CN115171129A - Character recognition error correction method and device, terminal equipment and storage medium - Google Patents

Character recognition error correction method and device, terminal equipment and storage medium Download PDF

Info

Publication number
CN115171129A
CN115171129A CN202211081040.4A CN202211081040A CN115171129A CN 115171129 A CN115171129 A CN 115171129A CN 202211081040 A CN202211081040 A CN 202211081040A CN 115171129 A CN115171129 A CN 115171129A
Authority
CN
China
Prior art keywords
character
image
model
matching
recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211081040.4A
Other languages
Chinese (zh)
Inventor
蓝建敏
李思伟
申鑫
池沐霖
张旭君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Excellence Information Technology Co ltd
Original Assignee
Excellence Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Excellence Information Technology Co ltd filed Critical Excellence Information Technology Co ltd
Priority to CN202211081040.4A priority Critical patent/CN115171129A/en
Publication of CN115171129A publication Critical patent/CN115171129A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/19007Matching; Proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/40Image enhancement or restoration using histogram techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/28Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/34Smoothing or thinning of the pattern; Morphological operations; Skeletonisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/16Image preprocessing
    • G06V30/162Quantising the image signal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/16Image preprocessing
    • G06V30/164Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/16Image preprocessing
    • G06V30/168Smoothing or thinning of the pattern; Skeletonisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/18Extraction of features or characteristics of the image
    • G06V30/1801Detecting partial patterns, e.g. edges or contours, or configurations, e.g. loops, corners, strokes or intersections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19173Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/26Techniques for post-processing, e.g. correcting the recognition result
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20036Morphological image processing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Character Discrimination (AREA)

Abstract

The invention discloses a character recognition error correction method, a device, a terminal device and a storage medium, comprising the following steps: acquiring character information in a text image according to the text image; inputting the character information into a matching model to obtain a character matching result output by the matching model; acquiring a character structure and a character component of the text information according to the text information; inputting the character structure and the character component into a recognition model to obtain a character recognition result output by the recognition model; and calculating to obtain an error correction result in a weighted scoring mode according to the matching result and the identification result. The method has the advantages that the character information is identified and corrected through semantic matching, the character structure is matched and corrected through font matching, the results of the two matching methods are further output, the effects of the two methods are fused through a weighting scoring mode, the algorithm flow of character identification correction is simplified, and the accuracy of character identification correction is improved.

Description

Character recognition error correction method and device, terminal equipment and storage medium
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a character recognition error correction method, a character recognition error correction device, terminal equipment and a storage medium.
Background
Optical Character Recognition (OCR) is generally used to automatically match and recognize the content in a text image captured by an Optical device and perform error correction tasks. However, the conventional OCR algorithm usually has a problem of misrecognition, and the OCR error correction algorithm has a poor effect, and is even worse in recognition and error correction of the chinese characters.
A character recognition and error correction method is needed to improve the accuracy of recognition and error correction in the field of chinese characters.
Disclosure of Invention
The purpose of the invention is: the character recognition error correction method, the character recognition error correction device, the computer terminal equipment and the computer readable storage medium can solve the problem that the accuracy rate of optical character recognition is not high.
In order to achieve the above object, the present invention provides a method for character recognition and error correction, comprising:
acquiring character information in a text image according to the text image;
inputting the character information into a matching model to obtain a character matching result output by the matching model; the matching model is a bidirectional code representation Keyword-BERT network model based on a converter and trained according to the existing corpus data set;
acquiring a character structure and a character component of the text information according to the text information;
inputting the character structure and the character component into a recognition model to obtain a character recognition result output by the recognition model; the identification model is a Convolutional Neural Network (CNN) model trained according to the existing character data set;
and calculating to obtain an error correction result in a weighted scoring mode according to the matching result and the identification result.
In a certain embodiment, the CNN model includes: the system comprises a target positioning module and a content identification module;
the target positioning module is used for generating a positioning area in a square frame or pixel level mask form so as to determine the positions of the content in the character structure and the character component in the positioning area and obtain a positioning result; and the content identification module is used for identifying the content represented by the characters in the positioning area according to the positioning result.
In one embodiment, before the obtaining of the text information in the text image according to the text image, the method further includes:
according to the text image, image preprocessing is carried out to obtain a preprocessed text image;
wherein the image pre-processing comprises at least one of: noise elimination, edge detection, histogram equalization, morphological processing and binarization.
In a certain embodiment, the existing character data set is in the form of an image, and before the training of the CNN model, the existing character data set further includes: according to the existing character data set, data enhancement is carried out, and a character data set after data enhancement is obtained;
wherein the data enhancement comprises at least one of: random noise addition, image flipping, image shifting, image rotation, image cropping, and image scaling.
The embodiment of the invention also provides a character recognition error correction device, which is applied to the character recognition error correction method in any embodiment, and comprises the following steps:
the character information acquisition unit is used for acquiring character information in the text image according to the text image;
the character information matching unit is used for inputting the character information into a matching model to obtain a character matching result output by the matching model; the matching model is a bidirectional code representation Keyword-BERT network model based on a converter and trained according to the existing corpus data set;
the character information acquisition unit is used for acquiring a character structure and a character component of the character information according to the character information;
the character information matching unit is used for inputting the character structure and the character component into a recognition model and obtaining a character recognition result output by the recognition model; the identification model is a Convolutional Neural Network (CNN) model trained according to the existing character data set;
and the error correction result processing unit is used for calculating and obtaining an error correction result in a weighted scoring mode according to the matching result and the identification result.
In a certain embodiment, the CNN model includes: the system comprises a target positioning module and a content identification module;
the target positioning module is used for generating a positioning area in a square frame or pixel level mask form so as to determine the positions of the content in the character structure and the character component in the positioning area and obtain a positioning result; and the content identification module is used for identifying the content represented by the characters in the positioning area according to the positioning result.
In one embodiment, the apparatus further comprises:
the image preprocessing unit is used for preprocessing the image according to the text image to obtain a preprocessed text image;
wherein the image pre-processing comprises at least one of: noise elimination, edge detection, histogram equalization, morphological processing and binarization.
In an embodiment, the existing character data set is in the form of an image, and the character information matching unit is further configured to: according to the existing character data set, data enhancement is carried out, and a character data set after data enhancement is obtained;
wherein the data enhancement comprises at least one of: random noise addition, image flipping, image shifting, image rotation, image cropping, and image scaling.
The embodiment of the invention also provides computer terminal equipment which comprises one or more processors and a memory. A memory coupled to the processor for storing one or more programs; when executed by the one or more processors, cause the one or more processors to implement a method for text recognition error correction as in any one of the embodiments described above.
The embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the method for recognizing and correcting the error of the character in any of the above embodiments is implemented.
The embodiment of the invention discloses a character recognition error correction method, a character recognition error correction device, computer terminal equipment and a computer readable storage medium, and compared with the prior art, the character recognition error correction method has the beneficial effects that: the method has the advantages that the character information is identified and corrected through semantic matching, the character structure is matched and corrected through font matching, the results of the two matching methods are further output, the effects of the two methods are fused through a weighting scoring mode, the algorithm flow of character identification correction is simplified, and the accuracy of character identification correction is improved.
Drawings
In order to more clearly illustrate the technical solution of the present invention, the drawings required to be used in the embodiments will be briefly described below, and obviously, the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a schematic flow chart of a method for recognizing and correcting a character according to an embodiment of the present invention;
fig. 2 is a schematic view of an application scenario of a method for recognizing and correcting errors in a word according to an embodiment of the present invention;
fig. 3 is a schematic flow chart of a method for recognizing and correcting a character according to a second embodiment of the present invention;
fig. 4 is a schematic flow chart of a character recognition error correction method according to a third embodiment of the present invention;
fig. 5 is a schematic structural diagram of a character recognition error correction apparatus according to a fourth embodiment of the present invention;
fig. 6 is a schematic structural diagram of a computer terminal device according to a fifth embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be understood that the step numbers used herein are for convenience of description only and are not used as limitations on the order in which the steps are performed.
It is to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
The terms "comprises" and "comprising" indicate the presence of the described features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The term "and/or" refers to any and all possible combinations of one or more of the associated listed items and includes such combinations.
OCR recognition and error correction is an important task in the field of artificial intelligence to match the content of text in an optical image and to correct corresponding errors for possible matching words and phrases. However, the current OCR recognition and error correction has low accuracy, and is mostly used for languages such as english and the like, and there are few solutions for chinese and foreign languages. One possible solution is to recognize and correct errors by using the structures of characters such as radicals and the like and the characteristics of fonts, and this type of algorithm generally considers characters as an image to perform the decomposition and recognition of each structure of characters. Another feasible solution is to perform context matching and recognition by word vectors based on the meaning of the words, thereby completing error correction. At present, word vector models with character patterns blended into features are available, but the error correction effect is not good, and other algorithms mostly perform character recognition and error correction through a single angle. The technical idea of the application lies in that the image meaning and the syntax meaning of the characters are used at the same time, namely, the identification and the error correction of the characters are carried out by combining the information of two aspects through the identification and the analysis of the characters and the identification and the analysis of the semantics, compared with the prior art, the difficulty of realizing the algorithm is simplified, and the identification accuracy is improved.
Example one
Fig. 1 is a schematic flow chart of a text recognition error correction method according to an embodiment of the present invention, and referring to fig. 1, the method includes:
s101, acquiring character information in a text image according to the text image;
s102, inputting the character information into a matching model to obtain a character matching result output by the matching model;
s103, acquiring a character structure and a character component of the text information according to the text information;
s104, inputting the character structure and the character component into a recognition model to obtain a character recognition result output by the recognition model;
and S105, calculating to obtain an error correction result in a weighted grading mode according to the matching result and the identification result.
The present embodiment is exemplarily described with reference to specific application scenarios: for the task of character recognition and error correction of the optical image, firstly, character information in the text image is acquired according to the text image, and then corresponding features are extracted from the character information according to a specific algorithm for processing.
After the extraction of the text information is completed, semantic matching can be performed, specifically, the text information is input into a matching model, and a text matching result output by the matching model is obtained; the matching model is a bidirectional code representation Keyword-BERT network model based on a converter and trained according to the existing corpus data set. It should be noted that the Keyword-BERT network model can pay more attention to the keywords in the word segments, that is, in the matching and error correction process, the algorithm will be more emphasized for relatively important words, thereby also improving the practicability of the method itself. The pre-training of the network model can be carried out through each large public corpus data set, and the trained model can be directly input in a use scene. It should be noted that the matching model may not be limited to the Keyword-BERT network model, and other natural language processing models are possible embodiments as long as semantic recognition and matching can be performed, but the preferred embodiment given herein is better than other solutions in effect.
After semantic matching is completed, font matching is required, and then recognition results of the two aspects are fused to calculate the final result of the method. Specifically, character analysis and splitting are firstly performed on character matching in the aspect of character patterns, however, for alphabetic languages such as English, character pattern recognition is simpler, and the data volume in the existing alphabetic training set is far less than that of Chinese characters. Therefore, this step is mostly used in the context of Chinese characters. Acquiring a character structure and a character component of the text information according to the text information; then, inputting the character structure and the character component into a recognition model to obtain a character recognition result output by the recognition model; the recognition model is a Convolutional Neural Network (CNN) model trained according to the existing character data set. Similarly, the pre-training of the network model can be carried out through each large public character data set, and the character patterns are directly input into the trained model for recognition in a use scene. The characteristics of the font include a character structure and character components, and since the step is mostly used in the context of Chinese characters, the character structure is usually a top-bottom structure, a left-right structure, an independent structure, and the like in Chinese characters, and the character components, i.e., basic elements constituting Chinese characters, need to be split from part or radical to stroke level. If the step is used for foreign languages, the corresponding letters in the character information are directly input into the recognition model without splitting the character structure and the character components. CNN is a general image recognition network, and extraction of some features in an image is completed through a series of convolution and pooling layers.
An example, the CNN model includes: the system comprises a target positioning module and a content identification module; the target positioning module is used for generating a positioning area in a square frame or pixel level mask form so as to determine the positions of the content in the character structure and the character component in the positioning area and obtain a positioning result; and the content identification module is used for identifying the content represented by the characters in the positioning area according to the positioning result. In this embodiment, a preferred CNN model is a model with two parts, namely, an object location module and a content recognition module, in the existing algorithm, fast RCNN and Mask RCNN are two more representative CNN models, the former object location module is used to generate a location block, and the latter object location module is used to generate a location area in the form of a pixel-level Mask. The target positioning module is used for judging the area to be identified, and the content identification module is used for identifying the text content in the area to be identified, so that the accuracy of character pattern identification is improved.
After the recognition of the semantics and the font is completed, an error correction result can be calculated in a weighted scoring mode according to the matching result and the recognition result. The weighted scoring is a general and simple data fusion mode, and in addition, a more accurate fusion result can be obtained through a neural network or other complex algorithms so as to further improve the accuracy of identification and error correction. However, the complexity of the algorithm increases accordingly. For example, fig. 2 is a schematic view of an application scenario of a text recognition error correction method according to an embodiment of the present invention, and provides an application case of a method for fusion recognition of semantics and fonts. The example sentence shown in the figure is "how to scan codes and add WeChat", but after the words are extracted by the optical device, the word "WeChat" is erroneously recognized as "badge", and therefore, recognition and error correction work is required. As shown, the top list is the candidates for several semantic aspect recognition results and the normalized scores, the bottom list is the candidates for several glyph aspect recognition results and the corresponding normalized scores, and the ellipses represent other results with scores that are too low to be ignored. Under the scene, the characteristics of the semantic and the font can be fused through the weighted score between the semantic and the font, and under the condition of unknown influence degree, a universal weight value taking mode is respectively 0.5, so that the micro-letter score is 0.98, the credit mark score is 0.42, the micro-letter group score is 0.37, the short message score is 0.35, and the information score is 0.06. It should be noted that in this scenario shown in fig. 2, regardless of how the weight is assigned, "WeChat" will be the matching result finally used for error correction, but in some scenarios, the correctness of the matching result may be affected. Because the distribution of the weight is difficult to model and calculate through a specific mathematical formula, however, the semantics generally include the judgment of homophones, synonyms and homonyms, especially the Keyword-BERT network model described in the present application, and therefore, the weight of the semantics should be heavier, i.e., greater than 0.5, but the specific value is difficult to directly determine. At the moment, a series of recognition results are input into the network by means of a neural network assisted with manual marking, a converged network model is trained, the weight values of the semantic meaning and the font form are inverted and used as a weight value distribution scheme, and therefore the accuracy of the method is improved.
The embodiment provides a character recognition error correction method, which comprises the following steps: acquiring character information in a text image according to the text image; inputting the character information into a matching model to obtain a character matching result output by the matching model; the matching model is a bidirectional code representation Keyword-BERT network model based on a converter and trained according to the existing corpus data set; acquiring a character structure and a character component of the text information according to the text information; inputting the character structure and the character component into a recognition model to obtain a character recognition result output by the recognition model; the identification model is a Convolutional Neural Network (CNN) model trained according to the existing character data set; and calculating to obtain an error correction result in a weighted scoring mode according to the matching result and the identification result. The method has the advantages that the character information is identified and corrected through semantic matching, the character structure is matched and corrected through font matching, the results of the two matching methods are further output, the effects of the two methods are fused through a weighting scoring mode, the algorithm flow of character identification correction is simplified, and the accuracy of character identification correction is improved.
Example two
Fig. 3 is a schematic flow chart of a character recognition error correction method according to a second embodiment of the present invention, please refer to fig. 3, before S101, the method further includes:
s201, according to the text image, image preprocessing is carried out, and a preprocessed text image is obtained.
The present embodiment is exemplarily explained with reference to specific application scenarios: before acquiring the text information in the text image according to the text image, the method further includes: according to the text image, image preprocessing is carried out to obtain a preprocessed text image; wherein the image pre-processing comprises at least one of: noise elimination, edge detection, histogram equalization, morphological processing and binarization. The input image is correspondingly preprocessed, so that the text part in the image is highlighted, and correspondingly, the features are extracted more favorably, and the accuracy of the method is improved. Meanwhile, as for a character recognition and error correction method of OCR, a traditional image processing means is a common method, namely, the improvement of image quality and the highlighting of character contents are completed through image processing, meanwhile, the extraction and combination of the characteristics in aspects of color, form, texture and the like in an image are performed through characteristic engineering, and recognition and error correction are completed through modeling of statistical machine learning. It should be noted that this method can also be used in the present application instead of extracting the character structure and character features. However, compared with the scheme provided in the application, the conventional method has the disadvantages of more complicated steps, large amount of labor work and low accuracy. But the method is also an alternative in the situation that the existing character data set is insufficient, and the image preprocessing is more important to improve the image quality.
The embodiment provides a method for character recognition and error correction, before acquiring character information in a text image according to the text image, the method further includes: performing image preprocessing according to the text image to obtain a preprocessed text image; wherein the image pre-processing comprises at least one of: noise elimination, edge detection, histogram equalization, morphological processing and binarization. By preprocessing the text image before application, the quality of the text image is improved, and therefore the accuracy of character recognition and error correction is improved.
EXAMPLE III
Fig. 4 is a schematic flow chart of a character recognition error correction method according to a third embodiment of the present invention, please refer to fig. 4, wherein S104 specifically includes:
s301, according to the existing character data set, data enhancement is carried out, and a character data set after data enhancement is obtained;
s302, training the CNN model according to the character data set after the data enhancement;
and S303, inputting the character structure and the character component into the CNN model to obtain a character recognition result output by the CNN model.
The present embodiment is exemplarily described with reference to specific application scenarios: for the existing character data set in the form of image, the data amount in the data set is not enough to train a general, stable and excellent model, so that the data set can be expanded from various angles through data enhancement to solve the problems of stability and universality of the model. Specifically, data enhancement is carried out according to the existing character data set to obtain a character data set after data enhancement; training the CNN model according to the character data set after the data enhancement; and inputting the character structure and the character component into the CNN model to obtain a character recognition result output by the CNN model. Wherein the data enhancement comprises at least one of: random noise addition, image flipping, image shifting, image rotation, image cropping, and image scaling.
The present embodiment provides a method for recognizing and correcting a word, where the existing character data set is in an image form, and before training a CNN model, the method further includes: according to the existing character data set, data enhancement is carried out, and a character data set after data enhancement is obtained; wherein the data enhancement comprises at least one of: random noise addition, image flipping, image shifting, image rotation, image cropping, and image scaling. By enhancing the data of the existing character data set during training, the stability and the repeatability of the model are improved, and therefore the accuracy of character recognition and error correction is improved.
Example four
Fig. 5 is a schematic structural diagram of a character recognition error correction apparatus according to a fourth embodiment of the present invention, and referring to fig. 5, a character recognition error correction apparatus according to a fourth embodiment of the present invention is applied to a character recognition error correction method according to any one of the embodiments. It should be noted that fig. 5 is only one of the most basic embodiments, and other units may be added according to actual requirements. The device comprises:
a text information obtaining unit 41, configured to obtain text information in a text image according to the text image;
a text information matching unit 42, configured to input the text information into a matching model, and obtain a text matching result output by the matching model; the matching model is a bidirectional code representation Keyword-BERT network model based on a converter and trained according to the existing corpus data set;
a character information obtaining unit 43, configured to obtain a character structure and a character component of the text information according to the text information;
a character information matching unit 44, configured to input the character structure and the character component into a recognition model, and obtain a character recognition result output by the recognition model; the identification model is a Convolutional Neural Network (CNN) model trained according to the existing character data set;
and an error correction result processing unit 45, configured to calculate an error correction result in a weighted scoring manner according to the matching result and the recognition result.
An example, the CNN model includes: the system comprises a target positioning module and a content identification module;
the target positioning module is used for generating a positioning area in a square frame or pixel level mask form so as to determine the positions of the content in the character structure and the character component in the positioning area and obtain a positioning result; and the content identification module is used for identifying the content represented by the characters in the positioning area according to the positioning result.
The target positioning module is used for judging the area to be identified, and the content identification module is used for identifying the text content in the area to be identified, so that the accuracy of character pattern identification is improved.
An example, the apparatus further comprising:
the image preprocessing unit is used for preprocessing the image according to the text image to obtain a preprocessed text image;
wherein the image pre-processing comprises at least one of: noise elimination, edge detection, histogram equalization, morphological processing and binarization.
By preprocessing the text image before application, the quality of the text image is improved, and therefore the accuracy of character recognition and error correction is improved.
An example, the existing character data set is in the form of an image, and the character information matching unit is further configured to: according to the existing character data set, data enhancement is carried out, and a character data set after data enhancement is obtained;
wherein the data enhancement comprises at least one of: random noise addition, image flipping, image shifting, image rotation, image cropping, and image scaling.
By enhancing the data of the existing character data set during training, the stability and the repeatability of the model are improved, and therefore the accuracy of character recognition and error correction is improved.
For the specific limitation of the character recognition and error correction device, reference may be made to the above limitation on the character recognition and error correction method, which is not described herein again. All or part of each module in the character recognition and error correction device can be realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent of a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
The present embodiment provides a character recognition error correction apparatus, including: the character information acquisition unit is used for acquiring character information in the text image according to the text image; the character information matching unit is used for inputting the character information into a matching model to obtain a character matching result output by the matching model; the matching model is a bidirectional code representation Keyword-BERT network model based on a converter and trained according to the existing corpus data set; the character information acquisition unit is used for acquiring a character structure and a character component of the character information according to the character information; the character information matching unit is used for inputting the character structure and the character component into a recognition model and obtaining a character recognition result output by the recognition model; the identification model is a Convolutional Neural Network (CNN) model trained according to the existing character data set; and the error correction result processing unit is used for calculating and obtaining an error correction result in a weighted scoring mode according to the matching result and the identification result. The method has the advantages that the character information is identified and corrected through semantic matching, the character structure is matched and corrected through font matching, the results of the two matching methods are further output, the effects of the two methods are fused through a weighting scoring mode, the algorithm flow of character identification correction is simplified, and the accuracy of character identification correction is improved.
EXAMPLE five
Fig. 6 is a schematic structural diagram of a computer terminal device according to a fifth embodiment of the present invention, and referring to fig. 6, the fifth embodiment of the present invention provides a computer terminal device including one or more processors and a memory. The memory is coupled to the processor and configured to store one or more programs, which when executed by the one or more processors, cause the one or more processors to implement the method for text recognition error correction in any of the above embodiments.
The processor is used for controlling the overall operation of the computer terminal equipment so as to complete all or part of the steps of the character recognition error correction method. The memory is used to store various types of data to support the operation at the computer terminal device, which data may include, for example, instructions for any application or method operating on the computer terminal device, as well as application-related data. The Memory may be implemented by any type of volatile or non-volatile Memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically Erasable Programmable Read-Only Memory (EEPROM), erasable Programmable Read-Only Memory (EPROM), programmable Read-Only Memory (PROM), read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk, or optical disk.
In an exemplary embodiment, the computer terminal Device may be implemented by one or more Application Specific 1 integrated circuits (AS 1C), digital Signal Processors (DSP), digital Signal Processing Devices (DSPD), programmable Logic Devices (PLD), field Programmable Gate Arrays (FPGA), controllers, microcontrollers, microprocessors, or other electronic components, and is configured to perform the above-mentioned character recognition error correction method and achieve the technical effects consistent with the above-mentioned method.
In another exemplary embodiment, a computer readable storage medium is further provided, in which a computer program is stored, and the computer program is executed by a processor to implement the steps of the character recognition error correction method in any one of the above embodiments. For example, the computer readable storage medium may be the memory storing the computer program, and the computer program may be executed by a processor of a computer terminal device to perform the method for recognizing and correcting the text, and achieve the technical effects consistent with the method.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and substitutions can be made without departing from the technical principle of the present invention, and these modifications and substitutions should also be regarded as the protection scope of the present invention.

Claims (10)

1. A character recognition error correction method is characterized by comprising the following steps:
acquiring character information in a text image according to the text image;
inputting the character information into a matching model to obtain a character matching result output by the matching model; the matching model is a bidirectional code representation Keyword-BERT network model based on a converter and trained according to the existing corpus data set;
acquiring a character structure and a character component of the text information according to the text information;
inputting the character structure and the character component into a recognition model to obtain a character recognition result output by the recognition model; the identification model is a Convolutional Neural Network (CNN) model trained according to the existing character data set;
and calculating to obtain an error correction result in a weighted scoring mode according to the matching result and the identification result.
2. The method of claim 1, wherein the CNN model comprises: the system comprises a target positioning module and a content identification module;
the target positioning module is used for generating a positioning area in a square frame or pixel level mask form so as to determine the positions of the content in the character structure and the character component in the positioning area and obtain a positioning result; and the content identification module is used for identifying the content represented by the characters in the positioning area according to the positioning result.
3. The method according to claim 1, wherein before the obtaining of the text information in the text image according to the text image, the method further comprises:
according to the text image, image preprocessing is carried out to obtain a preprocessed text image;
wherein the image pre-processing comprises at least one of: noise elimination, edge detection, histogram equalization, morphological processing and binarization.
4. The method of any of claims 1-3, wherein the existing character data set is in the form of an image, the existing character data set further comprising, prior to training of the CNN model: according to the existing character data set, data enhancement is carried out, and a character data set after data enhancement is obtained;
wherein the data enhancement comprises at least one of: random noise addition, image flipping, image shifting, image rotation, image cropping, and image scaling.
5. A character recognition error correction apparatus, comprising:
the character information acquisition unit is used for acquiring character information in the text image according to the text image;
the character information matching unit is used for inputting the character information into a matching model to obtain a character matching result output by the matching model; the matching model is a bidirectional code representation Keyword-BERT network model based on a converter and trained according to the existing corpus data set;
the character information acquisition unit is used for acquiring a character structure and a character component of the character information according to the character information;
the character information matching unit is used for inputting the character structure and the character component into a recognition model and obtaining a character recognition result output by the recognition model; the identification model is a Convolutional Neural Network (CNN) model trained according to the existing character data set;
and the error correction result processing unit is used for calculating and obtaining an error correction result in a weighted scoring mode according to the matching result and the identification result.
6. The apparatus of claim 5, wherein the CNN model comprises: the system comprises a target positioning module and a content identification module;
the target positioning module is used for generating a positioning area in a square frame or pixel level mask form so as to determine the positions of the content in the character structure and the character component in the positioning area and obtain a positioning result; and the content identification module is used for identifying the content represented by the characters in the positioning area according to the positioning result.
7. The apparatus of claim 5, further comprising:
the image preprocessing unit is used for preprocessing the image according to the text image to obtain a preprocessed text image;
wherein the image pre-processing comprises at least one of: noise elimination, edge detection, histogram equalization, morphological processing and binarization.
8. The apparatus according to any one of claims 5 to 7, wherein the existing character data set is in the form of an image, and the character information matching unit is further configured to: according to the existing character data set, data enhancement is carried out, and a character data set after data enhancement is obtained;
wherein the data enhancement comprises at least one of: random noise addition, image flipping, image shifting, image rotation, image cropping, and image scaling.
9. A computer terminal device, comprising:
one or more processors;
a memory coupled to the processor for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of text recognition error correction according to any of claims 1-4.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method of text recognition and error correction according to any one of claims 1 to 4.
CN202211081040.4A 2022-09-06 2022-09-06 Character recognition error correction method and device, terminal equipment and storage medium Pending CN115171129A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211081040.4A CN115171129A (en) 2022-09-06 2022-09-06 Character recognition error correction method and device, terminal equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211081040.4A CN115171129A (en) 2022-09-06 2022-09-06 Character recognition error correction method and device, terminal equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115171129A true CN115171129A (en) 2022-10-11

Family

ID=83481512

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211081040.4A Pending CN115171129A (en) 2022-09-06 2022-09-06 Character recognition error correction method and device, terminal equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115171129A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200311460A1 (en) * 2016-11-30 2020-10-01 Hangzhou Hikvision Digital Technology Co., Ltd. Character identification method and device
CN112396049A (en) * 2020-11-19 2021-02-23 平安普惠企业管理有限公司 Text error correction method and device, computer equipment and storage medium
CN112434691A (en) * 2020-12-02 2021-03-02 上海三稻智能科技有限公司 HS code matching and displaying method and system based on intelligent analysis and identification and storage medium
CN113408535A (en) * 2021-05-25 2021-09-17 浙江大学 OCR error correction method based on Chinese character level characteristics and language model
CN113743415A (en) * 2021-08-05 2021-12-03 杭州远传新业科技有限公司 Method, system, electronic device and medium for identifying and correcting image text

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200311460A1 (en) * 2016-11-30 2020-10-01 Hangzhou Hikvision Digital Technology Co., Ltd. Character identification method and device
CN112396049A (en) * 2020-11-19 2021-02-23 平安普惠企业管理有限公司 Text error correction method and device, computer equipment and storage medium
CN112434691A (en) * 2020-12-02 2021-03-02 上海三稻智能科技有限公司 HS code matching and displaying method and system based on intelligent analysis and identification and storage medium
CN113408535A (en) * 2021-05-25 2021-09-17 浙江大学 OCR error correction method based on Chinese character level characteristics and language model
CN113743415A (en) * 2021-08-05 2021-12-03 杭州远传新业科技有限公司 Method, system, electronic device and medium for identifying and correcting image text

Similar Documents

Publication Publication Date Title
CN110569830B (en) Multilingual text recognition method, device, computer equipment and storage medium
Breuel et al. High-performance OCR for printed English and Fraktur using LSTM networks
CN111753767A (en) Method and device for automatically correcting operation, electronic equipment and storage medium
CN110647829A (en) Bill text recognition method and system
CN111738251A (en) Optical character recognition method and device fused with language model and electronic equipment
Romero et al. Influence of text line segmentation in handwritten text recognition
US9286527B2 (en) Segmentation of an input by cut point classification
CN110942004A (en) Handwriting recognition method and device based on neural network model and electronic equipment
CN113408535B (en) OCR error correction method based on Chinese character level features and language model
CN112766255A (en) Optical character recognition method, device, equipment and storage medium
CN115438650B (en) Contract text error correction method, system, equipment and medium fusing multi-source characteristics
CN114429636B (en) Image scanning identification method and device and electronic equipment
CN115311666A (en) Image-text recognition method and device, computer equipment and storage medium
CN116704523A (en) Text typesetting image recognition system for publishing and printing equipment
Bosch et al. Semiautomatic text baseline detection in large historical handwritten documents
CN114758341A (en) Intelligent contract image identification and contract element extraction method and device
CN113407676A (en) Title correction method and system, electronic device and computer readable medium
CN111126160B (en) Intelligent Chinese character structure evaluation method and system constructed based on five-stroke input method
CN110929514A (en) Text proofreading method and device, computer readable storage medium and electronic equipment
CN108021918B (en) Character recognition method and device
CN115171129A (en) Character recognition error correction method and device, terminal equipment and storage medium
CN112528980B (en) OCR recognition result correction method and terminal and system thereof
CN111738248B (en) Character recognition method, training method of character decoding model and electronic equipment
CN111382322B (en) Method and device for determining similarity of character strings
CN110399607A (en) A kind of conversational system text error correction system and method based on phonetic

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20221011

RJ01 Rejection of invention patent application after publication