CN115100672A - Character detection and identification method, device and equipment and computer readable storage medium - Google Patents

Character detection and identification method, device and equipment and computer readable storage medium Download PDF

Info

Publication number
CN115100672A
CN115100672A CN202210682544.5A CN202210682544A CN115100672A CN 115100672 A CN115100672 A CN 115100672A CN 202210682544 A CN202210682544 A CN 202210682544A CN 115100672 A CN115100672 A CN 115100672A
Authority
CN
China
Prior art keywords
character
text
detection
detection result
result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210682544.5A
Other languages
Chinese (zh)
Inventor
廖明
李国鸣
陈洁彦
钱学成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Merchants Bank Co Ltd
Original Assignee
China Merchants Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Merchants Bank Co Ltd filed Critical China Merchants Bank Co Ltd
Priority to CN202210682544.5A priority Critical patent/CN115100672A/en
Publication of CN115100672A publication Critical patent/CN115100672A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/413Classification of content, e.g. text, photographs or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/146Aligning or centring of the image pick-up or image-field
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/146Aligning or centring of the image pick-up or image-field
    • G06V30/147Determination of region of interest
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/146Aligning or centring of the image pick-up or image-field
    • G06V30/1475Inclination or skew detection or correction of characters or of image to be recognised
    • G06V30/1478Inclination or skew detection or correction of characters or of image to be recognised of characters or characters lines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/26Techniques for post-processing, e.g. correcting the recognition result

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Character Discrimination (AREA)

Abstract

The invention discloses a character detection and identification method, a character detection and identification device, character detection and identification equipment and a computer readable storage medium, and belongs to the technical field of character identification. The text line detection and single character detection are carried out on the text image by acquiring the text image, the text line detection result and the single character detection result are output, and then the character content is identified based on the text line detection result and the single character detection result, and the identification result is generated. The invention can easily process the sequencing and recognition of the text with inclination angle, multi-direction text and bent characters by detecting the text line and the single character with two fine granularities in the process of detecting the text, has higher character recognition accuracy rate compared with the method for only recognizing the single character, and can improve the precision of character recognition.

Description

Character detection and identification method, device and equipment and computer readable storage medium
Technical Field
The present invention relates to the field of text recognition technologies, and in particular, to a text detection and recognition method, apparatus, device, and computer readable storage medium.
Background
The character detection and identification has wide application prospect in the aspects of intelligent review and recording of document images, document comparison, image character retrieval and the like. The intelligent review, document comparison and image text retrieval all require that a text detection and identification model can accurately detect the text of a single character in an image, identify the text content of the character, organize text lines of a plurality of identified single characters according to needs, and finally restore the document so as to determine the specific position of the text in the image document.
In the related art, when detecting and recognizing single-character characters, a common text detection algorithm is usually adopted, and training is performed by using labeled information at a character level to detect the single-character characters. A series of individual words detected by the word detection algorithm are then identified and the words are sorted by rows to restore the document. The scheme adopts a divide-and-conquer method to separate character detection and recognition, is easy to realize, but easily causes the defects of inaccurate single character detection and recognition and difficult text line sequencing.
Disclosure of Invention
The invention mainly aims to provide a character detection and identification method, a character detection and identification device, character detection and identification equipment and a computer readable storage medium, and aims to solve the problems of poor character detection and identification precision and difficult text line sequencing in the prior art.
In order to achieve the above object, the present invention provides a character detection and identification method, which comprises the following steps:
acquiring a text image;
performing text line detection and single character detection on the text image, and outputting a text line detection result and a single character detection result;
and identifying the text content based on the text line detection result and the single character detection result, and generating an identification result.
Optionally, after the step of performing text line detection and single character detection on the text image and outputting a text line detection result and a single character detection result, the method further includes:
according to the text line detection result and the single character detection result, grouping single character detection boxes in the single character detection result according to the text line;
and sequencing the single-character detection frames in a grouping group according to the text row direction in the text row detection result to generate a corrected single-character detection result.
Optionally, the step of sorting the single-character detection frames in a grouping group according to the text line direction in the text line detection result, and generating a corrected single-character detection result includes:
sorting the single character detection boxes in a grouping group according to the text row direction in the text row detection result;
and detecting the size and the direction of the sorted single character detection frames, and normalizing the size and the direction according to rows to generate a corrected single character detection result.
Optionally, the step of sorting the single-character detection frames in a grouping group according to the text line direction in the text line detection result, and generating a corrected single-character detection result includes:
determining whether the text line detection result is a curved text;
if so, modifying the size of the text line detection result to generate a reduced text line direction;
and sorting the single-character detection frames in the grouping group according to the text row direction to generate a corrected single-character detection result.
Optionally, the step of performing text content recognition based on the text line detection result and the single character detection result, and generating a recognition result includes:
performing text content recognition on the text line detection result to generate a text line character recognition result;
judging whether the text line character recognition result is aligned with the position of a single character detection frame in the single character detection result;
and if so, generating a recognition result based on the text line character recognition result.
Optionally, after the step of determining whether the text line character recognition result is aligned with the position of the single character detection box in the single character detection result, the method includes:
if not, performing character recognition based on the single character detection result to generate a single character recognition result;
generating a single character string according to the single character detection frame and the single character recognition result;
calculating the editing distance between the single character string and the corresponding text line character string in the text line character detection result, and acquiring corresponding editing operation;
and executing the editing operation, aligning the text line recognition result with the position of the single character detection frame, and generating a recognition result.
Optionally, the step of performing text line detection and single character detection on the text image, and outputting a text line detection result and a single character detection result includes:
respectively carrying out text line detection and single character detection on the text image through a double-branch character detection model, predicting to obtain a text line probability map and a threshold map, and a single character probability map and a threshold map, and carrying out differentiable binarization to respectively obtain a text line binarization probability map and a single character binarization probability map;
and carrying out post-processing on the text line binarization probability map and the single character binarization probability map to obtain the text line detection result and the single character detection result.
In addition, in order to achieve the above object, the present invention further provides a character detection and recognition apparatus, including:
the acquisition module is used for acquiring a text image;
the character detection module is used for carrying out text line detection and single character detection on the text image and outputting a text line detection result and a single character detection result;
and the character recognition module is used for recognizing the character content based on the text line detection result and the single character detection result and generating a recognition result.
Optionally, the apparatus further comprises:
the single character detection frame sorting and correcting module is used for grouping the single character detection frames in the single character detection results according to the text line detection results and the single character detection results;
and sequencing the single-character detection frames in a grouping group according to the text row direction in the text row detection result to generate a corrected single-character detection result.
Optionally, the single-character detection box ordering correction module is further configured to:
sorting the single character detection boxes in a grouping group according to the text line direction in the text line detection result;
and detecting the size and the direction of the sorted single character detection frames, and normalizing the size and the direction according to rows to generate a corrected single character detection result.
Optionally, the single-character detection box ordering correction module is further configured to:
confirming whether the text line detection result is a bent text;
if so, modifying the size of the text line detection result to generate a reduced text line direction;
and sorting the single-character detection frames in the grouping group according to the text row direction to generate a corrected single-character detection result.
Optionally, the apparatus further comprises:
the character alignment module is used for carrying out text content recognition on the text line detection result to generate a text line character recognition result;
judging whether the text line character recognition result is aligned with the position of a single character detection frame in the single character detection result;
and if so, generating a recognition result based on the text line character recognition result.
Optionally, the character alignment module is further configured to:
if not, performing character recognition based on the single character detection result to generate a single character recognition result;
generating a single character string according to the single character detection frame and the single character recognition result;
calculating the editing distance between the single character string and the corresponding text line character string in the text line character detection result, and acquiring corresponding editing operation;
and executing the editing operation, aligning the text line recognition result with the position of the single character detection frame, and generating a recognition result.
Optionally, the text detection module is further configured to:
respectively carrying out text line detection and single character detection on the text image through a double-branch character detection model, predicting to obtain a text line probability map and a threshold map, and a single character probability map and a threshold map, and carrying out differentiable binarization to respectively obtain a text line binarization probability map and a single character binarization probability map;
and carrying out post-processing on the text line binarization probability map and the single character binarization probability map to obtain the text line detection result and the single character detection result.
In addition, in order to achieve the above object, the present invention further provides a character detection and recognition apparatus, including: the character detection and recognition program is stored on the memory and can run on the processor, and the character detection and recognition program is configured to realize the steps of the character detection and recognition method.
In addition, to achieve the above object, the present invention further provides a computer readable storage medium, wherein a character detection and recognition program is stored on the computer readable storage medium, and when being executed by a processor, the character detection and recognition program realizes the steps of the character detection and recognition method.
The text detection and identification method, the text line detection and single character detection are performed on the text image by acquiring the text image, the text line detection result and the single character detection result are output, and then the text content is identified based on the text line detection result and the single character detection result, and the identification result is generated. The invention can carry out text line and single character fine-grained text detection in the process of detecting the text, is easy to process the sequencing and identification of the text with an inclination angle and multiple directions and bent characters, has higher character identification accuracy rate compared with the method for identifying the single character, and can improve the text detection and identification accuracy.
Drawings
FIG. 1 is a schematic structural diagram of a text detection and recognition device of a hardware operating environment according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a text detection and recognition method according to a first embodiment of the present invention;
FIG. 3 is a flow chart illustrating the sorting of single character check boxes according to an embodiment of the method for detecting and recognizing characters of the present invention;
FIG. 4 is a flowchart illustrating a detailed process of step S30 in an embodiment of the text detection and recognition method according to the invention;
FIG. 5 is a schematic diagram illustrating a character alignment process based on an edit distance according to an embodiment of the text detection and recognition method of the present invention;
FIG. 6 is a flow chart of single character word detection and recognition according to an embodiment of the word detection and recognition method of the present invention;
FIG. 7 is a flowchart illustrating a detailed process of step S20 in an embodiment of the text detection and recognition method according to the invention;
FIG. 8 is a diagram of a text detection model according to an embodiment of the text detection and recognition method of the present invention;
fig. 9 is a functional block diagram of an embodiment of the text detection and recognition device of the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Referring to fig. 1, fig. 1 is a schematic structural diagram of a character detection and recognition device in a hardware operating environment according to an embodiment of the present invention.
As shown in fig. 1, the character detection and recognition apparatus may include: the processor 1001 includes, for example, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. The communication bus 1002 is used to implement connection communication among these components. The user interface 1003 may include a Display (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a WIreless interface (e.g., a WIreless-FIdelity (WI-FI) interface). The Memory 1005 may be a high-speed Random Access Memory (RAM) Memory, or a Non-Volatile Memory (NVM), such as a disk Memory. The memory 1005 may alternatively be a storage device separate from the processor 1001.
Those skilled in the art will appreciate that the configuration shown in fig. 1 is not intended to be limiting of the text detection and recognition device and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
As shown in fig. 1, a memory 1005, which is a kind of computer-readable storage medium, may include therein an operating system, a data storage module, a network communication module, a user interface module, and a character detection recognition program.
In the text detection and recognition device shown in fig. 1, the network interface 1004 is mainly used for data communication with other devices; the user interface 1003 is mainly used for data interaction with a user; the processor 1001 and the memory 1005 of the character detection and recognition device of the present invention may be disposed in the character detection and recognition device, and the character detection and recognition device calls the character detection and recognition program stored in the memory 1005 through the processor 1001 and executes the character detection and recognition method provided by the embodiment of the present invention.
An embodiment of the present invention provides a method for detecting and identifying characters, and referring to fig. 2, fig. 2 is a schematic flow chart of a first embodiment of the method for detecting and identifying characters of the present invention.
In this embodiment, the text detection and identification method includes:
step S10, acquiring a text image;
step S20, performing text line detection and single character detection on the text image, and outputting a text line detection result and a single character detection result;
step S30, recognizing the text content based on the text line detection result and the single character detection result, and generating a recognition result.
The character detection and identification method provided by the embodiment can be used in application scenes such as intelligent document image review, document comparison, image character retrieval and the like. The existing single character word detection and recognition technical scheme mainly comprises two schemes of a common text detection model + a word recognition model, a single character word detection model + a character image classification. The scheme of a common character detection model and a character recognition model is difficult to accurately detect the position of a single character in an image, particularly a half-angle character. For characters with the same font size, the character width of the half-corner character is narrow, and the width of the character is generally half of that of the full-corner character in font design. Mainly embodied in english characters and symbols. Such as comma, semicolon, colon, etc. in chinese and english. Due to the small width of the character, the character is easy to miss detection or to be detected together with other characters. The full-angle character has a relatively large width, and thus the occurrence probability of the full-angle character is relatively low. Full-angle characters are more susceptible to missed characters, typically characters with few strokes, such as "one".
And when the text lines in the document image have large inclination or the text lines have curved layout, it is difficult to sort the characters in the text lines by lines, and the recognition accuracy of the individual characters is low. The scheme of single character detection algorithm and character image classification also has the problems of inaccurate detection of single character positions and difficult sequencing of text lines, and when a Chinese document is processed, the number of characters needing classification is increased sharply compared with that of an English document, so that the single character recognition is inaccurate.
Therefore, the method aims to solve the problems that the single character detection precision is poor, and the situations of missing detection, error detection and the like are easy to occur in the existing single character detection and recognition technology under a specific scene. A character detection and recognition method is provided.
The respective steps will be described in detail below:
step S10, acquiring a text image;
in one embodiment, a text image first needs to be acquired. Specifically, the image to be detected input by the user is acquired according to a specific application scenario, and may be an image in various forms such as a scanned image, a photograph, a screenshot, and the like, and the format, size, and color of the text image are not limited. In order to improve the accuracy of character detection of subsequent images, the acquired original image may be preprocessed, for example, by performing a color removal process, a resolution process, and the like, to obtain a text image.
Step S20, performing text line detection and single character detection on the text image, and outputting a text line detection result and a single character detection result;
in one embodiment, text line detection and single character detection are performed on a text image, and corresponding text line detection results and single character detection results are output. In the related technology, a common text detection algorithm is adopted to train with the labeled information of the character level, and single character detection is carried out. A series of individual words detected by the word detection algorithm are then identified and the words are sorted by rows to restore the document. And the character area of a single letter is much smaller than the character area of the entire line of the letter. The detection accuracy of the text line detection algorithm to a single character is poor, and the conditions of missed detection and false detection often occur, especially the detection to the character at a half corner. Therefore, in the detection process, the text line detection and the single character detection are carried out on the text image, the position information of two types, namely the text line and the single character, in the image is output, the text line detection result is used as the comparison of the single character detection result, the integrity of the identification result is improved, and the problem of missed detection in the single character detection is solved. The text line detection result and the single character detection result may be a text line character detection box and a single character detection box.
Step S30, recognizing the text content based on the text line detection result and the single character detection result, and generating a recognition result.
In an embodiment, according to the detected text line detection result and single character detection result, further performing character content recognition to generate a recognition result. Specifically, the text line detection result (region image) detected by the character detection module is recognized, the character content is output, meanwhile, the single character region image detected by the character detection module is recognized according to the requirement, the character content is output, and the text line recognition result is correspondingly matched with the single character recognition result.
In this embodiment, a text image is obtained, text line detection and single character detection are performed on the text image, a text line detection result and a single character detection result are output, and then, based on the text line detection result and the single character detection result, character content recognition is performed, and a recognition result is generated. The invention can easily process the sequencing and recognition of the text with inclination angle, multi-direction text and bent characters by detecting the characters with two different granularities of text lines and single characters in one character detection process, has higher character recognition accuracy rate compared with the method for recognizing single characters, and can improve the character detection and recognition precision.
Further, based on the first embodiment of the character detection and identification method of the present invention, a second embodiment of the character detection and identification method of the present invention is provided.
Referring to fig. 3, fig. 3 is a schematic flow chart illustrating ordering of single character detection boxes in an embodiment of the character detection and recognition method of the present invention, in a second embodiment, after the steps of performing text line detection and single character detection on the text image and outputting a text line detection result and a single character detection result, the method further includes:
step S21, grouping the single character detection boxes in the single character detection result according to the text line detection result and the single character detection result;
and step S22, sorting the single character detection frames in the grouping group according to the text line direction in the text line detection result to generate a corrected single character detection result.
In one embodiment, the single-character detection boxes in the single-character detection result are grouped according to the text line detection result and the single-character detection result, and the single-character detection boxes are sorted in the grouping group according to the text line direction in the text line detection result based on the text line direction in the text line detection result, so as to generate the corrected single-character detection result. It can be understood that, because there may be text lines with a large inclination angle or text lines with multiple text directions in the image, the single-character frames detected by the text detection model may have different sizes and different directions, and each single-character frame is independent and has not been lined up yet. It is difficult to sort them by row by relying only on a single character detection box. According to the scheme, the text line detection result and the single character detection result are combined, single character detection boxes are firstly grouped according to the text line, and then the single character detection boxes are sequenced in the group according to the text line direction.
Further, in an embodiment, the step of sorting the single-character detection boxes in a grouping group according to the text line direction in the text line detection result to generate a corrected single-character detection result includes:
step S221, sorting the single character detection boxes in a grouping group according to the text line direction in the text line detection result;
step S222, detecting the size and direction of the sorted single character detection boxes, normalizing the size and direction by row, and generating a corrected single character detection result.
In one embodiment, the single character detection boxes in the grouping group are sorted according to the text line direction in the text line detection result, and the size and direction of the single character detection boxes are normalized after sorting. It can be understood that sometimes a text line composed of characters of the same size may have problems of large, small and different directions of single character detection frames of some characters. This is particularly the case when the text lines are tilted or bent. Therefore, the present embodiment adjusts the size and the direction of the single-character detection boxes in the sorted current line, and unifies the size and the direction, so as to further improve the accuracy of the subsequent recognition.
Further, in an embodiment, the step of sorting the single-character detection boxes in a grouping group according to the text line direction in the text line detection result to generate a corrected single-character detection result includes:
step S223, determining whether the text line detection result is a curved text;
step S224, if yes, the size of the text line detection result is modified, and a reduced text line direction is generated;
and step S225, sorting the single-character detection frames in the grouping group according to the text row direction to generate a corrected single-character detection result.
In one embodiment, whether the text line detection result is a curved text is determined, if yes, the text line detection result is reduced to generate a reduced text line direction, and single character detection boxes in the grouping group are sorted according to the text line direction to generate a corrected single character detection result. In order to solve the problem that it is difficult to sort the single characters with curved text lines and to correctly restore the single character order according to the text lines, in this embodiment, it is first determined whether the text line detection result is a curved text, and further, for curved characters, the text line detection result (region image) is reduced to obtain a center line including the center of the single character. The single character boxes are grouped into text lines by position according to the direction of the text lines and sorted corresponding to the reduced text line direction (center line), it can be understood that the curve is composed of a series of connected points, and therefore sorting can be performed according to the positions of the single characters on the curve to generate the corrected single character detection result.
In this embodiment, the single-character detection frames are grouped according to the text lines, the single-character detection frames are sorted in the group according to the text line direction, and the size and direction of the single-character detection frames are further normalized according to the lines, so that the problem that the text lines with a large inclination angle or the text lines with multiple text directions may exist in the image, and the single-character frames detected by the character detection model may have different sizes and different directions is solved.
Further, based on the previous embodiment of the character detection and recognition method of the present invention, a third embodiment of the character detection and recognition method of the present invention is provided.
Referring to fig. 4, fig. 4 is a schematic view illustrating a detailed flow of step S30 in an embodiment of the text detection and recognition method of the present invention, in a third embodiment, the step of performing text content recognition based on the text line detection result and the single character detection result and generating a recognition result includes:
step S31, recognizing the text content of the text line detection result to generate a text line character recognition result;
step S32, judging whether the text line character recognition result is aligned with the position of the single character detection box in the single character detection result;
in step S33, if yes, a recognition result is generated based on the text line character recognition result.
The scheme provides a character alignment method based on editing distance, which is used for processing the problems of error detection, omission and the like of single characters caused by extreme conditions such as unequal-width fonts, small characters, half-angle symbols and the like.
Further, in an embodiment, after the step of determining whether the text line character recognition result is aligned with the position of the single character detection box in the single character detection result, the method includes:
step S34, if not, character recognition is carried out based on the single character detection result to generate a single character recognition result;
in one embodiment, if the line text recognition result is not aligned with the single character detection box, the single character detection result (character region) is first recognized to generate a single character recognition result. Specifically, the character recognition method is not limited herein, and may be implemented based on a deep learning algorithm.
Step S35, generating a single character string according to the single character detection frame and the single character recognition result;
in one embodiment, a single-character string is generated based on a single-character detection box and the single-character recognition result. It will be appreciated that the single-character recognition results and the single-character detection boxes may not correspond exactly, for example: if the recognition result in one single character detection frame is two characters, one single character detection frame is split into two single character frames according to the character width, and then the two single character frames are spliced into a single character string according to rows.
Step S36, calculating the edit distance between the single character string and the corresponding text string in the text string detection result, and obtaining the corresponding edit operation;
in one embodiment, the edit distance between a single character string and its corresponding line text string is calculated, and the corresponding edit operation is obtained. That is, the single character string is compared with the line text string, and the corresponding editing operation is selected according to the comparison result to correct the identification result. Edit Distance (Edit Distance): is a string metric that measures the difference between two character sequences, and the edit distance between two words is the minimum number of single-character edits (insertions, deletions, or substitutions) required to convert one word to another. Generally, the smaller the edit distance, the greater the similarity of the two strings. Specifically, the calculation of the edit Distance may be realized by a Levenshtein Distance algorithm.
And step S37, executing the editing operation, aligning the text line recognition result with the position of the single character detection box, and generating a recognition result.
In one embodiment, different processing procedures are executed according to processing steps corresponding to editing operation, and the text line recognition result is aligned with the position of the single character detection box to generate a recognition result. Wherein the editing operation comprises:
firstly, the method comprises the following steps: if the corresponding character needs to be inserted into the single character string, the corresponding character is inserted, and the single character detection boxes with the corresponding number of characters are inserted into the two adjacent single character detection boxes.
II, secondly: if the character needs to be replaced or no operation is needed, the single character detection box is reserved without modification.
Thirdly, the method comprises the following steps: if the character in the single character string needs to be deleted, the corresponding single character detection box is deleted.
Referring to fig. 5, fig. 5 is a schematic diagram of a character alignment process based on edit distance in an embodiment of the text detection and recognition method of the present invention, in which the module first determines whether the line text recognition result is aligned with the position of the single character detection box according to the text line recognition result and the single character detection result, and directly outputs the result if each character is aligned.
Referring to fig. 6, summarizing the method for detecting and recognizing characters according to the present invention, fig. 6 is a flow chart for detecting and recognizing characters in an embodiment of the method for detecting and recognizing characters according to the present invention, which includes inputting a text image, performing single character detection, outputting a single character detection result, organizing single character detection boxes according to a text line detection result, sorting the single character detection boxes in the text line, rotating, enlarging, and aligning after sorting, performing character recognition on a text line image region (i.e., a text line detection result), and determining whether the single character detection box and the text line recognition result are aligned, if so, directly outputting a single character and text line matching result (i.e., a recognition result), if not, performing character recognition on the single character image region (i.e., the single character detection result), aligning the single character detection box and the text line recognition result by character matching based on an edit distance, and outputting the result. The detection of text lines and single characters is completed simultaneously in a text detection model, so that the precision of text recognition is improved. And the text line detection result is combined with the single character detection result, so that the correct ordering of single characters in irregular texts such as inclined texts, curved texts, multidirectional texts and the like is realized. The character alignment of the single character detection result and the line text recognition result is realized by combining the text line recognition result, the single character detection result and the single character recognition result.
In this embodiment, whether the line text recognition result is aligned with the position of the single character detection box is determined according to the text line recognition result and the single character detection result, and if the characters are aligned, the result is directly output; if the characters are not aligned, the scheme provides a character alignment method based on the editing distance, and the positions of the text recognition result and the single character detection frame are aligned by calculating the editing distance between the single character string and the corresponding text line string in the text line character detection result and acquiring the corresponding editing operation, so that the recognition precision is further improved.
Further, based on the previous embodiment of the character detection and recognition method of the present invention, a fourth embodiment of the character detection and recognition method of the present invention is provided.
Referring to fig. 7, fig. 7 is a schematic view illustrating a detailed flow of step S20 in an embodiment of the character detection and recognition method of the present invention, in a third embodiment, the step of performing text line detection and single character detection on the text image and outputting a text line detection result and a single character detection result includes:
step S23, respectively carrying out text line detection and single character detection on the text image through a double-branch character detection model, predicting to obtain a text line probability map and a threshold value map, and a single character probability map and a threshold value map, and carrying out differentiable binarization to respectively obtain a text line binarization probability map and a single character binarization probability map;
and step S24, carrying out post-processing on the text line binarization probability map and the single character binarization probability map to obtain a text line detection result and a single character detection result.
In one embodiment, the text detection model used for detection adopts a double-branch structure, and a text line detection result and a single character detection result are simultaneously output in one model. It can be understood that if only a single character is detected and the detection result of a plurality of single characters is obtained, the detection result is usually unordered, and therefore it is difficult to sort the characters in the text line, and therefore, in the embodiment, the text line recognition and the single character recognition are performed simultaneously through a double-branch structure. As shown in fig. 8, fig. 8 is a structural diagram of a text detection model in an embodiment of the text detection and recognition method of the present invention. The model adopts a ResNet residual neural network as a backbone network, and combines with an FPN (feature Pyramid network) feature Pyramid network module to extract rich visual features of character images to be detected; furthermore, in order to simultaneously detect the characters of two levels of a Text line and a single character in one model, the model adopts a double-branch structure, is similar to DBNet (Text detection network), respectively predicts a probability graph and a threshold graph for each branch, and performs Differentiable Binarization (DB) to respectively obtain a Text line binarization probability graph (Text line map) and a single character binarization probability graph (Text char map); then, the text line character detection frame and the single character detection frame are obtained through simple post-processing. The post-processing is to calculate a minimum circumscribed rectangle for a single Text (character) instance (a connected domain of a single white pixel) in the Text line map and the Text char map, wherein the minimum circumscribed rectangle is a Text line/character detection box (result). It should be noted that the parameters such as the pixel size, the shortest side, etc. in the post-processing can be adjusted according to the actual situation.
In the embodiment, the character detection model is designed by combining a downstream character recognition task, and two levels of character detection of a text line and a single character are simultaneously performed through a double-branch structure, so that a text line detection result and a single character detection result are generated, and the accuracy of subsequent character recognition is improved.
The invention also provides a character detection and identification device. As shown in fig. 9, fig. 9 is a schematic diagram of functional modules of an embodiment of the text detection and recognition device of the present invention.
The character detection and recognition device of the present invention comprises:
an obtaining module 10, configured to obtain a text image;
the character detection module 20 is configured to perform text line detection and single character detection on the text image, and output a text line detection result and a single character detection result;
and the character recognition module 30 is configured to perform character content recognition based on the text line detection result and the single character detection result, and generate a recognition result.
Optionally, the apparatus further comprises:
the single character detection frame sorting and correcting module is used for grouping the single character detection frames in the single character detection results according to the text line detection results and the single character detection results;
and sequencing the single-character detection frames in a grouping group according to the text row direction in the text row detection result to generate a corrected single-character detection result.
Optionally, the single-character detection box ordering correction module is further configured to:
sorting the single character detection boxes in a grouping group according to the text row direction in the text row detection result;
and detecting the size and the direction of the sorted single character detection frames, and normalizing the size and the direction according to rows to generate a corrected single character detection result.
Optionally, the single-character detection box ordering correction module is further configured to:
confirming whether the text line detection result is a bent text;
if so, modifying the size of the text line detection result to generate a reduced text line direction;
and sorting the single-character detection frames in the grouping group according to the text row direction to generate a corrected single-character detection result.
Optionally, the apparatus further comprises:
the character alignment module is used for carrying out text content recognition on the text line detection result to generate a text line character recognition result;
judging whether the text line character recognition result is aligned with the position of a single character detection frame in the single character detection result;
and if so, generating a recognition result based on the text line character recognition result.
Optionally, the character alignment module is further configured to:
performing text content recognition on the text line detection result to generate a text line character recognition result;
judging whether the text line character recognition result is aligned with the position of a single character detection frame in the single character detection result;
and if so, generating a recognition result based on the text line character recognition result.
Optionally, the text detection module is further configured to:
respectively carrying out text line detection and single character detection on the text image through a double-branch character detection model, predicting to obtain a text line probability map and a threshold map, and a single character probability map and a threshold map, and carrying out differentiable binarization to respectively obtain a text line binarization probability map and a single character binarization probability map;
and carrying out post-processing on the text line binarization probability map and the single character binarization probability map to obtain the text line detection result and the single character detection result.
The invention also provides a computer readable storage medium.
The computer readable storage medium of the present invention stores a character detection and recognition program, and the character detection and recognition program, when executed by a processor, implements the steps of the character detection and recognition method as described above.
The method implemented when the management program running on the processor is executed may refer to each embodiment of the management method of the present invention, and is not described herein again.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising a raman spectral data process" does not exclude the presence of another like element in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are only for description, and do not represent the advantages and disadvantages of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a computer-readable storage medium (such as ROM/RAM, magnetic disk, optical disk) as described above, and includes several instructions for enabling a terminal device (such as a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention, and all equivalent structures or equivalent processes performed by the present invention or directly or indirectly applied to other related technical fields are also included in the scope of the present invention.

Claims (10)

1. A character detection and identification method is characterized by comprising the following steps:
acquiring a text image;
performing text line detection and single character detection on the text image, and outputting a text line detection result and a single character detection result;
and identifying the text content based on the text line detection result and the single character detection result, and generating an identification result.
2. The character detection and recognition method of claim 1, wherein after the step of performing text line detection and single character detection on the text image and outputting the text line detection result and the single character detection result, the method further comprises:
grouping single-character detection frames in the single-character detection result according to the text line detection result and the single-character detection result;
and sequencing the single-character detection frames in a grouping group according to the text row direction in the text row detection result to generate a corrected single-character detection result.
3. The method of claim 2, wherein the step of sorting the single-character detection boxes in groups according to the text row direction in the text row detection result to generate the modified single-character detection result comprises:
sorting the single character detection boxes in a grouping group according to the text row direction in the text row detection result;
and detecting the size and the direction of the sorted single character detection frames, and normalizing the size and the direction according to rows to generate a corrected single character detection result.
4. The method of claim 2, wherein the step of sorting the single-character detection boxes in groups according to the text row direction in the text row detection result to generate the modified single-character detection result comprises:
determining whether the text line detection result is a curved text;
if so, modifying the size of the text line detection result to generate a reduced text line direction;
and sorting the single-character detection frames in the grouping group according to the text row direction to generate a corrected single-character detection result.
5. The method of claim 1, wherein the step of performing text content recognition based on the text line detection result and the single character detection result and generating a recognition result comprises:
performing text content recognition on the text line detection result to generate a text line character recognition result;
judging whether the text line character recognition result is aligned with the position of a single character detection frame in the single character detection result;
and if so, generating a recognition result based on the text line character recognition result.
6. The method of claim 5, wherein after the step of determining whether the textual character recognition result is aligned with the position of the single-character detection box in the single-character detection result, the method comprises:
if not, performing character recognition based on the single character detection result to generate a single character recognition result;
generating a single character string according to the single character detection frame and the single character recognition result;
calculating the editing distance between the single character string and the corresponding text line character string in the text line character detection result, and acquiring corresponding editing operation;
and executing the editing operation, aligning the text line recognition result with the position of the single character detection frame, and generating a recognition result.
7. The character detection and recognition method of claim 1, wherein the step of performing text line detection and single character detection on the text image and outputting a text line detection result and a single character detection result comprises:
respectively carrying out text line detection and single character detection on the text image through a double-branch character detection model, predicting to obtain a text line probability map and a threshold map, and a single character probability map and a threshold map, and carrying out differentiable binarization to respectively obtain a text line binarization probability map and a single character binarization probability map;
and carrying out post-processing on the text line binarization probability map and the single character binarization probability map to obtain the text line detection result and the single character detection result.
8. A character detection and recognition apparatus, comprising:
the acquisition module is used for acquiring a text image;
the character detection module is used for carrying out text line detection and single character detection on the text image and outputting a text line detection result and a single character detection result;
and the character recognition module is used for recognizing the character content based on the text line detection result and the single character detection result and generating a recognition result.
9. A character detection and recognition apparatus, comprising: a memory, a processor, and a word detection and recognition program stored on the memory and executable on the processor, the word detection and recognition program configured to implement the steps of the word detection and recognition method of any one of claims 1 to 7.
10. A computer-readable storage medium, wherein a text detection and recognition program is stored on the computer-readable storage medium, and when executed by a processor, the steps of the text detection and recognition method according to any one of claims 1 to 7 are implemented.
CN202210682544.5A 2022-06-16 2022-06-16 Character detection and identification method, device and equipment and computer readable storage medium Pending CN115100672A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210682544.5A CN115100672A (en) 2022-06-16 2022-06-16 Character detection and identification method, device and equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210682544.5A CN115100672A (en) 2022-06-16 2022-06-16 Character detection and identification method, device and equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN115100672A true CN115100672A (en) 2022-09-23

Family

ID=83290937

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210682544.5A Pending CN115100672A (en) 2022-06-16 2022-06-16 Character detection and identification method, device and equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN115100672A (en)

Similar Documents

Publication Publication Date Title
US10943105B2 (en) Document field detection and parsing
US10846553B2 (en) Recognizing typewritten and handwritten characters using end-to-end deep learning
US10936862B2 (en) System and method of character recognition using fully convolutional neural networks
CN110942074B (en) Character segmentation recognition method and device, electronic equipment and storage medium
US8965126B2 (en) Character recognition device, character recognition method, character recognition system, and character recognition program
US6996295B2 (en) Automatic document reading system for technical drawings
US7970213B1 (en) Method and system for improving the recognition of text in an image
US10643094B2 (en) Method for line and word segmentation for handwritten text images
WO2018090011A1 (en) System and method of character recognition using fully convolutional neural networks
CN112446259A (en) Image processing method, device, terminal and computer readable storage medium
CN114463767A (en) Credit card identification method, device, computer equipment and storage medium
RU2597163C2 (en) Comparing documents using reliable source
CN111832497B (en) Text detection post-processing method based on geometric features
US20120281919A1 (en) Method and system for text segmentation
Yamazaki et al. Embedding a mathematical OCR module into OCRopus
CN115311666A (en) Image-text recognition method and device, computer equipment and storage medium
CN113111869B (en) Method and system for extracting text picture and description thereof
CN112560849B (en) Neural network algorithm-based grammar segmentation method and system
JP5857634B2 (en) Word space detection device, word space detection method, and computer program for word space detection
Kumar et al. Line based robust script identification for indianlanguages
CN115100672A (en) Character detection and identification method, device and equipment and computer readable storage medium
CN114627457A (en) Ticket information identification method and device
CN109409370B (en) Remote desktop character recognition method and device
CN111414889A (en) Financial statement identification method and device based on character identification
CN116563869B (en) Page image word processing method and device, terminal equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination