CN111814781B - Method, apparatus and storage medium for correcting image block recognition result - Google Patents

Method, apparatus and storage medium for correcting image block recognition result Download PDF

Info

Publication number
CN111814781B
CN111814781B CN201910288895.6A CN201910288895A CN111814781B CN 111814781 B CN111814781 B CN 111814781B CN 201910288895 A CN201910288895 A CN 201910288895A CN 111814781 B CN111814781 B CN 111814781B
Authority
CN
China
Prior art keywords
tree
nodes
recognition result
candidate matrix
lcs
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910288895.6A
Other languages
Chinese (zh)
Other versions
CN111814781A (en
Inventor
夏小洁
孙俊
于小亿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to CN201910288895.6A priority Critical patent/CN111814781B/en
Priority to JP2020066804A priority patent/JP7487532B2/en
Publication of CN111814781A publication Critical patent/CN111814781A/en
Application granted granted Critical
Publication of CN111814781B publication Critical patent/CN111814781B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Character Discrimination (AREA)
  • Character Input (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a method and device for correcting an identification result of an image block and a storage medium. The method comprises the following steps: obtaining a candidate matrix of the recognition result, each column of the candidate matrix representing a plurality of candidates of the recognition result of the corresponding image block; determining a range of nodes to search in a tree constructed based on a measure of the difference between the content contained by each pair of nodes in the tree; and correcting the recognition result by matching contents contained in all nodes within the determined range with the candidate matrix.

Description

Method, apparatus and storage medium for correcting image block recognition result
Technical Field
The present disclosure relates to the field of image correction, and in particular to a method of correcting the recognition result of an image block.
Background
OCR (optical character recognition) technology is widely used in industries such as postal service, finance, insurance, tax, etc., and facilitates improvement of industrial and life efficiency. The accurate and automatically generated text recognition result can provide more information, and labor force is saved. The original text image after preprocessing can be identified by utilizing a general OCR engine, and a rough identification result is obtained.
Disclosure of Invention
The following presents a simplified summary of the disclosure in order to provide a basic understanding of some aspects of the disclosure. It should be understood that this summary is not an exhaustive overview of the disclosure. It is not intended to identify key or critical elements of the disclosure or to delineate the scope of the disclosure. Its purpose is to present some concepts in a simplified form as a prelude to the more detailed description that is discussed later.
According to an aspect of the present invention, there is provided a method for correcting a recognition result of an image block, including: obtaining a candidate matrix of the recognition result, wherein each column of the candidate matrix represents a plurality of candidates of the recognition result of the corresponding image block; determining a range of nodes in a tree to search, wherein the tree is constructed based on a measure of the difference between the content contained by each pair of nodes in the tree; and correcting the recognition result by matching contents contained in all nodes within the determined range with the candidate matrix.
According to another aspect of the present invention, there is provided an apparatus for correcting a recognition result of an image block, comprising: obtaining means configured to obtain a candidate matrix of the recognition result, wherein each column of the candidate matrix represents a plurality of candidates of the recognition result of the corresponding image block; determining means configured to determine a range of nodes to be searched in a tree, wherein the tree is constructed based on a measure of a difference between contents contained in each pair of nodes in the tree; and correction means configured to correct the recognition result by matching the contents contained in all the nodes within the determined range with the candidate matrix.
According to other aspects of the invention, corresponding computer program code, computer readable storage medium and computer program product are also provided.
By the method and the device for correcting the identification result of the image block, the correction of the image identification result is improved, the correction speed is increased, and the image identification precision is improved.
These and other advantages of the present invention will become more apparent from the following detailed description of the preferred embodiments of the present invention, taken in conjunction with the accompanying drawings.
Drawings
To further clarify the above and other advantages and features of the present disclosure, a more particular description of the disclosure will be rendered by reference to the appended drawings. The accompanying drawings are incorporated in and form a part of this specification, together with the detailed description below. Elements having the same function and structure are denoted by the same reference numerals. It is appreciated that these drawings depict only typical examples of the disclosure and are not therefore to be considered limiting of its scope. In the drawings:
FIG. 1A shows an example of obtaining portions of characters in an address image using an over-segmentation method;
FIG. 1B illustrates an example of a coarse recognition result obtained using a beam search algorithm;
FIG. 2 illustrates an example of a recognition result matrix with multiple candidates obtained by an OCR engine;
FIG. 3 is a flow chart of a method for correcting the recognition result of an image block according to one embodiment of the present invention;
FIG. 4 schematically illustrates the structure of a BK tree;
FIG. 5 schematically illustrates how a particular search range in a BK tree is determined;
FIG. 6 is a block diagram of an apparatus for correcting the recognition result of an image block according to one embodiment of the present invention; and
Fig. 7 is a block diagram of an exemplary architecture of a general-purpose personal computer in which methods and/or apparatus according to embodiments of the present invention may be implemented.
Detailed Description
Exemplary embodiments of the present disclosure will be described hereinafter with reference to the accompanying drawings. In the interest of clarity and conciseness, not all features of an actual implementation are described in this specification. It will of course be appreciated that in the development of any such actual embodiment, numerous implementation-specific decisions must be made in order to achieve the developer's specific goals, such as compliance with system-and business-related constraints, and that these constraints will vary from one implementation to another. Moreover, it will be appreciated that such a development effort might be complex and time-consuming, but would nevertheless be a routine undertaking for those of ordinary skill in the art having the benefit of this disclosure.
It is also noted herein that, in order to avoid obscuring the disclosure with unnecessary details, only the device structures and/or processing steps closely related to the solution according to the present disclosure are shown in the drawings, while other details not greatly related to the present disclosure are omitted.
As previously described, with an OCR engine, a rough recognition result of an image can be obtained. However, the recognition results are often not particularly accurate, limited by the performance of the OCR engine. Therefore, it is desirable to automatically correct the recognition result in order to obtain a more accurate recognition result and to save labor.
The present invention improves image recognition accuracy by proposing a method for correcting OCR recognition results such as text in an image. In particular, the method according to the invention applies a distance measure to compare the similarity of recognition results to existing text, while accelerating the search process in a large text library during the correction process.
Briefly, the correction method according to the invention comprises the following three phases: (1) Obtaining, by an OCR engine, a recognition result of an image, such as an address, dividing the address image into image blocks, and providing a plurality of character candidates for each image block; (2) building a tree to store an existing exact text library; (3) Searching the best matching recognition text in the constructed tree to obtain a final correction result.
A method 300 according to one embodiment of the invention will be described in detail below in conjunction with fig. 3.
The method 300 starts in step 301 with obtaining a candidate matrix of the recognition result, wherein each column of the candidate matrix represents a plurality of candidates of the recognition result of the corresponding image block.
Specifically, in the present embodiment, a rough recognition result of an image may be obtained using a general OCR engine. OCR engines are designed based on the over-segmentation method and the beam-Search (beam-Search) method. In order to facilitate the understanding of the present invention, the basic principles of these two methods are briefly described below.
Over-segmentation method
Oversegregation refers to segmentation of a string into primitive segments and combining the primitive segments into characters that combine character recognition and context. It is generally divided into two steps: connected component labeling and sticky character segmentation. First, the address image must be preprocessed, such as noise reduction, normalization, and binarization. Then, the connected component of the address image can be obtained. By analyzing these connected components and contour lines, the over-segmentation method can be used to obtain the portions of the individual characters in the address image, as shown in fig. 1A. Each segment may be identified, for example, by a trained Convolutional Neural Network (CNN) model.
Beam search algorithm
After the corresponding identification results of all segments of the address image and the CNN model are obtained, the combined result and the final result can be obtained through a beam search algorithm. The beam search algorithm is a path evaluation and search algorithm. The path evaluation function is based on bayesian decisions that integrate a variety of contexts, including character classification, geometric context, and linguistic context. Different combining styles correspond to different paths. An improved beam search algorithm divides the pruning strategy into two phases so that the path with the largest path evaluation score is effectively found. And finally obtaining a final recognition result through the path with the maximum score, as shown in fig. 1B.
Fig. 2 shows an example of the candidate matrix in step 301 obtained by the above method. For each image block, a plurality of character candidates corresponding to each column in the matrix shown in fig. 2 will be provided.
Preferably, the individual candidates in each column of the candidate matrix are arranged with confidence from high to low.
Next, in step 302, a range of nodes in the tree to search is determined, wherein the tree is constructed based on a measure of the difference between the content contained by each pair of nodes in the tree.
Specifically, in this embodiment, a tree may be constructed, for example, based on a BK tree structure, to store the existing accurate truth text.
Those skilled in the art will appreciate that pre-existing domain knowledge or truth text candidates are generally easier to collect in different OCR application scenarios. For example, if the task is to identify company names on an invoice, receipt, all registered legal company names may be obtained from the tax department; if the task is to identify the address on the courier sheet, envelope, all the exact address entries can be obtained from the official postal system.
In this embodiment, using the resulting domain knowledge, the BK tree structure may preferably be built based on the longest common subsequence (Longest Common Subsequence, LCS), while enabling the speed of BK tree-based searches to be increased. In order to facilitate the understanding of the present invention, the basic principle of the B tree is briefly described below.
BK tree
BK tree was proposed by Walter Austin Burkhard and Robert M.Keller and is therefore also known as Burkhard-KELLER TREE. It is mainly used for spelling correction, fuzzy matching, string approximation comparison in dictionary, etc. Distance metrics d (x, y) are typically used to calculate distances between adjacent nodes of the BK tree. The distance metric most commonly used in BK trees is levenshtein distances. This distance is also called edit distance, i.e. a string distance measure, for comparing two character sequences. The edit distance represents the minimum number of steps by which two character strings composed of single characters are converted into each other by insertion, deletion, or substitution.
Expressed by a formula, the edit distance between two character strings a, b (character lengths are |a| and |b|), respectively, may be represented by ED a,b (|a|, |b|) as follows:
when building a BK tree, a root node is first selected, which may be any element a. The distance between the node to be inserted and the root node is then calculated. The distance between all elements of the k-th subtree under a certain node and the node element is k. Fig. 4 shows a simple BK tree structure.
According to a preferred embodiment, the BK tree may be constructed using, for example, the longest common subsequence. In order to facilitate the understanding of the present invention, the longest common subsequence is briefly described below.
Longest common subsequence
LCS, the longest common subsequence, is used to find a set of sequences (typically only two sequences) of the longest common subsequence of all sequences. Unlike the longest common substring (Longest Common Substring), the positions of consecutive sub-sequences are not necessarily identical in the original sequence. In this embodiment, LCS is used to compare any node element in the BK tree structure with a particular string.
For example, two sequences are defined as follows: x= (X 1,x2,…,xm) and y= (Y 1,y2,…,yn). The prefix of X may be denoted as X 1,2,...,m and the prefix of Y may be denoted as Y1,2,...,n. The set of longest common subsequences obtained with prefixes X i and Y j is denoted by LCS (X i,Yj). The set may be calculated by the following formula:
to find the longest common subsequence of X i and Y j, elements X i and Y j are first compared. If they are equal, the LCS (X i,Yj) can be expressed as LCS (X i-1,Yj-1) plus X i. If not, then LCS (X i,Yj) is the larger of LCS (X i,Yj-1) and LCS (X i-1,Yj).
LCS (X i,Yj) is recorded by a two-dimensional array of numbers C [ i ] [ j ]. The recursive formula for C [ i ] [ j ] can be expressed as:
As previously described, some OCR application scenarios of prior knowledge domain may be collected. Taking japanese address handwriting recognition as an example, first, the trunk address of the entire japanese official post office is collected as a truth text address library. They are different address entries such as "central region of sapporo city in Hokkaido, cover mountain and West", " Yu, jinshan city, yu Zhu, xiong Ben Yu name county Yudong original bin", etc. An arbitrary address string is then selected as the root node. Then, the LCS distance between the next inserted address string and the root node is calculated. And so on, all address entries in the address library are sequentially formed into the final BK tree.
How the search range in the BK tree is determined in step 302 is described in detail below.
As described above, the original OCR engine will split the recognition text image into blocks, each with multiple recognition candidates. The number of identified candidates is denoted by k. Of the k candidates, the higher the candidate character rank (rank), the greater the likelihood of correctness.
The weighted LCS distance between the recognition result matrix A of a plurality of candidates and the specific character string b in the BK tree is recorded by a two-dimensional array column C [ i ] [ j ]. The recursive formula is as follows:
Where f (i, j) is a weight of LCS length. According to equations (4) and (5), if there is a matching character in the recognition result matrix a and the likelihood of the character is high, the LCS length will be weighted near 1 accordingly. In contrast, if there are no matching characters, the weight of the LCS length will be very low, here given as an example a value-999999.
After the multi-character candidate recognition result of each recognition block of the original OCR engine is acquired, the best matching character string may be searched among nodes within a specific range of the BK tree as a correction result. Fig. 5 illustrates how a particular search range in a BK tree is determined.
Specifically, in the present embodiment, the step of determining the search range is as follows (1) setting a search distance threshold n, where n is a positive integer, such as 5; (2) Calculating the LCS length between the candidate recognition result matrix A and the root node of the BK tree; (3) And adding nodes with LCS lengths d (A, B) more than or equal to n-d between the father node of all the child nodes of the root node and the child nodes thereof into the search range.
Those skilled in the art will appreciate that due to the setting of n, many child nodes and subtrees can be removed during the search process, which makes the entire query process traverse no more than 5% to 8% of all nodes, and thus is far more efficient than violent enumeration.
Finally, in step 303, the recognition result is corrected by matching the content contained in all nodes within the determined range with the candidate matrix.
Specifically, in the present embodiment, the steps (2) and (3) for determining the search range described above are repeated until the BK tree is ended. Then, the candidates of the search result are ranked, and the larger the LCS length is, the higher the corresponding matching degree is. In this way, the text that best matches can be found as the correction result.
The methods discussed above may be implemented entirely by a computer executable program, or may be implemented partially or entirely using hardware and/or firmware. When it is implemented in hardware and/or firmware, or when a computer-executable program is loaded into a hardware device that can run the program, a device for correcting the recognition result of an image block, which will be described later, is implemented. Hereinafter, an overview of these devices is given without repeating some of the details that have been discussed above, but it should be noted that while these devices may perform the methods described previously, the methods do not necessarily employ or are not necessarily performed by those components of the described devices.
Fig. 6 shows an apparatus 600 for correcting a recognition result of an image block according to an embodiment of the invention, comprising obtaining means 601, determining means 602 and correcting means 603. Wherein the obtaining means 601 is configured to obtain a candidate matrix of the identification result, where each column of the candidate matrix represents a plurality of candidates of the identification result of the corresponding image block; determining means 602 for determining a range of nodes in a tree to be searched, wherein the tree is constructed based on a measure of a difference between content contained by each pair of nodes in the tree; and correction means 603 for correcting the recognition result by matching the content contained in all nodes within the determined range with the candidate matrix.
The apparatus 600 for correcting the recognition result of the image block shown in fig. 6 corresponds to the method 300 shown in fig. 3. Accordingly, relevant details of each device in the apparatus 600 for correcting the recognition result of the image block are already given in the description of the method 300 for correcting the recognition result of the image block of fig. 3, and are not repeated here.
The individual constituent modules, units in the apparatus described above may be configured by means of software, firmware, hardware or a combination thereof. The specific means or manner in which the configuration may be used is well known to those skilled in the art and will not be described in detail herein. In the case of implementation by software or firmware, a program constituting the software is installed from a storage medium or a network to a computer (for example, a general-purpose computer 700 shown in fig. 7) having a dedicated hardware structure, and the computer can execute various functions and the like when various programs are installed.
Fig. 7 is a block diagram of an exemplary architecture of a general-purpose personal computer in which methods and/or apparatus according to embodiments of the present invention may be implemented. As shown in fig. 7, a Central Processing Unit (CPU) 701 executes various processes according to a program stored in a Read Only Memory (ROM) 702 or a program loaded from a storage section 708 to a Random Access Memory (RAM) 703. In the RAM 703, data required when the CPU 701 executes various processes and the like is also stored as needed. The CPU 701, ROM 702, and RAM 703 are connected to each other via a bus 704. An input/output interface 705 is also connected to the bus 704.
The following components are connected to the input/output interface 705: an input section 706 (including a keyboard, a mouse, and the like), an output section 707 (including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker, and the like), a storage section 708 (including a hard disk, and the like), and a communication section 709 (including a network interface card such as a LAN card, a modem, and the like). The communication section 709 performs communication processing via a network such as the internet. The drive 710 may also be connected to the input/output interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed on the drive 710 as needed, so that a computer program read out therefrom is installed into the storage section 708 as needed.
In the case of implementing the above-described series of processes by software, a program constituting the software is installed from a network such as the internet or a storage medium such as the removable medium 711.
It will be understood by those skilled in the art that such a storage medium is not limited to the removable medium 711 shown in fig. 7, in which the program is stored, and which is distributed separately from the apparatus to provide the program to the user. Examples of the removable medium 711 include a magnetic disk (including a floppy disk (registered trademark)), an optical disk (including a compact disk read only memory (CD-ROM) and a Digital Versatile Disk (DVD)), a magneto-optical disk (including a Mini Disk (MD) (registered trademark)), and a semiconductor memory. Or the storage medium may be a ROM 702, a hard disk contained in the storage section 708, or the like, in which a program is stored and distributed to users together with a device containing them.
The invention also proposes a corresponding computer program code, a computer program product storing machine-readable instruction code. The instruction codes, when read and executed by a machine, may perform the method according to the embodiment of the present invention described above.
Accordingly, a storage medium configured to carry the above-described program product storing machine-readable instruction codes is also included in the disclosure of the present invention. Including but not limited to floppy disks, optical disks, magneto-optical disks, memory cards, memory sticks, and the like.
Through the above description, the embodiments of the present disclosure provide the following technical solutions, but are not limited thereto.
Supplementary note 1. A method for correcting a recognition result of an image block, comprising:
obtaining a candidate matrix of the recognition result, wherein each column of the candidate matrix represents a plurality of candidates of the recognition result of the corresponding image block;
Determining a range of nodes in a tree to search, wherein the tree is constructed based on a measure of the difference between the content contained by each pair of nodes in the tree; and
The recognition result is corrected by matching the contents contained in all the nodes within the determined range with the candidate matrix.
Appendix 2. The method of appendix 1, wherein the individual candidates in each column of the candidate matrix are arranged with confidence from high to low.
Supplementary note 3 the method of supplementary note 1 or 2, wherein the tree is constructed based on the longest common subsequence LCS, and wherein the measure of difference is the LCS length.
The method of supplementary note 4. The method of supplementary note 3, wherein determining the scope of the node to be searched in the tree further includes:
calculating a difference between a weighted LCS length between content contained in a root node in the tree and the candidate matrix and a predetermined threshold; and
Sub-nodes having LCS lengths greater than or equal to the difference are included in the search range.
The method of supplementary note 5. The method of supplementary note 4, wherein, in case of matching, the weighting of the weighted LCS length is based on the number of candidates selected for the corresponding image block and the ranking of the candidates to be matched in the candidate matrix in the selected number of candidates.
Supplementary note 6. The method of supplementary note 4, wherein in case of mismatch, the weight is minus infinity.
The method of appendix 7. Appendix 5 or 6, wherein matching content contained by all nodes within the determined range to the candidate matrix further comprises: a weighted LCS length between the content contained in each node within the search range and the candidate matrix is calculated.
The method of supplementary note 7, wherein correcting the recognition result further includes: the recognition result is corrected based on one or more weighted LCS lengths between the content contained in each node within the calculated search range and the candidate matrix.
Supplementary note 9. The method of supplementary note 1 or 2, wherein the tree is a Burkhard-Keller tree.
Supplementary note 10. The method of supplementary note 1 or 2, wherein the recognition result is obtained by an Optical Character Recognition (OCR) engine.
Supplementary note 11. The method of supplementary note 10, wherein the OCR engine utilizes an over-segmentation method and a beam search algorithm.
Supplementary note 12 an apparatus for correcting the recognition result of an image block, comprising:
Obtaining means configured to obtain a candidate matrix of the recognition result, wherein each column of the candidate matrix represents a plurality of candidates of the recognition result of the corresponding image block;
determining means configured to determine a range of nodes to be searched in a tree, wherein the tree is constructed based on a measure of a difference between contents contained in each pair of nodes in the tree; and correction means configured to correct the recognition result by matching the contents contained in all the nodes within the determined range with the candidate matrix.
The apparatus of supplementary note 13. The apparatus of supplementary note 12, wherein the respective candidates in each column of the candidate matrix are arranged from high to low in confidence.
The apparatus of supplementary note 14. The apparatus of supplementary note 12 or 13, wherein the tree is constructed based on the longest common subsequence LCS, and wherein the measure of difference is the LCS length.
The apparatus of appendix 15. Appendix 14, wherein the determining means is further configured to:
calculating a difference between a weighted LCS length between content contained in a root node in the tree and the candidate matrix and a predetermined threshold; and
Sub-nodes having LCS lengths greater than or equal to the difference are included in the search range.
The apparatus of appendix 16. The apparatus of appendix 15, wherein in case of a match, the weighting of the weighted LCS length is based on the number of candidates selected for the respective image block and the ranking of the candidates to be matched in the candidate matrix in the selected number of candidates.
The device of appendix 17. As in appendix 15, wherein in case of mismatch the weights are minus infinity.
The apparatus of supplementary note 18, as in supplementary note 16 or 17, wherein the correction device is further configured to:
the recognition result is corrected based on one or more weighted LCS lengths between the content contained in each node within the calculated search range and the candidate matrix.
The device of any of the supplementary notes 19. The tree is a Burkhard-Keller tree, as in supplementary notes 12 or 13.
Supplementary note 20. A computer-readable storage medium storing a program executable by a processor to:
obtaining a candidate matrix of the recognition result, wherein each column of the candidate matrix represents a plurality of candidates of the recognition result of the corresponding image block;
Determining a range of nodes in a tree to search, wherein the tree is constructed based on a measure of the difference between the content contained by each pair of nodes in the tree; and
The recognition result is corrected by matching the contents contained in all the nodes within the determined range with the candidate matrix.
Finally, it is also noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Furthermore, without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
Although the embodiments of the present invention have been described in detail above with reference to the accompanying drawings, it should be understood that the above-described embodiments are merely configured to illustrate the present invention and do not constitute a limitation of the present invention. Various modifications and alterations to the above described embodiments may be made by those skilled in the art without departing from the spirit and scope of the invention. The scope of the invention is, therefore, indicated only by the appended claims and their equivalents.

Claims (8)

1. A method for correcting a recognition result of an image block, comprising:
obtaining a candidate matrix of the recognition result, wherein each column of the candidate matrix represents a plurality of candidates of the recognition result of the corresponding image block;
Determining a range of nodes in a tree to search, wherein the tree is constructed based on a measure of a difference between content contained by each pair of nodes in the tree; and
Correcting the recognition result by matching the contents contained in all nodes within the determined range with the candidate matrix,
Wherein the tree is constructed based on a longest common subsequence LCS, and wherein the measure of difference is an LCS length, and
Wherein determining the range of nodes to search in the tree further comprises:
calculating a difference between a weighted LCS length between content contained by a root node in the tree and the candidate matrix and a predetermined threshold; and
Sub-nodes having LCS lengths greater than or equal to the difference are included in the range.
2. The method of claim 1, wherein the candidates in each column of the candidate matrix are ranked from high to low with confidence.
3. The method according to claim 1 or 2, wherein the weighted LCS length is calculated by:
In case of matching, the weighting of the weighted LCS length is based on the number of candidates selected for the respective image block and the ranking of the candidates to be matched in the candidate matrix in the selected number of candidates; and
In the case of a mismatch, the weight is minus infinity.
4. The method of claim 3, wherein matching content contained by all nodes within the determined range to the candidate matrix further comprises: a weighted LCS length between the content contained by each node within the range and the candidate matrix is calculated.
5. The method of claim 4, wherein correcting the recognition result further comprises: the recognition result is corrected based on the calculated one or more weighted LCS lengths between the content contained by each node within the range and the candidate matrix.
6. The method of claim 1 or 2, wherein the tree is a Burkhard-Keller tree.
7. An apparatus for correcting a recognition result of an image block, comprising:
Obtaining means configured to obtain a candidate matrix of the recognition result, wherein each column of the candidate matrix represents a plurality of candidates of the recognition result of the corresponding image block;
determining means configured to determine a range of nodes to be searched in a tree, wherein the tree is constructed based on a measure of a difference between contents contained by each pair of nodes in the tree; and
Correction means configured to correct the recognition result by matching contents contained in all nodes within the determined range with the candidate matrix,
Wherein the tree is constructed based on a longest common subsequence LCS, and wherein the measure of difference is an LCS length, and
Wherein the determining means is further configured to:
calculating a difference between a weighted LCS length between content contained by a root node in the tree and the candidate matrix and a predetermined threshold; and
Sub-nodes having LCS lengths greater than or equal to the difference are included in the range.
8. Computer can the storage medium is read and the data is read, the computer-readable storage medium stores a program executable by a processor to:
Obtaining a candidate matrix of the identification result of the image block, wherein each column of the candidate matrix represents a plurality of candidates of the identification result of the corresponding image block;
Determining a range of nodes in a tree to search, wherein the tree is constructed based on a measure of a difference between content contained by each pair of nodes in the tree; and
Correcting the recognition result by matching the contents contained in all nodes within the determined range with the candidate matrix,
Wherein the tree is constructed based on a longest common subsequence LCS, and wherein the measure of difference is an LCS length, and
Wherein determining the range of nodes to search in the tree further comprises:
calculating a difference between a weighted LCS length between content contained by a root node in the tree and the candidate matrix and a predetermined threshold; and
Sub-nodes having LCS lengths greater than or equal to the difference are included in the range.
CN201910288895.6A 2019-04-11 2019-04-11 Method, apparatus and storage medium for correcting image block recognition result Active CN111814781B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910288895.6A CN111814781B (en) 2019-04-11 2019-04-11 Method, apparatus and storage medium for correcting image block recognition result
JP2020066804A JP7487532B2 (en) 2019-04-11 2020-04-02 Method and device for correcting image block recognition results, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910288895.6A CN111814781B (en) 2019-04-11 2019-04-11 Method, apparatus and storage medium for correcting image block recognition result

Publications (2)

Publication Number Publication Date
CN111814781A CN111814781A (en) 2020-10-23
CN111814781B true CN111814781B (en) 2024-08-27

Family

ID=72831500

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910288895.6A Active CN111814781B (en) 2019-04-11 2019-04-11 Method, apparatus and storage medium for correcting image block recognition result

Country Status (2)

Country Link
JP (1) JP7487532B2 (en)
CN (1) CN111814781B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112990225B (en) * 2021-05-17 2021-08-27 深圳市维度数据科技股份有限公司 Image target identification method and device in complex environment
JP7134526B1 (en) 2021-11-15 2022-09-12 株式会社Cogent Labs Matching device, matching method, program, and recording medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101194272A (en) * 2005-05-31 2008-06-04 微软公司 Image comparison by metric embeddings
CN107169445A (en) * 2017-05-11 2017-09-15 北京东方金指科技有限公司 A kind of extensive palmmprint coding and comparison method

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001283156A (en) 2000-03-31 2001-10-12 Glory Ltd Device and method for recognizing address and computer readable recording medium stored with program for allowing computer to execute the same method
JP2012223338A (en) * 2011-04-19 2012-11-15 Fujifilm Corp Tree structure creation apparatus, method, and program
JP6470249B2 (en) 2016-12-20 2019-02-13 ソフトバンク株式会社 Data cleansing system, data cleansing method, and data cleansing program
CN109325138B (en) * 2018-07-12 2022-07-15 上海电机学院 Image rapid identification method based on combination of expansion and sub-pixel matrix

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101194272A (en) * 2005-05-31 2008-06-04 微软公司 Image comparison by metric embeddings
CN107169445A (en) * 2017-05-11 2017-09-15 北京东方金指科技有限公司 A kind of extensive palmmprint coding and comparison method

Also Published As

Publication number Publication date
CN111814781A (en) 2020-10-23
JP2020173802A (en) 2020-10-22
JP7487532B2 (en) 2024-05-21

Similar Documents

Publication Publication Date Title
US10515090B2 (en) Data extraction and transformation method and system
KR100630886B1 (en) Character string identification
KR970007281B1 (en) Character recognition method and apparatus
CN111324750B (en) Large-scale text similarity calculation and text duplicate checking method
CN111159990B (en) Method and system for identifying general special words based on pattern expansion
CN111859921A (en) Text error correction method and device, computer equipment and storage medium
US5553284A (en) Method for indexing and searching handwritten documents in a database
CN112381038B (en) Text recognition method, system and medium based on image
KR101379128B1 (en) Dictionary generation device, dictionary generation method, and computer readable recording medium storing the dictionary generation program
CN111814781B (en) Method, apparatus and storage medium for correcting image block recognition result
US20120254190A1 (en) Extracting method, computer product, extracting system, information generating method, and information contents
CN110888946A (en) Entity linking method based on knowledge-driven query
JPH11328317A (en) Method and device for correcting japanese character recognition error and recording medium with error correcting program recorded
CN115858773A (en) Keyword mining method, device and medium suitable for long document
US20030126138A1 (en) Computer-implemented column mapping system and method
CN110968693A (en) Multi-label text classification calculation method based on ensemble learning
JP5252596B2 (en) Character recognition device, character recognition method and program
CN113761902B (en) Target keyword extraction system
CN116229484A (en) Text recognition method, list scanning method and device
CN112651590B (en) Instruction processing flow recommending method
CN111488757B (en) Method and apparatus for dividing recognition result of image and storage medium
JP3975825B2 (en) Character recognition error correction method, apparatus and program
JP2021197165A (en) Information processing apparatus, information processing method and computer readable storage medium
CN115917527A (en) Document retrieval device, document retrieval system, document retrieval program, and document retrieval method
JPH11328318A (en) Probability table generating device, probability system language processor, recognizing device, and record medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant