WO1990015398A1 - Method and apparatus for identifying unrecognizable characters in optical character recognition machines - Google Patents

Method and apparatus for identifying unrecognizable characters in optical character recognition machines Download PDF

Info

Publication number
WO1990015398A1
WO1990015398A1 PCT/US1990/002920 US9002920W WO9015398A1 WO 1990015398 A1 WO1990015398 A1 WO 1990015398A1 US 9002920 W US9002920 W US 9002920W WO 9015398 A1 WO9015398 A1 WO 9015398A1
Authority
WO
WIPO (PCT)
Prior art keywords
characters
character
unrecognizable
document
line
Prior art date
Application number
PCT/US1990/002920
Other languages
English (en)
French (fr)
Inventor
Peter Rudak
Original Assignee
Eastman Kodak Company
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US07/360,967 external-priority patent/US4914709A/en
Priority claimed from US07/360,565 external-priority patent/US4974260A/en
Application filed by Eastman Kodak Company filed Critical Eastman Kodak Company
Publication of WO1990015398A1 publication Critical patent/WO1990015398A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/98Detection or correction of errors, e.g. by rescanning the pattern or by human intervention; Evaluation of the quality of the acquired patterns
    • G06V10/987Detection or correction of errors, e.g. by rescanning the pattern or by human intervention; Evaluation of the quality of the acquired patterns with the intervention of an operator

Definitions

  • the invention relates generally to systems for reading characters, and more particularly to character reading systems wherein operators are employed to assist in identifying characters which cannot be machine read and for making the necessary corrections.
  • OCR Optical Character Recognition
  • a reject/reentry system is required, where an operator can correct and/or confirm the uncertain characters.
  • Most of today's state-of-the-art OCR systems use electronic imaging for reject/reentry. Although electronic imaging offers the highest productivity for reject/reentry, how the system is implemented plays a major role in operator efficiency, data integrity, and the resulting productivity gain.
  • the most popular method of displaying the reject/reentry information on a computer screen consists of a video window 10 to display the image of the uncertain character(s) and a line of ASCII data 12 to display the OCR results as shown in Figure 1.
  • the operator looks at the ASCII data and finds the uncertain character usually highlighted or replaced by a "?'* 14, and then looks up at the video window 10 for that field, and finds the corresponding character.
  • the operator then types the correct character using the keyboard.
  • the entire field (such as the name field) is displayed in the video field in order to give the operator some context (for example, deciding between a "0" or an "o" may depend on whether that character's neighbors were letters or numbers).
  • Fig. 1 looking back and forth between the video window 10 and the ASCII data 12 (Fig. 1) is time consuming and can cause operator fatigue. Also, displaying an entire field of video for each uncertain character slows down screen updates because of the additional information that must be written on the screen. This also means increased data, requiring increased disk storage, as well as longer data transmission times, thereby adding further inefficiencies.
  • One way to minimize operator fatigue is to speed up the correction process so as to reduce the amount of data required for the Video Display Window. This may be accomplished by using an "Embedded Video Window", that carries a bit map image of the unrecognizable character.
  • the video image of the uncertain character is used to replace the uncertain character within the ASCII data string.
  • Yet another object is to display via masking only the character of interest without bits and pieces of neighboring characters within a document.
  • Figure 1 is a diagrammatic view showing a video display window used in the prior art to display data and correct unrecognizable characters
  • Figure 2 is a diagrammatic view showing a video display window depicting the unrecognizable characters in accordance with the present invention
  • Figure 3 is a block diagram of an OCR system in accordance with the present invention
  • Figure 4a is a diagrammatic view of the video display
  • Figure 4b illustrates an enlarged portion of Figure 4a and illustrates the use of location parameters to extract the video information from the OCR video RAM shown in Figure 3;
  • Figures 5-8 illustrate in steps how masking eliminates extraneous information from the left-most and right-most bytes;
  • Figure 9 shows the flow chart for the video extraction and masking functions of the present invention.
  • Figure 10 shows the flow chart for the reject/reentry function of the present invention.
  • Modes of Carrying Out The Invention Referring to Fig. 2, it can be seen that the present invention displays the ASCII data 16 for all the identified characters. However, instead of displaying the video window 10 of the entire field as was shown in the past (Fig. 1), only the bit map video image 18 of the uncertain character replaces the uncertain character in the ASCII data string 16. By extracting and displaying only the character of interest, it allows the operator to recognize the character in question by using context (surrounding characters) .
  • This type of reject/reentry is especially applicable for documents containing only machine generated types (typewriter, dot matrix printer, laser printer, typeset, etc.). Although slight variations may be noticeable between different fonts, the operator will not normally be able to distinguish font differences based on a single character. Accordingly, the ASCII data string 16 is displayed in a fixed font and the embedded bit map video image 18 will contain a character of a different front and no discontinuity should be noticeable.
  • the system includes an electronic scanner 20 and page buffer 22, a field extraction subsystem 26, OCR Processor and Video RAM 30 and 28, respectively, video extraction 32 and masking subsystems 34, the items 30, 32 and 34 interfaced with the reject/reentry system 36.
  • Electronic scanner 20 is adjacent to a document transport system (not shown) that moves documents past scanner 20 at a controlled rate allowing each document to be scanned serially and the bit mapped images of the documents are stored sequentially in page buffer 22 .
  • the electronic image of the document contains a binary representation of the original document, where a binary "1" signifies character information (black) and a binary "0" signifies background information (white) .
  • a computer controller instructs field extraction subsystem 26 to extract pertinent fields of interest 24 off the document. There could be anywhere from one to hundreds of fields of interest on a document.
  • OCR video RAM 28 which is part of the OCR processor 30.
  • OCR processor 30 processes the field video information stored in OCR video RAM 28. It identifies character locations, and interprets the bit-mapped information to produce an ASCII representation for each character. Each interpreted character carries a certain level of confidence.
  • the ASCII code for that character is transmitted directly to the reject/reentry system 36. If a character cannot be interpreted with an acceptable degree of confidence, OCR processor 30 transmits a question mark (or other marker) instead of the actual ASCII interpretation, thereby identifying the presence and location of the "unrecognizable character".
  • OCR processor 30 In addition to identifying uncertain characters with a question mark, OCR processor 30 also saves the location parameters for that character: X offset 38, Y offset 40, width 42 and height 44 , as illustrated in Figure 4.
  • Video extraction 32 functions to define the location of a character by identifying the size and position of the smallest rectangular area 46 which is capable of completely surrounding the unrecognizable character. The width 42 and height 44 of this rectangle 46 define the size of the unrecognizable character, while the X-offset 38 and Y-offset 40 define its position.
  • X-offset 38 measures the horizontal (cross-scan) distance between the upper lefthand corner 48 of the defining character rectangle 46 and a reference point such as the upper lefthand corner 49 of the field or the upper lefthand corner 50 of the original document depending on the application.
  • Y-offset 40 measures the corresponding vertical (line count) distance. All of these parameters are measured in pixels (picture elements, also referred to as PELS), each pixel representing the smallest area or linear distance resolvable by electronic scanner 20. In the present embodiment, there are 200 pixels per inch in both the horizontal and vertical directions.
  • OCR processor 30 passes the location parameters to video extraction 32 which uses these parameters to extract the bit mapped video image of the unrecognizable characters from OCR video RAM 28.
  • location parameters can pinpoint the rectangular area surrounding the unrecognizable character to within a pixel
  • video extraction 32 must read the information on byte boundaries (8 pixels per byte) in order to accommodate the physical configuration of the memory and its associated data bus. Accordingly, the byte boundaries are selected so as to guarantee that the resulting bit map video image (see Fig. 5) includes the entire unrecognizable character.
  • the resulting bit map video image 18 may extend beyond the actual unrecognizable character boundaries such that portions of the neighboring characters may be included in. the bit map video image as well, as illustrated in Figure 5.
  • the byte format is applied in the horizontal direction - 8 consecutive pixels in the horizontal direction form a byte.
  • the video information may be accessed on a line boundary, where 1 line is equivalent to the height of 1 pixel. Accordingly, the video extraction process must round both the X-offset 38 and width 42 parameters to the nearest byte.
  • the X-offset 38 is rounded to the smallest multiple of 8 pixels (byte boundary), and the difference between the actual X-offset 38 and the resulting byte boundary X-offset is stored as a remainder.
  • the remainder is added to the width 42 parameter and the result is rounded up to the next largest multiple of 8 pixels to obtain the byte-boundary width 56.
  • Such a procedure insures that the resulting rectangle does not truncate any portion of the unrecognizable character.
  • the masking process itself is a logic "AND" function.
  • a "mask byte” exists where pixel locations requiring masking contain a logic "0" and all other pixel locations contain a logic "1". Referring to the example, the left-most bytes require masking for the first 2 pixels 57. The resulting left "mask byte” would be 0011 1111 (3F Hex, or 63 Decimal). The right-most bytes require masking for the last 6 pixels 58. The resulting right "mask byte” 59 would be 1100 0000 (CO Hex, or 192 Decimal) . An "AND" operation is performed between the mask byte 59 and the video data 61 to form a masked video byte 63.
  • FIG. 7 shows the last video byte 61 1100 1100 (a right-most byte) being ANDed with the right-most masking byte 59 1100 0000 to form masked video byte 63 1100 0000.
  • the original video data contained some black (logic "1") information from a neighboring character "N”.
  • the masking process erased or removed this unwanted information by replacing these pixels with logic "0" (white).
  • Figure 8 illustrates the final character video image after the masking process. Video image 18 remains on byte boundaries but the neighboring character information has been “whited out” by the masking process. It is this final video image (Fig. 8) that will be used during the reject/reentry process.
  • the bit map video image containing the uncertain or unrecognizable character is passed to the reject/reentry system 36 to be combined with the ASCII information found in that field.
  • the video reject/reentry system 36 displays the bit map video image in place of the uncertain character (Fig. 2) along with the string of ASCII characters displayed on the screen. In effect, displacing the location where the character would reside had it been recognized successfully. In this way, it allows the operator to view the bit map video image within the context of its field, allowing the operator to type the correct character via the keyboard 54.
  • an ASCII representation of the now corrected character replaces the bit map video image, so that all the characters in the data line are now ASCII characters.
  • the parser looks at the data produced by OCR 30 in step 60 and separates the -successfully read ASCII data from the uncertain character information by searching for the combination of "?" followed by a control character (the non-printable control character is used with the printable "?” in case the ?" was actual data read from the document) . Each incoming byte is analyzed in sequence. All successfully recognized ASCII information is sent directly to reject/reentry 36 in accordance with step 66. However, if a control character is encountered by the parser in step 66, it knows that the next 16 bytes contain the location parameters (X-offset 38, Y-offset 40, width 42 and height 44 with 4 bytes in each) for the unrecognizable character.
  • step 68 calculates byte boundaries for uncertain characters. Because the location parameters locate a character using pixel boundaries, but the video RAM 28 requires data to be read on a byte basis. This configuration requires that the nearest byte boundaries are calculated to encompass the entire character (with possible extraneous markings being included due to rounding) . To eliminate the extraneous information, step 70 calculates mask bytes for the left-most and right-most bytes. Actually, it is the unwanted portion of the bytes that had to be included to insure that the entire uncertain character was encompassed.
  • step 72 provides for a pointer to be set up to read the first byte of video (upper left-hand corner 50 in Fig. 4) where the reading process begins.
  • step 74 a byte of video is read from the OCR Video RAM 28 (Fig. 3) with the pointer being initially set for the upper left-hand corner. If a particular byte is determined to be a "left most" byte (the first byte read from the OCR RAM is always the left-most byte), a decision is made in step 78 to "AND" this byte with the left masking byte (calculated earlier) in accordance with instructions in step 78.
  • step 76 If in step 76 it is found not to be the left-most byte, it is then checked for being a right-most byte in step 80 in which case the byte is "ANDed" with the right masking byte in accordance with step 82. If the byte is located in the center of the line, the video is passed with no masking. In all instances, no matter what path was taken, the video is sequentially transmitted from left to right to the reject/reentry system as per step 84. As each byte is passed, a determination is made in step 86 as whether or not it is the end of line, if not the pointer is incremented to the next byte in step 88.
  • step 90 it is determined when an end of line is encountered. If it is not the last line, the pointer is updated to begin the next line in step 92 and the process continues left to right on the next line. Lines of video are processed this way until the last line has been completed, step 90. At this point, the additional OCR results are ready to be processed.
  • Figure 10 shows a flow chart for the Reject/Reentry System.
  • ASCII data is received from the OCR processor in box 100.
  • Video information is also received from the Video Extraction and Masking Subsystems in box 102.
  • This received information is stored on a disk drive or other storage media as set forth in step 104. In this way, the reject/reentry process does not have to occur simultaneously with data capture.
  • the information is stored on a disk until an operator is ready to perform the reject/reentry process.
  • a field containing an uncertain character is read from disk storage in step 106.
  • the retrieved information includes all successfully recognized characters in ASCII form and a bit mapped video image is inserted for the uncertain character(s) .
  • step 108 the "?” or other marker within the ASCII data string is located within the particular field.
  • the ASCII characters are displayed on the screen in step 110.
  • step 112 the X and Y coordinates for the location of the "?" are calculated. These calculations are used to overwrite the "?" with the bit mapped video image in step 114.
  • step 116 the operator views the bit mapped video image depicting the uncertain character along with the neighboring textual ASCII character string and types the correct character via a keyboard. The character typed by the operator replaces the bit mapped video image as per step 118.
  • Step 120 causes the ASCII file to be updated with the correct data replacing the former unrecognized character.
  • Step 122 results in the completed line of data being scrolled up on the screen of the workstation and, in accordance with step 124, the next field containing an uncertain or unrecognized character is brought into position for consideration.
  • OCR optical character recognition
  • the present invention is useful in an image management system and more particularly in systems that use optical character recognition (OCR) to enter retrieval information automatically as opposed to manual data entry, which is more labor intensive.
  • OCR optical character recognition
  • the use of an imbedded bit mapped video image to replace the unrecognized character in a string of ASCII characters that were successfully identified minimizes both data storage and transmission requirements, while maximizing screen update speed. Such a system results in a lower cost, higher efficiency reject/reentry system.

Landscapes

  • Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Character Discrimination (AREA)
  • Character Input (AREA)
PCT/US1990/002920 1989-06-02 1990-05-30 Method and apparatus for identifying unrecognizable characters in optical character recognition machines WO1990015398A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US360,565 1989-06-02
US07/360,967 US4914709A (en) 1989-06-02 1989-06-02 Method for identifying unrecognizable characters in optical character recognition machines
US360,967 1989-06-02
US07/360,565 US4974260A (en) 1989-06-02 1989-06-02 Apparatus for identifying and correcting unrecognizable characters in optical character recognition machines

Publications (1)

Publication Number Publication Date
WO1990015398A1 true WO1990015398A1 (en) 1990-12-13

Family

ID=27000950

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1990/002920 WO1990015398A1 (en) 1989-06-02 1990-05-30 Method and apparatus for identifying unrecognizable characters in optical character recognition machines

Country Status (3)

Country Link
EP (1) EP0428713A1 (ja)
JP (1) JPH04500422A (ja)
WO (1) WO1990015398A1 (ja)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0650135A2 (en) * 1993-10-22 1995-04-26 International Business Machines Corporation Data capture variable priority method and system for managing varying processing capacities
EP1814062A1 (en) * 1995-07-31 2007-08-01 Fujitsu Ltd. Method and apparatus for handling errors in document recognition

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3903517A (en) * 1974-02-26 1975-09-02 Cummins Allison Corp Dual density display
EP0107083A2 (de) * 1982-09-29 1984-05-02 Computer Gesellschaft Konstanz Mbh Belegverarbeitungseinrichtung mit Korrekturschaltung und Datensichtgerät
EP0140527A2 (en) * 1983-10-28 1985-05-08 Unisys Corporation Document reading system

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS5878267A (ja) * 1981-11-04 1983-05-11 Toshiba Corp 文字切出方式
JPS60245088A (ja) * 1984-05-18 1985-12-04 Ricoh Co Ltd 文字認識修正方式
JPH0721815B2 (ja) * 1985-11-12 1995-03-08 沖電気工業株式会社 光学式文字読取装置
JPS63316189A (ja) * 1987-06-18 1988-12-23 Matsushita Graphic Commun Syst Inc 光学式文字認識装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3903517A (en) * 1974-02-26 1975-09-02 Cummins Allison Corp Dual density display
EP0107083A2 (de) * 1982-09-29 1984-05-02 Computer Gesellschaft Konstanz Mbh Belegverarbeitungseinrichtung mit Korrekturschaltung und Datensichtgerät
EP0140527A2 (en) * 1983-10-28 1985-05-08 Unisys Corporation Document reading system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
PATENT ABSTRACTS OF JAPAN vol. 10, no. 3 (P-418)(2060) 08 January 1986, & JP-A-60 160483 (HITACHI SEISAKUSHO K.K.) 22 August 1985, see the whole document *
PATENT ABSTRACTS OF JAPAN vol. 7, no. 238 (P-231)(1383) 22 October 1983, & JP-A-58 125183 (RICOH K.K.) 26 July 1983, see the whole document *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0650135A2 (en) * 1993-10-22 1995-04-26 International Business Machines Corporation Data capture variable priority method and system for managing varying processing capacities
EP0650135A3 (en) * 1993-10-22 1995-07-26 Ibm Variable priority data capture method and system for managing variable processing capabilities.
US5555325A (en) * 1993-10-22 1996-09-10 Lockheed Martin Federal Systems, Inc. Data capture variable priority method and system for managing varying processing capacities
EP1814062A1 (en) * 1995-07-31 2007-08-01 Fujitsu Ltd. Method and apparatus for handling errors in document recognition

Also Published As

Publication number Publication date
JPH04500422A (ja) 1992-01-23
EP0428713A1 (en) 1991-05-29

Similar Documents

Publication Publication Date Title
US4914709A (en) Method for identifying unrecognizable characters in optical character recognition machines
US4974260A (en) Apparatus for identifying and correcting unrecognizable characters in optical character recognition machines
EP0439951B1 (en) Data processing
US4933979A (en) Data reading apparatus for reading data from form sheet
EP0446631A2 (en) Method and system for locating the amount field on a document
US7305619B2 (en) Image processing method, device and storage medium therefor
US7149352B2 (en) Image processing device, program product and system
WO1990015398A1 (en) Method and apparatus for identifying unrecognizable characters in optical character recognition machines
US7142733B1 (en) Document processing method, recording medium recording document processing program and document processing device
US8125691B2 (en) Information processing apparatus and method, computer program and computer-readable recording medium for embedding watermark information
JPH02255964A (ja) 文書変更部分の自動識別装置
KR950001061B1 (ko) 문서인식 수정장치
CN115131806B (zh) 一种基于深度学习的各类证件ocr图像信息识别方法、系统
JP2002230480A (ja) 文字認識装置および文字認識結果修正方法
EP0692768A2 (en) Full text storage and retrieval in image at OCR and code speed
JPH0816719A (ja) 文字切り出し方法とこれを用いた文字認識方法及び装置
JP2570571B2 (ja) 光学文字読取装置
JP2887823B2 (ja) 文書認識装置
JPH04293185A (ja) ファイリング装置
JPH0554178A (ja) 文字認識装置及び修正用帳票
JPH01287755A (ja) 修正機能付情報入力装置
JPH07239901A (ja) 光学式読み取り装置における文字修正方法
JPS6327990A (ja) 文字認識方法
JP2890788B2 (ja) 文書認識装置
JPS63155385A (ja) 光学文字読取装置

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 1990909990

Country of ref document: EP

AK Designated states

Kind code of ref document: A1

Designated state(s): JP

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FR GB IT LU NL SE

WWP Wipo information: published in national office

Ref document number: 1990909990

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 1990909990

Country of ref document: EP