JPH11102413A

JPH11102413A - Pop-up correction method for optical character recognition output and device thereof

Info

Publication number: JPH11102413A
Application number: JP10110884A
Authority: JP
Inventors: L Horowitz Michael; エル．ホロビッツマイケル; J Mcnaney Michael; ジェイ．マキナニーマイケル
Original assignee: KURARITEC CORP
Current assignee: KURARITEC CORP
Priority date: 1997-07-25
Filing date: 1998-04-21
Publication date: 1999-04-13

Abstract

PROBLEM TO BE SOLVED: To attain comparison for a text, together with its optical character recognition interpretation by recognizing the characters from a document image to decide areas corresponding to a word of a document text and displaying the part of the document image, after securing the relation between the decided area and the corresponding word of the document text based on a correlation table. SOLUTION: The image of a certain document is generated by a scanner device 126, and an optical character recognizer 128 recognizes the characters included in the document image for generating a document text. A processor 112 decides an area for the document image, corresponding to a word of the document text, secures a relation between the decided area of the document image and the word corresponding to the document text, based on a correlation table and shows the document image part on the document text via a display device 120. Then it is preferably to display the area of the document image corresponding to the word of the document text and/or the corresponding word of the document text, so as to show each recognition likelihood parameter.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、光学式文字認識技
術に関し、特に光学式文字認識出力を表示するとともに
その誤りを訂正するための方法および装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to optical character recognition technology, and more particularly, to a method and apparatus for displaying an optical character recognition output and correcting an error thereof.

【０００２】[0002]

【従来の技術】紙の書類からテキストおよびグラフィッ
クを取得することは、多くの産業にとって重大な問題で
ある。例えば出版会社は年間を通じて何百または何千の
学術論文を印刷するかもしれない。しばしば出版会社は
紙の文書から作業を始め、その紙の文書は出版会社のコ
ンピュータ装置に入力されなければならない。従来の一
手法は、紙の文書を読んでその文書をコンピュータシス
テムにタイプ入力するために、キーボード入力者を雇う
というものである。しかしながら文書を入力することは
時間を浪費し、かつコストも高い。BACKGROUND OF THE INVENTION Obtaining text and graphics from paper documents is a significant problem for many industries. For example, a publisher may print hundreds or thousands of scholarly articles throughout the year. Often a publisher starts with a paper document that must be entered into the publisher's computing device. One conventional approach is to employ a keyboard enthusiast to read a paper document and type the document into a computer system. However, entering a document is time consuming and costly.

【０００３】光学式文字認識（以下、ＯＣＲとする）
は、出版産業およびその他の産業にとって有益であるこ
とを保証する技術である。その理由は、ＯＣＲ装置の入
力処理速度はキーボード入力者の入力速度をはるかに上
回っているからである。従って出版会社の従業員は、し
ばしば読取り走査された文書から作業を始める。その文
書はＯＣＲ装置によってコンピュータの読込み可能なテ
キストフォーマット、例えばＡＳＣＩＩに変換されてい
る。しかしながら最近のＯＣＲ装置でもって可能な高い
認識率（しばしば９５％を越える）ですら、高い正確度
を必要とする出版産業のような産業にとっては十分でな
い。従って出版会社はしばしば校正係の人を雇い、手作
業でＯＣＲ出力の修正を行う。[0003] Optical character recognition (hereinafter referred to as OCR)
Is a technology that guarantees benefits to the publishing and other industries. The reason is that the input processing speed of the OCR device is much higher than the input speed of the keyboard input person. Thus, publisher employees often begin work with scanned documents. The document has been converted by the OCR device into a computer readable text format, for example, ASCII. However, even the high recognition rates possible with modern OCR devices (often above 95%) are not sufficient for industries such as the publishing industry that require high accuracy. Thus, publishers often employ proofreaders to manually modify OCR output.

【０００４】[0004]

【発明が解決しようとする課題】しかしながらＯＣＲ出
力を手作業で校正することは、非常に時間を浪費し、ま
た人が行うのは困難である。校正係の人は、元の紙の文
書とＯＣＲ出力の印刷またはスクリーン表示とを見比べ
てそれらを一語一語比較しなければならない。たとえ認
識率が高くても、人がＯＣＲ出力の校正を行うと一人よ
がりになって誤りを見落としがちである。However, manually calibrating the OCR output is very time consuming and difficult for humans to perform. The proofreader must compare the original paper document with the printed or screen display of the OCR output and compare them word by word. Even if the recognition rate is high, when a person calibrates the OCR output, one person tends to miss and miss an error.

【０００５】別の従来の選択は、結果として生じたコン
ピュータの読込み可能なテキストのスペルチェックを行
うことである。しかしながらスペルの間違った語すべて
を認識するというわけではない。加えて、入力された語
は非常に曲解されているかもしれないので、校正係はス
ペルチェックを行っている間中ずっと紙のテキストに戻
って参照しなければならない。一旦校正をする人は紙の
テキストを見て正しい語を決め、それからその正しい語
をＯＣＲ出力のテキストにキーをたたいて入力する。こ
の手法は時間を浪費し、またやや間違いがちであること
が分かっているので、校正する人が、ＯＣＲの解釈を生
成するのに使用された元の文書を参照する必要がなく、
校正をする人が、文書イメージを表すテキストをそのテ
キストのＯＣＲ解釈と一緒に比較することができるのは
有用であろう。Another conventional option is to spell check the resulting computer readable text. However, it does not recognize all misspelled words. In addition, the entered words may be very distorted, so the proofreader must refer back to the paper text throughout the entire spell check. Once the proofreader looks at the text on the paper, he determines the correct word, and then taps the correct word into the OCR output text. This technique is time consuming and has proven to be somewhat error prone, so the proofreader does not need to refer to the original document used to generate the OCR interpretation,
It would be useful for a proofreader to be able to compare text representing a document image along with the OCR interpretation of that text.

【０００６】文書イメージをそのテキストのＯＣＲ解釈
と一緒に見ることは、出版社が紙の形態ではなくＡＳＣ
ＩＩテキストの形態でＯＣＲ出力を再版して販売しよう
とする場合に特に有用である。出版社が電子形態でＯＣ
Ｒ出力を再販売する目的のためにそのＯＣＲ出力を得る
とき、そのＯＣＲ出力が正しい語を含むだけでなく、後
にＯＣＲ出力がコンピュータのモニタに表示される際
に、ＯＣＲ出力の形態が文書イメージの形態と同じまま
であるという付加的な関係がある。校正する人が編集段
階中にＯＣＲ出力と文書イメージを並べて比較すること
ができるということは、この目的をかなり促進する。[0006] Viewing a document image along with its OCR interpretation of the text requires that publishers use ASC rather than paper form.
It is particularly useful when trying to reprint and sell OCR output in the form of II text. Publisher OC in electronic form
When obtaining the OCR output for the purpose of reselling the R output, not only does the OCR output contain the correct word, but also the form of the OCR output changes when the OCR output is later displayed on a computer monitor. There is an additional relationship that the form remains the same. The ability of the proofreader to compare the OCR output and the document image side-by-side during the editing phase greatly facilitates this purpose.

【０００７】本発明の目的は、ユーザが文書イメージか
らなるテキストをそのテキストのＯＣＲ解釈と一緒に比
較することができるようにすることにある。It is an object of the present invention to enable a user to compare text consisting of a document image along with an OCR interpretation of the text.

【０００８】本発明の他の目的は、ＯＣＲ解釈を生成す
るのに使用された元の文書をユーザが参照する必要がな
く、ユーザが文書イメージで表されたテキストをそのテ
キストのＯＣＲ解釈と一緒に比較することができるよう
にすることである。Another object of the present invention is to eliminate the need for the user to reference the original document used to generate the OCR interpretation, and to allow the user to combine the text represented by the document image with the OCR interpretation of that text. Is to be able to compare.

【０００９】本発明のさらに他の目的は、元のテキスト
をＯＣＲ出力のテキストに変換している間に起こった間
違いを正すために、ユーザが文書イメージで表されたテ
キストをそのテキストのＯＣＲ解釈と比較することがで
きるようにすることである。It is yet another object of the present invention to provide a method for a user to convert text represented in a document image to an OCR interpretation of the text in order to correct errors made during the conversion of the original text to text in the OCR output. Is to be able to compare.

【００１０】[0010]

【課題を解決するための手段】人がＯＣＲ出力を校正す
ることを容易に行えるようにする必要がある。この必要
性を満たすため、元の紙の文書から得られた文書イメー
ジの文字は、文書テキストを生成するために（例えばＯ
ＣＲを介して）認識される。文書テキストの領域に対応
する文書イメージの領域が決定され、そして認識確度パ
ラメータが各領域に対して決定される。ユーザは、語の
上にカーソルを位置させることによって文書テキストか
らその語を選択することができる。ユーザがマウスの一
方のボタンをクリックする（押す）か、あるいは同様な
ファンクションキーを押すと、選択された語に対応する
文書イメージ部分がポップアップウィンドウとして現れ
る。ユーザがマウスの別のボタンをクリックするか、あ
るいは別の同様なファンクションキーを押すと、対応す
るＯＣＲ出力に対するポップアップメニューが表示され
る。SUMMARY OF THE INVENTION There is a need to facilitate human calibration of OCR output. To meet this need, the characters of the document image obtained from the original paper document are used to generate the document text (eg, O
(Via CR). A region of the document image corresponding to the region of the document text is determined, and a recognition accuracy parameter is determined for each region. The user can select the word from the document text by positioning the cursor over the word. When the user clicks (presses) one button of the mouse or presses a similar function key, the portion of the document image corresponding to the selected word appears as a pop-up window. If the user clicks another mouse button or presses another similar function key, a pop-up menu for the corresponding OCR output is displayed.

【００１１】特に文書テキスト上に文書イメージ部分を
表示するためにコンピュータで実施される本方法は、あ
る文書の文書イメージを生成する工程、文書テキストを
生成するために文書イメージから文字を認識する工程、
文書テキストの語に対応する文書イメージの領域を決め
る工程、文書イメージの領域と文書テキストの対応する
語とを相関テーブルを用いて互いに関連させる工程、お
よび文書テキスト上に文書イメージ部分を表示する工程
を組み合わせたものである。それから文書テキストの選
択されたテキストは誤りを正される。In particular, a computer-implemented method for displaying a document image portion on document text includes the steps of generating a document image of a document, recognizing characters from the document image to generate the document text. ,
Determining a region of the document image corresponding to the word of the document text, associating the region of the document image with the corresponding word of the document text using a correlation table, and displaying a document image portion on the document text Are combined. The selected text of the document text is then corrected.

【００１２】本発明のこれらおよび他の見解および利点
は、以下の説明、図面および特許請求の範囲の記載を参
照することにより理解されるようになるであろう。[0012] These and other aspects and advantages of the present invention will become apparent with reference to the following description, drawings, and claims.

【００１３】[0013]

【発明の実施の形態】以下に図面を参照しながら本発明
に係る光学式文字認識出力のポップアップ訂正のための
方法および装置を詳細に説明するが、図面においては同
様の構成要素には同様の符号を付している。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, a method and apparatus for pop-up correction of an optical character recognition output according to the present invention will be described in detail with reference to the drawings. Signs are attached.

【００１４】〔１．ハードウェアの概略〕図１は、本発
明の一例が実施され得るコンピュータシステム１００の
ブロック図である。コンピュータシステム１００は、情
報を伝達するためにバス１１０または他の伝達手段を備
えており、また情報を処理するためにプロセッサ１１２
がバス１１０に接続されている。さらにコンピュータシ
ステム１００はランダムアクセスメモリ（ＲＡＭ）また
は他のダイナミック記憶装置１１４（メインメモリとし
て示されている）を備えており、そのメインメモリ１１
４は、情報およびプロセッサ１１２によって実行される
べき命令を記憶するためにバス１１０に接続されてい
る。またメインメモリ１１４は、プロセッサ１１２が命
令を実行している間、一時的な変数や他の中間的な情報
を記憶するのにも使用されてもよい。またコンピュータ
システム１００は読出し専用メモリ（ＲＯＭ）および他
のスタティック記憶装置１１６の一方または両方を備え
ており、それらはバス１１０に接続されていて、静的な
情報およびプロセッサ１１２に対する命令を記憶する。
データ記憶装置１１８は、例えば磁気ディスクや光ディ
スクおよびそれに相当するディスクのドライブであり、
情報および命令を記憶するためにバス１１０に接続され
得る。[1. FIG. 1 is a block diagram of a computer system 100 on which an example of the present invention can be implemented. Computer system 100 includes a bus 110 or other communication means for communicating information, and a processor 112 for processing information.
Are connected to the bus 110. The computer system 100 further includes a random access memory (RAM) or other dynamic storage device 114 (shown as main memory),
4 is connected to a bus 110 for storing information and instructions to be executed by the processor 112. Main memory 114 may also be used to store temporary variables and other intermediate information while processor 112 is executing instructions. Computer system 100 also includes one or both of read-only memory (ROM) and other static storage devices 116, which are connected to bus 110 and store static information and instructions for processor 112.
The data storage device 118 is, for example, a drive for a magnetic disk or an optical disk and a disk corresponding thereto,
It may be connected to bus 110 for storing information and instructions.

【００１５】またコンピュータシステム１００には、バ
ス１１０を介して入出力装置が接続され得る。例えばコ
ンピュータシステム１００は、コンピュータのユーザに
情報を表示するために、例えばブラウン管（ＣＲＴ）の
ような表示装置１２０を用いる。さらにコンピュータシ
ステム１００は、キーボード１２２および例えばマウス
のようなカーソル制御手段１２４を用いる。加えてコン
ピュータシステム１００は、紙の文書をコンピュータの
読込み可能なフォーマットに変換するためのスキャナー
１２６を用いてもよい。さらにまたコンピュータシステ
ム１００は、スキャナー１２６によって生成された文書
イメージ、またはメインメモリ１１４やデータ記憶装置
１１８に記憶された文書イメージにおける文字を認識す
るためにＯＣＲ装置１２８を用いることができる。ある
いはＯＣＲ装置１２８の機能は、メインメモリ１１４に
記憶された命令をプロセッサ１１２で実行することによ
って、ソフトウェアで実施され得る。さらに別に例で
は、スキャナー１２６とＯＣＲ装置１２８は、紙の文書
を走査してそこにある文字を認識するように設計された
単一の装置に組み合わせられ得る。An input / output device can be connected to the computer system 100 via a bus 110. For example, the computer system 100 uses a display device 120, such as a cathode ray tube (CRT), to display information to a computer user. Further, the computer system 100 uses a keyboard 122 and cursor control means 124 such as a mouse. In addition, computer system 100 may use a scanner 126 to convert paper documents into a computer readable format. Furthermore, the computer system 100 can use the OCR device 128 to recognize characters in a document image generated by the scanner 126 or in a document image stored in the main memory 114 or the data storage device 118. Alternatively, the functions of OCR device 128 may be implemented in software by executing instructions stored in main memory 114 with processor 112. In yet another example, the scanner 126 and the OCR device 128 can be combined into a single device designed to scan a paper document and recognize the characters there.

【００１６】本発明は、同一の表示装置１２０で元のテ
キストと出力されたテキストを見るためにコンピュータ
システム１００を使用することに関する。一実施の形態
によれば、この仕事は、メインメモリ１１４に格納され
た一連の命令をプロセッサ１１２が実行することに応じ
てコンピュータシステム１００によって遂行される。そ
のような命令は、例えばデータ記憶装置１１８のような
別のコンピュータ読込み可能媒体からメインメモリ１１
４内に読み込まれてもよい。メインメモリ１１４内に格
納された一連の命令を実行することによって、プロセッ
サ１１２は後述する処理工程を遂行することとなる。別
の例では、本発明を実施するためにソフトウェアによる
命令に代えて、あるいはソフトウェアの命令とともにハ
ードワイヤード回路が用いられてもよい。従って、本発
明はハードウェア回路とソフトウェアとの如何なる特定
の組合わせにも制限されない。The present invention relates to using the computer system 100 to view the original text and the output text on the same display device 120. According to one embodiment, this task is performed by computer system 100 in response to processor 112 executing a sequence of instructions stored in main memory 114. Such instructions may be stored in main memory 11 from another computer-readable medium, such as data storage device 118.
4 may be read. By executing a series of instructions stored in the main memory 114, the processor 112 performs the processing steps described later. In another example, hardwired circuitry may be used in place of, or in conjunction with, software instructions to implement the present invention. Accordingly, the present invention is not limited to any particular combination of hardware circuits and software.

【００１７】〔２．合成文書アーキテクチャ〕合成文書
は、ある文書の多数の表現を有しており、その多数の表
現を論理的な全体として取り扱う。図２に示される合成
文書２００は、例えばコンピュータシステム１００のメ
インメモリ１１４やデータ記憶装置１１８のようなメモ
リに記憶されている。[2. Synthetic Document Architecture] A synthetic document has many representations of a document, and treats the many representations as a logical whole. The composite document 200 shown in FIG. 2 is stored in a memory such as the main memory 114 or the data storage device 118 of the computer system 100, for example.

【００１８】合成文書２００は文書イメージ２１０を備
えており、そのイメージは文書の文書（例えばスキャナ
ー１２６から生成されたＴＩＦＦファイル）のビットマ
ップ表示である。例えばアメリカ合衆国憲法のコピー
は、文書イメージ２１０の形態でアメリカ合衆国憲法の
イメージを生成するために、スキャナー１２６によって
読取り走査されてもよい。The composite document 200 includes a document image 210, which is a bitmap representation of a document of the document (eg, a TIFF file generated from the scanner 126). For example, a copy of the United States Constitution may be read and scanned by scanner 126 to produce an image of the United States Constitution in the form of a document image 210.

【００１９】ビットマップ表示はピクセルの列であり、
モノクロ（例えば黒と白）または多色（例えば赤、青、
緑等）で表され得る。文書イメージ２１０の矩形領域の
位置は、例えば矩形の左上隅と右下隅を組み合わせるこ
とによって特定され得る。アメリカ合衆国憲法を読取り
走査する例では、前文の「form」という単語の最初の文
字（すなわち「f 」）は、左上が（１６，１１０）の座
標で右下が（３１，１１９）の座標の矩形内の文書イメ
ージ２１０に配置されてもよい。そして同じ単語の最後
の文字（すなわち「m 」）は、左上が（１６，１４０）
の座標で右下が（３１，１４９）の座標の矩形の文書イ
メージ２１０に配置され得る。A bitmap representation is a row of pixels,
Monochrome (eg black and white) or multicolor (eg red, blue,
Green, etc.). The position of the rectangular area of the document image 210 can be specified, for example, by combining the upper left corner and the lower right corner of the rectangle. In the example of reading and scanning the United States Constitution, the first letter of the word “form” in the preceding sentence (ie, “f”) is a rectangle with coordinates (16,110) in the upper left and (31,119) in the lower right. May be arranged in the document image 210 in the inside. And the last character of the same word (ie, "m") is (16,140)
, And the lower right corner can be arranged in the rectangular document image 210 having the coordinates (31, 149).

【００２０】また合成文書２００は、文書テキスト２２
０および相関テーブル２３０を備えており、それらは図
３のフローチャートに示す方法によって生成されてもよ
い。文書テキスト２２０は、符号化したＡＳＣＩＩ、Ｅ
ＢＣＤＩＣまたはユニコード（Unicode ）に文字を符号
化した一続きの８ビットまたは１６ビットのバイトでで
きている。従って文書テキスト２２０内の文字は、文書
テキスト２２０内にオフセットにより配置され得る。前
記例では、相関テーブル２３０のオフセット欄に表され
るように、前文の「form」という単語の最初の文字はオ
フセット５７で文書テキスト２２０内に配置されてもよ
く、また同じ単語の最後の文字はオフセット６０で文書
テキスト２２０内に配置され得る。The composite document 200 is composed of the document text 22
0 and a correlation table 230, which may be generated by the method shown in the flowchart of FIG. The document text 220 is encoded ASCII, E
It is made up of a series of 8-bit or 16-bit bytes that encode characters in BCDIC or Unicode. Thus, characters in the document text 220 may be located within the document text 220 by offset. In the above example, as represented in the offset column of the correlation table 230, the first letter of the word "form" in the preamble may be placed in the document text 220 at offset 57, and the last letter of the same word May be placed in the document text 220 at offset 60.

【００２１】図３について説明すると、ステップＳ２５
０で、文書イメージ２１０内の文字は、ＯＣＲ装置１２
８またはそれと同等のものによって認識され、ステップ
Ｓ２５２で、文書テキスト２２０を生成するために保存
される。またＯＣＲ装置１２８は、ステップＳ２５０に
おいて、認識される文字の文書イメージ２１０における
座標を出力するように設計されている。従って文書テキ
スト２２０内の分かっているオフセットにて認識された
文字は、文書イメージ２１０の領域に関連付けられ得
る。前文のイメージの上記例では、文書テキスト２２０
の「form」という単語の最初の文字（オフセット５７に
配置される）は、座標（１６，１１０）および（３１，
１１９）によって定義される文書イメージ２１０領域に
関係づけられている。同様に文書テキスト２２０の「fo
rm」という単語の最後の文字（オフセット６０に配置さ
れる）は、座標（１６，１４０）および（３１，１４
９）によって定義される文書イメージ２１０領域に関係
づけられている。Referring to FIG. 3, step S25 will be described.
0, the characters in the document image 210 are
8 or equivalent, and is saved to generate document text 220 in step S252. The OCR device 128 is designed to output the coordinates of the recognized character in the document image 210 in step S250. Thus, characters recognized at known offsets in the document text 220 may be associated with regions of the document image 210. In the above example of the preamble image, the document text 220
The first letter of the word "form" (located at offset 57) has coordinates (16, 110) and (31,
119) is associated with the document image 210 area. Similarly, “fo” of the document text 220
rm "(located at offset 60) are the coordinates (16,140) and (31,14).
9) is associated with the document image 210 area defined by the above.

【００２２】ステップＳ２５４で、文書テキスト２２０
の単語は、例えば空白の間の文字を語として解釈するこ
とによって特定される。ステップＳ２５４で、これらの
語のそれぞれの文字に対応する文書イメージ２１０の領
域は、合併されて文書テキスト２２０のそれぞれの語に
対応するより大きな文書イメージ２１０領域になる。一
実施の形態では、文書イメージ２１０の領域は、文書テ
キスト２２０の個々の単語に対応する領域の座標のうち
最も左上の座標と最も右下の座標を有する矩形として特
定される。例えば文書テキスト２２０の「form」という
単語（オフセット５７−６０）に対応する文書イメージ
２１０の領域は、相関テーブル２３０の座標およびオフ
セットの欄に示されるように座標（１６，１１０）およ
び（３１，１４９）を有する矩形によって特定される。
あるいは特に種々のサイズの文字を有する文書に対して
は、文書テキスト２２０の各文字に対する座標リストお
よびそれらに対応する文書イメージ２１０領域は個々に
保存されてもよい。In step S254, the document text 220
Is specified by, for example, interpreting characters between spaces as words. In step S254, the area of the document image 210 corresponding to each character of these words is merged into a larger document image 210 area corresponding to each word of the document text 220. In one embodiment, the area of the document image 210 is specified as a rectangle having the upper left and lower right coordinates of the area corresponding to the individual words of the document text 220. For example, the area of the document image 210 corresponding to the word "form" (offset 57-60) of the document text 220 has the coordinates (16, 110) and (31, 149).
Alternatively, especially for documents having characters of various sizes, the coordinate list for each character of the document text 220 and the corresponding document image 210 area may be stored individually.

【００２３】加えてＯＣＲ装置１２８の幾つかの実施例
は、当該技術分野において周知のように、文書テキスト
２２０内の単語または句が間違ったＯＣＲ解釈を含んで
いる可能性を十分に考慮した認識確度パラメータを出力
するように設計されている。例えばあるフォントで、文
書イメージ２１０の文字「m 」はＯＣＲ装置１２８によ
って文字の結合体「rn」として認識されてもよい（例え
ばＯＣＲ装置はその語を「modern」として解釈すること
ができるので、ＯＣＲ装置１２８は「modem 」という単
語に対して低い確度パラメータを出力してもよい）。そ
の結果、文字「m 」を含む語は、完全に唯一の特徴から
なる語よりも相対的に低い確度を割り当てられるようで
ある。上記前文の例では、「form」という単語は、当該
語の中に「m ］という文字があるため、５５％の認識確
度パラメータを割り当てられてもよい。In addition, some embodiments of the OCR device 128 may recognize, as is well known in the art, the possibility of words or phrases in the document text 220 containing incorrect OCR interpretations. It is designed to output accuracy parameters. For example, in a font, the character "m" in the document image 210 may be recognized by the OCR device 128 as a concatenation of characters "rn" (e.g., because the OCR device may interpret the word as "modern", OCR device 128 may output a low likelihood parameter for the word "modem"). As a result, words containing the letter "m" appear to be assigned a relatively lower probability than words consisting entirely of features. In the example of the above preamble, the word “form” may be assigned a recognition accuracy parameter of 55% because the word “m” is included in the word.

【００２４】ステップＳ２５６で、文書テキスト２２０
に出現する各語についての情報は相関テーブル２３０に
保存され、そのため文書イメージ２１０の領域は文書テ
キスト２２０の語に関係づけられ得る。特に相関テーブ
ル２３０は、文書イメージ２１０における領域を特定す
る座標対２３２、文書テキスト２２０における単語を特
定するオフセット対２３４、およびその単語に対する認
識確度パラメータ２３６を格納する。上記例では、文書
テキスト２２０の「form」という単語は、（１６，１１
０）および（３１，１４９）の座標対２３２と、５７お
よび６０のオフセット対２３４と、５５％の認識確度パ
ラメータ２３６を有する。In step S256, the document text 220
Are stored in the correlation table 230 so that regions of the document image 210 can be associated with words of the document text 220. In particular, the correlation table 230 stores a coordinate pair 232 specifying an area in the document image 210, an offset pair 234 specifying a word in the document text 220, and a recognition accuracy parameter 236 for the word. In the above example, the word “form” in the document text 220 is (16, 11
0) and (31,149) coordinate pairs 232, 57 and 60 offset pairs 234, and a 55% recognition accuracy parameter 236.

【００２５】相関テーブル２３０を用いると、文書テキ
スト２２０における各オフセットは文書イメージ２１０
の領域に対応し、その逆も同じである。例えばオフセッ
ト５８で文書テキスト２２０の文字が与えられると、そ
の文字が（１６，１１０）および（３１，１４９）の座
標を有する文書イメージ２１０内の矩形領域に該当して
いることを決めるために、相関テーブル２３０のオフセ
ット欄が調べられ得る。それから、それらの座標での文
書イメージ２１０における領域（上記例では「form」と
いう語）は、文書イメージ２１０から導き出されて表示
され得る。もう一方については、文書イメージ２１０の
座標（２３，１２７）が与えられると、与えられた文書
イメージ２１０の座標が、５７−６０のオフセットを有
する文書テキスト２２０の単語内に見出されることを決
めるために、相関テーブル２３０の座標欄が調べられ得
る。それから文書テキスト２２０のそのオフセット領域
での語（上記例では「form」という語）が特定され得
る。従って、ここで説明された合成文書アーキテクチャ
は、文書テキスト２２０における語の配置を文書イメー
ジ２１０の対応する領域に関係づける一方法を提供して
いる。Using the correlation table 230, each offset in the document text 220 is
, And vice versa. For example, given a character in document text 220 at offset 58, to determine that the character falls into a rectangular area in document image 210 having coordinates (16,110) and (31,149), The offset field of the correlation table 230 can be consulted. Then, the regions in the document image 210 at those coordinates (the word “form” in the example above) can be derived from the document image 210 and displayed. On the other hand, given the coordinates (23,127) of the document image 210, to determine that the coordinates of the given document image 210 are found within words of the document text 220 having an offset of 57-60. Next, the coordinate field of the correlation table 230 can be examined. The word in the offset region of the document text 220 (the word "form" in the example above) can then be identified. Thus, the composite document architecture described herein provides one way to relate the placement of words in the document text 220 to corresponding areas of the document image 210.

【００２６】〔３．誤認識の見込みを有する語の指摘〕
図４および図５のイメージ表示３００，３５０では、文
書テキスト２２０における、最も誤認識の可能性の高い
語は、種々の方式（例えば明るくしたり、色やフォント
を変えたり、下線を付したり、きらめかせるなど）で表
示され得る。これらの語は、全ての認識された語の認識
確度パラメータ２３６を規定された閾値と比較すること
によって決められ得る。例えば認識確度パラメータ２３
６が６０％以下の語は赤で表示され、テキスト中の間違
っているおそれのある語にユーザの注意を向けさせるこ
とができる。例えば元の語「form」は、５５％の認識確
度パラメータ２３６となり、それによって赤で表示され
る。別の例では、認識確度パラメータ２３６が低い語
は、その語の背景色を変えることによって、文書テキス
ト２２０から識別される（例えば文書テキスト２２０の
「form」という語は、はっきり見える色で強調され得
る）。[3. Pointing out words that have the possibility of misrecognition)
In the image displays 300 and 350 of FIGS. 4 and 5, the words in the document text 220 that are most likely to be misrecognized are identified by various methods (for example, brightening, changing colors and fonts, and underlining). , Flash, etc.). These words can be determined by comparing the recognition accuracy parameters 236 of all recognized words to a specified threshold. For example, the recognition accuracy parameter 23
Words in which 6 is 60% or less are displayed in red, and can draw the user's attention to potentially incorrect words in the text. For example, the original word "form" results in a 55% recognition accuracy parameter 236, which is displayed in red. In another example, words with a low recognition accuracy parameter 236 are identified from the document text 220 by changing the background color of the word (eg, the word “form” in the document text 220 is highlighted in a clearly visible color). obtain).

【００２７】別の例では、さらに紙のテキストの各語に
関連した認識確度パラメータ２３６は、文書テキスト２
２０の各語にふさわしい個々の表示色を決めて、認識さ
れた語の「ヒートマップ」を形成するために、複数の閾
値と比較される。ヒートマップは、複数の色を用いてス
ペクトルの種々の点でのパラメータ（例えば周波数、温
度または認識確度）の値を示した図表である。結果とし
て生じるヒートマップは、ＯＣＲ出力について文書テキ
スト２２０の最も問題の有りそうな部分にユーザを導く
助けとなる。本例では、ユーザに表示される文書テキス
ト２２０の語は種々の色で表される。In another example, the recognition accuracy parameter 236 further associated with each word of the paper text may include the document text 2
An individual display color for each of the twenty words is determined and compared to a plurality of thresholds to form a "heat map" of the recognized word. The heat map is a chart showing values of parameters (for example, frequency, temperature, or recognition accuracy) at various points in the spectrum using a plurality of colors. The resulting heat map helps guide the user to the most problematic portions of document text 220 for OCR output. In this example, the words of the document text 220 displayed to the user are represented in various colors.

【００２８】図６について説明すると、ヒートマップ
は、文書テキスト２２０に対してステップＳ４１０で制
御されるループによって生成される。ステップＳ４１０
は、イメージ表示３００およびイメージ表示３５０のう
ちの一方または両方に表示されるべき文書テキスト２２
０の各語全部についてループをなす。ステップＳ４２０
で、文書テキスト２２０の表示された語に対応する認識
確度パラメータ２３６を見つけるために、相関テーブル
２３０が調べられる。それからこのパラメータ２３６
は、例えば６０％、８０％および９０％のような複数の
閾値と引き続き比較される。Referring to FIG. 6, a heat map is generated for the document text 220 by a loop controlled in step S410. Step S410
Is the document text 22 to be displayed on one or both of the image display 300 and the image display 350.
Loop over all 0 words. Step S420
Then, the correlation table 230 is consulted to find a recognition accuracy parameter 236 corresponding to the displayed word of the document text 220. Then this parameter 236
Is subsequently compared to a plurality of thresholds, such as 60%, 80% and 90%.

【００２９】ステップＳ４２２−Ｓ４３４は、例えば閾
値を６０％、８０％および９０％とした場合のヒートマ
ップ表示の生成処理を示している。まず最も低い閾値で
ある６０％が比較用の閾値として使用される。認識確度
パラメータ２３６がその閾値よりも低い場合には、その
語の色は赤に設定される（ステップＳ４２４）。上記例
では、「form」という語は、その認識確度パラメータ２
３６が５５％であるため、赤で強調される。図４および
図５に示す例では、赤に設定される他の語は「general
」と「Constitution」であるかもしれない。Steps S422-S434 show a process of generating a heat map display when the threshold values are set to 60%, 80% and 90%, for example. First, the lowest threshold of 60% is used as a threshold for comparison. If the recognition accuracy parameter 236 is lower than the threshold, the color of the word is set to red (step S424). In the above example, the word “form” is the recognition accuracy parameter 2
Because 36 is 55%, it is highlighted in red. In the examples shown in FIGS. 4 and 5, another word set to red is "general
"And" Constitution ".

【００３０】つぎにステップＳ４２６では、つぎに低い
閾値である８０％が比較用の閾値として使用される。認
識確度パラメータ２３６がその閾値よりも低い場合に
は、文書テキスト２２０のその語の色は緑に設定される
（ステップＳ４２８）。上記例では、「Union 」という
語は、その認識確度パラメータ２３６が７５％であって
よく、その場合には緑で表示される。図４および図５に
示す例では、緑に設定される他の語は「insure」と「se
cure」であるかもしれない。Next, in step S426, the next lower threshold value of 80% is used as a comparison threshold value. If the recognition certainty parameter 236 is lower than the threshold, the color of the word in the document text 220 is set to green (step S428). In the above example, the word "Union" may have a recognition accuracy parameter 236 of 75%, in which case it is displayed in green. In the examples shown in FIGS. 4 and 5, the other words set to green are “insure” and “se
cure ".

【００３１】ステップＳ４３０で、最後の閾値である９
０％が比較用の閾値として使用される。認識確度パラメ
ータ２３６がその閾値よりも低い場合には、文書テキス
ト２２０の語の色は青に設定される（ステップＳ４３
２）。図４および図５に示す例では、青に設定される語
は「Tranquility 」と「establish 」になり得る。他
方、認識確度パラメータ２３６が全ての閾値よりも高い
場合には、文書テキスト２２０のその語の色は、デフォ
ルトの色として使用され得る黒に設定される（ステップ
Ｓ４３４）。一旦色が設定されると、文書テキスト２２
０の語はその色で表示される（ステップＳ４３６）。In step S430, the last threshold value of 9
0% is used as a threshold for comparison. If the recognition accuracy parameter 236 is lower than the threshold, the word color of the document text 220 is set to blue (step S43).
2). In the examples shown in FIGS. 4 and 5, the words set in blue may be “Tranquility” and “establish”. On the other hand, if the recognition accuracy parameter 236 is higher than all thresholds, the color of the word in the document text 220 is set to black, which can be used as a default color (step S434). Once the color is set, the document text 22
The word 0 is displayed in that color (step S436).

【００３２】閾値に対する数および色が、本発明の趣旨
から逸脱することなく、実施の形態に応じて変わっても
よいことは十分に理解されよう。例えば閾値が１つ、２
つ、３つまたは１０個でさえもかまわない。別の例とし
て、色の選択が変わってもよい（例えば赤、オレンジ、
黄色）。実際に例えば点滅や下線のような表示色以外の
表示属性が採用されてもよい。また図６のフローチャー
トに示すように分岐を厳格に体系化せずに、閾値および
表示色または他の表示属性が１つのテーブルに入力され
ていて１つのループで引き続き調べられてもよいことも
理解され得る。It will be appreciated that the numbers and colors for the thresholds may vary from embodiment to embodiment without departing from the spirit of the invention. For example, one threshold, 2
One, three, or even ten. As another example, the color selection may change (e.g., red, orange,
yellow). Actually, display attributes other than the display color such as blinking and underlining may be employed. It is also understood that the thresholds and display colors or other display attributes may be entered in one table and continually examined in one loop without strictly organizing the branches as shown in the flowchart of FIG. Can be done.

【００３３】〔４．文書イメージウィンドウの表示〕元
の紙の文書を参照するのに関する時間を減らすために、
元の紙の文書の読取り走査されたイメージ部分（すなわ
ち文書イメージ２１０）は、そのテキストのＯＣＲ解釈
上に表示される。アメリカ合衆国憲法を読取り走査した
例において、前文の読取り走査されたイメージ部分が、
図５に示すようにＯＣＲ出力上のウィンドウのイメージ
表示３５０内に表示されていてもよい。[4. Show Document Image Window] To reduce the time associated with browsing the original paper document,
The scanned image portion of the original paper document (ie, document image 210) is displayed on the OCR interpretation of the text. In the example of reading and scanning the United States Constitution, the scanned image portion of the preamble is
As shown in FIG. 5, it may be displayed in the image display 350 of the window on the OCR output.

【００３４】イメージ表示３５０において、文書テキス
ト２２０は図５に示されるようにモニタに表示される。
それからユーザは、文書テキスト２２０の何らかの語の
上にカーソル３６０を位置させることによって文書テキ
スト２２０からある語を選択する。ユーザがマウスのあ
るボタンをクリックするか、あるいは同様なファンクシ
ョンキーを押すと、文書イメージ２１０の、選択された
語を囲む領域に対応する部分がポップアップウィンドウ
３９０として現れる。これによってユーザは、必要な時
には即座に文書イメージ２１０の部分を見ることができ
る。In the image display 350, the document text 220 is displayed on a monitor as shown in FIG.
The user then selects a word from document text 220 by positioning cursor 360 over any word in document text 220. When the user clicks a button with the mouse or presses a similar function key, a portion of document image 210 corresponding to the area surrounding the selected word appears as pop-up window 390. This allows the user to immediately view the document image 210 when needed.

【００３５】文書イメージ２１０と文書テキスト２２０
との調和された動きは、文書テキスト２２０の各語の配
置を、相関テーブル２３０を用いて文書イメージ２１０
からの対応する領域に関係付けることによって達成され
る。カーソル制御手段１２４によって与えられる情報に
基づいて、文書テキスト２２０上の如何なる瞬間のカー
ソル３６０の位置も特定されて、当該技術分野において
周知のマッピング技術によって、イメージ表示３５０の
座標システムから文書テキスト２２０のオフセットシス
テムに変換され得る。それから相関テーブル２３０を用
いて、文書テキスト２２０に表れる各語のオフセット
は、文書イメージ２１０の対応する領域に対する座標に
関連付けられ得る。それから対応する領域を含む文書イ
メージ２１０の部分は、そのイメージ部分がポップアッ
プウィンドウ３９０内に表示され得るように、抽出され
る。そしてユーザは、文書テキスト２２０の対応する語
が文書イメージ２１０と一致することを確かめるため
に、文書イメージ２１０の表示部分を見ることができ
る。Document image 210 and document text 220
The coordinated motion with the document image 210 is determined by using the correlation table 230 to determine the location of each word in the document text 220.
This is achieved by relating to the corresponding region from Based on the information provided by the cursor control means 124, the position of the cursor 360 at any instant on the document text 220 is determined, and the mapping of the document text 220 from the coordinate system of the image display 350 is performed by mapping techniques well known in the art. It can be converted to an offset system. Then, using the correlation table 230, the offset of each word appearing in the document text 220 can be associated with coordinates for a corresponding region of the document image 210. The portion of the document image 210 that includes the corresponding region is then extracted so that the image portion can be displayed in a pop-up window 390. The user can then view the displayed portion of the document image 210 to make sure that the corresponding word in the document text 220 matches the document image 210.

【００３６】別の例では、ポップアップウィンドウ３９
０内に表示された文書イメージ２１０の部分は、文書テ
キスト２２０をヒートマップ化するために用いられたの
と同じ方式でヒートマップ化される。文書テキスト２２
０の語の表示状態を文書イメージ２１０の対応する領域
に関連付けるため、文書テキスト２２０の特定の語の表
示状態をはっきり示すのに使用されるのと同じ認識確度
パラメータ２３６が、文書イメージ２１０の対応する領
域の表示状態をはっきり示すのに使用される。例えば相
関テーブル２３０において「form」という語に５５％の
認識確度パラメータ２３６が割り当てられていることに
よって、その語は、文書テキスト２２０および文書イメ
ージ２１０の対応する領域の両方において赤で表示され
ることとなる。別の例では、文書テキスト２２０および
文書イメージ２１０の両方ともヒートマップで表示され
ず、文書イメージ２１０の部分が文書テキスト２２０上
のウィンドウ内に表示されるだけである。In another example, a pop-up window 39
The portion of the document image 210 displayed in 0 is heat mapped in the same manner used to heat map the document text 220. Document text 22
In order to associate the display state of the word 0 with the corresponding area of the document image 210, the same recognition accuracy parameter 236 used to clearly indicate the display state of a particular word in the document text 220 is used. It is used to clearly indicate the display state of the area to be displayed. For example, by assigning a 55% recognition accuracy parameter 236 to the word "form" in the correlation table 230, the word is displayed in red in both the document text 220 and the corresponding area of the document image 210. Becomes In another example, both document text 220 and document image 210 are not displayed in a heat map, and only a portion of document image 210 is displayed in a window on document text 220.

【００３７】〔５．ＯＣＲ出力の誤りの訂正〕図６のフ
ローチャートには、本発明の一実施の形態によるＯＣＲ
出力の誤りの訂正処理も示されている。訂正を行うため
に、カーソル３１０は、例えばマウスやトラックボール
やジョイスティック、およびポップアップウィンドウ３
９０を表示させるために使用されるマウスのボタンもし
くはファンクションキー以外のマウスのボタンやファン
クションキーのようなカーソル制御手段１２４を用いて
文書テキスト２２０の如何なる部分上にも配置される。[5. Error Correction of OCR Output] FIG. 6 is a flowchart showing an OCR output according to an embodiment of the present invention.
A process for correcting an output error is also shown. To make corrections, the cursor 310 may be, for example, a mouse, trackball, joystick, and pop-up window 3.
The mouse button used to display 90 or a mouse button other than the function key or a mouse button or a cursor control means such as a function key is disposed on any part of the document text 220 by using the cursor control means 124.

【００３８】ステップＳ４４０で、プロセッサ１１２
は、イメージ表示３００上のカーソル３１０の位置に関
してカーソル制御手段１２４からの入力を受け取る。こ
の入力は、カーソル３１０がイメージ表示３００上に置
かれる時にはいつでも、あるいはユーザがボタンを操作
する時にのみ、カーソル制御手段１２４によって自動的
に生成されてもよい。後者の場合には、ユーザがボタン
を操作する時に、カーソル制御手段１２４はカーソル３
１０の現在の位置を入力として送る。At step S440, the processor 112
Receives input from the cursor control means 124 regarding the position of the cursor 310 on the image display 300. This input may be automatically generated by the cursor control means 124 whenever the cursor 310 is placed on the image display 300 or only when the user operates a button. In the latter case, when the user operates the button, the cursor control means 124 sets the cursor 3
Send 10 current locations as input.

【００３９】ステップＳ４４０で受け取られる入力に関
連付けられるカーソル３１０の位置は、当該技術分野に
おいて周知のマッピング技術によって、イメージ表示３
００の座標システムから文書テキスト２２０のオフセッ
トシステムに変換される。図４に示す例では、イメージ
表示３００におけるカーソル３１０の位置は文書テキス
ト２２０のオフセット５９に対応していてもよい。The position of the cursor 310 associated with the input received in step S440 is determined by using a mapping technique well known in the art.
00 from the coordinate system of 00 to the offset system of the document text 220. In the example shown in FIG. 4, the position of the cursor 310 in the image display 300 may correspond to the offset 59 of the document text 220.

【００４０】ステップＳ４４２で、ステップＳ４４０で
受け取られた入力から得られたオフセットを含むオフセ
ット対２３４を指定する記載を求めて、相関テーブル２
３０が調べられる。上記例では、オフセット５９はオフ
セット対５７−６０に含まれる。このオフセット対は、
オフセット対２３４の範囲内のオフセットにて文書テキ
スト２２０内に置かれた文字列を抜き出すのに使用され
る。In step S442, a description specifying offset pair 234 including the offset obtained from the input received in step S440 is obtained, and the correlation table 2
30 is examined. In the above example, offset 59 is included in offset pair 57-60. This offset pair is
Used to extract strings placed in document text 220 at offsets within offset pair 234.

【００４１】ステップＳ４４４で、オフセット５７−６
０での文字列に対して可能性のある置換え語が生成され
る。当該技術分野において、可能性のある置換え語を生
成するために広範囲な種々の技術が知られているが、し
かし発明を実施するにはどれか特定の技術を必要としな
い。例えば可能性のある置換え語を生成するために、単
語レベルの反応が考慮され得る（例えばスペルチェック
をする）。さらに別の例として、句レベルの情報（例え
ばデータベース内に存在する連続語のマルコフモデル）
が用いられ得る。さらにはこれらの種々の技術は組み合
わされて重みを付けられ得る。上記例では、ステップＳ
４４４は、選択されたテキスト「domestic」に対して以
下の可能性のある置換え語の組、すなわち「dominat
e」、「demeanor」および「demotion」を生成してもよ
い。At step S444, the offset 57-6
A possible replacement for the string at 0 is generated. A wide variety of techniques are known in the art for generating possible replacements, but do not require any particular technique to practice the invention. For example, word-level responses may be considered (eg, spell-checked) to generate potential replacements. As yet another example, phrase-level information (eg, a Markov model of continuous words in a database)
Can be used. Furthermore, these various techniques can be combined and weighted. In the above example, step S
444 is a set of the following possible replacements for the selected text "domestic": "dominat
"e", "demeanor" and "demotion" may be generated.

【００４２】ステップＳ４４６で、選択されたテキスト
に対する可能性のある置換え語は、カーソル３１０の近
くでポップアップメニュー３３０内に表示される。これ
らの置換え語が、選択されたテキストの潜在的な置換え
の見込みにしたがう序列でポップアップメニュー３３０
内に表示されることは好ましい（すなわち選択されたテ
キストが間違っていると考えられる場合には、ポップア
ップメニュー３３０のリストの一番上にある置換え語
が、最も置換え語として使用されるそうである）。一実
施の形態では、ユーザが文書テキスト２２０の一部を手
休めずに削除することができるようにするために、削除
のオプションもカーソル３１０近くのポップアップメニ
ュー３３０内に設けられる。At step S 446, possible replacement words for the selected text are displayed in pop-up menu 330 near cursor 310. These replacement words are displayed in a pop-up menu 330 in an order according to the potential replacement of the selected text.
(I.e., if the selected text is deemed to be incorrect, the replacement word at the top of the list in the pop-up menu 330 is likely to be used as the most replacement word) ). In one embodiment, a delete option is also provided in the pop-up menu 330 near the cursor 310 to allow the user to delete a portion of the document text 220 without pause.

【００４３】別の例によれば、カーソル３１０が文書テ
キスト２２０のある語の上にある場合には、選択された
テキストに対するポップアップメニュー３３０が自動的
に表示される。従ってユーザは、文書テキスト２２０の
テキストの表示列の上にカーソル３１０を動かすことが
でき、選択されたテキストをポップアップメニュー３３
０内の可能性のある置換え語と迅速に比較することがで
きる。According to another example, if the cursor 310 is over a word in the document text 220, a pop-up menu 330 for the selected text is automatically displayed. Thus, the user can move the cursor 310 over the text display column of the document text 220 and move the selected text to the pop-up menu 33.
It can be compared quickly with possible replacement words in 0.

【００４４】ポップアップメニュー３３０が表示される
場合、ユーザは、文書イメージ２１０の部分を含むポッ
プアップウィンドウ３９０を見て、文書テキスト２２０
の選択されたテキストが正しくないということを決めて
もよい。この場合には、ユーザは、正しい置換え語を選
択するためにポップアップメニュー３３０内の可能性の
ある置換え語を見るであろう。正しい置換え語が見つか
ると、ユーザは正しい置換え語を選択することができる
（例えば適当な語を強調し、カーソル制御手段１２４の
ボタンをクリックするかまたはボタンを押している手を
放すようにする）。上記例では、「domestic」という語
に対する正しい置換え語は、ポップアップメニュー３３
０内の「dominate」と「demotion」の間に表示された
「demeanor」かもしれない。When the pop-up menu 330 is displayed, the user looks at the pop-up window 390 containing the portion of the document image 210 and looks at the document text 220
You may decide that the selected text is incorrect. In this case, the user will see the possible replacement words in the pop-up menu 330 to select the correct replacement word. Once the correct replacement word is found, the user can select the correct replacement word (eg, highlight the appropriate word and click or release the button on cursor control means 124). In the above example, the correct replacement for the word "domestic" is
It may be "demeanor" displayed between "dominate" and "demotion" in 0.

【００４５】この時点で、ステップＳ４４８におけるよ
うにプロセッサ１１２は、意図された訂正のために入力
を受け取り、ステップＳ４５０におけるように文書テキ
スト２２０の語を、ユーザが選択した訂正で置き換え
る。しかしながらポップアップメニュー３３０内に正し
い置換え語がない場合には、ユーザは従来通りのやり方
で（例えばキーボード１２２を介して）正しい置換え語
を入力してもよい。可能性のある置換え語を生成してそ
れらをポップアップメニュー３３０内に表示することに
よって、ＯＣＲ出力に対する訂正に費やされる時間が減
少する。At this point, as in step S448, processor 112 receives the input for the intended correction and replaces the words in document text 220 with the user-selected correction as in step S450. However, if there is no correct replacement word in pop-up menu 330, the user may enter the correct replacement word in a conventional manner (eg, via keyboard 122). By generating the potential replacement words and displaying them in the pop-up menu 330, the time spent on corrections to the OCR output is reduced.

【００４６】一旦ユーザが文書テキスト２２０に対して
訂正を行うかまたは何らかの方法で文書テキスト２２０
を変更すると、相関テーブル２３０は、この行為が起こ
ったことを反映するために更新されなければならない。
加えて、文書テキスト２２０の正された語の認識確度パ
ラメータ２３６は自動的に１００％に再設定され、文書
テキスト２２０における選択されたテキストはデフォル
トの色（例えば黒）に戻る。Once the user makes corrections to the document text 220 or in some way
, The correlation table 230 must be updated to reflect that this action has taken place.
In addition, the correct word recognition accuracy parameter 236 of the document text 220 is automatically reset to 100% and the selected text in the document text 220 returns to the default color (eg, black).

【００４７】本発明は、ある好ましい実施の形態につい
て言及しながらかなり詳細に説明され、また図示された
が、他の変形例が可能である。上記説明を読むと、本発
明の趣旨または範囲から逸脱することなく、形態または
細部について上記説明または図における変形がなされて
もよいということは、当業者にとって明らかである。Although the present invention has been described and illustrated in considerable detail with reference to certain preferred embodiments, other variations are possible. After reading the above description, it will be apparent to one skilled in the art that modifications in the above description or figures may be made in form or detail without departing from the spirit or scope of the invention.

【００４８】[0048]

【発明の効果】以上説明したとおり、この発明に係る光
学式文字認識出力のポップアップ訂正のための方法およ
び装置にあっては、ユーザが文書イメージからなるテキ
ストをそのテキストのＯＣＲ解釈と一緒に比較すること
ができる効果を奏する。また、ＯＣＲ解釈を生成するの
に使用された元の文書をユーザが参照する必要がなく、
ユーザが文書イメージで表されたテキストをそのテキス
トのＯＣＲ解釈と一緒に比較することができる効果を奏
する。さらに、元のテキストをＯＣＲ出力のテキストに
変換している間に起こった間違いを正すために、ユーザ
が文書イメージで表されたテキストをそのテキストのＯ
ＣＲ解釈と比較することができる効果を奏する。As described above, in the method and apparatus for pop-up correction of optical character recognition output according to the present invention, a user compares a text consisting of a document image with an OCR interpretation of the text. It has an effect that can be done. Also, the user does not need to refer to the original document used to generate the OCR interpretation,
This has the effect that the user can compare the text represented by the document image with the OCR interpretation of the text. In addition, to correct mistakes made during the conversion of the original text to the text of the OCR output, the user may replace the text represented by the document image with the OCR of the text.
It has an effect that can be compared with CR interpretation.

[Brief description of the drawings]

【図１】本発明が実施され得るコンピュータシステムを
示す上位ブロック図である。FIG. 1 is a high-level block diagram illustrating a computer system on which the present invention can be implemented.

【図２】合成文書アーキテクチャを示すブロック図であ
る。FIG. 2 is a block diagram illustrating a composite document architecture.

【図３】合成文書の生成処理を示すフローチャートであ
る。FIG. 3 is a flowchart illustrating a process of generating a composite document.

【図４】本発明の一実施の形態によるスクリーン表示の
一例を示す図である。FIG. 4 is a diagram showing an example of a screen display according to an embodiment of the present invention.

【図５】本発明の他の実施の形態によるスクリーン表示
の一例を示す図である。FIG. 5 is a diagram showing an example of a screen display according to another embodiment of the present invention.

【図６】本発明の一実施の形態によるＯＣＲ出力におけ
る誤りの発見および訂正処理を示すフローチャートであ
る。FIG. 6 is a flowchart illustrating error detection and correction processing in an OCR output according to an embodiment of the present invention.

[Explanation of symbols]

１１２プロセッサ１２０表示装置１２２キーボード１２４カーソル制御手段１２６スキャナー装置１２８光学式文字認識装置２１０文書イメージ２２０文書テキスト２３０相関テーブル２３６認識確度パラメータ３３０ポップアップメニュー 112 processor 120 display device 122 keyboard 124 cursor control means 126 scanner device 128 optical character recognition device 210 document image 220 document text 230 correlation table 236 recognition accuracy parameter 330 pop-up menu

Claims

[Claims]

1. A method for displaying text, comprising: generating a document image of a document; recognizing characters from the document image to generate a document text; Determining a region of the document image; associating the region of the document image with a corresponding word of the document text using a correlation table; displaying a portion of the document image on the document text; A method for pop-up correction of an optical character recognition output, comprising:

2. The method for pop-up correction of optical character recognition output according to claim 1, wherein the regions of the document image are displayed to indicate respective recognition accuracy parameters.

3. The corresponding word of the document text is:
The method for pop-up correction of an optical character recognition output according to claim 1, wherein each recognition accuracy parameter is displayed to indicate the recognition accuracy parameter.

4. The optical character of claim 1, wherein both the area of the document image and the corresponding word of the document text are displayed to indicate respective recognition accuracy parameters. A method for pop-up correction of recognition output.

5. An input for selecting a location in the text of the document, determining a selected text corresponding to the location of the text of the text, and input for correcting the selected text. 2. The method of claim 1, further comprising: receiving the selected text, and updating the correlation table with information including up-to-date information to reflect corrections made to the selected text. For pop-up correction of the optical character recognition output of a computer.

6. The optical character recognition output pop-up of claim 5, wherein receiving an input to correct the selected text comprises deleting the selected text. Method for correction.

7. The method of claim 1, further comprising: receiving one or more replacement words for the selected text; determining one or more replacement words for the selected text; and receiving one or more replacement words for the selected text. Displaying the replacement word, receiving input indicating a replacement word for the selected text, and replacing the selected text with the replacement word. 5. The method for pop-up correction of the optical character recognition output according to 5.

8. The method for pop-up correction of optical character recognition output according to claim 7, wherein receiving an input indicating a replacement word comprises receiving a keyboard input of the replacement word. .

9. The method for pop-up correction of optical character recognition output according to claim 7, wherein the one or more replacement words are displayed in a pop-up menu.

10. An apparatus for displaying text, comprising: a scanner device for generating a document image of a document; an optical character recognition device for recognizing characters in the document image for generating document text; A processor for determining a region of the document image corresponding to the word of the document text, and associating the region of the document image with a corresponding word of the document text using a correlation table; A display device for displaying a portion of a document image, and a device for correcting a pop-up of an optical character recognition output.

11. The apparatus according to claim 10, wherein the display device displays the area of the document image so as to indicate respective recognition accuracy parameters. Equipment.

12. The optical character recognition output pop-up correction of claim 10, wherein the display device displays the corresponding words of the document text to indicate respective recognition accuracy parameters. Equipment for.

13. The display device of claim 10, wherein the display device displays both the area of the document image and the corresponding word of the document text to indicate respective recognition accuracy parameters. For pop-up correction of the optical character recognition output of a computer.

14. The system further comprising: cursor control means for receiving an input for selecting a location in the document text; wherein the processor determines a selected text corresponding to the location of the document text; 11. The method of claim 10, further comprising receiving an input for correcting the selected text, and updating the correlation table to include the latest information to reflect the correction made to the selected text. For pop-up correction of the optical character recognition output of a computer.

15. The optical character recognition output of claim 14, wherein the processor receives an input to correct the selected text by deleting the selected text. Device for pop-up correction.

16. The processor, wherein the processor determines one or more replacement words for the selected text, and controls a display device to display the one or more replacement words for the selected text. Receiving input indicating a replacement word for the selected text, and receiving input for correcting the selected text by replacing the selected text with the replacement word. 15. An apparatus for pop-up correction of an optical character recognition output according to claim 14.

17. The apparatus according to claim 16, further comprising a keyboard for inputting the replacement word for the selected text.

18. The optical character recognition output of claim 16, wherein the display device is controlled to display the one or more replacement words in a pop-up menu. Equipment for.