JP2005352735A

JP2005352735A - Document file creation support device, document file creation support method, and program thereof

Info

Publication number: JP2005352735A
Application number: JP2004172299A
Authority: JP
Inventors: Shunichi Kimura; 俊一木村; Masanori Sekino; 雅則関野
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2004-06-10
Filing date: 2004-06-10
Publication date: 2005-12-22

Abstract

<P>PROBLEM TO BE SOLVED: To provide a document file creation support device that facilitates a check on the result of recognition of character recognition processing. <P>SOLUTION: The document file creation support device 2 identifies a character displayed in a manuscript image based on the manuscript image of the raster data form, and creates an image pattern of a character image based on the identification result. The document file creation support device 2, by applying the created image pattern to the character image of which accuracy of the identification result is a standard value or higher, and applying the character image captured from the manuscript image to the character image of which accuracy of the identification result is below the standard value, creates a character string reproduction image that reproduces a character string contained in the manuscript image, and displays it. In this way, a user can input a character code for the character of which accuracy of the identification result is low while checking the reproduction image of the character string. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、文字認識処理を用いた文書ファイルの作成を支援する文書ファイル作成支援装置に関する。 The present invention relates to a document file creation support apparatus that supports creation of a document file using character recognition processing.

例えば、特許文献１は、入力画像と文字の認識結果との両方を表示し、一方の画像をスクロールさせた場合に一方の画像に対応するように他方の画像を自動的にスクロールさせる文字認識装置を開示する。また、特許文献２は、文書画像列と文字の認識結果列とを、基準線を挟んで互いに対向する位置に表示する文字認識システムを開示する。また、特許文献３は、認識対象文字の近傍に認識結果を印字する光学的文字読取装置を開示する。同様に、特許文献４は、文字認識結果及びその文字認識のもとになったイメージ情報を表示する文字認識表示装置を開示する。
特公平７−０７２９０３号公報特開平１０−０２１３２６号公報特開平７−０９８７４６号公報特開平７−１２１６５４号公報 For example, Patent Literature 1 displays both an input image and a character recognition result, and when one image is scrolled, a character recognition device that automatically scrolls the other image to correspond to one image. Is disclosed. Patent Document 2 discloses a character recognition system that displays a document image sequence and a character recognition result sequence at positions facing each other across a reference line. Patent Document 3 discloses an optical character reader that prints a recognition result in the vicinity of a recognition target character. Similarly, Patent Document 4 discloses a character recognition display device that displays a character recognition result and image information based on the character recognition.
Japanese Examined Patent Publication No. 7-079023 Japanese Patent Laid-Open No. 10-021326 JP-A-7-098746 Japanese Patent Laid-Open No. 7-121654

本発明は、上述した背景からなされたものであり、文字認識処理の認識結果の確認を容易にする文書ファイル作成支援装置を提供することを目的とする。 The present invention has been made from the above-described background, and an object thereof is to provide a document file creation support apparatus that facilitates confirmation of a recognition result of character recognition processing.

［文書ファイル作成支援装置］
上記目的を達成するために、本発明にかかる文書ファイル作成支援装置は、ラスタデータ形式の原稿画像に基づいて、原稿画像で表示される文字を判別する文字判別手段と、前記文字判別手段による判別結果に基づいて文字画像の画像パターンを生成する画像パターン生成手段と、原稿画像から切り出された文字画像と、前記文字判別手段により生成された画像パターンとを用いて、原稿画像に含まれる文字列を再現する文字列再現画像を作成する再現画像作成手段と、前記再現画像作成手段により作成された文字列再現画像を表示するユーザインタフェース手段とを有する。 [Document file creation support device]
In order to achieve the above object, a document file creation support apparatus according to the present invention includes a character discrimination unit that discriminates characters displayed in an original image based on an original image in a raster data format, and a discrimination by the character discrimination unit. A character string included in the document image using an image pattern generation unit that generates an image pattern of the character image based on the result, a character image cut out from the document image, and an image pattern generated by the character determination unit Reproduction image creating means for creating a character string reproduction image for reproducing the image, and user interface means for displaying the character string reproduction image created by the reproduction image creation means.

好適には、前記ユーザインタフェース手段は、原稿画像に含まれる文字画像と、前記画像パターン生成手段により生成された文字画像とを区別できる表示態様で、文字列再現画像を表示する。 Preferably, the user interface means displays the character string reproduction image in a display mode capable of distinguishing between the character image included in the document image and the character image generated by the image pattern generation means.

好適には、前記文字判別手段は、さらに、文字の判別結果の確度を判別し、前記再現画像作成手段は、前記文字判別手段によりいずれかの文字画像について判別された判別結果の確度が基準値以上である場合に、この文字画像に相当する部分について、前記画像パターン生成手段により生成された画像パターンを適用し、この文字画像の判別結果の確度が基準値よりも小さい場合に、この文字画像に相当する部分について、原稿画像から切り出された文字画像を適用する。 Preferably, the character discriminating unit further discriminates the accuracy of the character discrimination result, and the reproduction image creating unit is configured such that the accuracy of the discrimination result discriminated for any one of the character images by the character discriminating unit is a reference value. In this case, when the image pattern generated by the image pattern generation unit is applied to the portion corresponding to the character image, and the accuracy of the determination result of the character image is smaller than the reference value, the character image The character image cut out from the document image is applied to the portion corresponding to.

好適には、前記文字判別手段は、少なくとも、原稿画像で表示される文字の文字識別情報を判別し、前記ユーザインタフェース手段は、表示された文字列再現画像に含まれる文字画像について、文字識別情報の入力を受け付け、前記文字判別手段により判別された文字識別情報と、前記ユーザインタフェース手段により受け付けた文字識別情報とに基づいて、原稿画像に含まれる文字列に対応する文字識別情報のファイルを作成する文字列ファイル作成手段をさらに有する。 Preferably, the character discriminating unit discriminates at least character identification information of a character displayed in the document image, and the user interface unit performs character identification information on the character image included in the displayed character string reproduction image. The character identification information file corresponding to the character string included in the document image is created based on the character identification information determined by the character determination means and the character identification information received by the user interface means. And a character string file creating means.

好適には、前記ユーザインタフェース手段は、原稿画像から切り出された文字画像であるか否かに応じて、文字列再現画像において文字識別情報の入力対象を示すカーソル位置をスキップさせる。 Preferably, the user interface means skips a cursor position indicating an input target of character identification information in the character string reproduction image, depending on whether or not the character image is cut out from the document image.

［文書ファイル作成支援方法］
また、本発明にかかる文書ファイル作成支援方法は、ラスタデータ形式の原稿画像に基づいて、原稿画像で表示される文字を判別し、判別結果に基づいて文字画像の画像パターンを生成し、原稿画像から切り出された文字画像と、生成された画像パターンとを用いて、原稿画像に含まれる文字列を再現する文字列再現画像を作成し、作成された文字列再現画像を表示する。 [Document file creation support method]
The document file creation support method according to the present invention discriminates characters displayed in a manuscript image based on the manuscript image in the raster data format, generates an image pattern of the character image based on the discrimination result, and A character string reproduction image that reproduces the character string included in the document image is created using the character image cut out from the image and the generated image pattern, and the created character string reproduction image is displayed.

［プログラム］
また、本発明にかかるプログラムは、コンピュータを含む文書ファイル作成支援装置において、ラスタデータ形式の原稿画像に基づいて、原稿画像で表示される文字を判別するステップと、判別結果に基づいて文字画像の画像パターンを生成するステップと、原稿から切り出された文字画像と、生成された画像パターンとを用いて、原稿画像に含まれる文字列を再現する文字列再現画像を作成するステップと、作成された文字列再現画像を表示するステップとを前記文書ファイル作成支援装置のコンピュータに実行させる。 [program]
According to another aspect of the invention, there is provided a program file creation support apparatus including a computer for determining a character displayed in an original image based on an original image in a raster data format, and for determining a character image based on the determination result. A step of generating an image pattern, a step of generating a character string reproduction image that reproduces a character string included in the document image using the character image cut out from the document and the generated image pattern, and Causing the computer of the document file creation support apparatus to execute the step of displaying the character string reproduction image.

本発明の文書ファイル作成支援装置によれば、文字認識処理の認識結果の確認が容易になる。 According to the document file creation support apparatus of the present invention, the recognition result of the character recognition process can be easily confirmed.

まず、本発明の理解を助けるために、その背景及び概略を説明する。
ＯＣＲ（Optical Character Reader）などのように、原稿から光学的に画像データ（原稿画像）を読み取り、読み取られた原稿画像の画像データ（ラスタデータ形式）から文字を認識して文字コード等を抽出する技術が提案されている。この文字の認識処理は、パターンマッチング等によりなされるが、１００％の正確性を実現するに至っていない。そのため、利用者が文字認識処理の認識結果を確認して校正する必要がある。 First, in order to help understanding of the present invention, its background and outline will be described.
Like OCR (Optical Character Reader) or the like, image data (original image) is optically read from a document, and characters are recognized by extracting characters from image data (raster data format) of the read document image. Technology has been proposed. This character recognition process is performed by pattern matching or the like, but has not yet achieved 100% accuracy. Therefore, it is necessary for the user to confirm and calibrate the recognition result of the character recognition process.

図１は、文字認識処理の認識結果の確認を容易にする技術を説明する図である。
文字認識処理における認識結果の確認を容易にするため、図１（Ａ）に例示するように、原稿から読み取られた原稿画像と、この原稿画像に基づいて認識された文字の画像（認識結果）とを並べて表示し、同時にスクロールする方法が考えられる。この場合に、利用者は、原稿画像と認識結果とを見比べながら確認及び校正を行っていくことになるが、これらを見比べる作業は容易なものではない。また、文字認識処理の認識精度が著しく低い場合などでは、利用者が原稿を見ながら直接入力したほうが速い場合もある。
また、図１（Ｂ）に例示するように、原稿画像に含まれる文字列の行と、この行に対応する認識結果とを、基準線を挟んで対向する位置に表示する方法も考えられる。この場合に、利用者は、見比べる行を容易に認識できるが、原稿画像の他の領域（例えば、次の行）などを見ることができない。
また、図１（Ｃ）に例示するように、原稿画像に含まれる文字画像の近傍に、認識結果を挿入して表示する方法も考えられる。この場合に、原稿画像上に認識結果を挿入しうる領域が必ず存在するとは限らない。 FIG. 1 is a diagram illustrating a technique that facilitates confirmation of a recognition result of character recognition processing.
In order to facilitate confirmation of the recognition result in the character recognition process, as illustrated in FIG. 1A, a document image read from a document and a character image recognized based on the document image (recognition result) Can be displayed side by side and scrolled simultaneously. In this case, the user performs confirmation and proofreading while comparing the document image and the recognition result, but the operation of comparing these is not easy. Also, when the recognition accuracy of the character recognition process is extremely low, it may be faster for the user to input directly while looking at the document.
In addition, as illustrated in FIG. 1B, a method of displaying a line of a character string included in a document image and a recognition result corresponding to the line at a position facing each other across a reference line is also conceivable. In this case, the user can easily recognize the row to be compared, but cannot view other regions (for example, the next row) of the document image.
Further, as illustrated in FIG. 1C, a method of inserting and displaying the recognition result in the vicinity of the character image included in the document image is also conceivable. In this case, an area where the recognition result can be inserted does not always exist on the document image.

このように、原稿画像と認識結果とを並べて表示する場合には、これら原稿画像と認識結果との見比べ作業が容易でなく、文字認識処理の認識確度が著しく低い場合などでは、利用者が原稿を見ながら直接入力したほうが速い場合もありうる。ここで、文字認識処理の認識確度とは、文字認識処理の正確さを示す情報であり、例えば、パターンマッチングにおける一致度合い（一致画素数、距離など）である。 As described above, when the document image and the recognition result are displayed side by side, it is not easy to compare the document image and the recognition result, and when the recognition accuracy of the character recognition process is extremely low, the user can It may be faster to input directly while watching. Here, the recognition accuracy of the character recognition processing is information indicating the accuracy of the character recognition processing, and is, for example, the degree of matching (number of matching pixels, distance, etc.) in pattern matching.

そこで、本実施形態における文書ファイル作成支援装置２は、文字認識処理の認識確度に応じて、原稿画像から切り出された文字画像、又は、認識結果に基づいて生成された文字画像を適用して、原稿画像に含まれる文字列を再現する。
図２は、文書ファイル作成支援装置２により表示される再現画像を例示する図である。
図２に例示するように、文書ファイル作成支援装置２は、文字認識処理の認識確度が基準値以上である文字画像については、認識結果に基づいて生成された文字画像（後述する画像パターン又はフォント画像など）を適用し、認識確度が基準値未満である文字画像については、原稿画像から切り出された文字画像を適用して、原稿画像に含まれる文字列が再現された再現画像を表示する。
利用者は、認識確度が高い部分については、前後の文字との関係に基づいて認識結果に誤りがないかを確認でき、認識確度が低い部分については、原稿画像から切り出された文字画像そのものを見ながら、直接文字コード等を入力することができる。すなわち、利用者は、文書ファイル作成支援装置２に表示された再現画像を参照することにより、原稿画像と認識結果との見比べ作業を行うことなく、認識結果の確認作業と認識結果の補完作業（校正作業）とを行うことができる。 Therefore, the document file creation support apparatus 2 according to the present embodiment applies a character image cut out from the document image or a character image generated based on the recognition result according to the recognition accuracy of the character recognition process. Reproduce the character string included in the original image.
FIG. 2 is a diagram illustrating a reproduced image displayed by the document file creation support apparatus 2.
As illustrated in FIG. 2, the document file creation support apparatus 2 uses a character image generated based on the recognition result (an image pattern or font described later) for a character image whose recognition accuracy of the character recognition process is equal to or higher than a reference value. For a character image whose recognition accuracy is less than the reference value, a character image cut out from the document image is applied to display a reproduced image in which a character string included in the document image is reproduced.
The user can check whether there is an error in the recognition result based on the relationship with the preceding and succeeding characters for the portion with high recognition accuracy, and the character image itself cut out from the original image for the portion with low recognition accuracy. You can enter the character code directly while watching. That is, the user refers to the reproduced image displayed on the document file creation support apparatus 2, thereby confirming the recognition result and complementing the recognition result without comparing the original image with the recognition result ( Calibration work).

［ハードウェア構成］
次に、文書ファイル作成支援装置２のハードウェア構成を説明する。
図３は、本発明にかかる文書ファイル作成支援方法が適応される文書ファイル作成支援装置２のハードウェア構成を、制御装置２０を中心に例示する図である。
図３に例示するように、文書ファイル作成支援２は、ＣＰＵ２０２及びメモリ２０４などを含む制御装置２０、通信装置２２、ＨＤＤ・ＣＤ装置などの記録装置２４、並びに、ＬＣＤ表示装置あるいはＣＲＴ表示装置およびキーボード・タッチパネルなどを含むユーザインタフェース装置（ＵＩ装置）２６から構成される。
文書ファイル作成支援装置２は、例えば、文書ファイル作成プログラム５（後述）がインストールされた汎用コンピュータであり、通信装置２２又は記録装置２４などを介して原稿画像の画像データ（ラスタデータ形式）を取得し、取得された原稿画像の画像データに基づいて文字認識処理を行い、文字認識処理の認識結果（文字コード等）が含まれた文書ファイルを作成する。例えば、文書ファイル作成支援装置２は、プリンタ機能及びスキャナ機能等を備えた複合機１０に接続されており、この複合機１０から、スキャナ機能により原稿から光学的に読み取られた原稿画像の画像データを取得し、取得された原稿画像の画像データに基づいて文字コード等からなる文書ファイルを作成する。 [Hardware configuration]
Next, the hardware configuration of the document file creation support apparatus 2 will be described.
FIG. 3 is a diagram illustrating a hardware configuration of the document file creation support apparatus 2 to which the document file creation support method according to the present invention is applied, centering on the control apparatus 20.
As illustrated in FIG. 3, the document file creation support 2 includes a control device 20 including a CPU 202 and a memory 204, a communication device 22, a recording device 24 such as an HDD / CD device, an LCD display device or a CRT display device, and the like. It comprises a user interface device (UI device) 26 including a keyboard / touch panel.
The document file creation support device 2 is, for example, a general-purpose computer in which a document file creation program 5 (described later) is installed, and acquires image data (raster data format) of a document image via the communication device 22 or the recording device 24. Then, character recognition processing is performed based on the acquired image data of the document image, and a document file including the recognition result (character code or the like) of the character recognition processing is created. For example, the document file creation support apparatus 2 is connected to a multifunction device 10 having a printer function, a scanner function, and the like. From this multifunction device 10, image data of a document image optically read from a document by a scanner function. And a document file composed of character codes and the like is created based on the acquired image data of the original image.

［文書ファイル作成プログラム］
図４は、制御装置２０（図３）により実行され、本発明にかかる文書ファイル作成支援方法を実現する文書ファイル作成プログラム５の機能構成を例示する図である。
図４に例示するように、文書ファイル作成プログラム５は、文字認識部４０、画像辞書作成部５０、符号化部６０、再現画像作成部７０、ユーザインタフェース制御部（ＵＩ制御部）８０及びコードファイル作成部９０を有する。
文書ファイル作成プログラム５において、文字認識部４０は、複合機１０のスキャナ機能により読み取られた原稿画像の画像データ、又は、通信装置２２又は記録装置２４などを介して取得された原稿画像の画像データ（ラスタデータ形式）を取得し、取得された原稿画像の画像データに対して文字認識処理を行う。例えば、文字認識部４０は、予め用意された文字のテンプレート画像と、原稿画像に含まれる画像（部分画像）とを比較してパターンマッチングを行い、最も一致する文字の文字識別情報、この文字のフォント情報、この文字の文字領域情報、及び、文字認識処理の認識確度を判別して、判別結果を画像辞書作成部５０に対して出力する。ここで、文字識別情報とは、文字を識別する情報であり、例えば、汎用性のある文字コード（ＡＳＣＩＩコード又はシフトＪＩＳコードなど）である。また、文字領域情報とは、原稿画像における文字画像の領域を示す情報であり、例えば、文字画像の位置、大きさ、範囲又はこれらの組合せからなる文字のレイアウト情報である。また、フォント情報とは、フォント画像の形状、大きさ又は色等を規定する情報であり、フォントの種類（ゴシック体、イタリック体又は明朝体など）、フォントサイズ（ポイント数）、及び、フォントの色などが含まれる。
なお、文字認識部４０は、画像辞書作成部５０（後述）により生成された画像パターンに基づいて文字認識処理を行ってもよい。例えば、文字認識部４０は、原稿画像に基づいて生成された画像パターンを用いてパターンマッチングを行い、原稿画像に含まれる文字を認識する。 [Document file creation program]
FIG. 4 is a diagram illustrating a functional configuration of the document file creation program 5 that is executed by the control device 20 (FIG. 3) and implements the document file creation support method according to the present invention.
As illustrated in FIG. 4, the document file creation program 5 includes a character recognition unit 40, an image dictionary creation unit 50, an encoding unit 60, a reproduction image creation unit 70, a user interface control unit (UI control unit) 80, and a code file. A creation unit 90 is included.
In the document file creation program 5, the character recognition unit 40 includes image data of a document image read by the scanner function of the multifunction machine 10 or image data of a document image acquired via the communication device 22 or the recording device 24. (Raster data format) is acquired, and character recognition processing is performed on the image data of the acquired document image. For example, the character recognizing unit 40 performs pattern matching by comparing a template image of a character prepared in advance with an image (partial image) included in a document image, and character identification information of the most matching character. The font information, the character area information of this character, and the recognition accuracy of the character recognition process are determined, and the determination result is output to the image dictionary creation unit 50. Here, the character identification information is information for identifying a character, for example, a versatile character code (such as an ASCII code or a shift JIS code). The character area information is information indicating the area of the character image in the document image, and is, for example, character layout information including the position, size, range, or combination of the character images. The font information is information that defines the shape, size, color, etc. of the font image, and the font type (gothic, italic, mincho, etc.), font size (number of points), and font The color etc. are included.
The character recognition unit 40 may perform character recognition processing based on the image pattern generated by the image dictionary creation unit 50 (described later). For example, the character recognition unit 40 performs pattern matching using an image pattern generated based on a document image, and recognizes a character included in the document image.

画像辞書作成部５０は、文字認識部４０から入力された文字認識処理の結果と、原稿画像の画像データとに基づいて、この原稿画像で類型的に出現する文字の画像パターンを作成する。例えば、画像辞書作成部５０は、文字認識部４０から入力された文字識別情報及び文字領域情報等に基づいて、原稿画像から文字画像を切り出し、切り出された文字画像に基づいて画像パターンを作成し、作成された画像パターン（文字画像）にインデクスを付与して、これら画像パターン及びインデクスを画像辞書として符号化部６０に出力する。 The image dictionary creation unit 50 creates an image pattern of characters that typically appear in the document image based on the result of the character recognition process input from the character recognition unit 40 and the image data of the document image. For example, the image dictionary creation unit 50 cuts out a character image from the document image based on the character identification information and the character region information input from the character recognition unit 40, and creates an image pattern based on the cut out character image. Then, an index is assigned to the created image pattern (character image), and the image pattern and the index are output to the encoding unit 60 as an image dictionary.

符号化部６０は、画像辞書作成部５０から入力された画像辞書に基づいて原稿画像の画像データを圧縮し、圧縮された原稿画像の画像データと画像辞書とを記録装置２４（図３）又は複合機１０（図３）などに出力する。より具体的には、符号化部６０は、画像辞書に登録された画像パターンと、原稿画像に含まれる文字画像（文字認識処理の認識確度が基準値以上である文字画像）とを比較して、いずれかの画像パターンと一致する文字画像のデータを、この画像パターンに対応するインデクス及びこの文字画像の位置情報に置換して圧縮する。ここで、一致とは、完全一致だけを言うのではなく、既定の許容範囲内で部分的に一致している場合を含む。さらに、符号化部６０は、文字画像と置き換えられたインデクス及び位置情報、並びに、画像辞書等をエントロピー符号化（ハフマン符号化、算術符号化又はＬＺ符号化など）により符号化してもよい。
なお、符号化部６０は、文字認識処理の認識確度が基準値未満である文字画像については、ラスタデータに適合した他の符号化方式（ＭＨ、ＭＭＲなど）により文字画像の画像データをそのまま圧縮する。また、符号化部６０は、文字認識処理の対象外の画像（例えば、写真画像、ＣＧ画像等）についても、他の符号化方式（ＪＰＥＧ、ＭＨ、ＭＭＲなど）を適用して符号化する。 The encoding unit 60 compresses the image data of the document image based on the image dictionary input from the image dictionary creation unit 50, and stores the compressed image data of the document image and the image dictionary in the recording device 24 (FIG. 3) or The data is output to the multifunction machine 10 (FIG. 3). More specifically, the encoding unit 60 compares the image pattern registered in the image dictionary with a character image included in the document image (a character image whose character recognition processing recognition accuracy is equal to or higher than a reference value). Then, the character image data matching any one of the image patterns is replaced with an index corresponding to the image pattern and the position information of the character image, and compressed. Here, the term “match” does not mean a complete match, but includes a case where a partial match is made within a predetermined allowable range. Furthermore, the encoding unit 60 may encode the index and position information replaced with the character image, the image dictionary, and the like by entropy encoding (Huffman encoding, arithmetic encoding, LZ encoding, or the like).
The encoding unit 60 compresses the image data of the character image as it is by using another encoding method (MH, MMR, etc.) suitable for the raster data for the character image whose recognition accuracy of the character recognition processing is less than the reference value. To do. The encoding unit 60 also encodes images that are not subject to character recognition processing (for example, photographic images, CG images, etc.) by applying other encoding methods (JPEG, MH, MMR, etc.).

再現画像作成部７０は、原稿画像から切り出された文字画像と、画像辞書作成部５０により生成された画像パターンとを用いて、この原稿画像に対応する再現画像を作成する。具体的には、再現画像作成部７０は、符号化部６０から入力された原稿画像の画像データを用いて、インデクス及び位置情報に置換された部分（すなわち、認識確度が基準値以上である文字画像）を、このインデクスに対応する画像パターンで再現し、インデクス及び位置情報に置換されなかった部分（すなわち、認識確度が基準値未満である文字画像であり、他の符号化方式で圧縮されたもの）を、この部分の画像データを復号化して再現する。すなわち、再現画像作成部７０は、文字認識処理の認識確度が基準値以上である文字画像については、画像辞書作成部５０により作成された画像パターンのうち最も近似するものを適用し、文字認識処理の認識確度が基準値未満である文字画像については、原稿画像から切り出された文字画像そのものを適用して、原稿画像を再現する再現画像の画像データを作成する。
なお、再現画像作成部７０は、文字認識処理の対象外の画像についても、原稿画像から切り出された画像を適用する。 The reproduction image creation unit 70 creates a reproduction image corresponding to the document image using the character image cut out from the document image and the image pattern generated by the image dictionary creation unit 50. Specifically, the reproduction image creation unit 70 uses the image data of the document image input from the encoding unit 60 and replaces the portion replaced with the index and position information (that is, the character whose recognition accuracy is equal to or higher than the reference value). The image is reproduced with an image pattern corresponding to this index, and the portion that is not replaced with the index and position information (that is, a character image whose recognition accuracy is less than the reference value and compressed by another encoding method) 1) is reproduced by decoding the image data of this portion. In other words, the reproduced image creation unit 70 applies the most approximate image pattern created by the image dictionary creation unit 50 to a character image whose recognition accuracy in the character recognition process is greater than or equal to the reference value, and performs the character recognition process. For a character image whose recognition accuracy is less than the reference value, a character image itself cut out from the document image is applied to generate image data of a reproduced image that reproduces the document image.
The reproduced image creating unit 70 also applies an image cut out from the document image to an image that is not subject to character recognition processing.

ＵＩ制御部８０は、ＵＩ装置２６（図３）を制御して、再現画像作成部７０により作成された再現画像を表示する。
また、ＵＩ制御部８０は、ＵＩ装置２６を介して、利用者から、この再現画像に対する文字識別情報の校正操作を受け付けて、校正操作に応じて再現画像作成部７０及びコードファイル作成部９０に指示する。再現画像作成部７０は、この指示に応じて、再現画像を変更する。 The UI control unit 80 controls the UI device 26 (FIG. 3) to display the reproduced image created by the reproduced image creating unit 70.
In addition, the UI control unit 80 receives a calibration operation of character identification information for the reproduced image from the user via the UI device 26, and causes the reproduced image creation unit 70 and the code file creation unit 90 to respond to the calibration operation. Instruct. The reproduced image creating unit 70 changes the reproduced image in response to this instruction.

コードファイル作成部９０は、文字認識部４０による文字認識処理の認識結果と、ＵＩ制御部８０に入力される利用者の校正操作とに基づいて、原稿画像に対応する文書ファイル（文字コード、フォント情報等からなるコードファイル）を作成する。具体的には、コードファイル作成部９０は、認識確度によらず、文字認識部４０による文字認識処理の全認識結果（全ての文字コード等）を基準とし、ＵＩ制御部８０に入力された校正操作に応じて、認識結果を修正（置換、削除及び追加等）する。 Based on the recognition result of the character recognition process by the character recognition unit 40 and the user's proofreading operation input to the UI control unit 80, the code file creation unit 90 generates a document file (character code, font) corresponding to the document image. Code file consisting of information etc.). Specifically, the code file creation unit 90 uses the calibration results input to the UI control unit 80 based on all recognition results (all character codes, etc.) of the character recognition processing by the character recognition unit 40 regardless of the recognition accuracy. The recognition result is corrected (replacement, deletion, addition, etc.) according to the operation.

図５は、画像辞書作成部５０の機能をより詳細に説明する図である。
図５に示すように、画像辞書作成部５０は、記憶部５００、文字画像抽出部５１０、一致判定部５２０、辞書決定部５３０、位置補正部５４０及びインデクス付与部５５０を有する。
記憶部５００は、メモリ２０４（図３）及び記録装置２４（図３）を制御して、文字認識部４０からから入力された原稿画像、文字識別情報及び文字領域情報を記憶する。なお、以下、文字コードを文字識別情報の具体例とし、文字の位置情報を文字領域情報の具体例として説明する。 FIG. 5 is a diagram for explaining the function of the image dictionary creation unit 50 in more detail.
As shown in FIG. 5, the image dictionary creation unit 50 includes a storage unit 500, a character image extraction unit 510, a match determination unit 520, a dictionary determination unit 530, a position correction unit 540, and an index assignment unit 550.
The storage unit 500 controls the memory 204 (FIG. 3) and the recording device 24 (FIG. 3) to store the document image, character identification information, and character area information input from the character recognition unit 40. Hereinafter, a character code will be described as a specific example of character identification information, and character position information will be described as a specific example of character region information.

文字画像抽出部５１０は、文字の位置情報に基づいて、原稿画像から文字画像を切り出す。すなわち、文字画像抽出部５１０は、原稿画像から、文字領域情報により示された領域を文字画像として抽出する。抽出される文字画像は、文字認識部４０により文字画像であると判定された領域である。なお、画像辞書作成部５０は、文字認識処理において文字画像を原稿画像から切り出される場合には、文字認識処理で切り出された文字画像をそのまま適用してもよい。 Character image extraction unit 510 cuts out a character image from a document image based on character position information. That is, the character image extraction unit 510 extracts an area indicated by the character area information from the document image as a character image. The extracted character image is an area determined by the character recognition unit 40 to be a character image. Note that, when the character image is cut out from the document image in the character recognition process, the image dictionary creation unit 50 may apply the character image cut out in the character recognition process as it is.

一致判定部５２０は、原稿画像から切り出された文字画像と、画像辞書に登録された画像パターンとを比較して、これらの一致度合いを判定する。ここで、一致度合いとは、複数の画像が互いに一致する程度を示す情報であり、例えば、２値画像が比較される場合に、２つの画像を重ねたときの互いに重なりあう画素の数（以下、一致画素数）、この一致画素数を正規化した一致画素率（例えば、一致画素数を全画素数で割ったもの）、又は、複数の画像を重ねたときの画素分布（ヒストグラム）などである。
また、一致判定部５２０は、原稿画像から切り出された文字画像と、画像辞書に登録された画像パターンとを複数の相対位置で比較して一致度合いを判定する。すなわち、一致判定部５２０は、最大の一致度合いを算出するために、切り出された文字画像と、画像辞書に登録された画像パターンとを、これらの相対位置を変更しながら（ずらしながら）比較する。
例えば、一致判定部５２０は、原稿画像から切り出された文字画像と、この文字画像と文字コード（又は、文字コードとフォント情報との組合せ）が一致する画像パターンとを互いにずらしながら一致画素率を算出し、一致画素率の最大値及びこの最大となったときのずらしベクトルを記憶部５００に出力する。 The coincidence determination unit 520 compares the character image cut out from the document image with the image pattern registered in the image dictionary, and determines the degree of coincidence thereof. Here, the degree of coincidence is information indicating the degree to which a plurality of images coincide with each other. For example, when binary images are compared, the number of pixels that overlap each other when the two images are overlaid (hereinafter referred to as the number of pixels). , The number of matching pixels), the matching pixel ratio obtained by normalizing the number of matching pixels (for example, the number of matching pixels divided by the total number of pixels), or the pixel distribution (histogram) when a plurality of images are superimposed is there.
The coincidence determination unit 520 determines the degree of coincidence by comparing the character image cut out from the document image and the image pattern registered in the image dictionary at a plurality of relative positions. That is, the coincidence determination unit 520 compares the clipped character image and the image pattern registered in the image dictionary while changing (shifting) their relative positions in order to calculate the maximum degree of coincidence. .
For example, the matching determination unit 520 shifts the matching pixel rate while shifting the character image cut out from the document image and the image pattern in which the character image and the character code (or the combination of the character code and the font information) match each other. The maximum value of the coincidence pixel ratio and the shift vector when the maximum is obtained are output to the storage unit 500.

辞書決定部５３０は、文字画像抽出部５１０により切り出された文字画像と、一致判定部５２０により判定された一致度合いとに基づいて、画像辞書に登録すべき画像パターンを決定する。例えば、辞書決定部５３０は、一致度合いが基準値以上となる複数の文字画像を選択し、これらの文字画像の共通形状を画像パターンとする。換言すると、辞書決定部５３０は、画像パターンを介して、形状が近似する文字画像を互いに対応付ける。 The dictionary determination unit 530 determines an image pattern to be registered in the image dictionary based on the character image cut out by the character image extraction unit 510 and the matching degree determined by the matching determination unit 520. For example, the dictionary determination unit 530 selects a plurality of character images having a matching degree equal to or higher than a reference value, and sets a common shape of these character images as an image pattern. In other words, the dictionary determination unit 530 associates character images having similar shapes with each other via the image pattern.

位置補正部５４０は、一致判定部５２０から出力されたずらしベクトルに基づいて、文字画像の位置情報を補正する。すなわち、位置補正部５４０は、文字認識部４０から入力された文字画像の位置情報を、この文字画像と画像パターンとの一致度合いが最大となるように補正する。
インデクス付与部５５０は、画像辞書に登録された画像パターンに対して、これらの画像パターンを識別するインデクスを付与し、付与されたインデクスと画像パターンと文字コードとを互いに対応付けて画像辞書として記憶部５００に出力する。 The position correction unit 540 corrects the position information of the character image based on the shift vector output from the match determination unit 520. That is, the position correction unit 540 corrects the position information of the character image input from the character recognition unit 40 so that the degree of coincidence between the character image and the image pattern is maximized.
The index assigning unit 550 assigns indexes for identifying these image patterns to the image patterns registered in the image dictionary, and stores the assigned indexes, image patterns, and character codes in association with each other as an image dictionary. To the unit 500.

図６は、画像辞書作成部５０により作成される画像辞書９０２を例示する図である。なお、便宜上、原稿画像の画像データが２値データである場合を具体例として以下説明する。
図６に例示するように、画像辞書作成部５０は、辞書決定部５３０（図５）により生成された画像パターンと、この画像パターンに対応する文字コード（文字識別情報）と、この画像パターンに付与されたインデクスとを互いに対応付けて画像辞書９０２とする。この文字コードは、文字認識部４０により判別されたものであり、画像パターンは、この文字コードにより分類された文字画像に基づいて生成されたものである。
なお、本例では、同一の文字コード（「０ｘ４２」）に対して複数の画像パターン（「ファイル０１４」及び「ファイル０３１」）が対応付けられている。これは、辞書決定部５３０が、文字コードが一致していてもそれぞれの文字画像の形状があまりにも異なる場合（例えば、フォントの種類又はフォントサイズが異なる場合）には、それぞれの文字画像に対応する画像パターンを生成するからである。 FIG. 6 is a diagram illustrating an image dictionary 902 created by the image dictionary creation unit 50. For convenience, the case where the image data of the document image is binary data will be described below as a specific example.
As illustrated in FIG. 6, the image dictionary creation unit 50 includes an image pattern generated by the dictionary determination unit 530 (FIG. 5), a character code (character identification information) corresponding to the image pattern, and the image pattern. The assigned index is associated with each other to form an image dictionary 902. The character code is determined by the character recognition unit 40, and the image pattern is generated based on the character image classified by the character code.
In this example, a plurality of image patterns (“file 014” and “file 031”) are associated with the same character code (“0x42”). This is because the dictionary determination unit 530 corresponds to each character image when the character code is matched but the shape of each character image is too different (for example, when the font type or font size is different). This is because an image pattern to be generated is generated.

図７は、符号化部６０の機能をより詳細に説明する図である。
図７に示すように、符号化部６０は、パターン判定部６１０、位置情報符号化部６２０、インデクス符号化部６３０、画像符号化部６４０、辞書符号化部６５０、選択部６６０及び符号出力部６７０を有する。
パターン判定部６１０は、画像辞書に登録された画像パターンそれぞれと、原稿画像に含まれる部分画像とを比較して、この部分画像と対応する画像パターン（同一又は類似の画像パターン）を判定する。具体的には、パターン判定部６１０は、原稿画像から文字画像単位で切り出された部分画像（位置補正部５４０により補正がなされたもの）と、画像パターンとを重ねあわせて、一致判定部５２０（図５）と同様の手法により、一致度合いを算出し、算出された一致度合いが許容値以上であるか否かに基づいて、対応しているか否かを判定する。
パターン判定部６１０は、対応する画像パターンが発見された場合には、この部分画像の位置情報を位置情報符号化部６２０に対して出力し、この画像パターンのインデクスをインデクス符号化部６３０に対して出力する。また、パターン判定部６１０は、対応する画像パターンが発見されるか否かにかかわらず、この部分画像の画像データを画像符号化部６４０に対して出力する。
なお、本実施形態におけるパターン判定部６１０は、原稿画像から切り出された文字画像と一致する画像パターンのインデクス、及び、文字画像の位置情報（位置補正部５４０により補正されたもの）を画像辞書作成部５０から取得するため、文字画像として切り出された部分画像については、画像辞書作成部５０から入力されたインデクス及び位置情報をそれぞれインデクス符号化部６３０及び位置情報符号化部６２０に出力する。 FIG. 7 is a diagram for explaining the function of the encoding unit 60 in more detail.
As illustrated in FIG. 7, the encoding unit 60 includes a pattern determination unit 610, a position information encoding unit 620, an index encoding unit 630, an image encoding unit 640, a dictionary encoding unit 650, a selection unit 660, and a code output unit. 670.
The pattern determination unit 610 compares each image pattern registered in the image dictionary with a partial image included in the document image, and determines an image pattern (same or similar image pattern) corresponding to the partial image. Specifically, the pattern determination unit 610 superimposes the partial image (corrected by the position correction unit 540) cut out from the original image in units of character images and the image pattern, and matches the determination unit 520 ( The degree of coincidence is calculated by the same method as in FIG. 5), and it is determined whether or not it is compatible based on whether or not the calculated degree of coincidence is greater than or equal to an allowable value.
When the corresponding image pattern is found, the pattern determination unit 610 outputs the position information of the partial image to the position information encoding unit 620, and the index of the image pattern is output to the index encoding unit 630. Output. Further, the pattern determination unit 610 outputs the image data of this partial image to the image encoding unit 640 regardless of whether or not a corresponding image pattern is found.
Note that the pattern determination unit 610 in this embodiment creates an image dictionary based on the index of the image pattern that matches the character image cut out from the document image and the position information of the character image (corrected by the position correction unit 540). For the partial image cut out as a character image for acquisition from the unit 50, the index and position information input from the image dictionary creation unit 50 are output to the index encoding unit 630 and the position information encoding unit 620, respectively.

位置情報符号化部６２０は、パターン判定部６１０から入力された位置情報（すなわち、位置補正部５４０により補正された部分画像（文字画像）の位置情報）を符号化し、選択部６６０に対して出力する。例えば、位置情報符号化部６２０は、ＬＺ符号化又は算術符号化等を適用して、位置情報を符号化する。
インデクス符号化部６３０は、パターン判定部６１０から入力されたインデクスを符号化し、選択部６６０に対して出力する。例えば、インデクス符号化部６３０は、インデクスの出現頻度に応じて符号長が異なる符号をそれぞれのインデクスに付与する。
画像符号化部６４０は、画像（ラスタライズされた画像データ）に適した符号化方式（ＪＰＥＧ、ＭＨ、ＭＭＲなど）を適用して、パターン判定部６１０から入力された部分画像を符号化し、選択部６６０に対して出力する。
辞書符号化部６５０は、画像辞書作成部５０（図４，図５）から入力された画像辞書の少なくとも一部を符号化し、符号出力部６７０に対して出力する。例えば、辞書符号化部６５０は、画像辞書９０２に含まれる画像パターン（ラスタデータ）を、画像に適した符号化方式により符号化する。 The position information encoding unit 620 encodes the position information input from the pattern determination unit 610 (that is, position information of the partial image (character image) corrected by the position correction unit 540) and outputs the encoded position information to the selection unit 660. To do. For example, the position information encoding unit 620 encodes position information by applying LZ encoding or arithmetic encoding.
The index encoding unit 630 encodes the index input from the pattern determination unit 610 and outputs the encoded index to the selection unit 660. For example, the index encoding unit 630 assigns a code having a different code length to each index according to the appearance frequency of the index.
The image encoding unit 640 encodes the partial image input from the pattern determination unit 610 by applying an encoding method (JPEG, MH, MMR, etc.) suitable for the image (rasterized image data), and selects the selection unit To 660.
The dictionary encoding unit 650 encodes at least a part of the image dictionary input from the image dictionary creation unit 50 (FIGS. 4 and 5) and outputs the encoded image dictionary to the code output unit 670. For example, the dictionary encoding unit 650 encodes an image pattern (raster data) included in the image dictionary 902 by an encoding method suitable for the image.

選択部６６０は、文字認識処理の認識確度に応じて、位置情報符号化部６２０及びインデクス符号化部６３０から入力された符号データと、画像符号化部６４０から入力された符号データとのいずれか一方を選択し、選択された符号データを符号出力部６７０に対して出力する。具体的には、選択部６６０は、認識確度が基準値以上である文字画像については、位置情報符号化部６２０から入力された位置情報の符号データと、インデクス符号化部６３０から入力されたインデクスの符号データとを互いに対応付けて符号出力部６７０に対して出力し、認識確度が基準値未満である文字画像については、画像符号化部６４０により符号化された部分画像の符号データを符号出力部６７０に対して出力する。
符号出力部６７０は、選択部６６０から入力された符号データ（位置情報、インデクス及び部分画像の符号データ）と、辞書符号化部６５０から入力された符号データ（画像辞書の符号データ）と、文字対応テーブル９０４とを互いに対応付けて再現画像作成部７０及び記録装置２４（図３）等に出力する。 The selection unit 660 selects either the code data input from the position information encoding unit 620 and the index encoding unit 630 or the code data input from the image encoding unit 640 according to the recognition accuracy of the character recognition process. One is selected and the selected code data is output to the code output unit 670. Specifically, for the character image whose recognition accuracy is equal to or higher than the reference value, the selection unit 660 includes the code data of the position information input from the position information encoding unit 620 and the index input from the index encoding unit 630. And the code data of the partial image encoded by the image encoding unit 640 is output as a code for a character image whose recognition accuracy is less than the reference value. Output to the unit 670.
The code output unit 670 includes code data (position information, index, and partial image code data) input from the selection unit 660, code data (image dictionary code data) input from the dictionary encoding unit 650, characters The correspondence table 904 is associated with each other and output to the reproduced image creation unit 70 and the recording device 24 (FIG. 3).

図８は、ＵＩ装置２６に表示される再現画像２６０を例示する図である。
図８に例示するように、ＵＩ制御部８０（図４）は、ＵＩ装置２６（図３）のモニタに、原稿画像から切り出された画像と、画像パターンとを合成した再現画像２６０を表示する。さらに、再現画像２６０には、原稿画像から切り出された画像と画像パターンとを区別するための下線２６２、及び、利用者の入力位置を示すカーソル２６４が表示される。
本例の下線２６２は、原稿画像から切り出された文字画像の近傍に表示され、認識確度が基準値未満である文字画像を利用者に知らせる機能を有する。なお、本例では、下線２６２により、原稿画像から切り出された文字画像（すなわち、認識確度が基準値未満である文字画像）と、画像パターン（すなわち、認識確度が基準値以上である文字画像）とを区別しているが、これに限定されるものではなく、例えば、文字画像の色（濃度）を異ならせたり、文字画像の周囲の色等を異ならせたりしてもよい。
また、本例のカーソル２６４は、入力対象となる文字画像と関連付けられた位置（具体的には、下方近傍）に表示され、利用者が入力対象を変更する操作を行うと、認識確度が基準値以上である文字画像（すなわち、画像パターン）をスキップして、認識確度が基準値未満である文字画像（すなわち、原稿画像から切り出された文字画像）に対応する領域のみを移動する。
これにより、利用者は、文字認識処理の認識確度が低い文字画像に対してのみ、文字コード等を入力を行うことができる。 FIG. 8 is a diagram illustrating a reproduction image 260 displayed on the UI device 26.
As illustrated in FIG. 8, the UI control unit 80 (FIG. 4) displays a reproduced image 260 obtained by combining an image cut out from an original image and an image pattern on the monitor of the UI device 26 (FIG. 3). . Further, the reproduced image 260 displays an underline 262 for distinguishing between an image cut out from the document image and an image pattern, and a cursor 264 indicating a user input position.
The underline 262 in this example is displayed in the vicinity of the character image cut out from the document image, and has a function of notifying the user of a character image whose recognition accuracy is less than the reference value. In this example, a character image cut out from the document image by the underline 262 (that is, a character image whose recognition accuracy is less than the reference value) and an image pattern (that is, a character image whose recognition accuracy is greater than or equal to the reference value). However, the present invention is not limited to this. For example, the color (density) of the character image may be different, or the color around the character image may be different.
In addition, the cursor 264 of this example is displayed at a position (specifically, near the lower part) associated with the character image to be input, and when the user performs an operation to change the input target, the recognition accuracy is the reference. A character image (that is, an image pattern) that is greater than or equal to the value is skipped, and only an area corresponding to a character image (that is, a character image cut out from the document image) whose recognition accuracy is less than the reference value is moved.
Thereby, the user can input a character code etc. only with respect to the character image with low recognition accuracy of character recognition processing.

［再現画像表示動作］
次に、文書ファイル作成支援装置２による再現画像２６０（図８）の作成処理及び表示処理を説明する。
図９は、再現画像表示処理（Ｓ１０）の全体動作を示すフローチャートである。なお、説明の便宜のために２値の画像データが入力される場合を具体例とする。
図９に示すように、ステップ１００（Ｓ１００）において、複合機１０（図３）は、原稿から光学的に原稿画像を読み取り、読み取られた原稿画像の画像データ（ラスタデータ）を文書ファイル作成支援装置２（図３）に送信する。
ステップ１０２（Ｓ１０２）において、文書ファイル作成支援装置２の文字認識部４０（図４）は、複合機１０（図３）から原稿画像の画像データ（２値のラスタデータ）が入力されると、入力された画像データに対して文字認識処理を行い、原稿画像に含まれる文字画像の文字コード、位置情報、及び、文字認識処理の認識確度を１文字ずつ判定し、判定された文字コード、位置情報及び認識確度を画像辞書作成部５０に対して出力する。 [Reproduced image display operation]
Next, the creation processing and display processing of the reproduction image 260 (FIG. 8) by the document file creation support apparatus 2 will be described.
FIG. 9 is a flowchart showing the overall operation of the reproduction image display process (S10). For convenience of explanation, a case where binary image data is input is taken as a specific example.
As shown in FIG. 9, in step 100 (S100), the multi function device 10 (FIG. 3) optically reads a document image from a document, and supports document file creation using image data (raster data) of the read document image. Transmit to device 2 (FIG. 3).
In step 102 (S102), the character recognition unit 40 (FIG. 4) of the document file creation support apparatus 2 receives image data (binary raster data) of a document image from the multifunction machine 10 (FIG. 3). Character recognition processing is performed on the input image data, the character code of the character image included in the document image, position information, and recognition accuracy of the character recognition processing are determined for each character, and the determined character code and position are determined. Information and recognition accuracy are output to the image dictionary creation unit 50.

ステップ１０４（Ｓ１０４）において、画像辞書作成部５０の記憶部５００（図５）は、文字認識部４０から入力される原稿画像（２値）、文字コード、位置情報及び認識確度をメモリ２０４（図３）に記憶する。
文字画像抽出部５１０（図５）は、入力された位置情報に基づいて、原稿画像から文字画像を切り出して記憶部５００に出力する。 In step 104 (S104), the storage unit 500 (FIG. 5) of the image dictionary creation unit 50 stores the document image (binary), character code, position information, and recognition accuracy input from the character recognition unit 40 in the memory 204 (FIG. 5). 3) Remember.
The character image extraction unit 510 (FIG. 5) cuts out a character image from the document image based on the input position information and outputs it to the storage unit 500.

ステップ１０６（Ｓ１０６）において、辞書決定部５３０は、原稿画像から順に切り出される文字画像の文字コードに基づいて、画像辞書９０２（図６）から画像パターン及びこの画像パターンのインデクスを読み出す。具体的には、記憶部５００は、既に決定された画像パターン、この画像パターンに対して付与されたインデクス、及び、この画像パターンに対応する文字画像の文字コードを互いに対応付けて画像辞書９０２として記憶している。辞書決定部５３０は、文字画像抽出部５１０により新たに切り出された文字画像の文字コードを文字認識部４０から取得し、取得された文字コードに対応する画像パターン及びインデクスを画像辞書９０２から読み出す。なお、処理対象が原稿画像から切り出された最初の文字画像である場合（すなわち、画像辞書９０２に画像パターンが登録されていない場合）には、この文字画像がそのまま画像辞書９０２に登録されることになる。 In step 106 (S106), the dictionary determining unit 530 reads out the image pattern and the index of the image pattern from the image dictionary 902 (FIG. 6) based on the character code of the character image cut out sequentially from the document image. Specifically, the storage unit 500 associates the already determined image pattern, the index assigned to the image pattern, and the character code of the character image corresponding to the image pattern as an image dictionary 902. I remember it. The dictionary determination unit 530 acquires the character code of the character image newly cut out by the character image extraction unit 510 from the character recognition unit 40, and reads out the image pattern and index corresponding to the acquired character code from the image dictionary 902. When the processing target is the first character image cut out from the document image (that is, when no image pattern is registered in the image dictionary 902), this character image is registered in the image dictionary 902 as it is. become.

ステップ１０８（Ｓ１０８）において、一致判定部５２０は、文字画像抽出部５１０により順次切り出される文字画像と、画像辞書９０２に登録されている画像パターンとを比較して、複数の相対位置における一致度合いを判定する。具体的には、一致判定部５２０は、辞書決定部５３０により読み出された画像パターンと、切り出された文字画像とを互いにずらしながら黒画素の一致画素数Ｋを算出する。
一致画素数Ｋは、画像における相対的な位置を示す位置ベクトルをｘ、画像パターンの黒画素の分布をＳ（ｘ）、順に切り出される文字画像の番号をｉ（１〜Ｎ）、文字画像の黒画素の分布をＰ（ｉ，ｘ）、文字画像ｉのずらしベクトルをｖｉとした場合に、以下の数式により算出される。
（一致画素数Ｋ）＝Σ｛Ｓ（ｘ）＊Ｐ（ｉ，ｘ−ｖｉ）｝
なお、「Σ」は、変数ｘについての総和を示す。
次に、一致判定部５２０は、算出された一致画素数Ｋを正規化して、一致画素率Ｋ’を算出する。
一致画素率Ｋ’は、文字画像を構成する画素数をＭとした場合に、以下の数式により算出される。
（一致画素率Ｋ’）＝Ｋ／Ｍ In step 108 (S108), the coincidence determination unit 520 compares the character images sequentially extracted by the character image extraction unit 510 with the image patterns registered in the image dictionary 902, and determines the degree of coincidence at a plurality of relative positions. judge. Specifically, the coincidence determination unit 520 calculates the coincidence pixel count K of black pixels while shifting the image pattern read by the dictionary determination unit 530 and the extracted character image from each other.
The coincidence pixel number K is a position vector indicating a relative position in the image, x is a black pixel distribution of the image pattern, S (x), a character image number cut out in order is i (1 to N), When the distribution of black pixels is P (i, x) and the shift vector of the character image i is vi, it is calculated by the following formula.
(Number of matched pixels K) = Σ {S (x) * P (i, x−vi)}
Note that “Σ” indicates the total sum for the variable x.
Next, the coincidence determination unit 520 normalizes the calculated coincidence pixel number K to calculate a coincidence pixel rate K ′.
The coincidence pixel rate K ′ is calculated by the following equation, where M is the number of pixels constituting the character image.
(Matched pixel rate K ′) = K / M

ステップ１１０（Ｓ１１０）において、辞書決定部５３０は、一致判定部５２０により判定された画像パターンと文字画像（新たに切り出されたもの）との一致度合いに基づいて、新たに切り出された文字画像に基づいて画像パターンを登録するか否かを判定する。具体的には、辞書決定部５３０は、判定された一致画素率Ｋ’が基準値以上である場合に、最も一致画素率Ｋ’が大きな画像パターンのインデクスと、切り出された文字画像とを互いに対応付けて、Ｓ１１４の処理に移行し、判定された一致画素率Ｋ’が基準値よりも小さい場合に、Ｓ１１２の処理に移行する。
すなわち、辞書決定部５３０は、一致度合いが基準値以上である場合には、この文字画像に基づく新たな画像パターンの登録を禁止して、この文字画像を既登録の画像パターンに対応付け、一致度合いが基準値よりも小さい場合には、この文字画像を画像パターンとして画像辞書９０２に新規登録する。 In step 110 (S110), the dictionary determination unit 530 converts the image pattern determined by the match determination unit 520 and the character image (newly cut out) into a newly cut out character image. Based on this, it is determined whether or not to register an image pattern. Specifically, when the determined matching pixel ratio K ′ is equal to or greater than the reference value, the dictionary determination unit 530 determines that the index of the image pattern having the largest matching pixel ratio K ′ and the cut character image are mutually connected. Correspondingly, the process proceeds to S114, and when the determined matching pixel ratio K ′ is smaller than the reference value, the process proceeds to S112.
That is, if the matching degree is equal to or greater than the reference value, the dictionary determining unit 530 prohibits registration of a new image pattern based on the character image, associates the character image with the registered image pattern, and matches the character image. When the degree is smaller than the reference value, this character image is newly registered in the image dictionary 902 as an image pattern.

ステップ１１２（Ｓ１１２）において、辞書決定部５３０は、切り出された文字画像を画像パターンとして画像辞書９０２に登録し、この文字画像と画像パターンとを互いに対応付ける。
また、インデクス付与部５５０は、辞書決定部５３０により決定された画像パターンに対して、この画像パターンを一意に識別する識別情報（インデクス）を付与する。そして、インデクス付与部５５０は、付与したインデクスを画像パターンに対応付けて画像辞書９０２に登録する。付与されるインデクスは、少なくともこの原稿画像において、それぞれの画像パターンを一意に識別するものである。
なお、この文字画像に付与されたインデクス、及び、この文字画像の位置情報は、符号化対象データとして符号化部６０に対して順次出力される。 In step 112 (S112), the dictionary determining unit 530 registers the extracted character image as an image pattern in the image dictionary 902, and associates the character image with the image pattern.
The index assigning unit 550 assigns identification information (index) for uniquely identifying the image pattern to the image pattern determined by the dictionary determining unit 530. Then, the index assigning unit 550 registers the assigned index in the image dictionary 902 in association with the image pattern. The assigned index uniquely identifies each image pattern in at least the original image.
The index assigned to the character image and the position information of the character image are sequentially output to the encoding unit 60 as encoding target data.

ステップ１１４（Ｓ１１４）において、位置補正部５４０は、原稿画像から切り出された文字画像と画像辞書９０２に登録された画像パターンとについて、一致判定部５２０により複数の相対位置で算出される一致度合い（一致画素数Ｋ又は一致画素率Ｋ’）に基づいて、文字認識部４１０から入力された位置情報の補正ベクトルを決定する。具体的には、位置補正部５４０は、Ｓ１０８において、切り出された文字画像と画像パターンとについて算出される一致画素数Ｋが最大となったときのずらしベクトルｖｉを補正ベクトルとする。
すなわち、画像辞書作成部５０は、処理対象である文字画像と、この文字画像に対応する画像パターンとが最も一致するように、処理対象である文字画像の切出し位置（文字画像の位置情報）を補正する。
なお、補正ベクトルにより補正された文字画像の位置情報は、この文字画像に対応する画像パターンのインデクスと共に、符号化対象データとして順次符号化部６０に出力される。 In step 114 (S114), the position correction unit 540 matches the degree of matching calculated by the matching determination unit 520 at a plurality of relative positions for the character image cut out from the document image and the image pattern registered in the image dictionary 902. Based on the matching pixel count K or matching pixel rate K ′), the correction vector of the position information input from the character recognition unit 410 is determined. Specifically, in S108, the position correction unit 540 sets the shift vector vi when the number of coincident pixels K calculated for the extracted character image and image pattern is the maximum as the correction vector.
That is, the image dictionary creation unit 50 sets the cutout position (character image position information) of the character image to be processed so that the character image to be processed and the image pattern corresponding to the character image most closely match. to correct.
The position information of the character image corrected by the correction vector is sequentially output to the encoding unit 60 as encoding target data together with the index of the image pattern corresponding to the character image.

ステップ１１６（Ｓ１１６）において、符号化部６０は、認識確度が基準値以上である文字画像に対して、画像辞書９０２に基づく符号化方式を適用して、符号化処理を行い、認識確度が基準値未満である文字画像に対して、画像辞書９０２を用いない他の符号化方式を適用して、符号化処理を行う。具体的には、符号化部６０は、入力された文字画像の認識確度が基準値以上である場合に、この文字画像の画像データの代わりに、この文字画像に対応するインデクス及び文字画像の位置情報（すなわち、上記符号化対象データ）を符号化し、入力された文字画像の認識確度が基準値未満である場合に、この文字画像の画像データを符号化する。
符号化された原稿画像の画像データは、画像辞書９０２と共に、再現画像作成部７０（図４）及び記録装置２４（図３）に出力される。 In step 116 (S116), the encoding unit 60 applies an encoding method based on the image dictionary 902 to a character image whose recognition accuracy is equal to or higher than a reference value, performs an encoding process, and the recognition accuracy is the reference accuracy. Encoding processing is performed by applying another encoding method that does not use the image dictionary 902 to a character image that is less than the value. Specifically, when the recognition accuracy of the input character image is equal to or higher than the reference value, the encoding unit 60 uses the index corresponding to the character image and the position of the character image instead of the image data of the character image. When the information (that is, the encoding target data) is encoded and the recognition accuracy of the input character image is less than the reference value, the image data of the character image is encoded.
The encoded image data of the document image is output together with the image dictionary 902 to the reproduction image creation unit 70 (FIG. 4) and the recording device 24 (FIG. 3).

ステップ１１８（Ｓ１１８）において、文書ファイル作成プログラム５は、原稿画像に含まれる全ての文字画像について、Ｓ１０２からＳ１１６までの処理が終了したか否かを判定し、全ての文字画像について終了している場合にＳ１２０の処理に移行し、これ以外の場合にＳ１０２の処理に戻り次の文字画像を切り出してＳ１０４からＳ１１６までの処理を繰り返す。
これにより、原稿画像に含まれる文字画像は、画像辞書９０２に登録されるいずれかの画像パターンと対応付けられる。換言すると、形状が近似する文字画像は、画像パターン（インデクス）を介して互いに関連付けられることになる。 In step 118 (S118), the document file creation program 5 determines whether or not the processing from S102 to S116 has been completed for all character images included in the document image, and has been completed for all character images. If not, the process proceeds to S120. Otherwise, the process returns to S102, the next character image is cut out, and the processes from S104 to S116 are repeated.
As a result, the character image included in the document image is associated with one of the image patterns registered in the image dictionary 902. In other words, character images whose shapes are approximated are associated with each other via an image pattern (index).

ステップ１２０（Ｓ１２０）において、再現画像作成部７０は、符号化部６０から入力された原稿画像の符号データを復号化して再現画像２６０を作成する。すなわち、再現画像作成部７０は、原稿画像の画像データのうち、認識確度が基準値以上である文字画像を、この文字画像に対応する画像パターン（一致画素数が最大である画像パターン）と置換して再現画像２６０（図８）を作成する。この基準値は、文字認識処理の結果を信頼できると判断できる程度の値であることが望ましく、例えば、利用者により設定される。 In step 120 (S120), the reproduction image creation unit 70 creates a reproduction image 260 by decoding the code data of the document image input from the encoding unit 60. That is, the reproduction image creating unit 70 replaces a character image whose recognition accuracy is equal to or higher than a reference value in the image data of the document image with an image pattern corresponding to the character image (an image pattern having the maximum number of matching pixels). Thus, a reproduced image 260 (FIG. 8) is created. This reference value is desirably a value that can be used to determine that the result of the character recognition process is reliable, and is set by a user, for example.

ステップ１２２（Ｓ１２２）において、ＵＩ制御部８０は、再現画像作成部７０により作成された再現画像２６０を、ＵＩ装置２６（図３）のモニタに表示する。その際に、ＵＩ制御部８０は、原稿画像から切り出された文字画像と、画像パターンとを区別できるように下線２６２（図７）を表示し、さらに、利用者の入力位置を示すカーソル２６４（図７）を原稿画像から切り出された文字画像の近傍に表示する。
なお、コードファイル作成部９０は、文字認識部４０による文字認識処理の結果（すなわち、文字コード群）を暫定的なコードファイルとする。 In step 122 (S122), the UI control unit 80 displays the reproduced image 260 created by the reproduced image creating unit 70 on the monitor of the UI device 26 (FIG. 3). At that time, the UI control unit 80 displays an underline 262 (FIG. 7) so that the character image cut out from the document image and the image pattern can be distinguished from each other, and a cursor 264 (FIG. 7) indicating the input position of the user. FIG. 7) is displayed in the vicinity of the character image cut out from the document image.
The code file creation unit 90 sets the result of the character recognition process (that is, the character code group) by the character recognition unit 40 as a provisional code file.

［校正処理］
次に、コードファイル（文字認識処理の結果）の校正処理を説明する。
図１０は、文字認識結果の校正処理（Ｓ２０）を示すフローチャートである。
図１０に示すように、ステップ２００（Ｓ２００）において、ＵＩ制御部８０（図４）は、再現画像作成部７０により作成された再現画像２６０を、ＵＩ装置２６（図３）のモニタに表示する。この再現画像２６０には、原稿画像から切り出された文字画像（認識確度が低い文字画像に相当）と、画像パターン（認識確度が高い文字画像に相当）と、原稿画像から切り出された文字画像の近傍に配置された下線２６２（図７）とが含まれている。
なお、現段階では、文字認識部４０による文字認識処理の結果（すなわち、文字コード群）がそのまま暫定的なコードファイルとして保持されている。 [Proofreading]
Next, the proofreading process of the code file (character recognition process result) will be described.
FIG. 10 is a flowchart showing the proofreading process (S20) of the character recognition result.
As shown in FIG. 10, in step 200 (S200), the UI control unit 80 (FIG. 4) displays the reproduced image 260 created by the reproduced image creating unit 70 on the monitor of the UI device 26 (FIG. 3). . The reproduction image 260 includes a character image cut out from a document image (corresponding to a character image with low recognition accuracy), an image pattern (corresponding to a character image with high recognition accuracy), and a character image cut out from the document image. An underline 262 (FIG. 7) arranged in the vicinity is included.
At the present stage, the result of the character recognition processing by the character recognition unit 40 (that is, the character code group) is held as it is as a temporary code file.

ステップ２０２（Ｓ２０２）において、ＵＩ制御部８０は、再現画像２６０に含まれる文字画像の中から、認識確度が基準値未満である文字画像（すなわち、原稿画像から切り出された文字画像）を順に検索する。
ステップ２０４（Ｓ２０４）において、文書ファイル作成プログラム５は、認識確度が基準値未満である文字画像が発見された場合に、Ｓ２０６の処理に移行し、基準値未満である文字画像が発見されない場合に、校正処理が終了した旨を表示してＳ２２０の処理に移行する。 In step 202 (S202), the UI control unit 80 sequentially searches the character images included in the reproduced image 260 for character images whose recognition accuracy is less than the reference value (that is, character images cut out from the document image). To do.
In step 204 (S204), the document file creation program 5 proceeds to the process of S206 when a character image whose recognition accuracy is less than the reference value is found, and when a character image less than the reference value is not found. Then, the fact that the calibration process is completed is displayed, and the process proceeds to S220.

ステップ２０６（Ｓ２０６）において、ＵＩ制御部８０は、発見された文字画像（すなわち、認識確度が基準値未満である文字画像）の近傍にカーソル２６４（図７）を表示して、この文字画像（以下、入力対象文字）に対する文字コードの入力を受け付ける。
ステップ２０８（Ｓ２０８）において、文書ファイル作成プログラム５は、文字コードの入力が行われた場合に、Ｓ２１０の処理に移行し、文字コードの入力が行われない場合に、Ｓ２１６の処理に移行する。 In step 206 (S206), the UI control unit 80 displays a cursor 264 (FIG. 7) in the vicinity of the found character image (that is, a character image whose recognition accuracy is less than the reference value), and this character image ( Hereinafter, input of a character code for the input target character) is accepted.
In step 208 (S208), the document file creation program 5 proceeds to the process of S210 when the character code is input, and proceeds to the process of S216 when the character code is not input.

ステップ２１０（Ｓ２１０）において、ＵＩ制御部８０は、利用者から、ＵＩ装置２６（図３）を介して文字コードが入力されると、この文字コードを、カーソル２６４の表示位置に対応する入力対象文字に対応付けて、再現画像作成部７０を介して符号化部６０に対して出力する。
符号化部６０は、入力対象文字に対応するインデクスを特定し、特定されたインデクスと対応付けられた他の文字画像を特定する。そして、符号化部６０は、入力対象文字及びインデクスが一致する他の文字画像の画像データを、このインデクス及びそれぞれの文字画像の位置情報と置換して、原稿画像の符号データを更新する。すなわち、符号化部６０は、いずれかの文字画像（認識確度が低いもの）について文字コードが入力されると、この文字画像、及び、この文字画像と同一の画像パターンに対応する他の文字画像（すなわち、互いに酷似する文字画像群）について、認識確度が基準値以上（１００％）であるとみなして画像辞書９０２に基づく圧縮処理を適用する。なお、画像辞書９０２においてこのインデクスに対応する文字コードは、入力された文字コードと置換される。 In step 210 (S210), when a character code is input from the user via the UI device 26 (FIG. 3), the UI control unit 80 converts the character code into an input target corresponding to the display position of the cursor 264. The image is output to the encoding unit 60 via the reproduction image creation unit 70 in association with the character.
The encoding unit 60 specifies an index corresponding to the input target character, and specifies another character image associated with the specified index. Then, the encoding unit 60 updates the code data of the document image by replacing the image data of the other character image having the same input target character and index with the position information of the index and each character image. That is, when a character code is input for any character image (with low recognition accuracy), the encoding unit 60 receives this character image and another character image corresponding to the same image pattern as this character image. The compression processing based on the image dictionary 902 is applied on the assumption that the recognition accuracy is equal to or higher than a reference value (100%) for a group of character images that are very similar to each other. In the image dictionary 902, the character code corresponding to this index is replaced with the input character code.

ステップ２１２（Ｓ２１２）において、コードファイル作成部９０は、入力対象文字、及び、この入力対象文字とインデクスが一致する他の文字画像について、入力された文字コードを適用すべくコードファイルを更新する。 In step 212 (S212), the code file creation unit 90 updates the code file so that the input character code is applied to the input target character and another character image whose index matches the input target character.

ステップ２１４（Ｓ２１４）において、再現画像作成部７０は、符号化部６０により更新された原稿画像の符号データに基づいて、再現画像２６０を再度作成する。作成される再現画像２６０は、入力対象文字、及び、これとインデクスが一致する他の文字画像が画像パターンと置換されたものになる。すなわち、文字コードが入力された文字画像、及び、この文字画像と形状が酷似する文字画像（インデクスが一致する文字画像群）は、再現画像２６０において、対応する画像パターンで表されることになる。 In step 214 (S214), the reproduction image creation unit 70 creates the reproduction image 260 again based on the code data of the document image updated by the encoding unit 60. The reproduced image 260 to be created is an image in which an input target character and another character image whose index matches this are replaced with an image pattern. That is, a character image to which a character code is input and a character image (a character image group having a matching index) that is very similar in shape to the character image are represented by a corresponding image pattern in the reproduced image 260. .

ステップ２１６（Ｓ２１６）において、ＵＩ制御部８０は、利用者から、ＵＩ装置２６（図３）を介してカーソル２６４の移動操作を受け付ける。
ＵＩ制御部８０は、カーソル２６４の移動操作を受け付けると、Ｓ２０２の処理に移行して、認識確度が基準値未満である文字画像を検索して次の入力対象文字の近傍にカーソル２６４を移動させ、これ以外の場合に、Ｓ２１８の処理に移行する。 In step 216 (S216), the UI control unit 80 receives a movement operation of the cursor 264 from the user via the UI device 26 (FIG. 3).
When the UI control unit 80 accepts the movement operation of the cursor 264, the UI control unit 80 proceeds to the processing of S202, searches for a character image whose recognition accuracy is less than the reference value, and moves the cursor 264 to the vicinity of the next input target character. In other cases, the process proceeds to S218.

ステップ２１８（Ｓ２１８）において、ＵＩ制御部８０は、利用者から、ＵＩ装置２６（図３）を介して校正処理の終了操作を受け付ける。
ＵＩ制御部８０は、終了操作を受け付けると、Ｓ２２０の処理に移行し、これ以外の場合に、Ｓ２０８の処理に戻って、文字コードの入力操作（Ｓ２０８）又はカーソルの移動操作（Ｓ２１６）を待つ。 In step 218 (S218), the UI control unit 80 accepts a calibration processing end operation from the user via the UI device 26 (FIG. 3).
When accepting the end operation, the UI control unit 80 proceeds to the process of S220, and otherwise returns to the process of S208 and waits for the character code input operation (S208) or the cursor movement operation (S216). .

ステップ２２０（Ｓ２２０）において、コードファイル作成部９０は、利用者の入力に応じて更新されたコードファイルを記録装置２４（図３）などに格納し、文書ファイル作成プログラム５は、校正処理を終了する。
なお、本例の文書ファイル作成支援装置２は、いずれかの文字画像について利用者によって文字コードが入力されると、この文字画像の画像データをインデクス及び位置情報に置換することにより圧縮しているが、これに限定されるものではなく、例えば、この文字画像の画像データを、入力された文字コードそのものと置換して圧縮率を向上させてもよい。 In step 220 (S220), the code file creation unit 90 stores the code file updated according to the user's input in the recording device 24 (FIG. 3) or the like, and the document file creation program 5 ends the calibration process. To do.
Note that the document file creation support apparatus 2 of this example, when a character code is input by a user for any character image, compresses the image data of this character image by replacing it with an index and position information. However, the present invention is not limited to this. For example, the image data of the character image may be replaced with the input character code itself to improve the compression rate.

以上説明したように、本実施形態における文書ファイル作成支援装置２は、原稿画像に含まれる文字画像を互いに比較して、これらの文字画像を形状の一致度合いに基づいて分類し、分類された文字画像に基づいて画像パターンを生成する。そして、文書ファイル作成支援装置２は、文字認識処理の認識確度が基準値以上である文字画像については、これらの文字画像に対応する画像パターンを適用し、認識確度が基準値未満である文字画像については、原稿画像に含まれる文字画像そのものを適用して再現画像２６０を作成する。これにより、利用者は、文字認識結果の認識確度が低い部分については、原稿画像に含まれる文字画像そのものを視認して文字コードを直接入力でき、認識確度が高い部分については、画像パターンに置換して原稿画像の画像データを高い圧縮率で符号化することができる。
また、再現画像２６０において、画像パターンと、原稿画像から切り出された文字画像とが区別しうる表示態様で表示されているため、利用者は、認識確度の高い文字と低い文字とを容易に識別することができる。
また、再現画像２６０において、カーソル２６４は、認識確度が基準値未満の文字に対応する位置のみを移動するため、利用者は、認識確度が低い文字について容易に文字コードを入力することができる。
また、認識確度が基準値未満である文字画像群についても、形状が互いに酷似する文字画像は、インデクス（画像パターン）を介して対応付けられているため、いずれかの文字画像について文字コードが入力されると、形状が酷似する他の文字画像と共に、文字コードが校正される。 As described above, the document file creation support apparatus 2 according to the present embodiment compares character images included in a document image with each other, classifies these character images based on the degree of matching of the shapes, and classifies the characters. An image pattern is generated based on the image. Then, the document file creation support apparatus 2 applies an image pattern corresponding to these character images to a character image whose recognition accuracy of the character recognition process is greater than or equal to the reference value, and the character image whose recognition accuracy is less than the reference value. For the above, the reproduction image 260 is created by applying the character image itself included in the document image. Thus, the user can directly input a character code by visually recognizing the character image included in the original image for a portion with low recognition accuracy of the character recognition result, and replace the portion with high recognition accuracy with an image pattern. Thus, the image data of the document image can be encoded at a high compression rate.
In the reproduced image 260, since the image pattern and the character image cut out from the document image are displayed in a distinguishable display form, the user can easily distinguish between characters with high recognition accuracy and characters with low recognition accuracy. can do.
In the reproduced image 260, the cursor 264 moves only at a position corresponding to a character whose recognition accuracy is less than the reference value, so that the user can easily input a character code for a character with low recognition accuracy.
Also, for character image groups whose recognition accuracy is less than the reference value, character images whose shapes are very similar to each other are associated through an index (image pattern), so a character code is input for any character image. Then, the character code is calibrated together with another character image having a very similar shape.

［変形例］
次に、上記実施形態の変形例を説明する。上記実施形態では、文字認識部４０により文字認識処理で適用されるパターンは、画像辞書作成部５０により作成される画像パターンと独立したものであったが、互いに依存した形態としてもよい。例えば、画像辞書作成部５０は、文字認識部４０で適用されているテンプレート画像（例えばフォント画像）のうち、原稿画像に含まれる文字画像と最も近似するものを、画像パターンとして画像辞書９０２に登録してもよい。また、文字認識部４０は、画像辞書作成部５０により作成された画像パターンを、文字認識処理においてテンプレート画像として利用してもよい。
画像辞書９０２に登録された画像パターンを文字認識処理のテンプレート画像として利用する場合には、それぞれの画像パターンは、文字コードだけでなく、各種フォント情報とも対応付けておくことが望ましい。
図１１は、文字コード及びフォント情報が登録された画像辞書９０４を例示する図である。
図１１に例示するように、本変形例における画像辞書９０４は、画像パターン及びインデクスに対応付けて、この画像パターンに対応する文字画像の文字コード及びフォント情報を有する。本例では、フォント情報は、フォントの種類及びフォントサイズであるが、フォントの色などを含んでもよい。この場合には、コードファイルは、文字コードだけではなく、フォント情報を含んでもよい。
また、この画像パターンが文字認識処理のテンプレート画像として適用される場合には、画像パターンに対応するフォント情報を特定することにより、文字画像のフォント情報を判別することできる。さらに、画像辞書９０４に登録された画像パターンを文字認識処理にフィードバックすることにより、圧縮率のさらなる向上が期待できる。 [Modification]
Next, a modification of the above embodiment will be described. In the above embodiment, the pattern applied in the character recognition process by the character recognition unit 40 is independent of the image pattern created by the image dictionary creation unit 50, but may be a form dependent on each other. For example, the image dictionary creation unit 50 registers, in the image dictionary 902, as an image pattern, a template image (for example, a font image) applied by the character recognition unit 40 that is closest to the character image included in the document image. May be. In addition, the character recognition unit 40 may use the image pattern created by the image dictionary creation unit 50 as a template image in the character recognition process.
When an image pattern registered in the image dictionary 902 is used as a template image for character recognition processing, it is desirable that each image pattern is associated not only with a character code but also with various font information.
FIG. 11 is a diagram illustrating an image dictionary 904 in which character codes and font information are registered.
As illustrated in FIG. 11, the image dictionary 904 in the present modified example has character codes and font information of character images corresponding to the image patterns in association with the image patterns and indexes. In this example, the font information is the type of font and the font size, but may include the color of the font. In this case, the code file may include not only the character code but also font information.
Further, when this image pattern is applied as a template image for character recognition processing, the font information of the character image can be determined by specifying the font information corresponding to the image pattern. Furthermore, by further feeding back the image pattern registered in the image dictionary 904 to the character recognition process, a further improvement in the compression rate can be expected.

また、上記実施形態では、再現画像が作成される前に原稿画像の画像データを圧縮しているが、これに限定されるものではなく、例えば、文書ファイル作成支援装置２は、圧縮処理（すなわち、文字画像とインデクス及び位置情報との置換）を行わずに、画像パターン（又はインデクス）を介して、形状が互いに近似する文字画像を関連付けていれば、画像パターンを用いた再現画像の作成、及び、近似する文字画像群の一括校正を実現することができる。 In the above embodiment, the image data of the document image is compressed before the reproduction image is created. However, the present invention is not limited to this. For example, the document file creation support apparatus 2 performs the compression process (that is, If the character images whose shapes are similar to each other are associated with each other via the image pattern (or index) without performing the replacement of the character image with the index and the position information), a reproduction image using the image pattern is created. In addition, it is possible to realize batch calibration of approximate character image groups.

また、上記実施形態では、文書ファイル作成支援装置２は、認識確度が基準値以上である文字画像（画像パターン）と、認識確度が基準値未満である文字画像（原稿画像から切り出したもの）との２種類で、再現画像に含まれる文字画像を表現しているが、これに限定されるものではなく、例えば、認識確度を３段階に区分して、認識確度が最上位区分に相当する文字画像については、画像パターンで表示して校正処理の対象から除外し、認識確度が最下位区分に相当する文字画像については、原稿画像から切り出した文字画像で表示して校正処理の対象として文字コードの入力を受け付け、上記以外の中位区分に相当する文字画像については、原稿画像から切り出された文字画像と、文字認識結果に対応するフォント画像とを表示して校正処理の対象として文字コードの入力を受け付ける。なお、文書ファイル作成支援装置２は、認識確度を３段階に区分した場合には、それぞれの区分に相当する文字画像を区分に応じた色で表示して、それぞれの文字画像の認識確度を識別可能にしてもよい。また、認識確度の区分の境界値（基準値）は、利用者の入力に応じて変更可能であってもよい。 In the above embodiment, the document file creation support apparatus 2 includes a character image (image pattern) having a recognition accuracy equal to or higher than a reference value, and a character image (cut out from a document image) having a recognition accuracy lower than the reference value. However, the present invention is not limited to this. For example, the recognition accuracy is divided into three stages, and the recognition accuracy corresponds to the highest classification. The image is displayed as an image pattern and excluded from the object of the proofreading process. The character image whose recognition accuracy corresponds to the lowest classification is displayed as a character image cut out from the original image and is used as the object of the proofreading process. The character image corresponding to the middle category other than the above is calibrated by displaying the character image cut out from the original image and the font image corresponding to the character recognition result. Receiving an input of a character code as the management of the target. When the recognition accuracy is classified into three levels, the document file creation support apparatus 2 displays the character image corresponding to each classification in a color corresponding to the classification and identifies the recognition accuracy of each character image. It may be possible. Further, the boundary value (reference value) of the classification of the recognition accuracy may be changeable according to a user input.

次に、画像辞書９０２の作成方法に関する変形例を説明する。
上記実施形態では、画像辞書作成部５０は、逐次的に原稿画像から文字画像を切り出し、切り出された文字画像に基づいて順次画像辞書を作成していたが、これに限定されるものではなく、例えば、１ページ又は１ドキュメントなどの原稿画像全体に基づいて、画像辞書を作成してもよい。また、画像辞書作成部５０は、原稿画像において形状が互いに類似する複数の文字画像を選択し、選択された複数の文字画像に基づいて画像辞書に登録すべき画像パターンを作成してもよい。
そこで、本変形例における画像辞書作成装置５０は、原稿画像から切り出された文字画像を文字コード又は文字コードとフォント情報との組合せに基づいて分類し、分類された文字画像を出現頻度に応じて統合して、画像辞書に登録すべき画像パターンを作成する。なお、同一の文字コードで分類された文字画像から複数の画像パターンが作成されてもよい。
これにより、画像辞書作成部５０は、文字画像の出現頻度などを加味して画像辞書を作成することができるため、高い圧縮率を実現できる。 Next, a modified example regarding the method of creating the image dictionary 902 will be described.
In the above embodiment, the image dictionary creation unit 50 sequentially cuts out character images from the document image, and sequentially creates an image dictionary based on the cut out character images. However, the present invention is not limited to this. For example, the image dictionary may be created based on the entire original image such as one page or one document. In addition, the image dictionary creation unit 50 may select a plurality of character images having similar shapes in the document image, and create an image pattern to be registered in the image dictionary based on the selected plurality of character images.
Therefore, the image dictionary creation device 50 according to the present modification classifies the character image cut out from the document image based on the character code or the combination of the character code and the font information, and the classified character image according to the appearance frequency. An image pattern to be registered in the image dictionary is created by integration. A plurality of image patterns may be created from character images classified by the same character code.
As a result, the image dictionary creation unit 50 can create an image dictionary in consideration of the appearance frequency of character images and the like, so that a high compression rate can be realized.

図１２は、変形例における第１の画像パターン作成処理を模式的に説明する図である。なお、本図では、２値の原稿画像が入力され、この原稿画像から切り出された文字画像が文字コードで分類される形態を具体例として説明する。
図１２に示すように、画像辞書作成部５０は、文字コードに基づいて、入力画像に含まれる文字画像を複数の文字画像群に分類し、それぞれの文字画像群について黒画素の分布確率Ｑ’（ｘ）を算出する。算出された分布確率Ｑ’（ｘ）は、図１２に例示するように、画素位置ｘによって異なる数値を示す。これは、分類された文字画像群の中に、形状の異なる文字画像が異なる出現頻度で混在しているからである。
分布確率Ｑ’（ｘ）は、以下の式により算出される。
Ｑ（ｘ）＝Ｐ（１，ｘ）＋Ｐ（２，ｘ−ｖ２）＋・・・＋Ｐ（ｉ−１，ｘ−ｖ（ｉ−１））
Ｑ’（ｘ）＝Ｑ（ｘ）／Ｎ
Ｑ（ｘ）：分類された文字画像群の画素分布、Ｐ（ｉ，ｘ）：各文字画像の黒画素分布、ｘ：位置ベクトル、ｉ：文字画像群に属する各文字画像（１〜Ｎ：Ｎは文字画像群に属する文字画像の数）
なお、ｉ＝１の場合には、Ｑ（ｘ）＝Ｐ（１，ｘ）となる。 FIG. 12 is a diagram schematically illustrating the first image pattern creation process in the modification. In this figure, an example in which a binary document image is input and character images cut out from the document image are classified by character codes will be described as a specific example.
As shown in FIG. 12, the image dictionary creation unit 50 classifies the character images included in the input image into a plurality of character image groups based on the character codes, and the black pixel distribution probability Q ′ for each character image group. (X) is calculated. The calculated distribution probability Q ′ (x) indicates a different numerical value depending on the pixel position x, as illustrated in FIG. This is because character images having different shapes are mixed with different appearance frequencies in the classified character image group.
The distribution probability Q ′ (x) is calculated by the following equation.
Q (x) = P (1, x) + P (2, x−v2) +... + P (i−1, x−v (i−1))
Q ′ (x) = Q (x) / N
Q (x): Pixel distribution of the classified character image group, P (i, x): Black pixel distribution of each character image, x: Position vector, i: Each character image belonging to the character image group (1 to N: N is the number of character images belonging to the character image group)
When i = 1, Q (x) = P (1, x).

次に、画像辞書作成部５０は、分布確率Ｑ’（ｘ）に対して閾値処理を施して出現頻度が高い類型的な形状（和結合パターンＱ”（ｘ））を抽出する。すなわち、分布確率Ｑ’（ｘ）は、閾値Ｂで閾値処理がなされることにより、出現頻度が低い文字画像の差分形状（出現頻度の高い類型的な形状との差分）及びノイズ部分等が排除されて、類型的な形状のみが抽出される。
なお、和結合パターンＱ”（ｘ）は、以下の条件式により算出される。
Ｑ’（ｘ）＞閾値Ｂの場合に、Ｑ”（ｘ）＝１
上記以外の場合に、Ｑ”（ｘ）＝０ Next, the image dictionary creation unit 50 performs threshold processing on the distribution probability Q ′ (x) to extract a typical shape (sum coupling pattern Q ″ (x)) having a high appearance frequency. The probability Q ′ (x) is subjected to threshold processing with the threshold B, so that the difference shape of the character image with a low appearance frequency (difference from a typical shape with a high appearance frequency), the noise portion, and the like are excluded, Only typical shapes are extracted.
The sum coupling pattern Q ″ (x) is calculated by the following conditional expression.
When Q ′ (x)> threshold B, Q ″ (x) = 1
In other cases, Q ″ (x) = 0

画像辞書作成部５０は、続いて、抽出された和結合パターンＱ”（ｘ）と、文字画像群に属する各文字画像との共通部分を、画像辞書に登録すべき画像パターンとして抽出する。すなわち、画像辞書作成部５０は、和結合パターンＱ”（ｘ）と、各文字画像の画素分布Ｐ（ｉ，ｘ−ｖｉ）とを積演算する。これにより、文字画像群の中に複数存在する類型的な文字画像（出現頻度の高いもの）の画像パターン＃１及び画像パターン＃２が抽出される。 Subsequently, the image dictionary creation unit 50 extracts a common part between the extracted sum combination pattern Q ″ (x) and each character image belonging to the character image group as an image pattern to be registered in the image dictionary. The image dictionary creation unit 50 multiplies the sum coupling pattern Q ″ (x) by the pixel distribution P (i, x−vi) of each character image. As a result, the image pattern # 1 and the image pattern # 2 of the typical character images (high appearance frequency) existing in the character image group are extracted.

次に、第２の画像パターン作成処理を説明する。
第２の画像パターン作成処理では、画像辞書作成部５０は、原稿画像から切り出された文字画像を文字コード又は文字コードとフォント情報との組合せに基づいて分類し、分類された文字画像とそれらの出現頻度とに基づいて、分類された文字画像群における共通形状及び差分形状を抽出し、抽出された共通形状及び差分形状を階層化して画像辞書に登録する。ここで、共通形状とは、文字コード等で分類された文字画像群に共通して存在する形状であり、差分形状とは、この文字画像群に属する各文字画像と共通形状との差分であって、出現頻度が基準値以上であるものをいう。 Next, the second image pattern creation process will be described.
In the second image pattern creation process, the image dictionary creation unit 50 classifies the character images cut out from the document image based on the character code or a combination of the character code and the font information, and classifies the character images and their character images. Based on the appearance frequency, common shapes and differential shapes in the classified character image group are extracted, and the extracted common shapes and differential shapes are hierarchized and registered in the image dictionary. Here, the common shape is a shape that exists in common in the character image group classified by the character code or the like, and the difference shape is a difference between each character image belonging to this character image group and the common shape. That is, the appearance frequency is equal to or higher than a reference value.

図１３は、第２の画像パターン作成処理を模式的に説明する図である。なお、本図でも、２値の原稿画像が入力され、この原稿画像から切り出された文字画像が文字コードで分類される形態を具体例として説明する。
まず、画像辞書作成部５０は、上記同様に、文字コードに基づいて原稿画像に含まれる文字画像を複数の文字画像群に分類し、それぞれの文字画像群について黒画素の分布確率Ｑ’（ｘ）を算出する。算出された分布確率Ｑ’（ｘ）は、図１３に示すように、画素位置ｘによって異なる数値を示す。このうち、分布確率が最も高い部分は、この文字画像群に属する文字画像に共通する形状（すなわち、共通形状）であると考えられる。そして、分布確率が存在する他の領域は、それぞれの文字画像と共通形状との差分に相当する形状（すなわち、差分形状）であると考えられる。 FIG. 13 is a diagram schematically illustrating the second image pattern creation process. In this figure, a specific example will be described in which a binary document image is input, and character images cut out from the document image are classified by character codes.
First, as described above, the image dictionary creation unit 50 classifies the character images included in the document image into a plurality of character image groups based on the character codes, and the black pixel distribution probability Q ′ (x ) Is calculated. The calculated distribution probability Q ′ (x) shows a different value depending on the pixel position x, as shown in FIG. Of these, the portion with the highest distribution probability is considered to have a shape common to character images belonging to this character image group (that is, a common shape). The other area where the distribution probability exists is considered to have a shape corresponding to the difference between each character image and the common shape (that is, a difference shape).

画像辞書作成部５０は、共通形状（分布確率がほぼ最大となる領域）を抽出するための第１レベル閾値と、出現頻度の高い差分形状を抽出するための第２レベル閾値とを有している。
まず、画像辞書作成部５０は、分布確率Ｑ’（ｘ）に対して第１レベル閾値により閾値処理を行い、第１レベルパターン（共通形状）に相当する部分を抽出する。次に、分布確率Ｑ’（ｘ）から、抽出された第１レベルパターンに相当する部分が除去されて、第２レベル閾値を基準として「１」又は「０」に変換されることにより、第２レベル和結合パターンＱ１”（ｘ）が生成される。
画像辞書作成部５０は、この第２レベル和結合パターンＱ１”（ｘ）と、各文字画像ｉの画素分布Ｐ（ｉ，ｘ）とを積演算することにより、これらの共通部分である第２レベルパターンを抽出する。本例では、第２レベル和結合パターンＱ１”（ｘ）と「文字画像＃１」との共通部分は、第２レベルパターンａとなり、第２レベル和結合パターンＱ１”（ｘ）と「文字画像＃２」との共通部分は、第２レベルパターンｂとなる。
これにより、「文字画像＃１」は、第１レベルパターンと第２レベルパターンａとの和としてそれぞれのインデクスに置換でき、「文字画像＃２」は、第１レベルパターンと第２レベルパターンｂとの和としてインデクスに置換できる。
なお、この場合には、１つの文字画像について、複数の画像パターン（第１レベルパターン及び第２レベルパターン）が対応付けられる。したがって、１つの文字画像は、複数のインデクスに対応付けられることになるが、文書ファイル作成プログラム５は、これらインデクスの組合せが一致するものを、形状が酷似する文字画像として校正処理（図１０）を行う。 The image dictionary creation unit 50 has a first level threshold value for extracting a common shape (an area where the distribution probability is almost maximum) and a second level threshold value for extracting a difference shape having a high appearance frequency. Yes.
First, the image dictionary creation unit 50 performs threshold processing on the distribution probability Q ′ (x) using the first level threshold, and extracts a portion corresponding to the first level pattern (common shape). Next, the portion corresponding to the extracted first level pattern is removed from the distribution probability Q ′ (x) and converted to “1” or “0” with the second level threshold as a reference, thereby A two-level sum coupling pattern Q1 ″ (x) is generated.
The image dictionary creation unit 50 performs a product operation on the second level sum coupling pattern Q1 ″ (x) and the pixel distribution P (i, x) of each character image i, thereby obtaining a second portion which is a common part of these. In this example, the common part between the second level sum combined pattern Q1 ″ (x) and “character image # 1” is the second level pattern a, and the second level sum combined pattern Q1 ″ ( The common part between x) and “character image # 2” is the second level pattern b.
Thereby, “character image # 1” can be replaced with the respective indexes as the sum of the first level pattern and the second level pattern a, and “character image # 2” can be replaced with the first level pattern and the second level pattern b. Can be replaced with an index as the sum of
In this case, a plurality of image patterns (first level pattern and second level pattern) are associated with one character image. Therefore, one character image is associated with a plurality of indexes. However, the document file creation program 5 calibrates a combination of these indexes as a character image having a very similar shape (FIG. 10). I do.

文字認識処理の認識結果の確認を容易にする技術を説明する図である。It is a figure explaining the technique which makes easy confirmation of the recognition result of a character recognition process. 文書ファイル作成支援装置２により表示される再現画像を例示する図である。It is a figure which illustrates the reproduction image displayed by the document file creation assistance apparatus. 本発明にかかる文書ファイル作成支援方法が適応される文書ファイル作成支援装置２のハードウェア構成を、制御装置２０を中心に例示する図である。It is a figure which illustrates the hardware constitutions of the document file creation assistance apparatus 2 to which the document file creation assistance method concerning this invention is applied centering on the control apparatus 20. FIG. 制御装置２０（図３）により実行され、本発明にかかる文書ファイル作成支援方法を実現する文書ファイル作成プログラム５の機能構成を例示する図である。It is a figure which illustrates the functional structure of the document file creation program 5 which is performed by the control apparatus 20 (FIG. 3) and implement | achieves the document file creation assistance method concerning this invention. 画像辞書作成部５０の機能をより詳細に説明する図である。It is a figure explaining the function of the image dictionary preparation part 50 in detail. 画像辞書作成部５０により作成される画像辞書９０２を例示する図である。It is a figure which illustrates image dictionary 902 created by image dictionary creation part 50. 符号化部６０の機能をより詳細に説明する図である。It is a figure explaining the function of the encoding part 60 in detail. ＵＩ装置２６に表示される再現画像２６０を例示する図である。6 is a diagram illustrating a reproduction image 260 displayed on the UI device 26. FIG. 再現画像表示処理（Ｓ１０）の全体動作を示すフローチャートである。It is a flowchart which shows the whole operation | movement of reproduction image display processing (S10). 文字認識結果の校正処理（Ｓ２０）を示すフローチャートである。It is a flowchart which shows the proofreading process (S20) of a character recognition result. 文字コード及びフォント情報が登録された画像辞書９０４を例示する図である。It is a figure which illustrates the image dictionary 904 in which the character code and the font information were registered. 変形例における第１の画像パターン作成処理を模式的に説明する図である。It is a figure which illustrates typically the 1st image pattern creation processing in a modification. 変形例における第２の画像パターン作成処理を模式的に説明する図である。It is a figure which illustrates typically the 2nd image pattern creation processing in a modification.

Explanation of symbols

２・・・文書ファイル作成支援装置
２６・・・ユーザインタフェース装置
２６０・・・再現画像
２６２・・・下線
２６４・・・カーソル
５・・・文書ファイル作成プログラム
４０・・・画像入力部
４０・・・文字認識部
５０・・・画像辞書作成部
５１０・・・文字画像抽出部
５２０・・・一致判定部
５３０・・・辞書決定部
５４０・・・位置補正部
５５０・・・インデクス付与部
６０・・・符号化部
６１０・・・パターン判定部
６２０・・・位置情報符号化部
６３０・・・インデクス符号化部
６４０・・・画像符号化部
６５０・・・辞書符号化部
６６０・・・選択部
６７０・・・符号出力部
７０・・・再現画像作成部
８０・・・ユーザインタフェース制御部
９０・・・コードファイル作成部
９０２，９０４・・・画像辞書 2 ... Document file creation support device 26 ... User interface device 260 ... Reproduced image 262 ... Underline 264 ... Cursor 5 ... Document file creation program 40 ... Image input unit 40 ... Character recognition unit 50 ... Image dictionary creation unit 510 ... Character image extraction unit 520 ... Match determination unit 530 ... Dictionary determination unit 540 ... Position correction unit 550 ... Index assignment unit 60 ..Encoding unit 610 ... Pattern determination unit 620 ... Position information encoding unit 630 ... Index encoding unit 640 ... Image encoding unit 650 ... Dictionary encoding unit 660 ... Selection 670: Code output unit 70: Reproduced image creation unit 80 ... User interface control unit 90 ... Code file creation unit 902, 904 ... Image dictionary

Claims

A character discriminating means for discriminating characters displayed in the original image based on the original image in the raster data format;
Image pattern generation means for generating an image pattern of a character image based on a determination result by the character determination means;
Reproduction image creation means for creating a character string reproduction image that reproduces a character string included in the document image using the character image cut out from the document image and the image pattern generated by the character discrimination means;
A document file creation support apparatus comprising user interface means for displaying a character string reproduction image created by the reproduction image creation means.

The document file creation according to claim 1, wherein the user interface unit displays a character string reproduction image in a display mode capable of distinguishing between a character image included in the document image and a character image generated by the image pattern generation unit. Support device.

The character discrimination means further determines the accuracy of the character discrimination result,
The reproduction image creation means generates the portion corresponding to the character image by the image pattern generation means when the accuracy of the determination result determined for any one of the character images by the character determination means is a reference value or more. The character image cut out from the document image is applied to a portion corresponding to the character image when the accuracy of the determination result of the character image is smaller than a reference value. Document file creation support device.

The character determining means determines at least character identification information of a character displayed in the document image;
The user interface means accepts input of character identification information for a character image included in the displayed character string reproduction image,
Character string file creation for creating a file of character identification information corresponding to the character string included in the document image based on the character identification information determined by the character determination means and the character identification information received by the user interface means The document file creation support apparatus according to claim 1, further comprising: means.

The document file creation according to claim 4, wherein the user interface unit skips a cursor position indicating an input target of character identification information in a character string reproduction image according to whether or not the character image is cut out from a document image. Support device.

Based on the raster data format original image, the characters displayed in the original image are determined,
Generate an image pattern of the character image based on the discrimination result,
Create a character string reproduction image that reproduces the character string included in the document image using the character image cut out from the document image and the generated image pattern,
A document file creation support method that displays the created character string reproduction image.

In a document file creation support apparatus including a computer,
Determining a character displayed in the original image based on the original image in the raster data format;
Generating an image pattern of the character image based on the discrimination result;
Creating a character string reproduction image that reproduces a character string included in the document image using the character image cut out from the document and the generated image pattern;
A program for causing the computer of the document file creation support apparatus to execute a step of displaying the created character string reproduction image.