JP2021064122A

JP2021064122A - Image processing device, image processing method, and program

Info

Publication number: JP2021064122A
Application number: JP2019187926A
Authority: JP
Inventors: 英智相馬; Hidetomo Soma
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2019-10-11
Filing date: 2019-10-11
Publication date: 2021-04-22
Anticipated expiration: 2039-10-11
Also published as: JP7408340B2

Abstract

To solve the problem that a processing time required for correction processing increases when character recognition erroneous pattern information increases even though it is known that the correction processing is performed to a character recognition result by using the character recognition erroneous pattern information because an occurrence of erroneous recognition cannot be avoided due to a deterioration in a state of a paper medium itself or the like in character recognition processing.SOLUTION: According to the present invention, character recognition processing is performed on the basis of constraint corresponding to characteristics such as the character type and the item of a character string, and correction processing is performed to a character recognition result by using pattern information for correction, such as a character recognition erroneous pattern, a standard character string, and a character pattern rule prepared for each constraint of character recognition processing. This can enhance processing efficiency of the correction processing and shorten a processing time.SELECTED DRAWING: Figure 8

Description

本発明は、スキャン文書画像に記載されたデータの入力作業を行うための画像処理装置、画像処理手法及び計算機プログラムに関するものである。 The present invention relates to an image processing apparatus, an image processing method, and a computer program for inputting data described in a scanned document image.

近年、プリント機能、複写機能、ファクシミリ機能などの多機能を備えた複合機（Multi Function Printer：ＭＦＰ）等にイメージスキャナを備え付けることが一般化している。また、コンパクトカメラ、スマートフォンに代表されるモバイル機器にデジタルカメラ等を備え付けることも一般化している。これにより、スキャナなどを用いて、手書き文字や活字を含む文書から、光学的にスキャンあるいは撮影した画像（スキャン文書画像）を作成することが容易になった。
さらに、スキャン文書画像に対して、文字認識処理（Optical Character Recognition：ＯＣＲ）を行い、コンピュータが利用可能な文字コードに変換する技術が広く使われるようになってきている。文字認識処理を用いることで、一般的なオフィスで実施されている経費精算作業に代表される、帳票などの紙媒体からデジタルデータへの変換を伴うデータ入力作業を自動化することできるようになった。これにより、データ入力作業における生産性の向上が図られている。 In recent years, it has become common to equip a multifunction printer (MFP) or the like having multiple functions such as a print function, a copy function, and a facsimile function with an image scanner. It is also common to equip mobile devices such as compact cameras and smartphones with digital cameras and the like. This makes it easier to create an image (scanned document image) optically scanned or photographed from a document containing handwritten characters and printed characters using a scanner or the like.
Further, a technique of performing character recognition processing (Optical Character Recognition: OCR) on a scanned document image and converting it into a character code that can be used by a computer has been widely used. By using character recognition processing, it has become possible to automate data entry work that involves conversion from paper media such as forms to digital data, such as expense settlement work that is carried out in general offices. .. As a result, the productivity in the data input work is improved.

しかし、一般的に、紙媒体自体の状態の劣化や、スキャンや撮影の状況などにより、スキャン文書画像の品質や文字画像の状態に劣化やばらつきが生じることは避けられないため、文字認識処理により正しい結果を常に取得することは困難である。そこで、文字認識処理において誤認識された文字に対して正しい記載内容を推定する補正処理等が行われることがある。 However, in general, it is inevitable that the quality of the scanned document image and the state of the character image will deteriorate or vary due to the deterioration of the state of the paper medium itself, the scanning or shooting conditions, etc. It is difficult to always get the correct results. Therefore, a correction process or the like for estimating the correct description content may be performed for a character that is erroneously recognized in the character recognition process.

例えば、特許文献１には、文字読み取り処理において、正解テキストとの比較結果を用いて、文字認識結果を照合する方法が開示されている。このような補正処理を行うことにより、文字認識結果の正解率を向上させることができるため、文字認識結果に含まれる誤認識される文字の割合を低下させることが可能である。
しかし、特許文献１のような補正処理を行ったとしても、正解テキストのような辞書に存在しない未知の文字認識の誤りが発生する場合があるため、補正処理により誤認識された文字を完全に修復することができない。そのため、データ入力作業においては、スキャンした文書に対する文字認識処理の結果を、ユーザが確認して、誤認識された文字については修正を行う必要がある For example, Patent Document 1 discloses a method of collating a character recognition result by using a comparison result with a correct text in a character reading process. By performing such a correction process, the correct answer rate of the character recognition result can be improved, so that the ratio of erroneously recognized characters included in the character recognition result can be reduced.
However, even if the correction processing as in Patent Document 1 is performed, an unknown character recognition error that does not exist in the dictionary such as the correct answer text may occur. Therefore, the character that is erroneously recognized by the correction processing is completely removed. It cannot be repaired. Therefore, in the data input work, it is necessary for the user to check the result of the character recognition processing for the scanned document and correct the erroneously recognized characters.

例えば、特許文献２では、オペレータが修正した文字に関する情報を修正履歴として保存しておき、これを利用して読取結果の中から誤読された文字を自動的に検索して表示する方法が開示されている。この方法によれば、オペレータが修正した文字認識結果の誤りパターンを収集することで、辞書に存在しない未知の文字認識の誤りについても補正処理を行うことができる。 For example, Patent Document 2 discloses a method in which information about a character corrected by an operator is saved as a correction history, and the misread character is automatically searched and displayed from the reading results by using the information. ing. According to this method, by collecting the error patterns of the character recognition results corrected by the operator, it is possible to perform correction processing even for unknown character recognition errors that do not exist in the dictionary.

しかし、過去の文字認識結果の誤りパターン情報の情報量が多くなってしまうと、補正処理に要する処理時間が増大し、ユーザによるデータ入力作業の効率化による時間短縮を阻害するという問題が生じる。そのため、補正処理において大量の文字認識結果の誤りパターン情報を安易に利用することは、処理効率の観点から好ましくない。 However, if the amount of error pattern information in the past character recognition result becomes large, the processing time required for the correction processing increases, which causes a problem that the time reduction due to the efficiency of the data input work by the user is hindered. Therefore, it is not preferable from the viewpoint of processing efficiency to easily use a large amount of error pattern information of the character recognition result in the correction processing.

特開平９―２５１５１８号公報Japanese Unexamined Patent Publication No. 9-251518 特開平５−３１４３０３号公報Japanese Unexamined Patent Publication No. 5-314303

本発明は、以上のような事情に鑑みてなされたものであり、文字認識結果の補正処理における処理効率を向上させ、補正処理に要する処理時間を短縮することを目的とする。 The present invention has been made in view of the above circumstances, and an object of the present invention is to improve the processing efficiency in the correction processing of the character recognition result and to shorten the processing time required for the correction processing.

本発明は、文書上の文字列に対して、前記文書上の文字列の特性に応じた制約を用いて文字認識処理を行うことにより生成された文字列に対して補正処理を行う補正手段を有する画像処理装置において、前記補正手段は、前記制約に応じて用意された補正用のパターン情報を用いて前記文字認識処理により生成された文字列に対して補正処理を行う
ことを特徴とする。 The present invention provides a correction means for correcting a character string generated by performing character recognition processing on a character string on a document using restrictions according to the characteristics of the character string on the document. The image processing apparatus has, characterized in that, the correction means performs correction processing on a character string generated by the character recognition processing using pattern information for correction prepared according to the constraint.

本発明によれば、文字認識結果の補正処理に要する処理時間の短縮を実現でき、これにより、データ入力業務におけるユーザの作業負担の軽減が可能となる。 According to the present invention, it is possible to shorten the processing time required for the correction processing of the character recognition result, and thereby it is possible to reduce the work load of the user in the data input work.

データ入力システムのシステム構成を示す図である。It is a figure which shows the system configuration of the data input system. データ入力システムの機能と情報の流れの概略を示す図である。It is a figure which shows the outline of the function of a data input system, and the flow of information. スキャンされる文書の一例である。This is an example of a document to be scanned. 処理の全体的な流れを示すフローチャートである。It is a flowchart which shows the overall flow of processing. 文字認識結果の例を示した表と、ラティス構造の例である。A table showing an example of character recognition results and an example of a lattice structure. 文字認識結果の割当処理を示すフローチャートである。It is a flowchart which shows the allocation process of a character recognition result. 文字認識結果の割当処理を説明するための表である。It is a table for demonstrating the allocation process of a character recognition result. 項目値の補正処理示すフローチャートである。It is a flowchart which shows the correction process of an item value. 項目値の再文字認識について説明するための図である。It is a figure for demonstrating the re-character recognition of an item value. 文字認識誤りパターンを用いたラティス構造の更新について説明するための図である。It is a figure for demonstrating the update of the lattice structure using a character recognition error pattern. 補正用の辞書・文字パターンを使用した補正処理の結果について説明するための図である。It is a figure for demonstrating the result of the correction processing using the dictionary / character pattern for correction. ユーザによる修正に基づく、文字認識誤りパターン情報抽出処理を示すフローチャートである。It is a flowchart which shows the character recognition error pattern information extraction processing based on the correction by a user. 確認・修正用画面の一例である。This is an example of a confirmation / correction screen. ユーザによる修正の結果を説明するための表である。It is a table for explaining the result of the modification by a user. 実施例２における、文字認識結果の割当処理を説明するための表である。It is a table for demonstrating the allocation processing of the character recognition result in Example 2.

以下に、図面を参照して、本発明を実施するための実施例について説明する。ただし、以下に説明する実施例はあくまで例示であり、本発明の範囲をそれらに限定する趣旨のものではない。また、以下の実施例で説明されている特徴の組み合わせのすべてが本発明の解決手段に必須のものとは限らない。 Hereinafter, examples for carrying out the present invention will be described with reference to the drawings. However, the examples described below are merely examples, and are not intended to limit the scope of the present invention to them. Moreover, not all combinations of features described in the following examples are essential for the means of solving the present invention.

＜実施例１＞
以下、図面を用いて、本発明の実施例について説明する。
図１に、本実施例に係るデータ入力システム１００のシステム構成の概略である。典型的には、データ入力システム１００は、複合機（ＭＦＰ）から構成される。以下、図１のデータ入力システムに示された各デバイスについて説明する。
ＣＰＵ（Central Processing Unit）１０１は、データ入力システムにおける大部分の制御や処理を実行する。ＣＰＵ１０１が実行する制御や処理は、後述するＲＯＭ１０２やＲＡＭ１０３上のプログラムによって指示される。ＣＰＵ１０１自身の機能や計算機プログラムの機能により、ＣＰＵ１０１は複数の計算機プログラムを並列に動作させることもできる。
ＲＯＭ（Read Only Memory）１０２は、ＣＰＵ１０１による制御の手順を記憶させた計算機プログラムやデータを格納する。
ＲＡＭ（Random Access Memory）１０３は、ＣＰＵ１０１が処理するための制御プログラムを格納するとともに、ＣＰＵ１０１が各種制御を実行する際の様々なデータの作業領域を提供する。 <Example 1>
Hereinafter, examples of the present invention will be described with reference to the drawings.
FIG. 1 is an outline of the system configuration of the data input system 100 according to the present embodiment. Typically, the data input system 100 is composed of a multifunction device (MFP). Hereinafter, each device shown in the data input system of FIG. 1 will be described.
The CPU (Central Processing Unit) 101 executes most of the control and processing in the data input system. The control and processing executed by the CPU 101 are instructed by a program on the ROM 102 or the RAM 103, which will be described later. The CPU 101 can also operate a plurality of computer programs in parallel by the function of the CPU 101 itself or the function of the computer program.
The ROM (Read Only Memory) 102 stores computer programs and data in which the control procedure by the CPU 101 is stored.
The RAM (Random Access Memory) 103 stores a control program for processing by the CPU 101, and also provides a work area for various data when the CPU 101 executes various controls.

入力装置１０４は、ユーザによる各種入力操作環境を提供するキーボードやマウスである。入力装置１０４は、ユーザからの各種入力操作環境を提供するものであれば、タッチパネル、スタイラスペン等、何でもよい。また、音声認識やジェスチャー操作による入力を提供するものでもよい。
バス１０５は、データ入力システム内の各デバイスに接続されているアドレスバス、データバスなどであり、各デバイス間の情報交換・通信機能を提供する。これにより、各デバイスは連携して動作することができる。
外部記憶装置１０６は、各種データ等を記憶する。外部記憶装置１０６は、ハードディスクやフロッピーディスク、光ディスク、磁気ディスク、磁気テープ、不揮発性のメモリカード等の記録媒体と、記憶媒体を駆動し情報を記録するドライブなどで構成される。保管された計算機プログラムやデータの全体や一部は、キーボード等の指示や、各種計算機プログラムの指示により、必要な時にＲＡＭ１０３上に呼び出される。 The input device 104 is a keyboard or mouse that provides various input operation environments by the user. The input device 104 may be any touch panel, stylus pen, or the like as long as it provides various input operation environments from the user. Further, it may provide input by voice recognition or gesture operation.
The bus 105 is an address bus, a data bus, or the like connected to each device in the data input system, and provides an information exchange / communication function between the devices. As a result, the devices can operate in cooperation with each other.
The external storage device 106 stores various data and the like. The external storage device 106 is composed of a recording medium such as a hard disk, a floppy disk, an optical disk, a magnetic disk, a magnetic tape, or a non-volatile memory card, and a drive for driving the storage medium and recording information. All or part of the stored computer programs and data are called up on the RAM 103 when necessary by instructions from a keyboard or the like or instructions from various computer programs.

ＮＣＵ（Network Control Unit）１０７は、他のコンピュータ装置等と通信を行うための通信装置である。ＮＣＵ１０７は、ネットワーク（ＬＡＮ）等を介して、遠隔地に存在する不図示の装置と通信し、本実施例のプログラムやデータを共有することを可能とする。通信手段としては、ＲＳ２３２ＣやＵＳＢ、ＩＥＥＥ１３９４、Ｐ１２８４、ＳＣＳＩ、モデム、Ｅｔｈｅｒｎｅｔ（登録商標）などの有線通信や、Ｂｌｕｅｔｏｏｔｈ（登録商標）、赤外線通信、ＩＥＥＥ８０２．１１ａ／ｂ／ｎ等の無線通信等、何でもよい。すなわち、データ入力システムと接続されているデバイスとの通信手段を備えるものであれば何でもよい。
ＧＰＵ（Graphics Processing Unit）１０８は、ＣＰＵ１０１などにより与えられた表示指示や計算指示に従い、表示内容の画像の作成や表示位置などの計算を行い、その計算結果を表示装置１０９へ送信して描画させる。また、ＧＰＵ１０８は、バス１０５を経由して、計算結果をＣＰＵ１０１に返信することで、ＣＰＵ１０１と連携した計算処理を行うこともできる。 The NCU (Network Control Unit) 107 is a communication device for communicating with other computer devices and the like. The NCU 107 enables communication with a device (not shown) existing in a remote location via a network (LAN) or the like to share the program or data of the present embodiment. As communication means, wired communication such as RS232C, USB, IEEE1394, P1284, SCSI, modem, Ethernet (registered trademark), Bluetooth (registered trademark), infrared communication, wireless communication such as IEEE802.11a / b / n, etc. Anything is fine. That is, any device may be provided as long as it has a means of communicating with the device connected to the data input system.
The GPU (Graphics Processing Unit) 108 creates an image of the display content, calculates the display position, etc. according to the display instruction and the calculation instruction given by the CPU 101 and the like, and transmits the calculation result to the display device 109 for drawing. .. Further, the GPU 108 can also perform the calculation process in cooperation with the CPU 101 by returning the calculation result to the CPU 101 via the bus 105.

表示装置１０９は、ディスプレイなどで構成され、各種入力操作の状態やそれに応じた計算結果などを、ユーザに対して表示する。
スキャナ１１０は、文書上の画像を読み取り、画像データに変換する。スキャナ１１０は、ＮＣＵ１０７の通信機能を介して接続されてもよいし、それ以外の独自の外部Ｉ／Ｆを介して接続される形態でもよい。 The display device 109 is composed of a display or the like, and displays to the user the states of various input operations and the calculation results corresponding thereto.
The scanner 110 reads an image on a document and converts it into image data. The scanner 110 may be connected via the communication function of the NCU 107, or may be connected via another unique external I / F.

以上のデータ入力システムのシステム構成は、あくまでも、本実施例における一例であり、本発明におけるシステム構成はこれに限定されるものでない。また、システム内の各デバイスにより示された内部構成は、いわゆるハードウェアである必要はなく、仮想的にソフトウエアで作り出されたものでもよい。
また、データ入力システムを構成する内部構成は、単一のシステム内に備えられることには限られず、複数のシステムに分散して備えられてもよい。この場合、各システムにおけるＮＣＵ１０７を利用した情報交換・共有等を行い連携させることで、各システムをサーバやクライアント等として構成する方法で実現しても構わない。すなわち、このような複数のシステムである場合、各内部構成が物理的に異なる場所にあって、ＬＡＮやインターネットなどを介して連携動作する場合でもよいし、また、仮想的にソフトウエアで作り出されたものが含まれていても構わない。さらに、複数のサーバやＰＣクライアント等の各システムの全部もしくは一部を、単一の内部構成において共有して動作させるような利用方法であっても構わない。 The system configuration of the above data input system is merely an example in this embodiment, and the system configuration in the present invention is not limited to this. Further, the internal configuration shown by each device in the system does not have to be so-called hardware, and may be virtually created by software.
Further, the internal configuration constituting the data input system is not limited to being provided in a single system, and may be provided in a distributed manner in a plurality of systems. In this case, it may be realized by a method of configuring each system as a server, a client, or the like by exchanging and sharing information using NCU107 in each system and linking them. That is, in the case of such a plurality of systems, each internal configuration may be in a physically different location and cooperated with each other via a LAN, the Internet, or the like, or it may be virtually created by software. It may be included. Further, the usage method may be such that all or a part of each system such as a plurality of servers and PC clients is shared and operated in a single internal configuration.

図２に、本実施例のデータ入力システムにおける主な機能と情報の流れの概略を示す。
スキャン文書画像２０１は、紙などの媒体に手書き文字が記載されたり活字が印刷されたりした文書を光学的にスキャンすることによって得られた画像情報である。スキャン文書画像２０１は、スキャナ１１０で文書に対してスキャンが行われることにより取得される。スキャン文書画像２０１は、基本的に、外部記憶装置１０６やＲＡＭ１０３に格納され、外部記憶装置１０６にある場合には、利用時にＲＡＭ１０３にコピーされる。 FIG. 2 shows an outline of the main functions and information flow in the data input system of this embodiment.
The scanned document image 201 is image information obtained by optically scanning a document in which handwritten characters are written or printed on a medium such as paper. The scanned document image 201 is acquired by scanning the document with the scanner 110. The scanned document image 201 is basically stored in the external storage device 106 or the RAM 103, and when it is in the external storage device 106, it is copied to the RAM 103 at the time of use.

文書画像解析部２１０は、スキャン文書画像２０１の解析処理を行い、スキャン文書画像２０１上の手書き文字や活字（以下、まとめて「文字」という）のテキスト情報や、イラストや写真などの画像情報を抽出する機能部分である。
文書画像解析部２１０は、後述する文書画像処理２１１と文字認識処理２１２の機能と、文字認識辞書２１３の情報を有している。 The document image analysis unit 210 analyzes the scanned document image 201, and obtains text information of handwritten characters and printed characters (hereinafter collectively referred to as "characters") on the scanned document image 201, and image information such as illustrations and photographs. It is a functional part to be extracted.
The document image analysis unit 210 has the functions of the document image processing 211 and the character recognition processing 212, which will be described later, and the information of the character recognition dictionary 213.

文書画像処理２１１は、スキャン文書画像２０１に対して、画像の解析が行いやすいように、画像全体の濃度や色の調整、画像の二値化、表などの罫線の削除を行う機能の処理部分である。文書画像処理２１１は、基本的に、外部記憶装置１０６やＲＯＭ１０２に格納された処理プログラムやデータがＲＡＭ１０３にコピーされて、ＣＰＵ１０１やＧＰＵ１０８を利用して実行されることで実現される。 The document image processing 211 is a processing portion of a function that adjusts the density and color of the entire image, binarizes the image, and deletes rule lines such as a table so that the scanned document image 201 can be easily analyzed. Is. The document image processing 211 is basically realized by copying the processing programs and data stored in the external storage device 106 and the ROM 102 to the RAM 103 and executing the document image processing 211 using the CPU 101 and the GPU 108.

文字認識処理２１２は、文書画像処理２１１の結果に対して、文字画像領域を検索し、文字画像領域内から文字画像を取得し、取得した文字画像に対応する文字コードを取得する機能の処理部分である。文字認識処理２１２は、基本的に、外部記憶装置１０６やＲＯＭ１０２に格納された処理プログラムやデータがＲＡＭ１０３にコピーされて、ＣＰＵ１０１やＧＰＵ１０８を利用して実行されることで実現される。 The character recognition processing 212 is a processing portion of a function of searching a character image area for the result of the document image processing 211, acquiring a character image from the character image area, and acquiring a character code corresponding to the acquired character image. Is. The character recognition process 212 is basically realized by copying a process program or data stored in the external storage device 106 or the ROM 102 to the RAM 103 and executing the character recognition process 212 using the CPU 101 or the GPU 108.

また、文字認識処理２１２は、文字認識の対象となる文字の近傍にある文字を集めて、文字列の記載方向を推定し、文字認識結果として文字列を取得する。文字認識結果により取得された文字列の情報には、文字列の記載方向と、文字列のスキャン文書画像２０１内における位置とサイズの領域情報と、文字列を構成する文字毎の文字コード列情報を生成する。文字コード列情報には、文字認識処理により取得された文字コードと、文字画像としての類似度などに基づいた文字コード毎の尤度が含まれる。文字コード列で、最尤度の文字コードを最尤文字と呼び、最尤文字のみから構成された文字列を最尤候補文字列と呼ぶ。
文字認識処理２１２は、スキャン文書画像２０１の全体について文字認識処理を行う以外にも、スキャン文書画像２０１内の一部の領域に対してのみ文字認識処理を行うことが可能である。また、文字認識処理を行う際に、文字の種類、使用言語、文字列の記載方向等の制約を与えることも可能である。 Further, the character recognition process 212 collects characters in the vicinity of the character to be recognized, estimates the writing direction of the character string, and acquires the character string as the character recognition result. The character string information acquired from the character recognition result includes the description direction of the character string, the area information of the position and size in the scanned document image 201 of the character string, and the character code string information for each character constituting the character string. To generate. The character code string information includes the character code acquired by the character recognition process and the likelihood of each character code based on the similarity as a character image and the like. In the character code string, the character code having the maximum likelihood is called the maximum likelihood character, and the character string composed of only the maximum likelihood characters is called the maximum likelihood candidate character string.
In addition to performing character recognition processing on the entire scanned document image 201, the character recognition processing 212 can perform character recognition processing only on a part of the area in the scanned document image 201. Further, when performing the character recognition process, it is possible to impose restrictions on the character type, the language used, the writing direction of the character string, and the like.

文字認識辞書２１３は、文字認識処理２１２において、文字画像に対応する文字コードを決定するために使用されるデータである。文字認識辞書２１３は、基本的に、外部記憶装置１０６やＲＯＭ１０２に格納され、ＲＡＭ１０３にコピーされて利用される。 The character recognition dictionary 213 is data used in the character recognition process 212 to determine the character code corresponding to the character image. The character recognition dictionary 213 is basically stored in the external storage device 106 or the ROM 102, and is copied to the RAM 103 for use.

文字認識結果情報２０２は、文書画像解析部２０１による処理結果として取得された情報である。文字認識結果情報２０２は、主に、文字認識処理２１２の出力である文字認識結果と、文書画像処理２１１の処理結果である画像によって構成される。文字認識結果情報２０２は、外部記憶装置１０６やＲＡＭ１０３に格納され、利用時に外部記憶装置１０６にある場合には、ＲＡＭ１０３にコピーされ利用される。 The character recognition result information 202 is information acquired as a processing result by the document image analysis unit 201. The character recognition result information 202 is mainly composed of a character recognition result which is an output of the character recognition processing 212 and an image which is a processing result of the document image processing 211. The character recognition result information 202 is stored in the external storage device 106 or the RAM 103, and if it is in the external storage device 106 at the time of use, it is copied to the RAM 103 and used.

特定情報抽出部２２０は、文字認識結果情報２０２に対して解析処理を行い、必要な情報の抽出を行う機能部分である。
特定情報抽出部２２０は、後述する文書分類・抽出項目選択処理２２１、項目名・項目値推定処理２２３、辞書・文字パターン突合処理２２５の機能を有している。また、特定情報抽出部２２０は、後述する文書分類データ（抽出項目データ）２１２、項目名・項目値制約データ２１４、文字認識誤りパターン辞書２１６、辞書・文字パターンデータ２１７の情報を有している。なお、本実施例における「項目」や「項目値」については、３を用いて後述する。 The specific information extraction unit 220 is a functional portion that performs analysis processing on the character recognition result information 202 and extracts necessary information.
The specific information extraction unit 220 has functions of document classification / extraction item selection processing 221, item name / item value estimation processing 223, and dictionary / character pattern matching processing 225, which will be described later. In addition, the specific information extraction unit 220 has information on document classification data (extracted item data) 212, item name / item value constraint data 214, character recognition error pattern dictionary 216, and dictionary / character pattern data 217, which will be described later. .. The "item" and "item value" in this embodiment will be described later with reference to 3.

文書分類・抽出項目選択処理２２１は、文字認識結果情報２０２内の文字認識結果から得られる記載内容や、文書画像内の配置（レイアウト）から、スキャン文書画像２０１の文書分類を行う。また、文書分類の内容に応じて、スキャン文書画像２０１から抽出する情報である抽出項目を決定する。すなわち、文書分類の内容として、例えば、レシート、領収書、請求書、納品書、報告書、見積り等に分類するとともに、その文書分類の内容に応じて、例えば、合計金額や日付等の抽出項目を決定する。文書分類・抽出項目選択処理２２１は、基本的に、外部記憶装置１０６やＲＯＭ１０２に格納された処理プログラムやデータで構成され、ＲＡＭ１０３にコピーされて、ＣＰＵ１０１やＧＰＵ１０８を利用して実行・利用されることで実現される。 The document classification / extraction item selection process 221 classifies the scanned document image 201 based on the description content obtained from the character recognition result in the character recognition result information 202 and the arrangement (layout) in the document image. Further, the extraction item which is the information to be extracted from the scanned document image 201 is determined according to the content of the document classification. That is, the contents of the document classification are classified into, for example, receipts, receipts, invoices, invoices, reports, quotations, etc., and according to the contents of the document classification, for example, extraction items such as total amount and date. To determine. The document classification / extraction item selection process 221 is basically composed of processing programs and data stored in the external storage device 106 and the ROM 102, copied to the RAM 103, and executed / used by using the CPU 101 and the GPU 108. It will be realized by.

文書分類データ（抽出項目データ）２１２は、文書分類・抽出項目選択処理２２１が、文書分類を行うために使用するデータである。文書分類データ（抽出項目データ）２１２は、各文書分類の内容と、文書分類毎の文書画像内の配置（レイアウト）に関する情報と、各文書分類の内容によって決まる抽出項目の情報を格納したものである。文書分類データ（抽出項目データ）２１２は、外部記憶装置１０６やＲＡＭ１０３に格納され、利用時に外部記憶装置１０６にある場合には、ＲＡＭ１０３にコピーされ利用される。 The document classification data (extracted item data) 212 is data used by the document classification / extraction item selection process 221 to perform document classification. The document classification data (extracted item data) 212 stores the contents of each document classification, the information on the arrangement (layout) in the document image for each document classification, and the information of the extracted items determined by the contents of each document classification. is there. The document classification data (extracted item data) 212 is stored in the external storage device 106 or the RAM 103, and if it is in the external storage device 106 at the time of use, it is copied to the RAM 103 and used.

項目名・項目値推定処理２２３は、抽出項目について、項目値に該当すると推定される文字認識結果を文字認識結果情報２０２の中から検索して割当を行う。項目名・項目値推定処理２２３は、基本的に、外部記憶装置１０６やＲＯＭ１０２に格納された処理プログラムやデータで構成され、ＲＡＭ１０３にコピーされて、ＣＰＵ１０１やＧＰＵ１０８を利用して実行・利用されることで実現される。 The item name / item value estimation process 223 searches for the character recognition result estimated to correspond to the item value from the character recognition result information 202 and assigns the extracted item. The item name / item value estimation process 223 is basically composed of processing programs and data stored in the external storage device 106 and the ROM 102, copied to the RAM 103, and executed / used by using the CPU 101 and the GPU 108. It will be realized by.

項目名・項目値制約データ２１４は、項目名・項目値推定処理２２３で検索を行うための項目名・項目値の制約に関する情報である。項目名・項目値の制約としては、項目名や項目値毎の文字種や用語等の文字認識結果に関する制約や、スキャン文書画像内の絶対・相対配置（レイアウト）等の制約が含まれる。項目名・項目値制約データ２１４は、外部記憶装置１０６やＲＡＭ１０３に格納され、利用時に外部記憶装置１０６にある場合には、ＲＡＭ１０３にコピーされ利用される。 The item name / item value constraint data 214 is information regarding the constraint of the item name / item value for performing the search in the item name / item value estimation process 223. Restrictions on item names and item values include restrictions on character recognition results such as character types and terms for each item name and item value, and restrictions on absolute / relative arrangement (layout) in scanned document images. The item name / item value constraint data 214 is stored in the external storage device 106 or the RAM 103, and if it is in the external storage device 106 at the time of use, it is copied to the RAM 103 and used.

辞書・文字パターン突合処理２２５は、文字認識誤りパターン辞書２１６と辞書・文字パターンデータ２１７を利用して、項目値の文字認識結果に対して補正処理を行う。辞書・文字パターン突合処理２２５は、基本的に、外部記憶装置１０６やＲＯＭ１０２に格納された処理プログラムやデータで構成され、ＲＡＭ１０３にコピーされて、ＣＰＵ１０１やＧＰＵ１０８を利用して実行・利用されることで実現される。 The dictionary / character pattern matching process 225 uses the character recognition error pattern dictionary 216 and the dictionary / character pattern data 217 to perform correction processing on the character recognition result of the item value. The dictionary / character pattern matching process 225 is basically composed of processing programs and data stored in the external storage device 106 and the ROM 102, copied to the RAM 103, and executed / used by using the CPU 101 and the GPU 108. It is realized by.

なお、文字認識処理においては、一般的に、スキャンされた紙媒体自体の劣化や、スキャン時の環境等の影響を受けるため、誤認識を完全になくすことは難しい。しかし、文字認識結果に誤りがあると、スキャン文書画像２０１から抽出した情報を再利用する際に問題が生じてしまう。そのため、辞書文字パターン突合処理２２５は、可能な限り、文字認識結果の補正処理を行うが、それでも、誤認識を減少させることはできても、完全になくすことは非常に難しい。 In the character recognition process, it is generally difficult to completely eliminate erroneous recognition because it is affected by the deterioration of the scanned paper medium itself and the environment at the time of scanning. However, if there is an error in the character recognition result, a problem will occur when the information extracted from the scanned document image 201 is reused. Therefore, the dictionary character pattern matching process 225 corrects the character recognition result as much as possible, but even if it is possible to reduce the erroneous recognition, it is very difficult to completely eliminate it.

文字認識誤りパターン辞書２１６は、文字認識処理における典型的な誤りパターンを収集した情報である。誤りパターンは、文字認識処理２１２と文字認識結果情報２０２に対して、様々なサンプルデータでの文字認識を行わせ、その正解結果と文字認識結果を対比することで、事前に登録しておく。また、実際にデータ入力作業を行う際に抽出された誤りパターンも収集され、追加される。文字認識処理における誤りパターンについては、詳しい説明は後述する。文字認識誤りパターン辞書２１６は、外部記憶装置１０６やＲＡＭ１０３に格納され、利用時に外部記憶装置１０６にある場合には、ＲＡＭ１０３にコピーされ利用される。 The character recognition error pattern dictionary 216 is information that collects typical error patterns in the character recognition process. The error pattern is registered in advance by causing the character recognition process 212 and the character recognition result information 202 to perform character recognition with various sample data and comparing the correct answer result with the character recognition result. In addition, error patterns extracted during the actual data input work are also collected and added. A detailed description of the error pattern in the character recognition process will be described later. The character recognition error pattern dictionary 216 is stored in the external storage device 106 or the RAM 103, and if it is in the external storage device 106 at the time of use, it is copied to the RAM 103 and used.

辞書・文字パターンデータ２１７は、抽出対象の項目の項目値を補正する際に用いられる情報である。項目値の記載内容（文字列）が事前に分かっている場合、あるいは、予想できる場合には、その記載内容を辞書・文字パターンデータ２１７として事前に作成しておく。そして、辞書文字パターン突合処理２２５は、辞書・文字パターンデータ２１７を利用して、項目値に対応する記載内容を推定する。辞書・文字パターンデータ２１７は、外部記憶装置１０６やＲＡＭ１０３に格納され、利用時に外部記憶装置１０６にある場合には、ＲＡＭ１０３にコピーされ利用される。 The dictionary / character pattern data 217 is information used when correcting the item value of the item to be extracted. If the description content (character string) of the item value is known in advance or can be predicted, the description content is created in advance as dictionary / character pattern data 217. Then, the dictionary character pattern matching process 225 uses the dictionary / character pattern data 217 to estimate the description content corresponding to the item value. The dictionary / character pattern data 217 is stored in the external storage device 106 or the RAM 103, and if it is in the external storage device 106 at the time of use, it is copied to the RAM 103 and used.

項目値情報２０３は、特定情報抽出部２２０で抽出された各項目の項目値の値である。項目値情報２０３は、外部記憶装置１０６やＲＡＭ１０３に格納され、利用時に外部記憶装置１０６にある場合には、ＲＡＭ１０３にコピーされ利用される。 The item value information 203 is the value of the item value of each item extracted by the specific information extraction unit 220. The item value information 203 is stored in the external storage device 106 or the RAM 103, and if it is in the external storage device 106 at the time of use, it is copied to the RAM 103 and used.

特定情報修正部２３０は、ユーザに確認あるいは修正を行わせるために、項目値情報２０３の内容をユーザに提供する。また、ユーザが行う修正により正しい項目値の値を取得して、文字認識処理の誤りパターン情報を取得し、利用できるようにする。特定情報修正部２３０は、後述するユーザによる項目値の確認・修正処理２３１と、文字認識の誤りパターン情報抽出処理２３２の機能を有している。 The specific information correction unit 230 provides the user with the contents of the item value information 203 in order to make the user confirm or correct. In addition, the correct item value is acquired by the modification performed by the user, and the error pattern information of the character recognition process is acquired so that it can be used. The specific information correction unit 230 has the functions of the item value confirmation / correction process 231 by the user, which will be described later, and the character recognition error pattern information extraction process 232.

ユーザによる項目値の確認・修正処理２３１は、項目値情報２０３の内容をユーザに提供し、項目値の確認あるいは誤りの修正を行う。これにより、正しい項目値の値を取得する。ユーザによる項目値の確認・修正処理２３１は、外部記憶装置１０６やＲＯＭ１０２に格納された処理プログラムやデータで構成され、ＲＡＭ１０３にコピーされて、ＣＰＵ１０１やＧＰＵ１０８を利用して実行・利用されることで実現される。 The item value confirmation / correction process 231 by the user provides the content of the item value information 203 to the user, and confirms the item value or corrects an error. As a result, the value of the correct item value is acquired. The item value confirmation / correction process 231 by the user is composed of processing programs and data stored in the external storage device 106 and the ROM 102, is copied to the RAM 103, and is executed / used by using the CPU 101 and the GPU 108. It will be realized.

文字認識誤りパターン情報抽出処理２３２は、ユーザにより修正された内容（正しい項目値）と、修正される前の文字認識結果から、文字認識誤りパターン情報を抽出する。ここで抽出された文字認識誤りパターン情報は、文字認識誤りパターン辞書２１６に登録され、以降、利用されるようになる。図２における矢印２５１は、この情報の流れを示している。文字認識誤りパターン情報抽出処理２３２は、外部記憶装置１０６やＲＯＭ１０２に格納された処理プログラムやデータで構成され、ＲＡＭ１０３にコピーされて、ＣＰＵ１０１やＧＰＵ１０８を利用して実行・利用されることで実現される。 The character recognition error pattern information extraction process 232 extracts character recognition error pattern information from the content corrected by the user (correct item value) and the character recognition result before the correction. The character recognition error pattern information extracted here is registered in the character recognition error pattern dictionary 216 and will be used thereafter. The arrow 251 in FIG. 2 shows the flow of this information. The character recognition error pattern information extraction process 232 is realized by being composed of processing programs and data stored in the external storage device 106 and the ROM 102, being copied to the RAM 103, and being executed and used by using the CPU 101 and the GPU 108. To.

文書抽出情報２０４は、特定情報修正部２３０で取得された正しい項目値の情報である。文書抽出情報２０４は、外部記憶装置１０６やＲＡＭ１０３に格納され、利用時に外部記憶装置１０６にある場合には、ＲＡＭ１０３にコピーされ利用される。 The document extraction information 204 is the information of the correct item value acquired by the specific information correction unit 230. The document extraction information 204 is stored in the external storage device 106 or the RAM 103, and if it is in the external storage device 106 at the time of use, it is copied to the RAM 103 and used.

文書抽出情報利用部２４０は、文書抽出情報２０４を利用する機器、アプリケーション、サービスなどである。文書抽出情報利用部２４０には様々な機器、アプリケーション、サービスなどが存在し、文書抽出情報２０４を利用するものであれば、いかなるものであっても適用可能である。 The document extraction information utilization unit 240 is a device, an application, a service, or the like that uses the document extraction information 204. Various devices, applications, services, and the like exist in the document extraction information utilization unit 240, and any device that uses the document extraction information 204 can be applied.

以上述べてきた機能や処理の流れは、あくまでも、本実施例における一例であり、本発明においては、特にこれに限定されるものでない。特に、上記の機能は、複数の装置で分担して実行してもよく、また、複数の機器で同じ機能を分散して実行してもよい。 The functions and the flow of processing described above are merely examples in the present embodiment, and the present invention is not particularly limited to these. In particular, the above functions may be shared and executed by a plurality of devices, or the same functions may be distributed and executed by a plurality of devices.

図３は、本実施例において用いられるスキャン文書画像２０１の元になる文書３００の一例である。以下の説明では、これを用いて説明をする。
この例では、文書３００の種別は、帳票の一種である請求書である。文書内には、文書の内容を示す単位として区別可能な項目が複数含まれている。そして、各項目について、その内容を示す項目値が記載されている。例えば、図３の文書３００には、文書タイトル３０１、作成日付３０２、請求先名称３０３、請求元情報３０４、請求金額３０５、請求内容３０６、請求元の振込先情報３０７などの各種項目が含まれている。 FIG. 3 is an example of the document 300 which is the source of the scanned document image 201 used in this embodiment. In the following description, this will be used for explanation.
In this example, the type of document 300 is an invoice, which is a type of form. The document contains a plurality of distinguishable items as a unit indicating the content of the document. Then, for each item, an item value indicating the content is described. For example, the document 300 of FIG. 3 includes various items such as a document title 301, a creation date 302, a billing destination name 303, a billing source information 304, a billing amount 305, a billing content 306, and a billing source transfer destination information 307. ing.

文書タイトル３０１は、文書３００のタイトルが記載された項目である。
作成日付３０２、請求先名称３０３、請求元情報３０４、請求金額３０５、請求内容３０６、請求元の振込先情報３０７について、それぞれ、それらに対応する項目値が記載されている。例えば、作成日付３０２の項目は、それに対応する日付（ここでは、「２０１９年２月１日」）が項目値として記載されている。 The document title 301 is an item in which the title of the document 300 is described.
Item values corresponding to each of the creation date 302, the billing destination name 303, the billing source information 304, the billing amount 305, the billing content 306, and the billing destination transfer destination information 307 are described. For example, in the item of creation date 302, the corresponding date (here, "February 1, 2019") is described as the item value.

また、例えば、請求元情報３０４は、請求元の名称、住所、電話番号のような、区別可能な複数の項目から構成される。同様に、請求内容３０６も、請求内容の品名、数量、単価、金額と、小計、消費税、合計金額のような、区別可能な複数の項目から構成される。同様に、請求元の振込先情報３０７も、請求元の振込先の銀行名、支店名、口座番号、口座名義のような、区別可能な複数の項目から構成される。 Further, for example, the billing source information 304 is composed of a plurality of distinguishable items such as the name, address, and telephone number of the billing source. Similarly, the billing content 306 is also composed of the product name, quantity, unit price, amount of the billing content, and a plurality of distinguishable items such as subtotal, consumption tax, and total amount. Similarly, the billing source transfer information 307 is also composed of a plurality of distinguishable items such as the billing source bank name, branch name, account number, and account name.

図４は、本実施例における処理の全体的な流れを示すフローチャートである。以下、図４のフローチャートにしたがって、説明を行う。なお、本フローチャートにおける各処理は、ＣＰＵ１０１が、ＲＯＭ１０２やＲＡＭ１０３上のプログラムを用いることによって、実行される。 FIG. 4 is a flowchart showing the overall flow of processing in this embodiment. Hereinafter, description will be given according to the flowchart of FIG. Each process in this flowchart is executed by the CPU 101 by using a program on the ROM 102 or the RAM 103.

まず、Ｓ４０１において、ＣＰＵ１０１は、スキャナ１１０を用いて文書３００のスキャンを行い、スキャン文書画像２０１を取得する。この処理は、図２におけるスキャン文書画像２０１の取得に相当する。
次に、Ｓ４０２において、ＣＰＵ１０１は、スキャン文書画像２０１を二値化する処理を行う。この処理は、図２におけるスキャン文書画像２０１に対する文書画像処理２１１に相当する。
次に、Ｓ４０３において、ＣＰＵ１０１は、スキャン文書画像２０１から罫線を除去する。この処理も、図２におけるスキャン文書画像２０１に対する文書画像処理２１１に相当する。 First, in S401, the CPU 101 scans the document 300 using the scanner 110 and acquires the scanned document image 201. This process corresponds to the acquisition of the scanned document image 201 in FIG.
Next, in S402, the CPU 101 performs a process of binarizing the scanned document image 201. This process corresponds to the document image process 211 for the scanned document image 201 in FIG.
Next, in S403, the CPU 101 removes the ruled line from the scanned document image 201. This process also corresponds to the document image process 211 for the scanned document image 201 in FIG.

次に、Ｓ４０４において、ＣＰＵ１０１は、スキャン文書画像２０１について、文字認識処理（ＯＣＲ処理）を行い、文字認識結果を生成する。文字認識結果には、図５を用いて後述するように、文書上に記載された複数の文字列が含まれる。文書認識処理は、図２におけるスキャン文書画像２０１に対する文字認識処理２１２に相当する。そして、この処理により、図２における文字認識結果情報２０２が生成される。
なお、図３に示した文書３００を用いた文字認識処理の結果の例については、図５を用いて後述する。 Next, in S404, the CPU 101 performs character recognition processing (OCR processing) on the scanned document image 201 to generate a character recognition result. The character recognition result includes a plurality of character strings described on the document, as will be described later with reference to FIG. The document recognition process corresponds to the character recognition process 212 for the scanned document image 201 in FIG. Then, by this process, the character recognition result information 202 shown in FIG. 2 is generated.
An example of the result of the character recognition process using the document 300 shown in FIG. 3 will be described later with reference to FIG.

次に、Ｓ４０５において、ＣＰＵ１０１は、文書の分類を行い、分類された文書の種別に応じて選択された項目を抽出する抽出項目選択処理を行う。項目とは、前述したように、文書の内容を示す区別可能な単位である。抽出項目選択処理は、図２における文字認識結果情報２０２に対する文書分類・抽出項目選択処理２２１に相当する。 Next, in S405, the CPU 101 classifies the documents and performs an extraction item selection process for extracting the selected items according to the type of the classified documents. As described above, an item is a distinguishable unit indicating the contents of a document. The extraction item selection process corresponds to the document classification / extraction item selection process 221 for the character recognition result information 202 in FIG.

次に、Ｓ４０６において、ＣＰＵ１０１は、抽出された各項目に対応する文字認識結果を割り当てる割当処理を行う。すなわち、Ｓ４０５で抽出された各項目について、Ｓ４０４で生成された文字認識結果の検索を行い、各項目の項目値に該当する文字列を割り当てる。割当処理は、図２における項目名・項目値推定処理２２３に相当する。
なお、割当処理の詳細については、図６のフローチャートを用いて後述する。また、図３に示した文書３００を用いた場合の割当処理の結果については、図７の表を用いて後述する。 Next, in S406, the CPU 101 performs an allocation process for allocating the character recognition result corresponding to each extracted item. That is, for each item extracted in S405, the character recognition result generated in S404 is searched, and a character string corresponding to the item value of each item is assigned. The allocation process corresponds to the item name / item value estimation process 223 in FIG.
The details of the allocation process will be described later using the flowchart of FIG. The result of the allocation process when the document 300 shown in FIG. 3 is used will be described later with reference to the table of FIG.

次に、Ｓ４０７において、ＣＰＵ１０１は、項目値に割り当てられた文字認識結果を、文字認識誤りパターンなどを用いて補正する、項目値の補正処理を行う。この処理は、図２における辞書・文字パターン突合処理２２５に相当する。
なお、補正処理の詳細については、図８のフローチャートを用いて後述する。また、図３に示した文書３００を用いた場合の補正処理の結果については、図１１の表を用いて後述する。 Next, in S407, the CPU 101 performs an item value correction process of correcting the character recognition result assigned to the item value by using a character recognition error pattern or the like. This process corresponds to the dictionary / character pattern matching process 225 in FIG.
The details of the correction process will be described later using the flowchart of FIG. The result of the correction process when the document 300 shown in FIG. 3 is used will be described later with reference to the table of FIG.

次に、Ｓ４０８において、ＣＰＵ１０１は、ユーザにより修正された項目値の内容から、文字認識誤りパターン情報を抽出し登録する、文字認識誤りパターン情報抽出処理を行う。文字認識誤りパターン情報抽出処理では、まず、Ｓ４０７で補正された各項目値の文字認識結果をユーザに提示し、その内容の確認もしくは修正を行わせる。次に、ユーザにより修正された修正前の文字と、それに対応する修正後の文字とを、文字認識誤りパターン情報として抽出し、これを登録する。文字認識誤りパターン情報抽出処理は、図２におけるユーザによる項目値の確認・修正処理２３１と、文字認識誤りパターン情報抽出処理２３２に相当する。
文字認識誤りパターン情報抽出処理の詳細については、図１２のフローチャートを用いて後述する。また、図３に示した文書３００を用いた場合の文字認識誤りパターン情報抽出処理の結果については、図１４の表を用いて後述する。 Next, in S408, the CPU 101 performs a character recognition error pattern information extraction process that extracts and registers character recognition error pattern information from the contents of the item values corrected by the user. In the character recognition error pattern information extraction process, first, the character recognition result of each item value corrected in S407 is presented to the user, and the content is confirmed or corrected. Next, the uncorrected character corrected by the user and the corresponding corrected character are extracted as character recognition error pattern information and registered. The character recognition error pattern information extraction process corresponds to the item value confirmation / correction process 231 by the user in FIG. 2 and the character recognition error pattern information extraction process 232.
The details of the character recognition error pattern information extraction process will be described later with reference to the flowchart of FIG. The result of the character recognition error pattern information extraction process when the document 300 shown in FIG. 3 is used will be described later with reference to the table of FIG.

次に、Ｓ４０９において、ＣＰＵ１０１は、抽出した文字認識誤りパターン情報を出力する。この処理は、図２における文書抽出情報２０４を作成し、文書抽出情報利用部２４０に提供する処理に相当する。
Ｓ４０９の処理が終了したら、本フローチャートは終了する。 Next, in S409, the CPU 101 outputs the extracted character recognition error pattern information. This process corresponds to the process of creating the document extraction information 204 in FIG. 2 and providing it to the document extraction information utilization unit 240.
When the process of S409 is completed, this flowchart ends.

次に、図５を用いて、文字認識処理により得られる文字認識結果について説明する。
図５（ａ）は、図３に示した文書３００に対する文字認識結果を表形式により示したものである。これは、図４のフローチャートのＳ４０４における文字認識処理により取得される。
文字認識結果は、文字列単位で管理される。そして、図５（ａ）中の「文字認識結果ＩＤ」として示されているように、文字列毎に識別番号として文字認識結果ＩＤが付与される。 Next, the character recognition result obtained by the character recognition process will be described with reference to FIG.
FIG. 5A shows the character recognition results for the document 300 shown in FIG. 3 in tabular form. This is acquired by the character recognition process in S404 of the flowchart of FIG.
The character recognition result is managed in units of character strings. Then, as shown as the "character recognition result ID" in FIG. 5A, a character recognition result ID is assigned as an identification number for each character string.

また、各文字認識結果は、図５（ａ）中の「文書画像内位置−サイズ」として示されているように、スキャン文書画像内の位置とサイズの情報を有している。
また、各文字認識結果は、文字列を構成する文字毎に、文字コードと位置・座標情報を有している。文字コードは、各文字認識結果は、文字列を構成する文字毎に、最尤候補に加えて、最尤候補より尤度が低い下位候補の情報を含んでいる。なお、図５（ａ）中では、「文字認識結果（最尤候補文字列）」として、各文字認識結果について、最尤候補の文字コードの文字列のみを示している。 Further, each character recognition result has information on the position and size in the scanned document image as shown as "position in document image-size" in FIG. 5A.
In addition, each character recognition result has a character code and position / coordinate information for each character constituting the character string. As for the character code, each character recognition result includes information of a lower candidate having a lower likelihood than the maximum likelihood candidate in addition to the maximum likelihood candidate for each character constituting the character string. In FIG. 5A, only the character string of the character code of the maximum likelihood candidate is shown for each character recognition result as the “character recognition result (maximum likelihood candidate character string)”.

なお、文字認識処理を行う際には、文書上に記載された文字の種類、使用言語、文字列の記載方向等の特性に応じて、文字認識結果として利用可能な文字種などについての制約を課すことができる。文字認識処理を行う際の制約とは、例えば、文字認識結果として、全文字種が利用可能である、金額文字種のみが可能である、などの制約である。ただし、ここでは特に制約を課していないため、図５（ａ）中では「利用文字認識制約」として全文字種と記載されている。 When performing character recognition processing, restrictions are imposed on the character types that can be used as the character recognition result, depending on the characteristics such as the type of character described on the document, the language used, and the writing direction of the character string. be able to. The restrictions when performing the character recognition process are, for example, restrictions such that all character types can be used as a character recognition result, and only monetary character types can be used. However, since no particular restriction is imposed here, all character types are described as "use character recognition restriction" in FIG. 5A.

図５（ａ）の例では、「文字認識結果ＩＤ」の「１」は、図３に示した文書３００では文書タイトル３０１の文字認識結果に対応する。また、「文字認識結果ＩＤ」の「１０」は、文書３００の請求元情報３０４内の電話番号に対応する。
同様に、「文字認識結果ＩＤ」の「２０」と「２１」は、それぞれ、文書３００の請求金額３０５内の「御請求金額」と金額に対応する。
また、「文字認識結果ＩＤ」の「３０」と「３１」は、それぞれ、文書３００の請求内容３０６内の「合計」と合計金額に対応する。 In the example of FIG. 5A, "1" of the "character recognition result ID" corresponds to the character recognition result of the document title 301 in the document 300 shown in FIG. Further, "10" of the "character recognition result ID" corresponds to the telephone number in the billing source information 304 of the document 300.
Similarly, the "20" and "21" of the "character recognition result ID" correspond to the "billed amount" and the amount in the billed amount 305 of the document 300, respectively.
Further, "30" and "31" of the "character recognition result ID" correspond to the "total" and the total amount in the billing content 306 of the document 300, respectively.

図５（ｂ）と（ｃ）は、文字認識結果として得られた下位候補を含む、文字認識結果のラティス構造の例を示している。
図５（ｂ）は、「文字認識結果ＩＤ」の「１」についての文字認識結果である。
開始点５１１は文字認識結果の開始を示す印であり、終了点５１２は文字認識結果の終了を示す印である。開始点５１１から終了点５１２の間に、文字認識結果である文字列が配置される。
矢印５１３は、文字の列の流れを示している。文字列５１４は、最尤候補の文字５２１〜５２４の文字列（最尤候補文字列）を示している。 5 (b) and 5 (c) show an example of the lattice structure of the character recognition result including the sub-candidates obtained as the character recognition result.
FIG. 5B is a character recognition result for “1” of the “character recognition result ID”.
The start point 511 is a mark indicating the start of the character recognition result, and the end point 512 is a mark indicating the end of the character recognition result. A character string that is a character recognition result is arranged between the start point 511 and the end point 512.
Arrow 513 indicates the flow of character strings. The character string 514 indicates a character string (maximum likelihood candidate character string) of the characters 521 to 524 of the maximum likelihood candidate.

図５（ａ）の「文字認識結果（最尤候補文字列）」は、最尤候補文字列５１４に基づいて記載されている。なお、図５（ａ）の「文字認識結果（最尤候補文字列）」において、図３に示した文書３００の記載と一致しない最尤候補の文字については、太文字で強調している。 The “character recognition result (maximum likelihood candidate character string)” of FIG. 5A is described based on the maximum likelihood candidate character string 514. In the "character recognition result (maximum likelihood candidate character string)" of FIG. 5A, the maximum likelihood candidate characters that do not match the description of the document 300 shown in FIG. 3 are emphasized in bold characters.

図５（ｃ）は、「文字認識結果ＩＤ」の「２１」についての文字認識結果である。図５（ｂ）では最尤候補の文字のみからなるラティス構成を示したが、図５（ｃ）では下位候補の文字も含んだラティス構造を示している。
図５（ｂ）と同様に、開始点５３１は文字認識結果の開始を示す印であり、終了点５３２は文字認識結果の終了を示す印である。開始点５３１から終了点５３２の間に、文字認識結果である文字列が配置される。
矢印５３３は、文字の列の流れを示している。文字列５３４は、最尤候補の文字５４１〜５４７の文字列（最尤候補文字列）を示している。 FIG. 5C is a character recognition result for “21” of the “character recognition result ID”. FIG. 5 (b) shows a lattice structure consisting of only the maximum likelihood candidate characters, but FIG. 5 (c) shows a lattice structure including the lower candidate characters.
Similar to FIG. 5B, the start point 531 is a mark indicating the start of the character recognition result, and the end point 532 is a mark indicating the end of the character recognition result. A character string that is a character recognition result is arranged between the start point 531 and the end point 532.
Arrow 533 indicates the flow of a string of characters. The character string 534 indicates a character string (maximum likelihood candidate character string) of the characters 541 to 547 of the maximum likelihood candidate.

また、最尤候補の文字５４１に対して下位候補の文字５５１が示されている。また、最尤候補の文字５４２に対して下位候補の文字５５２と５５３が示されている。同様に、最尤候補の文字５４３に対して下位候補の文字５５４が、最尤候補の文字５４４に対して下位候補の文字５５５が、最尤候補の文字５４６に対して下位候補の文字５５６が、それぞれ、示されている。 Further, the character 551 of the lower candidate is shown with respect to the character 541 of the maximum likelihood candidate. Further, the lower candidate characters 552 and 553 are shown with respect to the maximum likelihood candidate character 542. Similarly, the maximum likelihood candidate character 543 has a lower candidate character 554, the maximum likelihood candidate character 544 has a lower candidate character 555, and the maximum likelihood candidate character 546 has a lower candidate character 556. , Each is shown.

図６は、図４のフローチャートのＳ４０６において行われる、各抽出項目に対して文字認識結果を項目値として割り当てる割当処理の詳細を示すフローチャートである。割当処理の結果として、各抽出項目の項目値となる文字認識結果が特定され、全抽出項目の項目値の文字認識情報が得られる。以下、図６のフローチャートに従って、割当処理について説明する。なお、本フローチャートにおける各処理は、ＣＰＵ１０１が、ＲＯＭ１０２やＲＡＭ１０３上のプログラムを用いることによって、実行される。 FIG. 6 is a flowchart showing details of an allocation process for assigning a character recognition result as an item value to each extracted item, which is performed in S406 of the flowchart of FIG. As a result of the allocation process, the character recognition result which is the item value of each extracted item is specified, and the character recognition information of the item value of all the extracted items is obtained. Hereinafter, the allocation process will be described with reference to the flowchart of FIG. Each process in this flowchart is executed by the CPU 101 by using a program on the ROM 102 or the RAM 103.

まず、Ｓ６０１において、ＣＰＵ１０１は、文書上における項目値の記載位置の情報が「固定」であるか「可変」であるかを判定する。この判定は、図４のフローチャートのＳ４０５の抽出項目選択処理で分類された文書の種別に基づいて抽出された各項目の項目値の記載位置に基づいて行われる。
項目値の記載位置が「固定」である場合、Ｓ６０２へ進む。「可変」の場合、Ｓ６０３へ進む。 First, in S601, the CPU 101 determines whether the information of the description position of the item value on the document is "fixed" or "variable". This determination is performed based on the description position of the item value of each item extracted based on the type of the document classified by the extraction item selection process of S405 of the flowchart of FIG.
If the description position of the item value is "fixed", the process proceeds to S602. In the case of "variable", the process proceeds to S603.

項目値の記載位置が固定である場合、Ｓ６０２において、ＣＰＵ１０１は、記載位置に基づいて文字認識結果を項目値として割り当てる。すなわち、項目値の記載位置が固定した範囲内にあるため、その範囲内の文字認識結果を検索して、文字列があれば、それを項目値として割り当てる。
Ｓ６０２の処理が終了したら、Ｓ６０８へ進む。 When the description position of the item value is fixed, in S602, the CPU 101 assigns the character recognition result as the item value based on the description position. That is, since the description position of the item value is within a fixed range, the character recognition result within that range is searched, and if there is a character string, it is assigned as the item value.
When the processing of S602 is completed, the process proceeds to S608.

項目値の記載位置が可変である場合、Ｓ６０３において、ＣＰＵ１０１は、項目名の検索を行う。すなわち、図４のフローチャートのＳ４０５の抽出項目選択処理で抽出された項目の項目名の文字列に一致あるいは類似する文字列を文字認識結果の中から検索する。 When the description position of the item value is variable, the CPU 101 searches for the item name in S603. That is, a character string that matches or is similar to the character string of the item name of the item extracted by the extraction item selection process of S405 of the flowchart of FIG. 4 is searched from the character recognition results.

次に、Ｓ６０４において、ＣＰＵ１０１は、Ｓ６０３で項目名の検索結果があったか否かを判定する。
項目名の検索結果があった場合、Ｓ６０５へ進む。項目名の検索結果がなかった場合、Ｓ６０８へ進む。 Next, in S604, the CPU 101 determines whether or not there is a search result of the item name in S603.
If there is a search result of the item name, the process proceeds to S605. If there is no search result for the item name, the process proceeds to S608.

項目名の検索結果があった場合、Ｓ６０５において、ＣＰＵ１０１は、検索結果に基づいて項目値の検索を行う。この検索は、図４のフローチャートのＳ４０５の抽出項目選択処理で抽出された項目の情報に含まれる、文書画像内における項目名の位置に対する項目値の相対方向に関する情報に基づいて行われる。なお、項目名の位置に対する項目値の相対方向の情報は複数指定されている場合があるため、その場合は先頭から検索を行っていき、該当する文字列があれば、それを優先候補とする。
Ｓ６０５の処理が終了したら、Ｓ６０６へ進む。 When there is a search result of the item name, in S605, the CPU 101 searches for the item value based on the search result. This search is performed based on the information regarding the relative direction of the item value with respect to the position of the item name in the document image, which is included in the information of the item extracted by the extraction item selection process of S405 of the flowchart of FIG. In addition, since there are cases where multiple pieces of information in the relative direction of the item value with respect to the position of the item name are specified, in that case, the search is performed from the beginning, and if there is a corresponding character string, that is the priority candidate. ..
When the processing of S605 is completed, the process proceeds to S606.

Ｓ６０６において、ＣＰＵ１０１は、Ｓ６０５で項目値の検索結果があったか否かを判定する。
項目値の検索結果があった場合、Ｓ６０７へ進む。項目値の検索結果がなかった場合、Ｓ６０８へ進む。
項目値の検索結果があった場合、Ｓ６０７において、ＣＰＵ１０１は、検索結果の文字認識結果を項目値として割り当てる。
Ｓ６０７の処理が終了したら、Ｓ６０８へ進む。 In S606, the CPU 101 determines whether or not there is a search result of the item value in S605.
If there is a search result of the item value, the process proceeds to S607. If there is no search result for the item value, the process proceeds to S608.
When there is a search result of the item value, in S607, the CPU 101 assigns the character recognition result of the search result as the item value.
When the processing of S607 is completed, the process proceeds to S608.

Ｓ６０８において、ＣＰＵ１０１は、項目値の割り当てができたか否かを判定する。
項目値の割り当てができた場合、Ｓ６１０へ進む。項目値の割り当てができなかった場合、Ｓ６０９へ進む。
項目値の割り当てができなかった場合、Ｓ６０９において、ＣＰＵ１０１は、項目値に該当する文字認識結果情報は該当なしとする。
Ｓ６０９の処理が終了したら、本フローチャートは終了する。
項目値の割り当てができた場合、Ｓ６１０において、ＣＰＵ１０１は、割り当てられた文字認識結果から、項目値の文字認識結果情報を作成する。
Ｓ６１０の処理が終了したら、本フローチャートは終了する。 In S608, the CPU 101 determines whether or not the item value can be assigned.
If the item value can be assigned, the process proceeds to S610. If the item value cannot be assigned, the process proceeds to S609.
If the item value cannot be assigned, in S609, the CPU 101 determines that the character recognition result information corresponding to the item value is not applicable.
When the processing of S609 is completed, this flowchart ends.
When the item value can be assigned, in S610, the CPU 101 creates the character recognition result information of the item value from the assigned character recognition result.
When the process of S610 is completed, this flowchart ends.

次に、図７を用いて、割当処理によって項目値に対して文字認識結果が割り当てられる過程について説明する。
図７（ａ）は、図３に示した文書３００についての抽出項目の例を、表形式により示したものである。これは、図４のフローチャートのＳ４０５の抽出項目選択処理の結果として得られるものである。
文書３００について図４のフローチャートのＳ４０４で文字認識処理により得られた文字認識結果（図５（ａ））に対し、Ｓ４０５の抽出項目選択処理を行うと、文書の種別として、「請求書」が得られる。そして、文書が請求書である場合、抽出項目として、図７（ａ）に示したような項目が抽出される。 Next, the process of assigning the character recognition result to the item value by the allocation process will be described with reference to FIG. 7.
FIG. 7A shows an example of the extraction items for the document 300 shown in FIG. 3 in tabular form. This is obtained as a result of the extraction item selection process of S405 in the flowchart of FIG.
When the extraction item selection process of S405 is performed on the character recognition result (FIG. 5A) obtained by the character recognition process in S404 of the flowchart of FIG. 4 for the document 300, "invoice" is selected as the document type. can get. Then, when the document is an invoice, the items shown in FIG. 7A are extracted as the extraction items.

各抽出項目には、識別をするための「抽出項目ＩＤ」が付与される。「抽出項目」は、抽出すべき項目であり、文書内に記載されたものである。したがって、基本的に、その記載内容は文字認識結果として得られている。ただし、文書によっては抽出項目の項目値が記載されていない場合があり、また、記載されていても、それに該当する文字認識結果に誤りを含んでいる場合もあり得る。
各抽出項目は、「項目値の位置」として、文書上における項目値の記載位置に関する情報を有している。項目値の位置が「固定」である場合は、項目値の記載位置として、文書中の特定された位置（範囲）の情報が含まれる（ただし、図７（ａ）では不表示）。一方、項目値の位置が「可変」である場合は、項目値の記載位置に関する情報は含まれない。その代わりに、「項目名の文字列」の情報と「項目名の位置に対する項目値の相対方向」の情報が含まれる。 An "extracted item ID" for identification is assigned to each extracted item. The "extracted item" is an item to be extracted and is described in the document. Therefore, basically, the description content is obtained as a character recognition result. However, depending on the document, the item value of the extracted item may not be described, and even if it is described, the corresponding character recognition result may contain an error.
Each extracted item has information on the description position of the item value on the document as the "position of the item value". When the position of the item value is "fixed", the information of the specified position (range) in the document is included as the description position of the item value (however, it is not displayed in FIG. 7A). On the other hand, when the position of the item value is "variable", the information regarding the description position of the item value is not included. Instead, it contains information on the "character string of the item name" and information on the "relative direction of the item value to the position of the item name".

帳票のような文書では、一般的に、抽出項目の項目値（例えば、金額）が記載される場合、何の項目値であるかを特定するために、その項目名（例えば、「請求金額」）が記載されている。これを利用して、項目値に該当する文字認識結果を検索するために、図６の割当処理が行われる。
なお、「項目値の文字列」に複数の文字列がある場合は、それぞれに検索が行われ、先頭に記載されている方から優先的に検索結果とする。また、「項目名の位置」が「固定」である場合は、「項目名の文字列」と「項目名の位置に対する項目名の相対位置」に関する情報はないため、「（なし）」になっている。 In a document such as a form, when an item value (for example, amount of money) of an extracted item is described, the item name (for example, "billing amount") is generally used to identify what item value it is. ) Is described. Utilizing this, the allocation process of FIG. 6 is performed in order to search for the character recognition result corresponding to the item value.
If there are a plurality of character strings in the "item value character string", a search is performed for each of them, and the search result is given priority from the one listed at the beginning. If the "item name position" is "fixed", there is no information about the "item name character string" and the "item name relative position to the item name position", so it is "(none)". ing.

図７（ｂ）は、文書３００に対して得られた文字認識結果（図５（ａ））に対し、図６のフローチャートのＳ６０３で項目名の検索を行った結果の例を表形式で示したものである。
「抽出項目ＩＤ」が「４」〜「１０」の抽出項目については、図７（ａ）に示したように、「項目名の位置」が「可変」で、「項目値の文字列」が指定されているため、「項目名の文字認識結果（最尤候補文字列）」に検索した結果の項目名が記載されている。また、「項目名の文字認識時の制約」として、その文字認識処理を行った際の制約が記載されている。
なお、抽出項目ＩＤが「１」〜「３」の抽出項目については、図７（ａ）に示されているように、「項目名の位置」が「固定」であるため、「項目名の文字認識結果（最尤候補文字列）」と「項目名の文字認識時の制約」は「（なし）」になっている。 FIG. 7B shows an example of the result of searching the item name in S603 of the flowchart of FIG. 6 with respect to the character recognition result (FIG. 5A) obtained for the document 300 in a table format. It is a thing.
For the extracted items whose "extracted item ID" is "4" to "10", as shown in FIG. 7A, the "item name position" is "variable" and the "item value character string" is Since it is specified, the item name of the search result is described in "Character recognition result of item name (most likely candidate character string)". Further, as "restrictions when recognizing characters of item names", restrictions when the character recognition processing is performed are described.
As for the extracted items whose extraction item IDs are "1" to "3", as shown in FIG. 7A, the "position of the item name" is "fixed", so that the "item name" is used. "Character recognition result (maximum likelihood candidate character string)" and "restriction at the time of character recognition of item name" are "(none)".

図７（ｃ）は、文書３００に対して得られた文字認識結果（図５（ａ））に対し、図６のフローチャートのＳ６１０で作成された文字認識結果情報の例を表形式で示したものである。
抽出項目毎に、「項目値の文字認識結果（最尤候補文字列）」に、項目値として検索した文字認識結果が記載されている。また、「項目名の文字認識時の制約」に、その文字認識処理を行った際の制約が記載されている。
なお、「抽出項目ＩＤ」が「１」〜「３」の抽出項目については、「項目名の位置」が「固定」であるため、「項目値の文字認識結果（最尤候補文字列）」には、文書中の特定された位置（範囲）の情報に基づいて検索された結果が記載されている。 FIG. 7 (c) shows an example of the character recognition result information created in S610 of the flowchart of FIG. 6 in tabular form with respect to the character recognition result (FIG. 5 (a)) obtained for the document 300. It is a thing.
For each extracted item, the character recognition result searched as the item value is described in "Character recognition result of item value (maximum likelihood candidate character string)". In addition, the restrictions when the character recognition processing is performed are described in "Restrictions when recognizing characters of item names".
For the extracted items whose "extracted item ID" is "1" to "3", since the "position of the item name" is "fixed", the "character recognition result of the item value (most likely candidate character string)". Contains the results of the search based on the information of the specified position (range) in the document.

図８は、図４のフローチャートのＳ４０７で行われる、項目値に割り当てられた文字認識結果を補正する項目値の補正処理を示すフローチャートである。補正処理の結果として、各抽出項目の項目値となる文字認識結果に対して補正が行われる。以下、図８のフローチャートに従って、補正処理について説明する。なお、本フローチャートにおける各処理は、ＣＰＵ１０１が、ＲＯＭ１０２やＲＡＭ１０３上のプログラムを用いることによって、実行される。 FIG. 8 is a flowchart showing a correction process of the item value for correcting the character recognition result assigned to the item value, which is performed in S407 of the flowchart of FIG. As a result of the correction processing, correction is performed on the character recognition result which is the item value of each extraction item. Hereinafter, the correction process will be described with reference to the flowchart of FIG. Each process in this flowchart is executed by the CPU 101 by using a program on the ROM 102 or the RAM 103.

Ｓ８０１において、ＣＰＵ１０１は、項目値として得られた文字認識結果に対し、再文字認識処理を行うか否かを判定する。この判定は、図４のフローチャートのＳ４０５での文書分類結果により選択された項目の情報に含まれる、各項目の補正方法に関する情報である「項目値の再文字認識制約」（後述の図９（ａ）を参照）に基づいて行われる。
「項目値の再文字認識制約」において、再文字認識処理を行う際の文字種等の制約が設定されている場合、再文字認識処理が必要であるとして、Ｓ８０２へ進む。制約が設定されていない場合、再文字認識処理が不要であるとして、Ｓ８０４へ進む。 In S801, the CPU 101 determines whether or not to perform the re-character recognition process on the character recognition result obtained as the item value. This determination is "re-character recognition constraint of item value" which is information about the correction method of each item included in the information of the item selected by the document classification result in S405 of the flowchart of FIG. 4 (FIG. 9 (described later)). It is done based on a)).
If a constraint such as a character type when performing the re-character recognition process is set in the "re-character recognition constraint of the item value", it is assumed that the re-character recognition process is necessary, and the process proceeds to S802. If the constraint is not set, it is assumed that the re-character recognition process is unnecessary, and the process proceeds to S804.

再文字認識処理が必要である場合、Ｓ８０２において、ＣＰＵ１０１は、項目値の再文字認識処理を行う際の設定を行う。すなわち、「項目値の再文字認識の制約」に記載された情報に基づいて、再文字認識処理の制約を設定する。
Ｓ８０２の処理が終了したら、Ｓ８０３へ進む。 When the re-character recognition process is required, in S802, the CPU 101 makes settings for performing the re-character recognition process of the item value. That is, the constraint of the re-character recognition process is set based on the information described in "Restriction of re-character recognition of item value".
When the processing of S802 is completed, the process proceeds to S803.

Ｓ８０３において、ＣＰＵ１０１は、項目値の再文字認識処理（ＯＣＲ処理）を行う。これにより、文字認識結果が更新される。再文字認識処理の詳細については、図９を用いて後述する。
Ｓ８０３の処理が終了したら、Ｓ８０４へ進む。 In S803, the CPU 101 performs a re-character recognition process (OCR process) for the item value. As a result, the character recognition result is updated. The details of the re-character recognition process will be described later with reference to FIG.
When the processing of S803 is completed, the process proceeds to S804.

Ｓ８０４において、ＣＰＵ１０１は、項目値として得られた文字認識結果または再文字認識結果（以下、まとめて「（再）文字認識結果」という）に対し、辞書・文字パターンなどの補正用のパターン情報を用いた補正を行うか否かを判定する。この判定は、図４のフローチャートのＳ４０５で抽出された項目の情報に含まれる、各項目の補正方法に関する情報である「項目値補正の辞書・文字パターン設定」（後述の図９（ａ）を参照）の有無に基づいて行われる。
辞書・文字パターン設定がある場合、辞書・文字パターンを用いた補正を行うため、Ｓ８０５へ進む。辞書・文字パターン設定がない場合、辞書・文字パターンを用いた補正を行うことなく、本フローチャートの処理は終了する。 In S804, the CPU 101 applies pattern information for correction such as a dictionary / character pattern to the character recognition result or re-character recognition result (hereinafter collectively referred to as "(re) character recognition result") obtained as an item value. Determine whether to make the correction used. This determination is based on the "item value correction dictionary / character pattern setting" (FIG. 9 (a) described later), which is information on the correction method of each item included in the item information extracted in S405 of the flowchart of FIG. It is done based on the presence or absence of (see).
If there is a dictionary / character pattern setting, the process proceeds to S805 in order to perform correction using the dictionary / character pattern. If there is no dictionary / character pattern setting, the processing of this flowchart ends without making corrections using the dictionary / character pattern.

Ｓ８０５において、ＣＰＵ１０１は、項目値の（再）文字認識結果に対して、文字認識誤りパターン辞書（後述の図１０を参照）を用いて、（再）文字認識結果に文字（列）を追加する処理を行う。すなわち、（再）文字認識処理の際の制約に対応して、使用する文字認識誤りパターン辞書を切り替え、その中から一致する文字認識誤りパターンを見つけて、（再）文字認識結果として得られたラティス構造に追加する処理を行う。
なお、文字認識誤りパターン辞書に基づくラティス構造の追加の詳細については、図１０を用いて後述する。
Ｓ８０５の処理が終了したら、Ｓ８０６へ進む。 In S805, the CPU 101 adds a character (string) to the (re) character recognition result by using the character recognition error pattern dictionary (see FIG. 10 described later) for the (re) character recognition result of the item value. Perform processing. That is, in response to the restrictions in the (re) character recognition process, the character recognition error pattern dictionary to be used was switched, a matching character recognition error pattern was found from the dictionary, and the result was obtained as the (re) character recognition result. Performs processing to add to the lattice structure.
The details of adding the lattice structure based on the character recognition error pattern dictionary will be described later with reference to FIG.
When the processing of S805 is completed, the process proceeds to S806.

Ｓ８０６において、ＣＰＵ１０１は、補正用の辞書・文字パターン（後述の図１１を参照）を選択する。すなわち、辞書及び文字パターンは（再）文字認識処理を行った際に用いた制約毎に用意されており、ここでは、制約に対応した辞書及び文字パターンが選択される。
なお、文字認識処理の際の制約毎に用意された辞書及び文字パターンの詳細については、図１１を用いて後述する。
Ｓ８０６の処理が終了したら、Ｓ８０７へ進む。 In S806, the CPU 101 selects a dictionary / character pattern for correction (see FIG. 11 described later). That is, the dictionary and the character pattern are prepared for each constraint used when the (re) character recognition process is performed, and here, the dictionary and the character pattern corresponding to the constraint are selected.
The details of the dictionary and the character pattern prepared for each restriction in the character recognition process will be described later with reference to FIG.
When the processing of S806 is completed, the process proceeds to S807.

Ｓ８０７において、ＣＰＵ１０１は、項目毎の辞書・文字パターンを用いた補正を行う。これにより、文字認識結果が更新される。
Ｓ８０７の処理が終了したら、本フローチャートの処理は終了する。 In S807, the CPU 101 makes corrections using a dictionary / character pattern for each item. As a result, the character recognition result is updated.
When the processing of S807 is completed, the processing of this flowchart is completed.

ここで、図９を用いて、項目値の再文字認識処理について説明する。
図９（ａ）は、図３で示した文書３００に対する項目値の再文字認識処理の例を、表形式を用いて示したもので、図４のフローチャートのＳ４０５の抽出項目選択処理により得られる結果の一部である。
文書３００に対して得られた文字認識結果（図５（ａ））に対し、図４のフローチャートのＳ４０５の抽出項目選択処理を行うと、文書の種別とともに、図９（ａ）に示されるような情報が得られる。すなわち、各項目値について、再文字認識処理する際の制約や、辞書・文字パターンを用いた補正を行う際の設定に関する情報が得られる。 Here, the re-character recognition process of the item value will be described with reference to FIG.
FIG. 9A shows an example of the item value re-character recognition process for the document 300 shown in FIG. 3 using a tabular format, and is obtained by the extraction item selection process of S405 in the flowchart of FIG. Part of the result.
When the extraction item selection process of S405 of the flowchart of FIG. 4 is performed on the character recognition result (FIG. 5 (a)) obtained for the document 300, it is shown in FIG. 9 (a) together with the document type. Information can be obtained. That is, for each item value, information on restrictions when performing re-character recognition processing and settings when performing correction using a dictionary / character pattern can be obtained.

図９（ａ）に示されるように、各抽出項目には、識別をするための「抽出項目ＩＤ」が付与される。また、各抽出項目には、「項目値の再文字認識の制約」として、各項目の文字認識結果について、再文字認識処理を行う際の制約が含まれる。また、「項目値補正の辞書・文字パターン設定」として、補正処理を行う際の辞書・文字パターン設定が含まれる。 As shown in FIG. 9A, each extracted item is given an “extracted item ID” for identification. In addition, each extracted item includes a restriction when performing re-character recognition processing on the character recognition result of each item as "restriction of re-character recognition of item value". In addition, the "dictionary / character pattern setting for item value correction" includes the dictionary / character pattern setting when performing correction processing.

「項目値の再文字認識の制約」で、文字種等の制約がなされている場合、その制約に基づいて、項目値の文字認識結果に対して、再文字認識処理が行われる。なお、再文字認識処理が不要な場合は、「項目値の再文字認識の制約」は「(なし)」となっている。
また、「項目値補正の辞書・文字パターン設定」で、辞書・文字パターンが設定されている場合、設定されている内容で、（再）文字認識結果に対して補正処理が行われる。なお、補正処理が不要な場合は、「項目値補正の辞書・文字パターン設定」は「(なし)」となっている。 If there are restrictions such as character type in "Restrictions on item value re-character recognition", re-character recognition processing is performed on the character recognition result of the item value based on the restrictions. If the re-character recognition process is not required, the "restriction on re-character recognition of item value" is "(none)".
In addition, when a dictionary / character pattern is set in "Dictionary / character pattern setting for item value correction", correction processing is performed on the (re) character recognition result with the set contents. If the correction process is not required, the "item value correction dictionary / character pattern setting" is set to "(none)".

図９（ｂ）は、図９（ａ）の「項目値の再文字認識の制約」で指定される文字種等の制約の例を表形式で示したものである。各制約には、識別をするための「文字認識制約ＩＤ」が付与される。また、「文字認識制約内容」として、使用可能な文字の集合が記載されている。
例えば、「文字認識制約ＩＤ」が「１」の「全文字種」では、「文字認識制約内容」が「(制限なし)」であり、すなわち、再文字認識処理においてすべての文字が使用可能となっている。また、「文字認識制約ＩＤ」が「３」の「金額文字種」では、「文字認識制約内容」として「０」から「９」の数字、「，」、「￥」、「円」が記載されており、再文字認識処理においてこれらの文字のみが使用可能であることが示されている。 FIG. 9B shows an example of restrictions such as character types specified in “Restrictions on re-character recognition of item values” in FIG. 9A in a tabular format. A "character recognition constraint ID" for identification is assigned to each constraint. In addition, a set of usable characters is described as "character recognition restriction contents".
For example, in "all character types" where the "character recognition constraint ID" is "1", the "character recognition constraint content" is "(no restriction)", that is, all characters can be used in the re-character recognition process. ing. In addition, in the "amount character type" whose "character recognition constraint ID" is "3", numbers from "0" to "9", ",", "¥", and "yen" are described as "character recognition constraint contents". It is shown that only these characters can be used in the re-character recognition process.

図９（ｃ）は、文字認識結果（図５（ａ））に対し、図８のフローチャートのＳ８０３で再文字認識処理を行った場合の結果の例を表形式で示したものである。ここでは、図７（ｃ）で示した「項目値の文字認識結果（最尤候補文字列）」に対し、「項目値の再文字認識の制約」を設定して再文字認識処理を行った結果が「項目値の（再）文字認識結果（最尤候補文字列）」として示されている。 FIG. 9 (c) shows an example of the result when the character recognition result (FIG. 5 (a)) is subjected to the re-character recognition process in S803 of the flowchart of FIG. 8 in a tabular format. Here, the re-character recognition process was performed by setting the "restriction for re-character recognition of the item value" for the "character recognition result of the item value (most likely candidate character string)" shown in FIG. 7 (c). The result is shown as "(re) character recognition result of item value (most likely candidate character string)".

例えば、「抽出項目ＩＤ」の「６」の「合計金額」の「項目値の（再）文字認識結果」では、アルファベット「Ｉ」として誤認識されていた文字が、使用可能な文字を制約して再文字認識処理を行うことにより、数字の「１」に補正されたことが示されている。
なお、再文字認識処理を行わない項目値については、図７（ｃ）で示した元の「項目値の文字認識結果（最尤候補文字列）」のままとなっている。 For example, in the "(re) character recognition result of item value" of "total amount" of "6" of "extracted item ID", the character misrecognized as the alphabet "I" restricts the characters that can be used. It is shown that the number was corrected to "1" by performing the re-character recognition process.
The item value for which the re-character recognition process is not performed remains the same as the original "character recognition result of the item value (maximum likelihood candidate character string)" shown in FIG. 7 (c).

次に、図１０を用いて、文字認識処理の際の制約毎に作成された文字認識誤りパターン辞書を使用した補正処理について説明する。
図１０（ａ）と（ｂ）は、図８のフローチャートのＳ８０５で使用する文字認識誤りパターン辞書の例を、表形式を使って示したものである。文字認識誤りパターン辞書は、文字認識処理の際の制約毎に事前に用意されているもので、文字認識処理の際の制約に応じて異なる辞書が使用される。 Next, with reference to FIG. 10, a correction process using a character recognition error pattern dictionary created for each constraint in the character recognition process will be described.
10 (a) and 10 (b) show an example of the character recognition error pattern dictionary used in S805 of the flowchart of FIG. 8 using a tabular format. The character recognition error pattern dictionary is prepared in advance for each constraint in the character recognition process, and a different dictionary is used according to the constraint in the character recognition process.

図１０（ａ）は、文字認識処理の際の制約が「全文字種」である場合に使用される文字認識誤りパターン辞書１００１である。また、図１０（ｂ）は、文字認識処理の際の制約が「金額文字種」である場合に使用される文字認識誤りパターン辞書１００２である。いずれの文字認識誤りパターン辞書も、各文字認識誤りパターンとして、識別をするための「誤りパターンＩＤ」と、「文字認識結果」の文字（列）と、「補正結果」の文字（列）を有する。
本実施例では、各文字認識誤りパターン辞書は、「補正結果」として、対応する文字認識処理の際の制約に応じて使用可能な文字のみから構成される。これにより、不必要な「補正結果」の追加が抑止され、辞書・文字パターンを用いた補正処理を効率よく行うことができる。 FIG. 10A is a character recognition error pattern dictionary 1001 used when the constraint in the character recognition process is “all character types”. Further, FIG. 10B is a character recognition error pattern dictionary 1002 used when the constraint in the character recognition process is the “amount character type”. Each character recognition error pattern dictionary uses "error pattern ID" for identification, "character recognition result" character (string), and "correction result" character (string) as each character recognition error pattern. Have.
In this embodiment, each character recognition error pattern dictionary is composed of only characters that can be used as a "correction result" according to the restrictions in the corresponding character recognition processing. As a result, the addition of unnecessary "correction results" is suppressed, and correction processing using dictionaries and character patterns can be efficiently performed.

図１０（ｃ）と（ｄ）は、文字認識処理の際の制約毎の文字認識誤りパターン辞書に基づいて、文字認識結果に対し、補正結果を追加することにより更新されたラティス構造を示している。
図１０（ｃ）は、図５（ｂ）の文字認識結果（「抽出項目ＩＤ」の「１」の項目値）に対して、図１０（ａ）の文字認識処理の際の制約が「全文字種」である場合に使用される文字認識誤りパターン辞書１００１に基づいて補正処理を行ったラティス構造である。
「抽出項目ＩＤ」の「１」については、図９（ａ）の「項目値の再文字認識の制約」（「なし」）に従い、再文字認識処理は行われないため、図１０（ｃ）のラティス構造は図５（ｂ）の内容をすべて含んでいる。すなわち、図５（ｂ）中の文字５２１から５２４と、図１０（ｂ）中の文字１０２１から１０２４は、それぞれ同じ文字である。 10 (c) and 10 (d) show the lattice structure updated by adding the correction result to the character recognition result based on the character recognition error pattern dictionary for each constraint in the character recognition processing. There is.
In FIG. 10 (c), with respect to the character recognition result of FIG. 5 (b) (item value of “1” of “extracted item ID”), the restriction in the character recognition process of FIG. 10 (a) is “all”. It is a lattice structure that has been corrected based on the character recognition error pattern dictionary 1001 used in the case of "character type".
Regarding "1" of the "extracted item ID", since the re-character recognition process is not performed in accordance with the "restriction of re-character recognition of item value"("none") in FIG. 9 (a), FIG. 10 (c) The lattice structure of FIG. 5 (b) includes all the contents of FIG. 5 (b). That is, the characters 521 to 524 in FIG. 5 (b) and the characters 1021 to 1024 in FIG. 10 (b) are the same characters, respectively.

しかし、図９（ａ）の「項目値補正の辞書・文字パターン設定」に従い、文字認識処理の際の制約が「全文字種」である場合の文字認識誤りパターン辞書１００１を用いた補正処理が行わる。ここでは、文字１０２１と１０２２からなる文字列（「言青」）が、図１０（ａ）の「誤りパターンＩＤ」の「１」の「文字認識結果」と一致している。そのため、図１０（ｃ）のラティス構造には、「誤りパターンＩＤ」の「１」の「補正結果」である文字１０６１が、文字１０２１と１０２２からなる文字列の下位候補として追加されている。これにより、「請」を「言」と「青」に分割して誤認識してしまう文字認識結果の誤りパターンを補正することができる。 However, according to the "item value correction dictionary / character pattern setting" of FIG. 9A, the correction process is performed using the character recognition error pattern dictionary 1001 when the constraint in the character recognition process is "all character types". To. Here, the character string (“word blue”) composed of characters 1021 and 1022 matches the “character recognition result” of “1” of the “error pattern ID” in FIG. 10 (a). Therefore, in the lattice structure of FIG. 10C, the character 1061 which is the "correction result" of "1" of the "error pattern ID" is added as a subordinate candidate of the character string consisting of the characters 1021 and 1022. As a result, it is possible to correct the error pattern of the character recognition result in which the "request" is divided into "word" and "blue" and erroneously recognized.

図１０（ｄ）は、同様に、図５（ｃ）の文字認識結果（「抽出項目ＩＤ」の「６」の項目値）に対し、文字認識処理の際の制約が「金額文字種」である場合に使用される文字認識誤りパターン辞書１００２に基づいて補正処理を行ったラティス構造である。
「抽出項目ＩＤ」の「６」については、図９（ａ）の「項目値の再文字認識の制約」（「金額文字種」）に従って再文字認識処理が行われたため、図５（ｃ）ラティス構造における文字の一部が削除されている。すなわち、図５（ｃ）の文字５４１、５５２、５５３、５５４、５５５、５５６は、「項目値の再文字認識の制約」の利用可能な文字でないため、図１０（ｄ）のラティス構造においては削除されている。なお、図１０（ｄ）中の、文字１０４２から１０４７は、図５（ｃ）中の文字５４２から５４７と、それぞれ同じ文字である。 Similarly, in FIG. 10 (d), the constraint in the character recognition process is the “money character type” with respect to the character recognition result (item value of “6” of the “extracted item ID”) in FIG. 5 (c). This is a lattice structure that has been corrected based on the character recognition error pattern dictionary 1002 used in the case.
Regarding "6" of the "extracted item ID", since the re-character recognition process was performed in accordance with the "restriction of re-character recognition of item value"("money character type") of FIG. 9 (a), the lattice of FIG. 5 (c) Some characters in the structure have been removed. That is, since the characters 541, 552, 552, 554, 555, and 556 in FIG. 5 (c) are not available characters in the "restriction on re-character recognition of item values", in the lattice structure of FIG. 10 (d), It has been deleted. The characters 1042 to 1047 in FIG. 10 (d) are the same characters as the characters 542 to 547 in FIG. 5 (c), respectively.

ただし、図５（ｃ）中の最尤候補だった文字５４１が削除されたことにより、下位候補の文字５５１に対応する文字１０５１が最尤候補に繰り上がっている。
さらに、文字１０４５（「６」）が、図１０（ｂ）の文字認識誤りパターン辞書１００２における「誤りパターンＩＤ」の「１」の「文字認識結果」と一致している。そのため、文字認識誤りパターン辞書１００２の「誤りパターンＩＤ」の「１」の「補正結果」である「５」が、文字１０７１として文字１０４５の下位候補に追加されている。 However, since the character 541 which was the maximum likelihood candidate in FIG. 5C has been deleted, the character 1051 corresponding to the character 551 of the lower candidate has been moved up to the maximum likelihood candidate.
Further, the character 1045 (“6”) matches the “character recognition result” of “1” of the “error pattern ID” in the character recognition error pattern dictionary 1002 of FIG. 10 (b). Therefore, "5", which is the "correction result" of "1" of the "error pattern ID" of the character recognition error pattern dictionary 1002, is added as the lower candidate of the character 1045 as the character 1071.

次に、図１１を用いて、項目毎の補正用の辞書・文字パターンを使用した補正処理について説明する。
図１１（ａ）と（ｂ）は、図８のフローチャートのＳ８０６で使用する項目毎の補正用の辞書と文字パターンを、表形式を用いて示したものである。補正用の辞書・文字パターンは、抽出項目の内容に応じて事前に作成されており、抽出項目の「項目値補正の辞書・文字パターン設定」（図９（ａ）を参照）に応じて選択され使用される。 Next, with reference to FIG. 11, a correction process using a dictionary / character pattern for correction for each item will be described.
11 (a) and 11 (b) show the correction dictionary and the character pattern for each item used in S806 of the flowchart of FIG. 8 in tabular form. The dictionary / character pattern for correction is created in advance according to the content of the extracted item, and is selected according to the “dictionary / character pattern setting for item value correction” (see FIG. 9A) of the extracted item. And used.

図１１（ａ）は、補正用の辞書１１０１の例を、表形式を用いて示したものである。この例では、抽出項目は、文書の種別が「請求書」である場合の文書タイトルである。このように、文書タイトルとして一般的に用いられ得る文字列には、通常は多数あるため、各文字列には、それらを識別するために「辞書項目ＩＤ」が付与される。また、「標準文字列」として、文書中で請求書の文書タイトルとして一般的に用いられ得る標準的な文字列が記載されている。 FIG. 11A shows an example of the correction dictionary 1101 in tabular form. In this example, the extracted item is the document title when the document type is "invoice". As described above, since there are usually a large number of character strings that can be generally used as a document title, a "dictionary item ID" is assigned to each character string to identify them. Further, as a "standard character string", a standard character string that can be generally used as a document title of an invoice in a document is described.

図１１（ｂ）は、補正用の文字パターン１１０２の例を、表形式を用いて示したものである。この例では、抽出項目は、「請求金額」や「合計金額」などであり、これらに対応する項目値である「金額」として記載される文字列の文字パターンを示している。ここでは、文字パターンの「記載規則」として「正規表現」が用いられていることが示されている。そして、「文字パターン規則」として、正規表現で用いられる文字パターンが示されている。すなわち、正規表現で用いられる文字パターンは、１つ以上の数字の列の後に「，」があり、そのあとに１つ以上の数字の列がある文字列であり、文字列の先頭に「￥」があってもよく、また、その文字列の末尾に「円」があってもよいことが示されている。 FIG. 11B shows an example of the character pattern 1102 for correction using a tabular format. In this example, the extracted items are "billed amount", "total amount", and the like, and indicate the character pattern of the character string described as "amount" which is the item value corresponding to these. Here, it is shown that "regular expression" is used as the "description rule" of the character pattern. Then, as a "character pattern rule", a character pattern used in a regular expression is shown. That is, the character pattern used in the regular expression is a character string having "," after one or more number strings and one or more number strings after it, and "\" at the beginning of the character string. It is indicated that there may be a "circle" at the end of the character string.

図１１（ｃ）は、図８のフローチャートのＳ８０７で、図９（ｃ）に示した（再）文字認識結果に対して、補正用の辞書・文字パターンを用いて補正された結果を、表形式を用いて示したものである。
各抽出項目について、使用される補正用の辞書・文字パターンは、図９（ａ）の各抽出項目について「項目値補正の辞書・文字パターン設定」で指定されたものである。そして、「項目値」として、補正用の辞書・文字パターンを用いて補正された結果が示されている。また、「補正内容」として、補正用の辞書・文字パターンを用いた補正前と補正後の内容が示されている。また、「項目値の（再）文字認識の制約」として、（再）文字認識処理を行った際の制約が示されている。 FIG. 11 (c) shows the results of correction of the (re) character recognition result shown in FIG. 9 (c) using the correction dictionary / character pattern in S807 of the flowchart of FIG. It is shown using the format.
The correction dictionary / character pattern used for each extracted item is the one specified in "Item value correction dictionary / character pattern setting" for each extracted item in FIG. 9A. Then, as the "item value", the result of correction using the correction dictionary / character pattern is shown. Further, as the "correction content", the content before and after the correction using the dictionary / character pattern for correction is shown. Further, as "restrictions on (re) character recognition of item values", restrictions when (re) character recognition processing is performed are shown.

例えば、「抽出項目ＩＤ」が「１」の「文書タイトル」の場合、図１０（ｃ）に示した文字認識誤りパターンが追加されたラティス構造の中から、補正用の辞書１１０１の中の「標準文字列」にある一致する文字列（「請求書」）が検出される。
すなわち、図１０（ｃ）のラティス構造の開始点１０１１から終了点１０１２の間にある、文字１０６１、１０２３、１０２４で構成される文字列に「請求書」がある。これは補正用の辞書１１０１の「辞書項目ＩＤ」の「１」の「標準文字列」である「請求書」と一致している。そのため、図１１（ｃ）に示したように、「抽出項目ＩＤ」が「１」の「文書タイトル」の項目値は、「請求書」に補正されている。 For example, in the case of the "document title" in which the "extracted item ID" is "1", the "document title" in the correction dictionary 1101 is selected from the lattice structure to which the character recognition error pattern shown in FIG. 10 (c) is added. A matching string ("Invoice") in "Standard String" is detected.
That is, there is an "invoice" in the character string composed of the characters 1061, 1023, and 1024 between the start point 1011 and the end point 1012 of the lattice structure of FIG. 10 (c). This matches the "invoice" which is the "standard character string" of "1" of the "dictionary item ID" of the correction dictionary 1101. Therefore, as shown in FIG. 11 (c), the item value of the "document title" whose "extracted item ID" is "1" is corrected to "invoice".

また、例えば、「抽出項目ＩＤ」が「６」の合計金額の場合、図１０（ｄ）に示した文字認識誤りパターンが追加されたラティス構造の中から、補正用の文字パターン１１０２の中の「文字パターン規則」に一致する文字列が検出される。
すなわち、図１０（ｄ）のラティス構造の開始点１０３１から終了点１０３２の間にある、文字１０５１、１０４２、１０４３、１０４４、１０４５、１０４６、１０４７で構成される文字列「１０，１６２円」は、「文字パターン規則」に一致する。そのため、図１１（ｃ）に示したように、「抽出項目ＩＤ」が「６」の項目値は、「１０，１６２円」となっている。 Further, for example, when the "extracted item ID" is the total amount of money of "6", from the lattice structure to which the character recognition error pattern shown in FIG. 10D is added, the character pattern 1102 for correction is used. A character string that matches the "character pattern rule" is detected.
That is, the character string "10,162 yen" composed of the characters 1051, 1042, 1043, 1044, 1045, 1046, and 1047 between the start point 1031 and the end point 1032 of the lattice structure of FIG. 10 (d) is , Matches the "character pattern rule". Therefore, as shown in FIG. 11C, the item value of the “extracted item ID” of “6” is “10,162 yen”.

このように、補正用の辞書・文字パターンを使用した補正処理においては、（再）文字認識結果に対して、項目毎に対応した補正用の辞書や文字パターンを用いて補正を行う。しかし、図１０（ｄ）に示したように、文字認識結果における誤りがすべて解消されるとは限らない。この場合は、最も尤度の高い文字列（例えば、「１０，１６２円」）が選択される。このため、図４のフローチャートで示したように、補正処理の後、ユーザによる項目値の修正が行われる。
なお、最も類似した文字列を算出する手法としては、文字認識結果の尤度や、文字の連接尤度、文字パターンや辞書内の文字の一致度などを使用する様々なものが知られている。例えば、文字列の動的計画法を用いた編集距離計算手法や、ビタビアルゴリズムなどが周知である。本発明においては、これらのいずれの手法を用いても構わない。 In this way, in the correction process using the correction dictionary / character pattern, the (re) character recognition result is corrected by using the correction dictionary or character pattern corresponding to each item. However, as shown in FIG. 10D, not all errors in the character recognition result are eliminated. In this case, the character string having the highest likelihood (for example, "10,162 yen") is selected. Therefore, as shown in the flowchart of FIG. 4, the item value is corrected by the user after the correction process.
As a method for calculating the most similar character string, various methods using the likelihood of the character recognition result, the likelihood of connecting characters, the degree of matching of characters in a character pattern or a dictionary, and the like are known. .. For example, an editing distance calculation method using a dynamic programming method for character strings and a Viterbi algorithm are well known. In the present invention, any of these methods may be used.

図１２は、図４のフローチャートのＳ４０８で行われる、ユーザによる項目値の修正に基づいて、文字認識誤りパターン情報を抽出し登録する、文字認識誤りパターン情報抽出処理の詳細を示すフローチャートである。この処理において各抽出項目の項目値の確認・修正がユーザによって行われると、その修正内容に基づいて文字認識の誤りパターン情報が抽出され、登録される。以下、図１２のフローチャートに従って、文字認識誤りパターン情報抽出処理について説明する。なお、本フローチャートにおける各処理は、ＣＰＵ１０１が、ＲＯＭ１０２やＲＡＭ１０３上のプログラムを用いることによって、実行される。 FIG. 12 is a flowchart showing the details of the character recognition error pattern information extraction process for extracting and registering the character recognition error pattern information based on the correction of the item value by the user, which is performed in S408 of the flowchart of FIG. When the user confirms / corrects the item value of each extracted item in this process, the character recognition error pattern information is extracted and registered based on the corrected content. Hereinafter, the character recognition error pattern information extraction process will be described with reference to the flowchart of FIG. Each process in this flowchart is executed by the CPU 101 by using a program on the ROM 102 or the RAM 103.

Ｓ１２０１において、ＣＰＵ１０１は、抽出項目の項目値をユーザに提示するためのユーザインターフェースとして、確認・修正用画面を表示する。項目値を提示する際は、ユーザが容易に確認できるように、文書画像上で文字認識結果が存在する領域の近傍を含む画像と、抽出項目の項目値の文字列とを、並べて表示する。なお、表示される確認・修正用画面の例については、図１３を用いて後述する。
Ｓ１２０１の処理が終了したら、Ｓ１２０２へ進む。 In S1201, the CPU 101 displays a confirmation / correction screen as a user interface for presenting the item values of the extracted items to the user. When presenting the item value, the image including the vicinity of the area where the character recognition result exists on the document image and the character string of the item value of the extracted item are displayed side by side so that the user can easily confirm. An example of the displayed confirmation / correction screen will be described later with reference to FIG.
When the processing of S1201 is completed, the process proceeds to S1202.

Ｓ１２０２において、ＣＰＵ１０１は、表示した抽出項目の項目値に対し、ユーザが確認又は修正した結果を取得する。
ユーザが確認して修正が必要ないと判断した場合、ユーザが確認・修正用画面上で修正不要の旨を指示することにより、修正が不要であることを直接的に取得することができる。また、一定時間が経過するなどにより、修正が不要であることを間接的に取得することもできる。一方、ユーザが確認・修正用画面上で修正をした場合は、修正が必要であることと、修正内容を直接的に取得することができる。
Ｓ１２０２の処理が終了したら、Ｓ１２０３へ進む。 In S1202, the CPU 101 acquires the result of confirmation or correction by the user with respect to the item value of the displayed extraction item.
When the user confirms and determines that the correction is not necessary, the user can directly obtain that the correction is unnecessary by instructing that the correction is unnecessary on the confirmation / correction screen. It is also possible to indirectly acquire that no correction is necessary after a certain period of time has passed. On the other hand, when the user makes a correction on the confirmation / correction screen, the correction is necessary and the correction content can be directly acquired.
When the processing of S1202 is completed, the process proceeds to S1203.

Ｓ１２０３において、ＣＰＵ１０１は、Ｓ１２０２でユーザによる修正が行われたか否かを判定する。
ユーザによる修正が行われた場合、Ｓ１２０４へ進む。修正が行われなかった場合、本フローチャートの処理は終了する。 In S1203, the CPU 101 determines whether or not the modification has been made by the user in S1202.
If the correction is made by the user, the process proceeds to S1204. If no correction is made, the processing of this flowchart ends.

ユーザによる修正が行われた場合、Ｓ１２０４において、ＣＰＵ１０１は、ユーザにより修正された項目値を文字認識誤りパターン情報として抽出し、文字認識誤りパターン情報を文字認識処理の際の制約と対応付ける。すなわち、ユーザにより修正された文字（列）と修正前の文字（列）とを対応させて文字認識誤りパターン情報として抽出し、文字認識誤りパターン情報を文字認識処理の際の制約と対応付ける。あるいは、ユーザによる項目値の編集履歴情報から、ユーザにより修正された部分を文字認識誤りパターン情報として抽出し、文字認識誤りパターン情報を文字認識処理の際の制約と対応付けてもよい。
これにより、文字認識処理において誤認識された文字（列）と、ユーザにより修正された正しい文字（列）とが対応付けられた文字認識誤りパターン情報が、文字認識処理の制約毎に、新たに作成される。
Ｓ１２０４の処理が終了したら、Ｓ１２０５へ進む。 When the correction is made by the user, in S1204, the CPU 101 extracts the item value corrected by the user as the character recognition error pattern information, and associates the character recognition error pattern information with the restriction in the character recognition processing. That is, the character (string) corrected by the user and the character (string) before the correction are associated with each other and extracted as character recognition error pattern information, and the character recognition error pattern information is associated with the restriction in the character recognition process. Alternatively, the portion corrected by the user may be extracted as the character recognition error pattern information from the edit history information of the item value by the user, and the character recognition error pattern information may be associated with the constraint in the character recognition process.
As a result, the character recognition error pattern information in which the character (string) erroneously recognized in the character recognition process and the correct character (string) corrected by the user are associated with each other is newly added for each restriction of the character recognition process. Created.
When the processing of S1204 is completed, the process proceeds to S1205.

次に、Ｓ１２０５からＳ１２０７において、新たに作成された文字認識誤りパターン情報が文字認識誤りパターン辞書に登録される。これにより、文字認識処理の際の制約毎に文字認識誤りパターン情報が文字認識誤りパターン辞書に登録され、以後の文字認識処理において、文字認識処理の際の制約に応じて効率の良い文字認識誤りパターン情報の利用が可能となる。 Next, in S1205 to S1207, the newly created character recognition error pattern information is registered in the character recognition error pattern dictionary. As a result, the character recognition error pattern information is registered in the character recognition error pattern dictionary for each restriction in the character recognition processing, and in the subsequent character recognition processing, an efficient character recognition error is performed according to the restriction in the character recognition processing. Pattern information can be used.

Ｓ１２０５において、ＣＰＵ１０１は、Ｓ１２０４で生成された新しい文字認識誤りパターン情報を登録すべき文字認識誤りパターン辞書が存在するか否かを判定する。
文字認識誤りパターン情報を登録すべきすべての文字認識誤りパターン辞書が存在する場合、Ｓ１２０７へ進む。登録すべき文字認識誤りパターン辞書に存在しないものがある場合は、Ｓ１２０６へ進む。 In S1205, the CPU 101 determines whether or not there is a character recognition error pattern dictionary for registering the new character recognition error pattern information generated in S1204.
If all the character recognition error pattern dictionaries to which the character recognition error pattern information should be registered exist, the process proceeds to S1207. If there is something that does not exist in the character recognition error pattern dictionary to be registered, the process proceeds to S1206.

Ｓ１２０６において、登録すべき文字認識誤りパターン辞書に存在しないものがある場合ＣＰＵ１０１は、不足している文字認識誤りパターン辞書を作成する。この処理が行われるのは、新しい、あるいは、使用されたことのない制約により文字認識処理の際に文字認識誤りパターン情報が抽出された場合である。この場合、文字認識処理を行う際の制約に対応する文字認識誤りパターン辞書が作成される。
Ｓ１２０６の処理が終了したら、Ｓ１２０７へ進む。 In S1206, when there is something that does not exist in the character recognition error pattern dictionary to be registered, the CPU 101 creates the missing character recognition error pattern dictionary. This processing is performed when the character recognition error pattern information is extracted during the character recognition processing due to a new or never used constraint. In this case, a character recognition error pattern dictionary corresponding to the restrictions when performing character recognition processing is created.
When the processing of S1206 is completed, the process proceeds to S1207.

Ｓ１２０７において、ＣＰＵ１０１は、新しい文字認識誤りパターン情報を、その文字認識処理の際の制約に応じて該当する文字認識誤りパターン辞書に登録する。これにより、以後、文字認識処理の際の制約毎に文字認識誤りパターン辞書を利用することができるため、文字認識処理の際の制約に応じて効率の良い補正処理を行うことが実現される。
Ｓ１２０７の処理が終了したら、本フローチャートは終了する。 In S1207, the CPU 101 registers new character recognition error pattern information in the corresponding character recognition error pattern dictionary according to the restrictions in the character recognition process. As a result, since the character recognition error pattern dictionary can be used for each constraint in the character recognition process thereafter, efficient correction processing can be performed according to the constraint in the character recognition process.
When the process of S1207 is completed, this flowchart ends.

図１３は、図１２のフローチャートのＳ１２０１でユーザに提供されるユーザインターフェースである確認・修正用画面１３００の例である。図１３の画面は、マルチウィンドウのＧＵＩ環境で、使用されることを想定している。確認・修正用画面上で、図４のフローチャートのＳ４０７で得られた抽出項目の項目値の文字列に対して、図２のスキャン文書画像２０１と突き合わせて、正しい項目値が抽出されているか否かを、ユーザに確認してもらう。そして、抽出された項目値に誤りがあった場合、ユーザに修正してもらう。以下、図１３に示した例を用いて、確認・修正用画面１３００について説明する。 FIG. 13 is an example of a confirmation / correction screen 1300, which is a user interface provided to the user in S1201 of the flowchart of FIG. 12. The screen of FIG. 13 is assumed to be used in a multi-window GUI environment. Whether or not the correct item value is extracted by comparing the character string of the item value of the extracted item obtained in S407 of the flowchart of FIG. 4 with the scanned document image 201 of FIG. 2 on the confirmation / correction screen. Ask the user to confirm. Then, if there is an error in the extracted item value, the user is asked to correct it. Hereinafter, the confirmation / correction screen 1300 will be described with reference to the example shown in FIG.

確認・修正用画面１３００は、ディスプレイなどの表示装置１０９においてウィンドウとして表示される。確認・修正用画面１３００には、抽出項目の項目値の確認・修正作業をするため画面である旨を示すタイトル１３０１が表示される。
一時保存ボタン１３０２が押下されると、確認・修正用画面１３００上での確認・修正作業の内容が一時保存され、確認・修正作業を一時的に終了させ、確認・修正用画面１３００が閉じられる。この後、確認・修正用画面１３００が開かれると、一時保存された作業内容が再び表示され、確認・修正作業を再開することができるようになる。
完了ボタン１３０３が押下されると、確認・修正用画面１３００での確認・修正作業の内容を作業結果として保存した後、確認・修正用画面１３００が閉じられて、確認・修正作業が終了する。 The confirmation / correction screen 1300 is displayed as a window on a display device 109 such as a display. On the confirmation / correction screen 1300, a title 1301 indicating that the screen is used for confirming / correcting the item value of the extracted item is displayed.
When the temporary save button 1302 is pressed, the contents of the confirmation / correction work on the confirmation / correction screen 1300 are temporarily saved, the confirmation / correction work is temporarily terminated, and the confirmation / correction screen 1300 is closed. .. After that, when the confirmation / correction screen 1300 is opened, the temporarily saved work contents are displayed again, and the confirmation / correction work can be resumed.
When the completion button 1303 is pressed, the contents of the confirmation / correction work on the confirmation / correction screen 1300 are saved as the work result, and then the confirmation / correction screen 1300 is closed and the confirmation / correction work is completed.

表示領域１３１０には、図３に示したような文書３００のスキャン文書画像の全体が表示される。
表示領域１３２０には、図４のフローチャートのＳ４０７で得られた抽出項目の項目値が表示される。ユーザは表示領域１３２０において項目値の確認・修正を行うことができる。 In the display area 1310, the entire scanned document image of the document 300 as shown in FIG. 3 is displayed.
In the display area 1320, the item values of the extracted items obtained in S407 of the flowchart of FIG. 4 are displayed. The user can confirm / correct the item value in the display area 1320.

図１３の例では、表示領域１３２０には、図４のフローチャートのＳ４０７で得られた４つの抽出項目に関する情報が各ボックス１３３０、１３４０、１３５０、１３６０に表示されている。この例のように、抽出項目の項目値が多数あるために表示しきれない場合には、スクロールバー１３２１が表示される。ユーザはスクロールバー１３２１を操作することにより、表示領域１３２０においてすべての抽出項目の項目値に対して確認・修正作業をすることができる。 In the example of FIG. 13, in the display area 1320, information regarding the four extraction items obtained in S407 of the flowchart of FIG. 4 is displayed in the boxes 1330, 1340, 1350, and 1360, respectively. As in this example, when the display cannot be completed due to the large number of item values of the extracted items, the scroll bar 1321 is displayed. By operating the scroll bar 1321, the user can confirm / correct the item values of all the extracted items in the display area 1320.

最初のボックス１３３０には、抽出項目「文書タイトル」の項目値に関する情報が表示されている。同様に、各ボックス１３４０、１３５０、１３６０には、抽出項目「電話番号」、「合計金額」、「請求元の振込先銀行・支店名」のそれぞれの項目に関する情報がそれぞれ表示されている。
１３３１には抽出項目の名称が表示されており、ボックス１３３０が「文書タイトル」の項目値に関するものであることを示している。１３３２には、文書画像内の位置・サイズに基づいて切り抜かれたスキャン文書画像２０１の項目値の領域が、部分画像として表示されている。１３３３には、１３３２に表示された部分画像を文字認識処理した項目値の文字列が表示されている。１３３２に表示された文字列は、ユーザが確認・修正することができる。
１３３３に表示された文字列が編集されない場合、文字認識結果が正しいとユーザが確認したこととなる。一方、１３３３に表示された文字列を編集して内容を変更した場合は、ユーザが修正を行ったこととなる。これは、他のボックス１３４０、１３５０、１３６０に表示された項目値についても同様である。 In the first box 1330, information regarding the item value of the extraction item "document title" is displayed. Similarly, in each of the boxes 1340, 1350, and 1360, information regarding each item of the extracted items "telephone number", "total amount", and "bank / branch name of the billing source" is displayed.
The name of the extracted item is displayed in 1331, indicating that the box 1330 is related to the item value of the "document title". In 1332, the area of the item value of the scanned document image 201 cut out based on the position and size in the document image is displayed as a partial image. In 1333, a character string of the item value obtained by character recognition processing of the partial image displayed in 1332 is displayed. The character string displayed in 1332 can be confirmed and corrected by the user.
If the character string displayed in 1333 is not edited, it means that the user has confirmed that the character recognition result is correct. On the other hand, when the character string displayed in 1333 is edited and the content is changed, it means that the user has made the correction. This also applies to the item values displayed in the other boxes 1340, 1350, and 1360.

図１３に示された画面では、カーソル１３３４が１３３３の位置にあり、また、ボックス１３３０の枠が太枠として強調して表示されている。これは、ユーザが確認・修正作業する抽出項目として、１３３０の「文書タイトル」が選択されている状態にあることを示している。これに合わせて、１３３２に表示されている部分画像について、スキャン文書画像２０１における位置が確認しやすいように、表示領域１３１０内に表示されたスキャン文書画像２０１上に点線で示されている。
この状態において、ユーザは、カーソル１３３４が位置している抽出項目「文書タイトル」の項目値（「請求書」）を編集することができる。
このようにして、ユーザは、確認・修正用画面１３００において、全抽出項目の項目値の確認・修正を行うことができる。 In the screen shown in FIG. 13, the cursor 1334 is at the position of 1333, and the frame of the box 1330 is highlighted as a thick frame. This indicates that 1330 "document title" is selected as an extraction item to be confirmed / corrected by the user. Along with this, the partial image displayed in 1332 is indicated by a dotted line on the scanned document image 201 displayed in the display area 1310 so that the position in the scanned document image 201 can be easily confirmed.
In this state, the user can edit the item value (“invoice”) of the extracted item “document title” in which the cursor 1334 is located.
In this way, the user can confirm / correct the item values of all the extracted items on the confirmation / correction screen 1300.

図１４（ａ）は、図１２のフローチャートのＳ１２０２で取得される、ユーザによる修正内容の例を表形式で示したものである。
図１４（ａ）に示されるように、「抽出項目ＩＤ」毎に、「項目値の出力結果（ユーザ修正前）」と「ユーザ修正後の項目値」が記録されており、また、それらを比較した「修正内容」が検出されている。また、「項目値の文字認識制約」の情報も含まれている。
「項目値の出力結果（ユーザ修正前）」は、図１１（ｃ）で示した項目値と同じものである。この例では、「抽出項目名ＩＤ」の「５」、「６」、「１０」において、文字認識処理の誤りで、再文字認識処理や辞書・文字パターンを用いた補正処理で修正できなかったものが、ユーザによって修正されたことが示されている。また、ユーザにより修正された内容が、「項目値の文字認識の制約」と対応付けられて、「修正内容」として示されている。 FIG. 14A shows an example of the modification contents by the user acquired in S1202 of the flowchart of FIG. 12 in a table format.
As shown in FIG. 14A, "output result of item value (before user modification)" and "item value after user modification" are recorded for each "extracted item ID", and they are recorded. The compared "correction content" has been detected. It also contains information on "character recognition constraints for item values".
The “item value output result (before user modification)” is the same as the item value shown in FIG. 11 (c). In this example, in "5", "6", and "10" of the "extracted item name ID", an error in the character recognition process could not be corrected by the re-character recognition process or the correction process using the dictionary / character pattern. It is shown that something has been modified by the user. Further, the content modified by the user is shown as "correction content" in association with "restriction on character recognition of item value".

図１４（ｂ）は、図１４（ａ）の「修正内容」と「項目値の文字認識の制約」に基づいて抽出された文字認識誤りパターン情報の例を表形式で示したものである。
各文字認識誤りパターン情報は、それらを識別するために「誤りパターンの追加ＩＤ」が付与される。そして、「文字認識の制約」毎に、文字認識処理による誤った「文字認識結果」と、ユーザによる「修正結果」が記載されている。「文字認識結果」と「修正結果」は図１０（ａ）や（ｂ）に示したものに対応し、「文字認識の制約」は図９などに示したものに対応する。
これにより、文字認識処理の際の制約毎に文字認識誤りパターン辞書を利用することができるため、文字認識処理の際の制約に応じて効率の良い文字認識誤りパターン情報の利用が実現される。 FIG. 14B shows an example of the character recognition error pattern information extracted based on the “correction content” and the “character recognition constraint of the item value” of FIG. 14A in a tabular format.
Each character recognition error pattern information is given an "additional ID of an error pattern" to identify them. Then, for each "restriction of character recognition", an erroneous "character recognition result" by the character recognition process and a "correction result" by the user are described. The "character recognition result" and the "correction result" correspond to those shown in FIGS. 10 (a) and 10 (b), and the "character recognition constraint" corresponds to those shown in FIG. 9 and the like.
As a result, the character recognition error pattern dictionary can be used for each restriction in the character recognition processing, so that efficient use of character recognition error pattern information is realized according to the restrictions in the character recognition processing.

以上のとおり、本実施例によれば、文字認識処理の際の制約毎に、文字認識誤りパターン情報を記憶した辞書を用いることにより、文字認識結果の補正処理における処理効率を高め、処理時間の短縮を実現することができる。さらに、ユーザによる修正に基づいて文字認識誤りパターン情報を収集することにより、効率的な補正処理を実現することができる。これにより、データ入力業務の効率化による時間短縮に貢献し、ユーザの作業負担を軽減に貢献できる。 As described above, according to the present embodiment, the processing efficiency in the correction processing of the character recognition result is improved by using the dictionary that stores the character recognition error pattern information for each restriction in the character recognition processing, and the processing time is increased. The shortening can be realized. Furthermore, efficient correction processing can be realized by collecting character recognition error pattern information based on correction by the user. As a result, it is possible to contribute to shortening the time by improving the efficiency of the data input work and to reduce the workload of the user.

＜実施例２＞
前述のとおり、実施例１では、例えば、図３に示した文書３００に記載されている「口座番号」については、「請求元の振込先口座番号」として、一つの項目として取り扱った。しかし、「口座番号」は、文書３００に記載されているように、口座種別である「普通」、「当座」と口座番号である数字列という、区別可能な複数の単位がまとめて記載されたものである。そのため、実施例１では、このような項目値については、文字認識結果をそれぞれの単位に分割し、文字認識処理の際の制約や、補正用の辞書・文字パターンの設定を、単位毎に行う必要があった。
このため、例えば、「口座番号」のように、区別可能な複数の単位がまとめて記載されている項目については、割当処理において、単位毎に分割し、別々の項目として抽出する方が好ましい。ただし、これらの単位の内容は、互いに密接に関連しているため、補正処理においては一括して行う方がよい。 <Example 2>
As described above, in the first embodiment, for example, the "account number" described in the document 300 shown in FIG. 3 is treated as one item as the "transfer destination account number of the billing source". However, as described in the document 300, the "account number" is described together with a plurality of distinguishable units such as "ordinary" and "current" which are account types and a number string which is an account number. It is a thing. Therefore, in the first embodiment, for such an item value, the character recognition result is divided into each unit, and restrictions on character recognition processing and a dictionary / character pattern for correction are set for each unit. I needed it.
Therefore, for example, for an item in which a plurality of distinguishable units are collectively described, such as an "account number", it is preferable to divide each item in the allocation process and extract them as separate items. However, since the contents of these units are closely related to each other, it is better to perform the correction process collectively.

図１５を用いて、本実施例における割当処理によって項目値に対して文字認識結果が割り当てられる過程について説明する。
図１５（ａ）は、実施例１における図７（ａ）を置き換えたものである。
図１５（ａ）の「抽出項目ＩＤ」の「１」から「１０」は、図７（ａ）の「抽出項目ＩＤ」の「１」から「１０」にそれぞれ対応する。ただし、図７（ａ）の「抽出項目ＩＤ」の「８」と「９」は分割されて削除され、図１５（ａ）では、その代わりに、「抽出項目ＩＤ」の「１４」から「１７」が追加されている。また、「抽出項目ＩＤ」の「１１」から「１３」も追加されている。 The process of assigning the character recognition result to the item value by the allocation process in this embodiment will be described with reference to FIG.
FIG. 15 (a) replaces FIG. 7 (a) in the first embodiment.
The "1" to "10" of the "extracted item ID" in FIG. 15 (a) correspond to the "1" to "10" of the "extracted item ID" in FIG. 7 (a), respectively. However, "8" and "9" of the "extracted item ID" in FIG. 7 (a) are divided and deleted, and in FIG. 15 (a), instead of "14" of the "extracted item ID", "14" to " 17 "has been added. In addition, "11" to "13" of "extracted item ID" are also added.

なお、図１５（ａ）の「基準の位置に対する項目値の相対方向」では、相対方向だけでなく、その基準位置が指定できるようになっている。これにより、すでに検索された項目値の位置を基準にした相対方向の指定が可能である。
例えば、「抽出項目ＩＤ」の「１１」の抽出項目である「請求元郵便番号」は、「抽出項目ＩＤ」の「３」の抽出項目である「請求書名称」の位置を基準にして、「下」方向にあることが指定されている。このようにすることで、項目値に対する文字認識結果の割当処理をより詳細に行うことができる。 In the "relative direction of the item value with respect to the reference position" in FIG. 15A, not only the relative direction but also the reference position can be specified. This makes it possible to specify the relative direction based on the position of the item value that has already been searched.
For example, the "billing source zip code", which is the extracted item of "11" of the "extracted item ID", is based on the position of the "invoice name", which is the extracted item of "3" of the "extracted item ID". It is specified to be in the "down" direction. By doing so, it is possible to perform the character recognition result allocation process for the item value in more detail.

また、図１５（ｂ）は、実施例１における図９（ａ）を置き換えたものである。
図１５（ａ）と同様に、図１５（ｂ）の「抽出項目ＩＤ」の「１」から「１０」は、図９（ａ）の「抽出項目ＩＤ」の「１」から「１０」にそれぞれ対応する。ただし、図９（ａ）の「抽出項目ＩＤ」の「８」と「９」は分割されて削除され、図１５（ｂ）では、その代わりに、「抽出項目ＩＤ」の「１４」から「１７」が追加されている。また、「抽出項目ＩＤ」の「１１」から「１３」も追加されている。 Further, FIG. 15 (b) replaces FIG. 9 (a) in the first embodiment.
Similar to FIG. 15 (a), "1" to "10" of the "extracted item ID" of FIG. 15 (b) is changed from "1" to "10" of the "extracted item ID" of FIG. 9 (a). Corresponds to each. However, "8" and "9" of the "extracted item ID" in FIG. 9 (a) are divided and deleted, and in FIG. 15 (b), instead of "14" of the "extracted item ID", "14" to " 17 "has been added. In addition, "11" to "13" of "extracted item ID" are also added.

図１５（ｂ）では、項目がより細かい単位に分割されたため、「項目値の再文字認識の制約」がより詳細に設定されている。例えば、図９（ａ）の「抽出項目ＩＤ」の「９」の「請求元の振込先口座番号」は、図１５（ｂ）では「抽出項目ＩＤ」の「１６」の「請求元の振込先口座種別」と「抽出項目ＩＤ」の「１７」の「請求元の振込先口座番号」に分割されている。これにより、「請求元の振込先口座種別」と「請求元の振込先口座番号」について、再文字認識処理の際の制約として、それぞれ、「口座種別文字種」と「数字文字種」が設定されている。 In FIG. 15B, since the item is divided into finer units, the “restriction on re-character recognition of the item value” is set in more detail. For example, in FIG. 15 (a), the "invoice source transfer account number" of "9" of the "extracted item ID" is the "extracted item ID" of "16" "transfer of the billing source". It is divided into "destination account type" and "17" of "extracted item ID" and "transfer destination account number of billing source". As a result, "account type character type" and "number character type" are set as restrictions in the re-character recognition process for "billing source transfer account type" and "billing source transfer account number", respectively. There is.

また、「抽出項目ＩＤ」の「１１」の「請求元郵便番号」と「抽出項目ＩＤ」の「１２」の「請求元住所地名」については、同一の補正用の辞書「住所地名」を用いて、一括した補正処理が行われる。これは、住所地名と郵便番号には一義的な関係があるためであり、一括した補正処理を行うことにより、両方の項目値を矛盾なく補正することができる。
同様に、「抽出項目ＩＤ」の「１４」の「請求元の振込先銀行名」と「抽出項目ＩＤ」の「１５」の「請求元の振込先銀行支店名」についても、両者には密接な関係があるため、同一の補正用の辞書を用いた一括した補正処理が行われる。
以上のように、実施例２によれば、文字認識結果に対する補正処理をさらに正確に行うことができる。 For the "billing source zip code" of "11" of the "extracted item ID" and the "billing source address and place name" of "12" of the "extracted item ID", the same correction dictionary "address and place name" is used. Then, a batch correction process is performed. This is because there is a unique relationship between the address place name and the zip code, and by performing the batch correction process, both item values can be corrected without contradiction.
Similarly, the "bank name of the billing source" of "14" of the "extracted item ID" and the "bank branch name of the billing source" of "15" of the "extracted item ID" are closely related to both. Therefore, batch correction processing is performed using the same correction dictionary.
As described above, according to the second embodiment, the correction process for the character recognition result can be performed more accurately.

＜その他の実施例＞
本発明は、上述の実施例の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。
また、本発明は、複数の機器から構成されるシステムに適用しても、１つの機器からなる装置に適用してもよい。
本発明は上述の実施例に限定されるものではなく、本発明の趣旨に基づき種々の変形が可能であり、それらを本発明の範囲から除外するものではない。すなわち、上述した各実施例及びその変形例を組み合わせた構成もすべて本発明に含まれるものである。 <Other Examples>
The present invention supplies a program that realizes one or more functions of the above-described embodiment to a system or device via a network or storage medium, and one or more processors in the computer of the system or device reads and executes the program. It can also be realized by the processing to be performed. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.
Further, the present invention may be applied to a system composed of a plurality of devices or a device composed of one device.
The present invention is not limited to the above-described examples, and various modifications can be made based on the gist of the present invention, and these are not excluded from the scope of the present invention. That is, all the configurations in which each of the above-described examples and modifications thereof are combined are also included in the present invention.

１００データ入力システム 100 data entry system

Claims

An image processing device having a correction means for correcting a character string generated by performing character recognition processing on a character string on a document using restrictions according to the characteristics of the character string on the document. In
The correction means is an image processing apparatus characterized in that correction processing is performed on a character string generated by the character recognition processing using pattern information for correction prepared according to the constraint.

The image processing apparatus according to claim 1, further comprising a character recognition means for performing the character recognition process.

The image processing apparatus according to claim 1 or 2, wherein the character string generated by the character recognition process is composed of a plurality of items that can be distinguished as a unit indicating the content of the document.

The image processing apparatus according to any one of claims 1 to 3, wherein the characteristic is a character type of the character string.

The image processing apparatus according to claim 3, wherein the characteristic is the item.

The image processing apparatus according to any one of claims 1 to 5, wherein the pattern information for correction is a character recognition error pattern.

The image processing apparatus according to any one of claims 1 to 5, wherein the pattern information for correction is a character pattern rule of the character string.

The image processing apparatus according to any one of claims 3 to 5, wherein the pattern information for correction is a standard character string for the item.

Further, the present invention according to any one of claims 1 to 8, further comprising a re-character recognition means for performing a re-character recognition process on a character string generated by the character recognition process using the above-mentioned restrictions. Image processing device.

The image processing apparatus according to any one of claims 1 to 9, wherein the constraint is a character type of the character string.

The image processing apparatus according to any one of claims 1 to 10, further comprising a providing means for providing a user interface capable of confirming or correcting a character string generated by the correction process.

The image according to claim 11, wherein when the character string generated by the correction process in the user interface is corrected by the user, the corrected information is added to the pattern information for correction. Processing equipment.

The image processing apparatus according to any one of claims 3 to 12, wherein the correction means collectively corrects character strings corresponding to a plurality of the items having related contents.

An image processing method having a correction step of correcting a character string generated by performing character recognition processing on a character string on a document using restrictions according to the characteristics of the character string on the document. In
An image processing method characterized in that, in the correction step, correction processing is performed on a character string generated by the character recognition processing using pattern information for correction prepared according to the constraint.

A program for causing a computer to execute the image processing method according to claim 14.