JP2022011037A

JP2022011037A - Image processing device, image processing method and program

Info

Publication number: JP2022011037A
Application number: JP2020111903A
Authority: JP
Inventors: 亮小坂; Ryo Kosaka
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2020-06-29
Filing date: 2020-06-29
Publication date: 2022-01-17

Abstract

To solve a problem that, in the case where a plurality of document images having a data structure similar to the query document is registered in a registered document image, a correct registered document cannot be selected when a similar form is selected by the matching process, and as a result, an erroneous value may be extracted.SOLUTION: A plurality of candidates of a similar form is detected by matching processing, a region in which the comparison is easily performed between similar slip candidates is selected, and a correct value is made possible to be extracted by performing a determination process using the data format of the value attribute to allow you to select the correct registration document.SELECTED DRAWING: Figure 2

Description

本発明は、画像処理装置、画像処理方法およびプログラムに関する。 The present invention relates to an image processing apparatus, an image processing method and a program.

近年、コンピュータの普及に伴う労働環境の変化により、業務で扱う文書の電子化が進んでいる。例えば、受発注業務や申込審査業務を行う企業をはじめ、多くの企業では、請求書、見積書、注文書、申込書など、様々な帳票書類に対する電子化に取り組んでいる。このような業務においては、取り扱う大量の帳票書類の中から、どの帳票書類に該当するのかを特定し、帳票書類ごとにワークフローに沿った処理を行っている。例えば、ワークフローには、データ登録業務が挙げられる。具体的には、特定された帳票書類の中から、決められたデータ項目（帳票名、帳票番号、発行元会社情報、発行日、請求内容など）の情報を抽出し、会計処理システムや経費精算システムなどの所定業務システムにデータを登録する。このデータ登録業務を行う担当者は、様々な帳票書類に記載されたこれらのデータ項目の記載内容を目視で確認しつつ、所定の業務システムへ手動で記載内容を入力しているため、業務効率が低下している。そこで、データ登録業務を効率化するために、業務の自動化を図る取り組みがなされている。具体的には、入手した帳票書類をスキャンして電子化し、帳票種類を特定するための帳票認識処理や、光学的に文字情報を認識する光学的文字認識（以下、ＯＣＲと呼称）処理を適用することで、帳票内のデータ項目および記載内容を自動で抽出する。 In recent years, due to changes in the working environment due to the spread of computers, the digitization of documents handled in business is progressing. For example, many companies, including companies that perform ordering and application screening, are working on digitizing various forms such as invoices, quotations, purchase orders, and application forms. In such work, it is specified which form document corresponds to from a large amount of form documents to be handled, and processing is performed according to the workflow for each form document. For example, the workflow includes data registration work. Specifically, the information of the determined data items (form name, form number, issuer company information, issue date, billing details, etc.) is extracted from the specified form documents, and the accounting system and expense settlement are performed. Register data in a predetermined business system such as a system. The person in charge of this data registration work visually confirms the description contents of these data items described in various form documents and manually inputs the description contents into the predetermined business system, so that the business efficiency Is declining. Therefore, in order to improve the efficiency of data registration work, efforts are being made to automate the work. Specifically, the form recognition process for scanning and digitizing the obtained form documents to specify the form type and the optical character recognition (hereinafter referred to as OCR) process for optically recognizing character information are applied. By doing so, the data items and description contents in the form are automatically extracted.

帳票種類を特定する手法として、特許文献１の手法がある。この手法では、まず、帳票様式の特徴（罫線情報および、文字列の画像特徴情報）および確認情報（項目値が記入されている位置情報など）を、あらかじめ登録しておく。スキャンされた帳票を推定する際には、登録された帳票様式の特徴に基づき、処理対象文書の帳票様式を特定する。そして、特定された帳票様式に基づいた確認情報との照合を行うことにより帳票識別を行って項目値を抽出する。なお、照合に失敗した場合にはエラー提示を行う。これにより、該当項目値を自動抽出することによりユーザ作業負荷を大きく軽減することができる。また、帳票様式が類似しているが記載されている項目の位置が異なるような別の帳票が入力された場合であっても、誤検出を防止することが可能となる。 As a method for specifying the form type, there is a method of Patent Document 1. In this method, first, the features of the form format (ruled line information and image feature information of character strings) and confirmation information (position information in which item values are entered, etc.) are registered in advance. When estimating the scanned form, the form format of the document to be processed is specified based on the characteristics of the registered form format. Then, the form is identified by collating with the confirmation information based on the specified form format, and the item value is extracted. If the collation fails, an error is presented. As a result, the user workload can be greatly reduced by automatically extracting the corresponding item value. Further, even when another form is input in which the form format is similar but the positions of the described items are different, it is possible to prevent erroneous detection.

特開２００５－２４２７８６号公報Japanese Unexamined Patent Publication No. 2005-242786

この手法では、帳票の照合を行う際の確認情報として、帳票内の項目名情報を用いている。そのため、帳票内に項目名が記載されていない場合には適用できない。また、照合時に不一致となった場合には、帳票様式が類似しているが記載されている項目の位置が異なるような別の帳票が登録されていたとしても、帳票識別エラーとして処理が中断されてしまうため、項目値が抽出できなくなってしまう。 In this method, the item name information in the form is used as the confirmation information when collating the form. Therefore, it cannot be applied when the item name is not described in the form. In addition, if there is a discrepancy at the time of collation, even if another form is registered in which the form format is similar but the position of the described item is different, the processing is interrupted as a form identification error. Therefore, the item value cannot be extracted.

前述した課題を解決するために、複数の文書画像が登録文書画像として登録されている画像処理装置において、取得された文書画像と類似する複数の登録文書画像が選択されると、比較判定のための領域を決定する領域決定手段と、前記領域決定手段で決定された、前記取得された文書画像の領域に対して文字認識を行い、当該領域の属性を決定し、前記取得された文書画像の前記領域に対応する登録文書画像の領域の属性と比較する比較手段と、同一の属性を有する登録文書画像を類似文書画像として判定する判定手段とを備えることを特徴とする。 In order to solve the above-mentioned problem, when a plurality of registered document images similar to the acquired document image are selected in an image processing device in which a plurality of document images are registered as registered document images, for comparison determination. Character recognition is performed on the area of the acquired document image determined by the area determining means and the area determining means, the attributes of the area are determined, and the acquired document image is used. It is characterized by comprising a comparison means for comparing with the attribute of the area of the registered document image corresponding to the area, and a determination means for determining a registered document image having the same attribute as a similar document image.

本発明によれば、レイアウト構造の類似した登録文書画像が複数登録されていた場合にも、最も類似した類似文書画像を正しく特定でき、項目値の抽出精度を高めることができるようになる。 According to the present invention, even when a plurality of registered document images having similar layout structures are registered, the most similar similar document images can be correctly identified, and the extraction accuracy of item values can be improved.

本発明の一実施形態における画像処理装置のハードウェア構成の一例を示すブロック図である。It is a block diagram which shows an example of the hardware composition of the image processing apparatus in one Embodiment of this invention. 本実施形態の画像処理装置における全体フローを示す図である。It is a figure which shows the whole flow in the image processing apparatus of this embodiment. 本実施形態におけるブロックセレクション処理を説明する図である。It is a figure explaining the block selection process in this embodiment. 本実施形態におけるマッチング処理を説明する図である。It is a figure explaining the matching process in this embodiment. 本実施形態におけるバリュー領域位置推定処理を説明する図である。It is a figure explaining the value area position estimation process in this embodiment. 本実施形態における同一種類文書画像抽出処理を説明する図である。It is a figure explaining the same kind document image extraction process in this embodiment. 本実施形態における同一種類文書画像抽出処理で用いるバリュー属性のデータ形式および優先度決定表を説明する図である。It is a figure explaining the data format and priority determination table of the value attribute used in the same type document image extraction processing in this embodiment. 本発明の第１の実施形態における文書画像登録処理を説明する図である。It is a figure explaining the document image registration process in 1st Embodiment of this invention. 本実施形態における確認画面の一例を示す図である。It is a figure which shows an example of the confirmation screen in this embodiment. 本発明の第２の実施形態における文書画像登録処理を説明する図である。It is a figure explaining the document image registration process in the 2nd Embodiment of this invention. 本発明の第２の実施形態におけるデータ形式の出現頻度の学習結果の一例を示す図である。It is a figure which shows an example of the learning result of the appearance frequency of the data format in the 2nd Embodiment of this invention. 本発明の第２の実施形態における、同一種類文書画像抽出処理を説明する図である。It is a figure explaining the same kind document image extraction process in the 2nd Embodiment of this invention.

以下、本発明を実施するための形態について図面を参照して説明する。ただし、この実施形態に記載されている構成要素はあくまで例示であり、本発明の範囲をそれらに限定する趣旨のものではない。 Hereinafter, embodiments for carrying out the present invention will be described with reference to the drawings. However, the components described in this embodiment are merely examples, and the scope of the present invention is not intended to be limited thereto.

［第１の実施形態］
＜スキャンアシストシステム＞
図１は、本発明の一実施形態における画像処理装置のハードウェア構成を示すブロック図の一例である。このような構成により、画像処理装置１００は、所定の業務システムへ入力するのに必要な項目（以下、バリュー属性と呼称）に対応する項目値（以下、バリュー値と呼称）を、スキャンされた帳票中から自動で抽出するスキャンアシスト処理を実行する。 [First Embodiment]
<Scan Assist System>
FIG. 1 is an example of a block diagram showing a hardware configuration of an image processing apparatus according to an embodiment of the present invention. With such a configuration, the image processing apparatus 100 scans the item values (hereinafter referred to as value values) corresponding to the items (hereinafter referred to as value attributes) necessary for inputting to a predetermined business system. Executes scan assist processing that automatically extracts from the form.

図１に示すように、画像処理装置１００は、ＣＰＵ１１１、ＲＯＭ１１２、ＲＡＭ１１３、記憶部１１４、画像処理部１１５、ユーザインターフェース１１６、画像読取部１１７、画像出力部１１８、表示デバイス１１９を備える。これらのデバイスは、データバス１１０によって相互通信可能に接続されている。また、外部インターフェース１２０を介して、不図示の外部の情報処理装置、クラウドシステム、業務システムなどに接続されている。 As shown in FIG. 1, the image processing device 100 includes a CPU 111, a ROM 112, a RAM 113, a storage unit 114, an image processing unit 115, a user interface 116, an image reading unit 117, an image output unit 118, and a display device 119. These devices are communicably connected by the data bus 110. Further, it is connected to an external information processing device, a cloud system, a business system, etc. (not shown) via an external interface 120.

ＣＰＵ１１１は、画像処理装置１００を統括的に制御するためのコントローラ基盤である。ＣＰＵ１１１は、ＲＯＭ１１２に格納されているブートプログラムによりＯＳ（オペレーティングシステム）を起動する。このＯＳ上で、記憶部１１４に記憶されているコントローラプログラムが実行される。コントローラプログラムは、画像処理装置１００を制御するためのプログラムである。ＣＰＵ１１１は、データバス１１０によって接続されている各デバイスを統括的に制御する。ＲＡＭ１１３は、ＣＰＵ１１１の主メモリやワークエリアなどの一時記憶領域として動作する。 The CPU 111 is a controller board for comprehensively controlling the image processing device 100. The CPU 111 boots an OS (operating system) by a boot program stored in the ROM 112. The controller program stored in the storage unit 114 is executed on this OS. The controller program is a program for controlling the image processing device 100. The CPU 111 comprehensively controls each device connected by the data bus 110. The RAM 113 operates as a temporary storage area such as a main memory or a work area of the CPU 111.

記憶部１１４は、ＨＤＤなどの、読み出しと書き込みが可能な不揮発メモリであり、ここには、前述のコントローラプログラム、処理結果など、様々なデータが記録される。 The storage unit 114 is a non-volatile memory such as an HDD that can be read and written, and various data such as the above-mentioned controller program and processing results are recorded in the storage unit 114.

画像処理部１１５は、後述する画像読取部１１７で読み取られ、記憶部１１４上に格納された文書画像を解析し、スキャンアシストのための情報を生成する。解析処理は、以下の４つの処理からなる。１つ目は、文書画像内の文字列等のオブジェクトのブロック（オブジェクトの領域）を抽出するブロックセレクション処理（ＢＳ処理）である。２つ目は、文字列画像から文字列情報を抽出する光学文字認識処理（ＯＣＲ処理）である。３つ目は、登録されている文書画像と新たにスキャンされた文書画像との間の類似度を求め、文書画像の種類を判定する処理である。４つ目は、判定処理に基づきスキャンアシストのための情報を生成する処理である。スキャンアシストのための情報は、各種業務システムへ入力するのに必要となる項目値（例えば、帳票名や帳票番号、請求元会社情報、請求金額など）からなる。 The image processing unit 115 analyzes the document image read by the image reading unit 117, which will be described later, and stored on the storage unit 114, and generates information for scan assist. The analysis process consists of the following four processes. The first is a block selection process (BS process) for extracting a block (object area) of an object such as a character string in a document image. The second is an optical character recognition process (OCR process) for extracting character string information from a character string image. The third is a process of determining the type of the document image by obtaining the degree of similarity between the registered document image and the newly scanned document image. The fourth is a process of generating information for scan assist based on the determination process. The information for scan assist consists of item values (for example, form name, form number, billing company information, billing amount, etc.) required to be input to various business systems.

ユーザインターフェース１１６は、例えばキーボード、マウス、タッチパネル、ハードキーなどから構成される入出力装置である。ユーザインターフェース１１６は、ユーザからの各種の設定値または指定値を受け付け、指示情報をＣＰＵ１１１に伝達する。 The user interface 116 is an input / output device including, for example, a keyboard, a mouse, a touch panel, and hard keys. The user interface 116 receives various set values or designated values from the user and transmits instruction information to the CPU 111.

画像読取部１１７は、スキャナデバイスであり、ＣＣＤなどの光学読取装置を用いて紙文書などを読取ることにより画像データ形式の文書画像を取得することができる。ＣＰＵ１１１は、画像読取部１１７から文書画像を取得すると、記憶部１１４に記憶する。そして、ＣＰＵ１１１は、スキャンアシスト処理を実行する際に、記憶部１１４に記憶された文書画像をＲＡＭ１１３に読み出す。 The image reading unit 117 is a scanner device, and can acquire a document image in an image data format by reading a paper document or the like using an optical reading device such as a CCD. When the CPU 111 acquires a document image from the image reading unit 117, it stores it in the storage unit 114. Then, when the scan assist process is executed, the CPU 111 reads the document image stored in the storage unit 114 into the RAM 113.

画像出力部１１８はプリンタデバイスであり、例えば、画像出力部１１８は、文書画像に係る画像データを、記憶媒体に出力する処理を実行することができる。あるいは、画像出力部１１８は印刷機能を備え、紙媒体などの出力媒体に文書画像を出力する処理を実行してもよい。 The image output unit 118 is a printer device, and for example, the image output unit 118 can execute a process of outputting image data related to a document image to a storage medium. Alternatively, the image output unit 118 may have a printing function and may execute a process of outputting a document image to an output medium such as a paper medium.

表示デバイス１１９は、ＬＣＤやＣＲＴなどの表示装置であり、ＣＰＵ１１１が生成した表示データを表示する。 The display device 119 is a display device such as an LCD or a CRT, and displays display data generated by the CPU 111.

外部インターフェース１２０は、ＬＡＮや電話回線、赤外線といった近接無線などのネットワークを介して、外部機器と、画像データや抽出された帳票情報など各種データの送受信を行う。 The external interface 120 transmits and receives various data such as image data and extracted form information to and from an external device via a network such as a LAN, a telephone line, or a proximity radio such as infrared rays.

以上説明した画像処理装置１００は一例であり、画像読取部１１７および画像出力部１１８、表示デバイス１１９のいずれかを有さない構成の画像処理装置１００であってもよい。その際は、外部インターフェース１２０を介して必要な情報が相互に通信される構成とすればよい。また、画像処理装置１００の一部機能を、外部インターフェース１２０を介して通信を相互に行うことにより、外部処理装置で実行するようにしても構わない。なお、外部処理装置は、サーバーなどのコンピュータ装置で実装してもよいし、インターネット上のクラウドサーバーで実装してもよい。その他必要に応じて、その他の構成を備えるものであってもよい。 The image processing device 100 described above is an example, and may be an image processing device 100 having no one of the image reading unit 117, the image output unit 118, and the display device 119. In that case, the necessary information may be mutually communicated via the external interface 120. Further, some functions of the image processing device 100 may be executed by the external processing device by communicating with each other via the external interface 120. The external processing device may be mounted on a computer device such as a server, or may be mounted on a cloud server on the Internet. Others may be provided with other configurations, if necessary.

＜全体処理フロー＞
本実施形態の画像処理装置１００における全体フローについて、図２を用いて説明する。図２は、画像処理装置１００における新しくスキャンされた文書画像に対するスキャンアシスト処理の全体を示すフローチャートである。図２のフローは、ユーザインターフェース１１６により、ユーザからの文書画像のスキャン指示を受け付けることにより起動する。その際、起動したユーザ名についての情報を、ＣＰＵ１１１は、ＲＡＭ１１３に保持しておく。ユーザ名は、不図示の認証装置などを利用することにより取得可能である。図２に示されるフローチャートによる処理は、記憶部１１４に記憶されたプログラムコードがＲＡＭ１１３に展開され、ＣＰＵ１１１により実行される。 <Overall processing flow>
The overall flow of the image processing apparatus 100 of the present embodiment will be described with reference to FIG. FIG. 2 is a flowchart showing the entire scan assist process for the newly scanned document image in the image processing device 100. The flow of FIG. 2 is activated by receiving a document image scan instruction from the user by the user interface 116. At that time, the CPU 111 holds the information about the activated user name in the RAM 113. The user name can be obtained by using an authentication device (not shown) or the like. In the process according to the flowchart shown in FIG. 2, the program code stored in the storage unit 114 is expanded in the RAM 113 and executed by the CPU 111.

図２のステップＳ２０１において、ＣＰＵ１１１は画像読取部１１８を介して文書をスキャンし、文書画像を取得して記憶部１１４に保存する。この文書画像を以後、クエリ文書画像と呼称する。 In step S201 of FIG. 2, the CPU 111 scans a document via the image reading unit 118, acquires a document image, and stores the document image in the storage unit 114. This document image is hereinafter referred to as a query document image.

ステップＳ２０２において、ＣＰＵ１１１は、記憶部１１４に保存されているクエリ文書画像をＲＡＭ１１３に読み出し、クエリ文書画像に対して補正処理を行う。補正処理は、色変換処理、階調補正などの文書画像用の補正処理、および、回転補正処理を行う。回転補正処理は、文書画像内の文字列や線がデジタル的には水平方向に並ぶことを利用して、回転角度を算出し、算出された回転角度を用いて、画像を回転させることにより行う。 In step S202, the CPU 111 reads the query document image stored in the storage unit 114 into the RAM 113, and performs correction processing on the query document image. The correction processing includes color conversion processing, correction processing for document images such as gradation correction, and rotation correction processing. The rotation correction process is performed by calculating the rotation angle by utilizing the fact that the character strings and lines in the document image are arranged in the horizontal direction digitally, and rotating the image by using the calculated rotation angle. ..

ステップＳ２０３において、ＣＰＵ１１１は、画像処理部１１５に対して、クエリ文書画像に対するブロックセレクション（ＢＳ）処理を実行させる。ブロックセレクション処理とは、画像内の領域を分割してオブジェクトブロックにし、それぞれのブロックの属性を判定する処理（領域分割処理）である。ブロックセレクション処理で得られた各ブロックに対する属性情報は、以降に説明する類似度計算などのための情報として用いる。なお、ブロックセレクション処理の詳細は図３を用いて後述する。 In step S203, the CPU 111 causes the image processing unit 115 to execute the block selection (BS) processing for the query document image. The block selection process is a process (area division process) in which an area in an image is divided into object blocks and the attributes of each block are determined. The attribute information for each block obtained by the block selection process is used as information for similarity calculation and the like described later. The details of the block selection process will be described later with reference to FIG.

ステップＳ２０４において、ＣＰＵ１１１は、クエリ文書画像と登録文書画像群との間でマッチング処理を行う。マッチング処理とは、クエリ文書画像とすべての登録文書画像との間で類似度を算出する処理である。このとき、類似度が所定値以上である登録文書画像は、クエリ文書画像と同一種類の文書画像である可能性が高いと判断することができる。ここで、登録文書画像群は、後述するステップＳ２１２で記憶部１１４上に保存されるデータベースに登録された過去に処理された文書画像の集まりである。登録文書画像群は文書画像そのものでなく、各文書画像のブロックセレクション処理結果などのマッチング処理に利用可能な特徴量であってもよい。ＣＰＵ１１１は、この登録文書画像群を記憶部１１４からＲＡＭ１１３に読み出して利用する。なお、マッチング処理の詳細は図４を用いて後述する。 In step S204, the CPU 111 performs a matching process between the query document image and the registered document image group. The matching process is a process of calculating the degree of similarity between the query document image and all the registered document images. At this time, it can be determined that the registered document image whose similarity is equal to or higher than the predetermined value is likely to be the same type of document image as the query document image. Here, the registered document image group is a collection of previously processed document images registered in the database stored on the storage unit 114 in step S212 described later. The registered document image group may not be the document image itself, but may be a feature quantity that can be used for matching processing such as a block selection processing result of each document image. The CPU 111 reads the registered document image group from the storage unit 114 into the RAM 113 and uses it. The details of the matching process will be described later with reference to FIG.

ステップＳ２０５において、ＣＰＵ１１１は、登録文書画像群の中から、クエリ文書と同一種類文書である可能性が高い候補画像を抽出する。これは、ステップＳ２０４で得られたマッチング結果を用いて、所定値以上の類似度を得た登録画像を同一種類文書候補画像として抽出する。なお、すべての登録文書画像群との類似度が所定値を上回らなかった場合には、クエリ文書画像と同一種類文書画像は存在しなかったと判定される。 In step S205, the CPU 111 extracts a candidate image that is likely to be the same type of document as the query document from the registered document image group. This uses the matching result obtained in step S204 to extract a registered image having a degree of similarity equal to or higher than a predetermined value as a document candidate image of the same type. If the similarity with all the registered document image groups does not exceed the predetermined value, it is determined that the document image of the same type as the query document image did not exist.

ステップＳ２０６において、ＣＰＵ１１１は、ステップＳ２０５において、同一種類文書候補画像として抽出判定された登録文書群があったかどうかを判断する。同一種類文書候補画像が抽出された場合にはステップＳ２０７に進む。一方、同一種類文書候補画像が抽出されなかった場合には、処理をスキップしてステップＳ２１１まで進む。 In step S206, the CPU 111 determines whether or not there is a registered document group extracted and determined as the same type document candidate image in step S205. If the same type document candidate image is extracted, the process proceeds to step S207. On the other hand, if the same type document candidate image is not extracted, the process is skipped and the process proceeds to step S211.

ステップＳ２０７において、ＣＰＵ１１１は、ステップＳ２０４で得られた同一種類文書候補画像と判定された各登録文書画像に対応するバリュー領域位置を、クエリ文書画像の中で推定する。バリュー領域位置推定処理は、登録文書画像に紐づいて登録されているすべてのバリュー領域位置が、クエリ文書画像中のどこに位置するかを推定する処理である。なお、バリュー領域位置推定処理の詳細は、図５を用いて後述する。 In step S207, the CPU 111 estimates in the query document image the value region position corresponding to each registered document image determined to be the same type document candidate image obtained in step S204. The value area position estimation process is a process of estimating where all the value area positions registered in association with the registered document image are located in the query document image. The details of the value area position estimation process will be described later with reference to FIG.

ステップＳ２０８において、ＣＰＵ１１１は、同一種類文書画像抽出処理を行う。同一種類文書画像抽出処理では、複数検出された同一種類文書候補画像から最も類似した登録文書画像を、類似文書画像として決定する処理である。なお、同一種類文書補画像判定処理の詳細は、図６および図７を用いて後述する。 In step S208, the CPU 111 performs the same type document image extraction process. The same-type document image extraction process is a process of determining the most similar registered document image from a plurality of detected same-type document candidate images as a similar document image. The details of the same type document supplementary image determination process will be described later with reference to FIGS. 6 and 7.

ステップＳ２０９において、ＣＰＵ１１１は、クエリ文書画像に対して、ステップＳ２０８で決定された同一種類文書画像から推定されたバリュー領域位置に記載されているバリュー値を取得する。具体的には、ＢＳ処理で得られた各ＢＳテキストブロックに対し、ＯＣＲ処理を行うことにより、各バリュー属性に対するバリュー値を取得することができる。なお、ステップＳ２０８の判定処理で用いたＢＳテキストブロックについては、すでにＯＣＲ処理を適用しているため、記憶部１１４からＲＡＭ１１４に読み出して利用すればよい。 In step S209, the CPU 111 acquires the value value described in the value region position estimated from the same type document image determined in step S208 with respect to the query document image. Specifically, the value value for each value attribute can be acquired by performing OCR processing on each BS text block obtained by the BS processing. Since the BS text block used in the determination process in step S208 has already been subjected to the OCR process, it may be read from the storage unit 114 into the RAM 114 and used.

ステップＳ２１０において、ＣＰＵ１１１は、図９に示すようなバリュー値推定結果の確認画面を表示デバイス１１９上に表示する。なお、ここではクエリ文書画像の各項目に対するリコメンドとして、ステップＳ２０９で取得したバリュー値が埋められた状態で表示される。一方、ステップＳ２０６で同一種類文書候補画像なしと判断された場合は、リコメンドできるバリュー値が取得できていないことから、すべての項目欄が空欄の状態で表示されることになる。そして、ユーザはこの確認画面を見ながら、リコメンドされたバリュー値の修正作業を行うことになる。 In step S210, the CPU 111 displays a value value estimation result confirmation screen as shown in FIG. 9 on the display device 119. Here, as a recommendation for each item of the query document image, the value value acquired in step S209 is displayed in a filled state. On the other hand, when it is determined in step S206 that there is no document candidate image of the same type, since the value value that can be recommended has not been acquired, all the item fields are displayed in a blank state. Then, the user will correct the recommended value value while looking at this confirmation screen.

ステップＳ２１１において、ＣＰＵ１１１は、文書画像登録処理を行う。文書画像登録処理は、クエリ文書画像のリコメンド結果に対してユーザによる修正が行われた際に、ユーザの修正内容を反映させた文書画像情報をデータベースに登録し、次回以降の処理で参照できるようにするための処理である。なお、文書画像登録処理の詳細は、図８を用いて後述する。 In step S211 the CPU 111 performs a document image registration process. In the document image registration process, when the user modifies the recommendation result of the query document image, the document image information that reflects the user's modification is registered in the database so that it can be referred to in the next and subsequent processes. It is a process to make it. The details of the document image registration process will be described later with reference to FIG.

ステップＳ２１２において、ＣＰＵ１１１は、外部インターフェース１２０を介して所定の業務システムへの登録に必要な情報を送信する。送信する情報の一例としては、ステップＳ２０１で得られたスキャン画像およびスキャン情報（日時情報、スキャン設定情報など）、ステップＳ２１０でユーザによって確認／修正が行われた項目名と項目値の情報である。なお、送信する情報はこれに限らず、スキャン作業あるいは確認／修正作業を行ったユーザ情報などを含んでもよいことは言うまでもない。 In step S212, the CPU 111 transmits information necessary for registration to a predetermined business system via the external interface 120. Examples of the information to be transmitted are the scan image and scan information (date / time information, scan setting information, etc.) obtained in step S201, and the item name and item value information confirmed / corrected by the user in step S210. .. Needless to say, the information to be transmitted is not limited to this, and may include user information such as scanning work or confirmation / correction work.

＜ブロックセレクション処理＞
図２のステップＳ２０３で行われるブロックセレクション処理（領域分割処理）について、図３を用いて説明する。図３（ａ）はステップＳ２０１で読み取った文書画像の一例である。図３（ｂ）は読み取ったクエリ文書画像をオブジェクトブロックに分割した結果である。画像処理部１１５は、クエリ文書画像上のオブジェクトについて、文字（ＴＥＸＴ）／線（ＬＩＮＥ）／表（ＴＡＢＬＥ）／写真（ＰＨＯＴＯ）／図画（ＰＩＣＴＵＲＥ）などの属性を判定する。オブジェクトブロックは、それぞれの属性を持つ領域が、画定された領域として示されている。図３（ｃ）はブロックセレクションの処理フローを示している。 <Block selection processing>
The block selection process (regional division process) performed in step S203 of FIG. 2 will be described with reference to FIG. FIG. 3A is an example of the document image read in step S201. FIG. 3B shows the result of dividing the read query document image into object blocks. The image processing unit 115 determines attributes such as characters (TEXT) / lines (LINE) / tables (TABLE) / photographs (PHOTO) / drawings (PICTURE) for the objects on the query document image. In the object block, the area with each attribute is shown as a defined area. FIG. 3C shows a processing flow of block selection.

図３（ｃ）を用いて、ブロックセレクション処理の方法について説明する。ステップＳ３０１において、画像処理部１１５は、文書画像に対して二値化処理を行うことにより、白黒の二値画像を生成する。 A method of block selection processing will be described with reference to FIG. 3 (c). In step S301, the image processing unit 115 generates a black-and-white binary image by performing a binarization process on the document image.

ステップＳ３０２において、画像処理部１１５は、黒画素輪郭で囲まれる画素の塊を抽出する。これはステップＳ３０１で生成された二値画像に対して輪郭線追跡を行うことで抽出する。なお、輪郭線追跡で得られた黒画素の塊の面積が所定の面積よりも大きい場合については、内部にある白画素に対しても輪郭線追跡を行い、白画素の塊を抽出する。さらに一定面積以上の白画素の塊の内部から再帰的に黒画素の塊を抽出することを繰り返す。 In step S302, the image processing unit 115 extracts a pixel block surrounded by a black pixel contour. This is extracted by performing contour line tracking on the binary image generated in step S301. When the area of the black pixel mass obtained by the contour line tracking is larger than the predetermined area, the contour line tracking is also performed on the white pixels inside, and the white pixel mass is extracted. Further, the recursive extraction of the black pixel mass from the inside of the white pixel mass having a certain area or more is repeated.

ステップＳ３０３において、画像処理部１１５は、ステップＳ３０２で得られた黒画素の塊に対し、大きさおよび形状で分類し、異なる属性を持つ領域へ分類する。例えば、縦横比が１に近く、大きさが一定の範囲のものを文字相当の画素塊とする。さらに近接する文字が整列よくグループ化されている部分を、文字領域（ＴＥＸＴ）とする。扁平な画素塊を線領域（ＬＩＮＥ）とする。一定の大きさ以上でかつ四角形の白画素塊を整列よく内包する黒画素塊の占める範囲を表領域（ＴＡＢＬＥ）とする。不定形の画素塊が散在している領域を写真領域（ＰＨＯＴＯ）とする。そして、それ以外の任意形状の画素塊を図画領域（ＰＩＣＴＵＲＥ）とする。以後、文字領域と判定されたブロックをＢＳテキストブロックと呼称する。なお、ＢＳテキストブロックの情報は、ＯＣＲ処理にも利用可能であり、本ステップで必要に応じてＯＣＲ処理を行ってもよい。 In step S303, the image processing unit 115 classifies the black pixel block obtained in step S302 by size and shape, and classifies the black pixels into regions having different attributes. For example, a pixel block having an aspect ratio close to 1 and a certain size is defined as a pixel block corresponding to a character. The portion where adjacent characters are well aligned and grouped is referred to as a character area (TEXT). A flat pixel block is defined as a line region (LINE). The table area (TABLE) is a range occupied by a black pixel block having a certain size or larger and containing a rectangular white pixel block in a well-aligned manner. The area where irregular pixel clusters are scattered is referred to as a photographic area (PHOTO). Then, a pixel block having an arbitrary shape other than that is used as a drawing area (PICTURE). Hereinafter, the block determined to be the character area is referred to as a BS text block. The information in the BS text block can also be used for OCR processing, and OCR processing may be performed as necessary in this step.

＜マッチング処理＞
図２のステップＳ２０４で行われるマッチング処理の概要について、図４を用いて説明を行う。図４（ａ）はクエリ文書画像４００であり、図４（ｂ）および図４（ｃ）は登録文書画像４１０および４２０の一例である。図４（ａ）に示すクエリ文書画像４００と図４（ｂ）に示す登録文書画像４１０は、テキストの配置が類似していることがわかる。一方、図４（ａ）に示すクエリ文書画像４００と図４（ｃ）に示す登録文書画像４２０は、テキストの配置が類似していないことがわかる。そのため、画像処理部１１５は、適切な手法を利用して、図４（ａ）と図４（ｂ）の類似度と、図４（ａ）と図４（ｃ）の類似度を算出する。前者の方の類似度の方が高くなるので、マッチングの結果として、図４（ｂ）に示す登録文書画像４１０が選択されることになる。 <Matching process>
The outline of the matching process performed in step S204 of FIG. 2 will be described with reference to FIG. 4 (a) is a query document image 400, and FIGS. 4 (b) and 4 (c) are examples of registered document images 410 and 420. It can be seen that the query document image 400 shown in FIG. 4A and the registered document image 410 shown in FIG. 4B have similar text arrangements. On the other hand, it can be seen that the query document image 400 shown in FIG. 4A and the registered document image 420 shown in FIG. 4C are not similar in text arrangement. Therefore, the image processing unit 115 calculates the similarity between FIGS. 4 (a) and 4 (b) and the similarity between FIGS. 4 (a) and 4 (c) by using an appropriate method. Since the degree of similarity of the former is higher, the registered document image 410 shown in FIG. 4B is selected as the result of matching.

続いて、ＢＳテキストブロックを用いた類似度算出処理の概念について説明を行う。図４（ｄ）は、図４（ａ）のクエリ文書画像４００に対して文字領域と判定されたＢＳテキストブロックを示す。点線がＢＳテキストブロックであり、ＩＤ４０１～ＩＤ４０５は各テキストブロックのＩＤである。図４（ａ）中の文字列が図４（ｄ）のＢＳテキストブロックに対応していることがわかる。同様に、図４（ｅ）は、図４（ｂ）の登録文書画像４１０に対するＢＳテキストブロックを示し、ＩＤ４１１～４１６のＢＳテキストブロックで構成されている。図４（ｆ）は、図４（ｃ）の登録文書画像４２０に対するＢＳテキストブロックを示し、ＩＤ４２１～４２５のＢＳテキストブロックで構成されている。 Subsequently, the concept of the similarity calculation process using the BS text block will be described. FIG. 4D shows a BS text block determined to be a character area with respect to the query document image 400 of FIG. 4A. The dotted line is a BS text block, and ID 401 to ID 405 are IDs of each text block. It can be seen that the character string in FIG. 4 (a) corresponds to the BS text block of FIG. 4 (d). Similarly, FIG. 4 (e) shows a BS text block for the registered document image 410 of FIG. 4 (b), and is composed of BS text blocks of IDs 411 to 416. FIG. 4 (f) shows a BS text block for the registered document image 420 of FIG. 4 (c), and is composed of BS text blocks with IDs 421 to 425.

ＢＳテキストブロックを用いた類似度算出は、ＢＳテキストブロックの形状と配置がどれだけ類似しているかに着目して、類似度を算出する方法である。例えば、図４（ｄ）と図４（ｅ）を比較する場合、最初に、各文書の上部に位置する複数のテキストブロックの配置の比較を行う。具体的には、図４（ｄ）中のＩＤ４０１、ＩＤ４０２、ＩＤ４０３、ＩＤ４０４と、図４（ｅ）中のＩＤ４１１，ＩＤ４１２、ＩＤ４１３、ＩＤ４１４とを比較した場合、各テキストブロックの配置がそれぞれ類似している。さらに、図４（ｄ）と図４（ｅ）の各文書の下部に位置する複数のテキストブロックの配置の比較を行う。具体的には、図４（ｄ）中のＩＤ４０５と図４（ｅ）中のＩＤ４１６が同じ位置になるように位置合わせして考える。そうすると、図４（ｄ）中のＩＤ４０３、ＩＤ４０４、ＩＤ４０５と、図４（ｅ）中のＩＤ４１４、ＩＤ４１５、ＩＤ４１６が類似するようになる。したがって、図４（ｄ）と図４（ｅ）を比較した場合、文書の上部におけるテキストブロックの配置と、文書の下部におけるテキストブロックの配置の両方が類似しているので、同一種類文書の可能性が高いと判断され、高い類似度が算出される。このように、文書の上部と下部とにおいて、それぞれを比較することにより、図４に示すように、商品の数などに応じて文字領域の行数が変わるような文書に対する比較が容易になる。 The similarity calculation using the BS text block is a method of calculating the similarity by paying attention to how similar the shape and arrangement of the BS text block are. For example, when comparing FIG. 4 (d) and FIG. 4 (e), first, the arrangement of a plurality of text blocks located at the top of each document is compared. Specifically, when ID401, ID402, ID403, and ID404 in FIG. 4D are compared with ID4111, ID412, ID413, and ID414 in FIG. 4E, the arrangement of each text block is similar. ing. Further, the arrangement of a plurality of text blocks located at the bottom of each document of FIGS. 4 (d) and 4 (e) is compared. Specifically, the ID 405 in FIG. 4 (d) and the ID 416 in FIG. 4 (e) are aligned so as to be in the same position. Then, ID 403, ID 404, ID 405 in FIG. 4 (d) and ID 414, ID 415, ID 416 in FIG. 4 (e) become similar. Therefore, when comparing FIGS. 4 (d) and 4 (e), both the arrangement of the text blocks at the top of the document and the arrangement of the text blocks at the bottom of the document are similar, so that the same type of document is possible. It is judged that the sex is high, and a high degree of similarity is calculated. In this way, by comparing the upper part and the lower part of the document, as shown in FIG. 4, it becomes easy to compare the document in which the number of lines in the character area changes according to the number of products and the like.

一方で、図４（ｄ）と図４（ｆ）は、若干オーバーラップしているＢＳテキストブロックは存在するものの、類似度が高いＢＳテキストブロックは存在しない。そのため、同一種類の文書画像ではないと判断され、低い類似度が算出される。 On the other hand, in FIGS. 4 (d) and 4 (f), although there is a BS text block that slightly overlaps, there is no BS text block having a high degree of similarity. Therefore, it is determined that the document images are not of the same type, and a low degree of similarity is calculated.

なお、類似度の算出方法は、上記の方法に限定されるものではなく、ＢＳテキストブロックを位置合わせする際の移動量、ＢＳテキストブロック同士でオーバーラップした面積、類似したＢＳテキストブロック数などの情報を用いて算出することができる。 The method of calculating the similarity is not limited to the above method, and the amount of movement when aligning the BS text blocks, the area of overlap between the BS text blocks, the number of similar BS text blocks, etc. It can be calculated using information.

＜バリュー領域位置推定処理＞
図２のステップＳ２０７で行われるバリュー領域位置推定処理について、図５を用いて説明を行う。図５（ａ）はＢＳテキストブロック５０１～５１０で構成されたクエリ文書画像５００である。図５（ｂ）は、ステップＳ２０４において同一種類文書候補画像として得られた登録文書画像５２０であり、ＢＳテキストブロック５２１～５３０から構成されている。図５（ｃ）は、ステップＳ２０４において同一種類文書候補画像として得られた登録文書画像５４０であり、ＢＳテキストブロック５４１～５５２から構成されている。また、図５（ｂ）および図５（ｃ）の登録文書画像５２０および５４０には、一部のＢＳテキストブロックにバリュー属性が付与されている。具体的には、ＩＤ５２１には帳票の表題を表すｔｉｔｌｅ属性、ＩＤ５２２には帳票発行元の会社名を表すｉｓｓｕｅｒ属性、ＩＤ５２３には帳票発行元会社の電話番号を表すｉｓｓｕｅｒＴｅｌ属性などのバリュー属性が付与されている。本実施形態においては、帳票番号（ｉｓｓｕｅＮｏ）、帳票発行日（ｉｓｓｕｅＤａｔｅ）、小計（ｓｕｂＴｏｔａｌ）、税合計（ｔａｘＴｏｔａｌ）、合計（ｔｏｔａｌ）などを付与しているが、その他のバリュー属性を付与しても構わない。 <Value area position estimation process>
The value region position estimation process performed in step S207 of FIG. 2 will be described with reference to FIG. FIG. 5A is a query document image 500 composed of BS text blocks 501 to 510. FIG. 5B is a registered document image 520 obtained as a document candidate image of the same type in step S204, and is composed of BS text blocks 521 to 530. FIG. 5C is a registered document image 540 obtained as a document candidate image of the same type in step S204, and is composed of BS text blocks 541 to 552. Further, in the registered document images 520 and 540 of FIGS. 5 (b) and 5 (c), a value attribute is given to some BS text blocks. Specifically, ID 521 is given a title attribute representing the title of the form, ID 522 is given a value attribute such as the issuer attribute representing the company name of the form issuing company, and ID 523 is given a value attribute such as the issuer attribute representing the telephone number of the form issuing company. Has been done. In the present embodiment, a form number (issueNo), a form issuance date (issueDate), a subtotal (subTotal), a total tax (taxTotal), a total (total), etc. are given, but other value attributes are given. It doesn't matter.

バリュー領域位置推定処理は、バリュー属性が付与された登録文書画像中のＢＳテキストブロックに対して形状と配置が類似しているクエリ文書画像中のＢＳテキストブロックを推定する処理である。具体的には、ＣＰＵ１１１は、図５（ｂ）に示す登録文書画像５２０のバリュー属性が付与されたＢＳテキストブロック５２１～５２４、５２７、５２９のそれぞれを、図５（ａ）に示すクエリ文書画像５００上の同じ位置に射影する。例えば、登録文書画像５２０のｉｓｓｕｅｒＴｅｌ属性が付与されたＢＳテキストブロック（ＩＤ５２３）をクエリ文書画像上の同じ位置に射影すると、図５（ｄ）に示すように、ＩＤ５６３の位置に射影されたとする。このとき、射影されたＩＤ５６３の近傍にあるクエリ文書画像５００のＢＳテキストブロックを探索すると、ＩＤ５０２およびＩＤ５０３が近傍に位置すると判断される。そこで、ＩＤ５６３を、ＩＤ５０２とＩＤ５０３のいずれかに対応させるべきかを判断するために、テキストブロック間の配置の比較を行う。 The value area position estimation process is a process of estimating a BS text block in a query document image whose shape and arrangement are similar to those of the BS text block in the registered document image to which the value attribute is given. Specifically, the CPU 111 uses each of the BS text blocks 521 to 524, 527, and 259 to which the value attribute of the registered document image 520 shown in FIG. 5 (b) is attached to the query document image shown in FIG. 5 (a). Project to the same position on 500. For example, when a BS text block (ID 523) to which the issuerTel attribute of the registered document image 520 is added is projected at the same position on the query document image, it is assumed that the BS text block (ID 523) is projected at the position of ID 563 as shown in FIG. 5 (d). At this time, when the BS text block of the query document image 500 in the vicinity of the projected ID 563 is searched, it is determined that the ID 502 and ID 503 are located in the vicinity. Therefore, in order to determine whether the ID 563 should correspond to either the ID 502 or the ID 503, the arrangement between the text blocks is compared.

最初に、ＩＤ５６３がＩＤ５０２と同じ位置になるように位置合わせして、クエリ文書５００及び登録文書５２０のテキストブロックの配置の比較を行って類似度を算出する。次に、ＩＤ５６３がＩＤ５０３と同じ位置になるように位置合わせして、クエリ文書５００及び登録文書５２０のテキストブロックの配置の比較を行って類似度を算出する。最後に、算出した類似度を比較すると、ＩＤ５６３をＩＤ５０３に位置合わせしたときの方が、テキストブロックの配置の類似度が高くなるので、類似度が高い方の位置でテキストブロック同士を対応付ける。すなわち、登録文書５２０の各ＢＳテキストブロックＩＤ５２２、ＩＤ５２３、ＩＤ５２４は、それぞれクエリ文書５００の各ＢＳテキストブロックＩＤ５０２、ＩＤ５０３、ＩＤ５０４に対応づけられる。 First, the ID 563 is aligned with the ID 502, and the arrangements of the text blocks of the query document 500 and the registered document 520 are compared to calculate the similarity. Next, the ID 563 is aligned at the same position as the ID 503, and the arrangement of the text blocks of the query document 500 and the registered document 520 is compared to calculate the similarity. Finally, when the calculated similarity is compared, the similarity of the arrangement of the text blocks is higher when the ID 563 is aligned with the ID 503, so that the text blocks are associated with each other at the position having the higher similarity. That is, each BS text block ID 522, ID 523, and ID 524 of the registered document 520 are associated with each BS text block ID 502, ID 503, and ID 504 of the query document 500, respectively.

一方、図５（ｃ）に示す登録文書画像５４０の各バリュー属性が付与されたＢＳテキストブロック５４１～５４４、５４８、５５０、５５２についても、同様に、クエリ文書５００上に射影して、テキストブロックの配置の比較を行う。類似度の比較の結果、登録文書５４０の各ＢＳテキストブロックＩＤ５４２、ＩＤ５４３、ＩＤ５４４は、それぞれクエリ文書５００のＩＤ５０２、ＩＤ５０３、ＩＤ５０４に対応付けられる。その他のバリュー属性が付与されたＢＳテキストブロックに対しても同様にして処理を行う。これにより、各登録文書５２０、５４０のバリュー属性を、クエリ文書の各ＢＳテキストブロックに対応付けた結果として、図５（ｅ）に示すようなバリュー領域位置の推定結果が得られる。 On the other hand, the BS text blocks 541 to 544, 548, 550, and 552 to which each value attribute of the registered document image 540 shown in FIG. 5C is also projected are similarly projected onto the query document 500 to block the text. Compare the arrangement of. As a result of the comparison of similarity, each BS text block ID 542, ID 543, and ID 544 of the registered document 540 are associated with ID 502, ID 503, and ID 504 of the query document 500, respectively. The same processing is performed for the BS text block to which other value attributes are added. As a result, as a result of associating the value attributes of each registered document 520 and 540 with each BS text block of the query document, the estimation result of the value area position as shown in FIG. 5 (e) can be obtained.

＜同一種類文書画像抽出処理＞
図２のステップＳ２０８で行われる同一種類文書画像抽出処理について、図６および図７を用いて説明を行う。図６（ａ）は、同一種類文書画像抽出処理のフロー図、図６（ｂ）は、ＢＳテキストブロックに対する判定に用いる順番を決定した結果を示している。また、図７（ａ）は、バリュー属性のデータ形式の一例であり、図７（ｂ）はバリュー属性間における比較順位の優先度決定表を表している。ここで、バリュー属性のデータ形式は、バリュー値として抽出されるべき文字列がどのような特徴で構成されているかを、バリュー属性ごとに定義した対応表である。例えば、文字列中に特定の文字列を含むか否か、所定のデータフォーマットに従った文字列構成になっているか、使用されている文字種別に限定があるか、などを事前に定義しておく。なお、データ形式に関してはこれらに限るものではなく、文字列長や文字サイズなどの文字列情報の他、様々な条件をデータ形式として登録しても構わない。 <Same type document image extraction process>
The same type document image extraction process performed in step S208 of FIG. 2 will be described with reference to FIGS. 6 and 7. FIG. 6A shows a flow chart of the same type document image extraction process, and FIG. 6B shows the result of determining the order to be used for the determination for the BS text block. Further, FIG. 7A is an example of the data format of the value attribute, and FIG. 7B shows a priority determination table of the comparison order among the value attributes. Here, the data format of the value attribute is a correspondence table that defines the characteristics of the character string to be extracted as the value value for each value attribute. For example, it is defined in advance whether or not a specific character string is included in the character string, whether or not the character string structure conforms to a predetermined data format, and whether or not the character type used is limited. deep. The data format is not limited to these, and various conditions may be registered as the data format in addition to the character string information such as the character string length and the character size.

図７（ｂ）の優先度決定表は、異なるバリュー属性間でデータ形式を比較したとき、図７（ａ）に示したデータ形式の差分が大きい順に優先度を定めている。優先度が高いほど、同一種類文書画像を判定する際に、ＢＳテキストブロックに付与されている複数のバリュー属性の中から適切なバリュー属性を判定しやすいからである。例えば、帳票発行元の会社名（ｉｓｓｕｅｒ）と帳票番号（ｉｓｓｕｅｒＮｏ）の場合は、ｉｓｓｕｅｒ属性に特定文字列が含まれているか否かのみで判断するしかないため、優先度は低く（３として）設定される。また、帳票発行元の会社名（ｉｓｓｕｅｒ）と電話番号（ｉｓｓｕｅｒＴｅｌ）の場合は、特定文字列の有無、フォーマット形式の適合有無、文字種の限定有無によって判断できるため、優先度は高く（１として）設定される。また、小計（ｓｕｂＴｏｔａｌ）と税合計（ｔａｘＴｏｔａｌ）の場合は、データ形式が全く同じであるため、判別不能として優先度は０として設定される。 In the priority determination table of FIG. 7 (b), when the data formats are compared between different value attributes, the priorities are determined in descending order of the difference in the data formats shown in FIG. 7 (a). This is because the higher the priority, the easier it is to determine an appropriate value attribute from a plurality of value attributes assigned to the BS text block when determining the same type of document image. For example, in the case of the company name (issuer) and form number (issuerNo) of the form issuer, the priority is low (as 3) because it is only judged whether or not the specific character string is included in the issuer attribute. Set. In addition, in the case of the company name (issuer) and telephone number (issuerTel) of the form issuer, the priority is high (as 1) because it can be judged by the presence / absence of a specific character string, the presence / absence of conformity with the format format, and the presence / absence of limitation of the character type. Set. Further, in the case of the subtotal (subTotal) and the tax total (taxTotal), since the data formats are exactly the same, the priority is set to 0 because it cannot be discriminated.

図６のステップＳ６０１において、ＣＰＵ１１１は、後述のステップＳ６０６で複数の候補の中から登録文書が１つに決定されたか否かを判定する。複数の文書画像が候補に残っている場合には、以後のステップＳ６０２以降の処理を繰り返すことにより、候補の絞り込みを行う。そして、１つに決定された場合は、登録文書画像を同一種類文書画像と決定できるため、処理を終了する。 In step S601 of FIG. 6, the CPU 111 determines whether or not the registered document is determined to be one from the plurality of candidates in step S606 described later. When a plurality of document images remain as candidates, the candidates are narrowed down by repeating the subsequent processes of step S602 and subsequent steps. If only one is determined, the registered document image can be determined to be the same type of document image, so that the process ends.

ステップＳ６０２において、ＣＰＵ１１１は、同一種類文書画像を比較判定する際に用いる、クエリ文書画像５００のＢＳテキストブロックを決定する。判定に使用するＢＳテキストブロックは、ステップＳ２０７で推定して得られたバリュー領域位置の中から、以下の条件に基づいて決定される。
条件１１つのＢＳテキストブロックに異なる複数のバリュー属性が付与されていて
条件１－１比較判定の優先度が１の領域
条件１－２比較判定の優先度が２の領域
条件１－３比較判定の優先度が３の領域
条件２１つのＢＳテキストブロックに１つのバリュー属性が付与されている領域
条件３判定用の領域なし。 In step S602, the CPU 111 determines the BS text block of the query document image 500 to be used when comparing and determining the same type document images. The BS text block used for the determination is determined based on the following conditions from the value region positions estimated in step S207.
Condition 1 One BS text block is given different value attributes. Condition 1-1 Area where the priority of comparison judgment is 1. Condition 1-2 Area where the priority of comparison judgment is 2. Condition 1-3 Comparison judgment. Area with priority of 3 Condition 2 Area where one value attribute is given to one BS text block Condition 3 No area for judgment.

上記の条件に基づき、ステップＳ２０７で得られた図５（ｅ）に示すバリュー領域位置の推定結果に対して、判定に使用するＢＳテキストブロックの順番を決定する。図６（ｂ）は、図５（ｅ）の推定結果に判定順番を追記したものである。図６（ｂ）に示すように、クエリ文書画像５００のＩＤ５０１、ＩＤ５０７に対しては、登録文書５２０、５４０のいずれにおいても同一のバリュー属性が付与されており、上記の条件１～３のいずれに当てはまらないため、処理対象外となる。また、クエリ文書画像５００のＩＤ５０９に対しては、登録文書５２０と登録文書５４０とで異なるバリュー属性（ｔａｘＴｏｔａｌとｓｕｂＴｏｔａｌ）が付与されている。しかし、同一データ形式であるため、比較対象に適さないと判断して、同じく処理対象外となる。クエリ文書画像５００のＩＤ５０３に対しては、異なるバリュー属性（ｉｓｓｕｅｒＴｅｌとｉｓｓｕｅｒ）が付与されており、図７（ｂ）の優先度決定表で最大の優先度（１）となる組み合わせであることから、判定順番を１として設定する。 Based on the above conditions, the order of the BS text blocks used for the determination is determined with respect to the estimation result of the value region position shown in FIG. 5 (e) obtained in step S207. FIG. 6 (b) shows the estimation result of FIG. 5 (e) with the determination order added. As shown in FIG. 6B, the same value attribute is given to the ID 501 and ID 507 of the query document image 500 in both the registered documents 520 and 540, and any of the above conditions 1 to 3 is given. Because it does not apply to, it is not subject to processing. Further, the ID 509 of the query document image 500 is given different value attributes (taxTotal and subTotal) between the registered document 520 and the registered document 540. However, since it has the same data format, it is judged that it is not suitable for comparison, and it is also excluded from processing. Different value attributes (issuerTel and issuer) are given to the ID 503 of the query document image 500, and the combination is the highest priority (1) in the priority determination table of FIG. 7 (b). , The judgment order is set to 1.

同様にして、残りのテキストブロックに対しても判定順番を決定していく。そして、ステップＳ６０２の領域決定が実行される場合には、１回目の処理においては判定順番１が付与されたＢＳテキストブロックであるＩＤ５０３が選択される。ステップＳ６０２が２回目以降に実行される場合は、判定順番２、３が付与されたＢＳテキストボックスを順番に選択していけばよい。すなわち、２回目はＩＤ５０４およびＩＤ５０２、３回目はＩＤ５１０、そして４回目は領域なしとして領域が決定されることになる。なお、優先度決定表に関してはこれに限るものではなく、条件１用として３つ以上の組み合わせの優先度決定表を用いても構わない。また、条件２用として、単体のバリュー属性での判定のしやすさに基づいた優先度決定表を用いて、領域決定を行っても構わない。 In the same way, the judgment order is determined for the remaining text blocks. Then, when the area determination in step S602 is executed, ID 503, which is a BS text block to which the determination order 1 is assigned, is selected in the first process. When step S602 is executed for the second time or later, the BS text boxes to which the determination orders 2 and 3 are assigned may be selected in order. That is, the region is determined with ID 504 and ID 502 for the second time, ID 510 for the third time, and no area for the fourth time. The priority determination table is not limited to this, and a priority determination table of three or more combinations may be used for condition 1. Further, for the condition 2, the area may be determined by using the priority determination table based on the ease of determination by the value attribute of a single unit.

ステップＳ６０３において、ＣＰＵ１１１は、ステップＳ６０２で比較判定用のＢＳテキストブロックが決定されたか否かを判定する。決定されたＢＳテキストブロックがある場合にはステップＳ６０４へ進み、なかった場合はステップＳ６０７へ進む。 In step S603, the CPU 111 determines whether or not the BS text block for comparison determination is determined in step S602. If there is a determined BS text block, the process proceeds to step S604, and if not, the process proceeds to step S607.

ステップＳ６０４において、ＣＰＵ１１１は、ステップＳ６０２で決定された判定用のＢＳテキストブロックに対してＯＣＲ処理を適用し、ＯＣＲ結果の文字列を取得する。具体的には、最初に、クエリ文書画像５００のＩＤ５０３に対してＯＣＲ処理を適用することにより、「〇〇株式会社」という文字列を取得する。なお、ここで取得した文字列情報は、図２のステップＳ２０９のバリュー値取得処理で再利用可能なため、記憶部１１４に保存しておく。 In step S604, the CPU 111 applies the OCR process to the BS text block for determination determined in step S602, and acquires the character string of the OCR result. Specifically, first, the character string "○○ Co., Ltd." is acquired by applying the OCR process to the ID 503 of the query document image 500. Since the character string information acquired here can be reused in the value value acquisition process of step S209 of FIG. 2, it is stored in the storage unit 114.

ステップＳ６０５において、ＣＰＵ１１１は、ステップＳ６０４で得られた文字列と、図７（ａ）で定義されているデータ形式との比較を行う。そして、データ形式との比較結果により、クエリ文書画像５００のバリュー領域位置の文字列に付与するバリュー属性を決定する。ここでは、ＩＤ５０３に付与される属性としてｉｓｓｕｅｒＴｅｌ属性とｉｓｓｕｅｒ属性のどちらが適しているか判断することでバリュー属性を決定すればよい。そこで、ＩＤ５０３の文字列「〇〇株式会社」が、ｉｓｓｕｅｒＴｅｌ属性およびｉｓｓｕｅｒ属性のデータ形式にあてはまるか否かをそれぞれ判定する。ｉｓｓｕｅｒＴｅｌ属性のデータ形式と比較した場合、フォーマットや文字種の条件のいずれにも当てはまらないことがわかる。一方、ｉｓｓｕｅｒ属性のデータ形式に対しては、特定文字列を含むという条件に合致することがわかる。よって、ｉｓｓｕｅｒＴｅｌ属性のデータ形式に当てはまらなかったため、クエリ文書画像５００のＩＤ５０３に付与されるバリュー属性としてはｉｓｓｕｅｒ属性であると推定される。なお、データ形式が、比較対象にした複数の属性に当てはまる場合は、当該比較対象にしたどちらの属性にも当てはまる可能性があると推定される。 In step S605, the CPU 111 compares the character string obtained in step S604 with the data format defined in FIG. 7A. Then, the value attribute to be given to the character string at the position of the value area of the query document image 500 is determined based on the comparison result with the data format. Here, the value attribute may be determined by determining which of the issuerTel attribute and the issuer attribute is suitable as the attribute given to the ID 503. Therefore, it is determined whether or not the character string "○○ Co., Ltd." of ID503 corresponds to the data formats of the issuerTel attribute and the issuer attribute, respectively. When compared with the data format of the issuerTel attribute, it can be seen that neither the format nor the character type conditions are met. On the other hand, it can be seen that the data format of the issuer attribute satisfies the condition that a specific character string is included. Therefore, since it did not apply to the data format of the issuerTel attribute, it is presumed that the value attribute given to the ID 503 of the query document image 500 is the issuer attribute. If the data format applies to a plurality of attributes to be compared, it is presumed that there is a possibility that the data format can be applied to either of the attributes to be compared.

ステップＳ６０６において、ＣＰＵ１１１は、ステップＳ６０５によって得られた結果に基づき、比較に用いた同一種類文書候補画像のうちのどれが同一種類文書として適切かを判定し、再びステップＳ６０１へ戻る。図５に示した例では、ステップＳ６０５でＩＤ５０３に対してｉｓｓｕｅｒ属性を付与すべきと推定したので、図６（ｂ）でＩＤ５０３にｉｓｓｕｅｒ属性を対応付けていた登録文書画像５４０を同一種類文書画像であると決定する。なお、ステップＳ６０５における１つのＢＳテキストブロックに対する推定結果に基づいて、登録文書画像を１つに特定できない場合は、ステップＳ６０２に戻って次のＢＳテキストブロックを比較対象としてステップＳ６０５の処理が再度行われることになる。その場合、ステップＳ６０６では、それまでにステップＳ６０５で対象となったテキストブロックの比較結果すべてに基づいて、候補の中から同一種類文書が判定される。この場合、比較に使用した領域のうちデータ形式が一致すると判定されたバリュー属性の割合が所定のしきい値以上である登録文書画像を選択するなどすればよい。 In step S606, the CPU 111 determines which of the same type document candidate images used for comparison is appropriate as the same type document based on the result obtained in step S605, and returns to step S601 again. In the example shown in FIG. 5, since it is estimated that the issuer attribute should be given to the ID 503 in step S605, the registered document image 540 in which the issuer attribute is associated with the ID 503 in FIG. 6B is the same type document image. Determined to be. If the registered document image cannot be specified as one based on the estimation result for one BS text block in step S605, the process of step S605 is performed again with the next BS text block as a comparison target by returning to step S602. Will be. In that case, in step S606, the same type of document is determined from the candidates based on all the comparison results of the text blocks targeted in step S605 so far. In this case, a registered document image in which the ratio of the value attribute determined to match the data format in the area used for comparison is equal to or higher than a predetermined threshold value may be selected.

ステップＳ６０７において、ＣＰＵ１１１は、図２のステップＳ２０４のマッチング処理で算出された類似度が最大となる登録文書画像を同一種類文書画像として選択すし、処理を終了する。 In step S607, the CPU 111 selects the registered document image having the maximum similarity calculated in the matching process of step S204 of FIG. 2 as the same type document image, and ends the process.

＜文書画像登録処理＞
図２のステップＳ２１１で行われる文書画像登録処理について、図８を用いて説明を行う。 <Document image registration process>
The document image registration process performed in step S211 of FIG. 2 will be described with reference to FIG.

図８のステップＳ８０１において、ＣＰＵ１１１は、ステップＳ２０８で決定された同一種類文書画像に対し、ステップＳ２０４のマッチング処理で算出された類似度が所定値以下であるか否かを判定する。所定値以下の場合はステップＳ８０２へ進み、所定値以上の場合はステップＳ８０４へ進む。 In step S801 of FIG. 8, the CPU 111 determines whether or not the similarity calculated by the matching process of step S204 is equal to or less than a predetermined value with respect to the document image of the same type determined in step S208. If the value is equal to or less than the predetermined value, the process proceeds to step S802, and if the value is equal to or more than the predetermined value, the process proceeds to step S804.

ステップＳ８０２において、ＣＰＵ１１１は、ステップＳ２１０でユーザによる修正が行われたか否かを判定する。修正が行われた場合はステップＳ８０３へ進み、修正が行われなかった場合はデータベースへの追加登録が不要であるため、処理を終了する。 In step S802, the CPU 111 determines whether or not the modification has been made by the user in step S210. If the correction is made, the process proceeds to step S803, and if the correction is not made, additional registration to the database is not required, so the process ends.

ステップＳ８０３において、ＣＰＵ１１１は、ユーザによる修正内容が、バリュー領域の推定間違いに起因する修正であったか否かを判定する。バリュー領域の推定間違い起因する修正であった場合はステップＳ８０４へ進む。一方、単純にＯＣＲの間違いの修正や、リコメンドされたバリュー値の前後に付与されていた文字列を削除した場合などは、データベースへの追加登録が不要であると判断し、処理を終了する。バリュー前後に付与されていた文字列を削除する具体例としては、会社名（ｉｓｓｕｅｒ）に付与されている「株式会社」や、合計（ｔｏｔａｌ）および小計（ｓｕｂＴｏｔａｌ）に付与されている「円」などが挙げられる。 In step S803, the CPU 111 determines whether or not the correction content by the user is a correction caused by an estimation error in the value area. If the correction is caused by an estimation error in the value area, the process proceeds to step S804. On the other hand, if the OCR error is simply corrected or the character string assigned before and after the recommended value value is deleted, it is determined that additional registration in the database is unnecessary, and the process is terminated. Specific examples of deleting the character strings given before and after the value are "Co., Ltd." given to the company name (issuer) and "Yen" given to the total (total) and subtotal (subTotal). And so on.

図９は、バリュー値推定結果の確認画面を表示デバイス１１９上に表示した例である。確認画面９００は、スキャンしたクエリ文書画像９０１と、ステップＳ２０９において、同一種類文書画像から推定されたバリュー領域位置に記載されているバリュー値を示した推定結果９０２とを表示する。 FIG. 9 is an example in which the confirmation screen of the value value estimation result is displayed on the display device 119. The confirmation screen 900 displays the scanned query document image 901 and the estimation result 902 showing the value value described in the value region position estimated from the same type document image in step S209.

ステップＳ８０４において、ＣＰＵ１１１は、図９のバリュー値推定結果の確認画面上に入力されている各項目に対するバリュー値を取得する。 In step S804, the CPU 111 acquires the value value for each item input on the confirmation screen of the value value estimation result of FIG.

ステップＳ８０５において、ＣＰＵ１１１は、クエリ文書画像に対してブロックセレクション処理を行うことにより、ＢＳテキストブロックを生成する。なお、本ブロックセレクション処理は、図２のステップＳ２０３で行ったブロックセレクション処理と同じである。 In step S805, the CPU 111 generates a BS text block by performing a block selection process on the query document image. The block selection process is the same as the block selection process performed in step S203 of FIG.

ステップＳ８０６において、ＣＰＵ１１１は、ステップＳ８０４で取得したバリュー値が記載されたＢＳテキストブロックの位置情報（座標およびサイズ）を取得する。バリュー領域位置は、クエリ文書画像に対して全文ＯＣＲ処理を実行し、文字列が一致するＢＳテキストブロックを探索すればよい。あるいは、図９に示したバリュー値推定結果の確認画面上で、ユーザに正しいバリューが記載された領域を選択させることで取得することも可能である。 In step S806, the CPU 111 acquires the position information (coordinates and size) of the BS text block in which the value value acquired in step S804 is described. For the value area position, the full-text OCR process may be executed on the query document image, and the BS text block in which the character strings match may be searched. Alternatively, it can be acquired by having the user select an area in which the correct value is described on the confirmation screen of the value value estimation result shown in FIG.

ステップＳ８０７において、ＣＰＵ１１１は、ステップＳ８０４およびステップＳ８０６で得られた結果に基づき、文書画像情報を記憶部１１４に保存されたデータベースに登録し、処理を終了する。登録する内容としては、ＢＳテキストブロックの位置情報、バリュー属性とバリュー属性が付与されたバリュー領域位置情報である。なお、登録内容はこれに限るものではなく、クエリ画像データやバリュー値のＯＣＲ結果、スキャン設定条件、操作者情報など、必要な情報を登録しても構わないことは言うまでもない。 In step S807, the CPU 111 registers the document image information in the database stored in the storage unit 114 based on the results obtained in step S804 and step S806, and ends the process. The contents to be registered are the position information of the BS text block, and the value area position information to which the value attribute and the value attribute are added. It goes without saying that the registered contents are not limited to this, and necessary information such as query image data, OCR results of value values, scan setting conditions, and operator information may be registered.

以上の処理フローにより、クエリ文書画像とレイアウト構造が類似した登録文書画像が複数登録されていた場合に対しても、最も類似した登録文書画像を類似文書画像として特定できるようになる。加えて、帳票情報として各項目に対するバリュー値を精度よく推定することが可能となる。なお、類似文書画像を判定する際にも、比較判定が容易に可能なＢＳテキストブロックに限定して比較処理を行うため、効率的な類似判定処理が可能となる。 With the above processing flow, even when a plurality of registered document images having a layout structure similar to that of the query document image are registered, the most similar registered document image can be specified as a similar document image. In addition, it is possible to accurately estimate the value value for each item as form information. Even when determining a similar document image, the comparison process is limited to the BS text block that can be easily compared and determined, so that the efficient similarity determination process can be performed.

［第２の実施形態］
本実施形態では、バリュー属性のデータ形式の出現頻度を学習する仕組みを提供することで、第１の実施形態では判定できなかった同じバリュー属性を有するＢＳテキストブロックに対しても比較判定できるようにすることを目的とする。なお、第１の実施形態と同様の処理については同じ番号を付与し、詳細な説明は割愛する。 [Second Embodiment]
In the present embodiment, by providing a mechanism for learning the appearance frequency of the data format of the value attribute, it is possible to make a comparative judgment even for a BS text block having the same value attribute, which could not be determined in the first embodiment. The purpose is to do. The same numbers are assigned to the same processes as in the first embodiment, and detailed explanations are omitted.

＜文書画像登録処理＞
本実施形態における図２のステップＳ２１１で行われる文書画像登録処理について、図１０および図１１を用いて説明を行う。図１０は、文書画像登録処理の処理フローを説明する図であり、図１１は、バリュー属性のデータ形式の出現頻度を学習した結果を示す図である。なお、図１１は登録文書画像ごとの集計回数が追加された以外は、図７（ａ）に示した図面と同じである。 <Document image registration process>
The document image registration process performed in step S211 of FIG. 2 in the present embodiment will be described with reference to FIGS. 10 and 11. FIG. 10 is a diagram for explaining the processing flow of the document image registration process, and FIG. 11 is a diagram showing the result of learning the appearance frequency of the data format of the value attribute. Note that FIG. 11 is the same as the drawing shown in FIG. 7 (a) except that the total number of times for each registered document image is added.

図１０のステップＳ８０１～Ｓ８０７において、ＣＰＵ１１１は、第１の実施形態と同様の処理を行うことで、クエリ文書画像を記憶部１１４に保存されたデータベースに登録する。 In steps S801 to S807 of FIG. 10, the CPU 111 registers the query document image in the database stored in the storage unit 114 by performing the same processing as in the first embodiment.

ステップＳ１００１において、ＣＰＵ１１１は、データ形式の出現頻度の登録処理を行う。出現頻度登録処理は、登録文書画像ごとに、同一種類文書画像として判定されたクエリ文書画像内での各バリュー属性に対応して使用されたデータ形式を集計する処理である。具体的には、図２のステップＳ２０９において、クエリ文書画像５００に付与されたバリュー属性ごとに、図１１で定義されたデータ形式のどの条件に当てはまるかを判断し、登録文書画像と紐づけて集計する。例えば、ｉｓｓｕｅｒ属性が付与されたＩＤ５０３は、特定文字列「株式会社」の条件に当てはまることから、出現頻度を１回増加させ、２９回と集計する。またｔｏｔａｌ属性が付与されたＩＤ５１０、ｓｕｂＴｏｔａｌ属性が付与されたＩＤ５０７およびＩＤ５０９は、特定文字列「円」の条件に当てはまることから、出現頻度を３回増加させ、２７回と集計する。ｉｓｓｕｅｒＴｅｌ属性が付与されたＩＤ５０４は、フォーマット「＃＃－＃＃＃＃－＃＃＃＃」の条件に当てはまることから、出現頻度を１増加させ１１回と集計する。 In step S1001, the CPU 111 performs a registration process of the appearance frequency of the data format. The appearance frequency registration process is a process of totaling the data formats used corresponding to each value attribute in the query document image determined as the same type document image for each registered document image. Specifically, in step S209 of FIG. 2, for each value attribute assigned to the query document image 500, it is determined which condition of the data format defined in FIG. 11 is applicable, and the reference document image is associated with the registered document image. Tally. For example, since the ID 503 to which the issuer attribute is given meets the condition of the specific character string "Co., Ltd.", the appearance frequency is increased by 1 and the total is 29 times. Further, since ID 510 to which the total attribute is given and ID 507 and ID 509 to which the subTotal attribute is given meet the condition of the specific character string "circle", the appearance frequency is increased 3 times and totaled to 27 times. Since the ID 504 to which the issuerTel attribute is given meets the condition of the format "##-####-####", the frequency of appearance is increased by 1 and counted as 11 times.

＜同一種類文書画像抽出処理＞
本実施形態における図２のステップＳ２０８で行われる同一種類文書画像抽出処理について、第１の実施形態との違いを図６（ａ）の詳細処理フローおよび図１２を用いて説明を行う。図１２（ａ）はクエリ文書画像５００を、図１２（ｂ）および図１２（ｃ）は図２のステップＳ２０５で選択された同一種類文書候補画像である登録文書画像５４０および登録文書画像１２００を表している。図１２（ｄ）はステップＳ２０７で得られたバリュー領域位置の推定結果を表している。なお、図１２（ａ）および図１２（ｂ）は、第１の実施形態の説明で用いた図５（ａ）および図５（ｃ）と同じ文書画像である。 <Same type document image extraction process>
The same type document image extraction process performed in step S208 of FIG. 2 in this embodiment will be described with reference to the detailed process flow of FIG. 6A and FIG. 12 to explain the difference from the first embodiment. 12 (a) shows the query document image 500, and FIGS. 12 (b) and 12 (c) show the registered document image 540 and the registered document image 1200, which are the same type document candidate images selected in step S205 of FIG. Represents. FIG. 12D shows the estimation result of the value region position obtained in step S207. Note that FIGS. 12 (a) and 12 (b) are the same document images as those in FIGS. 5 (a) and 5 (c) used in the description of the first embodiment.

図６（ａ）のステップＳ６０１において、ＣＰＵ１１１は、同一種類文書の候補が１つに決定されたかを判断する。１つに決定された場合は処理を終了し、複数候補残っている場合にはステップＳ６０２以降を繰り返す。 In step S601 of FIG. 6A, the CPU 111 determines whether or not one candidate for the same type of document has been determined. If it is determined to be one, the process is terminated, and if a plurality of candidates remain, step S602 and subsequent steps are repeated.

ステップＳ６０２において、ＣＰＵ１１１は、同一種類文書画像を判定する際に用いるＢＳテキストブロックを決定する。判定に使用するＢＳテキストブロックは、ステップＳ２０７で推定して得られたバリュー領域位置の中から、以下の条件に基づいて決定される。
条件１１つのＢＳテキストブロックに異なる複数のバリュー属性が付与されていて
条件１－１比較判定の優先度が１の領域
条件１－２比較判定の優先度が２の領域
条件１－３比較判定の優先度が３の領域
条件２１つのＢＳテキストブロックに１つのバリュー属性が付与されている領域
条件３１つのＢＳテキストブロックに対して共通のバリュー属性が付与されていて
比較する登録文書同士のデータ形式の出現頻度に有意差がある領域
条件４判定用の領域なし。 In step S602, the CPU 111 determines a BS text block to be used when determining a document image of the same type. The BS text block used for the determination is determined based on the following conditions from the value region positions estimated in step S207.
Condition 1 One BS text block is given different value attributes. Condition 1-1 Area where the priority of comparison judgment is 1. Condition 1-2 Area where the priority of comparison judgment is 2. Condition 1-3 Comparison judgment. Area with priority of 3 Condition 2 Area where one value attribute is given to one BS text block Condition 3 Common value attribute is given to one BS text block
Area where there is a significant difference in the frequency of appearance of data formats between registered documents to be compared Condition 4 No area for judgment.

本実施形態で追加された条件３について、図１１および図１２を用いて説明を行う。図１２（ｄ）に示すように、条件３に適合する候補として、１つのＢＳテキストブロックに共通のバリュー属性が付与された領域としてＩＤ５０１、ＩＤ５０２、ＩＤ５０３、ＩＤ５０４が選ばれる。そして、この領域のうち、登録文書画像５４０と登録文書画像１２００との間で、各バリュー属性のデータ形式の条件に有意差がある領域を検出する。有意差の有無については、図１１に示した出現頻度で判定すればよい。例えば、ｔｉｔｌｅ属性が付与されたＩＤ５０１には、出現頻度に有意差があるデータ形式の条件がないため、判定不可と判断される。同様にして、ｉｓｓｕｅＮｏ属性が付与されたＩＤ５０２も判定不可と判断される。一方、ｉｓｓｕｅｒ属性が付与されたＩＤ５０３では、特定文字列として登録文書画像５４０では「株式会社」が多用されているのに対し、登録文書画像１１００では「（株）」が多用されているという違いがある。このため、条件３の比較判断として利用可能であると判断される。同様にして、ｉｓｓｕｅｒＴｅｌ属性が付与されたＩＤ５０４は、フォーマットとして「（＃＃）－＃＃＃＃－＃＃＃＃」の表記があるか否かで比較できると判断される。 The condition 3 added in the present embodiment will be described with reference to FIGS. 11 and 12. As shown in FIG. 12D, ID501, ID502, ID503, and ID504 are selected as the regions to which the common value attribute is given to one BS text block as the candidates satisfying the condition 3. Then, in this region, a region where there is a significant difference in the data format conditions of each value attribute between the registered document image 540 and the registered document image 1200 is detected. The presence or absence of a significant difference may be determined by the frequency of appearance shown in FIG. For example, the ID 501 to which the title attribute is assigned does not have a data format condition in which there is a significant difference in the appearance frequency, so that it is determined that the determination is not possible. Similarly, it is determined that the ID 502 to which the issueNo attribute is added cannot be determined. On the other hand, in the ID 503 to which the issuer attribute is given, "Co., Ltd." is frequently used in the registered document image 540 as a specific character string, whereas "Co., Ltd." is frequently used in the registered document image 1100. There is. Therefore, it is determined that the condition 3 can be used as a comparative judgment. Similarly, it is determined that the ID 504 to which the issuerTel attribute is given can be compared depending on whether or not there is a notation of "(##)-####-####" as the format.

ステップＳ６０５において、ＣＰＵ１１１は、ステップＳ６０２で選択されたＢＳテキストブロックに対して、データ形式の比較を行う。具体的には、ＩＤ５０３の文字列「〇〇株式会社」に対して、ｉｓｓｕｅｒ属性のデータ形式を比較すると、「株式会社」という特定文字列を多用している登録文書画像５４０の可能性が高いと判断できる。また、ＩＤ５０４の文字列「０３－１２３４－５６７８」に対して、ｉｓｓｕｅｒＴｅｌ属性のデータ形式を比較すると、フォーマット「（＃＃）－＃＃＃＃－＃＃＃＃」の表記がないため、登録文書画像１１００の可能性は低いと判断できる。 In step S605, the CPU 111 compares the data formats with respect to the BS text block selected in step S602. Specifically, when comparing the data format of the issuer attribute with the character string "○○ Co., Ltd." of ID503, there is a high possibility that the registered document image 540 uses a specific character string "Co., Ltd." a lot. Can be judged. Also, when comparing the data format of the issuerTel attribute with the character string "03-1234-5678" of ID504, the format "(##)-####-####" is not described, so it is registered. It can be determined that the possibility of the document image 1100 is low.

ステップＳ６０６において、ＣＰＵ１１１は、ステップＳ６０５での判定結果を受け、同一種類文書画像を判定する。ここでは、データ形式の出現頻度の比較において、同一種類文書画像の可能性が高いと判断された登録文書画像５４０を選択する。 In step S606, the CPU 111 receives the determination result in step S605 and determines the same type of document image. Here, in the comparison of the appearance frequency of the data formats, the registered document image 540 judged to have a high possibility of being the same type document image is selected.

その他の処理に関しては、第１の実施形態と同様であるため、説明を省略する。 Since other processes are the same as those in the first embodiment, the description thereof will be omitted.

以上の処理により、登録文書画像毎にデータ形式の出現頻度を学習することによって、第１の実施形態では判別することができなかった、同一のバリュー属性においても比較を行うことができるようになる。加えて、同一種類文書画像の特定精度を向上させることができる。その結果として、バリュー値の推定精度の向上を実現できる。 By the above processing, by learning the appearance frequency of the data format for each registered document image, it becomes possible to perform comparison even with the same value attribute, which could not be determined in the first embodiment. .. In addition, the accuracy of specifying the same type of document image can be improved. As a result, it is possible to improve the estimation accuracy of the value value.

［その他の実施形態］
本発明は、以下の処理を実行することによっても実現される。上述した実施形態の機能を実現するソフトウエア（プログラム）を、ネットワーク又は各種記憶媒体を介してシステム或いは装置に供給し、そのシステム或いは装置のコンピュータ（またはＣＰＵやＭＰＵなど）がプログラムを読み出して実行する処理である。 [Other embodiments]
The present invention is also realized by executing the following processing. The software (program) that realizes the functions of the above-described embodiment is supplied to the system or device via a network or various storage media, and the computer (or CPU, MPU, etc.) of the system or device reads and executes the program. It is a process to do.

また、本実施形態の機能を実現するためのプログラムコードを、１つのコンピュータ（ＣＰＵ、ＭＰＵ）で実行する場合であってもよいし、複数のコンピュータが協働することによって実行する場合であってもよい。さらに、プログラムコードをコンピュータが実行する場合であってもよいし、プログラムコードの機能を実現するための回路などのハードウェアを設けてもよい。またはプログラムコードの一部をハードウェアで実現し、残りの部分をコンピュータが実行する場合であってもよい。 Further, the program code for realizing the function of the present embodiment may be executed by one computer (CPU, MPU) or may be executed by a plurality of computers collaborating with each other. May be good. Further, the program code may be executed by a computer, or hardware such as a circuit for realizing the function of the program code may be provided. Alternatively, a part of the program code may be implemented by hardware and the rest may be executed by a computer.

１００画像処理システム
１１１ＣＰＵ
１１２ＲＯＭ
１１３ＲＡＭ
１１４記憶部
１１５画像処理部
１１６ユーザインターフェース
１１７画像読取部
１１８画像出力部
１１９表示デバイス
１２０外部インターフェース 100 Image processing system 111 CPU
112 ROM
113 RAM
114 Storage unit 115 Image processing unit 116 User interface 117 Image reading unit 118 Image output unit 119 Display device 120 External interface

Claims

In an image processing device in which multiple document images are registered as registered document images,
When a plurality of registered document images similar to the acquired document image are selected, the area determination means for determining the area for comparison determination and the area determination means.
Character recognition is performed on the area of the acquired document image determined by the area determining means, the attribute of the area is determined, and the area of the registered document image corresponding to the area of the acquired document image is determined. A comparison method to compare with the attributes of
An image processing apparatus including a determination means for determining a registered document image having the same attribute as a similar document image.

The image processing apparatus according to claim 1, wherein the area determining means determines an area to which a plurality of different attributes are given to one area as an area for comparison determination.

The claim means that the comparison means refers to a correspondence table in which a data format is defined for each attribute, and determines the attribute from the data format of the area in which the character recognition of the acquired document image is performed. The image processing apparatus according to 1.

The area determination means is characterized in that when data formats are compared between different attributes, the area for comparison determination is determined based on the priority determined in descending order of the difference in the data formats of the correspondence table. The image processing apparatus according to claim 3.

3. The image processing apparatus according to.

The image processing apparatus according to claim 5, wherein the data format has at least one of specific character string information, a data format, and a character type appearing in the extracted item.

The image according to any one of claims 1 to 6, further comprising an acquisition means for acquiring an item value described in an area estimated from a similar document image determined by the determination means. Processing equipment.

The image processing according to claim 7, further comprising a registration means for registering the acquired document image as a registered document image when the area of the item value acquired by the acquisition means is incorrect. Device.

The image processing apparatus according to claim 8, wherein the registration means registers the appearance frequency of the data format used in the acquired document image for each attribute associated with the similar document image.

The claim is characterized in that, when the same attribute is assigned to one region, the region determining means determines a region for comparison determination based on a significant difference in the appearance frequency of the data format. 9. The image processing apparatus according to 9.

In the image processing method in an image processing device in which a plurality of document images are registered as registered document images,
When a plurality of registered document images similar to the acquired document image are selected, an area determination step for determining an area for comparison determination and an area determination step,
Character recognition is performed on the area of the acquired document image determined by the area determining means, the attribute of the area is determined, and the area of the registered document image corresponding to the area of the acquired document image is determined. Comparison steps to compare with the attributes of
An image processing method comprising: a determination step of determining a registered document image having the same attribute as a similar document image.

A program for making a computer function as one means of the image processing apparatus according to any one of claims 1 to 10.