JP7034730B2

JP7034730B2 - Devices, methods, and programs for setting information related to scanned images

Info

Publication number: JP7034730B2
Application number: JP2018009017A
Authority: JP
Inventors: 義高松本
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2018-01-23
Filing date: 2018-01-23
Publication date: 2022-03-14
Anticipated expiration: 2038-01-23
Also published as: US10929657B2; US20190228220A1; JP2019128727A

Description

本発明は、スキャンして得られたスキャン画像に関連する情報を設定する技術に関する。 The present invention relates to a technique for setting information related to a scanned image obtained by scanning.

従来、紙文書をスキャンして得られた画像データ（以下、スキャン画像データという）に対して文字認識処理（以下、ＯＣＲ処理という）を行い、認識された文字を、その紙文書の電子ファイルのファイル名として使用する技術がある。 Conventionally, image data obtained by scanning a paper document (hereinafter referred to as scanned image data) is subjected to character recognition processing (hereinafter referred to as OCR processing), and the recognized characters are used as an electronic file of the paper document. There is a technique to use as a file name.

特許文献１には、帳票のスキャン画像データにおける所定の場所をＯＣＲ処理し、得られた文字をファイル名としたファイルを作成する技術が開示されている。 Patent Document 1 discloses a technique of performing OCR processing on a predetermined place in the scanned image data of a form and creating a file using the obtained characters as a file name.

特開昭６２－０５１８６６号公報Japanese Unexamined Patent Publication No. 62-051866

紙文書である帳票には、記載事項の位置および記載欄（記載領域）の大きさが予め決められている様式のものの他に、記載事項の位置が予め決められておらず同じ様式であっても記載欄が拡大可能でその大きさに応じて記載事項の位置が変わるものがある。例えば、帳票の一種である見積書には、商品の種類に応じて記載領域が下方向へ拡大する表と、表の記載領域の大きさに応じて記載位置が変わる記載事項とを含む帳票がある。特許文献１では、ＯＣＲ処理を行う場所（領域）が予め決められている。記載事項の位置が予め決められておらず記載欄の大きさに応じて変わる様な見積書のスキャン画像データにて所定の場所をＯＣＲ処理し、得られた文字をファイル名としたファイルを作成すると、意図しない文字を含んだファイル名となる可能性があった。すなわち、ＯＣＲ処理を行う領域を適切に特定することができない可能性があった。また、帳票に限らず、同じ様式であっても記載事項の位置が予め決められておらず記載欄の大きさに応じて変わる様な紙文書であれば、ＯＣＲ処理を行う領域を適切に特定することができない可能性があった。 In the form that is a paper document, in addition to the format in which the position of the items to be described and the size of the entry column (entry area) are predetermined, the position of the items to be described is not predetermined and is in the same format. However, the description column can be expanded, and the position of the description item may change depending on the size. For example, a quotation, which is a type of form, includes a table in which the description area expands downward according to the type of product, and a form containing items whose description position changes according to the size of the description area in the table. be. In Patent Document 1, the place (region) where the OCR processing is performed is predetermined. OCR processing is performed on the specified location using the scanned image data of the estimate that the position of the items to be described is not determined in advance and changes according to the size of the entry field, and a file with the obtained characters as the file name is created. Then, there is a possibility that the file name contains unintended characters. That is, there is a possibility that the area where the OCR processing is performed cannot be properly specified. In addition, not limited to forms, if the position of the items to be described is not determined in advance and the document changes according to the size of the entry column, the area to be OCR processed is appropriately specified. There was a possibility that it could not be done.

本発明は、上記の課題に鑑み為されたものであり、スキャン画像に対してＯＣＲ処理を行って得られた文字列を用いてファイル名等の設定を行う場面において、ＯＣＲ処理を行う領域を適切に特定することを目的としている。 The present invention has been made in view of the above problems, and in a scene where a file name or the like is set using a character string obtained by performing OCR processing on a scanned image, an area for performing OCR processing is provided. It is intended to be properly identified.

本発明の一態様に係る装置は、表を含む文書をスキャンして得られたスキャン画像データに情報を設定するための装置であって、新たなスキャン画像データ内において文字列および表と推認される文字列領域および表領域それぞれに関する領域情報を抽出する抽出手段と、前記抽出手段により前記新たなスキャン画像データにおいて抽出された前記領域情報と、過去のスキャン画像データそれぞれにおいて抽出された領域情報とを比較することにより、前記新たなスキャン画像データにおいて抽出された前記領域情報に類似する領域情報が抽出された過去のスキャン画像データを判定する判定手段と、前記類似する領域情報が抽出されたと前記判定手段で判定された過去のスキャン画像データに対して情報を設定する際に用いた文字列領域と、前記類似する領域情報が抽出されたと前記判定手段で判定された過去のスキャン画像データにおいて抽出された表領域と、の間の距離に基づいて、前記新たなスキャン画像データから抽出された文字列領域のうち処理対象となる対象領域を検出する検出手段と、前記対象領域の文字認識処理を行う認識手段と、前記文字認識処理の結果得られた文字を用いて前記新たなスキャン画像データに情報を設定する設定手段と有することを特徴とする。 The device according to one aspect of the present invention is a device for setting information in the scanned image data obtained by scanning a document including a table, and is presumed to be a character string and a table in the new scanned image data. The area information extracted from each of the character string area and the table area , the area information extracted by the extraction means in the new scanned image data, and the past scanned image data respectively. By comparing with the area information, the determination means for determining the past scan image data from which the area information similar to the area information extracted in the new scan image data is extracted, and the similar area information The character string area used when setting information for the past scanned image data determined to have been extracted by the determination means, and the past determined by the determination means that similar area information has been extracted . A detection means for detecting a target area to be processed among the character string areas extracted from the new scan image data based on the distance between the table area extracted in the scanned image data, and the above-mentioned detection means. It is characterized by having a recognition means for performing character recognition processing in a target area and a setting means for setting information in the new scanned image data using characters obtained as a result of the character recognition processing .

本発明によれば、スキャン画像に対してＯＣＲ処理を行って得られた文字列を用いてファイル名等の設定を行う場面において、ＯＣＲ処理を行う領域を適切に特定することができる。 According to the present invention, it is possible to appropriately specify an area to be OCR processed in a scene where a file name or the like is set using a character string obtained by performing OCR processing on a scanned image.

画像処理システムの全体構成を示す図である。It is a figure which shows the whole structure of an image processing system. ＭＦＰのハードウェア構成図である。It is a hardware block diagram of the MFP. ファイルサーバのハードウェア構成図である。It is a hardware configuration diagram of a file server. ＭＦＰのソフトウェア構成図である。It is a software block diagram of the MFP. 文字列領域情報の保存までの全体的な制御の流れを示すフローチャートである。It is a flowchart which shows the flow of the whole control until the character string area information is saved. スキャン設定画面の一例を示す図である。It is a figure which shows an example of a scan setting screen. 画像処理の流れを示すフローチャートである。It is a flowchart which shows the flow of image processing. スキャン処理対象の文書及びプレビュー画面の一例を示す図である。It is a figure which shows an example of the document to be scanned and the preview screen. ファイル名生成処理の詳細を示すフローチャートである。It is a flowchart which shows the details of a file name generation process. アップロード設定画面の一例を示す図である。It is a figure which shows an example of the upload setting screen. ファイル名使用文字列領域を導出するフローチャートである。It is a flowchart which derives a character string area using a file name. スキャン処理対象の文書及びプレビュー画面の他例を示す図である。It is a figure which shows the document to be scanned and another example of a preview screen. 実施形態２のファイル名使用文字列領域を導出するフローチャートである。It is a flowchart which derives the file name use character string area of Embodiment 2. 距離導出処理の詳細を示すフローチャートである。It is a flowchart which shows the detail of the distance derivation process. スキャン処理対象の文書及びプレビュー画面の他例を示す図である。It is a figure which shows the document to be scanned and another example of a preview screen. ファイル名リスト表示の一例を示す図である。It is a figure which shows an example of the file name list display.

以下、本発明を実施するための形態について図面を用いて説明する。なお、以下の実施の形態は特許請求の範囲に係る発明を限定するものでなく、また実施の形態で説明されている特徴の組み合わせの全てが発明の解決手段に必須のものとは限らない。 Hereinafter, embodiments for carrying out the present invention will be described with reference to the drawings. It should be noted that the following embodiments do not limit the invention according to the claims, and not all combinations of features described in the embodiments are essential for the means for solving the invention.

［実施形態１］
＜画像処理システムの構成＞
図１は、本実施形態に係る画像処理システムの全体構成を示す図である。画像処理システムは、ＭＦＰ１１０とファイルサーバ１２０とで構成され、ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）を介して互いに通信可能に接続されている。 [Embodiment 1]
<Configuration of image processing system>
FIG. 1 is a diagram showing an overall configuration of an image processing system according to the present embodiment. The image processing system is composed of an MFP 110 and a file server 120, and is communicably connected to each other via a LAN (Local Area Network).

ＭＦＰ（ＭｕｌｔｉＦｕｎｃｔｉｏｎＰｒｉｎｔｅｒ）１１０は、スキャナやプリンタといった複数の機能を有する複合機であり、画像処理装置の一例である。ファイルサーバ１２０は、電子化された文書ファイルを保存・管理する外部サーバの一例である。本実施形態の画像処理システムは、ＭＦＰ１１０とファイルサーバ１２０とからなる構成としているがこれに限定されない。例えば、ＭＦＰ１１０がファイルサーバ１２０の役割も兼ね備えてもよい。また、ＬＡＮに代えてインターネットなどを介した接続形態であってもよい。また、ＭＦＰ１１０は、ＰＳＴＮ（ＰｕｂｌｉｃＳｗｉｔｃｈｅｄＴｅｌｅｐｈｏｎｅＮｅｔｗｏｒｋｓ）に接続され、ファクシミリ装置（不図示）との間で画像データをファクシミリ通信することができる。 The MFP (Multi Function Printer) 110 is a multifunction device having a plurality of functions such as a scanner and a printer, and is an example of an image processing device. The file server 120 is an example of an external server that stores and manages an electronic document file. The image processing system of the present embodiment is configured to include the MFP 110 and the file server 120, but is not limited thereto. For example, the MFP 110 may also serve as the file server 120. Further, instead of LAN, a connection form via the Internet or the like may be used. Further, the MFP 110 is connected to a PSTN (Public Switched Telephone Network) and can perform facsimile communication of image data with a facsimile machine (not shown).

＜ＭＦＰのハードウェア構成＞
図２は、ＭＦＰ１１０のハードウェア構成図である。ＭＦＰ１１０は、制御部２１０、操作部２２０、プリンタ部２２１、スキャナ部２２２、モデム２２３で構成される。制御部２１０は、以下の各部２１１～２１９で構成され、ＭＦＰ１１０全体の動作を制御する。ＣＰＵ２１１は、ＲＯＭ２１２に記憶された制御プログラムを読み出して、読取／印刷／通信などＭＦＰ１１０が有する各種機能を実行・制御する。ＲＡＭ２１３は、ＣＰＵ２１１の主メモリ、ワークエリア等の一時記憶領域として用いられる。なお、本実施形態では１つのＣＰＵ２１１が１つのメモリ（ＲＡＭ２１３またはＨＤＤ２１４）を用いて後述のフローチャートに示す各処理を行うものとするが、他の態様であっても構わない。例えば、複数のＣＰＵや複数のＲＡＭまたはＨＤＤを協働させて各処理を行ってもよい。ＨＤＤ２１４は、画像データや各種プログラムを記憶する大容量記憶部である。操作部Ｉ／Ｆ２１５は、操作部２２０と制御部２１０を接続するインタフェースである。操作部２２０には、タッチパネル機能を有する液晶表示部やキーボードなどが備えられており、ユーザによる操作／入力／指示を受け付ける受付部としての役割を担う。これらユーザによる操作などは液晶パネルに対するタッチによって受け付けてもよいし、ユーザによるキーボードやボタンなどの操作によって受け付けてもよい。プリンタ部Ｉ／Ｆ２１６は、プリンタ部２２１と制御部２１０を接続するインタフェースである。印刷用の画像データはプリンタ部Ｉ／Ｆ２１６を介して制御部２１０からプリンタ部２２１へ転送され、記録媒体上に印刷される。スキャナ部Ｉ／Ｆ２１７は、スキャナ部２２２と制御部２１０を接続するインタフェースである。スキャナ部２２２は、不図示の原稿台やＡＤＦ（ＡｕｔｏＤｏｃｕｍｅｎｔＦｅｅｄｅｒ）にセットされた原稿を読み取って画像データを生成し、スキャナ部Ｉ／Ｆ２１７を介して制御部２１０に入力する。ＭＦＰ１１０は、スキャナ部２２２で生成された画像データをプリンタ部２２１から印刷出力（コピー）するほか、ファイル送信またはメール送信することができる。モデムＩ／Ｆ２１８は、モデム２２３と制御部２１０を接続するインタフェースである。モデム２２３は、ＰＳＴＮ上のファクシミリ装置との間で画像データをファクシミリ通信する。ネットワークＩ／Ｆ２１９は、制御部２１０（ＭＦＰ１１０）をＬＡＮに接続するインタフェースである。ＭＦＰ１１０は、ネットワークＩ／Ｆ２１９を用いてＬＡＮ上の外部装置（ファイルサーバ１２０など）に画像データや情報を送信したり、各種情報を受信したりする。 <Hardware configuration of MFP>
FIG. 2 is a hardware configuration diagram of the MFP 110. The MFP 110 is composed of a control unit 210, an operation unit 220, a printer unit 221, a scanner unit 222, and a modem 223. The control unit 210 is composed of the following units 211 to 219, and controls the operation of the entire MFP 110. The CPU 211 reads out the control program stored in the ROM 212 and executes / controls various functions of the MFP 110 such as reading / printing / communication. The RAM 213 is used as a temporary storage area for the main memory, work area, etc. of the CPU 211. In this embodiment, one CPU 211 uses one memory (RAM 213 or HDD 214) to perform each process shown in the flowchart described later, but other embodiments may be used. For example, a plurality of CPUs, a plurality of RAMs, or HDDs may be linked to perform each process. HDD 214 is a large-capacity storage unit that stores image data and various programs. The operation unit I / F 215 is an interface for connecting the operation unit 220 and the control unit 210. The operation unit 220 is provided with a liquid crystal display unit having a touch panel function, a keyboard, and the like, and serves as a reception unit for receiving operations / inputs / instructions by the user. These operations by the user may be accepted by touching the liquid crystal panel, or may be accepted by the operation of the keyboard, buttons, or the like by the user. The printer unit I / F 216 is an interface for connecting the printer unit 221 and the control unit 210. The image data for printing is transferred from the control unit 210 to the printer unit 221 via the printer unit I / F 216 and printed on the recording medium. The scanner unit I / F217 is an interface for connecting the scanner unit 222 and the control unit 210. The scanner unit 222 reads a document set in a document table (not shown) or an ADF (Auto Document Feeder) to generate image data, and inputs the image data to the control unit 210 via the scanner unit I / F 217. The MFP 110 can print out (copy) the image data generated by the scanner unit 222 from the printer unit 221, and can also send a file or send an e-mail. The modem I / F 218 is an interface for connecting the modem 223 and the control unit 210. Modem 223 facsimiles and communicates image data with a facsimile machine on the PSTN. The network I / F 219 is an interface for connecting the control unit 210 (MFP110) to the LAN. The MFP 110 uses the network I / F 219 to transmit image data and information to an external device (file server 120, etc.) on the LAN, and to receive various information.

＜ファイルサーバのハードウェア構成＞
図３は、ファイルサーバ１２０のハードウェア構成図である。ファイルサーバ１２０は、ＣＰＵ３１１、ＲＯＭ３１２、ＲＡＭ３１３、ＨＤＤ３１４及びネットワークＩ／Ｆ３１５で構成される。ＣＰＵ３１１は、ＲＯＭ３１２に記憶された制御プログラムを読み出して各種処理を行うことで、ファイルサーバ１２０全体の動作を制御する。ＲＡＭ３１３は、ＣＰＵ３１１の主メモリ、ワークエリアなどの一時記憶領域として用いられる。ＨＤＤ３１４は、画像データや各種プログラムを記憶する大容量記憶部である。ネットワークＩ／Ｆ３１５は、ファイルサーバ１２０をＬＡＮに接続するインタフェースである。ファイルサーバ１２０は、ネットワークＩ／Ｆ３１５を用いてＬＡＮ上の他の装置（例えばＭＦＰ１１０）との間で各種情報を送受信する。 <Hardware configuration of file server>
FIG. 3 is a hardware configuration diagram of the file server 120. The file server 120 is composed of a CPU 311, a ROM 312, a RAM 313, an HDD 314, and a network I / F 315. The CPU 311 reads the control program stored in the ROM 312 and performs various processes to control the operation of the entire file server 120. The RAM 313 is used as a temporary storage area such as a main memory and a work area of the CPU 311. The HDD 314 is a large-capacity storage unit that stores image data and various programs. The network I / F315 is an interface for connecting the file server 120 to the LAN. The file server 120 uses the network I / F315 to send and receive various information to and from other devices (for example, MFP110) on the LAN.

＜ＭＦＰのソフトウェア構成＞
図４は、ＭＦＰ１１０のソフトウェア構成の一例を示す図である。ＭＦＰ１１０は、ネイティブ機能モジュール４１０とアディショナル機能モジュール４２０とを有する。ネイティブ機能モジュール４１０に含まれる各部はＭＦＰ１１０に標準的に備えられたものであるのに対し、アディショナル機能モジュール４２０の各部はＭＦＰ１１０に追加インストールされたアプリケーションである。アディショナル機能モジュール４２０は、Ｊａｖａ（登録商標）をベースとしたアプリケーションであり、ＭＦＰ１１０への機能追加を容易に実現できる。なお、ＭＦＰ１１０には図示しない他のアディショナル機能モジュール（追加アプリケーション）がインストールされていても良い。 <Software configuration of MFP>
FIG. 4 is a diagram showing an example of the software configuration of the MFP 110. The MFP 110 has a native function module 410 and an additional function module 420. Each part included in the native function module 410 is provided as standard in the MFP 110, while each part of the additional function module 420 is an application additionally installed in the MFP 110. The additional function module 420 is an application based on Java (registered trademark), and can easily realize the addition of functions to the MFP 110. In addition, other additional function modules (additional applications) (not shown) may be installed in the MFP 110.

ネイティブ機能モジュール４１０は、スキャン実行部４１１および画像データ保存部４１２を有する。アディショナル機能モジュール４２０は、スキャン指示部４２１、メタデータ生成部４２２、画像解析部４２３、アップロード実行部４２４、ファイル生成部４２５、表示制御部４２６、および帳票情報保持部４２７を有する。 The native function module 410 has a scan execution unit 411 and an image data storage unit 412. The additional function module 420 has a scan instruction unit 421, a metadata generation unit 422, an image analysis unit 423, an upload execution unit 424, a file generation unit 425, a display control unit 426, and a form information holding unit 427.

表示制御部４２６は、ＭＦＰ１１０の操作部２２０のタッチパネル機能を有する液晶表示部に、ユーザによる操作や入力、指示などを受け付けるためのＵＩ（ユーザインターフェース）画面を表示する。表示するＵＩ画面の詳細については後述する。 The display control unit 426 displays a UI (user interface) screen for receiving operations, inputs, instructions, and the like by the user on the liquid crystal display unit having the touch panel function of the operation unit 220 of the MFP 110. The details of the UI screen to be displayed will be described later.

スキャン指示部４２１は、表示制御部４２６を介して入力されるユーザ指示に従い、該ユーザ指示に含まれるスキャン設定や転送設定の情報とともに、スキャン実行部４１１にスキャン処理を要求する。 The scan instruction unit 421 requests the scan execution unit 411 to perform a scan process together with the scan setting and transfer setting information included in the user instruction according to the user instruction input via the display control unit 426.

スキャン実行部４１１は、スキャン指示部４２１からのスキャン設定を含んだスキャン要求を受け取る。スキャン実行部４１１は、スキャナ部Ｉ／Ｆ２１７を介してスキャナ部２２２で、原稿上の画像を読み取ることでスキャン画像データを生成する。スキャン実行部４１１は生成したスキャン画像データを画像データ保存部４１２に送る。このとき、スキャン実行部４１１は、保存したスキャン画像データを一意に示すスキャン画像識別子をスキャン指示部４２１に送る。 The scan execution unit 411 receives a scan request including scan settings from the scan instruction unit 421. The scan execution unit 411 generates scan image data by scanning an image on a document with the scanner unit 222 via the scanner unit I / F 217. The scan execution unit 411 sends the generated scan image data to the image data storage unit 412. At this time, the scan execution unit 411 sends a scan image identifier uniquely indicating the saved scan image data to the scan instruction unit 421.

画像データ保存部４１２は、スキャン実行部４１１から受け取ったスキャン画像データをＨＤＤ２１４に保存する。 The image data storage unit 412 stores the scanned image data received from the scan execution unit 411 in the HDD 214.

スキャン指示部４２１は、スキャン実行部４１１から受け取ったスキャン画像識別子に対応するスキャン画像データを画像データ保存部４１２から取得する。スキャン指示部４２１は、取得したスキャン画像データのメタデータの生成をメタデータ生成部４２２に要求する。メタデータは、スキャン画像データに関連する情報であり、例としては、スキャン画像データに付与されるファイル名が挙げられる。以下、本実施形態では、メタデータが、ファイル名である場合を例に説明する。 The scan instruction unit 421 acquires the scan image data corresponding to the scan image identifier received from the scan execution unit 411 from the image data storage unit 412. The scan instruction unit 421 requests the metadata generation unit 422 to generate metadata of the acquired scan image data. The metadata is information related to the scanned image data, and examples thereof include a file name given to the scanned image data. Hereinafter, in the present embodiment, the case where the metadata is a file name will be described as an example.

メタデータ生成部４２２は、スキャン画像データの解析指示を画像解析部４２３に送る。画像解析部４２３は、メタデータ生成部４２２からの解析指示に基づき、スキャン画像データに対して画像解析（レイアウト解析処理やＯＣＲ処理（文字列認識処理））を行う。画像解析部４２３は、スキャン画像データを画像解析して得られる領域情報を解析結果として、メタデータ生成部４２２に送る。表１には、領域情報の一例が示されている。領域情報は、スキャン画像データに含まれる文字列領域や表領域などの各領域について、領域を識別するための番号と、領域のＸ座標、Ｙ座標、幅、および高さと、領域の種類とを示す情報を含む。ここで、文字列領域とは、画像解析によってテキストが検出された領域であり、表領域とは、画像解析によって表が検出された領域である。なお、画像データからテキストや表を検出する方法については広く知られているため、説明を省略する。また、表１には、説明の簡単のため、スキャン画像データ内の一部の領域のみが記載されている。 The metadata generation unit 422 sends an analysis instruction of the scanned image data to the image analysis unit 423. The image analysis unit 423 performs image analysis (layout analysis processing and OCR processing (character string recognition processing)) on the scanned image data based on the analysis instruction from the metadata generation unit 422. The image analysis unit 423 sends the area information obtained by image analysis of the scanned image data to the metadata generation unit 422 as an analysis result. Table 1 shows an example of area information. The area information includes a number for identifying the area, an X coordinate, a Y coordinate, a width, and a height of the area, and a type of the area for each area such as a character string area and a table area included in the scanned image data. Contains information to indicate. Here, the character string area is an area in which text is detected by image analysis, and the table area is an area in which a table is detected by image analysis. Since the method of detecting texts and tables from image data is widely known, the description thereof will be omitted. Further, in Table 1, for the sake of simplicity, only a part of the area in the scanned image data is shown.

画像解析部４２３は、今回の画像解析で得られた領域情報と、以前（過去）の画像解析で得られた各領域情報のそれぞれとを比較する。なお、以前の画像解析で得られた各領域情報は、帳票情報保持部４２７によって保持される。また、帳票情報保持部４２７が保持する各領域情報には、後述するステップＳ５０８のファイル名生成処理においてユーザがファイル名として選択した文字列領域を特定するための選択情報が付加される。以降、領域情報に選択情報を付加した情報を帳票情報と呼ぶ。画像解析部４２３は、上記比較により、今回の画像解析で得られた領域情報と類似する領域情報を帳票情報保持部４２７が保持していると判断した場合、更に、類似する領域情報（以下、類似領域情報と呼ぶ）に選択情報が付加されているかを判定する。類似領域情報に選択情報が付加されていると判定した場合、画像解析部４２３は、選択情報と該選択情報で示される文字列領域（以下、選択文字領域と呼ぶ）をＯＣＲ処理して得られる文字列とを、画像解析の解析結果に含ませてメタデータ生成部４２２に送る。詳細は後述する。なお、画像解析の解析結果は、メタデータ生成部４２２を介して、表示制御部４２６にも送られる。 The image analysis unit 423 compares the area information obtained by the current image analysis with each of the area information obtained by the previous (past) image analysis. The area information obtained in the previous image analysis is held by the form information holding unit 427. Further, to each area information held by the form information holding unit 427, selection information for specifying a character string area selected by the user as a file name in the file name generation process of step S508 described later is added. Hereinafter, the information obtained by adding the selection information to the area information is referred to as form information. When the image analysis unit 423 determines that the form information holding unit 427 holds the area information similar to the area information obtained in the current image analysis by the above comparison, the image analysis unit 423 further determines that the similar area information (hereinafter referred to as "the area information"). It is determined whether the selection information is added to (referred to as similar area information). When it is determined that the selection information is added to the similar area information, the image analysis unit 423 obtains the selection information and the character string area (hereinafter referred to as the selection character area) indicated by the selection information by OCR processing. The character string is included in the analysis result of the image analysis and sent to the metadata generation unit 422. Details will be described later. The analysis result of the image analysis is also sent to the display control unit 426 via the metadata generation unit 422.

また、メタデータ生成部４２２は、ＵＩ画面を介して入力されるユーザ指示と、画像解析部４２３の解析結果に基づいて、メタデータ（本実施形態ではファイル名）を生成する。メタデータ生成部４２２は、スキャン画像識別子および生成したメタデータをアップロード実行部４２４に送る。そして、メタデータ生成部４２２は、アップロード実行部４２４に対して、ファイルサーバ１２０へのスキャン画像データのアップロードを指示する。 Further, the metadata generation unit 422 generates metadata (file name in the present embodiment) based on the user instruction input via the UI screen and the analysis result of the image analysis unit 423. The metadata generation unit 422 sends the scanned image identifier and the generated metadata to the upload execution unit 424. Then, the metadata generation unit 422 instructs the upload execution unit 424 to upload the scanned image data to the file server 120.

さらに、メタデータ生成部４２２は、表示制御部４２６に表示指示を送る。表示制御部４２６は、メタデータ生成部４２２からの表示指示に基づき、ＭＦＰ１１０の操作部２２０のタッチパネル機能を有する液晶表示部にＵＩ画面（例えば図８（ｂ））を表示する。前記ＵＩ画面は、ファイル名を生成するための操作、入力、および指示を受け付けるための画面である。また、表示制御部４２６は、メタデータ生成部４２２からの表示指示に基づき、スキャン画像データのプレビュー画像をＵＩ画面に表示する。 Further, the metadata generation unit 422 sends a display instruction to the display control unit 426. The display control unit 426 displays a UI screen (for example, FIG. 8B) on the liquid crystal display unit having the touch panel function of the operation unit 220 of the MFP 110 based on the display instruction from the metadata generation unit 422. The UI screen is a screen for receiving operations, inputs, and instructions for generating a file name. Further, the display control unit 426 displays the preview image of the scanned image data on the UI screen based on the display instruction from the metadata generation unit 422.

アップロード実行部４２４は、表示制御部４２６にＵＩ画面の表示指示を送る。表示制御部４２６は、フォルダパス設定およびアップロードの操作、入力、及び指示をユーザから受け付けるためのＵＩ画面（例えば、図１０）を表示する。このとき表示されるＵＩ画面の詳細については後述する。また、アップロード実行部４２４は、ユーザからのアップロード指示を受け、該指示に従って、ファイル生成部４２５にスキャン画像識別子が示すスキャン画像データのファイル生成を指示する。 The upload execution unit 424 sends a display instruction of the UI screen to the display control unit 426. The display control unit 426 displays a UI screen (for example, FIG. 10) for accepting a folder path setting and upload operation, input, and instruction from the user. The details of the UI screen displayed at this time will be described later. Further, the upload execution unit 424 receives an upload instruction from the user, and instruct the file generation unit 425 to generate a file of the scan image data indicated by the scan image identifier according to the instruction.

ファイル生成部４２５は、指示されたスキャン画像識別子に対応するスキャン画像データを画像データ保存部４１２から取得し、ファイルサーバ１２０に送信するファイルを生成する。 The file generation unit 425 acquires the scan image data corresponding to the instructed scan image identifier from the image data storage unit 412, and generates a file to be transmitted to the file server 120.

アップロード実行部４２４は、設定したフォルダパス設定と、メタデータ生成部４２２により生成されたファイル名とを用いてファイルサーバ１２０に接続し、ファイル生成部４２５により生成されたファイルを送信する。アップロード実行部４２４は、アップロードが完了すると、アップロードが完了したことを表示制御部４２６に通知する。表示制御部４２６は、アップロード実行部４２４から通知を受けて、表示内容を更新する。アップロード実行部４２４は、ＳＭＢ（ＳｅｒｖｅｒＭｅｓｓａｇｅＢｌｏｃｋ）クライアント機能を有している。これにより、ＳＭＢサーバ機能を有するファイルサーバ１２０に対してＳＭＢを用いてファイル及びフォルダ操作を行う。ＳＭＢの他に、ＷｅｂＤＡＶ（ＤｉｓｔｒｉｂｕｔｅｄＡｕｔｈｏｒｉｎｇａｎｄＶｅｒｓｉｏｎｉｎｇｐｒｏｔｏｃｏｌｆｏｒｔｈｅＷＷＷ）を使用可能である。また、ＦＴＰ（ＦｉｌｅＴｒａｎｓｆｅｒＰｒｏｔｏｃｏｌ）、ＳＭＴＰ（ＳｉｍｐｌｅＭａｉｌＴｒａｎｓｆｅｒＰｒｏｔｏｃｏｌ）等も使用可能である。また、それ以外にファイル送信目的以外のＳＯＡＰやＲＥＳＴ（ＲｅｐｒｅｓｅｎｔａｔｉｏｎａｌＳｔａｔｅＴｒａｎｓｆｅｒ）等も使用可能である。 The upload execution unit 424 connects to the file server 120 using the set folder path setting and the file name generated by the metadata generation unit 422, and transmits the file generated by the file generation unit 425. When the upload is completed, the upload execution unit 424 notifies the display control unit 426 that the upload is completed. The display control unit 426 receives a notification from the upload execution unit 424 and updates the display content. The upload execution unit 424 has an SMB (Server Message Block) client function. As a result, file and folder operations are performed using the SMB on the file server 120 having the SMB server function. In addition to SMB, WebDAV (Distributed Photography and Versioning protocol for the WWW) can be used. Further, FTP (File Transfer Protocol), SMTP (Simple Mail Transfer Protocol) and the like can also be used. In addition, SOAP, REST (Representational State Transfer), and the like other than those for the purpose of file transmission can also be used.

＜全体の処理のフローチャート＞
図５は、文字列領域情報の保存までの全体的な制御の流れを示すフローチャートである。この一連の処理は、ＭＦＰ１１０のＣＰＵ２１１がＨＤＤ２１４に記憶された制御プログラムを実行することにより実現される。以下、詳しく説明する。 <Flow chart of the whole process>
FIG. 5 is a flowchart showing the overall control flow up to the storage of the character string area information. This series of processing is realized by the CPU 211 of the MFP 110 executing the control program stored in the HDD 214. Hereinafter, it will be described in detail.

ここでは、図５のフローチャートに基づき、類似する２つの文書に対し、文字列領域情報の保存までの一連の処理を行う場合について説明する。実施１回目では、帳票情報保持部４２７が帳票情報（類似する文書の情報）を１つも保持していない状態であり、一方の文書に対して一連の処理を行う場合について説明する。 Here, a series of processes up to the storage of the character string area information will be described for two similar documents based on the flowchart of FIG. In the first implementation, a case where the form information holding unit 427 does not hold any form information (information of similar documents) and performs a series of processing on one of the documents will be described.

続いて、実施２回目では、実施１回目でスキャン処理対象の文書の帳票情報を帳票情報保持部４２７が保持した状態であって、実施１回目のスキャン処理対象の文書に類似する他方の文書に対して一連の処理を行う場合について説明する。なお、本実施形態では、スキャン処理対象の文書に表が１つのみ存在する場合について説明する。複数の表が存在する場合については、後述の実施形態２にて説明する。 Subsequently, in the second implementation, the form information of the document to be scanned in the first implementation is held by the form information holding unit 427, and the other document similar to the document to be scanned in the first implementation is used. A case where a series of processes are performed will be described. In this embodiment, a case where only one table exists in the document to be scanned will be described. The case where a plurality of tables exist will be described in the second embodiment described later.

まず、実施１回目について説明する。 First, the first implementation will be described.

ステップＳ５０１では、スキャン指示部４２１は、表示制御部４２６にスキャン設定画面の表示を指示する。表示制御部４２６は、スキャン処理における各種設定を行うためのスキャン設定画面を操作部２２０に表示する。 In step S501, the scan instruction unit 421 instructs the display control unit 426 to display the scan setting screen. The display control unit 426 displays a scan setting screen for making various settings in the scan process on the operation unit 220.

図６は、スキャン設定画面６００の一例を示す図である。図６のスキャン設定画面６００には、５つの設定ボタン６０１～６０５が存在する。［カラー設定］ボタン６０１は、原稿をスキャンする際のカラーまたはモノクロを設定するためのボタンである。［解像度設定］ボタン６０２は、原稿をスキャンする際の解像度を設定するためのボタンである。［両面読み取り設定］ボタン６０３は、原稿の両面をスキャンしたい場合に用いる設定ボタンである。［原稿混載設定］ボタン６０４は、サイズが異なる原稿をまとめてスキャンしたい場合に用いる設定ボタンである。［画像形式設定］ボタン６０５は、スキャン画像データの保存形式を指定する際に用いる設定ボタンである。これら設定ボタン６０１～６０５を用いた設定時には、ＭＦＰ１１０においてサポートされている範囲で設定可能な候補（候補肢）が表示され、ユーザは表示された候補から望むものを選択する。なお、上述の設定ボタンは一例であって、これらすべての設定項目が存在しなくても良いし、これら以外の設定項目が存在しても良い。ユーザは、このようなスキャン設定画面６００を介してスキャン処理についても詳細な設定を行う。［キャンセル］ボタン６２０は、スキャン設定を中止する場合に用いるボタンである。［スキャン開始］ボタン６２１は、原稿台等にセットした原稿に対するスキャン処理の開始を指示するためのボタンである。 FIG. 6 is a diagram showing an example of the scan setting screen 600. The scan setting screen 600 of FIG. 6 has five setting buttons 601 to 605. The [Color setting] button 601 is a button for setting color or monochrome when scanning a document. The [Resolution setting] button 602 is a button for setting the resolution when scanning the original. The [Double-sided scanning setting] button 603 is a setting button used when it is desired to scan both sides of a document. The [Document mixed loading setting] button 604 is a setting button used when it is desired to scan documents of different sizes at once. The [Image Format Setting] button 605 is a setting button used when designating the storage format of the scanned image data. At the time of setting using these setting buttons 601 to 605, candidates (candidate limbs) that can be set within the range supported by the MFP 110 are displayed, and the user selects a desired one from the displayed candidates. The above-mentioned setting button is an example, and all of these setting items may not exist, or setting items other than these may exist. The user also makes detailed settings for the scan process via such a scan setting screen 600. The [Cancel] button 620 is a button used when canceling the scan setting. The [scan start] button 621 is a button for instructing the start of the scan process for the document set on the platen or the like.

ステップＳ５０２では、スキャン指示部４２１は、［スキャン開始］ボタン６２１が押下されたか、または［キャンセル］ボタン６２０が押下されたかを判定する。［スキャン開始］ボタン６２１が押下されたと判定すると、スキャン指示部４２１は、各スキャン設定ボタン６０１乃至６０５で選択された設定項目の設定でスキャン実行部４１１に対してスキャン処理を行わせる。［キャンセル］ボタン６２０が押下されたと判定すると、処理を終了する。 In step S502, the scan instruction unit 421 determines whether the [scan start] button 621 is pressed or the [cancel] button 620 is pressed. When it is determined that the [scan start] button 621 is pressed, the scan instruction unit 421 causes the scan execution unit 411 to perform a scan process by setting the setting items selected by the scan setting buttons 601 to 605. When it is determined that the [Cancel] button 620 is pressed, the process ends.

ステップＳ５０３では、スキャン実行部４１１は、スキャナ部２２２にスキャン指示を出し、原稿をスキャンする。スキャンして生成されたスキャン画像データは、画像データ保存部４１２に保存され、対応するスキャン画像識別子がスキャン指示部４２１に通知される。 In step S503, the scan execution unit 411 issues a scan instruction to the scanner unit 222 and scans the document. The scanned image data generated by scanning is stored in the image data storage unit 412, and the corresponding scan image identifier is notified to the scan instruction unit 421.

ステップＳ５０４では、スキャン指示部４２１は、スキャン画像識別子に対応するスキャン画像データを画像データ保存部４１２から取得する。 In step S504, the scan instruction unit 421 acquires the scan image data corresponding to the scan image identifier from the image data storage unit 412.

ステップＳ５０５では、メタデータ生成部４２２は、画像データ保存部４１２から取得されたスキャン画像データの解析指示を画像解析部４２３に送る。画像解析部４２３は、スキャン画像データの解析を行う。例えば、スキャン画像のヒストグラムを抽出したり、画素の塊を抽出したりするなどして、文字列領域や表領域など、スキャン画像中におけるレイアウトを解析する。この解析によって、スキャン画像全体における文字列領域が抽出される。文字列領域は、文字列と推認される領域（画像領域）である。表領域は、表と推認される領域（画像領域）である。文字列領域は、一文字の領域も含むものである。なお、レイアウト解析処理にはレイアウトしやすいようにスキャン画像の傾きを補正したり、方向を検知して回転したりする処理を含むようにしてもよい。その後、文字列領域に対して文字認識処理（ＯＣＲ：Optical Character Recognition）処理）を行うことで、文字列領域（画像領域）に含まれている文字（テキストデータ）が抽出される。文字認識処理は、例えば文字列領域に含まれている画素群と、予め登録されている辞書とをマッチング処理することで、文字（テキストデータ）を認識する処理である。この文字認識処理は、処理に時間を要する場合がある。このため、本実施形態においては、レイアウト解析によって抽出された文字列領域に逐次的に文字認識処理を行わずに、ユーザが所望する文字列領域に対して文字認識処理を行うことで、処理の高速化を図っている。画像解析部４２３によって解析された文字列領域の情報（以下、文字列領域情報という）は、メタデータ生成部４２２に渡される。 In step S505, the metadata generation unit 422 sends an analysis instruction of the scanned image data acquired from the image data storage unit 412 to the image analysis unit 423. The image analysis unit 423 analyzes the scanned image data. For example, the layout in the scanned image such as the character string area and the table area is analyzed by extracting the histogram of the scanned image or extracting the pixel block. By this analysis, the character string area in the entire scanned image is extracted. The character string area is an area (image area) presumed to be a character string. The table area is an area (image area) presumed to be a table. The character string area also includes an area of one character. The layout analysis process may include a process of correcting the inclination of the scanned image so that the layout can be easily performed, or a process of detecting the direction and rotating the image. After that, by performing character recognition processing (OCR: Optical Character Recognition) processing on the character string area, characters (text data) included in the character string area (image area) are extracted. The character recognition process is a process of recognizing a character (text data) by, for example, matching a pixel group included in a character string area with a pre-registered dictionary. This character recognition process may take time. Therefore, in the present embodiment, the character recognition process is performed on the character string area desired by the user without sequentially performing the character recognition process on the character string area extracted by the layout analysis. We are trying to increase the speed. The information of the character string area analyzed by the image analysis unit 423 (hereinafter referred to as the character string area information) is passed to the metadata generation unit 422.

ここで、ステップＳ５０５の画像解析処理の詳細について、図７を用いて説明する。図７は、画像解析部４２３による画像解析処理（ステップＳ５０５）の詳細を示すフローチャートである。以下、図７のフローに沿って説明する。 Here, the details of the image analysis process in step S505 will be described with reference to FIG. 7. FIG. 7 is a flowchart showing the details of the image analysis process (step S505) by the image analysis unit 423. Hereinafter, the flow will be described with reference to FIG.

ステップＳ７０１では、画像解析部４２３は、ファイル生成部４２５から受け取ったスキャン画像データを解析できる形態にして読み込む。 In step S701, the image analysis unit 423 reads the scanned image data received from the file generation unit 425 in a form that can be analyzed.

ステップＳ７０２では、画像解析部４２３は、読み込んだスキャン画像データをその後の領域判定や文字列解析を行い易い状態に補正する。具体的には、スキャン画像データに対し、画像信号の二値化やスキャン時にずれた原稿の傾きの修正、原稿が正立する方向への回転などを行って、解析処理を行い易い状態に補正する。 In step S702, the image analysis unit 423 corrects the read scanned image data to a state in which subsequent area determination and character string analysis can be easily performed. Specifically, the scanned image data is corrected to a state where it is easy to perform analysis processing by binarizing the image signal, correcting the tilt of the original that is misaligned during scanning, and rotating the original in the upright direction. do.

ステップＳ７０３では、画像解析部４２３は、ステップＳ７０２で補正したスキャン画像データの内容を解析して、文字列領域および表領域を判定する。例えば、画像解析部４２３は、補正されたスキャン画像（二値画像）に対しエッジ抽出などを行って、当該画像内の文字列領域および表領域を特定する。すなわち、一続きの文字列と推認される塊（単位領域）と表と推認される塊（単位領域）を特定する。文字列領域および表領域に関し、座標、幅方向（横方向）および高さ方向（縦方向）の大きさがそれぞれ特定される。文字列領域の幅方向（横方向）とは、文書の文章方向に沿う方向を示している。文字列領域の高さ方向（縦方向）とは、文書の文章方向に沿う方向と交わる方向、例えば直交する方向を示している。また、原稿にて文章方向が横書きであるか縦書きであるかを特定する。これはスキャン画像（二値画像）に対し縦と横の射影をとって、この射影の分散の低いほうを行方向と判定する方法があり、これを用いることができる。以下の表２は、ある見積書のスキャン画像の一部に対して画像解析処理を行った結果の一例を示している。 In step S703, the image analysis unit 423 analyzes the content of the scanned image data corrected in step S702 to determine the character string area and the table area. For example, the image analysis unit 423 performs edge extraction or the like on the corrected scan image (binary image) to specify the character string area and the table area in the image. That is, a block (unit area) that is presumed to be a continuous character string and a block (unit area) that is presumed to be a table are specified. With respect to the character string area and the table area, the coordinates, the width direction (horizontal direction), and the height direction (vertical direction) are specified, respectively. The width direction (horizontal direction) of the character string area indicates a direction along the text direction of the document. The height direction (vertical direction) of the character string region indicates a direction that intersects with a direction along the text direction of the document, for example, a direction orthogonal to each other. Also, specify whether the text direction is horizontal or vertical in the manuscript. There is a method of taking vertical and horizontal projections on a scanned image (binary image) and determining the lower dispersion of the projections as the row direction, which can be used. Table 2 below shows an example of the result of performing image analysis processing on a part of the scanned image of a certain quotation.

ステップＳ７０４では、画像解析部４２３は、ステップＳ７０３で判定した文字列領域および表領域を元に、帳票情報保持部４２７から類似する帳票情報を取得して、事前にファイル名を生成する。 In step S704, the image analysis unit 423 acquires similar form information from the form information holding unit 427 based on the character string area and the table area determined in step S703, and generates a file name in advance.

ここで、ステップＳ７０４のファイル名リスト生成処理の詳細について、図１１を用いて説明する。図１１は、画像解析部４２３によるファイル名リスト生成処理（ステップＳ７０４）の詳細を示すフローチャートである。 Here, the details of the file name list generation process in step S704 will be described with reference to FIG. FIG. 11 is a flowchart showing the details of the file name list generation process (step S704) by the image analysis unit 423.

ステップＳ１１０１では、画像解析部４２３は、ステップＳ７０３で取得した文字列領域が帳票情報保持部４２７に保持される帳票情報と類似するかを判定する。スキャン画像データの文字列領域が帳票情報保持部４２７に保持される帳票情報と所定の割合以上で重複している場合には、帳票情報保持部４２７に保持される帳票情報と類似しており、類似する帳票情報ありと判定し、ステップＳ１１０２に進む。他方、スキャン画像データの文字列領域が帳票情報保持部４２７に保持される帳票情報と所定の割合未満しか重複していない場合には、類似する帳票情報なしと判定し、本フローは終了する。なお、ステップＳ１１０１では、帳票情報保持部４２７に保持されている全ての帳票情報に対して類似判定が行われる。類似判定の判定基準である所定の割合は、ユーザにより設定変更可能な数値である。類似判定に関し領域に応じて重み付けをすることも可能である。実施１回目では、帳票情報保持部４２７が帳票情報を１つも保持していないため、ファイル名リスト生成処理が終了となる。ステップＳ１１０２以降の処理に関しては、実施２回目で説明する。 In step S1101, the image analysis unit 423 determines whether the character string area acquired in step S703 is similar to the form information held in the form information holding unit 427. When the character string area of the scanned image data overlaps with the form information held in the form information holding unit 427 at a predetermined ratio or more, it is similar to the form information held in the form information holding unit 427. It is determined that there is similar form information, and the process proceeds to step S1102. On the other hand, if the character string area of the scanned image data overlaps less than a predetermined ratio with the form information held in the form information holding unit 427, it is determined that there is no similar form information, and this flow ends. In step S1101, a similarity determination is made for all the form information held in the form information holding unit 427. The predetermined ratio, which is the criterion for determining the similarity, is a numerical value that can be set and changed by the user. It is also possible to weight the similarity determination according to the region. In the first implementation, since the form information holding unit 427 does not hold any form information, the file name list generation process is completed. The processing after step S1102 will be described in the second implementation.

表２は、レイアウト解析処理によって解析された文字列領域情報の一例を示している。 Table 2 shows an example of the character string area information analyzed by the layout analysis process.

上記の表２において、［番号］は、特定された各文字列領域を一意に示す番号である。この例では１から１３までの通し番号が、認識した順番に付けられている。座標は詳細につき後述するプレビュー表示領域８１０の左上を原点（０，０）として右方向にＸ軸、下方向にＹ軸をとるものとする。［領域］の［Ｘ座標］は、特定された各文字列領域の左上隅のＸ座標を示している。［領域］の［Ｙ座標］は、特定された各文字列領域の左上隅のＹ座標を示している。以後、文字列領域に対して“座標”と言う場合は、特に断らない限り、文字列領域の左上隅の位置座標のことを意味するものとする。［領域］の［幅］は、特定された各文字列領域の左辺から右辺までの距離を示している。［領域］の［高さ］は、特定された各文字列領域の上辺から下辺までの距離を示している。本実施形態では、［Ｘ座標］、［Ｙ座標］、［幅］および［高さ］はいずれもピクセルで示すが、ポイントやインチ等で示してもよい。 In Table 2 above, [number] is a number uniquely indicating each specified character string area. In this example, serial numbers from 1 to 13 are assigned in the order of recognition. It is assumed that the coordinates have the X-axis in the right direction and the Y-axis in the downward direction with the upper left of the preview display area 810 described later as the origin (0,0). [X coordinate] of [Area] indicates the X coordinate of the upper left corner of each specified character string area. [Y coordinate] of [Area] indicates the Y coordinate of the upper left corner of each specified character string area. Hereinafter, when the term "coordinates" is used with respect to the character string area, it means the position coordinates of the upper left corner of the character string area unless otherwise specified. [Width] of [Area] indicates the distance from the left side to the right side of each specified character string area. [Height] of [Area] indicates the distance from the upper side to the lower side of each specified character string area. In the present embodiment, the [X coordinate], [Y coordinate], [width], and [height] are all indicated by pixels, but may be indicated by points, inches, or the like.

図５のフローチャートに戻る。 Return to the flowchart of FIG.

ステップＳ５０６では、メタデータ生成部４２２は、画像解析部４２３で解析されてスキャン画像から抽出された各文字列領域情報（画像解析データ）を取得する。文字列領域情報は、例えばＣＳＶやＸＭＬのフォーマットで取得されるものとするが、他のフォーマットで取得されるものであっても構わない。また、ＨＤＤ２１４に一旦保存した上で、所定のタイミングで取得されるものでもよい。 In step S506, the metadata generation unit 422 acquires each character string area information (image analysis data) analyzed by the image analysis unit 423 and extracted from the scanned image. The character string area information is, for example, acquired in a CSV or XML format, but may be acquired in another format. Further, it may be acquired at a predetermined timing after being temporarily stored in the HDD 214.

ステップＳ５０７では、メタデータ生成部４２２は、表示制御部４２６にプレビュー画面の表示を指示する。表示制御部４２６は、スキャン指示部４２１から受け取ったスキャン画像データを用いて操作部２２０のタッチパネル上にプレビュー画面を表示する。ユーザは、プレビュー画面を介して、スキャン画像データのファイル名を設定することができる。 In step S507, the metadata generation unit 422 instructs the display control unit 426 to display the preview screen. The display control unit 426 displays a preview screen on the touch panel of the operation unit 220 using the scan image data received from the scan instruction unit 421. The user can set the file name of the scanned image data via the preview screen.

図８（ａ）は、スキャン処理対象の文書の一例を示す図である。図８（ｂ）は、図８（ａ）に示す文書（原稿）に対しスキャン処理を行った場合のプレビュー画面の一例を示す図であり、図８（ｃ）は、後述のプレビュー表示領域８１０に表示されるスキャン画像を下方へスクロールした場合の一例を示す図である。ユーザは、プレビュー画面８００を介してアップロード実行部４２４に実行させる、ファイルサーバ１２０に送信するためのファイル名設定を複数のボタン８０２～８０３を介して実行する。［ファイル名リスト表示］ボタン８０２は、ファイル名入力欄８０１に設定するファイル名選択リストを表示する。ファイル名選択リストは、実施２回目以降の処理で表示される。今回の画像解析で得られた領域情報と類似する領域情報が帳票情報保持部４２７に保持されていると判断し、さらに、類似する領域情報に選択情報が付加されている場合に、ファイル名選択リストが生成される。ファイル名選択リストは、類似する領域情報の選択情報を基に、今回の画像解析で得られた領域情報から抽出されたファイル名で構成される。すなわち、［ファイル名リスト表示］ボタン８０２が押下されると、ユーザにより選択可能な候補と成り得る、全てのファイル名（以下、候補ファイル名と呼ぶ）で構成されるファイル名選択リストが表示される。表３は、ファイル名選択リストの一例を示している。この例は、表２に示される領域情報と類似する領域情報が帳票情報保持部４２７に保持され、類似する領域情報に選択情報が付加されている場合を示している。帳票情報保持部４２７には、表２の番号１（見積書）、表２の番号３（Ｒ１２－３４５６）、表２の番号１３（川崎株式会社）に対応する文字列領域の文字列で構成されるファイル名のスキャン画像データが保持されている。さらに、帳票情報保持部４２７には、表２の番号１（見積書）、表２の番号３（Ｒ１２－３４５６）、表２の番号８（品川株式会社）に対応する文字列領域の文字列で構成されるファイル名のスキャン画像データが保持されている。このような状態で、表２に示される文字列領域情報を含むスキャン画像データに対しファイル名リスト生成処理が行われる。これにより、「見積書＿Ｒ１２－３４５６＿川崎株式会社」の候補ファイル名と、「見積書＿Ｒ１２－３４５６＿品川株式会社」の候補ファイル名とで構成されるファイル名選択リストが生成される。 FIG. 8A is a diagram showing an example of a document to be scanned. FIG. 8B is a diagram showing an example of a preview screen when the document (original) shown in FIG. 8A is scanned, and FIG. 8C is a preview display area 810 described later. It is a figure which shows an example of the case where the scanned image displayed in is scrolled downward. The user executes the file name setting for transmission to the file server 120, which is executed by the upload execution unit 424 via the preview screen 800, via the plurality of buttons 802 to 803. The [File name list display] button 802 displays the file name selection list set in the file name input field 801. The file name selection list is displayed in the second and subsequent processes. When it is determined that the area information similar to the area information obtained in this image analysis is held in the form information holding unit 427, and the selection information is added to the similar area information, the file name is selected. A list is generated. The file name selection list is composed of file names extracted from the area information obtained in this image analysis based on the selection information of similar area information. That is, when the [File name list display] button 802 is pressed, a file name selection list consisting of all file names (hereinafter referred to as candidate file names) that can be selectable candidates by the user is displayed. File. Table 3 shows an example of a file name selection list. This example shows a case where the area information similar to the area information shown in Table 2 is held in the form information holding unit 427 and the selection information is added to the similar area information. The form information holding unit 427 is composed of character strings in the character string area corresponding to the number 1 (estimate) in Table 2, the number 3 (R12-3456) in Table 2, and the number 13 (Kawasaki Co., Ltd.) in Table 2. The scanned image data of the file name to be used is retained. Further, the form information holding unit 427 has a character string in the character string area corresponding to the number 1 (estimate) in Table 2, the number 3 (R12-3456) in Table 2, and the number 8 (Shinagawa Co., Ltd.) in Table 2. The scanned image data of the file name composed of is retained. In such a state, the file name list generation process is performed on the scanned image data including the character string area information shown in Table 2. As a result, a file name selection list composed of the candidate file name of "Quotation_R12-3456_Kawasaki Co., Ltd." and the candidate file name of "Quotation_R12-3456_Shinagawa Co., Ltd." is generated.

候補ファイル名は、ファイル名を構成する項目と、項目毎の区切り文字とを１つ以上組み合わせたフォーマットで構成される。ファイル名を構成する項目は、後述のＯＣＲ処理内容とも関連する。 The candidate file name is composed of a format in which one or more items constituting the file name and a delimiter for each item are combined. The items constituting the file name are also related to the OCR processing contents described later.

ボタン８０３は、ファイル名のフォーマットなどを設定するためのボタンである。なお、上述した各種ボタンの種類、各文字列領域の表示や選択の態様は一例にすぎず、これに限定されない。例えば、ファイル名入力欄８０１に表示された文字列を修正・変更したり、ファイル名を確定したりするためのボタンがあってもよい。 The button 803 is a button for setting the format of the file name and the like. It should be noted that the above-mentioned types of various buttons and modes of display and selection of each character string area are merely examples, and the present invention is not limited to these. For example, there may be a button for modifying / changing the character string displayed in the file name input field 801 or for confirming the file name.

プレビュー画面８００において、画面中央にあるプレビュー表示領域８１０内には、スキャン画像と共にその表示状態を変更するための複数のボタン８１１～８１４も表示される。ボタン８１１及び８１２はスキャン画像の全体を表示しきれないときに現れるボタンで、表示領域を縦方向にスクロールするためのボタンである。ＭＦＰ１１０が備えるタッチパネルは通常それほど大きくはない。そこで、例えば、スキャン画像がＡ４縦・横書きの原稿を読み取ったものである場合は、スキャン画像の幅方向（短手方向）全体がプレビュー表示領域８１０にちょうど収まるように上詰めで縮小表示されるよう初期設定される。つまり、初期設定においては、Ａ４縦のスキャン画像の下部は、プレビュー表示領域８１０内に表示されないことになる。このようなとき、「↓」のボタン８１２を押下すると下に表示領域がスクロールし、下部を表示させることができる。また、「↑」のボタン８１１を押下すると上に表示領域がスクロールし、上部を再び表示させることができる。 In the preview screen 800, in the preview display area 810 at the center of the screen, a plurality of buttons 811 to 814 for changing the display state are also displayed together with the scanned image. Buttons 811 and 812 are buttons that appear when the entire scanned image cannot be displayed, and are buttons for scrolling the display area in the vertical direction. The touch panel provided by the MFP 110 is usually not very large. Therefore, for example, when the scanned image is an A4 vertical / horizontal writing original, the entire width direction (short direction) of the scanned image is reduced and displayed in the preview display area 810. Is initially set. That is, in the initial setting, the lower part of the A4 vertical scanned image is not displayed in the preview display area 810. In such a case, when the "↓" button 812 is pressed, the display area scrolls down and the lower part can be displayed. Further, when the "↑" button 811 is pressed, the display area scrolls upward and the upper part can be displayed again.

さらに、スキャン画像が例えばＡ４横やＡ３などの場合には、表示領域を横方向にスクロールするためのボタンをさらに設ければよい。ボタン８１３及び８１４は、表示領域を拡大・縮小するためのボタンであり、「＋」のボタン８１３を押下するとズームインし、「－」のボタン８１４を押下するとズームアウトする。これらボタン操作による動作を、プレビュー画面上でスワイプやピンチアウト／ピンチインといったユーザの指による操作で実現してもよい。また、プレビュー表示領域８１０には、ステップＳ５０５の画像解析処理によって特定された文字列領域が、上述の文字列領域情報に基づき、ユーザに選択可能で識別可能な態様（例えば、囲み線）にて表示される。ユーザがユーザに識別可能な態様でプレビュー表示領域８１０に表示された文字列領域の中から任意の文字列領域を選択（例えば指でタッチ）する。この選択操作に伴い、そこに含まれる文字列がファイル名入力欄８０１に表示、すなわち自動入力され、ファイル名を構成する文字列の一部となる。［戻る］ボタン８３０は、プレビュー表示を中止する場合に用いるボタンである。［次へ］ボタン８３１は、読み込まれたスキャン画像データのアップロード先を設定する画面に移行するためのボタンである。 Further, when the scanned image is, for example, A4 horizontal or A3, a button for scrolling the display area in the horizontal direction may be further provided. Buttons 813 and 814 are buttons for enlarging / reducing the display area, and when the "+" button 813 is pressed, the zoom-in is performed, and when the "-" button 814 is pressed, the zoom-out is performed. The operation by these button operations may be realized by the operation by the user's finger such as swipe or pinch out / pinch in on the preview screen. Further, in the preview display area 810, the character string area specified by the image analysis process of step S505 is in a mode that can be selected and identified by the user based on the above-mentioned character string area information (for example, a box). Is displayed. An arbitrary character string area is selected (for example, touched by a finger) from the character string area displayed in the preview display area 810 in a manner identifiable by the user. Along with this selection operation, the character string included therein is displayed in the file name input field 801, that is, is automatically input, and becomes a part of the character string constituting the file name. The [Back] button 830 is a button used when canceling the preview display. The [Next] button 831 is a button for moving to a screen for setting an upload destination of the read scanned image data.

ファイル名文字列設定領域８１５乃至８２７は前記画像解析部４２３がスキャン画像データを解析した文字列領域情報に従って、プレビュー表示領域８１０に表示される。文字列領域情報は、表２に示すようにスキャン画像データ上の位置を示している。よって、文字列領域情報は、プレビュー表示領域８１０に表示しているスキャン画像データのスクロール位置や拡大縮小が反映された位置に表示される。この文字列領域がユーザによりタッチされると、ユーザによりタッチされた文字列領域にある文字列がファイル名入力欄８０１に入力される。斜線で示される領域は、文字列として認識された領域を示し、矩形の形状をなしている。網掛で示される領域は、ユーザにより既にタッチされて、ファイル名として選択された領域を示している。これら各ボタン８１１～８１４および各領域８１５～８２７を用いた設定項目はここに記載した設定項目が存在しなくても良いし、これら以外の設定項目が存在しても良い。 The file name character string setting areas 815 to 827 are displayed in the preview display area 810 according to the character string area information obtained by the image analysis unit 423 analyzing the scanned image data. The character string area information indicates the position on the scanned image data as shown in Table 2. Therefore, the character string area information is displayed at a position where the scroll position and enlargement / reduction of the scanned image data displayed in the preview display area 810 are reflected. When this character string area is touched by the user, the character string in the character string area touched by the user is input to the file name input field 801. The area shown by the diagonal line indicates the area recognized as a character string and has a rectangular shape. The shaded area indicates the area that has already been touched by the user and selected as the file name. The setting items using each of the buttons 811 to 814 and the areas 815 to 827 may not have the setting items described here, or may have other setting items.

ステップＳ５０８では、ファイル生成部４２５は、ユーザからの入力指示に基づいてスキャン画像に対するファイル名を生成する。 In step S508, the file generation unit 425 generates a file name for the scanned image based on an input instruction from the user.

ここで、ステップＳ５０８のファイル名生成処理の詳細について、図９を用いて説明する。図９は、ファイル名生成処理（ステップＳ５０８）の詳細を示すフローチャートである。以下、図９のフローに沿って説明する。 Here, the details of the file name generation process in step S508 will be described with reference to FIG. FIG. 9 is a flowchart showing the details of the file name generation process (step S508). Hereinafter, the flow will be described with reference to FIG.

ステップＳ９０１では、タッチパネル上に表示されたプレビュー画面８００へのユーザによるタッチ操作の有無が監視される。タッチ操作が検出されるとステップＳ９０２へ進む。続くステップＳ９０２では、タッチ操作の内容によって処理の切り分けがなされる。タッチ操作の内容が、ボタンが押下されたことを検知した場合には、ステップＳ９１１へ進む。ステップＳ９１１では、押下されたボタンの種類によって処理の切り分けがなされる。［ファイル名リスト表示］ボタン８０２以外のボタンが押下されたことを検知した場合には、本フローが終了し、［ファイル名リスト表示］ボタン８０２が押下されたことを検知した場合には、ステップＳ９１２へ進む。ステップＳ９１２以降の処理に関しては、実施２回目で説明する。 In step S901, the presence or absence of a touch operation by the user on the preview screen 800 displayed on the touch panel is monitored. When the touch operation is detected, the process proceeds to step S902. In the following step S902, the processing is separated according to the content of the touch operation. If the content of the touch operation detects that the button has been pressed, the process proceeds to step S911. In step S911, the process is separated according to the type of the pressed button. When it is detected that a button other than the [File name list display] button 802 is pressed, this flow ends, and when it is detected that the [File name list display] button 802 is pressed, the step is performed. Proceed to S912. The processing after step S912 will be described in the second implementation.

他方、ボタン押下以外の操作がなされたことを検知した場合には、ステップＳ９０３へ進む。 On the other hand, if it is detected that an operation other than pressing the button has been performed, the process proceeds to step S903.

ステップＳ９０３では、タッチ操作がなされたタッチパネル上の位置座標（ｘ，ｙ）が取得される。続くステップＳ９０４では、タッチ操作された位置座標が、ユーザに選択可能で識別可能な態様にて表示された何れかの文字列領域と重なるかどうかが判定される。例えば、タッチ操作された位置座標が、ユーザに選択可能で識別可能な態様にてプレビュー表示領域８１０に表示された各文字列領域の内側（文字列領域の四隅を表す位置座標で特定される矩形の内側）に含まれるかどうかで判定する。タッチ操作された位置座標がユーザに選択可能で識別可能な態様にて表示された何れかの文字列領域と重なっている場合は、ステップＳ９０５へ進む。他方、重なっていない場合は、ステップＳ９０１に戻る。 In step S903, the position coordinates (x, y) on the touch panel on which the touch operation is performed are acquired. In the following step S904, it is determined whether or not the touch-operated position coordinates overlap with any character string area displayed in a manner selectable and identifiable by the user. For example, the touch-operated position coordinates are inside each character string area displayed in the preview display area 810 in a manner selectable and identifiable by the user (a rectangle specified by the position coordinates representing the four corners of the character string area). It is judged whether it is included in (inside). If the touch-operated position coordinates overlap with any of the character string areas displayed in a manner selectable and identifiable by the user, the process proceeds to step S905. On the other hand, if they do not overlap, the process returns to step S901.

ステップＳ９０５では、タッチ操作された位置座標と重なっている文字列領域の文字列が取得される。文字列の取得は、タッチ操作によって選択された文字列領域（以下、選択文字列領域と呼ぶ）に対するＯＣＲ処理を画像解析部４２３で行って、文字列を抽出することで取得される。そして、ステップＳ９０６では、ファイル名入力欄８０１に現在表示中のファイル名が取得される。続くステップＳ９０７では、取得したファイル名の中身が“空”であるかを判定する。ここで、ファイル名が“空”とは、ファイル名入力欄８０１内に何らの文字列も表示されていない空欄状態を意味する。取得したファイル名が“空”であった場合は、ステップＳ９０９に進む。他方、“空”でなかった場合は、ステップＳ９０８に進む。 In step S905, the character string in the character string area that overlaps with the touch-operated position coordinates is acquired. The character string is acquired by performing OCR processing on the character string area selected by the touch operation (hereinafter referred to as the selected character string area) in the image analysis unit 423 and extracting the character string. Then, in step S906, the file name currently displayed in the file name input field 801 is acquired. In the following step S907, it is determined whether the content of the acquired file name is "empty". Here, the file name "blank" means a blank state in which no character string is displayed in the file name input field 801. If the acquired file name is "empty", the process proceeds to step S909. On the other hand, if it is not "empty", the process proceeds to step S908.

ステップＳ９０８では、ステップＳ９０６で取得したファイル名の末尾に、所定の区切り文字を追加する処理がなされる。所定の区切り文字として、ここでは、アンダーバーを例に説明するが、これに限定されるものではない。例えばハイフンなどアンダーバー以外の記号・文字でも構わないし、さらにはスペースのような実体を伴わないものでも構わない。 In step S908, a process of adding a predetermined delimiter to the end of the file name acquired in step S906 is performed. As a predetermined delimiter, an underscore will be described here as an example, but the present invention is not limited thereto. For example, it may be a symbol / character other than an underscore such as a hyphen, or it may be a symbol / character without an entity such as a space.

ステップＳ９０９では、ステップＳ９０５で取得した文字列（選択文字列領域から抽出した文字列）が、ファイル名の構成要素として設定される。この際、既に設定された文字列が存在している場合は、その末尾に追加される。そして、ステップＳ９１０では、現時点で設定されている文字列が、ファイル名入力欄８０１に表示（自動入力）される。ユーザがプレビュー画面に表示される文字列領域をタッチ操作している間は、上述のステップＳ９０１～Ｓ９１０の処理が繰り返し行われる。 In step S909, the character string (character string extracted from the selected character string area) acquired in step S905 is set as a component of the file name. At this time, if the already set character string exists, it is added to the end. Then, in step S910, the character string currently set is displayed (automatically input) in the file name input field 801. While the user is touching the character string area displayed on the preview screen, the above-mentioned processes of steps S901 to S910 are repeated.

以上が、ファイル名生成処理の内容である。このような処理によって、ユーザに選択された複数の文字列領域の間に区切り文字を挿入して、スキャン画像のファイル名が生成される。 The above is the content of the file name generation process. By such a process, a delimiter is inserted between a plurality of character string areas selected by the user, and a file name of the scanned image is generated.

図８（ｂ）は、ステップＳ５０８でファイル名が生成された後のプレビュー画面８００の状態を示している。この例では、「見積書」、「Ｒ１２―３４５６」、「川崎株式会社」に対応する文字列領域が順次選択されたことで、「見積書＿Ｒ１２―３４５６＿川崎株式会社」の文字列が、ファイル名入力欄８０１に表示（設定）されている。プレビュー表示領域８１０には、ユーザのタッチ操作によりファイル名に使用された文字列を示す領域８１５、８２１、８２７が矩形の形状で表示される。そして、所望するファイル名が生成されてユーザが［次へ］ボタン８３１を押下すると、ステップＳ９０１、Ｓ９０２、Ｓ９１１を経て、本フローを終了する。 FIG. 8B shows the state of the preview screen 800 after the file name is generated in step S508. In this example, the character string area corresponding to "Quotation", "R12-3456", and "Kawasaki Co., Ltd." is sequentially selected, so that the character string of "Quotation_R12-3456_Kawasaki Co., Ltd." is a file. It is displayed (set) in the name input field 801. In the preview display area 810, areas 815, 821, and 827 indicating the character string used for the file name by the user's touch operation are displayed in a rectangular shape. Then, when the desired file name is generated and the user presses the [Next] button 831, the flow ends through steps S901, S902, and S911.

図５のフローチャートに戻る。 Return to the flowchart of FIG.

ステップＳ５０９では、メタデータ生成部４２２により、ユーザによるボタン操作の内容によって処理の切り分けがなされる。［次へ］ボタン８３１の押下が検出された場合は、ファイル名入力欄８０１に表示中のファイル名（ファイル名として設定された文字列）の情報がアップロード実行部４２４へ送られ、ステップＳ５１０へ進む。［戻る］ボタン８３０の押下が検出された場合は、ステップＳ５０１（スキャン設定画面の表示）へ戻る。［次へ］ボタン８３１および［戻る］ボタン８３０以外の操作が検出された場合には、ステップＳ５０８（ファイル名の生成）へ戻る。 In step S509, the metadata generation unit 422 divides the processing according to the content of the button operation by the user. When the press of the [Next] button 831 is detected, the information of the file name (character string set as the file name) displayed in the file name input field 801 is sent to the upload execution unit 424, and the process proceeds to step S510. move on. When the pressing of the [Back] button 830 is detected, the process returns to step S501 (display of the scan setting screen). If an operation other than the [Next] button 831 and the [Back] button 830 is detected, the process returns to step S508 (file name generation).

ステップＳ５１０では、メタデータ生成部４２２は、ファイル名入力欄８０１に設定されたファイル名を取得する。メタデータ生成部４２２は、取得したファイル名とスキャン画像識別子とをアップロード実行部４２４へ渡す。 In step S510, the metadata generation unit 422 acquires the file name set in the file name input field 801. The metadata generation unit 422 passes the acquired file name and the scanned image identifier to the upload execution unit 424.

ステップＳ５１１では、アップロード実行部４２４は、表示制御部４２６にスキャン画像データの送信先を設定するためのＵＩ画面（以下、アップロード設定画面と呼ぶ）の表示を指示する。表示制御部４２６は、データ送信処理における各種設定を行うためのアップロード設定画面を操作部２２０に表示する。ユーザは、このアップロード設定画面を介して、ファイルサーバ１２０へのアップロードに関する詳細設定を行う。 In step S511, the upload execution unit 424 instructs the display control unit 426 to display a UI screen (hereinafter, referred to as an upload setting screen) for setting a transmission destination of the scanned image data. The display control unit 426 displays an upload setting screen for making various settings in the data transmission process on the operation unit 220. The user makes detailed settings related to uploading to the file server 120 via the upload setting screen.

図１０は、アップロード設定画面１０００の一例を示す図である。ユーザは、［フォルダパス］入力欄１００１に、ファイルサーバ１２０へ送信する際のフォルダパスを入力する。図１０の例では、“＼＼Server1＼Share＼ScanData”がフォルダパスとして入力されている。フォルダパスの入力方法としては、例えば［フォルダパス］入力欄１００１へのタップ操作に応じてキーボード画面のサブウインドウ（不図示）を表示し、ユーザに、当該キーボード画面を介してパス名を入力して設定できるようにしてもよい。あるいは、アドレス帳参照画面（不図示）を表示し、ＭＦＰ１１０のＨＤＤ２１４に保存されたアドレス帳データからユーザがアドレスを選択することで設定できるようにしてもよい。［戻る］ボタン１０２０は、アップロードに関する詳細設定を中止する場合に用いるボタンである。［アップロード］ボタン１０２１は、［フォルダパス］入力欄１００１で設定したフォルダパスへのアップロードを指示するためのボタンである。 FIG. 10 is a diagram showing an example of the upload setting screen 1000. The user inputs the folder path for transmission to the file server 120 in the [folder path] input field 1001. In the example of FIG. 10, "\\ Server1 \ Share \ ScanData" is input as the folder path. As a method of inputting the folder path, for example, a sub-window (not shown) of the keyboard screen is displayed in response to a tap operation to the [folder path] input field 1001, and the user inputs the path name via the keyboard screen. It may be possible to set it. Alternatively, an address book reference screen (not shown) may be displayed so that the user can select an address from the address book data stored in the HDD 214 of the MFP 110. The [Back] button 1020 is a button used when canceling the detailed setting related to uploading. The [Upload] button 1021 is a button for instructing uploading to the folder path set in the [Folder path] input field 1001.

ステップＳ５１２では、アップロード実行部４２４により、ステップＳ５０９と同様、ユーザによるボタン操作の内容によって処理の切り分けがなされる。［アップロード］ボタン１０２１の押下が検出された場合は、ステップＳ５１３へ進む。他方、［戻る］ボタン１０２０の押下が検出された場合は、ステップＳ５０７（プレビュー画面の表示）へ戻る。 In step S512, the upload execution unit 424 divides the process according to the content of the button operation by the user, as in step S509. If the pressing of the [Upload] button 1021 is detected, the process proceeds to step S513. On the other hand, when the pressing of the [Back] button 1020 is detected, the process returns to step S507 (display of the preview screen).

ステップＳ５１３では、アップロード実行部４２４は、ファイルサーバ設定を取得して、ファイルサーバ設定と、ステップＳ５１１で取得したフォルダパスと、ステップＳ５１０で取得したファイル名とをメタデータ生成部４２２へ渡す。その際、［フォルダパス］入力欄１００１に入力されたパス名、ステップＳ５０８で生成されたファイル名、ファイルサーバ設定は、ファイルサーバ１２０にスキャン画像データを格納するために必要な情報である。例えば、ホスト名、フォルダパスの起点、ログイン用のユーザ名およびパスワードなどの情報を含む。 In step S513, the upload execution unit 424 acquires the file server setting, passes the file server setting, the folder path acquired in step S511, and the file name acquired in step S510 to the metadata generation unit 422. At that time, the path name input in the [folder path] input field 1001, the file name generated in step S508, and the file server setting are information necessary for storing the scanned image data in the file server 120. For example, it contains information such as a host name, a folder path origin, a user name and password for login.

ステップＳ５１４では、メタデータ生成部４２２は、アップロード実行部４２４から受け取った情報に基づきスキャン画像データの格納先パスを生成する。例えば、ファイルサーバ設定（ファイルサーバ１２０のホスト名、フォルダパスの起点）にフォルダパスを加えて生成される。これにより、例えば“＼＼Server01＼Share＼ScanData”といった格納先パスが生成される。 In step S514, the metadata generation unit 422 generates a storage destination path for the scanned image data based on the information received from the upload execution unit 424. For example, it is generated by adding the folder path to the file server settings (host name of the file server 120, starting point of the folder path). As a result, a storage destination path such as "\\ Server01 \ Share \ ScanData" is generated.

ステップＳ５１５では、アップロード実行部４２４は、ファイルサーバ１２０に対するアクセスを実行する。この際、ファイルサーバ設定に含まれるユーザ名とパスワードをファイルサーバ１２０に送信する。ユーザ名とパスワードを受け取ったファイルサーバ１２０では、認証処理が実行される。 In step S515, the upload execution unit 424 executes access to the file server 120. At this time, the user name and password included in the file server settings are transmitted to the file server 120. The file server 120 that has received the user name and password executes the authentication process.

ステップＳ５１６では、アップロード実行部４２４により、ファイルサーバ１２０での認証結果に従って処理の切り分けがなされる。すなわち、ファイルサーバ１２０から認証成功の通知を受信した場合は、ステップＳ５１７に進む。他方、認証失敗の通知を受信した場合は、本処理を終了する。 In step S516, the upload execution unit 424 divides the processing according to the authentication result in the file server 120. That is, when the notification of the authentication success is received from the file server 120, the process proceeds to step S517. On the other hand, when the notification of authentication failure is received, this process is terminated.

ステップＳ５１７では、アップロード実行部４２４により、ステップＳ５１４で生成された格納先パスが示すフォルダにスキャン画像データが送信されて、ファイルサーバ１２０内に格納される。 In step S517, the upload execution unit 424 transmits the scanned image data to the folder indicated by the storage destination path generated in step S514, and stores the scanned image data in the file server 120.

ステップＳ５１８では、画像解析部４２３により、ステップＳ５０６で取得された文字列領域情報とステップＳ５０８で取得選択された文字列領域情報（選択情報）とが帳票情報保持部４２７に保持される。 In step S518, the image analysis unit 423 holds the character string area information acquired in step S506 and the character string area information (selection information) acquired and selected in step S508 in the form information holding unit 427.

表４は、保持される文字列領域情報および選択された文字列領域情報の一例を示している。 Table 4 shows an example of the retained character string area information and the selected character string area information.

帳票番号は、帳票情報保持部４２７に保持される帳票情報ごとに一意に割り当てられる番号である。ここでは１種類目の帳票であるため１となる。また、帳票情報保持部４２７は、選択情報を保持している。選択情報の番号は、ステップＳ５０８の処理においてユーザにより選択された文字列領域の順番を表している。 The form number is a number uniquely assigned to each form information held in the form information holding unit 427. Here, it is 1 because it is the first type of form. Further, the form information holding unit 427 holds the selection information. The number of the selection information represents the order of the character string area selected by the user in the process of step S508.

以上が、文字列領域情報の保存までの全体的な制御の内容である。なお、本実施形態では、ステップＳ５０３～Ｓ５０８の処理を、スキャン処理の実行によって生成された１ページ分のスキャン画像データに対して行うことを想定している。例えば、プレビュー表示領域８１０内に次のページの画像解析を行うためのボタンを設け、その解析によって得られた次ページのプレビュー表示を行って、次ページ以降の文字列領域からファイル名を構成する文字列を設定できるようにしてもよい。 The above is the content of the overall control up to the storage of the character string area information. In this embodiment, it is assumed that the processes of steps S503 to S508 are performed on the scanned image data for one page generated by executing the scan process. For example, a button for performing image analysis of the next page is provided in the preview display area 810, the preview display of the next page obtained by the analysis is performed, and the file name is configured from the character string area of the next page and subsequent pages. You may be able to set a character string.

次に実施２回目について説明する。実施２回目においては、図８（ａ）に示される文書と類似する図１２（ａ）の文書、および、図１２（ｂ）にてプレビュー表示領域１２１０に表示されるスキャン画像を用いて説明をする。すなわち、実施２回目のスキャン処理で得られた新たなスキャン画像データは、過去のスキャン画像データと類似すると判定されるデータであるとする。前記新たなスキャン画像データは、表の大きさおよび表の近傍に存在する文字列領域の位置が前記過去のスキャン画像データと異なるものとする。さらに、前記過去のスキャン画像データに設定した情報は、当該過去のスキャン画像データの表の近傍に存在する文字列領域の文字に基づいて設定されたものとする。 Next, the second implementation will be described. In the second implementation, a description will be given using a document of FIG. 12 (a) similar to the document shown in FIG. 8 (a) and a scanned image displayed in the preview display area 1210 in FIG. 12 (b). do. That is, it is assumed that the new scan image data obtained in the second scan process is data that is determined to be similar to the past scan image data. The new scanned image data is different from the past scanned image data in the size of the table and the position of the character string region existing in the vicinity of the table. Further, it is assumed that the information set in the past scanned image data is set based on the characters in the character string area existing in the vicinity of the table of the past scanned image data.

実施１回目との差異は、図１１に示されるステップＳ１１０２～Ｓ１１１４の処理と図９に示されるステップＳ９１２～Ｓ９１４の処理とが行われることである。実施１回目と同様の処理については、説明を適宜省略する。また、実施２回目では、スキャン処理と、ステップＳ５０１～Ｓ５０５、ステップＳ７０１～ステップＳ７０３までの処理とが実行済みであることを前提とする。また、帳票情報保持部４２７は、図８（ｂ）および図８（ｃ）にてプレビュー表示領域８１０に示されるスキャン画像をファイル名入力欄８０１に入力された「見積書＿Ｒ１２－３４５６＿川崎株式会社」のファイル名で保持していることを前提とする。 The difference from the first implementation is that the processing of steps S1102 to S1114 shown in FIG. 11 and the processing of steps S912 to S914 shown in FIG. 9 are performed. The description of the same processing as that of the first implementation will be omitted as appropriate. Further, in the second execution, it is premised that the scan process and the processes of steps S501 to S505 and steps S701 to S703 have already been executed. Further, the form information holding unit 427 inputs the scanned image shown in the preview display area 810 in FIGS. 8B and 8C into the file name input field 801 "Quotation_R12-3456_Kawasaki Co., Ltd." It is assumed that the file name is retained.

表５は、図１２（ａ）に示される文書に対しスキャン処理を行いステップＳ７０３において、画像解析部４２３がステップＳ７０２で補正した画像データの内容を解析することにより得た文字列領域の一例を示している。番号１０で示される表は、図１２（ｂ）に示される表１２２４に対応し、実施１回目で得られた表と比較すると行が１つ増加した分、領域の高さ方向にて大きくなっている。 Table 5 shows an example of a character string region obtained by scanning the document shown in FIG. 12A and analyzing the content of the image data corrected in step S702 by the image analysis unit 423 in step S703. Shows. The table indicated by the number 10 corresponds to the table 1224 shown in FIG. 12 (b), and is larger in the height direction of the region by one more row than the table obtained in the first implementation. ing.

ステップＳ１１０１では、画像解析部４２３は、画像データ保存部４１２に保存されたスキャン画像データと類似する帳票情報が、帳票情報保持部４２７に保持されており、類似する帳票情報があると判定する。類似する帳票情報があるとの判定結果により、ステップＳ１１０２に進む。 In step S1101, the image analysis unit 423 determines that the form information similar to the scanned image data stored in the image data storage unit 412 is held in the form information holding unit 427 and that there is similar form information. Based on the determination result that there is similar form information, the process proceeds to step S1102.

ステップＳ１１０２では、画像解析部４２３は、帳票情報保持部４２７に保持されている帳票情報にて、画像データ保存部４１２に保存されたスキャン画像データと類似するものの中から、対象となる１つの帳票情報を特定する。 In step S1102, the image analysis unit 423 has one target form among the form information stored in the form information holding unit 427 that is similar to the scanned image data stored in the image data storage unit 412. Identify the information.

ステップＳ１１０３では、画像解析部４２３は、ステップＳ１１０２で特定された対象の帳票情報から、全ての選択文字列領域を取得する。 In step S1103, the image analysis unit 423 acquires all the selected character string areas from the target form information specified in step S1102.

ステップＳ１１０４では、画像解析部４２３は、ステップＳ１１０３で取得された全ての選択文字列領域の中から、対象となる１つの選択文字列領域を特定する。 In step S1104, the image analysis unit 423 identifies one target selected character string area from all the selected character string areas acquired in step S1103.

ステップＳ１１０５では、画像解析部４２３は、座標系（ｘ、ｙ）にて、ステップＳ１１０４で特定された対象の選択文字列領域の上に表があるか無いか（存在するか存在しないか）を判定する。画像解析部４２３が対象の選択文字列領域の上に表があると判定した場合には、ステップＳ１１０６に進む。他方、画像解析部４２３が対象の選択文字列領域の上に表が無いと判定した場合、ステップＳ１１０８に進む。 In step S1105, the image analysis unit 423 determines whether or not there is a table (exists or does not exist) on the selected character string region of the target specified in step S1104 in the coordinate system (x, y). judge. If the image analysis unit 423 determines that the table is above the target selected character string area, the process proceeds to step S1106. On the other hand, if the image analysis unit 423 determines that there is no table on the target selected character string area, the process proceeds to step S1108.

ここで、座標系（ｘ、ｙ）にて、対象の選択文字列領域の上に「表」があるか無いかの判定手順について、表４に示される文字列領域情報を用いて説明する。領域の座標は、文書の左上を原点（０，０）としているので、数字が小さい方が上にあると判断する。例えば、「表」と、「見積書」、「Ｒ１２－３４５６」および「川崎株式会社」との関係について説明する。 Here, a procedure for determining whether or not there is a "table" on the target selected character string area in the coordinate system (x, y) will be described using the character string area information shown in Table 4. Since the origin (0,0) is the upper left of the document as the coordinates of the area, it is judged that the smaller number is on the upper side. For example, the relationship between the "table" and the "quotation", "R12-3456" and "Kawasaki Co., Ltd." will be described.

「見積書」は、表４にて番号１に該当し、領域のＹ座標が２４であり、領域の高さが３０であることから、Ｙ座標にて２４～５４の位置座標に存在することになる。これに対し、「表」は、表４にて番号１０に該当し、領域のＹ座標が１９０であり、領域の高さが１２０であることから、Ｙ座標にて１９０～３１０の範囲に存在することになる。「見積書」と「表」の位置座標を比べると、「見積書」の位置座標が「表」の位置座標よりも小さいことから、番号１（見積書）の上には番号１０（表）が無いと判断される。 The "estimate" corresponds to number 1 in Table 4, and since the Y coordinate of the area is 24 and the height of the area is 30, it exists at the position coordinates of 24 to 54 in the Y coordinate. become. On the other hand, the "table" corresponds to the number 10 in Table 4, and since the Y coordinate of the area is 190 and the height of the area is 120, it exists in the range of 190 to 310 in the Y coordinate. Will be done. Comparing the position coordinates of the "estimate" and the "table", the position coordinates of the "estimate" are smaller than the position coordinates of the "table". It is judged that there is no.

「Ｒ１２－３４５６」は、表４にて番号３に該当し、領域のＹ座標が９９であり、領域の高さが２２であることから、Ｙ座標にて９９～１２１の範囲に存在することになる。「Ｒ１２－３４５６」と「表」の位置座標を比べると、「Ｒ１２－３４５６」の位置座標が「表」の位置座標よりも小さいことから、番号１（見積書）と同様、番号３（Ｒ１２－３４５６）の上には番号１０（表）が無いと判断される。 "R12-3456" corresponds to the number 3 in Table 4, and since the Y coordinate of the area is 99 and the height of the area is 22, it exists in the range of 99 to 121 in the Y coordinate. become. Comparing the position coordinates of "R12-3456" and "table", since the position coordinates of "R12-3456" are smaller than the position coordinates of "table", the number 3 (R12) is the same as the number 1 (estimate). It is determined that there is no number 10 (table) above -3456).

「川崎株式会社」は、表４にて番号１３に該当し、領域のＹ座標が３５９であり、領域の高さが３０であることから、Ｙ座標にて３５９～３８９の範囲に存在することになる。「川崎株式会社」と「表」の位置座標を比べると、番号１３（川崎株式会社）の位置座標が番号１０（表）の位置座標よりも大きいことから、番号１３（川崎株式会社）の上には番号１０（表）があると判断される。画像解析部４２３は、選択文字列領域の上に表が有ると判断した場合には、ステップＳ１１０６へ進み、ステップＳ１１０６にて、表との距離を導出することになる。他方、画像解析部４２３は、選択文字列領域の上に表が無いと判断した場合には、ステップＳ１１０８へ進み、ステップＳ１１０８にてファイル名使用文字列領域を特定することになる。 "Kawasaki Co., Ltd." corresponds to number 13 in Table 4, and since the Y coordinate of the area is 359 and the height of the area is 30, it exists in the range of 359 to 389 in the Y coordinate. become. Comparing the position coordinates of "Kawasaki Co., Ltd." and "Table", the position coordinates of No. 13 (Kawasaki Co., Ltd.) are larger than the position coordinates of No. 10 (Table). Is determined to have number 10 (table). When the image analysis unit 423 determines that the table is above the selected character string area, the image analysis unit 423 proceeds to step S1106 and derives the distance from the table in step S1106. On the other hand, if the image analysis unit 423 determines that there is no table on the selected character string area, the process proceeds to step S1108, and the file name used character string area is specified in step S1108.

ステップＳ１１０６では、画像解析部４２３は、対象の選択文字列領域と表との距離を導出する。例では、番号１０の表はＹ座標にて１９０～３１０の範囲に存在し、番号１３の選択文字列領域はＹ座標にて３５９～３８９の範囲に存在するので、番号１０の表と番号１３の選択文字列領域（川崎株式会社）の距離は、４９（＝３５９－３１０）となる。 In step S1106, the image analysis unit 423 derives the distance between the target selected character string area and the table. In the example, the table of number 10 exists in the range of 190 to 310 in the Y coordinate, and the selected character string area of the number 13 exists in the range of 359 to 389 in the Y coordinate, so that the table of number 10 and the number 13 exist. The distance of the selected character string area (Kawasaki Co., Ltd.) is 49 (= 359-310).

ステップＳ１１０７では、ステップＳ１１０６で導出された距離、ステップＳ１１０４で特定された対象の選択文字列領域、スキャン画像データの表に基づき、スキャン画像データにてファイル名として使用する選択文字列領域が特定される。すなわち、ファイル名使用文字列領域が特定される。これにより、新たなスキャン画像データにて、新たなスキャン画像データに類似すると判定された過去のスキャン画像データにおける、表と、選択情報が付された文字列領域との位置座標の関係と似ているまたは同じ関係となる選択文字列領域が特定される。本実施形態では、表４の番号１３および番号１０の文字列領域間の距離、表４の番号１３の選択文字列領域の位置座標、表５の番号１０の文字列領域（表）に基づき、表５にて該当する文字列領域が特定される。表４の番号１３と同じＸ座標（２３６）、高さ方向の大きさ（３０）であり、表５の番号１０（表）の下端のＹ座標３４０（＝１９０＋１５０）から距離（４９）の位置（３８９）にある、表５の番号１３の文字列領域が該当することになる。仮に、上述の距離を使用せずに、類似の過去のスキャン画像データの表下方の選択文字列領域と同じ位置の文字列領域をスキャン画像データから抽出すると、表５の番号１２の文字列領域が該当することになりユーザが意図しない文字列が抽出されることになる。 In step S1107, the selected character string area to be used as the file name in the scanned image data is specified based on the distance derived in step S1106, the selected character string area of the target specified in step S1104, and the scan image data table. To. That is, the character string area used for the file name is specified. As a result, the relationship between the position coordinates of the table and the character string area to which the selection information is attached in the past scan image data determined to be similar to the new scan image data in the new scan image data is similar. The selected string area that exists or has the same relationship is identified. In this embodiment, based on the distance between the character string areas of No. 13 and No. 10 in Table 4, the position coordinates of the selected character string area of No. 13 in Table 4, and the character string area (table) of No. 10 in Table 5. The corresponding character string area is specified in Table 5. It has the same X coordinate (236) and height direction size (30) as the number 13 in Table 4, and is located at a distance (49) from the Y coordinate 340 (= 190 + 150) at the lower end of the number 10 (table) in Table 5. The character string area of the number 13 in Table 5 in (389) corresponds to this. If the character string area at the same position as the selected character string area at the bottom of the table of similar past scanned image data is extracted from the scanned image data without using the above distance, the character string area of number 12 in Table 5 is extracted. Will be applicable and a character string not intended by the user will be extracted.

ステップＳ１１０８では、画像解析部４２３は、対象の選択文字列領域に基づきスキャン画像データの文字列領域のうち該当する文字列領域をファイル名使用文字列領域に特定する。すなわち、画像解析部４２３は、ステップＳ１１０４で特定された選択文字列領域をＲＡＭ２１３の記憶領域に保持する。 In step S1108, the image analysis unit 423 specifies the corresponding character string area among the character string areas of the scanned image data as the file name used character string area based on the target selected character string area. That is, the image analysis unit 423 holds the selected character string area specified in step S1104 in the storage area of the RAM 213.

ステップＳ１１０９では、画像解析部４２３は、全ての選択文字列領域を処理したか否かを判定する。未処理の選択文字列領域があり、画像解析部４２３が全ての選択文字列領域を処理していないと判定した場合には、ステップＳ１１０４に戻り、ステップＳ１１０４～Ｓ１１０８の処理が行われる。画像解析部４２３が全ての選択文字列領域を処理したと判定した場合には、ステップＳ１１１０に進む。 In step S1109, the image analysis unit 423 determines whether or not all the selected character string areas have been processed. If there is an unprocessed selected character string area and the image analysis unit 423 determines that all the selected character string areas have not been processed, the process returns to step S1104 and the processes of steps S1104 to S1108 are performed. If it is determined that the image analysis unit 423 has processed all the selected character string areas, the process proceeds to step S1110.

ステップＳ１１０９までの処理により、特定された対象の帳票情報、類似すると判定された過去のスキャン画像データに含まれる表、新たなスキャン画像データに含まれる表に基づいて、対象領域が検出される。対象領域は、新たなスキャン画像データから抽出された文字列領域のうち処理対象となる領域である。 By the process up to step S1109, the target area is detected based on the form information of the specified target, the table included in the past scanned image data determined to be similar, and the table included in the new scanned image data. The target area is an area to be processed in the character string area extracted from the new scanned image data.

ステップＳ１１１０では、画像解析部４２３は、前記対象領域である、ＲＡＭ２１３の記憶領域に保持された選択文字列領域に対しＯＣＲ処理を行って、ファイル名使用文字列領域の文字列を取得し、ファイル名を生成する。なお、ファイル名は、過去のスキャン画像データの選択情報と同じ順番でファイル名使用文字列領域の文字列を取得することで生成される。 In step S1110, the image analysis unit 423 performs OCR processing on the selected character string area held in the storage area of the RAM 213, which is the target area, obtains the character string of the file name used character string area, and obtains the file. Generate a name. The file name is generated by acquiring the character strings in the file name used character string area in the same order as the selection information of the past scanned image data.

ステップＳ１１１１では、画像解析部４２３は、ＲＡＭ２１３の記憶領域に保持されたファイル名選択リストがステップＳ１１１０で生成されたファイル名と重複するかを判定する。画像解析部４２３が重複すると判定した場合、ステップＳ１１１２に進み、ステップＳ１１１２にて、画像解析部４２３は、重複するファイル名をファイル名選択リストの先頭へ移動する。これに対し、画像解析部４２３が重複しないと判定した場合、ステップＳ１１１３へ進み、ステップＳ１１１３にて、画像解析部４２３は、ステップＳ１１１０で生成されたファイル名をＲＡＭ２１３の記憶領域に保持されたファイル名選択リストへ追加する。 In step S1111, the image analysis unit 423 determines whether the file name selection list held in the storage area of the RAM 213 overlaps with the file name generated in step S1110. If it is determined that the image analysis unit 423 is duplicated, the process proceeds to step S1112, and in step S1112, the image analysis unit 423 moves the duplicated file name to the top of the file name selection list. On the other hand, when it is determined that the image analysis unit 423 does not overlap, the process proceeds to step S1113, and in step S1113, the image analysis unit 423 holds the file name generated in step S1110 in the storage area of the RAM 213. Add to the name selection list.

本実施形態では、画像処理を行い抽出した文字列を画像データ格納先であるフォルダ名やファイル名に付加する情報として使用したが、それ以外の目的で使用することもできる。例えば、文字列に対応する電話番号を特定して画像データをその電話番号に対してファクス送信したり、メールアドレスを特定してメール送信したりすることもできる。 In the present embodiment, the character string extracted by image processing is used as information to be added to the folder name or file name of the image data storage destination, but it can also be used for other purposes. For example, it is possible to specify a telephone number corresponding to a character string and fax the image data to the telephone number, or specify an email address and send an email.

また、上記では、表と表下方の選択文字列領域の距離に基づきファイル名選択文字列領域を特定する手法について説明した。距離の代わりに表と表下方の選択文字列領域の間にある文字列領域の数量に基づきファイル名選択文字列領域を特定する手法とすることも可能である。 Further, in the above, the method of specifying the file name selection character string area based on the distance between the table and the selection character string area at the bottom of the table has been described. It is also possible to specify the file name selection character string area based on the quantity of the character string area between the table and the selection character string area at the bottom of the table instead of the distance.

また、上記では、ファイル名選択文字列領域が座標系（ｘ、ｙ）で表の外側にある場合について説明した。ファイル名選択文字列領域が座標系（ｘ、ｙ）で表の中にある場合、座標位置ではなく、表の中の項目位置からファイル名選択文字列領域を特定する手法とすることも可能である。 Further, in the above, the case where the file name selection character string area is outside the table in the coordinate system (x, y) has been described. When the file name selection character string area is in the table in the coordinate system (x, y), it is also possible to specify the file name selection character string area from the item position in the table instead of the coordinate position. be.

ここで、ファイル名生成処理Ｓ５０８の詳細を示す図９のフローにおいて、実施１回目との差異について説明する。具体的には、「ファイル名リスト表示」ボタンの押下により表示されるファイル名選択リストから候補ファイル名をファイル名として選択する場合について説明する。帳票情報保持部４２７が、実施２回目で得られたスキャン画像データと類似する帳票情報を保持することを前提とする。 Here, in the flow of FIG. 9 showing the details of the file name generation process S508, the difference from the first implementation will be described. Specifically, a case where a candidate file name is selected as a file name from the file name selection list displayed by pressing the "File name list display" button will be described. It is premised that the form information holding unit 427 holds the form information similar to the scanned image data obtained in the second implementation.

ステップＳ９１１では、押下されたボタンの種類によって処理の切り分けがなされる。［ファイル名リスト表示］ボタン８０２が押下されたことを検知した場合、ステップＳ９１２へ進む。ステップＳ９１２では、表示制御部４２６は、画像解析部４２３によって類似すると判定された帳票情報を元に作成された候補ファイル名をファイル名選択リストとして表示する。なお、帳票情報保持部が、スキャン画像データと類似する帳票情報を保持していない場合には、「ファイル名リスト表示」ボタン８０２を押下しても、ファイル名選択リストは表示されない。ファイル名選択リストの表示方法としては、例えば、プルダウンにより複数の候補ファイル名の選択肢を一覧表示する方法などが挙げられる。また、帳票情報保持部４２７に保持されるファイルのうち、スキャン画像データと最も類似する候補ファイル名をプルダウンで表示される複数の候補ファイル名の先頭に移動したり、さらに、強調表示したりすることが可能である。例えば、図１６に示すように、最も類似する対象の候補ファイル名を斜線で装飾表示する領域１６０２と、それ以外の候補ファイル名を無装飾表示する領域１６０３とを含むファイル名選択リスト１６０１をプルダウンで表示する手法が挙げられる。また、他の候補ファイル名と比べて対象の候補ファイル名のみを大きいフォントや太字や赤色で表示する手法が挙げられる。 In step S911, the process is separated according to the type of the pressed button. When it is detected that the [File name list display] button 802 is pressed, the process proceeds to step S912. In step S912, the display control unit 426 displays the candidate file names created based on the form information determined to be similar by the image analysis unit 423 as a file name selection list. If the form information holding unit does not hold the form information similar to the scanned image data, the file name selection list is not displayed even if the "file name list display" button 802 is pressed. As a method of displaying the file name selection list, for example, a method of displaying a list of options of a plurality of candidate file names by pulling down may be mentioned. Further, among the files held in the form information holding unit 427, the candidate file names most similar to the scanned image data are moved to the beginning of a plurality of candidate file names displayed in the pull-down menu, and further highlighted. It is possible. For example, as shown in FIG. 16, the file name selection list 1601 including the area 1602 for displaying the most similar candidate file names with diagonal lines and the area 1603 for displaying other candidate file names without decoration is pulled down. The method of displaying with is mentioned. In addition, there is a method of displaying only the target candidate file name in a large font or bold or red as compared with other candidate file names.

ステップＳ９１３では、ファイル名選択リストから候補ファイル名がユーザにより選択されたかを判定する。候補ファイル名の選択が検知された場合、ステップＳ９１４へ進む。他方、候補ファイル名の選択が検知されなかった場合、本フローを終了する。 In step S913, it is determined whether the candidate file name is selected by the user from the file name selection list. If the selection of the candidate file name is detected, the process proceeds to step S914. On the other hand, if the selection of the candidate file name is not detected, this flow ends.

ステップＳ９１４では、タッチ操作された位置座標と重なっている候補ファイル名の文字列が取得される。そして、ステップＳ９１０に進み、ステップＳ９１０では、現時点で設定された文字列が、ファイル名入力欄８０１に表示（自動入力）される。 In step S914, the character string of the candidate file name that overlaps with the touch-operated position coordinates is acquired. Then, the process proceeds to step S910, and in step S910, the character string currently set is displayed (automatically input) in the file name input field 801.

このように、過去のスキャン画像データのうち、類似するスキャン画像データのファイル名として選択された文字列領域情報を利用して、候補ファイル名を生成する。そのため、プレビュー画面にて、文字列を選択する手間を省くことができる。記載事項の位置が予め決められておらず同じ様式であっても記載欄が拡大可能でその大きさに応じて記載事項の位置が変わる様な文書であっても、このスキャン画像データに対してＯＣＲ処理を行う領域を適切に特定し、この情報を確実に取得することができる。これにより、類似するスキャン画像データに対して、同じ規則でファイル名を容易に設定できる。 In this way, the candidate file name is generated by using the character string area information selected as the file name of the similar scan image data among the past scan image data. Therefore, it is possible to save the trouble of selecting a character string on the preview screen. Even if the position of the description item is not determined in advance and the format is the same, the description field can be expanded and the position of the description item changes according to the size of the document. The area where OCR processing is performed can be appropriately specified, and this information can be reliably acquired. As a result, the file name can be easily set for similar scanned image data according to the same rule.

［実施形態２］
次に、本発明の実施形態２について説明する。実施形態１では１つの表が記載された文書を処理する場合について説明したが、本実施形態では２つ以上の表が記載された文書を処理する場合について説明する。本実施形態と実施形態１との差は、表と文字列領域の距離を導出する処理（ステップＳ１１０６）、およびプレビュー画面である。ステップＳ１１０６の処理の詳細について、図１３および図１４に示されるフローチャートを用いて説明する。図１３および図１４に示されるフローチャートと図１１に示されるフローチャートとの差は、ステップＳ１３０１～Ｓ１３０６の処理が行われることである。また、プレビュー画面について、図１５に示されるプレビュー画面を用いて説明する。その他の構成について、実施形態１と同様であるものは説明を適宜省略する。なお、本実施形態では、帳票情報保持部４２７は、表６に示されるような文字列領域の情報を保存しているものとする。 [Embodiment 2]
Next, the second embodiment of the present invention will be described. In the first embodiment, the case of processing a document in which one table is described will be described, but in the present embodiment, the case of processing a document in which two or more tables are described will be described. The difference between the present embodiment and the first embodiment is a process of deriving the distance between the table and the character string region (step S1106), and a preview screen. The details of the process of step S1106 will be described with reference to the flowcharts shown in FIGS. 13 and 14. The difference between the flowchart shown in FIGS. 13 and 14 and the flowchart shown in FIG. 11 is that the processes of steps S1301 to S1306 are performed. Further, the preview screen will be described with reference to the preview screen shown in FIG. As for the other configurations, which are the same as those in the first embodiment, the description thereof will be omitted as appropriate. In this embodiment, it is assumed that the form information holding unit 427 stores the information in the character string area as shown in Table 6.

表７は、図１５（ａ）に示される文書に対しスキャン処理を行いステップＳ７０３において、画像解析部４２３がステップＳ７０２で補正した画像データの内容を解析することにより得た文字列領域の一例を示している。番号１０で示される表は、図１５（ｂ）に示される表１４２５に対応し、番号１１で示される表は、図１５（ｂ）に示される表１４２４に対応する。 Table 7 shows an example of a character string region obtained by scanning the document shown in FIG. 15A and analyzing the content of the image data corrected in step S702 by the image analysis unit 423 in step S703. Shows. The table represented by number 10 corresponds to table 1425 shown in FIG. 15 (b), and the table represented by number 11 corresponds to table 1424 shown in FIG. 15 (b).

図１４は、図１３におけるステップＳ１３０１の距離導出処理の詳細を示すフローチャートである。以下、図１４のフローに沿って説明する。 FIG. 14 is a flowchart showing the details of the distance derivation process in step S1301 in FIG. Hereinafter, the flow will be described with reference to FIG.

ステップＳ１３０２において、画像解析部４２３は、対象の帳票情報から全ての表領域情報を取得する。表６に示される例では、番号１０と番号１１のそれぞれに該当する表領域情報を取得する。各表領域情報には、表領域のＸ座標およびＹ座標と、表領域の幅方向および高さ方向の大きさとが含まれる。番号１０に該当する表に関し、領域のＸ座標およびＹ座標が３７および１９０であり、幅方向および高さ方向の大きさが１１０および１２０である表領域情報を取得する。番号１１に該当する表に関し、領域のＸ座標およびＹ座標が１５７および１９０であり、幅方向および高さ方向の大きさが３６０および１２０である表領域情報を取得する。 In step S1302, the image analysis unit 423 acquires all tablespace information from the target form information. In the example shown in Table 6, the tablespace information corresponding to each of the number 10 and the number 11 is acquired. Each tablespace information includes the X and Y coordinates of the tablespace and the width and height directions of the tablespace. With respect to the table corresponding to the number 10, the tablespace information in which the X-coordinate and the Y-coordinate of the area are 37 and 190 and the sizes in the width direction and the height direction are 110 and 120 are acquired. For the table corresponding to number 11, obtain tablespace information in which the X and Y coordinates of the area are 157 and 190 and the sizes in the width and height directions are 360 and 120.

ステップＳ１３０３において、画像解析部４２３は、対象の表領域を１つ特定する。表６に示される例では、番号１０または番号１１に該当する表を特定する。 In step S1303, the image analysis unit 423 identifies one target tablespace. In the example shown in Table 6, the table corresponding to the number 10 or the number 11 is specified.

ステップＳ１３０４において、画像解析部４２３は、ステップＳ１１０４で特定された対象の選択文字列領域と、ステップＳ１３０３で特定された対象の表領域との距離を導出する。ここで、ステップＳ１１０４において番号１４に該当する選択文字列領域を特定し、ステップＳ１３０３において番号１０に該当する表領域を特定した場合について説明する。番号１４の選択文字列領域はＹ座標にて３５９～３８９の範囲となり、番号１０の表領域はＹ座標にて１９０～３１０の範囲となる。よって、Ｙ軸方向の距離は、４９（＝３５９－３１０）となる。また、Ｘ座標に関し、番号１４の選択文字列領域では２３６となり、番号１０の表領域は３７となる、よって、Ｘ軸方向の距離は、１９９（＝２３６－３７となる。 In step S1304, the image analysis unit 423 derives the distance between the selected character string area of the target specified in step S1104 and the table area of the target specified in step S1303. Here, a case where the selected character string area corresponding to the number 14 is specified in step S1104 and the table area corresponding to the number 10 is specified in step S1303 will be described. The selected character string area of the number 14 is in the range of 359 to 389 in the Y coordinate, and the table area of the number 10 is in the range of 190 to 310 in the Y coordinate. Therefore, the distance in the Y-axis direction is 49 (= 359-310). Further, regarding the X coordinate, the selection character string area of the number 14 is 236, the table area of the number 10 is 37, and therefore the distance in the X-axis direction is 199 (= 236-37).

ステップＳ１３０５において、画像解析部４２３は、対象の帳票情報にて全ての表領域を処理したか否かを判定する。未処理の表領域があり、画像解析部４２３が全ての表領域を処理していないと判定した場合には、ステップＳ１３０３に戻り、ステップＳ１３０３～Ｓ１３０５の処理が行われる。画像解析部４２３が全ての表領域を処理したと判定した場合には、ステップＳ１３０６に進む。 In step S1305, the image analysis unit 423 determines whether or not all the table areas have been processed by the target form information. If there is an unprocessed table area and the image analysis unit 423 determines that all the table areas have not been processed, the process returns to step S1303 and the processes of steps S1303 to S1305 are performed. If it is determined that the image analysis unit 423 has processed all the tablespaces, the process proceeds to step S1306.

ステップＳ１３０６において、画像解析部４２３は、ステップＳ１３０４で導出した距離が最も短い表、すなわち選択文字列領域と最も近い表を距離導出対象の表に特定する。表６に示される例では、ステップＳ１３０４で導出された、番号１４の選択文字列領域と番号１０の表領域との距離と、ステップＳ１３０４で導出された、番号１４の選択文字列領域と番号１１の表領域との距離とが比較される。番号１４の選択文字列領域と番号１０の表との距離は、Ｘ軸方向にて１９９となり、Ｙ軸方向にて４９となる。これに対し、番号１４の選択文字列領域と番号１１の表との距離は、Ｘ軸方向にて７９となり、Ｙ軸方向にて４９となる。番号１０の表の場合と番号１１の表の場合とで距離を比較すると、Ｙ軸方向では両者は同じ距離となり、Ｘ軸方向では番号１０の表の場合と比べて番号１１の表の場合の方が短い距離となる。よって、番号１０の表ではなく、番号１１の表を距離導出対象の表に特定する。これにより、ステップＳ１３０６で特定された表は、ステップＳ１１０７にてファイル名使用文字列領域を特定する際に用いられる。 In step S1306, the image analysis unit 423 specifies the table with the shortest distance derived in step S1304, that is, the table closest to the selected character string region , as the table to be derived from the distance. In the example shown in Table 6, the distance between the selected character string area of No. 14 and the table area of No. 10 derived in step S1304, and the selected character string area of No. 14 and the number 11 derived in step S1304. Is compared to the distance to the tablespace. The distance between the selected character string area of number 14 and the table of number 10 is 199 in the X-axis direction and 49 in the Y-axis direction. On the other hand, the distance between the selected character string area of No. 14 and the table of No. 11 is 79 in the X-axis direction and 49 in the Y-axis direction. Comparing the distances between the table of No. 10 and the table of No. 11, the distances are the same in the Y-axis direction, and in the X-axis direction, the distance of the table of No. 11 is compared with that of the table of No. 10. The distance is shorter. Therefore, instead of the table of the number 10, the table of the number 11 is specified as the table of the distance derivation target. As a result, the table specified in step S1306 is used when specifying the file name used character string area in step S1107.

以上説明したように、表などの記載欄を複数含む場合でも、実施形態１と同様、記載欄の大きさに応じて記載事項の位置が変わる様な文書のスキャン画像データに対してＯＣＲ処理を行う領域を適切に特定し、この情報を確実に取得することができる。これにより、類似するスキャン画像データに対して、同じ規則でファイル名を容易に設定できる。 As described above, even when a plurality of description columns such as a table are included, OCR processing is performed on the scanned image data of the document in which the position of the description item changes according to the size of the description column, as in the first embodiment. It is possible to properly identify the area to be performed and surely obtain this information. As a result, the file name can be easily set for similar scanned image data according to the same rule.

［その他の実施形態］
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読み出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 [Other embodiments]
The present invention supplies a program that realizes one or more functions of the above-described embodiment to a system or device via a network or storage medium, and one or more processors in the computer of the system or device reads and executes the program. It can also be realized by the processing to be performed. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

１１０ＭＦＰ
４２１スキャン指示部
４２２メタデータ生成部
４２３画像解析部
４２５ファイル生成部
４２６表示制御部
４２７帳票情報保持部 110 MFP
421 Scan instruction unit 422 Metadata generation unit 423 Image analysis unit 425 File generation unit 426 Display control unit 427 Form information retention unit

Claims

A device for setting information in scanned image data obtained by scanning a document containing a table.
An extraction means for extracting area information about each of the character string area and the table area, which are presumed to be a character string and a table, in the new scanned image data.
The area information extracted in the new scan image data by the extraction means is compared with the area information extracted in each of the past scan image data, whereby the area information extracted in the new scan image data is described. Judgment means for determining past scanned image data from which area information similar to area information has been extracted , and
The determination that the character string region used when setting information for the past scanned image data determined by the determination means that the similar region information has been extracted and the similar region information have been extracted. Based on the distance between the table area extracted from the past scanned image data determined by the means and the target area to be processed among the character string areas extracted from the new scanned image data. The detection means to detect and
A recognition means that performs character recognition processing in the target area, and
A device characterized by having as a setting means for setting information in the new scanned image data using characters obtained as a result of the character recognition process.

The detection means has the size of the table area and the character string area existing in the vicinity of the table area in the new scan image data and the past scan image data for which it is determined that the similar area information has been extracted. The information set in the past scan image data for which the position of the similar region information is different and the similar region information is determined to be extracted is the past scan image data for which the similar region information is determined to be extracted . When the data is set based on the characters in the character string area existing in the vicinity of the table area , it is determined that the character string areas existing in the vicinity of the table area of the new scanned image data are similar to each other. To detect the character string area corresponding to the relationship between the table area in the past scanned image data and the character string area used when obtaining the information set for the past scanned image data as the target area. The apparatus according to claim 1.

It has an acquisition means for acquiring the position coordinates of the table area and the character string area of the new scanned image data.
Based on the position coordinates of the acquired table area and the character string area, the detection means extracts the similar area information from the character string areas existing in the vicinity of the table area of the new scanned image data. The character string area corresponding to the relationship between the table area in the past scanned image data determined to be and the character string area used to obtain the information set for the past scanned image data is detected as the target area. The device according to claim 2, wherein the device is to be used.

The detection means is a character string existing in the vicinity of the table area of the new scanned image data when the position coordinates of the acquired character string area are lower than the position coordinates of the acquired table area . Of the areas, the table area in the past scanned image data for which it is determined that the similar area information has been extracted and the character string area used for obtaining the information set for the past scanned image data. The apparatus according to claim 3, wherein the character string area corresponding to the relationship is detected as the target area.

The detection means determines that, among the character string regions existing in the vicinity of the table area of the new scanned image data, the distance from the table area of the new scanned image data is similar to the area information extracted. The target area is a character string area having position coordinates that are the same as the distance between the table area in the past scanned image data and the character string area used to obtain the information set for the past scanned image data. The apparatus according to claim 4, wherein the device is detected as.

The apparatus according to any one of claims 1 to 5, further comprising a display control means for displaying the target area on the display means together with the scan image of the new scan image data.

The device according to claim 6, wherein the display control means displays the information on the display means.

If there is more than one of the above information
The apparatus according to claim 7, wherein the display control means highlights the information generated corresponding to the past scan image data determined to be most similar to the new scan image data.

The display control means moves the information generated corresponding to the past scan image data determined to be most similar to the new scan image data to the top of the list displaying the plurality of information. The device according to claim 8, wherein the device is displayed.

The document comprises a plurality of the tablespaces .
The target area according to claim 1 to 9, wherein the detection means detects the target area based on the table area closest to the character string area existing in the vicinity of the table area among the plurality of table areas . The device according to any one item.

The apparatus according to any one of claims 1 to 10, further comprising a holding means for holding the past scanned image data.

The document is a form
The apparatus according to any one of claims 1 to 11, wherein the information is a file name given to the scanned image data.

It is a method for setting information in the scanned image data obtained by scanning a document containing a table.
An extraction step that extracts area information about each of the character string area and the table area that are presumed to be a character string and a table in the new scanned image data, and
By comparing the area information extracted in the new scanned image data in the extraction step with the area information extracted in each of the past scanned image data, the area information was extracted in the new scanned image data. A determination step for determining past scanned image data from which region information similar to the region information has been extracted , and
The determination that the character string region used when setting information for the past scanned image data determined in the determination step that the similar region information has been extracted and the similar region information have been extracted. Based on the distance between the table area extracted from the past scanned image data determined in the step and the target area to be processed among the character string areas extracted from the new scanned image data. Steps to detect and
The step of performing character recognition processing in the target area and
A method comprising: a step of setting information in the new scanned image data using characters obtained as a result of the character recognition process.

A program for making a computer function as each means of the apparatus according to any one of claims 1 to 12.