JP2022189109A

JP2022189109A - Image processing apparatus, image processing method, and program

Info

Publication number: JP2022189109A
Application number: JP2021097497A
Authority: JP
Inventors: 峻中村; Shun Nakamura
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2021-06-10
Filing date: 2021-06-10
Publication date: 2022-12-22

Abstract

To discriminate separation between documents in scan images obtained by continuously scanning a plurality of documents, while reducing processing cost.SOLUTION: An image processing apparatus analyzes a layout page by page for scan images obtained by continuously scanning a plurality of documents page by page, and based on a result of analysis, calculates the similarity to the first page of the scan images for every page of the scan images. The image processing apparatus extracts, based on the calculated similarity, head page candidates in the respective documents of the plurality of documents from the scan images, and performs character recognition on the extracted head page candidates. The image processing apparatus determines separations between the documents based on a result of character recognition.SELECTED DRAWING: Figure 10

Description

本開示は、複数の文書を連続してスキャンして得られたスキャン画像における文書単位の区切りを判別する技術に関する。 TECHNICAL FIELD The present disclosure relates to a technique for determining a document-by-document break in a scanned image obtained by continuously scanning a plurality of documents.

文書の管理手法として、紙帳票などの文書をスキャナで読み取って得られたスキャン画像を所定フォーマットのファイルに変換し、ネットワーク上のストレージサーバに送信して保存する手法が広く利用されている。 As a document management method, a method of converting a scanned image obtained by reading a document such as a paper form with a scanner into a file of a predetermined format, transmitting the file to a storage server on a network, and storing the file is widely used.

そのような手法が利用されるユースケースとして、複数の帳票等を連続スキャンして得られたスキャン画像を帳票単位の区切りで分割してファイル化し、ストレージサーバに保存することがある。このように文書単位でファイル化する際には、複数の帳票等を連続スキャンして得られたスキャン画像を文書単位で分離する必要がある。この点、特許文献１は、複数の文書を読み込んで得られた文書画像に対して文字認識処理および解析処理を行い、この処理結果を用いて文書区切り情報を抽出する技術を開示している。 As a use case in which such a technique is used, there is a case in which scanned images obtained by continuously scanning a plurality of forms are divided into files by delimiters for each form, and saved in a storage server. In order to create a file for each document in this way, it is necessary to separate scanned images obtained by continuously scanning a plurality of forms or the like for each document. In this regard, Japanese Patent Application Laid-Open No. 2002-200002 discloses a technique of performing character recognition processing and analysis processing on a document image obtained by reading a plurality of documents, and extracting document delimiter information using the processing results.

特開２００２－３１２３８５号公報JP-A-2002-312385

上記特許文献１の技術では、文書画像の全ページに対して文字認識処理および解析処理を行う必要があるため、多大な処理コストを要した。 In the technique of Patent Document 1, since it is necessary to perform character recognition processing and analysis processing on all pages of the document image, a large processing cost is required.

本開示は、上記の問題に鑑みてなされたものであり、処理コストを抑えながらも、複数の文書を連続でスキャンして得られたスキャン画像における文書単位の区切りを判別する技術を提供することを目的とする。 SUMMARY OF THE INVENTION The present disclosure has been made in view of the above problems, and provides a technique for determining delimiters for each document in a scanned image obtained by continuously scanning a plurality of documents while suppressing processing costs. With the goal.

本開示の一態様に係る画像処理装置は、複数の文書をページ単位で連続してスキャンして得られたスキャン画像についてページ単位でレイアウトを解析する解析手段と、前記解析手段による解析結果に基づき、前記スキャン画像の各ページについて、当該スキャン画像の１ページ目との類似度を算出する算出手段と、前記算出手段により算出した類似度に基づき、前記スキャン画像から、前記複数の文書の各文書における先頭ページ候補を抽出する抽出手段と、前記抽出手段により抽出した前記先頭ページ候補に対して文字認識を行う文字認識手段と、前記文字認識手段による文字認識結果に基づき、前記各文書の区切りを判定する判定手段と、を有することを特徴とする。 An image processing apparatus according to an aspect of the present disclosure includes analysis means for analyzing a layout of each page of scanned images obtained by continuously scanning a plurality of documents in units of pages; a calculating means for calculating a degree of similarity between each page of the scanned image and the first page of the scanned image; character recognition means for performing character recognition on the first page candidates extracted by the extraction means; and determination means for determining.

本開示によれば、処理コストを抑えながらも、複数の文書を連続でスキャンして得られたスキャン画像における文書単位の区切りを判別することができる。 Advantageous Effects of Invention According to the present disclosure, it is possible to determine a document-by-document break in a scanned image obtained by continuously scanning a plurality of documents while suppressing processing costs.

画像処理システムの全体構成を示す図である。1 is a diagram showing the overall configuration of an image processing system; FIG. ＭＦＰのハードウェア構成例を示す図である。3 is a diagram showing an example hardware configuration of an MFP; FIG. クライアントＰＣ、ＭＦＰ連携サーバ及びストレージサーバのハードウェア構成例を示す図である。3 is a diagram showing an example hardware configuration of a client PC, an MFP cooperation server, and a storage server; FIG. 画像処理システムのソフトウェア構成例を示す図である。1 is a diagram showing an example of software configuration of an image processing system; FIG. 画像処理システム全体の処理の流れを示すシーケンス図である。4 is a sequence diagram showing the flow of processing in the entire image processing system; FIG. スキャン画像群例を示す図である。FIG. 10 is a diagram showing an example of a scan image group; 分割ページ確定画面例およびファイル名設定画面例を示す図である。FIG. 10 is a diagram showing an example of a division page confirmation screen and an example of a file name setting screen; 画像解析処理の流れを示すフローチャートである。4 is a flowchart showing the flow of image analysis processing; 文書先頭ページ候補抽出処理の詳細な流れを示すフローチャートである。FIG. 11 is a flowchart showing a detailed flow of document first page candidate extraction processing; FIG. 文書先頭ページ判定処理の詳細な流れを示すフローチャートである。FIG. 11 is a flowchart showing a detailed flow of document first page determination processing; FIG. ＯＣＲ結果による文書先頭ページ判定処理の詳細な流れを示すフローチャートである。10 is a flowchart showing a detailed flow of document first page determination processing based on OCR results; 画像処理システム全体の処理の流れを示すシーケンス図である。4 is a sequence diagram showing the flow of processing in the entire image processing system; FIG. 分割ページ確定画面例を示す図である。It is a figure which shows the example of a division page confirmation screen. 文書先頭ページ判定処理の詳細な流れを示すフローチャートである。FIG. 11 is a flowchart showing a detailed flow of document first page determination processing; FIG. 履歴によるＯＣＲ要否判定処理の詳細な流れを示すフローチャートである。7 is a flowchart showing a detailed flow of OCR necessity determination processing based on history; ＯＣＲ結果による文書先頭ページ判定処理の詳細な流れを示すフローチャートである。10 is a flowchart showing a detailed flow of document first page determination processing based on OCR results; 文書先頭ページ判定処理の詳細な流れを示すフローチャートである。FIG. 11 is a flowchart showing a detailed flow of document first page determination processing; FIG.

以下、本開示の技術を実施するための形態について図面を用いて説明する。なお、以下の実施の形態は特許請求の範囲に係る本開示の技術を限定するものでなく、また以下の実施の形態で説明されている特徴の組み合わせの全てが本開示の技術の解決手段に必須のものとは限らない。 Embodiments for implementing the technology of the present disclosure will be described below with reference to the drawings. It should be noted that the following embodiments do not limit the technology of the present disclosure according to the scope of claims, and all combinations of features described in the following embodiments are solutions to the technology of the present disclosure. Not necessarily required.

［第一の実施形態］
＜画像処理システムの概要＞
図１は、本実施形態に係る画像処理システムの全体構成を示す図である。画像処理システム１００は、ＭＦＰ（Ｍｕｌｔｉ－ＦｕｎｃｔｉｏｎＰｅｒｉｐｈｅｒａｌ）１１０、クライアントＰＣ（ＰｅｒｓｏｎａｌＣｏｍｐｕｔｅｒ）１１１、ＭＦＰ連携サーバ１２０およびストレージサーバ１３０を含む。ＭＦＰ１１０及びクライアントＰＣ１１１は、ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）経由でインターネット上の各種サービスを提供するサーバに対して通信可能に接続されている。 [First embodiment]
<Overview of image processing system>
FIG. 1 is a diagram showing the overall configuration of an image processing system according to this embodiment. Image processing system 100 includes MFP (Multi-Function Peripheral) 110 , client PC (Personal Computer) 111 , MFP cooperation server 120 and storage server 130 . The MFP 110 and client PC 111 are communicably connected to a server that provides various services on the Internet via a LAN (Local Area Network).

ＭＦＰ１１０は、スキャン機能を有する画像処理装置の一例である。ＭＦＰ１１０は、スキャン機能に加え印刷機能やＢＯＸ保存機能といった複数の機能を有する複合機である。クライアントＰＣ１１１はＭＦＰ連携サーバ１２０に対して依頼したサービスの提供を受けるアプリケーションがインストールされたコンピュータの一例である。サーバ装置１２０および１３０は、共にクラウドサービスを提供する画像処置装置の一例である。本実施形態のサーバ装置１２０は、ＭＦＰ１１０から受け取ったスキャン画像に対し画像解析を行い自サーバ上に保存したり、別のサービスを提供するサーバ装置１３０に対しＭＦＰ１１０からのリクエストを転送したりするサービスを提供する。以下、サーバ装置１２０が提供するクラウドサービスを「ＭＦＰ連携サービス」と呼ぶこととする。サーバ装置１３０は、インターネットを介して送られてきたファイルを保存したり、モバイル端末（不図示）などのウェブブラウザからの要求に応じて保存ファイルを提供したりするクラウドサービス（以下、「ストレージサービス」と呼ぶ）を提供する。本実施形態では、ＭＦＰ連携サービスを提供するサーバ装置１２０を「ＭＦＰ連携サーバ」と呼び、ストレージサービスを提供するサーバ装置１３０を「ストレージサーバ」と呼ぶこととする。 MFP 110 is an example of an image processing apparatus having a scanning function. The MFP 110 is a multifunction device that has multiple functions such as a print function and a BOX storage function in addition to a scan function. The client PC 111 is an example of a computer in which an application for receiving a requested service from the MFP cooperation server 120 is installed. Server devices 120 and 130 are both examples of image processing devices that provide cloud services. The server device 120 of this embodiment performs image analysis on the scanned image received from the MFP 110 and stores it on its own server, and transfers a request from the MFP 110 to the server device 130 that provides another service. I will provide a. Hereinafter, the cloud service provided by server device 120 will be referred to as "MFP cooperation service". The server device 130 is a cloud service (hereinafter referred to as “storage service”) that stores files sent via the Internet and provides stored files in response to requests from web browsers such as mobile terminals (not shown). ). In this embodiment, the server device 120 that provides the MFP cooperation service is called the "MFP cooperation server", and the server device 130 that provides the storage service is called the "storage server".

本実施形態の画像処理システム１００は、ＭＦＰ１１０、クライアントＰＣ１１１、ＭＦＰ連携サーバ１２０およびストレージサーバ１３０からなる構成としているがこれに限定されない。例えば、クライアントＰＣ１１１やＭＦＰ連携サーバ１２０の機能をＭＦＰ１１０が兼ね備えてもよい。また、ＭＦＰ連携サーバ１２０はインターネット上ではなくＬＡＮ経由でＭＦＰ１１０やクライアントＰＣ１１１と接続されていてもよい。また、ストレージサーバ１３０を、メール配信サービスを行うメールサーバに置き換えて、文書のスキャン画像をメールに添付し送信する場面に適用してもよい。 The image processing system 100 of the present embodiment is composed of the MFP 110, the client PC 111, the MFP cooperation server 120, and the storage server 130, but is not limited to this. For example, the MFP 110 may have the functions of the client PC 111 and the MFP cooperation server 120 . Also, the MFP cooperation server 120 may be connected to the MFP 110 and the client PC 111 via a LAN instead of the Internet. Alternatively, the storage server 130 may be replaced with a mail server that provides a mail delivery service, and may be applied to a scene in which a scanned image of a document is attached to mail and sent.

＜ＭＦＰのハードウェア構成＞
図２は、ＭＦＰ１１０のハードウェア構成例を示す図である。ＭＦＰ１１０は、制御部２１０、操作部２２０、プリンタ２２１、スキャナ２２２、モデム２２３を有する。制御部２１０は、以下の各部２１１－２１９を有し、ＭＦＰ１１０全体の動作を制御する。ＣＰＵ２１１は、中央処理装置であり、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）２１２に記憶された制御プログラム（後述のソフトウェア構成図で示す各種機能に対応するプログラム）を読み出して実行する。ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）２１３は、ＣＰＵ２１１の主メモリ、ワークエリア等の一時記憶領域として用いられる。なお、本実施形態では１つのＣＰＵ２１１が１つのメモリ（ＲＡＭ２１３またはＨＤＤ２１４）を用いて後述のフローチャートに示す各処理を実行するものとするが、これに限定されない。例えば、複数のＣＰＵや複数のＲＡＭまたはＨＤＤを協働させて各処理を実行してもよい。ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）２１４は、画像データや各種プログラムを記憶する大容量記憶部である。操作部Ｉ／Ｆ２１５は、操作部２２０と制御部２１０とを接続するインタフェースである。操作部２２０には、表示部として機能するタッチパネルやキーボードなどが備えられており、ユーザによる操作／入力／指示を受け付ける。なお、タッチパネルへのタッチ操作には、人の指による操作やタッチペンによる操作が含まれる。プリンタＩ／Ｆ２１６は、プリンタ２２１と制御部２１０とを接続するインタフェースである。印刷用の画像データはプリンタＩ／Ｆ２１６を介して制御部２１０からプリンタ２２１へ転送され、紙等の記録媒体上に印刷される。スキャナＩ／Ｆ２１７は、スキャナ２２２と制御部２１０とを接続するインタフェースである。スキャナ２２２は、不図示の原稿台やＡＤＦ（ＡｕｔｏＤｏｃｕｍｅｎｔＦｅｅｄｅｒ）にセットされた原稿（文書）を光学的に読み取って画像データ（すなわち、スキャン画像データ）を生成し、スキャナＩ／Ｆ２１７を介して制御部２１０に入力する。ＭＦＰ１１０は、スキャナ２２２で生成された画像データをプリンタ２２１から印刷出力（コピー）する他、ファイル送信またはメール送信することができる。モデムＩ／Ｆ２１８は、モデム２２３と制御部２１０とを接続するインタフェースである。モデム２２３は、ＰＳＴＮ（ＰｕｂｌｉｃＳｗｉｔｃｈｅｄＴｅｌｅｐｈｏｎｅＮｅｔｗｏｒｋｓ）上のファクシミリ装置との間で画像データをファクシミリ通信する。ネットワークＩ／Ｆ２１９は、制御部２１０（ＭＦＰ１１０）をＬＡＮに接続するインタフェースである。ＭＦＰ１１０は、ネットワークＩ／Ｆ２１９を用いてインターネット上の各サービスに画像データや情報を送信したり、各種情報を受信したりする。 <Hardware Configuration of MFP>
FIG. 2 is a diagram showing a hardware configuration example of the MFP 110. As shown in FIG. The MFP 110 has a control unit 210 , an operation unit 220 , a printer 221 , a scanner 222 and a modem 223 . Control unit 210 has the following units 211 to 219 and controls the operation of MFP 110 as a whole. The CPU 211 is a central processing unit, and reads and executes a control program (a program corresponding to various functions shown in a software configuration diagram to be described later) stored in a ROM (Read Only Memory) 212 . A RAM (Random Access Memory) 213 is used as a main memory of the CPU 211 and a temporary storage area such as a work area. Note that in the present embodiment, one CPU 211 uses one memory (RAM 213 or HDD 214) to execute each process shown in flowcharts described later, but the present invention is not limited to this. For example, multiple CPUs, multiple RAMs or HDDs may cooperate to execute each process. A HDD (Hard Disk Drive) 214 is a large-capacity storage unit that stores image data and various programs. An operation unit I/F 215 is an interface that connects the operation unit 220 and the control unit 210 . The operation unit 220 includes a touch panel and a keyboard functioning as a display unit, and receives operations/inputs/instructions from the user. The touch operation on the touch panel includes an operation using a human finger and an operation using a touch pen. A printer I/F 216 is an interface that connects the printer 221 and the control unit 210 . Image data for printing is transferred from the control unit 210 to the printer 221 via the printer I/F 216 and printed on a recording medium such as paper. A scanner I/F 217 is an interface that connects the scanner 222 and the control unit 210 . A scanner 222 optically reads a manuscript (document) set on a manuscript platen (not shown) or an ADF (Auto Document Feeder) to generate image data (that is, scanned image data), and transmits the image data via a scanner I/F 217 . Input to the control unit 210 . The MFP 110 can print out (copy) the image data generated by the scanner 222 from the printer 221, and can also transmit the image data as a file or as an e-mail. A modem I/F 218 is an interface that connects the modem 223 and the control unit 210 . A modem 223 performs facsimile communication of image data with a facsimile device on PSTN (Public Switched Telephone Networks). A network I/F 219 is an interface that connects the control unit 210 (MFP 110) to a LAN. The MFP 110 uses the network I/F 219 to transmit image data and information to each service on the Internet and receive various information.

＜クライアントＰＣ、サーバ装置のハードウェア構成＞
図３は、クライアントＰＣ１１１、ＭＦＰ連携サーバ１２０及びストレージサーバ１３０のハードウェア構成例を示す図である。クライアントＰＣ１１１、ＭＦＰ連携サーバ１２０及びストレージサーバ１３０は共通のハードウェア構成を有し、ＣＰＵ３１１、ＲＯＭ３１２、ＲＡＭ３１３、ＨＤＤ３１４及びネットワークＩ／Ｆ３１５で構成される。ＣＰＵ３１１は、ＲＯＭ３１２に記憶された制御プログラムを読み出して各種処理を実行することで、全体の動作を制御する。ＲＡＭ３１３は、ＣＰＵ３１１の主メモリ、ワークエリア等の一時記憶領域として用いられる。ＨＤＤ３１４は、画像データや各種プログラムを記憶する大容量記憶部である。ネットワークＩ／Ｆ３１５は、制御部３１０をインターネットに接続するインタフェースである。ＭＦＰ連携サーバ１２０及びストレージサーバ１３０は、ネットワークＩ／Ｆ３１５を介して他の装置（ＭＦＰ１１０等）から様々な処理のリスエストを受け、当該クリエストに応じた処理結果を返す。 <Hardware Configuration of Client PC and Server Device>
FIG. 3 is a diagram showing a hardware configuration example of the client PC 111, the MFP cooperation server 120, and the storage server 130. As shown in FIG. The client PC 111 , the MFP link server 120 and the storage server 130 have a common hardware configuration and are composed of a CPU 311 , ROM 312 , RAM 313 , HDD 314 and network I/F 315 . The CPU 311 reads control programs stored in the ROM 312 and executes various processes to control the overall operation. A RAM 313 is used as a main memory of the CPU 311 and a temporary storage area such as a work area. The HDD 314 is a large-capacity storage unit that stores image data and various programs. A network I/F 315 is an interface that connects the control unit 310 to the Internet. MFP cooperation server 120 and storage server 130 receive various processing requests from other devices (such as MFP 110) via network I/F 315, and return processing results according to the requests.

＜画像処理システムのソフトウェア構成＞
図４は、本実施形態に係る画像処理システム１００のソフトウェア構成例を示すブロック図であり、図４（ａ）に画像処理システム１００の全体を示し、図４（ｂ）にＭＦＰ連携サーバ１２０の画像処理部４３２の詳細を示す。以下、画像処理システム１００を構成するＭＦＰ１１０、ＭＦＰ連携サーバ１２０及びストレージサーバ１３０それぞれの役割に対応したソフトウェア構成を順に説明する。なお、以下では各装置が有する諸機能のうち、文書をスキャンして電子化（ファイル化）し、ストレージサーバ１３０に保存を行うまでの処理に関わる機能に絞って説明を行うものとする。 <Software configuration of image processing system>
4A and 4B are block diagrams showing an example of the software configuration of the image processing system 100 according to this embodiment. FIG. 4A shows the overall image processing system 100, and FIG. Details of the image processing unit 432 are shown. Software configurations corresponding to the respective roles of the MFP 110, the MFP cooperation server 120, and the storage server 130 that configure the image processing system 100 will be described in order below. In the following description, among the various functions of each device, the functions related to the process of scanning a document, digitizing it (making it into a file), and saving it in the storage server 130 will be described.

＜ＭＦＰのソフトウェア構成＞
ＭＦＰ１１０の機能モジュールは、ネイティブ機能モジュール４１０とアディショナル機能モジュール４２０の２つに大別される。ネイティブ機能モジュール４１０はＭＦＰ１１０に標準的に備えられたアプリケーションであるのに対し、アディショナル機能モジュール４２０はＭＦＰ１１０に追加的にインストールされたアプリケーションである。アディショナル機能モジュール４２０は、Ｊａｖａ（登録商標）をベースとしたアプリケーションであり、ＭＦＰ１１０への機能追加を容易に実現できる。なお、ＭＦＰ１１０には図示しない他の追加アプリケーションがインストールされていてもよい。 <MFP software configuration>
The functional modules of the MFP 110 are roughly divided into two, native functional modules 410 and additional functional modules 420 . Native function module 410 is an application that is standardly provided in MFP 110 , while additional function module 420 is an application additionally installed in MFP 110 . The additional function module 420 is a Java (registered trademark)-based application, and can easily add functions to the MFP 110 . Note that MFP 110 may have other additional applications (not shown) installed.

ネイティブ機能モジュール４１０は、スキャン実行部４１１およびスキャン画像管理部４１２を有する。アディショナル機能モジュール４２０は、表示制御部４２１、スキャン指示部４２２、連携サービスリクエスト部４２３、画像処理部４２４を有する。 Native function module 410 has scan execution unit 411 and scan image management unit 412 . The additional function module 420 has a display control section 421 , a scan instruction section 422 , a linked service request section 423 and an image processing section 424 .

表示制御部４２１は、操作部２２０のタッチパネル機能を有する液晶表示部に、各種のユーザ操作を受け付けるユーザインターフェース画面（ＵＩ画面）を表示する。各種のユーザ操作には、例えば、ＭＦＰ連携サーバ１２０へのアクセスに用いられるログイン認証情報の入力、スキャン設定、スキャン開始指示、分割ページ確定指示、ファイル名の入力、ファイル名設定指示、ファイルの保存指示などがある。 The display control unit 421 displays a user interface screen (UI screen) for accepting various user operations on the liquid crystal display unit having a touch panel function of the operation unit 220 . Various user operations include, for example, input of login authentication information used for accessing the MFP cooperation server 120, scan setting, scan start instruction, split page confirmation instruction, file name input, file name setting instruction, file saving, and so on. There are instructions.

スキャン指示部４２２は、ＵＩ画面でなされたユーザ操作（例えば「スキャン開始」ボタンの押下）に応じて、スキャン設定の情報と共にスキャン実行部４１１に対しスキャン処理の実行を指示する。スキャン実行部４１１は、スキャン指示部４２２からのスキャン処理の実行指示に従い、スキャナＩ／Ｆ２１７を介してスキャナ２２２に文書の読み取り動作を実行させ、原稿台ガラスに置かれた原稿（紙文書）を読み取ることでスキャン画像データを生成する。生成したスキャン画像データは、スキャン画像管理部４１２によってＨＤＤ２１４に保存される。この際、保存されたスキャン画像データを一意に示すスキャン画像識別子の情報が、スキャン指示部４２２へ通知される。スキャン画像識別子は、ＭＦＰ１１０においてスキャンした画像をユニークに識別する番号や記号、アルファベットなどである。スキャン指示部４２２は、例えばファイル化する対象のスキャン画像データを上記のスキャン画像識別子を使ってスキャン画像管理部４１２から取得する。そして、ファイル化のために必要な処理のリクエストをＭＦＰ連携サーバ１２０に対して行うよう、連携サービスリクエスト部４２３に対して指示する。 The scan instruction unit 422 instructs the scan execution unit 411 to execute scan processing together with scan setting information in response to a user operation (for example, pressing a “start scan” button) on the UI screen. The scan execution unit 411 causes the scanner 222 to execute a document reading operation via the scanner I/F 217 in accordance with the scan processing execution instruction from the scan instruction unit 422, and scans the original (paper document) placed on the platen glass. Scanned image data is generated by reading. The generated scan image data is stored in the HDD 214 by the scan image management unit 412 . At this time, information of a scanned image identifier that uniquely indicates the saved scanned image data is notified to the scan instruction unit 422 . A scanned image identifier is a number, symbol, alphabet, or the like that uniquely identifies an image scanned by the MFP 110 . The scan instruction unit 422 acquires scan image data to be filed, for example, from the scan image management unit 412 using the scan image identifier. Then, the cooperative service request unit 423 is instructed to request the MFP cooperative server 120 for processing necessary for filing.

連携サービスリクエスト部４２３は、ＭＦＰ連携サーバ１２０に対して各種処理のリクエストを行ったり、そのレスポンスを受け取ったりする。各種処理には、例えば、ログイン認証、スキャン画像の解析、スキャン画像データの送信等が含まれる。ＭＦＰ連携サーバ１２０とのやり取りにはＲＥＳＴ（ＲｅｐｒｅｓｅｎｔａｔｉｏｎａｌＳｔａｔｅＴｒａｎｓｆｅｒ）やＳＯＡＰ（ＳｉｍｐｌｅＯｂｊｅｃｔＡｃｃｅｓｓＰｒｏｔｏｃｏｌ）等のプロトコルが使用される。ＭＦＰ連携サーバ１２０とのやり取りはこれに限定されず、その他の通信手段を用いてもよい。画像処理部４２４は、スキャン画像データに対して所定の処理を行って、表示制御部４２１が表示するＵＩ画面で用いられる画像を生成する。 The cooperation service request unit 423 requests various processes from the MFP cooperation server 120 and receives responses thereto. Various types of processing include, for example, login authentication, analysis of scanned images, and transmission of scanned image data. Protocols such as REST (Representational State Transfer) and SOAP (Simple Object Access Protocol) are used for communication with the MFP cooperation server 120 . Communication with MFP cooperation server 120 is not limited to this, and other communication means may be used. The image processing unit 424 performs predetermined processing on the scanned image data to generate an image used in the UI screen displayed by the display control unit 421 .

＜サーバ装置のソフトウェア構成＞
まず、ＭＦＰ連携サーバ１２０のソフトウェア構成について説明する。ＭＦＰ連携サーバ１２０は、リクエスト制御部４３１、画像処理部４３２、ストレージサーバアクセス部４３３、データ管理部４３４、表示制御部４３５を有する。リクエスト制御部４３１は、外部装置からのリクエストを受信できる状態で待機しており、受信したリクエスト内容に応じて、画像処理部４３２、ストレージサーバアクセス部４３３、データ管理部４３４に対し所定の処理を実行する。 <Software Configuration of Server Device>
First, the software configuration of MFP cooperation server 120 will be described. The MFP cooperation server 120 has a request control unit 431 , an image processing unit 432 , a storage server access unit 433 , a data management unit 434 and a display control unit 435 . The request control unit 431 waits in a state where it can receive a request from an external device, and according to the contents of the received request, performs predetermined processing on the image processing unit 432, the storage server access unit 433, and the data management unit 434. Run.

画像処理部４３２は、ＭＦＰ１１０から送られてくるスキャン画像に対して、文字領域の検出処理、文字認識処理、類似文書（帳票）の判定処理（図８から図１０を用いて後述する）といった解析処理、回転や傾き補正といった画像加工処理を行う。画像処理部４３２は、図４（ｂ）に示すように、画像補正部４５１、文字領域検出部４５２、ＯＣＲ（ＯＣＲ：ＯｐｔｉｃａｌＣｈａｒａｃｔｅｒＲｅｃｏｇｎｉｔｉｏｎ）処理部（文字認識部）４５３を有する。画像処理部４３２は、文書先頭ページ候補抽出部４６１、文書先頭ページ判定部４７１をさらに有する。文書先頭ページ候補抽出部４６１は、レイアウト解析部４６２、類似度算出部４６３を有する。文書先頭ページ判定部４７１は、不適合率算出部４７２、分割判定利用フラグ設定部４７３を有する。画像処理部４３２が有する各機能部の処理の詳細については、全体の処理の説明の中で随時行う。なお、各機能部は、他の機能部の一部の機能を担ってもよい。 The image processing unit 432 analyzes the scanned image sent from the MFP 110, such as character area detection processing, character recognition processing, and similar document (form) determination processing (described later using FIGS. 8 to 10). image processing such as processing, rotation and tilt correction. The image processing unit 432 has an image correction unit 451, a character area detection unit 452, and an OCR (OCR: Optical Character Recognition) processing unit (character recognition unit) 453, as shown in FIG. 4B. The image processing unit 432 further has a document first page candidate extraction unit 461 and a document first page determination unit 471 . The document first page candidate extraction unit 461 has a layout analysis unit 462 and a similarity calculation unit 463 . The document first page determination unit 471 has a nonconforming rate calculation unit 472 and a division determination use flag setting unit 473 . Details of the processing of each functional unit included in the image processing unit 432 will be described as needed in the description of the overall processing. Note that each functional unit may serve a part of the functions of other functional units.

ストレージサーバアクセス部４３３は、ストレージサーバ１３０に対する処理のリクエストを行う。クラウドサービスでは、ＲＥＳＴやＳＯＡＰ等のプロトコルを用いてストレージサーバにファイルを保存したり、保存したファイルを取得したりする様々なインタフェースを公開している。ストレージサーバアクセス部４３３は、公開されたストレージサーバのインタフェースを使用して、ストレージサーバ１３０に対するリクエストを行う。データ管理部４３４は、ＭＦＰ連携サーバ１２０で管理するユーザ情報、画像解析結果、各種設定データ等を保持・管理する。 The storage server access unit 433 makes a processing request to the storage server 130 . Cloud services publish various interfaces for storing files in storage servers and acquiring stored files using protocols such as REST and SOAP. The storage server access unit 433 makes a request to the storage server 130 using the open interface of the storage server. A data management unit 434 holds and manages user information, image analysis results, various setting data, and the like managed by the MFP cooperation server 120 .

表示制御部４３５は、インターネット経由で接続されたＰＣやモバイル端末（いずれも不図示）上で動作しているウェブブラウザからのリクエストを受けて、画面表示に必要な画面情報（ＨＴＭＬ、ＣＳＳ等）を返す。ユーザは、ウェブブラウザで表示される画面経由で、ＭＦＰ連携サーバ１２０に登録されているユーザ情報を確認したり、スキャン設定を変更したりできる。 The display control unit 435 receives a request from a web browser running on a PC or mobile terminal (none of which is shown) connected via the Internet, and generates screen information (HTML, CSS, etc.) necessary for screen display. return it. The user can check the user information registered in the MFP cooperation server 120 and change the scan settings via the screen displayed by the web browser.

次に、ストレージサーバ１３０のソフトウェア構成について説明する。ストレージサーバ１３０は、リクエスト制御部４４１、ファイル管理部４４２、表示制御部４４３を有する。リクエスト制御部４４１は、外部装置からのリクエストを受信できる状態で待機しており、本実施形態においてはＭＦＰ連携サーバ１２０からのリクエストに応じて、受信したファイルの保存や保存ファイルの読み出しをファイル管理部４４２に指示する。そして、リクエストに応じたレスポンスをＭＦＰ連携サーバ１２０に返す。表示制御部４４３は、インターネット経由で接続されたＰＣやモバイル端末（不図示）上で動作しているウェブブラウザからのリクエストを受けて、画面表示に必要な画面構成情報（ＨＴＭＬ、ＣＳＳ等）を返す。ユーザは、ウェブブラウザで表示される画面経由で、ストレージサーバ１３０に登録されている保存ファイルを確認したり取得したりできる。 Next, the software configuration of the storage server 130 will be explained. The storage server 130 has a request control section 441 , a file management section 442 and a display control section 443 . The request control unit 441 waits in a state where it can receive a request from an external device. 442. Then, a response corresponding to the request is returned to MFP cooperation server 120 . The display control unit 443 receives a request from a web browser running on a PC or mobile terminal (not shown) connected via the Internet, and stores screen configuration information (HTML, CSS, etc.) necessary for screen display. return. The user can check and obtain saved files registered in the storage server 130 via a screen displayed by a web browser.

なお、図４（ａ）を用いてＭＦＰ１１０がアディショナル機能モジュール４２０を有する構成例について説明したが、本実施形態はこの構成に限定されない。例えば、クライアントＰＣ１１１がアディショナル機能モジュール４２０の機能を含んでいても構わない。すなわち、ＭＦＰ１１０にて得たスキャン画像の解析リクエストや解析結果に基づく各文書における文書先頭ページの確定やファイル名の設定等を、クライアントＰＣ１１１で行うようなシステム構成でも構わない。 Although the configuration example in which the MFP 110 has the additional function module 420 has been described with reference to FIG. 4A, the present embodiment is not limited to this configuration. For example, the client PC 111 may include the functions of the additional function module 420 . That is, the client PC 111 may be used to determine the first page of each document, set the file name, and the like based on the analysis request and the analysis result of the scanned image obtained by the MFP 110 .

クライアントＰＣ１１１にて各文書における文書先頭ページの確定やファイル名の設定等を行う場合、これら処理を行うためのプログラム（モジュール）は、クライアントＰＣ１１１に予めインストールしておく構成でもよい。ただし、これに限定されず、例えば、クライアントＰＣ１１１が備える汎用のウェブブラウザを利用して、文書先頭ページの確定等を行うためのウェブアプリケーションをＭＦＰ連携サーバ１２０から取得して実行する構成でもよい。 When the client PC 111 determines the first page of each document and sets the file name, the program (module) for performing these processes may be installed in the client PC 111 in advance. However, the present invention is not limited to this. For example, a general-purpose web browser provided in the client PC 111 may be used to obtain a web application for determining the first page of a document from the MFP link server 120 and execute the web application.

＜画像処理システム全体の処理の流れ＞
図５は、ＭＦＰ１１０にて複数の文書をページ単位で連続してスキャンし、得られたスキャン画像をファイル化してストレージサーバ１３０に保存（送信）する際の、各装置間の処理の流れを示すシーケンス図である。ここでは、各装置間のやり取りを中心に説明する。なお、図５のシーケンス図ではＭＦＰ１１０がＭＦＰ連携サーバ１２０とやり取りを行う場合の説明となっているが、後述する解析結果取得、画面の表示等はＭＦＰ１１０でなくクライアントＰＣ１１１が実行する構成でも構わない。 <Processing flow of the entire image processing system>
FIG. 5 shows the flow of processing between devices when the MFP 110 continuously scans a plurality of documents page by page, files the obtained scanned images, and saves (sends) them to the storage server 130 . It is a sequence diagram. Here, the explanation will focus on exchanges between devices. Although the sequence diagram of FIG. 5 describes the case where the MFP 110 communicates with the MFP cooperation server 120, the client PC 111 instead of the MFP 110 may perform the acquisition of analysis results, screen display, etc., which will be described later. .

ＭＦＰ１１０は、通常の状態では提供する各機能を実施するためのボタンを並べたメイン画面をタッチパネル上に表示する。 MFP 110 displays on the touch panel a main screen in which buttons for executing each function provided are arranged in a normal state.

ＭＦＰ１１０にスキャン画像をストレージサーバ１３０に送信するための追加アプリケーション（以降、スキャンアプリと呼ぶ）をインストールすることで、アプリケーションの機能を使用するボタンがＭＦＰ１１０のメイン画面に表示される。メイン画面に表示されたボタンがユーザ操作によって押下されると、スキャン画像をストレージサーバ１３０に送信するための画面が表示され、図５のシーケンス図で示される一連の処理が開始する。以下、図５のシーケンス図に沿って、装置間のやり取りを時系列に説明する。なお、以下の説明において記号「Ｓ」はステップを表す。 By installing an additional application (hereinafter referred to as a scan application) for transmitting scanned images to storage server 130 in MFP 110 , a button for using the function of the application is displayed on the main screen of MFP 110 . When a button displayed on the main screen is pressed by a user operation, a screen for transmitting the scanned image to the storage server 130 is displayed, and a series of processes shown in the sequence diagram of FIG. 5 are started. Hereinafter, exchanges between devices will be described in chronological order along the sequence diagram of FIG. In the following description, symbol "S" represents a step.

Ｓ５０１では、スキャンアプリが実行されることで、ＭＦＰ１１０は、ＭＦＰ連携サーバ１２０にアクセスするためのログイン認証の情報を入力するＵＩ画面（ログイン画面）を操作部２２０に表示する。 In S501 , the MFP 110 displays a UI screen (login screen) for inputting login authentication information for accessing the MFP cooperation server 120 on the operation unit 220 by executing the scan application.

Ｓ５０２では、ユーザが、予め登録されているユーザＩＤとパスワードを、ログイン画面上の入力欄にそれぞれ入力しログインボタンを押下すると、ログイン認証のリクエストがＭＦＰ連携サーバ１２０に送信される。 In step S502 , when the user enters a pre-registered user ID and password in the entry fields on the login screen and presses the login button, a request for login authentication is transmitted to MFP cooperation server 120 .

Ｓ５０３では、ログイン認証のリクエストを受けたＭＦＰ連携サーバ１２０は、当該リクエストに含まれるユーザＩＤとパスワードを用いて認証処理を行う。認証処理の結果、正規のユーザであることが確認されれば、ＭＦＰ連携サーバ１２０は、アクセストークンをＭＦＰ１１０に返す。以降、ＭＦＰ１１０からＭＦＰ連携サーバ１２０に対して行う各種リクエストの際にこのアクセストークンを一緒に送ることで、ログイン中のユーザが特定される。本実施形態では、ＭＦＰ連携サーバ１２０へのログインの完了によって、ストレージサーバ１３０へのログインも同時に完了するものとする。このためにユーザは、インターネット上のＰＣ（不図示）のウェブブラウザ等を介して、ＭＦＰ連携サービスを利用するためのユーザＩＤとストレージサービスを利用するためのユーザＩＤとの紐づけを予め行っておく。これにより、ＭＦＰ連携サーバ１２０へのログイン認証に成功すれば同時にストレージサーバ１３０へのログイン認証も完了し、ストレージサーバ１３０にログインするための操作を省略できる。そして、ＭＦＰ連携サーバ１２０においては、自装置にログインしたユーザからのストレージサービスに関するリクエストにも対応可能となる。なお、ログイン認証の方法は一般的に公知な手法（Ｂａｓｉｃ認証、Ｄｉｇｅｓｔ認証、ＯＡｕｔｈを用いた認可等）を用いて行えばよい。 In S503, the MFP cooperation server 120 that has received the login authentication request performs authentication processing using the user ID and password included in the request. As a result of authentication processing, if the user is confirmed to be an authorized user, MFP cooperation server 120 returns an access token to MFP 110 . Thereafter, by sending this access token together with various requests made from the MFP 110 to the MFP cooperation server 120, the logged-in user is specified. In this embodiment, when the login to the MFP cooperation server 120 is completed, the login to the storage server 130 is also completed at the same time. For this purpose, the user associates the user ID for using the MFP cooperation service with the user ID for using the storage service in advance via a web browser of a PC (not shown) on the Internet. back. Accordingly, if the login authentication to the MFP cooperation server 120 succeeds, the login authentication to the storage server 130 is also completed at the same time, and the operation for logging in to the storage server 130 can be omitted. The MFP cooperation server 120 can also respond to a request regarding storage services from a user who has logged in to its own device. As for the method of login authentication, generally known methods (Basic authentication, Digest authentication, authorization using OAuth, etc.) may be used.

ログインが完了すると、ＭＦＰ１１０は、スキャン設定用のＵＩ画面（以下、「スキャン設定画面」と表記）を操作部２２０に表示する（Ｓ５０４）。ユーザが、スキャン設定画面を介して、スキャン処理についての詳細な条件設定を行い、原稿台ガラスまたはＡＤＦにスキャン対象である複数の紙文書をセットし、「スキャン開始」ボタンを押下すると、スキャンが実行される（Ｓ５０５）。これにより、複数の紙文書を電子化したスキャン画像データ（スキャン画像）が生成される。スキャンの完了後、ＭＦＰ１１０は、スキャンによって得られたスキャン画像データを、その解析リクエストと共にＭＦＰ連携サーバ１２０に送信する（Ｓ５０６）。 When login is completed, the MFP 110 displays a UI screen for scan settings (hereinafter referred to as "scan setting screen") on the operation unit 220 (S504). When the user sets detailed conditions for scanning processing via the scan setting screen, sets multiple paper documents to be scanned on the platen glass or ADF, and presses the "Start Scan" button, scanning starts. is executed (S505). As a result, scan image data (scan images) is generated by digitizing a plurality of paper documents. After the scanning is completed, the MFP 110 transmits the scanned image data obtained by the scanning to the MFP cooperation server 120 together with the analysis request (S506).

スキャン画像の解析リクエストを受けたＭＦＰ連携サーバ１２０では、リクエスト制御部４３１が画像処理部４３２に対し、画像解析処理の実行を指示する（Ｓ５０７）。その際、リクエスト制御部４３１は、受信した解析リクエストを一意に特定可能な識別子であるリクエストＩＤ（”processId”）をＭＦＰ１１０に返す。一方、解析処理の実行指示を受けた画像処理部４３２は、スキャン画像データに対する解析処理を実行する（Ｓ５０８）。 In the MFP cooperation server 120 that has received the scan image analysis request, the request control unit 431 instructs the image processing unit 432 to execute image analysis processing (S507). At that time, the request control unit 431 returns to the MFP 110 a request ID (“processId”) that is an identifier that uniquely identifies the received analysis request. On the other hand, the image processing unit 432 that has received the analysis processing execution instruction executes the analysis processing for the scan image data (S508).

画像解析処理では、まず、Ｓ５０６にてＭＦＰ１１０より受信したスキャン画像データに対して、Ｓ５０８にて各文書の文書先頭ページ候補を抽出し、抽出した文書先頭ページ候補に対して詳細な解析を行う。そして、解析結果を基づき、各文書の区切り位置を判定するための各文書の文書先頭ページを確定する。Ｓ５０８の解析処理の詳細については、図８から図１０を用いて後述する。なお、スキャン画像は、スキャン画像群ともいう。 In the image analysis process, first, in S508, document first page candidates of each document are extracted from the scanned image data received from the MFP 110 in S506, and detailed analysis is performed on the extracted document first page candidates. Then, based on the analysis result, the first page of each document for determining the break position of each document is determined. Details of the analysis processing in S508 will be described later using FIGS. 8 to 10 . A scan image is also referred to as a scan image group.

上記画像解析処理が行われている間、ＭＦＰ１１０は、上述のリクエストＩＤを使用して、ＭＦＰ連携サーバ１２０に対して定期的（例えば数百ミリ秒から数ミリ秒程度毎等）に処理状況の問い合わせを行う（Ｓ５０９～Ｓ５０９’）。この問い合わせは、ＭＦＰ連携サーバ１２０からの解析処理の完了レスポンス（Ｓ５１０）が取得できるまで繰り返し実行される。ＭＦＰ連携サーバ１２０は、処理状況の問い合わせを受けると、リクエストＩＤに対応する画像解析処理の進行状況を確認し、完了していない場合は処理中を表すレスポンスを返す。また、完了していた場合は完了を表すレスポンスを返す。このレスポンスの”status”には現在の処理状況を示す文字列、具体的には、ＭＦＰ連携サーバ１２０で処理が行われている最中である場合には”processing”が入り、処理が完了している場合には”completed”が入る。なお、処理が失敗した場合の”failed”など、他のステータスの文字列が入ることもある。また、処理完了時（statusがcompletedの場合）のレスポンスには、ステータス情報に加え、スキャン画像を解析した解析結果情報、スキャン設定情報等が含まれる。 While the image analysis processing is being performed, the MFP 110 uses the above-described request ID to periodically (for example, every several hundred milliseconds to several milliseconds) send a processing status update to the MFP link server 120. An inquiry is made (S509 to S509'). This inquiry is repeatedly executed until an analysis processing completion response (S510) from the MFP cooperation server 120 can be obtained. When receiving an inquiry about the processing status, the MFP cooperation server 120 checks the progress status of the image analysis processing corresponding to the request ID, and returns a response indicating that processing is in progress if the processing has not been completed. Also, if completed, a response indicating completion is returned. "status" of this response contains a character string indicating the current processing status. "completed" is entered when the Note that other status strings, such as "failed" when the process fails, may also be entered. In addition to the status information, the response at the time of processing completion (when status is completed) includes analysis result information obtained by analyzing the scanned image, scan setting information, and the like.

処理完了レスポンスを受信した後、ＭＦＰ１１０は、当該レスポンスに含まれる、画像解析結果の格納先を示すＵＲＬを用いて、画像解析処理結果の取得を、ＭＦＰ連携サーバ１２０に対してリクエストする（Ｓ５１１）。リクエストを受けたＭＦＰ連携サーバ１２０では、リクエスト制御部４３１が解析処理の結果情報をＭＦＰ１１０に返す。 After receiving the processing completion response, the MFP 110 requests the MFP link server 120 to acquire the image analysis processing result using the URL indicating the storage location of the image analysis result included in the response (S511). . In MFP cooperation server 120 that has received the request, request control unit 431 returns the result information of the analysis process to MFP 110 .

そして、ＭＦＰ１１０は、Ｓ５１１のリクエストで取得した解析処理の結果情報を使用して、スキャン画像群における各文書の分割ページを確定するためのＵＩ画面（以下、「分割ページ確定画面」と表記）を表示する（Ｓ５１２）。表示された分割ページ確定画面での処理の詳細については、図７（ａ）を用いて後述する。 Then, the MFP 110 displays a UI screen (hereinafter referred to as a “divided page confirmation screen”) for confirming the divided pages of each document in the scan image group using the analysis processing result information obtained in the request of S511. Display (S512). The details of the processing on the displayed divided page confirmation screen will be described later with reference to FIG.

分割ページ確定画面７１０（図７（ａ））にてスキャン画像群に対し各文書の区切り位置が必要に応じてユーザ操作で再指定された状態で、「ＯＫ」ボタン７１７が押下されると、スキャン画像において、各文書の区切りとなる位置が確定されることとなる。 When the "OK" button 717 is pressed in a state where the separation position of each document for the scan image group is re-designated as necessary by the user operation on the divided page confirmation screen 710 (FIG. 7A), In the scanned image, the position to be the delimiter of each document is determined.

そして、ＭＦＰ１１０は、Ｓ５１２の処理で確定した位置で区切られた各ファイルのファイル名を設定するためのＵＩ画面（以下、「ファイル名設定画面」と表記）を表示する（Ｓ５１３）。表示されたファイル名設定画面での処理の詳細については、図７（ｂ）を用いて後述する。ファイル名設定画面７２０（図７（ｂ）にて各ファイルのファイル名が入力された状態で、「ＯＫ」ボタン７２７が押下されると、各ファイルのファイル名が設定されることとなる。 Then, the MFP 110 displays a UI screen (hereinafter referred to as a "file name setting screen") for setting the file name of each file separated at the positions determined in the process of S512 (S513). Details of the processing on the displayed file name setting screen will be described later with reference to FIG. 7(b). When the "OK" button 727 is pressed with the file name of each file entered on the file name setting screen 720 (FIG. 7B), the file name of each file is set.

そして、Ｓ５１２にて確定された各文書の区切りとなる位置やＳ５１３にて設定されたファイル名などに関する情報がＭＦＰ連携サーバ１２０に送られる（Ｓ５１４）。ＭＦＰ連携サーバ１２０はリクエストを受信すると、ＭＦＰ１１０より受信した情報に基づきファイル生成処理を開始するとともにリクエストを正常に受けたことをＭＦＰ１１０に返す。ＭＦＰ１１０は送信のレスポンスを受けると処理を終了し、Ｓ５０４のスキャン設定画面表示に戻る。 Then, the information about the position of each document delimiter decided in S512 and the file name set in S513 is sent to the MFP cooperation server 120 (S514). Upon receiving the request, MFP cooperation server 120 starts file generation processing based on the information received from MFP 110 and returns to MFP 110 that the request has been received normally. Upon receiving the transmission response, the MFP 110 ends the process and returns to the scan setting screen display of S504.

一方、ＭＦＰ連携サーバ１２０では、事前に登録されたスキャン設定からストレージサーバ１３０に送信するファイルフォーマットの情報を取得し、当該ファイルフォーマットに従ってスキャン画像からファイルを生成する（Ｓ５１５）。この際、生成されたファイルにはＳ５１３で設定されたファイル名が付されることになる。こうして生成されたスキャン画像ファイルは、ストレージサーバ１３０に送信され、保存される（Ｓ５１６）。スキャン画像ファイルを受信したストレージサーバ１３０は、ＭＦＰ連携サーバ１２０のリクエスト制御部４３１にスキャン画像ファイルの送信完了のレスポンスを返す。 On the other hand, the MFP cooperation server 120 acquires the information of the file format to be transmitted to the storage server 130 from the scan settings registered in advance, and generates a file from the scanned image according to the file format (S515). At this time, the file name set in S513 is attached to the generated file. The scanned image file thus generated is transmitted to and stored in the storage server 130 (S516). Upon receiving the scanned image file, the storage server 130 returns a scanned image file transmission completion response to the request control unit 431 of the MFP cooperation server 120 .

以上が、画像処理システム全体の処理の流れである。 The above is the processing flow of the entire image processing system.

図６は、複数の文書をページ単位で連続してスキャンして得られたスキャン画像群例を示す図である。ＭＦＰ１１０によりＳ５０５の処理にて生成されたスキャン画像群は、「スキャン画像群におけるページ順」に示すように、６ページで構成されており、１、２、３、４、５、６の順番となっているものとする。６ページで構成されるスキャン画像群は、３つの文書のスキャン画像からなる。３つの文書は、それぞれ文書Ａ、文書Ｂ、文書Ｃであるとする。スキャン画像群におけるページ順が２番、３番、５番のページ画像はスキャン画像群におけるページ順が１番のページ画像とレイアウトが類似している。以降、スキャン画像群におけるページ順が１番のページ画像を「スキャン先頭ページ」と呼称し、スキャン画像群に含まれるある文書内の１ページ目を「文書先頭ページ」と呼称し区別することにする。図６の例では、スキャン画像群におけるページ順が１のページ画像がスキャン先頭ページかつ文書先頭ページであり、２番、５番のページ画像が文書先頭ページであるということになる。スキャン先頭ページとレイアウトが類似しているページ画像に記載された内容を見ると、それぞれ文書の識別子を表す「文書ＩＤ」と、文書中のページ番号を表す「Ｐ．」が記載されている。スキャン先頭ページの「文書ＩＤ」の値は１、「Ｐ．」の値は１である。スキャン画像群におけるページ番号が２番、３番、５番のページについては、この順に「文書ＩＤ」の値は２、２、３であり、また「Ｐ．」の値は１、２、１である。一般的に同一の文書中では文書の識別子は同一であり、文書内のページ番号は連番もしくは文書内のページ順に大きくなっていくものと認めることができる。また、スキャン先頭ページは、スキャン画像群に含まれる１つないし複数の文書のうちの１つの文書の文書先頭ページである可能性が高いと認めることができる。また、スキャン画像群に含まれる１つないし複数の文書の文書先頭ページは、スキャン先頭ページと類似していることが多いという前提に立つものである。本実施形態では、これらの前提のもとでスキャン画像群に含まれる１つないし複数の文書を文書単位で適切に分割する（判別する）ための手法を提案する。 FIG. 6 is a diagram showing an example of a scanned image group obtained by continuously scanning a plurality of documents page by page. The scan image group generated by the process of S505 by the MFP 110 is composed of 6 pages, and the order is 1, 2, 3, 4, 5, and 6, as shown in "page order in scan image group". It is assumed that A group of scanned images of six pages consists of scanned images of three documents. Assume that the three documents are document A, document B, and document C, respectively. Page images of the second, third, and fifth pages in the scan image group are similar in layout to the page image of the first page order in the scan image group. Hereinafter, the page image with the first page order in the scan image group will be referred to as the "scan first page", and the first page in a certain document included in the scan image group will be referred to as the "document first page" for distinction. do. In the example of FIG. 6, the page image with the page order of 1 in the scan image group is the scanning first page and the document first page, and the page images of Nos. 2 and 5 are the document first page. Looking at the content described in the page image whose layout is similar to that of the first scanned page, "document ID" representing the identifier of the document and "P." representing the page number in the document are described. The value of "document ID" of the first page to be scanned is 1, and the value of "P." For pages with page numbers 2, 3, and 5 in the scan image group, the values of "document ID" are 2, 2, 3 in this order, and the values of "P." are 1, 2, 1. is. In general, the document identifier is the same in the same document, and it can be recognized that the page number in the document increases in sequence or in the order of the pages in the document. Also, it can be recognized that there is a high possibility that the scanned first page is the document first page of one of the one or more documents included in the scanned image group. It is also based on the premise that the first page of one or more documents included in the scan image group is often similar to the first page of the scan. Based on these assumptions, the present embodiment proposes a technique for appropriately dividing (determining) one or more documents included in a group of scanned images on a document-by-document basis.

分割ページ確定画面での処理の詳細について、図を用いて説明する。図７は、ＭＦＰ１１０のタッチパネルに表示されるＵＩ画面例を示す図であり、図７（ａ）にＳ５１２にて表示される分割ページ確定画面例を示し、図７（ｂ）に、Ｓ５１３にて表示されるファイル名設定画面例を示す。分割ページ確定画面およびファイル名設定画面の表示箇所は、これに限定されず、クライアントＰＣ１１１であってもよい。図７（ａ）に示すように、分割ページ確定画面７１０には、スキャン及び画像解析処理が完了して得られた、ストレージサーバ１３０に送信する前のスキャン画像群の全てのページが表示される。すなわち、分割ページ確定画面７１０には、スキャン画像群のサムネイル画像７１１及びサムネイル画像７１１に対応するページ番号７１２が、スキャン画像群のページ数分だけ表示される。さらに、分割ページ確定画面７１０には、Ｓ５０８の画像処理で判定された、スキャン画像群における各文書の区切り位置がユーザによって確認可能に表示される。ただし、当該各文書の区切り位置は、ユーザ操作によって修正可能となっている。すなわち、分割ページ確定画面７１０には、Ｓ５０８の画像処理で判定された各文書の区切り位置を示す文書区切り線７１３が表示され、文書区切り線７１３を例えばドラッグするなどのユーザ操作によって各文書の区切り入りを修正可能となっている。分割ページ確定画面７１０には、文書区切り位置を確定するためのボタンである「ＯＫ」ボタン７１７が表示され、ユーザ操作によって押下されると、分割ページ確定画面７１０で設定された文書の区切り位置が文書の区切り情報として確定することになる。 Details of the processing on the divided page confirmation screen will be described with reference to the drawings. FIGS. 7A and 7B show examples of UI screens displayed on the touch panel of the MFP 110. FIG. 7A shows an example of the split page confirmation screen displayed in S512, and FIG. An example of a displayed file name setting screen is shown. The display locations of the divided page confirmation screen and the file name setting screen are not limited to this, and may be the client PC 111 . As shown in FIG. 7A, the divided page confirmation screen 710 displays all the pages of the scanned image group before transmission to the storage server 130, obtained by completing the scanning and image analysis processing. . That is, thumbnail images 711 of the scan image group and page numbers 712 corresponding to the thumbnail images 711 are displayed on the divided page confirmation screen 710 as many as the number of pages of the scan image group. Further, on the divided page confirmation screen 710, the division position of each document in the scan image group determined by the image processing in S508 is displayed so that the user can check it. However, the delimiter position of each document can be modified by user operation. That is, the document separation line 713 indicating the separation position of each document determined by the image processing in S508 is displayed on the division page confirmation screen 710, and the document separation line 713 is displayed by the user operation such as dragging the document separation line 713. It is possible to modify the entry. The split page confirmation screen 710 displays an "OK" button 717, which is a button for confirming the document separation position. This is determined as document delimiter information.

ファイル名設定画面での処理の詳細について、図７（ｂ）を用いて説明する。図７（ｂ）に示すように、ファイル名設定画面７２０には、分割ページ確定画面７１０と同様、スキャン画像群のサムネイル画像７２１及びサムネイル画像７２１に対応するページ番号７２２が、スキャン画像群のページ数分だけ表示される。さらに、ファイル名設定画面７２０には、Ｓ５０８の画像処理で判定された各文書の区切り位置を示す文書区切り線７２３と、文書区切り線７２３で区切られた文書単位でファイル名を設定するためのファイル名設定欄７２４とが表示される。また、ファイル名設定画面７２０には、ファイル名を設定するための「ＯＫ」ボタン７２７が表示される。例えばタップなどのユーザ操作でファイル名設定欄７２４が選択されると、キーボードなどのファイル名を入力するための入力用ＵＩ画面（不図示）が表示され、入力用ＵＩ画面を介して文字列が入力されると、入力文字列がファイル名設定欄７２４に表示される。そして、このような操作が各ファイルに対して実行された後、「ＯＫ」ボタン７２７がユーザ操作によって押下されると、次の処理が行われる。すなわち、上述のスキャン画像群における各文書の区切り位置およびファイル名などに関する情報が、ＭＦＰ連携サーバ１２０へ送信される。 Details of the processing on the file name setting screen will be described with reference to FIG. 7(b). As shown in FIG. 7B, on the file name setting screen 720, similar to the divided page confirmation screen 710, a thumbnail image 721 of the scan image group and a page number 722 corresponding to the thumbnail image 721 are displayed on the page of the scan image group. displayed for a few minutes. Further, on the file name setting screen 720, document separation lines 723 indicating the separation position of each document determined by the image processing in S508, and a file for setting a file name for each document separated by the document separation lines 723 are displayed. A name setting field 724 is displayed. The file name setting screen 720 also displays an “OK” button 727 for setting the file name. For example, when the file name setting field 724 is selected by a user operation such as tapping, an input UI screen (not shown) for inputting a file name such as a keyboard is displayed, and a character string is input via the input UI screen. Once entered, the input character string is displayed in file name setting field 724 . Then, after such an operation is executed for each file, when the "OK" button 727 is pressed by the user operation, the following processing is performed. That is, information about the delimiter position and file name of each document in the above-described scan image group is transmitted to MFP link server 120 .

＜画像解析処理＞
続いて、上述のＳ５０８にてＭＦＰ連携サーバ１２０の画像処理部４３２が実行する画像解析処理について、図を用いて説明する。図８は、画像解析処理の流れを示すフローチャートである。 <Image analysis processing>
Next, the image analysis processing executed by the image processing unit 432 of the MFP cooperation server 120 in S508 described above will be described with reference to the drawings. FIG. 8 is a flowchart showing the flow of image analysis processing.

Ｓ８０１では、画像処理部４３２（画像補正部４５１）は、複数の文書をページ単位で連続してスキャンして得られたスキャン画像群に対して画像補正を行う。画像補正部４５１は、スキャン画像群の傾きなどを補正する処理を実行する。なお、スキャン画像群の傾きなどを補正する処理には、公知の技術が用いられる。 In S801, the image processing unit 432 (image correction unit 451) performs image correction on a group of scanned images obtained by continuously scanning a plurality of documents page by page. The image correction unit 451 executes processing for correcting the tilt of the scan image group. Note that a known technique is used for the process of correcting the tilt of the scan image group.

Ｓ８０２では、画像処理部４３２（文書先頭ページ候補抽出部４６１）は、画像補正後のスキャン画像群に対して、複数の文書の各文書における先頭ページ候補を抽出する文書先頭ページ候補抽出処理を実行する。この文書先頭ページ候補抽出処理の詳細な流れについて、図を用いて後述する。 In S802, the image processing unit 432 (document first page candidate extraction unit 461) executes document first page candidate extraction processing for extracting first page candidates in each of a plurality of documents from the scanned image group after image correction. do. A detailed flow of this document first page candidate extraction process will be described later with reference to the drawings.

Ｓ８０３では、画像処理部４３２（文書先頭ページ判定部４７１）は、Ｓ８０２にて抽出した先頭ページ候補に対して、文書先頭ページを判定する処理を行う。文書先頭ページ判定処理の詳細な流れについては、図を用いて後述する。 In S803, the image processing unit 432 (document first page determination unit 471) performs a document first page determination process on the first page candidate extracted in S802. A detailed flow of the document first page determination process will be described later with reference to the drawings.

Ｓ８０３の処理を完了すると、図８に示すフローを終える。 When the processing of S803 is completed, the flow shown in FIG. 8 ends.

＜文書先頭ページ候補抽出処理＞
続いて、上述のＳ８０２にて画像処理部４３２が実行する文書先頭ページ候補抽出処理について、図を用いて説明する。図９は、文書先頭ページ候補抽出処理の詳細な流れを示すフローチャートである。 <Document First Page Candidate Extraction Processing>
Next, document first page candidate extraction processing executed by the image processing unit 432 in S802 described above will be described with reference to the drawings. FIG. 9 is a flow chart showing a detailed flow of document first page candidate extraction processing.

Ｓ９０１では、ＭＦＰ連携サーバ１２０の画像処理部４３２（文書先頭ページ候補抽出部４６１）は、Ｓ５０６においてＭＦＰ１１０より受信したスキャン画像群のうち、スキャン先頭ページから順に、未処理のページに対応するページ画像を取得する。 In S901, the image processing unit 432 (document first page candidate extraction unit 461) of the MFP cooperation server 120 extracts page images corresponding to unprocessed pages from the scan image group received from the MFP 110 in S506 in order from the scanned first page. to get

Ｓ９０２では、ＭＦＰ連携サーバ１２０の画像処理部４３２（レイアウト解析部４６２）は、Ｓ９０１で取得したページ画像に対して、レイアウト解析を実施する。レイアウト解析の具体的手法は限定されないが、ここでは一例としてページ画像内に存在する文字領域の解析を行うことによって行うものとして説明を進める。レイアウト解析部４６２は、例えば、現在の処理対象であるページ画像に対して文字領域の検出処理を実行するよう文字領域検出部４５２に指示して文字領域を検出する処理を実行させる。そして、レイアウト解析部４６２は、文字領域検出処理後のページ画像のヒストグラムを抽出したり、画素の塊を抽出したりして、文字領域や図形領域など、ページ画像のレイアウトを解析する。すなわち、Ｓ９０２では、レイアウト解析部４６２は、スキャン画像群（スキャン画像）についてページ単位でレイアウトを解析するといえる。なお、Ｓ９０２では、画像処理部４３２（レイアウト解析部４６２）は、文字領域検出部４５２を制御する制御部として機能するともいえる。 In S902, the image processing unit 432 (layout analysis unit 462) of the MFP cooperation server 120 performs layout analysis on the page image acquired in S901. Although the specific method of layout analysis is not limited, here, as an example, a description will be given assuming that the layout analysis is performed by analyzing a character area existing in a page image. For example, the layout analysis unit 462 instructs the character area detection unit 452 to execute character area detection processing on the page image that is the current processing target, and causes the character area detection unit 452 to execute the character area detection process. Then, the layout analysis unit 462 extracts a histogram of the page image after the character area detection process, extracts a cluster of pixels, and analyzes the layout of the page image such as the character area and the figure area. That is, in S902, the layout analysis unit 462 can be said to analyze the layout of the scanned image group (scanned image) page by page. It can be said that the image processing unit 432 (layout analysis unit 462) functions as a control unit that controls the character area detection unit 452 in S902.

Ｓ９０３では、画像処理部４３２は、Ｓ９０２の処理で得たレイアウト解析結果を、レイアウト解析情報としてデータ管理部４３４に保存する。 In S903, the image processing unit 432 stores the layout analysis result obtained in the processing of S902 in the data management unit 434 as layout analysis information.

Ｓ９０４では、画像処理部４３２は、Ｓ９０２でレイアウト解析を実施したページ画像がスキャン先頭ページであるか否かを判定する。スキャン先頭ページであるとの判定結果を得た場合（Ｓ９０４のＹＥＳ）、処理がＳ９０７に移行される。他方、Ｓ９０２でレイアウト解析を実施したページ画像が文書の２ページ目以降のページ画像であり、スキャン先頭ページではないとの判定結果を得た場合（Ｓ９０４のＮＯ）、処理がＳ９０５に移行される。 In S904, the image processing unit 432 determines whether or not the page image for which layout analysis has been performed in S902 is the scan first page. If it is determined that the page is the scan first page (YES in S904), the process proceeds to S907. On the other hand, if it is determined that the page image subjected to the layout analysis in S902 is the page image of the second and subsequent pages of the document and is not the scan first page (NO in S904), the process proceeds to S905. .

Ｓ９０５では、画像処理部４３２（類似度算出部４６３）は、先ず、１ページ目（スキャン先頭ページ）の画像に対応し、Ｓ９０２の処理で得たレイアウト解析結果をデータ管理部４３４より取得する。そして、画像処理部４３２（類似度算出部４６３）は、取得したレイアウト解析結果と、現在の処理対象であるページ画像に対するレイアウト解析結果とを比較し、比較結果を基に類似度を導出する。類似度の導出について、具体的な方法は限定しないが、ここでは一例としてレイアウト解析結果として得られた画像内に存在する文字領域を示す座標群のそれぞれについて、各領域の重なり面積を類似度として導出するものとする。 In S905, the image processing unit 432 (similarity calculation unit 463) first acquires from the data management unit 434 the layout analysis result obtained in the processing of S902 corresponding to the image of the first page (scan first page). Then, the image processing unit 432 (similarity calculation unit 463) compares the acquired layout analysis result with the layout analysis result of the page image currently being processed, and derives the similarity based on the comparison result. Although the specific method for deriving the similarity is not limited, here, as an example, for each of the coordinate groups indicating the character regions existing in the image obtained as the layout analysis result, the overlapping area of each region is used as the similarity. shall be derived.

Ｓ９０６では、画像処理部４３２（文書先頭ページ候補抽出部４６１）は、Ｓ９０５にて算出した類似度が所定の閾値を超えるか否かにより、現在の処理対象であるページ画像が１ページ目（スキャン先頭ページ）の画像と類似しているか否かを判定する。類似度が所定の閾値を超え、現在の処理対象であるページ画像が１ページ目の画像と類似しているとの判定結果を得た場合（Ｓ９０６のＹＥＳ）、処理がＳ９０７に移行される。他方、類似度が所定の閾値を超えず、現在の処理対象であるページ画像が１ページ目の画像と類似していないとの判定結果を得た場合（Ｓ９０６のＮＯ）、処理がＳ９０８に移行される。 In S906, the image processing unit 432 (document first page candidate extraction unit 461) determines whether the page image currently being processed is the first page (scan first page) is similar to the image. If the degree of similarity exceeds the predetermined threshold and it is determined that the page image currently being processed is similar to the image of the first page (YES in S906), the process proceeds to S907. On the other hand, if the degree of similarity does not exceed the predetermined threshold value and it is determined that the page image currently being processed is not similar to the image of the first page (NO in S906), the process proceeds to S908. be done.

Ｓ９０７では、画像処理部４３２（文書先頭ページ候補抽出部４６１）は、現在の処理対象であるページ画像を文書先頭ページ候補として登録する。すなわち、Ｓ９０７では、画像処理部４３２は、Ｓ９０５で算出した類似度に基づき、スキャン画像から、複数の文書の各文書における文書先頭ページ候補を抽出しているといえる。 In S907, the image processing unit 432 (document first page candidate extraction unit 461) registers the page image currently being processed as a document first page candidate. That is, in S907, the image processing unit 432 can be said to extract document first page candidates for each of the plurality of documents from the scanned image based on the degree of similarity calculated in S905.

Ｓ９０８では、画像処理部４３２は、現在の処理対象であるページ画像を処理済みとして登録する。これらＳ９０７、Ｓ９０８による登録処理は、例えば、下記の表１に示すようなページ管理リストをデータ管理部４３４に保持することで行ってもよい。表１に示すページ管理リストでは、スキャン画像群に対するページ番号と、各ページに対する文書先頭ページ候補抽出処理の処理状況と、文書先頭ページ候補か否かを示すフラグ情報とが管理されている。さらに、ページ管理リストでは、各ページに対する文書先頭ページ判定処理の処理状況と、文書先頭ページか否かを示すフラグ情報も管理されている。 In S908, the image processing unit 432 registers the page image that is the current processing target as processed. The registration processing in S907 and S908 may be performed by holding a page management list as shown in Table 1 below in the data management unit 434, for example. The page management list shown in Table 1 manages the page number of the scanned image group, the processing status of document first page candidate extraction processing for each page, and flag information indicating whether or not the page is a document first page candidate. Further, the page management list manages the processing status of document first page determination processing for each page and flag information indicating whether or not the page is the document first page.

ページ管理リストには、文書先頭ページ候補フラグの値として、Ｓ９０７の処理により文書先頭ページ候補として登録されたページ番号（ページ画像）に対しては「１」が付され、それ以外のページ番号（ページ画像）に対しては「０」が付されることになる。すなわち、Ｓ９０７の処理がスキップされて文書先頭ページ候補として登録されていないページ番号に対しては文書先頭ページ候補フラグの値として「０」が付されることになる。また、ページ管理リストには、文書先頭ページ候補抽出処理の処理状況の値として、Ｓ９０８の処理により処理済みページとして登録されたページ番号（ページ画像）に対しては「完了」が付されることになる。それ以外のページ番号（ページ画像）に対しては「未処理」が付されることになる。すなわち、未処理であり、Ｓ９０８の処理が行われていないページ番号に対しては、文書先頭ページ候補抽出処理の処理状況の値として「未処理」が付されることになる。 In the page management list, as the value of the document first page candidate flag, "1" is added to the page number (page image) registered as the document first page candidate by the process of S907, and the other page numbers ( page image) is assigned "0". That is, "0" is attached as the value of the document first page candidate flag to the page number that is not registered as the document first page candidate because the process of S907 is skipped. Also, in the page management list, as a value of the processing status of the document first page candidate extraction processing, the page number (page image) registered as the processed page by the processing of S908 is added with "completed". become. "Unprocessed" is added to other page numbers (page images). In other words, "unprocessed" is attached as the value of the processing status of the document first page candidate extraction process to the page number that is unprocessed and for which the process of S908 has not been performed.

さらに、ページ管理リストには、図１０を用いて後述する文書先頭ページ判定処理の処理状況及びその結果登録された文書先頭ページフラグも管理されており、ページ番号に対して文書先頭ページ候補処理の場合と同様な値が付される。すなわち、ページ管理リストには、文書先頭ページフラグの値として、文書先頭ページとして登録されたページ番号（ページ画像）に対しては「１」が付され、それ以外のページ番号（ページ画像）に対しては「０」が付されることになる。すなわち、文書先頭ページとして登録されていないページ番号に対しては文書先頭ページフラグの値として「０」が付されることになる。また、ページ管理リストには、文書先頭ページ判定処理の処理状況の値として、処理済みページとして登録されたページ番号（ページ画像）に対しては「完了」が付されることになる。それ以外のページ番号（ページ画像）に対しては「未処理」が付されることになる。 Further, the page management list also manages the processing status of document first page determination processing, which will be described later with reference to FIG. 10, and the document first page flag registered as a result. The same value as in the case is attached. That is, in the page management list, as the value of the document first page flag, "1" is added to the page number (page image) registered as the document first page, and "1" is added to the other page numbers (page images). "0" will be added to it. That is, "0" is added as the value of the document first page flag to the page number that is not registered as the document first page. Also, in the page management list, as the value of the processing status of the document first page determination processing, "completed" is attached to the page number (page image) registered as the processed page. "Unprocessed" is added to other page numbers (page images).

表１の例では、９ページで構成されるスキャン画像群のうち、３ページ目まで図９に示す文書先頭ページ候補抽出処理が完了しており、そのうち１ページ目及び３ページ目が文書先頭ページ候補として抽出されたことを示している。 In the example of Table 1, the document first page candidate extraction process shown in FIG. 9 has been completed up to the third page of the scan image group composed of nine pages, and the first and third pages are document first page candidates. It shows that it was extracted as a candidate.

図９の説明に戻る。Ｓ９０９では、画像処理部４３２は、表１に示されたページ管理リストから、現在処理していたページの次のページが存在するか否かを判定する。次のページが存在するとの判定結果を得た場合（Ｓ９０９のＹＥＳ）、処理がＳ９０１に戻され、次のページのページ画像に対して、Ｓ９０１からＳ９０９の一連の処理が実行される。他方、次のページが存在しないとの判定結果を得た場合（Ｓ９０９のＮＯ）、図９に示すフローを終える。 Returning to the description of FIG. In S909, the image processing unit 432 determines from the page management list shown in Table 1 whether or not there is a page following the currently processed page. If it is determined that the next page exists (YES in S909), the process returns to S901, and a series of processes from S901 to S909 are performed on the page image of the next page. On the other hand, if it is determined that there is no next page (NO in S909), the flow shown in FIG. 9 ends.

以上、図９を用いて説明したフローを実行することにより、Ｓ５０５において生成された複数ページで構成されるスキャン画像群から、複数の文書の各文書における先頭ページ候補が抽出されることになる。 As described above, by executing the flow described with reference to FIG. 9, first page candidates in each document of a plurality of documents are extracted from the scan image group composed of a plurality of pages generated in S505.

＜文書先頭ページ判定処理＞
続いて、上述のＳ８０３にて画像処理部４３２が実行する文書先頭ページ判定処理について、図を用いて説明する。図１０は、文書先頭ページ判定処理の詳細な流れを示すフローチャートである。 <Document First Page Determination Processing>
Next, document first page determination processing executed by the image processing unit 432 in S803 described above will be described with reference to the drawings. FIG. 10 is a flowchart showing the detailed flow of the document first page determination process.

Ｓ１００１では、ＭＦＰ連携サーバ１２０の画像処理部４３２は、表１に示したページ管理リストをデータ管理部４３４より取得し、文書先頭ページ判定処理状況が「未処理」であるもののうち、ページ番号の最も小さいページに対応する管理データを取得する。 In S1001, the image processing unit 432 of the MFP cooperation server 120 acquires the page management list shown in Table 1 from the data management unit 434, and among those whose document first page determination processing status is "unprocessed", page number Get administrative data corresponding to the smallest page.

Ｓ１００２では、画像処理部４３２（文書先頭ページ判定部４７１）は、Ｓ１００１で取得した管理データのうち、文書先頭ページ候補フラグを参照することで、現在の処理対象であるページ番号が文書先頭ページ候補であるか否かを判定する。文書先頭ページ候補フラグが「０」であり、現在の処理対象であるページ番号が文書先頭ページ候補ではないとの判定結果を得た場合（Ｓ１００２のＮＯ）、処理がＳ１００９に移行される。他方、文書先頭ページ候補フラグが「１」であり、現在の処理対象であるページ番号が文書先頭ページ候補であるとの判定結果を得た場合（Ｓ１００２のＹＥＳ）、処理がＳ１００３に移行される。 In S1002, the image processing unit 432 (document first page determination unit 471) refers to the document first page candidate flag in the management data acquired in S1001 to determine whether the current page number to be processed is a document first page candidate. It is determined whether or not. If the document first page candidate flag is "0" and it is determined that the page number currently being processed is not the document first page candidate (NO in S1002), the process proceeds to S1009. On the other hand, if the document first page candidate flag is "1" and it is determined that the page number currently being processed is the document first page candidate (YES in S1002), the process proceeds to S1003. .

Ｓ１００３では、画像処理部４３２（文書先頭ページ判定部４７１）は、ページ管理リストを参照し、現在の処理対象であるページ番号の１つ前のページ番号が存在するか否かを判定する。現在の処理対象であるページ番号の１つ前のページ番号が存在するとの判定結果を得た場合（Ｓ１００３のＹＥＳ）、処理がＳ１００４に移行される。他方、現在の処理対象であるページ番号の１つ前のページ番号が存在しないとの判定結果を得た場合（Ｓ１００３のＮＯ）、処理がＳ１００８に移行される。 In S1003, the image processing unit 432 (document first page determination unit 471) refers to the page management list and determines whether or not there is a page number immediately preceding the page number currently being processed. If it is determined that there is a page number immediately preceding the current page number to be processed (YES in S1003), the process proceeds to S1004. On the other hand, if it is determined that there is no page number immediately preceding the page number currently being processed (NO in S1003), the process proceeds to S1008.

Ｓ１００４では、画像処理部４３２（文書先頭ページ判定部４７１）は、ページ管理リストを参照し、現在の処理対象であるページ番号の１つ前のページ番号に対して付された文書先頭ページ候補フラグの値を取得する。 In S1004, the image processing unit 432 (document first page determination unit 471) refers to the page management list, and determines whether the document first page candidate flag attached to the page number immediately before the page number to be currently processed. get the value of

Ｓ１００５では、画像処理部４３２（文書先頭ページ判定部４７１）は、Ｓ１００４で取得した、現在の処理対象であるページ番号の１つ前のページ番号に対して付された文書先頭ページ候補フラグの値が「１」であるか否かを判定する。現在の処理対象であるページ番号の１つ前のページ番号に対して付された文書先頭ページ候補フラグの値が「１」であるとの判定結果を得た場合（Ｓ１００５のＹＥＳ）、処理がＳ１００６に移行される。他方、現在の処理対象であるページ番号の１つ前のページ番号に対して付された文書先頭ページ候補フラグの値が「０」でるとの判定結果を得た場合（Ｓ１００５のＮＯ）、処理がＳ１００８に移行される。 In S1005, the image processing unit 432 (document first page determination unit 471) determines the value of the document first page candidate flag assigned to the page number immediately preceding the page number to be currently processed, which is acquired in S1004. is "1". If it is determined that the value of the document first page candidate flag attached to the page number immediately before the current page number to be processed is "1" (YES in S1005), the process is executed. The process proceeds to S1006. On the other hand, if it is determined that the value of the document first page candidate flag assigned to the page number immediately preceding the page number currently being processed is "0" (NO in S1005), the process is shifted to S1008.

Ｓ１００６では、画像処理部４３２（文書先頭ページ判定部４７１）は、現在の処理対象であるページ番号のページ画像に対してＯＣＲ処理を実行するようＯＣＲ処理部４５３に指示してＳ９０２の処理で検出した文字領域に対して文字認識処理を実行させる。なお、Ｓ１００６では、画像処理部４３２（文書先頭ページ判定部４７１）は、ＯＣＲ処理部４５３を制御する制御部として機能するともいえる。 In S1006, the image processing unit 432 (document first page determination unit 471) instructs the OCR processing unit 453 to perform OCR processing on the page image of the page number that is the current processing target, and detects it in the processing of S902. Character recognition processing is executed for the character area that has been selected. In S1006, the image processing unit 432 (document first page determination unit 471) can also be said to function as a control unit that controls the OCR processing unit 453. FIG.

Ｓ１００６にてＯＣＲ処理を実行して得られたＯＣＲ処理結果（文字認識結果）について、下記の表２を用いて説明する。表２は、ＯＣＲ処理結果例を示す表である。ＯＣＲ処理結果には、文字認識処理対象となる文字領域について、領域番号と、Ｘ座標と、Ｙ座標と、幅と、高さと、領域内文字列とが含まれる。なお、各文字領域の領域番号、座標（Ｘ座標、Ｙ座標）、幅、高さは、Ｓ９０２にて画像処理部４３２（文字領域検出部４５２）によって取得される。「Ｘ座標」は文字領域の左上隅のＸ座標、「Ｙ座標」は文字領域の左上隅のＹ座標、「幅」は文字領域のＸ（幅）方向のピクセル数、「高さ」は文字領域のＹ（高さ）方向のピクセル数を示す。 The OCR processing result (character recognition result) obtained by executing the OCR processing in S1006 will be described using Table 2 below. Table 2 is a table showing an example of OCR processing results. The OCR processing result includes the region number, the X coordinate, the Y coordinate, the width, the height, and the character string in the region for the character region to be subjected to the character recognition processing. Note that the area number, coordinates (X coordinate, Y coordinate), width, and height of each character area are acquired by the image processing unit 432 (character area detection unit 452) in S902. "X coordinate" is the X coordinate of the upper left corner of the character area, "Y coordinate" is the Y coordinate of the upper left corner of the character area, "width" is the number of pixels in the X (width) direction of the character area, and "height" is the character. Indicates the number of pixels in the Y (height) direction of the region.

Ｓ１００７では、画像処理部４３２（文書先頭ページ判定部４７１）は、Ｓ１００６の処理で得たＯＣＲ結果（文字認識結果）に基づき、現在の処理対象であるページ画像が文書先頭ページであるか否かの判定を行う。本実施形態では、Ｓ１００６で取得した現在の処理対象のページ画像のＯＣＲ結果と、１つ前のページ番号のページ画像のＯＣＲ結果とを比較することで判定を行う。なお、判定の方法は、これに限定されず、ページ画像のＯＣＲ結果を用いたものであればよい。 In S1007, the image processing unit 432 (document first page determination unit 471) determines whether the current page image to be processed is the document first page based on the OCR result (character recognition result) obtained in the processing of S1006. judgment is made. In this embodiment, determination is made by comparing the OCR result of the current page image to be processed acquired in S1006 with the OCR result of the page image of the previous page number. Note that the determination method is not limited to this, as long as it uses the OCR result of the page image.

＜ＯＣＲ結果による文書先頭ページ判定処理＞
続いて、上述のＳ１００７にて画像処理部４３２（文書先頭ページ判定部４７１）が実行する、ＯＣＲ結果による文書先頭ページ判定処理について、図を用いて説明する。図１１は、ＯＣＲ結果による文書先頭ページ判定処理の詳細な流れを示すフローチャートである。 <Document First Page Determination Processing Based on OCR Results>
Next, document first page determination processing based on the OCR results, which is executed by the image processing unit 432 (document first page determination unit 471) in S1007 described above, will be described with reference to the drawings. FIG. 11 is a flow chart showing the detailed flow of document first page determination processing based on OCR results.

Ｓ１１０１では、画像処理部４３２は、Ｓ１００６で取得した、現在の処理対象であるページ画像に対するＯＣＲ結果を取得する。Ｓ１１０１で取得されるＯＣＲ結果は、例えば、表２に示されるＯＣＲ処理結果であるとする。 In S1101, the image processing unit 432 acquires the OCR result for the current page image to be processed acquired in S1006. Assume that the OCR results obtained in S1101 are the OCR processing results shown in Table 2, for example.

Ｓ１１０２では、画像処理部４３２は、データ管理部４３４より、現在の処理対象であるページ画像の１つ前のページ画像のＯＣＲ結果を取得する。Ｓ１１０２の処理で画像処理部４３２が取得した１つ前のページ画像のＯＣＲ結果は、例えば、下記の表３に示されるＯＣＲ結果であるとする。表３に示されるＯＣＲ結果は、表２に示されるＯＣＲ結果と比較すると、領域番号１～５では領域内文字列（ＯＣＲ文字列）が一致するが、領域番号６及び７では領域内文字列（ＯＣＲ文字列）が異なるとする。 In S1102 , the image processing unit 432 acquires the OCR result of the page image immediately before the current page image to be processed from the data management unit 434 . Assume that the OCR result of the previous page image acquired by the image processing unit 432 in the process of S1102 is, for example, the OCR result shown in Table 3 below. When comparing the OCR results shown in Table 3 with the OCR results shown in Table 2, region numbers 1 to 5 match the character strings in the region (OCR character strings), but region numbers 6 and 7 match the character strings in the region. (OCR character strings) are different.

Ｓ１１０３では、画像処理部４３２は、Ｓ１１０１で取得した現在の処理対象であるページ画像に対するＯＣＲ結果のうち、未取得の領域内文字列（ＯＣＲ文字列）のうち小さい領域番号に対応する領域内文字列（ＯＣＲ文字列）を取得する。画像処理部４３２は、例えば、表２に示されるＯＣＲ結果では、まず領域番号１に対応する領域内文字列（ＯＣＲ文字列）である「見積書」を取得する。なお、処理が後述するＳ１１０５に移行した場合には、領域番号をインクリメントして次の領域番号に対応する領域内文字列（ＯＣＲ文字列）がＳ１１０３に戻されたときに取得される、という処理手順となる。 In S1103, the image processing unit 432 extracts in-area character strings (OCR character strings) that have not been obtained in the OCR results for the current page image to be processed in S1101 and that correspond to the smaller area numbers. Get a column (OCR string). For example, in the OCR results shown in Table 2, the image processing unit 432 first acquires the in-area character string (OCR character string) corresponding to area number 1, which is “quote”. Note that when the process proceeds to S1105, which will be described later, the area number is incremented, and the in-area character string (OCR character string) corresponding to the next area number is acquired when returned to S1103. procedure.

Ｓ１１０４では、画像処理部４３２は、Ｓ１１０３で取得した領域内文字列（ＯＣＲ文字列）が後述する同一文書判定ルールに含まれるか否かを判定する。 In S1104, the image processing unit 432 determines whether or not the in-region character string (OCR character string) acquired in S1103 is included in the same document determination rule described later.

同一文書判定ルールについて、下記の表を用いて説明する。下記の表４は、Ｓ１１０４の判定処理で用いるために予め設定された同一文書判定ルール例を示す表である。同一文書判定ルールには、ルールの識別子であるルールＩＤと、同一文書判定処理に用いられる文字列である判定トークンと、各判定トークンに対して判定の際に同一文書と判定するために適用する条件を示す判定条件とが設定されている。判定条件は、判定トークンごとに「連続」または「同一」の値が設定されており、この値に応じて後述するＳ１１１０の同一文書判定処理における判定条件が規定される。より具体的には、判定条件が「連続」であった場合には、判定に用いる２つの領域内文字列（ＯＣＲ文字列）の間に連続の関係にあるか否かに基づく判定を行う。判定条件が「同一」であった場合には、２つの領域内文字列（ＯＣＲ文字列）が同一であるか否かに基づく判定を行う。例えば、２つの領域内文字列（ＯＣＲ文字列）「Ｐａｇｅ１」「Ｐａｇｅ２」があった場合、両方の領域内文字列の間で値が連続であると認められるため、判定条件が「連続」であった場合には同一文書であると判定される。別の例として、２つの領域内文字列（ＯＣＲ文字列）「文書番号Ａ－１２３」「文書番号Ａ－１２３」があった場合、両方の領域内文字列は同一であると認められるため、判定条件が「同一」であった場合には同一文書であると判定される。 The same document determination rule will be explained using the table below. Table 4 below is a table showing an example of a same document determination rule set in advance for use in the determination processing of S1104. The same document determination rule includes a rule ID that is an identifier of the rule, a determination token that is a character string used for the same document determination process, and each determination token that is applied to determine that the document is the same document at the time of determination. A determination condition indicating the condition is set. As the determination condition, a value of "successive" or "same" is set for each determination token, and the determination condition in the same document determination process of S1110, which will be described later, is defined according to this value. More specifically, if the determination condition is "continuous", determination is made based on whether or not there is a continuity relationship between the two character strings (OCR character strings) used for determination. If the determination condition is "same", determination is made based on whether the two character strings (OCR character strings) in the area are the same. For example, if there are two character strings (OCR character strings) "Page 1" and "Page 2" in the area, the values are recognized as continuous between both character strings in the area. , it is determined that they are the same document. As another example, if there are two in-area character strings (OCR strings) "Document number A-123" and "Document number A-123", both in-area character strings are recognized to be the same, so If the judgment condition is "same", it is judged that they are the same document.

Ｓ１１０４の判定処理が、例えば、表４に示される判定トークンのうち、領域内文字列（ＯＣＲ文字列）に含まれるものがあるか否かで行われる。例えば、表２に示されるＯＣＲ結果のうち、領域番号４に対応する領域内文字列「見積番号: R12-3456」は、表４に示すルールＩＤ７に対応する判定トークン「見積番号」が含まれており、それ以外の領域内文字列は含まれていないと判定される。領域内文字列（ＯＣＲ文字列）に含まれる判定トークンが存在するとの判定結果を得た場合（Ｓ１１０４のＹＥＳ）、処理がＳ１１０７に移行される。他方、領域内文字列（ＯＣＲ文字列）に含まれる判定トークンが存在しないとの判定結果を得た場合（Ｓ１１０４のＮＯ）、処理がＳ１１０５に移行される。 The determination processing of S1104 is performed based on, for example, whether any of the determination tokens shown in Table 4 is included in the in-area character string (OCR character string). For example, among the OCR results shown in Table 2, the in-area character string "quotation number: R12-3456" corresponding to area number 4 includes the determination token "quotation number" corresponding to rule ID 7 shown in Table 4. It is judged that the other character strings in the area are not included. If it is determined that there is a determination token included in the in-area character string (OCR character string) (YES in S1104), the process proceeds to S1107. On the other hand, if it is determined that there is no determination token included in the in-area character string (OCR character string) (NO in S1104), the process proceeds to S1105.

Ｓ１１０５では、画像処理部４３２は、Ｓ１１０３で取得した領域内文字列（ＯＣＲ文字列）の領域番号から１だけインクリメントした領域番号に対応する次の領域内文字列（ＯＣＲ文字列）が存在するか否かを判定する。次の領域番号に対応する領域内文字列が存在するとの判定結果を得た場合（Ｓ１１０５のＹＥＳ）、処理がＳ１１０３に戻され、インクリメントした領域番号（次の領域番号）に対応する領域内文字列（ＯＣＲ文字列）に対してＳ１１０３の処理が行われる。他方、次の領域番号に対応する領域内文字列（ＯＣＲ文字列）が存在しないとの判定結果を得た場合（Ｓ１１０５のＮＯ）、処理がＳ１１０６に移行される。 In S1105, the image processing unit 432 determines whether the next in-area character string (OCR character string) corresponding to the area number incremented by 1 from the area number of the in-area character string (OCR character string) acquired in S1103 exists. determine whether or not If it is determined that there is an in-area character string corresponding to the next area number (YES in S1105), the process returns to S1103, and the in-area character string corresponding to the incremented area number (next area number) is The processing of S1103 is performed on the string (OCR character string). On the other hand, if it is determined that there is no in-area character string (OCR character string) corresponding to the next area number (NO in S1105), the process proceeds to S1106.

Ｓ１１０７では、画像処理部４３２は、Ｓ１１０４にて領域内文字列（ＯＣＲ文字列）に含まれると判定された判定トークンに対応する判定条件を取得する。画像処理部４３２は、例えば、上述した、表２に示されるＯＣＲ結果の領域番号４に対応する領域内文字列（ＯＣＲ文字列）「見積番号: R12-3456」に対する判定条件として、表４に示されるルールＩＤ７に対応する判定条件である「同一」を取得する。 In S1107, the image processing unit 432 acquires a determination condition corresponding to the determination token determined to be included in the in-area character string (OCR character string) in S1104. For example, the image processing unit 432 sets the determination conditions for the in-area character string (OCR character string) "estimation number: R12-3456" corresponding to the area number 4 of the OCR results shown in Table 2 as shown in Table 4. Acquire "same", which is the determination condition corresponding to the rule ID 7 shown.

Ｓ１１０８では、画像処理部４３２は、Ｓ１１０２で取得した、現在の処理対象の１つ前のページ画像に対するＯＣＲ結果の中にも、Ｓ１１０４で同一文書判定ルールに含まれると判定された判定トークンを含む領域内文字列が存在するか否かを判定する。すなわち、画像処理部４３２は、１つ前のページ画像に対するＯＣＲ結果も同一文書判定ルールに適合するか否かを判定する。１つ前のページ画像に対するＯＣＲ結果も同一文書判定ルールに適合しないとの判定結果を得た場合（Ｓ１１０８のＮＯ）、処理がＳ１１０６に移行される。１つ前のページ画像に対するＯＣＲ結果が同一文書判定ルールに適合するとの判定結果を得た場合（Ｓ１１０８のＹＥＳ）、処理がＳ１１０９に移行される。 In S1108, the image processing unit 432 includes the determination token determined to be included in the same document determination rule in S1104 in the OCR result of the page image immediately before the current processing target acquired in S1102. Determines whether or not there is an in-region character string. That is, the image processing unit 432 determines whether or not the OCR result for the previous page image also conforms to the same document determination rule. If it is determined that the OCR result for the previous page image also does not conform to the same document determination rule (NO in S1108), the process proceeds to S1106. If it is determined that the OCR result for the previous page image conforms to the same document determination rule (YES in S1108), the process proceeds to S1109.

Ｓ１１０９では、画像処理部４３２は、１つ前のページ画像に対するＯＣＲ結果から同一文書ルールに含まれていると判定したＯＣＲ文字列を取得する。例えば、ルールＩＤ７に対応する判定トークン「見積番号」が、表３に示されるＯＣＲ結果の領域番号４に対応するＯＣＲ文字列「見積番号: R12-3456」に含まれる。そのため、この例では、Ｓ１１０８にてＹＥＳと判定されて処理がＳ１１０９に移行され、Ｓ１１０９にてＯＣＲ文字列として「見積番号: R12-3456」が取得されることになる。 In S1109, the image processing unit 432 acquires the OCR character string determined to be included in the same document rule from the OCR result for the previous page image. For example, the determination token “quotation number” corresponding to rule ID 7 is included in the OCR character string “quotation number: R12-3456” corresponding to area number 4 of the OCR result shown in Table 3. Therefore, in this example, a determination of YES is made in S1108 and the process proceeds to S1109, where "Quotation number: R12-3456" is acquired as the OCR character string.

Ｓ１１１０では、画像処理部４３２は、次に示すＯＣＲ文字列および同一文書判定条件を用いて、現在のページ画像と１つ前のページ画像が同一文書か否かを判定する。すなわち、判定は、Ｓ１１０３で取得した現在のページ画像に対するＯＣＲ文字列、Ｓ１１０７で取得した同一文書判定条件、およびＳ１１０９で取得した１つ前のページ画像に対するＯＣＲ文字列を用いて行われる。判定の方法は、表４を用いて先述した手順で行う。同一文書であるとの判定結果を得た場合（Ｓ１１１０のＹＥＳ）、処理がＳ１１１１に移行される。同一文書でないとの判定結果を得た場合（Ｓ１１１０のＮＯ）、処理がＳ１１０６に移行される。 In S1110, the image processing unit 432 determines whether or not the current page image and the previous page image are the same document using the following OCR character string and same document determination condition. That is, the determination is performed using the OCR character string for the current page image acquired in S1103, the same document determination condition acquired in S1107, and the OCR character string for the previous page image acquired in S1109. The determination method is the procedure described above using Table 4. If it is determined that the documents are the same (YES in S1110), the process proceeds to S1111. If it is determined that the documents are not the same document (NO in S1110), the process proceeds to S1106.

Ｓ１１０６では、画像処理部４３２は、現在のページ画像が文書先頭ページであると判定し、現在のページ画像が文書先頭ページあるとして登録する。他方、Ｓ１１１１では、画像処理部４３２は、現在のページ画像が文書先頭ページでないと判定し、現在のページ画像を文書先頭ページではないとして登録する。より具体的に、Ｓ１１０６の処理では、表１に示されたページ管理リストのうち、「文書先頭ページフラグ」の値に「１」が登録されることになる。また、Ｓ１１１１の処理では、表１に示されたページ管理リストのうち、「文書先頭ページフラグ」の値に「０」が登録されることになる。 In S1106, the image processing unit 432 determines that the current page image is the first page of the document, and registers the current page image as the first page of the document. On the other hand, in S1111, the image processing unit 432 determines that the current page image is not the first page of the document, and registers the current page image as not being the first page of the document. More specifically, in the process of S1106, in the page management list shown in Table 1, "1" is registered as the value of the "document first page flag". In addition, in the process of S1111, "0" is registered in the value of the "document first page flag" in the page management list shown in Table 1.

なお、Ｓ１１０６の処理、またはＳ１１１１の処理が完了すると、図１１に示すフローを終える。 It should be noted that when the processing of S1106 or the processing of S1111 is completed, the flow shown in FIG. 11 ends.

図１０の説明に戻る。Ｓ１００８では、画像処理部４３２は、現在のページ画像が文書先頭ページであると判定し、上述のＳ１１０６と同様、現在のページ画像が文書先頭ページあるとして登録する。より具体的に、Ｓ１００８の処理では、表１に示されたページ管理リストのうち、「文書先頭ページフラグ」の値に「１」が登録されることになる。 Returning to the description of FIG. In S1008, the image processing unit 432 determines that the current page image is the first page of the document, and registers the current page image as the first page of the document, as in S1106 described above. More specifically, in the process of S1008, "1" is registered as the value of the "document first page flag" in the page management list shown in Table 1.

Ｓ１００９では、画像処理部４３２は、上述のＳ９０９と同様、現在の処理対象のページの次のページが存在するか否かを判定する。次のページが存在するとの判定結果を得た場合（Ｓ１００９のＹＥＳ）、処理がＳ１００１に移行され、次のページのページ画像に対して、Ｓ１００１からＳ１００９の一連の処理が実行される。他方、次のページが存在しないとの判定結果を得た場合（Ｓ１００９のＮＯ）、図１０に示すフローを終える。 In S1009 , the image processing unit 432 determines whether or not there is a page next to the current page to be processed, as in S909 described above. If it is determined that the next page exists (YES in S1009), the process proceeds to S1001, and a series of processes from S1001 to S1009 are executed for the page image of the next page. On the other hand, if it is determined that the next page does not exist (NO in S1009), the flow shown in FIG. 10 ends.

以上図８から図１１に示されるフローによりＳ５０８の画像解析処理を完了し、画像処理部４３２は、解析処理結果（解析情報）をリクエスト制御部４３１に返す。リクエスト制御部４３１に返す解析処理結果には、表１を用いて説明したページ管理リストが含まれる。例えば、解析処理結果（解析情報）の一部として最終的に下記の表５に示されるページ管理リストが含まれるとする。表５では、まず、上述の先頭候補ページ判定処理（Ｓ５０８の処理）にて、ページ番号１、３、４、５、７の各ページが文書先頭ページ候補としてレイアウト解析処理により抽出される。続いて、上述の文書先頭ページ判定処理（Ｓ５０８の処理）にて、それら文書先頭ページ候補それぞれに対してＯＣＲ処理を行って得たＯＣＲ結果を用いて詳細な判定を行った結果、ページ番号１、３、５、７が各文書の文書先頭ページとして判定される。Ｓ５１２では、この処理結果を受けて、文書先頭ページフラグの値が「０」から「１」に変わるページ境界に文書区切り線７１３が描画された状態で分割ページ確定画面７１０が表示される。 8 to 11 completes the image analysis processing in S508, and the image processing unit 432 returns the analysis processing result (analysis information) to the request control unit 431. FIG. The page management list described using Table 1 is included in the analysis processing result returned to the request control unit 431 . For example, assume that the page management list shown in Table 5 below is finally included as part of the analysis processing result (analysis information). In Table 5, first, pages with page numbers 1, 3, 4, 5, and 7 are extracted as document first page candidates by layout analysis processing in the above-described first page candidate determination processing (processing of S508). Subsequently, in the document first page determination processing (processing of S508) described above, detailed determination is performed using the OCR results obtained by performing the OCR processing on each of the document first page candidates. , 3, 5, and 7 are determined as the document first pages of each document. In S512, in response to this processing result, the divided page confirmation screen 710 is displayed with the document separation line 713 drawn at the page boundary where the value of the document top page flag changes from "0" to "1".

以上説明した通り、本実施形態によれば、複数のページで構成される複数の文書をページ単位で連続してスキャンして得られたスキャン画像群における文書単位の区切りを判別する処理について、次に示すように行えるようになる。すなわち、レイアウト解析によりスキャン先頭ページと類似したページに対してのみＯＣＲ処理を用いた詳細な解析を行うことができるようになる。これにより、文書単位の区切り（分割位置）の自動判定に関する精度を維持しつつ、ＯＣＲ処理が行われるページ数を最小限に抑えられ、処理に要する時間を短縮することから、ユーザにとっての応答性を向上させることができる。すなわち、処理コストを抑えながらも、複数の文書を連続でスキャンして得られたスキャン画像における文書単位の区切りを判別することができる。 As described above, according to the present embodiment, the process of determining the delimitation of each document in a group of scanned images obtained by continuously scanning a plurality of documents each having a plurality of pages is performed as follows. It becomes possible to perform as shown in . In other words, layout analysis enables detailed analysis using OCR processing only for pages similar to the first page of scanning. As a result, the number of pages to be subjected to OCR processing can be minimized while maintaining the accuracy of automatic determination of document unit delimiters (division positions), and the time required for processing can be shortened. can be improved. In other words, it is possible to determine the delimitation of each document in the scanned image obtained by continuously scanning a plurality of documents while suppressing the processing cost.

［第二の実施形態］
本実施形態では、ユーザによる分割ページ確定履歴を利用する態様について、図を用いて説明する。なお、本実施形態では、第一の実施形態と同一の構成や処理手順についてはその説明を省略し、差異のある箇所について説明する。 [Second embodiment]
In the present embodiment, a mode of using a split page confirmation history by a user will be described with reference to the drawings. In this embodiment, descriptions of the same configurations and processing procedures as in the first embodiment will be omitted, and differences will be described.

＜画像処理システム全体の処理の流れ＞
まず、本実施形態に係る画像処理システム全体の処理の流れについて、図を用いて説明する。図１２は、ＭＦＰ１１０にて複数の文書をページ単位で連続してスキャンし、得られたスキャン画像をファイル化してストレージサーバ１３０に保存（送信）する際の、各装置間の処理の流れを示すシーケンス図である。ここでは、各装置間のやり取りを中心に説明する。なお、図１２のシーケンス図ではＭＦＰ１１０がＭＦＰ連携サーバ１２０とやり取りを行う場合の説明となっているが、後述する解析結果取得、画面の表示等はＭＦＰ１１０でなくクライアントＰＣ１１１が実行する構成でも構わない。 <Processing flow of the entire image processing system>
First, the flow of processing of the entire image processing system according to this embodiment will be described with reference to the drawings. FIG. 12 shows the flow of processing between devices when MFP 110 continuously scans a plurality of documents page by page, files the obtained scanned images, and saves (sends) them to storage server 130 . It is a sequence diagram. Here, the explanation will focus on exchanges between devices. Although the sequence diagram of FIG. 12 describes the case where the MFP 110 communicates with the MFP cooperation server 120, the client PC 111 instead of the MFP 110 may perform the analysis result acquisition, screen display, etc., which will be described later. .

Ｓ１２０１では、ＭＦＰ連携サーバ１２０のリクエスト制御部４３１は、Ｓ５１２にてユーザによって確定されたスキャン画像群の分割結果の登録を画像処理部４３２に依頼する。登録を依頼する情報は、例えば、Ｓ５１２にて最終的に分割結果が確定した際のページ管理リストと、該スキャン画像群の１ページ目のページ画像に対するＳ９０２の処理によるレイアウト解析結果とが紐づいたデータ（以下、分割確定データと称す）である。例えば、Ｓ５１２で表示された分割ページ確定画面が図７（ａ）に示された状態の画面であるとし、この状態から、ユーザ操作によって、図１３に示すように、分割ページが修正された状態となり、「ＯＫ」ボタン７１７の押下により分割が確定されたとする。具体的には、ユーザ操作によって、３ページ目と４ページ目の間に文書区切り線１３０１が追加された状態となるとする。その結果、スキャン画像群に対するページ管理リストは下記の表６に示すように、ページ番号４に対応する文書先頭ページフラグが「０」から「１」に変更されることになる。Ｓ１２０１で登録依頼する分割確定データにおけるページ管理リストは、このようにＳ５１２を経てユーザにより確定した際のものとなる。図１３に示される分割ページ確定画面７１０では、文書区切り線がユーザ操作によって調整可能となっている。 In S1201, the request control unit 431 of the MFP cooperation server 120 requests the image processing unit 432 to register the division result of the scan image group confirmed by the user in S512. The information requested to be registered is, for example, the page management list when the division result is finally determined in S512 and the layout analysis result of the page image of the first page of the scan image group processed in S902. data (hereinafter referred to as "determined division data"). For example, assuming that the divided page confirmation screen displayed in S512 is the screen in the state shown in FIG. 7A, the divided page is corrected as shown in FIG. , and the division is confirmed by pressing the “OK” button 717 . Specifically, it is assumed that a document separation line 1301 has been added between the third and fourth pages by user operation. As a result, in the page management list for the scan image group, the document first page flag corresponding to page number 4 is changed from "0" to "1" as shown in Table 6 below. The page management list in the division confirmation data for which registration is requested in S1201 is the one when the user confirms through S512. In the divided page confirmation screen 710 shown in FIG. 13, the document separation line can be adjusted by user operation.

図１２の説明に戻る。Ｓ１２０２では、画像処理部４３２は、データ管理部４３４に登録されている分割履歴データに、Ｓ１２０１でリクエスト制御部４３１より依頼された分割確定データを履歴情報として追加する。分割履歴データは、複数のスキャン画像群それぞれに対する分割確定データを集積したデータである。なお、分割履歴データがデータ管理部４３４に登録されていない場合には、Ｓ１２０１でリクエスト制御部４３１より依頼された分割確定データがデータ管理部４３４に登録されることになる。 Returning to the description of FIG. In S1202, the image processing unit 432 adds the division confirmation data requested by the request control unit 431 in S1201 to the division history data registered in the data management unit 434 as history information. The division history data is data obtained by accumulating divided determined data for each of a plurality of scan image groups. If the division history data is not registered in the data management unit 434, the division confirmation data requested by the request control unit 431 in S1201 is registered in the data management unit 434. FIG.

次に、以上のようにして登録される分割履歴データを用いる文書先頭ページ判定処理について説明する。 Next, document first page determination processing using division history data registered as described above will be described.

＜文書先頭ページ判定処理＞
図１４は、本実施形態に係る文書先頭ページ判定処理の詳細な流れを示すフローチャートである。本フローは、図８におけるＳ８０３の処理に相当する。 <Document First Page Determination Processing>
FIG. 14 is a flowchart showing the detailed flow of document first page determination processing according to the present embodiment. This flow corresponds to the processing of S803 in FIG.

Ｓ１４０１では、ＭＦＰ連携サーバ１２０の画像処理部４３２は、上述した分割履歴データを用いて、文書先頭ページの判定にＯＣＲ処理を実施する必要があるか否かを判定するための処理を実行する。 In S1401, the image processing unit 432 of the MFP cooperation server 120 uses the division history data described above to execute processing for determining whether or not OCR processing needs to be performed to determine the first page of the document.

＜履歴によるＯＣＲ要否判定処理＞
履歴によるＯＣＲ要否判定処理について、図を用いて説明する。図１５は、Ｓ１４０１の履歴によるＯＣＲ要否判定処理の詳細な流れを示すフローチャートである。 <OCR Necessity Judgment Processing Based on History>
OCR necessity determination processing based on history will be described with reference to the drawings. FIG. 15 is a flowchart showing the detailed flow of the OCR necessity determination process based on the history in S1401.

Ｓ１５０１では、ＭＦＰ連携サーバ１２０の画像処理部４３２は、Ｓ９０３でデータ管理部４３４に保存されたスキャン先頭ページに対応するレイアウト解析結果（レイアウト解析データ）を取得する。続くＳ１５０２では、画像処理部４３２は、データ管理部４３４に保存された分割履歴データから、Ｓ１５０１で取得したレイアウト解析データと類似するレイアウト解析データを検索して取得する。 In S1501, the image processing unit 432 of the MFP cooperation server 120 acquires the layout analysis result (layout analysis data) corresponding to the scanned first page saved in the data management unit 434 in S903. In subsequent S1502, the image processing unit 432 searches and acquires layout analysis data similar to the layout analysis data acquired in S1501 from the division history data saved in the data management unit 434. FIG.

Ｓ１５０３では、画像処理部４３２（不適合率算出部４７２）は、Ｓ１５０２で取得した複数の分割履歴データに含まれるページ管理リストを連結した連結リストを作成する。そして、画像処理部４３２（不適合率算出部４７２）は、作成した連結リストに記載された文書先頭ページ候補フラグと文書先頭ページフラグの値とから、両フラグ値の不適合率を算出する。不適合率の算出方法は、これに限定されない。算出方法として、例えば、まず上述の連結リストにおける文書先頭ページ候補フラグと文書先頭ページフラグとの排他的論理和を各ページに対して取り、排他的論理和の総和を文書先頭ページ候補フラグの総和（総数）で除した値を不適合率として算出する。具体的な算出例として、Ｓ１５０３の処理の過程で下記の表７に示されるような連結リスト及び排他的論理和が得られた場合、（文書先頭ページ候補フラグの総和）＝５、（排他的論理和の総和）＝１となるので、不適合率は１／５となる。このような不適合率を算出することで、分割履歴データから文書先頭ページ候補と、Ｓ５１２にてユーザ操作で確定された文書先頭ページとの不適合の度合いを示す指標を得ることができる。そのため、不適合率の算出方法は、これに限定されず、前記指標を算出できればよい。 In S1503, the image processing unit 432 (nonconforming rate calculation unit 472) creates a linked list in which the page management lists included in the plurality of pieces of division history data acquired in S1502 are linked. Then, the image processing unit 432 (non-conforming rate calculating unit 472) calculates the non-conforming rate of both flag values from the values of the document first page candidate flag and the document first page flag described in the created linked list. The method for calculating the failure rate is not limited to this. As a calculation method, for example, first, the exclusive OR of the document first page candidate flag and the document first page flag in the above linked list is obtained for each page, and the sum of the exclusive ORs is the sum of the document first page candidate flags. The nonconforming rate is calculated by dividing by (total number). As a specific calculation example, when a linked list and an exclusive OR as shown in Table 7 below are obtained in the process of S1503, (total sum of document first page candidate flags)=5, (exclusive Since the total sum of the logical sums)=1, the nonconformance rate is 1/5. By calculating such a nonconformance rate, it is possible to obtain an index indicating the degree of nonconformity between the document first page candidate from the division history data and the document first page determined by the user's operation in S512. Therefore, the method of calculating the nonconformance rate is not limited to this, and it is sufficient that the index can be calculated.

Ｓ１５０４では、画像処理部４３２は、Ｓ１５０３で上述のように算出された不適合率が予め定められた一定値より大きいか否かを判定する。一定値として、例えば、１／２（５０％）としてよいが、これに限定されない。不適合率が一定値より大きいとの判定結果を得た場合（Ｓ１５０４のＹＥＳ）、処理がＳ１５０５に移行される。他方、不適合率が一定値以下であるとの判定結果を得た場合（Ｓ１５０４のＮＯ）、処理がＳ１５０６に移行される。 In S1504, the image processing unit 432 determines whether the nonconformity rate calculated as described above in S1503 is greater than a predetermined constant value. The constant value may be, for example, 1/2 (50%), but is not limited to this. If it is determined that the non-conforming rate is greater than the given value (YES in S1504), the process proceeds to S1505. On the other hand, if it is determined that the non-conforming rate is equal to or less than the certain value (NO in S1504), the process proceeds to S1506.

Ｓ１５０５では、画像処理部４３２は、スキャン画像群の分割判定にはＯＣＲ処理を行う必要があると判定する。Ｓ１５０６では、画像処理部４３２は、スキャン画像群の分割判定にはＯＣＲ処理を行う必要がないと判定する。Ｓ１５０５またはＳ１５０６の処理が完了すると、図１５に示すフローを終える。 In S1505, the image processing unit 432 determines that OCR processing needs to be performed to determine division of the scan image group. In S1506, the image processing unit 432 determines that OCR processing does not need to be performed to determine division of the scan image group. When the process of S1505 or S1506 is completed, the flow shown in FIG. 15 ends.

図１４の説明に戻る。Ｓ１４０２では、画像処理部４３２は、Ｓ１４０１の判定処理の結果に基づき、スキャン画像群に対するＯＣＲ処理を行う必要があるか否かを判定する。ＯＣＲ処理を行う必要があると判定結果を得た場合（Ｓ１４０２のＹＥＳ）、処理がＳ１００１に移行される。他方、ＯＣＲ処理を行う必要が無いとの判定結果を得た場合（Ｓ１４０２のＮＯ）、処理がＳ１４０３に移行される。 Returning to the description of FIG. In S1402, the image processing unit 432 determines whether it is necessary to perform OCR processing on the scanned image group based on the result of the determination processing in S1401. If it is determined that OCR processing is necessary (YES in S1402), the process proceeds to S1001. On the other hand, if it is determined that there is no need to perform OCR processing (NO in S1402), the process proceeds to S1403.

Ｓ１４０３では、画像処理部４３２は、図９の処理を経て得られた文書先頭ページ候補フラグの値をそのまま各ページの文書先頭ページフラグとして設定する。なお、Ｓ１４０３の処理が完了すると、図１４に示すフローを終える。 In S1403, the image processing unit 432 sets the value of the document first page candidate flag obtained through the processing in FIG. 9 as it is as the document first page flag of each page. It should be noted that when the process of S1403 is completed, the flow shown in FIG. 14 ends.

以上説明したように、本実施形態によれば、ユーザによって確定された分割履歴データから、文書先頭ページ候補と確定された文書先頭ページとの間の乖離が小さい場合、ＯＣＲ処理を行うことなく確度の高い分割結果をユーザに提示することができる。これにより、分割履歴データを用いない場合と比べて、処理に要する時間がさらに短縮されることから、ユーザにとって応答性をさらに向上させることができる。 As described above, according to the present embodiment, when the discrepancy between the document first page candidate and the confirmed document first page is small from the division history data confirmed by the user, the accuracy is calculated without performing the OCR process. The user can be presented with a segmentation result with a high . As a result, the time required for the processing is further shortened compared to the case where the division history data is not used, so that it is possible to further improve the responsiveness for the user.

［第三の実施形態］
本実施形態では、文書先頭ページの判定において、ＯＣＲ処理を行う必要があるとなった場合でも、必要最小限の領域に対してのみＯＣＲ処理を実施する態様について、図を用いて説明する。なお、本実施形態では、第一及び第二の実施形態と同一の構成や処理手順についてはその説明を省略し、差異のある箇所について説明する。 [Third embodiment]
In the present embodiment, even when it becomes necessary to perform OCR processing in determining the first page of a document, a manner in which OCR processing is performed only on the minimum necessary area will be described with reference to the drawings. In this embodiment, descriptions of the same configurations and processing procedures as those of the first and second embodiments will be omitted, and differences will be described.

本実施形態では、上述の分割確定データのうち、スキャン画像群の１ページ目の画像に対するＳ９０２によるレイアウト解析結果に、「分割判定利用フラグ」（領域情報）の値が追加される。本実施形態における分割確定データに含まれるレイアウト解析結果例について、下記の表を用い説明する。下記の表８は、本実施形態における分割確定データに含まれるレイアウト解析結果例を示す表である。分割確定データに含まれるレイアウト解析結果では、領域番号４に対応する分割判定利用フラグの値が「１」、それ以外の領域番号に対応する分割判定利用フラグの値は「０」として登録されている。本実施形態における分割判定利用フラグの値の設定方法、及び利用方法については詳細な処理手順の説明と併せて後述する。なお、表８には領域内文字列の値が付されているが、説明の明瞭さのために便宜的に付しているものであって、必ずしも付されていなくてもよい。 In this embodiment, the value of the "division determination use flag" (region information) is added to the layout analysis result of S902 for the image of the first page of the scan image group among the above-described division determination data. An example of the layout analysis result included in the division confirmation data in this embodiment will be described using the following table. Table 8 below is a table showing an example of layout analysis results included in the division confirmation data in this embodiment. In the layout analysis result included in the division confirmation data, the value of the division judgment use flag corresponding to the area number 4 is registered as "1", and the value of the division judgment use flag corresponding to the other area numbers is registered as "0". there is A method of setting the value of the division determination use flag and a method of using it in this embodiment will be described later together with a detailed description of the processing procedure. In addition, although the value of the character string in the area is given in Table 8, it is given for the sake of clarity of explanation and does not necessarily have to be given.

＜ＯＣＲ結果による文書先頭ページ判定処理＞
続いて、本実施形態におけるＳ１００７にて画像処理部４３２（文書先頭ページ判定部４７１）が実行する、ＯＣＲ結果による文書先頭ページ判定処理について、図を用いて説明する。図１６は、ＯＣＲ結果による文書先頭ページ判定処理の詳細な流れを示すフローチャートである。本フローチャートは、ＭＦＰ連携サーバ１２０によって実施される。
<Document First Page Determination Processing Based on OCR Results>
Next, document first page determination processing based on OCR results, which is executed by the image processing unit 432 (document first page determination unit 471) in S1007 in this embodiment, will be described with reference to the drawings. FIG. 16 is a flow chart showing the detailed flow of document first page determination processing based on OCR results. This flowchart is executed by the MFP cooperation server 120 .

Ｓ１６０１では、画像処理部４３２（分割判定利用フラグ設定部４７３）は、Ｓ１１０４において、同一文書判定ルールに含まれていると判定された判定トークンを含む領域内文字列（ＯＣＲ文字列）が存在する領域の分割判定利用フラグを「１」に設定する。すなわち、第一の実施形態で例示したのと同じく表２に示される各領域をもつレイアウト解析結果が得られており、かつ同一文書判定ルールが表４を用いて説明したものと同じであるとする。この場合、Ｓ１１０４にて同一文書判定ルールに含まれていると判定された判定トークンを含む領域内文字列（ＯＣＲ文字列）が存在する領域は領域番号４に対応する領域である。そのため、Ｓ１６０１にて領域番号４に対応する領域に対して分割判定利用フラグに「１」が付され、レイアウト解析データは表８で示される状態となる。以上のように設定された分割判定利用フラグを含むレイアウト解析データは、Ｓ５０８の画像解析結果としてリクエスト制御部４３１を経由してＭＦＰ１１０に送信される。そして、Ｓ５１２、Ｓ５１３、Ｓ５１４，Ｓ１２０１、Ｓ１２０２を経て分割履歴データとしてデータ管理部４３４に保存される。 In S1601, the image processing unit 432 (division determination use flag setting unit 473) determines that there is an intra-area character string (OCR character string) containing the determination token determined to be included in the same document determination rule in S1104. Set the division determination use flag of the area to "1". That is, it is assumed that a layout analysis result having each area shown in Table 2 is obtained as illustrated in the first embodiment, and that the identical document determination rule is the same as that explained using Table 4. do. In this case, the area corresponding to the area number 4 is the area in which the in-area character string (OCR character string) including the determination token determined to be included in the same document determination rule in S1104 exists. Therefore, in S1601, the division determination use flag is set to "1" for the area corresponding to the area number 4, and the layout analysis data is in the state shown in Table 8. The layout analysis data including the division determination use flag set as described above is transmitted to the MFP 110 via the request control unit 431 as the image analysis result of S508. Then, through S512, S513, S514, S1201, and S1202, the data is stored in the data management unit 434 as division history data.

＜文書先頭ページ判定処理＞
図１７は、本実施形態に係る文書先頭ページ判定処理の詳細な流れを示すフローチャートである。本フローは、図８におけるＳ８０３の処理に相当する。 <Document First Page Determination Processing>
FIG. 17 is a flow chart showing the detailed flow of document first page determination processing according to the present embodiment. This flow corresponds to the processing of S803 in FIG.

Ｓ１７０１では、ＭＦＰ連携サーバ１２０の画像処理部４３２は、データ管理部４３４に保存された分割履歴データから、現在の処理対象であるスキャン画像群の１ページ目とレイアウトが類似するレイアウト解析データを取得する。なお、分割履歴データは、Ｓ１２０２にて登録されたデータである。 In S1701, the image processing unit 432 of the MFP cooperation server 120 acquires layout analysis data whose layout is similar to that of the first page of the group of scan images currently being processed, from the division history data saved in the data management unit 434. do. Note that the division history data is the data registered in S1202.

Ｓ１７０２では、画像処理部４３２は、Ｓ１７０１で取得した分割履歴データのレイアウト解析データにて、分割判定利用フラグに「１」が設定されている領域が存在するか否かを判定する。分割判定利用フラグに「１」が設定されている領域が存在するとの判定結果を得た場合（Ｓ１７０２のＹＥＳ）、処理がＳ１７０３に移行される。他方、分割判定利用フラグに「１」が設定されている領域が存在しないとの判定結果を得た場合（Ｓ１７０２のＮＯ）、処理がＳ１００６に移行される。 In S1702, the image processing unit 432 determines whether or not there is an area in which the division determination use flag is set to "1" in the layout analysis data of the division history data acquired in S1701. If it is determined that there is an area with the division determination use flag set to "1" (YES in S1702), the process proceeds to S1703. On the other hand, when it is determined that there is no region for which the division determination use flag is set to "1" (NO in S1702), the process proceeds to S1006.

Ｓ１７０３では、画像処理部４３２（文書先頭ページ判定部４７１）は、Ｓ１７０１で取得したレイアウト解析データで分割判定利用フラグに「１」が設定されている領域に対してのみＯＣＲ処理を実行するようＯＣＲ処理部４５３に指示する。ＯＣＲ処理部４５３がＳ１７０１で取得したレイアウト解析データで分割判定利用フラグに「１」が設定されている領域に対してのみＯＣＲ処理を実行する。Ｓ１７０３では、画像処理部４３２（文書先頭ページ判定部４７１）は、ＯＣＲ処理部４５３を制御する制御部として機能するともいえる。なお、Ｓ１７０３の処理が完了すると、処理がＳ１００７に移行される。 In S1703, the image processing unit 432 (document first page determination unit 471) performs OCR processing only on the area for which the division determination use flag is set to "1" in the layout analysis data acquired in S1701. The processing unit 453 is instructed. The OCR processing unit 453 executes the OCR processing only for the area for which the division determination use flag is set to "1" in the layout analysis data acquired in S1701. In S1703 , the image processing unit 432 (document first page determination unit 471 ) can be said to function as a control unit that controls the OCR processing unit 453 . It should be noted that when the process of S1703 is completed, the process proceeds to S1007.

以上説明した通り、本実施形態によれば、過去にＳ１００７の文書先頭ページ判定で利用された領域に対してのみＯＣＲ処理を行うことができる。これにより、スキャン画像群（ページ画像）で検出した全ての文字領域に対してＯＣＲ処理を行う必要がある文書先頭ページ判定処理と比べて、文書先頭ページ判定処理に要する時間をさらに削減することができる。 As described above, according to the present embodiment, OCR processing can be performed only on the area that has been used in the document first page determination in S1007 in the past. As a result, the time required for document first page determination processing can be further reduced compared to document first page determination processing that requires OCR processing for all character areas detected in a group of scanned images (page images). can.

［その他の実施形態］
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１以上のプロセッサがプログラムを読み出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 [Other embodiments]
The present invention supplies a program that implements one or more functions of the above-described embodiments to a system or device via a network or a storage medium, and one or more processors in the computer of the system or device read and execute the program. processing is also feasible. It can also be implemented by a circuit (for example, ASIC) that implements one or more functions.

１２０ＭＦＰ連携サーバ
４３２画像処理部 120 MFP cooperation server 432 image processing unit

Claims

analysis means for analyzing the layout of each page of scanned images obtained by continuously scanning a plurality of documents page by page;
a calculating means for calculating the degree of similarity between each page of the scanned image and the first page of the scanned image based on the analysis result of the analyzing means;
extracting means for extracting first page candidates in each of the plurality of documents from the scanned image based on the similarity calculated by the calculating means;
character recognition means for performing character recognition on the first page candidate extracted by the extraction means;
determination means for determining a delimiter of each document based on the result of character recognition by the character recognition means;
An image processing apparatus characterized by comprising:

2. The image processing apparatus according to claim 1, wherein said determination means determines the first page of each document as the delimiter of each document.

3. The image processing according to claim 2, wherein said extracting means extracts a page whose degree of similarity exceeds a predetermined threshold and a first page in said scanned image as first page candidates of each document. Device.

further comprising registering means for registering history information indicating the first page determined based on past judgment results executed by the judging means;
4. The apparatus according to claim 2, wherein said determining means determines the first page of each document based on the character recognition result by said character recognizing means and said history information registered by said registering means. Image processing device.

a second calculation means for calculating a non-conformance rate between the first page candidate in each of the documents extracted by the extraction means and the determined first page;
The determining means determines the first page candidate as the first page without performing character recognition by the character recognizing means when the non-conforming rate calculated by the second calculating means is smaller than a predetermined value. 5. The image processing apparatus according to claim 4, wherein:

6. The image processing apparatus according to claim 5, wherein the non-conforming rate is a value obtained by dividing an exclusive OR obtained from the first page candidate and the first page by the total number of the first page candidates. .

The history information further includes area information indicating the character area of the character string used for the determination of the first page by the determination means,
7. The apparatus according to any one of claims 4 to 6, wherein said character recognition means performs character recognition only on a character area of the character string used for determining said first page based on said area information. The described image processing device.

8. The image processing apparatus according to any one of claims 1 to 7, further comprising dividing means for dividing the scanned image into document units according to the first page of each document determined by the determining means. .

9. The image processing apparatus according to claim 8, further comprising management means for managing a file obtained by dividing the scanned image by the division means at the delimiter for each document.

10. The image processing apparatus according to claim 8, further comprising display means for displaying a screen containing the result of determination by said determination means.

11. The image processing apparatus according to claim 10, wherein on the screen containing the result of determination by said determining means, the delimiter position for each document can be adjusted by a user operation.

12. The image processing apparatus according to claim 10, wherein said display means displays a setting screen for setting a file name for a file obtained by dividing said scanned image by said delimiter for each document.

an analysis step of analyzing the layout of each page of scanned images obtained by continuously scanning a plurality of documents page by page;
a calculation step of calculating the degree of similarity between each page of the scan image and the first page of the scan image based on the analysis result of the analysis step;
an extracting step of extracting first page candidates in each of the plurality of documents from the scanned image based on the degree of similarity calculated in the calculating step;
a character recognition step of performing character recognition on the first page candidate extracted in the extraction step;
a judgment step of judging the delimiters of the respective documents based on the character recognition result of the character recognition step;
An image processing method comprising:

A program for causing a computer to function as each means of the image processing apparatus according to any one of claims 1 to 12.