JP2005011260A

JP2005011260A - Document management device, document management system and program for document management

Info

Publication number: JP2005011260A
Application number: JP2003177211A
Authority: JP
Inventors: Masao Edamitsu; 正夫枝光; Masaru Otaka; 大大高; Masao Tsukawaki; 正生塚脇; Hiroshi Nomura; 大志野村; Toshihiko Wada; 俊彦和田
Original assignee: Canon Marketing Japan Inc
Current assignee: Canon Marketing Japan Inc
Priority date: 2003-06-20
Filing date: 2003-06-20
Publication date: 2005-01-13

Abstract

<P>PROBLEM TO BE SOLVED: To provide a document management device, a document management system and a document management program by which the waiting time of processing other than document recognition processing can be minimized, manhour for inputting document information can be reduced and operation efficiency can be improved. <P>SOLUTION: A document management server 110 is connected to a composite machine 140 for reading out an image from an original, and an operation terminal 120 for displaying the image data of the read image through a communication line. The document management server 110 executes OCR processing of the image data read out by the composite machine 140, extracts document information from text data obtained by the OCR processing on the basis of a fixed rule, transmits the image data, the text data and the extracted document information to the operation terminal 120, and continues the OCR processing in parallel on the background of the document information extracting processing and the transmitting processing of data or the like. During the continuation of the OCR processing, the operation terminal 120 displays the image data, the text data and the extracted document information on a screen. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、一定のルールに基づいてテキストデータから書誌事項を抽出する処理が実行可能な文書管理装置、文書管理システム及び文書管理用プログラムに関する。
【０００２】
【従来の技術】
従来より、紙に記載された文書を電子化して登録する業務では、紙に記載された文書の画像を読み込むスキャナと、該スキャナで読み取られた画像データに対してＯＣＲ（光学式文字認識）処理を実行することでテキストデータを生成し、さらにユーザにより入力された書誌事項と画像データ及びテキストデータとを関連付けて登録するコンピュータとを備える文書管理システムが知られている。
【０００３】
また、特許文献１には、紙に記載された文書を光学的に読み取り、ＯＣＲ処理によって文字を認識し、文字領域のレイアウトを認識後、更に文字サイズとフォントの種類を識別してタイトルや図のキャプションやキーワードを抽出する技術が開示されている。
【０００４】
さらに、特許文献２には、複写機とコンピュータを備えたシステムにおいて、複写機でインデックス情報（分類やキーワード）等を入力後、付加情報と画像データとをコンピュータ（ＰＣ）に送信し、コンピュータではこれらの付加情報と画像データとをデータベース（ＤＢ）用のデータ形式に変換し、ハードディスク等の内部に記録されたデータベースに該変換されたデータを登録して、管理する技術が開示されている。
【０００５】
【特許文献１】
特開平１１−２３８０７２号公報
【特許文献２】
特開２００２−２９０６６１号公報
【０００６】
【発明が解決しようとする課題】
しかしながら、上記従来の文書管理システムでは、画像データとテキストデータを参照しつつ、パソコン等の画面から書誌情報を手入力するのは手間がかかり、ユーザビリティに欠けるという問題がある。
【０００７】
また、上記特許文献１の技術では、タイトルの自動識別はある程度できるものの、例えばタイトルの文字サイズや使用フォントが本文と同一である場合は、識別に失敗する可能性があり、また、書誌情報のうちの文書管理用情報（文書作成日、発行元や送付先等）は通常は１回だけ出現するので、頻度順のキーワード抽出では対応できない可能性が高いという問題がある。
【０００８】
さらに、特定の文書全体のＯＣＲ処理には相当の時間がかかるので、文書登録業務の担当者は文書の読み込み後、ＯＣＲ処理が完了するまで待機する必要があり、実稼動率が低くなるという問題がある。
【０００９】
特許文献２のシステムによれば、複写機でキーワード入力後に１件づつ画像データをコンピュータに送信するので、付加情報と画像データの関連付けは容易な反面、キーワード入力と画像読み込みを同時に行うことができないという問題がある。すなわち、全体の作業時間の中で複写機が実際に画像を読み込んでいる時間の割合が低く、作業効率が悪いという問題がある。
【００１０】
本発明は、上記問題点を解決するためになされたもので、文字認識処理以外の他の処理の待機時間を最小限に低減し、かつ書誌情報の入力の工数を低減することができ、作業効率の向上を図ることができる文書管理装置、文書管理システム及び文書管理用プログラムを提供することを目的とする。
【００１１】
【課題を解決するための手段】
上記目的を達成するため、請求項１記載の文書管理装置は、原稿の画像を読み取る画像読取装置及び該読み取られた画像の画像データを表示するクライアント装置と通信回線を介して接続される文書管理装置において、前記画像読取装置から受信した画像データを記憶する画像記憶手段と、前記画像記憶手段に記憶された画像データに文字認識処理を施してテキストデータを生成する文字認識処理手段と、前記テキストデータから書誌情報を抽出するための抽出ルールを記憶する抽出ルール記憶手段と、前記テキストデータと前記抽出ルールに基づいて書誌情報を抽出する抽出手段と、前記画像データ、前記テキストデータ及び前記書誌情報を前記クライアント装置に送信する送信手段とを備えることを特徴とする。
【００１２】
請求項２記載の文書管理装置は、請求項１記載の文書管理装置において、前記画像データ、前記テキストデータ及び前記書誌情報を関連付けて記憶する文書情報記憶手段を備えることを特徴とする。
【００１３】
請求項３記載の文書管理装置は、請求項１又は２記載の文書管理装置において、前記クライアント装置から受信した前記書誌情報に対する編集情報を受信する受信手段と、該受信した編集情報に基づいて該書誌情報を更新する書誌情報更新手段とを備えることを特徴とする。
【００１４】
請求項４記載の記載の文書管理装置は、請求項３記載の文書管理装置において、前記受信した編集情報に基づいて前記テキストデータを更新するテキストデータ更新手段とを備えることを特徴とする。
【００１５】
請求項５記載の記載の文書管理装置は、請求項４記載の文書管理装置において、前記画像データ、前記書誌情報更新手段により更新された書誌情報及び前記テキストデータ更新手段により更新されたテキストデータに基づいて前記クライアント装置の画面に表示させるための表示情報を生成する表示情報生成手段を備えることを特徴とする。
【００１６】
請求項６記載の記載の文書管理装置は、請求項３乃至５のいずれか１項記載の文書管理装置において、前記受信した編集情報は、前記クライアント装置で選択されたテキストデータが反転表示されているか否かの情報、前記クライアント装置で選択されたテキストデータを含む矩形領域が指定されているか否かの情報、及び反転表示されたテキストデータ又は指定された矩形領域に含まれるテキストデータが所定の入力フィールドにドラッグ＆ドロップされたか否かを示す情報であることを特徴とする。
【００１７】
請求項７記載の文書管理システムは、原稿の画像を読み取る画像読取装置と、該読み取られた画像の画像データを表示するクライアント装置と、前記画像読取装置及び前記クライアント装置と通信回線を介して接続される文書管理装置とを備える文書管理システムにおいて、前記画像読取装置は、前記読み取った画像の画像データに文字認識処理を施してテキストデータを生成する文字認識処理手段と、前記画像データ及び前記テキストデータを前記文書管理装置に送信する送信手段とを備え、前記文書管理装置は、前記画像読取装置から受信した画像データ及びテキストデータを記憶する記憶手段と、前記テキストデータから書誌情報を抽出するための抽出ルールを記憶する抽出ルール記憶手段と、前記テキストデータと前記抽出ルールに基づいて書誌情報を抽出する抽出手段と、前記画像データ、前記テキストデータ及び前記書誌情報を前記クライアント装置に送信する送信手段とを備えることを特徴とする。
【００１８】
請求項８記載の文書管理システムは、原稿の画像を読み取る画像読取装置と、該読み取られた画像の画像データを表示するクライアント装置と、前記画像読取装置及び前記クライアント装置と通信回線を介して接続される文書管理装置とを備える文書管理システムにおいて、前記クライアント装置は、前記画像読取装置から受信した画像データを記憶する画像記憶手段と、前記画像記憶手段に記憶された画像データに文字認識処理を施してテキストデータを生成する文字認識処理手段と、前記テキストデータから書誌情報を抽出するための抽出ルールを記憶する抽出ルール記憶手段と、前記テキストデータと前記抽出ルールに基づいて書誌情報を抽出する抽出手段と、前記クライアント装置に前記画像データ、前記テキストデータ及び前記書誌情報を表示させるための表示情報生成手段とを備えることを特徴とする。
【００１９】
請求項９記載の文書管理システムは、請求項８記載の文書管理システムにおいて、前記文書管理装置は、前記クライアント装置から受信した前記書誌情報に対する編集情報を受信する受信手段を備え、前記受信した編集情報は、前記クライアント装置で選択されたテキストデータが反転表示されているか否かの情報、前記クライアント装置で選択されたテキストデータを含む矩形領域が指定されているか否かの情報、及び反転表示されたテキストデータ又は指定された矩形領域に含まれるテキストデータが所定の入力フィールドにドラッグ＆ドロップされたか否かを示す情報であることを特徴とする。
【００２０】
請求項１０記載の文書管理用プログラムは、原稿の画像を読み取る画像読取装置及び該読み取られた画像の画像データを表示するクライアント装置と通信回線を介して接続されるコンピュータに実行させる文書管理用プログラムにおいて、前記画像読取装置から受信した画像データを記憶する画像記憶モジュールと、前記画像記憶手段に記憶された画像データに文字認識処理を施してテキストデータを生成する文字認識処理モジュールと、前記テキストデータから書誌情報を抽出するための抽出ルールを記憶する抽出ルール記憶モジュールと、前記テキストデータと前記抽出ルールに基づいて書誌情報を抽出する抽出ステップと、前記画像データ、前記テキストデータ及び前記書誌情報を前記クライアント装置に表示させるための表示情報を生成する表示情報生成モジュールとを備えることを特徴とする。
【００２１】
【発明の実施の形態】
以下、本発明の実施の形態を図面を参照して説明する。
【００２２】
（第１の実施の形態）
図１は本発明の第１の実施の形態に係る文書管理装置のハードウェア構成を示すブロック図である。
【００２３】
同図において、ＣＰＵ２１（送信手段、受信手段、表示情報生成手段）、ＲＡＭ２２、ＲＯＭ２３、ＬＡＮアダプタ２４（送信手段、受信手段）、ビデオアダプタ２５、キーボード２６、マウス２７、ハードディスク２８、ＣＤ−ＲＯＭドライブ２９はそれぞれシステムバス２０を介して互いに接続されている。システムバス２０は、例えばＰＣＩバス、ＡＧＰバス又はメモリバス等である。文書管理サーバ１１０は各バス間の接続用チップ、キーボードインタフェース及びいわゆるＳＣＳＩやＡＴＡＰＩのような入出力用インタフェースを備えているが、図１ではこれらを省略している。
【００２４】
ＣＰＵ２１は、四則演算や比較演算等の各種の演算及びハードウェアやソフトウェアの制御を行う。ＲＡＭ２２には、ハードディスク２８やＣＤ−ＲＯＭドライブ２９に装着されたＣＤ−ＲＯＭやＣＤ−Ｒ等の記憶媒体から読み出されたオペレーションシステムのプログラムやアプリケーションプログラム等が記憶され、これらのプログラムはＣＰＵ２１の制御により実行される。ＲＯＭ２３には、オペレーションシステムと協働してハードディスク等への入出力を司るいわゆるＢＩＯＳ等が記憶されている。ＬＡＮアダプタ２４は、ＣＰＵ２１によって制御されるオペレーションシステムに含まれる通信プログラムと協働して、ネットワーク（不図示）を介した外部装置（不図示）との通信を行う。ビデオアダプタ２５はディスプレイ装置（不図示）に接続され、該ディスプレイ装置に出力する画像信号を生成し、キーボード２６やマウス２７は文書管理サーバ１１０への指示を入力するために用いられる。
【００２５】
ハードディスク２８はオペレーションシステムやアプリケーションプログラムや後述する抽出ルール記憶部１１５や書誌ＤＢ１１６等の各種データや不図示の各種マスタファイルを記憶している。ＣＤ−ＲＯＭドライブ２９はＣＤ−ＲＯＭ、ＣＤ−Ｒ、又はＣＤ−Ｒ／Ｗ等の記憶媒体を装着してアプリケーションプログラムをハードディスク２８にインストールするために使用する。ＣＤ−ＲＯＭドライブの代わりにＣＤ−Ｒドライブ、ＣＤ−Ｒ／Ｗドライブ、又はＭＯドライブ等を用いても良いのは言うまでもない。
【００２６】
後述する作業端末１２０及び管理端末１３０も図１の文書管理装置と同様のハードウェア構成を備えている。
【００２７】
図２は、本発明の実施形態に係る文書管理装置を適用可能な文書管理システムの構成を示すブロック図である。本発明の実施形態に係る文書管理装置は文書管理サーバ１１０に適用される。
【００２８】
同図において、文書管理システムは、通信回線１００、文書管理サーバ１１０、作業端末１２０、管理端末１３０及び複合機１４０を備えている。
【００２９】
作業端末１２０及び管理端末１３０は、例えば、パソコン、いわゆるＰＤＡ（ＰｅｒｓｏｎａｌＤｉｇｉｔａｌＡｓｓｉｓｔａｎｔ）又はインターネット対応の携帯電話等であり、文字の入力と画像や文字の表示が可能であれば良い。
【００３０】
通信回線１００は、典型的にはインターネット、ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）、ＷＡＮ（ＷｉｄｅＡｒｅａＮｅｔｗｏｒｋ）、電話回線、専用デジタル回線、ＡＴＭ（ＡｓｙｎｃｈｒｏｎｏｕｓＴｒａｎｓｆｅｒＭｏｄｅ）、フレームリレー回線、通信衛星回線、ケーブルテレビ回線、又はデータ放送用無線回線等のいずれか、又はこれらの組み合わせにより実現されるいわゆる通信ネットワークであり、データの送受信が可能であれば良い。
【００３１】
文書管理サーバ１１０は、所定のＯＳ（例えば，ＵＮＩＸ（登録商標）やＷＩＮＤＯＷＳ（登録商標）が含まれる）及びアプリケーションプログラムによって文書の登録処理や検索処理を行う。文書管理サーバ１１０は、書誌登録部１１１（書誌情報更新手段）、ＯＣＲ処理部１１２（文字認識処理手段）、書誌抽出部１１３（抽出手段）、画像管理部１１４、抽出ツール記憶部１１５（抽出ルール記憶手段）、書誌データベース（ＤＢ）１１６（文書情報記憶手段）、画像データベース（ＤＢ）１１７（画像記憶手段）及びテキストデータベース（ＤＢ）１１８を備えている。
【００３２】
画像管理部１１４は、複合機１４０から通信回線１００を介して受信した画像データを画像ＤＢ１１７に記憶する処理を行う。ＯＣＲ処理部１１２は画像ＤＢ１１７に記憶されている画像データのうちＯＣＲ処理を実行していないものについて文字認識処理を実行するためのプログラムと認識用辞書を備え、文字認識処理により生成されたテキストデータをテキストＤＢ１１８に記憶する。テキストＤＢは全文検索も可能なデータベースであるが、検索についての詳細な説明は省略する。
【００３３】
書誌抽出部１１３は、テキストデータと抽出ルール記憶部１１５に記憶されている書誌情報の抽出ルールに基づいて、書誌情報の抽出を行う。抽出ルール１１５に記憶されている書誌情報の抽出ルールについては後述する図８，９で詳細に説明する。
【００３４】
書誌登録部１１１は、作業端末１２０から受信した編集情報に基づいて書誌情報の更新を行い、「登録」を指示する編集情報を作業端末１２０から受信した場合は、書誌情報を書誌ＤＢ１１６に登録する。この書誌情報は画像ＤＢ１１７中の対応する画像データ及びテキストＤＢ１１８中の対応するテキストデータと関連付けられている。なお、書誌ＤＢ１１６と画像ＤＢ１１７とテキストＤＢを連携させて検索可能な不図示の検索処理部も文書管理サーバ１１０に含まれるが詳細な説明は省略する。
【００３５】
書誌登録部１１１、ＯＣＲ処理部１１２、書誌抽出部１１３及び画像管理部１１４はハードディスク２８に記憶されたプログラムに基づいてＣＰＵ２１が制御を実行することで実現され、書誌データベース（ＤＢ）１１６、画像データベース（ＤＢ）１１７及びテキストデータベース（ＤＢ）１１８はハードディスク２８内に構築される。
【００３６】
次に、複合機１４０は、ＣＣＤ等の撮像素子を備えた不図示のスキャンエンジンと、スキャンエンジンが読み込んだ画像データを記憶する画像データ記憶部１４７と、記憶した画像データを文書管理サーバ１１０に送信すると共に送信した履歴をログデータ記憶部１４６に記憶させる制御部１４５と、入力されるＰＤＬデータを印刷する機能及びスキャンエンジンから出力される画像データを印刷する機能とを備えた不図示のプリンタエンジンとを備え、スキャンエンジンとプリンタエンジンとは相互に通信可能に構成されている。ログデータ記憶部１４６と画像データ記憶部１４７は複合機１４０が備えるハードディスク装置で構成される。
【００３７】
また、複合機１４０は、不図示のネットワークコントローラと通信Ｉ／Ｆを備えて、通信回線１００を介して、文書管理サーバ１１０、作業端末１２０及び管理端末１３０と通信可能に接続されている。
【００３８】
図３，４は図２における文書管理システムで実行される処理を示すフローチャートである。
【００３９】
図３，４において、ステップＳ６０１〜ステップＳ６０７の処理は、作業端末１２０の不図示のＣＰＵの制御により実行され、ステップＳ６２１〜ステップＳ６３６の処理は、文書管理サーバ１１０のＣＰＵ２１の制御により実行され、ステップＳ６４１〜ステップＳ６４３の処理は複合機１４０の不図示のＣＰＵの制御により実行される。
【００４０】
文書管理サーバ１１０がステップＳ６２１の処理を実行する前に、既に、作業端末１２０が文書管理サーバ１１０に対して認証要求を実行すること、即ちユーザＩＤやパスワードを送信すること、文書管理サーバ１１０が認証処理を実行すること、作業端末１２０が文書管理サーバ１１０に対してメニュー選択情報を送信すること、及び文書管理サーバ１１０がメニュー選択情報に基づく作業端末１２０用の画面情報を生成することは、終了しているものとする。また複合機１４０にはスキャニングする文書が既に置かれているものとする。
【００４１】
まず、文書管理サーバ１１０が、作業端末１２０で文書登録画面を表示させるための画面情報を送信し（ステップＳ６２１）、作業端末１２０に接続されたディスプレイは図５に示すような画面を表示する（ステップＳ６０１）。
【００４２】
図５は、作業端末１２０の画面に、書誌登録の処理の際に表示されるアプリケーションの一例を示す図である。
【００４３】
同図において、符号１２００はスキャン指示を入力するためのスキャンボタンであり、符号１２０１はＯＣＲ処理の対象となる画像データを表示する画像表示部であり、符号１２０２は実際にＯＣＲ処理によって認識されたテキストデータを表示するイメージ表示部であり、符号１２０９〜１２１５は各種書誌情報の入力欄であり、符号１２０３はテキストデータの情報又は書誌情報を更新する更新ボタンであり、符号１２０４は前ページの画像データ、テキストデータ及び書誌候補を文書管理サーバ１１０に要求するための前ページボタンであり、符号１２０５は次ページの画像データ、テキストデータ及び書誌候補を文書管理サーバ１１０に要求するための次ページボタンであり、符号１２０６は書誌情報を文書管理サーバ１１０に登録するための登録ボタンであり、符号１２０７は次文書の先頭ページの画像データ、テキストデータ及び書誌候補を文書管理サーバ１１０に要求するための次文書ボタンであり、符号１２０８は終了ボタンである。
【００４４】
図５において、スキャンボタン１２００が押下されると、作業端末１２０はスキャニングの開始を要求する旨の情報（スキャン要求情報）と該情報の送信元である作業端末１２０を特定する作業端末特定情報（例えば、ユーザＩＤやセッションＩＤ等）を文書管理サーバ１１０に送信する（ステップＳ６０２）。
【００４５】
文書管理サーバ１１０は、作業端末１２０から受信したスキャン要求情報及び作業端末特定情報を複合機１４０に転送すると共に所定の文書番号を１つ採番する（ステップＳ６２２）。文書番号は文書１つに対して１つ採番される一意の管理番号であり、書誌ＤＢ１１６、画像ＤＢ１１７及びテキストＤＢ１１８の検索キー又は検索キーの一部として使用され、これら３つのＤＢのデータの関連付けに使用される。
【００４６】
複合機１４０は、文書管理サーバ１１０からのスキャン要求情報及び作業端末特定情報を受信し（ステップＳ６４１）、文書のスキャニングを実行し（ステップＳ６４２）、画像データ記憶部１４７に画像データを記憶する。画像データにはページ単位で一意の画像番号が採番される。画像番号は、例えば画像スキャン要求の受信時刻（１４桁）とページ数（下３桁）とを組み合わせて採番する。
【００４７】
複合機１４０の制御部１４５は、スキャン画像及びスキャン要求元の作業端末特定情報を一緒に文書管理サーバ１１０に送信し、送信した画像の名称と送信時刻をログデータ記憶部１４６に記憶する（ステップＳ６４３）。
【００４８】
ステップＳ６４３で画像データの送信が正常終了しなかった場合は、一定時間経過後にリトライし、所定回数リトライしても送信が正常終了しなかった場合は、その旨をログデータ記憶部１４６に記憶するように構成してもよい。
【００４９】
文書管理サーバ１１０では、上記ステップＳ６２１やステップＳ６２２とは別のプロセス（ステップＳ６２３〜ステップＳ６２９）が実行されており、ＣＰＵ２１は複合機からのスキャン画像の待ち状態になっており（ステップＳ６２３）、一定時間間隔で受信すべきスキャン画像があるか否かを判別し（ステップＳ６２４）、受信すべきスキャン画像がない場合（ステップＳ６２４でＮｏの場合）には、ステップＳ６２３に戻って待機する。ステップ６２４の判別の結果、受信すべきスキャン画像がある場合（ステップＳ６２４でＹｅｓの場合）には、スキャン画像を受信した後、画像管理部１１４により受信した画像データが画像ＤＢ１１７に登録される（ステップＳ６２５）。
【００５０】
次いで、ＯＣＲ処理部１１２が画像ＤＢ１１７に登録された画像データに対して１ページずつＯＣＲ処理を施し、該ＯＣＲ処理により認識されたテキストはＲＡＭ２２からテキストＤＢ１１８に追加される（ステップ６２６）。ＯＣＲ処理が施された画像データには画像ＤＢ１１７内でフラグが付される。ＯＣＲ処理の詳細は後述する。
【００５１】
次に、書誌抽出部１１３が抽出ルール記憶部１１５に記憶されている抽出ルールに基づいてＲＡＭ２２のワークエリア内の認識されたテキストデータから書誌情報の候補を抽出し、ＲＡＭ２２のワークエリア内に記憶する（ステップ６２７）。書誌候補抽出処理の詳細も後述する。
【００５２】
その後、ＣＰＵ２１は、受信した１ページ分の画像データと、この画像データに対応するテキストデータと、文書の先頭ページから抽出された書誌情報の候補とをＲＡＭ２２のワークエリアから作業端末１２０に送信する（ステップ６２８）。
【００５３】
次いで、ＣＰＵ２１は、画像ＤＢ１１７に登録された画像データであって、ＯＣＲ処理部１１２によりＯＣＲ処理が施されていないページが存在するか否かを判別する（ステップＳ６２９）。具体的には、ＣＰＵ２１がＯＣＲ処理時に更新される画像ＤＢ１１７のフラグの有無を検出し、フラグがある場合にはＯＣＲ処理が施されていないページが存在すると判断し、フラグがない場合にはＯＣＲ処理が施されていないページは存在しないと判断する。又はＣＰＵ２１は画像データに対応するテキストデータがテキストＤＢ１１８に存在するか否かを確認することによりステップＳ６２９の判別を実行してもよい。
【００５４】
ステップＳ６２９の判別の結果、ＯＣＲ処理が施されていないページが存在する場合には、ステップＳ６２６の処理に戻り、次ページの画像データのＯＣＲ処理が実行される。一方ＯＣＲ処理が施されていないページが存在しない場合には、ステップＳ６２３の処理に戻り、ＣＰＵ２１は次の画像データの受信を待つ。
【００５５】
次いで、作業端末１２０が、ステップＳ６２８の処理で文書管理サーバ１１０から送信される画像データ、テキストデータ及び書誌候補を受信する（ステップＳ６０３）。作業端末１２０では図５に示すような画面が表示され、同図の画像表示部１２０１には画像データが表示され、イメージ表示部１２０２にはＯＣＲ処理によって認識されたテキストデータが表示され、さらに各種書誌情報の入力欄１２０９〜１２１５には書誌情報の候補が表示され、画面表示が更新される（ステップ６０４）。
【００５６】
次に、作業端末１２０は、キーボード２６の入力情報やマウス２７の操作情報を識別し（ステップ６０５）、これらの情報を文書管理サーバ１１０に送信する（ステップ６０６）。作業端末１２０は、入力情報や操作情報がない場合には入力や操作が有るまで待機する。
【００５７】
文書管理サーバ１１０のＣＰＵ２１は、ステップＳ６０６で作業端末１２０から送信されるキーボード２６の入力情報やマウス２７の操作情報を受信し、この受信した入力情報や操作情報に基づいて、作業端末１２０の画面において終了ボタン１２０８が押下されたか否かを判別する（ステップＳ６３０）。
【００５８】
ステップＳ６３０の判別の結果、終了ボタン１２０８が押下された場合には、本処理を終了する一方、終了ボタン１２０８が押下されていない場合には、ステップＳ６０６で作業端末１２０から送信されるキーボード２６の入力情報やマウス２７の操作情報に基づいて、作業端末１２０の画面において次文書ボタン１２０７が押下されたか否かを判別する（ステップＳ６３１）。
【００５９】
ステップＳ６３１の判別の結果、次文書ボタン１２０７が押下された場合には、ＣＰＵ２１が現在処理している文書の次文書のＯＣＲ処理及び書誌候補抽出処理が終了しているか否かを判別し（ステップＳ６３２）、これらの処理が終了している場合には、ステップＳ６２８の処理に戻り、ＣＰＵ２１が次文書の先頭ページの画像データと、これに対応するテキストデータと、書誌情報の候補とを作業端末１２０に送信する。一方ステップＳ６３２の処理が終了していない場合には、ステップＳ６２６の処理に戻り、ＣＰＵ２１が次文書の先頭ページのＯＣＲ処理を行う。
【００６０】
ステップＳ６３１の判別の結果、次文書ボタン１２０７が押下されていない場合には、ステップＳ６０６で作業端末１２０から送信されるキーボード２６の入力情報やマウス２７の操作情報に基づいて、作業端末１２０の画面において前ページボタン１２０４又は次ページボタン１２０５が押下されたか否かを判別する（ステップＳ６３３）。
【００６１】
ステップＳ６３３の判別の結果、前ページボタン１２０４又は次ページボタン１２０５が押下された場合には、ステップＳ６２８の処理に戻り、押下されたボタンに応じて前ページ又は次ページの画像データと、これに対応するテキストデータと、書誌情報の候補とを作業端末１２０に送信する。
【００６２】
ステップＳ６３３の判別の結果、前ページボタン１２０４又は次ページボタン１２０５が押下されていない場合には、ステップＳ６０６で作業端末１２０から送信されるキーボード２６の入力情報やマウス２７の操作情報に基づいて、作業端末１２０の画面において登録ボタン１２０６が押下されたか否かを判別する（ステップＳ６３４）。
【００６３】
ステップＳ６３４の判別の結果、登録ボタン１２０６が押下された場合には、ＲＡＭ２２のワークエリアに記憶されていた書誌情報がハードディスク２８に送信され、ステップＳ６２３の処理に戻る。文書管理サーバ１１０では、作業端末１２０に登録処理が終了した旨のメッセージを送信した後、書誌ＤＢ１１６を更新する。
【００６４】
一方、ステップＳ６３４の判別の結果、登録ボタン１２０６が押下されていない場合には、イメージ表示部１２０２に表示されるテキストデータの情報が修正されたか、又は書誌情報の入力欄１２０９〜１２１５のいずれかの情報が修正された場合であるので、これらの修正を反映してＲＡＭ２２のワークエリア内のテキストデータ（テキストデータ更新手段）又は書誌情報を更新し、この更新された内容を含む画面情報を生成し、作業端末１２０に送信する（ステップＳ６３５）。
【００６５】
作業端末１２０はステップＳ６３５の処理で文書管理サーバ１１０から受信した画像情報を受信し（ステップＳ６０７）、ステップＳ６０４の処理を実行する。
【００６６】
更新ボタン１２０３が押下された場合は、文書管理サーバ１１０がステップＳ６３５の処理を実行し、作業端末１２０はステップＳ６０７の処理を実行する。
【００６７】
本処理によれば、原稿の画像データを１ページ単位でＯＣＲ処理して書誌情報を抽出し、元の画像データとテキストデータと抽出した書誌情報とが作業端末１２０の画面に表示され、バックグラウンドで次ページ以降のＯＣＲ作業が続行されるので、文字認識処理以外の他の処理の待機時間を最小限に低減し、かつ書誌情報の入力の工数を低減することができ、作業効率の向上を図ることができる。
【００６８】
上記ステップＳ６０２のスキャニングを要求する旨の情報は、スキャナーやスキャナー機能を有する複合機に直接送信しても良い。この場合、ステップＳ６４３ではスキャン画像と同時にスキャニング要求元を特定する情報（ユーザＩＤ等）が一緒に送信される。
【００６９】
図６，７は、ＯＣＲ処理（図３のステップＳ６２６）を示すフローチャートである。
【００７０】
図６，７において、ステップＳ７０１〜ステップＳ７２４の処理は文書管理サーバ１１０のＣＰＵ２１の制御により実行される。
【００７１】
まず、ＲＡＭ２２のワークエリア内に記憶しているＯＣＲ対象の画像データに基づき、文字ブロックと行間の解析が実行される（ステップＳ７０１）。「文字ブロック」とは、上下を空白行で挟まれているか、又は左右を所定数以上のスペースで挟まれている文字列である。左右を所定数以上のスペースで挟まれている場合を含むのは、例えば行の右端に作成年月日があり、直ぐ下の行の中央部分に表題があるようなケースを想定するからである。行間は、画像データを横方向に走査し、黒比率がゼロに近い所定値（例えば０．１）以下の場合は行間又は空白行であり、所定値以上の部分は文字行の一部と判定する。また、文字ブロック解析及び行間解析により、当該ページの最終文字の位置が確定される。最終文字の位置は再下段の文字ブロックの最終行の右端部分として確定される。最終文字位置については後で詳述する。
【００７２】
次いで、画像データ全体を横方向に走査し、罫線認識を行い、画像データが特定フォーマット（例えば文書の仕切り用紙のフォーマット）と合致しているか否かを解析する（ステップＳ７０２）。
【００７３】
ステップＳ７０２の解析の結果、画像データが特定フォーマットと合致していない場合には、本処理を終了し、画像データが特定フォーマットと合致している場合には、そのフォーマットに対応する特定位置の文字が文字認識処理の対象になる。特定のフォーマットの罫線情報は不図示の書式テーブル（物理的にはハードディスク２８）に記憶される。
【００７４】
次に、ＲＡＭ２２のワークエリア内の最終文字フラグがオンになっているか否かを判別する（ステップＳ７０３）。最終文字フラグとは、ＯＣＲ対象の画像データの最後の文字の文字認識処理が終わった時点でオンになるフラグである。
【００７５】
ステップＳ７０３の判別の結果、最終文字フラグがオンである場合には、今回処理している画像データが仕切り紙であるか否かを判別し（ステップＳ７０５）、この判別結果に応じて仕切りしであるか否かの識別データをＯＣＲ処理のテキストデータに追加し（ステップＳ７０６）、本処理を終了する。
【００７６】
一方、ステップＳ７０３の判別の結果、最終文字フラグがオンでない場合には、ステップＳ７０１の行間解析の結果に基づいて、文字範囲の解析が実行される（ステップＳ７０４）。ステップＳ７０１の行間解析において画像データの垂直方向の文字間隔が判明しているので、このステップＳ７０４では垂直方向の文字間隔又はその半分の値を文字範囲解析の初期値として文字範囲の判定を行う。文字範囲の判定は文字と文字との間の空白部を識別して各文字の範囲を判定する。
【００７７】
次に、確定した１文字分の範囲の黒色が占める比率を集計し（ステップＳ７０７）、この集計された黒色が占める比率の値が所定値（例えば０．００５）を超えているか否かを判別する（ステップＳ７０８）。黒色の比率が所定値以下の場合には、前後の所定文字数分の黒比率が判定され、ステップＳ７０１の文字ブロック範囲の判定とあわせて、今回判定した文字が行頭又は行末にあるか否かを判定する（ステップＳ７０９）。
【００７８】
次いで、今回の文字がスペースであるか否かを判別し（ステップＳ７１０）、今回の文字がスペースである場合には、出力されるテキストデータにスペースが１つ追加され（ステップＳ７１１）、後述するステップＳ７２３の処理に進む。一方、今回の文字がスペースでない場合には、今回の文字が、「。」や「、」や「・」等のいずれかの記号であるか否かを判別し（ステップＳ７１２）、今回の文字が記号である場合には、出力されるテキストデータに記号を１つ追加し（ステップＳ７１３）、後述するステップＳ７２３の処理に進む一方、今回の文字が記号でない場合には、後述するステップＳ７２３の処理に進む。
【００７９】
上記ステップＳ７０８の判別の結果、黒色が占める比率の値が所定値を超えている場合には、１文字分の画像データと辞書の文字データの各画素の一致又は不一致を照合する（ステップＳ７１４）。
【００８０】
次いで、ステップＳ７１４の辞書照合が完了したか否かを判別し（ステップＳ７１５）、辞書照合が完了していない場合には、ステップＳ７１４の照合結果に基づいて１文字分の画像データと辞書の文字データの各画素の一致率を判定する（ステップＳ７１６）。この辞書の文字データはハードディスク２８に記憶されている。
【００８１】
次に、ステップＳ７１６で判定された一致率が所定値Ａ（例えば０．８）を超えているか否かを判別し（ステップＳ７１７）、一致率が所定値Ａを超えている場合には、該当の文字のテキストデータを辞書から選択し、その文字の一致率とペアにしてＲＡＭ２２のワークエリア内の候補配列に追加し（ステップＳ７１８）、ステップＳ７１４に戻り、辞書の次の文字との照合を行う。
【００８２】
一方、ステップＳ７１７の判別の結果、一致率が所定値Ａ以下の場合には、ステップＳ７１８の処理をスキップして、ステップＳ７１４に戻る。
【００８３】
上記ステップＳ７１５の辞書照合が完了した場合には、１文字分の画像データと候補配列の各文字の一致率を比較し（ステップＳ７１９）、一致率が最大である文字のテキストデータが出力用テキストデータに追加される（ステップＳ７２０）。
【００８４】
次いで、一致率の最大値が所定値（例えば０．９）を超えているか否かを判別し（ステップＳ７２１）、一致率の最大値が所定値以下の場合は、警告フラグがオンになり出力用テキストデータに所定の特殊文字を出力する（ステップＳ７２２）。この特殊文字により、画面表示時（図４のステップＳ６０４）では、特殊文字の直前のテキストが通常の色（例えば黒色）以外の別の色（例えば青色）で表示される。
【００８５】
このように処理することで、文字の識別が正確かどうか疑わしい個所の色を変えて表示可能となるので、人手によるＯＣＲ処理の目視確認作業と訂正作業を効率的に行うことが可能になる。
【００８６】
次に、今回識別した文字が１ページ分の画像データにおける最終文字であるか否かを判別し（ステップＳ７２３）、今回識別した文字が１ページ分の画像データにおける最終文字である場合は、ＲＡＭ２２のワークエリア内の最終文字フラグデータをオンとし（１が代入され）（ステップＳ７２４）、ステップＳ７０３の処理に戻る。ステップＳ７２３の判別の結果、今回識別した文字が１ページ分の画像データにおける最終文字でない場合は、直ちにステップＳ７０３に戻る。
【００８７】
以上説明したように、図６，７のＯＣＲ処理では画像データにおけるスペースや句読点等の記号も判定して該画像データに対応するテキストデータを出力するので、元画像データとレイアウトの一致した文字認識処理が可能になる。
【００８８】
図８，９は、書誌候補抽出処理（図３のステップＳ６２７）を示すフローチャートである。
【００８９】
図８，９のステップＳ８２１〜ステップＳ８３６の処理は文書管理サーバ１１０のＣＰＵ２１の制御により実行される。
【００９０】
本処理では、書誌情報は文書の先頭ページにあることを前提としている。
【００９１】
まず、書誌情報を抽出しようとするページが文書の先頭ページであるか否かを判別する（ステップＳ８２１）。書誌情報を抽出しようとするページが文書の先頭ページでない場合には、本処理を終了する一方、書誌情報を抽出しようとするページが文書の先頭ページである場合は、上記ＯＣＲ処理で抽出されたテキストデータの文字ブロックが１つ読み込まれる（ステップＳ８２２）。ここで「文字ブロック」とは、所定数以上（例えば２つ以上）のスペースであるか、又は最初の文字が行頭又は行末にある一連の文字データをいう。
【００９２】
次に、読み込んだ文字ブロックを抽出ルール記憶部１１５に記憶されている候補辞書と照合し（ステップＳ８２３）、文字ブロックの先頭が「平成」等の元号であり、かつ、末尾が「日」であり、さらに文字ブロックの位置がページの中央より右側であるか否かを判別する（ステップＳ８２４）。これら全ての条件を満たしている場合（ステップＳ８２４でＹＥＳ）には、当該文字ブロックのテキストデータを書誌情報の「発行日」の入力欄に上書きし（ステップＳ８２５）、ステップＳ８３６の処理に進む。
【００９３】
ステップＳ８３６では、現在読み込んでいる文字ブロックが先頭ページの最終ブロックであるか否か（当該文字ブロックよりも右側又は下側に他の文字ブロックが存在するか否か）を判別し、現在読み込んでいる文字ブロックが最終ブロックである場合は、本処理を終了する。一方、現在読み込んでいる文字ブロックが最終ブロックでない場合には、ステップＳ８２２の処理に戻り、次の文字ブロックを読み込む。
【００９４】
一方、ステップＳ８２４の判別の結果、いずれか１つでも条件を満たしていない場合（ステップＳ８２４でＮＯ）には、文字ブロックの先頭が組織名称であり、かつ、末尾が「殿」であるか否かを判別する（ステップＳ８２６）。
【００９５】
ステップＳ８２６の判別の結果、全ての条件を満たしている場合（ステップＳ８２６でＹＥＳ）には、当該文字ブロックのテキストデータを書誌情報の「送付先部署」の入力欄に上書きし（ステップＳ８２７）、ステップＳ８３６の処理に進む。
【００９６】
ステップＳ８２６の判別の結果、いずれか１つでも条件を満たしていない場合（ステップＳ８２６でＮＯ）には、文字ブロックの先頭が組織名称であるか又は文字ブロックの末尾が官職であり、かつ、文字ブロックの位置がページの中央より右側であるか否かを判別する（ステップＳ８２８）。
【００９７】
ステップＳ８２８の判別の結果、全ての条件を満たしている場合（ステップＳ８２８でＹＥＳ）には、当該文字ブロックのテキストデータを書誌情報の「文書作成元」の入力欄に上書きし（ステップＳ８２９）、ステップＳ８３６の処理に進む。
【００９８】
ステップＳ８２８の判別の結果、いずれか１つでも条件を満たしていない場合（ステップＳ８２８でＮＯ）には、文字ブロックの先頭が文書名称であるか又は元号であり、かつ、文字ブロックの末尾が「号」であり、さらに文字ブロックの位置がページの中央より右側であるか否かを判別する（ステップＳ８３０）。
【００９９】
ステップＳ８３０の判別の結果、全ての条件を満たしている場合（ステップＳ８３０でＹＥＳ）には、当該文字ブロックのテキストデータを書誌情報の「文書番号」の入力欄に上書きし（ステップＳ８３１）、ステップＳ８３６の処理に進む。
【０１００】
ステップＳ８３０の判別の結果、いずれか１つでも条件を満たしていない場合（ステップＳ８３０でＮＯ）には、文字ブロックの末尾の文字が「の件」であるか、「通達」であるか、「通知」であるか又は「について」であるか否かを判別する（ステップＳ８３２）。
【０１０１】
ステップＳ８３２の判別の結果、全ての条件を満たしている場合（ステップＳ８３２でＹＥＳ）には、当該文字ブロックのテキストデータを書誌情報の「収受文書名」の入力欄に上書きした後（ステップＳ８３３）、当該入力欄の属性を「上書き禁止」に変更し、ステップＳ８３４の処理に進む。
【０１０２】
ステップＳ８３２の判別の結果、いずれか１つでも条件を満たしていない場合（ステップＳ８３２でＮＯ）には、ステップＳ８３６の処理に進む。
【０１０３】
次いで、ステップＳ８３３で上書きされた収受文書名をキーにして、抽出ルール記憶部１１５に記憶されている管理テーブルから、主管課、担当者、文書区分及び保管期限のデータを抽出し（ステップＳ８３４）、書誌情報の入力欄にこれらデータを上書きし、この上書き直後、これらのデータの属性を上書き禁止に変更し（ステップＳ８３５）、ステップＳ８３６の処理に進む。
【０１０４】
ステップＳ８３３，Ｓ８３４で属性が「上書き禁止」に設定されるのは、文書の本文中に他の文書名称が存在する場合に、本処理で「収受文書名」と収受文書名と関連するデータとが上書きされてしまう事態を防止するためである。
【０１０５】
なお、本フローチャートの処理の範囲外のことであるが、この書誌候補抽出処理が終わった後、作業端末１２０の操作者がカーソルをこれらの入力フィールドに移動して、手入力での修正を行うのは勿論可能である。
【０１０６】
本処理によれば、きめ細かい抽出ルールに基づいて文書の先頭ページのテキストデータから書誌情報の候補が抽出され入力フィールドに入力されるので、書誌情報の入力工数を最小限に抑制することができる。
【０１０７】
図１０は図３における文書管理システムで実行される処理の一部の変形例を示すフローチャートであるので、図３と異なる処理を示す。
【０１０８】
図１０のステップＳ６３０及びステップＳ１１２３〜ステップＳ１１３４の処理は、文書管理サーバ１１０のＣＰＵ２１の制御により実行され、ステップＳ６０６，Ｓ６０７の処理は作業端末１２０のＣＰＵの制御により実行される。
【０１０９】
作業端末１２０では、ブラウザソフト、Ｊａｖａ（登録商標）Ｓｃｒｉｐｔ、及びＡｃｔｉｖｅＸが組み合わされるか、又はクライアントのアプリケーションがインストールされ起動している。
【０１１０】
作業端末１２０は、図３のステップＳ６０５でキーボード２６の入力情報やマウス２７の操作情報を識別し、これらの情報を文書管理サーバ１１０に送信する（ステップ６０６）。これらの情報は、例えば「ドラッグ＆ドロップ操作」、「文字列の反転操作」、「特定入力フィールドへの文字入力」、「矩形領域の指定」、及び各種ボタンの押下である。操作情報がドラッグ＆ドロップである場合は、ドラッグされた文字列、ドラッグ開始の座標及びドロップ先の座標が操作情報に含まれている。また、矩形領域が指定された場合は、矩形領域内のテキスト情報も操作情報に含まれる。
【０１１１】
文書管理サーバ１１０のＣＰＵ２１は、ステップＳ６０６で作業端末１２０から送信されるキーボード２６の入力情報やマウス２７の操作情報に基づいて、作業端末１２０の画面において終了ボタン１２０８が押下されたか否かを判別する（ステップＳ６３０）。
【０１１２】
ステップＳ６３０の判別の結果、終了ボタン１２０８が押下された場合には、本処理を終了する一方、終了ボタン１２０８が押下されていない場合には、ステップＳ６０６で作業端末１２０から送信されるキーボード２６の入力情報やマウス２７の操作情報が「矩形領域の指定」であるか否かを判別する（ステップＳ１１２３）。
【０１１３】
ステップＳ１１２３の判別の結果、キーボード２６の入力情報やマウス２７の操作情報が「矩形領域の指定」である場合には、ＣＰＵ２１は矩形領域内のテキストデータをＲＡＭ２２のワークエリアのバッファ領域に蓄積する（ステップＳ１１２４）一方、キーボード２６の入力情報やマウス２７の操作情報が「矩形領域の指定」でない場合には、後述するステップＳ１１２５の処理に進む。
【０１１４】
次に、ＣＰＵ２１はキーボード２６の入力情報やマウス２７の操作情報が「文字列反転」であるか否かを判別し（ステップＳ１１２５）、キーボード２６の入力情報やマウス２７の操作情報が「文字列反転」でない場合には、後述するステップＳ１１２７の処理に進む一方、キーボード２６の入力情報やマウス２７の操作情報が「文字列反転」である場合には、反転部分のテキストデータをＲＡＭ２２のワークエリアのバッファ領域に蓄積し（ステップＳ１１２６）、ステップＳ１１２７の処理に進む。このバッファ領域にはドラッグ元の座標もテキストデータと関連付けて記憶される。
【０１１５】
次いで、ＣＰＵ２１はキーボード２６の入力情報やマウス２７の操作情報が「ドラッグ＆ドロップ操作」であるか否かを判別し（ステップＳ１１２７）、キーボード２６の入力情報やマウス２７の操作情報が「ドラッグ＆ドロップ操作」でない場合には、ステップＳ６３４の処理に進む一方、キーボード２６の入力情報やマウス２７の操作情報が「ドラッグ＆ドロップ操作」である場合には、矩形領域内のテキストデータをＲＡＭ２２のワークエリアのバッファ領域に蓄積する。このバッファ領域にはドラッグ元の座標もテキストデータと関連付けて記憶される。
【０１１６】
次いで、ＣＰＵ２１はドラッグ元の座標とドラッグ先の座標とを検出し、ドラッグ先の座標に基づいて該当データの上書き入力先のフィールドを特定し（ステップＳ１１２８）、さらにドラッグ元の座標から、ドラッグ元が反転文字列であるか又は矩形領域内のどの文字列であるかを特定する（ステップＳ１１２９）。その後、ＲＡＭ２２のワークエリアのバッファ領域内の特定された文字列をステップＳ１１２８で特定された入力フィールドに上書き入力し（ステップＳ１１３０）、ステップＳ６３４の処理に進む。ステップＳ６３４の処理については、図３の説明で上述している。
【０１１７】
ステップＳ１１３４では、ＣＰＵ２１が書誌登録が終了した旨のメッセージを作業端末井１２０に送信する。
【０１１８】
本処理によれば、作業端末１２０の操作者が、特定の文字列を囲む矩形領域を指定し、又は特定の文字列を反転した後、所望の入力フィールドにドラッグ＆ドロップすると、これらの行為がそのまま文書管理サーバ１１０内で自動的に実現されるので、簡単に書誌情報等の入力フィールドの文字列を更新することが可能となり、書誌情報等の入力工数を削減し、操作者の負担を軽減することが可能になる。
【０１１９】
上述したように、本実施の形態によれば、文書管理サーバ１１０は複合機１４０で読み取られた画像データのＯＣＲ処理を実行し、このＯＣＲ処理により得られたテキストデータから、一定のルールに基づいて書誌情報を抽出し、画像データ、テキストデータ及び抽出された書誌情報を作業端末１２０に送信し、これらの書誌情報抽出処理やデータ等の送信処理のバックグラウンドで並行してＯＣＲ処理を続行する一方、作業端末１２０は画面に画像データ、テキストデータ及び抽出された書誌情報を表示するので、文字認識処理以外の他の処理の待機時間を最小限に低減し、かつ書誌情報の入力の工数を低減することができ、作業効率の向上を図ることができる。
【０１２０】
本実施の形態では、文書管理サーバ１１０がＯＣＲ処理部を備えていたが、ＯＣＲ処理部だけを独立した他のサーバ（ＯＣＲサーバ）として構成しても良い。このとき、ＯＣＲ処理により認識されたテキストデータはＯＣＲサーバから文書管理サーバに送信され、テキストＤＢ１１８に記憶される。このようにＯＣＲサーバを独立させることによって、文書管理サーバ１１０の負荷を軽減し処理速度を一層向上させることができる。
【０１２１】
また、本実施の形態では、書誌登録の処理の際に表示される作業端末１２０の画面例（図５）において、画像表示部１２０１、イメージ表示部１２０２、各種ボタン１２０３〜１２０８及び各種書誌情報の入力欄１２０９〜１２１５が１つのアプリケーションに含まれているが、図１７に示すように、画像表示部１２０１、イメージ表示部１２０２及び各種ボタン１２０３〜１２０８が１つのアプリケーションに含まれるようにし、各種書誌情報の入力欄１２０９〜１２１５が別のアプリケーションに含まれるようにしてもよい。
【０１２２】
（第２の実施の形態）
第１の実施の形態では、文書管理サーバ１１０がＯＣＲ処理を実行したが、本実施の形態では複合機２４０がＯＣＲ処理を実行する点で異なる。
【０１２３】
図１１は、本発明の第２の実施の形態に係る文書管理装置を適用可能な文書管理システムの構成を示すブロック図である。本発明の実施形態に係る文書管理装置は文書管理サーバ１１０に適用される。
【０１２４】
同図における文書管理サーバ２１０は、第１の実施の形態に係る文書管理管理サーバ１１０と異なり、ＯＣＲ処理部１１２を備えていない一方で、テキスト登録部２１２を備えている。
【０１２５】
複合機２４０は、第１の実施の形態に係る複合機１４０の構成の他に、さらにＯＣＲ処理部２４４を備えている。ＯＣＲ処理部１１２とＯＣＲ処理部２４４とは、機能は同一である。
【０１２６】
本実施の形態に係る作業端末１２０では、ブラウザソフト、Ｊａｖａ（登録商標）Ｓｃｒｉｐｔ、及びＡｃｔｉｖｅＸが組み合わされるか、又はクライアントのアプリケーションがインストールされ起動している。
【０１２７】
図１１の文書管理システムと図２の文書管理システムの構成は、上述した構成の差異を除き、同一である。
【０１２８】
図１２，１３は図１１における文書管理システムで実行される処理を示すフローチャートである。
【０１２９】
このフローチャートは、上述した図３，４のフローチャートと概ね同一なので、同一の処理を実行するステップは同一のステップ番号を付し、異なる点のみを説明する。
【０１３０】
ステップＳ９４３〜ステップＳ９４５の処理は複合機１４０の不図示のＣＰＵの制御により実行され、ステップ９２３〜ステップＳ９２５及びステップＳ９３５の処理は文書管理サーバ２１０のＣＰＵの制御により実行され、ステップＳ９０７の処理は作業端末１２０のＣＰＵの制御により実行される。
【０１３１】
図１２において、複合機１４０は、文書のスキャニングを実行し、画像データ記憶部１４７に画像データを記憶した後（ステップＳ６４２）、ＯＣＲ処理を実行する（ステップＳ９４３）。このＯＣＲ処理は、上述した図６，７で説明した処理と同一であるが、実行するのは複合機１４０の制御部１４５である。
【０１３２】
ＯＣＲ処理の終了後、複合機１４０は、スキャンされた画像データとＯＣＲ処理の認識結果であるテキストデータを文書管理サーバ２１０に送信し（ステップＳ９４４）、制御部１４５はＯＣＲ処理部２４４によりＯＣＲ処理が施されていないページが存在するか否かを判別する（ステップＳ９４５）。この判別の結果、ＯＣＲ処理が施されていないページが存在する場合には、ステップＳ９４３に戻り、引き続きＯＣＲ処理を実行し、ＯＣＲ処理が施されていないページが存在しない場合には、ステップＳ６４１の処理に進む。
【０１３３】
文書管理サーバ２１０では、ＣＰＵ２１は複合機からのスキャン画像の他、ＯＣＲ処理の認識結果であるテキストデータの待ち状態になっている（ステップＳ９２３）、一定時間間隔で受信すべきスキャン画像及びテキストデータがあるか否かを判別し（ステップＳ９２４）、受信すべきスキャン画像及びテキストデータがない場合（ステップＳ９２４でＮｏの場合）には、ステップＳ９２３に戻って待機する。ステップ９２４の判別の結果、受信すべきスキャン画像及びテキストデータがある場合（ステップＳ９２４でＹｅｓの場合）、スキャン画像及びテキストデータを受信した後、画像管理部１１４により受信した画像データが画像ＤＢ１１７に登録され、テキスト登録部２１２により受信したテキストデータがテキストＤＢ１１８に登録される（ステップＳ６２５）。
【０１３４】
ステップＳ６３４の判別の結果、登録ボタン１２０６が押下されていない場合には、イメージ表示部１２０２に表示されるテキストデータの情報が修正されたか、又は書誌情報の入力欄１２０９〜１２１５のいずれかの情報が修正された場合であるので、これらの修正を反映してＲＡＭ２２のワークエリア内のテキストデータ及び書誌情報を更新し、この更新されたテキストデータ及び書誌情報を作業端末１２０に送信する（ステップＳ９３５）。
【０１３５】
作業端末１２０はステップＳ９３５の処理で文書管理サーバ１１０から受信したテキストデータ及び書誌情報を受信し（ステップＳ９０７）、ステップＳ６０４の処理を実行する。
【０１３６】
ステップＳ９３５において、更新されたテキストデータ及び書誌情報を作業端末１２０に送信するのは、作業端末１２０はブラウザソフト等を備えており、更新されたテキストデータ及び書誌情報のみで画面表示を更新できるからである。
【０１３７】
本実施の形態においても、図１０に示す文書管理システムで実行される処理の一部の変形例を適用できる。この場合、ステップＳ６３０、ステップＳ１１２３〜ステップＳ１１３０及びステップＳ６３４の処理を作業端末１２０で実行し、作業端末１２０は文書管理サーバ２１０に最終的な編集後の登録情報を送信するように構成してもよい。
【０１３８】
上述したように、本実施の形態によれば、複合機２４０は読み取った画像データのＯＣＲ処理を実行し、文書管理サーバ２１０は複合機２４０から画像データ及びＯＣＲ処理により得られたテキストデータを受信し、このテキストデータから一定のルールに基づいて書誌情報を抽出し、画像データ、テキストデータ及び抽出された書誌情報を作業端末１２０に送信し、これらの書誌情報抽出処理やデータ等の送信処理のバックグラウンドで並行して複合機２４０がＯＣＲ処理を続行する一方、作業端末１２０は画面に画像データ、テキストデータ及び抽出された書誌情報を表示するので、文書管理サーバ２１０において、文字認識処理以外の他の処理の待機時間を最小限に低減し、かつ書誌情報の入力の工数を低減することができ、作業効率の向上を図ることができる。
【０１３９】
また、複合機２４０がＯＣＲ処理を実行して、文書管理サーバ２１０の負荷が軽減されるので、文書管理サーバ２１０の処理速度を一層向上させることができる。
【０１４０】
（第３の実施の形態）
第１の実施の形態では、文書管理サーバ１１０がＯＣＲ処理及び書誌抽出処理を実行したが、本実施の形態では作業端末３２０がＯＣＲ処理及び書誌抽出処理を実行する点で異なる。
【０１４１】
図１４は、本発明の第３の実施の形態に係る文書管理装置を適用可能な文書管理システムの構成を示すブロック図である。本発明の実施形態に係る文書管理装置は文書管理サーバ１１０に適用される。
【０１４２】
同図における文書管理サーバ３１０の構成は、第１の実施の形態に係る文書管理管理サーバ１１０の構成と異なり、ＯＣＲ処理部１１２、書誌抽出部１１３、抽出ルール記憶部１１５を備えていない一方で、テキスト登録部２１２を備えている。複合機１４０の構成は、第１の実施の形態と異なり、画像データ記憶部１４７を備えていない。作業端末３２０の構成は、第１の実施の形態に係る作業端末１２０の構成と異なり、ＯＣＲ処理部３２４、書誌抽出部３２３、抽出ルール記憶部３２５及びテキストデータ記憶部３２８を備えている。
【０１４３】
本実施の形態に係る作業端末３２０では、ブラウザソフト、Ｊａｖａ（登録商標）Ｓｃｒｉｐｔ、及びＡｃｔｉｖｅＸが組み合わされるか、又はクライアントのアプリケーションがインストールされ起動している。
【０１４４】
図１４の文書管理システムと図２の文書管理システムの構成は、上述した構成の差異を除き、同一である。
【０１４５】
図１５，１６は、図１４における文書管理システムで実行される処理を示すフローチャートである。
【０１４６】
このフローチャートは、上述した図３，４のフローチャートと概ね同一なので、同一の処理を実行するステップは同一のステップ番号を付し、異なる点のみを説明する。
【０１４７】
ステップＳ１０２５〜ステップＳ１０３０の処理は文書管理サーバ３１０のＣＰＵの制御により実行され、ステップＳ１００３〜ステップＳ１０１３及びステップＳ１０３１〜ステップＳ１０３３の処理は作業端末３２０のＣＰＵの制御により実行される。
【０１４８】
図１５において、ステップＳ６２４で、文書管理サーバ３１０は、受信すべきスキャン画像があるか否かを判別し、受信すべきスキャン画像がある場合には、受信したスキャン画像に所定の管理番号を１つ採番する（ステップＳ１０２５）。この管理番号は、書誌ＤＢ１１６、画像ＤＢ１１７及びテキストＤＢ１１８の検索キー又は検索キーの一部として使用され、これら３つのＤＢのデータの関連付けに使用される。
【０１４９】
その後、文書管理サーバ３１０は、画像データを作業端末３２０に転送する（ステップＳ１０２６）と共に当該画像データを画像ＤＢ１１７に登録し（ステップＳ１０２７）、ステップＳ６２３の処理に戻る。
【０１５０】
作業端末３２０は、ステップＳ１０２６で文書管理サーバ３１０から転送された画像データを受信し（ステップＳ１００３）、ＯＣＲ処理（ステップＳ１００４）及び書誌候補抽出処理（ステップＳ１００５）を実行する。これらのＯＣＲ処理（ステップＳ１００４）及び書誌候補抽出処理（ステップＳ１００５）は、第１の実施の形態において文書管理サーバ１１０が実行するＯＣＲ処理及び書誌候補抽出処理と同一である。
【０１５１】
次いで、作業端末３２０では図５に示すような画面が表示され、同図の画像表示部１２０１には画像データが表示され、イメージ表示部１２０２にはＯＣＲ処理によって認識されたテキストデータが表示され、さらに各種書誌情報の入力欄１２０９〜１２１５には書誌情報の候補が表示され、画面表示が更新される（ステップ１００６）。
【０１５２】
次に、作業端末１２０は、キーボードの入力情報やマウスの操作情報を識別し（ステップ１００７）、ステップＳ１００８〜ステップＳ１０１１の処理を実行する。このステップＳ１００８〜ステップＳ１０１１の処理は、図４のステップＳ６３０〜ステップ６３４の処理と同一であるので、その説明は省略する。
【０１５３】
ステップＳ１０１１において、登録ボタンが押下されていない場合には、図５におけるイメージ表示部１２０２に表示されるテキストデータの情報が修正されたか、又は書誌情報の入力欄１２０９〜１２１５のいずれかの情報が修正された場合であるので、これらの修正を反映してテキストデータを更新し（ステップＳ１０１２）、ステップＳ１００６の処理に戻る。
【０１５４】
一方、ステップＳ１００９の判別の結果、次文書ボタン１２０７が押下された場合には、作業端末３２０のＣＰＵが現在処理している文書の次文書のＯＣＲ処理が終了しているか否かを判別し（ステップＳ１０３１）、この処理が終了している場合には、次文書の先頭ページの画像データと、これに対応するテキストデータと、書誌情報の候補とをＲＡＭ等から読み出し（ステップＳ１０３２）、ステップＳ１００６の処理に進む。一方次文書のＯＣＲ処理が終了していない場合には、ステップＳ１００４の処理に戻る。
【０１５５】
ステップＳ１０１０の判別の結果、図５に示す前ページボタン１２０４又は次ページボタン１２０５が押下された場合には、作業端末３２０のＣＰＵは、押下されたボタンに応じて前ページ又は次ページの画像データと、これに対応するテキストデータと、書誌情報の候補とをＲＡＭ等から読み出し（ステップＳ１０３２）、ステップＳ１００６の処理に進む。
【０１５６】
ステップＳ１０１１において、登録ボタンが押下された場合には、作業端末３２０のＣＰＵは、テキストデータ及び書誌情報を文書管理サーバ３１０に送信し（ステップＳ１０１３）、ステップＳ１００６の処理に戻る。
【０１５７】
文書管理サーバ３１０のＣＰＵは、作業端末３２０から送信されたテキストデータ及び書誌情報を受信し（ステップＳ１０２８）、書誌情報は書誌ＤＢ１１６に登録し、テキストデータはテキストＤＢ１１８に登録し（ステップＳ１０２９）、さらに、作業端末３２０からテキストデータ及び書誌情報を受信するまで待機する（ステップＳ１０３０）。
【０１５８】
上述したように、本実施の形態によれば、作業端末３２０が複合機１４０で読み取られた画像データのＯＣＲ処理を実行し、このＯＣＲ処理により得られたテキストデータから、一定のルールに基づいて書誌情報を抽出し、これらの書誌情報抽出処理のバックグラウンドで並行してＯＣＲ処理を続行する一方、画面に画像データ、テキストデータ及び抽出された書誌情報を表示するので、作業端末３２０において、文字認識処理以外の他の処理の待機時間を最小限に低減し、かつ書誌情報の入力の工数を低減することができ、作業効率の向上を図ることができる。文書管理サーバ３１０の負荷が軽減されるので、処理速度を一層向上させることができる。
【０１５９】
（第４の実施の形態）
本実施の形態は、図３における文書管理システムで実行される処理の一部が第１の実施の形態と異なり、その他は第１の実施の形態と同様であるので、異なる点のみ説明する。
【０１６０】
本実施形態では、前提として図３のステップＳ６２７の書誌候補抽出処理は実行されず、ステップＳ６２８で図２０に示す画像及びテキスト情報が作業端末１２０に送信されているものとする。
【０１６１】
まず、図２０を用いて作業端末１２０における画面の変化を説明する。
【０１６２】
図２０の表示部１９０１には図３のステップＳ６２６で認識されたＯＣＲテキストが表示され、イメージ表示部１９０２には、受信したスキャン画像に重ねてＯＣＲテキストがＨＴＭＬ化されて透明レイヤーとして該スキャン画像の上側に表示されている。このＨＴＭＬは通常は透明な状態（不可視）であり、テキストの左右位置については元画像と一致しているが、テキストの上下位置は略１行分上（又は略１行分下）の位置に元画像とずれて配置されているが通常は透明で見えない状態になっており、下側のスキャン画像が表示されている。
【０１６３】
符号１９０３は現在表示されている文書の先頭ページを表示するためのボタンであり、符号１９０４は前ページを表示するためのボタンである。又符号１９０５には、現在表示されている文書の全体ページ数と表示ページが表示される。例えば「１／３」は、現在全体で３ページの文書の１ページ目のスキャン画像と対応するＯＣＲテキストが表示されていることを意味する。符号１９０６は次文書を表示するためのボタンであり、ここには次文書のＯＣＲ作業の進捗が表示される。つまり、図３のステップＳ６２６乃至ステップＳ６２９におけるＯＣＲの完了ページを表示する。例えば「４／４」は次文書が４ページであり、４ページ目までＯＣＲ作業が完了していることを意味する。この場合、画面の符号１９０６の表示は当初の「１／４」→「２／４」→「３／４」と変化し、最終的に「４／４」に至る。符号１９０７は次文書の次ページを表示するためのボタンであり、符号１９０８は次文書の最終ページを表示するためのボタンである。次文書が表示された状態でボタン１９０５を押下すれば直前に表示されていた現在の文書のスキャン画像とＯＣＲテキストが再度表示される。登録ボタン１９１１が押下されると書誌登録処理が行われ、画面ではボタン１９０６の表示が表示１９０５に繰り上がる。つまり、「１／３」が「１／４」に変化する。ボタン１９０６には更に次の文書のページ数とＯＣＲ進捗が表示される。つまり新次文書（旧次々文書）が全部で５ページであり、３ページ目までＯＣＲが完了していれば「次文書
３／５」と表示される。
【０１６４】
また、ＯＣＲテキストを表示する表示部１９０１の特定の文字部分、例えば「○△改正について（通達）」にカーソルを位置付けてマウスのボタンを押下すると、イメージ表示部１９０２の透明ＨＴＭＬの対応する部分が反転表示される（反転部１９１９）。反転するのは１センテンス分で、１センテンスは前後の空白又は改行で判定される。
【０１６５】
このように次文書のＯＣＲ進捗を容易に確認できるので、現在作業中の文書での書誌登録作業から次文書の書誌登録作業にスムーズに移行することができる。反転部１９１９を件名の入力フィールド１９１２の位置までドラッグ＆ドロップすると「○△改正について（通達）」が入力される。
【０１６６】
次に、図２０で説明した画面の動きに対応する処理の詳細を図１８，１９のフローチャートで説明する。
【０１６７】
図１８，１９は、図３における文書管理システムで実行される処理の一部の変形例を示すフローチャートであるので、図３と異なる処理を示す。図３と同様の処理ステップは「ステップＳ０６６」で始まるステップで記述している。
【０１６８】
図１８のステップＳ６３０及びステップＳ１８２１〜ステップＳ１８３５及びステップＳ６３５の処理は、文書管理サーバ１１０のＣＰＵ２１の制御により実行され、ステップＳ６０４〜Ｓ６０７の処理は作業端末１２０のＣＰＵの制御により実行される。
【０１６９】
作業端末１２０では、ブラウザソフト、Ｊａｖａ（登録商標）Ｓｃｒｉｐｔ、及びＡｃｔｉｖｅＸが組み合わされるか、又はクライアントのアプリケーションがインストールされ起動している。
【０１７０】
作業端末１２０は、図３のステップＳ６０５でキーボード２６の入力情報やマウス２７の操作情報を識別し、これらの操作情報を文書管理サーバ１１０に送信する（ステップ６０６）。これらの操作情報は、例えば「画面の特定位置でのマウスボタンの押下」「ドラッグ＆ドロップ操作」、「反転表示部分への文字入力や削除」、及び各種ボタンの押下である。操作情報がドラッグ＆ドロップである場合は、ドラッグされた文字列、ドラッグ開始の座標及びドロップ先の座標が操作情報に含まれている。また、反転表示部分への文字入力や削除の場合は、反転表示部分の位置座標と入力又は削除されたテキスト情報も操作情報に含まれる。
【０１７１】
文書管理サーバ１１０のＣＰＵ２１は、ステップＳ６０６で作業端末１２０から送信されるキーボード２６の入力情報やマウス２７の操作情報に基づいて、作業端末１２０の画面において終了ボタン１９２０が押下されたか否かを判別する（ステップＳ６３０）。
【０１７２】
ステップＳ６３０の判別の結果、終了ボタン１９２０が押下された場合には、本処理を終了する。終了ボタンが押下されていない場合、処理はステップＳ１８２１に進む。
【０１７３】
ステップＳ１８２１では、ステップＳ６０６で作業端末１２０から送信される操作情報がマウスのボタンの押下であるか否かが判別され、押下されていない場合はステップＳ１８３０に進む。押下された場合はステップＳ１８２２に進み、ページボタンが押下されたか否かを検出する。ページボタンとは図２０の符号１９０３乃至１９０８の各ボタンであり、ボタン１９０３，１９０４の直ぐ下の２つのボタンとボタン１９０７，１９０８の直ぐ上の２つのボタンも含む。
【０１７４】
ステップＳ１８２２でページボタンが押下された場合は、処理はステップＳ１８２３に進み、文書管理サーバ１１０の画像ＤＢ１１７とテキストＤＢ１１８から、指定されたページの画像とＯＣＲ済みテキストデータが読み込まれ、ステップＳ６３５で画面情報が送信され、ステップＳ６０７で作業端末１２０が画面情報を受信し、ステップＳ６０４に図２０に示すような画面が表示される。
【０１７５】
ステップＳ１８２２でページボタンが押下されない場合は、処理はステップＳ１８２４に進み、テキスト上にカーソルがある状態でマウスボタンが押下されたか否かを検出する。テキスト上とは、テキストが表示されている画面領域にカーソルがあることを言う。テキスト上での押下であることが検出された場合は、処理はステップＳ１８２５に進み、マウス位置にあるテキストのワンフレーズ分がＲＡＭ中に確保されたバッファにコピーされる。後述するドラッグ＆ドロップのマウスボタンの押下の場合も同様に処理される。ここでのワンフレーズとは、空白又はタブで前後を挟まれた文字列を言う。次いで、処理はステップＳ１８２５に進み、該当テキストに対応する部分のＨＴＭＬの属性が透明から反転表示に変更されて、画面に表示される。一度反転した部分を再度押下しても反転表示は変わらない。ステップＳ６３５以下は図４と同様なので説明を省略する。
【０１７６】
ステップＳ１８２４でテキスト上の押下でなかった場合は、処理はステップＳ６３４に進み、登録ボタンが押下されたか否かが判定される。ここは図３と同様なので説明を省略する。登録ボタンが押下されなかった場合は、処理はステップＳ１８２９に進み、「テキスト更新」等の押下されたボタンに対応する処理が実行される。以上でマウスボタンが押下された場合の処理の説明を終わる。
【０１７７】
次に、ステップＳ１８２１でマウスボタンが押下されていない場合は、処理はステップＳ１８３０に進み、文字入力又は削除があったか否かが検出され、文字入力又は削除があった場合は、処理はステップＳ１８３１に進み、該当位置バッファ中のテキストが更新される。
【０１７８】
ステップＳ１８３０で文字入力又は削除がなかった場合は、処理はステップＳ１８３２に進み、ドラッグ＆ドロップがあったか否かが検出される。すなわちマウスボタンが押下されたままマウスのポインタが移動した後マウスボタンが離されたか否かが検出される。ドラッグ＆ドロップがあった場合は、処理はステップＳ１８３３に進み、マウスイベントの情報からドラッグ＆ドロップの始点と終点が検出される。処理はステップＳ１８３４に進み、テキストの反転表示位置がドラック＆ドロップされたかが検出され、テキストの反転表示位置がドラック＆ドロップされた場合は、処理はステップＳ１８３５に進み、バッファ中のテキストがドロップ位置の入力フィールドに入力され、処理はステップ６３５に進み画面情報が送信される。
【０１７９】
ステップＳ１８３４で反転位置がドラッグされた場合は直ちにステップ６３５に進む。
【０１８０】
以上説明したように、本実施の形態によれば、簡単な操作で画面上のテキストを所望の入力フィールドへ入力できるため、書誌情報登録作業の効率を大幅に向上させることができる。すなわち、通常であれば、特定の文字の始点と終点をクリックとドラッグで反転表示させ、Ｃｔｒｌ＋Ｃボタンを押し、入力フィールド位置にカーソルを移動してＣｔｒｌ＋Ｖを押すといった４動作を２動作で行うことができる。
【０１８１】
本発明は、上述した実施の形態の機能を実現するソフトウェアのプログラムをコンピュータ又は制御部（具体的にはＣＰＵ）に供給し、そのコンピュータ又はＣＰＵが該供給されたプログラムを読出して実行することによっても本発明の目的が達成されることは云うまでもない。
【０１８２】
この場合、上記プログラムは、不図示の該プログラムを記録した記録媒体から直接、又はインターネット、商用ネットワーク、若しくはローカルエリアネットワーク等に接続される不図示の他のコンピュータやデータベース等からダウンロードすることにより供給される。
【０１８３】
また、上記プログラムは、上述した実施の形態の機能をコンピュータで実現することができればよく、その形態は、オブジェクトコード、インタプリタにより実行されるプログラム、ＯＳに供給されるスクリプトデータ等の形態を有するものでもよい。
【０１８４】
更にまた、上述した実施の形態の機能を実現するソフトウェアのプログラムを記録した記録媒体をコンピュータに供給し、そのコンピュータが記録媒体に格納されたプログラムを読出し実行することによっても、本発明の目的が達成されることは云うまでもない。
【０１８５】
プログラムを供給する記録媒体としては、例えば、ＲＡＭ、ＮＶ−ＲＡＭ、フロッピー（登録商標）ディスク、光ディスク、光磁気ディスク、ＣＤ−ＲＯＭ、ＭＯ、ＣＤ−Ｒ、ＣＤ−ＲＷ、ＤＶＤ（ＤＶＤ−ＲＯＭ、ＤＶＤ−ＲＡＭ、ＤＶＤ−ＲＷ、ＤＶＤ＋ＲＷ、ＤＶＤ−Ｒ、ＤＶＤ＋Ｒ、ブルーレイディスク等）、磁気テープ、不揮発性のメモリカード、他のＲＯＭ等の上記プログラムを記憶できるものであればよい。
【０１８６】
【発明の効果】
以上説明したように、請求項１記載の文書管理装置及び請求項９記載の文書管理用プログラムによれば、画像読取装置から受信した画像データに文字認識処理を施してテキストデータが生成され、該生成されたテキストデータと抽出ルールに基づいて書誌情報が抽出され、クライアント装置に送信されるので、クライアント装置の画面に画像データとテキストデータと抽出した書誌情報を表示でき、さらにバックグラウンドで別の画像データの文字認識処理が続行することができる。従って、文字認識処理以外の他の処理の待機時間を最小限に低減し、かつ書誌情報の入力の工数を低減することができ、作業効率の向上を図ることができる。
【０１８７】
請求項２記載の文書管理装置によれば、画像データ、テキストデータ及び書誌情報が関連付けられて記憶されるので、画像データとテキストデータと抽出した書誌情報をデータベースに登録することができ、これらのデータや情報を一括で呼び出すことができ、書誌情報の訂正作業にかかる時間を最小限に抑制することができる。
【０１８８】
請求項３記載の文書管理装置によれば、クライアント装置から受信した編集情報に基づいて書誌情報が更新されるので、簡単に書誌情報等の入力フィールドの文字列を更新することが可能となり、書誌情報等の入力工数を削減し、操作者の負担を軽減することが可能になる。
【０１８９】
請求項６記載の文書管理装置によれば、書誌情報の訂正作業が容易になるという効果を奏する。
【０１９０】
請求項７記載の文書管理システムによれば、文書管理装置における文字認識処理以外の他の処理の待機時間を最小限に低減し、かつ書誌情報の入力の工数を低減することができ、作業効率の向上を図ることができる。画像読取装置が文字認識処理を実行するので、文書管理装置の負荷が軽減され、文書管理装置の処理速度を一層向上させることができる。
【０１９１】
請求項８記載の文書管理システムによれば、クライアント装置において、文字認識処理以外の他の処理の待機時間を最小限に低減し、かつ書誌情報の入力の工数を低減することができ、作業効率の向上を図ることができる。文書管理装置の負荷が軽減されるので、文書管理装置の処理速度を一層向上させることができる。
【０１９２】
請求項９記載の文書管理システムによれば、簡単な操作で画面上のテキストを所望の入力フィールドへ入力できるため、書誌情報登録作業の効率を大幅に向上させることができる。すなわち、通常であれば、特定の文字の始点と終点をクリックとドラッグで反転表示させ、Ｃｔｒｌ＋Ｃボタンを押し、入力フィールド位置にカーソルを移動してＣｔｒｌ＋Ｖを押すといった４動作を２動作で行うことができる。
【図面の簡単な説明】
【図１】本発明の第１の実施の形態に係る文書管理装置のハードウェア構成を示すブロック図である。
【図２】本発明の実施形態に係る文書管理装置を適用可能な文書管理システムの構成を示すブロック図である。
【図３】図２における文書管理システムで実行される処理を示すフローチャートである。
【図４】図２における文書管理システムで実行される処理を示すフローチャートである。
【図５】作業端末１２０の画面に、書誌登録の処理の際に表示されるアプリケーションの一例を示す図である。
【図６】ＯＣＲ処理（図３のステップＳ６２６）を示すフローチャートである。
【図７】ＯＣＲ処理（図３のステップＳ６２６）を示すフローチャートである。
【図８】書誌候補抽出処理（図３のステップＳ６２７）を示すフローチャートである。
【図９】書誌候補抽出処理（図３のステップＳ６２７）を示すフローチャートである。
【図１０】図３における文書管理システムで実行される処理の一部の変形例を示すフローチャートである。
【図１１】本発明の第２の実施の形態に係る文書管理装置を適用可能な文書管理システムの構成を示すブロック図である。
【図１２】図１１における文書管理システムで実行される処理を示すフローチャートである。
【図１３】図１１における文書管理システムで実行される処理を示すフローチャートである。
【図１４】本発明の第３の実施の形態に係る文書管理装置を適用可能な文書管理システムの構成を示すブロック図である。
【図１５】図１４における文書管理システムで実行される処理を示すフローチャートである。
【図１６】図１４における文書管理システムで実行される処理を示すフローチャートである。
【図１７】作業端末１２０の画面に、書誌登録の処理の際に表示されるアプリケーションの一例を示す図である。
【図１８】図３における文書管理システムで実行される処理の一部の変形例を示すフローチャートである。
【図１９】図３における文書管理システムで実行される処理の一部の変形例を示すフローチャートである。
【図２０】作業端末１２０の画面に、書誌登録の処理の際に表示されるアプリケーションの一例を示す図である。
【符号の説明】
２１ＣＰＵ
２２ＲＡＭ
２３ＲＯＭ
２８ハードディスク
１１０文書管理サーバ
１１１書誌登録部
１１２ＯＣＲ処理部
１１３書誌抽出部
１１４画像管理部
１１５抽出ルール記憶部
１１６書誌データベース（ＤＢ）
１１７画像データベース（ＤＢ）
１１８テキストデータベース（ＤＢ）[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a document management apparatus, a document management system, and a document management program capable of executing processing for extracting bibliographic items from text data based on certain rules.
[0002]
[Prior art]
2. Description of the Related Art Conventionally, in the business of registering electronically a document described on paper, a scanner that reads an image of the document described on paper, and OCR (optical character recognition) processing on the image data read by the scanner There is known a document management system including a computer that generates text data by executing and registers a bibliographic item input by a user in association with image data and text data.
[0003]
In Patent Document 1, a document written on paper is optically read, a character is recognized by OCR processing, a layout of a character area is recognized, a character size and a font type are further identified, and a title or figure is identified. Techniques for extracting captions and keywords are disclosed.
[0004]
Further, in Patent Document 2, in a system including a copying machine and a computer, after the index information (classification and keyword) is input by the copying machine, additional information and image data are transmitted to the computer (PC). A technique is disclosed in which these additional information and image data are converted into a data format for a database (DB), and the converted data is registered and managed in a database recorded inside a hard disk or the like.
[0005]
[Patent Document 1]
Japanese Patent Laid-Open No. 11-238072
[Patent Document 2]
JP 2002-290661 A
[0006]
[Problems to be solved by the invention]
However, in the conventional document management system described above, it is troublesome to manually input bibliographic information from a screen of a personal computer or the like while referring to image data and text data, and there is a problem that usability is lacking.
[0007]
In the technique of Patent Document 1, although automatic identification of titles can be performed to some extent, for example, if the title character size and font used are the same as the main text, identification may fail, and bibliographic information Of these, the document management information (document creation date, issuer, destination, etc.) usually appears only once, so there is a high possibility that keyword extraction in order of frequency is not possible.
[0008]
Furthermore, since the OCR process for a specific document as a whole takes a considerable amount of time, the person in charge of the document registration work needs to wait until the OCR process is completed after reading the document, resulting in a low actual operation rate. There is.
[0009]
According to the system of Patent Document 2, since image data is transmitted to a computer one by one after a keyword is input by a copying machine, it is easy to associate additional information with image data, but keyword input and image reading cannot be performed simultaneously. There is a problem. That is, there is a problem that the ratio of the time when the copying machine actually reads an image in the entire work time is low, and the work efficiency is poor.
[0010]
The present invention has been made to solve the above problems, and can reduce the waiting time for processes other than the character recognition process to a minimum, and can reduce the man-hours for inputting bibliographic information. An object is to provide a document management apparatus, a document management system, and a document management program capable of improving efficiency.
[0011]
[Means for Solving the Problems]
In order to achieve the above object, a document management apparatus according to claim 1 is an image reading apparatus that reads an image of a document and a document management that is connected to a client apparatus that displays image data of the read image via a communication line. In the apparatus, image storage means for storing image data received from the image reading apparatus, character recognition processing means for generating text data by performing character recognition processing on the image data stored in the image storage means, and the text Extraction rule storage means for storing extraction rules for extracting bibliographic information from data, extraction means for extracting bibliographic information based on the text data and the extraction rules, the image data, the text data, and the bibliographic information Transmitting means for transmitting to the client device.
[0012]
A document management apparatus according to a second aspect is the document management apparatus according to the first aspect, further comprising a document information storage unit that stores the image data, the text data, and the bibliographic information in association with each other.
[0013]
The document management device according to claim 3 is the document management device according to claim 1 or 2, wherein the document management device according to claim 1 or 2 is configured to receive editing information for the bibliographic information received from the client device, and based on the received editing information. Bibliographic information updating means for updating bibliographic information is provided.
[0014]
According to a fourth aspect of the present invention, there is provided the document management apparatus according to the third aspect, further comprising a text data updating unit that updates the text data based on the received editing information.
[0015]
The document management apparatus according to claim 5 is the document management apparatus according to claim 4, wherein the image data, the bibliographic information updated by the bibliographic information update unit, and the text data updated by the text data update unit are used. Display information generating means for generating display information to be displayed on the screen of the client device based on the information is provided.
[0016]
The document management device according to claim 6 is the document management device according to any one of claims 3 to 5, wherein the received editing information is displayed by highlighting text data selected by the client device. Information indicating whether or not a rectangular area including text data selected by the client device is specified, and text data that is displayed in reverse video or text data included in the specified rectangular area is predetermined. It is information indicating whether or not it has been dragged and dropped into the input field.
[0017]
The document management system according to claim 7, wherein an image reading device that reads an image of a document, a client device that displays image data of the read image, and the image reading device and the client device are connected via a communication line. In the document management system, the image reading device performs character recognition processing on the image data of the read image to generate text data, the image data, and the text Transmitting means for transmitting data to the document management apparatus, wherein the document management apparatus stores storage means for storing image data and text data received from the image reading apparatus, and extracts bibliographic information from the text data. The extraction rule storage means for storing the extraction rule, and the text data and the extraction rule Extracting means for extracting the bibliographic information Zui, the image data, and a sending means for sending the text data and the bibliography information to the client device.
[0018]
9. The document management system according to claim 8, wherein an image reading device that reads an image of a document, a client device that displays image data of the read image, and the image reading device and the client device are connected via a communication line. In the document management system including the document management device, the client device stores image data received from the image reading device, and performs character recognition processing on the image data stored in the image storage device. And character recognition processing means for generating text data, extraction rule storage means for storing extraction rules for extracting bibliographic information from the text data, and extracting bibliographic information based on the text data and the extraction rules Extraction means, and the client device with the image data, the text data and the document Characterized in that it comprises a display information generation means for displaying the information.
[0019]
9. The document management system according to claim 9, wherein the document management apparatus includes receiving means for receiving editing information for the bibliographic information received from the client apparatus, and the received editing is performed. The information includes information on whether or not the text data selected on the client device is highlighted, information on whether or not a rectangular area including the text data selected on the client device is specified, and is highlighted. The text data or the text data included in the designated rectangular area is information indicating whether or not the text data is dragged and dropped into a predetermined input field.
[0020]
11. A document management program according to claim 10, which is executed by a computer connected via a communication line to an image reading device that reads an image of a document and a client device that displays image data of the read image. An image storage module for storing image data received from the image reading device, a character recognition processing module for performing character recognition processing on the image data stored in the image storage means to generate text data, and the text data An extraction rule storage module for storing an extraction rule for extracting bibliographic information from the text, an extraction step for extracting bibliographic information based on the text data and the extraction rule, the image data, the text data, and the bibliographic information. Display information to be displayed on the client device Characterized in that it comprises a display information generating module to be formed.
[0021]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
[0022]
(First embodiment)
FIG. 1 is a block diagram showing a hardware configuration of a document management apparatus according to the first embodiment of the present invention.
[0023]
In the figure, a CPU 21 (transmission means, reception means, display information generation means), RAM 22, ROM 23, LAN adapter 24 (transmission means, reception means), video adapter 25, keyboard 26, mouse 27, hard disk 28, CD-ROM drive. 29 are connected to each other via a system bus 20. The system bus 20 is, for example, a PCI bus, an AGP bus, a memory bus, or the like. The document management server 110 includes a chip for connecting each bus, a keyboard interface, and an input / output interface such as a so-called SCSI or ATAPI, but these are omitted in FIG.
[0024]
The CPU 21 performs various operations such as four arithmetic operations and comparison operations, and controls hardware and software. The RAM 22 stores operation system programs, application programs, and the like read from a storage medium such as a CD-ROM or CD-R mounted in the hard disk 28 or the CD-ROM drive 29. These programs are stored in the CPU 21. It is executed by control. The ROM 23 stores a so-called BIOS that manages input / output to / from a hard disk or the like in cooperation with the operation system. The LAN adapter 24 communicates with an external device (not shown) via a network (not shown) in cooperation with a communication program included in an operation system controlled by the CPU 21. The video adapter 25 is connected to a display device (not shown), generates an image signal to be output to the display device, and the keyboard 26 and the mouse 27 are used to input instructions to the document management server 110.
[0025]
The hard disk 28 stores an operation system, application programs, various data such as an extraction rule storage unit 115 and a bibliography DB 116 described later, and various master files (not shown). The CD-ROM drive 29 is used to install an application program in the hard disk 28 by mounting a storage medium such as a CD-ROM, CD-R, or CD-R / W. It goes without saying that a CD-R drive, a CD-R / W drive, or an MO drive may be used instead of the CD-ROM drive.
[0026]
A work terminal 120 and a management terminal 130, which will be described later, also have the same hardware configuration as the document management apparatus in FIG.
[0027]
FIG. 2 is a block diagram showing a configuration of a document management system to which the document management apparatus according to the embodiment of the present invention can be applied. The document management apparatus according to the embodiment of the present invention is applied to the document management server 110.
[0028]
In FIG. 1, the document management system includes a communication line 100, a document management server 110, a work terminal 120, a management terminal 130, and a multifunction device 140.
[0029]
The work terminal 120 and the management terminal 130 are, for example, a personal computer, a so-called PDA (Personal Digital Assistant), an Internet-compatible mobile phone, or the like, as long as it can input characters and display images and characters.
[0030]
The communication line 100 is typically the Internet, a LAN (Local Area Network), a WAN (Wide Area Network), a telephone line, a dedicated digital line, an ATM (Asynchronous Transfer Mode), a frame relay line, a communication satellite line, and a cable TV line. Or a so-called communication network realized by any one of a wireless link for data broadcasting or a combination thereof, and any data transmission / reception is possible.
[0031]
The document management server 110 performs document registration processing and search processing using a predetermined OS (for example, including UNIX (registered trademark) and WINDOWS (registered trademark)) and application programs. The document management server 110 includes a bibliographic registration unit 111 (bibliographic information update unit), an OCR processing unit 112 (character recognition processing unit), a bibliographic extraction unit 113 (extraction unit), an image management unit 114, and an extraction tool storage unit 115 (extraction rule). Storage means), a bibliographic database (DB) 116 (document information storage means), an image database (DB) 117 (image storage means), and a text database (DB) 118.
[0032]
The image management unit 114 performs processing for storing image data received from the multi-function peripheral 140 via the communication line 100 in the image DB 117. The OCR processing unit 112 includes a program for executing character recognition processing and a recognition dictionary for image data stored in the image DB 117 that has not been subjected to OCR processing, and text data generated by the character recognition processing. Is stored in the text DB 118. The text DB is a database capable of full-text search, but a detailed description of the search is omitted.
[0033]
The bibliographic extraction unit 113 extracts bibliographic information based on text data and bibliographic information extraction rules stored in the extraction rule storage unit 115. The bibliographic information extraction rule stored in the extraction rule 115 will be described in detail with reference to FIGS.
[0034]
The bibliographic registration unit 111 updates the bibliographic information based on the editing information received from the work terminal 120. When the editing information instructing “registration” is received from the work terminal 120, the bibliographic information is registered in the bibliographic DB 116. . This bibliographic information is associated with corresponding image data in the image DB 117 and corresponding text data in the text DB 118. The document management server 110 also includes a search processing unit (not shown) that can search by linking the bibliographic DB 116, the image DB 117, and the text DB, but a detailed description thereof will be omitted.
[0035]
The bibliographic registration unit 111, the OCR processing unit 112, the bibliographic extraction unit 113, and the image management unit 114 are realized by the CPU 21 executing control based on a program stored in the hard disk 28. The bibliographic database (DB) 116, the image database (DB) 117 and text database (DB) 118 are constructed in hard disk 28.
[0036]
Next, the multifunction device 140 includes a scan engine (not shown) provided with an image sensor such as a CCD, an image data storage unit 147 that stores image data read by the scan engine, and stores the stored image data in the document management server 110. A printer (not shown) having a control unit 145 for transmitting and storing the transmitted history in the log data storage unit 146, a function for printing input PDL data, and a function for printing image data output from a scan engine The scan engine and the printer engine are configured to be able to communicate with each other. The log data storage unit 146 and the image data storage unit 147 are configured by a hard disk device included in the multifunction device 140.
[0037]
The multifunction device 140 includes a network controller (not shown) and a communication I / F, and is communicably connected to the document management server 110, the work terminal 120, and the management terminal 130 via the communication line 100.
[0038]
3 and 4 are flowcharts showing processing executed by the document management system in FIG.
[0039]
3 and 4, the processes in steps S601 to S607 are executed under the control of a CPU (not shown) of the work terminal 120, and the processes in steps S621 to S636 are executed under the control of the CPU 21 of the document management server 110. The processing in steps S641 to S643 is executed under the control of a CPU (not shown) of the multifunction device 140.
[0040]
Before the document management server 110 executes the process of step S621, the work terminal 120 has already executed an authentication request to the document management server 110, that is, transmitted a user ID or password, and the document management server 110 has Executing the authentication process, the work terminal 120 transmitting menu selection information to the document management server 110, and the document management server 110 generating screen information for the work terminal 120 based on the menu selection information are: Assume that it has finished. Further, it is assumed that a document to be scanned is already placed in the multifunction device 140.
[0041]
First, the document management server 110 transmits screen information for displaying a document registration screen on the work terminal 120 (step S621), and the display connected to the work terminal 120 displays a screen as shown in FIG. Step S601).
[0042]
FIG. 5 is a diagram illustrating an example of an application displayed on the screen of the work terminal 120 during the bibliographic registration process.
[0043]
In the figure, reference numeral 1200 denotes a scan button for inputting a scan instruction, reference numeral 1201 denotes an image display unit that displays image data to be subjected to OCR processing, and reference numeral 1202 is actually recognized by the OCR processing. An image display unit for displaying text data, reference numerals 1209 to 1215 are input fields for various bibliographic information, reference numeral 1203 is an update button for updating text data information or bibliographic information, and reference numeral 1204 is an image of the previous page A previous page button for requesting the document management server 110 for data, text data, and bibliographic candidates. Reference numeral 1205 indicates a next page button for requesting the document management server 110 for image data, text data, and bibliographic candidates for the next page. Reference numeral 1206 indicates that the bibliographic information is registered in the document management server 110. An order registration button, reference numeral 1207 is the next document button for requesting the image data of the first page of the next document, the text data and the bibliography candidates to the document management server 110, reference numeral 1208 is the end button.
[0044]
In FIG. 5, when the scan button 1200 is pressed, the work terminal 120 requests information to start scanning (scan request information) and work terminal specifying information for specifying the work terminal 120 that is the transmission source of the information (scan terminal information). For example, a user ID, a session ID, etc.) are transmitted to the document management server 110 (step S602).
[0045]
The document management server 110 transfers the scan request information and work terminal specifying information received from the work terminal 120 to the multi-function peripheral 140 and assigns one predetermined document number (step S622). The document number is a unique management number assigned to each document and is used as a search key or a part of the search key of the bibliographic DB 116, the image DB 117, and the text DB 118. Used for association.
[0046]
The multi-function device 140 receives the scan request information and work terminal identification information from the document management server 110 (step S641), executes document scanning (step S642), and stores the image data in the image data storage unit 147. The image data is assigned a unique image number for each page. For example, the image number is obtained by combining the reception time (14 digits) of the image scan request and the number of pages (lower 3 digits).
[0047]
The control unit 145 of the multifunction device 140 transmits the scanned image and the scan request source work terminal specifying information together to the document management server 110, and stores the name and transmission time of the transmitted image in the log data storage unit 146 (step S1). S643).
[0048]
If the transmission of the image data does not end normally in step S643, a retry is made after a lapse of a fixed time, and if the transmission does not end normally even after a predetermined number of retries, this is stored in the log data storage unit 146. You may comprise as follows.
[0049]
In the document management server 110, a process (step S623 to step S629) different from the above step S621 and step S622 is executed, and the CPU 21 is in a waiting state for a scan image from the multifunction peripheral (step S623). It is determined whether or not there are scan images to be received at regular time intervals (step S624). If there is no scan image to be received (No in step S624), the process returns to step S623 and waits. If it is determined in step 624 that there is a scan image to be received (Yes in step S624), the image data received by the image management unit 114 is registered in the image DB 117 after the scan image is received ( Step S625).
[0050]
Next, the OCR processing unit 112 performs OCR processing on the image data registered in the image DB 117 page by page, and the text recognized by the OCR processing is added from the RAM 22 to the text DB 118 (step 626). A flag is attached to the image data subjected to the OCR processing in the image DB 117. Details of the OCR processing will be described later.
[0051]
Next, the bibliographic extraction unit 113 extracts bibliographic information candidates from the recognized text data in the work area of the RAM 22 based on the extraction rules stored in the extraction rule storage unit 115 and stores them in the work area of the RAM 22. (Step 627). Details of the bibliographic candidate extraction process will also be described later.
[0052]
Thereafter, the CPU 21 transmits the received image data for one page, text data corresponding to the image data, and bibliographic information candidates extracted from the first page of the document from the work area of the RAM 22 to the work terminal 120. (Step 628).
[0053]
Next, the CPU 21 determines whether there is a page that is image data registered in the image DB 117 and has not been subjected to the OCR processing by the OCR processing unit 112 (step S629). Specifically, the CPU 21 detects the presence / absence of a flag in the image DB 117 that is updated during the OCR process. If the flag is present, the CPU 21 determines that there is a page that has not been subjected to the OCR process. It is determined that there is no page that has not been processed. Alternatively, the CPU 21 may execute the determination in step S629 by confirming whether text data corresponding to the image data exists in the text DB 118.
[0054]
As a result of the determination in step S629, if there is a page that has not been subjected to the OCR process, the process returns to the process in step S626, and the OCR process for the image data of the next page is executed. On the other hand, if there is no page that has not been subjected to OCR processing, the process returns to step S623, and the CPU 21 waits for reception of the next image data.
[0055]
Next, the work terminal 120 receives image data, text data, and bibliographic candidates transmitted from the document management server 110 in the process of step S628 (step S603). The work terminal 120 displays a screen as shown in FIG. 5, image data is displayed on the image display unit 1201, text data recognized by the OCR process is displayed on the image display unit 1202, and various types of data are displayed. Bibliographic information candidates are displayed in the bibliographic information input fields 1209 to 1215, and the screen display is updated (step 604).
[0056]
Next, the work terminal 120 identifies the input information of the keyboard 26 and the operation information of the mouse 27 (step 605), and transmits these information to the document management server 110 (step 606). When there is no input information or operation information, the work terminal 120 stands by until there is an input or operation.
[0057]
The CPU 21 of the document management server 110 receives the input information of the keyboard 26 and the operation information of the mouse 27 transmitted from the work terminal 120 in step S606, and the screen of the work terminal 120 based on the received input information and operation information. In step S630, it is determined whether or not the end button 1208 has been pressed.
[0058]
If it is determined in step S630 that the end button 1208 has been pressed, the process is terminated. If the end button 1208 has not been pressed, the keyboard 26 transmitted from the work terminal 120 in step S606 is displayed. Based on the input information and the operation information of the mouse 27, it is determined whether or not the next document button 1207 is pressed on the screen of the work terminal 120 (step S631).
[0059]
If the next document button 1207 is pressed as a result of the determination in step S631, it is determined whether or not the OCR processing and bibliographic candidate extraction processing for the next document of the document currently processed by the CPU 21 has been completed (step S631). S632) When these processes are completed, the process returns to the process of step S628, and the CPU 21 displays the image data of the first page of the next document, the corresponding text data, and the bibliographic information candidate as the work terminal. 120. On the other hand, if the process of step S632 has not ended, the process returns to step S626, and the CPU 21 performs the OCR process for the first page of the next document.
[0060]
If the next document button 1207 is not pressed as a result of the determination in step S631, the screen of the work terminal 120 is based on the input information of the keyboard 26 and the operation information of the mouse 27 transmitted from the work terminal 120 in step S606. In step S633, it is determined whether the previous page button 1204 or the next page button 1205 is pressed.
[0061]
If the previous page button 1204 or the next page button 1205 is pressed as a result of the determination in step S633, the process returns to step S628, and the image data of the previous page or the next page is determined according to the pressed button. Corresponding text data and bibliographic information candidates are transmitted to the work terminal 120.
[0062]
If the previous page button 1204 or the next page button 1205 is not pressed as a result of the determination in step S633, based on the input information of the keyboard 26 and the operation information of the mouse 27 transmitted from the work terminal 120 in step S606. It is determined whether or not the registration button 1206 has been pressed on the screen of the work terminal 120 (step S634).
[0063]
If the registration button 1206 is pressed as a result of the determination in step S634, the bibliographic information stored in the work area of the RAM 22 is transmitted to the hard disk 28, and the process returns to step S623. The document management server 110 updates the bibliography DB 116 after transmitting a message indicating that the registration process has been completed to the work terminal 120.
[0064]
On the other hand, if the result of determination in step S634 is that the registration button 1206 has not been pressed, the text data information displayed on the image display unit 1202 has been corrected, or one of the bibliographic information input fields 1209 to 1215. Therefore, the text data (text data updating means) or bibliographic information in the work area of the RAM 22 is updated to reflect these corrections, and screen information including the updated contents is generated. And transmitted to the work terminal 120 (step S635).
[0065]
The work terminal 120 receives the image information received from the document management server 110 in the process of step S635 (step S607), and executes the process of step S604.
[0066]
When the update button 1203 is pressed, the document management server 110 executes the process of step S635, and the work terminal 120 executes the process of step S607.
[0067]
According to this processing, bibliographic information is extracted by performing OCR processing on the original image data in units of one page, the original image data, text data, and the extracted bibliographic information are displayed on the screen of the work terminal 120, and the background Since the OCR work on the next page and after is continued, the waiting time for processes other than the character recognition process can be reduced to a minimum, and the man-hours for inputting bibliographic information can be reduced, improving work efficiency. Can be planned.
[0068]
The information requesting scanning in step S602 may be directly transmitted to a scanner or a multifunction peripheral having a scanner function. In this case, in step S643, information (user ID or the like) specifying the scanning request source is transmitted together with the scanned image.
[0069]
6 and 7 are flowcharts showing the OCR process (step S626 in FIG. 3).
[0070]
6 and 7, the processes in steps S701 to S724 are executed under the control of the CPU 21 of the document management server 110.
[0071]
First, based on the OCR target image data stored in the work area of the RAM 22, an analysis between character blocks and lines is executed (step S701). A “character block” is a character string that is sandwiched between blank lines at the top and bottom or between a predetermined number of spaces on the left and right. The reason for including the case where the left and right sides are sandwiched by more than a predetermined number of spaces is because, for example, the case where the creation date is on the right end of the line and the title is in the center of the line immediately below is assumed. . Between lines, the image data is scanned in the horizontal direction, and if the black ratio is less than a predetermined value (for example, 0.1) that is close to zero, it is a line space or a blank line, and a portion that exceeds the predetermined value is determined to be part of a character line. To do. Further, the position of the last character of the page is determined by character block analysis and line spacing analysis. The position of the last character is determined as the right end portion of the last line of the lower character block. The final character position will be described in detail later.
[0072]
Next, the entire image data is scanned in the horizontal direction, ruled line recognition is performed, and it is analyzed whether or not the image data matches a specific format (for example, the format of a document partition sheet) (step S702).
[0073]
As a result of the analysis in step S702, if the image data does not match the specific format, this process is terminated. If the image data matches the specific format, the character at the specific position corresponding to the format Are subject to character recognition processing. The ruled line information of a specific format is stored in a format table (not shown) (physically, the hard disk 28).
[0074]
Next, it is determined whether or not the last character flag in the work area of the RAM 22 is turned on (step S703). The final character flag is a flag that is turned on when the character recognition process for the last character of the image data to be subjected to OCR is completed.
[0075]
If the final character flag is ON as a result of the determination in step S703, it is determined whether or not the image data being processed at this time is a partition paper (step S705), and partitioning is performed according to the determination result. The identification data indicating whether or not there is is added to the text data of the OCR process (step S706), and this process ends.
[0076]
On the other hand, if the result of determination in step S703 is that the final character flag is not on, character range analysis is executed based on the result of line spacing analysis in step S701 (step S704). Since the character spacing in the vertical direction of the image data is found in the line spacing analysis in step S701, the character range is determined in step S704 using the character spacing in the vertical direction or a half value as an initial value for character range analysis. The character range is determined by identifying a blank portion between characters and determining the range of each character.
[0077]
Next, the ratios of blacks in the determined range for one character are tabulated (step S707), and it is determined whether or not the ratio values of the tabulated blacks exceed a predetermined value (for example, 0.005). (Step S708). If the black ratio is equal to or less than the predetermined value, the black ratio for the predetermined number of characters before and after is determined, and together with the determination of the character block range in step S701, whether or not the character determined this time is at the beginning or end of the line. Determination is made (step S709).
[0078]
Next, it is determined whether or not the current character is a space (step S710). If the current character is a space, one space is added to the output text data (step S711), which will be described later. The process proceeds to step S723. On the other hand, if the current character is not a space, it is determined whether or not the current character is any symbol such as “.”, “,”, “•” (step S712). Is a symbol, one symbol is added to the text data to be output (step S713), and the process proceeds to step S723 described later. On the other hand, if the current character is not a symbol, the process proceeds to step S723 described later. Proceed to processing.
[0079]
As a result of the determination in step S708, if the ratio of black is greater than a predetermined value, the image data for one character and the character data in the dictionary are matched or not matched (step S714). .
[0080]
Next, it is determined whether or not the dictionary collation in step S714 is completed (step S715). If the dictionary collation is not completed, the image data for one character and the characters in the dictionary are based on the collation result in step S714. The coincidence rate of each pixel of data is determined (step S716). The dictionary character data is stored in the hard disk 28.
[0081]
Next, it is determined whether or not the coincidence rate determined in step S716 exceeds a predetermined value A (for example, 0.8) (step S717). Is selected from the dictionary, paired with the matching rate of the character, and added to the candidate array in the work area of the RAM 22 (step S718), and the process returns to step S714 to check with the next character in the dictionary. Do.
[0082]
On the other hand, as a result of the determination in step S717, if the coincidence rate is equal to or less than the predetermined value A, the process of step S718 is skipped and the process returns to step S714.
[0083]
When the dictionary collation in step S715 is completed, the image data for one character is compared with the matching rate of each character in the candidate sequence (step S719), and the text data of the character with the maximum matching rate is output text. It is added to the data (step S720).
[0084]
Next, it is determined whether or not the maximum value of the matching rate exceeds a predetermined value (for example, 0.9) (step S721). If the maximum value of the matching rate is equal to or less than the predetermined value, the warning flag is turned on and output. A predetermined special character is output to the text data (step S722). With the special character, when the screen is displayed (step S604 in FIG. 4), the text immediately before the special character is displayed in another color (for example, blue) other than the normal color (for example, black).
[0085]
By processing in this way, it becomes possible to change and display the color of a suspicious part whether or not the character is accurately identified, so that it is possible to efficiently perform manual confirmation and correction work of OCR processing.
[0086]
Next, it is determined whether or not the character identified this time is the last character in the image data for one page (step S723). If the character identified this time is the last character in the image data for one page, the RAM 22 The last character flag data in the work area is turned on (1 is substituted) (step S724), and the process returns to step S703. As a result of the determination in step S723, if the character identified this time is not the last character in the image data for one page, the process immediately returns to step S703.
[0087]
As described above, since the OCR processing of FIGS. 6 and 7 also determines symbols such as spaces and punctuation marks in image data and outputs text data corresponding to the image data, character recognition whose layout matches that of the original image data is performed. Processing becomes possible.
[0088]
8 and 9 are flowcharts showing the bibliographic candidate extraction process (step S627 in FIG. 3).
[0089]
8 and 9 are executed under the control of the CPU 21 of the document management server 110.
[0090]
In this processing, it is assumed that bibliographic information is on the first page of the document.
[0091]
First, it is determined whether or not the page from which bibliographic information is to be extracted is the first page of the document (step S821). If the page from which the bibliographic information is to be extracted is not the first page of the document, the process ends. On the other hand, if the page from which the bibliographic information is to be extracted is the first page of the document, the process is extracted by the OCR process. One character block of the text data is read (step S822). Here, the “character block” refers to a series of character data having a predetermined number or more (for example, two or more) spaces or the first character at the beginning or end of a line.
[0092]
Next, the read character block is collated with the candidate dictionary stored in the extraction rule storage unit 115 (step S823), the head of the character block is an era such as “Heisei”, and the end is “day”. Further, it is determined whether or not the position of the character block is on the right side of the center of the page (step S824). If all these conditions are satisfied (YES in step S824), the text data of the character block is overwritten in the “issue date” entry field of the bibliographic information (step S825), and the process proceeds to step S836.
[0093]
In step S836, it is determined whether or not the character block currently being read is the last block of the first page (whether or not another character block exists on the right side or the lower side of the character block). If the character block is the last block, this process is terminated. On the other hand, if the character block currently being read is not the final block, the process returns to step S822 to read the next character block.
[0094]
On the other hand, as a result of the determination in step S824, if any one of the conditions is not satisfied (NO in step S824), whether the head of the character block is the organization name and whether the end is “dono” or not. Is determined (step S826).
[0095]
As a result of the determination in step S826, if all the conditions are satisfied (YES in step S826), the text data of the character block is overwritten in the input field of “destination department” of the bibliographic information (step S827), The process proceeds to step S836.
[0096]
As a result of the determination in step S826, if any one of the conditions is not satisfied (NO in step S826), the beginning of the character block is the organization name or the end of the character block is a government office, and the character It is determined whether or not the block position is on the right side of the center of the page (step S828).
[0097]
If all the conditions are satisfied as a result of the determination in step S828 (YES in step S828), the text data of the character block is overwritten in the “document creator” input field of the bibliographic information (step S829), The process proceeds to step S836.
[0098]
As a result of the determination in step S828, if any one of the conditions is not satisfied (NO in step S828), the beginning of the character block is a document name or an era name, and the end of the character block is It is determined whether it is “No.” and the position of the character block is on the right side of the center of the page (step S830).
[0099]
If all the conditions are satisfied as a result of the determination in step S830 (YES in step S830), the text data of the character block is overwritten in the “document number” input field of the bibliographic information (step S831). The process proceeds to S836.
[0100]
As a result of the determination in step S830, if any one of the conditions is not satisfied (NO in step S830), whether the character at the end of the character block is “case”, “notice”, “ It is determined whether it is “notification” or “about” (step S832).
[0101]
If all the conditions are satisfied as a result of the determination in step S832 (YES in step S832), the text data of the character block is overwritten in the “acquired document name” input field of the bibliographic information (step S833). Then, the attribute of the input field is changed to “overwrite prohibited”, and the process proceeds to step S834.
[0102]
As a result of the determination in step S832, if any one of the conditions is not satisfied (NO in step S832), the process proceeds to step S836.
[0103]
Next, using the receipt document name overwritten in step S833 as a key, data on the managing section, the person in charge, the document classification, and the storage deadline are extracted from the management table stored in the extraction rule storage unit 115 (step S834). These data are overwritten in the entry field of bibliographic information, and immediately after this overwriting, the attribute of these data is changed to prohibit overwriting (step S835), and the process proceeds to step S836.
[0104]
In Steps S833 and S834, the attribute is set to “overwrite prohibited” because, when another document name exists in the text of the document, “acquired document name” and data related to the received document name This is to prevent a situation where the data is overwritten.
[0105]
Although it is out of the scope of the processing of this flowchart, after the bibliographic candidate extraction processing is completed, the operator of the work terminal 120 moves the cursor to these input fields and performs manual input correction. Of course, it is possible.
[0106]
According to this processing, bibliographic information candidates are extracted from the text data of the first page of the document and input to the input field based on detailed extraction rules, so that the input manpower of bibliographic information can be minimized.
[0107]
FIG. 10 is a flowchart showing a modification of part of the processing executed in the document management system in FIG. 3, and therefore shows processing different from FIG.
[0108]
10 are performed under the control of the CPU 21 of the document management server 110, and the processes of steps S606 and S607 are performed under the control of the CPU of the work terminal 120.
[0109]
In the work terminal 120, browser software, Java (registered trademark) Script, and ActiveX are combined, or a client application is installed and activated.
[0110]
The work terminal 120 identifies the input information of the keyboard 26 and the operation information of the mouse 27 in step S605 of FIG. 3, and transmits these information to the document management server 110 (step 606). Such information includes, for example, “drag and drop operation”, “character string inversion operation”, “character input to a specific input field”, “designation of a rectangular area”, and pressing of various buttons. When the operation information is drag and drop, the dragged character string, the drag start coordinates, and the drop destination coordinates are included in the operation information. When a rectangular area is specified, text information in the rectangular area is also included in the operation information.
[0111]
The CPU 21 of the document management server 110 determines whether or not the end button 1208 is pressed on the screen of the work terminal 120 based on the input information of the keyboard 26 and the operation information of the mouse 27 transmitted from the work terminal 120 in step S606. (Step S630).
[0112]
If it is determined in step S630 that the end button 1208 has been pressed, the process is terminated. If the end button 1208 has not been pressed, the keyboard 26 transmitted from the work terminal 120 in step S606 is displayed. It is determined whether the input information and the operation information of the mouse 27 are “designation of a rectangular area” (step S1123).
[0113]
If it is determined in step S1123 that the input information on the keyboard 26 and the operation information on the mouse 27 are “designation of a rectangular area”, the CPU 21 stores the text data in the rectangular area in the buffer area of the work area of the RAM 22. (Step S1124) On the other hand, if the input information on the keyboard 26 or the operation information on the mouse 27 is not “designation of a rectangular area”, the process proceeds to Step S1125 described later.
[0114]
Next, the CPU 21 determines whether the input information of the keyboard 26 and the operation information of the mouse 27 are “character string inversion” (step S1125), and the input information of the keyboard 26 and the operation information of the mouse 27 are “character string”. If it is not “inverted”, the process proceeds to step S1127 described later. On the other hand, if the input information on the keyboard 26 and the operation information on the mouse 27 are “character inverted”, the text data of the inverted part is stored in the work area of the RAM 22. (Step S1126), and the process proceeds to step S1127. In this buffer area, the coordinates of the drag source are also stored in association with the text data.
[0115]
Next, the CPU 21 determines whether or not the input information of the keyboard 26 and the operation information of the mouse 27 are “drag and drop operations” (step S1127), and the input information of the keyboard 26 and the operation information of the mouse 27 are “drag and drop”. If it is not “drop operation”, the process proceeds to step S634. On the other hand, if the input information of the keyboard 26 or the operation information of the mouse 27 is “drag and drop operation”, the text data in the rectangular area is stored in the work of the RAM 22. Accumulate in the buffer area of the area. In this buffer area, the coordinates of the drag source are also stored in association with the text data.
[0116]
Next, the CPU 21 detects the coordinates of the drag source and the coordinates of the drag destination, specifies the overwriting input destination field of the corresponding data based on the coordinates of the drag destination (step S1128), and further determines the drag source from the coordinates of the drag source. Is an inverted character string or which character string in the rectangular area is specified (step S1129). Thereafter, the character string specified in the buffer area of the work area of the RAM 22 is overwritten in the input field specified in step S1128 (step S1130), and the process proceeds to step S634. The processing in step S634 has been described above with reference to FIG.
[0117]
In step S1134, the CPU 21 transmits a message indicating that the bibliographic registration is completed to the work terminal well 120.
[0118]
According to this processing, when the operator of the work terminal 120 designates a rectangular area surrounding a specific character string or inverts the specific character string and then drags and drops it to a desired input field, these actions are performed. Since it is automatically realized in the document management server 110 as it is, it is possible to easily update the character string of the input field such as bibliographic information, reducing the man-hours for inputting the bibliographic information and the like, and reducing the burden on the operator. It becomes possible to do.
[0119]
As described above, according to the present embodiment, the document management server 110 executes OCR processing of image data read by the multi-function peripheral 140, and based on text data obtained by this OCR processing, based on certain rules. The bibliographic information is extracted, the image data, the text data, and the extracted bibliographic information are transmitted to the work terminal 120, and the OCR process is continued in parallel in the background of the bibliographic information extraction process and the data transmission process. On the other hand, since the work terminal 120 displays image data, text data, and extracted bibliographic information on the screen, the waiting time for processes other than the character recognition process can be reduced to a minimum, and manpower for inputting bibliographic information can be reduced. It is possible to reduce the operating efficiency.
[0120]
In the present embodiment, the document management server 110 includes the OCR processing unit. However, only the OCR processing unit may be configured as another independent server (OCR server). At this time, the text data recognized by the OCR process is transmitted from the OCR server to the document management server and stored in the text DB 118. By making the OCR server independent in this way, it is possible to reduce the load on the document management server 110 and further improve the processing speed.
[0121]
In this embodiment, in the screen example of the work terminal 120 displayed during the bibliographic registration process (FIG. 5), the image display unit 1201, the image display unit 1202, various buttons 1203 to 1208, and various bibliographic information items are displayed. Although input fields 1209 to 1215 are included in one application, as shown in FIG. 17, an image display unit 1201, an image display unit 1202, and various buttons 1203 to 1208 are included in one application, and various bibliographies are included. Information input fields 1209 to 1215 may be included in another application.
[0122]
(Second Embodiment)
In the first embodiment, the document management server 110 executes the OCR process. However, the present embodiment is different in that the MFP 240 executes the OCR process.
[0123]
FIG. 11 is a block diagram showing a configuration of a document management system to which the document management apparatus according to the second embodiment of the present invention can be applied. The document management apparatus according to the embodiment of the present invention is applied to the document management server 110.
[0124]
Unlike the document management management server 110 according to the first embodiment, the document management server 210 in the figure does not include the OCR processing unit 112 but includes a text registration unit 212.
[0125]
The multifunction device 240 further includes an OCR processing unit 244 in addition to the configuration of the multifunction device 140 according to the first embodiment. The functions of the OCR processing unit 112 and the OCR processing unit 244 are the same.
[0126]
In the work terminal 120 according to the present embodiment, browser software, Java (registered trademark) Script, and ActiveX are combined, or a client application is installed and activated.
[0127]
The configuration of the document management system in FIG. 11 and the configuration of the document management system in FIG. 2 are the same except for the difference in configuration described above.
[0128]
12 and 13 are flowcharts showing processing executed by the document management system in FIG.
[0129]
Since this flowchart is substantially the same as the flowcharts of FIGS. 3 and 4 described above, steps for executing the same processing are given the same step numbers, and only different points will be described.
[0130]
The processing of step S943 to step S945 is executed under the control of a CPU (not shown) of the multifunction device 140, the processing of step 923 to step S925 and step S935 is executed under the control of the CPU of the document management server 210, and the processing of step S907 is performed. It is executed under the control of the CPU of the work terminal 120.
[0131]
In FIG. 12, the MFP 140 scans a document, stores image data in the image data storage unit 147 (step S642), and then executes an OCR process (step S943). The OCR process is the same as the process described with reference to FIGS. 6 and 7 described above, but is executed by the control unit 145 of the multi-function device 140.
[0132]
After completion of the OCR processing, the multifunction device 140 transmits the scanned image data and text data that is the recognition result of the OCR processing to the document management server 210 (step S944), and the control unit 145 causes the OCR processing unit 244 to perform the OCR processing. It is determined whether or not there is a page that has not been subjected to (step S945). As a result of the determination, if there is a page that has not been subjected to the OCR process, the process returns to step S943 to continue the OCR process. If there is no page that has not been subjected to the OCR process, the process proceeds to step S641. Proceed to processing.
[0133]
In the document management server 210, the CPU 21 waits for text data as a recognition result of the OCR process in addition to the scanned image from the multifunction peripheral (step S923). The scanned image and text data to be received at regular time intervals. If there is no scan image and text data to be received (No in step S924), the process returns to step S923 and waits. If it is determined in step 924 that there is a scanned image and text data to be received (Yes in step S924), the scanned image and text data are received, and then the image data received by the image management unit 114 is stored in the image DB 117. The text data registered and received by the text registration unit 212 is registered in the text DB 118 (step S625).
[0134]
If the registration button 1206 has not been pressed as a result of the determination in step S634, the text data information displayed on the image display unit 1202 has been corrected, or information in any of the bibliographic information input fields 1209 to 1215 Therefore, the text data and bibliographic information in the work area of the RAM 22 are updated to reflect these corrections, and the updated text data and bibliographic information are transmitted to the work terminal 120 (step S935). ).
[0135]
The work terminal 120 receives the text data and bibliographic information received from the document management server 110 in the process of step S935 (step S907), and executes the process of step S604.
[0136]
In step S935, the updated text data and bibliographic information are transmitted to the work terminal 120 because the work terminal 120 includes browser software and the like, and the screen display can be updated only with the updated text data and bibliographic information. It is.
[0137]
Also in the present embodiment, some modifications of the processing executed in the document management system shown in FIG. 10 can be applied. In this case, the processing of step S630, step S1123 to step S1130, and step S634 may be executed by the work terminal 120, and the work terminal 120 may be configured to transmit the final edited registration information to the document management server 210. Good.
[0138]
As described above, according to the present embodiment, the MFP 240 performs OCR processing on the read image data, and the document management server 210 receives image data and text data obtained by OCR processing from the MFP 240. The bibliographic information is extracted from the text data based on a certain rule, and the image data, the text data, and the extracted bibliographic information are transmitted to the work terminal 120. While the multifunction device 240 continues the OCR process in parallel in the background, the work terminal 120 displays image data, text data, and extracted bibliographic information on the screen. Therefore, the document management server 210 performs processing other than character recognition processing. Work efficiency can be reduced by minimizing waiting time for other processing and reducing bibliographic information input. It can be improved.
[0139]
Further, since the multifunction device 240 executes the OCR process and the load on the document management server 210 is reduced, the processing speed of the document management server 210 can be further improved.
[0140]
(Third embodiment)
In the first embodiment, the document management server 110 executes the OCR process and the bibliographic extraction process. However, the present embodiment is different in that the work terminal 320 executes the OCR process and the bibliographic extraction process.
[0141]
FIG. 14 is a block diagram showing a configuration of a document management system to which the document management apparatus according to the third embodiment of the present invention can be applied. The document management apparatus according to the embodiment of the present invention is applied to the document management server 110.
[0142]
Unlike the configuration of the document management management server 110 according to the first embodiment, the configuration of the document management server 310 in the figure does not include the OCR processing unit 112, the bibliographic extraction unit 113, and the extraction rule storage unit 115. , A text registration unit 212 is provided. Unlike the first embodiment, the configuration of the multi-function device 140 does not include the image data storage unit 147. The configuration of the work terminal 320 is different from the configuration of the work terminal 120 according to the first embodiment, and includes an OCR processing unit 324, a bibliographic extraction unit 323, an extraction rule storage unit 325, and a text data storage unit 328.
[0143]
In the work terminal 320 according to the present embodiment, browser software, Java (registered trademark) Script, and ActiveX are combined, or a client application is installed and activated.
[0144]
The configuration of the document management system in FIG. 14 and the configuration of the document management system in FIG. 2 are the same except for the difference in configuration described above.
[0145]
15 and 16 are flowcharts showing processing executed by the document management system in FIG.
[0146]
Since this flowchart is substantially the same as the flowcharts of FIGS. 3 and 4 described above, steps for executing the same processing are given the same step numbers, and only different points will be described.
[0147]
Steps S1025 to S1030 are executed under the control of the CPU of the document management server 310, and steps S1003 to S1013 and steps S1031 to S1033 are executed under the control of the CPU of the work terminal 320.
[0148]
In FIG. 15, in step S624, the document management server 310 determines whether there is a scanned image to be received. If there is a scanned image to be received, the document management server 310 assigns a predetermined management number to the received scanned image. Numbering is performed (step S1025). This management number is used as a search key or a part of the search key of the bibliographic DB 116, the image DB 117, and the text DB 118, and is used for associating data of these three DBs.
[0149]
Thereafter, the document management server 310 transfers the image data to the work terminal 320 (step S1026), registers the image data in the image DB 117 (step S1027), and returns to the process of step S623.
[0150]
The work terminal 320 receives the image data transferred from the document management server 310 in step S1026 (step S1003), and executes OCR processing (step S1004) and bibliographic candidate extraction processing (step S1005). These OCR processing (step S1004) and bibliographic candidate extraction processing (step S1005) are the same as the OCR processing and bibliographic candidate extraction processing executed by the document management server 110 in the first embodiment.
[0151]
Next, a screen as shown in FIG. 5 is displayed on the work terminal 320, image data is displayed on the image display unit 1201, and text data recognized by the OCR process is displayed on the image display unit 1202. Further, bibliographic information candidates are displayed in various bibliographic information input fields 1209 to 1215, and the screen display is updated (step 1006).
[0152]
Next, the work terminal 120 identifies keyboard input information and mouse operation information (step 1007), and executes the processing of steps S1008 to S1011. Since the processing from step S1008 to step S1011 is the same as the processing from step S630 to step 634 in FIG. 4, the description thereof is omitted.
[0153]
In step S1011, if the registration button has not been pressed, the text data information displayed on the image display unit 1202 in FIG. 5 has been corrected, or any information in the bibliographic information input fields 1209 to 1215 is displayed. Since it is a case where it corrects, text data is updated reflecting these corrections (step S1012), and it returns to the process of step S1006.
[0154]
On the other hand, if the next document button 1207 is pressed as a result of the determination in step S1009, the CPU of the work terminal 320 determines whether or not the OCR processing of the next document of the document currently being processed has been completed ( Step S1031) If this processing is completed, the image data of the first page of the next document, the text data corresponding thereto, and bibliographic information candidates are read out from the RAM or the like (Step S1032), and Step S1006 is performed. Proceed to the process. On the other hand, if the OCR process for the next document has not been completed, the process returns to step S1004.
[0155]
If the previous page button 1204 or the next page button 1205 shown in FIG. 5 is pressed as a result of the determination in step S1010, the CPU of the work terminal 320 displays the image data of the previous page or the next page according to the pressed button. Then, corresponding text data and bibliographic information candidates are read from the RAM or the like (step S1032), and the process proceeds to step S1006.
[0156]
If the registration button is pressed in step S1011, the CPU of the work terminal 320 transmits text data and bibliographic information to the document management server 310 (step S1013), and the process returns to step S1006.
[0157]
The CPU of the document management server 310 receives the text data and bibliographic information transmitted from the work terminal 320 (step S1028), registers the bibliographic information in the bibliographic DB 116, and registers the text data in the text DB 118 (step S1029). Furthermore, it waits until it receives text data and bibliographic information from the work terminal 320 (step S1030).
[0158]
As described above, according to the present embodiment, the work terminal 320 executes the OCR process on the image data read by the multi-function device 140, and the text data obtained by the OCR process is based on a certain rule. Bibliographic information is extracted, and while OCR processing is continued in parallel in the background of these bibliographic information extraction processing, image data, text data, and extracted bibliographic information are displayed on the screen. The waiting time for processes other than the recognition process can be reduced to a minimum, the man-hours for inputting bibliographic information can be reduced, and work efficiency can be improved. Since the load on the document management server 310 is reduced, the processing speed can be further improved.
[0159]
(Fourth embodiment)
In the present embodiment, part of the processing executed in the document management system in FIG. 3 is different from that of the first embodiment, and the others are the same as those of the first embodiment, so only different points will be described.
[0160]
In the present embodiment, it is assumed that the bibliographic candidate extraction process in step S627 in FIG. 3 is not executed and the image and text information shown in FIG. 20 is transmitted to the work terminal 120 in step S628.
[0161]
First, changes in the screen on the work terminal 120 will be described with reference to FIG.
[0162]
The OCR text recognized in step S626 in FIG. 3 is displayed on the display unit 1901 in FIG. 20, and the OCR text is converted into HTML by superimposing the received scan image on the received scan image, and the scan image as a transparent layer. It is displayed on the upper side. This HTML is normally transparent (invisible), and the left and right positions of the text match the original image, but the top and bottom positions of the text are approximately one line above (or approximately one line below). Although it is shifted from the original image, it is normally transparent and invisible, and the lower scan image is displayed.
[0163]
Reference numeral 1903 is a button for displaying the first page of the currently displayed document, and reference numeral 1904 is a button for displaying the previous page. Reference numeral 1905 displays the total number of pages and the display page of the currently displayed document. For example, “1/3” means that the OCR text corresponding to the scanned image of the first page of the three-page document as a whole is currently displayed. Reference numeral 1906 denotes a button for displaying the next document, which displays the progress of the OCR work for the next document. That is, the OCR completion page in steps S626 to S629 in FIG. 3 is displayed. For example, “4/4” means that the next document has four pages, and the OCR work has been completed up to the fourth page. In this case, the display of the reference numeral 1906 on the screen changes from “1/4” → “2/4” → “3/4” at the beginning, and finally reaches “4/4”. Reference numeral 1907 is a button for displaying the next page of the next document, and reference numeral 1908 is a button for displaying the last page of the next document. If the button 1905 is pressed while the next document is displayed, the scan image and the OCR text of the current document displayed immediately before are displayed again. When the registration button 1911 is pressed, bibliographic registration processing is performed, and the display of the button 1906 is moved up to the display 1905 on the screen. That is, “1/3” changes to “1/4”. A button 1906 further displays the number of pages of the next document and the OCR progress. In other words, if the new document (old document one after another) is 5 pages in total and the OCR is completed up to the 3rd page, “Next document”
3/5 "is displayed.
[0164]
Further, when the cursor is positioned on a specific character portion of the display unit 1901 for displaying the OCR text, for example, “Regarding Revision (Notification)” and the mouse button is pressed, the corresponding portion of the transparent HTML of the image display unit 1902 is displayed. Inverted display is performed (reversing unit 1919). Inversion is performed for one sentence, and one sentence is determined by a preceding or following space or line feed.
[0165]
As described above, since the OCR progress of the next document can be easily confirmed, it is possible to smoothly shift from the bibliographic registration work for the current document to the bibliographic registration work for the next document. When the reversing unit 1919 is dragged and dropped to the position of the subject input field 1912, “○ △ Revision (Notification)” is input.
[0166]
Next, details of processing corresponding to the screen movement described with reference to FIG. 20 will be described with reference to flowcharts of FIGS.
[0167]
18 and 19 are flowcharts showing a modification of part of the processing executed in the document management system in FIG. 3, and thus show processing different from FIG. The processing steps similar to those in FIG. 3 are described in steps starting with “step S066”.
[0168]
18 are executed under the control of the CPU 21 of the document management server 110, and the processes of steps S604 to S607 are executed under the control of the CPU of the work terminal 120.
[0169]
In the work terminal 120, browser software, Java (registered trademark) Script, and ActiveX are combined, or a client application is installed and activated.
[0170]
The work terminal 120 identifies the input information of the keyboard 26 and the operation information of the mouse 27 in step S605 of FIG. 3, and transmits these operation information to the document management server 110 (step 606). These pieces of operation information include, for example, “pressing a mouse button at a specific position on the screen”, “drag and drop operation”, “character input / deletion to a highlighted part”, and pressing various buttons. When the operation information is drag and drop, the dragged character string, the drag start coordinates, and the drop destination coordinates are included in the operation information. In addition, in the case of character input or deletion in the reverse display portion, the position coordinates of the reverse display portion and the text information input or deleted are also included in the operation information.
[0171]
The CPU 21 of the document management server 110 determines whether or not the end button 1920 is pressed on the screen of the work terminal 120 based on the input information of the keyboard 26 and the operation information of the mouse 27 transmitted from the work terminal 120 in step S606. (Step S630).
[0172]
If the result of determination in step S630 is that the end button 1920 has been pressed, this processing ends. If the end button has not been pressed, the process proceeds to step S1821.
[0173]
In step S1821, it is determined whether or not the operation information transmitted from the work terminal 120 in step S606 is a press of a mouse button. If not, the process proceeds to step S1830. If it has been pressed, the process advances to step S1822 to detect whether the page button has been pressed. The page buttons are buttons 1903 to 1908 in FIG. 20, and include two buttons just below buttons 1903 and 1904 and two buttons just above buttons 1907 and 1908.
[0174]
If the page button is pressed in step S1822, the process advances to step S1823, and the image of the specified page and the OCR-completed text data are read from the image DB 117 and the text DB 118 of the document management server 110, and the screen is displayed in step S635. The information is transmitted, the work terminal 120 receives the screen information in step S607, and a screen as shown in FIG. 20 is displayed in step S604.
[0175]
If the page button is not pressed in step S1822, the process advances to step S1824 to detect whether the mouse button is pressed with the cursor on the text. On the text means that the cursor is in the screen area where the text is displayed. If it is detected that the text has been pressed, the process advances to step S1825, and one phrase of the text at the mouse position is copied to a buffer secured in the RAM. The same processing is performed when a mouse button for drag and drop described later is pressed. The one phrase here refers to a character string that is sandwiched between blanks or tabs. Next, the process proceeds to step S1825, where the HTML attribute of the portion corresponding to the text is changed from transparent to reverse display and displayed on the screen. Even if the part once reversed is pressed again, the reverse display does not change. Since step S635 and subsequent steps are the same as those in FIG.
[0176]
If the text is not pressed in step S1824, the process advances to step S634 to determine whether the registration button has been pressed. Since this is the same as FIG. 3, the description thereof is omitted. If the registration button has not been pressed, the process advances to step S1829 to execute a process corresponding to the pressed button, such as “update text”. This is the end of the description of the processing when the mouse button is pressed.
[0177]
Next, when the mouse button is not pressed in step S1821, the process proceeds to step S1830, where it is detected whether or not a character is input or deleted. If there is a character input or deletion, the process proceeds to step S1831. The text in the corresponding position buffer is updated.
[0178]
If there is no character input or deletion in step S1830, the process proceeds to step S1832, and it is detected whether there has been a drag and drop. That is, it is detected whether or not the mouse button has been released after the mouse pointer has been moved with the mouse button pressed. If there is a drag and drop, the process advances to step S1833, and the start and end points of the drag and drop are detected from the mouse event information. The process proceeds to step S1834, where it is detected whether the reversed display position of the text has been dragged and dropped. If the reversed display position of the text has been dragged and dropped, the process proceeds to step S1835, where the text in the buffer is at the drop position. Input is made in the input field, the process proceeds to step 635, and the screen information is transmitted.
[0179]
If the reverse position is dragged in step S1834, the process immediately proceeds to step 635.
[0180]
As described above, according to the present embodiment, text on the screen can be input to a desired input field with a simple operation, so that the efficiency of bibliographic information registration work can be greatly improved. In other words, normally, four operations such as clicking and dragging the start point and end point of a specific character, highlighting the Ctrl + C button, moving the cursor to the input field position, and pressing Ctrl + V can be performed in two operations. it can.
[0181]
The present invention supplies a software program that realizes the functions of the above-described embodiments to a computer or a control unit (specifically, a CPU), and the computer or CPU reads and executes the supplied program. It goes without saying that the object of the present invention is achieved.
[0182]
In this case, the program is supplied by downloading directly from a recording medium recording the program (not shown) or from another computer or database (not shown) connected to the Internet, a commercial network, a local area network, or the like. Is done.
[0183]
The above-described program only needs to be able to realize the functions of the above-described embodiments by a computer, and the form includes forms such as object code, a program executed by an interpreter, and script data supplied to the OS. But you can.
[0184]
Furthermore, the object of the present invention can also be achieved by supplying a computer with a recording medium that records a software program that implements the functions of the above-described embodiments, and reading and executing the program stored in the recording medium. Needless to say, this is achieved.
[0185]
As a recording medium for supplying the program, for example, RAM, NV-RAM, floppy (registered trademark) disk, optical disk, magneto-optical disk, CD-ROM, MO, CD-R, CD-RW, DVD (DVD-ROM, (DVD-RAM, DVD-RW, DVD + RW, DVD-R, DVD + R, Blu-ray Disc, etc.), magnetic tape, nonvolatile memory card, other ROM, etc., as long as they can store the above programs.
[0186]
【The invention's effect】
As described above, according to the document management device according to claim 1 and the document management program according to claim 9, text data is generated by performing character recognition processing on the image data received from the image reading device, Bibliographic information is extracted based on the generated text data and extraction rules and sent to the client device, so that the image data, text data, and extracted bibliographic information can be displayed on the screen of the client device. The character recognition processing of the image data can be continued. Therefore, the waiting time for processes other than the character recognition process can be reduced to a minimum, the man-hours for inputting bibliographic information can be reduced, and work efficiency can be improved.
[0187]
According to the document management device of claim 2, since image data, text data, and bibliographic information are stored in association with each other, image data, text data, and extracted bibliographic information can be registered in the database. Data and information can be recalled in a batch, and the time required for bibliographic information correction can be minimized.
[0188]
According to the document management apparatus of claim 3, since the bibliographic information is updated based on the editing information received from the client apparatus, it is possible to easily update the character string of the input field such as the bibliographic information. It is possible to reduce the man-hours for inputting information and the like and reduce the burden on the operator.
[0189]
According to the document management apparatus of the sixth aspect, the bibliographic information can be easily corrected.
[0190]
According to the document management system of claim 7, the waiting time of processing other than the character recognition processing in the document management apparatus can be reduced to the minimum, and the man-hours for inputting bibliographic information can be reduced. Can be improved. Since the image reading apparatus executes the character recognition process, the load on the document management apparatus is reduced, and the processing speed of the document management apparatus can be further improved.
[0191]
According to the document management system according to claim 8, in the client device, it is possible to reduce the waiting time for processing other than the character recognition processing to a minimum, and to reduce the man-hours for inputting bibliographic information, thereby improving work efficiency. Can be improved. Since the load on the document management apparatus is reduced, the processing speed of the document management apparatus can be further improved.
[0192]
According to the document management system of the ninth aspect, since the text on the screen can be input to a desired input field with a simple operation, the efficiency of the bibliographic information registration work can be greatly improved. In other words, normally, four operations such as clicking and dragging the start point and end point of a specific character, highlighting the Ctrl + C button, moving the cursor to the input field position, and pressing Ctrl + V can be performed in two operations. it can.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a hardware configuration of a document management apparatus according to a first embodiment of the present invention.
FIG. 2 is a block diagram showing a configuration of a document management system to which a document management apparatus according to an embodiment of the present invention can be applied.
FIG. 3 is a flowchart showing processing executed by the document management system in FIG. 2;
4 is a flowchart showing processing executed by the document management system in FIG. 2. FIG.
FIG. 5 is a diagram showing an example of an application displayed on the screen of the work terminal 120 during the bibliographic registration process.
FIG. 6 is a flowchart showing an OCR process (step S626 in FIG. 3).
FIG. 7 is a flowchart showing an OCR process (step S626 in FIG. 3).
FIG. 8 is a flowchart showing a bibliographic candidate extraction process (step S627 in FIG. 3).
FIG. 9 is a flowchart showing a bibliographic candidate extraction process (step S627 in FIG. 3).
10 is a flowchart showing a modification of part of the processing executed in the document management system in FIG. 3; FIG.
FIG. 11 is a block diagram showing a configuration of a document management system to which a document management apparatus according to a second embodiment of the present invention can be applied.
12 is a flowchart showing processing executed by the document management system in FIG. 11. FIG.
13 is a flowchart showing processing executed in the document management system in FIG. 11. FIG.
FIG. 14 is a block diagram showing a configuration of a document management system to which a document management apparatus according to a third embodiment of the present invention can be applied.
15 is a flowchart showing processing executed by the document management system in FIG.
16 is a flowchart showing processing executed by the document management system in FIG.
FIG. 17 is a diagram illustrating an example of an application displayed on the screen of the work terminal 120 in the bibliographic registration process.
18 is a flowchart showing a modification of part of the processing executed in the document management system in FIG. 3. FIG.
FIG. 19 is a flowchart showing a modification of part of the processing executed by the document management system in FIG. 3;
FIG. 20 is a diagram showing an example of an application displayed on the screen of the work terminal 120 during the bibliographic registration process.
[Explanation of symbols]
21 CPU
22 RAM
23 ROM
28 hard disk
110 Document management server
111 Bibliographic Registration Department
112 OCR processing unit
113 Bibliographic Extraction Department
114 Image management unit
115 Extraction rule storage unit
116 Bibliographic Database (DB)
117 Image database (DB)
118 Text Database (DB)

Claims

In an image reading apparatus for reading an image of a document and a document management apparatus connected via a communication line to a client apparatus for displaying image data of the read image,
Image storage means for storing image data received from the image reading device;
Character recognition processing means for generating text data by performing character recognition processing on the image data stored in the image storage means;
Extraction rule storage means for storing extraction rules for extracting bibliographic information from the text data;
Extraction means for extracting bibliographic information based on the text data and the extraction rule;
A document management apparatus comprising: transmission means for transmitting the image data, the text data, and the bibliographic information to the client apparatus.

2. The document management apparatus according to claim 1, further comprising document information storage means for storing the image data, the text data, and the bibliographic information in association with each other.

3. A receiving unit that receives editing information on the bibliographic information received from the client device, and a bibliographic information updating unit that updates the bibliographic information based on the received editing information. The document management apparatus described.

4. The document management apparatus according to claim 3, further comprising text data updating means for updating the text data based on the received editing information.

Display information generation means for generating display information to be displayed on the screen of the client device based on the image data, the bibliographic information updated by the bibliographic information update means, and the text data updated by the text data update means; The document management apparatus according to claim 4, further comprising:

The received editing information includes information on whether or not the text data selected on the client device is highlighted, information on whether or not a rectangular area including the text data selected on the client device is designated, 6. The information indicating whether or not the text data displayed in reverse video or the text data included in the designated rectangular area is dragged and dropped to a predetermined input field. Document management device described in the section.

Document management comprising an image reading device for reading an image of a document, a client device for displaying image data of the read image, and a document management device connected to the image reading device and the client device via a communication line In the system,
The image reading device includes:
Character recognition processing means for generating text data by performing character recognition processing on the image data of the read image;
Transmission means for transmitting the image data and the text data to the document management device,
The document management apparatus includes:
Storage means for storing image data and text data received from the image reading device;
Extraction rule storage means for storing extraction rules for extracting bibliographic information from the text data;
Extraction means for extracting bibliographic information based on the text data and the extraction rule;
A document management system comprising: transmission means for transmitting the image data, the text data, and the bibliographic information to the client device.

Document management comprising an image reading device for reading an image of a document, a client device for displaying image data of the read image, and a document management device connected to the image reading device and the client device via a communication line In the system,
The client device is
Image storage means for storing image data received from the image reading device;
Character recognition processing means for generating text data by performing character recognition processing on the image data stored in the image storage means;
Extraction rule storage means for storing extraction rules for extracting bibliographic information from the text data;
Extraction means for extracting bibliographic information based on the text data and the extraction rule;
A document management system comprising: the image data, the text data, and display information generating means for generating display information for displaying the bibliographic information on the client device.

The document management apparatus includes a receiving unit that receives editing information for the bibliographic information received from the client apparatus, and the received editing information indicates whether text data selected by the client apparatus is highlighted. Information indicating whether or not a rectangular area including text data selected by the client device is designated, and text data displayed in reverse video or text data included in the designated rectangular area is input to a predetermined input field. 9. The document management system according to claim 8, wherein the document management system is information indicating whether or not dragging and dropping has been performed.

In a document management program that is executed by a computer connected via a communication line to an image reading device that reads an image of a document and a client device that displays image data of the read image,
An image storage module for storing image data received from the image reading device;
A character recognition processing module for generating text data by performing character recognition processing on the image data stored in the image storage means;
An extraction rule storage module for storing extraction rules for extracting bibliographic information from the text data;
An extraction step of extracting bibliographic information based on the text data and the extraction rule;
A document management program, comprising: a display information generation module that generates display information for causing the client device to display the image data, the text data, and the bibliographic information.