JP3319203B2

JP3319203B2 - Document filing method and apparatus

Info

Publication number: JP3319203B2
Application number: JP02898195A
Authority: JP
Inventors: 修森口; 泰博高山; 洋一藤井; 裕三丸田
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 1995-02-17
Filing date: 1995-02-17
Publication date: 2002-08-26
Anticipated expiration: 2017-08-26
Also published as: JPH08221558A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は文書ファイリング方法及
び装置に関し、特に文書画像情報を登録し検索する文書
ファイリングシステムで、文書画像情報を入力し登録す
る場合に適用し得る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a document filing method and apparatus, and more particularly, to a document filing system for registering and retrieving document image information, which can be applied to the case of inputting and registering document image information.

【０００２】[0002]

【従来の技術】従来、文書画像情報を登録し検索する文
書ファイリングシステムにおいては、文書画像情報の検
索に用いるデータベースへの登録項目を容易に入力する
ための方法が種々提案されている。このような文書画像
情報の検索に用いるデータベースへの登録項目の入力を
可能とした文書ファイリングシステムとして、例えば特
開平４−７７８８５号公報に開示されたものがある。こ
れは、スキャナ等で読み取った名刺等の画像データにつ
いて、文字認識手段によって文字認識すると共に項目を
判定し、文字データとその文字データの項目の種類を用
いて、データベースへ登録して検索するものである。2. Description of the Related Art Conventionally, in a document filing system for registering and retrieving document image information, various methods have been proposed for easily inputting registration items to a database used for retrieving document image information. As a document filing system capable of inputting a registration item to a database used for searching for such document image information, for example, there is one disclosed in Japanese Patent Application Laid-Open No. 4-77885. This is a method of recognizing characters by image recognition data such as a business card read by a scanner or the like, and recognizing an item by using a character recognizing means. It is.

【０００３】図８はこの文書ファイリングシステムにお
ける修正操作を示す表示である。図中Ｃ１は現在指示さ
れている文字データのカーソルであり、Ｃ２はカーソル
Ｃ１が指示する文字データについての認識候補一覧の選
択表示を示し、Ｃ３はカーソルＣ１の指示する項目名一
覧の選択表示を示す。カーソルＣ１を修正すべき文字デ
ータの位置に移動させ、Ｃ２によって正しい文字を選択
し、Ｃ３によって正しい項目名を選択し修正を行う。こ
のような後修正を順次行った後に、読み取った画像デー
タや認識した文字データをデータベースに登録する。FIG. 8 is a display showing a correction operation in the document filing system. In the figure, C1 is a cursor of the character data currently designated, C2 is a selection display of a recognition candidate list for the character data indicated by the cursor C1, and C3 is a selection display of an item name list indicated by the cursor C1. Show. The cursor C1 is moved to the position of the character data to be corrected, a correct character is selected by C2, and a correct item name is selected and corrected by C3. After sequentially performing such post-correction, the read image data and the recognized character data are registered in the database.

【０００４】[0004]

【発明が解決しようとする課題】ところが上述の文書フ
ァイリングシステムでの表示及び修正操作は、１つの文
書について項目名一覧を表示し、それらを修正した後に
他の文書の項目一覧を表示し、修正するという繰り返し
によって行われるため、全ての文書を修正するために
は、その都度画面の表示を更新して修正する煩雑な操作
が必要であり、修正作業の作業効率が悪いという問題が
あった。また上述した文書ファイリングシステムでは、
名刺や伝票のように限られた書式で限られた項目数しか
存在しないような文書のみを取り扱うようになされ、複
数種類の複数の文書を一括して入力し修正するような利
便性はなく、利用者の使い勝手の点で未だ不十分であっ
た。However, in the display and correction operation in the above-described document filing system, a list of item names of one document is displayed, and after correcting them, a list of items of another document is displayed. Therefore, in order to correct all the documents, a complicated operation of updating and correcting the display of the screen each time is necessary, and there has been a problem that the efficiency of the correction work is low. In the document filing system described above,
Only documents that have a limited number of items in a limited format such as business cards and slips are handled, and there is no convenience such as inputting and correcting multiple types of documents at once, The usability of users was still insufficient.

【０００５】この発明は上記のような問題点を解消する
ためなされたもので、簡易な構成及び操作で複数種類の
複数の文書を一括して入力し得ると共に、修正作業にお
ける利用者の作業負担を軽減し使い勝手を向上できる文
書ファイリング方法及び装置を得ることを目的とする。SUMMARY OF THE INVENTION The present invention has been made to solve the above-mentioned problems, and it is possible to collectively input a plurality of types of documents with a simple configuration and operation, and to impose a burden on a user in correction work. It is an object of the present invention to obtain a document filing method and apparatus capable of reducing the number of times and improving the usability.

【０００６】[0006]

【課題を解決するための手段】この発明に係る文書ファ
イリング方法は、複数種類の複数の文書を一括して画像
読み取り装置から読み取った一連の画像データを、読み
取った複数の文書に各々対応する文書画像データとして
入力する文書画像データ入力行程と、その文書画像デー
タ入力行程で入力された文書画像データをパターン認識
してコード化し、コード化データを生成するパターン認
識行程と、複数の文書に各々対応する文書画像データ及
びコード化データを一括して表示部に表示する第１の表
示行程と、その第１の表示行程の表示結果に基づいてコ
ード化データの後修正の要否を判定し、その判定結果に
基づいてコード化データを後修正する修正行程と、その
修正行程で修正されたコード化データを表示部に表示す
る第２の表示行程と、表示部に表示された文書画像デー
タ及びコード化データを一括してデータベースに登録す
るデータベース登録処理行程とを備え、文書画像データ
入力行程は、文書画像データから文書の種類に応じたデ
ータベース登録項目領域の画像データを抽出し、パター
ン認識行程は、文書画像データ入力行程で抽出したデー
タベース登録項目領域の画像データをコード化してデー
タベース登録項目値とし、第１及び第２の表示行程は、
データベース登録項目領域の画像データ及びこの画像デ
ータをコード化したデータベース登録項目値とを、文書
及びデータベース登録項目を配列した表形式で一括して
表示するものである。According to a document filing method according to the present invention, a series of image data obtained by collectively reading a plurality of types of documents from an image reading apparatus is converted into a document corresponding to each of the read plurality of documents. A document image data input process to be input as image data, a pattern recognition process for pattern-recognizing and coding the document image data input in the document image data input process, and generating a coded data, each corresponding to a plurality of documents. A first display step of collectively displaying document image data and coded data to be displayed on the display unit, and determining whether post-correction of the coded data is necessary based on a display result of the first display step. A correction process for post-correction of the coded data based on the determination result, and a second display process for displaying the coded data corrected in the correction process on a display unit , And a database registration processing step of registering in a database in bulk document image data and the coded data displayed on the display unit, the document image data
The input process is based on the document image data and the data corresponding to the document type.
Extract the image data of the database registration item area and
The recognition process consists of the data extracted in the document image data input process.
Code the image data in the database registration item area
And the first and second display steps are:
Image data in the database registration item area and this image data
The database entry values that encode the data
And database registration items arranged in a table in a batch
To display .

【０００７】また次の発明に係る文書ファイリング方法
で、文書画像データ入力行程は、一連の画像データを予
め定義された定型文書情報と比較して、読み取った複数
の文書に区分けする文書区分け行程と、その文書区分け
行程で区分けされた複数の文書を、定型文書情報と比較
して文書の種類を判別し、文書画像データとして送出す
る文書判別行程とを備えるものである。Further, in the document filing method according to the next invention, the document image data input step includes a step of comparing a series of image data with predetermined standard document information and dividing the read image into a plurality of read documents. And a document discriminating step of comparing the plurality of documents divided in the document dividing step with the standard document information to discriminate the type of the document and transmitting the same as document image data.

【０００８】また次の発明に係る文書ファイリング方法
で、文書画像データ入力行程は、さらに、定型文書情報
に加えて文書の種類毎に、その文書をデータベースに登
録する際のデータベース登録項目値の記載位置を示すデ
ータベース登録項目領域が予め定義される文書項目定義
行程と、文書に対応する文書画像データから、文書の種
類に応じたデータベース登録項目領域の画像データを抽
出するデータベース登録項目抽出行程とを備え、そのデ
ータベース登録項目抽出行程で抽出したデータベース登
録項目領域の画像データをパターン認識行程でコード化
し、そのコード化したデータベース登録項目値を付与し
て、データベース登録処理工程で文書画像データをデー
タベースへ登録するものである。In the document filing method according to the next invention, the document image data input step further includes, for each document type, a description of a database registration item value when the document is registered in the database in addition to the fixed document information. A document item definition process in which a database registration item region indicating a position is defined in advance, and a database registration item extraction process of extracting image data of a database registration item region corresponding to a document type from document image data corresponding to a document. The image data of the database registration item area extracted in the database registration item extraction process is coded in the pattern recognition process, the coded database registration item value is assigned, and the document image data is transferred to the database in the database registration process. It is to register.

【０００９】また次の発明に係る文書ファイリング法
で、第１及び第２の表示行程は、複数種類の複数の文書
の文書画像データ及びコード化したデータベース登録項
目値を、文書及びデータベース登録項目を配列した表形
式で一括して表示し、修正行程は文書及びデータベース
登録項目の表で文書に対応してコード化されたデータベ
ース登録項目値を修正するものである。In the document filing method according to the next invention, the first and second display steps include the steps of storing document image data and coded database registration item values of a plurality of types of documents, and document and database registration items. The data is displayed collectively in an arrayed table format, and the correction process is to correct the database registration item value coded corresponding to the document in the document and database registration item table.

【００１０】また次の発明に係る文書ファイリング方法
で、第１及び第２の表示行程は、文書の種類に応じて表
示及び動作が切り替わるものである。[0010] In the document filing method according to the next invention, the first and second display steps switch between display and operation according to the type of document.

【００１１】また次の発明に係る文書ファイリング方法
は、種類の異なる複数の文書に対して、同一種類の文書
をまとめて表示する第１の表示モードと、種類の異なる
文書を混在させて同時に表示する第２の表示モードと
を、利用者の要求に応じて切り替えるものである。A document filing method according to the next invention is characterized in that a first display mode in which the same type of document is displayed collectively for a plurality of different types of documents, and that different types of documents are mixed and displayed simultaneously. The second display mode is switched in response to a request from the user.

【００１２】また次の発明に係る文書ファイリング方法
で、第２の表示モードは、文書の種類毎に異なるデータ
ベース登録項目値の表示位置を固定して表示する固定表
示モードと、文書の種類毎に異なるデータベース登録項
目値の表示位置を左詰めで表示する左詰め表示モードと
を、利用者の要求の応じて切り替えるものである。In the document filing method according to the next invention, the second display mode includes a fixed display mode in which display positions of database registration item values different for each type of document are fixed and displayed, and a second display mode for each type of document. A left-justified display mode in which display positions of different database registration item values are displayed left-justified is switched according to a user's request.

【００１３】また次の発明に係る文書ファイリング方法
で、修正行程は、修正の対象としている文書中のデータ
ベース登録項目値に対応する文書画像データを常に拡大
表示する画像データ表示領域と、コード化したデータベ
ース登録項目値のパターンデータを常に拡大表示するパ
ターンデータ表示領域とを備えるものである。[0013] In the document filing method according to the next invention, the correction step includes an image data display area for always enlarging and displaying document image data corresponding to a database registration item value in a document to be corrected, and a coded image data display area. A pattern data display area for constantly displaying the pattern data of the database registration item value in an enlarged manner.

【００１４】また次の発明に係る文書ファイリング方法
で、修正行程は、修正の対象としている文書中のデータ
ベース登録項目値を含む頁全体の文書画像データと、コ
ード化したデータベース登録項目値のパターンデータと
を、指定された表示倍率で常に表示する固定された表示
領域を備えるものである。[0014] In the document filing method according to the next invention, the correction process includes the steps of: document image data of the entire page including the database registration item value in the document to be corrected; and coded pattern data of the database registration item value. Are always displayed at a specified display magnification.

【００１５】また次の発明に係る文書ファイリング方法
で、頁全体の文書画像データの固定された表示領域への
表示は、修正の対象としているデータベース登録項目値
に対応する文書画像データ上の領域を枠で囲み又は色を
表示し、修正対象のデータベース登録項目値に対応する
文書画像データ上の領域を、表示領域に収めるために必
要最小限移動させるものである。In the document filing method according to the next invention, the display of the document image data of the entire page in the fixed display area is performed by changing the area on the document image data corresponding to the database entry to be corrected. An area or color is displayed by a frame, and an area on the document image data corresponding to the database registration item value to be corrected is moved to a minimum necessary to fit in the display area.

【００１６】また次の発明に係る文書ファイリング方法
で、修正行程は、文書判別行程によって判別された文書
の種類と異なる文書の種類となるように利用者が訂正す
ると、文書の種類を訂正された文書のみ文書項目定義行
程によって定義された文書の種類に対応する処理手順に
従って再処理するものである。In the document filing method according to the next invention, when the user corrects the correction process to be a document type different from the document type determined by the document determination process, the document type is corrected. Only the document is reprocessed according to the processing procedure corresponding to the document type defined by the document item definition process.

【００１７】また次の発明に係る文書ファイリング方法
で、データベース登録処理行程は、文字認識行程によっ
て文書全体をコード化したコード化データから単語を切
り出してキーワードとして抽出するキーワード抽出行程
を備え、データベースの登録項目としてキーワードが指
定された場合、キーワード抽出行程で抽出したキーワー
ドを付与して、データベースへ登録するものである。Further, in the document filing method according to the next invention, the database registration processing step includes a keyword extraction step of cutting out words from coded data obtained by coding the entire document by the character recognition step and extracting the extracted words as keywords. When a keyword is specified as a registration item, the keyword extracted in the keyword extraction process is added and registered in the database.

【００１８】また次の発明に係る文書ファイリング方法
では、データベース登録項目値の中からキーワードを抽
出する領域を限定し、指定された領域のみを文字認識行
程によってコード化したコード化データからキーワード
を抽出し、その抽出したキーワードを付与してデータベ
ースへ登録するものである。In the document filing method according to the next invention, an area for extracting a keyword from a database entry item value is limited, and a keyword is extracted from coded data obtained by coding only a specified area by a character recognition process. Then, the extracted keyword is added and registered in the database.

【００１９】また次の発明に係る文書ファイリング方法
では、キーワードを抽出する範囲を文書内容に応じて指
定するキーワード抽出範囲指定行程を備え、文書全体を
文字認識行程によってコード化したコード化データのう
ち、指定された抽出範囲よりキーワードを抽出し、その
抽出したキーワードを付与してデータベースへ登録する
ものである。The document filing method according to the next invention further comprises a keyword extraction range designation step for designating a keyword extraction range in accordance with the contents of the document. A keyword is extracted from a specified extraction range, and the extracted keyword is added to the extracted keyword and registered in a database.

【００２０】また次の発明に係る文書ファイリング装置
は、複数種類の複数の文書を一括して画像読み取り装置
から読み取った一連の画像データを、読み取った複数の
文書に各々対応する文書画像データとして入力する文書
画像データ入力手段と、その文書画像データ入力手段で
入力された文書画像データをパターン認識してコード化
し、コード化データを生成するパターン認識手段と、複
数の文書に各々対応する文書画像データ及びコード化デ
ータを一括して表示部に表示する表示手段と、表示部の
表示結果に基づいてコード化データの後修正の要否を判
定し、その判定結果に基づいてコード化データを後修正
して表示手段に供給する修正手段と、表示部に表示され
た文書画像データ及びコード化データを一括してデータ
ベースに登録するデータベース登録処理手段とを備え、
文書画像データ入力手段は、文書画像データから文書の
種類に応じたデータベース登録項目領域の画像データを
抽出し、パターン認識手段は、文書画像データ入力手段
で抽出したデータベース登録項目領域の画像データをコ
ード化してデータベース登録項目値とし、表示手段は、
データベース登録項目領域の画像データ及びこの画像デ
ータをコード化したデータベース登録項目値とを、文書
及びデータベース登録項目を配列した表形式で一括して
表示するものである。A document filing apparatus according to another aspect of the present invention inputs a series of image data obtained by reading a plurality of types of documents collectively from an image reading apparatus as document image data corresponding to the read plurality of documents. Document image data inputting means, pattern recognition means for pattern-recognizing and coding the document image data input by the document image data inputting means, and coded data, and document image data respectively corresponding to a plurality of documents. Display means for collectively displaying coded data on the display unit, and determining whether post-correction of the coded data is necessary based on the display result of the display unit, and post-correcting the coded data based on the determination result. Correction means for supplying to the display means, and the document image data and coded data displayed on the display unit are collectively registered in the database. And a database registration processing means,
The document image data input means converts the document image data into a document.
Image data in the database registration item area according to the type
Extracting and pattern recognizing means, document image data input means
Copy the image data of the database registration item area extracted in
Into a database entry item value.
Image data in the database registration item area and this image data
The database entry values that encode the data
And database registration items arranged in a table in a batch
To display .

【００２１】また次の発明に係る文書ファイリング装置
で、文書画像データ入力手段は、一連の画像データを予
め定義された定型文書情報と比較して、読み取った複数
の文書に区分けする文書区分け手段と、その文書区分け
手段で区分けされた複数の文書を、定型文書情報と比較
して文書の種類を判別し、文書画像データとして送出す
る文書判別手段とを備えるものである。Further, in the document filing apparatus according to the next invention, the document image data input means compares the series of image data with predefined standard document information and divides the image data into a plurality of read documents. And a document discriminating means for comparing the plurality of documents classified by the document dividing means with the standard document information to determine the type of the document and transmitting the same as document image data.

【００２２】また次の発明に係る文書ファイリング装置
で、文書画像データ入力手段は、さらに、定型文書情報
に加えて文書の種類毎に、その文書をデータベースに登
録する際のデータベース登録項目値の記載位置を示すデ
ータベース登録項目領域が予め定義される文書項目定義
手段と、文書に対応する文書画像データから、文書の種
類に応じたデータベース登録項目域の画像データを抽出
するデータベース登録項目抽出手段とを備え、そのデー
タベース登録項目抽出手段で抽出したデータベース登録
項目領域の画像データをパターン認識手段でコード化
し、そのコード化したデータベース登録項目値を付与し
て、データベース登録処理手段で文書画像データをデー
タベースへ登録するものである。In the document filing apparatus according to the next invention, the document image data input means further includes, for each type of document, a database registration item value when registering the document in a database, in addition to the standard document information. Document item defining means for defining a database registration item area indicating a position in advance, and database registration item extracting means for extracting image data of a database registration item area corresponding to a document type from document image data corresponding to a document. The image data of the database registration item area extracted by the database registration item extraction means is coded by the pattern recognition means, the coded database registration item value is added, and the document image data is transferred to the database by the database registration processing means. It is to register.

【００２３】また次の発明に係る文書ファイリング装置
で、表示手段は、複数種類の複数の文書の文書画像デー
タ及びコード化したデータベース登録項目値を、文書及
びデータベース登録項目を配列した表形式で一括して表
示し、修正手段は文書及びデータベース登録項目の表で
文書に対応してコード化されたデータベース登録項目値
を修正するものである。In the document filing apparatus according to the next invention, the display means collects the document image data and the coded database registration item values of a plurality of types of documents in a tabular form in which the documents and database registration items are arranged. The correction means corrects the coded database registration item value corresponding to the document in the document and database registration item table.

【００２４】[0024]

【作用】複数種類の複数の文書を一括して画像読み取り
装置から読み取った一連の画像データを、読み取った複
数の文書に各々対応する文書画像データとして入力し、
その文書画像データをパターン認識してコード化データ
を生成し、複数の文書に各々対応する文書画像データ及
びコード化データを一括して表示部に表示し、必要に応
じてコード化データを後修正し、表示部に表示された文
書画像データ及びコード化データを一括してデータベー
スに登録する。これにより、簡易な構成及び操作で複数
種類の複数の文書を一括して入力し得ると共に、修正作
業における利用者の作業負担を軽減し使い勝手を向上し
得る。A series of image data obtained by reading a plurality of types of documents collectively from an image reading apparatus is input as document image data corresponding to each of the read documents.
Generates coded data by pattern recognition of the document image data, collectively displays the document image data and coded data corresponding to a plurality of documents on the display unit, and post-corrects the coded data as necessary Then, the document image data and the coded data displayed on the display unit are collectively registered in the database. Thus, a plurality of types of documents can be input collectively with a simple configuration and operation, and the user's work load in correction work can be reduced and usability can be improved.

【００２５】また、一連の画像データを予め定義された
定型文書情報と比較して、読み取った複数の文書に区分
けし、区分けされた複数の文書を、定型文書情報と比較
して文書の種類を判別する。これにより簡易な構成及び
操作で、複数種類に複数の文書を確実に一括して入力し
得る。Further, a series of image data is compared with predetermined standard document information, divided into a plurality of read documents, and the divided plural documents are compared with the standard document information to determine the type of the document. Determine. Thus, a plurality of documents can be reliably input to a plurality of types collectively with a simple configuration and operation.

【００２６】また、文書に対応する文書画像データか
ら、文書の種類に応じたデータベース登録項目領域の画
像データを抽出し、そのデータベース登録項目領域の画
像データをパターン認識行程でコード化し、そのコード
化したデータベース登録項目値を付与して、文書画像デ
ータをデータベースへ登録する。これにより、簡易な構
成及び操作で複数種類の複数の文書を確実に一括して入
力し得る。Further, from the document image data corresponding to the document, the image data of the database registration item area corresponding to the type of the document is extracted, and the image data of the database registration item area is coded in a pattern recognition process, and the coding is performed. The document image data is registered in the database by giving the database entry item value. Thereby, a plurality of types of documents can be reliably input collectively with a simple configuration and operation.

【００２７】また、複数種類の複数の文書の文書画像デ
ータ及びコード化したデータベース登録項目値を、文書
及びデータベース登録項目を配列した表形式で一括して
表示し、文書及びデータベース登録項目の表で文書に対
応してコード化されたデータベース登録項目値を修正す
る。これにより、修正作業における利用者の作業負担を
軽減して使い勝手を向上し得る。Further, the document image data and the coded database registration item values of a plurality of types of a plurality of documents are collectively displayed in a table format in which the documents and the database registration items are arranged. Modify the database entry value coded for the document. Thereby, the user's work load in the correction work can be reduced and the usability can be improved.

【００２８】また第１及び第２の表示行程は、文書の種
類に応じて表示及び動作が切り替わる。これにより、利
用者は文書の種類を容易に認識することができ、修正作
業における利用者の作業負担を軽減して使い勝手を向上
し得る。The display and operation of the first and second display steps are switched according to the type of the document. As a result, the user can easily recognize the type of the document, and the user's work load in the correction work can be reduced and the usability can be improved.

【００２９】また、種類の異なる複数の文書に対して、
同一種類の文書をまとめて表示する第１の表示モード
と、種類の異なる文書を混在させて同時に表示する第２
の表示モードとを、利用者の要求に応じて切り替える。
これにより、利用者が第１の表示モードを選択すれば、
同一種類の複数の文書の内容を比較でき、また第２の表
示モードを選択すれば、異なる文書の内容を比較でき、
必要に応じてこれらを選択し得ることにより、修正作業
における利用者の作業負担を軽減して使い勝手を向上し
得る。Further, for a plurality of different types of documents,
A first display mode for displaying the same type of documents collectively, and a second display mode for simultaneously displaying different types of documents mixedly.
Is switched in response to a user request.
Thus, if the user selects the first display mode,
You can compare the contents of multiple documents of the same type, and if you select the second display mode, you can compare the contents of different documents,
Since these can be selected as necessary, the work load on the user in the correction work can be reduced and the usability can be improved.

【００３０】また、第２の表示モードは、文書の種類毎
に異なるデータベース登録項目値の表示位置を固定して
表示する固定表示モードと、文書の種類毎に異なるデー
タベース登録項目値の表示位置を左詰めで表示する左詰
め表示モードとを、利用者の要求の応じて切り替える。
これにより、利用者が固定表示モードを選択すれば、文
書の種類毎にデータベース登録項目の有無を容易に認識
でき、また左詰め表示モードを選択すれば、同一文書内
でより多くのデータベース登録項目を表示して認識で
き、必要に応じてこれらを選択し得ることにより、修正
作業における利用者の作業負担を軽減して使い勝手を向
上し得る。The second display mode includes a fixed display mode in which the display position of a database registration item value that differs for each type of document is fixed and displayed, and a display position of a database registration item value that differs for each type of document. The display mode is switched between left-justified display mode and left-justified display mode according to the user's request.
Thus, if the user selects the fixed display mode, it is possible to easily recognize the presence or absence of database registration items for each type of document, and if the user selects the left-justified display mode, more database registration items in the same document Can be displayed and recognized, and these can be selected as necessary, so that the work load on the user in the correction work can be reduced and the usability can be improved.

【００３１】また、修正行程では、画像データ表示領域
に修正の対象としている文書中のデータベース登録項目
値に対応する文書画像データを常に拡大表示し、パター
ンデータ表示領域にコード化したデータベース登録項目
値のパターンデータを常に拡大表示する。これにより、
拡大表示された修正対象のパターンデータとその文書画
像データを対比して認識でき、修正作業における利用者
の作業負担を軽減して使い勝手を向上し得る。In the correction step, the document image data corresponding to the database registration item value in the document to be corrected is always enlarged and displayed in the image data display area, and the coded database registration item value is displayed in the pattern data display area. Is always enlarged. This allows
The enlarged and displayed pattern data to be corrected and its document image data can be compared and recognized, and the user's work load in the correction work can be reduced and the usability can be improved.

【００３２】また、修正行程では、固定された表示領域
に、修正の対象としている文書中のデータベース登録項
目値を含む頁全体の文書画像データと、コード化したデ
ータベース登録項目値のパターンデータとを、指定され
た表示倍率で常に表示する。これにより、指定された表
示倍率で表示された修正対象のパターンデータとその文
書画像データを対比して認識でき、修正作業における利
用者の作業負担を軽減して使い勝手を向上し得る。In the correction process, the document image data of the entire page including the database registration item value in the document to be corrected and the coded database registration item value pattern data are stored in the fixed display area. , Always display at the specified display magnification. As a result, the pattern data to be corrected displayed at the specified display magnification and the document image data can be compared and recognized, and the user's work load in the correction work can be reduced and the usability can be improved.

【００３３】また、頁全体の文書画像データの固定され
た表示領域への表示は、修正の対象としているデータベ
ース登録項目値に対応する文書画像データ上の領域を枠
で囲み又は色を表示し、修正対象のデータベース登録項
目値に対応する文書画像データ上の領域を、表示領域に
収めるために必要最小限移動させる。これにより、修正
対象のデータベース登録項目値に対応する文書画像デー
タを確実に認識でき、修正作業における利用者の作業負
担を軽減して使い勝手を向上し得る。Further, the display of the document image data of the entire page in the fixed display area is performed by surrounding the area on the document image data corresponding to the database entry item value to be corrected with a frame or displaying a color. The area on the document image data corresponding to the database entry item value to be corrected is moved to the minimum necessary to fit in the display area. As a result, the document image data corresponding to the database registration item value to be corrected can be reliably recognized, and the user's work load in the correction work can be reduced and the usability can be improved.

【００３４】また、修正行程は、文書判別行程によって
判別された文書の種類と異なる文書の種類となるように
利用者が訂正すると、文書の種類を訂正された文書のみ
文書項目定義行程によって定義された文書の種類に対応
する処理手順に従って再処理する。これにより、文書が
誤判別された場合でも容易に訂正でき、複数種類の複数
の文書を確実に一括して入力し得る。When the user corrects the correction process so that the document type is different from the document type determined by the document determination process, only the document whose document type has been corrected is defined by the document item definition process. The document is re-processed according to the processing procedure corresponding to the type of the received document. Thus, even if a document is erroneously determined, it can be easily corrected, and a plurality of types of documents can be reliably input collectively.

【００３５】また、データベース登録処理行程のキーワ
ード抽出行程は、文字認識行程によって文書全体をコー
ド化したコード化データから単語を切り出してキーワー
ドとして抽出し、データベースの登録項目としてキーワ
ードが指定された場合、抽出したキーワードを付与し
て、データベースへ登録する。これにより、容易にキー
ワードを抽出してデータベースへ登録でき、利用者の使
い勝手を向上し得る。In the keyword extraction process of the database registration process, a word is cut out from coded data obtained by coding the entire document by the character recognition process and extracted as a keyword. When the keyword is specified as a database registration item, The extracted keyword is assigned and registered in the database. As a result, keywords can be easily extracted and registered in the database, and the usability of the user can be improved.

【００３６】また、データベース登録項目値の中からキ
ーワードを抽出する領域を限定し、指定された領域のみ
を文字認識行程によってコード化したコード化データか
らキーワードを抽出し、その抽出したキーワードを付与
してデータベースへ登録する。これにより、容易にデー
タベース登録項目値をキーワードとして抽出してデータ
ベースへ登録でき、利用者の使い勝手を向上し得る。Further, the area for extracting the keyword from the database entry item value is limited, the keyword is extracted from the coded data obtained by coding only the designated area by the character recognition process, and the extracted keyword is assigned. To register in the database. As a result, the database registration item value can be easily extracted as a keyword and registered in the database, and the usability of the user can be improved.

【００３７】また、キーワード抽出範囲指定行程で、キ
ーワードを抽出する範囲を文書内容に応じて指定し、文
書全体を文字認識行程によってコード化したコード化デ
ータのうち、指定された抽出範囲よりキーワードを抽出
し、その抽出したキーワードを付与してデータベースへ
登録する。これにより、有効なキーワードを抽出してデ
ータベースへ登録でき、利用者の使い勝手を向上し得
る。In the keyword extraction range designating step, a keyword extraction range is designated in accordance with the contents of the document, and keywords are extracted from the designated extraction range in the coded data obtained by encoding the entire document by the character recognition step. The extracted keyword is added to the extracted keyword and registered in the database. As a result, valid keywords can be extracted and registered in the database, and the usability of the user can be improved.

【００３８】また、文書画像データ入力手段で、複数種
類の複数の文書を一括して画像読み取り装置から読み取
った一連の画像データを各々対応する文書画像データと
して入力し、パターン認識手段でパターン認識してコー
ド化データを生成し、表示手段で複数の文書に各々対応
する文書画像データ及びコード化データを一括して表示
部に表示し、修正手段で表示部の表示結果に基づいて必
要に応じて後修正して表示手段に供給し、データベース
登録手段で表示部に表示された文書画像データ及びコー
ド化データを一括してデータベースに登録する。これに
より、簡易な構成及び操作で複数種類の複数の文書を一
括して入力し得ると共に、修正作業における利用者の作
業負担を軽減し使い勝手を向上し得る。Further, a series of image data read from a plurality of types of documents collectively by an image reading device is input as corresponding document image data by a document image data input means, and pattern recognition is performed by a pattern recognition means. Coded data is generated, and the display means collectively displays the document image data and the coded data corresponding to the plurality of documents on the display unit, and the correction means as necessary based on the display result of the display unit. The document image data and the coded data displayed on the display unit are collectively registered in the database by the database registration unit. Thus, a plurality of types of documents can be input collectively with a simple configuration and operation, and the user's work load in correction work can be reduced and usability can be improved.

【００３９】また、文書画像データ入力手段では、文書
区分け手段で一連の画像データを予め定義された定型文
書情報と比較して、読み取った複数の文書に区分けし、
文書判別手段で区分けされた複数の文書を、定型文書情
報と比較して文書の種類を判別し、文書画像データとし
て送出する。これにより、これにより簡易な構成及び操
作で、複数種類に複数の文書を確実に一括して入力し得
る。Further, in the document image data input means, a series of image data is compared with predefined standard document information by the document classification means, and divided into a plurality of read documents.
The plurality of documents classified by the document discriminating means are compared with standard document information to determine the type of the document, and transmitted as document image data. Thus, with a simple configuration and operation, a plurality of documents in a plurality of types can be reliably input collectively.

【００４０】また、文書画像データ入力手段では、さら
に、文書項目定義手段で定型文書情報に加えて文書の種
類毎にデータベース登録項目値の記載位置を示すデータ
ベース登録項目領域が予め定義され、データベース登録
項目抽出手段で文書に対応する文書画像データからデー
タベース登録項目域の画像データを抽出し、パターン認
識手段でコード化したデータベース登録項目値を付与し
て、文書画像データをデータベースへ登録する。これに
より、簡易な構成及び操作で複数種類の複数の文書を確
実に一括して入力し得る。Further, in the document image data input means, a database entry item area indicating the entry position of a database entry item value for each document type is defined in advance by the document item definition means in addition to the standard document information. The item extracting means extracts the image data of the database registration item area from the document image data corresponding to the document, adds the coded database registration item value by the pattern recognition means, and registers the document image data in the database. Thereby, a plurality of types of documents can be reliably input collectively with a simple configuration and operation.

【００４１】また、表示手段では、複数種類の複数の文
書の文書画像データ及びコード化したデータベース登録
項目値を、文書及びデータベース登録項目を配列した表
形式で一括して表示し、修正手段では、文書及びデータ
ベース登録項目の表で文書に対応してコード化されたデ
ータベース登録項目値を修正する。これにより、修正作
業における利用者の作業負担を軽減して使い勝手を向上
し得る。The display means collectively displays the document image data and the coded database registration item values of a plurality of types of documents in a table format in which the documents and the database registration items are arranged. Correct the database entry value coded corresponding to the document in the table of document and database entry. Thereby, the user's work load in the correction work can be reduced and the usability can be improved.

【００４２】[0042]

【実施例】以下図面を参照して、この発明の一実施例を
詳述する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of the present invention will be described below in detail with reference to the drawings.

【００４３】実施例１．図１は全体として、パーソナル
コンピュータやワークステーション上に組み込まれた本
発明による文書ファイリングシステムの構成を示し、１
はＣＰＵ（中央処理ユニット）でなりシステム全体の動
作を制御する制御部である。この制御部１は、文書読み
取り手段２、文書区分け手段３、文書判別手段４、デー
タベース登録項目抽出手段５、文字認識手段６、一括表
示手段７、修正手段８、キーワード抽出手段９、データ
ベース登録手段１０及び記憶部１１に対して、制御を指
示すると共にデータを送受信する。Embodiment 1 FIG. 1 shows the configuration of a document filing system according to the present invention incorporated in a personal computer or a workstation as a whole.
Is a control unit which comprises a CPU (Central Processing Unit) and controls the operation of the entire system. The control unit 1 includes a document reading unit 2, a document classifying unit 3, a document discriminating unit 4, a database registration item extracting unit 5, a character recognizing unit 6, a batch display unit 7, a correcting unit 8, a keyword extracting unit 9, a database registering unit. The control unit 10 instructs control and transmits and receives data to and from the storage unit 11.

【００４４】文書読み取り手段２は、シートフィーダを
有する光学式画像読み取り装置からなる文書読み取り部
１２に対応動作するソフトウェアドライバである。これ
により複数種類の複数の文書を文書読み取り部１２に載
置し、一括してスキャンして得られる画像データが文書
ファイリングシステムに入力される。文書区分け手段３
は、文書読み取り手段２を通じて入力される複数種類の
複数の文書に応じた一連の画像データと、文書項目定義
手段１３に予め定義された文書画像パターンとをパター
ンマッチングの手法等を用いて比較し、文書の先頭のペ
ージを認識してそれぞれ１文書に対応した文書画像デー
タとして区分けする。この結果得られる文書画像データ
は文書判別手段４において、再度文書項目定義手段１３
で定義された文書画像パターンと比較され、文書の種類
を判別される。The document reading means 2 is a software driver which operates corresponding to the document reading section 12 comprising an optical image reading device having a sheet feeder. As a result, a plurality of types of documents are placed on the document reading unit 12, and image data obtained by scanning all at once is input to the document filing system. Document classification means 3
Compares a series of image data corresponding to a plurality of types of documents input through the document reading unit 2 with a document image pattern predefined in the document item definition unit 13 using a pattern matching method or the like. , And recognizes the first page of the document and classifies it as document image data corresponding to one document. The document image data obtained as a result is sent to the document discriminating means 4 again by the document item defining means 13.
Is compared with the document image pattern defined in the above, and the type of the document is determined.

【００４５】実際上この実施例では、文書の先頭に図２
に示す先頭識別頁ＰＨを付けて、複数種類の複数の文書
を一括して読み取り入力する。この先頭識別頁ＰＨ上に
は、予め塗りつぶされた矩形領域の先頭頁マークＨＭ
と、文書の種類をそれぞれ利用者が塗りつぶして指定す
る複数の矩形領域でなる文書種別マークＳＭが配されて
いる。従って文書項目定義手段１３では、先頭頁マーク
ＨＭに応じた文書画像パターンと、文書の種類の応じた
文書種別マークＳＭを有する文書画像パターンとが予め
定義されている。これにより、文書区分け手段３及び文
書判別手段４では、容易なパターンマッチングの手法
で、入力された画像データについて文書毎に区分けし得
ると共に、文書の種類を判別できる。In practice, in this embodiment, FIG.
A plurality of documents of a plurality of types are read and input collectively with the top identification page PH shown in FIG. On the top identification page PH, the top page mark HM of the rectangular area painted in advance
And a document type mark SM composed of a plurality of rectangular areas each of which specifies the type of the document by the user by painting. Therefore, in the document item definition means 13, a document image pattern corresponding to the first page mark HM and a document image pattern having a document type mark SM corresponding to the type of document are defined in advance. As a result, the document classifying means 3 and the document discriminating means 4 can classify the input image data for each document and determine the type of the document by an easy pattern matching technique.

【００４６】なお上述した先頭識別頁ＰＨ（図２）を用
いる場合には、パターンマッチングの手法に代えて、塗
りつぶされた矩形領域の有無を検出して、文書を区分け
すると共に文書の種類を判別しても良く、この場合文書
区分け手段３や文書判別手段４の処理を一段と軽減でき
る。また先頭識別頁ＰＨを用いずに、入力される予定の
複数種類の複数の文書について、先頭の文書画像パター
ンを予め定義して記憶部１１に記憶し、この文書画像パ
ターンを用いてパターンマッチングを行い、文書を区分
けすると共に文書の種類を判別しても良く、この場合利
用者は文書のみを文書読み取り装置１２に載置して読み
取れば良く、その分文書の入力に際して使い勝手を向上
し得る。When the above-described head identification page PH (FIG. 2) is used, the presence or absence of a filled rectangular area is detected in place of the pattern matching technique to divide the document and determine the type of the document. In this case, the processing of the document classification unit 3 and the document determination unit 4 can be further reduced. Also, without using the head identification page PH, the head document image pattern is defined in advance and stored in the storage unit 11 for a plurality of types of documents to be input, and pattern matching is performed using this document image pattern. The document may be classified and the type of the document may be determined. In this case, the user may place only the document on the document reading device 12 and read the document, thereby improving the usability when inputting the document.

【００４７】文書項目定義手段１３には、上述した文書
の区分け及び文書の種類の判別に用いる複数種類の文書
画像パターンに加えて、それぞれの文書の種類に対応す
るデータベース登録項目の項目名とその項目名に対応す
る文書中の記載位置が、データベース登録項目領域とし
て定義されている。データベース登録項目抽出手段６
は、文書判別手段４で判別された文書画像データについ
て、文書種類に応じたデータベース登録項目領域の内容
の画像データを切り出す。この実施例の場合、データベ
ース登録項目領域は、文書中の記載位置として領域の座
標情報が数値で定義され、この座標情報の範囲内の文書
画像データが、データベース登録項目領域の画像データ
として切り出される。なおこのデータベース登録領域も
文書画像パターンとして定義しても良く、この場合は上
述と同様にパターンマッチングの手法を用いてデータベ
ース登録項目領域の画像データを切り出すことができ
る。The document item definition means 13 includes, in addition to the above-described plural types of document image patterns used for document classification and document type discrimination, the item names of database registration items corresponding to each document type, and The description position in the document corresponding to the item name is defined as a database registration item area. Database registration item extraction means 6
Extracts the image data of the contents of the database registration item area according to the document type from the document image data determined by the document determination means 4. In the case of this embodiment, in the database registration item area, coordinate information of the area is numerically defined as a description position in the document, and document image data within the range of the coordinate information is cut out as image data of the database registration item area. . This database registration area may also be defined as a document image pattern. In this case, the image data of the database registration item area can be cut out using the pattern matching method as described above.

【００４８】文字認識手段６は、データベース登録項目
抽出手段５で抽出されたデータベース登録項目領域の画
像データについて、文字認識の手法で文字パターンを認
識して文字コード化し、文書画像データと文字列データ
を一括表示手段７に送出する。文字認識するパターン
は、文字や英数字、記号、絵柄等である。一括表示手段
７は、文書及びデータベース登録項目を縦及び横に配列
し、複数種類の複数の文書のデータベース登録項目の画
像データとそれを文字コード化した文字列データとをそ
れぞれ２行に配して、一括して表示部１４に表示する。
これと同時に修正手段８で編集の対象としているデータ
ベース登録項目が存在する文書の該当ページの文書画像
データを固定表示領域でなる画像データビューワに表示
し、その編集対象の該当領域上に枠を表示する等種々の
表示情報を生成し、表示部に送出する。表示部１４は一
括表示手段８で生成された表示情報を利用者に提示する
もので、ＣＲＴ（陰極線管）ディスプレイ等より構成さ
れる。The character recognizing means 6 recognizes a character pattern of the image data of the database registration item area extracted by the database registration item extracting means 5 by a character recognition technique and converts it into a character code. Is sent to the batch display means 7. The character recognition patterns are characters, alphanumeric characters, symbols, patterns, and the like. The batch display means 7 arranges documents and database registration items vertically and horizontally, and arranges image data of database registration items of a plurality of types of documents and character string data obtained by converting the image data into two lines, respectively. And collectively display them on the display unit 14.
At the same time, the correction means 8 displays the document image data of the corresponding page of the document containing the database registration item to be edited in the image data viewer which is a fixed display area, and displays a frame on the corresponding area to be edited. For example, various display information is generated and sent to the display unit. The display unit 14 presents display information generated by the collective display unit 8 to a user, and includes a CRT (cathode ray tube) display or the like.

【００４９】修正手段８は、表示部１４に表示された各
種情報に対して、利用者が操作部１５によって指示する
種々の修正操作に応答し、内部データに対する所望の修
正を施すと同時に、それを一括表示手段７に伝え、表示
部１４で映出する表示内容に反映させる。また利用者
は、認識した文字の修正操作に加えて、文書の区分けや
種類の誤りの訂正を操作し得るようになされ、この場合
修正された文書の種類に応じて、該当する文書画像デー
タについて、文書項目定義手段１３より所望のデータベ
ース登録項目抽出領域を得て、上述したデータベース登
録項目抽出手段５及び文字認識手段６を通じて再処理が
行われ、再度一括表示手段７で追加されて表示される。
なお操作部１４は、表示部１４に映出される情報に対
し、利用者が所望する修正を修正手段１０に入力するた
めのキーボードや、マウス等のポインティングデバイス
から構成されている。The correction means 8 responds to various correction operations instructed by the user through the operation unit 15 for the various information displayed on the display unit 14 and makes desired corrections to the internal data. Is transmitted to the batch display means 7 and is reflected on the display content projected on the display unit 14. In addition to the correction operation of the recognized characters, the user can operate the division of the document and the correction of the error of the type. In this case, depending on the type of the corrected document, A desired database registration item extraction area is obtained from the document item definition means 13, reprocessed through the database registration item extraction means 5 and the character recognition means 6 described above, and added and displayed again by the collective display means 7. .
The operation unit 14 includes a keyboard and a pointing device such as a mouse for inputting a correction desired by the user to the information displayed on the display unit 14 to the correction unit 10.

【００５０】キーワード抽出手段９は、文字認識手段６
でコード化され修正手段８で修正されたデータベース登
録項目の文字列データから、文書の検索に有用なキーワ
ードを抽出する。また文書の種類に応じて、文書全体の
領域や指定された文書カテゴリーとして例えば要約等の
領域の文書画像データを、文字認識手段６でコード化し
た文字列データから、全ての単語を切り出してキーワー
ドとして抽出しても良い。データベース登録手段１０
は、データベース登録項目抽出手段５で抽出され、文字
認識手段６でコード化され、修正手段８で修正されたデ
ータベース登録項目と、キーワード抽出手段９で抽出し
たキーワードとを付加情報とし、文書判別手段４で判別
した文書画像データを、記憶部１１のデータベースに登
録する。記憶部１４は、文書項目定義手段１３の文書画
像パターンや、データベース登録手段１３によって登録
される文書等を記憶するもので、半導体記憶装置やディ
スク装置等からなる。The keyword extracting means 9 includes the character recognizing means 6
A keyword useful for document retrieval is extracted from the character string data of the database registration item coded by the correction means 8 and corrected by the correction means 8. Further, according to the type of the document, the document image data of the region of the entire document or the region such as the abstract as the designated document category is extracted from the character string data coded by the character recognizing means 6, and all the words are cut out. May be extracted. Database registration means 10
Is added to the database registration item extracted by the database registration item extracting means 5 and coded by the character recognizing means 6 and corrected by the correcting means 8 and the keyword extracted by the keyword extracting means 9 as document information. The document image data determined in step 4 is registered in the database of the storage unit 11. The storage unit 14 stores a document image pattern of the document item definition unit 13, a document registered by the database registration unit 13, and the like, and includes a semiconductor storage device and a disk device.

【００５１】このような構成で、この実施例の文書ファ
イリングシステムにおける文書ファイリング処理の流れ
を図３に示す。複数種類の複数の文書２０は、文書読み
取り部１２に載置されて文書読み取り手段２を通じて一
括して入力され、この複数の文書に対応する一連の画像
データが、文書区分け手段３で各々文書に区分けされ、
文書判別手段４によって各文書の種類が判別される。例
えば文書Ａと判別された文書画像データ２１Ａに対して
は、文書項目定義手段１３から得られる文書Ａに対する
定義２２Ａを適用し、そこに定義された手順に従って文
書画像データ２１Ａは処理される。FIG. 3 shows the flow of document filing processing in the document filing system of this embodiment having such a configuration. A plurality of types of documents 20 are placed on the document reading unit 12 and input collectively through the document reading unit 2, and a series of image data corresponding to the plurality of documents is converted into documents by the document sorting unit 3. Divided,
The type of each document is determined by the document determination unit 4. For example, the definition 22A for the document A obtained from the document item definition means 13 is applied to the document image data 21A determined to be the document A, and the document image data 21A is processed according to the procedure defined therein.

【００５２】各々の文書画像データ２１からは、文書項
目定義手段１３の定義２２に従ってデータベース登録項
目領域の画像データ２３がデータベース登録項目抽出手
段５によって切り出される。この結果切り出された各々
のデータベース登録項目領域の画像データ２３は文字認
識手段６によって文字コード化され、データベース登録
項目値でなる文字列データ２４が作成される。それらを
一括表示／修正インタフェース２５によって表示すると
共に必要に応じて後修正し、記憶部１１内に構築された
データベース２６へ文書画像データと文字列データとを
登録する。From each document image data 21, the image data 23 of the database registration item area is cut out by the database registration item extraction means 5 according to the definition 22 of the document item definition means 13. As a result, the image data 23 of each database registration item area cut out is converted into a character code by the character recognizing means 6, and character string data 24 composed of database registration item values is created. These are displayed by the batch display / correction interface 25 and post-corrected as necessary, and the document image data and the character string data are registered in the database 26 constructed in the storage unit 11.

【００５３】ここで一括表示／修正インタフェース２５
を詳述する。一括表示手段７によって全ての文書のデー
タベース登録項目を一括して表示するための情報が生成
され表示部１４に表示されるので、利用者はそれを確認
しながら、必要に応じて操作部１５によって対話的に修
正操作できる。この場合、操作部１５には常に１つだけ
カーソルが表示され、カーソルが表示されている位置の
データベース登録項目が、その時点での修正対象とな
る。Here, the batch display / modification interface 25
Will be described in detail. The information for collectively displaying the database registration items of all the documents is generated by the collective display means 7 and displayed on the display unit 14. The user can confirm the information and operate the operation unit 15 as necessary while confirming the information. Correction can be performed interactively. In this case, only one cursor is displayed on the operation unit 15 at all times, and the database registration item at the position where the cursor is displayed is a correction target at that time.

【００５４】修正対象のデータベース登録項目は、その
画像データとコード化された文字列データが拡大表示さ
れ、これにより利用者は拡大された画像データを確認し
ながらコード化された文字列データを修正できる。また
修正対象のデータベース登録項目を含む文書の該当ペー
ジの画像データは、常に画像データビューワに表示さ
れ、そのデータベース登録項目に対応するデータベース
登録項目領域を枠で囲んで表示したり、色を反転して表
示することにより、利用者は修正対象のデータベース登
録項目を画像データ上で容易に認識できる。In the database registration items to be corrected, the image data and the coded character string data are displayed in an enlarged manner, so that the user can correct the coded character string data while checking the expanded image data. it can. The image data of the corresponding page of the document including the database registration item to be corrected is always displayed in the image data viewer, and the database registration item area corresponding to the database registration item is displayed with a frame, or the color is inverted. The user can easily recognize the database registration item to be corrected on the image data.

【００５５】なお該当するページの画像データのサイズ
が大きく、画像データビューワに収まらない場合は、修
正対象のデータベース登録項目に対応する領域を表示す
るために、必要最小限の範囲でスクロールする。またコ
ード化したデータベース登録項目の文字列データを修正
する際には、選択した文字が２つ以上の文字認識候補を
持つ場合はその一覧を表示し、利用者が選択した文字に
置き換えることができるが、文字認識候補の一覧中に正
しい文字がない場合は、仮名漢字変換等を利用してキー
ボード等から正しい文字を入力する。If the size of the image data of the corresponding page is too large to fit in the image data viewer, scroll in the minimum necessary range to display the area corresponding to the database registration item to be corrected. When correcting the character string data of the coded database registration item, if the selected character has two or more character recognition candidates, a list thereof is displayed, and the character can be replaced with the character selected by the user. However, if there is no correct character in the list of character recognition candidates, the correct character is input from a keyboard or the like using kana-kanji conversion or the like.

【００５６】ここで、表示部１４に表示される修正のた
めの一括表示／修正インタフェース２５の表示を図４に
示す。これは、一括表示手段７で生成された一括表示デ
ータを基に構成され、操作部１５から入力された修正指
示を修正手段８で解釈し、修正を行った後、その結果を
一括表示手段７に伝え、表示部１４に反映させ、以降こ
れを繰り返すことによって利用者は必要に応じて、各文
書の各データベース登録項目について修正作業を行う。
この実施例の場合、表示部１４の表示枠中には、文書及
びデータベース登録項目を表形式で表示するメイン表示
領域に加えて、上述した画像データビューワ３０及び文
字列データビューワ３１の表示領域が配されている。FIG. 4 shows the display of the batch display / correction interface 25 for correction displayed on the display unit 14. This is configured based on the collective display data generated by the collective display means 7. The correction instruction input from the operation unit 15 is interpreted by the correction means 8, the correction is performed, and the result is displayed. , And reflected on the display unit 14, and thereafter, by repeating this, the user performs a correction work on each database registration item of each document as necessary.
In the case of this embodiment, in the display frame of the display unit 14, in addition to the main display area for displaying documents and database registration items in a table format, the display areas for the image data viewer 30 and the character string data viewer 31 described above are provided. Are arranged.

【００５７】メイン表示領域中において、文書種類ラベ
ル３２は文書の種類を示すラベルであり、文書判別手段
４で判別された結果を示し、同一行に表示されるデータ
は同一文書のデータベース登録項目であることを示す。
またデータベース登録項目ラベル３３は、文書を検索す
るためのデータベース登録項目を示すラベルであり、デ
ータベース登録項目抽出手段５で抽出された結果を示
し、同一桁に示されるデータは各文書の同一のデータベ
ース登録項目であることを示す。なお各文書に対応する
セルは、画像データセル３４及び文字列データセル３５
の２段で構成されている。In the main display area, the document type label 32 is a label indicating the type of the document, indicates the result determined by the document determination means 4, and the data displayed on the same line is a database registration item of the same document. Indicates that there is.
The database registration item label 33 is a label indicating a database registration item for searching for a document. The label indicates a result extracted by the database registration item extracting means 5, and data indicated by the same digit is the same database of each document. Indicates a registered item. The cells corresponding to each document are an image data cell 34 and a character string data cell 35.
In two stages.

【００５８】このうち画像データセル３４は、その文書
における文書画像データのデータベース登録項目の領域
の画像データを示す。このとき表示される画像データ
は、画像データセル３４内に表示が収まるように、必要
最小限だけ縮小される。また文字列データセル３５は、
その文書における文書画像データのデータベース登録項
目に対応する領域の画像データを文字認識手段６でコー
ド化した文字列データを示す。この文字列データセル３
５上で、文字列データが修正できる。またある時点で
は、文字列データセル３５の１つだけ必ずカーソル３６
を所有し、カーソル３６が存在する画像データセル３４
及び文字列データセル３５をカレントセルと呼び、その
時点での修正対象となる。The image data cell 34 indicates the image data of the area of the database image registration item of the document image data of the document. The image data to be displayed at this time is reduced to a necessary minimum so that the display can be accommodated in the image data cell 34. The character string data cell 35 is
The character string data obtained by encoding the image data of the area corresponding to the database registration item of the document image data in the document by the character recognition means 6 is shown. This character string data cell 3
5, the character string data can be modified. At some point, only one of the character string data cells 35 must be
And the image data cell 34 in which the cursor 36 exists
And the character string data cell 35 are referred to as a current cell, and are to be corrected at that time.

【００５９】画像データビューワ３０は、カレントセル
に対応する文書で、対応するページの画像データを表示
するための固定表示領域である。またこの画像データビ
ューワ３０の画像データ中、カレントセルに対応する領
域には、枠３７が表示されている。この枠３７がもし表
示領域外になる場合には、必要最小限だけ表示部分の画
像が移動されて、枠３７が最大限画像データビューワ内
に表示される。文字列データビューワ３１はカレントセ
ルの文字列データを表示するための固定表示領域であ
り、通常のエディタ機能を有する。文字列データビュー
ワ３１には常にカーソル３８が表示され、文字列データ
を修正することができる。The image data viewer 30 is a fixed display area for displaying image data of a corresponding page in a document corresponding to the current cell. In the image data of the image data viewer 30, a frame 37 is displayed in a region corresponding to the current cell. If the frame 37 is out of the display area, the image of the display portion is moved by a necessary minimum, and the frame 37 is displayed in the image data viewer at the maximum. The character string data viewer 31 is a fixed display area for displaying character string data of the current cell, and has a normal editor function. A cursor 38 is always displayed on the character string data viewer 31 so that character string data can be corrected.

【００６０】なお文字列データの修正として、カレント
セル上又は文字列データビューワ３１上のカーソル３６
又は３８に対応する１文字を修正する場合、図５に示す
ように、認識候補文字の一覧を表す認識候補選択画面が
ポップアップして表示される。この図中には、修正対象
となる１文字分の画像データ４０が表示されると共に、
認識候補の文字４１が１文字ずつ漢字、平仮名、片仮
名、英数字等の種別を付して表示される。従って利用者
は、マウス等のポインティングデバイスやカーソルを移
動させることによって修正選択する文字を選択する。ま
た、修正しない場合は終了４２の表示を指示入力する。As a modification of the character string data, the cursor 36 on the current cell or the character string data viewer 31
Alternatively, when correcting one character corresponding to 38, as shown in FIG. 5, a recognition candidate selection screen showing a list of recognition candidate characters pops up and displayed. In this figure, image data 40 for one character to be corrected is displayed, and
Recognition candidate characters 41 are displayed one by one with classifications such as kanji, hiragana, katakana, and alphanumeric characters. Therefore, the user selects a character to be corrected and selected by moving a pointing device such as a mouse or a cursor. If no correction is to be made, an instruction to display the end 42 is inputted.

【００６１】この結果認識候補選択画面が閉じられ、文
字列データビューワ３１上及びカレントセル上のカーソ
ル３８及び３６の位置で、それぞれ対応する文字列デー
タ中の文字が修正される。なお認識候補選択画面中に修
正を希望する文字が表示されないとき、利用者は通常の
エディタのインタフェースでカーソル３６及び３８に対
する文字や、その文字を含む文字列を修正することがで
きる。実際上この文字列データビューワ３１上のカーソ
ル３８とカレントセル上のカーソル３６のどちらか一方
が移動した場合、他方のカーソルは文字列データ上で常
に同じ位置を保つように移動する。また文字列データビ
ューワ３１とカレントセルのどちらかで修正が行われた
場合、もう一方の文字列データは常に同じ修正が行われ
る。As a result, the recognition candidate selection screen is closed, and the characters in the corresponding character string data are corrected at the positions of the cursors 38 and 36 on the character string data viewer 31 and on the current cell. When the character desired to be corrected is not displayed on the recognition candidate selection screen, the user can correct the character corresponding to the cursors 36 and 38 and the character string including the character by using an ordinary editor interface. Actually, when one of the cursor 38 on the character string data viewer 31 and the cursor 36 on the current cell moves, the other cursor moves so as to always keep the same position on the character string data. Further, when a correction is made in either the character string data viewer 31 or the current cell, the other character string data is always subjected to the same correction.

【００６２】このように構成すれば、文書の種類毎に抽
出すべきデータベース登録項目領域を定義し、この定義
に基づいて一括して読み取った複数種類の複数の文書の
画像データから、一括して必要なデータベース登録項目
を抽出すると共に、文字認識手段によってコード化して
文字列データとすることにより、文書画像の検索に用い
るためのデータベース登録項目を効率良く入力すること
ができる。さらに、複数種類の複数の文書に対するデー
タベース登録項目の画像データとコード化した文字列デ
ータとを一括表示して修正することにより、修正操作の
際の利用者の作業負担を軽減できる。With this configuration, a database registration item area to be extracted is defined for each type of document, and image data of a plurality of types of documents read collectively based on this definition is collectively determined. By extracting necessary database registration items and encoding them into character string data by a character recognition unit, database registration items to be used for searching for document images can be input efficiently. Furthermore, by displaying and correcting the image data of the database registration items and the coded character string data for a plurality of types of documents collectively, the user's work load at the time of the correction operation can be reduced.

【００６３】実施例２．上述の実施例１の一括表示修正
インターフェースでは、文書及びデータベース登録項目
の表示に加えて、画像データビューワ３０及び文字列デ
ータビューワ３１を表示したが、この実施例ではこれら
を表示せずに、図６に示すように、文字及びデータベー
ス登録項目のみを表形式で表示する。この場合カーソル
３６が存在するカレントセルの表示を常時拡大して表示
し、カーソル３６が他のセルに移動したとき、移動元の
セルの表示を通常のサイズに戻し、移動先のセルの表示
を拡大表示する。Embodiment 2 FIG. In the batch display correction interface of the first embodiment, the image data viewer 30 and the character string data viewer 31 are displayed in addition to the display of the document and the database registration items. As shown in FIG. 6, only characters and database registration items are displayed in a table format. In this case, the display of the current cell where the cursor 36 is present is always enlarged and displayed, and when the cursor 36 moves to another cell, the display of the source cell is returned to the normal size and the display of the destination cell is displayed. Enlarge the display.

【００６４】このようにすれば、実施例１と同様の効果
を実現できることに加えて、実施例１に比較して同一表
示面積ならば、より多くの文書に応じたデータベース登
録項目を表示できることにより、修正操作の際の利用者
の作業負担を一段と軽減できる。また実施例１と同じ数
の文書に応じたデータベース登録項目を表示するときに
は、画像データビューワ３０や文字列データビューワ３
１を表示しない分、表示面積を小さくすることができ、
表示部１４の表示画面を有効に利用することができる。In this way, the same effect as that of the first embodiment can be realized. In addition, if the display area is the same as that of the first embodiment, the database registration items corresponding to more documents can be displayed. Thus, the user's work load during the correction operation can be further reduced. When displaying database registration items corresponding to the same number of documents as in the first embodiment, the image data viewer 30 and the character string data viewer 3 are displayed.
The display area can be reduced by the amount that 1 is not displayed,
The display screen of the display unit 14 can be used effectively.

【００６５】実施例３．上述の実施例１や実施例２の一
括表示／修正インタフェースでは、一括して読み取った
順序で複数種類の複数の文書を一括表示するようにした
が、この実施例では文書の種類が同じデータのみを揃
え、それらを１単位として修正作業を行い、１つの文書
の種類の修正が終了したら別の種類の文書の修正を行
い、それを繰り返すことで一括して読み取った複数種類
の複数の文書全体の修正を行う。また文書の種類をもと
にして各文書を表示する位置を並び変えても良い。この
ようにすれば、上述の実施例１、実施例２と同様の効果
を実現できることに加えて、実施例１及び実施例２と比
較した場合、表示された同一の種類の文書を比較しなが
ら修正操作することができ分、利用者の修正操作の作業
負担を一段と軽減できる。Embodiment 3 FIG. In the batch display / correction interface according to the first and second embodiments, a plurality of types of documents are collectively displayed in the order of batch reading. However, in this embodiment, only data of the same document type is displayed. And correct them as a unit. When the correction of one document type is completed, correct another type of document, and repeat the process to read all types of multiple documents. Make corrections. In addition, the display positions of the respective documents may be rearranged based on the types of the documents. In this way, the same effects as those of the first and second embodiments can be realized. In addition, when compared with the first and second embodiments, the displayed documents of the same type are compared with each other. Since the correction operation can be performed, the work load of the user on the correction operation can be further reduced.

【００６６】実施例４．上述の実施例１〜実施例３の一
括表示／修正インタフェースでは、各文書のあるデータ
ベース登録項目の値が無く、その位置のセルが存在しな
い項目については空白を表示したが、これに代え、この
実施例では、図７に示すように、各文書のデータベース
登録項目の値が無く、その位置のセルが存在しない分だ
けセルの表示を左に詰めて表示する。この場合、データ
ベース登録項目ラベル３３に表示される項目名は、カレ
ントセルに対応する文書に対応するデータベース登録項
目名となり、カレントセルが移動する度に移動先の文書
に対応して、データベース登録項目ラベル３３に表示さ
れる項目名が変更される。このようにすれば、上述の実
施例１〜実施例３と同様の効果を実現できることに加え
て、実施例１〜実施例３と比較して、修正対象となる文
書に対応するデータベース登録項目の表示量が増加する
ことにより、その分利用者の修正操作の際の作業負担を
一段と軽減できる。Embodiment 4 FIG. In the batch display / correction interface of the first to third embodiments described above, a blank is displayed for an item in which there is no value of a certain database registration item of each document and a cell at that position does not exist. In the embodiment, as shown in FIG. 7, there is no value of the database registration item of each document, and the cells are displayed left-justified as much as there is no cell at that position. In this case, the item name displayed on the database registration item label 33 is the database registration item name corresponding to the document corresponding to the current cell. Each time the current cell moves, the item name displayed in the database registration item The item name displayed on the label 33 is changed. With this configuration, the same effects as those of the above-described first to third embodiments can be achieved. In addition, compared to the first to third embodiments, the database registration item corresponding to the document to be corrected can be obtained. By increasing the display amount, the user's work load for the correction operation can be further reduced.

【００６７】実施例５．実施例１の一括表示修正インタ
フェースにおける画像データビューワ３０では、表示さ
れる画像データがカレントセルの移動に応じて、移動し
たデータベース登録項目領域を含むようにする場合につ
いて述べた。この実施例ではマウス等のポインティング
デバイスを用いて画像データの表示領域を移動させて、
画像データ中であるデータベース登録項目領域内の任意
の１点を指定することによって、文書及びデータベース
登録項目表のカーソル３６を移動させる。さらに画像デ
ータビューワ３０に文書画像データの他の用紙の表示に
切り替えるための操作表示を設けても良い。このように
すれば、実施例１と同様の効果を実現できることに加え
て、利用者は画像データビューワ３０の表示を見ながら
視覚的に修正操作することができ、利用者の修正操作の
際の作業負担をさらに一段と軽減できる。Embodiment 5 FIG. In the image data viewer 30 in the batch display correction interface according to the first embodiment, the case has been described in which the displayed image data includes the moved database registration item area in accordance with the movement of the current cell. In this embodiment, the display area of the image data is moved using a pointing device such as a mouse,
By specifying an arbitrary point in the database registration item area in the image data, the cursor 36 of the document and database registration item table is moved. Further, the image data viewer 30 may be provided with an operation display for switching the display of the document image data to another sheet. By doing so, in addition to achieving the same effect as in the first embodiment, the user can visually perform a correction operation while viewing the display of the image data viewer 30, and the user can perform the correction operation at the time of the correction operation. Work load can be further reduced.

【００６８】[0068]

【発明の効果】上述したように本発明によれば、複数種
類の複数の文書を一括して画像読み取り装置から読み取
った一連の画像データを、読み取った複数の文書に各々
対応する文書画像データとして入力し、その文書画像デ
ータをパターン認識してコード化データを生成し、複数
の文書に各々対応する文書画像データ及びコード化デー
タを一括して表示部に表示し、必要に応じてコード化デ
ータを後修正し、表示部に表示された文書画像データ及
びコード化データを一括してデータベースに登録するこ
とにより、簡易な構成及び操作で複数種類の複数の文書
を一括して入力し得ると共に、修正作業における利用者
の作業負担を軽減し使い勝手を向上し得る文書ファイリ
ング方法を実現できる。As described above, according to the present invention, a series of image data obtained by reading a plurality of types of documents at once from an image reading apparatus is converted into document image data corresponding to each of the read documents. Input, pattern-recognize the document image data to generate coded data, and collectively display the document image data and coded data corresponding to each of a plurality of documents on a display unit. By post-correcting, and by registering the document image data and the coded data displayed on the display unit collectively in the database, a plurality of types of documents can be input collectively with a simple configuration and operation, A document filing method that can reduce the work load of the user in the correction work and improve the usability can be realized.

【００６９】また次の発明によれば、一連の画像データ
を予め定義された定型文書情報と比較して、読み取った
複数の文書に区分けし、区分けされた複数の文書を、定
型文書情報と比較して文書の種類を判別することによ
り、簡易な構成及び操作で、複数種類に複数の文書を確
実に一括して入力し得る文書ファイリング方法を実現で
きる。According to the next invention, a series of image data is compared with predefined standard document information, divided into a plurality of read documents, and the divided plural documents are compared with the standard document information. Then, by determining the type of the document, a document filing method capable of securely inputting a plurality of documents in a plurality of types with a simple configuration and operation can be realized.

【００７０】また次の発明によれば、文書に対応する文
書画像データから、文書の種類に応じたデータベース登
録項目領域の画像データを抽出し、そのデータベース登
録項目領域の画像データをパターン認識行程でコード化
し、そのコード化したデータベース登録項目値を付与し
て、文書画像データをデータベースへ登録することによ
り、簡易な構成及び操作で複数種類の複数の文書を確実
に一括して入力し得る文書ファイリング方法を実現でき
る。According to the next invention, the image data of the database registration item area corresponding to the type of the document is extracted from the document image data corresponding to the document, and the image data of the database registration item area is subjected to the pattern recognition process. Document filing that allows multiple types of documents to be input collectively with a simple configuration and operation by coding and adding the coded database registration item values and registering the document image data in the database The method can be realized.

【００７１】また次の発明によれば、複数種類の複数の
文書の文書画像データ及びコード化したデータベース登
録項目値を、文書及びデータベース登録項目を配列した
表形式で一括して表示し、文書及びデータベース登録項
目の表で文書に対応してコード化されたデータベース登
録項目値を修正することにより、修正作業における利用
者の作業負担を軽減して使い勝手を向上し得る文書ファ
イリング方法を実現できる。According to the next invention, the document image data and the coded database registration item values of a plurality of types of documents are collectively displayed in a table format in which the documents and database registration items are arranged. By correcting the database registration item value coded corresponding to the document in the database registration item table, a document filing method that can reduce the work load on the user in the correction work and improve the usability can be realized.

【００７２】また次の発明によれば、第１及び第２の表
示行程で、文書の種類に応じて表示及び動作が切り替わ
るようにしたことにより、利用者は文書の種類を容易に
認識することができ、修正作業における利用者の作業負
担を軽減して使い勝手を向上し得る文書ファイリング方
法を実現できる。According to the next invention, the display and operation are switched according to the type of the document in the first and second display steps, so that the user can easily recognize the type of the document. Thus, it is possible to realize a document filing method that can reduce the work load of the user in the correction work and improve the usability.

【００７３】また次の発明によれば、種類の異なる複数
の文書に対して、同一種類の文書をまとめて表示する第
１の表示モードと、種類の異なる文書を混在させて同時
に表示する第２の表示モードとを、利用者の要求に応じ
て切り替えるようにしたことにより、利用者が第１の表
示モードを選択すれば、同一種類の複数の文書の内容を
比較でき、また第２の表示モードを選択すれば、異なる
文書の内容を比較でき、必要に応じてこれらを選択し
得、かくして修正作業における利用者の作業負担を軽減
して使い勝手を向上し得る文書ファイリング方法を実現
できる。According to the next invention, a first display mode for displaying a plurality of documents of the same type collectively for a plurality of different types of documents, and a second display mode for simultaneously displaying documents of different types in a mixed manner. Is switched in response to the user's request, so that if the user selects the first display mode, the contents of a plurality of documents of the same type can be compared. By selecting the mode, the contents of different documents can be compared, and these can be selected as necessary. Thus, a document filing method that can reduce the work load on the user in the correction work and improve the usability can be realized.

【００７４】また次の発明によれば、第２の表示モード
は、文書の種類毎に異なるデータベース登録項目値の表
示位置を固定して表示する固定表示モードと、文書の種
類毎に異なるデータベース登録項目値の表示位置を左詰
めで表示する左詰め表示モードとを、利用者の要求の応
じて切り替えるようにしたことにより、利用者が固定表
示モードを選択すれば、文書の種類毎にデータベース登
録項目の有無を容易に認識でき、また左詰め表示モード
を選択すれば、同一文書内でより多くのデータベース登
録項目を表示して認識でき、必要に応じてこれらを選択
し得、かくして修正作業における利用者の作業負担を軽
減して使い勝手を向上し得る文書ファイリング方法を実
現できる。According to the next invention, the second display mode includes a fixed display mode in which the display position of the database registration item value that is different for each document type is fixed and displayed, and a different database registration for each document type. When the user selects the fixed display mode by switching between the left-justified display mode that displays the display position of the item value left-justified according to the user's request, the database registration for each document type You can easily recognize the presence or absence of items, and if you select the left-justified display mode, you can display and recognize more database registration items in the same document, you can select these as needed, and thus A document filing method that can reduce user work load and improve usability can be realized.

【００７５】また次の発明によれば、修正行程では、画
像データ表示領域に修正の対象としている文書中のデー
タベース登録項目値に対応する文書画像データを常に拡
大表示し、パターンデータ表示領域にコード化したデー
タベース登録項目値のパターンデータを常に拡大表示す
るようにしたことにより、拡大表示された修正対象のパ
ターンデータとその文書画像データを対比して認識で
き、修正作業における利用者の作業負担を軽減して使い
勝手を向上し得る文書ファイリング方法を実現できる。According to the next invention, in the correction process, the document image data corresponding to the database registration item value in the document to be corrected is always enlarged and displayed in the image data display area, and the code is displayed in the pattern data display area. The enlarged pattern data of the database registration item values is always displayed in an enlarged manner, so that the enlarged and displayed correction target pattern data and its document image data can be compared and recognized, thereby reducing the user's work load in the correction work. It is possible to realize a document filing method that can reduce and improve the usability.

【００７６】また次の発明によれば、修正行程では、固
定された表示領域に、修正の対象としている文書中のデ
ータベース登録項目値を含む頁全体の文書画像データ
と、コード化したデータベース登録項目値のパターンデ
ータとを、指定された表示倍率で常に表示することによ
り、指定された表示倍率で表示された修正対象のパター
ンデータとその文書画像データを対比して認識でき、修
正作業における利用者の作業負担を軽減して使い勝手を
向上し得る文書ファイリング方法を実現できる。According to the next invention, in the correction process, the document image data of the entire page including the database registration item value in the document to be corrected and the coded database registration item are displayed in the fixed display area. By always displaying the value pattern data at the specified display magnification, the pattern data to be corrected displayed at the specified display magnification and its document image data can be compared and recognized, and the user in the correction work can be recognized. A document filing method which can reduce the work load of the document and improve the usability can be realized.

【００７７】また次の発明によれば、頁全体の文書画像
データの固定された表示領域への表示は、修正の対象と
しているデータベース登録項目値に対応する文書画像デ
ータ上の領域を枠で囲み又は色を表示し、修正対象のデ
ータベース登録項目値に対応する文書画像データ上の領
域を、表示領域に収めるために必要最小限移動させるこ
とにより、修正対象のデータベース登録項目値に対応す
る文書画像データを確実に認識でき、修正作業における
利用者の作業負担を軽減して使い勝手を向上し得る文書
ファイリング方法を実現できる。According to the next invention, the display of the document image data of the entire page in the fixed display area is performed by enclosing the area on the document image data corresponding to the database registration item value to be corrected with a frame. Alternatively, the document image corresponding to the database registration item value to be corrected is displayed by displaying the color and moving the area on the document image data corresponding to the database registration item value to be corrected to the display area by the minimum necessary. A document filing method capable of reliably recognizing data, reducing the user's work load in correction work, and improving usability can be realized.

【００７８】また次の発明によれば、修正行程は、文書
判別行程によって判別された文書の種類と異なる文書の
種類となるように利用者が訂正すると、文書の種類を訂
正された文書のみ文書項目定義行程によって定義された
文書の種類に対応する処理手順に従って再処理するよう
にしたことにより、文書が誤判別された場合でも容易に
訂正でき、複数種類の複数の文書を確実に一括して入力
し得る文書ファイリング方法を実現できる。According to the next invention, when the user corrects the correction process so that the document type is different from the document type determined by the document determination process, only the document whose document type has been corrected is a document. By re-processing according to the processing procedure corresponding to the document type defined by the item definition process, even if a document is erroneously determined, it can be easily corrected, and multiple types of documents can be reliably collected at once. A document filing method that can be input can be realized.

【００７９】また次の発明によれば、データベース登録
処理行程のキーワード抽出行程は、文字認識行程によっ
て文書全体をコード化したコード化データから単語を切
り出してキーワードとして抽出し、データベースの登録
項目としてキーワードが指定された場合、抽出したキー
ワードを付与して、データベースへ登録することによ
り、容易にキーワードを抽出してデータベースへ登録で
き、利用者の使い勝手を向上し得る文書ファイリング方
法を実現できる。According to the next invention, the keyword extraction step of the database registration processing step extracts a word from coded data obtained by encoding the entire document by the character recognition step, extracts the word as a keyword, and registers the keyword as a database registration item. Is designated, by adding the extracted keyword and registering it in the database, the keyword can be easily extracted and registered in the database, and a document filing method that can improve the usability of the user can be realized.

【００８０】また次の発明によれば、データベース登録
項目値の中からキーワードを抽出する領域を限定し、指
定された領域のみを文字認識行程によってコード化した
コード化データからキーワードを抽出し、その抽出した
キーワードを付与してデータベースへ登録することによ
り、容易にデータベース登録項目値をキーワードとして
抽出してデータベースへ登録でき、利用者の使い勝手を
向上し得る文書ファイリング方法を実現できる。According to the next invention, the keyword extraction area is limited from the database registration item values, and the keyword is extracted from the coded data obtained by encoding only the designated area by the character recognition process. By assigning the extracted keyword and registering it in the database, the database registration item value can be easily extracted as a keyword and registered in the database, and a document filing method that can improve the usability of the user can be realized.

【００８１】また次の発明によれば、キーワード抽出範
囲指定行程で、キーワードを抽出する範囲を文書内容に
応じて指定し、文書全体を文字認識行程によってコード
化したコード化データのうち、指定された抽出範囲より
キーワードを抽出し、その抽出したキーワードを付与し
てデータベースへ登録することにより、有効なキーワー
ドを抽出してデータベースへ登録でき、利用者の使い勝
手を向上し得る文書ファイリング方法を実現できる。According to the next invention, the keyword extraction range is specified in the keyword extraction range specifying step in accordance with the contents of the document, and the entire document is specified in the coded data obtained by coding the character recognition step. By extracting keywords from the extracted extraction range, adding the extracted keywords and registering them in the database, valid keywords can be extracted and registered in the database, and a document filing method that can improve the usability of the user can be realized. .

【００８２】また次の発明によれば、文書画像データ入
力手段で、複数種類の複数の文書を一括して画像読み取
り装置から読み取った一連の画像データを各々対応する
文書画像データとして入力し、パターン認識手段でパタ
ーン認識してコード化データを生成し、表示手段で複数
の文書に各々対応する文書画像データ及びコード化デー
タを一括して表示部に表示し、修正手段で表示部の表示
結果に基づいて必要に応じて後修正して表示手段に供給
し、データベース登録手段で表示部に表示された文書画
像データ及びコード化データを一括してデータベースに
登録することにより、簡易な構成及び操作で複数種類の
複数の文書を一括して入力し得ると共に、修正作業にお
ける利用者の作業負担を軽減し使い勝手を向上し得る文
書ファイリング装置を実現できる。According to the next invention, the document image data input means inputs a series of image data obtained by reading a plurality of types of documents collectively from the image reading apparatus as the corresponding document image data, The recognition means generates the coded data by pattern recognition, the display means collectively displays the document image data and the coded data corresponding to the plurality of documents on the display unit, and the correction means displays the display result on the display unit. Based on the necessity, post-correction is provided to the display means, and the document image data and the coded data displayed on the display unit are collectively registered in the database by the database registration means. A document filing device capable of collectively inputting a plurality of types of documents, reducing the user's work load in correction work, and improving usability. It can be realized.

【００８３】また次の発明によれば、文書画像データ入
力手段では、文書区分け手段で一連の画像データを予め
定義された定型文書情報と比較して、読み取った複数の
文書に区分けし、文書判別手段で区分けされた複数の文
書を、定型文書情報と比較して文書の種類を判別し、文
書画像データとして送出することにより、これにより簡
易な構成及び操作で、複数種類に複数の文書を確実に一
括して入力し得る文書ファイリング装置を実現できる。According to the next invention, in the document image data input means, the series of image data is compared by the document classification means with the predetermined standard document information, divided into a plurality of read documents, and the document is determined. By comparing the plurality of documents classified by the means with the standard document information and determining the type of the document, and sending the same as document image data, the plurality of documents can be reliably classified into a plurality of types with a simple configuration and operation. A document filing device that can collectively input data to a document can be realized.

【００８４】また次の発明によれば、文書画像データ入
力手段では、さらに、文書項目定義手段で定型文書情報
に加えて文書の種類毎にデータベース登録項目値の記載
位置を示すデータベース登録項目領域が予め定義され、
データベース登録項目抽出手段で文書に対応する文書画
像データからデータベース登録項目域の画像データを抽
出し、パターン認識手段でコード化したデータベース登
録項目値を付与して、文書画像データをデータベースへ
登録することにより、簡易な構成及び操作で複数種類の
複数の文書を確実に一括して入力し得る文書ファイリン
グ装置を実現できる。Further, according to the next invention, in the document image data input means, the document item definition means further includes a database registration item area indicating a description position of a database registration item value for each document type in addition to the standard document information. Predefined
Extracting the image data of the database registration item area from the document image data corresponding to the document by the database registration item extraction means, adding the database registration item value coded by the pattern recognition means, and registering the document image data in the database Accordingly, a document filing apparatus capable of reliably inputting a plurality of types of documents collectively with a simple configuration and operation can be realized.

【００８５】また次の発明によれば、表示手段では、複
数種類の複数の文書の文書画像データ及びコード化した
データベース登録項目値を、文書及びデータベース登録
項目を配列した表形式で一括して表示し、修正手段で
は、文書及びデータベース登録項目の表で文書に対応し
てコード化されたデータベース登録項目値を修正するこ
とにより、修正作業における利用者の作業負担を軽減し
て使い勝手を向上し得る文書ファイリング装置を実現で
きる。According to the next invention, the display means collectively displays the document image data and the coded database registration item values of a plurality of types of documents in a table format in which the documents and the database registration items are arranged. In the correction means, by correcting the database registration item value coded corresponding to the document in the table of the document and the database registration items, the user's work load in the correction work can be reduced and the usability can be improved. A document filing device can be realized.

[Brief description of the drawings]

【図１】本発明による文書ファイリングシステムの実
施例１の構成を示すブロック図である。FIG. 1 is a block diagram showing a configuration of a first embodiment of a document filing system according to the present invention.

【図２】図１の文書ファイリングシステムの一括入力
の際に用いる先頭識別頁を示す略線図である。FIG. 2 is a schematic diagram showing a head identification page used in batch input of the document filing system of FIG. 1;

【図３】図１の文書ファイリングシステムの処理の流
れを説明する略線図である。FIG. 3 is a schematic diagram illustrating a processing flow of the document filing system of FIG. 1;

【図４】図１の文書ファイリングシステムにおける一
括表示及び修正操作を説明する略線図である。FIG. 4 is a schematic diagram illustrating batch display and correction operations in the document filing system of FIG. 1;

【図５】図３の一括表示及び修正操作における認識候
補選択画面を示す略線図である。FIG. 5 is a schematic diagram illustrating a recognition candidate selection screen in the batch display and correction operation in FIG. 3;

【図６】本発明による文書ファイリングシステムの実
施例２の一括表示及び修正操作を説明する略線図であ
る。FIG. 6 is a schematic diagram illustrating batch display and correction operations in a document filing system according to a second embodiment of the present invention.

【図７】本発明による文書ファイリングシステムの実
施例４の一括表示及び修正操作を説明する略線図であ
る。FIG. 7 is a schematic diagram illustrating batch display and correction operations of a document filing system according to a fourth embodiment of the present invention.

【図８】従来の文書ファイリングシステムにおける表
示及び修正操作を説明する略線図である。FIG. 8 is a schematic diagram illustrating display and correction operations in a conventional document filing system.

[Explanation of symbols]

１制御部２文書読み取り手段３文書区分け手段４文書判別手段５データベース登録項目抽出手段６文字認識手段７一括表示手段８修正手段９キーワード抽出手段１０データベース登録手段１１記憶部１２文書読み取り部１３文書項目定義手段１４表示部１５操作部２０文書２１文書画像データ２２定義２３データベース登録項目領域の画像データ２４文字列データ２５一括表示修正インターフェース２６データベース３０画像データビューワ３１文字列データビューワ３２文書種類ラベル３３データベース登録項目ラベル３４画像データセル３５文字データセル３６カーソル３７枠３８カーソル４０１文字分の画像データ４１認識候補の文字４２終了表示 DESCRIPTION OF SYMBOLS 1 Control part 2 Document reading means 3 Document classification means 4 Document discriminating means 5 Database registration item extraction means 6 Character recognition means 7 Batch display means 8 Correction means 9 Keyword extraction means 10 Database registration means 11 Storage part 12 Document reading part 13 Document items Definition unit 14 Display unit 15 Operation unit 20 Document 21 Document image data 22 Definition 23 Image data of database registration item area 24 Character string data 25 Batch display correction interface 26 Database 30 Image data viewer 31 Character string data viewer 32 Document type label 33 Database Registration item label 34 Image data cell 35 Character data cell 36 Cursor 37 Frame 38 Cursor 40 Image data for one character 41 Recognition candidate character 42 End display

───────────────────────────────────────────────────── フロントページの続き (72)発明者藤井洋一鎌倉市大船五丁目１番１号三菱電機株式会社パーソナル情報機器開発研究所内 (72)発明者丸田裕三鎌倉市大船五丁目１番１号三菱電機株式会社パーソナル情報機器開発研究所内 (56)参考文献特開平６−52236（ＪＰ，Ａ) 特開平５−303619（ＪＰ，Ａ) 特開平１−111285（ＪＰ，Ａ) 特開平３−263182（ＪＰ，Ａ) 特開平６−68165（ＪＰ，Ａ) 特開平５−54120（ＪＰ，Ａ) (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06T 1/00 G06K 9/03 ──────────────────────────────────────────────────続き Continuing on the front page (72) Inventor Yoichi Fujii 5-1-1, Ofuna, Kamakura-shi Mitsubishi Electric Corp. Personal Information Equipment Development Laboratory (72) Inventor Yuzo Maruta 5-1-1, Ofuna, Kamakura-shi No. Mitsubishi Electric Corporation Personal Information Equipment Development Laboratory (56) References JP-A-6-52236 (JP, A) JP-A-5-303619 (JP, A) JP-A 1-1111285 (JP, A) JP-A-3-263182 (JP, A) JP-A-6-68165 (JP, A) JP-A-5-54120 (JP, A) (58) Fields investigated (Int. Cl. ⁷ , DB name) G06T 1/00 G06K 9/03

Claims

(57) [Claims]

1. A document image data input step of inputting a series of image data obtained by reading a plurality of types of documents collectively from an image reading device as document image data respectively corresponding to the read plurality of documents; A pattern recognition step of pattern-recognizing and coding the document image data input in the document image data input step and generating coded data; and the document image data and the coded data respectively corresponding to the plurality of documents. A collective display on the display unit, and the necessity of post-correction of the coded data is determined based on a display result of the first display process, and the code is determined based on the determination result. A correction process for post-correction of the coded data, a second display process for displaying the coded data corrected in the correction process on the display unit, and a display process on the display unit. A database registration processing step of collectively registering the indicated document image data and the coded data in a database .
From the database registration item area corresponding to the type of the above document
Extracting the image data, and performing the pattern recognition process in the document image data input process.
The image data of the database registration item area extracted in
The first and second display steps are performed by coding the data into a database registration item value.
Code the image data of the item area and this image data
The registered database entry values with the document and database
A document filing method characterized by displaying registered items collectively in a table format .

2. The document image data input step includes the steps of: comparing the series of image data with predetermined standard document information; dividing the read image into a plurality of read documents; 2. The document filing method according to claim 1, further comprising: a document discriminating step of comparing the divided documents with the fixed document information to determine the type of the document and transmitting the document type as the document image data. Method.

3. The document image data input process further includes, for each type of the document, in addition to the fixed document information, a database registration indicating a description position of a database registration item value when the document is registered in the database. A document item definition process in which an item area is defined in advance, and a database registration item extraction process of extracting the image data of the database registration item region according to the type of the document from the document image data corresponding to the document. The image data of the database registration item area extracted in the database registration item extraction step is coded in the pattern recognition step, and the coded database registration item value is added. 3. The document image data is registered in the database. Document filing method described.

4. The document filing method according to claim 3 , wherein display and operation of the first and second display processes are switched according to the type of the document.

5. A method for a plurality of documents of different types,
A first display mode in which the same type of documents are displayed together and a second display mode in which different types of documents are mixed and displayed at the same time are switched according to a user request. The document filing method according to claim 4 .

6. The second display mode includes a fixed display mode in which a display position of the database registration item value different for each document type is fixedly displayed, and a database registration item different for each document type. 6. The document filing method according to claim 5 , wherein the display mode is switched between a left-justified display mode in which the display position of the value is displayed left-justified according to a user's request.

7. An image data display area for constantly enlarging and displaying said document image data corresponding to said database registration item value in said document to be corrected, said coded database registration item 4. The document filing method according to claim 3 , further comprising a pattern data display area for constantly displaying the value pattern data in an enlarged manner.

8. The method according to claim 1, wherein the correcting step includes: the document image data of the entire page including the database registration item value in the document to be corrected; and the coded pattern of the database registration item value. 4. The document filing method according to claim 3 , further comprising a fixed display area that is always displayed at a specified display magnification.

9. The display of the document image data of the entire page in the fixed display area is performed by enclosing an area on the document image data corresponding to the database entry item value to be corrected with a frame. 9. The document according to claim 8 , wherein a color is displayed, and an area on the document image data corresponding to the database registration item value to be corrected is moved to a minimum necessary to fit in the display area. Filing method.

10. When the user corrects the correction process so that the type of the document is different from the type of the document determined by the document determination process, only the document in which the type of the document is corrected is the document. 4. The document filing method according to claim 3 , wherein reprocessing is performed in accordance with a processing procedure corresponding to the type of the document defined by an item definition process.

11. The database registration processing step includes a keyword extraction step of extracting a word from the coded data obtained by coding the entire document by the character recognition step and extracting the extracted word as a keyword. If a keyword is specified,
4. The document filing method according to claim 3, wherein the keyword extracted in the keyword extraction step is added and registered in the database.

12. An area for extracting the keyword from the database entry item value is limited, and the keyword is extracted from the coded data obtained by coding only a designated area by the character recognition process. 12. The document filing method according to claim 11 , wherein the keyword is added and registered in the database.

13. A keyword extraction range designating step for designating a range in which the keyword is to be extracted in accordance with the contents of a document, and a designated extraction range of the coded data obtained by encoding the entire document by the character recognition process. 12. The document filing method according to claim 11 , further comprising extracting the keyword, adding the extracted keyword, and registering the extracted keyword in the database.

14. A document image data input means for inputting a series of image data obtained by reading a plurality of types of documents collectively from an image reading device as document image data respectively corresponding to the read plurality of documents; Pattern recognition means for pattern-recognizing and coding the document image data input by the document image data input means and generating coded data; and the document image data and the coded data respectively corresponding to the plurality of documents. Display means for collectively displaying the coded data on the display unit, determining whether or not post-correction of the coded data is necessary based on the display result of the display unit, and post-correcting the coded data based on the determination result. Correction means for supplying the document image data and the coded data displayed on the display unit to the database Database registration processing means for registering, wherein the document image data input means
From the database registration item area corresponding to the type of the above document
Extracting image data, wherein the pattern recognition means comprises:
The image data of the database registration item area extracted in
The data is coded into a database registration item value, and the display means displays an image of the database registration item area.
Data and database that coded this image data
Array of registration item values and document and database registration items
A document filing apparatus characterized by displaying in batches in a table format .

15. The document image data input means compares the series of image data with predetermined standard document information and classifies the read document into a plurality of read documents. 15. The document filing device according to claim 14 , further comprising: a document discriminating unit that compares the divided documents with the fixed document information to determine the type of the document and sends the document image data as the document image data. apparatus.

16. The document image data input means further includes, for each type of the document, in addition to the standard document information, a database registration indicating a description position of a database registration item value when the document is registered in the database. A document item definition means in which an item area is defined in advance; and a database registration item extraction means for extracting the image data of the database registration item area corresponding to the type of the document from the document image data corresponding to the document. The image data of the database registration item area extracted by the database registration item extraction means is coded by the pattern recognition means, and the coded database registration item value is added. claim the document image data and registers to the database 1 6. The document filing apparatus according to 5 .