JPH08221558A

JPH08221558A - Method and device for filing document

Info

Publication number: JPH08221558A
Application number: JP7028981A
Authority: JP
Inventors: Osamu Moriguchi; 修森口; Yasuhiro Takayama; 泰博高山; Yoichi Fujii; 洋一藤井; Yuzo Maruta; 裕三丸田
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 1995-02-17
Filing date: 1995-02-17
Publication date: 1996-08-30
Anticipated expiration: 2017-08-26
Also published as: JP3319203B2

Abstract

PURPOSE: To simultaneously input plural documents with a simple configuration and operation by displaying document image data and coded data respectively corresponding to plural documents on a display part, performing post-correction as needed and simultaneously registering the document image data and the coded data on a data base later. CONSTITUTION: A document partitioning means 3 compares series of image data corresponding to plural kinds of documents with the document image pattern of a document item defining means 13 and partitions them as the document image data corresponding to one document. A document discriminating means 4 compares the document image data with the document image pattern defined by the document item defining means 13 and discriminates the kind of the document. A data base registration item extracting 5 segments the image data and a character recognizing means 6 recognizes the character pattern of image data, turns it to a character code and sends it to a batch display means 7. A correcting means 8 reponds to the correcting operation to the information displayed on a display part 14, transmits it to the batch display means 7 and reflects the data with the displayed contents on the display part 14.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は文書ファイリング方法及
び装置に関し、特に文書画像情報を登録し検索する文書
ファイリングシステムで、文書画像情報を入力し登録す
る場合に適用し得る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a document filing method and apparatus, and is particularly applicable to a document filing system for registering and retrieving document image information when inputting and registering document image information.

【０００２】[0002]

【従来の技術】従来、文書画像情報を登録し検索する文
書ファイリングシステムにおいては、文書画像情報の検
索に用いるデータベースへの登録項目を容易に入力する
ための方法が種々提案されている。このような文書画像
情報の検索に用いるデータベースへの登録項目の入力を
可能とした文書ファイリングシステムとして、例えば特
開平４−７７８８５号公報に開示されたものがある。こ
れは、スキャナ等で読み取った名刺等の画像データにつ
いて、文字認識手段によって文字認識すると共に項目を
判定し、文字データとその文字データの項目の種類を用
いて、データベースへ登録して検索するものである。2. Description of the Related Art Conventionally, in a document filing system for registering and retrieving document image information, various methods have been proposed for easily inputting registration items in a database used for retrieving document image information. As a document filing system capable of inputting registration items to a database used for searching such document image information, there is one disclosed in, for example, Japanese Patent Laid-Open No. 4-77885. This is to search for image data such as a business card read by a scanner by character recognition by a character recognition means and determining an item, and registering it in a database using the character data and the item type of the character data. Is.

【０００３】図８はこの文書ファイリングシステムにお
ける修正操作を示す表示である。図中Ｃ１は現在指示さ
れている文字データのカーソルであり、Ｃ２はカーソル
Ｃ１が指示する文字データについての認識候補一覧の選
択表示を示し、Ｃ３はカーソルＣ１の指示する項目名一
覧の選択表示を示す。カーソルＣ１を修正すべき文字デ
ータの位置に移動させ、Ｃ２によって正しい文字を選択
し、Ｃ３によって正しい項目名を選択し修正を行う。こ
のような後修正を順次行った後に、読み取った画像デー
タや認識した文字データをデータベースに登録する。FIG. 8 is a display showing a correction operation in this document filing system. In the figure, C1 is a cursor for the character data currently designated, C2 is a selection display of the recognition candidate list for the character data designated by the cursor C1, and C3 is a selection display of the item name list designated by the cursor C1. Show. The cursor C1 is moved to the position of the character data to be corrected, the correct character is selected by C2, the correct item name is selected by C3, and the correction is performed. After such post-correction is sequentially performed, the read image data and the recognized character data are registered in the database.

【０００４】[0004]

【発明が解決しようとする課題】ところが上述の文書フ
ァイリングシステムでの表示及び修正操作は、１つの文
書について項目名一覧を表示し、それらを修正した後に
他の文書の項目一覧を表示し、修正するという繰り返し
によって行われるため、全ての文書を修正するために
は、その都度画面の表示を更新して修正する煩雑な操作
が必要であり、修正作業の作業効率が悪いという問題が
あった。また上述した文書ファイリングシステムでは、
名刺や伝票のように限られた書式で限られた項目数しか
存在しないような文書のみを取り扱うようになされ、複
数種類の複数の文書を一括して入力し修正するような利
便性はなく、利用者の使い勝手の点で未だ不十分であっ
た。However, the display and correction operations in the document filing system described above display a list of item names for one document, and after correcting them, display a list of items for another document and make corrections. Since it is repeatedly performed, it is necessary to perform a complicated operation of updating and correcting the display of the screen each time in order to correct all the documents, and there is a problem that the work efficiency of the correction work is poor. In the document filing system described above,
It handles only documents that have a limited number of items in a limited format, such as business cards and slips, and there is no convenience to input and modify multiple documents of multiple types at once. It was still insufficient in terms of user convenience.

【０００５】この発明は上記のような問題点を解消する
ためなされたもので、簡易な構成及び操作で複数種類の
複数の文書を一括して入力し得ると共に、修正作業にお
ける利用者の作業負担を軽減し使い勝手を向上できる文
書ファイリング方法及び装置を得ることを目的とする。The present invention has been made in order to solve the above-mentioned problems, and it is possible to collectively input a plurality of types of a plurality of documents with a simple configuration and operation, and the work load on the user in the correction work. It is an object of the present invention to provide a document filing method and apparatus that can reduce the number of problems and improve the usability.

【０００６】[0006]

【課題を解決するための手段】この発明に係る文書ファ
イリング方法は、複数種類の複数の文書を一括して画像
読み取り装置から読み取った一連の画像データを、読み
取った複数の文書に各々対応する文書画像データとして
入力する文書画像データ入力行程と、その文書画像デー
タ入力行程で入力された文書画像データをパターン認識
してコード化し、コード化データを生成するパターン認
識行程と、複数の文書に各々対応する文書画像データ及
びコード化データを一括して表示部に表示する第１の表
示行程と、その第１の表示行程の表示結果に基づいてコ
ード化データの後修正の要否を判定し、その判定結果に
基づいてコード化データを後修正する修正行程と、その
修正行程で修正されたコード化データを表示部に表示す
る第２の表示行程と、表示部に表示された文書画像デー
タ及びコード化データを一括してデータベースに登録す
るデータベース登録処理行程とを備えるものである。According to the document filing method of the present invention, a series of image data obtained by collectively reading a plurality of types of a plurality of documents from an image reading apparatus is associated with the plurality of read documents. A document image data input process that is input as image data, a pattern recognition process that pattern-recognizes and encodes the document image data that was input in the document image data input process, and generates coded data. A first display step for collectively displaying the document image data and the coded data to be displayed on the display unit, and the necessity of post-correction of the coded data is determined based on the display result of the first display step. A correction process for post-correcting the coded data based on the determination result, and a second display process for displaying the coded data corrected by the correction process on the display unit. , In which and a database registration processing step of registering in a database in bulk document image data and the coded data displayed on the display unit.

【０００７】また次の発明に係る文書ファイリング方法
で、文書画像データ入力行程は、一連の画像データを予
め定義された定型文書情報と比較して、読み取った複数
の文書に区分けする文書区分け行程と、その文書区分け
行程で区分けされた複数の文書を、定型文書情報と比較
して文書の種類を判別し、文書画像データとして送出す
る文書判別行程とを備えるものである。Further, in the document filing method according to the next invention, the document image data input process includes a document dividing process of comparing a series of image data with predefined fixed document information and dividing the read image into a plurality of read documents. , A document discrimination process of comparing a plurality of documents segmented by the document segmentation process with standard document information to discriminate the type of the document, and transmitting the document image data.

【０００８】また次の発明に係る文書ファイリング方法
で、文書画像データ入力行程は、さらに、定型文書情報
に加えて文書の種類毎に、その文書をデータベースに登
録する際のデータベース登録項目値の記載位置を示すデ
ータベース登録項目領域が予め定義される文書項目定義
行程と、文書に対応する文書画像データから、文書の種
類に応じたデータベース登録項目領域の画像データを抽
出するデータベース登録項目抽出行程とを備え、そのデ
ータベース登録項目抽出行程で抽出したデータベース登
録項目領域の画像データをパターン認識行程でコード化
し、そのコード化したデータベース登録項目値を付与し
て、データベース登録処理工程で文書画像データをデー
タベースへ登録するものである。In the document filing method according to the next invention, the document image data input process further includes database registration item values for registering the document in the database for each document type in addition to the standard document information. A document item definition step in which a database registration item area indicating a position is defined in advance, and a database registration item extraction step for extracting image data of the database registration item area corresponding to the document type from the document image data corresponding to the document. The image data of the database registration item area extracted in the database registration item extraction process is coded in the pattern recognition process, the coded database registration item value is given, and the document image data is sent to the database in the database registration processing step. It is to register.

【０００９】また次の発明に係る文書ファイリング法
で、第１及び第２の表示行程は、複数種類の複数の文書
の文書画像データ及びコード化したデータベース登録項
目値を、文書及びデータベース登録項目を配列した表形
式で一括して表示し、修正行程は文書及びデータベース
登録項目の表で文書に対応してコード化されたデータベ
ース登録項目値を修正するものである。In the document filing method according to the next invention, the first and second display steps include document image data of a plurality of types of a plurality of documents and encoded database registration item values, and document and database registration items. Displayed collectively in an arrayed table format, the correction process is to correct the database registration item value coded corresponding to the document in the document and database registration item table.

【００１０】また次の発明に係る文書ファイリング方法
で、第１及び第２の表示行程は、文書の種類に応じて表
示及び動作が切り替わるものである。In the document filing method according to the next invention, the display and operation of the first and second display steps are switched according to the type of document.

【００１１】また次の発明に係る文書ファイリング方法
は、種類の異なる複数の文書に対して、同一種類の文書
をまとめて表示する第１の表示モードと、種類の異なる
文書を混在させて同時に表示する第２の表示モードと
を、利用者の要求に応じて切り替えるものである。In the document filing method according to the next invention, for a plurality of documents of different types, a first display mode in which documents of the same type are collectively displayed and a document of different types are mixed and displayed simultaneously. The second display mode to be performed is switched according to a user's request.

【００１２】また次の発明に係る文書ファイリング方法
で、第２の表示モードは、文書の種類毎に異なるデータ
ベース登録項目値の表示位置を固定して表示する固定表
示モードと、文書の種類毎に異なるデータベース登録項
目値の表示位置を左詰めで表示する左詰め表示モードと
を、利用者の要求の応じて切り替えるものである。In the document filing method according to the next invention, the second display mode is a fixed display mode in which the display position of the database registration item value which is different for each document type is fixed and displayed, and a second display mode is set for each document type. The display position of different database registration item values is switched to the left-justified display mode in accordance with the user's request.

【００１３】また次の発明に係る文書ファイリング方法
で、修正行程は、修正の対象としている文書中のデータ
ベース登録項目値に対応する文書画像データを常に拡大
表示する画像データ表示領域と、コード化したデータベ
ース登録項目値のパターンデータを常に拡大表示するパ
ターンデータ表示領域とを備えるものである。In the document filing method according to the next invention, the correction process is coded as an image data display area for constantly enlarging and displaying the document image data corresponding to the database registration item value in the document to be corrected. A pattern data display area for constantly enlarging and displaying pattern data of database registration item values is provided.

【００１４】また次の発明に係る文書ファイリング方法
で、修正行程は、修正の対象としている文書中のデータ
ベース登録項目値を含む頁全体の文書画像データと、コ
ード化したデータベース登録項目値のパターンデータと
を、指定された表示倍率で常に表示する固定された表示
領域を備えるものである。In the document filing method according to the next invention, the correction process includes the document image data of the entire page including the database registration item value in the document to be corrected and the coded pattern data of the database registration item value. And a fixed display area for always displaying and at a designated display magnification.

【００１５】また次の発明に係る文書ファイリング方法
で、頁全体の文書画像データの固定された表示領域への
表示は、修正の対象としているデータベース登録項目値
に対応する文書画像データ上の領域を枠で囲み又は色を
表示し、修正対象のデータベース登録項目値に対応する
文書画像データ上の領域を、表示領域に収めるために必
要最小限移動させるものである。In the document filing method according to the next invention, the display of the document image data of the entire page in the fixed display area is performed by changing the area on the document image data corresponding to the database registration item value to be corrected. An area or a color is displayed in a frame, and the area on the document image data corresponding to the database registration item value to be corrected is moved to the minimum necessary to fit in the display area.

【００１６】また次の発明に係る文書ファイリング方法
で、修正行程は、文書判別行程によって判別された文書
の種類と異なる文書の種類となるように利用者が訂正す
ると、文書の種類を訂正された文書のみ文書項目定義行
程によって定義された文書の種類に対応する処理手順に
従って再処理するものである。Further, in the document filing method according to the next invention, when the user corrects the correction process to be a document type different from the document type discriminated by the document discrimination process, the document type is corrected. Only the document is reprocessed according to the processing procedure corresponding to the document type defined by the document item definition process.

【００１７】また次の発明に係る文書ファイリング方法
で、データベース登録処理行程は、文字認識行程によっ
て文書全体をコード化したコード化データから単語を切
り出してキーワードとして抽出するキーワード抽出行程
を備え、データベースの登録項目としてキーワードが指
定された場合、キーワード抽出行程で抽出したキーワー
ドを付与して、データベースへ登録するものである。Further, in the document filing method according to the next invention, the database registration processing step includes a keyword extraction step of cutting out a word from coded data obtained by encoding the entire document by the character recognition step and extracting it as a keyword. When a keyword is designated as a registration item, the keyword extracted in the keyword extraction process is added and registered in the database.

【００１８】また次の発明に係る文書ファイリング方法
では、データベース登録項目値の中からキーワードを抽
出する領域を限定し、指定された領域のみを文字認識行
程によってコード化したコード化データからキーワード
を抽出し、その抽出したキーワードを付与してデータベ
ースへ登録するものである。Further, in the document filing method according to the next invention, the area for extracting the keyword from the database registration item value is limited, and the keyword is extracted from the coded data in which only the designated area is coded by the character recognition process. Then, the extracted keywords are added and registered in the database.

【００１９】また次の発明に係る文書ファイリング方法
では、キーワードを抽出する範囲を文書内容に応じて指
定するキーワード抽出範囲指定行程を備え、文書全体を
文字認識行程によってコード化したコード化データのう
ち、指定された抽出範囲よりキーワードを抽出し、その
抽出したキーワードを付与してデータベースへ登録する
ものである。Further, in the document filing method according to the next invention, a keyword extraction range designating step for designating a keyword extraction range according to the content of the document is provided, and the entire document is encoded by the character recognition process. The keyword is extracted from the specified extraction range, the extracted keyword is added and registered in the database.

【００２０】また次の発明に係る文書ファイリング装置
は、複数種類の複数の文書を一括して画像読み取り装置
から読み取った一連の画像データを、読み取った複数の
文書に各々対応する文書画像データとして入力する文書
画像データ入力手段と、その文書画像データ入力手段で
入力された文書画像データをパターン認識してコード化
し、コード化データを生成するパターン認識手段と、複
数の文書に各々対応する文書画像データ及びコード化デ
ータを一括して表示部に表示する表示手段と、表示部の
表示結果に基づいてコード化データの後修正の要否を判
定し、その判定結果に基づいてコード化データを後修正
して表示手段に供給する修正手段と、表示部に表示され
た文書画像データ及びコード化データを一括してデータ
ベースに登録するデータベース登録処理手段とを備える
ものである。Further, a document filing apparatus according to the next invention inputs a series of image data obtained by collectively reading a plurality of types of a plurality of documents from an image reading device as document image data corresponding to the plurality of read documents. Document image data inputting means, pattern recognition means for pattern-recognizing and coding the document image data input by the document image data inputting means, and generating coded data, and document image data respectively corresponding to a plurality of documents. And a display means for collectively displaying the coded data on the display unit, and whether or not the coded data needs to be corrected later based on the display result on the display unit, and the coded data is corrected after the determination result. The correction means to be supplied to the display means and the document image data and the coded data displayed on the display section are collectively registered in the database. Is intended and a database registration processing means.

【００２１】また次の発明に係る文書ファイリング装置
で、文書画像データ入力手段は、一連の画像データを予
め定義された定型文書情報と比較して、読み取った複数
の文書に区分けする文書区分け手段と、その文書区分け
手段で区分けされた複数の文書を、定型文書情報と比較
して文書の種類を判別し、文書画像データとして送出す
る文書判別手段とを備えるものである。Further, in the document filing apparatus according to the next invention, the document image data input means compares the series of image data with predefined standard document information, and divides the read image into a plurality of read documents. A document discriminating unit that discriminates the type of the document by comparing the plurality of documents segmented by the document segmenting unit with the standard document information and sends the document image data.

【００２２】また次の発明に係る文書ファイリング装置
で、文書画像データ入力手段は、さらに、定型文書情報
に加えて文書の種類毎に、その文書をデータベースに登
録する際のデータベース登録項目値の記載位置を示すデ
ータベース登録項目領域が予め定義される文書項目定義
手段と、文書に対応する文書画像データから、文書の種
類に応じたデータベース登録項目域の画像データを抽出
するデータベース登録項目抽出手段とを備え、そのデー
タベース登録項目抽出手段で抽出したデータベース登録
項目領域の画像データをパターン認識手段でコード化
し、そのコード化したデータベース登録項目値を付与し
て、データベース登録処理手段で文書画像データをデー
タベースへ登録するものである。In the document filing apparatus according to the next invention, the document image data input means further describes the database registration item value when registering the document in the database for each type of document in addition to the fixed document information. A document item definition means for predefining a database registration item area indicating a position, and a database registration item extraction means for extracting image data of a database registration item area corresponding to a document type from document image data corresponding to the document. The image data of the database registration item area extracted by the database registration item extraction means is coded by the pattern recognition means, the coded database registration item value is given, and the document image data is transferred to the database by the database registration processing means. It is to register.

【００２３】また次の発明に係る文書ファイリング装置
で、表示手段は、複数種類の複数の文書の文書画像デー
タ及びコード化したデータベース登録項目値を、文書及
びデータベース登録項目を配列した表形式で一括して表
示し、修正手段は文書及びデータベース登録項目の表で
文書に対応してコード化されたデータベース登録項目値
を修正するものである。In the document filing device according to the next invention, the display means collectively displays the document image data of a plurality of types of a plurality of documents and the encoded database registration item value in a table format in which the documents and the database registration items are arranged. The correction means is for correcting the database registration item value coded corresponding to the document in the table of documents and database registration items.

【００２４】[0024]

【作用】複数種類の複数の文書を一括して画像読み取り
装置から読み取った一連の画像データを、読み取った複
数の文書に各々対応する文書画像データとして入力し、
その文書画像データをパターン認識してコード化データ
を生成し、複数の文書に各々対応する文書画像データ及
びコード化データを一括して表示部に表示し、必要に応
じてコード化データを後修正し、表示部に表示された文
書画像データ及びコード化データを一括してデータベー
スに登録する。これにより、簡易な構成及び操作で複数
種類の複数の文書を一括して入力し得ると共に、修正作
業における利用者の作業負担を軽減し使い勝手を向上し
得る。A series of image data obtained by collectively reading a plurality of types of a plurality of documents from the image reading device is input as document image data corresponding to each of the plurality of read documents,
The document image data is pattern-recognized to generate coded data, and the document image data and coded data corresponding to each of a plurality of documents are collectively displayed on the display unit, and the coded data is post-corrected as necessary. Then, the document image data and the coded data displayed on the display unit are collectively registered in the database. As a result, a plurality of types of documents can be collectively input with a simple configuration and operation, and the work burden on the user in the correction work can be reduced and usability can be improved.

【００２５】また、一連の画像データを予め定義された
定型文書情報と比較して、読み取った複数の文書に区分
けし、区分けされた複数の文書を、定型文書情報と比較
して文書の種類を判別する。これにより簡易な構成及び
操作で、複数種類に複数の文書を確実に一括して入力し
得る。Further, a series of image data is compared with pre-defined standard document information to be divided into a plurality of read documents, and the plurality of sectioned documents are compared with the standard document information to determine the type of document. Determine. Thus, it is possible to reliably input a plurality of documents into a plurality of types at once with a simple configuration and operation.

【００２６】また、文書に対応する文書画像データか
ら、文書の種類に応じたデータベース登録項目領域の画
像データを抽出し、そのデータベース登録項目領域の画
像データをパターン認識行程でコード化し、そのコード
化したデータベース登録項目値を付与して、文書画像デ
ータをデータベースへ登録する。これにより、簡易な構
成及び操作で複数種類の複数の文書を確実に一括して入
力し得る。Further, image data of the database registration item area corresponding to the type of the document is extracted from the document image data corresponding to the document, the image data of the database registration item area is coded in the pattern recognition process, and the coding is performed. The document image data is registered in the database by adding the database registration item value. This makes it possible to reliably input a plurality of documents of a plurality of types at once with a simple configuration and operation.

【００２７】また、複数種類の複数の文書の文書画像デ
ータ及びコード化したデータベース登録項目値を、文書
及びデータベース登録項目を配列した表形式で一括して
表示し、文書及びデータベース登録項目の表で文書に対
応してコード化されたデータベース登録項目値を修正す
る。これにより、修正作業における利用者の作業負担を
軽減して使い勝手を向上し得る。Further, the document image data of a plurality of kinds of a plurality of documents and the coded database registration item values are collectively displayed in a table format in which the documents and the database registration items are arranged, and a table of the documents and the database registration items is displayed. Correct the database entry value coded for the document. As a result, the work burden on the user in the correction work can be reduced and usability can be improved.

【００２８】また第１及び第２の表示行程は、文書の種
類に応じて表示及び動作が切り替わる。これにより、利
用者は文書の種類を容易に認識することができ、修正作
業における利用者の作業負担を軽減して使い勝手を向上
し得る。In the first and second display steps, the display and operation are switched according to the type of document. As a result, the user can easily recognize the type of the document, and the work burden on the user in the correction work can be reduced and the usability can be improved.

【００２９】また、種類の異なる複数の文書に対して、
同一種類の文書をまとめて表示する第１の表示モード
と、種類の異なる文書を混在させて同時に表示する第２
の表示モードとを、利用者の要求に応じて切り替える。
これにより、利用者が第１の表示モードを選択すれば、
同一種類の複数の文書の内容を比較でき、また第２の表
示モードを選択すれば、異なる文書の内容を比較でき、
必要に応じてこれらを選択し得ることにより、修正作業
における利用者の作業負担を軽減して使い勝手を向上し
得る。Further, for a plurality of documents of different types,
A first display mode in which documents of the same type are collectively displayed and a second display mode in which documents of different types are mixed and displayed simultaneously
The display mode of and is switched according to the user's request.
With this, if the user selects the first display mode,
You can compare the contents of multiple documents of the same type, and if you select the second display mode, you can compare the contents of different documents.
By selecting these as necessary, the work load on the user in the correction work can be reduced and usability can be improved.

【００３０】また、第２の表示モードは、文書の種類毎
に異なるデータベース登録項目値の表示位置を固定して
表示する固定表示モードと、文書の種類毎に異なるデー
タベース登録項目値の表示位置を左詰めで表示する左詰
め表示モードとを、利用者の要求の応じて切り替える。
これにより、利用者が固定表示モードを選択すれば、文
書の種類毎にデータベース登録項目の有無を容易に認識
でき、また左詰め表示モードを選択すれば、同一文書内
でより多くのデータベース登録項目を表示して認識で
き、必要に応じてこれらを選択し得ることにより、修正
作業における利用者の作業負担を軽減して使い勝手を向
上し得る。The second display mode includes a fixed display mode in which the display position of the database registration item value that differs for each type of document is fixed and displayed, and a display position of the database registration item value that differs for each type of document. Switches between left-justified display mode, which displays left-justified, according to the user's request.
Therefore, if the user selects the fixed display mode, it is possible to easily recognize the presence or absence of database registration items for each document type, and if the left-justified display mode is selected, more database registration items can be registered in the same document. Can be displayed and recognized, and these can be selected as necessary, whereby the work load on the user in the correction work can be reduced and usability can be improved.

【００３１】また、修正行程では、画像データ表示領域
に修正の対象としている文書中のデータベース登録項目
値に対応する文書画像データを常に拡大表示し、パター
ンデータ表示領域にコード化したデータベース登録項目
値のパターンデータを常に拡大表示する。これにより、
拡大表示された修正対象のパターンデータとその文書画
像データを対比して認識でき、修正作業における利用者
の作業負担を軽減して使い勝手を向上し得る。Further, in the correction process, the document image data corresponding to the database registration item value in the document to be corrected is always enlarged and displayed in the image data display area, and the database registration item value coded in the pattern data display area is displayed. The pattern data of is always enlarged. This allows
It is possible to recognize the enlarged pattern data of the correction target and the document image data thereof by comparison, and reduce the work burden on the user in the correction work and improve the usability.

【００３２】また、修正行程では、固定された表示領域
に、修正の対象としている文書中のデータベース登録項
目値を含む頁全体の文書画像データと、コード化したデ
ータベース登録項目値のパターンデータとを、指定され
た表示倍率で常に表示する。これにより、指定された表
示倍率で表示された修正対象のパターンデータとその文
書画像データを対比して認識でき、修正作業における利
用者の作業負担を軽減して使い勝手を向上し得る。In the correction process, the document image data of the entire page including the database registration item value in the document to be corrected and the coded database registration item value pattern data are displayed in the fixed display area. , Always display at the specified display magnification. As a result, the pattern data to be corrected displayed at the designated display magnification can be recognized in comparison with the document image data, and the work load on the user in the correction work can be reduced and the usability can be improved.

【００３３】また、頁全体の文書画像データの固定され
た表示領域への表示は、修正の対象としているデータベ
ース登録項目値に対応する文書画像データ上の領域を枠
で囲み又は色を表示し、修正対象のデータベース登録項
目値に対応する文書画像データ上の領域を、表示領域に
収めるために必要最小限移動させる。これにより、修正
対象のデータベース登録項目値に対応する文書画像デー
タを確実に認識でき、修正作業における利用者の作業負
担を軽減して使い勝手を向上し得る。The display of the document image data of the entire page in the fixed display area is performed by enclosing the area on the document image data corresponding to the database registration item value to be corrected with a frame or displaying a color, The area on the document image data corresponding to the database registration item value to be corrected is moved to the minimum necessary to fit in the display area. As a result, the document image data corresponding to the correction target database registration item value can be surely recognized, and the work burden on the user in the correction work can be reduced and the usability can be improved.

【００３４】また、修正行程は、文書判別行程によって
判別された文書の種類と異なる文書の種類となるように
利用者が訂正すると、文書の種類を訂正された文書のみ
文書項目定義行程によって定義された文書の種類に対応
する処理手順に従って再処理する。これにより、文書が
誤判別された場合でも容易に訂正でき、複数種類の複数
の文書を確実に一括して入力し得る。When the user corrects the correction process so that the document type is different from the document type determined by the document determination process, only the document with the corrected document type is defined by the document item definition process. Reprocess according to the processing procedure corresponding to the document type. As a result, even if a document is erroneously discriminated, the document can be easily corrected, and a plurality of documents of a plurality of types can be surely input at once.

【００３５】また、データベース登録処理行程のキーワ
ード抽出行程は、文字認識行程によって文書全体をコー
ド化したコード化データから単語を切り出してキーワー
ドとして抽出し、データベースの登録項目としてキーワ
ードが指定された場合、抽出したキーワードを付与し
て、データベースへ登録する。これにより、容易にキー
ワードを抽出してデータベースへ登録でき、利用者の使
い勝手を向上し得る。In the keyword extraction process of the database registration process, words are cut out from the encoded data obtained by encoding the entire document by the character recognition process and extracted as keywords, and when the keywords are designated as registration items in the database, Add the extracted keyword and register it in the database. As a result, the keyword can be easily extracted and registered in the database, and the usability for the user can be improved.

【００３６】また、データベース登録項目値の中からキ
ーワードを抽出する領域を限定し、指定された領域のみ
を文字認識行程によってコード化したコード化データか
らキーワードを抽出し、その抽出したキーワードを付与
してデータベースへ登録する。これにより、容易にデー
タベース登録項目値をキーワードとして抽出してデータ
ベースへ登録でき、利用者の使い勝手を向上し得る。Further, the area in which the keyword is extracted from the database registration item value is limited, only the specified area is encoded by the character recognition process, the keyword is extracted, and the extracted keyword is assigned. Register in the database. Thereby, the database registration item value can be easily extracted as a keyword and registered in the database, and the usability for the user can be improved.

【００３７】また、キーワード抽出範囲指定行程で、キ
ーワードを抽出する範囲を文書内容に応じて指定し、文
書全体を文字認識行程によってコード化したコード化デ
ータのうち、指定された抽出範囲よりキーワードを抽出
し、その抽出したキーワードを付与してデータベースへ
登録する。これにより、有効なキーワードを抽出してデ
ータベースへ登録でき、利用者の使い勝手を向上し得
る。Further, in the keyword extraction range specification step, the range for extracting the keyword is specified according to the content of the document, and the keyword is extracted from the specified extraction range in the coded data obtained by encoding the entire document by the character recognition step. It is extracted, and the extracted keyword is added and registered in the database. As a result, effective keywords can be extracted and registered in the database, and the usability for the user can be improved.

【００３８】また、文書画像データ入力手段で、複数種
類の複数の文書を一括して画像読み取り装置から読み取
った一連の画像データを各々対応する文書画像データと
して入力し、パターン認識手段でパターン認識してコー
ド化データを生成し、表示手段で複数の文書に各々対応
する文書画像データ及びコード化データを一括して表示
部に表示し、修正手段で表示部の表示結果に基づいて必
要に応じて後修正して表示手段に供給し、データベース
登録手段で表示部に表示された文書画像データ及びコー
ド化データを一括してデータベースに登録する。これに
より、簡易な構成及び操作で複数種類の複数の文書を一
括して入力し得ると共に、修正作業における利用者の作
業負担を軽減し使い勝手を向上し得る。Further, the document image data inputting means inputs a series of image data obtained by collectively reading a plurality of plural kinds of documents from the image reading device as corresponding document image data, and the pattern recognizing means recognizes the pattern. To generate the encoded data by the display means, and collectively display the document image data and the encoded data corresponding to each of a plurality of documents on the display section, and the correction means, if necessary, based on the display result of the display section. The document image data and the coded data displayed on the display unit by the database registration unit are collectively registered in the database after being corrected and supplied to the display unit. As a result, a plurality of types of documents can be collectively input with a simple configuration and operation, and the work burden on the user in the correction work can be reduced and usability can be improved.

【００３９】また、文書画像データ入力手段では、文書
区分け手段で一連の画像データを予め定義された定型文
書情報と比較して、読み取った複数の文書に区分けし、
文書判別手段で区分けされた複数の文書を、定型文書情
報と比較して文書の種類を判別し、文書画像データとし
て送出する。これにより、これにより簡易な構成及び操
作で、複数種類に複数の文書を確実に一括して入力し得
る。Further, in the document image data inputting means, the document dividing means compares the series of image data with the predefined fixed form document information and divides it into a plurality of read documents,
The plurality of documents divided by the document discriminating means are compared with the standard document information to discriminate the type of the document, and the document image data is transmitted. As a result, it is possible to reliably input a plurality of documents in a plurality of types at once with a simple configuration and operation.

【００４０】また、文書画像データ入力手段では、さら
に、文書項目定義手段で定型文書情報に加えて文書の種
類毎にデータベース登録項目値の記載位置を示すデータ
ベース登録項目領域が予め定義され、データベース登録
項目抽出手段で文書に対応する文書画像データからデー
タベース登録項目域の画像データを抽出し、パターン認
識手段でコード化したデータベース登録項目値を付与し
て、文書画像データをデータベースへ登録する。これに
より、簡易な構成及び操作で複数種類の複数の文書を確
実に一括して入力し得る。Further, in the document image data input means, in addition to the standard document information by the document item definition means, a database registration item area indicating the position where the database registration item value is described is previously defined for each document type, and the database registration is performed. The image data in the database registration item area is extracted from the document image data corresponding to the document by the item extracting means, the database registration item value encoded by the pattern recognizing means is added, and the document image data is registered in the database. This makes it possible to reliably input a plurality of documents of a plurality of types at once with a simple configuration and operation.

【００４１】また、表示手段では、複数種類の複数の文
書の文書画像データ及びコード化したデータベース登録
項目値を、文書及びデータベース登録項目を配列した表
形式で一括して表示し、修正手段では、文書及びデータ
ベース登録項目の表で文書に対応してコード化されたデ
ータベース登録項目値を修正する。これにより、修正作
業における利用者の作業負担を軽減して使い勝手を向上
し得る。Further, the display means collectively displays the document image data of a plurality of types of a plurality of documents and the coded database registration item values in the form of a table in which the documents and the database registration items are arranged. Modify the database entry value coded for the document in the Documents and Database Entries table. As a result, the work burden on the user in the correction work can be reduced and usability can be improved.

【００４２】[0042]

【実施例】以下図面を参照して、この発明の一実施例を
詳述する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of the present invention will be described in detail below with reference to the drawings.

【００４３】実施例１．図１は全体として、パーソナル
コンピュータやワークステーション上に組み込まれた本
発明による文書ファイリングシステムの構成を示し、１
はＣＰＵ（中央処理ユニット）でなりシステム全体の動
作を制御する制御部である。この制御部１は、文書読み
取り手段２、文書区分け手段３、文書判別手段４、デー
タベース登録項目抽出手段５、文字認識手段６、一括表
示手段７、修正手段８、キーワード抽出手段９、データ
ベース登録手段１０及び記憶部１１に対して、制御を指
示すると共にデータを送受信する。Example 1. FIG. 1 shows the overall configuration of a document filing system according to the present invention installed on a personal computer or workstation.
Is a control unit which is a CPU (Central Processing Unit) and controls the operation of the entire system. The control unit 1 includes a document reading unit 2, a document dividing unit 3, a document discriminating unit 4, a database registration item extracting unit 5, a character recognizing unit 6, a collective displaying unit 7, a correcting unit 8, a keyword extracting unit 9, and a database registering unit. 10 and the storage unit 11 are instructed to control and data is transmitted and received.

【００４４】文書読み取り手段２は、シートフィーダを
有する光学式画像読み取り装置からなる文書読み取り部
１２に対応動作するソフトウェアドライバである。これ
により複数種類の複数の文書を文書読み取り部１２に載
置し、一括してスキャンして得られる画像データが文書
ファイリングシステムに入力される。文書区分け手段３
は、文書読み取り手段２を通じて入力される複数種類の
複数の文書に応じた一連の画像データと、文書項目定義
手段１３に予め定義された文書画像パターンとをパター
ンマッチングの手法等を用いて比較し、文書の先頭のペ
ージを認識してそれぞれ１文書に対応した文書画像デー
タとして区分けする。この結果得られる文書画像データ
は文書判別手段４において、再度文書項目定義手段１３
で定義された文書画像パターンと比較され、文書の種類
を判別される。The document reading means 2 is a software driver that operates corresponding to the document reading unit 12 which is an optical image reading device having a sheet feeder. As a result, a plurality of documents of a plurality of types are placed on the document reading unit 12, and image data obtained by collectively scanning are input to the document filing system. Document classification means 3
Compares a series of image data corresponding to a plurality of types of documents input through the document reading unit 2 with a document image pattern defined in advance in the document item defining unit 13 using a pattern matching method or the like. , The top page of a document is recognized and classified as document image data corresponding to one document. The document image data obtained as a result is again read by the document discrimination means 4 in the document item definition means 13
The document type is determined by comparison with the document image pattern defined in.

【００４５】実際上この実施例では、文書の先頭に図２
に示す先頭識別頁ＰＨを付けて、複数種類の複数の文書
を一括して読み取り入力する。この先頭識別頁ＰＨ上に
は、予め塗りつぶされた矩形領域の先頭頁マークＨＭ
と、文書の種類をそれぞれ利用者が塗りつぶして指定す
る複数の矩形領域でなる文書種別マークＳＭが配されて
いる。従って文書項目定義手段１３では、先頭頁マーク
ＨＭに応じた文書画像パターンと、文書の種類の応じた
文書種別マークＳＭを有する文書画像パターンとが予め
定義されている。これにより、文書区分け手段３及び文
書判別手段４では、容易なパターンマッチングの手法
で、入力された画像データについて文書毎に区分けし得
ると共に、文書の種類を判別できる。In practice, in this embodiment, at the beginning of the document, FIG.
A plurality of documents of a plurality of types are collectively read and input with the head identification page PH shown in FIG. On the top identification page PH, the top page mark HM of a rectangular area that is filled in advance is displayed.
And a document type mark SM composed of a plurality of rectangular areas each of which is designated by the user by filling in the document type. Therefore, in the document item definition means 13, a document image pattern corresponding to the first page mark HM and a document image pattern having a document type mark SM corresponding to the type of document are defined in advance. As a result, the document classifying unit 3 and the document discriminating unit 4 can classify the input image data for each document by a simple pattern matching method, and determine the document type.

【００４６】なお上述した先頭識別頁ＰＨ（図２）を用
いる場合には、パターンマッチングの手法に代えて、塗
りつぶされた矩形領域の有無を検出して、文書を区分け
すると共に文書の種類を判別しても良く、この場合文書
区分け手段３や文書判別手段４の処理を一段と軽減でき
る。また先頭識別頁ＰＨを用いずに、入力される予定の
複数種類の複数の文書について、先頭の文書画像パター
ンを予め定義して記憶部１１に記憶し、この文書画像パ
ターンを用いてパターンマッチングを行い、文書を区分
けすると共に文書の種類を判別しても良く、この場合利
用者は文書のみを文書読み取り装置１２に載置して読み
取れば良く、その分文書の入力に際して使い勝手を向上
し得る。When the above-mentioned head identification page PH (FIG. 2) is used, instead of the pattern matching method, the presence or absence of a filled rectangular area is detected to divide the document and determine the document type. However, in this case, the processing of the document classification means 3 and the document discrimination means 4 can be further reduced. Further, without using the head identification page PH, the head document image pattern is previously defined and stored in the storage unit 11 for a plurality of types of documents to be input, and pattern matching is performed using this document image pattern. The document may be divided and the type of the document may be determined at the same time. In this case, the user may place only the document on the document reading device 12 and read the document, which improves the usability in inputting the document.

【００４７】文書項目定義手段１３には、上述した文書
の区分け及び文書の種類の判別に用いる複数種類の文書
画像パターンに加えて、それぞれの文書の種類に対応す
るデータベース登録項目の項目名とその項目名に対応す
る文書中の記載位置が、データベース登録項目領域とし
て定義されている。データベース登録項目抽出手段６
は、文書判別手段４で判別された文書画像データについ
て、文書種類に応じたデータベース登録項目領域の内容
の画像データを切り出す。この実施例の場合、データベ
ース登録項目領域は、文書中の記載位置として領域の座
標情報が数値で定義され、この座標情報の範囲内の文書
画像データが、データベース登録項目領域の画像データ
として切り出される。なおこのデータベース登録領域も
文書画像パターンとして定義しても良く、この場合は上
述と同様にパターンマッチングの手法を用いてデータベ
ース登録項目領域の画像データを切り出すことができ
る。In the document item definition means 13, in addition to the document image patterns of a plurality of types used for the above-described document division and document type discrimination, the item name of the database registration item corresponding to each document type and its The description position in the document corresponding to the item name is defined as the database registration item area. Database registration item extraction means 6
Cuts out the image data of the contents of the database registration item area according to the document type from the document image data judged by the document judging means 4. In the case of this embodiment, the database registration item area is defined by the numerical value of the coordinate information of the area as a description position in the document, and the document image data within the range of this coordinate information is cut out as the image data of the database registration item area. . Note that this database registration area may also be defined as a document image pattern, and in this case, the image data of the database registration item area can be cut out using the pattern matching method as described above.

【００４８】文字認識手段６は、データベース登録項目
抽出手段５で抽出されたデータベース登録項目領域の画
像データについて、文字認識の手法で文字パターンを認
識して文字コード化し、文書画像データと文字列データ
を一括表示手段７に送出する。文字認識するパターン
は、文字や英数字、記号、絵柄等である。一括表示手段
７は、文書及びデータベース登録項目を縦及び横に配列
し、複数種類の複数の文書のデータベース登録項目の画
像データとそれを文字コード化した文字列データとをそ
れぞれ２行に配して、一括して表示部１４に表示する。
これと同時に修正手段８で編集の対象としているデータ
ベース登録項目が存在する文書の該当ページの文書画像
データを固定表示領域でなる画像データビューワに表示
し、その編集対象の該当領域上に枠を表示する等種々の
表示情報を生成し、表示部に送出する。表示部１４は一
括表示手段８で生成された表示情報を利用者に提示する
もので、ＣＲＴ（陰極線管）ディスプレイ等より構成さ
れる。The character recognition means 6 recognizes the character pattern of the image data of the database registration item area extracted by the database registration item extraction means 5 by the character recognition method and character-codes it. Is sent to the collective display means 7. The pattern for recognizing characters is characters, alphanumeric characters, symbols, patterns and the like. The collective display means 7 arranges documents and database registration items vertically and horizontally, and arranges image data of database registration items of a plurality of types of plural documents and character string data obtained by character encoding the data in two lines. And collectively display them on the display unit 14.
At the same time, the document image data of the corresponding page of the document in which the database registration item to be edited by the correction means 8 is displayed on the image data viewer, which is a fixed display area, and a frame is displayed on the corresponding area of the editing target. Various display information is generated and sent to the display unit. The display unit 14 presents the display information generated by the collective display unit 8 to the user, and is composed of a CRT (cathode ray tube) display or the like.

【００４９】修正手段８は、表示部１４に表示された各
種情報に対して、利用者が操作部１５によって指示する
種々の修正操作に応答し、内部データに対する所望の修
正を施すと同時に、それを一括表示手段７に伝え、表示
部１４で映出する表示内容に反映させる。また利用者
は、認識した文字の修正操作に加えて、文書の区分けや
種類の誤りの訂正を操作し得るようになされ、この場合
修正された文書の種類に応じて、該当する文書画像デー
タについて、文書項目定義手段１３より所望のデータベ
ース登録項目抽出領域を得て、上述したデータベース登
録項目抽出手段５及び文字認識手段６を通じて再処理が
行われ、再度一括表示手段７で追加されて表示される。
なお操作部１４は、表示部１４に映出される情報に対
し、利用者が所望する修正を修正手段１０に入力するた
めのキーボードや、マウス等のポインティングデバイス
から構成されている。The correction means 8 responds to various correction operations instructed by the user through the operation section 15 on various information displayed on the display section 14 to make desired corrections to the internal data and at the same time. Is transmitted to the collective display means 7 and is reflected on the display content displayed on the display unit 14. In addition to the correction operation of the recognized characters, the user can operate the classification of the document and the correction of the error of the type. In this case, according to the type of the corrected document, the corresponding document image data A desired database registration item extraction area is obtained from the document item definition means 13, reprocessed by the database registration item extraction means 5 and the character recognition means 6 described above, and additionally displayed by the collective display means 7. .
The operation unit 14 is composed of a keyboard and a pointing device such as a mouse for inputting a correction desired by the user to the information displayed on the display unit 14 into the correction unit 10.

【００５０】キーワード抽出手段９は、文字認識手段６
でコード化され修正手段８で修正されたデータベース登
録項目の文字列データから、文書の検索に有用なキーワ
ードを抽出する。また文書の種類に応じて、文書全体の
領域や指定された文書カテゴリーとして例えば要約等の
領域の文書画像データを、文字認識手段６でコード化し
た文字列データから、全ての単語を切り出してキーワー
ドとして抽出しても良い。データベース登録手段１０
は、データベース登録項目抽出手段５で抽出され、文字
認識手段６でコード化され、修正手段８で修正されたデ
ータベース登録項目と、キーワード抽出手段９で抽出し
たキーワードとを付加情報とし、文書判別手段４で判別
した文書画像データを、記憶部１１のデータベースに登
録する。記憶部１４は、文書項目定義手段１３の文書画
像パターンや、データベース登録手段１３によって登録
される文書等を記憶するもので、半導体記憶装置やディ
スク装置等からなる。The keyword extracting means 9 is the character recognizing means 6
From the character string data of the database registration item coded by and corrected by the correction means 8, a keyword useful for document retrieval is extracted. Further, according to the type of the document, all the words are cut out from the character string data encoded by the character recognizing means 6 for the document image data of the entire document area or the designated document category, for example, the area of the summary or the like. May be extracted as Database registration means 10
Is the database registration item extracted by the database registration item extraction means 5, coded by the character recognition means 6, and corrected by the correction means 8 and the keyword extracted by the keyword extraction means 9 as additional information, and the document discrimination means The document image data determined in 4 is registered in the database of the storage unit 11. The storage unit 14 stores the document image pattern of the document item definition unit 13, the document registered by the database registration unit 13, and the like, and includes a semiconductor storage device, a disk device, or the like.

【００５１】このような構成で、この実施例の文書ファ
イリングシステムにおける文書ファイリング処理の流れ
を図３に示す。複数種類の複数の文書２０は、文書読み
取り部１２に載置されて文書読み取り手段２を通じて一
括して入力され、この複数の文書に対応する一連の画像
データが、文書区分け手段３で各々文書に区分けされ、
文書判別手段４によって各文書の種類が判別される。例
えば文書Ａと判別された文書画像データ２１Ａに対して
は、文書項目定義手段１３から得られる文書Ａに対する
定義２２Ａを適用し、そこに定義された手順に従って文
書画像データ２１Ａは処理される。FIG. 3 shows the flow of the document filing process in the document filing system of this embodiment having such a configuration. A plurality of types of a plurality of documents 20 are placed on the document reading unit 12 and are collectively input through the document reading unit 2, and a series of image data corresponding to the plurality of documents are converted into documents by the document dividing unit 3. Divided into
The document discriminating means 4 discriminates the type of each document. For example, for the document image data 21A determined to be the document A, the definition 22A for the document A obtained from the document item definition means 13 is applied, and the document image data 21A is processed according to the procedure defined therein.

【００５２】各々の文書画像データ２１からは、文書項
目定義手段１３の定義２２に従ってデータベース登録項
目領域の画像データ２３がデータベース登録項目抽出手
段５によって切り出される。この結果切り出された各々
のデータベース登録項目領域の画像データ２３は文字認
識手段６によって文字コード化され、データベース登録
項目値でなる文字列データ２４が作成される。それらを
一括表示／修正インタフェース２５によって表示すると
共に必要に応じて後修正し、記憶部１１内に構築された
データベース２６へ文書画像データと文字列データとを
登録する。Image data 23 of the database registration item area is cut out from each document image data 21 by the database registration item extracting means 5 according to the definition 22 of the document item defining means 13. The image data 23 of each database registration item area cut out as a result is character coded by the character recognizing means 6, and character string data 24 consisting of database registration item values is created. These are displayed by the collective display / correction interface 25, and are post-corrected if necessary, and the document image data and the character string data are registered in the database 26 built in the storage unit 11.

【００５３】ここで一括表示／修正インタフェース２５
を詳述する。一括表示手段７によって全ての文書のデー
タベース登録項目を一括して表示するための情報が生成
され表示部１４に表示されるので、利用者はそれを確認
しながら、必要に応じて操作部１５によって対話的に修
正操作できる。この場合、操作部１５には常に１つだけ
カーソルが表示され、カーソルが表示されている位置の
データベース登録項目が、その時点での修正対象とな
る。Batch display / correction interface 25
Will be described in detail. Information for collectively displaying the database registration items of all documents is generated by the collective display means 7 and displayed on the display unit 14, so that the user confirms the information and operates the operation unit 15 as necessary. You can interactively modify. In this case, only one cursor is always displayed on the operation unit 15, and the database registration item at the position where the cursor is displayed becomes the correction target at that time.

【００５４】修正対象のデータベース登録項目は、その
画像データとコード化された文字列データが拡大表示さ
れ、これにより利用者は拡大された画像データを確認し
ながらコード化された文字列データを修正できる。また
修正対象のデータベース登録項目を含む文書の該当ペー
ジの画像データは、常に画像データビューワに表示さ
れ、そのデータベース登録項目に対応するデータベース
登録項目領域を枠で囲んで表示したり、色を反転して表
示することにより、利用者は修正対象のデータベース登
録項目を画像データ上で容易に認識できる。In the database registration item to be corrected, the image data and the encoded character string data are enlarged and displayed, whereby the user corrects the encoded character string data while checking the enlarged image data. it can. The image data of the corresponding page of the document that contains the database registration item to be modified is always displayed in the image data viewer, and the database registration item area corresponding to the database registration item is displayed in a frame or the color is reversed. By displaying as, the user can easily recognize the database registration item to be corrected on the image data.

【００５５】なお該当するページの画像データのサイズ
が大きく、画像データビューワに収まらない場合は、修
正対象のデータベース登録項目に対応する領域を表示す
るために、必要最小限の範囲でスクロールする。またコ
ード化したデータベース登録項目の文字列データを修正
する際には、選択した文字が２つ以上の文字認識候補を
持つ場合はその一覧を表示し、利用者が選択した文字に
置き換えることができるが、文字認識候補の一覧中に正
しい文字がない場合は、仮名漢字変換等を利用してキー
ボード等から正しい文字を入力する。If the size of the image data of the corresponding page is too large to fit in the image data viewer, the scroll is performed within the minimum necessary range in order to display the area corresponding to the database registration item to be corrected. When modifying the coded character string data of database registration items, if the selected character has two or more character recognition candidates, a list of them can be displayed and replaced with the character selected by the user. However, if there is no correct character in the list of character recognition candidates, enter the correct character from the keyboard or the like by using Kana-Kanji conversion or the like.

【００５６】ここで、表示部１４に表示される修正のた
めの一括表示／修正インタフェース２５の表示を図４に
示す。これは、一括表示手段７で生成された一括表示デ
ータを基に構成され、操作部１５から入力された修正指
示を修正手段８で解釈し、修正を行った後、その結果を
一括表示手段７に伝え、表示部１４に反映させ、以降こ
れを繰り返すことによって利用者は必要に応じて、各文
書の各データベース登録項目について修正作業を行う。
この実施例の場合、表示部１４の表示枠中には、文書及
びデータベース登録項目を表形式で表示するメイン表示
領域に加えて、上述した画像データビューワ３０及び文
字列データビューワ３１の表示領域が配されている。Here, the display of the collective display / correction interface 25 for correction displayed on the display unit 14 is shown in FIG. This is configured on the basis of the collective display data generated by the collective display means 7. The modifying means 8 interprets the modification instruction input from the operation unit 15 to make the modification, and then the result is collectively displayed by the collective display means 7. To the display unit 14, and by repeating this process thereafter, the user corrects each database registration item of each document as necessary.
In the case of this embodiment, in the display frame of the display unit 14, in addition to the main display area for displaying documents and database registration items in a tabular format, the display areas for the image data viewer 30 and the character string data viewer 31 described above are provided. It is distributed.

【００５７】メイン表示領域中において、文書種類ラベ
ル３２は文書の種類を示すラベルであり、文書判別手段
４で判別された結果を示し、同一行に表示されるデータ
は同一文書のデータベース登録項目であることを示す。
またデータベース登録項目ラベル３３は、文書を検索す
るためのデータベース登録項目を示すラベルであり、デ
ータベース登録項目抽出手段５で抽出された結果を示
し、同一桁に示されるデータは各文書の同一のデータベ
ース登録項目であることを示す。なお各文書に対応する
セルは、画像データセル３４及び文字列データセル３５
の２段で構成されている。In the main display area, the document type label 32 is a label showing the type of the document, showing the result of the discrimination by the document discriminating means 4, and the data displayed in the same line is the database registration item of the same document. Indicates that there is.
Further, the database registration item label 33 is a label showing a database registration item for searching a document, and shows a result extracted by the database registration item extracting means 5. Data indicated in the same digit is the same database of each document. Indicates that it is a registered item. The cells corresponding to each document are the image data cell 34 and the character string data cell 35.
It is composed of two stages.

【００５８】このうち画像データセル３４は、その文書
における文書画像データのデータベース登録項目の領域
の画像データを示す。このとき表示される画像データ
は、画像データセル３４内に表示が収まるように、必要
最小限だけ縮小される。また文字列データセル３５は、
その文書における文書画像データのデータベース登録項
目に対応する領域の画像データを文字認識手段６でコー
ド化した文字列データを示す。この文字列データセル３
５上で、文字列データが修正できる。またある時点で
は、文字列データセル３５の１つだけ必ずカーソル３６
を所有し、カーソル３６が存在する画像データセル３４
及び文字列データセル３５をカレントセルと呼び、その
時点での修正対象となる。Of these, the image data cell 34 shows the image data of the area of the database registration item of the document image data in the document. The image data displayed at this time is reduced by the necessary minimum so that the display fits within the image data cell 34. The character string data cell 35 is
The character string data obtained by encoding the image data in the area corresponding to the database registration item of the document image data in the document by the character recognition means 6 is shown. This character string data cell 3
On the above 5, the character string data can be corrected. Also, at some point, only one of the character string data cells
The image data cell 34 that owns the
The character string data cell 35 is called a current cell and is a correction target at that time.

【００５９】画像データビューワ３０は、カレントセル
に対応する文書で、対応するページの画像データを表示
するための固定表示領域である。またこの画像データビ
ューワ３０の画像データ中、カレントセルに対応する領
域には、枠３７が表示されている。この枠３７がもし表
示領域外になる場合には、必要最小限だけ表示部分の画
像が移動されて、枠３７が最大限画像データビューワ内
に表示される。文字列データビューワ３１はカレントセ
ルの文字列データを表示するための固定表示領域であ
り、通常のエディタ機能を有する。文字列データビュー
ワ３１には常にカーソル３８が表示され、文字列データ
を修正することができる。The image data viewer 30 is a fixed display area for displaying the image data of the page corresponding to the document corresponding to the current cell. A frame 37 is displayed in the area corresponding to the current cell in the image data of the image data viewer 30. If the frame 37 is outside the display area, the image of the display portion is moved by the minimum necessary amount, and the frame 37 is displayed within the image data viewer to the maximum extent. The character string data viewer 31 is a fixed display area for displaying the character string data of the current cell, and has a normal editor function. A cursor 38 is always displayed on the character string data viewer 31, and the character string data can be corrected.

【００６０】なお文字列データの修正として、カレント
セル上又は文字列データビューワ３１上のカーソル３６
又は３８に対応する１文字を修正する場合、図５に示す
ように、認識候補文字の一覧を表す認識候補選択画面が
ポップアップして表示される。この図中には、修正対象
となる１文字分の画像データ４０が表示されると共に、
認識候補の文字４１が１文字ずつ漢字、平仮名、片仮
名、英数字等の種別を付して表示される。従って利用者
は、マウス等のポインティングデバイスやカーソルを移
動させることによって修正選択する文字を選択する。ま
た、修正しない場合は終了４２の表示を指示入力する。As a modification of the character string data, the cursor 36 on the current cell or the character string data viewer 31 is used.
Alternatively, when one character corresponding to 38 is modified, a recognition candidate selection screen showing a list of recognition candidate characters is popped up and displayed as shown in FIG. In this figure, the image data 40 for one character to be corrected is displayed and
The recognition candidate characters 41 are displayed one by one with the type such as kanji, hiragana, katakana, and alphanumeric characters. Therefore, the user selects a character to be modified and selected by moving a pointing device such as a mouse or a cursor. If the correction is not necessary, the display of end 42 is instructed and input.

【００６１】この結果認識候補選択画面が閉じられ、文
字列データビューワ３１上及びカレントセル上のカーソ
ル３８及び３６の位置で、それぞれ対応する文字列デー
タ中の文字が修正される。なお認識候補選択画面中に修
正を希望する文字が表示されないとき、利用者は通常の
エディタのインタフェースでカーソル３６及び３８に対
する文字や、その文字を含む文字列を修正することがで
きる。実際上この文字列データビューワ３１上のカーソ
ル３８とカレントセル上のカーソル３６のどちらか一方
が移動した場合、他方のカーソルは文字列データ上で常
に同じ位置を保つように移動する。また文字列データビ
ューワ３１とカレントセルのどちらかで修正が行われた
場合、もう一方の文字列データは常に同じ修正が行われ
る。As a result, the recognition candidate selection screen is closed, and the characters in the corresponding character string data are corrected at the positions of the cursors 38 and 36 on the character string data viewer 31 and the current cell. When the character desired to be corrected is not displayed in the recognition candidate selection screen, the user can correct the character corresponding to the cursors 36 and 38 or the character string including the character by using the interface of a usual editor. Actually, when one of the cursor 38 on the character string data viewer 31 and the cursor 36 on the current cell moves, the other cursor moves so as to always keep the same position on the character string data. When the correction is made in either the character string data viewer 31 or the current cell, the other character string data is always corrected.

【００６２】このように構成すれば、文書の種類毎に抽
出すべきデータベース登録項目領域を定義し、この定義
に基づいて一括して読み取った複数種類の複数の文書の
画像データから、一括して必要なデータベース登録項目
を抽出すると共に、文字認識手段によってコード化して
文字列データとすることにより、文書画像の検索に用い
るためのデータベース登録項目を効率良く入力すること
ができる。さらに、複数種類の複数の文書に対するデー
タベース登録項目の画像データとコード化した文字列デ
ータとを一括表示して修正することにより、修正操作の
際の利用者の作業負担を軽減できる。According to this structure, the database registration item area to be extracted is defined for each type of document, and the image data of a plurality of types of plural documents collectively read based on this definition are collectively determined. By extracting necessary database registration items and encoding them by the character recognizing means into character string data, it is possible to efficiently input the database registration items to be used for searching the document image. Furthermore, by collectively displaying the image data of the database registration item and the encoded character string data for a plurality of documents of a plurality of types and making corrections, the work load on the user at the time of the correction operation can be reduced.

【００６３】実施例２．上述の実施例１の一括表示修正
インターフェースでは、文書及びデータベース登録項目
の表示に加えて、画像データビューワ３０及び文字列デ
ータビューワ３１を表示したが、この実施例ではこれら
を表示せずに、図６に示すように、文字及びデータベー
ス登録項目のみを表形式で表示する。この場合カーソル
３６が存在するカレントセルの表示を常時拡大して表示
し、カーソル３６が他のセルに移動したとき、移動元の
セルの表示を通常のサイズに戻し、移動先のセルの表示
を拡大表示する。Example 2. In the collective display correction interface of the first embodiment described above, the image data viewer 30 and the character string data viewer 31 are displayed in addition to the display of the document and database registration items. As shown in FIG. 6, only characters and database registration items are displayed in a table format. In this case, the display of the current cell in which the cursor 36 exists is always enlarged and displayed, and when the cursor 36 moves to another cell, the display of the cell of the movement source is returned to the normal size and the display of the cell of the movement destination is displayed. Enlarge and display.

【００６４】このようにすれば、実施例１と同様の効果
を実現できることに加えて、実施例１に比較して同一表
示面積ならば、より多くの文書に応じたデータベース登
録項目を表示できることにより、修正操作の際の利用者
の作業負担を一段と軽減できる。また実施例１と同じ数
の文書に応じたデータベース登録項目を表示するときに
は、画像データビューワ３０や文字列データビューワ３
１を表示しない分、表示面積を小さくすることができ、
表示部１４の表示画面を有効に利用することができる。By doing so, in addition to the effect similar to that of the first embodiment, if the display area is the same as that of the first embodiment, the database registration items corresponding to more documents can be displayed. , It is possible to further reduce the work burden on the user when performing the correction operation. When displaying database registration items corresponding to the same number of documents as in the first embodiment, the image data viewer 30 and the character string data viewer 3 are displayed.
Since 1 is not displayed, the display area can be reduced,
The display screen of the display unit 14 can be effectively used.

【００６５】実施例３．上述の実施例１や実施例２の一
括表示／修正インタフェースでは、一括して読み取った
順序で複数種類の複数の文書を一括表示するようにした
が、この実施例では文書の種類が同じデータのみを揃
え、それらを１単位として修正作業を行い、１つの文書
の種類の修正が終了したら別の種類の文書の修正を行
い、それを繰り返すことで一括して読み取った複数種類
の複数の文書全体の修正を行う。また文書の種類をもと
にして各文書を表示する位置を並び変えても良い。この
ようにすれば、上述の実施例１、実施例２と同様の効果
を実現できることに加えて、実施例１及び実施例２と比
較した場合、表示された同一の種類の文書を比較しなが
ら修正操作することができ分、利用者の修正操作の作業
負担を一段と軽減できる。Example 3. In the batch display / correction interface of the above-described first and second embodiments, a plurality of documents of a plurality of types are collectively displayed in the order of being read collectively. However, in this embodiment, only data of the same document type is displayed. All of the multiple documents of multiple types that are collectively read by performing the correction work with them as one unit and correcting the document of another type when the correction of one document type is completed. Make corrections. Further, the position where each document is displayed may be rearranged based on the type of document. By doing so, in addition to achieving the same effects as those of the above-described first and second embodiments, when comparing with the first and second embodiments, the displayed documents of the same type are compared. Since the correction operation can be performed, the work burden of the correction operation on the user can be further reduced.

【００６６】実施例４．上述の実施例１〜実施例３の一
括表示／修正インタフェースでは、各文書のあるデータ
ベース登録項目の値が無く、その位置のセルが存在しな
い項目については空白を表示したが、これに代え、この
実施例では、図７に示すように、各文書のデータベース
登録項目の値が無く、その位置のセルが存在しない分だ
けセルの表示を左に詰めて表示する。この場合、データ
ベース登録項目ラベル３３に表示される項目名は、カレ
ントセルに対応する文書に対応するデータベース登録項
目名となり、カレントセルが移動する度に移動先の文書
に対応して、データベース登録項目ラベル３３に表示さ
れる項目名が変更される。このようにすれば、上述の実
施例１〜実施例３と同様の効果を実現できることに加え
て、実施例１〜実施例３と比較して、修正対象となる文
書に対応するデータベース登録項目の表示量が増加する
ことにより、その分利用者の修正操作の際の作業負担を
一段と軽減できる。Example 4. In the collective display / correction interface of the above-described first to third embodiments, a blank is displayed for an item in which there is no value of a database registration item of each document and a cell at that position does not exist. In the embodiment, as shown in FIG. 7, there is no value of the database registration item of each document, and the cells at the position are not displayed, and the cells are displayed on the left. In this case, the item name displayed on the database registration item label 33 becomes the database registration item name corresponding to the document corresponding to the current cell, and every time the current cell is moved, the database registration item corresponding to the destination document is registered. The item name displayed on the label 33 is changed. By doing so, in addition to achieving the same effects as those of the above-described first to third embodiments, compared with the first to third embodiments, the database registration items corresponding to the document to be corrected can be changed. By increasing the display amount, it is possible to further reduce the work load of the user during the correction operation.

【００６７】実施例５．実施例１の一括表示修正インタ
フェースにおける画像データビューワ３０では、表示さ
れる画像データがカレントセルの移動に応じて、移動し
たデータベース登録項目領域を含むようにする場合につ
いて述べた。この実施例ではマウス等のポインティング
デバイスを用いて画像データの表示領域を移動させて、
画像データ中であるデータベース登録項目領域内の任意
の１点を指定することによって、文書及びデータベース
登録項目表のカーソル３６を移動させる。さらに画像デ
ータビューワ３０に文書画像データの他の用紙の表示に
切り替えるための操作表示を設けても良い。このように
すれば、実施例１と同様の効果を実現できることに加え
て、利用者は画像データビューワ３０の表示を見ながら
視覚的に修正操作することができ、利用者の修正操作の
際の作業負担をさらに一段と軽減できる。Example 5. In the image data viewer 30 in the collective display correction interface of the first embodiment, the case where the displayed image data includes the moved database registration item area in accordance with the movement of the current cell has been described. In this embodiment, the display area of the image data is moved using a pointing device such as a mouse,
By designating any one point in the database registration item area in the image data, the cursor 36 of the document and database registration item table is moved. Further, the image data viewer 30 may be provided with an operation display for switching to the display of another sheet of document image data. By doing so, in addition to achieving the same effect as that of the first embodiment, the user can visually perform a correction operation while looking at the display of the image data viewer 30. The work load can be further reduced.

【００６８】[0068]

【発明の効果】上述したように本発明によれば、複数種
類の複数の文書を一括して画像読み取り装置から読み取
った一連の画像データを、読み取った複数の文書に各々
対応する文書画像データとして入力し、その文書画像デ
ータをパターン認識してコード化データを生成し、複数
の文書に各々対応する文書画像データ及びコード化デー
タを一括して表示部に表示し、必要に応じてコード化デ
ータを後修正し、表示部に表示された文書画像データ及
びコード化データを一括してデータベースに登録するこ
とにより、簡易な構成及び操作で複数種類の複数の文書
を一括して入力し得ると共に、修正作業における利用者
の作業負担を軽減し使い勝手を向上し得る文書ファイリ
ング方法を実現できる。As described above, according to the present invention, a series of image data obtained by collectively reading a plurality of documents of a plurality of types from an image reading apparatus is used as document image data corresponding to the read plurality of documents. Input the document image data, generate the coded data by pattern recognition, and display the document image data and the coded data corresponding to each of the multiple documents at once on the display unit, and if necessary, the coded data. By post-correcting and registering the document image data and the coded data displayed on the display unit in the database in a batch, a plurality of types of a plurality of documents can be input in a batch with a simple configuration and operation, It is possible to realize a document filing method that can reduce the work burden on the user in correction work and improve usability.

【００６９】また次の発明によれば、一連の画像データ
を予め定義された定型文書情報と比較して、読み取った
複数の文書に区分けし、区分けされた複数の文書を、定
型文書情報と比較して文書の種類を判別することによ
り、簡易な構成及び操作で、複数種類に複数の文書を確
実に一括して入力し得る文書ファイリング方法を実現で
きる。According to the next invention, a series of image data is compared with pre-defined standard document information, divided into a plurality of read documents, and the plurality of divided documents are compared with the standard document information. By discriminating the type of the document, it is possible to realize a document filing method capable of surely inputting a plurality of documents into a plurality of types with a simple configuration and operation.

【００７０】また次の発明によれば、文書に対応する文
書画像データから、文書の種類に応じたデータベース登
録項目領域の画像データを抽出し、そのデータベース登
録項目領域の画像データをパターン認識行程でコード化
し、そのコード化したデータベース登録項目値を付与し
て、文書画像データをデータベースへ登録することによ
り、簡易な構成及び操作で複数種類の複数の文書を確実
に一括して入力し得る文書ファイリング方法を実現でき
る。According to the next invention, the image data of the database registration item area corresponding to the type of the document is extracted from the document image data corresponding to the document, and the image data of the database registration item area is subjected to the pattern recognition process. Document filing that allows you to input multiple documents of multiple types in a batch with a simple configuration and operation by encoding and adding the encoded database registration item value and registering the document image data in the database The method can be realized.

【００７１】また次の発明によれば、複数種類の複数の
文書の文書画像データ及びコード化したデータベース登
録項目値を、文書及びデータベース登録項目を配列した
表形式で一括して表示し、文書及びデータベース登録項
目の表で文書に対応してコード化されたデータベース登
録項目値を修正することにより、修正作業における利用
者の作業負担を軽減して使い勝手を向上し得る文書ファ
イリング方法を実現できる。According to the next invention, the document image data of a plurality of types of a plurality of documents and the coded database registration item values are collectively displayed in a table format in which the documents and the database registration items are arranged. By correcting the database registration item value coded corresponding to the document in the database registration item table, it is possible to realize the document filing method that can reduce the work burden on the user in the correction work and improve the usability.

【００７２】また次の発明によれば、第１及び第２の表
示行程で、文書の種類に応じて表示及び動作が切り替わ
るようにしたことにより、利用者は文書の種類を容易に
認識することができ、修正作業における利用者の作業負
担を軽減して使い勝手を向上し得る文書ファイリング方
法を実現できる。According to the next invention, the display and the operation are switched according to the document type in the first and second display steps, so that the user can easily recognize the document type. Therefore, it is possible to realize a document filing method that can reduce the work load on the user in the correction work and improve the usability.

【００７３】また次の発明によれば、種類の異なる複数
の文書に対して、同一種類の文書をまとめて表示する第
１の表示モードと、種類の異なる文書を混在させて同時
に表示する第２の表示モードとを、利用者の要求に応じ
て切り替えるようにしたことにより、利用者が第１の表
示モードを選択すれば、同一種類の複数の文書の内容を
比較でき、また第２の表示モードを選択すれば、異なる
文書の内容を比較でき、必要に応じてこれらを選択し
得、かくして修正作業における利用者の作業負担を軽減
して使い勝手を向上し得る文書ファイリング方法を実現
できる。Further, according to the next invention, for a plurality of documents of different types, a first display mode for collectively displaying documents of the same type and a second display mode for simultaneously displaying documents of different types are mixed. By switching the display mode of the above according to the user's request, if the user selects the first display mode, the contents of a plurality of documents of the same type can be compared and the second display can be displayed. If the mode is selected, the contents of different documents can be compared with each other, and these can be selected as needed, thus realizing a document filing method that can reduce the work burden on the user in the correction work and improve the usability.

【００７４】また次の発明によれば、第２の表示モード
は、文書の種類毎に異なるデータベース登録項目値の表
示位置を固定して表示する固定表示モードと、文書の種
類毎に異なるデータベース登録項目値の表示位置を左詰
めで表示する左詰め表示モードとを、利用者の要求の応
じて切り替えるようにしたことにより、利用者が固定表
示モードを選択すれば、文書の種類毎にデータベース登
録項目の有無を容易に認識でき、また左詰め表示モード
を選択すれば、同一文書内でより多くのデータベース登
録項目を表示して認識でき、必要に応じてこれらを選択
し得、かくして修正作業における利用者の作業負担を軽
減して使い勝手を向上し得る文書ファイリング方法を実
現できる。According to the next invention, the second display mode is the fixed display mode in which the display position of the database registration item value that is different for each type of document is fixed and displayed, and the database display that is different for each type of document is performed. By switching the left-justified display mode in which the display position of item values is left-justified according to the user's request, if the user selects the fixed display mode, database registration for each document type will be performed. The presence or absence of items can be easily recognized, and if the left-justified display mode is selected, more database registered items can be displayed and recognized in the same document, and these can be selected as needed, thus making it possible to correct It is possible to realize a document filing method that can reduce the user's work load and improve usability.

【００７５】また次の発明によれば、修正行程では、画
像データ表示領域に修正の対象としている文書中のデー
タベース登録項目値に対応する文書画像データを常に拡
大表示し、パターンデータ表示領域にコード化したデー
タベース登録項目値のパターンデータを常に拡大表示す
るようにしたことにより、拡大表示された修正対象のパ
ターンデータとその文書画像データを対比して認識で
き、修正作業における利用者の作業負担を軽減して使い
勝手を向上し得る文書ファイリング方法を実現できる。According to the next invention, in the correction process, the document image data corresponding to the database registration item value in the document to be corrected is always displayed in the image data display area in an enlarged manner and the code is displayed in the pattern data display area. Since the pattern data of the converted database registration item values is always enlarged, it is possible to recognize the enlarged pattern data of the correction target and its document image data in contrast, and to reduce the work burden on the user in the correction work. It is possible to realize a document filing method that can reduce the number and improve the usability.

【００７６】また次の発明によれば、修正行程では、固
定された表示領域に、修正の対象としている文書中のデ
ータベース登録項目値を含む頁全体の文書画像データ
と、コード化したデータベース登録項目値のパターンデ
ータとを、指定された表示倍率で常に表示することによ
り、指定された表示倍率で表示された修正対象のパター
ンデータとその文書画像データを対比して認識でき、修
正作業における利用者の作業負担を軽減して使い勝手を
向上し得る文書ファイリング方法を実現できる。According to the next invention, in the correction process, the document image data of the entire page including the database registration item value in the document to be corrected and the coded database registration item are displayed in the fixed display area. By constantly displaying the value pattern data at the specified display magnification, the pattern data to be corrected displayed at the specified display magnification and its document image data can be recognized by comparison, and the user in the correction work It is possible to realize a document filing method that can reduce the work load of and improve the usability.

【００７７】また次の発明によれば、頁全体の文書画像
データの固定された表示領域への表示は、修正の対象と
しているデータベース登録項目値に対応する文書画像デ
ータ上の領域を枠で囲み又は色を表示し、修正対象のデ
ータベース登録項目値に対応する文書画像データ上の領
域を、表示領域に収めるために必要最小限移動させるこ
とにより、修正対象のデータベース登録項目値に対応す
る文書画像データを確実に認識でき、修正作業における
利用者の作業負担を軽減して使い勝手を向上し得る文書
ファイリング方法を実現できる。According to the next invention, the document image data of the entire page is displayed in the fixed display area by enclosing the area on the document image data corresponding to the database registration item value to be corrected with a frame. Alternatively, by displaying the color and moving the area on the document image data corresponding to the correction target database registration item value to the minimum necessary to fit in the display area, the document image corresponding to the correction target database registration item value It is possible to realize a document filing method that can reliably recognize data, reduce the work load on the user in the correction work, and improve usability.

【００７８】また次の発明によれば、修正行程は、文書
判別行程によって判別された文書の種類と異なる文書の
種類となるように利用者が訂正すると、文書の種類を訂
正された文書のみ文書項目定義行程によって定義された
文書の種類に対応する処理手順に従って再処理するよう
にしたことにより、文書が誤判別された場合でも容易に
訂正でき、複数種類の複数の文書を確実に一括して入力
し得る文書ファイリング方法を実現できる。Further, according to the following invention, when the user corrects the correction process so that it becomes a document type different from the document type discriminated by the document discrimination process, only the document whose document type has been corrected is documented. By reprocessing according to the processing procedure corresponding to the type of document defined by the item definition process, even if a document is misidentified, it can be easily corrected, and multiple documents of multiple types can be reliably collected. A document filing method that can be input can be realized.

【００７９】また次の発明によれば、データベース登録
処理行程のキーワード抽出行程は、文字認識行程によっ
て文書全体をコード化したコード化データから単語を切
り出してキーワードとして抽出し、データベースの登録
項目としてキーワードが指定された場合、抽出したキー
ワードを付与して、データベースへ登録することによ
り、容易にキーワードを抽出してデータベースへ登録で
き、利用者の使い勝手を向上し得る文書ファイリング方
法を実現できる。According to the next invention, in the keyword extraction step of the database registration processing step, a word is cut out from the coded data obtained by encoding the entire document by the character recognition step and extracted as a keyword, and the keyword is registered as a database registration item. When is designated, the extracted keyword is added and registered in the database, whereby the keyword can be easily extracted and registered in the database, and a document filing method that can improve the usability for the user can be realized.

【００８０】また次の発明によれば、データベース登録
項目値の中からキーワードを抽出する領域を限定し、指
定された領域のみを文字認識行程によってコード化した
コード化データからキーワードを抽出し、その抽出した
キーワードを付与してデータベースへ登録することによ
り、容易にデータベース登録項目値をキーワードとして
抽出してデータベースへ登録でき、利用者の使い勝手を
向上し得る文書ファイリング方法を実現できる。According to the next invention, the area for extracting the keyword from the database registration item value is limited, and the keyword is extracted from the encoded data in which only the designated area is encoded by the character recognition process. By adding the extracted keyword and registering it in the database, the database registration item value can be easily extracted as a keyword and registered in the database, and a document filing method that can improve the usability for the user can be realized.

【００８１】また次の発明によれば、キーワード抽出範
囲指定行程で、キーワードを抽出する範囲を文書内容に
応じて指定し、文書全体を文字認識行程によってコード
化したコード化データのうち、指定された抽出範囲より
キーワードを抽出し、その抽出したキーワードを付与し
てデータベースへ登録することにより、有効なキーワー
ドを抽出してデータベースへ登録でき、利用者の使い勝
手を向上し得る文書ファイリング方法を実現できる。According to the next invention, in the keyword extraction range designating process, the range for extracting the keyword is designated according to the document content, and the entire document is designated by the character recognition process. By extracting a keyword from the extraction range, adding the extracted keyword and registering it in the database, a valid keyword can be extracted and registered in the database, and a document filing method that can improve the usability of the user can be realized. .

【００８２】また次の発明によれば、文書画像データ入
力手段で、複数種類の複数の文書を一括して画像読み取
り装置から読み取った一連の画像データを各々対応する
文書画像データとして入力し、パターン認識手段でパタ
ーン認識してコード化データを生成し、表示手段で複数
の文書に各々対応する文書画像データ及びコード化デー
タを一括して表示部に表示し、修正手段で表示部の表示
結果に基づいて必要に応じて後修正して表示手段に供給
し、データベース登録手段で表示部に表示された文書画
像データ及びコード化データを一括してデータベースに
登録することにより、簡易な構成及び操作で複数種類の
複数の文書を一括して入力し得ると共に、修正作業にお
ける利用者の作業負担を軽減し使い勝手を向上し得る文
書ファイリング装置を実現できる。According to the next invention, the document image data inputting means inputs a series of image data obtained by collectively reading a plurality of plural kinds of documents from the image reading device as the corresponding document image data, and the pattern The recognition means performs pattern recognition to generate coded data, the display means collectively displays the document image data and the coded data corresponding to a plurality of documents on the display unit, and the correction unit displays the display result on the display unit. Based on the need, the data is corrected and supplied to the display means, and the database registration means collectively registers the document image data and the coded data displayed on the display section in the database, thereby simplifying the configuration and operation. A document filing device that can input multiple documents of multiple types at the same time, reduce the work burden on the user in correction work, and improve usability. It can be realized.

【００８３】また次の発明によれば、文書画像データ入
力手段では、文書区分け手段で一連の画像データを予め
定義された定型文書情報と比較して、読み取った複数の
文書に区分けし、文書判別手段で区分けされた複数の文
書を、定型文書情報と比較して文書の種類を判別し、文
書画像データとして送出することにより、これにより簡
易な構成及び操作で、複数種類に複数の文書を確実に一
括して入力し得る文書ファイリング装置を実現できる。According to the next invention, in the document image data inputting means, the document dividing means compares the series of image data with the predefined fixed form document information, divides the read document into a plurality of documents, and discriminates the document. By comparing multiple documents divided by means with standard document information to determine the document type and sending it as document image data, this makes it possible to ensure multiple documents for multiple types with a simple configuration and operation. It is possible to realize a document filing device capable of inputting all at once.

【００８４】また次の発明によれば、文書画像データ入
力手段では、さらに、文書項目定義手段で定型文書情報
に加えて文書の種類毎にデータベース登録項目値の記載
位置を示すデータベース登録項目領域が予め定義され、
データベース登録項目抽出手段で文書に対応する文書画
像データからデータベース登録項目域の画像データを抽
出し、パターン認識手段でコード化したデータベース登
録項目値を付与して、文書画像データをデータベースへ
登録することにより、簡易な構成及び操作で複数種類の
複数の文書を確実に一括して入力し得る文書ファイリン
グ装置を実現できる。According to the next invention, in the document image data input means, the document item definition means further includes a database registration item area indicating the position where the database registration item value is described for each document type in addition to the standard document information. Predefined,
The image data in the database registration item area is extracted from the document image data corresponding to the document by the database registration item extraction means, the database registration item value encoded by the pattern recognition means is added, and the document image data is registered in the database. As a result, it is possible to realize a document filing device that can surely input a plurality of types of a plurality of documents collectively with a simple configuration and operation.

【００８５】また次の発明によれば、表示手段では、複
数種類の複数の文書の文書画像データ及びコード化した
データベース登録項目値を、文書及びデータベース登録
項目を配列した表形式で一括して表示し、修正手段で
は、文書及びデータベース登録項目の表で文書に対応し
てコード化されたデータベース登録項目値を修正するこ
とにより、修正作業における利用者の作業負担を軽減し
て使い勝手を向上し得る文書ファイリング装置を実現で
きる。According to the next invention, the display means collectively displays the document image data of a plurality of types of a plurality of documents and the coded database registration item values in a table format in which the documents and the database registration items are arranged. In the correction means, the database registration item value coded in correspondence with the document in the table of documents and database registration items is corrected, so that the work burden on the user in the correction work can be reduced and the usability can be improved. A document filing device can be realized.

【図面の簡単な説明】[Brief description of drawings]

【図１】本発明による文書ファイリングシステムの実
施例１の構成を示すブロック図である。FIG. 1 is a block diagram showing a configuration of a first embodiment of a document filing system according to the present invention.

【図２】図１の文書ファイリングシステムの一括入力
の際に用いる先頭識別頁を示す略線図である。FIG. 2 is a schematic diagram showing a head identification page used in batch input of the document filing system of FIG.

【図３】図１の文書ファイリングシステムの処理の流
れを説明する略線図である。FIG. 3 is a schematic diagram illustrating a processing flow of the document filing system in FIG.

【図４】図１の文書ファイリングシステムにおける一
括表示及び修正操作を説明する略線図である。FIG. 4 is a schematic diagram illustrating collective display and correction operations in the document filing system of FIG.

【図５】図３の一括表示及び修正操作における認識候
補選択画面を示す略線図である。5 is a schematic diagram showing a recognition candidate selection screen in the collective display and correction operation of FIG.

【図６】本発明による文書ファイリングシステムの実
施例２の一括表示及び修正操作を説明する略線図であ
る。FIG. 6 is a schematic diagram illustrating collective display and correction operations of a second embodiment of the document filing system according to the present invention.

【図７】本発明による文書ファイリングシステムの実
施例４の一括表示及び修正操作を説明する略線図であ
る。FIG. 7 is a schematic diagram illustrating collective display and correction operations of a fourth embodiment of the document filing system according to the present invention.

【図８】従来の文書ファイリングシステムにおける表
示及び修正操作を説明する略線図である。FIG. 8 is a schematic diagram illustrating display and correction operations in a conventional document filing system.

[Explanation of symbols]

１制御部２文書読み取り手段３文書区分け手段４文書判別手段５データベース登録項目抽出手段６文字認識手段７一括表示手段８修正手段９キーワード抽出手段１０データベース登録手段１１記憶部１２文書読み取り部１３文書項目定義手段１４表示部１５操作部２０文書２１文書画像データ２２定義２３データベース登録項目領域の画像データ２４文字列データ２５一括表示修正インターフェース２６データベース３０画像データビューワ３１文字列データビューワ３２文書種類ラベル３３データベース登録項目ラベル３４画像データセル３５文字データセル３６カーソル３７枠３８カーソル４０１文字分の画像データ４１認識候補の文字４２終了表示 DESCRIPTION OF SYMBOLS 1 control part 2 document reading means 3 document classification means 4 document discrimination means 5 database registration item extraction means 6 character recognition means 7 batch display means 8 correction means 9 keyword extraction means 10 database registration means 11 storage section 12 document reading section 13 document items Definition means 14 Display unit 15 Operation unit 20 Document 21 Document image data 22 Definition 23 Database registration item area image data 24 Character string data 25 Batch display correction interface 26 Database 30 Image data viewer 31 Character string data viewer 32 Document type label 33 Database Registration item label 34 Image data cell 35 Character data cell 36 Cursor 37 Frame 38 Cursor 40 Image data for one character 41 Character of recognition candidate 42 End display

───────────────────────────────────────────────────── フロントページの続き (72)発明者丸田裕三鎌倉市大船五丁目１番１号三菱電機株式会社パーソナル情報機器開発研究所内 ─────────────────────────────────────────────────── ─── Continued Front Page (72) Inventor Yuzo Maruta 5-1-1 Ofuna, Kamakura-shi Personal Information Equipment Development Laboratory, Mitsubishi Electric Corporation

Claims

[Claims]

1. A document image data input step of inputting a series of image data obtained by collectively reading a plurality of types of a plurality of documents from an image reading device as document image data corresponding to each of the plurality of read documents, A pattern recognition step of pattern-recognizing and coding the document image data input in the document image data input step to generate coded data, and the document image data and the coded data corresponding to the plurality of documents, respectively. Is displayed on the display unit in a batch, and it is determined whether post-correction of the coded data is necessary based on the display result of the first display process. A correction process for post-correcting the encoded data, a second display process for displaying the coded data corrected by the correction process on the display unit, and a display on the display unit. A document filing method, comprising: a database registration processing step of collectively registering the document image data and the coded data shown in a database.

2. The document image data input process includes a document dividing process of dividing the read image into a plurality of read documents by comparing the series of image data with predefined document information, and the document dividing process. The document filing process according to claim 1, further comprising: a document determination process of comparing the plurality of divided documents with the standard document information to determine a document type and transmitting the document image data. Method.

3. The document image data input step further includes, in addition to the standard document information, database registration indicating a description position of a database registration item value when registering the document in the database for each type of the document. A document item definition process in which item regions are defined in advance, and a database registration item extraction process for extracting the image data in the database registration item region according to the type of the document from the document image data corresponding to the document. The image data of the database registration item area extracted in the database registration item extraction step is coded in the pattern recognition step, the coded database registration item value is added, and the database registration processing step is performed. The document image data is registered in the database, according to claim 2. Document filing method described.

4. The first and second display steps are a table format in which the document image data of the plurality of types of the plurality of documents and the encoded database registration item value are arranged in a document and a database registration item. To display all at once,
4. The document filing method according to claim 3, wherein the correction step is to correct the coded database registration item value corresponding to the document in the table of documents and database registration items.

5. The document filing method according to claim 4, wherein the display and the operation of the first and second display steps are switched according to the type of the document.

6. A plurality of documents of different types,
It is characterized by switching between a first display mode in which the same type of documents are collectively displayed and a second display mode in which the different types of documents are mixed and simultaneously displayed according to a user's request. The document filing method according to claim 5.

7. The second display mode is a fixed display mode in which the display position of the database registration item value that is different for each type of the document is fixed and displayed, and the database registration item that is different for each type of the document. 7. The document filing method according to claim 6, wherein a left-justified display mode in which the display position of the value is displayed left-justified is switched according to a user's request.

8. The correction process includes an image data display area for constantly enlarging and displaying the document image data corresponding to the database registration item value in the document to be corrected, and the coded database registration item. The document filing method according to claim 4, further comprising: a pattern data display area for constantly enlarging and displaying the value pattern data.

9. The correction process includes the document image data of the entire page including the database registration item value in the document to be corrected, and the coded pattern data of the database registration item value. The document filing method according to claim 4, further comprising a fixed display area which is always displayed at a designated display magnification.

10. Displaying the document image data of the entire page in the fixed display area encloses an area on the document image data corresponding to the database registration item value to be corrected with a frame or 10. The document according to claim 9, wherein a color is displayed, and an area on the document image data corresponding to the correction target database registration item value is moved to a minimum necessary amount to fit in the display area. Filing method.

11. When the user corrects the correction process so that the document type is different from the document type determined by the document determination process, only the document with the corrected document type is the document. 5. The document filing method according to claim 4, further comprising reprocessing according to a processing procedure corresponding to the type of the document defined by the item definition process.

12. The database registration processing step includes a keyword extraction step of cutting out a word from the encoded data obtained by encoding the entire document by the character recognition step and extracting the keyword as a keyword, and the keyword registration step is performed as a registration item of the database. If a keyword is specified,
The document filing method according to claim 3, wherein the keyword extracted in the keyword extracting step is added and registered in the database.

13. The area for extracting the keyword from the database registration item value is limited, and the keyword is extracted from the coded data in which only the designated area is encoded by the character recognition process, and the extraction is performed. 13. The document filing method according to claim 12, wherein the keyword is added and registered in the database.

14. A keyword extraction range designating process for designating a range for extracting the keyword according to document contents, and a designated extraction range of the coded data obtained by coding the entire document by the character recognition process. 13. The document filing method according to claim 12, further comprising extracting the keyword, adding the extracted keyword, and registering the keyword in the database.

15. A document image data input means for inputting a series of image data obtained by collectively reading a plurality of types of a plurality of documents from an image reading device as document image data corresponding to the plurality of read documents, respectively. Pattern recognition means for pattern-recognizing and coding the document image data input by the document image data input means to generate coded data, the document image data and the coded data corresponding to the plurality of documents, respectively. Is displayed on the display unit collectively, and it is determined whether or not the post-correction of the coded data is necessary based on the display result of the display unit, and the coded data is post-corrected based on the judgment result. And the correction means to be supplied to the display means, and the document image data and the coded data displayed on the display section are collectively stored in a database. A document filing apparatus comprising: a database registration processing unit for registering.

16. The document image data input means compares the series of image data with predefined standard document information and divides the read image into a plurality of read documents, and the document dividing means. 16. The document filing according to claim 15, further comprising: a document discriminating unit that discriminates a document type by comparing the plurality of divided documents with the standard document information and sends the disc as the document image data. apparatus.

17. The document image data input means further includes database registration indicating a position of a database registration item value when registering the document in the database for each type of the document in addition to the standard document information. Document item definition means for predefining item areas, and database registration item extraction means for extracting the image data of the database registration item area according to the type of the document from the document image data corresponding to the document The image data of the database registration item area extracted by the database registration item extracting means is coded by the pattern recognizing means, the coded database registration item value is given, and the database registration processing means The document image data is registered in the database. 6. The document filing device described in 6.

18. The display means collectively displays the document image data of the plurality of types of the plurality of documents and the encoded database registration item value in a table format in which documents and database registration items are arranged. 18. The document filing apparatus according to claim 17, wherein the correction unit corrects the coded database registration item value corresponding to the document in the table of documents and database registration items.