JP2001034763A

JP2001034763A - Document image processor, method for extracting its document title and method for attaching document tag information

Info

Publication number: JP2001034763A
Application number: JP2000053079A
Authority: JP
Inventors: Hirosuke Monobe; 裕亮物部; Atsutsugu Hirose; 篤嗣広瀬; Akito Umebayashi; 明人梅林
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1999-03-01
Filing date: 2000-02-29
Publication date: 2001-02-09

Abstract

PROBLEM TO BE SOLVED: To extract a title area and a mark attached by a user from a document image and to utilize the title and the mark as document tag information. SOLUTION: First, a title area extracting means 104 extracts an area of an area average character size being larger than a prescribed extraction decision value as a title area. Thus, a plurality of title areas can be extracted from one document image. Next, a mark extracting means extracts a mark attached to an input image by a user, and a calculating means calculates the characteristic value of the mark. Then, a document tag information attaching means selects document tag information to be attached to the input image among standard tag information on the basis of the characteristic value and the attribute value of the standard tag information. Thus, it is possible to automatically attach the document tag information to the document image.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、文書画像を画像デ
ータとして記憶・管理する文書画像処理装置と文書画像
処理方法に関し、特に、上記文書画像からタイトル領域
やユーザが付したマークを抽出して文書タグ情報として
利用する上記装置と方法に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a document image processing apparatus and a document image processing method for storing and managing document images as image data, and more particularly to extracting a title area and a mark given by a user from the document image. The present invention relates to the above-described apparatus and method used as document tag information.

【０００２】[0002]

【従来の技術】データ記憶装置の容量が著しく増加して
きたことに伴って、スキャナ等から読み込んだ紙文書を
画像データである文書画像として記憶・管理する文書画
像処理装置が急速に普及してきている。2. Description of the Related Art Along with a remarkable increase in the capacity of a data storage device, a document image processing device for storing and managing a paper document read from a scanner or the like as a document image as image data is rapidly spreading. .

【０００３】このような文書画像処理装置では、データ
記憶装置に記憶された複数の文書画像の中から所望の文
書画像を検索できるようにするため各文書画像にタイト
ルやキーワード等の文書タグ情報となる文字列を対応付
けて登録するようにしている。In such a document image processing apparatus, in order to search for a desired document image from a plurality of document images stored in a data storage device, each document image includes document tag information such as a title and a keyword. Are registered in association with each other.

【０００４】この文書タグ情報を概念的に示したものが
図１９である。この図に示すように、文書タグ情報、例
えば「極秘」１９１、「Ａ社」１９２、「９９年度」１
９３、「新車」１９４は文書画像１９０に対してキーワ
ード的な役割を果たしている。このように各文書画像に
複数の文書タグ情報を付与しておくと、これら複数の文
書タグ情報から絞り込みを行うことにより、必要な文書
画像を素早く検索することができる。FIG. 19 conceptually shows the document tag information. As shown in this figure, document tag information, for example, “Top Secret” 191, “Company A” 192, “FY 1999” 1
93, "new car" 194 plays a keyword role for the document image 190. When a plurality of document tag information is assigned to each document image in this way, a necessary document image can be quickly searched by narrowing down the plurality of document tag information.

【０００５】従来このような文書タグ情報は、文書画像
を記憶する際にユーザが手入力していた。しかしなが
ら、上記の文書タグ情報の入力作業をユーザが行うこと
は、文書数が多くなると作業量が膨大になるため現実的
でない。そこで、近年では、文書画像に対して文字認識
を行い、この認識結果である文字列を文書タグ情報とす
ることによって、人手を介さずに文書タグ情報を付与で
きるようにした装置も出現している。Conventionally, such document tag information has been manually input by a user when storing a document image. However, it is not realistic for the user to input the document tag information as the number of documents increases the amount of work. Therefore, in recent years, there has emerged an apparatus that performs character recognition on a document image and uses a character string as a result of the recognition as document tag information, so that the document tag information can be added without human intervention. I have.

【０００６】例えば、特開平８−１４７３１３号公報で
はマークシート用紙を利用する手法が開示されている。
この手法では、先ず、ユーザが所定のフォーマットのマ
ークシート用紙に記載された、文書画像に付与したい文
書タグ情報のチェック欄に印を付ける。そして、このマ
ークシートを、紙文書より先に文書画像処理装置に読み
取らせることにより、予め登録されている文書タグ情報
の候補の中から付与すべき文書タグ情報を指定するので
ある。この手法によれば、キーボードやポインティング
デバイス等の入力装置を用いることなく、登録する文書
画像に対して文書タグ情報を自動的に付与することがで
きる。For example, Japanese Patent Application Laid-Open No. 8-147313 discloses a method using mark sheet paper.
In this method, first, a user marks a check box of document tag information to be added to a document image, which is described on a mark sheet of a predetermined format. Then, by reading the mark sheet by the document image processing apparatus prior to the paper document, the document tag information to be added is designated from among the candidate document tag information registered in advance. According to this method, document tag information can be automatically added to a document image to be registered without using an input device such as a keyboard or a pointing device.

【０００７】ところで、文書画像を効率良く検索するた
めには、適切な文書タグ情報を付与しておくことが重要
である。すなわち、ディスプレイに一覧表示された複数
の文書タグ情報の中から所望の文書画像に対応する文書
タグ情報を特定するのが一般的な検索形態であるが、こ
のような文書タグ情報を迅速に特定するためには、各文
書タグ情報が文書の内容を端的に表した内容でなければ
ならない。In order to efficiently search for a document image, it is important to provide appropriate document tag information. That is, a general search mode is to specify document tag information corresponding to a desired document image from a plurality of document tag information listed on the display, but such document tag information is quickly specified. To do so, each document tag information must be a content that briefly represents the content of the document.

【０００８】特開平８−２０２８５９号公報では、タイ
トル文字列が属する領域（以下「タイトル領域」とい
う。）を文書画像から抽出した後、このタイトル領域画
像に対して文字認識を行い、この認識結果であるタイト
ル文字列を文書タグ情報とする方法を提案している。タ
イトル文字列は文書の内容を端的に表した内容であるた
め、このようなタイトル領域抽出方法を採用した文書画
像処理装置によれば所望の文書画像に対応する文書タグ
情報を迅速に特定できる。[0008] In Japanese Patent Application Laid-Open No. H8-202859, after an area to which a title character string belongs (hereinafter referred to as a "title area") is extracted from a document image, character recognition is performed on the title area image, and the recognition result is obtained. Are proposed as a method of using a title character string as document tag information. Since the title character string is a simple description of the content of the document, the document image processing apparatus employing such a title area extraction method can quickly specify document tag information corresponding to a desired document image.

【０００９】[0009]

【発明が解決しようとする課題】上記特開平８−２０２
８５９号公報のタイトル領域抽出方法では、タイトル文
字が当該文書画像に属する全ての文字の中で最も大きい
サイズであるという観点から、文書画像を複数の領域
（隣接する文字矩形相互を統合した領域）に分割して各
領域内の平均文字サイズを算出し、この平均文字サイズ
が最も大きい領域をタイトル領域として抽出するように
している。従って、このようなタイトル領域抽出方法に
よって抽出されるタイトル領域の数は、１つの文書画像
につき当然１つとなる。The above-mentioned Japanese Patent Application Laid-Open No. 8-202
In the title area extraction method disclosed in Japanese Patent No. 859, from the viewpoint that the title character has the largest size among all the characters belonging to the document image, the document image is divided into a plurality of areas (areas in which adjacent character rectangles are integrated). The average character size in each area is calculated, and the area having the largest average character size is extracted as a title area. Therefore, the number of title areas extracted by such a title area extraction method is naturally one for each document image.

【００１０】しかしながら、近似した内容の複数の文書
が存在する場合、タイトルも近似した内容となるのが通
常であるため、上記従来のタイトル領域抽出方法には、
近似した内容の文書が多数存在する場合所望の文書画像
に対応する文書タグ情報を迅速に特定できないという問
題があった。However, when there are a plurality of documents having similar contents, the title usually has similar contents.
When a large number of documents having similar contents exist, there is a problem that document tag information corresponding to a desired document image cannot be quickly specified.

【００１１】上記問題を回避するため紙文書を作成する
段階で似た内容のタイトルを付けないようにしてもよい
が、このような準備作業をユーザに要求することは好ま
しくない。[0011] To avoid the above-mentioned problem, it is possible not to give a title having similar contents at the stage of creating a paper document, but it is not preferable to request the user for such preparation work.

【００１２】一方、マークシートを利用する上記特開平
８−１４７３１３号公報の手法では、画像管理装置をソ
フトウェア的に構築するときに、全ての文書タグ情報の
項目を記載したマークシート用紙の形式、読み取りの処
理等を定義する必要があるため、非常に手間がかかる。
また、後から新しい文書タグ情報の候補を追加登録する
場合には、文書タグ情報の項目が変化するため、上記の
マークシート用紙の形式、読み取りの処理等を作り直す
必要が生じる。On the other hand, in the method disclosed in Japanese Patent Application Laid-Open No. Hei 8-147313 using a mark sheet, when the image management apparatus is constructed by software, the format and reading of the mark sheet paper in which all the items of the document tag information are described are described. Since it is necessary to define processing and the like, it is very troublesome.
When a new document tag information candidate is additionally registered later, since the items of the document tag information change, it is necessary to recreate the above-described mark sheet paper format, reading process, and the like.

【００１３】更に、マークシート用紙を利用する場合に
は、常に同一の用紙を用いてチェック欄に印を付けるだ
けであるので、ユーザにとってはどの文書タグ情報を付
与したかが視覚的に分かりにくく、入力ミスを起こしや
すいという問題もあった。Further, when using mark sheet paper, it is only necessary to always mark the check box using the same paper, so that it is difficult for the user to visually recognize which document tag information is added. There was also a problem that input errors were likely to occur.

【００１４】本発明は上記のような事情に基づいて提案
されたものであって、１つの文書画像から複数のタイト
ル領域やユーザが付したマークを抽出して文書タグ情報
として利用できる文書画像処理装置、その文書タイトル
抽出方法及び文書タグ情報付与方法を提供することを目
的とするものである。The present invention has been proposed based on the above-described circumstances, and is a document image processing apparatus capable of extracting a plurality of title areas and marks provided by a user from one document image and using the extracted mark areas as document tag information. It is an object of the present invention to provide an apparatus, a document title extracting method and a document tag information adding method.

【００１５】[0015]

【課題を解決するための手段】本発明は上記目的を達成
するために以下のような手段を採用している。The present invention employs the following means to achieve the above object.

【００１６】第１に、図１に示すように、文書画像を複
数の領域に分割する領域分割手段１０３と、該領域分割
手段１０３によって分割された各領域について領域平均
文字サイズを算出した後、該領域平均文字サイズに基づ
いて全領域の中からタイトル領域を抽出するタイトル領
域抽出手段１０４とを備えた文書画像処理装置におい
て、次の手段を採用する。First, as shown in FIG. 1, an area dividing means 103 for dividing a document image into a plurality of areas, and after calculating an area average character size for each area divided by the area dividing means 103, The following means is adopted in a document image processing apparatus provided with a title area extracting means 104 for extracting a title area from all areas based on the area average character size.

【００１７】先ず、上記タイトル領域抽出手段１０４
が、全領域の文字の平均高さに相当する全平均文字サイ
ズを算出した後、該全平均文字サイズに抽出パラメータ
を乗算した抽出判定値と上記領域平均文字サイズとを比
較し、上記抽出判定値より大きい領域平均文字サイズの
領域をタイトル領域として抽出するようにしている。こ
のようにすれば、上記抽出判定値より大きい領域平均文
字サイズの領域であればタイトル領域として抽出される
ため、１つの文書画像から複数のタイトル領域を抽出で
きることになる。First, the title area extracting means 104
Calculates a total average character size corresponding to the average height of characters in all regions, and then compares an extraction determination value obtained by multiplying the total average character size by an extraction parameter with the region average character size. An area having an average character size larger than the value is extracted as a title area. In this way, if the area has an average character size larger than the above-mentioned extraction determination value, it is extracted as a title area, so that a plurality of title areas can be extracted from one document image.

【００１８】また、上記タイトル領域抽出手段１０４
が、複数段階の抽出パラメータを用いて複数段階の上記
抽出判定値を算出するようにしてもよい。このようにす
れば、複数段階の抽出判定値に基づいて抽出判定がなさ
れることになるため、タイトル領域だけでなくサブタイ
トル領域（タイトル文字より若干小さなサイズの文字か
らなるサブタイトル文字列が属する領域）をも抽出でき
る。Further, the title area extracting means 104
However, the above-described extraction determination values in a plurality of stages may be calculated using the extraction parameters in a plurality of stages. With this configuration, the extraction determination is made based on the extraction determination values in a plurality of stages. Therefore, not only the title area but also the subtitle area (the area to which the subtitle character string composed of characters slightly smaller than the title character belongs) Can also be extracted.

【００１９】更に、上記タイトル領域抽出手段１０４
が、領域平均文字サイズの最大値を全平均文字サイズで
除算した値に基づいて上記複数段階の抽出パラメータを
決定するようにしてもよい。抽出パラメータを固定値と
するのではなく領域平均文字サイズの最大値等に基づい
て算出した方が良好な抽出判定値が得られる。Further, the title area extracting means 104
However, the plurality of stages of extraction parameters may be determined based on a value obtained by dividing the maximum value of the area average character size by the total average character size. A better extraction judgment value can be obtained by calculating the extraction parameter based on the maximum value of the average character size of the area, instead of using a fixed value.

【００２０】上記全平均文字サイズ、領域平均文字サイ
ズを求めるについて、所定割合より大きい文字および所
定割合より小さい文字を除外したトリム平均を用いると
により精度を上げることができる。In obtaining the total average character size and the area average character size, accuracy can be improved by using a trim average excluding characters larger than a predetermined ratio and characters smaller than a predetermined ratio.

【００２１】また抽出された上記タイトル領域に含まれ
る文字の画像は、文字認識手段１０５により文字コード
列であるタイトル文字列に変換することができる。そし
て修正手段１１２によりこのタイトル文字列を修正する
ことにより、ユーザは文書画像のタイトルを適宜変更す
ることができる。The extracted character image contained in the title area can be converted by the character recognition means 105 into a title character string which is a character code string. By modifying the title character string by the modifying means 112, the user can appropriately change the title of the document image.

【００２２】第２に、図１２に示すように、紙文書を読
み取って文書画像を生成、記憶する文書画像処理におい
て、先ず、標準タグ情報（文書タグ情報の候補）を、こ
の標準タグ情報の属性値とともに予め蓄積しておく標準
タグ情報蓄積手段１２１５を設ける。Second, as shown in FIG. 12, in document image processing for reading a paper document to generate and store a document image, first, standard tag information (candidate of document tag information) is A standard tag information storage unit 1215 that stores the attribute value together with the attribute value is provided.

【００２３】次に、ユーザが紙文書上に付した特定のマ
ークを抽出するマーク抽出手段１２０５を設ける。ここ
でマークとはスタンプ、シール、イラスト、一定の筆跡
による署名等、ユーザが紙文書を識別することを意図し
て付すマーク一般を指す。Next, there is provided a mark extracting means 1205 for extracting a specific mark given by a user on a paper document. Here, the mark refers to a general mark, such as a stamp, a sticker, an illustration, or a signature with a certain handwriting, that the user intends to identify a paper document.

【００２４】また、抽出された上記マークを構成する画
素の分布に基づいて、このマークの特徴を表す特徴値を
算出する算出手段１２０Ａを設ける。Further, a calculating means 120A is provided for calculating a characteristic value representing the characteristic of the mark based on the distribution of the pixels constituting the extracted mark.

【００２５】そして、上記属性値と上記特徴値を比較し
て、最も類似度の高い標準タグ情報を選択し、上記文書
画像に対して付与する文書タグ情報付与手段１２０８を
設ける。Then, a document tag information providing means 1208 for comparing the attribute value with the characteristic value, selecting standard tag information having the highest similarity, and providing the selected tag information to the document image is provided.

【００２６】以上により、ユーザが書類の整理に際して
日常的に用いるマークを基に、文書画像に対して文書タ
グ情報を自動的に付与することが可能となり、オフィス
等における文書管理が簡便に行える。As described above, the document tag information can be automatically added to the document image based on the mark that the user uses on a daily basis when organizing the document, and the document management in the office or the like can be easily performed.

【００２７】[0027]

【発明の実施の形態】（実施の形態１）以下に本発明の
実施の形態を図面に従って詳細に説明する。実施の形態
１、２、３及び４では１つの紙文書から複数のタイトル
を抽出する文書画像処理装置に関して説明する。(Embodiment 1) Embodiments of the present invention will be described below in detail with reference to the drawings. In the first, second, third and fourth embodiments, a document image processing apparatus that extracts a plurality of titles from one paper document will be described.

【００２８】図１は、本発明を適用した文書画像処理装
置の概略機能ブロック図であり、以下、その構成を文書
画像登録手順とともに説明する。FIG. 1 is a schematic functional block diagram of a document image processing apparatus to which the present invention is applied. The configuration will be described below together with a document image registration procedure.

【００２９】まず、例えばスキャナ等の文書画像入力手
段１０１が紙文書を光電変換して多値画像データである
文書画像１０８ａを得、該文書画像は画像処理手段１１
１ａで記憶に適した処理（例えば圧縮処理）がなされて
記憶手段１０８の文書画像エリアＡａに登録される。も
ちろん画像処理手段１１１ａを設けないで多値画像デー
タのまま文書画像エリアＡａに登録しておいてもよい。First, a document image input means 101 such as a scanner photoelectrically converts a paper document to obtain a document image 108a which is multi-valued image data.
At 1a, processing suitable for storage (for example, compression processing) is performed and registered in the document image area Aa of the storage unit 108. Of course, the multi-valued image data may be registered in the document image area Aa without providing the image processing unit 111a.

【００３０】上記文書画像入力手段１０１よりの文書画
像は上記画像処理手段１１１ａに入力されるとともに、
画像処理手段１１１ｂにも入力されここで２値画像デー
タに変換されて画像メモリ１０７に格納される。このよ
うに画像メモリ１０７に文書画像が格納された状態で、
文字矩形生成手段１０２は上記画像メモリ１０７に記憶
された文書画像を参照して、以下のラベリング処理を行
う。このラベリング処理とは、注目する黒画素（以下
「注目画素」という。）の上、右上、右、右下、下、左
下、左、左上の８方向に隣接する画素のうち黒画素につ
いて当該注目画素と同一のラベル値（識別情報）を与え
る処理である。すなわち、図７に示すようにＷ１・Ｗ２
・Ｗ３・Ｗ４・Ｗ６・Ｗ７・Ｗ８・Ｗ９の８画素が注目
画素Ｗ５に連結する場合、文字矩形生成手段２は、黒画
素であるＷ２・Ｗ３・Ｗ８に注目画素Ｗ５と同一のラベ
ル値を与える。このようなラベリング処理を行うことに
よって、文書画像内の黒画素連結成分（連続する黒画
素）毎に同一ラベル値を与えることができる。The document image from the document image input means 101 is input to the image processing means 111a.
The data is also input to the image processing unit 111b, where it is converted into binary image data and stored in the image memory 107. With the document image stored in the image memory 107 in this manner,
The character rectangle generating means 102 performs the following labeling processing with reference to the document image stored in the image memory 107. This labeling processing means that the black pixel among the pixels adjacent to the black pixel of interest (hereinafter, referred to as “target pixel”) in eight directions adjacent to the upper, upper right, right, lower right, lower, lower left, lower left, and upper left directions. This is a process of giving the same label value (identification information) as that of the pixel. That is, as shown in FIG.
When the eight pixels W3, W4, W6, W7, W8, and W9 are connected to the target pixel W5, the character rectangle generation unit 2 assigns the same label value to the black pixels W2, W3, and W8 as the target pixel W5. give. By performing such labeling processing, the same label value can be given to each black pixel connected component (consecutive black pixels) in the document image.

【００３１】次いで、文字矩形生成手段１０２は、上記
のように同一ラベル値を与えた黒画素連結成分を切り出
すことによって文字矩形を生成し、この文字矩形を領域
分割手段３に渡す。ここで、文字矩形とは黒画素連結成
分の外接矩形を意味する。尚、文字によっては１つの黒
画素連結成分で構成されていない場合もあり、このこと
を考慮して、上記ラベリング処理を行う前に文書画像中
の黒画素領域を膨張させる処理をしておくこともでき
る。すなわち、注目する黒画素に隣接する８個の画素を
黒画素に変換するという処理であり、この処理を適切な
回数（通常２、３回）だけ施すことにより、黒画素の領
域が拡大され１つの文字内で分離していた黒画素連結成
分を１つに結合することができる。このような処理を行
った上で、上記ラベリング処理を行うことにより、上記
文字矩形を正しく文字毎に生成することが可能となる。Next, the character rectangle generating means 102 generates a character rectangle by cutting out the black pixel connected component having the same label value as described above, and passes this character rectangle to the area dividing means 3. Here, the character rectangle means a circumscribed rectangle of the black pixel connected component. Note that some characters may not be composed of one black pixel connected component. In consideration of this, a process of expanding a black pixel region in a document image should be performed before performing the labeling process. Can also. In other words, this is a process of converting eight pixels adjacent to the black pixel of interest into black pixels. By performing this process an appropriate number of times (usually two or three times), the area of the black pixels is enlarged to 1 Black pixel connected components separated in one character can be combined into one. By performing the labeling process after performing such a process, it is possible to correctly generate the character rectangle for each character.

【００３２】上記文字矩形生成手段１０２の処理が終わ
ると領域分割手段１０３は、各文字矩形について近傍を
調べ、相互に隣接する文字矩形を統合することによって
文書画像の領域を分割する。例えば図８に示す文字矩形
Ｃ１〜Ｃ１２を受けた領域分割手段１０３は、文字矩形
Ｃ１〜Ｃ４・Ｃ５〜Ｃ９・Ｃ１０〜Ｃ１２をそれぞれ統
合することによって文書画像を領域Ｅ１・Ｅ２・Ｅ３に
分割する。このような領域分割処理を行うことによっ
て、文書画像の領域を文字列毎に分割することができ
る。なお、文字矩形が相互に隣接している状態であるの
か、あるいは、行間であるのか等の区別は文字間、行間
に関する適当な閾値を用いて判定するようにしている。When the processing of the character rectangle generating means 102 is completed, the area dividing means 103 examines the neighborhood of each character rectangle and divides the area of the document image by integrating the character rectangles adjacent to each other. For example, the region dividing means 103 receiving the character rectangles C1 to C12 shown in FIG. 8 divides the document image into regions E1, E2, and E3 by integrating the character rectangles C1 to C4, C5 to C9, and C10 to C12, respectively. . By performing such area division processing, the area of the document image can be divided for each character string. Whether the character rectangles are adjacent to each other or between lines is determined by using an appropriate threshold value between characters and lines.

【００３３】以上の結果、文書画像内における全ての文
字サイズ（後述する）・分割された領域数・各領域内の
文字矩形の数などの情報が得られる。本発明では、分割
された各領域に対して１から始まる通し番号を付すとと
もに各領域に属する文字矩形に対しても１から始まる通
し番号を付すようにしており、以下、ｎ番目の領域内の
文字矩形数をNumChar _n、ｎ番目の領域内におけるｍ番
目の文字サイズをSizeChar_n, _mと表す。As a result, information such as the total character size (to be described later) in the document image, the number of divided areas, and the number of character rectangles in each area can be obtained. In the present invention, a serial number starting from 1 is assigned to each divided area, and a serial number starting from 1 is assigned to a character rectangle belonging to each area. The number is represented by NumChar _n and the m-th character size in the n-th area is represented by SizeChar _n , _m .

【００３４】ところで図９に示すように、文字矩形の幅
Ｗ１〜Ｗ４および面積Ａ１〜Ａ４は同一ポイント数の文
字フォントを使用している場合であっても文字の種類に
依存して大きく変動するのに対して、文字矩形の高さＨ
１〜Ｈ４はこのような変動が小さい。従って本発明で
は、文字フォントのポイント数が比較的正確に反映され
る“文字矩形の高さ”を上記文字サイズとして採用する
ようにしている。As shown in FIG. 9, the widths W1 to W4 and the areas A1 to A4 of the character rectangle vary greatly depending on the type of the character even when the character font having the same number of points is used. The height H of the character rectangle
Such fluctuations are small in 1 to H4. Therefore, in the present invention, "the height of the character rectangle" in which the point number of the character font is relatively accurately reflected is adopted as the character size.

【００３５】ここで、タイトル領域抽出手段１０４は、
上記のように分割された全領域のうち所定の領域のみを
タイトル領域として抽出する。以下、このタイトル領域
抽出処理を図２に示すフローチャートに従って説明す
る。Here, the title area extracting means 104
Only a predetermined area among all the areas divided as described above is extracted as a title area. Hereinafter, the title region extraction processing will be described with reference to the flowchart shown in FIG.

【００３６】まず、タイトル領域抽出手段１０４は各領
域について領域平均文字サイズを算出する（図２、ステ
ップ１）。この領域平均文字サイズとは１領域に属する
全ての文字サイズの平均値であり、ｎ番目の領域におけ
る領域平均文字サイズSizeReg _nは、当該領域に属する
全ての文字サイズSizeChar_n, _mの加算値を当該領域内
の文字数NumChar _nで除算した値となる。この関係を次
式に示す。First, the title area extracting means 104 calculates an area average character size for each area (FIG. 2, step 1). This region average character size is the average value of all character sizes belonging to one region, and the region average character size SizeReg _n in the n-th region is the sum of all character sizes SizeChar _n and _m belonging to the region. This is a value obtained by dividing by the number of characters in the area NumChar _n . This relationship is shown in the following equation.

【００３７】[0037]

【数１】 (Equation 1)

【００３８】次いで、上記のように算出した各領域の領
域平均文字サイズSizeReg _nと領域内の文字数NumChar
_nとから、文書画像内の全平均文字サイズSizeAll を次
式によって算出する（図２、ステップ２）。Next, the area average character size SizeReg _{n of} each area calculated as described above and the number of characters in the area NumChar
_{From n} , the total average character size SizeAll in the document image is calculated by the following formula (FIG. 2, step 2).

【００３９】[0039]

【数２】 (Equation 2)

【００４０】なお、領域平均文字サイズSizeReg _nおよ
び全平均文字サイズSizeAll の算出方法は上記した方法
に限定されるものではなく、例えば、後に説明するトリ
ム平均（最小値側および最大値側から所定割合例えば１
０％のデータを除外したうえで平均値を算出する方法）
を採用することもできる。[0040] The area average character size SizeReg _n and the method of calculating the total average character size SizeAll is not limited to the method described above, for example, a predetermined proportion of the trimmed mean (minimum value side and the maximum value side, which will be described later For example, 1
Method of calculating the average value after excluding 0% data)
Can also be adopted.

【００４１】ここで、タイトル領域抽出手段１０４は、
以下に示す抽出判定式が成立するか否かに基づいてタイ
トル領域の抽出判定を行う。Here, the title area extracting means 104
The extraction determination of the title area is performed based on whether or not the following extraction determination formula is satisfied.

【００４２】[0042]

【数３】 (Equation 3)

【００４３】すなわち、上記のように算出した全平均文
字サイズSizeAll に抽出パラメータαを乗算した値（抽
出判定値）と各領域の領域平均文字サイズSizeReg _nと
を比較し、この抽出判定式が成立する領域のみをタイト
ル領域として抽出する（図２、ステップ３→４→５）。
なお、抽出パラメータαは1.0 より大きな定数とし、1.
2 程度の値とするのが好ましい。That is, the value (extraction judgment value) obtained by multiplying the total average character size SizeAll calculated as described above by the extraction parameter α is compared with the region average character size SizeReg _{n of} each region, and this extraction judgment formula is established. Only the area to be extracted is extracted as a title area (FIG. 2, steps 3 → 4 → 5).
Note that the extraction parameter α is a constant larger than 1.0, and 1.
A value of about 2 is preferred.

【００４４】以上の手順を繰り返し全ての領域について
抽出判定が行われると（図２、ステップ３で“Ｎ
Ｏ”）、タイトル領域抽出処理を終了し、ここで抽出さ
れた各タイトル領域画像１０８ｂは記憶手段１０８のタ
イトルエリアＡｂに登録される。When the above procedure is repeated and extraction determination is performed for all the regions (FIG.
O "), the title area extraction processing is terminated, and each title area image 108b extracted here is registered in the title area Ab of the storage means 108.

【００４５】次いで、文字認識手段１０５は、上記のよ
うに抽出されたタイトル領域画像を文書画像から切り出
し、このタイトル領域画像に対して文字認識を行うこと
によって文字コード列であるタイトル文字列を得る。こ
こで得られたタイトル文字列は修正手段１１２を介して
表示制御手段１１０に渡され、タイトル領域画像ととも
に図示しないディスプレイにリスト表示され、ユーザに
提示される（図１０(I) 参照）。Next, the character recognizing means 105 cuts out the title area image extracted as described above from the document image, and performs character recognition on the title area image to obtain a title character string as a character code string. . The title character string obtained here is passed to the display control means 110 via the correction means 112, displayed in a list on a display (not shown) together with the title area image, and presented to the user (see FIG. 10 (I)).

【００４６】ユーザは表示された上記タイトル領域画像
及びタイトル文字列を確認し、このタイトル文字列を表
示された通りに登録したい場合は、指示入力手段１０９
により登録を指示する。するとこのタイトル文字列が上
記文字認識手段１０５から文書登録手段１０６に渡され
る。The user checks the displayed title area image and the title character string, and if the user wants to register the title character string as displayed, the instruction input means 109
Indicates registration. Then, the title character string is passed from the character recognition means 105 to the document registration means 106.

【００４７】一方、上記タイトル文字列に変更・修正を
加えたいときは、ユーザは上記指示入力手段１０９のポ
インティングデバイスにより、表示されたこのタイトル
文字列を例えばダブルクリックする。修正手段１１２は
このダブルクリックに基づいて、上記表示制御手段１１
０に対して、例えば、ディスプレイ上の上記タイトル文
字列を点滅させ、また、カーソルを上記文字列内に表示
させるよう指示する。そしてユーザは上記指示入力手段
１０９のキーボードを操作し、修正文字列を上記修正手
段１１２に入力して上記カーソル以降の文字列を修正文
字列で置き換える。このようにして修正された文字列は
上記修正手段１１２から上記文字認識手段１０５に入力
され、上記タイトル文字列の修正が行われる。そして上
記と同様、ユーザが上記指示入力手段１０９により登録
を指示すると、この修正後のタイトル文字列が上記文字
認識手段１０５から上記文書登録手段１０６に渡され
る。On the other hand, when the user wants to change or modify the title character string, the user double-clicks the displayed title character string by using the pointing device of the instruction input means 109, for example. The correction means 112, based on this double click,
For 0, for example, an instruction is made to blink the title character string on the display and to display a cursor in the character string. Then, the user operates the keyboard of the instruction input means 109 to input the correction character string to the correction means 112 and replace the character string after the cursor with the correction character string. The character string corrected in this way is input from the correction means 112 to the character recognition means 105, and the title character string is corrected. Then, similarly to the above, when the user instructs registration using the instruction input unit 109, the title character string after the correction is transferred from the character recognition unit 105 to the document registration unit 106.

【００４８】尚、上記確認及び修正の処理を設けない場
合は、上記文字認識手段１０５が認識した内容をディス
プレイに表示しないでそのまま文書登録手段１０６に渡
すことになる。If the confirmation and correction processing is not provided, the contents recognized by the character recognition means 105 are passed to the document registration means 106 without being displayed on the display.

【００４９】上記タイトル文字列を受けた文書登録手段
１０６は、記憶手段１０８での文書画像１０８ａの格納
ポインタ、上記タイトル領域画像１０８ｂの格納ポイン
タ・上記タイトル文字列・文書画像内におけるタイトル
領域の位置およびサイズからなる登録情報を記憶手段１
０８上のテーブルエリアＡｃに形成された登録情報管理
テーブル１０８ｃ（図５参照）に登録する。ここで、上
記文書画像１０８ａの格納ポインタは上記記憶手段１０
８の文書画像エリアＡａより得られ、上記タイトル画像
１０８ｂの格納ポインタは上記記憶手段１０８のタイト
ルエリアＡｂより得られ、更に、タイトル領域の位置と
サイズは文字認識手段１０５より得られることになる。Upon receipt of the title character string, the document registration means 106 stores the document image 108a in the storage means 108, the storage pointer of the title area image 108b, the title character string, and the position of the title area in the document image. Storage means 1 for storing registration information comprising
08 in the registration information management table 108c (see FIG. 5) formed in the table area Ac. Here, the storage pointer of the document image 108a is
8 is obtained from the document image area Aa, the storage pointer of the title image 108b is obtained from the title area Ab of the storage means 108, and the position and size of the title area are obtained from the character recognition means 105.

【００５０】このように登録情報管理テーブル１０８ｃ
が生成されると、以降に、キーボードやポインティング
デバイス等からなる指示入力手段１０９より文書画像の
検索が指示入力されると、表示制御手段１１０は、上記
のように記憶されたタイトル領域画像およびタイトル文
字列を上記ディスプレイにリスト表示する（図１０(I)
）。As described above, the registration information management table 108c
Is generated, thereafter, when an instruction to search for a document image is input from the instruction input unit 109 including a keyboard, a pointing device, and the like, the display control unit 110 transmits the title area image and the title stored as described above. A list of character strings is displayed on the display (FIG. 10 (I)).
).

【００５１】そして上記リスト表示からユーザが所望の
タイトル（タイトル領域画像またはタイトル文字列）が
上記指示入力手段１０９により選択すると、上記表示制
御手段１１０はこのタイトルに対応する文書画像を上記
ディスプレイに表示する。このとき、図１０(II)に示す
ように、矩形枠Ｆで囲むなどして文書画像内におけるタ
イトル領域を明示するのが好ましい。このような矩形枠
Ｆは、登録情報管理テーブル１０８ｃに登録されている
タイトル領域の位置およびサイズに基づいて生成でき
る。When the user selects a desired title (title area image or title character string) from the list display by the instruction input means 109, the display control means 110 displays a document image corresponding to the title on the display. I do. At this time, as shown in FIG. 10 (II), it is preferable to clearly indicate the title area in the document image by surrounding it with a rectangular frame F or the like. Such a rectangular frame F can be generated based on the position and size of the title area registered in the registration information management table 108c.

【００５２】また、上記のようにディスプレイに表示さ
れたリストからいずれか１つを選択する方法に加えて、
指示入力手段１０９より特定の文書タグ情報を入力し、
該文書タグ情報に該当するタイトルが登録情報管理テー
ブル１０８ｃに登録されているとき、対応する文書画像
を表示するようにしてもよいことはもちろんである。In addition to the method of selecting any one from the list displayed on the display as described above,
By inputting specific document tag information from the instruction input means 109,
When a title corresponding to the document tag information is registered in the registration information management table 108c, it goes without saying that a corresponding document image may be displayed.

【００５３】以上のように本実施の形態によれば、抽出
判定値より大きい領域平均文字サイズの領域であればタ
イトル領域として抽出する構成としているため、１つの
文書画像から複数のタイトル領域を抽出できる。従っ
て、似た内容の文書が多数存在する場合であっても、所
望の文書画像に対応する文書タグ情報（タイトル）を迅
速に特定できる。As described above, according to the present embodiment, if a region having an average character size larger than the extraction determination value is extracted as a title region, a plurality of title regions are extracted from one document image. it can. Therefore, even when there are many documents having similar contents, the document tag information (title) corresponding to the desired document image can be quickly specified.

【００５４】なお、上記の説明では、タイトル領域抽出
処理において抽出判定式の成立する領域が存在しなかっ
た場合の手順については言及していないが、このような
場合には、タイトル領域が抽出されなかった旨をディス
プレイ表示するとともに文書タグ情報となる文字列を入
力するようユーザに対して要求し、この要求に対してユ
ーザが文字列を入力すると、この文字列を当該文書画像
のタイトル文字列として用いるようにしている。Although the above description does not refer to the procedure when there is no area where the extraction judgment formula is satisfied in the title area extraction processing, in such a case, the title area is extracted. Is displayed on the display and the user is requested to input a character string serving as document tag information. When the user inputs the character string in response to this request, the character string is changed to the title character string of the document image. Is used as

【００５５】（実施の形態２）上記実施の形態１では、
抽出判定値より大きい領域平均文字サイズの領域であれ
ば、領域平均文字サイズの大小を区別することなく同様
にタイトル領域として抽出する構成としている。従っ
て、タイトル文字より若干小さなサイズの文字からなる
サブタイトル文字列はリスト表示せずタイトル文字列の
みをリスト表示する処理など、領域平均文字サイズの大
小に基づいた適切な処理を行うことができない。本実施
の形態では、複数段階の抽出パラメータを用いて複数段
階の抽出判定値を算出するとともにレベル属性（抽出し
た段階を示す情報）と対応付けてタイトル領域を抽出す
る構成とすることによって上記した問題を解消してお
り、以下、その構成を実施の形態１と異なる点のみ説明
する。(Embodiment 2) In Embodiment 1 described above,
If the area has a region average character size larger than the extraction determination value, the region is similarly extracted as a title region without distinguishing the size of the region average character size. Therefore, it is not possible to perform an appropriate process based on the size of the area average character size, such as a process of displaying only a title character string in a list without displaying a subtitle character string composed of characters slightly smaller than the title character. In the present embodiment, the above is described by adopting a configuration in which a plurality of levels of extraction determination values are calculated using a plurality of levels of extraction parameters, and a title area is extracted in association with a level attribute (information indicating the level of the extracted level). Since the problem has been solved, only the configuration different from that of the first embodiment will be described below.

【００５６】上記実施の形態１と同様の手順で領域平均
文字サイズSizeReg _nおよび全平均文字サイズSizeAll
を算出したタイトル領域抽出手段１０４は、以下に示す
複数段階の抽出判定式が成立するか否かに基づいて複数
段階の抽出判定を行う。The area average character size SizeReg _n and the total average character size SizeAll are obtained in the same procedure as in the first embodiment.
The title region extracting means 104 that has calculated the above-mentioned formulas performs a multi-stage extraction judgment based on whether or not the following multi-stage extraction judgment formula holds.

【００５７】[0057]

【数４】 (Equation 4)

【００５８】上式におけるα_pは、ｐ段階（レベルｐ）
の抽出パラメータであり、〔数５〕の条件を満たすよう
に値を設定しておく。例えば、５段階の抽出判定を行う
場合には、α₁=1.5 、α₂=1.3 、α₃=1.2 、α₄=1.15、
α₅=1.1 程度とするのが好ましい。Α _p in the above equation is a p-step (level p)
The value is set so as to satisfy the condition of [Equation 5]. For example, when performing five-stage extraction determination, α ₁ = 1.5, α ₂ = 1.3, α ₃ = 1.2, α ₄ = 1.15,
α ₅ is preferably about 1.1.

【００５９】[0059]

【数５】 (Equation 5)

【００６０】図３に示すフローチャートを用いて説明す
ると、タイトル領域抽出手段１０４は、レベル１から順
に全レベルの抽出判定を行い（図３、ステップ１４→１
５→１４）、全レベルにおいて抽出判定式が成立しなか
った場合には、この領域をタイトル領域として抽出せ
ず、次の領域について抽出判定を行う（図３、ステップ
１４→１３→１４→１５）。一方、いずれかのレベルに
おいて抽出判定式が成立した場合には、この領域を当該
レベルのタイトル領域として（上記レベル属性を対応付
けて）抽出した後、次の領域について抽出判定を行う
（図３、ステップ１５→１６→１３→１４→１５）。Referring to the flowchart shown in FIG. 3, the title area extracting means 104 performs extraction determination of all levels in order from level 1 (FIG. 3, step 14 → 1).
5 → 14), if the extraction determination formula does not hold at all levels, this area is not extracted as a title area, and extraction determination is performed for the next area (FIG. 3, steps 14 → 13 → 14 → 15). ). On the other hand, if the extraction determination formula is satisfied at any level, this area is extracted as the title area of the level (in association with the level attribute), and then the extraction determination is performed for the next area (FIG. 3). Steps 15 → 16 → 13 → 14 → 15).

【００６１】以上の手順を繰り返し全ての領域について
抽出判定が行われると（図３、ステップ１３で“Ｎ
Ｏ”）、タイトル領域抽出処理を終了する。When the above procedure is repeated to determine the extraction for all the regions (FIG.
O "), the title area extraction processing ends.

【００６２】なお、抽出判定式の成立する領域が存在し
なかった場合ユーザが入力した文字列をタイトル文字列
として用いる点は上記実施の形態１と同様であり、この
タイトル文字列のレベル属性はレベル１、全レベル数も
１としている。It is to be noted that the character string input by the user is used as the title character string when there is no area where the extraction judgment formula holds, as in the first embodiment, and the level attribute of this title character string is Level 1 and the total number of levels are also 1.

【００６３】また、抽出された上記タイトル文字列を変
更・修正できる点についても実施の形態１と同様であ
る。Also, the point that the extracted title character string can be changed / corrected is the same as in the first embodiment.

【００６４】図６は、本実施の形態における登録情報管
理テーブル１０８ｃの説明図であり、上記実施の形態１
において示した構成（フィールド５０１〜５０５）に
「レベル属性」フィールド６０１と「全レベル数」フィ
ールド６０２とを加えた構成としている。そして文書登
録手段１０６は、例えば５段階の抽出判定においてレベ
ル１で抽出された領域がある場合、この領域に対応する
「全レベル数」フィールド６０２には“５”を、「レベ
ル属性」フィールド６０１には“１”をそれぞれ登録す
る。FIG. 6 is an explanatory diagram of the registration information management table 108c according to the present embodiment.
(Fields 501 to 505), a "level attribute" field 601 and a "total number of levels" field 602 are added. For example, when there is an area extracted at level 1 in the five-stage extraction determination, the document registration unit 106 sets “5” in the “total level number” field 602 and “level attribute” field 601 corresponding to this area. Is registered as "1".

【００６５】図１１は、本実施の形態の検索時において
ディスプレイに表示される内容を示す図であり、上段に
リスト表示するタイトルのレベル属性を指示入力手段１
０９より範囲指定できるようにしている。そして、表示
制御手段１１０は、登録情報管理テーブル１０８ｃの
「レベル属性」フィールド６０１と「全レベル数」フィ
ールド６０２とを参照することによって上記のように指
定された範囲内のタイトルのみをディスプレイにリスト
表示する。FIG. 11 is a diagram showing the contents displayed on the display at the time of retrieval according to the present embodiment.
09 can be specified. Then, the display control unit 110 refers to the “level attribute” field 601 and the “total number of levels” field 602 of the registration information management table 108c to list only the titles within the range specified as described above on the display. indicate.

【００６６】以上のように本実施の形態によれば、複数
段階の抽出パラメータを用いて複数段階の抽出判定値を
算出するとともにレベル属性と対応付けてタイトル領域
を抽出する構成としているため、サブタイトル文字列は
リスト表示せずタイトル文字列のみをリスト表示する処
理など領域平均文字サイズの大小に基づいて、異なる処
理を行うことができる。As described above, according to the present embodiment, a plurality of levels of extraction determination values are calculated using a plurality of levels of extraction parameters, and the title area is extracted in association with the level attribute. Different processing can be performed based on the size of the area average character size, such as processing for displaying only the title character string in a list without displaying the character string in a list.

【００６７】（実施の形態３）上記実施の形態２では、
複数段階の抽出パラメータを予め設定する（固定値とす
る）構成としているが、このような抽出パラメータは入
力された文書画像の特性に応じて決定するのが好まし
い。本実施の形態では、領域平均文字サイズの最大値を
全平均文字サイズで除算した値に基づいて複数段階の抽
出パラメータを決定する（図４、ステップ２３参照）よ
うにしており、以下、その構成を実施の形態２と異なる
点のみ説明する。(Embodiment 3) In Embodiment 2 described above,
Although the configuration is such that the extraction parameters in a plurality of stages are set in advance (fixed values), it is preferable that such extraction parameters are determined according to the characteristics of the input document image. In the present embodiment, a plurality of levels of extraction parameters are determined based on a value obtained by dividing the maximum value of the area average character size by the total average character size (see step 23 in FIG. 4). Only the points different from the second embodiment will be described.

【００６８】上記の実施の形態２と同様の手順で領域平
均文字サイズSizeReg _nおよび全平均文字サイズSizeAl
l を算出したタイトル領域抽出手段４は、まず、領域平
均文字サイズの最大値max ｛SizeReg _n｝を全平均文字
サイズSizeAll で除算した値α₁を次式によって算出す
る。The area average character size SizeReg _n and the total average character size SizeAl are obtained in the same procedure as in the second embodiment.
Title area extracting means 4 to calculate the l first calculates the maximum value max value alpha ₁ for the {SizeReg _n} divided by the total average character size SizeAll area average character size by the following equation.

【００６９】[0069]

【数６】 (Equation 6)

【００７０】次いで、タイトル領域抽出手段４は、上記
のように算出したα₁と当該抽出判定の全レベル数Ｐ(P
>=1)とから、各レベルの抽出パラメータα_pを次式によ
って決定する。Next, the title area extracting means 4 calculates α ₁ calculated as described above and the total number of levels P (P
> = 1), the extraction parameter α _p of each level is determined by the following equation.

【００７１】[0071]

【数７】 (Equation 7)

【００７２】例えばα₁が1.5 で５段階の抽出判定を行
う場合、各レベルの抽出パラメータα₁〜α₅は以下の
ようになる。For example, when α ₁ is 1.5 and five levels of extraction judgment are performed, the extraction parameters α _{1 to} α ₅ of each level are as follows.

【００７３】[0073]

【数８】 (Equation 8)

【００７４】このように〔数７〕によれば、上記のよう
に算出したα₁から1.0 の間で等間隔になるように各レ
ベルの抽出パラメータα_pを決定することができる。As described above, according to [Equation 7], the extraction parameter α _p of each level can be determined so as to be equally spaced between α ₁ and 1.0 calculated as described above.

【００７５】以降の手順は、上記のように決定した抽出
パラメータを用いて抽出判定を行う点を除いて実施の形
態２と同様であるため説明を省略する。The subsequent procedure is the same as that of the second embodiment except that the extraction determination is performed using the extraction parameters determined as described above, and therefore the description is omitted.

【００７６】ただし上記した方法には、文書画像内にタ
イトル領域が存在しない場合、α₁が例えば1.03など1.0
付近の値となるため本文の領域をタイトル領域として
誤抽出してしまうという不具合がある。そこで本発明で
は、例えば1.05など所定値以下となる抽出パラメータは
採用しないようにしている。[0076] However, in the method described above, if there is no title area in the document image, alpha _1, for example 1.03, such as 1.0
There is a problem that a text region is erroneously extracted as a title region because the value is in the vicinity. Therefore, in the present invention, an extraction parameter having a predetermined value or less, such as 1.05, is not adopted.

【００７７】また、各レベル間の抽出パラメータの差が
例えば0.03など所定値以下となると、良好な抽出判定が
できないため、上記抽出パラメータの差が上記所定値
（0.03）となるように抽出パラメータの設定値を修正す
るようにしている。すなわち上記の場合、α₁から順に
0.03ずつ減算した値を各レベルの抽出パラメータとして
設定する。If the difference between the extraction parameters at each level is equal to or less than a predetermined value, for example, 0.03, it is not possible to make a good extraction judgment. Therefore, the extraction parameter is set so that the difference between the extraction parameters becomes the predetermined value (0.03). The settings are modified. That is, in the above case, α ₁
The value subtracted by 0.03 is set as the extraction parameter of each level.

【００７８】以上の結果全レベル数Ｐが減少する場合も
あるが、このような場合には、実際のレベル数（全レベ
ル数Ｐから減少レベル数を減じた値）を全レベル数Ｐと
して登録情報管理テーブル１０８ｃの「全レベル数」フ
ィールド６０２に設定する。As a result, the total number of levels P may decrease. In such a case, the actual number of levels (the value obtained by subtracting the number of reduced levels from the total number of levels P) is registered as the total number of levels P. This is set in the “all levels” field 602 of the information management table 108c.

【００７９】以上のように本実施の形態によれば、抽出
パラメータを固定値とするのではなく、入力された文書
画像の特性に応じて決定する構成としているため良好な
抽出判定を行うことができる。As described above, according to the present embodiment, the extraction parameters are determined not according to the fixed values but according to the characteristics of the input document image. it can.

【００８０】（実施の形態４）上記の各実施の形態にお
いては、全平均文字サイズの算出に比較的サイズの大き
いタイトル領域の文字も算入され、また、サイズの小さ
いコンマ、ピリオド、句読点も算入されるので、精度が
低くなる傾向がある。そこで、文書画像の全文字から、
所定割合（例えば９０％）より大きいサイズの文字と、
所定割合（例えば１０％）より小さいサイズの文字を除
外した文字から全平均文字サイズを算出する、いわゆる
トリム平均を利用する。更に、領域平均文字サイズを算
出するときにも、同様の問題が発生するところから、領
域平均文字サイズの算出についても上記トリム平均を用
いることもできる。(Embodiment 4) In each of the above embodiments, the characters in the title area having a relatively large size are included in the calculation of the total average character size, and commas, periods, and punctuation marks having small sizes are also included. Accuracy tends to be lower. Therefore, from all the characters in the document image,
A character having a size larger than a predetermined ratio (for example, 90%);
A so-called trim average is used in which a total average character size is calculated from characters excluding characters smaller than a predetermined ratio (for example, 10%). Further, the same problem occurs when calculating the area average character size. Therefore, the trim average can also be used for calculating the area average character size.

【００８１】これによって、全平均文字サイズ、および
領域平均文字サイズとも、ピリオド、コンマ、句読点を
除外した文字サイズを求めることができ、より精度の高
い値が得られることになる。As a result, the character size excluding the period, comma, and punctuation can be obtained for both the total average character size and the area average character size, and a value with higher precision can be obtained.

【００８２】ここで、上記各実施の形態では領域平均文
字サイズより、全平均文字サイズを算出しているが、同
じ方法をこのトリム平均を用いる場合に適用すると、領
域毎にサイズの大きい文字と小さい文字を除外すること
になるため、全平均文字サイズの算出においてタイトル
領域に含まれるすべての文字を除外することができな
い。従ってここでは、全平均文字サイズを算出するとき
に、あらためて文書画像中の全文字を対象として処理を
行っている。Here, in each of the above embodiments, the total average character size is calculated from the region average character size. However, if the same method is applied to the case where the trim average is used, a character having a large size for each region is used. Since small characters are excluded, it is not possible to exclude all characters included in the title area in calculating the total average character size. Therefore, here, when calculating the total average character size, processing is performed again for all characters in the document image.

【００８３】但し、このトリム平均を用いる方式を使用
するにしても、上記抽出パラメータとして、実施の形態
１の所定値、あるいは実施の形態２、３の段階値のいず
れを用いてもよいことはもちろんである。However, even if the method using this trim averaging is used, any of the predetermined values of the first embodiment or the step values of the second and third embodiments may be used as the extraction parameters. Of course.

【００８４】なお、上記の各実施の形態の説明では、文
書画像となる文書の枚数については言及していないが、
紙文書の枚数は特に限定されるものではない。すなわ
ち、１枚であっても複数枚であっても、各頁に同じ抽出
パラメータを用いる限り同様の効果が得られる。特に、
実施の形態２、３においては、複数頁に対して同じ抽出
パラメータ用いることにより、論文データのように複数
頁にわたる単一文書から、タイトル、サブタイトルを正
しく抽出することができる。Although the description of each of the above embodiments does not refer to the number of documents serving as document images,
The number of paper documents is not particularly limited. That is, the same effect can be obtained for one page or a plurality of pages as long as the same extraction parameter is used for each page. In particular,
In the second and third embodiments, by using the same extraction parameter for a plurality of pages, a title and a subtitle can be correctly extracted from a single document that covers a plurality of pages, such as paper data.

【００８５】また、上記の説明では、文字矩形の高さを
文字サイズとして採用することとしているが、文字矩形
の幅・面積を文字サイズとして採用してもよい。In the above description, the height of the character rectangle is adopted as the character size. However, the width and area of the character rectangle may be adopted as the character size.

【００８６】尚、図１の説明において、記憶手段１０８
の前段と画像メモリ１０７の前段に画像処理手段１１１
ａ、１１１ｂを設けて、タイトル抽出用の文書画像は２
値画像データを用い、記憶手段１０８の文書画像エリア
Ａａに登録される文書画像データとして、圧縮画像ある
いは多値画像データを用いることができるようになって
いる。これによって、上記のように抽出されたタイトル
に基づく検索処理の結果得られた文書画像をカラーで表
示する等の多様な表示方法が可能となる。In the description of FIG. 1, the storage unit 108
Image processing means 111
a and 111b, and the document image for title extraction is 2
Compressed images or multi-valued image data can be used as document image data registered in the document image area Aa of the storage unit 108 using the value image data. As a result, various display methods are possible, such as displaying a document image obtained as a result of the search processing based on the title extracted as described above in color.

【００８７】（実施の形態５）以下、実施の形態５及び
６ではユーザが紙文書に付したマークを文書タグ情報と
して自動的に付与する文書画像処理装置に関して説明す
る。(Embodiment 5) Hereinafter, Embodiments 5 and 6 will be described with respect to a document image processing apparatus which automatically adds a mark given to a paper document by a user as document tag information.

【００８８】先ず、紙文書を構成するいずれかのページ
にタイトルやキーワード等よりなるマークがユーザによ
って付される。ここで、マークとはスタンプ、シール、
イラスト、一定の筆跡による署名等、ユーザが紙文書を
識別することを意図して付すマーク一般を指すこととす
る。First, a mark made up of a title, a keyword, or the like is added to one of the pages constituting the paper document by the user. Here, marks are stamps, stickers,
It refers to general marks, such as illustrations and signatures with certain handwriting, that the user attaches with the intention of identifying paper documents.

【００８９】本発明の文書画像処理装置に、多数のペー
ジからなる紙文書を記憶させるにあたって、この紙文書
のどのページに上記マークが付されているかを判別する
必要がある。この際、上記紙文書の全ページを検索して
上記マークを検出する方法も考えられるが、検出処理に
時間がかかるという問題がある。When storing a paper document including a large number of pages in the document image processing apparatus of the present invention, it is necessary to determine which page of the paper document has the mark. At this time, a method of detecting the mark by searching all the pages of the paper document is conceivable, but there is a problem that the detection process takes time.

【００９０】このような問題を解決する方法としては、
例えば、１ページ目のみに上記マークの検出を行うよ
う、予め文書画像処理装置に設定しておくことなどが挙
げられる。As a method for solving such a problem,
For example, the document image processing apparatus may be set in advance so that the mark is detected only on the first page.

【００９１】本発明の実施の形態においては、図１３
（ｂ）に示すように、上記マークを付したページ（「以
下「文書タグ情報指定ページ」と呼ぶ）２１、２４に対
しては、右下の特定位置に特定の２次元コード画像２６
を記載することによって、この文書タグ情報指定ページ
を判別することにしている。In the embodiment of the present invention, FIG.
As shown in (b), for the pages (hereinafter referred to as “document tag information designation pages”) 21 and 24 to which the above-mentioned mark is attached, a specific two-dimensional code image 26
Is described, the document tag information designation page is determined.

【００９２】図１は本発明の実施の形態５による文書画
像処理装置のブロック図であり、以下、この文書画像処
理装置の行う処理の手順について説明する。FIG. 1 is a block diagram of a document image processing apparatus according to a fifth embodiment of the present invention. Hereinafter, a procedure of processing performed by the document image processing apparatus will be described.

【００９３】先ず、画像入力手段１２０１では、スキャ
ナやディジタル複合機などの光電変換装置を用いて紙文
書を電子化し、文書画像として入力する。ここでは、図
１３に示すように、入力画像２２及び２３に文書タグ情
報指定ページ２１に付された「極秘」「Ａ社」「９９年
度」の文書タグ情報を、入力画像２５に文書タグ情報指
定ページ２４に付された「極秘」「Ｂ社」の文書タグ情
報を付与することとする。そして、画像入力手段１２０
１には、文書タグ情報指定ページ２１、入力画像２２、
２３、文書タグ情報指定ページ２４、入力画像２５の順
に入力するようにしておく。First, the image input unit 1201 digitizes a paper document by using a photoelectric conversion device such as a scanner or a digital multifunction peripheral, and inputs it as a document image. Here, as shown in FIG. 13, document tag information of “secret”, “Company A”, and “1999” attached to the document tag information designation page 21 is added to the input images 22 and 23, and the document tag information is added to the input image 25. The document tag information of “top secret” and “company B” attached to the designated page 24 is added. Then, the image input means 120
1 includes a document tag information designation page 21, an input image 22,
23, a document tag information designation page 24, and an input image 25 in this order.

【００９４】ここで入力された文書画像は、一旦、画像
メモリ１２０２に格納され、更に、画像データ圧縮処理
手段１２０３においてデータ圧縮が施された後、記憶手
段１２１０の画像記憶領域１２１１に記憶される。この
とき、記憶された各文書画像を特定できるように、それ
ぞれの文書画像に画像ＩＤを付与し、この画像ＩＤを図
１３（ａ）に示す登録管理テーブル１２１２の「画像Ｉ
Ｄ」フィールド１２１に格納する。また、記憶手段１２
１０の画像記憶領域１２１１に記憶された画像データへ
のポインタ情報を、登録画像管理テーブル１２１２の
「画像データへのポインタ」フィールド１２２に格納す
る。The document image input here is temporarily stored in the image memory 1202, further subjected to data compression by the image data compression processing unit 1203, and then stored in the image storage area 1211 of the storage unit 1210. . At this time, an image ID is assigned to each document image so that each stored document image can be specified, and this image ID is assigned to “Image I” in the registration management table 1212 shown in FIG.
D "field 121. The storage means 12
The pointer information to the image data stored in the ten image storage areas 1211 is stored in the “pointer to image data” field 122 of the registered image management table 1212.

【００９５】また、画像メモリ１２０２に格納された上
記文書画像は、画像２値化処理手段１２０４において２
値化された後、マーク抽出手段１２０５にも送られる。
このマーク抽出手段１２０５では、先ず、画像右下の予
め決められた位置に特定の２次元コード画像が存在する
か否かを判定することによって、入力された各文書画像
が文書タグ情報指定ページであるかどうかを判定する。The document image stored in the image memory 1202 is converted into a binary image by
After being digitized, it is also sent to the mark extracting means 1205.
The mark extracting means 1205 first determines whether or not a specific two-dimensional code image exists at a predetermined position at the lower right of the image, so that each input document image is displayed on the document tag information designation page. Determine if there is.

【００９６】このとき、文書タグ情報指定ページと判断
された文書画像に関しては、上記登録画像管理テーブル
１２１２の「文書タグ情報指定ページフラグ」フィール
ド１２３に「１」を、そうでない場合には「０」を格納
する。このフラグは上記文書画像がマークのみが付され
た文書タグ情報指定ページであって、紙文書の文書とし
ての内容を含んでいないことを識別するために用いられ
る。例えば、後述する方法によって文書画像に対して文
書タグ情報が付与された後は、このフラグに基づいて、
文書タグ情報指定ページに該当する文書画像を削除する
ようにすれば、メモリ資源の節約になる。At this time, with respect to the document image determined to be the document tag information designated page, “1” is set in the “document tag information designated page flag” field 123 of the registered image management table 1212, and otherwise “0”. Is stored. This flag is used to identify that the document image is a document tag information designation page to which only a mark is added and does not include the contents of a paper document as a document. For example, after document tag information is added to a document image by a method described below, based on this flag,
If the document image corresponding to the document tag information designation page is deleted, memory resources can be saved.

【００９７】そして、ある文書タグ情報指定ページが入
力されてから次の文書タグ情報指定ページが入力される
までに入力された全ての文書画像に対して、同一のマー
ク管理グループ番号を付与する。更に、このマーク管理
グループ番号を上記登録画像管理テーブル１２１２の
「マーク管理グループ番号」フィールド１２５に格納す
る。ここで、同一のマーク管理グループ番号が付与され
た文書画像には、同一の文書タグ情報が付与されること
を意味している。The same mark management group number is assigned to all the input document images from the time when a certain document tag information designation page is input to the time when the next document tag information designation page is input. Further, this mark management group number is stored in the “mark management group number” field 125 of the registered image management table 1212. Here, it means that the same document tag information is assigned to the document images to which the same mark management group number is assigned.

【００９８】次に、上記の処理によって文書タグ情報指
定ページと判断された文書画像から、マーク抽出手段１
２０５がマークを抽出する処理について説明する。Next, the mark extracting unit 1 extracts the document image determined as the document tag information designated page by the above processing.
The process of extracting a mark by 205 will be described.

【００９９】先ず、文書タグ情報指定ページのうち、上
記２次元コードが付された領域を除く全ての領域に対し
て、実施の形態１で説明したラベリング処理を行う。そ
してラベリング処理で得られた複数の黒画素連結成分の
うち、相互の距離が特定の閾値よりも小さい成分に関し
ては統合して１つの領域とする。このようにして得られ
た各領域は、図１６に示すように、それぞれ各マークの
領域４１〜４３に対応しており、これらの領域を抽出す
ることによって、各マーク画像を得ることができる。First, the labeling process described in the first embodiment is performed on all of the document tag information designated pages except for the region to which the two-dimensional code is added. Then, among a plurality of black pixel connected components obtained by the labeling process, components having a mutual distance smaller than a specific threshold are integrated into one region. As shown in FIG. 16, the areas obtained in this manner correspond to the areas 41 to 43 of the respective marks, and by extracting these areas, each mark image can be obtained.

【０１００】ここで、各文書タグ情報指定ページから抽
出されたマークの個数を、上記登録画像管理テーブル１
２１２の「マーク数」フィールド１２４に格納する。Here, the number of marks extracted from each document tag information designation page is determined by the registration image management table 1 described above.
212 is stored in the “number of marks” field 124.

【０１０１】また、抽出された各マーク画像の情報を管
理するために、各マーク画像にマークＩＤを付与し、図
１４に示すような、マーク管理テーブル１２１３の「マ
ークＩＤ」フィールド１３１に格納する。更に、各マー
クが付されていた文書タグ情報指定ページのマーク管理
グループ番号を、上記マーク管理テーブル１２１３の
「マーク管理グループ番号」フィールド１３２に格納す
る。また、各文書タグ情報指定ページから抽出されたマ
ーク画像の該文書タグ情報指定ページ内での位置、サイ
ズ（幅、高さ）の情報を、それぞれ上記マーク管理テー
ブル１２１３の「位置」フィールド１３４、「サイズ」
フィールド１３５に格納する。In order to manage the information of each extracted mark image, a mark ID is assigned to each mark image and stored in a “mark ID” field 131 of a mark management table 1213 as shown in FIG. . Further, the mark management group number of the document tag information designated page to which each mark is attached is stored in the “mark management group number” field 132 of the mark management table 1213. The position and size (width, height) of the mark image extracted from each document tag information designation page in the document tag information designation page are stored in the “position” field 134 of the mark management table 1213, respectively. "size"
It is stored in the field 135.

【０１０２】本実施の形態では、最初の文書タグ情報指
定ページと、次の文書タグ情報指定ページとの間に入力
された文書画像には同一のマーク管理グループ番号を付
与し、上記文書画像を上記最初の文書タグ情報指定ペー
ジに付随する一連の文書画像として管理している。この
他にも文書タグ情報指定ページの次に入力された特定の
文書画像にのみにマーク管理グループ番号を付与し、そ
の他の文書画像にはマーク管理グループ番号を付与しな
い管理方法も考えられる。これは例えば上記特定の文書
画像に目次を付けたい場合などに利用される管理方法で
ある。In this embodiment, the same mark management group number is assigned to a document image input between the first document tag information designation page and the next document tag information designation page, and It is managed as a series of document images attached to the first document tag information designation page. In addition, a management method in which a mark management group number is assigned only to a specific document image input next to the document tag information designation page and a mark management group number is not assigned to other document images is also conceivable. This is a management method used when, for example, it is desired to add a table of contents to the specific document image.

【０１０３】次に、算出手段１２０Ａの特徴量算出手段
１２０６では、マーク抽出手段１２０５において抽出さ
れた各マーク画像の特徴を表す数値を算出する。ここで
はこの数値として、公知の技術であるモーメント・イン
バリアント（Moment Invariants ）における特徴量を利
用する。以下、このモーメント・インバリアントについ
て簡単に説明する。Next, the characteristic amount calculating means 1206 of the calculating means 120A calculates a numerical value representing the characteristic of each mark image extracted by the mark extracting means 1205. Here, a feature amount in a known technique, Moment Invariants, is used as the numerical value. Hereinafter, this moment invariant will be briefly described.

【０１０４】ｉ，ｊを画素の座標、Ｉ（ｉ，ｊ）をその
画素値、即ち、黒画素についてはＩ＝１、白画素につい
てはＩ＝０の値を持つ関数とする。そして〔数９〕で定
義されるｍ_pqを（ｐ＋ｑ）次のモーメントと呼ぶ。It is assumed that i and j are pixel coordinates, and I (i, j) is a function having the pixel value, that is, I = 1 for a black pixel and I = 0 for a white pixel. Then, _mpq defined by [Equation 9] is called a (p + q) _-order moment.

【０１０５】[0105]

【数９】 (Equation 9)

【０１０６】ここで、この m_pqを用いると、２次元画像
の重心（ｘ，ｙ）は〔数１０〕で表される。Here, when this m _pq is used, the center of gravity (x, y) of the two-dimensional image is represented by [Equation 10].

【０１０７】[0107]

【数１０】 (Equation 10)

【０１０８】このようにして算出された重心に基づい
て、〔数１１〕で定義されるμ_pqを中心モーメントと言
う。Based on the center of gravity calculated in this manner, μ _pq defined by [Equation 11] is called a center moment.

【０１０９】[0109]

【数１１】 [Equation 11]

【０１１０】そしてこの中心モーメントに基づき、〔数
１２〕によって以下のように算出される数値M1〜M6を、
当該２次元画像の（モーメント・インバリアントにおけ
る）特徴量と定義する。Based on this central moment, numerical values M1 to M6 calculated as
It is defined as a feature amount (in the moment invariant) of the two-dimensional image.

【０１１１】[0111]

【数１２】 (Equation 12)

【０１１２】これらの特徴量は当該２次元画像が回転や
平行移動した場合にも不変となるため、本発明の実施の
形態のように、ユーザが手作業で特定のマークを用紙の
上に付すような場合において、このマークを特徴付ける
のに有効な数値となるのである。Since these feature amounts do not change even when the two-dimensional image is rotated or translated, the user manually attaches a specific mark on the sheet as in the embodiment of the present invention. In such a case, the numerical value is effective for characterizing this mark.

【０１１３】このように特徴量算出手段１２０６により
算出された特徴量は、算出手段１２０Ａの類似度算出手
段１２０７に渡され、この特徴量と各標準タグ情報の属
性値との類似度が算出される。この方法を説明するため
に、以下では先ず、各標準タグ情報の管理方法及び各標
準タグ情報の属性値を算出する方法について説明する。The feature quantity calculated by the feature quantity calculation means 1206 is passed to the similarity calculation means 1207 of the calculation means 120A, and the similarity between this feature quantity and the attribute value of each standard tag information is calculated. You. In order to explain this method, first, a method of managing each standard tag information and a method of calculating an attribute value of each standard tag information will be described.

【０１１４】上記の標準タグ情報とは具体的にはユーザ
の使用が予測されるマーク（以下「標準マーク」と呼
ぶ）に関連付けられたデータであり、入力画像に対して
キーワード的な役割を果たす文字列等の文書タグ情報の
候補である。この標準タグ情報を、図１５（ａ）に示す
ような、標準タグ情報管理テーブル１２１４の「標準タ
グ情報」フィールド１４１に格納する。また、上記標準
マークの画像データは標準タグ情報蓄積手段１２１５に
格納されており、更に、この画像データへのポインタが
上記標準タグ情報管理テーブル１２１４の「標準マーク
へのポインタ」フィールド１４２に格納されている。ま
た、上記特徴量算出手段１２０６はこれら標準マークの
モーメント・インバリアントにおける６つの特徴量を算
出し、これら特徴量を標準タグ情報管理テーブル１２１
４の「属性値(M1 〜M6) 」フィールドに格納する。即
ち、この特徴量が、各標準マークの属性値となるのであ
る。The above-mentioned standard tag information is specifically data associated with a mark (hereinafter, referred to as a “standard mark”) that is expected to be used by the user, and plays a role of a keyword for an input image. This is a candidate for document tag information such as a character string. This standard tag information is stored in the “standard tag information” field 141 of the standard tag information management table 1214 as shown in FIG. The image data of the standard mark is stored in the standard tag information storage unit 1215, and a pointer to the image data is stored in the “pointer to standard mark” field 142 of the standard tag information management table 1214. ing. Further, the characteristic amount calculating means 1206 calculates six characteristic amounts in the moment invariant of these standard marks, and stores these characteristic amounts in the standard tag information management table 121.
4 in the "attribute value (M1 to M6)" field. That is, this feature amount becomes the attribute value of each standard mark.

【０１１５】このようにして算出された各標準マークの
属性値と、入力画像から抽出されたマーク画像の上記モ
ーメント・インバリアントにおける特徴量との距離を
〔数１３〕（最小２乗法）によって算出する。The distance between the attribute value of each standard mark calculated in this way and the feature value of the mark image extracted from the input image in the moment invariant is calculated by [Equation 13] (least square method). I do.

【０１１６】[0116]

【数１３】 (Equation 13)

【０１１７】ここで、M1〜M6は上記標準マークの属性
値、m1〜m6は抽出されたマーク画像の特徴量を表してい
る。上式によって算出された距離Ｌの値が小さいほど、
抽出されたマーク画像と標準タグ情報との類似度が高い
ことを示している。Here, M1 to M6 represent attribute values of the standard mark, and m1 to m6 represent feature amounts of the extracted mark image. As the value of the distance L calculated by the above equation is smaller,
This indicates that the similarity between the extracted mark image and the standard tag information is high.

【０１１８】次に文書タグ情報決定手段１２０８では、
上記類似度算出手段１２０７において算出された類似度
が最大となる標準マークを特定し、この標準マークの標
準タグ情報を入力された文書画像の文書タグ情報として
選択し、この文書画像に付与する。さらに、この文書タ
グ情報をマーク管理テーブル１２１３の「文書タグ情
報」フィールド１３３に格納する。Next, the document tag information determining means 1208
The standard mark having the maximum similarity calculated by the similarity calculating means 1207 is specified, the standard tag information of this standard mark is selected as the document tag information of the input document image, and the document image is given to this document image. Further, the document tag information is stored in the “document tag information” field 133 of the mark management table 1213.

【０１１９】以上の処理を適用することにより、入力さ
れた各文書画像に自動的に文書タグ情報を付与すること
ができる。ここで得られた各テーブルの情報を用いる
と、次の手順に従って画像の検索を行うことができる。By applying the above processing, document tag information can be automatically added to each input document image. Using the information of each table obtained here, an image can be searched according to the following procedure.

【０１２０】先ず、ユーザが検索に使用する文書タグ情
報を指定すると、この文書タグ情報に関連付けられてい
るマーク管理グループ番号をマーク管理テーブル１２１
３から特定することができる。さらに、上記マーク管理
グループ番号が付与されている文書画像の画像ＩＤおよ
びこの文書画像データへのポインタの情報を、登録画像
管理テーブル１２１２から特定することができる。ここ
で特定された文書画像が、ユーザの指定した文書タグ情
報に関連付けられている画像となる。また、複数の文書
タグ情報を指定することにより、検索したい画像データ
を絞り込むこともできる。First, when the user specifies the document tag information to be used for the search, the mark management group number associated with the document tag information is set to the mark management table 121.
3 can be specified. Furthermore, the image ID of the document image to which the mark management group number has been assigned and the information of the pointer to the document image data can be specified from the registered image management table 1212. The document image specified here is an image associated with the document tag information specified by the user. Also, by specifying a plurality of document tag information, it is possible to narrow down image data to be searched.

【０１２１】次に、類似度算出手段１２０７において算
出された類似度が最大となる文書タグ情報に関しても、
抽出されたマーク画像との距離Ｌが予め指定された閾値
よりも大きかった場合には、このマーク画像に関連付け
るべき既存の文書タグ情報は存在せず、新規の標準マー
クが入力されたものと判断する。この場合、マーク管理
テーブル１２１３の「位置」フィールド１３４、「サイ
ズ」フィールド１３５および登録画像管理テーブル１２
１２の「画像データへのポインタ」フィールド１２２の
情報に基づいてマーク画像を表示し、ユーザに対してこ
の新規の標準マークを関連付けておく文書タグ情報を登
録するように促す。Next, regarding the document tag information having the maximum similarity calculated by the similarity calculating means 1207,
If the distance L from the extracted mark image is greater than a predetermined threshold, it is determined that there is no existing document tag information to be associated with this mark image and that a new standard mark has been input. I do. In this case, the “position” field 134 and the “size” field 135 of the mark management table 1213 and the registered image management table 12
A mark image is displayed on the basis of the information in the "pointer to image data" field 122, and the user is prompted to register document tag information for associating the new standard mark.

【０１２２】ここで入力された文書タグ情報を、新たに
標準タグ情報管理テーブル１２１４の「標準タグ情報」
フィールド１４１に格納する。また、上記の新規に入力
された標準マークの画像データを以降の検索処理に利用
するために標準タグ情報蓄積手段１２１５に格納し、こ
のマーク画像データへのポインタ情報を標準タグ情報管
理テーブル１２１４の「マーク画像へのポインタ」フィ
ールド１４２に格納する。さらに、この新規の標準マー
クのモーメント・インバリアントにおける特徴量を算出
し、標準タグ情報管理テーブル１２１４の「属性値(M1
〜M6) 」フィールド１４３に格納する。The input document tag information is newly added to the “standard tag information” in the standard tag information management table 1214.
It is stored in the field 141. Further, the image data of the newly input standard mark is stored in the standard tag information storage unit 1215 for use in subsequent search processing, and pointer information to the mark image data is stored in the standard tag information management table 1214. The “pointer to the mark image” field 142 is stored. Further, the feature amount of the new standard mark in the moment invariant is calculated, and the “attribute value (M1
To M6) ”field 143.

【０１２３】以上のように、ユーザは新しいマーク画像
と文書タグ情報を入力するだけで、新規の標準タグ情報
を登録することができる。As described above, the user can register new standard tag information only by inputting a new mark image and document tag information.

【０１２４】なお、上記の説明において、図１４、図１
５（ａ）では、標準マークが関連付けられている標準タ
グ情報を、当該標準マークに使用されている文字列にし
ているが、これらは必ずしも文字列に限定する必要はな
い。すなわち、標準タグ情報管理テーブル１２１４にお
いて、各標準マークに任意の標準タグ情報を関連付ける
ことが可能である。In the above description, FIGS.
In 5 (a), the standard tag information associated with the standard mark is a character string used for the standard mark, but these need not necessarily be limited to character strings. That is, in the standard tag information management table 1214, it is possible to associate arbitrary standard tag information with each standard mark.

【０１２５】例えば、上述のような文字列による標準タ
グ情報の代わりに、各標準マークの縮小画像を標準タグ
情報としてそれぞれの標準マークに関連付けておき、こ
の縮小画像を検索用シートに印刷しておく。そしてこの
検索用シートの縮小画像をスキャナで読み取らせること
により、所望の文書画像の検索を行うようにすることも
可能である。For example, instead of the standard tag information using a character string as described above, a reduced image of each standard mark is associated with each standard mark as standard tag information, and this reduced image is printed on a search sheet. deep. By scanning the reduced image of the search sheet with a scanner, a desired document image can be searched.

【０１２６】更に、図１３および図１６の説明図では、
全ての入力画像の中から文書タグ情報指定ページを特定
するために２次元コードを用いたが、１次元コード等を
用いても良い。他にも、文書タグ情報指定ページを特定
するための手段としては、２次元コード画像の替わりに
特定のマークを使用する方法や、特定カラーの用紙を使
用する方法、あるいは特定の形状やサイズの用紙を使用
する方法等によっても同様の効果が期待できる。Furthermore, in the explanatory diagrams of FIGS. 13 and 16,
Although a two-dimensional code is used to specify a document tag information designation page from all input images, a one-dimensional code or the like may be used. In addition, as a means for specifying the document tag information designation page, a method of using a specific mark instead of a two-dimensional code image, a method of using paper of a specific color, or a specific shape or size. The same effect can be expected by using a method using paper.

【０１２７】また、全ての入力画像に同一の文書タグ情
報を付与する場合には、１枚目に入力する画像だけが文
書タグ情報指定ページ画像であると定義して文書画像処
理装置を構築することも可能である。この場合、１枚目
に入力される画像だけが文書タグ情報指定ページである
と分かっているので、２次元コード画像等を用いて文書
タグ指定画像を特定する処理は必要なくなり、全体の処
理を簡略化することができる。When the same document tag information is added to all input images, a document image processing apparatus is constructed by defining only the first image to be input as a document tag information designation page image. It is also possible. In this case, since it is known that only the first image input is the document tag information designation page, the process of specifying the document tag designation image using a two-dimensional code image or the like is not necessary, and the entire process is omitted. It can be simplified.

【０１２８】勿論、２次元コード画像等を用いずに、紙
文書の全てのページを検索してマークを抽出する方法も
可能である。この際に、ユーザが付したマークとは別
に、紙文書中の例えば「極秘」等の文字をマークとして
抽出してしまうことも起こり得る。このような場合には
当該文字も一つの上記のマークの１つとして登録画像管
理テーブル１２１３に追加すればよい。Of course, it is also possible to search all pages of a paper document and extract marks without using a two-dimensional code image or the like. At this time, apart from the mark added by the user, a character such as “top secret” in the paper document may be extracted as the mark. In such a case, the character may be added to the registered image management table 1213 as one of the marks.

【０１２９】なお、上記の説明ではモーメント・インバ
リアントにおける特徴量を用いてマーク画像と標準タグ
情報との関連付けを行ったが、２つの画像を重ねて一致
する黒画素の割合を比較するテンプレートマッチングを
用いて関連付けを行っても同様の効果が期待できる。In the above description, the mark image and the standard tag information are associated with each other by using the feature amount in the moment invariant. However, the two images are overlapped to compare the ratio of the matching black pixels. The same effect can be expected even if the association is performed by using.

【０１３０】また、１つの標準タグ情報に複数の標準マ
ークを関連付けておくこともできる。これは、標準タグ
情報管理テーブル１２１４において、同一の標準タグ情
報を複数登録しておき、それぞれに異なる標準マークを
関連付けることによって実現できる。この場合、異なる
マークの付された紙文書を入力して、この入力された文
書画像に同一の文書タグ情報を付与することができる。Further, a plurality of standard marks can be associated with one standard tag information. This can be realized by registering a plurality of the same standard tag information in the standard tag information management table 1214 and associating each with a different standard mark. In this case, a paper document with a different mark can be input, and the same document tag information can be added to the input document image.

【０１３１】逆に、１つの標準マークが複数の標準タグ
情報に関連付けられるようにすることもできる。これ
は、標準タグ情報管理テーブル１２１４において、異な
る標準タグ情報に同一の標準マークを関連付けることに
よって実現できる。この場合、１つのマークの付された
紙文書を入力して、この入力された文書画像に複数の文
書タグ情報を付与することができる。Conversely, one standard mark may be associated with a plurality of standard tag information. This can be realized by associating the same standard mark with different standard tag information in the standard tag information management table 1214. In this case, a paper document with one mark can be input, and a plurality of document tag information can be added to the input document image.

【０１３２】（実施の形態６）本実施の形態では、登録
する紙文書の余白部分に押されたマークを抽出すること
により、文書画像に文書タグ情報を付与する形態とす
る。以下、図１２を参照しながら実施の形態５と異なる
点についてのみ説明する。(Embodiment 6) In this embodiment, a document image is given document tag information by extracting a mark pressed in a margin of a paper document to be registered. Hereinafter, only differences from the fifth embodiment will be described with reference to FIG.

【０１３３】まず、画像入力手段１２０１では、実施の
形態５と同様、ユーザが入力した紙文書を電子化して文
書画像を得る。ここでは、図１７（ｂ）に示すように、
文書画像３１および３２に「極秘」「Ａ社」「９９年
度」の文書タグ情報を、文書画像３３に「極秘」「Ｂ
社」の文書タグ情報を付与することとする。このため
に、各画像の余白部分には、それぞれ付与したい文書タ
グ情報に関連付けられているマークが付されている。First, as in the fifth embodiment, the image input unit 1201 digitizes a paper document input by the user to obtain a document image. Here, as shown in FIG.
The document images 31 and 32 contain document tag information of “secret”, “company A”, and “1999”, and the document image 33 contains “secret”, “B”
Document tag information of “company”. For this reason, a mark associated with the document tag information to be added is attached to the margin of each image.

【０１３４】ここで得られた画像データは、一旦、画像
メモリ１２０２に格納され、さらに画像データ圧縮処理
手段１２０３においてデータが圧縮された後、記憶手段
１２１０の画像記憶領域１２１１に格納される。ここで
格納された画像データの情報として、図１７（ａ）に示
すように登録画像管理テーブル１２１２’の「画像Ｉ
Ｄ」フィールド１２１’および「画像データへのポイン
タ」フィールド１２２’に、それぞれ必要な情報を格納
することについては、実施の形態５と同様である。The image data obtained here is temporarily stored in the image memory 1202, further compressed in the image data compression processing unit 1203, and then stored in the image storage area 1211 of the storage unit 1210. As information of the image data stored here, as shown in FIG. 17A, “Image I” in the registered image management table 1212 ′ is used.
The storage of necessary information in the "D" field 121 'and the "pointer to image data" field 122' is the same as in the fifth embodiment.

【０１３５】また、画像メモリ１２０２の画像は、画像
２値化処理手段１２０４において２値化された後、マー
ク抽出手段１２０５’に送られる。本実施の形態では、
マーク画像の領域を確実に抽出することができるよう
に、図１７（ｂ）に示すような枠付きのマークを使用
し、上記マーク抽出手段１２０５’が以下の処理によっ
て各マークの抽出を行う。The image in the image memory 1202 is binarized by the image binarization processing unit 1204, and then sent to the mark extraction unit 1205 '. In the present embodiment,
In order to reliably extract the area of the mark image, a mark with a frame as shown in FIG. 17B is used, and the mark extraction means 1205 'extracts each mark by the following processing.

【０１３６】まず、各２値画像の黒画素に対して上述の
ラベリング処理を行い、さらに同一のラベル値が付与さ
れた黒画素連結成分毎に外接矩形のサイズを算出してお
く。このとき、マークの枠の部分に対応する黒画素連結
成分の外接矩形サイズは、入力画像内の各文字のサイズ
に比べると十分大きいが、マークは書類の余白部分に納
まるように押す必要があることから、極端に大きなサイ
ズになることもない。この性質を利用し、上記ラベリン
グ処理によって得られた黒画素連結成分のうち、外接矩
形の占める領域の大きさが、指定された２つの閾値の間
に納まる領域だけを抽出する。すなわち、高さおよび幅
の大きさが、それぞれある閾値〔余白のサイズ（高さ、
幅）として通常考えられる最小サイズ〕よりも大きく、
且つ、別のある閾値〔余白のサイズとして通常考えられ
る最大サイズ〕よりも小さくなるような黒画素連結成分
の領域だけを抽出することによって、各マーク画像の領
域を抽出することができる。First, the above-described labeling process is performed on the black pixels of each binary image, and the size of the circumscribed rectangle is calculated for each black pixel connected component to which the same label value has been added. At this time, the circumscribed rectangle size of the black pixel connected component corresponding to the frame portion of the mark is sufficiently larger than the size of each character in the input image, but the mark needs to be pushed so as to fit in the margin of the document. Therefore, the size does not become extremely large. By utilizing this property, only the area where the size of the area occupied by the circumscribed rectangle falls within the two specified threshold values is extracted from the black pixel connected components obtained by the labeling processing. That is, the height and the width are each determined by a certain threshold [the size of the margin (height,
Width) is usually larger than
In addition, by extracting only the area of the black pixel connected component that is smaller than another certain threshold value (the maximum size that is generally considered as a margin size), the area of each mark image can be extracted.

【０１３７】上記の処理によって、各文書画像から抽出
されたマークの個数を、それぞれ登録画像管理テーブル
１２１２’の「マーク数」フィールド１２４’に格納す
る。また、抽出された各マーク画像にマークＩＤを付与
し、このマークＩＤを図１８に示すようなマーク管理テ
ーブル１２１３’の「マークＩＤ」フィールド１３１’
に格納する。また、各マークが付されていた入力画像の
画像ＩＤ、マークが付されていたおよびマークのサイズ
に関する情報を、それぞれマーク管理テーブル１２１
３’の「画像ＩＤ」「位置」「サイズ」の各フィールド
１３２’、１３４’、１３５’に格納する。By the above processing, the number of marks extracted from each document image is stored in the “number of marks” field 124 ′ of the registered image management table 1212 ′. Also, a mark ID is given to each extracted mark image, and this mark ID is assigned to a “mark ID” field 131 ′ of a mark management table 1213 ′ as shown in FIG.
To be stored. Also, the image ID of the input image to which each mark is attached, the information about the attached mark and the size of the mark are stored in the mark management table 121, respectively.
3 'are stored in the fields 132', 134 'and 135' of "image ID", "position" and "size".

【０１３８】なお、本実施の形態ではマークの付された
画像にのみ文書タグ情報を付与することとしている。こ
の他にも、最初のマークの付された画像から、次のマー
クが付された画像までの間に入力された画像を、上記最
初のマークの付された画像に付随する一連の文書画像と
して管理したい場合には、上記実施の形態５と同様にに
マーク管理グループ番号を付与して管理する方法を採用
することもできる。In this embodiment, the document tag information is added only to the marked image. In addition, the images input between the image with the first mark and the image with the next mark are converted into a series of document images attached to the image with the first mark. If management is desired, a method of assigning and managing a mark management group number as in the fifth embodiment can be adopted.

【０１３９】以下、算出手段１２０Ａ（特徴量算出手段
１２０６、類似度算出手段１２０７）、文書タグ情報決
定手段１２０８では、それぞれ実施の形態５と同様、公
知の技術であるモーメント・インバリアントの特徴量に
基づいて、各マーク画像に関連付けられている文書タグ
情報を特定する。そして、特定された文書タグ情報をマ
ーク管理テーブル１２１３’の「文書タグ情報」フィー
ルド１３３’に格納する。The calculation means 120A (feature value calculation means 1206, similarity calculation means 1207) and the document tag information determination means 1208 are similar to the fifth embodiment, and each of them is a feature quantity of a moment invariant which is a known technique. , The document tag information associated with each mark image is specified. Then, the specified document tag information is stored in the “document tag information” field 133 ′ of the mark management table 1213 ′.

【０１４０】以上の処理を用いることにより、登録した
い紙文書の余白部分にマークを押して入力するだけで、
自動的に検出して文書タグ情報を付与することが可能に
なる。この場合、実施の形態５で用いた文書タグ情報指
定ページは不要であり、登録したい書類だけを入力する
ことになる。また上述のように、登録画像管理テーブル
１２１２’及びマーク管理テーブル１２１３’は、実施
の形態５における登録画像管理テーブル１２１２及びマ
ーク管理テーブル１２１３よりも簡単な構成となってい
る。By using the above processing, a mark can be pressed in the margin of a paper document to be registered and input.
It is possible to automatically detect and add document tag information. In this case, the document tag information designation page used in the fifth embodiment is unnecessary, and only the document to be registered is input. As described above, the registered image management table 1212 ′ and the mark management table 1213 ′ have a simpler configuration than the registered image management table 1212 and the mark management table 1213 in the fifth embodiment.

【０１４１】勿論、本実施の形態においても、実施の形
態５と同様、マーク抽出の処理を速めるために、マーク
の付されたページに２次元コード等を付与しておく方法
を採用することもできる。Of course, in this embodiment, as in the fifth embodiment, a method of adding a two-dimensional code or the like to a marked page may be adopted in order to speed up the mark extraction process. it can.

【０１４２】また、本実施の形態では、紙文書の内容が
記載されている面の余白部分にマークを押して入力した
が、両面を読み取ることができるスキャナ等を利用する
場合には、書類の裏面にマークを押して入力する場合に
も同様の効果が期待できる。In the present embodiment, the mark is pressed in the blank portion of the surface on which the content of the paper document is described, but the input is made by using a scanner capable of reading both sides. The same effect can be expected when inputting by pressing the mark on.

【０１４３】更に、上記のマークは枠を持つものとした
が、この枠は必須のものではない。枠がない場合にも、
通常マークは紙文書本文中の文字よりも大きな黒画素連
結成分から構成されると考えられるので、本実施の形態
が適用できる。Further, the above-mentioned mark has a frame, but this frame is not essential. Even if there is no frame,
Since a normal mark is considered to be composed of black pixel connected components larger than characters in a paper document body, this embodiment can be applied.

【０１４４】[0144]

【発明の効果】以上説明したように、第１に、本発明に
よれば、抽出判定値より大きい領域平均文字サイズの領
域をタイトル領域として抽出するようにしているため、
１つの文書画像から複数のタイトル領域を抽出できる。
また、複数段階の抽出パラメータに基づいて複数段階の
抽出判定をすることもでき、更に、この複数段階の抽出
パラメータを入力された文書画像の特性に応じて決定で
きる。また、全平均文字サイズの算出、あるいは領域平
均文字サイズの算出に、大きい方の所定割合と小さい方
の所定割合に属する文字を除外して算出するトリム平均
を用いると、より精度を上げることができる。As described above, first, according to the present invention, an area having an average character size larger than the extraction determination value is extracted as a title area.
A plurality of title areas can be extracted from one document image.
In addition, a plurality of stages of extraction determination can be performed based on a plurality of stages of extraction parameters, and the plurality of stages of extraction parameters can be determined according to characteristics of the input document image. Further, by using the trim average, which is calculated by excluding characters belonging to the larger predetermined ratio and the smaller predetermined ratio, in the calculation of the total average character size or the calculation of the region average character size, accuracy can be further improved. it can.

【０１４５】更に、第２に、本発明によれば、キーボー
ドやポインティングデバイス等を用いることなく、マー
ク処理された書類を文書画像処理装置に入力するだけ
で、自動的に入力画像に文書タグ情報を付与することが
できる。ここで付与された文書タグ情報を利用すること
によって文書画像を検索することができるため、文書画
像処理装置を効率良く管理、運用することができるよう
になる。Furthermore, secondly, according to the present invention, only by inputting a marked document to a document image processing apparatus without using a keyboard or a pointing device, document tag information is automatically added to an input image. Can be provided. Since the document image can be searched by using the document tag information given here, the document image processing apparatus can be efficiently managed and operated.

[Brief description of the drawings]

【図１】本発明の実施の形態１における文書画像処理装
置の概略機能ブロック図である。FIG. 1 is a schematic functional block diagram of a document image processing device according to a first embodiment of the present invention.

【図２】実施の形態１におけるタイトル領域抽出処理の
フローチャートである。FIG. 2 is a flowchart of a title area extraction process according to the first embodiment.

【図３】実施の形態２におけるタイトル領域抽出処理の
フローチャートである。FIG. 3 is a flowchart of a title area extraction process according to the second embodiment.

【図４】実施の形態３におけるタイトル領域抽出処理の
フローチャートである。FIG. 4 is a flowchart of a title area extraction process according to the third embodiment.

【図５】実施の形態１における登録情報管理テーブルの
説明図である。FIG. 5 is an explanatory diagram of a registration information management table according to the first embodiment.

【図６】実施の形態２における登録情報管理テーブルの
説明図である。FIG. 6 is an explanatory diagram of a registration information management table according to the second embodiment.

【図７】ラベリング処理の説明図である。FIG. 7 is an explanatory diagram of a labeling process.

【図８】領域分割処理の説明図である。FIG. 8 is an explanatory diagram of a region dividing process.

【図９】文字矩形の高さ・幅・面積の関係を示す図であ
る。FIG. 9 is a diagram illustrating a relationship among height, width, and area of a character rectangle.

【図１０】実施の形態１の検索時においてディスプレイ
に表示される内容を示す図である。FIG. 10 is a diagram showing contents displayed on a display at the time of search according to the first embodiment.

【図１１】実施の形態２の検索時においてディスプレイ
に表示される内容を示す図である。FIG. 11 is a diagram showing contents displayed on a display at the time of a search according to the second embodiment.

【図１２】本発明の実施の形態５及び６における文書画
像処理装置の概略機能ブロック図である。FIG. 12 is a schematic functional block diagram of a document image processing device according to Embodiments 5 and 6 of the present invention.

【図１３】本発明の実施の形態５における登録画像管理
テーブルの説明図である。FIG. 13 is an explanatory diagram of a registered image management table according to the fifth embodiment of the present invention.

【図１４】本発明の実施の形態５におけるマーク管理テ
ーブルの説明図である。FIG. 14 is an explanatory diagram of a mark management table according to the fifth embodiment of the present invention.

【図１５】文書タグ情報管理テーブルの説明図である。FIG. 15 is an explanatory diagram of a document tag information management table.

【図１６】マーク画像の抽出結果に関する説明図であ
る。FIG. 16 is an explanatory diagram related to a result of extracting a mark image.

【図１７】本発明の実施の形態６における登録画像管理
テーブルの説明図である。FIG. 17 is an explanatory diagram of a registered image management table according to the sixth embodiment of the present invention.

【図１８】本発明の実施の形態６におけるマーク管理テ
ーブルの説明図である。FIG. 18 is an explanatory diagram of a mark management table according to the sixth embodiment of the present invention.

【図１９】文書タグ情報の概念を示す説明図である。FIG. 19 is an explanatory diagram showing the concept of document tag information.

[Explanation of symbols]

１０１画像入力手段１０２文字矩形生成手段１０３領域分割手段１０４タイトル領域抽出手段１０５文字認識手段１０６文書登録手段１０７画像メモリ１０８記憶手段１０８ａ文書画像１０８ｂタイトル領域画像１０８ｃ登録情報管理テーブル１０９指示入力手段１１０表示制御手段１１１ａ，１１１ｂ画像処理手段１１２修正手段１２０１画像入力手段１２０２画像メモリ１２０３画像データ圧縮処理手段１２０４画像２値化処理手段１２０５，１２０５’ マーク抽出手段１２０Ａ算出手段１２０６特徴量算出手段１２０７類似度算出手段１２０８文書タグ情報付与手段１２１０記憶手段１２１１画像記憶領域１２１２，１２１２’ 登録画像管理テーブル１２１３，１２１３’ マーク管理テーブル１２１４標準タグ情報管理テーブル１２１５標準タグ情報蓄積手段 101 image input means 102 character rectangle generation means 103 area division means 104 title area extraction means 105 character recognition means 106 document registration means 107 image memory 108 storage means 108a document image 108b title area image 108c registration information management table 109 instruction input means 110 display Control means 111a, 111b Image processing means 112 Correction means 1201 Image input means 1202 Image memory 1203 Image data compression processing means 1204 Image binarization processing means 1205, 1205 'Mark extraction means 120A Calculation means 1206 Feature amount calculation means 1207 Similarity calculation Means 1208 Document tag information adding means 1210 Storage means 1211 Image storage area 1212, 1212 'Registered image management table 1213, 1213' Mark management table 214 standard tag information management table 1215 standard tag information storing means

フロントページの続き (72)発明者梅林明人大阪府門真市大字門真1006番地松下電器産業株式会社内Ｆターム(参考） 5B050 BA10 BA16 DA06 EA01 EA03 EA04 EA07 5B075 ND07 NK31 NK39 NR03 NR12 5L096 BA17 EA35 EA43 FA44 FA59 FA64 GA15 GA34 HA08 JA03 JA11 Continued on the front page (72) Inventor Akito Umebayashi 1006 Kazuma Kadoma, Kadoma City, Osaka Prefecture F-term in Matsushita Electric Industrial Co., Ltd. FA59 FA64 GA15 GA34 HA08 JA03 JA11

Claims

[Claims]

An image input unit configured to read a paper document to generate a document image; an area dividing unit configured to divide the document image into a plurality of areas; and an average size of a character in each area divided by the area dividing unit. Calculating a region average character size corresponding to the character region, and then extracting a title region from all regions based on the region average character size. After calculating a total average character size corresponding to the size, an extraction determination value obtained by multiplying the total average character size by an extraction parameter is compared with the region average character size. A document image processing apparatus comprising the above-described title region extracting means for extracting a title as a title region.

2. The document image processing apparatus according to claim 1, further comprising: the title area extracting unit that calculates the area average character size and the total average character size based on an average character height.

3. The document image processing apparatus according to claim 1, further comprising: the title area extracting unit that calculates the area average character size and the total average character size based on an average character width.

4. The document image processing apparatus according to claim 1, further comprising the title area extracting means for calculating the area average character size and the total average character size based on an average area of characters.

5. The document image processing apparatus according to claim 1, wherein the title area extracting means calculates the extraction determination value in a plurality of stages using an extraction parameter in a plurality of stages.

6. The title region extracting means calculates a plurality of stages of the extraction determination value using a plurality of stages of extraction parameters, and extracts a title region in association with a level attribute indicating the extracted stage. 2. The document image processing device according to 1.

7. The document image according to claim 2, wherein the title region extracting means determines the plurality of stages of extraction parameters based on a value obtained by dividing a maximum value of the region average character size by a total average character size. Processing equipment.

8. The title area extracting means uses a trim average for calculating the total average character size and the area average character size from characters excluding characters larger than a predetermined ratio and characters smaller than a predetermined ratio. A document image processing apparatus according to claim 1.

9. The document image processing apparatus according to claim 1, further comprising a correction unit that corrects the extracted character string of the title area.

10. The document image processing apparatus according to claim 1, wherein said document image is a document image of a plurality of pages.

11. An image input process for reading a paper document to generate a document image, a dividing process for dividing the document image into a plurality of regions, and calculating an average character size of each region corresponding to an average character size. In the document title extracting method of the document image processing device, the calculating method comprises: a calculating process of extracting the title region from the entire region based on the average character size of the region. The above-described calculation processing for calculating the total average character size, a comparison processing for comparing the extraction determination value obtained by multiplying the total average character size by an extraction parameter with the area average character size, and an area average character size larger than the extraction determination value Extracting the document title as a title area. Method.

12. The document title extracting method according to claim 11, further comprising the calculation processing for calculating the area average character size and the total average character size based on an average character height.

13. The document title extracting method according to claim 11, further comprising the calculation processing for calculating the area average character size and the total average character size based on an average character width.

14. The method for extracting document tag information according to claim 11, further comprising the calculation processing of calculating the area average character size and the total average character size based on the average area of the character.

15. The document title extracting method of the document image processing apparatus according to claim 14, wherein the title extracting process calculates the extraction determination value in a plurality of stages using an extraction parameter in a plurality of stages.

16. The method according to claim 1, further comprising calculating the extraction determination values in a plurality of stages using the extraction parameters in a plurality of stages, and extracting a title area in association with a level attribute indicating the extracted stage. 14. The document title extracting method of the document image processing device according to 14.

17. The title extraction processing for determining the plurality of stages of extraction parameters based on a value obtained by dividing the maximum value of the area average character size by the total average character size,
A document title extracting method for the document image processing apparatus according to claim 15.

18. The title extraction process for calculating a total average character size and an area average character size by using a trim average for calculating an average value of characters excluding a character larger than a predetermined ratio and a character smaller than the predetermined ratio. A document title extracting method for the document image processing apparatus according to claim 11.

19. The document title extracting method for a document image processing apparatus according to claim 11, further comprising a correction process for correcting the extracted character string in the title area.

20. The method according to claim 11, wherein the document image is a multi-page document image.

21. A document image generated by reading a paper document is divided into a plurality of regions, and a region average character size corresponding to an average character size and a total average character corresponding to an average character size of all regions are obtained for each region. A character size is calculated, and an extraction determination value obtained by multiplying the total average character size by an extraction parameter is compared with the region average character size, and a region having a region average character size larger than the extraction determination value is extracted as a title region. A recording medium on which a program to be recorded is recorded.

22. A document image processing apparatus comprising image input means for reading a paper document to generate a document image and storage means for storing the document image, wherein standard tag information is stored together with attribute values of the standard tag information. Standard tag information accumulating means, mark extracting means for extracting a specific mark added by a user on the paper document, and characteristic representing the characteristic of the mark based on the distribution of pixels constituting the specific mark Document image processing comprising: calculating means for calculating a value; and document tag information providing means for selecting specific standard tag information based on the attribute value and the characteristic value and providing the selected standard tag information to the document image. apparatus.

23. The document image processing apparatus according to claim 22, wherein said mark extracting means extracts said specific mark on a specific sheet.

24. The document image processing apparatus according to claim 23, wherein said mark extracting means determines a sheet with a two-dimensional code as said specific sheet.

25. The document image processing apparatus according to claim 22, wherein said mark extracting means extracts said specific mark on a margin of said paper document.

26. The document image processing apparatus according to claim 22, wherein said paper document is composed of a plurality of pages.

27. A document tag information assigning method of a document image processing apparatus for assigning document tag information of a document image generated by reading a paper document, wherein a standard tag information is stored together with an attribute value of the standard tag information. Tag information accumulation processing, mark extraction processing for extracting a specific mark added by a user on the paper document, calculation processing for calculating a numerical value representing the characteristic of the mark based on the distribution of pixels of the mark, A document tag information providing method for a document image processing apparatus, comprising: a process of selecting specific standard tag information based on an attribute value and a characteristic value and providing the selected standard tag information to the document image.

28. The method according to claim 27, wherein said paper document comprises a plurality of pages.

29. When a paper document is read to generate a document image, a specific mark added by a user on the paper document is extracted, and the characteristics of the mark based on the pixel distribution of the mark are represented. A recording medium in which a program for calculating a numerical value and selecting document tag information to be added to the document image from among the document tag information candidates based on the numerical value is recorded.