JPH11203305A

JPH11203305A - Method for processing document picture and recording medium

Info

Publication number: JPH11203305A
Application number: JP10004225A
Authority: JP
Inventors: Takashi Saito; 高志齋藤
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1998-01-12
Filing date: 1998-01-12
Publication date: 1999-07-30

Abstract

PROBLEM TO BE SOLVED: To automatically extract a part (key area) useful to the grasping a content from a document picture, and to narrow down the number of extraction into a proper number. SOLUTION: An area dividing layout information extraction means 102 divides an inputted document picture into a character area and an element such as a graphic, chart, or ruled line, and extracts layout information such as the kind of a multiple column or the multiple column. A line extracting means 103 extracts a line from a character area, and a font identifying means 104 identifies a font by each line unit. A key area discriminating means 105 extracts a key area directly indicating a document content based on the feature of the layout expression, checks the total number of the extracted key areas, and reduces and narrow-down the number into a proper number when the number is beyond a threshold value.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、文書画像中から内
容把握に役立つ領域を自動的に抽出する文書画像処理方
法および記録媒体に関する。[0001] 1. Field of the Invention [0002] The present invention relates to a document image processing method and a recording medium for automatically extracting a region useful for grasping contents from a document image.

【０００２】[0002]

【従来の技術】近年、計算機、デジタル機器の能力向上
やハードディスク等のデータ蓄積装置の能力向上によっ
て文書画像の流通機会が飛躍的に増大した。しかし文書
画像はデータ量が多く、このため処理速度が要求される
場合やネットワークを通じて画像データを交換する場合
などにネックとなってきた。2. Description of the Related Art In recent years, opportunities for distributing document images have dramatically increased due to improvements in the capabilities of computers and digital devices and in the capabilities of data storage devices such as hard disks. However, a document image has a large amount of data, which has been a bottleneck when a processing speed is required or when image data is exchanged through a network.

【０００３】一般に、大量に蓄積された文書画像を閲覧
する場合には、その全ての画像について、蓄積された高
画質な画像状態で見る必要はなく、取敢えず内容の確認
が可能であれば十分な場合が多い。もちろん、キーワー
ド検索等が行える方が画像を扱うより、はるかに高速に
処理できる。しかし適切なキーワードを入力することは
難しい。蓄積された画像内容を一度も見たことがないよ
うな場合は、より一層困難である。In general, when browsing a large number of document images, it is not necessary to view all the images in the state of the stored high-quality images. Is often sufficient. Of course, processing can be performed much faster when a keyword search or the like can be performed than when an image is handled. However, it is difficult to enter appropriate keywords. It is even more difficult if the stored image content has never been seen.

【０００４】そこで、光ファイリング装置等において
は、縮小した画像をインデックス画像として原画像とは
別に保持し、まずそのインデックス画像を利用者に提示
することによって、処理するデータ量の軽減化を図って
いる。しかし、単に画像全体を縮小した場合には、全体
の感じはつかめても画像中の文字を読むことは難しく、
特に、似たようなレイアウトの文書が多い場合には、所
望の文書を選択することは難しい。Therefore, in an optical filing apparatus or the like, the reduced image is held as an index image separately from the original image, and the index image is first presented to the user to reduce the amount of data to be processed. I have. However, if you simply shrink the entire image, it is difficult to read the characters in the image even if you can grasp the whole feeling,
In particular, when there are many documents having similar layouts, it is difficult to select a desired document.

【０００５】このような問題を解決するものとして、特
開平５−３４２３２６号公報に記載された文書処理装置
がある。この装置では、文書画像を領域分割し、分割さ
れた要素に対して、論理モデルに従って論理識別子を付
与し、必要とする論理要素だけを識別子をキーにして抽
出し、それを見やすく再配置する。また、部分的にＯＣ
Ｒを使用することによって、抽出した情報のソ−ティン
グなども行う。To solve such a problem, there is a document processing apparatus described in Japanese Patent Application Laid-Open No. Hei 5-342326. This apparatus divides a document image into regions, assigns logical identifiers to the divided elements according to a logical model, extracts only necessary logical elements using the identifiers as keys, and rearranges them so that they are easy to see. In addition, partly OC
By using R, sorting of extracted information is also performed.

【０００６】[0006]

【発明が解決しようとする課題】しかし、上記した文書
処理方法では、予め入力される文書画像のレイアウト構
成および論理構成を把握して、該当するモデルを作成す
る必要がある。モデルに従って文書が構成されていて、
領域分割部が完璧であれば精度よく処理できるが、実際
にはそうでない場合が多く、モデルの適用範囲が限定さ
れてしまう。また、論理モデルの作成には相当の熟達が
必要であることから、上記した処理方法では、新規文書
群の内容把握のために部分画像（キー領域）の抽出を行
うことが非常に難しい。However, in the above-described document processing method, it is necessary to comprehend the layout configuration and the logical configuration of the document image input in advance and create a corresponding model. The document is structured according to the model,
If the region dividing unit is perfect, processing can be performed with high accuracy. However, in many cases, processing is not so accurate, and the application range of the model is limited. In addition, since considerable skill is required to create a logical model, it is very difficult to extract a partial image (key area) to grasp the contents of a new document group in the above-described processing method.

【０００７】上記した問題を解決する他の方法として、
例えば、特開平５−２４２１４２号公報および本出願人
が先に提案した特願平８−１１０８０８号がある。これ
らの方法は、レイアウト構成および論理構成を把握して
該当するモデルを作成する必要はなく、前掲した特開平
５−３４２３２６号の方法よりも柔軟である。[0007] As another method for solving the above problem,
For example, there are Japanese Patent Application Laid-Open No. 5-242142 and Japanese Patent Application No. 8-110808 previously proposed by the present applicant. These methods do not need to comprehend the layout configuration and the logical configuration and create a corresponding model, and are more flexible than the method disclosed in Japanese Patent Laid-Open No. Hei 5-342326.

【０００８】つまり、特開平５−２４２１４２号公報に
記載された、文書画像の復号なしに文書を要約するため
の方法おいては、言語的基準（「重要な」、「意味のあ
る」などの単語）、文書内の位置や形態的画像特性（字
体、字種など）によって重要度、つまり意味的に重要な
画像単位を判定している。しかし、この方法においては
画像単位（領域）の「重要性」を判断することができて
も、それが画像全体でどの程度の量になるのかを制御し
ていない。頻度情報を利用することにより、重要相当領
域の数の総数を制御することも可能であるが、日本文に
おいては出現頻度が低くても「キー領域」に相当する場
合が多いことから、頻度だけで制御することは難しい。That is, in the method for summarizing a document without decoding a document image described in Japanese Patent Application Laid-Open No. 5-242142, a linguistic standard (such as "important" or "significant") is used. The importance, that is, the semantically important image unit is determined based on the word), the position in the document, and the morphological image characteristics (character style, character type, etc.). However, in this method, even if it is possible to determine the “importance” of each image unit (area), it does not control how much the whole image becomes. By using frequency information, it is possible to control the total number of important equivalent areas, but in Japanese sentences, even if the frequency of occurrence is low, it often corresponds to a "key area". Is difficult to control.

【０００９】また、特願平８−１１０８０８号で提案し
た方法は、文書画像を複数の要素に分割し、分割された
各要素のレイアウト上の特徴（本文とは異なる強調処理
された部分）を基に該要素が前記文書内容を端的に表す
領域であるか否かを判定し、該領域を部分画像として抽
出する方法であるが、この方法においても、やはり当該
領域の「重要性」を判断できても、それが画像全体でど
の程度の量になるのかを制御していない。In the method proposed in Japanese Patent Application No. 8-110808, a document image is divided into a plurality of elements, and layout characteristics of each of the divided elements (emphasized portions different from the main text). In this method, it is determined whether or not the element is an area that expresses the contents of the document, and the area is extracted as a partial image. In this method, the “importance” of the area is also determined. Even if you can, it does not control how much it will be in the whole image.

【００１０】本発明は上記した事情を考慮してなされた
もので、本発明の目的は、文書画像中から内容把握に役
立つ部分（キー領域）を自動的に抽出することができ、
また抽出数も適当な数に絞ることができる文書画像処理
方法および記録媒体を提供することにある。The present invention has been made in view of the above circumstances, and an object of the present invention is to automatically extract a portion (key area) useful for grasping contents from a document image.
Another object of the present invention is to provide a document image processing method and a recording medium in which the number of extractions can be reduced to an appropriate number.

【００１１】[0011]

【課題を解決するための手段】前記目的を達成するため
に、請求項１記載の発明では、文書画像を複数の要素に
分割し、分割された各要素のレイアウト表現上の特徴を
基に該要素が前記文書画像の内容を端的に表わす領域
（以下、キー領域）であるか否かを判別する文書画像処
理方法であって、前記判別されたキー領域の数を調べ、
該キー領域数を最適な数に削減することを特徴としてい
る。In order to achieve the above object, according to the first aspect of the present invention, a document image is divided into a plurality of elements, and each of the divided elements is laid out on the basis of a layout expression characteristic. What is claimed is: 1. A document image processing method for determining whether an element is an area (hereinafter, referred to as a key area) that briefly represents the content of the document image, comprising: determining the number of the determined key areas;
It is characterized in that the number of key areas is reduced to an optimum number.

【００１２】請求項２記載の発明では、前記判別された
キー領域の数が所定の閾値以上であるとき、該キー領域
数を削減することを特徴としている。The invention according to claim 2 is characterized in that when the number of the determined key areas is equal to or larger than a predetermined threshold, the number of the key areas is reduced.

【００１３】請求項３記載の発明では、前記判別された
キー領域の数を、ページ内の全行数と比べたときの比率
が所定の閾値以上であるとき、該キー領域数を削減する
ことを特徴としている。According to a third aspect of the present invention, when the ratio of the number of determined key areas to the total number of lines in a page is equal to or greater than a predetermined threshold, the number of key areas is reduced. It is characterized by.

【００１４】請求項４記載の発明では、前記レイアウト
表現上の特徴に優先順位を付け、優先度の低い特徴を基
に判別されたキ−領域から順に削減することを特徴とし
ている。[0014] The invention according to claim 4 is characterized in that priorities are assigned to the features in the layout expression, and the key regions are sequentially reduced from key regions determined based on the features having the lower priority.

【００１５】請求項５記載の発明では、同一の特徴を持
つキー領域に同じラベルを付与し、各ラベル毎のキー領
域数を計数し、該ラベル毎の計数に応じてキー領域を削
減することを特徴としている。According to the fifth aspect of the present invention, the same label is assigned to key areas having the same characteristics, the number of key areas for each label is counted, and the key area is reduced according to the count for each label. It is characterized by.

【００１６】請求項６記載の発明では、文書画像を複数
の要素に分割し、分割された各要素のレイアウト表現上
の特徴を基に該要素が前記文書画像の内容を端的に表わ
す領域（以下、キー領域）であるか否かを判別する文書
画像処理方法であって、前記文書画像のページ全体の段
組構成を判別し、判別された段組構成に応じて、前記キ
ー領域の判別に使用するレイアウト表現上の特徴を選択
することを特徴としている。According to the sixth aspect of the present invention, the document image is divided into a plurality of elements, and based on the layout representation characteristics of each of the divided elements, the elements clearly represent the contents of the document image (hereinafter referred to as "areas"). , A key area), a column configuration of the entire page of the document image is determined, and the key area is determined according to the determined column configuration. The feature is to select a feature in the layout expression to be used.

【００１７】請求項７記載の発明では、文書画像を複数
の要素に分割し、分割された各要素のレイアウト表現上
の特徴を基に該要素が前記文書画像の内容を端的に表わ
す領域（以下、キー領域）であるか否かを判別する文書
画像処理方法であって、前記判別されたキー領域の数を
調べ、該キー領域数を削減する必要があるとき、前記判
別に使用したレイアウト表現上の特徴の内、一ないし複
数の特徴を除いた特徴を用いて再度、キー領域を判別す
ることを特徴としている。According to the present invention, the document image is divided into a plurality of elements, and based on the layout representation characteristics of each of the divided elements, the elements clearly indicate the contents of the document image (hereinafter referred to as "areas"). A key area), the number of the determined key areas is checked, and when it is necessary to reduce the number of key areas, the layout expression used for the determination is used. It is characterized in that the key area is determined again by using a feature excluding one or more of the above features.

【００１８】請求項８記載の発明では、文書画像を複数
の要素に分割する機能と、分割された各要素のレイアウ
ト表現上の特徴を基に該要素が前記文書画像の内容を端
的に表わす領域（以下、キー領域）であるか否かを判別
する機能をコンピュータに実現させるためのプログラム
を記録したコンピュータ読み取り可能な記録媒体であっ
て、前記判別されたキー領域の数を調べる機能と、該キ
ー領域数を最適な数に削減する機能をコンピュータに実
現させるためのプログラムを記録したコンピュータ読み
取り可能な記録媒体であることを特徴としている。According to the eighth aspect of the present invention, a function of dividing a document image into a plurality of elements, and an area where the elements express the contents of the document image based on the layout representation characteristics of each of the divided elements. (Hereinafter referred to as a key area) a computer-readable recording medium storing a program for causing a computer to realize a function of determining whether or not the key area is a key area, the function of checking the number of the determined key areas; It is a computer-readable recording medium that records a program for causing a computer to realize a function of reducing the number of key areas to an optimum number.

【００１９】[0019]

【発明の実施の形態】以下、本発明の一実施例を図面を
用いて具体的に説明する。〈実施例１〉図１は、本発明の実施例の構成を示す。図
において、１０１は画像の入力手段、１０２は文書画像
を要素に分割し、また段組などのレイアウト情報を抽出
する領域分割／レイアウト情報抽出手段、１０３は各文
字領域において行を抽出する行抽出手段、１０４はフォ
ント識別手段、１０５は領域分割／レイアウト情報抽出
手段１０２および行抽出手段１０３で抽出した情報か
ら、文書画像の内容を端的に表わす領域（キー領域）を
判別するキー領域判別手段、１０６は入力された画像や
処理中の各種情報を蓄積するデータ記憶部、１０７は全
体を制御する制御部、１０８はデータ通信路である。DESCRIPTION OF THE PREFERRED EMBODIMENTS One embodiment of the present invention will be specifically described below with reference to the drawings. <Embodiment 1> FIG. 1 shows the structure of an embodiment of the present invention. In the figure, 101 is an image input means, 102 is an area division / layout information extraction means for dividing a document image into elements and extracting layout information such as columns, and 103 is a line extraction for extracting a line in each character area. Means 104, font identification means, 105 key area discriminating means for discriminating an area (key area) which expresses the contents of the document image from information extracted by the area division / layout information extracting means 102 and line extracting means 103, Reference numeral 106 denotes a data storage unit for storing input images and various information during processing, 107 a control unit for controlling the whole, and 108 a data communication path.

【００２０】図２は、本発明の処理フローチャートを示
す。以下、図２に従って本発明を説明する。まず、画像
入力手段１０１によって文書画像を得る（ステップ２０
１）。この画像入力手段はスキャナやファックスであ
り、あるいはネットワーク経由で別の機器から画像を得
る手段でもよい。FIG. 2 shows a processing flowchart of the present invention. Hereinafter, the present invention will be described with reference to FIG. First, a document image is obtained by the image input means 101 (step 20).
1). The image input means may be a scanner or a facsimile, or may be a means for obtaining an image from another device via a network.

【００２１】次に、領域分割／レイアウト情報抽出手段
１０２は、入力された文書画像を文字領域と、図や表や
罫線等の要素に分割すると同時に、段組種類や段組など
のレイアウト情報を抽出する（ステップ２０２）。この
ような領域分割方法としては例えば、特開平６−２００
９２号公報に記載された公知技術を用いればよい。ま
た、レイアウト情報の抽出方法としては例えば、特開平
９−４４５９４号公報に記載された公知技術を用いれば
よい。抽出した領域は、属性として要素の種類（文字領
域、表など）と、その位置（領域の外接矩形）などを持
ち、また画像全体として段組種類や段組分割線などのレ
イアウト情報を持つ。Next, the area dividing / layout information extracting means 102 divides the input document image into character areas and elements such as figures, tables, and ruled lines, and at the same time, lays out layout information such as column types and columns. Extract (step 202). As such an area dividing method, for example, Japanese Patent Laid-Open No. 6-200
A known technique described in JP-A-92-92 may be used. As a method for extracting layout information, for example, a known technique described in Japanese Patent Application Laid-Open No. 9-44594 may be used. The extracted area has, as attributes, the type of element (character area, table, etc.), its position (circumscribed rectangle of the area), and the like, and the entire image has layout information such as a column type and a column dividing line.

【００２２】行抽出手段１０３は、抽出した文字領域か
ら行を抽出する（ステップ２０３）。この行抽出方法と
しては、例えば電子通信学会論文「周辺分布、綿密度、
外接矩形特徴を利用した文書画像の領域分割」（秋山
他、１９８６年８月、Ｖｏｌ．Ｊ６９−ＤＮｏ．８）に
記載された技術を用いればよい。The line extracting means 103 extracts a line from the extracted character area (step 203). For example, the IEICE paper “Marginal distribution, cotton density,
The technique described in “Area Segmentation of Document Image Using Bounding Rectangular Feature” (Akiyama et al., August 1986, Vol. J69-DNo. 8) may be used.

【００２３】さらに、フォント識別手段１０４は、各行
単位または各行に含まれる文字単位でフォント識別を行
う（ステップ２０４）。このフォント識別方法としては
例えば、特開平６−２０８６４９号公報に記載の方法を
用いればよい。Further, the font identifying means 104 identifies the font for each line or for each character included in each line (step 204). As the font identification method, for example, a method described in Japanese Patent Application Laid-Open No. 6-208649 may be used.

【００２４】領域などのレイアウト情報、行情報、フォ
ント情報が抽出されると、キー領域判別手段１０５はキ
ー領域の判別を行う（ステップ２０５）。ここで、キー
領域とは文書の内容を把握するのに役立つ部分である。
一般的に、そのような部分は何らかの方法で強調が施さ
れている。例えば、大きな文字や強調系のフォントを使
用したり、あるいは他の部分とは独立させたり、枠で囲
むなどの処理が施されている。ここでは、先の特願平８
−１１０８０８号で提案した、文書のレイアウト表現上
の特徴を用いてキー領域を判別する、文書画像処理方法
を利用する。When layout information such as areas, line information, and font information are extracted, the key area determining means 105 determines a key area (step 205). Here, the key area is a part useful for grasping the contents of the document.
Generally, such parts are emphasized in some way. For example, processing such as using a large character or an emphasis type font, making it independent from other parts, and enclosing it with a frame is performed. Here, the earlier application for Japanese Patent Application Hei 8
Utilizing a document image processing method proposed in Jpn. Pat.

【００２５】図３は、キー領域判別処理の詳細な処理フ
ローチャートである。まず、各単位（ここでは行）毎に
レイアウト表現上の強調処理を調べてキー領域を抽出す
る（ステップ３０１）。FIG. 3 is a detailed processing flowchart of the key area determination processing. First, a key area is extracted by checking the emphasis processing on the layout expression for each unit (here, a row) (step 301).

【００２６】すなわち、上記出願で提案したキー領域抽
出処理では、まず、各行の文字サイズ特徴（大文字行、
中文字行、普通文字行）を検出し、次いでフォント特徴
（黒画素密度などを基に強調系のフォントであるか否
か）を検出し、タイトル部（文字サイズが大きく、独立
した領域）の検出を行う。続いて、小見出し部（例え
ば、中文字行で、行数が所定数未満の領域）を検出し、
書誌事項を検出し、最後に、囲み枠が存在する場合に、
囲み枠内の文章は別記事である場合が多く、内容把握に
役立つので、枠内の行を先頭から数行抽出する。That is, in the key area extraction processing proposed in the above-mentioned application, first, the character size characteristics of each line (uppercase lines,
A middle character line, a normal character line) is detected, and then a font feature (whether or not the font is emphasized based on black pixel density or the like) is detected, and a title portion (a large character size and an independent region) is detected. Perform detection. Subsequently, a subheading portion (for example, an area in which the number of lines is less than a predetermined number in a middle character line) is detected,
Bibliographic information is detected, and finally, if a box exists,
The text in the box is often a separate article and is useful for understanding the content, so several lines in the box are extracted from the beginning.

【００２７】以上の処理によって求められた、例えば
「タイトル領域」、「フォント強調領域」、「小見出し
領域および行」、「書誌事項」、「囲み枠先頭行」など
がキー領域となる。For example, the "title area", "font emphasis area", "subtitle area and line", "bibliographic information", "enclosing frame top line", etc., obtained by the above processing are key areas.

【００２８】各単位（ここでは行）毎にレイアウト表現
上の強調処理を調べることによりキー領域を抽出した
ら、抽出したキー領域の総数を調べる（ステップ３０
２）。ところで、キー領域は文書画像の要約効果を狙っ
たものであるので、あまり多くの行をキー領域として抽
出した場合にはその効果が半減してしまう。そこで、本
発明では、抽出したキー領域の総数を調べ、所定の条件
を満たす場合にキー領域の削減を図る。この条件として
は、抽出総数が所定のしきい値以上である場合、あるい
は抽出したキー領域の数の、ページ内の全行数に対する
比率が所定のしきい値以上である場合、などがある。こ
れらの条件はＯＲ条件でもＡＮＤ条件としてもよい。After extracting key areas by checking the emphasis processing on the layout expression for each unit (here, row), the total number of extracted key areas is checked (step 30).
2). By the way, since the key area aims at the effect of summarizing the document image, if too many lines are extracted as the key area, the effect is reduced by half. Therefore, in the present invention, the total number of extracted key areas is checked, and the key areas are reduced when a predetermined condition is satisfied. This condition includes, for example, a case where the total number of extractions is equal to or greater than a predetermined threshold, or a case where the ratio of the number of extracted key areas to the total number of lines in the page is equal to or greater than a predetermined threshold. These conditions may be OR conditions or AND conditions.

【００２９】キー領域の削減処理（ステップ３０３）は
次のように行う。図４は、ステップ３０３の詳細の処理
フローチャートである。まず、キー領域として判別され
たということは、何らかの表現上の強調処理があること
を示している。つまり、強調系のフォントが使用されて
いる場合であり、あるいはインデントや文字サイズで小
見出しであると判断された場合などである。そこで、ス
テップ３０１の処理において、同一の特徴を持つキー領
域と判断されたものに対して同じラベルを付与してお
く。図５は、ラベリングの様子を示す。そして、このラ
ベル毎にキー領域の数を数える。The key area reduction processing (step 303) is performed as follows. FIG. 4 is a detailed processing flowchart of step 303. First, being determined as a key area indicates that there is some emphasis processing in expression. In other words, this is the case where an emphasis font is used, or the case where it is determined that the font is a subheading based on the indent or the character size. Therefore, in the processing of step 301, the same label is assigned to the key areas determined to have the same characteristics. FIG. 5 shows a state of labeling. Then, the number of key areas is counted for each label.

【００３０】全体としてキー領域の数が多すぎると判定
された場合に、削減対象を決定する。その決定方法とし
ては、（１）あらかじめ決められた強調特徴の優先順位による（２）同ーラベルで数の多いものから削減していくなどの方法がある。When it is determined that the number of key areas is too large as a whole, a reduction target is determined. As a determination method, there are a method of (1) a priority order of a predetermined emphasis feature and a method of (2) a reduction from a large number of the same labels.

【００３１】（１）の方法は、図５の例では、タイトル
相当＞小見出し相当＞フォント強調という優先度をあら
かじめ決めておくことにより、まず最も優先度が低いフ
ォント強調のみのキー領域を削減し、それでもまだキー
領域数が多い場合には小見出し相当のキー領域を削減す
るものである。また、複数の強調特徴を持つ場合は（例
えば、小見出し相当かつフォント強調）、優先度の高い
特徴単体と同等またはそれよりも優先度を高くする。In the method (1), in the example of FIG. 5, the priority of title equivalent> subtitle equivalent> font emphasis is determined in advance, so that the key area of only the font emphasis having the lowest priority is first reduced. If the number of key areas is still large, the key area corresponding to the subheading is reduced. Also, when there are a plurality of emphasized features (for example, equivalent to a subheading and font emphasis), the priority is equal to or higher than that of a single high-priority feature.

【００３２】（２）の方法は、図５の例では、最も同一
ラベルの多いもの（＝フォント強調）から削減してい
き、キー領域が十分な数になるまでこれを繰り返すもの
である。In the method of (2), in the example of FIG. 5, the labels having the largest number of identical labels (= font emphasis) are reduced, and this is repeated until the number of key areas becomes sufficient.

【００３３】〈実施例２〉本実施例の構成は図１と同じ
であり、また全体の処理の流れも図２で表されるが、ス
テップ２０５のキー領域判別処理の処理内容が実施例１
と異なる。<Embodiment 2> The configuration of this embodiment is the same as that of FIG. 1, and the overall processing flow is also shown in FIG.
And different.

【００３４】ステップ２０２において、例えば特開平９
−４４５９４号に記載の技術（文書画像から文字列を含
む、複数の小領域を抽出し、複数の小領域から空白部ま
たは罫線を検出し、検出された空白部または罫線を基
に、１段組、複数段組、自由段組を含む段組種類を判別
する段組種類判別方法）を使用することにより、ページ
全体の段組構成が判別される。ここで、例えば全体が１
段組であると判別された場合を例にして説明する。１段
組みの書類としては通達文などがあり、通達文ではイン
デントが多用される傾向がある。また、特に重要でない
部分に強調系のゴシック系フォントが使用されることが
ある。In step 202, for example,
-45594 (A plurality of small areas including a character string are extracted from a document image, a blank portion or a ruled line is detected from the plurality of small regions, and one step is performed based on the detected blank portion or the ruled line. The column structure of the entire page is determined by using a column type discriminating method for determining a column type including a set, a plurality of columns, and a free column. Here, for example, the whole is 1
An example will be described in which a column is determined. A one-column document includes a notification sentence and the like, and the indentation tends to be frequently used in the notification sentence. In addition, an emphasis type Gothic font may be used for a part that is not particularly important.

【００３５】したがって、１段組であって図６のような
深いインデントがある場合には、ステップ２０５の処理
では、インデント特徴をキー領域判別に使用しない、あ
るいは特徴としての重みを低くし、またはフォント情報
を使用しない、といった処理をする。Therefore, when there is a deep indent as shown in FIG. 6 in a one-column system, in the process of step 205, the indent feature is not used for the key area determination, or the weight as the feature is reduced, or Processing such as not using font information.

【００３６】このように、ページ全体の段組構成によっ
てキー領域判定に使用する特徴の重み付けを変えること
で、より最適なキー領域が抽出される。As described above, by changing the weight of the feature used for the key area determination depending on the column configuration of the entire page, a more optimal key area is extracted.

【００３７】〈実施例３〉実施例３の構成は図１と同じ
であり、全体の処理の流れも図２で表される。相違点は
ステップ２０５のキー領域判別処理の処理内容にある。<Embodiment 3> The configuration of Embodiment 3 is the same as that of FIG. 1, and the overall processing flow is also shown in FIG. The difference lies in the processing content of the key area determination processing in step 205.

【００３８】図７は、実施例３に係るステップ２０５の
詳細の処理フローチャートである。実施例１と同様に、
一度、キー領域を抽出し（ステップ７０１）、実施例１
と同様に、その数について判定し（ステップ７０２）、
もし削減する必要がある場合（該当する領域が多すぎる
場合）は、一ないし複数の特徴をキー領域抽出に使用せ
ずに、再度、キー領域の抽出処理を行う（ステップ７０
３）。ステップ７０２および７０３は再帰的に行っても
よい。FIG. 7 is a detailed processing flowchart of step 205 according to the third embodiment. As in Example 1,
First, a key area is extracted (step 701), and the first embodiment is performed.
Similarly to the above, the number is determined (step 702),
If it is necessary to reduce the number of areas (when there are too many corresponding areas), the key area is extracted again without using one or more features for key area extraction (step 70).
3). Steps 702 and 703 may be performed recursively.

【００３９】〈実施例４〉本発明は上記した実施例に限
定されず、ソフトウエアによっても実現することができ
る。本発明をソフトウエアによって実現する場合には、
図８に示すように、ＣＰＵ、メモリ、表示装置、ハード
ディスク、キーボード、ＣＤ‐ＲＯＭドライブ、マウス
などからなるコンピュータシステムを用意する。ＣＤ−
ＲＯＭなどのコンピュータ読み取り可能な記録媒体に
は、本発明の文書画像処理機能や処理手順を実現するプ
ログラムなどが記録されている。また、処理対象の文書
画像は例えばハードディスクなどに格納されている。そ
して、ＣＰＵは、記録媒体から上記した処理機能、処理
手順を実現するプログラムを読み出し、ハードディスク
などから読み込まれた文書画像からキー領域を抽出処理
し、キー領域数が過剰である場合にはその数を最適な数
に絞り、その結果をディスプレイなどに色を変えるなど
して出力する。<Embodiment 4> The present invention is not limited to the above-described embodiment, but can be realized by software. When the present invention is realized by software,
As shown in FIG. 8, a computer system including a CPU, a memory, a display device, a hard disk, a keyboard, a CD-ROM drive, a mouse, and the like is prepared. CD-
A computer-readable recording medium such as a ROM stores a program for implementing the document image processing function and the processing procedure of the present invention. The document image to be processed is stored, for example, on a hard disk or the like. Then, the CPU reads a program for realizing the above-described processing function and processing procedure from the recording medium, extracts key areas from a document image read from a hard disk or the like, and, if the number of key areas is excessive, the number of key areas. Is reduced to an optimal number, and the result is output to a display or the like by changing colors.

【００４０】[0040]

【発明の効果】以上、説明したように、請求項１、２、
３、８記載の発明によれば、文書画像の内容把握に役立
つ領域が過剰に抽出されたとき、その領域数を削減して
いるので、最適な量だけ得ることができる。As described above, claims 1 and 2,
According to the inventions described in the third and eighth aspects, when an area useful for grasping the contents of a document image is excessively extracted, the number of the areas is reduced, so that an optimal amount can be obtained.

【００４１】請求項４、５記載の発明によれば、削減対
象を考慮しているので、文書画像の内容把握に役立つ領
域を、適当な量だけ得ることができる。According to the fourth and fifth aspects of the present invention, since an object to be reduced is taken into consideration, an appropriate amount of area useful for grasping the contents of the document image can be obtained.

【００４２】請求項６記載の発明によれば、文書画像の
内容把握に役立つ領域を、ページ全体の段組構成に応じ
て適当な量だけ得ることができる。According to the sixth aspect of the present invention, an area useful for grasping the contents of the document image can be obtained in an appropriate amount according to the column structure of the entire page.

【００４３】請求項７記載の発明によれば、文書画像の
内容把握に役立つ領域が過剰に抽出されたとき、異なる
特徴を用いて再度、キー領域を判別処理しているので、
内容を端的に表わす領域を最適な量だけ求めることがで
きる。According to the present invention, when an area useful for grasping the contents of the document image is excessively extracted, the key area is discriminated again by using different features.
It is possible to obtain an optimum amount of an area that expresses the contents.

[Brief description of the drawings]

【図１】本発明の実施例の構成を示す。FIG. 1 shows a configuration of an embodiment of the present invention.

【図２】本発明の処理フローチャートを示す。FIG. 2 shows a processing flowchart of the present invention.

【図３】キー領域判別処理の詳細な処理フローチャート
である。FIG. 3 is a detailed processing flowchart of a key area determination process.

【図４】ステップ３０３（キー領域削減処理）の詳細の
処理フローチャートである。FIG. 4 is a detailed processing flowchart of step 303 (key area reduction processing).

【図５】キー領域に対するラベリングの様子を示す。FIG. 5 shows a state of labeling for a key area.

【図６】深いインデントの例を示す。FIG. 6 shows an example of deep indentation.

【図７】他の実施例におけるステップ２０５（キー領域
判別処理）の詳細な処理フローチャートである。FIG. 7 is a detailed processing flowchart of step 205 (key area determination processing) in another embodiment.

【図８】本発明をソフトウェアによって実現する場合の
構成例を示す。FIG. 8 shows a configuration example when the present invention is realized by software.

[Explanation of symbols]

１０１画像入力手段１０２領域分割／レイアウト情報抽出手段１０３行抽出手段１０４フォント識別手段１０５キー領域判別手段１０６データ記憶部１０７制御部１０８データ通信路 Reference Signs List 101 Image input means 102 Area division / layout information extraction means 103 Line extraction means 104 Font identification means 105 Key area determination means 106 Data storage unit 107 Control unit 108 Data communication path

Claims

[Claims]

1. A document image is divided into a plurality of elements, and based on layout characteristics of each of the divided elements, the elements are areas (hereinafter, referred to as key areas) that briefly represent the contents of the document image. What is claimed is: 1. A document image processing method, comprising: determining the number of determined key areas; and reducing the number of key areas to an optimal number.

2. The document image processing method according to claim 1, wherein when the number of the determined key areas is equal to or more than a predetermined threshold, the number of the key areas is reduced.

3. The method according to claim 2, wherein the number of key areas is reduced when a ratio of the number of determined key areas to a total number of lines in a page is equal to or greater than a predetermined threshold. 1. The document image processing method according to 1.

4. The document image processing method according to claim 1, wherein priorities are assigned to the features in the layout expression, and key regions determined based on features with lower priorities are sequentially reduced.

5. The method according to claim 1, wherein the same label is assigned to key areas having the same characteristics, the number of key areas for each label is counted, and the key area is reduced according to the count for each label. 1. The document image processing method according to 1.

6. A document image is divided into a plurality of elements, and based on layout characteristics of each of the divided elements, the elements are areas (hereinafter, referred to as key areas) that briefly represent the contents of the document image. A document image processing method for determining whether or not a column structure of an entire page of the document image is determined, and according to the determined column structure, a layout expression used for determining the key area is determined. A document image processing method characterized by selecting a feature.

7. A document image is divided into a plurality of elements, and based on layout characteristics of each of the divided elements, the elements are areas (hereinafter referred to as key areas) that express the contents of the document image. A document image processing method for determining whether or not the number of key areas determined, and when it is necessary to reduce the number of key areas, among the features in the layout expression used for the determination, A document image processing method characterized by re-determining a key area using a feature excluding one or a plurality of features.

8. A function of dividing a document image into a plurality of elements, and an area where the elements express the contents of the document image based on layout characteristics of the respective elements (hereinafter referred to as a key area). A computer-readable recording medium on which a program for causing a computer to realize a function of determining whether or not a key area is determined is provided. A computer-readable recording medium on which a program for causing a computer to realize the function of reducing the number is recorded.