JP2001076095A

JP2001076095A - Information processor and method therefor

Info

Publication number: JP2001076095A
Application number: JP25193299A
Authority: JP
Inventors: Makoto Takaoka; 真琴高岡
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1999-09-06
Filing date: 1999-09-06
Publication date: 2001-03-23

Abstract

PROBLEM TO BE SOLVED: To prepare electronic document data which suitably saves and delivers a document image and are easily electronically documentated. SOLUTION: The layout of an inputted document image is analyzed (S103) to recognize a texture area and a picture area. The overlap area where the recognized text area and picture area overlap with each other is detected (step S104), and the storage style of the document image is switched, according to whether the overlap area is detected, thereby storing document image data (S105 and S106).

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、光学的読み取り手
段によって読み取った文書原稿を扱い、そのデータを再
利用する情報処理装置及び方法に関する。[0001] 1. Field of the Invention [0002] The present invention relates to an information processing apparatus and method for handling document originals read by optical reading means and reusing the data.

【０００２】[0002]

【従来の技術】従来、文書画像を保存、配信する場合、
白黒２値画像ならばＭＭＲ圧縮を用い、カラー画像なら
ばＪＰＥＧ圧縮を用いて、画像を保存したり配信したり
してきた。さらに近年では、画像認識技術が向上し、単
に画像を保存、配信するだけでなく、電子文書として再
利用する要望も高まり、実現されつつある。2. Description of the Related Art Conventionally, when a document image is stored and distributed,
Images have been stored and distributed using black and white binary images using MMR compression and color images using JPEG compression. Further, in recent years, image recognition technology has been improved, and a demand for not only saving and distributing an image but also reusing it as an electronic document has been increased, and is being realized.

【０００３】通常、文書画像を再利用する場合、認識処
理に適した画像形式に変換し、レイアウト解析処理を行
い、文字部については文字認識処理を行い、電子文書化
する。しかしながら、レイアウト解析処理や文字認識処
理の両方ともに判断ミスを犯すことが間々ある。特に文
字認識処理は、９９％の精度であっても残りの１％のミ
スはどうしても起きてしまうことになる。これは、単に
文字認識処理が原因ではなく、前処理である画像２値化
処理やレイアウト解析処理が原因であったりする場合も
ある。いずれにせよ完全に再現することは不可能に近
い。そのため、従来から行われている画像圧縮を用い
て、画像情報として保存や配信をすることが一般的に行
われている。Normally, when a document image is reused, it is converted into an image format suitable for recognition processing, layout analysis processing is performed, and character recognition processing is performed on a character portion to generate an electronic document. However, both the layout analysis processing and the character recognition processing sometimes make a determination error. In particular, in the character recognition processing, even if the accuracy is 99%, the remaining 1% error will inevitably occur. This is not simply caused by the character recognition processing, but may be caused by pre-processing image binarization processing or layout analysis processing. In any case, it is almost impossible to completely reproduce it. For this reason, it is common to store and distribute image information using image compression that has been conventionally performed.

【０００４】[0004]

【発明が解決しようとする課題】しかしながら、従来の
手法において、例えばカラー文書画像を保存や配付する
場合には、原画像を比較的忠実に圧縮するＪＰＥＧ圧縮
がよく用いられる。ＪＰＥＧ圧縮画像であると、例えば
受信した側の装置が内容を確認するために、伸長処理を
することになるが、その際、原画像が高解像度であった
りサイズが大きかったりすると確認するのに多くのメモ
リが必要となり、また、伸長処理に費やされる時間も多
くなってしまう。However, in the conventional method, for example, when storing and distributing a color document image, JPEG compression for relatively faithfully compressing the original image is often used. If it is a JPEG compressed image, for example, the device on the receiving side will perform decompression processing in order to confirm the contents. At this time, it is necessary to confirm that the original image is high resolution or large in size. A large amount of memory is required, and the time spent for the decompression process increases.

【０００５】そのため、複数ページにわたる文書画像の
場合、受信した側は、内容を確認することなく、データ
を廃棄してしまうこともあった。[0005] Therefore, in the case of a document image covering a plurality of pages, the receiving side sometimes discards the data without confirming the content.

【０００６】そこで、カラー文書画像を、ファイルサイ
ズを小さくし、伸長処理の際に必要なメモリ容量を低減
するため、白黒２値化して、それを配信することが行わ
れている。しかしながら、このように処理すると、カラ
ー情報の欠落や２値化時に写真部が黒くつぶれたり、あ
るいはかすれたりし、画質が著しく劣化することにな
る。更に下地付きの文書画像の場合、ノイズっぽい画像
になってしまうことがある。白黒２値画像を用いるの
は、結局文字部が文書の中で重要なため、文字部がはっ
きり読めればよいという妥協の上で用いられる。よい例
がＦＡＸで送る例である。ＦＡＸは、最終的には、白黒
文書でしか扱わないものであり、多くの場合はそれで事
足りている。Therefore, in order to reduce the file size of the color document image and reduce the memory capacity required for the decompression processing, the color document image is distributed to black and white binary and distributed. However, such processing results in loss of color information or blackening or fading of the photographic part during binarization, resulting in a significant deterioration in image quality. Further, in the case of a document image with a base, the image may look like noise. The use of a black-and-white binary image is used on a compromise that the character portion is important in the document, so that the character portion only needs to be clearly readable. A good example is sending by fax. FAX ultimately deals only with black and white documents, which is often sufficient.

【０００７】白黒２値画像の場合、代表的なものとし
て、単純２値化された画像と誤差拡散によって２値化さ
れた画像を扱う場合が多い。前者だと文字部はきれいだ
が、写真部は、黒くつぶれてしまう。一方、後者だと、
ＭＭＲ圧縮ではファイルサイズが大きくなり、文字部に
は、バックノイズが出現してしまう。殊に、後者の場
合、上述のような再利用を目的として電子文書化するの
はかなり難しい。[0007] In the case of a black and white binary image, as a typical example, a simple binary image and an image binarized by error diffusion are often handled. In the former case, the text part is beautiful, but the photo part is blackened. On the other hand, the latter
In MMR compression, the file size becomes large, and background noise appears in the character portion. In particular, in the latter case, it is very difficult to make an electronic document for the purpose of reuse as described above.

【０００８】いずれにせよ、画像を扱う場合には、その
文書画像をコードとして扱える電子文書化するにはかな
り障壁が高かった。ましてや、再利用することはあまり
行われないのが実状であった。[0008] In any case, when an image is handled, there is a considerably high barrier to the electronic document which can handle the document image as a code. In fact, it was not much reused.

【０００９】そこで、以下のような要件を満たす電子文
書形式が望まれていた。・カラー画像はカラーのまま、
小さいサイズとなり、しかも再生画像が、きれいである
こと。・下地のある白黒画像であっても、バックノイズ
が出現して、醜くないように、必要なところははっきり
とさせ、不必要なところは、削除されたきれいな再生画
像であること。・そして、いずれもコンピュータ上のア
プリソフトに取り込める再利用可能な電子文書であるこ
と。Therefore, an electronic document format satisfying the following requirements has been desired.・ Color images remain in color,
The size must be small and the reproduced image must be beautiful. -Even in a black-and-white image with a background, the necessary parts must be clarified so that the background noise does not appear and the image is not ugly. -All of them must be reusable electronic documents that can be imported into application software on a computer.

【００１０】従って、本発明の目的は、上述の如き要件
を満たす電子文書形式を生成することが可能な情報処理
装置及び方法を提供することにある。Accordingly, it is an object of the present invention to provide an information processing apparatus and method capable of generating an electronic document format satisfying the above requirements.

【００１１】すなわち、本発明は、文書画像の保存や配
信に適して、しかも電子文書化することが容易である文
書画像データを作成可能な情報処理方法及び装置を提供
することを目的とする。That is, an object of the present invention is to provide an information processing method and apparatus capable of creating document image data suitable for storing and distributing a document image and which can be easily converted into an electronic document.

【００１２】[0012]

【課題を解決するための手段】上記の目的を達成するた
めの本発明による情報処理装置は例えば以下の構成を備
える。すなわち、入力された文書画像についてレイアウ
ト解析を行い、少なくともテキスト領域とピクチャ領域
とを認識する解析手段と、前記解析手段で認識されたテ
キスト領域とピクチャ領域とが重なる重複領域を検出す
る検出手段と、前記検出手段で重複領域が検出されたか
否かに基づいて当該文書画像の格納形態を切り替えて該
文書画像データを格納する格納手段とを備える。An information processing apparatus according to the present invention for achieving the above object has, for example, the following arrangement. That is, analysis means for performing layout analysis on the input document image and recognizing at least a text area and a picture area, and detection means for detecting an overlapping area where the text area and the picture area recognized by the analysis means overlap. A storage unit for switching the storage mode of the document image based on whether the detection unit detects an overlapping area and storing the document image data.

【００１３】[0013]

【発明の実施の形態】以下、添付の図面を参照して本発
明の好適な実施形態を説明する。Preferred embodiments of the present invention will be described below with reference to the accompanying drawings.

【００１４】本実施形態では、ＰＡＦ（Page Analysis
Format）と呼ぶ電子文書フォーマットを仮に設定する。
このＰＡＦを仲介として、前記課題を解決する。In this embodiment, PAF (Page Analysis
Format) is temporarily set.
The problem is solved by using the PAF as an intermediary.

【００１５】＜ＰＡＦ＞ＰＡＦ（パフ）では、文書画像
を画像解析し、文書画像中のコード化が可能な領域は、
できる限りコード化させる。コード化が難しい領域は、
部分画像で保存する。図１は本実施形態による電子文書
フォーマット（ＰＡＦ）を説明する図である。１１はＰ
ＡＦ全体構成を示し、図示を省略したが、文書全体のプ
ロパティ情報（例えば、何ページの文書ＰＡＦか等を管
理する項目）や、管理情報（例えば、何時にどのような
スキャナで入力したものか）等の情報を格納する領域を
有する。<PAF> In PAF (puff), a document image is image-analyzed, and the area in the document image that can be coded is
Code as much as possible. Areas that are difficult to code
Save as a partial image. FIG. 1 is a diagram illustrating an electronic document format (PAF) according to the present embodiment. 11 is P
Although the entire AF configuration is shown and not shown, property information of the entire document (for example, an item for managing the number of pages of the document PAF, etc.) and management information (for example, at what time and what scanner is inputted) ) Are stored.

【００１６】Text Object部１２は、文書画像中のＴｅ
ｘｔ領域と判断した領域を保持する。Ｔｅｘｔ領域は、
最も圧縮効率が高い領域である。このＴｅｘｔ領域には
２つの保存方法でデータが保持される。１つはＴｅｘｔ
領域を２値画像で保持する方法である。これは、白黒２
値画像は当然のことながら、ＲＧＢフルカラー画像もＴ
ｅｘｔ領域に限っては、２値画像で保持する。一方、も
う一つの保存方法は、Ｔｅｘｔ領域を文字認識処理を実
行して文字コード化して保持する方法である。文字認識
コード化されたＴｅｘｔ領域は再利用可能な文書画像と
なる。[0016] The Text Object section 12 stores the Te in the document image.
The area determined as the xt area is held. The Text area is
This area has the highest compression efficiency. The text area holds data in two storage methods. One is Text
This is a method of storing an area as a binary image. This is black and white 2
Of course, the value image is T
Only in the ext area, a binary image is stored. On the other hand, another storage method is a method in which a text area is subjected to character recognition processing to be converted into a character code and stored. The text area encoded with the character recognition code becomes a reusable document image.

【００１７】Picture Object部１３は、写真などの自然
画や解析ではまだベクトルコード化ができない画像領域
を部分画像で保存する。ここでは、原画像が２値画像の
場合は、部分２値画像で保持し、多値画像の場合は、部
分多値画像で保持する。The Picture Object unit 13 stores a natural image such as a photograph or an image area which cannot be vector-coded by analysis as a partial image. Here, when the original image is a binary image, it is held as a partial binary image, and when it is a multi-valued image, it is held as a partial multi-valued image.

【００１８】Drawing Object部１４は、線や線画、囲い
枠などの線ベクトル化が可能な領域を保持する部分であ
る。例えば、横線の場合、その開始位置、終了位置、太
さ、線種などを保持する。The drawing object section 14 is a section for holding an area such as a line, a line drawing, and an enclosing frame that can be converted to a line vector. For example, in the case of a horizontal line, its start position, end position, thickness, line type, and the like are held.

【００１９】Table Object部１５は、文書画像中の表領
域と判断された領域を保持する部分である。例えば、表
の縦横のセル数や、そのセル構成情報、表枠情報の他、
セル内部に書かれたＴｅｘｔも保持する。The table object section 15 is a section for holding an area determined as a table area in a document image. For example, in addition to the number of vertical and horizontal cells in the table, its cell configuration information, table frame information,
Text written inside the cell is also held.

【００２０】文書画像をレイアウト解析し、その解析結
果に基づき、以上のような形式で保存した電子文書フォ
ーマットをＰＡＦと呼ぶ。An electronic document format stored in the above-described format based on the result of layout analysis of a document image and the analysis result is called PAF.

【００２１】図２は、各オブジェクトごとに適切な圧縮
を行い保存する適応的圧縮を示す図である。上述したよ
うに、画像として保持する場合、その領域に適した圧縮
方法を選択して圧縮、格納する。これによりファイルサ
イズが小さくなる。上記のText Object領域１２とTable
Object領域１５は、２値画像として保持する場合があ
る。この場合、線を示すDrawing Object部１４を除きす
べて部分画像の集まりとなる。後述の説明を分かり易く
するため、以上のような形式の保存をＩＭＧ−ＰＡＦ
（イメージパフ）と呼ぶ。また、レイアウト解析処理結
果より、Ｔｅｘｔを含む画像領域と判断された領域に対
して、文字認識処理を行った結果を保存した形式をＯ−
ＰＡＦと呼ぶ。FIG. 2 is a diagram showing adaptive compression in which appropriate compression is performed for each object and stored. As described above, when an image is stored, a compression method suitable for the area is selected, compressed, and stored. This reduces the file size. Text Object area 12 and Table
The Object area 15 may be stored as a binary image. In this case, all of the images except for the drawing object unit 14 indicating a line are a group of partial images. In order to make the following description easy to understand, the above-mentioned format is stored in IMG-PAF.
(Image puff). The format in which the result of performing the character recognition processing on the area determined to be the image area including the Text from the layout analysis processing result is set to O-format.
Called PAF.

【００２２】以上説明したような電子文書ＰＡＦを文書
画像の代わりに保存することにより、ファイルサイズを
縮小したり、再利用を可能としたりすることができる。
また、その他、全部文検索を可能とするなどの利点も生
まれる。By saving the electronic document PAF as described above instead of the document image, the file size can be reduced or the file can be reused.
In addition, there are other advantages such as enabling a full sentence search.

【００２３】しかしながら、このようなＰＡＦ化をする
にあたり、どのような文書画像も適切に部分オブジェク
ト化できるわけではない。図２に示した文書画像は比較
的整然とレイアウトされた文書であるため、ＰＡＦ化し
ても効率よく縮小ファイル化できる。また再現文書画像
も忠実に行うことが可能である。ところが、もっと複雑
に入り組んだ文書画像の場合、そのままレイアウト解析
した結果のＩＭＧ−ＰＡＦを保存していても重複領域の
処理が複雑化して、かえって使い物にならない再生電子
文書となってしまうことがある。この点に関して、図４
〜図６を用いて説明する。However, in performing such PAF conversion, not all document images can be appropriately converted into partial objects. Since the document image shown in FIG. 2 is a document laid out relatively neatly, it can be efficiently reduced to a file even if it is PAF-formatted. It is also possible to faithfully reproduce a reproduced document image. However, in the case of a document image that is more complicated, even if the IMG-PAF obtained as a result of the layout analysis is stored as it is, the processing of the overlapping area becomes complicated, and the reproduced electronic document may become useless. . In this regard, FIG.
This will be described with reference to FIG.

【００２４】図４は、レイアウト解析の結果、各領域が
整然と配置された文書画像を示す図である。図５は、レ
イアウトが少し入り組んだ文書画像を示し、図６は、Ｐ
ＩＣＴ部（イメージ領域）にＴＥＸＴ部（テキスト領
域）が重なったり、中に入ったりする文書画像レイアウ
ト解析結果を示す。図６のような複雑な解析結果をその
まま部分領域画像で保存した場合、重複する部分が多
く、返ってファイルサイズが大きくなってしまったり、
また再生する際も、画像中のＴｅｘｔ部が変に浮き出て
しまうといったような再現画像となってしまう。すなわ
ち、どの文書画像は部分保存方法でよく、どの文書画像
では返って不効率かつ見苦しい画像になってしまうかを
あらかじめ判断することが必要である。FIG. 4 is a view showing a document image in which the respective areas are arranged neatly as a result of the layout analysis. FIG. 5 shows a document image with a slightly complicated layout, and FIG.
This shows a document image layout analysis result in which a TEXT section (text area) overlaps or enters an ICT section (image area). When a complicated analysis result as shown in FIG. 6 is stored as it is as a partial area image, there are many overlapping parts, and the file size becomes large,
Also, at the time of reproduction, the reproduced image is such that the Text portion in the image is unusually raised. That is, it is necessary to determine in advance which document image may be a partial storage method, and which document image returns an inefficient and unsightly image.

【００２５】本実施形態では、図４のような文書画像の
場合は極力細分化した部分画像として扱い、図６のよう
な複雑な文書画像の場合は、再生を考慮して、ある程度
大枠の部分画像として扱うようにする分類技術を加える
ことにより、どのような文書画像もＰＡＦ化して、その
利点を生み出すようにする。さらに、各部分画像毎の保
存を行う部分切り出し画像保存の処理では、うまく再生
できないような文書画像は、原画像保存のままでよいと
判断する。これにより、レイアウト解析や文字認識が悪
い結果を出しそうな場合、できるだけ原画像の情報を保
存するようにでき、そうでない良好な文書画像は、でき
るだけ圧縮効率を上げ、再利用可能に文書を電子化する
ことができる。In the present embodiment, a document image as shown in FIG. 4 is treated as a subdivided partial image as much as possible, and a complicated document image as shown in FIG. Any document image can be PAFed to create its benefits by adding a classification technique to treat it as an image. Furthermore, in the process of saving a partial cut-out image for saving each partial image, it is determined that a document image that cannot be reproduced well can be kept as the original image. This makes it possible to preserve the original image information as much as possible when layout analysis and character recognition are likely to produce bad results. Can be

【００２６】以下、上記ＰＡＦ化までの本実施形態の処
理について説明する。図１２は本実施形態による文書画
像処理を実行するシステムの構成を示すブロック図であ
る。図１２において、１０１はＣＰＵであり、ＲＯＭ１
０２やＲＡＭ１０３に格納された制御プログラムを実行
して、各種処理を実現する。１０４は外部記憶装置であ
り、ＣＰＵ１０１による実行のために適宜ＲＡＭ１０３
へロードされる各種アプリケーションプログラムや、文
書画像データを格納する。１０５はディスプレイであ
り、ＣＰＵ１０１の制御の下で各種の表示を行う。１０
６は操作部であり、キーボードやポインティングデバイ
ス等を具備する。１０７はバスであり、上述の各構成を
接続する。なお、図示のフローチャートを参照して後述
される処理を実現するための制御プログラムは、外部記
憶装置１０４に格納され、必要に応じてＲＡＭ１０３に
ロードされてＣＰＵ１０１により実行されるものとす
る。もちろん、これら制御プログラムは、ＲＯＭ１０３
に格納されていてもよい。Hereinafter, the processing of this embodiment up to the PAF conversion will be described. FIG. 12 is a block diagram illustrating the configuration of a system that executes document image processing according to the present embodiment. In FIG. 12, reference numeral 101 denotes a CPU;
02 and a control program stored in the RAM 103 to execute various processes. Reference numeral 104 denotes an external storage device, which is appropriately stored in the RAM 103 for execution by the CPU 101.
It stores various application programs and document image data to be loaded into the. Reference numeral 105 denotes a display, which performs various displays under the control of the CPU 101. 10
An operation unit 6 includes a keyboard, a pointing device, and the like. A bus 107 connects the above-described components. It is assumed that a control program for realizing processing described below with reference to the illustrated flowchart is stored in the external storage device 104, loaded into the RAM 103 as needed, and executed by the CPU 101. Of course, these control programs are stored in the ROM 103
May be stored.

【００２７】図３は、本実施形態によるＰＡＦ化処理の
手順を説明するフローチャートである。まず、ステップ
Ｓ１０１で文書画像を入力する。ここで入力される文書
画像とは、スキャナ画像もしくは、画像ファイルであ
る。ステップＳ１０２では前処理を行う。前処理におい
ては、もし当該文書画像が多値画像であればこれに対し
て２値化処理を行う。例えば、ＲＧＢカラー画像の場合
は、まず最初に白黒グレースケール変換してから２値化
処理を行う。この２値化処理は、プリントのための２値
化ではなく、後のレイアウト解析処理のための２値化処
理である。例えば１２８をスライスレベルとした単純２
値化処理方式や、文書画像中の部分領域でヒストグラム
解析した結果で適切な閾値を求め、それをスライスレベ
ルとして２値化処理を行う方式等で２値画像を作成す
る。この２値化処理を用いる理由は、後のレイアウト解
析処理や文字認識処理では２値画像を用いるのが一般的
だからである。さらに付加的な理由として、下地の色を
確実に取り除くためである。下地の中の文字は、領域判
別、文字きり出しが判断しにくいからである。一方ＰＩ
ＣＴ部は一様に黒く塗りつぶされた領域としていた方が
よい。黒が飛んでいたりすると、ＰＩＣＴ部の中にＴＥ
ＸＴ部があるような誤認識しやすいからである。FIG. 3 is a flowchart for explaining the procedure of the PAF processing according to the present embodiment. First, a document image is input in step S101. The document image input here is a scanner image or an image file. In step S102, preprocessing is performed. In the preprocessing, if the document image is a multi-valued image, a binarization process is performed on the image. For example, in the case of an RGB color image, first, a black-and-white gray scale conversion is performed and then a binarization process is performed. This binarization process is not a binarization process for printing but a binarization process for a layout analysis process to be performed later. For example, simple 2 with 128 as the slice level
A binarized image is created by a binarizing process or a binarizing process in which an appropriate threshold is obtained based on a result of histogram analysis of a partial area in the document image and the slice is used as a slice level. The reason for using the binarization processing is that a binary image is generally used in the layout analysis processing and the character recognition processing that will be described later. An additional reason is to reliably remove the background color. This is because it is difficult to determine the area of the character in the background and the character cutout. On the other hand PI
It is preferable that the CT unit is a region uniformly blackened. If black is flying, TE in the PICT section
This is because an erroneous recognition such as an XT part is easily performed.

【００２８】さらに、文書画像が傾いていると認識ミス
を犯しやすい。そこで、文書画像の傾斜補正を行う。ま
た文書画像が左を向いていたり、さかさまを向いている
ことがないように、原稿方向補正処理も行う。Further, if the document image is inclined, it is easy to make a recognition error. Therefore, the inclination of the document image is corrected. Also, a document orientation correction process is performed so that the document image does not face left or upside down.

【００２９】ステップＳ１０３へ進み、レイアウト解析
処理を行う。レイアウト処理の処理方法は、ここでは、
公知技術である輪郭線追跡によるレイアウト解析等の手
法を用いることができ、画像中も黒画素の塊を検出して
その輪郭をたどる輪郭線追跡方法や、黒画素領域を検出
したら順番に番号を付加して行くラベリング方式があ
る。そして、検出した黒画素の塊の大きさ、位置等から
その領域の属性を判断する。In step S103, a layout analysis process is performed. Here, the processing method of layout processing is
It is possible to use a known technique such as layout analysis based on contour tracing, such as a contour tracing method that detects a block of black pixels in an image and follows the contour, or sequentially detects a number when a black pixel area is detected. There is a labeling method to add. Then, the attribute of the area is determined from the size, position, and the like of the detected mass of black pixels.

【００３０】その結果、レイアウト解析処理を行うと、
ＴＥＸＴ（文字）、ＴＩＴＬＥ（タイトル）、ＣＡＰＴ
ＩＯＮ（キャプション）、ＬＩＮＥＡＲＴ（線画）、Ｐ
ＩＣＴＵＲＥ（自然画）、ＦＲＡＭＥ（枠）、ＬＩＮＥ
（線）、ＴＡＢＬＥ（表）などの属性毎に認識された各
ブロックの属性情報とその矩形アドレス情報が、レイア
ウト解析結果として出力される。上記図１のText Objec
t部１２に属する領域は、ＴＥＸＴ（文字）、ＴＩＴＬ
Ｅ（タイトル）、ＣＡＰＴＩＯＮ（キャプション）であ
る。同様にPicture Object部１３は、ＰＩＣＴＵＲＥ
（自然画）、ＬＩＮＥＡＲＴ（線画）、Drawing Object
部１４は、ＦＲＡＭＥ（枠）、ＬＩＮＥ（線）、そして
Table Object部１５はＴＡＢＬＥ（表）が当てはまる。As a result, when the layout analysis processing is performed,
TEXT (character), TITLE (title), CAPT
ION (caption), LINEART (line drawing), P
ICTURE (natural image), FRAME (frame), LINE
The attribute information of each block recognized for each attribute such as (line) and TABLE (table) and its rectangular address information are output as a layout analysis result. Text Objec in Figure 1 above
Areas belonging to the t section 12 are TEXT (character), TITL
E (title) and CAPTION (caption). Similarly, the Picture Object unit 13
(Natural image), LINEART (line drawing), Drawing Object
Part 14 includes FRAME (frame), LINE (line), and
Table applies to the Table Object section 15.

【００３１】次に、ステップＳ１０４において、画像理
解処理を行う。この処理は本実施形態における重要な処
理であり、その詳細は後述する。この画像理解処理の結
果、文書画像は適切にＰＡＦ化を行うために分類され
る。Next, in step S104, an image understanding process is performed. This process is an important process in the present embodiment, and details thereof will be described later. As a result of this image comprehension processing, the document images are classified in order to perform appropriate PAF conversion.

【００３２】ステップＳ１０５では、ＩＭＧ−ＰＡＦ化
処理を行う。このＩＭＧ−ＰＡＦ化処理では、ステップ
Ｓ１０４における画像理解処理のルールに従って、部分
画像の切り出しを行う。ＰＩＣＴ部は、原画像が、多値
画像の場合は、多値画像のまま切り出す。In step S105, an IMG-PAF conversion process is performed. In this IMG-PAF processing, a partial image is cut out according to the rule of the image understanding processing in step S104. If the original image is a multi-level image, the PICT unit cuts out the multi-level image as it is.

【００３３】次に、ステップＳ１０６において、ＩＭＧ
−ＰＡＦ保存処理を行う。ここでは、ステップＳ１０５
で切り出された画像についてそれぞれ適応した圧縮を行
う。そして、レイアウト情報とともに圧縮画像も保存し
てＩＭＧ−ＰＡＦ化する。Next, in step S106, the IMG
-Perform PAF storage processing. Here, step S105
Performs adaptive compression on the images cut out in step (1). Then, the compressed image is stored together with the layout information and is converted into IMG-PAF.

【００３４】続いて、ステップＳ２０１において、ＩＭ
Ｇ−ＰＡＦ化保存されたデータを再び読み込む。ステッ
プＳ２０２で、レイアウト情報より、Ｔｅｘｔ領域を抽
出する。Ｔｅｘｔ領域には、２値の切り出し画像が保管
されている。この画像が圧縮されている場合は伸長す
る。ステップＳ２０３では、Ｔｅｘｔ領域で保存してい
た画像に対して文字認識処理を実行する。この文字認識
処理では、前処理として、言語判別を行う。一般的には
日英判別を行う。次に組方向判別を行い、縦書き・横書
きを判別する。そして、例えば日本語ならば日本語文字
認識エンジンを用いて、処理を実行する。Subsequently, in step S201, IM
The data stored in the G-PAF format is read again. In step S202, a text area is extracted from the layout information. In the text area, a binary clipped image is stored. If this image is compressed, it is expanded. In step S203, a character recognition process is performed on the image stored in the text area. In this character recognition processing, language determination is performed as preprocessing. Generally, Japanese-English distinction is performed. Next, group orientation discrimination is performed to discriminate between vertical writing and horizontal writing. Then, for example, in the case of Japanese, the processing is executed using a Japanese character recognition engine.

【００３５】ステップＳ２０４におけるＯ−ＰＡＦ化処
理では、ステップＳ２０３にて得られた文字コード及び
文字認識情報を再びＰＡＦの中に組み入れる。この場
合、Ｔｅｘｔ領域の画像は必要に応じて消去してもよ
い。そして、ステップＳ２０５におけるＯ−ＰＡＦ保存
処理では、ステップＳ２０４で得られたＯ−ＰＡＦを保
存する。In the O-PAF conversion process in step S204, the character code and character recognition information obtained in step S203 are incorporated into the PAF again. In this case, the image in the text area may be deleted as needed. Then, in the O-PAF storage processing in step S205, the O-PAF obtained in step S204 is stored.

【００３６】以上のようなフローを実現するソフトウエ
アを、以下、PAF Captureと呼ぶ。さて、本実施形態の
ポイントは、ステップＳ１０３によるレイアウト解析結
果を用いて、文書画像がどのような分類に属し、どのよ
うな画像保存を行ったら良いか判断する。この処理を画
像理解処理と呼ぶ。The software for realizing the above flow is hereinafter referred to as PAF Capture. The point of this embodiment is to determine what classification a document image belongs to and what kind of image storage should be performed using the layout analysis result in step S103. This process is called an image understanding process.

【００３７】＜文書画像の分類について＞文書画像は、
そのレイアウトに注目して分類することができる。図７
は本実施形態による文書画像の分類体系を示す図であ
る。文書画像５１は、まず、整列レイアウト画像５２、
非整列レイアウト画像５３の２種類に大きく分類され
る。整列レイアウト画像５２とは、図４に示したような
各属性の矩形が、整然とレイアウトされた文書である。
例えば特許公報の文書はこの分類に属する。<About Classification of Document Images>
The layout can be classified by focusing on the layout. FIG.
FIG. 3 is a diagram showing a classification system of document images according to the present embodiment. The document image 51 first includes an alignment layout image 52,
It is largely classified into two types of non-aligned layout images 53. The aligned layout image 52 is a document in which the rectangles of the respective attributes as shown in FIG. 4 are laid out neatly.
For example, patent publication documents belong to this category.

【００３８】非整列レイアウト文書５３とは、図６に示
したようなＰＩＣＴ領域とＴｅｘｔ領域が重なりあった
り、中に含まれたりするような文書画像を指す。このよ
うに、非整列レイアウト画像５３は、完全には領域分離
が難しい文書画像である。The non-aligned layout document 53 refers to a document image in which the PICT area and the Text area as shown in FIG. 6 overlap or are included. Thus, the non-aligned layout image 53 is a document image in which it is difficult to completely separate areas.

【００３９】さて、非整列レイアウト画像５３は、さら
に、重複矩形レイアウト５４とその他レイアウト５５と
に分離できる。図６のレイアウトは、この重複矩形レイ
アウト５４に含まれる。図６のようなレイアウトの場
合、画像切り出しルールさえ確立すれば、かなり良い再
現結果を得ることが可能である。その他のレイアウト５
５に含まれる文書画像は、例えば、文字の無い画像や、
画像が傾いていたために領域判別結果が極端に歪んだ結
果を出力した場合に、ここに分類される。このその他の
レイアウト５５に分類された文書画像は、無理に部分領
域画像にしなくて、原画像のまま残しておく方が返って
よい。The non-aligned layout image 53 can be further separated into an overlapping rectangular layout 54 and other layouts 55. The layout of FIG. 6 is included in the overlapping rectangular layout 54. In the case of the layout as shown in FIG. 6, a fairly good reproduction result can be obtained as long as the image cutout rule is established. Other layout 5
5 include, for example, an image without characters,
If the result of the region discrimination is extremely distorted due to the image being tilted, it is classified here. The document images classified into the other layouts 55 do not need to be forcibly changed to partial area images, but may be left as original images.

【００４０】本実施形態では、整列レイアウト画像５
２，重複矩形レイアウト５４の分類に当てはまる文書画
像を抽出し、それ以外は、切り出し画像保存の処理系に
提供せずに、無理な画像の切り出しを防止する。In this embodiment, the alignment layout image 5
2. Extract a document image that fits the classification of the overlapping rectangle layout 54, and prevent the other parts from being provided to the processing system for storing the cut-out image, thereby preventing improper cutting-out of the image.

【００４１】図８はＰＩＣＴ部とＴＥＸＴ部の重なりの
パターンを示す図である。図８において、（ａ）はＰＩ
ＣＴ部の内部にそっくりＴＥＸＴ部が含まれてしまう例
である。図９の（ａ）には、図８（ａ）のような矩形が
得られる画像例を示した。また、図８の（ｂ）はＰＩＣ
Ｔ部にＴＥＸＴ部の一部が重なっている場合であり、図
９の（ｂ）はその画像例である。この２種類のパターン
は、レイアウト解析時に、ＰＩＣＴ部を親として、ＴＥ
ＸＴ部のオーバーラップフラグを立てておく。ただし、
（ａ）の場合と（ｂ）の場合とではそれぞれ異なるフラ
グを立てておく。図８の（ｃ）はＰＩＣＴ部の中にＰＩ
ＣＴ部があると解析した例であり、図９の（ｂ）はその
画像例である。同様に図８の（ｄ）はＰＩＣＴ部ともう
一つＰＩＣＴ部が一部重なっている場合を示すものであ
り、図９の（ｄ）はその画像例である。FIG. 8 is a diagram showing an overlapping pattern of the PICT section and the TEXT section. In FIG. 8, (a) shows PI
This is an example in which a TEXT section is completely included inside a CT section. FIG. 9A shows an image example in which a rectangle as shown in FIG. 8A is obtained. FIG. 8B shows a PIC.
This is the case where a part of the TEXT part overlaps the T part, and FIG. 9B shows an example of the image. These two types of patterns are used when the layout analysis
The overlap flag of the XT section is set. However,
Different flags are set for the cases (a) and (b). FIG. 8C shows PI in the PICT unit.
FIG. 9B shows an example of an image obtained by analyzing the presence of a CT unit. Similarly, FIG. 8D shows a case where the PICT unit partially overlaps another PICT unit, and FIG. 9D shows an example of the image.

【００４２】以上の様な画像例を用いて、ステップＳ１
０４の画像理解処理を説明する。図１０は、本実施形態
による画像理解処理の手順を示すフローチャートであ
る。まず、ステップＳ３０１においてレイアウト解析を
行う。ステップＳ３０２では、レイアウト解析結果にオ
ーバーラップフラグを付加する。このオーバーラップフ
ラグは、上記図８及び図９の（ａ）及び（ｂ）に示した
ように、ＰＩＣＴ部とＴＥＸＴ部とがオーバーラップす
る場合に付加される。Using the above image examples, step S1
The image understanding process of No. 04 will be described. FIG. 10 is a flowchart illustrating the procedure of the image understanding process according to the present embodiment. First, a layout analysis is performed in step S301. In step S302, an overlap flag is added to the layout analysis result. This overlap flag is added when the PICT unit and the TEXT unit overlap as shown in FIGS. 8 and 9 (a) and (b).

【００４３】ステップＳ３０３では、レイアウト解析結
果に対してオーバーラップフラグを検査する。ＰＩＣＴ
部とＴＥＸＴ部とでオーバーラップするところが無いな
らば、ステップＳ３０４へ進み、当該画像にＴＥＸＴ部
が存在するかどうかを判定する。ＴＥＸＴ部が存在する
ならば、整列レイアウト画像と判断してステップＳ３０
６へ進み、ステップＳ１０５のＩＭＧ−ＰＡＦ化を実行
する部分切り出し画像処理系へ送る。一方、ステップＳ
３０４において、ＴＥＸＴ領域が存在しなければ、その
他のレイアウト画像であると判定してステップＳ３１０
へ進む。In step S303, an overlap flag is checked for the layout analysis result. Pict
If there is no overlap between the part and the TEXT part, the process proceeds to step S304, and it is determined whether or not a TEXT part exists in the image. If the TEXT portion exists, it is determined that the image is the aligned layout image, and the process proceeds to step S30.
Then, the process proceeds to step S105, where the image data is sent to the partially cut-out image processing system for executing the IMG-PAF conversion in step S105. On the other hand, step S
If there is no TEXT area in 304, it is determined that the layout image is another layout image, and step S310 is performed.
Proceed to.

【００４４】ステップＳ３０３においてオーバーラップ
があると判断された場合は、ステップＳ３０５へ進み、
ＰＩＣＴ部に少なくとも一部が重なるＴＥＸＴ部を検出
し、ＴＥＸＴ部のはみ出し部分をも含めたＰＩＣＴ領域
へと拡張する。例えば図６のＰＩＣＴ部の場合には、図
１１のようなＰＩＣＴ部に変更する。ここでは、大きな
外接矩形とはせずに矩形を接続した状態にしておく。If it is determined in step S303 that there is an overlap, the process proceeds to step S305,
A TEXT part that at least partially overlaps the PICT part is detected, and is extended to a PICT area including a protruding part of the TEXT part. For example, in the case of the PICT unit in FIG. 6, the PICT unit is changed to the one in FIG. Here, a rectangle is connected without being a large circumscribed rectangle.

【００４５】ステップＳ３０７では、オーバーラップし
た矩形以外の他の矩形が存在するか、検出する。無い場
合は、ステップＳ３１０へ進み、その他のレイアウトと
判断する。この文書画像の場合、ＰＩＣＴ部とＴＥＸＴ
部が複雑に入り組み、分離するより、一つの文書画像の
ままの方がよいと判定されるからである。In step S307, it is detected whether there is another rectangle other than the overlapped rectangle. If there is no layout, the process proceeds to step S310 to determine another layout. In the case of this document image, the PICT unit and the TEXT
This is because it is determined that it is better to keep one document image as it is, rather than separate and complicated.

【００４６】ステップＳ３０７で、他の矩形があると判
定された場合は、ステップＳ３０８へ進み、再結合した
矩形が文書画像の８０％以上を占めるかどうかを判定す
る。この８０％は、一例であり、矩形の位置、幅、高さ
を考慮して、判断する。ここで８０％以上と判定された
場合には、その他のレイアウトと判断してステップＳ３
１０へ進む。この場合は、ＰＩＣＴ部とＴＥＸＴ部が複
雑に入り組んだ領域が多く存在し、分割しないほうが良
いと判断されるからである。一方、８０％以下であった
場合は、ステップＳ３０９へ進み、重複矩形レイアウト
と判断される。If it is determined in step S307 that there is another rectangle, the flow advances to step S308 to determine whether the recombined rectangle occupies 80% or more of the document image. This 80% is an example, and the determination is made in consideration of the position, width, and height of the rectangle. If it is determined that the layout is 80% or more, it is determined that the layout is another layout, and the process proceeds to step S3.
Proceed to 10. In this case, there are many regions where the PICT unit and the TEXT unit are complicated and complicated, and it is determined that it is better not to divide the region. On the other hand, if it is less than 80%, the process proceeds to step S309, and it is determined that the layout is an overlapping rectangular layout.

【００４７】整列レイアウトもしくは重複レイアウトと
判定された文書画像は、共にステップＳ１０５のＩＭＧ
−ＰＡＦ化処理における部分画像保存処理系で、部分画
像保存を行う。その際、整列レイアウトの場合は問題な
く、部分画像切り出しを行いそれぞれ最適な圧縮を行
う。重複レイアウトの文書画像は、前記ＴＥＸＴとＰＩ
ＣＴのオーバーラップ部分は、接続処理を行い、ＰＩＣ
Ｔ部とする。ＰＩＣＴ部とＰＩＣＴ部のオーバーラップ
部分は、同様に接続処理を行い、ＰＩＣＴ部とする。Ｐ
ＩＣＴ部内にＴＥＸＴ部がある場合とＰＩＣＴ部がある
場合は、大きい方が小さい方を包含して画像保存する。The document images determined to be the aligned layout or the overlapping layout are both IMG in step S105.
-Save a partial image in the partial image saving processing system in the PAF processing. At that time, in the case of the aligned layout, there is no problem, and partial image clipping is performed, and optimal compression is performed for each. The document image of the overlapping layout is the TEXT and PI
In the overlapping part of CT, connection processing is performed and PIC
T section. The PICT unit and the overlapping part of the PICT unit perform the connection processing in the same manner to form the PICT unit. P
When there is a TEXT part and a PICT part in the ICT part, the larger one includes the smaller one and saves the image.

【００４８】このような、画像理解ルールを用いること
により、ＩＭＧ−ＰＡＦ化したファイルが、不自然に部
分画像とならないように、無理な場合は、元画像を自動
保存させることができる。By using such an image understanding rule, the original image can be automatically saved when it is impossible to prevent the IMG-PAF converted file from becoming an unnatural partial image.

【００４９】以上の様な処理により、整列レイアウト画
像の場合には、かなり小さなファイルとなる。また、こ
の分類に属した文書画像は、さらに文字認識処理によ
り、さらに使いやすい電子文書化が可能となる。一方、
重複レイアウト画像は、できるだけ原画像部を残し、再
現させる。これにより、不自然なＴＥＸＴ部とＰＩＣＴ
部の境目をなくすことが可能となる。By the above-described processing, in the case of an aligned layout image, a very small file is obtained. In addition, the document images belonging to this classification can be further converted into a more usable electronic document by character recognition processing. on the other hand,
The overlapping layout image is reproduced while leaving the original image portion as much as possible. Thereby, the unnatural TEXT part and PICT
It is possible to eliminate boundaries between departments.

【００５０】［他の実施形態］次に他の実施形態につい
て説明する。上記実施形態では、ＩＭＧ−ＰＡＦを保存
する際に画像理解ルールを保存時に用いたが、文書画像
自動分類技術は、画像ファイリングの自動分類方法とし
ても実施できる。この場合、レイアウト解析した結果に
ついて画像理解処理を行い、図７に示した分類を得る。
そして、その分類の情報をファイルの管理情報に記述し
ておく。そして、整列レイアウト画像５２、重複矩形レ
イアウト画像５４に属する文書画像は、少なくとも再利
用可能なドキュメントとして分類しておく。その他レイ
アウト５５に属する文書画像は、画像データとして扱う
分類に分ける。このドキュメントは、画像として扱うこ
とにする。[Other Embodiments] Next, other embodiments will be described. In the above-described embodiment, the image understanding rule is used when saving the IMG-PAF, but the document image automatic classification technology can also be implemented as an automatic classification method of image filing. In this case, an image understanding process is performed on the result of the layout analysis, and the classification shown in FIG. 7 is obtained.
Then, the information of the classification is described in the management information of the file. Then, the document images belonging to the aligned layout image 52 and the overlapping rectangular layout image 54 are classified at least as reusable documents. Other document images belonging to the layout 55 are classified into categories handled as image data. This document will be treated as an image.

【００５１】この分類分けは、ファイルを保管する領域
を設定したりするのに都合がよい。また検索するのも、
整列レイアウト画像に属するものから順番にアクセスす
る方が、ヒットする時間も早くなる。例えば、その他レ
イアウトの文書画像は、検索項目に入れないとすると、
その分の処理が軽減されることになる。This classification is convenient for setting an area for storing a file. You can also search
The time to hit is faster if the images are accessed sequentially from those belonging to the aligned layout images. For example, if the document image of other layout is not included in the search item,
That amount of processing is reduced.

【００５２】以上説明してきたように、本実施形態の文
書画像自動分類によれば、人間が、この画像は画像のま
まがよいとか、この画像は部分画像にして保存しておい
ても良いといったことを判断することを行わなくて済む
ことになる。また、整列レイアウト画像のように、文字
認識を実行して、再びコード化して再利用しても、かな
り良好な電子文書を得ることができるといったような判
断を、自動分類されたファイル類から判断することがで
きる効果がある。As described above, according to the automatic classification of document images according to the present embodiment, a human may decide that this image should be an image, or that this image may be stored as a partial image. It is not necessary to judge this. In addition, as in the case of an aligned layout image, even if character recognition is performed, and even if it is coded and reused, a very good electronic document can be obtained. There is an effect that can be.

【００５３】なお、本発明は、複数の機器（例えばホス
トコンピュータ、インタフェイス機器、リーダ、プリン
タなど）から構成されるシステムに適用しても、一つの
機器からなる装置（例えば、複写機、ファクシミリ装置
など）に適用してもよい。Even if the present invention is applied to a system including a plurality of devices (for example, a host computer, an interface device, a reader, a printer, etc.), an apparatus including one device (for example, a copying machine, a facsimile, etc.) Device).

【００５４】また、本発明の目的は、前述した実施形態
の機能を実現するソフトウェアのプログラムコードを記
録した記憶媒体（または記録媒体）を、システムあるい
は装置に供給し、そのシステムあるいは装置のコンピュ
ータ（またはCPUやMPU）が記憶媒体に格納されたプログ
ラムコードを読み出し実行することによっても、達成さ
れることは言うまでもない。この場合、記憶媒体から読
み出されたプログラムコード自体が前述した実施形態の
機能を実現することになり、そのプログラムコードを記
憶した記憶媒体は本発明を構成することになる。また、
コンピュータが読み出したプログラムコードを実行する
ことにより、前述した実施形態の機能が実現されるだけ
でなく、そのプログラムコードの指示に基づき、コンピ
ュータ上で稼働しているオペレーティングシステム(OS)
などが実際の処理の一部または全部を行い、その処理に
よって前述した実施形態の機能が実現される場合も含ま
れることは言うまでもない。Another object of the present invention is to supply a storage medium (or a recording medium) in which a program code of software for realizing the functions of the above-described embodiments is recorded to a system or an apparatus, and a computer (a computer) of the system or the apparatus. It is needless to say that the present invention can also be achieved by a CPU or an MPU) reading and executing the program code stored in the storage medium. In this case, the program code itself read from the storage medium implements the functions of the above-described embodiment, and the storage medium storing the program code constitutes the present invention. Also,
When the computer executes the readout program code, not only the functions of the above-described embodiments are realized, but also the operating system (OS) running on the computer based on the instructions of the program code.
It goes without saying that a case where the functions of the above-described embodiments are implemented by performing some or all of the actual processing, and the processing performs the functions of the above-described embodiments.

【００５５】さらに、記憶媒体から読み出されたプログ
ラムコードが、コンピュータに挿入された機能拡張カー
ドやコンピュータに接続された機能拡張ユニットに備わ
るメモリに書込まれた後、そのプログラムコードの指示
に基づき、その機能拡張カードや機能拡張ユニットに備
わるCPUなどが実際の処理の一部または全部を行い、そ
の処理によって前述した実施形態の機能が実現される場
合も含まれることは言うまでもない。Further, after the program code read from the storage medium is written into the memory provided in the function expansion card inserted into the computer or the function expansion unit connected to the computer, the program code is read based on the instruction of the program code. Needless to say, the CPU included in the function expansion card or the function expansion unit performs part or all of the actual processing, and the processing realizes the functions of the above-described embodiments.

【００５６】[0056]

【発明の効果】以上説明したように、本発明によれば、
文書画像の保存や配信に適し、しかも電子文書化するこ
とが容易な電子文書データを作成することが可能とな
る。As described above, according to the present invention,
This makes it possible to create electronic document data that is suitable for storing and distributing document images and that can be easily converted into electronic documents.

[Brief description of the drawings]

【図１】電子文書ＰＡＦの簡単な構造を示す図である。FIG. 1 is a diagram showing a simple structure of an electronic document PAF.

【図２】ＩＭＧ−ＰＡＦの適応的圧縮を説明した図であ
る。FIG. 2 is a diagram illustrating adaptive compression of IMG-PAF.

【図３】本実施形態による本実施形態によるＰＡＦ化処
理の手順を説明するフローチャートである。FIG. 3 is a flowchart illustrating a procedure of a PAF process according to the embodiment;

【図４】整列レイアウト画像のレイアウト解析結果を示
す図である。FIG. 4 is a diagram showing a layout analysis result of an aligned layout image.

【図５】入り組んだ整列レイアウト画像のレイアウト解
析結果を示す図である。FIG. 5 is a diagram illustrating a layout analysis result of a complicated aligned layout image.

【図６】重複矩形レイアウト画像のレイアウト解析結果
を示す図である。FIG. 6 is a diagram illustrating a layout analysis result of an overlapping rectangular layout image.

【図７】文書画像の分類を示した図である。FIG. 7 is a diagram showing classification of document images.

【図８】ＰＩＣＴ部とＴＥＸＴ部の重なりのパターンを
説明する図である。FIG. 8 is a diagram illustrating an overlapping pattern of a PICT unit and a TEXT unit.

【図９】ＰＩＣＴ部とＴＥＸＴ部の重なりのパターンを
説明する図である。FIG. 9 is a diagram illustrating an overlapping pattern of a PICT unit and a TEXT unit.

【図１０】本実施形態による画像理解処理の手順を示す
フローチャートである。FIG. 10 is a flowchart illustrating a procedure of an image understanding process according to the embodiment.

【図１１】ＰＩＣＴ領域とＴＥＸＴ領域の結合を説明す
る図である。FIG. 11 is a diagram illustrating the connection between the PICT region and the TEXT region.

【図１２】本実施形態による文書画像処理を実現するた
めのシステム構成を示すブロック図である。FIG. 12 is a block diagram illustrating a system configuration for implementing document image processing according to the present embodiment.

Claims

[Claims]

An analysis means for performing layout analysis on an input document image to recognize at least a text area and a picture area, and detecting an overlap area where the text area and the picture area recognized by the analysis means overlap. An information processing apparatus, comprising: a detection unit; and a storage unit configured to switch a storage mode of the document image and store the document image data based on whether an overlapping area is detected by the detection unit.

2. The image processing apparatus according to claim 2, wherein the storage unit stores the text area and the picture area detected by the detection unit as separate partial images when the detection unit does not detect the overlapping area. The information processing device according to claim 1.

3. The information processing apparatus according to claim 1, wherein the storage unit treats the overlap area detected by the detection unit as one picture area.

4. The information processing apparatus according to claim 2, wherein the storage unit stores the document image data by performing an appropriate compression process on each of a text area and a picture area.

5. The method according to claim 1, wherein the storage unit stores the document image without dividing the document image into partial images when the ratio of the overlapping area detected by the detection unit in the document image is larger than a predetermined threshold. The information processing apparatus according to claim 1.

6. The information processing apparatus according to claim 1, further comprising a character recognition unit that executes a character recognition process on a partial image of the area stored as the text area.

7. An analysis step of performing a layout analysis on an input document image to recognize at least a text area and a picture area, and detecting an overlap area where the text area and the picture area recognized in the analysis step overlap each other. An information processing method comprising: a detecting step; and a storing step of switching a storage mode of the document image based on whether an overlapping area is detected in the detecting step and storing the document image data.

8. The storing step stores the text area and the picture area detected in the detecting step as separate partial images when no overlapping area is detected in the detecting step. The information processing method according to claim 7.

9. The information processing method according to claim 7, wherein in the storing step, the overlapping area detected in the detecting step is treated as one picture area.

10. The information processing method according to claim 8, wherein in the storing step, the document image data is stored by performing an appropriate compression process on each of a text area and a picture area.

11. The storage step, wherein when the ratio of the overlapping area detected in the detection step in the document image is greater than a predetermined threshold value, the document image is stored without being divided into partial images. The information processing method according to claim 7, wherein:

12. The information processing method according to claim 7, further comprising a character recognition step of performing a character recognition process on a partial image of the area stored as the text area.

13. A storage medium for storing a control program for causing a computer to implement a process of storing document image data, wherein the control program performs a layout analysis on an input document image and at least a text area and a picture area. Code of an analysis step of recognizing an area, code of a detection step of detecting an overlap area where the text area and the picture area recognized in the analysis step overlap, and whether or not an overlap area is detected in the detection step. And a code for a storage step of storing the document image data by switching the storage mode of the document image based on the storage medium.