JP2000324331A

JP2000324331A - Image processor and method therefor

Info

Publication number: JP2000324331A
Application number: JP11133069A
Authority: JP
Inventors: Kazuyuki Saito; 和之齋藤
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1999-05-13
Filing date: 1999-05-13
Publication date: 2000-11-24
Anticipated expiration: 2019-05-13
Also published as: JP4109793B2

Abstract

PROBLEM TO BE SOLVED: To reduce data quantity of a received document image for storing them, to read them and to reproduce the image data with high quality. SOLUTION: The image processor receives a document image 101, generates binary image data 105 from its document image, generates layout analysis data 107 in response to the attribute of an image included in the document image, generates and stores document image understanding data 111 denoting a result of analyzing the document image on the basis of the layout analysis data 107 of the document image. In this case, the attribute of each part image of the document image 101 is checked on the basis of the document image understanding data 111 and when the area indicates a text area, the image is stored as a binary image and when not so, the original image is entirely selected and stored. In the case of reproduction, synthesis/reproduction processing is executed depending on the image type of the partial image in the case of reproduction.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、文書画像を入力し
て保存し、それらを再生する画像処理装置及びその方法
に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an image processing apparatus for inputting and storing a document image and reproducing the same, and a method therefor.

【０００２】[0002]

【従来の技術】従来、文書画像を入力して処理する装置
としては、例えば文書及び画像などが印刷された原稿を
光学的に読み取って入力し、その原稿画像を印刷して出
力する複写機や、その光学的に入力した原稿画像を保存
する文書データベースシステムや、その原稿画像をネッ
トワーク等を介して伝送するファクシミリ装置等があっ
た。2. Description of the Related Art Conventionally, as a device for inputting and processing a document image, for example, a copying machine that optically reads and inputs a document on which a document and an image are printed, and prints and outputs the document image, There has been a document database system for storing the optically input document image, a facsimile apparatus for transmitting the document image via a network or the like.

【０００３】[0003]

【発明が解決しようとする課題】しかしながら、このよ
うな従来の装置では、デジタル化やネットワーク化に適
応できなくなっている。具体的には、原稿画像のカラー
化により、原稿画像をそのまま保存したり、ネットワー
クを介して配信したのでは、その画像データの量が膨大
なものとなる。また、画像データの圧縮に際して、その
原稿画像の画像データを一律に圧縮したのでは、再利用
に適した画質を保つことができない。また、その原稿画
像を出力する出力機器が、白黒プリンタやカラープリン
タ等といった違いによって、その出力画像の画質が劣化
してしまう等の課題があった。However, such a conventional device cannot be adapted to digitization and networking. Specifically, if the original image is stored as it is by colorizing the original image or distributed via a network, the amount of the image data becomes enormous. Further, if the image data of the original image is compressed uniformly when compressing the image data, the image quality suitable for reuse cannot be maintained. Further, there is a problem that the image quality of the output image is deteriorated due to a difference between an output device for outputting the original image such as a monochrome printer and a color printer.

【０００４】本発明は上記従来例に鑑みてなされたもの
で、入力した文書画像のデータ量を削減して記憶し、そ
れらを読み出して高品位に再生できる画像処理装置及び
その方法を提供することを目的とする。SUMMARY OF THE INVENTION The present invention has been made in view of the above conventional example, and provides an image processing apparatus and method capable of reducing the amount of data of an input document image, storing the data, reading the data, and reproducing the data with high quality. With the goal.

【０００５】[0005]

【課題を解決するための手段】上記目的を達成するため
に本発明の画像処理装置は以下のような構成を備える。
即ち、文書画像を入力する画像入力手段と、前記画像入
力手段により入力された文書画像から二値画像データを
作成する二値画像作成手段と、前記二値画像作成手段に
より作成された二値画像データから、前記文書画像に含
まれる画像の属性に応じたレイアウト解析データを生成
するレイアウト解析手段と、前記レイアウト解析データ
を基に前記文書画像を解析した結果を示す解析データを
生成して記憶する解析手段と、前記解析手段により解析
されて記憶された解析データを基に前記文書画像を再生
する文書画像再生手段とを有することを特徴とする。In order to achieve the above object, an image processing apparatus according to the present invention has the following arrangement.
That is, image input means for inputting a document image, binary image creating means for creating binary image data from the document image input by the image input means, and binary image created by the binary image creating means A layout analysis unit configured to generate layout analysis data according to an attribute of an image included in the document image from data; and generate and store analysis data indicating a result of analyzing the document image based on the layout analysis data. It is characterized by comprising analyzing means, and document image reproducing means for reproducing the document image based on the analysis data analyzed and stored by the analyzing means.

【０００６】上記目的を達成するために本発明の画像処
理方法は以下のような工程を備える。即ち、文書画像を
入力する画像入力工程と、前記画像入力工程で入力され
た文書画像から二値画像データを作成する二値画像作成
工程と、前記二値画像作成工程で作成された二値画像デ
ータから、前記文書画像に含まれる画像の属性に応じた
レイアウト解析データを生成するレイアウト解析工程
と、前記レイアウト解析データを基に前記文書画像を解
析した結果を示す解析データを生成して記憶する解析工
程と、前記解析工程で解析されて記憶された解析データ
を基に前記文書画像を再生する文書画像再生工程と、を
有することを特徴とする。In order to achieve the above object, the image processing method of the present invention comprises the following steps. That is, an image input step of inputting a document image, a binary image creation step of creating binary image data from the document image input in the image input step, and a binary image created in the binary image creation step A layout analysis step of generating layout analysis data according to an attribute of an image included in the document image from data; and generating and storing analysis data indicating a result of analyzing the document image based on the layout analysis data. An analysis step; and a document image reproduction step of reproducing the document image based on the analysis data analyzed and stored in the analysis step.

【０００７】[0007]

【発明の実施の形態】以下、添付図面を参照して本発明
の好適な実施の形態を詳細に説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Preferred embodiments of the present invention will be described below in detail with reference to the accompanying drawings.

【０００８】図１は、本実施の形態１に係る画像処理シ
ステムの機能構成を示す機能ブロック図である。FIG. 1 is a functional block diagram showing a functional configuration of the image processing system according to the first embodiment.

【０００９】図１において、１０１は入力される文書
で、印刷物やコンピュータ上で扱う画像データ等を含む
文書原稿である。１０２は画像入力部で、例えばスキャ
ナ等を有し、文書１０１を読み取って画像信号として入
力する。１０３は画像入力部１０２から入力された原画
像データである。１０４は二値画像出力部で、入力文書
１０１の画像信号から二値画像データ１０５を作成して
出力する。１０６はレイアウト解析部で、二値画像デー
タ１０５を基に、その入力された文書画像を図やテキス
トや表等の各種属性に対応する領域に分割した情報、即
ち、レイアウト解析データ１０７を出力する。１０８は
文書画像理解保存部で、入力文書原画像１０３及び二値
画像データ１０５から適応的に保存条件を変更しながら
文書画像理解データを保存する。１０９は文書画像理解
データ出力部で、文書画像理解保存部１０８に記憶され
ている文書画像理解データ１１０を読み出して出力す
る。１１１は文書画像理解データ入力部で、文書画像理
解データ出力部１０９から出力される文書画像理解デー
タ１１０を入力する。１１２は文書画像理解再生部で、
文書画像理解データ１１０から適応的に再生条件を変更
しながら文書画像を作成する。１１３は再生される文書
画像データ、１１４は再生文書画像データ１１３を出力
する画像出力部、１１５は、最終的に再生されて出力さ
れる出力文書である。In FIG. 1, reference numeral 101 denotes an input document, which is a document manuscript including a printed matter and image data handled on a computer. An image input unit 102 includes, for example, a scanner, and reads the document 101 and inputs the document 101 as an image signal. Reference numeral 103 denotes original image data input from the image input unit 102. Reference numeral 104 denotes a binary image output unit which creates and outputs binary image data 105 from the image signal of the input document 101. Reference numeral 106 denotes a layout analysis unit, which outputs, based on the binary image data 105, information obtained by dividing the input document image into regions corresponding to various attributes such as figures, texts, and tables, that is, layout analysis data 107. . Reference numeral 108 denotes a document image understanding storage unit that stores document image understanding data while adaptively changing storage conditions from the input document original image 103 and the binary image data 105. A document image comprehension data output unit 109 reads and outputs the document image comprehension data 110 stored in the document image comprehension storage unit 108. A document image comprehension data input unit 111 inputs document image comprehension data 110 output from the document image comprehension data output unit 109. Reference numeral 112 denotes a document image understanding / reproduction unit.
A document image is created from the document image understanding data 110 while adaptively changing the reproduction conditions. Reference numeral 113 denotes a document image data to be reproduced, 114 denotes an image output unit that outputs the reproduced document image data 113, and 115 denotes an output document that is finally reproduced and output.

【００１０】図２は、本実施の形態１に係る画像処理シ
ステムの構成を説明する概念図である。FIG. 2 is a conceptual diagram illustrating the configuration of the image processing system according to the first embodiment.

【００１１】図２において、２０１は文書画像の理解処
理を行うコンピュータで、図１の二値画像出力部１０
４、レイアウト解析部１０６等に相当している。２０２
はデータを保存する記憶装置で、図１の文書画像理解保
存部１０８に該当している。２０３はスキャナで、原稿
画像を読み取って光電的に画像を入力している。２０４
はカラープリンタである。２０５はファクシミリ、２０
６はカラー画像の入力及びプリント出力を複合的に行う
デジタルカラー複写機である。２０７はプリンタ２０８
を管理するプリンタサーバ、２０８はモノクロプリンタ
である。２０９はデータべ−スを管理するファイルサー
バ、２１０はデータべ−ス、２１１はモノクロ画像の入
力及びプリント出力を複合的に行うデジタル複写機であ
る。２１２は上述した各デジタル機器同士を接続するネ
ットワークである。この構成において、画像入力部１０
２は、スキャナ２０３や複写機２０６、２１１等のスキ
ャナ部、更にはファクシミリ装置２０５の原稿読取り部
にも相当している。また画像出力部１１４は、プリンタ
２０４、２０８、及び複写機２０６，２１１のプリンタ
部、更にはファクシミリ装置２０５のプリンタ部などが
相当している。In FIG. 2, reference numeral 201 denotes a computer for performing a process of understanding a document image.
4. It corresponds to the layout analysis unit 106 and the like. 202
Is a storage device for storing data, which corresponds to the document image understanding storage unit 108 in FIG. A scanner 203 reads a document image and photoelectrically inputs the image. 204
Is a color printer. 205 is facsimile, 20
Reference numeral 6 denotes a digital color copying machine that performs input and output of color images in a combined manner. 207 is a printer 208
Is a printer server, and 208 is a monochrome printer. Reference numeral 209 denotes a file server for managing a database; 210, a database; and 211, a digital copying machine that performs composite input and output of monochrome images. A network 212 connects the above-described digital devices. In this configuration, the image input unit 10
Reference numeral 2 also corresponds to a scanner unit such as the scanner 203 and the copiers 206 and 211, and a document reading unit of the facsimile apparatus 205. The image output unit 114 corresponds to the printer units of the printers 204 and 208, the copying machines 206 and 211, and the facsimile apparatus 205.

【００１２】次に本実施の形態に係る画像処理システム
における処理の流れについて説明する。Next, the flow of processing in the image processing system according to the present embodiment will be described.

【００１３】図３は、本実施の形態に係る画像処理シス
テムにおける処理の流れを説明するフローチャートであ
る。FIG. 3 is a flowchart illustrating the flow of processing in the image processing system according to the present embodiment.

【００１４】本実施の形態では、２４ビット多値画像を
スキャナ２０３や複写機２０６などより入力し、ネット
ワーク２１２を介して配信し、その配信先でモノクロプ
リンタ２０８や複写機２１１或はファクシミリ装置２０
５等で出力する場合で説明する。In the present embodiment, a 24-bit multi-valued image is input from a scanner 203, a copying machine 206, or the like, and distributed via a network 212. The monochrome printer 208, the copying machine 211, or the facsimile machine 20
5 and the like.

【００１５】まずステップＳ３０１で、画像入力部１０
２であるスキャナ２０３より文書画像を多値画像として
入力する。次にステップＳ３０２に進み、その多値画像
データを二値画像出力部１０４により二値画像データ１
０５に変換する。そしてステップＳ３０３で、その二値
画像データ１０５を基に、レイアウト解析部１０６によ
り、その文書画像に含まれる図やテキストや表等を、各
種属性毎に領域に分割し、レイアウト解析データ１０７
（図４参照）を出力する。First, in step S301, the image input unit 10
A document image is input as a multi-value image from the scanner 203, which is a second image. Next, the process proceeds to step S302, where the multi-valued image data is converted into the binary image data 1 by the binary image output unit 104.
Convert to 05. In step S303, based on the binary image data 105, the layout analysis unit 106 divides the figures, texts, tables, and the like included in the document image into regions for each attribute, and
(See FIG. 4).

【００１６】図４（ａ）（ｂ）において、レイアウト解
析データ１０７は、その分割された領域数ｎと、各領域
ごとの始点のＸ，Ｙ座標、幅、高さ、及びその属性（テ
キスト＝１、線図形＝２、絵・写真＝３、表＝４）が記
憶されている。4A and 4B, the layout analysis data 107 includes the number n of the divided areas, the X and Y coordinates, the width, the height, and the attributes (text = 1, line graphic = 2, picture / photograph = 3, table = 4) are stored.

【００１７】次にステップＳ３０４に進み、文書画像理
解保存部１０８において、レイアウト解析データ１０７
を基にして、入力文書の原画像である多値画像１０３及
び二値画像データ１０５から適応的に保存条件を変更し
ながら文書画像理解データ１１０を作成して保存する。
そしてステップＳ３０５に進み、その文書画像理解デー
タ１１０を出力する。Next, the flow advances to step S304, where the layout analysis data 107 is stored in the document image understanding and storage unit 108.
The document image comprehension data 110 is created and stored from the multivalued image 103 and the binary image data 105, which are the original images of the input document, while adaptively changing the storage conditions.
Then, the process proceeds to step S305 to output the document image understanding data 110.

【００１８】以下、上記各処理の詳細を説明する。The details of each of the above processes will be described below.

【００１９】図５は、本実施の形態に係る文書画像理解
保存処理の一例を示すフローチャートである。FIG. 5 is a flowchart showing an example of the document image comprehension storage processing according to the present embodiment.

【００２０】まずステップＳ５０１で、各分割領域のレ
イアウト解析データ１０７を分析して、各領域の領域属
性を求める。次にステップＳ５０２に進み、その領域属
性を判定し『テキスト』であればステップＳ５０３に進
み、切り取り先全体画像として二値画像データ１０５を
選択し、ステップＳ５０４で、レイアウト解析データ１
０７で指示される各領域の座標データを用いて、その部
分画像を切り取る。こうして切り取った部分画像は、ス
テップＳ５０５において文書画像理解データ１１０（図
６参照）として保存処理を行う。First, in step S501, the layout analysis data 107 of each divided area is analyzed to determine the area attribute of each area. Next, the process proceeds to step S502, where the region attribute is determined. If the attribute is "text", the process proceeds to step S503, where the binary image data 105 is selected as the entire image to be cut out.
The partial image is cut out using the coordinate data of each area designated by 07. In step S505, the partial image thus cut is stored as document image understanding data 110 (see FIG. 6).

【００２１】図６（ａ）（ｂ）は、本実施の形態に係る
文書画像理解データ１１０の構成を示す図である。FIGS. 6A and 6B are diagrams showing the structure of the document image comprehension data 110 according to the present embodiment.

【００２２】図６において、対象となった部分画像デー
タのレイアウト解析データ１０７内での出現順位をレイ
アウト領域データのＩＤとし、その切り取り幅、切り取
り高さ、画像タイプ、画像サイズ及び切り取った画像
を、その部分画像データとして保存する。In FIG. 6, the appearance order of the target partial image data in the layout analysis data 107 is defined as the ID of the layout area data, and the cut width, cut height, image type, image size, and the cut image are determined. Is stored as the partial image data.

【００２３】またステップＳ５０２で、領域属性が『テ
キスト』でなかった場合はステップＳ５０６に進み、そ
の切り取り先全体画像として原画像である多値画像を選
択し、ステップＳ５０４で、レイアウト解析データ１０
７の領域データの座標データを用いて部分画像を切り取
る。そしてステップＳ５０５に進み、その切り取った部
分画像を、文書画像理解データ１１０（図６）として保
存処理を行う。If it is determined in step S502 that the area attribute is not "text", the flow advances to step S506 to select a multivalued image as an original image as the entire image to be cut out.
The partial image is cut out using the coordinate data of the area data of No. 7. Then, the process proceeds to step S505, in which the cut-out partial image is stored as the document image understanding data 110 (FIG. 6).

【００２４】次にＳ５０７において残りの分割領域が存
在しているか否かを調べ、存在していればＳ５０１に戻
り、残りの分割領域について上記の処理を繰り返す。ス
テップＳ５０７で残りの分割領域が存在していなけれ
ば、文書画像理解保存処理を終了する。Next, in S507, it is checked whether or not the remaining divided area exists. If there is, the process returns to S501, and the above processing is repeated for the remaining divided area. If the remaining divided area does not exist in step S507, the document image comprehension / save processing ends.

【００２５】さらに、本実施の形態では、文書画像理解
保存部１０８において作成され、保存された文書画像理
解データ１１０は、文書理解データ出力部１０９により
ネットワーク２１２上に出力され、このネットワーク２
１２を介して利用先に配信される。Further, in this embodiment, the document image comprehension data 110 created and stored in the document image comprehension storage unit 108 is output to the network 212 by the document comprehension data output unit 109, and the network 2
The information is distributed to the user via the server 12.

【００２６】こうして文書画像理解データが配信された
利用先における処理を、図７のフローチャートを参照し
て説明する。The processing at the destination where the document image comprehension data is distributed in this way will be described with reference to the flowchart of FIG.

【００２７】図７に示すように、まずステップＳ４０１
において、文書理解保存データ入力部１１１により文書
画像理解データ１１０を入力し、次にステップＳ４０２
において、この文書画像理解データ１１０から適応的に
再生条件を変更しながら再生すべき文書画像データ１１
３を作成し、その文書画像データ１１３を基に画像出力
部１１４であるモノクロプリンタから再生画像を出力文
書として出力する（Ｓ４０３）。As shown in FIG. 7, first, at step S401
In step S402, the document image comprehension data input unit 111 inputs the document image comprehension data 110.
, The document image data 11 to be reproduced from the document image comprehension data 110 while adaptively changing the reproduction conditions.
Then, a reproduced image is output as an output document from the monochrome printer, which is the image output unit 114, based on the document image data 113 (S403).

【００２８】図８は、図７のステップＳ４０２の文書画
像理解再生処理を示すフローチャートである。FIG. 8 is a flowchart showing the document image comprehension / reproduction processing in step S402 in FIG.

【００２９】まずステップＳ７０１において、再生文書
画像の下地となる画像を白い下地となるように作成す
る。次にステップＳ７０２に進み、文書画像理解データ
１１０から部分画像データを取り出し、ステップＳ７０
３において、その取り出した部分画像とその座標データ
を基に、白い下地画像と合成して再生文書画像データ１
１３を作成する。そしてステップＳ７０４で、全ての部
分画像に対する処理が終了したかを調べ、終了していな
ければステップＳ７０２に戻り、前述の処理を繰り返し
実行する。First, in step S701, an image serving as a base of a reproduced document image is created so as to serve as a white base. Next, the process proceeds to step S702, where partial image data is extracted from the document image comprehension data 110, and step S70 is performed.
In step 3, based on the extracted partial image and its coordinate data, the reproduced partial image data 1 is synthesized with a white background image.
13 is created. In step S704, it is determined whether or not the processing for all the partial images has been completed. If the processing has not been completed, the process returns to step S702, and the above-described processing is repeatedly performed.

【００３０】図９は、図８のステップＳ７０３の再生画
像合成処理を示すフローチャートである。FIG. 9 is a flowchart showing the reproduced image synthesizing process in step S703 in FIG.

【００３１】まずステップＳ８０１において、文書画像
理解データ１１０から部分画像の画像タイプを抽出し、
次にステップＳ８０２に進み、その画像タイプが『二値
画像タイプ』かどうかを調べ、そうであればでステップ
Ｓ８０３に進み、二値画像の黒、白画素をそれぞれ２４
ビット多値画像の黒、白画素に変換する疑似２４ビット
化の処理を行う。ここでは、二値画像の黒画素は
“１”、白画素は“０”であり、２４ビット多値画像の
黒画素はＲ＝０、Ｇ＝０、Ｂ＝０であり、白画素はＲ＝
２５５、Ｇ＝２５５、Ｂ＝２５５（Ｒは赤成分、Ｇは緑
成分、Ｂは青成分で各８ビットの値を持つ）としてい
る。First, in step S801, the image type of a partial image is extracted from the document image comprehension data 110,
Next, the process proceeds to step S802, where it is determined whether the image type is a “binary image type”. If so, the process proceeds to step S803, where the black and white pixels of the binary image
A pseudo 24-bit conversion process of converting the bit multi-valued image into black and white pixels is performed. Here, the black pixel of the binary image is “1”, the white pixel is “0”, the black pixel of the 24-bit multi-valued image is R = 0, G = 0, B = 0, and the white pixel is R =
255, G = 255, B = 255 (R is a red component, G is a green component, and B is a blue component and has an 8-bit value).

【００３２】またステップＳ８０２において、その部分
画像の画像タイプが『２４ビット多値画像タイプ』であ
ればステップＳ８０４に進み、そのままの部分画像を合
成に用いる。こうしてステップＳ８０４において、下地
画像と部分画像に対し、部分画像の各画素の論理演算を
行って合成画像を作成する。If it is determined in step S802 that the image type of the partial image is "24-bit multi-valued image type", the flow advances to step S804 to use the partial image as it is for synthesis. In this way, in step S804, a logical operation is performed on each pixel of the partial image on the base image and the partial image to create a composite image.

【００３３】本実施の形態では、下地画像の白画素（Ｒ
＝２５５，Ｇ＝２５５、Ｂ＝２５５）と部分画像の黒画
素（Ｒ＝０，Ｇ＝０，Ｂ＝０）の合成画像が黒画素（Ｒ
＝０，Ｇ＝０，Ｂ＝０）となるような論理演算を用いて
いる。In the present embodiment, the white pixels (R
= 255, G = 255, B = 255) and a black pixel (R = 0, G = 0, B = 0) of the partial image is a black pixel (R
= 0, G = 0, B = 0).

【００３４】［実施の形態２］次に本発明の実施の形態
２について説明する。この実施の形態２では、前述の図
３のステップＳ３０４の文書画像理解保存処理におい
て、画像データの保存時に画像データの圧縮を行うとと
もに、更に画像タイプの違いに応じて適応的に圧縮方式
を変更するようにしている。[Second Embodiment] Next, a second embodiment of the present invention will be described. In the second embodiment, in the document image comprehension saving processing in step S304 in FIG. 3, the image data is compressed when the image data is stored, and the compression method is adaptively changed according to the difference in the image type. I am trying to do it.

【００３５】以下、本実施の形態に係る文書画像理解保
存処理の流れについて図１０のフローチャートを参照し
て説明する。Hereinafter, the flow of the document image understanding / storing process according to the present embodiment will be described with reference to the flowchart of FIG.

【００３６】まずステップＳ６０１で、各分割領域のレ
イアウト解析データ１０７を分析して各領域の属性を求
める。次にステップＳ６０２に進み、その属性が『テキ
スト』かどうかを調べ、そうであればステップＳ６０３
に進み、切り取り先全体画像として二値画像データ１０
５を選択し、次にステップＳ６０５に進み、レイアウト
解析データ１０７の各領域の座標データを用いて部分画
像を切り取る。First, in step S601, the layout analysis data 107 of each divided area is analyzed to determine the attribute of each area. Next, the process proceeds to step S602 to check whether the attribute is “text”.
To the binary image data 10 as the whole image to be cut out.
5 is selected, and the process advances to step S605 to cut out a partial image using the coordinate data of each area of the layout analysis data 107.

【００３７】またステップＳ６０２で、その部分画像の
属性が『テキスト』でなかった場合はステップＳ６０４
に進み、切り取り先全体画像として原画像である多値画
像を選択し、ステップＳ６０５で、レイアウト解析デー
タ１０７の各領域の座標データを用いて部分画像を切り
取る。ここまでは前述の図５の処理（実施の形態１）と
同様である。If the attribute of the partial image is not "text" in step S602, the process proceeds to step S604.
In step S605, a partial image is cut out using the coordinate data of each area of the layout analysis data 107 in step S605. Up to this point, the processing is the same as the processing in FIG. 5 (Embodiment 1).

【００３８】次にステップＳ６０６に進み、その切り取
った部分画像の画像タイプを調べ、二値画像であればス
テップＳ６０７に進み、二値画像の圧縮処理を施し（例
えばＭＭＲ等）、その圧縮した部分画像を、ステップＳ
６０９において文書画像理解データ１１０（図１１）と
して保存する。この時、文書画像理解データ１１０に、
その画像データを圧縮した圧縮方式も記録しておくこと
により、その文書画像理解データの再生処理において、
その圧縮画像を復号することができる。Then, the flow advances to step S606 to check the image type of the cut-out partial image. If the image type is a binary image, the flow advances to step S607 to perform compression processing of the binary image (for example, MMR or the like). Step S
In step 609, the document image is stored as the document image understanding data 110 (FIG. 11). At this time, the document image comprehension data 110 includes
By recording the compression method of compressing the image data, in the process of reproducing the document image comprehension data,
The compressed image can be decoded.

【００３９】またステップＳ６０６において調べた結果
が、多値画像であったならばステップＳ６０８に進み、
多値画像用の圧縮処理を施し（例えばＪＰＥＧ等）、そ
の圧縮した部分画像を文書画像理解データ１１０（図１
１）として保存する。次にステップＳ６１０に進み、残
りの分割領域が存在しているか否かを調べ、存在してい
ればステップＳ６０１に戻り、残りの分割領域について
上記の処理を繰り返す。また残りの分割領域が存在して
いなければ、この文書画像理解保存処理を終了する。If the result checked in step S606 is a multi-valued image, the flow advances to step S608, and
A compression process for a multi-valued image is performed (for example, JPEG or the like), and the compressed partial image is converted into document image understanding data 110 (FIG. 1).
Save as 1). Next, the process proceeds to step S610, where it is determined whether or not there is a remaining divided region. If there is, the process returns to step S601, and the above processing is repeated for the remaining divided region. If the remaining divided areas do not exist, the document image comprehension storage processing ends.

【００４０】図１１は、本発明の実施の形態２に係る文
書画像理解データ１１０の一例を説明する図である。FIG. 11 is a view for explaining an example of the document image comprehension data 110 according to the second embodiment of the present invention.

【００４１】対象となったレイアウト領域データのレイ
アウト解析データ１０７内での出現順位を、そのレイア
ウト領域データのＩＤとし、切り取り幅、切り取り高
さ、画像タイプ、圧縮方式、画像サイズ及び、その圧縮
した切り取り画像を部分画像データとして保存する。こ
こでは画像タイプとして二値画像「１」と２４ビット多
値画像「２」に分類し、更に圧縮方式としてＭＭＲ（二
値画像用）「１」とＪＰＥＧ（多値画像用）「２」とに
分類している。The appearance order of the target layout area data in the layout analysis data 107 is used as the ID of the layout area data, and the cut width, cut height, image type, compression method, image size, and the compressed The cut image is stored as partial image data. Here, the image type is classified into a binary image "1" and a 24-bit multi-valued image "2", and the compression methods are MMR (for binary image) "1" and JPEG (for multi-valued image) "2". It is classified into.

【００４２】［実施の形態３］次に本発明の実施の形態
３について説明する。この実施の形態３では、文書画像
理解データ１１０の各部分画像をそれぞれ１つの画像と
して出力する。Third Embodiment Next, a third embodiment of the present invention will be described. In the third embodiment, each partial image of the document image understanding data 110 is output as one image.

【００４３】図１２は、この実施の形態３に係る文書画
像理解再生処理（図７のステップＳ４０２の処理に相
当）を説明するフローチャートである。FIG. 12 is a flowchart for explaining the document image understanding and reproducing process (corresponding to the process of step S402 in FIG. 7) according to the third embodiment.

【００４４】まずステップＳ９０１において、文書画像
理解データ１１０から部分画像データを取り出し、次に
ステップＳ９０２に進み、その取り出した部分画像を１
ページ分の画像として出力用のバッファに貯めておく。
次にステップＳ９０３に進み、残りの部分画像が存在し
ているか否かを調べ、存在していればステップＳ９０１
に戻り、残りの部分画像について上記の処理を繰り返
す。そしてステップＳ９０３で、残りの部分画像が存在
していなければ、この文書画像理解再生処理を終了す
る。First, in step S901, partial image data is extracted from the document image comprehension data 110, and then the process proceeds to step S902, where the extracted partial image is
The image for the page is stored in the output buffer.
Next, the process advances to step S903 to check whether or not the remaining partial image exists.
And the above process is repeated for the remaining partial images. If it is determined in step S903 that the remaining partial image does not exist, the document image understanding and reproducing process ends.

【００４５】［実施の形態４］次に本発明の実施の形態
４について説明する。この実施の形態４では、文書画像
理解データの各部分画像をそれぞれ１つの画像として出
力する際に、部分画像の属性によって選択するようにし
ている。[Fourth Embodiment] Next, a fourth embodiment of the present invention will be described. In the fourth embodiment, when each partial image of the document image comprehension data is output as one image, selection is made based on the attribute of the partial image.

【００４６】図１３は、この実施の形態４に係る文書画
像理解再生処理（図７のステップＳ４０２の処理に相
当）を説明するフローチャートである。FIG. 13 is a flowchart for explaining the document image understanding and reproducing process (corresponding to the process of step S402 in FIG. 7) according to the fourth embodiment.

【００４７】まずステップＳ１００１において、文書画
像理解データ１１０から部分画像を取り出し、次にステ
ップＳ１００２に進み、その部分画像の属性を調べる。
ここでもし、その属性が『テキスト』であればステップ
Ｓ１００３に進んで１ページ画像作成処理を実行し、そ
の取り出した部分画像を１ページの画像として出力用の
バッファに格納する。ステップＳ１００２で、その属性
が『テキスト』でなければステップＳ１００４に進み、
その部分画像に対しての処置はせず、次の処理へと進
む。こうしてステップＳ１００４において、残りの部分
画像が存在しているか否かを調べ、存在していればステ
ップＳ１００１に戻り、残りの部分画像について上記の
処理を繰り返す。こうして全ての部分画像に対する処理
が終了すると、この文書画像理解再生処理を終了する。First, in step S1001, a partial image is extracted from the document image comprehension data 110, and the process advances to step S1002 to check the attribute of the partial image.
If the attribute is "text", the process advances to step S1003 to execute a one-page image creation process, and stores the extracted partial image as an image of one page in an output buffer. If the attribute is not “text” in step S1002, the process proceeds to step S1004,
The processing is not performed on the partial image, and the process proceeds to the next processing. In this way, in step S1004, it is checked whether or not the remaining partial image exists. If there is, the process returns to step S1001 to repeat the above processing for the remaining partial image. When the processing for all the partial images is completed in this way, the document image understanding and reproducing processing is ended.

【００４８】［実施の形態５］次に本発明の実施の形態
５について説明する。この実施の形態５では、文書画像
理解保存処理前に再生レベルを設定して部分画像を１ペ
ージに合成するか、各ページに分けて出力するかを選択
可能にしている。[Fifth Embodiment] Next, a fifth embodiment of the present invention will be described. In the fifth embodiment, the reproduction level is set before the document image comprehension / save processing, and it is possible to select whether to combine the partial images into one page or to output each page separately.

【００４９】図１４は、本発明の実施の形態５に係る、
文書画像理解データが配信された利用先における処理を
示すフローチャートである。FIG. 14 shows a fifth embodiment according to the present invention.
9 is a flowchart illustrating a process at a destination to which document image understanding data has been distributed.

【００５０】まずステップＳ１１０１で、文書画像理解
データ１１０の部分画像を１ページに合成するか、各ペ
ージに分けて出力するかの設定（以下、再生レベルと呼
ぶ）を行う。ここでは「レベル１」であれば１ページに
合成して出力とし、「レベル２」であれば部分画像毎に
１ページとして出力ものとする。First, in step S1101, a setting is made as to whether a partial image of the document image comprehension data 110 is to be combined into one page or to be output separately for each page (hereinafter referred to as a reproduction level). In this case, if "level 1", it is output as being synthesized into one page, and if "level 2", it is output as one page for each partial image.

【００５１】次にステップＳ１１０２において、文書理
解保存データ入力部１１１により文書画像理解データ１
１０を入力し、次にステップＳ１１０３において、この
文書画像理解データ１１０から適応的に再生条件を変更
しながら再生すべき文書画像データ１１３を作成し、そ
の文書画像データ１１３を基に画像出力部１１４である
モノクロプリンタから再生画像を出力文書として出力す
る（Ｓ１１０４）。Next, in step S1102, the document image understanding data 1
Then, in step S1103, document image data 113 to be reproduced is created from the document image comprehension data 110 while adaptively changing reproduction conditions, and an image output unit 114 is generated based on the document image data 113. The reproduced image is output as an output document from the monochrome printer (S1104).

【００５２】図１５は、本実施の形態５に係るステップ
Ｓ１１０３の文書画像理解再生処理を示すフローチャー
トである。FIG. 15 is a flowchart showing the document image understanding / reproduction processing in step S1103 according to the fifth embodiment.

【００５３】まずステップＳ１２０１において、前述の
ステップＳ１１０１で設定された再生レベルを調べ、
「レベル１」（１ページ合成出力）であればステップＳ
１２０２に進み、再生文書画像の下地となる画像を白い
下地となるように作成する。次にステップＳ１２０３に
進み、文書画像理解データ１１０から部分画像データを
取り出し、ステップＳ１２０４において、その取り出し
た部分画像とその座標データから白い下地画像と合成し
て再生画像を作成する。そしてステップＳ１２０５で、
全ての部分画像に対する処理が終了したかを調べ、そう
でなければステップＳ１２０３に戻り、前述の処理を実
行する。こうして全ての部分画像に対する処理が終了す
ると、この文書画像理解再生処理を終了する。これらス
テップＳ１２０２乃至Ｓ１２０５の処理は、前述の図７
のステップＳ７０１乃至Ｓ７０４の処理と同じである。
従って、ステップＳ１２０４における再生合成処理の一
例としては、前述の図９に示すように、ステップＳ８０
１において文書画像保存データから部分画像の画像タイ
プを抽出し、ステップＳ８０２において、画像タイプが
『二値画像タイプ』であればステップＳ８０３に進み、
二値画像の黒、白画素をそれぞれ２４ビット多値画像の
黒、白画素に変換する疑似２４ビット化の処理を行う。First, in step S1201, the reproduction level set in step S1101 is checked.
If "Level 1" (one-page composite output), step S
Proceeding to step 1202, a base image of the reproduced document image is created so as to be a white base. Next, in step S1203, partial image data is extracted from the document image understanding data 110, and in step S1204, a reproduced image is created by combining the extracted partial image and the coordinate data with a white background image. Then, in step S1205,
It is checked whether the processing has been completed for all the partial images. If not, the process returns to step S1203 to execute the above-described processing. When the processing for all the partial images is completed in this way, the document image understanding and reproducing processing is ended. The processing in steps S1202 to S1205 is the same as that in FIG.
Is the same as the processing of steps S701 to S704.
Therefore, as an example of the reproduction synthesis processing in step S1204, as shown in FIG.
In step S1, the image type of the partial image is extracted from the document image storage data. If the image type is "binary image type" in step S802, the process proceeds to step S803.
A pseudo 24-bit conversion process is performed to convert the black and white pixels of the binary image into the black and white pixels of the 24-bit multi-valued image.

【００５４】またステップＳ１２０１において、再生レ
ベルが「レベル２」（部分画像ごと）であればステップ
Ｓ１２０６に進み、その文書画像理解データ１１０から
部分画像データを取り出し、ステップＳ１２０７におい
て、その取り出した部分画像を１ページ分の画像として
出力用のバッファに格納する。そしてＳ１２０８におい
て、残りの部分画像が存在しているか否かを調べ、存在
していればステップＳ１２０６に戻り、残りの部分画像
について上記の処理を繰り返す。こうしてステップＳ１
２０５或はステップＳ１２０８において、残りの部分画
像が存在していなければ、この文書画像理解再生処理を
終了する。尚、これらステップＳ１２０６乃至Ｓ１２０
８の処理は、前述の図１２のフローチャート（実施の形
態３）のステップＳ９０１乃至ステップＳ９０３の処理
と同様である。If it is determined in step S1201 that the reproduction level is "level 2" (for each partial image), the flow advances to step S1206 to extract partial image data from the document image understanding data 110. In step S1207, the extracted partial image data is extracted. Is stored in the output buffer as an image for one page. In step S1208, it is determined whether or not the remaining partial image exists. If the remaining partial image exists, the process returns to step S1206, and the above processing is repeated for the remaining partial image. Thus, step S1
If there is no remaining partial image in 205 or step S1208, this document image understanding and reproducing process is ended. Note that these steps S1206 to S120
The processing of No. 8 is the same as the processing of steps S901 to S903 in the above-described flowchart of FIG. 12 (Embodiment 3).

【００５５】なお本発明は、複数の機器（例えばホスト
コンピュータ、インターフェース機器、リーダ、プリン
タなど）から構成されるシステムに適用しても、一つの
機器からなる装置（例えば、複写機、ファクシミリ装置
など）に適用してもよい。Even if the present invention is applied to a system including a plurality of devices (for example, a host computer, an interface device, a reader, a printer, etc.), a device including one device (for example, a copying machine, a facsimile machine, etc.) ) May be applied.

【００５６】また、本発明の目的は、前述した実施形態
の機能を実現するソフトウェアのプログラムコードを記
録した記憶媒体（または記録媒体）を、システムあるい
は装置に供給し、そのシステムあるいは装置のコンピュ
ータ（またはCPUやMPU）が記憶媒体に格納されたプログ
ラムコードを読み出し実行することによっても達成され
る。この場合、記憶媒体から読み出されたプログラムコ
ード自体が前述した実施形態の機能を実現することにな
り、そのプログラムコードを記憶した記憶媒体は本発明
を構成することになる。また、コンピュータが読み出し
たプログラムコードを実行することにより、前述した実
施形態の機能が実現されるだけでなく、そのプログラム
コードの指示に基づき、コンピュータ上で稼働している
オペレーティングシステム(OS)などが実際の処理の一部
または全部を行い、その処理によって前述した実施形態
の機能が実現される場合も含まれる。Further, an object of the present invention is to supply a storage medium (or a recording medium) in which a program code of software for realizing the functions of the above-described embodiments is recorded to a system or an apparatus, and to provide a computer (a computer) of the system or the apparatus. Alternatively, this can be achieved by a CPU or an MPU) reading and executing the program code stored in the storage medium. In this case, the program code itself read from the storage medium implements the functions of the above-described embodiment, and the storage medium storing the program code constitutes the present invention. By executing the program code read by the computer, not only the functions of the above-described embodiments are realized, but also an operating system (OS) running on the computer based on the instruction of the program code. This also includes a case where some or all of the actual processing is performed and the functions of the above-described embodiments are realized by the processing.

【００５７】更に、記憶媒体から読み出されたプログラ
ムコードが、コンピュータに挿入された機能拡張カード
やコンピュータに接続された機能拡張ユニットに備わる
メモリに書込まれた後、そのプログラムコードの指示に
基づき、その機能拡張カードや機能拡張ユニットに備わ
るCPUなどが実際の処理の一部または全部を行い、その
処理によって前述した実施形態の機能が実現される場合
も含まれる。Further, after the program code read from the storage medium is written into the memory provided in the function expansion card inserted into the computer or the function expansion unit connected to the computer, the program code is read based on the instruction of the program code. This also includes the case where the CPU provided in the function expansion card or the function expansion unit performs part or all of the actual processing, and the processing realizes the functions of the above-described embodiments.

【００５８】以上説明したように本実施の形態によれ
ば、文書画像の保存時における、データ量を削減できる。文書画像の配信時におけるネットワーク・トラフィッ
クへの負荷を軽減できる。文書画像の保存、配信時において、再利用に適した高
画質が維持できる。出力時の文書画像の画質劣化や情報欠落を防止でき
る。ＤＴＰ（デスクトップパブリッシング）等の電子文書
への再利用の容易化等が図れるという効果がある。As described above, according to the present embodiment, the data amount at the time of storing a document image can be reduced. It is possible to reduce the load on network traffic when a document image is distributed. When storing and distributing document images, high image quality suitable for reuse can be maintained. It is possible to prevent the image quality of the document image from deteriorating and the loss of information at the time of output. This has the effect of facilitating reuse of electronic documents such as DTP (desktop publishing).

【００５９】[0059]

【発明の効果】以上説明したように本発明によれば、入
力した文書画像のデータ量を削減して記憶し、それらを
読み出して高品位に再生できるという効果がある。As described above, according to the present invention, there is an effect that the data amount of an input document image is reduced, stored, read out, and reproduced with high quality.

[Brief description of the drawings]

【図１】本発明の実施の形態に係る画像処理システムの
機能構成を示す機能ブロック図である。FIG. 1 is a functional block diagram showing a functional configuration of an image processing system according to an embodiment of the present invention.

【図２】本実施の形態に係る画像処理システムの構成を
示す概念図である。FIG. 2 is a conceptual diagram illustrating a configuration of an image processing system according to the present embodiment.

【図３】本実施の形態に係る画像処理システムにおける
画像入力から文書画像理解データ出力までの処理を示す
フローチャートである。FIG. 3 is a flowchart showing processing from image input to document image understanding data output in the image processing system according to the present embodiment.

【図４】本実施の形態に係るレイアウト解析データの構
成を説明する図である。FIG. 4 is a diagram illustrating a configuration of layout analysis data according to the present embodiment.

【図５】本実施の形態１に係る文書画像理解保存処理
（Ｓ３０４）を示すフローチャートである。FIG. 5 is a flowchart showing a document image understanding and saving process (S304) according to the first embodiment.

【図６】本実施の形態に係る文書画像理解データの構成
を説明する図である。FIG. 6 is a diagram illustrating a configuration of document image comprehension data according to the present embodiment.

【図７】本実施の形態に係る文書画像理解データの入力
から再生画像の出力までの処理を示すフローチャートで
ある。FIG. 7 is a flowchart illustrating processing from input of document image understanding data to output of a reproduced image according to the present embodiment.

【図８】本実施の形態に係る文書画像理解再生処理（Ｓ
４０２）を示すフローチャートである。FIG. 8 shows a document image understanding / reproduction process (S
It is a flowchart which shows 402).

【図９】本実施の形態に係るステップＳ７０３の再生画
像合成処理を示すフローチャートである。FIG. 9 is a flowchart showing a reproduced image combining process in step S703 according to the present embodiment.

【図１０】本実施の形態２に係る文書画像理解保存処理
（Ｓ３０４）を示すフローチャートである。FIG. 10 is a flowchart showing a document image understanding and saving process (S304) according to the second embodiment.

【図１１】本実施の形態２に係る文書画像理解データの
構成を説明する図である。FIG. 11 is a diagram illustrating a configuration of document image understanding data according to the second embodiment.

【図１２】本実施の形態３に係る文書画像理解再生処理
（Ｓ４０２）を示すフローチャートである。FIG. 12 is a flowchart showing a document image understanding and reproducing process (S402) according to the third embodiment.

【図１３】本実施の形態４に係る文書画像理解再生処理
（Ｓ４０２）を示すフローチャートである。FIG. 13 is a flowchart showing a document image understanding and reproducing process (S402) according to the fourth embodiment.

【図１４】本実施の形態５に係る文書画像理解データの
入力から再生画像の出力までの処理を示すフローチャー
トである。FIG. 14 is a flowchart showing processing from input of document image understanding data to output of a reproduced image according to the fifth embodiment.

【図１５】本実施の形態５に係る文書画像理解再生処理
（Ｓ１１０３）を示すフローチャートである。FIG. 15 is a flowchart showing a document image understanding and reproducing process (S1103) according to the fifth embodiment.

フロントページの続きＦターム(参考） 5B009 NC02 NC05 NG02 5B050 AA08 BA10 BA15 BA16 BA18 EA02 EA03 EA10 EA19 FA05 5C076 AA01 AA12 AA17 AA19 AA27 BA06 CA10 5C077 LL17 LL19 MP06 MP08 PP19 PP21 PP23 PP27 PP28 PP58 PQ12 RR02 RR21 Continued on the front page F-term (reference)

Claims

[Claims]

An image input unit for inputting a document image; a binary image generating unit for generating binary image data from the document image input by the image input unit; A layout analysis unit configured to generate layout analysis data corresponding to an attribute of an image included in the document image from the binary image data; and generating analysis data indicating a result of analyzing the document image based on the layout analysis data. An image processing apparatus, comprising: an analysis unit configured to store and store the document image; and a document image reproduction unit configured to reproduce the document image based on the analysis data analyzed and stored by the analysis unit.

2. The image processing apparatus according to claim 1, wherein the analyzing unit selects one of an input document original image and a binary image as a cut-out destination image for each area of the document image divided based on the layout analysis data. Selection means, partial image cutting means for cutting a partial image from the image selected by the cut destination selection means, and image storage means for saving the partial image cut by the partial image cutting means. Item 2. The image processing device according to Item 1.

3. The image processing apparatus according to claim 2, wherein the analysis unit performs compression processing on the image of each area according to an attribute of each area of the document image divided according to the layout analysis data, and stores the image. The image processing device according to claim 1.

4. The analysis data includes: a number of a plurality of regions obtained by dividing the document image according to the layout analysis data;
4. The image processing apparatus according to claim 1, further comprising information on a size of each area, an image type, and a compression method thereof.

5. The document image reproducing means includes: a whole background image creating unit that creates a white background image; and an image combining unit that combines the created background image with a partial image obtained from the analysis data. The image processing apparatus according to claim 1, further comprising:

6. The image processing apparatus according to claim 5, wherein the image synthesizing unit converts a binary image into pseudo multi-valued image data and synthesizes it with the base image.

7. The image processing apparatus according to claim 1, wherein the document image reproducing unit reproduces each partial image of the document image as an image of one page.

8. The document image reproducing means selects whether or not to reproduce each of the partial images of the document image based on the analysis data, and converts each of the selected partial images into a one-page image. The image processing apparatus according to claim 1, wherein the image is reproduced.

9. A reproduction level setting unit for setting a reproduction condition of each of the partial images, wherein the document image reproduction unit synthesizes and reproduces a background image according to the set reproduction level. The image processing apparatus according to claim 1, wherein it is determined whether or not each partial image of the document image is reproduced and output as a one-page image.

10. An image inputting step of inputting a document image, a binary image creating step of creating binary image data from the document image input in the image inputting step, and a binary image creating step of creating the binary image data. From the binary image data, a layout analysis step of generating layout analysis data according to an attribute of an image included in the document image, and generating analysis data indicating a result of analyzing the document image based on the layout analysis data. An image processing method, comprising: an analyzing step of storing and storing the document image; and a document image reproducing step of reproducing the document image based on the analysis data analyzed and stored in the analyzing step.

11. The method according to claim 11, wherein the analyzing step includes selecting a source image or a binary image as an image to be cut out for each region of the document image divided based on the layout analysis data. A selection step, a partial image cutout step of cutting out a partial image from the image selected in the cutout destination selection step, and an image saving step of saving the partial image cutout in the partial image cutout step. Item 10
The image processing method according to 1.

12. The method according to claim 1, wherein the analyzing step performs a compression process on the image of each area according to an attribute of each area of the document image divided according to the layout analysis data and stores the image. The image processing method according to claim 10.

13. The method according to claim 1, wherein the analysis data includes information on the number of a plurality of areas obtained by dividing the document image in accordance with the layout analysis data, the size of each area, an image type, and a compression method thereof. Items 10 to 12
The image processing method according to any one of the above.

14. The document image reproducing step includes: a base whole image forming step of forming a white base image; and an image synthesizing step of combining the generated base image and a partial image obtained from the analysis data. The image processing method according to claim 10, further comprising:

15. The image processing method according to claim 14, wherein in the image synthesizing step, a binary image is converted into pseudo multi-valued image data and synthesized with the base image.

16. The image processing method according to claim 10, wherein in the document image reproducing step, each partial image of the document image is reproduced as an image of one page.

17. The document image reproducing step selects whether or not to reproduce each of the partial images of the document image based on the analysis data, and converts each of the selected partial images into a one-page image. The image processing method according to claim 10, wherein the image is reproduced.

18. A reproduction level setting step of setting a reproduction condition of each of the partial images, wherein the document image reproduction step synthesizes and reproduces and outputs a base image according to the set reproduction level. 11. The image processing method according to claim 10, wherein it is determined whether or not each partial image of the document image is reproduced and output as an image of one page.

19. A computer-readable storage medium storing a program for executing an image processing method for inputting and saving a document image and reproducing the document image, comprising: an image input step module for inputting a document image; A binary image creation step module for creating binary image data from the document image input by the image input step module; and a binary image data created by the binary image creation step module, the binary image data being included in the document image. A layout analysis step module for generating layout analysis data according to an attribute of an image; an analysis step module for generating and storing analysis data indicating a result of analyzing the document image based on the layout analysis data; and A document image reproducing unit that reproduces the document image based on the analysis data stored by being analyzed by the module; Storage medium and a step module.

20. The image processing apparatus according to claim 1, wherein the analyzing step module selects one of an input document original image and a binary image as a cut-out destination image for each area of the document image divided based on the layout analysis data. A pre-selection step module, a partial image cut-out step module for cutting out a partial image from the image selected in the cut-out destination selection step module, and an image storage step module for saving the partial image cut out in the partial image cut-out step module 20. The storage medium according to claim 19, wherein:

21. The analysis process module according to claim 1, wherein the image of each area is compressed and stored in accordance with an attribute of each area of the document image divided according to the layout analysis data. The storage medium according to claim 19, wherein

22. The analysis data according to claim 22, wherein the analysis data includes information on the number of a plurality of areas obtained by dividing the document image in accordance with the layout analysis data, the size of each area, an image type, and a compression method thereof. Items 19 to 21
The storage medium according to claim 1.