JPH07244715A

JPH07244715A - Image processor

Info

Publication number: JPH07244715A
Application number: JP6032151A
Authority: JP
Inventors: Kentaro Matsumoto; 健太郎松本
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1994-03-02
Filing date: 1994-03-02
Publication date: 1995-09-19

Abstract

PURPOSE:To read and file image documents at a high speed without using a mass-storage memory. CONSTITUTION:A color image document is read by a color document input part 101, and compressed and stored. Nearly simultaneously with the compressing and storing process, color multi-valued image data are converted into black-and-white and binarized. The binary data are stored in a binary page memory 106 and the image is thinned out almost at the same time to generate a reduced image. In this reduced image, areas where black pixels succeed are defined as rectangular areas, and data regarding the rectangular areas are generated. As for only the areas defined as the rectangular areas, the binary data are read out of the binary page memory 106 and compressed, and the compressed data are stored.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は画像処理装置に関するも
ので、特に、例えば、文字と画像とが混在した文書をフ
ァイリングする画像処理装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an image processing apparatus, and more particularly to an image processing apparatus for filing a document in which characters and images are mixed.

【０００２】[0002]

【産業上の利用分野】従来、１ページに写真画像と文字
とが混在している原稿を読み取りファイリングする画像
ファイリング装置は図１２に示すような構成になってい
た。BACKGROUND OF THE INVENTION Conventionally, an image filing apparatus for reading and filing a document in which a photographic image and characters are mixed on one page has been constructed as shown in FIG.

【０００３】図１２で、１０１は原稿を主走査方向に
Ｒ，Ｇ，Ｂ各８ビット毎に読みとるスキャナなどのカラ
ー原稿入力部、２０２は１ページ中の画像領域と文字領
域を分離する領域判定部、２０３は領域判定結果を記憶
する領域判定結果記憶部、１０２は読みとったカラー原
稿を１ページにわたり圧縮符号化処理する、例えば、Ｊ
ＰＥＧ方式に従うカラー多値圧縮符号化部、１０３は圧
縮符号化後のカラー符号データを蓄積するハードディス
クまたは光磁気ディスクなどで構成されるカラー符号蓄
積部、１０４はカラーデータから白黒データを計算する
カラー白黒変換部、１０５はカラー白黒変換部から出力
された白黒データを２値化する二値化処理部、１０７は
指定された領域の二値データを圧縮符号化する二値圧縮
符号化部、１０８はカラー符号蓄積部と同じ媒体上に構
成される二値圧縮データを蓄積する白黒符号蓄積部であ
る。In FIG. 12, 101 is a color original input section such as a scanner which reads an original every 8 bits of R, G and B in the main scanning direction, and 202 is an area determination for separating an image area and a character area in one page. Reference numeral 203 denotes a region determination result storage unit that stores the region determination result, and 102 performs compression coding processing on the read color original document over one page, for example, J
A color multi-value compression coding unit according to the PEG system, 103 is a color code storage unit configured by a hard disk or a magneto-optical disk that stores the color code data after compression coding, and 104 is a color that calculates black and white data from the color data. A black-and-white conversion unit, 105 is a binarization processing unit that binarizes the black-and-white data output from the color black-and-white conversion unit, 107 is a binary compression encoding unit that compresses and encodes the binary data of the designated area, and 108. Is a black and white code storage unit that stores binary compressed data configured on the same medium as the color code storage unit.

【０００４】二値化処理には、閾値を固定値にした単純
二値化方式や、中間調部分の再現性を高めた誤差拡散
法、平均濃度保存法など多くの方法が提案されている
が、ここでは、文字や表部分が対象であるので、それに
最適な二値化方法を用いる。また、単色二値データを対
象とした圧縮方法としては、ＭＨ，ＭＲ，ＭＭＲや算術
符号を用いた方法などを用いる。For the binarization processing, many methods such as a simple binarization method in which a threshold value is fixed, an error diffusion method in which the reproducibility of a halftone portion is improved, and an average density preservation method are proposed. , Here, since the character and the front part are the target, the binarization method most suitable for it is used. Further, as a compression method for monochromatic binary data, a method using MH, MR, MMR, arithmetic code, or the like is used.

【０００５】次に、以上の構成の装置が写真画像と文字
とが混在する原稿を読み取って、これをファイリングす
る処理を、図１３に示すフローチャートと図１４に示す
タイムチャートとを参照して説明する。Next, the process of reading an original in which a photographic image and characters are mixed and filing the original by the apparatus having the above-described structure will be described with reference to the flowchart shown in FIG. 13 and the time chart shown in FIG. To do.

【０００６】即ち、ステップＳ３０１ではあるページの
原稿をプレスキャンとして第１回目の原稿読みとりを行
う（図１４の１４１）と同時に、領域判定部２０２で１
ページ内の写真画像領域を抽出し（図１４の１４２）、
ステップＳ３０２でその位置情報を領域判定結果記憶部
２０３に記憶する。続いて、ステップＳ３０３でその同
じページの原稿の第２回目の読みとりを開始する（図１
４の１４３）。That is, in step S301, a first page of the document is read by prescanning a document of a certain page (141 in FIG. 14), and at the same time, the area determination unit 202 makes 1
Extract the photo image area in the page (142 in FIG. 14),
In step S302, the position information is stored in the area determination result storage unit 203. Then, in step S303, the second reading of the original on the same page is started (see FIG. 1).
4 of 143).

【０００７】そして、ステップＳ３０４では領域判定結
果記憶部２０３に記憶された写真画像の位置情報に基づ
き、その読み取り画像が写真領域であるかどうかを調べ
る。ここで、読取画像がその写真領域である場合には処
理はステップＳ３０５に進んでカラー多値圧縮符号化部
１０２で多値圧縮を行うが（図１４の１４４）、その読
取画像が写真領域でない場合には処理はステップＳ３０
６に進み二値圧縮符号化部１０７で二値圧縮を行う（図
１４の１４５）。最後に、ステップＳ３０７ではその結
果をカラー符号蓄積部１０３或は白黒符号蓄積部１０８
で蓄積する。Then, in step S304, whether or not the read image is a photographic area is checked based on the position information of the photographic image stored in the area determination result storage section 203. Here, if the read image is in the photograph area, the process proceeds to step S305 and multi-value compression is performed by the color multi-value compression encoding unit 102 (144 in FIG. 14), but the read image is not in the photograph area. If so, step S30 is performed.
6, the binary compression encoding unit 107 performs binary compression (145 in FIG. 14). Finally, in step S307, the result is stored in the color code storage unit 103 or the monochrome code storage unit 108.
Accumulate with.

【０００８】[0008]

【発明が解決しようとする課題】しかしながら上記従来
例では、原稿中の写真領域を判定するために予め原稿を
読みとる動作（プレスキャン）が必要であり、その結果
１ページの原稿に対して２回の読みとり動作が必要とな
り処理時間が長くなるという欠点があった。However, in the above-mentioned conventional example, an operation (prescan) of reading the document in advance is required to determine the photographic area in the document, and as a result, the document is read twice for one page of the document. However, there is a disadvantage that the processing time becomes long because the reading operation of is required.

【０００９】また、原稿の読み取り、多値圧縮、二値圧
縮が同期動作する必要があるので、最も速度の遅い処理
に装置の他の部分の処理を合わせなければならず、処理
の高速化の妨げとなるばかりか、システムが複雑化する
という問題もあった。そのために、多数の原稿を高速に
処理する用途には不向きであった。Further, since the original reading, multi-value compression, and binary compression must be operated in synchronization, the processing of the other parts of the apparatus must be matched with the processing with the slowest speed, which results in high speed processing. Not only is it an obstacle, but there is also the problem that the system becomes complicated. Therefore, it is not suitable for the purpose of processing a large number of originals at high speed.

【００１０】このような欠点を解決するために、カラー
原稿入力部１０１からの原稿入力時に、同時に多値ペー
ジメモリに蓄積して再度の原稿読みとり動作を省き多値
ページメモリから読み出すことにより、原稿読みとり動
作を１回にする装置も提案されている。しかしこのよう
な構成の装置では、大容量の多値ページメモリが必要
で、例えば、Ａ４サイズで記録密度が４００ｄｐｉのカ
ラーデータでは実に４８ＭＢのメモリが必要となり、装
置生産コストを抑えるという点からはやはり好ましくな
い。In order to solve such a drawback, when an original is input from the color original input section 101, the original is simultaneously stored in the multivalued page memory and read again from the multivalued page memory without rereading the original. A device that makes one reading operation has also been proposed. However, an apparatus having such a configuration requires a large-capacity multi-valued page memory. For example, a color data of A4 size and a recording density of 400 dpi requires a memory of 48 MB, which is advantageous from the viewpoint of suppressing the production cost of the apparatus. After all it is not preferable.

【００１１】本発明は上記従来例に鑑みてなされたもの
で、装置生産コストを抑えながらも高速に画像原稿を読
取処理してファイルすることができる画像処理装置を提
供することを目的としている。The present invention has been made in view of the above-mentioned conventional example, and an object of the present invention is to provide an image processing apparatus capable of reading and processing an image original at a high speed and making a file while suppressing the production cost of the apparatus.

【００１２】[0012]

【課題を解決するための手段】上記目的を達成するため
に本発明の画像処理装置は以下の様な構成からなる。即
ち、画像原稿を読み取る読取手段と、前記読取手段によ
って読み取られた前記画像原稿を表す多値画像データを
圧縮する第１圧縮手段と、前記第１圧縮手段によって圧
縮された多値画像データを格納する第１記憶手段と、前
記第１圧縮手段による圧縮とほぼ同時に、前記多値画像
データを２値化する２値化手段と、前記２値化手段から
所定量の２値データが入力されると、前記２値データを
所定の間引き率で間引きして縮小２値画像を形成し、前
記縮小２値画像中の黒画素の存在或は連続性から定義さ
れる領域ごとに、該領域の情報を生成する領域判別手段
と、前記領域判別手段によって定義される領域に関し、
前記生成された領域の情報に基づいて、前記２値化手段
によって２値化された２値データを圧縮する第２圧縮手
段と、前記第２圧縮手段によって圧縮された２値画像デ
ータを格納する第２記憶手段とを有することを特徴とす
る画像処理装置を備える。In order to achieve the above object, the image processing apparatus of the present invention has the following configuration. That is, reading means for reading an image original, first compression means for compressing multivalued image data representing the image original read by the reading means, and multivalued image data compressed by the first compressing means are stored. The first storing means, the binarizing means for binarizing the multi-valued image data, and the binary data of a predetermined amount are input from the binarizing means almost at the same time as the compression by the first compressing means. And the binary data is thinned out at a predetermined thinning rate to form a reduced binary image, and information on the area is defined for each area defined by the presence or continuity of black pixels in the reduced binary image. And a region defined by the region discrimination means,
Second compression means for compressing the binary data binarized by the binarization means based on the information of the generated area, and binary image data compressed by the second compression means are stored. An image processing apparatus having a second storage means is provided.

【００１３】[0013]

【作用】以上の構成により本発明は、画像原稿を表す多
値画像データの圧縮とほぼ同時に、そのデータを２値化
し、その２値データが所定量得られると、その２値デー
タを所定の間引き率で間引きして縮小２値画像を形成
し、その縮小２値画像中の黒画素の存在或は連続性から
定義される領域ごとに、領域の情報を生成し、その生成
された領域の情報に基づいて、２値データを圧縮するよ
う動作する。With the above-described structure, the present invention binarizes the multivalued image data representing the image original at almost the same time as compressing the data, and when the binary data is obtained in a predetermined amount, the binary data is set to the predetermined value. A reduced binary image is formed by thinning out at a thinning rate, region information is generated for each region defined by the presence or continuity of black pixels in the reduced binary image, and the region information is generated. It operates to compress the binary data based on the information.

【００１４】[0014]

【実施例】以下添付図面を参照して本発明の好適な実施
例を詳細に説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT A preferred embodiment of the present invention will be described in detail below with reference to the accompanying drawings.

【００１５】［第１実施例］＜装置の概要構成（図１）＞図１は本発明の代表的な実
施例である画像処理装置の構成を示すブロック図であ
る。なお、図１の構成において、図１２で示した従来例
と同じ構成要素には同じ参照番号を付し、ここでの説明
は省略する。[First Embodiment] <Outline Configuration of Apparatus (FIG. 1)> FIG. 1 is a block diagram showing the configuration of an image processing apparatus which is a typical embodiment of the present invention. In the configuration of FIG. 1, the same components as those of the conventional example shown in FIG. 12 are designated by the same reference numerals, and the description thereof will be omitted.

【００１６】図１において、１０６は二値化処理結果を
１ページ分（Ａ４サイズ記録密度４００ｄｐｉ）記憶す
る約２ＭＢの容量をもつ二値ページメモリ、１０７は指
定された領域の二値データを二値ページメモリ１０６か
ら読みだし圧縮符号化する二値圧縮符号化部、１０９は
二値化処理部１０５から得られる二値データを縮小する
と共に縮小後の画像中の隣接する黒画素どうしをひとま
とめにして同一のラベルづけを行い、隣接しない黒画素
のかたまりに対しては異なるラベルをつける縮小／ラベ
リング部、１１０は縮小／ラベリング部１０９で得られ
た矩形データを保存するラベルメモリ、１１１は原稿１
ページ読み込み終了後ラベルメモリ１１０から矩形情報
を読みだして矩形領域の合成を行う合成部、１１２は合
成部１１１で処理後のテキスト領域の位置を示す情報を
保存するテキスト矩形領域記憶部、１１３は本装置を外
部から操作する操作部、１１４は装置全体の制御を行う
ＣＰＵ、１１５はＣＰＵ１１４が実行する制御プログラ
ムを格納するＲＯＭ、１１６は制御プログラム実行のた
めの作業領域として用いられるＲＡＭである。In FIG. 1, reference numeral 106 denotes a binary page memory having a capacity of about 2 MB for storing one page (A4 size recording density 400 dpi) of the binarization processing result, and 107 denotes binary data of a designated area. A binary compression encoding unit that reads out from the value page memory 106 and performs compression encoding, 109 reduces the binary data obtained from the binarization processing unit 105 and collects adjacent black pixels in the reduced image together. The same labeling, and different labels are attached to the clusters of black pixels that are not adjacent to each other, 110 is a label memory for storing the rectangular data obtained by the reduction / labeling unit 109, and 111 is the original 1
After the page has been read, the combining unit that reads the rectangular information from the label memory 110 and combines the rectangular regions, 112 is a text rectangular region storage unit that stores information indicating the position of the text region after being processed by the combining unit 111, and 113 is An operating unit for operating the apparatus from the outside, 114 a CPU for controlling the entire apparatus, 115 a ROM storing a control program executed by the CPU 114, and 116 a RAM used as a work area for executing the control program.

【００１７】なお、合成部１１１が実行する合成はＲＯ
Ｍ１１５に格納された予め決められた手順である制御プ
ログラムに従って実行される。Note that the synthesis performed by the synthesis unit 111 is RO
It is executed according to a control program which is a predetermined procedure stored in M115.

【００１８】また、カラー白黒変換部１０４で実行する
カラーデータから白黒データへの変換は、下記の式
（１）〜（２）に従って、Ｒ，Ｇ，Ｂから輝度成分
（Ｙ）を算出し、得られた輝度成分（Ｙ）をＬｏｇ曲線
に従い濃度値（Ｄ）に変換する。In the conversion from color data to monochrome data executed by the color monochrome conversion unit 104, the luminance component (Y) is calculated from R, G and B according to the following equations (1) and (2), The obtained luminance component (Y) is converted into a density value (D) according to the Log curve.

【００１９】Ｙ＝０．３Ｒ＋０．６Ｇ＋０．１Ｂ ……… （１）Ｄ＝−ＬｏｇＹ ……… （２）＜装置の動作概要＞次に本装置の動作の概要について、
図２に示すタイムチャートと図３〜図４に示すフローチ
ャートとを参照して説明する。Y = 0.3R + 0.6G + 0.1B (1) D = -LogY (2) <Outline of operation of device> Next, an outline of operation of this device will be described.
This will be described with reference to the time chart shown in FIG. 2 and the flow charts shown in FIGS.

【００２０】まず、ステップＳ１０１では、操作部１１
３からユーザによる原稿入力開始指示に従って、１ペー
ジ分の原稿読取と、カラー画像の圧縮（多値圧縮）とそ
の蓄積、２値化データの記憶、矩形データ記憶の処理
（後述）を行う。その詳細は以下の通りである。First, in step S101, the operation unit 11
In accordance with a document input start instruction from the user from 3, the document reading for one page, the compression of the color image (multi-value compression) and the storage thereof, the storage of the binarized data and the storage of the rectangular data (described later) are performed. The details are as follows.

【００２１】カラー原稿入力部１０１は画像データを
Ｒ，Ｇ，Ｂに分解して読み出す（図２の１２１）。カラ
ー多値圧縮符号化部１０２はＪＰＥＧ方式を用いた場
合、８画素×８画素のブロック単位での処理になるた
め、最低限８ラインの画像バッファが必要になる。ＪＰ
ＥＧ方式の詳細はＩＳＯ／ＩＥＣ／ＪＴＣ１ＣＤ１０
９１８−１等で規定されているため、ここでの説明は省
略する。圧縮後の多値カラー符号データはカラー符号蓄
積部１０３に蓄積される（図２の１２２）。従って、カ
ラー原稿入力が終了するとカラー多値圧縮符号化部１０
２で８ラインの遅延があるものの、ほぼ同時に多値カラ
ー符号データの蓄積も終了する。The color original input section 101 separates the image data into R, G and B and reads them (121 in FIG. 2). When the JPEG method is used, the color multi-level compression encoding unit 102 performs processing in block units of 8 pixels × 8 pixels, and thus requires an image buffer of at least 8 lines. JP
Details of the EG method are ISO / IEC / JTC1 CD10
Since it is defined by 918-1 and the like, description thereof will be omitted here. The multi-valued color code data after compression is stored in the color code storage unit 103 (122 in FIG. 2). Therefore, when the input of the color original is completed, the color multi-value compression encoding unit 10
Although there is a delay of 8 lines in 2, the accumulation of the multi-valued color code data is completed almost at the same time.

【００２２】一方、カラー原稿入力部からのＲ，Ｇ，Ｂ
データはカラー白黒変換部１０４で白黒データに変換さ
れる。式（１）〜式（２）にあるようにカラー白黒変換
処理は画素単位の処理であって、演算には１画素程度の
遅延があれば可能である。得られた白黒データは二値化
処理部１０５で二値データに変換され二値ページメモリ
１０６に記憶されると同時に、縮小／ラベリング部１０
９によりラベリング処理が行われる（図２の１２３）。On the other hand, R, G, B from the color original input section
The data is converted into monochrome data by the color monochrome conversion unit 104. As shown in Expressions (1) and (2), the color / black-and-white conversion processing is processing in pixel units, and the calculation can be performed with a delay of about one pixel. The obtained black-and-white data is converted into binary data by the binarization processing unit 105 and stored in the binary page memory 106, and at the same time, the reduction / labeling unit 10
Labeling processing is performed by 9 (123 in FIG. 2).

【００２３】以下、矩形データ記憶の処理（画像中の文
字領域、表領域を抽出するまでの処理）は図４に示すフ
ローチャートを参照して詳細に説明する。The rectangular data storage processing (processing until the character area and the table area in the image are extracted) will be described in detail below with reference to the flowchart shown in FIG.

【００２４】まず、ステップＳ１００１では画像を間引
きして縮小する。元画像に対して、縦ｍドット、横ｎド
ットの論理和をとって新たにｍ×ｎ画素を１画素に間引
く。ここで、元画像（ｍ×ｎドット）中に１ドットでも
黒画素があれば、間引き後の画像は黒とする。図５は、
ｍ（縦）＝４、ｎ（横）＝８の例について示す図であ
る。First, in step S1001, the image is thinned out and reduced. A logical sum of vertical m dots and horizontal n dots is taken from the original image to newly thin out m × n pixels to one pixel. Here, if even one dot has a black pixel in the original image (m × n dots), the thinned image is black. Figure 5
It is a figure shown about the example of m (vertical) = 4 and n (horizontal) = 8.

【００２５】ステップＳ１００２では、間引き後の画像
中の黒画素に対して１画素ずつラベルを付加し、上下・
左右・斜めで連続している画素には同一ラベルを付け、
同時に同一ラベルの黒画素を囲む矩形領域を定義してゆ
くとともに、その矩形領域のデータ（矩形データ）を生
成していく。矩形データは矩形領域で囲まれる黒画素の
隣接する領域を表す情報である。図６において、“●”
は間引き後の画像中の黒画素であり、“○”は間引き後
の画像中の白画素である。それぞれの黒画素にはラベル
（ここでは、大文字の英字Ａ〜Ｅ）が付されている。そ
して、矩形領域６１は、上下・左右・斜め方向に黒画素
が連続している領域であり、ラベル英字Ａ〜Ｅには同じ
値がセットされる。In step S1002, a label is added to the black pixels in the image after thinning, one pixel at a time, and
Pixels that are continuous on the left and right and diagonally are given the same label,
At the same time, the rectangular area surrounding the black pixel of the same label is defined, and the data (rectangular data) of the rectangular area is generated. Rectangle data is information representing an area adjacent to a black pixel surrounded by a rectangular area. In Figure 6, "●"
Indicates a black pixel in the thinned image, and “◯” represents a white pixel in the thinned image. Labels (here, capital letters A to E) are attached to the respective black pixels. The rectangular area 61 is an area in which black pixels are continuous in the vertical, horizontal, and diagonal directions, and the same value is set in the label letters A to E.

【００２６】ここで、黒画素が１つだけ孤立している領
域も、１つの矩形領域として定義され、矩形データが生
成される。Here, an area in which only one black pixel is isolated is also defined as one rectangular area, and rectangular data is generated.

【００２７】このような処理は１行毎（図６ではＹ軸方
向）にすすめられ、１ページ読み込みが終了した時点
で、ラベルメモリ１１０には図７に示すような矩形デー
タが複数作成される。図７に示すように、矩形データ各
々には、矩形ラベルとその矩形領域の始点座標値（矩形
の左上端のＸＹ座標値）と終点座標値（矩形の右下端の
ＸＹ座標値）とその矩形領域に囲まれる画素数とが含ま
れる。Such processing is performed for each row (Y-axis direction in FIG. 6), and when one page has been read, a plurality of rectangular data as shown in FIG. 7 are created in the label memory 110. . As shown in FIG. 7, in each piece of rectangle data, a rectangle label, the start point coordinate value (the XY coordinate value of the upper left end of the rectangle), the end point coordinate value (the XY coordinate value of the lower right end of the rectangle) of the rectangle area, and the rectangle thereof. The number of pixels surrounded by the area is included.

【００２８】次に画像の読みとり動作終了そして矩形デ
ータの作成と記憶の終了と同時に、処理はステップＳ１
０２に移り、ラベルメモリ１１０より矩形データを読み
出して（図２の１２５）、矩形領域がどのような種類の
データ（例えば、文章であるか表であるか、或は、セパ
レータであるか等）であるかを判別する。この処理の詳
細は図４のフローチャートのステップＳ１００３〜Ｓ１
００７に相当する。Next, at the same time when the image reading operation is completed and the rectangular data is created and stored, the processing is performed in step S1.
Moving to 02, the rectangular data is read from the label memory 110 (125 in FIG. 2), and what kind of data the rectangular area is (for example, whether it is a sentence or a table or a separator). Is determined. Details of this processing are steps S1003 to S1 in the flowchart of FIG.
Corresponding to 007.

【００２９】即ち、ステップＳ１００３〜Ｓ１００５で
は、この読み出された矩形データに基づいて得られる矩
形の幅、高さ、面積、面積に対する画素数、即ち、画素
密度をもちいて、合成部１１１がセパレータ（ステップ
Ｓ１００３）、組方向（ステップＳ１００４）、見だし
類（ステップＳ１００５）を検出するとともに、ステッ
プＳ１００６では必要ならその検出結果に基づいて複数
の矩形領域を合併して１つの矩形領域とする。その結
果、本文、図形、写真、表、セパレータに該当する矩形
が区別される（図２の１２４）。このようにして区別さ
れた情報は図７の画素ラベルにセットされる。そして、
ステップＳ１００８ではその結果をテキスト矩形領域記
憶部１１２にいったん保存する。That is, in steps S1003 to S1005, the synthesizing unit 111 uses the width, height, area, and the number of pixels per area, that is, the pixel density of the rectangle obtained based on the read rectangle data, and the combining unit 111 separates the separator. (Step S1003), the set direction (step S1004), and the finding type (step S1005) are detected, and in step S1006, a plurality of rectangular areas are merged into one rectangular area based on the detection result if necessary. As a result, the rectangles corresponding to the text, figure, photograph, table, and separator are distinguished (124 in FIG. 2). The information distinguished in this way is set in the pixel label of FIG. And
In step S1008, the result is temporarily stored in the text rectangular area storage unit 112.

【００３０】次に処理はステップＳ１０３において、テ
キスト矩形領域記憶部で示される領域に対してのみ、二
値ページメモリ１０６から二値画像データを読みだし、
二値圧縮１０７を行う（図２の１２６）。その結果はス
テップＳ１０４において二値圧縮データ蓄積部１０８に
蓄積される。従って、矩形領域が定義されない、白画素
だけで構成される領域は圧縮して格納されない。Next, in step S103, the process reads binary image data from the binary page memory 106 only for the area indicated by the text rectangular area storage section,
Binary compression 107 is performed (126 in FIG. 2). The result is stored in the binary compressed data storage unit 108 in step S104. Therefore, an area defined by only white pixels in which a rectangular area is not defined is not compressed and stored.

【００３１】従って本実施例に従えば、１度の画像原稿
読取で画像を圧縮するとともに、その原稿入力された画
像の内２値化された部分のみを２値ページメモリに一度
格納し、その格納されたデータを用いて２値化された画
像が黒画素を含む矩形領域と白画素のみの領域とを区別
し、さらに黒画素を含む矩形領域についてはそれがどの
ような種類の画像（文章、表、セパレータなど）を表す
かを判別する。そして、その情報に基づいて２値化画像
データを圧縮して蓄積するので、無駄な白画素のみで構
成される領域が圧縮されることもなく、さらに、カラー
画像すべてを格納するような大容量のページメモリを用
いず、かつ、１回の画像読取で、その画像のファイリン
グに必要な種々の情報を得、画像圧縮を行って読取画像
データを蓄積管理することができる。Therefore, according to the present embodiment, the image is compressed by reading the image original once, and only the binarized portion of the image input to the original is stored once in the binary page memory. An image binarized using the stored data distinguishes a rectangular area containing black pixels from an area containing only white pixels, and for a rectangular area containing black pixels, what kind of image (text , Table, separator, etc.). Then, since the binarized image data is compressed and stored based on the information, an area constituted by useless white pixels is not compressed, and further, a large capacity for storing all color images. It is possible to obtain various information necessary for filing of the image by one-time image reading without using the page memory of No. 1 and perform image compression to store and manage the read image data.

【００３２】これによって、より高速な画像データの読
取とファイリングを行うことが可能になる。また、多値
圧縮、２値圧縮のように速度の異なる処理を分けたこと
で、各々の処理を最も適した速度で動作させることがで
きるという利点もある。This makes it possible to read and filing image data at higher speed. In addition, by dividing the processes having different speeds such as multi-value compression and binary compression, there is also an advantage that each process can be operated at the most suitable speed.

【００３３】［第２実施例］本実施例では第１実施例で
説明した処理を、２値ページメモリとラベルメモリとに
関してダブルバッファ構成とした画像処理装置に適用し
た場合について考える。[Second Embodiment] In this embodiment, the case where the processing described in the first embodiment is applied to an image processing apparatus having a double buffer structure for a binary page memory and a label memory will be considered.

【００３４】図８は本実施例に従う画像処理装置の構成
を示すブロック図である。図８において、第１実施例や
従来例で説明したと共通の構成要素には同じ参照番号を
付し説明を省略する。ここでは、本実施例に特長的な部
分のみについて説明する。FIG. 8 is a block diagram showing the arrangement of the image processing apparatus according to this embodiment. In FIG. 8, the same components as those described in the first embodiment and the conventional example are designated by the same reference numerals and the description thereof will be omitted. Here, only the characteristic parts of this embodiment will be described.

【００３５】本実施例では、第１実施例で説明した二値
ページメモリ１０６とラベルメモリ１１２をダブルバッ
ファ構成とし、ページメモリ１０６を二値ページメモリ
７０１と二値ページメモリ７０２に、ラベルメモリ１１
０をラベルメモリ７０３とラベルメモリ７０４にしてい
る。In this embodiment, the binary page memory 106 and the label memory 112 described in the first embodiment have a double buffer structure, and the page memory 106 is a binary page memory 701 and a binary page memory 702, and a label memory 11 is used.
0 is used as the label memory 703 and the label memory 704.

【００３６】図９は、本実施例に従う内部処理部動作タ
イミングを示すタイムチャートである。ダブルバッファ
自体は一般的な技術であるので説明は省略するが、ダブ
ルバッファ構成とすることで、図９に示すように、１ペ
ージ目の処理中、つまり、ラベルメモリから矩形データ
を読み出して、その矩形領域がどのような種類のもので
あるかを判別中に２ページ目の画像原稿を読取ってカラ
ー画像圧縮処理やもう１つの２値ページメモリやラベル
メモリへの書込みを実行することができる。FIG. 9 is a time chart showing the operation timing of the internal processing unit according to this embodiment. Since the double buffer itself is a general technique, description thereof will be omitted. However, by adopting the double buffer configuration, as shown in FIG. 9, during the processing of the first page, that is, the rectangular data is read from the label memory, The image original of the second page can be read and color image compression processing and writing to another binary page memory or label memory can be executed while determining what kind of the rectangular area is. .

【００３７】従って本実施例に従えば、２値ページメモ
リとラベルメモリとをダブルバッファ構成にすること
で、二つの異なる画像原稿に関する処理を１部重複させ
て実行することができるので、第１実施例と比べてさら
に高速に高速な画像データの読取とファイリングを行う
ことができる。Therefore, according to the present embodiment, since the binary page memory and the label memory have the double buffer structure, the processes relating to two different image originals can be overlapped and executed. High-speed reading and filing of high-speed image data can be performed as compared with the embodiment.

【００３８】なお第１〜第２実施例では、入力画像デー
タは多値のカラー画像と２値化された白黒データとの分
けて画像処理を行ったが本発明はこれに限定されるもの
ではない。例えば、２値化された白黒データに関して、
さらに、文字認識処理を加え、その文字認識処理によっ
て得られた文字情報をコードデータによって表されるテ
キストデータとして白黒蓄積部１０８とは別の記憶領域
に格納しても良い。In the first and second embodiments, the input image data is divided into the multi-valued color image and the binarized black-and-white data for image processing, but the present invention is not limited to this. Absent. For example, regarding the binarized black and white data,
Further, a character recognition process may be added, and the character information obtained by the character recognition process may be stored as text data represented by code data in a storage area different from the black and white storage unit 108.

【００３９】図１０は第１実施例の示した構成の画像処
理装置に文字認識処理機能を加えた場合の装置構成を示
すブロック図である。図１０において、８０１は指定さ
れた領域の二値データを読みだし、文字認識処理を行
い、コードデータに変換する文字認識部、８０２は文字
コードデータを蓄積するテキストデータ蓄積部である。
テキストデータ蓄積部８０２はカラー符号蓄積部１０３
及び白黒符号蓄積部１０９と同じ媒体上に構成して良
い。FIG. 10 is a block diagram showing the arrangement of the image processing apparatus of the first embodiment to which a character recognition processing function is added. In FIG. 10, reference numeral 801 is a character recognition unit that reads binary data in a designated area, performs character recognition processing, and converts the data into code data, and 802 is a text data storage unit that stores character code data.
The text data storage unit 802 is a color code storage unit 103.
It may be configured on the same medium as the black and white code storage unit 109.

【００４０】また、以上の実施例では入力した画像デー
タの内、多値カラー画像データについては特別な取り扱
いをしなかったが、入力ページ全体が２値化画像となり
得ることも考慮して、図１１のフローチャートに示すよ
うに、図３のフローチャートで示した処理に加え、ステ
ップＳ１２０〜Ｓ１２１の処理を加え、入力ページ全体
が２値化画像であると判定された場合には圧縮された多
値カラー画像をカラー符号蓄積部１０３より削除するよ
うに装置を構成しても良い。これによって、無駄なデー
タがカラー符号蓄積部１０３に格納されることが防止さ
れ、より効率的なカラー符号蓄積部１０３の利用が図ら
れることになる。Further, in the above-mentioned embodiment, the multivalued color image data of the input image data was not treated specially, but in consideration of the fact that the entire input page can be a binarized image, As shown in the flowchart of FIG. 11, in addition to the processing shown in the flowchart of FIG. 3, the processing of steps S120 to S121 is added, and when it is determined that the entire input page is a binarized image, the compressed multi-valued The apparatus may be configured to delete the color image from the color code storage unit 103. As a result, useless data is prevented from being stored in the color code storage unit 103, and the color code storage unit 103 can be used more efficiently.

【００４１】さらにまた、以上の実施例で説明したカラ
ー原稿入力部は従来例と同様に原稿を主走査方向にＲ，
Ｇ，Ｂ各８ビット毎に読みとるスキャナなどを考えた
が、その走査方式や画素の有効ビット数などはこれに限
らず、他の仕様のものを適用できることは言うまでもな
い。Furthermore, the color document input section described in the above embodiment is similar to the conventional example in that the document is read in the main scanning direction by R,
A scanner or the like that reads every 8 bits of G and B was considered, but it goes without saying that the scanning method and the number of effective bits of pixels are not limited to this, and other specifications can be applied.

【００４２】さらにまた、以上の実施例ではスキャナな
どのカラー原稿入力部からカラー画像原稿を入力する場
合について考えたが、本発明はこれに限定されるもので
はない。例えば、カラー原稿入力部をモノクロスキャナ
とし、カラー白黒変換部を省略し、カラー多値圧縮符号
化部やカラー符号蓄積部をモノクロ中間調（多値）の画
像を扱う部分として装置を構成しても良い。これによっ
て、装置がモノクロ中間調（多値）の画像と文字や表の
ような白黒２値の画像を専用に扱い、かつ、高速に画像
を読み取ってファイリングすることができる。Furthermore, in the above embodiment, the case of inputting a color image original from a color original input unit such as a scanner was considered, but the present invention is not limited to this. For example, the color original input unit is a monochrome scanner, the color black-and-white conversion unit is omitted, and the color multi-value compression encoding unit and the color code storage unit are configured as units for handling monochrome halftone (multi-valued) images. Is also good. As a result, the apparatus can handle a monochrome halftone (multi-valued) image and a monochrome binary image such as a character or a table exclusively, and can read the image at a high speed and perform filing.

【００４３】尚、本発明は、複数の機器から構成される
システムに適用しても良いし、１つの機器から成る装置
に適用しても良い。また、本発明は、システム或は装置
にプログラムを供給することによって達成される場合に
も適用できることはいうまでもない。The present invention may be applied to a system composed of a plurality of devices or an apparatus composed of one device. Further, it goes without saying that the present invention can be applied to the case where it is achieved by supplying a program to a system or an apparatus.

【００４４】[0044]

【発明の効果】以上説明したように本発明によれば、画
像原稿を表す多値画像データの圧縮とほぼ同時に、その
データを２値化し、その２値データが所定量得られる
と、その２値データを所定の間引き率で間引きして縮小
２値画像を形成し、その縮小２値画像中の黒画素の存在
或は連続性から定義される領域ごとに、領域の情報を生
成し、その生成された領域の情報に基づいて、２値デー
タを圧縮するので、例えば、文字と写真の混在するよう
な画像原稿を読み取ってファイリングする際、１回の原
稿読みとりで適切にその画像を圧縮してファイルするこ
とができる。As described above, according to the present invention, when the multi-valued image data representing the image original is compressed almost at the same time, the data is binarized and a predetermined amount of the binary data is obtained. Value data is thinned out at a predetermined thinning rate to form a reduced binary image, and area information is generated for each area defined by the presence or continuity of black pixels in the reduced binary image. Since the binary data is compressed based on the information of the generated area, for example, when reading and filing an image original in which characters and photographs are mixed, the image is appropriately compressed by reading the original once. Can be filed.

【００４５】これにより、読み取った画像原稿全体の画
像情報を格納するような大容量の記憶手段を用いること
なく、かつ、高速な画像読取からファイリングまでの処
理を行うことができるという効果がある。さらに、大容
量の記憶手段を用いる必要もないので、装置の生産コス
トの削減にも資することになる。As a result, it is possible to perform high-speed processing from image reading to filing without using a large-capacity storage means for storing the image information of the entire read image original. Furthermore, since it is not necessary to use a large-capacity storage means, it contributes to the reduction of the production cost of the device.

[Brief description of drawings]

【図１】本発明の第１実施例に従う画像処理装置の構成
を示すブロック図である。FIG. 1 is a block diagram showing a configuration of an image processing apparatus according to a first embodiment of the present invention.

【図２】第１実施例に従う画像読取及びファイリングの
処理を示すタイムチャートである。FIG. 2 is a time chart showing image reading and filing processing according to the first embodiment.

【図３】画像読取及びファイリングの処理を示すフロー
チャートである。FIG. 3 is a flowchart showing image reading and filing processing.

【図４】画像読取及びファイリングの処理に係わる矩形
データの生成と領域判別処理の詳細を示すフローチャー
トである。FIG. 4 is a flow chart showing details of rectangular data generation and area discrimination processing relating to image reading and filing processing.

【図５】間引き縮小処理の概要を示す図である。FIG. 5 is a diagram showing an outline of thinning-out reduction processing.

【図６】黒画素のラベリングと矩形領域の定義を説明す
る図である。FIG. 6 is a diagram for explaining labeling of black pixels and definition of a rectangular area.

【図７】矩形データの構成を示す図である。FIG. 7 is a diagram showing a structure of rectangular data.

【図８】第２実施例に従う画像処理装置の構成を示すブ
ロック図である。FIG. 8 is a block diagram showing a configuration of an image processing device according to a second embodiment.

【図９】第２実施例に従う画像読取及びファイリングの
処理を示すタイムチャートである。FIG. 9 is a time chart showing image reading and filing processing according to the second embodiment.

【図１０】第１実施例に従う画像処理装置に文字認識機
能を付加した装置の構成を示すブロック図である。FIG. 10 is a block diagram showing a configuration of a device in which a character recognition function is added to the image processing device according to the first embodiment.

【図１１】画像読取及びファイリングの処理の別の実施
例を示すフローチャートである。FIG. 11 is a flowchart showing another embodiment of image reading and filing processing.

【図１２】従来例に従う画像処理装置の構成を示すブロ
ック図である。FIG. 12 is a block diagram showing a configuration of an image processing apparatus according to a conventional example.

【図１３】従来例に従う画像読取及びファイリングの処
理を示すフローチャートである。FIG. 13 is a flowchart showing image reading and filing processing according to a conventional example.

【図１４】従来例に従う画像読取及びファイリングの処
理を示すタイムチャートである。FIG. 14 is a time chart showing image reading and filing processing according to a conventional example.

[Explanation of symbols]

１０１カラー原稿入力部１０２カラー多値圧縮符号化部１０３カラー符号蓄積部１０４カラー白黒変換部１０５２値化処理部１０６２値ページメモリ１０７２値圧縮符号化部１０８白黒符号蓄積部１０９縮小・ラベリング部１１０ラベルメモリ１１１合成部１１２テキスト矩形領域保存部１１３操作部１１４ＣＰＵ１１５ＲＯＭ１１６ＲＡＭ 101 Color Original Input Unit 102 Color Multi-Valued Compression Encoding Unit 103 Color Code Accumulation Unit 104 Color Monochrome Conversion Unit 105 Binarization Processing Unit 106 Binary Page Memory 107 Binary Compression Encoding Unit 108 Monochrome Code Accumulation Unit 109 Reduction / Labeling Part 110 Label memory 111 Compositing part 112 Text rectangular area storage part 113 Operation part 114 CPU 115 ROM 116 RAM

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁶ 識別記号庁内整理番号ＦＩ技術表示箇所Ｈ０４Ｎ 1/411 ─────────────────────────────────────────────────── ─── Continuation of the front page (51) Int.Cl. ⁶ Identification code Internal reference number FI Technical indication H04N 1/411

Claims

[Claims]

1. A reading unit for reading an image original, a first compressing unit for compressing multivalued image data representing the image original read by the reading unit, and a multivalued image compressed by the first compressing unit. First storage means for storing data, binarization means for binarizing the multi-valued image data almost simultaneously with compression by the first compression means, and a predetermined amount of binary data from the binarization means. When input, the binary data is decimated at a predetermined decimation rate to form a reduced binary image, and the reduced binary image is formed for each region defined by the presence or continuity of black pixels. With respect to the area discriminating means for generating information of the area and the area defined by the area discriminating means, the binary data binarized by the binarizing means is compressed based on the information of the generated area. Second compression means The image processing apparatus characterized by having a second memory means for storing the binary image data compressed by the second compression means.

2. The area defined by the area discriminating means is a rectangular area surrounding black pixels continuous in a vertical direction, a horizontal direction or an oblique direction, or an isolated black pixel is a black pixel. The image processing apparatus according to claim 1, wherein the image processing apparatus is an enclosing rectangular area.

3. The image processing apparatus according to claim 1, wherein the compression by the first and second compression means is performed in units of one page of the image original.

4. A binary digitized by the binarizing means
It further comprises buffer means for storing the value data for one page of the image original, and area information storage means for storing the information of the area generated by the area determining means for the one page of the image original. The reading of the image original by the means, the compression by the first compression means, the storage of the binary data in the buffer means, and the storage of the information of the area in the area information storage means are performed substantially at the same time. 4. The method according to claim 3,
The image processing device according to item 1.

5. The image processing apparatus according to claim 4, wherein each of the buffer means and the area information storage means has a double buffer configuration composed of two storage media.

6. A character recognizing unit for recognizing a character from binary data binarized by the binarizing unit based on the information of the generated region with respect to the region defined by the region discriminating unit. The image processing apparatus according to claim 1, further comprising: a third storage unit that stores the character information obtained by the character recognition unit.

7. If all the image originals read by the reading means are binarized and defined as the area based on the information of the area generated by the area determining means, The image processing apparatus according to claim 1, further comprising a deleting unit that deletes the compressed multi-valued image data stored in the first storage unit.

8. The image original is a color image original, and the binarizing unit converts multi-valued color image data representing the color image original into black-and-white multi-valued image data and converts the black-and-white multi-valued image data into two. The image processing apparatus according to claim 1, wherein the image processing apparatus performs binarization.