JP4693289B2

JP4693289B2 - Image compression apparatus, image compression method, program code, and storage medium

Info

Publication number: JP4693289B2
Application number: JP2001202450A
Authority: JP
Inventors: 哲臣田中
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2001-07-03
Filing date: 2001-07-03
Publication date: 2011-06-01
Anticipated expiration: 2021-07-03
Also published as: JP2003018413A

Description

【０００１】
【発明の属する技術分野】
本発明は、カラー文書画像を圧縮する画像圧縮装置及び画像圧縮方法並びにプログラムコード、記憶媒体に関するものである。
【０００２】
【従来の技術】
近年、スキャナの普及により文書の電子化が進んでいる。電子化された文書をフルカラーで所有すると３００ｄｐｉでＡ４サイズの場合、約２４Ｍバイトになり、保有するにもメモリを逼迫するし、メール添付などで他人に送信できるサイズではない。フルカラー画像圧縮にはＪＰＥＧが知られている。ＪＰＥＧは写真などの自然画像を圧縮するには非常に効果も高く、画質も良いが、文字部などの高周波部分をＪＰＥＧ圧縮するとモスキートノイズと呼ばれる画像劣化が発生し、圧縮率も悪い。そこで原画像に対して領域分割を行い、文字領域を抜いた下地部分のＪＰＥＧ圧縮画像と、色情報付き文字領域部分のＭＭＲ圧縮画像を作成する。
【０００３】
しかし、上記方法では例えば、黒文字の文章中の赤で示した強調文字の情報が欠落してしまう等、２色以上を用いた文字部を含む画像を上記圧縮方法で圧縮し、この圧縮した画像を伸長した場合、伸長後の画像に含まれる文字部は１色とされてしまう。
【０００４】
それに対しカラー文書画像を画質劣化少なく高圧縮する方式として、以下の方法があった。まず、カラー文書画像に対して２値化処理を行い、２値画像を得る。そして２値画像から文字領域を検出する。具体的には、２値画像中の黒画素の輪郭線追跡を行い、すべての黒領域に対してラベル付けする。そしてラベル付けされた黒領域を検索し、黒領域中の文字らしい領域を判定する。そして２値画像の黒の領域に該当する原画像中の領域を黒の領域の周囲の色で塗りつぶし、画像Ａを作成する。そして画像Ａを縮小した画像ＢをＪＰＥＧ圧縮する。そして、２値画像の黒の領域に該当する原画像（カラー文書画像）の領域の色を算出し、複数のパレットを作成する。またパレットに従って原画像に対して減色処理を行い、減色画像を生成する。減色画像が１ビットであるときには、減色画像をＭＭＲ圧縮する。減色画像が２ビット以上であるときには、減色画像を可逆圧縮する。
【発明が解決しようとする課題】
しかしながら従来の方式では画像中に多くのテキスト部が存在した場合にテキスト領域の部分的圧縮サイズは小さいがその部分画像の位置座標や色情報（パレット）等のヘッダ情報がそれぞれに付随して加わるため、結果的に圧縮サイズが大きくなるという欠点があった。例えば全面に表が配置された画像の場合には表の各セルがテキスト領域として処理されるため圧縮後のファイルサイズが大きくなってしまう。
【０００５】
本発明は以上の問題に鑑みてなされたものであり、テキスト領域を含むカラー文書画像を圧縮する事で得られる圧縮データのサイズを抑えることを目的とする。
【０００６】
【課題を解決するための手段】
本発明の目的を達成するために、例えば本発明の画像圧縮装置は以下の構成を備える。
【０００７】
即ち、カラー文書画像を圧縮する画像圧縮装置であって、
前記カラー文書画像に含まれるテキスト領域の色を抽出する抽出手段と、
前記テキスト領域において、予め設定された色範囲内の色を有するテキスト領域を包含する包含画像を生成する生成手段と、
前記包含画像及び／又は前記テキスト領域に対して圧縮を行う圧縮手段と
を備え、
前記生成手段は、予め設定された色範囲内の色を有するテキスト領域のうち、注目テキスト領域に結合するテキスト領域を決定する決定手段を備え、
前記注目テキスト領域と、前記決定手段が決定したテキスト領域とを包含する包含画像と、当該包含画像に関する情報を生成し、
前記決定手段は、
前記注目テキスト領域、もしくは前記注目テキスト領域を含む包含画像を圧縮した際に推定される圧縮サイズと、前記テキスト領域を圧縮した際に推定される圧縮サイズとの合計により得られる第１の圧縮サイズと、
前記注目テキスト領域、もしくは前記注目テキスト領域を含む包含画像と前記テキスト領域とを包含する包含画像を圧縮した際に推定される第２の圧縮サイズとを求め、
前記第２の圧縮サイズが前記第１の圧縮サイズよりも小さい場合、前記テキスト領域を前記注目テキスト領域に結合する
ことを特徴とする。
【００１１】
【発明の実施の形態】
以下添付図面を参照して、本発明を好適な実施形態に従って詳細に説明する。
【００１２】
［第１の実施形態］
図１に本実施形態における画像圧縮装置の基本構成を示す。１０１はＣＰＵで、ＲＡＭ１０２やＲＯＭ１０３に格納されたプログラムやデータを用いて本装置全体の制御を行うと共に、後述の画像圧縮処理を行う。１０２はＲＡＭで、外部記憶装置１０４や記憶媒体ドライブ１０９からロードされたプログラムやデータ、画像入力装置１０８から入力された画像データなどを一時的に記憶するエリアを備えると共に、ＣＰＵ１０１が各種の処理を実行する際に用いるワークエリアも備える。１０３はＲＯＭで、本装置全体の制御プログラムやブートプログラム、本装置の設定データ等を格納する。１０４はハードディスクなどの外部記憶装置で、記憶媒体ドライブ１０９からロードされたプログラムやデータなどを保存することができる。また、ワークエリアのサイズがＲＡＭ１０２のサイズを越えた場合、越えた分のエリアをファイルとして提供することもできる。１０５，１０６は夫々キーボード、マウスで、ポインティングデバイスとして機能し、各種の指示を本装置に入力することができる。
【００１３】
１０７は表示装置で、ＣＲＴや液晶画面などにより構成されており、画像情報や文字情報を表示することができる。１０８は画像入力装置で、スキャナやディジタルカメラなどにより構成されており、画像をデータとして入力することができる。尚、画像入力装置１０８は本装置と接続するためのインターフェースを含む。１０９は記憶媒体ドライブで、ＣＤ−ＲＯＭドライブ、ＤＶＤ−ＲＯＭドライブ、フロッピーディスク（ＦＤ）ドライブ等により構成されており、ＣＤ−ＲＯＭやＤＶＤ−ＲＯＭやＦＤ等の記憶媒体からプログラムやデータなどを読み込むことができる。１１０は上述の各部を繋ぐバスである。
【００１４】
図２に本実施形態における画像圧縮装置の機能構成を示す。２０１はカラーの文書画像で、画像２０１に含まれる文字部（テキスト部）には、複数の色が用いられている。２０２は２値化部で、カラー文書画像２０１を２値化処理し、２値画像を生成する。２０３は領域解析部で、２値画像におけるテキスト領域を特定し、２値画像におけるテキスト領域の位置やサイズなどの情報（テキスト情報）を生成する。テキスト領域の特定方法として、例えば、２値画像中の黒画素の輪郭線追跡を行い、すべての黒領域に対してラベル付けし、ラベル付けされた黒領域を検索し、黒領域中の文字らしい領域（すなわち、テキスト領域と思われる領域）を特定する方法が挙げられるが、これに限定されるものではない。
【００１５】
２０４はテキスト部色抽出部で、各テキスト領域毎に、用いられている色を抽出する。２０５は画像結合部で、同色が用いられていると判断されたテキスト領域を包含する領域の画像（以下、包含画像）を生成する。２０６は２値画像圧縮部で、画像結合部２０５で生成された包含画像、及び／又はテキスト領域に対して圧縮を施す。２０７は文字部塗りつぶし部で、カラー文書画像２０１において、領域解析部２０３で解析されたテキスト領域を所定の色で塗りつぶした画像（以下、下地画像）を生成する。この所定の色は予め決められた色でも良いし、テキスト領域の周辺の画素の平均値でも良い。２０８は下地画像圧縮部で、文字部塗りつぶし部２０７で生成された下地画像を圧縮する。
【００１６】
尚、図２に示した機能構成に従ったプログラムコードを記憶媒体に格納し、この記憶媒体を記憶媒体ドライブ１０９を介して図１に示した画像圧縮装置に（記憶媒体ドライブ１０８を介して）読み込ませてもよい。この場合、読み込んだプログラムをＣＰＵ１０１が実行することで、図１に示した構成を備える画像圧縮装置は図２に示した機能構成を有する装置として動作する。
【００１７】
図２の機能構成図を用いて本実施形態におけるカラー文書画像の圧縮方法について説明する。
【００１８】
まず、外部記憶装置１０４もしくは、画像入力装置１０８もしくは、記憶媒体ドライブ１０９のいずれかから、カラー文書画像２０１をＲＡＭ１０２に読み込む。本実施形態ではこのカラー文書画像２０１として図３Ａに示す画像を用いる。
【００１９】
次に、ＲＡＭ１０２に読み込まれたカラー文書画像２０１に基づいて、２値化部２０２は２値画像を生成する。２値画像を生成する方法は特に限定されるものではないが、本実施形態では以下の方法を用いる。まず、カラー文書画像２０１における輝度データのヒストグラムを取り、２値化閾値Ｔを算出する。この算出方法はここでは特には限定しないが、例えばヒストグラムの中間値となる輝度値をこの閾値Ｔとしてもよい。そして２値化閾値Ｔを用いてカラー文書画像２０１を２値化し、２値画像を作成する。生成された２値画像はＲＡＭ１０２内において、カラー文書画像２０１が記憶されているエリアとは別のエリアに記憶される。
【００２０】
次に、領域解析部２０３は上述の２値画像を参照して上述の方法で、テキスト領域を特定する。その際に上述のテキスト情報を生成する。領域解析部２０３によって領域解析される対象を図３Ａに示した画像とした場合、その結果を図３Ｂに示す。同図ではテキスト領域としてＴＥＸＴ１〜ＴＥＸＴ５が特定されており、夫々の領域に対してテキスト情報が生成される。このテキスト情報はテキスト部色抽出部２０４と、文字部塗りつぶし部２０７に出力される。
【００２１】
テキスト部色抽出部２０４は、テキスト情報を参照してカラー文書画像２０１におけるテキスト領域を特定し、特定したテキスト領域における色、つまり、テキスト領域内の文字の色を抽出する。図３Ｂにおいて、本実施形態ではＴＥＸＴ１とＴＥＸＴ３の領域は赤の文字、ＴＥＸＴ２とＴＥＸＴ４の領域は黒の文字、ＴＥＸＴ５は青の文字とする。テキスト部色抽出部２０４により抽出された各テキスト領域の色はパレット情報として生成される。
【００２２】
画像結合部２０５は、テキスト部色抽出部２０４により抽出された各テキスト領域ＴＥＸＴ１〜ＴＥＸＴ５における色を前述のパレット情報を参照して、同じ色を用いているテキスト領域を結合する。この場合、ＴＥＸＴ１とＴＥＸＴ３は同じ色を用いた文字を含んでいるので、これらの領域を包含する領域の画像（包含画像）を生成する。ここで、ＴＥＸＴ１とＴＥＸＴ３とを包含する領域の画像（包含画像）を生成することを、「ＴＥＸＴ１とＴＥＸＴ３とを結合する」と呼ぶことにする。この包含画像を図３Ｃにおいて、ＴＥＸＴ１’で示す。尚、この包含画像内の画素は、文字の部分以外は単色の画素値を有する。ＴＥＸＴ２とＴＥＸＴ４についても同様である。なお、ＴＥＸＴ２とＴＥＸＴ４とを包含する包含画像は図３Ｃにおいて、ＴＥＸＴ２’で示す。また、包含画像ＴＥＸＴ１’、ＴＥＸＴ２’の詳細を夫々図３Ｄに示す。また、画像結合部２０５は各包含画像の（２値画像もしくはカラー文書画像２０１における）位置、サイズを含む包含画像情報を生成する。
【００２３】
また、画像結合部２０５において同じ色を用いているテキスト領域を特定する方法について説明する。テキスト領域内におけるテキストの色がＲＧＢ各８ビットであった場合、ＲＧＢ各２ビット、もしくは３ビットといったように、予め決められた色範囲に減色する。そして各テキスト領域をこのように減色しておいて、同一色になるかどうかを判断する。どの程度まで減色するかは圧縮した画像にどの程度階調性を持たせたいかによって決まる。例えば人の目の青色に対する感度が低いことを利用してＲＧＢを夫々２ビット、２ビット、１ビットとしてもよいし、ＲＧＢを夫々３ビット、３ビット、２ビットとしてもよい。
【００２４】
また、より正確に同色の判定を行いたい場合はＲＧＢ形式ではなく、より色差を比較しやすいＬＡＢ形式やＹＣｒＣｂ形式に変換して、２ビットや３ビットに丸めて用いると良い。説明するとＲＧＢ形式では黒色を灰及び暗い青色とそれぞれ比較した場合には距離的に暗い青色が近くなるが、ＬＡＢやＹＣｒＣｂ形式では輝度成分と色成分が分かれているため黒色と暗い青色の分離が可能となる。
【００２５】
またスキャンされた文字の色と多少異なるが、黒文字などの輝度の低い色の場合は同色のテキスト領域内の最も輝度の低い色を採用し、逆に白文字などの輝度の高い色の場合は同色のテキスト領域内の最も輝度の高い色を採用すると入力画像の再現性は多少低くなるが見た目が良くなる。
【００２６】
２値画像圧縮部２０６は、各包含画像及び／又はテキスト領域を圧縮するが、複数色を有するテキスト領域も存在する可能性がある。よってテキスト領域に対して圧縮を行う場合、このテキスト領域が１つの色を有するか複数の色を有するかに応じて圧縮方法を変更する。これはテキスト領域のパレット情報を参照することで決定する。このパレット情報を参照した結果、注目テキスト領域が１つの色のみを有している場合、この注目テキスト領域に対してＭＭＲ圧縮を行い、注目テキスト領域が複数の色を有する場合、この注目テキスト領域に対して可逆圧縮を行う。また、圧縮結果には上述のパレット情報とテキスト情報をヘッダとして添付する。
【００２７】
一方、包含画像を圧縮する際には、ＭＭＲ圧縮を用いる。またこの圧縮結果には、この包含画像のパレット情報と包含画像情報をヘッダとして添付する。尚、パレット情報は各テキスト領域毎に存在するが、包含画像内のテキスト領域は全て同じパレット情報を有する。よって、包含画像のパレット情報として、包含画像内のテキスト領域のいずれか１つのパレット情報を用いればよい。
【００２８】
このようにすることで、各テキスト領域を圧縮すると５つのヘッダ（ＴＥＸＴ１〜ＴＥＸＴ５に対するヘッダ）が作成されるのに対して、本実施形態では３つのヘッダ（ＴＥＸＴ１’、ＴＥＸＴ２’、ＴＥＸＴ５に対するヘッダ）が作成されることになる。その結果、ヘッダの数を減らすことができ、結果として圧縮後のデータのサイズが減ることになる。
【００２９】
一方、文字部塗りつぶし部２０７は、テキスト情報を用いてカラー文書画像２０１におけるテキスト領域を特定して、特定したテキスト領域を所定の色で塗りつぶした画像（下地画像）を生成する。この下地画像を図３Ｅに示す。この所定の色は予め決められた色でも良いし、カラー文書画像２０１におけるテキスト領域の周辺の画素の平均値でも良い。
【００３０】
そして下地画像圧縮部２０８は、文字部塗りつぶし部２０７で生成された画像（下地画像）に対してＪＰＥＧ圧縮を行う。
【００３１】
以上の説明の通り、本実施形態の画像圧縮装置及び画像圧縮方法によって、テキスト領域を多く含むカラー文書画像を圧縮する場合でも、同じ色を有するテキスト領域を包含する画像を生成し、この画像を圧縮するので、圧縮後の画像に添付されるヘッダの数を減らすことができる。又、同時に、圧縮後のデータのサイズを減らすことができる。
【００３２】
［第２の実施形態］
第１の実施形態では、同一色を有するテキスト領域は同じ包含画像に含まれ、ＭＭＲ圧縮される。しかし同一色を有してはいるが、離れた小さなテキスト領域をこの包含画像に含ませる場合に、逆に圧縮後のサイズが大きくなる場合がある。本実施形態ではこのような場合の画像圧縮方法について、以下説明する。
【００３３】
本実施形態における画像圧縮装置の機能構成は、図２に示した機能構成図において画像結合部２０５における処理が第１の実施形態とは異なる。よって、本実施形態における画像結合部２０５の処理を図４を用いて説明する。
【００３４】
図４は本実施形態の画像結合部２０５における具体的な処理のフローチャートである。
【００３５】
まず、画像結合部２０５で、同一色であると判定されたテキスト領域群の中から基準となる一つのテキスト領域（以下、基準テキスト領域）を選択する（ステップＳ４０１）。もしテキスト領域がなければ、もしくは全テキスト領域に対して後述の処理を終えたのであれば（ステップＳ４０２）、本処理を終了する。一方、未処理のテキスト領域が有れば、処理をステップＳ４０３に進める。
【００３６】
基準テキスト領域の近傍のテキスト領域であって、同一色のテキスト領域を検索し（ステップＳ４０３）、この条件に合致する適する領域が有れば、処理をステップＳ４０４に進め、この条件に合致するテキスト領域であって、基準テキスト領域に最も近いテキスト領域（以下、近傍テキスト領域）を選択する（ステップＳ４０４）。一方、上述の条件に合致したテキスト領域が存在しなければ、処理をステップＳ４０９に進め、後述のステップＳ４０８で基準テキスト領域と結合したと見なされたテキスト領域を包含する包含画像を作成する（ステップＳ４０９）。
【００３７】
次に、基準テキスト領域と近傍テキスト領域とを包含する包含画像矩形を決定する（ステップＳ４０５）。そして、基準テキスト領域、近傍テキスト領域の夫々を圧縮した場合に、夫々の圧縮データの合計サイズと、包含画像を圧縮した場合の圧縮サイズを推定する（ステップＳ４０６）。ここで実際に圧縮を施して正確なサイズを出す方法もあるが、以下の方法で簡易的に算出すれば圧縮サイズの精度は落ちるが処理時間を軽減できる。予め測定していたテキスト領域の圧縮率Ａを用いて、２つの領域（基準テキスト領域と近傍テキスト領域）を夫々圧縮した場合に、その合計サイズは、以下の式で推定することができる。
【００３８】
圧縮サイズ１＝（基準テキスト領域の面積＋近傍テキスト領域の面積）×Ａ+２×ヘッダサイズ
一方、包含画像を圧縮する場合、包含画像に含まれる２つの領域、基準テキスト領域と近傍テキスト領域には必ず隙間部分が生じる。この部分は単一の画素値を表すデータで埋められており、テキスト領域を圧縮した場合に比べではるかに高圧縮率で圧縮できる。この圧縮率をＢとすると
圧縮サイズ２＝（テキスト領域の面積）×Ａ＋（隙間部分の面積）×Ｂ＋ヘッダサイズ
となる。
【００３９】
そして、上述の推定結果を用いて圧縮サイズ１と圧縮サイズ２の比較を行い、圧縮サイズ２の方が小さい、つまり、包含画像を圧縮した方が、各領域を別々に圧縮するよりも、発生する圧縮データのサイズが小さくなる場合（ステップＳ４０７）、処理をステップＳ４０８に進め、結合リストに基準テキスト領域と近傍テキスト領域とを同じ包含画像に含める（結合する）ことを示すデータを追加する（ステップＳ４０８）。
【００４０】
図５に結合リストの例を示す。同図では、基準テキスト領域をＴＥＸＴ２とした場合の結合リストの構成例を示したものであり、ＴＥＸＴ２と、各テキスト領域ＴＥＸＴ１〜ＴＥＸＴ５との対応が示されている。同図において、０は結合していないことを示す符号で、１は結合していることを示している符号、９９９が無効（自身とは結合できない）を示す符号である。結合リストには最初全て結合していないことを示す符号（同図では０）がセットされており、ステップＳ４０８における処理を実行したときのみ、結合していることを示す符号（同図では１）に変更される。
【００４１】
一方、圧縮サイズ２の方が大きい、つまり、包含画像を圧縮した方が、各領域を別々に圧縮するよりも、発生する圧縮データのサイズが大きくなる場合（ステップＳ４０７）、処理をステップＳ４０３に戻し、次の近傍テキスト領域を検索する。
【００４２】
以上の処理が一巡し、基準テキスト領域と近傍テキスト領域が結合された場合、再び行われるステップＳ４０３以降の処理では、一度選択されたテキスト領域以外であって、基準テキスト領域と同一色であって、基準テキスト領域に最も近いテキスト領域を新たな近傍テキスト領域とする（ステップＳ４０３，ステップＳ４０４）。そして、基準テキスト領域と前回の近傍テキスト領域、そして今回の近傍テキスト領域とを含む包含画像（第２の包含画像）矩形を決定し（ステップＳ４０５）、第２の包含画像と今回の近傍テキスト領域について、上述の式を用いて圧縮サイズ１，圧縮サイズ２を推定する（ステップＳ４０６）。具体的には以下のような式になる。
【００４３】
圧縮サイズ１＝（第２の包含画像の面積＋近傍テキスト領域の面積）×Ａ+２×ヘッダサイズ
圧縮サイズ２＝（テキスト領域の面積）×Ａ＋（隙間部分の面積）×Ｂ＋ヘッダサイズ
そして上述のステップＳ４０７以降の処理を行う。このようにすることで、最も多くのテキスト領域を含み、且つ圧縮後のサイズが最も小さい包含画像の作成を行うことができる。
【００４４】
［他の実施形態］
また、本発明は上記実施形態を実現する為の装置及び方法のみに限定されるものではなく、上記システム又は装置内のコンピュータ（ＣＰＵあるいはＭＰＵ）に、上記実施形態を実現する為のソフトウェアのプログラムコードを供給し、このプログラムコードに従って上記システムあるいは装置のコンピュータが上記各種デバイスを動作させることにより上記実施形態を実現する場合も本発明の範疇に含まれる。
【００４５】
またこの場合、ソフトウェアのプログラムコード自体が上記実施形態の機能を実現することになり、そのプログラムコード自体、及びそのプログラムコードをコンピュータに供給する為の手段、具体的には上記プログラムコードを格納した記憶媒体は本発明の範疇に含まれる。
【００４６】
この様なプログラムコードを格納する記憶媒体としては、例えばフロッピーディスク、ハードディスク、光ディスク、光磁気ディスク、ＣＤ−ＲＯＭ、磁気テープ、不揮発性のメモリカード、ＲＯＭ等を用いることができる。
【００４７】
また、上記コンピュータが、供給されたプログラムコードのみに従って各種デバイスを制御することにより、上記実施形態の機能が実現される場合だけではなく、上記プログラムコードがコンピュータ上で稼働しているＯＳ（オペレーティングシステム）、あるいは他のアプリケーションソフト等と共同して上記実施形態が実現される場合にもかかるプログラムコードは本発明の範疇に含まれる。
【００４８】
更に、この供給されたプログラムコードが、コンピュータの機能拡張ボードやコンピュータに接続された機能拡張ユニットに備わるメモリに格納された後、そのプログラムコードの指示に基づいてその機能拡張ボードや機能格納ユニットに備わるＣＰＵ等が実際の処理の一部又は全部を行い、その処理によって上記実施形態が実現される場合も本発明の範疇に含まれる。
【００４９】
【発明の効果】
以上の説明により、本発明によって、所定の色範囲内で同じ色を有するテキスト領域を包含する包含画像、及びこの包含画像のヘッダを生成することで、テキスト領域毎に設けられたヘッダの数を包含画像のヘッダの数に減らすことができる。その結果、テキスト領域を含むカラー文書画像を圧縮する事で得られる圧縮データのサイズを抑えることができる。
【図面の簡単な説明】
【図１】本発明の第１の実施形態における画像圧縮装置の基本構成を示す図である。
【図２】本発明の第１の実施形態における画像圧縮装置の機能構成を示す図である。
【図３Ａ】カラー文書画像２０１を示す図である。
【図３Ｂ】領域解析部２０３により特定したカラー文書画像２０１のテキスト領域を示す図である。
【図３Ｃ】包含画像を示す図である。
【図３Ｄ】ＴＥＸＴ１’、ＴＥＸＴ２’の詳細を示す図である。
【図３Ｅ】下地画像を示す図である。
【図４】本発明の第２の実施形態の画像結合部２０５における具体的な処理のフローチャートである。
【図５】結合リストの例を示す図である。[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an image compression apparatus, an image compression method, a program code, and a storage medium for compressing a color document image.
[0002]
[Prior art]
In recent years, the digitization of documents has progressed with the spread of scanners. If an electronic document is owned in full color, it will be about 24 Mbytes in 300 dpi and A4 size, and it will not be a size that can be sent to others by attaching a mail, etc. JPEG is known for full-color image compression. JPEG is very effective for compressing natural images such as photographs and has good image quality. However, when JPEG compression is performed on a high-frequency portion such as a character portion, image degradation called mosquito noise occurs, and the compression rate is also poor. Therefore, the original image is divided into regions to create a JPEG compressed image of the background portion from which the character region is removed and an MMR compressed image of the character region portion with color information.
[0003]
However, in the above method, for example, an image including a character part using two or more colors is compressed by the above compression method, such as information on highlighted characters indicated by red in a black character sentence is lost, and the compressed image Is expanded, the character portion included in the expanded image is set to one color.
[0004]
On the other hand, as a method for highly compressing a color document image with little deterioration in image quality, there has been the following method. First, a binarization process is performed on a color document image to obtain a binary image. A character area is detected from the binary image. Specifically, the black pixel contour line in the binary image is traced, and all black regions are labeled. Then, the labeled black area is searched to determine an area that seems to be a character in the black area. Then, an area in the original image corresponding to the black area of the binary image is filled with a color around the black area to create an image A. Then, the image B obtained by reducing the image A is JPEG compressed. Then, the color of the area of the original image (color document image) corresponding to the black area of the binary image is calculated to create a plurality of palettes. Further, a color reduction process is performed on the original image according to the palette to generate a color reduction image. When the reduced color image is 1 bit, the reduced color image is subjected to MMR compression. When the reduced color image is 2 bits or more, the reduced color image is reversibly compressed.
[Problems to be solved by the invention]
However, in the conventional method, when there are many text parts in the image, the partial compression size of the text area is small, but header information such as position coordinates and color information (palette) of the partial image is added to each. As a result, there is a drawback that the compression size increases. For example, in the case of an image in which a table is arranged on the entire surface, each cell of the table is processed as a text area, so the file size after compression becomes large.
[0005]
The present invention has been made in view of the above problems, and an object of the present invention is to suppress the size of compressed data obtained by compressing a color document image including a text area.
[0006]
[Means for Solving the Problems]
In order to achieve the object of the present invention, for example, an image compression apparatus of the present invention comprises the following arrangement.
[0007]
That is, an image compression apparatus for compressing a color document image,
Extracting means for extracting the color of the text area included in the color document image;
Generating means for generating an inclusion image including a text area having a color within a preset color range in the text area;
Compression means for compressing the inclusion image and / or the text region ,
The generating unit includes a determining unit that determines a text region to be combined with a target text region among text regions having a color within a preset color range,
Generating an inclusion image including the attention text area and the text area determined by the determination unit; and information about the inclusion image;
The determining means includes
The first compressed size obtained by the sum of the compressed size estimated when the target text area or the inclusion image including the target text area is compressed and the compressed size estimated when the text area is compressed When,
Obtaining a second compressed size estimated when the attention image area or an inclusion image including the attention text area and an inclusion image including the text area are compressed;
When the second compressed size is smaller than the first compressed size, the text region is combined with the target text region .
[0011]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, the present invention will be described in detail according to preferred embodiments with reference to the accompanying drawings.
[0012]
[First Embodiment]
FIG. 1 shows a basic configuration of an image compression apparatus according to this embodiment. A CPU 101 controls the entire apparatus using programs and data stored in the RAM 102 and the ROM 103 and performs image compression processing described later. Reference numeral 102 denotes a RAM which has an area for temporarily storing programs and data loaded from the external storage device 104 and the storage medium drive 109, image data input from the image input device 108, and the like, and the CPU 101 performs various processes. It also has a work area used for execution. A ROM 103 stores a control program and a boot program for the entire apparatus, setting data for the apparatus, and the like. Reference numeral 104 denotes an external storage device such as a hard disk, which can store programs and data loaded from the storage medium drive 109. When the size of the work area exceeds the size of the RAM 102, the excess area can be provided as a file. Reference numerals 105 and 106 denote a keyboard and a mouse, respectively, which function as pointing devices, and can input various instructions to the apparatus.
[0013]
Reference numeral 107 denotes a display device, which is composed of a CRT, a liquid crystal screen, or the like, and can display image information and character information. An image input device 108 includes a scanner, a digital camera, and the like, and can input an image as data. Note that the image input apparatus 108 includes an interface for connecting to the apparatus. Reference numeral 109 denotes a storage medium drive, which includes a CD-ROM drive, DVD-ROM drive, floppy disk (FD) drive, and the like, and reads programs and data from a storage medium such as a CD-ROM, DVD-ROM, or FD. be able to. A bus 110 connects the above-described units.
[0014]
FIG. 2 shows a functional configuration of the image compression apparatus according to this embodiment. Reference numeral 201 denotes a color document image, and a plurality of colors are used for a character portion (text portion) included in the image 201. A binarization unit 202 binarizes the color document image 201 to generate a binary image. An area analysis unit 203 identifies a text area in the binary image and generates information (text information) such as the position and size of the text area in the binary image. As a method for specifying a text area, for example, the outline of a black pixel in a binary image is traced, all the black areas are labeled, the labeled black area is searched, and characters in the black area are likely to be identified. A method for specifying a region (that is, a region that seems to be a text region) can be mentioned, but the method is not limited to this.
[0015]
A text part color extracting unit 204 extracts a color used for each text area. An image combining unit 205 generates an image of an area including a text area determined to use the same color (hereinafter referred to as an included image). A binary image compressing unit 206 compresses the included image and / or the text area generated by the image combining unit 205. Reference numeral 207 denotes a character portion filling unit that generates an image (hereinafter referred to as a background image) in which the text region analyzed by the region analysis unit 203 is filled with a predetermined color in the color document image 201. The predetermined color may be a predetermined color or an average value of pixels around the text area. A background image compression unit 208 compresses the background image generated by the character portion painting unit 207.
[0016]
2 is stored in a storage medium, and this storage medium is stored in the image compression apparatus shown in FIG. 1 via the storage medium drive 109 (via the storage medium drive 108). It may be read. In this case, when the CPU 101 executes the read program, the image compression apparatus having the configuration shown in FIG. 1 operates as an apparatus having the functional configuration shown in FIG.
[0017]
A color document image compression method according to this embodiment will be described with reference to the functional configuration diagram of FIG.
[0018]
First, the color document image 201 is read into the RAM 102 from any of the external storage device 104, the image input device 108, or the storage medium drive 109. In the present embodiment, an image shown in FIG. 3A is used as the color document image 201.
[0019]
Next, based on the color document image 201 read into the RAM 102, the binarization unit 202 generates a binary image. A method for generating a binary image is not particularly limited, but in the present embodiment, the following method is used. First, a histogram of luminance data in the color document image 201 is taken, and a binarization threshold T is calculated. Although this calculation method is not particularly limited here, for example, a luminance value that is an intermediate value of the histogram may be used as the threshold value T. Then, the color document image 201 is binarized using the binarization threshold T, and a binary image is created. The generated binary image is stored in a different area from the area where the color document image 201 is stored in the RAM 102.
[0020]
Next, the region analysis unit 203 specifies the text region by referring to the above binary image by the above method. At that time, the above-described text information is generated. When the object whose area is analyzed by the area analysis unit 203 is the image shown in FIG. 3A, the result is shown in FIG. 3B. In the figure, TEXT1 to TEXT5 are specified as text areas, and text information is generated for each area. This text information is output to the text part color extracting part 204 and the character part filling part 207.
[0021]
The text part color extracting unit 204 refers to the text information, identifies the text area in the color document image 201, and extracts the color in the identified text area, that is, the color of the character in the text area. 3B, in this embodiment, the TEXT1 and TEXT3 areas are red characters, the TEXT2 and TEXT4 areas are black characters, and the TEXT5 is blue characters. The color of each text area extracted by the text part color extracting unit 204 is generated as palette information.
[0022]
The image combining unit 205 combines the text regions using the same color with reference to the palette information described above for the colors in the text regions TEXT1 to TEXT5 extracted by the text portion color extracting unit 204. In this case, since TEXT 1 and TEXT 3 include characters using the same color, an image of an area including these areas (an included image) is generated. Here, generating an image of an area including TEXT1 and TEXT3 (including image) is referred to as “joining TEXT1 and TEXT3”. This inclusion image is shown as TEXT 1 ′ in FIG. 3C. It should be noted that the pixels in the included image have monochrome pixel values except for the character portion. The same applies to TEXT2 and TEXT4. An inclusion image including TEXT2 and TEXT4 is indicated by TEXT2 ′ in FIG. 3C. Details of the inclusion images TEXT1 ′ and TEXT2 ′ are shown in FIG. 3D, respectively. The image combining unit 205 also generates inclusion image information including the position and size (in the binary image or the color document image 201) of each inclusion image.
[0023]
A method for specifying a text region using the same color in the image combining unit 205 will be described. When the text color in the text area is 8 bits for each RGB, the color is reduced to a predetermined color range such as 2 bits for each RGB or 3 bits for each RGB. Each text area is reduced in this way, and it is determined whether or not the same color is obtained. The degree of color reduction depends on how much gradation the compressed image is desired to have. For example, RGB may be 2 bits, 2 bits, and 1 bit, respectively, or RGB may be 3 bits, 3 bits, and 2 bits by utilizing the low sensitivity to the blue color of the human eye.
[0024]
If it is desired to determine the same color more accurately, it is preferable to convert the color difference to the LAB format or the YCrCb format for easier comparison of the color difference and round it to 2 bits or 3 bits. To explain, in the RGB format, when comparing black with gray and dark blue, the dark blue becomes closer in distance, but in the LAB and YCrCb formats, the luminance component and the color component are separated, so the black and dark blue can be separated. It becomes possible.
[0025]
The color of the scanned character is slightly different, but if the color is low, such as black characters, the color with the lowest luminance in the text area of the same color is used, and conversely, if the color is high, such as white characters, etc. If the brightest color in the text area of the same color is adopted, the reproducibility of the input image is somewhat lowered but the appearance is improved.
[0026]
The binary image compression unit 206 compresses each included image and / or text region, but there may be a text region having a plurality of colors. Therefore, when compressing a text area, the compression method is changed depending on whether the text area has one color or a plurality of colors. This is determined by referring to the palette information in the text area. As a result of referring to the palette information, if the target text area has only one color, MMR compression is performed on the target text area. If the target text area has a plurality of colors, the target text area Reversible compression is performed. Further, the above palette information and text information are attached to the compression result as a header.
[0027]
On the other hand, when compressing an included image, MMR compression is used. In addition, the palette information and the included image information of the included image are attached to the compression result as a header. Note that palette information exists for each text area, but all text areas in the included image have the same palette information. Therefore, any one palette information of the text area in the inclusion image may be used as the palette information of the inclusion image.
[0028]
In this way, when each text area is compressed, five headers (headers for TEXT1 to TEXT5) are created, whereas in this embodiment, three headers (headers for TEXT1 ′, TEXT2 ′, and TEXT5) are created. Will be created. As a result, the number of headers can be reduced, and as a result, the size of the compressed data is reduced.
[0029]
On the other hand, the character portion filling unit 207 identifies a text region in the color document image 201 using text information, and generates an image (background image) in which the identified text region is painted with a predetermined color. This base image is shown in FIG. 3E. The predetermined color may be a predetermined color or may be an average value of pixels around the text area in the color document image 201.
[0030]
Then, the background image compression unit 208 performs JPEG compression on the image (background image) generated by the character portion painting unit 207.
[0031]
As described above, even when a color document image including many text areas is compressed by the image compression apparatus and the image compression method according to the present embodiment, an image including a text area having the same color is generated, Since compression is performed, the number of headers attached to the compressed image can be reduced. At the same time, the size of the compressed data can be reduced.
[0032]
[Second Embodiment]
In the first embodiment, text regions having the same color are included in the same inclusion image and subjected to MMR compression. However, when a small text area that is the same color but is included in the included image is included, the size after compression may increase. In this embodiment, an image compression method in such a case will be described below.
[0033]
The functional configuration of the image compression apparatus in the present embodiment is different from the first embodiment in the processing in the image combining unit 205 in the functional configuration diagram shown in FIG. Therefore, the processing of the image combining unit 205 in the present embodiment will be described with reference to FIG.
[0034]
FIG. 4 is a flowchart of specific processing in the image combining unit 205 of the present embodiment.
[0035]
First, the image combining unit 205 selects one reference text area (hereinafter referred to as a reference text area) from the text area group determined to have the same color (step S401). If there is no text area, or if the process described later has been completed for all text areas (step S402), this process ends. On the other hand, if there is an unprocessed text area, the process proceeds to step S403.
[0036]
A text area in the vicinity of the reference text area that has the same color is searched (step S403). If there is a suitable area that matches this condition, the process proceeds to step S404, and the text that matches this condition is found. A text area closest to the reference text area (hereinafter referred to as a neighboring text area) is selected (step S404). On the other hand, if there is no text area that matches the above-described conditions, the process proceeds to step S409, and an inclusion image including the text area considered to be combined with the reference text area in step S408 described later is created (step S408). S409).
[0037]
Next, an inclusion image rectangle including the reference text region and the neighboring text region is determined (step S405). Then, when each of the reference text area and the neighboring text area is compressed, the total size of the respective compressed data and the compressed size when the inclusion image is compressed are estimated (step S406). Here, there is a method of actually compressing and obtaining an accurate size, but if it is simply calculated by the following method, the processing time can be reduced although the accuracy of the compression size is reduced. When the two areas (reference text area and neighboring text area) are compressed using the compression ratio A of the text area measured in advance, the total size can be estimated by the following equation.
[0038]
Compression size 1 = (area of reference text area + area of adjacent text area) × A + 2 × header size On the other hand, when compressing an included image, two areas included in the included image, the reference text area and the adjacent text area There is always a gap. This portion is filled with data representing a single pixel value, and can be compressed at a much higher compression rate than when the text area is compressed. If this compression ratio is B, the compression size is 2 = (area of text area) × A + (area of gap) × B + header size.
[0039]
Then, the compression size 1 and the compression size 2 are compared using the above estimation result, and the compression size 2 is smaller, that is, the case where the inclusion image is compressed is generated than the case where each region is compressed separately. If the size of the compressed data to be reduced is small (step S407), the process proceeds to step S408, and data indicating that the reference text region and the neighboring text region are included (combined) in the same inclusion image in the combined list (added) (step S407). Step S408).
[0040]
FIG. 5 shows an example of a combined list. In the figure, a configuration example of the combined list when the reference text area is TEXT2 is shown, and correspondence between TEXT2 and each text area TEXT1 to TEXT5 is shown. In the figure, 0 is a code indicating that they are not combined, 1 is a code indicating that they are combined, and 999 is a code indicating invalid (cannot be combined with itself). A code (0 in the figure) indicating that all items are not initially combined is set in the combination list, and a code (1 in the figure) indicating that the items are combined only when the process in step S408 is executed. Changed to
[0041]
On the other hand, if the compressed size 2 is larger, that is, if the size of the generated compressed data is larger when compressing the included image than when compressing each area separately (step S407), the process goes to step S403. Go back and search the next neighborhood text area.
[0042]
When the above process is completed and the reference text area and the adjacent text area are combined, the process after step S403 is performed again, except for the text area selected once, and the same color as the reference text area. The text area closest to the reference text area is set as a new neighboring text area (steps S403 and S404). Then, an inclusion image (second inclusion image) rectangle including the reference text area, the previous neighborhood text area, and the current neighborhood text area is determined (step S405), and the second inclusion image and the current neighborhood text area are determined. , The compression size 1 and the compression size 2 are estimated using the above formula (step S406). Specifically, the following equation is obtained.
[0043]
Compression size 1 = (Area of second inclusion image + Area of neighboring text area) × A + 2 × Header size Compression size 2 = (Area of text area) × A + (Area of gap) × B + Header size The process after step S407 is performed. By doing so, it is possible to create an inclusion image that includes the most text regions and has the smallest size after compression.
[0044]
[Other Embodiments]
Further, the present invention is not limited to only the apparatus and method for realizing the above-described embodiment, and a software program for realizing the above-described embodiment on a computer (CPU or MPU) in the system or apparatus. A case where the embodiment is realized by supplying a code and causing the computer of the system or apparatus to operate the various devices according to the program code is also included in the scope of the present invention.
[0045]
In this case, the software program code itself realizes the functions of the above embodiment, and the program code itself and means for supplying the program code to the computer, specifically, the program code is stored. The storage medium is included in the category of the present invention.
[0046]
As a storage medium for storing such a program code, for example, a floppy disk, a hard disk, an optical disk, a magneto-optical disk, a CD-ROM, a magnetic tape, a nonvolatile memory card, a ROM, or the like can be used.
[0047]
The computer controls various devices according to only the supplied program code, so that not only the functions of the above embodiments are realized, but also the OS (operating system) on which the program code is running on the computer. In the case where the above embodiment is realized in cooperation with other application software or the like, such program code is also included in the scope of the present invention.
[0048]
Further, after the supplied program code is stored in the memory of the function expansion board of the computer or the function expansion unit connected to the computer, the program code is stored in the function expansion board or function storage unit based on the instruction of the program code. A case in which the CPU or the like provided performs part or all of the actual processing and the above-described embodiment is realized by the processing is also included in the scope of the present invention.
[0049]
【The invention's effect】
As described above, according to the present invention, the number of headers provided for each text area is generated by generating an inclusion image including a text area having the same color within a predetermined color range and a header of the inclusion image. The number of headers of the included image can be reduced. As a result, the size of the compressed data obtained by compressing the color document image including the text area can be suppressed.
[Brief description of the drawings]
FIG. 1 is a diagram illustrating a basic configuration of an image compression apparatus according to a first embodiment of the present invention.
FIG. 2 is a diagram illustrating a functional configuration of the image compression apparatus according to the first embodiment of the present invention.
3A is a diagram showing a color document image 201. FIG.
3B is a diagram showing a text area of the color document image 201 specified by the area analysis unit 203. FIG.
FIG. 3C is a diagram showing an inclusion image.
FIG. 3D is a diagram showing details of TEXT1 ′ and TEXT2 ′.
FIG. 3E is a diagram showing a base image.
FIG. 4 is a flowchart of specific processing in an image combining unit 205 according to the second embodiment of this invention.
FIG. 5 is a diagram illustrating an example of a combined list.

Claims

An image compression apparatus for compressing a color document image,
Extracting means for extracting the color of the text area included in the color document image;
Generating means for generating an inclusion image including a text area having a color within a preset color range in the text area;
Compression means for compressing the inclusion image and / or the text region ,
The generating unit includes a determining unit that determines a text region to be combined with a target text region among text regions having a color within a preset color range,
Generating an inclusion image including the attention text area and the text area determined by the determination unit; and information about the inclusion image;
The determining means includes
The first compressed size obtained by the sum of the compressed size estimated when the target text area or the inclusion image including the target text area is compressed and the compressed size estimated when the text area is compressed When,
Obtaining a second compressed size estimated when the attention image area or an inclusion image including the attention text area and an inclusion image including the text area are compressed;
An image compression apparatus for combining the text area with the target text area when the second compression size is smaller than the first compression size .

Further, the image processing apparatus includes binarization means for performing binarization processing on the color document image and generating a binary image,
The image according to claim 1, wherein the extraction unit specifies a text region from the binary image, and extracts a color included in a region corresponding to the specified text region in the color document image. Compression device.

The image compression apparatus according to claim 2, wherein the extraction unit generates text information including a position of a text area and a size of the text area in the binary image.

The image compression apparatus according to claim 2, wherein the extraction unit generates a color extracted from the text area as palette information of the text area.

The generating means performs a color reduction process on the colors in the text area, and includes an included image including a text area having a color within a preset color range in the color subjected to the color reduction process, and information on the included image The image compression apparatus according to claim 1, wherein the image compression apparatus is generated.

6. The image compression apparatus according to claim 5, wherein the information on the inclusion image includes a position of the inclusion image in the color document image and a size of the inclusion image.

The generating means generates a combined list indicating a text area to be combined with the target text area among text areas having a color within a preset color range, and refers to the combined list to thereby generate the target text area. and the inclusive image including a text region in which the determining means has determined, the image compression apparatus according to any one of claims 1 to 6, characterized in that to generate the information related to the inclusive image.

The image compression apparatus according to claim 1, wherein the compression unit performs MMR compression on an included image and / or a text region having one color.

The image compression apparatus according to claim 1, wherein the compression unit performs lossless compression on a text region having a plurality of colors.

Further, in the color document image, a background image generation means for generating a background image in which a text area is filled with a predetermined color;
The image compression apparatus according to any one of claims 1 to 9, characterized in that it comprises a base image compressing means for compressing the base image.

The underlying image generating means, the position of the text area, above with reference to text information including a text area size of the image compression apparatus according to claim 1 0, characterized in that identifying the text area.

The underlying image generating means, image compression apparatus according to claim 1 0, characterized in that to fill the text area by the average value of the pixels of the text area near the color document image.

The underlying image compression means, the image compression apparatus according to claim 1 0, characterized in that performs JPEG compression on the underlying image.

An image compression method performed by an image compression apparatus for compressing a color document image,
An extraction step of extracting a color of a text region included in the color document image;
A generating step for generating an inclusion image including a text region having a color within a preset color range in the text region;
A compression step of compressing the inclusion image and / or the text region ,
The generating step includes a determining step of determining a text region to be combined with a target text region among text regions having a color within a preset color range,
Generating an inclusion image including the text region of interest and the text region determined in the determination step, and information about the inclusion image;
In the determination step,
The first compressed size obtained by the sum of the compressed size estimated when the target text area or the inclusion image including the target text area is compressed and the compressed size estimated when the text area is compressed When,
Obtaining a second compressed size estimated when the attention image area or an inclusion image including the attention text area and an inclusion image including the text area are compressed;
An image compression method comprising combining the text area with the text area of interest when the second compressed size is smaller than the first compressed size .

The computer program for causing to function as each unit included in the image compression apparatus according to any one of claims 1 to 1 3.

A computer-readable storage medium storing the program according to claim 15 .