JP3907505B2

JP3907505B2 - Block truncation coding apparatus and method

Info

Publication number: JP3907505B2
Application number: JP2002072966A
Authority: JP
Inventors: 宏幸作山
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2002-03-15
Filing date: 2002-03-15
Publication date: 2007-04-18
Anticipated expiration: 2022-03-15
Also published as: JP2003274195A

Description

【０００１】
【発明の属する技術分野】
本発明は、ブロックトランケーション符号化装置及び方法に関するもので、特に解像度成分の適切な区間設定を行うことで画質を向上させるものに関する。
【０００２】
【従来の技術】
従来から、静止画像の符号化法として、ブロックトランケーション符号化（BTC Block Truncation Coding ）と呼ばれる方法が知られている。これは、原画像を m×n 画素のブロックに分解し（分割は必須ではないが）、該ブロックに所定の階調数を割り当て、該限られた階調数によってブロック内の画素値を表現する（階調数を量子化する）方法である。
【０００３】
図２０は、４階調を割り当ててモノクロ画像の符号化を行う場合の、量子化区間および復号値の設定の概要を示したものである。図２０の太線は、符号化対象となるブロックの画素値のヒストグラムを示しており、横軸が画素値、縦軸が頻度である。
該ヒストグラムを元に、ブロック内の画素値の平均値的な値LA、ブロック内の画素値のダイナミックレンジを反映した値LDが求められ、これらの２つの量をもとに、画素値を量子化する４つの区間が設定される。原画像の各画素がどの区間に含まれるかを示すフラグをφと呼び、符号化時には、LA、LD、各画素のφ、の３つの量が符号化される。なお、LDは階調幅指標と呼ばれ、φは解像度成分と呼ばれる。
【０００４】
さて今、上記モノクロ画像（１画素8bitの階調とする）について、２×４画素（計64bit ）を１ブロックとして、1/2 に圧縮して符号化することを考える。各ブロックに４階調を割り当てるためには、１画素当たりφとして2bitが必要になるため、φだけで計16bit が必要になり、1/2 （＝ブロック当たり32bit ）に圧縮するためには、残る16bit をLA、LDに割り当てることになる。
ここで、LA、LDおよびφの値は、一般的なブロックトランケーション符号化においては、以下の様に求められる。
図２０のようにブロック内の画素値の最大値をmax 、最小値をmin とし、区間(max−min)を４等分する。そして、該4 区間の内の両端の区間について、各区間内の画素値の平均値Q1、Q4を求める。そして、
LA＝（Q1＋Q4）/2
LD＝ Q4 −Q1
によって、LA、LDが求められ、これを元に、量子化区間の境界L1、L2が
L1＝ LA −LD/4
L2＝ LA+LD/4
の様に求められる。ここで重要なのは、LDを４等分して、画素値の量子化区間を設定している点である。これは、ヒストグラムの中央値付近を細かく量子化し、端部を粗く量子化することに他ならず、中央値付近の出現頻度が相対的に大きいヒストグラムを想定して量子化区間の設定を行っていると考えられる。なぜならば、出現頻度の多い区間を密に量子化することで、ブロック全体としての量子化誤差が小さく抑えられ、復号後の画質を向上させられるからである。このように、一般的なヒストグラムを想定して、LDの区間設定を行うBTC を、GBTC（Generalized Block Truncation Coding)型符号化方式と呼び、特開平６−１６４９５０公報にも記載のある通りである。
ここで、x を画素値とした場合、φは
min ≦x ≦L1 の場合 φ=0
L1＜x ≦LA の場合 φ=1
LA＜x ≦L2 の場合 φ=2
L2＜x ≦max の場合 φ=3
と規定される。
一方、復号値ｄは、
φ＝0 の場合ｄ = d1 ＝ Q1 （＝ＬＡ―ＬＤ/2）
φ＝1 の場合ｄ = d2 ＝ LA − LD/6
φ＝2 の場合ｄ = d3 ＝ LA ＋ LD/6
φ＝3 の場合ｄ = d4 ＝ Q4 （＝ＬＡ＋ＬＤ/2）
と規定される。ここで重要なのは、LDを6 等分して、復号値d2、d3を規定している点である。これは、LAから見て、各量子化区間の中央よりも離れた位置に復号することに他ならないが、これは復号値のダイナミックレンジを広めにとることを意図しているように思われる。
なお上記では、４階調に量子化するために区間（max −min)を4 等分し、その後LDを求めているが、その他の階調数に量子化する場合は、K 。Ogura ；"Generalized Block Truncation Coding（GBTC)"、ISO/TC97/SC2/WG8 N510 （June 1987)を参照されたい（GBTC型符号化方式は、GBTCのサブセットである）。
【０００５】
以上のようにGBTC型符号化方式では、中央値付近の出現頻度が相対的に大きいヒストグラムを想定して量子化区間の設定を行っていると考えられるが、実際の画像はそうでない場合がある。たとえは、図２１に示したようなモノクロの文字画像の場合、その画素値のヒストグラムは図２２の通りであり、図２０とは反対に中央の頻度が少ないものとなる。
また上記GBTC型符号化方式では、ブロック内の全ての画素に一律に2bitのφを割り当てているが、原画像を小さな領域に分割して符号化する場合など、対象となる画像自体の階調がもともと限られている場合には、一律な割り当ては効率的ではない。すなわち、原画像の階調が広い場合にはφへの割り当てを増やし、狭い場合には、φへの割り当てを減らすような構成が必要である。（特願２０００−０９２３９３記載の“画像を予め定められた一定の大きさの矩形ブロックに分割し、該ブロックを更に分割して周波数変換の単位となる画像ブロックを作成し、該周波数変換結果を元に矩形ブロックを予め定められた一定長の符号長に符号化を行う固定長符号化装置。”は、こうした手法を意図したものである。）
【０００６】
そこで例えば、上記２×４画素のブロックを２×２画素のサブブロック２つに分割し、該２つのサブブロックを“階調が狭いサブブロック" と“階調が広いサブブロック" に分類し、前者に1 階調を割り当て、後者に４階調を割り当てることを考える。これにより、２×４画素のブロック単位で見れば、計５階調（あるいはそれ以上）を得ることが出来、復号画像の画質を向上させることができる。この場合、“階調が狭いサブブロック" に対しては画素の代表値のみを符号化し、“階調が広いサブブロック" に対してはBTC を適用するような構成が考えられる。
この場合にも、“階調が広いサブブロック" のみの画素値のヒストグラムは、もはやブロック全体のヒストグラムではなく、文字画像でない場合でも、上記GBTC型符号化方式が想定しているヒストグラムとは異なるものとなる。そもそも“階調が狭いサブブロック" とは、サブブロック内の画素値の差が少ないサブブロックであり、“階調が広いサブブロック" とは、サブブロック内の画素値として、大きいものと小さいものの両方を含むサブブロックである。よって、“階調が広いサブブロック" のみの画素値のヒストグラムは、図２３のようにフラットであったり、図２４に示すように、むしろ中央の頻度が少な目の、図２０とは上下が反転した形状になることがあるのである。
【０００７】
以上のような場合には、ヒストグラムに対する想定が成り立たないため、GBTC型符号化方式をそのまま適用することはできない。最適な画質を得るためには、新たなLDの分割法が必須であり、例えば、図１のように、LDの中央を粗に分割し、両端側を密に分割することが必要である。また、復号値の設定としては、解像度成分として設定された各区間の中央よりも、階調幅指標の両端側よりであるのが望ましい。
【０００８】
【発明が解決しようとする課題】
そこで、本発明は、通常の形状ではないヒストグラムを有しうる画像にブロックトランケーション符号化を適用する場合において、適切な区間設定を行って画質を向上させることを目的とする。
【００２６】
なお、" 通常の形状ではないヒストグラム" として、図２３のようなフラットな分布を想定した場合、区間(max−min)を４等分したものを、そのままφの区間とし、各区間の中央に復号することで、量子化誤差を最小にすることができる。これはLDを１：２：２：１に分割することに相当する。
【００２７】
よって、本発明は、上記の目的に加えて、フラットな分布の場合に、量子化誤差を最小にすることを目的とする。
【００３５】
【課題を解決するための手段】
以上に鑑み、請求項１に係る発明は、階調幅指標を３つ以上に分割し、階調幅指標の中央よりも両端側を密に分割して、解像度成分の区間設定を行うことを特徴とするブロックトランケーション符号化装置を提案する。
【００４８】
請求項２に係る発明は、請求項１に記載のブロックトランケーション符号化装置であって、前記階調幅指標を１：２：２：１に分割することを特徴とするブロックトランケーション符号化装置を提案する。
【００５０】
請求項３に係る発明は、階調幅指標を３つ以上に分割し、階調幅指標の中央よりも両端側を密に分割して、解像度成分の区間設定を行うステップを有することを特徴とするブロックトランケーション符号化方法を提案する。
【００５６】
【発明の実施の形態】
以下、本発明の実施の形態の好適な例を図面を参照しながら説明するが、まず、本願実施例の装置構成について説明を行う。
【００５７】
［装置構成１］
図２は、本願にかかる装置構成の第１の例を示したものである。データバスを介して、ＨＤＤ、ＲＡＭ、ＣＰＵが接続されており、以下の流れで、原画像の符号化処理がなされる。
▲１▼ ＨＤＤ上に記録されたオリジナル画像は、ＣＰＵからの命令によってＲＡＭ上に読み込まれる。
▲２▼ 〔符号化ステップ〕ＣＰＵはＲＡＭ上の画像を部分的に読み込み、本願の符号化方法を適用して符号化を行う。
▲３▼ ＣＰＵは、符号化後のデータをＲＡＭ上の別の領域に書き込む。
▲４▼ 全ての原画像が符号化されると、ＣＰＵからの命令によって、符号化後のデータがＨＤＤ上に記録される。
また、同一の装置構成において、以下の流れで、符号化された画像の復号化処理がなされる。
▲１▼ ＨＤＤ上に記録された符号化された画像は、ＣＰＵからの命令によってＲＡＭ上に読み込まれる。
▲２▼ 〔復号化ステップ〕ＣＰＵはＲＡＭ上の符号化された画像を部分的に読み込み、本願を適用して復号化を行う。
▲３▼ ＣＰＵは、復号化後のデータをＲＡＭ上の別の領域に書き込む。
▲４▼ 全ての画像が復号化されると、ＣＰＵからの命令によって、復号化後のデータがＨＤＤ上に記録される。
【００５８】
［装置構成２］
また図３は、本願にかかる装置構成の第２の例を示したものである。データバスを介して、ＨＤＤ、ＲＡＭ１（ＰＣ内) 、ＣＰＵ１（ＰＣ内) 、プリンタが接続されている。オリジナル画像のプリントアウトに際し、画像の符号化がなされ、符号化後のデータがプリンタに送信される。プリンタへの送信データ量が低減されるため、送信時間が短縮され、符号化・復号化に要する時間を加味しても、高速なプリントが可能になる。
▲１▼ ＨＤＤ上に記録されたオリジナル画像は、ＣＰＵからの命令によってＲＡＭ上に読み込まれる。
▲２▼ 〔符号化ステップ〕ＣＰＵ１は、ＲＡＭ１上の画像を部分的に読み込み、本願の符号化方法を適用して符号化を行う。
▲３▼ ＣＰＵ１は、符号化後のデータをＲＡＭ１上の別の領域に書き込む。
▲４▼ ＣＰＵ１からの命令によって、符号化後のデータがプリンタ内のＲＡＭ２上に記録される。
▲５▼ 〔復号化ステップ〕プリンタ内のＣＰＵ２は、符号化後のデータを読み込み、本願の復号化処理を適用して画像の復号を行う。
▲６▼ ＣＰＵ２は、復号化後のデータをＲＡＭ２上に書き込む。
プリンタは、全てのデータが復号化された後、該復号化後のデータを所定の手順でプリントアウトする
【００５９】
以上のような装置構成のもと、本願実施例１においては、CMYKの４コンポーネントからなるカラーの原画像（１画素32bit ）を２×２画素単位のブロックに分割し、４階調に量子化してブロックトランケーション符号化を行う（図４）。本実施例は、図５にその符号構成を示している（φは４画素分のbit を連結するため、8bitとなる）。
ここで図７は、本願実施例に共通な、ブロック内で最大値・最小値を探索する処理、図８はLA、LD算出処理、図９はφ算出処理を示したものである。本願においては、図９の様に、ＬＤを６で割ることにより、その両端を密に量子化する。
さて、図１０は本願実施例１の流れを示したものであり、まずＣコンポーネント（Ｃプレーン）の画素値Ｐ（ｉ）が入力された後、ブロック内の最大値・最小値が求められる。そして、最大値―最小値（これは画素値の分布幅を直接反映する量である）が所定値Th以上のときは、フラグ＝１としてBTC を適用し、そうでない場合にはフラグ＝０としてブロック内の画素値の平均値を符号化し、これを全てのコンポーネントおよびブロックについて繰り返す。
本例においては、各コンポーネント（プレーン）の画素値は８bit であり、図５のように、ＬＡ、ＬＤ、平均値全てに８bit を割り当てているため、LA、ＬＤ、平均値は量子化せずにそのまま符号とすることができる。本例においては、各ブロックごとに、BTC を適用するか平均値を符号化するかを判断し、これを全プレーンについて行うため、符号の並びは、例えば図６のようになる。
また、これらの符号を復号する場合には、まずフラグを読み、ＢＴＣで符号化されている場合には、LA、LD、φを読み込み、復号値ｄを、
φ＝0 の場合ｄ = d1 ＝ Q1 （＝ＬＡ−ＬＤ/2）
φ＝1 の場合ｄ = d2 ＝ LA − LD/6
φ＝2 の場合ｄ = d3 ＝ LA ＋ LD/6
φ＝3 の場合ｄ = d4 ＝ Q4 （＝ＬＡ＋ＬＤ/2）
によって得る。これは、先述のように、通常復号値を量子化区間の両端よりに選ぶことを意味するが、特にヒストグラムがフラットな場合には、区間の中央に復号することを意味する。また、ＢＴＣで符号化されていない場合は、符号がそのまま４画素分の復号値となる（なお、本明細書内ではフローチャートは省略する）。
【００６０】
また、図１４は本願実施例２の流れを示したものである。本例では、CMY の３コンポーネントからなるカラーの原画像（１画素24bit ）を２×２画素単位のブロックに分割し、４階調に量子化してブロックトランケーション符号化を行う。まずＭコンポーネントの画素値Ｐ（ｉ）が入力された後、ブロック内の最大値・最小値が求められ、最大値―最小値（これは画素値の分布幅を直接反映する量である）が所定値Th以上のときは、フラグ＝１としてBTC を適用し、そうでない場合にはフラグ＝０としてブロック内の画素値の平均値を符号化する。つぎに、その他のコンポーネントに関しては、Ｍコンポーネントの結果をそのまま反映して、BTC とするか否かを決定する。このため符号は、図１１の様になる。
【００６１】
また、図１５は本願実施例３の流れを示したものである。本例では、RGB の３コンポーネントからなるカラーの原画像（１画素24bit ）を２×２画素単位のブロックに分割し、４階調に量子化してブロックトランケーション符号化を行う。まず、ブロック内の全ての画素値は下記式によって、輝度Ｙと色差Ｕ、Ｖに変換される。これは、RCT （Reversible Component Transform）と呼ばれる変換である。
輝度Ｙ＝｜＿（Ｒ＋２Ｇ＋Ｂ）／４＿｜（記号｜＿＿｜はフロア関数を示す）
色差Ｕ＝Ｒ−Ｇ・・・・式１
色差Ｖ＝Ｂ−Ｇ
逆変換は、
Ｒ＝Ｇ＋Ｕ
Ｇ＝Ｙ−｜＿（Ｕ＋Ｖ）／４＿｜・・・・式２
Ｂ＝Ｖ＋Ｇ
そして、ブロック内の４画素分の輝度（ａ、ｂ、ｃ、ｄとする）について、図１３に示したＳ変換が施され、４つの係数ＬＬ、ＨＬ、ＬＨ、ＨＨが生成される。Ｓ変換は簡易なウェーブレットフィルタであり、ＨＬ、ＬＨ、ＨＨ係数は、ハイパスフィルタ出力に相当する。そして、これらのいずれかの値が所定値Ｔｈ（ＨＨは２Ｔ）以上である場合、ＹＵＶの全てに対してＢＴＣが適用され、そうでない場合には平均値が符号化される。よって符号は、図１２の様になる。
なお、これらの符号を復号する場合には、まずフラグを読み、ＢＴＣで符号化されている場合には、LA、LD、φを読み込み、復号値ｄを、
φ＝0 の場合ｄ = d1 ＝ Q1 （＝ＬＡ−ＬＤ/2）
φ＝1 の場合ｄ = d2 ＝ LA − LD/6
φ＝2 の場合ｄ = d3 ＝ LA ＋ LD/6
φ＝3 の場合ｄ = d4 ＝ Q4 （＝ＬＡ＋ＬＤ/2）
によって得る。また、ＢＴＣで符号化されていない場合は、符号がそのまま４画素分の復号値となる。その後、上式２によってYUV をＲＧＢに戻すことによって、画素値を得ることができる（本明細書内ではフローチャートは省略する）。
【００６２】
また本願実施例４において、図１６はＲＧＢの３コンポーネントからなるカラーの原画像（１画素24bit ）を２×４画素単位のブロックに分割し、さらにこれを２×２画素のサブブロックＡ、Ｂに再分割し、符号化する場合の様子を示している。本実施例は、図１７にその符号構成を示している。
図１９はその流れを示したものであり、各サブブロックのＲＧＢ値は、それぞれＹＵＶ変換され、その後輝度または色差がＳ変換される。そしてＳ変換係数と所定値との関係で、ＢＴＣの適否が、
サブブロックＡ、Ｂともに平均値で符号化。フラグ＝０
サブブロックＡは平均値、ＢはＢＴＣ。フラグ＝１でφは8bit
サブブロックＡはＢＴＣ、Ｂは平均値。フラグ＝２でφは8bit
サブブロックＡ、Ｂをまとめて、８画素分でＢＴＣ。フラグ＝３でφは16bit
（ＬＡ、ＬＤは２で割って７bit に量子化）
のように決定され、これを輝度・色差の全て、および全てのブロックについて繰り返す。よって符号の例は、図１８の様になる。実施例１のように、各サブブロックに個別にＢＴＣを適用するよりも、本例のように２つのサブブロックにまとめてＢＴＣを適用した方が、LA、LDが一組で済む分、圧縮率を上げることができる。
また同様に、復号の場合には、まずフラグを読み、ＢＴＣで符号化されている場合には、LA、LD、φを読み込み、復号値ｄを、
φ＝0 の場合ｄ = d1 ＝ Q1 （＝ＬＡ−ＬＤ/2）
φ＝1 の場合ｄ = d2 ＝ LA − LD/6
φ＝2 の場合ｄ = d3 ＝ LA ＋ LD/6
φ＝3 の場合ｄ = d4 ＝ Q4 （＝ＬＡ＋ＬＤ/2）
によって得る。また、平均値としても符号化されている場合は、符号がそのまま４画素分の復号値となる。その後、上式２によってYUV をＲＧＢに戻すことによって、画素値を得ることができる（本明細書内ではフローチャートは省略する）。
【００６４】
【発明の効果】
請求項１に記載のブロックトランケーション符号化装置は、階調幅指標を３つ以上に分割し、階調幅指標の中央よりも両端側を密に分割して、解像度成分の区間設定を行うので、通常の形状ではないヒストグラムを有しうる画像にブロックトランケーション符号化を適用する場合において、適切な区間設定を行って画質を向上させることが可能となる。
【００７７】
請求項２に記載のブロックトランケーション符号化装置は、請求項１に記載のブロックトランケーション符号化装置であって、前記階調幅指標を１：２：２：１に分割するので、フラットな分布の場合に、量子化誤差を最小にすることができる。
【００７９】
請求項３に記載のブロックトランケーション符号化方法は、階調幅指標を３つ以上に分割し、階調幅指標の中央よりも両端側を密に分割して、解像度成分の区間設定を行うステップを有するので、通常の形状ではないヒストグラムを有しうる画像にブロックトランケーション符号化を適用する場合において、適切な区間設定を行って画質を向上させることができる。
【図面の簡単な説明】
【図１】本願の区間設定法である。
【図２】本願の装置構成例１の図である。
【図３】本願の装置構成例２の図である。
【図４】本願実施例１のCMYKの４コンポーネントからなるカラーの原画像（１画素32bit ）を２×２画素単位のブロックに分割する図である。
【図５】本願実施例１の符号構成図である。
【図６】本願実施例１の符号の並びの一例である。
【図７】本願実施例の最大・最小値探索処理のフローチャートである。
【図８】本願実施例のLA、LD算出処理のフローチャートである。
【図９】本願実施例のφ算出処理のフローチャートである。
【図１０】本願実施例１のフローチャートである。
【図１１】本願実施例２の符号構成図である。
【図１２】本願実施例３の符号構成図である。
【図１３】Ｓ変換および逆Ｓ変換の変換式である。
【図１４】本願実施例２のフローチャートである。
【図１５】本願実施例３のフローチャートである。
【図１６】本願実施例４のＲＧＢの３コンポーネントからなるカラーの原画像（１画素24bit ）を２×４画素単位のブロックに分割する図である。
【図１７】本願実施例４の符号構成図である。
【図１８】本願実施例４の符号の並びの一例である。
【図１９】本願実施例４のフローチャートである。
【図２０】従来技術のＧＢＴＣにおける量子各区間および復号値の設定法の概要図である。
【図２１】従来技術のＧＢＴＣ型符号化方式によるモノクロ文字画像の一例である。
【図２２】図２１の画像に対する画素値のヒストグラムである。
【図２３】ＧＢＴＣ型符号化方式での“階調が広いサブブロック”のみの画素値のヒストグラムの一例である。
【図２４】ＧＢＴＣ型符号化方式での“階調が広いサブブロック”のみの画素値のヒストグラムの一例である。[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a block truncation encoding apparatus and method , and more particularly to an apparatus that improves image quality by setting appropriate sections of resolution components.
[0002]
[Prior art]
Conventionally, a method called block truncation coding (BTC Block Truncation Coding) is known as a still image coding method. This means that the original image is decomposed into blocks of m × n pixels (division is not essential), a predetermined number of gradations is assigned to the block, and the pixel values in the block are expressed by the limited number of gradations. This is a method of (quantizing the number of gradations).
[0003]
FIG. 20 shows an outline of setting a quantization interval and a decoded value when a monochrome image is encoded by assigning four gradations. A thick line in FIG. 20 shows a histogram of pixel values of a block to be encoded, with the horizontal axis representing pixel values and the vertical axis representing frequency.
Based on the histogram, an average value LA of the pixel values in the block and a value LD reflecting the dynamic range of the pixel values in the block are obtained, and the pixel values are quantized based on these two quantities. Four sections are set. A flag indicating which section each pixel of the original image is included is called φ, and at the time of encoding, three quantities of LA, LD, and φ of each pixel are encoded. Note that LD is called a gradation width index, and φ is called a resolution component.
[0004]
Now, it is considered that the monochrome image (with a gradation of 8 bits per pixel) is encoded by being compressed to 1/2 by 2 × 4 pixels (64 bits in total) as one block. In order to assign 4 gradations to each block, 2 bits are required as φ per pixel, so a total of 16 bits are required with φ alone, and in order to compress to 1/2 (= 32 bits per block) The remaining 16 bits are assigned to LA and LD.
Here, the values of LA, LD, and φ are obtained as follows in general block truncation coding.
As shown in FIG. 20, the maximum value of the pixel values in the block is max, the minimum value is min, and the section (max-min) is equally divided into four. Then, average values Q1 and Q4 of the pixel values in each section are obtained for both ends of the four sections. And
LA = (Q1 + Q4) / 2
LD = Q4−Q1
, LA and LD are obtained, and based on this, the boundaries L1 and L2 of the quantization interval are
L1 = LA −LD / 4
L2 = LA + LD / 4
Is required. The important point here is that the quantization interval of the pixel value is set by dividing the LD into four equal parts. This is nothing but to quantize the vicinity of the median of the histogram finely and coarsely quantize the edges, and set the quantization interval assuming a histogram with a relatively high frequency of appearance near the median. It is thought that there is. This is because by densely quantizing a section having a high appearance frequency, the quantization error of the entire block can be suppressed to be small, and the image quality after decoding can be improved. In this way, assuming a general histogram, the BTC for setting the LD section is called a GBTC (Generalized Block Truncation Coding) type coding system, as described in JP-A-6-164950. .
Here, when x is a pixel value, φ is
When min ≤ x ≤ L1, φ = 0
If L1 <x ≦ LA φ = 1
When LA <x ≤ L2, φ = 2
When L2 <x ≤ max φ = 3
It is prescribed.
On the other hand, the decoded value d is
When φ = 0 d = d1 = Q1 (= LA-LD / 2)
When φ = 1 d = d2 = LA − LD / 6
When φ = 2 d = d3 = LA + LD / 6
When φ = 3 d = d4 = Q4 (= LA + LD / 2)
It is prescribed. What is important here is that the LD is divided into six equal parts to define the decoded values d2 and d3. This is nothing but decoding at a position farther from the center of each quantization interval as seen from LA, but this seems to intend to increase the dynamic range of the decoded value.
In the above, in order to quantize to 4 gradations, the section (max−min) is divided into 4 equal parts, and then LD is obtained. However, when quantizing to other gradations, K. Ogura; see "Generalized Block Truncation Coding (GBTC)", ISO / TC97 / SC2 / WG8 N510 (June 1987) (GBTC-type coding is a subset of GBTC).
[0005]
As described above, in the GBTC coding method, it is considered that the quantization interval is set assuming a histogram with a relatively high appearance frequency near the median, but the actual image may not be so. . For example, in the case of a monochrome character image as shown in FIG. 21, the histogram of the pixel values is as shown in FIG. 22, and the frequency at the center is low as opposed to FIG.
In the GBTC encoding method, 2 bits are uniformly assigned to all pixels in the block. However, when the original image is divided into small areas and encoded, the gradation of the target image itself is used. If is originally limited, uniform allocation is not efficient. That is, it is necessary to increase the allocation to φ when the gradation of the original image is wide, and to decrease the allocation to φ when it is narrow. (In Japanese Patent Application No. 2000-092393, “divides an image into rectangular blocks of a predetermined size, and further divides the block to create an image block as a unit of frequency conversion. (A fixed-length encoding device that originally encodes a rectangular block to a predetermined code length. ”Is intended for such a method.)
[0006]
Therefore, for example, the above 2 × 4 pixel block is divided into two 2 × 2 pixel sub-blocks, and the two sub-blocks are classified into “sub-blocks with narrow gradation” and “sub-blocks with wide gradation”. Consider assigning 1 gradation to the former and assigning 4 gradations to the latter. Thus, when viewed in blocks of 2 × 4 pixels, a total of 5 gradations (or more) can be obtained, and the image quality of the decoded image can be improved. In this case, a configuration in which only a representative value of a pixel is encoded for a “subblock with a narrow gradation” and BTC is applied to a “subblock with a wide gradation” can be considered.
Also in this case, the histogram of the pixel value of only “sub-block with wide gradation” is no longer the histogram of the whole block, and even if it is not a character image, it is different from the histogram assumed by the GBTC coding method. It will be a thing. In the first place, a “sub-block with a narrow gradation” is a sub-block with a small difference in pixel values within a sub-block, and a “sub-block with a wide gradation” has a large or small pixel value within a sub-block. It is a sub-block containing both things. Therefore, the pixel value histogram of only the “sub-block with wide gradation” is flat as shown in FIG. 23, or as shown in FIG. The shape may be changed.
[0007]
In such a case, since the assumption for the histogram does not hold, the GBTC coding method cannot be applied as it is. In order to obtain the optimum image quality, a new LD division method is essential. For example, as shown in FIG. 1, it is necessary to roughly divide the center of the LD and to densely divide both ends. Further, it is desirable that the decoded value is set from both ends of the gradation width index rather than the center of each section set as the resolution component.
[0008]
[Problems to be solved by the invention]
Therefore, an object of the present invention is to improve image quality by setting appropriate sections when applying block truncation coding to an image that may have a histogram that is not of a normal shape.
[0026]
Assuming a flat distribution as shown in FIG. 23 as a “non-regular histogram”, the section (max−min) divided into four equal parts is used as it is as a section of φ, and at the center of each section. By decoding, the quantization error can be minimized. This corresponds to dividing the LD into 1: 2: 2: 1 .
[0027]
Therefore, in addition to the above object , the present invention aims to minimize the quantization error in the case of a flat distribution.
[0035]
[Means for Solving the Problems]
In view of the above, the invention according to claim 1 is characterized in that the gradation width index is divided into three or more, and both end sides are divided more densely than the center of the gradation width index to set the resolution component section. A block truncation coding apparatus is proposed.
[0048]
The invention according to claim 2, a block truncation coding apparatus according to claim 1, the tone width index 1: 2: 2: Proposal block truncation coding apparatus characterized by divided into 1 To do.
[0050]
The invention according to claim 3 includes the step of dividing the gradation width index into three or more and dividing the both ends from the center of the gradation width index more densely to set the resolution component section. A block truncation coding method is proposed.
[0056]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, a preferred example of an embodiment of the present invention will be described with reference to the drawings. First, an apparatus configuration of the present embodiment will be described.
[0057]
[Device Configuration 1]
FIG. 2 shows a first example of a device configuration according to the present application. An HDD, RAM, and CPU are connected via a data bus, and an original image is encoded in the following flow.
{Circle around (1)} The original image recorded on the HDD is read onto the RAM by a command from the CPU.
(2) [Encoding step] The CPU partially reads an image on the RAM and applies the encoding method of the present application to perform encoding.
(3) The CPU writes the encoded data in another area on the RAM.
(4) When all the original images are encoded, the encoded data is recorded on the HDD in accordance with a command from the CPU.
In the same apparatus configuration, the encoded image is decoded according to the following flow.
(1) The encoded image recorded on the HDD is read onto the RAM in accordance with a command from the CPU.
(2) [Decoding step] The CPU partially reads the encoded image on the RAM and performs decoding by applying the present application.
(3) The CPU writes the decrypted data in another area on the RAM.
(4) When all the images are decoded, the decoded data is recorded on the HDD in accordance with a command from the CPU.
[0058]
[Device configuration 2]
FIG. 3 shows a second example of the device configuration according to the present application. The HDD, RAM 1 (inside PC), CPU 1 (inside PC), and printer are connected via a data bus. When the original image is printed out, the image is encoded, and the encoded data is transmitted to the printer. Since the amount of data transmitted to the printer is reduced, the transmission time is shortened, and high-speed printing is possible even when the time required for encoding / decoding is taken into account.
{Circle around (1)} The original image recorded on the HDD is read onto the RAM by a command from the CPU.
(2) [Encoding step] The CPU 1 partially reads an image on the RAM 1 and performs encoding by applying the encoding method of the present application.
(3) The CPU 1 writes the encoded data in another area on the RAM 1.
(4) The encoded data is recorded on the RAM 2 in the printer in accordance with a command from the CPU 1.
(5) [Decoding step] The CPU 2 in the printer reads the encoded data and applies the decoding process of the present application to decode the image.
(6) The CPU 2 writes the decrypted data on the RAM 2.
After all data is decrypted, the printer prints out the decrypted data according to a predetermined procedure.
With the above apparatus configuration, in the first embodiment of the present invention, a color original image (1 pixel 32 bits) composed of 4 components of CMYK is divided into blocks of 2 × 2 pixels and quantized into 4 gradations. Block truncation encoding is performed (FIG. 4). In this embodiment , the code configuration is shown in FIG. 5 (φ is 8 bits because bits of 4 pixels are connected).
Here, FIG. 7 shows processing for searching for the maximum value / minimum value within the block, FIG. 8 shows LA and LD calculation processing, and FIG. 9 shows φ calculation processing, which is common to the present embodiment. In the present application, as shown in FIG. 9, the LD is divided by 6 to densely quantize the both ends.
FIG. 10 shows the flow of the first embodiment of the present invention. First, after the pixel value P (i) of the C component (C plane) is input, the maximum value / minimum value in the block is obtained. When the maximum value-minimum value (which is an amount that directly reflects the distribution width of the pixel value) is equal to or greater than the predetermined value Th, BTC is applied as flag = 1, otherwise, flag = 0 is set. Encode the average of the pixel values in the block and repeat this for all components and blocks.
In this example, the pixel value of each component (plane) is 8 bits, and since LA, LD, and average value are all assigned 8 bits as shown in FIG. 5, LA, LD, and average value are not quantized. Can be used as they are. In this example, for each block, it is determined whether BTC is applied or the average value is encoded, and this is performed for all the planes. Therefore, the arrangement of codes is as shown in FIG. 6, for example.
When decoding these codes, first, the flag is read, and when encoded with BTC, LA, LD, and φ are read, and the decoded value d is
When φ = 0 d = d1 = Q1 (= LA-LD / 2)
When φ = 1 d = d2 = LA − LD / 6
When φ = 2 d = d3 = LA + LD / 6
When φ = 3 d = d4 = Q4 (= LA + LD / 2)
Get by. This is because, as described previously, but normally decoded value means to select the from both ends of the quantization intervals, particularly when the histogram is flat, means to decode the central segments. In addition, when it is not encoded by BTC, the code becomes a decoded value for four pixels as it is (note that the flowchart is omitted in this specification ).
[0060]
Further, FIG. 14 shows the flow of the present embodiment 2. In this example, a color original image (24 bits per pixel) composed of 3 components of CMY is divided into blocks of 2 × 2 pixel units, quantized into 4 gradations, and block truncation coding is performed. First, after the pixel value P (i) of the M component is input, the maximum value / minimum value in the block is obtained, and the maximum value-minimum value (this is a quantity that directly reflects the distribution width of the pixel value). When the value is equal to or greater than the predetermined value Th, the BTC is applied with the flag = 1, and when not, the average value of the pixel values in the block is encoded with the flag = 0. Next, regarding other components, the result of the M component is reflected as it is, and it is determined whether or not to use BTC. Therefore, the code is as shown in FIG.
[0061]
Further, FIG. 15 shows the flow of the present embodiment 3. In this example, a color original image (24 bits per pixel) composed of three RGB components is divided into blocks of 2 × 2 pixel units, quantized into four gradations, and block truncation coding is performed. First, all pixel values in the block are converted into luminance Y and color differences U and V by the following formula. This is a transformation called RCT (Reversible Component Transform).
Luminance Y = | _ (R + 2G + B) / 4_ | (symbol | __ | represents a floor function)
Color difference U = RG-Formula 1
Color difference V = BG
Inverse transformation is
R = G + U
G = Y− | _ (U + V) / 4_ |
B = V + G
Then, the S conversion shown in FIG. 13 is performed on the luminance (a, b, c, d) for four pixels in the block, and four coefficients LL, HL, LH, and HH are generated. The S conversion is a simple wavelet filter, and the HL, LH, and HH coefficients correspond to the high-pass filter output. If any of these values is equal to or greater than a predetermined value Th (HH is 2T), BTC is applied to all of YUV, and if not, an average value is encoded. Therefore, the code is as shown in FIG.
When decoding these codes, first, the flag is read, and when encoded with BTC, LA, LD, φ are read, and the decoded value d is
When φ = 0 d = d1 = Q1 (= LA-LD / 2)
When φ = 1 d = d2 = LA − LD / 6
When φ = 2 d = d3 = LA + LD / 6
When φ = 3 d = d4 = Q4 (= LA + LD / 2)
Get by. In addition, when the encoding is not performed by BTC, the code is a decoded value for four pixels as it is. Thereafter, the pixel value can be obtained by returning YUV to RGB by the above equation 2 (the flowchart is omitted in this specification ).
[0062]
In Embodiment 4 of the present application, FIG. 16 divides a color original image (1 pixel 24 bits) composed of 3 components of RGB into blocks of 2 × 4 pixels and further sub-blocks A and B of 2 × 2 pixels. Fig. 2 shows a state in which re-division and encoding are performed. This example illustrates the code structure in Figure 17.
FIG. 19 shows the flow. The RGB value of each sub-block is YUV converted, and thereafter the luminance or color difference is S-converted. The suitability of the BTC is determined by the relationship between the S conversion coefficient and the predetermined value.
Both sub-blocks A and B are encoded with an average value. Flag = 0
Sub-block A is the average value, B is BTC. Flag = 1 and φ is 8bit
Sub-block A is BTC, B is average value. Flag = 2 and φ is 8bit
Sub-blocks A and B are combined into 8 pixels for BTC. Flag = 3 and φ is 16bit
(LA and LD are divided by 2 and quantized to 7 bits)
This is repeated for all luminance and color differences and all blocks. Therefore, an example of the code is as shown in FIG. Rather than applying BTC individually to each sub-block as in the first embodiment, applying BTC together in two sub-blocks as in this example reduces the amount of LA and LD required by one set. You can raise the rate.
Similarly, in the case of decoding, first, the flag is read, and in the case of encoding with BTC, LA, LD, and φ are read, and the decoded value d is set as
When φ = 0 d = d1 = Q1 (= LA-LD / 2)
When φ = 1 d = d2 = LA − LD / 6
When φ = 2 d = d3 = LA + LD / 6
When φ = 3 d = d4 = Q4 (= LA + LD / 2)
Get by. If the average value is also encoded, the code becomes the decoded value for four pixels as it is. Thereafter, the pixel value can be obtained by returning YUV to RGB by the above equation 2 (the flowchart is omitted in this specification ).
[0064]
【The invention's effect】
Since the block truncation encoding apparatus according to claim 1 divides the gradation width index into three or more, and finely divides both ends of the gradation width index from the center to set the resolution component section, When block truncation coding is applied to an image that may have a histogram that is not the shape of the image, it is possible to improve image quality by setting an appropriate section.
[0077]
Block truncation coding apparatus according to claim 2 is a block truncation coding apparatus according to claim 1, the tone width index 1: 2: 2: Since is divided into 1, when the flat distribution In addition, the quantization error can be minimized.
[0079]
The block truncation encoding method according to claim 3 includes a step of dividing the gradation width index into three or more and dividing the both ends from the center of the gradation width index more densely to set a resolution component section. Therefore, in the case where block truncation coding is applied to an image that may have a histogram that does not have a normal shape, image quality can be improved by setting an appropriate section.
[Brief description of the drawings]
FIG. 1 is a section setting method of the present application.
FIG. 2 is a diagram of an apparatus configuration example 1 of the present application.
FIG. 3 is a diagram of an apparatus configuration example 2 of the present application.
FIG. 4 is a diagram in which a color original image (1 pixel 32 bits) composed of four CMYK components according to Embodiment 1 of the present application is divided into blocks of 2 × 2 pixel units.
FIG. 5 is a code configuration diagram of Embodiment 1 of the present application.
FIG. 6 is an example of an arrangement of codes according to the first embodiment of the present application.
FIG. 7 is a flowchart of maximum / minimum value search processing according to the embodiment of the present invention.
FIG. 8 is a flowchart of LA and LD calculation processing according to the embodiment of the present invention.
FIG. 9 is a flowchart of φ calculation processing according to the embodiment of the present application.
FIG. 10 is a flowchart of Embodiment 1 of the present application.
FIG. 11 is a code configuration diagram of Embodiment 2 of the present application.
FIG. 12 is a code configuration diagram of Embodiment 3 of the present application.
FIG. 13 is a conversion formula of S conversion and inverse S conversion.
FIG. 14 is a flowchart of Embodiment 2 of the present application.
FIG. 15 is a flowchart of Embodiment 3 of the present application.
FIG. 16 is a diagram in which a color original image (one pixel 24 bits) composed of three RGB components according to the fourth embodiment of the present application is divided into blocks of 2 × 4 pixel units.
FIG. 17 is a code configuration diagram of the fourth embodiment of the present application.
FIG. 18 is an example of an arrangement of codes according to the fourth embodiment of the present invention.
FIG. 19 is a flowchart of Embodiment 4 of the present application.
FIG. 20 is a schematic diagram of a method for setting each quantum interval and a decoded value in GBTC of the prior art.
FIG. 21 is an example of a monochrome character image according to a GBTC type encoding method of the prior art.
22 is a histogram of pixel values for the image of FIG.
FIG. 23 is an example of a histogram of pixel values of only “sub-block with wide gradation” in the GBTC type encoding method;
FIG. 24 is an example of a histogram of pixel values of only “sub-block with wide gradation” in the GBTC type encoding method;

Claims

A block truncation coding apparatus characterized in that a gradation width index is divided into three or more, and both end sides are densely divided from the center of the gradation width index to set a resolution component section.

A block truncation coding apparatus according to claim 1, the tone width index 1: 2: 2: block truncation coding apparatus characterized by divided into 1.

A block truncation encoding method comprising the steps of: dividing a gradation width index into three or more, and dividing both ends of the gradation width index densely from the center to set a resolution component section.