JP3792067B2

JP3792067B2 - Visual progressive coding method for images

Info

Publication number: JP3792067B2
Application number: JP11402899A
Authority: JP
Inventors: リージン
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 1998-03-19
Filing date: 1999-03-16
Publication date: 2006-06-28
Anticipated expiration: 2019-03-16
Also published as: EP0944262A2; DE69943154D1; JP2000041249A; EP0944262B1; JP2005020768A; EP0944262A3

Description

【０００１】
【発明の属する技術分野】
本発明は、ビデオ画像における埋め込み符号化（embedded coding）方法に関し、特に埋め込み符号化の画質を向上させ、埋め込み符号化に柔軟な視覚的制御を与える方法に関する。
【０００２】
【従来の技術】
埋め込み符号化においては、符号化ビットストリームを切り縮め、種々のビットレート範囲で使用することが出来る。高いビットレートの視覚条件又は外観は、低いビットレートの視覚条件又は外観と本質的に異なる。視覚的プログッレシブコーディング（visual progressive coding（ＶＰＣ））は、より良い主観的画質を全てのビットレート範囲にわたって得られるように全ての符号化ビットレートに対し視覚条件を調整するための機構と方法を提供する。視覚の重み付け（visual weighting）が埋め込み画像の主観的画質を向上させる有効な手段であることは証明されている。視覚される波長帯域内で係数に対するビットの割り当てを増やし、視覚されない波長帯域内で係数に対するビットの割り当てを少なくする視覚的重み付けは、人間の目により多く感知される特徴を強調し、画像の主観的画質を向上させる。従来、画像の重み付けは次の2つ方法のいずれかで実施されている。即ち、視覚系のコントラスト感度関数（contrast sensitivity function（ＣＳＦ））モデルを用い、変換係数を式（１）で乗徐算し、重み付けた係数ｆ _i,j （キャップ）をエントロピー符号化する方法、又は式（２）によりＣＳＦ関数の逆数により量子化ステップサイズを調整する方法による。
【０００３】
【数１】

【０００４】
式（１）及び式（２）共、固定視覚重み付けスキームとして知られており、式中のｆ_i,jとｆ_i,j（キャップ）は各々視覚重み付けしない変換係数と視覚重み付けした変換係数であり、ｘ_i,jは量子化された係数であり、ｉは周波数帯域を示し、ｊは帯域ｉ内の位置である。ｑは前記帯域ｉに関連した量子化ステップサイズであり、重みに反比例するように調節される。Ｑは量子化器である。ｗ_iは係数ｘ_iの周波数成分と視覚条件に関連して重み付けファクタである。重みｗ_iは前記視覚系のコントラスト視感度関数（ＣＳＦ）モデルと画像を見る距離から導き出せる。多くの埋め込みスキーム（embedding shemes)に式（１）による実行例を用いる場合、量子化操作は行わない。通常、視覚重み付けファクタｗ_iは符号化プロセス中で固定であると仮定される。かようなスキームは固定視覚重み付けとして知られている。ＪＰＥＧのような量子化操作を明らかに含むスキームの場合、式（２）による操作はより簡単であり広く使用されている。固定視覚重み付けの実行は簡単なので、視覚的に最適な符号化に関して現在行われている研究は殆んど、先に引用した文献に記述されているように、視距離からの重み付けファクタｗ_iの導出に向けられている。
【０００５】
要約すると、符号化は、（Ａ）変換及びエントロピー符号化の２段階操作または（Ｂ）変換、量子化及びエントロピー符号化の３段階操作で実施出来る。（Ａ）による方法は埋め込み符号器に多く用いられる。上記２種類の符号化の場合、固定視覚重み付けを個別に実施する必要がある。即ち、（Ａ）による方法の場合は式（１）による実施を要し、（Ｂ）による方法の場合は式（２）による実施を要する。
【０００６】
画像の符号化における最近の成果の１つは、埋め込み符号化（embedded coding)である。埋め込みゼロツリーウェーブレット符号化“（Embedded zero tree Wavelet coding（ＥＺＷ））J.Shapiro、“Embedded image coding using zero tree of wavelet coefficients（ウェーブレット係数のゼロツリーを用いた埋め込み画像符号化）”（ＩＥＥＥ会報、On Signal Processing（信号処理）、４１巻、3445-3462頁、１９９３年）のような埋め込み符号器は、後続処理段において切り捨てることが出来、視覚的に認知できる画像を表示するために復号化可能な符号化ビットストリームを生成する能力を有する。この埋め込み符号器はインターネット画像のブラウジング、画像データベース、デジタルカメラ等に主として用いられる。
【０００７】
一例としてインターネット画像のブラウジングにこの埋め込み符号化を用いると、圧縮画像の１バージョンだけを中央データベースに保管すれば良い。ユーザは、低い忠実度で多数の画像を迅速にブラウジングができるよう最初は各画像毎にビットストリームの小さな部分だけを要求する。興味のある画像を発見した時に、ユーザはビットストリームの残りを要求し、完全な分解能の忠実な画像を得る。ＥＺＷ技法は、画像をビットプレーン毎に符号化し、各ビットプレーン内でゼロツリー構造を用いて重要でない係数を組にし、効率良く符号化する。
【０００８】
埋め込み符号化の分野において他にも多数の文献が刊行されており、多数の特許が認められている。この分野において良く知られている文献の１つに、D.TaubmanとA.Zakhorにより提案された階層化ゼロ符号化法がある（Multirate 3-D subband coding of video，ＩＥＥＥ会報、画像処理、第３巻５号、１９９４年９月、572-588頁）。ここでは階層化ゼロ符号化（Layered Zero Coding）（ＬＺＣ））と呼ばれている埋め込み符号化法が記述されている。この方式は、変換された係数を文脈適応算術符号化を用いてビットプレーン毎に符号化するものである。これは、ＥＺＷより優れた速度歪み性能を達成するが、しかし、この文献は人間の視覚特性については何も触れていない。すぐれた性能に加え、ＬＺＣで生成した符号化ビットストリームを編成して、画質又は分解能を向上させることが出来、埋め込みプロセスに柔軟性を付加する。
【０００９】
階層ツリーにおける設定区画（ＳＰＩＨＴ）がA.SaidとW.Pearlmanによって提案されている（“A new, fast and efficient image codec based on set partitioning in hierarchical trees（階層ツリーにおける設定区画に基づく新しい高速高能率のイメージコーデック）”、ＩＥＥＥ会報、On Circuit and System for Video Technology（ビデオ技術用回路とシステム）、第６巻３号、１９９６年６月、243-250頁)。このＳＰＩＨＴは重要でない係数の集団を再規定し、ＥＺＷとの比較で、より優れた成果を達成する。さらに、或るモードのＳＰＩＨＴはエントロピー符号器を不要とし、符号器と復号器を非常に簡単に出来る。しかしながら、人間の視覚特性は何も考慮されていない。
【００１０】
H.WangとC.J.Kuoは文献“A multi-threshold wavelet coder（ＭＴＷＣ）for high fidelity image（高忠実度画像用多しきい値ウェーブレット）”（ＩＥＥＥ International Conference On Image Processing（ＩＥＥＥ画像処理に関する国際会議）、１９９７年）中に、最大しきい値でウェーブレット係数を最初に符号化することによりＬＺＣをこえる改善をもたらすスキームを開示している。この方式においても人間の視覚特性は考慮されていない。
【００１１】
J.Li及びS.Leiは、“An Embedded Still image coder with rate-distortion optimization（速度歪みを最適化する埋め込み静止画像コーダ）”（ＳＰＩＥ、Visual Communication and Image Processing（画像通信と画像処理）、第３３０９巻、36-47頁、サンホセ、１９９８年１月)に、最大速度歪み勾配、即ち使用符号化ビット当たりの最大歪み減少値で符号化単位を最初に符号化することにより、埋め込み符号器の性能を最適化するスキームを開示している。ここでは速度歪みを最適化した符号器（rate-distortion optimized embedding coder（ＲＤＥ））が開示されており、この符号器はなだらかな速度歪み曲線を有し、ＳＰＩＨＴとＬＺＣの性能を改善する。しかし、この方式も人間の視覚系に関して考慮していない。
【００１２】
Jones，Daly，Gaborski及びRabbaniは、“Comprative study of wavelet and DCT decompositions with equivalent quantization and encoding strategies fro medical images（ウェーブレットと等値量子化によるＤＣＴ分解の比較研究と医用画像の符号化戦略）”（ＳＰＩＥＶ，２３４１，proceedings of Conference Medical Imaging（医用画像会議の会報）、571-582頁、１９９５年）に、視覚的重みの計算方法を開示している。
A.Wastonの米国特許No.5,426,512、Image data compression having minimum perceptual error（最小知覚誤差を持つ画像データ圧縮）は、圧縮される画像によるＤＣＴ量子化マトリックスに適応またはカスタム化する方法を開示している。
A.Wastonの米国特許No.5,629,780、Image data compression having minimum perceptual error（最小知覚誤差を持つ画像データ圧縮）は、輝度とコントラスト技法およびエラープーリング技法により視覚マスキングを用いて量子化マトリックスを調節する方法を記述している。
【００１３】
Daly他の米国特許No.4,780,761、Digital image compression and transmission system visually waighted transform coefficients（デジタル画像圧縮・伝送システムの視覚的重み付け変換係数）は人間の視覚系感度の二次元モデルに従い、変換係数を量子化する方法を開示している。この人間の視覚系のモデルは、水平又は垂直方向の空間周波数に対する感度よりも斜め方向の空間周波数に対する感度が低く、これを利用して画像の圧縮度を高めることを特徴としている。モデルは、固定視覚条件で使用している。
【００１４】
A.Bovior他の米国特許No.5,144,688、Method and apparatus for visual pattern image cording（視覚パターンによる画像符号化方法と装置）はサブバンドの圧縮システムを記述している。画像を複数のサブバンドに分割する。サブバンドフィルタの特性、量子化器の誤差分布及び人間の視覚系の特性に基づき知覚マトリックスを決定する。この知覚マトリックスを用い、各サブバンド信号の符号化に使用される量子化器を調節する。この教示も固定視覚条件に対してなされている。
Hopkinsの米国特許No.4,939,645、Method and apparatus to reduce transform compression visual artifacts in medical images（医用画像における変換圧縮の視覚的人為要素の減少方法と装置）は、デジタル画像をブロックに区分し、人間の目の重要な視覚的反応に従って個別に符号化するデジタル画像符号化・復号化方法を記述している。符号化は、各区画内のデジタル数から平均輝度値を計算抽出し、合成残像ブロック内の視覚的に知覚出来るエッジ位置を検出することにより達成される。視覚的に知覚できるエッジがブロック内にあれば、各エッジブロック内のエッジの反対側における勾配値と方向を計算し適正に符号化する。ブロック内に知覚できるエッジが含まれていない場合は、そのブロックは均等輝度ブロックとして符号化する。復号化においては、符号化平均輝度値、勾配の大きさ、及びパターンコードを受信し、これらの３つの指示値を組合わせて最初のデジタル画像と同様な配置となるように復号化することが必要である。視覚条件は固定である。
【００１５】
J.Sapiroの米国特許No.5,321,776、Date compression system including successive approximation quantizer（遂次近似量子化器を含むデータ圧縮システム）は、遂次洗練量子化（successive refinement quantization）とエントロピー符号化（entropy coding）によるデータ圧縮を容易にするデータ圧縮システムを記述している。生成され圧縮されたビットストリームは、任意の時点で切り捨てても知覚可能な画像を生成出来る。ビットストリームは、画質の漸次改善を達成する。即ち、切り捨て点における平均二乗誤差を最小化するように配列される。このスキームも人間の視覚特性を考慮していない。
【００１６】
固定した視覚的重み付けは、視覚系のコントラスト感度関数（ＣＳＦ）のモデルを用いて変換係数を乗徐算することにより埋め込み符号器に容易に組込むことが出来る。しかしながら、埋め込み符号器の場合、符号化ビットストリームは若干遅れた時点で切り捨てられ、埋め込み段階が異なれば視覚条件はかなり異なる。低いビットレートにおいて、圧縮画像の画質は貧しく、詳細な画像の特徴は得られない。画像は通常比較的遠い距離をおいて目視され、観察者は全体的な特徴に関心を持つ。受信ビット数が増えれば増える程、画像の画質は向上し、観察者は全体的な特徴だけでなく画像の細部にも関心をもつことが出来る。画像はより近い距離で観察され、画像分析がなされるか或いは検査のため拡大され、視覚距離も短くなる。かように、異なる視覚条件が、異なる埋め込み段階において要求される。
【００１７】
（発明の要約）
画像の視覚的画質を漸次向上させる符号化方法は、画像を１組の変換係数に変換し、この変換係数の組を、同一の視覚特性を有する変換係数群を各々含む複数バンドに分割し、各バンドに１組のアクティブな重みを割り当て、符号化単位を生成し、１組の符号化単位の候補を特定し、各符号化単位候補の重要度を決定し、最大の視覚的重要度を有する複数の符号化単位候補を符号化し前記のアクティブな重みを更新するステップを含む。
【００１８】
【発明が解決しようとする課題】
この本発明の目的は、ユーザが完全な分解能の高画質画像を要求する前に低分解能・低画質の画像を見ることが出来る符号化方法を提供することである。
本発明の他の目的は、視覚的画質を漸次向上させる符号化方法を速度歪みを最適化した埋め込み技術に適用することである。
本発明のさらなる目的は、視覚的画質を漸次向上させる符号化方法をサブバンド又はＤＣＴインデックスレベルにおいて適用することである。
これらの目的及び他の諸目的並びに諸利益は以下の説明を図面を参照して読めば明瞭となろう。
【００１９】
【発明の実施の形態】
ここに記述する発明は、視覚的プログレッシブコーディング方法（ＶＰＣ）と命名した視覚的重み付け方法である。ＶＰＣは先行技術のように変換係数に重みを乗算したり或いは重みに反比例する量子化ステップサイズに調整したりすることはない。代わりに、埋め込みの順番を変えるために重みを用いる。多数の重みをＶＰＣ符号化プロセスに用いることが出来、各時点で新しい組の重みがアクティブであり、ＶＰＣは新しい重みに従って残りのビットストリームの順序を変更する。新しい重みは既に符号化したビットストリームの順番に影響しない。ＶＰＣは現にある埋め込み符号器を用いて実施することができ、完全な埋め込み符号化に柔軟な視覚的調整を加えることが出来る。
【００２０】
ＶＰＣは埋め込み符号化の主観的画質を改善する。埋め込み符号化において、符号化ビットストリームは後処理時に切り捨てることができ、それでも知覚出来る画像を復号化出来る。高ビットレートにおける視覚条件は低ビットレートでの視覚条件と非常に異なる。本発明による視覚的プログレッシブコーディング方法は、より良い主観的画質を得るために全符号ビットレートにわたって視覚条件を調整する方法を提供する。
【００２１】
埋め込み符号化された画像が或る特定の距離から観察される場合、視覚的重み付けは変換係数に重みｗ_iを乗ずることにより符号器に容易に組込むことが出来る。しかしながら、見る条件が異なれば、異なる埋め込み段階を要する。埋め込み機能を用いた画像データベースへの照会を例にとると、圧縮ビットストリームの１バージョンのみが中央データベースに保管される。ユーザは、多数の画像を低分解能・低忠実度で、例えば画像当たり１６分の１のスクリーンで迅速にブラウジングを行うために、最初は各画像共極く小部分のビットストリームを要求する。興味を引く画像が見つかると、ユーザは全スクリーン分解能でその画像を見る。ユーザはその画像に満足すれば、分析のために完全無欠な圧縮画像を要求し印刷する。照会プロセス中には視覚条件が変化する。受信ビットレートが増大するにつれて、画像は拡大されるか或いは接近して観察される。低ビットレートでは、画像は通常比較的遠い距離をおいて観察される。この場合、圧縮画像の画質が低く詳細な画像の特徴はいずれにしても得られないので、ユーザは画像の全体的な特徴に興味をおいている。画像の画質は受信ビットレートが増すにつれて向上し、ユーザは全体的な特徴だけでなく画像の細部にも関心を持つようになる。画像をより近い距離で見るようになると、画像分析操作が行われ、或いは引き伸ばされて点検され、その分視距離は減少する。前述した式（１）又は式（２）のいずれかを用いて重みを変えて実施するのは、重みが変わる度に毎回、係数に新しい重みを乗じるか再度量子化しなければならず不都合である。さらに、そのような実施例では、重みが変わる度にその都度エントロピー符号器に送る係数の二進表現を変更するので、変化する統計値のために次のエントロピー符号器の性能が低下する。
【００２２】
主観的な視覚的重み付けは、符号化された画像の主観的画質を向上させる効果的な方法であることは証明されている。図１を参照し、先行技術による視覚的重み付け方式のフローチャート１０を説明する。画像１２の従来の視覚的重み付けは、次の２つの方法のいずれか、即ち、視覚系のコントラスト感度関数（ＣＳＦ）モデルにブロック１４の変換係数を、視覚的重み付けブロック１６に示すように乗除算する方法（１）か、またはブロック１８の量子化操作で視覚的重み付けを累積する方法（２）のいずれかにより実施する。
【００２３】
重み付けられた係数はブロック２０でエントロピー符号化される。画像は先ずブロック２２でエントロピー復号化され、ブロック２４で重み係数が逆量子化され、ブロック２６で逆重み付けされ、ブロック２８で逆変換され、出力画像３０が得られる。これに代わり、重みに反比例するように量子化ステップサイズを調節する逆量子化により視覚的重み付けを累積することが出来る。
【００２４】
低ビットレートの場合、画像の全体的な特徴のみに関心が集まり、画像も比較的遠い距離から観察される。細部の特徴はビットレートが十分でないので得られない。しかしながら、高ビットレートの場合、画像は詳細に検査され、比較的近い距離から観察される。画像は分析のためにさらに拡大される。かように、埋め込み段階が異なると、異なる視覚的重み付けが要求される。埋め込みプロセス中に視覚的重み付けファクタを調節出来る公知の埋め込み符号器は無い。
【００２５】
調節可能な埋め込み符号器用シンタクス及び特殊な装置について説明する。シンタクスは重み付けファクタを埋め込みプロセスにおいて調節可能にする。かような符号器は視覚的プログレッシブコーダ（ＶＰＣ）と呼ばれ、そのプロセスも視覚的プログレッシブコーディングプロセス（ＶＰＣ）と呼ばれる。ＶＰＣの幾つかの実施例について記述する。
【００２６】
視覚的プログレッシブコーディング方法（ＶＰＣ）
ＶＰＣは埋め込みプロセス中の視覚的重み付けファクタの柔軟な調節を可能にする。ＶＰＣのこの機能性により、符号器は低いビットレートにおける視覚的重み付けの利点を生かし、低域係数により多数のビットを割り当て、画像全体的の外観を改善する。高ビットレートの場合、ＶＰＣは次第にこの重み付けを止めてより柔軟な視覚条件に適応させ、高周波画像の細部を維持する。ＶＰＣは埋め込み符号化の主観的画質を向上させる。視覚的重みにより係数を乗除算したり又は視覚的重みにより量子化ステップサイズを調整するのではなく、ＶＰＣは視覚的重みに従って埋め込み順位を調整する。言い換えれば、ＶＰＣは符号化の内容ではなく符号化の順位を制御する視覚的重み付けを用いる。
【００２７】
視覚的プログレッシブコーディング方法（ＶＰＣ）の実施
ＶＰＣにおいて、画像は先ず１組の係数に変換される。ここでの変換は、ＤＣＴ（離散コサイン変換）、ウェーブレット又は偶数（Ｅｖｅｎ）ウェーブレットパケットであって良い。ＶＰＣにおいては、一般性を失うことなく、１つのバンドを同一の視覚特性を有する変換係数群と定義する。ウェーブレット／ウェーブレットパケットの変換において、１つのバンドは１つのウェーブレット／ウェーブレットパケットのサブバンドであり、ＤＣＴの場合は１つのバンドは同じＤＣＴ基底を有する全ての係数を含む。変換係数はｆ_i,jのように添字される。ここで、ｉはバンドを示し、ｊはバンドの位置を示す。変換係数ｆ_i,jの２進表示は次の通りである。
±ｂ₁,ｂ₂,ｂ₃,…,ｂ_n,…ｂ_L … 式（３）
【００２８】
ここで、ｂ₁は最上位ビットであり、ｂ_Lは最下位ビットであり、ｂ_u（ｆ_i,j）はｕ番目の最上位ビットもしくは係数ｆ_i,jのｕ番目の符号化層である。変換により生じたサンプルビット配列を図２に示す。ビット配列の各行は変換係数を示しており、ビット配列の列は符号化層を示している。最上位ビットは左端の列に位置し、最下位ビットは右端の列に位置する。上位ビットｂ_u（ｆ_i,j）は常に下位ビットｂ_v（ｆ_i,j）（ｕ＜ｖの場合）より先に符号化されなければならないことは明らかである。b_u（ｆ_i,j）は、最上位の未符号化ビットであれば、即ち、同じ係数ｂ_v（ｆ_i,j）のより上位のビット（ｖ＝１，…，ｕ−１）が全て符号化されていれば候補ビットとして表示される。所定の時点において、符号器は候補ビットの集合の中から次に符号化するビットを選択しなければならない。係数は、その符号化されたビットのいずれかがゼロでなければ重要であり、全てゼロであれば重要でないと見なされる。重要でない係数の候補ビットは重要度の特定モード（mode of significance identification）で符号化され、重要な係数の候補ビットは洗練モード（mode of refinement）で符号化される。重要度の特定と洗練に関しては後述する。
【００２９】
従来の符号器とこの埋め込み符号器はビット配列を符号化する順番が異なる。従来のＪＰＥＧまたはＭＰＥＧのような符号器は、先ず量子化の精度を決定し又は同様に各係数毎に符号化するビット数を決定してから、係数毎に符号化する。前記のビット配列を例にとると、従来の符号化は通常図２の３２のような配列である。図2の例の場合、ｗ₀からｗ₇までの列はビットプレーンｂ₁〜ｂ₇を含んでいる。各列には＋または−の符号が付けられている。
【００３０】
従来の符号化と異なり、埋め込み符号化は図３の３４に示すように、画像をビットプレーン単位又は列単位で符号化する。埋め込み符号化のビットストリームは、各係数の最も重要な部分を最初に符号化するので途中で切り捨てても妥当な画質が維持される。復号画像の画質はビットの受信数が増すにつれて徐々に向上するので順次画像伝送にも適している。
ＶＰＣの場合、多数の視覚的重みの組が存在する。
ｗ^（ ⁰ ^）＝｛ｗ₀ ^（ ⁰ ^）,ｗ₁ ^（ ⁰ ^）;…,ｗ_n ^（ ⁰ ^）｝；
ｗ^（ ¹ ^）＝｛ｗ₀ ^（ ¹ ^）,ｗ₁ ^（ ¹ ^）;…,ｗ_n ^（ ¹ ^）｝；
・・・
ｗ^（ ^m ^）＝｛ｗ₀ ^（ ^m ^）,ｗ₁ ^（ ^m ^）;…,ｗ_n ^（ ^m ^）｝． … 式（４）
ＶＰＣのこのような一連の重みの他に、変換操作直後に任意に使用される全体の重みの組ｗｇは次の通りである。
ｗg＝｛ｗg₀,ｗg₁,…,ｗｇ_n｝ … 式（５）
この全体的な重みの組は固定した視覚的重み付けで実施される。所定の時点において、アクティブな重みｗと表わされる１組の重みが実施される。
ｗ=｛ｗ₀,ｗ₁,…,ｗ_n｝ … 式（６）
【００３１】
ここで、ｗ_iはバンドｉに対するアクティブな重みである。ＶＰＣの重要なコンセプトは、実施式（１）の変換係数を重み付けするのではなく、或いは、実施化式（２）の重みに反比例するように量子化を調節するのではなく、重みを使用して埋め込みの順番を制御することである。ＶＰＣにおける再順序付けの最小の単位は符号化単位（coding unit（ＣＵ））として知られており、ｋを添字する。符号化単位は、ＶＰＣを実施するための特別な埋め込みスキームによって異なる。符号化単位候補は候補ビットのみから成る符号化単位（ＣＵ）であると定義される。ＣＵ候補のみを符号化することが出来るので、ＶＰＣの操作はアクティブな重みに従ってＣＵ候補を順序づけることである。新しい重み付けの組がアクティブである場合、ＶＰＣは残るＣＵに対し新しい符号化順序を編成する。既に符号化されたＣＵの符号化順序は新しい重みによって影響されない。この重みによる再順位付け戦略により、ＶＰＣ符号器は埋め込みプロセス中に多数の重みの組を編入することが出来る。
【００３２】
本発明によるＶＰＣ方法の全体的操作のフローチャート４０を図４に示す。入力画像１２を受信し、この画像を変換して符号化単位（ＣＵ）を生成する（ブロック４２）。変換後、全体的な重み付けの組ｗgがある場合、これを実施式（１）又は（２）のいずれかを用いた、固定視覚重み付け法により適用する。アクティブな重みの組ｗを初期化し、変換係数のビットをグループ化して符号化単位（ＣＵ）を生成する。ＶＰＣはＣＵ候補を特定し、各々のＣＵ候補の重要度ｓ_kを決定する（ブロック４４）。重要度ｓ_kは視覚的重み付けをしない埋め込み順位に関する大きさの値である。その後、ＣＵの重要度にその重みを乗算してＣＵの視覚的な重要度Ｖ_skを決定する（ブロック４６）。
【００３３】
【数２】

【００３４】
式中、Ｗ_iはＣＵが存在する帯域のアクティブな重みである。ＶＰＣは最大の視覚的重要度を有するＣＵを符号化する（ブロック４８）。１つのＣＵの符号化が終了すると、新しいＣＵ候補が出現する。ＶＰＣは新しく出現した複数のＣＵ候補の重要度と視覚的重要度（The largest Visual significance）を評価し、最大の視覚的重要度を有するＣＵを符号化する。そして重みを更新すべきか否かの決定をなす（ブロック５０）。更新すべきであれば、次のステップに移行する。更新すべきでないなら、同一の重みを用いてブロック４４からのステップを繰り返す。アクティブな重みは何時でも変えることが出来（ブロック５２）、新しい重みがアクティブになると、残りのＣＵの埋め込み順位にのみ影響する。重みの変更は、符号器と復号器の間で折衝決定されねばならない。幾つかの有効な方法があるが、ＶＰＣのシンタクスとして後述する。上記の符号化プロセスは、或る終了判定基準が満たされるまで反復される（ブロック５４)。例えば、全てのＣＵが符号化され、即ち、符号化が無損失状態に達したか、最終符号化レートが達成されたか、或いは符号化歪みが或る特定のしきい値に達するまで反復される。その後、プロセスは終了する（ブロック５６）。前記の終了判定基準が満たされない場合は、前記プロセスをブロック４４から反復する。
【００３５】
個別ビットにおける視覚的画質の向上：視覚的プログレッシブ方式による速度歪み最適化埋め込み符号化（ＶＰＣＲＤＥ）
速度歪み最適化埋め込み法（ＲＤＥ）は、前述のように、LiとLeiによって開発された。ＲＤＥの場合、符号化単位（ＣＵ）は１つの変換係数ｆ_i,jの単一ビットｂ_u(ｆ_i,j)である。ＲＤＥは候補ビットを、予想速度歪み勾配（Ｒ−Ｄ勾配（Rate-Distortion（Ｒ−Ｄ）slope)）の順番、即ち符号化ビット当たりの歪み減少順に符号化する。
【００３６】
【数３】

【００３７】
計算を容易にするために、ルックアップテーブルを開発し、各候補ビット毎のＲ−Ｄ勾配の計算を、符号化層、重要度状態および算術符号化コンテキスト（arithmetic coding context）をインデックスとして一回のルックアップテーブル操作だけで済むようにしている。速度歪みを最適化する埋め込み（ＲＤＥ）のためのＶＰＣを実施するために、符号化単位（ＣＵ），即ち係数の個別ビットを視覚的重要度の降順で符号化する。ＣＵの重要度はＲ−Ｄ勾配の平方根として定義する。
ｓ_ij＝√slope _i _j … 式（９）
【００３８】
Ｒ−Ｄ勾配はエネルギー減少の測定値あるので、この平方根を適用する。一方、１符号化単位（ＣＵ）の重要度は大きさの測定値である。ＣＵの個数は非常に多いので、ＣＵは厳密な探索はされず、また、最大の視覚的重要度では符号化されず、代わりに、しきい値近似法が適用される。
【００３９】
１組の低減するしきい値をγ₀＞γ₁＞…γ_n＞…として定義する。代表的なしきい値列は反復毎にファクタαだけ減少する。
γ_n＝γ₀・α^-n … 式（１０）
ＶＰＣＲＤＥは変換係数を多数回走査し、ｎ回の走査で、γ_nを越える視覚的重要度をもつ全てのＣＵを符号化する。アクティブな重みはバンドｉの範囲内では同一であるので，各係数の視覚的重要度を計算して現在のしきい値と比較する代わりに，バンドｉに対するしきい値を逆に重み付けする。
【００４０】
【数４】

【００４１】
γ′iより大きい重要度を有する全ての候補ビットを符号化する。ＶＰＣＲＤＥのステップは次の通りである。
ステップ１：画像変換
ステップ２：固定視覚重み付け：可能ならば全体的な重みｗｇを加える。
ステップ３：最初のしきい値γ＝γ₀とアクティブな重みｗを設定する。
ステップ４：走査して符号化する。
画像は、最初に最低分解能バンドから最高分解能バンドまで各帯域内をラスタ線の順序で走査する。バンドｉの場合、（１１）式によって重み付きのしきい値γ′ _i が計算される。各候補ビットについて、そのＲ−Ｄ勾配を、符号化層と重要度状態と算術符号化コンテキストをインデックスとしたルックアップテーブルの操作を用いてLiとLeiの文献に記載されているように決定する。候補ビットのＲ−Ｄ勾配を、調節したしきい値γ′_iと比較し、その調節したしきい値より大きいＲ−Ｄ勾配を持つビットのみを符号化する。
ステップ５：必要に応じアクティブな重みを更新する。
ステップ６：しきい値を減少させる。画像全体を走査した後、しきい値をファクタαだけ減少させる（γ←γ／α）。ステップ４に戻り、ユーザが選択した最終ビットレート、例えば２.０ｂｐｐに達する等の終了条件が満たされるまで符号化は継続実行される。
【００４２】
サブバンド又はＤＣＴインデックスレベルにおける視覚的プログレッシブコーディング
Taubman及びZakhorが提案した階層ゼロ符号化（ＬＺＣ）のようなビットプレーン方式、Zandi他の提案せる可逆埋め込みウェーブレット符号器による圧縮（compression with reversible embedded warvelets（ＣＲＥＷ））、及びWangとKuoの提案する複数しきい値ウェーブレット符号器（multi-threshold wavelet coder（ＭＴＷＣ））の場合、ＶＰＣの符号化単位（ＣＵ）はベースバンドであり、同一符号化層内及び同一バンドの全てのビットを含んでおり、そのビット構成はすでにＭＴＷＣ内に存在している。ＣＵを拡大することにより、再配列の粒度が増大するが、実現化はより容易となり、符号器の大部分は同一のままである。この範疇のＶＰＣの実施は、ＪＰＥＧ２０００ＶＭ２におけるＶＰＣの実施化として以下のように説明出来る。１つのバンドビットプレーン内で、ビットをさらに部分的ビットプレーン又は３つのサブモデル、即ち、（１）現係数は重要でないがその近傍の係数は重要である予測重要度モード（predicted significance mode）と、（２）現係数が重要である洗練モード（refinement mode）と、（３）現係数も近傍の全ての係数も重要でない予測非重要度モード（predicted insignificance mode)に分類する。１つのバンドにおいて、符号器は常に最重要ビットプレーンから最低重要度ビットプレーンに処理を進め、１つのビットプレーンにおいて、符号器は常に先ず予測重要度モードで符号化し、次に洗練モードで符号化し、最後に非重要度モードで符号化する。ＶＰＣを実施するために、符号化単位（ＣＵ）を１ビットプレーンの１サブモードとして規定し、ＣＵをアクティブな重みに従って最配列する。ＶＰＣを可能にするＪＰＥＧ２０００ＶＭ２は下記のように実施する。
【００４３】
ステップ１：画像を変換する。
ステップ２：スカラー量子化器またはトレリス符号化量子化器（ＴＣＱ）により量子化し、可能ならば全体的な重みｗｇで固定視覚的重み付けをする。
ステップ３：最初のアクティブな重みｗを設定する。
ステップ４：各符号単位（ＣＵ）候補毎に重要度ｓ_kを計算する。
ｓ_k＝３^1/2・２^-n _k 予測重要度モードの場合
ｓ_k＝１・２^-n _k 洗練モードの場合
ｓ_k＝（０.９６）^1/2・２^-n _k 予測非重要度モードの場合 … 式（１２）
ここで、ｎ _kは現符号化層である。定数３^1/2，１及び０.９６^1/2は異なる符号化モードのＲ−Ｄ勾配を概算によって、また、視覚的プログレッションを用いない場合の埋め込み順序を維持するように選定される。
ステップ５：式（７）に従って各ＣＵ候補の視覚的重要度を計算する。
ステップ６：最大の視覚的重要度を有するＣＵ候補を符号化する。ＣＵの数が比較的少ないので、変更重みを符号化する代わりに、ＪＰＥＧ２０００ＶＭ２はＣＵの順序を符号化する。１つのＣＵを符号化する前に、そのＣＵを特定するタグを符号化する。１つのバンド内における符号化順序は１つだけなので、タグはＣＵが含まれているバンドを特定するだけで良い。
ステップ７：必要に応じアクティブな重みを更新する。符号化は、終了条件が満たされるまで継続実行される。
【００４４】
マルチバンドの符号化単位を有する埋め込み方式の場合の視覚的プログレシッブコーディング（ＶＰＣ）方法
複数バンドにまたがる係数を有するシンボルの符号化を含む階層ツリー（Set Partitions In Hierarchical Trees（ＳＰＩＨＴ））内の設定区分においてＶＰＣを実施する方法について記述する。この実施法はＥＺＷのような他の類似埋め込み方式にも一般化出来る。階層ツリー（ＳＰＩＨＴ）内に三種類の符号化シンボル、即ち、非重要画素のリスト（list of insignificant pixels（ＬＩＰ））、重要画素のリスト（list of significant pixels（ＬＳＰ））、及び非重要組のリスト（list of insignificant sets（ＬＩＳ)）が存在する。ＬＩＰとＬＳＰの構成要素は各々１つの係数の１つのビットである。ＬＩＳの構成要素には、複数の帯域にまたがる同一層における重要でないビットの３つの集団を含んでいる。ＶＰＣの再配列の最小単位である符号化単位（ＣＵ）は、ＬＩＰ，ＬＳＰ又はＬＩＳの１つの要素であると規定する。ＣＵは多数なので、前述のＶＰＣＲＤＥと同様なしきい値による処理を採用する。ＳＰＩＨＴによるＶＰＣの符号化手順は次の通りである。
【００４５】
ステップ１：画像を変換する。
ステップ２：可能ならばグローバルな重み（global weights）ｗｇで固定の視覚的重み付けをする。
ステップ３：最初の初期しきい値γ＝γ₀を設定しアクティブな重みｗを設定する。
ステップ４：トラバースして符号化する。ＶＰＣはＬＩＳ，ＬＩＰおよびＬＳＰを縦覧横断し、各ＣＵの重要度と視覚的重要度を評定し、しきい値γより大きい重要度を有するＣＵを符号化する。ＣＵの重要度は量子化のステップサイズと符号化モードによって計算する。
ｓ_k＝１.９・２^-n _k ＬＩＳ構成要素の場合 … 式（１３）
ｓ_k＝３^1/2・２^-n _k ＬＩＰ構成要素の場合
ｓ_k＝１・２^-n _k ＬＳＰ構成要素の場合
【００４６】
ここで、ｎ _kはまだＣＵの符号化層である。定数１.９、３^1/2及び１は異なる符号化モードのＲ−Ｄ勾配を再び概算によって、また、視覚的プログレッションを用いない場合の埋め込み順序を維持するように定められている。ＣＵの視覚的重要度は、ＣＵの重要度にその重みを乗じて定める。１ビットのＣＵ（ＬＩＰ又はＬＳＰ）の場合、その重みは、画素が存在するバンドｉのアクティブな重みｗ_iである。複数のバンドにまたがる重要でないビットのツリーを含むＬＩＳの構成要素のＣＵ場合、その重みは最も感度の強い視覚バンドに従って式（１４）のように計算するか、或いは、加重和として式（１５）のように計算することが出来る。
【００４７】
【数５】

【００４８】
式中のＰｃはバンドｃ（ｃ＝０,…,Ｌ）に存在する画素数を意味する。式（１４）の方法は、ＣＵの視覚的な画質を保証できるのでより好適である。
計算した視覚的重要度を現しきい値と比較し、しきい値を越える重要度を有するＣＵのみを符号化する。ＣＵの符号化はSaidとPearlmanが記述した規則に厳格に従っている。
ステップ５：必要に応じアクティブな重みを更新する。
ステップ６：しきい値を減少させる。ＬＩＳ，ＬＩＰ及びＬＳＰを走査後、しきい値γをファクタαだけ減少させ（γ←γ／α）、ステップ４に戻る。終了条件が満たされるまで符号化は継続実行される。
【００４９】
ＶＰＣのビットストリームシンタクス
ＶＰＣの場合、復号器はアクティブな重みの変更について知らされねばならないが、これには、３通りの方法がある。第１の方法は、デフォルト重み値の変更戦略を符号器と復号器間で折衝出来るようにすることである。このデフォルト重みのアプローチは復号器へ送られるオーバーヘッドを除去するが、デフォルト重みの個数が制限されるので視覚的進展の柔軟さが制限される。
【００５０】
より一般的な方法は、符号器に埋め込み時の重み変更、即ち視覚条件の変更を制御させ、復号器は符号器の指示に従って重みを受信し更新するだけとする。これには２通りの方法がある。符号化単位（ＣＵ）の個数が少ない場合、ＪＰＥＧ２０００ＶＭ２におけるＶＰＣの実施の様に、ＣＵの埋め込み順序を指定するタグを符号化出来る。これが、アクティブな重みの変更を知らせる第１の方法を構成する。
【００５１】
或る特定の符号器の場合、付加タグは、次のＣＵを符号化するために必要なビット数を特定することが要求される。ＣＵの数が多い場合、通常の方法では正規の間隔で視覚マーク（ＶＭ）を明示的に伝送し、重みが変更されたか否かを復号器に知らせる。これが、アクティブな重みの変更を知らせる第２の方法を構成する。
【００５２】
図５に視覚マーク（ＶＭ）６０のシンタクスを示す。ＶＭは、重みが変更されたか否かを示す１ビットのシンボルＭにより先導される。Ｍが０であれば、先の重みがアクティブである。Ｍが１の場合、ＶＰＣは全てのバンドについて重みを更新する。かようなシンタクスは重みの変更がない場合にオーバヘッドを最小にする。重み更新に関する事前折衝の間隔は、符号器と復号器間で事前に取り決めておく。これは、例えば、１バンドビットプレーンを符号化後または全画像を走査後に実施出来る。重み更新間隔が長ければ長い程、重み更新のためのオーバヘッドは短いが、重み変更の粒度は粗くなる。
【００５３】
視覚マークのシンタクスは、複数バンドにまたがる係数を持つＣＵがない特別な場合に、画質と空間のスケーラビリティを維持出来る。画質のスケーラビリティの場合、最初の重みは一様に１に設定し、重みが変わっていないことを示す視覚マーク０を重み更新間隔ごとに伝送する。空間スケーラビリティを実施するために、最低分解能に対する重みを全て１に設定し、残りの分解能に対する重みを全て０に設定する。かような重みを使用すれば最低分解能範囲外に存在する係数の視覚的重要度が０になり、ＶＰＣは最低分解能の係数のみを符号化出来る。最低分解能の全係数の全ビットプレーンを符号化後、ＶＰＣは次の最低分解能の処理に進む。この新しい分解能の重みを１に設定し、残りの分解能の重みを０に設定する。新しい分解能の全係数を符号化した後、ＶＰＣはより高い分解能の処理に進む。プロセスは全ての係数を符号化するまで継続実行される。
【００５４】
実験結果
実験結果を得るために使用したシミュレーションソフトウェアはＪＰＥＧ２０００ＶＭ２であり、非視覚的重み付けモード（non-visual weghting mode（ＮＷ））、固定視覚的重み付けモード（fixed visual weighting mode（ＶＷ））及び視覚的プログレッションモード（visual progression mode（ＶＰＣ））で実施した。試験画像は図６に示した自転車の画像であり、寸法は２０４８×２５６０である。
【００５５】
この画像を画素当たり１.０ビットで圧縮し、０.１２５ｂｐｐおよび１.０ｂｐｐで各々埋め込み／復号化した。固定視覚的重み付けの場合、画像は１４インチ（３５センチ）の距離をおいて観察されると想定し、コントラスト視感度関数（ＣＳＦ）の視覚的重みはJones他の提案になる方法（以後Jones法と記す）に従い計算する。同じＣＳＦ重みを０.１２５ｂｐｐ以前のＶＰＣにおいて使用し、その後、重みを一様に１に設定する。結果画像を図７，図８及び図９に示す。符号化画像のピーク信号対雑音比（ＰＳＮＲ）と実行値誤差（ＲＭＳＥ）を参考までに表１に示すが、ＰＳＮＲ及びＲＭＳＥは視覚画質における良好な測定尺度を提供するものではない。
【００５６】
図７は、０.１２５ｂｐｐの復号画像で、ＮＷ，ＶＰＣ及びＶＷの各モードで符号化した画像を各々図７（ａ），図７（ｂ），図７（ｃ）に示してある。図７（ｂ）のＶＰＣ符号化画像の主観的画質は、図７（ａ）のＮＷ符号化画像の主観的画質より優れており、図７（ｃ）のＶＷ符号化画像の主観的画質に近い。人間の目に認識されやすい周波数成分を強調することにより、ＶＰＣ符号化画像はより明瞭に見え、自転車の車輪回りのリンギングアーチファクトも少ない。背景の縞模様はＶＰＣとＶＷ符号化画像においてより明瞭である。
【００５７】
１.０ｂｐｐで完全に復号した画像を図８に示す。図８（ａ），図８（ｂ）及び図８（ｃ）は各々ＮＷ，ＶＰＣ及びＶＷで復号した画像であるが全て視覚的画質は近似である。しかしながら、高ビットレートの場合、ユーザは画像を拡大して詳しく見たいと思うであろう。図９に示すように画像を４倍に拡大すると、ＶＷ符号化画像９ｃはよりスムーズで、シャープなエッジ周辺のリンギングアーチファクトがより強くなるが、ＶＣＰ符号化画像９ａ及びＮＷ符号化画像９ｂはかようなアーチファクトは僅かである。高ビットレートの場合、ＶＰＣの重みによる再順位付け戦略は、画像を近距離で見られるように徐々に視覚的重み付けを無くして行くことを可能にする。ＶＷ符号化画像はかような柔軟性を持たない。
【００５８】
ＶＰＣ法により符号化した画像は、埋め込み時の視覚的重みをより柔軟に調節出来るようにする。低いビットレートでの視覚的重み付けの利点を生かし、低域パス係数により多くのビットを割り当て画像全体の外観を向上させる。高ビットレートにおいては、より柔軟な視覚条件に適応して高周波画像の細部を確保するために視覚的重み付けを停止する。ＶＰＣは埋め込み符号化画像の主観的画質を向上させる。
【００５９】
【表１】

【００６０】
以上、視覚的プログレッシブコーディング方法とその種々の変形例について説明してきたが、これらは好適な実施例とその代案であり、請求項に規定された本発明の範囲から離れることなく、さらなる変更と修正をなし得ることは理解出来よう。
【図面の簡単な説明】
【図１】代表的な符号化フレームワークにおける先行技術による重み付けを示すブロック図である。
【図２】従来の符号化方法のビット配列と符号化順序を示す図である。
【図３】本発明による符号化方法のビット配列と符号化順序を示す図である。
【図４】本発明の符号化方法のブロック図である。
【図５】本発明に用いられるシンタクスを示す図である。
【図６】原画像を示す図である。
【図７】本発明により処理された画像を示す図である。
【図８】本発明により処理された画像を示す図である。
【図９】本発明により処理された画像を示す図である。
【符号の説明】
１２…画像、１４…変換、１６…視覚的重み付け、１８…量子化、２０…エントロピー符号化、２２…エントロピー復号化、２４…逆量子化、２６…逆重み付け、２８…逆変換、３０…出力画像、４０…フローチャート、４２…変換，符号化単位（ＣＵ）の生成、４４…各ＣＵ候補の重要度の決定、４６…視覚的重要度の決定、４８…最大の視覚的重要度を有するＣＵの符号化、５０…重み更新（？）、５２…重み変更、５４…終了（？）、５６…終了。[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an embedded coding method in a video image, and more particularly to a method of improving the image quality of embedded coding and giving flexible visual control to embedded coding.
[0002]
[Prior art]
In embedded coding, an encoded bit stream can be truncated and used in various bit rate ranges. The high bit rate visual condition or appearance is essentially different from the low bit rate visual condition or appearance. Visual progressive coding (VPC) is a mechanism for adjusting visual conditions for all coded bit rates so that better subjective image quality is obtained over the entire bit rate range. Provide a method. Visual weighting has proven to be an effective means of improving the subjective image quality of embedded images. Visual weighting, which increases the allocation of bits to coefficients within the visible wavelength band and reduces the allocation of bits to coefficients within the invisible wavelength band, emphasizes features that are more perceived by the human eye, To improve the image quality. Conventionally, image weighting is performed by one of the following two methods. That is, using the contrast sensitivity function (CSF) model of the visual system, the conversion coefficient is multiplied and subtracted by equation (1), and the weighted coefficientf _{i, j} (cap)Is entropy-encoded, or a method of adjusting the quantization step size by the reciprocal of the CSF function according to Equation (2).
[0003]
[Expression 1]

[0004]
Equations (1) and (2) are both known as fixed visual weighting schemes, where f_{i, j}And f_{i, j}(Cap) is a conversion coefficient that is not visually weighted and a conversion coefficient that is visually weighted, and x_{i, j}Is a quantized coefficient, i indicates a frequency band, and j is a position in the band i. q is the quantization step size associated with band i and is adjusted to be inversely proportional to the weight. Q is a quantizer. w_iIs the coefficient x_iThe weighting factor in relation to the frequency components and visual conditions. Weight w_iCan be derived from the contrast visibility function (CSF) model of the visual system and the viewing distance of the image. If the implementation example according to equation (1) is used for many embedding schemes, the quantization operation is not performed. Usually visual weighting factor w_iIs assumed to be fixed during the encoding process. Such a scheme is known as fixed visual weighting. For schemes that clearly include quantization operations such as JPEG, the operation according to equation (2) is simpler and widely used. Since fixed visual weighting is simple to implement, most of the current work on visually optimal coding is weighting factor w from viewing distance, as described in the previously cited document._iDerived from.
[0005]
In summary, encoding can be performed with (A) a two-stage operation of transform and entropy encoding or (B) a three-stage operation of transform, quantization and entropy encoding. The method according to (A) is often used for embedded encoders. In the case of the above two types of encoding, it is necessary to perform fixed visual weighting individually. That is, in the case of the method according to (A), the execution according to the equation (1) is required, and in the case of the method according to (B), the execution according to the equation (2) is required.
[0006]
One recent achievement in image coding is embedded coding. Embedded Zero Tree Wavelet Coding “(Embedded zero tree Wavelet coding (EZW))” J. Shapiro, “Embedded image coding using zero tree of wavelet coefficients” (IEEE Bulletin, One Embedded encoders such as Signal Processing, 41, 3445-3462 (1993) can be truncated in subsequent processing stages and can be decoded to display visually recognizable images. It has the ability to generate a coded bitstream, which is mainly used for Internet image browsing, image databases, digital cameras, etc.
[0007]
As an example, if this embedded coding is used for browsing Internet images, only one version of the compressed image need be stored in the central database. The user initially requires only a small portion of the bitstream for each image so that multiple images can be quickly browsed with low fidelity. When finding an image of interest, the user requests the rest of the bitstream and obtains a full resolution faithful image. In the EZW technique, an image is encoded for each bit plane, and insignificant coefficients are paired by using a zero tree structure in each bit plane, and encoded efficiently.
[0008]
Many other publications have been published in the field of embedded coding, and numerous patents have been recognized. One well-known document in this field is the hierarchical zero coding method proposed by D. Taubman and A. Zakhor (Multirate 3-D subband coding of video, IEEE Bulletin, Image Processing, Vol. 3, No. 5, September 1994, pages 572-588). Here, an embedded coding method called Layered Zero Coding (LZC) is described. In this method, the converted coefficient is encoded for each bit plane using context adaptive arithmetic coding. This achieves better speed distortion performance than EZW, but this document does not mention anything about human visual characteristics. In addition to excellent performance, the encoded bitstream generated by LZC can be organized to improve image quality or resolution, adding flexibility to the embedding process.
[0009]
A set partition (SPIHT) in a hierarchical tree has been proposed by A.Said and W.Pearlman (“A new, fast and efficient image codec based on set partitioning in hierarchical trees” Image Codec) ”, IEEE Bulletin, On Circuit and System for Video Technology, Vol. 6, No. 3, June 1996, pages 243-250). This SPIHT redefines the population of insignificant coefficients and achieves better results compared to EZW. In addition, certain modes of SPIHT do not require an entropy encoder, making the encoder and decoder very simple. However, no consideration is given to human visual characteristics.
[0010]
H.Wang and CJKuo have published the document “A multi-threshold wavelet coder (MTWC) for high fidelity image” (IEEE International Conference On Image Processing). , 1997) discloses a scheme that provides an improvement over LZC by first encoding wavelet coefficients with a maximum threshold. Even in this method, human visual characteristics are not taken into consideration.
[0011]
J.Li and S.Lei, “An Embedded Still image coder with rate-distortion optimization” (SPIE, Visual Communication and Image Processing), No. 1 3309, 36-47, San Jose, January 1998), by first encoding the coding unit with the maximum velocity distortion gradient, ie the maximum distortion reduction value per coding bit used, A scheme for optimizing performance is disclosed. Here, a rate-distortion optimized embedding coder (RDE) is disclosed, which has a gentle rate distortion curve and improves the performance of SPIHT and LZC. However, this method also does not consider the human visual system.
[0012]
Jones, Daly, Gaborski and Rabbani, “Comprative study of wavelet and DCT decompositions with equivalent quantization and encoding strategies fro medical images” (SPIEV) , 2341, proceedings of Conference Medical Imaging, pages 571-582 (1995), disclose a method for calculating visual weights.
A. Waston US Patent No. 5,426,512, Image data compression having minimum perceptual error, discloses a method for adapting or customizing a DCT quantization matrix with a compressed image. .
A.Waston US Patent No. 5,629,780, Image data compression having minimum perceptual error is a method of adjusting the quantization matrix using visual masking with luminance and contrast techniques and error pooling techniques Is described.
[0013]
Daly et al., US Patent No. 4,780,761, Digital image compression and transmission system visually waighted transform coefficients, which quantize transform coefficients according to a two-dimensional model of human visual system sensitivity. The method of doing is disclosed. This human visual system model is characterized in that the sensitivity to the spatial frequency in the oblique direction is lower than the sensitivity to the spatial frequency in the horizontal or vertical direction, and the compression degree of the image is increased by using this. The model is used in fixed visual conditions.
[0014]
US Patent No. 5,144,688, A. Bovior et al., Method and apparatus for visual pattern image cording, describes a subband compression system. Divide the image into multiple subbands. The perceptual matrix is determined based on the characteristics of the subband filter, the error distribution of the quantizer, and the characteristics of the human visual system. This perceptual matrix is used to adjust the quantizer used to encode each subband signal. This teaching is also made for fixed visual conditions.
Hopkins US Patent No. 4,939,645, Method and apparatus to reduce transform compression visual artifacts in medical images, divides a digital image into blocks, Describes a digital image encoding / decoding method that encodes separately according to the important visual response. Encoding is accomplished by calculating and extracting an average luminance value from the digital number in each partition and detecting visually perceptible edge positions in the composite afterimage block. If there are visually perceptible edges in the block, the gradient value and direction on the opposite side of the edge in each edge block is calculated and encoded appropriately. If a perceptible edge is not included in the block, the block is encoded as a uniform luminance block. In decoding, the encoded average luminance value, the magnitude of the gradient, and the pattern code are received, and decoding is performed by combining these three instruction values so that the arrangement is similar to that of the first digital image. is necessary. The visual conditions are fixed.
[0015]
J. Sapiro's US Patent No. 5,321,776, Date compression system including successive approximation quantizer, is a succession refinement quantization and entropy coding. Describes a data compression system that facilitates data compression according to. The generated and compressed bitstream can generate a perceptible image even if it is truncated at any time. The bitstream achieves a gradual improvement in image quality. That is, they are arranged so as to minimize the mean square error at the truncation point. This scheme also does not take into account human visual characteristics.
[0016]
Fixed visual weighting can be easily incorporated into an embedded encoder by multiplying and subtracting the transform coefficient using a visual contrast sensitivity function (CSF) model. However, in the case of an embedded encoder, the encoded bit stream is truncated at a slight delay, and the visual conditions differ considerably at different embedding stages. At low bit rates, the quality of the compressed image is poor and detailed image features cannot be obtained. Images are usually viewed at relatively long distances, and the viewer is interested in the overall characteristics. The higher the number of bits received, the better the image quality, and the viewer can be interested in not only the overall features but also the details of the image. The image is viewed at a closer distance and image analysis is performed or magnified for inspection, and the visual distance is also reduced. Thus, different visual conditions are required at different embedding stages.
[0017]
(Summary of the Invention)
An encoding method for gradually improving the visual image quality of an image transforms an image into a set of transform coefficients, divides the transform coefficient set into a plurality of bands each including transform coefficient groups having the same visual characteristics, Assign a set of active weights to each band, generate coding units, identify a set of coding unit candidates, determine the importance of each coding unit candidate, and set the maximum visual importance Encoding a plurality of encoding unit candidates having and updating the active weights.
[0018]
[Problems to be solved by the invention]
An object of the present invention is to provide an encoding method that allows a user to view a low-resolution and low-quality image before requesting a high-quality image with full resolution.
Another object of the present invention is to apply an encoding method for gradually improving visual image quality to an embedding technique in which speed distortion is optimized.
A further object of the present invention is to apply an encoding method that gradually improves visual image quality at the subband or DCT index level.
These and other objects and benefits will become apparent upon reading the following description with reference to the drawings.
[0019]
DETAILED DESCRIPTION OF THE INVENTION
  The invention described here is a visual weighting method named Visual Progressive Coding Method (VPC). VPC does not multiply the transform coefficient by the weight as in the prior art, or adjust the quantization step size inversely proportional to the weight. Instead, weights are used to change the order of embedding. Multiple weights can be used in the VPC encoding process, each time a new set of weights is active, and the VPCChange the order ofTo do. The new weight does not affect the order of the already encoded bitstream. VPC can be implemented using existing embedded encoders, allowing flexible visual adjustments to complete embedded encoding.
[0020]
VPC improves the subjective image quality of embedded coding. In embedded coding, the encoded bitstream can be truncated during post-processing, and still perceivable images can be decoded. Visual conditions at high bit rates are very different from visual conditions at low bit rates. The visual progressive coding method according to the present invention provides a way to adjust the visual conditions over the full code bit rate in order to obtain better subjective image quality.
[0021]
If the embedded coded image is viewed from a certain distance, the visual weighting is the weight w_iCan be easily incorporated into the encoder. However, different viewing conditions require different embedding steps. Taking a query to the image database using the embedding function as an example, only one version of the compressed bitstream is stored in the central database. In order to quickly browse a large number of images with low resolution and low fidelity, for example, with a screen of 1 / 16th per image, the user initially requires a very small bitstream for each image. When an interesting image is found, the user views the image at full screen resolution. If the user is satisfied with the image, he requests and prints a fully compressed image for analysis. Visual conditions change during the query process. As the received bit rate increases, the image is magnified or viewed closely. At low bit rates, images are usually viewed at a relatively long distance. In this case, since the image quality of the compressed image is low and detailed image characteristics cannot be obtained anyway, the user is interested in the overall characteristics of the image. The image quality improves as the reception bit rate increases, and the user becomes interested in the image details as well as the overall features. When an image is viewed at a closer distance, an image analysis operation is performed or stretched and inspected, and the viewing distance is reduced accordingly. It is inconvenient to change the weight using either the above-mentioned formula (1) or formula (2) because the coefficient must be multiplied by a new weight or re-quantized every time the weight changes. . Furthermore, in such an embodiment, each time the weight changes, the binary representation of the coefficients sent to the entropy encoder is changed each time, so the performance of the next entropy encoder is degraded due to the changing statistics.
[0022]
Subjective visual weighting has proven to be an effective way to improve the subjective image quality of encoded images. With reference to FIG. 1, a flowchart 10 of a prior art visual weighting scheme will be described. Conventional visual weighting of the image 12 can be done in one of two ways: a contrast sensitivity function (CSF) model of the visual system with the transform coefficients of block 14 and multiplication and division as shown in visual weighting block 16. This is done either by method (1) or by method (2) of accumulating visual weights in the quantization operation of block 18.
[0023]
The weighted coefficients are entropy encoded at block 20. The image is first entropy decoded at block 22, the weighting coefficients are dequantized at block 24, deweighted at block 26, and inverse transformed at block 28 to obtain an output image 30. Alternatively, visual weighting can be accumulated by inverse quantization that adjusts the quantization step size to be inversely proportional to the weight.
[0024]
At low bit rates, only the overall characteristics of the image are of interest and the image is also viewed from a relatively far distance. Detail features cannot be obtained because the bit rate is not sufficient. However, at high bit rates, the image is examined in detail and viewed from a relatively close distance. The image is further magnified for analysis. Thus, different visual weights are required at different embedding stages. There is no known embedded encoder that can adjust the visual weighting factor during the embedding process.
[0025]
An adjustable embedded encoder syntax and special device will be described. The syntax makes the weighting factor adjustable in the embedding process. Such an encoder is called a visual progressive coder (VPC), and the process is also called a visual progressive coding process (VPC). Several embodiments of VPC are described.
[0026]
Visual progressive coding method (VPC)
  VPC allows flexible adjustment of visual weighting factors during the embedding process. This functionality of the VPC allows the encoder to take advantage of visual weighting at low bit rates andAreaAllocate a large number of bits with coefficients to improve the overall appearance of the image. For high bit rates, the VPC gradually stops this weighting to adapt to more flexible visual conditions and maintain high-frequency image details. VPC improves the subjective image quality of embedded coding. Rather than multiplying / dividing the coefficients by visual weight or adjusting the quantization step size by visual weight, VPC adjusts the embedding rank according to the visual weight. In other words, VPC uses visual weighting to control the order of encoding rather than the content of encoding.
[0027]
Implementation of visual progressive coding method (VPC)
  In VPC, an image is first converted into a set of coefficients. The transform here may be a DCT (Discrete Cosine Transform), wavelet or even wavelet packet. In VPC, one band is defined as a group of transform coefficients having the same visual characteristics without loss of generality. In the wavelet / wavelet packet transformation, one band is a subband of one wavelet / wavelet packet, and in the case of DCT, one band contains all the coefficients having the same DCT base. The conversion factor is f_{i, j}Is subscripted. Here, i represents a band, and j represents the position of the band.StrangeConversion coefficient f_{i, j}The binary display is as follows.
          ± b₁, b₂, b_Three, ..., b_n, ... b_L          ... Formula (3)
[0028]
Where b₁Is the most significant bit, b_LIs the least significant bit, b_u(F_{i, j}) Is the u most significant bit or coefficient f_{i, j}This is the u th coding layer. A sample bit array generated by the conversion is shown in FIG. Each row of the bit array indicates a conversion coefficient, and a column of the bit array indicates an encoding layer. The most significant bit is located in the leftmost column and the least significant bit is located in the rightmost column. Upper bit b_u(F_{i, j}) Is always the lower bit b_v(F_{i, j}It is clear that it must be coded before (if u <v). b_u(F_{i, j}) Are the most significant uncoded bits, ie the same coefficient b_v(F_{i, j}) Are coded as candidate bits if all the higher-order bits (v = 1,..., U−1) are encoded. At a given time, the encoder must select the next bit to encode from the set of candidate bits. A coefficient is considered significant if any of its encoded bits are non-zero, and unimportant if it is all zero. Insignificant coefficient candidate bits are encoded in a mode of significance identification, and important coefficient candidate bits are encoded in a mode of refinement. The identification and refinement of importance will be described later.
[0029]
The conventional encoder and this embedded encoder are different in the order of encoding the bit array. A conventional encoder such as JPEG or MPEG first determines the quantization accuracy or similarly determines the number of bits to be encoded for each coefficient, and then encodes for each coefficient. Taking the above bit arrangement as an example, the conventional encoding is usually an arrangement like 32 in FIG. In the case of the example in FIG.₀To w₇The columns up to bit plane b₁~ B₇Is included. Each column has a + or-sign.
[0030]
  Unlike conventional encoding, embedded encoding encodes an image in bit plane units or column units as indicated by 34 in FIG. Since the most important part of each coefficient is encoded first in the bit stream of the embedded encoding, a reasonable image quality is maintained even if it is cut off in the middle. Since the image quality of the decoded image is gradually improved as the number of bits received increases, it is suitable for sequential image transmission.
  In the case of VPC, there are many sets of visual weights.
      w⁽ ⁰ ⁾= {W₀ ⁽ ⁰ ⁾, w₁ ⁽ ⁰ ⁾; ..., w_n ⁽ ⁰ ⁾};
      w⁽ ¹ ⁾= {W₀ ⁽ ¹ ⁾, w₁ ⁽ ¹ ⁾; ..., w_n ⁽ ¹ ⁾};
                  ・・・
      w⁽ ^m ⁾= {W₀ ⁽ ^m ⁾, w₁ ⁽ ^m ⁾; ..., w_n ⁽ ^m ⁾}. ... Formula (4)
  VPCA series of suchweightOthersImmediately after the conversion operationArbitrarilyUsedAllThe set of body weights wg is as follows.
    wg = {wg₀, wg₁, ..., wg_n} ... Formula (5)
  This overall weight set is implemented with a fixed visual weight. Active weight w and table at a given timeWowA set of weightsImplementationIs done.
    w = {w₀, w₁, ..., w_n} ... Formula (6)
[0031]
  Where w_iIs the active weight for band i. An important concept of VPC is not to weight the transform coefficients of the implementation equation (1) or to use the weights instead of adjusting the quantization inversely proportional to the weight of the implementation equation (2). To control the order of embedding. The smallest unit of reordering in VPC is known as a coding unit (CU) and is subscripted by k. The coding unit depends on a special embedding scheme for implementing VPC. A coding unit candidate is defined as a coding unit (CU) consisting only of candidate bits. Since only CU candidates can be encoded, the VPC operation is to order the CU candidates according to their active weights. If a new weighting set is active, the VPC organizes a new coding order for the remaining CUs. The encoding order of already encoded CUs is not affected by the new weight. This weight reordering strategy allows the VPC encoder to incorporate multiple sets of weights during the embedding process.
[0032]
  A flowchart 40 of the overall operation of the VPC method according to the present invention is shown in FIG. The input image 12 is received and transformed to generate a coding unit (CU) (block 42). After conversion, if there is an overall weighting set wg, this is determined by a fixed visual weighting method using either the implementation formula (1) or (2).ApplyTo do. Initializes the active weight set w and sets the transform coefficient bitsGroupThus, a coding unit (CU) is generated. The VPC identifies CU candidates and the importance s of each CU candidate_kIs determined (block 44). Importance s_kIs a size value related to the embedding rank without visual weighting. After that, the CU's visual importance V is multiplied by the weight of the CU's importance._skIs determined (block 46).
[0033]
[Expression 2]

[0034]
Where W_iIs the active weight of the band in which the CU exists. The VPC encodes the CU with the greatest visual importance (block 48). When encoding of one CU is completed, a new CU candidate appears. The VPC evaluates the importance and visual significance of a plurality of newly appearing CU candidates, and encodes the CU having the largest visual importance. A determination is then made whether the weight should be updated (block 50). If it should be updated, move to the next step. If not, the steps from block 44 are repeated using the same weight. The active weight can be changed at any time (block 52), and when a new weight becomes active, it only affects the embedding order of the remaining CUs. The weight change must be negotiated between the encoder and decoder. There are several effective methods, which will be described later as VPC syntax. The above encoding process is repeated until certain termination criteria are met (block 54). For example, all CUs have been encoded, i.e., until the encoding has reached a lossless state, the final encoding rate has been achieved, or the encoding distortion has reached a certain threshold. . Thereafter, the process ends (block 56). If the termination criteria are not met, the process is repeated from block 44.
[0035]
Improving visual image quality in individual bits: Speed distortion optimized embedded coding (VPC RDE) by visual progressive method
The velocity strain optimized embedding method (RDE) was developed by Li and Lei as described above. In the case of RDE, the coding unit (CU) is one transform coefficient f._{i, j}Single bit b_u(f_{i, j}). The RDE encodes the candidate bits in the order of the expected rate distortion gradient (R-D gradient (R-D) slope), that is, in the order of decreasing distortion per encoded bit.
[0036]
[Equation 3]

[0037]
  To facilitate the calculation, a lookup table is developed, and the calculation of the RD gradient for each candidate bit is performed using the coding layer, importance state and arithmetic coding context (arithmetic coding context) is used as an index so that only one lookup table operation is required. In order to implement a VPC for embedding (RDE) that optimizes the rate distortion, the coding units (CU), ie the individual bits of the coefficients, are coded in descending order of visual importance. The importance of CU is defined as the square root of the RD gradient.
          s_ij= √slope _i _j                            ... Formula (9)
[0038]
Since the RD slope is a measure of energy reduction, this square root is applied. On the other hand, the importance of one coding unit (CU) is a measure of size. Since the number of CUs is so large, the CUs are not rigorously searched and are not encoded with maximum visual importance, but instead a threshold approximation method is applied.
[0039]
  A set of thresholds to reduce γ₀> Γ₁> ... γ_n> Defined as ... A typical threshold sequence decreases by a factor α with each iteration.
          γ_n= Γ₀・ Α^-n                                ... Formula (10)
  VPC RDE scans the transform coefficient many times,nΓ in each scan_nEncode all CUs with visual importance greater than. Since the active weights are the same within the range of band i,VisualInstead of calculating the importance and comparing it with the current threshold, we reverse weight the threshold for band i.
[0040]
[Expression 4]

[0041]
  Greater than γ′iHeavyEncode all candidate bits with importance. The steps of VPC RDE are as follows.
  Step 1: Image conversion
  Step 2: Fixed visual weighting: add global weight wg if possible.
  Step 3: Initial threshold γ = γ₀And active weightswSet.
  Step 4: Scan and encode.
  The image is first scanned in raster line order within each band from the lowest resolution band to the highest resolution band. For band i,Weighted threshold γ ′ according to equation (11) _i Is calculated. For each candidate bit,The RD gradient is encodedLayer andImportance level andUsing lookup table operations with an arithmetic coding context as an indexDescribed in Li and Lei literaturelikedecide. The RD slope of the candidate bit is adjusted to the adjusted threshold γ ′_iAnd only those bits with an RD gradient greater than the adjusted threshold are encoded.
  Step 5: Update active weights as needed.
  Step 6: Decrease threshold. After scanning the entire image, the threshold is decreased by a factor α (γ ← γ / α). Returning to step 4, the final bit rate selected by the user, eg 2.Encoding is continued until an end condition such as reaching 0 bpp is satisfied.
[0042]
Visual progressive coding at subband or DCT index level
Bitplane schemes such as hierarchical zero coding (LZC) proposed by Taubman and Zakhor, compression with reversible embedded wavelet encoders (CREW) proposed by Zandi et al., And Wang and Kuo In the case of a multi-threshold wavelet coder (MTWC), the VPC coding unit (CU) is the baseband and includes all bits in the same coding layer and the same band. The bit configuration already exists in the MTWC. Enlarging the CU increases the reordering granularity, but is easier to implement and the majority of the encoder remains the same. The implementation of this category of VPC can be described as the implementation of VPC in JPEG2000VM2 as follows. Within one band bit plane, the bits are further divided into partial bit planes or three submodels: (1) a predicted significance mode in which the current coefficient is insignificant but its neighboring coefficients are significant. (2) Refinement mode in which the current coefficient is important and (3) Predicted insignificance mode in which neither the current coefficient nor all nearby coefficients are important. In one band, the encoder always proceeds from the most significant bitplane to the least significant bitplane, and in one bitplane, the encoder always encodes first in predictive importance mode and then in refinement mode. Finally, encoding is performed in the non-importance mode. To implement VPC, a coding unit (CU) is defined as one sub-mode of one bit plane, and CUs are rearranged according to active weights. JPEG2000VM2 enabling VPC is implemented as follows.
[0043]
  Step 1: Convert the image.
  Step 2: Quantize by scalar quantizer or trellis coded quantizer (TCQ), if possible, fixed visual weighting with overall weight wg.
  Step 3: Set the first active weight w.
  Step 4: Importance s for each code unit (CU) candidate_kCalculate
  s_k= 3^1/2・ 2^-n _k          In predictive importance mode
  s_k= 1.2^-n _k              In refined mode
  s_k= (0.96)^1/2・ 2^-n _k  Prediction non-importance mode: Equation (12)
  here,n _kIs the current coding layer. Constant 3^1/2, 1 and 0.96^1/2Approximates the RD slope of different coding modesAnd alsoVisualUse progressionIf notofMaintain embedding orderIs selected as follows.
  Step5: Calculate the visual importance of each CU candidate according to equation (7).
  Step6: Encode the CU candidate with the greatest visual importance. Since the number of CUs is relatively small, instead of encoding the change weights, JPEG2000VM2 encodes the order of the CUs. Before encoding one CU, a tag specifying the CU is encoded. Since there is only one encoding order within one band, the tag only needs to identify the band containing the CU.
  Step 7: Update active weights as needed. The encoding is continued until the end condition is satisfied.
[0044]
VISUAL PROGRESSIVE CODING (VPC) METHOD FOR EMBEDDING SYSTEM WITH MULTI-BAND CODING UNIT
A method for implementing VPC in a set section in a hierarchical tree (Set Partitions In Hierarchical Trees (SPIHT)) that includes coding of symbols having coefficients that span multiple bands is described. This implementation can be generalized to other similar embedding schemes such as EZW. Three types of encoded symbols in the hierarchical tree (SPIHT): a list of insignificant pixels (LIP), a list of significant pixels (LSP), and a non-critical set There is a list (list of insignificant sets (LIS)). Each component of LIP and LSP is one bit of one coefficient. The LIS components include three groups of unimportant bits in the same layer that span multiple bands. A coding unit (CU), which is a minimum unit of VPC rearrangement, is defined as one element of LIP, LSP, or LIS. Since there are a large number of CUs, the threshold value processing similar to the VPC RDE described above is adopted. The VPC encoding procedure by SPIHT is as follows.
[0045]
  Step 1: Convert the image.
  Step 2: If possible, give fixed visual weights with global weights wg.
  Step 3: Initial initial threshold γ = γ₀Set active weightwSet.
  Step 4: Traverse and encode. The VPC traverses the LIS, LIP, and LSP, assesses the importance and visual importance of each CU, and encodes CUs with importance greater than the threshold γ. The importance of the CU is calculated according to the quantization step size and the encoding mode.
        s_k= 1.9.2^-n _k      In case of LIS component: Equation (13)
        s_k= 3^1/2・ 2^-n _k       For LIP components
        s_k= 1.2^-n _k         For LSP components
[0046]
  here,n _kIs still the coding layer of the CU. Constant 1.9, 3^1/2And 1 again estimate the RD slope of different coding modesAnd alsoVisualDon't use progressionCaseofMaintain embedding orderIt is prescribed as. The visual importance of a CU is determined by multiplying the importance of the CU by its weight. In the case of a 1-bit CU (LIP or LSP), the weight is the active weight w of band i in which the pixel exists._iIt is. For a LIS component CU containing a tree of insignificant bits spanning multiple bands, its weight is calculated as in equation (14) according to the most sensitive visual band, orWeighted sumCan be calculated as shown in Equation (15).
[0047]
[Equation 5]

[0048]
Pc in the equation means the number of pixels existing in the band c (c = 0,..., L). The method of Formula (14) is more preferable because it can guarantee the visual image quality of the CU.
The calculated visual importance is compared with the current threshold, and only CUs with importance exceeding the threshold are encoded. The encoding of CUs strictly follows the rules described by Said and Pearlman.
Step 5: Update active weights as needed.
Step 6: Decrease threshold. After scanning LIS, LIP, and LSP, the threshold value γ is decreased by a factor α (γ ← γ / α), and the process returns to Step 4. Encoding is continued until the end condition is satisfied.
[0049]
VPC bitstream syntax
For VPC, the decoder must be informed about active weight changes, but there are three ways to do this. The first method is to allow a default weight value change strategy to be negotiated between the encoder and the decoder. This default weight approach removes the overhead sent to the decoder, but limits the number of default weights and thus limits the flexibility of visual evolution.
[0050]
A more general method is to let the encoder control the weight change at the time of embedding, that is, change the visual condition, and the decoder only receives and updates the weight according to the instructions of the encoder. There are two ways to do this. When the number of coding units (CU) is small, JPEG2000As in the case of VPC in VM2, tags that specify the CU embedding order can be encoded. This is the first notification of an active weight change.1Configure the method.
[0051]
For certain encoders, the additional tag is required to specify the number of bits required to encode the next CU. When the number of CUs is large, the normal method explicitly transmits visual marks (VMs) at regular intervals to inform the decoder whether the weight has been changed. This constitutes a second method of signaling active weight changes.
[0052]
FIG. 5 shows the syntax of the visual mark (VM) 60. The VM is led by a 1-bit symbol M indicating whether the weight has been changed. If M is 0, the previous weight is active. If M is 1, the VPC updates the weights for all bands. Such syntax minimizes the overhead when there is no weight change. A pre-negotiation interval for weight update is determined in advance between the encoder and the decoder. This can be done, for example, after encoding one band bit plane or after scanning the entire image. The longer the weight update interval, the shorter the overhead for weight update, but the granularity of weight change becomes coarser.
[0053]
The visual mark syntax can maintain image quality and spatial scalability in special cases where there are no CUs with coefficients that span multiple bands. In the case of image quality scalability, the initial weight is uniformly set to 1, and a visual mark 0 indicating that the weight has not changed is transmitted at each weight update interval. In order to implement spatial scalability, all the weights for the lowest resolution are set to 1 and all the weights for the remaining resolutions are set to 0. If such weight is used, the visual importance of the coefficient existing outside the minimum resolution range becomes 0, and the VPC can encode only the coefficient of the lowest resolution. After encoding all bit planes of all coefficients with the lowest resolution, the VPC proceeds to the next lowest resolution process. This new resolution weight is set to 1 and the remaining resolution weights are set to 0. After encoding all of the new resolution coefficients, the VPC proceeds to higher resolution processing. The process continues until all coefficients are encoded.
[0054]
Experimental result
The simulation software used to obtain the experimental results is JPEG2000VM2, which is a non-visual weighting mode (NW), a fixed visual weighting mode (VW), and a visual progression mode. (Visual progression mode (VPC)). The test image is an image of the bicycle shown in FIG. 6 and has a size of 2048 × 2560.
[0055]
This image was compressed at 1.0 bits per pixel and embedded / decoded at 0.125 bpp and 1.0 bpp, respectively. In the case of fixed visual weighting, it is assumed that the image is viewed at a distance of 14 inches (35 centimeters), and the visual weight of the contrast visibility function (CSF) is determined by Jones et al. To calculate. The same CSF weight is used in a VPC before 0.125 bpp, and then the weight is set uniformly to 1. The resulting images are shown in FIGS. The peak signal-to-noise ratio (PSNR) and effective value error (RMSE) of the coded image are shown in Table 1 for reference, but PSNR and RMSE do not provide a good measure of visual image quality.
[0056]
FIG. 7 shows a 0.125 bpp decoded image encoded in each mode of NW, VPC, and VW, respectively, in FIGS. 7A, 7B, and 7C. The subjective image quality of the VPC encoded image in FIG. 7B is superior to the subjective image quality of the NW encoded image in FIG. 7A, and the subjective image quality of the VW encoded image in FIG. close. By emphasizing the frequency components that are easily recognized by the human eye, the VPC encoded image looks clearer and has fewer ringing artifacts around the bicycle wheel. The background stripes are clearer in VPC and VW encoded images.
[0057]
An image completely decoded at 1.0 bpp is shown in FIG. FIGS. 8A, 8B, and 8C are images decoded by NW, VPC, and VW, respectively, but the visual image quality is all approximate. However, at high bit rates, the user will want to magnify the image for more details. As shown in FIG. 9, when the image is enlarged four times, the VW encoded image 9c becomes smoother and the ringing artifacts around the sharp edge become stronger. However, the VCP encoded image 9a and the NW encoded image 9b There are few such artifacts. For high bit rates, the VPC weight re-ranking strategy allows the visual weighting to be gradually eliminated so that the image can be viewed at close range. VW encoded images do not have such flexibility.
[0058]
An image encoded by the VPC method makes it possible to adjust the visual weight at the time of embedding more flexibly. Taking advantage of the visual weighting at a low bit rate, more bits are allocated to the low pass coefficient to improve the overall appearance of the image. At high bit rates, visual weighting is stopped to accommodate more flexible visual conditions and ensure high-frequency image details. VPC improves the subjective image quality of embedded coded images.
[0059]
[Table 1]

[0060]
While the visual progressive coding method and various variations thereof have been described above, these are preferred embodiments and alternatives, and further changes and modifications may be made without departing from the scope of the invention as defined in the claims. You can understand that
[Brief description of the drawings]
FIG. 1 is a block diagram illustrating prior art weighting in a representative coding framework.
FIG. 2 is a diagram illustrating a bit arrangement and an encoding order of a conventional encoding method.
FIG. 3 is a diagram showing a bit arrangement and an encoding order of an encoding method according to the present invention.
FIG. 4 is a block diagram of an encoding method of the present invention.
FIG. 5 is a diagram illustrating syntax used in the present invention.
FIG. 6 is a diagram illustrating an original image.
FIG. 7 shows an image processed according to the present invention.
FIG. 8 shows an image processed according to the present invention.
FIG. 9 shows an image processed according to the present invention.
[Explanation of symbols]
12 ... Image, 14 ... Transformation, 16 ... Visual weighting, 18 ... Quantization, 20 ... Entropy coding, 22 ... Entropy decoding, 24 ... Inverse quantization, 26 ... Inverse weighting, 28 ... Inverse transformation, 30 ... Output Image, 40 ... Flowchart, 42 ... Conversion, Coding unit (CU) generation, 44 ... Determining importance of each CU candidate, 46 ... Determining visual importance, 48 ... CU with maximum visual importance 50 ... Weight update (?), 52 ... Weight change, 54 ... End (?), 56 ... End.

Claims

By converting the image by DCT is divided into a plurality of bands including the transform coefficients having the same DCT basis, in the encoding method for encoding bits expressed in a predetermined order the transform coefficients in binary,
Classifying the bits based on whether the transform coefficient to which the bits belong is zero, defining the coding unit as a partial bit plane consisting of bits belonging to the same classification for each band and for each bit plane;
The importance of the coding unit is determined based on the number of bit planes for each of the classification,
Multiplying the importance by the visual weight defined corresponding to the band to determine the visual importance corresponding to the coding unit;
A method of visually progressive encoding of an image, wherein the encoding units are encoded in the order of the visual importance .

In an encoding method in which an image is converted by a wavelet and divided into a plurality of bands, which are wavelet subbands, and bits in which conversion coefficients are expressed in binary are encoded in a predetermined order.
Classifying the bits based on whether the transform coefficient to which the bits belong is zero, defining the coding unit as a partial bit plane consisting of bits belonging to the same classification for each band and for each bit plane;
Determining the importance of the coding unit based on the bit plane number for each classification;
Multiplying the importance by the visual weight defined corresponding to the band to determine the visual importance corresponding to the coding unit;
A method of visually progressive encoding of an image, wherein the encoding units are encoded in the order of the visual importance .