JP3833585B2

JP3833585B2 - Image coding apparatus, image coding method, and computer program

Info

Publication number: JP3833585B2
Application number: JP2002215618A
Authority: JP
Inventors: 真幸橋本; 賢治松尾; 淳小池; 康之中島
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2002-07-24
Filing date: 2002-07-24
Publication date: 2006-10-11
Anticipated expiration: 2022-07-24
Also published as: JP2004064126A

Description

【０００１】
【発明の属する技術分野】
本発明は、ディジタル画像処理の分野に係り、画像データを高能率に符号化する画像符号化装置および画像符号化方法、並びにその画像符号化装置をコンピュータを利用して実現するためのコンピュータプログラムに関する。
【０００２】
【従来の技術】
従来、ＪＰＥＧ２０００と呼ばれる画像符号化方式がＩＳＯ（International Organization for Standardization）およびＩＥＣ（International Electrotechnical Commission）の国際標準化機関によって「ISO/IEC 15444-1,“Information technology - JPEG2000 image coding system - Part 1: Core coding system”，ISO/IEC JTC 1/SC 29 WG1，Jan.2001.」で規格化されている。このＪＰＥＧ２０００はウェーブレット変換（ＷＴ:Wavelet Transform）を用いた画像符号化方式（以下、ＷＴ符号化方式と称する）の一つである。このＷＴ符号化方式は、ＤＣＴ符号化方式（例えばＪＰＥＧと呼ばれて知られている方式）よりも高圧縮、高品質な画像圧縮が可能なことから注目されている。
【０００３】
ＷＴ符号化方式では、ウェーブレット関数により画像全体を周波数帯域に分けた水平、垂直方向それぞれの周波数成分を、量子化および符号化して圧縮することが可能であるが、一般にＪＰＥＧ２０００などでは、画像全体を複数の矩形領域（タイル）に分割し、一つのタイルの中で独立にＤＷＴ（離散ウェーブレット変換）と量子化および符号化が行われる。このタイル化ＷＴ符号化方式は、ハードウェアで実現する際のメモリ量を削減することが可能なために、実用上、非常に重要な方式となっている。
【０００４】
【発明が解決しようとする課題】
しかし、上述した従来のタイル化ＷＴ符号化方式では、符号化ビットレートを低く設定して符号した場合、量子化誤差のため、再生画像においてタイルの境界部分で歪（タイル歪）が発生し、再生画像の品質が低下するという問題が生じている。
【０００５】
本発明は、このような事情を考慮してなされたもので、その目的は、タイル化ＷＴ符号化方式により画像を符号化する際、再生画像の品質を向上させるために、効率的にタイル歪を軽減させることができる画像符号化装置および画像符号化方法を提供することにある。
【０００６】
また、本発明は、その画像符号化装置をコンピュータを利用して実現するためのコンピュータプログラムを提供することも目的とする。
【０００７】
【課題を解決するための手段】
上記の課題を解決するために、請求項１に記載の画像符号化装置は、入力画像を矩形領域のタイルに分割し、該タイル毎にウェーブレット変換符号化方式により符号化して符号化データを出力する画像符号化装置において、ウェーブレット変換係数群に対して、全サブバンドを符号化の処理単位である符号化ブロックに分割する符号ブロック分割手段と、該符号化ブロック内のウェーブレット変換係数をビットプレーン符号化し、該ビットプレーン符号化して生成された符号列において、所定の符号化データ量を超えた下位ビットプレーンを切り捨てることによって量子化を行うビットプレーン符号化及びポスト量子化手段と、前記量子化における切り捨て位置を記憶する前タイル符号化パラメータ保存手段と、画素値の変化が小さい平坦部に含まれるタイル境界を検出する歪発生境界予測手段と、各タイルのウェーブレット変換係数の分散値を算出して保持するデータ量推定手段と、を備え、前記ビットプレーン符号化及びポスト量子化手段は、符号化対象タイル内の符号化ブロックにおいて前記タイル境界に隣接する符号化ブロックを高精度化対象とし、前記タイル境界が符号化対象タイルと既に符号化済みのタイルとの境界であった場合には、前記前タイル符号化パラメータ保存手段から該符号化済みタイルについての切り捨て位置を読み出し、この位置に基づく符号化済みタイルの量子化誤差と、符号化対象タイルの前記所定の符号化データ量による切り捨て位置に基づく量子化誤差とを比較し、符号化対象タイルの方が符号化済みタイルよりも量子化誤差が大きいときに、当該符号化対象タイルの高精度化対象の符号化ブロックの切り捨て位置を当該符号化済みタイルに合わせるようにし、一方、前記タイル境界が符号化対象タイルと未だ符号化されていないタイルとの境界であった場合には、該符号化対象タイルの分散値と前記タイル境界を挟んで当該符号化対象タイルに隣接する隣接タイルの分散値とを前記データ量推定手段から読み出し、その分散値を比較し、符号化対象タイルの分散値の方が隣接タイルの分散値よりも大きいときに、当該符号化対象タイルの高精度化対象の符号化ブロックの切り捨て位置を、該分散値の差に応じて、前記所定の符号化データ量による切り捨て位置よりも下位の方にずらす、ことを特徴としている。
【０００８】
請求項２に記載の画像符号化装置においては、前記ビットプレーン符号化及びポスト量子化手段は、符号化対象タイル内の符号化ブロックにおいて前記タイル境界に隣接していない符号化ブロックを低精度化対象とし、符号化対象タイルの低精度化対象の符号化ブロックの切り捨て位置を、前記高精度化対象の符号化ブロックについての符号化データ量に応じて、前記所定の符号化データ量による切り捨て位置よりも上位の方にずらす、ことを特徴とする。
【０００９】
請求項３に記載の画像符号化装置においては、前記歪発生境界予測手段は、前記入力画像内のタイル境界部分の所定範囲の画素値の分散を算出し、この分散値に基づいて当該画像領域が平坦であるか否かを判断することを特徴とする。
【００１３】
上記の課題を解決するために、請求項４に記載の画像符号化方法は、入力画像を矩形領域のタイルに分割し、該タイル毎にウェーブレット変換符号化方式により符号化して符号化データを出力する画像符号化方法であって、ウェーブレット変換係数群に対して、全サブバンドを符号化の処理単位である符号化ブロックに分割する第１の過程と、該符号化ブロック内のウェーブレット変換係数をビットプレーン符号化し、該ビットプレーン符号化して生成された符号列において、所定の符号化データ量を超えた下位ビットプレーンを切り捨てることによって量子化を行う第２の過程と、前記量子化における切り捨て位置を記憶する第３の過程と、画素値の変化が小さい平坦部に含まれるタイル境界を検出する第４の過程と、各タイルのウェーブレット変換係数の分散値を算出して保持する第５の過程とを、含み、前記第２の過程は、符号化対象タイル内の符号化ブロックにおいて前記タイル境界に隣接する符号化ブロックを高精度化対象とし、前記タイル境界が符号化対象タイルと既に符号化済みのタイルとの境界であった場合には、前記記憶された該符号化済みタイルについての切り捨て位置に基づく符号化済みタイルの量子化誤差と、符号化対象タイルの前記所定の符号化データ量による切り捨て位置に基づく量子化誤差とを比較し、符号化対象タイルの方が符号化済みタイルよりも量子化誤差が大きいときに、当該符号化対象タイルの高精度化対象の符号化ブロックの切り捨て位置を当該符号化済みタイルに合わせるようにし、一方、前記タイル境界が符号化対象タイルと未だ符号化されていないタイルとの境界であった場合には、前記保持された該符号化対象タイルの分散値と前記タイル境界を挟んで当該符号化対象タイルに隣接する隣接タイルの分散値とを比較し、符号化対象タイルの分散値の方が隣接タイルの分散値よりも大きいときに、当該符号化対象タイルの高精度化対象の符号化ブロックの切り捨て位置を、該分散値の差に応じて、前記所定の符号化データ量による切り捨て位置よりも下位の方にずらす、ことを特徴としている。
請求項５に記載の画像符号化方法においては、前記第２の過程は、符号化対象タイル内の符号化ブロックにおいて前記タイル境界に隣接していない符号化ブロックを低精度化対象とし、符号化対象タイルの低精度化対象の符号化ブロックの切り捨て位置を、前記高精度化対象の符号化ブロックについての符号化データ量に応じて、前記所定の符号化データ量による切り捨て位置よりも上位の方にずらす、ことを特徴とする。
【００１４】
上記の課題を解決するために、請求項６に記載のコンピュータプログラムは、入力画像を矩形領域のタイルに分割し、該タイル毎にウェーブレット変換符号化方式により符号化して符号化データを出力する画像符号化処理を行うためのコンピュータプログラムであって、ウェーブレット変換係数群に対して、全サブバンドを符号化の処理単位である符号化ブロックに分割する第１の機能と、該符号化ブロック内のウェーブレット変換係数をビットプレーン符号化し、該ビットプレーン符号化して生成された符号列において、所定の符号化データ量を超えた下位ビットプレーンを切り捨てることによって量子化を行う第２の機能と、前記量子化における切り捨て位置を記憶する第３の機能と、画素値の変化が小さい平坦部に含まれるタイル境界を検出する第４の機能と、各タイルのウェーブレット変換係数の分散値を算出して保持する第５の機能とを、コンピュータに実現させるコンピュータプログラムであり、前記第２の機能は、符号化対象タイル内の符号化ブロックにおいて前記タイル境界に隣接する符号化ブロックを高精度化対象とし、前記タイル境界が符号化対象タイルと既に符号化済みのタイルとの境界であった場合には、前記記憶された該符号化済みタイルについての切り捨て位置に基づく符号化済みタイルの量子化誤差と、符号化対象タイルの前記所定の符号化データ量による切り捨て位置に基づく量子化誤差とを比較し、符号化対象タイルの方が符号化済みタイルよりも量子化誤差が大きいときに、当該符号化対象タイルの高精度化対象の符号化ブロックの切り捨て位置を当該符号化済みタイルに合わせるようにし、一方、前記タイル境界が符号化対象タイルと未だ符号化されていないタイルとの境界であった場合には、前記保持された該符号化対象タイルの分散値と前記タイル境界を挟んで当該符号化対象タイルに隣接する隣接タイルの分散値とを比較し、符号化対象タイルの分散値の方が隣接タイルの分散値よりも大きいときに、当該符号化対象タイルの高精度化対象の符号化ブロックの切り捨て位置を、該分散値の差に応じて、前記所定の符号化データ量による切り捨て位置よりも下位の方にずらす、ことを特徴としている。
請求項７に記載のコンピュータプログラムにおいては、前記第２の機能は、符号化対象タイル内の符号化ブロックにおいて前記タイル境界に隣接していない符号化ブロックを低精度化対象とし、符号化対象タイルの低精度化対象の符号化ブロックの切り捨て位置を、前記高精度化対象の符号化ブロックについての符号化データ量に応じて、前記所定の符号化データ量による切り捨て位置よりも上位の方にずらす、ことを特徴とする。
これにより、前述の画像符号化装置がコンピュータを利用して実現できるようになる。
【００１５】
【発明の実施の形態】
以下、図面を参照し、本発明の一実施形態について説明する。
図１は、本発明の一実施形態による画像符号化装置の構成を示すブロック図である。この図１に示す画像符号化装置は、タイル化ウェーブレット変換符号化方式（タイル化ＷＴ符号化方式）の基本構成部と、本発明の特徴的な構成部とから構成される。
【００１６】
図１において、タイル化ＷＴ符号化方式の基本構成部は、色変換／ＤＣレベルシフト部１とタイル分割部２とＤＷＴ部３と符号ブロック分割部４と量子化部５とビットプレーン符号化／ポスト量子化部６と符号列順序制御部７からなる。この基本構成部は、ＪＰＥＧ２０００方式による従来の画像符号化装置と略同様であるが、ビットプレーン符号化／ポスト量子化部６については改良している。本発明の特徴的な構成部は、歪発生境界予測部１１とデータ量推定部１２と前タイル符号化パラメータ保存部１３からなる。
【００１７】
初めに、図１に示すタイル化ＷＴ符号化方式の基本構成部について説明する。色変換／ＤＣレベルシフト部１には、入力画像の画像データが入力される。色変換／ＤＣレベルシフト部１は、入力された画像データに対して、符号化効率を高めるための色変換及びＤＣレべルシフトを行う。次いで、タイル分割部２は、入力画像全体を複数の矩形領域（タイル）に分割する。このタイル分割の例を図２に示す。図２の例では、入力画像１０１が９個のタイル１〜９＿１１０に分割されている。
以降は各タイル１１０ごとに、タイル１からタイル２，タイル３，タイル４，…，タイル８，タイル９へと右上から左下に向かってラスタ順に処理を実行する。
【００１８】
ＤＷＴ（離散ウェーブレット変換）部３は、タイル１１０をＤＷＴにより図３に示すような複数の周波数領域（サブバンド）のＤＷＴ係数群２０１に変換する。図３において、サブバンド「ＬＬｉ」の領域には水平および垂直方向ともに低周波数領域に属するＤＷＴ係数が位置している。サブバンド「ＨＬｉ」の領域には水平方向で高周波領域且つ垂直方向で低周波数領域に属するＤＷＴ係数が位置している。サブバンド「ＬＨｉ」の領域には水平方向で低周波領域且つ垂直方向で高周波数領域に属するＤＷＴ係数が位置している。サブバンド「ＨＨｉ」の領域には水平および垂直方向ともに高周波数領域に属するＤＷＴ係数が位置している。但し、ｉは２次元ＤＷＴを繰り返し行った回数である。図３の例は、情報量の多い低周波数領域に対して２次元ＤＷＴを繰り返し３回おこなった場合のサブバンド状態となっている。
【００１９】
次いで、符号ブロック分割部４は、ＤＷＴ係数群２０１に対して、全サブバンドを符号化の処理単位である符号化ブロック２１０に分割する。図４に符号化ブロック分割の例を示す。図４の例では、符号化ブロック２１０の大きさは全サブバンドにおいて一定としている。
【００２０】
次いで、量子化部５は、スカラー量子化と呼ばれる一般的な量子化方法によって符号化ブロック２１０内のＤＷＴ係数値を量子化する。このスカラー量子化では、符号化対象係数を量子化ステップで除算し、係数のダイナミックレンジを削減する。
なお、この量子化部５は符号ブロック分割部４の前に設けてもよい。また、量子化部５の処理実行の有無については、適宜可変としてもよい。例えば、非可逆符号化モードの場合に実行有りとし、可逆符号化モードの場合には実行なしとする。
【００２１】
次いで、ビットプレーン符号化／ポスト量子化部６は、ＤＷＴ係数をビットプレーン符号化して生成された符号列の下位ビットブレーンを切り捨てることによって量子化（ポスト量子化）を行う。
先ずビットプレーン符号化では、符号化ブロック２１０内のＤＷＴ係数値あるいはそのスカラー量子化値を、正負を表すビットと絶対値とに分ける。次いで、その絶対値を自然２進数によりビットプレーン表現し、上位のビットプレーンから順にビットブレーン符号化を行う。但し、最初に、必ずどこかにビット１が含まれるビットプレーンから符号化は開始され、ある変換係数において最初にビット１が発生した場合、直ちに正負を表すビットを符号化する。なお、符号化開始のビットプレーンより上位で全ビットが０であるビットプレーンの情報は別途復号側へ送信される。
【００２２】
図５にビットプレーン符号化の概念を示す。図５に示すように、一つの符号化ブロック２１０内のＤＷＴ係数値あるいはそのスカラー量子化値の絶対値について、そのＭＳＢからＬＳＢまでの第１〜第Ｎビットプレーン３０１をビット桁ごとに構成する。これら第１〜第Ｎビットプレーン３０１において、上位ビットプレーンから順にビット値を参照し、０でないビットが初めて出現するビットプレーンを第ｍビットプレーン３０１とすると、第１ビットプレーン３０１から第（ｍ−１）ビットプレーン３０１までは符号化処理を行わない。そして、第ｍビットプレーン３０１から第Ｎビットプレーン３０１まで符号化処理を行う。最初に符号化される第ｍビットプレーン３０１を除く、第（ｍ＋１）ビットプレーン３０１から第Ｎビットプレーン３０１については、それぞれ３つの符号化パス１〜３に分割されて符号化される。これら３つのパス１〜３ヘの分類は、より画像の精練化に対する寄与度の高いビットが優先度の高いパス１に含まれるよう、符号化対象係数と隣接する周辺８係数の値から決定される。
【００２３】
符号化の優先順位はパス１が最優先で次がパス２、最後がパス３である。なお、第ｍビットプレーン３０１はパス３のみである。したがって、符号化の順番は、第ｍビットプレーン３０１のパス３から始まり、順次下位の第（ｍ＋１）〜第（Ｎ−１）ビットプレーン３０１のパス１，パス２，パス３を行い、最後がＬＳＢの第Ｎビットプレーン３０１のパス１，パス２，パス３となる。
【００２４】
次に、ポスト量子化では、上記符号化の順番に従って上位のパスからビットプレーン符号化が行われ、この結果として生成された符号化データの量が所定量に達したところで、それ以降の下位の符号化パスについては符号化を行わず、切り捨てることによって量子化を行う。例えば、第ｍビットプレーン３０１のパス３から順次符号化を実行し、第（ｍ＋２）ビットプレーン３０１のパス１まで符号化が完了したところで所定の符号化データ量となった場合、ここで符号化を中止する。これにより、第（ｍ＋２）ビットプレーン３０１のパス２からＬＳＢの第Ｎビットプレーン３０１までの全パスについては符号化されず、その分の情報量が廃棄されることになる。
【００２５】
なお、符号化を中止するか否かの判定に使用される所定の符号化データ量は、利用者によって任意に設定可能であり、予め指定される。あるいは、全ての符号化ブロック２１０において同程度の量子化誤差が含まれるように決定するようにしてもよい。
このようにして決定された所定の符号化データ量により符号化が中止された位置は、通常の符号化中止位置Ｐ１である。この符号化中止位置Ｐ１により通常の量子化精度が決まる。
上記ビットプレーン符号化／ポスト量子化部６の動作は従来と同様であり、本実施形態において改良した内容については後述する。
【００２６】
次いで、符号列順序制御部７は、符号系列の並び順を制御して符号化データを出力する。
【００２７】
次に、歪発生境界予測部１１とデータ量推定部１２と前タイル符号化パラメータ保存部１３からなる本発明の特徴的な構成部と、ビットプレーン符号化／ポスト量子化部６の改良内容について説明する。
本実施形態では、これら構成によって、視覚的にタイル歪の影響が大きい箇所を予測し、該箇所の量子化精度を通常より上げることにより、効率的にタイル歪を軽減させて再生画像の品質向上を図る。
【００２８】
一般に、画素値の変化が小さい平坦な画像領域（平坦部）では、歪みに対する人の視覚感度が高いために、画素値の変化が大きい画像領域（変動部）に比べてタイル歪が検知されやすいことが知られている。したがって、平坦部を含むタイル境界部分は、変動部を含むタイル境界部分に比べて視覚的にタイル歪の影響が大きくなるので、再生画像の品質を向上させるためには、平坦部を含むタイル境界部分のタイル歪を軽減させるのがより効果的である。
このような知見に基づき、本実施形態では、視覚的にタイル歪の影響が大きい箇所として平坦部を含むタイル境界部分を検出し、該タイル境界部分の量子化精度を通常よりも上げることによって該当するタイル歪を軽減させる。これにより、再生画像品質向上のために、効率よくタイル歪を軽減させることを実現する。
【００２９】
先ず、歪発生境界予測部１１について説明する。歪発生境界予測部１１は、平坦部に含まれるタイル境界を検出する。ここで検出されたタイル境界は、後段のビットプレーン符号化／ポスト量子化部６の符号化処理において歪補正対象となる。
図６を参照して、歪発生境界予測部１１が平坦部に含まれるタイル境界を検出する動作を説明する。図６は該検出動作の概念図である。図６において、タイル境界４１０は、図２のタイル２＿１１０とタイル５＿１１０の境界である。画素値分散調査領域４０１は、自領域内の画素値の分散を算出する所定の矩形領域である。
【００３０】
歪発生境界予測部１１は、タイル境界４１０のいずれかの端に画素値分散調査領域４０１を設定し、当該調査領域に含まれる画素値の分散を算出する。そして、この分散値が所定値以下であるときに、当該調査領域が平坦部であると判断する。一方、当該調査領域が平坦部でなかった場合には、この画素値分散調査領域４０１をタイル境界４１０に沿ってずらして再度、当該調査領域に含まれる画素値の分散を算出し、この分散値により当該調査領域が平坦部であるか否かを判断する。この処理を繰り返し実行し、該タイル境界４１０の端から端まで全てにおいて平坦部が検出されなかった場合のみ、当該タイル境界４１０は平坦部に含まれるものではないと判断する。言い換えれば、一つでも平坦部であると判断された調査領域があれば、当該タイル領域４１０を平坦部に含まれるタイル境界として検出する。
【００３１】
次に、データ量推定部１２について説明する。データ量推定部１２は、各タイル１１０のＤＷＴ係数の分散値を算出して保持する。これにより、各タイル１１０で生成される符号化データ量を事前にある程度推定することができる。これら推定した符号化データ量は、タイル１１０の符号化時に、複数のタイル１１０間における互いの生成データ量や画質（量子化精度）の制御に用いることが可能であり、該制御によって画像全体としてより高品質な符号化を行うことができる。
【００３２】
なお、各タイル１１０で生成される符号化データ量を事前に推定することなく、複数のタイル１１０の符号化を同時に行いながら互いの生成データ量や画質を制御することも考えられるが、このためには必要なメモリ量が増大する。しかしながら、タイル化ＷＴ符号化方式を適用するような場合には、使用可能なメモリ量に制約があることが多く、実用化には不向きである。このような理由からも、本実施形態のように、各タイル１１０のＤＷＴ係数の分散値を算出して保持しておき、これにより、各タイル１１０で生成される符号化データ量を事前にある程度推定可能とすることは、非常に有用である。
【００３３】
次に、前タイル符号化パラメータ保存部１３について説明する。前タイル符号化パラメータ保存部１３は、ビットプレーン符号化／ポスト量子化部６のポスト量子化の際に符号化を中止した位置を記憶する。すなわち、前タイル符号化パラメータ保存部１３には、ある符号化ブロック２１０について、どの符号化パスまで符号化して、どの符号化パスから切り捨てられたかの情報が保存される。この情報に基づいて当該符号化ブロック２１０の量子化誤差を求めることができる。
【００３４】
次に、ビットプレーン符号化／ポスト量子化部６の改良内容について説明する。ビットプレーン符号化／ポスト量子化部６は、従来の機能に加えて、タイル歪を軽減するための制御機能を備える。以下、このタイル歪軽減制御機能について説明する。
ビットプレーン符号化／ポスト量子化部６は、歪発生境界予測部１１によって検出された「平坦部に含まれるタイル境界」、すなわち歪補正対象のタイル境界に隣接するタイルを、歪補正対象のタイルとする。
【００３５】
先ず、歪補正対象のタイルを符号化する際に、該符号化対象タイル内の符号化ブロックの中から、タイル歪補正対象となる符号化ブロック、すなわち高い量子化精度で量子化する符号化ブロックの選択方法を説明する。
図７に示すように、一つのタイル１１０に対応するＤＷＴ係数群２０１の各サブバンドの符号化ブロック２１０を、高い量子化精度で量子化する候補の符号化ブロック５０１と低い量子化精度で量子化する候補の符号化ブロック５０２に分類する。高精度化候補の符号化ブロック５０１は、タイル境界に隣接する符号化ブロック２１０であり、低精度化候補の符号化ブロック５０２はタイル境界に隣接していない符号化ブロック２１０である。
【００３６】
ビットプレーン符号化／ポスト量子化部６は、符号化対象タイルについての高精度化候補の符号化ブロック５０１のうち、歪補正対象のタイル境界に隣接する符号化ブロック２１０を、高精度化対象の符号化ブロックとして選択する。例えば、図２のタイル５＿１１０を符号化する際、タイル５＿１１０の上側に位置するタイル２＿１１０とのタイル境界が歪補正対象であったとする。この場合、図８に示すように、タイル５＿１１０についての各サブバンドの符号化ブロック２１０のうち、タイル２＿１１０とのタイル境界に隣接する符号化ブロック２１０、すなわち上辺の符号化ブロック２１０全てを高精度化対象の符号化ブロック６０１とする。
【００３７】
また、他の例として、符号化対象であるタイル５＿１１０の左側に位置するタイル４＿１１０とのタイル境界が歪補正対象であった場合には、図９に示すように、タイル５＿１１０についての各サブバンドの符号化ブロック２１０のうち、タイル４＿１１０とのタイル境界に隣接する符号化ブロック２１０、すなわち左辺の符号化ブロック２１０全てを高精度化対象の符号化ブロック６０１とする。
【００３８】
同様に、符号化対象タイルの下側に位置するタイルとのタイル境界が歪補正対象であった場合には、符号化対象タイルについての各サブバンドの下辺の符号化ブロック２１０全てを高精度化対象の符号化ブロック６０１とする。また、符号化対象タイルの右側に位置するタイルとのタイル境界が歪補正対象であった場合には、符号化対象タイルについての各サブバンドの右辺の符号化ブロック２１０全てを高精度化対象の符号化ブロック６０１とする。
【００３９】
次に、高精度化対象の符号化ブロック６０１のポスト量子化方法を説明する。タイルごとの符号化処理は、図２のタイル１＿１１０からタイル９＿１１０へと、すなわち左上から右下ヘとラスタ順に実行されるものとする。
ビットプレーン符号化／ポスト量子化部６は、歪補正対象のタイルの高精度化対象の符号化ブロック６０１について、量子化精度を上げるために、符号化を中止する位置を以下の（１）、（２）のようにして決定する。
【００４０】
（１）歪補正対象のタイル境界が、符号化対象タイルと既に符号化済みのタイル（符号化対象タイルの上側あるいは左側のタイル）との境界であった場合。
この場合には、前タイル符号化パラメータ保存部１３から該符号化済みタイルについての符号化を中止した位置を読み出す。そして、この位置に基づく符号化済みタイルの量子化誤差と、符号化対象タイルの通常の符号化中止位置Ｐ１に基づく量子化誤差を比較する。この比較の結果、符号化対象タイルの方が符号化済みタイルよりも量子化誤差が大きい場合に、符号化対象タイルの高精度化対象の符号化ブロック６０１の量子化精度を通常よりも上げるようにする。このために、符号化対象タイルの符号化中止位置を通常の符号化中止位置Ｐ１よりも後ろにずらすが、符号化対象タイルの量子化誤差が符号化済みタイルの量子化誤差と同程度になるように、今回の符号化中止位置を決定する。例えば、前タイル符号化パラメータ保存部１３から読み出した符号化済みタイルの符号化中止位置を、当該符号化対象タイルの高精度化対象の符号化ブロック６０１の符号化中止位置とする。
【００４１】
（２）歪補正対象のタイル境界が、符号化対象タイルと未だ符号化されていないタイル（符号化対象タイルの下側あるいは右側のタイル）との境界であった場合。
この場合には、歪補正対象のタイル境界を挟んで符号化対象タイルに隣接するタイルの量子化誤差が不明である。そこで、データ量推定部１２において算出され保持されているＤＷＴ係数の分散値のうち、符号化対象タイルの分散値と該隣接タイルの分散値をデータ量推定部１２から読み出して比較する。この比較の結果、符号化対象タイルの分散値の方が隣接タイルの分散値よりも大きい場合、後から符号化される隣接タイルの量子化精度のほうが高いと予想されるので、符号化対象タイルの高精度化対象の符号化ブロック６０１の量子化精度を通常よりも上げるようにする。このために、符号化対象タイルの符号化中止位置を通常の符号化中止位置Ｐ１よりも後ろにずらすが、どの程度ずらすかは隣接タイルの分散値と符号化対象タイルの分散値の差に応じて決定する。
【００４２】
また、上記（１），（２）どちらの場合においても、後で符号化される隣接タイル（右または下の隣接タイル）の符号化の際に必要となるため、符号化対象タイルの今回の符号化中止位置を前タイル符号化パラメータ保存部１３に記憶しておく。
【００４３】
また、上記（１），（２）により高精度化対象の符号化ブロック６０１について量子化精度を上げると、その分、符号化データ量が増大する。そこで、この増加分を相殺するために、ビットプレーン符号化／ポスト量子化部６は、歪補正対象のタイルの低精度化候補の符号化ブロック５０２（図７参照）について、量子化精度を下げて符号化データ量を減少させる。このために、符号化対象タイルの符号化中止位置を通常の符号化中止位置Ｐ１よりも前にずらすが、どの程度ずらすかは相殺分の符号化データ量に応じて決定する。
【００４４】
なお、本実施形態においては、歪発生境界予測部１１が予測手段に対応する。また、ビットプレーン符号化／ポスト量子化部６とデータ量推定部１２と前タイル符号化パラメータ保存部１３が量子化精度制御手段に対応する。
【００４５】
また、図１に示す画像符号化装置が行う各処理を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することにより画像符号化処理を行ってもよい。なお、ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものであってもよい。
また、「コンピュータシステム」は、ＷＷＷシステムを利用している場合であれば、ホームページ提供環境（あるいは表示環境）も含むものとする。
また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。
【００４６】
さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムが送信された場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリ（ＲＡＭ）のように、一定時間プログラムを保持しているものも含むものとする。
また、上記プログラムは、このプログラムを記憶装置等に格納したコンピュータシステムから、伝送媒体を介して、あるいは、伝送媒体中の伝送波により他のコンピュータシステムに伝送されてもよい。ここで、プログラムを伝送する「伝送媒体」は、インターネット等のネットワーク（通信網）や電話回線等の通信回線（通信線）のように情報を伝送する機能を有する媒体のことをいう。
また、上記プログラムは、前述した機能の一部を実現するためのものであっても良い。さらに、前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるもの、いわゆる差分ファイル（差分プログラム）であっても良い。
【００４７】
以上、本発明の実施形態を図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、本発明の要旨を逸脱しない範囲の設計変更等も含まれる。
【００４８】
【発明の効果】
以上説明したように、本発明によれば、再生画像において視覚的に影響するタイル歪が軽減されるので、再生画像品質を向上させる上で効果的であり、タイル化ＷＴ符号化方式により画像を符号化する際、効率的にタイル歪を軽減させることができる。この結果、原画像に近い再生画像がより効率的に得られるという優れた効果を奏する。
【００４９】
また、タイル化処理の利点であるタイルごとに符号化処理を独立して行い、使用メモリ量を節約するという効果を損なうことなく、各タイル間での符号化状況に応じて符号化処理を制御することが可能となる。
【００５０】
また、本発明の画像符号化装置および方法により生成された符号化データは、従来の復号装置および方法を変更することなく再生することができる。したがって、本発明をＪＰＥＧ２０００方式の画像符号化装置および方法に適用することにより、ＪＰＥＧ２０００方式の画像復号装置および方法を変更することなく、再生画像のタイル歪を軽減して再生画像品質を向上させることが可能となる。
【図面の簡単な説明】
【図１】本発明の一実施形態による画像符号化装置の構成を示すブロック図である。
【図２】タイル分割の一例を示す図である。
【図３】ＤＷＴ係数群２０１への変換の一例を示す図である。
【図４】符号化ブロック分割の一例を示す図である。
【図５】ビットプレーン符号化の概念を示す図である。
【図６】歪発生境界予測部１１が行う平坦部に含まれるタイル境界の検出動作の概念を示す図である。
【図７】高精度化候補の符号化ブロック５０１と低精度化候補の符号化ブロック５０２の分類例を示す図である。
【図８】高い量子化精度で量子化する符号化ブロック６０１の選択例を示す第１の図である。
【図９】高い量子化精度で量子化する符号化ブロック６０１の選択例を示す第２の図である。
【符号の説明】
１…色変換／ＤＣレベルシフト部、２…タイル分割部、３…ＤＷＴ部、４…符号ブロック分割部、５…量子化部、６…ビットプレーン符号化／ポスト量子化部、７…符号列順序制御部、１１…歪発生境界予測部、１２…データ量推定部、１３…前タイル符号化パラメータ保存部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to the field of digital image processing, and relates to an image encoding device and an image encoding method for encoding image data with high efficiency, and a computer program for realizing the image encoding device using a computer. .
[0002]
[Prior art]
Conventionally, an image coding method called JPEG2000 has been developed by an international standardization organization of ISO (International Organization for Standardization) and IEC (International Electrotechnical Commission) "ISO / IEC 15444-1," Information technology-JPEG2000 image coding system-Part 1: Core. coding system ”, ISO / IEC JTC 1 / SC 29 WG1, Jan. 2001”. This JPEG2000 is one of image coding methods (hereinafter referred to as WT coding methods) using wavelet transform (WT). This WT coding method is attracting attention because it enables higher compression and higher quality image compression than the DCT coding method (for example, a method known as JPEG).
[0003]
In the WT coding method, the horizontal and vertical frequency components obtained by dividing the entire image into frequency bands by the wavelet function can be quantized and encoded to be compressed. Divided into a plurality of rectangular regions (tiles), DWT (discrete wavelet transform) and quantization and encoding are performed independently in one tile. Since this tiled WT coding method can reduce the amount of memory when implemented by hardware, it is a very important method for practical use.
[0004]
[Problems to be solved by the invention]
However, in the above-described conventional tiled WT encoding method, when encoding is performed with the encoding bit rate set low, distortion (tile distortion) occurs at the boundary between tiles in the reproduced image due to quantization error, There is a problem that the quality of the reproduced image is degraded.
[0005]
The present invention has been made in consideration of such circumstances, and its object is to efficiently generate tile distortion in order to improve the quality of a reproduced image when an image is encoded by the tiled WT encoding method. It is an object of the present invention to provide an image encoding device and an image encoding method capable of reducing the above.
[0006]
Another object of the present invention is to provide a computer program for realizing the image encoding apparatus using a computer.
[0007]
[Means for Solving the Problems]
  In order to solve the above-described problem, the image encoding device according to claim 1 divides an input image into tiles of a rectangular area, encodes each tile by a wavelet transform encoding method, and outputs encoded data. In the image encoding device toA code block dividing unit that divides all subbands into coding blocks, which are coding processing units, for the wavelet transform coefficient group, and bit-plane codes the wavelet transform coefficients in the coding block, and the bit-plane code A bit plane encoding and post-quantization means for performing quantization by truncating lower bit planes exceeding a predetermined amount of encoded data in a code string generated by conversion, and before storing a truncation position in the quantization Tile coding parameter storage means, distortion generation boundary prediction means for detecting tile boundaries included in a flat portion where the change in pixel value is small, and data amount estimation means for calculating and holding the dispersion value of the wavelet transform coefficient of each tile And the bit-plane encoding and post-quantization means includes an encoding target type. If the encoding block adjacent to the tile boundary in the encoded block is a target for high accuracy, and the tile boundary is a boundary between the encoding target tile and an already encoded tile, the previous tile The truncation position for the encoded tile is read from the encoding parameter storage means, the quantization error of the encoded tile based on this position, and the quantum based on the truncation position based on the predetermined encoded data amount of the encoding target tile When the quantization error is larger in the encoding target tile than in the encoded tile, the truncation position of the encoding block to be improved in the encoding target tile is already encoded. If the tile boundary is the boundary between the tile to be encoded and the tile that has not been encoded yet, The variance value of the encoding target tile and the variance value of an adjacent tile adjacent to the encoding target tile across the tile boundary are read from the data amount estimation unit, the variance value is compared, and the encoding target tile When the variance value is larger than the variance value of the adjacent tile, the truncation position of the encoding block to be highly accurate of the encoding target tile is determined according to the difference between the predetermined encoding data. Shift to a lower position than the truncation position by quantity,It is characterized by that.
[0008]
  In the image encoding device according to claim 2,The bit plane encoding and post-quantization means sets a coding block that is not adjacent to the tile boundary in a coding block in a coding target tile as a target for low accuracy, and a target for low accuracy of a coding target tile. The encoding block truncation position is shifted to a higher position than the predetermined encoding data amount truncation position in accordance with the encoded data amount of the high-precision encoding block.It is characterized by that.
[0009]
  In the image encoding device according to claim 3,The distortion occurrence boundary prediction meansIs characterized in that a variance of pixel values within a predetermined range of a tile boundary portion in the input image is calculated, and it is determined whether or not the image region is flat based on the variance value.
[0013]
  To solve the above problems, the claims4The image encoding method described in the above is an image encoding method that divides an input image into tiles of a rectangular area, encodes each tile by a wavelet transform encoding method, and outputs encoded data.A first process of dividing all subbands into coding blocks, which are coding processing units, for the wavelet transform coefficient group, and bitplane coding of the wavelet transform coefficients in the coding block, and the bitplane code A second step of performing quantization by truncating lower bit planes exceeding a predetermined amount of encoded data in a code string generated by conversion, and a third step of storing a truncation position in the quantization, A fourth step of detecting a tile boundary included in a flat portion where a change in pixel value is small, and a fifth step of calculating and holding a dispersion value of a wavelet transform coefficient of each tile. In the process, the encoding block adjacent to the tile boundary in the encoding block in the encoding target tile is targeted for high accuracy, and the tile boundary is encoded. If it is a boundary between the target tile and an already encoded tile, the quantization error of the encoded tile based on the stored truncation position for the encoded tile, and the encoding tile Compares the quantization error based on the truncation position with a certain amount of encoded data, and when the encoding target tile has a larger quantization error than the encoded tile, If the tile boundary is a boundary between a tile to be encoded and a tile that has not yet been encoded, the truncation position of the encoded block is matched with the encoded tile. The variance value of the encoding target tile is compared with the variance value of an adjacent tile adjacent to the encoding target tile across the tile boundary. Is larger than the variance value of the adjacent tile, the truncation position of the encoding block to be highly accurate of the encoding target tile is changed to the truncation position by the predetermined encoded data amount according to the difference of the variance values. Shift to a lower position,It is characterized by that.
6. The image encoding method according to claim 5, wherein in the second step, a coding block that is not adjacent to the tile boundary in a coding block in a coding target tile is set as a low accuracy target, and encoding is performed. The truncation position of the encoding block to be reduced in accuracy of the target tile is higher than the truncation position by the predetermined encoding data amount according to the encoded data amount of the encoding block to be improved in accuracy. It is characterized by that.
[0014]
  To solve the above problems, the claims6Is a computer program for performing an image encoding process of dividing an input image into rectangular area tiles, encoding each tile using a wavelet transform encoding method, and outputting encoded data. ,A first function that divides all subbands into coding blocks that are coding processing units for the wavelet transform coefficient group, and bitplane coding of the wavelet transform coefficients in the coding block, and the bitplane code A second function of performing quantization by truncating lower bit planes exceeding a predetermined amount of encoded data in a code string generated by conversion, and a third function of storing a truncation position in the quantization, A computer program for causing a computer to realize a fourth function for detecting a tile boundary included in a flat portion where a change in pixel value is small and a fifth function for calculating and holding a dispersion value of a wavelet transform coefficient of each tile And the second function is encoding adjacent to the tile boundary in the encoding block in the encoding target tile. If the lock is a target for high accuracy and the tile boundary is a boundary between the encoding target tile and an already encoded tile, encoding based on the truncation position of the stored encoded tile is performed. The quantization error of the finished tile is compared with the quantization error based on the truncation position by the predetermined encoded data amount of the encoding target tile, and the encoding target tile has a quantization error more than the encoded tile. When it is larger, the truncation position of the encoding block to be highly accurate of the encoding target tile is matched with the encoded tile, while the tile boundary is not yet encoded with the encoding target tile. If the boundary is a boundary between the tile to be encoded and an adjacent tile adjacent to the tile to be encoded across the tile boundary. When the variance value of the encoding target tile is larger than the variance value of the adjacent tile, the truncation position of the encoding block targeted for high accuracy of the encoding target tile is Depending on the value difference, shift to a lower position than the truncation position by the predetermined encoded data amount,It is characterized by that.
The computer program according to claim 7, wherein the second function sets a coding block that is not adjacent to the tile boundary in a coding block in a coding target tile as a low-precision target, and is a coding target tile. The truncation position of the encoding block to be reduced in accuracy is shifted to a higher position than the truncation position by the predetermined encoding data amount in accordance with the encoded data amount of the encoding block to be improved in accuracy. It is characterized by that.
  As a result, the above-described image encoding apparatus can be realized using a computer.
[0015]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, an embodiment of the present invention will be described with reference to the drawings.
FIG. 1 is a block diagram showing a configuration of an image encoding device according to an embodiment of the present invention. The image encoding apparatus shown in FIG. 1 includes a basic configuration unit of a tiled wavelet transform encoding method (tiled WT encoding method) and a characteristic configuration unit of the present invention.
[0016]
In FIG. 1, the basic components of the tiled WT coding method are the color conversion / DC level shift unit 1, the tile division unit 2, the DWT unit 3, the code block division unit 4, the quantization unit 5, the bit plane coding / It consists of a post quantization unit 6 and a code string order control unit 7. This basic component is substantially the same as a conventional image encoding apparatus based on the JPEG2000 system, but the bit plane encoding / post quantization unit 6 is improved. A characteristic component of the present invention includes a distortion generation boundary prediction unit 11, a data amount estimation unit 12, and a previous tile coding parameter storage unit 13.
[0017]
First, the basic components of the tiled WT coding method shown in FIG. 1 will be described. The color conversion / DC level shift unit 1 receives image data of an input image. The color conversion / DC level shift unit 1 performs color conversion and DC level shift for improving encoding efficiency with respect to input image data. Next, the tile dividing unit 2 divides the entire input image into a plurality of rectangular areas (tiles). An example of this tile division is shown in FIG. In the example of FIG. 2, the input image 101 is divided into nine tiles 1 to 9_110.
Thereafter, for each tile 110, processing is executed from the tile 1 to the tile 2, the tile 3, the tile 4,..., The tile 8, and the tile 9 in the raster order from the upper right to the lower left.
[0018]
The DWT (discrete wavelet transform) unit 3 transforms the tile 110 into a plurality of frequency domain (subband) DWT coefficient groups 201 as shown in FIG. 3 by DWT. In FIG. 3, DWT coefficients belonging to the low frequency region are located in the horizontal and vertical directions in the subband “LLi” region. In the subband “HLi” region, DWT coefficients belonging to the high frequency region in the horizontal direction and the low frequency region in the vertical direction are located. In the subband “LHi” region, DWT coefficients belonging to the low frequency region in the horizontal direction and the high frequency region in the vertical direction are located. The DWT coefficient belonging to the high frequency region is located in the horizontal and vertical directions in the subband “HHi” region. However, i is the number of times the two-dimensional DWT is repeatedly performed. The example of FIG. 3 shows a subband state when two-dimensional DWT is repeatedly performed three times for a low frequency region with a large amount of information.
[0019]
Next, the code block division unit 4 divides all the subbands into coding blocks 210 which are coding processing units for the DWT coefficient group 201. FIG. 4 shows an example of coding block division. In the example of FIG. 4, the size of the encoding block 210 is constant in all subbands.
[0020]
Next, the quantization unit 5 quantizes the DWT coefficient value in the coding block 210 by a general quantization method called scalar quantization. In this scalar quantization, the coefficient to be encoded is divided by the quantization step to reduce the dynamic range of the coefficient.
The quantization unit 5 may be provided before the code block dividing unit 4. Further, whether or not the quantizing unit 5 performs processing may be appropriately changed. For example, execution is performed in the lossy encoding mode, and execution is not performed in the lossless encoding mode.
[0021]
Next, the bit-plane encoding / post-quantization unit 6 performs quantization (post-quantization) by truncating the lower bit brain of the code string generated by bit-plane encoding the DWT coefficient.
First, in bit plane coding, the DWT coefficient value in the coding block 210 or its scalar quantized value is divided into a bit representing positive and negative and an absolute value. Next, the absolute value is expressed as a bit plane by a natural binary number, and bit brain coding is performed in order from the upper bit plane. However, first, encoding is started from a bit plane always including bit 1 somewhere, and when bit 1 is first generated in a certain transform coefficient, a bit representing positive / negative is immediately encoded. Note that the information of the bit planes in which all the bits are 0 above the bit plane at the start of encoding is separately transmitted to the decoding side.
[0022]
FIG. 5 shows the concept of bit plane coding. As shown in FIG. 5, for the DWT coefficient value in one coding block 210 or the absolute value of the scalar quantized value, the first to Nth bit planes 301 from the MSB to the LSB are configured for each bit digit. . In these first to Nth bit planes 301, bit values are referred to in order from the upper bit plane, and a bit plane in which a non-zero bit first appears is the m-th bit plane 301. 1) No encoding processing is performed up to the bit plane 301. Then, encoding processing from the m-th bit plane 301 to the N-th bit plane 301 is performed. The (m + 1) -th bit plane 301 to the N-th bit plane 301 except the m-th bit plane 301 to be encoded are divided into three encoding passes 1 to 3 and encoded. The classification of these three passes 1 to 3 is determined from the values of the neighboring 8 coefficients adjacent to the encoding target coefficient so that the bit 1 having higher contribution to image refinement is included in the pass 1 having higher priority. The
[0023]
As for the encoding priority, pass 1 is the highest priority, the next is pass 2, and the last is pass 3. Note that the mth bit plane 301 is only path 3. Therefore, the encoding order starts from pass 3 of the m-th bit plane 301, and sequentially performs pass 1, pass 2, and pass 3 of the lower (m + 1) th to (N-1) -th bit planes 301. Path 1, path 2, and path 3 of the LSB Nth bit plane 301 are obtained.
[0024]
Next, in post-quantization, bit-plane encoding is performed from the higher-order pass according to the above-described encoding order, and when the amount of encoded data generated as a result reaches a predetermined amount, The encoding pass is not encoded but is quantized by truncating. For example, when encoding is sequentially performed from pass 3 of the m-th bit plane 301 and encoding is completed up to pass 1 of the (m + 2) -th bit plane 301, a predetermined amount of encoded data is obtained. Cancel. As a result, all the paths from the path 2 of the (m + 2) th bit plane 301 to the NSB bit plane 301 of the LSB are not encoded, and the corresponding information amount is discarded.
[0025]
Note that the predetermined amount of encoded data used for determining whether or not to stop encoding can be arbitrarily set by the user and is designated in advance. Alternatively, it may be determined so that the quantization error of the same level is included in all the coding blocks 210.
The position where encoding is stopped due to the predetermined amount of encoded data determined in this way is a normal encoding stop position P1. The normal quantization accuracy is determined by the encoding stop position P1.
The operation of the bit plane encoding / post-quantization unit 6 is the same as the conventional one, and the contents improved in this embodiment will be described later.
[0026]
Next, the code sequence control unit 7 controls the arrangement order of the code sequences and outputs encoded data.
[0027]
Next, the characteristic components of the present invention including the distortion generation boundary prediction unit 11, the data amount estimation unit 12, and the previous tile coding parameter storage unit 13, and the improvement contents of the bit plane coding / post quantization unit 6 are described. explain.
In this embodiment, with these configurations, a portion where the influence of tile distortion is visually large is predicted, and by increasing the quantization accuracy of the portion from the usual, tile distortion can be effectively reduced to improve the quality of a reproduced image. Plan.
[0028]
In general, in a flat image region (flat portion) with a small change in pixel value, human visual sensitivity to distortion is high, and therefore tile distortion is more easily detected than in an image region (variable portion) with a large change in pixel value. It is known. Therefore, the tile boundary part including the flat part is visually affected by the tile distortion more visually than the tile boundary part including the fluctuation part. Therefore, in order to improve the quality of the reproduced image, the tile boundary part including the flat part is included. It is more effective to reduce the tile distortion of the part.
Based on such knowledge, in the present embodiment, a tile boundary part including a flat part is detected as a place where the influence of tile distortion is visually large, and the quantization accuracy of the tile boundary part is increased by higher than usual. Reduce tile distortion. As a result, it is possible to efficiently reduce the tile distortion in order to improve the reproduction image quality.
[0029]
First, the distortion occurrence boundary prediction unit 11 will be described. The distortion generation boundary prediction unit 11 detects a tile boundary included in the flat portion. The tile boundary detected here becomes a distortion correction target in the encoding process of the subsequent bit plane encoding / post-quantization unit 6.
With reference to FIG. 6, an operation in which the distortion occurrence boundary prediction unit 11 detects a tile boundary included in the flat portion will be described. FIG. 6 is a conceptual diagram of the detection operation. In FIG. 6, a tile boundary 410 is a boundary between the tile 2_110 and the tile 5_110 in FIG. The pixel value dispersion | distribution investigation area | region 401 is a predetermined rectangular area | region which calculates dispersion | distribution of the pixel value in an own area | region.
[0030]
The distortion occurrence boundary prediction unit 11 sets the pixel value dispersion investigation area 401 at any end of the tile boundary 410 and calculates the dispersion of the pixel values included in the investigation area. When the variance value is equal to or less than a predetermined value, it is determined that the survey area is a flat portion. On the other hand, if the survey area is not a flat portion, the pixel value variance survey area 401 is shifted along the tile boundary 410 to calculate again the variance of the pixel values included in the survey area. To determine whether or not the survey area is a flat part. This process is repeatedly executed, and only when the flat portion is not detected from end to end of the tile boundary 410, it is determined that the tile boundary 410 is not included in the flat portion. In other words, if there is any survey area determined to be a flat part, the tile area 410 is detected as a tile boundary included in the flat part.
[0031]
Next, the data amount estimation unit 12 will be described. The data amount estimation unit 12 calculates and holds the variance value of the DWT coefficient of each tile 110. Thereby, the amount of encoded data generated in each tile 110 can be estimated to some extent in advance. These estimated encoded data amounts can be used to control the amount of generated data and image quality (quantization accuracy) between the plurality of tiles 110 at the time of encoding the tiles 110, and as a result, the entire image is controlled by the control. Higher quality encoding can be performed.
[0032]
Note that it is possible to control the amount of generated data and image quality of each tile 110 while simultaneously encoding the plurality of tiles 110 without estimating the amount of encoded data generated in each tile 110 in advance. Increases the amount of memory required. However, when the tiled WT coding method is applied, there are many restrictions on the amount of memory that can be used, which is not suitable for practical use. For this reason as well, the variance value of the DWT coefficient of each tile 110 is calculated and held as in the present embodiment, so that the amount of encoded data generated by each tile 110 is determined to some extent in advance. Making it estimable is very useful.
[0033]
Next, the previous tile coding parameter storage unit 13 will be described. The previous tile coding parameter storage unit 13 stores a position where the coding is stopped at the time of post-quantization by the bit-plane coding / post-quantization unit 6. In other words, the previous tile encoding parameter storage unit 13 stores information about which encoding pass is encoded up to a certain encoding block 210 and which encoding pass is truncated. Based on this information, the quantization error of the coding block 210 can be obtained.
[0034]
Next, the improvement content of the bit plane encoding / post quantization unit 6 will be described. The bit-plane encoding / post-quantization unit 6 has a control function for reducing tile distortion in addition to the conventional function. Hereinafter, the tile distortion reduction control function will be described.
The bit-plane encoding / post-quantization unit 6 converts the “tile boundary included in the flat part” detected by the distortion occurrence boundary prediction unit 11, that is, the tile adjacent to the distortion correction target tile boundary, to the distortion correction target tile. And
[0035]
First, when coding a distortion correction target tile, among the coding blocks in the encoding target tile, a coding block that is a tile distortion correction target, that is, a coding block that is quantized with high quantization accuracy. The selection method of will be described.
As shown in FIG. 7, the coding block 210 of each subband of the DWT coefficient group 201 corresponding to one tile 110 is quantized with a candidate coding block 501 that is quantized with high quantization accuracy and with low quantization accuracy. Into candidate encoding blocks 502 to be converted. The encoding block 501 that is a candidate for high accuracy is an encoding block 210 adjacent to the tile boundary, and the encoding block 502 that is a candidate for high accuracy is an encoding block 210 that is not adjacent to the tile boundary.
[0036]
The bit-plane encoding / post-quantization unit 6 converts the encoding block 210 adjacent to the distortion correction target tile boundary among the encoding blocks 501 of the high-accuracy candidate for the encoding target tile, Select as coding block. For example, when the tile 5_110 of FIG. 2 is encoded, it is assumed that the tile boundary with the tile 2_110 positioned above the tile 5_110 is the distortion correction target. In this case, as shown in FIG. 8, among the encoding blocks 210 of each subband for the tile 5_110, all the encoding blocks 210 adjacent to the tile boundary with the tile 2_110, that is, all the encoding blocks 210 on the upper side are highly accurate. It is assumed that the encoding block 601 is an encoding target.
[0037]
As another example, when the tile boundary with the tile 4_110 located on the left side of the encoding target tile 5_110 is a distortion correction target, as shown in FIG. 9, each subband for the tile 5_110 Among the coding blocks 210, coding blocks 210 adjacent to the tile boundary with the tile 4 — 110, that is, all the coding blocks 210 on the left side are coding blocks 601 to be improved in accuracy.
[0038]
Similarly, when the tile boundary with the tile located below the encoding target tile is a distortion correction target, the accuracy of all the encoding blocks 210 on the lower side of each subband for the encoding target tile is increased. The target encoding block 601 is assumed. If the tile boundary with the tile located on the right side of the encoding target tile is a distortion correction target, all the encoding blocks 210 on the right side of each subband for the encoding target tile are subjected to the high accuracy target. The encoding block 601 is assumed.
[0039]
Next, a post-quantization method for the coding block 601 to be improved will be described. It is assumed that the encoding process for each tile is executed in the raster order from the tile 1_110 to the tile 9_110 in FIG. 2, that is, from the upper left to the lower right.
The bit-plane encoding / post-quantization unit 6 sets the position where the encoding is stopped for the encoding block 601 targeted for high accuracy of the distortion correction target tile (1), Determine as in (2).
[0040]
(1) When the tile boundary of the distortion correction target is a boundary between the encoding target tile and a tile that has already been encoded (the tile on the upper side or the left side of the encoding target tile).
In this case, the position where encoding for the encoded tile is stopped is read from the previous tile encoding parameter storage unit 13. Then, the quantization error of the encoded tile based on this position is compared with the quantization error based on the normal encoding stop position P1 of the encoding target tile. As a result of the comparison, when the encoding target tile has a quantization error larger than that of the encoded tile, the quantization accuracy of the encoding block 601 to be improved in accuracy of the encoding target tile is made higher than usual. To. For this reason, the encoding stop position of the encoding target tile is shifted behind the normal encoding stop position P1, but the quantization error of the encoding target tile is approximately the same as the quantization error of the encoded tile. In this way, the current encoding stop position is determined. For example, the encoding stop position of the encoded tile read from the previous tile encoding parameter storage unit 13 is set as the encoding stop position of the high-precision encoding block 601 of the encoding target tile.
[0041]
(2) When the tile boundary of the distortion correction target is a boundary between the encoding target tile and a tile that has not been encoded yet (the lower or right tile of the encoding target tile).
In this case, the quantization error of the tile adjacent to the encoding target tile across the distortion correction target tile boundary is unknown. Therefore, among the variance values of the DWT coefficients calculated and held in the data amount estimation unit 12, the variance value of the encoding target tile and the variance value of the adjacent tile are read from the data amount estimation unit 12 and compared. As a result of this comparison, when the variance value of the encoding target tile is larger than the variance value of the adjacent tile, it is expected that the quantization accuracy of the adjacent tile encoded later is higher, so the encoding target tile The quantization accuracy of the coding block 601 that is the target of higher accuracy is made higher than usual. For this purpose, the encoding stop position of the encoding target tile is shifted behind the normal encoding stop position P1, but the extent of the shift depends on the difference between the variance value of the adjacent tile and the encoding value of the encoding target tile. To decide.
[0042]
In both cases (1) and (2), it is necessary for encoding an adjacent tile (right or lower adjacent tile) to be encoded later. The encoding stop position is stored in the previous tile encoding parameter storage unit 13.
[0043]
Further, when the quantization accuracy is increased for the encoding block 601 to be improved in the above (1) and (2), the amount of encoded data increases accordingly. Therefore, in order to cancel out this increase, the bit plane encoding / post-quantization unit 6 decreases the quantization accuracy of the encoding block 502 (see FIG. 7) of the tile accuracy reduction candidate of the distortion correction target tile. To reduce the amount of encoded data. For this reason, the encoding stop position of the encoding target tile is shifted before the normal encoding stop position P1, but how much the shift is determined is determined according to the amount of encoded data to be canceled.
[0044]
In the present embodiment, the distortion occurrence boundary prediction unit 11 corresponds to a prediction unit. The bit plane encoding / post-quantization unit 6, the data amount estimation unit 12, and the previous tile encoding parameter storage unit 13 correspond to quantization accuracy control means.
[0045]
Also, a program for realizing each process performed by the image encoding device shown in FIG. 1 is recorded on a computer-readable recording medium, and the program recorded on the recording medium is read into a computer system and executed. The image encoding process may be performed as described above. Here, the “computer system” may include an OS and hardware such as peripheral devices.
Further, the “computer system” includes a homepage providing environment (or display environment) if a WWW system is used.
The “computer-readable recording medium” refers to a storage device such as a flexible medium, a magneto-optical disk, a portable medium such as a ROM and a CD-ROM, and a hard disk incorporated in a computer system.
[0046]
Further, the “computer-readable recording medium” refers to a volatile memory (RAM) in a computer system that becomes a server or a client when a program is transmitted via a network such as the Internet or a communication line such as a telephone line. In addition, those holding programs for a certain period of time are also included.
The program may be transmitted from a computer system storing the program in a storage device or the like to another computer system via a transmission medium or by a transmission wave in the transmission medium. Here, the “transmission medium” for transmitting the program refers to a medium having a function of transmitting information, such as a network (communication network) such as the Internet or a communication line (communication line) such as a telephone line.
The program may be for realizing a part of the functions described above. Furthermore, what can implement | achieve the function mentioned above in combination with the program already recorded on the computer system, and what is called a difference file (difference program) may be sufficient.
[0047]
The embodiment of the present invention has been described in detail with reference to the drawings. However, the specific configuration is not limited to this embodiment, and includes design changes and the like within a scope not departing from the gist of the present invention.
[0048]
【The invention's effect】
As described above, according to the present invention, the tile distortion that visually affects the reproduced image is reduced, which is effective in improving the reproduced image quality. The image is obtained by the tiled WT coding method. When encoding, tile distortion can be reduced efficiently. As a result, there is an excellent effect that a reproduced image close to the original image can be obtained more efficiently.
[0049]
In addition, the encoding process is performed independently for each tile, which is an advantage of the tiling process, and the encoding process is controlled according to the encoding status between tiles without losing the effect of saving the amount of memory used. It becomes possible to do.
[0050]
Also, the encoded data generated by the image encoding apparatus and method of the present invention can be reproduced without changing the conventional decoding apparatus and method. Therefore, by applying the present invention to a JPEG2000 format image encoding apparatus and method, it is possible to reduce the tile distortion of the playback image and improve the playback image quality without changing the JPEG2000 format image decoding apparatus and method. Is possible.
[Brief description of the drawings]
FIG. 1 is a block diagram illustrating a configuration of an image encoding device according to an embodiment of the present invention.
FIG. 2 is a diagram illustrating an example of tile division.
FIG. 3 is a diagram illustrating an example of conversion into a DWT coefficient group 201.
FIG. 4 is a diagram illustrating an example of coding block division.
FIG. 5 is a diagram illustrating the concept of bit-plane encoding.
FIG. 6 is a diagram illustrating a concept of a tile boundary detection operation included in a flat portion performed by a distortion occurrence boundary prediction unit 11;
FIG. 7 is a diagram illustrating a classification example of an encoding block 501 for a high accuracy candidate and an encoding block 502 for a low accuracy candidate.
FIG. 8 is a first diagram illustrating a selection example of an encoding block 601 that performs quantization with high quantization accuracy;
FIG. 9 is a second diagram illustrating a selection example of an encoding block 601 that performs quantization with high quantization accuracy;
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 1 ... Color conversion / DC level shift part, 2 ... Tile division part, 3 ... DWT part, 4 ... Code block division part, 5 ... Quantization part, 6 ... Bit-plane encoding / post-quantization part, 7 ... Code sequence Order control unit, 11 ... distortion occurrence boundary prediction unit, 12 ... data amount estimation unit, 13 ... previous tile coding parameter storage unit

Claims

In an image encoding device that divides an input image into tiles of a rectangular area, encodes each tile by a wavelet transform encoding method, and outputs encoded data.
Code block dividing means for dividing all subbands into coding blocks which are coding processing units for the wavelet transform coefficient group;
A bit plane that performs bit-plane coding on wavelet transform coefficients in the coding block, and performs quantization by truncating lower-order bit planes exceeding a predetermined encoded data amount in a code string generated by the bit-plane coding Encoding and post-quantization means;
A pre-tile coding parameter storage means for storing a truncation position in the quantization;
A distortion generation boundary prediction means for detecting a tile boundary included in a flat portion where a change in pixel value is small;
A data amount estimating means for calculating and holding a dispersion value of a wavelet transform coefficient of each tile, and
The bit plane encoding and post quantization means are
In the encoding block in the encoding target tile, the encoding block adjacent to the tile boundary is the target of high accuracy,
When the tile boundary is a boundary between the encoding target tile and an already encoded tile, a truncation position for the encoded tile is read from the previous tile encoding parameter storage unit, and based on this position The quantization error of the encoded tile is compared with the quantization error based on the truncation position based on the predetermined encoded data amount of the encoding target tile, and the encoding target tile is quantized more than the encoded tile. When the error is large, the truncation position of the encoding block to be highly accurate of the encoding target tile is matched with the encoded tile,
On the other hand, if the tile boundary is a boundary between the encoding target tile and a tile that has not yet been encoded, the tile boundary is adjacent to the encoding target tile across the tile boundary. Is read from the data amount estimation means, and the variance values are compared. When the variance value of the encoding target tile is larger than the variance value of the adjacent tile, the encoding target tile The truncation position of the encoding block to be highly accurate is shifted to a lower position than the truncation position by the predetermined encoded data amount according to the difference in the variance value.
An image encoding apparatus characterized by that.

The bit plane encoding and post quantization means are
A coding block that is not adjacent to the tile boundary in the coding block in the coding target tile is targeted for low accuracy,
The truncation position of the encoding block to be reduced in accuracy of the encoding target tile is higher than the truncation position by the predetermined encoding data amount according to the encoded data amount of the encoding block to be improved in accuracy. Shift towards
The image coding apparatus according to claim 1.

The distortion generation boundary prediction unit calculates a variance of pixel values in a predetermined range of a tile boundary portion in the input image, and determines whether or not the image region is flat based on the variance value. The image encoding device according to claim 1 or 2.

An image encoding method that divides an input image into tiles of a rectangular area, encodes each tile by a wavelet transform encoding method, and outputs encoded data,
A first step of dividing all subbands into coding blocks, which are coding processing units, for a wavelet transform coefficient group;
Secondly, the wavelet transform coefficient in the coding block is bit-plane coded, and quantization is performed by truncating lower bit planes exceeding a predetermined coded data amount in a code string generated by the bit-plane coding. And the process
A third step of storing a truncation position in the quantization;
A fourth step of detecting a tile boundary included in a flat portion in which a change in pixel value is small;
And calculating and holding a dispersion value of the wavelet transform coefficient of each tile ,
The second process includes:
In the encoding block in the encoding target tile, the encoding block adjacent to the tile boundary is the target of high accuracy,
If the tile boundary is a boundary between a tile to be encoded and an already encoded tile, the quantization error of the encoded tile based on the truncation position for the stored encoded tile; Compare the quantization error based on the truncation position by the predetermined encoded data amount of the encoding target tile, and when the encoding target tile has a larger quantization error than the encoded tile, the encoding target Match the truncation position of the encoded block that is the target of tile accuracy to the encoded tile,
On the other hand, when the tile boundary is a boundary between the encoding target tile and a tile that has not been encoded yet, the encoding value is held across the tile boundary and the retained variance value of the encoding target tile. When the variance value of the adjacent tile adjacent to the target tile is compared and the variance value of the encoding target tile is greater than the variance value of the adjacent tile, the encoding block of the encoding target tile to be improved in accuracy The truncation position is shifted to a position lower than the truncation position due to the predetermined encoded data amount according to the difference in the variance values.
An image encoding method characterized by the above.

  The second process includes:
  A coding block that is not adjacent to the tile boundary in the coding block in the coding target tile is targeted for low accuracy,
  The truncation position of the encoding block to be reduced in accuracy of the encoding target tile is higher than the truncation position by the predetermined encoding data amount according to the encoded data amount of the encoding block to be improved in accuracy. Shift towards
  The image encoding method according to claim 4, wherein:

A computer program for performing an image encoding process of dividing an input image into tiles of a rectangular area, encoding each tile by a wavelet transform encoding method, and outputting encoded data,
A first function that divides all subbands into encoding blocks that are encoding processing units for the wavelet transform coefficient group;
Secondly, the wavelet transform coefficient in the coding block is bit-plane coded, and quantization is performed by truncating lower bit planes exceeding a predetermined coded data amount in a code string generated by the bit-plane coding. Functions and
A third function for storing a truncation position in the quantization;
A fourth function of detecting a tile boundary included in a flat portion in which a change in pixel value is small;
A computer program for causing a computer to realize a fifth function of calculating and holding a dispersion value of a wavelet transform coefficient of each tile;
The second function is:
In the encoding block in the encoding target tile, the encoding block adjacent to the tile boundary is the target of high accuracy,
If the tile boundary is a boundary between a tile to be encoded and an already encoded tile, the quantization error of the encoded tile based on the truncation position for the stored encoded tile; Compare the quantization error based on the truncation position by the predetermined encoded data amount of the encoding target tile, and when the encoding target tile has a larger quantization error than the encoded tile, the encoding target Match the truncation position of the encoded block that is the target of tile accuracy to the encoded tile,
On the other hand, when the tile boundary is a boundary between the encoding target tile and a tile that has not been encoded yet, the encoding value is held across the tile boundary and the retained variance value of the encoding target tile. When the variance value of the adjacent tile adjacent to the target tile is compared and the variance value of the encoding target tile is greater than the variance value of the adjacent tile, the encoding block of the encoding target tile to be improved in accuracy The truncation position is shifted to a position lower than the truncation position due to the predetermined encoded data amount according to the difference in the variance values .
A computer program characterized by the above.

  The second function is:
  A coding block that is not adjacent to the tile boundary in the coding block in the coding target tile is targeted for low accuracy,
  The truncation position of the encoding block to be reduced in accuracy of the encoding target tile is higher than the truncation position by the predetermined encoding data amount according to the encoded data amount of the encoding block to be improved in accuracy. Shift towards
  The computer program according to claim 6.