JP3937632B2

JP3937632B2 - Image decoding apparatus and image decoding method

Info

Publication number: JP3937632B2
Application number: JP04373499A
Authority: JP
Inventors: 一彦西堀; 幸彦茂木; 芳人近藤
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1999-02-22
Filing date: 1999-02-22
Publication date: 2007-06-27
Anticipated expiration: 2019-02-22
Also published as: JP2000244916A

Description

【０００１】
【発明の属する技術分野】
本発明は、８×８画素からなる直交変換ブロック単位で直交変換することによる圧縮符号化をした第１の解像度の圧縮画像データを復号する画像復号装置及び画像復号方法に関し、特に、第１の解像度の圧縮画像データを復号して、この第１の解像度よりも低い第２の解像度の動画像データに縮小する画像復号装置及び画像復号方法に関するものである。
【０００２】
【従来の技術】
ＭＰＥＧ２（Moving Picture Experts Group phase2）等の画像圧縮方式を用いたデジタルテレビジョン放送の規格化が進められている。デジタルテレビジョン放送の規格には、標準解像度画像（例えば垂直方向の有効ライン数が５７６本）に対応した規格、高解像度画像（例えば垂直方向の有効ライン数が１１５２本）に対応した規格等がある。そのため、近年、高解像度画像の圧縮画像データを復号するとともにこの圧縮画像データを１／２の解像度に縮小することにより、標準解像度画像の画像データを生成して、この画像データを標準解像度に対応したテレビジョンモニタに表示するダウンデコーダが求められている。
【０００３】
高解像度画像に対して動き予測による予測符号化及び離散コサイン変換による圧縮符号化をしたＭＰＥＧ２等のビットストリームを、復号するとともに標準解像度画像にダウンサンプリングするダウンデコーダが、文献「低域ドリフトのないスケーラブル・デコーダ」（岩橋・神林・貴家：信学技報 CS94-186,DSP94-108,1995-01）に提案されている（以下、この文献を文献１と呼ぶ。）。この文献１には、以下の第１から第３のダウンデコーダが示されている。
【０００４】
第１のダウンデコーダは、図４に示すように、高解像度画像のビットストリームに対して８（垂直方向のＤＣ成分から数えた係数の数）×８（水平方向のＤＣ成分から数えた係数の数）の逆離散コサイン変換をする逆離散コサイン変換装置１０１と、離散コサイン変換がされた高解像度画像と動き補償がされた参照画像とを加算する加算装置１０２と、参照画像を一時記憶するフレームメモリ１０３と、フレームメモリ１０３が記憶した参照画像に１／２画素精度で動き補償をする動き補償装置１０４と、フレームメモリ１０３が記憶した参照画像を標準解像度の画像に変換するダウンサンプリング装置１０５とを備えている。
【０００５】
この第１のダウンデコーダでは、逆離散コサイン変換を行い高解像度画像として復号した出力画像を、ダウンサンプリング装置１０５で縮小して標準解像度の画像データを出力する。
【０００６】
第２のダウンデコーダは、図５に示すように、高解像度画像のビットストリームのＤＣＴ（Discrete Cosine Transform）ブロックの高周波成分の係数を０に置き換えて８×８の逆離散コサイン変換をする逆離散コサイン変換装置１１１と、離散コサイン変換がされた高解像度画像と動き補償がされた参照画像とを加算する加算装置１１２と、参照画像を一時記憶するフレームメモリ１１３と、フレームメモリ１１３が記憶した参照画像に１／２画素精度で動き補償をする動き補償装置１１４と、フレームメモリ１１３が記憶した参照画像を標準解像度の画像に変換するダウンサンプリング装置１１５とを備えている。
【０００７】
この第２のダウンデコーダでは、ＤＣＴブロックの全ての係数のうち高周波成分の係数を０に置き換えて逆離散コサイン変換を行い高解像度画像として復号した出力画像を、ダウンサンプリング装置１０５で縮小して標準解像度の画像データを出力する。
【０００８】
第３のダウンデコーダは、図６に示すように、高解像度画像のビットストリームのＤＣＴブロックの低周波成分の係数のみを用いて例えば４×４の逆離散コサイン変換をして標準解像度画像に復号する縮小逆離散コサイン変換装置１２１と、縮小逆離散コサイン変換がされた標準解像度画像と動き補償がされた参照画像とを加算する加算装置１２２と、参照画像を一時記憶するフレームメモリ１２３と、フレームメモリ１２３が記憶した参照画像に１／４画素精度で動き補償をする動き補償装置１２４とを備えている。
【０００９】
この第３のダウンデコーダでは、ＤＣＴブロックの全ての係数のうち低周波成分の係数のみを用いて逆離散コサイン変換を行い、高解像度画像から標準解像度画像として復号する。
【００１０】
ここで、上記第１のダウンデコーダでは、ＤＣＴブロック内の全ての係数に対して逆離散コサイン変換を行い高解像度画像を復号しているため、高い演算処理能力の逆離散コサイン変換装置１０１と高容量のフレームメモリ１０３とが必要となる。また、上記第２のダウンデコーダでは、ＤＣＴブロック内の係数のうち高周波成分を０として離散コサイン変換を行い高解像度画像を復号しているため、逆離散コサイン変換装置１１１の演算処理能力は低くて良いが、やはり高容量のフレームメモリ１１３が必要となる。これら第１及び第２のダウンデコーダに対し、第３のダウンデコーダでは、ＤＣＴブロック内の全ての係数うち低周波成分の係数のみを用いて逆離散コサイン変換をしているため逆離散コサイン変換装置１２１の演算処理能力が低くてよく、さらに、標準解像度画像の参照画像を復号しているのでフレームメモリ１２３の容量も少なくすることができる。
【００１１】
ところで、テレビジョン放送等の動画像の表示方式には、順次走査方式と飛び越し走査方式とがある。順次走査方式は、フレーム内の全ての画素を同じタイミングでサンプリングした画像を、順次表示する表示方式である。飛び越し走査方式は、フレーム内の画素を水平方向の１ライン毎に異なるタイミングでサンプリングした画像を、交互に表示する表示方式である。
【００１２】
この飛び越し走査方式では、フレーム内の画素を１ライン毎に異なるタイミングでサンプリングした画像のうちの一方を、トップフィールド（第１フィールドともいう。）といい、他方をボトムフィールド（第２のフィールドともいう。）という。フレームの水平方向の先頭ラインが含まれる画像がトップフィールドとなり、フレームの水平方向の２番目のラインが含まれる画像がボトムフィールドとなる。従って、飛び越し走査方式では、１つのフレームが２つのフィールドから構成されることとなる。
【００１３】
ＭＥＰＧ２では、飛び越し走査方式に対応した動画像信号を効率良く圧縮するため、画面の圧縮単位であるピクチャにフレームを割り当てて符号化するだけでなく、ピクチャにフィールドを割り当てて符号化することもできる。
【００１４】
ＭＰＥＧ２では、ピクチャにフィールドが割り当てられた場合には、そのビットストリームの構造をフィールド構造と呼び、ピクチャにフレームが割り当てられた場合には、そのビットストリームの構造をフレーム構造と呼ぶ。また、フィールド構造では、フィールド内の画素からＤＣＴブロックが形成され、フィールド単位で離散コサイン変換がされる。このフィールド単位で離散コサイン変換を行う処理モードのことをフィールドＤＣＴモードと呼ぶ。また、フレーム構造では、フレーム内の画素からＤＣＴブロックが形成され、フレーム単位で離散コサイン変換がされる。このフレーム単位で離散コサイン変換を行う処理モードのことをフレームＤＣＴモードと呼ぶ。さらに、フィールド構造では、フィールド内の画素からマクロブロックが形成され、フィールド単位で動き予測がされる。このフィールド単位で動き予測を行う処理モードのことをフィールド動き予測モードと呼ぶ。また、フレーム構造では、フレーム内の画素からマクロブロックが形成され、フレーム単位で動き予測がされる。フレーム単位で動き予測を行う処理モードのことをフレーム動き予測モードと呼ぶ。
【００１５】
ところで、上記文献１に示された第３のダウンデコーダを利用して、飛び越し走査方式に対応した圧縮画像データを復号する画像復号装置が、例えば文献「A Compensation Method of Drift Errors in Scalability」（N.OBIKANE,K.TAHARA and J.YONEMITSU,HDTV Work Shop'93）に提案されている（以下、この文献を文献２と呼ぶ）。
【００１６】
この文献２に示された従来の画像復号装置は、図７に示すように、高解像度画像をＭＰＥＧ２で圧縮したビットストリームが供給され、このビットストリームを解析するビットストリーム解析装置１３１と、データの発生頻度に応じた符号長を割り当てる可変長符号化がされたビットストリームを復号する可変長符号復号装置１３２と、ＤＣＴブロックの各係数に量子化ステップを掛ける逆量子化装置２０３３と、ＤＣＴブロックの全ての係数のうち低周波成分の係数のみを用いて例えば４×４の逆離散コサイン変換をして標準解像度画像を復号する縮小逆離散コサイン変換装置１３４と、縮小逆離散コサイン変換がされた標準解像度画像と動き補償がされた参照画像とを加算する加算装置１３５と、参照画像を一時記憶するフレームメモリ１３６と、フレームメモリ１３６が記憶した参照画像に１／４画素精度で動き補償をする動き補償装置１３７とを備えている。
【００１７】
この文献２に示された従来の画像復号装置の縮小逆離散コサイン変換装置１３４は、ＤＣＴブロック内の全ての係数のうち低周波成分の係数のみを用いて逆離散コサイン変換をするが、フレームＤＣＴモードとフィールドＤＣＴモードとで、逆離散コサイン変換を行う係数の位置が異なっている。
【００１８】
具体的には、縮小逆離散コサイン変換装置１３４は、フィールドＤＣＴモードの場合には、図８に示すように、ＤＣＴブロック内の８×８個のうち、低域の４×４個の係数のみに逆離散コサイン変換を行う。それに対し、縮小逆離散コサイン変換装置１３４は、フレームＤＣＴモードの場合には、図９に示すように、ＤＣＴブロック内の８×８個の係数のうち、４×２個＋４×２個の係数のみに逆離散コサイン変換を行う。
【００１９】
また、この文献２に示された従来の画像復号装置の動き補償装置１３７は、高解像度画像に対して行われた動き予測の情報（動きベクトル）に基づき、フィールド動き予測モード及びフレーム動き予測モードのそれぞれに対応した１／４画素精度の動き補償を行う。すなわち、通常ＭＰＥＧ２では１／２画素精度で動き補償が行われることが定められているが、高解像度画像から標準解像度画像を復号する場合には、ピクチャ内の画素数が１／２に間引かれるため、動き補償装置１３７では動き補償の画素精度を１／４画素精度として動き補償を行っている。
【００２０】
従って、動き補償装置１３７では、高解像度画像に対応した動き補償を行うため、標準解像度の画像としてフレームメモリ１３６に格納された参照画像の画素に対して線形補間して、１／４画素精度の画素を生成している。
【００２１】
具体的に、フィールド動き予測モード及びフレーム動き予測モードの場合の垂直方向の画素の線形補間処理を、図１０及び図１１を用いて説明する。なお、図面中には、縦方向に垂直方向の画素の位相を示し、表示画像の各画素が位置する位相を整数で示している。
【００２２】
まず、フィールド動き予測モードで動き予測がされた画像の補間処理について、図１０を用いて説明する。高解像度画像（上位レイヤー）に対しては、図１０（ａ）に示すように、各フィールドそれぞれ独立に、１／２画素精度で動き補償がされる。これに対し、標準解像度画像（下位レイヤー）に対しては、図１０（ｂ）に示すように、整数精度の画素に基づきフィールド内で線形補間をして、垂直方向に１／４画素、１／２画素、３／４画素分の位相がずれた画素を生成し、動き補償がされる。すなわち、標準解像度画像（下位レイヤー）では、トップフィールドの整数精度の各画素に基づきトップフィールドの１／４画素精度の各画素が線形補間により生成され、ボトムフィールドの整数精度の各画素に基づきボトムフィールドの１／４画素精度の各画素が線形補間により生成される。例えば、垂直方向の位相が０の位置にあるトップフィールドの画素の値をａ、垂直方向の位相が１の位置にあるトップフィールドの画素の値をｂとする。この場合、垂直方向の位相が１／４の位置にあるトップフィールドの画素は（３ａ＋ｂ）／４となり、垂直方向の位相が１／２の位置にあるトップフィールドの画素は（ａ＋ｂ）／２となり、垂直方向の位相が３／４の位置にあるトップフィールドの画素は（ａ＋３ｂ）／４となる。
【００２３】
続いて、フレーム動き予測モードで動き予測がされた画像の補間処理について、図１１を用いて説明する。高解像度画像（上位レイヤー）に対しては、図１１（ａ）に示すように、各フィールド間で補間処理がされ、すなわち、ボトムフィールドとトップフィールドとの間で補間処理がされ、１／２画素精度で動き補償がされる。標準解像度画像（下位レイヤー）に対しては、図１１（ｂ）に示すように、トップフィールド及びボトムフィールドの２つのフィールドの整数精度の各画素に基づき、垂直方向に１／４画素、１／２画素、３／４画素分の位相がずれた画素が線形補間により生成され、動き補償がされる。例えば、垂直方向の位相が−１の位置にあるボトムフィールドの画素の値をａ、垂直方向の位相が０の位置にあるトップフィールドの画素の値をｂ、垂直方向の位相が１の位置にあるボトムフィールドの画素の値をｃ、垂直方向の位相が２の位置にあるトップフィールドの画素の値をｄ、垂直方向の位相が３の位置にあるボトムフィールドの画素の値をｅとする。この場合、垂直方向の位相が０〜２の間にある１／４画素精度の各画素は、以下のように求められる。
【００２４】
垂直方向の位相が１／４の位置にある画素は（ａ＋４ｂ＋３ｃ）／８となる。垂直方向の位相が１／２の位置にある画素は（ａ＋３ｃ）／４となる。垂直方向の位相が３／４の位置にある画素は（ａ＋２ｂ＋３ｃ＋２ｄ）／８となる。垂直方向の位相が５／４の位置にある画素は（２ｂ＋３ｃ＋２ｄ＋ｅ）／８となる。垂直方向の位相が３／２の位置にある画素は（３ｃ＋ｅ）／４となる。垂直方向の位相が７／４の位置にある画素は（３ｃ＋４ｄ＋ｅ）／８となる。
【００２５】
以上のように上記文献２に示された従来の画像復号装置は、飛び越し走査方式に対応した高解像度画像の圧縮画像データを、標準解像度画像データに復号することができる。
【００２６】
しかしながら、上記文献２に示された従来の画像復号装置では、フィールドＤＣＴモードで得られる標準解像度画像の各画素と、フレームＤＣＴモードで得られる標準解像度の各画素との位相がずれる。具体的には、フィールドＤＣＴモードでは、図１２に示すように、下位レイヤーのトップフィールドの各画素の垂直方向の位相が１／２、５／２・・・となり、下位レイヤーのボトムフィールドの各画素の垂直方向の位相が１、３・・・となる。それに対して、フレームＤＣＴモードでは、図１３に示すように、下位レイヤーのトップフィールドの各画素の垂直方向の位相が０、２・・・となり、下位レイヤーのボトムフィールドの各画素の垂直方向の位相が１、３・・・となる。そのため、位相が異なる画像がフレームメモリ１３６に混在し、出力する画像の画質が劣化する。
【００２７】
また、上記文献２に示された従来の画像復号装置では、フィールド動き予測モードとフレーム動き予測モードとで位相ずれの補正がされていない。そのため、出力する画像の画質が劣化する。
【００２８】
【発明が解決しようとする課題】
このような問題を解決するための画像復号装置が、特願平１０−２０８３８５号により提案されている。
【００２９】
つぎに、特願平１０−２０８３８５で提案された画像復号装置について説明する。
【００３０】
図１４に示す特願平１０−２０８３８５号で提案した画像復号装置２００は、垂直方向の有効ライン数が例えば１１５２本の高解像度画像をＭＰＥＧ２で画像圧縮したビットストリームが入力され、この入力されたビットストリームを復号するとともに１／２の解像度に縮小して、垂直方向の有効ライン数が例えば５７６本の標準解像度画像を出力する装置である。
【００３１】
なお、以下、高解像度画像のことを上位レイヤーとも呼び、標準解像度画像のことを下位レイヤーとも呼ぶものとする。また、通常、８×８の離散コサイン係数を有するＤＣＴブロックを逆離散コサイン変換した場合８×８の画素から構成される復号データを得ることができるが、例えば、８×８の離散コサイン係数を復号して４×４の画素から構成される復号データを得るような、逆離散コサイン変換をするとともに解像度を縮小する処理を、縮小逆離散コサイン変換という。
【００３２】
この画像復号装置２００は、圧縮された高解像度画像のビットストリームが供給され、このビットストリームを解析するビットストリーム解析装置２０１と、データの発生頻度に応じた符号長を割り当てる可変長符号化がされた上記ビットストリームを復号する可変長符号復号装置２０２と、ＤＣＴブロックの各係数に量子化ステップを掛ける逆量子化装置２０３と、フィールドＤＣＴモードで離散コサイン変換がされたＤＣＴブロックに対して縮小逆離散コサイン変換をして標準解像度画像を生成するフィールドモード用縮小逆離散コサイン変換装置２０４と、フレームＤＣＴモードで離散コサイン変換がされたＤＣＴブロックに対して縮小逆離散コサイン変換をして標準解像度画像を生成するフレームモード用縮小逆離散コサイン変換装置２０５と、縮小逆離散コサイン変換がされた標準解像度画像と動き補償がされた参照画像とを加算する加算装置２０６と、参照画像を一時記憶するフレームメモリ２０７と、フレームメモリ２０７が記憶した参照画像にフィールド動き予測モードに対応した動き補償をするフィールドモード用動き補償装置２０８と、フレームメモリ２０７が記憶した参照画像にフレーム動き予測モードに対応した動き補償をするフレームモード用動き補償装置２０９と、フレームメモリ２０７が記憶した画像に対してポストフィルタリングをすることにより、画枠変換をするとともに画素の位相ずれを補正してテレビジョンモニタ等に表示するための標準解像度の画像データを出力する画枠変換・位相ずれ補正装置２１０とを備えている。
【００３３】
フィールドモード用縮小逆離散コサイン変換装置２０４は、入力されたビットストリームのマクロブロックが、フィールドＤＣＴモードで離散コサイン変換されている場合に用いられる。フィールドモード用縮小逆離散コサイン変換装置２０４は、フィールドＤＣＴモードで離散コサイン変換がされたマクロブロック内の８×８個の係数が示されたＤＣＴブロックに対して、図８で示したような、低域の４×４の係数のみに逆離散コサイン変換を行う。すなわち、水平方向及び垂直方向の低域の４点の離散コサイン係数に基づき縮小逆離散コサイン変換を行う。このフィールドモード用縮小逆離散コサイン変換装置２０４では、以上のような縮小逆離散コサイン変換を行うことにより、１つのＤＣＴブロックが４×４の画素から構成される標準解像度画像を復号することができる。この復号された画像データの各画素の位相は、図１５に示すように、トップフィールドの各画素の垂直方向の位相が１／２、５／２・・・となり、ボトムフィールドの各画素の垂直方向の位相が１、３・・・となる。すなわち、復号された下位レイヤーのトップフィールドでは、先頭画素（位相が１／２の画素）の位相が上位レイヤーのトップフィールドの先頭から１番目と２番目の画素（位相が０と２の画素）の中間位相となり、先頭から２番目の画素（位相が５／２の画素）の位相が上位レイヤーのトップフィールドの先頭から３番目と４番目の画素（位相が４と６の画素）の中間位相となる。また、復号された下位レイヤーのボトムフィールドでは、先頭画素（位相が１の画素）の位相が上位レイヤーのボトムフィールドの先頭から１番目と２番目の画素（位相が１と３の画素）の中間位相となり、先頭から２番目の画素（位相が３の画素）の位相が上位レイヤーのボトムフィールドの先頭から３番目と４番目の画素（位相が５と７の画素）の中間位相となる。
【００３４】
フレームモード用縮小逆離散コサイン変換装置２０５は、入力されたビットストリームのマクロブロックが、フレームＤＣＴモードで離散コサイン変換されている場合に用いられる。フレームモード用縮小逆離散コサイン変換装置２０５は、フレームＤＣＴモードで離散コサイン変換がされたマクロブロック内の８×８個の係数が示されたＤＣＴブロックに対して、縮小逆離散コサイン変換を行う。そして、フレームモード用縮小逆離散コサイン変換装置２０５では、１つのＤＣＴブロックが４×４の画素から構成される解像度画像を復号するとともに、フィールドモード用縮小逆離散コサイン変換装置２０４で生成した標準解像度画像の画素の位相と同位相の画像を生成する。すなわち、フレームモード用縮小逆離散コサイン変換装置２０５で復号された画像データの各画素の位相は、図１５に示すように、トップフィールドの各画素の垂直方向の位相が１／２、５／２・・・となり、ボトムフィールドの各画素の垂直方向の位相が１、３・・・となる。
【００３５】
なお、このフレームモード用縮小逆離散コサイン変換装置２０５の処理については、その詳細を後述する。
【００３６】
加算装置２０６は、フィールドモード用縮小逆離散コサイン変換装置２０４又はフレームモード用縮小逆離散コサイン変換装置２０５により縮小逆離散コサイン変換されたマクロブロックがイントラ画像の場合には、そのイントラ画像をそのままフレームメモリ２０７に格納する。また、加算装置２０６は、フィールドモード用縮小逆離散コサイン変換装置２０４又はフレームモード用縮小逆離散コサイン変換装置２０５により縮小逆離散コサイン変換されたマクロブロックがインター画像である場合には、そのインター画像に、フィールドモード用動き補償装置２０８或いはフレームモード用動き補償装置２０９により動き補償がされた参照画像を合成して、フレームメモリ２０７に格納する。
【００３７】
フィールドモード用動き補償装置２０８は、マクロブロックの動き予測モードがフィールド動き予測モードの場合に用いられる。フィールドモード用動き補償装置２０８は、フレームメモリ２０７に記憶されている標準解像度画像の参照画像に対して、トップフィールドとボトムフィールドとの間の位相ずれ成分を考慮した形で１／４画素精度で補間処理を行い、フィールド動き予測モードに対応した動き補償をする。このフィールドモード用動き補償装置２０８により動き補償がされた参照画像は、加算装置２０６に供給され、インター画像に合成される。
【００３８】
フレームモード用動き補償装置２０９は、マクロブロックの動き予測モードがフレーム動き予測モードの場合に用いられる。フレームモード用動き補償装置２０９は、フレームメモリ２０７に記憶されている標準解像度画像の参照画像に対して、トップフィールドとボトムフィールドとの間の位相ずれ成分を考慮した形で１／４画素精度で補間処理を行い、フレーム動き予測モードに対応した動き補償をする。このフレームモード用動き補償装置２０９により動き補償がされた参照画像は、加算装置２０６に供給され、インター画像に合成される。
【００３９】
画枠変換・位相ずれ補正装置２１０は、フレームメモリ２０７が記憶した標準解像度の参照画像或いは加算装置２０６が合成した画像が供給され、この画像をポストフィルタリングにより、トップフィールドとボトムフィールドとの間の位相ずれ成分を補正するとともに画枠を標準解像度のテレビジョンの規格に合致するように変換する。すなわち、画枠変換・位相ずれ補正装置２１０は、トップフィールドの各画素の垂直方向の位相が１／２、５／２・・・となり、ボトムフィールドの各画素の垂直方向の位相が１、３・・・となる標準解像度画像を、例えば、トップフィールドの各画素の垂直方向の位相が０、２、４・・・となり、ボトムフィールドの各画素の垂直方向の位相が１、３、５・・・となるように補正する。また、画枠変換・位相ずれ補正装置２１０は、高解像度のテレビジョン規格の画枠を、１／４に縮小して標準解像度のテレビジョン規格の画枠に変換する。
【００４０】
特願平１０−２０８３８５で提案した画像復号装置２００では、以上のような構成を有することにより、高解像度画像をＭＰＥＧ２で画像圧縮したビットストリームを、復号するとともに解像度を１／２に縮小して、標準解像度画像を出力することができる。
【００４１】
つぎに、上記フレームモード用縮小逆離散コサイン変換装置２０５の処理内容について、さらに詳細に説明する。
【００４２】
フレームモード用縮小逆離散コサイン変換装置２０５には、図１６に示すように、高解像度画像を圧縮符号化したビットストリームが、１つのＤＣＴブロック単位で入力される。
【００４３】
まず、ステップＳ１において、この１つのＤＣＴブロックの離散コサイン係数ｙ（ＤＣＴブロックの全ての離散コサイン係数のうち垂直方向の係数をｙ₁〜ｙ₈として図中に示す。）に対して、８×８の逆離散コサイン変換（ＩＤＣＴ８×８）を行う。逆離散コサイン変換をすることにより、８×８の復号された画素データｘ（ＤＣＴブロックの全ての画素データのうち垂直方向の画素データをｘ₁〜ｘ₈として図中に示す。）を得ることができる。
【００４４】
続いて、ステップＳ２において、この８×８の画素データｘを、垂直方向に１ライン毎交互に取り出して、飛び越し走査に対応した４×４のトップフィールドの画素ブロックと、飛び越し走査に対応した４×４のボトムフィールドの画素ブロックの２つの画素ブロックに分離する。すなわち、垂直方向に１ライン目の画素データｘ₁と、３ライン目の画素データｘ₃と、５ライン目の画素データｘ₅と、７ライン目の画素データｘ₇とを取り出して、トップフィールドに対応した画素ブロックを生成する。また、垂直方向に２ライン目の画素データｘ₂と、４ライン目の画素データｘ₄と、６ライン目の画素データｘ₆と、８ライン目の画素データｘ₈とを取り出して、ボトムフィールドに対応した画素ブロックを生成する。なお、ＤＣＴブロックの各画素を飛び越し走査に対応した２つの画素ブロックに分離する処理を、以下フィールド分離という。
【００４５】
続いて、ステップＳ３において、フィールド分離した２つの画素ブロックそれぞれに対して４×４の離散コサイン変換（ＤＣＴ４×４）をする。
【００４６】
続いて、ステップＳ４において、４×４の離散コサイン変換をして得られたトップフィールドに対応する画素ブロックの離散コサイン係数ｚ（トップフィールドに対応する画素ブロックの全ての係数のうち垂直方向の離散コサイン係数をｚ₁，ｚ₃，ｚ₅，ｚ₇として図中に示す。）の高域成分を間引き、２×２の離散コサイン係数から構成される画素ブロックとする。また、４×４の離散コサイン変換をして得られたボトムフィールドに対応する画素ブロックの離散コサイン係数ｚ（ボトムフィールドに対応する画素ブロックの全ての係数のうち垂直方向の離散コサイン係数をｚ₂，ｚ₄，ｚ₆，ｚ₈として図中に示す。）の高域成分を間引き、２×２の離散コサイン係数から構成される画素ブロックとする。
【００４７】
続いて、ステップＳ５において、高域成分の離散コサイン係数を間引いた画素ブロックに対して、２×２の逆離散コサイン変換（ＩＤＣＴ２×２）を行う。２×２の逆離散コサイン変換をすることにより、２×２の復号された画素データｘ′（トップフィールドの画素ブロックの全ての画素データのうち垂直方向の画素データをｘ′₁，ｘ′₃として図中に示し、また、ボトムフィールドに対応する画素ブロックの全ての画素データのうち垂直方向の画素データをｘ′₂，ｘ′₄として図中に示す。）を得ることができる。
【００４８】
続いて、ステップＳ６において、トップフィールドに対応する画素ブロックの画素データと、ボトムフィールドに対応する画素ブロックの画素データとを、垂直方向に１ラインずつ交互に合成して、４×４の画素データから構成される縮小逆離散コサイン変換をしたＤＣＴブロックを生成する。なお、トップフィールドとボトムフィールドに対応した２つの画素ブロックの各画素を垂直方向に交互に合成する処理を、以下フレーム合成という。
【００４９】
以上のステップＳ１からステップＳ６を行うことにより、フレームモード用縮小逆離散コサイン変換装１５では、図１５で示したような、フィールドモード用縮小逆離散コサイン変換装置２０４で生成した標準解像度画像の画素の位相と同位相の画素から構成される４×４のＤＣＴブロックを生成することができる。
【００５０】
また、フレームモード用縮小逆離散コサイン変換装置２０５では、以上のステップＳ１からステップＳ６までの処理を１つの行列を用いて演算する。具体的には、フレームモード用縮小逆離散コサイン変換装置２０５では、以上の処理を加法定理を用いて展開計算することにより得られる以下の式（１）に示す行列［ＦＳ′］と、１つのＤＣＴブロックの離散コサイン係数ｙ（ｙ₁〜ｙ₈）とを行列演算することにより、縮小逆離散コサイン変換したＤＣＴブロックの画素データｘ′（ｘ′₁〜ｘ′₄）を得ることができる。
【００５１】
【数１】

【００５２】
但し、この式（１）において、Ａ〜Ｊは以下の通りである。
【００５３】
【数２】

【００５４】
以上のように特願平１０−２０８３８５で提案した画像復号装置２００においては、フィールドＤＣＴモードでは、トップフィールドとボトムフィールドとのそれぞれに４×４の縮小逆離散コサイン変換を行い標準解像度画像を復号し、フレームＤＣＴモードでは、フレーム分離をして縮小逆離散コサイン変換を行い標準解像度画像を復号する。この画像復号装置２００では、このようにフィールドＤＣＴモードとフレームＤＣＴモードとで異なる処理を行うため、飛び越し走査画像が有するインタレース性を損なうことなく、かつ、フィールドＤＣＴモードとフレームＤＣＴモードとで復号した画像の位相を同一とすることができ、出力する画像の画質を劣化させない。
【００５５】
なお、上記画像復号装置２００では、フィールドモード用縮小逆離散コサイン変換装置２０４の４×４の縮小逆離散コサイン変換処理、及び、フレームモード用縮小逆離散コサイン変換装置２０５の上記ステップＳ１〜ステップＳ６の縮小逆離散コサイン変換処理を、高速アルゴリズムを用いて処理してもよい。
【００５６】
例えば、Ｗａｎｇのアルゴリズム（参考文献：Zhong DE Wang.,"Fast Algorithms for the Discrete W Transform and for the Discrete Fourier Transform",IEEE Tr.ASSP-32,NO.4,pp.803-816, Aug.1984）を用いることにより、処理を高速化することができる。
【００５７】
フィールドモード用縮小逆離散コサイン変換装置２０４が演算をする行列を、Ｗａｎｇのアルゴリズムを用いて分解すると、以下の式（２）に示すように分解される。
【００５８】
【数３】

【００５９】
図１７にフィールドモード用縮小逆離散コサイン変換装置２０４の処理にＷａｎｇのアルゴリズムを適用した場合の処理フローを示す。
【００６０】
この処理フローは、第１から第５の乗算器２０４ａ〜２０４ｅ及び第１から第９の加算器２０４ｆ〜２０４ｎから構成され、１次元の離散コサイン係数Ｘ（０）〜Ｘ（３）が入力される。第１から第５の乗算器２０４ａ〜２０４ｅに示しているＷ₀〜Ｗ₄は、各乗算器が乗算する値を示している。この処理フローからは、以下の式（３）〜式（６）に示す演算がされた４つの係数値或いは画素値Ｙ（０）〜Ｙ（３）が出力される。なお、この式（３）〜式（６）では、Ｗ₀＝Ｗ₁の関係がある。
【００６１】
【数４】

【００６２】
フィールドモード用縮小逆離散コサイン変換装置２０４では、ＤＣＴブロックの低域の４×４の垂直方向の４個の係数に対して図１７に示す処理フローを施した後、係数位置をメモリ内で９０°転換し、ＤＣＴブロックの低域の４×４の水平方向の４個の係数に対して再度この図１７に示す処理フローを施す。
【００６３】
このようにＷａｎｇのアルゴリズムを適用することにより演算を高速化することができる。
【００６４】
また、フレームモード用縮小逆離散コサイン変換装置２０５が演算をする行列［ＦＳ′］を、Ｗａｎｇのアルゴリズムを用いて分解すると、以下の式（７）に示すように分解される。
【００６５】
【数５】

【００６６】
但し、この式（７）において、Ａ〜Ｊは、以下の通りである。
【００６７】
【数６】

【００６８】
図１８にフレームモード用縮小逆離散コサイン変換装置２０５の処理にＷａｎｇのアルゴリズムを適用した場合の処理フローを示す。
【００６９】
この処理フローは、第１から第１０の乗算器２０５ａ〜２０５ｊ及び第１から第１２の加算器２０５ｋ〜２０５ｖから構成され、１次元の離散コサイン係数Ｘ（０）〜Ｘ（７）が入力される。第１から第１０の乗算器２０５ａ〜２０５ｊに示しているＷ₀〜Ｗ₉は、各乗算器が乗算する値を示している。この処理フローからは、以下の式（８）〜式（１１）に示す演算がされた４つの係数値Ｙ（０）からＹ（３）が出力される。なお、この式（８）〜式（１１）では、Ｗ₀＝Ｗ₁の関係がある。
【００７０】
【数７】

【００７１】
フレームモード用縮小逆離散コサイン変換装置２０５では、ＤＣＴブロックの低域の８×４の垂直方向の８個の係数に対して図１８に示す処理フローを施した後、係数位置をメモリ内で９０°転換し、ＤＣＴブロックの低域の４×４の水平方向の４個の係数に対して図１７に示す処理フローを施す。
【００７２】
このようにＷａｎｇのアルゴリズムを適用することにより演算を高速化することができる。
【００７３】
ところで、以上のような特願平１０−２０８３８５号で提案した画像復号装置２００では、フィールドモード時の縮小逆離散コサイン変換と、フレームモード時の縮小逆離散コサイン変換とが全く別の処理フローとなっている。
【００７４】
そのため、例えば、これらの処理をソフトウェアで行おうとした場合には、コードサイズが大きくなり、多くの命令メモリが必要となってしまう。同様に、コードサイズが大きくなることによりキャッシュミスを起こし易くなり、復号処理の低下を招く可能性がある。また、これらの処理を専用回路で行おうとした場合も、回路規模が大きくなってしまう。
【００７５】
本発明は、このような実情を鑑みてなされたものであり、飛び越し走査画像が有するインタレース性を損なうことなくフィールド直交変換モードとフレーム直交変換モードとによる画素の位相ずれをなくすことが可能な、高解像度画像の圧縮画像データから標準解像度の画像データを復号する画像復号装置及び画像復号方法であって、処理フローを簡略化した画像復号装置及び画像復号方法を提供することを目的とする。
【００７６】
【課題を解決するための手段】
本発明にかかる画像復号装置は、８×８画素からなる画素ブロックに対して２次元の直交変換をして８×８係数からなる直交変換ブロックを生成して圧縮符号化をした第１の解像度の圧縮画像データから、上記第１の解像度の１／２の解像度の動画像データを復号する画像復号装置であって、飛び越し走査に対応した直交変換方式（フィールド直交変換モード）により直交変換がされた直交変換ブロックと、順次走査に対応した直交変換方式（フレーム直交変換モード）により直交変換がされた直交変換ブロックとで、逆直交変換に用いる各乗算係数を切り換えるとともに各乗算係数に入力する係数を切り換え、上記圧縮画像データの直交変換ブロックの各係数のうち８×４係数に対して垂直方向の１次元の逆直交変換をして、４×４係数となる直交変換ブロックを生成する第１の逆直交変換手段と、上記第１の逆直交変換手段により逆直交変換がされた４×４係数に対して、水平方向の１次元の逆直交変換をして、４×４画素からなる画素ブロックを生成する第２の逆直交変換手段とを備え、上記第１の逆直交変換手段は、フィールド直交変換モードにより直交変換された上記直交変換ブロックの各係数に対しては、水平方向及び垂直方向の低域の４係数に対して逆直交変換をして４×４係数からなる直交変換ブロックを生成し、フレーム直交変換モードにより直交変換された上記直交変換ブロックの各係数に対しては、水平方向の低域の４係数及び垂直方向の８係数に対して逆直交変換をし、逆直交変換をした直交変換ブロックの各画素を飛び越し走査に対応した２つの画素ブロックに分離し、分離した２つの画素ブロックに対してそれぞれ直交変換をし、直交変換をした２つの画素ブロックの各係数のうち水平方向及び垂直方向２係数に対して逆直交変換をし、逆直交変換をした２つの画素ブロックを合成して４×４係数からなる直交変換ブロックを生成することを特徴とする。
【００７７】
この画像復号装置では、フィールド直交変換モードにより直交変換された上記直交変換ブロックの各係数に対しては、水平方向及び垂直方向の低域の４係数に対して逆直交変換をして４×４係数からなる直交変換ブロックを生成し、フレーム直交変換モードにより直交変換された上記直交変換ブロックの各係数に対しては、水平方向の低域の４係数及び垂直方向の８係数に対して逆直交変換をし、逆直交変換をした直交変換ブロックの各画素を飛び越し走査に対応した２つの画素ブロックに分離し、分離した２つの画素ブロックに対してそれぞれ直交変換をし、直交変換をした２つの画素ブロックの各係数のうち水平方向及び垂直方向２係数に対して逆直交変換をし、逆直交変換をした２つの画素ブロックを合成して４×４係数からなる直交変換ブロックを生成する。そして、この画像復号装置では、第１の解像度より低い第２の解像度の動画像データを出力する。さらに、この画像復号装置では、フィールド直交変換モードにより直交変換がされた直交変換ブロックと、フレーム直交変換モードにより直交変換がされた直交変換ブロックとで、逆直交変換に用いる各乗算係数を切り換えるとともに各乗算係数に入力する係数を切り換え、上記圧縮画像データの直交変換ブロックの各係数のうち８×４係数に対して垂直方向の１次元の逆直交変換をする。
【００７８】
また、本発明にかかる画像復号方法は、８×８画素からなる画素ブロックに対して２次元の直交変換をして８×８係数からなる直交変換ブロックを生成して圧縮符号化をした第１の解像度の圧縮画像データから、上記第１の解像度の１／２の解像度の動画像データを復号する画像復号方法であって、飛び越し走査に対応した直交変換方式（フィールド直交変換モード）により直交変換がされた直交変換ブロックと、順次走査に対応した直交変換方式（フレーム直交変換モード）により直交変換がされた直交変換ブロックとで、逆直交変換に用いる各乗算係数を切り換えるとともに各乗算係数に入力する係数を切り換え、上記圧縮画像データの直交変換ブロックの各係数のうち８×４係数に対して垂直方向の１次元の逆直交変換をして、４×４係数となる直交変換ブロックを生成し、逆直交変換がされた上記４×４係数に対して、水平方向の１次元の逆直交変換をして、４×４画素からなる画素ブロックを生成し、フィールド直交変換モードにより直交変換された上記直交変換ブロックの各係数に対しては、水平方向及び垂直方向の低域の４係数に対して逆直交変換をして４×４係数からなる直交変換ブロックを生成し、フレーム直交変換モードにより直交変換された上記直交変換ブロックの各係数に対しては、水平方向の低域の４係数及び垂直方向の８係数に対して逆直交変換をし、逆直交変換をした直交変換ブロックの各画素を飛び越し走査に対応した２つの画素ブロックに分離し、分離した２つの画素ブロックに対してそれぞれ直交変換をし、直交変換をした２つの画素ブロックの各係数のうち水平方向及び垂直方向２係数に対して逆直交変換をし、逆直交変換をした２つの画素ブロックを合成して４×４係数からなる直交変換ブロックを生成することを特徴とする。
【００７９】
この画像復号方法では、フィールド直交変換モードにより直交変換された上記直交変換ブロックの各係数に対しては、水平方向及び垂直方向の低域の４係数に対して逆直交変換をして４×４係数からなる直交変換ブロックを生成し、フレーム直交変換モードにより直交変換された上記直交変換ブロックの各係数に対しては、水平方向の低域の４係数及び垂直方向の８係数に対して逆直交変換をし、逆直交変換をした直交変換ブロックの各画素を飛び越し走査に対応した２つの画素ブロックに分離し、分離した２つの画素ブロックに対してそれぞれ直交変換をし、直交変換をした２つの画素ブロックの各係数のうち水平方向及び垂直方向２係数に対して逆直交変換をし、逆直交変換をした２つの画素ブロックを合成して４×４係数からなる直交変換ブロックを生成する。そして、この画像復号装置では、第１の解像度より低い第２の解像度の動画像データを出力する。さらに、この画像復号装置では、フィールド直交変換モードにより直交変換がされた直交変換ブロックと、フレーム直交変換モードにより直交変換がされた直交変換ブロックとで、逆直交変換に用いる各乗算係数を切り換えるとともに各乗算係数に入力する係数を切り換え、上記圧縮画像データの直交変換ブロックの各係数のうち８×４係数に対して垂直方向の１次元の逆直交変換をする。
【００８０】
【発明の実施の形態】
以下、本発明の実施の形態について、図面を参照しながら説明する。
【００８１】
図１に本発明の実施の形態の画像復号装置のブロック構成図を示す。
【００８２】
図１に示す画像復号装置１０は、垂直方向の有効ライン数が例えば１１５２本の高解像度画像をＭＰＥＧ２で画像圧縮したビットストリームが入力され、この入力されたビットストリームを復号するとともに１／２の解像度に縮小して、垂直方向の有効ライン数が例えば５７６本の標準解像度画像を出力する装置である。
【００８３】
この画像復号装置１０は、圧縮された高解像度画像のビットストリームが供給され、このビットストリームを解析するビットストリーム解析装置１１と、データの発生頻度に応じた符号長を割り当てる可変長符号化がされた上記ビットストリームを復号する可変長符号復号装置１２と、ＤＣＴブロックの各係数に量子化ステップを掛ける逆量子化装置１３と、離散コサイン変換がされたＤＣＴブロックに対して縮小逆離散コサイン変換をして標準解像度画像を生成する縮小逆離散コサイン変換装置１４と、縮小逆離散コサイン変換がされた標準解像度画像と動き補償がされた参照画像とを加算する加算装置１６と、参照画像を一時記憶するフレームメモリ１７と、フレームメモリ１７が記憶した参照画像にフィールド動き予測モードに対応した動き補償をするフィールドモード用動き補償装置１８と、フレームメモリ１７が記憶した参照画像にフレーム動き予測モードに対応した動き補償をするフレームモード用動き補償装置１９と、フレームメモリ１７が記憶した画像に対してポストフィルタリングをすることにより、画枠変換をするとともに画素の位相ずれを補正してテレビジョンモニタ等に表示するための標準解像度の画像データを出力する画枠変換・位相ずれ補正装置２０とを備えている。
【００８４】
縮小逆離散コサイン変換装置１４は、入力されたビットストリームのマクロブロックがフィールドＤＣＴモードで離散コサイン変換されている場合には、フィールドＤＣＴモードで離散コサイン変換がされたマクロブロック内の８×８個の係数が示されたＤＣＴブロックに対して、図８で示したような、低域の４×４の係数のみに逆離散コサイン変換を行う。すなわち、水平方向及び垂直方向の低域の４点の離散コサイン係数に基づき縮小逆離散コサイン変換を行う。この縮小逆離散コサイン変換装置１４では、以上のような縮小逆離散コサイン変換を行うことにより、１つのＤＣＴブロックが４×４の画素から構成される標準解像度画像を復号することができる。この復号された画像データの各画素の位相は、図１５に示したように、トップフィールドの各画素の垂直方向の位相が１／２、５／２・・・となり、ボトムフィールドの各画素の垂直方向の位相が１、３・・・となる。すなわち、復号された下位レイヤーのトップフィールドでは、先頭画素（位相が１／２の画素）の位相が上位レイヤーのトップフィールドの先頭から１番目と２番目の画素（位相が０と２の画素）の中間位相となり、先頭から２番目の画素（位相が５／２の画素）の位相が上位レイヤーのトップフィールドの先頭から３番目と４番目の画素（位相が４と６の画素）の中間位相となる。また、復号された下位レイヤーのボトムフィールドでは、先頭画素（位相が１の画素）の位相が上位レイヤーのボトムフィールドの先頭から１番目と２番目の画素（位相が１と３の画素）の中間位相となり、先頭から２番目の画素（位相が３の画素）の位相が上位レイヤーのボトムフィールドの先頭から３番目と４番目の画素（位相が５と７の画素）の中間位相となる。
【００８５】
また、縮小逆離散コサイン変換装置１４は、入力されたビットストリームのマクロブロックがフレームＤＣＴモードで離散コサイン変換されている場合には、フレームＤＣＴモードで離散コサイン変換がされたマクロブロック内の８×８個の係数が示されたＤＣＴブロックに対して、上述したフレームモード用縮小逆離散コサイン変換装置１５と同様の８×４の縮小逆離散コサイン変換を行う。そして、縮小逆離散コサイン変換装置１４では、１つのＤＣＴブロックが４×４の画素から構成される解像度画像を復号するとともに、フィールドＤＣＴモードで離散コサイン変換されている場合に復号した標準解像度画像の画素の位相と同位相の画像を生成する。すなわち、縮小逆離散コサイン変換装置１４は、図１５に示したような、トップフィールドの各画素の垂直方向の位相が１／２、５／２・・・となり、ボトムフィールドの各画素の垂直方向の位相が１、３・・・となる画像を生成する。
【００８６】
加算装置１６は、縮小逆離散コサイン変換装置１４により縮小逆離散コサイン変換されたマクロブロックがイントラ画像の場合には、そのイントラ画像をそのままフレームメモリ１７に格納する。また、加算装置１６は、縮小逆離散コサイン変換装置１４により縮小逆離散コサイン変換されたマクロブロックがインター画像である場合には、そのインター画像に、フィールドモード用動き補償装置１８或いはフレームモード用動き補償装置１９により動き補償がされた参照画像を合成して、フレームメモリ１７に格納する。
【００８７】
フィールドモード用動き補償装置１８は、マクロブロックの動き予測モードがフィールド動き予測モードの場合に用いられる。フィールドモード用動き補償装置１８は、フレームメモリ１７に記憶されている標準解像度画像の参照画像に対して、トップフィールドとボトムフィールドとの間の位相ずれ成分を考慮した形で１／４画素精度で補間処理を行い、フィールド動き予測モードに対応した動き補償をする。このフィールドモード用動き補償装置１８により動き補償がされた参照画像は、加算装置１６に供給され、インター画像に合成される。
【００８８】
フレームモード用動き補償装置１９は、マクロブロックの動き予測モードがフレーム動き予測モードの場合に用いられる。フレームモード用動き補償装置１９は、フレームメモリ１７に記憶されている標準解像度画像の参照画像に対して、トップフィールドとボトムフィールドとの間の位相ずれ成分を考慮した形で１／４画素精度で補間処理を行い、フレーム動き予測モードに対応した動き補償をする。このフレームモード用動き補償装置１９により動き補償がされた参照画像は、加算装置１６に供給され、インター画像に合成される。
【００８９】
画枠変換・位相ずれ補正装置２０は、フレームメモリ１７が記憶した標準解像度の参照画像或いは加算装置１６が合成した画像が供給され、この画像をポストフィルタリングにより、トップフィールドとボトムフィールドとの間の位相ずれ成分を補正するとともに画枠を標準解像度のテレビジョンの規格に合致するように変換する。すなわち、画枠変換・位相ずれ補正装置２０は、トップフィールドの各画素の垂直方向の位相が１／２、５／２・・・となり、ボトムフィールドの各画素の垂直方向の位相が１、３・・・となる標準解像度画像を、例えば、トップフィールドの各画素の垂直方向の位相が０、２、４・・・となり、ボトムフィールドの各画素の垂直方向の位相が１、３、５・・・となるように補正する。また、画枠変換・位相ずれ補正装置２０は、高解像度のテレビジョン規格の画枠を、１／４に縮小して標準解像度のテレビジョン規格の画枠に変換する。
【００９０】
この画像復号装置１０では、以上のような構成を有することにより、高解像度画像をＭＰＥＧ２で画像圧縮したビットストリームを、復号するとともに解像度を１／２に縮小して、標準解像度画像を出力することができる。
【００９１】
つぎに、上記縮小逆離散コサイン変換装置１４の処理内容について、さらに詳細に説明する。
【００９２】
縮小逆離散コサイン変換装置１４は、図２に示すように、逆量子化装置１３から高解像度画像をＭＰＥＧ２で画像圧縮したビットストリームが供給される入力バッファ２１と、垂直成分の１次元の逆離散コサイン変換を行う垂直成分逆離散コサイン変換部２２と、中間バッファ２３と、水平成分の１次元の逆離散コサイン変換を行う水平成分逆離散コサイン変換部２４と、出力バッファ２５と有している。
【００９３】
入力バッファ２１には、逆量子化装置１３により逆量子化されたビットストリームがＤＣＴブロック単位で入力され、このＤＣＴブロックを一時格納する。
【００９４】
垂直成分逆離散コサイン変換部２２は、入力バッファ２１に格納されたＤＣＴブロック単位の離散コサイン係数を抜き出し、このＤＣＴブロックに対して垂直方向の一次元の逆離散コサイン変換を行う。このとき、垂直成分逆離散コサイン変換部２２は、８個の垂直方向の係数に対して離散コサイン変換をするとともに、この８個の係数を４個の係数に縮小する縮小逆離散コサイン変換を行う。
【００９５】
また、この垂直成分逆離散コサイン変換部２２には、離散コサイン変換をするＤＣＴブロックの離散コサイン変換のモード情報が入力される。すなわち、そのＤＣＴブロックが、フィールドＤＣＴモードで離散コサイン変換されているか、或いは、フレームＤＣＴモード離散コサイン変換がされているかどうかの情報が入力される。垂直成分逆離散コサイン変換部２２は、離散コサイン変換をするＤＣＴブロックが、フィールドＤＣＴモードで離散コサイン変換されている場合には、上述した特願平１０−２０８３８５号で提案した画像復号装置２００のフィールドモード用縮小逆離散コサイン変換装置２０４と同一のアルゴリズムで、垂直方向の逆離散コサイン変換を行う。また、垂直成分逆離散コサイン変換部２２は、離散コサイン変換をするＤＣＴブロックが、フレームＤＣＴモードで離散コサイン変換されている場合には、上述した特願平１０−２０８３８５号で提案した画像復号装置２００のフレームモード用縮小逆離散コサイン変換装置２０５と同一のアルゴリズムで、垂直方向の逆離散コサイン変換を行う。
【００９６】
この垂直成分逆離散コサイン変換部２２の具体的な処理フローを図３に示す。
【００９７】
この図３に示した処理フローは、縮小逆離散コサイン変換装置１４が演算をする行列を、分解してＷａｎｇのアルゴリズムに適用したものである。
【００９８】
この垂直成分逆離散コサイン変換部２２の処理フローは、第１から第１０の乗算器３１〜４０及び第１から第１２の加算器４１〜５２から構成され、１次元の離散コサイン係数Ｘ（０）〜Ｘ（７）が入力される。第１から第１０の乗算器３１〜４０に示しているＡ〜Ｊは、各乗算器が乗算する値を示している。
【００９９】
第１の乗算器３１は最も低域の離散コサイン係数Ｘ（０）に対して値Ａを乗算する。第２の乗算器３２は低域から３番目の離散コサイン係数Ｘ（２）に対して値Ｂを乗算する。第３の乗算器３３は低域から５番目の離散コサイン係数Ｘ（４）に対して値Ｃを乗算する。第４の乗算器３４は低域から７番目の離散コサイン係数Ｘ（６）に対して値Ｄを乗算する。第５の乗算器３５は低域から４番目の離散コサイン係数Ｘ（３）に対して値Ｅを乗算する。第６の乗算器３６は低域から６番目の離散コサイン係数Ｘ（５）に対して値Ｆを乗算する。第７の乗算器３７は低域から２番目の離散コサイン係数Ｘ（１）に対して値Ｇを乗算する。第８の乗算器３８は低域から２番目の離散コサイン係数Ｘ（１）に対して値Ｈを乗算する。
【０１００】
また、第９の乗算器３９は、フィールドＤＣＴモードの場合には低域から４番目の離散コサイン係数Ｘ（３）に対して値Ｉを乗算し、また、フレームＤＣＴモードの場合には最も高域の離散コサイン係数Ｘ（７）に対して値Ｉを乗算する。また、第１０の乗算器４０は、フィールドＤＣＴモードの場合には低域から４番目の離散コサイン係数Ｘ（３）に対して値Ｊを乗算し、また、フレームＤＣＴモードの場合には最も高域の離散コサイン係数Ｘ（７）に対して値Ｊを乗算する。この第９及び第１０の乗算器３９，４０に入力する離散コサイン係数の切り換えは、即ち離散コサイン係数Ｘ（３）と離散コサイン係数Ｘ（７）の入力データポートの切り換えは、切換スイッチ６０により行われる。
【０１０１】
従って、フィールドＤＣＴモードの場合、この図３に示す処理フローからは、以下の式（１２）〜式（１５）に示す演算がされた４つの係数値Ｙ（０）〜Ｙ（３）が出力される。
【０１０２】
【数８】

【０１０３】
また、フレームＤＣＴモードの場合、この図３に示す処理フローからは、以下の式（１６）〜式（１９）に示す演算がされた４つの係数値Ｙ（０）からＹ（３）が出力される。
【０１０４】
【数９】

【０１０５】
また、第１から第１０の各乗算器３１〜４０は、例えば内部にレジスタを備えており、そのレジスタ内に乗算するＡ〜Ｊの値が格納される。このレジスタ内の値は、フィールドＤＣＴモードとフレームＤＣＴモードとで切り換えられる。
【０１０６】
フィールドＤＣＴモードとフレームＤＣＴモードとで切り換えられる各乗算器の乗算定数を以下の表に示す。なお、この表では、Ｗ₀＝Ｗ₁の関係がある。
【０１０７】
【表１】

【０１０８】
この表に示されたフィールドＤＣＴモードの場合の各値を上記式（１２）〜式（１５）に代入すると、図３に示した処理フローからは、以下の式（２０）〜式（２３）に示す演算がされた４つの係数値Ｙ（０）からＹ（３）が出力されることとなる。
【０１０９】
【数１０】

【０１１０】
この式（２０）〜式（２３）に示され得たＹ（０）からＹ（３）は、上記式（３）〜式（６）の各値と同一となる。すなわち、特願平１０−２０８３８５号で提案した画像復号装置２００のフィールドモード用縮小逆離散コサイン変換装置２０４と同一のアルゴリズムで、垂直方向の逆離散コサイン変換を行っている。
【０１１１】
また、フレームＤＣＴモードの場合の各値を上記式（１６）〜式（１９）に代入すると、図３に示した処理フローからは、以下の式（２４）〜式（２７）に示す演算がされた４つの係数値Ｙ（０）からＹ（３）が出力されることとなる。
【０１１２】
【数１１】

【０１１３】
この式（２４）〜式（２７）に示され得たＹ（０）からＹ（３）は、上記式（７）〜式（１１）の各値と同一となる。すなわち、すなわち、特願平１０−２０８３８５号で提案した画像復号装置２００のフレームモード用縮小逆離散コサイン変換装置２０５と同一のアルゴリズムで、垂直方向の逆離散コサイン変換を行う。
【０１１４】
以上のように垂直成分逆離散コサイン変換部２２では、各乗算器３１〜３３の乗算定数を切り換えるとともにデータの入力ポートを切り換えることにより、フィールドＤＣＴモードの場合とフレームＤＣＴモードの場合の垂直方向の縮小逆離散コサイン変換をすることができる。また、この垂直成分逆離散コサイン変換部２２では、フィールドＤＣＴモードの縮小逆離散コサイン変換とフレームＤＣＴモードの縮小逆離散コサイン変換とを１つの処理フローで演算するの、その構成が非常に簡易となる。
【０１１５】
以上のように演算された離散コサイン係数は、中間バッファ２３に一旦格納される。
【０１１６】
水平成分逆離散コサイン変換部２４は、中間バッファ２３に格納されたＤＣＴブロック単位の離散コサイン係数を抜き出し、このＤＣＴブロックに対して水平方向の一次元の逆離散コサイン変換を行う。このとき、水平成分逆離散コサイン変換部２４は、４個の垂直方向の係数に対して離散コサイン変換をする。水平成分逆離散コサイン変換部２４は、離散コサイン変換をするＤＣＴブロックが、フィールドＤＣＴモードで離散コサイン変換されているか、或いは、フレームＤＣＴモードで離散コサイン変換されているかにかかわらず、上述した特願平１０−２０８３８５号で提案した画像復号装置２００のフィールドモード用縮小逆離散コサイン変換装置２０４と同一のアルゴリズムで、水平方向の逆離散コサイン変換を行う。
【０１１７】
すなわち、この水平成分逆離散コサイン変換部２４の処理フローは、上記図１７で示したものと同一となる。
【０１１８】
なお、この水平成分逆離散コサイン変換部２４の処理フローは、上記垂直成分逆離散コサイン変換部２２のフィールドＤＣＴモードの際の処理フローと同一であるので、この垂直成分逆離散コサイン変換部２２の処理フローを流用しても良い。
【０１１９】
出力バッファ２５は、水平成分逆離散コサイン変換部２４が演算した４×４の画素値をＤＣＴブロック単位で一時格納し、加算装置１６に供給する。
【０１２０】
以上のように本発明の実施の形態の画像復号装置１０では、飛び越し走査画像が有するインタレース性を損なうことなくフィールド直交変換モードとフレーム直交変換モードとによる画素の位相ずれをなくた高解像度画像の圧縮画像データから標準解像度の画像データを復号することができ、かつ、その処理フローを簡略化することができる。
【０１２１】
【発明の効果】
本発明にかかる画像復号装置及び画像復号方法では、フレーム直交変換モードにより直交変換がされた直交変換ブロックの全周波数成分の係数に対して逆直交変換をして飛び越し走査に対応した２つの画素ブロックに分離し、分離した２つの画素ブロックに対してそれぞれ直交変換をして低周波成分の係数に対して逆直交変換をし、逆直交変換をした２つの画素ブロックを合成する。そして、この画像復号方法では、第１の解像度より低い第２の解像度の動画像データを出力する。さらに、この画像復号方法では、フィールド直交変換モードにより直交変換がされた直交変換ブロックと、フレーム直交変換モードにより直交変換がされた直交変換ブロックとで、逆直交変換に用いる各乗算係数を切り換えるとともに各乗算係数に入力する係数を切り換え、上記圧縮画像データの直交変換ブロックの各係数のうち８×４係数に対して垂直方向の１次元の逆直交変換をする。
【０１２２】
このことにより、本発明では、復号に必要な演算量及び記憶容量を少なくすることができるとともに、飛び越し走査画像が有するインタレース性を損なうことなくフィールド直交変換モードとフレーム直交変換モードとによる画素の位相ずれをなくすことができる。また、第２の解像度の動画像データの画質を向上させることができる。さらに、本発明では、逆離散コサイン変換に伴う処理を簡略化することができる。
【図面の簡単な説明】
【図１】本発明の実施の形態の画像復号装置のブロック図である。
【図２】上記画像復号装置の縮小逆離散コサイン変換装置のブロック構成図である。
【図３】上記縮小離散コサイン変換装置の垂直成分逆離散コサイン変換部の処理フローを示す図である。
【図４】従来の第１のダウンデコーダを示すブロック図である。
【図５】従来の第２のダウンデコーダを示すブロック図である。
【図６】従来の第３のダウンデコーダを示すブロック図である。
【図７】従来の画像復号装置のブロック図である。
【図８】上記従来の画像復号装置のフィールドＤＣＴモードにおける縮小逆離散コサイン変換処理を説明するための図である。
【図９】上記従来の画像復号装置のフィールドＤＣＴモードにおける縮小逆離散コサイン変換処理を説明するための図である。
【図１０】上記従来の画像復号装置のフィールド動き予測モードにおける線形補間処理を説明するための図である。
【図１１】上記従来の画像復号装置のフレーム動き予測モードにおける線形補間処理を説明するための図である。
【図１２】上記従来の画像復号装置のフィールドＤＣＴモードの結果得られる画素の位相を説明するための図である。
【図１３】上記従来の画像復号装置のフレームＤＣＴモードの結果得られる画素の位相を説明するための図である。
【図１４】特願平１０−２０８３８５で提案された画像復号装置のブロック図である。
【図１５】上記特願平１０−２０８３８５で提案された画像復号装置のフレームメモリに格納される参照画像の垂直方向の画素の位相を説明するための図である。
【図１６】上記特願平１０−２０８３８５で提案された画像復号装置のフレームモード用縮小逆離散コサイン変換装置の１ブロック処理の内容を説明するための図である。
【図１７】Ｗａｎｇのアルゴリズムを上記特願平１０−２０８３８５で提案された画像復号装置のフィールドモード用縮小逆離散コサイン変換装置の処理に適用した場合の演算フローを示す図である。
【図１８】Ｗａｎｇのアルゴリズムを上記特願平１０−２０８３８５で提案された画像復号装置のフレームモード用縮小逆離散コサイン変換装置の１ブロック処理に適用した場合の演算フローを示す図である。
【符号の説明】
１０画像復号装置、１４縮小逆離散コサイン変換装置、２２垂直成分逆離散コサイン変換部、２４水平成分逆離散コサイン変換部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an image decoding apparatus and an image decoding method for decoding compressed image data having a first resolution that has been compression-encoded by performing orthogonal transform in units of orthogonal transform blocks each consisting of 8 × 8 pixels. The present invention relates to an image decoding apparatus and an image decoding method for decoding compressed image data having a resolution and reducing the moving image data to a second resolution lower than the first resolution.
[0002]
[Prior art]
The standardization of digital television broadcasting using an image compression method such as MPEG2 (Moving Picture Experts Group phase 2) is underway. Standards for digital television broadcasting include standards corresponding to standard resolution images (for example, 576 effective lines in the vertical direction), standards corresponding to high resolution images (for example, 1152 effective lines in the vertical direction), and the like. is there. Therefore, in recent years, by decoding compressed image data of a high resolution image and reducing the compressed image data to 1/2 resolution, image data of a standard resolution image is generated, and this image data corresponds to the standard resolution. There is a need for a down decoder for display on a television monitor.
[0003]
A down-decoder that decodes a bit stream such as MPEG2 that has been subjected to predictive coding by motion prediction and compression coding by discrete cosine transform on a high-resolution image and down-samples it into a standard-resolution image is described in the document “No low frequency drift” "Scalable Decoder" (Iwahashi, Kamibayashi, Takaya: IEICE Tech. Reports CS94-186, DSP94-108, 1995-01) (hereinafter referred to as Document 1). The document 1 shows the following first to third down decoders.
[0004]
As shown in FIG. 4, the first down-decoder has 8 (the number of coefficients counted from the DC component in the vertical direction) × 8 (the number of coefficients counted from the DC component in the horizontal direction) for the bit stream of the high resolution image. The inverse discrete cosine transform device 101 for performing the inverse discrete cosine transform, the adder device 102 for adding the high-resolution image subjected to the discrete cosine transform and the motion compensated reference image, and a frame for temporarily storing the reference image A memory 103, a motion compensator 104 that performs motion compensation on the reference image stored in the frame memory 103 with 1/2 pixel accuracy, and a downsampling device 105 that converts the reference image stored in the frame memory 103 into an image of standard resolution. It has.
[0005]
In the first down decoder, an output image obtained by performing inverse discrete cosine transform and decoded as a high-resolution image is reduced by the down-sampling device 105 to output standard-resolution image data.
[0006]
As shown in FIG. 5, the second down decoder replaces the coefficient of the high frequency component of the DCT (Discrete Cosine Transform) block of the bit stream of the high resolution image with 0 and performs inverse discrete cosine transform of 8 × 8. A cosine transform device 111, an adder device 112 for adding a discrete cosine transformed high resolution image and a motion compensated reference image, a frame memory 113 for temporarily storing the reference image, and a reference stored in the frame memory 113 A motion compensation device 114 that performs motion compensation on an image with 1/2 pixel accuracy and a downsampling device 115 that converts a reference image stored in the frame memory 113 into an image of standard resolution are provided.
[0007]
In this second down decoder, the output image decoded as a high resolution image by performing inverse discrete cosine transform by replacing the coefficient of the high frequency component among all the coefficients of the DCT block with 0 is reduced by the downsampling device 105 and is standardized. Output resolution image data.
[0008]
As shown in FIG. 6, the third down decoder performs, for example, 4 × 4 inverse discrete cosine transform using only the low-frequency component coefficient of the DCT block of the bit stream of the high-resolution image and decodes it to the standard-resolution image. A reduced inverse discrete cosine transform device 121, an adder 122 that adds the standard resolution image subjected to the reduced inverse discrete cosine transform and the motion compensated reference image, a frame memory 123 that temporarily stores the reference image, and a frame And a motion compensation device 124 for performing motion compensation on the reference image stored in the memory 123 with 1/4 pixel accuracy.
[0009]
In the third down decoder, inverse discrete cosine transform is performed using only the coefficients of the low frequency component among all the coefficients of the DCT block, and the high resolution image is decoded as the standard resolution image.
[0010]
Here, since the first down decoder performs inverse discrete cosine transform on all the coefficients in the DCT block and decodes the high-resolution image, the first down decoder and the inverse discrete cosine transform device 101 having high arithmetic processing capability and the high resolution image are decoded. A frame memory 103 having a capacity is required. In the second down decoder, since the high-frequency image is decoded by performing high-frequency component 0 in the coefficients in the DCT block and the high-resolution image is decoded, the inverse discrete cosine transform device 111 has low arithmetic processing capability. Although it is good, a high-capacity frame memory 113 is still necessary. In contrast to the first and second down decoders, the third down decoder performs inverse discrete cosine transform using only the coefficients of the low frequency component among all the coefficients in the DCT block, and thus the inverse discrete cosine transform apparatus. The arithmetic processing capability of 121 may be low, and furthermore, since the reference image of the standard resolution image is decoded, the capacity of the frame memory 123 can be reduced.
[0011]
By the way, there are a sequential scanning method and an interlaced scanning method as a moving image display method such as television broadcasting. The sequential scanning method is a display method that sequentially displays images obtained by sampling all pixels in a frame at the same timing. The interlaced scanning method is a display method that alternately displays images obtained by sampling pixels in a frame at different timings for each line in the horizontal direction.
[0012]
In this interlaced scanning method, one of images obtained by sampling pixels in a frame at different timings for each line is called a top field (also called a first field), and the other is called a bottom field (also called a second field). Say.) An image including the top line in the horizontal direction of the frame is a top field, and an image including the second line in the horizontal direction of the frame is a bottom field. Therefore, in the interlace scanning method, one frame is composed of two fields.
[0013]
In MPEG2, in order to efficiently compress a moving image signal corresponding to the interlace scanning method, not only can a frame be assigned to a picture, which is a compression unit of a screen, but also a field can be assigned to a picture for encoding. .
[0014]
In MPEG2, when a field is assigned to a picture, the structure of the bit stream is called a field structure, and when a frame is assigned to a picture, the structure of the bit stream is called a frame structure. In the field structure, a DCT block is formed from pixels in the field, and discrete cosine transform is performed on a field basis. A processing mode in which discrete cosine transform is performed on a field basis is called a field DCT mode. In the frame structure, a DCT block is formed from pixels in the frame, and discrete cosine transform is performed on a frame basis. A processing mode in which discrete cosine transformation is performed in units of frames is called a frame DCT mode. Furthermore, in the field structure, a macroblock is formed from pixels in the field, and motion prediction is performed in field units. A processing mode in which motion prediction is performed in units of fields is referred to as field motion prediction mode. In the frame structure, a macro block is formed from pixels in the frame, and motion prediction is performed on a frame basis. A processing mode in which motion prediction is performed in units of frames is called a frame motion prediction mode.
[0015]
By the way, an image decoding apparatus that decodes compressed image data corresponding to the interlace scanning method using the third down decoder shown in the above-mentioned document 1 is disclosed in, for example, the document “A Compensation Method of Drift Errors in Scalability” (N OBIKANE, K. TAHARA and J. YONEMITSU, HDTV Work Shop '93) (hereinafter referred to as Document 2).
[0016]
As shown in FIG. 7, the conventional image decoding apparatus disclosed in this document 2 is supplied with a bitstream obtained by compressing a high-resolution image with MPEG2, and analyzes the bitstream. A variable-length code decoding apparatus 132 that decodes a variable-length-encoded bitstream to which a code length is assigned according to the frequency of occurrence; an inverse quantization apparatus 2033 that multiplies each coefficient of the DCT block by a quantization step; A reduced inverse discrete cosine transform device 134 that decodes a standard resolution image by performing, for example, 4 × 4 inverse discrete cosine transform using only low frequency component coefficients among all the coefficients, and a standard subjected to reduced inverse discrete cosine transform An adder 135 for adding the resolution image and the motion-compensated reference image, and a frame memory for temporarily storing the reference image 36, and a motion compensation unit 137 for the motion compensation in 1/4 pixel precision in the reference picture frame memory 136 and stored.
[0017]
The reduced inverse discrete cosine transform device 134 of the conventional image decoding device shown in this document 2 performs inverse discrete cosine transform using only the coefficients of the low frequency component among all the coefficients in the DCT block. The position of the coefficient for performing the inverse discrete cosine transform differs between the mode and the field DCT mode.
[0018]
Specifically, in the case of the field DCT mode, the reduced inverse discrete cosine transform device 134, as shown in FIG. 8, only 4 × 4 coefficients in the low frequency among 8 × 8 in the DCT block. Perform inverse discrete cosine transform on. On the other hand, in the case of the frame DCT mode, the reduced inverse discrete cosine transform device 134, as shown in FIG. 9, 4 × 2 coefficients + 4 × 2 coefficients among 8 × 8 coefficients in the DCT block. Only the inverse discrete cosine transform is performed.
[0019]
In addition, the motion compensation device 137 of the conventional image decoding device disclosed in this document 2 is based on the field motion prediction mode and the frame motion prediction mode based on the information (motion vector) of motion prediction performed on the high-resolution image. The motion compensation with 1/4 pixel accuracy corresponding to each of the above is performed. In other words, in normal MPEG2, it is determined that motion compensation is performed with 1/2 pixel accuracy, but when decoding a standard resolution image from a high resolution image, the number of pixels in the picture is reduced to 1/2. Therefore, the motion compensation device 137 performs motion compensation by setting the pixel accuracy of motion compensation to ¼ pixel accuracy.
[0020]
Therefore, in order to perform motion compensation corresponding to the high resolution image, the motion compensation device 137 performs linear interpolation on the reference image pixels stored in the frame memory 136 as a standard resolution image, and has a 1/4 pixel accuracy. A pixel is generated.
[0021]
Specifically, linear interpolation processing of pixels in the vertical direction in the field motion prediction mode and the frame motion prediction mode will be described with reference to FIGS. 10 and 11. In the drawing, the phase of pixels in the vertical direction is shown in the vertical direction, and the phase in which each pixel of the display image is located is shown as an integer.
[0022]
First, an interpolation process for an image subjected to motion prediction in the field motion prediction mode will be described with reference to FIG. For a high-resolution image (upper layer), as shown in FIG. 10A, motion compensation is performed with ½ pixel accuracy independently for each field. On the other hand, for the standard resolution image (lower layer), as shown in FIG. 10B, linear interpolation is performed in the field based on the integer precision pixels, and 1/4 pixel, 1 / 2 pixels and 3/4 pixels out of phase are generated and motion compensation is performed. In other words, in the standard resolution image (lower layer), each pixel having the ¼ pixel accuracy of the top field is generated by linear interpolation based on each pixel having the integer accuracy of the top field, and the bottom of each pixel having the integer accuracy of the bottom field is generated. Each pixel with 1/4 pixel accuracy of the field is generated by linear interpolation. For example, the value of the pixel in the top field where the vertical phase is 0 is a, and the value of the pixel in the top field where the vertical phase is 1 is b. In this case, the top field pixel whose vertical phase is 1/4 is (3a + b) / 4, and the top field pixel whose vertical phase is 1/2 is (a + b) / 2. The pixel in the top field at the position where the vertical phase is 3/4 is (a + 3b) / 4.
[0023]
Next, an interpolation process for an image subjected to motion prediction in the frame motion prediction mode will be described with reference to FIG. For the high resolution image (upper layer), as shown in FIG. 11A, interpolation processing is performed between the fields, that is, interpolation processing is performed between the bottom field and the top field, and 1/2 Motion compensation is performed with pixel accuracy. For a standard resolution image (lower layer), as shown in FIG. 11B, based on each pixel of integer precision of two fields of the top field and the bottom field, 1/4 pixel, 1 / Pixels whose phases are shifted by 2 pixels and 3/4 pixels are generated by linear interpolation, and motion compensation is performed. For example, the value of the bottom field pixel whose vertical phase is -1 is set to a, the value of the top field pixel whose vertical phase is 0 is set to b, and the vertical phase is set to 1 Let c be the value of a pixel in a certain bottom field, d be the value of a pixel in the top field where the vertical phase is at position 2, and e be the value of the pixel in the bottom field where the vertical phase is at position 3. In this case, each pixel with 1/4 pixel accuracy whose vertical phase is between 0 and 2 is obtained as follows.
[0024]
The pixel whose vertical phase is ¼ is (a + 4b + 3c) / 8. A pixel having a vertical phase of 1/2 is (a + 3c) / 4. A pixel whose vertical phase is 3/4 is (a + 2b + 3c + 2d) / 8. A pixel having a vertical phase of 5/4 is (2b + 3c + 2d + e) / 8. A pixel whose vertical phase is 3/2 is (3c + e) / 4. A pixel whose vertical phase is 7/4 is (3c + 4d + e) / 8.
[0025]
As described above, the conventional image decoding apparatus disclosed in Document 2 can decode compressed image data of a high resolution image corresponding to the interlace scanning method into standard resolution image data.
[0026]
However, in the conventional image decoding device shown in the above-mentioned document 2, the phase of each pixel of the standard resolution image obtained in the field DCT mode is shifted from that of each pixel of the standard resolution obtained in the frame DCT mode. Specifically, in the field DCT mode, as shown in FIG. 12, the vertical phase of each pixel in the top field of the lower layer is 1/2, 5/2. The vertical phase of the pixels is 1, 3,. On the other hand, in the frame DCT mode, as shown in FIG. 13, the vertical phase of each pixel in the top field of the lower layer is 0, 2,. The phase is 1, 3,. Therefore, images having different phases are mixed in the frame memory 136, and the image quality of the output image is deteriorated.
[0027]
Further, in the conventional image decoding device disclosed in the above-mentioned document 2, the phase shift is not corrected in the field motion prediction mode and the frame motion prediction mode. Therefore, the image quality of the output image is deteriorated.
[0028]
[Problems to be solved by the invention]
An image decoding apparatus for solving such a problem has been proposed in Japanese Patent Application No. 10-208385.
[0029]
Next, the image decoding apparatus proposed in Japanese Patent Application No. 10-208385 will be described.
[0030]
In the image decoding apparatus 200 proposed in Japanese Patent Application No. 10-208385 shown in FIG. 14, a bit stream obtained by compressing a high-resolution image having, for example, 1152 effective lines in the vertical direction with MPEG2 is input. This is a device that decodes the bit stream and reduces the resolution to ½, and outputs a standard resolution image having, for example, 576 effective lines in the vertical direction.
[0031]
Hereinafter, the high resolution image is also referred to as an upper layer, and the standard resolution image is also referred to as a lower layer. In general, when a DCT block having 8 × 8 discrete cosine coefficients is subjected to inverse discrete cosine transform, decoded data composed of 8 × 8 pixels can be obtained. For example, 8 × 8 discrete cosine coefficients are The process of performing inverse discrete cosine transform and reducing the resolution so as to obtain decoded data composed of 4 × 4 pixels by decoding is called reduced inverse discrete cosine transform.
[0032]
The image decoding apparatus 200 is supplied with a compressed bit stream of a high-resolution image, and performs a variable-length encoding that assigns a code length according to the frequency of data generation, and a bit stream analysis apparatus 201 that analyzes the bit stream. Further, the variable length code decoding device 202 for decoding the bit stream, the inverse quantization device 203 for applying a quantization step to each coefficient of the DCT block, and the reduced inverse for the DCT block subjected to the discrete cosine transform in the field DCT mode. Field mode reduced inverse discrete cosine transform device 204 for generating a standard resolution image by performing a discrete cosine transform, and a standard resolution image by performing a reduced inverse discrete cosine transform on a DCT block that has been subjected to discrete cosine transform in a frame DCT mode. Reduced Inverse Discrete Cosine Transform for Frame Mode 205, an adder 206 for adding the standard resolution image subjected to the reduced inverse discrete cosine transform and the reference image subjected to motion compensation, a frame memory 207 for temporarily storing the reference image, and a reference image stored in the frame memory 207 A field mode motion compensation device 208 that performs motion compensation corresponding to the field motion prediction mode, a frame mode motion compensation device 209 that performs motion compensation corresponding to the frame motion prediction mode on the reference image stored in the frame memory 207, and An image frame that outputs image data of a standard resolution for display on a television monitor or the like by performing post filtering on the image stored in the frame memory 207 to convert the image frame and correcting the phase shift of the pixel. And a conversion / phase shift correction device 210.
[0033]
The reduced inverse discrete cosine transform device 204 for field mode is used when a macroblock of the input bit stream is subjected to discrete cosine transform in the field DCT mode. The reduced inverse cosine transform device 204 for field mode uses a DCT block in which 8 × 8 coefficients in a macroblock subjected to discrete cosine transform in the field DCT mode are shown, as shown in FIG. Inverse discrete cosine transform is performed only on low-frequency 4 × 4 coefficients. That is, the reduced inverse discrete cosine transform is performed based on the four discrete cosine coefficients in the horizontal and vertical low bands. The reduced inverse discrete cosine transform device 204 for field mode can decode a standard resolution image in which one DCT block is composed of 4 × 4 pixels by performing the reduced inverse discrete cosine transform as described above. . As shown in FIG. 15, the phase of each pixel of the decoded image data is 1/2, 5/2..., And the vertical phase of each pixel in the bottom field is vertical. The phase of the direction is 1, 3,. That is, in the decoded top layer of the lower layer, the first pixel (the pixel whose phase is ½) is the first and second pixels (the pixels whose phase is 0 and 2) from the top of the top field of the upper layer. The phase of the second pixel from the top (pixel having a phase of 5/2) is the middle phase of the third and fourth pixels from the top of the top field of the upper layer (pixels having a phase of 4 and 6). It becomes. In the bottom layer of the decoded lower layer, the phase of the first pixel (pixel having a phase of 1) is intermediate between the first and second pixels (pixels having a phase of 1 and 3) from the top of the bottom field of the upper layer. The phase of the second pixel from the top (pixel having a phase of 3) is an intermediate phase between the third and fourth pixels (pixels having a phase of 5 and 7) from the top of the bottom field of the upper layer.
[0034]
The reduced inverse discrete cosine transform device 205 for frame mode is used when the macroblock of the input bit stream is subjected to discrete cosine transform in the frame DCT mode. The frame mode reduced inverse discrete cosine transform device 205 performs a reduced inverse discrete cosine transform on a DCT block in which 8 × 8 coefficients in a macroblock subjected to discrete cosine transform in the frame DCT mode are indicated. The reduced inverse discrete cosine transform device 205 for frame mode decodes a resolution image in which one DCT block is composed of 4 × 4 pixels, and the standard resolution generated by the reduced inverse discrete cosine transform device 204 for field mode. An image having the same phase as the pixel phase of the image is generated. That is, as shown in FIG. 15, the phase of each pixel of the image data decoded by the reduced inverse discrete cosine transform device 205 for frame mode is 1/2, 5/2. The vertical phase of each pixel in the bottom field is 1, 3,.
[0035]
Details of the processing of the reduced inverse discrete cosine transform device 205 for frame mode will be described later.
[0036]
When the macroblock subjected to the reduced inverse discrete cosine transform 204 by the field mode reduced inverse discrete cosine transform device 204 or the frame mode reduced inverse discrete cosine transform device 205 is an intra image, the adder 206 directly converts the intra image into a frame. Store in the memory 207. Further, when the macroblock subjected to the reduced inverse discrete cosine transform 204 by the field mode reduced inverse discrete cosine transform device 204 or the frame mode reduced inverse discrete cosine transform device 205 is an inter image, the adder 206 is an inter image. In addition, the reference image subjected to motion compensation by the field mode motion compensation device 208 or the frame mode motion compensation device 209 is synthesized and stored in the frame memory 207.
[0037]
The field mode motion compensation device 208 is used when the motion prediction mode of the macroblock is the field motion prediction mode. The field mode motion compensator 208 has a 1/4 pixel accuracy with respect to the reference image of the standard resolution image stored in the frame memory 207 in consideration of the phase shift component between the top field and the bottom field. Interpolation processing is performed, and motion compensation corresponding to the field motion prediction mode is performed. The reference image subjected to motion compensation by the field mode motion compensator 208 is supplied to the adder 206 and synthesized with the inter image.
[0038]
The frame mode motion compensation apparatus 209 is used when the macroblock motion prediction mode is the frame motion prediction mode. The frame mode motion compensator 209 has a 1/4 pixel accuracy with respect to the reference image of the standard resolution image stored in the frame memory 207 in consideration of the phase shift component between the top field and the bottom field. Interpolation processing is performed, and motion compensation corresponding to the frame motion prediction mode is performed. The reference image that has been subjected to motion compensation by the frame mode motion compensation device 209 is supplied to the addition device 206 and is combined with the inter image.
[0039]
The image frame conversion / phase shift correction device 210 is supplied with a standard resolution reference image stored in the frame memory 207 or an image synthesized by the addition device 206, and this image is post-filtered between the top field and the bottom field. The phase shift component is corrected and the image frame is converted so as to conform to the standard definition television standard. That is, in the image frame conversion / phase shift correction device 210, the vertical phase of each pixel in the top field is 1/2, 5/2..., And the vertical phase of each pixel in the bottom field is 1, 3 For example, the vertical phase of each pixel in the top field is 0, 2, 4,..., And the vertical phase of each pixel in the bottom field is 1, 3, 5,. -Correct so that Further, the image frame conversion / phase shift correction device 210 reduces the image frame of the high-resolution television standard to 1/4 and converts it to a standard-resolution television standard image frame.
[0040]
The image decoding apparatus 200 proposed in Japanese Patent Application No. 10-208385 has the above-described configuration, thereby decoding a bit stream obtained by compressing a high-resolution image with MPEG2 and reducing the resolution to ½. Standard resolution images can be output.
[0041]
Next, the processing contents of the reduced inverse discrete cosine transform device 205 for frame mode will be described in more detail.
[0042]
As shown in FIG. 16, the reduced inverse discrete cosine transform device 205 for frame mode receives a bit stream obtained by compression-coding a high-resolution image in units of one DCT block.
[0043]
First, in step S1, the discrete cosine coefficient y of this one DCT block (the coefficient in the vertical direction of all the discrete cosine coefficients of the DCT block is expressed as y ₁ ~ Y ₈ As shown in the figure. ) Is subjected to 8 × 8 inverse discrete cosine transform (IDCT 8 × 8). By performing inverse discrete cosine transform, 8 × 8 decoded pixel data x (vertical pixel data of all the pixel data of the DCT block is converted to x ₁ ~ X ₈ As shown in the figure. ) Can be obtained.
[0044]
Subsequently, in step S2, the 8 × 8 pixel data x is alternately extracted for each line in the vertical direction, and a 4 × 4 top field pixel block corresponding to interlaced scanning and 4 corresponding to interlaced scanning. The pixel block of the x4 bottom field is separated into two pixel blocks. That is, the pixel data x of the first line in the vertical direction ₁ And pixel data x on the third line _Three And pixel data x on the fifth line _Five And pixel data x of the seventh line ₇ And a pixel block corresponding to the top field is generated. Also, pixel data x of the second line in the vertical direction ₂ And pixel data x on the fourth line _Four And pixel data x of the sixth line ₆ And pixel data x of the eighth line ₈ And a pixel block corresponding to the bottom field is generated. The process of separating each pixel of the DCT block into two pixel blocks corresponding to interlaced scanning is hereinafter referred to as field separation.
[0045]
Subsequently, in step S3, 4 × 4 discrete cosine transform (DCT4 × 4) is performed on each of the two pixel blocks separated in the field.
[0046]
Subsequently, in step S4, the discrete cosine coefficient z of the pixel block corresponding to the top field obtained by performing the 4 × 4 discrete cosine transform (the discrete in the vertical direction among all the coefficients of the pixel block corresponding to the top field) The cosine coefficient is z ₁ , Z _Three , Z _Five , Z ₇ As shown in the figure. ) Is a pixel block composed of 2 × 2 discrete cosine coefficients. Also, the discrete cosine coefficient z of the pixel block corresponding to the bottom field obtained by the 4 × 4 discrete cosine transform (the discrete cosine coefficient in the vertical direction among all the coefficients of the pixel block corresponding to the bottom field is z ₂ , Z _Four , Z ₆ , Z ₈ As shown in the figure. ) Is a pixel block composed of 2 × 2 discrete cosine coefficients.
[0047]
Subsequently, in step S5, 2 × 2 inverse discrete cosine transform (IDCT2 × 2) is performed on the pixel block obtained by thinning out the discrete cosine coefficients of the high frequency components. By performing 2 × 2 inverse discrete cosine transform, 2 × 2 decoded pixel data x ′ (vertical pixel data of all pixel data of the top field pixel block is converted to x ′ ₁ , X ′ _Three In the figure, the pixel data in the vertical direction among all the pixel data of the pixel block corresponding to the bottom field is x ′. ₂ , X ′ _Four As shown in the figure. ) Can be obtained.
[0048]
Subsequently, in step S6, the pixel data of the pixel block corresponding to the top field and the pixel data of the pixel block corresponding to the bottom field are alternately synthesized one line at a time in the vertical direction to obtain 4 × 4 pixel data. A DCT block having a reduced inverse discrete cosine transform composed of The process of alternately synthesizing the pixels of the two pixel blocks corresponding to the top field and the bottom field in the vertical direction is hereinafter referred to as frame synthesis.
[0049]
By performing the above steps S1 to S6, the reduced inverse discrete cosine transform unit 15 for frame mode 15 generates pixels of the standard resolution image generated by the reduced inverse discrete cosine transform device 204 for field mode as shown in FIG. It is possible to generate a 4 × 4 DCT block composed of pixels in the same phase.
[0050]
Further, the reduced inverse discrete cosine transform device 205 for the frame mode calculates the above processing from step S1 to step S6 using one matrix. Specifically, in the frame mode reduced inverse discrete cosine transform device 205, a matrix [FS ′] represented by the following equation (1) obtained by performing expansion calculation of the above processing using the addition theorem, and one DCT block discrete cosine coefficient y (y ₁ ~ Y ₈ ) And the pixel data x ′ (x ′) of the DCT block subjected to the reduced inverse discrete cosine transform. ₁ ~ X ' _Four ) Can be obtained.
[0051]
[Expression 1]

[0052]
However, in this formula (1), A to J are as follows.
[0053]
[Expression 2]

[0054]
As described above, in the image decoding apparatus 200 proposed in Japanese Patent Application No. 10-208385, in the field DCT mode, 4 × 4 reduced inverse discrete cosine transform is performed on each of the top field and the bottom field to decode a standard resolution image. In the frame DCT mode, the standard resolution image is decoded by frame separation and reduced inverse discrete cosine transform. Since the image decoding apparatus 200 performs different processing in the field DCT mode and the frame DCT mode in this way, decoding is performed in the field DCT mode and the frame DCT mode without impairing the interlace property of the interlaced scanning image. The phase of the output image can be made the same, and the image quality of the output image is not deteriorated.
[0055]
In the image decoding apparatus 200, the 4 × 4 reduced inverse discrete cosine transform process of the field mode reduced inverse discrete cosine transform apparatus 204 and the step S1 to step S6 of the frame mode reduced inverse discrete cosine transform apparatus 205 are described. The reduced inverse discrete cosine transform process may be processed using a high-speed algorithm.
[0056]
For example, Wang's algorithm (reference: Zhong DE Wang., “Fast Algorithms for the Discrete W Transform and for the Discrete Fourier Transform”, IEEE Tr.ASSP-32, NO.4, pp.803-816, Aug.1984. ) Can be used to speed up the processing.
[0057]
When the matrix operated by the reduced inverse discrete cosine transform device 204 for field mode is decomposed using the Wang algorithm, it is decomposed as shown in the following equation (2).
[0058]
[Equation 3]

[0059]
FIG. 17 shows a processing flow when the Wang algorithm is applied to the processing of the reduced inverse discrete cosine transform device 204 for field mode.
[0060]
This processing flow is composed of first to fifth multipliers 204a to 204e and first to ninth adders 204f to 204n, and one-dimensional discrete cosine coefficients X (0) to X (3) are inputted. The W shown in the first to fifth multipliers 204a to 204e ₀ ~ W _Four Indicates a value to be multiplied by each multiplier. From this processing flow, four coefficient values or pixel values Y (0) to Y (3) subjected to the calculations shown in the following equations (3) to (6) are output. In the equations (3) to (6), W ₀ = W ₁ There is a relationship.
[0061]
[Expression 4]

[0062]
In the reduced inverse discrete cosine transform device 204 for field mode, the processing flow shown in FIG. The process flow shown in FIG. 17 is performed again on the 4 coefficients in the 4 × 4 horizontal direction in the low band of the DCT block.
[0063]
In this way, the calculation can be speeded up by applying the Wang algorithm.
[0064]
Further, when the matrix [FS ′] calculated by the reduced inverse discrete cosine transform device 205 for frame mode is decomposed using the Wang algorithm, it is decomposed as shown in the following equation (7).
[0065]
[Equation 5]

[0066]
However, in this formula (7), A to J are as follows.
[0067]
[Formula 6]

[0068]
FIG. 18 shows a processing flow when the Wang algorithm is applied to the processing of the reduced inverse discrete cosine transform device 205 for frame mode.
[0069]
This processing flow includes first to tenth multipliers 205a to 205j and first to twelfth adders 205k to 205v, and one-dimensional discrete cosine coefficients X (0) to X (7) are inputted. The W shown in the first to tenth multipliers 205a to 205j ₀ ~ W ₉ Indicates a value to be multiplied by each multiplier. From this processing flow, four coefficient values Y (0) to Y (3) on which the operations shown in the following equations (8) to (11) are performed are output. In the equations (8) to (11), W ₀ = W ₁ There is a relationship.
[0070]
[Expression 7]

[0071]
In the reduced inverse discrete cosine transform device 205 for the frame mode, the processing flow shown in FIG. 18 is applied to the 8 × 4 vertical coefficients in the low frequency region of the DCT block, and then the coefficient positions are 90 in the memory. The process flow shown in FIG. 17 is applied to the 4 × 4 horizontal coefficients in the low frequency region of the DCT block.
[0072]
In this way, the calculation can be speeded up by applying the Wang algorithm.
[0073]
Incidentally, in the image decoding apparatus 200 proposed in Japanese Patent Application No. 10-208385 as described above, the reduced inverse discrete cosine transform in the field mode and the reduced inverse discrete cosine transform in the frame mode are completely different processing flows. It has become.
[0074]
Therefore, for example, when these processes are performed by software, the code size increases and a lot of instruction memory is required. Similarly, when the code size increases, a cache miss is likely to occur, and there is a possibility that the decoding process is lowered. Also, when these processes are performed by a dedicated circuit, the circuit scale becomes large.
[0075]
The present invention has been made in view of such circumstances, and can eliminate the pixel phase shift between the field orthogonal transform mode and the frame orthogonal transform mode without impairing the interlaced property of the interlaced scanning image. An object of the present invention is to provide an image decoding apparatus and an image decoding method for decoding standard resolution image data from compressed image data of a high resolution image, the image decoding apparatus and image decoding method having a simplified processing flow.
[0076]
[Means for Solving the Problems]
The image decoding apparatus according to the present invention provides a first resolution obtained by compressing and encoding a two-dimensional orthogonal transform of a pixel block composed of 8 × 8 pixels to generate an orthogonal transform block composed of 8 × 8 coefficients. Is an image decoding device that decodes moving image data having a resolution that is ½ of the first resolution from the compressed image data, and is orthogonally transformed by an orthogonal transformation method (field orthogonal transformation mode) that supports interlaced scanning. Coefficient to be used for inverse orthogonal transformation and input to each multiplication coefficient between the orthogonal transformation block and the orthogonal transformation block which has been orthogonally transformed by the orthogonal transformation method (frame orthogonal transformation mode) corresponding to progressive scanning. And by performing one-dimensional inverse orthogonal transformation in the vertical direction on 8 × 4 coefficients among the coefficients of the orthogonal transform block of the compressed image data, 4 × 4 coefficients are obtained. A first inverse orthogonal transform unit that generates an orthogonal transform block and a 4 × 4 coefficient subjected to the inverse orthogonal transform by the first inverse orthogonal transform unit perform a one-dimensional inverse orthogonal transform in the horizontal direction. Second inverse orthogonal transform means for generating a pixel block composed of 4 × 4 pixels, wherein the first inverse orthogonal transform means converts each coefficient of the orthogonal transform block orthogonally transformed by the field orthogonal transform mode. On the other hand, the orthogonal transform block which is orthogonally transformed by the frame orthogonal transform mode is generated by performing inverse orthogonal transform on the four low-frequency coefficients in the horizontal direction and the vertical direction to generate an orthogonal transform block having 4 × 4 coefficients. 2 coefficients corresponding to the interlaced scanning of each pixel of the orthogonal transform block obtained by performing inverse orthogonal transform on the 4 coefficients in the horizontal direction and the 8 coefficients in the vertical direction. Pixel block The two separated pixel blocks are orthogonally transformed, and the two orthogonally transformed coefficients of the two pixel blocks are subjected to inverse orthogonal transformation with respect to the horizontal and vertical two coefficients, and are inversely orthogonal. It is characterized in that two transformed pixel blocks are combined to generate an orthogonal transform block composed of 4 × 4 coefficients.
[0077]
In this image decoding apparatus, each coefficient of the orthogonal transform block orthogonally transformed in the field orthogonal transform mode is subjected to inverse orthogonal transform with respect to four coefficients in the horizontal and vertical low frequencies, and 4 × 4. An orthogonal transform block consisting of coefficients is generated, and for each coefficient of the orthogonal transform block that has been orthogonally transformed by the frame orthogonal transform mode, the inverse low orthogonality is applied to the 4 coefficients in the horizontal direction and the 8 coefficients in the vertical direction. Each pixel of the orthogonal transformation block that has been transformed and subjected to inverse orthogonal transformation is separated into two pixel blocks corresponding to interlaced scanning, and each of the two separated pixel blocks is subjected to orthogonal transformation, and the two obtained by orthogonal transformation are separated. Of each coefficient of the pixel block, inverse horizontal transformation is performed on two coefficients in the horizontal direction and the vertical direction, and two pixel blocks obtained by inverse orthogonal transformation are combined to create an orthogonal transformation composed of 4 × 4 coefficients. Generate a replacement block. The image decoding apparatus outputs moving image data having a second resolution lower than the first resolution. Further, in this image decoding apparatus, each of the multiplication coefficients used for the inverse orthogonal transform is switched between the orthogonal transform block that has been orthogonally transformed by the field orthogonal transform mode and the orthogonal transform block that has been orthogonally transformed by the frame orthogonal transform mode. The coefficient input to each multiplication coefficient is switched, and one-dimensional inverse orthogonal transform in the vertical direction is performed on 8 × 4 coefficients among the coefficients of the orthogonal transform block of the compressed image data.
[0078]
The image decoding method according to the present invention is a first method in which a two-dimensional orthogonal transform is performed on a pixel block composed of 8 × 8 pixels to generate an orthogonal transform block composed of 8 × 8 coefficients and compression-coded. Decoding method that decodes moving image data having a resolution that is ½ of the first resolution from compressed image data having the above resolution, and performing orthogonal transform by an orthogonal transform method (field orthogonal transform mode) that supports interlaced scanning The orthogonal transformation block and the orthogonal transformation block (orthogonal transformation mode) that has been orthogonally transformed by progressive scanning are switched between the multiplication coefficients used for inverse orthogonal transformation and input to each multiplication coefficient. 4 × 4 coefficients are obtained by performing one-dimensional inverse orthogonal transformation in the vertical direction on the 8 × 4 coefficients of the coefficients of the orthogonal transformation block of the compressed image data. The orthogonal transform block is generated, and the 4 × 4 coefficient subjected to inverse orthogonal transform is subjected to horizontal one-dimensional inverse orthogonal transform to generate a pixel block composed of 4 × 4 pixels. For each coefficient of the orthogonal transform block which has been orthogonally transformed by the orthogonal transform mode, an orthogonal transform block composed of 4 × 4 coefficients is obtained by performing inverse orthogonal transform on the low-frequency four coefficients in the horizontal direction and the vertical direction. For each coefficient of the orthogonal transform block generated and orthogonally transformed by the frame orthogonal transform mode, the inverse orthogonal transform is performed on the horizontal low frequency 4 coefficients and the vertical 8 coefficients. Each pixel of the orthogonal transformation block that has been separated into two pixel blocks corresponding to interlaced scanning, each of the two separated pixel blocks is orthogonally transformed, and each of the two pixel blocks that have undergone orthogonal transformation Of the coefficients, inverse orthogonal transform is performed on two coefficients in the horizontal direction and the vertical direction, and two pixel blocks subjected to inverse orthogonal transform are combined to generate an orthogonal transform block including 4 × 4 coefficients.
[0079]
In this image decoding method, each coefficient of the orthogonal transform block that has been orthogonally transformed in the field orthogonal transform mode is subjected to inverse orthogonal transform with respect to four coefficients in the horizontal and vertical low frequencies to be 4 × 4. An orthogonal transform block consisting of coefficients is generated, and for each coefficient of the orthogonal transform block that has been orthogonally transformed by the frame orthogonal transform mode, the inverse low orthogonality is applied to the 4 coefficients in the horizontal direction and the 8 coefficients in the vertical direction. Each pixel of the orthogonal transformation block that has been transformed and subjected to inverse orthogonal transformation is separated into two pixel blocks corresponding to interlaced scanning, and each of the two separated pixel blocks is subjected to orthogonal transformation, and the two obtained by orthogonal transformation are separated. Of each coefficient of the pixel block, inverse horizontal transformation is performed on two coefficients in the horizontal direction and the vertical direction, and two pixel blocks obtained by inverse orthogonal transformation are combined to create an orthogonal transformation composed of 4 × 4 coefficients. Generate a replacement block. The image decoding apparatus outputs moving image data having a second resolution lower than the first resolution. Further, in this image decoding apparatus, each of the multiplication coefficients used for the inverse orthogonal transform is switched between the orthogonal transform block that has been orthogonally transformed by the field orthogonal transform mode and the orthogonal transform block that has been orthogonally transformed by the frame orthogonal transform mode. The coefficient input to each multiplication coefficient is switched, and one-dimensional inverse orthogonal transform in the vertical direction is performed on 8 × 4 coefficients among the coefficients of the orthogonal transform block of the compressed image data.
[0080]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
[0081]
FIG. 1 shows a block configuration diagram of an image decoding apparatus according to an embodiment of the present invention.
[0082]
The image decoding apparatus 10 shown in FIG. 1 receives a bitstream obtained by compressing an MPEG2 high-resolution image having, for example, 1152 effective lines in the vertical direction, decodes the input bitstream, and reduces it to ½. This is a device that reduces the resolution and outputs a standard resolution image having, for example, 576 effective lines in the vertical direction.
[0083]
The image decoding apparatus 10 is supplied with a bit stream of a compressed high-resolution image, and performs a variable-length encoding that assigns a code length according to the frequency of data generation, and a bit stream analysis apparatus 11 that analyzes the bit stream. Further, the variable length code decoding device 12 for decoding the bit stream, the inverse quantization device 13 for applying a quantization step to each coefficient of the DCT block, and the reduced inverse discrete cosine transform on the DCT block subjected to the discrete cosine transform. Then, a reduced inverse discrete cosine transform device 14 that generates a standard resolution image, an adder 16 that adds the standard resolution image subjected to the reduced inverse discrete cosine transform and the motion compensated reference image, and the reference image are temporarily stored. Frame memory 17 and the reference image stored in the frame memory 17 correspond to the field motion prediction mode. Field mode motion compensation device 18 for performing motion compensation, frame mode motion compensation device 19 for performing motion compensation corresponding to the frame motion prediction mode for the reference image stored in frame memory 17, and the image stored in frame memory 17 An image frame conversion / phase shift correction device 20 that outputs image data of a standard resolution for display on a television monitor or the like by performing image filtering and correcting the phase shift of the pixels by post-filtering. It has.
[0084]
The reduced inverse discrete cosine transform device 14, when the macroblock of the input bit stream is subjected to the discrete cosine transform in the field DCT mode, 8 × 8 in the macroblock subjected to the discrete cosine transform in the field DCT mode. As shown in FIG. 8, the inverse discrete cosine transform is performed only on the low-frequency 4 × 4 coefficients as shown in FIG. That is, the reduced inverse discrete cosine transform is performed based on the four discrete cosine coefficients in the horizontal and vertical low bands. The reduced inverse discrete cosine transform device 14 can decode a standard resolution image in which one DCT block is composed of 4 × 4 pixels by performing the reduced inverse discrete cosine transform as described above. As shown in FIG. 15, the phase of each pixel of the decoded image data is 1/2, 5/2... The phase in the vertical direction is 1, 3,. That is, in the decoded top layer of the lower layer, the first pixel (the pixel whose phase is ½) is the first and second pixels (the pixels whose phase is 0 and 2) from the top of the top field of the upper layer. The phase of the second pixel from the top (pixel having a phase of 5/2) is the middle phase of the third and fourth pixels from the top of the top field of the upper layer (pixels having a phase of 4 and 6). It becomes. In the bottom layer of the decoded lower layer, the phase of the first pixel (pixel having a phase of 1) is intermediate between the first and second pixels (pixels having a phase of 1 and 3) from the top of the bottom field of the upper layer. The phase of the second pixel from the top (pixel having a phase of 3) is an intermediate phase between the third and fourth pixels (pixels having a phase of 5 and 7) from the top of the bottom field of the upper layer.
[0085]
Further, the reduced inverse discrete cosine transform device 14, when the macroblock of the input bit stream is subjected to the discrete cosine transform in the frame DCT mode, the 8 × in the macroblock subjected to the discrete cosine transform in the frame DCT mode. The same 8 × 4 reduced inverse discrete cosine transform as that of the frame mode reduced inverse discrete cosine transform device 15 described above is performed on the DCT block in which 8 coefficients are indicated. The reduced inverse discrete cosine transform device 14 decodes a resolution image in which one DCT block is composed of 4 × 4 pixels, and also decodes the standard resolution image that is decoded when the discrete cosine transform is performed in the field DCT mode. An image having the same phase as the pixel phase is generated. That is, in the reduced inverse discrete cosine transform device 14, as shown in FIG. 15, the vertical phase of each pixel in the top field is 1/2, 5/2..., And the vertical direction of each pixel in the bottom field. An image with a phase of 1, 3,...
[0086]
When the macroblock subjected to the reduced inverse discrete cosine transform by the reduced inverse discrete cosine transform device 14 is an intra image, the adder 16 stores the intra image in the frame memory 17 as it is. Further, when the macroblock subjected to the reduced inverse discrete cosine transform by the reduced inverse discrete cosine transform device 14 is an inter image, the adder 16 adds the field mode motion compensation device 18 or the frame mode motion to the inter image. The reference image subjected to motion compensation by the compensation device 19 is synthesized and stored in the frame memory 17.
[0087]
The field mode motion compensation device 18 is used when the motion prediction mode of the macroblock is the field motion prediction mode. The field mode motion compensator 18 has a 1/4 pixel accuracy with respect to the reference image of the standard resolution image stored in the frame memory 17 in consideration of the phase shift component between the top field and the bottom field. Interpolation processing is performed, and motion compensation corresponding to the field motion prediction mode is performed. The reference image that has been subjected to motion compensation by the field mode motion compensation device 18 is supplied to the adder 16 and is combined with the inter image.
[0088]
The frame mode motion compensation device 19 is used when the macroblock motion prediction mode is the frame motion prediction mode. The frame mode motion compensator 19 has a 1/4 pixel accuracy with respect to the reference image of the standard resolution image stored in the frame memory 17 in consideration of the phase shift component between the top field and the bottom field. Interpolation processing is performed, and motion compensation corresponding to the frame motion prediction mode is performed. The reference image that has been subjected to motion compensation by the frame mode motion compensation device 19 is supplied to the addition device 16 and synthesized with the inter image.
[0089]
The frame conversion / phase shift correction device 20 is supplied with a standard resolution reference image stored in the frame memory 17 or an image synthesized by the addition device 16, and post-filtering this image between the top field and the bottom field. The phase shift component is corrected and the image frame is converted so as to conform to the standard definition television standard. That is, in the image frame conversion / phase shift correction device 20, the vertical phase of each pixel in the top field is 1/2, 5/2..., And the vertical phase of each pixel in the bottom field is 1, 3 For example, the vertical phase of each pixel in the top field is 0, 2, 4,..., And the vertical phase of each pixel in the bottom field is 1, 3, 5,. -Correct so that Further, the image frame conversion / phase shift correction device 20 reduces the image frame of the high-resolution television standard to 1/4 and converts it to the standard-definition television standard image frame.
[0090]
The image decoding apparatus 10 has the above-described configuration, thereby decoding a bit stream obtained by compressing a high-resolution image with MPEG2 and reducing the resolution to ½ and outputting a standard-resolution image. Can do.
[0091]
Next, the processing contents of the reduced inverse discrete cosine transform device 14 will be described in more detail.
[0092]
As shown in FIG. 2, the reduced inverse discrete cosine transform device 14 includes an input buffer 21 to which a bit stream obtained by compressing a high-resolution image by MPEG2 is supplied from the inverse quantization device 13, and a one-dimensional inverse discrete component of a vertical component. A vertical component inverse discrete cosine transform unit 22 that performs cosine transform, an intermediate buffer 23, a horizontal component inverse discrete cosine transform unit 24 that performs a one-dimensional inverse discrete cosine transform of a horizontal component, and an output buffer 25 are provided.
[0093]
The input buffer 21 receives the bit stream inversely quantized by the inverse quantizer 13 in units of DCT blocks, and temporarily stores the DCT blocks.
[0094]
The vertical component inverse discrete cosine transform unit 22 extracts a discrete cosine coefficient for each DCT block stored in the input buffer 21, and performs a one-dimensional inverse discrete cosine transform in the vertical direction on the DCT block. At this time, the vertical component inverse discrete cosine transform unit 22 performs discrete cosine transform on the eight coefficients in the vertical direction, and performs reduced inverse discrete cosine transform that reduces the eight coefficients to four coefficients. .
[0095]
Also, the vertical component inverse discrete cosine transform unit 22 is inputted with the mode information of the discrete cosine transform of the DCT block that performs the discrete cosine transform. That is, information on whether the DCT block has been subjected to discrete cosine transform in the field DCT mode or frame DCT mode discrete cosine transform is input. The vertical component inverse discrete cosine transform unit 22, when the DCT block for performing discrete cosine transform is subjected to discrete cosine transform in the field DCT mode, includes the image decoding apparatus 200 proposed in Japanese Patent Application No. 10-208385 described above. The inverse discrete cosine transform in the vertical direction is performed using the same algorithm as the reduced inverse discrete cosine transform device 204 for field mode. In addition, the vertical component inverse discrete cosine transform unit 22 is the image decoding apparatus proposed in Japanese Patent Application No. 10-208385 described above when the DCT block for performing discrete cosine transform is discrete cosine transformed in the frame DCT mode. The inverse discrete cosine transform in the vertical direction is performed using the same algorithm as that of the reduced inverse discrete cosine transform device 205 for 200 frame modes.
[0096]
A specific processing flow of the vertical component inverse discrete cosine transform unit 22 is shown in FIG.
[0097]
The processing flow shown in FIG. 3 is obtained by decomposing a matrix operated by the reduced inverse discrete cosine transform device 14 and applying it to the Wang algorithm.
[0098]
The processing flow of the vertical component inverse discrete cosine transform unit 22 includes first to tenth multipliers 31 to 40 and first to twelfth adders 41 to 52, and a one-dimensional discrete cosine coefficient X (0 ) To X (7) are input. A to J shown in the first to tenth multipliers 31 to 40 indicate values to be multiplied by the respective multipliers.
[0099]
The first multiplier 31 multiplies the lowest frequency discrete cosine coefficient X (0) by the value A. The second multiplier 32 multiplies the third discrete cosine coefficient X (2) from the low frequency by the value B. The third multiplier 33 multiplies the value C by the fifth discrete cosine coefficient X (4) from the low band. The fourth multiplier 34 multiplies the seventh discrete cosine coefficient X (6) from the low frequency by the value D. The fifth multiplier 35 multiplies the fourth discrete cosine coefficient X (3) from the low frequency by the value E. The sixth multiplier 36 multiplies the sixth discrete cosine coefficient X (5) from the low frequency by the value F. The seventh multiplier 37 multiplies the second discrete cosine coefficient X (1) from the low frequency by the value G. The eighth multiplier 38 multiplies the second discrete cosine coefficient X (1) from the low frequency by the value H.
[0100]
The ninth multiplier 39 multiplies the fourth discrete cosine coefficient X (3) from the low frequency by the value I in the case of the field DCT mode, and the highest value in the case of the frame DCT mode. The domain discrete cosine coefficient X (7) is multiplied by the value I. The tenth multiplier 40 multiplies the fourth discrete cosine coefficient X (3) from the low frequency by the value J in the field DCT mode, and the highest in the frame DCT mode. Multiply the discrete cosine coefficient X (7) of the region by the value J. Switching of the discrete cosine coefficients input to the ninth and tenth multipliers 39 and 40, that is, switching of input data ports of the discrete cosine coefficient X (3) and the discrete cosine coefficient X (7) is performed by the changeover switch 60. Done.
[0101]
Therefore, in the case of the field DCT mode, four coefficient values Y (0) to Y (3) subjected to the operations shown in the following equations (12) to (15) are output from the processing flow shown in FIG. Is done.
[0102]
[Equation 8]

[0103]
Further, in the case of the frame DCT mode, the processing flow shown in FIG. 3 outputs four coefficient values Y (0) to Y (3) on which the operations shown in the following equations (16) to (19) are performed. Is done.
[0104]
[Equation 9]

[0105]
Each of the first to tenth multipliers 31 to 40 includes a register, for example, and the values of A to J to be multiplied are stored in the register. The value in this register is switched between the field DCT mode and the frame DCT mode.
[0106]
The following table shows the multiplication constants of each multiplier that can be switched between the field DCT mode and the frame DCT mode. In this table, W ₀ = W ₁ There is a relationship.
[0107]
[Table 1]

[0108]
When the values in the field DCT mode shown in this table are substituted into the above equations (12) to (15), the following equations (20) to (23) are obtained from the processing flow shown in FIG. The four coefficient values Y (0) to Y (3) on which the calculation shown in FIG.
[0109]
[Expression 10]

[0110]
Y (0) to Y (3) that can be shown in the equations (20) to (23) are the same as the values in the above equations (3) to (6). That is, the inverse discrete cosine transform in the vertical direction is performed by the same algorithm as the reduced inverse discrete cosine transform device 204 for field mode of the image decoding device 200 proposed in Japanese Patent Application No. 10-208385.
[0111]
Further, when the respective values in the frame DCT mode are substituted into the above equations (16) to (19), the operations shown in the following equations (24) to (27) are performed from the processing flow shown in FIG. The four coefficient values Y (0) to Y (3) are output.
[0112]
## EQU11 ##

[0113]
Y (0) to Y (3) that can be expressed in the equations (24) to (27) are the same as the values in the equations (7) to (11). That is, the inverse discrete cosine transform in the vertical direction is performed using the same algorithm as the frame mode reduced inverse discrete cosine transform device 205 of the image decoding device 200 proposed in Japanese Patent Application No. 10-208385.
[0114]
As described above, in the vertical component inverse discrete cosine transform unit 22, the multiplication constants of the multipliers 31 to 33 are switched and the data input port is switched to thereby change the vertical direction in the field DCT mode and the frame DCT mode. A reduced inverse discrete cosine transform can be performed. Further, the vertical component inverse discrete cosine transform unit 22 calculates the reduced inverse discrete cosine transform of the field DCT mode and the reduced inverse discrete cosine transform of the frame DCT mode in one processing flow. Become.
[0115]
The discrete cosine coefficient calculated as described above is temporarily stored in the intermediate buffer 23.
[0116]
The horizontal component inverse discrete cosine transform unit 24 extracts DCT block unit discrete cosine coefficients stored in the intermediate buffer 23, and performs a one-dimensional inverse discrete cosine transform in the horizontal direction on the DCT block. At this time, the horizontal component inverse discrete cosine transform unit 24 performs discrete cosine transform on the four coefficients in the vertical direction. The horizontal component inverse discrete cosine transform unit 24 applies the above-mentioned patent application regardless of whether the DCT block for performing discrete cosine transform is discrete cosine transformed in the field DCT mode or discrete cosine transformed in the frame DCT mode. The horizontal inverse discrete cosine transform is performed by the same algorithm as the field mode reduced inverse discrete cosine transform device 204 of the image decoding device 200 proposed in Japanese Patent Laid-Open No. 10-208385.
[0117]
That is, the processing flow of the horizontal component inverse discrete cosine transform unit 24 is the same as that shown in FIG.
[0118]
Note that the processing flow of the horizontal component inverse discrete cosine transform unit 24 is the same as the processing flow in the field DCT mode of the vertical component inverse discrete cosine transform unit 22, so that the vertical component inverse discrete cosine transform unit 22 The processing flow may be used.
[0119]
The output buffer 25 temporarily stores the 4 × 4 pixel values calculated by the horizontal component inverse discrete cosine transform unit 24 in units of DCT blocks and supplies them to the adder 16.
[0120]
As described above, in the image decoding apparatus 10 according to the embodiment of the present invention, a high-resolution image in which the phase shift between pixels in the field orthogonal transform mode and the frame orthogonal transform mode is eliminated without impairing the interlaced property of the interlaced scanned image. The standard resolution image data can be decoded from the compressed image data, and the processing flow can be simplified.
[0121]
【The invention's effect】
In the image decoding apparatus and the image decoding method according to the present invention, two pixel blocks corresponding to interlaced scanning by performing inverse orthogonal transformation on the coefficients of all frequency components of the orthogonal transformation block which has been orthogonally transformed by the frame orthogonal transformation mode. Then, the two separated pixel blocks are orthogonally transformed to perform inverse orthogonal transformation on the low frequency component coefficients, and the two pixel blocks subjected to inverse orthogonal transformation are synthesized. In this image decoding method, moving image data having a second resolution lower than the first resolution is output. Further, in this image decoding method, each multiplication coefficient used for inverse orthogonal transform is switched between an orthogonal transform block which has been orthogonally transformed by the field orthogonal transform mode and an orthogonal transform block which has been orthogonally transformed by the frame orthogonal transform mode. The coefficient input to each multiplication coefficient is switched, and one-dimensional inverse orthogonal transform in the vertical direction is performed on 8 × 4 coefficients among the coefficients of the orthogonal transform block of the compressed image data.
[0122]
As a result, in the present invention, the amount of computation and the storage capacity required for decoding can be reduced, and the pixel orthogonalization in the field orthogonal transform mode and the frame orthogonal transform mode can be performed without impairing the interlaced property of the interlaced scanning image. The phase shift can be eliminated. In addition, the image quality of the moving image data having the second resolution can be improved. Furthermore, in the present invention, the process associated with the inverse discrete cosine transform can be simplified.
[Brief description of the drawings]
FIG. 1 is a block diagram of an image decoding apparatus according to an embodiment of the present invention.
FIG. 2 is a block configuration diagram of a reduced inverse discrete cosine transform device of the image decoding device.
FIG. 3 is a diagram illustrating a processing flow of a vertical component inverse discrete cosine transform unit of the reduced discrete cosine transform device.
FIG. 4 is a block diagram showing a conventional first down decoder.
FIG. 5 is a block diagram showing a conventional second down decoder.
FIG. 6 is a block diagram showing a conventional third down decoder.
FIG. 7 is a block diagram of a conventional image decoding device.
FIG. 8 is a diagram for explaining a reduced inverse discrete cosine transform process in a field DCT mode of the conventional image decoding apparatus.
FIG. 9 is a diagram for explaining a reduced inverse discrete cosine transform process in a field DCT mode of the conventional image decoding apparatus.
FIG. 10 is a diagram for explaining linear interpolation processing in a field motion prediction mode of the conventional image decoding apparatus.
FIG. 11 is a diagram for explaining linear interpolation processing in a frame motion prediction mode of the conventional image decoding apparatus.
FIG. 12 is a diagram for explaining a phase of a pixel obtained as a result of a field DCT mode of the conventional image decoding device.
FIG. 13 is a diagram for explaining a phase of a pixel obtained as a result of a frame DCT mode of the conventional image decoding device.
FIG. 14 is a block diagram of an image decoding apparatus proposed in Japanese Patent Application No. 10-208385.
FIG. 15 is a diagram for explaining the phase of a pixel in the vertical direction of a reference image stored in the frame memory of the image decoding apparatus proposed in Japanese Patent Application No. 10-208385.
FIG. 16 is a diagram for explaining the contents of one block processing of the frame mode reduced inverse discrete cosine transform device of the image decoding device proposed in Japanese Patent Application No. 10-208385.
FIG. 17 is a diagram showing a calculation flow when the Wang algorithm is applied to the process of the reduced inverse discrete cosine transform device for field mode of the image decoding device proposed in Japanese Patent Application No. 10-208385.
FIG. 18 is a diagram showing a calculation flow when the Wang algorithm is applied to one-block processing of the frame mode reduced inverse discrete cosine transform device of the image decoding device proposed in Japanese Patent Application No. 10-208385.
[Explanation of symbols]
10 image decoding device, 14 reduced inverse discrete cosine transform device, 22 vertical component inverse discrete cosine transform unit, 24 horizontal component inverse discrete cosine transform unit

Claims

From the compressed image data of the first resolution obtained by performing compression encoding by generating an orthogonal transform block having 8 × 8 coefficients by performing two-dimensional orthogonal transform on a pixel block having 8 × 8 pixels. In an image decoding apparatus for decoding moving image data having a resolution of 1/2 of
An orthogonal transformation block that has been orthogonally transformed by an orthogonal transformation method (field orthogonal transformation mode) corresponding to interlaced scanning, and an orthogonal transformation block that has been orthogonally transformed by an orthogonal transformation method (frame orthogonal transformation mode) that supports sequential scanning. , Switching each multiplication coefficient used for inverse orthogonal transform and switching a coefficient to be input to each multiplication coefficient, one-dimensional inverse orthogonal in the vertical direction with respect to 8 × 4 coefficients among the coefficients of the orthogonal transform block of the compressed image data First inverse orthogonal transform means for transforming to generate an orthogonal transform block having 4 × 4 coefficients;
A second block for generating a pixel block composed of 4 × 4 pixels by performing one-dimensional inverse orthogonal transformation in the horizontal direction on the 4 × 4 coefficients subjected to the inverse orthogonal transform by the first inverse orthogonal transform unit. An inverse orthogonal transform means,
The first inverse orthogonal transform means performs inverse orthogonal transform on the low-frequency four coefficients in the horizontal direction and the vertical direction for each coefficient of the orthogonal transform block orthogonally transformed in the field orthogonal transform mode. For each coefficient of the orthogonal transform block that has been orthogonally transformed by the frame orthogonal transform mode, 4 coefficients in the horizontal direction and 8 coefficients in the vertical direction are generated. The inverse orthogonal transform is performed, and each pixel of the orthogonal transform block subjected to the inverse orthogonal transform is separated into two pixel blocks corresponding to interlaced scanning, and each of the separated two pixel blocks is subjected to orthogonal transform, and orthogonal transform is performed. Inverted orthogonal transformation is performed on the horizontal and vertical two coefficients among the coefficients of the two pixel blocks that have been processed, and the two pixel blocks subjected to the inverse orthogonal transformation are combined to form 4 × 4 coefficients. An image decoding apparatus that generates an orthogonal transform block.

The image decoding apparatus according to claim 1, wherein the first inverse orthogonal transform unit and the second inverse orthogonal transform unit perform inverse orthogonal transform based on a fast algorithm.

From the compressed image data of the first resolution obtained by performing compression encoding by generating an orthogonal transform block having 8 × 8 coefficients by performing two-dimensional orthogonal transform on a pixel block having 8 × 8 pixels. In an image decoding method for decoding moving image data having a resolution ½ of the resolution of
An orthogonal transformation block that has been orthogonally transformed by an orthogonal transformation method (field orthogonal transformation mode) corresponding to interlaced scanning, and an orthogonal transformation block that has been orthogonally transformed by an orthogonal transformation method (frame orthogonal transformation mode) that supports sequential scanning. , Switching each multiplication coefficient used for inverse orthogonal transform and switching a coefficient to be input to each multiplication coefficient, one-dimensional inverse orthogonal in the vertical direction with respect to 8 × 4 coefficients among the coefficients of the orthogonal transform block of the compressed image data Transform to generate an orthogonal transform block with 4x4 coefficients,
A one-dimensional inverse orthogonal transform in the horizontal direction is performed on the 4 × 4 coefficient subjected to the inverse orthogonal transform to generate a pixel block composed of 4 × 4 pixels,
For each coefficient of the orthogonal transform block subjected to the orthogonal transform in the field orthogonal transform mode, an orthogonal transform block comprising 4 × 4 coefficients is obtained by performing inverse orthogonal transform on the four low-frequency coefficients in the horizontal direction and the vertical direction. Produces
For each coefficient of the orthogonal transform block which has been orthogonally transformed by the frame orthogonal transform mode, the orthogonal transform is performed by performing the inverse orthogonal transform on the 4 coefficients in the horizontal direction and the 8 coefficients in the vertical direction, and performing the inverse orthogonal transform. Each pixel of the transform block is separated into two pixel blocks corresponding to interlaced scanning, and each of the separated two pixel blocks is subjected to orthogonal transformation, and the horizontal direction and An image decoding method characterized by performing inverse orthogonal transform on two coefficients in the vertical direction and combining two pixel blocks subjected to inverse orthogonal transform to generate an orthogonal transform block composed of 4 × 4 coefficients.

The image decoding method according to claim 3, wherein inverse orthogonal transform is performed based on a high-speed algorithm.