JP4088205B2

JP4088205B2 - Encoding apparatus, computer-readable program, and encoding method.

Info

Publication number: JP4088205B2
Application number: JP2003165594A
Authority: JP
Inventors: 一公清水; 吉一郎柏木
Original assignee: Panasonic Corp; Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Corp; Panasonic Holdings Corp
Priority date: 2002-06-11
Filing date: 2003-06-10
Publication date: 2008-05-21
Anticipated expiration: 2023-06-10
Also published as: JP2004072732A

Description

【０００１】
【産業上の利用分野】
本発明は、動画像の符号化対象を行う符号化装置に関し、特に動きベクトルの検出や、動き補償方式の選択を行うにあたっての改良に関する。
【０００２】
【従来の技術】
MPEG2ビデオの規格（ISO/IEC 13818-2, "Information technology - Generic coding of moving pictures and associated audio information: Video"）を応用した応用製品が、近年の民生機器市場を席巻している。MPEG2ビデオ規格は復号方法を規定したにすぎず、符号化方法は規定されていないので、かかる応用製品の開発では、実装にあたって適当な符号化方法を選択せねばならない。MPEG2ビデオの符号化方法として広く知られているのは、"Test Model 5"(ISO/IEC JTC/SC29/WG11/N0400, Apr 1993, TM5と略す)である。
【０００３】
Test Model5による符号化処理は、動きベクトル探索(1) 動き補償モードの選択(2)、DCTタイプ選択(3)、DCT、量子化、可変長符号化(4)という工程からなる。これらの工程のうち、動きベクトルをどのように探索するか、動き補償モードをどのように選択するかは、画質を大きく左右する。
ここでPピクチャに対する動き補償モードの選択について説明する。Pピクチャにおける動き補償モードには、前方フレーム予測、前方フィールド予測、NoMC（動きベクトルを0とした動き補償）、イントラ（動き補償を行わない）という4つのものがある。これらの動き補償モードで予測されるマクロブロックは、動き補償モード毎に異なる。
【０００４】
最も良い動き補償モードを選択するため、Test Model5では各モードで予測されるマクロブロック(予測MB)について平均二乗誤差(Mean Square Error(MSE))を算出し、予測MBのMSEを評価値として、動き補償モードを選択するという考えをとっている。Test Model5による符号化技術は、以下の非特許文献1に記載されている。
【０００５】
また、動きベクトル探索については、以下の特許文献1に記載された技術が知られている。動きベクトルとは、符号化対象たるマクロブロックを基準とした参照マクロブロックの相対位置を示す情報である。動きベクトル探索は、候補になり得る複数マクロブロックのそれぞれについて、評価値を算出し、この評価値が最小のマクロブロックを参照マクロブロックに選ぶという手順でなされる。
【０００６】
特許文献1に記載の符号化装置は、この評価値の算出に特徴がある。つまりマクロブロックが有する誤差の直流成分、誤差の交流成分のうち、誤差の直流成分を完全に除去して誤差の交流成分のみを用いて評価値を算出する。
また、符号化技術に関しては、以下の特許文献2に記載された技術も知られている。
【０００７】
【特許文献１】
特許第2625424号公報
【０００８】
【特許文献２】
特開昭63-193784号公報
【０００９】
【非特許文献１】
Test Model5,ISO/IEC JTC/SC29/WG11/NO400,1993
【００１０】
【発明が解決しようとする課題】
しかしながら従来の動き補償モード選択、及び動きベクトル探索は、以下のような3つの問題点を有する。
第１に、Test Model5による動き補償モードの選択は、時間的に輝度変化が大きい動画像では適切な動き補償モードを選択することができない場合がある。そのような動画像には、暗いコンサート会場で、ライトが激しく明滅しているような動画像がある。かかる動画像は、輝度変化が大きく、MSEにおいて誤差の直流成分が大きな割合を占める。各動き補償モードについての予測MBのMSEを算出しようとすると、どの動き補償モードについてのMSEも誤差の直流成分が大きな割合を占めることになる。かかるMSEを評価値としてインター型の動き補償モード(インターモード)を選べば、絵柄が全く異なるけれど、輝度はたまたま同じになっているようなマクロブロックを予測MBに選んでしまう。そのような予測MBの誤選択にて、画質低下が生じる場合がある。
【００１１】
第２に、Test Model5による動き補償モード選択は、輝度変化が大きいインターレス画像では適切な動き補償モードを選択することができない場合がある。インターレス画像は、絵柄そのものは平坦であっても、フィールド間の時間的変化が大きいことがある。フィールド間の時間的変化が大きいと、マクロブロックの分散値は大きな値に算出される。マクロブロックの分散値とは、符号化をイントラモードで行うか、インターモードで行うかの決定時に参照されるパラメータである。この分散値が大きな値になれば、イントラモード／インターモードの選択にあたってインターモードが選択される。しかしインターレス動画像とはいえ、絵柄そのものの変化は平坦なので、イントラモードでフィールドDCTによる符号化を行えば符号量が小さくなることは客観的に明らかであり、それにも拘らずインターモードを選択すると、最適な符号量が得られない可能性が生ずる。輝度変化が大きいインターレス画像を例にとって説明したが、水平方向に物体が動くようなインターレス画像でも同様の問題が生じ得る。
【００１２】
第３に、特許文献1に記載の技術は、誤差の直流成分を除外し、誤差の交流成分のみで評価している。そのため、時間的に輝度変化が激しい動画像においても適切に動きベクトルを探索することができる。しかし、輝度変化が小さく、絵柄の変化にも乏しい動画像については、動きベクトルの探索を誤ることがある。つまり、符号化対象たるマクロブロックの周辺に、空の絵等、平坦な絵柄のマクロブロックが位置している場合、周辺のどのマクロブロックについても誤差の交流成分は小さくなる。しかし空の画像は、一見は平坦であっても、広い範囲では、微妙な変化が存在していることが多い。このような大きな範囲での微妙な変化は誤差の直流成分に現れる。特許文献1による動きベクトル探索は、この変化を無視して参照マクロブロックを選んでいるので、誤差の直流成分が大きなマクロブロックを選択し、画質劣化を招来することがある。
【００１３】
本発明の第１の目的は、時間的に輝度変化が大きい動画像を符号化するにあたって、動き補償モードを適切に選択することができる符号化装置を提供することである。
本発明の第２の目的は、時間的に輝度変化が大きい動画像を符号化するにあたって、動き補償モードを行うか、イントラモードで符号化するかの選択を適切に行うことができる符号化装置を提供することである。
【００１４】
本発明の第３の目的は、時間的な輝度変化が大きくても小さくても、かつ、絵柄が平坦であっても複雑であっても、動きベクトルを適切に探索することができる符号化装置を提供することである。
【００１５】
【課題を解決するための手段】
上記第１の目的を達成するため、本発明に係る符号化装置は、動き補償を施すにあたっての補償方式を、複数方式の中から選択し、選択された補償方式にてマクロブロックを符号化する符号化装置であって、各動き補償方式にて予測されるマクロブロックについて、誤差の交流成分、及び、誤差の直流成分を算出する第１算出手段と、算出された誤差の直流成分、及び、誤差の交流成分を用いて各補償方式についての評価値を算出する第２算出手段と、算出された評価値に基づき補償方式を選択する選択手段とを備え、前記第２算出手段は、個々のマクロブロックについての誤差の直流成分を、所定の係数に基づき減衰させた上で、評価値の算出を行うことを特徴としている。
【００１６】
上記第２の目的を達成するため、本発明に係る符号化装置は、マクロブロックに対し離散コサイン変換を施すにあたっての変換方式を、複数方式の中から決定する決定手段と、決定手段により決定された変換方式に応じて、異なる計算を実行することにより、符号化対象たるマクロブロックについての分散値を算出する第３算出手段と、選択手段により選択された補償方式において予測されるマクロブロックと、符号化対象たるマクロブロックとの平均二乗誤差を、第３算出手段により算出された分散値と比較する比較手段とを備え、
選択手段により選択された補償方式にて、符号化対象たるマクロブロックの符号化が行われるのは、符号化対象たるマクロブロックとの平均二乗誤差が、分散値より小さいか、又は、前記平均二乗誤差が所定の閾値より小さい場合であることを特徴としている。
【００１７】
上記第３の目的を達成するため、本発明に係る符号化装置は、マクロブロックに対し動き補償を施すにあたっての参照マクロブロックを、前方又は後方のフレームに属する複数マクロブロックの中から選択し、選択された参照マクロブロックに対する動きベクトルを算出する符号化装置であって、参照マクロブロックの候補となるマクロブロック毎に、誤差の交流成分、及び、誤差の直流成分を算出する第１算出手段と、算出された誤差の交流成分、及び、誤差の直流成分を用いて、候補となる個々のマクロブロックについての評価値を算出する第２算出手段と、算出された評価値に基づき、動き補償方式にあたっての参照マクロブロックを選択する選択手段とを備え、前記第２算出手段は、個々のマクロブロックについての誤差の直流成分を、所定の係数に基づき減衰させた上で、評価値を算出することを特徴としている。
【００１８】
【発明の実施の形態】
本発明に係る符号化装置の実施形態について説明する。本発明に係る符号化装置は、図１に示すハードウェア構成に基づき工業的に生産される。図１に示すように符号化装置は、A/Dコンバータ１、フォーマット変換部２、画面並替部３、フレームメモリ４、減算器５、動き補償予測部６、DCT部７、量子化部８、可変長符号化部９、バッファ１０、レート制御部１１、逆量子化部１２、逆DCT部１３、加算器１４、D/Aコンバータ１５を備える。
【００１９】
A／Dコンバータ１は、A／D変換を実施する回路からなり、アナログ信号形式のビデオフレームを輝度信号（Y）と色差信号（Cb、Cr）とに分離して、それぞれの信号をデジタル形式に変換する。この変換によりデジタルデータ形式のビデオフレームが得られることになる。こうして得られたビデオフレームは、順次フォーマット変換部２に出力される。
【００２０】
フォーマット変換部２は、A/Dコンバータ１により得られたビデオフレームを、空間解像度形式に変換し、変換後のビデオフレームを画面並替部３に出力する。
画面並替部３は、フォーマット変換部２から出力されたビデオフレームを並べ替える。つまりアナログ信号形式においてビデオフレームは、表示順序と呼ばれる順序になっており、これを並べ替えることにより、符号化順序に配されたビデオフレーム列を得る。符号化順序に並べ替えられたビデオフレーム列のうち個々のビデオフレームを、画面並替部３は、符号化対象たるフレームとして、減算器５及び動き補償予測部６に出力する。
【００２１】
フレームメモリ４は、動き補償を行うにあたって、符号化対象たるフレームの参照フレームになりうるフレームが格納される。具体的にいうと、符号化対象たるフレームの前方に位置するビデオフレーム、後方に位置するビデオフレームがこのフレームメモリ４に格納されることになる。
減算器５は、符号化対象たるビデオフレームと、フレームメモリ４に格納された参照フレームとの残差を算出して動き補償予測部６及びDCT部７に出力する。
【００２２】
動き補償予測部６は、減算器５から出力された残差と、符号化対象たるマクロブロック(以降、符号化MBという)とに基づき、動き補償モードの選択を行い、動き補償を行って動きベクトルと、予測モードとを可変長符号化部９に出力する。予測モードとは、イントラモードで符号化を行うか、複数のインターモードのうち何れのインターモードで符号化を行うかを可変長符号化部９に指示する情報である。
【００２３】
DCT部７は、減算器５から出力された残差や符号化MBに対してDCTを実施し、その結果得られるDCT係数を量子化部８に出力する。これにより、複数のDCT係数が格納されたマトリクスが生成されることとなる。
量子化部８は、DCT係数を16倍して、(量子化係数×２×量子化スケール)の値で割り、さらに、小数点以下四捨五入することで量子化を行う。
【００２４】
可変長符号化部９は、DCT係数、動きベクトル及び予測モードそれぞれについて、出現頻度がより高いデータにより短いコードを割り当てるように符号化を行う。
バッファ１０は、FIFOメモリであり、可変長符号化部９から入力されたデータを、入力順に逐次格納する。
【００２５】
レート制御部１１は、バッファ１０がアンダーフロー及びオーバーフローを起こさないように、バッファ１０内のデータ量を監視する機能を有する。この監視は、バッファ１０のデータ量を参照し、量子化部８にそのデータ量を示す情報をフィードバックすることでなされる。この情報にもとづいて、量子化部８が可変長符号化部９への出力速度を調整すれば、バッファ１０からの出力を一定のレートに保つことができる。
【００２６】
逆量子化部１２は、量子化部８により量子化されたDCT係数に、(量子化係数×２×量子化スケール)の値を乗じ、更に16で割ることにより、逆量子化を実施して逆DCT部１３に出力する。
逆DCT部１３は、逆量子化部１２からDCT係数の値を受信し、この値に逆DCTを実行することで符号化前の残差を得て加算器１４に出力する。
【００２７】
加算器１４は、加算回路であって、逆DCT部１３から出力された残差を、フレームメモリ４に格納されている参照フレームに足し合わせ、その加算した結果をフレームメモリ４に出力する。
以上が符号化装置の全体構成である。
続いて、符号化装置の中核となる動き補償予測部６について説明する。動き補償予測部６は、CPU、プログラムを格納したROM、RAMからなる典型的なコンピュータシステムとして符号化装置に実装される。ROMに格納されたプログラムがCPUに読み込まれ、プログラムと、ハードウェア資源とが協動することにより、動き補償予測部６はその機能を果たす。動き補償予測部６を示す枠内は、ROMに格納されたプログラムと、ハードウェア資源とが協動した具体的手段を示す。
【００２８】
この枠内に示すように動き補償予測部６は、動きベクトルの探索を行う動きベクトル探索部１６、DCTタイプを決定するDCTタイプ決定部１７、複数のインター型の動き補償モード(インターモード)のうち、最善のものを選択するインター選択部１８、最善のインターモードと、イントラモードとを比較し、何れか一方を選択するインター／イントラ選択部１９を備える。
【００２９】
本実施形態ではこのうちDCTタイプ決定部１７、インター選択部１８について詳しく説明し、動きベクトル探索部１６、インター／イントラ選択部１９については第２実施形態、第３実施形態に説明を譲る。
DCTタイプ決定部１７は、動きベクトルと、フレームメモリ４中にあるマクロブロックとに基づき、DCTタイプを決定する。残差のマクロブロックにおいて、隣接するライン間の輝度差の2乗和と1ライン置きのライン間の輝度差の2乗和を比較し、前者の方が小さければフレームDCTとし、後者の方が小さければフィールドDCTとする。
【００３０】
インター選択部１８は、各モードで予測されるマクロブロック(予測MB)についてMSEを算出し、予測MBのMSEを評価値として、動き補償モードを選択する。インター選択部１８によるモード選択がTest Model5のそれと異なるのは、誤差の直流成分、誤差の交流成分の算出手順をDCTタイプに応じて変化させていること(1)、MSEに修正を施した値を評価値に用いていること(2)である。
【００３１】
誤差の直流成分、誤差の交流成分の算出手法をどのように変化させているかを、フレーム、フィールドのそれぞれについて説明する。先ず始めに、直流成分の平均二乗誤差、交流成分の平均二乗誤差がどのように算出されるかについて説明する。符号化MBにおいて座標(i,j)に位置する輝度をＸijとし、参照マクロブロックにおいて座標(i,j)に位置する輝度をＹijとする。この2つのマクロブロックにおけるMSEは、以下の数１の式で算出される。
【００３２】
【数１】

誤差の直流成分(DCE)は、誤差の平均値mの2乗として以下の数２に示すように算出される。
【００３３】
【数２】

一方、誤差の交流成分(ACE)は、上述した平均値mを基準とした分散値として算出される。以下の数３は、分散値の算出式を示す。
【００３４】
【数３】

ここで分散値の算出式は、
分散値＝(２乗の平均)−(平均の２乗)
という形式に展開できる。展開後の式において、(２乗の平均)の項、(平均の２乗)の項をそれぞれMSE,DCEに置き換えれば、ACE＝MSE−DCEという関係が成立する。
【００３５】
MSEと、DCE、ACEとの関係は、以下のようになる。
MSE＝DCE＋ACE
数３の展開については、以下の説明を参照されたい。
【００３６】
n個の数値x1,x2,x3・・・・・xnにおける平均値mを用いて数３における分散値の計算式を表現する。ここで、n個の数値x1,x2,x3・・・・・xnにおける平均値mは、以下の数４の式で表現される。
【００３７】
【数４】

この数４の式を用いて数３の計算式を表すと、数５のようになる。
【００３８】
【数５】

数６は、分散値の計算式の展開の過程を示す。
【００３９】
【数６】

かかる展開を経て、分散値＝(２乗の平均)−(平均の２乗)という式が成立していることがわかる。
以上がフレーム予測時におけるDCE、ACEの算出式である。フィールド予測時に1つのフィールドに対するおけるDCE、ACEの算出式を以下の数７に示す。
【００４０】
【数７】

以上がDCE、ACEの算出手法である。インター選択部１８の1つ目の特徴は、以上のような計算手法をDCTタイプに応じて変化させる点である。
DCTタイプがフレームの場合はフレームマクロブロック（16×16）に対して、符号化MBとの残差から上述した計算を行ってDCE、ACEを得る。
【００４１】
DCTタイプがフィールドの場合は、２つのフィールド（8×16）に対してそれぞれDCEを求め、それを平均した値をマクロブロックのDCEとする。ACEについても同様に、２つのフィールドに対してそれぞれACEを求め、それを平均した値をマクロブロックのACEとする。
インター選択部１８の2つ目の特徴は、モード選択にあたっての評価値として修正されたMSEを用いる点である。修正されたMSEとは、DCEを減衰させたものであり、modMSEで表す。式に表すと、modMSEは以下のようになる。
modMSE = α×DCE + ACE
ここで、αは減衰率で、0<α<1 である。
このαの値は、デフォルト値として1/64に設定するのが望ましい。何故なら、各種シミュレーションによると、αを1/64とした場合に圧縮率やS/N比が最適になったことがわかっているからである。またαの値は、ユーザインターフェイスよりユーザからの指示が通知されたとき、この指示にもとづいて変更できるように構成することが望ましい。このようにαの値を変更すれば、ユーザはモニタで符号化後の画質を確認し、納得の行く画質が確保できるよう、αの値を調整できるからである。
【００４２】
以降、フローチャートを参照しながら、動き補償予測部６及びインター選択部１８の処理手順について説明する。図２は、動き補償予測部６全体における大きな処理の流れを示すフローチャートである。動きベクトル探索部１６に動きベクトルを探索させ(ステップＳ１０１)、DCTタイプ決定部１７にDCTタイプを決定させてから(ステップＳ１０２)、動き補償モードを選択し(ステップＳ１０３)、その後、DCT部７、量子化部８、可変長符号化部９にDCT,量子化、可変符号長符号化を行わせる(ステップＳ１０４)というものである。動き補償モードの選択は、インター選択部１８に最善のインターモードを選択させるという処理(ステップＳ１０５)と、最善のインターモード、イントラモードのうち何れかをインター／イントラ選択部１９に選択させるという処理(ステップＳ１０６)とを含む。
【００４３】
インター選択部１８を構成するには、コンピュータ記述言語を用いて図３のフローチャートに示す処理手順を記述することでプログラムを作成し、コンピュータに実行させればよい。以降、図３を参照しながら、インター選択部１８の処理手順について説明する。尚、簡略を期するため、本フローチャートでは、”マクロブロック”をMBと略記している。
【００４４】
ステップＳ１〜ステップＳ２は、インターモードのそれぞれについてステップＳ３〜ステップＳ８の処理を繰り返すループ処理を形成している。Ｐピクチャであれば、前方フレーム予測(1)、前方フィールド予測(2)、NoMC(3)のそれぞれが、ステップＳ３〜ステップＳ８の対象となる。Ｂピクチャであれば、前方フレーム予測(1)、前方フィールド予測(2)、後方フレーム予測(3)、後方フィールド予測(4)、両方向フレーム予測(5)、両方向フィールド予測(6)のそれぞれが、ステップＳ３〜ステップＳ８の処理の対象となる。
【００４５】
このループ処理において、対象となる動き補償モードを動き補償モードpとし、この動き補償モードpにて予測されるマクロブロックをマクロブロックpとする。ステップＳ３は、DCTタイプ決定部により決定されたDCTタイプにより、処理手順を切り換える。
DCTタイプがフレームなら、予測MBpのうち、16×16のフレームについての残差に基づきDCE,ACEを算出する(ステップＳ４)。
【００４６】
DCTタイプがフィールドなら、予測MBpのうち、16×8,16×8の2つのフィールドについてDCE、ACEを、マクロブロックpと、符号化MBとの残差に基づき算出する(ステップＳ５)。そして2つのフィールドにおけるDCEの平均をDCEに設定し(ステップＳ６)、2つのフィールドにおけるACEの平均をACEに設定する(ステップＳ７)。
【００４７】
ステップＳ４、及び、ステップＳ５〜ステップＳ７の何れか一方で、DCE、ACEが計算されれば、DCEに係数αを乗じ、ACEを足し合わせて、modMSE(p)を得る(ステップＳ８)。このmodMSE(p)が、モードpにおける評価値となる。ステップＳ３〜ステップＳ８の処理を、インターモードのそれぞれについて繰り返せば、インターモードのそれぞれについて、評価値が算出されることになる。こうして各インターモードについてのmodMSEが算出されれば、算出されたインターモードのうち、予測MBにおけるmodMSEが最小になったものを選択する(ステップＳ９)。
DCE, ACE, modMSEの計算例を交えて、インター選択部１８の動作例について説明する。
【００４８】
この動作例の対象となる画像は輝度変化が大きいインターレース画像であり、ピクチャタイプは、Pピクチャであるものとする。このPピクチャのうち、符号化対象となるマクロブロック(符号化MB)は、以下の16×16の輝度を持つものとする。
【００４９】
符号化MB
48 59 57 50 52 56 54 51 56 60 56 52 57 60 55 56
72 66 67 74 75 71 70 75 73 69 71 74 73 71 76 77
55 50 55 58 56 59 62 60 58 59 61 60 59 62 62 60
70 74 72 73 76 74 70 72 74 73 73 75 75 73 73 77
53 55 54 55 54 51 55 57 57 59 60 59 58 58 55 56
73 72 73 75 74 72 73 74 72 74 75 76 76 74 76 78
56 54 57 57 54 51 54 58 58 59 59 57 58 57 55 59
65 69 73 75 74 73 75 76 75 75 76 75 76 76 77 78
50 56 57 58 56 52 55 58 58 58 60 58 58 58 57 59
71 68 74 75 72 71 69 71 74 75 74 75 77 75 74 76
56 54 56 56 55 54 55 58 57 57 59 57 57 57 57 60
67 71 70 70 71 71 71 74 74 73 73 73 74 74 77 76
50 56 55 56 57 56 58 58 57 57 58 58 58 59 60 61
70 67 71 71 73 73 72 74 73 72 73 73 74 73 72 74
54 51 55 56 56 57 57 56 55 57 59 59 59 57 59 61
69 71 70 69 72 74 74 73 73 74 74 73 74 76 76 76
【００５０】
インター選択部１８は、前方フレーム予測(1)、前方フィールド予測(2)、NoMC(3)のうち、どの動き補償モードが最善であるかの選択を行う。そのため、これらインター型の動き補償モードのそれぞれについて、評価値を算出する。前方フレーム予測(1)がインターモードpである場合、以下の16×16の輝度を持つマクロブロックが予測されるものとする。
【００５１】
前方フレームモードで予測されたマクロブロック
49 47 47 48 49 49 50 50 51 52 52 52 52 52 52 52
51 51 51 51 51 51 51 51 52 53 53 53 53 53 53 53
49 47 47 48 49 50 50 50 51 52 52 52 52 52 52 52
51 51 51 51 51 51 51 51 52 53 53 53 53 53 53 53
49 47 48 48 49 50 51 51 52 52 52 52 52 52 52 52
51 51 51 51 51 51 51 51 52 53 53 53 53 53 53 53
49 48 48 49 50 51 51 51 52 52 52 52 52 52 52 52
51 51 51 51 51 51 51 51 52 53 53 53 53 53 53 53
50 49 50 50 51 51 52 52 52 52 52 52 52 52 52 52
51 51 51 51 51 51 51 51 52 53 53 53 53 53 53 53
50 50 50 51 51 52 52 53 53 52 52 52 52 52 52 52
51 51 51 51 51 51 51 51 52 53 53 53 53 53 53 53
50 50 51 51 52 52 53 53 53 52 52 52 52 52 52 52
51 51 51 51 51 51 51 51 52 53 53 53 53 53 53 53
50 50 51 51 52 53 53 53 53 52 52 52 52 52 52 52
51 51 51 51 51 51 51 51 52 53 53 53 53 53 53 53
【００５２】
この予測MBと、符号化MBとの画素毎の残差は以下のようになる。
前方フレームモードにより予測されるマクロブロックとの残差
-1 12 10 2 3 7 4 1 5 8 4 0 5 8 3 4
21 15 16 23 24 20 19 24 21 16 18 21 20 18 23 24
6 3 8 10 7 9 12 10 7 7 9 8 7 10 10 8
19 23 21 22 25 23 19 21 22 20 20 22 22 20 20 24
4 8 6 7 5 1 4 6 5 7 8 7 6 6 3 4
22 21 22 24 23 21 22 23 20 21 22 23 23 21 23 25
7 6 9 8 4 0 3 7 6 7 7 5 6 5 3 7
14 18 22 24 23 22 24 25 23 22 23 22 23 23 24 25
0 7 7 8 5 1 3 6 6 6 8 6 6 6 5 7
20 17 23 24 21 20 18 20 22 22 21 22 24 22 21 23
6 4 6 5 4 2 3 5 4 5 7 5 5 5 5 8
16 20 19 19 20 20 20 23 22 20 20 20 21 21 24 23
0 6 4 5 5 4 5 5 4 5 6 6 6 7 8 9
19 16 20 20 22 22 21 23 21 19 20 20 21 20 19 21
4 1 4 5 4 4 4 3 2 5 7 7 7 5 7 9
18 20 19 18 21 23 23 22 21 21 21 20 21 23 23 23
【００５３】
ステップＳ４においてこの残差からDCEを算出するとDCE=239になり、またACEを算出すると、ACE=5になる。ステップＳ８において係数αを1/64にしてmodMSEを算出すると、9になる。
【００５４】
MSE_frame 9 (ACE 5, DCE 239)
前方フィールド予測(2)がインターモードpである場合、以下の16×16の輝度を持つマクロブロックが予測されるものとする。
前方フィールドモードで予測されるマクロブロック
47 48 48 49 49 50 50 51 51 51 51 51 51 51 51 51
52 52 52 52 52 52 52 52 52 52 52 52 52 52 52 53
47 48 48 49 50 50 51 51 51 51 51 51 51 51 51 51
52 52 52 52 52 52 52 52 52 52 52 52 52 52 52 53
47 48 48 49 50 50 51 51 51 51 51 51 51 51 51 51
52 52 52 52 52 52 52 52 52 52 52 52 52 52 52 53
47 48 48 49 50 50 51 51 51 51 51 51 51 51 51 51
52 52 52 52 52 52 52 52 52 52 52 52 52 52 52 53
47 48 48 49 50 50 51 51 51 51 51 51 51 51 51 51
52 52 52 52 52 52 52 52 52 52 52 52 52 52 52 53
47 48 48 49 49 50 51 51 51 51 51 51 51 51 51 51
52 52 52 52 52 52 52 52 52 52 52 52 52 52 52 53
47 48 48 49 49 50 51 51 51 51 51 51 51 51 51 51
52 52 52 52 52 52 52 52 52 52 52 52 52 52 52 53
47 48 48 49 49 50 50 51 52 52 52 52 52 52 52 52
52 52 52 52 52 52 52 52 52 52 52 52 52 52 52 53
【００５５】
この予測MBと、符号化MBとの画素毎の残差は以下のようになる。
前方フィールドモードにより予測されるマクロブロックとの残差
1 11 9 1 3 6 4 0 5 9 5 1 6 9 4 5
20 14 15 22 23 19 18 23 21 17 19 22 21 19 24 24
8 2 7 9 6 9 11 9 7 8 10 9 8 11 11 9
18 22 20 21 24 22 18 20 22 21 21 23 23 21 21 24
6 7 6 6 4 1 4 6 6 8 9 8 7 7 4 5
21 20 21 23 22 20 21 22 20 22 23 24 24 22 24 25
9 6 9 8 4 1 3 7 7 8 8 6 7 6 4 8
13 17 21 23 22 21 23 24 23 23 24 23 24 24 25 25
3 8 9 9 6 2 4 7 7 7 9 7 7 7 6 8
19 16 22 23 20 19 17 19 22 23 22 23 25 23 22 23
9 6 8 7 6 4 4 7 6 6 8 6 6 6 6 9
15 19 18 18 19 19 19 22 22 21 21 21 22 22 25 23
3 8 7 7 8 6 7 7 6 6 7 7 7 8 9 10
18 15 19 19 21 21 20 22 21 20 21 21 22 21 20 21
7 3 7 7 7 7 7 5 3 5 7 7 7 5 7 9
17 19 18 17 20 22 22 21 21 22 22 21 22 24 24 23
ステップＳ５〜ステップＳ７において、残差からDCEを算出するとDCE=242になり、またACEを算出すると、ACE=5になる。ステップＳ８において係数αを1/64にしてmodMSEを算出すると、9になる。
【００５６】
MSE_field 9 (ACE 5, DCE 242)
NoMC(3)がインターモードpである場合、以下の16×16の輝度を持つマクロブロックが予測されるものとする。
noMCモードで予測されるマクロブロック
44 44 44 44 44 44 44 44 46 46 46 46 46 46 46 46
44 44 44 44 44 44 44 44 46 47 47 48 48 49 49 50
44 44 44 44 44 44 44 44 46 46 46 46 46 46 46 46
44 44 44 44 44 44 44 44 46 47 47 48 48 49 49 50
44 44 44 44 44 44 44 44 46 46 46 46 46 46 46 46
44 44 44 44 44 44 44 44 46 46 47 47 49 49 50 50
44 44 44 44 44 44 44 44 46 46 46 46 46 46 46 46
44 44 44 44 44 44 44 44 46 47 47 48 48 49 49 50
44 44 44 44 44 44 44 44 46 46 46 46 46 46 46 46
44 44 44 44 44 44 44 44 46 46 47 47 49 49 50 50
44 44 44 44 44 44 44 44 46 46 46 46 46 46 46 46
44 44 44 44 44 44 44 44 46 47 47 48 48 49 49 50
44 44 44 44 44 44 44 44 46 46 46 46 46 46 46 46
44 44 44 44 44 44 44 44 46 46 47 48 48 49 50 50
44 44 44 44 44 44 44 44 46 46 46 46 46 46 46 46
44 44 44 44 44 44 44 44 46 47 47 48 48 49 49 50
【００５７】
この予測MBと、符号化MBとの画素毎の残差は以下のようになる。
noMCモードにより予測されるマクロブロックとの残差
4 15 13 6 8 12 10 7 10 14 10 6 11 14 9 10
28 22 23 30 31 27 26 31 27 22 24 26 25 22 27 27
11 6 11 14 12 15 18 16 12 13 15 14 13 16 16 14
26 30 28 29 32 30 26 28 28 26 26 27 27 24 24 27
9 11 10 11 10 7 11 13 11 13 14 13 12 12 9 10
29 28 29 31 30 28 29 30 26 28 28 29 27 25 26 28
12 10 13 13 10 7 10 14 12 13 13 11 12 11 9 13
21 25 29 31 30 29 31 32 29 28 29 27 28 27 28 28
6 12 13 14 12 8 11 14 12 12 14 12 12 12 11 13
27 24 30 31 28 27 25 27 28 29 27 28 28 26 24 26
12 10 12 12 11 10 11 14 11 11 13 11 11 11 11 14
23 27 26 26 27 27 27 30 28 26 26 25 26 25 28 26
6 12 11 12 13 12 14 14 11 11 12 12 12 13 14 15
26 23 27 27 29 29 28 30 27 26 26 25 26 24 22 24
10 7 11 12 12 13 13 12 9 11 13 13 13 11 13 15
25 27 26 25 28 30 30 29 27 27 27 25 26 27 27 26
【００５８】
ステップＳ４においてこの残差からDCEを算出するとDCE=435になり、またACEを算出すると、ACE=5になる。ステップＳ８において係数αを1/64にしてmodMSEを算出すると、12になる。
MSE_noMC 12 (ACE 5, DCE 435)
【００５９】
以上の過程で、以下の3つのmodMSEが算出されたので、ステップＳ９においてこのmodMSEが最も小さいインターモードを選択すれば、それが最善のインターモードになる。尚、今回の計算例では、前方フレームモード、前方フィールドモードの双方でmodMSEが等しい。
MSE_frame 9 (ACE 5, DCE 239)
MSE_field 9 (ACE 5, DCE 242)
MSE_noMC 12 (ACE 5, DCE 435)
【００６０】
以上のように本実施形態によれば、DCEを減衰した上で評価値を算出しているので、暗いコンサート会場でライトが激しく点滅しているような動画像においても、動き補償モードの選択が適切になり、画質を向上させることができる。
(第２実施形態)
第２実施形態は、インター／イントラ選択部１９における改良をより詳しく示す実施形態である。インター／イントラ選択部１９は、インター選択部１８により最善と判定されたインターモードを、イントラモードと比較してどちらを採用するかを決定するものである。この比較は、イントラモードについて分散値を算出して、この分散値を最善モードのMSEと比較し、尚且つ最善モードのMSEが所定の閾値を上回るか否かを判定することでなされる。
【００６１】
本実施形態におけるインター／イントラ選択部１９の特徴は、DCTタイプに応じて、分散値の計算手法を変えている点である。つまり、DCTタイプがフレームなら、符号化MBにおける16×16のフレームについて分散値を算出する。DCTタイプがフィールドなら、符号化MBのうち、16×8,16×8の2つのフィールド毎に分散値を算出し、そして2つのフィールドにおける分散値の平均を分散値に設定するのである。
【００６２】
以上のインター／イントラ選択部１９を構成するには、コンピュータ記述言語を用いて図４のフローチャートに示す処理手順を記述することでプログラムを作成し、コンピュータに実行させればよい。以降、図４のフローチャートを参照しながら、インター／イントラ選択部１９の処理手順について説明する。
ステップＳ２１は、DCTタイプ決定部により決定されたDCTタイプにより、処理手順を切り換えを実現する。DCTタイプがフレームなら、符号化MBにおける16×16のフレームについて分散値を算出する(ステップＳ２２)。DCTタイプがフィールドなら、符号化MBのうち、16×8,16×8の2つのフィールドについて分散値を算出する(ステップＳ２３)。そして2つのフィールドにおける分散値の平均を分散値に設定する(ステップＳ２４)。
【００６３】
ステップＳ２５は、算出されたVARが最善モードのMSEより小さく、且つ最善モードのMSEが64より大きいかという条件の成立を判定する判定ステップである。この判定において分散値と対比されるのは、MSEであり、modMSEではない。つまりDCEが減衰されない状態のままのMSEが比較の対象となる。もしこの条件が成立すればイントラモードを選択する(ステップＳ２６)。もしこの条件が不成立であれば、モードpのインターモードを選択する(ステップＳ２７)。
【００６４】
以降、第２実施形態に係るインター／イントラ選択部１９の動作例について説明する。この動作例は、第１実施形態に示した示した計算例において、インター／イントラ選択部１９がどのように選択を行うかを述べるものである。
第１実施形態のインターモードの選択にあたって、前方フレームモードと前方フィールドとでmodMSEが同じになった。ここで判定の順序のためにフレーム予測が選択されるものとする。
【００６５】
このようにして選択された前方フレームモードと、イントラモードのどちらがよいかを上述した手順に基づき判定する。
第１実施形態において前方フレームモードのDCTタイプはフィールドと決定されている。イントラのDCTタイプがフィールドなので、インター／イントラ選択部１９はステップＳ１３、ステップＳ１４においてフィールド毎に分散値VARを算出する。そうして算出された分散値VARは、「7」になったものとする。
【００６６】
こうして算出された分散値VARを、前方フレームモードのMSEと比較する(ステップＳ１５)。前方フレームモードのMSEは244(=5+239)であり、VAR<MSEの関係を満たす。またMSEは244であり、MSE>64という関係を満たすので、イントラモードが選択されることになる。
以上のように本実施形態によれば、DCTタイプに応じて、分散値の計算手法を変化させることで、輝度変化が大きな画像でマクロブロックについての分散値が小さな値に算出される。これとMSEとの比較にあたっては、「分散値＜MSE」の関係が満たされ易くなり、輝度変化が大きい画像に対してTest Model5の方法よりもイントラモードが選択されることが多くなり、画質を向上させることができる。
【００６７】
尚、MSEとの比較に用いる閾値を64としたが、これを4としてもよい。閾値を4とすると、平坦な画像において、矩形状のノイズが出現するというブロックノイズを、抑制することができる。
(第３実施形態)
第３実施形態は、動きベクトル探索部１６における改良をより詳しく説明する実施形態である。
【００６８】
動きベクトル探索部１６は、参照フレーム／フィールド内に位置するマクロブロックのそれぞれについて、評価値を算出して、この評価値が最小のマクロブロックを参照マクロブロックにする。そして、符号化MBを基準とした参照マクロブロック(以降、参照MB)の相対位置を動きベクトルとして算出する。第１実施形態における評価値は、二乗誤差に基づいた交流成分、及び、直流成分から導かれたが、本実施形態における評価値は、絶対誤差に基づいた交流成分、及び、直流成分から導かれる。
【００６９】
交流成分の算出式を数８に示す。
【００７０】
【数８】

直流成分の算出式を数９に示す。
【００７１】
【数９】

参照MB毎の評価値は、これら誤差の直流成分、誤差の交流成分を用いた式で算出される。以下の数１０がその式である。
【００７２】
【数１０】

この数１０からも分かるように、直流成分の絶対誤差は係数kが乗られて減衰させられていることがわかる。
直流成分、交流成分の計算は、数１１、数１２のように行っても良い。
【００７３】
【数１１】

【００７４】
【数１２】

ここで、符号化MB内の座標(i,j)に位置する画素の輝度値をＸijとし、また、参照フレーム／フィールド内のマクロブロックにおいて、座標(i,j)に位置する画素の輝度値をＹijとしている。
【００７５】
第３実施形態において直流成分が評価値に用いられているものの、係数kが乗じられ減衰させられている点は、第１実施形態と共通であるといえる。
このkの値は、デフォルト値として1/16〜1/4に設定するのが望ましい。何故なら、kをこの範囲に設定すれば、絵柄の変化よりも輝度変化が大きい画像(1)、絵柄の変化及び輝度変化が共に小さい画像(2)の双方において、視覚的な問題が生じず、かつ符号量が小さくなることが確認できるからである。
【００７６】
またkの値は、ユーザインターフェイスよりユーザからの指示が通知されたとき、この指示にもとづいて変更できるように構成することが望ましい。このようにkの値を変更すれば、ユーザはモニタで符号化後の画質を確認し、納得の行く画質が確保できるよう、kの値を調整できるからである。
以上の動きベクトル探索部１６を構成するには、コンピュータ記述言語を用いて図５のフローチャートに示す処理手順を記述することでプログラムを作成し、コンピュータに実行させればよい。以降、図５のフローチャートを参照しながら、動きベクトル探索部１６の処理手順について説明する。ステップＳ３１において、符号化MBをMBxとする。
【００７７】
ステップＳ３２〜ステップＳ３３は、各動き補償モードの参照フレーム／フィールドについて、ステップＳ３４〜ステップＳ３８の処理を繰り返すループ処理を形成している。Ｐピクチャの参照フレーム／フィールドには、前方フレーム(i)、前方フィールド(ii)があるので、これらについてステップＳ３４〜ステップＳ３８の処理が行われる。またＢピクチャの参照フレーム／フィールドには、前方フレーム(i)、前方フィールド(ii)、後方フレーム(iii)、後方フィールド(iv)があるので、これらについてステップＳ３４〜ステップＳ３８の処理が行われる。
【００７８】
ステップＳ３１〜ステップＳ３８のループ処理にあたって、処理の対象となる個々の参照フレーム／フィールドを参照フレーム／フィールド(r)という。
ステップＳ３４〜ステップＳ３６は、参照フレーム／フィールド(r)に属するマクロブロックであって、候補になり得る全てのものについて、ステップＳ３６の処理を繰り返すループ処理を形成している。符号化MBにおいて、探索範囲内の全てのマクロブロックが、ここでの候補になりうる。これは、フルサーチと呼ばれる。
【００７９】
このループ処理において対象となるマクロブロックをマクロブロックyとする。ステップＳ３６は、上述した数１０の式に基づきマクロブロックyについての評価値f(y)を求める。
このステップＳ３６の繰り返しにより、参照フレーム／フィールド(r)において候補になり得る全てのマクロブロックについて、評価値が算出されることになる。
【００８０】
ステップＳ３７は、参照フレーム／フィールド(r)において候補となり得るマクロブロックのうち、f(y)が最小のものを、参照フレーム／フィールド(r)についての参照MBにする。続くステップＳ３８では、マクロブロックxを基準とした参照MBの相対位置を動きベクトル(r)に設定する。動きベクトル(r)とは、参照フレーム／フィールド(r)についての動きベクトルである。以上のステップＳ３２〜ステップＳ３８の繰り返しにより、参照フレーム／フィールドのそれぞれについて、動きベクトルが算出される。
【００８１】
本発明と、特許文献1とで、参照MBの探索が適切かどうかの比較を行う。この比較は、非可逆変換の前後で、参照MBの直流成分がどれだけ変わるかを算出することでなされる。不可逆変換前の直流成分の平均値と、不可逆変換後の直流成分の平均値とが等しく、差が0ならば参照MBの探索が最適であることを示す。一方、平均値の差が大きければ大きい程参照MBの探索が不適であることを示す。比較の対象となる画像は、図６のような空の画像である。図６において、各画素の値は、46,47,48というように微妙に変化する。この変化はランダムな変化であり、交流成分として現される。一方、大きな範囲での画素の変化を観察すれば、かかる画素の集まりでも、変化が観察される。この変化は、マクロブロックの直流成分の変化となる。図７は、図６における輝度の表記を、10進数に置き換えて示した図である。
【００８２】
図８を参照しながら、本発明の手順で探索された参照MBと、特許文献1で探索された参照MBとを比較する。マクロブロック1とは本発明の手順で探索した参照MBであり、マクロブロック2とは特許文献1の手順で探索した参照MBである。
特許文献1においては、輝度の直流成分を無視し、交流成分のみを用いた評価値で参照MBを探索している。数１３の式は、特許文献1におけるマクロブロックの評価値の算出式である。
【００８３】
【数１３】

特許文献1では、空のような平坦な画像であっても、直流成分が大きくなるような参照MBを選んでしまう可能性がある。図８のmx1,mx2は、マクロブロック1とマクロブロック2に対する残差を、マトリックス状に示している。平坦な画像から選んだ参照MBなのに、特許文献1では直流成分が無視されたため、mx2に示すように画素毎の残差が3,4,5になるような参照MBが選ばれている。画素毎の残差が3,4,5であるため、残差の平均値が3.8と大きく算出されている。
【００８４】
本発明に係る符号化装置においては、直流成分を減衰した上で、評価値を算出し参照MBを選んでいるため、空のような平坦な画像においては矢印mx1に示すように画素毎の残差が-1,0,1になるような参照MBが選ばれている。画素毎の残差が-1,0,1なので、残差の平均値は-0.25と小さな値になっている。
図中の<DCT>、<量子化>、<逆量子化>、<逆DCT>は、非可逆変換の過程である。
【００８５】
先ず始めに参照MB１における非可逆変換前後の残差の平均値の差を説明する。 DCT化において参照MB１の平均値「-0.25」は矢印my1に示すように8倍されて「-2.0」になる。続く量子化にて、DC係数「-2.0」は16倍され、(量子化係数×２×量子化スケール)の値で割られ、小数点以下四捨五入されることで矢印my2に示すように「0.00」になる。
【００８６】
逆量子化部１２にて、DC係数「0.00」は、(量子化係数×２×量子化スケール)の値が乗じられ、更に16で割られることにより矢印my3に示すように「0.00」になる。
逆DCTにてDC係数は8で割られて矢印my8に示すように平均が「0.00」になる。この「0.00」が非可逆変換後の残差の平均値である。非可逆変換前と非可逆変換後とでは、残差の平均値に「0.25」の差がある。
【００８７】
続いて、参照MB２における非可逆変換前後の平均値の差を説明する。
DCT化において参照MB２の平均値「3.80」は8倍されて矢印my4に示すように「30.38」になる。続く量子化にて、DC係数「30.28」は16倍され、(量子化係数×２×量子化スケール)の値で割られ、小数点以下四捨五入されることで矢印my5に示すように「2」になる。
【００８８】
逆量子化部１２にて、DC係数「2」は、(量子化係数×２×量子化スケール)の値が乗じられ、更に16で割られることにより矢印my6に示すように「40.00」になる。
逆DCTにてDC係数は8で割られて矢印my7に示すように平均が「5.00」になる。この「5.00」が非可逆変換後の残差の平均値である。非可逆変換前と非可逆変換後とでは、平均値に「1.20」の差がある。
【００８９】
参照MB１と、参照MB２とで非可逆変換前後の平均値の差を比較すると、参照MB１の方が、よい結果になっていることがわかる。
＜画質の比較＞
Test Model5を用いる符号化装置、特許文献1に記載の符号化装置、kの値を1/16〜1/4とした本発明の符号化装置の３つを用いて求めた動きベクトルを用いて符号化するシミュレーションを行った結果、以下のことが明らかとなった。
【００９０】
時間的に輝度変化が小さい比較的複雑な画像（以下、「第１の画像」という。）では、上述の３つの符号化装置全において良い画質となった。
時間的に輝度変化が大きい画像（以下、「第２の画像」という。）では、Test Model5を用いる符号化装置は画質が劣り、特許文献1に記載の符号化装置と本発明の符号化装置とでは良い画質となった。
【００９１】
時間的な輝度変化が非常に小さく位置による輝度変化が多少ある画像（以下、「第３の画像」という。）では、Test Model5を用いる符号化装置と本発明の符号化装置では良い画質となり、特許文献1に記載の符号化装置では画質が劣る。より具体的には第３の画像（例えば空の画像）では、前記特許文献1の方法で求めた動きベクトルを用いて符号化すると本来は静止の領域でブロックが動いて見えて画質が低下する。
【００９２】
以下、第１、第２、第３それぞれの画像に対し、本発明の符号化装置を用いて動きベクトルを検出する場合について詳細に説明する。
第１の画像、即ち、時間的な輝度変化が小さく比較的複雑な画像に対しては、各参照MB群の各マクロブロックの交流成分は本来の動きを表わすマクロブロックに近い値となり、それ以外のマクロブロックで差が大きい値となる。
【００９３】
従って、各マクロブロック間の輝度値における誤差の評価値の差は主として交流成分の差によって決まり、誤差の交流成分が最も小さい動きベクトルが選択されることになり、本発明の符号化装置において、本来の動きを示す動きベクトルが選択される。
第２の画像、即ち、時間的な輝度変化が大きい画像に対しては、時間的な輝度の変化により直流成分が大きくなるが、直流成分は1/16〜1/4に減衰されるので、誤差の交流成分の影響が相対的に大きくなる。
【００９４】
その結果、絵柄の影響が強くなるため、輝度変化の影響が小さくなり、適切な動きベクトルが求められる。
第３の画像（時間的な輝度変化が小さい平坦な画像）に対しては、平坦な画像なのでどの参照MBであっても交流成分はごく小さい。
例えば、図６、図７に示すように、狭い範囲では変化はランダム状であるが、広い範囲ではランダムでない意味のある変化があり、直流成分が動きを反映している。
【００９５】
本発明の符号化装置によれば、直流成分の1/16〜1/4が誤差の評価値に反映され、直流成分と交流成分とが適度な割合で評価されるため、符号化MBの直流成分及び交流成分が共に近い値となっている参照MBを検出することができる。
一方、特許文献1に記載の符号化装置は、交流成分のみによって誤差を評価して動きベクトルを求めるため、交流成分は動きを反映しないので不適切な動きベクトルを求めてしまう。
【００９６】
特許文献1に記載の符号化装置において、不適切な動きベクトルを求めてしまうケースは、第３の画像以外にも存在する。それは画像の左端で輝度が高く、画像の右端で輝度が低く、輝度の変化が一様な画像などである。このような輝度の変化率が一定の画像では、交流成分は画像中のどこでも同じになる。そのため、交流成分は意味のある動きを反映せず、交流成分のみによって誤差を評価すると誤差の直流成分が大きい動きベクトルを求めてしまうことがあり得る。
【００９７】
本発明の符号化装置によれば、先に述べたように、直流成分の絶対誤差の一部（1/16〜1/4）が評価値に反映され、直流成分と交流成分とが適度な割合で評価されるため、符号化MBの直流成分及び交流成分が共に近い値となっている参照MBを検出することができる。
以上のように本実施形態によれば、特許文献1に示されるように、直流成分の絶対誤差を”0”にするのではなく、減衰した上で参照MBについての評価値を算出しているので、輝度変化が小さい平坦な画像であっても、ライトが激しく点滅しているような画像が符号化対象であっても、動きベクトルを適切に探索することができる。
【００９８】
(備考)
上記実施形態に基づいて説明してきたが、現状において最善の効果が期待できるシステム例として提示したに過ぎない。本発明はその要旨を逸脱しない範囲で変更実施することができる。代表的な変更実施の形態として、以下(A)(B)(C)・・・・のものがある。
【００９９】
(A)第３実施形態では、減衰のための係数fのデフォルト値を1/16〜1/4としたが、この値に限定されるものではなく、0よりも大きく、１よりも小さい値で画質に問題が生じない範囲であればよい。
(B)第３実施形態では、参照MBの評価値として、絶対誤差を用いたが、これに限らず2乗誤差を用いてもよい。
【０１００】
その場合、2乗誤差を求める式は、
【０１０１】
【数１４】

となる。
また、2乗誤差の求め方として、上述の式の各項をnの2乗で除した平均二乗誤差を用いてもよい。
【０１０２】
(C)第３実施形態に係る符号化装置に、画像の特性を検出する機能部を追加し、画像の特性によって係数kの値を自動的に変更してもよい。また、第３実施形態における動き補償予測部６は、輝度値にもとづいて誤差の直流成分、誤差の交流成分を算出したが、色差の値にもとづいて算出してもよく、又は、画素を示すRGB成分の任意の成分の値にもとづいて算出してもよい。
【０１０３】
(D)第３実施形態に係る動き補償予測部６は、符号化MBと探索範囲内の全てのマクロブロックを比較する、いわゆるフルサーチを実施するとしたが、他のサーチ方法を用いてもよい。このようなサーチ方法の一例として、縮小した画像を用いてサーチする方法が挙げられる。
(E)図３〜図５に示したプログラムによる情報処理は、CPU、フレームメモリといったハードウェア資源を具体的に利用していることから、このプログラムは、単体で発明として成立する。第１実施形態〜第３実施形態は、符号化装置に組み込まれた態様で、本発明に係るプログラムの実施行為についての実施形態を示したが、符号化装置から分離して、第１実施形態〜第３実施形態に示したプログラム単体を実施してもよい。プログラム単体の実施行為には、これらのプログラムを生産する行為(1)や、有償・無償によりプログラムを譲渡する行為(2)、貸与する行為(3)、輸入する行為(4)、双方向の電子通信回線を介して公衆に提供する行為(5)、店頭展示、カタログ勧誘、パンフレット配布により、プログラムの譲渡や貸渡を、一般ユーザに申し出る行為(6)がある。
【０１０４】
双方向の電子通信回線を介した提供行為(5)の類型には、提供者が、プログラムをユーザに送り、ユーザに使用させる行為や(プログラムダウンロードサービス)、プログラムを提供者の手元に残したまま、そのプログラムの機能のみを電子通信回線を通じて、ユーザに提供する行為(機能提供型ASPサービス)がある。
(F)図３〜図５のフロ−チャ−トにおいて時系列に実行される各ステップの「時」の要素を、発明を特定するための必須の事項と考える。そうすると、これらのフロ−チャ−トによる処理手順は、符号化方法の使用形態を開示していることがわかる。これらのフロ−チャ−トこそ、本発明に係る符号化方法の使用行為についての実施形態である。各ステップの処理を、時系列に行うことで、本発明の本来の目的を達成し、作用及び効果を奏するよう、これらのフロ−チャ−トの処理を行うのであれば、本発明に係る符号化方法の実施行為に該当することはいうまでもない。
【０１０５】
【発明の効果】
以上説明したように、本発明に係る符号化装置は「クレーム１」であるので、暗いコンサート会場でライトが激しく点滅しているような動画像においては、評価値(MSE)の大部分を占める誤差の直流成分が係数にて減衰させられることになる。誤差の直流成分を減衰した上で評価値を算出しているので、動き補償モードの選択が適切になり、符号化効率を高めることができる。
【０１０６】
誤差の直流成分を減衰させて動き補償モード選択を行うことで他の効果が奏される。それは以下の通りである。DCTを行った後、可変長符号化を行うときに、各ブロック（8×8）に対して直流係数は1個であるが、交流係数は複数個になる可能性がある。仮に、マクロブロックにおけるDCEとACEの値が等しいとしても、直流係数の符号量より交流係数の符号量の方が大きくなる可能性が高い。これは、符号量に与える影響はDCEよりACEの方が大きいことを意味する。TM5のMSEに対してはDCEとACEとが同じ重みで反映されるので、MSEで比較するとDCEとACEの符号量に対する影響が適切に反映されない。本発明では、DCEを減衰させることで、ACEを重視することができ、輝度より絵柄を重視した動き補償モードの選択を行うことができる。これにより輝度が変化する画像では、本来の動きを反映したモード選択を行う可能性が高くなる。
【０１０７】
ここで前記符号化装置は、マクロブロックに対し離散コサイン変換を施すにあたっての変換方式を、複数方式の中から決定する決定手段と、決定手段により決定された変換方式に応じて、異なる計算を実行することにより、符号化対象たるマクロブロックについての分散値を算出する第３算出手段と、選択手段により選択された補償方式において予測されるマクロブロックと、符号化対象たるマクロブロックとの平均二乗誤差を、第３算出手段により算出された分散値と比較する比較手段とを備え、
選択手段により選択された補償方式にて、符号化対象たるマクロブロックの符号化が行われるのは、符号化対象たるマクロブロックとの平均二乗誤差が、分散値より小さいか、又は、前記平均二乗誤差が所定の閾値より小さい場合としてもよい。
【０１０８】
離散コサイン変換のタイプに応じて、分散値の計算方法を変えているので輝度変化が激しいインターレス画像においては、マクロブロックについての分散値がTest Model5の方法により小さな値に算出される。これとMSEとの比較にあたっては、「分散値＜MSE」の関係が満たされ易くなり、イントラモードが選択されることが多くなる。これにより、符号化効率を高めることができる。
【０１０９】
ここで前記所定の閾値は、4としてもよい。閾値を4とすると、平坦な画像において、矩形状のノイズが出現するというブロックノイズを、抑制することができる。
ここでマクロブロックに対し動き補償を施すにあたっての参照マクロブロックを、前方又は後方のフレームに属する複数マクロブロックの中から選択し、選択された参照マクロブロックに対する動きベクトルを算出する符号化装置であって、参照マクロブロックの候補となるマクロブロック毎に、誤差の交流成分、及び、誤差の直流成分を算出する第１算出手段と、算出された誤差の交流成分、及び、誤差の直流成分を用いて、候補となる個々のマクロブロックについての評価値を算出する第２算出手段と、算出された評価値に基づき、動き補償方式にあたっての参照マクロブロックを選択する選択手段とを備え、
前記第２算出手段は、個々のマクロブロックについての誤差の直流成分を、所定の係数に基づき減衰させた上で、評価値を算出してもよい。特許文献1に示されるように、直流成分を”0”にするのではなく、減衰した上で参照MBについての評価値を算出しているので、輝度変化が小さい平坦な画像であっても、ライトが激しく点滅しているような画像が符号化対象であっても、動きベクトルを適切に探索することができ、符号化効率を高めることができる。
【図面の簡単な説明】
【図１】符号化装置のハードウェア構成を示す図である。
【図２】動き補償予測部６全体における大きな処理の流れを示すフローチャートである。
【図３】インター選択部１８の処理手順を示すフローチャートである。
【図４】インター／イントラ選択部１９の処理手順を示すフローチャートである。
【図５】動きベクトル探索部１６の処理手順を示すフローチャートである。
【図６】空の風景のような絵柄の変化及び輝度変化が共に小さい画像において、符号化MB及び参照MB内の輝度値を示す図である。
【図７】空の画像の輝度値（１０進数表示）を示す図である。
【図８】残差の直流成分の値と、この残差の直流成分の値に対し、DCT、量子化、逆量子化及び逆DCTを順に実施して得られる残差の直流成分の値との差における、各動きベクトル探索の違いについて説明する図である。
【符号の説明】
１ D/Aコンバータ
２フォーマット変換部
３画面並替部
４フレームメモリ
５減算器
６動き補償予測部
７ DCT部
８量子化部
９可変長符号化部
１０バッファ
１１レート制御部
１２逆量子化部
１３逆DCT部
１４加算器
１６動きベクトル探索部
１７ DCTタイプ決定部
１８インター選択部
１９インター／イントラ選択部[0001]
[Industrial application fields]
The present invention relates to an encoding apparatus that performs an encoding target of a moving image, and particularly relates to an improvement in detecting a motion vector and selecting a motion compensation method.
[0002]
[Prior art]
Application products that apply the MPEG2 video standard (ISO / IEC 13818-2, "Information technology-Generic coding of moving pictures and associated audio information: Video") have swept the consumer electronics market in recent years. Since the MPEG2 video standard only defines a decoding method and not an encoding method, an appropriate encoding method must be selected for implementation in the development of such application products. “Test Model 5” (abbreviated as ISO / IEC JTC / SC29 / WG11 / N0400, Apr 1993, TM5) is widely known as an MPEG2 video encoding method.
[0003]
The coding process by Test Model 5 includes steps of motion vector search (1) motion compensation mode selection (2), DCT type selection (3), DCT, quantization, and variable length coding (4). Of these steps, how to search for a motion vector and how to select a motion compensation mode greatly affect the image quality.
Here, selection of the motion compensation mode for the P picture will be described. There are four motion compensation modes for P pictures: forward frame prediction, forward field prediction, NoMC (motion compensation with a motion vector of 0), and intra (no motion compensation is performed). Macroblocks predicted in these motion compensation modes differ for each motion compensation mode.
[0004]
In order to select the best motion compensation mode, Test Model 5 calculates the mean square error (Mean Square Error (MSE)) for the macroblock (predicted MB) predicted in each mode, and the MSE of the predicted MB is used as the evaluation value. The idea is to select a motion compensation mode. The encoding technique based on Test Model 5 is described in Non-Patent Document 1 below.
[0005]
For motion vector search, a technique described in Patent Document 1 below is known. The motion vector is information indicating the relative position of the reference macroblock with reference to the macroblock to be encoded. The motion vector search is performed by a procedure in which an evaluation value is calculated for each of a plurality of macroblocks that can be candidates, and a macroblock having the smallest evaluation value is selected as a reference macroblock.
[0006]
The encoding device described in Patent Document 1 is characterized by the calculation of the evaluation value. That is, of the error DC component and error AC component of the macroblock, the error DC component is completely removed, and the evaluation value is calculated using only the error AC component.
Regarding the encoding technique, the technique described in Patent Document 2 below is also known.
[0007]
[Patent Document 1]
Japanese Patent No. 2625424
[0008]
[Patent Document 2]
JP 63-193784 A
[0009]
[Non-Patent Document 1]
Test Model5, ISO / IEC JTC / SC29 / WG11 / NO400,1993
[0010]
[Problems to be solved by the invention]
However, the conventional motion compensation mode selection and motion vector search have the following three problems.
First, the motion compensation mode selected by Test Model 5 may not be able to select an appropriate motion compensation mode for a moving image with a large luminance change over time. Such a moving image includes a moving image in which a light is flickering violently in a dark concert hall. Such a moving image has a large luminance change, and the DC component of the error occupies a large proportion in the MSE. When trying to calculate the MSE of the prediction MB for each motion compensation mode, the DC component of the error occupies a large proportion in the MSE for any motion compensation mode. If an inter-type motion compensation mode (inter mode) is selected using the MSE as an evaluation value, a macroblock having a completely different picture but having the same brightness is selected as the prediction MB. Such an erroneous selection of the predicted MB may cause image quality degradation.
[0011]
Second, motion compensation mode selection by Test Model 5 may not be able to select an appropriate motion compensation mode for an interlaced image with a large luminance change. Interlaced images may have large temporal changes between fields even if the pattern itself is flat. When the temporal change between fields is large, the variance value of the macroblock is calculated to be a large value. The macro block variance value is a parameter that is referred to when determining whether encoding is performed in the intra mode or the inter mode. If the variance value becomes a large value, the inter mode is selected when selecting the intra mode / inter mode. However, even though it is an interlaced video, since the change of the pattern itself is flat, it is objectively clear that the coding amount is reduced if field DCT coding is performed in the intra mode. Then, there is a possibility that an optimum code amount cannot be obtained. Although an interlaced image with a large luminance change has been described as an example, a similar problem may occur even in an interlaced image in which an object moves in the horizontal direction.
[0012]
Third, the technique described in Patent Document 1 excludes the DC component of the error and evaluates only with the AC component of the error. Therefore, it is possible to appropriately search for a motion vector even in a moving image whose luminance changes drastically with time. However, for a moving image that has a small luminance change and a poor pattern change, the motion vector search may be erroneous. In other words, when a macroblock having a flat pattern such as an empty picture is located around the macroblock to be encoded, the AC component of the error is small for any surrounding macroblock. However, even though the sky image is flat at first, there are often subtle changes in a wide range. Such subtle changes in a large range appear in the DC component of the error. The motion vector search according to Patent Document 1 ignores this change and selects a reference macroblock. Therefore, a macroblock having a large DC component of error may be selected, resulting in image quality degradation.
[0013]
A first object of the present invention is to provide an encoding device capable of appropriately selecting a motion compensation mode when encoding a moving image having a large luminance change over time.
A second object of the present invention is to provide an encoding device capable of appropriately selecting whether to perform motion compensation mode or intra mode encoding when encoding a moving image having a large luminance change over time. Is to provide.
[0014]
A third object of the present invention is to provide an encoding device capable of appropriately searching for a motion vector regardless of whether the temporal luminance change is large or small, and the pattern is flat or complicated. Is to provide.
[0015]
[Means for Solving the Problems]
In order to achieve the first object, the encoding apparatus according to the present invention selects a compensation method for performing motion compensation from a plurality of methods, and encodes a macroblock using the selected compensation method. An encoding device, for a macroblock predicted by each motion compensation method, a first calculation means for calculating an AC component of error and a DC component of error, a DC component of the calculated error, and A second calculating unit that calculates an evaluation value for each compensation method using an AC component of the error; and a selecting unit that selects a compensation method based on the calculated evaluation value. An evaluation value is calculated after the DC component of the error for the macroblock is attenuated based on a predetermined coefficient.
[0016]
In order to achieve the second object, the encoding apparatus according to the present invention is determined by a determining unit that determines a transform method for performing a discrete cosine transform on a macroblock from among a plurality of methods, and a determining unit. A third calculation unit that calculates a variance value for the macroblock to be encoded by performing different calculations according to the conversion method, a macroblock predicted by the compensation method selected by the selection unit, Comparing means for comparing the mean square error with the macroblock to be encoded with the variance value calculated by the third calculating means,
In the compensation method selected by the selection unit, the macroblock to be encoded is encoded because the mean square error with the macroblock to be encoded is smaller than the variance value or the mean square This is characterized in that the error is smaller than a predetermined threshold value.
[0017]
In order to achieve the third object, the encoding apparatus according to the present invention selects a reference macroblock for performing motion compensation on a macroblock from a plurality of macroblocks belonging to a front or rear frame, A first calculation unit that calculates a motion vector for a selected reference macroblock, and calculates an AC component of error and a DC component of error for each macroblock that is a candidate for a reference macroblock; A second calculation means for calculating an evaluation value for each candidate macroblock using the calculated AC component of the error and the DC component of the error, and a motion compensation method based on the calculated evaluation value Selecting means for selecting a reference macroblock at the time, wherein the second calculating means is a DC component of an error for each macroblock , After attenuated based on a predetermined coefficient, it is characterized in that to calculate the evaluation value.
[0018]
DETAILED DESCRIPTION OF THE INVENTION
An embodiment of an encoding apparatus according to the present invention will be described. The encoding apparatus according to the present invention is industrially produced based on the hardware configuration shown in FIG. As shown in FIG. 1, the encoding device includes an A / D converter 1, a format conversion unit 2, a screen rearrangement unit 3, a frame memory 4, a subtracter 5, a motion compensation prediction unit 6, a DCT unit 7, and a quantization unit 8. A variable length encoding unit 9, a buffer 10, a rate control unit 11, an inverse quantization unit 12, an inverse DCT unit 13, an adder 14, and a D / A converter 15.
[0019]
The A / D converter 1 is composed of a circuit that performs A / D conversion, separates an analog signal format video frame into a luminance signal (Y) and a color difference signal (Cb, Cr), and converts each signal into a digital format. Convert to A video frame in a digital data format is obtained by this conversion. The video frames obtained in this way are sequentially output to the format conversion unit 2.
[0020]
The format conversion unit 2 converts the video frame obtained by the A / D converter 1 into a spatial resolution format, and outputs the converted video frame to the screen rearrangement unit 3.
The screen rearrangement unit 3 rearranges the video frames output from the format conversion unit 2. That is, in the analog signal format, the video frames are in an order called a display order, and a video frame sequence arranged in the encoding order is obtained by rearranging the video frames. The screen rearrangement unit 3 outputs the individual video frames in the video frame sequence rearranged in the encoding order to the subtracter 5 and the motion compensation prediction unit 6 as frames to be encoded.
[0021]
The frame memory 4 stores a frame that can be a reference frame of a frame to be encoded when performing motion compensation. More specifically, a video frame located in front of a frame to be encoded and a video frame located behind are stored in the frame memory 4.
The subtracter 5 calculates a residual between the video frame to be encoded and the reference frame stored in the frame memory 4 and outputs the residual to the motion compensation prediction unit 6 and the DCT unit 7.
[0022]
The motion compensation prediction unit 6 selects a motion compensation mode based on the residual output from the subtracter 5 and a macroblock to be encoded (hereinafter referred to as encoding MB), performs motion compensation, and performs motion. The vector and the prediction mode are output to the variable length coding unit 9. The prediction mode is information for instructing the variable length coding unit 9 to perform coding in the intra mode or in which inter mode among the plurality of inter modes.
[0023]
The DCT unit 7 performs DCT on the residual or encoded MB output from the subtracter 5 and outputs the DCT coefficient obtained as a result to the quantizing unit 8. As a result, a matrix in which a plurality of DCT coefficients are stored is generated.
The quantization unit 8 multiplies the DCT coefficient by 16, divides by the value of (quantization coefficient × 2 × quantization scale), and further performs quantization by rounding off after the decimal point.
[0024]
The variable length encoding unit 9 performs encoding so that a shorter code is assigned to data having a higher appearance frequency for each DCT coefficient, motion vector, and prediction mode.
The buffer 10 is a FIFO memory, and sequentially stores the data input from the variable length encoding unit 9 in the input order.
[0025]
The rate control unit 11 has a function of monitoring the amount of data in the buffer 10 so that the buffer 10 does not underflow and overflow. This monitoring is performed by referring to the data amount of the buffer 10 and feeding back information indicating the data amount to the quantization unit 8. If the quantization unit 8 adjusts the output speed to the variable length coding unit 9 based on this information, the output from the buffer 10 can be maintained at a constant rate.
[0026]
The inverse quantization unit 12 performs inverse quantization by multiplying the DCT coefficient quantized by the quantization unit 8 by the value of (quantization coefficient × 2 × quantization scale), and further dividing by 16. Output to the inverse DCT unit 13.
The inverse DCT unit 13 receives the value of the DCT coefficient from the inverse quantization unit 12, performs inverse DCT on this value, obtains a residual before encoding, and outputs the residual to the adder 14.
[0027]
The adder 14 is an addition circuit, adds the residual output from the inverse DCT unit 13 to the reference frame stored in the frame memory 4, and outputs the addition result to the frame memory 4.
The above is the overall configuration of the encoding apparatus.
Next, the motion compensation prediction unit 6 that is the core of the encoding device will be described. The motion compensation prediction unit 6 is mounted on the encoding device as a typical computer system including a CPU, a ROM storing a program, and a RAM. The program stored in the ROM is read by the CPU, and the motion compensation prediction unit 6 fulfills its function by the cooperation of the program and hardware resources. The frame showing the motion compensation prediction unit 6 shows specific means in which the program stored in the ROM and hardware resources cooperate.
[0028]
As shown in this frame, the motion compensation prediction unit 6 includes a motion vector search unit 16 that searches for a motion vector, a DCT type determination unit 17 that determines a DCT type, and a plurality of inter motion compensation modes (inter modes). Among them, an inter selector 18 for selecting the best one, and an inter / intra selector 19 for comparing the best inter mode with the intra mode and selecting one of them are provided.
[0029]
In this embodiment, the DCT type determination unit 17 and the inter selection unit 18 will be described in detail, and the motion vector search unit 16 and the inter / intra selection unit 19 will be described in the second embodiment and the third embodiment.
The DCT type determination unit 17 determines the DCT type based on the motion vector and the macroblock in the frame memory 4. In the residual macroblock, the square sum of the luminance difference between adjacent lines and the square sum of the luminance difference between every other line are compared. If the former is smaller, the frame is DCT, and the latter is If it is smaller, the field is DCT.
[0030]
The inter selection unit 18 calculates an MSE for a macroblock (predicted MB) predicted in each mode, and selects a motion compensation mode using the MSE of the predicted MB as an evaluation value. The mode selection by the inter selection unit 18 differs from that of Test Model 5 in that the calculation procedure of the DC component of error and the AC component of error is changed according to the DCT type (1), and the value obtained by modifying the MSE Is used for the evaluation value (2).
[0031]
How the calculation method of the DC component of error and the AC component of error is changed will be described for each of the frame and the field. First, how the average square error of the DC component and the average square error of the AC component are calculated will be described. The luminance located at the coordinate (i, j) in the encoded MB is Xij, and the luminance located at the coordinate (i, j) in the reference macroblock is Yij. MSE in these two macroblocks is calculated by the following equation (1).
[0032]
[Expression 1]

The direct current component (DCE) of the error is calculated as the square of the average value m of the error as shown in Equation 2 below.
[0033]
[Expression 2]

On the other hand, the AC component (ACE) of the error is calculated as a variance value based on the above average value m. Equation 3 below shows a formula for calculating the dispersion value.
[0034]
[Equation 3]

Here, the formula for calculating the variance is
Variance = (mean square)-(mean square)
Can be expanded to In the expression after the expansion, if the term (mean square) and the term (mean square) are replaced with MSE and DCE, respectively, the relationship ACE = MSE−DCE is established.
[0035]
The relationship between MSE, DCE, and ACE is as follows.
MSE = DCE + ACE
Refer to the following description for the expansion of Equation (3).
[0036]
The formula for calculating the variance value in Equation 3 is expressed using the average value m of the n numbers x1, x2, x3... xn. Here, the average value m of the n numerical values x1, x2, x3... Xn is expressed by the following equation (4).
[0037]
[Expression 4]

If the formula of Formula 3 is expressed using the formula of Formula 4, Formula 5 is obtained.
[0038]
[Equation 5]

Equation 6 shows the process of developing the dispersion value calculation formula.
[0039]
[Formula 6]

Through this development, it can be seen that the equation dispersion value = (average of squares) − (square of average) holds.
The above is the calculation formula of DCE and ACE at the time of frame prediction. Equation 7 below shows DCE and ACE calculation formulas for one field at the time of field prediction.
[0040]
[Expression 7]

The above is the calculation method of DCE and ACE. The first feature of the inter selector 18 is that the calculation method as described above is changed according to the DCT type.
When the DCT type is a frame, DCE and ACE are obtained by performing the above calculation on the frame macroblock (16 × 16) from the residual with the encoded MB.
[0041]
When the DCT type is a field, the DCE is obtained for each of the two fields (8 × 16), and the average of the obtained DCE is used as the DCE of the macroblock. Similarly, for ACEs, ACEs are obtained for each of the two fields, and the average of these values is used as the ACE of the macroblock.
The second feature of the inter selection unit 18 is that the corrected MSE is used as an evaluation value for mode selection. The corrected MSE is a DCE attenuated and is represented by modMSE. Expressed in the formula, modMSE is as follows.
modMSE = α × DCE + ACE
Where α is the attenuation factor, 0 <α <1.
The value of α is preferably set to 1/64 as a default value. This is because various simulations show that the compression ratio and S / N ratio are optimal when α is 1/64. Further, it is desirable that the value α can be changed based on an instruction from the user interface when an instruction from the user is notified. This is because if the value of α is changed in this way, the user can check the encoded image quality on the monitor and adjust the value of α so as to ensure a satisfactory image quality.
[0042]
Hereinafter, processing procedures of the motion compensation prediction unit 6 and the inter selection unit 18 will be described with reference to flowcharts. FIG. 2 is a flowchart showing a large processing flow in the motion compensation prediction unit 6 as a whole. The motion vector search unit 16 searches for a motion vector (step S101), the DCT type determination unit 17 determines a DCT type (step S102), a motion compensation mode is selected (step S103), and then the DCT unit 7 The quantizing unit 8 and the variable length coding unit 9 perform DCT, quantization, and variable code length coding (step S104). The motion compensation mode is selected by causing the inter selector 18 to select the best inter mode (step S105) and causing the inter / intra selector 19 to select either the best inter mode or the intra mode. (Step S106).
[0043]
In order to configure the inter selection unit 18, a program is created by describing the processing procedure shown in the flowchart of FIG. 3 using a computer description language, and the program is executed by a computer. Hereinafter, the processing procedure of the inter selection unit 18 will be described with reference to FIG. For the sake of simplicity, in this flowchart, “macroblock” is abbreviated as MB.
[0044]
Steps S1 to S2 form a loop process that repeats the processes of steps S3 to S8 for each inter mode. In the case of a P picture, each of forward frame prediction (1), forward field prediction (2), and NoMC (3) is the target of steps S3 to S8. For a B picture, each of forward frame prediction (1), forward field prediction (2), backward frame prediction (3), backward field prediction (4), bidirectional frame prediction (5), bidirectional field prediction (6) , The processing target of step S3 to step S8.
[0045]
In this loop processing, the target motion compensation mode is a motion compensation mode p, and a macroblock predicted in the motion compensation mode p is a macroblock p. In step S3, the processing procedure is switched according to the DCT type determined by the DCT type determination unit.
If the DCT type is a frame, DCE and ACE are calculated based on the residual for a 16 × 16 frame in the predicted MBp (step S4).
[0046]
If the DCT type is a field, DCE and ACE are calculated based on the residual between the macroblock p and the encoded MB for two fields of 16 × 8 and 16 × 8 of the predicted MBp (step S5). Then, the average DCE in the two fields is set to DCE (step S6), and the average ACE in the two fields is set to ACE (step S7).
[0047]
If DCE and ACE are calculated in any one of steps S4 and S5 to S7, DCE is multiplied by coefficient α, and ACE is added to obtain modMSE (p) (step S8). This modMSE (p) is an evaluation value in mode p. If the processes in steps S3 to S8 are repeated for each inter mode, the evaluation value is calculated for each inter mode. When the modMSE for each inter mode is calculated in this way, the calculated inter mode that has the smallest modMSE in the predicted MB is selected (step S9).
An example of the operation of the inter selection unit 18 will be described along with calculation examples of DCE, ACE, and modMSE.
[0048]
It is assumed that an image that is a target of this operation example is an interlaced image with a large luminance change, and a picture type is a P picture. Among the P pictures, a macroblock (encoding MB) to be encoded has the following 16 × 16 luminance.
[0049]
Encoding MB
48 59 57 50 52 56 54 51 56 60 56 52 57 60 55 56
72 66 67 74 75 71 70 75 73 69 71 74 73 71 76 77
55 50 55 58 56 59 62 60 58 59 61 60 59 62 62 60
70 74 72 73 76 74 70 72 74 73 73 75 75 73 73 77
53 55 54 55 54 51 55 57 57 59 60 59 58 58 55 56
73 72 73 75 74 72 73 74 72 74 75 76 76 74 76 78
56 54 57 57 54 51 54 58 58 59 59 57 58 57 55 59
65 69 73 75 74 73 75 76 75 75 76 75 76 76 77 78
50 56 57 58 56 52 55 58 58 58 60 58 58 58 57 59
71 68 74 75 72 71 69 71 74 75 74 75 77 75 74 76
56 54 56 56 55 54 55 58 57 57 59 57 57 57 57 60
67 71 70 70 71 71 71 74 74 73 73 73 74 74 77 76
50 56 55 56 57 56 58 58 57 57 58 58 58 59 60 61
70 67 71 71 73 73 72 74 73 72 73 73 74 73 72 74
54 51 55 56 56 57 57 56 55 57 59 59 59 57 59 61
69 71 70 69 72 74 74 73 73 74 74 73 74 76 76 76
[0050]
The inter selection unit 18 selects which motion compensation mode is the best among the forward frame prediction (1), the forward field prediction (2), and NoMC (3). Therefore, an evaluation value is calculated for each of the inter type motion compensation modes. When the forward frame prediction (1) is the inter mode p, the following macroblock having 16 × 16 luminance is assumed to be predicted.
[0051]
Macroblock predicted in forward frame mode
49 47 47 48 49 49 50 50 51 52 52 52 52 52 52 52
51 51 51 51 51 51 51 51 52 53 53 53 53 53 53 53
49 47 47 48 49 50 50 50 51 52 52 52 52 52 52 52
51 51 51 51 51 51 51 51 52 53 53 53 53 53 53 53
49 47 48 48 49 50 51 51 52 52 52 52 52 52 52 52
51 51 51 51 51 51 51 51 52 53 53 53 53 53 53 53
49 48 48 49 50 51 51 51 52 52 52 52 52 52 52 52
51 51 51 51 51 51 51 51 52 53 53 53 53 53 53 53
50 49 50 50 51 51 52 52 52 52 52 52 52 52 52 52
51 51 51 51 51 51 51 51 52 53 53 53 53 53 53 53
50 50 50 51 51 52 52 53 53 52 52 52 52 52 52 52
51 51 51 51 51 51 51 51 52 53 53 53 53 53 53 53
50 50 51 51 52 52 53 53 53 52 52 52 52 52 52 52
51 51 51 51 51 51 51 51 52 53 53 53 53 53 53 53
50 50 51 51 52 53 53 53 53 52 52 52 52 52 52 52
51 51 51 51 51 51 51 51 52 53 53 53 53 53 53 53
[0052]
The residual for each pixel between the predicted MB and the encoded MB is as follows.
Residual with macroblock predicted by forward frame mode
-1 12 10 2 3 7 4 1 5 8 4 0 5 8 3 4
21 15 16 23 24 20 19 24 21 16 18 21 20 18 23 24
6 3 8 10 7 9 12 10 7 7 9 8 7 10 10 8
19 23 21 22 25 23 19 21 22 20 20 22 22 20 20 24
4 8 6 7 5 1 4 6 5 7 8 7 6 6 3 4
22 21 22 24 23 21 22 23 20 21 22 23 23 21 23 25
7 6 9 8 4 0 3 7 6 7 7 5 6 5 3 7
14 18 22 24 23 22 24 25 23 22 23 22 23 23 24 25
0 7 7 8 5 1 3 6 6 6 8 6 6 6 5 7
20 17 23 24 21 20 18 20 22 22 21 22 24 22 21 23
6 4 6 5 4 2 3 5 4 5 7 5 5 5 5 8
16 20 19 19 20 20 20 23 22 20 20 20 21 21 24 23
0 6 4 5 5 4 5 5 4 5 6 6 6 7 8 9
19 16 20 20 22 22 21 23 21 19 20 20 21 20 19 21
4 1 4 5 4 4 4 3 2 5 7 7 7 5 7 9
18 20 19 18 21 23 23 22 21 21 21 20 21 23 23 23
[0053]
If DCE is calculated from this residual in step S4, DCE = 239, and if ACE is calculated, ACE = 5. When modMSE is calculated with the coefficient α set to 1/64 in step S8, it becomes 9.
[0054]
MSE_frame 9 (ACE 5, DCE 239)
When the forward field prediction (2) is the inter mode p, it is assumed that the following macroblock having 16 × 16 luminance is predicted.
Macroblock predicted in forward field mode
47 48 48 49 49 50 50 51 51 51 51 51 51 51 51 51
52 52 52 52 52 52 52 52 52 52 52 52 52 52 52 53
47 48 48 49 50 50 51 51 51 51 51 51 51 51 51 51
52 52 52 52 52 52 52 52 52 52 52 52 52 52 52 53
47 48 48 49 50 50 51 51 51 51 51 51 51 51 51 51
52 52 52 52 52 52 52 52 52 52 52 52 52 52 52 53
47 48 48 49 50 50 51 51 51 51 51 51 51 51 51 51
52 52 52 52 52 52 52 52 52 52 52 52 52 52 52 53
47 48 48 49 50 50 51 51 51 51 51 51 51 51 51 51
52 52 52 52 52 52 52 52 52 52 52 52 52 52 52 53
47 48 48 49 49 50 51 51 51 51 51 51 51 51 51 51
52 52 52 52 52 52 52 52 52 52 52 52 52 52 52 53
47 48 48 49 49 50 51 51 51 51 51 51 51 51 51 51
52 52 52 52 52 52 52 52 52 52 52 52 52 52 52 53
47 48 48 49 49 50 50 51 52 52 52 52 52 52 52 52
52 52 52 52 52 52 52 52 52 52 52 52 52 52 52 53
[0055]
The residual for each pixel between the predicted MB and the encoded MB is as follows.
Residual with macroblock predicted by forward field mode
1 11 9 1 3 6 4 0 5 9 5 1 6 9 4 5
20 14 15 22 23 19 18 23 21 17 19 22 21 19 24 24
8 2 7 9 6 9 11 9 7 8 10 9 8 11 11 9
18 22 20 21 24 22 18 20 22 21 21 23 23 21 21 24
6 7 6 6 4 1 4 6 6 8 9 8 7 7 4 5
21 20 21 23 22 20 21 22 20 22 23 24 24 22 24 25
9 6 9 8 4 1 3 7 7 8 8 6 7 6 4 8
13 17 21 23 22 21 23 24 23 23 24 23 24 24 25 25
3 8 9 9 6 2 4 7 7 7 9 7 7 7 6 8
19 16 22 23 20 19 17 19 22 23 22 23 25 23 22 23
9 6 8 7 6 4 4 7 6 6 8 6 6 6 6 9
15 19 18 18 19 19 19 22 22 21 21 21 22 22 25 23
3 8 7 7 8 6 7 7 6 6 7 7 7 8 9 10
18 15 19 19 21 21 20 22 21 20 21 21 22 21 20 21
7 3 7 7 7 7 7 5 3 5 7 7 7 5 7 9
17 19 18 17 20 22 22 21 21 22 22 21 22 24 24 23
In step S5 to step S7, when DCE is calculated from the residual, DCE = 242, and when ACE is calculated, ACE = 5. When modMSE is calculated with the coefficient α set to 1/64 in step S8, it becomes 9.
[0056]
MSE_field 9 (ACE 5, DCE 242)
When NoMC (3) is inter mode p, the following macroblock having 16 × 16 luminance is assumed to be predicted.
Macroblock predicted in noMC mode
44 44 44 44 44 44 44 44 46 46 46 46 46 46 46 46
44 44 44 44 44 44 44 44 46 47 47 48 48 49 49 50
44 44 44 44 44 44 44 44 46 46 46 46 46 46 46 46
44 44 44 44 44 44 44 44 46 47 47 48 48 49 49 50
44 44 44 44 44 44 44 44 46 46 46 46 46 46 46 46
44 44 44 44 44 44 44 44 46 46 47 47 49 49 50 50
44 44 44 44 44 44 44 44 46 46 46 46 46 46 46 46
44 44 44 44 44 44 44 44 46 47 47 48 48 49 49 50
44 44 44 44 44 44 44 44 46 46 46 46 46 46 46 46
44 44 44 44 44 44 44 44 46 46 47 47 49 49 50 50
44 44 44 44 44 44 44 44 46 46 46 46 46 46 46 46
44 44 44 44 44 44 44 44 46 47 47 48 48 49 49 50
44 44 44 44 44 44 44 44 46 46 46 46 46 46 46 46
44 44 44 44 44 44 44 44 46 46 47 48 48 49 50 50
44 44 44 44 44 44 44 44 46 46 46 46 46 46 46 46
44 44 44 44 44 44 44 44 46 47 47 48 48 49 49 50
[0057]
The residual for each pixel between the predicted MB and the encoded MB is as follows.
Residual with macroblock predicted by noMC mode
4 15 13 6 8 12 10 7 10 14 10 6 11 14 9 10
28 22 23 30 31 27 26 31 27 22 24 26 25 22 27 27
11 6 11 14 12 15 18 16 12 13 15 14 13 16 16 14
26 30 28 29 32 30 26 28 28 26 26 27 27 24 24 27
9 11 10 11 10 7 11 13 11 13 14 13 12 12 9 10
29 28 29 31 30 28 29 30 26 28 28 29 27 25 26 28
12 10 13 13 10 7 10 14 12 13 13 11 12 11 9 13
21 25 29 31 30 29 31 32 29 28 29 27 28 27 28 28
6 12 13 14 12 8 11 14 12 12 14 12 12 12 11 13
27 24 30 31 28 27 25 27 28 29 27 28 28 26 24 26
12 10 12 12 11 10 11 14 11 11 13 11 11 11 11 14
23 27 26 26 27 27 27 30 28 26 26 25 26 25 28 26
6 12 11 12 13 12 14 14 11 11 12 12 12 13 14 15
26 23 27 27 29 29 28 30 27 26 26 25 26 24 22 24
10 7 11 12 12 13 13 12 9 11 13 13 13 11 13 15
25 27 26 25 28 30 30 29 27 27 27 25 26 27 27 26
[0058]
If DCE is calculated from this residual in step S4, DCE = 435, and if ACE is calculated, ACE = 5. If modMSE is calculated with the coefficient α set to 1/64 in step S8, it becomes 12.
MSE_noMC 12 (ACE 5, DCE 435)
[0059]
In the above process, the following three modMSEs are calculated. If an inter mode with the smallest modMSE is selected in step S9, it becomes the best inter mode. In this calculation example, modMSE is the same in both the front frame mode and the front field mode.
MSE_frame 9 (ACE 5, DCE 239)
MSE_field 9 (ACE 5, DCE 242)
MSE_noMC 12 (ACE 5, DCE 435)
[0060]
As described above, according to the present embodiment, since the evaluation value is calculated after the DCE is attenuated, the motion compensation mode can be selected even in a moving image in which lights are flashing intensely in a dark concert venue. It becomes appropriate and the image quality can be improved.
(Second embodiment)
The second embodiment is an embodiment showing the improvement in the inter / intra selection unit 19 in more detail. The inter / intra selection unit 19 determines which one of the inter modes determined to be the best by the inter selection unit 18 is compared with the intra mode. This comparison is made by calculating a variance value for the intra mode, comparing this variance value with the best mode MSE, and determining whether the best mode MSE exceeds a predetermined threshold.
[0061]
The feature of the inter / intra selection unit 19 in this embodiment is that the method of calculating the variance value is changed according to the DCT type. That is, if the DCT type is a frame, a variance value is calculated for a 16 × 16 frame in the encoded MB. If the DCT type is a field, a variance value is calculated for each of 16 × 8 and 16 × 8 fields in the encoded MB, and the average of the variance values in the two fields is set as the variance value.
[0062]
In order to configure the inter / intra selection unit 19 described above, a program may be created by describing the processing procedure shown in the flowchart of FIG. 4 using a computer description language and executed by a computer. Hereinafter, the processing procedure of the inter / intra selection unit 19 will be described with reference to the flowchart of FIG.
Step S21 realizes switching of the processing procedure according to the DCT type determined by the DCT type determination unit. If the DCT type is a frame, a variance value is calculated for a 16 × 16 frame in the encoded MB (step S22). If the DCT type is a field, variance values are calculated for two fields of 16 × 8 and 16 × 8 in the encoded MB (step S23). Then, the average of the variance values in the two fields is set as the variance value (step S24).
[0063]
Step S25 is a determination step for determining whether the condition that the calculated VAR is smaller than the best mode MSE and whether the best mode MSE is larger than 64 is satisfied. In this determination, the MSE is compared with the variance value, not modMSE. In other words, the MSE with the DCE not attenuated is the comparison target. If this condition is satisfied, the intra mode is selected (step S26). If this condition is not satisfied, the inter mode of mode p is selected (step S27).
[0064]
Hereinafter, an operation example of the inter / intra selection unit 19 according to the second embodiment will be described. This operation example describes how the inter / intra selection unit 19 performs selection in the calculation example shown in the first embodiment.
In the selection of the inter mode in the first embodiment, the modMSE is the same between the front frame mode and the front field. Here, frame prediction is selected for the order of determination.
[0065]
It is determined based on the above-described procedure whether the forward frame mode selected in this way or the intra mode is better.
In the first embodiment, the DCT type of the front frame mode is determined as a field. Since the intra DCT type is a field, the inter / intra selector 19 calculates a variance value VAR for each field in steps S13 and S14. The dispersion value VAR calculated in this way is assumed to be “7”.
[0066]
The variance value VAR calculated in this way is compared with the MSE in the forward frame mode (step S15). The front frame mode MSE is 244 (= 5 + 239) and VAR <Satisfies MSE relationship. Since MSE is 244 and satisfies the relationship MSE> 64, the intra mode is selected.
As described above, according to the present embodiment, the variance value calculation method is changed according to the DCT type, whereby the variance value for the macroblock is calculated to be a small value for an image with a large luminance change. In comparing this with MSE, the relationship of “dispersion value <MSE” is easily satisfied, and the intra mode is often selected for images with large changes in brightness compared to the Test Model5 method. Can be improved.
[0067]
Although the threshold used for comparison with MSE is 64, it may be 4. When the threshold value is 4, block noise in which rectangular noise appears in a flat image can be suppressed.
(Third embodiment)
The third embodiment is an embodiment for explaining the improvement in the motion vector search unit 16 in more detail.
[0068]
The motion vector search unit 16 calculates an evaluation value for each macroblock located in the reference frame / field, and makes the macroblock with the smallest evaluation value a reference macroblock. Then, the relative position of the reference macroblock (hereinafter referred to as the reference MB) based on the encoded MB is calculated as a motion vector. The evaluation value in the first embodiment is derived from the AC component and the DC component based on the square error, but the evaluation value in the present embodiment is derived from the AC component and the DC component based on the absolute error. .
[0069]
The formula for calculating the AC component is shown in Equation 8.
[0070]
[Equation 8]

A formula for calculating the DC component is shown in Equation 9.
[0071]
[Equation 9]

The evaluation value for each reference MB is calculated by an equation using the DC component of the error and the AC component of the error. The following formula 10 is the formula.
[0072]
[Expression 10]

As can be seen from Equation 10, it can be seen that the absolute error of the DC component is attenuated by being multiplied by the coefficient k.
The calculation of the direct current component and the alternating current component may be performed as in

Expressions

11 and 12.
[0073]
[Expression 11]

[0074]
[Expression 12]

Here, the luminance value of the pixel located at the coordinate (i, j) in the encoding MB is Xij, and the luminance value of the pixel located at the coordinate (i, j) in the macroblock in the reference frame / field. Is Yij.
[0075]
Although the direct current component is used for the evaluation value in the third embodiment, it can be said that it is common to the first embodiment in that the coefficient k is multiplied and attenuated.
The value of k is preferably set to 1/16 to 1/4 as a default value. This is because, if k is set within this range, there will be no visual problem in both the image (1) where the luminance change is larger than the change in the pattern and the image (2) where both the change in the pattern and the luminance change are both small. This is because it can be confirmed that the code amount is small.
[0076]
Further, it is desirable that the value of k can be changed based on an instruction from the user interface when an instruction from the user is notified. This is because if the value of k is changed in this way, the user can confirm the image quality after encoding on the monitor and adjust the value of k so as to ensure satisfactory image quality.
In order to configure the motion vector search unit 16 described above, a program may be created by describing the processing procedure shown in the flowchart of FIG. 5 using a computer description language, and executed by a computer. Hereinafter, the processing procedure of the motion vector search unit 16 will be described with reference to the flowchart of FIG. In step S31, the encoded MB is MBx.
[0077]
Steps S32 to S33 form a loop process that repeats the processes of steps S34 to S38 for the reference frame / field in each motion compensation mode. Since the reference frame / field of the P picture includes a front frame (i) and a front field (ii), the processes in steps S34 to S38 are performed on these. Further, since the reference frame / field of the B picture includes a front frame (i), a front field (ii), a rear frame (iii), and a rear field (iv), the processes of steps S34 to S38 are performed on these. .
[0078]
In the loop processing from step S31 to step S38, each reference frame / field to be processed is referred to as a reference frame / field (r).
Steps S34 to S36 form a loop process in which the process of step S36 is repeated for all macroblocks belonging to the reference frame / field (r) that can be candidates. In the encoded MB, all macroblocks within the search range can be candidates here. This is called full search.
[0079]
The target macroblock in this loop processing is defined as macroblock y. In step S36, an evaluation value f (y) for the macroblock y is obtained based on the above equation (10).
By repeating this step S36, evaluation values are calculated for all macroblocks that can be candidates in the reference frame / field (r).
[0080]
In step S37, among the macroblocks that can be candidates in the reference frame / field (r), the macroblock with the smallest f (y) is set as the reference MB for the reference frame / field (r). In the subsequent step S38, the relative position of the reference MB based on the macroblock x is set as the motion vector (r). The motion vector (r) is a motion vector for the reference frame / field (r). By repeating the above steps S32 to S38, a motion vector is calculated for each reference frame / field.
[0081]
The present invention and Patent Document 1 compare whether the search for the reference MB is appropriate. This comparison is made by calculating how much the DC component of the reference MB changes before and after the irreversible conversion. If the average value of DC components before irreversible conversion is equal to the average value of DC components after irreversible conversion and the difference is zero, it indicates that the reference MB search is optimal. On the other hand, the larger the difference between the average values, the more inappropriate the reference MB search is. The image to be compared is an empty image as shown in FIG. In FIG. 6, the value of each pixel changes slightly, such as 46, 47, and 48. This change is a random change and appears as an AC component. On the other hand, if the change of the pixel in a large range is observed, the change is observed even in such a group of pixels. This change is a change in the DC component of the macroblock. FIG. 7 is a diagram in which the notation of luminance in FIG. 6 is replaced with a decimal number.
[0082]
Referring to FIG. 8, the reference MB searched by the procedure of the present invention is compared with the reference MB searched in Patent Document 1. Macroblock 1 is a reference MB searched by the procedure of the present invention, and macroblock 2 is a reference MB searched by the procedure of Patent Document 1.
In Patent Document 1, the direct current component of luminance is ignored, and the reference MB is searched with an evaluation value using only the alternating current component. Equation 13 is a formula for calculating the evaluation value of the macroblock in Patent Document 1.
[0083]
[Formula 13]

In Patent Document 1, there is a possibility of selecting a reference MB that increases the DC component even for a flat image such as the sky. Mx1 and mx2 in FIG. 8 indicate the residuals for the macroblock 1 and the macroblock 2 in a matrix form. Although the reference MB selected from the flat image is ignored in Patent Document 1, the reference MB is selected so that the residual for each pixel becomes 3,4,5 as shown in mx2. Since the residual for each pixel is 3,4,5, the average value of the residual is calculated as large as 3.8.
[0084]
In the encoding apparatus according to the present invention, the evaluation value is calculated and the reference MB is selected after the DC component is attenuated. A reference MB with a difference of -1,0,1 is chosen. Since the residual for each pixel is -1,0,1, the average value of the residual is as small as -0.25.
In the figure <DCT>, <Quantization>, <Inverse quantization>, <Inverse DCT> is a process of irreversible conversion.
[0085]
First, the difference between the average values of residuals before and after the irreversible conversion in the reference MB1 will be described. In the DCT conversion, the average value “−0.25” of the reference MB1 is multiplied by 8 as indicated by the arrow my1 to become “−2.0”. In the subsequent quantization, the DC coefficient “-2.0” is multiplied by 16, divided by the value of (quantization coefficient × 2 × quantization scale), and rounded off to the nearest decimal place, as shown by the arrow my2. become.
[0086]
In the inverse quantization unit 12, the DC coefficient “0.00” is multiplied by the value of (quantization coefficient × 2 × quantization scale), and further divided by 16, to become “0.00” as shown by an arrow my3. .
In the reverse DCT, the DC coefficient is divided by 8 and the average becomes “0.00” as shown by the arrow my8. This “0.00” is the average value of residuals after irreversible transformation. There is a difference of “0.25” in the average value of residuals before and after irreversible conversion.
[0087]
Next, the difference between the average values before and after the irreversible conversion in the reference MB2 will be described.
In DCT conversion, the average value “3.80” of the reference MB2 is multiplied by 8 to become “30.38” as shown by the arrow my4. In the subsequent quantization, the DC coefficient “30.28” is multiplied by 16, divided by the value of (quantization coefficient × 2 × quantization scale), and rounded off to the nearest decimal place, as shown by arrow my5. Become.
[0088]
In the inverse quantization unit 12, the DC coefficient “2” is multiplied by the value of (quantization coefficient × 2 × quantization scale) and further divided by 16 to become “40.00” as shown by the arrow my6. .
In the inverse DCT, the DC coefficient is divided by 8 and the average becomes “5.00” as shown by the arrow my7. This “5.00” is the average value of residuals after irreversible transformation. There is a difference of “1.20” in the average value before and after irreversible conversion.
[0089]
When the difference between the average values before and after the irreversible conversion is compared between the reference MB1 and the reference MB2, it can be seen that the reference MB1 has a better result.
<Comparison of image quality>
Using motion vectors obtained by using three encoders using Test Model 5, the encoder described in Patent Document 1, and the encoder of the present invention in which the value of k is 1/16 to 1/4. As a result of the simulation to encode, the following became clear.
[0090]
In a relatively complex image (hereinafter referred to as “first image”) having a small luminance change over time, the image quality is good in all the above three encoding devices.
For an image with a large luminance change over time (hereinafter referred to as “second image”), the encoding device using Test Model 5 has poor image quality, and the encoding device described in Patent Document 1 and the encoding device of the present invention And it became good image quality.
[0091]
In an image having a very small temporal luminance change and a slight luminance change depending on the position (hereinafter referred to as “third image”), the encoding device using Test Model 5 and the encoding device of the present invention have good image quality. The encoding device described in Patent Document 1 has poor image quality. More specifically, in the third image (for example, an empty image), if the motion vector obtained by the method of Patent Document 1 is used for coding, the block appears to move in an originally stationary region, and the image quality is degraded. .
[0092]
Hereinafter, the case where a motion vector is detected using the encoding apparatus of the present invention for each of the first, second, and third images will be described in detail.
For the first image, that is, a relatively complicated image with small temporal luminance change, the AC component of each macroblock of each reference MB group is close to the macroblock representing the original motion, and otherwise The macro block has a large difference.
[0093]
Therefore, the difference in the evaluation value of the error in the luminance value between the macroblocks is mainly determined by the difference in the AC component, and the motion vector having the smallest AC component of the error is selected. In the encoding device of the present invention, A motion vector indicating the original motion is selected.
For the second image, that is, an image with a large temporal luminance change, the direct current component increases due to the temporal luminance change, but the direct current component is attenuated to 1/16 to 1/4. The influence of the AC component of the error becomes relatively large.
[0094]
As a result, since the influence of the pattern becomes strong, the influence of the luminance change becomes small, and an appropriate motion vector is obtained.
For the third image (a flat image with a small change in luminance over time), the AC component is very small for any reference MB because it is a flat image.
For example, as shown in FIGS. 6 and 7, the change is random in a narrow range, but there is a meaningful change that is not random in a wide range, and the DC component reflects the movement.
[0095]
According to the encoding apparatus of the present invention, 1/16 to 1/4 of the direct current component is reflected in the error evaluation value, and the direct current component and the alternating current component are evaluated at an appropriate ratio. It is possible to detect a reference MB in which both the component and the AC component are close to each other.
On the other hand, since the encoding device described in Patent Document 1 calculates a motion vector by evaluating an error using only an alternating current component, the alternating current component does not reflect motion, and therefore an inappropriate motion vector is obtained.
[0096]
In the encoding device described in Patent Document 1, there are cases other than the third image in which an inappropriate motion vector is obtained. That is, an image with high brightness at the left end of the image, low brightness at the right end of the image, and a uniform change in brightness. In such an image with a constant rate of change in luminance, the AC component is the same everywhere in the image. Therefore, the AC component does not reflect a meaningful motion, and if an error is evaluated only by the AC component, a motion vector having a large DC component of the error may be obtained.
[0097]
According to the encoding device of the present invention, as described above, a part (1/16 to 1/4) of the absolute error of the DC component is reflected in the evaluation value, and the DC component and the AC component are moderate. Since the evaluation is based on the ratio, it is possible to detect the reference MB in which both the direct current component and the alternating current component of the encoded MB are close to each other.
As described above, according to the present embodiment, as shown in Patent Document 1, instead of setting the absolute error of the DC component to “0”, the evaluation value for the reference MB is calculated after attenuation. Therefore, even if the image is a flat image with a small luminance change or an image whose light is flashing violently, the motion vector can be appropriately searched.
[0098]
(Remarks)
Although it has been described based on the above embodiment, it is merely presented as an example of a system that can be expected to have the best effect in the present situation. The present invention can be modified and implemented without departing from the gist thereof. As typical modified embodiments, there are the following (A), (B), (C),...
[0099]
(A) In the third embodiment, the default value of the coefficient f for attenuation is set to 1/16 to 1/4, but is not limited to this value, and is a value larger than 0 and smaller than 1. As long as the image quality does not cause any problem.
(B) Although the absolute error is used as the evaluation value of the reference MB in the third embodiment, the present invention is not limited to this, and a square error may be used.
[0100]
In that case, the equation for calculating the square error is
[0101]
[Expression 14]

It becomes.
Further, as a method of obtaining the square error, an average square error obtained by dividing each term of the above formula by the square of n may be used.
[0102]
(C) A function unit for detecting image characteristics may be added to the encoding apparatus according to the third embodiment, and the value of the coefficient k may be automatically changed according to the image characteristics. In addition, the motion compensation prediction unit 6 in the third embodiment calculates the DC component of the error and the AC component of the error based on the luminance value, but may calculate based on the color difference value or represents a pixel. You may calculate based on the value of the arbitrary components of an RGB component.
[0103]
(D) Although the motion compensation prediction unit 6 according to the third embodiment performs a so-called full search in which the encoded MB is compared with all macroblocks within the search range, other search methods may be used. . An example of such a search method is a search method using a reduced image.
(E) Since the information processing by the program shown in FIGS. 3 to 5 specifically uses hardware resources such as a CPU and a frame memory, this program is established as an invention as a single unit. 1st Embodiment-3rd Embodiment was the aspect integrated in the encoding apparatus, and showed embodiment about the implementation act of the program which concerns on this invention, but isolate | separates from encoding apparatus, 1st Embodiment The program alone shown in the third embodiment may be executed. The act of implementing the program alone includes the act of producing these programs (1), the act of transferring the program for a fee or free of charge (2), the act of lending (3), the act of importing (4), and the interactive There is an act of offering to the public via an electronic communication line (5), an act of offering to the general user transfer or rental of the program by store display, catalog solicitation, pamphlet distribution.
[0104]
There are two types of provision (5) via a two-way electronic communication line: the provider sends the program to the user, causes the user to use the program (program download service), and leaves the program at the provider's hand There is an act of providing only the function of the program to the user through the electronic communication line (function providing ASP service).
(F) The “time” element of each step executed in time series in the flow charts of FIGS. 3 to 5 is considered as an indispensable matter for specifying the invention. Then, it can be seen that the processing procedure by these flowcharts discloses the usage form of the encoding method. These flowcharts are the embodiments of the usage act of the encoding method according to the present invention. If the processing of these flowcharts is performed so that the original purpose of the present invention can be achieved and the operations and effects can be achieved by performing the processing of each step in time series, the code according to the present invention. Needless to say, it corresponds to the act of implementing the conversion method.
[0105]
【The invention's effect】
As described above, since the encoding apparatus according to the present invention is “claim 1”, a moving image in which lights are flashing violently in a dark concert hall occupies most of the evaluation value (MSE). The DC component of the error is attenuated by the coefficient. Since the evaluation value is calculated after the DC component of the error is attenuated, the motion compensation mode is appropriately selected, and the encoding efficiency can be increased.
[0106]
Another effect is achieved by selecting the motion compensation mode by attenuating the DC component of the error. It is as follows. When performing variable length coding after DCT, there is one DC coefficient for each block (8 × 8), but there may be multiple AC coefficients. Even if the DCE and ACE values in the macroblock are equal, the code amount of the AC coefficient is likely to be larger than the code amount of the DC coefficient. This means that the influence on the code amount is larger in ACE than in DCE. Since the DCE and ACE are reflected with the same weight on the MSE of TM5, the effect on the code amount of the DCE and ACE is not appropriately reflected when compared with the MSE. In the present invention, it is possible to emphasize ACE by attenuating DCE, and to select a motion compensation mode that emphasizes the design rather than the luminance. As a result, in an image whose luminance changes, there is a high possibility of performing mode selection reflecting the original motion.
[0107]
Here, the encoding device performs different calculations according to a determination unit that determines a conversion method for performing a discrete cosine transform on a macroblock from among a plurality of methods and a conversion method determined by the determination unit. Thus, the mean square error between the third calculation means for calculating the variance value for the macroblock to be encoded, the macroblock predicted by the compensation method selected by the selection means, and the macroblock to be encoded Comparing means for comparing with the variance value calculated by the third calculating means,
In the compensation method selected by the selection unit, the macroblock to be encoded is encoded because the mean square error with the macroblock to be encoded is smaller than the variance value or the mean square The error may be smaller than a predetermined threshold.
[0108]
Since the calculation method of the variance value is changed according to the type of the discrete cosine transform, the variance value for the macroblock is calculated to be a small value by the method of Test Model 5 in the interlaced image where the luminance change is large. In comparing this with MSE, the relationship of “dispersion value <MSE” is easily satisfied, and the intra mode is often selected. Thereby, encoding efficiency can be improved.
[0109]
Here, the predetermined threshold may be 4. When the threshold value is 4, block noise in which rectangular noise appears in a flat image can be suppressed.
Here, the reference macroblock for performing motion compensation on a macroblock is selected from a plurality of macroblocks belonging to a front or rear frame, and an encoding device that calculates a motion vector for the selected reference macroblock. For each macroblock that is a candidate for the reference macroblock, the first calculation means for calculating the AC component of error and the DC component of error, and the calculated AC component of error and DC component of error are used. Second calculation means for calculating an evaluation value for each candidate macroblock, and selection means for selecting a reference macroblock for the motion compensation method based on the calculated evaluation value,
The second calculation means may calculate the evaluation value after attenuating the DC component of the error for each macroblock based on a predetermined coefficient. As shown in Patent Document 1, instead of setting the direct current component to “0”, the evaluation value for the reference MB is calculated after attenuation, so even if the image is a flat image with a small luminance change, Even if an image whose light is flashing violently is an object to be encoded, a motion vector can be appropriately searched, and encoding efficiency can be improved.
[Brief description of the drawings]
FIG. 1 is a diagram illustrating a hardware configuration of an encoding device.
FIG. 2 is a flowchart showing a large processing flow in the entire motion compensation prediction unit 6;
FIG. 3 is a flowchart showing a processing procedure of an inter selector 18;
FIG. 4 is a flowchart showing a processing procedure of an inter / intra selection unit 19;
FIG. 5 is a flowchart showing a processing procedure of a motion vector search unit 16;
FIG. 6 is a diagram illustrating luminance values in an encoding MB and a reference MB in an image in which both a change in a picture and a change in luminance are small, such as an empty landscape.
FIG. 7 is a diagram illustrating luminance values (decimal number display) of an empty image.
FIG. 8 shows a residual DC component value and a residual DC component value obtained by sequentially performing DCT, quantization, inverse quantization, and inverse DCT on the residual DC component value. It is a figure explaining the difference of each motion vector search in the difference of.
[Explanation of symbols]
1 D / A converter
2 Format converter
3 Screen rearrangement part
4 frame memory
5 Subtractor
6 Motion compensation prediction unit
7 DCT section
8 Quantization part
9 Variable length encoder
10 buffers
11 Rate control unit
12 Inverse quantization part
13 Reverse DCT section
14 Adder
16 Motion vector search unit
17 DCT type determination section
18 Inter selector
19 Inter / Intra Selector

Claims

An encoding device that selects a compensation method for performing motion compensation from a plurality of methods and encodes a macroblock using the selected compensation method,
A first calculation unit that calculates an AC component of error and a DC component of error for a macroblock predicted by each motion compensation method;
By multiplying the DC component of the error for each macro block by a coefficient α (0 <α <1), the DC component of the error is attenuated and the AC component of the error for the macro block is added. Second calculating means for calculating an evaluation value for each compensation method ,
And a selecting unit that selects a compensation method based on the calculated evaluation value.

The coefficient α is, the encoding apparatus according to claim 1, characterized in that the 1/64.

The first calculating means calculates a residual for each pixel between a value of a pixel belonging to the macroblock to be encoded and a value of a pixel belonging to the prediction macroblock, sums the square values, Calculate the mean square error between macroblocks by dividing by the total number of pixels in the block,
The DC component of the error is calculated by squaring the average of the residual,
The encoding apparatus according to claim 1, wherein the AC component of the error is calculated by subtracting the DC component of the error from the mean square error.

A determination means for determining a conversion method for performing a discrete cosine transform on a macroblock from a plurality of methods,
The determination of the conversion method by the determination unit is performed prior to the selection of the compensation method by the selection unit,
The first calculation means calculates an AC component of error and a DC component of error for the macroblock to be encoded by executing different calculations according to the conversion method determined by the determination unit. The encoding apparatus according to claim 1 or 3 , characterized in that:

The conversion method determined by the determining means includes a frame method and a field method.
If the conversion method determined by the determining means is a frame method, the error AC component and the error DC component are calculated for the frames constituting the macroblock,
If the conversion method determined by the determining means is a field method, the error AC component and error DC component are calculated for each field constituting the macroblock, and the calculated error AC component and error DC are calculated. The encoding device according to claim 4 , wherein the average value of the components is an AC component of error and a DC component of error.

The encoding device includes:
A determining means for determining a conversion method for performing a discrete cosine transform on a macroblock from a plurality of methods;
Third calculation means for calculating a variance value for a macroblock to be encoded by performing different calculations according to the conversion method determined by the determination means;
Comparing means for comparing the mean square error between the macroblock predicted in the compensation method selected by the selecting means and the macroblock to be encoded with the variance value calculated by the third calculating means,
In the compensation method selected by the selection unit, the macroblock to be encoded is encoded because the mean square error with the macroblock to be encoded is smaller than the variance value or the mean square The encoding apparatus according to claim 1, wherein the error is smaller than a predetermined threshold value.

The encoding apparatus according to claim 6 , wherein the predetermined threshold value is 4.

The conversion method determined by the determining means includes a frame method and a field method.
The third calculating means includes
If the conversion method determined by the determining means is a frame method, a variance value for a frame constituting a macroblock to be encoded is calculated,
7. The code according to claim 6, wherein if the conversion method determined by the determining means is a field method, a variance value is calculated for each field constituting the macroblock, and an average value of the calculated variance values is calculated. Device.

The first calculating means calculates a residual for each pixel between a value of a pixel belonging to the macroblock to be encoded and a value of a pixel belonging to the prediction macroblock, sums the square values, Calculate the mean square error between macroblocks by dividing by the total number of pixels in the block,
The DC component of the error is calculated by squaring the average of the residual,
The encoding apparatus according to claim 6 , wherein the variance value is calculated as an average of a square of a difference between an average of the encoded macroblock and a pixel value of the encoded macroblock.

An encoding device that selects a reference macroblock for performing motion compensation on a macroblock from a plurality of macroblocks belonging to a front or rear frame, and calculates a motion vector for the selected reference macroblock,
A first calculation unit that calculates an AC component of error and a DC component of error for each macroblock that is a candidate for a reference macroblock;
By multiplying the DC component of the error for each macro block by a coefficient α (0 <α <1), the DC component of the error is attenuated and the AC component of the error for the macro block is added. Second calculation means for calculating an evaluation value for each macroblock that is a candidate for the reference macroblock ,
An encoding device comprising: selection means for selecting a reference macroblock in the motion compensation method based on the calculated evaluation value.

The DC component of the error is a value obtained by subtracting the sum of the values of the pixels belonging to the candidate macroblock from the sum of the values of the pixels belonging to the macroblock to be encoded, and taking the absolute value,
The AC component of the error is a value obtained by subtracting an average value of pixel values in a macroblock to be encoded from a value of each pixel belonging to the macroblock to be encoded, and an individual component belonging to the reference macroblock. The encoding apparatus according to claim 10 , wherein the encoding value is calculated based on a value obtained by subtracting an average value of pixel values in the reference macroblock from a value of the pixel.

A computer-readable program for selecting a compensation method for performing motion compensation from a plurality of methods and causing a computer to execute a process of encoding a macroblock using the selected compensation method,
A first calculation step for calculating an AC component of error and a DC component of error for the macroblock predicted by each motion compensation method;
By multiplying the DC component of the error for each macro block by a coefficient α (0 <α <1), the DC component of the error is attenuated and the AC component of the error for the macro block is added. A second calculation step of calculating an evaluation value for each compensation method ,
A computer-readable program that causes a computer to execute a selection step of selecting a compensation method based on a calculated evaluation value.

The computer-readable program according to claim 12 , wherein the coefficient α is 1/64.

The first calculating step calculates a residual for each pixel between a value of a pixel belonging to the macroblock to be encoded and a value of a pixel belonging to the prediction macroblock, sums the square values, Calculate the mean square error between macroblocks by dividing by the total number of pixels in the block,
The DC component of the error is calculated by squaring the average of the residuals,
The computer-readable program according to claim 12 , wherein the AC component of the error is calculated by subtracting the DC component of the error from the mean square error.

A decision step for determining a conversion method for performing a discrete cosine transform on a macroblock from a plurality of methods;
Determination of the conversion method by the determination step is performed prior to selection of the compensation method by the selection step,
The first calculation step calculates an AC component of error and a DC component of error for the macroblock to be encoded by executing different calculations according to the conversion method determined by the determining step. The computer-readable program according to claim 12 or 14,

The conversion method determined by the determination step includes a frame method and a field method.
If the conversion method determined in the determination step is a frame method, the error AC component and the error DC component are calculated for the frames constituting the macroblock,
If the conversion method determined by the determination step is a field method, the error AC component and error DC component are calculated for each field constituting the macroblock, and the calculated error AC component and error DC are calculated. The computer-readable program according to claim 15 , wherein the average value of the components is an AC component of an error and a DC component of an error.

The computer readable program is:
A decision step for determining a conversion method for performing a discrete cosine transform on a macroblock from a plurality of methods;
A third calculation step of calculating a variance value for the macroblock to be encoded by performing different calculations according to the conversion method determined in the determination step;
A comparison step of comparing the mean square error between the macroblock predicted in the compensation method selected in the selection step and the macroblock to be encoded with the variance value calculated in the third calculation step;
In the compensation method selected in the selection step, the macroblock to be encoded is encoded.
The computer-readable program according to claim 12 , wherein a mean square error with a macroblock to be encoded is smaller than a variance value or the mean square error is smaller than a predetermined threshold.

The computer-readable program according to claim 17 , wherein the predetermined threshold is 4.

The conversion method determined by the determination step includes a frame method and a field method.
In the third calculation step, if the conversion method determined in the determination step is a frame method,
Calculate the variance for the frames that make up the macroblock to be encoded,
18. The computer according to claim 17, wherein if the conversion method determined in the determining step is a field method, a variance value is calculated for each field constituting the macroblock, and an average value of the calculated variance values is calculated. A readable program.

The first calculating step calculates a residual for each pixel between a value of a pixel belonging to the macroblock to be encoded and a value of a pixel belonging to the prediction macroblock, sums the square values, Calculate the mean square error between macroblocks by dividing by the total number of pixels in the block,
The DC component of the error is calculated by squaring the average of the residual,
The computer-readable program according to claim 17 , wherein the variance value is calculated as an average of the square of the difference between the average of the encoded macroblock and the pixel value of the encoded macroblock.

A reference macroblock for performing motion compensation on a macroblock is selected from a plurality of macroblocks belonging to the front or rear frame, and the computer is caused to execute a process of calculating a motion vector for the selected reference macroblock . A computer readable program comprising:
For each macro block to be the reference macroblock candidates, a first calculation step of calculating the AC component of the error, and the DC component of the error,
By multiplying the DC component of the error for each macro block by a coefficient α (0 <α <1), the DC component of the error is attenuated and the AC component of the error for the macro block is added. A second calculation step of calculating an evaluation value for each macroblock that is a candidate for the reference macroblock ;
A selection step for selecting a reference macroblock for the motion compensation method based on the calculated evaluation value;
A computer-readable program that causes a computer to execute .

The DC component of the error is a value obtained by subtracting the sum of the values of the pixels belonging to the candidate macroblock from the sum of the values of the pixels belonging to the macroblock to be encoded, and taking the absolute value,
The AC component of the error is a value obtained by subtracting an average value of pixel values in a macroblock to be encoded from a value of each pixel belonging to the macroblock to be encoded, and an individual component belonging to the reference macroblock. The computer-readable program according to claim 21 , wherein the computer-readable program is a value calculated based on a value obtained by subtracting an average value of pixel values in the reference macroblock from a value of the pixel.

A coding method for selecting a compensation method for performing motion compensation from a plurality of methods and coding a macroblock with the selected compensation method,
A first calculation step for calculating an AC component of error and a DC component of error for the macroblock predicted by each motion compensation method;
By multiplying the DC component of the error for each macro block by a coefficient α (0 <α <1), the DC component of the error is attenuated and the AC component of the error for the macro block is added. A second calculation step of calculating an evaluation value for each compensation method ,
And a selection step of selecting a compensation method based on the calculated evaluation value.

The encoding method includes a determining step for determining a transform method for performing a discrete cosine transform on a macroblock from a plurality of methods;
A third calculation step of calculating a variance value for the macroblock to be encoded by performing different calculations according to the conversion method determined in the determination step;
A comparison step of comparing the mean square error between the macroblock predicted in the compensation method selected in the selection step and the macroblock to be encoded with the variance value calculated in the third calculation step;
In the compensation method selected in the selection step, the macroblock to be encoded is encoded because the mean square error with respect to the macroblock to be encoded is smaller than the variance value or the mean square The encoding method according to claim 23 , wherein the error is smaller than a predetermined threshold.

An encoding method for selecting a reference macroblock for performing motion compensation on a macroblock from among a plurality of macroblocks belonging to a front or rear frame, and calculating a motion vector for the selected reference macroblock,
For each macro block to be the reference macroblock candidates, a first calculation step of calculating the AC component of the error, and the DC component of the error,
By multiplying the DC component of the error for each macro block by a coefficient α (0 <α <1), the DC component of the error is attenuated and the AC component of the error for the macro block is added. A second calculation step of calculating an evaluation value for each macroblock that is a candidate for the reference macroblock ;
And a selection step of selecting a reference macroblock in the motion compensation method based on the calculated evaluation value.