JP2004260251A

JP2004260251A - Apparatus and program of detecting motion vector

Info

Publication number: JP2004260251A
Application number: JP2003045504A
Authority: JP
Inventors: Hideki Takehara; 英樹竹原; Ichiro Ando; 一郎安藤
Original assignee: Victor Company of Japan Ltd
Current assignee: Victor Company of Japan Ltd
Priority date: 2003-02-24
Filing date: 2003-02-24
Publication date: 2004-09-16

Abstract

<P>PROBLEM TO BE SOLVED: To provide an apparatus and a program of detecting a motion vector which are used for motion compensation predictive coding for highly accurately coding motion image information, and capable of saving memories and reducing amounts of operations. <P>SOLUTION: The apparatus is provided with a thinning rate setting means for setting a thinning rate of pixels in a motion detection object block to be a target of comparison in block matching for the pixels in a detection range, according to a distance from the center of the detection range of the pixels in the detection range of the motion vector; and a thinning means for thinning the pixels in the motion detection object block used for block matching and the pixels in the detection range according to the set thinning rate. <P>COPYRIGHT: (C)2004,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は動画像情報を高能率符号化する動き補償予測符号化における動きベクトル検出に係り、特に省メモリ化と演算処理量の削減の図れる動きベクトル検出装置及び動きベクトル検出プログラムに関する。
【０００２】
【従来の技術】
動画像符号化として、従来から動き補償予測が用いられている。動き補償予測はＭＰＥＧなど国際標準方式でも広く用いられていて、ブロック毎に画像間予測を行なう際に動画像の動きに合わせて画像の各ブロックを移動させてから予測する。動き補償予測は一般に１６×１６画素または８×８画素単位で行われ、動きベクトルもその単位で求められる。
【０００３】
以下、上記のような動き補償予測を行う従来の符号化装置について図面及びフローチャートに基づき説明する。図１４は従来の動き補償予測を行う符号化装置の構成例を示すブロック図である。また図１５は従来の動き補償予測を行う符号化装置を説明するフローチャートである。
【０００４】
入力された画像信号は、動き検出器１１１に与えられる一方、減算器１０１において動き補償予測器１１０から与えられる画像間予測信号が減算され、予測誤差となってＤＣＴ１０２に与えられる（ステップＳ１０３）。ＤＣＴ１０２は８×８画素単位で離散コサイン変換（ＤＣＴ）の変換処理を行い、得られた係数を量子化器１０３に与える（ステップＳ１０４）。量子化器１０３は所定の量子化ステップ幅で係数を量子化し、固定長の符号となった係数を可変長符号化器１０４と逆量子化器１０６に与える（ステップＳ１０５）。可変長符号化器１０４は２次元の８×８個の係数を１次元に変換し、係数をハフマン符号で符号化し、ハフマン符号列は多重化器１０５に与えられる（ステップＳ１０６）。
【０００５】
一方、逆量子化器１０６及び逆ＤＣＴ１０７ではＤＣＴ１０２及び量子化器１０３の逆処理が行われ、予測誤差を再生する（ステップＳ１０９及びステップＳ１１０）。得られた予測誤差には加算器１０８で予測信号が加算され参照画像となり（ステップＳ１１１）、画像メモリ１０９に蓄えられる（ステップＳ１１２）。画像メモリ１０９は参照画像を動き検出器１１１と動き補償予測器１１０に与える。
【０００６】
動き補償予測器１１０は、動き検出器１１１から与えられる動きベクトルに従って、画像メモリ１０９に蓄積されている画像をブロック毎に移動させ、画像間予測信号を得る（ステップＳ１０２）。動き補償予測により得られた画像間予測信号は、減算器１０１及び加算器１０８に与えられる。動き検出器１１１では、動き補償予測のブロック単位で画像メモリ１０９に蓄積されている参照画像をハーフペル単位で移動させて入力画像とブロックマッチングを取り、最もマッチングのとれた移動を動きベクトルとする（ステップＳ１０１）。得られた動きベクトルは動き補償予測器１１０の他に、動きベクトル符号化器１１２にも与えられる。動きベクトル符号化器１１２では、所定の手順で求められた予測動きベクトルと符号化対象となるブロックの動きベクトルを水平成分、垂直成分毎に比較し、その差分値を符号化する（ステップＳ１０７）。符号化された動きベクトルの符号列は多重化器１０５で、画像間予測誤差の符号列と多重化される（ステップＳ１０８）。
【０００７】
ここで、従来の動き補償予測における動きベクトルの検出手法について説明する。動きベクトルの検出手法にはブロックマッチング法がよく利用されている。このブロックマッチング法は、動き検出対象ブロック（Ｎ×Ｎ画素）を参照画像中の検出範囲内の画像と画素単位で全ての位置毎にマッチングをとり、マッチング評価関数を計算し、最もマッチングのとれた位置、すなわち最もマッチング評価関数の値が小さくなる位置を動きベクトルとする手法である。マッチング評価関数としては、例えばブロック内の全画素について、画素毎の画素値の差分の絶対値の和をとったもの（絶対値差分和）が使われる。
【０００８】
フルサーチ法の場合は、検出範囲内の全ての位置においてブロックマッチングが行われるため計算量が膨大になるという問題がある。このため、３ステップサーチ法（例えば「会議テレビ信号の動き補償フレーム間符号化」（信学技報ＩＥ８１−５４１９８１−７）を参照）や階層化サーチ法（例えば、「階層画素情報を用いた動画像における動き量検出方式」電子情報通信学会論文誌Ｄ−ＩＩＶｏｌ．Ｊ７２−Ｄ−ＩＩＮｏ．３ｐｐ．３９５−４０３１９８９年３月を参照）などの簡略化手法が提案されている。
【０００９】
３ステップサーチ法は、図１６に示すように、検出範囲内の全ての位置をサーチする代わりに、まず第１段として、粗い間隔で３×３の９点のサーチを行い、最もマッチングのとれた位置を中心に、サーチする間隔を、第１段のときの半分にして第２段の検出を行う。同様の処理を複数段繰り返して、粗から密へ検出点の間隔を絞っていき動きベクトルを検出する方法である。±７画素の検出範囲の場合、３段で終了する。フルサーチ法の場合と比較すると処理量に大きく削減される。ただし、線画部分などの高解像度部分では、最初の方の段階での粗い検出位置とマッチングすべき位置がたまたま合っている場合を除いて、検出初期段階で誤った動きベクトルを検出する場合があり、実際の動き補償予測画像における予測誤差を大きくしてしまうという問題がある。
【００１０】
これに対して、階層化サーチ法は、低域通過型フィルタおよびサブサンプリングによって、水平・垂直方向について、各々１／２ずつ解像度を落した複数解像度の多階層画像を生成し、低解像度階層画像から動きベクトルを検出し、順次１つ上の階層での検出動きベクトルを参照しながら次階層の動きベクトル検出を行う方法で、各階層の動きベクトル検出はブロックマッチング法を用いる。
【００１１】
なお、以下の説明では、低解像度画像を下位の階層画像とし、高解像度画像を上位の階層画像と定義する。各階層での検出範囲の大きさは通常固定されていて、例えば水平・垂直±２画素の５×５＝２５点を検出位置とする。各階層の検出範囲の中心位置は、１つ上の階層での検出動きベクトルを２倍にした位置が使われる。
【００１２】
図１７は、階層化サーチ法における検出範囲の中心位置の設定を説明する図である。下位階層において、始点１０を中心として検出範囲１１で動きベクトル検出を行った結果、動きベクトル１２が検出されたとする。上位階層では、動きベクトル１２を２倍した位置２０を中心として検出範囲２１で動きベクトルを検出する。
【００１３】
以下、上記のような階層化サーチを行う動き検出器１１１について図面及びフローチャートに基づき説明する。図１８は、階層化サーチを行う従来の動き検出器の構成例を示すブロック図である。また図１９及び図２０は階層化サーチを行う従来の動き検出器を説明するフローチャートである。
【００１４】
入力画像は階層レベル１サブサンプル器２１１に与えられる。階層レベル１サブサンプル器２１１は入力された画像を水平方向及び垂直方向に１／２にサブサンプルして階層レベル１の入力画像を作成し、画像メモリ２１２及び階層レベル０サブサンプル器２２１に与える。参照画像は階層レベル１サブサンプル器２１４に与えられる。階層レベル１サブサンプル器２１４は入力された画像を水平方向及び垂直方向に１／２にサブサンプルして階層レベル１の参照画像を作成し、画像メモリ２１３及び階層レベル０サブサンプル器２２４に与える（ステップＳ２０１）。
【００１５】
階層レベル０サブサンプル器２２１は入力された画像を水平方向及び垂直方向に１／２にサブサンプルして階層レベル０の入力画像を作成し、画像メモリ２２２に与える。階層レベル０サブサンプル器２２４は入力された画像を水平方向及び垂直方向に１／２にサブサンプルして階層レベル０の参照画像を作成し、画像メモリ２２３に与える（ステップＳ２０２）。階層レベル０動きベクトル決定器２２０は、検出開始点を（０，０）、検出範囲を±１、マッチングブロックサイズを４×４に設定して（ステップＳ２０３〜ステップ２０６）、画像メモリ２２２内の階層レベル０の入力画像と画像メモリ２２３内の階層レベル０の参照画像から絶対値差分和が最小となる位置の階層レベル０動きベクトルＶ（０）を求め（ステップＳ２０７）、Ｖ（０）を階層レベル１動きベクトル決定器２１０に与え、次の階層レベルに進む（ステップＳ２０８）。
【００１６】
階層レベル１動きベクトル決定器２１０は、検出開始点をＶ（０）を２倍にスケーリングした位置、検出範囲を±１、マッチングブロックサイズを８×８に設定して（ステップＳ２０４〜ステップ２０６）、画像メモリ２１２内の階層レベル１の入力画像と画像メモリ２１３内の階層レベル１の参照画像から絶対値差分和が最小となる位置の階層レベル１動きベクトルＶ（１）を求め（ステップＳ２０７）、Ｖ（０）を階層レベル２動きベクトル決定器２００に与え、次の階層レベルに進む（ステップＳ２０８）。階層レベル２動きベクトル決定器２００は、検出開始点をＶ（１）を２倍にスケーリングした位置、検出範囲を±１、マッチングブロックサイズを１６×１６に設定して（ステップＳ２０４〜ステップ２０６）、入力画像と参照画像から絶対値差分和が最小となる位置の階層レベル１動きベクトルＶ（２）を求め（ステップＳ２０７）、Ｖ（２）を動きベクトルとして出力する。ここで、動きベクトルの決定（ステップＳ２０７）について説明する。
【００１７】
各階層レベルの動きベクトル検出範囲内でブロックマッチング（ステップＳ２５１）をして、ブロックマッチングの評価値を評価し（ステップＳ２５２）、その評価値が最小であればそれを動きベクトルとし（ステップＳ２５３）、そうでなければ次の検出点についてブロックマッチング（ステップＳ２５１）を行う。
【００１８】
なお、階層化サーチ法には、各階層での動き検出対象ブロックの大きさを、階層によらず一定の画素数とする方法（従って低解像度階層ほど実際の画像上でのブロックサイズは大きくなる）と、各階層のサブサンプル比率に応じて、動き検出対象ブロックの大きさを変え、実際上の画像において同じ領域の動きベクトル検出を行う方法とがある。
【００１９】
このような階層化サーチ法では、低域通過型フィルタおよびサブサンプリングによって多階層画像を生成して各階層の動き検出を行うため、３ステップ法で動きベクトルを誤検出するような高解像度の部分に対しても、動きベクトルの誤検出を回避することができ、誤検出の場合の、実際の動き補償予測画像における予測誤差も３ステップサーチ法に比べて大きくない。
【００２０】
表１は動きベクトルの検出範囲が±７である場合のフルサーチ法におけるマッチングブロックサイズ、検出範囲、検出点数、差分演算回数を示している。表２は動きベクトルの検出範囲が±７である場合の３ステップサーチ法における各ステップでのマッチングブロックサイズ、検出範囲、検出点数、差分演算回数を示している。表３は、動きベクトルの検出範囲が±７である場合の階層化サーチ法における各階層でのマッチングブロックサイズ、検出範囲、検出点数、差分演算回数を示している。表１、表２、及び表３から３ステップサーチ法における差分演算回数はフルサーチ法における差分演算回数と比較して１／９に削減されている。また階層化サーチ法における差分演算回数はフルサーチ法における差分演算回数と比較して約１／２１に削減されている。ただし、階層化サーチ法では階層レベル１及び階層レベル０の画像メモリが必要となる。
【００２１】
【表１】

【００２２】
【表２】

【００２３】
【表３】

【００２４】
【特許文献１】
特開平７−１５４８０１号公報
【００２５】
【非特許文献１】
「会議テレビ信号の動き補償フレーム間符号化」（信学技報ＩＥ８１−５４１９８１−７）
【００２６】
【非特許文献２】
「階層画素情報を用いた動画像における動き量検出方式」（電子情報通信学会論文誌Ｄ−ＩＩＶｏｌ．Ｊ７２−Ｄ−ＩＩＮｏ．３ｐｐ．３９５−４０３１９８９年３月）
【００２７】
【発明が解決しようとする課題】
フルサーチ法はブロックマッチングを用いる動きベクトル検出としては最高の精度を得ることができる。しかし、フルサーチ法では処理量は膨大であるという問題がある。階層化サーチ法はフルサーチよりも検出精度は劣るが、検出範囲内を均一に動きベクトルを検出する特徴を有している。しかし、階層化サーチ法では、高周波成分で各階層レベルにおける入力画像と参照画像のための画像メモリが必要となる問題がある。また、３ステップサーチ法では画像メモリは必要ないが、線画部分などの高解像度部分では、最初のステップで誤った動きベクトルを検出した場合には正しい動きベクトルが得られない問題がある。
【００２８】
本発明は以上の点を鑑みてなされたもので、動きベクトルの検出範囲の中心点からの距離に応じてブロックマッチングに用いる画素を間引くことで、演算量を削減し、余計な画像メモリを必要としない精度の良い動きベクトル検出装置びプログラムを提供することを目的とする。
【００２９】
【課題を解決するための手段】
そこで上記課題を解決するために本発明は、下記の装置及びプログラムを提供するものである。
（１）入力画像を所定数の画素単位毎にブロック化し、前記入力画像において各ブロックを順次動き検出対象ブロックとして設定し、参照画像において前記動き検出対象ブロックに応じて順次検出範囲を設定し、前記各動き検出対象ブロック内の画素と前記各検出範囲内の画素とを比較して、ブロックマッチングにより前記各動き検出対象ブロック毎に動きベクトルを求める動画像の動きベクトル検出装置において、
前記検出範囲内の画素の前記検出範囲の中心からの距離に応じて、前記検出範囲内の画素に対してブロックマッチングにおける比較対象となる前記動き検出対象ブロック内の画素の間引き率を設定する間引き率設定手段と、
前記設定された間引き率に応じて、ブロックマッチングに用いる前記動き検出対象ブロック内の画素及び前記検出範囲内の画素を間引く間引き手段と、
を設けたことを特徴とする動きベクトル検出装置。
（２）上記（１）記載の動きベクトル検出装置において、
動き検出対象ブロックに設定されたブロックに隣接する、既に動きベクトルが求められているブロックの動きベクトルを用いて、予め定められている所定方法に従って求められる予測動きベクトルに応じて、前記動き検出対象ブロックに対して設定される前記検出範囲の中心位置を移動させる検出範囲移動手段を設けたことを特徴とする動きベクトル検出装置。
（３）上記（１）記載の動きベクトル検出装置において、
前記間引き率設定手段は、動き検出対象ブロックに設定されたブロックに隣接する、既に動きベクトルが求められているブロックの動きベクトル、及び前記検出範囲の中心からの距離に応じて、前記間引き率を設定するものであることを特徴とする動きベクトル検出装置。
（４）動画像である入力画像を所定数の画素単位毎にブロック化し、前記入力画像において各ブロックを順次動き検出対象ブロックとして設定し、参照画像において前記動き検出対象ブロックに応じて順次検出範囲を設定し、前記各動き検出対象ブロック内の画素と前記各検出範囲内の画素とを比較して、ブロックマッチングにより前記各動き検出対象ブロック毎に動きベクトルを求める動作をコンピュータに行わせる動きベクトル検出プログラムにおいて、
前記検出範囲内の画素の前記検出範囲の中心からの距離に応じて、前記検出範囲内の画素に対してブロックマッチングにおける比較対象となる前記動き検出対象ブロック内の画素の間引き率を設定する間引き率設定手機能と、
前記設定された間引き率に応じて、ブロックマッチングに用いる前記動き検出対象ブロック内の画素及び前記検出範囲内の画素を間引く間引き機能と、
をコンピュータに実行させることを特徴とする動きベクトル検出プログラム。
【００３０】
【発明の実施の形態】
（第１実施例）
以下、本発明の第１実施例について図面及びフローチャートに基づき説明する。図１は本発明に係る動きベクトル検出器の第１実施例を用いた符号化装置の一実施例を示すブロック図である。図１は、従来の動きベクトル検出器を用いた符号化装置の例である図１４と同じ構成である。ただし、動き検出器１１１の動作が従来例と異なる。図１で減算器１０１、ＤＣＴ１０２、量子化器１０３、可変長符号化器１０４、多重化器１０５、逆量子化器１０６、逆ＤＣＴ１０７、加算器１０８、画像メモリ１０９、動き補償予測器１１０、及び動きベクトル符号化器の動作は従来例と同じである。
【００３１】
図２は本発明の第１実施例を用いた符号化装置の一実施例を説明するフローチャートである。図２は、従来の動きベクトル検出器を用いた符号化装置を説明するフローチャートである図１５と同じ手順である。ただし、手順Ｓ１０１の動作が従来例と異なる。図２で手順Ｓ１０２、手順Ｓ１０３、手順Ｓ１０４、手順Ｓ１０５、手順Ｓ１０６、手順Ｓ１０７、手順Ｓ１０８、手順Ｓ１０９、手順Ｓ１１０、手順Ｓ１１１、及び手順Ｓ１１２は従来例と同じである。
【００３２】
以下、本実施例における動きベクトル検出器の従来例との違いを図面及びフローチャートに基づき説明する。図３は動きベクトル検出器の第１実施例を示すブロック図である。図４は動きベクトル検出器の第１実施例を説明するフローチャートである。
【００３３】
ブロックマッチング設定器３０１はあらかじめ決められた（ステップＳ３０１）方法に従って動きベクトルの検出範囲の中心からの距離に応じてブロックマッチングに用いる画素の間引き率を設定し（ステップＳ３０２）、その情報を動きベクトル決定器３００、間引き器３０２、３０３に与える。間引き器３０２は与えられた間引き率に従って入力画像を間引き、間引いた入力画像を動きベクトル決定器３００に与える（ステップＳ３０３）。間引き器３０３は与えられた間引き率に従って参照画像を間引き、間引いた参照画像を動きベクトル決定器３００に与える（ステップＳ３０４）。
【００３４】
動きベクトル決定器３００は与えられた間引き率に従って、ブロックマッチングを行い（ステップＳ３０５）、ブロックマッチングの評価値を評価し（ステップＳ３０６）、その評価値が最小であればそれを動きベクトルとし（ステップＳ３０７）、そうでなければ次の検出点に移る。動きベクトル検出範囲内のブロックマッチングが全て終了すると、動きベクトルを出力する。
【００３５】
本実施例における動きベクトルの検出範囲の中心からの距離に応じてブロックマッチングに用いる画素の間引き率を設定する方法について図５を用いて説明する。
【００３６】
動きベクトルは１６×１６画素のブロック単位で求め、動きベクトルの検出範囲は水平方向、垂直方向に±７画素であるとする。ブロックマッチングに用いる画素の間引き率を検出範囲の中心からの距離に応じて３段階のマッチングレベルに分けるとする。図５において検出中心位置５０は動きベクトルを（０，０）とした位置であるとし、検出中心位置５０を中心として水平方向、垂直方向に±２画素の範囲を第１の検出範囲５１とする（マッチングレベル１）。第１の検出範囲５１から検出中心位置５０を中心として水平方向、垂直方向に±４画素の範囲を第２の検出範囲５２とする（マッチングレベル２）。第２の検出範囲５２から検出中心位置５０を中心として水平方向、垂直方向に±７画素の範囲を第３の検出範囲５３とする（マッチングレベル３）。第１の検出範囲５１、第２の検出範囲５２、第３の検出範囲５３のそれぞれにブロックマッチングに用いる画素の間引き率を割り当てる。
【００３７】
第１の検出範囲５１の中では図６（Ａ）で示されるように、動きベクトル検出の対象となるブロックの全画素をブロックマッチングに用いる。マッチング評価関数Ｆ（ｘ，ｙ）は式１で表すことができる。ここでｘ、ｙは検出位置の水平位置、垂直位置であり、ＣｕｒＰｉｘｅｌ（ｉ，ｊ）は入力画像の動きベクトル検出の対象となるブロック内の画素であり、ＲｅｆＰｉｘｅｌ（ｉ，ｊ）は検出位置における参照画像のブロック内の画素である。
【００３８】
【数１】

第２の検出範囲５２の中では図６（Ｂ）で示されるように、動きベクトル検出の対象となるブロックに対して水平方向、垂直方向にそれぞれ１／２間引いた画素をブロックマッチングに用いる。マッチング評価関数Ｆ（ｘ，ｙ）は式２で表すことができる。
【００３９】
【数２】

第３の検出範囲５３の中では図６（Ｃ）で示されるように、動きベクトル検出の対象となるブロックに対して水平方向、垂直方向にそれぞれ１／４間引いた画素をブロックマッチングに用いる。マッチング評価関数Ｆ（ｘ，ｙ）は式３で表すことができる。
【００４０】
【数３】

以上のようにして求めたマッチング評価関数Ｆ（ｘ，ｙ）の値が最小となる位置を動きベクトルとする。
【００４１】
検出範囲の中心からの距離が大きくなるに従って、ブロックマッチングに用いる画素の間引き率を大きくするのは、人間の目の視覚特性に基づいた処理である。一般的に人間の目は動きの早いものに対しては認知力が低下するという特性があり、この特性を利用した処理である。認知力の高い小さな動きを高精度で検出し、認知力の低い大きな動きについては精度を落として検出することを特徴としている。
【００４２】
表４は動きベクトルの検出範囲が±７である本実施例における各マッチングレベルにおけるブロックマッチングブロックサイズ、検出範囲、検出点数、差分演算回数を示している。表１、表２、表３と比較すると、本実施例の差分演算回数はフルサーチ法における差分演算回数と比較して約１／５に削減されている。検出点数がフルサーチと同じ２２５であるため検出した動きベクトルの精度は高い。３ステップサーチ法と比較すると差分演算回数は１．６８倍になっているが、３ステップサーチ法のように検出点数を減らしていないため、線画部分などの高解像度部分において誤った動きベクトルを検出する確率は極めて低く、高精度な動きベクトルを得ることができる。階層サーチ法と比較すると差分演算回数は約４倍になっているが、階層化サーチ法のように各階層レベルにおける入力画像と参照画像のための画像メモリが不要である。
【００４３】
【表４】

（第２実施例）
本発明の第２実施例について図面及びフローチャートに基づき説明する。図７は本発明に係る動きベクトル検出器の第２実施例を用いた符号化装置の一実施例を示すブロック図である。図７は、図１に示す符号化装置と同じ構成である。ただし、動き検出器１１１の動作が図１に示す符号化装置と異なる。図７で、減算器１０１、ＤＣＴ１０２、量子化器１０３、可変長符号化器１０４、多重化器１０５、逆量子化器１０６、逆ＤＣＴ１０７、加算器１０８、画像メモリ１０９、動き補償予測器１１０、及び動きベクトル符号化器１１２の動作は図１に示す符号化装置と同じである。
【００４４】
以下、第２実施例における動きベクトル検出器の第１実施例との違いを図面及びフローチャートに基づき説明する。図８は動きベクトル検出器の第２実施例を示すブロック図である。図８は、第１実施例を示す図３と比較して、動きベクトル予測器３０４が増えている。また、間引き設定器３０１、動きベクトル決定器３００の動作が本発明の第１実施例と異なる。図８で、間引き器３０２、３０３の動作は本発明の第１実施例と同じである。図９は動きベクトル検出器の第２実施例を説明するフローチャートである。図９は本発明の第１実施例を説明した図４と比較して、手順Ｓ３００が増えている。また、手順Ｓ３０１が本発明の第一の実施例と異なる。
【００４５】
動きベクトル予測器３０４は予測動きベクトルを所定の手順で求め（ステップＳ３００）、間引き設定器３０１に与える。間引き設定器３０１は与えられたは与えられた予測動きベクトルに従って、動きベクトルの検出範囲の中心からの距離に応じたブロックマッチングに用いる画素の間引き率を設定する（ステップＳ３０１）。動きベクトルの検出範囲内のブロックマッチングが終了すると、動きベクトル決定器３００は動きベクトルを動きベクトル予測器３０４に与え、動きベクトルを出力する。
【００４６】
予測動きベクトルを求める手順について図１０を用いて説明する。図１０において、ブロックＸの予測動きベクトルであるＭＶｐ（Ｘ）を求める。ブロックＸの隣接ブロックＡ、Ｂ、Ｃのそれぞれの動きベクトルをＭＶ（Ａ）、ＭＶ（Ｂ）、ＭＶ（Ｃ）とする。最も単純な方法として、左隣のブロックの動きベクトルを用いてＭＶｐ（Ｘ）＝ＭＶ（Ｃ）とする方法がある。これらはＭＰＥＧ−２などで用いられている方法である。またＭＶ（Ａ）、ＭＶ（Ｂ）、ＭＶ（Ｃ）の水平方向、垂直方向それぞれの値の中央値をとる方法もある。この方法はＭＰＥＧ−４で用いられている方法であり、ＭＶ（Ａ）＝（０，２）、ＭＶ（Ｂ）＝（１、−１）、ＭＶ（Ｃ）＝（３，３）であれば、ＭＶｐ（Ｘ）＝（１，２）となる。
【００４７】
本実施例における動きベクトルの検出範囲の中心からの距離に応じてブロックマッチングに用いる画素の間引き率を設定する方法について図５と図１１を用いて説明する。予測動きベクトルＭＶｐ（ｘ，ｙ）に応じて、第１の検出範囲であるＲａｎｇｅ１（ｘ，ｙ）及び第２の検出範囲であるＲａｎｇｅ２（ｘ，ｙ）が式４に従って変化する。
【００４８】
【数４】

例えば、ＭＶｐが（０，０）であれば、検索範囲の中心位置５０は（０，０）となり、間引き率の設定範囲は第一の実施例と同じ図５となる。ＭＶｐが（３，３）であれば、検索範囲の検出中心位置５０は（０，１）となり、間引き率の設定範囲は図１１となる。図１１において検出中心位置５０を中心として水平方向、垂直方向に±２画素の範囲を第１の検出範囲５１とする（マッチングレベル１）。第１の検出範囲５１から検出中心位置５０を中心として水平方向、垂直方向に±４画素の範囲を第２の検出範囲５２とする（マッチングレベル２）。第２の検出範囲５２の外側を第３の検出範囲５３とする（マッチングレベル３）。それぞれのマッチングレベルは表４で示されている。
【００４９】
本実施例では、間引き率の設定範囲の計算に平行移動の例をあげたが、これらに拡大や縮小の計算を加えても良い。例えば、動きの小さな入力画像に対しては、図１２のように、検出中心位置５０を（０，０）とし、水平方向、垂直方向に±１画素の範囲を第１の検出範囲５１（マッチングレベル１）とし、第１の検出範囲５１から検出中心位置５０を中心として水平方向、垂直方向に±２画素の範囲を第２の検出範囲５２（マッチングレベル２）とし、第２の検出範囲５２の外側を第３の検出範囲５３（マッチングレベル３）として、マッチングレベル１、マッチングレベル２の範囲を小さくすることで演算量をさらに減らすことが可能となる。
【００５０】
この場合の各マッチングレベルにおけるブロックマッチングブロックサイズ、検出範囲、検出点数、差分演算回数を表５に示す。この場合には差分演算回数は３ステップサーチ法とほぼ同じになる。このように入力画像の動きの大小に合わせて間引き率の設定を変化させることで、演算量をより小さくすることが可能である。
【００５１】
【表５】

また、ＭＶｐを利用した間引き率設定方法において、動きベクトルの検出範囲の中心をＭＶｐとして移動させる方法もある。例えば、ＭＶｐが（０，１）である場合に、動きベクトルの検出範囲の中心を（０，１）に移動させる。この結果は図１１と同じになる。動きが大きな入力画像に対しても、予測動きベクトルが正しく予測できる限りにおいて、動きベクトルの検出範囲の中心をＭＶｐに合わせて移動させることによって、動きベクトルの検出範囲の広い、演算量の少ない、且つ精度の高い動きベクトルを検出することができる。
【００５２】
次に、隣接するブロックの動きベクトルに応じてブロック内の画素を間引く率を設定する方法について説明する。本実施例においては隣接するブロックの動きベクトルとしてＭＶｐを用いる。ＭＶｐが（０，３）であるような水平方向に移動するような動きに対しては図１３のような水平重視型の間引き率を設定する。検出中心位置５０を（２，０）とし、水平方向±３画素、垂直方向に±１画素の範囲を第１の検出範囲５１（マッチングレベル１）とし、第１の検出範囲５１から検出中心位置５０を中心として水平方向±４画素、垂直方向に±２画素の範囲を第２の検出範囲５２（マッチングレベル２）とし、第２の検出範囲５２の外側を第３の検出範囲５３（マッチングレベル３）とする。
【００５３】
ＭＶｐが（３，０）であるような垂直方向に移動するような動きに対しては図１３のような垂直重視型の間引き率を設定する。検出中心位置５０を（０，２）とし、水平方向±１画素、垂直方向に±３画素の範囲を第１の検出範囲５１（マッチングレベル１）とし、第１の検出範囲５１から検出中心位置５０を中心として水平方向±２画素、垂直方向に±４画素の範囲を第２の検出範囲５２（マッチングレベル２）とし、第２の検出範囲５２の外側を第３の検出範囲５３（マッチングレベル３）とする。このようにすることで、動きベクトルの検出範囲の広い、演算量の少ない、且つ精度の高い動きベクトルを検出することができる。
【００５４】
即ち、隣接するブロックの動きベクトルに応じてブロック内の画素を間引く率を設定する場合に、水平方向の移動であると判断される場合には垂直方向の画素を間引き率を大きくして垂直方向の移動であると判断される場合には水平方向の画素を間引く率を大きくするようにする。
【００５５】
なお、本発明は、上記した動きベクトル検出器の機能をコンピュータに実現させるためのプログラムを含むものである。このプログラムは、記録媒体から読みとられてコンピュータに取り込まれてもよいし、通信ネットワーク等を介して伝送されてコンピュータに取り込まれてもよい。
【００５６】
【発明の効果】
本発明は、検出範囲の中心からの距離に応じてブロックマッチングに用いる画素の間引き率を変化させることで、演算量を減らし、余計な画像メモリを必要とせず、動きの小さな動画像に対しては特に精度の高い動きベクトルを検出することができる。
【００５７】
また、予測動きベクトルに応じて動きベクトルの検出範囲の中心位置を移動させるようにした場合には、動きの大きな動画像に対しても、演算量を減らし、余計な画像メモリを必要とせず、精度の高い動きベクトルを検出することができる。
【００５８】
また、隣接するブロックの動きベクトルに応じてブロックマッチングに用いる画素の間引き率を変化させるようにした場合には、さらに演算量を減らし、余計な画像メモリを必要とせず、精度の高い動きベクトルを検出することができる。
【図面の簡単な説明】
【図１】本発明に係る動きベクトル検出器の第１実施例を用いた符号化装置の一実施例を示すブロック図である。
【図２】本発明の第１実施例を用いた符号化装置の一実施例を説明するフローチャートである。
【図３】動きベクトル検出器の第１実施例を示すブロック図である。
【図４】動きベクトル検出器の第１実施例を説明するフローチャートである。
【図５】間引き率の設定範囲を示す図である。
【図６】ブロックマッチングに用いる画素の例を示す図である。
【図７】本発明に係る動きベクトル検出器の第２実施例を用いた符号化装置の一実施例を示すブロック図である。
【図８】動きベクトル検出器の第２実施例を示すブロック図である。
【図９】動きベクトル検出器の第２実施例を説明するフローチャートである。
【図１０】予測動きベクトルを求める手順を示す図である。
【図１１】間引き率の設定範囲を示す図である。
【図１２】間引き率の設定範囲を示す図である。
【図１３】間引き率の設定範囲を示す図である。
【図１４】従来の動き補償予測を行う符号化装置の構成例を示すブロック図である。
【図１５】従来の動き補償予測を行う符号化装置を説明するフローチャートである。
【図１６】３ステップサーチ法を説明するための図である。
【図１７】階層化サーチ法を説明するための図である。
【図１８】階層化サーチを行う従来の動き検出器の構成例を示すブロック図である。
【図１９】階層化サーチを行う従来の動き検出器を説明するフローチャートである。
【図２０】階層化サーチを行う従来の動き検出器を説明するフローチャートである。[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to motion vector detection in motion-compensated prediction coding for highly efficient coding of moving image information, and more particularly to a motion vector detection device and a motion vector detection program capable of saving memory and reducing the amount of arithmetic processing.
[0002]
[Prior art]
Conventionally, motion compensation prediction has been used as moving image coding. Motion compensation prediction is widely used in international standard methods such as MPEG. When performing inter-picture prediction for each block, prediction is performed after moving each block of an image in accordance with the motion of a moving image. Motion compensation prediction is generally performed in units of 16 × 16 pixels or 8 × 8 pixels, and a motion vector is also obtained in that unit.
[0003]
Hereinafter, a conventional encoding apparatus that performs the above-described motion compensation prediction will be described with reference to the drawings and flowcharts. FIG. 14 is a block diagram showing a configuration example of a conventional coding apparatus that performs motion compensation prediction. FIG. 15 is a flowchart for explaining a conventional coding apparatus that performs motion compensation prediction.
[0004]
The input image signal is supplied to the motion detector 111, while the subtractor 101 subtracts the inter-picture prediction signal supplied from the motion compensation predictor 110, and the result is supplied to the DCT 102 as a prediction error (step S103). The DCT 102 performs a discrete cosine transform (DCT) transform process in units of 8 × 8 pixels, and supplies the obtained coefficients to the quantizer 103 (step S104). The quantizer 103 quantizes the coefficient with a predetermined quantization step width, and supplies the fixed-length code to the variable-length encoder 104 and the inverse quantizer 106 (step S105). The variable-length encoder 104 converts the two-dimensional 8 × 8 coefficients into one-dimension, encodes the coefficients by Huffman code, and supplies the Huffman code sequence to the multiplexer 105 (step S106).
[0005]
On the other hand, in the inverse quantizer 106 and the inverse DCT 107, the inverse processing of the DCT 102 and the quantizer 103 is performed, and the prediction error is reproduced (steps S109 and S110). The prediction signal is added to the obtained prediction error by the adder 108 to become a reference image (step S111), and is stored in the image memory 109 (step S112). The image memory 109 supplies the reference image to the motion detector 111 and the motion compensation predictor 110.
[0006]
The motion compensation predictor 110 moves the image stored in the image memory 109 for each block according to the motion vector given from the motion detector 111, and obtains an inter-picture prediction signal (step S102). The inter-picture prediction signal obtained by the motion compensation prediction is provided to the subtractor 101 and the adder 108. The motion detector 111 performs block matching with the input image by moving the reference image stored in the image memory 109 in units of motion-compensated prediction in units of half pels, and determines the best matching movement as a motion vector ( Step S101). The obtained motion vector is supplied to a motion vector encoder 112 in addition to the motion compensation predictor 110. The motion vector encoder 112 compares the predicted motion vector obtained by a predetermined procedure with the motion vector of the block to be encoded for each horizontal component and vertical component, and encodes the difference value (step S107). . The coded code sequence of the motion vector is multiplexed by the multiplexer 105 with the code sequence of the inter-picture prediction error (step S108).
[0007]
Here, a method of detecting a motion vector in the conventional motion compensation prediction will be described. A block matching method is often used as a motion vector detection method. In this block matching method, a motion detection target block (N × N pixels) is matched with an image within a detection range in a reference image for every position in pixel units, a matching evaluation function is calculated, and the best matching is obtained. This is a method in which the position at which the value of the matching evaluation function is minimized, that is, the position at which the value of the matching evaluation function becomes the smallest is used as a motion vector. As the matching evaluation function, for example, the sum of the absolute values of the differences between the pixel values of all the pixels in the block (absolute value difference sum) is used.
[0008]
In the case of the full search method, there is a problem that the amount of calculation is enormous because block matching is performed at all positions within the detection range. For this reason, a three-step search method (for example, see “Motion Compensation Interframe Coding of Conference Television Signals” (IEICE Technical Report IE81-54 1981-7)) and a hierarchical search method (for example, A method of detecting a motion amount in a moving image has been proposed, for example, "Transactions of the Institute of Electronics, Information and Communication Engineers, D-II Vol. J72-D-II No. 3pp. 395-403, March 1989).
[0009]
In the three-step search method, as shown in FIG. 16, instead of searching all positions within the detection range, first, as a first step, a search of 3 × 3 nine points is performed at coarse intervals, and the best matching is obtained. With the search position as the center, the search interval of the second stage is set to half of the search interval in the first stage. This is a method in which a similar process is repeated in a plurality of stages to narrow the interval between detection points from coarse to dense and detect a motion vector. In the case of the detection range of ± 7 pixels, the process ends in three stages. The processing amount is greatly reduced as compared with the case of the full search method. However, in high-resolution parts such as line drawings, erroneous motion vectors may be detected in the initial detection stage unless the coarse detection position in the first stage coincides with the position to be matched. However, there is a problem that a prediction error in an actual motion-compensated prediction image is increased.
[0010]
In contrast, the hierarchical search method uses a low-pass filter and sub-sampling to generate a multi-resolution image having a resolution reduced by ２ in each of the horizontal and vertical directions. , And a motion vector of the next layer is detected by sequentially referring to the detected motion vector in the next higher layer. The motion vector detection of each layer uses a block matching method.
[0011]
In the following description, a low-resolution image is defined as a lower hierarchical image, and a high-resolution image is defined as an upper hierarchical image. The size of the detection range in each layer is usually fixed, and for example, 5 × 5 = 25 points of horizontal / vertical ± 2 pixels are set as detection positions. As the center position of the detection range of each layer, a position obtained by doubling the detected motion vector in the layer immediately above is used.
[0012]
FIG. 17 is a diagram illustrating the setting of the center position of the detection range in the hierarchical search method. In the lower layer, it is assumed that a motion vector 12 is detected as a result of performing a motion vector detection in a detection range 11 around a start point 10. In the upper hierarchy, a motion vector is detected in a detection range 21 around a position 20 where the motion vector 12 is doubled.
[0013]
Hereinafter, the motion detector 111 that performs the above-described hierarchical search will be described with reference to the drawings and flowcharts. FIG. 18 is a block diagram illustrating a configuration example of a conventional motion detector that performs a hierarchical search. FIGS. 19 and 20 are flowcharts illustrating a conventional motion detector that performs a hierarchical search.
[0014]
The input image is provided to the hierarchical level 1 subsampler 211. The hierarchical level 1 sub-sampler 211 subsamples the input image in half in the horizontal and vertical directions to create a hierarchical level 1 input image, and provides it to the image memory 212 and the hierarchical level 0 sub-sampler 221. . The reference image is provided to the hierarchical level 1 subsampler 214. The hierarchical level 1 subsampler 214 subsamples the input image in half in the horizontal and vertical directions to create a hierarchical level 1 reference image, and provides it to the image memory 213 and the hierarchical level 0 subsampler 224. (Step S201).
[0015]
The hierarchical level 0 sub-sampler 221 creates an input image of hierarchical level 0 by sub-sampling the input image in half in the horizontal and vertical directions, and supplies the input image to the image memory 222. The hierarchical level 0 sub-sampler 224 subsamples the input image in the horizontal and vertical directions by を to create a reference image of the hierarchical level 0, and provides it to the image memory 223 (step S202). The hierarchical level 0 motion vector determiner 220 sets the detection start point to (0, 0), sets the detection range to ± 1, and sets the matching block size to 4 × 4 (step S203 to step 206). From the input image at hierarchical level 0 and the reference image at hierarchical level 0 in the image memory 223, a hierarchical level 0 motion vector V (0) at the position where the absolute value difference sum is minimum is obtained (step S207), and V (0) is calculated. It is provided to the hierarchical level 1 motion vector determiner 210, and proceeds to the next hierarchical level (step S208).
[0016]
The hierarchical level 1 motion vector determiner 210 sets the detection start point at a position obtained by scaling V (0) by two, the detection range to ± 1, and the matching block size to 8 × 8 (steps S204 to S206). The hierarchical level 1 motion vector V (1) at the position where the absolute value difference sum is minimum is obtained from the hierarchical level 1 input image in the image memory 212 and the hierarchical level 1 reference image in the image memory 213 (step S207). , V (0) to the hierarchical level 2 motion vector determiner 200, and proceeds to the next hierarchical level (step S208). The hierarchical level 2 motion vector determiner 200 sets the detection start point at a position obtained by scaling V (1) to twice, the detection range at ± 1, and the matching block size at 16 × 16 (steps S204 to S206). Then, the hierarchical level 1 motion vector V (2) at the position where the absolute value difference sum is minimum is obtained from the input image and the reference image (step S207), and V (2) is output as the motion vector. Here, the determination of the motion vector (step S207) will be described.
[0017]
Block matching (step S251) is performed within the motion vector detection range of each hierarchical level, and the evaluation value of block matching is evaluated (step S252). If the evaluation value is minimum, it is set as a motion vector (step S253). Otherwise, block matching is performed for the next detection point (step S251).
[0018]
In the hierarchical search method, the size of a motion detection target block in each layer is set to a fixed number of pixels irrespective of the layer (therefore, the lower the resolution, the larger the block size on an actual image becomes). ), And a method of changing the size of the motion detection target block according to the sub-sample ratio of each layer, and detecting a motion vector of the same area in an actual image.
[0019]
In such a hierarchical search method, a multi-layer image is generated by a low-pass filter and sub-sampling, and motion detection of each layer is performed. , It is possible to avoid erroneous detection of a motion vector, and in the case of erroneous detection, the prediction error in an actual motion-compensated predicted image is not large compared to the three-step search method.
[0020]
Table 1 shows the matching block size, the detection range, the number of detection points, and the number of difference calculations in the full search method when the detection range of the motion vector is ± 7. Table 2 shows the matching block size, the detection range, the number of detection points, and the number of difference calculations in each step in the three-step search method when the detection range of the motion vector is ± 7. Table 3 shows the matching block size, the detection range, the number of detection points, and the number of difference calculations in each layer in the hierarchical search method when the detection range of the motion vector is ± 7. From Tables 1, 2 and 3, the number of difference operations in the three-step search method is reduced to 1/9 as compared with the number of difference operations in the full search method. Also, the number of difference operations in the hierarchical search method is reduced to about 1/21 compared to the number of difference operations in the full search method. However, the hierarchical search method requires hierarchical level 1 and hierarchical level 0 image memories.
[0021]
[Table 1]

[0022]
[Table 2]

[0023]
[Table 3]

[0024]
[Patent Document 1]
JP-A-7-154801
[0025]
[Non-patent document 1]
"Motion Compensation Interframe Coding of Conference Television Signals" (IEICE IE81-54, 1981-7)
[0026]
[Non-patent document 2]
"Motion Amount Detection Method in Moving Image Using Hierarchical Pixel Information" (Transactions of the Institute of Electronics, Information and Communication Engineers, D-II Vol. J72-D-II No. 3pp. 395-403, March 1989)
[0027]
[Problems to be solved by the invention]
The full search method can obtain the highest accuracy as motion vector detection using block matching. However, the full search method has a problem that the processing amount is enormous. The hierarchical search method has a lower detection accuracy than the full search, but has a feature of detecting a motion vector uniformly within a detection range. However, the hierarchical search method has a problem that an image memory for an input image and a reference image at each hierarchical level is required for high-frequency components. Further, although an image memory is not required in the three-step search method, there is a problem in a high-resolution portion such as a line drawing portion that a correct motion vector cannot be obtained if an erroneous motion vector is detected in the first step.
[0028]
The present invention has been made in view of the above points, and reduces the amount of calculation by thinning out the pixels used for block matching according to the distance from the center point of the detection range of a motion vector, thereby requiring an extra image memory. It is an object of the present invention to provide a highly accurate motion vector detecting device program.
[0029]
[Means for Solving the Problems]
Therefore, in order to solve the above problems, the present invention provides the following devices and programs.
(1) The input image is divided into blocks for each predetermined number of pixels, each block is sequentially set as a motion detection target block in the input image, and a detection range is sequentially set in the reference image according to the motion detection target block. In the motion vector detecting device for a moving image, a pixel in each of the motion detection target blocks is compared with a pixel in each of the detection ranges, and a motion vector is obtained for each of the motion detection target blocks by block matching.
Thinning-out for setting a thinning rate of a pixel in the motion detection target block to be compared in block matching for a pixel in the detection range according to a distance of a pixel in the detection range from a center of the detection range. Rate setting means;
According to the set thinning rate, thinning means for thinning pixels in the motion detection target block and pixels in the detection range used for block matching,
A motion vector detecting device comprising:
(2) In the motion vector detecting device according to (1),
Using a motion vector of a block adjacent to a block set as a motion detection target block and for which a motion vector has already been obtained, the motion detection target is obtained in accordance with a predicted motion vector obtained according to a predetermined method. A motion vector detecting device comprising a detection range moving means for moving a center position of the detection range set for a block.
(3) In the motion vector detecting device according to (1),
The decimation rate setting unit is configured to set the decimation rate in accordance with a motion vector of a block adjacent to a block set as a motion detection target block, for which a motion vector has already been obtained, and a distance from the center of the detection range. A motion vector detecting device, which is to be set.
(4) An input image which is a moving image is divided into blocks by a predetermined number of pixel units, each block is sequentially set as a motion detection target block in the input image, and a detection range is sequentially determined in the reference image according to the motion detection target block. A motion vector for causing a computer to perform an operation of obtaining a motion vector for each of the motion detection target blocks by block matching by comparing pixels in the respective motion detection target blocks with pixels in the respective detection ranges. In the detection program,
Thinning-out for setting a thinning rate of a pixel in the motion detection target block to be compared in block matching for a pixel in the detection range according to a distance of a pixel in the detection range from a center of the detection range. Rate setting hand function,
According to the set thinning rate, a thinning function to thin out pixels in the motion detection target block and pixels in the detection range used for block matching,
A motion vector detecting program for causing a computer to execute the program.
[0030]
BEST MODE FOR CARRYING OUT THE INVENTION
(First embodiment)
Hereinafter, a first embodiment of the present invention will be described with reference to the drawings and flowcharts. FIG. 1 is a block diagram showing one embodiment of an encoding apparatus using a first embodiment of the motion vector detector according to the present invention. FIG. 1 has the same configuration as that of FIG. 14, which is an example of an encoding device using a conventional motion vector detector. However, the operation of the motion detector 111 is different from the conventional example. In FIG. 1, a subtractor 101, a DCT 102, a quantizer 103, a variable length encoder 104, a multiplexer 105, an inverse quantizer 106, an inverse DCT 107, an adder 108, an image memory 109, a motion compensation predictor 110, The operation of the motion vector encoder is the same as in the conventional example.
[0031]
FIG. 2 is a flowchart illustrating an embodiment of the encoding apparatus using the first embodiment of the present invention. FIG. 2 is the same procedure as FIG. 15, which is a flowchart illustrating a conventional encoding device using a motion vector detector. However, the operation of step S101 is different from the conventional example. In FIG. 2, the procedure S102, procedure S103, procedure S104, procedure S105, procedure S106, procedure S107, procedure S108, procedure S109, procedure S110, procedure S111, and procedure S112 are the same as the conventional example.
[0032]
Hereinafter, the difference between the motion vector detector in the present embodiment and the conventional example will be described with reference to the drawings and flowcharts. FIG. 3 is a block diagram showing a first embodiment of the motion vector detector. FIG. 4 is a flowchart illustrating a first embodiment of the motion vector detector.
[0033]
The block matching setting unit 301 sets a thinning rate of pixels used for block matching according to the distance from the center of the detection range of the motion vector according to a predetermined method (step S301) (step S302), and stores the information in the motion vector. It is provided to the decision unit 300 and the thinning

units

302 and 303. The thinning-out unit 302 thins out the input image according to the given thinning-out rate, and supplies the thinned-out input image to the motion vector determiner 300 (step S303). The thinning-out unit 303 thins out the reference image according to the given thinning-out ratio, and gives the thinned-out reference image to the motion vector determiner 300 (step S304).
[0034]
The motion vector determiner 300 performs block matching according to the given thinning rate (step S305), evaluates an evaluation value of block matching (step S306), and if the evaluation value is the minimum, sets it as a motion vector (step S306). S307) Otherwise, move to the next detection point. When all the block matchings within the motion vector detection range are completed, a motion vector is output.
[0035]
A method of setting the pixel thinning rate used for block matching according to the distance from the center of the detection range of the motion vector in the present embodiment will be described with reference to FIG.
[0036]
The motion vector is obtained in units of 16 × 16 pixels blocks, and the detection range of the motion vector is ± 7 pixels in the horizontal and vertical directions. It is assumed that the pixel thinning rate used for block matching is divided into three levels of matching levels according to the distance from the center of the detection range. In FIG. 5, it is assumed that the detection center position 50 is a position where the motion vector is (0, 0), and a range of ± 2 pixels in the horizontal and vertical directions around the detection center position 50 is a first detection range 51. (Matching level 1). A range of ± 4 pixels from the first detection range 51 in the horizontal and vertical directions around the detection center position 50 is defined as a second detection range 52 (matching level 2). A range of ± 7 pixels in the horizontal direction and the vertical direction from the second detection range 52 around the detection center position 50 is defined as a third detection range 53 (matching level 3). A thinning rate of a pixel used for block matching is assigned to each of the first detection range 51, the second detection range 52, and the third detection range 53.
[0037]
In the first detection range 51, as shown in FIG. 6A, all pixels of a block to be subjected to motion vector detection are used for block matching. The matching evaluation function F (x, y) can be expressed by Expression 1. Here, x and y are the horizontal position and the vertical position of the detection position, CurPixel (i, j) is a pixel in a block for which a motion vector of the input image is detected, and RefPixel (i, j) is the detection position. In the block of the reference image in FIG.
[0038]
(Equation 1)

In the second detection range 52, as shown in FIG. 6 (B), pixels obtained by thinning out the block to be subjected to the motion vector detection in the horizontal and vertical directions by 1/2 each are used for block matching. The matching evaluation function F (x, y) can be expressed by Expression 2.
[0039]
(Equation 2)

As shown in FIG. 6 (C), in the third detection range 53, pixels which are thinned out by 1/4 in the horizontal and vertical directions with respect to the block to be subjected to the motion vector detection are used for block matching. The matching evaluation function F (x, y) can be expressed by Expression 3.
[0040]
[Equation 3]

The position where the value of the matching evaluation function F (x, y) obtained as described above is the minimum is defined as a motion vector.
[0041]
Increasing the thinning rate of the pixels used for block matching as the distance from the center of the detection range increases is processing based on the visual characteristics of human eyes. Generally, the human eye has a characteristic that the cognitive ability is reduced for a fast-moving object, and the processing is performed using this characteristic. It is characterized by detecting small movements with high cognitive ability with high accuracy, and detecting large movements with low cognitive ability with lower accuracy.
[0042]
Table 4 shows the block matching block size, the detection range, the number of detection points, and the number of difference calculations at each matching level in the present embodiment in which the detection range of the motion vector is ± 7. As compared with Tables 1, 2, and 3, the number of difference operations in the present embodiment is reduced to about 1/5 as compared with the number of difference operations in the full search method. Since the number of detected points is 225, which is the same as in the full search, the accuracy of the detected motion vector is high. Compared with the three-step search method, the number of times of difference calculation is 1.68 times. However, since the number of detection points is not reduced unlike the three-step search method, an erroneous motion vector is detected in a high-resolution part such as a line drawing part. Is very low, and a highly accurate motion vector can be obtained. Compared with the hierarchical search method, the number of times of difference calculation is about four times, but the image memory for the input image and the reference image at each hierarchical level is not required as in the hierarchical search method.
[0043]
[Table 4]

(Second embodiment)
A second embodiment of the present invention will be described with reference to the drawings and the flowchart. FIG. 7 is a block diagram showing an embodiment of an encoding apparatus using a second embodiment of the motion vector detector according to the present invention. FIG. 7 has the same configuration as the encoding device shown in FIG. However, the operation of the motion detector 111 is different from that of the encoding device shown in FIG. In FIG. 7, a subtractor 101, a DCT 102, a quantizer 103, a variable length encoder 104, a multiplexer 105, an inverse quantizer 106, an inverse DCT 107, an adder 108, an image memory 109, a motion compensation predictor 110, The operation of the motion vector encoder 112 is the same as that of the encoder shown in FIG.
[0044]
Hereinafter, differences between the motion vector detector in the second embodiment and the first embodiment will be described with reference to the drawings and flowcharts. FIG. 8 is a block diagram showing a second embodiment of the motion vector detector. FIG. 8 shows that the number of motion vector predictors 304 is increased as compared with FIG. 3 showing the first embodiment. The operations of the thinning-out setting unit 301 and the motion vector determining unit 300 are different from those of the first embodiment of the present invention. In FIG. 8, the operation of the

decimation units

302 and 303 is the same as in the first embodiment of the present invention. FIG. 9 is a flowchart for explaining a second embodiment of the motion vector detector. FIG. 9 is different from FIG. 4 illustrating the first embodiment of the present invention in that the number of steps S300 is increased. Also, the procedure S301 is different from the first embodiment of the present invention.
[0045]
The motion vector predictor 304 obtains a predicted motion vector according to a predetermined procedure (step S300), and supplies the predicted motion vector to the thinning setting device 301. The thinning setting unit 301 sets a thinning rate of pixels used for block matching according to a given or given predicted motion vector according to a distance from the center of the detection range of the motion vector (step S301). When the block matching within the detection range of the motion vector ends, the motion vector determiner 300 supplies the motion vector to the motion vector predictor 304 and outputs the motion vector.
[0046]
The procedure for obtaining the predicted motion vector will be described with reference to FIG. In FIG. 10, MVp (X), which is a predicted motion vector of block X, is obtained. The motion vectors of the adjacent blocks A, B, and C of the block X are MV (A), MV (B), and MV (C). As the simplest method, there is a method in which MVp (X) = MV (C) using the motion vector of the block on the left. These are methods used in MPEG-2 and the like. There is also a method of taking the median of the horizontal and vertical values of MV (A), MV (B) and MV (C). This method is the method used in MPEG-4, regardless of MV (A) = (0,2), MV (B) = (1, −1), MV (C) = (3,3). For example, MVp (X) = (1, 2).
[0047]
A method for setting the pixel thinning rate used for block matching according to the distance from the center of the motion vector detection range in the present embodiment will be described with reference to FIGS. 5 and 11. In accordance with the predicted motion vector MVp (x, y), Range1 (x, y) as the first detection range and Range2 (x, y) as the second detection range change according to Equation 4.
[0048]
(Equation 4)

For example, if MVp is (0, 0), the center position 50 of the search range is (0, 0), and the setting range of the thinning rate is the same as that of the first embodiment shown in FIG. If MVp is (3, 3), the detection center position 50 of the search range is (0, 1), and the setting range of the thinning rate is as shown in FIG. In FIG. 11, a range of ± 2 pixels in the horizontal and vertical directions around the detection center position 50 is defined as a first detection range 51 (matching level 1). A range of ± 4 pixels from the first detection range 51 in the horizontal and vertical directions around the detection center position 50 is defined as a second detection range 52 (matching level 2). The outside of the second detection range 52 is defined as a third detection range 53 (matching level 3). Each matching level is shown in Table 4.
[0049]
In the present embodiment, the example of the parallel movement has been described in the calculation of the setting range of the thinning rate. For example, as shown in FIG. 12, the detection center position 50 is set to (0, 0) and the range of ± 1 pixel in the horizontal direction and the vertical direction is set to the first detection range 51 (matching Level 1), a range of ± 2 pixels from the first detection range 51 in the horizontal and vertical directions around the detection center position 50 as a second detection range 52 (matching level 2), and a second detection range 52 Is set as the third detection range 53 (matching level 3), and the range of the matching level 1 and the matching level 2 is reduced, so that the calculation amount can be further reduced.
[0050]
Table 5 shows the block matching block size, the detection range, the number of detection points, and the number of times of difference calculation at each matching level in this case. In this case, the number of difference operations is almost the same as in the three-step search method. By changing the setting of the thinning rate in accordance with the magnitude of the motion of the input image, the amount of calculation can be reduced.
[0051]
[Table 5]

In the thinning rate setting method using MVp, there is also a method of moving the center of the detection range of the motion vector as MVp. For example, when MVp is (0, 1), the center of the motion vector detection range is moved to (0, 1). This result is the same as FIG. By moving the center of the detection range of the motion vector in accordance with the MVp as long as the predicted motion vector can be correctly predicted even for an input image with a large motion, the detection range of the motion vector is wide, the amount of operation is small, In addition, a highly accurate motion vector can be detected.
[0052]
Next, a method of setting a rate of thinning out pixels in a block according to a motion vector of an adjacent block will be described. In this embodiment, MVp is used as a motion vector of an adjacent block. For a movement in which the MVp moves in the horizontal direction such as (0,3), a horizontal-oriented thinning rate as shown in FIG. 13 is set. The detection center position 50 is (2, 0), the range of ± 3 pixels in the horizontal direction and ± 1 pixel in the vertical direction is the first detection range 51 (matching level 1), and the detection center position is determined from the first detection range 51. The range of ± 4 pixels in the horizontal direction and ± 2 pixels in the vertical direction around the center 50 is the second detection range 52 (matching level 2), and the outside of the second detection range 52 is the third detection range 53 (matching level). 3).
[0053]
For a movement in which the MVp moves in the vertical direction such as (3, 0), a thinning rate of the vertical emphasis type as shown in FIG. 13 is set. The detection center position 50 is defined as (0, 2), a range of ± 1 pixel in the horizontal direction and ± 3 pixels in the vertical direction is defined as a first detection range 51 (matching level 1). The range of ± 2 pixels in the horizontal direction and ± 4 pixels in the vertical direction around the center 50 is defined as a second detection range 52 (matching level 2), and the outside of the second detection range 52 is defined as a third detection range 53 (matching level). 3). By doing so, a motion vector with a wide detection range of the motion vector, a small amount of calculation, and high accuracy can be detected.
[0054]
That is, when the rate of thinning out pixels in a block is set in accordance with the motion vector of an adjacent block, if it is determined that the movement is in the horizontal direction, the rate of thinning out the pixels in the vertical direction is increased and If it is determined that the movement is a horizontal movement, the rate of thinning out pixels in the horizontal direction is increased.
[0055]
The present invention includes a program for causing a computer to realize the function of the motion vector detector described above. This program may be read from a recording medium and taken into the computer, or may be transmitted via a communication network or the like and taken into the computer.
[0056]
【The invention's effect】
The present invention reduces the amount of calculation by changing the thinning rate of the pixels used for block matching according to the distance from the center of the detection range, does not require an extra image memory, and is used for a moving image with small motion. Can detect a particularly accurate motion vector.
[0057]
In addition, when the center position of the detection range of the motion vector is moved according to the predicted motion vector, the amount of calculation is reduced even for a moving image having a large motion, and no extra image memory is required. A highly accurate motion vector can be detected.
[0058]
When the thinning rate of pixels used for block matching is changed in accordance with the motion vector of an adjacent block, the amount of calculation is further reduced, and no extra image memory is required. Can be detected.
[Brief description of the drawings]
FIG. 1 is a block diagram showing one embodiment of an encoding apparatus using a first embodiment of a motion vector detector according to the present invention.
FIG. 2 is a flowchart illustrating an embodiment of an encoding device using the first embodiment of the present invention.
FIG. 3 is a block diagram showing a first embodiment of a motion vector detector.
FIG. 4 is a flowchart illustrating a first embodiment of the motion vector detector.
FIG. 5 is a diagram showing a setting range of a thinning rate.
FIG. 6 is a diagram illustrating an example of pixels used for block matching.
FIG. 7 is a block diagram showing an embodiment of an encoding device using a motion vector detector according to a second embodiment of the present invention.
FIG. 8 is a block diagram showing a second embodiment of the motion vector detector.
FIG. 9 is a flowchart illustrating a second embodiment of the motion vector detector.
FIG. 10 is a diagram showing a procedure for obtaining a predicted motion vector.
FIG. 11 is a diagram illustrating a setting range of a thinning rate.
FIG. 12 is a diagram illustrating a setting range of a thinning rate.
FIG. 13 is a diagram illustrating a setting range of a thinning rate.
FIG. 14 is a block diagram illustrating a configuration example of a conventional encoding device that performs motion compensation prediction.
FIG. 15 is a flowchart illustrating a conventional encoding device that performs motion compensation prediction.
FIG. 16 is a diagram for explaining a three-step search method.
FIG. 17 is a diagram for explaining a hierarchical search method.
FIG. 18 is a block diagram illustrating a configuration example of a conventional motion detector that performs a hierarchical search.
FIG. 19 is a flowchart illustrating a conventional motion detector that performs a hierarchical search.
FIG. 20 is a flowchart illustrating a conventional motion detector that performs a hierarchical search.

Claims

The input image is divided into blocks of a predetermined number of pixels, each block is sequentially set as a motion detection target block in the input image, and a detection range is sequentially set in the reference image in accordance with the motion detection target block. In the motion vector detection device for a moving image, a pixel in the detection target block is compared with a pixel in each of the detection ranges, and a motion vector is obtained for each of the motion detection target blocks by block matching.
Thinning-out for setting a thinning rate of a pixel in the motion detection target block to be compared in block matching for a pixel in the detection range according to a distance of a pixel in the detection range from a center of the detection range. Rate setting means;
According to the set thinning rate, thinning means for thinning pixels in the motion detection target block and pixels in the detection range used for block matching,
A motion vector detecting device comprising:

The motion vector detecting device according to claim 1,
Using a motion vector of a block adjacent to a block set as a motion detection target block and for which a motion vector has already been obtained, the motion detection target is obtained in accordance with a predicted motion vector obtained according to a predetermined method. A motion vector detecting device comprising a detection range moving means for moving a center position of the detection range set for a block.

The motion vector detecting device according to claim 1,
The decimation rate setting unit is configured to set the decimation rate in accordance with a motion vector of a block adjacent to a block set as a motion detection target block, for which a motion vector has already been obtained, and a distance from the center of the detection range. A motion vector detecting device, which is to be set.

An input image which is a moving image is divided into blocks of a predetermined number of pixels, each block is sequentially set as a motion detection target block in the input image, and a detection range is sequentially set in the reference image according to the motion detection target block. A motion vector detection program that causes a computer to perform an operation of obtaining a motion vector for each of the motion detection target blocks by block matching by comparing pixels in the respective motion detection target blocks with pixels in the respective detection ranges. ,
Thinning-out for setting a thinning rate of a pixel in the motion detection target block to be compared in block matching for a pixel in the detection range according to a distance of a pixel in the detection range from a center of the detection range. Rate setting hand function,
According to the set thinning rate, a thinning function to thin out pixels in the motion detection target block and pixels in the detection range used for block matching,
A motion vector detecting program for causing a computer to execute the program.