JP2004215215A

JP2004215215A - Motion vector detecting method

Info

Publication number: JP2004215215A
Application number: JP2003068058A
Authority: JP
Inventors: Makoto Hagai; 誠羽飼; Shinya Sumino; 眞也角野; Toshiyuki Kondo; 敏志近藤
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2002-03-14
Filing date: 2003-03-13
Publication date: 2004-07-29

Abstract

<P>PROBLEM TO BE SOLVED: To provide a motion vector detecting method which detects an optimum motion vector even when fading is caused. <P>SOLUTION: This method comprises: a step (S300) for generating a first motion vector candidate MVC1 based on a reference frame Rf1 and a second motion vector candidate MVC2 based on a reference frame Rf2; steps (S302, S304) for generating a interpolation predicting block PBO by interpolating a pixel value based on a predicting block PB1 indicated by the first motion vector candidate MVC1 and a predicting block PB2 indicated by the second motion vector candidate MVC2; steps (S306-S310) for calculating prediction error evaluation value based on a difference between pixel values of the interpolation predicting block PB0 and an encoding object block; and steps (S312, S314) for detecting the first and second motion vector candidates MVC1 and MVC2 with which the prediction error evaluation value becomes minimum, as motion vectors MV1 and MV2 of the encoding object block, respectively. <P>COPYRIGHT: (C)2004,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、動画像の符号化を行う際の、画像内の領域の動きを示す動きベクトルを検出する動きベクトル検出方法に関する。
【０００２】
【従来の技術】
近年、マルチメディアアプリケーションの発展に伴い、画像・音声・テキストなど、あらゆるメディアの情報をディジタル化することにより統一的に扱うことが一般的になってきた。しかしながら、ディジタル化された画像は膨大なデータ量を持つため、蓄積・伝送のためには、画像の情報圧縮技術が不可欠である。一方で、圧縮した画像データを相互運用するためには、圧縮技術の標準化も重要である。画像圧縮技術の標準規格としては、ＩＴＵ−Ｔ（国際電気通信連合電気通信標準化部門）のＨ．２６１、Ｈ．２６３、又はＩＳＯ（国際標準化機構）のＭＰＥＧ（ＭｏｖｉｎｇＰｉｃｔｕｒｅＥｘｐｅｒｔｓＧｒｏｕｐ）−１、ＭＰＥＧ−２、ＭＰＥＧ−４などがある（例えば、非特許文献１および非特許文献２参照。）。
【０００３】
これらの標準規格の動画像符号化方式に共通の技術として動き補償を伴うフレーム間予測がある。これらの動画像符号化方式の動き補償では、入力画像のフレームを所定のサイズの矩形（以降、ブロックと呼ぶ）に分割し、各ブロック毎にフレーム間の動きを示す動きベクトルから予測画素を生成する。
【０００４】
図３１は、動きベクトルを説明するための説明図である。
例えば、ビデオカメラで動きのある被写体を撮影した場合、各フレーム毎に被写体の写っているブロックの位置は移動する。つまり、動きのある被写体がフレームＲｆのブロックＢに含まれ、フレームＴｆではその被写体がブロックＢ０に含まれていれば、ブロックＢ０はブロックＢから変位することとなり、ブロックＢ０に対するその変位が動きベクトルＭＶとして表される。
【０００５】
そして、上述の動き補償では各ブロックに対して動きベクトルＭＶが検出され、その検出は、一般に、参照元となるフレーム（図３１ではフレームＲｆ）の中から、検出の対象となるブロック（図３１ではブロックＢ０）と画素値が近いブロックを探すことで行われる。
【０００６】
また、現在、ＩＴＵ−Ｔで標準化中のＨ．２６Ｌでは、符号化対象フレームの直前２枚のフレームを参照フレームとして画素補間することで予測画素を得る方法が検討されている。ここで、符号化対象フレームより表示時刻が前の２枚のフレームを参照して画素補間により予測画像（予測画素）を生成する予測方法を、前方補間予測という。
【０００７】
図３２は、２枚のフレームを用いて予測画像を生成する様子を説明するための説明図である。
この図３２に示すように、符号化対象フレームＴｆのブロックＢ０の予測画像を生成するときには、例えば、表示時刻から見て、符号化対象フレームＴｆの直前にあるフレームＲｆ１と、符号化対象フレームＴｆの２枚前にある参照フレームＲｆ２とが参照フレームとして用いられる。即ち、これらの参照フレームＲｆ１，Ｒｆ２の中からブロックＢ０と画素値が近いブロックＢ１，Ｂ２が探し出され、そのブロックＢ１，Ｂ２とブロックＢ０の位置の変位から動きベクトルＭＶ１，ＭＶ２が検出される。
【０００８】
そして、動きベクトルＭＶ１で示される参照フレームＲｆ１のブロックＢ１と、動きベクトルＭＶ２で示される参照フレームＲｆ２のブロックＢ２とからブロックＢ０の予測画像が生成される。つまり、ブロックＢ１の画素値とブロックＢ２の画素値とを用いて、ブロックＢ０に対して画素値の補間を行うことによって上記予測画像が生成される。このような画素値の補間方法には、例えば、平均値や外挿補間などがある。外挿補間はフェードなど時刻経過に対し画素値が線形に変化していく画面効果の予測に高い効果がある。
【０００９】
また、ブロックＢ０は上述のように生成された予測画像を用いて符号化される。
図３３は、上記従来の動画像符号化方式により動画像を符号化する動画像符号化装置８００の構成を示すブロック図である。
【００１０】
この動画像符号化装置８００は、マルチフレームバッファ８０１と、動き推定部８０２と、動き補償部８０３と、画像符号化部８０４と、画像復号部８０５と、可変長符号化部８０６と、動きベクトルスケーリング部８０７と、加算器８０８と、減算器８０９，８１０と、スイッチ８１１，８１２とを備え、画像信号Ｉｍｇにより示されるフレームをブロックに分割し、そのブロック毎に処理を行う。
【００１１】
減算器８０９は、動画像符号化装置８００に入力された画像信号Ｉｍｇから予測画像信号Ｐｒｅを減算し、残差信号Ｒｅｓを出力する。
画像符号化部８０４は、残差信号Ｒｅｓを取得してＤＣＴ変換・量子化などの画像符号化処理を行い、量子化済ＤＣＴ係数などを含む残差符号化信号Ｅｒを出力する。
【００１２】
画像復号部８０５は、残差符号化信号Ｅｒを取得して逆量子化・逆ＤＣＴ変換などの画像復号処理を行い、残差復号信号Ｄｒを出力する。加算器８０８は、残差復号信号Ｄｒと予測画像信号Ｐｒｅを加算し、再構成画像信号Ｒｃを出力する。また、再構成画像信号Ｒｃのうち、以降のフレーム間予測で参照される可能性がある信号は、マルチフレームバッファ８０１に格納される。
【００１３】
図３４は、動き推定部８０２の構成を示すブロック図である。
動き推定部８０２は、ブロック毎の動きベクトルを検出するものであって、動きベクトル候補生成部８２１と画素取得部８２２と減算器８２３と動きベクトル選択部８２４とを備えている。
【００１４】
動きベクトル候補生成部８２１は、符号化対象となるブロックの動きベクトルＭＶの候補として、動きベクトル候補ＭＶＣを生成する。ここで、動きベクトル候補生成部８２１は、所定の動きベクトル検出範囲内から幾つかの動きベクトル候補ＭＶＣを順次生成する。
【００１５】
画素取得部８２２は、動きベクトル候補ＭＶＣが示す参照フレームＲｆ中の１ブロックを取得して、予測ブロックＰＢとして減算器８２３に出力する。
減算器８２３は、画像信号Ｉｍｇの符号化対象となるブロックと、予測ブロックＰＢとの画素値の差分を計算し、予測誤差ブロックＲＢとして動きベクトル選択部８２４に出力する。
【００１６】
動きベクトル選択部８２４は、動きベクトル候補生成部８２１で生成された各動きベクトル候補ＭＶＣに対する予測誤差ブロックＲＢを取得すると、これらの予測誤差ブロックＲＢのそれぞれに対して、そのブロック内の画素値からＳＡＤ（予測誤差値の絶対値和）やＳＳＤ（予測誤差値の２乗和）などの予測誤差評価値を算出する。そして、動きベクトル選択部８２４は、その予測誤差評価値が最小となるときの動きベクトル候補ＭＶＣを選択し、その選択した動きベクトル候補ＭＶＣを動きベクトルＭＶとして出力する。
【００１７】
このような動きベクトル推定部８０２は、前方補間予測時には、上述のような処理動作を繰り返し行うことで、符号化対象ブロックに対して２つの参照フレームＲｆ１，Ｒｆ２に基づく動きベクトルＭＶ１，ＭＶ２を検出する。
【００１８】
図３５は、動きベクトルＭＶ１，ＭＶ２を検出する様子を説明するための説明図である。
動き推定部８０２は、スイッチ８１１，８１２の接点が接点０側に切り換えられると、上述のように予測誤差評価値を算出することで、符号化対象フレームＴｆ中の符号化対象ブロックと画素値の近いブロック（予測誤差評価値が最小のブロック）を参照フレームＲｆ１から探し出し、符号化対象フレームＴｆ中のブロックの画素Ｐｔ０と、参照フレームＲｆ１から探し出されたブロック内において画素Ｐｔ０とブロック内の相対位置が等しい画素Ｐｔ１との変位を示す動きベクトルＭＶ１を検出する。
【００１９】
次に、動き推定部８０２は、スイッチ８１１，８１２の接点が接点１側に切り換えられると、上述と同様、符号化対象フレームＴｆ中の符号化対象ブロックと画素値の近いブロック（予測誤差評価値が最小のブロック）を参照フレームＲｆ２から探し出し、符号化対象フレームＴｆ中のブロックの画素Ｐｔ０と、参照フレームＲｆ２から探し出されたブロック内において画素Ｐｔ０とブロック内の相対位置が等しい画素Ｐｔ２との変位を示す動きベクトルＭＶ２を検出する。
【００２０】
動き補償部８０３は、前方補間予測時には、参照フレームＲｆ１における動きベクトルＭＶ１により指し示される位置のブロックと、参照フレームＲｆ２における動きベクトルＭＶ２により指し示される位置のブロックとを、マルチフレームバッファ８０１から取り出す。そして動き補償部８０３は、これらのブロックに基づいて画素値の補間処理を行って予測画像を示す予測画像信号Ｐｒｅを作成し、これを出力する。
【００２１】
なお、この前方補間予測により予測画像を得る場合の符号化対象ブロックを前方補間予測ブロックと呼ぶ。また、動き補償部８０３はブロック毎に他の予測方法、例えば表示時刻より前の１枚のフレームのみから予測を行う前方予測、に切り替えることも可能である。
【００２２】
ここで、画素値（輝度値）が時間的に変化するフェードについて説明する。上述のように被写体を含むブロックの位置はその被写体の動きに応じて変位するとともに、そのブロック内の画素値は時間的に変化する。
【００２３】
図３６は、フェードによる画素値の変化を説明するための説明図である。
上述の動きベクトルＭＶ２で示される画素Ｐｔ２の画素値は、動きベクトルＭＶ１で示される画素Ｐｔ１の画素値に変化する。このような変化は時間間隔が短ければ図３６中の線Ｌで示されるように時間に比例すると仮定することができる。
【００２４】
そこで、符号化対象フレームＴｆのブロックＢ０の画素Ｐｔ０の画素値Ｐ０は、参照フレームＲｆ１，Ｒｆ２の画素Ｐｔ１，Ｐｔ２の画素値Ｐ１，Ｐ２から外挿して、計算式「Ｐ０＝２×Ｐ１−Ｐ２」により予測される。
【００２５】
動き補償部８０３は、上式の外挿を行うことでフェードに対する予測効果を高めて符号化効率を向上している。また、動き補償部８０３は、フェードがない画像に対しては、外挿の代わりに内挿（平均値）による補間を行っており、その結果、より最適な予測方法の選択の幅を広げて符号化効率を向上している。
【００２６】
動きベクトルスケーリング部８０７は、動きベクトルＭＶ１に対してスケーリングを行う。
動き推定部８０２で検出された動きベクトルＭＶ１に対して、動きベクトルスケーリング部８０７は、図３５に示すように、符号化対象フレームＴｆ及び参照フレームＲｆ１が表示される時間差を示す表示時間差Ｔ１と、符号化対象フレームＴｆ及び参照フレームＲｆ２が表示される時間差を示す表示時間差Ｔ２とに基づいて、スケーリングを行う。
【００２７】
即ち、動きベクトルスケーリング部８０７は、表示時間差Ｔ１に対する表示時間差Ｔ２の割合（Ｔ２／Ｔ１）を、動きベクトルＭＶ１に乗じることで、動きベクトルＭＶ１に対するスケーリングを行い、動きベクトルＭＶｓを求める。
【００２８】
また、このような表示時間差Ｔ１，Ｔ２に関する情報は、マルチフレームバッファ８０１から取得される。つまり、再生構成画像信号Ｒｃにより示されるフレームは、そのフレームの表示時間に関する情報と共にマルチフレームバッファ８０１に記録されている。
【００２９】
減算器８１０は、動き推定部８０２で検出された動きベクトルＭＶ２から、上述の動きベクトルＭＶｓを減算して、図３５に示す差分ベクトルＭＶｄを出力する。
【００３０】
可変長符号化部８０６は、動きベクトルＭＶ１と差分ベクトルＭＶｄと残差符号化信号Ｅｒとを可変長符号化し、画像符号化信号Ｂｓを出力する。
以上、説明した処理により動画像符号化装置８００は画像信号Ｉｍｇを符号化して、画像符号化信号Ｂｓを出力する。
【００３１】
図３７は、画像符号化信号Ｂｓのフォーマットの概念を示す概念図である。
画像符号化信号Ｂｓには、前方補間予測により符号化されたフレームを示す内容のフレーム符号化信号Ｂｓｆ９が含まれ、さらにこのフレーム符号化信号Ｂｓｆ９には符号化された前方補間予測ブロック（符号化対象ブロック）を示す内容のブロック符号化信号Ｂｓｂ９が含まれている。そしてさらに、このブロック符号化信号Ｂｓｂ９には、符号化された動きベクトルＭＶ１を示す内容の第１動きベクトル符号化信号Ｂｓ１と、符号化された差分ベクトルＭＶｄを示す内容の差分ベクトル符号化信号Ｂｓｄとが含まれる。
【００３２】
このような動画像符号化装置８００が実行する動きベクトル符号化方法では、動きベクトルスケーリング部８０７及び減算器８１０並びに可変長符号化部８０６により、動きベクトルＭＶ１と差分ベクトルＭＶｄが符号化されるため、符号化対象フレームＴｆ、参照フレームＲｆ１及び参照フレームＲｆ２の各フレーム間で、画面内の被写体の動きの向き・速度が一定と仮定すると、差分ベクトルＭＶｄが０に近くなって、動きベクトルの符号化効率が良い。
【００３３】
次に、動画像符号化装置８００により符号化された画像を復号化する動画像復号化装置について説明する。
図３８は、従来の動画像復号化装置の構成を示すブロック図である。
【００３４】
この動画像復号化装置９００は、マルチフレームバッファ９０１と、動き補償部９０３と、画像復号化部９０５と、可変長復号化部９０６と、動きベクトルスケーリング部９０７と、加算器９０９，９１０とを備えている。
【００３５】
可変長復号化部９０６は、画像符号化信号Ｂｓを取得して可変長復号を行い、残差符号化信号Ｅｒと動きベクトルＭＶ１と差分ベクトルＭＶｄとを出力する。画像復号化部９０５は、残差符号化信号Ｅｒを取得して、逆量子化や逆ＤＣＴ変換などの画像復号処理を行い、残差復号信号Ｄｒを出力する。
【００３６】
動きベクトルスケーリング部９０７は、動画像符号化装置８００の動きベクトルスケーリング部８０７と同様、可変長復号化部９０６から出力された動きベクトルＭＶ１を取得すると、符号化対象フレームＴｆと参照フレームＲｆ１との表示時間差Ｔ１、及び符号化対象フレームＴｆと参照フレームＲｆ２との表示時間差Ｔ２に基づいて、動きベクトルＭＶ１に対してスケーリングを行い、その結果生成された動きベクトルＭＶｓを出力する。
【００３７】
加算器９１０は、スケーリング済みの動きベクトルＭＶｓと差分ベクトルＭＶｄとを加算し、その加算結果を動きベクトルＭＶ２として出力する。
動き補償部９０３は、動画像符号化装置８００の動き補償部８０３と同様、参照フレームＲｆ１における動きベクトルＭＶ１により指し示される位置のブロックと、参照フレームＲｆ２における動きベクトルＭＶ２により指し示される位置のブロックとを、マルチフレームバッファ９０１から取り出す。そして動き補償部９０３は、これらのブロックに基づいて画素値の補間処理を行って予測画像信号Ｐｒｅを作成しこれを出力する。
【００３８】
加算器９０９は、動き補償部９０３からの予測画像信号Ｐｒｅと、画像復号化部９０５からの残差復号信号Ｄｒとを加算し、その結果を復号画像信号Ｄｉとして出力する。
【００３９】
マルチフレームバッファ９０１は、動画像符号化装置８００のマルチフレームバッファ８０１と同様の構成を有し、復号画像信号Ｄｉのうち、フレーム間予測で参照される可能性がある信号を格納する。
【００４０】
このような動画像復号化装置９００は画像符号化信号Ｂｓを復号化して、その復号結果を復号画像信号Ｄｉとして出力する。
以上のように動画像符号化装置８００を含む多くの従来の動画像符号化装置における動きベクトルの検出方法には、ＳＡＤやＳＳＤなどの予測誤差評価値が用いられている。
【００４１】
【非特許文献１】
“ＶｉｄｅｏＣｏｄｉｎｇｆｏｒＬｏｗＢｉｔＲａｔｅＣｏｍｍｕｎｉｃａｔｉｏｎ”Ｈ．２６３，ＩＴＵ−Ｔ，１９９６．３
【００４２】
【非特許文献２】
ＤＲＡＦＴＦＯＲ “Ｈ．２６３＋＋” ＡＮＮＥＸＥＳＵ，Ｖ，ＡＮＤＷＴＯＲＥＣＯＭＭＥＮＤＡＴＩＯＮＨ．２６３（Ｕ．４ＤｅｃｏｄｅｒＰｒｏｃｅｓｓ），ＩＴＵ−Ｔ，２０００．１１
【００４３】
【発明が解決しようとする課題】
しかしながら、上記従来の動きベクトル検出方法では、フェードによる画素値の変化が考慮されていないため、符号化対象ブロックと予測画像との画素値に差が生じ易く、つまり上述のＳＡＤやＳＳＤなどの予測誤差評価値の値が増加し易く、最適な動きベクトルを検出することができないといった問題がある。
【００４４】
そこで本発明では、フェードが生じても最適な動きベクトルを検出する動きベクトル検出方法を提供することを目的とする。
【００４５】
【課題を解決するための手段】
上記目的を達成するために、本発明の動きベクトル検出方法は、動画像を構成するピクチャの中のブロックにおける、他のピクチャからの変位を示す動きベクトルを検出する動きベクトル検出方法であって、前記検出対象ブロックの第１の参照ピクチャに基づく第１の動きベクトル候補を生成する第１の候補生成ステップと、前記検出対象ブロックの第２の参照ピクチャに基づく第２の動きベクトル候補を生成する第２の候補生成ステップと、前記第１の参照ピクチャにおける第１の動きベクトル候補が示す第１の予測用ブロックと、前記第２の参照ピクチャにおける第２の動きベクトル候補が示す第２の予測用ブロックとに基づいて、互いに対応する画素の画素値の補間を行うことで補間予測ブロックを作成する補間ステップと、前記補間予測ブロックと前記検出対象ブロックとの互いに対応する画素の画素値の差に基づく評価値を算出する算出ステップと、前記評価値に基づいて、前記第１の候補生成ステップで生成された複数の第１の動きベクトル候補の中から１つを選択するとともに、前記第２の候補生成ステップで生成された複数の第２の動きベクトル候補の中から１つを選択する選択ステップと、選択された前記第１の動きベクトル候補を、前記検出対象ブロックにおける前記第１の参照ピクチャに基づく第１の動きベクトルとして検出するとともに、選択された前記第２の動きベクトル候補を、前記検出対象ブロックにおける前記第２の参照ピクチャに基づく第２の動きベクトルとして検出する検出ステップとを含むことを特徴とする。例えば、前記選択ステップでは、前記評価値が最小となる第１及び第２の動きベクトル候補をそれぞれ１つ選択する。
【００４６】
これにより、画素値の補間処理を行った結果に基づいて評価値が算出されるため、フェードが生じる場合であってもその影響による評価値の誤差の増加を防止して、最適な動きベクトルを検出することができる。
【００４７】
さらに、本発明に係る動きベクトル符号化方法は、動画像を構成するピクチャの中のブロックにおける、他のピクチャからの変位を示す動きベクトルを符号化する動きベクトル符号化方法であって、上記本発明に係る動きベクトル検出方法で第１及び第２の動きベクトルを検出する動きベクトル検出ステップと、前記第１及び第２の動きベクトルをそれぞれ符号化する符号化ステップとを含むことを特徴とする。
【００４８】
これにより、最適な動きベクトルをそれぞれ符号化することができる。
また、本発明に係る動きベクトル符号化方法は、動画像における符号化の対象となる符号化対象ピクチャに対して、他の２つのピクチャを参照ピクチャとして参照することで第１及び第２の動きベクトルの符号化方法を特定し、前記第１及び第２の動きベクトルに関する情報を符号化する動きベクトル符号化方法であって、ピクチャがその表示時刻に関する情報と共に記録される第１の領域と前記第１の領域に記録されていないピクチャが記録される第２の領域とを有する記憶手段から前記２つの参照ピクチャを読み出す読出ステップと、前記２つの参照ピクチャのうちの少なくとも１つが前記第２の領域から読み出されたか否かを判定する判定ステップと、前記判定ステップで、前記２つの参照ピクチャのうちの少なくとも１つが前記第２の領域から読み出されたと判定されたときには、前記第１の動きベクトルと前記第２の動きベクトルとの差分を示す差分ベクトルを求める差分ベクトル導出ステップと、前記第１の動きベクトルと前記差分ベクトルとを符号化する符号化ステップとを含むことを特徴としても良い。
【００４９】
これにより、前記２つの参照ピクチャのうちの少なくとも１つがメモリの第２の領域から読み出されたときには、従来例のように動きベクトルのスケーリングが行われないため、無理なスケーリングの実行を省き、動きベクトルの符号化効率の向上を図ることができる。
【００５０】
一方、本発明に係る動きベクトル復号化方法は、動画像における復号化の対象となる復号化対象ピクチャに対して、他の２つのピクチャを参照ピクチャとして参照することで第１及び第２の動きベクトルの復号化方法を特定し、前記第１及び第２の動きベクトルに関する情報を符号化した符号化情報を復号化する動きベクトル復号化方法であって、前記符号化情報から前記第１の動きベクトルと前記第１及び第２の動きベクトルに関する関連ベクトルとを復号化する復号化ステップと、ピクチャがその表示時刻に関する情報と共に記録される第１の領域と前記第１の領域に記録されていないピクチャが記録される第２の領域とを有する記憶手段から前記２つの参照ピクチャを読み出す読出ステップと、前記２つの参照ピクチャのうちの少なくとも１つが前記第２の領域から読み出されたか否かを判定する判定ステップと、前記判定ステップで、前記２つの参照ピクチャのうちの少なくとも１つが前記第２の領域から読み出されたと判定されたときには、前記関連ベクトルと前記第１の動きベクトルを加算して前記第２の動きベクトルを求める演算ステップとを含むことを特徴とする。
【００５１】
これにより、前記２つの参照ピクチャのうちの少なくとも１つがメモリの第２の領域から読み出されたときには、従来例のように動きベクトルのスケーリングが行われないため、無理なスケーリングの実行を省き、動きベクトルの復号化効率の向上を図ることができる。
【００５２】
なお、本発明は、上記動きベクトル検出方法や動きベクトル符号化方法を用いる動画像符号化装置、プログラム、及びそのプログラムを格納する記憶媒体としても実現することができる。
【００５３】
【発明の実施の形態】
（実施の形態１）
以下、本発明の第１の実施の形態における動画像符号化装置について図面を参照しながら説明する。
【００５４】
図１は、本実施の形態における動画像符号化装置３００Ａの構成を示すブロック図である。
本実施の形態の動画像符号化装置３００Ａは、最適な動きベクトルを検出して符号化するものであって、マルチフレームバッファ３０１と、動き推定部３０２と、動き補償部３０３と、画像符号化部３０４と、画像復号部３０５と、可変長符号化部３０６と、加算器３０８及び減算器３０９とを備える。
【００５５】
動き推定部３０２は、マルチフレームバッファ３０１から読み出された参照フレームＲｆ１，Ｒｆ２のそれぞれに基づいて、画像信号Ｉｍｇにより示される符号化対象フレームＴｆ中のブロック（符号化対象ブロック）の最適な動きベクトルＭＶ１，ＭＶ２を検出する。
【００５６】
動き補償部３０３は、動画像符号化装置８００の動き補償部８０３と同様に、前方補間予測時には、参照フレームＲｆ１における動きベクトルＭＶ１により指し示される位置のブロックと、参照フレームＲｆ２における動きベクトルＭＶ２により指し示される位置のブロックとを、マルチフレームバッファ８０１から取り出す。そして、動き補償部３０３はこれらのブロックに基づいて、図３６で説明したような外挿による画素値の補間処理を行って予測画像を示す予測画像信号Ｐｒｅを作成してこれを出力する。このように外挿による画素値の補間処理を行うことでフェードに対する予測効果を高めることができる。なお、動き補償部３０３は、符号化対象のブロック毎に予測方法を、前方補間予測と他の予測方法、例えば表示時刻より前の１枚のフレームのみから予測を行う前方予測と、に切り替えても良い。
【００５７】
減算器３０９は、画像信号Ｉｍｇから予測画像信号Ｐｒｅを減算して残差信号Ｒｅｓを出力する。
画像符号化部３０４は、残差信号Ｒｅｓを取得してＤＣＴ変換・量子化などの画像符号化処理を行い、量子化済ＤＣＴ係数などを含む残差符号化信号Ｅｒを出力する。
【００５８】
画像復号部３０５は、残差符号化信号Ｅｒを取得して、逆量子化・逆ＤＣＴ変換などの画像復号処理を行い、残差復号信号Ｄｒを出力する。
加算器１０８は、残差復号信号Ｄｒと予測画像信号Ｐｒｅを加算して再構成画像信号Ｒｃを出力する。
【００５９】
マルチフレームバッファ３０１は、再構成画像信号Ｒｃのうち、フレーム間予測で参照される可能性がある信号を格納する。
可変長符号化部３０６は、動き推定部３０２で検出された動きベクトルＭＶ１，ＭＶ２と、画像符号化部３０４から出力される残差符号化信号Ｅｒとを可変長符号化し、これらの符号化された結果を画像符号化信号Ｂｓ１として出力する。
【００６０】
このような動画像符号化装置３００Ａは、画像信号Ｉｍｇにより示される符号化対象フレームＴｆの各ブロックを符号化するときには、２枚の参照フレームＲｆ１，Ｒｆ２を参照して、これらの参照フレームＲｆ１，Ｒｆ２に基づく符号化対象ブロックの動きベクトルＭＶ１，ＭＶ２を検出する。そして動画像符号化装置３００Ａは、検出した動きベクトルＭＶ１，ＭＶ２をそれぞれ符号化するとともに、参照フレームＲｆ１，Ｒｆ２及び動きベクトルＭＶ１，ＭＶ２から予測される予測画像と、符号化対象ブロックとの画素値の差を符号化する。
【００６１】
ここで、参照フレームＲｆ１，Ｒｆ２はそれぞれ、符号化対象フレームＴｆに対して時間的に前方にあっても後方にあっても良い。
図２は、参照フレームＲｆ１，Ｒｆ２と符号化対象フレームＴｆの時間的な位置関係を示すフレーム配置図である。
【００６２】
この図２中の（ａ）に示すように、動画像符号化装置３００Ａは、符号化対象フレームＴｆの前方に位置するフレームを参照フレームＲｆ１，Ｒｆ２として参照しても良く、図２中の（ｂ）に示すように、符号化対象フレームＴｆの後方に位置するフレームを参照フレームＲｆ１，Ｒｆ２として参照しても良い。さらに、図２中の（ｃ）に示すように、符号化対象フレームＴｆの前方に位置する１つのフレームを参照フレームＲｆ２として参照し、符号化対象フレームＴｆの後方に位置する１つのフレームを参照フレームＲｆ１として参照しても良く、その逆、つまり符号化対象フレームＴｆの前方に位置する１つのフレームを参照フレームＲｆ１として参照し、符号化対象フレームＴｆの後方に位置する１つのフレームを参照フレームＲｆ２として参照しても良い。
【００６３】
このような本実施の形態における動画像符号化装置３００Ａの動き推定部３０２について詳細に説明する。
動き推定部３０２は、上述のように参照フレームＲｆ１，Ｒｆ２のそれぞれに基づいて、符号化対象ブロックの動きベクトルＭＶ１，ＭＶ２を検出し、これらの動きベクトルＭＶ１，ＭＶ２を動き補償部３０３に出力する。
【００６４】
図３は、動き推定部３０２の構成を示すブロック図である。
本実施の形態における上述の動き推定部３０２は、第１動きベクトル候補生成部３２１と、第２動きベクトル候補生成部３２２と、画素取得部３２３，３２４と、補間部３２５と、減算器３２６と、動きベクトル選択部３２７とを備えている。
【００６５】
第１動きベクトル候補生成部３２１は、符号化対象ブロックの参照フレームＲｆ１に基づく動きベクトルＭＶ１の候補を、所定の検出範囲から全て抽出し、これらをそれぞれ第１動きベクトル候補ＭＶＣ１として順次出力する。
【００６６】
第２動きベクトル候補生成部３２２は、第１動きベクトル候補生成部３２１と同様、符号化対象ブロックの参照フレームＲｆ２に基づく動きベクトルＭＶ２の候補を、所定の検出範囲から全て抽出し、これらをそれぞれ第２動きベクトル候補ＭＶＣ２として順次出力する。
【００６７】
図４は、第１動きベクトル候補ＭＶＣ１及び第２動きベクトル候補ＭＶＣ２の生成方法を説明するための説明図である。
第１動きベクトル候補生成部３２１は、参照フレームＲｆ１の動きベクトル検出範囲ＳＲから何れかの画素Ｐｔ１を選択し、符号化対象ブロックの画素Ｐｔ０と画素Ｐｔ１との変位を第１動きベクトル候補ＭＶＣ１として出力し、第２動きベクトル候補生成部３２２は、参照フレームＲｆ２の動きベクトル検出範囲ＳＲ１から何れかの画素Ｐｔ２を選択し、符号化対象ブロックの画素Ｐｔ０と画素Ｐｔ２との変位を第２動きベクトル候補ＭＶＣ２として出力する。
【００６８】
画素取得部３２３は、第１動きベクトル候補ＭＶＣ１が示す参照フレームＲｆ１中の１ブロック（図４中の画素Ｐｔ１を含むブロック）を取得して、予測ブロックＰＢ１として補間部３２５に出力する。
【００６９】
画素取得部３２４は、第２動きベクトル候補ＭＶＣ２が示す参照フレームＲｆ２中の１ブロック（図４中の画素Ｐｔ２を含むブロック）を取得して、予測ブロックＰＢ２として補間部３２５に出力する。
【００７０】
補間部３２５は、予測ブロックＰＢ１，ＰＢ２のブロック内での相対位置が互いに等しい２つの画素を用いて画素値の補間を行うことで、符号化対象ブロックに対する補間予測ブロックＰＢ０を作成し、これを減算器３２６に出力する。
【００７１】
このような画素値の補間処理は、前方補間予測の場合であれば、図３６で説明したように、２つの画素値を外挿することにより行われる。つまり、図３６に示すように、予測ブロックＰＢ１内の画素Ｐｔ１と、予測ブロックＰＢ２内の画素Ｐｔ２とがブロック内での相対位置が互いに等しい場合、画素Ｐｔ１の画素値Ｐ１と画素Ｐｔ２の画素値Ｐ２とから計算式「Ｐ０’＝２×Ｐ１−Ｐ２」の計算を行い、画素Ｐｔ１，Ｐｔ２に対応する符号化対象ブロック内の画素Ｐｔ０の補間予測画素値Ｐ０’を算出する。そして補間部３２５は、符号化対象ブロック内の全ての画素に対する補間予測画素値Ｐ０’を算出することで補間予測ブロックＰＢ０を作成している。
【００７２】
減算器３２６は、画像信号Ｉｍｇに示される符号化対象ブロックと、補間予測ブロックＰＢ０との互いに対応する画素の画素値の差分（Ｐ０−Ｐ０’）を計算し、その結果を予測誤差ブロックＲＢ０として動きベクトル選択部３２７に出力する。
【００７３】
動きベクトル選択部３２７は、減算器３２６から予測誤差ブロックＲＢ０を取得すると、以下の（式１）で示されるＳＡＤ（予測誤差値の絶対値和）や、以下の（式２）で示されるＳＳＤ（予測誤差値の２乗和）などの予測誤差評価値を算出する。
【００７４】
【数１】

このような予測誤差評価値は、第１動きベクトル候補生成部３２１で生成される第１動きベクトル候補ＭＶＣ１と、第２動きベクトル候補生成部３２２で生成される第２動きベクトル候補ＭＶＣ２との全ての組合せに対して算出される。
【００７５】
そして、動きベクトル選択部３２７は、予測誤差評価値が最も小さくなる第１動きベクトル候補ＭＶＣ１及び第２動きベクトル候補ＭＶＣ２を選択し、選択した第１動きベクトル候補ＭＶＣ１を、参照フレームＲｆ１に基づく符号化対象ブロックの動きベクトルＭＶ１として出力するとともに、選択した第２動きベクトル候補ＭＶＣ２を、参照フレームＲｆ２に基づく符号化対象ブロックの動きベクトルＭＶ２として出力する。
【００７６】
これにより、本実施の形態の動き推定部３０２は、フェードによる画素値の変化が生じても、これを考慮して動きベクトルを検出するため、最適な動きベクトルを検出することができる。
【００７７】
そして可変長符号化部３０６は、上述のように動き推定部３０２で検出された動きベクトルＭＶ１，ＭＶ２を可変長符号化し、画像符号化信号Ｂｓ１に含めて出力する。なお、可変長復号化部３０６は、符号化対象ブロックの周辺にある周辺ブロックから求められた動きベクトルＭＶ１，ＭＶ２の予測値を、それぞれ動きベクトルＭＶ１，ＭＶ２から差し引いて、それぞれの差分を符号化しても良い。
【００７８】
図５は、画像符号化信号Ｂｓ１のフォーマットの概念を示す概念図である。
画像符号化信号Ｂｓ１には、符号化されたフレームを示す内容のフレーム符号化信号Ｂｓｆ１が含まれ、さらにこのフレーム符号化信号Ｂｓｆ１には符号化されたブロックを示す内容のブロック符号化信号Ｂｓｂ１が含まれている。そしてさらに、このブロック符号化信号Ｂｓｂ１には、符号化された動きベクトルＭＶ１を示す内容の第１動きベクトル符号化信号Ｂｓ１と、符号化された動きベクトルＭＶ２を示す内容の第２動きベクトル符号化信号Ｂｓ２とが含まれる。
【００７９】
図６は、動画像符号化装置３００Ａが動きベクトルを検出して符号化するまでの動作を示すフロー図である。
まず、動画像符号化装置３００Ａの動き推定部３０２は、第１動きベクトル候補ＭＶＣ１と第２動きベクトル候補ＭＶＣ２をそれぞれ１つ生成する（ステップＳ３００）。
【００８０】
そして動き推定部３０２は、その第１動きベクトル候補ＭＶＣ１で示される予測ブロックＰＢ１と、第２動きベクトル候補ＭＶＣ２で示される予測ブロックＰＢ２とを取得する（ステップＳ３０２）。
【００８１】
次に、動き推定部３０２は、予測ブロックＰＢ１，ＰＢ２から画素補間処理を行うことにより補間予測ブロックＰＢ０を作成する（ステップＳ３０４）。
その後、動き推定部３０２は、符号化対象ブロックを取得して（ステップＳ３０６）、その符号化対象ブロックと補間予測ブロックＰＢ０との差分を求め、予測誤差ブロックＲＢ０を作成する（ステップＳ３０８）。
【００８２】
そして動き推定部３０２は、ステップＳ３０８で作成された予測誤差ブロックＲＰ０から予測誤差評価値を算出し（ステップＳ３１０）、第１動きベクトル候補ＭＶＣ１と第２動きベクトル候補ＭＶＣ２との全ての組合せに対して予測誤差評価値を算出したか否かを判断する（ステップＳ３１２）。
【００８３】
ここで、動き推定部３０２が、全ての組合せに対して予測誤差評価値を算出していないと判断したときには（ステップＳ３１２のＮ）、再び前回と異なる第１動きベクトル候補ＭＶＣ１と第２動きベクトル候補ＭＶＣ２との組み合わせを生成してステップＳ３００からの動作を繰り返し実行する。また、動き推定部３０２が全ての組合せに対して予測誤差評価値を算出したと判断したときには（ステップＳ３１２のＹ）、ステップＳ３１０で算出された全ての予測誤差評価値の中で、最小の予測誤差評価値が算出されたときにステップＳ３００で生成された第１動きベクトル候補ＭＶＣ１を動きベクトルＭＶ１として検出するとともに、このときの第２動きベクトル候補ＭＶＣ２を動きベクトルＭＶ２として検出する（ステップＳ３１４）。
【００８４】
そして動画像符号化装置３００Ａの可変長符号化部３０６は、ステップＳ３１４で検出された動きベクトルＭＶ１，ＭＶ２をそれぞれ符号化する（ステップＳ３１６）。
【００８５】
このように本実施の形態では、画素値の補間処理を行った結果に基づいて予測誤差評価値を算出するため、フェードが生じる場合であってもその影響による予測誤差評価値の増加を防止して、最適な動きベクトルを検出することができる。また、その結果、動きベクトルの符号化効率を向上することができる。
【００８６】
（実施の形態２）
以下、本発明の第２の実施の形態における動画像復号化装置について図面を参照しながら説明する。
【００８７】
図７は、本実施の形態における動画像復号化装置３００Ｂの構成を示すブロック図である。
本実施の形態の動画像復号化装置３００Ｂは、実施の形態１の動画像符号化装置３００Ａにより符号化された動画像を復号化するものであって、可変長復号化部３３６と、動き補償部３３３と、画像復号化部３３５と、マルチフレームバッファ３３１と、加算器３３９とを備えている。
【００８８】
可変長復号化部３３６は、画像符号化信号Ｂｓ１を取得して可変長復号を行い、残差符号化信号Ｅｒと動きベクトルＭＶ１と動きベクトルＭＶ２とを出力する。なお、動きベクトルＭＶ１，ＭＶ２のそれぞれが、復号化対象ブロックの周辺にある周辺ブロックから求められた動きベクトルＭＶ１，ＭＶ２の予測値を差し引いて符号化されているときには、可変長復号化部２３３６はその符号化された差分をそれぞれ復号化して、そのそれぞれの差分に対して上述の予測値を加算することで動きベクトルＭＶ１，ＭＶ２を生成しても良い。
【００８９】
画像復号化部３３５は、残差符号化信号Ｅｒを取得して、逆量子化や逆ＤＣＴ変換などの画像復号処理を行い、残差復号信号Ｄｒを出力する。
動き補償部３３３は、動画像符号化装置３００Ａの動き補償部３０３と同様に、前方補間予測時には、参照フレームＲｆ１における動きベクトルＭＶ１により指し示される位置のブロックと、参照フレームＲｆ２における動きベクトルＭＶ２により指し示される位置のブロックとを、マルチフレームバッファ３３１から取り出す。そして、動き補償部３３３はこれらのブロックに基づいて、図３６で説明したような外挿による画素値の補間処理を行って予測画像信号Ｐｒｅを作成してこれを出力する。
【００９０】
加算器３３９は、動き補償部３３３からの予測画像信号Ｐｒｅと、画像復号化部３３５からの残差復号信号Ｄｒとを加算し、その結果を復号画像信号Ｄｉとして出力する。
【００９１】
マルチフレームバッファ３３１は、復号画像信号Ｄｉのうち、フレーム間予測で参照される可能性がある信号を格納する。
このような本実施の形態における動画像復号化装置３００Ｂは、動きベクトルＭＶ１，ＭＶ２に基づく動き補償を行うことで、動画像符号化装置３００Ａで符号化された画像を正確に復号することができる。
【００９２】
（実施の形態３）
以下、本発明の第３の実施の形態における動画像符号化装置について図面を参照しながら説明する。
【００９３】
図８は、本実施の形態における動画像符号化装置４００Ａの構成を示すブロック図である。
本実施の形態の動画像符号化装置３００Ａは、実施の形態１と同様、最適な動きベクトルを検出して符号化するものであって、マルチフレームバッファ４０１と、動き推定部４０２と、動き補償部４０３と、画像符号化部４０４と、画像復号部４０５と、可変長符号化部４０６と、加算器４０８及び減算器４０９とを備える。
【００９４】
ここで本実施の形態におけるマルチフレームバッファ４０１と、動き補償部４０３と、画像符号化部４０４と、画像復号部４０５と、加算器４０８及び減算器４０９とは、実施の形態１のマルチフレームバッファ３０１と、動き補償部３０３と、画像符号化部３０４と、画像復号部３０５と、加算器３０８及び減算器３０９とそれぞれ同様の機能及び構成を有する。
【００９５】
このような本実施の形態の動画像符号化装置４００Ａでは、実施の形態１と比べて、動きベクトルの検出方法と動きベクトルの符号化方法とが異なっており、前方補間予測時であっても動きベクトルを実質的に１つだけ検出してこれを符号化する点に特徴がある。
【００９６】
図９は、本実施の形態の動き推定部４０２の構成を示すブロック図である。
本実施の形態における動き推定部４０２は、第１動きベクトル候補生成部４２１と、動きベクトルスケーリング部４２２と、画素取得部４２３，４２４と、補間部４２５と、減算器４２６と、動きベクトル選択部４２７とを備えている。そして、このような動き推定部４０２は、マルチフレームバッファ４０１から読み出された参照フレームＲｆ１，Ｒｆ２のそれぞれに基づいて、画像信号Ｉｍｇにより示される符号化対象フレームＴｆ中のブロック（符号化対象ブロック）の最適な動きベクトルＭＶ１，ＭＶ２を検出する。
【００９７】
ここで、第１動きベクトル候補生成部４２１と画素取得部４２３，４２４と補間部４２５と減算器４２６と動きベクトル選択部４２７は、実施の形態１の動き推定部３０２における第１動きベクトル候補生成部３２１と画素取得部３２３，３２４と補間部３２５と減算器３２６と動きベクトル選択部３２７とそれぞれ同様の機能及び構成を有している。
【００９８】
即ち本実施の形態における動き推定部４０２は、実施の形態１の第２動きベクトル候補生成部３２２の代わりに動きベクトルスケーリング部４２２を備えており、第１動きベクトル候補ＭＶＣ１をスケーリングすることで第２動きベクトル候補ＭＶＣ２を生成している。
【００９９】
図１０は、第１動きベクトル候補ＭＶＣ１及び第２動きベクトル候補ＭＶＣ２の生成方法を説明するための説明図である。
第１動きベクトル候補生成部４２１は、参照フレームＲｆ１の動きベクトル検出範囲ＳＲから何れかの画素Ｐｔ１を選択し、符号化対象ブロックの画素Ｐｔ０と画素Ｐｔ１との変位を第１動きベクトル候補ＭＶＣ１として出力する。
【０１００】
動きベクトルスケーリング部４２２は、上述のように生成された第１動きベクトル候補ＭＶＣ１を取得すると、符号化対象フレームＴｆ及び参照フレームＲｆ１が表示される時間差を示す表示時間差Ｔ１と、符号化対象フレームＴｆ及び参照フレームＲｆ２が表示される時間差を示す表示時間差Ｔ２とに基づいて、第１動きベクトル候補ＭＶＣ１に対してスケーリングを行う。
【０１０１】
即ち、動きベクトルスケーリング部４２２は、表示時間差Ｔ１に対する表示時間差Ｔ２の割合（Ｔ２／Ｔ１）を、第１動きベクトル候補ＭＶＣ１に乗じることで、第１動きベクトル候補ＭＶＣ１に対するスケーリングを行い、動きベクトルＭＶＣｓを求める。
【０１０２】
そして動きベクトルスケーリング部４２２はこのようにスケーリングして求めた動きベクトルＭＶＣｓを画素取得部４２４と動きベクトル選択部４２７へ出力する。
【０１０３】
画素取得部４２４と動きベクトル選択部４２７は、動きベクトルスケーリング部４２２から動きベクトルＭＶＣｓを取得すると、この動きベクトルＭＶＣｓを第２動きベクトル候補ＭＶＣ２として扱い、それぞれ実施の形態１と同様の動作を行う。
【０１０４】
また、第１動きベクトル候補生成部４２１は、動きベクトル検出範囲ＳＲから順次全ての第１動きベクトル候補ＭＶＣ１を作成し、動きベクトルスケーリング部４２２は、その第１動きベクトル候補ＭＶＣ１を取得するごとにスケーリングを行い、動きベクトルＭＶＣｓを作成する。
【０１０５】
動きベクトル選択部４２７は、第１動きベクトル候補生成部４２１で第１動きベクトル候補ＭＶＣ１が生成されるたびに、その第１動きベクトル候補ＭＶＣ１と、こらから求められた動きベクトルＭＶＣｓとに基づく予測誤差評価値を算出し、その結果、予測誤差評価値が最小となる第１動きベクトル候補ＭＶＣ１と動きベクトルＭＶＣｓとを選択する。そして動きベクトル選択部４２７は、選択した第１動きベクトル候補ＭＶＣ１と動きベクトルＭＶＣｓとを、動きベクトルＭＶ１，ＭＶ２として出力する。
【０１０６】
上述のような動き推定部４０２の一連の動作について以下に説明する。
まず、第１動きベクトル候補生成部４２１は、動きベクトルＭＶ１の候補である第１動きベクトル候補ＭＶＣ１を生成する。
【０１０７】
次に、動きベクトルスケーリング部４２２は、第１動きベクトル候補ＭＶＣ１に対してスケーリングを行うことで動きベクトルＭＶＣｓを生成してこれを出力する。
【０１０８】
そして画素取得部４２３は、第１動きベクトル候補ＭＶＣ１により示される参照フレームＲｆ１の画素を含む１ブロックを予測ブロックＰＢ１として取得し、これを補間部４２５に出力する。画素取得部４２４は、動きベクトルＭＶＣｓにより示される参照フレームＲｆ２の画素を含む１ブロックを予測ブロックＰＢ２として取得し、これを補間部４２５に出力する。
【０１０９】
補間部４２５は、画素取得部４２３，４２４がそれぞれ取得した２つの予測ブロックＰＢ１，ＰＢ２のそれぞれにおいて、対応する画素の画素値を補間処理することで補間予測ブロックＰＢ０を生成する。
【０１１０】
減算器４２６は、補間予測ブロックＰＢ０と画像信号Ｉｍｇ中の符号化対象ブロックとの間の画素値の差分を計算し、その計算結果を予測誤差ブロックＲＢ０として出力する。
【０１１１】
動きベクトル選択部４２７は、予測誤差ブロックＲＢ０内の画素の画素値に基づいて予測誤差評価値を算出する。そして動きベクトル選択部４２７は、第１動きベクトル候補生成部４２１で生成された全ての第１動きベクトル候補ＭＶＣ１に対してそれぞれ上述のように予測誤差評価値を算出すると、予測誤差評価値が最小となる第１動きベクトル候補ＭＶＣ１と動きベクトルＭＶＣｓとを選択する。
【０１１２】
そして動きベクトル選択部４２７は、選択した第１動きベクトル候補ＭＶＣ１と動きベクトルＭＶＣｓとを、動きベクトルＭＶ１，ＭＶ２として出力する。
このように本実施の形態における動き推定部４０２では、実施の形態１と同様に、画素値の補間処理を行った結果に基づいて予測誤差評価値を算出するため、フェードが生じる場合であってもその影響による予測誤差評価値の増加を防止して、最適な動きベクトルを検出することができる。また、本実施の形態における動き推定部４０２では、前方補間予測時であっても実質的な動きベクトルＭＶ２の検出を行っておらず、動きベクトルＭＶ１に対してスケーリングが行われたものを動きベクトルＭＶ２とするため、動きベクトルＭＶ２の検出にかかる手間を省いて、符号化効率を向上することができる。
【０１１３】
また本実施の形態の可変長符号化部４０６は、動きベクトルＭＶ１と残差符号化信号Ｅｒとを可変長符号化し、画像符号化信号Ｂｓ２を出力する。
図１１は、画像符号化信号Ｂｓ２のフォーマットの概念を示す概念図である。
【０１１４】
画像符号化信号Ｂｓ２には、符号化されたフレームを示す内容のフレーム符号化信号Ｂｓｆ２が含まれ、さらにこのフレーム符号化信号Ｂｓｆ２には符号化されたブロックを示す内容のブロック符号化信号Ｂｓｂ２が含まれている。そしてさらに、このブロック符号化信号Ｂｓｂ２には、符号化された動きベクトルＭＶ１を示す内容の動きベクトル符号化信号Ｂｓ１が含まれる。
【０１１５】
このように本実施の形態では、符号化された動きベクトルＭＶ２を示す第２動きベクトル符号化信号Ｂｓ２を、画像符号化信号Ｂｓ２に格納しなくてもよいため、符号量を減少して符号化効率を向上することができる。
【０１１６】
なお、本実施の形態では、図３８に示すように、前方補間予測の場合について説明したが、実施の形態１と同様、参照フレームＲｆ１，Ｒｆ２はそれぞれ符号化対象フレームＴｆの前方又は後方の何れにあっても良い。
【０１１７】
さらに、本実施の形態では、符号化対象フレームＴｆを基準とした表示時間差が参照フレームＲｆ１よりも長くなるフレームを、参照フレームＲｆ２として選択し、参照フレームＲｆ１に基づく第１動きベクトル候補ＭＶＣ１に対してスケーリングを行ったが、符号化対象フレームＴｆの表示時刻よりも前にあるフレームを参照フレームＲｆ１として選択するとともに、符号化対象フレームＴｆの表示時刻よりも後にあるフレームを参照フレームＲｆ２として選択して、その参照フレームＲｆ１に基づく第１動きベクトル候補ＭＶＣ１に対してスケーリングを行っても良い。
【０１１８】
なお、可変長符号化部４０６は、動きベクトルＭＶ１と、符号化対象ブロックの周辺にある周辺ブロックの動きベクトルから予測される予測値との差分を符号化し、その結果を示す符号化信号を動きベクトル符号化信号Ｂｓ１の代わりにブロック符号化信号Ｂｓｂ２に含めても良い。この場合には符号化効率をさらに向上することができる。
【０１１９】
（実施の形態４）
以下、本発明の第４の実施の形態における動画像復号化装置について図面を参照しながら説明する。
【０１２０】
図１２は、本実施の形態における動画像復号化装置４００Ｂの構成を示すブロック図である。
本実施の形態の動画像復号化装置４００Ｂは、実施の形態３の動画像符号化装置４００Ａにより符号化された動画像を復号化するものであって、可変長復号化部４３６と、動きベクトルスケーリング部４３７と、動き補償部４３３と、画像復号化部４３５と、マルチフレームバッファ４３１と、加算器４３９とを備えている。
【０１２１】
ここで本実施の形態における画像復号化部４３５と動き補償部４３３とマルチフレームバッファ４３１と加算器４３９とは、実施の形態２の動画像復号化装置３００Ｂの画像復号化部３３５と動き補償部３３３とマルチフレームバッファ３３１と加算器３３９とそれぞれ同様の機能及び構成を有する。
【０１２２】
可変長復号化部３３６は、画像符号化信号Ｂｓを取得して可変長復号を行い、残差符号化信号Ｅｒと動きベクトルＭＶ１とを出力する。なお、動きベクトルＭＶ１が、復号化対象ブロックの周辺にある周辺ブロックから求められた動きベクトルＭＶ１の予測値を差し引いて符号化されているときには、可変長復号化装置３３６は、その符号化された差分を復号化してこれに上述の予測値を加算することで、動きベクトルＭＶ１を生成して出力しても良い。
【０１２３】
動きベクトルスケーリング部４３７は、動画像復号化装置９００の動きベクトルスケーリング部９０７と同様、可変長復号化部９０６から出力された動きベクトルＭＶ１を取得すると、符号化対象フレームＴｆと参照フレームＲｆ１との表示時間差Ｔ１、及び符号化対象フレームＴｆと参照フレームＲｆ２との表示時間差Ｔ２に基づいて、動きベクトルＭＶ１に対してスケーリングを行う。そして動きベクトルスケーリング部４３７は、その結果生成された動きベクトルを、参照フレームＲｆ２に基づいて検出された動きベクトルＭＶ２として動き補償部４３３に出力する。
【０１２４】
動き補償部４３３は、実施の形態２の動き補償部３３３と同様、参照フレームＲｆ１における動きベクトルＭＶ１により指し示される位置のブロックと、参照フレームＲｆ２における動きベクトルＭＶ２により指し示される位置のブロックとを、マルチフレームバッファ４３１から取り出す。そして動き補償部４３３は、これらのブロックに基づいて画素値の補間処理を行って予測画像信号Ｐｒｅを作成してこれを出力する。
【０１２５】
このような本実施の形態における動画像符号化装置４００Ｂは、動きベクトルＭＶ１に対してスケーリングを行うことで動きベクトルＭＶ２を導出し、動画像符号化装置４００Ａで符号化された画像を正確に復号することができる。
【０１２６】
（実施の形態５）
以下、本発明の第５の実施の形態における動画像符号化装置について図面を参照しながら説明する。
【０１２７】
図１３は、本実施の形態における動画像符号化装置５００Ａの構成を示すブロック図である。
この動画像符号化装置５００Ａは、マルチフレームバッファ５０１、動き推定部５０２、動き補償部５０３、画像符号化部５０４、画像復号部５０５、可変長符号化部５０６、動きベクトルスケーリング部５０７、加算器５０８、減算器５０９，５１０を備え、画像信号Ｉｍｇにより示されるフレームをブロックに分割し、そのブロック毎に処理を行う。
【０１２８】
ここで本実施の形態におけるマルチフレームバッファ５０１、動き補償部５０３、画像符号化部５０４、画像復号部５０５、可変長符号化部５０６、動きベクトルスケーリング部５０７、加算器５０８、減算器５０９，５１０は、従来の動画像符号化装置８００におけるマルチフレームバッファ８０１、動き補償部８０３、画像符号化部８０４、画像復号部８０５、可変長符号化部８０６、動きベクトルスケーリング部８０７、加算器８０８、減算器８０９，８１０とそれぞれ同様の機能及び構成を有する。
【０１２９】
つまり、本実施の形態は、動き推定部５０２の動きベクトルの検出方法に特徴があり、動きベクトル符号化方法などの動作処理は動画像符号化装置８００と共通する。
【０１３０】
図１４は、本実施の形態の動き推定部５０２の構成を示すブロック図である。動き推定部５０２は、第１動きベクトル候補生成部５２１と、第２動きベクトル候補生成部５２２と、動きベクトルスケーリング部５２１ａと、画素取得部５２３，５２４と、補間部５２５と、減算器５２６と、動きベクトル選択部５２７と、スイッチ５２８，５２９とを備えている。
【０１３１】
また、動き推定部５０２における第１動きベクトル候補生成部５２１と、動きベクトルスケーリング部５２１ａと、画素取得部５２３，５２４と、補間部５２５と、減算器５２６とは、実施の形態３の動き推定部４０２における第１動きベクトル候補生成部４２１と、動きベクトルスケーリング部４２２と、画素取得部４２３，４２４と、補間部４２５と、減算器４２６とそれぞれ同様の機能及び構成を有する。
【０１３２】
そして第２動きベクトル候補生成部５２２は、符号化対象ブロックの参照フレームＲｆ２に基づく動きベクトルＭＶ２の候補を、所定の動きベクトル検出範囲から全て抽出し、これらをそれぞれ第２動きベクトル候補ＭＶＣ２として出力する。
【０１３３】
実施の形態３で説明したように実施の形態３における動き推定部４０２は、幾つかの第１動きベクトル候補ＭＶＣ１を生成するとともに、それぞれの第１動きベクトル候補ＭＶＣ１に対してスケーリングを行うことで動きベクトルＭＶＣｓを生成し、最も予測誤差評価値が小さくなる第１動きベクトル候補ＭＶＣ１とそれに対応する動きベクトルＭＶＣｓとを動きベクトルＭＶ１，ＭＶ２として検出した。即ち動き推定部４０２は、動きベクトルＭＶ１に対してスケーリングを行ったものを動きベクトルＭＶ２として検出している。
【０１３４】
しかし、本実施の形態における動き推定部５０２は、実施の形態３における動き推定部４０２と同様の方法で動きベクトルＭＶ１を検出するが、動きベクトルＭＶ１に対してスケーリングを行ったものを動きベクトルＭＶ２とすることなく、検出した動きベクトルＭＶ１を用いて予測誤差評価値が最も小さくなる動きベクトルＭＶ２をさらに検出する点に特徴がある。
【０１３５】
このような動き推定部５０２の具体的な動作について、図１４及び図１５を用いて説明する。
動き推定部５０２は、スイッチ５２８，５２９の接点が接点０側に切り換えられているときには、実施の形態３と同様の方法で第１動きベクトル候補ＭＶＣ１及び動きベクトルＭＶＣｓの生成し、動きベクトルＭＶ１を検出する。
【０１３６】
図１５の（ａ）は、第１動きベクトル候補ＭＶＣ１及び動きベクトルＭＶＣｓの生成方法を説明するための説明図である。
第１動きベクトル候補生成部５２１は、参照フレームＲｆ１の動きベクトル検出範囲ＳＲから何れかの第１動きベクトル候補ＭＶＣ１を生成する。
【０１３７】
動きベクトルスケーリング部５２１ａは、表示時間差Ｔ１に対する表示時間差Ｔ２の割合（Ｔ２／Ｔ１）を、上述のように生成された第１動きベクトル候補ＭＶＣ１に乗じることで、第１動きベクトル候補ＭＶＣ１に対するスケーリングを行い、動きベクトルＭＶＣｓを生成する。
【０１３８】
画素取得部５２３は、第１動きベクトル候補ＭＶＣ１が示す参照フレームＲｆ１中の１ブロック（画素Ｐｔ１を含むブロック）を取得して、予測ブロックＰＢ１として補間部５２５に出力する。
【０１３９】
画素取得部５２４は、動きベクトル候補ＭＶＣｓが示す参照フレームＲｆ２中の１ブロック（画素Ｐｔ２を含むブロック）を取得して、予測ブロックＰＢ２として補間部５２５に出力する。
【０１４０】
補間部５２５は、予測ブロックＰＢ１，ＰＢ２のブロック内での相対位置が互いに等しい２つの画素を用いて画素値の補間を行うことで、符号化対象ブロックに対する補間予測ブロックＰＢ０を作成し、これを減算器５２６に出力する。
【０１４１】
減算器５２６は、画像信号Ｉｍｇに示される符号化対象ブロックと、補間予測ブロックＰＢ０との互いに対応する画素の画素値の差分を計算し、その結果を予測誤差ブロックＲＢ０として動きベクトル選択部５２７に出力する。
【０１４２】
動きベクトル選択部５２７は、減算器５２６から予測誤差ブロックＲＢ０を取得すると、ＳＡＤやＳＳＤなどの予測誤差評価値を算出する。
そして、動きベクトル選択部５２７は、予測誤差評価値が最も小さくなる第１動きベクトル候補ＭＶＣ１及び動きベクトル候補ＭＶＣｓを選択し、選択した第１動きベクトル候補ＭＶＣ１を、参照フレームＲｆ１に基づく符号化対象ブロックの動きベクトルＭＶ１として出力する。
【０１４３】
次に、動き推定部５０２は、スイッチ５２８，５２９の接点が接点０側に切り換えられると、上述のように検出した動きベクトルＭＶ１を用いて予測誤差評価値が最も小さくなる動きベクトルＭＶ２を検出する。
【０１４４】
図１５の（ｂ）は、第２動きベクトル候補ＭＶＣ２の生成方法を説明するための説明図である。
第２動きベクトル候補生成部５２２は、動きベクトルＭＶ１のスケーリングにより示される参照フレームＲｆ２中の位置Ｃを中心とした動きベクトル検出範囲ＳＲ２から、幾つかの第２動きベクトル候補ＭＶＣ２を順次生成する。
【０１４５】
画素取得部５２３は、上述のように既に検出された動きベクトルＭＶ１が示す参照フレームＲｆ１中の１ブロックを取得して、予測ブロックＰＢ１として補間部５２５に出力する。
【０１４６】
画素取得部５２４は、第２動きベクトル候補ＭＶＣ２が示す参照フレームＲｆ２中の１ブロックを取得して、予測ブロックＰＢ２として補間部５２５に出力する。
【０１４７】
補間部５２５は、上述と同様に、予測ブロックＰＢ１，ＰＢ２のブロック内での相対位置が互いに等しい２つの画素を用いて画素値の補間を行うことで、符号化対象ブロックに対する補間予測ブロックＰＢ０を作成し、これを減算器５２６に出力する。
【０１４８】
減算器５２６は、画像信号Ｉｍｇに示される符号化対象ブロックと、補間予測ブロックＰＢ０との互いに対応する画素の画素値の差分を計算し、その結果を予測誤差ブロックＲＢ０として動きベクトル選択部５２７に出力する。
【０１４９】
動きベクトル選択部５２７は、減算器５２６から予測誤差ブロックＲＢ０を取得すると、ＳＡＤやＳＳＤなどの予測誤差評価値を算出する。
そして、動きベクトル選択部５２７は、予測誤差評価値が最も小さくなる第２動きベクトル候補ＭＶＣ２を選択し、選択した第２動きベクトル候補ＭＶＣ２を、参照フレームＲｆ２に基づく符号化対象ブロックの動きベクトルＭＶ２として出力する。
【０１５０】
ここで、被写体の動きがフレーム間で一定であるとすれば、既に検出した動きベクトルＭＶ１のスケーリングにより示される位置Ｃに近いほど、動きベクトルＭＶ２が存在する確率が高い。
【０１５１】
従って、本実施の形態の動き推定部５０２は、動きベクトルＭＶ２を検出するときに、位置Ｃを中心とした動きベクトル検出範囲ＳＲ２を設定して、その動きベクトル検出範囲ＳＲ２から第２動きベクトル候補ＭＶＣ２を生成するため、動きベクトル検出範囲ＳＲ２を狭くすることができ、動きベクトルの検出効率を向上することができる。また、スパイラルサーチなどの手法を用いたときには、より高速に動きベクトルＭＶ２を検出することができる。
【０１５２】
なお、本実施の形態では、動きベクトルＭＶ１を検出した後に、その動きベクトルＭＶ１を固定して用いて動きベクトルＭＶ２の検出を行ったが、さらに、このように検出された動きベクトルＭＶ２を固定して用い、動きベクトルＭＶ１の検出をもう一度行っても良い。この場合、画素取得部５２４が一度検出された固定の動きベクトルＭＶ２を取得している状態で、画素取得部５２３は、所定の動きベクトル検出範囲から幾つか抽出された可変の第１動きベクトル候補ＭＶＣ１を第１動きベクトル候補生成部５２１から取得する。そして、動きベクトル選択部５２７は、抽出された第１動きベクトル候補ＭＶＣ１の中から予測誤差評価値が最も小さくなる第１動きベクトル候補ＭＶＣ１を動きベクトルＭＶ１として検出する。これにより、さらに適切な動きベクトルを検出することができ、検出効率を向上することができる。
【０１５３】
またさらに、このように再度検出された動きベクトルＭＶ１を固定して用い、動きベクトルＭＶ２の検出をもう一度行っても良い。このような動きベクトルの検出の繰り返しは何度行っても良く、その繰り返し回数が所定の回数に達するまで、又は、予測誤差評価値の減少率が所定の値以下となるまで行っても良い。
【０１５４】
このように本実施の形態では、実施の形態１又は３と同様、画素値の補間処理を行った結果に基づいて予測誤差評価値を算出するため、フェードが生じる場合であってもその影響による予測誤差評価値の増加を防止して、最適な動きベクトルを検出することができる。また、本実施の形態では、実施の形態３と異なり、各参照フレーム毎にそれぞれ独立した動きベクトルを使用するため、フレーム間で動きが一定でない場合でも予測効率を向上することができる。
【０１５５】
（変形例）
次に、上記本実施の形態における動画像符号化装置５００Ａの変形例について説明する。
【０１５６】
図１６は、本実施の形態の変形例に係る動画像符号化装置５５０Ａの構成を示すブロック図である。
この変形例に係る動画像符号化装置５５０Ａは、動画像符号化装置５００Ａの動き推定部５０２や動き補償部５０３などの他、減算器５１０から出力される差分ベクトルＭＶｄに応じて「１」又は「２」を示すコードＮｕを生成するコード生成部５１２と、減算器５１０と可変長符号化部５０６ａとの間を開閉するスイッチ５１１とを備えている。
【０１５７】
コード生成部５１２は、減算器５１０から差分ベクトルＭＶｄを取得すると、その差分ベクトルＭＶｄが「０」か否かを判別し、「０」であれば、スイッチ５１１を開くことで可変長符号化部５０６ａが差分ベクトルＭＶｄを取得するのを禁止し、「１」を示すコードＮｕを生成して可変長符号化部５０６ａに出力する。また、差分ベクトルＭＶｄが「０」でなければ、コード生成部５１２は、スイッチ５１１を閉じることで可変長符号化部５０６ａに差分ベクトルＭＶｄを取得させ、「２」を示すコードＮｕを生成して可変長符号化部５０６ａに出力する。
【０１５８】
そしてこの変形例に係る可変長符号化部５０６ａは、コードＮｕが「１」を示すときには、残差符号化信号Ｅｒと動きベクトルＭＶ１とコードＮｕとを可変長符号化し、コードＮｕが「２」を示すときには、残差符号化信号Ｅｒと動きベクトルＭＶ１と差分ベクトルＭＶｄとコードＮｕとを可変長符号化する。つまり、可変長符号化部５０６ａは、コードＮｕが「１」のとき、即ち差分ベクトルＭＶｄが「０」のときには差分ベクトルＭＶｄの符号化を行わない。そして可変長符号化部５０６ａは、上述のように可変長符号化した結果を、画像符号化信号Ｂｓ３として出力する。
【０１５９】
図１７は、画像符号化信号Ｂｓ３のフォーマットの概念を示す概念図である。
画像符号化信号Ｂｓ３には、符号化されたフレームを示す内容のフレーム符号化信号Ｂｓｆ３が含まれ、さらにこのフレーム符号化信号Ｂｓｆ３には符号化されたブロックを示す内容のブロック符号化信号Ｂｓｂ３，Ｂｓｂ４が含まれている。そしてさらに、このブロック符号化信号Ｂｓｂ３には、符号化されたコードＮｕ（２）を示す内容のコード信号Ｂｓｎ２と、符号化された動きベクトルＭＶ１を示す内容の第１動きベクトル符号化信号Ｂｓ１と、符号化された差分ベクトルＭＶｄを示す内容の差分ベクトル符号化信号Ｂｓｄとが含まれる。また、ブロック符号化信号Ｂｓｂ４には、符号化されたコードＮｕ（１）を示す内容のコード信号Ｂｓｎ１と、符号化された動きベクトルＭＶ１を示す内容の第１動きベクトル符号化信号Ｂｓ１とが含まれる。
【０１６０】
つまり、ブロック符号化信号Ｂｓｂ３で示されるブロックに対しては差分ベクトルＭＶｄが「０」ではないので、ブロック符号化信号Ｂｓｂ３には、第１動きベクトル符号化信号ＢＳ１以外にコード信号Ｂｓｎ２と差分ベクトル符号化信号Ｂｓｄが含まれ、ブロック符号化信号Ｂｓｂ４で示されるブロックに対しては差分ベクトルＭＶｄが「０」なので、ブロック符号化信号Ｂｓｂ４には、第１動きベクトル符号化信号ＢＳ１以外にはコード信号Ｂｓｎ１しか含まれない。
【０１６１】
ここで、コードＮｕは「１」又は「２」を示すため、コードＮｕに対する情報量は１ビットあれば十分である。一方、差分ベクトルＭＶｄに対する情報量は、差分ベクトルＭＶｄを横方向成分及び縦方向成分に独立して可変長符号化するような場合には、少なくとも２ビットは必要である。また、多くの場合において、短い時間では画像内の被写体の動きは一定であるため、殆どの符号化対象ブロックに対する差分ベクトルＭＶｄは「０」となる。
【０１６２】
従って、本変形例では、画像符号化信号Ｂｓ３には、差分ベクトル符号化信号Ｂｓｄを省略して情報量を少なくしたブロック符号化信号Ｂｓｂ４が多く含まれるために、符号化効率を向上することができる。
【０１６３】
また、殆どの符号化対象ブロックに対する差分ベクトルＭＶｄが「０」となると、コードＮｕにより示される値の出現頻度に偏りが生じて、そのコードＮｕに対する情報量は１ビットよりも小さくなる。従って、ハフマン符号などの整数ビット単位の可変長符号化方法で動きベクトルの符号化を行う場合には、コードＮｕを他の種類のコードと組み合わせて符号化することで、上述のように単独でコードＮｕを符号化するよりも符号化効率を向上することができる。
【０１６４】
なお、本実施の形態では、コード信号Ｂｓｎ１，Ｂｓｎ２をブロック符号化信号ごとに格納したが、ブロック符号化信号ごとではなく、例えば、ＭＰＥＧのマクロブロックやスライスなどのように、ブロックより大きな単位で画像が符号化された内容を示す信号毎に、コード信号Ｂｓｎ１，Ｂｓｎ２を格納するようにしてもよい。これにより、コード信号Ｂｓｎ１，Ｂｓｎ２を減らすことができ、より符号化効率を改善することができる。
【０１６５】
以上のように本変形例によれば、画像符号化信号Ｂｓ３にコード信号Ｂｓｎ１の情報を格納して差分ベクトル符号化信号Ｂｓｄを省くことで、情報量を削減することができ、符号化効率を向上することができる。
【０１６６】
（実施の形態６）
以下、本発明の第６の実施の形態における動画像復号化装置について図面を参照しながら説明する。
【０１６７】
図１８は、本実施の形態における動画像復号化装置５５０Ｂの構成を示すブロック図である。
本実施の形態の動画像復号化装置５５０Ｂは、実施の形態５の変形例に係る動画像符号化装置５５０Ａにより符号化された動画像を復号化するものであって、可変長復号化部５３６と、動きベクトルスケーリング部５３７と、動き補償部５３３と、画像復号化部５３５と、マルチフレームバッファ５３１と、加算器５３９，５４０と、スイッチ５４１とを備えている。
【０１６８】
ここで本実施の形態における画像復号化部５３５と動き補償部５３３とマルチフレームバッファ５３１と加算器５３９，５４０とは、従来例に示す動画像復号化装置９００における画像復号化部９０５と動き補償部９０３とマルチフレームバッファ９０１と加算器９０９，９１０とそれぞれ同様の機能及び構成を有するため説明を省略する。
【０１６９】
本実施の形態における可変長復号化部５３６は、画像符号化信号Ｂｓ３を取得して可変長復号を行い、コードＮｕが「１」を示しているときには、そのコードＮｕと残差符号化信号Ｅｒと動きベクトルＭＶ１とを出力し、コードＮｕが「２」を示しているときには、そのコードＮｕと残差符号化信号Ｅｒと動きベクトルＭＶ１と差分ベクトルＭＶｄとを出力する。
【０１７０】
スイッチ５４１は、可変長復号化部５３６と加算器５４０との間を、可変長復号化部５３６からのコードＮｕに応じて開閉する。即ち、コードＮｕが「１」を示すときにはスイッチ５４１は開いて、可変長復号化部５３６から加算器５４０に差分ベクトルＭＶｄが出力されるのを禁止し、コードＮｕが「２」を示すときにはスイッチ５４１は閉じて、可変長復号化部５３６から加算器５４０への差分ベクトルＭＶｄの出力を許可する。
【０１７１】
その結果、スイッチ５４１が開いたときには、加算器４０は、動きベクトルスケーリング部５３７で生成された動きベクトルＭＶｓのみを取得するので、その動きベクトルＭＶｓを動きベクトルＭＶ２として動き補償部５３３に出力する。また、スイッチ５４１が閉じたときには、加算器４０は、動きベクトルスケーリング部５３７で生成された動きベクトルＭＶｓと、可変長復号化部５３６から出力される差分ベクトルＭＶｄとを取得するので、その動きベクトルＭＶｓに差分ベクトルＭＶｄを加算して、その加算結果を動きベクトルＭＶ２として動き補償部５３３に出力する。
【０１７２】
これにより、本実施の形態では、画像復号化信号Ｂｓ３に差分ベクトルＭＶｄを含む情報が含まれていれば、その差分ベクトルＭＶｄに、スケーリング処理された動きベクトルＭＶｓを加算することで、動きベクトルＭＶ２が生成され、画像復号化信号Ｂｓ３に差分ベクトルＭＶｄを含む情報が含まれていなければ、スケーリング処理された動きベクトルＭＶｓを動きベクトルＭＶ２とすることで、動きベクトルＭＶ２が生成される。
【０１７３】
従って本実施の形態によれば、実施の形態５の変形例に係る動画像符号化装置５５０Ａで符号化された動きベクトルに関する情報を正しく復号化し、その結果、動画像を正確に復号化することができる。
【０１７４】
（実施の形態７）
ところで、従来例で示した動画像符号化装置８００の動きベクトル符号化方法では、動きベクトルＭＶ１に対してスケーリングを行うが、スケーリングを行うために必要な表示時間差Ｔ１，Ｔ２に関する情報がマルチフレームバッファ８０１から取得できない場合があり、このような場合には動きベクトルを符号化できないといった問題がある。また、マルチフレームバッファ８０１から表示時間差Ｔ１，Ｔ２に関する情報を取得できたとしても、表示時間差Ｔ１，Ｔ２のうちの少なくとも一方が非常に大きいときには、スケーリングを行うことに意味を成さず、動きベクトルの符号化効率が低下してしまうといった問題がある。
【０１７５】
つまり、マルチフレームバッファ８００には短時間メモリと長時間メモリの２種類の領域が確保されており、長時間メモリには、表示時間に関する情報が省かれた状態でフレームが記録されている場合があり、このようなフレームが参照フレームとして読み出されたときには、スケーリングを行うことができない。また、長時間メモリには、符号化対象フレームと表示時間差が非常に大きいフレームが記録されている場合があり、このようなフレームが参照フレームとして読み出されたときには、意味を成さないスケーリングが行われてしまうのである。
【０１７６】
そこで、本発明の第７の実施の形態における動画像符号化装置は、意味を成さないスケーリングを避けて動きベクトルを符号化効率が高まるように符号化する点に特徴がある。
【０１７７】
以下、本発明の第７の実施の形態における動画像符号化装置について図面を参照しながら説明する。
図１９は、本実施の形態における動画像符号化装置１００の構成を示すブロック図である。
【０１７８】
本実施の形態の動画像符号化装置１００は、マルチフレームバッファ１０１と、動き推定部１０２と、動き補償部１０３と、画像符号化部１０４と、画像復号部１０５と、可変長符号化部１０６と、動きベクトルスケーリング部１０７と、加算器１０８及び減算器１０９，１１０と、スイッチ１１１，１１２，１１３と、判定部１１４とを備える。
【０１７９】
この動画像符号化装置１００は、画像信号Ｉｍｇにより示される符号化対象フレームＴｆの各ブロックを符号化するときには、２枚の参照フレームＲｆ１，Ｒｆ２を参照して、これらの参照フレームＲｆ１，Ｒｆ２に対する符号化対象ブロックの動きベクトルＭＶ１，ＭＶ２に関する情報と、参照フレームＲｆ１，Ｒｆ２及び動きベクトルＭＶ１，ＭＶ２から予測される予測画像に基づく情報とを符号化する。
【０１８０】
ここで、本実施の形態においても実施の形態１と同様、参照フレームＲｆ１，Ｒｆ２はそれぞれ、符号化対象フレームＴｆに対して時間的に前方にあっても後方にあっても良い。
【０１８１】
図２中の（ａ）に示すように、動画像符号化装置１００は、符号化対象フレームＴｆの前方に位置するフレームを参照フレームＲｆ１，Ｒｆ２として参照しても良く、図２中の（ｂ）に示すように、符号化対象フレームＴｆの後方に位置するフレームを参照フレームＲｆ１，Ｒｆ２として参照しても良く、さらに、図２中の（ｃ）に示すように、符号化対象フレームＴｆの前方に位置する１つのフレームを参照フレームＲｆ２として参照し、符号化対象フレームＴｆの後方に位置する１つのフレームを参照フレームＲｆ１として参照しても良い。
【０１８２】
スイッチ１１１，１１２は、符号化対象ブロック毎に、参照される２つのフレーム（参照フレームＲｆ１，Ｒｆ２）に応じて、その接点０，１を切り換える。例えば、参照フレームＲｆ１が参照されるときには、スイッチ１１１，１１２は、それぞれ接点０を動き推定部１０２に接続し、参照フレームＲｆ２が参照されるときには、スイッチ１１１，１１２は、それぞれ接点１を動き推定部１０２に接続する。
【０１８３】
動き推定部１０２は、マルチフレームバッファ１０１から読み出された参照フレームＲｆ１，Ｒｆ２のそれぞれに基づいて、画像信号Ｉｍｇにより示される符号化対象フレームＴｆ中のブロックに対する動きベクトルＭＶ１，ＭＶ２を、実施の形態１の動き推定部３０２、又は実施の形態３の動き推定部４０２、又は実施の形態５の動き推定部５０２と同様の方法で検出する。
【０１８４】
動き補償部１０３は、参照フレームＲｆ１における動きベクトルＭＶ１により指し示される位置のブロックと、参照フレームＲｆ２における動きベクトルＭＶ２により指し示される位置のブロックとを、マルチフレームバッファ１０１から取り出す。そして動き補償部１０３は、これらのブロックに基づいて画素の補間処理を行って予測画像信号Ｐｒｅを作成しこれを出力する。
【０１８５】
減算器１０９は、画像信号Ｉｍｇから予測画像信号Ｐｒｅを減算して残差信号Ｒｅｓを出力する。
画像符号化部１０４は、残差信号Ｒｅｓを取得してＤＣＴ変換・量子化などの画像符号化処理を行い、量子化済ＤＣＴ係数などを含む残差符号化信号Ｅｒを出力する。
【０１８６】
画像復号部１０５は、残差符号化信号Ｅｒを取得して、逆量子化・逆ＤＣＴ変換などの画像復号処理を行い、残差復号信号Ｄｒを出力する。
加算器１０８は、残差復号信号Ｄｒと予測画像信号Ｐｒｅを加算して再構成画像信号Ｒｃを出力する。
【０１８７】
マルチフレームバッファ１０１は、再構成画像信号Ｒｃのうち、フレーム間予測で参照される可能性がある信号を格納する。
図２０は、マルチフレームバッファ１０１における上記信号を格納するメモリの概略構成を示す構成図である。
【０１８８】
マルチフレームバッファ１０１には、図２０に示すように、短時間メモリ１０１ｓと長時間メモリ１０１ｌとが確保され、再構成画像信号Ｒｃにより示されるフレームは短時間メモリ１０１ｓと長時間メモリ１０１ｌとに適宜分別して保存される。
【０１８９】
短時間メモリ１０１ｓは、先入れ先出し方式（ＦＩＦＯ）のメモリであり、新規の信号が短時間メモリ１０１ｓに記録されると、記録された時刻が古い順から記録内容が破棄され、短時間メモリ１０１ｓには常に最新の一定フレーム数の画像が保存される。また、この短時間メモリ１０１ｓに、再構成画像信号Ｒｃの示すフレームが記録されるときには、そのフレームの表示時刻に関する情報と共に記録される。
【０１９０】
長時間メモリ１０１ｌは、ランダムアクセス方式のメモリであり、任意の領域にフレームを格納したり、任意の領域に格納されたフレームの読み出しが可能な構成を有する。この長時間メモリ１０１ｌは、背景画像や、シーン挿入前の画像など、主に長時間に渡って参照される画像を保存し、短時間メモリ１０１ｓよりも長い時間分のフレームを保存する。また、長時間メモリ１０１ｌへのフレームの保存は、短時間メモリ１０１ｓに保存されたものが長時間メモリ１０１ｌに移動するという形式で行われる。
【０１９１】
さらに、この長時間メモリ１０１ｌに、再構成画像信号Ｒｃの示すフレームが記録されるときには、そのフレームの表示時刻に関する情報と共に記録されたり、その時刻に関する情報が省かれた状態で記録されたりする。
【０１９２】
また本実施の形態のマルチフレームバッファ１０１は通知部１１５を備えており、この通知部１１５は、スイッチ１１１を介して動き推定部１０２によって読み出される参照フレームＲｆ１，Ｒｆ２が短時間メモリ１０１ｓから読み出されたものか、長時間メモリ１０１ｌから読み出されたものかを通知する内容の通知信号Ｉｎｆを出力する。
【０１９３】
図２１は、マルチフレームバッファ１０１に保存されたフレームの状態を示す状態図である。
短時間メモリ１０１ｓには、時間の経過ととともに、フレームｆｓ１、フレームｆｓ２、フレームｆｓ３…が順に保存され、長時間メモリ１０１ｌには、短時間メモリ１０１ｓに保存されたフレームのうち、後に参照される可能性のあるフレームｆｌ１とフレームｆｌ２が順に保存されている。
【０１９４】
ここでマルチフレームバッファ１０１から、図２１中の（ａ）に示すように、短時間メモリ１０１ｓに保存されているフレームｆｓ２が参照フレームＲｆ１として読み出されると、マルチフレームバッファ１０１の通知部１１５は、短時間メモリ１０１ｓからフレームが読み出されたことを知らせる内容の通知信号Ｉｎｆを出力する。また、マルチフレームバッファ１０１から、長時間メモリ１０１ｌに保存されているフレームｆｌ２が参照フレームＲｆ２として読み出されると、マルチフレームバッファ１０１の通知部１１５は、長時間メモリ１０１ｌからフレームが読み出されたことを知らせる内容の通知信号Ｉｎｆを出力する。
【０１９５】
これと同様、図２１の（ｂ）に示すように、マルチフレームバッファ１０１から、長時間メモリ１０１ｌに保存されているフレームｆｌ１，ｆｌ２がそれぞれ参照フレームＲｆ１，Ｒｆ２として読み出されると、マルチフレームバッファ１０１の通知部１１５は、フレームｆｌ１，ｆｌ２が読み出されるごとに、長時間メモリ１０１ｌからフレームが読み出されたことを知らせる内容の通知信号Ｉｎｆを出力する。
【０１９６】
判定部１１４は、通知部１１５からの通知信号Ｉｎｆを取得して、符号化対象ブロックごとに参照される参照フレームＲｆ１，Ｒｆ２のうちの少なくとも１つが長時間メモリ１０１ｌから読み出されたものか否かを判定する。そして、判定部１１４は、その判定結果に基づいてスイッチ１１３の接点の切り換えを指示する切換信号ｓｉ１を出力する。
【０１９７】
スイッチ１１３は、上述の切換信号ｓｉ１に応じて接点を切り換えることで、動き推定部１０２の動きベクトルＭＶ１の出力先を、動きベクトルスケーリング部１０７と減算器１１０とに切り換える。
【０１９８】
即ち上述の判定部１１４は、参照フレームＲｆ１，Ｒｆ２が短時間メモリ１０１ｓから読み出されたものであると判定したときには、スイッチ１１３の出力先が動きベクトルスケーリング部１０７になるように指示する切換信号ｓｉ１を出力し、参照フレームＲｆ１，Ｒｆ２のうちの少なくとも１つが長時間メモリ１０１ｌから読み出されたものであると判定したときには、スイッチ１１３の出力先が減算器１１０になるように指示する切換信号ｓｉ１を出力する。
【０１９９】
動きベクトルスケーリング部１０７は、図３５を参照して説明した動作と同様、符号化対象フレームＴｆと参照フレームＲｆ１との表示時間差Ｔ１、及び符号化対象フレームＴｆと参照フレームＲｆ２との表示時間差Ｔ２に基づいて、動きベクトルＭＶ１をスケーリングし、その結果生成された動きベクトルＭＶｓを出力する。
【０２００】
減算器１１０は、スイッチ１１３の出力先が動きベクトルスケーリング部１０７に設定されているときには、動き推定部１０２から取得された動きベクトルＭＶ２と、動きベクトルスケーリング部１０７から取得された動きベクトルＭＶｓとの差分を求め、その結果を示す差分ベクトルＭＶｄを出力する。
【０２０１】
また、減算器１１０は、自らがスイッチ１１３の出力先として設定されているときには、動きベクトルスケーリング部１０７からの動きベクトルＭＶｓの代わりに、動き推定部１０２からスイッチ１１３を介して取得された動きベクトルＭＶ１を用い、動きベクトルＭＶ２と動きベクトルＭＶ１との差分を求めて、その差分結果を差分ベクトルＭＶｄとして出力する。
【０２０２】
図２２は、差分ベクトルＭＶｄの作成される様子を説明するための説明図である。
この図２２に示すように、減算器１１０は、動きベクトルＭＶｓの代わりに動きベクトルＭＶ１を取得したときには、動きベクトルＭＶ２と動きベクトルＭＶ１との差分を求めて差分ベクトルＭＶｄを作成する。
【０２０３】
可変長符号化部１０６は、この差分ベクトルＭＶｄと動きベクトルＭＶ１と残差符号化信号Ｅｒとを可変長符号化し、これらの符号化された結果を画像符号化信号Ｂｓとして出力する。
【０２０４】
このような本実施の形態における動画像符号化装置１００の動きベクトルの符号化の一連の動作について、図２３を参照して説明する。
図２３は、動きベクトルの符号化の一連の動作を示すフロー図である。
【０２０５】
まず、動画像符号化装置１００の判定部１１４は、通知信号Ｉｎｆに基づいて参照フレームＲｆ１，Ｒｆ２のうちの少なくとも１つが長時間メモリ１０１ｌから読み出されたか否かを判定する（ステップＳ１０１）。
【０２０６】
そして、判定部１１４は、参照フレームＲｆ１，Ｒｆ２の２つのフレームが短時間メモリ１０１ｓから読み出されたと判定したときには（ステップＳ１０１のＮ）、スイッチ１１３の出力先が動きベクトルスケーリング部１０７となるようにスイッチ１１３の接点を切り換えさせる。その結果、動きベクトルスケーリング部１０７は、動きベクトルＭＶ１を取得してこれをスケーリングすることで動きベクトルＭＶｓを作成する（ステップＳ１０２）。そして、減算器１１０は、その作成された動きベクトルＭＶｓを取得する。
【０２０７】
一方、判定部１１４は、参照フレームＲｆ１，Ｒｆ２のうちの少なくとも１つが長時間メモリ１０１ｌから読み出されたと判定したときには（ステップＳ１０１のＹ）、スイッチ１１３の出力先が減算器１１０となるようにスイッチ１１３の接点を切り換えさせる。その結果、減算器１１０は、スイッチ１１３を介して動き推定部１０２から取得された動きベクトルＭＶ１を、動きベクトルスケーリング部１０７から出力された動きベクトルＭＶｓとして扱う（ステップＳ１０３）。
【０２０８】
次に、減算器１１０は、動きベクトルＭＶ２と、上述の動きベクトルＭＶｓとの差分を求め、その差分結果を示す差分ベクトルＭＶｄを可変長符号化部１０６に出力する（ステップＳ１０４）。
【０２０９】
そして、可変長符号化部１０６は、動き推定部１０２から取得された動きベクトルＭＶ１を可変長符号化し（ステップＳ１０５）、減算器１１０から取得された差分ベクトルＭＶｄを可変長符号化する（ステップＳ１０６）。
【０２１０】
このように本実施の形態では、２つの参照フレームＲｆ１，Ｒｆ２のうちの少なくとも１つが長時間メモリ１０１ｌから読み出されたときには、動きベクトルスケーリング部１０７にスケーリングさせないため、フレームの表示時間に関する情報が長時間メモリ１０１ｌに記録されている場合であっても、その情報を利用した意味を成さないスケーリングの実行を省き、動きベクトルの符号化効率の向上を図ることができる。また、フレームの表示時間に関する情報が長時間メモリ１０１ｌに記録されていない場合には、無理なスケーリングの実行を省き、動きベクトルの符号化効率の向上を図ることができる。
【０２１１】
なお、本実施の形態では、スイッチ１１３の切り換えにより動きベクトルスケーリング部１０７に対してスケーリングを行わせたり、スケーリングを行わせないようにしたが、スイッチ１１３を備えずに図３３に示すような構成として、２つの参照フレームＲｆ１，Ｒｆ２のうちの少なくとも１つが長時間メモリ１０１ｌから読み出されたときには、常に動きベクトルスケーリング部１０７にスケーリングを行わないようにさせても良い。
【０２１２】
また、本実施の形態では、２つの参照フレームＲｆ１，Ｒｆ２のうちの少なくとも１つが長時間メモリ１０１ｌから読み出されたときには、動きベクトルＭＶ１と差分ベクトルＭＶｄを符号化したが、差分ベクトルＭＶｄを求めずに動きベクトルＭＶ１と動きベクトルＭＶ２を符号化しても良い。これは、２つの参照フレームＲｆ１，Ｒｆ２のうち少なくとも１つが長時間メモリ１０１ｌから読み出された場合には差分ベクトルＭＶｄを符号化するのではなく、動きベクトルＭＶ２を符号化することを意味する。この場合にはさらに、符号化対象ブロックの周辺にある周辺ブロックから動きベクトルＭＶ１，ＭＶ２の予測値を求めて動きベクトルＭＶ１，ＭＶ２のそれぞれとの予測値の差分を計算し、その差分を符号化しても良い。
【０２１３】
また、本実施の形態では、通知部１１５をマルチフレームバッファ１０１に備えたが、マルチフレームバッファ１０１以外の他の構成要素に備えても良く、通知部１１５を単独で備えても良い。
【０２１４】
（実施の形態８）
以下、本発明の第８の実施の形態における動画像復号化装置について図面を参照しながら説明する。
【０２１５】
図２４は、本実施の形態における動画像復号化装置２００の構成を示すブロック図である。
本実施の形態の動画像復号化装置２００は、実施の形態７の動画像符号化装置１００により符号化された動画像を復号化するものであって、可変長復号化部２０６と、動きベクトルスケーリング部２０７と、動き補償部２０３と、画像復号化部２０４と、マルチフレームバッファ２０１と、判定部２１４と、加算器２０９，２１０と、スイッチ２１３とを備えている。
【０２１６】
可変長復号化部２０６は、画像符号化信号Ｂｓを取得して可変長復号を行い、残差符号化信号Ｅｒと、動きベクトルＭＶ１と、差分ベクトルＭＶｄとを出力する。
【０２１７】
画像復号化部２０４は、残差符号化信号Ｅｒを取得して、逆量子化や逆ＤＣＴ変換などの画像復号処理を行い、残差復号信号Ｄｒを出力する。
動き補償部２０３は、実施の形態１の動き補償部１０３と同様、参照フレームＲｆ１における動きベクトルＭＶ１により指し示される位置のブロックと、参照フレームＲｆ２における動きベクトルＭＶ２により指し示される位置のブロックとを、マルチフレームバッファ２０１から取り出す。そして動き補償部２０３は、これらのブロックに基づいて画素の補間処理を行って予測画像信号Ｐｒｅを作成しこれを出力する。
【０２１８】
加算器２０９は、動き補償部２０３からの予測画像信号Ｐｒｅと、画像復号化部２０４からの残差復号信号Ｄｒとを加算し、その結果を復号画像信号Ｄｉとして出力する。
【０２１９】
動きベクトルスケーリング部２０７は、実施の形態７の動きベクトルスケーリング部１０７と同様、可変長復号化部２０６から出力された動きベクトルＭＶ１を取得すると、符号化対象フレームＴｆと参照フレームＲｆ１との表示時間差Ｔ１、及び符号化対象フレームＴｆと参照フレームＲｆ２との表示時間差Ｔ２に基づいて、動きベクトルＭＶ１に対してスケーリングを行い、その結果生成された動きベクトルＭＶｓを出力する。
【０２２０】
マルチフレームバッファ２０１は、復号画像信号Ｄｉのうち、フレーム間予測で参照される可能性がある信号を格納する。また、このマルチフレームバッファ２０１には、実施の形態７のマルチフレームバッファ１０１の短時間メモリ１０１ｓ及び長時間メモリ１０１ｌと同様の機能及び構成を有する、短時間メモリ２０１ｓと長時間メモリ２０１ｌとが確保されている。つまり、復号画像信号Ｄｉが示すフレームは短時間メモリ２０１ｓと長時間メモリ２０１ｌとに適宜分別して保存される。
【０２２１】
さらに、このマルチフレームバッファ２０１は、実施の形態７のマルチバッファフレーム１０１の通知部１１５と同様の機能及び構成を有する通知部２１５を備えている。即ち、この通知部２１５は、動き補償部２０３によって読み出される参照フレームＲｆ１，Ｒｆ２が短時間メモリ２０１ｓから読み出されたものか、長時間メモリ２０１ｌから読み出されたものかを通知する内容の通知信号Ｉｎｆを出力する。
【０２２２】
判定部２１４は、実施の形態７の判定部１１４と同様の機能及び構成を有し、通知部２１５からの通知信号Ｉｎｆを取得して、符号化対象ブロックごとに参照される参照フレームＲｆ１，Ｒｆ２のうちの少なくとも１つが長時間メモリ２０１ｌから読み出されたものか否かを判定する。そして、判定部２１４は、その判定結果に基づいてスイッチ２１３の接点の切り換えを指示する切換信号ｓｉ１を出力する。
【０２２３】
スイッチ２１３は、上述の切換信号ｓｉ１に応じて接点を切り換えることで、可変長復号化部２０６から取得された動きベクトルＭＶ１の出力先を、動きベクトルスケーリング部２０７と加算器２１０とに切り換える。
【０２２４】
即ち上述の判定部２１４は、参照フレームＲｆ１，Ｒｆ２が短時間メモリ２０１ｓから読み出されたものであると判定したときには、スイッチ２１３の出力先が動きベクトルスケーリング部２０７になるように指示する切換信号ｓｉ１を出力し、参照フレームＲｆ１，Ｒｆ２のうちの少なくとも１つが長時間メモリ２０１ｌから読み出されたものであると判定したときには、スイッチ２１３の出力先が加算器２１０になるように指示する切換信号ｓｉ１を出力する。
【０２２５】
加算器２１０は、スイッチ２１３の出力先が動きベクトルスケーリング部２０７に設定されているときには、可変長復号化部２０６から取得された動きベクトルＭＶｄと、動きベクトルスケーリング部２０７から取得された動きベクトルＭＶｓとを加算し、その結果を示す動きベクトルＭＶ２を動き補償部２０３に対して出力する。
【０２２６】
また、加算器２１０は、自らがスイッチ２１３の出力先として設定されているときには、動きベクトルスケーリング部２０７からの動きベクトルＭＶｓの代わりに、可変長復号化部２０６からスイッチ２１３を介して取得された動きベクトルＭＶ１を用い、動きベクトルＭＶｄと動きベクトルＭＶ１とを加算し、その結果を動きベクトルＭＶ２として動き補償部２０３に対して出力する。
【０２２７】
このような本実施の形態における動画像復号化装置２００の動きベクトルの復号化の一連の動作について、図２５を参照して説明する。
図２５は、動きベクトルの復号化の一連の動作を示すフロー図である。
【０２２８】
まず、動画像復号化装置２００の可変長復号化部２０６は、画像符号化信号Ｂｓを取得して可変長復号することで、動きベクトルＭＶ１を復号化するとともに（ステップＳ２０１）、差分ベクトルＭＶｄを復号化する（ステップＳ２０２）。
【０２２９】
次に、判定部２１４は、通知信号Ｉｎｆに基づいて参照フレームＲｆ１，Ｒｆ２のうちの少なくとも１つが長時間メモリ２０１ｌから読み出されたか否かを判別する（ステップＳ２０３）。
【０２３０】
そして、判定部２１４は、参照フレームＲｆ１，Ｒｆ２の２つのフレームが短時間メモリ２０１ｓから読み出されたと判定したときには（ステップＳ２０３のＮ）、スイッチ２１３の出力先が動きベクトルスケーリング部２０７となるようにスイッチ２１３の接点を切り換えさせる。その結果、動きベクトルスケーリング部２０７は、動きベクトルＭＶ１を取得してこれに対してスケーリングを行うことで動きベクトルＭＶｓを作成する（ステップＳ２０４）。そして、加算器２１０は、その作成された動きベクトルＭＶｓを取得する。
【０２３１】
一方、判定部２１４は、参照フレームＲｆ１，Ｒｆ２のうちの少なくとも１つが長時間メモリ２０１ｌから読み出されたと判定したときには（ステップＳ２０３のＹ）、スイッチ２１３の出力先が加算器２１０となるようにスイッチ２１３の接点を切り換えさせる。その結果、加算器２１０は、スイッチ２１３を介して可変長復号化部２０６から取得された動きベクトルＭＶ１を、動きベクトルスケーリング部１０７から出力された動きベクトルＭＶｓとして扱う（ステップＳ２０５）。
【０２３２】
そして、加算器２１０は、差分ベクトルＭＶｄに動きベクトルＭＶｓを加算し、その加算結果を示す動きベクトルＭＶ２を動き補償部２０３に出力する（ステップＳ２０６）。
【０２３３】
このように本実施の形態では、実施の形態７と同様、２つの参照フレームＲｆ１，Ｒｆ２のうちの少なくとも１つが長時間メモリ２０１ｌから読み出されたときには、動きベクトルスケーリング部２０７にスケーリングをさせないため、フレームの表示時間に関する情報が長時間メモリ２０１ｌに記録されている場合であっても、その情報を利用した意味を持たないスケーリングの実行を省き、動きベクトルの復号化効率の向上を図ることができる。また、フレームの表示時間に関する情報が長時間メモリ２０１ｌに記録されていない場合には、無理なスケーリングの実行を省き、動きベクトルの復号化効率の向上を図ることができる。
【０２３４】
なお、本実施の形態では、スイッチ２１３の切り換えにより動きベクトルスケーリング部２０７に対してスケーリングを行わせたり、スケーリングを行わせないようにしたが、スイッチ２１３を備えずに、２つの参照フレームＲｆ１，Ｒｆ２のうちの少なくとも１つが長時間メモリ２０１ｌから読み出されたときには、常に動きベクトルスケーリング部２０７にスケーリングを行わせないようにしても良い。
【０２３５】
また、本実施の形態では、２つの参照フレームＲｆ１，Ｒｆ２のうちの少なくとも１つが長時間メモリ２０１ｌから読み出されたときには、動きベクトルＭＶ１に差分ベクトルＭＶｄを加算して動きベクトルＭＶ２を導出したが、加算せずに動きベクトルＭＶ２を直接復号化しても良い。これは、２つの参照フレームＲｆ１，Ｒｆ２のうちの少なくとも１つが長時間メモリ２０１ｌから読み出された場合には差分ベクトルＭＶｄを復号化するのではなく、動きベクトルＭＶ２を復号化することを意味する。この場合にはさらに、動きベクトルＭＶ１，ＭＶ２のそれぞれが、復号化対象ブロックの周辺にある周辺ブロックから求められた動きベクトルＭＶ１，ＭＶ２の予測値を差し引いて符号化されているときには、その符号化された動きベクトルＭＶ１，ＭＶ２と上述の予測値を加算して、動きベクトルＭＶ１，ＭＶ２を復号化しても良い。
【０２３６】
また、本実施の形態では、通知部２１５をマルチフレームバッファ２０１に備えたが、マルチフレームバッファ２０１以外の他の構成要素に備えても良く、通知部２１５を単独で備えても良い。
【０２３７】
なお、実施の形態１〜８で説明したフレームは、フィールドであっても良い。また、フレームとフィールドを総称してピクチャと呼ぶ。
以上のように、本発明に係る動きベクトル検出方法と、この方法を用いた動きベクトル符号化方法と、動きベクトル復号化方法と、これらの方法を用いた装置について実施の形態１〜８を用いて説明したが、本発明は実施の形態１〜８に限定されるものではなく、他の形態によっても実現されるのは言うまでもない。
【０２３８】
（実施の形態９）
さらに、上記各実施の形態で示した動きベクトル検出方法及び動きベクトル符号化方法並びに動きベクトル復号化方法を実現するためのプログラムを、フレキシブルディスク等の記憶媒体に記録するようにすることにより、上記各実施の形態で示した処理を、独立したコンピュータシステムにおいて簡単に実施することが可能となる。
【０２３９】
図２６は、実施の形態１〜８の動きベクトル検出方法及び動きベクトル符号化方法並びに動きベクトル復号化方法をコンピュータシステムにより実現するためのプログラムを格納する記憶媒体についての説明図である。
【０２４０】
図２６中の（ｂ）は、フレキシブルディスクＦＤの正面からみた外観、断面構造、及びディスク本体ＦＤ１を示し、図２６中の（ａ）は、記録媒体の本体であるディスク本体ＦＤ１の物理フォーマットの例を示している。
【０２４１】
ディスク本体ＦＤ１はケースＦ内に内蔵され、ディスク本体ＦＤ１の表面には、同心円状に外周からは内周に向かって複数のトラックＴｒが形成され、各トラックは角度方向に１６のセクタＳｅに分割されている。従って、上記プログラムを格納したフレキシブルディスクＦＤでは、上記ディスク本体ＦＤ１上に割り当てられた領域に、上記プログラムとしての動きベクトル符号化方法や動きベクトル復号化方法が記録されている。
【０２４２】
また、図２６中の（ｃ）は、フレキシブルディスクＦＤに上記プログラムの記録再生を行うための構成を示す。
上記プログラムをフレキシブルディスクＦＤに記録する場合は、コンピュータシステムＣｓが上記プログラムとしての動きベクトル符号化方法または動きベクトル復号化方法をフレキシブルディスクドライブＦＤＤを介して書き込む。また、フレキシブルディスクＦＤ内のプログラムにより上記動きベクトル符号化方法又は動きベクトル復号化方法をコンピュータシステムＣｓ中に構築する場合は、フレキシブルディスクドライブＦＤＤによりプログラムがフレキシブルディスクＦＤから読み出され、コンピュータシステムＣｓに転送される。
【０２４３】
なお、上記説明では、記録媒体としてフレキシブルディスクＦＤを用いて説明を行ったが、光ディスクを用いても同様に行うことができる。また、記録媒体はこれに限らず、ＩＣカード、ＲＯＭカセット等、プログラムを記録できるものであれば同様に実施することができる。
【０２４４】
（実施の形態１０）
さらにここで、上記実施の形態で示した動きベクトル検出方法及び動きベクトル符号化方法並びに動きベクトル復号化方法の応用例とそれを用いたシステムを説明する。
【０２４５】
図２７は、コンテンツ配信サービスを実現するコンテンツ供給システムｅｘ１００の全体構成を示すブロック図である。通信サービスの提供エリアを所望の大きさに分割し、各セル内にそれぞれ固定無線局である基地局ｅｘ１０７〜ｅｘ１１０が設置されている。
【０２４６】
このコンテンツ供給システムｅｘ１００は、例えば、インターネットｅｘ１０１にインターネットサービスプロバイダｅｘ１０２および電話網ｅｘ１０４、および基地局ｅｘ１０７〜ｅｘ１１０を介して、コンピュータｅｘ１１１、ＰＤＡ（ｐｅｒｓｏｎａｌｄｉｇｉｔａｌａｓｓｉｓｔａｎｔ）ｅｘ１１２、カメラｅｘ１１３、携帯電話ｅｘ１１４、カメラ付きの携帯電話ｅｘ１１５などの各機器が接続される。
【０２４７】
しかし、コンテンツ供給システムｅｘ１００は図２７のような組合せに限定されず、いずれかを組み合わせて接続するようにしてもよい。また、固定無線局である基地局ｅｘ１０７〜ｅｘ１１０を介さずに、各機器が電話網ｅｘ１０４に直接接続されてもよい。
【０２４８】
カメラｅｘ１１３はデジタルビデオカメラ等の動画撮影が可能な機器である。また、携帯電話は、ＰＤＣ（ＰｅｒｓｏｎａｌＤｉｇｉｔａｌＣｏｍｍｕｎｉｃａｔｉｏｎｓ）方式、ＣＤＭＡ（ＣｏｄｅＤｉｖｉｓｉｏｎＭｕｌｔｉｐｌｅＡｃｃｅｓｓ）方式、Ｗ−ＣＤＭＡ（Ｗｉｄｅｂａｎｄ−ＣｏｄｅＤｉｖｉｓｉｏｎＭｕｌｔｉｐｌｅＡｃｃｅｓｓ）方式、若しくはＧＳＭ（ＧｌｏｂａｌＳｙｓｔｅｍｆｏｒＭｏｂｉｌｅＣｏｍｍｕｎｉｃａｔｉｏｎｓ）方式の携帯電話機、またはＰＨＳ（ＰｅｒｓｏｎａｌＨａｎｄｙｐｈｏｎｅＳｙｓｔｅｍ）等であり、いずれでも構わない。
【０２４９】
また、ストリーミングサーバｅｘ１０３は、カメラｅｘ１１３から基地局ｅｘ１０９、電話網ｅｘ１０４を通じて接続されており、カメラｅｘ１１３を用いてユーザが送信する符号化処理されたデータに基づいたライブ配信等が可能になる。撮影したデータの符号化処理はカメラｅｘ１１３で行っても、データの送信処理をするサーバ等で行ってもよい。また、カメラｅｘ１１６で撮影した動画データはコンピュータｅｘ１１１を介してストリーミングサーバｅｘ１０３に送信されてもよい。カメラｅｘ１１６はデジタルカメラ等の静止画、動画が撮影可能な機器である。この場合、動画データの符号化はカメラｅｘ１１６で行ってもコンピュータｅｘ１１１で行ってもどちらでもよい。また、符号化処理はコンピュータｅｘ１１１やカメラｅｘ１１６が有するＬＳＩｅｘ１１７において処理することになる。なお、画像符号化・復号化用のソフトウェアをコンピュータｅｘ１１１等で読み取り可能な記録媒体である何らかの蓄積メディア（ＣＤ−ＲＯＭ、フレキシブルディスク、ハードディスクなど）に組み込んでもよい。さらに、カメラ付きの携帯電話ｅｘ１１５で動画データを送信してもよい。このときの動画データは携帯電話ｅｘ１１５が有するＬＳＩで符号化処理されたデータである。
【０２５０】
このコンテンツ供給システムｅｘ１００では、ユーザがカメラｅｘ１１３、カメラｅｘ１１６等で撮影しているコンテンツ（例えば、音楽ライブを撮影した映像等）を上記実施の形態同様に符号化処理してストリーミングサーバｅｘ１０３に送信する一方で、ストリーミングサーバｅｘ１０３は要求のあったクライアントに対して上記コンテンツデータをストリーム配信する。クライアントとしては、上記符号化処理されたデータを復号化することが可能な、コンピュータｅｘ１１１、ＰＤＡｅｘ１１２、カメラｅｘ１１３、携帯電話ｅｘ１１４等がある。このようにすることでコンテンツ供給システムｅｘ１００は、符号化されたデータをクライアントにおいて受信して再生することができ、さらにクライアントにおいてリアルタイムで受信して復号化し、再生することにより、個人放送をも実現可能になるシステムである。
【０２５１】
このシステムを構成する各機器の符号化、復号化には上記各実施の形態で示した動画像符号化装置あるいは動画像復号化装置を用いるようにすればよい。
その一例として携帯電話について説明する。
【０２５２】
図２８は、上記実施の形態で説明した動きベクトル検出方法及び動きベクトル符号化方法と動きベクトル復号化方法を用いた携帯電話ｅｘ１１５を示す図である。携帯電話ｅｘ１１５は、基地局ｅｘ１１０との間で電波を送受信するためのアンテナｅｘ２０１、ＣＣＤカメラ等の映像、静止画を撮ることが可能なカメラ部ｅｘ２０３、カメラ部ｅｘ２０３で撮影した映像、アンテナｅｘ２０１で受信した映像等が復号化されたデータを表示する液晶ディスプレイ等の表示部ｅｘ２０２、操作キーｅｘ２０４群から構成される本体部、音声出力をするためのスピーカ等の音声出力部ｅｘ２０８、音声入力をするためのマイク等の音声入力部ｅｘ２０５、撮影した動画もしくは静止画のデータ、受信したメールのデータ、動画のデータもしくは静止画のデータ等、符号化されたデータまたは復号化されたデータを保存するための記録メディアｅｘ２０７、携帯電話ｅｘ１１５に記録メディアｅｘ２０７を装着可能とするためのスロット部ｅｘ２０６を有している。記録メディアｅｘ２０７はＳＤカード等のプラスチックケース内に電気的に書換えや消去が可能な不揮発性メモリであるＥＥＰＲＯＭ（ＥｌｅｃｔｒｉｃａｌｌｙＥｒａｓａｂｌｅａｎｄＰｒｏｇｒａｍｍａｂｌｅＲｅａｄＯｎｌｙＭｅｍｏｒｙ）の一種であるフラッシュメモリ素子を格納したものである。
【０２５３】
さらに、携帯電話ｅｘ１１５について図２９を用いて説明する。携帯電話ｅｘ１１５は表示部ｅｘ２０２及び操作キーｅｘ２０４を備えた本体部の各部を統括的に制御するようになされた主制御部ｅｘ３１１に対して、電源回路部ｅｘ３１０、操作入力制御部ｅｘ３０４、画像符号化部ｅｘ３１２、カメラインターフェース部ｅｘ３０３、ＬＣＤ（ＬｉｑｕｉｄＣｒｙｓｔａｌＤｉｓｐｌａｙ）制御部ｅｘ３０２、画像復号化部ｅｘ３０９、多重分離部ｅｘ３０８、記録再生部ｅｘ３０７、変復調回路部ｅｘ３０６及び音声処理部ｅｘ３０５が同期バスｅｘ３１３を介して互いに接続されている。
【０２５４】
電源回路部ｅｘ３１０は、ユーザの操作により終話及び電源キーがオン状態にされると、バッテリパックから各部に対して電力を供給することによりカメラ付ディジタル携帯電話ｅｘ１１５を動作可能な状態に起動する。
【０２５５】
携帯電話ｅｘ１１５は、ＣＰＵ、ＲＯＭ及びＲＡＭ等でなる主制御部ｅｘ３１１の制御に基づいて、音声通話モード時に音声入力部ｅｘ２０５で集音した音声信号を音声処理部ｅｘ３０５によってディジタル音声データに変換し、これを変復調回路部ｅｘ３０６でスペクトラム拡散処理し、送受信回路部ｅｘ３０１でディジタルアナログ変換処理及び周波数変換処理を施した後にアンテナｅｘ２０１を介して送信する。また携帯電話機ｅｘ１１５は、音声通話モード時にアンテナｅｘ２０１で受信した受信データを増幅して周波数変換処理及びアナログディジタル変換処理を施し、変復調回路部ｅｘ３０６でスペクトラム逆拡散処理し、音声処理部ｅｘ３０５によってアナログ音声データに変換した後、これを音声出力部ｅｘ２０８を介して出力する。
【０２５６】
さらに、データ通信モード時に電子メールを送信する場合、本体部の操作キーｅｘ２０４の操作によって入力された電子メールのテキストデータは操作入力制御部ｅｘ３０４を介して主制御部ｅｘ３１１に送出される。主制御部ｅｘ３１１は、テキストデータを変復調回路部ｅｘ３０６でスペクトラム拡散処理し、送受信回路部ｅｘ３０１でディジタルアナログ変換処理及び周波数変換処理を施した後にアンテナｅｘ２０１を介して基地局ｅｘ１１０へ送信する。
【０２５７】
データ通信モード時に画像データを送信する場合、カメラ部ｅｘ２０３で撮像された画像データをカメラインターフェース部ｅｘ３０３を介して画像符号化部ｅｘ３１２に供給する。また、画像データを送信しない場合には、カメラ部ｅｘ２０３で撮像した画像データをカメラインターフェース部ｅｘ３０３及びＬＣＤ制御部ｅｘ３０２を介して表示部ｅｘ２０２に直接表示することも可能である。
【０２５８】
画像符号化部ｅｘ３１２は、本願発明で説明した画像符号化装置を備えた構成であり、カメラ部ｅｘ２０３から供給された画像データを上記実施の形態で示した画像符号化装置に用いた符号化方法によって圧縮符号化することにより符号化画像データに変換し、これを多重分離部ｅｘ３０８に送出する。また、このとき同時に携帯電話機ｅｘ１１５は、カメラ部ｅｘ２０３で撮像中に音声入力部ｅｘ２０５で集音した音声を音声処理部ｅｘ３０５を介してディジタルの音声データとして多重分離部ｅｘ３０８に送出する。
【０２５９】
多重分離部ｅｘ３０８は、画像符号化部ｅｘ３１２から供給された符号化画像データと音声処理部ｅｘ３０５から供給された音声データとを所定の方式で多重化し、その結果得られる多重化データを変復調回路部ｅｘ３０６でスペクトラム拡散処理し、送受信回路部ｅｘ３０１でディジタルアナログ変換処理及び周波数変換処理を施した後にアンテナｅｘ２０１を介して送信する。
【０２６０】
データ通信モード時にホームページ等にリンクされた動画像ファイルのデータを受信する場合、アンテナｅｘ２０１を介して基地局ｅｘ１１０から受信した受信データを変復調回路部ｅｘ３０６でスペクトラム逆拡散処理し、その結果得られる多重化データを多重分離部ｅｘ３０８に送出する。
【０２６１】
また、アンテナｅｘ２０１を介して受信された多重化データを復号化するには、多重分離部ｅｘ３０８は、多重化データを分離することにより画像データのビットストリームと音声データのビットストリームとに分け、同期バスｅｘ３１３を介して当該符号化画像データを画像復号化部ｅｘ３０９に供給すると共に当該音声データを音声処理部ｅｘ３０５に供給する。
【０２６２】
次に、画像復号化部ｅｘ３０９は、本願発明で説明した画像復号化装置を備えた構成であり、画像データのビットストリームを上記実施の形態で示した符号化方法に対応した復号化方法で復号することにより再生動画像データを生成し、これをＬＣＤ制御部ｅｘ３０２を介して表示部ｅｘ２０２に供給し、これにより、例えばホームページにリンクされた動画像ファイルに含まれる動画データが表示される。このとき同時に音声処理部ｅｘ３０５は、音声データをアナログ音声データに変換した後、これを音声出力部ｅｘ２０８に供給し、これにより、例えばホームページにリンクされた動画像ファイルに含まる音声データが再生される。
【０２６３】
なお、上記システムの例に限られず、最近は衛星、地上波によるディジタル放送が話題となっており、図３０に示すようにディジタル放送用システムにも上記実施の形態の少なくとも画像符号化装置または画像復号化装置のいずれかを組み込むことができる。具体的には、放送局ｅｘ４０９では映像情報のビットストリームが電波を介して通信または放送衛星ｅｘ４１０に伝送される。これを受けた放送衛星ｅｘ４１０は、放送用の電波を発信し、この電波を衛星放送受信設備をもつ家庭のアンテナｅｘ４０６で受信し、テレビ（受信機）ｅｘ４０１またはセットトップボックス（ＳＴＢ）ｅｘ４０７などの装置によりビットストリームを復号化してこれを再生する。また、記録媒体であるＣＤやＤＶＤ等の蓄積メディアｅｘ４０２に記録したビットストリームを読み取り、復号化する再生装置ｅｘ４０３にも上記実施の形態で示した画像復号化装置を実装することが可能である。この場合、再生された映像信号はモニタｅｘ４０４に表示される。また、ケーブルテレビ用のケーブルｅｘ４０５または衛星／地上波放送のアンテナｅｘ４０６に接続されたセットトップボックスｅｘ４０７内に画像復号化装置を実装し、これをテレビのモニタｅｘ４０８で再生する構成も考えられる。このときセットトップボックスではなく、テレビ内に画像復号化装置を組み込んでも良い。また、アンテナｅｘ４１１を有する車ｅｘ４１２で衛星ｅｘ４１０からまたは基地局ｅｘ１０７等から信号を受信し、車ｅｘ４１２が有するカーナビゲーションｅｘ４１３等の表示装置に動画を再生することも可能である。
【０２６４】
更に、画像信号を上記実施の形態で示した画像符号化装置で符号化し、記録媒体に記録することもできる。具体例としては、ＤＶＤディスクｅｘ４２１に画像信号を記録するＤＶＤレコーダや、ハードディスクに記録するディスクレコーダなどのレコーダｅｘ４２０がある。更にＳＤカードｅｘ４２２に記録することもできる。レコーダｅｘ４２０が上記実施の形態で示した画像復号化装置を備えていれば、ＤＶＤディスクｅｘ４２１やＳＤカードｅｘ４２２に記録した画像信号を再生し、モニタｅｘ４０８で表示することができる。
【０２６５】
なお、カーナビゲーションｅｘ４１３の構成は例えば図２９に示す構成のうち、カメラ部ｅｘ２０３とカメラインターフェース部ｅｘ３０３、画像符号化部ｅｘ３１２を除いた構成が考えられ、同様なことがコンピュータｅｘ１１１やテレビ（受信機）ｅｘ４０１等でも考えられる。
【０２６６】
また、上記携帯電話ｅｘ１１４等の端末は、符号化器・復号化器を両方持つ送受信型の端末の他に、符号化器のみの送信端末、復号化器のみの受信端末の３通りの実装形式が考えられる。
【０２６７】
このように、上記実施の形態で示した動きベクトル検出方法、符号化方法、復号化方法を上述したいずれの機器・システムに用いることは可能であり、そうすることで、上記実施の形態で説明した効果を得ることができる。
【０２６８】
また、本発明はかかる上記実施形態に限定されるものではなく、本発明の範囲を逸脱することなく種々の変形または修正が可能である。
【０２６９】
【発明の効果】
以上の説明から明らかなように、本発明に係る動きベクトル検出方法によれば、動画像を構成するピクチャの中のブロックにおける、他のピクチャからの変位を示す動きベクトルを検出する動きベクトル検出方法であって、前記検出対象ブロックの第１の参照ピクチャに基づく第１の動きベクトル候補を生成する第１の候補生成ステップと、前記検出対象ブロックの第２の参照ピクチャに基づく第２の動きベクトル候補を生成する第２の候補生成ステップと、前記第１の参照ピクチャにおける第１の動きベクトル候補が示す第１の予測用ブロックと、前記第２の参照ピクチャにおける第２の動きベクトル候補が示す第２の予測用ブロックとに基づいて、互いに対応する画素の画素値の補間を行うことで補間予測ブロックを作成する補間ステップと、前記補間予測ブロックと前記検出対象ブロックとの互いに対応する画素の画素値の差に基づく評価値を算出する算出ステップと、前記評価値に基づいて、前記第１の候補生成ステップで生成された複数の第１の動きベクトル候補の中から１つを選択するとともに、前記第２の候補生成ステップで生成された複数の第２の動きベクトル候補の中から１つを選択する選択ステップと、選択された前記第１の動きベクトル候補を、前記検出対象ブロックにおける前記第１の参照ピクチャに基づく第１の動きベクトルとして検出するとともに、選択された前記第２の動きベクトル候補を、前記検出対象ブロックにおける前記第２の参照ピクチャに基づく第２の動きベクトルとして検出する検出ステップとを含むことを特徴とする。例えば、前記選択ステップでは、前記評価値が最小となる第１及び第２の動きベクトル候補をそれぞれ１つ選択する。
【０２７０】
これにより、画素値の補間処理を行った結果に基づいて評価値が算出されるため、フェードが生じる場合であってもその影響による評価値の誤差の増加を防止して、最適な動きベクトルを検出することができる。
【図面の簡単な説明】
【図１】本発明の第１の実施の形態における動画像符号化装置の構成を示すブロック図である。
【図２】同上の参照フレームと符号化対象フレームの時間的位置関係を示すフレーム配置図である。
【図３】同上の動き推定部の構成を示すブロック図である。
【図４】同上の第１動きベクトル候補及び第２動きベクトル候補の生成方法を説明するための説明図である。
【図５】同上の画像符号化信号のフォーマットの概念を示す概念図である。
【図６】同上の動画像符号化装置が動きベクトルを検出して符号化するまでの動作を示すフロー図である。
【図７】本発明の第２の実施の形態における動画像復号化装置の構成を示すブロック図である。
【図８】本発明の第３の実施の形態における動画像符号化装置の構成を示すブロック図である。
【図９】同上の動き推定部の構成を示すブロック図である。
【図１０】同上の第１動きベクトル候補及び第２動きベクトル候補の生成方法を説明するための説明図である。
【図１１】同上の画像符号化信号のフォーマットの概念を示す概念図である。
【図１２】本発明の第４の実施の形態における動画像復号化装置の構成を示すブロック図である。
【図１３】本発明の第５の実施の形態における動画像符号化装置の構成を示すブロック図である。
【図１４】同上の動き推定部の構成を示すブロック図である。
【図１５】第１動きベクトル候補及び動きベクトル並びに第２動きベクトル候補の生成方法を説明するための説明図である。
【図１６】同上の変形例に係る動画像符号化装置の構成を示すブロック図である。
【図１７】同上の変形例に係る動画像符号化装置の画像符号化信号のフォーマットの概念を示す概念図である。
【図１８】本発明の第６の実施の形態における動画像復号化装置の構成を示すブロック図である。
【図１９】本発明の第７の実施の形態における動画像符号化装置の構成を示すブロック図である。
【図２０】同上のマルチフレームバッファの内部のメモリの概略構成を示す構成図である。
【図２１】同上のマルチフレームバッファに保存されたフレームの状態を示す状態図である。
【図２２】同上の差分ベクトルの作成される様子を説明するための説明図である。
【図２３】同上の動きベクトルの符号化の一連の動作を示すフロー図である。
【図２４】本発明の第８の実施の形態における動画像復号化装置の構成を示すブロック図である。
【図２５】同上の動きベクトルの復号化の一連の動作を示すフロー図である。
【図２６】本発明の第９の実施の形態における記録媒体についての説明図である。
【図２７】本発明の第１０の実施の形態におけるコンテンツ供給システムの全体構成を示すブロック図である。
【図２８】同上の携帯電話の正面図である。
【図２９】同上の携帯電話のブロック図である。
【図３０】同上のディジタル放送用システムの全体構成を示すブロック図である。
【図３１】動きベクトルを説明するための説明図である。
【図３２】２枚のフレームを用いて予測画像を生成する様子を説明するための説明図である。
【図３３】従来例を示す動画像符号化装置の構成を示すブロック図である。
【図３４】同上の動き推定部の構成を示すブロック図である。
【図３５】同上の動きベクトルを検出する様子を説明するための説明図である。
【図３６】フェードによる画素値の変化を説明するための説明図である。
【図３７】従来例を示す動画像符号化装置が出力する画像符号化信号のフォーマットの概念を示す概念図である。
【図３８】従来例を示す動画像復号化装置の構成を示すブロック図である。
【符号の説明】
３０１マルチフレームバッファ
３０２動き推定部
３０３動き補償部
３０４画像符号化部
３０５画像復号部
３０６可変長符号化部
３０８加算機
３０９減算器
Ｂｓ１画像符号化信号
Ｄｒ残差復号信号
Ｅｒ残差符号化信号
Ｉｍｇ画像信号
ＭＶ１，ＭＶ２動きベクトル
Ｐｒｅ予測画像信号
Ｒｃ再構成画像信号
Ｒｆ１，Ｒｆ２参照フレーム[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a motion vector detecting method for detecting a motion vector indicating a motion of a region in an image when encoding the moving image.
[0002]
[Prior art]
2. Description of the Related Art In recent years, with the development of multimedia applications, it has become common to handle all media information such as images, voices, and texts by digitizing them. However, since a digitized image has a huge amount of data, an image information compression technique is indispensable for storage and transmission. On the other hand, in order to interoperate compressed image data, standardization of compression technology is also important. Standards for image compression technology include H.264 of ITU-T (International Telecommunication Union Telecommunication Standardization Sector). 261, H .; 263, or ISO (International Organization for Standardization) MPEG (Moving Picture Experts Group) -1, MPEG-2, and MPEG-4 (for example, see Non-Patent Documents 1 and 2).
[0003]
An inter-frame prediction with motion compensation is a technique common to the video coding systems of these standards. In the motion compensation of these video coding methods, a frame of an input image is divided into rectangles of a predetermined size (hereinafter, referred to as blocks), and a prediction pixel is generated from a motion vector indicating a motion between frames for each block. I do.
[0004]
FIG. 31 is an explanatory diagram for describing a motion vector.
For example, when a moving object is photographed by a video camera, the position of the block in which the object is photographed moves for each frame. That is, if a moving subject is included in the block B of the frame Rf and the subject is included in the block B0 in the frame Tf, the block B0 is displaced from the block B, and the displacement with respect to the block B0 is a motion vector. Expressed as MV.
[0005]
In the above-described motion compensation, a motion vector MV is detected for each block, and the detection is generally performed from a reference source frame (frame Rf in FIG. 31) as a detection target block (FIG. 31). Is performed by searching for a block whose pixel value is close to that of the block B0).
[0006]
In addition, the H.264 standard currently being standardized by the ITU-T. In 26L, a method of obtaining a predicted pixel by interpolating pixels using two frames immediately before the encoding target frame as a reference frame is being studied. Here, a prediction method of generating a predicted image (predicted pixel) by pixel interpolation with reference to two frames whose display time is earlier than the encoding target frame is referred to as forward interpolation prediction.
[0007]
FIG. 32 is an explanatory diagram for explaining how to generate a predicted image using two frames.
As shown in FIG. 32, when a predicted image of the block B0 of the encoding target frame Tf is generated, for example, the frame Rf1 immediately before the encoding target frame Tf and the encoding target frame Tf The reference frame Rf2 two frames before is used as a reference frame. That is, the blocks B1 and B2 whose pixel values are close to the block B0 are searched for from the reference frames Rf1 and Rf2, and the motion vectors MV1 and MV2 are detected from the displacement of the positions of the blocks B1 and B2 and the block B0. .
[0008]
Then, a predicted image of the block B0 is generated from the block B1 of the reference frame Rf1 indicated by the motion vector MV1 and the block B2 of the reference frame Rf2 indicated by the motion vector MV2. That is, by using the pixel values of the block B1 and the pixel values of the block B2 to interpolate the pixel values of the block B0, the predicted image is generated. Examples of such a pixel value interpolation method include an average value and extrapolation. Extrapolation is highly effective in predicting a screen effect such as a fade in which pixel values change linearly with time.
[0009]
Further, the block B0 is encoded using the predicted image generated as described above.
FIG. 33 is a block diagram showing a configuration of a moving picture coding apparatus 800 for coding a moving picture according to the conventional moving picture coding method.
[0010]
The moving picture coding apparatus 800 includes a multi-frame buffer 801, a motion estimating section 802, a motion compensating section 803, a picture coding section 804, a picture decoding section 805, a variable length coding section 806, a motion vector The image processing apparatus includes a scaling unit 807, adders 808, subtractors 809 and 810, and

switches

811 and 812, divides a frame indicated by the image signal Img into blocks, and performs processing for each block.
[0011]
The subtractor 809 subtracts the predicted image signal Pre from the image signal Img input to the video encoding device 800, and outputs a residual signal Res.
The image encoding unit 804 obtains the residual signal Res, performs image encoding processing such as DCT transform / quantization, and outputs a residual encoded signal Er including quantized DCT coefficients and the like.
[0012]
The image decoding unit 805 obtains the residual coded signal Er, performs image decoding processing such as inverse quantization and inverse DCT, and outputs a residual decoded signal Dr. The adder 808 adds the residual decoded signal Dr and the predicted image signal Pre, and outputs a reconstructed image signal Rc. Also, of the reconstructed image signal Rc, a signal that may be referred to in subsequent inter-frame prediction is stored in the multi-frame buffer 801.
[0013]
FIG. 34 is a block diagram illustrating a configuration of the motion estimation unit 802.
The motion estimation unit 802 detects a motion vector for each block, and includes a motion vector candidate generation unit 821, a pixel acquisition unit 822, a subtractor 823, and a motion vector selection unit 824.
[0014]
The motion vector candidate generation unit 821 generates a motion vector candidate MVC as a candidate for a motion vector MV of a block to be encoded. Here, the motion vector candidate generation unit 821 sequentially generates some motion vector candidates MVC from within a predetermined motion vector detection range.
[0015]
The pixel acquisition unit 822 acquires one block in the reference frame Rf indicated by the motion vector candidate MVC, and outputs it to the subtractor 823 as a prediction block PB.
The subtracter 823 calculates the difference between the pixel value of the block to be encoded of the image signal Img and the pixel value of the prediction block PB, and outputs the difference to the motion vector selection unit 824 as the prediction error block RB.
[0016]
When the motion vector selection unit 824 obtains the prediction error blocks RB for each of the motion vector candidates MVC generated by the motion vector candidate generation unit 821, the motion vector selection unit 824 calculates, for each of these prediction error blocks RB, a pixel value in the block. A prediction error evaluation value such as SAD (sum of absolute values of prediction error values) and SSD (sum of squares of prediction error values) is calculated. Then, the motion vector selection unit 824 selects a motion vector candidate MVC when the prediction error evaluation value is minimized, and outputs the selected motion vector candidate MVC as a motion vector MV.
[0017]
The motion vector estimating unit 802 detects the motion vectors MV1 and MV2 based on the two reference frames Rf1 and Rf2 for the current block by repeatedly performing the processing operation described above during forward interpolation prediction. I do.
[0018]
FIG. 35 is an explanatory diagram for describing how to detect the motion vectors MV1 and MV2.
When the contacts of the

switches

811 and 812 are switched to the contact 0 side, the motion estimating unit 802 calculates the prediction error evaluation value as described above, and thereby calculates the prediction target evaluation value and the pixel value of the encoding target block and the pixel value in the encoding target frame Tf. A near block (a block with the smallest prediction error evaluation value) is searched from the reference frame Rf1, and the pixel Pt0 of the block in the encoding target frame Tf and the pixel Pt0 in the block searched from the reference frame Rf1 are compared with the pixel Pt0. A motion vector MV1 indicating a displacement from the pixel Pt1 at the same position is detected.
[0019]
Next, when the contacts of the

switches

811 and 812 are switched to the contact 1 side, the motion estimating unit 802 determines, as described above, a block whose pixel value is close to the encoding target block in the encoding target frame Tf (the prediction error evaluation value). Is the smallest block) from the reference frame Rf2, and the pixel Pt0 of the block in the encoding target frame Tf and the pixel Pt2 having the same relative position as the pixel Pt0 in the block found in the reference frame Rf2. A motion vector MV2 indicating displacement is detected.
[0020]
At the time of forward interpolation prediction, the motion compensation unit 803 extracts, from the multi-frame buffer 801, the block at the position indicated by the motion vector MV1 in the reference frame Rf1 and the block at the position indicated by the motion vector MV2 in the reference frame Rf2. . Then, the motion compensation unit 803 performs a pixel value interpolation process based on these blocks to generate a predicted image signal Pre indicating a predicted image, and outputs this.
[0021]
Note that a coding target block when a predicted image is obtained by the forward interpolation prediction is referred to as a forward interpolation prediction block. Further, the motion compensation unit 803 can switch to another prediction method for each block, for example, forward prediction in which prediction is performed only from one frame before the display time.
[0022]
Here, a fade in which a pixel value (luminance value) changes with time will be described. As described above, the position of the block including the subject is displaced in accordance with the movement of the subject, and the pixel value in the block changes with time.
[0023]
FIG. 36 is an explanatory diagram for describing a change in a pixel value due to a fade.
The pixel value of the pixel Pt2 indicated by the above-described motion vector MV2 changes to the pixel value of the pixel Pt1 indicated by the motion vector MV1. It can be assumed that such a change is proportional to time if the time interval is short, as shown by the line L in FIG.
[0024]
Therefore, the pixel value P0 of the pixel Pt0 of the block B0 of the encoding target frame Tf is extrapolated from the pixel values P1 and P2 of the pixels Pt1 and Pt2 of the reference frames Rf1 and Rf2, and the calculation expression “P0 = 2 × P1−P2” ".
[0025]
The motion compensation unit 803 performs the extrapolation of the above equation to enhance the prediction effect on the fade and improve the coding efficiency. In addition, the motion compensation unit 803 performs interpolation by interpolation (average value) instead of extrapolation on an image without fade, and as a result, the range of selection of a more optimal prediction method is expanded. The coding efficiency has been improved.
[0026]
The motion vector scaling unit 807 performs scaling on the motion vector MV1.
For the motion vector MV1 detected by the motion estimator 802, the motion vector scaling unit 807, as shown in FIG. 35, displays a display time difference T1 indicating a time difference at which the encoding target frame Tf and the reference frame Rf1 are displayed; Scaling is performed based on the display time difference T2 indicating the time difference between when the encoding target frame Tf and the reference frame Rf2 are displayed.
[0027]
That is, the motion vector scaling unit 807 performs scaling on the motion vector MV1 by multiplying the ratio of the display time difference T2 to the display time difference T1 (T2 / T1) by the motion vector MV1, thereby obtaining the motion vector MVs.
[0028]
Information on such display time differences T1 and T2 is obtained from the multi-frame buffer 801. That is, the frame indicated by the reproduction constituent image signal Rc is recorded in the multi-frame buffer 801 together with information on the display time of the frame.
[0029]
The subtractor 810 subtracts the above-described motion vector MVs from the motion vector MV2 detected by the motion estimating unit 802, and outputs a difference vector MVd shown in FIG.
[0030]
The variable-length encoding unit 806 performs variable-length encoding on the motion vector MV1, the difference vector MVd, and the residual encoded signal Er, and outputs an image encoded signal Bs.
As described above, the moving image encoding device 800 encodes the image signal Img and outputs the image encoded signal Bs by the processing described above.
[0031]
FIG. 37 is a conceptual diagram showing the concept of the format of the image coded signal Bs.
The image coded signal Bs includes a frame coded signal Bsf9 having a content indicating a frame coded by forward interpolation prediction, and the coded frame interpolation signal Bsf9 further includes a coded forward interpolation prediction block (coding (Target block) is included. Further, the block coded signal Bsb9 includes a first motion vector coded signal Bs1 having the content indicating the coded motion vector MV1, and a difference vector coded signal Bsd having the content indicating the coded difference vector MVd. And are included.
[0032]
In the motion vector encoding method executed by the moving image encoding device 800, the motion vector MV1 and the difference vector MVd are encoded by the motion vector scaling unit 807, the subtractor 810, and the variable length encoding unit 806. Assuming that the direction and speed of the motion of the subject in the screen are constant among the encoding target frame Tf, the reference frame Rf1, and the reference frame Rf2, the difference vector MVd becomes close to 0 and the code of the motion vector Good conversion efficiency.
[0033]
Next, a moving image decoding device that decodes an image encoded by the moving image encoding device 800 will be described.
FIG. 38 is a block diagram showing a configuration of a conventional video decoding device.
[0034]
This video decoding device 900 includes a multi-frame buffer 901, a motion compensation unit 903, an image decoding unit 905, a variable length decoding unit 906, a motion vector scaling unit 907, and

adders

909 and 910. Have.
[0035]
The variable length decoding unit 906 obtains the coded image signal Bs, performs variable length decoding, and outputs the coded residual signal Er, the motion vector MV1, and the difference vector MVd. The image decoding unit 905 acquires the residual coded signal Er, performs image decoding processing such as inverse quantization and inverse DCT transform, and outputs a residual decoded signal Dr.
[0036]
When the motion vector scaling unit 907 acquires the motion vector MV1 output from the variable length decoding unit 906, similarly to the motion vector scaling unit 807 of the moving picture coding device 800, the motion vector scaling unit 907 generates a target frame Tf and a reference frame Rf1. Based on the display time difference T1 and the display time difference T2 between the encoding target frame Tf and the reference frame Rf2, scaling is performed on the motion vector MV1, and the resulting motion vector MVs is output.
[0037]
The adder 910 adds the scaled motion vector MVs and the difference vector MVd, and outputs the addition result as a motion vector MV2.
Similar to the motion compensator 803 of the video encoding device 800, the motion compensator 903 includes a block at a position indicated by the motion vector MV1 in the reference frame Rf1 and a block at a position indicated by the motion vector MV2 in the reference frame Rf2. From the multi-frame buffer 901. Then, the motion compensating unit 903 performs a pixel value interpolation process based on these blocks to generate a predicted image signal Pre and outputs it.
[0038]
The adder 909 adds the predicted image signal Pre from the motion compensation unit 903 and the residual decoded signal Dr from the image decoding unit 905, and outputs the result as a decoded image signal Di.
[0039]
The multi-frame buffer 901 has the same configuration as the multi-frame buffer 801 of the video encoding device 800, and stores a signal of the decoded image signal Di that may be referred to in inter-frame prediction.
[0040]
Such a moving picture decoding apparatus 900 decodes the coded image signal Bs, and outputs the decoding result as a decoded image signal Di.
As described above, prediction error evaluation values such as SAD and SSD are used in the motion vector detection method in many conventional video encoding devices including the video encoding device 800.
[0041]
[Non-patent document 1]
"Video Coding for Low Bit Rate Communication," H. et al. 263, ITU-T, 1996.3
[0042]
[Non-patent document 2]
DRAFT FOR “H.263 ++” ANNEXES U, V, AND W TO RECOMMENDATION H. 263 (U.4 Decoder Process), ITU-T, 2000.11
[0043]
[Problems to be solved by the invention]
However, in the above-described conventional motion vector detection method, a change in pixel value due to a fade is not taken into consideration, so that a difference easily occurs in a pixel value between a coding target block and a prediction image. There is a problem that the value of the error evaluation value tends to increase, and an optimum motion vector cannot be detected.
[0044]
Therefore, an object of the present invention is to provide a motion vector detecting method for detecting an optimal motion vector even when a fade occurs.
[0045]
[Means for Solving the Problems]
In order to achieve the above object, the motion vector detection method of the present invention is a motion vector detection method for detecting a motion vector indicating a displacement from another picture in a block in a picture constituting a moving image, A first candidate generating step of generating a first motion vector candidate based on a first reference picture of the detection target block, and a second candidate of a motion vector based on a second reference picture of the detection target block; A second candidate generation step, a first prediction block indicated by a first motion vector candidate in the first reference picture, and a second prediction block indicated by a second motion vector candidate in the second reference picture An interpolation step of creating an interpolation prediction block by interpolating pixel values of pixels corresponding to each other based on the A calculating step of calculating an evaluation value based on a difference between pixel values of pixels corresponding to each other between the measurement block and the detection target block; and A selection step of selecting one from among one motion vector candidate, and selecting one from a plurality of second motion vector candidates generated in the second candidate generation step; A first motion vector candidate is detected as a first motion vector based on the first reference picture in the detection target block, and the selected second motion vector candidate is detected as the first motion vector in the detection target block. Detecting a second motion vector based on the second reference picture. For example, in the selecting step, one first and second motion vector candidate each having the smallest evaluation value is selected.
[0046]
As a result, since the evaluation value is calculated based on the result of performing the interpolation processing of the pixel value, even if a fade occurs, it is possible to prevent an increase in the error of the evaluation value due to the effect and to determine an optimal motion vector. Can be detected.
[0047]
Further, the motion vector encoding method according to the present invention is a motion vector encoding method for encoding a motion vector indicating a displacement from another picture in a block in a picture constituting a moving image. The method includes a motion vector detecting step of detecting first and second motion vectors by the motion vector detecting method according to the present invention, and an encoding step of encoding the first and second motion vectors, respectively. .
[0048]
Thereby, it is possible to encode the optimum motion vectors.
Further, the motion vector encoding method according to the present invention is configured such that, for a current picture to be coded in a moving image, the first and second motion vectors are referred to by referring to the other two pictures as reference pictures. A motion vector encoding method for identifying a vector encoding method and encoding information relating to the first and second motion vectors, wherein a first area in which a picture is recorded together with information relating to its display time, Reading out the two reference pictures from a storage unit having a second area in which a picture not recorded in the first area is recorded, and at least one of the two reference pictures is the second area. A determining step of determining whether or not the reference picture has been read from the area; and determining at least one of the two reference pictures by the second When it is determined that the first motion vector and the second motion vector have been read, a difference vector deriving step of obtaining a difference vector indicating a difference between the first motion vector and the second motion vector; And an encoding step of encoding.
[0049]
With this, when at least one of the two reference pictures is read from the second area of the memory, the scaling of the motion vector is not performed as in the conventional example, so that execution of unreasonable scaling is omitted, It is possible to improve the coding efficiency of the motion vector.
[0050]
On the other hand, in the motion vector decoding method according to the present invention, the first and second motion vectors are referred to a decoding target picture to be decoded in a moving image by referring to the other two pictures as reference pictures. A motion vector decoding method for identifying a vector decoding method and decoding coded information obtained by coding information relating to the first and second motion vectors, wherein the first motion information is obtained from the coded information. A decoding step of decoding a vector and an associated vector relating to the first and second motion vectors, and a first area in which a picture is recorded together with information relating to its display time and a picture not recorded in the first area. Reading out the two reference pictures from storage means having a second area in which a picture is recorded; and reading out at least one of the two reference pictures. Determining whether at least one of the two reference pictures has been read from the second area; and determining that at least one of the two reference pictures has been read from the second area. Calculating the second motion vector by adding the related vector and the first motion vector.
[0051]
With this, when at least one of the two reference pictures is read from the second area of the memory, the scaling of the motion vector is not performed as in the conventional example, so that execution of unreasonable scaling is omitted, The decoding efficiency of the motion vector can be improved.
[0052]
Note that the present invention can also be realized as a moving image encoding device, a program, and a storage medium that stores the program using the motion vector detection method and the motion vector encoding method.
[0053]
BEST MODE FOR CARRYING OUT THE INVENTION
(Embodiment 1)
Hereinafter, a moving picture coding apparatus according to the first embodiment of the present invention will be described with reference to the drawings.
[0054]
FIG. 1 is a block diagram showing a configuration of a moving picture coding apparatus 300A according to the present embodiment.
The moving picture coding apparatus 300A according to the present embodiment detects and codes an optimal motion vector, and includes a multi-frame buffer 301, a motion estimating section 302, a motion compensating section 303, and a video coding apparatus. A section 304, an image decoding section 305, a variable length coding section 306, an adder 308 and a subtractor 309.
[0055]
The motion estimating unit 302 calculates an optimal motion of a block (encoding target block) in the encoding target frame Tf indicated by the image signal Img based on each of the reference frames Rf1 and Rf2 read from the multi-frame buffer 301. The vectors MV1 and MV2 are detected.
[0056]
Similar to the motion compensating unit 803 of the video encoding device 800, the motion compensating unit 303 uses the block at the position indicated by the motion vector MV1 in the reference frame Rf1 and the motion vector MV2 in the reference frame Rf2 during forward interpolation prediction. The block at the indicated position is extracted from the multi-frame buffer 801. Then, based on these blocks, the motion compensating unit 303 performs a process of interpolating pixel values by extrapolation as described with reference to FIG. 36 to generate and output a predicted image signal Pre indicating a predicted image. By performing the interpolation process of the pixel value by the extrapolation in this way, the effect of predicting the fade can be enhanced. Note that the motion compensation unit 303 switches the prediction method for each encoding target block between forward interpolation prediction and another prediction method, for example, forward prediction in which prediction is performed from only one frame before the display time. Is also good.
[0057]
The subtractor 309 subtracts the predicted image signal Pre from the image signal Img and outputs a residual signal Res.
The image coding unit 304 obtains the residual signal Res, performs image coding processing such as DCT transform / quantization, and outputs a residual coded signal Er including quantized DCT coefficients and the like.
[0058]
The image decoding unit 305 acquires the residual coded signal Er, performs image decoding processing such as inverse quantization and inverse DCT, and outputs a residual decoded signal Dr.
The adder 108 adds the residual decoded signal Dr and the prediction image signal Pre to output a reconstructed image signal Rc.
[0059]
The multi-frame buffer 301 stores a signal of the reconstructed image signal Rc which may be referred to in inter-frame prediction.
The variable-length coding unit 306 performs variable-length coding on the motion vectors MV1 and MV2 detected by the motion estimating unit 302 and the residual coded signal Er output from the image coding unit 304. The result is output as an image encoded signal Bs1.
[0060]
When encoding each block of the encoding target frame Tf indicated by the image signal Img, such a moving image encoding device 300A refers to the two reference frames Rf1 and Rf2 to refer to these reference frames Rf1 and Rf2. The motion vectors MV1 and MV2 of the current block based on Rf2 are detected. Then, the moving picture coding apparatus 300A codes the detected motion vectors MV1 and MV2, respectively, and sets the pixel values of the predicted picture predicted from the reference frames Rf1 and Rf2 and the motion vectors MV1 and MV2 and the coding target block. Is encoded.
[0061]
Here, each of the reference frames Rf1 and Rf2 may be temporally forward or backward with respect to the encoding target frame Tf.
FIG. 2 is a frame layout diagram showing a temporal positional relationship between the reference frames Rf1 and Rf2 and the encoding target frame Tf.
[0062]
As shown in FIG. 2A, the moving picture coding apparatus 300A may refer to frames located before the coding target frame Tf as reference frames Rf1 and Rf2. As shown in b), a frame located behind the encoding target frame Tf may be referred to as reference frames Rf1 and Rf2. Further, as shown in (c) of FIG. 2, one frame located before the encoding target frame Tf is referred to as a reference frame Rf2, and one frame located behind the encoding target frame Tf is referred to. The frame may be referred to as a frame Rf1, and vice versa, that is, one frame located before the encoding target frame Tf is referred to as a reference frame Rf1, and one frame located behind the encoding target frame Tf is referred to as a reference frame. It may be referred to as Rf2.
[0063]
The motion estimating unit 302 of the moving picture coding apparatus 300A according to the present embodiment will be described in detail.
The motion estimation unit 302 detects the motion vectors MV1 and MV2 of the encoding target block based on each of the reference frames Rf1 and Rf2 as described above, and outputs these motion vectors MV1 and MV2 to the motion compensation unit 303. .
[0064]
FIG. 3 is a block diagram illustrating a configuration of the motion estimation unit 302.
The above-described motion estimation unit 302 in the present embodiment includes a first motion vector candidate generation unit 321, a second motion vector candidate generation unit 322,

pixel acquisition units

323 and 324, an interpolation unit 325, and a subtractor 326. , A motion vector selection unit 327.
[0065]
The first motion vector candidate generation unit 321 extracts all motion vector MV1 candidates based on the reference frame Rf1 of the current block from a predetermined detection range, and sequentially outputs them as first motion vector candidates MVC1.
[0066]
The second motion vector candidate generation unit 322 extracts all the motion vector MV2 candidates based on the reference frame Rf2 of the encoding target block from the predetermined detection range, similarly to the first motion vector candidate generation unit 321. These are sequentially output as the second motion vector candidate MVC2.
[0067]
FIG. 4 is an explanatory diagram for describing a method of generating the first motion vector candidate MVC1 and the second motion vector candidate MVC2.
The first motion vector candidate generation unit 321 selects any pixel Pt1 from the motion vector detection range SR of the reference frame Rf1, and sets the displacement between the pixel Pt0 and the pixel Pt1 of the encoding target block as the first motion vector candidate MVC1. Then, the second motion vector candidate generation unit 322 selects any pixel Pt2 from the motion vector detection range SR1 of the reference frame Rf2, and calculates the displacement between the pixel Pt0 and the pixel Pt2 of the encoding target block by the second motion vector. Output as candidate MVC2.
[0068]
The pixel acquisition unit 323 acquires one block (a block including the pixel Pt1 in FIG. 4) in the reference frame Rf1 indicated by the first motion vector candidate MVC1, and outputs the acquired block to the interpolation unit 325 as a prediction block PB1.
[0069]
The pixel acquisition unit 324 acquires one block (a block including the pixel Pt2 in FIG. 4) in the reference frame Rf2 indicated by the second motion vector candidate MVC2, and outputs the acquired block to the interpolation unit 325 as a prediction block PB2.
[0070]
The interpolation unit 325 creates an interpolation prediction block PB0 for the encoding target block by performing pixel value interpolation using two pixels whose relative positions in the prediction blocks PB1 and PB2 are equal to each other. Output to the subtractor 326.
[0071]
In the case of forward interpolation prediction, such interpolation of pixel values is performed by extrapolating two pixel values as described with reference to FIG. That is, as shown in FIG. 36, when the pixel Pt1 in the prediction block PB1 and the pixel Pt2 in the prediction block PB2 have the same relative position in the block, the pixel value P1 of the pixel Pt1 and the pixel value of the pixel Pt2 A calculation formula “P0 ′ = 2 × P1−P2” is calculated from P2, and an interpolation prediction pixel value P0 ′ of the pixel Pt0 in the encoding target block corresponding to the pixels Pt1 and Pt2 is calculated. Then, the interpolation unit 325 creates the interpolation prediction block PB0 by calculating the interpolation prediction pixel values P0 ′ for all the pixels in the encoding target block.
[0072]
The subtractor 326 calculates a difference (P0−P0 ′) between pixel values of pixels corresponding to the encoding target block indicated in the image signal Img and the interpolation prediction block PB0, and sets the result as a prediction error block RB0. Output to the motion vector selection unit 327.
[0073]
Upon obtaining the prediction error block RB0 from the subtractor 326, the motion vector selection unit 327 obtains an SAD (sum of absolute values of prediction error values) shown by the following (Equation 1) or an SSD shown by the following (Equation 2) A prediction error evaluation value such as (sum of squares of the prediction error value) is calculated.
[0074]
(Equation 1)

Such a prediction error evaluation value is calculated based on all of the first motion vector candidate MVC1 generated by the first motion vector candidate generation unit 321 and the second motion vector candidate MVC2 generated by the second motion vector candidate generation unit 322. Is calculated for the combination of.
[0075]
Then, the motion vector selection unit 327 selects the first motion vector candidate MVC1 and the second motion vector candidate MVC2 having the smallest prediction error evaluation value, and assigns the selected first motion vector candidate MVC1 to a code based on the reference frame Rf1. In addition to outputting as the motion vector MV1 of the encoding target block, the selected second motion vector candidate MVC2 is output as the motion vector MV2 of the encoding target block based on the reference frame Rf2.
[0076]
Thus, even if a change in pixel value due to fading occurs, the motion estimating unit 302 according to the present embodiment detects a motion vector in consideration of the change, so that an optimal motion vector can be detected.
[0077]
Then, the variable-length coding unit 306 performs variable-length coding on the motion vectors MV1 and MV2 detected by the motion estimating unit 302 as described above, and outputs the motion vectors MV1 and MV2 together with the coded image signal Bs1. Note that the variable length decoding unit 306 subtracts the predicted values of the motion vectors MV1 and MV2 obtained from the peripheral blocks around the current block from the motion vectors MV1 and MV2, and encodes the respective differences. May be.
[0078]
FIG. 5 is a conceptual diagram showing the concept of the format of the image coded signal Bs1.
The image coded signal Bs1 includes a frame coded signal Bsf1 having a content indicating a coded frame, and the frame coded signal Bsf1 further includes a block coded signal Bsb1 having a content indicating a coded block. include. Further, the block coded signal Bsb1 includes a first motion vector coded signal Bs1 having the content indicating the coded motion vector MV1 and a second motion vector coded signal having the content indicating the coded motion vector MV2. Signal Bs2.
[0079]
FIG. 6 is a flowchart showing an operation until the moving picture coding apparatus 300A detects and codes a motion vector.
First, the motion estimator 302 of the video encoding device 300A generates one first motion vector candidate MVC1 and one second motion vector candidate MVC2 (Step S300).
[0080]
Then, the motion estimating unit 302 acquires a predicted block PB1 indicated by the first motion vector candidate MVC1 and a predicted block PB2 indicated by the second motion vector candidate MVC2 (step S302).
[0081]
Next, the motion estimating unit 302 creates an interpolation prediction block PB0 by performing pixel interpolation processing from the prediction blocks PB1 and PB2 (step S304).
Thereafter, the motion estimating unit 302 acquires the current block to be coded (step S306), obtains the difference between the current block to be coded and the interpolated prediction block PB0, and creates a prediction error block RB0 (step S308).
[0082]
Then, the motion estimating unit 302 calculates a prediction error evaluation value from the prediction error block RP0 generated in step S308 (step S310), and calculates all the combinations of the first motion vector candidate MVC1 and the second motion vector candidate MVC2. Then, it is determined whether or not the prediction error evaluation value has been calculated (step S312).
[0083]
Here, when the motion estimating unit 302 determines that the prediction error evaluation values have not been calculated for all the combinations (N in step S312), the first motion vector candidate MVC1 and the second motion vector A combination with the candidate MVC2 is generated, and the operation from step S300 is repeatedly executed. When the motion estimator 302 determines that the prediction error evaluation values have been calculated for all the combinations (Y in step S312), the smallest prediction error evaluation value among all the prediction error evaluation values calculated in step S310. When the error evaluation value is calculated, the first motion vector candidate MVC1 generated in step S300 is detected as a motion vector MV1, and the second motion vector candidate MVC2 at this time is detected as a motion vector MV2 (step S314). .
[0084]
Then, the variable-length encoding unit 306 of the video encoding device 300A encodes the motion vectors MV1 and MV2 detected in step S314 (step S316).
[0085]
As described above, in the present embodiment, since the prediction error evaluation value is calculated based on the result of performing the pixel value interpolation processing, even if a fade occurs, it is possible to prevent the prediction error evaluation value from increasing due to the effect. Thus, an optimal motion vector can be detected. As a result, the coding efficiency of the motion vector can be improved.
[0086]
(Embodiment 2)
Hereinafter, a moving picture decoding apparatus according to the second embodiment of the present invention will be described with reference to the drawings.
[0087]
FIG. 7 is a block diagram illustrating a configuration of a video decoding device 300B according to the present embodiment.
The moving picture decoding apparatus 300B according to the present embodiment decodes a moving picture encoded by the moving picture coding apparatus 300A according to the first embodiment, and includes a variable length decoding unit 336, a motion compensation A unit 333, an image decoding unit 335, a multi-frame buffer 331, and an adder 339 are provided.
[0088]
The variable-length decoding unit 336 acquires the coded image signal Bs1, performs variable-length decoding, and outputs the coded residual signal Er, the motion vector MV1, and the motion vector MV2. Note that when each of the motion vectors MV1 and MV2 is encoded by subtracting the predicted values of the motion vectors MV1 and MV2 obtained from the peripheral blocks around the current block, the variable-length decoding unit 2336 The encoded differences may be respectively decoded, and the above-described predicted values may be added to the respective differences to generate the motion vectors MV1 and MV2.
[0089]
The image decoding unit 335 acquires the residual coded signal Er, performs image decoding processing such as inverse quantization and inverse DCT transform, and outputs a residual decoded signal Dr.
The motion compensation unit 333, like the motion compensation unit 303 of the video encoding device 300A, uses the block at the position indicated by the motion vector MV1 in the reference frame Rf1 and the motion vector MV2 in the reference frame Rf2 during forward interpolation prediction. The block at the indicated position is extracted from the multi-frame buffer 331. Then, based on these blocks, the motion compensating unit 333 performs a process of interpolating pixel values by extrapolation as described with reference to FIG. 36 to generate and output a predicted image signal Pre.
[0090]
The adder 339 adds the predicted image signal Pre from the motion compensation unit 333 and the residual decoded signal Dr from the image decoding unit 335, and outputs the result as a decoded image signal Di.
[0091]
The multi-frame buffer 331 stores a signal of the decoded image signal Di that may be referred to in inter-frame prediction.
By performing motion compensation based on the motion vectors MV1 and MV2, the video decoding device 300B according to the present embodiment can accurately decode the image encoded by the video encoding device 300A. .
[0092]
(Embodiment 3)
Hereinafter, a video encoding device according to the third embodiment of the present invention will be described with reference to the drawings.
[0093]
FIG. 8 is a block diagram showing a configuration of a moving picture coding apparatus 400A according to the present embodiment.
The moving picture coding apparatus 300A according to the present embodiment detects and codes an optimal motion vector, as in the first embodiment, and includes a multi-frame buffer 401, a motion estimating unit 402, A section 403, an image coding section 404, an image decoding section 405, a variable length coding section 406, an adder 408 and a subtractor 409 are provided.
[0094]
Here, the multi-frame buffer 401, the motion compensation unit 403, the image encoding unit 404, the image decoding unit 405, the adder 408, and the subtractor 409 according to the present embodiment are the multi-frame buffer according to the first embodiment. 301, a motion compensation unit 303, an image encoding unit 304, an image decoding unit 305, and an adder 308 and a subtractor 309 have the same functions and configurations, respectively.
[0095]
In the moving picture coding apparatus 400A according to the present embodiment, the method of detecting a motion vector and the coding method of a motion vector are different from those of the first embodiment. The feature is that only one motion vector is detected and encoded.
[0096]
FIG. 9 is a block diagram illustrating a configuration of the motion estimation unit 402 according to the present embodiment.
The motion estimating unit 402 according to the present embodiment includes a first motion vector candidate generating unit 421, a motion vector scaling unit 422,

pixel obtaining units

423 and 424, an interpolating unit 425, a subtractor 426, and a motion vector selecting unit. 427. Then, such a motion estimating unit 402 generates a block (encoding target block) in the encoding target frame Tf indicated by the image signal Img based on each of the reference frames Rf1 and Rf2 read from the multi-frame buffer 401. ), The optimal motion vectors MV1 and MV2 are detected.
[0097]
Here, the first motion vector candidate generation unit 421, the

pixel acquisition units

423 and 424, the interpolation unit 425, the subtractor 426, and the motion vector selection unit 427 perform the first motion vector candidate generation in the motion estimation unit 302 of the first embodiment. The unit 321, the

pixel acquisition units

323, 324, the interpolation unit 325, the subtractor 326, and the motion vector selection unit 327 have the same functions and configurations, respectively.
[0098]
That is, the motion estimating unit 402 according to the present embodiment includes a motion vector scaling unit 422 instead of the second motion vector candidate generating unit 322 according to the first embodiment, and performs scaling by scaling the first motion vector candidate MVC1. Two motion vector candidates MVC2 are generated.
[0099]
FIG. 10 is an explanatory diagram for describing a method of generating the first motion vector candidate MVC1 and the second motion vector candidate MVC2.
The first motion vector candidate generation unit 421 selects any pixel Pt1 from the motion vector detection range SR of the reference frame Rf1, and sets the displacement between the pixel Pt0 and the pixel Pt1 of the encoding target block as the first motion vector candidate MVC1. Output.
[0100]
When the motion vector scaling unit 422 acquires the first motion vector candidate MVC1 generated as described above, the display time difference T1 indicating the time difference in which the encoding target frame Tf and the reference frame Rf1 are displayed, and the encoding target frame Tf And scaling the first motion vector candidate MVC1 based on the display time difference T2 indicating the time difference when the reference frame Rf2 is displayed.
[0101]
That is, the motion vector scaling unit 422 performs scaling on the first motion vector candidate MVC1 by multiplying the ratio of the display time difference T2 to the display time difference T1 (T2 / T1) by the first motion vector candidate MVC1, and performs motion vector MVCs Ask for.
[0102]
Then, the motion vector scaling unit 422 outputs the motion vectors MVCs obtained by scaling in this manner to the pixel acquisition unit 424 and the motion vector selection unit 427.
[0103]
Upon acquiring the motion vector MVCs from the motion vector scaling unit 422, the pixel acquisition unit 424 and the motion vector selection unit 427 treat the motion vector MVCs as the second motion vector candidate MVC2, and perform the same operation as in the first embodiment. .
[0104]
Further, the first motion vector candidate generation unit 421 sequentially generates all the first motion vector candidates MVC1 from the motion vector detection range SR, and the motion vector scaling unit 422 obtains the first motion vector candidates MVC1 every time. Scaling is performed to create motion vectors MVCs.
[0105]
Each time the first motion vector candidate MVC1 is generated by the first motion vector candidate generation unit 421, the motion vector selection unit 427 performs prediction based on the first motion vector candidate MVC1 and the motion vector MVCs obtained therefrom. An error evaluation value is calculated, and as a result, the first motion vector candidate MVC1 and the motion vector MVCs that minimize the prediction error evaluation value are selected. Then, the motion vector selection unit 427 outputs the selected first motion vector candidate MVC1 and motion vector MVCs as the motion vectors MV1 and MV2.
[0106]
A series of operations of the above-described motion estimation unit 402 will be described below.
First, the first motion vector candidate generation unit 421 generates a first motion vector candidate MVC1, which is a candidate for the motion vector MV1.
[0107]
Next, the motion vector scaling unit 422 generates and outputs the motion vectors MVCs by performing scaling on the first motion vector candidate MVC1.
[0108]
Then, the pixel acquisition unit 423 acquires one block including the pixels of the reference frame Rf1 indicated by the first motion vector candidate MVC1 as the prediction block PB1, and outputs this to the interpolation unit 425. The pixel acquisition unit 424 acquires one block including the pixels of the reference frame Rf2 indicated by the motion vector MVCs as the prediction block PB2, and outputs this to the interpolation unit 425.
[0109]
The interpolation unit 425 generates an interpolation prediction block PB0 by interpolating pixel values of corresponding pixels in each of the two prediction blocks PB1 and PB2 obtained by the

pixel obtaining units

423 and 424.
[0110]
The subtracter 426 calculates a difference between pixel values between the interpolation prediction block PB0 and the current block in the image signal Img, and outputs the calculation result as a prediction error block RB0.
[0111]
The motion vector selection unit 427 calculates a prediction error evaluation value based on the pixel values of the pixels in the prediction error block RB0. Then, the motion vector selection unit 427 calculates the prediction error evaluation values for all the first motion vector candidates MVC1 generated by the first motion vector candidate generation unit 421 as described above, and Is selected as the first motion vector candidate MVC1 and the motion vector MVCs.
[0112]
Then, the motion vector selection unit 427 outputs the selected first motion vector candidate MVC1 and motion vector MVCs as the motion vectors MV1 and MV2.
As described above, the motion estimating unit 402 in the present embodiment calculates the prediction error evaluation value based on the result of performing the interpolation process on the pixel values, similarly to the first embodiment, and therefore, there is a case where a fade occurs. This also prevents an increase in the prediction error evaluation value due to the influence, and can detect an optimal motion vector. Further, the motion estimating unit 402 in the present embodiment does not detect the substantial motion vector MV2 even at the time of forward interpolation prediction, and uses the scaled motion vector MV1 as the motion vector MV1. Since the MV2 is used, it is possible to improve the coding efficiency by eliminating the trouble of detecting the motion vector MV2.
[0113]
Further, variable-length coding section 406 of the present embodiment performs variable-length coding on motion vector MV1 and residual coded signal Er, and outputs video coded signal Bs2.
FIG. 11 is a conceptual diagram showing the concept of the format of the image coded signal Bs2.
[0114]
The image coded signal Bs2 includes a frame coded signal Bsf2 having a content indicating a coded frame, and the frame coded signal Bsf2 further includes a block coded signal Bsb2 having a content indicating a coded block. include. Further, the block coded signal Bsb2 includes a motion vector coded signal Bs1 having the content indicating the coded motion vector MV1.
[0115]
As described above, in the present embodiment, the second motion vector coded signal Bs2 indicating the coded motion vector MV2 does not need to be stored in the image coded signal Bs2. Efficiency can be improved.
[0116]
In the present embodiment, the case of forward interpolation prediction has been described as shown in FIG. 38. However, as in Embodiment 1, reference frames Rf1 and Rf2 are either forward or backward of encoding target frame Tf. It may be.
[0117]
Furthermore, in the present embodiment, a frame in which the display time difference based on the encoding target frame Tf is longer than the reference frame Rf1 is selected as the reference frame Rf2, and the first motion vector candidate MVC1 based on the reference frame Rf1 is selected. However, a frame before the display time of the encoding target frame Tf is selected as the reference frame Rf1, and a frame after the display time of the encoding target frame Tf is selected as the reference frame Rf2. Then, scaling may be performed on the first motion vector candidate MVC1 based on the reference frame Rf1.
[0118]
Note that the variable-length coding unit 406 codes a difference between the motion vector MV1 and a predicted value predicted from a motion vector of a peripheral block around the current block, and converts a coded signal indicating the result into a motion vector. It may be included in the block coded signal Bsb2 instead of the vector coded signal Bs1. In this case, the coding efficiency can be further improved.
[0119]
(Embodiment 4)
Hereinafter, a video decoding device according to the fourth embodiment of the present invention will be described with reference to the drawings.
[0120]
FIG. 12 is a block diagram illustrating a configuration of a video decoding device 400B according to the present embodiment.
A moving picture decoding apparatus 400B according to the present embodiment decodes a moving picture encoded by the moving picture coding apparatus 400A according to the third embodiment, and includes a variable length decoding unit 436 and a motion vector. It includes a scaling unit 437, a motion compensation unit 433, an image decoding unit 435, a multi-frame buffer 431, and an adder 439.
[0121]
Here, the image decoding unit 435, the motion compensation unit 433, the multi-frame buffer 431, and the adder 439 according to the present embodiment are the same as the image decoding unit 335 and the motion compensation unit of the video decoding device 300B according to the second embodiment. 333, the multi-frame buffer 331, and the adder 339 have the same functions and configurations, respectively.
[0122]
The variable length decoding unit 336 obtains the coded image signal Bs, performs variable length decoding, and outputs a coded residual error Er and a motion vector MV1. Note that when the motion vector MV1 is encoded by subtracting the predicted value of the motion vector MV1 obtained from the peripheral blocks around the current block, the variable-length decoding device 336 outputs The motion vector MV1 may be generated and output by decoding the difference and adding the above-described prediction value to the difference.
[0123]
When acquiring the motion vector MV1 output from the variable-length decoding unit 906, similarly to the motion vector scaling unit 907 of the video decoding device 900, the motion vector scaling unit 437 compares the encoding target frame Tf with the reference frame Rf1. The motion vector MV1 is scaled based on the display time difference T1 and the display time difference T2 between the encoding target frame Tf and the reference frame Rf2. Then, the motion vector scaling unit 437 outputs the resulting motion vector to the motion compensation unit 433 as the motion vector MV2 detected based on the reference frame Rf2.
[0124]
Similar to the motion compensation unit 333 according to the second embodiment, the motion compensation unit 433 determines a block at a position indicated by the motion vector MV1 in the reference frame Rf1 and a block at a position indicated by the motion vector MV2 in the reference frame Rf2. , From the multi-frame buffer 431. Then, the motion compensating unit 433 performs a pixel value interpolation process based on these blocks, creates a predicted image signal Pre, and outputs this.
[0125]
The video encoding device 400B according to the present embodiment derives the motion vector MV2 by performing scaling on the motion vector MV1, and correctly decodes the image encoded by the video encoding device 400A. can do.
[0126]
(Embodiment 5)
Hereinafter, a moving picture coding apparatus according to a fifth embodiment of the present invention will be described with reference to the drawings.
[0127]
FIG. 13 is a block diagram illustrating a configuration of a moving picture coding apparatus 500A according to the present embodiment.
The moving image encoding device 500A includes a multi-frame buffer 501, a motion estimating unit 502, a motion compensating unit 503, an image encoding unit 504, an image decoding unit 505, a variable length encoding unit 506, a motion vector scaling unit 507, and an adder. 508, and includes subtracters 509 and 510, and divides the frame indicated by the image signal Img into blocks, and performs processing for each block.
[0128]
Here, the multi-frame buffer 501, the motion compensation unit 503, the image encoding unit 504, the image decoding unit 505, the variable length encoding unit 506, the motion vector scaling unit 507, the adder 508, and the

subtractors

509 and 510 in the present embodiment. Are the multi-frame buffer 801, the motion compensation unit 803, the image encoding unit 804, the image decoding unit 805, the variable length encoding unit 806, the motion vector scaling unit 807, the adder 808, and the subtraction in the conventional video encoding device 800. It has the same function and configuration as the devices 809 and 810, respectively.
[0129]
That is, the present embodiment is characterized in the motion vector detection method of the motion estimation unit 502, and the operation processing such as the motion vector encoding method is common to the moving image encoding device 800.
[0130]
FIG. 14 is a block diagram illustrating a configuration of the motion estimation unit 502 according to the present embodiment. The motion estimation unit 502 includes a first motion vector candidate generation unit 521, a second motion vector candidate generation unit 522, a motion vector scaling unit 521a,

pixel acquisition units

523 and 524, an interpolation unit 525, and a subtractor 526. , A motion vector selection unit 527, and switches 528 and 529.
[0131]
Also, the first motion vector candidate generation unit 521, the motion vector scaling unit 521a, the

pixel acquisition units

523 and 524, the interpolation unit 525, and the subtractor 526 in the motion estimation unit 502 The unit 402 has the same functions and configurations as the first motion vector candidate generation unit 421, the motion vector scaling unit 422, the

pixel acquisition units

423, 424, the interpolation unit 425, and the subtractor 426.
[0132]
Then, the second motion vector candidate generation unit 522 extracts all the motion vector MV2 candidates based on the reference frame Rf2 of the encoding target block from the predetermined motion vector detection range, and outputs these as the second motion vector candidates MVC2, respectively. I do.
[0133]
As described in the third embodiment, the motion estimating unit 402 in the third embodiment generates some first motion vector candidates MVC1 and performs scaling on each of the first motion vector candidates MVC1. The motion vectors MVCs are generated, and the first motion vector candidate MVC1 with the smallest prediction error evaluation value and the corresponding motion vectors MVCs are detected as the motion vectors MV1 and MV2. That is, the motion estimating unit 402 detects a scaled version of the motion vector MV1 as the motion vector MV2.
[0134]
However, the motion estimating unit 502 in the present embodiment detects the motion vector MV1 in the same manner as the motion estimating unit 402 in the third embodiment, but calculates the motion vector MV1 by scaling the motion vector MV1. The feature is that the motion vector MV2 having the smallest prediction error evaluation value is further detected using the detected motion vector MV1.
[0135]
The specific operation of the motion estimation unit 502 will be described with reference to FIGS.
When the contacts of the

switches

528 and 529 are switched to the contact 0 side, the motion estimating unit 502 generates the first motion vector candidate MVC1 and the motion vector MVCs by the same method as in the third embodiment, and generates the motion vector MV1. To detect.
[0136]
FIG. 15A is an explanatory diagram for describing a method of generating the first motion vector candidate MVC1 and the motion vector MVCs.
The first motion vector candidate generation unit 521 generates one of the first motion vector candidates MVC1 from the motion vector detection range SR of the reference frame Rf1.
[0137]
The motion vector scaling unit 521a multiplies the ratio (T2 / T1) of the display time difference T2 to the display time difference T1 by the first motion vector candidate MVC1 generated as described above to perform scaling on the first motion vector candidate MVC1. Then, motion vectors MVCs are generated.
[0138]
The pixel acquisition unit 523 acquires one block (a block including the pixel Pt1) in the reference frame Rf1 indicated by the first motion vector candidate MVC1, and outputs the acquired block to the interpolation unit 525 as a prediction block PB1.
[0139]
The pixel acquisition unit 524 acquires one block (a block including the pixel Pt2) in the reference frame Rf2 indicated by the motion vector candidate MVCs, and outputs the acquired block to the interpolation unit 525 as a prediction block PB2.
[0140]
The interpolation unit 525 creates an interpolation prediction block PB0 for the encoding target block by performing pixel value interpolation using two pixels having the same relative position in the prediction blocks PB1 and PB2. Output to the subtractor 526.
[0141]
The subtractor 526 calculates the difference between the pixel value of the pixel corresponding to the encoding target block indicated by the image signal Img and the pixel value of the corresponding pixel of the interpolation prediction block PB0, and outputs the result to the motion vector selection unit 527 as the prediction error block RB0. Output.
[0142]
Upon obtaining the prediction error block RB0 from the subtractor 526, the motion vector selection unit 527 calculates a prediction error evaluation value such as SAD or SSD.
Then, the motion vector selection unit 527 selects the first motion vector candidate MVC1 and the motion vector candidate MVCs with the smallest prediction error evaluation value, and determines the selected first motion vector candidate MVC1 as an encoding target based on the reference frame Rf1. Output as the motion vector MV1 of the block.
[0143]
Next, when the contacts of the

switches

528 and 529 are switched to the contact 0 side, the motion estimating unit 502 detects the motion vector MV2 having the smallest prediction error evaluation value using the motion vector MV1 detected as described above. .
[0144]
FIG. 15B is an explanatory diagram for describing a method of generating the second motion vector candidate MVC2.
The second motion vector candidate generation unit 522 sequentially generates some second motion vector candidates MVC2 from the motion vector detection range SR2 centered on the position C in the reference frame Rf2 indicated by the scaling of the motion vector MV1.
[0145]
The pixel acquisition unit 523 acquires one block in the reference frame Rf1 indicated by the motion vector MV1 already detected as described above, and outputs the acquired block to the interpolation unit 525 as a prediction block PB1.
[0146]
The pixel acquisition unit 524 acquires one block in the reference frame Rf2 indicated by the second motion vector candidate MVC2, and outputs it to the interpolation unit 525 as a prediction block PB2.
[0147]
The interpolation unit 525 performs the interpolation of the interpolation prediction block PB0 for the encoding target block by performing pixel value interpolation using two pixels having the same relative position in the prediction blocks PB1 and PB2 as described above. And outputs it to the subtractor 526.
[0148]
The subtractor 526 calculates the difference between the pixel value of the pixel corresponding to the encoding target block indicated by the image signal Img and the pixel value of the corresponding pixel of the interpolation prediction block PB0, and outputs the result to the motion vector selection unit 527 as the prediction error block RB0. Output.
[0149]
Upon obtaining the prediction error block RB0 from the subtractor 526, the motion vector selection unit 527 calculates a prediction error evaluation value such as SAD or SSD.
Then, the motion vector selection unit 527 selects the second motion vector candidate MVC2 having the smallest prediction error evaluation value, and assigns the selected second motion vector candidate MVC2 to the motion vector MV2 of the encoding target block based on the reference frame Rf2. Is output as
[0150]
Here, assuming that the motion of the subject is constant between frames, the closer to the position C indicated by the scaling of the already detected motion vector MV1, the higher the probability that the motion vector MV2 exists.
[0151]
Therefore, when detecting the motion vector MV2, the motion estimating unit 502 of the present embodiment sets the motion vector detection range SR2 around the position C, and sets the second motion vector candidate from the motion vector detection range SR2. Since the MVC2 is generated, the motion vector detection range SR2 can be narrowed, and the detection efficiency of the motion vector can be improved. Further, when a technique such as a spiral search is used, the motion vector MV2 can be detected at higher speed.
[0152]
In the present embodiment, after detecting the motion vector MV1, the motion vector MV2 is detected using the fixed motion vector MV1, but the motion vector MV2 detected in this manner is further fixed. And the motion vector MV1 may be detected again. In this case, in a state where the pixel acquisition unit 524 has acquired the fixed motion vector MV2 once detected, the pixel acquisition unit 523 selects the variable first motion vector candidate extracted from the predetermined motion vector detection range. MVC1 is obtained from the first motion vector candidate generation unit 521. Then, the motion vector selection unit 527 detects, as the motion vector MV1, the first motion vector candidate MVC1 having the smallest prediction error evaluation value among the extracted first motion vector candidates MVC1. As a result, a more appropriate motion vector can be detected, and the detection efficiency can be improved.
[0153]
Furthermore, the motion vector MV1 thus detected again may be fixed and used, and the motion vector MV2 may be detected again. Such detection of the motion vector may be repeated any number of times, and may be performed until the number of repetitions reaches a predetermined number, or until the reduction rate of the prediction error evaluation value becomes a predetermined value or less.
[0154]
As described above, in the present embodiment, as in the first or third embodiment, the prediction error evaluation value is calculated based on the result of performing the pixel value interpolation processing. An optimal motion vector can be detected by preventing the prediction error evaluation value from increasing. Further, in the present embodiment, unlike Embodiment 3, since independent motion vectors are used for each reference frame, prediction efficiency can be improved even when the motion is not constant between frames.
[0155]
(Modification)
Next, a modified example of the moving picture coding apparatus 500A according to the present embodiment will be described.
[0156]
FIG. 16 is a block diagram illustrating a configuration of a moving picture coding device 550A according to a modification of the present embodiment.
The moving picture coding apparatus 550A according to this modification includes “1” or “1” according to the difference vector MVd output from the subtractor 510 in addition to the motion estimating section 502 and the motion compensating section 503 of the moving picture coding apparatus 500A. A code generation unit 512 that generates a code Nu indicating “2” and a switch 511 that opens and closes between the subtractor 510 and the variable length coding unit 506a are provided.
[0157]
Upon acquiring the difference vector MVd from the subtractor 510, the code generation unit 512 determines whether or not the difference vector MVd is “0”. If the difference vector MVd is “0”, the code generation unit 512 opens the switch 511 to open the switch 511. The control unit 506a prohibits the 506a from acquiring the difference vector MVd, generates a code Nu indicating "1", and outputs the code Nu to the variable-length coding unit 506a. If the difference vector MVd is not “0”, the code generation unit 512 causes the variable-length coding unit 506a to acquire the difference vector MVd by closing the switch 511, and generates a code Nu indicating “2”. Output to the variable length coding unit 506a.
[0158]
When the code Nu indicates “1”, the variable-length coding unit 506a according to this modification performs variable-length coding on the residual coded signal Er, the motion vector MV1, and the code Nu, and sets the code Nu to “2”. , The residual encoded signal Er, the motion vector MV1, the difference vector MVd, and the code Nu are subjected to variable-length encoding. That is, when the code Nu is “1”, that is, when the difference vector MVd is “0”, the variable-length encoding unit 506a does not encode the difference vector MVd. Then, the variable-length encoding unit 506a outputs the result of the variable-length encoding as described above as an image encoded signal Bs3.
[0159]
FIG. 17 is a conceptual diagram showing the concept of the format of the image coded signal Bs3.
The image coded signal Bs3 includes a frame coded signal Bsf3 having a content indicating a coded frame, and the frame coded signal Bsf3 further includes a block coded signal Bsb3 having a content indicating a coded block. Bsb4 is included. Further, the block coded signal Bsb3 includes a code signal Bsn2 having a content indicating the coded code Nu (2) and a first motion vector coded signal Bs1 having a content indicating the coded motion vector MV1. , A difference vector coded signal Bsd having a content indicating the coded difference vector MVd. Further, the block coded signal Bsb4 includes a code signal Bsn1 having a content indicating the coded code Nu (1) and a first motion vector coded signal Bs1 having a content indicating the coded motion vector MV1. It is.
[0160]
That is, since the difference vector MVd is not “0” for the block indicated by the block coded signal Bsb3, the block coded signal Bsb3 includes the code signal Bsn2 and the difference vector other than the first motion vector coded signal BS1. Since the coded signal Bsd is included and the difference vector MVd is “0” for the block indicated by the block coded signal Bsb4, the block coded signal Bsb4 contains codes other than the first motion vector coded signal BS1. Only the signal Bsn1 is included.
[0161]
Here, since the code Nu indicates “1” or “2”, it is sufficient that the information amount for the code Nu is 1 bit. On the other hand, the information amount for the difference vector MVd requires at least 2 bits when the difference vector MVd is subjected to variable-length coding independently for the horizontal component and the vertical component. In many cases, the motion of the subject in the image is constant in a short time, and thus the difference vector MVd for most of the coding target blocks is “0”.
[0162]
Therefore, in the present modified example, since the image coded signal Bs3 includes a large amount of the block coded signal Bsb4 in which the amount of information is reduced by omitting the difference vector coded signal Bsd, the coding efficiency can be improved. it can.
[0163]
Also, when the difference vector MVd for most of the coding target blocks becomes “0”, the appearance frequency of the value indicated by the code Nu is biased, and the information amount for the code Nu becomes smaller than 1 bit. Therefore, when a motion vector is encoded by a variable length encoding method such as a Huffman code in units of an integer bit, the code Nu is encoded in combination with another type of code, so that the encoding is performed independently as described above. Encoding efficiency can be improved as compared with encoding the code Nu.
[0164]
In this embodiment, the code signals Bsn1 and Bsn2 are stored for each block coded signal. However, the code signals Bsn1 and Bsn2 are not stored for each block coded signal but in units larger than a block, such as an MPEG macro block or slice. The code signals Bsn1 and Bsn2 may be stored for each signal indicating the content of the encoded image. As a result, the code signals Bsn1 and Bsn2 can be reduced, and the coding efficiency can be further improved.
[0165]
As described above, according to the present modification, by storing information of the code signal Bsn1 in the image coded signal Bs3 and omitting the differential vector coded signal Bsd, the amount of information can be reduced, and the coding efficiency can be reduced. Can be improved.
[0166]
(Embodiment 6)
Hereinafter, a video decoding device according to the sixth embodiment of the present invention will be described with reference to the drawings.
[0167]
FIG. 18 is a block diagram illustrating a configuration of a video decoding device 550B according to the present embodiment.
The moving picture decoding apparatus 550B of the present embodiment decodes a moving picture encoded by the moving picture coding apparatus 550A according to the modification of the fifth embodiment, and includes a variable length decoding unit 536. , A motion vector scaling unit 537, a motion compensation unit 533, an image decoding unit 535, a multi-frame buffer 531,

adders

539 and 540, and a switch 541.
[0168]
Here, the image decoding unit 535, the motion compensation unit 533, the multi-frame buffer 531 and the

adders

539 and 540 in the present embodiment are the same as the image decoding unit 905 and the motion compensation in the moving image decoding apparatus 900 shown in the conventional example. The section 903, the multi-frame buffer 901, and the

adders

909 and 910 have the same functions and configurations, respectively, and thus description thereof is omitted.
[0169]
The variable-length decoding unit 536 in the present embodiment obtains the image coded signal Bs3 and performs variable-length decoding. When the code Nu indicates “1”, the code Nu and the residual coded signal Er When the code Nu indicates "2", the code Nu, the residual coded signal Er, the motion vector MV1, and the difference vector MVd are output.
[0170]
The switch 541 opens and closes between the variable length decoding unit 536 and the adder 540 according to the code Nu from the variable length decoding unit 536. That is, when the code Nu indicates “1”, the switch 541 is opened to inhibit the output of the difference vector MVd from the variable length decoding unit 536 to the adder 540, and when the code Nu indicates “2”, the switch 541 is opened. 541 is closed, and the output of the difference vector MVd from the variable length decoding unit 536 to the adder 540 is permitted.
[0171]
As a result, when the switch 541 is opened, the adder 40 acquires only the motion vector MVs generated by the motion vector scaling unit 537, and outputs the motion vector MVs to the motion compensation unit 533 as the motion vector MV2. When the switch 541 is closed, the adder 40 acquires the motion vector MVs generated by the motion vector scaling unit 537 and the difference vector MVd output from the variable length decoding unit 536. The difference vector MVd is added to MVs, and the result of the addition is output to the motion compensation unit 533 as a motion vector MV2.
[0172]
Accordingly, in the present embodiment, if information including the difference vector MVd is included in the decoded image signal Bs3, the motion vector MVs subjected to the scaling process is added to the difference vector MVd to obtain the motion vector MV2. Is generated, and if the decoded image signal Bs3 does not include information including the difference vector MVd, the motion vector MV2 subjected to the scaling process is used as the motion vector MV2, thereby generating the motion vector MV2.
[0173]
Therefore, according to the present embodiment, it is possible to correctly decode information relating to a motion vector coded by moving image coding apparatus 550A according to the modification of Embodiment 5, and as a result, to accurately decode a moving image. Can be.
[0174]
(Embodiment 7)
By the way, in the motion vector encoding method of the moving image encoding device 800 shown in the conventional example, scaling is performed on the motion vector MV1, but information regarding the display time differences T1 and T2 necessary for performing scaling is stored in a multi-frame buffer. In some cases, the motion vector cannot be obtained. In such a case, there is a problem that the motion vector cannot be encoded. Even if information about the display time differences T1 and T2 can be obtained from the multi-frame buffer 801, if at least one of the display time differences T1 and T2 is very large, it does not make sense to perform scaling, and the motion vector However, there is a problem that the coding efficiency is reduced.
[0175]
That is, two types of areas, a short-time memory and a long-term memory, are secured in the multi-frame buffer 800, and a frame may be recorded in the long-term memory in a state where information on the display time is omitted. Yes, scaling cannot be performed when such a frame is read as a reference frame. In addition, a frame having a very large display time difference from the encoding target frame may be recorded in the long-term memory, and when such a frame is read as a reference frame, insignificant scaling is performed. It is done.
[0176]
Therefore, the moving picture coding apparatus according to the seventh embodiment of the present invention is characterized in that a moving vector is coded so as to increase coding efficiency by avoiding meaningless scaling.
[0177]
Hereinafter, a video encoding device according to the seventh embodiment of the present invention will be described with reference to the drawings.
FIG. 19 is a block diagram illustrating a configuration of a moving picture coding apparatus 100 according to the present embodiment.
[0178]
The moving picture coding apparatus 100 according to the present embodiment includes a multi-frame buffer 101, a motion estimating section 102, a motion compensating section 103, an image coding section 104, an image decoding section 105, and a variable length coding section 106. , A motion vector scaling unit 107, adders 108 and subtractors 109 and 110, switches 111, 112 and 113, and a determination unit 114.
[0179]
When encoding each block of the encoding target frame Tf indicated by the image signal Img, the moving image encoding apparatus 100 refers to the two reference frames Rf1 and Rf2 to perform processing on the reference frames Rf1 and Rf2. The information on the motion vectors MV1 and MV2 of the current block and the information based on the predicted images predicted from the reference frames Rf1 and Rf2 and the motion vectors MV1 and MV2 are encoded.
[0180]
Here, also in the present embodiment, as in Embodiment 1, each of reference frames Rf1 and Rf2 may be temporally forward or backward with respect to encoding target frame Tf.
[0181]
As illustrated in (a) of FIG. 2, the video encoding device 100 may refer to frames located in front of the encoding target frame Tf as reference frames Rf1 and Rf2, and refer to (b) of FIG. 2), a frame located behind the encoding target frame Tf may be referred to as reference frames Rf1 and Rf2. Further, as shown in (c) of FIG. One frame located ahead may be referred to as a reference frame Rf2, and one frame located behind the encoding target frame Tf may be referred to as a reference frame Rf1.
[0182]
The

switches

111 and 112 switch their

contact points

0 and 1 in accordance with two frames (reference frames Rf1 and Rf2) referred to for each encoding target block. For example, when the reference frame Rf1 is referred to, the

switches

111 and 112 respectively connect the contact 0 to the motion estimating unit 102, and when the reference frame Rf2 is referred to, the

switches

111 and 112 respectively move the contact 1 to the motion estimation unit 102. Connect to the unit 102.
[0183]
The motion estimating unit 102 calculates the motion vectors MV1 and MV2 for the block in the encoding target frame Tf indicated by the image signal Img based on each of the reference frames Rf1 and Rf2 read from the multi-frame buffer 101. The motion estimation unit 302 of the first embodiment, the motion estimation unit 402 of the third embodiment, or the motion estimation unit 502 of the fifth embodiment detects the motion.
[0184]
The motion compensation unit 103 extracts from the multi-frame buffer 101 the block at the position indicated by the motion vector MV1 in the reference frame Rf1 and the block at the position indicated by the motion vector MV2 in the reference frame Rf2. Then, the motion compensation unit 103 performs a pixel interpolation process based on these blocks to generate a predicted image signal Pre, and outputs this.
[0185]
The subtractor 109 subtracts the predicted image signal Pre from the image signal Img and outputs a residual signal Res.
The image coding unit 104 obtains the residual signal Res, performs image coding processing such as DCT transform / quantization, and outputs a residual coded signal Er including quantized DCT coefficients and the like.
[0186]
The image decoding unit 105 acquires the residual coded signal Er, performs image decoding processing such as inverse quantization and inverse DCT transform, and outputs a residual decoded signal Dr.
The adder 108 adds the residual decoded signal Dr and the prediction image signal Pre to output a reconstructed image signal Rc.
[0187]
The multi-frame buffer 101 stores a signal that may be referred to in inter-frame prediction among the reconstructed image signals Rc.
FIG. 20 is a configuration diagram showing a schematic configuration of a memory for storing the above signals in the multi-frame buffer 101.
[0188]
As shown in FIG. 20, the multi-frame buffer 101 secures a short-term memory 101s and a long-term memory 101l, and a frame indicated by the reconstructed image signal Rc is appropriately stored in the short-term memory 101s and the long-term memory 101l. Stored separately.
[0189]
The short-time memory 101s is a first-in first-out (FIFO) memory. When a new signal is recorded in the short-time memory 101s, the recorded contents are discarded from the oldest recorded time, and the short-time memory 101s is stored in the short-time memory 101s. An image of the latest fixed number of frames is always stored. When a frame indicated by the reconstructed image signal Rc is recorded in the short-time memory 101s, it is recorded together with information on the display time of the frame.
[0190]
The long-term memory 1011 is a memory of a random access method, and has a configuration in which a frame can be stored in an arbitrary area and a frame stored in an arbitrary area can be read. The long-time memory 101l stores an image that is mainly referred to over a long period of time, such as a background image or an image before scene insertion, and stores frames longer than the short-time memory 101s. The storage of the frames in the long-term memory 101l is performed in such a manner that the data stored in the short-term memory 101s moves to the long-term memory 101l.
[0191]
Further, when the frame indicated by the reconstructed image signal Rc is recorded in the long-time memory 101l, the frame is recorded together with the information on the display time of the frame, or is recorded with the information on the time omitted.
[0192]
Further, the multi-frame buffer 101 according to the present embodiment includes a notifying unit 115, and the notifying unit 115 reads the reference frames Rf1 and Rf2 read by the motion estimating unit 102 via the switch 111 from the short-time memory 101s. A notification signal Inf is output to notify whether the data has been read or read from the memory 101l for a long time.
[0193]
FIG. 21 is a state diagram showing a state of a frame stored in the multi-frame buffer 101.
The frame fs1, the frame fs2, the frame fs3,... Are sequentially stored in the short-time memory 101s with the passage of time, and the long-term memory 101l is referred to later among the frames stored in the short-time memory 101s. Potential frames fl1 and fl2 are stored in order.
[0194]
Here, when the frame fs2 stored in the short-time memory 101s is read out from the multi-frame buffer 101 as the reference frame Rf1, as shown in (a) of FIG. 21, the notification unit 115 of the multi-frame buffer 101 It outputs a notification signal Inf of a content notifying that the frame has been read from the short-time memory 101s. When the frame fl2 stored in the long-term memory 101l is read from the multi-frame buffer 101 as the reference frame Rf2, the notification unit 115 of the multi-frame buffer 101 indicates that the frame has been read from the long-term memory 101l. Is output.
[0195]
Similarly, as shown in FIG. 21B, when the frames fl1 and fl2 stored in the long-time memory 101l are read from the multi-frame buffer 101 as the reference frames Rf1 and Rf2, respectively, the multi-frame buffer 101 The notification unit 115 outputs a notification signal Inf having a content indicating that a frame has been read from the long-time memory 101l every time the frames fl1 and fl2 are read.
[0196]
The determination unit 114 acquires the notification signal Inf from the notification unit 115, and determines whether at least one of the reference frames Rf1 and Rf2 referenced for each encoding target block has been read from the long-time memory 101l. Is determined. Then, the determination unit 114 outputs a switching signal si1 for instructing switching of the contact point of the switch 113 based on the determination result.
[0197]
The switch 113 switches an output destination of the motion vector MV1 of the motion estimating unit 102 to the motion vector scaling unit 107 and the subtractor 110 by switching a contact according to the above-described switching signal si1.
[0198]
That is, when the determination unit 114 determines that the reference frames Rf1 and Rf2 have been read from the short-time memory 101s, the switching signal instructing the output destination of the switch 113 to the motion vector scaling unit 107 When the output of the switch 113 is determined to be at least one of the reference frames Rf1 and Rf2 from the memory 101l for a long time, the switch 113 instructs the output destination of the switch 113 to be the subtractor 110. Output si1.
[0199]
The motion vector scaling unit 107 calculates the display time difference T1 between the encoding target frame Tf and the reference frame Rf1, and the display time difference T2 between the encoding target frame Tf and the reference frame Rf2, as in the operation described with reference to FIG. Based on this, the motion vector MV1 is scaled, and the resulting motion vector MVs is output.
[0200]
When the output destination of the switch 113 is set to the motion vector scaling unit 107, the subtractor 110 calculates the difference between the motion vector MV 2 obtained from the motion estimation unit 102 and the motion vector MVs obtained from the motion vector scaling unit 107. The difference is obtained, and a difference vector MVd indicating the result is output.
[0201]
When the subtractor 110 is set as the output destination of the switch 113, the subtractor 110 replaces the motion vector MVs from the motion vector scaling unit 107 with the motion vector obtained from the motion estimating unit 102 via the switch 113. Using MV1, a difference between the motion vector MV2 and the motion vector MV1 is obtained, and the difference result is output as a difference vector MVd.
[0202]
FIG. 22 is an explanatory diagram for explaining how the difference vector MVd is created.
As shown in FIG. 22, when the subtractor 110 acquires the motion vector MV1 instead of the motion vector MVs, the subtractor 110 calculates a difference between the motion vector MV2 and the motion vector MV1, and creates a difference vector MVd.
[0203]
The variable-length coding unit 106 performs variable-length coding on the difference vector MVd, the motion vector MV1, and the residual coded signal Er, and outputs the coded result as an image coded signal Bs.
[0204]
A series of motion vector coding operations of the moving picture coding apparatus 100 according to the present embodiment will be described with reference to FIG.
FIG. 23 is a flowchart showing a series of operations for encoding a motion vector.
[0205]
First, the determination unit 114 of the video encoding device 100 determines whether at least one of the reference frames Rf1 and Rf2 has been read from the long-time memory 1011 based on the notification signal Inf (step S101).
[0206]
When the determination unit 114 determines that the two frames of the reference frames Rf1 and Rf2 have been read from the short-time memory 101s (N in step S101), the output destination of the switch 113 is set to the motion vector scaling unit 107. To switch the contact of the switch 113. As a result, the motion vector scaling unit 107 creates the motion vector MVs by acquiring the motion vector MV1 and scaling it (step S102). Then, the subtractor 110 acquires the created motion vector MVs.
[0207]
On the other hand, when the determination unit 114 determines that at least one of the reference frames Rf1 and Rf2 has been read from the long-time memory 101l (Y in step S101), the output destination of the switch 113 is set to the subtractor 110. The contact of the switch 113 is switched. As a result, the subtractor 110 treats the motion vector MV1 obtained from the motion estimating unit 102 via the switch 113 as the motion vector MVs output from the motion vector scaling unit 107 (step S103).
[0208]
Next, the subtractor 110 calculates a difference between the motion vector MV2 and the above-described motion vector MVs, and outputs a difference vector MVd indicating the difference result to the variable-length encoding unit 106 (Step S104).
[0209]
Then, the variable length coding unit 106 performs variable length coding on the motion vector MV1 obtained from the motion estimating unit 102 (step S105), and performs variable length coding on the difference vector MVd obtained from the subtractor 110 (step S106). ).
[0210]
As described above, in the present embodiment, when at least one of the two reference frames Rf1 and Rf2 is read from the long-time memory 101l, the motion vector scaling unit 107 does not perform scaling. Even when the information is recorded in the memory 101l for a long time, it is possible to omit the execution of insignificant scaling using the information and improve the coding efficiency of the motion vector. Further, when the information regarding the display time of the frame is not recorded in the memory 101l for a long time, the execution of the unreasonable scaling can be omitted, and the coding efficiency of the motion vector can be improved.
[0211]
In this embodiment, scaling of the motion vector scaling unit 107 is performed or not performed by switching the switch 113. However, the configuration shown in FIG. Alternatively, when at least one of the two reference frames Rf1 and Rf2 is read from the long-time memory 101l, the motion vector scaling unit 107 may not always perform scaling.
[0212]
In the present embodiment, when at least one of the two reference frames Rf1 and Rf2 is read from the long-term memory 101l, the motion vector MV1 and the difference vector MVd are encoded. Instead, the motion vectors MV1 and MV2 may be encoded. This means that when at least one of the two reference frames Rf1 and Rf2 is read from the long-term memory 101l, the motion vector MV2 is encoded instead of encoding the difference vector MVd. In this case, the prediction values of the motion vectors MV1 and MV2 are further calculated from peripheral blocks around the current block, and the difference between the prediction values of the motion vectors MV1 and MV2 is calculated. May be.
[0213]
Further, in the present embodiment, the notification unit 115 is provided in the multi-frame buffer 101;
[0214]
(Embodiment 8)
Hereinafter, a video decoding device according to the eighth embodiment of the present invention will be described with reference to the drawings.
[0215]
FIG. 24 is a block diagram illustrating a configuration of a video decoding device 200 according to the present embodiment.
The moving picture decoding apparatus 200 according to the present embodiment decodes a moving picture encoded by the moving picture coding apparatus 100 according to the seventh embodiment, and includes a variable length decoding unit 206, a motion vector It includes a scaling unit 207, a motion compensation unit 203, an image decoding unit 204, a multi-frame buffer 201, a determination unit 214,

adders

209 and 210, and a switch 213.
[0216]
The variable length decoding unit 206 obtains the coded image signal Bs, performs variable length decoding, and outputs the coded residual signal Er, the motion vector MV1, and the difference vector MVd.
[0219]
The image decoding unit 204 acquires the residual coded signal Er, performs image decoding processing such as inverse quantization and inverse DCT, and outputs a residual decoded signal Dr.
Similar to the motion compensation unit 103 of the first embodiment, the motion compensation unit 203 determines the block at the position indicated by the motion vector MV1 in the reference frame Rf1 and the block at the position indicated by the motion vector MV2 in the reference frame Rf2. , From the multi-frame buffer 201. Then, the motion compensation unit 203 creates a predicted image signal Pre by performing pixel interpolation processing based on these blocks, and outputs this.
[0218]
The adder 209 adds the predicted image signal Pre from the motion compensation unit 203 and the residual decoded signal Dr from the image decoding unit 204, and outputs the result as a decoded image signal Di.
[0219]
When acquiring the motion vector MV1 output from the variable length decoding unit 206, the motion vector scaling unit 207 obtains the display time difference between the encoding target frame Tf and the reference frame Rf1 similarly to the motion vector scaling unit 107 of the seventh embodiment. Based on T1 and the display time difference T2 between the encoding target frame Tf and the reference frame Rf2, the motion vector MV1 is scaled, and the resulting motion vector MVs is output.
[0220]
The multi-frame buffer 201 stores a signal of the decoded image signal Di that may be referred to in inter-frame prediction. Further, in the multi-frame buffer 201, a short-term memory 201s and a long-term memory 201l having the same function and configuration as the short-term memory 101s and the long-term memory 101l of the multi-frame buffer 101 of the seventh embodiment are secured. Have been. That is, the frame indicated by the decoded image signal Di is appropriately separated and stored in the short-time memory 201s and the long-time memory 201l.
[0221]
Further, the multi-frame buffer 201 includes a notifying unit 215 having the same function and configuration as the notifying unit 115 of the multi-buffer frame 101 according to the seventh embodiment. That is, the notification unit 215 notifies the content that notifies whether the reference frames Rf1 and Rf2 read by the motion compensation unit 203 are read from the short-time memory 201s or read from the long-time memory 201l. The signal Inf is output.
[0222]
The determination unit 214 has the same function and configuration as the determination unit 114 of the seventh embodiment, acquires the notification signal Inf from the notification unit 215, and refers to the reference frames Rf1 and Rf2 referenced for each encoding target block. It is determined whether at least one of them has been read from the memory 201l for a long time. Then, the determination unit 214 outputs a switching signal si1 for instructing switching of the contact point of the switch 213 based on the determination result.
[0223]
The switch 213 switches the output destination of the motion vector MV1 acquired from the variable length decoding unit 206 to the motion vector scaling unit 207 and the adder 210 by switching the contact according to the above-described switching signal si1.
[0224]
That is, when the determination unit 214 determines that the reference frames Rf1 and Rf2 have been read from the short-time memory 201s, the switching signal that instructs the output destination of the switch 213 to the motion vector scaling unit 207. When the switch 213 outputs si1 and determines that at least one of the reference frames Rf1 and Rf2 has been read from the memory 201l for a long time, the switch 213 instructs the adder 210 to output to the switch 213. Output si1.
[0225]
When the output destination of the switch 213 is set to the motion vector scaling unit 207, the adder 210 compares the motion vector MVd obtained from the variable length decoding unit 206 with the motion vector MVs obtained from the motion vector scaling unit 207. And outputs a motion vector MV2 indicating the result to the motion compensation unit 203.
[0226]
When the adder 210 itself is set as the output destination of the switch 213, the adder 210 is obtained from the variable length decoding unit 206 via the switch 213 instead of the motion vector MVs from the motion vector scaling unit 207. Using the motion vector MV1, the motion vector MVd and the motion vector MV1 are added, and the result is output to the motion compensation unit 203 as a motion vector MV2.
[0227]
A series of operations for decoding a motion vector of the moving picture decoding apparatus 200 according to the present embodiment will be described with reference to FIG.
FIG. 25 is a flowchart showing a series of operations for decoding a motion vector.
[0228]
First, the variable-length decoding unit 206 of the video decoding device 200 obtains the coded image signal Bs and performs variable-length decoding, thereby decoding the motion vector MV1 (Step S201) and converting the difference vector MVd. The decryption is performed (step S202).
[0229]
Next, the determination unit 214 determines whether at least one of the reference frames Rf1 and Rf2 has been read from the long-term memory 201l based on the notification signal Inf (step S203).
[0230]
When the determination unit 214 determines that the two frames of the reference frames Rf1 and Rf2 have been read from the short-time memory 201s (N in step S203), the output destination of the switch 213 is set to the motion vector scaling unit 207. Switch the contact of the switch 213. As a result, the motion vector scaling unit 207 generates the motion vector MVs by acquiring the motion vector MV1 and performing scaling on the motion vector MV1 (step S204). Then, the adder 210 acquires the created motion vector MVs.
[0231]
On the other hand, when the determining unit 214 determines that at least one of the reference frames Rf1 and Rf2 has been read from the long-term memory 201l (Y in step S203), the output destination of the switch 213 is set to the adder 210. The contact of the switch 213 is switched. As a result, the adder 210 treats the motion vector MV1 obtained from the variable length decoding unit 206 via the switch 213 as the motion vector MVs output from the motion vector scaling unit 107 (Step S205).
[0232]
Then, the adder 210 adds the motion vector MVs to the difference vector MVd, and outputs a motion vector MV2 indicating the result of the addition to the motion compensation unit 203 (Step S206).
[0233]
Thus, in the present embodiment, as in Embodiment 7, when at least one of the two reference frames Rf1 and Rf2 is read from the long-term memory 201l, the motion vector scaling unit 207 does not perform scaling. Even when information on the display time of a frame is recorded in the memory 201l for a long time, it is possible to omit execution of insignificant scaling using the information and to improve the efficiency of decoding a motion vector. it can. Further, when the information about the display time of the frame is not recorded in the memory 201l for a long time, it is possible to omit execution of unreasonable scaling and to improve the efficiency of decoding the motion vector.
[0234]
In the present embodiment, scaling of the motion vector scaling unit 207 is performed by switching the switch 213 or scaling is not performed. However, without providing the switch 213, the two reference frames Rf1, When at least one of Rf2 is read from the memory 201l for a long time, the motion vector scaling unit 207 may not always perform scaling.
[0235]
In the present embodiment, when at least one of the two reference frames Rf1 and Rf2 is read from the long-term memory 201l, the motion vector MV2 is derived by adding the difference vector MVd to the motion vector MV1. , The motion vector MV2 may be directly decoded without adding. This means that when at least one of the two reference frames Rf1 and Rf2 is read from the long-term memory 201l, the motion vector MV2 is decoded instead of decoding the difference vector MVd. . In this case, when each of the motion vectors MV1 and MV2 is coded by subtracting the predicted value of the motion vector MV1 or MV2 obtained from a peripheral block around the decoding target block, The motion vectors MV1 and MV2 may be decoded by adding the calculated motion vectors MV1 and MV2 and the above-described predicted values.
[0236]
Further, in the present embodiment, the notification unit 215 is provided in the multi-frame buffer 201, but may be provided in a component other than the multi-frame buffer 201, or the notification unit 215 may be provided alone.
[0237]
The frames described in the first to eighth embodiments may be fields. The frame and the field are collectively called a picture.
As described above, the first to eighth embodiments are used for the motion vector detection method according to the present invention, the motion vector encoding method using the method, the motion vector decoding method, and the apparatus using these methods. Although the present invention has been described, the present invention is not limited to Embodiments 1 to 8, and it goes without saying that the present invention can be realized in other forms.
[0238]
(Embodiment 9)
Further, by recording a program for realizing the motion vector detecting method, the motion vector encoding method, and the motion vector decoding method described in each of the above embodiments on a storage medium such as a flexible disk, The processing described in each embodiment can be easily performed by an independent computer system.
[0239]
FIG. 26 is an explanatory diagram of a storage medium that stores a program for realizing a motion vector detection method, a motion vector encoding method, and a motion vector decoding method according to Embodiments 1 to 8 by a computer system.
[0240]
26B shows the appearance, cross-sectional structure, and disk main body FD1 of the flexible disk FD viewed from the front, and FIG. 26A shows the physical format of the disk main body FD1, which is the main body of the recording medium. An example is shown.
[0241]
The disk main body FD1 is built in the case F, and a plurality of tracks Tr are formed concentrically on the surface of the disk main body FD1 from the outer periphery toward the inner periphery, and each track is divided into 16 sectors Se in an angular direction. Have been. Therefore, in the flexible disk FD storing the program, the motion vector encoding method and the motion vector decoding method as the program are recorded in the area allocated on the disk body FD1.
[0242]
FIG. 26C shows a configuration for recording and reproducing the program on the flexible disk FD.
When recording the program on the flexible disk FD, the computer system Cs writes the motion vector encoding method or the motion vector decoding method as the program via the flexible disk drive FDD. When the motion vector encoding method or the motion vector decoding method is constructed in the computer system Cs by a program in the flexible disk FD, the program is read from the flexible disk FD by the flexible disk drive FDD, and the computer system Cs Will be forwarded to
[0243]
In the above description, the description has been made using the flexible disk FD as the recording medium, but the same can be done using an optical disk. Further, the recording medium is not limited to this, and the present invention can be similarly implemented as long as the program can be recorded, such as an IC card or a ROM cassette.
[0244]
(Embodiment 10)
Further, here, application examples of the motion vector detection method, the motion vector encoding method, and the motion vector decoding method described in the above embodiment and a system using the same will be described.
[0245]
FIG. 27 is a block diagram illustrating an overall configuration of a content supply system ex100 that realizes a content distribution service. A communication service providing area is divided into desired sizes, and base stations ex107 to ex110, which are fixed wireless stations, are installed in each cell.
[0246]
The content supply system ex100 includes, for example, a computer ex111, a PDA (personal digital assistant) ex112, a camera ex113, a mobile phone ex114, and a camera via the Internet ex101 via the Internet service provider ex102 and the telephone network ex104, and the base stations ex107 to ex110. Each device such as a mobile phone ex115 with a tag is connected.
[0247]
However, the content supply system ex100 is not limited to the combination as shown in FIG. 27, and may be connected in any combination. Further, each device may be directly connected to the telephone network ex104 without going through the base stations ex107 to ex110 which are fixed wireless stations.
[0248]
The camera ex113 is a device such as a digital video camera capable of shooting moving images. In addition, a mobile phone is a PDC (Personal Digital Communications) system, a CDMA (Code Division Multiple Access) system, a W-CDMA (Wideband-Code Division Multiple Access mobile phone system, or a GSM communication system). Or PHS (Personal Handyphone System) or the like.
[0249]
The streaming server ex103 is connected from the camera ex113 to the base station ex109 and the telephone network ex104, and enables live distribution and the like based on encoded data transmitted by the user using the camera ex113. The encoding process of the photographed data may be performed by the camera ex113, or may be performed by a server or the like that performs the data transmission process. Also, moving image data captured by the camera ex116 may be transmitted to the streaming server ex103 via the computer ex111. The camera ex116 is a device such as a digital camera that can shoot still images and moving images. In this case, encoding of the moving image data may be performed by the camera ex116 or the computer ex111. The encoding process is performed by the LSI ex117 of the computer ex111 and the camera ex116. It should be noted that the image encoding / decoding software may be incorporated in any storage medium (a CD-ROM, a flexible disk, a hard disk, or the like) that is a recording medium readable by the computer ex111 or the like. Further, the moving image data may be transmitted by the mobile phone with camera ex115. The moving image data at this time is data encoded by the LSI included in the mobile phone ex115.
[0250]
In the content supply system ex100, the content (for example, a video image of a live music) captured by the user with the camera ex113, the camera ex116, and the like is encoded and transmitted to the streaming server ex103 as in the above-described embodiment. On the other hand, the streaming server ex103 stream-distributes the content data to the requesting client. Examples of the client include a computer ex111, a PDA ex112, a camera ex113, a mobile phone ex114, and the like, which can decode the encoded data. In this way, the content supply system ex100 can receive and reproduce the encoded data at the client, and further, realizes personal broadcast by receiving, decoding, and reproducing the data in real time at the client. It is a system that becomes possible.
[0251]
The encoding and decoding of each device constituting this system may be performed using the video encoding device or the video decoding device described in each of the above embodiments.
A mobile phone will be described as an example.
[0252]
FIG. 28 is a diagram illustrating the mobile phone ex115 using the motion vector detection method, the motion vector encoding method, and the motion vector decoding method described in the above embodiment. The mobile phone ex115 includes an antenna ex201 for transmitting and receiving radio waves to and from the base station ex110, a camera unit ex203 capable of taking a picture such as a CCD camera, a still image, a picture taken by the camera unit ex203, and an antenna ex201. A display unit ex202 such as a liquid crystal display for displaying data obtained by decoding a received video or the like, a main unit including operation keys ex204, an audio output unit ex208 such as a speaker for outputting audio, and audio input. Input unit ex205 such as a microphone for storing encoded or decoded data, such as data of captured moving images or still images, received mail data, moving image data or still image data, etc. Of recording media ex207 to mobile phone ex115 And a slot portion ex206 to ability. The recording medium ex207 stores a flash memory device, which is a kind of electrically erasable and programmable read only memory (EEPROM), which is a nonvolatile memory that can be electrically rewritten and erased, in a plastic case such as an SD card.
[0253]
Further, the mobile phone ex115 will be described with reference to FIG. The mobile phone ex115 is provided with a power supply circuit unit ex310, an operation input control unit ex304, an image encoding unit, and a main control unit ex311 which controls the respective units of a main body unit including a display unit ex202 and operation keys ex204. Unit ex312, camera interface unit ex303, LCD (Liquid Crystal Display) control unit ex302, image decoding unit ex309, demultiplexing unit ex308, recording / reproducing unit ex307, modulation / demodulation circuit unit ex306, and audio processing unit ex305 via the synchronous bus ex313. Connected to each other.
[0254]
When the end of the call and the power key are turned on by a user operation, the power supply circuit unit ex310 supplies power to each unit from the battery pack to activate the digital cellular phone with camera ex115 in an operable state. .
[0255]
The mobile phone ex115 converts a sound signal collected by the sound input unit ex205 into digital sound data by the sound processing unit ex305 in the voice call mode based on the control of the main control unit ex311 including a CPU, a ROM, a RAM, and the like. This is spread-spectrum-processed by a modulation / demodulation circuit unit ex306, subjected to digital-analog conversion processing and frequency conversion processing by a transmission / reception circuit unit ex301, and then transmitted via an antenna ex201. The mobile phone ex115 amplifies received data received by the antenna ex201 in the voice communication mode, performs frequency conversion processing and analog-to-digital conversion processing, performs spectrum despreading processing in the modulation / demodulation circuit unit ex306, and performs analog voice decoding in the voice processing unit ex305. After being converted into data, this is output via the audio output unit ex208.
[0256]
Further, when an e-mail is transmitted in the data communication mode, text data of the e-mail input by operating the operation key ex204 of the main body is sent to the main control unit ex311 via the operation input control unit ex304. The main control unit ex311 performs spread spectrum processing on the text data in the modulation / demodulation circuit unit ex306, performs digital / analog conversion processing and frequency conversion processing in the transmission / reception circuit unit ex301, and transmits the data to the base station ex110 via the antenna ex201.
[0257]
When transmitting image data in the data communication mode, the image data captured by the camera unit ex203 is supplied to the image encoding unit ex312 via the camera interface unit ex303. When image data is not transmitted, image data captured by the camera unit ex203 can be directly displayed on the display unit ex202 via the camera interface unit ex303 and the LCD control unit ex302.
[0258]
The image encoding unit ex312 includes the image encoding device described in the present invention, and uses the image data supplied from the camera unit ex203 in the image encoding device described in the above embodiment. The image data is converted into encoded image data by compression encoding, and is transmitted to the demultiplexing unit ex308. At this time, the mobile phone ex115 simultaneously transmits the audio collected by the audio input unit ex205 during imaging by the camera unit ex203 to the demultiplexing unit ex308 as digital audio data via the audio processing unit ex305.
[0259]
The demultiplexing unit ex308 multiplexes the encoded image data supplied from the image encoding unit ex312 and the audio data supplied from the audio processing unit ex305 by a predetermined method, and multiplexes the resulting multiplexed data into a modulation / demodulation circuit unit. The signal is subjected to spread spectrum processing in ex306 and subjected to digital-analog conversion processing and frequency conversion processing in the transmission / reception circuit unit ex301, and then transmitted via the antenna ex201.
[0260]
When data of a moving image file linked to a homepage or the like is received in the data communication mode, the data received from the base station ex110 via the antenna ex201 is subjected to spectrum despreading processing by the modulation / demodulation circuit unit ex306, and the resulting multiplexed data The demultiplexed data is sent to the demultiplexing unit ex308.
[0261]
To decode the multiplexed data received via the antenna ex201, the demultiplexing unit ex308 separates the multiplexed data into a bit stream of image data and a bit stream of audio data, and performs synchronization. The coded image data is supplied to the image decoding unit ex309 via the bus ex313 and the audio data is supplied to the audio processing unit ex305.
[0262]
Next, the image decoding unit ex309 is configured to include the image decoding device described in the present invention, and decodes a bit stream of image data by a decoding method corresponding to the encoding method described in the above embodiment. By doing so, reproduced moving image data is generated and supplied to the display unit ex202 via the LCD control unit ex302, whereby, for example, moving image data included in a moving image file linked to a homepage is displayed. At this time, the audio processing unit ex305 simultaneously converts the audio data into analog audio data and supplies the analog audio data to the audio output unit ex208, whereby the audio data included in the moving image file linked to the homepage is reproduced, for example. You.
[0263]
It should be noted that the present invention is not limited to the example of the system described above, and digital broadcasting using satellites and terrestrial waves has recently become a hot topic. As shown in FIG. Any of the decoding devices can be incorporated. Specifically, at the broadcasting station ex409, the bit stream of the video information is transmitted to the communication or the broadcasting satellite ex410 via radio waves. The broadcasting satellite ex410 receiving this transmits a radio wave for broadcasting, receives this radio wave with a home antenna ex406 having a satellite broadcasting receiving facility, and transmits the radio wave to a television (receiver) ex401 or a set-top box (STB) ex407 or the like. The device decodes the bit stream and reproduces it. Further, the image decoding device described in the above embodiment can also be mounted on a reproducing device ex403 that reads and decodes a bit stream recorded on a storage medium ex402 such as a CD or DVD, which is a recording medium. In this case, the reproduced video signal is displayed on the monitor ex404. Further, a configuration is also conceivable in which an image decoding device is mounted in a set-top box ex407 connected to a cable ex405 for cable television or an antenna ex406 for satellite / terrestrial broadcasting, and this is reproduced on a monitor ex408 of the television. At this time, the image decoding device may be incorporated in the television instead of the set-top box. In addition, a car ex412 having an antenna ex411 can receive a signal from the satellite ex410 or a base station ex107 or the like, and can reproduce a moving image on a display device such as a car navigation ex413 included in the car ex412.
[0264]
Further, an image signal can be encoded by the image encoding device described in the above embodiment and recorded on a recording medium. Specific examples include a recorder ex420 such as a DVD recorder that records an image signal on a DVD disc ex421 and a disc recorder that records on a hard disk. Furthermore, it can be recorded on the SD card ex422. If the recorder ex420 includes the image decoding device described in the above embodiment, the image signal recorded on the DVD disc ex421 or the SD card ex422 can be reproduced and displayed on the monitor ex408.
[0265]
The configuration of the car navigation system ex413 may be, for example, the configuration shown in FIG. 29 except for the camera unit ex203, the camera interface unit ex303, and the image encoding unit ex312. ) Ex401 and the like can also be considered.
[0266]
In addition, terminals such as the mobile phone ex114 and the like have three mounting formats, in addition to a transmitting / receiving terminal having both an encoder and a decoder, a transmitting terminal having only an encoder and a receiving terminal having only a decoder. Can be considered.
[0267]
As described above, the motion vector detecting method, the encoding method, and the decoding method described in the above embodiment can be used in any of the devices and systems described above, and by doing so, the description in the above embodiment can be made. The effect obtained can be obtained.
[0268]
Further, the present invention is not limited to the above embodiment, and various changes or modifications can be made without departing from the scope of the present invention.
[0269]
【The invention's effect】
As is apparent from the above description, according to the motion vector detection method according to the present invention, a motion vector detection method for detecting a motion vector indicating a displacement from another picture in a block in a picture constituting a moving image A first candidate generating step of generating a first motion vector candidate based on a first reference picture of the detection target block, and a second motion vector based on a second reference picture of the detection target block A second candidate generating step of generating a candidate, a first prediction block indicated by a first motion vector candidate in the first reference picture, and a second motion vector candidate in the second reference picture. An interpolation step for creating an interpolation prediction block by interpolating pixel values of pixels corresponding to each other based on the second prediction block. A calculating step of calculating an evaluation value based on a difference between pixel values of pixels corresponding to each other between the interpolation prediction block and the detection target block; and a calculating step of generating the first candidate based on the evaluation value. Selecting one from among the plurality of first motion vector candidates and selecting one from among the plurality of second motion vector candidates generated in the second candidate generating step; The selected first motion vector candidate is detected as a first motion vector based on the first reference picture in the detection target block, and the selected second motion vector candidate is detected by the detection target block. And detecting as a second motion vector based on the second reference picture in the block. For example, in the selecting step, one first and second motion vector candidate each having the smallest evaluation value is selected.
[0270]
As a result, since the evaluation value is calculated based on the result of performing the interpolation processing of the pixel value, even if a fade occurs, it is possible to prevent an increase in the error of the evaluation value due to the effect and to determine an optimal motion vector. Can be detected.
[Brief description of the drawings]
FIG. 1 is a block diagram illustrating a configuration of a video encoding device according to a first embodiment of the present invention.
FIG. 2 is a frame layout diagram showing a temporal positional relationship between a reference frame and an encoding target frame according to the first embodiment.
FIG. 3 is a block diagram illustrating a configuration of a motion estimating unit according to the first embodiment.
FIG. 4 is an explanatory diagram for describing a method of generating a first motion vector candidate and a second motion vector candidate according to the embodiment.
FIG. 5 is a conceptual diagram showing a concept of a format of an image coded signal according to the embodiment.
FIG. 6 is a flowchart showing an operation until the moving image encoding apparatus detects and encodes a motion vector.
FIG. 7 is a block diagram illustrating a configuration of a video decoding device according to a second embodiment of the present invention.
FIG. 8 is a block diagram illustrating a configuration of a video encoding device according to a third embodiment of the present invention.
FIG. 9 is a block diagram illustrating a configuration of a motion estimation unit according to the embodiment.
FIG. 10 is an explanatory diagram for describing a method of generating a first motion vector candidate and a second motion vector candidate according to the first embodiment.
FIG. 11 is a conceptual diagram showing the concept of the format of an image coded signal according to the embodiment.
FIG. 12 is a block diagram illustrating a configuration of a video decoding device according to a fourth embodiment of the present invention.
FIG. 13 is a block diagram illustrating a configuration of a video encoding device according to a fifth embodiment of the present invention.
FIG. 14 is a block diagram illustrating a configuration of a motion estimation unit according to the third embodiment.
FIG. 15 is an explanatory diagram for describing a method of generating a first motion vector candidate, a motion vector, and a second motion vector candidate.
FIG. 16 is a block diagram showing a configuration of a moving picture coding device according to a modification of the above.
FIG. 17 is a conceptual diagram showing a concept of a format of an image coded signal of a moving image coding apparatus according to a modification of the above.
FIG. 18 is a block diagram illustrating a configuration of a video decoding device according to a sixth embodiment of the present invention.
FIG. 19 is a block diagram illustrating a configuration of a video encoding device according to a seventh embodiment of the present invention.
FIG. 20 is a configuration diagram showing a schematic configuration of a memory inside the multi-frame buffer according to the first embodiment;
FIG. 21 is a state diagram showing a state of a frame stored in the multi-frame buffer according to the embodiment.
FIG. 22 is an explanatory diagram for explaining how a difference vector is created.
FIG. 23 is a flowchart showing a series of operations for encoding a motion vector according to the embodiment.
FIG. 24 is a block diagram illustrating a configuration of a video decoding device according to an eighth embodiment of the present invention.
FIG. 25 is a flowchart showing a series of operations for decoding the motion vector according to the above.
FIG. 26 is an explanatory diagram of a recording medium according to a ninth embodiment of the present invention.
FIG. 27 is a block diagram illustrating an overall configuration of a content supply system according to a tenth embodiment of the present invention.
FIG. 28 is a front view of the above mobile phone.
FIG. 29 is a block diagram of the mobile phone of the above.
FIG. 30 is a block diagram showing the overall configuration of the digital broadcasting system of the above.
FIG. 31 is an explanatory diagram for explaining a motion vector.
FIG. 32 is an explanatory diagram for explaining how to generate a predicted image using two frames.
FIG. 33 is a block diagram illustrating a configuration of a moving image encoding device showing a conventional example.
FIG. 34 is a block diagram illustrating a configuration of a motion estimation unit according to the embodiment.
FIG. 35 is an explanatory diagram for describing how to detect a motion vector according to the embodiment.
FIG. 36 is an explanatory diagram for explaining a change in a pixel value due to a fade;
FIG. 37 is a conceptual diagram showing the concept of the format of an image coded signal output by a moving image coding apparatus showing a conventional example.
FIG. 38 is a block diagram illustrating a configuration of a moving image decoding device illustrating a conventional example.
[Explanation of symbols]
301 Multi-frame buffer
302 Motion estimation unit
303 motion compensation unit
304 image coding unit
305 Image decoding unit
306 Variable length coding unit
308 Adder
309 Subtractor
Bs1 image encoded signal
Dr residual decoded signal
Er residual coded signal
Img image signal
MV1, MV2 motion vector
Pre prediction image signal
Rc reconstructed image signal
Rf1, Rf2 reference frame

Claims

A motion vector detection method for detecting a motion vector indicating a displacement from another picture in a block in a picture constituting a moving image,
A first candidate generating step of generating a first motion vector candidate based on a first reference picture of the detection target block;
A second candidate generating step of generating a second motion vector candidate based on a second reference picture of the detection target block;
Based on a first prediction block indicated by a first motion vector candidate in the first reference picture and a second prediction block indicated by a second motion vector candidate in the second reference picture, An interpolation step of creating an interpolation prediction block by interpolating the pixel value of the corresponding pixel;
A calculation step of calculating an evaluation value based on a difference between pixel values of pixels corresponding to each other between the interpolation prediction block and the detection target block,
Based on the evaluation value, one of the plurality of first motion vector candidates generated in the first candidate generation step is selected, and the plurality of first motion vector candidates generated in the second candidate generation step are selected. A selection step of selecting one from two motion vector candidates;
The selected first motion vector candidate is detected as a first motion vector based on the first reference picture in the detection target block, and the selected second motion vector candidate is detected by the detection target block. A detecting step of detecting as a second motion vector based on the second reference picture in the block.

In the selecting step,
2. The motion vector detecting method according to claim 1, wherein one of the first and second motion vector candidates having the smallest evaluation value is selected.

In the second candidate generation step,
A second motion vector candidate is obtained by scaling the first motion vector candidate at a rate corresponding to a display time difference between the first and second reference pictures based on the picture including the detection target block. 3. The method according to claim 1, wherein

In the second candidate generation step,
The display time difference based on the picture including the detection target block is
The motion vector detecting method according to claim 3, wherein a picture longer than the first reference picture is selected as the second reference picture.

In the first candidate generation step,
4. The motion vector detecting method according to claim 3, wherein a picture before a display time of a picture including the detection target block is selected as the first reference picture.

In the first candidate generation step,
Selecting a picture before the display time of the picture including the detection target block as the first reference picture,
In the second candidate generation step,
3. The motion vector detecting method according to claim 1, wherein a picture before a display time of a picture including the detection target block is selected as the second reference picture.

In the first candidate generation step,
Selecting a picture after the display time of the picture including the detection target block as the first reference picture,
In the second candidate generation step,
3. The motion vector detecting method according to claim 1, wherein a picture located after a display time of a picture including the detection target block is selected as the second reference picture.

In the second candidate generation step,
3. The picture according to claim 1, wherein a picture whose display time is different from the first reference picture before and after the display time is selected as a second reference picture based on a display time of a picture including the detection target block. Motion vector detection method.

A motion vector detection method for detecting a motion vector indicating a displacement from another picture in a block in a picture constituting a moving image,
A first candidate generating step of generating a first motion vector candidate based on a first reference picture of the detection target block;
By scaling the first motion vector candidate at a rate corresponding to the display time difference between the first and second reference pictures based on the picture including the detection target block, Generating a scaling vector based on the two reference pictures;
Based on a first prediction block indicated by a first motion vector candidate in the first reference picture and a second prediction block indicated by a scaling vector in the second reference picture, pixels corresponding to each other are A first interpolation step of creating a first interpolation prediction block by interpolating pixel values;
A first calculation step of calculating an evaluation value based on a difference between pixel values of pixels corresponding to each other between the first interpolation prediction block and the detection target block;
From the plurality of first motion vector candidates generated in the first candidate generation step, the first motion vector candidate having the smallest evaluation value calculated in the first calculation step is determined by the detection target A first detection step of detecting as a first motion vector based on the first reference picture in a block;
A second candidate generating step of generating a second motion vector candidate based on a second reference picture of the detection target block;
Based on a third prediction block indicated by the first motion vector in the first reference picture and a fourth prediction block indicated by a second motion vector candidate in the second reference picture, A second interpolation step of creating a second interpolation prediction block by interpolating pixel values of corresponding pixels;
A second calculation step of calculating an evaluation value based on a difference between pixel values of pixels corresponding to each other between the second interpolation prediction block and the detection target block;
From the plurality of second motion vector candidates generated in the second candidate generation step, the second motion vector candidate having the smallest evaluation value calculated in the second calculation step is determined by the detection target. And a second detection step of detecting as a second motion vector based on the second reference picture in the block.

A motion vector encoding method for encoding a motion vector indicating a displacement from another picture in a block in a picture constituting a moving image,
A motion vector detecting step of detecting the first and second motion vectors by the motion vector detecting method according to claim 1 or 2;
A coding step of coding the first and second motion vectors, respectively.

A motion vector encoding method for encoding a motion vector indicating a displacement from another picture in a block in a picture constituting a moving image,
A reading step of reading the first and second reference pictures from a storage unit having a first area in which a picture is recorded together with information on its display time and a second area in which another picture is recorded;
A motion vector detecting step of detecting the first and second motion vectors by using the first and second reference pictures;
A determining step of determining whether at least one of the first and second reference pictures has been read from the second area;
When it is determined in the determining step that at least one of the first and second reference pictures has been read from the second area, a code for encoding the first and second motion vectors, respectively. A motion vector encoding method.

A motion vector detection device for detecting a motion vector indicating a displacement from another picture in a block in a picture constituting a moving image,
First candidate generating means for generating a first motion vector candidate based on a first reference picture of the detection target block;
Second candidate generating means for generating a second motion vector candidate based on a second reference picture of the detection target block;
Based on a first prediction block indicated by a first motion vector candidate in the first reference picture and a second prediction block indicated by a second motion vector candidate in the second reference picture, Interpolation means for creating an interpolation prediction block by interpolating the pixel value of the corresponding pixel,
Calculation means for calculating an evaluation value based on a difference between pixel values of pixels corresponding to each other between the interpolation prediction block and the detection target block,
Among the plurality of first motion vector candidates generated by the first candidate generation unit and the plurality of second motion vector candidates generated by the second candidate generation unit, the evaluation value is the smallest. Selecting means for selecting one first and second motion vector candidate, respectively,
The selected first motion vector candidate is detected as a first motion vector based on the first reference picture in the detection target block, and the selected second motion vector candidate is detected by the detection target block. Detecting means for detecting as a second motion vector based on the second reference picture in the block.

A motion vector encoding device that encodes a motion vector indicating a displacement from another picture in a block in a picture constituting a moving image,
A motion vector detecting device according to claim 12,
Encoding means for encoding each of the first and second motion vectors detected by the motion vector detecting device.

A moving image encoding device that encodes a picture constituting a moving image,
A motion vector encoding device according to claim 13,
A moving image encoding apparatus comprising: an image encoding unit that encodes blocks corresponding to the first and second motion vectors encoded by the motion vector encoding apparatus.

A program for causing a computer to execute a motion vector detection method of detecting a motion vector indicating a displacement from another picture in a block in a picture constituting a moving image,
A first candidate generating step of generating a first motion vector candidate based on a first reference picture of the detection target block;
A second candidate generating step of generating a second motion vector candidate based on a second reference picture of the detection target block;
Based on a first prediction block indicated by a first motion vector candidate in the first reference picture and a second prediction block indicated by a second motion vector candidate in the second reference picture, An interpolation step of creating an interpolation prediction block by interpolating the pixel value of the corresponding pixel;
A calculation step of calculating an evaluation value based on a difference between pixel values of pixels corresponding to each other between the interpolation prediction block and the detection target block,
Among the plurality of first motion vector candidates generated in the first candidate generation step and the plurality of second motion vector candidates generated in the second candidate generation step, the evaluation value is the smallest. A selection step of selecting one first and second motion vector candidate, respectively,
The selected first motion vector candidate is detected as a first motion vector based on the first reference picture in the detection target block, and the selected second motion vector candidate is detected by the detection target block. A detecting step of detecting as a second motion vector based on the second reference picture in a block.

A program for causing a computer to execute a motion vector encoding method for encoding a motion vector indicating a displacement from another picture in a block in a picture constituting a moving image,
Steps included in the program according to claim 15;
A coding step of coding the first and second motion vectors, respectively.

A storage medium for storing the program according to claim 16.