JP2004048711A

JP2004048711A - Method for coding and decoding moving picture and data recording medium

Info

Publication number: JP2004048711A
Application number: JP2003136452A
Authority: JP
Inventors: Seishi Abe; 安倍　清史; Shinya Sumino; 角野　眞也; Toshiyuki Kondo; 近藤　敏志; Makoto Hagai; 羽飼　誠
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2002-05-22
Filing date: 2003-05-14
Publication date: 2004-02-12

Abstract

<P>PROBLEM TO BE SOLVED: To enable use of B picture and directional modes even if a picture coded behind in a displaying order cannot be referred to, and maintain high coding efficiency. <P>SOLUTION: In the circumstance where the picture coded behind in the displaying order sequentially cannot be referred to, motion vectors in already-coded blocks in the same picture are referred to when performing predictive coding using the directional mode. A method thus realizes the directional mode without referring to the following picture in the sequential order. Furthermore, by deleting an item of referencing rearward picture reference from a table of the coding modes, thereby reducing the number of the items in the table, the high coding efficiency is also achieved in a motion compensation only to the forward direction. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
本発明は、動画像の符号化方法および復号化方法に関し、特に時間的に前方または後方にある既に符号化済みの複数のピクチャを参照して予測符号化を行うＢピクチャを対象とした予測符号化方法および予測復号化方法に関する。
【０００２】
【従来の技術】
一般に動画像の符号化では、時間方向および空間方向の冗長性を削減することによって情報量の圧縮を行う。そこで時間的な冗長性の削減を目的とするピクチャ間予測符号化では、前方または後方のピクチャを参照してブロック単位で動きの検出および動き補償を行い、得られた予測画像と現在のピクチャとの差分値に対して符号化を行う。
【０００３】
現在標準化中の動画像符号化方法であるＨ．２６Ｌでは、ピクチャ内予測符号化をのみを行うピクチャ（Ｉピクチャ）、および時間的に前方にある１枚のピクチャを参照してピクチャ間予測符号化を行うピクチャ（Ｐピクチャ）、さらに時間的に前方にある２枚のピクチャもしくは時間的に後方にある２枚のピクチャもしくは時間的に前方および後方にあるそれぞれ１枚ずつのピクチャを参照してピクチャ間予測符号化を行うピクチャ（Ｂピクチャ）が提案されている。Ｈ．２６Ｌ以前の符号化方法であるＭＰＥＧ（Ｍｏｔｉｏｎ　Ｐｉｃｔｕｒｅ　Ｅｘｐｅｒｔｓ　Ｇｒｏｕｐ）１およびＭＰＥＧ２およびＭＰＥＧ４では、Ｂピクチャは同一方向には１枚のピクチャしか参照できなかったが、Ｈ．２６Ｌでは２枚参照できるように変更されていることが１つの大きな特徴である。
【０００４】
図１６は、従来の動画像符号化方法における各ピクチャと、それによって参照されるピクチャとの参照関係の例を示す図である。同図において、ピクチャＩ１〜ピクチャＢ２０は、この順で表示される。図１７（ａ）は、図１６に示したピクチャＢ１８の周辺にあるピクチャを表示順で抜き出して示す図である。図１７（ｂ）は、ピクチャＢ１８を図１７（ａ）に示した参照関係で符号化する場合におけるピクチャＢ１８の周辺ピクチャの符号化順を示す図である。
【０００５】
ピクチャＩ１は参照ピクチャを持たずピクチャ内予測符号化を行い、ピクチャＰ１０は時間的に前方にあるピクチャＰ７を参照しピクチャ間予測符号化を行っている。また、ピクチャＢ６は時間的に前方にある２つのピクチャ（ピクチャＩ１およびピクチャＰ４）を参照し、ピクチャＢ１２は時間的に後方にある２つのピクチャ（ピクチャＰ１３およびピクチャＰ１６）を参照し、ピクチャＢ１８は時間的に前方および後方にあるそれぞれ１枚ずつのピクチャ（ピクチャＰ１６およびピクチャＰ１９）を参照してピクチャ間予測符号化を行っている。このようにＢピクチャを使用する符号化では時間的に後方にあるピクチャを参照するため、表示される順番では符号化を行うことが出来ない。つまり、図１７（ａ）におけるピクチャＢ１８のようなＢピクチャがあった場合は、それが参照するピクチャＰ１９を先に符号化する必要がある。そのため、ピクチャＢ１６からピクチャＢ１９を図１７（ｂ）のような順番に並び替えて符号化を行わなくてはならない。
【０００６】
時間的に前方にある１枚のピクチャを参照してピクチャ間予測符号化を行うＰピクチャの予測モードのひとつとしてスキップモードがある。スキップモードでは符号化対象ブロックに直接動きベクトルの情報を持たせずに、周辺に位置する符号化済みブロックの動きベクトルを参照して符号化対象ブロックの動き補償に使用する動きベクトルを決定し、符号化対象ブロックの属するピクチャの時間的に直前にあるＰピクチャから予測画像を生成することによって動き補償を行う。
【０００７】
図１８は、同一ピクチャ内で対象ブロックの周辺に位置する符号化済みブロックの動きベクトルを参照する場合に、動きベクトルを参照される符号化済みブロックと対象ブロックとの位置関係を表した図である。図１８（ａ）は符号化を行う符号化対象ブロックＢＬ５１が１６画素×１６画素のサイズであった場合の例であり、図１８（ｂ）は符号化を行うブロックＢＬ５２が８画素×８画素のサイズであった場合の例を示している。ここでは、Ｐピクチャのスキップモードの際に動きベクトルを参照される符号化済みブロックと符号化対象ブロックとの位置関係を示している。ブロックＢＬ５１がスキップモードを用いて符号化を行う１６画素×１６画素のブロックであり、基本的にＡ、Ｂ、Ｃの位置関係にある３つの符号化済みブロック（以下、Ａの位置にあるブロックをブロックＡ、Ｂの位置にあるブロックをブロックＢ、Ｃの位置にあるブロックをブロックＣという。）の動きベクトルを参照する。ただし、下記の条件に該当する場合は動きベクトルの参照を行わず、符号化対象ブロックの動きベクトルの値を「０」として直前のＰピクチャを参照し、直接モードによる動き補償を行う。
【０００８】
１、ブロックＡもしくはブロックＢが、符号化対象ブロックが属するピクチャの外部もしくはスライスの外部であった場合。
２、ブロックＡもしくはブロックＢが、直前のピクチャを参照する値「０」の動きベクトルを持つ場合。
【０００９】
参照の対象となったブロックＡ、Ｂ、Ｃの３つのブロックが持つ動きベクトルの中から直前のＰピクチャを参照する動きベクトルのみを取り出し、その中央値を取ることによって実際に直接モードにおいて使用する動きベクトルとする。ただし、ブロックＣが参照不可の場合は代わりにブロックＤの動きベクトルを用いるものとする。
【００１０】
図１９は、Ｐピクチャのスキップモードの際に参照される動きベクトルとその動きベクトルによって参照される符号化済みピクチャとの一例を示した図である。ピクチャＰ６４に属するブロックＢＬ５１を現在符号化しているブロックとする。この例では直前のピクチャを参照する動きベクトルは動きベクトルＭＶＡ１のみとなり、直接モードにおいて用いられる動きベクトルＭＶ１は、動きベクトルＭＶＡ１の値をそのまま使用することになる。このような参照方法を使用することにより動きベクトルを符号化する必要がないため、出力符号列のビット量を減らすことが可能となる。また、周辺のブロックを参照して動きベクトルを決定するため、カメラのパーン等の影響によって撮像物が一定方向に移動するような場合においてその効果が大きく得られる。
【００１１】
時間的に前方にある２枚のピクチャもしくは時間的に後方にある２枚のピクチャもしくは時間的に前方および後方にあるそれぞれ１枚ずつのピクチャを参照してピクチャ間予測符号化を行うＢピクチャの予測モードのひとつとして直接モードがある。直接モードでは、符号化対象のブロックには直接動きベクトルを持たせず、符号化対象ピクチャの時間的に直後にある符号化済みピクチャ内の同じ位置にあるブロックの動きベクトルを参照することによって、実際に符号化対象ブロックの動き補償を行うための２つの動きベクトルを算出し、予測画像を作成する。
【００１２】
図２０は、直接モードにおいて動きベクトルを決定する方法を説明するための図である。ピクチャＢ７３が現在符号化の対象としているＢピクチャであり、ピクチャＰ７２およびピクチャＰ７４を参照ピクチャとして、直接モードによる双方向予測を行うものである。符号化を行うブロックをブロックＢＬ７１とすると、このとき必要とされる２つの動きベクトルは、符号化済みの後方参照ピクチャであるピクチャＰ７４の同じ位置にあるブロックＢＬ７２の持つ動きベクトルＭＶ７１を用いて決定される。動きベクトルＭＶ７１に対してピクチャ間隔ＴＲ７２、ＴＲ７３を用いてスケーリングを適用することによって、もしくは動きベクトルＭＶ７１に対して所定の係数をかけることによって、直接モードによって使用する２つの動きベクトルＭＶ７２、ＭＶ７３が算出される。この２つの動きベクトルによって指定された２つの参照画像の画素値の平均をとることによって、ブロックＢＬ７１の符号化に必要とされる予測画像が生成される。このように、直接モードで符号化を行うブロックでは、動きベクトルを符号化する必要がないため出力符号列のビット量を減らすことが可能となる。
【００１３】
【非特許文献１】
Ｊｏｉｎｔ　Ｖｉｄｅｏ　Ｔｅａｍ　（ＪＶＴ）　ｏｆ　ＩＳＯ／ＩＥＣ　ＭＰＥＧ　ａｎｄ　ＩＴＵ−Ｔ　ＶＣＥＧ　−−　ＪｏｉｎｔＣｏｍｍｉｔｔｅｅ　Ｄｒａｆｔ　（２００２−５−１０）　Ｐ．９９　１１　Ｂ　ｐｉｃｔｕｒｅｓ
【００１４】
【発明が解決しようとする課題】
しかしながら、Ｂピクチャの直接モードを用いた動画像の符号化では、時間的に後方にあるピクチャを参照して符号化が行われるため、前記参照される可能性のあるピクチャを符号化対象のピクチャよりも先に符号化しておく必要があった。そのため、時間的に後方にあるピクチャを先に符号化および復号化することが出来ない環境下ではＢピクチャの直接モードを用いた符号化を行うことが出来なかった。
【００１５】
本発明は上記のような問題点を解決するものであり、時間的に後方にあるピクチャが符号化対象ピクチャまたは復号化対象ピクチャよりも先に符号化および復号化されていない環境下においても、Ｂピクチャ、特に直接モードを矛盾なく使用することを可能とする方法を提案することを第１の目的とする。さらに、本発明は、符号化モードとその識別番号とを対応付けるテーブルの効率の良い参照方法を提案することによって、Ｂピクチャを使用した高効率による動画像の符号化方法および復号化方法を提案することを第２の目的とする。
【００１６】
【課題を解決するための手段】
上記目的を達成するために、本発明の動画像符号化方法は、動画像を符号化して符号列を生成する動画像符号化方法であって、時間的に前方または後方にある符号化済みの複数のピクチャを参照して予測符号化を行うＢピクチャの符号化において、符号化済みのブロックが持つ動きベクトルを参照して符号化対象ブロックの動き補償を行う直接モードを使用することを可能とする符号化ステップを含み、前記符号化ステップは、前記符号化対象ブロックの属するピクチャから表示順で一方向にある符号化済みのピクチャのみを参照して前記Ｂピクチャの予測符号化を行う場合、前記直接モードとして前記符号化対象ブロックの周辺に位置する同一ピクチャ内の符号化済みブロックの動きベクトルを参照して動き補償を行う動き補償ステップを含むことを特徴とする。
【００１７】
また本発明の動画像符号化方法は、動画像を符号化して符号列を生成する動画像符号化方法であって、時間的に前方または後方にある符号化済みの複数のピクチャを参照して予測符号化を行うＢピクチャの符号化において、符号化済みのブロックが持つ動きベクトルを参照して符号化対象ブロックの動き補償を行う直接モードを使用することを可能とする符号化ステップを含み、前記符号化ステップは、前記符号化対象ブロックの属するピクチャから表示順で一方向にある符号化済みのピクチャのみを参照して前記Ｂピクチャの予測符号化を行う場合、前記直接モードとして前記符号化対象ブロックの動きベクトルの値を「０」として、時間的に近い方から順に１つまたは複数のピクチャを参照して動き補償を行うことを特徴とする。
【００１８】
さらに本発明の動画像符号化方法では、前記符号化ステップは、前記Ｂピクチャの予測符号化方法と前記予測符号化方法を識別するための識別子とを対応付けたテーブルの中から後方を参照する予測符号化方法を除いて、前記テーブルを再生成するテーブル再生成ステップを含み、前記符号化ステップでは、再生成された前記テーブルを用いて当該Ｂピクチャの予測符号化方法を示す前記識別子を符号化するとしてもよい。
【００１９】
上記目的を達成するために、本発明の動画像復号化方法は、動画像を符号化して得られる符号列を復号化する動画像復号化方法であって、時間的に前方または後方にある復号化済みの複数のピクチャを参照して予測復号化を行うＢピクチャの復号化において、復号化済みのブロックが持つ動きベクトルを参照して復号化対象ブロックの動き補償を行う直接モードを使用することを可能とする復号化ステップを含み、前記復号化ステップは、前記復号化対象ブロックの属するピクチャから時間的に一方向にある復号化済みのピクチャのみを参照して前記Ｂピクチャの予測復号化を行う場合、前記直接モードとして前記復号化対象ブロックの周辺に位置する同一ピクチャ内の復号化済みブロックの動きベクトルを参照して動き補償を行う動き補償ステップを含むことを特徴とする。
【００２０】
また、本発明の動画像復号化方法は、動画像を符号化して得られる符号列を復号化する動画像復号化方法であって、時間的に前方または後方にある復号化済みの複数のピクチャを参照して予測復号化を行うＢピクチャの復号化において、復号化済みのブロックが持つ動きベクトルを参照して復号化対象ブロックの動き補償を行う直接モードを使用することを可能とする復号化ステップを含み、前記復号化ステップは、前記復号化対象ブロックの属するピクチャから時間的に一方向にある復号化済みのピクチャのみを参照して前記Ｂピクチャの予測復号化を行う場合、前記直接モードとして前記復号化対象ブロックの動きベクトルの値を「０」として、時間的に近い方から順に１つまたは複数のピクチャを参照して動き補償を行うことを特徴とする。
【００２１】
さらに本発明の動画像復号化方法では、前記復号化ステップは、前記Ｂピクチャの予測復号化方法と前記予測復号化方法を識別するための識別子とを対応付けた、予め保持しているテーブルの中から後方を参照する予測復号化方法を除いて、前記テーブルを再生成するテーブル再生成ステップを含み、前記復号化ステップでは、前記符号化列から前記Ｂピクチャの予測復号化方法を識別するための識別子を復号化し、再生成されたテーブルを用いて、当該Ｂピクチャの予測復号化方法を識別し、識別された前記予測復号化方法に従って復号化対象ブロックの予測復号化を行うとしてもよい。
【００２２】
【発明の実施の形態】
以下、本発明の実施の形態について、図面を用いて詳細に説明する。
（実施の形態１）
図１は、実施の形態１の動画像符号化方法を実行する動画像符号化装置１００の構成を示すブロック図である。動画像符号化装置１００は、直接モードによるＢピクチャの符号化時に、符号化対象ピクチャより表示順で前方にあるピクチャのみを参照する場合には、同一ピクチャ内で符号化対象ブロックの周辺にある符号化済みブロックの動きベクトルを参照して符号化対象ブロックの動きベクトルを決定する動画像符号化装置であって、フレームメモリ１０１、予測残差符号化部１０２、符号列生成部１０３、予測残差復号化部１０４、フレームメモリ１０５、動きベクトル検出部１０６、モード選択部１０７、動きベクトル記憶部１０８、後方ピクチャ判定部１０９、差分演算部１１０、加算演算部１１１、スイッチ１１２およびスイッチ１１３を備える。
【００２３】
フレームメモリ１０１、フレームメモリ１０５および動きベクトル記憶部１０８は、ＲＡＭ等によって実現されるメモリであって、フレームメモリ１０１は、表示順で入力される動画像の各ピクチャを符号化順に並べ替えるための記憶領域を提供する。
【００２４】
予測残差符号化部１０２は、差分演算部１１０で求められた予測残差にＤＣＴ変換などの周波数変換を施し、量子化して出力する。符号列生成部１０３は、予測残差符号化部１０２からの符号化結果を可変長符号化した後、出力用の符号化列のフォーマットに変換し、予測符号化方法に関連する情報を記述したヘッダなどの付加情報を付して符号列を生成する。予測残差復号化部１０４は、予測残差符号化部１０２からの符号化結果を可変長復号化し、逆量子化した後、ＩＤＣＴ変換などの逆周波数変換を施して復号化予測残差を生成する。
【００２５】
フレームメモリ１０５は、予測ピクチャをピクチャ単位で保持するための記憶領域を提供する。動きベクトル検出部１０６は、マクロブロックまたはマクロブロックをさらに分割して得られるブロックなどの所定の単位ごとに、動きベクトルを検出する。モード選択部１０７では動きベクトル記憶部１０８に記憶されている符号化済みのピクチャで用いた動きベクトルを参照しつつ最適な予測モードを選択し、動きベクトル検出部１０６によって検出された動きベクトルで示される予測ピクチャ中の各ブロックをフレームメモリ１０５から読み出して差分演算部１１０に出力する。
【００２６】
動きベクトル記憶部１０８は、符号化済みピクチャのブロックごとに検出された動きベクトルを保持するための記憶領域を提供する。後方ピクチャ判定部１０９は、符号化対象ピクチャよりも表示順で後方にあるピクチャが既に符号化されているかどうかを判定する。差分演算部１１０は、符号化対象のマクロブロックと、動きベクトルによって決定された予測画像のマクロブロックとの差分を出力する。
【００２７】
加算演算部１１１は、予測残差復号化部１０４から出力される復号化予測残差と、モード選択部１０７から出力される予測ピクチャのブロックとを加算して、加算結果（予測ピクチャを構成するブロック）をフレームメモリ１０５に格納する。スイッチ１１２は、符号化対象ピクチャのピクチャタイプに応じて切り替えられ、画面内予測符号化を行うＩピクチャではフレームメモリ１０１の読み出し線と予測残差符号化部１０２とを導通させる。これにより、フレームメモリ１０１から読み出された符号化対象ピクチャの各マクロブロックは、直接、予測残差符号化部１０２に入力される。
【００２８】
また、画面間予測符号化を行うＰピクチャおよびＢピクチャでは、差分演算部１１０の出力側と予測残差符号化部１０２とを導通させる。これにより、差分演算部１１０の演算結果が予測残差符号化部１０２に入力される。スイッチ１１３は、符号化対象ピクチャのピクチャタイプに応じて導通と遮断とに切り替えられる。画面内予測符号化を行うＩピクチャでは、モード選択部１０７の出力側と加算演算部１１１の入力側とが遮断され、画面間予測符号化を行うＰピクチャおよびＢピクチャでは、モード選択部１０７の出力側と加算演算部１１１の入力側とが導通される。これにより、画面内予測符号化を行うＩピクチャでは、予測残差復号化部１０４によって復号化された復号化予測残差がフレームメモリ１０５に出力される。
【００２９】
以下、本発明の実施の形態１の動画像符号化方法を図１に示したブロック図を用いて説明する。
符号化対象となる動画像は時間順にピクチャ単位でフレームメモリ１０１に入力される。各々のピクチャはマクロブロックと呼ばれる例えば水平１６画素×垂直１６画素のブロックに分割されブロック単位で以降の処理が行われる。
【００３０】
フレームメモリ１０１から読み出されたマクロブロックは動きベクトル検出部１０６に入力される。動きベクトル検出部１０６では、フレームメモリ１０５に蓄積されている画像（符号化済みのピクチャをさらに復号化して得られた画像）を参照ピクチャとして用いて、符号化対象としているマクロブロックの動きベクトル検出を行う。動きベクトル検出部１０６では、直接モード以外の予測モードにおいて、マクロブロックごとに、もしくはマクロブロックを分割した領域（例えば、１６画素×８画素、８画素×１６画素および８画素×８画素などの大きさに分割した小ブロック）ごとに動きベクトルの検出が行われる。
【００３１】
動きベクトル検出部１０６では、符号化の対象としているマクロブロックに対し、既に符号化済みのピクチャを参照ピクチャとし、そのピクチャ内の探索領域において最も符号化対象ブロックの画素値の構成に近いと予測されるブロックの位置を示す動きベクトルが検出される。モード選択部１０７では、動きベクトル記憶部１０８に記憶されている、符号化済みのピクチャで用いられた動きベクトルを参照しつつ、最適な予測モードを選択する。このとき表示順で後方にあるピクチャが既に符号化されているかどうかを後方ピクチャ判定部１０９において判定する。もし後方のピクチャが符号化されていないと判定された場合、モード選択部１０７は、Ｂピクチャの符号化において、表示順で後方にあるピクチャを参照しない予測モードを選択する。
【００３２】
モード選択部１０７によって選択された予測モードに従って、動きベクトル検出部１０６で検出された動きベクトルのうち最適な動きベクトルが決定され、決定された動きベクトルによって参照される予測ブロックがフレームメモリ１０５から読み出されて差分演算部１１０に入力される。差分演算部１１０では、予測ブロックと符号化対象のマクロブロックとの差分をとることにより予測残差画像が生成される。生成された予測残差画像は予測残差符号化部１０２に入力され、予測残差符号化部１０２内で周波数変換および量子化が施される。以上の処理の流れはピクチャ間予測符号化が選択された場合の動作であったが、スイッチ１１２によってピクチャ内予測符号化との切り替えがなされる。最後に符号列生成部１０３によって、動きベクトル等の制御情報および予測残差符号化部１０２から出力される画像情報等に対し可変長符号化が施され、最終的に出力される符号列が生成される。
【００３３】
以上符号化の流れの概要を示したが、以下では、モード選択部１０７における直接モードでの処理の詳細について説明する。ただし、ここでは後方ピクチャ判定部１０９において後方のピクチャが符号化されていないと判定された場合について説明する。図２は、符号化対象ブロックが属するピクチャから表示順で後方にあるピクチャを参照することが出来ない場合の各ピクチャの参照関係の一例を示す図である。図のように、ピクチャの表示順シーケンスに含まれる全てのＢピクチャは表示順で前方にある１枚もしくは複数の符号化済みピクチャを参照して予測符号化を行っている。例えば、いずれもＢピクチャであるピクチャＢ８２とピクチャＢ８３とは、表示順で前方にある符号化済みピクチャがピクチャＰ８１のみであるので、それぞれ、ピクチャＰ８１だけを参照し動き補償を行う。
【００３４】
また、いずれもＢピクチャであるピクチャＢ８５とピクチャＢ８６とについては、ピクチャＢ８５は表示順で前方にある２枚の符号化済みピクチャ（ピクチャＰ８１とピクチャＰ８４）を参照するが、例えば、ピクチャＢ８６は表示順で時間的に遠いピクチャＰ８１を参照せず、表示順で時間的により近いピクチャＰ８４のみを参照して動き補償を行っている。このような場合、各Ｂピクチャの動きベクトルは、すべて、符号化対象ピクチャよりも表示順で前方にある符号化済みピクチャを参照している。
【００３５】
本実施の形態では、表示順で後方のピクチャが符号化されていない環境下において、Ｂピクチャの予測符号化を行う際にモード選択部１０７によって直接モードが選択された場合、従来のように、表示順で符号化対象ピクチャの直後のピクチャに属する符号化済みブロックの動きベクトルを参照して符号化対象ブロックの動きベクトルを生成する（以下、「時間的予測」という）代わりに、同一ピクチャ内で符号化対象ブロックの周辺に位置する符号化済みブロックの動きベクトルを参照して符号化対象ブロックの動きベクトルを生成する（以下、「空間的予測」という）ことによって直接モードを実現する。
【００３６】
図３は、直接モードが選択された場合のモード選択部１０７の動作の一例を示すフローチャートである。モード選択部１０７は、直接モードが選択された場合、まず、後方ピクチャ判定部１０９に符号化対象ピクチャよりも表示順で後方にあるピクチャが符号化済みであるか否かを判定させ（Ｓ５０１）、判定の結果、表示順で後方にあるピクチャが符号化済みであれば、従来のように、時間的予測を用いて符号化対象ブロックの予測符号化を行う（Ｓ５０２）。そして、モード選択部１０７は、当該符号化対象ブロックの処理を終了し、次の符号化対象ブロックの処理にうつる。
【００３７】
また、ステップＳ５０１における判定の結果、表示順で後方にあるピクチャが符号化されていなければ、上記の空間的予測を用いて符号化対象ブロックの予測符号化を行う（Ｓ５０３）。さらに、モード選択部１０７は、上記の空間的予測を行ったことを示すフラグｓｐａｔｉａｌ＿ｆｌａｇの値を「１」にセットし、符号列生成部１０３に出力する（Ｓ５０４）。その後、モード選択部１０７は、当該符号化対象ブロックの処理を終了し、次の符号化対象ブロックの処理にうつる。
【００３８】
以下では、図３のステップＳ５０３において行われる空間的予測の具体的方法について説明する。
図１９を用いて説明したスキップモードの例は、参照される符号化済みブロックが１つずつの動きベクトルを持っている場合についてのものであった。しかし、Ｂピクチャの予測モードの中には、図２に示したように、表示順で前方にある２枚のピクチャを同時に参照して動き補償を行うものも含まれている。そのようなモードの場合は１つのブロックが２つの動きベクトルを持っていることになる。図４は、動きベクトルを参照される符号化済みブロックに２つの動きベクトルを持つブロックが含まれている場合の動きベクトルの参照関係の一例を示す図である。ピクチャＰ９４が現在符号化を行っているピクチャであり、ブロックＢＬ５１が直接モードによる予測符号化を行うブロックである。
【００３９】
まず、第１の方法として、モード選択部１０７は、直接モードによる予測符号化を行うブロックＢＬ５１が図１８（ａ）および図１８（ｂ）に示したいずれの場合も、ブロックＢＬ５１（またはブロックＢＬ５２）に対して基本的にＡ、Ｂ、Ｃの位置にあるブロックの動きベクトルを参照する。ただし、下記の条件に従って参照が変更される。
【００４０】
１、ブロックＣが参照不可の場合は、Ａ、Ｂ、Ｄの位置にあるブロックを参照する。
２、Ａ、Ｂ、ＣまたはＡ、Ｂ、Ｄの位置にある３つのブロックにおいて、動きベクトルを参照できないブロックがあれば、そのブロックを動きベクトルの参照の対象から除く。
【００４１】
モード選択部１０７は、参照の対象となったＡ、Ｂ、Ｃ（またはＡ、Ｂ、Ｄ）の３つのブロックが持つ動きベクトルのうちで、動きベクトルによって参照されるピクチャと符号化対象ピクチャとの表示順での遠近を比較する。比較されたうちで、符号化対象ピクチャから表示順で最も近い位置にあるピクチャを参照している動きベクトルを取り出す。取り出された動きベクトルが複数ある場合には、それらの中央値もしくは平均値を取る。例えば、取り出された動きベクトルが奇数個ある場合には中央値、偶数個ある場合には平均値をとるとしてもよい。これによって得られた動きベクトルを、表示順で符号化対象ピクチャよりも前方にあるピクチャのみを参照して動き補償する場合に、直接モードが選択されたときの符号化対象ブロックの動きベクトルとする。また、Ａ、Ｂ、Ｃ（またはＡ、Ｂ、Ｄ）の全てのブロックが参照できない場合は、符号化対象ブロックの動きベクトルを０とし、参照するピクチャを直前のピクチャとして直接モードによる予測符号化を行う。
【００４２】
図５は、図１に示したモード選択部１０７が第１の方法を用いて符号化対象ブロックの空間的予測を行う場合の処理手順の一例を示すフローチャートである。以下では、図４に示した符号化対象ブロックＢＬ５１を例として説明する。まず、モード選択部１０７は、符号化対象ブロックＢＬ５１に対してＣの位置にあるブロックが参照可能であるか否かを調べる（Ｓ６０１）。図４においてＣの位置にあるブロックはピクチャＰ９３を参照する動きベクトルＭＶＣ１とピクチャＰ９２を参照する動きベクトルＭＶＣ２とを持っている。従って、モード選択部１０７は、Ａ、Ｂ、Ｃの位置にあるブロックの動きベクトルを参照する（Ｓ６０２）。
【００４３】
Ａの位置にあるブロックは、ピクチャＰ９３を参照する動きベクトルＭＶＡ１を持ち、Ｂの位置にあるブロックはピクチャＰ９３を参照する動きベクトルＭＶＢ１とピクチャＰ９１を参照する動きベクトルＭＶＢ３を持っている。ステップＳ６０１において、Ｃの位置にあるブロックが、符号化対象ピクチャＰ９４の外にあったり、または符号化対象ブロックＢＬ５１が属するスライスの外にあったり、または画面内予測などの符号化を行ったために動きベクトルを持っていない場合には、Ｃの位置にあるブロックの代わりに、図１８（ａ）および図１８（ｂ）に示したＤの位置にあるブロックの動きベクトルを参照する（Ｓ６０３）。すなわち、Ａ、Ｂ、Ｄの位置にある３つのブロックを参照する。
【００４４】
次いで、モード選択部１０７は、参照された３つのブロック（Ａ、Ｂ、ＣまたはＡ、Ｂ、Ｄ）のうち、符号化対象ピクチャＰ９４の外にあったり、または符号化対象ブロックＢＬ５１が属するスライスの外にあったり、または画面内予測などの符号化を行ったために動きベクトルを持っていない場合には、そのブロックを参照の候補から除いて、符号化対象ブロックの動きベクトルの計算を行う（Ｓ６０４）。
【００４５】
また、３つのブロック（Ａ、Ｂ、ＣまたはＡ、Ｂ、Ｄ）のうち、全てのブロックを参照できない場合は、符号化対象ブロックの動きベクトルを「０」とし、符号化対象ピクチャの直前のピクチャを参照する。モード選択部１０７は、参照されたこれらの動きベクトルの中から、符号化対象ピクチャに表示順で最も近いピクチャを参照するもののみを取り出すと、ピクチャＰ９３を参照する動きベクトルＭＶＡ１、動きベクトルＭＶＢ１および動きベクトルＭＶＣ１が得られる。モード選択部１０７は、さらに、これらの中央値もしくは平均値を取る。例えば、ここでは３個の動きベクトルが得られたので中央値を取る。これにより、ブロックＢＬ５１の動き補償を行うための１つの動きベクトルＭＶ１を決定することができる。
【００４６】
図６は、図１に示した符号列生成部１０３で生成される符号列のスライスごとのデータ構造の一例を示す図である。各ピクチャの符号列は、複数のスライスデータから構成され、各スライスデータは複数のマクロブロックデータから構成されている。同図に示すように、符号列中の各スライスデータには、それぞれスライスヘッダが付加されており、スライスヘッダにはスライスに関する情報などが書き込まれる。スライスに関する情報には、例えば、スライスの属するフレームの番号、および上記の直接モードの符号化方法の種類を示すフラグｓｐａｔｉａｌ＿ｆｌａｇなどが記述される。
【００４７】
以上のように上記実施の形態では、表示順で後方にあるピクチャを参照出来ない環境下においても、直接モードを用いて予測符号化する際に表示順で後方にあるピクチャを参照することなく直接モードを実現する方法を提案し、高い符号化効率を実現する符号化方法を示した。
【００４８】
なお上記第１の方法では、参照された動きベクトルのうちで符号化対象ピクチャに表示順で一番近いピクチャを参照するものを取り出したが、参照された動きベクトルの中から符号化対象ピクチャの直前のピクチャを参照するもののみを取り出すとしてもよい。図４に示した例の場合、参照された動きベクトルに参照されるピクチャのうち、符号化対象ピクチャに表示順で最も近いピクチャは、符号化対象ピクチャの直前のピクチャであるので、得られる動きベクトルは同じである。もし、表示順で一番近いピクチャを参照する動きベクトルが１つも無かった場合は、符号化対象ブロックの動きベクトルを「０」として直接モードによる符号化を行う。
【００４９】
また上記第１の方法では、直接モードにおいて用いる動きベクトルを決定する際に、周辺の符号化済みブロックが参照するピクチャの中から表示順で符号化対象ピクチャから最も手前にあるピクチャを参照している動きベクトルだけを取り出して最終的に１個の動きベクトルを算出したが、その代わりに、第２の方法として、表示順で符号化対象ピクチャの手前からＮ枚のピクチャを参照する動きベクトルを取り出し、参照しているピクチャごとに１つずつの動きベクトルを決定し、得られたＮ個の動きベクトルを直接モードでの予測符号化に用いる動きベクトルとして前方向のみを参照する動き補償を行うことも可能である。このとき予測画像はＮ個の動きベクトルによって指定されたＮ個の領域の画素値の平均を算出することによって生成される。
【００５０】
なお、単純な平均ではなく、各領域の画素値に重みを付けて平均をとる方法によって予測画像を生成することも可能である。この方法を用いることにより、表示順で画素値が序々に変化するような画像列に対してより精度の高い動き補償を実現することが可能となる。
【００５１】
図７は、表示順で符号化対象ピクチャの手前から２枚のピクチャを参照する動きベクトルを取り出して２個の動きベクトルを算出する場合の動きベクトル参照方法の一例を表した図である。ピクチャＰ１０４が現在符号化を行っているピクチャであり、ＢＬ５１が直接モードによる予測符号化を行うブロックである。参照の対象となる複数の動きベクトルが参照しているピクチャの中で表示順で最も手前にあるピクチャＰ１０３を参照している動きベクトルＭＶＡ１、動きベクトルＭＶＢ１および動きベクトルＭＶＣ１を用いてその中央値もしくは平均値を取ることにより動きベクトルＭＶ１が決定され、さらに表示順で２つ前にあるピクチャＰ１０２を参照している動きベクトルの中央値もしくは平均値、つまりＭＶＣ２そのものを取ることにより動きベクトルＭＶ２が決定され、これら２つの動きベクトルを用いて直接モードによる符号化がなされる。
【００５２】
なお、図１８（ａ）および図１８（ｂ）において参照されるブロックの動きベクトルの中から表示順で手前から１枚もしくはＮ枚のピクチャを参照するもののみを使用するという方法の代わりに、指定されたピクチャを参照する動きベクトルのみを取り出して直接モードにおいて使用される符号化対象ブロックの動きベクトルの値を決定し、前記指定されたピクチャから動き補償を行うということも可能である。
【００５３】
なお、直接モードを用いて符号化を行う際に、図１８（ａ）および図１８（ｂ）のような位置関係にある符号化済みのブロックを参照して動き補償を行う代わりに、符号化対象ブロックの動きベクトルの値を「０」、参照するピクチャを直前のピクチャとして直接モードによる動き補償を行うことも可能である。この方法を用いると、直接モードに使用する動きベクトルを算出するステップを行う必要がなくなるため、符号化処理の単純化を図ることが出来る。
【００５４】
なお、このとき直接モードにおいて時間的予測を行うか空間的予測を行うかを示すｓｐａｔｉａｌ＿ｆｌａｇの代わりに、符号化済みのブロックを参照せずに符号化対象ブロックの動きベクトルの値を「０」として動き補償を行うことを示すフラグをスライスヘッダに記述してもよい。
【００５５】
なお、上記の方法では、３つのブロックを参照して得られた動きベクトルのうちから、それらが参照しているピクチャの中で符号化対象ピクチャに表示順で最も近い位置にあるピクチャを参照している動きベクトルを取り出すとしたが、本発明はこれに限定されない。例えば、符号化対象ピクチャに符号化順で最も近い位置にあるピクチャを参照している動きベクトルを取り出すとしてもよい。
【００５６】
（実施の形態２）
本発明の実施の形態２の動画像復号化方法を図８に示したブロック図を用いて説明する。ただし、本動画像復号化方法では、実施の形態１の動画像符号化方法で生成された符号列を復号化するものとする。
【００５７】
図８は、本実施の形態の動画像復号化装置２００の構成を示すブロック図である。動画像復号化装置２００は、直接モードで符号化された復号化対象ブロックに対し、直接モードによる復号化の方法を示すフラグが「１」のとき空間的予測を用いて復号化を行う動画像復号化装置であって、符号列解析部２０１、予測残差復号化部２０２、フレームメモリ２０３、動き補償復号部２０４、動きベクトル記憶部２０５、後方ピクチャ判定部２０６、加算演算部２０７およびスイッチ２０８を備える。
【００５８】
符号列解析部２０１は、入力された符号列を解析し、符号列から予測残差符号化データ、動きベクトル情報および予測モードなどの情報を抽出し、抽出された動きベクトル情報および予測モードなどの情報を動き補償復号部２０４に、予測残差符号化データを予測残差復号化部２０２にそれぞれ出力する。予測残差復号化部２０２は、抽出された予測残差符号化データに可変長復号化、逆量子化および逆周波数変換などを施し、予測残差画像を生成する。
【００５９】
フレームメモリ２０３は、復号化された画像をピクチャ単位で格納し、格納しているピクチャを表示順に外部のモニタなどに出力画像として出力する。動き補償復号部２０４は、予測モードの復号化と、その予測モードで用いる動きベクトルの復号化とを行い、フレームメモリ２０３に蓄積されている復号化画像を参照ピクチャとし、入力された動きベクトル情報に基づいて復号化対象ブロックに対する予測画像を生成する。動きベクトルの復号化の際には、動きベクトル記憶部６０５に記憶されている復号化済みの動きベクトルを利用する。
【００６０】
動きベクトル記憶部２０５は、動き補償復号部２０４において復号化された動きベクトルを格納する。後方ピクチャ判定部２０６は、動き補償復号部２０４による予測画像の生成時に、復号化対象ピクチャよりも表示順で後方にあるピクチャが復号化されているか否かを判定する。なお、後方ピクチャ判定部２０６は、実施の形態４で用いられるが、本実施の形態では不要である。加算演算部２０７は、予測残差復号化部２０２で復号化された予測残差画像と、動き補償復号部２０４で生成された予測画像とを加算し、復号化対象ブロックの復号化画像を生成する。
【００６１】
まず入力された符号列から符号列解析部２０１によって動きベクトル情報および予測残差符号化データ等の各種の情報が抽出される。ここで抽出された動きベクトル情報は動き補償復号部２０４に、予測残差符号化データは予測残差復号化部２０２にそれぞれ出力される。動き補償復号部２０４では、フレームメモリ２０３に蓄積されている復号化済みのピクチャの復号化画像を参照ピクチャとし、復号化された動きベクトルに基づいて予測画像を生成する。
【００６２】
このようにして生成された予測画像は加算演算部２０７に入力され、予測残差復号化部２０２において生成された予測残差画像との加算を行うことにより復号化画像が生成される。予測方向が制限されていない場合は、生成された復号化画像はフレームメモリ２０３において表示される順にピクチャの並び替えを行うが、表示順で後方にあるピクチャを参照することが出来ない場合は、並び替えを行うことなく復号化された順に表示することが可能となる。以上の実施の形態はピクチャ間予測符号化がなされている符号列に対する動作であったが、スイッチ２０８によってピクチャ内予測符号化がなされている符号列に対する復号化処理との切り替えがなされる。
【００６３】
以上復号化の流れの概要を示したが、動き補償復号部２０４における処理の詳細について以下で説明する。
図９は、図８に示した動き補償復号部２０４における直接モードによる復号化の処理手順を示すフローチャートである。
【００６４】
予測モードおよび動きベクトル情報はマクロブロックごともしくはマクロブロックを分割したブロックごとに付加されている。これらの情報は、符号列のスライスデータ領域の中に、スライス中のマクロブロックの順に記述されている。前記予測モードＭｏｄｅが直接モードを示している場合、動き補償復号部２０４は、スライスヘッダに復号化されるフラグｓｐａｔｉａｌ＿ｆｌａｇに「０」がセットされているか「１」がセットされているかを調べる（Ｓ９０１）。後方のピクチャが復号化されていないときには、フラグｓｐａｔｉａｌ＿ｆｌａｇに「１」がセットされており、空間的予測を用いて復号化を行うことが指示されている。
【００６５】
フラグｓｐａｔｉａｌ＿ｆｌａｇに「１」がセットされている場合には、動き補償復号部２０４は、直接モードの空間的予測を用いて復号化対象ブロックの予測画像を作成し（Ｓ９０２）、「０」がセットされている場合には、動き補償復号部２０４は、直接モードの時間的予測を用いて復号化対象ブロックの予測画像を作成する（Ｓ９０３）。スライスヘッダ中の予測モードＭｏｄｅが直接モード以外の予測モードを示している場合、動き補償復号部２０４は、復号化の対象としているマクロブロックに対して、既に復号化済みのピクチャを参照ピクチャとし、復号化された動きベクトルによってその参照ピクチャ内のブロックを特定し、特定されたブロックから動き補償を行うための予測画像切り出して、予測画像を作成する。
【００６６】
以下では、図９のステップＳ９０２において行われる空間的予測の具体的方法について説明する。
図１９を用いて説明したスキップモードの例は、参照される復号化済みブロックが１つずつの動きベクトルを持っている場合についてのものであった。しかし、Ｂピクチャの予測モードの中には、図２に示したように、表示順で前方にある２枚のピクチャを同時に参照して動き補償を行うものも含まれている。そのようなモードの場合は１つのブロックが２つの動きベクトルを持っていることになる。
【００６７】
図４は、動きベクトルを参照される復号化済みブロックに２つの動きベクトルを持つブロックが含まれている場合の動きベクトルの参照関係の一例を示している。ピクチャＰ９４が現在復号化を行っているピクチャであり、ブロックＢＬ５１が直接モードによる予測復号化を行うブロックである。
【００６８】
まず、第１の方法として、動き補償復号部２０４は、直接モードによる予測復号化を行うブロックＢＬ５１が図１８（ａ）および図１８（ｂ）に示したいずれの場合も、ブロックＢＬ５１（またはブロックＢＬ５２）に対して基本的にＡ、Ｂ、Ｃの位置にあるブロックの動きベクトルを参照する。ただし、下記の条件に従って参照が変更される。
【００６９】
１、ブロックＣが参照不可の場合は、Ａ、Ｂ、Ｄの位置にあるブロックを参照する。
２、Ａ、Ｂ、ＣまたはＡ、Ｂ、Ｄの位置にある３つのブロックにおいて、動きベクトルを参照できないブロックがあれば、そのブロックを動きベクトルの参照の対象から除く。
【００７０】
動き補償復号部２０４は、参照の対象となったＡ、Ｂ、Ｃ（またはＡ、Ｂ、Ｄ）の３つのブロックが持つ動きベクトルのうちで、動きベクトルによって参照されるピクチャと復号化対象ピクチャとの表示順での遠近を比較する。比較されたうちで、復号化対象ピクチャから表示順で最も近い位置にあるピクチャを参照している動きベクトルを取り出す。取り出された動きベクトルが複数ある場合には、それらの中央値もしくは平均値を取る。例えば、取り出された動きベクトルが奇数個ある場合には中央値、偶数個ある場合には平均値をとるとしてもよい。
【００７１】
これによって得られた動きベクトルを、表示順で復号化対象ピクチャよりも前方にあるピクチャのみを参照して動き補償する場合に、直接モードが選択されたときの復号化対象ブロックの動きベクトルとする。また、Ａ、Ｂ、Ｃ（またはＡ、Ｂ、Ｄ）の全てのブロックが参照できない場合は、復号化対象ブロックの動きベクトルを０とし、参照するピクチャを直前のピクチャとして直接モードによる予測復号化を行う。
【００７２】
図５のフローチャートは、図８に示した動き補償復号部２０４が第１の方法を用いて復号化対象ブロックの空間的予測を行う場合の処理手順の一例を示している。以下では、図４に示した復号化対象ブロックＢＬ５１を例として説明する。
【００７３】
まず、動き補償復号部２０４は、復号化対象ブロックＢＬ５１に対してＣの位置にあるブロックが参照可能であるか否かを調べる（Ｓ６０１）。図４においてＣの位置にあるブロックはピクチャＰ９３を参照する動きベクトルＭＶＣ１とピクチャＰ９２を参照する動きベクトルＭＶＣ２とを持っている。従って、動き補償復号部２０４は、Ａ、Ｂ、Ｃの位置にあるブロックの動きベクトルを参照する（Ｓ６０２）。
【００７４】
Ａの位置にあるブロックは、ピクチャＰ９３を参照する動きベクトルＭＶＡ１を持ち、Ｂの位置にあるブロックはピクチャＰ９３を参照する動きベクトルＭＶＢ１とピクチャＰ９１を参照する動きベクトルＭＶＢ３を持っている。ステップＳ６０１において、Ｃの位置にあるブロックが、復号化対象ピクチャＰ９４の外にあったり、または復号化対象ブロックＢＬ５１が属するスライスの外にあったり、または画面内予測などの復号化を行ったために動きベクトルを持っていない場合には、Ｃの位置にあるブロックの代わりに、図１８（ａ）および図１８（ｂ）に示したＤの位置にあるブロックの動きベクトルを参照する（Ｓ６０３）。すなわち、Ａ、Ｂ、Ｄの位置にある３つのブロックを参照する。
【００７５】
次いで、動き補償復号部２０４は、参照された３つのブロック（Ａ、Ｂ、ＣまたはＡ、Ｂ、Ｄ）のうち、復号化対象ピクチャＰ９４の外にあったり、または復号化対象ブロックＢＬ５１が属するスライスの外にあったり、または画面内予測などの復号化を行ったために動きベクトルを持っていない場合には、そのブロックを参照の候補から除いて、復号化対象ブロックの動きベクトルの計算を行う（Ｓ６０４）。
【００７６】
また、３つのブロック（Ａ、Ｂ、ＣまたはＡ、Ｂ、Ｄ）のうち、全てのブロックを参照できない場合は、復号化対象ブロックの動きベクトルを「０」とし、復号化対象ピクチャの直前のピクチャを参照する。動き補償復号部２０４は、参照されたこれらの動きベクトルの中から、復号化対象ピクチャに表示順で最も近いピクチャを参照するもののみを取り出すと、ピクチャＰ９３を参照する動きベクトルＭＶＡ１、動きベクトルＭＶＢ１および動きベクトルＭＶＣ１が得られる。動き補償復号部２０４は、さらに、これらの中央値もしくは平均値を取る。例えば、ここでは３個の動きベクトルが得られたので中央値を取る。これにより、ブロックＢＬ５１の動き補償を行うための１つの動きベクトルＭＶ１を決定することができる。
【００７７】
以上のように上記実施の形態では、表示順で後方にあるピクチャを参照出来ない環境下においても、直接モードを用いて予測復号化する際に表示順で後方にあるピクチャを参照することなく直接モードを実現する方法を提案し、高い符号化効率を実現する復号化方法を示した。
【００７８】
なお上記第１の方法では、参照された動きベクトルのうちで復号化対象ピクチャに表示順で一番近いピクチャを参照するものを取り出したが、参照された動きベクトルの中から復号化対象ピクチャの直前のピクチャを参照するもののみを取り出すとしてもよい。図４に示した例の場合、参照された動きベクトルに参照されるピクチャのうち、復号化対象ピクチャに表示順で最も近いピクチャは、復号化対象ピクチャの直前のピクチャであるので、得られる動きベクトルは同じである。もし、表示順で一番近いピクチャを参照する動きベクトルが１つも無かった場合は、復号化対象ブロックの動きベクトルを「０」として直接モードによる復号化を行う。
【００７９】
また上記第１の方法では、直接モードにおいて用いる動きベクトルを決定する際に、周辺の復号化済みブロックが参照するピクチャの中から表示順で復号化対象ピクチャから最も手前にあるピクチャを参照している動きベクトルだけを取り出して最終的に１個の動きベクトルを算出したが、その代わりに、第２の方法として、表示順で復号化対象ピクチャの手前からＮ枚のピクチャを参照する動きベクトルを取り出し、参照しているピクチャごとに１つずつの動きベクトルを決定し、得られたＮ個の動きベクトルを直接モードでの予測復号化に用いる動きベクトルとして前方向のみを参照する動き補償を行うことも可能である。このとき予測画像はＮ個の動きベクトルによって指定されたＮ個の領域の画素値の平均を算出することによって生成される。
【００８０】
なお、単純な平均ではなく、各領域の画素値に重みを付けて平均をとる方法によって予測画像を生成することも可能である。この方法を用いることにより、表示順で画素値が序々に変化するような画像列に対してより精度の高い動き補償を実現することが可能となる。
【００８１】
図７は、表示順で復号化対象ピクチャの手前から２枚のピクチャを参照する動きベクトルを取り出して２個の動きベクトルを算出する場合の動きベクトル参照方法の一例を表している。ピクチャＰ１０４が現在復号化を行っているピクチャであり、ＢＬ５１が直接モードによる予測復号化を行うブロックである。参照の対象となる複数の動きベクトルが参照しているピクチャの中で表示順で最も手前にあるピクチャＰ１０３を参照している動きベクトルＭＶＡ１、動きベクトルＭＶＢ１および動きベクトルＭＶＣ１を用いてその中央値もしくは平均値を取ることにより動きベクトルＭＶ１が決定され、さらに表示順で２つ前にあるピクチャＰ１０２を参照している動きベクトルの中央値もしくは平均値、つまりＭＶＣ２そのものを取ることにより動きベクトルＭＶ２が決定され、これら２つの動きベクトルを用いて直接モードによる復号化がなされる。
【００８２】
なお、図１８（ａ）および図１８（ｂ）において参照されるブロックの動きベクトルの中から表示順で手前から１枚もしくはＮ枚のピクチャを参照するもののみを使用するという方法の代わりに、指定されたピクチャを参照する動きベクトルのみを取り出して直接モードにおいて使用される復号化対象ブロックの動きベクトルの値を決定し、前記指定されたピクチャから動き補償を行うということも可能である。
【００８３】
なお、直接モードを用いて復号化を行う際に、図１８（ａ）および図１８（ｂ）のような位置関係にある符号化済みのブロックを参照して動き補償を行う代わりに、復号化対象ブロックの動きベクトルの値を「０」、参照するピクチャを直前のピクチャとして直接モードによる動き補償を行うことも可能である。この方法を用いると、直接モードに使用する動きベクトルを算出するステップを行う必要がなくなるため、復号化処理の単純化を図ることが出来る。
【００８４】
なお、対応する符号化処理において、直接モードで符号化済みのブロックを参照せずに符号化対象ブロックの動きベクトルの値を「０」として動き補償を行うことを示すフラグが符号化されていた場合は、前記フラグの値を解釈することにより前記動作に切り替えて直接モードによる動き予測を行うことができる。
【００８５】
なお、上記の方法では、３つのブロックを参照して得られた動きベクトルのうちから、それらが参照しているピクチャの中で復号化対象ピクチャに表示順で最も近い位置にあるピクチャを参照している動きベクトルを取り出すとしたが、本発明はこれに限定されない。例えば、復号化対象ピクチャに復号化順で最も近い位置にあるピクチャを参照している動きベクトルを取り出すとしてもよい。
【００８６】
（実施の形態３）
本発明の実施の形態３の動画像符号化方法を図１に示したブロック図を用いて説明する。
符号化対象となる動画像は時間順にピクチャ単位でフレームメモリ１０１に入力される。各々のピクチャはマクロブロックと呼ばれる例えば水平１６×垂直１６画素のブロックに分割されブロック単位で以降の処理が行われる。
【００８７】
フレームメモリ１０１から読み出されたマクロブロックは動きベクトル検出部１０６に入力される。ここではフレームメモリ１０５に蓄積されている符号化済みのピクチャを復号化した画像を参照ピクチャとして用いて、符号化対象としているマクロブロックの動きベクトル検出を行う。
【００８８】
モード選択部１０７では動きベクトル記憶部１０８に記憶されている符号化済みのピクチャで用いた動きベクトルを参照しつつ最適な予測モードを決定する。このとき表示順で後方にあるピクチャが既に符号化されているかどうかを後方ピクチャ判定部１０９において判定し、もし後方のピクチャが符号化されていないと判定された場合は、Ｂピクチャの符号化において表示順で後方にあるピクチャを参照する予測モードは選択されないように制限される。
【００８９】
図１０はＢピクチャにおける予測モードを識別するためのコードと符号化モードとを対応付けるテーブルの例を示したものである。予測方向が制限されていない場合は図１０（ａ）のように全ての参照パターンを示すテーブルを用いるが、予測方向が前方のみに制限されている場合は図１０（ｂ）のように後方を参照するパターンを全て除いたテーブルに作り直してそれを参照する。これにより、予測モードを識別するための符号に必要とされるビット量を削減することが可能となる。なお、図１０（ａ）および図１０（ｂ）のテーブルにおける各項目は、これ以外の値を用いた場合も同様に扱うことができる。
【００９０】
図２は表示順で後方のピクチャを参照することが出来ない場合の各ピクチャの参照関係を示したものである。シーケンスに含まれる全てのＢピクチャは表示順で前方にある１枚もしくは複数の符号化済みピクチャを参照して予測符号化を行っている。
【００９１】
得られた動きベクトルによって決定された予測画像が差分演算部１１０に入力され、符号化対象のマクロブロックとの差分をとることにより予測残差画像が生成され、予測残差符号化部１０２において符号化が行われる。以上の処理の流れはピクチャ間予測符号化が選択された場合の動作であったが、スイッチ１１２によってピクチャ内予測符号化との切り替えがなされる。最後に符号列生成部１０３によって、動きベクトル等の制御情報および予測残差符号化部１０２から出力される画像情報等に対し可変長符号化を施し、最終的に出力される符号列が生成される。
【００９２】
以上符号化の流れの概要を示したが、動きベクトル検出部１０６およびモード選択部１０７における処理の詳細について以下で説明する。ただし、ここでは後方ピクチャ判定部１０９において後方のピクチャが符号化されていないと判定された場合を考える。
【００９３】
動きベクトルの検出はマクロブロックごともしくはマクロブロックを分割した領域ごとに行われる。符号化の対象としているマクロブロックは既に符号化済みのピクチャを参照ピクチャとし、そのピクチャ内の探索領域において最適と予測される位置を示す動きベクトルおよび予測モードを決定することにより予測画像を作成する。
【００９４】
表示順で後方のピクチャが符号化されていない環境下において、Ｂピクチャの予測符号化を行う際にモード選択部１０７によって直接モードが選択された場合、従来の技術で述べた表示順で直後のピクチャを参照して動きベクトルとして用いる代わりに、符号化対象のブロックの周辺に位置する符号化済みブロックの動きベクトルを参照することによって直接モードを実現する。
【００９５】
まず、符号化対象ブロックの周辺に位置する符号化済みブロックがそれぞれ１つの動きベクトルを持っている場合について説明する。図１８は参照するブロックの位置関係を表したものである。図１８（ａ）は直接モードでの符号化を行うブロックＢＬ５１が１６画素×１６画素のサイズであった場合の例であり、図１８（ｂ）は直接モードでの符号化を行うブロックＢＬ５２が８画素×８画素のサイズであった場合の例である。いずれの場合も基本的にＡ、Ｂ、Ｃの位置関係にある３つのブロックの動きベクトルを参照する。ただし、下記の条件の場合は参照を行わず、符号化対象ブロックの動きベクトルの値を「０」として直前のピクチャを参照して直接モードによる動き補償を行う。
【００９６】
１、ＡもしくはＢがピクチャの外部もしくはスライスの外部であった場合。
２、ＡもしくはＢが直前のピクチャを参照する値「０」の動きベクトルを持つ場合。
参照の対象となったＡ、Ｂ、Ｃの３つのブロックが持つ動きベクトルの中から直前のピクチャを参照するもののみを取り出し、その中央値もしくは平均値を取ることによって実際に直接モードにおいて使用する動きベクトルとする。ただし、ブロックＣが参照不可の場合は代わりにブロックＤを用いるものとする。
【００９７】
図１９はそのときの動きベクトルの参照関係の例を示したものである。ピクチャＰ６４に属するブロックＢＬ５１を現在符号化しているブロックとする。この例では直前のピクチャを参照する動きベクトルはＭＶＡ１のみとなり、直接モードにおいて用いられる動きベクトルＭＶ１はＭＶＡ１の値をそのまま使用することになる。なお、参照するブロックの位置関係は図１８（ａ）および図１８（ｂ）で示したＡ、Ｂ、Ｃ，Ｄ以外の場所を用いた場合も同様である。
【００９８】
図１９の例は参照される符号化済みブロックが１つずつの動きベクトルを持っている場合についてのものであった。しかし、Ｂピクチャの予測モードの中には表示順で前方にある２枚のピクチャを同時に参照して動き補償を行うものも含まれている。そのようなモードの場合は１つのブロックが２つの動きベクトルを持っていることになる。
【００９９】
以下では、符号化対象ブロックの周辺に位置する符号化済みブロックに２つの動きベクトルを持つブロックが含まれている場合について説明する。図４は、符号化対象ブロックの周辺に位置する符号化済みブロックに２つの動きベクトルを持つブロックが含まれている場合の動きベクトルの参照関係の例を表した図である。ピクチャＰ９４が現在符号化を行っているピクチャであり、ブロックＢＬ５１が直接モードによる予測符号化を行うブロックである。参照の対象となるブロックの持つ全ての動きベクトルが参照しているピクチャの中で、表示順で最も直前にあるピクチャであるピクチャＰ９３を参照している動きベクトルＭＶＡ１、ＭＶＢ１およびＭＶＣ１を用いてその中央値もしくは平均値を取ることにより、直接モードでの予測符号化に用いる動きベクトルＭＶ１が決定され前方向のみを参照する動き補償を行う。
【０１００】
以上のように上記実施の形態では、表示順で後方にあるピクチャを参照出来ない環境下においても、直接モードを用いて予測符号化する際に表示順で後方にあるピクチャを参照することなく直接モードを実現する方法を提案し、さらに、符号化モードのテーブルから後方のピクチャを参照する項目を除くことによりテーブルの項目数を少なくすることが可能となり、高い符号化効率を実現する符号化方法を示した。
【０１０１】
なお、直接モードにおいて用いる動きベクトルを決定する際に、周辺の符号化済みブロックが参照するピクチャの中から最も表示順で手前にあるピクチャを参照している動きベクトルだけを取り出して１個の動きベクトルを算出する代わりに、手前からＮ枚のピクチャを参照する動きベクトルを取り出し、参照しているピクチャごとに１つずつの動きベクトルを決定し、得られたＮ個の動きベクトルを直接モードでの予測符号化に用いる動きベクトルとして前方向のみを参照する動き補償を行うことも可能である。このとき予測画像はＮ個の動きベクトルによって指定されたＮ個の領域の画素値の平均を算出することによって生成される。
【０１０２】
なお、単純な平均ではなく、各領域の画素値に重みを付けて平均をとる方法によって予測画像を生成することも可能である。この方法を用いることにより、表示順で画素値が序々に変化するような画像列に対してより精度の高い動き補償を実現することが可能となる。
【０１０３】
図７は上記のケースにおけるＮ＝２のときの動きベクトル参照方法の例を表したものである。Ｐ１０４が現在符号化を行っているピクチャであり、ＢＬ５１が直接モードによる予測符号化を行うブロックである。参照の対象となる複数の動きベクトルが参照しているピクチャの中で表示順で最も手前にあるピクチャＰ１０３を参照している動きベクトルＭＶＡ１およびＭＶＢ１およびＭＶＣ１を用いてその中央値もしくは平均値を取ることにより動きベクトルＭＶ１が決定され、さらに表示順で２つ前にあるピクチャＰ１０２を参照している動きベクトルの中央値もしくは平均値、つまりＭＶＣ２そのものを取ることにより動きベクトルＭＶ２が決定され、これら２つの動きベクトルを用いて直接モードによる符号化がなされる。
【０１０４】
なお、図１８（ａ）および図１８（ｂ）において動きベクトルを参照されるブロックを決定する方法として、前記実施の形態で述べた方法の代わりに、下記の条件を用いることも可能である。
１、ＡおよびＤが参照不可の場合はそれらの動きベクトルを「０」として参照する。
２、ＢおよびＣおよびＤが参照不可の場合はＡのみを参照する。
３、Ｃのみが参照不可の場合はＡ、Ｂ、Ｄを参照する。
４、上記２および３以外の場合はＡ、Ｂ、Ｃを参照する。
【０１０５】
なお、図１８（ａ）および図１８（ｂ）において参照されるブロックの動きベクトルの中から表示順で手前から１枚もしくはＮ枚のピクチャを参照するもののみを使用するという方法の代わりに、指定されたピクチャを参照する動きベクトルのみを取り出して直接モードにおいて使用される符号化対象ブロックの動きベクトルの値を決定し、前記指定されたピクチャから動き補償を行うということも可能である。
【０１０６】
なお、直接モードを用いて符号化を行う際に、図１８（ａ）および図１８（ｂ）のような位置関係にあるブロックを参照して動き補償を行う代わりに、符号化対象ブロックの動きベクトルの値を「０」、参照するピクチャを直前のピクチャとして直接モードによる動き補償を行うことも可能である。この方法を用いると、直接モードに使用する動きベクトルを算出するステップを行う必要がなくなるため、符号化処理の単純化を図ることが出来る。
【０１０７】
なお、上記実施の形態では、３つのブロックを参照して得られた動きベクトルのうちから、それらが参照しているピクチャの中で符号化対象ピクチャに表示順で最も近い位置にあるピクチャを参照している動きベクトルを取り出すとしたが、本発明はこれに限定されない。例えば、符号化対象ピクチャに符号化順で最も近い位置にあるピクチャを参照している動きベクトルを取り出すとしてもよい。
【０１０８】
（実施の形態４）
本発明の実施の形態４の動画像復号化方法を図８に示したブロック図を用いて説明する。ただし、実施の形態３の動画像符号化方法で生成された符号列が入力されるものとする。
【０１０９】
まず入力された符号列から符号列解析部２０１によって動きベクトル情報および予測残差符号化データ等の各種の情報が抽出される。ここで抽出された動きベクトル情報は動き補償復号部２０４に、予測残差符号化データは予測残差復号化部２０２にそれぞれ出力される。動き補償復号部２０４ではフレームメモリ２０３に蓄積されている復号化済みのピクチャの復号化画像を参照ピクチャとし、入力された動きベクトル情報に基づいて予測画像を生成する。このとき表示順で後方にあるピクチャが既に符号化されているかどうかを後方ピクチャ判定部２０６において判定し、もし後方のピクチャが符号化されていないと判定された場合は、Ｂピクチャの符号化において表示順で後方にあるピクチャを参照する予測モードは選択されないように制限される。
【０１１０】
図１０はＢピクチャにおける予測モードを識別するためのコードと符号化モードとを対応付けるテーブルの例を示したものである。予測方向が制限されていない場合は図１０（ａ）のように全ての参照パターンを示すテーブルを用いるが、予測方向が前方のみに制限されている場合は図１０（ｂ）のように後方を参照するパターンを全て除いたテーブルに作り直してそれを参照する。なお、図１０（ａ）および図１０（ｂ）のテーブルにおける各項目は、これ以外の値を用いた場合も同様に扱うことができる。
【０１１１】
このようにして生成された予測画像は加算演算部２０７に入力され、予測残差復号化部２０２において生成された予測残差画像との加算を行うことにより復号化画像が生成される。予測方向が制限されていない場合は、生成された復号化画像はフレームメモリ２０３において表示される順にピクチャの並び替えを行うが、表示順で後方にあるピクチャを参照することが出来ない場合は、並び替えを行うことなく復号化された順に表示することが可能となる。以上の実施の形態はピクチャ間予測符号化がなされている符号列に対する動作であったが、スイッチ２０８によってピクチャ内予測符号化がなされている符号列に対する復号化処理との切り替えがなされる。
【０１１２】
以上復号化の流れの概要を示したが、動き補償復号部２０４における処理の詳細について以下で説明する。ただし、ここでは後方ピクチャ判定部２０６において後方のピクチャが復号化されていないと判定された場合を考える。
動きベクトル情報はマクロブロックごともしくはマクロブロックを分割した領域ごとに付加されている。復号化の対象としているマクロブロックは既に復号化済みのピクチャを参照ピクチャとし、復号化された動きベクトルによってそのピクチャ内から動き補償を行うための予測画像を作成する。
【０１１３】
表示順で後方のピクチャが復号化されていない環境下において、Ｂピクチャの予測復号化において直接モードが指示された場合、従来の技術で述べた表示順で直後のピクチャを参照して動きベクトルとして用いる代わりに、復号化対象のブロックの周辺に位置する復号化済みブロックの動きベクトルを参照することによって直接モードを実現する。
【０１１４】
まず、復号化対象ブロックの周辺に位置する復号化済みブロックがそれぞれ１つの動きベクトルを持っている場合について説明する。図１８は参照するブロックの位置関係を表したものである。図１８（ａ）は直接モードでの復号化を行うブロックＢＬ５１が１６画素×１６画素のサイズであった場合の例であり、図１８（ｂ）は直接モードでの復号化を行うブロックＢＬ５２が８画素×８画素のサイズであった場合の例である。いずれの場合も基本的にＡ、Ｂ、Ｃの位置関係にある３つのブロックの動きベクトルを参照する。ただし、下記の条件の場合は参照を行わず、復号化対象ブロックの動きベクトルの値を「０」として直前のピクチャを参照して直接モードによる動き補償を行う。
１、ＡもしくはＢがピクチャの外部もしくはスライスの外部であった場合。
２、ＡもしくはＢが直前のピクチャを参照する値「０」の動きベクトルを持つ場合。
【０１１５】
参照の対象となったＡ、Ｂ、Ｃの３つのブロックが持つ動きベクトルの中から直前のピクチャを参照するもののみを取り出し、その中央値もしくは平均値を取ることによって実際に直接モードにおいて使用する動きベクトルとする。ただし、ブロックＣが参照不可の場合は代わりにブロックＤを用いるものとする。
【０１１６】
図１９はそのときの動きベクトルの参照関係の例を示したものである。ピクチャＰ６４に属するブロックＢＬ５１を現在復号化しているブロックとする。この例では直前のピクチャを参照する動きベクトルはＭＶＡ１のみとなり、直接モードにおいて用いられる動きベクトルＭＶ１はＭＶＡ１の値をそのまま使用することになる。なお、参照するブロックの位置関係は図１８（ａ）および図１８（ｂ）で示したＡ、Ｂ、Ｃ，Ｄ以外の場所を用いた場合も同様である。
【０１１７】
図１９の例は参照される復号化済みブロックが１つずつの動きベクトルを持っている場合についてのものであった。しかし、Ｂピクチャの予測モードの中には表示順で前方にある２枚のピクチャを同時に参照して動き補償を行うものも含まれている。そのようなモードの場合は１つのブロックが２つの動きベクトルを持っていることになる。
【０１１８】
以下では、復号化対象ブロックの周辺に位置する復号化済みブロックに２つの動きベクトルを持つブロックが含まれている場合について説明する。図４はそのような場合の動きベクトルの参照関係の例を表したものである。Ｐ９４が現在復号化を行っているピクチャであり、ＢＬ５１が直接モードによる予測復号化を行うブロックである。参照の対象となるブロックの持つ全ての動きベクトルが参照しているピクチャの中で、表示順で最も直前にあるピクチャであるピクチャＰ９３を参照している動きベクトルＭＶＡ１およびＭＶＢ１およびＭＶＣ１を用いてその中央値もしくは平均値を取ることにより、直接モードでの予測復号化に用いる動きベクトルＭＶ１が決定され前方向のみを参照する動き補償を行う。
【０１１９】
以上のように上記実施の形態では、表示順で後方にあるピクチャを参照出来ない環境下においても、直接モードを用いて予測復号化する際に表示順で後方にあるピクチャを参照することなく直接モードを実現する方法を提案し、さらに、符号化モードのテーブルから後方のピクチャを参照する項目を除くことによりテーブルの項目数を少なくし、高い符号化効率を実現する復号化方法を示した。
【０１２０】
なお、直接モードにおいて用いる動きベクトルを決定する際に、周辺の復号化済みブロックが参照するピクチャの中から最も表示順で手前にあるピクチャを参照している動きベクトルだけを取り出して１個の動きベクトルを算出する代わりに、手前からＮ枚のピクチャを参照する動きベクトルを取り出し、参照しているピクチャごとに１つずつの動きベクトルを決定し、得られたＮ個の動きベクトルを直接モードでの予測復号化に用いる動きベクトルとして前方向のみを参照する動き補償を行うことも可能である。このとき予測画像はＮ個の動きベクトルによって指定されたＮ個の領域の画素値の平均を算出することによって生成される。
【０１２１】
なお、単純な平均ではなく、各領域の画素値に重みを付けて平均をとる方法によって予測画像を生成することも可能である。この方法を用いることにより、表示順で画素値が序々に変化するような画像列に対してより精度の高い動き補償を実現することが可能となる。
【０１２２】
図７は上記のケースにおけるＮ＝２のときの動きベクトル参照方法の例を表したものである。Ｐ１０４が現在復号化を行っているピクチャであり、ＢＬ５１が直接モードによる予測復号化を行うブロックである。参照の対象となる複数の動きベクトルが参照しているピクチャの中で表示順で最も手前にあるピクチャＰ１０３を参照している動きベクトルＭＶＡ１およびＭＶＢ１およびＭＶＣ１を用いてその中央値もしくは平均値を取ることにより動きベクトルＭＶ１が決定され、さらに表示順で２つ前にあるピクチャＰ１０２を参照している動きベクトルの中央値もしくは平均値、つまりＭＶＣ２そのものを取ることにより動きベクトルＭＶ２が決定され、これら２つの動きベクトルを用いて直接モードによる復号化がなされる。
【０１２３】
なお、図１８（ａ）および図１８（ｂ）において動きベクトルを参照されるブロックを決定する方法として、前記実施の形態で述べた方法の代わりに、下記の条件を用いることも可能である。
１、ＡおよびＤが参照不可の場合はそれらの動きベクトルを「０」として参照する。
２、ＢおよびＣおよびＤが参照不可の場合はＡのみを参照する。
３、Ｃのみが参照不可の場合はＡ、Ｂ、Ｄを参照する。
４、上記２および３以外の場合はＡ，Ｂ、Ｃを参照する。
【０１２４】
なお、図１８（ａ）および図１８（ｂ）において参照されるブロックの動きベクトルの中から表示順で手前から１枚もしくはＮ枚のピクチャを参照するもののみを使用するという方法の代わりに、指定されたピクチャを参照する動きベクトルのみを取り出して直接モードにおいて使用される復号化対象ブロックの動きベクトルの値を決定し、前記指定されたピクチャから動き補償を行うということも可能である。
【０１２５】
なお、直接モードを用いて復号化を行う際に、図１８（ａ）および図１８（ｂ）のような位置関係にあるブロックを参照して動き補償を行う代わりに、復号化対象ブロックの動きベクトルの値を「０」、参照するピクチャを直前のピクチャとして直接モードによる動き補償を行うことも可能である。この方法を用いると、直接モードに使用する動きベクトルを算出するステップを行う必要がなくなるため、復号化処理の単純化を図ることが出来る。
【０１２６】
（実施の形態５）
さらに、上記実施の形態で示した動画像符号化方法または動画像復号化方法の構成を実現するためのプログラムを、フレキシブルディスク等の記録媒体に記録するようにすることにより、上記実施の形態１で示した処理を、独立したコンピュータシステムにおいて簡単に実施することが可能となる。
【０１２７】
図１１は、上記実施の形態１の動画像符号化方法または動画像復号化方法を格納したフレキシブルディスクを用いて、コンピュータシステムにより実施する場合の説明図である。
【０１２８】
図１１（ｂ）は、フレキシブルディスクの正面からみた外観、断面構造、及びフレキシブルディスクを示し、図１１（ａ）は、記録媒体本体であるフレキシブルディスクの物理フォーマットの例を示している。フレキシブルディスクＦＤはケースＦ内に内蔵され、該ディスクの表面には、同心円状に外周からは内周に向かって複数のトラックＴｒが形成され、各トラックは角度方向に１６のセクタＳｅに分割されている。従って、上記プログラムを格納したフレキシブルディスクでは、上記フレキシブルディスクＦＤ上に割り当てられた領域に、上記プログラムとしての動画像符号化方法が記録されている。
【０１２９】
また、図１１（ｃ）は、フレキシブルディスクＦＤに上記プログラムの記録再生を行うための構成を示す。上記プログラムをフレキシブルディスクＦＤに記録する場合は、コンピュータシステムＣｓから上記プログラムとしての動画像符号化方法または動画像復号化方法をフレキシブルディスクドライブを介して書き込む。また、フレキシブルディスク内のプログラムにより上記動画像符号化方法をコンピュータシステム中に構築する場合は、フレキシブルディスクドライブによりプログラムをフレキシブルディスクから読み出し、コンピュータシステムに転送する。
【０１３０】
なお、上記説明では、記録媒体としてフレキシブルディスクを用いて説明を行ったが、光ディスクを用いても同様に行うことができる。また、記録媒体はこれに限らず、ＩＣカード、ＲＯＭカセット等、プログラムを記録できるものであれば同様に実施することができる。
【０１３１】
（実施の形態６）
さらにここで、上記実施の形態で示した動画像符号化方法や動画像復号化方法の応用例とそれを用いたシステムを説明する。
【０１３２】
図１２は、コンテンツ配信サービスを実現するコンテンツ供給システムｅｘ１００の全体構成を示すブロック図である。通信サービスの提供エリアを所望の大きさに分割し、各セル内にそれぞれ固定無線局である基地局ｅｘ１０７〜ｅｘ１１０が設置されている。
【０１３３】
このコンテンツ供給システムｅｘ１００は、例えば、インターネットｅｘ１０１にインターネットサービスプロバイダｅｘ１０２および電話網ｅｘ１０４、および基地局ｅｘ１０７〜ｅｘ１１０を介して、コンピュータｅｘ１１１、ＰＤＡ（ｐｅｒｓｏｎａｌ　ｄｉｇｉｔａｌ　ａｓｓｉｓｔａｎｔ）ｅｘ１１２、カメラｅｘ１１３、携帯電話ｅｘ１１４、カメラ付きの携帯電話ｅｘ１１５などの各機器が接続される。
【０１３４】
しかし、コンテンツ供給システムｅｘ１００は図１２のような組合せに限定されず、いずれかを組み合わせて接続するようにしてもよい。また、固定無線局である基地局ｅｘ１０７〜ｅｘ１１０を介さずに、各機器が電話網ｅｘ１０４に直接接続されてもよい。
【０１３５】
カメラｅｘ１１３はデジタルビデオカメラ等の動画撮影が可能な機器である。また、携帯電話は、ＰＤＣ（Ｐｅｒｓｏｎａｌ　Ｄｉｇｉｔａｌ　Ｃｏｍｍｕｎｉｃａｔｉｏｎｓ）方式、ＣＤＭＡ（Ｃｏｄｅ　Ｄｉｖｉｓｉｏｎ　Ｍｕｌｔｉｐｌｅ　Ａｃｃｅｓｓ）方式、Ｗ−ＣＤＭＡ（Ｗｉｄｅｂａｎｄ−Ｃｏｄｅ　Ｄｉｖｉｓｉｏｎ　Ｍｕｌｔｉｐｌｅ　Ａｃｃｅｓｓ）方式、若しくはＧＳＭ（Ｇｌｏｂａｌ　Ｓｙｓｔｅｍ　ｆｏｒ　Ｍｏｂｉｌｅ　Ｃｏｍｍｕｎｉｃａｔｉｏｎｓ）方式の携帯電話機、またはＰＨＳ（Ｐｅｒｓｏｎａｌ　Ｈａｎｄｙｐｈｏｎｅ　Ｓｙｓｔｅｍ）等であり、いずれでも構わない。
【０１３６】
また、ストリーミングサーバｅｘ１０３は、カメラｅｘ１１３から基地局ｅｘ１０９、電話網ｅｘ１０４を通じて接続されており、カメラｅｘ１１３を用いてユーザが送信する符号化処理されたデータに基づいたライブ配信等が可能になる。撮影したデータの符号化処理はカメラｅｘ１１３で行っても、データの送信処理をするサーバ等で行ってもよい。
【０１３７】
また、カメラｅｘ１１６で撮影した動画データはコンピュータｅｘ１１１を介してストリーミングサーバｅｘ１０３に送信されてもよい。カメラｅｘ１１６はデジタルカメラ等の静止画、動画が撮影可能な機器である。この場合、動画データの符号化はカメラｅｘ１１６で行ってもコンピュータｅｘ１１１で行ってもどちらでもよい。また、符号化処理はコンピュータｅｘ１１１やカメラｅｘ１１６が有するＬＳＩｅｘ１１７において処理することになる。
【０１３８】
なお、動画像符号化・復号化用のソフトウェアをコンピュータｅｘ１１１等で読み取り可能な記録媒体である何らかの蓄積メディア（ＣＤ−ＲＯＭ、フレキシブルディスク、ハードディスクなど）に組み込んでもよい。さらに、カメラ付きの携帯電話ｅｘ１１５で動画データを送信してもよい。このときの動画データは携帯電話ｅｘ１１５が有するＬＳＩで符号化処理されたデータである。
【０１３９】
このコンテンツ供給システムｅｘ１００では、ユーザがカメラｅｘ１１３、カメラｅｘ１１６等で撮影しているコンテンツ（例えば、音楽ライブを撮影した映像等）を上記実施の形態同様に符号化処理してストリーミングサーバｅｘ１０３に送信する一方で、ストリーミングサーバｅｘ１０３は要求のあったクライアントに対して上記コンテンツデータをストリーム配信する。クライアントとしては、上記符号化処理されたデータを復号化することが可能な、コンピュータｅｘ１１１、ＰＤＡｅｘ１１２、カメラｅｘ１１３、携帯電話ｅｘ１１４等がある。
【０１４０】
このようにすることでコンテンツ供給システムｅｘ１００は、符号化されたデータをクライアントにおいて受信して再生することができ、さらにクライアントにおいてリアルタイムで受信して復号化し、再生することにより、個人放送をも実現可能になるシステムである。
【０１４１】
このシステムを構成する各機器の符号化、復号化には上記各実施の形態で示した動画像符号化装置あるいは動画像復号化装置を用いるようにすればよい。
その一例として携帯電話について説明する。
【０１４２】
図１３は、上記実施の形態で説明した動画像符号化方法と動画像復号化方法を用いた携帯電話ｅｘ１１５を示す図である。携帯電話ｅｘ１１５は、基地局ｅｘ１１０との間で電波を送受信するためのアンテナｅｘ２０１、ＣＣＤカメラ等の映像、静止画を撮ることが可能なカメラ部ｅｘ２０３、カメラ部ｅｘ２０３で撮影した映像、アンテナｅｘ２０１で受信した映像等が復号化されたデータを表示する液晶ディスプレイ等の表示部ｅｘ２０２、操作キーｅｘ２０４群から構成される本体部、音声出力をするためのスピーカ等の音声出力部ｅｘ２０８、音声入力をするためのマイク等の音声入力部ｅｘ２０５、撮影した動画もしくは静止画のデータ、受信したメールのデータ、動画のデータもしくは静止画のデータ等、符号化されたデータまたは復号化されたデータを保存するための記録メディアｅｘ２０７、携帯電話ｅｘ１１５に記録メディアｅｘ２０７を装着可能とするためのスロット部ｅｘ２０６を有している。記録メディアｅｘ２０７はＳＤカード等のプラスチックケース内に電気的に書換えや消去が可能な不揮発性メモリであるＥＥＰＲＯＭ（Ｅｌｅｃｔｒｉｃａｌｌｙ　Ｅｒａｓａｂｌｅ　ａｎｄ　Ｐｒｏｇｒａｍｍａｂｌｅ　Ｒｅａｄ　Ｏｎｌｙ　Ｍｅｍｏｒｙ）の一種であるフラッシュメモリ素子を格納したものである。
【０１４３】
さらに、携帯電話ｅｘ１１５について図１４を用いて説明する。携帯電話ｅｘ１１５は表示部ｅｘ２０２及び操作キーｅｘ２０４を備えた本体部の各部を統括的に制御するようになされた主制御部ｅｘ３１１に対して、電源回路部ｅｘ３１０、操作入力制御部ｅｘ３０４、画像符号化部ｅｘ３１２、カメラインターフェース部ｅｘ３０３、ＬＣＤ（Ｌｉｑｕｉｄ　Ｃｒｙｓｔａｌ　Ｄｉｓｐｌａｙ）制御部ｅｘ３０２、画像復号化部ｅｘ３０９、多重分離部ｅｘ３０８、記録再生部ｅｘ３０７、変復調回路部ｅｘ３０６及び音声処理部ｅｘ３０５が同期バスｅｘ３１３を介して互いに接続されている。
【０１４４】
電源回路部ｅｘ３１０は、ユーザの操作により終話及び電源キーがオン状態にされると、バッテリパックから各部に対して電力を供給することによりカメラ付ディジタル携帯電話ｅｘ１１５を動作可能な状態に起動する。
【０１４５】
携帯電話ｅｘ１１５は、ＣＰＵ、ＲＯＭ及びＲＡＭ等でなる主制御部ｅｘ３１１の制御に基づいて、音声通話モード時に音声入力部ｅｘ２０５で集音した音声信号を音声処理部ｅｘ３０５によってディジタル音声データに変換し、これを変復調回路部ｅｘ３０６でスペクトラム拡散処理し、送受信回路部ｅｘ３０１でディジタルアナログ変換処理及び周波数変換処理を施した後にアンテナｅｘ２０１を介して送信する。また携帯電話機ｅｘ１１５は、音声通話モード時にアンテナｅｘ２０１で受信した受信データを増幅して周波数変換処理及びアナログディジタル変換処理を施し、変復調回路部ｅｘ３０６でスペクトラム逆拡散処理し、音声処理部ｅｘ３０５によってアナログ音声データに変換した後、これを音声出力部ｅｘ２０８を介して出力する。
【０１４６】
さらに、データ通信モード時に電子メールを送信する場合、本体部の操作キーｅｘ２０４の操作によって入力された電子メールのテキストデータは操作入力制御部ｅｘ３０４を介して主制御部ｅｘ３１１に送出される。主制御部ｅｘ３１１は、テキストデータを変復調回路部ｅｘ３０６でスペクトラム拡散処理し、送受信回路部ｅｘ３０１でディジタルアナログ変換処理及び周波数変換処理を施した後にアンテナｅｘ２０１を介して基地局ｅｘ１１０へ送信する。
【０１４７】
データ通信モード時に画像データを送信する場合、カメラ部ｅｘ２０３で撮像された画像データをカメラインターフェース部ｅｘ３０３を介して画像符号化部ｅｘ３１２に供給する。また、画像データを送信しない場合には、カメラ部ｅｘ２０３で撮像した画像データをカメラインターフェース部ｅｘ３０３及びＬＣＤ制御部ｅｘ３０２を介して表示部ｅｘ２０２に直接表示することも可能である。
【０１４８】
画像符号化部ｅｘ３１２は、本願発明で説明した動画像符号化装置を備えた構成であり、カメラ部ｅｘ２０３から供給された画像データを上記実施の形態で示した動画像符号化装置に用いた符号化方法によって圧縮符号化することにより符号化画像データに変換し、これを多重分離部ｅｘ３０８に送出する。また、このとき同時に携帯電話機ｅｘ１１５は、カメラ部ｅｘ２０３で撮像中に音声入力部ｅｘ２０５で集音した音声を音声処理部ｅｘ３０５を介してディジタルの音声データとして多重分離部ｅｘ３０８に送出する。
【０１４９】
多重分離部ｅｘ３０８は、画像符号化部ｅｘ３１２から供給された符号化画像データと音声処理部ｅｘ３０５から供給された音声データとを所定の方式で多重化し、その結果得られる多重化データを変復調回路部ｅｘ３０６でスペクトラム拡散処理し、送受信回路部ｅｘ３０１でディジタルアナログ変換処理及び周波数変換処理を施した後にアンテナｅｘ２０１を介して送信する。
【０１５０】
データ通信モード時にホームページ等にリンクされた動画像ファイルのデータを受信する場合、アンテナｅｘ２０１を介して基地局ｅｘ１１０から受信した受信データを変復調回路部ｅｘ３０６でスペクトラム逆拡散処理し、その結果得られる多重化データを多重分離部ｅｘ３０８に送出する。
【０１５１】
また、アンテナｅｘ２０１を介して受信された多重化データを復号化するには、多重分離部ｅｘ３０８は、多重化データを分離することにより画像データのビットストリームと音声データのビットストリームとに分け、同期バスｅｘ３１３を介して当該符号化画像データを画像復号化部ｅｘ３０９に供給すると共に当該音声データを音声処理部ｅｘ３０５に供給する。
【０１５２】
次に、画像復号化部ｅｘ３０９は、本願発明で説明した動画像復号化装置を備えた構成であり、画像データのビットストリームを上記実施の形態で示した符号化方法に対応した復号化方法で復号することにより再生動画像データを生成し、これをＬＣＤ制御部ｅｘ３０２を介して表示部ｅｘ２０２に供給し、これにより、例えばホームページにリンクされた動画像ファイルに含まれる動画データが表示される。このとき同時に音声処理部ｅｘ３０５は、音声データをアナログ音声データに変換した後、これを音声出力部ｅｘ２０８に供給し、これにより、例えばホームページにリンクされた動画像ファイルに含まる音声データが再生される。
【０１５３】
なお、上記システムの例に限られず、最近は衛星、地上波によるディジタル放送が話題となっており、図１５に示すようにディジタル放送用システムにも上記実施の形態の少なくとも動画像符号化装置または動画像復号化装置のいずれかを組み込むことができる。具体的には、放送局ｅｘ４０９では映像情報のビットストリームが電波を介して通信または放送衛星ｅｘ４１０に伝送される。これを受けた放送衛星ｅｘ４１０は、放送用の電波を発信し、この電波を衛星放送受信設備をもつ家庭のアンテナｅｘ４０６で受信し、テレビ（受信機）ｅｘ４０１またはセットトップボックス（ＳＴＢ）ｅｘ４０７などの装置によりビットストリームを復号化してこれを再生する。
【０１５４】
また、記録媒体であるＣＤやＤＶＤ等の蓄積メディアｅｘ４０２に記録したビットストリームを読み取り、復号化する再生装置ｅｘ４０３にも上記実施の形態で示した動画像復号化装置を実装することが可能である。この場合、再生された映像信号はモニタｅｘ４０４に表示される。また、ケーブルテレビ用のケーブルｅｘ４０５または衛星／地上波放送のアンテナｅｘ４０６に接続されたセットトップボックスｅｘ４０７内に動画像復号化装置を実装し、これをテレビのモニタｅｘ４０８で再生する構成も考えられる。このときセットトップボックスではなく、テレビ内に動画像復号化装置を組み込んでも良い。
【０１５５】
また、アンテナｅｘ４１１を有する車ｅｘ４１２で衛星ｅｘ４１０からまたは基地局ｅｘ１０７等から信号を受信し、車ｅｘ４１２が有するカーナビゲーションｅｘ４１３等の表示装置に動画を再生することも可能である。
【０１５６】
更に、画像信号を上記実施の形態で示した動画像符号化装置で符号化し、記録媒体に記録することもできる。具体例としては、ＤＶＤディスクｅｘ４２１に画像信号を記録するＤＶＤレコーダや、ハードディスクに記録するディスクレコーダなどのレコーダｅｘ４２０がある。更にＳＤカードｅｘ４２２に記録することもできる。レコーダｅｘ４２０が上記実施の形態で示した動画像復号化装置を備えていれば、ＤＶＤディスクｅｘ４２１やＳＤカードｅｘ４２２に記録した画像信号を再生し、モニタｅｘ４０８で表示することができる。
【０１５７】
なお、カーナビゲーションｅｘ４１３の構成は例えば図１４に示す構成のうち、カメラ部ｅｘ２０３とカメラインターフェース部ｅｘ３０３、画像符号化部ｅｘ３１２を除いた構成が考えられ、同様なことがコンピュータｅｘ１１１やテレビ（受信機）ｅｘ４０１等でも考えられる。
【０１５８】
また、上記携帯電話ｅｘ１１４等の端末は、符号化器・復号化器を両方持つ送受信型の端末の他に、符号化器のみの送信端末、復号化器のみの受信端末の３通りの実装形式が考えられる。
このように、上記実施の形態で示した動画像符号化方法あるいは動画像復号化方法を上述したいずれの機器・システムに用いることは可能であり、そうすることで、上記実施の形態で説明した効果を得ることができる。
【０１５９】
また、本発明はかかる上記実施形態に限定されるものではなく、本発明の範囲を逸脱することなく種々の変形または修正が可能である。
本発明に係る動画像符号化装置は、通信機能を備えるパーソナルコンピュータ、ＰＤＡ、ディジタル放送の放送局および携帯電話機などに備えられる動画像符号化装置として有用である。
【０１６０】
また、本発明に係る動画像復号化装置は、通信機能を備えるパーソナルコンピュータ、ＰＤＡ、ディジタル放送を受信するＳＴＢおよび携帯電話機などに備えられる動画像復号化装置として有用である。
【０１６１】
【発明の効果】
以上のように、本発明の動画像符号化方法によると、時間的に後方にあるピクチャを参照出来ない環境下においても、直接モードを用いて予測符号化する際に時間的に後方にあるピクチャを参照することなく直接モードを実現する方法を提案し、さらに、符号化モードのテーブルから後方のピクチャを参照する項目を除くことによりテーブルの項目数を少なくし高い符号化効率を実現することを可能とする。
【０１６２】
また、本発明の動画像復号化方法によると、時間的に後方にあるピクチャを参照出来ない環境下においても、直接モードを用いて予測復号化する際に時間的に後方にあるピクチャを参照することなく直接モードを実現する方法を提案し、さらに、符号化モードのテーブルから後方のピクチャを参照する項目を除くことによりテーブルの項目数を少なくし、高い符号化効率によって符号化された符号列を矛盾無く復号化することを可能とする。
【図面の簡単な説明】
【図１】実施の形態１の動画像符号化方法を実行する動画像符号化装置の構成を示すブロック図である。
【図２】符号化対象ブロックが属するピクチャから表示順で後方にあるピクチャを参照することが出来ない場合の各ピクチャの参照関係の一例を示す図である。
【図３】直接モードが選択された場合のモード選択部の動作の一例を示すフローチャートである。
【図４】動きベクトルを参照される符号化済みブロックに２つの動きベクトルを持つブロックが含まれている場合の動きベクトルの参照関係の一例を示す図である。
【図５】図１に示したモード選択部が第１の方法を用いて符号化対象ブロックの空間的予測を行う場合の処理手順の一例を示すフローチャートである。
【図６】図１に示した符号列生成部で生成される符号列のスライスごとのデータ構造の一例を示す図である。
【図７】表示順で符号化対象ピクチャの手前から２枚のピクチャを参照する動きベクトルを取り出して２個の動きベクトルを算出する場合の動きベクトル参照方法の一例を表した図である。
【図８】本実施の形態の動画像復号化装置の構成を示すブロック図である。
【図９】図８に示した動き補償復号部における直接モードによる復号化の処理手順を示すフローチャートである。
【図１０】（ａ）Ｂピクチャにおける予測モードを識別するためのコードと符号化モードとを対応付けるテーブルの一例を示す図である。
（ｂ）予測方向が前方のみに制限されている場合のＢピクチャにおける予測モードを識別するためのコードと符号化モードとを対応付けるテーブルの一例を示す図である。
【図１１】（ａ）記録媒体本体であるフレキシブルディスクの物理フォーマットの例を示す図である。
（ｂ）フレキシブルディスクの正面からみた外観、断面構造、及びフレキシブルディスクを示す図である。
（ｃ）フレキシブルディスクＦＤに上記プログラムの記録再生を行うための構成を示す図である。
【図１２】コンテンツ配信サービスを実現するコンテンツ供給システムの全体構成を示すブロック図である。
【図１３】携帯電話の外観の一例を示す図である。
【図１４】携帯電話の構成を示すブロック図である。
【図１５】上記実施の形態で示した符号化処理または復号化処理を行う機器、およびこの機器を用いたシステムを説明する図である。
【図１６】従来の動画像符号化方法における各ピクチャと、それによって参照されるピクチャとの参照関係の例を示す図である。
【図１７】（ａ）図１６に示したピクチャＢ１８の周辺にあるピクチャを表示順で抜き出して示す図である。
（ｂ）ピクチャＢ１８を図１７（ａ）に示した参照関係で符号化する場合におけるピクチャＢ１８の周辺ピクチャの符号化順を示す図である。
【図１８】同一ピクチャ内で対象ブロックの周辺に位置する符号化済みブロックの動きベクトルを参照する場合に、動きベクトルを参照される符号化済みブロックと対象ブロックとの位置関係を表した図である。
（ａ）対象ブロックＢＬ５１が１６画素×１６画素のサイズであった場合の例である。
（ｂ）対象ブロックＢＬ５２が８画素×８画素のサイズであった場合の例である。
【図１９】Ｐピクチャのスキップモードの際に参照される動きベクトルとその動きベクトルによって参照される符号化済みピクチャとの一例を示した図である。
【図２０】直接モードにおいて動きベクトルを決定する方法を説明するための図である。
【符号の説明】
１００　　動画像符号化装置
１０１　　フレームメモリ
１０２　　予測誤差符号化部
１０３　　符号列生成部
１０４　　予測残差復号化部
１０５　　フレームメモリ
１０６　　動きベクトル検出部
１０７　　モード選択部
１０８　　動きベクトル記憶部
１０９　　後方ピクチャ判定部
２００　　動画像復号化装置
２０１　　符号列解析部
２０２　　予測残差復号化部
２０３　　フレームメモリ
２０４　　動き補償復号部
２０５　　動きベクトル記憶部
２０６　　後方ピクチャ判定部
Ｓｅ　　　セクタ
Ｔｒ　　　トラック
ＦＤ　　　フレキシブルディスク
Ｆ　　　　フレキシブルディスクケース
Ｃｓ　　　コンピュータシステム
ＦＤＤ　　フレキシブルディスクドライブ[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a moving picture coding method and a moving picture decoding method, and in particular, to a prediction code for a B picture that performs prediction coding with reference to a plurality of previously coded pictures that are temporally forward or backward. And a prediction decoding method.
[0002]
[Prior art]
Generally, in coding of a moving image, the amount of information is compressed by reducing redundancy in the time direction and the space direction. Therefore, in inter-picture predictive coding for the purpose of reducing temporal redundancy, motion detection and motion compensation are performed in block units with reference to a forward or backward picture, and the obtained predicted image is compared with the current picture. Is performed on the difference value of.
[0003]
H.264, which is a moving picture coding method currently being standardized, In 26L, a picture (I picture) for which only intra-picture prediction coding is performed, a picture (P picture) for which inter-picture prediction coding is performed with reference to one picture that is temporally forward, and A picture (B picture) for performing inter-picture predictive encoding with reference to two pictures in the front, two pictures in the back in time, or one picture in the front and the back in time is referred to. Proposed. H. In MPEG (Motion Picture Experts Group) 1, MPEG2, and MPEG4, which are encoding methods before 26L, a B picture can refer to only one picture in the same direction. One major feature of the 26L is that it has been changed so that two sheets can be referenced.
[0004]
FIG. 16 is a diagram illustrating an example of a reference relationship between each picture and a picture referred to by the picture in the conventional moving picture coding method. In the figure, picture I1 to picture B20 are displayed in this order. FIG. 17 (a) is a diagram showing the pictures around the picture B18 shown in FIG. 16 extracted in the display order. FIG. 17B is a diagram illustrating the encoding order of the peripheral pictures of the picture B18 when the picture B18 is encoded with the reference relationship illustrated in FIG. 17A.
[0005]
The picture I1 does not have a reference picture and performs intra-picture predictive coding, and the picture P10 performs inter-picture predictive coding with reference to a temporally preceding picture P7. Picture B6 refers to two temporally forward pictures (picture I1 and picture P4), picture B12 refers to two temporally backward pictures (picture P13 and picture P16), and picture B18. Performs inter-picture predictive coding with reference to one picture (picture P16 and picture P19) that is temporally forward and backward. As described above, in the encoding using the B picture, the picture located later in time is referred to, so that the encoding cannot be performed in the display order. That is, when there is a B picture such as the picture B18 in FIG. 17A, it is necessary to encode the picture P19 referred to by the B picture first. Therefore, the picture B16 to the picture B19 must be rearranged in the order shown in FIG.
[0006]
There is a skip mode as one of the prediction modes of a P picture in which inter-picture prediction coding is performed with reference to one temporally forward picture. In the skip mode, the encoding target block does not have the motion vector information directly, and determines the motion vector to be used for the motion compensation of the encoding target block by referring to the motion vector of the encoded block located in the vicinity. Motion compensation is performed by generating a prediction image from a P picture temporally immediately before the picture to which the coding target block belongs.
[0007]
FIG. 18 is a diagram illustrating a positional relationship between a coded block whose motion vector is referred to and a current block when referring to a motion vector of a coded block located around the current block in the same picture. is there. FIG. 18A shows an example in which the encoding target block BL51 to be encoded has a size of 16 pixels × 16 pixels, and FIG. 18B shows an example in which the encoding block BL52 is 8 pixels × 8 pixels. The example in the case of the size is shown. Here, the positional relationship between a coded block whose motion vector is referred to in the P picture skip mode and the current block to be coded is shown. The block BL51 is a block of 16 pixels × 16 pixels for which coding is performed using the skip mode, and basically includes three coded blocks having a positional relationship of A, B, and C (hereinafter, blocks located at the position of A). Are referred to as blocks B and C at the positions of blocks A and B, respectively, and the block at the position of C is referred to as block C.). However, if the following condition is satisfied, the motion vector is not referred to, the value of the motion vector of the current block is set to “0”, and the immediately preceding P picture is referred to perform motion compensation in the direct mode.
[0008]
1. When the block A or the block B is outside the picture or the slice to which the encoding target block belongs.
2. When the block A or the block B has a motion vector of a value “0” referring to the immediately preceding picture.
[0009]
Only the motion vector that refers to the immediately preceding P picture is extracted from the motion vectors of the three blocks A, B, and C that have been referred to, and the median value is used to actually use it in the direct mode. Let it be a motion vector. However, when the block C cannot be referred to, the motion vector of the block D is used instead.
[0010]
FIG. 19 is a diagram illustrating an example of a motion vector referred to in the skip mode of a P picture and an encoded picture referred to by the motion vector. It is assumed that a block BL51 belonging to the picture P64 is a currently coded block. In this example, the motion vector referring to the immediately preceding picture is only the motion vector MVA1, and the motion vector MV1 used in the direct mode uses the value of the motion vector MVA1 as it is. By using such a reference method, it is not necessary to encode a motion vector, so that the bit amount of an output code string can be reduced. Further, since the motion vector is determined with reference to the surrounding blocks, the effect is greatly obtained when the imaged object moves in a certain direction due to the influence of the camera pan or the like.
[0011]
B picture for which inter-picture predictive coding is performed with reference to two temporally forward pictures or two temporally backward pictures or one temporally forward and backward picture. There is a direct mode as one of the prediction modes. In the direct mode, the block to be coded does not have a direct motion vector, but refers to the motion vector of the block at the same position in the coded picture that is temporally immediately after the picture to be coded. Two motion vectors for actually performing motion compensation of the current block are calculated, and a predicted image is created.
[0012]
FIG. 20 is a diagram for explaining a method of determining a motion vector in the direct mode. The picture B73 is a current B picture to be coded, and performs bidirectional prediction in the direct mode using the picture P72 and the picture P74 as reference pictures. Assuming that a block to be coded is a block BL71, two motion vectors required at this time are determined using a motion vector MV71 of a block BL72 at the same position of a picture P74 which is a coded backward reference picture. Is done. The two motion vectors MV72 and MV73 used in the direct mode are calculated by applying scaling to the motion vector MV71 using the picture intervals TR72 and TR73 or by applying a predetermined coefficient to the motion vector MV71. Is done. By averaging the pixel values of the two reference images specified by the two motion vectors, a predicted image required for encoding the block BL71 is generated. As described above, in a block that performs encoding in the direct mode, it is not necessary to encode a motion vector, so that the bit amount of an output code string can be reduced.
[0013]
[Non-patent document 1]
J. Joint Video Team (JVT) of ISO / IEC MPEG and ITU-T VCEG --- JointCommittee Draft (2002-5-10) 99 11 B pictures
[0014]
[Problems to be solved by the invention]
However, in the encoding of a moving image using the direct mode of a B picture, encoding is performed with reference to a picture that is temporally backward, so that the picture that may be referred to is a picture to be encoded. It had to be encoded before. For this reason, in an environment where it is not possible to first encode and decode a picture that is later in time, encoding using the direct mode of the B picture cannot be performed.
[0015]
The present invention is to solve the above problems, even in an environment in which a picture that is temporally backward is not coded and decoded earlier than the current picture or the current picture to be decoded, It is a first object of the present invention to propose a method that enables B-pictures, particularly direct mode, to be used without contradiction. Further, the present invention proposes a highly efficient moving picture encoding method and a decoding method using a B picture by proposing an efficient reference method of a table that associates an encoding mode with its identification number. This is a second object.
[0016]
[Means for Solving the Problems]
In order to achieve the above object, a moving image encoding method according to the present invention is a moving image encoding method that encodes a moving image to generate a code sequence. In the coding of a B picture that performs predictive coding with reference to a plurality of pictures, it is possible to use the direct mode of performing motion compensation of a current block with reference to a motion vector of a coded block. When performing the predictive encoding of the B picture by referring to only the encoded picture in one direction in display order from the picture to which the encoding target block belongs, A motion compensation step of performing motion compensation by referring to a motion vector of an encoded block in the same picture located around the current block as the direct mode. And wherein the Mukoto.
[0017]
Further, the moving picture coding method of the present invention is a moving picture coding method for coding a moving picture to generate a code string, and refers to a plurality of coded pictures that are temporally forward or backward. In the encoding of a B picture to be subjected to predictive encoding, an encoding step that enables to use a direct mode for performing motion compensation of a current block with reference to a motion vector of an encoded block, The encoding step includes performing the predictive encoding of the B picture by referring to only the encoded picture in one direction in display order from the picture to which the encoding target block belongs, and performing the encoding as the direct mode. It is characterized in that the motion compensation is performed by referring to one or a plurality of pictures in order from the temporally closest one with the value of the motion vector of the target block being “0”.
[0018]
Further, in the moving picture coding method according to the present invention, the coding step refers to the backward from a table in which the predictive coding method of the B picture is associated with an identifier for identifying the predictive coding method. Excluding the predictive encoding method, the method includes a table regenerating step of regenerating the table. In the encoding step, the identifier indicating the predictive encoding method of the B picture is encoded using the regenerated table. May be changed.
[0019]
In order to achieve the above object, a moving picture decoding method according to the present invention is a moving picture decoding method for decoding a code sequence obtained by coding a moving picture. When decoding a B picture that performs predictive decoding with reference to a plurality of decoded pictures, use a direct mode that performs motion compensation on the current block with reference to a motion vector of the decoded block. And decoding the B picture by predicting only the decoded picture in one direction in time from the picture to which the current block belongs. When performing the motion compensation, the direct mode refers to a motion vector of a decoded block in the same picture positioned around the current block and performs motion compensation. Characterized in that it comprises a step.
[0020]
Further, the moving picture decoding method of the present invention is a moving picture decoding method for decoding a code sequence obtained by coding a moving picture, wherein a plurality of decoded pictures which are temporally forward or backward are decoded. In the decoding of a B picture that performs predictive decoding with reference to the above, it is possible to use the direct mode of performing motion compensation on the current block with reference to the motion vector of the decoded block. The predicting decoding of the B picture by referring to only the decoded picture in one direction in time from the picture to which the decoding target block belongs. And performing motion compensation by referring to one or a plurality of pictures in order from the temporally closest one, with the value of the motion vector of the decoding target block being “0”. That.
[0021]
Further, in the moving picture decoding method according to the present invention, the decoding step includes the step of storing a previously stored table in which the predictive decoding method of the B picture is associated with an identifier for identifying the predictive decoding method. Excluding a predictive decoding method that refers to backward from inside, the method includes a table regenerating step of regenerating the table. In the decoding step, a predictive decoding method for the B picture is identified from the coded sequence. May be decoded, the prediction decoding method of the B picture is identified using the regenerated table, and the prediction decoding of the current block is performed according to the identified prediction decoding method.
[0022]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
(Embodiment 1)
FIG. 1 is a block diagram illustrating a configuration of a moving image encoding device 100 that executes the moving image encoding method according to the first embodiment. When encoding a B picture in the direct mode, the moving picture encoding apparatus 100 refers to only a picture that is ahead of the current picture in display order and is located around the current block in the same picture. A moving picture coding apparatus that determines a motion vector of a current block by referring to a motion vector of a coded block, comprising: a frame memory 101; a prediction residual coding unit 102; a code sequence generation unit 103; It includes a difference decoding unit 104, a frame memory 105, a motion vector detection unit 106, a mode selection unit 107, a motion vector storage unit 108, a backward picture determination unit 109, a difference calculation unit 110, an addition calculation unit 111, a switch 112, and a switch 113. .
[0023]
The frame memory 101, the frame memory 105, and the motion vector storage unit 108 are memories realized by a RAM or the like. The frame memory 101 is for rearranging each picture of a moving image input in display order in encoding order. Provide storage area.
[0024]
The prediction residual encoding unit 102 performs frequency conversion such as DCT on the prediction residual obtained by the difference calculation unit 110, quantizes and outputs the result. The code sequence generation unit 103 performs variable length coding on the coding result from the prediction residual coding unit 102, converts the coding result into a format of a coded sequence for output, and describes information related to the prediction coding method. A code string is generated with additional information such as a header. The prediction residual decoding unit 104 performs variable-length decoding on the encoding result from the prediction residual encoding unit 102, performs inverse quantization, and performs inverse frequency transform such as IDCT transform to generate a decoded prediction residual. I do.
[0025]
The frame memory 105 provides a storage area for holding a predicted picture in a picture unit. The motion vector detection unit 106 detects a motion vector for each predetermined unit such as a macroblock or a block obtained by further dividing the macroblock. The mode selection unit 107 selects an optimal prediction mode while referring to the motion vector used for the coded picture stored in the motion vector storage unit 108 and indicates the best prediction mode by using the motion vector detected by the motion vector detection unit 106. Each block in the predicted picture to be read is read from the frame memory 105 and output to the difference calculation unit 110.
[0026]
The motion vector storage unit 108 provides a storage area for holding a motion vector detected for each block of an encoded picture. The backward picture determination unit 109 determines whether or not a picture behind the current picture in display order has already been coded. The difference calculation unit 110 outputs a difference between a coding-target macroblock and a macroblock of a predicted image determined by a motion vector.
[0027]
The addition operation unit 111 adds the decoded prediction residual output from the prediction residual decoding unit 104 and the block of the predicted picture output from the mode selection unit 107 to obtain an addition result (a prediction picture is formed). Block) is stored in the frame memory 105. The switch 112 is switched in accordance with the picture type of the picture to be coded, and makes the readout line of the frame memory 101 and the prediction residual coding unit 102 conductive for an I picture to be subjected to intra prediction coding. As a result, each macroblock of the current picture read from the frame memory 101 is directly input to the prediction residual encoding unit 102.
[0028]
In the case of a P picture and a B picture for which inter-picture prediction coding is performed, the output side of the difference calculation unit 110 and the prediction residual coding unit 102 are made conductive. As a result, the calculation result of the difference calculation unit 110 is input to the prediction residual coding unit 102. The switch 113 is switched between conduction and blocking according to the picture type of the current picture. The output side of the mode selection unit 107 and the input side of the addition operation unit 111 are shut off for an I picture for which intra prediction coding is performed, and for a P picture and a B picture for which inter prediction coding is performed, the mode selection unit 107 The output side and the input side of the addition operation unit 111 are conducted. As a result, in an I picture for which intra prediction coding is performed, the decoded prediction residual decoded by the prediction residual decoding unit 104 is output to the frame memory 105.
[0029]
Hereinafter, the moving picture coding method according to the first embodiment of the present invention will be described with reference to the block diagram shown in FIG.
A moving image to be encoded is input to the frame memory 101 in picture order in time order. Each picture is divided into blocks called, for example, 16 pixels in the horizontal direction and 16 pixels in the vertical direction, which are called macroblocks, and the subsequent processing is performed in block units.
[0030]
The macroblock read from the frame memory 101 is input to the motion vector detection unit 106. The motion vector detection unit 106 detects a motion vector of a macroblock to be coded using an image stored in the frame memory 105 (an image obtained by further decoding a coded picture) as a reference picture. I do. In the prediction mode other than the direct mode, the motion vector detection unit 106 determines the size of each macroblock or an area obtained by dividing the macroblock (for example, a size such as 16 pixels × 8 pixels, 8 pixels × 16 pixels, and 8 pixels × 8 pixels). A motion vector is detected for each of the divided small blocks.
[0031]
The motion vector detection unit 106 predicts that a previously coded picture is used as a reference picture for a macroblock to be coded, and predicts that the pixel value of the current block is closest to the configuration of the pixel value of the current block in a search area within the picture. A motion vector indicating the position of the block to be processed is detected. The mode selection unit 107 selects an optimal prediction mode while referring to the motion vector stored in the motion vector storage unit 108 and used for the coded picture. At this time, the backward picture determination unit 109 determines whether or not the picture behind in the display order has already been encoded. If it is determined that the backward picture is not coded, the mode selection unit 107 selects a prediction mode that does not refer to the backward picture in the display order in the encoding of the B picture.
[0032]
According to the prediction mode selected by the mode selection unit 107, the optimal motion vector is determined from the motion vectors detected by the motion vector detection unit 106, and the prediction block referred to by the determined motion vector is read from the frame memory 105. It is output and input to the difference calculation unit 110. The difference calculation unit 110 generates a prediction residual image by calculating the difference between the prediction block and the macroblock to be encoded. The generated prediction residual image is input to the prediction residual encoding unit 102, and is subjected to frequency conversion and quantization in the prediction residual encoding unit 102. The above processing flow is the operation when the inter-picture prediction coding is selected. However, the switch 112 switches to the intra-picture prediction coding. Finally, the code sequence generation unit 103 performs variable length coding on control information such as a motion vector and image information output from the prediction residual coding unit 102, and generates a finally output code sequence. Is done.
[0033]
The outline of the flow of the encoding has been described above. Hereinafter, the details of the processing in the direct mode in the mode selection unit 107 will be described. Here, a case will be described in which the rear picture determination unit 109 determines that the rear picture is not coded. FIG. 2 is a diagram illustrating an example of a reference relationship between the pictures in a case where it is not possible to refer to a picture in the display order behind a picture to which the coding target block belongs. As shown in the figure, all the B pictures included in the display order sequence of the pictures are subjected to predictive encoding with reference to one or a plurality of encoded pictures located ahead in the display order. For example, the picture B82 and the picture B83, both of which are B pictures, have only the coded picture P81 ahead in the display order, so that the motion compensation is performed with reference to only the picture P81.
[0034]
Further, as for picture B85 and picture B86, both of which are B pictures, picture B85 refers to two coded pictures (picture P81 and picture P84) located in the front in display order. The motion compensation is performed by referring only to the picture P84 that is temporally closer in the display order without referring to the picture P81 that is temporally farther in the display order. In such a case, the motion vectors of each B picture all refer to encoded pictures that are ahead of the current picture in display order.
[0035]
In the present embodiment, when the direct mode is selected by the mode selection unit 107 when predictive encoding of a B picture is performed in an environment in which a picture backward in the display order is not encoded, as in the related art, Instead of generating the motion vector of the current block by referring to the motion vector of the coded block belonging to the picture immediately following the current picture in display order (hereinafter referred to as “temporal prediction”), A direct mode is realized by generating a motion vector of the current block with reference to a motion vector of a coded block located around the current block (hereinafter referred to as “spatial prediction”).
[0036]
FIG. 3 is a flowchart illustrating an example of the operation of the mode selection unit 107 when the direct mode is selected. When the direct mode is selected, the mode selection unit 107 first causes the backward picture determination unit 109 to determine whether or not a picture behind the current picture in display order has been coded (S501). If the result of the determination is that the picture behind in the display order has already been coded, predictive coding of the current block is performed using temporal prediction as in the prior art (S502). Then, the mode selection unit 107 ends the processing of the current block, and moves on to the processing of the next current block.
[0037]
Also, as a result of the determination in step S501, if the picture in the rear in the display order is not coded, predictive coding of the coding target block is performed using the above spatial prediction (S503). Further, the mode selection unit 107 sets the value of the flag spatial_flag indicating that the spatial prediction has been performed to “1” and outputs the value to the code sequence generation unit 103 (S504). After that, the mode selection unit 107 ends the processing of the current block to be processed, and proceeds to the processing of the next current block.
[0038]
Hereinafter, a specific method of the spatial prediction performed in step S503 of FIG. 3 will be described.
The example of the skip mode described with reference to FIG. 19 is for the case where the encoded block to be referred to has one motion vector at a time. However, among the prediction modes of the B picture, as shown in FIG. 2, there is also a mode in which the motion compensation is performed by simultaneously referring to two pictures ahead in the display order. In such a mode, one block has two motion vectors. FIG. 4 is a diagram illustrating an example of a motion vector reference relationship in a case where a block having two motion vectors is included in an encoded block whose motion vector is referred to. The picture P94 is a picture currently being coded, and the block BL51 is a block for performing predictive coding in the direct mode.
[0039]
First, as a first method, the mode selection unit 107 sets the block BL51 (or the block BL52) in any case where the block BL51 for performing predictive encoding in the direct mode is the one shown in FIGS. 18A and 18B. ) Basically refers to the motion vector of the block at the position of A, B, C. However, the reference is changed according to the following conditions.
[0040]
1. If the block C cannot be referred to, the block at the positions A, B, and D is referred to.
If there is a block that cannot refer to a motion vector among the three blocks located at 2, A, B, and C, or A, B, and D, the block is excluded from motion vector reference targets.
[0041]
The mode selection unit 107 determines, among the motion vectors of the three blocks A, B, and C (or A, B, and D) that have been referred to, the picture referenced by the motion vector and the picture to be coded. Compare the perspective in the display order of. From the comparison, a motion vector that refers to a picture located closest to the current picture in display order is extracted. When there are a plurality of extracted motion vectors, the median value or the average value thereof is taken. For example, a median value may be taken when there are an odd number of extracted motion vectors, and an average value may be taken when there are even numbered motion vectors. The motion vector obtained in this way is used as the motion vector of the current block when the direct mode is selected, when the motion compensation is performed with reference to only the picture preceding the current picture in the display order. . When all the blocks A, B, and C (or A, B, and D) cannot be referred to, the motion vector of the current block is set to 0, and the picture to be referred to is set as the immediately preceding picture. I do.
[0042]
FIG. 5 is a flowchart illustrating an example of a processing procedure when the mode selection unit 107 illustrated in FIG. 1 performs spatial prediction of the current block using the first method. Hereinafter, the encoding target block BL51 illustrated in FIG. 4 will be described as an example. First, the mode selection unit 107 checks whether or not the block at the position C with respect to the encoding target block BL51 can be referred to (S601). In FIG. 4, the block at the position C has a motion vector MVC1 referring to the picture P93 and a motion vector MVC2 referring to the picture P92. Therefore, the mode selection unit 107 refers to the motion vectors of the blocks at the positions of A, B, and C (S602).
[0043]
The block at the position A has a motion vector MVA1 referring to the picture P93, and the block at the position B has a motion vector MVB1 referring to the picture P93 and a motion vector MVB3 referring to the picture P91. In step S601, the block at the position C is outside the current picture P94, outside the slice to which the current block BL51 belongs, or because encoding such as intra prediction has been performed. If there is no motion vector, the motion vector of the block at the position D shown in FIGS. 18A and 18B is referred to instead of the block at the position C (S603). That is, three blocks at positions A, B, and D are referred to.
[0044]
Next, the mode selection unit 107 selects a slice that is outside the current picture P94 or belongs to the current block BL51 among the three referenced blocks (A, B, C or A, B, D). , Or does not have a motion vector due to encoding such as intra prediction, the block is excluded from reference candidates, and the motion vector of the encoding target block is calculated ( S604).
[0045]
In addition, when all blocks cannot be referred to among the three blocks (A, B, C or A, B, D), the motion vector of the current block is set to “0”, and the motion vector immediately before the current picture to be coded is set. Refers to a picture. When the mode selection unit 107 extracts only one of the referenced motion vectors that refers to the picture closest to the current picture in display order, the motion vector MVA1, the motion vector MVB1, and the motion vector MVA1 referencing the picture P93. The motion vector MVC1 is obtained. The mode selection unit 107 further takes the median value or the average value. For example, in this case, since three motion vectors are obtained, a median value is obtained. Thereby, one motion vector MV1 for performing motion compensation of the block BL51 can be determined.
[0046]
FIG. 6 is a diagram illustrating an example of a data structure for each slice of the code string generated by the code string generation unit 103 illustrated in FIG. The code string of each picture is composed of a plurality of slice data, and each slice data is composed of a plurality of macroblock data. As shown in the figure, a slice header is added to each slice data in the code string, and information about the slice and the like is written in the slice header. The information on the slice describes, for example, the number of the frame to which the slice belongs, a flag spatial_flag indicating the type of the encoding method in the direct mode, and the like.
[0047]
As described above, in the above embodiment, even in an environment where it is not possible to refer to a picture located backward in the display order, it is possible to directly refer to a picture located backward in the display order when performing predictive encoding using the direct mode. A method for realizing the mode was proposed, and a coding method for realizing high coding efficiency was shown.
[0048]
Note that, in the first method, among the referenced motion vectors, the one that refers to the picture closest to the encoding target picture in display order is extracted. Only the picture that refers to the immediately preceding picture may be extracted. In the case of the example shown in FIG. 4, among the pictures referenced by the referenced motion vector, the picture closest to the current picture in display order is the picture immediately before the current picture. The vectors are the same. If there is no motion vector that refers to the closest picture in the display order, the encoding in the direct mode is performed with the motion vector of the encoding target block set to “0”.
[0049]
Further, in the first method, when determining a motion vector to be used in the direct mode, by referring to a picture located in the foreground from a picture to be coded in display order from among pictures referred to by neighboring coded blocks. Only one motion vector is taken out and one motion vector is finally calculated. Instead, as a second method, a motion vector referring to N pictures from the front of the current picture in the display order is calculated. One motion vector is determined for each picture taken out and referred to, and the obtained N motion vectors are subjected to motion compensation for referencing only the forward direction as motion vectors used for predictive coding in the direct mode. It is also possible. At this time, the predicted image is generated by calculating the average of the pixel values of N regions specified by the N motion vectors.
[0050]
Note that it is also possible to generate a predicted image by a method of averaging by weighting the pixel values of each area instead of a simple average. By using this method, it is possible to realize more accurate motion compensation for an image sequence in which pixel values gradually change in display order.
[0051]
FIG. 7 is a diagram illustrating an example of a motion vector referencing method in a case where two motion vectors are calculated by extracting a motion vector referring to two pictures in front of the current picture in display order and calculating two motion vectors. Picture P104 is a picture currently being coded, and BL51 is a block for performing predictive coding in the direct mode. Using the motion vector MVA1, the motion vector MVB1, and the motion vector MVC1 that refer to the picture P103 that is the closest in the display order among the pictures referred to by the plurality of motion vectors to be referred, The motion vector MV1 is determined by taking the average value, and the median value or average value of the motion vector referring to the picture P102 which is two before in the display order, that is, the motion vector MV2 is determined by taking MVC2 itself. The coding in the direct mode is performed using these two motion vectors.
[0052]
It should be noted that instead of using only the one that refers to one or N pictures from the front in the display order from among the motion vectors of the blocks referred to in FIGS. 18A and 18B, It is also possible to take out only the motion vector referring to the specified picture, determine the value of the motion vector of the current block to be used in the direct mode, and perform motion compensation from the specified picture.
[0053]
Note that when performing encoding using the direct mode, instead of performing motion compensation with reference to encoded blocks having a positional relationship as shown in FIGS. It is also possible to perform motion compensation in the direct mode by setting the value of the motion vector of the target block to “0” and setting the picture to be referred to as the immediately preceding picture. By using this method, it is not necessary to perform a step of calculating a motion vector to be used in the direct mode, so that the encoding process can be simplified.
[0054]
At this time, instead of the spatial_flag indicating whether to perform temporal prediction or spatial prediction in the direct mode, the value of the motion vector of the encoding target block is set to “0” without referring to the encoded block. A flag indicating that motion compensation is performed may be described in the slice header.
[0055]
In the above method, among the motion vectors obtained by referring to the three blocks, a picture located closest to the current picture in display order among the pictures referred to by the three blocks is referred to. Although it is described that a motion vector is extracted, the present invention is not limited to this. For example, a motion vector referring to a picture closest to the current picture in the coding order may be extracted.
[0056]
(Embodiment 2)
A moving picture decoding method according to the second embodiment of the present invention will be described with reference to the block diagram shown in FIG. However, in the present moving picture decoding method, it is assumed that the code sequence generated by the moving picture coding method of Embodiment 1 is decoded.
[0057]
FIG. 8 is a block diagram illustrating a configuration of a moving picture decoding apparatus 200 according to the present embodiment. When the flag indicating the decoding method in the direct mode is “1”, the moving image decoding apparatus 200 performs the decoding using the spatial prediction on the decoding target block encoded in the direct mode. The decoding apparatus includes a code sequence analysis unit 201, a prediction residual decoding unit 202, a frame memory 203, a motion compensation decoding unit 204, a motion vector storage unit 205, a backward picture determination unit 206, an addition operation unit 207, and a switch 208. Is provided.
[0058]
The code sequence analysis unit 201 analyzes the input code sequence, extracts information such as prediction residual coded data, motion vector information and a prediction mode from the code sequence, and extracts the extracted motion vector information and prediction mode. The information is output to the motion compensation decoding unit 204 and the prediction residual coded data is output to the prediction residual decoding unit 202. The prediction residual decoding unit 202 performs variable-length decoding, inverse quantization, inverse frequency transformation, and the like on the extracted prediction residual encoded data to generate a prediction residual image.
[0059]
The frame memory 203 stores the decoded image in picture units, and outputs the stored pictures to an external monitor or the like as an output image in display order. The motion compensation decoding unit 204 performs decoding of a prediction mode and decoding of a motion vector used in the prediction mode, uses the decoded image stored in the frame memory 203 as a reference picture, and receives the input motion vector information. To generate a predicted image for the current block. When decoding the motion vector, the decoded motion vector stored in the motion vector storage unit 605 is used.
[0060]
The motion vector storage unit 205 stores the motion vector decoded by the motion compensation decoding unit 204. When the motion compensation decoding unit 204 generates the predicted image, the backward picture determining unit 206 determines whether or not a picture behind the current picture in display order has been decoded. Note that the backward picture determination unit 206 is used in the fourth embodiment, but is unnecessary in the present embodiment. The addition operation unit 207 adds the prediction residual image decoded by the prediction residual decoding unit 202 and the prediction image generated by the motion compensation decoding unit 204 to generate a decoded image of the decoding target block. I do.
[0061]
First, various information such as motion vector information and prediction residual coded data is extracted from the input code sequence by the code sequence analysis unit 201. The motion vector information extracted here is output to the motion compensation decoding unit 204, and the prediction residual coded data is output to the prediction residual decoding unit 202. The motion compensation decoding unit 204 uses the decoded image of the decoded picture stored in the frame memory 203 as a reference picture, and generates a predicted image based on the decoded motion vector.
[0062]
The prediction image generated in this way is input to the addition operation unit 207, and a decoded image is generated by performing addition with the prediction residual image generated in the prediction residual decoding unit 202. If the prediction direction is not restricted, the generated decoded image is rearranged in the order in which the pictures are displayed in the frame memory 203. However, if it is not possible to refer to a picture behind in the display order, It is possible to display the images in the order in which they have been decoded without rearrangement. In the above embodiment, the operation is performed on a code string on which inter-picture predictive encoding is performed. However, the switch 208 switches between decoding processing on a code string on which intra-picture predictive encoding is performed.
[0063]
The outline of the decoding flow has been described above, and the details of the processing in the motion compensation decoding unit 204 will be described below.
FIG. 9 is a flowchart showing a processing procedure of decoding in the direct mode in the motion compensation decoding unit 204 shown in FIG.
[0064]
The prediction mode and the motion vector information are added for each macroblock or each block obtained by dividing the macroblock. These pieces of information are described in the order of the macroblocks in the slice in the slice data area of the code string. If the prediction mode Mode indicates the direct mode, the motion compensation decoding unit 204 checks whether “0” or “1” is set in the flag spatial_flag to be decoded in the slice header (S901). ). When the subsequent picture is not decoded, the flag "spatial_flag" is set to "1", indicating that decoding is performed using spatial prediction.
[0065]
If the flag spatial_flag is set to “1”, the motion compensation decoding unit 204 creates a prediction image of the current block using spatial prediction in direct mode (S902), and sets “0”. If the prediction has been performed, the motion compensation decoding unit 204 creates a prediction image of the current block using temporal prediction in the direct mode (S903). When the prediction mode Mode in the slice header indicates a prediction mode other than the direct mode, the motion compensation decoding unit 204 sets a decoded picture as a reference picture for a macroblock to be decoded, A block in the reference picture is specified based on the decoded motion vector, and a predicted image for performing motion compensation is cut out from the specified block to generate a predicted image.
[0066]
Hereinafter, a specific method of the spatial prediction performed in step S902 of FIG. 9 will be described.
The example of the skip mode described with reference to FIG. 19 is for the case where the decoded block to be referred to has one motion vector at a time. However, among the prediction modes of the B picture, as shown in FIG. 2, there is also a mode in which the motion compensation is performed by simultaneously referring to two pictures ahead in the display order. In such a mode, one block has two motion vectors.
[0067]
FIG. 4 shows an example of a motion vector reference relationship in the case where a block having two motion vectors is included in a decoded block to which a motion vector is referred. Picture P94 is a picture currently being decoded, and block BL51 is a block for performing predictive decoding in the direct mode.
[0068]
First, as a first method, the motion compensation decoding unit 204 sets the block BL51 (or the block BL51 (or the block BL51) to perform the predictive decoding in the direct mode in both cases shown in FIGS. 18A and 18B. BL52) basically refers to the motion vector of the block at the position of A, B, C. However, the reference is changed according to the following conditions.
[0069]
1. If the block C cannot be referred to, the block at the positions A, B, and D is referred to.
If there is a block that cannot refer to a motion vector among the three blocks located at 2, A, B, and C, or A, B, and D, the block is excluded from motion vector reference targets.
[0070]
The motion compensation decoding unit 204 determines which of the motion vectors of the three blocks A, B, and C (or A, B, and D) has been referred to by the motion vector and the decoding target picture. Compare the perspective in the display order with. From the comparison, a motion vector that refers to a picture located closest to the current picture in display order from the decoding target picture is extracted. When there are a plurality of extracted motion vectors, the median value or the average value thereof is taken. For example, a median value may be taken when there are an odd number of extracted motion vectors, and an average value may be taken when there are even numbered motion vectors.
[0071]
The motion vector obtained in this way is used as the motion vector of the current block to be decoded when the direct mode is selected, in the case where the motion compensation is performed by referring to only the picture preceding the current picture in the display order. . If all the blocks A, B, and C (or A, B, and D) cannot be referred to, the motion vector of the current block is set to 0, and the picture to be referred to is set as the immediately preceding picture. I do.
[0072]
The flowchart in FIG. 5 illustrates an example of a processing procedure when the motion compensation decoding unit 204 illustrated in FIG. 8 performs the spatial prediction of the current block using the first method. Hereinafter, the decoding target block BL51 illustrated in FIG. 4 will be described as an example.
[0073]
First, the motion compensation decoding unit 204 checks whether or not the block at the position C can be referred to with respect to the decoding target block BL51 (S601). In FIG. 4, the block at the position C has a motion vector MVC1 referring to the picture P93 and a motion vector MVC2 referring to the picture P92. Therefore, the motion compensation decoding unit 204 refers to the motion vectors of the blocks at the positions of A, B, and C (S602).
[0074]
The block at the position A has a motion vector MVA1 referring to the picture P93, and the block at the position B has a motion vector MVB1 referring to the picture P93 and a motion vector MVB3 referring to the picture P91. In step S601, the block at the position C is outside the decoding target picture P94, outside the slice to which the decoding target block BL51 belongs, or because decoding such as intra prediction has been performed. If there is no motion vector, the motion vector of the block at the position D shown in FIGS. 18A and 18B is referred to instead of the block at the position C (S603). That is, three blocks at positions A, B, and D are referred to.
[0075]
Next, the motion compensation decoding unit 204 is outside the decoding target picture P94 among the three referenced blocks (A, B, C or A, B, D), or the decoding target block BL51 belongs. If the block does not have a motion vector due to being outside the slice or having undergone decoding such as intra prediction, the block is excluded from the reference candidates and the motion vector of the decoding target block is calculated. (S604).
[0076]
When all blocks cannot be referred to among the three blocks (A, B, C or A, B, D), the motion vector of the current block is set to “0” and the motion vector immediately before the current picture to be decoded is set to “0”. Refers to a picture. The motion compensation decoding unit 204 extracts, from these referenced motion vectors, only those that refer to the picture closest to the decoding target picture in display order, the motion vector MVA1 and the motion vector MVB1 that refer to the picture P93. And the motion vector MVC1. The motion compensation decoding unit 204 further takes a median value or an average value. For example, in this case, since three motion vectors are obtained, a median value is obtained. Thereby, one motion vector MV1 for performing motion compensation of the block BL51 can be determined.
[0077]
As described above, in the above-described embodiment, even in an environment where it is not possible to refer to a picture located backward in the display order, it is possible to directly refer to a picture located backward in the display order when performing predictive decoding using the direct mode. A method to realize the mode was proposed, and a decoding method to achieve high coding efficiency was shown.
[0078]
Note that, in the first method, among the referenced motion vectors, the one that refers to the picture closest to the decoding target picture in display order is extracted. Only the picture that refers to the immediately preceding picture may be extracted. In the case of the example shown in FIG. 4, among the pictures referenced by the referenced motion vector, the picture closest to the current picture in display order is the picture immediately before the current picture to be decoded. The vectors are the same. If there is no motion vector that refers to the closest picture in the display order, the decoding is performed in the direct mode with the motion vector of the decoding target block set to “0”.
[0079]
Further, in the first method, when determining a motion vector to be used in the direct mode, by referring to a picture located in the foreground from a decoding target picture in display order from among pictures referred to by neighboring decoded blocks. Only one motion vector is taken out and one motion vector is finally calculated. Instead, as a second method, a motion vector referring to N pictures before the current picture in decoding order is displayed. One motion vector is determined for each picture taken out and referred to, and the obtained N motion vectors are subjected to motion compensation for referencing only the forward direction as motion vectors used for predictive decoding in the direct mode. It is also possible. At this time, the predicted image is generated by calculating the average of the pixel values of N regions specified by the N motion vectors.
[0080]
Note that it is also possible to generate a predicted image by a method of averaging by weighting the pixel values of each area instead of a simple average. By using this method, it is possible to realize more accurate motion compensation for an image sequence in which pixel values gradually change in display order.
[0081]
FIG. 7 illustrates an example of a motion vector reference method in a case where two motion vectors are calculated by extracting a motion vector that refers to two pictures from the front of the current picture in the display order. Picture P104 is a picture currently being decoded, and BL51 is a block for performing predictive decoding in the direct mode. Using the motion vector MVA1, the motion vector MVB1, and the motion vector MVC1 that refer to the picture P103 that is the closest in the display order among the pictures referenced by the plurality of motion vectors to be referred to, the median value or The motion vector MV1 is determined by taking the average value, and the median value or the average value of the motion vectors referring to the picture P102 which is two in front in the display order, that is, the motion vector MV2 is determined by taking MVC2 itself. Then, decoding in the direct mode is performed using these two motion vectors.
[0082]
It should be noted that instead of using only the one that refers to one or N pictures from the front in the display order from among the motion vectors of the blocks referred to in FIGS. 18A and 18B, It is also possible to take out only the motion vector referring to the specified picture, determine the value of the motion vector of the current block to be used in the direct mode, and perform motion compensation from the specified picture.
[0083]
When decoding is performed using the direct mode, instead of performing motion compensation with reference to encoded blocks having a positional relationship as shown in FIGS. 18A and 18B, decoding is performed. It is also possible to perform motion compensation in the direct mode by setting the value of the motion vector of the target block to “0” and setting the picture to be referred to as the immediately preceding picture. By using this method, it is not necessary to perform a step of calculating a motion vector to be used in the direct mode, so that the decoding process can be simplified.
[0084]
In the corresponding encoding process, a flag indicating that motion compensation is performed with the value of the motion vector of the encoding target block set to “0” without referring to the encoded block in the direct mode has been encoded. In this case, by interpreting the value of the flag, the operation can be switched to the operation and the motion prediction in the direct mode can be performed.
[0085]
In the above method, among the motion vectors obtained by referring to the three blocks, the picture closest to the decoding target picture in the display order among the pictures referred to by them is referred to. Although it is described that a motion vector is extracted, the present invention is not limited to this. For example, a motion vector referring to a picture closest to the current picture in decoding order may be extracted.
[0086]
(Embodiment 3)
The moving picture coding method according to the third embodiment of the present invention will be described with reference to the block diagram shown in FIG.
A moving image to be encoded is input to the frame memory 101 in picture order in time order. Each picture is divided into, for example, a block of 16 × 16 pixels, which is called a macroblock, and the subsequent processing is performed in block units.
[0087]
The macroblock read from the frame memory 101 is input to the motion vector detection unit 106. Here, a motion vector of a macroblock to be coded is detected using a decoded picture of a coded picture stored in the frame memory 105 as a reference picture.
[0088]
The mode selection unit 107 determines the optimal prediction mode while referring to the motion vector used in the coded picture stored in the motion vector storage unit 108. At this time, the backward picture determination unit 109 determines whether or not the picture behind in the display order has already been coded. If it is determined that the backward picture has not been coded, A prediction mode that refers to a picture located backward in the display order is restricted so as not to be selected.
[0089]
FIG. 10 shows an example of a table that associates a code for identifying a prediction mode in a B picture with an encoding mode. If the prediction direction is not restricted, a table indicating all the reference patterns is used as shown in FIG. 10A, but if the prediction direction is restricted only to the front, the table shown in FIG. Recreate the table with all the referenced patterns removed and refer to it. This makes it possible to reduce the amount of bits required for the code for identifying the prediction mode. Each item in the tables of FIG. 10A and FIG. 10B can be handled in the same manner when other values are used.
[0090]
FIG. 2 shows the reference relationship of each picture when it is not possible to refer to the picture behind in the display order. All the B pictures included in the sequence are subjected to predictive encoding with reference to one or a plurality of encoded pictures located ahead in the display order.
[0091]
The prediction image determined by the obtained motion vector is input to the difference calculation unit 110, and a difference from the coding target macroblock is calculated to generate a prediction residual image. Is performed. The above processing flow is the operation when the inter-picture prediction coding is selected. However, the switch 112 switches to the intra-picture prediction coding. Finally, the code string generation unit 103 performs variable length coding on control information such as a motion vector and image information output from the prediction residual coding unit 102, and generates a finally output code string. You.
[0092]
The outline of the flow of the encoding has been described above. The details of the processing in the motion vector detecting unit 106 and the mode selecting unit 107 will be described below. Here, it is assumed that the rear picture determination unit 109 determines that the rear picture is not coded.
[0093]
The detection of a motion vector is performed for each macroblock or for each region obtained by dividing the macroblock. For a macroblock to be coded, a predicted image is created by determining a motion vector indicating a position predicted to be optimal in a search area within the picture and a prediction mode by using a previously coded picture as a reference picture. .
[0094]
If the direct mode is selected by the mode selection unit 107 when predictive encoding of a B picture is performed in an environment in which a picture backward in the display order is not encoded, the immediately following picture is displayed in the display order described in the related art. Instead of using a picture as a motion vector with reference to a picture, a direct mode is realized by referring to a motion vector of a coded block located around a block to be coded.
[0095]
First, a case will be described in which each of the encoded blocks located around the current block has one motion vector. FIG. 18 shows the positional relationship of the referenced blocks. FIG. 18A shows an example in which a block BL51 for encoding in the direct mode has a size of 16 × 16 pixels, and FIG. 18B shows a block BL52 for encoding in the direct mode. This is an example of a case where the size is 8 pixels × 8 pixels. In any case, basically, the motion vectors of three blocks having a positional relationship of A, B, and C are referred to. However, in the case of the following conditions, the reference is not performed, the motion vector value of the current block is set to “0”, and the motion compensation in the direct mode is performed by referring to the immediately preceding picture.
[0096]
1. When A or B is outside the picture or outside the slice.
2. A or B has a motion vector of a value “0” referring to the immediately preceding picture.
From the motion vectors of the three blocks A, B, and C that have been referred to, only those that refer to the immediately preceding picture are extracted, and the median or average value is used to actually use it in the direct mode. Let it be a motion vector. However, when the block C cannot be referred to, the block D is used instead.
[0097]
FIG. 19 shows an example of the reference relationship between the motion vectors at that time. It is assumed that a block BL51 belonging to the picture P64 is a currently coded block. In this example, the motion vector referring to the immediately preceding picture is only MVA1, and the value of MVA1 is used as it is as the motion vector MV1 used in the direct mode. Note that the positional relationship of the blocks to be referred to is the same when a place other than A, B, C, and D shown in FIGS. 18A and 18B is used.
[0098]
The example of FIG. 19 is for the case where the encoded block to be referenced has one motion vector at a time. However, some of the B picture prediction modes perform motion compensation by simultaneously referring to two pictures ahead in the display order. In such a mode, one block has two motion vectors.
[0099]
Hereinafter, a case will be described in which a coded block located around the current block includes a block having two motion vectors. FIG. 4 is a diagram illustrating an example of a reference relationship between motion vectors when a block having two motion vectors is included in a coded block located around the current block. The picture P94 is a picture currently being coded, and the block BL51 is a block for performing predictive coding in the direct mode. Of the pictures referenced by all the motion vectors of the block to be referred to, the motion vectors MVA1, MVB1, and MVC1 referencing the picture P93, which is the picture immediately before in the display order, are used. By taking the median value or the average value, the motion vector MV1 used for predictive coding in the direct mode is determined, and motion compensation is performed by referring only to the forward direction.
[0100]
As described above, in the above embodiment, even in an environment where it is not possible to refer to a picture located backward in the display order, it is possible to directly refer to a picture located backward in the display order when performing predictive encoding using the direct mode. A method for realizing a mode is proposed, and the number of items in the table can be reduced by removing an item that refers to a subsequent picture from the table of the encoding mode. showed that.
[0101]
When deciding a motion vector to be used in the direct mode, only a motion vector referring to a picture located closest to the display order in the display order is extracted from among pictures referred to by neighboring encoded blocks, and one motion vector is extracted. Instead of calculating the vectors, a motion vector referring to N pictures is extracted from the near side, one motion vector is determined for each picture referred to, and the obtained N motion vectors are determined in the direct mode. It is also possible to perform motion compensation that refers only to the forward direction as a motion vector used for predictive coding of. At this time, the predicted image is generated by calculating the average of the pixel values of N regions specified by the N motion vectors.
[0102]
Note that it is also possible to generate a predicted image by a method of averaging by weighting the pixel values of each area instead of a simple average. By using this method, it is possible to realize more accurate motion compensation for an image sequence in which pixel values gradually change in display order.
[0103]
FIG. 7 shows an example of a motion vector reference method when N = 2 in the above case. P104 is a picture currently being coded, and BL51 is a block for performing predictive coding in the direct mode. Using the motion vectors MVA1, MVB1, and MVC1 referencing the foremost picture P103 in the display order among the pictures referenced by the plurality of motion vectors to be referred to, take the median or average value thereof. Thus, the motion vector MV1 is determined, and furthermore, the motion vector MV2 is determined by taking the median value or the average value of the motion vectors referring to the picture P102 which is two immediately preceding in the display order, that is, MVC2 itself. The coding in the direct mode is performed using one motion vector.
[0104]
It should be noted that the following conditions can be used instead of the method described in the above embodiment as a method of determining a block to which a motion vector is referred in FIGS. 18A and 18B.
When 1, A and D cannot be referred to, those motion vectors are referred to as "0".
If 2, B, C and D cannot be referenced, only A is referenced.
If only 3 and C cannot be referenced, A, B and D are referenced.
4. In cases other than 2 and 3 above, reference is made to A, B and C.
[0105]
It should be noted that instead of using only the one that refers to one or N pictures from the front in the display order from among the motion vectors of the blocks referred to in FIGS. 18A and 18B, It is also possible to take out only the motion vector referring to the specified picture, determine the value of the motion vector of the current block to be used in the direct mode, and perform motion compensation from the specified picture.
[0106]
When performing encoding using the direct mode, instead of performing motion compensation with reference to blocks having a positional relationship as shown in FIG. 18A and FIG. It is also possible to perform motion compensation in the direct mode with the value of the vector being “0” and the picture to be referred to as the immediately preceding picture. By using this method, it is not necessary to perform a step of calculating a motion vector to be used in the direct mode, so that the encoding process can be simplified.
[0107]
In the above embodiment, among the motion vectors obtained by referring to the three blocks, the picture closest to the encoding target picture in the display order among the pictures referred to by them is referred to. Although it is described that the motion vector is extracted, the present invention is not limited to this. For example, a motion vector referring to a picture closest to the current picture in the coding order may be extracted.
[0108]
(Embodiment 4)
A moving picture decoding method according to the fourth embodiment of the present invention will be described with reference to the block diagram shown in FIG. However, it is assumed that a code string generated by the moving picture coding method according to the third embodiment is input.
[0109]
First, various information such as motion vector information and prediction residual coded data is extracted from the input code sequence by the code sequence analysis unit 201. The motion vector information extracted here is output to the motion compensation decoding unit 204, and the prediction residual coded data is output to the prediction residual decoding unit 202. The motion compensation decoding unit 204 uses the decoded image of the decoded picture stored in the frame memory 203 as a reference picture, and generates a predicted image based on the input motion vector information. At this time, the backward picture determination unit 206 determines whether or not the picture behind in the display order has already been coded. If it is determined that the backward picture has not been coded, A prediction mode that refers to a picture located backward in the display order is restricted so as not to be selected.
[0110]
FIG. 10 shows an example of a table that associates a code for identifying a prediction mode in a B picture with an encoding mode. If the prediction direction is not restricted, a table indicating all the reference patterns is used as shown in FIG. 10A, but if the prediction direction is restricted only to the front, the table shown in FIG. Recreate the table with all the referenced patterns removed and refer to it. Each item in the tables of FIG. 10A and FIG. 10B can be handled in the same manner when other values are used.
[0111]
The prediction image generated in this way is input to the addition operation unit 207, and a decoded image is generated by performing addition with the prediction residual image generated in the prediction residual decoding unit 202. If the prediction direction is not restricted, the generated decoded image is rearranged in the order in which the pictures are displayed in the frame memory 203, but if it is not possible to refer to the backward picture in the display order, It is possible to display the images in the order in which they have been decoded without rearrangement. In the above-described embodiment, the operation is performed on a code string that has been subjected to inter-picture predictive coding. However, the switch 208 switches between the decoding processing for a code string that has been subjected to intra-picture predictive coding.
[0112]
The outline of the decoding flow has been described above, and the details of the processing in the motion compensation decoding unit 204 will be described below. Here, it is assumed that the backward picture determination unit 206 determines that the backward picture is not decoded.
The motion vector information is added for each macroblock or each region obtained by dividing the macroblock. For a macroblock to be decoded, a picture that has already been decoded is used as a reference picture, and a predicted image for performing motion compensation is created from within the picture using the decoded motion vector.
[0113]
When the direct mode is specified in the predictive decoding of a B picture in an environment in which the backward picture in the display order is not decoded, the motion vector is referred to by referring to the immediately succeeding picture in the display order described in the related art. Instead of using it, the direct mode is realized by referring to the motion vector of a decoded block located around the block to be decoded.
[0114]
First, a case will be described in which each of the decoded blocks located around the current block has one motion vector. FIG. 18 shows the positional relationship of the referenced blocks. FIG. 18A shows an example in which a block BL51 for decoding in the direct mode has a size of 16 pixels × 16 pixels, and FIG. 18B shows a block BL52 for decoding in the direct mode. This is an example of a case where the size is 8 pixels × 8 pixels. In any case, basically, the motion vectors of three blocks having a positional relationship of A, B, and C are referred to. However, in the case of the following conditions, the reference is not performed, and the motion vector value of the decoding target block is set to “0”, and the motion compensation in the direct mode is performed by referring to the immediately preceding picture.
1. When A or B is outside the picture or outside the slice.
2. A or B has a motion vector of a value “0” referring to the immediately preceding picture.
[0115]
From the motion vectors of the three blocks A, B, and C that have been referred to, only those that refer to the immediately preceding picture are extracted, and the median or average value is used to actually use it in the direct mode. Let it be a motion vector. However, when the block C cannot be referred to, the block D is used instead.
[0116]
FIG. 19 shows an example of the reference relationship between the motion vectors at that time. It is assumed that a block BL51 belonging to the picture P64 is a block currently being decoded. In this example, the motion vector referring to the immediately preceding picture is only MVA1, and the value of MVA1 is used as it is as the motion vector MV1 used in the direct mode. Note that the positional relationship of the blocks to be referred to is the same when a place other than A, B, C, and D shown in FIGS. 18A and 18B is used.
[0117]
The example in FIG. 19 is for the case where the decoded block to be referred to has one motion vector at a time. However, some of the B picture prediction modes perform motion compensation by simultaneously referring to two pictures ahead in the display order. In such a mode, one block has two motion vectors.
[0118]
Hereinafter, a case will be described where a decoded block located around a current block to be decoded includes a block having two motion vectors. FIG. 4 shows an example of a reference relationship between motion vectors in such a case. P94 is a picture currently being decoded, and BL51 is a block for performing predictive decoding in the direct mode. Among the pictures referenced by all the motion vectors of the block to be referenced, the motion vectors MVA1, MVB1, and MVC1 referencing the picture P93, which is the picture immediately before in the display order. By taking the median value or the average value, the motion vector MV1 used for predictive decoding in the direct mode is determined, and motion compensation is performed with reference to only the forward direction.
[0119]
As described above, in the above-described embodiment, even in an environment where it is not possible to refer to a picture located backward in the display order, it is possible to directly refer to a picture located backward in the display order when performing predictive decoding using the direct mode. A method for realizing the mode is proposed, and a decoding method for realizing high encoding efficiency by reducing the number of items in the table by removing items referring to the subsequent picture from the encoding mode table is shown.
[0120]
When deciding a motion vector to be used in the direct mode, only a motion vector referring to a picture located closest to the display order in the display order is extracted from among pictures referred to by neighboring decoded blocks, and one motion vector is extracted. Instead of calculating the vectors, a motion vector referring to N pictures is extracted from the near side, one motion vector is determined for each picture referred to, and the obtained N motion vectors are determined in the direct mode. It is also possible to perform motion compensation that refers only to the forward direction as a motion vector used for predictive decoding of. At this time, the predicted image is generated by calculating the average of the pixel values of N regions specified by the N motion vectors.
[0121]
Note that it is also possible to generate a predicted image by a method of averaging by weighting the pixel values of each area instead of a simple average. By using this method, it is possible to realize more accurate motion compensation for an image sequence in which pixel values gradually change in display order.
[0122]
FIG. 7 shows an example of a motion vector reference method when N = 2 in the above case. P104 is a picture currently being decoded, and BL51 is a block for performing predictive decoding in the direct mode. Using the motion vectors MVA1, MVB1, and MVC1 referencing the foremost picture P103 in the display order among the pictures referenced by the plurality of motion vectors to be referred to, take the median or average value thereof. Thus, the motion vector MV1 is determined, and furthermore, the motion vector MV2 is determined by taking the median value or the average value of the motion vectors referring to the picture P102 which is two immediately preceding in the display order, that is, MVC2 itself. Decoding in direct mode is performed using two motion vectors.
[0123]
It should be noted that the following conditions can be used instead of the method described in the above embodiment as a method of determining a block to which a motion vector is referred in FIGS. 18A and 18B.
When 1, A and D cannot be referred to, those motion vectors are referred to as "0".
If 2, B, C and D cannot be referenced, only A is referenced.
If only 3 and C cannot be referenced, A, B and D are referenced.
4. In cases other than 2 and 3 above, reference is made to A, B and C.
[0124]
It should be noted that instead of using only the one that refers to one or N pictures from the front in the display order from among the motion vectors of the blocks referred to in FIGS. 18A and 18B, It is also possible to take out only the motion vector referring to the specified picture, determine the value of the motion vector of the current block to be used in the direct mode, and perform motion compensation from the specified picture.
[0125]
When performing decoding using the direct mode, instead of performing motion compensation with reference to blocks having a positional relationship as shown in FIG. 18A and FIG. It is also possible to perform motion compensation in the direct mode with the value of the vector being “0” and the picture to be referred to as the immediately preceding picture. By using this method, it is not necessary to perform a step of calculating a motion vector to be used in the direct mode, so that the decoding process can be simplified.
[0126]
(Embodiment 5)
Furthermore, by recording a program for realizing the configuration of the moving picture encoding method or the moving picture decoding method shown in the above-described embodiment on a recording medium such as a flexible disk, the first embodiment can be used. Can be easily implemented in an independent computer system.
[0127]
FIG. 11 is an explanatory diagram of a case where the present invention is implemented by a computer system using a flexible disk storing the moving picture coding method or the moving picture decoding method according to the first embodiment.
[0128]
FIG. 11B shows the appearance, cross-sectional structure, and flexible disk of the flexible disk as viewed from the front, and FIG. 11A shows an example of the physical format of the flexible disk which is a recording medium body. The flexible disk FD is built in the case F, and a plurality of tracks Tr are formed concentrically from the outer circumference toward the inner circumference on the surface of the disk, and each track is divided into 16 sectors Se in an angular direction. ing. Therefore, in the flexible disk storing the program, the moving image encoding method as the program is recorded in an area allocated on the flexible disk FD.
[0129]
FIG. 11C shows a configuration for recording and reproducing the program on the flexible disk FD. When the above program is recorded on the flexible disk FD, the moving picture encoding method or the moving picture decoding method as the above program is written from the computer system Cs via the flexible disk drive. When the moving picture encoding method is constructed in a computer system using a program in a flexible disk, the program is read from the flexible disk by a flexible disk drive and transferred to the computer system.
[0130]
In the above description, the description has been made using a flexible disk as a recording medium. However, the same description can be made using an optical disk. Further, the recording medium is not limited to this, and the present invention can be similarly implemented as long as the program can be recorded, such as an IC card or a ROM cassette.
[0131]
(Embodiment 6)
Further, here, application examples of the moving picture coding method and the moving picture decoding method described in the above embodiment and a system using the same will be described.
[0132]
FIG. 12 is a block diagram illustrating an overall configuration of a content supply system ex100 that realizes a content distribution service. A communication service providing area is divided into desired sizes, and base stations ex107 to ex110, which are fixed wireless stations, are installed in each cell.
[0133]
The content supply system ex100 includes, for example, a computer ex111, a PDA (personal digital assistant) ex112, a camera ex113, a mobile phone ex114, and a camera via the Internet ex101 via the Internet service provider ex102 and the telephone network ex104, and the base stations ex107 to ex110. Each device such as a mobile phone ex115 with a tag is connected.
[0134]
However, the content supply system ex100 is not limited to the combination as shown in FIG. 12, and may be connected in any combination. Further, each device may be directly connected to the telephone network ex104 without going through the base stations ex107 to ex110 which are fixed wireless stations.
[0135]
The camera ex113 is a device such as a digital video camera capable of shooting moving images. In addition, a mobile phone can be a PDC (Personal Digital Communications) system, a CDMA (Code Division Multiple Access) system, a W-CDMA (Wideband-Code Division Multiple Access mobile phone system, or a GSM gigabit mobile access system). Or PHS (Personal Handyphone System) or the like.
[0136]
The streaming server ex103 is connected from the camera ex113 to the base station ex109 and the telephone network ex104, and enables live distribution and the like based on encoded data transmitted by the user using the camera ex113. The encoding process of the photographed data may be performed by the camera ex113, or may be performed by a server or the like that performs the data transmission process.
[0137]
Also, moving image data captured by the camera ex116 may be transmitted to the streaming server ex103 via the computer ex111. The camera ex116 is a device such as a digital camera that can shoot still images and moving images. In this case, encoding of the moving image data may be performed by the camera ex116 or the computer ex111. The encoding process is performed by the LSI ex117 of the computer ex111 and the camera ex116.
[0138]
The moving image encoding / decoding software may be incorporated in any storage medium (a CD-ROM, a flexible disk, a hard disk, or the like) that is a recording medium readable by the computer ex111 or the like. Further, the moving image data may be transmitted by the mobile phone with camera ex115. The moving image data at this time is data encoded by the LSI included in the mobile phone ex115.
[0139]
In the content supply system ex100, the content (for example, a video image of a live music) captured by the user with the camera ex113, the camera ex116, or the like is encoded and transmitted to the streaming server ex103 as in the above-described embodiment. On the other hand, the streaming server ex103 stream-distributes the content data to the requesting client. Examples of the client include a computer ex111, a PDA ex112, a camera ex113, a mobile phone ex114, and the like that can decode the encoded data.
[0140]
In this way, the content supply system ex100 can receive and reproduce the encoded data at the client, and further, realizes personal broadcast by receiving, decoding, and reproducing the data in real time at the client. It is a system that becomes possible.
[0141]
The encoding and decoding of each device constituting this system may be performed using the video encoding device or the video decoding device described in each of the above embodiments.
A mobile phone will be described as an example.
[0142]
FIG. 13 is a diagram illustrating the mobile phone ex115 using the moving picture coding method and the moving picture decoding method described in the above embodiment. The mobile phone ex115 includes an antenna ex201 for transmitting and receiving radio waves to and from the base station ex110, a camera unit ex203 capable of taking a picture such as a CCD camera, a still image, a picture taken by the camera unit ex203, and an antenna ex201. A display unit ex202 such as a liquid crystal display for displaying data obtained by decoding a received video or the like, a main unit including operation keys ex204, an audio output unit ex208 such as a speaker for outputting audio, and audio input. Input unit ex205 such as a microphone for storing encoded or decoded data, such as data of captured moving images or still images, received mail data, moving image data or still image data, etc. Of recording media ex207 to mobile phone ex115 And a slot portion ex206 to ability. The recording medium ex207 stores a flash memory device, which is a kind of electrically erasable and programmable read only memory (EEPROM), which is a nonvolatile memory that can be electrically rewritten and erased, in a plastic case such as an SD card.
[0143]
Further, the mobile phone ex115 will be described with reference to FIG. The mobile phone ex115 is provided with a power supply circuit unit ex310, an operation input control unit ex304, an image encoding unit, and a main control unit ex311 which controls the respective units of a main body unit including a display unit ex202 and operation keys ex204. Unit ex312, camera interface unit ex303, LCD (Liquid Crystal Display) control unit ex302, image decoding unit ex309, demultiplexing unit ex308, recording / reproducing unit ex307, modulation / demodulation circuit unit ex306, and audio processing unit ex305 via the synchronous bus ex313. Connected to each other.
[0144]
When the end of the call and the power key are turned on by a user operation, the power supply circuit unit ex310 supplies power to each unit from the battery pack to activate the digital cellular phone with camera ex115 in an operable state. .
[0145]
The mobile phone ex115 converts a sound signal collected by the sound input unit ex205 into digital sound data by the sound processing unit ex305 in the voice call mode based on the control of the main control unit ex311 including a CPU, a ROM, a RAM, and the like. This is spread-spectrum-processed by a modulation / demodulation circuit unit ex306, subjected to digital-analog conversion processing and frequency conversion processing by a transmission / reception circuit unit ex301, and then transmitted via an antenna ex201. The mobile phone ex115 amplifies the received data received by the antenna ex201 in the voice communication mode, performs frequency conversion processing and analog-to-digital conversion processing, performs spectrum despreading processing in the modulation / demodulation circuit unit ex306, and performs analog voice decoding in the voice processing unit ex305. After being converted into data, this is output via the audio output unit ex208.
[0146]
Further, when an e-mail is transmitted in the data communication mode, text data of the e-mail input by operating the operation key ex204 of the main body is sent to the main control unit ex311 via the operation input control unit ex304. The main control unit ex311 performs spread spectrum processing on the text data in the modulation / demodulation circuit unit ex306, performs digital / analog conversion processing and frequency conversion processing in the transmission / reception circuit unit ex301, and transmits the data to the base station ex110 via the antenna ex201.
[0147]
When transmitting image data in the data communication mode, the image data captured by the camera unit ex203 is supplied to the image encoding unit ex312 via the camera interface unit ex303. When image data is not transmitted, image data captured by the camera unit ex203 can be directly displayed on the display unit ex202 via the camera interface unit ex303 and the LCD control unit ex302.
[0148]
The image encoding unit ex312 includes the moving image encoding device described in the present invention, and encodes image data supplied from the camera unit ex203 using the moving image encoding device described in the above embodiment. The image data is converted into encoded image data by performing compression encoding according to a demultiplexing method, and is transmitted to the demultiplexing unit ex308. At this time, the mobile phone ex115 simultaneously transmits the audio collected by the audio input unit ex205 during imaging by the camera unit ex203 to the demultiplexing unit ex308 as digital audio data via the audio processing unit ex305.
[0149]
The demultiplexing unit ex308 multiplexes the encoded image data supplied from the image encoding unit ex312 and the audio data supplied from the audio processing unit ex305 by a predetermined method, and multiplexes the resulting multiplexed data into a modulation / demodulation circuit unit. The signal is subjected to spread spectrum processing in ex306 and subjected to digital-analog conversion processing and frequency conversion processing in the transmission / reception circuit unit ex301, and then transmitted via the antenna ex201.
[0150]
When data of a moving image file linked to a homepage or the like is received in the data communication mode, the data received from the base station ex110 via the antenna ex201 is subjected to spectrum despreading processing by the modulation / demodulation circuit unit ex306, and the resulting multiplexed data is obtained. The demultiplexed data is sent to the demultiplexing unit ex308.
[0151]
To decode the multiplexed data received via the antenna ex201, the demultiplexing unit ex308 separates the multiplexed data into a bit stream of image data and a bit stream of audio data, and performs synchronization. The coded image data is supplied to the image decoding unit ex309 via the bus ex313 and the audio data is supplied to the audio processing unit ex305.
[0152]
Next, the image decoding unit ex309 is configured to include the moving image decoding device described in the present invention, and converts a bit stream of image data into a decoding method corresponding to the encoding method described in the above embodiment. By decoding, reproduced moving image data is generated and supplied to the display unit ex202 via the LCD control unit ex302, whereby, for example, moving image data included in a moving image file linked to a homepage is displayed. At this time, the audio processing unit ex305 simultaneously converts the audio data into analog audio data and supplies the analog audio data to the audio output unit ex208, whereby the audio data included in the moving image file linked to the homepage is reproduced, for example. You.
[0153]
It should be noted that the present invention is not limited to the example of the system described above, and digital broadcasting using satellites and terrestrial waves has recently become a topic. As shown in FIG. Any of the video decoding devices can be incorporated. Specifically, at the broadcasting station ex409, the bit stream of the video information is transmitted to the communication or the broadcasting satellite ex410 via radio waves. The broadcasting satellite ex410 receiving this transmits a radio wave for broadcasting, receives this radio wave with a home antenna ex406 having a satellite broadcasting receiving facility, and transmits the radio wave to a television (receiver) ex401 or a set-top box (STB) ex407 or the like. The device decodes the bit stream and reproduces it.
[0154]
Further, the moving picture decoding apparatus described in the above embodiment can also be mounted on a reproducing apparatus ex403 that reads and decodes a bit stream recorded on a storage medium ex402 such as a CD or DVD, which is a recording medium. . In this case, the reproduced video signal is displayed on the monitor ex404. A configuration is also conceivable in which a moving picture decoding apparatus is mounted in a set-top box ex407 connected to a cable ex405 for cable television or an antenna ex406 for satellite / terrestrial broadcasting, and this is reproduced on a monitor ex408 of the television. At this time, the moving picture decoding device may be incorporated in the television instead of the set-top box.
[0155]
Further, it is also possible to receive a signal from the satellite ex410 or the base station ex107 or the like with the car ex412 having the antenna ex411 and reproduce the moving image on a display device such as the car navigation ex413 or the like included in the car ex412.
[0156]
Further, an image signal can be encoded by the moving image encoding device described in the above embodiment and recorded on a recording medium. As specific examples, there are a recorder ex420 such as a DVD recorder for recording an image signal on a DVD disk ex421 and a disk recorder for recording on a hard disk. Furthermore, it can be recorded on the SD card ex422. If the recorder ex420 includes the moving picture decoding device described in the above embodiment, the video signal recorded on the DVD disc ex421 or the SD card ex422 can be reproduced and displayed on the monitor ex408.
[0157]
The configuration of the car navigation system ex413 may be, for example, a configuration excluding the camera unit ex203, the camera interface unit ex303, and the image encoding unit ex312 from the configuration illustrated in FIG. 14, and the same applies to the computer ex111 and the television (receiver). ) Ex401 and the like can also be considered.
[0158]
In addition, terminals such as the mobile phone ex114 and the like have three mounting formats, in addition to a transmitting / receiving terminal having both an encoder and a decoder, a transmitting terminal having only an encoder and a receiving terminal having only a decoder. Can be considered.
As described above, the moving picture coding method or the moving picture decoding method described in the above embodiment can be used for any of the devices and systems described above. The effect can be obtained.
[0159]
Further, the present invention is not limited to the above embodiment, and various changes or modifications can be made without departing from the scope of the present invention.
INDUSTRIAL APPLICABILITY The moving picture coding apparatus according to the present invention is useful as a moving picture coding apparatus provided in a personal computer, a PDA, a digital broadcast station, a mobile phone, and the like having a communication function.
[0160]
Further, the moving picture decoding apparatus according to the present invention is useful as a moving picture decoding apparatus provided in a personal computer having a communication function, a PDA, an STB for receiving digital broadcasts, and a mobile phone.
[0161]
【The invention's effect】
As described above, according to the moving picture coding method of the present invention, even in an environment where a picture that is temporally backward cannot be referred to, the picture that is temporally backward when predictive encoding is performed using the direct mode. To reduce the number of items in the table and realize high encoding efficiency by removing the items that refer to the subsequent pictures from the encoding mode table. Make it possible.
[0162]
Further, according to the moving picture decoding method of the present invention, even in an environment where it is not possible to refer to a temporally backward picture, the temporally backward picture is referred to when performing predictive decoding using the direct mode. A method for realizing the direct mode without using the encoding mode, and further reducing the number of items in the table by removing the items referring to the subsequent pictures from the encoding mode table, and encoding a code sequence encoded with high encoding efficiency. Can be decoded without contradiction.
[Brief description of the drawings]
FIG. 1 is a block diagram illustrating a configuration of a video encoding device that executes a video encoding method according to Embodiment 1.
FIG. 2 is a diagram illustrating an example of a reference relationship between pictures in a case where it is not possible to refer to a picture behind a picture to which an encoding target block belongs in display order.
FIG. 3 is a flowchart illustrating an example of an operation of a mode selection unit when a direct mode is selected.
FIG. 4 is a diagram illustrating an example of a motion vector reference relationship in a case where a block having two motion vectors is included in an encoded block to which a motion vector is referred;
5 is a flowchart illustrating an example of a processing procedure in a case where the mode selection unit illustrated in FIG. 1 performs spatial prediction of an encoding target block using the first method.
FIG. 6 is a diagram illustrating an example of a data structure of each slice of a code string generated by the code string generation unit illustrated in FIG. 1;
FIG. 7 is a diagram illustrating an example of a motion vector referencing method in a case where two motion vectors are calculated by extracting a motion vector referring to two pictures from a position before a current picture in a display order.
FIG. 8 is a block diagram illustrating a configuration of a video decoding device according to the present embodiment.
FIG. 9 is a flowchart showing a decoding procedure in a direct mode in the motion compensation decoding unit shown in FIG. 8;
FIG. 10A is a diagram illustrating an example of a table that associates a code for identifying a prediction mode in a B picture with an encoding mode.
(B) is a diagram illustrating an example of a table that associates a code for identifying a prediction mode in a B picture with a coding mode when the prediction direction is restricted to only the forward direction.
FIG. 11A is a diagram illustrating an example of a physical format of a flexible disk as a recording medium body.
FIG. 2B is a diagram showing the appearance, cross-sectional structure, and flexible disk of the flexible disk as viewed from the front.
(C) is a diagram showing a configuration for recording and reproducing the program on a flexible disk FD.
FIG. 12 is a block diagram illustrating an overall configuration of a content supply system that realizes a content distribution service.
FIG. 13 is a diagram illustrating an example of an appearance of a mobile phone.
FIG. 14 is a block diagram illustrating a configuration of a mobile phone.
FIG. 15 is a diagram illustrating a device that performs the encoding process or the decoding process described in the above embodiment, and a system using the device.
FIG. 16 is a diagram illustrating an example of a reference relationship between each picture and a picture referred to by the picture in the conventional moving picture coding method.
17 (a) is a diagram showing a picture in the display order extracted from a picture around a picture B18 shown in FIG. 16; FIG.
FIG. 18B is a diagram illustrating the encoding order of the peripheral pictures of the picture B18 when the picture B18 is encoded based on the reference relationship illustrated in FIG.
FIG. 18 is a diagram showing a positional relationship between a coded block whose motion vector is referred to and a current block when referring to a motion vector of a coded block located around the current block in the same picture. is there.
(A) This is an example where the target block BL51 has a size of 16 pixels × 16 pixels.
(B) This is an example when the target block BL52 has a size of 8 pixels × 8 pixels.
FIG. 19 is a diagram illustrating an example of a motion vector referred to in a skip mode of a P picture and an encoded picture referred to by the motion vector.
FIG. 20 is a diagram illustrating a method for determining a motion vector in the direct mode.
[Explanation of symbols]
100 Moving picture coding device
101 frame memory
102 prediction error encoding unit
103 Code string generation unit
104 prediction residual decoding unit
105 frame memory
106 motion vector detection unit
107 Mode selector
108 Motion vector storage unit
109 Back picture determination unit
200 Video decoding device
201 Code string analyzer
202 Prediction residual decoding unit
203 frame memory
204 motion compensation decoding unit
205 Motion vector storage unit
206 Back picture determination unit
Se sector
Tr Track
FD flexible disk
F Flexible disk case
Cs computer system
FDD flexible disk drive

Claims

A moving image encoding method for encoding a moving image to generate a code sequence,
In the encoding of a B picture in which predictive encoding is performed with reference to a plurality of encoded pictures that are temporally forward or backward, the motion of a current block to be encoded is performed with reference to a motion vector of an encoded block. Including an encoding step that allows to use the direct mode with compensation,
The encoding step, when performing the predictive encoding of the B picture with reference to only the encoded picture in one direction in display order from the picture to which the encoding target block belongs, the encoding is performed as the direct mode. A moving picture coding method comprising a motion compensation step of performing motion compensation by referring to a motion vector of a coded block in the same picture located around a target block.

In the motion compensation step, when all the coded pictures referred to for performing the predictive coding are pictures temporally ahead of the picture to which the current block belongs, the motion compensation is performed. 2. The moving picture coding method according to claim 1, wherein the moving picture coding is performed.

In the motion compensating step, when there is no coded picture that can be referred to for performing the predictive coding temporally behind the picture to which the current block belongs, the motion compensation is performed. 2. The moving picture coding method according to claim 1, wherein:

The motion compensation step includes, in performing the motion compensation, one of a plurality of pictures referred to by a motion vector of the encoded block, in order from a temporally closer to a picture to which the encoding target block belongs. Or refer to a motion vector referring to a plurality of pictures, including a motion vector calculation step of calculating the motion vector of the encoding target block by taking the median or average value thereof,
2. The moving picture coding method according to claim 1, wherein in the motion compensating step, the motion compensation in the direct mode is performed using one or a plurality of motion vectors obtained in the motion vector calculating step. .

In the motion vector calculation step, among a plurality of pictures referred to by the motion vector of the coded block, one or more pictures are referred to in a display order closer to a picture to which the coding target block belongs. 5. The moving image encoding method according to claim 4, wherein the motion vector is calculated by referring to the motion vector to be executed.

In the motion compensation step, when performing the motion compensation, the position of the encoded block is outside the picture or slice to which the encoding target block belongs, or the motion vector of the encoded block has If the value is “0”, or if the coded block has no motion vector, the value of the motion vector of the current block is set to “0”, and 1 The moving picture coding method according to claim 1, wherein the motion compensation is performed by referring to one or more pictures.

A moving image encoding method for encoding a moving image to generate a code sequence,
In the encoding of a B picture in which predictive encoding is performed with reference to a plurality of encoded pictures that are temporally forward or backward, the motion of a current block to be encoded is performed with reference to a motion vector of an encoded block. Including an encoding step that allows to use the direct mode with compensation,
The encoding step includes performing the predictive encoding of the B picture by referring to only the encoded picture in one direction in display order from the picture to which the encoding target block belongs, and performing the encoding as the direct mode. A moving image coding method, wherein a motion vector value of a target block is set to “0”, and motion compensation is performed by referring to one or more pictures in order from a temporally closest one.

The encoding step includes excluding the predictive encoding method that refers to the rear from a table in which the predictive encoding method of the B picture and an identifier for identifying the predictive encoding method are associated with each other. Including a table regeneration step for regenerating,
The said encoding step encodes the said identifier which shows the prediction encoding method of the said B picture using the regenerated said table, The Claims 1 to 7 characterized by the above-mentioned. Video encoding method.

A moving image decoding method for decoding a code sequence obtained by encoding a moving image,
In decoding a B picture that performs predictive decoding with reference to a plurality of decoded pictures that are temporally forward or backward, the motion of the current block to be decoded is performed with reference to the motion vector of the decoded block. A decoding step allowing to use the direct mode with compensation,
In the decoding step, when performing predictive decoding of the B picture with reference to only a decoded picture in one direction in time from a picture to which the current block belongs, the decoding is performed as the direct mode. A moving picture decoding method, comprising a motion compensation step of performing motion compensation with reference to a motion vector of a decoded block in the same picture located around a target block.

In the motion compensation step, when all of the decoded pictures referred to for performing the predictive decoding are only pictures that are temporally ahead of the picture to which the current block belongs, the motion compensation is performed. 10. The moving picture decoding method according to claim 9, wherein

In the motion compensation step, when there is no decoded picture which can be referred to for performing the predictive decoding, temporally behind the picture to which the current block belongs, the motion compensation is performed. The moving picture decoding method according to claim 9, wherein:

The motion compensation step includes, in performing the motion compensation, one of a plurality of pictures referred to by a motion vector of the decoded block, in order from a temporally closer to a picture to which the current block belongs. Or referring to a motion vector referring to a plurality of pictures, including a motion vector calculation step of calculating the motion vector of the decoding target block by taking the median or average value thereof,
10. The moving picture decoding method according to claim 9, wherein, in the motion compensating step, motion compensation in the direct mode is performed using one or a plurality of motion vectors obtained in the motion vector calculating step. .

In the motion vector calculation step, among a plurality of pictures referred to by the motion vector of the decoded block, one or more pictures are referred to in a display order closer to a picture to which the current block belongs. 13. The moving picture decoding method according to claim 12, wherein the motion vector is calculated by referring to a motion vector to be decoded.

In the motion compensation step, when performing the motion compensation, if the position of the decoded block is outside the picture or slice to which the decoding target block belongs, or the motion vector of the decoded block has If the value is “0”, or if the decoded block has no motion vector, the value of the motion vector of the block to be decoded is set to “0”, and 1 10. The moving picture decoding method according to claim 9, wherein the motion compensation is performed with reference to one or a plurality of pictures.

A moving image decoding method for decoding a code sequence obtained by encoding a moving image,
In decoding a B picture that performs predictive decoding with reference to a plurality of decoded pictures that are temporally forward or backward, the motion of the current block to be decoded is performed with reference to the motion vector of the decoded block. A decoding step allowing to use the direct mode with compensation,
In the decoding step, when performing predictive decoding of the B picture by referring to only decoded pictures in one direction in time from a picture to which the current block belongs, the decoding is performed as the direct mode. A moving picture decoding method characterized in that a motion vector value of a target block is set to "0" and motion compensation is performed by referring to one or more pictures in order from a temporally closest one.

The decoding step excludes a predictive decoding method that refers to the backward from a pre-stored table that associates the predictive decoding method of the B picture with an identifier for identifying the predictive decoding method. Including a table regeneration step of regenerating the table,
In the decoding step, an identifier for identifying the predictive decoding method of the B picture is decoded from the encoded sequence, and using the regenerated table, the predictive decoding method of the B picture is identified, The moving picture decoding method according to any one of claims 9 to 15, wherein predictive decoding of the current block is performed according to the identified predictive decoding method.

A moving image encoding device that encodes a moving image to generate a code sequence,
In the coding of a B picture in which predictive coding is performed with reference to a plurality of coded pictures located forward or backward in time, the motion of a block to be coded is referred to with reference to a motion vector of a coded block. Comprising an encoding means enabling to use the direct mode with compensation,
The encoding unit, when performing predictive encoding of the B picture by referring only to an encoded picture temporally in one direction from a picture to which the encoding target block belongs, performs the encoding as the direct mode. A moving image encoding apparatus comprising: a motion compensation unit that performs motion compensation by referring to a motion vector of an encoded block in the same picture located around a target block.

A moving image decoding apparatus for decoding a code sequence obtained by encoding a moving image,
In decoding a B picture that performs predictive decoding with reference to a plurality of decoded pictures that are temporally forward or backward, the motion of the current block to be decoded is performed with reference to the motion vector of the decoded block. Comprising decoding means allowing to use the direct mode with compensation,
The decoding means, when performing predictive decoding of the B picture with reference to only a decoded picture in one direction in time from a picture to which the decoding target block belongs, performs the decoding as the direct mode. What is claimed is: 1. A video decoding device comprising: a motion compensation unit that performs motion compensation by referring to a motion vector of a decoded block in the same picture located around a target block.

A data recording medium storing a program for causing a computer to execute each step of the moving picture encoding method or the moving picture decoding method according to any one of claims 1 to 16.

A program for causing a computer to execute each step of the moving picture encoding method or the moving picture decoding method according to any one of claims 1 to 16.

An encoded data stream configured by repeatedly arranging a header part and a data part for each slice constituting a picture,
In the direct mode of a B picture in which predictive coding is performed with reference to a plurality of coded pictures that are temporally forward or backward in the header part of each slice, one frame is temporally shifted from the picture to which the current block belongs. When predictive coding is performed with reference to only coded pictures in the direction, motion compensation is performed with reference to motion vectors of a plurality of coded blocks in the same picture located around the current block to be coded. Or a flag indicating that motion compensation is performed with the value of the motion vector of the encoding target block set to “0”,
9. Coded data in which video data encoded by the video encoding method according to any one of claims 1 to 8 is arranged in a data portion of each slice. stream.