JP3777998B2

JP3777998B2 - Global motion compensation method

Info

Publication number: JP3777998B2
Application number: JP2001104091A
Authority: JP
Inventors: 雄一郎中屋
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2001-04-03
Filing date: 2001-04-03
Publication date: 2006-05-24
Anticipated expiration: 2016-03-18
Also published as: JP2001352549A

Description

【０００１】
【発明の属する技術分野】
本発明は、画像全体に対して線形内・外挿または共１次内・外挿に基づくグローバル動き補償を適用する画像符号化および復号化方法に関するものである。
【０００２】
【従来の技術】
動画像の高能率符号化において、時間的に近接するフレーム間の類似性を活用する動き補償は情報圧縮に大きな効果を示すことが知られている。現在の画像符号化技術の主流となっている動き補償方式は、動画像符号化方式の国際標準であるＨ.２６１，ＭＰＥＧ１，ＭＰＥＧ２に採用されているブロックマッチングである。この方式では、符号化しようとする画像を多数のブロックに分割し、ブロックごとにその動きベクトルを求める。
【０００３】
図１にＨ.２６１の符号化器の構成例１００を示す。Ｈ.２６１は、符号化方式として、ブロックマッチングとＤＣＴ（離散コサイン変換）を組み合わせたハイブリッド符号化方式（フレーム間／フレーム内適応符号化方式）を採用している。減算器１０２は入力画像（現フレームの原画像）１０１とフレーム間／フレーム内符号化切り換えスイッチ１１９の出力画像１１３（後述）との差を計算し、誤差画像１０３を出力する。この誤差画像は、ＤＣＴ変換器１０４でＤＣＴ係数に変換された後に量子化器１０５で量子化され、量子化ＤＣＴ係数１０６となる。この量子化ＤＣＴ計数は伝送情報として通信路に出力されると同時に、符号化器内でもフレーム間予測画像を合成するために使用される。以下に予測画像合成の手順を説明する。上述の量子化ＤＣＴ係数１０６は、逆量子化器１０８と逆ＤＣＴ変換器１０９を経て復号誤差画像１１０（受信側で再生される誤差画像と同じ画像）となる。これに、加算器１１１においてフレーム間／フレーム内符号化切り換えスイッチ１１９の出力画像１１３（後述）が加えられ、現フレームの復号画像１１２（受信側で再生される現フレームの復号画像と同じ画像）を得る。この画像は一旦フレームメモリ１１４に蓄えられ、１フレーム分の時間だけ遅延される。したがって、現時点では、フレームメモリ１１４は前フレームの復号画像１１５を出力している。この前フレームの復号画像と現フレームの入力画像１０１がブロックマッチング部１１６に入力され、ブロックマッチングの処理が行われる。ブロックマッチングでは、画像を複数のブロックに分割し、各ブロックごとに現フレームの原画像に最も似た部分を前フレームの復号画像から取り出すことにより、現フレームの予測画像１１７が合成される。このときに、各ブロックが前フレームと現フレームの間でどれだけ移動したかを検出する処理（動き推定処理）を行う必要がある。動き推定処理によって検出された各ブロックごとの動きベクトルは、動き情報１２０として受信側へ伝送される。受信側は、この動き情報と前フレームの復号画像から、独自に送信側で得られるものと同じ予測画像を合成することができる。予測画像１１７は、「０」信号１１８と共にフレーム間／フレーム内符号化切り換えスイッチ１１９に入力される。このスイッチは、両入力のいずれかを選択することにより、フレーム間符号化とフレーム内符号化を切り換える。予測画像１１７が選択された場合（図２はこの場合を表している）には、フレーム間符号化が行われる。一方、「０」信号が選択された場合には、入力画像がそのままＤＣＴ符号化されて通信路に出力されるため、フレーム内符号化が行われることになる。受信側が正しく復号化画像を得るためには、送信側でフレーム間符号化が行われたかフレーム内符号化が行われたかを知る必要がある。このため、識別フラグ１２１が通信路へ出力される。最終的なＨ.２６１符号化ビットストリーム１２３は多重化器１２２で量子化ＤＣＴ係数，動きベクトル，フレーム内／フレーム間識別フラグの情報を多重化することによって得られる。
【０００４】
図２に図１の符号化器が出力した符号化ビットストリームを受信する復号化器２００の構成例を示す。受信したＨ.２６１ビットストリーム２１７は、分離器２１６で量子化ＤＣＴ係数２０１，動きベクトル２０２，フレーム内／フレーム間識別フラグ２０３に分離される。量子化ＤＣＴ係数２０１は逆量子化器２０４と逆ＤＣＴ変換器２０５を経て復号化された誤差画像２０６となる。この誤差画像は加算器２０７でフレーム間／フレーム内符号化切り換えスイッチ２１４の出力画像２１５を加算され、復号化画像２０８として出力される。フレーム間／フレーム内符号化切り換えスイッチはフレーム間／フレーム内符号化識別フラグ２０３に従って、出力を切り換える。フレーム間符号化を行う場合に用いる予測画像２１２は、予測画像合成部２１１において合成される。ここでは、フレームメモリ２０９に蓄えられている前フレームの復号画像２１０に対して、受信した動きベクトル２０２に従ってブロックごとに位置を移動させる処理が行われる。一方フレーム内符号化の場合、フレーム間／フレーム内符号化切り換えスイッチは、「０」信号２１３をそのまま出力する。
【０００５】
ブロックマッチングは現在最も広く利用されている動き補償方式であるが、画像全体が拡大・縮小・回転している場合には、すべてのブロックに対して動きベクトルを伝送しなければならず、符号化効率が悪くなる問題が発生する。この問題に対し、画像全体の動きベクトル場を少ないパラメータを用いて表現するグローバル動き補償（例えば、M.Hotter, &#34Differential estimation of the global motion parameters zoom and pan&#34, Signal Processing, vol. 16, no. 3, pp. 249-265, Mar. 1989）が提案されている。これは、画像内の画素（ｘ,ｙ）の動きベクトル（ｕg(ｘ,ｙ),ｖg(ｘ,ｙ)）を、
【０００６】
【数１】

【０００７】
や、
【０００８】
【数２】

【０００９】
の形式で表し、この動きベクトルを利用して動き補償を行う方式である。ここでａ0〜ａ5，ｂ0〜ｂ7 は動きパラメータである。動き補償を行う際には、送信側と受信側で同じ予測画像が得られなければならない。このために、送信側は受信側へａ0〜ａ5 またはｂ0〜ｂ7 の値を直接伝送しても良いが、代わりにいくつかの代表点の動きベクトルを伝送する方法もある。いま、画像の左上端，右上端，左下端，右下端の画素の座標がそれぞれ（０,０)，(ｒ,０)，(０,ｓ)，(ｒ,ｓ）で表されるとする（ただし、ｒとｓは正の整数）。このとき、代表点（０,０)，(ｒ,０)，(０,ｓ）の動きベクトルの水平，垂直成分をそれぞれ（ｕa,ｖa)，(ｕb,ｖb)，(ｕc,ｖc）とすると、(数１) は
【００１０】
【数３】

【００１１】
と書き換えることができる。このことはａ0〜ａ5 を伝送する代わりにｕa，ｖa，ｕb，ｖb，ｕc，ｖc を伝送しても同様の機能が実現できることを意味する。この様子を図３に示す。現フレームの原画像３０２と参照画像３０１の間でグローバル動き補償が行われたとして、動きパラメータの代わりに代表点３０３，３０４，３０５の動きベクトル３０６，３０７，３０８（このとき、動きベクトルは現フレームの原画像の点を出発点として、参照画像内の対応する点を終点とするものとして定義する）を伝送しても良い。これと同じように、４個の代表点（０,０)，(ｒ,０)，(０,ｓ)，(ｒ,ｓ）の動きベクトルの水平，垂直成分（ｕa,ｖa)，(ｕb,ｖb)，(ｕc,ｖc)，(ｕd,ｖd）を用いて (数２) は、
【００１２】
【数４】

【００１３】
と書き換えることができる。したがって、ｂ0〜ｂ7 を伝送する代わりにｕa，ｖa，ｕb，ｖb，ｕc，ｖc，ｕd，ｖd を伝送しても同様の機能が実現できる。本明細書では (数１) を用いる方式を線形内・外挿に基づくグローバル動き補償，(数２) を用いる方式を共１次内・外挿に基づくグローバル動き補償とよぶこととする。
【００１４】
代表点の動きベクトルを伝送する線形内・外挿に基づくグローバル動き補償方式を採用した画像符号化器の動き補償処理部４０１の構成例を図４に示す。図１と同じ番号は同じものを指すとする。図１のブロックマッチング部１１６をこの動き補償処理部４０１に入れ替えることにより、グローバル動き補償を行う画像符号化装置を構成することができる。グローバル動き補償部４０２で前フレームの復号画像１１５と現フレームの原画像１０１との間でグローバル動き補償に関する動き推定が行われ、上記ｕa，ｖa，ｕb，ｖb，ｕc，ｖc の値が推定される。これらの値に関する情報４０３は動き情報１２０の一部として伝送される。グローバル動き補償の予測画像４０４は数３を用いて合成され、ブロックマッチング部４０５に供給される。ここでは、グローバル動き補償の予測画像と現フレームの原画像との間でブロックマッチングによる動き補償が行われ、ブロックの動きベクトル情報４０６と最終的な予測画像１１７が得られる。この動きベクトル情報は動きパラメータ情報と多重化部４０７において多重化され、動き情報１２０として出力される。
【００１５】
図４とは異なる動き補償処理部５０１の構成例を図５に示す。図１と同じ番号は同じものを指すとする。図１のブロックマッチング部１１６をこの動き補償処理部５０１に入れ替えることにより、グローバル動き補償を行う画像符号化装置を構成することができる。この例では、グローバル動き補償の予測画像にブロックマッチングを適用するのではなく、各ブロックに関してグローバル動き補償かブロックマッチングのいずれかが適用される。前フレームの復号画像１１５と現フレームの原画像１０１との間で、グローバル動き補償部５０２とブロックマッチング部５０５でそれぞれグローバル動き補償とブロックマッチングが並列に行われる。選択スイッチ５０８は、グローバル動き補償による予測画像５０３とブロックマッチングによる予測画像５０６の間でブロックごとに最適な方式を選択する。代表点の動きベクトル５０４，ブロックごとの動きベクトル５０７，グローバル動き補償／ブロックマッチングの選択情報５０９は多重化部５１０で多重化され、動き情報１２０として出力される。
【００１６】
以上述べたグローバル動き補償を導入することにより、画像の大局的な動きを少ないパラメータを用いて表現することが可能となり、より高い情報圧縮率が実現できる。しかし、その一方で符号化および復号化における処理量は従来の方式と比較して増加する。特に (数３) および (数４) に見られる除算は、処理を複雑にする大きな要因となってしまう。
【００１７】
【発明が解決しようとする課題】
画像全体の動きベクトル場を少ないパラメータによって近似するグローバル動き補償では、予測画像の合成のための処理量が多くなる問題が発生する。本発明の目的は、グローバル動き補償における除算の処理を２進数のシフト演算に置き換えることにより、演算量を減少させることにある。
【００１８】
【課題を解決するための手段】
グローバル動き補償を行う際の代表点の座標をうまく選択することにより、除算処理をシフト演算で実現できるようにする。
【００１９】
【発明の実施の形態】
以下の議論では、画素のサンプリング間隔を水平，垂直方向共に１とし、画像の左上端，右上端，左下端，右下端の画素の座標がそれぞれ（０,０)，(ｒ,０)，(０,ｓ)，(ｒ,ｓ）で表されるとする（ただし、ｒとｓは正の整数）。
【００２０】
線形内・外挿（アフィン変換）または共１次内・外挿（共１次変換）を用いた動き補償を行う際には、画素ごとの動きベクトルに対して量子化を行うと、ミスマッチの防止や演算の簡略化などの効果を得ることができる（特願平06-193970）。以下では、画素の動きベクトルの水平成分と垂直成分が１／ｍ（ｍは正の整数）の整数倍であるとする。また、「従来の技術」で説明した代表点の動きベクトルを用いるグローバル動き補償を行うと仮定し、各代表点の動きベクトルは１／ｋ（ｋは正の整数）の整数倍であるとする。なお、本明細書では、「画素の動きベクトル」はグローバル動き補償を行う際に、実際に予測画像を合成するために用いる動きベクトルのことを指す。一方、「代表点の動きベクトル」は画素の動きベクトルを計算するために用いるパラメータを意味している。したがって、量子化ステップサイズの違いなどが原因で、同じ座標上に存在していても画素の動きベクトルと代表点の動きベクトルが一致しない場合も起こり得る。
【００２１】
まず線形内・外挿を用いる場合について図６を用いて説明する。このとき、「従来の技術」で述べたように代表点を画像６０１の隅に位置する点とはせず、（ｉ,ｊ)，(ｉ+ｐ,ｊ)，(ｉ,ｊ+ｑ）に位置する点６０２，６０３，６０４とする（ｉ，ｊ，ｐ，ｑは整数）。このとき、点６０２，６０３，６０４は画像の内部に存在していても外部に存在していても良い。代表点の動きベクトルの水平，垂直成分をｋ倍したものをそれぞれ（ｕ0,ｖ0)，(ｕ1,ｖ1)，(ｕ2,ｖ2）とすると（ｕ0，ｖ0，ｕ1，ｖ1，ｕ2，ｖ2 は整数）、画素（ｘ,ｙ）の動きベクトルの水平，垂直成分をｍ倍したもの（ｕ(ｘ,ｙ),ｖ(ｘ,ｙ)）は以下の式で表すことができる（ただし、ｘ，ｙ，ｕ(ｘ,ｙ)，ｖ(ｘ,ｙ）は整数）。
【００２２】
【数５】

【００２３】
ただし、「//」は通常の除算による演算結果が整数ではない場合にこれを近隣の整数に丸め込む除算で、演算子としての優先順位は乗除算と同等である。演算誤差を小さくするためには、非整数値は最も近い整数に丸め込まれることが望ましい。このとき整数に１／２を加えた値の丸め込み方法は、
(1) ０に近づける方向に丸め込む、
(2) ０から遠ざける方向に丸め込む、
(3) 被除数が負の場合は０に近づける方向，正の場合は０から遠ざける方向に丸め込む（除数は常に正であるとする）、
(4) 被除数が負の場合は０から遠ざける方向，正の場合は０に近づける方向に丸め込む（除数は常に正であるとする）、
などが考えられる。これらの中で（3）と（4）は、被除数の正負に関わらず丸め込みの方向が変化しないため、正負判定が必要ない分だけ処理量の点で有利である。(3）を用いた高速処理は以下の式によって実現される。
【００２４】
【数６】

【００２５】
ただし、「＃」は小数点以下を０の方向に切り捨てる整数の除算であり、一般に計算機では最も実現しやすい形式の除算である。ここで、ＬとＭは除算の被除数を常に正に保つための数で、十分に大きな正の整数である。また、(ｐｑｋ＃２）の項は、除算結果を最も近い整数に丸め込むために用いられる。
【００２６】
処理を整数化することはそれ自体処理量の低減に貢献するが、ここでｐ，ｑ，ｋをそれぞれ２の α，β，ｈ0 乗（α，βは正の整数、ｈ0は負ではない整数）とすると、(数５) の除算は α＋β＋ｈ0 ビットのシフト演算で実現できるため、計算機や専用ハードウェアにおける処理量を大きく減らすことができる。さらにｍを２のｈ1乗とすれば（ｈ1は負ではない整数、ｈ1＜α＋β＋ｈ0）、(数６) は、
【００２７】
【数７】

【００２８】
と書き換えることができ（「ｘ&#60&#60α」はｘをαビット左にシフトして下位αビットに０を入れる、「ｘ&#62&#62α」はｘをαビット右にシフトして上位αビットに０または１を入れる（ｘが２の補数表現の場合、ｘの最上位ビットが１のときは１，０のときは０を入れる）ことを意味し、これらの演算子の優先順位は加減算と乗除算の中間であるとする）、さらに演算を簡略化することができる。
【００２９】
線形内・外挿を用いた場合、さらに（ｉ+ｐ,ｊ+ｑ）に位置する代表点の動きベクトルの水平，垂直成分をｋ倍したものを（ｕ3,ｖ3）として、(数５) は（ｉ,ｊ)，(ｉ+ｐ,ｊ)，(ｉ+ｐ,ｊ+ｑ）を代表点とすれば、
【００３０】
【数８】

【００３１】
（ｉ,ｊ)，(ｉ,ｊ+ｑ)，(ｉ+ｐ,ｊ+ｑ）を代表点とすれば、
【００３２】
【数９】

【００３３】
（ｉ+ｐ,ｊ)，(ｉ,ｊ+ｑ)，(ｉ+ｐ,ｊ+ｑ）を代表点とすれば、
【００３４】
【数１０】

【００３５】
と書き換えられ、ｐ，ｑ，ｋ，ｍの値を２の正の整数乗とすることによって同様に処理量を減らすことができる。
【００３６】
共１次内・外挿を用いた場合には、代表点（ｉ,ｊ)，(ｉ+ｐ,ｊ)，(ｉ,ｊ+ｑ)，(ｉ+ｐ,ｊ+ｑ）それぞれの動きベクトルの水平，垂直成分をｋ倍したものである（ｕ0,ｖ0)，(ｕ1,ｖ1)，(ｕ2,ｖ2)，(ｕ3,ｖ3）を用いて（ｕ(ｘ,ｙ),ｖ(ｘ,ｙ)）は以下の式で表すことができる。
【００３７】
【数１１】

【００３８】
この式もｐ，ｑ，ｋ，ｍの値をそれぞれ２の α，β，ｈ0，ｈ1 乗とすることによって、
【００３９】
【数１２】

【００４０】
と書き換えることができ、上と同様に処理量を減らすことができる。
【００４１】
送信側と受信側で同じグローバル動き補償予測画像を得るためには、代表点の動きベクトルに関する情報を何らかの形で受信側に伝える必要がある。代表点の動きベクトルそのまま伝送する方法もあるが、画像の隅の点の動きベクトルを伝送し、この値から代表点の動きベクトルを計算する方法もある。この方法に関し、以下に説明する。
【００４２】
まずは線形内・外挿が使用される場合について説明する。画像の隅の３個の点（０,０)，(ｒ,０)，(０,ｓ）の動きベクトルが１／ｎ整数倍の値のみとれるとして、これらの水平，垂直成分をｎ倍した（ｕ00,ｖ00)，(ｕ01,ｖ01)，(ｕ02,ｖ02）が伝送されるとする。このとき、点（ｉ,ｊ)，(ｉ+ｐ,ｊ)，(ｉ,ｊ+ｑ)，(ｉ+ｐ,ｊ+ｑ）それぞれの動きベクトルの水平，垂直成分をｋ倍したものである（ｕ0,ｖ0)，(ｕ1,ｖ1)，(ｕ2,ｖ2)，(ｕ3,ｖ3）を、
【００４３】
【数１３】

【００４４】
と定義する。ただし、ｕ'(ｘ,ｙ)，ｖ'(ｘ,ｙ）は、(数５) を変形して、
【００４５】
【数１４】

【００４６】
と定義する。このとき、「///」は通常の除算による演算結果が整数ではない場合にこれを近隣の整数に丸め込む除算で、演算子としての優先順位は乗除算と同等である。(ｕ0,ｖ0)，(ｕ1,ｖ1)，(ｕ2,ｖ2)，(ｕ3,ｖ3）の中から３点を選び、それらを代表点とするグローバル動き補償を行えば、(０,０)，(ｒ,０)，(０,ｓ）を代表点とするグローバル動き補償を近似することができる。もちろんこのときにｐとｑを２の正の整数乗とすれば、上で述べたように処理を簡略化することが可能となる。なお、演算誤差を小さくするためには、「///」は非整数値を最も近い整数に丸め込むことが望ましい。このとき整数に１／２を加えた値の丸め込み方法としては、上で述べた（1）〜（4）の方法が考えられる。ただし、(数５)（画素ごとに計算）の場合と比較して、(数１４) （１枚の画像で３回のみ計算）は演算が実行される回数が少ないため、(1）または（2）の方法を選んだとしても全体の演算量に大きな影響は与えない。
【００４７】
画像の隅の点として (数１３) の例とは異なる３点が選ばれた場合も数８〜１０を変形することによって同様の処理を実現することができる。上記の例に加え、さらに画像の隅の点（ｒ,ｓ）の動きベクトルの水平，垂直成分をｎ倍したものを（ｕ03,ｖ03）とすれば、(数１４) は、（ｕ00,ｖ00)，(ｕ01,ｖ01)，(ｕ03,ｖ03）が伝送される場合には、
【００４８】
【数１５】

【００４９】
と、(ｕ00,ｖ00)，(ｕ02,ｖ02)，(ｕ03,ｖ03）が伝送される場合には、
【００５０】
【数１６】

【００５１】
と、(ｕ01,ｖ01)，(ｕ02,ｖ02)，(ｕ03,ｖ03）が伝送される場合には、
【００５２】
【数１７】

【００５３】
と書き換えられる。
【００５４】
共１次内・外挿が行われる場合も同様である。上と同様に画像の隅の４個の代表点（０,０)，(ｒ,０)，(０,ｓ)，(ｒ,ｓ）の動きベクトルが１／ｎ整数倍の値のみとれるとして、これらの水平，垂直成分をｎ倍した（ｕ00,ｖ00)，(ｕ01,ｖ01)，(ｕ02,ｖ02)，(ｕ03,ｖ03）が伝送されるとする。このときの代表点（ｉ,ｊ)，(ｉ+ｐ,ｊ)，(ｉ,ｊ+ｑ)，(ｉ+ｐ,ｊ+ｑ）それぞれの動きベクトルの水平，垂直成分をｋ倍したものである（ｕ0,ｖ0)，(ｕ1,ｖ1)，(ｕ2,ｖ2)，(ｕ3,ｖ3）は上と同様、(数１３) で与えられる。ただし、ｕ'(ｘ,ｙ)，ｖ'(ｘ,ｙ）は、(数１１) を変形して、
【００５５】
【数１８】

【００５６】
と定義される。
【００５７】
画像の隅の点の動きベクトルを伝送し、これに対して内・外挿を行うことによって代表点の動きベクトルを求める方式の長所は、画素ごとの動きベクトルの範囲を限定しやすい点である。例えば、(数４) で与えられる共１次内・外挿では、(ｘ,ｙ）が画像内の点であるとき、ｕg(ｘ,ｙ）の値はｕa，ｕb，ｕc，ｕd の最大値を超えることも、最小値を下まわることもない。したがって、グローバル動き推定のときにｕa，ｕb，ｕc，ｕd の値がある制限範囲内（例えば±３２画素以内の範囲）に収まるような制約条件を加えれば、全ての画素に関してｕg(ｘ,ｙ）の値を同じ制限範囲内入れておくことができる（もちろんこれはｖg(ｘ,ｙ）に関しても成立する）。こうすると演算に必要な桁数を明確にすることができ、ソフトウェアまたはハードウェアを設計する上で便利である。ただし、以上の議論は計算がすべて浮動小数点演算で行われた場合の議論なので、実際の処理では注意が必要である。画像の隅の点の動きベクトルから代表点の動きベクトルを求める演算 (数１８) には整数への丸め込みが存在するため、計算誤差の影響で (数１２) で求まる動きベクトルが上で述べた制限範囲内から出る可能性を考慮する必要がある。特に代表点が画像の内側に位置するような場合には注意が必要である。これは、代表点が囲む長方形の外側に位置する画素に関しては外挿によって動きベクトルが求められるため、丸め込み誤差が増幅される可能性があるためである。外挿によって動きベクトルが求まる例を図７に示す。画像７０１に対し、代表点７０２，７０３，７０４，７０５を用いてグローバル動き補償を行うと、画像内の斜線で示された部分は外挿により動きベクトルが計算されることになる。これは、斜線部が代表点が囲む長方形７０６の外にあるためである。
【００５８】
この問題への対策としては、４個の代表点を、代表点が囲む長方形が画像全体を含むように配置する方法が有効である。この例を図８に示す。代表点８０２，８０３，８０４，８０５が囲む長方形８０６は、画像８０１を含んでいる。こうすれば、全ての画素の動きベクトルは代表点からの内挿により求められるため、代表点における丸め込み誤差の影響は画像内では増幅されない。したがって、代表点の丸め込み誤差より大きな誤差が画像内で発生することはなく、誤差の上限が明確になる。ただし、代表点が囲む長方形を大きくし過ぎると、代表点の動きベクトルがとり得る値の範囲が広くなるため、演算に必要な桁数が増加し、実装する上では不利となる。
【００５９】
以上の議論から、丸め込み誤差の影響を小さくするため、ｐの値はｒ以上であり、ｑの値はｓ以上であるが望ましい。ｐ，ｑはそれぞれｒ，ｓより小さい場合にも、なるべく大きな値をとることが望ましい。また、ｉ，ｊの値は、画像内のできるだけ広い部分が代表点により囲まれる領域に入るような値とすることが望ましい。
【００６０】
上で述べたように、グローバル動き補償に共１次内・外挿を用いた場合には、４個の代表点が囲む長方形に含まれる画素の動きベクトルの各成分は、代表点の動きベクトルの各成分の最大値と最小値の間の値しかとれないという性質がある。これに対し、線形内・外挿が使用された場合には、３個の代表点が囲む３角形内の画素の動きベクトルが同様の性質を持つ。したがって、線形内・外挿を用いるグローバル動き補償を行う場合には、画像の四隅の点の動きベクトルを伝送し、画像の対角線によって分割される２個の直角３角形に対してそれぞれ独立にグローバル動き補償を行う方法が有効である。こうすることにより、４隅の点に対する動きベクトルの範囲に関する制約が、そのまま画像内のすべての画素の動きベクトルに適用できる。このとき、ｉ，ｊ，ｐ，ｑの値は、２個の直角３角形の間で異なっていても良い。また、演算誤差の観点から言えば、外挿によって画素の動きベクトルを計算するケースを避けるため、代表点の囲む３角形が、グローバル動き補償の対象となる直角３角形を含むことが望ましい。この例を図９に示す。画像９０１の４隅である点９０９，９０３，９０８，９１０の動きベクトルが伝送され、点９０９，９０３，９１０によって構成される直角３角形と、点９０９，９１０，９０８によって構成される直角３角形それぞれに対し、独立にグローバル動き補償が行われる。したがって、頂点の動きベクトルの範囲に関して制約を設ければ、画像内の全ての画素の動きベクトルもこの制約範囲内に入ることになる。点９０９，９０３，９１０によって構成される直角３角形は代表点として点９０２，９０３，９０４を使用し、点９０９，９１０，９０８によって構成される直角３角形は代表点として点９０６，９０７，９０８を使用する。代表点によって構成される３角形は、それぞれグローバル動き補償の対象となる直角３角形を中に含んでいる。このため、代表点の動きベクトルの丸め込み誤差の影響は、画像内の点において増幅されることはない。なお、この例では代表点が構成する２個の三角形は相似となっているが、必ずしもそうである必要はない。
【００６１】
【発明の効果】
本発明により、グローバル動き補償の予測画像合成処理における除算の処理をシフト演算で代用することが可能となり、ソフトウェアや専用ハードウェアによる処理を簡略化することが可能となる。
【図面の簡単な説明】
【図１】Ｈ.２６１の画像符号化器の構成例を示した図である。
【図２】Ｈ.２６１の画像復号化器の構成例を示した図である。
【図３】代表点の動きベクトルを伝送するグローバル動き補償の例を示した図である。
【図４】グローバル動き補償の予測画像に対してブロックマッチングを行う画像符号化器の動き補償処理部を示した図である。
【図５】ブロックごとにグローバル動き補償とブロックマッチングを選択する画像符号化器の動き補償処理部を示した図である。
【図６】高速な処理を行うための代表点の配置の例を示した図である。
【図７】画像内において、外挿によって動きベクトルを求める領域を示した図である。
【図８】画像内のすべての画素の動きベクトルが、代表点の動きベクトルからの内挿によって求められる場合を示した図である。
【図９】画像を２個の直角３角形に分割し、それぞれに対して代表点の動きベクトルからの内挿によるグローバル動き補償を適用した例を示した図である。
【符号の説明】
１００…画像符号化器、１０１…入力画像、１０２…減算器、１０３…誤差画像、１０４…ＤＣＴ変換器、１０５…ＤＣＴ係数量子化器、１０６，２０１…量子化ＤＣＴ係数、１０８，２０４…ＤＣＴ係数逆量子化器、１０９，２０５…逆ＤＣＴ変換器、１１０，２０６…復号誤差画像、１１１，２０７…加算器、１１２…現フレームの復号画像、１１３，２１５…フレーム間／フレーム内符号化切り換えスイッチの出力画像、１１４，２０９…フレームメモリ、１１５，２１０…前フレームの復号画像、１１６，４０５，５０５…ブロックマッチング部、１１７，２１２…現フレームの予測画像、１１８，２１３…「０」信号、１１９，２１４…フレーム間／フレーム内符号化切り換えスイッチ、１２０，２０２，４０６，５０７…動きベクトル情報、１２１，２０３…フレーム間／フレーム内識別フラグ、１２２，４０７，５１０…多重化器、１２３…伝送ビットストリーム、２００…画像復号化器、２０８…出力画像、２１１…予測画像合成部、２１６…分離器、３０１…参照画像、３０２…現フレームの原画像、３０３，３０４，３０５，６０２，６０３，６０４，７０２，７０３，７０４，７０５，８０２，８０３，８０４，８０５，９０２，９０４，９０６，９０７…代表点、３０６，３０７，３０８…代表点の動きベクトル、４０１，５０１…グローバル動き補償を行う動き補償処理部、４０２，５０２…グローバル動き補償部、４０３，５０４…動きパラメータ、４０４，５０３…グローバル動き補償の予測画像、５０６…ブロックマッチングによる予測画像、５０８…ブロックマッチング／グローバル動き補償切り換えスイッチ、５０９…ブロックマッチング／グローバル動き補償の選択情報、６０１，７０１，８０１，９０１…グローバル動き補償の対象となる画像、７０６…代表点が囲む長方形、９０３，９０８…画像の隅の点と代表点を兼用する点、９０９，９１０…画像の隅の点。[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an image encoding and decoding method that applies global motion compensation based on linear interpolation / extrapolation or bilinear interpolation / extrapolation to an entire image.
[0002]
[Prior art]
In high-efficiency coding of moving images, it is known that motion compensation that uses the similarity between temporally adjacent frames has a great effect on information compression. The motion compensation method which is the mainstream of the current image coding technology is block matching adopted in H.261, MPEG1 and MPEG2 which are international standards for moving image coding. In this method, an image to be encoded is divided into a large number of blocks, and a motion vector is obtained for each block.
[0003]
FIG. 1 shows a configuration example 100 of an H.261 encoder. H.261 employs a hybrid encoding scheme (interframe / intraframe adaptive encoding scheme) that combines block matching and DCT (discrete cosine transform) as an encoding scheme. The subtractor 102 calculates a difference between the input image (original image of the current frame) 101 and an output image 113 (described later) of the interframe / intraframe encoding changeover switch 119 and outputs an error image 103. This error image is converted into DCT coefficients by the DCT converter 104 and then quantized by the quantizer 105 to become quantized DCT coefficients 106. This quantized DCT count is output as transmission information to the communication path, and at the same time, used in the encoder to synthesize an inter-frame prediction image. The procedure for predictive image composition will be described below. The quantized DCT coefficient 106 described above becomes a decoded error image 110 (the same image as the error image reproduced on the receiving side) through the inverse quantizer 108 and the inverse DCT transformer 109. An adder 111 adds an output image 113 (described later) of the interframe / intraframe coding changeover switch 119 to the decoded image 112 of the current frame (the same image as the decoded image of the current frame reproduced on the receiving side). Get. This image is temporarily stored in the frame memory 114 and delayed by a time corresponding to one frame. Therefore, at present, the frame memory 114 outputs the decoded image 115 of the previous frame. The decoded image of the previous frame and the input image 101 of the current frame are input to the block matching unit 116, and block matching processing is performed. In block matching, an image is divided into a plurality of blocks, and the portion most similar to the original image of the current frame is extracted for each block from the decoded image of the previous frame, thereby synthesizing the predicted image 117 of the current frame. At this time, it is necessary to perform a process (motion estimation process) for detecting how much each block has moved between the previous frame and the current frame. The motion vector for each block detected by the motion estimation process is transmitted as motion information 120 to the receiving side. The receiving side can synthesize the same predicted image as obtained on the transmitting side independently from the motion information and the decoded image of the previous frame. The predicted image 117 is input to the interframe / intraframe coding changeover switch 119 together with the “0” signal 118. This switch switches between interframe coding and intraframe coding by selecting either of the two inputs. When the predicted image 117 is selected (FIG. 2 shows this case), interframe coding is performed. On the other hand, when the “0” signal is selected, the input image is directly DCT-encoded and output to the communication path, so that intraframe encoding is performed. In order for the receiving side to obtain a decoded image correctly, it is necessary to know whether inter-frame encoding or intra-frame encoding has been performed on the transmitting side. For this reason, the identification flag 121 is output to the communication path. The final H.261 encoded bit stream 123 is obtained by multiplexing information on quantized DCT coefficients, motion vectors, and intra-frame / inter-frame identification flags in a multiplexer 122.
[0004]
FIG. 2 shows a configuration example of a decoder 200 that receives the encoded bit stream output from the encoder of FIG. The received H.261 bit stream 217 is separated into a quantized DCT coefficient 201, a motion vector 202, and an intra-frame / inter-frame identification flag 203 by a separator 216. The quantized DCT coefficient 201 becomes an error image 206 decoded through the inverse quantizer 204 and the inverse DCT transformer 205. The error image is added to the output image 215 of the interframe / intraframe encoding changeover switch 214 by the adder 207 and output as a decoded image 208. The interframe / intraframe coding changeover switch switches the output according to the interframe / intraframe coding identification flag 203. The predicted image 212 used when performing interframe coding is synthesized in the predicted image synthesis unit 211. Here, a process of moving the position for each block according to the received motion vector 202 is performed on the decoded image 210 of the previous frame stored in the frame memory 209. On the other hand, in the case of intraframe coding, the interframe / intraframe coding changeover switch outputs the “0” signal 213 as it is.
[0005]
Block matching is currently the most widely used motion compensation method, but when the entire image is enlarged, reduced, or rotated, motion vectors must be transmitted to all blocks, and coding is performed. The problem of inefficiency occurs. To solve this problem, global motion compensation that represents the motion vector field of the entire image using a small number of parameters (for example, M. Hotter, &# 34 Differential estimation of the global motion parameters zoom and pan &# 34, Signal Processing, vol. 16 , no. 3, pp. 249-265, Mar. 1989). This is the motion vector (ug (x, y), vg (x, y)) of the pixel (x, y) in the image,
[0006]
[Expression 1]

[0007]
Or
[0008]
[Expression 2]

[0009]
This is a method of performing motion compensation using this motion vector. Here, a0 to a5 and b0 to b7 are motion parameters. When motion compensation is performed, the same predicted image must be obtained on the transmission side and the reception side. For this purpose, the transmitting side may directly transmit the values of a0 to a5 or b0 to b7 to the receiving side, but there is also a method of transmitting motion vectors of some representative points instead. Now, assume that the coordinates of the pixels at the upper left corner, upper right corner, lower left corner, and lower right corner of the image are represented by (0,0), (r, 0), (0, s), and (r, s), respectively. (Where r and s are positive integers). At this time, the horizontal and vertical components of the motion vectors of the representative points (0,0), (r, 0), (0, s) are (ua, va), (ub, vb), (uc, vc), respectively. Then, (Equation 1) is
[Equation 3]

[0011]
Can be rewritten. This means that the same function can be realized by transmitting ua, va, ub, vb, uc, vc instead of transmitting a0 to a5. This is shown in FIG. Assuming that global motion compensation has been performed between the original image 302 of the current frame and the reference image 301, the

motion vectors

306, 307, and 308 of the

representative points

303, 304, and 305 instead of the motion parameters (the motion vectors are The point of the original image of the frame is defined as the starting point and the corresponding point in the reference image is defined as the ending point). Similarly, the horizontal and vertical components (ua, va), (ub) of the motion vectors of the four representative points (0,0), (r, 0), (0, s), (r, s) , vb), (uc, vc), (ud, vd), (Equation 2) becomes
[0012]
[Expression 4]

[0013]
Can be rewritten. Therefore, the same function can be realized by transmitting ua, va, ub, vb, uc, vc, ud, vd instead of transmitting b0 to b7. In this specification, the method using (Equation 1) is called global motion compensation based on linear interpolation / extrapolation, and the method using (Equation 2) is called global motion compensation based on bilinear interpolation / extrapolation.
[0014]
FIG. 4 shows a configuration example of a motion compensation processing unit 401 of an image encoder that employs a global motion compensation method based on linear interpolation / extrapolation that transmits a motion vector of a representative point. The same numbers as those in FIG. 1 indicate the same items. By replacing the block matching unit 116 of FIG. 1 with the motion compensation processing unit 401, an image encoding device that performs global motion compensation can be configured. The global motion compensation unit 402 performs motion estimation regarding global motion compensation between the decoded image 115 of the previous frame and the original image 101 of the current frame, and estimates the values of ua, va, ub, vb, uc, vc. The Information 403 regarding these values is transmitted as part of the motion information 120. The global motion compensation predicted image 404 is synthesized using Equation 3 and supplied to the block matching unit 405. Here, motion compensation by block matching is performed between the predicted image of global motion compensation and the original image of the current frame, and block motion vector information 406 and a final predicted image 117 are obtained. This motion vector information is multiplexed in the motion parameter information and multiplexing section 407 and output as motion information 120.
[0015]
A configuration example of a motion compensation processing unit 501 different from that in FIG. 4 is shown in FIG. The same numbers as those in FIG. 1 indicate the same items. By replacing the block matching unit 116 in FIG. 1 with the motion compensation processing unit 501, an image encoding device that performs global motion compensation can be configured. In this example, block matching is not applied to the prediction image of global motion compensation, but either global motion compensation or block matching is applied to each block. The global motion compensation unit 502 and the block matching unit 505 perform global motion compensation and block matching in parallel between the decoded image 115 of the previous frame and the original image 101 of the current frame, respectively. The selection switch 508 selects an optimum method for each block between the predicted image 503 by global motion compensation and the predicted image 506 by block matching. The representative point motion vector 504, the motion vector 507 for each block, and the global motion compensation / block matching selection information 509 are multiplexed by the multiplexing unit 510 and output as motion information 120.
[0016]
By introducing the global motion compensation described above, it is possible to represent the global motion of an image using a small number of parameters, and a higher information compression rate can be realized. However, on the other hand, the amount of processing in encoding and decoding increases compared to the conventional method. In particular, the division found in (Equation 3) and (Equation 4) is a major factor that complicates the processing.
[0017]
[Problems to be solved by the invention]
In global motion compensation in which the motion vector field of the entire image is approximated with a small number of parameters, there is a problem that the amount of processing for synthesizing the predicted image increases. An object of the present invention is to reduce the amount of calculation by replacing the division processing in global motion compensation with a binary shift operation.
[0018]
[Means for Solving the Problems]
By appropriately selecting the coordinates of representative points when performing global motion compensation, division processing can be realized by shift operation.
[0019]
DETAILED DESCRIPTION OF THE INVENTION
In the following discussion, the pixel sampling interval is set to 1 in both the horizontal and vertical directions, and the coordinates of the upper left, upper right, lower left, and lower right pixels of the image are (0, 0), (r, 0), ( 0, s) and (r, s) (where r and s are positive integers).
[0020]
When performing motion compensation using linear interpolation / extrapolation (affine transformation) or bilinear interpolation / extrapolation (bilinear transformation), if quantization is performed on the motion vector for each pixel, mismatching occurs. Effects such as prevention and calculation simplification can be obtained (Japanese Patent Application No. 06-193970). In the following, it is assumed that the horizontal and vertical components of the pixel motion vector are integer multiples of 1 / m (m is a positive integer). Further, it is assumed that global motion compensation using the motion vector of the representative point described in “Prior Art” is performed, and the motion vector of each representative point is an integral multiple of 1 / k (k is a positive integer). . In the present specification, “pixel motion vector” refers to a motion vector used for actually synthesizing a predicted image when performing global motion compensation. On the other hand, “representative point motion vector” means a parameter used to calculate a pixel motion vector. Therefore, the pixel motion vector and the representative point motion vector may not match even if they exist on the same coordinate due to a difference in quantization step size.
[0021]
First, the case of using linear interpolation / extrapolation will be described with reference to FIG. At this time, as described in “Prior Art”, the representative point is not a point positioned at the corner of the image 601 but (i, j), (i + p, j), (i, j + q). The

points

602, 603, and 604 located at (i, j, p, q are integers). At this time, the

points

602, 603, and 604 may exist inside or outside the image. (U0, v0), (u1, v1), (u2, v2) are obtained by multiplying the horizontal and vertical components of the motion vector of the representative point by k, and (u0, v0, u1, v1, u2, v2 are integers) ), And the horizontal and vertical components (u (x, y), v (x, y)) of the motion vector of the pixel (x, y) can be expressed by the following equations (where x, y, u (x, y), and v (x, y) are integers).
[0022]
[Equation 5]

[0023]
However, “//” is a division that rounds a result obtained by normal division to a neighboring integer when the calculation result is not an integer, and the priority as an operator is equivalent to multiplication / division. In order to reduce the calculation error, it is desirable that the non-integer value is rounded to the nearest integer. At this time, the rounding method of the value obtained by adding 1/2 to the integer is:
(1) Round it up toward 0,
(2) Round away from 0,
(3) If the dividend is negative, round toward 0; if it is positive, round away from 0 (assuming that the divisor is always positive)
(4) If the dividend is negative, round away from 0; if the dividend is positive, round toward 0 (divisor is always positive)
And so on. Among these, (3) and (4) are advantageous in terms of throughput because the rounding direction does not change regardless of whether the dividend is positive or negative. High-speed processing using (3) is realized by the following equation.
[0024]
[Formula 6]

[0025]
However, “#” is an integer division in which the fractional part is rounded down in the direction of 0, and is generally the most easily realized division in a computer. Here, L and M are numbers for keeping the dividend dividend always positive, and are sufficiently large positive integers. The term (pqk # 2) is used to round the division result to the nearest integer.
[0026]
Converting processing into integers itself contributes to reducing the amount of processing. Here, p, q, and k are respectively raised to the powers of α, β, and h0 (where α and β are positive integers and h0 is a non-negative integer). ), The division of (Equation 5) can be realized by a shift operation of α + β + h0 bits, so that the processing amount in a computer or dedicated hardware can be greatly reduced. Further, if m is 2 to the power of h1 (h1 is a non-negative integer, h1 <α + β + h0),
[0027]
[Expression 7]

[0028]
("X &# 60 &#60α" shifts x to the left by α bits and puts 0 in the lower α bits. "X &# 62 &#62α" shifts x to the right by α bits and shifts the upper α Means 0 or 1 in the bit (when x is 2's complement expression, if the most significant bit of x is 1, put 0 if 0), the priority of these operators is Further, the calculation can be simplified.
[0029]
When linear interpolation / extrapolation is used, (u3, v3) is obtained by multiplying the horizontal and vertical components of the motion vector of the representative point located at (i + p, j + q) by k. If (i, j), (i + p, j), (i + p, j + q) are representative points,
[0030]
[Equation 8]

[0031]
If (i, j), (i, j + q), (i + p, j + q) are representative points,
[0032]
[Equation 9]

[0033]
If (i + p, j), (i, j + q), (i + p, j + q) are representative points,
[0034]
[Expression 10]

[0035]
And the amount of processing can be similarly reduced by making the values of p, q, k, and m 2 positive positive integers.
[0036]
When bilinear interpolation / extrapolation is used, the movement of representative points (i, j), (i + p, j), (i, j + q), (i + p, j + q) (U (x, y), v (x) using (u0, v0), (u1, v1), (u2, v2), (u3, v3), which is the horizontal and vertical component of the vector multiplied by k. , y)) can be expressed by the following equation.
[0037]
## EQU11 ##

[0038]
This formula also sets the values of p, q, k, m to 2 to the power of α, β, h0, h1, respectively.
[0039]
[Expression 12]

[0040]
And the processing amount can be reduced in the same manner as above.
[0041]
In order to obtain the same global motion compensated prediction image on the transmission side and the reception side, it is necessary to transmit information on the motion vector of the representative point to the reception side in some form. There is a method of transmitting the motion vector of the representative point as it is, but there is also a method of transmitting the motion vector of the point at the corner of the image and calculating the motion vector of the representative point from this value. This method will be described below.
[0042]
First, the case where linear interpolation / extrapolation is used will be described. Assuming that the motion vectors of the three points (0,0), (r, 0), and (0, s) at the corners of the image can only take values of 1 / n integer multiples, these horizontal and vertical components are multiplied by n. Assume that (u00, v00), (u01, v01), (u02, v02) are transmitted. At this time, the horizontal and vertical components of the respective motion vectors at points (i, j), (i + p, j), (i, j + q), (i + p, j + q) are multiplied by k. (U0, v0), (u1, v1), (u2, v2), (u3, v3)
[0043]
[Formula 13]

[0044]
It is defined as However, u ′ (x, y) and v ′ (x, y) are transformed from (Equation 5),
[0045]
[Expression 14]

[0046]
It is defined as At this time, “///” is a division in which the result of the normal division is rounded to a neighboring integer when the result is not an integer, and the priority as an operator is equivalent to multiplication and division. If three points are selected from (u0, v0), (u1, v1), (u2, v2), (u3, v3) and global motion compensation is performed using them as representative points, (0, 0), Global motion compensation with (r, 0) and (0, s) as representative points can be approximated. Of course, if p and q are raised to a positive integer power of 2, the processing can be simplified as described above. In order to reduce the calculation error, “///” desirably rounds a non-integer value to the nearest integer. At this time, as the rounding method of the value obtained by adding 1/2 to the integer, the methods (1) to (4) described above can be considered. However, compared to the case of (Equation 5) (calculated for each pixel), (Equation 14) (calculated only 3 times for one image) has a smaller number of operations, so (1) or ( Even if the method of 2) is selected, the overall amount of computation is not greatly affected.
[0047]
Even when three points different from the example of (Expression 13) are selected as the corner points of the image, the same processing can be realized by modifying Expressions 8 to 10. In addition to the above example, if (u03, v03) is obtained by multiplying the horizontal and vertical components of the motion vector at the corner point (r, s) of the image by n, (Equation 14) becomes (u00, v00). ), (U01, v01), (u03, v03) are transmitted,
[0048]
[Expression 15]

[0049]
And (u00, v00), (u02, v02), (u03, v03) are transmitted,
[0050]
[Expression 16]

[0051]
And (u01, v01), (u02, v02), (u03, v03) are transmitted,
[0052]
[Expression 17]

[0053]
It can be rewritten as
[0054]
The same applies when bilinear interpolation / extrapolation is performed. Similarly to the above, it is assumed that the motion vectors of the four representative points (0,0), (r, 0), (0, s), (r, s) at the corners of the image can only take values of 1 / n integer multiples. It is assumed that (u00, v00), (u01, v01), (u02, v02), (u03, v03) obtained by multiplying these horizontal and vertical components by n are transmitted. The horizontal and vertical components of the representative motion vectors (i, j), (i + p, j), (i, j + q), (i + p, j + q) at this time multiplied by k (U0, v0), (u1, v1), (u2, v2), and (u3, v3) are given by (Equation 13) in the same manner as above. However, u ′ (x, y) and v ′ (x, y) are transformed from (Equation 11),
[0055]
[Formula 18]

[0056]
Is defined.
[0057]
The advantage of the method of obtaining the motion vector of the representative point by transmitting the motion vector of the corner point of the image and performing interpolation / extrapolation on this is that it is easy to limit the range of the motion vector for each pixel. . For example, in bilinear interpolation / extrapolation given by (Equation 4), when (x, y) is a point in the image, the value of ug (x, y) is the maximum of ua, ub, uc, ud. It does not exceed the value or fall below the minimum value. Therefore, if a constraint condition is set such that the values of ua, ub, uc, ud fall within a certain limited range (for example, a range within ± 32 pixels) during global motion estimation, ug (x, y ) Can be kept within the same limit range (of course, this also holds for vg (x, y)). In this way, the number of digits required for the calculation can be clarified, which is convenient for designing software or hardware. However, since the above discussion is a discussion when all calculations are performed by floating point arithmetic, care must be taken in actual processing. Since the calculation of the motion vector of the representative point from the motion vector of the corner point of the image (Equation 18) involves rounding to an integer, the motion vector obtained by (Equation 12) due to the influence of the calculation error is described above. It is necessary to consider the possibility of going out of the limit range. Care must be taken particularly when the representative point is located inside the image. This is because a rounding error may be amplified because a motion vector is obtained by extrapolation for pixels located outside the rectangle surrounded by the representative points. An example in which a motion vector is obtained by extrapolation is shown in FIG. When global motion compensation is performed on the image 701 using the

representative points

702, 703, 704, and 705, a motion vector is calculated by extrapolation for the portion indicated by the oblique lines in the image. This is because the shaded portion is outside the rectangle 706 surrounded by the representative point.
[0058]
As a countermeasure to this problem, a method of arranging four representative points so that a rectangle surrounded by the representative points includes the entire image is effective. An example of this is shown in FIG. A rectangle 806 surrounded by

representative points

802, 803, 804, 805 includes an image 801. In this way, since the motion vectors of all the pixels are obtained by interpolation from the representative point, the influence of the rounding error at the representative point is not amplified in the image. Therefore, an error larger than the rounding error of the representative point does not occur in the image, and the upper limit of the error becomes clear. However, if the rectangle surrounded by the representative points is made too large, the range of values that can be taken by the motion vectors of the representative points is widened, which increases the number of digits required for the calculation, which is disadvantageous in mounting.
[0059]
From the above discussion, in order to reduce the influence of rounding error, the value of p is preferably r or more, and the value of q is preferably s or more. It is desirable that p and q be as large as possible even when they are smaller than r and s, respectively. Further, it is desirable that the values of i and j are such that the widest possible portion in the image falls within the region surrounded by the representative points.
[0060]
As described above, when both linear interpolation and extrapolation are used for global motion compensation, each component of the motion vector of the pixel included in the rectangle surrounded by the four representative points is the motion vector of the representative point. There is a property that only a value between the maximum value and the minimum value of each component can be taken. On the other hand, when linear interpolation / extrapolation is used, the motion vectors of the pixels in the triangle surrounded by the three representative points have the same properties. Therefore, when performing global motion compensation using linear interpolation / extrapolation, the motion vectors of the four corner points of the image are transmitted, and the global vectors are independently obtained for the two right triangles divided by the diagonal lines of the image. A method of performing motion compensation is effective. In this way, the restriction on the motion vector range for the four corner points can be applied to the motion vectors of all the pixels in the image as they are. At this time, the values of i, j, p, and q may be different between the two right triangles. Further, from the viewpoint of calculation error, it is desirable that the triangle surrounded by the representative point includes a right triangle that is a target of global motion compensation in order to avoid a case of calculating a pixel motion vector by extrapolation. An example of this is shown in FIG. Motion vectors of

points

909, 903, 908, and 910 which are the four corners of the image 901 are transmitted, and a right triangle formed by the

points

909, 903 and 910 and a right triangle formed by the

points

909, 910 and 908 are transmitted. Global motion compensation is performed independently for each. Therefore, if a constraint is provided regarding the range of motion vectors of vertices, the motion vectors of all pixels in the image will also fall within this constraint range. A right triangle formed by

points

909, 903, and 910 uses

points

902, 903, and 904 as representative points, and a right triangle formed by

points

909, 910, and 908 has

points

906, 907, and 908 as representative points. Is used. The triangle formed by the representative points includes a right triangle that is a target of global motion compensation. For this reason, the influence of the rounding error of the motion vector of the representative point is not amplified at the point in the image. In this example, the two triangles formed by the representative points are similar, but this is not necessarily the case.
[0061]
【The invention's effect】
According to the present invention, it is possible to substitute the division process in the prediction image synthesis process of the global motion compensation by a shift operation, and it is possible to simplify the process by software or dedicated hardware.
[Brief description of the drawings]
1 is a diagram illustrating a configuration example of an H.261 image encoder.
2 is a diagram illustrating a configuration example of an H.261 image decoder.
FIG. 3 is a diagram illustrating an example of global motion compensation for transmitting a motion vector of a representative point.
FIG. 4 is a diagram illustrating a motion compensation processing unit of an image encoder that performs block matching on a predicted image of global motion compensation.
FIG. 5 is a diagram illustrating a motion compensation processing unit of an image encoder that selects global motion compensation and block matching for each block.
FIG. 6 is a diagram illustrating an example of arrangement of representative points for performing high-speed processing.
FIG. 7 is a diagram showing a region for obtaining a motion vector by extrapolation in an image.
FIG. 8 is a diagram illustrating a case where motion vectors of all pixels in an image are obtained by interpolation from motion vectors of representative points.
FIG. 9 is a diagram illustrating an example in which an image is divided into two right triangles, and global motion compensation is applied to each by interpolation from motion vectors of representative points.
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 100 ... Image encoder, 101 ... Input image, 102 ... Subtractor, 103 ... Error image, 104 ... DCT converter, 105 ... DCT coefficient quantizer, 106, 201 ... Quantized DCT coefficient, 108, 204 ... DCT Coefficient

inverse quantizer

109, 205 ... Inverse DCT converter, 110, 206 ... Decoded error image, 111, 207 ... Adder, 112 ... Decoded image of current frame, 113, 215 ... Inter-frame / intra-frame coding switching Switch output image, 114, 209 ... frame memory, 115, 210 ... decoded image of previous frame, 116, 405, 505 ... block matching unit, 117, 212 ... predicted image of current frame, 118, 213 ... "0" signal 119, 214... Interframe / intraframe coding changeover switch, 120, 202, 406, 507. Toll information, 121, 203 ... Inter-frame / intra-frame identification flag, 122, 407, 510 ... Multiplexer, 123 ... Transmission bit stream, 200 ... Image decoder, 208 ... Output image, 211 ... Predictive image synthesizer, 216: Separator, 301: Reference image, 302: Original image of current frame, 303, 304, 305, 602, 603, 604, 702, 703, 704, 705, 802, 803, 804, 805, 902, 904 906, 907: representative points, 306, 307, 308: motion vectors of representative points, 401, 501: motion compensation processing units for performing global motion compensation, 402, 502: global motion compensation units, 403, 504: motion parameters, 404 503: Predicted image of global motion compensation 506: Predicted image by block matching, 5 8: Block matching / global motion compensation changeover switch, 509: Block matching / global motion compensation selection information, 601, 701, 801, 901 ... Global motion compensation target image, 706: Rectangle surrounded by representative points, 903 908... Points that serve as corner points of the image and representative points, 909, 910... Points at the corners of the image.

Claims

In an image encoding method for performing motion estimation by global motion compensation between a decoded image obtained by decoding an already encoded frame and an input image of the current frame, and synthesizing the predicted image of the current frame,
In the synthesis of the prediction image, the prediction image that is a rectangle is divided into two right triangle images using diagonal lines, and for each right triangle image, linear in / out of the motion vectors at three representative points. When calculating the motion vector for all pixels in a right triangle image by performing the interpolation,
The sampling interval of the pixel is 1 in both the horizontal and vertical directions, and the sampling point exists on a point where both the horizontal and vertical components of the coordinate are integers.
For one right triangle image, three points represented by coordinates (i, j), (i + p, j), (i, j + q) are used as representative points (i and j are integers, p and q Is an integer),
For the other right triangle image, three points whose coordinates are represented by (i ′, j ′), (i′−p, j ′), (i ′, j′−q) are represented as representative points. Use (i 'and j' are integers),
And p and q are respectively 2 to the power of α and 2 to the power of β (where α and β are positive integers).

In an image encoding method for performing motion estimation by global motion compensation between a decoded image obtained by decoding an already encoded frame and an input image of the current frame, and synthesizing the predicted image of the current frame,
In the synthesis of the predicted image, the rectangular predicted image is divided into two right triangle images using a diagonal line, and linear motion inside and outside the motion vectors at three representative points for each right triangle image. When calculating the motion vector for all pixels in a right triangle image by performing the interpolation,
The sampling interval of the pixel is 1 in both the horizontal and vertical directions, and the sampling point exists on a point where both the horizontal and vertical components of the coordinate are integers.
For one right triangle image, three points represented by coordinates (i, j), (i-p, j), (i, j + q) are used as representative points (i and j are integers, p And q are integers),
For the other right triangle image, three points whose coordinates are represented by (i ′, j ′), (i ′ + p, j ′), (i ′, j′−q) are used as representative points. (I 'and j' are integers)
And p and q are respectively 2 to the power of α and 2 to the power of β (where α and β are positive integers).