JP2004241957A

JP2004241957A - Image processor and encoding device, and methods therefor

Info

Publication number: JP2004241957A
Application number: JP2003027895A
Authority: JP
Inventors: Kazufumi Sato; 数史佐藤; Yoichi Yagasaki; 陽一矢ケ崎
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2003-02-05
Filing date: 2003-02-05
Publication date: 2004-08-26
Anticipated expiration: 2023-02-05
Also published as: JP4360093B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide an image processor, an encoding device and methods therefor for reducing the quantity of operation attendant on movement compensation. <P>SOLUTION: In a moving vector search of small-number pixel precision, a simple SATD calculating circuit 42 generates a simple SATD which is the sum of absolute values of values obtained by performing simple orthogonal transformation (orthogonal transformation of the present invention) with a smaller quantity of operation than Hadamard transformation for the differences (residual component) between pixel data in a movement compensation block of a current frame to be processed and pixel data in a movement compensation block of a reference frame. A movement predicting/compensating circuit 36 specifies a movement vector for minimizing an evaluation value J prescribed by using the simple SATD. <P>COPYRIGHT: (C)2004,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、画像データの量子化制御に特徴を有する画像処理装置および符号化装置と、それらの方法に関する。
【０００２】
【従来の技術】
近年、画像データとして取り扱い、その際、効率の高い情報の伝送、蓄積を目的とし、画像情報特有の冗長性を利用して、離散コサイン変換等の直交変換と動き補償により圧縮するＭＥＰＧ（ＭｏｖｉｎｇＰｉｃｔｕｒｅＥｘｐｅｒｔｓＧｒｏｕｐ）などの方式に準拠した装置が、放送局などの情報配信、及び一般家庭における情報受信の双方において普及しつつある。
【０００３】
特に、ＭＰＥＧ２（ＩＳＯ／ＩＥＣ１３８１８−２）は、汎用画像符号化方式として定義されており、飛び越し走査画像及び順次走査画像の双方、並びに標準解像度画像及び高精細画像を網羅する標準で、プロフェッショナル用途及びコンシューマー用途の広範なアプリケーションに現在広く用いられている。
ＭＰＥＧ２圧縮方式を用いることにより、例えば７２０×４８０画素を持つ標準解像度の飛び越し走査画像であれば４〜８Ｍｂｐｓ、１９２０×１０８８画素を持つ高解像度の飛び越し走査画像であれば１８〜２２Ｍｂｐｓの符号量（ビットレート）を割り当てることで、高い圧縮率と良好な画質の実現が可能である。
【０００４】
ＭＰＥＧ２は主として放送用に適合する高画質符号化を対象としていたが、ＭＰＥＧ１より低い符号量（ビットレート）、つまり、より高い圧縮率の符号化方式には対応していなかった。携帯端末の普及により、今後そのような符号化方式のニーズは高まると思われ、これに対応してＭＰＥＧ４符号化方式の標準化が行われた。画像符号化方式に関しては、１９９８年１２月にＩＳＯ／ＩＥＣ１４４９６−２としてその規格が国際標準に承認された。
【０００５】
さらに、近年、当初テレビ会議用の画像符号化を目的として、Ｈ．２６Ｌ（ＩＴＵ−ＴＱ６／１６ＶＣＥＧ）という標準の規格化が進んでいる。Ｈ．２６ＬはＭＰＥＧ２やＭＰＥＧ４といった従来の符号化方式に比べ、その符号化、復号化により多くの演算量が要求されるものの、より高い符号化効率が実現されることが知られている。また、現在、ＭＰＥＧ４の活動の一環として、このＨ．２６Ｌをベースに、Ｈ．２６Ｌ規格ではサポートされない機能をも取り入れ、より高い符号化効率を実現する標準化がＪｏｉｎｔＭｏｄｅｌｏｆＥｎｈａｎｃｅｄ−ＣｏｍｐｒｅｓｓｉｏｎＶｉｄｅｏＣｏｄｉｎｇとして行われている。
【０００６】
ところで、ＭＰＥＧおよびＨ．２６Ｌ規格の符号化装置では、高い圧縮効率を得るために、動き予測・補償が重要や役割を果たす。
ＭＰＥＧおよびＨ．２６Ｌを発展させたＪＶＴ（ＪｏｉｎｔＶｉｄｅｏＴｅａｍ）方式では、以下の３つの方式を導入して、高い圧縮効率を達成している。
第１の方式はマルチプル参照フレーム（ＭｌｕｔｉｐｌｅＲｅｆｅｒｅｎｃｅＦｒａｍｅ）であり、第２の方式は可変動き予測・補償ブロックサイズであり、第３の方式はＦＩＲフィルタを用いた１／４画素精度あるいは１／８画素精度の少数画素精度の動き補償である。
このようなＪＶＴ方式では、例えば、少数精度の動きベクトル探索において、処理対象となる現フレームの動き補償ブロック内の画素データと参照フレームの動き補償ブロック内の画素データとの差分（残差成分）の自乗和を示すＳＡＤ（ＳｕｍｏｆＡｂｓｏｌｕｔｅＤｉｆｆｅｒｅｎｅｃ）ではなく、上記差分に対して、垂直方向および水平方向のアダマール変換を施した値の絶対値和であるＳＡＴＤを生成し、当該ＳＡＴＤを用いて規定される値Ｊを最小にする動きベクトルを探索する。
【０００７】
【発明が解決しようとする課題】
しかしながら、上述したように動き探索において、上記差分に対して、垂直方向および水平方向のアダマール変換を施してＳＡＴＤを生成するのでは、演算量が多くなり、処理負担が大きいという問題がある。
なお、動き補償における動き補償（ＭＣ）ブロックの予測方向の決定処理についても、同様な問題がある。
【０００８】
本発明はかかる事情に鑑みてなされたものであり、動き補償に伴う演算量を削減できる画像処理装置、符号化装置およびそれらの方法を提供することを目的とする。
また、本発明は、予測符号化における予測方向の決定に伴う演算量を削減できる画像処理装置、符号化装置およびそれらの方法を提供することを目的とする。
【０００９】
【課題を解決するための手段】
上記の目的を達成するため、第１の発明の画像処理装置は、第１の画像データ内の複数の第１の画素データと、前記第１の画像データと相関を有する第２の画像データ内における、前記複数の第１の画素データに対応する位置から候補となる動きベクトルによって指し示される位置に対応する複数の第２の画素データとの差分をそれぞれ生成する差分生成手段と、前記差分生成手段が生成した前記差分に１次元の直交変換を施す直交変換手段と、前記直交変換手段で前記直交変換が施された前記差分の累積値を基に、前記候補となる複数の動きベクトルのなかから、動き補償に用いる動きベクトルを特定する動きベクトル特定手段とを有する。
【００１０】
第１の発明の画像処理装置の作用は以下のようになる。
差分生成手段が、第１の画像データ内の複数の第１の画素データと、前記第１の画像データと相関を有する第２の画像データ内における、前記複数の第１の画素データに対応する位置から候補となる動きベクトルによって指し示される位置に対応する複数の第２の画素データとの差分をそれぞれ生成する。
次に、直交変換手段が、前記差分生成手段が生成した前記差分に１次元の直交変換を施す。
次に、動きベクトル特定手段が、前記直交変換手段で前記直交変換が施された前記差分の累積値を基に、前記候補となる複数の動きベクトルのなかから、動き補償に用いる動きベクトルを特定する。
【００１１】
第２の発明の符号化装置は、第１の画像データ内の複数の第１の画素データと、前記第１の画像データと相関を有する第２の画像データ内における、前記複数の第１の画素データに対応する位置から候補となる動きベクトルによって指し示される位置に対応する複数の第２の画素データとの差分をそれぞれ生成する差分生成手段と、前記差分生成手段が生成した前記差分に１次元の直交変換を施す直交変換手段と、前記直交変換手段で前記直交変換が施された前記差分の累積値を基に、前記候補となる複数の動きベクトルのなかから、動き補償に用いる動きベクトルを特定する動きベクトル特定手段と、前記第２の画像データと前記第１の画像データとの差分、並びに前記動きベクトル特定手段が特定した前記動きベクトルを符号化する符号化手段とを有する。
【００１２】
第２の発明の符号化装置の作用は以下のようになる。
差分生成手段が、第１の画像データ内の複数の第１の画素データと、前記第１の画像データと相関を有する第２の画像データ内における、前記複数の第１の画素データに対応する位置から候補となる動きベクトルによって指し示される位置に対応する複数の第２の画素データとの差分をそれぞれ生成する。
次に、直交変換手段が、前記差分生成手段が生成した前記差分に１次元の直交変換を施す。
次に、動きベクトル特定手段が、前記直交変換手段で前記直交変換が施された前記差分の累積値を基に、前記候補となる複数の動きベクトルのなかから、動き補償に用いる動きベクトルを特定する。
次に、符号化手段が、前記第２の画像データと前記第１の画像データとの差分、並びに前記動きベクトル特定手段が特定した前記動きベクトルを符号化する。
【００１３】
第３の発明の画像処理装置は、第１の画像データ内の複数の第１の画素データと、前記第１の画像データと相関を有する第２の画像データ内における、前記複数の第１の画素データに対応する位置から候補となる動きベクトルによって指し示される位置に対応する複数の第２の画素データとの差分をそれぞれ生成する差分生成手段と、前記差分生成手段が生成した前記差分に、水平方向および垂直方向のうち少なくとも一方の方向において前記差分の一部の周波数成分のみを残すように２次元の直交変換を施す直交変換手段と、前記直交変換手段で前記直交変換が施された前記差分の累積値を基に、前記候補となる複数の動きベクトルのなかから、動き補償に用いる動きベクトルを特定する動きベクトル特定手段とを有する。
【００１４】
第３の発明の画像処理装置の作用は以下のようになる。
差分生成手段が、第１の画像データ内の複数の第１の画素データと、前記第１の画像データと相関を有する第２の画像データ内における、前記複数の第１の画素データに対応する位置から候補となる動きベクトルによって指し示される位置に対応する複数の第２の画素データとの差分をそれぞれ生成する。
次に、直交変換手段が、前記差分生成手段が生成した前記差分に、水平方向および垂直方向のうち少なくとも一方の方向において前記差分の一部の周波数成分のみを残すように２次元の直交変換を施す。
次に、動きベクトル特定手段が、前記直交変換手段で前記直交変換が施された前記差分の累積値を基に、前記候補となる複数の動きベクトルのなかから、動き補償に用いる動きベクトルを特定する。
【００１５】
第４の発明の符号化装置は、第１の画像データ内の複数の第１の画素データと、前記第１の画像データと相関を有する第２の画像データ内における、前記複数の第１の画素データに対応する位置から候補となる動きベクトルによって指し示される位置に対応する複数の第２の画素データとの差分をそれぞれ生成する差分生成手段と、前記差分生成手段が生成した前記差分に、水平方向および垂直方向のうち少なくとも一方の方向において前記差分の一部の周波数成分のみを残すように２次元の直交変換を施す直交変換手段と、前記直交変換手段で前記直交変換が施された前記差分の累積値を基に、前記候補となる複数の動きベクトルのなかから、動き補償に用いる動きベクトルを特定する動きベクトル特定手段と、前記第２の画像データと前記第１の画像データとの差分、並びに前記動きベクトル特定手段が特定した前記動きベクトルを符号化する符号化手段とを有する。
【００１６】
第４の発明の符号化装置の作用は以下のようになる。
差分生成手段が、第１の画像データ内の複数の第１の画素データと、前記第１の画像データと相関を有する第２の画像データ内における、前記複数の第１の画素データに対応する位置から候補となる動きベクトルによって指し示される位置に対応する複数の第２の画素データとの差分をそれぞれ生成する。
次に、直交変換手段が、前記差分生成手段が生成した前記差分に、水平方向および垂直方向のうち少なくとも一方の方向において前記差分の一部の周波数成分のみを残すように２次元の直交変換を施す。
次に、動きベクトル特定手段が、前記直交変換手段で前記直交変換が施された前記差分の累積値を基に、前記候補となる複数の動きベクトルのなかから、動き補償に用いる動きベクトルを特定する。
次に、符号化手段が、前記第２の画像データと前記第１の画像データとの差分、並びに前記動きベクトル特定手段が特定した前記動きベクトルを符号化する。
【００１７】
第５の発明の画像処理装置は、第１の画像データ内の複数の第１の画素データと、前記第１の画像データと相関を有する第２の画像データ内における、前記複数の第１の画素データに対応する位置から動きベクトルによって指し示される位置に対応する複数の第２の画素データとの差分をそれぞれ生成する差分生成手段と、前記差分生成手段が生成した前記差分に１次元の直交変換を施す直交変換手段と、前記直交変換手段で前記直交変換が施された前記差分の累積値を基に、前記動きベクトルを用いた予測方向を特定する予測方向特定手段とを有する。
【００１８】
第５の発明の画像処理装置の作用は以下のようになる。
差分生成手段が、第１の画像データ内の複数の第１の画素データと、前記第１の画像データと相関を有する第２の画像データ内における、前記複数の第１の画素データに対応する位置から動きベクトルによって指し示される位置に対応する複数の第２の画素データとの差分をそれぞれ生成する。
次に、直交変換手段が、前記差分生成手段が生成した前記差分に１次元の直交変換を施す。
次に、予測方向特定手段が、前記直交変換手段で前記直交変換が施された前記差分の累積値を基に、前記動きベクトルを用いた予測方向を特定する。
【００１９】
第６の発明の符号化装置は、第１の画像データ内の複数の第１の画素データと、前記第１の画像データと相関を有する第２の画像データ内における、前記複数の第１の画素データに対応する位置から候補となる動きベクトルによって指し示される位置に対応する複数の第２の画素データとの差分をそれぞれ生成する差分生成手段と、前記差分生成手段が生成した前記差分に１次元の直交変換を施す直交変換手段と、前記直交変換手段で前記直交変換が施された前記差分の累積値を基に、前記動きベクトルを用いた予測方向を特定する予測方向特定手段と、前記第２の画像データと前記第１の画像データとの差分、並びに前記動きベクトルを符号化する符号化手段とを有する。
【００２０】
第７の発明の画像処理装置は、第１の画像データ内の複数の第１の画素データと、前記第１の画像データと相関を有する第２の画像データ内における、前記複数の第１の画素データに対応する位置から候補となる動きベクトルによって指し示される位置に対応する複数の第２の画素データとの差分をそれぞれ生成する差分生成手段と、前記差分生成手段が生成した前記差分に、水平方向および垂直方向のうち少なくとも一方の方向において前記差分の一部の周波数成分のみを残すように２次元の直交変換を施す直交変換手段と、前記直交変換手段で前記直交変換が施された前記差分の累積値を基に、前記動きベクトルを用いた予測方向を特定する予測方向特定手段とを有する。
【００２１】
第８の発明の符号化装置は、第１の画像データ内の複数の第１の画素データと、前記第１の画像データと相関を有する第２の画像データ内における、前記複数の第１の画素データに対応する位置から候補となる動きベクトルによって指し示される位置に対応する複数の第２の画素データとの差分をそれぞれ生成する差分生成手段と、前記差分生成手段が生成した前記差分に、水平方向および垂直方向のうち少なくとも一方の方向において前記差分の一部の周波数成分のみを残すように２次元の直交変換を施す直交変換手段と、前記直交変換手段で前記直交変換が施された前記差分の累積値を基に、前記動きベクトルを用いた予測方向を特定する予測方向特定手段と、前記第２の画像データと前記第１の画像データとの差分、並びに前記動きベクトルを符号化する符号化手段とを有する。
【００２２】
第９の発明の画像処理方法は、第１の画像データ内の複数の第１の画素データと、前記第１の画像データと相関を有する第２の画像データ内における、前記複数の第１の画素データに対応する位置から候補となる動きベクトルによって指し示される位置に対応する複数の第２の画素データとの差分をそれぞれ生成する第１の工程と、前記第１の工程で生成した前記差分に１次元の直交変換を施す第２の工程と、前記第２の工程で前記直交変換が施された前記差分の累積値を基に、前記候補となる複数の動きベクトルのなかから、動き補償に用いる動きベクトルを特定する第３の工程とを有する。
【００２３】
第１０の発明の画像処理方法は、第１の画像データ内の複数の第１の画素データと、前記第１の画像データと相関を有する第２の画像データ内における、前記複数の第１の画素データに対応する位置から候補となる動きベクトルによって指し示される位置に対応する複数の第２の画素データとの差分をそれぞれ生成する第１の工程と、前記第１の工程で生成した前記差分に、水平方向および垂直方向のうち少なくとも一方の方向において前記差分の一部の周波数成分のみを残すように２次元の直交変換を施す第２の工程と、前記第２の工程で前記直交変換が施された前記差分の累積値を基に、前記候補となる複数の動きベクトルのなかから、動き補償に用いる動きベクトルを特定する第３の工程とを有する。
【００２４】
第１１の発明の画像処理方法は、第１の画像データ内の複数の第１の画素データと、前記第１の画像データと相関を有する第２の画像データ内における、前記複数の第１の画素データに対応する位置から動きベクトルによって指し示される位置に対応する複数の第２の画素データとの差分をそれぞれ生成する第１の工程と、前記第１の工程で生成した前記差分に１次元の直交変換を施す第２の工程と、前記第２の工程で前記直交変換が施された前記差分の累積値を基に、前記動きベクトルを用いた予測方向を特定する第３の工程とを有する。
【００２５】
第１２の発明の画像処理方法は、第１の画像データ内の複数の第１の画素データと、前記第１の画像データと相関を有する第２の画像データ内における、前記複数の第１の画素データに対応する位置から候補となる動きベクトルによって指し示される位置に対応する複数の第２の画素データとの差分をそれぞれ生成する第１の工程と、前記第１の工程で生成した前記差分に、水平方向および垂直方向のうち少なくとも一方の方向において前記差分の一部の周波数成分のみを残すように２次元の直交変換を施す第２の工程と、前記第２の工程で前記直交変換が施された前記差分の累積値を基に、前記動きベクトルを用いた予測方向を特定する第３の工程とを有する。
【００２６】
【発明の実施の形態】
〔本発明の関連技術〕
図１は、本発明の関連技術に係わる符号化装置５００の機能ブロック図である。
図１に示す符号化装置５００において、入力となる画像信号は、まず、Ａ／Ｄ変換回路５０１においてデジタル信号に変換される。次に、出力となる画像圧縮情報のＧＯＰ（ＧｒｏｕｐｏｆＰｉｃｔｕｒｅｓ）構造に応じ、画面並べ替え回路５０２においてフレーム画像データの並べ替えが行われる。
そして、イントラ符号化が行われる画像に関しては、フレーム画像データの全体が直交変換回路５０４に入力され、直交変換回路５０４において離散コサイン変換やカルーネン・レーベ変換等の直交変換が施される。
直交変換回路５０４の出力となる変換係数は、量子化回路５０５において量子化処理される。
量子化回路５０５の出力となる、量子化された変換係数は、可逆符号化回路５０６に入力され、ここで可変長符号化、算術符号化等の可逆符号化が施された後、バッファ５０７に蓄積され、圧縮された画像データとして出力される。
【００２７】
量子化回路５０５における量子化レートは、レート制御回路５１２によって制御される。
同時に、量子化回路５０５の出力となる、量子化された変換係数は、逆量子化回路５０８において逆量子化され、続いて逆直交変換回路５０９において逆直交変換処理が施され、デブロックフィルタ５１３においてブロック歪みが除去されて復号された参照フレーム画像データが得られる。当該参照フレーム画像データは、フレームメモリ５１０に蓄積される。
【００２８】
一方、インター符号化が行われる画像に関しては、画面並べ替え回路５０２から出力されたフレーム画像データが、動き予測・補償回路５１１に入力される。同時に参照フレーム画像データがフレームメモリ５１０より読み出され、動き予測・補償回路５１１によって動きベクトルが生成され、当該動きベクトルおよび参照フレーム画像データを用いて予測フレーム画像データが生成される。予測フレーム画像データが演算回路５０３に出力され、演算回路５０３において、画面並べ替え回路５０２からのフレーム画像データと、動き予測・補償回路５１１からの予測フレーム画像データとの差分を示す画像データが生成され、当該画像データが直交変換回路５０４に出力される。
また、動き補償・予測回路５１１は、動きベクトルを可逆符号化回路５０６に出力し、可逆符号化回路５０６において、動きベクトルが可変長符号化あるいは算術符号化といった可逆符号化処理され、画像信号のヘッダ部に挿入される。その他の処理はイントラ符号化を施される画像信号と同様である。
【００２９】
図２は、図１に示す符号化装置５００に対応する復号回路４９９の機能ブロック図である。
図２に示す復号回路４９９では、入力となる画像データがバッファ６１３に格納された後、可逆復号回路６１４に出力される。そして、可逆復号回路６１４において、フレーム画像データのフォーマットに基づき、可変長復号化、算術復号化等の処理が行われる。同時に、当該フレーム画像データがインター符号化されたものである場合には、可逆復号回路６１４において、フレーム画像データのヘッダ部に格納された動きベクトルＭＶも復号され、その動きベクトルＭＶが動き予測・補償装置６２０に出力される。
【００３０】
可逆復号回路６１４の出力となる、量子化された変換係数は、逆量子化回路６１５に入力され、ここで逆量子化される。当該逆量子化された変換係数には、逆直交変換回路６１６において、定められたフレーム画像データのフォーマットに基づき、逆離散コサイン変換や逆カルーネン・レーベ変換等の逆直交変換が施される。当該フレーム画像データがイントラ符号化されたものである場合には、逆直交変換処理が施されたフレーム画像データは、デブロックフィルタ６２１でブロック歪みが除去された後に画面並べ替えバッファ６１８に格納され、Ｄ／Ａ変換回路６１９によるＤ／Ａ変換処理を経て出力される。
【００３１】
一方、当該フレームがインター符号化されたものである場合には、動き予測・補償回路６２０において、動きベクトルＭＶ及びフレームメモリ６５０に格納された参照フレーム画像データを基に予測フレーム画像データが生成され、この予測フレーム画像データと、逆直交変換回路６１６から出力されたフレーム画像データとが加算器６１７において加算される。その他の処理はイントラ符号化されたフレーム画像データと同様である。
【００３２】
ところで、図１に示す符号化装置５００では、ＪＶＴ方式により、例えば、少数精度の動きベクトル探索において、処理対象となる現フレームの動き補償ブロック内の画素データと参照フレームの動き補償ブロック内の画素データとの差分ｄ（残差成分）の自乗和を示すＳＡＤ（ＳｕｍｏｆＡｂｓｏｌｕｔｅＤｉｆｆｅｒｅｎｅｃ）ではなく、上記差分ｄに対して、下記式（１）に示すように、垂直方向および水平方向のアダマール変換を施した値の絶対値和であるＳＡＴＤを生成し、当該ＳＡＴＤを用いて規定される値Ｊを最小にする動きベクトルを探索する。
【００３３】
【数１】

【００３４】
しかしながら、上述したように動き探索において、上記差分ｄに対して、下記式（１）に示す垂直方向および水平方向のアダマール変換を施してＳＡＴＤを生成するのでは、演算量が多くなり、処理負担が大きいという問題がある。
なお、動き補償における動き補償（ＭＣ）ブロックの予測方向の決定処理、並びにマクロブロックモードの決定処理についても、同様な問題がある。
【００３５】
以下、上述した問題を解決するための本実施形態の画像処理装置およびその方法と符号化装置について説明する。
第１実施形態
図３は、本実施形態の通信システム１の概念図である。
図３に示すように、通信システム１は、送信側に設けられた符号化装置２と、受信側に設けられた復号装置４９９とを有する。
符号化装置２が発明の符号化装置に対応している。
符号化装置２および復号装置４９９は、上述したＪＶＴ符号化方式に基づいて符号化および復号を行なう。
復号回路４９９は、図２を用いて前述したものと同じである。
【００３６】
通信システム１では、送信側の符号化装置２において、離散コサイン変換やカルーネン・レーベ変換などの直交変換と動き補償によって圧縮したフレーム画像データ（ビットストリーム）を生成し、当該フレーム画像データを変調した後に、衛星放送波、ケーブルＴＶ網、電話回線網、携帯電話回線網などの伝送媒体を介して送信する。
受信側では、受信した画像信号を復調した後に、上記変調時の直交変換の逆変換と動き補償によって伸張したフレーム画像データを生成して利用する。
なお、上記伝送媒体は、光ディスク、磁気ディスクおよび半導体メモリなどの記録媒体であってもよい。
【００３７】
〔符号化装置２〕
図４は、図３に示す符号化装置２の全体構成図である。
図４に示すように、符号化装置２は、例えば、Ａ／Ｄ変換回路２２、画面並べ替えバッファ２３、演算回路２４、直交変換回路２５、量子化回路２６、可逆符号化回路２７、バッファ２８、逆量子化回路２９、逆直交変換回路３０、フレームメモリ３１、レート制御回路３２、動き予測・補償回路３６、デブロックフィルタ３７、モード判別回路４０、簡易ＳＡＴＤ算出回路４１および簡易ＳＡＴＤ算出回路４２を有する。
ここで、簡易ＳＡＴＤ算出回路４２が第１の発明の差分生成手段おび直交変換手段に対応し、動き予測・補償回路３６が第１の発明の動きベクトル特定手段に対応している。
また、可逆符号化回路２７が本発明の符号化手段に対応している。
さらに、モード判別回路４０が本発明の予測方向特定手段に対応し、動き予測・補償回路３６が本発明の動きベクトル特定手段に対応している。
【００３８】
符号化装置２では、例えば、少数精度の動きベクトル探索において、処理対象となる現フレームの動き補償ブロック内の画素データと参照フレームの動き補償ブロック内の画素データとの差分（残差成分）に対して、前述したアダマール変換に比べて演算量の少ない簡易的な直交変換（本発明の直交変換）を施した値の絶対値和である簡易ＳＡＴＤを簡易ＳＡＴＤ算出回路４１および簡易ＳＡＴＤ算出回路４２で生成することに特徴を有している。
モード判別回路４０および動き予測・補償回路３６は、このように同様に生成された簡易ＳＡＴＤを基に、それぞれモード判別および動きベクトルの探索を行う。
【００３９】
以下、符号化装置２の構成要素について説明する。
〔Ａ／Ｄ変換回路２２〕
Ａ／Ｄ変換回路２２は、入力されたアナログの輝度信号Ｙ、色差信号Ｐｂ，Ｐｒから構成される画像信号をデジタルの画像信号に変換し、これを画面並べ替えバッファ２３に出力する。
【００４０】
〔画面並べ替えバッファ２３〕
画面並べ替えバッファ２３は、Ａ／Ｄ変換回路２２から入力した画像信号内のフレーム画像信号を、そのピクチャタイプＩ，Ｐ，ＢからなるＧＯＰ（ＧｒｏｕｐＯｆＰｉｃｔｕｒｅｓ）構造に応じて、符号化する順番に並べ替えたフレーム画像データＳ２３をモード判別回路４０に出力する。
【００４１】
〔モード判別回路４０〕
モード判別回路４０は、動き予測・補償回路３６における動き予測・補償において用いられる動き補償ブロックの予測方向（インター予測における順方向および逆方向）を決定する予測方向決定処理、並びにマクロブロックをイントラ／インターのどちらのモードで符号化するかを決定するモード決定処理を行う。
モード判別回路４０は、例えば、Ｂフレームの動き補償ブロックの予測方向として、下記式（２）で定義される値Ｊ（本発明の累積値）を最小にする方向を選択する。
【００４２】
【数２】

【００４３】
上記式（２）において、ＰＤＩＲは、動き補償ブロックの予測方向を示し、ｓは符号化対象の画像データ（現フレームデータ）を示し、ｃは参照画像データ（参照フレームデータ）を示し、ｍは動きベクトルを示し、ｐは予測動きベクトルを示し、ＲＥＦは参照画像データの識別番号を示し、Ｒは当該予測方向を選択した場合の動きベクトルの発生符号量を示し、ｌ_{ＭＯＴＩＯＮ}は所定の係数を示している。
また、上記式（２）におけるＳＡＴＤは、その引数に示される符号化対象の画像データのブロック内の画素データと、参照画像データのブロック内の画素データとの差分（残差成分）ｄに、所定の直交変換Ｃ（本発明の直交変換）を施した値の累積値（絶対値和）を示している。
【００４４】
ここで、簡易ＳＡＴＤ算出回路４１がＳＡＴＤの算出を行い、モード判別回路４０は、簡易ＳＡＴＤ算出回路４１から入力したＳＡＴＤを用いて、上記式（２）の値Ｊを算出する。
本実施形態では、上記差分ｄを下記式（３）のように定義する。
【００４５】
【数３】

【００４６】
なお、モード判別回路４０は、上記簡易ＳＡＴＤを基に、マクロブロックモードの決定を行ってもよい。
【００４７】
〔簡易ＳＡＴＤ算出回路４１〕
図５は、簡易ＳＡＴＤ算出回路４１の機能ブロック図である。
図５に示すように、簡易ＳＡＴＤ算出回路４１は、例えば、差分算出回路６１および直交変換回路６２を有する。
ここで、差分算出回路６１が本発明の差分生成手段に対応し、直交変換回路６２が本発明の直交変換手段に対応している。
【００４８】
差分算出回路６１は、現フレームの所定のブロック内の画素データＳ９０と参照フレームの所定のブロック内の画素データＳ９１との差分（残差成分）ｄを生成する。
また、直交変換回路６２は、以下に示す式（４）〜（１１）の何れかを、直交変換Ｃとして上記差分ｄに施し、それによって得られた値の絶対値和を簡易ＳＡＴＤとする。
【００４９】
すなわち、直交変換回路６２は、上記式（３）に示す差分ｄに対して、下記式（４）に示すように、アダマール変換の水平方向のみの変換を直交変換Ｃとして施し、それによって得られた値の絶対値和を上記簡易ＳＡＴＤとする。
【００５０】
【数４】

【００５１】
直交変換回路６２は、上記式（３）に示す差分ｄに対して、下記式（５）に示すように、アダマール変換の垂直方向のみの変換を直交変換Ｃとして施し、それによって得られた値の絶対値和を上記簡易ＳＡＴＤとする。
【００５２】
【数５】

【００５３】
直交変換回路６２は、上記式（３）に示す差分ｄに対して、下記式（６）に示すように、アダマール変換により水平方向の低域成分および垂直方向の低域成分のみの残す変換を直交変換Ｃとして施し、それによって得られた値の絶対値和を上記簡易ＳＡＴＤとする。
なお、本実施形態において、水平方向の低域成分とは差分ｄの最も低域の周波数成分から所定の範囲にある周波数成分（例えば、式（３）の左２行あるいは１行分の係数）を示し、垂直方向の高域成分とは差分ｄの最も高域の成分から所定の範囲にある周波数成分（例えば、式（３）の右２行あるいは１行分の係数）を示し、垂直方向の中域成分とは上記低域成分と上記高域成分との間の周波数成分を示す。
また、本実施形態において、垂直方向の低域成分とは差分ｄの最も低域の周波数成分から所定の範囲にある周波数成分（例えば、式（３）の上２行あるいは１行分の係数）を示し、垂直方向の高域成分とは差分ｄの最も高域の成分から所定の範囲にある周波数成分（例えば、式（３）の下２行あるいは１行分の係数）を示し、垂直方向の中域成分とは上記低域成分と上記高域成分との間の周波数成分を示す。
【００５４】
【数６】

【００５５】
直交変換回路６２は、上記式（３）に示す差分ｄに対して、下記式（７）に示すように、アダマール変換により水平方向の低域成分および垂直方向の全域成分を残す変換を直交変換Ｃとして施し、それによって得られた値の絶対値和を上記簡易ＳＡＴＤとする。この場合には、直交変換回路６２は、垂直方向の変換に先立って水平方向の変換を行うことで、上記式（１）に示す演算に比べて演算量を削減できる。
【００５６】
【数７】

【００５７】
直交変換回路６２は、上記式（３）に示す差分ｄに対して、下記式（８）に示すように、アダマール変換により水平方向の全域成分および垂直方向の低域成分のみを残す変換を直交変換Ｃとして施し、それによって得られた値の絶対値和を上記簡易ＳＡＴＤとする。この場合には、直交変換回路６２は、水平方向の変換に先立って垂直方向の変換を行うことで、上記式（１）に示す演算に比べて演算量を削減できる。
【００５８】
【数８】

【００５９】
直交変換回路６２は、上記式（３）に示す差分ｄに対して、下記式（９）に示すように、アダマール変換により水平方向の低域成分、並びに垂直方向の高域および低域成分のみを残す変換を直交変換Ｃとして施し、それによって得られた値の絶対値和を上記簡易ＳＡＴＤとする。
【００６０】
【数９】

【００６１】
直交変換回路６２は、上記式（３）に示す差分ｄに対して、下記式（１０）に示すように、アダマール変換により水平方向の低域および高域成分、並びに垂直方向の低域成分のみを残す変換を直交変換Ｃとして施し、それによって得られた値の絶対値和を上記簡易ＳＡＴＤとする。
【００６２】
【数１０】

【００６３】
直交変換回路６２は、上記式（３）に示す差分ｄに対して、下記式（１１）に示すように、アダマール変換により水平方向の低域および高域成分、並びに垂直方向の低域および高域成分のみを残す変換を直交変換Ｃとして施し、それによって得られた値の絶対値和を上記簡易ＳＡＴＤとする。
【００６４】
【数１１】

【００６５】
直交変換回路６２は、動き補償ブロックの種類に応じて、上述した式（４）〜（１１）の何れかを、直交変換Ｃとして差分ｄに施し、それによって得られた値の絶対値和を上記簡易ＳＡＴＤとしてもよい。
例えば、直交変換回路６２は、動き補償ブロックの種類が図７に示す１６×８あるいは８×４である場合には、上記式（４），（７）あるいは（９）を直交変換Ｃとして用いる。
また、直交変換回路６２は、動き補償ブロックの種類が図７に示す８×１６あるいは４×８である場合には、上記式（５），（７）あるいは（１０）を直交変換Ｃとして用いる。
また、直交変換回路６２は、動き補償ブロックの種類が正方形状である場合には、下記式（１２）、上記式（６）あるいは（１１）を用いてもよい。また、この場合には、アダマール変換を行わず、ＳＡＤを用いてもよい。
【００６６】
〔演算回路２４〕
演算回路２４は、フレーム画像データＳ２３がインター（Ｉｎｔｅｒ）符号化される場合には、フレーム画像データＳ２３と、動き予測・補償回路３６から入力した予測フレーム画像データＳ３６ａとの差分を示す画像データＳ２４を生成し、これを直交変換回路２５に出力する。
また、演算回路２４は、フレーム画像データＳ２３がイントラ（Ｉｎｔｒａ）符号化される場合には、フレーム画像データＳ２３を画像データＳ２４として直交変換回路２５に出力する。
【００６７】
〔直交変換回路２５〕
直交変換回路２５は、画像データＳ２４に離散コサイン変換やカルーネン・レーベ変換などの直交変換を施して画像データ（例えばＤＣＴ係数信号）Ｓ２５を生成し、これを量子化回路２６に出力する。
直交変換回路２５は、例えば、４×４のブロックを単位として直交変換を行う。
【００６８】
〔量子化回路２６〕
量子化回路２６は、レート制御回路３２から入力した量子化スケールで、画像データＳ２５を量子化して画像データＳ２６を生成し、これを可逆符号化回路２７および逆量子化回路２９に出力する。
【００６９】
〔可逆符号化回路２７〕
可逆符号化回路２７は、画像データＳ２６を可変長符号化あるいは算術符号化し、符号化された画像データをバッファ２８に格納する。
また、可逆符号化回路２７は、動き予測・補償回路３６から入力した動きベクトルＭＶあるいはその差分を符号化してヘッダデータに格納する。
バッファ２８に格納された画像データは、変調等された後に送信される。
【００７０】
〔逆量子化回路２９および逆直交変換回路３０〕
逆量子化回路２９は、画像データＳ２６を逆量子化したデータを生成し、これをデブロックフィルタ３７に出力する。
逆直交変換回路３０は、量子化され、デブロックフィルタ３７でブロック歪みが除去された画像データに上記直交変換の逆変換を施して生成したフレーム画像データをフレームメモリ３１に格納する。
【００７１】
〔レート制御回路３２〕
レート制御回路３２は、バッファ２８から読み出した画像データ、量子化パラメータＱＰを基に、量子化回路２６における量子化の量子化スケールを生成し、これを量子化回路２６に出力する。
【００７２】
〔動き予測・補償回路３６〕
動き予測・補償回路３６は、フレームメモリ３１からの画像データＳ３１と、画面並べ替えバッファ２３からの画像データとを基に動き予測・補償処理を行って、動きベクトルＭＶおよび参照画像データＳ３６ａを生成する。
動き予測・補償回路３６は、動きベクトルＭＶを可逆符号化回路２７に出力し、参照画像データＳ３６ａを演算回路２４に出力する。
動き予測・補償回路３６は、ＪＶＴ方式により、図６に示すようにマルチプル参照フレームを用いると共に、図７に示すように複数の種類の動き補償ブロックを選択的に用いる。さらには、ＦＩＲフィルタを用いた１／４画素精度あるいは１／８画素精度の少数画素精度の動き補償を行う。
【００７３】
図８は、動き予測・補償回路３６の機能ブロック図である。
図８に示すように、動き予測・補償回路３６は、例えば、整数精度ＭＶ探索回路５１、ＳＡＤ算出回路５２、補間回路５３、少数画素精度ＭＶ探索回路５４および参照画像決定回路５５を有する。
整数精度ＭＶ探索回路５１は、モード判別回路４０から入力した処理対象の画像データ（現フレームデータ）Ｓ２３内の画素データと、フレームメモリ３１から入力した参照画像データＳ３１（参照フレームデータ）内の画素データと、候補となる動きベクトルＭＶをＳＡＤ算出回路５２に出力する。
整数精度ＭＶ探索回路５１は、ＳＡＤ算出回路５２から入力した後述するＳＡＤを基に、候補となる複数の動きベクトルのうち、下記式（１２）で示される値Ｊを最小にする動きベクトルＭＶ１を整数画素精度で探索（特定）する。
【００７４】
【数１２】

【００７５】
上記式（１２）に示すλ_ＭＯＤＥとしては、ＩおよびＰフレームに対しては下記式（１３）で示されるλ_{ＭＯＤＥ，Ｐ（Ｉ）} が用いられ、Ｂフレームに対しては下記式（１４）で示されるλ_{ＭＯＤＥ，Ｂ}が用いられる。また、ＳＡＤは、ＳＡＤ算出回路５２から入力したものを用いる。
【００７６】
【数１３】

【００７７】
【数１４】

【００７８】
ＳＡＤ算出回路５２は、画像データ（現フレームデータ）Ｓ２３の動き補償ブロックＢ内の画素データｓ（ｘ，ｙ）と、フレームメモリ３１から入力した参照画像データＳ３１（参照フレームデータ）の動き補償ブロックＢに対応するブロック内の画素データｃ（ｘ−ｍ_ｘ，ｙ−ｍ_ｙ）とを用いて、下記式（１５）によりＳＡＤを算出する。
【００７９】
【数１５】

【００８０】
補間回路５３は、フレームメモリ３１から入力した整数画素精度の参照画像データ（参照フレームデータ）Ｓ３１をＦＩＲフィルタ等を用いて補間処理して１／４画素あるいは１／８画素精度の少数画素精度の参照画像データを生成し、これを少数画素精度ＭＶ探索回路５４に出力する。
【００８１】
少数画素精度ＭＶ探索回路５４は、例えば、フレームメモリ３１からの整数画素精度の参照画像データＳ３１と、補間回路５３で得られた少数画素精度の参照画像データとを用いて、整数精度ＭＶ探索回路５１で生成された動きベクトルＭＶ１によって規定される検索範囲内で、少数画素精度で動きベクトルの探索を行う。
このとき、少数画素精度ＭＶ探索回路５４は、簡易ＳＡＴＤ算出回路４２から入力した簡易ＳＡＴＤを用いて、下記式（１６）に示される値Ｊを算出し、複数の候補動きベクトルのなかから、当該値Ｊを最小にする動きベクトルＭＶを探索する。
【００８２】
【数１６】

【００８３】
参照画像決定回路５５は、少数画素精度ＭＶ探索回路５４から入力した動きベクトルＭＶを基に参照画像データＳ３６ａを生成する。
【００８４】
〔簡易ＳＡＴＤ算出回路４２〕
図９は、図４および図８に示す簡易ＳＡＴＤ算出回路４２の機能ブロック図である。
図９に示すように、簡易ＳＡＴＤ算出回路４２は、例えば、差分算出回路７１および直交変換回路７２を有している。
但し、簡易ＳＡＴＤ算出回路４２は、図８に示す画像データ２３（現フレームデータ）の少数画素および整数画素の画素データＳ２３ａと、参照画像データＳ３１（参照フレームデータ）内の少数画素および整数画素の画素データＳ３１ａとを用いて簡易ＳＡＴＤを算出する。
ここで、差分算出回路７１が本発明の差分生成手段に対応し、直交変換回路７２が本発明の直交変換手段に対応している。
【００８５】
差分算出回路７１は、各動き補償ブロック内の画素データＳ２３ａと、当該画素データＳ２３ａに対応する画素位置から候補動きベクトルによって指し示される画素位置に対応する参照画像データＳ３１ａ内の画素データＳ３１ａとの差分（残差成分）ｄをそれぞれ算出する。
また、直交変換回路７２は、前述した式（４）〜（１１）の何れかを、直交変換Ｃとして、差分算出回路７１が算出した上記差分ｄに施し、それによって得られた値の絶対値和を簡易ＳＡＴＤとする。
【００８６】
直交変換回路７２は、動き補償ブロックの種類に応じて、上述した式（４）〜（１１）の何れかを、直交変換Ｃとして差分ｄに施し、それによって得られた値の絶対値和を上記簡易ＳＡＴＤとしてもよい。
例えば、直交変換回路７２は、動き補償ブロックの種類が図７に示す１６×８あるいは８×４である場合には、上記式（４），（７）あるいは（９）を直交変換Ｃとして用いる。
また、直交変換回路７２は、動き補償ブロックの種類が図７に示す８×１６あるいは４×８である場合には、上記式（５），（７）あるいは（１０）を直交変換Ｃとして用いる。
また、直交変換回路７２は、動き補償ブロックの種類が正方形状である場合には、下記式（１２）、上記式（６）あるいは（１１）を用いてもよい。また、この場合には、アダマール変換を行わず、ＳＡＤを用いてもよい。
【００８７】
次に、図４に示す符号化装置２の全体動作を説明する。
入力となる画像信号は、まず、Ａ／Ｄ変換回路２２においてデジタル信号に変換される。次に、出力となる画像圧縮情報のＧＯＰ構造に応じ、画面並べ替えバッファ２３においてフレーム画像データの並べ替えが行われる。
そして、モード判別回路４０において、簡易ＳＡＴＤ算出回路４１において簡易的に算出された簡易ＳＡＴＤを基に、動き予測・補償において用いられる動き補償ブロックの予測方向を決定する予測方向決定処理、並びにマクロブロックをイントラ／インターのどちらのモードで符号化するかを決定するモード決定処理が行われる。
【００８８】
イントラ符号化が行われる画像データＳ２３（フレームデータ）に関しては、フレームデータ全体の画像情報が直交変換回路２５に入力され、直交変換回路２５において離散コサイン変換やカルーネン・レーベ変換等の直交変換が施される。
直交変換回路２５の出力となる変換係数は、量子化回路２６において量子化処理される。
量子化回路２６は、レート制御回路３２からの制御に基づいて、量子化を行う。
【００８９】
量子化回路２６の出力となる、量子化された変換係数は、可逆変換回路２７に入力され、ここで可変長符号化、算術符号化等の可逆符号化が施された後、バッファ２８に蓄積され、圧縮された画像データとして出力される。
同時に、量子化回路２６の出力となる、量子化された変換係数は、逆量子化回路２９に入力され、さらに逆直交変換回路３０において逆直交変換処理が施されて、復号された画像データ（フレームデータ）となり、その画像データがフレームメモリ３１に蓄積される。
【００９０】
一方、インター符号化が行われる画像に関しては、先ず、その画像データＳ２３が動き予測・補償回路３６に入力される。また、参照画像データＳ３１がフレームメモリ３１より読み出され、動き予測・補償回路３６に出力される。
そして、動き予測・補償回路３６において、参照画像の画像データＳ３１を用いて、動きベクトルＭＶおよび予測画像データＳ３６ａが生成される。
【００９１】
そして、演算回路２４において、モード判別回路４０からの画像データＳ２３と、動き予測・補償回路３６からの予測画像データＳ３６ａとの差分信号である画像データＳ２４が生成され、当該画像データＳ２４が直交変換回路２５に出力される。
このとき、動き予測・補償回路３６は、ＪＶＴ方式により、マルチプル参照フレームを用いると共に、複数の種類の動き補償ブロックを選択的に用いる。さらには、ＦＩＲフィルタを用いた１／４画素精度あるいは１／８画素精度の少数画素精度の動き補償を行う。
当該動き補償において、動き予測・補償回路３６は、前述したように、簡易ＳＡＴＤ算出回路４２から入力した簡易ＳＡＴＤを用いて、上記式（１６）に示される値Ｊを算出し、複数の候補動きベクトルのなかから、当該値Ｊを最小にする動きベクトルＭＶを探索する。
【００９２】
そして、可逆符号化回路２７において、動きベクトルＭＶが可変長符号化あるいは算術符号化といった可逆符号化処理され、画像データのヘッダ部に挿入される。その他の処理はイントラ符号化を施される画像データと同様である。
【００９３】
以上説明したように、符号化装置２によれば、簡易ＳＡＴＤ算出回路４１において、アダマール変換を簡易的にした上記式（４）〜（１１）に示される直交変換Ｃを行って簡易ＳＡＴＤを算出し、当該簡易ＳＡＴＤを基に動き補償ブロックの予測方向を決定する。
そのため、アダマール変換を用いる場合に比べて動き予測・補償回路３６の演算量を削減できると共に、直交変換を行わない場合に比べて適切な動きベクトルＭＶを得ることができ、演算回路２４で生成される画像信号Ｓ２４の情報量を削減できる（符号化効率を高められる）。
【００９４】
本発明は上述した実施形態には限定されない。
例えば、上述した実施形態によれば、簡易ＳＡＴＤ算出回路４１および簡易ＳＡＴＤ算出回路４２において直交変換としてアダマール変換を用いる場合を例示したが、ＤＣＴ（ＤｉｓｃｒｅｔｅＣｏｓｉｎｅｔｒａｎｓｆｏｒｍ）などのその他の直交変換を用いてもよい。
なお、本実施形態において、ＪＶＴ方式では、簡易ＳＡＴＤ算出回路４１および簡易ＳＡＴＤ算出回路４２における直交変換Ｃを４×４画素のブロックに対して行うが、本発明は、例えば、８×８画素などの４×４画素以外のブロックについて直交変換Ｃを施してもよい。
【００９５】
【発明の効果】
以上説明したように、本発明によれば、動き補償に伴う演算量を削減できる画像処理装置、符号化装置およびそれらの方法を提供できる。
また、本発明によれば、予測符号化における予測方向の決定に伴う演算量を削減できる画像処理装置、符号化装置およびそれらの方法を提供できる。
【図面の簡単な説明】
【図１】図１は、本発明の関連技術に係わる符号化装置の機能ブロック図である。
【図２】図２は、本発明の関連技術に係わる復号装置の機能ブロック図である。
【図３】図３は、本発明の第１実施形態に係わる符号化装置を説明するための図である。
【図４】図４は、図３に示す符号化装置の機能ブロック図である。
【図５】図５は、図３に示すモード判別回路に対応する簡易ＳＡＴＤ算出回路の機能ブロック図である。
【図６】図６は、図３に示す動き予測・補償回路によるマルチプル参照フレームを説明するための図である。
【図７】図７は、図３に示す動き予測・補償回路によって選択される複数の種類の動き補償ブロックを説明するための図である。
【図８】図８は、図３に示す動き予測・補償回路の機能ブロック図である。
【図９】図９は、図３に示す動き予測・補償回路に対応した簡易ＳＡＴＤ算出回路の機能ブロック図である。
【符号の説明】
１…通信システム、２…符号化装置、２２…Ａ／Ｄ変換回路、２３…画面並べ替えバッファ、２４…演算回路、２５…直交変換回路、２６…量子化回路、２７…可逆符号化回路、２８…バッファ、２９…逆量子化回路、３０…逆直交変換回路、３１…フレームメモリ、３２…レート制御回路、３６…動き予測・補償回路、４１…簡易ＳＡＴＤ算出回路、４２…簡易ＳＡＴＤ算出回路、６１，７１…差分算出回路、６２，７２…直交変換回路、５１…整数精度ＭＶ探索回路、５２…ＳＡＤ算出回路、５３…補間回路、５４…少数画素精度ＭＶ探索回路、５５…参照画像決定回路[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to an image processing device and an encoding device having a feature in image data quantization control, and a method thereof.
[0002]
[Prior art]
2. Description of the Related Art In recent years, an MPEG (Moving Picture) that compresses data by orthogonal transform such as discrete cosine transform and motion compensation using redundancy unique to image information for the purpose of transmitting and storing information with high efficiency at the time of handling as image data. 2. Description of the Related Art Devices conforming to a method such as Experts Group) are becoming widespread in both information distribution at broadcast stations and the like and information reception in ordinary households.
[0003]
In particular, MPEG2 (ISO / IEC13818-2) is defined as a general-purpose image coding method, and is a standard covering both interlaced scan images and progressive scan images, as well as standard resolution images and high-definition images, and is used for professional applications and It is currently widely used in a wide range of consumer applications.
By using the MPEG2 compression method, for example, a code amount of 4 to 8 Mbps for a standard resolution interlaced scan image having 720 × 480 pixels and 18 to 22 Mbps for a high resolution interlace scan image having 1920 × 1088 pixels, for example. Bit rate), it is possible to realize a high compression rate and good image quality.
[0004]
MPEG2 is mainly intended for high-quality coding suitable for broadcasting, but does not support a coding amount (bit rate) lower than that of MPEG1, that is, a coding method with a higher compression rate. With the spread of mobile terminals, it is expected that the need for such an encoding system will increase in the future, and in response to this, the MPEG4 encoding system has been standardized. Regarding the image coding method, the standard was approved as an international standard in December 1998 as ISO / IEC14496-2.
[0005]
Furthermore, in recent years, H.264 was initially used for image coding for video conferences. The standardization of 26L (ITU-T Q6 / 16 VCEG) is in progress. H. It is known that 26L requires a larger amount of calculation for encoding and decoding than conventional encoding methods such as MPEG2 and MPEG4, but realizes higher encoding efficiency. Currently, as part of MPEG4 activities, Based on H.26L, Standardization that incorporates functions that are not supported by the 26L standard and realizes higher encoding efficiency is performed as Joint Model of Enhanced-Compression Video Coding.
[0006]
By the way, MPEG and H.264. In an encoding device conforming to the 26L standard, motion prediction / compensation plays an important role in obtaining high compression efficiency.
MPEG and H.264. In the JVT (Joint Video Team) system developed from 26L, the following three systems are introduced to achieve high compression efficiency.
The first method is a multiple reference frame, the second method is a variable motion prediction / compensation block size, and the third method is 1/4 pixel accuracy or 1/8 using an FIR filter. This is motion compensation with a small number of pixel precision.
In such a JVT method, for example, in a motion vector search with a small number of precision, the difference (residual component) between the pixel data in the motion compensation block of the current frame to be processed and the pixel data in the motion compensation block of the reference frame. Instead of the SAD (Sum of Absolute Differentialec) indicating the sum of squares of the above, a SATD which is a sum of absolute values of values obtained by subjecting the above difference to Hadamard transform in the vertical and horizontal directions is generated, and is defined using the SATD. Search for a motion vector that minimizes the value J to be obtained.
[0007]
[Problems to be solved by the invention]
However, as described above, if the SATD is generated by performing the Hadamard transform in the vertical direction and the horizontal direction on the difference in the motion search, the amount of calculation increases and the processing load increases.
It should be noted that there is a similar problem in the process of determining the prediction direction of a motion compensation (MC) block in motion compensation.
[0008]
The present invention has been made in view of the above circumstances, and has as its object to provide an image processing device, an encoding device, and a method thereof that can reduce the amount of computation involved in motion compensation.
It is another object of the present invention to provide an image processing device, an encoding device, and a method thereof that can reduce the amount of computation involved in determining a prediction direction in predictive encoding.
[0009]
[Means for Solving the Problems]
In order to achieve the above object, an image processing apparatus according to a first aspect of the present invention includes a plurality of first pixel data in a first image data and a second image data having a correlation with the first image data. A difference generating means for respectively generating a difference from a plurality of second pixel data corresponding to a position indicated by a candidate motion vector from a position corresponding to the plurality of first pixel data; Orthogonal transform means for performing one-dimensional orthogonal transform on the difference generated by the means; and a plurality of candidate motion vectors based on a cumulative value of the difference subjected to the orthogonal transform by the orthogonal transform means. And motion vector specifying means for specifying a motion vector used for motion compensation.
[0010]
The operation of the image processing apparatus according to the first invention is as follows.
The difference generating means corresponds to the plurality of first pixel data in the first image data and the plurality of first pixel data in the second image data having a correlation with the first image data. A difference from a plurality of pieces of second pixel data corresponding to a position indicated by the candidate motion vector from the position is generated.
Next, orthogonal transformation means performs one-dimensional orthogonal transformation on the difference generated by the difference generation means.
Next, the motion vector specifying means specifies a motion vector to be used for motion compensation from among the plurality of candidate motion vectors based on the accumulated value of the difference subjected to the orthogonal transform by the orthogonal transform means. I do.
[0011]
An encoding device according to a second aspect of the present invention is the encoding device, wherein the plurality of first pixel data in the first image data and the plurality of first pixel data in the second image data having a correlation with the first image data. Difference generating means for respectively generating a difference from a plurality of second pixel data corresponding to a position indicated by a candidate motion vector from a position corresponding to the pixel data; Orthogonal transformation means for performing a dimensional orthogonal transformation, and a motion vector used for motion compensation from among the plurality of candidate motion vectors based on the accumulated value of the difference subjected to the orthogonal transformation by the orthogonal transformation means. A difference between the second image data and the first image data, and a code for encoding the motion vector identified by the motion vector identification means. And a means.
[0012]
The operation of the encoding device according to the second invention is as follows.
The difference generating means corresponds to the plurality of first pixel data in the first image data and the plurality of first pixel data in the second image data having a correlation with the first image data. A difference from a plurality of pieces of second pixel data corresponding to a position indicated by the candidate motion vector from the position is generated.
Next, orthogonal transformation means performs one-dimensional orthogonal transformation on the difference generated by the difference generation means.
Next, the motion vector specifying means specifies a motion vector to be used for motion compensation from among the plurality of candidate motion vectors based on the accumulated value of the difference subjected to the orthogonal transform by the orthogonal transform means. I do.
Next, an encoding unit encodes a difference between the second image data and the first image data and the motion vector identified by the motion vector identification unit.
[0013]
An image processing apparatus according to a third aspect of the present invention is the image processing device, wherein the plurality of first pixel data in the first image data and the plurality of first pixel data in the second image data having a correlation with the first image data. Difference generating means for respectively generating a difference from a plurality of second pixel data corresponding to a position indicated by a candidate motion vector from a position corresponding to the pixel data; and the difference generated by the difference generating means, Orthogonal transformation means for performing two-dimensional orthogonal transformation so as to leave only a part of the frequency component of the difference in at least one of the horizontal direction and the vertical direction, and the orthogonal transformation means performing the orthogonal transformation. A motion vector specifying unit that specifies a motion vector used for motion compensation from among the plurality of candidate motion vectors based on the accumulated value of the differences.
[0014]
The operation of the image processing apparatus according to the third invention is as follows.
The difference generating means corresponds to the plurality of first pixel data in the first image data and the plurality of first pixel data in the second image data having a correlation with the first image data. A difference from a plurality of pieces of second pixel data corresponding to a position indicated by the candidate motion vector from the position is generated.
Next, orthogonal transformation means performs two-dimensional orthogonal transformation on the difference generated by the difference generation means so that only a part of frequency components of the difference is left in at least one of a horizontal direction and a vertical direction. Apply.
Next, the motion vector specifying means specifies a motion vector to be used for motion compensation from among the plurality of candidate motion vectors based on the accumulated value of the difference subjected to the orthogonal transform by the orthogonal transform means. I do.
[0015]
An encoding device according to a fourth aspect is configured such that the plurality of first pixel data in the first image data and the plurality of first pixel data in the second image data having a correlation with the first image data. Difference generating means for respectively generating a difference from a plurality of second pixel data corresponding to a position indicated by a candidate motion vector from a position corresponding to the pixel data; and the difference generated by the difference generating means, Orthogonal transformation means for performing two-dimensional orthogonal transformation so as to leave only a part of the frequency component of the difference in at least one of the horizontal direction and the vertical direction, and the orthogonal transformation means performing the orthogonal transformation. A motion vector specifying unit that specifies a motion vector to be used for motion compensation from the plurality of candidate motion vectors based on the accumulated value of the difference; Difference between the first image data, as well as a coding means for coding the motion vector is the motion vector specifying means has specified.
[0016]
The operation of the encoding device according to the fourth invention is as follows.
The difference generating means corresponds to the plurality of first pixel data in the first image data and the plurality of first pixel data in the second image data having a correlation with the first image data. A difference from a plurality of pieces of second pixel data corresponding to a position indicated by the candidate motion vector from the position is generated.
Next, orthogonal transformation means performs two-dimensional orthogonal transformation on the difference generated by the difference generation means so that only a part of frequency components of the difference is left in at least one of a horizontal direction and a vertical direction. Apply.
Next, the motion vector specifying means specifies a motion vector to be used for motion compensation from among the plurality of candidate motion vectors based on the accumulated value of the difference subjected to the orthogonal transform by the orthogonal transform means. I do.
Next, an encoding unit encodes a difference between the second image data and the first image data and the motion vector identified by the motion vector identification unit.
[0017]
An image processing apparatus according to a fifth aspect of the present invention is the image processing apparatus, wherein the plurality of first pixel data in the first image data and the plurality of first pixel data in the second image data having a correlation with the first image data. Difference generating means for respectively generating a difference from a plurality of second pixel data corresponding to a position indicated by the motion vector from a position corresponding to the pixel data, and a one-dimensional orthogonal to the difference generated by the difference generating means An orthogonal transform unit for performing a transform; and a prediction direction specifying unit for specifying a prediction direction using the motion vector based on a cumulative value of the difference subjected to the orthogonal transform by the orthogonal transform unit.
[0018]
The operation of the image processing apparatus according to the fifth invention is as follows.
The difference generating means corresponds to the plurality of first pixel data in the first image data and the plurality of first pixel data in the second image data having a correlation with the first image data. A difference from a plurality of pieces of second pixel data corresponding to the position indicated by the motion vector from the position is generated.
Next, orthogonal transformation means performs one-dimensional orthogonal transformation on the difference generated by the difference generation means.
Next, a prediction direction specifying unit specifies a prediction direction using the motion vector based on the accumulated value of the difference subjected to the orthogonal transformation by the orthogonal transformation unit.
[0019]
An encoding device according to a sixth aspect is the encoding device, wherein the plurality of first pixel data in the first image data and the plurality of first pixel data in the second image data having a correlation with the first image data. Difference generating means for respectively generating a difference from a plurality of second pixel data corresponding to a position indicated by a candidate motion vector from a position corresponding to the pixel data; Orthogonal transformation means for performing a dimensional orthogonal transformation, a prediction direction identification means for identifying a prediction direction using the motion vector, based on a cumulative value of the difference subjected to the orthogonal transformation by the orthogonal transformation means, Encoding means for encoding the difference between the second image data and the first image data and the motion vector;
[0020]
An image processing device according to a seventh aspect of the present invention is the image processing device, wherein the plurality of first pixel data in the first image data and the plurality of first pixel data in the second image data having a correlation with the first image data. Difference generating means for respectively generating a difference from a plurality of second pixel data corresponding to a position indicated by a candidate motion vector from a position corresponding to the pixel data; and the difference generated by the difference generating means, Orthogonal transformation means for performing two-dimensional orthogonal transformation so as to leave only a part of the frequency component of the difference in at least one of the horizontal direction and the vertical direction, and the orthogonal transformation means performing the orthogonal transformation. Prediction direction specifying means for specifying a prediction direction using the motion vector based on the accumulated value of the differences.
[0021]
An encoding device according to an eighth aspect of the present invention is the encoding device, wherein the plurality of first pixel data in the first image data and the plurality of first pixel data in the second image data having a correlation with the first image data. Difference generating means for respectively generating a difference from a plurality of second pixel data corresponding to a position indicated by a candidate motion vector from a position corresponding to the pixel data; and the difference generated by the difference generating means, Orthogonal transformation means for performing two-dimensional orthogonal transformation so as to leave only a part of the frequency component of the difference in at least one of the horizontal direction and the vertical direction, and the orthogonal transformation means performing the orthogonal transformation. Prediction direction specifying means for specifying a prediction direction using the motion vector based on the accumulated value of the difference; a difference between the second image data and the first image data; And an encoding means for encoding torr.
[0022]
An image processing method according to a ninth aspect is the image processing method, wherein the plurality of first pixel data in the first image data and the plurality of first pixel data in the second image data having a correlation with the first image data. A first step of generating differences from a plurality of pieces of second pixel data corresponding to a position indicated by a candidate motion vector from a position corresponding to pixel data, and the difference generated in the first step A second step of performing one-dimensional orthogonal transformation on the motion vector, and performing motion compensation from the plurality of candidate motion vectors based on the accumulated value of the difference subjected to the orthogonal transformation in the second step. And a third step of specifying a motion vector to be used.
[0023]
An image processing method according to a tenth aspect of the present invention is the image processing method, wherein the plurality of first pixel data in the first image data and the plurality of first pixel data in the second image data having a correlation with the first image data. A first step of generating differences from a plurality of pieces of second pixel data corresponding to a position indicated by a candidate motion vector from a position corresponding to pixel data, and the difference generated in the first step A second step of performing a two-dimensional orthogonal transformation so as to leave only a part of the frequency component of the difference in at least one of the horizontal direction and the vertical direction; and the orthogonal transformation is performed in the second step. And a third step of specifying a motion vector to be used for motion compensation from among the plurality of candidate motion vectors based on the cumulative value of the difference thus applied.
[0024]
An image processing method according to an eleventh aspect of the present invention is the image processing method, wherein the plurality of first pixel data in the first image data and the plurality of first pixel data in the second image data having a correlation with the first image data. A first step of generating a difference from a plurality of pieces of second pixel data corresponding to a position indicated by the motion vector from a position corresponding to the pixel data, and a one-dimensional calculation of the difference generated in the first step. A second step of performing an orthogonal transformation of the following, and a third step of specifying a prediction direction using the motion vector based on the accumulated value of the difference subjected to the orthogonal transformation in the second step. Have.
[0025]
An image processing method according to a twelfth aspect is the image processing method according to the twelfth aspect, wherein the plurality of first pixel data in the first image data and the plurality of first pixel data in the second image data having a correlation with the first image data. A first step of generating differences from a plurality of pieces of second pixel data corresponding to a position indicated by a candidate motion vector from a position corresponding to pixel data, and the difference generated in the first step A second step of performing a two-dimensional orthogonal transformation so as to leave only a part of the frequency component of the difference in at least one of the horizontal direction and the vertical direction; and the orthogonal transformation is performed in the second step. A third step of specifying a prediction direction using the motion vector based on the cumulative value of the difference that has been performed.
[0026]
BEST MODE FOR CARRYING OUT THE INVENTION
(Related technology of the present invention)
FIG. 1 is a functional block diagram of an encoding device 500 according to the related art of the present invention.
In the encoding device 500 shown in FIG. 1, an input image signal is first converted into a digital signal in an A / D conversion circuit 501. Next, the frame image data is rearranged in the screen rearrangement circuit 502 in accordance with the GOP (Group of Pictures) structure of the image compression information to be output.
For the image to be subjected to intra coding, the entire frame image data is input to the orthogonal transform circuit 504, and the orthogonal transform circuit 504 performs an orthogonal transform such as a discrete cosine transform or a Karhunen-Loeve transform.
The transform coefficient output from the orthogonal transform circuit 504 is quantized by the quantizing circuit 505.
The quantized transform coefficient output from the quantization circuit 505 is input to the lossless encoding circuit 506, where it is subjected to lossless encoding such as variable-length encoding or arithmetic encoding, and then to the buffer 507. It is stored and output as compressed image data.
[0027]
The quantization rate in the quantization circuit 505 is controlled by the rate control circuit 512.
At the same time, the quantized transform coefficient output from the quantization circuit 505 is inversely quantized by the inverse quantization circuit 508, and then subjected to inverse orthogonal transform processing by the inverse orthogonal transform circuit 509, and the deblocking filter 513 , The decoded reference frame image data is obtained by removing the block distortion. The reference frame image data is stored in the frame memory 510.
[0028]
On the other hand, for an image on which inter encoding is performed, frame image data output from the screen rearranging circuit 502 is input to the motion prediction / compensation circuit 511. At the same time, the reference frame image data is read from the frame memory 510, a motion vector is generated by the motion prediction / compensation circuit 511, and predicted frame image data is generated using the motion vector and the reference frame image data. The predicted frame image data is output to the arithmetic circuit 503, and the arithmetic circuit 503 generates image data indicating a difference between the frame image data from the screen rearranging circuit 502 and the predicted frame image data from the motion prediction / compensation circuit 511. Then, the image data is output to the orthogonal transformation circuit 504.
In addition, the motion compensation / prediction circuit 511 outputs the motion vector to the lossless encoding circuit 506, and the lossless encoding circuit 506 subjects the motion vector to lossless encoding processing such as variable-length encoding or arithmetic encoding. Inserted in the header. Other processes are the same as those of the image signal to be subjected to intra coding.
[0029]
FIG. 2 is a functional block diagram of a decoding circuit 499 corresponding to the encoding device 500 shown in FIG.
In the decoding circuit 499 illustrated in FIG. 2, image data to be input is stored in the buffer 613 and then output to the lossless decoding circuit 614. Then, the lossless decoding circuit 614 performs processing such as variable-length decoding and arithmetic decoding based on the format of the frame image data. At the same time, when the frame image data is inter-coded, the motion vector MV stored in the header portion of the frame image data is also decoded by the lossless decoding circuit 614, and the motion vector Output to compensator 620.
[0030]
The quantized transform coefficient output from the lossless decoding circuit 614 is input to the inverse quantization circuit 615, where it is inversely quantized. The inversely quantized transform coefficient is subjected to an inverse orthogonal transform such as an inverse discrete cosine transform or an inverse Karhunen-Loeve transform in an inverse orthogonal transform circuit 616 based on a predetermined frame image data format. When the frame image data is intra-coded, the frame image data subjected to the inverse orthogonal transform processing is stored in the screen rearrangement buffer 618 after the block distortion is removed by the deblocking filter 621. , And D / A conversion processing by the D / A conversion circuit 619.
[0031]
On the other hand, if the frame is inter-coded, the motion prediction / compensation circuit 620 generates predicted frame image data based on the motion vector MV and the reference frame image data stored in the frame memory 650. The predicted frame image data and the frame image data output from the inverse orthogonal transform circuit 616 are added in the adder 617. The other processing is the same as that of the frame image data that has been intra-coded.
[0032]
By the way, in the encoding apparatus 500 shown in FIG. 1, for example, in the motion vector search with a small number of precision, the pixel data in the motion compensation block of the current frame to be processed and the pixels in the motion compensation block of the reference frame are searched by the JVT method. Rather than SAD (Sum of Absolute Difference) indicating the sum of squares of the difference d (residual component) from the data, the Hadamard transform in the vertical and horizontal directions is performed on the difference d as shown in the following equation (1). Is generated, and a motion vector that minimizes a value J defined by using the SATD is searched for.
[0033]
(Equation 1)

[0034]
However, as described above, in the motion search, if the difference d is subjected to the Hadamard transform in the vertical and horizontal directions shown in the following equation (1) to generate the SATD, the amount of calculation increases and the processing load increases. There is a problem that is large.
Note that there is a similar problem in the process of determining the prediction direction of a motion compensation (MC) block and the process of determining a macroblock mode in motion compensation.
[0035]
Hereinafter, an image processing apparatus, an image processing method, and an encoding apparatus according to the present embodiment for solving the above-described problem will be described.
First embodiment
FIG. 3 is a conceptual diagram of the communication system 1 of the present embodiment.
As shown in FIG. 3, the communication system 1 includes an encoding device 2 provided on the transmission side and a decoding device 499 provided on the reception side.
The encoding device 2 corresponds to the encoding device of the present invention.
The encoding device 2 and the decoding device 499 perform encoding and decoding based on the above-described JVT encoding method.
The decoding circuit 499 is the same as that described with reference to FIG.
[0036]
In the communication system 1, the encoding device 2 on the transmission side generates frame image data (bit stream) compressed by orthogonal transform such as discrete cosine transform or Karhunen-Loeve transform and motion compensation, and modulates the frame image data. Later, it is transmitted via a transmission medium such as a satellite broadcast wave, a cable TV network, a telephone line network, or a mobile telephone line network.
On the receiving side, after demodulating the received image signal, frame image data expanded by inverse transformation of the orthogonal transformation and the motion compensation at the time of the modulation is generated and used.
Note that the transmission medium may be a recording medium such as an optical disk, a magnetic disk, and a semiconductor memory.
[0037]
[Encoding device 2]
FIG. 4 is an overall configuration diagram of the encoding device 2 shown in FIG.
As shown in FIG. 4, the encoding device 2 includes, for example, an A / D conversion circuit 22, a screen rearrangement buffer 23, an arithmetic circuit 24, an orthogonal transformation circuit 25, a quantization circuit 26, a lossless encoding circuit 27, and a buffer 28. , An inverse quantization circuit 29, an inverse orthogonal transform circuit 30, a frame memory 31, a rate control circuit 32, a motion prediction / compensation circuit 36, a deblocking filter 37, a mode discriminating circuit 40, a simple SATD calculating circuit 41 and a simple SATD calculating circuit 42 Having.
Here, the simple SATD calculation circuit 42 corresponds to the difference generation means and the orthogonal transformation means of the first invention, and the motion prediction / compensation circuit 36 corresponds to the motion vector identification means of the first invention.
Further, the lossless encoding circuit 27 corresponds to the encoding means of the present invention.
Further, the mode discrimination circuit 40 corresponds to the prediction direction specifying means of the present invention, and the motion prediction / compensation circuit 36 corresponds to the motion vector specifying means of the present invention.
[0038]
The encoding device 2 calculates, for example, a difference (residual component) between pixel data in the motion compensation block of the current frame to be processed and pixel data in the motion compensation block of the reference frame in a motion vector search with a small number of accuracy. On the other hand, a simple SATD calculation circuit 41 and a simple SATD calculation circuit 42 calculate a simple SATD which is a sum of absolute values of values obtained by performing a simple orthogonal transformation (orthogonal transformation of the present invention) with a smaller amount of operation than the Hadamard transform described above. It is characterized in that it is generated by.
The mode discrimination circuit 40 and the motion prediction / compensation circuit 36 perform mode discrimination and search for a motion vector, respectively, based on the simplified SATD generated in the same manner.
[0039]
Hereinafter, components of the encoding device 2 will be described.
[A / D conversion circuit 22]
The A / D conversion circuit 22 converts the input image signal composed of the analog luminance signal Y and the color difference signals Pb and Pr into a digital image signal, and outputs this to the screen rearrangement buffer 23.
[0040]
[Screen sorting buffer 23]
The screen rearrangement buffer 23 encodes the frame image signal in the image signal input from the A / D conversion circuit 22 according to the GOP (Group Of Pictures) structure including the picture types I, P, and B. And outputs the frame image data S23 rearranged to the mode determination circuit 40.
[0041]
[Mode discriminating circuit 40]
The mode discrimination circuit 40 performs a prediction direction determination process for determining a prediction direction (a forward direction and a backward direction in inter prediction) of a motion compensation block used in the motion prediction / compensation in the motion prediction / compensation circuit 36, and performs macro / intra A mode determination process is performed to determine which mode of the inter is to be encoded.
The mode determination circuit 40 selects, for example, a direction that minimizes a value J (accumulated value of the present invention) defined by the following equation (2) as a prediction direction of a motion compensation block of a B frame.
[0042]
(Equation 2)

[0043]
In the above equation (2), PDIR indicates the prediction direction of the motion compensation block, s indicates image data to be encoded (current frame data), c indicates reference image data (reference frame data), and m indicates Represents a motion vector, p represents a predicted motion vector, REF represents an identification number of reference image data, R represents a generated code amount of the motion vector when the prediction direction is selected, l_MOTIONIndicates a predetermined coefficient.
The SATD in the above equation (2) is a difference (residual component) d between the pixel data in the block of the image data to be encoded indicated by the argument and the pixel data in the block of the reference image data. It shows the cumulative value (sum of absolute values) of the values that have undergone a predetermined orthogonal transformation C (orthogonal transformation of the present invention).
[0044]
Here, the simple SATD calculation circuit 41 calculates the SATD, and the mode determination circuit 40 calculates the value J of the above equation (2) using the SATD input from the simple SATD calculation circuit 41.
In the present embodiment, the difference d is defined as in the following equation (3).
[0045]
(Equation 3)

[0046]
Note that the mode determination circuit 40 may determine the macroblock mode based on the simple SATD.
[0047]
[Simple SATD calculation circuit 41]
FIG. 5 is a functional block diagram of the simple SATD calculation circuit 41.
As shown in FIG. 5, the simple SATD calculation circuit 41 includes, for example, a difference calculation circuit 61 and an orthogonal transformation circuit 62.
Here, the difference calculation circuit 61 corresponds to the difference generation means of the present invention, and the orthogonal transformation circuit 62 corresponds to the orthogonal transformation means of the invention.
[0048]
The difference calculation circuit 61 generates a difference (residual component) d between pixel data S90 in a predetermined block of the current frame and pixel data S91 in a predetermined block of the reference frame.
Further, the orthogonal transformation circuit 62 performs any one of the following equations (4) to (11) as the orthogonal transformation C on the difference d, and sets the sum of the absolute values of the obtained values as a simple SATD.
[0049]
That is, the orthogonal transformation circuit 62 performs the transformation of only the horizontal direction of the Hadamard transformation as the orthogonal transformation C on the difference d shown in the above equation (3) as shown in the following equation (4), and obtains the result. The sum of the absolute values of the calculated values is defined as the simple SATD.
[0050]
(Equation 4)

[0051]
The orthogonal transformation circuit 62 performs a transformation only in the vertical direction of the Hadamard transformation as the orthogonal transformation C on the difference d shown in the above equation (3) as shown in the following equation (5), and obtains a value obtained thereby. Is the simple SATD.
[0052]
(Equation 5)

[0053]
The orthogonal transformation circuit 62 performs a Hadamard transformation on the difference d shown in the above equation (3) to perform a transformation that leaves only a horizontal low-frequency component and a vertical low-frequency component, as shown in the following equation (6). The orthogonal transformation C is performed, and the sum of absolute values of the obtained values is defined as the simple SATD.
In the present embodiment, the horizontal low-frequency component is a frequency component within a predetermined range from the lowest frequency component of the difference d (for example, the coefficients of the left two rows or one row of equation (3)). And the high-frequency component in the vertical direction indicates a frequency component within a predetermined range from the highest-frequency component of the difference d (for example, the coefficients of the right two rows or one row of equation (3)), and Represents the frequency component between the low frequency component and the high frequency component.
Further, in the present embodiment, the vertical low frequency component is a frequency component within a predetermined range from the lowest frequency component of the difference d (for example, a coefficient of two rows or one row in equation (3)). And the high-frequency component in the vertical direction indicates a frequency component within a predetermined range from the highest-frequency component of the difference d (for example, the coefficient of the lower two rows or one row of Expression (3)), Represents the frequency component between the low frequency component and the high frequency component.
[0054]
(Equation 6)

[0055]
The orthogonal transformation circuit 62 performs an orthogonal transformation on the difference d shown in the above equation (3), as shown in the following equation (7), by applying a Hadamard transformation to leave a horizontal low-pass component and a vertical remaining whole-band component. C, and the sum of the absolute values of the obtained values is defined as the simple SATD. In this case, the orthogonal transformation circuit 62 performs the horizontal transformation before the vertical transformation, so that the computation amount can be reduced as compared with the computation shown in the above equation (1).
[0056]
(Equation 7)

[0057]
The orthogonal transformation circuit 62 performs an orthogonal transformation on the difference d shown in the above equation (3), as shown in the following equation (8), by applying a Hadamard transform to leave only the entire horizontal component and the vertical low frequency component. The conversion is performed as C, and the sum of the absolute values of the obtained values is defined as the simple SATD. In this case, the orthogonal transformation circuit 62 performs the vertical transformation before the horizontal transformation, so that the amount of computation can be reduced as compared with the computation shown in the above equation (1).
[0058]
(Equation 8)

[0059]
The orthogonal transformation circuit 62 applies only the low-frequency component in the horizontal direction and the high-frequency component and the low-frequency component in the vertical direction to the difference d shown in the above equation (3) by Hadamard transform as shown in the following equation (9). Is performed as orthogonal transformation C, and the sum of absolute values of the obtained values is defined as the simple SATD.
[0060]
(Equation 9)

[0061]
The orthogonal transform circuit 62 calculates only the low-frequency component in the horizontal direction and the high-frequency component in the horizontal direction and the low-frequency component in the vertical direction by using the Hadamard transform, as shown in the following expression (10), for the difference d shown in the expression (3). Is performed as orthogonal transformation C, and the sum of absolute values of the obtained values is defined as the simple SATD.
[0062]
(Equation 10)

[0063]
The orthogonal transformation circuit 62 applies the Hadamard transformation to the horizontal low-frequency component and the high-frequency component, and the vertical low-frequency component and the high-frequency component as shown in the following equation (11). The transform that leaves only the band component is performed as the orthogonal transform C, and the sum of the absolute values of the obtained values is defined as the simple SATD.
[0064]
(Equation 11)

[0065]
The orthogonal transformation circuit 62 performs any one of the above equations (4) to (11) on the difference d as the orthogonal transformation C according to the type of the motion compensation block, and calculates the sum of absolute values of the obtained values. The simple SATD may be used.
For example, when the type of the motion compensation block is 16 × 8 or 8 × 4 shown in FIG. 7, the orthogonal transform circuit 62 uses the above equation (4), (7) or (9) as the orthogonal transform C. .
When the type of the motion compensation block is 8 × 16 or 4 × 8 shown in FIG. 7, the orthogonal transform circuit 62 uses the above equation (5), (7) or (10) as the orthogonal transform C. .
When the type of the motion compensation block is square, the orthogonal transformation circuit 62 may use the following equation (12), equation (6), or (11). In this case, SAD may be used without performing Hadamard transform.
[0066]
[Operation circuit 24]
When the frame image data S23 is inter-coded, the arithmetic circuit 24 generates image data S24 indicating a difference between the frame image data S23 and the predicted frame image data S36a input from the motion prediction / compensation circuit 36. Is generated and output to the orthogonal transform circuit 25.
When the frame image data S23 is intra-coded, the arithmetic circuit 24 outputs the frame image data S23 to the orthogonal transformation circuit 25 as image data S24.
[0067]
[Orthogonal transformation circuit 25]
The orthogonal transform circuit 25 performs orthogonal transform such as discrete cosine transform or Karhunen-Loeve transform on the image data S24 to generate image data (for example, a DCT coefficient signal) S25, and outputs this to the quantization circuit 26.
The orthogonal transformation circuit 25 performs an orthogonal transformation in units of, for example, 4 × 4 blocks.
[0068]
[Quantization circuit 26]
The quantization circuit 26 quantizes the image data S25 using the quantization scale input from the rate control circuit 32 to generate image data S26, and outputs this to the lossless encoding circuit 27 and the inverse quantization circuit 29.
[0069]
[Reversible encoding circuit 27]
The lossless encoding circuit 27 performs variable length encoding or arithmetic encoding on the image data S26, and stores the encoded image data in the buffer.
Further, the lossless encoding circuit 27 encodes the motion vector MV input from the motion prediction / compensation circuit 36 or a difference thereof and stores the encoded motion vector MV in the header data.
The image data stored in the buffer 28 is transmitted after being modulated.
[0070]
[Inverse quantization circuit 29 and inverse orthogonal transformation circuit 30]
The inverse quantization circuit 29 generates data obtained by inversely quantizing the image data S26, and outputs this to the deblocking filter 37.
The inverse orthogonal transform circuit 30 stores in the frame memory 31 frame image data generated by performing inverse transform of the above orthogonal transform on the image data that has been quantized and from which the block distortion has been removed by the deblocking filter 37.
[0071]
[Rate control circuit 32]
The rate control circuit 32 generates a quantization scale for quantization in the quantization circuit 26 based on the image data read from the buffer 28 and the quantization parameter QP, and outputs this to the quantization circuit 26.
[0072]
[Motion prediction / compensation circuit 36]
The motion prediction / compensation circuit 36 performs a motion prediction / compensation process based on the image data S31 from the frame memory 31 and the image data from the screen rearrangement buffer 23 to generate a motion vector MV and reference image data S36a. I do.
The motion prediction / compensation circuit 36 outputs the motion vector MV to the lossless encoding circuit 27, and outputs reference image data S36a to the arithmetic circuit 24.
The motion prediction / compensation circuit 36 uses a multiple reference frame as shown in FIG. 6 and selectively uses a plurality of types of motion compensation blocks as shown in FIG. 7 according to the JVT method. Further, motion compensation is performed with a small number of pixels of 1/4 pixel accuracy or 1/8 pixel accuracy using an FIR filter.
[0073]
FIG. 8 is a functional block diagram of the motion prediction / compensation circuit 36.
As shown in FIG. 8, the motion prediction / compensation circuit 36 includes, for example, an integer precision MV search circuit 51, a SAD calculation circuit 52, an interpolation circuit 53, a small pixel precision MV search circuit 54, and a reference image determination circuit 55.
The integer-precision MV search circuit 51 includes a pixel data in the image data to be processed (current frame data) S23 input from the mode determination circuit 40 and a pixel in the reference image data S31 (reference frame data) input from the frame memory 31. The data and the candidate motion vector MV are output to the SAD calculation circuit 52.
The integer-precision MV search circuit 51 determines a motion vector MV1 that minimizes a value J represented by the following equation (12) among a plurality of candidate motion vectors based on a SAD described later input from the SAD calculation circuit 52. Search (specify) with integer pixel precision.
[0074]
(Equation 12)

[0075]
Λ shown in the above equation (12)_MODEIs given by the following equation (13) for I and P frames._{MODE, P (I)} Is used, and for a B frame, λ represented by the following equation (14) is used._{MODE, B}Is used. As the SAD, one input from the SAD calculation circuit 52 is used.
[0076]
(Equation 13)

[0077]
[Equation 14]

[0078]
The SAD calculation circuit 52 calculates the pixel data s (x, y) in the motion compensation block B of the image data (current frame data) S23 and the motion compensation block of the reference image data S31 (reference frame data) input from the frame memory 31. B in the block corresponding to the pixel data c (x−m_x , Ym_y ) Is calculated by the following equation (15).
[0079]
[Equation 15]

[0080]
The interpolation circuit 53 performs an interpolation process on the reference image data (reference frame data) S31 with integer pixel precision input from the frame memory 31 using an FIR filter or the like to obtain a 1/4 pixel or 1/8 pixel precision small pixel precision. The reference image data is generated and output to the small-pixel accuracy MV search circuit 54.
[0081]
The small-pixel-precision MV search circuit 54 uses, for example, the integer-pixel-precision reference image data S31 from the frame memory 31 and the small-pixel-precision reference image data obtained by the interpolation circuit 53, to generate an integer-precision MV search circuit. In the search range defined by the motion vector MV1 generated in 51, a motion vector search is performed with a small number of pixels.
At this time, the small pixel precision MV search circuit 54 calculates the value J shown in the following equation (16) using the simple SATD input from the simple SATD calculation circuit 42, and calculates the value J from the plurality of candidate motion vectors. Search for a motion vector MV that minimizes the value J.
[0082]
(Equation 16)

[0083]
The reference image determination circuit 55 generates reference image data S36a based on the motion vector MV input from the small pixel precision MV search circuit 54.
[0084]
[Simple SATD calculation circuit 42]
FIG. 9 is a functional block diagram of the simplified SATD calculation circuit 42 shown in FIGS.
As shown in FIG. 9, the simple SATD calculation circuit 42 has, for example, a difference calculation circuit 71 and an orthogonal transformation circuit 72.
However, the simple SATD calculation circuit 42 calculates the pixel data S23a of the small number of pixels and the integer pixels of the image data 23 (current frame data) shown in FIG. 8 and the small number of pixels and the integer pixels of the reference image data S31 (reference frame data). The simple SATD is calculated using the pixel data S31a.
Here, the difference calculation circuit 71 corresponds to the difference generation means of the present invention, and the orthogonal transformation circuit 72 corresponds to the orthogonal transformation means of the invention.
[0085]
The difference calculation circuit 71 calculates a difference between the pixel data S23a in each motion compensation block and the pixel data S31a in the reference image data S31a corresponding to the pixel position indicated by the candidate motion vector from the pixel position corresponding to the pixel data S23a. Each difference (residual component) d is calculated.
Also, the orthogonal transformation circuit 72 performs any one of the above-described equations (4) to (11) as the orthogonal transformation C on the difference d calculated by the difference calculation circuit 71, and obtains the absolute value of the value obtained thereby. Let the sum be a simple SATD.
[0086]
The orthogonal transformation circuit 72 performs any one of the above equations (4) to (11) on the difference d as the orthogonal transformation C according to the type of the motion compensation block, and calculates the sum of absolute values of the values obtained thereby. The simple SATD may be used.
For example, when the type of the motion compensation block is 16 × 8 or 8 × 4 shown in FIG. 7, the orthogonal transform circuit 72 uses the above equation (4), (7) or (9) as the orthogonal transform C. .
When the type of the motion compensation block is 8 × 16 or 4 × 8 shown in FIG. 7, the orthogonal transform circuit 72 uses the above equation (5), (7) or (10) as the orthogonal transform C. .
When the type of the motion compensation block is square, the orthogonal transformation circuit 72 may use the following equation (12), the above equation (6), or (11). In this case, SAD may be used without performing Hadamard transform.
[0087]
Next, the overall operation of the encoding device 2 shown in FIG. 4 will be described.
The input image signal is first converted into a digital signal by the A / D conversion circuit 22. Next, the frame image data is rearranged in the screen rearrangement buffer 23 according to the GOP structure of the image compression information to be output.
Then, in the mode discriminating circuit 40, a prediction direction determining process for determining a prediction direction of a motion compensation block used in motion prediction / compensation based on the simple SATD calculated simply in the simple SATD calculation circuit 41, and a macro block Is determined in which mode, intra or inter, is to be encoded.
[0088]
As for the image data S23 (frame data) to be subjected to intra coding, image information of the entire frame data is input to the orthogonal transform circuit 25, where the orthogonal transform circuit 25 performs orthogonal transform such as discrete cosine transform or Karhunen-Loeve transform. Is done.
The transform coefficient output from the orthogonal transform circuit 25 is quantized by a quantization circuit 26.
The quantization circuit 26 performs quantization based on the control from the rate control circuit 32.
[0089]
The quantized transform coefficient output from the quantization circuit 26 is input to the lossless transform circuit 27, where it is subjected to lossless encoding such as variable-length encoding and arithmetic encoding, and then stored in the buffer 28. And output as compressed image data.
At the same time, the quantized transform coefficient output from the quantization circuit 26 is input to the inverse quantization circuit 29, and further subjected to inverse orthogonal transform processing in the inverse orthogonal transform circuit 30 to decode the decoded image data ( (Frame data), and the image data is stored in the frame memory 31.
[0090]
On the other hand, for an image to be subjected to inter-coding, first, the image data S23 is input to the motion prediction / compensation circuit 36. Further, the reference image data S31 is read from the frame memory 31 and output to the motion prediction / compensation circuit 36.
Then, in the motion prediction / compensation circuit 36, a motion vector MV and predicted image data S36a are generated using the image data S31 of the reference image.
[0091]
Then, the arithmetic circuit 24 generates image data S24 which is a difference signal between the image data S23 from the mode discriminating circuit 40 and the predicted image data S36a from the motion prediction / compensation circuit 36, and converts the image data S24 into an orthogonal transform. Output to the circuit 25.
At this time, the motion prediction / compensation circuit 36 uses multiple reference frames and selectively uses a plurality of types of motion compensation blocks according to the JVT method. Further, motion compensation is performed with a small number of pixels of 1/4 pixel accuracy or 1/8 pixel accuracy using an FIR filter.
In the motion compensation, the motion prediction / compensation circuit 36 calculates the value J shown in the above equation (16) using the simple SATD input from the simple SATD calculation circuit 42 as described above, and A motion vector MV that minimizes the value J is searched from the vectors.
[0092]
Then, in the lossless encoding circuit 27, the motion vector MV is subjected to a lossless encoding process such as variable length encoding or arithmetic encoding, and is inserted into a header portion of the image data. Other processes are the same as those of the image data subjected to intra coding.
[0093]
As described above, according to the encoding device 2, the simple SATD calculation circuit 41 calculates the simple SATD by performing the orthogonal transform C shown in the above equations (4) to (11) that simplify the Hadamard transform. Then, the prediction direction of the motion compensation block is determined based on the simple SATD.
Therefore, the operation amount of the motion prediction / compensation circuit 36 can be reduced as compared with the case where the Hadamard transform is used, and an appropriate motion vector MV can be obtained as compared with the case where the orthogonal transform is not performed. The information amount of the image signal S24 can be reduced (encoding efficiency can be increased).
[0094]
The present invention is not limited to the embodiments described above.
For example, according to the above-described embodiment, the case where the Hadamard transform is used as the orthogonal transform in the simple SATD calculation circuit 41 and the simple SATD calculation circuit 42 has been described, but other orthogonal transform such as DCT (Discrete Cosine transform) is used. Is also good.
In the present embodiment, in the JVT system, the orthogonal transformation C in the simple SATD calculation circuit 41 and the simple SATD calculation circuit 42 is performed on a block of 4 × 4 pixels. Orthogonal transformation C may be performed on blocks other than the 4 × 4 pixels.
[0095]
【The invention's effect】
As described above, according to the present invention, it is possible to provide an image processing device, an encoding device, and a method thereof that can reduce the amount of computation involved in motion compensation.
Further, according to the present invention, it is possible to provide an image processing device, an encoding device, and a method thereof, which can reduce the amount of computation involved in determining a prediction direction in predictive encoding.
[Brief description of the drawings]
FIG. 1 is a functional block diagram of an encoding device according to a related technique of the present invention.
FIG. 2 is a functional block diagram of a decoding device according to a related technique of the present invention.
FIG. 3 is a diagram for explaining an encoding device according to the first embodiment of the present invention.
FIG. 4 is a functional block diagram of the encoding device shown in FIG. 3;
FIG. 5 is a functional block diagram of a simple SATD calculation circuit corresponding to the mode determination circuit shown in FIG. 3;
FIG. 6 is a diagram for explaining a multiple reference frame by the motion prediction / compensation circuit shown in FIG. 3;
FIG. 7 is a diagram for explaining a plurality of types of motion compensation blocks selected by the motion prediction / compensation circuit shown in FIG. 3;
FIG. 8 is a functional block diagram of the motion prediction / compensation circuit shown in FIG. 3;
FIG. 9 is a functional block diagram of a simple SATD calculation circuit corresponding to the motion prediction / compensation circuit shown in FIG. 3;
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 1 ... Communication system, 2 ... Encoding device, 22 ... A / D conversion circuit, 23 ... Screen rearrangement buffer, 24 ... Operation circuit, 25 ... Orthogonal conversion circuit, 26 ... Quantization circuit, 27 ... Reversible encoding circuit, 28 buffer, 29 inverse quantization circuit, 30 inverse orthogonal transform circuit, 31 frame memory, 32 rate control circuit, 36 motion prediction / compensation circuit, 41 simple SATD calculation circuit, 42 simple SATD calculation circuit , 61, 71: difference calculation circuit, 62, 72: orthogonal transformation circuit, 51: integer precision MV search circuit, 52: SAD calculation circuit, 53: interpolation circuit, 54: minority pixel precision MV search circuit, 55: reference image determination circuit

Claims

A plurality of first pixel data in the first image data and a position corresponding to the plurality of first pixel data in the second image data having a correlation with the first image data are candidates. Difference generating means for respectively generating a difference from a plurality of second pixel data corresponding to a position indicated by the motion vector;
Orthogonal transformation means for performing one-dimensional orthogonal transformation on the difference generated by the difference generation means,
A motion vector specifying unit that specifies a motion vector used for motion compensation from among the plurality of candidate motion vectors based on the accumulated value of the difference subjected to the orthogonal transform by the orthogonal transform unit. Processing equipment.

The difference generation unit generates a difference between the plurality of first pixel data in a predetermined motion compensation block and the second pixel data corresponding to the first pixel data,
The image processing apparatus according to claim 1, wherein the orthogonal transform unit performs the one-dimensional orthogonal transform defining a transform in a direction corresponding to a type of the motion compensation block in a vertical direction and a horizontal direction to the difference.

When the motion compensation block has a long side in the horizontal direction, the orthogonal transformation means performs the one-dimensional orthogonal transformation defined in the horizontal direction on the difference, and the motion compensation block extends in the vertical direction. The image processing apparatus according to claim 2, wherein when the side is set, the one-dimensional orthogonal transformation defined in the vertical direction is performed on the difference.

4. The image processing apparatus according to claim 3, wherein the orthogonal transformation unit performs a two-dimensional orthogonal transformation defined in the horizontal direction and the vertical direction to the difference when the motion compensation block has a square shape.

The image processing apparatus according to claim 3, wherein the orthogonal transform unit does not perform the orthogonal transform when the motion compensation block has a square shape.

A plurality of first pixel data in the first image data and a position corresponding to the plurality of first pixel data in the second image data having a correlation with the first image data are candidates. Difference generating means for respectively generating a difference from a plurality of second pixel data corresponding to a position indicated by the motion vector;
Orthogonal transformation means for performing one-dimensional orthogonal transformation on the difference generated by the difference generation means,
Based on the cumulative value of the difference subjected to the orthogonal transformation by the orthogonal transformation means, from among the plurality of candidate motion vectors, a motion vector identification means for identifying a motion vector used for motion compensation,
An encoding device comprising: an encoding unit that encodes a difference between the second image data and the first image data and the motion vector identified by the motion vector identification unit.

A plurality of first pixel data in the first image data and a position corresponding to the plurality of first pixel data in the second image data having a correlation with the first image data are candidates. Difference generating means for respectively generating a difference from a plurality of second pixel data corresponding to a position indicated by the motion vector;
An orthogonal transformation unit that performs two-dimensional orthogonal transformation on the difference generated by the difference generation unit so as to leave only a part of the frequency component of the difference in at least one of a horizontal direction and a vertical direction;
A motion vector specifying unit that specifies a motion vector used for motion compensation from among the plurality of candidate motion vectors based on the accumulated value of the difference subjected to the orthogonal transform by the orthogonal transform unit. Processing equipment.

The orthogonal transformation means performs the two-dimensional orthogonal transformation so as to leave only a part of frequency components within a predetermined range from a frequency component having the highest difference in at least one of a horizontal direction and a vertical direction. Item 8. The image processing device according to Item 7.

A plurality of first pixel data in the first image data and a position corresponding to the plurality of first pixel data in the second image data having a correlation with the first image data are candidates. Difference generating means for respectively generating a difference from a plurality of second pixel data corresponding to a position indicated by the motion vector;
An orthogonal transformation unit that performs two-dimensional orthogonal transformation on the difference generated by the difference generation unit so as to leave only a part of the frequency component of the difference in at least one of a horizontal direction and a vertical direction;
Based on the cumulative value of the difference subjected to the orthogonal transformation by the orthogonal transformation means, from among the plurality of candidate motion vectors, a motion vector identification means for identifying a motion vector used for motion compensation,
An encoding device comprising: an encoding unit that encodes a difference between the second image data and the first image data and the motion vector identified by the motion vector identification unit.

A plurality of first pixel data in the first image data and a motion vector from a position in the second image data having a correlation with the first image data corresponding to the plurality of first pixel data; Difference generating means for respectively generating a difference from a plurality of second pixel data corresponding to the pointed position;
Orthogonal transformation means for performing one-dimensional orthogonal transformation on the difference generated by the difference generation means,
An image processing apparatus comprising: a prediction direction specifying unit configured to specify a prediction direction using the motion vector based on a cumulative value of the difference subjected to the orthogonal transformation by the orthogonal transformation unit.

A plurality of first pixel data in the first image data and a position corresponding to the plurality of first pixel data in the second image data having a correlation with the first image data are candidates. Difference generating means for respectively generating a difference from a plurality of second pixel data corresponding to a position indicated by the motion vector;
Orthogonal transformation means for performing one-dimensional orthogonal transformation on the difference generated by the difference generation means,
A prediction direction identification unit that identifies a prediction direction using the motion vector based on a cumulative value of the difference subjected to the orthogonal transformation by the orthogonal transformation unit;
An encoding device comprising encoding means for encoding the difference between the second image data and the first image data and the motion vector.

A plurality of first pixel data in the first image data and a position corresponding to the plurality of first pixel data in the second image data having a correlation with the first image data are candidates. Difference generating means for respectively generating a difference from a plurality of second pixel data corresponding to a position indicated by the motion vector;
An orthogonal transformation unit that performs two-dimensional orthogonal transformation on the difference generated by the difference generation unit so as to leave only a part of the frequency component of the difference in at least one of a horizontal direction and a vertical direction;
An image processing apparatus comprising: a prediction direction specifying unit configured to specify a prediction direction using the motion vector based on a cumulative value of the difference subjected to the orthogonal transformation by the orthogonal transformation unit.

A plurality of first pixel data in the first image data and a position corresponding to the plurality of first pixel data in the second image data having a correlation with the first image data are candidates. Difference generating means for respectively generating a difference from a plurality of second pixel data corresponding to a position indicated by the motion vector;
An orthogonal transformation unit that performs two-dimensional orthogonal transformation on the difference generated by the difference generation unit so as to leave only a part of the frequency component of the difference in at least one of a horizontal direction and a vertical direction;
A prediction direction identification unit that identifies a prediction direction using the motion vector based on a cumulative value of the difference subjected to the orthogonal transformation by the orthogonal transformation unit;
An encoding device comprising encoding means for encoding the difference between the second image data and the first image data and the motion vector.

A plurality of first pixel data in the first image data and a position corresponding to the plurality of first pixel data in the second image data having a correlation with the first image data are candidates. A first step of generating a difference from each of a plurality of pieces of second pixel data corresponding to a position indicated by the motion vector;
A second step of performing a one-dimensional orthogonal transformation on the difference generated in the first step;
And a third step of specifying a motion vector to be used for motion compensation from among the plurality of candidate motion vectors based on the accumulated value of the difference subjected to the orthogonal transformation in the second step. Image processing method.

A plurality of first pixel data in the first image data and a position corresponding to the plurality of first pixel data in the second image data having a correlation with the first image data are candidates. A first step of generating a difference from each of a plurality of pieces of second pixel data corresponding to a position indicated by the motion vector;
A second step of performing a one-dimensional orthogonal transformation on the difference generated in the first step;
A third step of specifying a motion vector to be used for motion compensation from among the plurality of candidate motion vectors based on the accumulated value of the difference subjected to the orthogonal transformation in the second step;
A fourth step of encoding the difference between the second image data and the first image data and the motion vector specified in the third step.

A plurality of first pixel data in the first image data and a position corresponding to the plurality of first pixel data in the second image data having a correlation with the first image data are candidates. A first step of generating a difference from each of a plurality of pieces of second pixel data corresponding to a position indicated by the motion vector;
A second step of performing a two-dimensional orthogonal transformation on the difference generated in the first step so as to leave only a part of frequency components of the difference in at least one of a horizontal direction and a vertical direction;
And a third step of specifying a motion vector to be used for motion compensation from among the plurality of candidate motion vectors based on the accumulated value of the difference subjected to the orthogonal transformation in the second step. Image processing method.

A plurality of first pixel data in the first image data and a position corresponding to the plurality of first pixel data in the second image data having a correlation with the first image data are candidates. A first step of generating a difference from each of a plurality of pieces of second pixel data corresponding to a position indicated by the motion vector;
A second step of performing a two-dimensional orthogonal transformation on the difference generated in the first step so as to leave only a part of frequency components of the difference in at least one of a horizontal direction and a vertical direction;
A third step of specifying a motion vector to be used for motion compensation from among the plurality of candidate motion vectors based on the accumulated value of the difference subjected to the orthogonal transformation in the second step;
A fourth step of encoding the difference between the second image data and the first image data and the motion vector specified in the third step.

A plurality of first pixel data in the first image data and a motion vector from a position in the second image data having a correlation with the first image data corresponding to the plurality of first pixel data; A first step of generating a difference from each of the plurality of pieces of second pixel data corresponding to the indicated position;
A second step of performing a one-dimensional orthogonal transformation on the difference generated in the first step;
A third step of specifying a prediction direction using the motion vector based on a cumulative value of the difference subjected to the orthogonal transformation in the second step.

A plurality of first pixel data in the first image data and a position corresponding to the plurality of first pixel data in the second image data having a correlation with the first image data are candidates. A first step of generating a difference from each of a plurality of pieces of second pixel data corresponding to a position indicated by the motion vector;
A second step of performing a two-dimensional orthogonal transformation on the difference generated in the first step so as to leave only a part of frequency components of the difference in at least one of a horizontal direction and a vertical direction;
A third step of specifying a prediction direction using the motion vector based on a cumulative value of the difference subjected to the orthogonal transformation in the second step.