JP4120916B2

JP4120916B2 - Information processing apparatus and method, recording medium, and program

Info

Publication number: JP4120916B2
Application number: JP2002008251A
Authority: JP
Inventors: 哲二郎近藤; 泰弘藤森; 直己武田; 崇中西
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2002-01-17
Filing date: 2002-01-17
Publication date: 2008-07-16
Anticipated expiration: 2022-01-17
Also published as: JP2003209843A

Description

【０００１】
【発明の属する技術分野】
本発明は、情報処理装置および方法、記録媒体、並びにプログラムに関し、特に、効率よく符号化を行うことができるようにした情報処理装置および方法、記録媒体、並びにプログラムに関する。
【０００２】
【従来の技術】
最近、MPEG(Moving Picture Experts Group)に代表される圧縮方式が普及し、ビデオデータやオーディオデータは、MPEG方式で圧縮された上で、ハードディスク、磁気テープ、光ディスクといった記録媒体に記録されたり、ネットワークを介して伝送されたり、衛星を介して放送される。
【０００３】
MPEG方式では、例えば、図１に示されるように、現在フレームの前後に、前フレームと後フレームが存在するような場合、現在フレームが、例えば、８×８画素のブロックに分割される。そして、前フレームの所定の探索範囲内において、８×８画素のブロックが抽出され、現在フレームのブロック１と前フレームのブロック２が、画素毎に絶対差分和、自乗誤差和、または代表点マッチング処理により比較される。前フレームのブロック２は、探索範囲内でその位置が順次移動される。
【０００４】
そして、探索範囲の中で、絶対差分和、自乗誤差和、または代表点マッチングの値が最も小さかった相対座標位置が動きベクトルとして推定される。
【０００５】
さらに、クラス分類適応処理が用いられる場合、動きベクトルに対応するブロック２の画素から、クラス分類適応処理に基づいて、ブロック位置の画素が予測され、動きベクトルと予測値とが伝送される。
【０００６】
【発明が解決しようとする課題】
しかしながら、このような、従来のブロックマッチングとクラス分類適応処理を用いる予測符号化方式は、伝送効率、圧縮効率、および符号化効率が悪いという課題があった。
【０００７】
本発明は、このような状況に鑑みてなされたものであり、伝送効率、圧縮効率、および符号化効率を向上させるようにするものである。
【０００８】
【課題を解決するための手段】
本発明の情報処理装置は、入力される画像データのうちの画像データの第１のフレームより時間的に前の第２のフレームを記憶し、フレーム毎に、あるいは数フレーム毎に、記憶している第２のフレームを更新する更新手段と、第１のフレームの一部であって１つの画素からなる第１の画素データに対応する第２のフレームにおける位置の周辺である探索範囲内に存在する複数の第２の画素データをクラス分類適応処理するのに用いる、クラス分類に用いる複数の画素データであるクラスタップおよび予測値の生成に用いる複数の画素データである予測タップを抽出する抽出手段と、抽出手段により抽出されたクラスタップおよび予測タップを用いて複数の第２の画素データをクラス分類適応処理することにより、予測値を生成する予測値生成手段と、予測値生成手段により生成された予測値と、第１の画素データとの予測残差を演算する予測残差演算手段と、予測残差演算手段により演算された予測残差のうち、最小の予測残差を第１の画素データに対する評価値として選択し、当該評価値と、所定の閾値とを比較する比較手段と、比較手段の比較結果に基づいて、予測残差が閾値より小さいと判定された場合、第１の画素データと、予測残差が最小である第２の画素データの位置により規定される動きベクトルを出力し、予測残差が閾値より大きいと判定された場合、動きベクトルの他、予測値生成手段により生成された予測値、予測残差演算手段により演算された予測残差、または第１の画素データのいずれかと、送出したデータが、予測残差であるか否かを表すフラグを出力する出力手段とを備えることを特徴とする。
【００１４】
本発明の情報処理方法は、入力される画像データのうちの画像データの第１のフレームより時間的に前の第２のフレームを記憶し、フレーム毎に、あるいは数フレーム毎に、記憶している第２のフレームを更新する更新ステップと、第１のフレームの一部であって１つの画素からなる第１の画素データに対応する第２のフレームにおける位置の周辺である探索範囲内に存在する複数の第２の画素データをクラス分類適応処理するのに用いる、クラス分類に用いる複数の画素データであるクラスタップおよび予測値の生成に用いる複数の画素データである予測タップを抽出する抽出ステップと、抽出ステップの処理により抽出されたクラスタップおよび予測タップを用いて複数の第２の画素データをクラス分類適応処理することにより、予測値を生成する予測値生成ステップと、予測値生成ステップの処理により生成された予測値と、第１の画素データとの予測残差を演算する予測残差演算ステップと、予測残差演算ステップの処理により演算された予測残差のうち、最小の予測残差を第１の画素データに対する評価値として選択し、当該評価値と、所定の閾値とを比較する比較ステップと、比較ステップの処理における比較結果に基づいて、予測残差が閾値より小さいと判定された場合、第１の画素データと、予測残差が最小である第２の画素データの位置により規定される動きベクトルを出力し、予測残差が閾値より大きいと判定された場合、動きベクトルの他、予測値生成ステップの処理により生成された予測値、予測残差演算ステップの処理により演算された予測残差、または第１の画素データのいずれかと、送出したデータが、予測残差であるか否かを表すフラグを出力する出力ステップとを含むことを特徴とする。
【００１５】
本発明の第１の記録媒体のプログラムは、入力される画像データのうちの画像データの第１のフレームより時間的に前の第２のフレームを記憶し、フレーム毎に、あるいは数フレーム毎に、記憶している第２のフレームを更新する更新ステップと、第１のフレームの一部であって１つの画素からなる第１の画素データに対応する第２のフレームにおける位置の周辺である探索範囲内に存在する複数の第２の画素データをクラス分類適応処理するのに用いる、クラス分類に用いる複数の画素データであるクラスタップおよび予測値の生成に用いる複数の画素データである予測タップを抽出する抽出ステップと、抽出ステップの処理により抽出されたクラスタップおよび予測タップを用いて複数の第２の画素データをクラス分類適応処理することにより、予測値を生成する予測値生成ステップと、予測値生成ステップの処理により生成された予測値と、第１の画素データとの予測残差を演算する予測残差演算ステップと、予測残差演算ステップの処理により演算された予測残差のうち、最小の予測残差を第１の画素データに対する評価値として選択し、当該評価値と、所定の閾値とを比較する比較ステップと、比較ステップの処理における比較結果に基づいて、予測残差が閾値より小さいと判定された場合、第１の画素データと、予測残差が最小である第２の画素データの位置により規定される動きベクトルを出力し、予測残差が閾値より大きいと判定された場合、動きベクトルの他、予測値生成ステップの処理により生成された予測値、予測残差演算ステップの処理により演算された予測残差、または第１の画素データのいずれかと、送出したデータが、予測残差であるか否かを表すフラグを出力する出力ステップとを含むことを特徴とする。
【００１６】
本発明の第１のプログラムは、入力される画像データのうちの画像データの第１のフレームより時間的に前の第２のフレームを記憶し、フレーム毎に、あるいは数フレーム毎に、記憶している第２のフレームを更新する更新ステップと、第１のフレームの一部であって１つの画素からなる第１の画素データに対応する第２のフレームにおける位置の周辺である探索範囲内に存在する複数の第２の画素データをクラス分類適応処理するのに用いる、クラス分類に用いる複数の画素データであるクラスタップおよび予測値の生成に用いる複数の画素データである予測タップを抽出する抽出ステップと、抽出ステップの処理により抽出されたクラスタップおよび予測タップを用いて複数の第２の画素データをクラス分類適応処理することにより、予測値を生成する予測値生成ステップと、予測値生成ステップの処理により生成された予測値と、第１の画素データとの予測残差を演算する予測残差演算ステップと、予測残差演算ステップの処理により演算された予測残差のうち、最小の予測残差を第１の画素データに対する評価値として選択し、当該評価値と、所定の閾値とを比較する比較ステップと、比較ステップの処理における比較結果に基づいて、予測残差が閾値より小さいと判定された場合、第１の画素データと、予測残差が最小である第２の画素データの位置により規定される動きベクトルを出力し、予測残差が閾値より大きいと判定された場合、動きベクトルの他、予測値生成ステップの処理により生成された予測値、予測残差演算ステップの処理により演算された予測残差、または第１の画素データのいずれかと、送出したデータが、予測残差であるか否かを表すフラグを出力する出力ステップとをコンピュータに実行させることを特徴とする。
【００２６】
本発明の情報処理装置および方法、記録媒体、並びにプログラムにおいては、入力される画像データのうちの画像データの第１のフレームより時間的に前の第２のフレームが記憶され、フレーム毎に、あるいは数フレーム毎に、記憶されている第２のフレームが更新され、第１のフレームの一部であって１つの画素からなる第１の画素データに対応する第２のフレームにおける位置の周辺である探索範囲内に存在する複数の第２の画素データをクラス分類適応処理するのに用いる、クラス分類に用いる複数の画素データであるクラスタップおよび予測値の生成に用いる複数の画素データである予測タップが抽出され、抽出された前記クラスタップおよび前記予測タップを用いて複数の第２の画素データをクラス分類適応処理することにより、予測値が生成され、生成された予測値と、第１の画素データとの予測残差が演算され、演算された予測残差のうち、最小の予測残差を前記第１の画素データに対する評価値として選択し、当該評価値と、所定の閾値とが比較される。そして、比較結果に基づいて、予測残差が閾値より小さいと判定された場合、第１の画素データと、予測残差が最小である第２の画素データの位置により規定される動きベクトルが出力され、予測残差が閾値より大きいと判定された場合、動きベクトルの他、生成された予測値、演算された予測残差、または第１の画素データのいずれかと、送出したデータが、予測残差であるか否かを表すフラグが出力される。
【００２８】
【発明の実施の形態】
図２は、本発明を適用した画像処理システムのうちの送信装置の構成例を表している。この送信装置１１においては、基準フレームメモリ２２に、予測の基準とされる基準フレームの画像データが記憶される。予測フレームメモリ２１には、基準フレームの画像に基づいて予測される予測フレームの画像データが、順次、記憶される。
【００２９】
アドレス設定部２３−１は、フレームメモリ２２に記憶されている画素データのうち、所定の探索範囲（フレームメモリ２１に記憶されている予測フレームの注目画像に対応する探索範囲）内の第１の位置（動きベクトルｖ０に対応する位置）の予測タップとクラスタップを構成する任意の数の画素のアドレスを設定し、そのアドレスに対応する画素データ、すなわち、予測タップとクラスタップを構成する画素データを読み出し、クラス分類適応処理部２４−１に供給する。クラス分類適応処理部２４−１は、アドレス設定部２３−１より供給された予測タップとクラスタップの画素データに基づいて、クラス分類適応処理を行い、予測フレームメモリ２１に記憶されている予測フレームの対応する画素（注目画素）の予測値を演算する。
【００３０】
アドレス設定部２３−２は、探索範囲内の第２の位置（第２の動きベクトルｖ１に対応する位置）の予測タップとクラスタップの画素データを抽出し、クラス分類適応処理部２４−２に供給する。クラス分類適応処理部２４−２は、アドレス設定部２３−２より供給された予測タップとクラスタップの画素データに基づいて、クラス分類適応処理を行い、注目画素に対する予測値を演算する。
【００３１】
同様の構成が探索範囲内を探索して得られる動きベクトルの数（ｎ個）だけ設けられている。すなわち、アドレス設定部とクラス分類適応処理部の組み合わせがｎ組用意されている。例えば、探索範囲が水平、垂直ともに、マイナス８からプラス８までであるとすると、ｎの数は、２８９（＝１７×１７）となる。
【００３２】
比較器２５は、クラス分類適応処理部２４−１乃至２４−ｎから供給されるｎ個の予測値を、予測フレームメモリ２１から供給される注目画素と比較し、その差を予測残差として検出するとともに、予測残差のうちの、最小の予測残差を評価値として選択する。閾値判定部２６は、比較器２５より供給される最小の予測残差（評価値）を閾値と比較する。評価値が閾値と等しいか、それより小さい場合には、閾値判定部２６は、動きベクトルを後処理部２７に供給する。閾値判定部２６は、予測残差が閾値より大きい場合には、動きベクトルと予測残差を後処理部２７に出力する。
【００３３】
後処理部２７は、閾値判定部２６より動きベクトルが供給されてきたとき、それをそのまま出力する。後処理部２７は、閾値判定部２６より動きベクトルと予測残差が供給されてきた場合には、予測残差に、それが予測残差であることを表すフラグを付加する。
【００３４】
後処理部２７より出力された動きベクトル、または動きベクトルと、フラグが付加された予測残差は、量子化器２８に供給される。量子化器２８は、このうちの予測残差については、例えば、ロイドマックス、ランレングス等の符号により量子化し、所定の伝送路に伝送する。この伝送路には、各種の通信路の他、記録再生によりデータを伝送する記録媒体も含まれる。
【００３５】
例えば、マイクロプロセッサなどによりなる制御部３０は、送信装置１１の各部を制御する。制御部３０には、必要に応じて、インタフェース３１を介して、磁気ディスク４１、光ディスク４２、光磁気ディスク４３、または半導体メモリ４４などが適宜装着される。
【００３６】
図３は、クラス分類適応処理部２４−１乃至２４−ｎ（以下、これらを個々に区別する必要がない場合、単に、クラス分類適応処理部２４と称する）の構成を表している。クラスタップ抽出部６１は、アドレス設定部２３−１乃至２３−ｎ（以下、これらを個々に区別する必要がない場合、単に、アドレス設定部２３と称する）のうち対応するものより供給された画素データから、クラスタップを抽出し、ADRC(Adaptive Dynamic Range Coding)部６２に供給する。ADRC部６２は、クラスタップ抽出部６１より供給された画素データに対して、例えば、１ビットADRC処理を施す。
【００３７】
すなわち、クラスタップを構成する複数の画素データの最大値と最小値が検出され、その最大値と最小値の差が、さらにダイナミックレンジとして算出される。各画素データは、最小値が減算され、さらに、ダイナミックレンジで割算することで正規化される。
【００３８】
正規化されたデータは、所定の基準値（例えば、０．５）と比較され、基準値より大きい場合には、例えば１、基準値より小さい場合には、例えば０、とされる。
【００３９】
すなわち、例えばクラスタップを構成する画素の数が９個である場合、ADRC部６２から、９ビットのデータがクラスコード決定部６３に供給されることになる。クラスコード決定部６３は、この９ビットのデータに基づいて、そのクラスタップのクラスコードを決定し、予測係数メモリ６４に出力する。
【００４０】
予測係数メモリ６４には、クラスコードに対応する予測係数が予め記憶されており、クラスコード決定部６３より供給されたクラスコードに対応する予測係数を、予測値算出部６６に出力する。
【００４１】
予測タップ抽出部６５は、アドレス設定部２３より供給された画素データから予測タップを構成する画素データを抽出し、予測値算出部６６に出力する。予測値算出部６６は、予測タップ抽出部６５より供給された予測タップを構成する画素データに、予測係数メモリ６４より供給される予測係数を乗算し、その和から（すなわち、線形１次結合から）、予測値を算出する。
【００４２】
次に、図４と図５のフローチャートを参照して、送信装置１１の処理について説明する。
【００４３】
最初に、ステップＳ１１において、制御部３０は、閾値判定部２６に対して、伝送判定閾値をセットする。この伝送判定閾値は、後述するステップＳ２６の処理で利用される。
【００４４】
ステップＳ１２において、前フレームデータを蓄積する処理が実行され、ステップＳ１３において、現在フレームデータを蓄積する処理が実行される。すなわち、例えば、１フレーム分の画像データが、予測フレームメモリ２１に記憶された後、再びそこから読み出され、基準フレームメモリ２２に伝送され、記憶される。そして、予測フレームメモリ２１には、それに続く新たな１フレーム分の画像データが記憶される。このようにして、基準フレームメモリ２２には、前フレームの画素データが記憶され、予測フレームメモリ２１には、現在フレーム（基準フレームより時間的に後のフレーム）の画像データが蓄積される。
【００４５】
ステップＳ１４において、予測フレームメモリ２１は、制御部３０の制御に基づいて、記憶している１フレーム分の画像データの中から注目画素のデータを抽出し、比較器２５に供給する。
【００４６】
ステップＳ１５において、制御部３０は、比較器２５に比較値をセットする。この比較値は、最小の予測残差としての評価値を求めるために、ステップＳ２４の処理で、より小さい予測残差の値に更新されるものであり、初期値としては、最大値がセットされる。この比較値は、ステップＳ２３の処理で利用される。
【００４７】
次に、ステップＳ１６において、アドレス設定部２３−１乃至２３−ｎに対して、それぞれステップＳ１４の処理で抽出された注目画素に対応するアドレスが設定される。これにより、アドレス設定部２３−１乃至２３−ｎから、注目画素に対応する探索範囲内における、各探索位置（動きベクトル）に対応するクラスタップと予測タップを含む画素データが、対応するクラス分類適応処理部２４−１乃至２４−ｎに取り込み可能となる。
【００４８】
そこで、ステップＳ１７において、クラスタップ抽出部６１は、供給された画素データの中からクラスタップを抽出する。クラスタップは、例えば、図６に示されるように、予測フレームの注目画素を中心とする７×７個の画素のうちの図中黒く示される３×３個の画素とされる。
【００４９】
ADRC部６２は、ステップＳ１８において、クラスタップを構成する９個の画素データに対して、１ビットADRC処理を施す。これにより、９個の画素データがそれぞれ０または１の値に変換されて、９ビットのデータがクラスコード決定部６３に供給される。クラスコード決定部６３は、ADRC部６２より供給される９ビットのデータに基づいて、その９個の画素データで構成されるクラスタップに対応するクラスコードを決定し、予測係数メモリ６４に出力する。
【００５０】
予測係数メモリ６４は、ステップＳ１９において、クラスコード決定部６３より供給されたクラスコードに対する予測係数を読み出し、予測値算出部６６に出力する。
【００５１】
ステップＳ２０において、予測タップ抽出部６５は、アドレス設定部２３より供給される画素データから、クラスタップを構成する画素データを取得する。図７は、クラスタップの例を表している。この例においては、予測フレームの注目画素を中心とする７×７個の画素のうち、中央に黒く示される１３個の画素がクラスタップとされている。
【００５２】
予測値算出部６６は、ステップＳ２１において、予測値計算処理を実行する。すなわち、予測値算出部６６は、予測タップ抽出部６５より供給された予測タップを構成する１３個の画素データと、予測係数メモリ６４より供給される１３個の予測係数の線形１次結合を演算して、予測値を算出し、比較器２５に出力する。
【００５３】
比較器２５は、ステップＳ２２において、予測残差計算処理を実行する。すなわち、比較器２５は、クラス分類適応処理部２４（いまの場合、クラス分類適応処理部２４−１）より供給される予測値と、予測フレームメモリ２１より供給される注目画素（真値）との差を演算することで、予測残差を計算する。
【００５４】
ステップＳ２３において、比較器２５は、ステップＳ２２の処理で求めた予測残差（評価値）を比較値と比較する。この比較値は、いまの場合、ステップＳ１５の処理で最大値に設定されている。従って、評価値は、比較値より小さいと判定され、ステップＳ２４に進み、比較器２５は、ステップＳ１５の処理で最大値にセットされた比較値の値を、ステップＳ２２の処理で計算された予測残差（評価値）に更新する。すなわち、比較値として、より小さい値が設定される。
【００５５】
ステップＳ２５において、閾値判定部２６は、ステップＳ２２の処理で求められた評価値に対応する動きベクトルを内蔵するメモリに保持する（既に保持されている動きベクトルがある場合には、更新される）。いまの場合、クラス分類適応処理部２４−１の出力が処理されているので、この動きベクトルは、ｖ０となる。
【００５６】
ステップＳ２６において、閾値判定部２６は、ステップＳ２２の処理で得られた予測残差（評価値）と、ステップＳ１１の処理でセットされた伝送判定閾値とを比較する。予測残差（評価値）が伝送判定閾値と等しいか、それより大きいと判定された場合、ステップＳ２７において、閾値判定部２６は、ステップＳ２２の処理で計算された予測残差を評価値として保持する。
【００５７】
予測残差（評価値）が伝送判定閾値より小さいと判定された場合には、ステップＳ２７の処理はスキップされる。
【００５８】
ステップＳ２３において、予測残差（評価値）が比較値と等しいか、それより大きいと判定された場合には、ステップＳ２４乃至ステップＳ２７の処理はスキップされる。
【００５９】
ステップＳ２８において、閾値判定部２６は、探索範囲内の全ての位置の処理を終了したか否かを判定する。すなわち、クラス分類適応処理部２４−１乃至２４−ｎの全ての出力に対する処理を完了したか否かがここで判定される。クラス分類適応処理部２４−１乃至２４−ｎの出力のうち、まだ処理していないものが残っている場合には、ステップＳ１６に戻り、それ以降の処理が繰り返し実行される。
【００６０】
以上のようにして、ステップＳ１６乃至ステップＳ２８の処理がクラス分類適応処理部２４−１乃至２４−ｎが出力するｎ個の予測値の全てに対して実行される。
【００６１】
ステップＳ２８で探索範囲内の全ての位置の処理が終了したと判定された場合、ステップＳ２９に進み、閾値判定部２６は、伝送判定閾値より小さい予測残差（評価値）が保存されているか否かを判定する。すなわち、探索範囲内の全ての位置の処理により、ｎ個の予測残差（評価値）が得られることになるが、そのｎ個の予測残差のうち、少なくとも１つ伝送判定閾値より小さいものがある場合には、その中で最小のものに対応する動きベクトルがステップＳ２５の処理で保持されている。そこで、その場合には、ステップＳ３０において、閾値判定部２６は、ステップＳ２５の処理で保持した動きベクトルを読み出し、後処理部２７に出力する。
【００６２】
ステップＳ２９において、伝送判定閾値より小さい評価値が存在しないと判定された場合、閾値判定部２６、ステップＳ３１において、ステップＳ２７の処理により保持されている予測残差（評価値）を後処理部２７に出力する処理を実行する。
【００６３】
すなわち、ｎ個の予測残差（評価値）が全て伝送判定閾値と等しいか、それより大きい場合には、そのうちの最小の値に対応する予測残差がステップＳ２７の処理で保持されている。そこで、閾値判定部２６は、その予測残差を、いま対象とされている検索範囲の評価値（最小の予測残差）として、後処理部２７に出力する。
【００６４】
ステップＳ３０，Ｓ３１の処理の後、ステップＳ３２において、閾値判定部２６は、１フレーム内の全ての探索範囲の処理が終了したか否かを判定し、まだ終了していない探索範囲が残っている場合には、ステップＳ１４に戻り、それ以降の処理が繰り返し実行される。
【００６５】
ステップＳ３２において、全ての探索範囲の処理が終了したと判定された場合、ステップＳ３３において、後処理部２７は、閾値判定部２６より動きベクトルだけが供給されてきた場合には、その動きベクトルをそのまま出力し、閾値判定部２６より供給されてきたのが、動きベクトルと予測残差（評価値）である場合には、予測残差に、それが予測残差であることを表すフラグを付加する。
【００６６】
ステップＳ３４において、量子化器２８は、後処理部２７より供給された動きベクトルをそのまま出力する。動きベクトルと予測残差（評価値）が供給されてきた場合には、量子化器２８は、予測残差を量子化し、伝送路に伝送する。
【００６７】
以上のようにして、図８に示されるように、予測フレームメモリ２１に記憶されている予測フレーム９０において、所定の画素が注目画素９１として選択される。そして、基準フレームメモリ２２に記憶されている基準フレーム８０の注目画素９１に対応する画素が注目対応画素８３として選択され、注目対応画素８３を中心とする所定の範囲が、探索範囲８１として設定される。
【００６８】
探索範囲８１内において、所定の範囲の画素が予測タップ８２として選択され、予測タップ８２を構成する画素に基づいて、上述したように、予測値が演算される。そして、その予測値と注目画素９１との差が予測残差として算出される。
【００６９】
予測タップ８２は、探索範囲８１内を、予測タップ８２−１乃至８２−ｎとして示されるように、ｎ個の位置に、順次移動される。そして、ｎ個の予測タップのそれぞれに対応して得られるｎ個の予測残差の中から最小のものがその探索範囲８１の評価値として選択される。
【００７０】
以上のような処理が、予測フレーム９０の中の全ての画素を注目画素９１として順次選択することで実行される。
【００７１】
以上のようにして、伝送路に伝送された画像データは、図９に示されるような受信装置により受信され、復号される。
【００７２】
この受信装置１１１においては、逆量子化器１２１が送信装置１１により符号化され、伝送路を伝送されてきたデータを取得し、逆量子化する。伝送判定部１２２は、逆量子化器１２１より供給されたデータからフラグを読み取り、伝送されてきたデータのうち動きベクトルをアドレス設定部１２３に出力する。アドレス設定部１２３は、基準フレームメモリ１２４に記憶されている、既に復調して得られている基準フレームの画素データから、動きベクトルに対応する範囲の画素データを抽出し、クラス分類適応処理部１２５に供給する。
【００７３】
クラス分類適応処理部１２５は、アドレス設定部１２３より供給された画素データに対して、クラス分類適応処理を施し、画素データを生成する。
【００７４】
このクラス分類適応処理部１２５は、図２の送信装置１１のクラス分類適応処理部２４と基本的に同様の構成とされる。そこで、図３は、以下の説明で、クラス分類適応処理部１２５としても、引用される。
【００７５】
伝送判定部１２２は、逆量子化器１２１より入力されたデータに予測残差のデータが含まれていると判定した場合、これを復号部１２６に供給する。復号部１２６は、この予測残差を復号する。
【００７６】
合成部１２７は、クラス分類適応処理部１２５と復号部１２６より供給された画素データを、メモリ上で適宜合成し、１フレーム分の画素データとし、出力する。合成部１２７より出力された画素データの一部は、必要に応じて、基準フレームメモリ１２４に供給され、後続するフレームの処理に対する基準フレームとして記憶される。
【００７７】
例えば、マイクロプロセッサなどにより構成される制御部１２８は、受信装置１１１の各部の動作を制御する。
【００７８】
制御部１２８には、インタフェース１３１を介して、必要に応じて磁気ディスク１４１、光ディスク１４２、光磁気ディスク１４３、または半導体メモリ１４４が装着され、そこに記憶されているプログラムやデータなどが適宜インストールされる。
【００７９】
次に、図１０のフローチャートを参照して、受信装置１１１における復号処理について説明する。
【００８０】
ステップＳ５１において、逆量子化器１２１は、伝送路を介して伝送されてきた量子化されている画像データを受信し、逆量子化して伝送判定部１２２に供給する。伝送判定部１２２は、ステップＳ５２において、いま伝送されてきたデータに含まれているのは、動きベクトルのデータであるのか否かを判定する。入力されたデータに含まれているのが動きベクトルのデータであると判定された場合、伝送判定部１２２は、ステップＳ５３において、その動きベクトルのデータをアドレス設定部１２３に出力する。アドレス設定部１２３は、基準フレームメモリ１２４に記憶されている画素データから探索範囲の画素データを抽出し、クラス分類適応処理部１２５に出力する。
【００８１】
クラス分類適応処理部１２５は、ステップＳ５４において、クラス分類適応処理を実行する。すなわち、クラスタップ抽出部６１と、予測タップ抽出部６５は、アドレス設定部１２３より供給される動きベクトルに基づいて、探索範囲の画素データから、それぞれクラスタップと予測タップの画素データを抽出する。ADRC部６２は、クラスタップを１ビットADRC処理し、その結果得られたデータに基づいて、クラスコード決定部６３は、クラスコードを決定する。予測係数メモリ６４は、クラスコードに対応する予測係数を予測値算出部６６に出力する。予測値算出部６６は、予測タップと予測係数の線形１次結合から予測値を演算する。
【００８２】
生成された予測値は、合成部１２７に供給される。ステップＳ５７において、合成部１２７は、クラス分類適応処理部１２５より供給された予測値をフレームの対応する位置の画素データとして合成する。
【００８３】
ステップＳ５２において、伝送されてきたデータのうち動きベクトルではないと判定されたデータ（予測誤差のデータ）は、伝送判定部１２２から復号部１２６に出力される。復号部１２６は、ステップＳ５６において、予測残差の復号処理を実行する。すなわち、復号部１２６は、伝送判定部１２２より供給された予測残差を復号する。復号部１２６により復号された予測残差は、合成部１２７に供給され、ステップＳ５７の処理で、クラス分類適応処理部１２５で復号された対応する位置の画素データと合成される。
【００８４】
予測係数メモリ６４に記憶される予測係数は、学習により取得することができる。図１１は、この学習を行う学習装置の構成例を表している。この学習装置１６０においては、生徒データ生成部１６１が、入力された教師データとしての画像データの画素値を適宜変更処理するなどして、生徒データを生成する。クラスタップ抽出部１６２は、生徒データ生成部１６１より供給された生徒データからクラスタップを抽出し、ADRC部１６３に出力する。ADRC部１６３は、クラスタップ抽出部１６２より供給されたクラスタップの画素データに対して、１ビットADRC処理を施し、得られた結果をクラスコード決定部１６４に出力する。
【００８５】
クラスコード決定部１６４は、ADRC部１６３より入力されたデータに基づいて、クラスコードを決定し、そのクラスコードを正規方程式生成部１６６に出力する。
【００８６】
予測タップ抽出部１６５は、生徒データ生成部１６１より供給された生徒データから予測タップを抽出し、正規方程式生成部１６６に出力する。正規方程式生成部１６６には、また、生徒データ生成部１６１において、生徒データを生成する元になった親データとしての教師データが供給されている。正規方程式生成部１６６は、予測係数（未知数）と生徒データの積和を教師データの値と等しいとする線形１次結合の正規方程式を生成する。
【００８７】
予測係数決定部１６７は、正規方程式生成部１６６により生成された正規方程式を、掃き出し法などの一般的な行列解法を用いて、未知数とされている予測係数を求め、予測係数メモリ６４に記憶させる。
【００８８】
次に、学習装置１６０の動作について説明する。クラスタップ抽出部１６２は、生徒データ生成部１６１より供給された生徒データからクラスタップの画素データ（図６の例では、３×３個の画素データ）を抽出し、ADRC部１６３に出力する。ADRC部１６３は、クラスタップの画素データに対して、１ビットADRC処理を施し、得られた結果を、クラスコード決定部１６４に出力する。クラスコード決定部１６４は、ADRC部１６３より入力されたデータに基づいて、クラスコードを決定し、正規方程式生成部１６６に出力する。
【００８９】
生徒データ生成部１６１により生成された生徒データからは、予測タップ抽出部１６５により予測タップを構成する画素データ（図６の例では、１３個の画素データ）が抽出され、正規方程式生成部１６６に供給される。正規方程式生成部１６６は、予測タップの画素データと予測係数の積和が教師データに等しいとする線形１次結合の正規方程式をクラスコード毎に生成し、予測係数決定部１６７に出力する。予測係数決定部１６７は、掃き出し法に基づいて、未知数としての予測係数を決定し、予測係数メモリ６４に記憶させる。
【００９０】
なお、以上においては、図４のステップＳ２９において、伝送判定閾値より小さい評価値が存在しないと判定された場合には、ステップＳ３１の処理で、動きベクトルと予測残差を送出するようにしたが、予測残差に代えて、ステップＳ２１で計算された予測値を送出したり、注目画素の画素データの値そのものを送出するようにすることも可能である。
【００９１】
また、画素単位ではなく、ブロック単位で処理することも可能である。この場合には、ブロック内での予測値と入力値とで、絶対誤差和を取得し、その誤差和が所定の閾値よりも小さい場合には、動きベクトルを伝送し、閾値より大きい場合には、ブロック内の画素データを伝送するようにすればよい。
【００９２】
このようにした場合、フラグがブロック内に１ビット付加されるだけの構成となるため、余分な情報量が減り、効率よく圧縮することが可能になる。
【００９３】
さらに、図１２に示されるように、閾値判定部２６（そこにおける閾値判定処理）を省略し、常に動きベクトルを送出するようにしてもよい。そのようにすれば、より圧縮効率、伝送効率、符号化効率を向上させることが可能となる。
【００９４】
なお、一番最初のフレームのデータに関しては、量子化を行ってもよいし、行わなくてもよい。また、前フレーム毎に、あるいは数フレーム毎に、基準フレームを更新してもよい。あるいは、また、基準フレームを全く更新しないようにすることも可能である。
【００９５】
基準フレームを更新するようにした場合には、画像の劣化も少なく、復調側で、より品質の高い画像を再現することが可能なるばかりでなく、雑音による影響が後々まで伝播しないため、よりロバストな画像伝送方式を実現することができる。
【００９６】
このように、本発明によれば、予測誤差が、絶対差分和や自乗誤差和に比べて同じか、それより小さくなるため、絶対差分和、または自乗誤差和を用いる方式に比べて、データの圧縮効率をより高めることができ、また、高画質の画像を伝送することが可能となる。
【００９７】
以上においては、送信装置１１と受信装置１１１は、独立する構成としたが、１つの装置内に、これらを一体化することも可能である。特に、伝送路が記録媒体により構成される場合には、その記録媒体に対して、データを記録再生する装置においては、送信装置と受信装置の両方が、１つの装置内に配置される。
【００９８】
また、以上においては、画像データを例として説明したが、画像データ以外のコンテンツデータを伝送する場合にも、本発明を適用することが可能である。
【００９９】
上述した一連の処理は、ハードウエアにより実行させることもできるが、ソフトウエアにより実行させることもできる。一連の処理をソフトウエアにより実行させる場合には、そのソフトウエアを構成するプログラムが、専用のハードウエアに組み込まれているコンピュータ、または、各種のプログラムをインストールすることで、各種の機能を実行することが可能な、例えば汎用のパーソナルコンピュータなどに、ネットワークや記録媒体からインストールされる。
【０１００】
この記録媒体は、図２と図９に示されるように、装置本体とは別に、ユーザにプログラムを提供するために配布される、プログラムが記録されている磁気ディスク４１，１４１（フロッピディスクを含む）、光ディスク４２，１４２（CD-ROM(Compact Disk-Read Only Memory),DVD(Digital Versatile Disk)を含む）、光磁気ディスク４３，１４３（ＭＤ（Mini-Disk）を含む）、もしくは半導体メモリ４４，１４４などよりなるパッケージメディアにより構成されるだけでなく、装置本体に予め組み込まれた状態でユーザに提供される、プログラムが記録されているROMや、記憶部に含まれるハードディスクなどで構成される。
【０１０１】
なお、本明細書において、記録媒体に記録されるプログラムを記述するステップは、記載された順序に沿って時系列的に行われる処理はもちろん、必ずしも時系列的に処理されなくとも、並列的あるいは個別に実行される処理をも含むものである。
【０１０２】
また、本明細書において、システムとは、複数の装置により構成される装置全体を表すものである。
【０１０３】
【発明の効果】
以上の如く、本発明によれば、伝送効率、圧縮効率、および符号化効率を向上させることが可能となる。
【０１０４】
また、本発明によれば、伝送効率、圧縮効率、または符号化効率を向上させた伝送データを、簡単かつ確実に、受信し、復号することが可能となる。
【図面の簡単な説明】
【図１】ブロックマッチングを説明する図である。
【図２】本発明を適用した送信装置の構成例を示すブロック図である。
【図３】図２のクラス分類適応処理部の構成例を示すブロック図である。
【図４】図２の送信装置の処理を説明するフローチャートである。
【図５】図２の送信装置の処理を説明するフローチャートである。
【図６】クラスタップの例を示す図である。
【図７】予測タップの例を示す図である。
【図８】図２のクラス分類適応処理部における予測処理を説明する図である。
【図９】本発明を適用した受信装置の構成例を示すブロック図である。
【図１０】図９の受信装置の動作を説明するフローチャートである。
【図１１】予測係数を取得する学習装置の構成例を示すブロック図である。
【図１２】本発明を適用した送信装置の他の構成例を示すブロック図である。
【符号の説明】
２１予測フレームメモリ，２２基準フレームメモリ，２３−１乃至２３−ｎアドレス設定部，２４−１乃至２４−ｎクラス分類適応処理部，２５比較器，２６閾値判定部，２７後処理部，２８量子化器[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an information processing apparatus and method, a recording medium, and a program, and more particularly, to an information processing apparatus and method, a recording medium, and a program that enable efficient encoding.
[0002]
[Prior art]
Recently, compression methods represented by MPEG (Moving Picture Experts Group) have become widespread, and video data and audio data are compressed by the MPEG method and then recorded on a recording medium such as a hard disk, magnetic tape, or optical disk, or network Or transmitted via satellite.
[0003]
In the MPEG system, for example, as shown in FIG. 1, when a previous frame and a subsequent frame exist before and after the current frame, the current frame is divided into, for example, blocks of 8 × 8 pixels. Then, a block of 8 × 8 pixels is extracted within a predetermined search range of the previous frame, and block 1 of the current frame and block 2 of the previous frame are subjected to absolute difference sum, square error sum, or representative point matching for each pixel. Compared by processing. The position of the block 2 of the previous frame is sequentially moved within the search range.
[0004]
Then, the relative coordinate position having the smallest absolute difference sum, square error sum, or representative point matching value in the search range is estimated as a motion vector.
[0005]
Further, when the class classification adaptive process is used, the pixel at the block position is predicted from the pixel of the block 2 corresponding to the motion vector based on the class classification adaptive process, and the motion vector and the predicted value are transmitted.
[0006]
[Problems to be solved by the invention]
However, such a conventional predictive coding method using block matching and class classification adaptive processing has a problem that transmission efficiency, compression efficiency, and coding efficiency are poor.
[0007]
The present invention has been made in view of such a situation, and is intended to improve transmission efficiency, compression efficiency, and encoding efficiency.
[0008]
[Means for Solving the Problems]
The information processing apparatus according to the present invention stores a second frame temporally prior to the first frame of the image data in the input image data, and stores the second frame every frame or every several frames. Updating means for updating the second frame, and a search range that is a part of the first frame and is around the position in the second frame corresponding to the first pixel data composed of one pixel. A plurality of second pixel data to be processed Is a plurality of pixel data used for class classification. Class tap and A plurality of pixel data used to generate a predicted value An extraction unit that extracts a prediction tap; a prediction value generation unit that generates a prediction value by performing class classification adaptive processing on the plurality of second pixel data using the class tap and the prediction tap extracted by the extraction unit; Of the prediction residuals calculated by the prediction residual calculation means, the prediction residual calculation means for calculating the prediction residual between the prediction value generated by the prediction value generation means and the first pixel data, the smallest prediction The residual is selected as the evaluation value for the first pixel data, and the prediction residual is determined to be smaller than the threshold based on the comparison means comparing the evaluation value with a predetermined threshold and the comparison result of the comparison means. The motion vector defined by the position of the first pixel data and the second pixel data with the smallest prediction residual is output, and if it is determined that the prediction residual is greater than the threshold, The prediction value generated by the prediction value generation means, the prediction residual calculated by the prediction residual calculation means, or the first pixel data, and whether the transmitted data is a prediction residual or not Output means for outputting a flag.
[0014]
The information processing method of the present invention stores a second frame temporally prior to a first frame of image data of input image data, and stores it every frame or every several frames. An update step for updating the second frame being present, and being within a search range that is part of the first frame and that is around the position in the second frame corresponding to the first pixel data consisting of one pixel A plurality of second pixel data to be processed Is a plurality of pixel data used for class classification. Class tap and A plurality of pixel data used to generate a predicted value An extraction step for extracting a prediction tap, and a prediction value generation step for generating a prediction value by performing class classification adaptive processing on the plurality of second pixel data using the class tap and the prediction tap extracted by the processing of the extraction step A prediction residual generated by calculating a prediction residual between the prediction value generated by the processing of the prediction value generating step and the first pixel data, and a prediction residual calculated by the processing of the prediction residual calculating step The smallest prediction residual is selected as the evaluation value for the first pixel data, and the prediction residual is based on the comparison step in which the evaluation value is compared with a predetermined threshold and the comparison result in the processing of the comparison step. When it is determined that the difference is smaller than the threshold value, a motion vector defined by the position of the first pixel data and the second pixel data having the smallest prediction residual is output. When it is determined that the prediction residual is larger than the threshold, the prediction value generated by the processing of the prediction value generation step, the prediction residual calculated by the processing of the prediction residual calculation step, And an output step of outputting a flag indicating whether or not the transmitted data is a prediction residual.
[0015]
The program of the first recording medium of the present invention stores the second frame temporally before the first frame of the image data of the input image data, and every frame or every several frames An update step for updating the stored second frame, and a search that is a part of the first frame and is around the position in the second frame corresponding to the first pixel data consisting of one pixel. A plurality of second pixel data existing in the range Is a plurality of pixel data used for class classification. Class tap and A plurality of pixel data used to generate a predicted value An extraction step for extracting a prediction tap, and a prediction value generation step for generating a prediction value by performing class classification adaptive processing on the plurality of second pixel data using the class tap and the prediction tap extracted by the processing of the extraction step A prediction residual generated by calculating a prediction residual between the prediction value generated by the processing of the prediction value generating step and the first pixel data, and a prediction residual calculated by the processing of the prediction residual calculating step The smallest prediction residual is selected as the evaluation value for the first pixel data, and the prediction residual is based on the comparison step in which the evaluation value is compared with a predetermined threshold and the comparison result in the processing of the comparison step. When it is determined that the difference is smaller than the threshold value, a motion vector defined by the position of the first pixel data and the second pixel data having the smallest prediction residual is output. When it is determined that the prediction residual is larger than the threshold, the prediction value generated by the processing of the prediction value generation step, the prediction residual calculated by the processing of the prediction residual calculation step, And an output step of outputting a flag indicating whether or not the transmitted data is a prediction residual.
[0016]
The first program of the present invention stores the second frame temporally before the first frame of the image data of the input image data, and stores the second frame every frame or every several frames. An update step for updating the second frame, and within a search range that is a part of the first frame and that is around the position in the second frame corresponding to the first pixel data consisting of one pixel. A plurality of second pixel data existing Is a plurality of pixel data used for class classification. Class tap and A plurality of pixel data used to generate a predicted value An extraction step for extracting a prediction tap, and a prediction value generation step for generating a prediction value by performing class classification adaptive processing on the plurality of second pixel data using the class tap and the prediction tap extracted by the processing of the extraction step A prediction residual generated by calculating a prediction residual between the prediction value generated by the processing of the prediction value generating step and the first pixel data, and a prediction residual calculated by the processing of the prediction residual calculating step The smallest prediction residual is selected as the evaluation value for the first pixel data, and the prediction residual is based on the comparison step in which the evaluation value is compared with a predetermined threshold and the comparison result in the processing of the comparison step. When it is determined that the difference is smaller than the threshold value, a motion vector defined by the position of the first pixel data and the second pixel data having the smallest prediction residual is output. When it is determined that the prediction residual is larger than the threshold, the prediction value generated by the processing of the prediction value generation step, the prediction residual calculated by the processing of the prediction residual calculation step, One of the pixel data and an output step of outputting a flag indicating whether the transmitted data is a prediction residual or not are executed by the computer.
[0026]
In the information processing apparatus and method, the recording medium, and the program of the present invention, the second frame temporally prior to the first frame of the image data of the input image data is stored, and for each frame, Alternatively, every few frames, the stored second frame is updated, around the position in the second frame that is part of the first frame and that corresponds to the first pixel data consisting of one pixel. A plurality of second pixel data existing within a certain search range Is a plurality of pixel data used for class classification. Class tap and A plurality of pixel data used to generate a predicted value A prediction tap is extracted, and a classifying adaptive process is performed on the plurality of second pixel data using the extracted class tap and the prediction tap, and a prediction value is generated. A prediction residual with the pixel data of the first pixel data is calculated, the smallest prediction residual among the calculated prediction residuals is selected as an evaluation value for the first pixel data, and the evaluation value and a predetermined threshold value are To be compared. Then, when it is determined that the prediction residual is smaller than the threshold based on the comparison result, a motion vector defined by the position of the first pixel data and the second pixel data having the smallest prediction residual is output. If it is determined that the prediction residual is larger than the threshold, the generated prediction value, the calculated prediction residual, or the first pixel data in addition to the motion vector, and the transmitted data include the prediction residual. A flag indicating whether or not there is a difference is output.
[0028]
DETAILED DESCRIPTION OF THE INVENTION
FIG. 2 shows a configuration example of a transmission apparatus in an image processing system to which the present invention is applied. In this transmission device 11, the reference frame memory 22 stores image data of a reference frame that is a reference for prediction. The predicted frame memory 21 sequentially stores image data of predicted frames predicted based on the image of the reference frame.
[0029]
The address setting unit 23-1 has a first search range (a search range corresponding to the target image of the predicted frame stored in the frame memory 21) in the pixel data stored in the frame memory 22. An address of an arbitrary number of pixels constituting the prediction tap and the class tap of the position (position corresponding to the motion vector v0) is set, and pixel data corresponding to the address, that is, pixel data constituting the prediction tap and the class tap Is supplied to the class classification adaptive processing unit 24-1. The class classification adaptation processing unit 24-1 performs the class classification adaptation process based on the prediction tap and the pixel data of the class tap supplied from the address setting unit 23-1, and the prediction frame stored in the prediction frame memory 21. The predicted value of the corresponding pixel (target pixel) is calculated.
[0030]
The address setting unit 23-2 extracts the pixel data of the prediction tap and the class tap at the second position (position corresponding to the second motion vector v1) within the search range, and sends it to the class classification adaptation processing unit 24-2. Supply. The class classification adaptation processing unit 24-2 performs class classification adaptation processing based on the prediction tap and class tap pixel data supplied from the address setting unit 23-2, and calculates a prediction value for the pixel of interest.
[0031]
Similar configurations are provided for the number of motion vectors (n) obtained by searching the search range. That is, n combinations of address setting units and class classification adaptive processing units are prepared. For example, assuming that the search range is from minus 8 to plus 8 in both the horizontal and vertical directions, the number of n is 289 (= 17 × 17).
[0032]
The comparator 25 compares the n prediction values supplied from the class classification adaptive processing units 24-1 to 24-n with the target pixel supplied from the prediction frame memory 21, and detects the difference as a prediction residual. In addition, the smallest prediction residual among the prediction residuals is selected as the evaluation value. The threshold determination unit 26 compares the minimum prediction residual (evaluation value) supplied from the comparator 25 with the threshold. When the evaluation value is equal to or smaller than the threshold value, the threshold value determination unit 26 supplies the motion vector to the post-processing unit 27. The threshold determination unit 26 outputs the motion vector and the prediction residual to the post-processing unit 27 when the prediction residual is larger than the threshold.
[0033]
When the motion vector is supplied from the threshold determination unit 26, the post-processing unit 27 outputs it as it is. When the motion vector and the prediction residual are supplied from the threshold determination unit 26, the post-processing unit 27 adds a flag indicating that it is a prediction residual to the prediction residual.
[0034]
The motion vector output from the post-processing unit 27 or the motion vector and the prediction residual to which the flag is added are supplied to the quantizer 28. The quantizer 28 quantizes the prediction residual among these by using, for example, a code such as Lloyd Max and run length, and transmits the result to a predetermined transmission path. In addition to various communication paths, this transmission path includes a recording medium that transmits data by recording and reproduction.
[0035]
For example, the control unit 30 including a microprocessor controls each unit of the transmission device 11. A magnetic disk 41, an optical disk 42, a magneto-optical disk 43, a semiconductor memory 44, or the like is appropriately attached to the control unit 30 via an interface 31 as necessary.
[0036]
FIG. 3 shows the configuration of the class classification adaptive processing units 24-1 to 24-n (hereinafter, simply referred to as the class classification adaptive processing unit 24 when it is not necessary to distinguish them individually). The class tap extraction unit 61 is a pixel supplied from a corresponding one of the address setting units 23-1 to 23-n (hereinafter simply referred to as the address setting unit 23 when it is not necessary to distinguish them individually). Class taps are extracted from the data and supplied to an ADRC (Adaptive Dynamic Range Coding) unit 62. The ADRC unit 62 performs, for example, 1-bit ADRC processing on the pixel data supplied from the class tap extraction unit 61.
[0037]
That is, the maximum value and the minimum value of the plurality of pixel data constituting the class tap are detected, and the difference between the maximum value and the minimum value is further calculated as a dynamic range. Each pixel data is normalized by subtracting the minimum value and further dividing by the dynamic range.
[0038]
The normalized data is compared with a predetermined reference value (for example, 0.5), and is set to 1 when it is larger than the reference value, for example, 0 when it is smaller than the reference value.
[0039]
That is, for example, when the number of pixels constituting the class tap is nine, 9-bit data is supplied from the ADRC unit 62 to the class code determining unit 63. The class code determination unit 63 determines the class code of the class tap based on the 9-bit data, and outputs it to the prediction coefficient memory 64.
[0040]
A prediction coefficient corresponding to the class code is stored in the prediction coefficient memory 64 in advance, and the prediction coefficient corresponding to the class code supplied from the class code determination unit 63 is output to the prediction value calculation unit 66.
[0041]
The prediction tap extraction unit 65 extracts pixel data constituting the prediction tap from the pixel data supplied from the address setting unit 23 and outputs the pixel data to the prediction value calculation unit 66. The prediction value calculation unit 66 multiplies the pixel data constituting the prediction tap supplied from the prediction tap extraction unit 65 by the prediction coefficient supplied from the prediction coefficient memory 64, and from the sum (that is, from linear linear combination). ) To calculate a predicted value.
[0042]
Next, processing of the transmission device 11 will be described with reference to the flowcharts of FIGS. 4 and 5.
[0043]
First, in step S 11, the control unit 30 sets a transmission determination threshold for the threshold determination unit 26. This transmission determination threshold is used in the process of step S26 described later.
[0044]
In step S12, processing for accumulating previous frame data is executed, and in step S13, processing for accumulating current frame data is executed. That is, for example, image data for one frame is stored in the prediction frame memory 21, read out from there again, transmitted to the reference frame memory 22, and stored. Then, the predicted frame memory 21 stores image data for a new frame following that. In this way, the pixel data of the previous frame is stored in the reference frame memory 22, and the image data of the current frame (a frame temporally later than the reference frame) is stored in the prediction frame memory 21.
[0045]
In step S 14, the prediction frame memory 21 extracts the data of the pixel of interest from the stored image data for one frame based on the control of the control unit 30, and supplies it to the comparator 25.
[0046]
In step S 15, the control unit 30 sets a comparison value in the comparator 25. This comparison value is updated to a smaller prediction residual value in step S24 in order to obtain the evaluation value as the minimum prediction residual, and the maximum value is set as the initial value. The This comparison value is used in the process of step S23.
[0047]
Next, in step S16, addresses corresponding to the target pixel extracted in the process of step S14 are set for the address setting units 23-1 to 23-n, respectively. As a result, the pixel data including the class tap and the prediction tap corresponding to each search position (motion vector) in the search range corresponding to the target pixel from the address setting units 23-1 to 23-n is converted into the corresponding class classification. The data can be taken into the adaptive processing units 24-1 to 24-n.
[0048]
Therefore, in step S17, the class tap extraction unit 61 extracts class taps from the supplied pixel data. For example, as shown in FIG. 6, the class tap is 3 × 3 pixels shown in black in the figure out of 7 × 7 pixels centered on the target pixel of the prediction frame.
[0049]
In step S18, the ADRC unit 62 performs 1-bit ADRC processing on the nine pieces of pixel data constituting the class tap. As a result, the nine pieces of pixel data are converted into values of 0 or 1, respectively, and 9-bit data is supplied to the class code determination unit 63. Based on the 9-bit data supplied from the ADRC unit 62, the class code determination unit 63 determines a class code corresponding to the class tap composed of the nine pixel data, and outputs the class code to the prediction coefficient memory 64. .
[0050]
In step S 19, the prediction coefficient memory 64 reads the prediction coefficient for the class code supplied from the class code determination unit 63 and outputs the prediction coefficient to the prediction value calculation unit 66.
[0051]
In step S 20, the prediction tap extraction unit 65 acquires pixel data constituting the class tap from the pixel data supplied from the address setting unit 23. FIG. 7 shows an example of a class tap. In this example, among the 7 × 7 pixels centered on the target pixel of the prediction frame, 13 pixels shown in black at the center are class taps.
[0052]
In step S21, the predicted value calculation unit 66 executes a predicted value calculation process. That is, the prediction value calculation unit 66 calculates a linear primary combination of 13 pixel data constituting the prediction tap supplied from the prediction tap extraction unit 65 and 13 prediction coefficients supplied from the prediction coefficient memory 64. Then, the predicted value is calculated and output to the comparator 25.
[0053]
In step S22, the comparator 25 executes a prediction residual calculation process. That is, the comparator 25 includes the predicted value supplied from the class classification adaptive processing unit 24 (in this case, the class classification adaptive processing unit 24-1) and the target pixel (true value) supplied from the predicted frame memory 21. By calculating the difference, the prediction residual is calculated.
[0054]
In step S23, the comparator 25 compares the prediction residual (evaluation value) obtained in the process of step S22 with the comparison value. In this case, this comparison value is set to the maximum value in the process of step S15. Therefore, it is determined that the evaluation value is smaller than the comparison value, and the process proceeds to step S24, where the comparator 25 predicts the value of the comparison value set to the maximum value in the process of step S15 and calculated in the process of step S22. Update to residual (evaluation value). That is, a smaller value is set as the comparison value.
[0055]
In step S25, the threshold determination unit 26 holds a motion vector corresponding to the evaluation value obtained in the process of step S22 in a built-in memory (updated when there is a motion vector already held). . In this case, since the output of the class classification adaptive processing unit 24-1 is processed, this motion vector is v0.
[0056]
In step S26, the threshold determination unit 26 compares the prediction residual (evaluation value) obtained in the process of step S22 with the transmission determination threshold set in the process of step S11. When it is determined that the prediction residual (evaluation value) is equal to or larger than the transmission determination threshold, in step S27, the threshold determination unit 26 holds the prediction residual calculated in the process of step S22 as an evaluation value. To do.
[0057]
When it is determined that the prediction residual (evaluation value) is smaller than the transmission determination threshold, the process of step S27 is skipped.
[0058]
If it is determined in step S23 that the prediction residual (evaluation value) is equal to or greater than the comparison value, the processes in steps S24 to S27 are skipped.
[0059]
In step S28, the threshold determination unit 26 determines whether or not the processing for all positions within the search range has been completed. That is, it is determined here whether or not the processing for all the outputs of the class classification adaptive processing units 24-1 to 24-n has been completed. If there is an unprocessed output among the outputs of the class classification adaptive processing units 24-1 to 24-n, the process returns to step S16, and the subsequent processing is repeatedly executed.
[0060]
As described above, the processes in steps S16 to S28 are executed for all n predicted values output from the class classification adaptive processing units 24-1 to 24-n.
[0061]
If it is determined in step S28 that the processing of all positions within the search range has been completed, the process proceeds to step S29, and the threshold determination unit 26 stores a prediction residual (evaluation value) smaller than the transmission determination threshold. Determine whether. That is, n prediction residuals (evaluation values) are obtained by processing all positions in the search range, and at least one of the n prediction residuals is smaller than the transmission determination threshold. If there is a motion vector, the motion vector corresponding to the smallest one is held in the process of step S25. Therefore, in that case, in step S 30, the threshold determination unit 26 reads out the motion vector held in the process of step S 25 and outputs it to the post-processing unit 27.
[0062]
If it is determined in step S29 that there is no evaluation value smaller than the transmission determination threshold value, the prediction residual (evaluation value) held by the process in step S27 in the threshold value determination unit 26 and step S31 is used as the post-processing unit 27. Execute the process to output to.
[0063]
That is, if all n prediction residuals (evaluation values) are equal to or larger than the transmission determination threshold, the prediction residual corresponding to the smallest value among them is held in the process of step S27. Therefore, the threshold determination unit 26 outputs the prediction residual to the post-processing unit 27 as an evaluation value (minimum prediction residual) of the search range currently targeted.
[0064]
After the processes of steps S30 and S31, in step S32, the threshold determination unit 26 determines whether or not the processing of all search ranges within one frame has been completed, and there is still a search range that has not been completed. In that case, the process returns to step S14, and the subsequent processing is repeatedly executed.
[0065]
If it is determined in step S32 that the processing for all the search ranges has been completed, in step S33, the post-processing unit 27, when only the motion vector is supplied from the threshold determination unit 26, displays the motion vector. If the motion vector and the prediction residual (evaluation value) are output as is and supplied from the threshold determination unit 26, a flag indicating that this is a prediction residual is added to the prediction residual. To do.
[0066]
In step S34, the quantizer 28 outputs the motion vector supplied from the post-processing unit 27 as it is. When the motion vector and the prediction residual (evaluation value) have been supplied, the quantizer 28 quantizes the prediction residual and transmits it to the transmission path.
[0067]
As described above, as shown in FIG. 8, a predetermined pixel is selected as the target pixel 91 in the prediction frame 90 stored in the prediction frame memory 21. Then, a pixel corresponding to the target pixel 91 of the reference frame 80 stored in the reference frame memory 22 is selected as the target corresponding pixel 83, and a predetermined range centered on the target corresponding pixel 83 is set as the search range 81. The
[0068]
Within the search range 81, pixels in a predetermined range are selected as the prediction taps 82, and the prediction values are calculated as described above based on the pixels constituting the prediction taps 82. Then, a difference between the predicted value and the target pixel 91 is calculated as a prediction residual.
[0069]
The prediction tap 82 is sequentially moved to n positions within the search range 81 as indicated by prediction taps 82-1 to 82-n. Then, the smallest one of the n prediction residuals obtained corresponding to each of the n prediction taps is selected as the evaluation value of the search range 81.
[0070]
The above process is executed by sequentially selecting all the pixels in the prediction frame 90 as the target pixel 91.
[0071]
As described above, the image data transmitted to the transmission path is received and decoded by a receiving apparatus as shown in FIG.
[0072]
In this receiving apparatus 111, the inverse quantizer 121 is encoded by the transmitting apparatus 11, acquires the data transmitted through the transmission path, and performs inverse quantization. The transmission determination unit 122 reads a flag from the data supplied from the inverse quantizer 121 and outputs a motion vector of the transmitted data to the address setting unit 123. The address setting unit 123 extracts pixel data in a range corresponding to the motion vector from the pixel data of the reference frame already demodulated stored in the reference frame memory 124, and the class classification adaptive processing unit 125. To supply.
[0073]
The class classification adaptation processing unit 125 performs class classification adaptation processing on the pixel data supplied from the address setting unit 123 to generate pixel data.
[0074]
The class classification adaptation processing unit 125 has basically the same configuration as the class classification adaptation processing unit 24 of the transmission device 11 of FIG. Therefore, FIG. 3 is also referred to as the class classification adaptation processing unit 125 in the following description.
[0075]
When the transmission determination unit 122 determines that the data input from the inverse quantizer 121 includes prediction residual data, the transmission determination unit 122 supplies the data to the decoding unit 126. The decoding unit 126 decodes this prediction residual.
[0076]
The synthesizing unit 127 appropriately synthesizes the pixel data supplied from the class classification adaptive processing unit 125 and the decoding unit 126 on the memory, and outputs the pixel data for one frame. Part of the pixel data output from the combining unit 127 is supplied to the reference frame memory 124 as necessary, and is stored as a reference frame for processing of subsequent frames.
[0077]
For example, the control unit 128 configured by a microprocessor or the like controls the operation of each unit of the reception device 111.
[0078]
The control unit 128 is equipped with a magnetic disk 141, an optical disk 142, a magneto-optical disk 143, or a semiconductor memory 144 as necessary via an interface 131, and programs and data stored therein are appropriately installed. The
[0079]
Next, the decoding process in the receiving apparatus 111 will be described with reference to the flowchart of FIG.
[0080]
In step S 51, the inverse quantizer 121 receives the quantized image data transmitted via the transmission path, performs inverse quantization, and supplies the quantized image data to the transmission determination unit 122. In step S52, the transmission determination unit 122 determines whether or not the data that has been transmitted is motion vector data. When it is determined that the input data includes motion vector data, the transmission determination unit 122 outputs the motion vector data to the address setting unit 123 in step S53. The address setting unit 123 extracts pixel data in the search range from the pixel data stored in the reference frame memory 124 and outputs the pixel data to the class classification adaptation processing unit 125.
[0081]
In step S54, the class classification adaptation processing unit 125 executes class classification adaptation processing. That is, the class tap extraction unit 61 and the prediction tap extraction unit 65 extract the pixel data of the class tap and the prediction tap from the pixel data in the search range based on the motion vector supplied from the address setting unit 123, respectively. The ADRC unit 62 performs 1-bit ADRC processing on the class tap, and the class code determination unit 63 determines the class code based on the data obtained as a result. The prediction coefficient memory 64 outputs a prediction coefficient corresponding to the class code to the prediction value calculation unit 66. The predicted value calculation unit 66 calculates a predicted value from a linear primary combination of a prediction tap and a prediction coefficient.
[0082]
The generated predicted value is supplied to the synthesis unit 127. In step S57, the synthesizing unit 127 synthesizes the prediction value supplied from the class classification adaptation processing unit 125 as pixel data at a corresponding position in the frame.
[0083]
In step S 52, data (prediction error data) determined not to be a motion vector among the transmitted data is output from the transmission determination unit 122 to the decoding unit 126. In step S56, the decoding unit 126 performs a prediction residual decoding process. That is, the decoding unit 126 decodes the prediction residual supplied from the transmission determination unit 122. The prediction residual decoded by the decoding unit 126 is supplied to the combining unit 127, and is combined with the pixel data at the corresponding position decoded by the class classification adaptive processing unit 125 in the process of step S57.
[0084]
The prediction coefficient stored in the prediction coefficient memory 64 can be acquired by learning. FIG. 11 shows a configuration example of a learning device that performs this learning. In the learning device 160, the student data generation unit 161 generates student data by appropriately changing the pixel value of the image data as the input teacher data. The class tap extraction unit 162 extracts class taps from the student data supplied from the student data generation unit 161 and outputs the class taps to the ADRC unit 163. The ADRC unit 163 performs 1-bit ADRC processing on the class tap pixel data supplied from the class tap extraction unit 162 and outputs the obtained result to the class code determination unit 164.
[0085]
The class code determination unit 164 determines a class code based on the data input from the ADRC unit 163, and outputs the class code to the normal equation generation unit 166.
[0086]
The prediction tap extraction unit 165 extracts a prediction tap from the student data supplied from the student data generation unit 161 and outputs the prediction tap to the normal equation generation unit 166. The normal equation generation unit 166 is also supplied with teacher data as parent data from which the student data generation unit 161 generates student data. The normal equation generation unit 166 generates a linear linear combination normal equation in which the sum of products of the prediction coefficient (unknown number) and the student data is equal to the value of the teacher data.
[0087]
The prediction coefficient determination unit 167 obtains a prediction coefficient that is an unknown number from the normal equation generated by the normal equation generation unit 166 by using a general matrix solving method such as a sweep-out method, and stores the prediction coefficient in the prediction coefficient memory 64. .
[0088]
Next, the operation of the learning device 160 will be described. The class tap extraction unit 162 extracts class tap pixel data (3 × 3 pixel data in the example of FIG. 6) from the student data supplied from the student data generation unit 161, and outputs it to the ADRC unit 163. The ADRC unit 163 performs 1-bit ADRC processing on the pixel data of the class tap, and outputs the obtained result to the class code determination unit 164. The class code determination unit 164 determines a class code based on the data input from the ADRC unit 163 and outputs the class code to the normal equation generation unit 166.
[0089]
From the student data generated by the student data generation unit 161, pixel data (13 pixel data in the example of FIG. 6) constituting the prediction tap is extracted by the prediction tap extraction unit 165, and the normal data is generated by the normal equation generation unit 166. Supplied. The normal equation generation unit 166 generates a linear linear combination normal equation for each class code, assuming that the product sum of the prediction tap pixel data and the prediction coefficient is equal to the teacher data, and outputs the normal equation to the prediction coefficient determination unit 167. The prediction coefficient determination unit 167 determines a prediction coefficient as an unknown number based on the sweep-out method and stores the prediction coefficient in the prediction coefficient memory 64.
[0090]
In the above, when it is determined in step S29 in FIG. 4 that there is no evaluation value smaller than the transmission determination threshold, the motion vector and the prediction residual are transmitted in the process of step S31. Instead of the prediction residual, it is also possible to send the predicted value calculated in step S21 or send the pixel data value of the target pixel itself.
[0091]
It is also possible to perform processing in units of blocks instead of in units of pixels. In this case, an absolute error sum is obtained from the predicted value and input value in the block, and if the error sum is smaller than a predetermined threshold, a motion vector is transmitted. The pixel data in the block may be transmitted.
[0092]
In this case, since only one bit is added to the flag in the block, the amount of extra information is reduced, and compression can be performed efficiently.
[0093]
Furthermore, as shown in FIG. 12, the threshold value determination unit 26 (threshold value determination process) may be omitted and a motion vector may be always transmitted. By doing so, it is possible to further improve the compression efficiency, transmission efficiency, and encoding efficiency.
[0094]
Note that the quantization of the first frame data may or may not be performed. Further, the reference frame may be updated every previous frame or every several frames. Alternatively, it is also possible not to update the reference frame at all.
[0095]
When the reference frame is updated, image degradation is small, and not only the demodulation side can reproduce a higher quality image, but also the influence of noise does not propagate to the later stage, so it is more robust. A simple image transmission method can be realized.
[0096]
As described above, according to the present invention, the prediction error is the same as or smaller than the absolute difference sum or the square error sum. Therefore, compared with the method using the absolute difference sum or the square error sum, The compression efficiency can be further increased, and a high-quality image can be transmitted.
[0097]
In the above description, the transmission device 11 and the reception device 111 are configured to be independent from each other. However, they can be integrated in one device. In particular, when the transmission path is configured by a recording medium, in a device that records and reproduces data with respect to the recording medium, both the transmission device and the reception device are arranged in one device.
[0098]
In the above description, image data has been described as an example. However, the present invention can also be applied to the case of transmitting content data other than image data.
[0099]
The series of processes described above can be executed by hardware, but can also be executed by software. When a series of processing is executed by software, a program constituting the software executes various functions by installing a computer incorporated in dedicated hardware or various programs. For example, a general-purpose personal computer is installed from a network or a recording medium.
[0100]
As shown in FIGS. 2 and 9, the recording medium is distributed to provide a program to the user separately from the apparatus main body, and includes magnetic disks 41 and 141 (including floppy disks) on which the program is recorded. ), Optical disks 42 and 142 (including CD-ROM (Compact Disk-Read Only Memory), DVD (Digital Versatile Disk)), magneto-optical disks 43 and 143 (including MD (Mini-Disk)), or semiconductor memory 44 , 144, etc., as well as a ROM in which a program is recorded, a hard disk included in the storage unit, etc. provided to the user in a state of being preinstalled in the apparatus main body. .
[0101]
In the present specification, the step of describing the program recorded on the recording medium is not limited to the processing performed in chronological order according to the described order, but is not necessarily performed in chronological order. It also includes processes that are executed individually.
[0102]
Further, in this specification, the system represents the entire apparatus constituted by a plurality of apparatuses.
[0103]
【The invention's effect】
As described above, according to the present invention, transmission efficiency, compression efficiency, and encoding efficiency can be improved.
[0104]
Furthermore, according to the present invention, it is possible to easily and reliably receive and decode transmission data with improved transmission efficiency, compression efficiency, or encoding efficiency.
[Brief description of the drawings]
FIG. 1 is a diagram illustrating block matching.
FIG. 2 is a block diagram illustrating a configuration example of a transmission apparatus to which the present invention is applied.
FIG. 3 is a block diagram illustrating a configuration example of a class classification adaptation processing unit in FIG. 2;
4 is a flowchart illustrating processing of the transmission apparatus in FIG.
FIG. 5 is a flowchart for explaining processing of the transmission apparatus in FIG. 2;
FIG. 6 is a diagram illustrating an example of a class tap.
FIG. 7 is a diagram illustrating an example of a prediction tap.
8 is a diagram for explaining prediction processing in the class classification adaptation processing unit in FIG. 2; FIG.
FIG. 9 is a block diagram illustrating a configuration example of a receiving device to which the present invention is applied.
10 is a flowchart for explaining the operation of the receiving apparatus of FIG. 9;
FIG. 11 is a block diagram illustrating a configuration example of a learning apparatus that acquires a prediction coefficient.
FIG. 12 is a block diagram showing another configuration example of the transmission apparatus to which the present invention is applied.
[Explanation of symbols]
21 prediction frame memory, 22 reference frame memory, 23-1 to 23-n address setting unit, 24-1 to 24-n class classification adaptive processing unit, 25 comparator, 26 threshold value determination unit, 27 post-processing unit, 28 quantum Generator

Claims

Of the input image data, the second frame temporally prior to the first frame of the image data is stored, and the stored second frame is stored every frame or every several frames. Update means for updating;
Class a plurality of second pixel data present within the search range around the position in the second frame is a part corresponding to the first pixel data of one pixel of the first frame Extraction means for extracting class taps that are a plurality of pixel data used for class classification and prediction taps that are a plurality of pixel data used for generating a prediction value, which are used for classification adaptive processing ;
Predicted value generation means for generating a predicted value by performing class classification adaptive processing on the plurality of second pixel data using the class tap and the prediction tap extracted by the extraction means;
Prediction residual calculation means for calculating a prediction residual between the prediction value generated by the prediction value generation means and the first pixel data;
Comparison means for selecting a minimum prediction residual among the prediction residuals calculated by the prediction residual calculation means as an evaluation value for the first pixel data and comparing the evaluation value with a predetermined threshold value. When,
When it is determined that the prediction residual is smaller than the threshold value based on the comparison result of the comparison means, it is defined by the position of the first pixel data and the second pixel data having the smallest prediction residual. When the prediction residual is determined to be larger than the threshold, the prediction value generated by the prediction value generation unit and the prediction residual calculation unit are calculated in addition to the motion vector. And an output means for outputting a flag indicating whether or not the transmitted data is the prediction residual. apparatus.

Of the input image data, the second frame temporally prior to the first frame of the image data is stored, and the stored second frame is stored every frame or every several frames. An update step to update;
Class a plurality of second pixel data present within the search range around the position in the second frame is a part corresponding to the first pixel data of one pixel of the first frame An extraction step for extracting a class tap which is a plurality of pixel data used for class classification and a plurality of pixel data used for generation of a prediction value, which are used for classification adaptive processing ;
A predicted value generation step of generating a predicted value by performing a class classification adaptive process on the plurality of second pixel data using the class tap and the prediction tap extracted by the processing of the extraction step;
A prediction residual calculation step of calculating a prediction residual between the prediction value generated by the processing of the prediction value generation step and the first pixel data;
Among the prediction residuals calculated by the processing of the prediction residual calculation step, the smallest prediction residual is selected as an evaluation value for the first pixel data, and the evaluation value is compared with a predetermined threshold value. A comparison step;
When it is determined that the prediction residual is smaller than the threshold based on the comparison result in the comparison step, the position of the first pixel data and the second pixel data having the smallest prediction residual When the prediction residual is determined to be larger than the threshold, the prediction value generated by the processing of the prediction value generation step and the prediction residual are output in addition to the motion vector. An output step of outputting either the prediction residual calculated by the processing of the calculation step or the first pixel data and a flag indicating whether or not the transmitted data is the prediction residual. An information processing method characterized by the above.

Of the input image data, the second frame temporally prior to the first frame of the image data is stored, and the stored second frame is stored every frame or every several frames. An update step to update;
Class a plurality of second pixel data present within the search range around the position in the second frame is a part corresponding to the first pixel data of one pixel of the first frame An extraction step for extracting a class tap which is a plurality of pixel data used for class classification and a plurality of pixel data used for generation of a prediction value, which are used for classification adaptive processing ;
A predicted value generation step of generating a predicted value by performing a class classification adaptive process on the plurality of second pixel data using the class tap and the prediction tap extracted by the processing of the extraction step;
A prediction residual calculation step of calculating a prediction residual between the prediction value generated by the processing of the prediction value generation step and the first pixel data;
Among the prediction residuals calculated by the processing of the prediction residual calculation step, the smallest prediction residual is selected as an evaluation value for the first pixel data, and the evaluation value is compared with a predetermined threshold value. A comparison step;
When it is determined that the prediction residual is smaller than the threshold based on the comparison result in the comparison step, the position of the first pixel data and the second pixel data having the smallest prediction residual When the prediction residual is determined to be larger than the threshold, the prediction value generated by the processing of the prediction value generation step and the prediction residual are output in addition to the motion vector. An output step of outputting either the prediction residual calculated by the processing of the calculation step or the first pixel data and a flag indicating whether or not the transmitted data is the prediction residual. A recording medium on which a computer-readable program is recorded.

Of the input image data, the second frame temporally prior to the first frame of the image data is stored, and the stored second frame is stored every frame or every several frames. An update step to update;
Class a plurality of second pixel data present within the search range around the position in the second frame is a part corresponding to the first pixel data of one pixel of the first frame An extraction step for extracting a class tap which is a plurality of pixel data used for class classification and a plurality of pixel data used for generation of a prediction value, which are used for classification adaptive processing ;
A predicted value generation step of generating a predicted value by performing a class classification adaptive process on the plurality of second pixel data using the class tap and the prediction tap extracted by the processing of the extraction step;
A prediction residual calculation step of calculating a prediction residual between the prediction value generated by the processing of the prediction value generation step and the first pixel data;
Among the prediction residuals calculated by the processing of the prediction residual calculation step, the smallest prediction residual is selected as an evaluation value for the first pixel data, and the evaluation value is compared with a predetermined threshold value. A comparison step;
When it is determined that the prediction residual is smaller than the threshold based on the comparison result in the comparison step, the position of the first pixel data and the second pixel data having the smallest prediction residual When the prediction residual is determined to be larger than the threshold, the prediction value generated by the processing of the prediction value generation step and the prediction residual are output in addition to the motion vector. An output step of outputting to the computer either the prediction residual calculated by the processing of the calculation step or the first pixel data and a flag indicating whether or not the transmitted data is the prediction residual. The program to be executed.