JP4124436B2

JP4124436B2 - Motion estimation device, program, storage medium, and motion estimation method

Info

Publication number: JP4124436B2
Application number: JP2002329553A
Authority: JP
Inventors: 水納　　亨
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2002-11-13
Filing date: 2002-11-13
Publication date: 2008-07-23
Anticipated expiration: 2022-11-13
Also published as: JP2004165982A

Description

【０００１】
【発明の属する技術分野】
本発明は、動き量推定装置、プログラム、記憶媒体および動き量推定方法に関する。
【０００２】
【従来の技術】
画像入力技術およびその出力技術の進歩により、画像に対して高精細化の要求が、近年非常に高まっている。例えば、画像入力装置として、デジタルカメラ（ＤｉｇｉｔａｌＣａｍｅｒａ）を例にあげると、３００万以上の画素数を持つ高性能な電荷結合素子（ＣＣＤ：ＣｈａｒｇｅＣｏｕｐｌｅｄＤｅｖｉｃｅ）の低価格化が進み、普及価格帯の製品においても広く用いられるようになってきた。そして、５００万画素の製品の登場も間近である。そして、このピクセル数の増加傾向は、なおしばらくは続くと言われている。
【０００３】
一方、画像出力・表示装置に関しても、例えば、レーザプリンタ、インクジェットプリンタ、昇華型プリンタ等のハード・コピー分野における製品、そして、ＣＲＴやＬＣＤ（液晶表示デバイス）、ＰＤＰ（プラズマ表示デバイス）等のフラットパネルディスプレイのソフト・コピー分野における製品の高精細化・低価格化は目を見張るものがある。
【０００４】
こうした高性能・低価格な画像入出力製品の市場投入効果によって、高精細画像の大衆化が始まっており、今後はあらゆる場面で、高精細画像の需要が高まると予想されている。実際、パーソナルコンピュータ（ＰｅｒｓｏｎａｌＣｏｍｐｕｔｅｒ）やインターネットをはじめとするネットワークに関連する技術の発達は、こうしたトレンドをますます加速させている。特に最近は、携帯電話やノートパソコン等のモバイル機器の普及速度が非常に大きく、高精細な画像を、あらゆる地点から通信手段を用いて伝送あるいは受信する機会が急増している。
【０００５】
これらを背景に、高精細画像の取扱いを容易にする画像圧縮伸長技術に対する高性能化あるいは多機能化の要求は、今後ますます強くなっていくことは必至と思われる。
【０００６】
そこで、近年においては、こうした要求を満たす画像圧縮方式の一つとして、高圧縮率でも高画質な画像を復元可能なＪＰＥＧ２０００という新しい方式が規格化されつつある。かかるＪＰＥＧ２０００においては、画像を矩形領域（タイル）に分割することにより、少ないメモリ環境下で圧縮伸長処理を行うことが可能である。すなわち、個々のタイルが圧縮伸長プロセスを実行する際の基本単位となり、圧縮伸長動作はタイル毎に独立に行うことができる。
【０００７】
また、このような１フレームのＪＰＥＧ２０００画像は、所定のフレームレート（単位時間に再生するフレーム数）で連続して表示することにより、動画像にすることが可能である。このようにＪＰＥＧ２０００画像をコマ送りさせて動画表示させる規格としては、ＭｏｔｉｏｎＪＰＥＧ２０００という国際標準の規格がある。
【０００８】
また、このＭｏｔｉｏｎＪＰＥＧ２０００方式と同様に、離散ウェーブレット変換を用いて画像データを圧縮符号化するようにしたものも提案されている（例えば、特許文献１参照）。
【０００９】
この特許文献１に開示の技術では、画素値を離散ウェーブレット変換して圧縮符号化するのみならず、異なるフレーム間の画像においても相関をとり、フレーム間で画像の動きがない場合の動画像データの冗長性も解消するようにしているので、よりデータの圧縮率を向上することができる。
【００１０】
【特許文献１】
特開２００１−３０９３８１公報
【００１１】
【発明が解決しようとする課題】
ところが、特許文献１に記載されている方式によれば、フレーム間の相関を求めるのに符号化された直交変換係数値を復号化し、更に、逆量子化するという、複雑な処理を経る必要があるため、処理に時間がかかるという問題がある。さらに、フレーム間の相関を求めるために用いられる前のフレームを記憶するメモリ量も必要となる。
【００１２】
本発明の目的は、画像の動き量を高速かつ精度良く求めることができる動き量推定装置、プログラム、記憶媒体および動き量推定方法を提供することである。
【００１３】
【課題を解決するための手段】
請求項１記載の発明の動き量推定装置は、動画像を構成するインターレース画像のフレーム毎に複数のブロックに分割してブロック毎に画素値を離散ウェーブレット変換することで階層的に圧縮符号化された符号列データから、１階層のＬＨ成分（１ＬＨ）とＨＬ成分（１ＨＬ）の高周波サブバンドに対応するサブブロックをブロック単位で選択する手段と、前記選択された１ＬＨのサブブロックの符号量と１ＨＬのサブブロックの符号量とを計算して、両者の比率（１ＬＨの符号量／１ＨＬの符号量）を算出する手段と、前記算出された比率を閾値と比較して、大きければ当該サブブロックの動き量は高速であると推定し、大きくなけば当該サブブロックの動き量は低速であると推定する手段とを有する。
【００１４】
高周波サブバンドに含まれるサブブロックの符号量がブロック単位で算出され、このサブブロックの符号量に基づいてコード・ブロック単位等での動き量が推定される。これにより、フレーム間差分をとる必要がないことからメモリ消費を抑制するとともに処理時間を短縮することが可能になるので、高速かつ精度良くコード・ブロック単位等での動き量を推定することが可能になる。特に、画像の横エッジが強く現われるウェーブレット変換係数の１階層の１ＬＨ成分と、画像の縦エッジが強く現われるウェーブレット変換係数の１階層の１ＨＬ成分とを比較することによって、インターレース画像の動き量（速度）が確実に推定される。
【００１５】
請求項２記載の発明は、請求項１記載の動き量推定装置において、フレーム全体で、前記高速であると推定されたサブブロックの数あるいは前記低速であると推定されたサブブロック数を総サブブロック数で除算する手段と、前記除算した結果を閾値と比較して、当該フレーム全体の動き量が高速あるいは低速であると推定する手段とを更に有する。
【００１６】
したがって、推定されたサブブロック単位の動き量に基づいてフレーム全体の動き量が推定される。これにより、フレーム全体の動き量によって画質の粗調整を行い、コード・ブロック毎などの動き量によって画質の微調整を行うことが可能になるので、効率の良い画質制御を行うことが可能になる。しかも、高周波サブバンドに含まれる全てのサブブロックについてのサブバンド１ＬＨの符号量とサブバンド１ＨＬの符号量との比較結果の比率に応じてフレーム画像全体の動き量を推定するので、フレーム画像全体の動き量を簡易に推定することが可能になる。
【００１７】
請求項３記載の発明は、請求項１又は２記載の動き量推定装置において、算出されるサブブロックの符号量は、ロスレス圧縮された符号量である。したがって、動き量の推定精度の向上を図ることが可能になる。
【００１８】
請求項４記載の発明は、請求項１ないし３のいずれか一記載の動き量推定装置において、算出されるサブブロックの符号量は、ビットトランケーション前の符号量である。したがって、動き量の推定精度の向上を図ることが可能になる。
【００１９】
請求項５記載の発明は、請求項１ないし４のいずれか一記載の動き量推定装置の各手段の機能をコンピュータに実行させるプログラムである。また、請求項６記載の発明は、請求項１ないし４のいずれか一記載の動き量推定装置の各手段の機能をコンピュータに実行させるプログラムを記録しているコンピュータ読み取り可能な記憶媒体である。したがって、プログラムを直接コンピュータで実行するか、あるいは、記憶媒体に記録されたプログラムをコンピュータに読込み実行することで、請求項１乃至４と同様の作用を得ることが可能にある。
【００２０】
請求項７記載の発明の動き量推定方法は、動画像を構成するインターレース画像のフレーム毎に複数のブロックに分割してブロック毎に画素値を離散ウェーブレット変換することで階層的に圧縮符号化された符号列データから、１階層のＬＨ成分（１ＬＨ）とＨＬ成分（１ＨＬ）の高周波サブバンドに対応するサブブロックをブロック単位で選択するステップと、前記選択された１ＬＨのサブブロックの符号量と１ＨＬのサブブロックの符号量とを計算して、両者の比率（１ＬＨの符号量／１ＨＬの符号量）を算出するステップと、前記算出された比率を閾値と比較して、大きければ当該サブブロックの動き量は高速であると推定し、大きくなけば当該サブブロックの動き量は低速であると推定するステップとを有する。したがって、請求項１と同様の作用を得ることが可能になる。
【００２１】
請求項８記載の発明は、請求項７記載の動き量推定方法において、フレーム全体で、前記高速であると推定されたサブブロックの数あるいは前記低速であると推定されたサブブロック数を総サブブロック数で除算するステップと、前記除算した結果を閾値と比較して、当該フレーム全体の動き量が高速あるいは低速であると推定するステップとを更に有する。したがって、請求項２と同様の作用を得ることが可能になる。
【００２２】
請求項９記載の発明は、請求項７又は８記載の動き量推定において、算出されるサブブロックの符号量は、ロスレス圧縮された符号量である。したがって、請求項３と同様の作用を得ることが可能になる。
【００２３】
請求項１０記載の発明は、請求項７ないし９のいずれか一記載の動き量推定方法において、算出されるサブブロックの符号量は、ビットトランケーション前の符号量である。したがって、請求項４と同様の作用を得ることが可能になる。
【００２４】
【発明の実施の形態】
最初に、本実施の形態の前提となる「階層符号化アルゴリズム」及び「ＪＰＥＧ２０００アルゴリズム」の概要について説明する。
【００２５】
図１は、ＪＰＥＧ２０００方式の基本となる階層符号化アルゴリズムを実現するシステムの機能ブロック図である。このシステムは、色空間変換・逆変換部１０１、２次元ウェーブレット変換・逆変換部１０２、量子化・逆量子化部１０３、エントロピー符号化・復号化部１０４、タグ処理部１０５の各機能ブロックにより構成されている。
【００２６】
このシステムが従来のＪＰＥＧアルゴリズムと比較して最も大きく異なる点の一つは変換方式である。ＪＰＥＧでは離散コサイン変換（ＤＣＴ：ＤｉｓｃｒｅｔｅＣｏｓｉｎｅＴｒａｎｓｆｏｒｍ）を用いているのに対し、この階層符号化アルゴリズムでは、２次元ウェーブレット変換・逆変換部１０２において、離散ウェーブレット変換（ＤＷＴ：ＤｉｓｃｒｅｔｅＷａｖｅｌｅｔＴｒａｎｓｆｏｒｍ）を用いている。ＤＷＴはＤＣＴに比べて、高圧縮領域における画質が良いという長所を有し、この点が、ＪＰＥＧの後継アルゴリズムであるＪＰＥＧ２０００でＤＷＴが採用された大きな理由の一つとなっている。
【００２７】
また、他の大きな相違点は、この階層符号化アルゴリズムでは、システムの最終段に符号形成を行うために、タグ処理部１０５の機能ブロックが追加されていることである。このタグ処理部１０５で、画像の圧縮動作時には圧縮データが符号列データとして生成され、伸長動作時には伸長に必要な符号列データの解釈が行われる。そして、符号列データによって、ＪＰＥＧ２０００は様々な便利な機能を実現できるようになった。例えば、ブロック・ベースでのＤＷＴにおけるオクターブ分割に対応した任意の階層（デコンポジション・レベル）で、静止画像の圧縮伸長動作を自由に停止させることができるようになる（後述する図３参照）。
【００２８】
原画像の入出力部分には、色空間変換・逆変換１０１が接続される場合が多い。例えば、原色系のＲ（赤）／Ｇ（緑）／Ｂ（青）の各コンポーネントからなるＲＧＢ表色系や、補色系のＹ（黄）／Ｍ（マゼンタ）／Ｃ（シアン）の各コンポーネントからなるＹＭＣ表色系から、ＹＵＶあるいはＹＣｂＣｒ表色系への変換又は逆変換を行う部分がこれに相当する。
【００２９】
次に、ＪＰＥＧ２０００アルゴリズムについて説明する。
【００３０】
カラー画像は、一般に、図２に示すように、原画像の各コンポーネント１１１（ここではＲＧＢ原色系）が、矩形をした領域によって分割される。この分割された矩形領域は、一般にブロックあるいはタイルと呼ばれているものであるが、ＪＰＥＧ２０００では、タイルと呼ぶことが一般的であるため、以下、このような分割された矩形領域をタイルと記述することにする（図２の例では、各コンポーネント１１１が縦横４×４、合計１６個の矩形のタイル１１２に分割されている）。このような個々のタイル１１２（図２の例で、Ｒ００，Ｒ０１，…，Ｒ１５／Ｇ００，Ｇ０１，…，Ｇ１５／Ｂ００，Ｂ０１，…，Ｂ１５）が、画像データの圧縮伸長プロセスを実行する際の基本単位となる。従って、画像データの圧縮伸長動作は、コンポーネントごと、また、タイル１１２ごとに、独立に行われる。
【００３１】
画像データの符号化時には、各コンポーネント１１１の各タイル１１２のデータが、図１の色空間変換・逆変換部１０１に入力され、色空間変換を施された後、２次元ウェーブレット変換部１０２で２次元ウェーブレット変換（順変換）が施されて、周波数帯に空間分割される。
【００３２】
図３には、デコンポジション・レベル数が３の場合の、各デコンポジション・レベルにおけるサブバンドを示している。すなわち、原画像のタイル分割によって得られたタイル原画像（０ＬＬ）（デコンポジション・レベル０）に対して、２次元ウェーブレット変換を施し、デコンポジション・レベル１に示すサブバンド（１ＬＬ，１ＨＬ，１ＬＨ，１ＨＨ）を分離する。そして引き続き、この階層における低周波成分１ＬＬに対して、２次元ウェーブレット変換を施し、デコンポジション・レベル２に示すサブバンド（２ＬＬ，２ＨＬ，２ＬＨ，２ＨＨ）を分離する。順次同様に、低周波成分２ＬＬに対しても、２次元ウェーブレット変換を施し、デコンポジション・レベル３に示すサブバンド（３ＬＬ，３ＨＬ，３ＬＨ，３ＨＨ）を分離する。図３では、各デコンポジション・レベルにおいて符号化の対象となるサブバンドを、網掛けで表してある。例えば、デコンポジション・レベル数を３としたとき、網掛けで示したサブバンド（３ＨＬ，３ＬＨ，３ＨＨ，２ＨＬ，２ＬＨ，２ＨＨ，１ＨＬ，１ＬＨ，１ＨＨ）が符号化対象となり、３ＬＬサブバンドは符号化されない。
【００３３】
次いで、指定した符号化の順番で符号化の対象となるビットが定められ、図１に示す量子化・逆量子化部１０３で対象ビット周辺のビットからコンテキストが生成される。
【００３４】
この量子化の処理が終わったウェーブレット係数は、個々のサブバンド毎に、「プレシンクト」と呼ばれる重複しない矩形に分割される。これは、インプリメンテーションでメモリを効率的に使うために導入されたものである。図４に示したように、一つのプレシンクトは、空間的に一致した３つの矩形領域からなっている。更に、個々のプレシンクトは、重複しない矩形の「コード・ブロック」に分けられる。これは、エントロピー・コーディングを行う際の基本単位となる。
【００３５】
ウェーブレット変換後の係数値は、そのまま量子化し符号化することも可能であるが、ＪＰＥＧ２０００では符号化効率を上げるために、係数値を「ビットプレーン」単位に分解し、画素あるいはコード・ブロック毎に「ビットプレーン」に順位付けを行うことができる。
【００３６】
ここで、図５はビットプレーンに順位付けする手順の一例を示す説明図である。図５に示すように、この例は、原画像（３２×３２画素）を１６×１６画素のタイル４つで分割した場合で、デコンポジション・レベル１のプレシンクトとコード・ブロックの大きさは、各々８×８画素と４×４画素としている。プレシンクトとコード・ブロックの番号は、ラスター順に付けられており、この例では、プレンシクトが番号０から３まで、コード・ブロックが番号０から３まで割り当てられている。タイル境界外に対する画素拡張にはミラーリング法を使い、可逆（５，３）フィルタでウェーブレット変換を行い、デコンポジション・レベル１のウェーブレット係数値を求めている。
【００３７】
また、タイル０／プレシンクト３／コード・ブロック３について、代表的な「レイヤ」構成の概念の一例を示す説明図も図５に併せて示す。変換後のコード・ブロックは、サブバンド（１ＬＬ，１ＨＬ，１ＬＨ，１ＨＨ）に分割され、各サブバンドにはウェーブレット係数値が割り当てられている。
【００３８】
レイヤの構造は、ウェーブレット係数値を横方向（ビットプレーン方向）から見ると理解し易い。１つのレイヤは任意の数のビットプレーンから構成される。この例では、レイヤ０，１，２，３は、各々、１，３，１，３のビットプレーンから成っている。そして、ＬＳＢ（ＬｅａｓｔＳｉｇｎｉｆｉｃａｎｔＢｉｔ：最下位ビット）に近いビットプレーンを含むレイヤ程、先に量子化の対象となり、逆に、ＭＳＢ（ＭｏｓｔＳｉｇｎｉｆｉｃａｎｔＢｉｔ：最上位ビット）に近いレイヤは最後まで量子化されずに残ることになる。ＬＳＢに近いレイヤから破棄する方法はトランケーションと呼ばれ、量子化率を細かく制御することが可能である。
【００３９】
図１に示すエントロピー符号化・復号化部１０４では、コンテキストと対象ビットから確率推定によって、各コンポーネント１１１のタイル１１２に対する符号化を行う。こうして、原画像の全てのコンポーネント１１１について、タイル１１２単位で符号化処理が行われる。最後にタグ処理部１０５は、エントロピー符号化・復号化部１０４からの全符号化データを１本の符号列データ（コードストリーム）に結合するとともに、それにタグを付加する処理を行う。
【００４０】
図６には、この符号列データの１フレーム分の概略構成を示している。この符号列データの先頭と各タイルの符号データ（ｂｉｔｓｔｒｅａｍ）の先頭にはヘッダ（メインヘッダ（Ｍａｉｎｈｅａｄｅｒ）、タイル境界位置情報やタイル境界方向情報等であるタイルパートヘッダ（ｔｉｌｅｐａｒｔｈｅａｄｅｒ））と呼ばれるタグ情報が付加され、その後に、各タイルの符号化データが続く。なお、メインヘッダ（Ｍａｉｎｈｅａｄｅｒ）には、符号化パラメータや量子化パラメータが記述されている。そして、符号列データの終端には、再びタグ（ｅｎｄｏｆｃｏｄｅｓｔｒｅａｍ）が置かれる。
【００４１】
一方、復号化時には、画像データの符号化時とは逆に、各コンポーネント１１１の各タイル１１２の符号列データから画像データを生成する。この場合、タグ処理部１０５は、外部より入力した符号列データに付加されたタグ情報を解釈し、符号列データを各コンポーネント１１１の各タイル１１２の符号列データに分解し、その各コンポーネント１１１の各タイル１１２の符号列データ毎に復号化処理（伸長処理）を行う。このとき、符号列データ内のタグ情報に基づく順番で復号化の対象となるビットの位置が定められるとともに、量子化・逆量子化部１０３で、その対象ビット位置の周辺ビット（既に復号化を終えている）の並びからコンテキストが生成される。エントロピー符号化・復号化部１０４で、このコンテキストと符号列データから確率推定によって復号化を行い、対象ビットを生成し、それを対象ビットの位置に書き込む。このようにして復号化されたデータは周波数帯域毎に空間分割されているため、これを２次元ウェーブレット変換・逆変換部１０２で２次元ウェーブレット逆変換を行うことにより、画像データの各コンポーネントの各タイルが復元される。復元されたデータは色空間変換・逆変換部１０１によって元の表色系の画像データに変換される。
【００４２】
以上が、「ＪＰＥＧ２０００アルゴリズム」の概要であり、静止画像、すなわち単フレームに対する方式を複数フレームに拡張したものが、「ＭｏｔｉｏｎＪＰＥＧ２０００アルゴリズム」である。すなわち、「ＭｏｔｉｏｎＪＰＥＧ２０００」は、図７に示すように、１フレームのＪＰＥＧ２０００画像を所定のフレームレート（単位時間に再生するフレーム数）で連続して表示することにより、動画像にするものである。
【００４３】
以下、本発明の第一の実施の形態について説明する。なお、ここでは、ＭｏｔｉｏｎＪＰＥＧ２０００を代表とする動画像圧縮伸長技術に関する例について説明するが、いうまでもなく、本発明は以下の説明の内容に限定されるものではない。
【００４４】
図８は本発明が適用されるムービーカメラシステム１の概略構成を示すブロック図である。図８に示すように、本発明の動画像表示システムが適用されるムービーカメラシステム１は、ムービーカメラである画像記録装置１ａとパーソナルコンピュータである動画像再生装置１ｂとをインターネットであるネットワーク１ｃを介して接続したものである。
【００４５】
以下においては、本発明の特長的な機能を発揮する画像記録装置１ａについて説明する。なお、動画像再生装置１ｂについては、ＭｏｔｉｏｎＪＰＥＧ２０００方式で圧縮した符号列データの伸長を行うことができる標準的なシステムであれば良いので、その詳細な説明は省略する。
【００４６】
図８に示すように、画像記録装置１ａは、動画像を撮影する画像入力装置２と、この撮影した画像データを圧縮符号化する画像圧縮装置３とを備えている。画像圧縮装置３は、動画像データの圧縮処理を行う本発明の画像処理装置を実施するものである。
【００４７】
図９は、画像記録装置１ａのハードウエア構成の一例を示すブロック図である。画像記録装置１ａは、図９に示すように、コンピュータの主要部であって各部を集中的に制御するＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）１１を備えており、このＣＰＵ１１には、各種のＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）からなる記憶媒体であるメモリ１２と、ネットワーク１ｃと通信を行う所定の通信インターフェイス１３と、ユーザから各種の操作を受け付ける操作パネル１８とが、バス１４を介して接続されている。
【００４８】
画像記録装置１ａにおいては、前述した画像入力装置２と画像圧縮装置３とに加え、論理回路１９が、バス１４を介してＣＰＵ１１に接続されている。
【００４９】
このような構成の画像記録装置１ａのメモリ１２（のＲＯＭ）には、動画像を処理する動画処理プログラム等の制御プログラムがそれぞれ記憶されている。この動画処理プログラムは本発明のプログラムを実施するものである。そして、この動画処理プログラムに基づいてＣＰＵ１１が実行する処理により、符号列変換装置４の機能を実現する。
【００５０】
なお、メモリ１２としては、ＣＤやＤＶＤなどの各種の光ディスク、各種光磁気ディスク、フレキシブルディスクなどの各種磁気ディスク、半導体メモリ等、各種方式のメディアを用いることもできる。また、ネットワーク１ｃからプログラムをダウンロードし、メモリ１２にインストールするようにしてもよい。この場合に、送信側のサーバでプログラムを記憶している記憶装置も、この発明の記憶媒体である。なお、プログラムは、所定のＯＳ（ＯｐｅｒａｔｉｎｇＳｙｓｔｅｍ）上で動作するものであってもよいし、その場合に後述の各種処理の一部の実行をＯＳに肩代わりさせるものであってもよいし、所定のアプリケーションソフトやＯＳなどを構成する一群のプログラムファイルの一部として含まれているものであってもよい。
【００５１】
ここで、画像記録装置１ａの各部の動作について簡単に説明する。画像記録装置１ａの画像入力装置２は、ＣＣＤ、ＭＯＳイメージセンサ等の光電変換デバイスを用いて動画像をフレーム単位でキャプチャし、動画像のデジタル画素値信号を画像圧縮装置３に出力するものである。
【００５２】
画像記録装置１ａの画像圧縮装置３は、動画像のデジタル画素値信号を「ＭｏｔｉｏｎＪＰＥＧ２０００アルゴリズム」に従って圧縮符号化する。図１０に示すように、画像圧縮装置３は、色空間変換部３１、２次元ウェーブレット変換部３２、量子化部３３、エントロピー符号化部３４、ポスト量子化部３５、算術符号化部３６の各部から構成される。これらの各部における各種機能は、前述の動画処理プログラムにしたがってＣＰＵ１１が行う処理により実現している。なお、リアルタイム性が重要視される場合には、処理を高速化する必要がある。そのためには、論理回路１９の動作により、各部における各種機能を実現するようにするのが望ましい。
【００５３】
次に、画像圧縮装置３を構成する各部の動作について簡単に説明する。色空間変換部３１で画像入力装置２から入力された動画像のデジタル画素値信号をＲＧＢからＹＵＶまたはＹＣｂＣｒに変換し、２次元ウェーブレット変換部３２で色成分ごとに２次元ウェーブレット変換を行う。そして、量子化部３３でＷａｖｅｌｅｔ係数を適当な量子化分母で除算し、エントロピー符号化部３４でロスレスの符号を作り、ポスト量子化部３５でビットトランケーション（符号の破棄）を行い、算術符号化部３６でＪＰＥＧ２０００の符号フォーマットに符号を形成する。このような一連の処理により、元の動画像のＲ，Ｇ，Ｂの各コンポーネントの動画像データは、フレーム毎に１又は複数（通常は複数）のタイルに分割され、このタイル毎に階層的に圧縮符号化された符号化データとなる。
【００５４】
ここで、本実施の形態において特長的な機能を発揮するポスト量子化部３５について詳細に説明する。図１１は、ポスト量子化部３５の構成を概略的に示すブロック図である。図１１に示すように、ポスト量子化部３５は、速度推定部４１、量子化テーブル決定部４２、符号破棄部４３、マスキング制御部４４を備えている。
【００５５】
速度推定部４１は、動き量推定装置として機能するものであって、エントロピー符号化部３４で作成されたコード・ブロック内の情報から画像の動き量（速度）を推定し、推定した画像の動き量（速度）をマスキング制御部４４に送るものである。
【００５６】
マスキング制御部４４は、量子化テーブルのトランケート量（ビットプレーン削り量）をコード・ブロック毎に調整するものである。
【００５７】
量子化テーブル決定部４２は、ＣＰＵ１１から与えられた圧縮率と速度推定部４１により推定された画像の動き量（速度）とに応じて量子化テーブルを決定し、該量子化テーブルを符号破棄部４３に与えるものである。
【００５８】
符号破棄部４３は、量子化テーブル決定部４２により決定された量子化テーブルとそれをマスキング制御部４４でコード・ブロック毎に調整したコード・ブロック毎のトランケート量（ビットプレーン削り量）とを用いて、ビットプレーン（又はサブビットプレーン）を削っていない状態の符号から所定の圧縮率になるまで符号を破棄するものである。
【００５９】
ここで、速度推定部４１による画像の動き量（速度）の推定手法について説明する。図１２は、速度推定部４１による画像の動き量（速度）の推定手法の基本的思想について記述したものである。一般的に、インターレース画像内に動きが生じた場合、フレームデータでは動いた物体のエッジがライン単位のくしの歯状（以下、インターレースのくし型という。）になる。図１２に示すように、インターレース画像において、物体が高速で動いている画像は、インターレースのくし型が横方向に長い。それに対し、物体が低速で動いている画像は、インターレースのくし型が横方向に短い。また、画像の横エッジは、Ｗａｖｅｌｅｔ変換係数の１ＬＨ成分に強く現われることが知られている。すなわち、高速で動いている画像ほど高周波の横方向のエッジが長くなることから、高速なコード・ブロックほど１ＬＨの係数の絶対値の和が大きく、かつ、ロスレスの１ＬＨの符号量の和が大きくなる。速度推定部４１においては、この特性を利用して画像の動き量（速度）をフレーム毎に独立して推定する。
【００６０】
図１３は、速度推定部４１の構成を概略的に示すブロック図である。図１３に示すように、速度推定部４１は、ブロック選択部５１、特徴量算出部５２、速度判定部５３の各部から構成される。ブロック選択部５１は、演算すべきコード・ブロックを選択するものである。例えば、１階層のサブバンド１ＬＨの中のコード・ブロックをラスター順に選択する。または、全てのコード・ブロックを選択する構成にしても良い。特徴量算出部５２は、ブロック選択部５１で選択されたコード・ブロックの係数または符号量の演算を行い、特徴量を算出するものである。速度判定部５３は、特徴量算出部５２で求められた特徴量を用いて、コード・ブロック単位の画像の動き量（速度）を推定するものである。
【００６１】
ここで、コード・ブロック単位の画像の動き量（速度）の推定について説明する。前述したように、コード・ブロックとはサブバンドを更に細かいブロックに分けたものである。すなわち、コード・ブロックはサブブロックである。本実施の形態においては、画像の横エッジが強く現われるＷａｖｅｌｅｔ変換係数の１階層の１ＬＨ成分と、画像の縦エッジが強く現われるＷａｖｅｌｅｔ変換係数の１階層の１ＨＬ成分とを比較することによって、インターレース画像の動き量（速度）の推定を行うものである。さらには、復号されたときに同じ位置になるコード・ブロックを比較することによってコード・ブロック単位の画像の動き量（速度）の推定を行うものである。ここで、図１４は４つのコード・ブロックがある１階層のサブバンドを示すものである。すなわち、図１４に示すように、復号時には同じ位置になるコード・ブロック１ＨＬ＿１とコード・ブロック１ＬＨ＿１とを比較する。同様に、１ＨＬ＿２と１ＬＨ＿２、１ＨＬ＿３と１ＬＨ＿３、１ＨＬ＿４と１ＬＨ＿４の比較も行う。なお、コード・ブロックが１つのサブバンドの場合には、コード・ブロック単位ではなくサブバンド単位で比較を行う。そして、この比較結果に応じてコード・ブロック単位の画像の動き量（速度）の推定を行う。
【００６２】
上述したようなコード・ブロック単位の画像の動き量（速度）の推定処理について、図１５に示すフローチャートを参照しつつ説明する。図１５に示すように、ブロック選択部５１で選択されたコード・ブロックのうち、初めのコード・ブロックを取得し（ステップＳ１：サブブロック取得手段）、当該コード・ブロックの１ＬＨ，１ＨＬのビットプレーンを削る前の符号量をそれぞれ算出し（ステップＳ２，Ｓ３：符号量算出手段）、割り算（１ＬＨ／１ＨＬ）し、算出された割合を動き量推定の特徴量（ｒａｔｅ）とする（ステップＳ４）。このステップＳ１〜Ｓ４の処理は、特徴量算出部５２により実行される。
【００６３】
そして、割り算（１ＬＨ／１ＨＬ）の結果（ｒａｔｅ）を閾値（ｔｈ０）と比較して（ステップＳ５）、割り算（１ＬＨ／１ＨＬ）の結果（ｒａｔｅ）が閾値（ｔｈ０）よりも大きければ（ステップＳ５のＹ）、画像の横エッジが強く現われているものとして、コード・ブロック単位の画像の動き量（速度）は高速であると推定する（ステップＳ６）。一方、割り算（１ＬＨ／１ＨＬ）の結果（ｒａｔｅ）が閾値（ｔｈ０）よりも大きくなければ（ステップＳ５のＮ）、画像の縦エッジが強く現われているものとして、コード・ブロック単位の画像の動き量（速度）は低速であると推定する（ステップＳ７）。このステップＳ５〜Ｓ７の処理は、速度判定部５３により実行される。ここに、サブブロック動き量推定手段の機能が実行される。
【００６４】
ステップＳ２〜Ｓ７の処理は、ブロック選択部５１で選択された全てのコード・ブロックについて終了する迄（ステップＳ８、Ｓ９）、繰り返される。
【００６５】
すなわち、図１４に示すような１階層のサブバンドに４つのコード・ブロックがある場合には、１階層の係数のコード・ブロックの数だけ上記の処理が繰り返されることになるので、４つのコード・ブロックについての速度が推定されることになる。ここで、１階層のコード・ブロックの数とは、例えばコード・ブロックのサイズが３２×３２である場合、１ＬＨの係数のサイズが２５６×１２８ならば、コード・ブロックの数は（２５６／３２）×（１２８／３２）＝８×４＝３２個である。
【００６６】
ここに、高周波サブバンドに含まれるサブブロックの符号量がブロック単位で算出され、このサブブロックの符号量に基づいてコード・ブロック単位での動き量が推定される。これにより、フレーム間差分をとる必要がないことからメモリ消費を抑制するとともに処理時間を短縮することが可能になるので、高速かつ精度良くコード・ブロック単位での動き量を推定することが可能になる。
【００６７】
なお、このように推定したコード・ブロック単位での動き量（速度）は、速度推定部４１からマスキング制御部４４に送られることにより、動きの速い物体と動きの遅い物体が混在している画像に対して、各々最適な処理（マスキング処理など）を施すことができる。
【００６８】
次に、本発明の第二の実施の形態について図１６ないし図１８に基づいて説明する。なお、第一の実施の形態において説明した部分と同一部分については同一符号を用い、説明も省略する。第一の実施の形態では、コード・ブロック単位の画像の動き量（速度）を推定したが、本実施の形態は、コード・ブロック単位の画像の動き量（速度）とフレーム画像全体の動き量（速度）とを推定するものである。
【００６９】
図１６は、本発明の第二の実施の形態の速度推定部４１の構成を概略的に示すブロック図である。図１６に示すように、速度推定部４１は、ブロック選択部６１、特徴量算出部６２、速度判定部６３の各部から構成される。ブロック選択部６１は、演算すべきコード・ブロックを選択するものである。例えば、１階層のサブバンド１ＬＨの中のコード・ブロックをラスター順に選択する。または、全てのコード・ブロックを選択する構成にしても良い。特徴量算出部６２は、ブロック選択部６１で選択されたコード・ブロックの係数または符号量の演算を行い、特徴量を算出するものである。速度判定部６３は、特徴量算出部６２で求められた特徴量を用いて、コード・ブロック単位の画像の動き量（速度）とフレーム画像全体の動き量（速度）とを推定するものである。
【００７０】
コード・ブロック単位の画像の動き量（速度）の推定については、第一の実施の形態で説明したので、その説明は省略する。
【００７１】
次に、コード・ブロック単位の画像の動き量（速度）の推定結果を利用したフレーム画像全体の動き量（速度）の推定について説明する。なお、第一の実施の形態で説明したように、コード・ブロック単位の画像の動き量（速度）が４つのコード・ブロックについて推定されていることを前提として説明する。
【００７２】
本実施の形態においては、フレーム画像全体の動き量（速度）を各コード・ブロック単位の画像の動き量（速度）の比率に応じて推定するものである。より具体的には、図１７に示すように、４つのコード・ブロックについての画像の動き量（速度）の高速と低速の比率が、
高速：低速＝３：１
であれば、高速の比率が高いことからフレーム画像全体の動き量（速度）は高速であると推定する。
【００７３】
なお、これはあくまでも一例であり、コード・ブロック単位の画像の動き量（速度）の推定結果を用いて、フレーム画像全体の動き量（速度）を高速に判定しやすくするか、低速に判定しやすくするかは、自由に設定できる構成にしても良い。
【００７４】
上述したようなコード・ブロック単位の画像の動き量（速度）及びフレーム画像全体の動き量（速度）の推定処理について、図１８に示すフローチャートを参照しつつ説明する。図１８に示すように、カウンタの初期化（ステップＳ２１）やブロック選択部５１で選択されたコード・ブロックの総数を設定（ステップＳ２２）した後、対応するコード・ブロック毎に１ＬＨ，１ＨＬのビットプレーンを削る前の符号量をそれぞれ算出し（ステップＳ２３，Ｓ２４）、割り算（１ＬＨ／１ＨＬ）を行う（ステップＳ２５）。
【００７５】
そして、割り算（１ＬＨ／１ＨＬ）の結果（ｒａｔｅ）を閾値（ｔｈ１）と比較して（ステップＳ２６）、割り算（１ＬＨ／１ＨＬ）の結果（ｒａｔｅ）が閾値（ｔｈ１）よりも大きければ（ステップＳ２６のＹ）、割り算（１ＬＨ／１ＨＬ）の結果（ｒａｔｅ）が閾値（ｔｈ１）よりも大きいコード・ブロックの数をカウントする（ステップＳ２７）。
【００７６】
ステップＳ２３〜Ｓ２７の処理は、ブロック選択部５１で選択された全てのコード・ブロックについて終了する迄、繰り返される。
【００７７】
ブロック選択部５１で選択された全てのコード・ブロックについてステップＳ２３〜Ｓ２７の処理が終了すると（ステップＳ２８のＹ）、割り算（１ＬＨ／１ＨＬ）の結果（ｒａｔｅ）が閾値（ｔｈ１）よりも大きいコード・ブロックの数を総コード・ブロック数で割り算し、算出された割合を動き量推定の特徴量（ｓｐｅｅｄ）とする（ステップＳ２９）。このステップＳ２１〜Ｓ２９の処理は、特徴量算出部５２により実行される。
【００７８】
そして、特徴量算出部５２で求められた特徴量（ｓｐｅｅｄ）と閾値（ｔｈ２）とを比較し（ステップＳ３０）、その比較結果に基づいて高速、低速の判定をする。すなわち、特徴量算出部５２で求められた特徴量（ｓｐｅｅｄ）が閾値（ｔｈ２）よりも大きければ（ステップＳ３０のＹ）、各コード・ブロックにおける画像の動き量（速度）の高速の比率が高いものとして、フレーム画像全体の動き量（速度）は高速であると推定する（ステップＳ３１）。一方、特徴量算出部５２で求められた特徴量（ｓｐｅｅｄ）が閾値（ｔｈ２）よりも大きくなければ（ステップＳ３０のＮ）、各コード・ブロックにおける画像の動き量（速度）の低速の比率が高いものとして、フレーム画像全体の動き量（速度）は低速であると推定する（ステップＳ３２）。このステップＳ３０〜Ｓ３２の処理は、速度判定部５３により実行される。ここに、フレーム動き量推定手段の機能が実行される。
【００７９】
ここに、高周波サブバンドに含まれるサブブロックの符号量がブロック単位で算出され、このサブブロックの符号量に基づいてコード・ブロック単位での動き量が推定されるとともに、この推定されたサブブロック単位の動き量に基づいてフレーム全体の動き量が推定される。このように推定したフレーム全体の動き量（速度）は、速度推定部４１から量子化テーブル決定部４２に送られることにより、量子化テーブル決定部４２では動き量（速度）に適した量子化テーブルを選択することができる。つまり、フレーム全体の動き量によって画質の粗調整を行い、コード・ブロック毎の動き量によって画質の微調整を行うことが可能になるので、効率の良い画質制御を行うことが可能になる。
【００８０】
また、前述の説明では、本発明の画像記録装置１ａをムービーカメラに適用した例を説明したが、画像記録装置１ａを携帯情報端末装置（ＰＤＡ）、携帯電話などの情報端末装置に適用することもできる。
【００８１】
【発明の効果】
本発明によれば、次のような効果が得られる。
（１）高周波サブバンドに含まれるサブブロックの符号量をブロック単位で算出し、このサブブロックの符号量に基づいてコード・ブロック単位での動き量を推定することにより、フレーム間差分をとる必要がないことからメモリ消費を抑制するとともに処理時間を短縮することができるので、高速かつ精度良くコード・ブロック単位での動き量を推定することができる。特に、画像の横エッジが強く現われるウェーブレット変換係数の１階層の１ＬＨ成分と、画像の縦エッジが強く現われるウェーブレット変換係数の１階層の１ＨＬ成分とを比較することで、インターレース画像の動き量（速度）を確実に推定することができる。
【００８２】
（２）推定されたサブブロック単位の動き量に基づいてフレーム全体の動き量を推定することにより、フレーム全体の動き量によって画質の粗調整を行い、コード・ブロック毎の動き量によって画質の微調整を行うことができるので、効率の良い画質制御を行うことができる。しかも、高周波サブバンドに含まれる全てのサブブロックについての前記サブブロック動き量推定手段によるサブバンド１ＬＨの符号量とサブバンド１ＨＬの符号量との比較結果の比率に応じてフレーム画像全体の動き量を推定することにより、フレーム画像全体の動き量を簡易に推定することができる。
【００８３】
（３）サブブロックの符号量は、ロスレス圧縮された符号量であることにより、動き量の推定精度の向上を図ることができる。
【００８４】
（４）サブブロックの符号量は、ビットトランケーション前の符号量であることにより、動き量の推定精度の向上を図ることができる。
【図面の簡単な説明】
【図１】本発明の前提となるＪＰＥＧ２０００方式の基本となる階層符号化アルゴリズムを実現するシステムの機能ブロック図である。
【図２】原画像の各コンポーネントの分割された矩形領域を示す説明図である。
【図３】デコンポジション・レベル数が３の場合の、各デコンポジション・レベルにおけるサブバンドを示す説明図である。
【図４】プレシンクトを示す説明図である。
【図５】ビットプレーンに順位付けする手順の一例を示す説明図である。
【図６】符号列データの１フレーム分の概略構成を示す説明図である。
【図７】ＭｏｔｉｏｎＪＰＥＧ２０００の概念を示す説明図である。
【図８】本発明の第一の実施の形態のムービーカメラシステムの概略構成を示すブロック図である。
【図９】画像記録装置のハードウエア構成の一例を示すブロック図である。
【図１０】画像圧縮装置の構成を概略的に示すブロック図である。
【図１１】ポスト量子化部の構成を概略的に示すブロック図である。
【図１２】速度推定部による画像の動き量（速度）の推定手法の基本的思想についての説明図である。
【図１３】速度推定部の構成を概略的に示すブロック図である。
【図１４】４つのコード・ブロックがある１階層のサブバンドを示す説明図である。
【図１５】コード・ブロック単位の画像の動き量（速度）の推定処理の流れを示すフローチャートである。
【図１６】本発明の第二の実施の形態の速度推定部の構成を概略的に示すブロック図である。
【図１７】コード・ブロック単位の画像の動き量（速度）の推定結果の一例を示す説明図である。
【図１８】コード・ブロック単位の画像の動き量（速度）及びフレーム画像全体の動き量（速度）の推定処理の流れを示すフローチャートである。
【符号の説明】
１２記憶媒体
４１動き量推定装置[0001]
BACKGROUND OF THE INVENTION
  The present invention relates to a motion estimation device, a program, a storage medium, and a motion estimation method.
[0002]
[Prior art]
  Due to advances in image input technology and output technology, the demand for higher definition of images has increased greatly in recent years. For example, taking a digital camera as an example of an image input device, the price of a high-performance charge-coupled device (CCD) having a number of pixels of 3 million or more has been reduced, and the price range has become widespread. It has come to be widely used in products. And the product of 5 million pixels is coming soon. And it is said that this increasing trend in the number of pixels will continue for a while.
[0003]
  On the other hand, with regard to image output / display devices, for example, products in the hard copy field such as laser printers, ink jet printers, sublimation printers, and flats such as CRTs, LCDs (liquid crystal display devices), and PDPs (plasma display devices). The high definition and low price of products in the soft copy field of panel displays are remarkable.
[0004]
  Due to the market launch of these high-performance, low-priced image input / output products, high-definition images have become popular, and it is expected that demand for high-definition images will increase in all situations. In fact, the development of technology related to networks such as personal computers and the Internet is accelerating these trends. In particular, recently, mobile devices such as mobile phones and notebook personal computers have become very popular, and opportunities for transmitting or receiving high-definition images from any point using communication means are rapidly increasing.
[0005]
  Against this background, it is inevitable that the demand for higher performance or higher functionality for image compression / decompression technology that facilitates the handling of high-definition images will become stronger in the future.
[0006]
  Therefore, in recent years, a new method called JPEG2000, which can restore high-quality images even at a high compression rate, is being standardized as one of image compression methods that satisfy these requirements. In JPEG2000, it is possible to perform compression / decompression processing in a small memory environment by dividing an image into rectangular regions (tiles). That is, each tile becomes a basic unit for executing the compression / decompression process, and the compression / decompression operation can be performed independently for each tile.
[0007]
  Further, such a JPEG 2000 image of one frame can be converted into a moving image by being continuously displayed at a predetermined frame rate (the number of frames reproduced per unit time). As a standard for moving a JPEG2000 image frame by frame and displaying a moving image, there is an international standard called Motion JPEG2000.
[0008]
  In addition, as in the Motion JPEG2000 system, there has also been proposed one in which image data is compression-coded using discrete wavelet transform (see, for example, Patent Document 1).
[0009]
  In the technique disclosed in Patent Document 1, not only the pixel values are subjected to discrete wavelet transform and compression coding, but also the correlation between the images between different frames, and the moving image data when there is no image movement between the frames. Therefore, the data compression rate can be further improved.
[0010]
[Patent Document 1]
JP 2001-309281 A
[0011]
[Problems to be solved by the invention]
  However, according to the method described in Patent Document 1, it is necessary to go through a complicated process of decoding an orthogonal transform coefficient value encoded for obtaining a correlation between frames and further performing inverse quantization. Therefore, there is a problem that processing takes time. Furthermore, an amount of memory for storing a previous frame used for obtaining a correlation between frames is also required.
[0012]
  An object of the present invention is to provide a motion amount estimation device, a program, a storage medium, and a motion amount estimation method capable of obtaining a motion amount of an image with high speed and accuracy.
[0013]
[Means for Solving the Problems]
  The motion amount estimation apparatus according to the first aspect of the present invention is hierarchically compression-coded by dividing into a plurality of blocks for each frame of an interlaced image constituting a moving image and subjecting pixel values to discrete wavelet transform for each block. Means for selecting subblocks corresponding to high-frequency subbands of the LH component (1LH) and HL component (1HL) of one layer from the code string data, and the code amount of the selected 1LH subblock, The code amount of the 1HL sub-block is calculated, and a ratio between the two (1LH code amount / 1HL code amount) is calculated, and the calculated ratio is compared with a threshold value.Sub-blockThe amount of motion is estimated to be fast,Sub-blockAnd a means for estimating that the amount of motion is low.
[0014]
  High frequency subbandThe code amount of the sub-block included in the block is calculated in block units, and based on the code amount of this sub-block, code block unitetcThe amount of motion at is estimated. As a result, it is not necessary to take the difference between frames, so it is possible to reduce memory consumption and shorten the processing time.etcIt is possible to estimate the amount of motion at.In particular, by comparing the 1LH component of the first layer of the wavelet transform coefficient where the horizontal edge of the image appears strongly with the 1HL component of the first layer of the wavelet transform coefficient where the vertical edge of the image appears strongly, the motion amount (speed) of the interlaced image ) Is reliably estimated.
[0015]
  According to a second aspect of the present invention, in the motion amount estimating apparatus according to the first aspect, the high speed is estimated for the entire frame.Means for dividing the number of sub-blocks or the number of sub-blocks estimated to be low by the total number of sub-blocks, and comparing the result of the division with a threshold value, the amount of motion of the entire frame is high or low Means to estimateIt has further.
[0016]
  Therefore, estimatedSub-blockThe motion amount of the entire frame is estimated based on the unit motion amount. As a result, it is possible to perform rough adjustment of image quality based on the amount of motion of the entire frame and fine adjustment of image quality based on the amount of motion for each code block, etc., thus enabling efficient image quality control. . Moreover, since the motion amount of the entire frame image is estimated according to the ratio of the comparison result between the code amount of the subband 1LH and the code amount of the subband 1HL for all subblocks included in the high frequency subband, the entire frame image It is possible to easily estimate the amount of movement.
[0017]
  According to a third aspect of the present invention, in the motion amount estimation apparatus according to the first or second aspect, the calculated code amount of the sub-block is a lossless compressed code amount. Therefore, it is possible to improve the estimation accuracy of the motion amount.
[0018]
  According to a fourth aspect of the present invention, in the motion amount estimation apparatus according to any one of the first to third aspects, the calculated code amount of the sub-block is a code amount before bit truncation. did Accordingly, it is possible to improve the estimation accuracy of the motion amount.
[0019]
  A fifth aspect of the present invention is a program that causes a computer to execute the function of each unit of the motion amount estimating apparatus according to any one of the first to fourth aspects. A sixth aspect of the present invention is a computer-readable storage medium storing a program that causes a computer to execute the function of each means of the motion amount estimating apparatus according to any one of the first to fourth aspects. Therefore, it is possible to obtain the same operation as in the first to fourth aspects by executing the program directly on the computer or by reading the program recorded on the storage medium into the computer and executing it.
[0020]
  The motion amount estimation method according to the invention of claim 7 is hierarchically compression-coded by dividing into a plurality of blocks for each frame of an interlaced image constituting a moving image and subjecting the pixel value to discrete wavelet transform for each block. Selecting subblocks corresponding to high-frequency subbands of the LH component (1LH) and HL component (1HL) in one layer from the code string data, and the code amount of the selected 1LH subblock, The code amount of the 1HL sub-block is calculated to calculate the ratio between the two (the code amount of 1LH / 1 the code amount of 1HL), and the calculated ratio is compared with a threshold value.Sub-blockThe amount of motion is estimated to be fast,Sub-blockThe step of estimating that the amount of motion is low. Therefore, it is possible to obtain the same effect as that of the first aspect.
[0021]
  The invention according to claim 8 is the motion amount estimation method according to claim 7, wherein the high-speed is estimated for the entire frame.The number of sub-blocks or the number of sub-blocks estimated to be low is divided by the total number of sub-blocks, and the result of the division is compared with a threshold, and the amount of motion of the entire frame is high or low And estimating stepIt has further. Therefore, it is possible to obtain the same effect as that of the second aspect.
[0022]
  According to a ninth aspect of the present invention, in the motion amount estimation according to the seventh or eighth aspect, the calculated code amount of the sub-block is a lossless compressed code amount. Therefore, it is possible to obtain the same effect as that of the third aspect.
[0023]
  According to a tenth aspect of the present invention, in the motion amount estimation method according to any one of the seventh to ninth aspects, the calculated code amount of the sub-block is a code amount before bit truncation. Therefore, it is possible to obtain the same effect as that of the fourth aspect.
[0024]
DETAILED DESCRIPTION OF THE INVENTION
  First, an outline of the “hierarchical encoding algorithm” and the “JPEG2000 algorithm” that are the premise of the present embodiment will be described.
[0025]
  FIG. 1 is a functional block diagram of a system that realizes a hierarchical encoding algorithm that is the basis of the JPEG2000 system. This system includes color space transform / inverse transform unit 101, two-dimensional wavelet transform / inverse transform unit 102, quantization / inverse quantization unit 103, entropy encoding / decoding unit 104, and tag processing unit 105. It is configured.
[0026]
  One of the biggest differences between this system and the conventional JPEG algorithm is the conversion method. In JPEG, discrete cosine transform (DCT) is used, whereas in this hierarchical coding algorithm, the two-dimensional wavelet transform / inverse transform unit 102 uses discrete wavelet transform (DWT). Yes. DWT has the advantage that the image quality in the high compression region is better than DCT, and this is one of the main reasons why DWT is adopted in JPEG2000, which is a successor algorithm of JPEG.
[0027]
  Another major difference is that in this hierarchical encoding algorithm, a functional block of the tag processing unit 105 is added in order to perform code formation at the final stage of the system. The tag processing unit 105 generates compressed data as code string data during an image compression operation, and interprets code string data necessary for decompression during the decompression operation. With the code string data, JPEG2000 can realize various convenient functions. For example, the compression / decompression operation of a still image can be freely stopped at an arbitrary layer (decomposition level) corresponding to octave division in block-based DWT (see FIG. 3 described later).
[0028]
  In many cases, color space conversion / inverse conversion 101 is connected to the input / output portion of the original image. For example, the RGB color system composed of R (red) / G (green) / B (blue) components of the primary color system and the Y (yellow) / M (magenta) / C (cyan) components of the complementary color system This corresponds to the part that performs conversion or reverse conversion from the YMC color system consisting of the above to the YUV or YCbCr color system.
[0029]
  Next, the JPEG2000 algorithm will be described.
[0030]
  As shown in FIG. 2, in a color image, each component 111 (RGB primary color system here) of an original image is generally divided by a rectangular area. This divided rectangular area is generally called a block or a tile. In JPEG2000, it is generally called a tile. Therefore, such a divided rectangular area is hereinafter referred to as a tile. (In the example of FIG. 2, each component 111 is divided into a total of 16 rectangular tiles 112, 4 × 4 in length and breadth). When such individual tiles 112 (R00, R01,..., R15 / G00, G01,..., G15 / B00, B01,..., B15 in the example of FIG. 2) execute the image data compression / decompression process. It becomes the basic unit. Therefore, the compression / decompression operation of the image data is performed independently for each component and for each tile 112.
[0031]
  At the time of encoding image data, the data of each tile 112 of each component 111 is input to the color space conversion / inverse conversion unit 101 in FIG. A dimensional wavelet transform (forward transform) is applied to divide the space into frequency bands.
[0032]
  FIG. 3 shows subbands at each decomposition level when the number of decomposition levels is three. In other words, the tile original image (0LL) (decomposition level 0) obtained by tile division of the original image is subjected to two-dimensional wavelet transform, and the subbands (1LL, 1HL, 1LH shown in the decomposition level 1) , 1HH). Subsequently, the low-frequency component 1LL in this hierarchy is subjected to two-dimensional wavelet transformation to separate the subbands (2LL, 2HL, 2LH, 2HH) indicated by the decomposition level 2. Similarly, the low-frequency component 2LL is also subjected to two-dimensional wavelet transform to separate subbands (3LL, 3HL, 3LH, 3HH) shown in the decomposition level 3. In FIG. 3, the subbands to be encoded at each decomposition level are indicated by shading. For example, when the number of decomposition levels is 3, subbands (3HL, 3LH, 3HH, 2HL, 2LH, 2HH, 1HL, 1LH, 1HH) indicated by shading are to be encoded, and the 3LL subband is encoded. It is not converted.
[0033]
  Next, the bits to be encoded are determined in the specified encoding order, and the context is generated from the bits around the target bits by the quantization / inverse quantization unit 103 shown in FIG.
[0034]
  The wavelet coefficients that have undergone the quantization process are divided into non-overlapping rectangles called “precincts” for each subband. This was introduced to use memory efficiently in implementation. As shown in FIG. 4, one precinct consists of three rectangular regions that are spatially coincident. Further, each precinct is divided into non-overlapping rectangular “code blocks”. This is the basic unit for entropy coding.
[0035]
  The coefficient values after the wavelet transform can be quantized and encoded as they are, but in JPEG2000, in order to increase the encoding efficiency, the coefficient values are decomposed into “bit plane” units, and each pixel or code block is divided. Ranking can be performed on “bitplanes”.
[0036]
  Here, FIG. 5 is an explanatory diagram showing an example of a procedure for ranking the bit planes. As shown in FIG. 5, this example is a case where the original image (32 × 32 pixels) is divided into four 16 × 16 pixel tiles, and the size of the precinct and code block at the composition level 1 is Each is 8 × 8 pixels and 4 × 4 pixels. The numbers of the precinct and the code block are assigned in raster order. In this example, the number of assigns is assigned from numbers 0 to 3, and the code block is assigned from numbers 0 to 3. A mirroring method is used for pixel expansion outside the tile boundary, wavelet transform is performed with a reversible (5, 3) filter, and a wavelet coefficient value of decomposition level 1 is obtained.
[0037]
  An explanatory diagram showing an example of the concept of a typical “layer” configuration for tile 0 / precinct 3 / code block 3 is also shown in FIG. The converted code block is divided into subbands (1LL, 1HL, 1LH, 1HH), and wavelet coefficient values are assigned to the subbands.
[0038]
  The layer structure is easy to understand when the wavelet coefficient values are viewed from the horizontal direction (bit plane direction). One layer is composed of an arbitrary number of bit planes. In this example, layers 0, 1, 2, and 3 are made up of bit planes of 1, 3, 1, and 3, respectively. A layer including a bit plane closer to the LSB (Least Significant Bit) is subject to quantization first, and conversely, a layer close to the MSB (Most Significant Bit: most significant bit) is quantized to the end. It will remain without being. A method of discarding from a layer close to the LSB is called truncation, and the quantization rate can be finely controlled.
[0039]
  The entropy encoding / decoding unit 104 illustrated in FIG. 1 performs encoding on the tile 112 of each component 111 by probability estimation from the context and the target bit. In this way, encoding processing is performed in units of tiles 112 for all components 111 of the original image. Finally, the tag processing unit 105 performs a process of combining all the encoded data from the entropy encoding / decoding unit 104 into one code string data (code stream) and adding a tag thereto.
[0040]
  FIG. 6 shows a schematic configuration for one frame of the code string data. The head of this code string data and the head of the code data (bit stream) of each tile are a header (a main header (tile header) that is tile boundary position information, tile boundary direction information, and the like). Is added, followed by encoded data for each tile. Note that the main header (Main header) describes coding parameters and quantization parameters. A tag (end of codestream) is placed again at the end of the code string data.
[0041]
  On the other hand, at the time of decoding, the image data is generated from the code string data of each tile 112 of each component 111, contrary to the case of encoding the image data. In this case, the tag processing unit 105 interprets tag information added to the code string data input from the outside, decomposes the code string data into code string data of each tile 112 of each component 111, and Decoding processing (decompression processing) is performed for each code string data of each tile 112. At this time, the position of the bit to be decoded is determined in the order based on the tag information in the code string data, and the quantization / inverse quantization unit 103 determines the peripheral bits (that have already been decoded) of the target bit position. Context is generated from the sequence of The entropy encoding / decoding unit 104 performs decoding by probability estimation from the context and code string data, generates a target bit, and writes it in the position of the target bit. Since the data decoded in this way is spatially divided for each frequency band, the two-dimensional wavelet transform / inverse transform unit 102 performs two-dimensional wavelet inverse transform on each of the components of the image data. The tile is restored. The restored data is converted to original color system image data by the color space conversion / inverse conversion unit 101.
[0042]
  The above is an outline of the “JPEG2000 algorithm”. A “motion JPEG2000 algorithm” is an extension of a still image, that is, a method for a single frame to a plurality of frames. That is, “Motion JPEG2000”, as shown in FIG. 7, is a moving image by continuously displaying a JPEG2000 image of one frame at a predetermined frame rate (the number of frames reproduced per unit time). .
[0043]
  Hereinafter, a first embodiment of the present invention will be described. Here, an example relating to a moving image compression / decompression technique typified by Motion JPEG 2000 will be described, but it goes without saying that the present invention is not limited to the contents of the following description.
[0044]
  FIG. 8 is a block diagram showing a schematic configuration of a movie camera system 1 to which the present invention is applied. As shown in FIG. 8, a movie camera system 1 to which the moving image display system of the present invention is applied includes an image recording device 1a that is a movie camera and a moving image playback device 1b that is a personal computer, over a network 1c that is the Internet. It is connected via.
[0045]
  In the following, an image recording apparatus 1a that exhibits the characteristic functions of the present invention will be described. The moving image playback apparatus 1b may be a standard system capable of decompressing code string data compressed by the Motion JPEG2000 method, and thus detailed description thereof is omitted.
[0046]
  As shown in FIG. 8, the image recording device 1a includes an image input device 2 that captures a moving image and an image compression device 3 that compresses and encodes the captured image data. The image compression apparatus 3 implements the image processing apparatus of the present invention that performs compression processing of moving image data.
[0047]
  FIG. 9 is a block diagram illustrating an example of a hardware configuration of the image recording apparatus 1a. As shown in FIG. 9, the image recording apparatus 1 a includes a CPU (Central Processing Unit) 11 that is a main part of a computer and controls each part centrally. The CPU 11 includes various ROMs (Read Only). A memory 12 that is a storage medium including a memory (RAM) and a RAM (Random Access Memory), a predetermined communication interface 13 that communicates with the network 1c, and an operation panel 18 that receives various operations from the user via the bus 14. It is connected.
[0048]
  In the image recording apparatus 1 a, in addition to the image input apparatus 2 and the image compression apparatus 3 described above, a logic circuit 19 is connected to the CPU 11 via the bus 14.
[0049]
  Control programs such as a moving image processing program for processing moving images are stored in the memory 12 (ROM) of the image recording apparatus 1a having such a configuration. This moving image processing program implements the program of the present invention. And the function of the code sequence converter 4 is implement | achieved by the process which CPU11 performs based on this moving image processing program.
[0050]
  As the memory 12, various types of media such as various optical disks such as CD and DVD, various magnetic disks such as various magneto-optical disks and flexible disks, and semiconductor memories can be used. Alternatively, the program may be downloaded from the network 1 c and installed in the memory 12. In this case, the storage device storing the program in the server on the transmission side is also a storage medium of the present invention. Note that the program may operate on a predetermined OS (Operating System), and in that case, the OS may take over the execution of some of the various processes described below, It may be included as a part of a group of program files constituting the application software or OS.
[0051]
  Here, the operation of each part of the image recording apparatus 1a will be briefly described. The image input device 2 of the image recording device 1a captures a moving image in units of frames using a photoelectric conversion device such as a CCD or a MOS image sensor, and outputs a digital pixel value signal of the moving image to the image compression device 3. is there.
[0052]
  The image compression device 3 of the image recording device 1a compresses and encodes the digital pixel value signal of the moving image according to the “Motion JPEG2000 algorithm”. As shown in FIG. 10, the image compression apparatus 3 includes a color space conversion unit 31, a two-dimensional wavelet conversion unit 32, a quantization unit 33, an entropy encoding unit 34, a post quantization unit 35, and an arithmetic encoding unit 36. Consists of Various functions in these units are realized by processing performed by the CPU 11 in accordance with the above-described moving image processing program. In addition, when real-time property is regarded as important, it is necessary to speed up the processing. For this purpose, it is desirable to realize various functions in each unit by the operation of the logic circuit 19.
[0053]
  Next, the operation of each part constituting the image compression apparatus 3 will be briefly described. The color space conversion unit 31 converts the digital pixel value signal of the moving image input from the image input device 2 from RGB to YUV or YCbCr, and the two-dimensional wavelet conversion unit 32 performs two-dimensional wavelet conversion for each color component. Then, the quantization unit 33 divides the Wavelet coefficient by an appropriate quantization denominator, creates a lossless code by the entropy coding unit 34, performs bit truncation (code discard) by the post quantization unit 35, and performs arithmetic coding. The unit 36 forms a code in the JPEG2000 code format. Through such a series of processing, the moving image data of the R, G, and B components of the original moving image is divided into one or a plurality of (usually a plurality of) tiles for each frame, and each tile is hierarchical. The encoded data is compressed and encoded.
[0054]
  Here, the post-quantization unit 35 that exhibits a characteristic function in the present embodiment will be described in detail. FIG. 11 is a block diagram schematically showing the configuration of the post quantization unit 35. As shown in FIG. 11, the post quantization unit 35 includes a speed estimation unit 41, a quantization table determination unit 42, a code discard unit 43, and a masking control unit 44.
[0055]
  The speed estimation unit 41 functions as a motion amount estimation device, estimates the motion amount (speed) of an image from information in the code block created by the entropy encoding unit 34, and estimates the motion of the image. The amount (speed) is sent to the masking control unit 44.
[0056]
  The masking control unit 44 adjusts the truncation amount (bit plane cutting amount) of the quantization table for each code block.
[0057]
  The quantization table determination unit 42 determines a quantization table according to the compression rate given from the CPU 11 and the amount of motion (speed) of the image estimated by the speed estimation unit 41, and the quantization table is a code discarding unit. 43.
[0058]
  The code discarding unit 43 uses the quantization table determined by the quantization table determining unit 42 and the truncation amount (bit plane cutting amount) for each code block adjusted by the masking control unit 44 for each code block. Thus, the code is discarded from the code in a state where the bit plane (or the sub bit plane) is not deleted until a predetermined compression rate is reached.
[0059]
  Here, a method for estimating the amount of motion (speed) of the image by the speed estimation unit 41 will be described. FIG. 12 describes a basic idea of an image motion amount (speed) estimation method by the speed estimation unit 41. In general, when a motion occurs in an interlaced image, the edge of the moved object becomes a comb-like tooth shape (hereinafter referred to as an interlaced comb) in the frame data. As shown in FIG. 12, in an interlaced image, an image in which an object is moving at high speed has a long interlaced comb shape in the horizontal direction. On the other hand, in an image where an object is moving at low speed, the interlaced comb is short in the horizontal direction. Further, it is known that the horizontal edge of the image appears strongly in the 1LH component of the Wavelet transform coefficient. That is, since a high-frequency moving image has a longer horizontal edge at a higher speed, the higher the code block, the larger the sum of absolute values of 1LH coefficients and the larger the sum of lossless 1LH codes. Become. The speed estimation unit 41 uses this characteristic to estimate the motion amount (speed) of the image independently for each frame.
[0060]
  FIG. 13 is a block diagram schematically showing the configuration of the speed estimation unit 41. As illustrated in FIG. 13, the speed estimation unit 41 includes a block selection unit 51, a feature amount calculation unit 52, and a speed determination unit 53. The block selection unit 51 selects a code block to be calculated. For example, code blocks in one-layer subband 1LH are selected in raster order. Alternatively, all the code blocks may be selected. The feature amount calculation unit 52 calculates the feature amount by calculating the coefficient or code amount of the code block selected by the block selection unit 51. The speed determination unit 53 estimates the motion amount (speed) of the image in units of code blocks using the feature amount obtained by the feature amount calculation unit 52.
[0061]
  Here, estimation of the motion amount (speed) of the image in units of code blocks will be described. As described above, the code block is a subband divided into finer blocks. That is, the code block is a sub-block. In the present embodiment, the interlaced image is compared by comparing the 1LH component of the first layer of the Wavelet transform coefficient in which the horizontal edge of the image appears strongly with the 1HL component of the first layer of the Wavelet transform coefficient in which the vertical edge of the image appears strongly. The amount of motion (speed) is estimated. Further, the motion amount (speed) of the image in units of code blocks is estimated by comparing code blocks that are in the same position when decoded. Here, FIG. 14 shows a one-layer subband having four code blocks. That is, as shown in FIG. 14, the code block 1HL_1 and the code block 1LH_1 that are at the same position during decoding are compared. Similarly, 1HL_2 and 1LH_2, 1HL_3 and 1LH_3, and 1HL_4 and 1LH_4 are also compared. When the code block is one subband, the comparison is performed in units of subbands instead of units of code blocks. Then, the amount of motion (speed) of the image in units of code blocks is estimated according to the comparison result.
[0062]
  The process for estimating the amount of motion (speed) of the image in units of code blocks as described above will be described with reference to the flowchart shown in FIG. As shown in FIG. 15, among the code blocks selected by the block selector 51, the first code block is acquired (step S1: sub-block acquisition means), and the 1LH and 1HL bit planes of the code block are obtained. The code amount before cutting is calculated (steps S2 and S3: code amount calculation means), divided (1LH / 1HL), and the calculated ratio is used as a feature amount (rate) for motion amount estimation (step S4). . The processing in steps S1 to S4 is executed by the feature amount calculation unit 52.
[0063]
  Then, the result (rate) of the division (1LH / 1HL) is compared with the threshold (th0) (step S5), and if the result (rate) of the division (1LH / 1HL) is larger than the threshold (th0) (step S5). Y) Assuming that the horizontal edge of the image appears strongly, the motion amount (speed) of the image in units of code blocks is estimated to be high (step S6). On the other hand, if the result (rate) of the division (1LH / 1HL) is not larger than the threshold (th0) (N in step S5), it is assumed that the vertical edge of the image appears strongly, and the motion of the image in units of code blocks The amount (speed) is estimated to be low (step S7). The processing in steps S5 to S7 is executed by the speed determination unit 53. Here, the function of the sub-block motion estimation means is executed.
[0064]
  The processes in steps S2 to S7 are repeated until all the code blocks selected by the block selection unit 51 are completed (steps S8 and S9).
[0065]
  That is, when there are four code blocks in one layer subband as shown in FIG. 14, the above process is repeated for the number of code blocks of the coefficient in one layer. • The speed for the block will be estimated. Here, the number of code blocks in one layer is, for example, when the size of a code block is 32 × 32 and the size of a 1LH coefficient is 256 × 128, the number of code blocks is (256/32). ) × (128/32) = 8 × 4 = 32.
[0066]
  Here, the code amount of the subblock included in the high frequency subband is calculated in units of blocks, and the amount of motion in units of code blocks is estimated based on the code amount of the subblocks. As a result, it is not necessary to take the difference between frames, so it is possible to reduce memory consumption and reduce the processing time, so it is possible to estimate the amount of motion in units of code blocks with high speed and accuracy. Become.
[0067]
  The motion amount (speed) in units of code blocks estimated in this way is sent from the speed estimation unit 41 to the masking control unit 44, so that an image in which a fast moving object and a slow moving object are mixed is mixed. Each can be subjected to optimum processing (masking processing, etc.).
[0068]
  Next, a second embodiment of the present invention will be described with reference to FIGS. The same parts as those described in the first embodiment are denoted by the same reference numerals, and description thereof is also omitted. In the first embodiment, the motion amount (speed) of the image in units of code blocks is estimated. However, in this embodiment, the motion amount (speed) of the image in units of code blocks and the motion amount of the entire frame image are estimated. (Speed) is estimated.
[0069]
  FIG. 16 is a block diagram schematically showing the configuration of the speed estimation unit 41 according to the second embodiment of this invention. As illustrated in FIG. 16, the speed estimation unit 41 includes a block selection unit 61, a feature amount calculation unit 62, and a speed determination unit 63. The block selection unit 61 selects a code block to be calculated. For example, code blocks in one-layer subband 1LH are selected in raster order. Alternatively, all the code blocks may be selected. The feature amount calculation unit 62 calculates the feature amount by calculating the coefficient or code amount of the code block selected by the block selection unit 61. The speed determination unit 63 estimates the motion amount (speed) of the image in units of code blocks and the motion amount (speed) of the entire frame image using the feature amount obtained by the feature amount calculation unit 62. .
[0070]
  Since the estimation of the motion amount (speed) of the image in units of code blocks has been described in the first embodiment, the description thereof is omitted.
[0071]
  Next, estimation of the motion amount (speed) of the entire frame image using the estimation result of the motion amount (speed) of the image in units of code blocks will be described. Note that, as described in the first embodiment, the description will be made on the assumption that the motion amount (speed) of an image in units of code blocks is estimated for four code blocks.
[0072]
  In the present embodiment, the motion amount (speed) of the entire frame image is estimated according to the ratio of the motion amount (speed) of the image for each code block unit. More specifically, as shown in FIG. 17, the ratio between the high speed and the low speed of the motion amount (speed) of the image for the four code blocks is
High speed: Low speed = 3: 1
Then, since the high-speed ratio is high, it is estimated that the motion amount (speed) of the entire frame image is high.
[0073]
  Note that this is just an example, and using the estimation result of the motion amount (speed) of the image in units of code blocks, it is easy to determine the motion amount (speed) of the entire frame image at a high speed or a low speed. It may be configured so that it can be easily set.
[0074]
  The process of estimating the amount of motion (speed) of the image in units of code blocks and the amount of motion (speed) of the entire frame image as described above will be described with reference to the flowchart shown in FIG. As shown in FIG. 18, after initializing the counter (step S21) and setting the total number of code blocks selected by the block selection unit 51 (step S22), 1LH and 1HL bits for each corresponding code block The code amount before cutting the plane is calculated (steps S23 and S24), and division (1LH / 1HL) is performed (step S25).
[0075]
  Then, the result (rate) of the division (1LH / 1HL) is compared with the threshold (th1) (step S26), and if the result (rate) of the division (1LH / 1HL) is larger than the threshold (th1) (step S26). Y), the number of code blocks whose count (1LH / 1HL) result (rate) is greater than the threshold (th1) is counted (step S27).
[0076]
  The processes in steps S23 to S27 are repeated until all the code blocks selected by the block selection unit 51 are completed.
[0077]
  When the processing of steps S23 to S27 is completed for all code blocks selected by the block selection unit 51 (Y in step S28), the code whose division (1LH / 1HL) result (rate) is greater than the threshold (th1) The number of blocks is divided by the total number of codes and the number of blocks, and the calculated ratio is set as a feature amount (speed) for motion amount estimation (step S29). The processes in steps S21 to S29 are executed by the feature amount calculation unit 52.
[0078]
Then, the feature amount (speed) obtained by the feature amount calculation unit 52 is compared with the threshold value (th2) (step S30), and high speed and low speed are determined based on the comparison result. That is, if the feature amount (speed) obtained by the feature amount calculation unit 52 is larger than the threshold value (th2) (Y in step S30), the high-speed ratio of the image motion amount (speed) in each code block is high. As a thing, it is estimated that the motion amount (speed) of the entire frame image is high (step S31). On the other hand, if the feature amount (speed) obtained by the feature amount calculation unit 52 is not larger than the threshold value (th2) (N in step S30), the low-speed ratio of the image motion amount (speed) in each code block is As a high value, it is estimated that the motion amount (speed) of the entire frame image is low (step S32). The processes in steps S30 to S32 are executed by the speed determination unit 53. Here, the function of the frame motion amount estimation means is executed.
[0079]
  Here, the code amount of the sub-block included in the high frequency sub-band is calculated in block units, and the motion amount in code block unit is estimated based on the code amount of the sub-block, and the estimated sub-block The motion amount of the entire frame is estimated based on the unit motion amount. The motion amount (speed) of the entire frame estimated in this way is sent from the speed estimation unit 41 to the quantization table determination unit 42, so that the quantization table determination unit 42 has a quantization table suitable for the motion amount (speed). Can be selected. That is, it is possible to perform coarse adjustment of image quality based on the amount of motion of the entire frame and fine adjustment of image quality based on the amount of motion for each code block, so that efficient image quality control can be performed.
[0080]
In the above description, the image recording apparatus 1a of the present invention is applied to a movie camera. However, the image recording apparatus 1a is applied to an information terminal apparatus such as a personal digital assistant (PDA) or a mobile phone. You can also.
[0081]
【The invention's effect】
  According to the present invention, the following effects can be obtained.
(1) It is necessary to calculate inter-frame differences by calculating the code amount of sub-blocks included in the high-frequency subband in units of blocks and estimating the amount of motion in units of code blocks based on the code amounts of the sub-blocks Therefore, the memory consumption can be suppressed and the processing time can be shortened, so that the motion amount in units of code blocks can be estimated with high speed and accuracy. In particular, by comparing the 1LH component of the first layer of the wavelet transform coefficient in which the horizontal edge of the image appears strongly with the 1HL component of the first layer of the wavelet transform coefficient in which the vertical edge of the image appears strongly, the motion amount (speed) of the interlaced image ) Can be reliably estimated.
[0082]
(2) By estimating the motion amount of the entire frame based on the estimated motion amount of each sub-block, the image quality is roughly adjusted by the motion amount of the entire frame, and the image quality is finely adjusted by the motion amount of each code block. Since adjustment can be performed, efficient image quality control can be performed. In addition, the motion amount of the entire frame image according to the ratio of the comparison result between the code amount of the subband 1LH and the code amount of the subband 1HL by the subblock motion amount estimation means for all subblocks included in the high frequency subband. It is possible to easily estimate the amount of motion of the entire frame image.
[0083]
(3) Since the code amount of the sub-block is a lossless-compressed code amount, the motion amount estimation accuracy can be improved.
[0084]
(4) Since the code amount of the sub-block is the code amount before bit truncation, it is possible to improve the estimation accuracy of the motion amount.
[Brief description of the drawings]
FIG. 1 is a functional block diagram of a system that realizes a hierarchical encoding algorithm that is the basis of a JPEG2000 system that is a premise of the present invention.
FIG. 2 is an explanatory diagram showing a divided rectangular area of each component of the original image.
FIG. 3 is an explanatory diagram showing subbands at each decomposition level when the number of decomposition levels is 3. FIG.
FIG. 4 is an explanatory diagram showing a precinct.
FIG. 5 is an explanatory diagram showing an example of a procedure for ranking bit planes;
FIG. 6 is an explanatory diagram illustrating a schematic configuration of one frame of code string data.
FIG. 7 is an explanatory diagram showing the concept of Motion JPEG2000.
FIG. 8 is a block diagram showing a schematic configuration of the movie camera system according to the first embodiment of the present invention.
FIG. 9 is a block diagram illustrating an example of a hardware configuration of an image recording apparatus.
FIG. 10 is a block diagram schematically showing a configuration of an image compression apparatus.
FIG. 11 is a block diagram schematically showing a configuration of a post quantization unit.
FIG. 12 is an explanatory diagram of a basic idea of an image motion amount (speed) estimation method performed by a speed estimation unit;
FIG. 13 is a block diagram schematically showing a configuration of a speed estimation unit.
FIG. 14 is an explanatory diagram showing a one-layer subband having four code blocks;
FIG. 15 is a flowchart showing the flow of estimation processing of the motion amount (speed) of an image in units of code blocks.
FIG. 16 is a block diagram schematically illustrating a configuration of a speed estimation unit according to the second embodiment of this invention.
FIG. 17 is an explanatory diagram illustrating an example of an estimation result of a motion amount (speed) of an image in units of code blocks.
FIG. 18 is a flowchart showing a flow of estimation processing of the motion amount (speed) of an image in units of code blocks and the motion amount (speed) of the entire frame image.
[Explanation of symbols]
  12 storage media
  41 Motion estimation device

Claims

One layer of LH components (1LH) is obtained from code string data that is hierarchically compression-coded by dividing the block into a plurality of blocks for each interlaced image frame constituting the moving image and subjecting the pixel values to discrete wavelet transform for each block. ) And means for selecting a sub-block corresponding to the high-frequency sub-band of the HL component (1HL) in units of blocks,
Means for calculating a code amount of the selected 1LH sub-block and a code amount of the 1HL sub-block, and calculating a ratio between the two (a code amount of 1LH / a code amount of 1HL);
A means for comparing the calculated ratio with a threshold and estimating that the motion amount of the sub-block is high if it is large, and estimating that the motion amount of the sub-block is low if it is not large;
A motion amount estimation apparatus comprising:

Means for dividing the number of sub-blocks estimated to be fast or the number of sub-blocks estimated to be slow by the total number of sub-blocks in the entire frame ;
Means for comparing the result of the division with a threshold and estimating that the amount of motion of the entire frame is high or low;
The motion amount estimation apparatus according to claim 1 , further comprising:

The motion amount estimation apparatus according to claim 1 or 2, wherein the code amount of the sub-block is a lossless compressed code amount.

The motion amount estimation apparatus according to claim 1, wherein the code amount of the sub-block is a code amount before bit truncation.

The program which makes a computer perform the function of each means of the motion amount estimation apparatus as described in any one of Claims 1 thru | or 4.

A computer-readable storage medium storing a program for causing a computer to execute the function of each means of the motion amount estimation apparatus according to claim 1.

One layer of LH components (1LH) is obtained from code string data that is hierarchically compression-coded by dividing the block into a plurality of blocks for each interlaced image frame constituting the moving image and subjecting the pixel values to discrete wavelet transform for each block. ) And a sub-block corresponding to the high-frequency sub-band of the HL component (1HL) are selected in units of blocks;
Calculating a code amount of the selected 1LH sub-block and a code amount of the 1HL sub-block, and calculating a ratio between the two (a code amount of 1LH / a code amount of 1HL);
Comparing the calculated ratio with a threshold, estimating that the amount of motion of the sub-block is fast if it is large, and estimating that the amount of motion of the sub-block is slow if not large;
A motion amount estimation method characterized by comprising:

Dividing the number of sub-blocks estimated to be fast or the number of sub-blocks estimated to be slow by the total number of sub-blocks over the entire frame ;
Comparing the result of the division with a threshold and estimating that the amount of motion of the entire frame is high or low;
The motion amount estimation method according to claim 7, further comprising:

9. The motion amount estimation method according to claim 7, wherein the code amount of the sub-block is a lossless compressed code amount.

The motion amount estimation method according to claim 7, wherein the code amount of the sub-block is a code amount before bit truncation.