JP4193460B2

JP4193460B2 - Image processing apparatus and method, recording medium, and program

Info

Publication number: JP4193460B2
Application number: JP2002296675A
Authority: JP
Inventors: 哲二郎近藤; 靖立平; 淳一石橋; 成司和田; 泰広周藤
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2002-10-09
Filing date: 2002-10-09
Publication date: 2008-12-10
Anticipated expiration: 2022-10-09
Also published as: JP2004134990A

Description

【０００１】
【発明の属する技術分野】
本発明は、画像処理装置および方法、記録媒体、並びにプログラムに関し、特に、より正確に動きベクトルを検出できるようにした画像処理装置および方法、記録媒体、並びにプログラムに関する。
【０００２】
【従来の技術】
画像の動きを示す動きベクトルを求め、この動きベクトルに基づいて効率よく動画像を圧縮する技術がある。
【０００３】
この動画像圧縮技術における上述の動きベクトルを求める手法としては、いくつか提案されているが、代表的な手法としてブロックマッチングアルゴリズムと呼ばれる手法がある。
【０００４】
図１は、ブロックマッチングアルゴリズムを採用した従来の画像処理装置の動き検出部１の構成を示している。
【０００５】
動き検出部１のフレームメモリ１１は、例えば、時刻ｔ１において、入力端子Tinから画像信号が入力されると、1フレーム分の情報を格納する。さらに、フレームメモリ１１は、次のタイミングとなる時刻ｔ２において、入力端子Tinから次のフレームの画像信号が入力されると、時刻ｔ１において、格納した1フレーム分の画像情報をフレームメモリ１２に出力した後、新たに入力された1フレーム分の画像情報を格納する。
【０００６】
また、フレームメモリ１２は、時刻ｔ２のタイミングで、フレームメモリ１１から入力されてくる時刻ｔ１のタイミングで入力端子Tinから入力されてきた1フレーム分の画像情報を格納する。
【０００７】
すなわち、フレームメモリ１１が、上述の時刻ｔ２のタイミングで入力される（今現在の）1フレーム分の画像情報を格納するとき、フレームメモリ１２は、時刻ｔ１のタイミングで入力された（１タイミング過去の）1フレーム分の画像情報を格納していることになる。尚、以下において、フレームメモリ１１に格納される画像情報をカレントフレームＦｃ、フレームメモリ１２に格納される画像情報を参照フレームＦｒと称するものとする。
【０００８】
動きベクトル検出部１３は、フレームメモリ１１，１２に格納されているカレントフレームＦｃと参照フレームＦｒをそれぞれから読出し、このカレントフレームＦｃと参照フレームＦｒに基づいて、ブロックマッチングアルゴリズムにより動きベクトルを検出し、出力端子Toutから出力する。
【０００９】
ここで、ブロックマッチングアルゴリズムについて説明する。例えば、図２で示すように、カレントフレームＦｃ内の注目画素Ｐ（ｉ，ｊ）に対応する動きベクトルを求める場合、まず、カレントフレームＦｃ上に注目画素Ｐ（ｉ，ｊ）を中心としたＬ（画素数）×Ｌ（画素数）からなる基準ブロックＢｂ（ｉ，ｊ）、参照フレームＦｒ上に、注目画素Ｐ（ｉ，ｊ）の位置に対応するサーチエリアＳＲ、そして、そのサーチエリアＳＲ内に、Ｌ（画素数）×Ｌ（画素数）の画素からなる参照ブロックＢｒｎ（ｉ，ｊ）がそれぞれ設定される。
【００１０】
次に、この基準ブロックＢｂ（ｉ，ｊ）と、参照ブロックＢｒｎ（ｉ，ｊ）の各画素間の差分の絶対値の和を求める処理が、参照ブロックＢｒｎ（ｉ，ｊ）をサーチエリアＳＲ内の全域で水平方向、または、垂直方向に１画素分ずつ移動させながら、図２中のＢｒ１（ｉ，ｊ）からＢｒｍ（ｉ，ｊ）（参照ブロックＢｒｎ（ｉ，ｊ）が、サーチエリアＳＲ内にｍ個設定できるものとする）まで繰り返される。
【００１１】
このようにして求められた基準ブロックＢｂ（ｉ，ｊ）と、参照ブロックＢｒｎ（ｉ，ｊ）の各画素間の差分絶対値和のうち、差分絶対値和が最小となる参照ブロックＢｒｎ（ｉ，ｊ）を求めることにより、基準ブロックＢｂ（ｉ，ｊ）に最も近い（類似している）参照ブロックＢｒｎ（ｉ，ｊ）を構成するＬ×Ｌ個の画素の中心となる参照画素Ｐｎ（ｉ，ｊ）が求められる。
【００１２】
そして、このカレントフレームＦｃ上の注目画素Ｐ（ｉ，ｊ）に対応する参照フレームＦｒ上の画素Ｐ’（ｉ，ｊ）を始点とし、参照画素Ｐｎ（ｉ，ｊ）を終点とするベクトルが、注目画素Ｐ（ｉ，ｊ）の動きベクトル（Ｖｘ，Ｖｙ）として出力される。ここで、例えば、Ｐ（ｉ，ｊ）＝（ａ，ｂ）、および、Ｐｎ（ｉ，ｊ）＝（ｃ，ｄ）である場合、（Ｖｘ，Ｖｙ）は、（Ｖｘ，Ｖｙ）＝（ｃ−ａ，ｄ−ｂ）となる。
【００１３】
すなわち、注目画素Ｐ（ｉ，ｊ）に対応する参照フレームＦｒ上の参照画素Ｐ’（ｉ，ｊ）を始点とし、基準ブロックＢｂ（ｉ，ｊ）に最も近い（類似している）参照ブロックＢｒｎ（ｉ，ｊ）を構成するＬ×Ｌ個の画素の中心となる参照画素Ｐｎ（ｉ，ｊ）を終点とするベクトルが動きベクトルとして求められる。
【００１４】
次に、図３のフローチャートを参照して、図１の動き検出部１の動き検出処理について説明する。
【００１５】
ステップＳ１において、動きベクトル検出部１３は、フレームメモリ１１に格納されているカレントフレームＦｃ上の注目画素Ｐ（ｉ，ｊ）の画素位置に応じて、サーチエリアＳＲを設定する。
【００１６】
ステップＳ２において、動きベクトル検出部１３は、上述のように、基準ブロックＢｂ（ｉ，ｊ）と参照ブロックＢｒｎ（ｉ，ｊ）の画素間の差分絶対値和の最小値を設定する変数Minを、画素の階調数に基準ブロックＢｂ（ｉ，ｊ）を構成する画素数を乗じた値に設定することにより初期化する。すなわち、例えば、１画素が８ビットのデータであった場合、１画素の階調数は、２の８乗となるため２５６階調（２５６色）となる。また、基準ブロックＢｂ（ｉ，ｊ）がＬ画素×Ｌ画素＝３画素×３画素から構成される場合、その画素数は、９個となる。結果として、変数Minは、２３０４（＝２５６（階調数）×９（画素数））に初期化される。
【００１７】
ステップＳ３において、動きベクトル検出部１３は、参照ブロックＢｒｎ（ｉ，ｊ）をカウントするカウンタ変数ｎを１に初期化する。
【００１８】
ステップＳ４において、動きベクトル検出部１３は、基準ブロックＢｂ（ｉ，ｊ）と参照ブロックＢｒｎ（ｉ，ｊ）の画素間の差分絶対値和を代入するために用いる変数Sumを０に初期化する。
【００１９】
ステップＳ５において、動きベクトル検出部１３は、基準ブロックＢｂ（ｉ，ｊ）と参照ブロックＢｒｎ（ｉ，ｊ）の画素間の差分絶対値和（＝Sum）を求める。すなわち、基準ブロックＢｂ（ｉ，ｊ）の各画素がＰ_Ｂｂ（ｉ，ｊ）、基準ブロックＢｒｎ（ｉ，ｊ）の各画素がＰ_Ｂｒｎ（ｉ，ｊ）としてそれぞれ示される場合、動きベクトル検出部１３は、以下の式（１）で示される演算を実行して、基準ブロックＢｂ（ｉ，ｊ）と参照ブロックＢｒｎ（ｉ，ｊ）の画素間の差分絶対値和を求める。
【００２０】
【数１】

【００２１】
ステップＳ６において、動きベクトル検出部１３は、変数Minが変数Sumよりも大きいか否かを判定し、例えば、変数Minが変数Sumよりも大きいと判定する場合、ステップＳ７において、変数Minを変数Sumに更新し、その時点でのカウンタｎの値を動きベクトル番号として登録する。すなわち、今求めた差分絶対値和を示す変数Sumが、最小値を示す変数Minよりも小さいと言うことは、これまで演算したどの参照ブロックよりも、今演算している参照ブロックＢｒｎ（ｉ，ｊ）が基準ブロックＢｂ（ｉ，ｊ）により類似したものであるとみなすことができるので、動きベクトルを求める際の候補とするため、その時点でのカウンタｎが動きベクトル番号として登録される。また、ステップＳ６において、変数Minが変数Sumよりも大きくないと判定された場合、ステップＳ７の処理がスキップされる。
【００２２】
ステップＳ８において、動きベクトル検出部１３は、カウンタ変数ｎがサーチエリアＳＲの参照ブロックＢｒｎ（ｉ，ｊ）の総数ｍであるか否か、すなわち、今の参照ブロックＢｒｎ（ｉ，ｊ）がＢｒｎ（ｉ，ｊ）＝Ｂｒｍ（ｉ，ｊ）であるか否かを判定し、例えば、総数ｍではないと判定した場合、ステップＳ９において、カウンタ変数ｎを１インクリメントし、その処理は、ステップＳ４に戻る。
【００２３】
ステップＳ８において、カウンタ変数ｎがサーチエリア内の参照ブロックＢｒｎ（ｉ，ｊ）の総数ｍである、すなわち、今の参照ブロックＢｒｎ（ｉ，ｊ）がＢｒｎ（ｉ，ｊ）＝Ｂｒｍ（ｉ，ｊ）であると判定された場合、ステップＳ１０において、動きベクトル検出部１３は、登録されている動きベクトル番号に基づいて動きベクトルを出力する。すなわち、ステップＳ４乃至Ｓ９が繰り返されることにより、差分絶対値和が最小となる参照ブロックＢｒｎ（ｉ，ｊ）に対応するカウンタ変数ｎが動きベクトル番号として登録されることになるので、動きベクトル検出部１３は、この動きベクトル番号に対応する参照ブロックＢｒｎ（ｉ，ｊ）のＬ×Ｌ個の画素のうち、その中心となる参照画素Ｐｎ（ｉ，ｊ）を求め、カレントフレームＦｃ上の注目画素Ｐ（ｉ，ｊ）に対応する参照フレームＦｒ上の画素Ｐ’（ｉ，ｊ）を始点とし、参照画素Ｐｎ（ｉ，ｊ）を終点とするベクトルを、注目画素Ｐ（ｉ，ｊ）の動きベクトル（Ｖｘ，Ｖｙ）として求めて出力する。
【００２４】
また、ブロックマッチング法により動きベクトルを検出する時に、定常成分および過渡成分の絶対値差分を累算した値の加重平均から得られる評価値に基づいて動きベクトルを検出することにより、演算量を低減させるようにするものがある（例えば、特許文献１参照）。
【００２５】
さらに、参照ブロックおよび探索範囲内の画素値を符号化してコード値に基づいてマッチング演算を行い、演算結果に基づいて第１の動きベクトルを算出し、第１の動きベクトルに応じた動き補償を行った後に第１の動きベクトルに係る候補ブロックを１画素を単位としてずらすことで得られる新たな探索範囲について、画素値の差分に基づくブロックマッチングを行うことにより、第２の動きベクトルを算出して、第１の動きベクトルと第２の動きベクトルの和として最終的な動きベクトルを算出することで、演算を簡素化するものがある（例えば、特許文献２参照）。
【００２６】
【特許文献１】
特開平０７−０８７４９４号公報
【特許文献２】
特開２０００−２７８６９１号公報
【００２７】
【発明が解決しようとする課題】
しかしながら、上述したブロックマッチングアルゴリズムは、式（１）の演算量が非常に膨大なものとなるため、MPEG（Moving Picture Experts Group）等の画像圧縮処理においては、大半の時間がこの処理に費やされてしまうという課題があった。
【００２８】
また、カレントフレームＦｃ、または、参照フレームＦｒの動きベクトルの始点、または、終点付近でノイズが含まれた場合、ブロックマッチングでは基準ブロックに類似する参照ブロックを検出することができず、正確な動きベクトルを検出することができないという課題があった。
【００２９】
本発明はこのような状況に鑑みてなされたものであり、正確に動きベクトルを生成することができるようにするものである。
【００３０】
【課題を解決するための手段】
本発明の画像処理装置は、入力された画像における第１のフレームの各画素のうちの、注目画素に対応する４ｍ（ｍ≧１）個の周辺画素の画素値をクラスタップとして抽出するクラスタップ抽出手段と、クラスタップのダイナミックレンジを検出するダイナミックレンジ検出手段と、ダイナミックレンジに基づいて、ダイナミックレンジが大きいときにはビット長を小さくし、ダイナミックレンジが小さいときにはビット長を大きくするように、クラスタップのビット長を決定するビット長決定手段と、ビット長決定手段により決定されたビット長で表現されたクラスタップの画素値レベルから、特徴量としての量子化コードを生成する量子化コード生成手段と、第１のフレームの全ての画素を注目画素として、量子化コード生成手段により生成された特徴量と、特徴量に対応する第１のフレームの各画素位置の情報を記憶する記憶手段と、第１のフレームとは異なる第２のフレーム中の注目画素について、量子化コード生成手段により生成された特徴量に対応する、第１のフレームの画素位置の情報を、記憶手段より読み出す読出し手段と、読出し手段により読み出された第１のフレームの画素位置の情報のうち、第２のフレーム中の注目画素との距離が最小となる画素の画素位置を始点とし、第２のフレーム中の注目画素の画素位置を終点とする動きベクトルを検出する動きベクトル検出手段とを備えることを特徴とする。
【００３１】
前記ビット長決定手段には、ダイナミックレンジが大きいときにはビット長を小さくし、ダイナミックレンジが小さいときにはビット長を大きくするように、ダイナミックレンジと対応付けられたビット長の情報を記憶したテーブルを設けるようにさせることができ、ダイナミックレンジに基づいてビット長の情報をテーブルから読み出して、クラスタップのビット長を決定させるようにすることができる。
【００３３】
動きベクトル検出手段により検出された前記動きベクトルの大きさを所定の動きベクトルの大きさと比較する動きベクトル比較手段をさらに設けるようにさせることができ、動きベクトル比較手段の比較結果が、所定の動きベクトルの大きさよりも、動きベクトル生成手段により生成された動きベクトルの大きさの方が大きい場合、読出し手段には、第２のフレーム中の注目画素の量子化コードに類似する量子化コードに対応する第１のフレームの画素位置の情報を、記憶手段より読み出させ、動きベクトル検出手段には、読出し手段により読み出された、第２のフレーム中の注目画素の量子化コードに類似する量子化コードに対応する第１のフレームの画素位置の情報のうち、第２のフレーム中の注目画素との距離が最小となる画素の画素位置を始点とし、第２のフレーム中の注目画素の画素位置を終点とする動きベクトルを検出させることができる。
【００３４】
本発明の画像処理方法は、入力された画像における第１のフレームの各画素のうちの、注目画素に対応する４ｍ（ｍ≧１）個の周辺画素の画素値をクラスタップとして抽出するクラスタップ抽出ステップと、クラスタップのダイナミックレンジを検出するダイナミックレンジ検出ステップと、ダイナミックレンジに基づいて、ダイナミックレンジが大きいときにはビット長を小さくし、ダイナミックレンジが小さいときにはビット長を大きくするように、クラスタップのビット長を決定するビット長決定ステップと、ビット長決定ステップの処理で決定されたビット長で表現されたクラスタップの画素値レベルから、特徴量としての量子化コードを生成する量子化コード生成ステップと、第１のフレームの全ての画素を注目画素として、量子化コード生成ステップの処理で生成された特徴量と、特徴量に対応する第１のフレームの各画素位置の情報を記憶する記憶ステップと、第１のフレームとは異なる第２のフレーム中の注目画素について、量子化コード生成ステップの処理で生成された特徴量に対応する、第１のフレームの画素位置の情報を読み出す読出しステップと、読出しステップの処理で読み出された第１のフレームの画素位置の情報のうち、第２のフレーム中の注目画素との距離が最小となる画素の画素位置を始点とし、第２のフレーム中の注目画素の画素位置を終点とする動きベクトルを検出する動きベクトル検出ステップとを含むことを特徴とする。
【００３５】
本発明の記録媒体のプログラムは、入力された画像における第１のフレームの各画素のうちの、注目画素に対応する４ｍ（ｍ≧１）個の周辺画素の画素値をクラスタップとして抽出するクラスタップ抽出ステップと、クラスタップのダイナミックレンジを検出するダイナミックレンジ検出ステップと、ダイナミックレンジに基づいて、ダイナミックレンジが大きいときにはビット長を小さくし、ダイナミックレンジが小さいときにはビット長を大きくするように、クラスタップのビット長を決定するビット長決定ステップと、ビット長決定ステップの処理で決定されたビット長で表現されたクラスタップの画素値レベルから、特徴量としての量子化コードを生成する量子化コード生成ステップと、第１のフレームの全ての画素を注目画素として、量子化コード生成ステップの処理で生成された特徴量と、特徴量に対応する第１のフレームの各画素位置の情報を記憶する記憶ステップと、第１のフレームとは異なる第２のフレーム中の注目画素について、量子化コード生成ステップの処理で生成された特徴量に対応する、第１のフレームの画素位置の情報を読み出す読出しステップと、読出しステップの処理で読み出された第１のフレームの画素位置の情報のうち、第２のフレーム中の注目画素との距離が最小となる画素の画素位置を始点とし、第２のフレーム中の注目画素の画素位置を終点とする動きベクトルを検出する動きベクトル検出ステップとを含むことを特徴とする。
【００３６】
本発明のプログラムは、入力された画像における第１のフレームの各画素のうちの、注目画素に対応する４ｍ（ｍ≧１）個の周辺画素の画素値をクラスタップとして抽出するクラスタップ抽出ステップと、クラスタップのダイナミックレンジを検出するダイナミックレンジ検出ステップと、ダイナミックレンジに基づいて、ダイナミックレンジが大きいときにはビット長を小さくし、ダイナミックレンジが小さいときにはビット長を大きくするように、クラスタップのビット長を決定するビット長決定ステップと、ビット長決定ステップの処理で決定されたビット長で表現されたクラスタップの画素値レベルから、特徴量としての量子化コードを生成する量子化コード生成ステップと、第１のフレームの全ての画素を注目画素として、量子化コード生成ステップの処理で生成された特徴量と、特徴量に対応する第１のフレームの各画素位置の情報を記憶する記憶ステップと、第１のフレームとは異なる第２のフレーム中の注目画素について、量子化コード生成ステップの処理で生成された特徴量に対応する、第１のフレームの画素位置の情報を読み出す読出しステップと、読出しステップの処理で読み出された第１のフレームの画素位置の情報のうち、第２のフレーム中の注目画素との距離が最小となる画素の画素位置を始点とし、第２のフレーム中の注目画素の画素位置を終点とする動きベクトルを検出する動きベクトル検出ステップとを含む処理をコンピュータに実行させることを特徴とする。
【００３７】
本発明の画像処理装置および方法、並びにプログラムにおいては、入力された画像における第１のフレームの各画素のうちの、注目画素に対応する４ｍ（ｍ≧１）個の周辺画素の画素値がクラスタップとして抽出され、クラスタップのダイナミックレンジが検出され、ダイナミックレンジに基づいて、ダイナミックレンジが大きいときにはビット長を小さくし、ダイナミックレンジが小さいときにはビット長を大きくするように、クラスタップのビット長が決定され、決定されたビット長で表現されたクラスタップの画素値レベルから、特徴量としての量子化コードが生成され、第１のフレームの全ての画素を注目画素として生成された特徴量と、特徴量に対応する第１のフレームの各画素位置の情報が記憶され、第１のフレームとは異なる第２のフレーム中の注目画素について生成された特徴量に対応する、第１のフレームの画素位置の情報が読み出され、読み出された第１のフレームの画素位置の情報のうち、第２のフレーム中の注目画素との距離が最小となる画素の画素位置を始点とし、第２のフレーム中の注目画素の画素位置を終点とする動きベクトルが検出される。
【００３８】
【発明の実施の形態】
図４は、本発明を適用した画像処理装置の動き検出部２１の構成を示すブロック図である。
【００３９】
動き検出部２１は、フレームメモリ３１、特徴量抽出部３２、バッファメモリ３３、データベース制御部３４、および、動きベクトル検出部３５で構成されている。
【００４０】
フレームメモリ３１は、入力端子Ｔｉｎから入力された画像信号の１画面（１フレーム）の情報を格納し、特徴量抽出部３２に供給する。特徴量抽出部３２は、フレームメモリから供給された画面情報、すなわちカレントフレームＦｃの情報を基に、注目画素の特徴量を抽出する。
【００４１】
特徴量抽出部３２は、注目画素に対応したクラスタップを抽出し（注目画素に対応した付近の画素の画素値を抽出し）、そのクラスタップの情報から量子化コードを生成し、この量子化コードを特徴量として出力する。特徴量抽出部３２は、次の画面情報が入力されると、先に供給された画面情報から抽出した特徴量と対応する画素位置の情報（例えば、座標情報）を、バッファメモリ３３および動きベクトル検出部３５に供給する。尚、特徴量抽出部３２については、詳細を後述する。
【００４２】
バッファメモリ３３は、供給された特徴量を、参照フレームＦｒの特徴量情報として格納する。
【００４３】
データベース制御部３４は、バッファメモリ３３に格納されている参照フレームＦｒの特徴量情報に基づいて、特徴量をアドレスとして画素の位置情報をデータベース２１に格納することにより、参照フレーム情報を生成する。データベース制御部３４は、内部に、処理済の画素数をカウントするためのカウンタを有している。
【００４４】
次に、図５を参照して、データベース２１に格納される参照フレーム情報の構成について説明する。
【００４５】
データベース２１は、特徴量アドレス０乃至ａと、フラグアドレス０乃至ｂによって示されるａ×ｂ個のセルにより構成されている。データベース制御部３４は、画素の特徴量を特徴量アドレスに対応付けて、特徴量毎にその特徴量を持つ画素位置の情報を、データベース２１の、特徴量アドレスに対応するフラグアドレス１乃至ｂに順次格納する。そして、フラグアドレス０には、現在、その特徴量アドレスに格納されている画素位置の情報の数が、順次、インクリメントされて格納される。具体的には、特徴量アドレス１に、１つの画素位置の情報がセル（１，１）に格納されている場合、セル（１，０）には、格納されている画素位置の情報の数として、１が格納される。そして、次の注目画素の特徴量が、特徴量アドレス１に対応するものであった場合、セル（１，０）に格納されている値は、インクリメントされて２となり、注目画素の位置情報が、セル（１，２）に格納される。
【００４６】
再び、図４に戻り、動き検出部２１の構成について説明する。動きベクトル検出部３５は、特徴量抽出部３２から供給されたカレントフレームＦｃの特徴量情報と、データベース制御部３４のデータベース４１に記憶されている情報とのマッチング処理を実行して、動きベクトルを検出する。動きベクトル検出部３５は、具体的には、カレントフレームＦｃの注目画素の特徴量に対応する、データベース４１の特徴量アドレスに記載されている、複数の候補それぞれの画素位置と注目画素の距離を演算し、算出された距離が最小である画素位置の情報に基づいて、差分座標を注目画素の動きベクトル（Ｖｘ，Ｖｙ）として検出する。
【００４７】
動きベクトル検出部３５は、検出した動きベクトルＭの値が想定される動きベクトルの最大値M_Maxより小さいときは（規定範囲内にあるとき）、正しい動きベクトルとして判定し、端子Ｔout から出力する。また、動きベクトル検出部３５は、検出した動きベクトルＭの値が想定される動きベクトルの最大値M_Maxより大きいときは（規定範囲外にあるとき（等しい場合も含む））、対応していない位置の動きベクトルと判定し、カレントフレーム内の画素とデータベース４１で近い特徴量のアドレスを生成して、近傍の特徴量のアドレスに属する画素位置の情報に基づいて再度マッチング処理を行う。なお、動きベクトル検出部３５は、たとえば近傍の特徴量の選択においてビット反転を行うが、このとき注目画素のパターンに応じてビット反転の形態を変更する。
【００４８】
次に、図６のフローチャートを参照して、参照フレーム情報生成処理について説明する。
【００４９】
ステップＳ３１において、データベース制御部３４は、データベース４１に登録されている参照フレーム情報を初期化する。すなわち、データベース制御部３４は、全ての特徴量アドレスに対応するフラグアドレス０のセルに０を書き込み、フラグアドレス１乃至ｂに格納されている画素位置の情報を削除する。
【００５０】
ステップＳ３２において、データベース制御部３４は、フレームメモリ３１内の画素をカウントするカウンタのカウンタ変数ｎを０に初期化する。
【００５１】
ステップＳ３３において、特徴量抽出部３２は、注目画素に対応するクラスタップから特徴量を算出し、算出された特徴量を、動きベクトル検出部３５およびバッファメモリ３３に供給する。尚、特徴量算出処理については、後述する。
【００５２】
ステップＳ３４において、データベース制御部３４は、バッファメモリ３３に保存されている参照フレームＦｒの情報から、カウンタ変数ｎに対応する画素に対応する注目画素の特徴量を読み出し、データベース４１内の対応する特徴量アドレスの、フラグアドレス０に記載されている値Kを読み込む。
【００５３】
ステップＳ３５において、データベース制御部３４は、ステップＳ３５において読み出した値Ｋを、１だけインクリメントして（K=K＋１として）、データベース４１の対応する特徴量アドレスの、フラグアドレス０に書き込む。
【００５４】
ステップＳ３６において、データベース制御部３４は、バッファメモリ３３に保存されている参照フレームＦｒの情報から、注目画素の位置情報を読み出し、データベース４１の対応する特徴量アドレスのフラグアドレスＫ＋１に、注目画素の位置情報を書き込む。
【００５５】
ステップＳ３７において、データベース制御部３４は、カウンタ変数ｎをインクリメントして、ｎ＝ｎ＋１とする。
【００５６】
ステップＳ３８において、データベース制御部３４は、カウンタ変数ｎ＝1フレームの画素数であるか否かを判定する。ステップＳ３８において、カウンタ変数ｎ＝１フレームの画素数ではないと判定された場合、処理は、ステップＳ３３に戻り、それ以降の処理が繰り返される。ステップＳ３８において、カウンタ変数ｎ＝１フレームの画素数であると判定された場合、処理は、終了される。
【００５７】
次に、図７，図８のフローチャートを参照して、図４の動き検出部２１による動きベクトル検出処理について説明する。
【００５８】
ステップＳ５１において、動きベクトル検出部３５は、１フレームの画素数をカウントするカウント変数ｓを０に初期化する。
【００５９】
ステップＳ５２において、動きベクトル検出部３５は、特徴量抽出部３２により抽出されたカレントフレームＦｃ内の注目画素Ｐｎの特徴量を取得する。
【００６０】
ステップＳ５３において、動きベクトル検出部３５は、取得した特徴量を特徴量アドレスに対応する、データベース４１上のセル（特徴量アドレス，０）に記録されている値を読み込み、同じ特徴量に分類される画素数（候補数：画素位置の情報の数）を変数ｔｓに代入する。また、動きベクトル検出部３５は、候補数カウンタを示すカウンタ変数ｔを１に、距離の最小値を示す変数Minを∞に、距離を示すカウンタ変数Ｌを０にそれぞれ初期化する。
【００６１】
ステップＳ５４において、動きベクトル検出部３５は、カレントフレームＦｃ内の注目画素Ｐｎとデータベース制御部３５から読み込んだデータベース４１上の（特徴量アドレス，ｔ）に記録されている画素位置の情報との距離を演算して、変数Ｌに代入する。
【００６２】
ステップＳ５５において、動きベクトル検出部３５は、ステップＳ５４の処理で求められた距離Ｌが最小値を示す変数Minよりも小さいか否かを判定し、例えば、変数Min＞距離Ｌであると判定した場合、ステップＳ５６において、変数Minを距離Lに更新し、そのときの変数ｔを動きベクトル番号として登録する。また、変数Min≦距離Ｌであると判定された場合、ステップＳ５６の処理はスキップされる。
【００６３】
ステップＳ５７において、動きベクトル検出部３５は、候補カウンタ変数ｔが候補数の変数ｔｓ以上であるか否かを判定し、候補カウンタ変数ｔが候補数の変数ｔｓ以上ではないと判定した場合、すなわち、候補となる未処理の画素が存在する場合、ステップＳ５８において、変数ｔを１インクリメントして、その処理は、ステップＳ５４に戻る。
【００６４】
すなわち、候補カウンタ変数ｔが候補数の変数ｔｓ以上ではないということは、データベース４１上に記録されている、注目画素の特徴量と同一の特徴量に分類された画素位置の情報のうち、ステップＳ５４乃至Ｓ５６の処理がなされていない画素位置の情報が存在することになるので、候補となる全ての画素位置の情報について、ステップＳ５４乃至Ｓ５７の処理が施されるまで、その処理が繰り返される。
【００６５】
ステップＳ５７において、変数ｔが変数ｔｓ以上であると判定された場合、すなわち、注目画素の特徴量と同一の特徴量を有する全ての画素の画素位置と注目画素との距離が比較されたと判定された場合、ステップＳ５９において、動きベクトル検出部３５は、ステップＳ５６の処理で登録した動きベクトル番号に対応する画素位置を始点とし、注目画素位置を終点とする動きベクトルＭを求める。
【００６６】
ステップＳ６０において、動きベクトル検出部３５は、求められた動きベクトルＭの絶対値と、動きベクトルＭの想定される最大値M_Maxとを比較し、動きベクトルＭの絶対値が最大値M_Maxよりも小さいか否かを判定し、例えば、動きベクトルＭの絶対値が最大値M_Maxよりも小さいと判定した場合、その処理は、ステップＳ６１に進む。
【００６７】
ステップＳ６１において、動きベクトル検出部３５は、ステップＳ５９の処理で求められた動きベクトルＭを出力端子Toutから出力し、ステップＳ６２において、変数ｓを１インクリメントする。
【００６８】
ステップＳ６３において、動きベクトル検出部３５は、変数ｓがフレーム内の画素数と一致するか否かを判定し、例えば、変数ｓがフレーム内の画素数と一致しないと判定した場合、すなわち、まだ、処理すべき画素が存在すると判定する場合、その処理は、ステップＳ５２に戻り、例えば、変数ｓがフレーム内の画素数と一致すると判定した場合、すなわち、全ての画素について処理がなされたと判定された場合、その処理は終了する。
【００６９】
ステップＳ６０において、動きベクトルＭの絶対値が最大値M_Maxよりも小さくないと判定された場合、すなわち、注目画素と同じ特徴量と有する参照フレームの画素のうち、注目画素と最も近い位置に存在する画素との距離が想定していた距離M_Maxよりも離れていた場合、その処理は、ステップＳ６４（図８）に進む。
【００７０】
ステップＳ６４（図８）において、動きベクトル検出部３５は、反転ビット数を示す変数ｕを１に初期化する。
【００７１】
ステップＳ６５において、動きベクトル検出部３５は、注目画素の特徴量を示す量子化コードのうちｕビットを反転させる。すなわち、最初の処理では注目画素の特徴量を示す量子化コードを構成する複数ビットのうちのいずれか１ビットが反転される。
【００７２】
ステップＳ６６において、動きベクトル検出部３５は、ｕビットだけ反転した特徴量を特徴量アドレスに対応する、データベース４１上の（特徴量アドレス，０）に記録されている値を読み込み、同じ特徴量に分類される画素数（候補数：画素位置の情報の数）を変数ｔｓに代入する。また、動きベクトル検出部３５は、候補数カウンタを示すカウンタ変数ｔを１に、距離の最小値を意味する変数Minを∞に、距離を示すカウンタ変数Ｌを０にそれぞれ初期化する。
【００７３】
ステップＳ６７において、動きベクトル検出部３５は、カレントフレームＦｃ内の注目画素Ｐｎと、データベース制御部３５から読み込んだデータベース４１上の（特徴量アドレス，ｔ）に記録されている画素位置との距離を演算して、変数Ｌに代入する。
【００７４】
ステップＳ６８において、動きベクトル検出部３５は、ステップＳ６７の処理で求められた距離Ｌが距離の最小値の変数Minよりも小さいか否かを判定し、例えば、変数Min＞距離Ｌであると判定した場合、ステップＳ６９において、最小値の変数Minを距離Lに更新し、そのときの変数ｔを動きベクトル番号として登録する。また、変数Min≦距離Ｌであると判定された場合、ステップＳ６９の処理はスキップされる。
【００７５】
ステップＳ７０において、動きベクトル検出部３５は、候補カウンタ変数ｔが候補数の変数ｔｓ以上であるか否かを判定し、候補カウンタ変数ｔが候補数の変数ｔｓ以上ではないと判定した場合、すなわち、候補となる未処理の画素が存在するとみなし、ステップＳ７１において、変数ｔを１インクリメントして、その処理は、ステップＳ６７に戻る。
【００７６】
すなわち、候補カウンタ変数ｔが候補数の変数ｔｓ以上ではないということは、データベース４１上に記録されている、注目画素の特徴量がｕビット反転された特徴量と同一の特徴量に分類された画素位置の情報のうち、ステップＳ６７乃至Ｓ６９の処理がなされていない画素が存在することになるので、候補となる全ての画素について、ステップＳ６７乃至Ｓ６９の処理が施されるまで、その処理が繰り返される。
【００７７】
ステップＳ７０において、変数ｔが変数ｔｓ以上であると判定された場合、すなわち、注目画素の特徴量のうちｕビット反転された特徴量と同一の特徴量を有する全ての画素の画素位置と注目画素との距離が比較されたと判定された場合、ステップＳ７２において、注目画素の特徴量のうち反転させていないビットの組み合わせがあるか否かを判定し、まだ、反転させていないビットが、または、反転させていないビットの組み合わせがあると判定した場合、その処理は、ステップＳ６５に戻る。
【００７８】
ステップＳ７２において、反転させていないビットが存在しないと判定された場合、ステップＳ７３において、動きベクトル検出部３５は、ステップＳ６９の処理で登録した動きベクトル番号に対応する画素位置を始点とし、注目画素位置を終点とする動きベクトルＭを求める。
【００７９】
ステップＳ７４において、動きベクトル検出部３５は、求められた動きベクトルＭの大きさの絶対値と、動きベクトルＭの大きさとして想定される最大値M_Maxとを比較し、動きベクトルＭの絶対値が最大値M_Maxよりも小さいか否かを判定し、例えば、動きベクトルＭの絶対値が最大値M_Maxよりも小さいと判定した場合、その処理は、ステップＳ６１に戻る。
【００８０】
ステップＳ７４において、動きベクトルＭの絶対値が最大値M_Maxよりも小さくないと判定された場合、すなわち、注目画素の特徴量のうちｕビット反転された特徴量と同じ特徴量を有する参照フレームの画素のうち、注目画素と最も近い位置に存在する画素との距離が想定していた距離よりも離れていた場合、ステップＳ７５において、変数ｕを１インクリメントして、その処理は、ステップＳ６５に戻る。
【００８１】
すなわち、ステップＳ５１乃至Ｓ６３の処理により、注目画素Ｐｎの特徴量と同一の特徴量として分類されている画素と、注目画素との距離を順次演算し、最小となる画素を求め、その求められた画素位置の情報と注目画素Ｐｎの画素位置の情報から動きベクトルを生成して、出力する。
【００８２】
ただし、ステップＳ６０において、生成された動きベクトルMの絶対値が、最大値M_Maxよりも大きい場合、正しく求められた動きベクトルではないと判定され、ステップＳ６４乃至Ｓ７５の処理により、注目画素Ｐｎの特徴量に類似した近傍の特徴量に属する（分類される）画素と、注目画素Ｐｎとの距離を順次求めて、その距離が最小となる画素を求め、動きベクトルを求める。
【００８３】
すなわち、動き物体のある部分の特徴量が隣接フレームで少量変化することがありうる。
【００８４】
特徴量空間の各要素間は直交の関係にあるため、ある特徴量は図８のｄ−ａ１のように要素の個数を軸とした特徴量空間で定義される。すなわち、ｄ−ａ１で示す座標に特徴量を抽出した画素の空間座標をリンクさせる。
【００８５】
ある特徴量に対して近傍の特徴量（ある特徴量に対して類似する特徴量）は、図８中ｄ−ａ２で示すように、各特徴量空間内の軸ｘ，ｙ，ｚの値に対してある範囲の振れ幅を許容する領域として定義される。特徴量抽出部３２から出力される特徴量は、量子化コードであり、例えば、４ビットの量子化コードとして「０００」が出力される場合、最も近傍の特徴量は、ハミング距離が１となる「００１」、「０１０」、および「１００」となる。結果として、図８において、注目画素の特徴量が「０００」である場合、図中ｄ−ａ２で定義される範囲に含まれる特徴量（類似する特徴量）は、「００１」、「０１０」、および「１００」となる。
【００８６】
そこで、ステップＳ６５乃至Ｓ７２の処理においては、上述したようにいずれかｕビットのデータを反転させた（０ならば１に、１ならば０に反転させた）ときに求められる特徴量に属する画素と注目画素との距離を求めて最小となる画素と注目画素の画素位置の情報から動きベクトルＭを生成し、大きさが最大値M_Max以下となる動きベクトルＭが生成されるまで、反転させるビット数ｕを大きくしながらその処理を繰り返す。すなわち、注目画素の特徴量を示す量子化コードとのハミング距離を徐々に大きくしながら、注目画素との距離が最小となる画素を求める処理を繰り返す。
【００８７】
この場合、演算内容はアドレス参照と差分演算と分岐のみなので、演算量が大幅に増大することはない。
【００８８】
画素値としては、たとえば１画素＝８ビットとした場合、コンピュータグラフィックス（ＣＧ）のような画像はフルビット（８ビット）情報でマッチング処理を行えるが、自然画像の場合は、フレーム毎にバラツキを含むので、複数ビットのうち所定ビットを除いて、マッチング処理を行うことが望ましい。具体的には、下位数ビットをマスクして使用してもよいし、ビット数を少なくして再量子化しても良い。つまり、非線形／線形な量子化におけるビット数を削減する（量子化ビット数を少なくする）ことが望ましい。
【００８９】
次に、特徴量抽出部３２について詳細を説明する。
【００９０】
図１０は、特徴量抽出部３２の構成例を示したブロック図である。
【００９１】
クラスタップ抽出部５１は、入力された画像情報のうち、特徴量の抽出に必要な注目画素に対応する周辺画素（クラスタップ）の情報（画素値）を抽出しDRレンジ演算部５２、および、ADRCコード生成部５３に出力する。また、クラスタップ抽出部５１は、クラスタップのパターンを記憶したタップテーブル５１ａを有しており、DR判定部５４より入力される指示情報に基づいて（指示回数に応じて）、抽出するクラスタップのパターンを切り替える。
【００９２】
すなわち、タップテーブル５１ａは、クラスタップのパターンとして、図１１で示すように注目画素（図中黒丸で表示されている画素）を中心とした３画素×３画素からなるブロックの最外周位置に配置される８画素（図中斜線で塗りつぶされている画素）のパターンや、図１２で示すように、注目画素（図中黒丸で表示されている画素）を中心とした５画素×５画素からなるブロックの最外周位置に配置される１６画素（図中斜線で塗りつぶされている画素）のパターンや、図１３で示すように、注目画素（図中黒丸で表示されている画素）を中心とした７画素×７画素からなるブロックの最外周位置に配置される２４画素（図中斜線で塗りつぶされている画素）パターンや、さらに、図示しないが、注目画素（図中黒丸で表示されている画素）を中心としたｎ画素×ｎ画素からなるブロックの最外周位置に配置される４（ｎ−１）画素（図中斜線で塗りつぶされている画素）パターンを記憶している。
【００９３】
また、クラスタップのパターンは、図１４で示すように、注目画素の上下左右の１画素からなる４画素であってもよいし、図１５で示すように、注目画素の上下左右の２画素からなる８画素であってもよいし、図１６で示すように、注目画素の上下左右の３画素からなる１２画素であってもよいし、さらには、図示しないが、注目画素の上下左右のｍ画素からなる４ｍ画素であってもよい。さらに、クラスタップは、図示した以外の構成でもよく、例えば、注目画素との位置関係が非対称となる配置のものであってもよい。
【００９４】
クラスタップ抽出部５１は、DR（ダイナミックレンジ）判定部５４からの指示情報に基づいて、クラスタップの配置が注目画素の位置に対してその範囲が広がっていくような配置となるように、ランク分けされたクラスタップのパターンを順次タップテーブル５１ａから読み出して変更させていく。この際、クラスタップ抽出部５１は、内蔵するランクを示すカウンタを制御して対応するランクのクラスタップを抽出する。
【００９５】
すなわち、図１１乃至図１３で示すように変化していくクラスタップのパターン（注目画素を中心としたｎ画素×ｎ画素からなるブロックの最外周位置に配置される４（ｎ−１）画素のパターン）が、順次ランクと共にタップテーブル５１ａに記憶されていた場合、クラスタップ抽出部５１は、DR判定部５４から指示情報が無い状態では、図１１で示すランク１のパターンでクラスタップの情報を読出し、その後のタイミングで、最初に指示情報を受信した場合、図１２で示すランク２のパターンに変化させて、クラスタップの情報を読み出し、さらに、２回目の指示情報を受信した場合、図１３で示すランク３のパターンに変更してクラスタップの情報を読み出し、さらに、ｎ回目の指示情報を受信した場合、注目画素を中心としたｎ画素×ｎ画素からなるブロックの最外周位置に配置される４（ｎ−１）画素のランクｎのパターンでクラスタップの情報を抽出する。このようにクラスタップのパターンのランクｎは、指示情報を受信した回数に基づいて変化する。尚、このクラスタップ抽出部５１が用いるパターンの変化（変更）は、図１１乃至図１３で示すようなパターンの変化のように同一形状で規則的に変化するもの（クラスタップの配置の形状が正方形で相似に変化するようなもの）である必要は無く、配置の形状などは変化するものであってもよい。また、パターンの変化は、図１４乃至図１６で示すようなパターンであってもよく、この場合、ランクｓは、図１４，図１５，図１６の順に設定されることになる。
【００９６】
DR（ダイナミックレンジ）演算部５２は、クラスタップ抽出部５１より入力されるクラスタップの情報（画素値）からダイナミックレンジを求め、ダイナミックレンジをDR判定部５４とADRC（Adaptive Dynamic Range Coding）コード生成部５３に出力すると共に、ダイナミックレンジとダイナミックレンジを求める際に得られる最小値の情報をADRCコード生成部５３に出力する。すなわち、クラスタップのパターンが、例えば、図１４で示すものであった場合、各クラスタップＣ１乃至Ｃ４の情報（画素値レベル）が、（Ｃ１，Ｃ２，Ｃ３，Ｃ４）＝（６０，９０，５１，１００）であるとき、その関係は、図１７で示すようになる。このような場合、ダイナミックレンジは、画素値レベルの最小値と最大値の差として定義され、その値は、以下の式（２）で定義される。
DR＝Max−Min＋１・・・（２）
【００９７】
ここで、Maxは、クラスタップの情報である画素値レベルの最大値であり、Minは、クラスタップの画素値レベルの最小値を示す。ここで、１を加算するのは、クラスを定義するためである（例えば、０，１で示されるクラスを設定する場合、両者の差分は１であるが、クラスとしては２クラスとなるため、差分に１を加算する）。従って、図１７の場合、クラスタップＣ３の画素値レベル１００が最大値であり、クラスタップＣ１の画素値レベル５１が最小値となるので、DRは、５０（＝１００−５１＋１）となる。
【００９８】
このように、DR演算部５２は、DRを演算するにあたり、クラスタップの画素値レベルのうちの最小値と最大値を検出することになるので、その最小値（または最大値：最小値は、最大値とDRから求めることができる）をADRCコード生成部５３に出力する。
【００９９】
ADRCコード生成部５３は、DR演算部５２より入力されたDRの値と最小値Minからクラスタップの各画素値レベルをADRCコードからなる量子化コードを生成して出力する。より詳細には、ADRCコードは、クラスタップの各画素値レベルを以下の式（３）に代入することにより求められる。
Ｑ＝Round（（Ｌ−Min＋０．５）×（２＾ｎ）／DR）・・・（３）
【０１００】
ここで、Roundは切り捨てを、Lは画素値レベルを、ｎは割り当てビット数を、（２＾ｎ）は２のｎ乗を、それぞれ示している。
【０１０１】
従って、例えば、割り当てビット数ｎが１であった場合、各クラスタップの画素値レベルは、以下の式（４）で示される閾値ｔｈ以上であれば１であり、閾値ｔｈより小さければ０とされる。
ｔｈ＝DR／２−０．５＋Min・・・（４）
【０１０２】
結果として、割り当てビット数ｎが１である場合、図１７で示すようなクラスタップが得られた場合、閾値ｔｈは、７５．５（＝５０／２−０．５＋５１）となるので、ADRCコードは、ADRC（Ｃ１，Ｃ２，Ｃ３，Ｃ４）＝０１０１となる。
【０１０３】
ここで、図１０の特徴量抽出部３２の構成の説明に戻る。
【０１０４】
DR判定部５４は、DR演算部５２より入力されるDRの値が所定の閾値DR_th以上であるか否かを判定し、所定の閾値DR_thの場合、クラスタップ抽出部５１に対してクラスタップのパターンを変更するように指示すると共に、ADRCコード生成部５３に対して、今入力されたDRを用いたADRCコードの生成処理を実行しないように指示する。
【０１０５】
すなわち、ADRCコードは、クラスタップによっては、異なるDRであっても同じADRCコードが生成されることがある。この場合、DRが小さいほど誤差の影響を受けやすくなり、ADRCコードがフレーム毎に変化する恐れが生じやすくなる。
【０１０６】
より詳細には、例えば、図１８で示すように、図１４で示すクラスタップにより画素値レベル（Ｃ１，Ｃ２，Ｃ３，Ｃ４）＝（７４，８０，６５．５，８５．５）となる場合を考える。尚、図１８において、画素値レベル（Ｃ１，Ｃ２，Ｃ３，Ｃ４）＝（７４，８０，６５．５，８５．５）は、バツ印により示されている。または、図中の黒丸は、図１７で示した場合の例をそのまま示しており、ダイナミックレンジは、DR１（＝５０）で示されている。
【０１０７】
このとき、図１８中で、バツ印で示されるクラスタップのダイナミックレンジであるDR２は、２０となり、最小値は６５．５となるので、閾値は、７５．５となり、生成されるADRCコードは、０１０１となり、図１７で示した場合と同一のADRCコードが生成されることになる。
【０１０８】
ところで、各画素値レベルには、一定のレベルの誤差が生じる事が知られている。例えば、図１８では、各データに発生する誤差の範囲が、図中の縦棒で示されている。ここで、バツ印のクラスタップＣ１のデータは、誤差範囲に閾値が含まれているため、誤差の程度によっては、閾値ｔｈよりも大きくなることがあり、クラスタップのコードが反転してしまう可能性がある。
【０１０９】
動きベクトルの始点、または、終点となる画素に対応するクラスタップにこのような現象が生じると、結果として、本来動きベクトルの始点、または、終点として設定されるADRCコードが変化してしまうため、正確な動きベクトルの生成ができなくなる可能性が生じてしまう。さらに、DRが小さいと、クラスタップの各画素値レベルは、分散の程度が小さいことになるので、このように、閾値付近に存在する画素が多く含まれている可能性があり、それぞれの誤差範囲が閾値を跨ぐことにより複数ビットのコードが反転してしまう可能性がある。
【０１１０】
以上のことから、DRが大きい程、ADRCコードの誤差は小さく、DRが小さい程、ADRCコードの誤差が大きくなる。
【０１１１】
そこで、ダイナミックレンジDRがダイナミックレンジの閾値DR_thよりも小さい場合、クラスタップのパターンを変化させて、例えば、図１１のランク１のパターンから図１２のランク２のパターンへと変化させ、クラスタップの個数とクラスタップを抽出する範囲を注目画素に対して広くすることによりDRが大きくなるようにパターンを変更させる。さらに、パターンを変更させた後でも、DRがDR_thよりも小さい場合には、さらに、高いランクのパターンに変更させ、DRがDR_thを超える値になるまでパターンを変更させる指示情報をクラスタップ抽出部５１に送信する。
【０１１２】
次に、図１９のフローチャートを参照して、特徴量算出処理について説明する。
【０１１３】
ステップＳ９１において、クラスタップ抽出部５１は、クラスタップのパターンのランクを示すカウンタ変数ｓを１に初期化する。
【０１１４】
ステップＳ９２において、クラスタップ抽出部５１は、フレームメモリ３１に記憶されている１フレーム分の画像を読出し、その中の注目画素に対応するランクｓのパターンの情報をタップテーブル５１ａから読出し、対応するクラスタップを抽出し、DR演算部５２、および、ADRCコード生成部５３に出力する。すなわち、最初の処理では、ランクｓ＝１のパターン（例えば、上述の図１１、または、図１４のクラスタップのパターン）のクラスタップが抽出され、DR演算部５２に出力される。
【０１１５】
ステップＳ９３において、DR演算部５２は、クラスタップ抽出部５１より入力されたクラスタップのダイナミックレンジを求め、ADRCコード生成部５３、および、DR判定部５４に出力すると共に、その際、検出される最小値をADRCコード生成部５３に出力する。すなわち、図１７で示すような、クラスタップの画素値レベルが入力されると、上述のように、DRとして５０が出力され、最小値Minとして５１が出力される。
【０１１６】
ステップＳ９４において、DR判定部５４は、DR演算部５２より入力されたDRが所定の閾値DR_thよりも大きいか否かを判定し、例えば、DR演算部５２より入力されたDRが所定の閾値DR_thよりも大きくない、すなわち、クラスタップの各画素値レベルが分散していないと判定した場合、ステップＳ９６において、クラスタップ抽出部５１に対して、クラスタップのパターンを変更させる指示を出力し、ADRCコード生成部５３に対して、DR算出部５２より入力されたDR、および、最小値Minを用いたADRCコードを生成しないように指示する。
【０１１７】
ステップＳ９７において、クラスタップ抽出部５１は、ランクを示すカウンタ変数ｓを１インクリメントし、その処理は、ステップＳ９２に進む。すなわち、今の場合、ランクを示すカウンタ変数ｓは２となり、例えば、上述の図１２、または、図１５のパターンのように、注目画素から空間的に広がりをもつパターンに変更させてクラスタップを抽出する。
【０１１８】
このように、DRが所定の閾値DR_thを超えるまで、ステップＳ９２乃至Ｓ９７の処理が繰り返される。
【０１１９】
ステップＳ９４において、DR演算部５２より入力されたDRが所定の閾値DR_thよりも大きいと判定された場合、ステップＳ９５において、ADRCコード生成部５３は、DR演算部５２より入力されたDR、および、最小値Minを用いて、クラスタップ抽出部５１より入力されたクラスタップの各画素値レベルからADRCコードを生成して出力する。
【０１２０】
以上のように、クラスタップとして検出される画素値レベルに含まれる誤差を考慮し、所定の大きさ以上のダイナミックレンジが確保されるまでクラスタップを抽出する範囲が広げられることにより、ADRCコードを生成する際に生じるノイズなどの誤差を抑制し、より正確な動きベクトルの検出を可能にすることができる。
【０１２１】
以上においては、注目画素に対するクラスタップのパターンを変化させることにより、DRを大きくして、結果的に正しい動きベクトルの検出が可能となる例について説明してきたが、クラスタップのパターンを変更させること無く、画像を縮小させるようにしてもよい。
【０１２２】
図２０は、DRが所定の閾値DR_thより小さい場合、画像を縮小させることにより、同一のクラスタップのパターンでもDRを大きくさせるようにした特徴量抽出部３２の構成を示している。
【０１２３】
尚、図２０中、図１０における場合と対応する部分については、同一の符号を付してあり、以下では、その説明は、適宜省略する。
【０１２４】
図２０において、図１０の特徴量抽出部３２と異なるのは、画像縮小部７１が新たに設けられ、クラスタップ抽出部５１およびDR判定部５４に代えて、クラスタップ抽出部７２およびDR判定部７３が設けられた点である。
【０１２５】
画像縮小部７１は、DR判定部７３からの指示に基づいて入力された画像を縮小してクラスタップ抽出部７２に出力する。尚、画像縮小部７１は、縮小率を示すカウンタを備えており、デフォルトの状態では（DR判定部７３から特に指示を受けていない状態では）、縮小率１、すなわち、入力された画像を縮小することなくそのままクラスタップ抽出部７２に出力する。
【０１２６】
クラスタップ抽出部７２は、クラスタップ５１と同一の機能を有するものであるが、クラスタップ抽出部５１とは異なりタップテーブル５１ａを有していないので、所定の固定されたクラスタップのパターンで画像縮小部７１より入力された画像の注目画素に対応するクラスタップを抽出し、DR演算部５２、および、ADRCコード生成部５３に出力する。
【０１２７】
DR判定部７３は、基本的には、DR判定部５４と同一の機能を有するものであるが、DRが所定の閾値DR_thよりも小さいと判定した場合、画像縮小部７１に対して画像を縮小するように指示を送信する。
【０１２８】
次に、図２１のフローチャートを参照して、図２０の特徴量算出処理について説明する。尚、図２１のフローチャートにおけるステップＳ１１３乃至Ｓ１１６の処理については、図１９のフローチャートにおけるステップＳ９２乃至Ｓ９５の処理と、同様の処理であるので、その説明は省略する。
【０１２９】
ステップＳ１１１において、画像縮小部７１は、縮小率を示すカウンタ変数ｔを１に初期化する。
【０１３０】
ステップＳ１１２において、画像縮小部７１は、フレームメモリ３１に記憶されている画像情報を読出し、１／（ｔ＾２）倍（ここで、ｔ＾２は、ｔの２乗を示す）に画像を縮小して（水平方向、および、垂直方向にそれぞれ、ｔ行毎、または、ｔ列毎に間引きして、１／ｔ倍に縮小して）、クラスタップ抽出部７２に出力する。すなわち、最初の処理では、縮小率が１（＝１／（１＾２））であるので、画像は縮小されること無くフレームメモリ３１に記憶されている画像そのものがクラスタップ抽出部７２に出力される。
【０１３１】
ステップＳ１１５において、DR判定部７３は、DRが所定の閾値DR_thよりも大きいと判定すると、ステップＳ１１７において、画像縮小部７１に対して、画像を縮小するようにする指示情報を送信する。
【０１３２】
ステップＳ１１８において、画像縮小部７１は、DR判定部７３からの指示に基づいて縮小率を示すカウンタ変数ｔを１インクリメントし、その処理は、ステップＳ１１２に戻る。従って、例えば、ｔ＝２とき、ステップＳ１１２においては、画像が水平方向、および垂直方向に１／２に縮小されて（１／（２＾２）に縮小されて）以降の処理が繰り返される。
【０１３３】
すなわち、例えば、図２２で示すように、２L（画素数）×２M（画素数）からなる画像の注目画素（図中黒丸で表示されている画素）を中心としたを３画素×３画素からなるブロックの最外周位置に配置される８画素（図中斜線で塗りつぶされている画素）のパターンからなるクラスタップを抽出する場合、DRが所定の閾値DR_thよりも大きいと判定されると、図２３で示すように、画像が、水平方向、および、垂直方向に対して１／２倍となるL×Mからなる画像に縮小されて（水平方向および垂直方向に、それぞれ１行、および、１列ずつ間引きされて）、注目画素を中心とした３画素×３画素からなるブロックの最外周位置に配置される８画素（図２３中格子状の線で塗りつぶされている画素）のパターンからなるクラスタップを抽出する。このように処理することにより、図２３で示す縮小された画像のクラスタップは、図２２で示すように縮小前の画像の注目画素の水平方向、および、垂直方向に対して２画素分の距離に存在していた画素（図２２中格子状の線で塗りつぶされている画素）がクラスタップとして選択されることになる。
【０１３４】
このように、画像を縮小して、同一のパターンからなるクラスタップを抽出することにより、実質的に、元の画像の注目画素の位置から空間的に広がるように、クラスタップのパターンが変更されるのと同様の処理がなされることになるので、図１０を参照して説明した特徴量抽出部３２と同様の効果を得ることができる。
【０１３５】
また、以上においては、クラスタップを空間的に広がるようにパターンを切り替えてDRを所定の閾値DR_th以上となるようにさせる例について説明してきたが、例えば、DRが小さい場合、クラスタップとして抽出される各画素の画素値レベルを表すビット長を変化させることによりADRCコードを正確に表現できるようにしてもよい。
【０１３６】
図２４は、クラスタップとして抽出する画素値レベルを表すビット長を変化させるようにして、正確なADRCコードを生成するようにした特徴量抽出部３２の構成例を示すブロック図である。
【０１３７】
尚、図２４中、図１０または図２０における場合と対応する部分については、同一の符号を付してあり、以下では、その説明は、適宜省略する。図２４において、図１０の特徴量抽出部３２と異なるのは、DR判定部５４、および、ADRCコード生成部５３に代えて、使用ビット長決定部８１、および、ADRCコード生成部８２が設けられている点である。
【０１３８】
使用ビット長決定部８１は、ADRCコードを生成する際に、DRに対応した使用ビット長の情報が予め記憶されたテーブル８１ａを有しており、DR演算部より入力されるDRに基づいて、このテーブルからクラスタップのデータ（各画素の画素値）の使用ビット長を決定し、決定した使用ビット長の情報をADRCコード生成部８２に出力する。
【０１３９】
ADRCコード生成部８２は、基本的に図１０のADRCコード生成部５３と同様の機能を有するものであるが、使用ビット長決定部８１より入力される使用ビット長の情報に基づいてADRCコードを生成する。
【０１４０】
ここで、クラスタップの各画素値レベルを示すビット長を変化させることにより、ADRCコード（量子化コード）を正確に表現する方法について説明する。
【０１４１】
例えば、クラスタップ抽出部７２により、クラスタップから抽出される各画素の画素値レベルを上位２ビットで表現させる場合、図２５で示すように、クラスタップＣ１が、０乃至６３の範囲に、クラスタップＣ２が、６４乃至１２７の範囲に、クラスタップＣ３が、１２８乃至１９１の範囲に、クラスタップＣ４が、１９２乃至２５５の範囲に、それぞれ分布して、ダイナミックレンジDR_Lが大きければ、２ビットのADRCコードで表現するとき、各クラスタップの値は、それぞれ異なる値として表現できることになる。すなわち、画素値レベルが、０乃至６３のとき００、６４乃至１２７のとき０１、１２８乃至１９１のとき１０、１９２乃至２５５のとき１１としてADRCコードの各コードを表すとするならば、クラスタップ（Ｃ１，Ｃ２，Ｃ３，Ｃ４）に対応するコードは、（００，１０，０１，１１）となるので、各画素値に対応するコードに差をつけて表現することが可能である。
【０１４２】
ところが、図２５と類似する傾向を持ったクラスタップＣ１乃至Ｃ２のようにダイナミックレンジDR_Sが小さくとなると、例えば、図２６で示すようにクラスタップＣ１乃至Ｃ４が同一の範囲（１２８乃至１９１の範囲）に分布してしまうことになり、２ビットの表現では、全てが０１で表現されてしまうことになるため、結果として、画素値レベルを２ビットで表現すると、各画素毎に差をつけて表現することができないことになる。
【０１４３】
従って、このような場合には、画素値レベルの表現方法を２ビットからそれ以上のビット数で表現すれば、各画素毎に異なる画素値レベルを表現することができるので、ADRCコードの各画素値レベルを示すコードにも差をつけることが可能になる。
【０１４４】
これらをまとめると、画素値レベルを表現するビット長は、実際の画素値レベルのダイナミックレンジDR／２の分解能が表現できるように選択すればよいことになる。従って、例えば、画素値レベルのダイナミックレンジを８ビットで表現できる場合、図２７で示すように、必要分解能（＝DR／２）が１２８であるとき、使用ビット長は１ビット以上であればよく、必要分解能が６４であるとき、使用ビット長は２ビット以上であればよく、必要分解能が３２であるとき、使用ビット長は３ビット以上であればよく、必要分解能が１６であるとき、使用ビット長は４ビット以上であればよく、必要分解能が８であるとき、使用ビット長は５ビット以上であればよく、必要分解能が４であるとき、使用ビット長は６ビット以上であればよく、必要分解能が２であるとき、使用ビット長は７ビット以上であればよく、必要分解能が１であるとき、使用ビット長は８ビット以上であればよいことになる。
【０１４５】
換言すれば、図２８で示すように、ダイナミックレンジDRに対して、DRが２６５であるとき、使用ビット長は１ビット以上であればよく、DRが１２８であるとき、使用ビット長は２ビット以上であればよく、DRが６４であるとき、使用ビット長は３ビット以上であればよく、DRが３２であるとき、使用ビット長は４ビット以上であればよく、DRが１６であるとき、使用ビット長は５ビット以上であればよく、DRが８であるとき、使用ビット長は６ビット以上であればよく、DRが４であるとき、使用ビット長は７ビット以上であればよく、DRが２であるとき、使用ビット長は８ビット以上であればよいことになる。
【０１４６】
使用ビット長決定部８１のテーブル８１ａには、図２８で示すダイナミックレンジDRと、対応する使用ビット長の情報が記憶されている。
【０１４７】
次に、図２９のフローチャートを参照して、ビット長を調整することにより、注目画素に対応する特徴量を抽出する処理（注目画素に対応するプラスタップに基づいてADRCコードを生成する処理）について説明する。尚、図２９のフローチャートにおいて、ステップＳ１３１，Ｓ１３２の処理については、図２１のフローチャートにおけるステップＳ１１３，Ｓ１１４の処理と同様であるので、その説明は省略する。
【０１４８】
ステップＳ１３３において、使用ビット長決定部８１は、ステップＳ１３２の処理によりDR演算部５２で演算されたDRに基づいて、テーブル８１ａを参照して使用ビット長を決定し、決定した使用ビット長の情報をADRCコード生成部８２に出力する。
【０１４９】
ステップＳ１３４において、ADRCコード生成部８２は、クラスタップ抽出部７２より入力されたクラスタップの各画素値レベルを、使用ビット長決定部８１より入力された使用ビット長により処理し、DR演算部５２より入力されたDR、および、最小値Minを用いてADRCコードを生成して出力する。
【０１５０】
以上によれば、クラスタップとして抽出される画素値レベルを示すデータのビット長をDRに対応して変更させることにより、ダイナミックレンジが小さい場合には、画素値レベルを表現するビット長を長くして、各画素毎のコードに差をつけて表現することができ、結果として正確なADRCコードを生成することが可能になる。また、ダイナミックレンジが大きい場合、ビット長を長くしすぎないようにすることができるので、フレーム単位での画素値レベルの変化を吸収してノイズによる影響を抑制することが可能となる。
【０１５１】
また、動き検出部２１は、図４で示すように、参照フレームＦｒの特徴量をバッファメモリ３３に記憶させ、次のタイミングでデータベース４１に記憶させて、動きベクトルを検出させるようにする例について説明してきたが、例えば、図３０で示すように、カレントフレームＦｃを記憶するフレームメモリ３１−１と、その次のタイミングで、カレントフレームＦｃを参照フレームＦｒに記憶させるフレームメモリ３１−２を設けるようにして、カレントフレームＦｃと参照フレームＦｒのそれぞれの特徴量抽出部３２−１，３２−２を設けるようにしてもよい。尚、図３０の動き検出部２１の処理は、図４の動き検出部２１と同様であるので、その説明は省略する。
【０１５２】
以上においては、特徴量を表現する量子化コードとしてADRCコードを生成する場合について説明してきたが、特徴量を表現する量子化コードは、ADRCコードに限るものではなく、例えば、注目画素に対応したクラスタップから求められるダイナミックレンジDR、最小値Min、ラプラシアン、ソーベルオペレータ、または、クラスタップの中心画素の画素値、もしくは、注目画素近傍N画素値の平均、または、和などであってもよい。
【０１５３】
以上によれば、注目画素に対応したクラスタップのDRの大きさに基づいて、適応的にDR、または、ビット長を変更することにより、適正な量子化コードを生成することが可能となるので、動きベクトルを正確に求めることが可能となる。
【０１５４】
上述した一連の処理は、ハードウェアにより実行させることもできるが、ソフトウェアにより実行させることもできる。一連の処理をソフトウェアにより実行させる場合には、そのソフトウェアを構成するプログラムが、専用のハードウェアに組み込まれているコンピュータ、または、各種のプログラムをインストールすることで、各種の機能を実行させることが可能な、例えば汎用のパーソナルコンピュータなどに記録媒体からインストールされる。
【０１５５】
図３１は、動き検出部２１をソフトウェアにより実現する場合のパーソナルコンピュータの一実施の形態の構成を示している。パーソナルコンピュータのCPU２０１は、パーソナルコンピュータの動作の全体を制御する。また、CPU２０１は、バス２０４および入出力インタフェース２０５を介してユーザからキーボードやマウスなどからなる入力部２０６から指令が入力されると、それに対応してROM(Read Only Memory)２０２に格納されているプログラムを実行する。あるいはまた、CPU２０１は、ドライブ２１０に接続された磁気ディスク２１１、光ディスク２１２、光磁気ディスク２１３、または半導体メモリ２１４から読み出され、記憶部２０８にインストールされたプログラムを、RAM(Random Access Memory)２０３にロードして実行し、出力部２０７が実行結果を出力する。さらに、CPU２０１は、通信部２０９を制御して、外部と通信し、データの授受を実行する。
【０１５６】
プログラムが記録されている記録媒体は、図３１に示すように、コンピュータとは別に、ユーザにプログラムを提供するために配布される、プログラムが記録されている磁気ディスク２１１（フレキシブルディスクを含む）、光ディスク２１２（CDROM(Compact Disc-Read Only Memory)，DVD（Digital Versatile Disc）を含む）、光磁気ディスク２１３（MD（Mini-Disc）を含む）、もしくは半導体メモリ２１４などよりなるパッケージメディアにより構成されるだけでなく、コンピュータに予め組み込まれた状態でユーザに提供される、プログラムが記録されているROM２０２や、記憶部２０８に含まれるハードディスクなどで構成される。
【０１５７】
尚、本明細書において、記録媒体に記録されるプログラムを記述するステップは、記載された順序に沿って時系列的に行われる処理は、もちろん、必ずしも時系列的に処理されなくとも、並列的あるいは個別に実行される処理を含むものである。
【０１５８】
【発明の効果】
本発明によれば、適正な量子化コードを生成することが可能となるので、動きベクトルを正確に求めることが可能となる。
【図面の簡単な説明】
【図１】従来の動き検出部の構成を示すブロック図である。
【図２】動きベクトルの検出方法を説明する図である。
【図３】図１の動き検出部による動き検出処理を説明するフローチャートである。
【図４】本発明を適用した動き検出部の一実施の形態の構成を説明するブロック図である。
【図５】図４のデータベースの構造を説明する図である。
【図６】図４の動き検出部による参照フレーム情報生成処理を説明するフローチャートである。
【図７】図４の動き検出部による動きベクトル検出処理を説明するフローチャートである。
【図８】図４の動き検出部による動きベクトル検出処理を説明するフローチャートである。
【図９】注目画素の特徴量の近傍の特徴量を説明する図である。
【図１０】本発明を適用した特徴量抽出部の一実施の形態の構成を説明するブロック図である。
【図１１】クラスタップを説明する図である。
【図１２】クラスタップを説明する図である。
【図１３】クラスタップを説明する図である。
【図１４】クラスタップを説明する図である。
【図１５】クラスタップを説明する図である。
【図１６】クラスタップを説明する図である。
【図１７】 ADRCコードを説明する図である。
【図１８】 DRが小さいときに生じる誤差の例を説明する図である。
【図１９】図１０で示す特徴量抽出部による特徴量算出処理を説明するフローチャートである。
【図２０】本発明を適用した特徴量抽出部の他の実施の形態の構成を説明するブロック図である。
【図２１】図２０で示す特徴量抽出部による特徴量算出処理を説明するフローチャートである。
【図２２】画像を縮小してクラスタップを抽出する処理を説明する図である。
【図２３】画像を縮小してクラスタップを抽出する処理を説明する図である。
【図２４】本発明を適用した特徴量抽出部の他の実施の形態の構成を説明するブロック図である。
【図２５】 DRに対応して使用ビット長を変更させる処理を説明する図である。
【図２６】 DRに対応して使用ビット長を変更させる処理を説明する図である。
【図２７】必要分解能と使用ビット長の関係を説明する図である。
【図２８】ダイナミックレンジと使用ビット長の関係を説明する図である。
【図２９】図２４で示す特徴量抽出部による特徴量算出処理を説明するフローチャートである。
【図３０】本発明を適用した動き検出部の他の実施の形態の構成を説明するブロック図である。
【図３１】媒体を説明する図である。
【符号の説明】
３１フレームメモリ，３２特徴量抽出部，３３バッファメモリ，３４データベース制御部，３５動きベクトル検出部，４１データベース，５１クラスタップ抽出部，５１ａタップテーブル，５２ DR算出部，５３ ADRCコード生成部，５４ DR判定部，７１画像縮小部，７２クラスタップ抽出部，７３ DR判定部，８１使用ビット長決定部，８１ａテーブル，８２ ADRCコード生成部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an image processing apparatus and method, a recording medium, and a program, and more particularly to an image processing apparatus and method, a recording medium, and a program that can detect a motion vector more accurately.
[0002]
[Prior art]
There is a technique for obtaining a motion vector indicating the motion of an image and efficiently compressing the moving image based on the motion vector.
[0003]
Several methods for obtaining the above-described motion vector in this moving image compression technique have been proposed. As a representative method, there is a method called a block matching algorithm.
[0004]
FIG. 1 shows a configuration of a motion detection unit 1 of a conventional image processing apparatus employing a block matching algorithm.
[0005]
For example, when an image signal is input from the input terminal Tin at time t1, the frame memory 11 of the motion detection unit 1 stores information for one frame. Further, when an image signal of the next frame is input from the input terminal Tin at time t2, which is the next timing, the frame memory 11 outputs the stored image information for one frame to the frame memory 12 at time t1. After that, the newly input image information for one frame is stored.
[0006]
Further, the frame memory 12 stores image information for one frame input from the input terminal Tin at the timing of time t1 input from the frame memory 11 at the timing of time t2.
[0007]
That is, when the frame memory 11 stores image information for one frame (current) input at the above-described timing of time t2, the frame memory 12 is input at the timing of time t1 (one timing past). The image information for one frame is stored. Hereinafter, the image information stored in the frame memory 11 is referred to as a current frame Fc, and the image information stored in the frame memory 12 is referred to as a reference frame Fr.
[0008]
The motion vector detection unit 13 reads the current frame Fc and the reference frame Fr stored in the

frame memories

11 and 12, respectively, and detects a motion vector by a block matching algorithm based on the current frame Fc and the reference frame Fr. And output from the output terminal Tout.
[0009]
Here, the block matching algorithm will be described. For example, as shown in FIG. 2, when obtaining a motion vector corresponding to a target pixel P (i, j) in the current frame Fc, first, the target pixel P (i, j) is centered on the current frame Fc. A reference block Bb (i, j) composed of L (number of pixels) × L (number of pixels), a search area SR corresponding to the position of the target pixel P (i, j) on the reference frame Fr, and the search area Reference blocks Brn (i, j) each consisting of L (number of pixels) × L (number of pixels) pixels are set in SR.
[0010]
Next, the process of calculating the sum of the absolute values of the differences between the pixels of the base block Bb (i, j) and the reference block Brn (i, j) causes the reference block Brn (i, j) to be searched. 2 are moved horizontally by one pixel at a time in the whole area or Br1 (i, j) to Brm (i, j) (reference block Brn (i, j) in FIG. It is repeated until m can be set in SR).
[0011]
Of the sum of absolute differences between the pixels of the base block Bb (i, j) and the reference block Brn (i, j) thus determined, the reference block Brn (i , J), the reference pixel Pn (the center of L × L pixels constituting the reference block Brn (i, j) closest to (similar to) the base block Bb (i, j) i, j) is determined.
[0012]
A vector having a pixel P ′ (i, j) on the reference frame Fr corresponding to the target pixel P (i, j) on the current frame Fc as a start point and a reference pixel Pn (i, j) as an end point is obtained. , And output as a motion vector (Vx, Vy) of the pixel of interest P (i, j). Here, for example, when P (i, j) = (a, b) and Pn (i, j) = (c, d), (Vx, Vy) becomes (Vx, Vy) = ( c-a, db).
[0013]
That is, the reference block closest to (similar to) the base block Bb (i, j) starting from the reference pixel P ′ (i, j) on the reference frame Fr corresponding to the target pixel P (i, j). A vector whose end point is the reference pixel Pn (i, j), which is the center of L × L pixels constituting Brn (i, j), is obtained as a motion vector.
[0014]
Next, the motion detection processing of the motion detection unit 1 in FIG. 1 will be described with reference to the flowchart in FIG.
[0015]
In step S <b> 1, the motion vector detection unit 13 sets the search area SR according to the pixel position of the pixel of interest P (i, j) on the current frame Fc stored in the frame memory 11.
[0016]
In step S2, the motion vector detection unit 13 sets the variable Min for setting the minimum value of the sum of absolute differences between the pixels of the base block Bb (i, j) and the reference block Brn (i, j) as described above. Initialization is performed by setting a value obtained by multiplying the number of gradations of the pixel by the number of pixels constituting the reference block Bb (i, j). That is, for example, when one pixel is 8-bit data, the number of gradations of one pixel is 2 8, and thus 256 gradations (256 colors) are obtained. Further, when the reference block Bb (i, j) is composed of L pixels × L pixels = 3 pixels × 3 pixels, the number of pixels is nine. As a result, the variable Min is initialized to 2304 (= 256 (number of gradations) × 9 (number of pixels)).
[0017]
In step S3, the motion vector detecting unit 13 initializes a counter variable n for counting the reference block Brn (i, j) to 1.
[0018]
In step S4, the motion vector detection unit 13 initializes a variable Sum used for substituting the sum of absolute differences between the pixels of the base block Bb (i, j) and the reference block Brn (i, j) to 0. .
[0019]
In step S5, the motion vector detection unit 13 obtains the sum of absolute differences (= Sum) between the pixels of the base block Bb (i, j) and the reference block Brn (i, j). That is, when each pixel of the reference block Bb (i, j) is indicated as P_Bb (i, j) and each pixel of the reference block Brn (i, j) is indicated as P_Brn (i, j), the motion vector detection unit 13 Performs an operation represented by the following expression (1) to obtain a sum of absolute differences between pixels of the base block Bb (i, j) and the reference block Brn (i, j).
[0020]
[Expression 1]

[0021]
In step S6, the motion vector detection unit 13 determines whether or not the variable Min is larger than the variable Sum. For example, if it is determined that the variable Min is larger than the variable Sum, in step S7, the variable Min is changed to the variable Sum. And the value of the counter n at that time is registered as a motion vector number. In other words, the fact that the variable Sum indicating the sum of absolute differences thus obtained is smaller than the variable Min indicating the minimum value means that the reference block Brn (i, i, Since j) can be regarded as more similar to the reference block Bb (i, j), a counter n at that time is registered as a motion vector number in order to make it a candidate for obtaining a motion vector. If it is determined in step S6 that the variable Min is not greater than the variable Sum, the process of step S7 is skipped.
[0022]
In step S8, the motion vector detection unit 13 determines whether or not the counter variable n is the total number m of the reference blocks Brn (i, j) in the search area SR, that is, the current reference block Brn (i, j) is Brn. It is determined whether or not (i, j) = Brm (i, j). For example, when it is determined that the total number is not m, in step S9, the counter variable n is incremented by 1, and the processing is performed in step S4. Return to.
[0023]
In step S8, the counter variable n is the total number m of the reference blocks Brn (i, j) in the search area, that is, the current reference block Brn (i, j) is Brn (i, j) = Brm (i, j), in step S10, the motion vector detection unit 13 outputs a motion vector based on the registered motion vector number. That is, by repeating steps S4 to S9, the counter variable n corresponding to the reference block Brn (i, j) that minimizes the sum of absolute differences is registered as a motion vector number. The unit 13 obtains a reference pixel Pn (i, j) serving as the center of the L × L pixels of the reference block Brn (i, j) corresponding to the motion vector number, and pays attention to the current frame Fc. A vector having a pixel P ′ (i, j) on the reference frame Fr corresponding to the pixel P (i, j) as a start point and a reference pixel Pn (i, j) as an end point is a pixel of interest P (i, j). Are obtained and output as motion vectors (Vx, Vy).
[0024]
Also, when detecting a motion vector by the block matching method, the amount of computation is reduced by detecting the motion vector based on an evaluation value obtained from a weighted average of values obtained by accumulating absolute value differences between the steady component and the transient component. There is something to be made (for example, refer to Patent Document 1).
[0025]
Furthermore, the pixel values in the reference block and the search range are encoded, a matching operation is performed based on the code value, a first motion vector is calculated based on the operation result, and motion compensation according to the first motion vector is performed. The second motion vector is calculated by performing block matching based on the difference of the pixel values for a new search range obtained by shifting the candidate block related to the first motion vector by one pixel after being performed. In some cases, the calculation is simplified by calculating the final motion vector as the sum of the first motion vector and the second motion vector (see, for example, Patent Document 2).
[0026]
[Patent Document 1]
Japanese Patent Application Laid-Open No. 07-087494
[Patent Document 2]
JP 2000-278691 A
[0027]
[Problems to be solved by the invention]
However, since the above-described block matching algorithm requires an enormous amount of calculation of equation (1), most of the time is spent on this processing in image compression processing such as MPEG (Moving Picture Experts Group). There was a problem of being done.
[0028]
In addition, when noise is included near the start point or end point of the motion vector of the current frame Fc or the reference frame Fr, the block matching cannot detect a reference block similar to the reference block, and an accurate motion There was a problem that the vector could not be detected.
[0029]
The present invention has been made in view of such circumstances, and makes it possible to accurately generate a motion vector.
[0030]
[Means for Solving the Problems]
The image processing apparatus of the present invention corresponds to a pixel of interest among the pixels of the first frame in the input image. 4m (m ≧ 1) Class tap the pixel values of surrounding pixels As Based on the dynamic range, the bit length is decreased when the dynamic range is large, and the bit length is increased when the dynamic range is small, based on the dynamic range. In addition, a bit length determining means for determining the bit length of the class tap, and a quantization for generating a quantization code as a feature amount from the pixel value level of the class tap represented by the bit length determined by the bit length determining means A code generation unit and a memory for storing the feature amount generated by the quantization code generation unit and information of each pixel position of the first frame corresponding to the feature amount, with all pixels of the first frame as the target pixel Means and a pixel of interest in a second frame different from the first frame, Reading means for reading out the pixel position information of the first frame corresponding to the feature quantity generated by the child code generation means from the storage means, and information on the pixel position of the first frame read by the reading means Motion vector detecting means for detecting a motion vector starting from the pixel position of the pixel having the smallest distance from the target pixel in the second frame and ending at the pixel position of the target pixel in the second frame It is characterized by providing.
[0031]
The bit length determining means includes When the dynamic range is large, the bit length is reduced. When the dynamic range is small, the bit length is increased. Compatible with dynamic range Attached A table storing bit length information can be provided, and the bit length information can be read from the table based on the dynamic range to determine the bit length of the class tap.
[0033]
Motion vector detection By means detection The motion vector comparison means for comparing the magnitude of the motion vector with the magnitude of the predetermined motion vector can be further provided, and the comparison result of the motion vector comparison means is greater than the magnitude of the predetermined motion vector. When the magnitude of the motion vector generated by the motion vector generating means is larger, the reading means has the first frame corresponding to the quantization code similar to the quantization code of the pixel of interest in the second frame. Read pixel location information from storage The motion vector detection means reads out the pixel position information of the first frame corresponding to the quantization code similar to the quantization code of the pixel of interest in the second frame read by the reading means, Detects a motion vector starting from the pixel position of the pixel with the smallest distance to the pixel of interest in the second frame and ending with the pixel position of the pixel of interest in the second frame Can be made.
[0034]
The image processing method of the present invention corresponds to a pixel of interest among the pixels of the first frame in the input image. 4m (m ≧ 1) Class tap the pixel values of surrounding pixels As Based on the dynamic range, the bit length is reduced when the dynamic range is small, and the bit length is increased when the dynamic range is small, based on the class tap extraction step to extract, the dynamic range detection step to detect the dynamic range of the class tap Next, a quantization code is generated as a feature quantity from the bit length determination step for determining the bit length of the class tap and the pixel value level of the class tap expressed by the bit length determined by the processing of the bit length determination step. Using the quantization code generation step and all the pixels of the first frame as the target pixel, the feature amount generated by the processing of the quantization code generation step, and information on each pixel position of the first frame corresponding to the feature amount And a second step different from the first frame. For the pixel of interest in the frame, a read step for reading out information on the pixel position of the first frame corresponding to the feature quantity generated in the process of the quantization code generation step, and the first read out by the process of the read step Motion vector starting from the pixel position of the pixel having the smallest distance from the pixel of interest in the second frame and ending in the pixel position of the pixel of interest in the second frame And a motion vector detecting step for detecting.
[0035]
The recording medium program of the present invention corresponds to the target pixel among the pixels of the first frame in the input image. 4m (m ≧ 1) Class tap the pixel values of surrounding pixels As Based on the dynamic range, the bit length is reduced when the dynamic range is small, and the bit length is increased when the dynamic range is small, based on the class tap extraction step to extract, the dynamic range detection step to detect the dynamic range of the class tap Next, a quantization code is generated as a feature quantity from the bit length determination step for determining the bit length of the class tap and the pixel value level of the class tap expressed by the bit length determined by the processing of the bit length determination step. Using the quantization code generation step and all the pixels of the first frame as the target pixel, the feature amount generated by the processing of the quantization code generation step, and information on each pixel position of the first frame corresponding to the feature amount And a second step different from the first frame. For the pixel of interest in the frame, a read step for reading out information on the pixel position of the first frame corresponding to the feature quantity generated in the process of the quantization code generation step, and the first read out by the process of the read step Motion vector starting from the pixel position of the pixel having the smallest distance from the pixel of interest in the second frame and ending in the pixel position of the pixel of interest in the second frame And a motion vector detecting step for detecting.
[0036]
The program of the present invention corresponds to a pixel of interest among the pixels of the first frame in the input image. 4m (m ≧ 1) Class tap the pixel values of surrounding pixels As Based on the dynamic range, the bit length is reduced when the dynamic range is small, and the bit length is increased when the dynamic range is small, based on the class tap extraction step to extract, the dynamic range detection step to detect the dynamic range of the class tap Next, a quantization code is generated as a feature quantity from the bit length determination step for determining the bit length of the class tap and the pixel value level of the class tap expressed by the bit length determined by the processing of the bit length determination step. Using the quantization code generation step and all the pixels of the first frame as the target pixel, the feature amount generated by the processing of the quantization code generation step, and information on each pixel position of the first frame corresponding to the feature amount And a second step different from the first frame. For the pixel of interest in the frame, a read step for reading out information on the pixel position of the first frame corresponding to the feature quantity generated in the process of the quantization code generation step, and the first read out by the process of the read step Motion vector starting from the pixel position of the pixel having the smallest distance from the pixel of interest in the second frame and ending in the pixel position of the pixel of interest in the second frame And a motion vector detecting step for detecting the motion vector.
[0037]
In the image processing apparatus and method, and the program according to the present invention, the pixel corresponding to the target pixel is selected from the pixels of the first frame in the input image. 4m (m ≧ 1) Pixel values of surrounding pixels are class tap As The class tap dynamic range is extracted, and the class tap bit length is determined based on the dynamic range so that the bit length is reduced when the dynamic range is large and the bit length is increased when the dynamic range is small. Quantization code as a feature value is generated from the pixel value level of the class tap expressed with the determined bit length, and the feature value generated using all the pixels of the first frame as the target pixel, and the feature value Is stored, and the pixel position of the first frame corresponding to the feature quantity generated for the target pixel in the second frame different from the first frame is stored. The information is read out, and the pixel position of the first frame is read out with the pixel of interest in the second frame. Away it is starting from the pixel position of the pixel having the minimum motion vector of the pixel position of the target pixel in the second frame and ending are detected.
[0038]
DETAILED DESCRIPTION OF THE INVENTION
FIG. 4 is a block diagram showing a configuration of the motion detection unit 21 of the image processing apparatus to which the present invention is applied.
[0039]
The motion detection unit 21 includes a frame memory 31, a feature amount extraction unit 32, a buffer memory 33, a database control unit 34, and a motion vector detection unit 35.
[0040]
The frame memory 31 stores information of one screen (one frame) of the image signal input from the input terminal Tin and supplies the information to the feature amount extraction unit 32. The feature amount extraction unit 32 extracts the feature amount of the target pixel based on the screen information supplied from the frame memory, that is, the information of the current frame Fc.
[0041]
The feature amount extraction unit 32 extracts a class tap corresponding to the target pixel (extracts a pixel value of a neighboring pixel corresponding to the target pixel), generates a quantization code from the information of the class tap, and performs this quantization. The code is output as a feature value. When the next screen information is input, the feature amount extraction unit 32 receives information (for example, coordinate information) of the pixel position corresponding to the feature amount extracted from the previously supplied screen information, the buffer memory 33 and the motion vector. It supplies to the detection part 35. Details of the feature amount extraction unit 32 will be described later.
[0042]
The buffer memory 33 stores the supplied feature amount as feature amount information of the reference frame Fr.
[0043]
Based on the feature amount information of the reference frame Fr stored in the buffer memory 33, the database control unit 34 generates reference frame information by storing pixel position information in the database 21 using the feature amount as an address. The database control unit 34 has a counter for counting the number of processed pixels.
[0044]
Next, the configuration of reference frame information stored in the database 21 will be described with reference to FIG.
[0045]
The database 21 is composed of a × b cells indicated by feature amount addresses 0 to a and flag addresses 0 to b. The database control unit 34 associates the feature amount of the pixel with the feature amount address, and stores information on the pixel position having the feature amount for each feature amount in the flag addresses 1 to b corresponding to the feature amount address of the database 21. Store sequentially. In the flag address 0, the number of pieces of pixel position information currently stored in the feature amount address is sequentially incremented and stored. Specifically, when information on one pixel position is stored in the feature amount address 1 in the cell (1, 1), the number of pieces of stored pixel position information is stored in the cell (1, 0). 1 is stored. When the feature amount of the next pixel of interest corresponds to the feature amount address 1, the value stored in the cell (1, 0) is incremented to 2, and the position information of the pixel of interest is , Stored in cell (1, 2).
[0046]
Returning to FIG. 4 again, the configuration of the motion detector 21 will be described. The motion vector detection unit 35 executes a matching process between the feature amount information of the current frame Fc supplied from the feature amount extraction unit 32 and the information stored in the database 41 of the database control unit 34 to obtain a motion vector. To detect. Specifically, the motion vector detection unit 35 calculates the distance between each pixel position and the target pixel described in the feature amount address of the database 41 corresponding to the feature amount of the target pixel of the current frame Fc. The difference coordinates are detected as the motion vector (Vx, Vy) of the target pixel based on the information of the pixel position where the calculated distance is the minimum.
[0047]
When the value of the detected motion vector M is smaller than the assumed motion vector maximum value M_Max (when it is within the specified range), the motion vector detection unit 35 determines that the motion vector is a correct motion vector and outputs it from the terminal Tout. In addition, when the detected motion vector M value is larger than the assumed motion vector maximum value M_Max (when it is outside the specified range (including the case where it is equal)), the motion vector detection unit 35 does not correspond to the position. , The address of the feature quantity close to the pixel in the current frame is generated in the database 41, and the matching process is performed again based on the information of the pixel position belonging to the address of the feature quantity in the vicinity. Note that the motion vector detection unit 35 performs bit inversion, for example, when selecting a nearby feature amount, but changes the bit inversion mode according to the pattern of the target pixel.
[0048]
Next, reference frame information generation processing will be described with reference to the flowchart of FIG.
[0049]
In step S31, the database control unit 34 initializes reference frame information registered in the database 41. That is, the database control unit 34 writes 0 in the cell of flag address 0 corresponding to all feature amount addresses, and deletes the pixel position information stored in flag addresses 1 to b.
[0050]
In step S 32, the database control unit 34 initializes a counter variable n of a counter that counts the pixels in the frame memory 31 to 0.
[0051]
In step S <b> 33, the feature amount extraction unit 32 calculates a feature amount from the class tap corresponding to the target pixel, and supplies the calculated feature amount to the motion vector detection unit 35 and the buffer memory 33. The feature amount calculation process will be described later.
[0052]
In step S <b> 34, the database control unit 34 reads the feature amount of the target pixel corresponding to the pixel corresponding to the counter variable n from the information of the reference frame Fr stored in the buffer memory 33, and the corresponding feature in the database 41. The value K written in the flag address 0 of the quantity address is read.
[0053]
In step S35, the database control unit 34 increments the value K read in step S35 by 1 (K = K + 1), and writes it to the flag address 0 of the corresponding feature amount address in the database 41.
[0054]
In step S 36, the database control unit 34 reads the position information of the target pixel from the information of the reference frame Fr stored in the buffer memory 33, and sets the target pixel's flag address K + 1 in the corresponding feature amount address in the database 41. Write location information.
[0055]
In step S37, the database control unit 34 increments the counter variable n so that n = n + 1.
[0056]
In step S38, the database control unit 34 determines whether or not the counter variable n is the number of pixels of one frame. If it is determined in step S38 that the counter variable n is not the number of pixels in one frame, the process returns to step S33, and the subsequent processes are repeated. If it is determined in step S38 that the counter variable n is the number of pixels in one frame, the process is terminated.
[0057]
Next, the motion vector detection process by the motion detection unit 21 in FIG. 4 will be described with reference to the flowcharts in FIGS.
[0058]
In step S51, the motion vector detection unit 35 initializes a count variable s for counting the number of pixels in one frame to zero.
[0059]
In step S52, the motion vector detection unit 35 acquires the feature amount of the pixel of interest Pn in the current frame Fc extracted by the feature amount extraction unit 32.
[0060]
In step S53, the motion vector detection unit 35 reads the value recorded in the cell (feature amount address, 0) on the database 41 corresponding to the feature amount address for the acquired feature amount, and is classified into the same feature amount. The number of pixels (number of candidates: the number of pieces of pixel position information) is substituted into the variable ts. In addition, the motion vector detection unit 35 initializes a counter variable t indicating a candidate number counter to 1, a variable Min indicating a minimum distance value to ∞, and a counter variable L indicating a distance to 0.
[0061]
In step S54, the motion vector detection unit 35 determines the distance between the target pixel Pn in the current frame Fc and the pixel position information recorded in (feature amount address, t) on the database 41 read from the database control unit 35. Is substituted into the variable L.
[0062]
In step S55, the motion vector detection unit 35 determines whether or not the distance L obtained in the process of step S54 is smaller than the variable Min indicating the minimum value, and for example, determines that the variable Min> distance L. In step S56, the variable Min is updated to the distance L, and the variable t at that time is registered as a motion vector number. If it is determined that the variable Min ≦ distance L, the process of step S56 is skipped.
[0063]
In step S57, the motion vector detection unit 35 determines whether or not the candidate counter variable t is greater than or equal to the candidate number of variables ts, and if it is determined that the candidate counter variable t is not greater than or equal to the candidate number of variables ts, that is, If there is a candidate unprocessed pixel, the variable t is incremented by 1 in step S58, and the process returns to step S54.
[0064]
That is, the fact that the candidate counter variable t is not greater than or equal to the number of candidates for the variable ts means that, among the information on the pixel positions classified on the same feature quantity as the feature quantity of the target pixel recorded on the database 41, the step Since there is information on pixel positions that have not been subjected to the processing of S54 to S56, the processing is repeated until the processing of steps S54 to S57 is performed for information on all candidate pixel positions.
[0065]
If it is determined in step S57 that the variable t is greater than or equal to the variable ts, that is, it is determined that the pixel positions of all pixels having the same feature quantity as the feature quantity of the target pixel have been compared with the distance of the target pixel. In step S59, the motion vector detection unit 35 obtains a motion vector M having the pixel position corresponding to the motion vector number registered in the process of step S56 as a start point and the target pixel position as an end point.
[0066]
In step S60, the motion vector detection unit 35 compares the obtained absolute value of the motion vector M with the assumed maximum value M_Max of the motion vector M, and the absolute value of the motion vector M is smaller than the maximum value M_Max. For example, when it is determined that the absolute value of the motion vector M is smaller than the maximum value M_Max, the process proceeds to step S61.
[0067]
In step S61, the motion vector detection unit 35 outputs the motion vector M obtained in the process of step S59 from the output terminal Tout. In step S62, the variable s is incremented by one.
[0068]
In step S63, the motion vector detection unit 35 determines whether or not the variable s matches the number of pixels in the frame. For example, when it is determined that the variable s does not match the number of pixels in the frame, that is, still If it is determined that there is a pixel to be processed, the process returns to step S52. For example, if it is determined that the variable s matches the number of pixels in the frame, that is, it is determined that all the pixels have been processed. If so, the process ends.
[0069]
In step S60, when it is determined that the absolute value of the motion vector M is not smaller than the maximum value M_Max, that is, among the pixels of the reference frame having the same feature amount as the target pixel, the pixel exists at the closest position to the target pixel. If the distance from the pixel is larger than the assumed distance M_Max, the process proceeds to step S64 (FIG. 8).
[0070]
In step S64 (FIG. 8), the motion vector detection unit 35 initializes a variable u indicating the number of inverted bits to 1.
[0071]
In step S65, the motion vector detection unit 35 inverts u bits in the quantization code indicating the feature amount of the target pixel. That is, in the first process, any one of a plurality of bits constituting the quantization code indicating the feature amount of the target pixel is inverted.
[0072]
In step S66, the motion vector detection unit 35 reads the value recorded in (feature amount address, 0) on the database 41 corresponding to the feature amount address with the feature amount inverted by u bits, and sets the same feature amount. The number of pixels to be classified (number of candidates: the number of pieces of pixel position information) is substituted into the variable ts. Further, the motion vector detection unit 35 initializes a counter variable t indicating a candidate number counter to 1, a variable Min indicating a minimum value of the distance to ∞, and a counter variable L indicating the distance to 0.
[0073]
In step S67, the motion vector detection unit 35 calculates the distance between the target pixel Pn in the current frame Fc and the pixel position recorded in (feature amount address, t) on the database 41 read from the database control unit 35. Calculate and assign to variable L.
[0074]
In step S68, the motion vector detection unit 35 determines whether or not the distance L obtained in the process of step S67 is smaller than the variable Min having the minimum distance, and for example, determines that the variable Min> distance L. In this case, in step S69, the minimum value variable Min is updated to the distance L, and the variable t at that time is registered as a motion vector number. If it is determined that the variable Min ≦ distance L, the process of step S69 is skipped.
[0075]
In step S70, the motion vector detection unit 35 determines whether the candidate counter variable t is greater than or equal to the candidate number of variables ts, and if it is determined that the candidate counter variable t is not greater than or equal to the candidate number of variables ts, that is, Assuming that there is a candidate unprocessed pixel, the variable t is incremented by 1 in step S71, and the process returns to step S67.
[0076]
That is, the fact that the candidate counter variable t is not greater than or equal to the number of candidates for the variable ts means that the feature quantity of the pixel of interest recorded in the database 41 is classified as the same feature quantity as the u-bit inverted feature quantity. Among the pixel position information, there is a pixel that has not been subjected to the processing of steps S67 to S69. Therefore, the processing is repeated until the processing of steps S67 to S69 is performed for all candidate pixels. It is.
[0077]
If it is determined in step S70 that the variable t is greater than or equal to the variable ts, that is, the pixel positions and the target pixels of all the pixels having the same feature quantity as the u-bit inverted feature quantity among the feature quantities of the target pixel. In step S72, it is determined whether or not there is a combination of non-inverted bits in the feature amount of the pixel of interest. If it is determined that there is a bit combination that has not been inverted, the processing returns to step S65.
[0078]
If it is determined in step S72 that there is no non-inverted bit, in step S73, the motion vector detection unit 35 uses the pixel position corresponding to the motion vector number registered in step S69 as the start point, and A motion vector M having the position as an end point is obtained.
[0079]
In step S74, the motion vector detection unit 35 compares the absolute value of the obtained magnitude of the motion vector M with the maximum value M_Max assumed as the magnitude of the motion vector M, and the absolute value of the motion vector M is determined. It is determined whether or not it is smaller than the maximum value M_Max. For example, when it is determined that the absolute value of the motion vector M is smaller than the maximum value M_Max, the processing returns to step S61.
[0080]
If it is determined in step S74 that the absolute value of the motion vector M is not smaller than the maximum value M_Max, that is, the reference frame pixel having the same feature quantity as the u-bit inverted feature quantity of the feature quantity of the target pixel. If the distance between the pixel of interest and the pixel closest to the target pixel is greater than the expected distance, the variable u is incremented by 1 in step S75, and the process returns to step S65.
[0081]
That is, by the processing of steps S51 to S63, the distance between the pixel classified as the same feature quantity as the feature quantity of the target pixel Pn and the target pixel is sequentially calculated, and the minimum pixel is obtained. A motion vector is generated from the pixel position information and the pixel position information of the target pixel Pn and output.
[0082]
However, when the absolute value of the generated motion vector M is larger than the maximum value M_Max in step S60, it is determined that the motion vector is not obtained correctly, and the process of steps S64 to S75 determines the characteristics of the target pixel Pn. The distance between a pixel belonging to (classified) a nearby feature quantity similar to the quantity and the target pixel Pn is sequentially obtained, the pixel having the minimum distance is obtained, and the motion vector is obtained.
[0083]
That is, the feature amount of a certain part of the moving object may change a little in adjacent frames.
[0084]
Since the elements in the feature amount space are orthogonal to each other, a certain feature amount is defined in the feature amount space with the number of elements as an axis, as indicated by d-a1 in FIG. That is, the spatial coordinates of the pixel from which the feature amount is extracted are linked to the coordinates indicated by d-a1.
[0085]
A feature amount in the vicinity of a certain feature amount (a feature amount similar to a certain feature amount) is represented by the values of axes x, y, and z in each feature amount space, as indicated by d-a2 in FIG. On the other hand, it is defined as an area that allows a certain range of fluctuation width. The feature quantity output from the feature quantity extraction unit 32 is a quantization code. For example, when “000” is output as a 4-bit quantization code, the nearest feature quantity has a Hamming distance of 1. “001”, “010”, and “100”. As a result, in FIG. 8, when the feature amount of the target pixel is “000”, the feature amounts (similar feature amounts) included in the range defined by d-a2 in the drawing are “001” and “010”. , And “100”.
[0086]
Therefore, in the processing of steps S65 to S72, as described above, pixels belonging to the feature quantity obtained when any u-bit data is inverted (inverted to 1 if 0, inverted to 0 if 1). The bit to be inverted until the motion vector M is generated from the minimum pixel and the pixel position information of the target pixel by obtaining the distance between the target pixel and the target pixel, and the magnitude of the motion vector M is less than the maximum value M_Max The process is repeated while increasing the number u. That is, the process of obtaining a pixel having the minimum distance from the target pixel is repeated while gradually increasing the Hamming distance from the quantization code indicating the feature amount of the target pixel.
[0087]
In this case, since the calculation contents are only address reference, difference calculation, and branching, the calculation amount does not increase significantly.
[0088]
As a pixel value, for example, when 1 pixel = 8 bits, an image such as computer graphics (CG) can be matched with full-bit (8-bit) information, but in the case of a natural image, it varies from frame to frame. Therefore, it is desirable to perform the matching process by excluding predetermined bits from the plurality of bits. Specifically, the lower few bits may be masked and used, or the number of bits may be reduced and requantized. In other words, it is desirable to reduce the number of bits in nonlinear / linear quantization (reduce the number of quantization bits).
[0089]
Next, details of the feature quantity extraction unit 32 will be described.
[0090]
FIG. 10 is a block diagram illustrating a configuration example of the feature amount extraction unit 32.
[0091]
The class tap extraction unit 51 extracts information (pixel values) of peripheral pixels (class taps) corresponding to the target pixel necessary for feature amount extraction from the input image information, a DR range calculation unit 52, and The data is output to the ADRC code generation unit 53. The class tap extraction unit 51 has a tap table 51a that stores class tap patterns, and class taps to be extracted based on instruction information input from the DR determination unit 54 (according to the number of instructions). Change the pattern.
[0092]
In other words, the tap table 51a is arranged as a class tap pattern at the outermost peripheral position of a block of 3 pixels × 3 pixels centering on the pixel of interest (pixels displayed with black circles in the figure) as shown in FIG. Pattern consisting of 8 pixels (pixels filled with diagonal lines in the figure) and 5 pixels × 5 pixels centered on the pixel of interest (pixels displayed with black circles in the figure) as shown in FIG. A pattern of 16 pixels (pixels filled with diagonal lines in the figure) arranged at the outermost peripheral position of the block, or a pixel of interest (pixels displayed with black circles in the figure) as shown in FIG. A pattern of 24 pixels (pixels filled with diagonal lines in the figure) arranged at the outermost peripheral position of a block of 7 pixels × 7 pixels, and a pixel of interest (an image displayed with a black circle in the figure), although not shown. ) Stores 4 (n-1) pixel (pixel are filled in the drawing hatching) pattern disposed on the outermost peripheral position of the block of n pixels × n pixels centered on.
[0093]
Further, the class tap pattern may be four pixels including one pixel above, below, left, and right of the target pixel as shown in FIG. 14, or from two pixels above, below, left, and right of the target pixel as shown in FIG. As shown in FIG. 16, it may be 12 pixels consisting of 3 pixels above, below, left, and right of the target pixel. Further, although not shown, It may be a 4 m pixel consisting of pixels. Furthermore, the class tap may have a configuration other than that illustrated, for example, an arrangement in which the positional relationship with the target pixel is asymmetric.
[0094]
Based on the instruction information from the DR (dynamic range) determination unit 54, the class tap extraction unit 51 ranks the class taps so that the range of the class taps expands with respect to the position of the target pixel. The divided class tap patterns are sequentially read from the tap table 51a and changed. At this time, the class tap extraction unit 51 controls a counter indicating a built-in rank to extract a class tap having a corresponding rank.
[0095]
That is, as shown in FIG. 11 to FIG. 13, the class tap pattern is changed (the 4 (n−1) pixels arranged at the outermost peripheral position of the block of n pixels × n pixels centered on the target pixel. Pattern) is stored in the tap table 51a together with the sequential rank, the class tap extraction unit 51 displays the class tap information in the rank 1 pattern shown in FIG. 11 in the state where there is no instruction information from the DR determination unit 54. When the instruction information is received for the first time at the timing after reading, it is changed to the rank 2 pattern shown in FIG. 12, the class tap information is read, and when the second instruction information is received, FIG. 13 When the class tap information is read by changing to the rank 3 pattern shown in FIG. 5 and the n-th instruction information is received, n pixels centered on the target pixel × Extracting information class tap pattern 4 (n-1) of rank n of pixels arranged in the outermost peripheral position of the block of pixels. As described above, the rank n of the class tap pattern changes based on the number of times the instruction information is received. Note that the change (change) of the pattern used by the class tap extraction unit 51 is one that regularly changes in the same shape as the pattern change shown in FIGS. It is not necessary to be similar to a square, and the shape of the arrangement may be changed. Further, the pattern may be changed as shown in FIGS. 14 to 16. In this case, the rank s is set in the order of FIG. 14, FIG. 15, and FIG.
[0096]
The DR (dynamic range) calculation unit 52 obtains a dynamic range from class tap information (pixel value) input from the class tap extraction unit 51, and generates a dynamic range from the DR determination unit 54 and ADRC (Adaptive Dynamic Range Coding) code generation. In addition to outputting to the unit 53, the dynamic range and the minimum value information obtained when obtaining the dynamic range are output to the ADRC code generation unit 53. That is, if the class tap pattern is, for example, as shown in FIG. 14, the information (pixel value level) of each of the class taps C1 to C4 is (C1, C2, C3, C4) = (60, 90, 51, 100), the relationship is as shown in FIG. In such a case, the dynamic range is defined as the difference between the minimum value and the maximum value of the pixel value level, and the value is defined by the following equation (2).
DR = Max−Min + 1 (2)
[0097]
Here, Max is the maximum value of the pixel value level which is class tap information, and Min is the minimum value of the pixel value level of the class tap. Here, 1 is added in order to define a class (for example, when a class indicated by 0, 1 is set, the difference between the two is 1, but the class is 2 classes. 1 is added to the difference). Therefore, in the case of FIG. 17, the pixel value level 100 of the class tap C3 is the maximum value, and the pixel value level 51 of the class tap C1 is the minimum value, so DR is 50 (= 100−51 + 1).
[0098]
Thus, since the DR calculation unit 52 detects the minimum value and the maximum value of the pixel value levels of the class tap when calculating the DR, the minimum value (or maximum value: minimum value is (Which can be obtained from the maximum value and DR) is output to the ADRC code generation unit 53.
[0099]
The ADRC code generation unit 53 generates and outputs a quantization code composed of ADRC codes for each pixel value level of the class tap from the DR value and the minimum value Min input from the DR operation unit 52. More specifically, the ADRC code is obtained by substituting each pixel value level of the class tap into the following equation (3).
Q = Round ((L−Min + 0.5) × (2 ^ n) / DR) (3)
[0100]
Here, Round indicates truncation, L indicates the pixel value level, n indicates the number of assigned bits, and (2 ^ n) indicates 2 to the nth power.
[0101]
Therefore, for example, when the number of assigned bits n is 1, the pixel value level of each class tap is 1 if it is greater than or equal to the threshold th shown in the following equation (4), and 0 if it is smaller than the threshold th. Is done.
th = DR / 2−0.5 + Min (4)
[0102]
As a result, when the number of assigned bits n is 1, and when a class tap as shown in FIG. 17 is obtained, the threshold th is 75.5 (= 50 / 2−0.5 + 51), so the ADRC code Becomes ADRC (C1, C2, C3, C4) = 0101.
[0103]
Here, the description returns to the configuration of the feature amount extraction unit 32 in FIG.
[0104]
The DR determination unit 54 determines whether or not the DR value input from the DR calculation unit 52 is equal to or greater than a predetermined threshold DR_th. In addition to instructing to change the pattern, the ADRC code generation unit 53 is instructed not to execute the ADRC code generation processing using the currently input DR.
[0105]
In other words, the same ADRC code may be generated even if the ADRC code is a different DR depending on the class tap. In this case, the smaller the DR, the more easily affected by the error, and the ADRC code is likely to change from frame to frame.
[0106]
More specifically, for example, as shown in FIG. 18, the pixel value level (C1, C2, C3, C4) = (74, 80, 65.5, 85.5) is obtained by the class tap shown in FIG. think of. In FIG. 18, pixel value levels (C1, C2, C3, C4) = (74, 80, 65.5, 85.5) are indicated by crosses. Alternatively, the black circles in the figure indicate the example shown in FIG. 17 as it is, and the dynamic range is indicated by DR1 (= 50).
[0107]
At this time, in FIG. 18, DR2, which is the dynamic range of the class tap indicated by a cross, is 20, and the minimum value is 65.5, so the threshold value is 75.5, and the generated ADRC code is , 0101, and the same ADRC code as that shown in FIG. 17 is generated.
[0108]
Incidentally, it is known that a certain level of error occurs in each pixel value level. For example, in FIG. 18, the range of errors occurring in each data is indicated by vertical bars in the figure. Here, since the data of the cross-tapped class tap C1 includes a threshold value in the error range, it may be larger than the threshold value th depending on the degree of error, and the code of the class tap may be inverted. There is sex.
[0109]
When such a phenomenon occurs in the class tap corresponding to the pixel that is the start point or end point of the motion vector, as a result, the ADRC code originally set as the start point or end point of the motion vector changes, There is a possibility that an accurate motion vector cannot be generated. In addition, if the DR is small, each pixel value level of the class tap has a small degree of dispersion.Therefore, there is a possibility that many pixels existing near the threshold are included, and each error is different. There is a possibility that the code of a plurality of bits is inverted when the range crosses the threshold.
[0110]
From the above, the larger the DR, the smaller the ADRC code error, and the smaller the DR, the larger the ADRC code error.
[0111]
Therefore, when the dynamic range DR is smaller than the dynamic range threshold DR_th, the class tap pattern is changed, for example, from the rank 1 pattern of FIG. 11 to the rank 2 pattern of FIG. The pattern is changed so that the DR becomes large by widening the number and class tap extraction range with respect to the target pixel. Furthermore, if the DR is smaller than DR_th even after changing the pattern, the class tap extracting unit further changes the pattern information to a higher rank pattern and changes the pattern until the DR exceeds DR_th. 51.
[0112]
Next, the feature amount calculation process will be described with reference to the flowchart of FIG.
[0113]
In step S91, the class tap extraction unit 51 initializes a counter variable s indicating the rank of the class tap pattern to 1.
[0114]
In step S92, the class tap extraction unit 51 reads an image for one frame stored in the frame memory 31, reads out information on the pattern of rank s corresponding to the target pixel in the image from the tap table 51a, and corresponds. The class tap is extracted and output to the DR operation unit 52 and the ADRC code generation unit 53. That is, in the first process, the class tap of the pattern of rank s = 1 (for example, the class tap pattern of FIG. 11 or FIG. 14 described above) is extracted and output to the DR operation unit 52.
[0115]
In step S93, the DR operation unit 52 obtains the dynamic range of the class tap input from the class tap extraction unit 51, outputs the dynamic range to the ADRC code generation unit 53, and the DR determination unit 54, and is detected at that time. The minimum value is output to the ADRC code generation unit 53. That is, when the pixel value level of a class tap as shown in FIG. 17 is input, 50 is output as DR and 51 is output as the minimum value Min as described above.
[0116]
In step S94, the DR determination unit 54 determines whether the DR input from the DR calculation unit 52 is greater than a predetermined threshold DR_th. For example, the DR input from the DR calculation unit 52 is a predetermined threshold DR_th. If it is determined that the pixel value levels of the class taps are not distributed, in step S96, an instruction to change the class tap pattern is output to the class tap extraction unit 51, and ADRC The code generation unit 53 is instructed not to generate an ADRC code using the DR input from the DR calculation unit 52 and the minimum value Min.
[0117]
In step S97, the class tap extraction unit 51 increments the counter variable s indicating the rank by 1, and the process proceeds to step S92. That is, in this case, the counter variable s indicating the rank is 2. For example, the class tap is changed by changing the pixel of interest to a pattern having a spatial extension as in the pattern of FIG. 12 or 15 described above. Extract.
[0118]
In this way, the processes of steps S92 to S97 are repeated until DR exceeds the predetermined threshold value DR_th.
[0119]
In step S94, when it is determined that the DR input from the DR operation unit 52 is larger than the predetermined threshold DR_th, the ADRC code generation unit 53 in step S95, the DR input from the DR operation unit 52, and Using the minimum value Min, an ADRC code is generated from each pixel value level of the class tap input from the class tap extraction unit 51 and output.
[0120]
As described above, considering the error included in the pixel value level detected as a class tap, the ADRC code is expanded by expanding the range for extracting the class tap until a dynamic range of a predetermined size or more is secured. Errors such as noise generated during generation can be suppressed, and more accurate motion vectors can be detected.
[0121]
In the above, an example has been described in which the DR is increased by changing the class tap pattern for the pixel of interest, and as a result, a correct motion vector can be detected, but the class tap pattern can be changed. Alternatively, the image may be reduced.
[0122]
FIG. 20 shows a configuration of the feature quantity extraction unit 32 in which when the DR is smaller than the predetermined threshold DR_th, the DR is enlarged by reducing the image even in the same class tap pattern.
[0123]
In FIG. 20, portions corresponding to those in FIG. 10 are denoted by the same reference numerals, and description thereof will be omitted below as appropriate.
[0124]
20 is different from the feature amount extraction unit 32 of FIG. 10 in that an image reduction unit 71 is newly provided, and instead of the class tap extraction unit 51 and the DR determination unit 54, a class tap extraction unit 72 and a DR determination unit are provided. 73 is provided.
[0125]
The image reduction unit 71 reduces the input image based on an instruction from the DR determination unit 73 and outputs the reduced image to the class tap extraction unit 72. The image reduction unit 71 includes a counter indicating a reduction rate. In the default state (when no instruction is received from the DR determination unit 73), the image reduction unit 71 reduces the input image. And output to the class tap extraction unit 72 as it is.
[0126]
The class tap extraction unit 72 has the same function as the class tap 51. However, unlike the class tap extraction unit 51, the class tap extraction unit 72 does not have the tap table 51a, and thus the image has a predetermined fixed class tap pattern. The class tap corresponding to the target pixel of the image input from the reduction unit 71 is extracted and output to the DR operation unit 52 and the ADRC code generation unit 53.
[0127]
The DR determination unit 73 basically has the same function as the DR determination unit 54. However, when the DR determination unit 73 determines that DR is smaller than a predetermined threshold value DR_th, the image reduction unit 71 reduces the image. Send instructions to do.
[0128]
Next, the feature amount calculation processing of FIG. 20 will be described with reference to the flowchart of FIG. Note that the processing of steps S113 to S116 in the flowchart of FIG. 21 is the same as the processing of steps S92 to S95 in the flowchart of FIG.
[0129]
In step S111, the image reducing unit 71 initializes a counter variable t indicating a reduction rate to 1.
[0130]
In step S112, the image reduction unit 71 reads the image information stored in the frame memory 31, and displays the image at 1 / (t ^ 2) times (where t ^ 2 indicates the square of t). The data is reduced (thinned every t rows or t columns in the horizontal direction and the vertical direction and reduced to 1 / t times), and output to the class tap extraction unit 72. That is, in the first process, since the reduction ratio is 1 (= 1 / (1 ^ 2)), the image itself stored in the frame memory 31 is output to the class tap extraction unit 72 without being reduced. Is done.
[0131]
In step S115, if the DR determination unit 73 determines that DR is larger than the predetermined threshold value DR_th, in step S117, the DR determination unit 73 transmits instruction information for reducing the image to the image reduction unit 71.
[0132]
In step S118, the image reduction unit 71 increments the counter variable t indicating the reduction rate by 1 based on the instruction from the DR determination unit 73, and the process returns to step S112. Therefore, for example, when t = 2, in step S112, the image is reduced to 1/2 in the horizontal direction and the vertical direction (reduced to 1 / (2 ^ 2)) and the subsequent processing is repeated.
[0133]
That is, for example, as shown in FIG. 22, the pixel of interest (pixels indicated by black circles in the figure) of an image composed of 2L (number of pixels) × 2M (number of pixels) is centered on 3 pixels × 3 pixels. In the case of extracting a class tap consisting of a pattern of 8 pixels (pixels filled with diagonal lines in the figure) arranged at the outermost peripheral position of a block, if the DR is determined to be larger than a predetermined threshold DR_th, 23, the image is reduced to an image of L × M that is ½ times the horizontal and vertical directions (one row and one in the horizontal and vertical directions, respectively). It is composed of a pattern of 8 pixels (pixels filled with grid lines in FIG. 23) arranged at the outermost peripheral position of a block of 3 pixels × 3 pixels centered on the target pixel. Extract class taps. By processing in this way, the class tap of the reduced image shown in FIG. 23 is the distance of two pixels with respect to the horizontal direction and the vertical direction of the target pixel of the image before reduction as shown in FIG. The pixels existing in (a pixel filled with a grid line in FIG. 22) are selected as class taps.
[0134]
In this way, by reducing the image and extracting class taps having the same pattern, the class tap pattern is changed so that it substantially spreads spatially from the position of the pixel of interest in the original image. Therefore, the same effect as that of the feature amount extraction unit 32 described with reference to FIG. 10 can be obtained.
[0135]
Further, in the above, an example has been described in which the pattern is switched so that the class taps are spatially expanded so that the DR becomes equal to or greater than a predetermined threshold DR_th. For example, when the DR is small, the class tap is extracted as a class tap. The ADRC code may be accurately expressed by changing the bit length representing the pixel value level of each pixel.
[0136]
FIG. 24 is a block diagram illustrating a configuration example of the feature amount extraction unit 32 that generates an accurate ADRC code by changing a bit length representing a pixel value level to be extracted as a class tap.
[0137]
In FIG. 24, portions corresponding to those in FIG. 10 or FIG. 20 are denoted by the same reference numerals, and description thereof will be omitted below as appropriate. 24 differs from the feature amount extraction unit 32 of FIG. 10 in that a used bit length determination unit 81 and an ADRC code generation unit 82 are provided instead of the DR determination unit 54 and the ADRC code generation unit 53. It is a point.
[0138]
The used bit length determining unit 81 has a table 81a in which used bit length information corresponding to DR is stored in advance when generating an ADRC code, and based on the DR input from the DR operation unit, The used bit length of the class tap data (pixel value of each pixel) is determined from this table, and information on the determined used bit length is output to the ADRC code generation unit 82.
[0139]
The ADRC code generation unit 82 basically has the same function as the ADRC code generation unit 53 of FIG. 10, but the ADRC code is generated based on the information on the used bit length input from the used bit length determining unit 81. Generate.
[0140]
Here, a method of accurately expressing the ADRC code (quantization code) by changing the bit length indicating each pixel value level of the class tap will be described.
[0141]
For example, when the pixel value level of each pixel extracted from the class tap is expressed by the upper 2 bits by the class tap extraction unit 72, the class tap C1 falls within the range of 0 to 63 as shown in FIG. If the tap C2 is distributed in the range of 64 to 127, the class tap C3 is distributed in the range of 128 to 191 and the class tap C4 is distributed in the range of 192 to 255, and the dynamic range DR_L is large, the 2-bit When expressed in ADRC code, the value of each class tap can be expressed as a different value. In other words, if each code of the ADRC code is represented as 00 when the pixel value level is 0 to 63, 01 when 64 to 127, 10 when 128 to 191 and 11 when 192 to 255, the class tap ( Since the codes corresponding to (C1, C2, C3, C4) are (00, 10, 01, 11), the codes corresponding to the respective pixel values can be expressed with a difference.
[0142]
However, when the dynamic range DR_S becomes small like the class taps C1 to C2 having a tendency similar to that in FIG. 25, for example, as shown in FIG. 26, the class taps C1 to C4 are in the same range (range 128 to 191). In the 2-bit representation, everything is represented by 01. As a result, when the pixel value level is represented by 2 bits, a difference is given for each pixel. It cannot be expressed.
[0143]
Therefore, in such a case, if the pixel value level expression method is expressed by 2 bits or more, a different pixel value level can be expressed for each pixel. It is also possible to make a difference in the code indicating the value level.
[0144]
In summary, the bit length expressing the pixel value level may be selected so that the resolution of the dynamic range DR / 2 of the actual pixel value level can be expressed. Therefore, for example, when the dynamic range of the pixel value level can be expressed by 8 bits, as shown in FIG. 27, when the required resolution (= DR / 2) is 128, the used bit length may be 1 bit or more. When the required resolution is 64, the used bit length may be 2 bits or more. When the required resolution is 32, the used bit length may be 3 bits or more. When the required resolution is 16, The bit length may be 4 bits or more. When the required resolution is 8, the used bit length may be 5 bits or more. When the required resolution is 4, the used bit length may be 6 bits or more. When the required resolution is 2, the used bit length may be 7 bits or more, and when the required resolution is 1, the used bit length may be 8 bits or more.
[0145]
In other words, as shown in FIG. 28, when DR is 265 with respect to the dynamic range DR, the used bit length may be 1 bit or more, and when DR is 128, the used bit length is 2 bits. When the DR is 64, the used bit length may be 3 bits or more. When the DR is 32, the used bit length may be 4 bits or more. When the DR is 16. The used bit length may be 5 bits or more. When DR is 8, the used bit length may be 6 bits or more. When DR is 4, the used bit length may be 7 bits or more. When DR is 2, the used bit length may be 8 bits or more.
[0146]
The table 81a of the used bit length determination unit 81 stores the dynamic range DR shown in FIG. 28 and the corresponding used bit length information.
[0147]
Next, referring to the flowchart of FIG. 29, a process of extracting a feature amount corresponding to the target pixel by adjusting the bit length (a process of generating an ADRC code based on a plus tap corresponding to the target pixel). explain. In the flowchart of FIG. 29, the processing of steps S131 and S132 is the same as the processing of steps S113 and S114 in the flowchart of FIG.
[0148]
In step S133, the used bit length determining unit 81 determines the used bit length with reference to the table 81a based on the DR calculated by the DR calculating unit 52 in the process of step S132, and information on the determined used bit length Is output to the ADRC code generation unit 82.
[0149]
In step S134, the ADRC code generation unit 82 processes each pixel value level of the class tap input from the class tap extraction unit 72 with the use bit length input from the use bit length determination unit 81, and the DR operation unit 52 An ADRC code is generated and output using the DR and the minimum value Min that are input more.
[0150]
According to the above, by changing the bit length of the data indicating the pixel value level extracted as the class tap corresponding to the DR, when the dynamic range is small, the bit length expressing the pixel value level is lengthened. Thus, the code for each pixel can be expressed with a difference, and as a result, an accurate ADRC code can be generated. In addition, when the dynamic range is large, the bit length can be prevented from becoming too long, so that it is possible to absorb the change in the pixel value level in units of frames and suppress the influence of noise.
[0151]
In addition, as shown in FIG. 4, the motion detection unit 21 stores the feature amount of the reference frame Fr in the buffer memory 33 and stores it in the database 41 at the next timing to detect a motion vector. As described above, for example, as shown in FIG. 30, a frame memory 31-1 for storing the current frame Fc and a frame memory 31-2 for storing the current frame Fc in the reference frame Fr at the next timing are provided. In this manner, the feature amount extraction units 32-1 and 32-2 for the current frame Fc and the reference frame Fr may be provided. 30 is the same as that of the motion detection unit 21 in FIG. 4, and therefore the description thereof is omitted.
[0152]
In the above, the case where an ADRC code is generated as a quantization code that expresses a feature amount has been described. However, the quantization code that expresses a feature amount is not limited to the ADRC code, and corresponds to, for example, a target pixel. The dynamic range DR obtained from the class tap, the minimum value Min, the Laplacian, the Sobel operator, the pixel value of the center pixel of the class tap, or the average or sum of the N pixel values near the target pixel may be used. .
[0153]
According to the above, an appropriate quantization code can be generated by adaptively changing the DR or the bit length based on the magnitude of the DR of the class tap corresponding to the target pixel. The motion vector can be accurately obtained.
[0154]
The series of processes described above can be executed by hardware, but can also be executed by software. When a series of processes is executed by software, a program constituting the software may execute various functions by installing a computer incorporated in dedicated hardware or various programs. For example, it is installed from a recording medium in a general-purpose personal computer or the like.
[0155]
FIG. 31 shows a configuration of an embodiment of a personal computer when the motion detection unit 21 is realized by software. The CPU 201 of the personal computer controls the entire operation of the personal computer. Further, when a command is input from the input unit 206 such as a keyboard or a mouse from the user via the bus 204 and the input / output interface 205, the CPU 201 is stored in a ROM (Read Only Memory) 202 correspondingly. Run the program. Alternatively, the CPU 201 reads a program read from the magnetic disk 211, the optical disk 212, the magneto-optical disk 213, or the semiconductor memory 214 connected to the drive 210 and installed in the storage unit 208 into a RAM (Random Access Memory) 203. The output unit 207 outputs the execution result. Further, the CPU 201 controls the communication unit 209 to communicate with the outside and exchange data.
[0156]
As shown in FIG. 31, the recording medium on which the program is recorded is distributed to provide the program to the user separately from the computer, and the magnetic disk 211 (including the flexible disk) on which the program is recorded, An optical disk 212 (comprising a compact disc-read only memory (CDROM), a DVD (digital versatile disc)), a magneto-optical disk 213 (including an MD (mini-disc)), or a package medium composed of a semiconductor memory 214 or the like. In addition, the program is configured by a ROM 202 in which a program is recorded and a hard disk included in the storage unit 208, which is provided to the user in a state of being incorporated in advance in the computer.
[0157]
In this specification, the step of describing the program recorded on the recording medium is not limited to the processing performed in time series in the order described, but of course, it is not necessarily performed in time series. Or the process performed separately is included.
[0158]
【The invention's effect】
According to the present invention, an appropriate quantization code can be generated, so that a motion vector can be accurately obtained.
[Brief description of the drawings]
FIG. 1 is a block diagram illustrating a configuration of a conventional motion detection unit.
FIG. 2 is a diagram for explaining a motion vector detection method;
FIG. 3 is a flowchart for explaining motion detection processing by a motion detection unit in FIG. 1;
FIG. 4 is a block diagram illustrating a configuration of an embodiment of a motion detection unit to which the present invention is applied.
FIG. 5 is a diagram for explaining the structure of the database of FIG. 4;
6 is a flowchart for describing reference frame information generation processing by the motion detection unit in FIG. 4;
7 is a flowchart for explaining motion vector detection processing by the motion detection unit in FIG. 4;
FIG. 8 is a flowchart for explaining motion vector detection processing by the motion detection unit of FIG. 4;
FIG. 9 is a diagram illustrating a feature amount in the vicinity of a feature amount of a target pixel.
FIG. 10 is a block diagram illustrating a configuration of an embodiment of a feature amount extraction unit to which the present invention is applied.
FIG. 11 is a diagram illustrating class taps.
FIG. 12 is a diagram illustrating class taps.
FIG. 13 is a diagram illustrating class taps.
FIG. 14 is a diagram illustrating class taps.
FIG. 15 is a diagram illustrating class taps.
FIG. 16 is a diagram illustrating class taps.
FIG. 17 is a diagram illustrating an ADRC code.
FIG. 18 is a diagram illustrating an example of an error that occurs when DR is small.
FIG. 19 is a flowchart for describing feature amount calculation processing by a feature amount extraction unit illustrated in FIG. 10;
FIG. 20 is a block diagram illustrating a configuration of another embodiment of a feature amount extraction unit to which the present invention is applied.
FIG. 21 is a flowchart for describing feature amount calculation processing by a feature amount extraction unit illustrated in FIG. 20;
FIG. 22 is a diagram illustrating a process of extracting a class tap by reducing an image.
FIG. 23 is a diagram illustrating processing for reducing class taps by reducing an image.
FIG. 24 is a block diagram illustrating a configuration of another embodiment of a feature amount extraction unit to which the present invention is applied.
FIG. 25 is a diagram for describing processing for changing the used bit length in accordance with DR;
FIG. 26 is a diagram for describing processing for changing the used bit length in accordance with DR.
FIG. 27 is a diagram for explaining a relationship between required resolution and used bit length;
FIG. 28 is a diagram illustrating a relationship between a dynamic range and a used bit length.
FIG. 29 is a flowchart for describing feature amount calculation processing by a feature amount extraction unit illustrated in FIG. 24;
FIG. 30 is a block diagram illustrating a configuration of another embodiment of a motion detection unit to which the present invention is applied.
FIG. 31 is a diagram illustrating a medium.
[Explanation of symbols]
31 frame memory, 32 feature extraction unit, 33 buffer memory, 34 database control unit, 35 motion vector detection unit, 41 database, 51 class tap extraction unit, 51a tap table, 52 DR calculation unit, 53 ADRC code generation unit, 54 DR determination unit, 71 image reduction unit, 72 class tap extraction unit, 73 DR determination unit, 81 used bit length determination unit, 81a table, 82 ADRC code generation unit

Claims

Class tap extraction means for extracting pixel values of 4m (m ≧ 1) peripheral pixels corresponding to the target pixel among the pixels of the first frame in the input image as class taps;
Dynamic range detecting means for detecting the dynamic range of the class tap;
Based on the dynamic range, a bit length determining means for determining the bit length of the class tap so as to reduce the bit length when the dynamic range is large and to increase the bit length when the dynamic range is small;
Quantization code generation means for generating a quantization code as a feature quantity from the pixel value level of the class tap expressed by the bit length determined by the bit length determination means;
Using all the pixels of the first frame as the target pixel, the feature amount generated by the quantization code generation unit and information on each pixel position of the first frame corresponding to the feature amount are stored. Storage means;
For the pixel of interest in a second frame different from the first frame, information on the pixel position of the first frame corresponding to the feature quantity generated by the quantization code generation unit is stored in the storage unit. Reading means for reading; and
Of the information on the pixel position of the first frame read by the reading means, the pixel position of the pixel having the smallest distance from the target pixel in the second frame is used as the starting point, and the second frame An image processing apparatus comprising: a motion vector detecting unit that detects a motion vector whose end point is the pixel position of a target pixel in the center.

The bit length determining means includes a table storing bit length information associated with the dynamic range so that the bit length is reduced when the dynamic range is large and the bit length is increased when the dynamic range is small. The image processing apparatus according to claim 1, wherein bit length information is read from the table based on the dynamic range and the bit length of the class tap is determined.

Motion vector comparison means for comparing the magnitude of the motion vector detected by the motion vector detection means with the magnitude of a predetermined motion vector;
When the comparison result of the motion vector comparison means is greater in magnitude of the motion vector detected by the motion vector detection means than the magnitude of the predetermined motion vector, the reading means Read the pixel position information of the first frame corresponding to the quantization code similar to the quantization code of the pixel of interest in the frame from the storage means,
The motion vector detection means includes the pixel position information of the first frame corresponding to the quantization code similar to the quantization code of the pixel of interest in the second frame read by the reading means. Detecting a motion vector starting from the pixel position of the pixel having the smallest distance from the target pixel in the second frame and ending at the pixel position of the target pixel in the second frame. The image processing apparatus according to claim 1.

A class tap extraction step of extracting pixel values of 4m (m ≧ 1) peripheral pixels corresponding to the target pixel among the pixels of the first frame in the input image as class taps;
A dynamic range detecting step for detecting a dynamic range of the class tap;
Based on the dynamic range, a bit length determination step of determining the bit length of the class tap so as to reduce the bit length when the dynamic range is large and to increase the bit length when the dynamic range is small;
A quantization code generation step for generating a quantization code as a feature quantity from the pixel value level of the class tap expressed by the bit length determined in the processing of the bit length determination step;
Using all the pixels of the first frame as the target pixel, the feature amount generated by the process of the quantization code generation step, and information on each pixel position of the first frame corresponding to the feature amount A storage step for storing;
A step of reading out information on a pixel position of the first frame corresponding to a feature amount generated by the process of the quantization code generation step for a target pixel in a second frame different from the first frame When,
Of the information on the pixel position of the first frame read in the process of the reading step, the pixel position of the pixel having the smallest distance from the target pixel in the second frame is used as the starting point, and And a motion vector detecting step of detecting a motion vector whose end point is the pixel position of the target pixel in the frame.

A class tap extraction step of extracting pixel values of 4m (m ≧ 1) peripheral pixels corresponding to the target pixel among the pixels of the first frame in the input image as class taps;
A dynamic range detecting step for detecting a dynamic range of the class tap;
Based on the dynamic range, a bit length determination step of determining the bit length of the class tap so as to reduce the bit length when the dynamic range is large and to increase the bit length when the dynamic range is small;
A quantization code generation step for generating a quantization code as a feature quantity from the pixel value level of the class tap expressed by the bit length determined in the processing of the bit length determination step;
Using all the pixels of the first frame as the target pixel, the feature amount generated by the process of the quantization code generation step, and information on each pixel position of the first frame corresponding to the feature amount A storage step for storing;
A step of reading out information on a pixel position of the first frame corresponding to a feature amount generated by the process of the quantization code generation step for a target pixel in a second frame different from the first frame When,
Of the information on the pixel position of the first frame read in the process of the reading step, the pixel position of the pixel having the smallest distance from the target pixel in the second frame is set as the starting point, and the second And a motion vector detecting step for detecting a motion vector whose end point is the pixel position of the pixel of interest in the frame. A recording medium on which a computer-readable program is recorded.

A class tap extraction step of extracting pixel values of 4m (m ≧ 1) peripheral pixels corresponding to the target pixel among the pixels of the first frame in the input image as class taps;
A dynamic range detecting step for detecting a dynamic range of the class tap;
Based on the dynamic range, a bit length determination step of determining the bit length of the class tap so as to reduce the bit length when the dynamic range is large and to increase the bit length when the dynamic range is small;
A quantization code generation step for generating a quantization code as a feature quantity from the pixel value level of the class tap expressed by the bit length determined in the processing of the bit length determination step;
Using all the pixels of the first frame as the target pixel, the feature amount generated by the process of the quantization code generation step, and information on each pixel position of the first frame corresponding to the feature amount A storage step for storing;
A step of reading out information on a pixel position of the first frame corresponding to a feature amount generated by the process of the quantization code generation step for a target pixel in a second frame different from the first frame When,
Of the information on the pixel position of the first frame read in the process of the reading step, the pixel position of the pixel having the smallest distance from the target pixel in the second frame is set as the starting point, and the second And a motion vector detecting step of detecting a motion vector whose end point is the pixel position of the pixel of interest in the frame.