JP3988879B2

JP3988879B2 - Stereo image generation method, stereo image generation apparatus, stereo image generation program, and recording medium

Info

Publication number: JP3988879B2
Application number: JP2003016302A
Authority: JP
Inventors: 秋彦橋本; 肇能登; 憲二中沢
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2003-01-24
Filing date: 2003-01-24
Publication date: 2007-10-10
Anticipated expiration: 2023-01-24
Also published as: JP2004229093A

Description

【０００１】
【発明の属する技術分野】
本発明は、立体動画像データ生成方法及び立体動画像データ生成装置、ならびに立体動画像データ生成プログラム及び記録媒体に関し、特に、携帯機器等で表示する立体動画像データの生成に適用して有効な技術に関するものである。
【０００２】
【従来の技術】
従来、画像通信の分野では、より高い臨場感が得られる立体画像による動画通信が注目されており、学会発表や企業のデモ試作等も盛んになりつつある。ここで、前記立体画像とは、色や輝度を記録した二次元画像と、計測地点から撮影対象までの距離を記録した距離画像または奥行き画像と呼ばれる画像の組み合わせであるとする。また、前記距離画像とは、各画素に撮影対象までの距離を記録した画像である。また、前記動画とは、例えば、毎秒３０フレーム程度以上の画像からなるものとする。
【０００３】
前記立体画像による動画を生成するときには、例えば、ＴＶカメラ等で前記色や輝度を記録した二次元画像からなる動画像データを入力し、距離計測器等で前記距離画像データを入力する。
【０００４】
また、前記立体画像による動画を生成する方法には、例えば、視差を有する２枚の二次元画像を用いて距離画像データを生成する方法もある。
【０００５】
また、二次元画像の視差画像による立体画像を生成する方法として、距離画像から輪郭部を求めて視差領域を決定し、前後の画像と動きベクトル等を用いて視差量を生成する技術が開示されている（たとえば、特許文献１を参照。）。
【０００６】
【特許文献１】
特開２０００‐２５３４２２号公報
【０００７】
【発明が解決しようとする課題】
しかしながら、前記従来の技術では、前記立体画像による動画通信を行うことが難しいという問題があった。
【０００８】
前記立体画像による動画通信を行うためには、例えば、毎秒３０フレーム以上の距離画像データを入力しなければならないが、従来、そのような高速で距離画像データを取得できる距離計測器は非常に高価であり、携帯機器に組み込んで用いることが難しい。
【０００９】
また、従来の技術では、前記距離画像データは、前記二次元画像と同様に１フレーム時間毎に取得し、入力しているので、消費電力量が高くなる。そのため、充電池等で使用する携帯機器等への組み込みが難しいという問題があった。
【００１０】
また、携帯機器に組み込み可能な技術として、例えば、遠近の区別ができる程度の分解能の低い距離画像データを入力し、これを用いて書き割り形式の立体画像表示を行う方法がある。また、前記分解能の低い距離画像から、擬似的な距離画像データを生成して立体表示する方法もある。ここで、前記擬似的な距離画像データとは、距離を計測によってではなく、人為的に生成したデータであるとする。すなわち、前記擬似的な距離画像データは、必ずしも現実の距離と一致しているわけではなく、近似の度合いは生成のアルゴリズムに依存する。
【００１１】
しかしながら、前記各方法でも、前記各距離画像データは１フレーム時間毎に取得するので、消費電力量が高い。そのため、携帯機器で立体画像を動画入力することが難しいという問題があった。
【００１２】
また、前記特許文献１に記載されたような技術でも、前記距離画像データは１フレーム時間毎に取得するので、消費電力量が高く、携帯機器に組み込んで用いることが難しいという問題があった。
【００１３】
本発明の目的は、動画の立体画像を容易に生成することが可能な技術を提供することにある。
【００１４】
本発明の他の目的は、動画の立体画像を生成する装置の消費電力量を低減することが可能な技術を提供することにある。
【００１５】
本発明の他の目的は、立体画像による動画入力を容易にし、かつ、立体画像生成時の消費電力量を低減することが可能なプログラム及び記録媒体を提供することにある。
【００１６】
本発明の前記ならびにその他の目的と新規な特徴は、本明細書の記述及び添付図面によって明らかになるであろう。
【００１７】
【課題を解決するための手段】
本願において開示される発明の概要を説明すれば、以下の通りである。
（１）１フレーム時間毎の動画像データを入力するステップと、前記動画像データから立体画像を生成するためのデータ（以下、距離画像データと称する）を、間欠的なフレーム間隔で入力するステップと、前記動画像データ及び前記入力された距離画像データに基づいて、前記距離画像データが存在しないフレーム時刻の距離画像データを補間生成するステップとを有する立体画像生成方法であって、前記距離画像データを補間生成するステップは、前記入力された距離画像データから、遠近の境目領域を求めるステップと、前記入力された距離画像データと同じフレーム時刻の動画像データから、前記境目領域と同じ位置、または前記境目領域の近傍の輪郭部領域を求めるステップと、前記距離画像データが存在しないフレーム時刻の動画像データから、前記輪郭部領域に相当する領域を求めるステップと、前記輪郭部領域と、前記輪郭部領域に相当する領域の移動量に基づいて、前記境目領域及びその内部領域を移動させて、前記距離画像データを生成するステップとを有する。
【００１９】
（２）１フレーム時間毎の動画像データを入力するステップと、前記動画像データから立体画像を生成するためのデータ（以下、距離画像データと称する）を、間欠的なフレーム間隔で入力するステップと、前記動画像データ及び前記入力された距離画像データに基づいて、前記距離画像データが存在しないフレーム時刻の距離画像データを補間生成するステップとを有する立体画像生成方法であって、前記距離画像データを補間生成するステップは、前記入力された距離画像データと同じフレーム時刻の動画像データを選択するステップと、前記選択された動画像データから、前記距離画像データが存在しないフレーム時刻の動画像データまでの間の画像の動きベクトルを求めるステップと、前記入力された距離画像データを複数の領域に分割し、前記動きベクトルに基づいて前記各領域を移動させて、距離画像データを生成するステップとを有する。
【００２０】
（３）前記（１）または（２）の立体画像生成方法において、前記距離画像データを入力するステップは、前記距離画像データを生成可能な補助画像データを、間欠的なフレーム間隔で入力するステップと、前記補助画像データ、または前記補助画像データ及び前記補助画像データと同じフレーム時刻の動画像データに基づいて距離画像データを生成するステップとを有する。
【００２１】
前記（１）または（２）の立体画像生成方法によれば、距離計測器などの前記距離画像データを取得する手段を高速で動作させることができない場合でも、立体画像による動画を容易に得ることができる。
【００２３】
また、前記距離画像データを入力するステップは、距離計測器等を用いて、直接距離画像データを入力してもよいし、前記（３）の各ステップのような処理を行って前記補助画像データから間接的に生成した距離画像データを入力してもよい。
【００２４】
（４）１フレーム時間毎の動画像データを入力する動画像データ入力手段と、前記動画像データから立体画像を生成するためのデータ（以下、距離画像データと称する）を、間欠的なフレーム間隔で入力する距離画像データ入力手段と、前記動画像データ入力手段から入力された動画像データ及び前記距離画像データ入力手段から入力された距離画像データに基づいて、前記距離画像データが存在しないフレーム時刻の距離画像データを補間生成する距離画像データ補間手段とを備える立体画像生成装置であって、前記距離画像データ補間手段は、前記距離画像データ入力手段から入力された距離画像データから、遠近の境目領域を求める手段と、前記距離画像データと同じフレーム時刻の動画像データから、前記境目領域と同じ位置、または前記境目領域の近傍の輪郭部領域を求める手段と、前記距離画像データが存在しないフレーム時刻の動画像データから、前記輪郭部領域に相当する領域を求める手段と、前記輪郭部領域と、前記輪郭部領域に相当する領域の移動量に基づいて、前記境目領域及びその内部領域を移動させて、前記距離画像データを生成する手段とを備える。
【００２６】
（５）１フレーム時間毎の動画像データを入力する動画像データ入力手段と、前記動画像データから立体画像を生成するためのデータ（以下、距離画像データと称する）を、間欠的なフレーム間隔で入力する距離画像データ入力手段と、前記動画像データ入力手段から入力された動画像データ及び前記距離画像データ入力手段から入力された距離画像データに基づいて、前記距離画像データが存在しないフレーム時刻の距離画像データを補間生成する距離画像データ補間手段とを備える立体画像生成装置であって、前記距離画像データ補間手段は、前記入力された距離画像データと同じフレーム時刻の動画像データを選択する手段と、前記選択された動画像データから、前記距離画像データが存在しないフレーム時刻の動画像データまでの間の画像の動きベクトルを求める手段と、前記入力された距離画像データを複数の領域に分割し、前記動きベクトルに基づいて前記各領域を移動させて、距離画像データを生成する手段とを備える。
【００２７】
（６）前記（４）または（５）の立体画像生成装置において、前記距離画像データ入力手段は、前記距離画像データを生成可能な補助画像データを、間欠的なフレーム間隔で入力する補助画像データ入力手段と、前記補助画像データ、または前記補助画像データ及び前記補助画像データと同じフレーム時刻の動画像データに基づいて距離画像データを生成する手段とを備える。
【００２８】
前記（４）または（５）の立体画像生成装置によれば、距離計測器などの距離画像データを取得する手段を高速で動作させることができない場合でも、立体画像による動画を容易に得ることができる。
【００２９】
また、前記距離画像データを取得する手段が、１フレーム時間毎に距離画像データを取得できる手段であっても、動作を間欠的にすることで、消費電力量を低くすることができる。
【００３１】
また、前記距離画像データ入力手段は、距離計測器等から前記距離画像データを直接入力する手段であってもよいし、前記（６）のように、前記補助画像データを入力する手段と、前記補助画像データから距離画像データを生成する手段を設けて、間接的に距離画像データを入力してもよい。
【００３２】
以上のようなことから、前記（４）から（６）までのいずれかの立体画像生成装置を用いることで、前記立体画像生成装置の携帯機器への組み込みを容易にすることができる。
【００３３】
（７）前記（１）から（３）までのいずれかの立体画像生成方法の、各ステップをコンピュータに実行させるための立体画像生成プログラムである。
【００３４】
前記（７）の立体画像生成プログラムによれば、コンピュータなどの汎用機器（装置）を用いて容易に立体画像を生成させることができる。
【００３５】
（８）前記（７）の立体画像生成プログラムがコンピュータで読み取り可能に記録された記録媒体。
【００３６】
前記（８）の記録媒体によれば、ある１つの記録媒体に記録された前記（７）の立体画像生成プログラムを他の記録媒体にコピー、あるいはネットワークを通して提供することができる。そのため、前記立体画像生成プログラムを実行可能な装置であれば、どのような装置でも立体画像を生成させることができる。
【００３７】
以下、本発明について、図面を参照して実施の形態（実施例）とともに詳細に説明する。
なお、実施例を説明するための全図において、同一機能を有するものは、同一符号を付け、その繰り返しの説明は省略する。
【００３８】
【発明の実施の形態】
（実施例１）
図１は、本発明による実施例１の立体画像生成装置の概略構成を示す模式図である。
【００３９】
本実施例１の立体画像生成装置は、図１に示すように、動画像データ取得手段１Ａから動画像データを入力する動画像データ入力手段１Ｂと、距離画像データ取得手段２Ａから距離画像データを入力する距離画像データ入力手段２Ｂと、前記動画像データ入力手段１Ｂ及び前記距離画像データ入力手段２Ｂに入力する各データの制御をする制御手段３と、前記動画像データ入力手段１から入力された動画像データを記録する動画像データ記録手段４と、前記距離画像データ入力手段２から入力された距離画像データを記録する距離画像データ記録手段５と、距離画像データを補間生成する距離画像データ補間手段６とを備える。
【００４０】
また、前記立体画像生成装置は、例えば、毎秒３０フレーム以上の動画像データと距離画像データの組からなる立体画像を生成する装置であり、生成した前記立体画像は、図１に示したように、演算手段７で演算し、表示手段８で表示することにより立体化された動画像を得ることができる。このとき、前記動画像データ入力手段１Ｂ、前記距離画像データ入力手段２Ｂ、前記制御手段３、前記動画像データ記録手段４、前記距離画像データ記録手段５、前記距離画像データ補間手段６、前記演算手段７の各手段は、コンピュータのＣＰＵやメモリを用いることができる。
【００４１】
図２及び図３は、本実施例１の立体画像生成装置を用いた立体画像生成方法を説明するための模式図であり、立体画像生成方法の全体的な処理手順を示す図である。
【００４２】
本実施例１の立体画像生成装置を用いて前記立体画像を生成するには、まず、図２に示すように、動画像データＣＭ（Ｔ）及び距離画像データＲＭ（Ｔ）を入力し、前記動画像データ記録手段４及び前記距離画像データ記録手段５に記録する。このとき、前記動画像データＣＭ（Ｔ）は、１フレーム時間毎に入力するが、前記距離画像データＲＭ（Ｔ）はＮフレーム間隔、たとえば、図２に示したように、３フレーム毎の間欠的なフレーム間隔で入力する。またこのとき、前記距離画像データＲＭ（Ｔ）は、たとえば、１bitの分解能の距離画像データであって、黒い領域１１は近景領域、たとえば、撮影対象１０に相当する領域であり、白い領域は遠景領域であるとする。
【００４３】
しかしながら、この状態では、前記距離画像データＲＭが存在しないフレーム時刻の動画像データＲＭ（Ｔ＋１），ＲＭ（Ｔ＋２）があるので、そのままでは前記立体画像を生成したことにならない。そこで、前記動画像データ及び距離画像データを用いて、存在しないフレーム時間の距離画像データを補間生成する。
【００４４】
本実施例１の立体画像生成方法では、図２及び図３に示すように、初期化（ステップ１４０１）した後、前記距離画像データＲＭ（Ｔ）から、近景と遠景の境目領域１２を求め、この画像をＱ（Ｔ）とする（ステップ１４０２）。
【００４５】
次に、前記距離画像データＲＭ（Ｔ）と同じフレーム時刻Ｔの動画像データＣＭ（Ｔ）から、前記画像Ｑ（Ｔ）の境目領域１２と同じ領域、または近傍の位置にある輪郭部領域１３を求め、この画像をＲ（Ｔ）とする（ステップ１４０３）。
【００４６】
次に、距離画像データが存在しないフレーム時刻Ｔ＋１の動画像データＣＭ（Ｔ＋１）から、前記画像Ｒ（Ｔ）の輪郭部領域１３と同じ領域を求め、この画像をＲ（Ｔ＋１）とする（ステップ１４０４）。前記映像領域を求めるためには、たとえば、既存のパターンマッチング法（田村秀行著、「コンピュータ画像処理入門」、総研出版、p.148‐153）を用いる。パターンマッチング法には様々なバリエーションがあり、適宜選択することができる。
【００４７】
次に、前記ステップ１４０４で求めた画像Ｒ（Ｔ＋１）における輪郭部領域１３の、前記画像Ｒ（Ｔ）の輪郭部領域１３に対する移動量に合わせて、前記フレーム時刻Ｔの距離画像データＲＭ（Ｔ）における近景領域１１を移動させることで、フレーム時刻Ｔ＋１の距離画像データＲＭ（Ｔ＋１）を生成する（ステップ１４０５）。
【００４８】
その後、次のフレーム時刻Ｔ＋２の距離画像データＲＭ（Ｔ＋２）が存在するか調べる（ステップ１４０６，１４０７，１４０８，１４０９）。本実施例１では、フレーム時刻Ｔ＋２の距離画像データＲＭ（Ｔ＋２）は存在しないので、前記フレーム時刻Ｔ＋１の距離画像データＲＭ（Ｔ＋１）と同様の手順で生成する。このとき、フレーム時刻がＴ＋２の前記動画像データの映像領域の位置は、たとえば、図２に示すように、前記フレーム時刻がＴの動画像データの輪郭部領域と直接パターンマッチングして求めても良いし、フレーム時刻がＴ＋１の動画像データの輪郭部領域とパターンマッチングして求めても良い。
【００４９】
以上の手順を繰り返すことにより、前記距離画像データが存在しないフレーム時刻の距離画像データを順次補間生成することができる。そのため、本実施例１の立体画像生成装置では、１フレーム時間毎に距離画像データを取得する必要がない。そのため、間欠的なフレーム間隔でしか動作できない距離画像データ取得手段２Ａを用いることが可能である。また、前記距離画像データ入力手段２Ａが、１フレーム時間毎の距離画像データを取得できる場合でも、間欠的に動作させることにより、消費電力量を低減することができる。また、本実施例１では、前記距離画像データを３フレーム時間毎に入力しているが、これに限らず、たとえば、５フレーム時間毎あるいは１０フレーム時間毎に入力することも可能であり、フレーム時間間隔を大きくすることで、消費電力量を大幅に低減することができる。
【００５０】
図４乃至図１８は、本実施例１の立体画像生成方法の具体例を説明するための図である。
【００５１】
本実施例１の立体画像生成方法では、まず、図４に示したように、１フレーム毎の動画像データＣＭと、Ｎフレーム毎の距離画像データＲＭを入力し、前記動画像データ記録手段４及び前記距離画像データ記録手段５に記録する。このとき、画像と分解能の低い距離画像がともに存在するフレーム時刻をＴとすると、ＣＭ（Ｔ），ＣＭ（Ｔ＋１），・・・，ＣＭ（Ｔ＋Ｎ−１）の動画像に対してＲＭ（Ｔ）が１枚存在する。これらの画像から、ＲＭ（Ｔ＋１），・・・，ＲＭ（Ｔ＋Ｎ−１）の距離画像データを生成することによって動画の立体画像データを生成する。ここで、ＣＭもＲＭも縦横Ｘ，Ｙ画素の画像サイズであるとする。また、ＣＭ（Ｔ）の座標（ｘ，ｙ）のデータ値をＣＭ（Ｔ）_x,yと記述し、ＲＭ（Ｔ）の座標（ｘ，ｙ）のデータ値をＲＭ（Ｔ）_x,yと記述する。ＣＭ（Ｔ）_x,yはＲＧＢデータであり、ＲＭ（Ｔ）_x,yは０か１の２値データである。ここで、１は近景、０は遠景を表すものとする。
【００５２】
また、前記動画像データＣＭ及び前記距離画像データＲＭを入力するときには、図５に示すように、動画像撮影用のＴＶカメラ（動画像データ取得手段）１Ａと、距離画像撮影用のＴＶカメラ（距離画像データ取得手段）２Ｂと、フラッシュなどのパルス光を照射する光源１５とを前記制御手段３で制御する。
【００５３】
このとき、前記距離画像データ取得手段２Ａの撮影光軸は、全反射ミラー１６及びハーフミラー１７によって前記動画像データ取得手段１Ａの撮影光軸と合わせ、同じ撮影画角及び同じ撮影視点で撮影できるように設定する。
【００５４】
前記動画像データ取得手段１Ａを用いて前記動画像データＣＭを入力するときには、前記制御手段３から前記動画像データ取得手段１Ａに対して、図６（ａ）に示すように、あらかじめ設定されたフレーム時間毎に制御信号を出力し、前記動画像データを撮影し、前記動画像データ入力手段１Ｂに入力する。このとき、フレームの時間間隔ΔＴは、例えば、１／３０秒に設定する。
【００５５】
一方、前記距離画像データＲＭを入力するときには、前記制御手段３から前記距離画像データ取得手段２Ａに対して、図６（ｂ）に示すように、あらかじめ設定したフレーム間隔のＮ倍の時間間隔、例えば、前記動画像データのフレーム間隔の３倍の時間間隔で制御信号を出力し、間欠的な距離画像データを撮影し、前記距離画像データ入力手段２Ｂに入力する。このとき、前記制御手段３は、前記光源１５にも同じタイミングで制御信号を出力し、撮影対象１０を照明しておく。
【００５６】
なお、前記動画像データ取得手段１Ａと前記距離画像データ取得手段２Ａの撮影開始時刻は、少なくとも前記動画像データ取得手段１Ａの露光時間以上ずらしておくか、もしくは、光源１５から照射する照明光に赤外光を用い、前記距離画像データ取得手段２Ａに赤外カメラを使う等することによって、照明光が前記動画像データ取得手段１のＡ撮影に影響しないようにする。
【００５７】
また、高速度カメラを用いることにより前記動画像データ取得手段１Ａと前記距離画像取得手段２Ａを一台のカメラで兼用することも可能である。この場合、前記全反射ミラー１６及び前記ハーフミラー１７は不要である。
【００５８】
また、前記距離画像データ取得手段２Ａで撮影した距離画像データＲＭＰ（Ｔ）は、例えば、遠近の区別のみを記録した分解能の低い距離画像データＲＭ（Ｔ）に変換する。そこでまず、図７に示すように、時刻Ｔに前記動画像データ取得手段１Ａで撮影した動画像データＣＭ（Ｔ）にもっとも撮影時刻の近い距離画像データＲＭＰ（Ｔ）を選ぶ（ステップ１８０１）。この２枚の画像データは同じ撮影時刻に同じ撮影対象を照明の有無のみが異なって撮影した画像データとみなすことができる。
【００５９】
次に、前記動画像データＣＭ（Ｔ）と距離画像データＲＭＰ（Ｔ）において、同座標（ｘ，ｙ）でＣＭ（Ｔ）_x,y／ＲＭＰ（Ｔ）_x,yの比を求め（ステップ１８０２）、ある設定値ｋ以上の比であった場合にはＲＭ（Ｔ）_x,y＝１（ステップ１８０３）、そうでない場合にはＲＭ（Ｔ）_x,y＝０（ステップ１８０４）とすることによってＲＭ（Ｔ）_x,yを生成する。以降、座標（ｘ，ｙ）を更新（ステップ１８０５）して、すべての座標においてＲＭ（Ｔ）_x,yを生成し（ステップ１８０６）、距離画像データＲＭ（Ｔ）とする。
【００６０】
照明した撮影画像は照明しない撮像画像と比較すると、近景の撮影対象は明るくなるのに対して、遠景の画像の明るさはあまり変わらないため、この判定処理によって、近景領域は１、遠景領域は０で表現される分解能の低い距離画像データＲＭ（Ｔ）が生成されることになる。なお、前記分解能の低い距離画像データＲＭ（Ｔ）は、比の代わりに差を用いて生成しても構わない。
【００６１】
また、本実施例１では分解能の低い距離画像データＲＭ（Ｔ）の入力手段としてフラッシュなどの光源１５を用いる撮影方法を示したが、分解能の低い距離画像データＲＭ（Ｔ）を取得、あるいは入力できるのであれば、前記実施例１に示した方法に限定されるものではなく、近景に焦点のあった画像と遠景に焦点のあった画像（またはパンフォーカス画像）間で空間周波数の大小に応じて距離を算出してもよく、あるいはステレオ撮影して単純に差分を取り、ある設定値を下回る差分値を示した領域を遠景領域として分解能の低い距離画像データＲＭ（Ｔ）を生成すること等が可能である。
【００６２】
次に、図２及び図３に示した手順にしたがって、動画の立体画像データを生成してゆく。
【００６３】
前記距離画像データＲＭ（Ｔ）から境目領域１２の画像Ｑ（Ｔ）を求めるステップ１４０２では、まず、図８に示すように、距離画像データＲＭ（Ｔ）の座標（ｘ，ｙ）の画素ＲＭ（Ｔ）_x,yと隣接する画素の値を比較する（ステップ１４０２ａ，１４０２ｂ）。このとき、前記ＲＭ（Ｔ）_x,yと隣接する画素は、図９（ａ）に示すように、８連結で隣接するＲＭ（Ｔ）_x-1,y-1、ＲＭ（Ｔ）_x-1,y、ＲＭ（Ｔ）_x-1,y+1、ＲＭ（Ｔ）_x,y-1、ＲＭ（Ｔ）_x,y+1、ＲＭ（Ｔ）_x+1,y-1、ＲＭ（Ｔ）_x+1,y、ＲＭ（Ｔ）_x+1,y+1とし、全てが同じデータ値であったときにはＱ（Ｔ）_x,y＝０（ステップ１４０２ｃ）、そうでない場合はＱ（Ｔ）_x,y＝１（ステップ１４０２ｄ）とする。その後、座標（ｘ，ｙ）を更新（ステップ１４０２ｅ）し、すべての画素ＲＭ（Ｔ）_x,yについてＱ（Ｔ）_x,yを求め、図９（ｂ）に示したような、境目領域１２の画像Ｑ（Ｔ）を生成する。
【００６４】
また、前記ステップ１４０２ａでは、画素ＲＭ（Ｔ）_x,yと８連結で隣接する場合を示したが、４連結で隣接するＲＭ（Ｔ）_x-1,y、ＲＭ（Ｔ）_x,y-1、ＲＭ（Ｔ）_x,y+1、ＲＭ（Ｔ）_x+1,yと比較してＱ（Ｔ）を生成しても良い。
【００６５】
次に、前記動画像データＣＭ（Ｔ）から前記境目領域１２と同じ位置または近傍にある輪郭部領域１３を求めるステップ１４０３は、まず、図１０に示すように、輪郭部領域１３の画像Ｒ（Ｔ）のブロックＲ（Ｔ）_i,jに相当するＱ（Ｔ）_i,jを抽出する（ステップ１４０３ａ）。このとき、前記Ｒ（Ｔ）_i,jは、図１１（ａ）に示すように、横方向がＸ／Ｗ画素，縦方向がＹ／Ｗ画素のブロック画像とし、ｉ，ｊはブロック座標を表している。またこのとき、前記Ｘ，Ｙ，Ｗは、Ｘ／ＷとＹ／Ｗが整数となるように、あらかじめ設定されているものとする。Ｒの各画素は２値データ格納エリアＲｋ_i,jと移動ベクトル格納エリアＲＶ_i,jで構成されているものとする。
【００６６】
また、前記Ｑ（Ｔ）_i,jは、図１１（ｂ）に示すように、縦Ｗ画素，横Ｗ画素毎に分割して抽出する。このとき、前記Ｑ（Ｔ）_i,jの各画素の値を調べ（ステップ１４０３ｂ）、すべて０の場合にはＲｋ（Ｔ）_i,j＝０とし（ステップ１４０３ｃ）、１である画素が含まれる場合にはＲｋ（Ｔ）_i,j＝１とする（ステップ１４０３ｄ）。その後、ｉまたはｊを更新し（ステップ１４０３ｅ）、すべてのＲ（Ｔ）_i,jに対するＲｋ（Ｔ）_i,jを求めるまで繰り返す（ステップ１４０３ｆ）。
【００６７】
前記Ｒｋ（Ｔ）_i,jは、ＣＭ（Ｔ）に対して遠近の境目領域と同じ位置または近傍にある輪郭部領域のみを示すデータである。すなわち、Ｒｋ_i,j＝１ならば、座標（ｉ×Ｗ，ｊ×Ｗ）から座標（（ｉ＋１）×Ｗ−１，（ｊ＋１）×Ｗ−１）を対角の頂点とするブロック領域が境目領域と同じ位置または近傍にある輪郭部領域となる。また、Ｒｋ_i,j＝０ならば、座標（ｉ×Ｗ，ｊ×Ｗ）から座標（（ｉ＋１）×Ｗ−１，（ｊ＋１）×Ｗ−１）を対角の頂点とするブロック領域は境目領域と同じ位置または近傍にある輪郭部領域ではない。そのため、Ｒｋ（Ｔ）_i,jを求めることにより、図１２に示したような、輪郭部領域１３の画像Ｒ（Ｔ）が得られる。
【００６８】
なお、本実施例１では輪郭部領域１３の画像Ｒ（Ｔ）のブロックＲ（Ｔ）_i,jを矩形状のブロックに分割したが、これに限らず、種々の形状で分割し、抽出してよい。
【００６９】
次に、フレーム時刻Ｔ＋ｎの動画像データＣＭ（Ｔ＋ｎ）から、前記ステップ１４０３で求めた輪郭部領域１３に相当する輪郭部領域を求めるステップ１４０４では、まず、図１３に示すように、前記ＣＭ（Ｔ）において、Ｒｋ（Ｔ）_i,j＝１のブロックを抽出する（ステップ１４０４ａ）。
【００７０】
次に、Ｒｋ（Ｔ）_i,j＝１のブロックから、図１４（ａ）に示すように、ＣＭ（Ｔ）の座標（ｉ×Ｗ，ｊ×Ｗ）から座標（（ｉ＋１）×Ｗ−１，（ｊ＋１）×Ｗ−１）のブロックを、縦横Ｗ画素の参照画像Ｂ（Ｔ）_i,jとして切り出す（ステップ１４０４ａ，１４０４ｂ）。
【００７１】
次に、図１４（ｂ）に示すように、ＣＭ（Ｔ＋ｎ）内でもっともＢ（Ｔ）_i,jに画像内容の近いブロックを決定し、このブロックの左上の座標値（ｉ×Ｗ＋ｄｘ，ｊ×Ｗ＋ｄｙ）を得る（ステップ１４０４ｃ）。このときｄｘとｄｙはフレーム数ｎ進んだ画像におけるＢ（Ｔ＋ｎ）_i,jに含まれる輪郭部領域のｘ方向とｙ方向に移動した距離を表す。ｄｘとｄｙをＲＶ（Ｔ＋ｎ）_i,jに格納する（ステップ１４０４ｄ）。
【００７２】
なお、画像内容の近いブロックは、２つのブロック毎に相関量を求め画像内で最も高い相関量を示したブロックを画像内容の近いブロックと決定する。相関量の式は統計論で用いられる厳密な式から画素毎の差分の二乗和等の簡略な式まで様々な既存の式があり、ブロックの探索法（例えば、K.R.Rao 他著、「デジタル放送・インターネットのための情報圧縮技術」、共立出版、p.69‐71）も様々な既存手法が存在するが、本発明ではこれらを特定の式または手法に限定するものではない。また、本実施例ではブロック相関法を用いた例を示したが、輪郭部領域のｘ方向とｙ方向に移動した距離が求められるのであればブロック相関法に限定するものではない。
【００７３】
次に、ｉまたはｊを更新して（ステップ１４０４ｅ）、上記の処理を全てのＲｋ（Ｔ）_i,j＝１であるブロックに対して行い（ステップ１４０４ｆ）、各々のブロックに対して移動量ｄｘ，ｄｙを求めてＲＶ（Ｔ＋ｎ）_i,jに格納する。そしてさらに、画像Ｒ（Ｔ）内の各Ｂ（Ｔ）_i,jをＲＶ（Ｔ＋ｎ）_i,jのｄｘ及びｄｙに基づいて移動させることにより、図１５に示すように、フレーム時刻Ｔ＋ｎの動画像データＣＭ（Ｔ＋ｎ）に対応する輪郭部領域１３の画像Ｒ（Ｔ＋ｎ）が得られる。
【００７４】
次に、フレーム時刻Ｔ＋ｎの距離画像データＲＭ（Ｔ＋ｎ）を生成するステップ１４０５では、まず、図１６に示すように、フレーム時刻Ｔの距離画像データＲＭ（Ｔ）から、前記ブロックＢ（Ｔ）_i,jに相当するブロックを、ＲＶ（Ｔ＋ｎ）_i,jのｄｘ及びｄｙに基づいて移動させる（ステップ１４０５ａ）。このとき、前記移動させるブロックは、図１７（ａ）及び図１７（ｂ）に示すように、Ｒｋ（Ｔ）_i,j＝１のブロックのみ、すなわち境目領域１２のみをまず移動させる。
【００７５】
ところで、移動前にブロック内に存在する境目領域はブロック間で連続しているが、移動方向や移動量によっては移動後のブロック間で途切れてしまうことがある。不連続になってしまった境目領域については、各々のブロック境界で途切れた点同士を直線で接続することによって連続性を回復する（ステップ１４０５ｂ）。なお、本実施例では直線補間による接続例を示したが、本発明は接続を直線補間に限定するものではない。
【００７６】
次に、境目領域１２で囲まれた内部領域を全て境目領域内側の近傍領域のデータに置き換えることによって塗りつぶすことで、図１８に示すように、フレーム時刻Ｔ＋ｎの距離画像データＲＭ（Ｔ＋ｎ）が得られる。
【００７７】
その後、図３に示したように、フレーム時刻を更新して、前記各ステップの処理を繰り返し行うことによって、距離画像データを取得していないフレーム時刻における分解能の低い距離画像データＲＭ（Ｔ＋ｎ）が全て生成され、動画の立体画像データが生成されることになる。
【００７８】
図１９は、本実施例１の立体画像生成方法で生成した立体画像の特徴を説明するための模式図である。
【００７９】
本実施例１の方法によって生成される距離画像データは、動画像データから抽出したｘ方向とｙ方向の動きに合わせて距離画像を移動させて生成するものである。そのため、撮影対象のｚ方向の動きを正確に表現することができない。例えば、図１９（ａ）に示すように、撮影対象が撮影地点に近づくと、実際には、図１９（ｂ）に示したように、撮影地点に近づくにつれて、奥行き（ｚ方向）の距離ｚが大きくなる。しかしながら、本実施例１によって生成される距離画像は図１９（ｃ）のように、奥行きの距離ｚがほとんど変わらない。したがって工業計測に使用するような動画の立体画像データの生成には不向きである。しかしながら、映像産業や通信産業で用いられる立体画像通信のような人間が見ることが目的である場合、画像と組み合わせて提示されるので、人間の視覚認識機構が合理的な解釈を行うことによって不自然さが目立たない。また、分解能の低い距離画像や擬似的な距離画像である場合、この点は元々問題にならない。
【００８０】
以上説明したように、本実施例１の立体画像生成装置を用いた立体画像生成方法によれば、１フレーム毎に距離画像データを取得することができない距離画像データ取得手段２Ａを用いても、容易に動画の立体画像を生成することができる。
【００８１】
また、１フレーム毎に距離画像データを取得することができる距離画像データ取得手段２Ａであっても、間欠的に取得することで、前記距離画像データ取得手段２Ａの消費電力量を低減することができる。そのため、前記立体画像生成装置の携帯機器への組み込みが容易になる。
【００８２】
また、前記立体画像生成方法をプログラム化すれば、前記各ステップをコンピュータに実行させることが可能である。そのため、専用の装置を用いることなく、動画の立体画像を容易に生成することができる。このとき、前記立体画像を生成するプログラムは、半導体メモリやハードディスク、ＣＤ−ＲＯＭ等の記録媒体によって提供することも出来るし、ネットワークを通して提供することも可能である。
【００８３】
なお、本実施例１では、分解能の低い距離画像の取得手段としてフラッシュを用いる撮影方法を示したが、分解能の低い距離画像を取得できるのであれば、実施例に示した方法に限定するものではなく、近景に焦点のあった画像と遠景に焦点のあった画像（またはパンフォーカス画像）間で空間周波数の大小に応じて距離を算出してもよく、あるいはステレオ撮影して単純に差分を取り、ある設定値を下回る差分値を示した領域を遠景領域として分解能の低い距離画像を生成すること等が可能である。
【００８４】
（実施例２）
図２０乃至図２１は、本発明による実施例２の立体画像生成方法の全体的な処理手順を説明するための模式図である。
【００８５】
本実施例２の立体画像生成方法は、例えば、前記実施例１で説明したような立体画像生成装置を用いて行うので、装置に関する詳細な説明は省略する。
【００８６】
本実施例２の立体画像生成方法は、まず、図２０に示すように、動画像データＣＭ（Ｔ）及び距離画像データＲＭ（Ｔ）を入力し、前記動画像データ記録手段４及び前記距離画像データ記録手段５に記録する。このとき、前記動画像データＣＭ（Ｔ）は、１フレーム時間毎に入力するが、前記距離画像データＲＭ（Ｔ）はＮフレーム間隔、例えば、図２０に示したように、３フレーム毎の間欠的なフレーム間隔で入力する。
【００８７】
しかしながら、この状態では、前記距離画像データＲＭが存在しないフレーム時刻の動画像データＣＭ（Ｔ＋１），ＣＭ（Ｔ＋２）があるので、そのままでは前記立体画像を生成したことにならない。そこで、前記動画像データ及び距離画像データを用いて、存在しないフレーム時間の距離画像データを補間生成する。
【００８８】
本実施例２の立体画像生成方法では、まず、図２０及び図２１に示したように、あるフレーム時刻の距離画像データＲＭと、同フレーム時刻の動画像データを選択する（ステップ１９０２）。ここでは、図２０に示したように、フレーム時刻Ｔの動画像データＣＭ（Ｔ）と距離画像データＲＭ（Ｔ）が選択された場合を説明する。なお、図２０に示した距離画像データＲＭ（Ｔ），ＲＭ（Ｔ＋３）は、黒い領域が撮影地点からの距離が近い領域（近景領域）１１を表しており、距離が遠くなるにしたがって白くなる。
【００８９】
次に、前記ステップ１９０２で選択した動画像データＣＭ（Ｔ）と、距離画像データのないフレーム時刻Ｔ＋１の動画像データＣＭ（Ｔ＋１）との間での画像の動きベクトルを求める（ステップ１９０３）。この求めた動きベクトルをＭ（Ｔ＋１）に示す。動きベクトルの画像Ｍ（Ｔ＋１）中で矢印の記号ＭＶが検出した動きベクトルである。なお、分かり易いように動きベクトルを矢印で図示したが、実際の画像データＭ（Ｔ＋１）は各画素に動きベクトル情報ＭＶが記録された画像であってこのような矢印で描かれた画像ではない。
【００９０】
次に、フレーム時刻Ｔの距離画像データＲＭ（Ｔ）を複数の領域に分割し、各領域を前記動きベクトルＭＶに従って別々に移動させることによって、フレーム時刻Ｔ＋１における距離画像データＲＭ（Ｔ＋１）を生成する（ステップ１９０４）。ここで前記距離画像データＲＭ（Ｔ）の領域は、動きベクトルＭＶの領域に合わせる。例えば、動きベクトルＭＶが画素毎に与えられているならば、前記距離画像データＲＭ（Ｔ）は画素単位で移動させる。また、動きベクトルＭＶがあるブロック領域の動きベクトルであるならば、各々の領域とはそのブロック領域単位である。この領域は動きベクトルを求める方法に依存するが、本発明は特定の動きベクトル生成法に限定するものではない。
【００９１】
次に、フレーム時刻を更新し（ステップ１９０５）、動画像データＣＭ（Ｔ＋２）及び距離画像データＲＭ（Ｔ＋２）が存在するか判定し（ステップ１９０６，１９０７）、動画像データＣＭ（Ｔ＋２）はあるが距離画像データＲＭ（Ｔ＋２）がない場合には、前記ステップ１９０３に戻って、距離画像データＲＭ（Ｔ＋２）を補間生成する。このとき、前記距離画像データＲＭ（Ｔ＋２）は、例えば、図２０に示したように、フレーム時刻Ｔの動画像データＣＭ（Ｔ）とフレーム時刻Ｔ＋２の動画像データＣＭ（Ｔ＋２）から動きベクトルＭＶを求め、その動きベクトルにしたがって前記フレーム時刻Ｔの距離画像データＲＭ（Ｔ）の各領域を移動させ、フレーム時刻Ｔ＋２の距離画像データＲＭ（Ｔ＋２）を生成する。
【００９２】
以上の手順を繰り返すことにより、前記距離画像データが存在しないフレーム時刻の距離画像データを順次補間生成することができる。そのため、本実施例２の立体画像生成方法でも、１フレーム時間毎に距離画像データを入力する必要がない。そのため、間欠的なフレーム間隔でしか動作できない距離画像データ取得手段１Ａを用いることが可能である。また、前記距離画像データ取得手段１Ａが、１フレーム時間毎の距離画像データを取得できる場合でも、間欠的に動作させることにより、消費電力量を低減することができる。また、本実施例２でも、前記距離画像データを３フレーム時間毎に入力しているが、これに限らず、たとえば、５フレーム時間毎あるいは１０フレーム時間毎に入力することも可能であり、フレーム時間間隔を大きくすることで、消費電力量を大幅に低減することができる。
【００９３】
図２２乃至図２６は、本実施例２の立体画像生成方法の具体例を説明するための模式図である。
【００９４】
本実施例２の立体画像生成方法では、１フレーム時間毎の動画像データＣＭと、Ｎフレーム時間毎の距離画像データＲＭが、半導体メモリ等の動画像データ記録手段４及び距離画像データ記録手段５にあらかじめ記録されているとする。このとき、前記動画像データと前記距離画像データがともに存在するフレーム時刻をＴとすると、ＣＭ（Ｔ），ＣＭ（Ｔ＋１），・・・，ＣＭ（Ｔ＋Ｎ−１）の動画像データに対して距離画像データＲＭ（Ｔ）が１枚存在する。これらの画像から、ＲＭ（Ｔ＋１），・・・，ＲＭ（Ｔ＋Ｎ−１）の距離画像データを補間生成することによって動画の立体画像データを生成する。ここで、ＣＭもＲＭも、横方向がＸ画素、縦方向がＹ画素の画像サイズであるとする。また、ＣＭ（Ｔ）の座標（ｘ，ｙ）のデータ値をＣＭ（Ｔ）_x,yと記述し、ＲＭ（Ｔ）の座標（ｘ，ｙ）のデータ値をＲＭ（Ｔ）_x,yと記述する。ＣＭ（Ｔ）_x,yはＲＧＢデータであり、ＲＭ（Ｔ）_x,yは距離を記録したスカラ値である。
【００９５】
また、前記動画像データＣＭ（Ｔ）を取得する動画像データ取得手段１Ａには、前記実施例１で説明した動画像撮影用のＴＶカメラを用いる。また、前記距離画像データＲＭ（Ｔ）を取得する距離画像データ取得手段２Ａには、たとえば、光切断法（例えば、吉澤徹著、「光三次元計測」、新技術コミュニケーションズ、p.28‐37）、ＴＯＦ（光飛行時間計測法）（例えば、井口征士、「３次元計測の最新の動向」、計測と制御、第３４巻、第６号、p.430や、河北真宏、「ハイビジョン３次元カメラ：Axi-vision カメラ」、平１４年度技研公開公演・研究発表会予稿集、p.58‐63）、種々のパターン投影法（例えば、吉澤徹著、「光三次元計測（第２版）」、新技術コミュニケーションズ、p.77‐99）、種々のステレオ法（例えば、画像電子学会著、「３次元画像用語辞典」、新技術コミュニケーションズ、p.51）等に基づく距離計測装置を使用する。
【００９６】
前記動画像データＣＭ（Ｔ）と間欠的な距離画像データＲＭ（Ｔ）を用いて距離画像データを補間生成するときには、まず、あるフレーム時刻Ｔにおける距離画像データＲＭ（Ｔ）と、同フレーム時刻の画像データＣＭ（Ｔ）を選択する（ステップ１９０２）。
【００９７】
次に、前記フレーム時刻Ｔの動画像データＣＭ（Ｔ）と、距離画像データのないフレーム時刻Ｔ＋ｎの動画像データＣＭ（Ｔ＋ｎ）との間での画像の動きベクトルを求めるステップ１９０３を行う。
【００９８】
前記ステップ１９０３では、図２２に示すように、動画像データＣＭ（Ｔ）を縦Ｗ画素、横Ｗ画素のブロックに格子状に分割する（ステップ１９０３ａ）。このとき、図２３（ａ）及び図２３（ｂ）に示すように、各々のブロックサイズはＷ×Ｗであり、横方向のブロック数はＸ／Ｗ、縦方向のブロック数はＹ／Ｗである。なお、ＸとＹは、Ｗで割り切れるような値があらかじめ設定されているものとする。このとき、ブロック座標（ｉ，ｊ）のブロックをＢ（Ｔ）_i,jと記述する。
【００９９】
次に、動きベクトル画像Ｍ（Ｔ＋ｎ）を生成する。このとき、前記動きベクトル画像Ｍ（Ｔ＋ｎ）は、前記ブロックＢ（Ｔ）_i,jと同じサイズ、すなわち、横Ｘ／Ｗ、縦Ｙ／Ｗのブロック状の二次元データＭＶ（Ｔ＋ｎ）_i,jからなり、各要素には二次元のベクトルデータが格納されるとする。
【０１００】
このとき、まず、各々のブロック座標（ｉ，ｊ）に対応するＢ（Ｔ）_i,jを選択し、図２４（ａ）に示すように、Ｂ（Ｔ）_i,jにもっとも画像内容の近いブロックをＣＭ（Ｔ＋ｎ）内から決定する（ステップ１９０３ｂ）。
【０１０１】
次に、図２４（ｂ）に示すように、前記ＣＭ（Ｔ＋ｎ）から決定したこのブロックの左上の座標値（ｉ×Ｗ＋ｄｘ，ｊ×Ｗ＋ｄｙ）を得る。このときｄｘ及びｄｙはｎフレーム進んだ画像におけるＢ（Ｔ）_i,jの動きベクトルＭＶ（Ｔ＋ｎ）_i,jである。ｄｘ及びｄｙをＭＶ（Ｔ＋ｎ）_i,jに格納する（ステップ１９０３ｃ）。
【０１０２】
なお、前記動きベクトルＭＶ（Ｔ＋ｎ）_i,jを求める処理は、ブロックマッチング法と呼ばれ、例えば、ＭＰＥＧ２画像生成時の動き予測に用いられる公知の手法である。一般に、２つのブロック毎に相関量を求め画像内で最も高い相関量を示したブロックを画像内容の近いブロックと決定するが、相関量の式は統計論で用いられる厳密な式から画素毎の差分の二乗和等の簡略な式まで様々な既存の式があり、ブロックの探索法、移動後のブロック間歪み除去法なども様々な既存手法が存在する。本発明ではこれらを特定の式または手法に限定するものではない。また、本実施例２ではブロックマッチング法を用いた例を示したが、画像内の動きベクトルを求められるのであれば、ブロックマッチング法に限定するものではない。この結果、生成されたＭＶ（Ｔ＋ｎ）_i,jはフレーム時刻ＴからＴ＋ｎに変わった際の画像内の動きを表す。
【０１０３】
その後、ｉまたはｊを更新し（ステップ１９０３ｄ）、ＣＭ（Ｔ）の全てのブロックＢ（Ｔ）_i,jで同様の処理を繰り返す（ステップ１９０３ｅ）。
【０１０４】
次に、前記距離画像データＲＭ（Ｔ）の各領域を前記動きベクトル画像Ｍ（Ｔ＋ｎ）に基づいて移動させるステップ１９０４では、まず、図２５に示すように、前記距離画像データＲＭ（Ｔ）を、前記ブロックＢ（Ｔ）_i,jと同じサイズのブロックＲＭ（Ｔ）_i,jに分割する（ステップ１９０４ａ）。その後、図２６（ａ）に示すように、前記ブロックＲＭ（Ｔ）_i,jを前記動きベクトルＭＶ（Ｔ＋ｎ）_i,jで移動させる（ステップ１９０４ｂ）。その後、ｉまたはｊを更新し（ステップ１９０４ｃ）、全てのブロックＲＭ（Ｔ）_i,jを移動させると、図２６（ｂ）に示したように、フレーム時刻Ｔ＋ｎの距離画像データＲＭ（Ｔ＋ｎ）が得られる。
【０１０５】
その後は、ｎ＝ｎ＋１としてフレーム時刻を更新し、距離画像データＲＭ（Ｔ＋ｎ）が存在しない場合には、前記手順を繰り返し、距離画像データＲＭ（Ｔ＋ｎ）を生成する。
【０１０６】
このとき、前記フレーム時刻Ｔ＋ｎの距離画像データＲＭ（Ｔ＋ｎ）は、例えば、図２０に示したように、フレーム時刻Ｔの動画像データＣＭ（Ｔ）及び距離画像データＲＭ（Ｔ）と、フレーム時刻Ｔ＋ｎの動画像データＣＭ（Ｔ＋ｎ）を用いて生成する。
【０１０７】
以下、距離画像データＲＭが存在する動画像ベクトルＣＭのフレーム時刻になるまで、前記処理を繰り返し行うことで、フレーム時刻Ｔとフレーム時刻Ｔ＋Ｎの間の距離画像データＲＭを補間生成することができる。
【０１０８】
以上説明したように、本実施例２の立体画像生成方法によれば、１フレーム毎に距離画像データを取得することができない取得手段を用いても、容易に動画の立体画像を生成することができる。
【０１０９】
また、１フレーム毎に距離画像データを取得することができる取得手段であっても、間欠的に取得することで、前記距離画像データ取得手段２Ａの消費電力量を低減することができる。そのため、前記立体画像を生成する装置の携帯機器への組み込みが容易になる。
【０１１０】
また、本実施例２の立体画像生成方法では、前記手順で説明したように、距離計測器から入力した距離画像データの代わりに、前記実施例１で説明したような、遠近の区別ができる程度の分解能の低い距離画像データを用いることも出来る。その場合は、輪郭部領域のみの動きを求めればよいので、計算量が少なくなり、消費電力量をさらに低減することができる。
【０１１１】
図２７は、前記実施例２の変形例を説明するための模式図である。
【０１１２】
前記実施例２では、フレーム時刻Ｔとフレーム時刻Ｔ＋Ｎの間のフレーム時刻Ｔ＋ｎの距離画像データを補間生成するときに、図２０に示したように、前記フレーム時刻Ｔの動画像データＣＭ（Ｔ）及び距離画像データＲＭ（Ｔ）、ならびにフレーム時刻Ｔ＋ｎの動画像データＣＭ（Ｔ＋ｎ）を用いて距離画像データＲＭ（Ｔ＋ｎ）を補間生成する例を示したが、その代わりに、例えば、１フレーム前の動画像データＣＭ（Ｔ＋ｎ−１）及び補間生成した距離画像データＲＭ（Ｔ＋ｎ−１）、ならびに動画像データＣＭ（Ｔ＋ｎ）を用いて距離画像データＲＭ（Ｔ＋ｎ）を生成してもよい。
【０１１３】
前記動画像データは通常、画像フレームの更新に伴い、撮影対象の輪郭形状が徐々に変形していくため、前記実施例２のように、常に初めのフレーム時刻Ｔで得られた輪郭部領域との間でパターンマッチングを行う方法では、変形量がある一定量以上となる、時間的に離れたフレームにおいて正しいパターンマッチングが行えなくなる。一方、図２７に示した方法の場合、直前のフレームで得られた輪郭部領域との間でパターンマッチングを行うことにより前記動きベクトルを求めるので、前記距離画像データが存在しないフレームの数の多少に関わらず、撮影対象の輪郭形状の変形量は、常に１フレーム以内でありその量は微小である。したがって、図２７に示した方法の場合、距離画像データが存在しないフレームの数が多くても、前記動画の立体画像を精度よく生成することができる。
【０１１４】
（実施例３）
図２８は、本発明による実施例３の立体画像生成装置の概略構成を示す模式図である。
【０１１５】
本実施例３の立体画像生成装置は、図２８に示すように、動画像データ取得手段１Ａから動画像データを入力する動画像データ入力手段１Ｂと、補助画像データ取得手段２０Ａから補助画像データを入力する補助画像データ入力手段２０Ｂと、前記動画像データ入力手段１Ｂ及び前記補助画像データ入力手段２０Ｂに入力する各データを制御する制御手段３と、前記動画像データ入力手段１から入力された動画像データを記録する動画像データ記録手段４と、前記補助画像データ入力手段２０から入力された補助画像データ及び距離画像データを記録する距離画像データ記録手段５と、距離画像データを補間生成する距離画像データ補間生成手段２１とを備える。
【０１１６】
また、前記立体画像生成装置は、たとえば、毎秒３０フレーム以上の動画像データと距離画像データの組からなる立体画像を生成する装置であり、生成した前記立体画像は、図２８に示したように、演算手段７で演算し、表示手段８で表示することにより立体化された動画像を得ることができる。
【０１１７】
図２９乃至図３６は、本実施例３の立体画像生成装置を用いた立体画像生成方法を説明するための模式図である。
【０１１８】
本実施例３の立体画像生成装置では、前記実施例１及び前記実施例２と異なり、図２９に示すように、前記距離画像データを生成可能な補助画像データを入力し、前記補助画像データ、または前記補助画像データと前記動画像データを用いて、距離画像データを生成する。
【０１１９】
ここで、前記距離画像データを生成可能な補助画像データとは、データそのものは直接距離情報を表現しているわけではないが、補助画像データ単独、または補助画像データと画像データの組み合わせを用いて演算することによって距離画像データまたは分解能の低い距離画像データまたは擬似的な距離画像データを生成可能な画像データと定義する。
【０１２０】
一例をあげると、補助画像データが普通の画像である場合、これ単独では距離情報を含まないが、もう一方の画像データと組み合わせてステレオ画像とみなすことによってステレオ法を用いて距離画像が生成できる。したがって普通の画像は補助画像データとなる。
【０１２１】
また、本実施例３では、このようなステレオ画像を補助画像データとして用いた例のみを詳細に説明するが、補助画像データ単独、または補助画像データと画像データの組み合わせを用いて演算することによって距離画像データまたは分解能の低い距離画像データまたは擬似的な距離画像データを生成可能な画像データであるならば、本発明は補助画像データの種類を限定するものではない。
【０１２２】
本実施例３の立体画像生成装置を用いて動画の立体画像を生成するときには、半導体メモリなどの記録手段に、１フレーム毎の動画像データＣＭと、Ｎフレーム毎に距離画像を生成可能な補助画像データＨＭがあらかじめ記録されているとする。動画像データと補助画像データがともに存在するフレーム時刻をＴとすると、ＣＭ（Ｔ），ＣＭ（Ｔ＋１），・・・，ＣＭ（Ｔ＋Ｎ−１）の動画像に対してＨＭ（Ｔ）が１枚存在する。これらの画像から、ＲＭ（Ｔ），ＲＭ（Ｔ＋１），・・・，ＲＭ（Ｔ＋Ｎ−１）の距離画像データを補間生成することによって動画の立体画像データを生成できる。
【０１２３】
なお、本実施例３で用いる補助画像データＨＭ（Ｔ）は、動画像データＣＭ（Ｔ）の撮影位置とは水平方向に異なる位置で撮影された画像データであるとする。
【０１２４】
本実施例３の立体画像生成方法を用いて前記距離画像データＲＭ（Ｔ）を生成するときには、図３０に示すように、まず、動画像データＣＭ（Ｔ）と同じフレーム時刻の補助画像データＨＭ（Ｔ）を選択し（ステップ２２０１）、前記フレーム時刻Ｔの距離画像データＲＭ（Ｔ）を生成する（ステップ２２０２）。
【０１２５】
前記フレーム時刻Ｔの距離画像データＲＭ（Ｔ）を生成するステップ２２０２では、図３１に示すように、まず、図３２（ａ）に示した動画像データＣＭ（Ｔ）と図３２（ｂ）に示した補助画像データＨＭ（Ｔ）の差分の絶対値をとり、その画像ＤＭ（Ｔ）を生成する（ステップ２２０２ａ）。このとき、前記動画像データＣＭ（Ｔ）と補助画像データＨＭ（Ｔ）はステレオ画像なので近景領域は動画像データＣＭ（Ｔ）と補助画像データＨＭ（Ｔ）で画像内の位置が異なり、差分値は０にならない。特に、輝度値の変化が大きい領域においては著しい差分値が発生する。一方、遠景領域は動画像データＣＭ（Ｔ）と補助画像データＨＭ（Ｔ）で画像内の位置がほとんど変わらないので差分値は非常に小さい。
【０１２６】
次に、差分値の画像ＤＭ（Ｔ）の各点ＤＭ（Ｔ）_x,yが、あらかじめ設定したある差分値ｋ以上なら１、ｋ未満なら０とする（ステップ２２０２ｂ，２２０２ｃ，２２０２ｄ）。その後、ｘまたはｙを更新し（ステップ２２０２ｅ），全てのＤＭ（Ｔ）_x,yを２値化（２２０２ｆ）して、図３３に示したように、２値化した画像ＤＭ（Ｔ）を生成する。この結果、物体が写っていない領域（遠景領域）は０となる。一方、物体が写っている領域（近景領域）は１になる確率が高いが、輝度の変化に乏しい領域では０になることもあり、０と１の混在したノイジーな画像となる。
【０１２７】
そこで、次に、前記２値化した画像ＤＭ（Ｔ）からノイズを除去する（ステップ２２０２ｇ）。前記ノイズの除去は、例えば、図３４、図３５（ａ）、図３５（ｂ）に示したように、２値化した画像ＤＭ（Ｔ）の全ての画素ＤＭ（Ｔ）_x,yを中心とした、あらかじめ設定したｐ画素×ｐ画素の各画素の値を調べ（ステップ２２０２ｈ，２２０２ｉ）、１の値を持つ画素があるならＲＭ（Ｔ）_x,y＝１とし（ステップ２２０２ｋ）、そうでなければ０とする（ステップ２２０２ｊ）。これによってノイズが除去され、図３６に示すように、近景が１、遠景が０で表現される分解能の低い距離画像ＲＭ（Ｔ）が生成される。
【０１２８】
なお、ノイズの除去方法は、これ以外にメディアンフィルタを用いたり、平滑化フィルタをかけた後、再度２値化する方法もある。本発明はノイズ除去の方法を限定するものではない。
【０１２９】
また、補助画像にもこれ以外の様々な画像が考えられる。例えば、近距離でピントの合う焦点深度の浅い入力画像を補助画像として用いる方法もある。補助画像では遠景領域はボケて撮影されるので高周波成分が著しく小さい。そこで、補助画像と画像について同じ位置の近傍の空間周波数を求めて高周波成分の比を求めることによって、その位置における距離の概略を比から算出することが可能である。本発明は補助画像または補助画像と画像の組み合わせから距離画像を生成できるのであれば補助画像に何を用いるのかを限定するものではない。
【０１３０】
その後、前記手順を繰り返し、動画像データＣＭ（Ｔ）と間欠的に入力した補助画像データＨＭ（Ｔ）から距離画像データＲＭ（Ｔ）を生成した後は、例えば、前記実施例１で説明した手順に沿って、動画の立体画像を生成する。
【０１３１】
なお、本実施例３では分解能の低い距離画像が生成されるので、前記実施例１で説明した生成方法と組み合わせたが、生成される距離画像の分解能が高い場合には前記実施例２で説明した生成方法と組み合わせてもよい。
【０１３２】
以上説明したように、本実施例３の立体画像生成方法によれば、前記距離画像データを生成可能な補助画像データを間欠的に取得し、その補助画像データから距離画像データを生成することにより、前記実施例１または前記実施例２で説明した生成方法と同様に、動画の立体画像を容易に生成することができる。
【０１３３】
また、補助画像データを間欠的に取得することにより、補助画像データを取得する補助画像データ取得手段２０Ａの消費電力量を低減することができる。そのため、前記立体画像生成装置の携帯機器への組み込みが容易になる。
【０１３４】
以上、本発明を、前記実施例に基づき具体的に説明したが、本発明は、前記実施例に限定されるものではなく、その要旨を逸脱しない範囲において、種々変更可能であることはもちろんである。
【０１３５】
【発明の効果】
本願において開示される発明のうち、代表的なものによって得られる効果を簡単に説明すれば、以下の通りである。
（１）動画の立体画像を容易に生成することができる。
（２）動画の立体画像を生成する装置の消費電力量を低減することができる。
【図面の簡単な説明】
【図１】本発明による実施例１の立体画像生成装置の概略構成を示す模式図である。
【図２】本実施例１の立体画像生成装置を用いた立体画像生成方法を説明するための模式図であり、立体画像生成方法の全体的な処理手順を示す図である。
【図３】本実施例１の立体画像生成装置を用いた立体画像生成方法を説明するための模式図であり、立体画像生成方法の全体的な処理手順を示す図である。
【図４】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図５】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図６】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図７】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図８】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図９】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図１０】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図１１】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図１２】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図１３】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図１４】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図１５】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図１６】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図１７】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図１８】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図１９】本実施例１の立体画像生成方法で生成した立体画像の特徴を説明するための模式図である。
【図２０】本発明による実施例２の立体画像生成方法の全体的な処理手順を説明するための模式図である。
【図２１】本発明による実施例２の立体画像生成方法の全体的な処理手順を説明するための模式図である。
【図２２】本実施例２の立体画像生成方法の具体例を説明するための模式図である。
【図２３】本実施例２の立体画像生成方法の具体例を説明するための模式図である。
【図２４】本実施例２の立体画像生成方法の具体例を説明するための模式図である。
【図２５】本実施例２の立体画像生成方法の具体例を説明するための模式図である。
【図２６】本実施例２の立体画像生成方法の具体例を説明するための模式図である。
【図２７】前記実施例２の変形例を説明するための模式図である。
【図２８】本発明による実施例３の立体画像生成装置の概略構成を示す模式図である。
【図２９】本実施例３の立体画像生成装置を用いた立体画像生成方法を説明するための模式図である。
【図３０】本実施例３の立体画像生成装置を用いた立体画像生成方法を説明するための模式図である。
【図３１】本実施例３の立体画像生成装置を用いた立体画像生成方法を説明するための模式図である。
【図３２】本実施例３の立体画像生成装置を用いた立体画像生成方法を説明するための模式図である。
【図３３】本実施例３の立体画像生成装置を用いた立体画像生成方法を説明するための模式図である。
【図３４】本実施例３の立体画像生成装置を用いた立体画像生成方法を説明するための模式図である。
【図３５】本実施例３の立体画像生成装置を用いた立体画像生成方法を説明するための模式図である。
【図３６】本実施例３の立体画像生成装置を用いた立体画像生成方法を説明するための模式図である。
【符号の説明】
１Ａ…動画像データ取得手段、１Ｂ…動画像データ入力手段、２Ａ…距離画像データ取得手段、２Ｂ…距離画像データ入力手段、３…制御手段、４…動画像データ記録手段、５…距離画像データ記録手段、６…距離画像データ補間手段、７…演算手段、８…表示手段、１０…撮影対象、１１…近景領域、１２…境目領域、１３…輪郭部領域、１５…光源、１６…全反射ミラー、１７…ハーフミラー、２０Ａ…補助画像データ取得手段、２０Ｂ…補助画像データ入力手段、２１…距離画像データ補間生成手段。[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a stereoscopic moving image data generation method, a stereoscopic moving image data generation apparatus, a stereoscopic moving image data generation program, and a recording medium, and is particularly effective when applied to generation of stereoscopic moving image data to be displayed on a portable device or the like. It is about technology.
[0002]
[Prior art]
Conventionally, in the field of image communication, moving image communication using a stereoscopic image that can provide a higher sense of realism has attracted attention, and conference presentations and demonstrations of companies are becoming popular. Here, it is assumed that the three-dimensional image is a combination of a two-dimensional image in which color and brightness are recorded and an image called a distance image or a depth image in which the distance from the measurement point to the subject is recorded. The distance image is an image in which the distance to the object to be photographed is recorded in each pixel. In addition, the moving image includes, for example, an image of about 30 frames or more per second.
[0003]
When generating a moving image based on the stereoscopic image, for example, moving image data including a two-dimensional image in which the color and brightness are recorded by a TV camera or the like is input, and the distance image data is input by a distance measuring device or the like.
[0004]
In addition, as a method of generating a moving image based on the stereoscopic image, for example, there is a method of generating distance image data using two two-dimensional images having parallax.
[0005]
In addition, as a method for generating a stereoscopic image from a parallax image of a two-dimensional image, a technique is disclosed in which a parallax region is determined by obtaining an outline from a distance image, and a parallax amount is generated using a preceding and following image and a motion vector. (For example, refer to Patent Document 1).
[0006]
[Patent Document 1]
JP 2000-253422 A
[0007]
[Problems to be solved by the invention]
However, the conventional technology has a problem that it is difficult to perform moving image communication using the stereoscopic image.
[0008]
In order to perform moving image communication using the stereoscopic image, for example, it is necessary to input distance image data of 30 frames or more per second. Conventionally, such a distance measuring instrument that can acquire distance image data at a high speed is very expensive. Therefore, it is difficult to use it by incorporating it in a portable device.
[0009]
In the conventional technique, the distance image data is acquired and input every frame time as in the case of the two-dimensional image, so that the power consumption is increased. For this reason, there is a problem that it is difficult to incorporate into a portable device or the like used for a rechargeable battery.
[0010]
Further, as a technology that can be incorporated into a portable device, for example, there is a method of inputting distance image data with a resolution that is low enough to distinguish between perspective and displaying a three-dimensional image in a split format using this. Also, there is a method of generating stereoscopic distance image data from the distance image with a low resolution and displaying it stereoscopically. Here, it is assumed that the pseudo distance image data is data generated artificially, not by measuring the distance. That is, the pseudo distance image data does not necessarily match the actual distance, and the degree of approximation depends on the generation algorithm.
[0011]
However, even in each of the above methods, each distance image data is acquired every frame time, so that the power consumption is high. Therefore, there is a problem that it is difficult to input a stereoscopic image as a moving image on a portable device.
[0012]
Further, even the technique described in Patent Document 1 has a problem that the distance image data is acquired every frame time, so that the amount of power consumption is high and it is difficult to incorporate and use it in a portable device.
[0013]
An object of the present invention is to provide a technique capable of easily generating a stereoscopic image of a moving image.
[0014]
Another object of the present invention is to provide a technique capable of reducing the power consumption of an apparatus that generates a stereoscopic image of a moving image.
[0015]
Another object of the present invention is to provide a program and a recording medium that make it easy to input a moving image using a stereoscopic image and reduce the power consumption when generating a stereoscopic image.
[0016]
The above and other objects and novel features of the present invention will be apparent from the description of this specification and the accompanying drawings.
[0017]
[Means for Solving the Problems]
The outline of the invention disclosed in the present application will be described as follows.
(1) A step of inputting moving image data for each frame time, and a step of inputting data for generating a stereoscopic image from the moving image data (hereinafter referred to as distance image data) at intermittent frame intervals. And interpolating and generating distance image data at a frame time when the distance image data does not exist based on the moving image data and the input distance image data. Then, the step of interpolating and generating the distance image data includes the step of obtaining a boundary area of perspective from the input distance image data, and the moving image data at the same frame time as the input distance image data, Obtaining a contour region in the same position as the border region or in the vicinity of the border region; obtaining a region corresponding to the contour region from moving image data at a frame time when the distance image data does not exist; A step of generating the distance image data by moving the boundary region and the internal region based on a contour region and a movement amount of the region corresponding to the contour region. The
[0019]
( 2 ) Inputting moving image data for each frame time; inputting data for generating a stereoscopic image from the moving image data (hereinafter referred to as distance image data) at intermittent frame intervals; A stereoscopic image generation method comprising: interpolating distance image data at a frame time when the distance image data does not exist based on moving image data and the input distance image data, The step of interpolating and generating the distance image data includes a step of selecting moving image data having the same frame time as the input distance image data, and a frame time when the distance image data does not exist from the selected moving image data. Obtaining a motion vector of the image up to the moving image data; dividing the inputted distance image data into a plurality of regions; and moving the regions based on the motion vector to obtain the distance image data. Generating.
[0020]
( 3 ) (1) Or ( 2 )of 3D image generation method The step of inputting the distance image data includes the step of inputting auxiliary image data capable of generating the distance image data at intermittent frame intervals, the auxiliary image data, or the auxiliary image data and the auxiliary image. Generating distance image data based on moving image data at the same frame time as the data.
[0021]
(1) Or (2) of 3D image generation method Therefore, even when the means for acquiring the distance image data such as a distance measuring device cannot be operated at high speed, a moving image based on a stereoscopic image can be easily obtained.
[0023]
In the step of inputting the distance image data, the distance image data may be directly input using a distance measuring device or the like. 3 )of Each step As Do the right thing Distance image data generated indirectly from the auxiliary image data may be input.
[0024]
( 4 ) A moving image data input means for inputting moving image data for each frame time and data for generating a stereoscopic image from the moving image data (hereinafter referred to as distance image data) are input at intermittent frame intervals. A distance image data input means, a moving image data input from the moving image data input means, and a distance at a frame time when the distance image data does not exist based on the distance image data input from the distance image data input means. A stereoscopic image generation apparatus comprising distance image data interpolation means for generating and interpolating image data Thus, the distance image data interpolating means, from the distance image data input from the distance image data input means, from the distance image data, from the moving image data at the same frame time as the distance image data, Means for obtaining a contour region in the same position as the border region or in the vicinity of the border region; means for obtaining a region corresponding to the contour region from moving image data at a frame time when the distance image data does not exist; A contour region, and means for generating the distance image data by moving the boundary region and the internal region based on a movement amount of the region corresponding to the contour region. The
[0026]
( 5 ) Moving image data input means for inputting moving image data for each frame time and data for generating a stereoscopic image from the moving image data (hereinafter referred to as distance image data) are input at intermittent frame intervals. Based on distance image data input means, moving image data input from the moving image data input means, and distance image data input from the distance image data input means, a distance image at a frame time at which the distance image data does not exist A three-dimensional image generation device comprising distance image data interpolation means for generating and interpolating data, The distance image data interpolation means includes means for selecting moving image data at the same frame time as the input distance image data, and a moving image at a frame time at which no distance image data exists from the selected moving image data. Means for obtaining a motion vector of an image up to data, and means for dividing the input distance image data into a plurality of regions and moving the regions based on the motion vector to generate distance image data With.
[0027]
( 6 ) ( 4 ) Or ( 5 )of Stereo image generator The distance image data input means includes auxiliary image data input means for inputting the auxiliary image data capable of generating the distance image data at intermittent frame intervals, the auxiliary image data, or the auxiliary image data and the auxiliary image data. Means for generating distance image data based on moving image data at the same frame time as the auxiliary image data.
[0028]
Said (4) or (5) Stereo image generator According to this, even when a means for acquiring distance image data such as a distance measuring device cannot be operated at high speed, a moving image based on a stereoscopic image can be easily obtained.
[0029]
Even if the means for acquiring the distance image data is a means for acquiring the distance image data every frame time, the power consumption can be reduced by intermittently operating.
[0031]
Further, the distance image data input means may be means for directly inputting the distance image data from a distance measuring instrument or the like. 6 ) No In this manner, means for inputting the auxiliary image data and means for generating distance image data from the auxiliary image data may be provided to input distance image data indirectly.
[0032]
Because of the above, 4 ) To ( 6 ) Either up to of Stereo image generator By using this, it is possible to easily incorporate the stereoscopic image generating apparatus into a portable device.
[0033]
( 7 ) From (1) to ( 3 ) Until A stereoscopic image generation program for causing a computer to execute each step of any one of the stereoscopic image generation methods.
[0034]
Said ( 7 )of 3D image generation program Accordingly, a stereoscopic image can be easily generated using a general-purpose device (device) such as a computer.
[0035]
( 8 ) ( 7 ) Standing A recording medium on which a body image generation program is recorded so as to be readable by a computer.
[0036]
Said ( 8 )of recoding media According to A certain one recoding media (3) stereoscopic image generating program recorded in The On other recording media It can be provided via a copy or network. Therefore, any apparatus that can execute the stereoscopic image generation program can generate a stereoscopic image.
[0037]
Hereinafter, the present invention will be described in detail together with embodiments (examples) with reference to the drawings.
In all the drawings for explaining the embodiments, parts having the same function are given the same reference numerals and their repeated explanation is omitted.
[0038]
DETAILED DESCRIPTION OF THE INVENTION
Example 1
FIG. 1 is a schematic diagram illustrating a schematic configuration of a stereoscopic image generation apparatus according to a first embodiment of the present invention.
[0039]
As shown in FIG. 1, the stereoscopic image generating apparatus according to the first embodiment receives moving image data input means 1B for inputting moving image data from moving image data acquisition means 1A and distance image data from distance image data acquisition means 2A. Distance image data input means 2B for input, control means 3 for controlling each data input to the moving image data input means 1B and the distance image data input means 2B, and input from the moving image data input means 1 Moving image data recording means 4 for recording moving image data, distance image data recording means 5 for recording distance image data inputted from the distance image data input means 2, and distance image data interpolation for generating distance image data by interpolation Means 6.
[0040]
Further, the stereoscopic image generating apparatus is an apparatus that generates a stereoscopic image composed of a set of moving image data and distance image data of 30 frames or more per second, and the generated stereoscopic image is as shown in FIG. The three-dimensional moving image can be obtained by calculating with the calculating means 7 and displaying with the display means 8. At this time, the moving image data input means 1B, the distance image data input means 2B, the control means 3, the moving image data recording means 4, the distance image data recording means 5, the distance image data interpolation means 6, the calculation As each means of the means 7, a CPU or a memory of a computer can be used.
[0041]
2 and 3 are schematic diagrams for explaining a stereoscopic image generation method using the stereoscopic image generation apparatus according to the first embodiment, and are diagrams illustrating an overall processing procedure of the stereoscopic image generation method.
[0042]
In order to generate the stereoscopic image using the stereoscopic image generating apparatus according to the first embodiment, first, as shown in FIG. 2, moving image data CM (T) and distance image data RM (T) are input, Recording is performed on the moving image data recording means 4 and the distance image data recording means 5. At this time, the moving image data CM (T) is input every frame time, but the distance image data RM (T) is intermittent every N frames, for example, every 3 frames as shown in FIG. Enter at regular frame intervals. At this time, the distance image data RM (T) is, for example, distance image data with 1-bit resolution, the black area 11 is a foreground area, for example, an area corresponding to the subject 10 and the white area is a distant view. Suppose that it is an area.
[0043]
However, in this state, since there is moving image data RM (T + 1) and RM (T + 2) at the frame time when the distance image data RM does not exist, the stereoscopic image is not generated as it is. Therefore, distance image data of a non-existing frame time is generated by interpolation using the moving image data and distance image data.
[0044]
In the stereoscopic image generation method according to the first embodiment, as shown in FIGS. 2 and 3, after initialization (step 1401), the boundary region 12 between the foreground and the distant view is obtained from the distance image data RM (T). This image is defined as Q (T) (step 1402).
[0045]
Next, from the moving image data CM (T) at the same frame time T as the distance image data RM (T), the contour region 13 in the same region as the boundary region 12 of the image Q (T) or in the vicinity thereof. And this image is set as R (T) (step 1403).
[0046]
Next, the same region as the contour region 13 of the image R (T) is obtained from the moving image data CM (T + 1) at the frame time T + 1 in which no distance image data exists, and this image is set to R (T + 1) (step) 1404). In order to obtain the video region, for example, an existing pattern matching method (Hideyuki Tamura, “Introduction to Computer Image Processing”, Soken Publishing, p.148-153) is used. There are various variations in the pattern matching method, which can be selected as appropriate.
[0047]
Next, the distance image data RM (T (T) at the frame time T is matched with the movement amount of the contour region 13 in the image R (T + 1) obtained in the step 1404 with respect to the contour region 13 of the image R (T). The distance image data RM (T + 1) at the frame time T + 1 is generated by moving the foreground area 11 in ().
[0048]
Thereafter, it is checked whether the distance image data RM (T + 2) at the next frame time T + 2 exists (steps 1406, 1407, 1408, 1409). In the first embodiment, since the distance image data RM (T + 2) at the frame time T + 2 does not exist, it is generated in the same procedure as the distance image data RM (T + 1) at the frame time T + 1. At this time, the position of the video area of the moving image data with the frame time T + 2 may be obtained by direct pattern matching with the contour area of the moving image data with the frame time T as shown in FIG. Alternatively, it may be obtained by pattern matching with the contour region of the moving image data whose frame time is T + 1.
[0049]
By repeating the above procedure, it is possible to sequentially generate the distance image data at the frame time when the distance image data does not exist. Therefore, the stereoscopic image generating apparatus according to the first embodiment does not need to acquire distance image data for each frame time. Therefore, it is possible to use the distance image data acquisition unit 2A that can operate only at intermittent frame intervals. Further, even when the distance image data input unit 2A can acquire distance image data for each frame time, the power consumption can be reduced by operating the distance image data intermittently. In the first embodiment, the distance image data is input every 3 frame times. However, the present invention is not limited to this. For example, the distance image data can be input every 5 frame times or every 10 frame times. By increasing the time interval, the power consumption can be significantly reduced.
[0050]
4 to 18 are diagrams for explaining a specific example of the stereoscopic image generation method according to the first embodiment.
[0051]
In the stereoscopic image generation method according to the first embodiment, first, as shown in FIG. 4, the moving image data CM for each frame and the distance image data RM for each N frame are input, and the moving image data recording means 4 And recorded in the distance image data recording means 5. At this time, if the frame time when both the image and the distance image with low resolution exist is T, RM (T) for the moving image of CM (T), CM (T + 1),..., CM (T + N−1). ) Exists. Three-dimensional image data of a moving image is generated by generating distance image data of RM (T + 1),..., RM (T + N−1) from these images. Here, it is assumed that both CM and RM have vertical and horizontal X and Y pixel image sizes. Further, the data value of the coordinates (x, y) of CM (T) is expressed as CM (T). _{x, y} And the data value of the coordinates (x, y) of RM (T) is RM (T) _{x, y} Is described. CM (T) _{x, y} Is RGB data, RM (T) _{x, y} Is binary data of 0 or 1. Here, 1 represents a near view and 0 represents a distant view.
[0052]
Further, when inputting the moving image data CM and the distance image data RM, as shown in FIG. 5, a moving image shooting TV camera (moving image data acquisition means) 1A and a distance image shooting TV camera ( The distance image data acquisition means) 2B and the light source 15 that emits pulsed light such as a flash are controlled by the control means 3.
[0053]
At this time, the photographing optical axis of the distance image data acquisition means 2A can be photographed with the same photographing angle of view and the same photographing viewpoint by matching the photographing optical axis of the moving image data obtaining means 1A with the total reflection mirror 16 and the half mirror 17. Set as follows.
[0054]
When the moving image data acquisition means 1A is used to input the moving image data CM, the control means 3 has previously set the moving image data acquisition means 1A as shown in FIG. 6A. A control signal is output every frame time, the moving image data is photographed, and input to the moving image data input means 1B. At this time, the frame time interval ΔT is set to 1/30 seconds, for example.
[0055]
On the other hand, when the distance image data RM is input, as shown in FIG. 6 (b), the control unit 3 applies a time interval N times the preset frame interval to the distance image data acquisition unit 2A. For example, a control signal is output at a time interval that is three times the frame interval of the moving image data, intermittent distance image data is photographed, and input to the distance image data input means 2B. At this time, the control means 3 outputs a control signal to the light source 15 at the same timing to illuminate the object 10 to be photographed.
[0056]
Note that the shooting start times of the moving image data acquisition unit 1A and the distance image data acquisition unit 2A are shifted at least by the exposure time of the moving image data acquisition unit 1A or the illumination light emitted from the light source 15 is used. By using infrared light and using an infrared camera for the distance image data acquisition means 2A, the illumination light does not affect the A photographing of the moving image data acquisition means 1.
[0057]
In addition, by using a high-speed camera, the moving image data acquisition unit 1A and the distance image acquisition unit 2A can be shared by a single camera. In this case, the total reflection mirror 16 and the half mirror 17 are unnecessary.
[0058]
Further, the distance image data RMP (T) photographed by the distance image data acquisition means 2A is converted into distance image data RM (T) having a low resolution in which only the perspective distinction is recorded, for example. Therefore, first, as shown in FIG. 7, distance image data RMP (T) closest to the shooting time is selected from the moving image data CM (T) shot by the moving image data acquisition means 1A at time T (step 1801). These two pieces of image data can be regarded as image data obtained by photographing the same subject at the same photographing time only with or without illumination.
[0059]
Next, in the moving image data CM (T) and distance image data RMP (T), CM (T) at the same coordinates (x, y). _{x, y} / RMP (T) _{x, y} (Step 1802), and if the ratio is equal to or greater than a certain set value k, RM (T) _{x, y} = 1 (step 1803), otherwise RM (T) _{x, y} RM (T) by setting = 0 (step 1804) _{x, y} Is generated. Thereafter, the coordinates (x, y) are updated (step 1805), and RM (T) is updated for all coordinates. _{x, y} Is generated (step 1806) and set as distance image data RM (T).
[0060]
Compared to a captured image in which the illuminated image is illuminated, the object to be photographed in the foreground is brighter, but the brightness of the image in the foreground does not change much. The distance image data RM (T) with low resolution expressed by 0 is generated. The distance image data RM (T) having a low resolution may be generated using a difference instead of a ratio.
[0061]
In the first embodiment, the photographing method using the light source 15 such as a flash as an input unit for the low-resolution distance image data RM (T) is shown. However, the low-resolution distance image data RM (T) is acquired or input. If possible, the method is not limited to the method described in the first embodiment, and the spatial frequency between the image focused on the near view and the image focused on the distant view (or the pan focus image) depends on the size of the spatial frequency. The distance may be calculated, or the distance between the images obtained by stereo shooting is simply taken, and the range image data RM (T) with a low resolution is generated using the area showing the difference value below a certain set value as a distant view area. Is possible.
[0062]
Next, moving image stereoscopic image data is generated according to the procedure shown in FIGS.
[0063]
In step 1402 for obtaining the image Q (T) of the boundary region 12 from the distance image data RM (T), first, as shown in FIG. 8, the pixel RM at the coordinates (x, y) of the distance image data RM (T). (T) _{x, y} And the values of adjacent pixels are compared (steps 1402a and 1402b). At this time, the RM (T) _{x, y} As shown in FIG. 9A, the adjacent pixels are RM (T) adjacent in 8-connection. _{x-1, y-1} , RM (T) _{x-1, y} , RM (T) _{x-1, y + 1} , RM (T) _{x, y-1} , RM (T) _{x, y + 1} , RM (T) _{x + 1, y-1} , RM (T) _{x + 1, y} , RM (T) _{x + 1, y + 1} Q (T) when all have the same data value _{x, y} = 0 (step 1402c), otherwise Q (T) _{x, y} = 1 (step 1402d). Thereafter, the coordinates (x, y) are updated (step 1402e), and all the pixels RM (T) are updated. _{x, y} About Q (T) _{x, y} And an image Q (T) of the boundary region 12 as shown in FIG. 9B is generated.
[0064]
In Step 1402a, the pixel RM (T) _{x, y} RM (T) adjacent to each other with 4 connections _{x-1, y} , RM (T) _{x, y-1} , RM (T) _{x, y + 1} , RM (T) _{x + 1, y} Q (T) may be generated in comparison with
[0065]
Next, in step 1403 for obtaining the contour region 13 at the same position or in the vicinity of the boundary region 12 from the moving image data CM (T), first, as shown in FIG. T) block R (T) _{i, j} Q (T) equivalent to _{i, j} Is extracted (step 1403a). At this time, the R (T) _{i, j} As shown in FIG. 11A, a horizontal image is a block image having X / W pixels and a vertical direction is Y / W pixels, and i and j represent block coordinates. At this time, X, Y, and W are set in advance so that X / W and Y / W are integers. Each pixel of R has a binary data storage area Rk _{i, j} And movement vector storage area RV _{i, j} It shall consist of
[0066]
The Q (T) _{i, j} As shown in FIG. 11B, the vertical W pixels and the horizontal W pixels are divided and extracted. At this time, Q (T) _{i, j} The value of each pixel is checked (step 1403b). If all pixels are 0, Rk (T) _{i, j} = 0 (step 1403c), Rk (T) if a pixel of 1 is included _{i, j} = 1 (step 1403d). Then, i or j is updated (step 1403e), and all R (T) _{i, j} Rk (T) for _{i, j} Is repeated until it is obtained (step 1403f).
[0067]
Rk (T) _{i, j} Is data indicating only the contour region at the same position or in the vicinity of the boundary region near and far from CM (T). That is, Rk _{i, j} If = 1, the block region having the coordinates ((i + 1) × W−1, (j + 1) × W−1) from the coordinate (i × W, j × W) to the diagonal vertex is the same position as the boundary region or The contour region is in the vicinity. Rk _{i, j} If = 0, the block region having the coordinates ((i + 1) × W−1, (j + 1) × W−1) from the coordinates (i × W, j × W) to the diagonal vertex is the same position as the boundary region or It is not a contour region in the vicinity. Therefore, Rk (T) _{i, j} Is obtained, an image R (T) of the contour region 13 as shown in FIG. 12 is obtained.
[0068]
In the first embodiment, the block R (T) of the image R (T) in the contour region 13 is used. _{i, j} Is divided into rectangular blocks, but is not limited thereto, and may be divided into various shapes and extracted.
[0069]
Next, in step 1404 for obtaining a contour region corresponding to the contour region 13 obtained in step 1403 from the moving image data CM (T + n) at frame time T + n, first, as shown in FIG. T), Rk (T) _{i, j} = 1 block is extracted (step 1404a).
[0070]
Next, Rk (T) _{i, j} = 1, from the coordinates (i × W, j × W) of the CM (T) to the coordinates ((i + 1) × W−1, (j + 1) × W−1), as shown in FIG. , A reference image B (T) of W and W pixels _{i, j} (Steps 1404a and 1404b).
[0071]
Next, as shown in FIG. 14B, most B (T) in CM (T + n). _{i, j} A block whose image content is close to is determined, and the upper left coordinate value (i × W + dx, j × W + dy) of this block is obtained (step 1404c). At this time, dx and dy are B (T + n) in an image advanced by n frames. _{i, j} Represents the distance moved in the x-direction and y-direction of the contour region included in. dx and dy are RV (T + n) _{i, j} (Step 1404d).
[0072]
For blocks having similar image contents, a correlation amount is obtained for each of the two blocks, and a block having the highest correlation amount in the image is determined as a block having the closest image content. There are various existing formulas for correlation, ranging from exact formulas used in statistical theory to simple formulas such as the sum of squares of differences for each pixel. For example, KRRao et al. Information compression technology for the Internet ", Kyoritsu Shuppan, p.69-71), there are various existing methods, but the present invention is not limited to specific formulas or methods. In this embodiment, an example using the block correlation method is shown, but the present invention is not limited to the block correlation method as long as the distance moved in the x direction and the y direction of the contour region is obtained.
[0073]
Next, i or j is updated (step 1404e), and the above processing is performed for all Rk (T). _{i, j} = 1 (step 1404f), and the movement amounts dx and dy are obtained for each block to obtain RV (T + n). _{i, j} To store. Further, each B (T) in the image R (T) _{i, j} RV (T + n) _{i, j} As shown in FIG. 15, the image R (T + n) of the contour region 13 corresponding to the moving image data CM (T + n) at the frame time T + n is obtained.
[0074]
Next, in step 1405 for generating distance image data RM (T + n) at frame time T + n, first, as shown in FIG. 16, from the distance image data RM (T) at frame time T, the block B (T) _{i, j} The block corresponding to RV (T + n) _{i, j} Based on dx and dy, the movement is performed (step 1405a). At this time, as shown in FIGS. 17A and 17B, the block to be moved is Rk (T). _{i, j} = 1 block only, that is, only the boundary area 12 is moved first.
[0075]
By the way, the boundary area existing in the block before the movement is continuous between the blocks, but may be interrupted between the moved blocks depending on the movement direction and the movement amount. For the boundary region that has become discontinuous, the continuity is recovered by connecting the broken points at each block boundary with a straight line (step 1405b). In this embodiment, an example of connection by linear interpolation is shown, but the present invention is not limited to linear interpolation.
[0076]
Next, as shown in FIG. 18, the distance image data RM (T + n) at the frame time T + n is obtained by painting the entire inner area surrounded by the boundary area 12 by replacing it with the data of the neighboring area inside the boundary area. It is done.
[0077]
Thereafter, as shown in FIG. 3, by updating the frame time and repeating the processing of each step, the distance image data RM (T + n) having a low resolution at the frame time when the distance image data is not acquired is obtained. All are generated, and the stereoscopic image data of the moving image is generated.
[0078]
FIG. 19 is a schematic diagram for explaining the feature of the stereoscopic image generated by the stereoscopic image generation method of the first embodiment.
[0079]
The distance image data generated by the method of the first embodiment is generated by moving the distance image according to the movement in the x direction and the y direction extracted from the moving image data. For this reason, it is impossible to accurately represent the movement of the shooting target in the z direction. For example, as shown in FIG. 19 (a), when the object to be photographed approaches the photographing point, actually, as shown in FIG. 19 (b), the distance z in the depth (z direction) approaches the photographing point. Becomes larger. However, the distance image generated according to the first embodiment has almost no change in the depth distance z as shown in FIG. Therefore, it is not suitable for generating moving image stereoscopic image data used for industrial measurement. However, when the purpose is to be viewed by humans such as stereoscopic image communication used in the video industry and the communication industry, it is presented in combination with images, so that the human visual recognition mechanism is not effective due to rational interpretation. Nature is inconspicuous. In addition, this is not a problem in the case of a range image with a low resolution or a pseudo range image.
[0080]
As described above, according to the stereoscopic image generation method using the stereoscopic image generation apparatus of the first embodiment, even if the distance image data acquisition unit 2A that cannot acquire distance image data for each frame is used, A stereoscopic image of a moving image can be easily generated.
[0081]
Further, even in the distance image data acquisition unit 2A that can acquire the distance image data for each frame, the power consumption of the distance image data acquisition unit 2A can be reduced by intermittent acquisition. it can. Therefore, it becomes easy to incorporate the stereoscopic image generating apparatus into a portable device.
[0082]
If the stereoscopic image generation method is programmed, each step can be executed by a computer. Therefore, it is possible to easily generate a stereoscopic image of a moving image without using a dedicated device. At this time, the program for generating the stereoscopic image can be provided by a recording medium such as a semiconductor memory, a hard disk, or a CD-ROM, or can be provided through a network.
[0083]
In the first embodiment, a photographing method using a flash as a means for acquiring a distance image with a low resolution is shown. However, the method is not limited to the method shown in the embodiment as long as a distance image with a low resolution can be acquired. Instead, the distance may be calculated according to the size of the spatial frequency between the image focused on the near view and the image focused on the distant view (or pan focus image), or the difference is simply obtained by stereo shooting. For example, it is possible to generate a distance image with low resolution by using a region showing a difference value below a certain set value as a distant view region.
[0084]
(Example 2)
20 to 21 are schematic diagrams for explaining the overall processing procedure of the stereoscopic image generation method according to the second embodiment of the present invention.
[0085]
The stereoscopic image generation method according to the second embodiment is performed using, for example, the stereoscopic image generation apparatus described in the first embodiment, and thus detailed description regarding the apparatus is omitted.
[0086]
In the stereoscopic image generation method according to the second embodiment, first, as shown in FIG. 20, moving image data CM (T) and distance image data RM (T) are input, and the moving image data recording unit 4 and the distance image are input. Record in the data recording means 5. At this time, the moving image data CM (T) is input every frame time, but the distance image data RM (T) is intermittent every N frames, for example, every 3 frames as shown in FIG. Enter at regular frame intervals.
[0087]
However, in this state, since there is moving image data CM (T + 1) and CM (T + 2) at the frame time when the distance image data RM does not exist, the stereoscopic image is not generated as it is. Therefore, distance image data of a non-existing frame time is generated by interpolation using the moving image data and distance image data.
[0088]
In the stereoscopic image generation method of the second embodiment, first, as shown in FIGS. 20 and 21, distance image data RM at a certain frame time and moving image data at the same frame time are selected (step 1902). Here, as shown in FIG. 20, a case will be described in which moving image data CM (T) and distance image data RM (T) at frame time T are selected. In the distance image data RM (T) and RM (T + 3) shown in FIG. 20, the black region represents a region (near view region) 11 that is close to the shooting point, and becomes white as the distance increases. .
[0089]
Next, the motion vector of the image between the moving image data CM (T) selected in the step 1902 and the moving image data CM (T + 1) at the frame time T + 1 without the distance image data is obtained (step 1903). The obtained motion vector is indicated by M (T + 1). An arrow symbol MV in the motion vector image M (T + 1) is a motion vector detected. Although the motion vector is illustrated by an arrow for easy understanding, the actual image data M (T + 1) is an image in which motion vector information MV is recorded in each pixel, and is not an image drawn by such an arrow. .
[0090]
Next, the distance image data RM (T + 1) at the frame time T + 1 is generated by dividing the distance image data RM (T) at the frame time T into a plurality of areas and moving each area separately according to the motion vector MV. (Step 1904). Here, the area of the distance image data RM (T) is matched with the area of the motion vector MV. For example, if the motion vector MV is given for each pixel, the distance image data RM (T) is moved in units of pixels. If the motion vector MV is a motion vector of a block area, each area is a unit of the block area. This region depends on the method for obtaining the motion vector, but the present invention is not limited to a specific motion vector generation method.
[0091]
Next, the frame time is updated (step 1905), and it is determined whether moving image data CM (T + 2) and distance image data RM (T + 2) exist (steps 1906 and 1907), and moving image data CM (T + 2) is present. If there is no distance image data RM (T + 2), the process returns to step 1903 to generate the distance image data RM (T + 2) by interpolation. At this time, for example, as shown in FIG. 20, the distance image data RM (T + 2) is obtained from the moving image data CM (T) at the frame time T and the moving image data CM (T + 2) at the frame time T + 2. And each area of the distance image data RM (T) at the frame time T is moved according to the motion vector to generate the distance image data RM (T + 2) at the frame time T + 2.
[0092]
By repeating the above procedure, it is possible to sequentially generate the distance image data at the frame time when the distance image data does not exist. Therefore, it is not necessary to input distance image data every frame time even in the stereoscopic image generation method according to the second embodiment. Therefore, it is possible to use the distance image data acquisition unit 1A that can operate only at intermittent frame intervals. Further, even when the distance image data acquisition unit 1A can acquire distance image data for each frame time, the power consumption can be reduced by operating intermittently. In the second embodiment, the distance image data is input every 3 frame times. However, the present invention is not limited to this. For example, the distance image data can be input every 5 frame times or every 10 frame times. By increasing the time interval, the power consumption can be significantly reduced.
[0093]
22 to 26 are schematic diagrams for explaining a specific example of the stereoscopic image generation method according to the second embodiment.
[0094]
In the stereoscopic image generation method according to the second embodiment, the moving image data CM for each frame time and the distance image data RM for each N frame time are converted into moving image data recording means 4 and distance image data recording means 5 such as a semiconductor memory. Are recorded in advance. At this time, if the frame time when both the moving image data and the distance image data exist is T, the moving image data of CM (T), CM (T + 1),..., CM (T + N−1) One piece of distance image data RM (T) exists. Three-dimensional image data of a moving image is generated by interpolating and generating distance image data of RM (T + 1),..., RM (T + N−1) from these images. Here, it is assumed that both the CM and the RM have an image size of X pixels in the horizontal direction and Y pixels in the vertical direction. Further, the data value of the coordinates (x, y) of CM (T) is expressed as CM (T). _{x, y} And the data value of the coordinates (x, y) of RM (T) is RM (T) _{x, y} Is described. CM (T) _{x, y} Is RGB data, RM (T) _{x, y} Is a scalar value that records the distance.
[0095]
The moving image data acquisition unit 1A that acquires the moving image data CM (T) uses the TV camera for moving image shooting described in the first embodiment. The distance image data acquisition means 2A for acquiring the distance image data RM (T) includes, for example, an optical cutting method (for example, Toru Yoshizawa, “Optical three-dimensional measurement”, New Technology Communications, p. 28-37). ), TOF (Time of Flight Measurement) (for example, Seiji Iguchi, “Latest Trends in 3D Measurement”, Measurement and Control, Vol. 34, No. 6, p.430, Masahiro Kawakita, “Hi-Vision 3D” Camera: Axi-vision camera, published by STRL in 2014, Proceedings of performances and research presentations, p.58-63), various pattern projection methods (for example, Toru Yoshizawa, “Three-dimensional optical measurement (2nd edition)) ”, New Technology Communications, p.77-99), using distance measurement devices based on various stereo methods (for example, the Institute of Image Electronics Engineers,“ 3D Image Glossary ”, New Technology Communications, p.51) .
[0096]
When interpolating and generating distance image data using the moving image data CM (T) and intermittent distance image data RM (T), first, the distance image data RM (T) at a certain frame time T and the same frame time Image data CM (T) is selected (step 1902).
[0097]
Next, step 1903 is performed to obtain a motion vector of the image between the moving image data CM (T) at the frame time T and the moving image data CM (T + n) at the frame time T + n without the distance image data.
[0098]
In step 1903, as shown in FIG. 22, the moving image data CM (T) is divided into blocks of vertical W pixels and horizontal W pixels in a grid pattern (step 1903a). At this time, as shown in FIGS. 23A and 23B, each block size is W × W, the number of horizontal blocks is X / W, and the number of vertical blocks is Y / W. is there. It is assumed that X and Y are set in advance so that they are divisible by W. At this time, the block of block coordinates (i, j) is represented by B (T) _{i, j} Is described.
[0099]
Next, a motion vector image M (T + n) is generated. At this time, the motion vector image M (T + n) is the block B (T). _{i, j} Block size two-dimensional data MV (T + n) having the same size as that of the horizontal X / W and vertical Y / W _{i, j} It is assumed that two-dimensional vector data is stored in each element.
[0100]
At this time, first, B (T) corresponding to each block coordinate (i, j). _{i, j} And B (T) as shown in FIG. _{i, j} The block with the closest image content is determined from within CM (T + n) (step 1903b).
[0101]
Next, as shown in FIG. 24B, the upper left coordinate value (i × W + dx, j × W + dy) of this block determined from the CM (T + n) is obtained. At this time, dx and dy are B (T) in an image advanced by n frames. _{i, j} Motion vector MV (T + n) _{i, j} It is. dx and dy are MV (T + n) _{i, j} (Step 1903c).
[0102]
The motion vector MV (T + n) _{i, j} The process for obtaining is called a block matching method, and is, for example, a known method used for motion prediction during MPEG2 image generation. In general, the correlation amount is obtained for each of the two blocks, and the block showing the highest correlation amount in the image is determined as the block having the closest image content. The correlation amount formula is determined from the exact formula used in statistical theory for each pixel. There are various existing formulas up to a simple formula such as a sum of squares of differences, and there are various existing methods such as a block search method and an inter-block distortion removal method after movement. In the present invention, these are not limited to specific formulas or techniques. In the second embodiment, an example using the block matching method has been described. However, the present invention is not limited to the block matching method as long as a motion vector in an image can be obtained. As a result, the generated MV (T + n) _{i, j} Represents the movement in the image when the frame time T changes to T + n.
[0103]
Thereafter, i or j is updated (step 1903d), and all blocks B (T) of CM (T) are updated. _{i, j} The same processing is repeated at (Step 1903e).
[0104]
Next, in step 1904 in which each region of the distance image data RM (T) is moved based on the motion vector image M (T + n), first, as shown in FIG. 25, the distance image data RM (T) is converted into the distance image data RM (T). , The block B (T) _{i, j} Block RM (T) of the same size as _{i, j} (Step 1904a). Thereafter, as shown in FIG. 26 (a), the block RM (T) _{i, j} The motion vector MV (T + n) _{i, j} (Step 1904b). Thereafter, i or j is updated (step 1904c), and all blocks RM (T) are updated. _{i, j} Is moved, distance image data RM (T + n) at frame time T + n is obtained as shown in FIG.
[0105]
Thereafter, the frame time is updated with n = n + 1. When the distance image data RM (T + n) does not exist, the above procedure is repeated to generate the distance image data RM (T + n).
[0106]
At this time, the distance image data RM (T + n) at the frame time T + n includes, for example, the moving image data CM (T) and the distance image data RM (T) at the frame time T and the frame time as shown in FIG. It is generated using T + n moving image data CM (T + n).
[0107]
Thereafter, the distance image data RM between the frame time T and the frame time T + N can be interpolated and generated by repeating the above processing until the frame time of the moving image vector CM in which the distance image data RM exists.
[0108]
As described above, according to the three-dimensional image generation method of the second embodiment, it is possible to easily generate a three-dimensional image of a moving image even using an acquisition unit that cannot acquire distance image data for each frame. it can.
[0109]
Even if the acquisition unit can acquire the distance image data for each frame, the power consumption of the distance image data acquisition unit 2A can be reduced by intermittent acquisition. Therefore, it becomes easy to incorporate the apparatus for generating the stereoscopic image into a portable device.
[0110]
Further, in the stereoscopic image generation method of the second embodiment, as described in the above procedure, it is possible to distinguish the perspective as described in the first embodiment instead of the distance image data input from the distance measuring device. It is also possible to use distance image data with low resolution. In that case, since it is only necessary to obtain the movement of only the contour region, the amount of calculation is reduced, and the power consumption can be further reduced.
[0111]
FIG. 27 is a schematic diagram for explaining a modification of the second embodiment.
[0112]
In the second embodiment, when the distance image data at the frame time T + n between the frame time T and the frame time T + N is generated by interpolation, the moving image data CM (T) at the frame time T as shown in FIG. In the above example, the distance image data RM (T + n) is generated by interpolation using the distance image data RM (T) and the moving image data CM (T + n) at the frame time T + n. The distance image data RM (T + n) may be generated using the moving image data CM (T + n−1), the interpolated distance image data RM (T + n−1), and the moving image data CM (T + n).
[0113]
Since the moving image data normally changes the contour shape of the object to be photographed gradually as the image frame is updated, the contour region always obtained at the first frame time T as in the second embodiment, In the method of performing pattern matching between the frames, correct pattern matching cannot be performed in temporally separated frames in which the deformation amount is a certain amount or more. On the other hand, in the case of the method shown in FIG. 27, since the motion vector is obtained by performing pattern matching with the contour region obtained in the immediately preceding frame, the number of frames in which the distance image data does not exist is somewhat Regardless, the amount of deformation of the contour shape of the photographing target is always within one frame, and the amount is very small. Therefore, in the case of the method shown in FIG. 27, even if the number of frames in which no distance image data exists is large, a stereoscopic image of the moving image can be generated with high accuracy.
[0114]
(Example 3)
FIG. 28 is a schematic diagram illustrating a schematic configuration of the stereoscopic image generating apparatus according to the third embodiment of the present invention.
[0115]
As shown in FIG. 28, the stereoscopic image generating apparatus according to the third embodiment receives moving image data input means 1B for inputting moving image data from moving image data acquisition means 1A and auxiliary image data from auxiliary image data acquisition means 20A. Auxiliary image data input means 20B for input, control means 3 for controlling each data input to the moving image data input means 1B and the auxiliary image data input means 20B, and a moving image input from the moving image data input means 1 Moving image data recording means 4 for recording image data, distance image data recording means 5 for recording auxiliary image data and distance image data input from the auxiliary image data input means 20, and distance for generating distance image data by interpolation Image data interpolation generation means 21.
[0116]
In addition, the stereoscopic image generating apparatus is an apparatus that generates a stereoscopic image composed of a set of moving image data and distance image data of 30 frames or more per second, for example, as shown in FIG. The three-dimensional moving image can be obtained by calculating with the calculating means 7 and displaying with the display means 8.
[0117]
FIG. 29 to FIG. 36 are schematic diagrams for explaining a stereoscopic image generation method using the stereoscopic image generation apparatus according to the third embodiment.
[0118]
In the stereoscopic image generating apparatus according to the third embodiment, unlike the first and second embodiments, as shown in FIG. 29, auxiliary image data that can generate the distance image data is input, and the auxiliary image data, Alternatively, distance image data is generated using the auxiliary image data and the moving image data.
[0119]
Here, the auxiliary image data that can generate the distance image data does not directly represent distance information, but the auxiliary image data alone or a combination of auxiliary image data and image data is used. It is defined as image data that can generate distance image data, distance image data with low resolution, or pseudo distance image data by calculation.
[0120]
For example, if the auxiliary image data is a normal image, this alone does not include distance information, but a distance image can be generated using the stereo method by considering it as a stereo image in combination with the other image data. . Therefore, a normal image becomes auxiliary image data.
[0121]
In the third embodiment, only an example in which such a stereo image is used as auxiliary image data will be described in detail. However, by calculating using auxiliary image data alone or a combination of auxiliary image data and image data. The present invention does not limit the types of auxiliary image data as long as the image data can generate distance image data, distance image data with low resolution, or pseudo distance image data.
[0122]
When generating a stereoscopic image of a moving image using the stereoscopic image generation apparatus according to the third embodiment, moving image data CM for each frame and auxiliary image capable of generating a distance image for each N frame are stored in a recording unit such as a semiconductor memory. Assume that image data HM is recorded in advance. Assuming that the frame time at which both moving image data and auxiliary image data exist is T, HM (T) is 1 for moving images of CM (T), CM (T + 1),..., CM (T + N−1). There are sheets. Three-dimensional image data of a moving image can be generated by interpolating distance image data of RM (T), RM (T + 1),..., RM (T + N−1) from these images.
[0123]
It is assumed that the auxiliary image data HM (T) used in the third embodiment is image data captured at a position different in the horizontal direction from the capturing position of the moving image data CM (T).
[0124]
When the distance image data RM (T) is generated using the stereoscopic image generation method of the third embodiment, first, as shown in FIG. 30, auxiliary image data HM having the same frame time as the moving image data CM (T). (T) is selected (step 2201), and distance image data RM (T) at the frame time T is generated (step 2202).
[0125]
In step 2202 for generating the distance image data RM (T) at the frame time T, as shown in FIG. 31, first, the moving image data CM (T) shown in FIG. 32A and FIG. The absolute value of the difference of the auxiliary image data HM (T) shown is taken to generate the image DM (T) (step 2202a). At this time, since the moving image data CM (T) and the auxiliary image data HM (T) are stereo images, the position of the foreground area in the image differs between the moving image data CM (T) and the auxiliary image data HM (T), and the difference The value will not be zero. In particular, a significant difference value occurs in an area where the change in luminance value is large. On the other hand, in the distant view area, the position in the image hardly changes between the moving image data CM (T) and the auxiliary image data HM (T), so the difference value is very small.
[0126]
Next, each point DM (T) of the difference value image DM (T) _{x, y} Is 1 if it is greater than or equal to a preset difference value k, and 0 if it is less than k (steps 2202b, 2202c, 2202d). Thereafter, x or y is updated (step 2202e), and all DM (T) _{x, y} Is binarized (2202f), and a binarized image DM (T) is generated as shown in FIG. As a result, the area where the object is not captured (distant view area) is zero. On the other hand, the area where the object is reflected (the foreground area) has a high probability of being 1, but may be 0 in an area where the change in luminance is poor, resulting in a noisy image in which 0 and 1 are mixed.
[0127]
Therefore, next, noise is removed from the binarized image DM (T) (step 2202g). For example, as shown in FIG. 34, FIG. 35 (a), and FIG. 35 (b), the noise removal is performed for all the pixels DM (T) of the binarized image DM (T). _{x, y} The value of each pixel of p pixels × p pixels set in advance is checked (steps 2202h and 2202i). If there is a pixel having a value of 1, RM (T) _{x, y} = 1 (step 2202k), otherwise 0 (step 2202j). As a result, noise is removed, and as shown in FIG. 36, a low-resolution range image RM (T) in which the near view is represented by 1 and the far view is represented by 0 is generated.
[0128]
In addition to this, there are other methods for removing noise, such as using a median filter or applying a smoothing filter and then binarizing again. The present invention does not limit the noise removal method.
[0129]
Various other images can be considered as the auxiliary image. For example, there is also a method of using an input image with a shallow focus depth at a short distance as an auxiliary image. In the auxiliary image, the distant view area is taken out of focus, so the high frequency component is extremely small. Therefore, by calculating the spatial frequency in the vicinity of the same position for the auxiliary image and the image and determining the ratio of the high frequency components, it is possible to calculate the approximate distance at that position from the ratio. The present invention does not limit what is used for the auxiliary image as long as the distance image can be generated from the auxiliary image or a combination of the auxiliary image and the image.
[0130]
Then, after the procedure is repeated and the distance image data RM (T) is generated from the moving image data CM (T) and the auxiliary image data HM (T) input intermittently, for example, as described in the first embodiment. A moving image stereoscopic image is generated according to the procedure.
[0131]
In addition, since the distance image with low resolution is generated in the third embodiment, it is combined with the generation method described in the first embodiment. However, when the resolution of the generated distance image is high, the second embodiment will be described. You may combine with the production method.
[0132]
As described above, according to the stereoscopic image generation method of the third embodiment, auxiliary image data that can generate the distance image data is intermittently acquired, and distance image data is generated from the auxiliary image data. Similarly to the generation method described in the first embodiment or the second embodiment, a moving image stereoscopic image can be easily generated.
[0133]
Further, by intermittently acquiring auxiliary image data, it is possible to reduce the power consumption of the auxiliary image data acquiring unit 20A that acquires auxiliary image data. Therefore, it becomes easy to incorporate the stereoscopic image generating apparatus into a portable device.
[0134]
The present invention has been specifically described above based on the above-described embodiments. However, the present invention is not limited to the above-described embodiments, and various modifications can be made without departing from the scope of the present invention. is there.
[0135]
【The invention's effect】
Of the inventions disclosed in the present application, effects obtained by typical ones will be briefly described as follows.
(1) A stereoscopic image of a moving image can be easily generated.
(2) It is possible to reduce the power consumption of a device that generates a stereoscopic image of a moving image.
[Brief description of the drawings]
FIG. 1 is a schematic diagram illustrating a schematic configuration of a stereoscopic image generation apparatus according to a first embodiment of the present invention.
FIG. 2 is a schematic diagram for explaining a stereoscopic image generation method using the stereoscopic image generation apparatus according to the first embodiment, and illustrates an overall processing procedure of the stereoscopic image generation method.
FIG. 3 is a schematic diagram for explaining a stereoscopic image generation method using the stereoscopic image generation apparatus according to the first embodiment, and illustrates an overall processing procedure of the stereoscopic image generation method.
FIG. 4 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment;
FIG. 5 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment;
FIG. 6 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment;
FIG. 7 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment;
FIG. 8 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment;
FIG. 9 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment;
FIG. 10 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment;
FIG. 11 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment;
12 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment; FIG.
FIG. 13 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment;
FIG. 14 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment;
FIG. 15 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment;
FIG. 16 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment;
FIG. 17 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment;
FIG. 18 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment;
FIG. 19 is a schematic diagram for explaining characteristics of a stereoscopic image generated by the stereoscopic image generation method according to the first embodiment;
FIG. 20 is a schematic diagram for explaining an overall processing procedure of the stereoscopic image generation method according to the second embodiment of the present invention;
FIG. 21 is a schematic diagram for explaining an overall processing procedure of the stereoscopic image generation method according to the second embodiment of the present invention;
FIG. 22 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the second embodiment;
FIG. 23 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the second embodiment;
FIG. 24 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the second embodiment;
FIG. 25 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the second embodiment;
FIG. 26 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the second embodiment;
FIG. 27 is a schematic diagram for explaining a modification of the second embodiment.
FIG. 28 is a schematic diagram illustrating a schematic configuration of a stereoscopic image generating apparatus according to a third embodiment of the present invention.
FIG. 29 is a schematic diagram for explaining a stereoscopic image generation method using the stereoscopic image generation apparatus according to the third embodiment.
30 is a schematic diagram for explaining a stereoscopic image generation method using the stereoscopic image generation apparatus according to the third embodiment. FIG.
FIG. 31 is a schematic diagram for explaining a stereoscopic image generation method using the stereoscopic image generation apparatus according to the third embodiment.
FIG. 32 is a schematic diagram for explaining a stereoscopic image generation method using the stereoscopic image generation apparatus according to the third embodiment.
FIG. 33 is a schematic diagram for explaining a stereoscopic image generation method using the stereoscopic image generation apparatus according to the third embodiment.
FIG. 34 is a schematic diagram for explaining a stereoscopic image generation method using the stereoscopic image generation apparatus according to the third embodiment.
FIG. 35 is a schematic diagram for explaining a stereoscopic image generation method using the stereoscopic image generation apparatus according to the third embodiment.
FIG. 36 is a schematic diagram for explaining a stereoscopic image generation method using the stereoscopic image generation apparatus according to the third embodiment.
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 1A ... Moving image data acquisition means, 1B ... Moving image data input means, 2A ... Distance image data acquisition means, 2B ... Distance image data input means, 3 ... Control means, 4 ... Moving image data recording means, 5 ... Distance image data Recording means, 6 ... Distance image data interpolation means, 7 ... Calculation means, 8 ... Display means, 10 ... Shooting object, 11 ... Foreground area, 12 ... Border area, 13 ... Contour area, 15 ... Light source, 16 ... Total reflection Mirror 17, half mirror 20 A auxiliary image data acquisition means 20 B auxiliary image data input means 21 distance image data interpolation generation means

Claims

Inputting moving image data for each frame time;
Inputting data for generating a stereoscopic image from the moving image data (hereinafter referred to as distance image data) at intermittent frame intervals;
A method of interpolating and generating distance image data at a frame time when the distance image data does not exist based on the moving image data and the input distance image data ,
The step of generating the distance image data by interpolation includes:
Obtaining a near and far boundary region from the input distance image data;
From the moving image data of the same frame time as the input distance image data, obtaining the same position as the boundary area, or a contour area near the boundary area;
Obtaining a region corresponding to the contour region from moving image data at a frame time when the distance image data does not exist;
And a step of generating the distance image data by moving the boundary area and its internal area based on a movement amount of the area corresponding to the outline area. Image generation method.

Inputting moving image data for each frame time;
Inputting data for generating a stereoscopic image from the moving image data (hereinafter referred to as distance image data) at intermittent frame intervals;
A method of interpolating and generating distance image data at a frame time when the distance image data does not exist based on the moving image data and the input distance image data,
The step of generating the distance image data by interpolation includes:
Selecting moving image data at the same frame time as the input distance image data;
Obtaining a motion vector of an image from the selected moving image data to moving image data at a frame time at which the distance image data does not exist;
Dividing the distance the input image data into a plurality of regions, the motion by moving the respective region based on the vector, steric image generation how to; and a step of generating a distance image data .

The step of inputting the distance image data includes:
Inputting auxiliary image data capable of generating the distance image data at intermittent frame intervals;
Wherein said auxiliary image data according to claim 1 or claim 2, characterized in that a step of or generates the distance image data based on the moving image data of the same frame time as the supplementary image data and the supplementary image data, 3D image generation method.

Moving image data input means for inputting moving image data for each frame time;
Distance image data input means for inputting data for generating a stereoscopic image from the moving image data (hereinafter referred to as distance image data) at intermittent frame intervals;
A distance image that interpolates and generates distance image data at a frame time when the distance image data does not exist based on the moving image data input from the moving image data input means and the distance image data input from the distance image data input means A three-dimensional image generation device comprising data interpolation means ,
The distance image data interpolation means includes
Means for obtaining a boundary area of the perspective from the distance image data input from the distance image data input means;
Means for obtaining the same position as the boundary area or a contour area near the boundary area from the moving image data at the same frame time as the distance image data;
From the moving image data of the frame time when the range image data does not exist, means for determining a region equivalent to the contour region,
A solid comprising: the contour region, and means for generating the distance image data by moving the boundary region and its internal region based on a movement amount of a region corresponding to the contour region. Image generation device.

Moving image data input means for inputting moving image data for each frame time;
Distance image data input means for inputting data for generating a stereoscopic image from the moving image data (hereinafter referred to as distance image data) at intermittent frame intervals;
A distance image that interpolates and generates distance image data at a frame time when the distance image data does not exist based on the moving image data input from the moving image data input means and the distance image data input from the distance image data input means A three-dimensional image generation device comprising data interpolation means,
The distance image data interpolation means includes
Means for selecting moving image data at the same frame time as the input distance image data;
Means for obtaining a motion vector of an image from the selected moving image data to moving image data at a frame time at which the distance image data does not exist;
The distance the input image data is divided into a plurality of regions, the motion on the basis of the vector by moving the respective regions, steric image generating apparatus characterized in that it comprises a means for generating a distance image data .

The distance image data input means includes
Auxiliary image data input means for inputting auxiliary image data capable of generating the distance image data at intermittent frame intervals;
According to claim 4 or 5, characterized in that it comprises a means for generating a distance image data based on the auxiliary image data or moving image data of the same frame time as the supplementary image data and the supplementary image data, Stereo image generating apparatus.

Stereoscopic image generation program for executing the steps of the three-dimensional image generation method according to the computer in any one of claims 1 to 3.

A recording medium on which the stereoscopic image generating program according to claim 7 is recorded so as to be readable by a computer.