JP2004229093A

JP2004229093A - Method, device, and program for generating stereoscopic image, and recording medium

Info

Publication number: JP2004229093A
Application number: JP2003016302A
Authority: JP
Inventors: Akihiko Hashimoto; 秋彦橋本; Hajime Noto; 肇能登; Kenji Nakazawa; 憲二中沢
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2003-01-24
Filing date: 2003-01-24
Publication date: 2004-08-12
Anticipated expiration: 2023-01-24
Also published as: JP3988879B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a technology capable of easily generating a stereoscopic image of a moving picture. <P>SOLUTION: This stereoscopic image generating method comprises a step for inputting a moving image data in each one frame time, a step for inputting data (hereafter called range image data) for generating a stereoscopic image from the moving image data in an intermittent frame interval, and a step for interpolating and generating the range image data at a frame time when the range image data are not present on the basis of the moving image data and the inputted range image data. <P>COPYRIGHT: (C)2004,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、立体動画像データ生成方法及び立体動画像データ生成装置、ならびに立体動画像データ生成プログラム及び記録媒体に関し、特に、携帯機器等で表示する立体動画像データの生成に適用して有効な技術に関するものである。
【０００２】
【従来の技術】
従来、画像通信の分野では、より高い臨場感が得られる立体画像による動画通信が注目されており、学会発表や企業のデモ試作等も盛んになりつつある。ここで、前記立体画像とは、色や輝度を記録した二次元画像と、計測地点から撮影対象までの距離を記録した距離画像または奥行き画像と呼ばれる画像の組み合わせであるとする。また、前記距離画像とは、各画素に撮影対象までの距離を記録した画像である。また、前記動画とは、例えば、毎秒３０フレーム程度以上の画像からなるものとする。
【０００３】
前記立体画像による動画を生成するときには、例えば、ＴＶカメラ等で前記色や輝度を記録した二次元画像からなる動画像データを入力し、距離計測器等で前記距離画像データを入力する。
【０００４】
また、前記立体画像による動画を生成する方法には、例えば、視差を有する２枚の二次元画像を用いて距離画像データを生成する方法もある。
【０００５】
また、二次元画像の視差画像による立体画像を生成する方法として、距離画像から輪郭部を求めて視差領域を決定し、前後の画像と動きベクトル等を用いて視差量を生成する技術が開示されている（たとえば、特許文献１を参照。）。
【０００６】
【特許文献１】
特開２０００‐２５３４２２号公報
【０００７】
【発明が解決しようとする課題】
しかしながら、前記従来の技術では、前記立体画像による動画通信を行うことが難しいという問題があった。
【０００８】
前記立体画像による動画通信を行うためには、例えば、毎秒３０フレーム以上の距離画像データを入力しなければならないが、従来、そのような高速で距離画像データを取得できる距離計測器は非常に高価であり、携帯機器に組み込んで用いることが難しい。
【０００９】
また、従来の技術では、前記距離画像データは、前記二次元画像と同様に１フレーム時間毎に取得し、入力しているので、消費電力量が高くなる。そのため、充電池等で使用する携帯機器等への組み込みが難しいという問題があった。
【００１０】
また、携帯機器に組み込み可能な技術として、例えば、遠近の区別ができる程度の分解能の低い距離画像データを入力し、これを用いて書き割り形式の立体画像表示を行う方法がある。また、前記分解能の低い距離画像から、擬似的な距離画像データを生成して立体表示する方法もある。ここで、前記擬似的な距離画像データとは、距離を計測によってではなく、人為的に生成したデータであるとする。すなわち、前記擬似的な距離画像データは、必ずしも現実の距離と一致しているわけではなく、近似の度合いは生成のアルゴリズムに依存する。
【００１１】
しかしながら、前記各方法でも、前記各距離画像データは１フレーム時間毎に取得するので、消費電力量が高い。そのため、携帯機器で立体画像を動画入力することが難しいという問題があった。
【００１２】
また、前記特許文献１に記載されたような技術でも、前記距離画像データは１フレーム時間毎に取得するので、消費電力量が高く、携帯機器に組み込んで用いることが難しいという問題があった。
【００１３】
本発明の目的は、動画の立体画像を容易に生成することが可能な技術を提供することにある。
【００１４】
本発明の他の目的は、動画の立体画像を生成する装置の消費電力量を低減することが可能な技術を提供することにある。
【００１５】
本発明の他の目的は、立体画像による動画入力を容易にし、かつ、立体画像生成時の消費電力量を低減することが可能なプログラム及び記録媒体を提供することにある。
【００１６】
本発明の前記ならびにその他の目的と新規な特徴は、本明細書の記述及び添付図面によって明らかになるであろう。
【００１７】
【課題を解決するための手段】
本願において開示される発明の概要を説明すれば、以下の通りである。
（１）１フレーム時間毎の動画像データを入力するステップと、前記動画像データから立体画像を生成するためのデータ（以下、距離画像データと称する）を、間欠的なフレーム間隔で入力するステップと、前記動画像データ及び前記入力された距離画像データに基づいて、前記距離画像データが存在しないフレーム時刻の距離画像データを補間生成するステップとを有する立体画像生成方法である。
【００１８】
（２）前記（１）の手段において、前記距離画像データを補間生成するステップは、前記入力された距離画像データから、遠近の境目領域を求めるステップと、前記入力された距離画像データと同じフレーム時刻の動画像データから、前記境目領域と同じ位置、または前記境目領域の近傍の輪郭部領域を求めるステップと、前記距離画像データが存在しないフレーム時刻の動画像データから、前記輪郭部領域に相当する領域を求めるステップと、前記輪郭部領域と、前記輪郭部領域に相当する領域の移動量に基づいて、前記境目領域及びその内部領域を移動させて、前記距離画像データを生成するステップとを有する。
【００１９】
（３）前記（１）の手段において、前記距離画像データを補間生成するステップは、前記入力された距離画像データと同じフレーム時刻の動画像データを選択するステップと、前記選択された動画像データから、前記距離画像データが存在しないフレーム時刻の動画像データまでの間の画像の動きベクトルを求めるステップと、前記入力された距離画像データを複数の領域に分割し、前記動きベクトルに基づいて前記各領域を移動させて、距離画像データを生成するステップとを有する。
【００２０】
（４）前記（１）から（３）の各手段において、前記距離画像データを入力するステップは、前記距離画像データを生成可能な補助画像データを、間欠的なフレーム間隔で入力するステップと、前記補助画像データ、または前記補助画像データ及び前記補助画像データと同じフレーム時刻の動画像データに基づいて距離画像データを生成するステップとを有する。
【００２１】
前記（１）の手段によれば、距離計測器などの前記距離画像データを取得する手段を高速で動作させることができない場合でも、立体画像による動画を容易に得ることができる。
【００２２】
このとき、前記距離画像データを補間生成するステップは、例えば、前記（２）の手段あるいは前記（３）の手段の各ステップによる処理を行う。
【００２３】
また、前記距離画像データを入力するステップは、距離計測器等を用いて、直接距離画像データを入力してもよいし、前記（４）の手段のように、前記補助画像データから間接的に生成した距離画像データを入力してもよい。
【００２４】
（５）１フレーム時間毎の動画像データを入力する動画像データ入力手段と、前記動画像データから立体画像を生成するためのデータ（以下、距離画像データと称する）を、間欠的なフレーム間隔で入力する距離画像データ入力手段と、前記動画像データ入力手段から入力された動画像データ及び前記距離画像データ入力手段から入力された距離画像データに基づいて、前記距離画像データが存在しないフレーム時刻の距離画像データを補間生成する距離画像データ補間手段とを備える立体画像生成装置である。
【００２５】
（６）前記（５）の手段において、前記距離画像データ補間手段は、前記距離画像データ入力手段から入力された距離画像データから、遠近の境目領域を求める手段と、前記距離画像データと同じフレーム時刻の動画像データから、前記境目領域と同じ位置、または前記境目領域の近傍の輪郭部領域を求める手段と、前記距離画像データが存在しないフレーム時刻の動画像データから、前記輪郭部領域に相当する領域を求める手段と、前記輪郭部領域と、前記輪郭部領域に相当する領域の移動量に基づいて、前記境目領域及びその内部領域を移動させて、前記距離画像データを生成する手段とを有する。
【００２６】
（７）前記（５）の手段において、前記距離画像データ補間手段は、前記入力された距離画像データと同じフレーム時刻の動画像データを選択する手段と、前記選択された動画像データから、前記距離画像データが存在しないフレーム時刻の動画像データまでの間の画像の動きベクトルを求める手段と、前記入力された距離画像データを複数の領域に分割し、前記動きベクトルに基づいて前記各領域を移動させて、距離画像データを生成する手段とを備える。
【００２７】
（８）前記（５）から（７）の手段において、前記距離画像データ入力手段は、前記距離画像データを生成可能な補助画像データを、間欠的なフレーム間隔で入力する補助画像データ入力手段と、前記補助画像データ、または前記補助画像データ及び前記補助画像データと同じフレーム時刻の動画像データに基づいて距離画像データを生成する手段とを備える。
【００２８】
前記（５）の手段によれば、距離計測器などの距離画像データを取得する手段を高速で動作させることができない場合でも、立体画像による動画を容易に得ることができる。
【００２９】
また、前記距離画像データを取得する手段が、１フレーム時間毎に距離画像データを取得できる手段であっても、動作を間欠的にすることで、消費電力量を低くすることができる。
【００３０】
また、前記距離画像データ補間手段は、例えば、前記（６）の手段または前記（７）の手段の各手段を備える。
【００３１】
また、前記距離画像データ入力手段は、距離計測器等から前記距離画像データを直接入力する手段であってもよいし、前記（８）の手段のように、前記補助画像データを入力する手段と、前記補助画像データから距離画像データを生成する手段を設けて、間接的に距離画像データを入力してもよい。
【００３２】
以上のようなことから、前記（５）から（８）の手段を用いることで、前記立体画像生成装置の携帯機器への組み込みを容易にすることができる。
【００３３】
（９）前記（１）から（４）の各手段のいずれかの立体画像生成方法の、各ステップをコンピュータに実行させるための立体画像生成プログラムである。
【００３４】
前記（９）の手段によれば、コンピュータなどの汎用機器（装置）を用いて容易に立体画像を生成させることができる。
【００３５】
（１０）前記（９）の手段の立体画像生成プログラムがコンピュータで読み取り可能に記録された記録媒体。
【００３６】
前記（１０）の手段によれば、前記記録媒体をコピー、あるいはネットワークを通して提供することができる。そのため、前記立体画像生成プログラムを実行可能な装置であれば、どのような装置でも立体画像を生成させることができる。
【００３７】
以下、本発明について、図面を参照して実施の形態（実施例）とともに詳細に説明する。
なお、実施例を説明するための全図において、同一機能を有するものは、同一符号を付け、その繰り返しの説明は省略する。
【００３８】
【発明の実施の形態】
（実施例１）
図１は、本発明による実施例１の立体画像生成装置の概略構成を示す模式図である。
【００３９】
本実施例１の立体画像生成装置は、図１に示すように、動画像データ取得手段１Ａから動画像データを入力する動画像データ入力手段１Ｂと、距離画像データ取得手段２Ａから距離画像データを入力する距離画像データ入力手段２Ｂと、前記動画像データ入力手段１Ｂ及び前記距離画像データ入力手段２Ｂに入力する各データの制御をする制御手段３と、前記動画像データ入力手段１から入力された動画像データを記録する動画像データ記録手段４と、前記距離画像データ入力手段２から入力された距離画像データを記録する距離画像データ記録手段５と、距離画像データを補間生成する距離画像データ補間手段６とを備える。
【００４０】
また、前記立体画像生成装置は、例えば、毎秒３０フレーム以上の動画像データと距離画像データの組からなる立体画像を生成する装置であり、生成した前記立体画像は、図１に示したように、演算手段７で演算し、表示手段８で表示することにより立体化された動画像を得ることができる。このとき、前記動画像データ入力手段１Ｂ、前記距離画像データ入力手段２Ｂ、前記制御手段３、前記動画像データ記録手段４、前記距離画像データ記録手段５、前記距離画像データ補間手段６、前記演算手段７の各手段は、コンピュータのＣＰＵやメモリを用いることができる。
【００４１】
図２及び図３は、本実施例１の立体画像生成装置を用いた立体画像生成方法を説明するための模式図であり、立体画像生成方法の全体的な処理手順を示す図である。
【００４２】
本実施例１の立体画像生成装置を用いて前記立体画像を生成するには、まず、図２に示すように、動画像データＣＭ（Ｔ）及び距離画像データＲＭ（Ｔ）を入力し、前記動画像データ記録手段４及び前記距離画像データ記録手段５に記録する。このとき、前記動画像データＣＭ（Ｔ）は、１フレーム時間毎に入力するが、前記距離画像データＲＭ（Ｔ）はＮフレーム間隔、たとえば、図２に示したように、３フレーム毎の間欠的なフレーム間隔で入力する。またこのとき、前記距離画像データＲＭ（Ｔ）は、たとえば、１ｂｉｔの分解能の距離画像データであって、黒い領域１１は近景領域、たとえば、撮影対象１０に相当する領域であり、白い領域は遠景領域であるとする。
【００４３】
しかしながら、この状態では、前記距離画像データＲＭが存在しないフレーム時刻の動画像データＲＭ（Ｔ＋１），ＲＭ（Ｔ＋２）があるので、そのままでは前記立体画像を生成したことにならない。そこで、前記動画像データ及び距離画像データを用いて、存在しないフレーム時間の距離画像データを補間生成する。
【００４４】
本実施例１の立体画像生成方法では、図２及び図３に示すように、初期化（ステップ１４０１）した後、前記距離画像データＲＭ（Ｔ）から、近景と遠景の境目領域１２を求め、この画像をＱ（Ｔ）とする（ステップ１４０２）。
【００４５】
次に、前記距離画像データＲＭ（Ｔ）と同じフレーム時刻Ｔの動画像データＣＭ（Ｔ）から、前記画像Ｑ（Ｔ）の境目領域１２と同じ領域、または近傍の位置にある輪郭部領域１３を求め、この画像をＲ（Ｔ）とする（ステップ１４０３）。
【００４６】
次に、距離画像データが存在しないフレーム時刻Ｔ＋１の動画像データＣＭ（Ｔ＋１）から、前記画像Ｒ（Ｔ）の輪郭部領域１３と同じ領域を求め、この画像をＲ（Ｔ＋１）とする（ステップ１４０４）。前記映像領域を求めるためには、たとえば、既存のパターンマッチング法（田村秀行著、「コンピュータ画像処理入門」、総研出版、ｐ．１４８‐１５３）を用いる。パターンマッチング法には様々なバリエーションがあり、適宜選択することができる。
【００４７】
次に、前記ステップ１４０４で求めた画像Ｒ（Ｔ＋１）における輪郭部領域１３の、前記画像Ｒ（Ｔ）の輪郭部領域１３に対する移動量に合わせて、前記フレーム時刻Ｔの距離画像データＲＭ（Ｔ）における近景領域１１を移動させることで、フレーム時刻Ｔ＋１の距離画像データＲＭ（Ｔ＋１）を生成する（ステップ１４０５）。
【００４８】
その後、次のフレーム時刻Ｔ＋２の距離画像データＲＭ（Ｔ＋２）が存在するか調べる（ステップ１４０６，１４０７，１４０８，１４０９）。本実施例１では、フレーム時刻Ｔ＋２の距離画像データＲＭ（Ｔ＋２）は存在しないので、前記フレーム時刻Ｔ＋１の距離画像データＲＭ（Ｔ＋１）と同様の手順で生成する。このとき、フレーム時刻がＴ＋２の前記動画像データの映像領域の位置は、たとえば、図２に示すように、前記フレーム時刻がＴの動画像データの輪郭部領域と直接パターンマッチングして求めても良いし、フレーム時刻がＴ＋１の動画像データの輪郭部領域とパターンマッチングして求めても良い。
【００４９】
以上の手順を繰り返すことにより、前記距離画像データが存在しないフレーム時刻の距離画像データを順次補間生成することができる。そのため、本実施例１の立体画像生成装置では、１フレーム時間毎に距離画像データを取得する必要がない。そのため、間欠的なフレーム間隔でしか動作できない距離画像データ取得手段２Ａを用いることが可能である。また、前記距離画像データ入力手段２Ａが、１フレーム時間毎の距離画像データを取得できる場合でも、間欠的に動作させることにより、消費電力量を低減することができる。また、本実施例１では、前記距離画像データを３フレーム時間毎に入力しているが、これに限らず、たとえば、５フレーム時間毎あるいは１０フレーム時間毎に入力することも可能であり、フレーム時間間隔を大きくすることで、消費電力量を大幅に低減することができる。
【００５０】
図４乃至図１８は、本実施例１の立体画像生成方法の具体例を説明するための図である。
【００５１】
本実施例１の立体画像生成方法では、まず、図４に示したように、１フレーム毎の動画像データＣＭと、Ｎフレーム毎の距離画像データＲＭを入力し、前記動画像データ記録手段４及び前記距離画像データ記録手段５に記録する。このとき、画像と分解能の低い距離画像がともに存在するフレーム時刻をＴとすると、ＣＭ（Ｔ），ＣＭ（Ｔ＋１），・・・，ＣＭ（Ｔ＋Ｎ−１）の動画像に対してＲＭ（Ｔ）が１枚存在する。これらの画像から、ＲＭ（Ｔ＋１），・・・，ＲＭ（Ｔ＋Ｎ−１）の距離画像データを生成することによって動画の立体画像データを生成する。ここで、ＣＭもＲＭも縦横Ｘ，Ｙ画素の画像サイズであるとする。また、ＣＭ（Ｔ）の座標（ｘ，ｙ）のデータ値をＣＭ（Ｔ）_ｘ，ｙと記述し、ＲＭ（Ｔ）の座標（ｘ，ｙ）のデータ値をＲＭ（Ｔ）_ｘ，ｙと記述する。ＣＭ（Ｔ）_ｘ，ｙはＲＧＢデータであり、ＲＭ（Ｔ）_ｘ，ｙは０か１の２値データである。ここで、１は近景、０は遠景を表すものとする。
【００５２】
また、前記動画像データＣＭ及び前記距離画像データＲＭを入力するときには、図５に示すように、動画像撮影用のＴＶカメラ（動画像データ取得手段）１Ａと、距離画像撮影用のＴＶカメラ（距離画像データ取得手段）２Ｂと、フラッシュなどのパルス光を照射する光源１５とを前記制御手段３で制御する。
【００５３】
このとき、前記距離画像データ取得手段２Ａの撮影光軸は、全反射ミラー１６及びハーフミラー１７によって前記動画像データ取得手段１Ａの撮影光軸と合わせ、同じ撮影画角及び同じ撮影視点で撮影できるように設定する。
【００５４】
前記動画像データ取得手段１Ａを用いて前記動画像データＣＭを入力するときには、前記制御手段３から前記動画像データ取得手段１Ａに対して、図６（ａ）に示すように、あらかじめ設定されたフレーム時間毎に制御信号を出力し、前記動画像データを撮影し、前記動画像データ入力手段１Ｂに入力する。このとき、フレームの時間間隔ΔＴは、例えば、１／３０秒に設定する。
【００５５】
一方、前記距離画像データＲＭを入力するときには、前記制御手段３から前記距離画像データ取得手段２Ａに対して、図６（ｂ）に示すように、あらかじめ設定したフレーム間隔のＮ倍の時間間隔、例えば、前記動画像データのフレーム間隔の３倍の時間間隔で制御信号を出力し、間欠的な距離画像データを撮影し、前記距離画像データ入力手段２Ｂに入力する。このとき、前記制御手段３は、前記光源１５にも同じタイミングで制御信号を出力し、撮影対象１０を照明しておく。
【００５６】
なお、前記動画像データ取得手段１Ａと前記距離画像データ取得手段２Ａの撮影開始時刻は、少なくとも前記動画像データ取得手段１Ａの露光時間以上ずらしておくか、もしくは、光源１５から照射する照明光に赤外光を用い、前記距離画像データ取得手段２Ａに赤外カメラを使う等することによって、照明光が前記動画像データ取得手段１のＡ撮影に影響しないようにする。
【００５７】
また、高速度カメラを用いることにより前記動画像データ取得手段１Ａと前記距離画像取得手段２Ａを一台のカメラで兼用することも可能である。この場合、前記全反射ミラー１６及び前記ハーフミラー１７は不要である。
【００５８】
また、前記距離画像データ取得手段２Ａで撮影した距離画像データＲＭＰ（Ｔ）は、例えば、遠近の区別のみを記録した分解能の低い距離画像データＲＭ（Ｔ）に変換する。そこでまず、図７に示すように、時刻Ｔに前記動画像データ取得手段１Ａで撮影した動画像データＣＭ（Ｔ）にもっとも撮影時刻の近い距離画像データＲＭＰ（Ｔ）を選ぶ（ステップ１８０１）。この２枚の画像データは同じ撮影時刻に同じ撮影対象を照明の有無のみが異なって撮影した画像データとみなすことができる。
【００５９】
次に、前記動画像データＣＭ（Ｔ）と距離画像データＲＭＰ（Ｔ）において、同座標（ｘ，ｙ）でＣＭ（Ｔ）_ｘ，ｙ／ＲＭＰ（Ｔ）_ｘ，ｙの比を求め（ステップ１８０２）、ある設定値ｋ以上の比であった場合にはＲＭ（Ｔ）_ｘ，ｙ＝１（ステップ１８０３）、そうでない場合にはＲＭ（Ｔ）_ｘ，ｙ＝０（ステップ１８０４）とすることによってＲＭ（Ｔ）_ｘ，ｙを生成する。以降、座標（ｘ，ｙ）を更新（ステップ１８０５）して、すべての座標においてＲＭ（Ｔ）_ｘ，ｙを生成し（ステップ１８０６）、距離画像データＲＭ（Ｔ）とする。
【００６０】
照明した撮影画像は照明しない撮像画像と比較すると、近景の撮影対象は明るくなるのに対して、遠景の画像の明るさはあまり変わらないため、この判定処理によって、近景領域は１、遠景領域は０で表現される分解能の低い距離画像データＲＭ（Ｔ）が生成されることになる。なお、前記分解能の低い距離画像データＲＭ（Ｔ）は、比の代わりに差を用いて生成しても構わない。
【００６１】
また、本実施例１では分解能の低い距離画像データＲＭ（Ｔ）の入力手段としてフラッシュなどの光源１５を用いる撮影方法を示したが、分解能の低い距離画像データＲＭ（Ｔ）を取得、あるいは入力できるのであれば、前記実施例１に示した方法に限定されるものではなく、近景に焦点のあった画像と遠景に焦点のあった画像（またはパンフォーカス画像）間で空間周波数の大小に応じて距離を算出してもよく、あるいはステレオ撮影して単純に差分を取り、ある設定値を下回る差分値を示した領域を遠景領域として分解能の低い距離画像データＲＭ（Ｔ）を生成すること等が可能である。
【００６２】
次に、図２及び図３に示した手順にしたがって、動画の立体画像データを生成してゆく。
【００６３】
前記距離画像データＲＭ（Ｔ）から境目領域１２の画像Ｑ（Ｔ）を求めるステップ１４０２では、まず、図８に示すように、距離画像データＲＭ（Ｔ）の座標（ｘ，ｙ）の画素ＲＭ（Ｔ）_ｘ，ｙと隣接する画素の値を比較する（ステップ１４０２ａ，１４０２ｂ）。このとき、前記ＲＭ（Ｔ）_ｘ，ｙと隣接する画素は、図９（ａ）に示すように、８連結で隣接するＲＭ（Ｔ）_{ｘ−１，ｙ−１}、ＲＭ（Ｔ）_{ｘ−１，ｙ}、ＲＭ（Ｔ）_{ｘ−１，ｙ＋１}、ＲＭ（Ｔ）_{ｘ，ｙ−１}、ＲＭ（Ｔ）_{ｘ，ｙ＋１}、ＲＭ（Ｔ）_{ｘ＋１，ｙ−１}、ＲＭ（Ｔ）_{ｘ＋１，ｙ}、ＲＭ（Ｔ）_{ｘ＋１，ｙ＋１}とし、全てが同じデータ値であったときにはＱ（Ｔ）_ｘ，ｙ＝０（ステップ１４０２ｃ）、そうでない場合はＱ（Ｔ）_ｘ，ｙ＝１（ステップ１４０２ｄ）とする。その後、座標（ｘ，ｙ）を更新（ステップ１４０２ｅ）し、すべての画素ＲＭ（Ｔ）_ｘ，ｙについてＱ（Ｔ）_ｘ，ｙを求め、図９（ｂ）に示したような、境目領域１２の画像Ｑ（Ｔ）を生成する。
【００６４】
また、前記ステップ１４０２ａでは、画素ＲＭ（Ｔ）_ｘ，ｙと８連結で隣接する場合を示したが、４連結で隣接するＲＭ（Ｔ）_{ｘ−１，ｙ}、ＲＭ（Ｔ）_{ｘ，ｙ−１}、ＲＭ（Ｔ）_{ｘ，ｙ＋１}、ＲＭ（Ｔ）_{ｘ＋１，ｙ}と比較してＱ（Ｔ）を生成しても良い。
【００６５】
次に、前記動画像データＣＭ（Ｔ）から前記境目領域１２と同じ位置または近傍にある輪郭部領域１３を求めるステップ１４０３は、まず、図１０に示すように、輪郭部領域１３の画像Ｒ（Ｔ）のブロックＲ（Ｔ）_ｉ，ｊに相当するＱ（Ｔ）_ｉ，ｊを抽出する（ステップ１４０３ａ）。このとき、前記Ｒ（Ｔ）_ｉ，ｊは、図１１（ａ）に示すように、横方向がＸ／Ｗ画素，縦方向がＹ／Ｗ画素のブロック画像とし、ｉ，ｊはブロック座標を表している。またこのとき、前記Ｘ，Ｙ，Ｗは、Ｘ／ＷとＹ／Ｗが整数となるように、あらかじめ設定されているものとする。Ｒの各画素は２値データ格納エリアＲｋ_ｉ，ｊと移動ベクトル格納エリアＲＶ_ｉ，ｊで構成されているものとする。
【００６６】
また、前記Ｑ（Ｔ）_ｉ，ｊは、図１１（ｂ）に示すように、縦Ｗ画素，横Ｗ画素毎に分割して抽出する。このとき、前記Ｑ（Ｔ）_ｉ，ｊの各画素の値を調べ（ステップ１４０３ｂ）、すべて０の場合にはＲｋ（Ｔ）_ｉ，ｊ＝０とし（ステップ１４０３ｃ）、１である画素が含まれる場合にはＲｋ（Ｔ）_ｉ，ｊ＝１とする（ステップ１４０３ｄ）。その後、ｉまたはｊを更新し（ステップ１４０３ｅ）、すべてのＲ（Ｔ）_ｉ，ｊに対するＲｋ（Ｔ）_ｉ，ｊを求めるまで繰り返す（ステップ１４０３ｆ）。
【００６７】
前記Ｒｋ（Ｔ）_ｉ，ｊは、ＣＭ（Ｔ）に対して遠近の境目領域と同じ位置または近傍にある輪郭部領域のみを示すデータである。すなわち、Ｒｋ_ｉ，ｊ＝１ならば、座標（ｉ×Ｗ，ｊ×Ｗ）から座標（（ｉ＋１）×Ｗ−１，（ｊ＋１）×Ｗ−１）を対角の頂点とするブロック領域が境目領域と同じ位置または近傍にある輪郭部領域となる。また、Ｒｋ_ｉ，ｊ＝０ならば、座標（ｉ×Ｗ，ｊ×Ｗ）から座標（（ｉ＋１）×Ｗ−１，（ｊ＋１）×Ｗ−１）を対角の頂点とするブロック領域は境目領域と同じ位置または近傍にある輪郭部領域ではない。そのため、Ｒｋ（Ｔ）_ｉ，ｊを求めることにより、図１２に示したような、輪郭部領域１３の画像Ｒ（Ｔ）が得られる。
【００６８】
なお、本実施例１では輪郭部領域１３の画像Ｒ（Ｔ）のブロックＲ（Ｔ）_ｉ，ｊを矩形状のブロックに分割したが、これに限らず、種々の形状で分割し、抽出してよい。
【００６９】
次に、フレーム時刻Ｔ＋ｎの動画像データＣＭ（Ｔ＋ｎ）から、前記ステップ１４０３で求めた輪郭部領域１３に相当する輪郭部領域を求めるステップ１４０４では、まず、図１３に示すように、前記ＣＭ（Ｔ）において、Ｒｋ（Ｔ）_ｉ，ｊ＝１のブロックを抽出する（ステップ１４０４ａ）。
【００７０】
次に、Ｒｋ（Ｔ）_ｉ，ｊ＝１のブロックから、図１４（ａ）に示すように、ＣＭ（Ｔ）の座標（ｉ×Ｗ，ｊ×Ｗ）から座標（（ｉ＋１）×Ｗ−１，（ｊ＋１）×Ｗ−１）のブロックを、縦横Ｗ画素の参照画像Ｂ（Ｔ）_ｉ，ｊとして切り出す（ステップ１４０４ａ，１４０４ｂ）。
【００７１】
次に、図１４（ｂ）に示すように、ＣＭ（Ｔ＋ｎ）内でもっともＢ（Ｔ）_ｉ，ｊに画像内容の近いブロックを決定し、このブロックの左上の座標値（ｉ×Ｗ＋ｄｘ，ｊ×Ｗ＋ｄｙ）を得る（ステップ１４０４ｃ）。このときｄｘとｄｙはフレーム数ｎ進んだ画像におけるＢ（Ｔ＋ｎ）_ｉ，ｊに含まれる輪郭部領域のｘ方向とｙ方向に移動した距離を表す。ｄｘとｄｙをＲＶ（Ｔ＋ｎ）_ｉ，ｊに格納する（ステップ１４０４ｄ）。
【００７２】
なお、画像内容の近いブロックは、２つのブロック毎に相関量を求め画像内で最も高い相関量を示したブロックを画像内容の近いブロックと決定する。相関量の式は統計論で用いられる厳密な式から画素毎の差分の二乗和等の簡略な式まで様々な既存の式があり、ブロックの探索法（例えば、Ｋ．Ｒ．Ｒａｏ他著、「デジタル放送・インターネットのための情報圧縮技術」、共立出版、ｐ．６９‐７１）も様々な既存手法が存在するが、本発明ではこれらを特定の式または手法に限定するものではない。また、本実施例ではブロック相関法を用いた例を示したが、輪郭部領域のｘ方向とｙ方向に移動した距離が求められるのであればブロック相関法に限定するものではない。
【００７３】
次に、ｉまたはｊを更新して（ステップ１４０４ｅ）、上記の処理を全てのＲｋ（Ｔ）_ｉ，ｊ＝１であるブロックに対して行い（ステップ１４０４ｆ）、各々のブロックに対して移動量ｄｘ，ｄｙを求めてＲＶ（Ｔ＋ｎ）_ｉ，ｊに格納する。そしてさらに、画像Ｒ（Ｔ）内の各Ｂ（Ｔ）_ｉ，ｊをＲＶ（Ｔ＋ｎ）_ｉ，ｊのｄｘ及びｄｙに基づいて移動させることにより、図１５に示すように、フレーム時刻Ｔ＋ｎの動画像データＣＭ（Ｔ＋ｎ）に対応する輪郭部領域１３の画像Ｒ（Ｔ＋ｎ）が得られる。
【００７４】
次に、フレーム時刻Ｔ＋ｎの距離画像データＲＭ（Ｔ＋ｎ）を生成するステップ１４０５では、まず、図１６に示すように、フレーム時刻Ｔの距離画像データＲＭ（Ｔ）から、前記ブロックＢ（Ｔ）_ｉ，ｊに相当するブロックを、ＲＶ（Ｔ＋ｎ）_ｉ，ｊのｄｘ及びｄｙに基づいて移動させる（ステップ１４０５ａ）。このとき、前記移動させるブロックは、図１７（ａ）及び図１７（ｂ）に示すように、Ｒｋ（Ｔ）_ｉ，ｊ＝１のブロックのみ、すなわち境目領域１２のみをまず移動させる。
【００７５】
ところで、移動前にブロック内に存在する境目領域はブロック間で連続しているが、移動方向や移動量によっては移動後のブロック間で途切れてしまうことがある。不連続になってしまった境目領域については、各々のブロック境界で途切れた点同士を直線で接続することによって連続性を回復する（ステップ１４０５ｂ）。なお、本実施例では直線補間による接続例を示したが、本発明は接続を直線補間に限定するものではない。
【００７６】
次に、境目領域１２で囲まれた内部領域を全て境目領域内側の近傍領域のデータに置き換えることによって塗りつぶすことで、図１８に示すように、フレーム時刻Ｔ＋ｎの距離画像データＲＭ（Ｔ＋ｎ）が得られる。
【００７７】
その後、図３に示したように、フレーム時刻を更新して、前記各ステップの処理を繰り返し行うことによって、距離画像データを取得していないフレーム時刻における分解能の低い距離画像データＲＭ（Ｔ＋ｎ）が全て生成され、動画の立体画像データが生成されることになる。
【００７８】
図１９は、本実施例１の立体画像生成方法で生成した立体画像の特徴を説明するための模式図である。
【００７９】
本実施例１の方法によって生成される距離画像データは、動画像データから抽出したｘ方向とｙ方向の動きに合わせて距離画像を移動させて生成するものである。そのため、撮影対象のｚ方向の動きを正確に表現することができない。例えば、図１９（ａ）に示すように、撮影対象が撮影地点に近づくと、実際には、図１９（ｂ）に示したように、撮影地点に近づくにつれて、奥行き（ｚ方向）の距離ｚが大きくなる。しかしながら、本実施例１によって生成される距離画像は図１９（ｃ）のように、奥行きの距離ｚがほとんど変わらない。したがって工業計測に使用するような動画の立体画像データの生成には不向きである。しかしながら、映像産業や通信産業で用いられる立体画像通信のような人間が見ることが目的である場合、画像と組み合わせて提示されるので、人間の視覚認識機構が合理的な解釈を行うことによって不自然さが目立たない。また、分解能の低い距離画像や擬似的な距離画像である場合、この点は元々問題にならない。
【００８０】
以上説明したように、本実施例１の立体画像生成装置を用いた立体画像生成方法によれば、１フレーム毎に距離画像データを取得することができない距離画像データ取得手段２Ａを用いても、容易に動画の立体画像を生成することができる。
【００８１】
また、１フレーム毎に距離画像データを取得することができる距離画像データ取得手段２Ａであっても、間欠的に取得することで、前記距離画像データ取得手段２Ａの消費電力量を低減することができる。そのため、前記立体画像生成装置の携帯機器への組み込みが容易になる。
【００８２】
また、前記立体画像生成方法をプログラム化すれば、前記各ステップをコンピュータに実行させることが可能である。そのため、専用の装置を用いることなく、動画の立体画像を容易に生成することができる。このとき、前記立体画像を生成するプログラムは、半導体メモリやハードディスク、ＣＤ−ＲＯＭ等の記録媒体によって提供することも出来るし、ネットワークを通して提供することも可能である。
【００８３】
なお、本実施例１では、分解能の低い距離画像の取得手段としてフラッシュを用いる撮影方法を示したが、分解能の低い距離画像を取得できるのであれば、実施例に示した方法に限定するものではなく、近景に焦点のあった画像と遠景に焦点のあった画像（またはパンフォーカス画像）間で空間周波数の大小に応じて距離を算出してもよく、あるいはステレオ撮影して単純に差分を取り、ある設定値を下回る差分値を示した領域を遠景領域として分解能の低い距離画像を生成すること等が可能である。
【００８４】
（実施例２）
図２０乃至図２１は、本発明による実施例２の立体画像生成方法の全体的な処理手順を説明するための模式図である。
【００８５】
本実施例２の立体画像生成方法は、例えば、前記実施例１で説明したような立体画像生成装置を用いて行うので、装置に関する詳細な説明は省略する。
【００８６】
本実施例２の立体画像生成方法は、まず、図２０に示すように、動画像データＣＭ（Ｔ）及び距離画像データＲＭ（Ｔ）を入力し、前記動画像データ記録手段４及び前記距離画像データ記録手段５に記録する。このとき、前記動画像データＣＭ（Ｔ）は、１フレーム時間毎に入力するが、前記距離画像データＲＭ（Ｔ）はＮフレーム間隔、例えば、図２０に示したように、３フレーム毎の間欠的なフレーム間隔で入力する。
【００８７】
しかしながら、この状態では、前記距離画像データＲＭが存在しないフレーム時刻の動画像データＣＭ（Ｔ＋１），ＣＭ（Ｔ＋２）があるので、そのままでは前記立体画像を生成したことにならない。そこで、前記動画像データ及び距離画像データを用いて、存在しないフレーム時間の距離画像データを補間生成する。
【００８８】
本実施例２の立体画像生成方法では、まず、図２０及び図２１に示したように、あるフレーム時刻の距離画像データＲＭと、同フレーム時刻の動画像データを選択する（ステップ１９０２）。ここでは、図２０に示したように、フレーム時刻Ｔの動画像データＣＭ（Ｔ）と距離画像データＲＭ（Ｔ）が選択された場合を説明する。なお、図２０に示した距離画像データＲＭ（Ｔ），ＲＭ（Ｔ＋３）は、黒い領域が撮影地点からの距離が近い領域（近景領域）１１を表しており、距離が遠くなるにしたがって白くなる。
【００８９】
次に、前記ステップ１９０２で選択した動画像データＣＭ（Ｔ）と、距離画像データのないフレーム時刻Ｔ＋１の動画像データＣＭ（Ｔ＋１）との間での画像の動きベクトルを求める（ステップ１９０３）。この求めた動きベクトルをＭ（Ｔ＋１）に示す。動きベクトルの画像Ｍ（Ｔ＋１）中で矢印の記号ＭＶが検出した動きベクトルである。なお、分かり易いように動きベクトルを矢印で図示したが、実際の画像データＭ（Ｔ＋１）は各画素に動きベクトル情報ＭＶが記録された画像であってこのような矢印で描かれた画像ではない。
【００９０】
次に、フレーム時刻Ｔの距離画像データＲＭ（Ｔ）を複数の領域に分割し、各領域を前記動きベクトルＭＶに従って別々に移動させることによって、フレーム時刻Ｔ＋１における距離画像データＲＭ（Ｔ＋１）を生成する（ステップ１９０４）。ここで前記距離画像データＲＭ（Ｔ）の領域は、動きベクトルＭＶの領域に合わせる。例えば、動きベクトルＭＶが画素毎に与えられているならば、前記距離画像データＲＭ（Ｔ）は画素単位で移動させる。また、動きベクトルＭＶがあるブロック領域の動きベクトルであるならば、各々の領域とはそのブロック領域単位である。この領域は動きベクトルを求める方法に依存するが、本発明は特定の動きベクトル生成法に限定するものではない。
【００９１】
次に、フレーム時刻を更新し（ステップ１９０５）、動画像データＣＭ（Ｔ＋２）及び距離画像データＲＭ（Ｔ＋２）が存在するか判定し（ステップ１９０６，１９０７）、動画像データＣＭ（Ｔ＋２）はあるが距離画像データＲＭ（Ｔ＋２）がない場合には、前記ステップ１９０３に戻って、距離画像データＲＭ（Ｔ＋２）を補間生成する。このとき、前記距離画像データＲＭ（Ｔ＋２）は、例えば、図２０に示したように、フレーム時刻Ｔの動画像データＣＭ（Ｔ）とフレーム時刻Ｔ＋２の動画像データＣＭ（Ｔ＋２）から動きベクトルＭＶを求め、その動きベクトルにしたがって前記フレーム時刻Ｔの距離画像データＲＭ（Ｔ）の各領域を移動させ、フレーム時刻Ｔ＋２の距離画像データＲＭ（Ｔ＋２）を生成する。
【００９２】
以上の手順を繰り返すことにより、前記距離画像データが存在しないフレーム時刻の距離画像データを順次補間生成することができる。そのため、本実施例２の立体画像生成方法でも、１フレーム時間毎に距離画像データを入力する必要がない。そのため、間欠的なフレーム間隔でしか動作できない距離画像データ取得手段１Ａを用いることが可能である。また、前記距離画像データ取得手段１Ａが、１フレーム時間毎の距離画像データを取得できる場合でも、間欠的に動作させることにより、消費電力量を低減することができる。また、本実施例２でも、前記距離画像データを３フレーム時間毎に入力しているが、これに限らず、たとえば、５フレーム時間毎あるいは１０フレーム時間毎に入力することも可能であり、フレーム時間間隔を大きくすることで、消費電力量を大幅に低減することができる。
【００９３】
図２２乃至図２６は、本実施例２の立体画像生成方法の具体例を説明するための模式図である。
【００９４】
本実施例２の立体画像生成方法では、１フレーム時間毎の動画像データＣＭと、Ｎフレーム時間毎の距離画像データＲＭが、半導体メモリ等の動画像データ記録手段４及び距離画像データ記録手段５にあらかじめ記録されているとする。このとき、前記動画像データと前記距離画像データがともに存在するフレーム時刻をＴとすると、ＣＭ（Ｔ），ＣＭ（Ｔ＋１），・・・，ＣＭ（Ｔ＋Ｎ−１）の動画像データに対して距離画像データＲＭ（Ｔ）が１枚存在する。これらの画像から、ＲＭ（Ｔ＋１），・・・，ＲＭ（Ｔ＋Ｎ−１）の距離画像データを補間生成することによって動画の立体画像データを生成する。ここで、ＣＭもＲＭも、横方向がＸ画素、縦方向がＹ画素の画像サイズであるとする。また、ＣＭ（Ｔ）の座標（ｘ，ｙ）のデータ値をＣＭ（Ｔ）_ｘ，ｙと記述し、ＲＭ（Ｔ）の座標（ｘ，ｙ）のデータ値をＲＭ（Ｔ）_ｘ，ｙと記述する。ＣＭ（Ｔ）_ｘ，ｙはＲＧＢデータであり、ＲＭ（Ｔ）_ｘ，ｙは距離を記録したスカラ値である。
【００９５】
また、前記動画像データＣＭ（Ｔ）を取得する動画像データ取得手段１Ａには、前記実施例１で説明した動画像撮影用のＴＶカメラを用いる。また、前記距離画像データＲＭ（Ｔ）を取得する距離画像データ取得手段２Ａには、たとえば、光切断法（例えば、吉澤徹著、「光三次元計測」、新技術コミュニケーションズ、ｐ．２８‐３７）、ＴＯＦ（光飛行時間計測法）（例えば、井口征士、「３次元計測の最新の動向」、計測と制御、第３４巻、第６号、ｐ．４３０や、河北真宏、「ハイビジョン３次元カメラ：Ａｘｉ−ｖｉｓｉｏｎカメラ」、平１４年度技研公開公演・研究発表会予稿集、ｐ．５８‐６３）、種々のパターン投影法（例えば、吉澤徹著、「光三次元計測（第２版）」、新技術コミュニケーションズ、ｐ．７７‐９９）、種々のステレオ法（例えば、画像電子学会著、「３次元画像用語辞典」、新技術コミュニケーションズ、ｐ．５１）等に基づく距離計測装置を使用する。
【００９６】
前記動画像データＣＭ（Ｔ）と間欠的な距離画像データＲＭ（Ｔ）を用いて距離画像データを補間生成するときには、まず、あるフレーム時刻Ｔにおける距離画像データＲＭ（Ｔ）と、同フレーム時刻の画像データＣＭ（Ｔ）を選択する（ステップ１９０２）。
【００９７】
次に、前記フレーム時刻Ｔの動画像データＣＭ（Ｔ）と、距離画像データのないフレーム時刻Ｔ＋ｎの動画像データＣＭ（Ｔ＋ｎ）との間での画像の動きベクトルを求めるステップ１９０３を行う。
【００９８】
前記ステップ１９０３では、図２２に示すように、動画像データＣＭ（Ｔ）を縦Ｗ画素、横Ｗ画素のブロックに格子状に分割する（ステップ１９０３ａ）。このとき、図２３（ａ）及び図２３（ｂ）に示すように、各々のブロックサイズはＷ×Ｗであり、横方向のブロック数はＸ／Ｗ、縦方向のブロック数はＹ／Ｗである。なお、ＸとＹは、Ｗで割り切れるような値があらかじめ設定されているものとする。このとき、ブロック座標（ｉ，ｊ）のブロックをＢ（Ｔ）_ｉ，ｊと記述する。
【００９９】
次に、動きベクトル画像Ｍ（Ｔ＋ｎ）を生成する。このとき、前記動きベクトル画像Ｍ（Ｔ＋ｎ）は、前記ブロックＢ（Ｔ）_ｉ，ｊと同じサイズ、すなわち、横Ｘ／Ｗ、縦Ｙ／Ｗのブロック状の二次元データＭＶ（Ｔ＋ｎ）_ｉ，ｊからなり、各要素には二次元のベクトルデータが格納されるとする。
【０１００】
このとき、まず、各々のブロック座標（ｉ，ｊ）に対応するＢ（Ｔ）_ｉ，ｊを選択し、図２４（ａ）に示すように、Ｂ（Ｔ）_ｉ，ｊにもっとも画像内容の近いブロックをＣＭ（Ｔ＋ｎ）内から決定する（ステップ１９０３ｂ）。
【０１０１】
次に、図２４（ｂ）に示すように、前記ＣＭ（Ｔ＋ｎ）から決定したこのブロックの左上の座標値（ｉ×Ｗ＋ｄｘ，ｊ×Ｗ＋ｄｙ）を得る。このときｄｘ及びｄｙはｎフレーム進んだ画像におけるＢ（Ｔ）_ｉ，ｊの動きベクトルＭＶ（Ｔ＋ｎ）_ｉ，ｊである。ｄｘ及びｄｙをＭＶ（Ｔ＋ｎ）_ｉ，ｊに格納する（ステップ１９０３ｃ）。
【０１０２】
なお、前記動きベクトルＭＶ（Ｔ＋ｎ）_ｉ，ｊを求める処理は、ブロックマッチング法と呼ばれ、例えば、ＭＰＥＧ２画像生成時の動き予測に用いられる公知の手法である。一般に、２つのブロック毎に相関量を求め画像内で最も高い相関量を示したブロックを画像内容の近いブロックと決定するが、相関量の式は統計論で用いられる厳密な式から画素毎の差分の二乗和等の簡略な式まで様々な既存の式があり、ブロックの探索法、移動後のブロック間歪み除去法なども様々な既存手法が存在する。本発明ではこれらを特定の式または手法に限定するものではない。また、本実施例２ではブロックマッチング法を用いた例を示したが、画像内の動きベクトルを求められるのであれば、ブロックマッチング法に限定するものではない。この結果、生成されたＭＶ（Ｔ＋ｎ）_ｉ，ｊはフレーム時刻ＴからＴ＋ｎに変わった際の画像内の動きを表す。
【０１０３】
その後、ｉまたはｊを更新し（ステップ１９０３ｄ）、ＣＭ（Ｔ）の全てのブロックＢ（Ｔ）_ｉ，ｊで同様の処理を繰り返す（ステップ１９０３ｅ）。
【０１０４】
次に、前記距離画像データＲＭ（Ｔ）の各領域を前記動きベクトル画像Ｍ（Ｔ＋ｎ）に基づいて移動させるステップ１９０４では、まず、図２５に示すように、前記距離画像データＲＭ（Ｔ）を、前記ブロックＢ（Ｔ）_ｉ，ｊと同じサイズのブロックＲＭ（Ｔ）_ｉ，ｊに分割する（ステップ１９０４ａ）。その後、図２６（ａ）に示すように、前記ブロックＲＭ（Ｔ）_ｉ，ｊを前記動きベクトルＭＶ（Ｔ＋ｎ）_ｉ，ｊで移動させる（ステップ１９０４ｂ）。その後、ｉまたはｊを更新し（ステップ１９０４ｃ）、全てのブロックＲＭ（Ｔ）_ｉ，ｊを移動させると、図２６（ｂ）に示したように、フレーム時刻Ｔ＋ｎの距離画像データＲＭ（Ｔ＋ｎ）が得られる。
【０１０５】
その後は、ｎ＝ｎ＋１としてフレーム時刻を更新し、距離画像データＲＭ（Ｔ＋ｎ）が存在しない場合には、前記手順を繰り返し、距離画像データＲＭ（Ｔ＋ｎ）を生成する。
【０１０６】
このとき、前記フレーム時刻Ｔ＋ｎの距離画像データＲＭ（Ｔ＋ｎ）は、例えば、図２０に示したように、フレーム時刻Ｔの動画像データＣＭ（Ｔ）及び距離画像データＲＭ（Ｔ）と、フレーム時刻Ｔ＋ｎの動画像データＣＭ（Ｔ＋ｎ）を用いて生成する。
【０１０７】
以下、距離画像データＲＭが存在する動画像ベクトルＣＭのフレーム時刻になるまで、前記処理を繰り返し行うことで、フレーム時刻Ｔとフレーム時刻Ｔ＋Ｎの間の距離画像データＲＭを補間生成することができる。
【０１０８】
以上説明したように、本実施例２の立体画像生成方法によれば、１フレーム毎に距離画像データを取得することができない取得手段を用いても、容易に動画の立体画像を生成することができる。
【０１０９】
また、１フレーム毎に距離画像データを取得することができる取得手段であっても、間欠的に取得することで、前記距離画像データ取得手段２Ａの消費電力量を低減することができる。そのため、前記立体画像を生成する装置の携帯機器への組み込みが容易になる。
【０１１０】
また、本実施例２の立体画像生成方法では、前記手順で説明したように、距離計測器から入力した距離画像データの代わりに、前記実施例１で説明したような、遠近の区別ができる程度の分解能の低い距離画像データを用いることも出来る。その場合は、輪郭部領域のみの動きを求めればよいので、計算量が少なくなり、消費電力量をさらに低減することができる。
【０１１１】
図２７は、前記実施例２の変形例を説明するための模式図である。
【０１１２】
前記実施例２では、フレーム時刻Ｔとフレーム時刻Ｔ＋Ｎの間のフレーム時刻Ｔ＋ｎの距離画像データを補間生成するときに、図２０に示したように、前記フレーム時刻Ｔの動画像データＣＭ（Ｔ）及び距離画像データＲＭ（Ｔ）、ならびにフレーム時刻Ｔ＋ｎの動画像データＣＭ（Ｔ＋ｎ）を用いて距離画像データＲＭ（Ｔ＋ｎ）を補間生成する例を示したが、その代わりに、例えば、１フレーム前の動画像データＣＭ（Ｔ＋ｎ−１）及び補間生成した距離画像データＲＭ（Ｔ＋ｎ−１）、ならびに動画像データＣＭ（Ｔ＋ｎ）を用いて距離画像データＲＭ（Ｔ＋ｎ）を生成してもよい。
【０１１３】
前記動画像データは通常、画像フレームの更新に伴い、撮影対象の輪郭形状が徐々に変形していくため、前記実施例２のように、常に初めのフレーム時刻Ｔで得られた輪郭部領域との間でパターンマッチングを行う方法では、変形量がある一定量以上となる、時間的に離れたフレームにおいて正しいパターンマッチングが行えなくなる。一方、図２７に示した方法の場合、直前のフレームで得られた輪郭部領域との間でパターンマッチングを行うことにより前記動きベクトルを求めるので、前記距離画像データが存在しないフレームの数の多少に関わらず、撮影対象の輪郭形状の変形量は、常に１フレーム以内でありその量は微小である。したがって、図２７に示した方法の場合、距離画像データが存在しないフレームの数が多くても、前記動画の立体画像を精度よく生成することができる。
【０１１４】
（実施例３）
図２８は、本発明による実施例３の立体画像生成装置の概略構成を示す模式図である。
【０１１５】
本実施例３の立体画像生成装置は、図２８に示すように、動画像データ取得手段１Ａから動画像データを入力する動画像データ入力手段１Ｂと、補助画像データ取得手段２０Ａから補助画像データを入力する補助画像データ入力手段２０Ｂと、前記動画像データ入力手段１Ｂ及び前記補助画像データ入力手段２０Ｂに入力する各データを制御する制御手段３と、前記動画像データ入力手段１から入力された動画像データを記録する動画像データ記録手段４と、前記補助画像データ入力手段２０から入力された補助画像データ及び距離画像データを記録する距離画像データ記録手段５と、距離画像データを補間生成する距離画像データ補間生成手段２１とを備える。
【０１１６】
また、前記立体画像生成装置は、たとえば、毎秒３０フレーム以上の動画像データと距離画像データの組からなる立体画像を生成する装置であり、生成した前記立体画像は、図２８に示したように、演算手段７で演算し、表示手段８で表示することにより立体化された動画像を得ることができる。
【０１１７】
図２９乃至図３６は、本実施例３の立体画像生成装置を用いた立体画像生成方法を説明するための模式図である。
【０１１８】
本実施例３の立体画像生成装置では、前記実施例１及び前記実施例２と異なり、図２９に示すように、前記距離画像データを生成可能な補助画像データを入力し、前記補助画像データ、または前記補助画像データと前記動画像データを用いて、距離画像データを生成する。
【０１１９】
ここで、前記距離画像データを生成可能な補助画像データとは、データそのものは直接距離情報を表現しているわけではないが、補助画像データ単独、または補助画像データと画像データの組み合わせを用いて演算することによって距離画像データまたは分解能の低い距離画像データまたは擬似的な距離画像データを生成可能な画像データと定義する。
【０１２０】
一例をあげると、補助画像データが普通の画像である場合、これ単独では距離情報を含まないが、もう一方の画像データと組み合わせてステレオ画像とみなすことによってステレオ法を用いて距離画像が生成できる。したがって普通の画像は補助画像データとなる。
【０１２１】
また、本実施例３では、このようなステレオ画像を補助画像データとして用いた例のみを詳細に説明するが、補助画像データ単独、または補助画像データと画像データの組み合わせを用いて演算することによって距離画像データまたは分解能の低い距離画像データまたは擬似的な距離画像データを生成可能な画像データであるならば、本発明は補助画像データの種類を限定するものではない。
【０１２２】
本実施例３の立体画像生成装置を用いて動画の立体画像を生成するときには、半導体メモリなどの記録手段に、１フレーム毎の動画像データＣＭと、Ｎフレーム毎に距離画像を生成可能な補助画像データＨＭがあらかじめ記録されているとする。動画像データと補助画像データがともに存在するフレーム時刻をＴとすると、ＣＭ（Ｔ），ＣＭ（Ｔ＋１），・・・，ＣＭ（Ｔ＋Ｎ−１）の動画像に対してＨＭ（Ｔ）が１枚存在する。これらの画像から、ＲＭ（Ｔ），ＲＭ（Ｔ＋１），・・・，ＲＭ（Ｔ＋Ｎ−１）の距離画像データを補間生成することによって動画の立体画像データを生成できる。
【０１２３】
なお、本実施例３で用いる補助画像データＨＭ（Ｔ）は、動画像データＣＭ（Ｔ）の撮影位置とは水平方向に異なる位置で撮影された画像データであるとする。
【０１２４】
本実施例３の立体画像生成方法を用いて前記距離画像データＲＭ（Ｔ）を生成するときには、図３０に示すように、まず、動画像データＣＭ（Ｔ）と同じフレーム時刻の補助画像データＨＭ（Ｔ）を選択し（ステップ２２０１）、前記フレーム時刻Ｔの距離画像データＲＭ（Ｔ）を生成する（ステップ２２０２）。
【０１２５】
前記フレーム時刻Ｔの距離画像データＲＭ（Ｔ）を生成するステップ２２０２では、図３１に示すように、まず、図３２（ａ）に示した動画像データＣＭ（Ｔ）と図３２（ｂ）に示した補助画像データＨＭ（Ｔ）の差分の絶対値をとり、その画像ＤＭ（Ｔ）を生成する（ステップ２２０２ａ）。このとき、前記動画像データＣＭ（Ｔ）と補助画像データＨＭ（Ｔ）はステレオ画像なので近景領域は動画像データＣＭ（Ｔ）と補助画像データＨＭ（Ｔ）で画像内の位置が異なり、差分値は０にならない。特に、輝度値の変化が大きい領域においては著しい差分値が発生する。一方、遠景領域は動画像データＣＭ（Ｔ）と補助画像データＨＭ（Ｔ）で画像内の位置がほとんど変わらないので差分値は非常に小さい。
【０１２６】
次に、差分値の画像ＤＭ（Ｔ）の各点ＤＭ（Ｔ）_ｘ，ｙが、あらかじめ設定したある差分値ｋ以上なら１、ｋ未満なら０とする（ステップ２２０２ｂ，２２０２ｃ，２２０２ｄ）。その後、ｘまたはｙを更新し（ステップ２２０２ｅ），全てのＤＭ（Ｔ）_ｘ，ｙを２値化（２２０２ｆ）して、図３３に示したように、２値化した画像ＤＭ（Ｔ）を生成する。この結果、物体が写っていない領域（遠景領域）は０となる。一方、物体が写っている領域（近景領域）は１になる確率が高いが、輝度の変化に乏しい領域では０になることもあり、０と１の混在したノイジーな画像となる。
【０１２７】
そこで、次に、前記２値化した画像ＤＭ（Ｔ）からノイズを除去する（ステップ２２０２ｇ）。前記ノイズの除去は、例えば、図３４、図３５（ａ）、図３５（ｂ）に示したように、２値化した画像ＤＭ（Ｔ）の全ての画素ＤＭ（Ｔ）_ｘ，ｙを中心とした、あらかじめ設定したｐ画素×ｐ画素の各画素の値を調べ（ステップ２２０２ｈ，２２０２ｉ）、１の値を持つ画素があるならＲＭ（Ｔ）_ｘ，ｙ＝１とし（ステップ２２０２ｋ）、そうでなければ０とする（ステップ２２０２ｊ）。これによってノイズが除去され、図３６に示すように、近景が１、遠景が０で表現される分解能の低い距離画像ＲＭ（Ｔ）が生成される。
【０１２８】
なお、ノイズの除去方法は、これ以外にメディアンフィルタを用いたり、平滑化フィルタをかけた後、再度２値化する方法もある。本発明はノイズ除去の方法を限定するものではない。
【０１２９】
また、補助画像にもこれ以外の様々な画像が考えられる。例えば、近距離でピントの合う焦点深度の浅い入力画像を補助画像として用いる方法もある。補助画像では遠景領域はボケて撮影されるので高周波成分が著しく小さい。そこで、補助画像と画像について同じ位置の近傍の空間周波数を求めて高周波成分の比を求めることによって、その位置における距離の概略を比から算出することが可能である。本発明は補助画像または補助画像と画像の組み合わせから距離画像を生成できるのであれば補助画像に何を用いるのかを限定するものではない。
【０１３０】
その後、前記手順を繰り返し、動画像データＣＭ（Ｔ）と間欠的に入力した補助画像データＨＭ（Ｔ）から距離画像データＲＭ（Ｔ）を生成した後は、例えば、前記実施例１で説明した手順に沿って、動画の立体画像を生成する。
【０１３１】
なお、本実施例３では分解能の低い距離画像が生成されるので、前記実施例１で説明した生成方法と組み合わせたが、生成される距離画像の分解能が高い場合には前記実施例２で説明した生成方法と組み合わせてもよい。
【０１３２】
以上説明したように、本実施例３の立体画像生成方法によれば、前記距離画像データを生成可能な補助画像データを間欠的に取得し、その補助画像データから距離画像データを生成することにより、前記実施例１または前記実施例２で説明した生成方法と同様に、動画の立体画像を容易に生成することができる。
【０１３３】
また、補助画像データを間欠的に取得することにより、補助画像データを取得する補助画像データ取得手段２０Ａの消費電力量を低減することができる。そのため、前記立体画像生成装置の携帯機器への組み込みが容易になる。
【０１３４】
以上、本発明を、前記実施例に基づき具体的に説明したが、本発明は、前記実施例に限定されるものではなく、その要旨を逸脱しない範囲において、種々変更可能であることはもちろんである。
【０１３５】
【発明の効果】
本願において開示される発明のうち、代表的なものによって得られる効果を簡単に説明すれば、以下の通りである。
（１）動画の立体画像を容易に生成することができる。
（２）動画の立体画像を生成する装置の消費電力量を低減することができる。
【図面の簡単な説明】
【図１】本発明による実施例１の立体画像生成装置の概略構成を示す模式図である。
【図２】本実施例１の立体画像生成装置を用いた立体画像生成方法を説明するための模式図であり、立体画像生成方法の全体的な処理手順を示す図である。
【図３】本実施例１の立体画像生成装置を用いた立体画像生成方法を説明するための模式図であり、立体画像生成方法の全体的な処理手順を示す図である。
【図４】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図５】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図６】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図７】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図８】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図９】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図１０】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図１１】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図１２】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図１３】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図１４】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図１５】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図１６】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図１７】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図１８】本実施例１の立体画像生成方法の具体例を説明するための模式図である。
【図１９】本実施例１の立体画像生成方法で生成した立体画像の特徴を説明するための模式図である。
【図２０】本発明による実施例２の立体画像生成方法の全体的な処理手順を説明するための模式図である。
【図２１】本発明による実施例２の立体画像生成方法の全体的な処理手順を説明するための模式図である。
【図２２】本実施例２の立体画像生成方法の具体例を説明するための模式図である。
【図２３】本実施例２の立体画像生成方法の具体例を説明するための模式図である。
【図２４】本実施例２の立体画像生成方法の具体例を説明するための模式図である。
【図２５】本実施例２の立体画像生成方法の具体例を説明するための模式図である。
【図２６】本実施例２の立体画像生成方法の具体例を説明するための模式図である。
【図２７】前記実施例２の変形例を説明するための模式図である。
【図２８】本発明による実施例３の立体画像生成装置の概略構成を示す模式図である。
【図２９】本実施例３の立体画像生成装置を用いた立体画像生成方法を説明するための模式図である。
【図３０】本実施例３の立体画像生成装置を用いた立体画像生成方法を説明するための模式図である。
【図３１】本実施例３の立体画像生成装置を用いた立体画像生成方法を説明するための模式図である。
【図３２】本実施例３の立体画像生成装置を用いた立体画像生成方法を説明するための模式図である。
【図３３】本実施例３の立体画像生成装置を用いた立体画像生成方法を説明するための模式図である。
【図３４】本実施例３の立体画像生成装置を用いた立体画像生成方法を説明するための模式図である。
【図３５】本実施例３の立体画像生成装置を用いた立体画像生成方法を説明するための模式図である。
【図３６】本実施例３の立体画像生成装置を用いた立体画像生成方法を説明するための模式図である。
【符号の説明】
１Ａ…動画像データ取得手段、１Ｂ…動画像データ入力手段、２Ａ…距離画像データ取得手段、２Ｂ…距離画像データ入力手段、３…制御手段、４…動画像データ記録手段、５…距離画像データ記録手段、６…距離画像データ補間手段、７…演算手段、８…表示手段、１０…撮影対象、１１…近景領域、１２…境目領域、１３…輪郭部領域、１５…光源、１６…全反射ミラー、１７…ハーフミラー、２０Ａ…補助画像データ取得手段、２０Ｂ…補助画像データ入力手段、２１…距離画像データ補間生成手段。[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a three-dimensional moving image data generating method, a three-dimensional moving image data generating device, a three-dimensional moving image data generating program, and a recording medium, and is particularly effective when applied to the generation of three-dimensional moving image data to be displayed on a portable device or the like. It is about technology.
[0002]
[Prior art]
2. Description of the Related Art Conventionally, in the field of image communication, attention has been paid to moving image communication using stereoscopic images that can provide a higher sense of realism, and presentations at academic conferences and demonstration productions of companies are also becoming active. Here, it is assumed that the three-dimensional image is a combination of a two-dimensional image in which colors and luminances are recorded, and an image called a distance image or a depth image in which a distance from a measurement point to an imaging target is recorded. Further, the distance image is an image in which the distance to an object to be photographed is recorded in each pixel. In addition, the moving image includes, for example, an image of about 30 frames or more per second.
[0003]
When generating a moving image based on the stereoscopic image, for example, moving image data including a two-dimensional image in which the color and luminance are recorded is input by a TV camera or the like, and the distance image data is input by a distance measuring device or the like.
[0004]
In addition, as a method of generating a moving image using the stereoscopic image, for example, there is a method of generating distance image data using two two-dimensional images having parallax.
[0005]
Further, as a method of generating a stereoscopic image based on a parallax image of a two-dimensional image, a technique is disclosed in which a parallax region is determined from a distance image to determine a parallax region, and a parallax amount is generated using the preceding and following images and motion vectors. (See, for example, Patent Document 1).
[0006]
[Patent Document 1]
JP 2000-253422 A
[0007]
[Problems to be solved by the invention]
However, the conventional technique has a problem that it is difficult to perform moving image communication using the stereoscopic image.
[0008]
In order to perform the moving image communication using the three-dimensional image, for example, distance image data of 30 frames or more per second must be input. However, conventionally, a distance measuring device capable of acquiring distance image data at such a high speed is very expensive. Therefore, it is difficult to use it by incorporating it into a portable device.
[0009]
Further, in the related art, the distance image data is acquired and input every one frame time similarly to the two-dimensional image, so that the power consumption increases. For this reason, there has been a problem that it is difficult to incorporate the battery pack into a portable device used with a rechargeable battery or the like.
[0010]
Further, as a technique that can be incorporated into a portable device, for example, there is a method of inputting distance image data having a low resolution enough to distinguish between near and far and displaying a three-dimensional image in a split type using the input. There is also a method of generating pseudo range image data from the range image having a low resolution and stereoscopically displaying the same. Here, it is assumed that the pseudo distance image data is data generated artificially, not by measuring the distance. That is, the pseudo distance image data does not always match the actual distance, and the degree of approximation depends on the generation algorithm.
[0011]
However, in each of the above methods, the distance image data is acquired every frame time, so that the power consumption is high. Therefore, there is a problem that it is difficult to input a three-dimensional image into a moving image using a portable device.
[0012]
Further, even in the technique described in Patent Document 1, since the distance image data is acquired every frame time, there is a problem that the power consumption is high and it is difficult to use the distance image data incorporated in a portable device.
[0013]
An object of the present invention is to provide a technique capable of easily generating a three-dimensional image of a moving image.
[0014]
Another object of the present invention is to provide a technique capable of reducing the power consumption of a device that generates a stereoscopic image of a moving image.
[0015]
Another object of the present invention is to provide a program and a recording medium capable of facilitating input of a moving image using a three-dimensional image and reducing the amount of power consumption when generating a three-dimensional image.
[0016]
The above and other objects and novel features of the present invention will become apparent from the description of the present specification and the accompanying drawings.
[0017]
[Means for Solving the Problems]
The outline of the invention disclosed in the present application is as follows.
(1) A step of inputting moving image data for each frame time, and a step of inputting data for generating a stereoscopic image from the moving image data (hereinafter referred to as distance image data) at intermittent frame intervals. And a step of interpolating and generating distance image data at a frame time when the distance image data does not exist based on the moving image data and the input distance image data.
[0018]
(2) In the means of the above (1), the step of generating the distance image data by interpolation includes a step of obtaining a distance boundary area from the input distance image data, and a step of obtaining the same frame as the input distance image data. A step of obtaining a contour region at the same position as the boundary region or a boundary region near the boundary region from moving image data at a time; and corresponding to the contour region from the moving image data at a frame time when the distance image data does not exist. Determining the region to be processed, and the step of generating the distance image data by moving the boundary region and its internal region based on the amount of movement of the outline region and the region corresponding to the outline region. Have.
[0019]
(3) In the means of (1), the step of interpolation generating the distance image data includes the step of selecting moving image data at the same frame time as the input distance image data; From, a step of obtaining a motion vector of an image up to the moving image data at the frame time when the distance image data does not exist, and dividing the input distance image data into a plurality of regions, based on the motion vector Moving each area to generate distance image data.
[0020]
(4) In each of the means (1) to (3), the step of inputting the distance image data includes the step of inputting auxiliary image data capable of generating the distance image data at intermittent frame intervals; Generating distance image data based on the auxiliary image data or moving image data at the same frame time as the auxiliary image data and the auxiliary image data.
[0021]
According to the means (1), even when a means for acquiring the distance image data such as a distance measuring device cannot be operated at a high speed, a moving image of a three-dimensional image can be easily obtained.
[0022]
At this time, in the step of generating the distance image data by interpolation, for example, the processing in each step of the means of (2) or the means of (3) is performed.
[0023]
In the step of inputting the distance image data, the distance image data may be directly input using a distance measuring device or the like, or indirectly from the auxiliary image data as in the means of (4). The generated distance image data may be input.
[0024]
(5) Moving image data input means for inputting moving image data for each frame time, and data for generating a three-dimensional image from the moving image data (hereinafter referred to as distance image data) are transmitted at intermittent frame intervals. Based on the moving image data input from the moving image data inputting means and the distance image data input from the distance image data inputting means, based on the distance image data inputting means input at And a distance image data interpolating unit for interpolating and generating the distance image data.
[0025]
(6) In the means of (5), the distance image data interpolating means includes means for obtaining a distance boundary area from the distance image data input from the distance image data input means, and the same frame as the distance image data. Means for obtaining a contour area in the same position as the boundary area from the moving image data at the time, or a contour area near the boundary area, and corresponding to the contour area from the moving image data at the frame time in which the distance image data does not exist. Means for determining an area to be performed, and means for generating the distance image data by moving the boundary area and an internal area thereof based on the amount of movement of the area corresponding to the outline area and the area corresponding to the outline area. Have.
[0026]
(7) In the means of the above (5), the distance image data interpolating means selects a moving image data at the same frame time as the input distance image data, and the distance image data interpolating means selects the moving image data from the selected moving image data. Means for obtaining a motion vector of an image up to moving image data at a frame time when no distance image data exists, and dividing the input distance image data into a plurality of regions, and dividing each of the regions based on the motion vector. Means for moving to generate distance image data.
[0027]
(8) In the means of (5) to (7), the distance image data input means includes auxiliary image data input means for inputting auxiliary image data capable of generating the distance image data at intermittent frame intervals. Means for generating distance image data based on the auxiliary image data or moving image data at the same frame time as the auxiliary image data and the auxiliary image data.
[0028]
According to the means (5), even when a means for acquiring distance image data such as a distance measuring device cannot be operated at a high speed, a moving image of a three-dimensional image can be easily obtained.
[0029]
Further, even if the means for acquiring the distance image data is a means capable of acquiring the distance image data every frame time, the intermittent operation can reduce the power consumption.
[0030]
Further, the distance image data interpolation means includes, for example, each means of the means (6) or the means (7).
[0031]
The distance image data input means may be a means for directly inputting the distance image data from a distance measuring device or the like, or a means for inputting the auxiliary image data as in the means of (8). Means for generating distance image data from the auxiliary image data may be provided to indirectly input distance image data.
[0032]
As described above, by using the means (5) to (8), it is possible to easily incorporate the stereoscopic image generation device into a portable device.
[0033]
(9) A three-dimensional image generation program for causing a computer to execute each step of the three-dimensional image generation method according to any one of (1) to (4).
[0034]
According to the means (9), a stereoscopic image can be easily generated using a general-purpose device (apparatus) such as a computer.
[0035]
(10) A recording medium on which the stereoscopic image generation program of the means (9) is recorded so as to be readable by a computer.
[0036]
According to the means (10), the recording medium can be provided by copying or through a network. Therefore, any device that can execute the stereoscopic image generation program can generate a stereoscopic image.
[0037]
Hereinafter, the present invention will be described in detail with embodiments (examples) with reference to the drawings.
In all the drawings for describing the embodiments, components having the same function are denoted by the same reference numerals, and the repeated description thereof will be omitted.
[0038]
BEST MODE FOR CARRYING OUT THE INVENTION
(Example 1)
FIG. 1 is a schematic diagram illustrating a schematic configuration of a stereoscopic image generation device according to a first embodiment of the present invention.
[0039]
As shown in FIG. 1, the three-dimensional image generation device according to the first embodiment includes a moving image data input unit 1B for inputting moving image data from a moving image data obtaining unit 1A, and distance image data from a distance image data obtaining unit 2A. Distance image data input means 2B to be input, control means 3 for controlling each data input to the moving image data input means 1B and the distance image data input means 2B, and input from the moving image data input means 1. Moving image data recording means 4 for recording moving image data, distance image data recording means 5 for recording distance image data input from the distance image data input means 2, distance image data interpolation for generating distance image data Means 6.
[0040]
Further, the three-dimensional image generation device is, for example, a device that generates a three-dimensional image including a set of moving image data and distance image data of 30 frames or more per second, and the generated three-dimensional image is, as illustrated in FIG. The three-dimensional moving image can be obtained by the calculation by the calculation means 7 and the display by the display means 8. At this time, the moving image data input means 1B, the distance image data input means 2B, the control means 3, the moving image data recording means 4, the distance image data recording means 5, the distance image data interpolating means 6, the calculation As each means of the means 7, a CPU or a memory of a computer can be used.
[0041]
FIGS. 2 and 3 are schematic diagrams illustrating a stereoscopic image generation method using the stereoscopic image generation device according to the first embodiment, and are diagrams illustrating an overall processing procedure of the stereoscopic image generation method.
[0042]
In order to generate the three-dimensional image using the three-dimensional image generation device of the first embodiment, first, as shown in FIG. 2, moving image data CM (T) and distance image data RM (T) are input, and The moving image data is recorded in the moving image data recording means 4 and the distance image data recording means 5. At this time, the moving image data CM (T) is input every one frame time, but the distance image data RM (T) is intermittent at N frame intervals, for example, every three frames as shown in FIG. At a typical frame interval. At this time, the distance image data RM (T) is, for example, 1-bit resolution distance image data. The black region 11 is a near-view region, for example, a region corresponding to the shooting target 10, and the white region is a distant view. Let it be an area.
[0043]
However, in this state, since the moving image data RM (T + 1) and RM (T + 2) at the frame time when the distance image data RM does not exist, the stereoscopic image is not generated as it is. Therefore, distance image data of a nonexistent frame time is generated by interpolation using the moving image data and the distance image data.
[0044]
In the three-dimensional image generation method according to the first embodiment, as shown in FIGS. 2 and 3, after initialization (step 1401), a boundary area 12 between a near view and a distant view is obtained from the distance image data RM (T). This image is defined as Q (T) (step 1402).
[0045]
Next, from the moving image data CM (T) at the same frame time T as the distance image data RM (T), the contour region 13 in the same region as the boundary region 12 of the image Q (T) or in the vicinity thereof Is obtained, and this image is set to R (T) (step 1403).
[0046]
Next, the same region as the contour region 13 of the image R (T) is obtained from the moving image data CM (T + 1) at the frame time T + 1 where no distance image data exists, and this image is set as R (T + 1) (step). 1404). In order to obtain the image area, for example, an existing pattern matching method (Hideyuki Tamura, "Introduction to Computer Image Processing", Soken Shuppan, pp. 148-153) is used. There are various variations of the pattern matching method, which can be appropriately selected.
[0047]
Next, the distance image data RM (T (T) at the frame time T is set in accordance with the movement amount of the contour region 13 in the image R (T + 1) obtained in the step 1404 with respect to the contour region 13 of the image R (T). The distance image data RM (T + 1) at the frame time T + 1 is generated by moving the foreground area 11 in (1) (step 1405).
[0048]
Thereafter, it is checked whether or not the distance image data RM (T + 2) at the next frame time T + 2 exists (steps 1406, 1407, 1408, 1409). In the first embodiment, since the distance image data RM (T + 2) at the frame time T + 2 does not exist, the distance image data RM (T + 1) is generated in the same procedure as the distance image data RM (T + 1) at the frame time T + 1. At this time, the position of the video area of the moving image data whose frame time is T + 2 may be obtained by, for example, pattern matching directly with the contour area of the moving image data whose frame time is T as shown in FIG. Alternatively, the frame time may be obtained by pattern matching with the contour area of the moving image data at T + 1.
[0049]
By repeating the above procedure, the distance image data at the frame time when the distance image data does not exist can be sequentially generated by interpolation. Therefore, in the three-dimensional image generation device according to the first embodiment, it is not necessary to acquire distance image data every frame time. Therefore, it is possible to use the distance image data acquisition unit 2A that can operate only at intermittent frame intervals. Further, even when the distance image data input means 2A can acquire distance image data for each one frame time, the power consumption can be reduced by operating the distance image data intermittently. In the first embodiment, the distance image data is input every three frame times. However, the present invention is not limited to this. For example, it is possible to input the distance image data every five or ten frame times. By increasing the time interval, power consumption can be significantly reduced.
[0050]
FIGS. 4 to 18 are diagrams illustrating a specific example of the stereoscopic image generation method according to the first embodiment.
[0051]
In the stereoscopic image generation method of the first embodiment, first, as shown in FIG. 4, moving image data CM for each frame and distance image data RM for every N frames are input, and the moving image data And the distance image data recording means 5. At this time, assuming that a frame time at which both the image and the distance image having the low resolution exist is T, RM (T (T), CM (T + 1),..., CM (T + N−1) ) Exists. From these images, moving image stereoscopic image data is generated by generating distance image data of RM (T + 1),..., RM (T + N−1). Here, it is assumed that both the CM and the RM have the image size of X and Y pixels in the vertical and horizontal directions. Further, the data value of the coordinates (x, y) of CM (T) is _{x, y} And the data value of the coordinates (x, y) of RM (T) is expressed as RM (T) _{x, y} Is described. CM (T) _{x, y} Is RGB data, RM (T) _{x, y} Is binary data of 0 or 1. Here, 1 represents a near view, and 0 represents a distant view.
[0052]
When the moving image data CM and the distance image data RM are input, as shown in FIG. 5, a TV camera (moving image data acquisition unit) 1A for moving image shooting and a TV camera ( The control unit 3 controls the distance image data acquisition unit 2B and the light source 15 that emits pulse light such as a flash.
[0053]
At this time, the photographing optical axis of the distance image data acquiring means 2A is aligned with the photographing optical axis of the moving image data acquiring means 1A by the total reflection mirror 16 and the half mirror 17, so that photographing can be performed at the same photographing angle of view and the same photographing viewpoint. Set as follows.
[0054]
When the moving image data CM is input using the moving image data obtaining unit 1A, the control unit 3 previously sets the moving image data CM as shown in FIG. 6A. A control signal is output every frame time, the moving image data is photographed, and input to the moving image data input means 1B. At this time, the frame time interval ΔT is set to, for example, 1/30 seconds.
[0055]
On the other hand, when the distance image data RM is input, as shown in FIG. 6B, the control unit 3 sends the distance image data acquisition unit 2A a time interval N times the preset frame interval. For example, a control signal is output at a time interval three times as long as the frame interval of the moving image data, intermittent distance image data is photographed, and is input to the distance image data input means 2B. At this time, the control unit 3 outputs a control signal to the light source 15 at the same timing to illuminate the imaging target 10.
[0056]
Note that the photographing start times of the moving image data acquiring unit 1A and the distance image data acquiring unit 2A are shifted at least by the exposure time of the moving image data acquiring unit 1A, or to the illumination light emitted from the light source 15. By using an infrared light and using an infrared camera for the distance image data acquisition unit 2A, the illumination light is prevented from affecting the A image capturing of the moving image data acquisition unit 1.
[0057]
Further, by using a high-speed camera, it is also possible to use the moving image data acquiring means 1A and the distance image acquiring means 2A as a single camera. In this case, the total reflection mirror 16 and the half mirror 17 are unnecessary.
[0058]
Further, the distance image data RMP (T) photographed by the distance image data acquisition means 2A is converted into, for example, low-resolution distance image data RM (T) in which only distinction of distance is recorded. Therefore, as shown in FIG. 7, first, distance image data RMP (T) whose shooting time is closest to the moving image data CM (T) shot by the moving image data obtaining means 1A at time T is selected (step 1801). These two pieces of image data can be regarded as image data of the same object to be photographed at the same photographing time with only the presence or absence of illumination different.
[0059]
Next, in the moving image data CM (T) and the distance image data RMP (T), CM (T) at the same coordinates (x, y). _{x, y} / RMP (T) _{x, y} (Step 1802), and if the ratio is equal to or greater than a certain set value k, RM (T) _{x, y} = 1 (step 1803), otherwise RM (T) _{x, y} RM (T) by setting = 0 (step 1804) _{x, y} Generate Thereafter, the coordinates (x, y) are updated (step 1805), and RM (T) _{x, y} Is generated (step 1806), and is set as the distance image data RM (T).
[0060]
When the illuminated captured image is compared with a non-illuminated captured image, the brightness of the distant view image does not change much while the photographic object of the close view is brighter. The low-resolution range image data RM (T) represented by 0 is generated. Note that the distance image data RM (T) having a low resolution may be generated using a difference instead of a ratio.
[0061]
In the first embodiment, the imaging method using the light source 15 such as a flash as the input means of the low-resolution range image data RM (T) is described. However, the low-resolution range image data RM (T) is acquired or input. If possible, the method is not limited to the method shown in the first embodiment, and the method according to the magnitude of the spatial frequency between the image focused on the near view and the image focused on the distant view (or pan-focus image) For example, the distance may be calculated by stereo imaging, or the difference may be simply obtained by taking a stereo image, and a low-resolution range image data RM (T) may be generated using a region having a difference value lower than a certain set value as a distant view region. Is possible.
[0062]
Next, stereoscopic image data of a moving image is generated according to the procedure shown in FIGS.
[0063]
In step 1402 of obtaining the image Q (T) of the boundary region 12 from the distance image data RM (T), first, as shown in FIG. 8, the pixel RM of the coordinates (x, y) of the distance image data RM (T) (T) _{x, y} And the values of adjacent pixels are compared (steps 1402a and 1402b). At this time, the RM (T) _{x, y} As shown in FIG. 9 (a), pixels adjacent to RM (T) _{x-1, y-1} , RM (T) _{x-1, y} , RM (T) _{x-1, y + 1} , RM (T) _{x, y-1} , RM (T) _{x, y + 1} , RM (T) _{x + 1, y-1} , RM (T) _{x + 1, y} , RM (T) _{x + 1, y + 1} And if all have the same data value, Q (T) _{x, y} = 0 (step 1402c), otherwise Q (T) _{x, y} = 1 (step 1402d). Thereafter, the coordinates (x, y) are updated (step 1402e), and all the pixels RM (T) are updated. _{x, y} About Q (T) _{x, y} Is obtained, and an image Q (T) of the boundary area 12 as shown in FIG. 9B is generated.
[0064]
In step 1402a, the pixel RM (T) _{x, y} And RM (T) adjacent in four connections _{x-1, y} , RM (T) _{x, y-1} , RM (T) _{x, y + 1} , RM (T) _{x + 1, y} Q (T) may be generated in comparison with.
[0065]
Next, as shown in FIG. 10, the step 1403 of obtaining the contour region 13 located at the same position or in the vicinity of the boundary region 12 from the moving image data CM (T) first includes the image R ( Block R (T) of T) _{i, j} Q (T) corresponding to _{i, j} Is extracted (step 1403a). At this time, the R (T) _{i, j} Is a block image having X / W pixels in the horizontal direction and Y / W pixels in the vertical direction, as shown in FIG. 11A, and i and j represent block coordinates. At this time, it is assumed that X, Y, and W are set in advance so that X / W and Y / W are integers. Each pixel of R is a binary data storage area Rk _{i, j} And movement vector storage area RV _{i, j} It is assumed to be composed of
[0066]
In addition, the Q (T) _{i, j} Is divided and extracted for each vertical W pixel and horizontal W pixel as shown in FIG. At this time, the Q (T) _{i, j} Is checked (step 1403b), and if all are 0, Rk (T) _{i, j} = 0 (step 1403c), and if a pixel of 1 is included, Rk (T) _{i, j} = 1 (step 1403d). Thereafter, i or j is updated (step 1403e), and all R (T) are updated. _{i, j} Rk (T) for _{i, j} (Step 1403f).
[0067]
Rk (T) _{i, j} Is data indicating only a contour area located at the same position as or near the boundary area near and far from the CM (T). That is, Rk _{i, j} If = 1, the block area having the coordinates ((i + 1) × W−1, (j + 1) × W−1) from the coordinates (i × W, j × W) to the diagonal apex is the same as the boundary area or The contour area is located nearby. Also, Rk _{i, j} If = 0, the block area having the coordinates ((i + 1) × W−1, (j + 1) × W−1) as the diagonal vertices from the coordinates (i × W, j × W) is the same as the boundary area or It is not a contour area in the vicinity. Therefore, Rk (T) _{i, j} Is obtained, an image R (T) of the contour area 13 as shown in FIG. 12 is obtained.
[0068]
In the first embodiment, the block R (T) of the image R (T) of the contour area 13 is used. _{i, j} Is divided into rectangular blocks, but the present invention is not limited to this, and may be divided into various shapes and extracted.
[0069]
Next, in step 1404 for obtaining a contour area corresponding to the contour area 13 obtained in step 1403 from the moving image data CM (T + n) at frame time T + n, first, as shown in FIG. In T), Rk (T) _{i, j} = 1 is extracted (step 1404a).
[0070]
Next, Rk (T) _{i, j} = 1, the coordinates ((i + 1) × W−1, (j + 1) × W−1) from the coordinates (i × W, j × W) of the CM (T) as shown in FIG. Of the reference image B (T) _{i, j} (Steps 1404a and 1404b).
[0071]
Next, as shown in FIG. 14B, the most B (T) in CM (T + n) _{i, j} Is determined, and the upper left coordinate value (i × W + dx, j × W + dy) of the block is determined (step 1404c). At this time, dx and dy are B (T + n) in the image advanced by the number of frames n. _{i, j} Represents the distance moved in the x-direction and the y-direction of the contour area included in. dx and dy are RV (T + n) _{i, j} (Step 1404d).
[0072]
As for blocks having similar image contents, a correlation amount is obtained for each two blocks, and a block having the highest correlation amount in the image is determined as a block having similar image contents. There are various existing expressions for the amount of correlation, from exact expressions used in statistical theory to simple expressions such as the sum of squares of differences for each pixel, and block search methods (for example, KR Rao et al., “Information compression technology for digital broadcasting and the Internet”, Kyoritsu Shuppan, pp. 69-71) also has various existing methods, but the present invention does not limit these to specific formulas or methods. In the present embodiment, an example using the block correlation method has been described. However, the present invention is not limited to the block correlation method as long as the distance of the contour area moved in the x direction and the y direction can be obtained.
[0073]
Next, i or j is updated (step 1404e), and the above processing is performed for all Rk (T). _{i, j} = 1 (step 1404f), and the amount of movement dx, dy is calculated for each block to obtain RV (T + n). _{i, j} To be stored. And further, each B (T) in the image R (T) _{i, j} Is RV (T + n) _{i, j} 15, the image R (T + n) of the contour area 13 corresponding to the moving image data CM (T + n) at the frame time T + n is obtained, as shown in FIG.
[0074]
Next, in step 1405 for generating the distance image data RM (T + n) at the frame time T + n, first, as shown in FIG. 16, the block B (T) is obtained from the distance image data RM (T) at the frame time T. _{i, j} The block corresponding to RV (T + n) _{i, j} Are moved based on dx and dy (step 1405a). At this time, as shown in FIGS. 17A and 17B, the block to be moved is Rk (T) _{i, j} First, only the block of = 1, that is, only the boundary area 12 is moved first.
[0075]
By the way, the boundary area existing in the block before the movement is continuous between the blocks, but may be interrupted between the blocks after the movement depending on the movement direction and the movement amount. Regarding the boundary area that has become discontinuous, the continuity is restored by connecting the broken points at each block boundary with a straight line (step 1405b). In the present embodiment, an example of connection using linear interpolation is shown, but the present invention is not limited to connection using linear interpolation.
[0076]
Next, by replacing the entire inner area surrounded by the boundary area 12 with data of a neighboring area inside the boundary area, the area image data RM (T + n) at the frame time T + n is obtained as shown in FIG. Can be
[0077]
Thereafter, as shown in FIG. 3, by updating the frame time and repeating the processing of the above steps, the distance image data RM (T + n) having a low resolution at the frame time at which the distance image data has not been acquired is obtained. All are generated, and stereoscopic image data of a moving image is generated.
[0078]
FIG. 19 is a schematic diagram for explaining features of a stereoscopic image generated by the stereoscopic image generation method according to the first embodiment.
[0079]
The distance image data generated by the method of the first embodiment is generated by moving the distance image in accordance with the movement in the x and y directions extracted from the moving image data. Therefore, the movement of the imaging target in the z direction cannot be accurately represented. For example, as shown in FIG. 19A, when the photographing target approaches the photographing point, as shown in FIG. 19B, the distance z in the depth (z direction) actually increases as the photographing point approaches. Becomes larger. However, in the distance image generated according to the first embodiment, the depth distance z hardly changes as shown in FIG. Therefore, it is not suitable for generating moving image stereoscopic image data used for industrial measurement. However, when the purpose is human viewing, such as stereoscopic image communication used in the video and telecommunications industries, it is presented in combination with images, and the human visual recognition mechanism makes a reasonable interpretation due to a reasonable interpretation. Nature is not noticeable. In the case of a range image having a low resolution or a pseudo range image, this point does not originally cause a problem.
[0080]
As described above, according to the three-dimensional image generation method using the three-dimensional image generation device of the first embodiment, even if the distance image data acquisition unit 2A that cannot acquire the distance image data for each frame is used, A three-dimensional image of a moving image can be easily generated.
[0081]
In addition, even if the distance image data acquisition unit 2A can acquire the distance image data for each frame, the distance image data acquisition unit 2A can reduce the power consumption by intermittently acquiring the distance image data. it can. Therefore, it is easy to incorporate the stereoscopic image generation device into a portable device.
[0082]
Further, if the stereoscopic image generation method is programmed, it is possible to cause a computer to execute each of the steps. Therefore, a stereoscopic image of a moving image can be easily generated without using a dedicated device. At this time, the program for generating the stereoscopic image can be provided by a recording medium such as a semiconductor memory, a hard disk, or a CD-ROM, or can be provided through a network.
[0083]
In the first embodiment, an imaging method using a flash as a means for acquiring a low-resolution range image is described. However, as long as a low-resolution range image can be acquired, the method is not limited to the method described in the embodiment. Instead, the distance between the image focused on the near view and the image focused on the distant view (or pan-focus image) may be calculated according to the magnitude of the spatial frequency, or the difference may be calculated simply by performing stereo shooting. For example, it is possible to generate a low-resolution range image with a region having a difference value smaller than a certain set value as a distant view region.
[0084]
(Example 2)
20 and 21 are schematic diagrams for explaining the overall processing procedure of the stereoscopic image generation method according to the second embodiment of the present invention.
[0085]
Since the stereoscopic image generation method according to the second embodiment is performed using, for example, the stereoscopic image generation apparatus described in the first embodiment, detailed description of the apparatus will be omitted.
[0086]
In the stereoscopic image generating method according to the second embodiment, first, as shown in FIG. 20, moving image data CM (T) and distance image data RM (T) are input, and the moving image data recording unit 4 and the distance image The data is recorded in the data recording means 5. At this time, the moving image data CM (T) is input every one frame time, but the distance image data RM (T) is intermittent at N frame intervals, for example, as shown in FIG. At a typical frame interval.
[0087]
However, in this state, there is the moving image data CM (T + 1) and CM (T + 2) at the frame time when the distance image data RM does not exist. Therefore, the stereoscopic image is not generated as it is. Therefore, distance image data of a nonexistent frame time is generated by interpolation using the moving image data and the distance image data.
[0088]
In the stereoscopic image generation method according to the second embodiment, first, as shown in FIGS. 20 and 21, distance image data RM at a certain frame time and moving image data at the same frame time are selected (step 1902). Here, a case where the moving image data CM (T) and the distance image data RM (T) at the frame time T are selected as shown in FIG. 20 will be described. In the distance image data RM (T), RM (T + 3) shown in FIG. 20, a black area represents an area 11 (close view area) that is close to the shooting point, and becomes white as the distance increases. .
[0089]
Next, a motion vector of an image between the moving image data CM (T) selected in step 1902 and the moving image data CM (T + 1) at the frame time T + 1 having no distance image data is obtained (step 1903). The obtained motion vector is shown as M (T + 1). The arrow symbol MV is the detected motion vector in the motion vector image M (T + 1). Although the motion vectors are shown by arrows for easy understanding, the actual image data M (T + 1) is an image in which the motion vector information MV is recorded in each pixel, and is not an image drawn by such arrows. .
[0090]
Next, the distance image data RM (T + 1) at the frame time T + 1 is generated by dividing the distance image data RM (T) at the frame time T into a plurality of regions and moving each region separately according to the motion vector MV. (Step 1904). Here, the area of the distance image data RM (T) matches the area of the motion vector MV. For example, if a motion vector MV is given for each pixel, the distance image data RM (T) is moved in pixel units. If the motion vector MV is a motion vector of a certain block area, each area is a block area unit. Although this region depends on the method for obtaining the motion vector, the present invention is not limited to a specific method for generating a motion vector.
[0091]
Next, the frame time is updated (step 1905), and it is determined whether the moving image data CM (T + 2) and the distance image data RM (T + 2) exist (steps 1906 and 1907), and the moving image data CM (T + 2) exists. If there is no distance image data RM (T + 2), the process returns to step 1903 to generate the distance image data RM (T + 2) by interpolation. At this time, for example, as shown in FIG. 20, the distance image data RM (T + 2) is obtained from the moving image data CM (T) at the frame time T and the moving image data CM (T + 2) at the frame time T + 2. Is calculated, and each area of the distance image data RM (T) at the frame time T is moved according to the motion vector to generate the distance image data RM (T + 2) at the frame time T + 2.
[0092]
By repeating the above procedure, the distance image data at the frame time when the distance image data does not exist can be sequentially generated by interpolation. Therefore, even in the stereoscopic image generation method according to the second embodiment, there is no need to input distance image data every frame time. Therefore, it is possible to use the distance image data acquisition unit 1A that can operate only at intermittent frame intervals. In addition, even when the distance image data acquiring unit 1A can acquire the distance image data for each one frame time, the power consumption can be reduced by operating the distance image data intermittently. In the second embodiment, the distance image data is input every three frame times. However, the present invention is not limited to this. For example, the distance image data can be input every five frame times or every ten frame times. By increasing the time interval, power consumption can be significantly reduced.
[0093]
FIG. 22 to FIG. 26 are schematic diagrams for explaining a specific example of the stereoscopic image generation method according to the second embodiment.
[0094]
In the three-dimensional image generation method according to the second embodiment, the moving image data CM for every one frame time and the distance image data RM for every N frame times are stored in the moving image data recording unit 4 and the distance image data recording unit 5 such as semiconductor memories. Is recorded in advance in At this time, assuming that a frame time at which both the moving image data and the distance image data exist is T, the moving image data of CM (T), CM (T + 1),..., CM (T + N−1) There is one distance image data RM (T). From these images, moving image stereoscopic image data is generated by interpolating and generating RM (T + 1),..., RM (T + N−1) distance image data. Here, it is assumed that both the CM and the RM have an image size of X pixels in the horizontal direction and Y pixels in the vertical direction. Further, the data value of the coordinates (x, y) of CM (T) is _{x, y} And the data value of the coordinates (x, y) of RM (T) is expressed as RM (T) _{x, y} Is described. CM (T) _{x, y} Is RGB data, RM (T) _{x, y} Is a scalar value that records the distance.
[0095]
As the moving image data acquiring means 1A for acquiring the moving image data CM (T), the TV camera for photographing moving images described in the first embodiment is used. The distance image data acquiring means 2A for acquiring the distance image data RM (T) includes, for example, a light section method (for example, Toru Yoshizawa, “Three-dimensional optical measurement”, New Technology Communications, pp. 28-37). ), TOF (Optical Time-of-Flight Measurement Method) (for example, Seiji Iguchi, "Latest Trends in 3D Measurement", Measurement and Control, Vol. 34, No. 6, p. 430, Masahiro Kawakita, "Hi-Vision 3D" Camera: Axi-vision Camera ", published by Giken in 2002, Proceedings of the Research and Presentation Conference, p.58-63, various pattern projection methods (for example, Toru Yoshizawa," Three-dimensional optical measurement (second edition) ") , New Technology Communications, pp. 77-99), a distance measuring device based on various stereo methods (eg, the Institute of Image Electronics Engineers of Japan, "3D Image Glossary", New Technology Communications, pp. 51) To use.
[0096]
When the distance image data is generated by interpolation using the moving image data CM (T) and the intermittent distance image data RM (T), first, the distance image data RM (T) at a certain frame time T and the same frame time Is selected (step 1902).
[0097]
Next, step 1903 for obtaining a motion vector of an image between the moving image data CM (T) at the frame time T and the moving image data CM (T + n) at the frame time T + n having no distance image data is performed.
[0098]
In the step 1903, as shown in FIG. 22, the moving image data CM (T) is divided into blocks of vertical W pixels and horizontal W pixels in a grid pattern (step 1903a). At this time, as shown in FIGS. 23A and 23B, each block size is W × W, the number of blocks in the horizontal direction is X / W, and the number of blocks in the vertical direction is Y / W. is there. It is assumed that X and Y are set in advance so that they are divisible by W. At this time, the block at the block coordinates (i, j) is represented by B (T) _{i, j} Is described.
[0099]
Next, a motion vector image M (T + n) is generated. At this time, the motion vector image M (T + n) corresponds to the block B (T) _{i, j} , Ie, block-shaped two-dimensional data MV (T + n) of horizontal X / W and vertical Y / W _{i, j} , And each element stores two-dimensional vector data.
[0100]
At this time, first, B (T) corresponding to each block coordinate (i, j) _{i, j} And B (T) is selected as shown in FIG. _{i, j} Is determined from the CM (T + n) (step 1903b).
[0101]
Next, as shown in FIG. 24B, the upper left coordinate value (i × W + dx, j × W + dy) of this block determined from the CM (T + n) is obtained. At this time, dx and dy are B (T) in the image advanced by n frames. _{i, j} Motion vector MV (T + n) _{i, j} It is. dx and dy are MV (T + n) _{i, j} (Step 1903c).
[0102]
The motion vector MV (T + n) _{i, j} Is called a block matching method, and is a known method used for motion prediction when generating an MPEG2 image, for example. Generally, a correlation amount is obtained for every two blocks, and the block showing the highest correlation amount in the image is determined as a block having a similar image content. There are various existing equations up to a simple equation such as a sum of squares of differences, and various existing methods also exist, such as a block search method and a post-movement inter-block distortion removal method. The present invention does not limit these to any particular formula or approach. In the second embodiment, an example using the block matching method is described. However, the present invention is not limited to the block matching method as long as a motion vector in an image can be obtained. As a result, the generated MV (T + n) _{i, j} Represents the motion in the image when the frame time changes from T to T + n.
[0103]
Thereafter, i or j is updated (step 1903d), and all blocks B (T) of CM (T) are updated. _{i, j} And the same process is repeated (step 1903e).
[0104]
Next, in step 1904 for moving each area of the distance image data RM (T) based on the motion vector image M (T + n), first, as shown in FIG. , The block B (T) _{i, j} Block RM (T) of the same size as _{i, j} (Step 1904a). Thereafter, as shown in FIG. 26A, the block RM (T) _{i, j} Is the motion vector MV (T + n) _{i, j} (Step 1904b). Thereafter, i or j is updated (step 1904c), and all blocks RM (T) are updated. _{i, j} Is moved, distance image data RM (T + n) at frame time T + n is obtained as shown in FIG.
[0105]
After that, the frame time is updated with n = n + 1. If the distance image data RM (T + n) does not exist, the above procedure is repeated to generate the distance image data RM (T + n).
[0106]
At this time, the distance image data RM (T + n) at the frame time T + n is, for example, as shown in FIG. It is generated using T + n moving image data CM (T + n).
[0107]
Hereinafter, by repeating the above-described processing until the frame time of the moving image vector CM in which the distance image data RM exists, the distance image data RM between the frame time T and the frame time T + N can be interpolated.
[0108]
As described above, according to the three-dimensional image generation method of the second embodiment, it is possible to easily generate a three-dimensional image of a moving image even using an acquisition unit that cannot acquire distance image data for each frame. it can.
[0109]
Further, even if the acquisition unit is capable of acquiring the distance image data for each frame, by intermittently acquiring the distance image data, the power consumption of the distance image data acquisition unit 2A can be reduced. Therefore, the device for generating the stereoscopic image can be easily incorporated into a portable device.
[0110]
Further, in the three-dimensional image generation method according to the second embodiment, as described in the above procedure, instead of the distance image data input from the distance measuring device, the distance can be distinguished as described in the first embodiment. It is also possible to use distance image data having a low resolution. In such a case, since only the movement of the contour region needs to be obtained, the amount of calculation is reduced, and the power consumption can be further reduced.
[0111]
FIG. 27 is a schematic diagram for explaining a modification of the second embodiment.
[0112]
In the second embodiment, when interpolation generation of the distance image data at the frame time T + n between the frame time T and the frame time T + N is performed, the moving image data CM (T) at the frame time T is generated as shown in FIG. Although the example in which the distance image data RM (T + n) and the distance image data RM (T + n) are interpolated and generated using the moving image data CM (T + n) at the frame time T + n is shown instead, for example, one frame before The distance image data RM (T + n) may be generated using the moving image data CM (T + n-1), the distance image data RM (T + n-1) generated by interpolation, and the moving image data CM (T + n).
[0113]
The moving image data usually includes a contour portion region always obtained at the first frame time T, as in the second embodiment, since the contour shape of the shooting target gradually changes with the update of the image frame. In the method of performing pattern matching between frames, correct pattern matching cannot be performed in temporally separated frames in which the deformation amount is a certain amount or more. On the other hand, in the case of the method shown in FIG. 27, the motion vector is obtained by performing pattern matching with the contour region obtained in the immediately preceding frame. Irrespective of this, the deformation amount of the contour shape of the photographing target is always within one frame, and the amount is minute. Therefore, in the case of the method shown in FIG. 27, a stereoscopic image of the moving image can be accurately generated even if the number of frames in which no distance image data exists is large.
[0114]
(Example 3)
FIG. 28 is a schematic diagram illustrating a schematic configuration of the stereoscopic image generation device according to the third embodiment of the present invention.
[0115]
As shown in FIG. 28, the three-dimensional image generating apparatus according to the third embodiment includes a moving image data input unit 1B for inputting moving image data from the moving image data obtaining unit 1A, and auxiliary image data from an auxiliary image data obtaining unit 20A. Auxiliary image data input means 20B to be input, control means 3 for controlling each data to be input to the moving image data input means 1B and the auxiliary image data input means 20B, and moving image input from the moving image data input means 1 Moving image data recording means 4 for recording image data; distance image data recording means 5 for recording auxiliary image data and distance image data input from the auxiliary image data input means 20; Image data interpolation generating means 21.
[0116]
The three-dimensional image generation device is a device that generates a three-dimensional image composed of a set of moving image data and distance image data of 30 frames or more per second, for example, and the generated three-dimensional image is, as shown in FIG. The three-dimensional moving image can be obtained by the calculation by the calculation means 7 and the display by the display means 8.
[0117]
FIGS. 29 to 36 are schematic diagrams illustrating a stereoscopic image generation method using the stereoscopic image generation device of the third embodiment.
[0118]
In the three-dimensional image generation device of the third embodiment, unlike the first and second embodiments, as shown in FIG. 29, auxiliary image data capable of generating the distance image data is input, and the auxiliary image data Alternatively, distance image data is generated using the auxiliary image data and the moving image data.
[0119]
Here, the auxiliary image data capable of generating the distance image data means that the data itself does not directly represent the distance information, but the auxiliary image data alone or a combination of the auxiliary image data and the image data is used. The image data is defined as image data capable of generating distance image data, low-resolution distance image data, or pseudo-range image data by calculation.
[0120]
As an example, when the auxiliary image data is a normal image, the auxiliary image data alone does not include the distance information, but the distance image can be generated by using the stereo method by considering it as a stereo image in combination with the other image data. . Therefore, a normal image becomes auxiliary image data.
[0121]
In the third embodiment, only an example in which such a stereo image is used as the auxiliary image data will be described in detail. However, by performing the calculation using the auxiliary image data alone or the combination of the auxiliary image data and the image data, The present invention does not limit the type of auxiliary image data as long as the image data can generate range image data, range image data with low resolution, or pseudo range image data.
[0122]
When a three-dimensional image of a moving image is generated using the three-dimensional image generation apparatus of the third embodiment, an auxiliary device capable of generating moving image data CM for each frame and a distance image for each N frames is stored in a recording unit such as a semiconductor memory. It is assumed that the image data HM has been recorded in advance. Assuming that the frame time at which both the moving image data and the auxiliary image data exist is T, HM (T) is 1 for the moving image of CM (T), CM (T + 1),..., CM (T + N−1). Exists. From these images, moving image stereoscopic image data can be generated by interpolating and generating distance image data of RM (T), RM (T + 1),..., RM (T + N−1).
[0123]
It is assumed that the auxiliary image data HM (T) used in the third embodiment is image data photographed at a position different in the horizontal direction from the photographing position of the moving image data CM (T).
[0124]
When generating the distance image data RM (T) by using the stereoscopic image generation method of the third embodiment, first, as shown in FIG. 30, the auxiliary image data HM at the same frame time as the moving image data CM (T). (T) is selected (step 2201), and the distance image data RM (T) at the frame time T is generated (step 2202).
[0125]
In step 2202 for generating the distance image data RM (T) at the frame time T, as shown in FIG. 31, first, the moving image data CM (T) shown in FIG. The absolute value of the difference between the indicated auxiliary image data HM (T) is taken to generate the image DM (T) (step 2202a). At this time, since the moving image data CM (T) and the auxiliary image data HM (T) are stereo images, the near-view area has a different position in the image between the moving image data CM (T) and the auxiliary image data HM (T). The value will not be 0. In particular, a remarkable difference value occurs in a region where the change in the luminance value is large. On the other hand, in the distant view area, the difference value is very small because the position in the image is hardly changed between the moving image data CM (T) and the auxiliary image data HM (T).
[0126]
Next, each point DM (T) of the difference value image DM (T) _{x, y} Is set to 1 if the difference value is equal to or greater than a predetermined difference value k, and set to 0 if less than k (steps 2202b, 2202c, 2202d). Thereafter, x or y is updated (step 2202e), and all DM (T) are updated. _{x, y} Is binarized (2202f) to generate a binarized image DM (T) as shown in FIG. As a result, the area where the object is not shown (distant view area) is 0. On the other hand, the region where the object is captured (foreground region) has a high probability of being 1, but it may be 0 in a region where the change in luminance is poor, resulting in a noisy image in which 0 and 1 are mixed.
[0127]
Therefore, next, noise is removed from the binarized image DM (T) (step 2202g). The noise removal is performed, for example, as shown in FIGS. 34, 35 (a), and 35 (b) with respect to all the pixels DM (T) of the binarized image DM (T). _{x, y} The value of each of the previously set p pixels × p pixels centered on is checked (steps 2202h and 2202i). If there is a pixel having a value of 1, RM (T) _{x, y} = 1 (step 2202k), otherwise 0 (step 2202j). As a result, noise is removed, and as shown in FIG. 36, a low-resolution range image RM (T) in which the near view is represented by 1 and the distant view is represented by 0 is generated.
[0128]
In addition, as a method of removing noise, there is also a method of using a median filter or applying a smoothing filter and then binarizing again. The present invention does not limit the method of removing noise.
[0129]
Various other images are also conceivable as auxiliary images. For example, there is a method of using an input image with a short depth of focus that is in focus at a short distance as an auxiliary image. In the auxiliary image, the distant view area is photographed out of focus, so that the high-frequency component is extremely small. Therefore, it is possible to calculate the approximate distance at that position from the ratio by obtaining the spatial frequency near the same position between the auxiliary image and the image and obtaining the ratio of the high frequency components. The present invention does not limit what is used for an auxiliary image as long as a distance image can be generated from an auxiliary image or a combination of an auxiliary image and an image.
[0130]
After that, the above procedure is repeated to generate the distance image data RM (T) from the moving image data CM (T) and the auxiliary image data HM (T) intermittently input. Generate a three-dimensional image of a moving image according to the procedure.
[0131]
In the third embodiment, a range image having a low resolution is generated. Therefore, the range image is combined with the generation method described in the first embodiment. However, when the range image generated has a high resolution, the range image is described in the second embodiment. It may be combined with the generated generation method.
[0132]
As described above, according to the three-dimensional image generating method of the third embodiment, the auxiliary image data capable of generating the distance image data is intermittently obtained, and the distance image data is generated from the auxiliary image data. In the same manner as the generation method described in the first embodiment or the second embodiment, a moving image stereoscopic image can be easily generated.
[0133]
Further, by intermittently obtaining the auxiliary image data, the power consumption of the auxiliary image data obtaining unit 20A that obtains the auxiliary image data can be reduced. Therefore, it is easy to incorporate the stereoscopic image generation device into a portable device.
[0134]
As described above, the present invention has been specifically described based on the embodiment. However, the present invention is not limited to the embodiment, and can be variously modified without departing from the gist of course. is there.
[0135]
【The invention's effect】
The effects obtained by typical aspects of the invention disclosed in the present application will be briefly described as follows.
(1) A three-dimensional image of a moving image can be easily generated.
(2) The power consumption of a device that generates a stereoscopic image of a moving image can be reduced.
[Brief description of the drawings]
FIG. 1 is a schematic diagram illustrating a schematic configuration of a stereoscopic image generation device according to a first embodiment of the present invention.
FIG. 2 is a schematic diagram for explaining a three-dimensional image generation method using the three-dimensional image generation device according to the first embodiment, and is a diagram illustrating an overall processing procedure of the three-dimensional image generation method.
FIG. 3 is a schematic diagram for explaining a stereoscopic image generation method using the stereoscopic image generation device according to the first embodiment, and is a diagram illustrating an overall processing procedure of the stereoscopic image generation method.
FIG. 4 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment.
FIG. 5 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment.
FIG. 6 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment.
FIG. 7 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment.
FIG. 8 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment.
FIG. 9 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment.
FIG. 10 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment.
FIG. 11 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment.
FIG. 12 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment.
FIG. 13 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment.
FIG. 14 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment.
FIG. 15 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment.
FIG. 16 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment.
FIG. 17 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment.
FIG. 18 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the first embodiment.
FIG. 19 is a schematic diagram for explaining characteristics of a three-dimensional image generated by the three-dimensional image generation method according to the first embodiment.
FIG. 20 is a schematic diagram for explaining the overall processing procedure of the stereoscopic image generation method according to the second embodiment of the present invention.
FIG. 21 is a schematic diagram for explaining the overall processing procedure of the stereoscopic image generation method according to the second embodiment of the present invention.
FIG. 22 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the second embodiment.
FIG. 23 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the second embodiment.
FIG. 24 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the second embodiment.
FIG. 25 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the second embodiment.
FIG. 26 is a schematic diagram for explaining a specific example of the stereoscopic image generation method according to the second embodiment.
FIG. 27 is a schematic diagram for explaining a modification of the second embodiment.
FIG. 28 is a schematic diagram illustrating a schematic configuration of a stereoscopic image generation device according to a third embodiment of the present invention.
FIG. 29 is a schematic diagram for explaining a three-dimensional image generation method using the three-dimensional image generation device of the third embodiment.
FIG. 30 is a schematic diagram for explaining a three-dimensional image generation method using the three-dimensional image generation device of the third embodiment.
FIG. 31 is a schematic diagram for explaining a three-dimensional image generation method using the three-dimensional image generation device of the third embodiment.
FIG. 32 is a schematic diagram for explaining a three-dimensional image generation method using the three-dimensional image generation device of the third embodiment.
FIG. 33 is a schematic diagram for explaining a three-dimensional image generation method using the three-dimensional image generation device of the third embodiment.
FIG. 34 is a schematic diagram for explaining a three-dimensional image generation method using the three-dimensional image generation device of the third embodiment.
FIG. 35 is a schematic diagram for explaining a three-dimensional image generation method using the three-dimensional image generation device of the third embodiment.
FIG. 36 is a schematic diagram for explaining a three-dimensional image generation method using the three-dimensional image generation device according to the third embodiment.
[Explanation of symbols]
1A: Moving image data acquisition means, 1B: Moving image data input means, 2A: Distance image data acquisition means, 2B: Distance image data input means, 3 ... Control means, 4 ... Moving image data recording means, 5: Distance image data Recording means, 6 ... Distance image data interpolation means, 7 ... Calculation means, 8 ... Display means, 10 ... Photographing object, 11 ... Near view area, 12 ... Boundary area, 13 ... Outline area, 15 ... Light source, 16 ... Total reflection Mirror, 17 half mirror, 20A auxiliary image data acquisition means, 20B auxiliary image data input means, 21 distance image data interpolation generation means.

Claims

Inputting moving image data for each frame time;
Inputting data (hereinafter, referred to as distance image data) for generating a stereoscopic image from the moving image data at intermittent frame intervals;
Interpolating and generating distance image data at a frame time when the distance image data does not exist based on the moving image data and the input distance image data.

Interpolating and generating the distance image data,
From the input distance image data, obtaining a near and far boundary area,
From the moving image data at the same frame time as the input distance image data, obtaining the same position as the boundary area, or a contour area near the boundary area,
A step of obtaining a region corresponding to the outline portion region from moving image data at a frame time when the distance image data does not exist;
A step of generating the distance image data by moving the boundary region and an inner region thereof based on a movement amount of the outline region and a region corresponding to the outline region. Item 3. The stereoscopic image generation method according to Item 1.

Interpolating and generating the distance image data,
Selecting moving image data at the same frame time as the input distance image data,
Obtaining a motion vector of an image from the selected moving image data to moving image data at a frame time when the distance image data does not exist;
Dividing the input distance image data into a plurality of regions and moving each of the regions based on the motion vector to generate distance image data. 3D image generation method.

The step of inputting the distance image data,
Inputting auxiliary image data capable of generating the distance image data at intermittent frame intervals,
4. The method according to claim 1, further comprising: generating distance image data based on the auxiliary image data or moving image data at the same frame time as the auxiliary image data and the auxiliary image data. 3. The method for generating a three-dimensional image according to claim 1.

Moving image data input means for inputting moving image data for each frame time;
Distance image data input means for inputting data for generating a stereoscopic image from the moving image data (hereinafter referred to as distance image data) at intermittent frame intervals;
A distance image that interpolates and generates distance image data at a frame time when the distance image data does not exist based on the moving image data input from the moving image data input means and the distance image data input from the distance image data input means. A three-dimensional image generation device comprising: a data interpolation unit.

The distance image data interpolation means,
Means for obtaining a far and near boundary area from the distance image data input from the distance image data input means,
From the moving image data at the same frame time as the distance image data, means for determining the same position as the boundary region, or a contour region near the boundary region
Means for obtaining an area corresponding to the outline area from moving image data at a frame time when the distance image data does not exist;
The apparatus according to claim 1, further comprising: a unit configured to generate the distance image data by moving the boundary region and an inner region based on a movement amount of the outline region and a region corresponding to the outline region. Item 3. The stereoscopic image generation device according to Item 5.

The distance image data interpolation means,
Means for selecting moving image data at the same frame time as the input distance image data,
Means for obtaining a motion vector of an image from the selected moving image data to moving image data at a frame time when the distance image data does not exist;
The apparatus according to claim 5, further comprising: a unit configured to divide the input distance image data into a plurality of regions, move each of the regions based on the motion vector, and generate distance image data. Stereoscopic image generation device.

The distance image data input means,
Auxiliary image data input means for inputting auxiliary image data capable of generating the distance image data at intermittent frame intervals,
8. A device according to claim 5, further comprising: means for generating distance image data based on the auxiliary image data or the moving image data at the same frame time as the auxiliary image data and the auxiliary image data. The stereoscopic image generation device according to claim 1.

A stereoscopic image generation program for causing a computer to execute each step of the stereoscopic image generation method according to any one of claims 1 to 4.

A recording medium in which the stereoscopic image generation program according to claim 9 is recorded so as to be readable by a computer.