JP3655065B2

JP3655065B2 - Position / attitude detection device, position / attitude detection method, three-dimensional shape restoration device, and three-dimensional shape restoration method

Info

Publication number: JP3655065B2
Application number: JP23785597A
Authority: JP
Inventors: 憲彦村田; 貴史北口
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1997-08-20
Filing date: 1997-08-20
Publication date: 2005-06-02
Anticipated expiration: 2017-08-20
Also published as: JPH1163949A

Description

【０００１】
【発明の属する技術分野】
この発明は、連続する複数枚の画像から撮影したときのカメラの位置，姿勢を検出する位置・姿勢検出装置と方法及び撮影した対象物の３次元形状を復元する３次元形状復元装置及び方法、特に精度が高い３次元形状の復元を少ない計算量で実現することに関するものである。
【０００２】
【従来の技術】
対象物の３次元形状を復元する研究は自律移動ロボットの視覚をはじめとして様々な分野で進められている。特に近年は電子技術の飛躍的な進歩による計算機や電子機器の普及が急速に進み、手軽に３次元情報の立体表示が楽しめるようになった。それに対して実世界の対象物や情景の３次元構造を復元する技術の発展が期待されている。
【０００３】
実世界の対象物の３次元構造を復元するため対象物までの距離や形状を測定する方法は、対象物に光波や超音波を照射する能動的な方法と、ステレオ画像法に代表される受動的な方法とがある。能動的な方法は、光や電波，音波等の波動を対象物に照射し、対象物からの反射波の伝播時間を計測して対象物までの距離を求める方法や、カメラと位置関係が既知の光源から特定パターンを持ったスリット光やスポット光等を対象物に照射し、その歪を観測して対象物の形状を求める光投影法などがある。この能動的な手法は一般的に装置の小型化に問題がある反面、高速でかつ高精度に距離を測定できるという特徴がある。
【０００４】
一方、受動的な方法は、多眼立体視による方法と運動立体視による方法に大別される。多眼立体視による方法は、互いの位置と姿勢が既知である複数のカメラを用いて対象物を撮影し、撮影した画像から各画像間の特徴点又は領域の対応付けを行い、三角測量の原理により対象物の３次元形状を計算するという手順で行われる。この方法では画像に重畳されたノイズ等により対応付けの誤差が存在したり視差が十分にとれない場合に、大きな距離測定誤差を生じやすいという問題点がある。運動立体視による方法は、１台のカメラを移動させながら対象物を撮影し、連続する画像間の対応付けを行い、カメラの位置と姿勢及び対象物の３次元形状を計算するという手順で行われる。この方法も多眼立体視による方法と同様の問題点があるほか、多眼立体視とは異なり画像間のカメラ位置や姿勢情報が未知であり、一般的に複雑な非線形方程式を反復演算で解く必要がある。そのため計算量が膨大であり、その解も不安定になりやすい。この受動的な方法の問題点に対して画像以外に距離センサや加速度センサ，角速度センサ，磁気センサなどを併用して、小さな計算コストで３次元形状を復元する装置が、例えば特開平５−196437号公報や特開平７−181024号公報，特開平９−81790号公報等に示されている。
【０００５】
特開平5-196437号公報に示された装置は、被写体の１点を直交投影のカメラで撮影し、そのときのカメラの姿勢を３軸ジャイロで求め、ボーティング法により被写体の３次元情報を抽出している。また、特開平７−181024号公報に示された装置は、カメラの移動量を検出する移動量検出手段を設け、移動量検出手段で得たカメラの移動量を基線長とし、この基線長と画像データによる対応点検策結果より被写体の３次元形状復元を行い、大規模になりがちな３次元形状測定装置の小型，軽量化を図っている。移動量検出手段としては慣性力を利用した角速度センサで画像入力手段の移動量を直接計測したり、計測者の動きを磁気センサや超音波センサ，光ファイバセンサ，圧力センサ等で検出して画像入力手段の移動量を算出している。特開平９−81790号公報等に示された装置は、カメラの動きを角度センサと加速度センサにより検出し、異なる視点からの光軸が任意の点で交わるように各視点における光軸方向を調整して、撮影時の視点を自由に選択できるようにするとともに各視点からの画像の座標軸が共通となるようにして、各画像間の対応付けを容易に行えるようにし、３次元形状を復元するときの処理の負担を軽減して処理速度を上げるようにしている。
【０００６】
【発明が解決しようとする課題】
しかしながら特開平5-196437号公報に示されているように、直交投影を前提としていると、中心射影モデルのカメラで撮影した画像から３次元情報を抽出するには精度が不十分である。また、特開平７−181024号公報に示されているように、角速度センサ等の各種センサでカメラの移動量を算出する場合は、移動量を計算するときに各種センサからの信号を分析する必要があるため、移動量の誤差成分が累積的に蓄積されるという問題点がある。また、特開平９−81790号公報に示された装置ではカメラの動きを示すセンサ情報とあらかじめ設定された対象物とカメラの距離により算出した動きベクトルの推定値と画像処理により求めた動きベクトルの比較により被写体の検出を行っているが、対象物とカメラの距離があらかじめ設定されているため、特定の撮影条件のもとでのみ３次元形状の復元が可能である。さらに、光軸の向きを変えるための駆動機構が必要なため、装置の構造が複雑になってしまう。
【０００７】
この発明はかかる問題点を解消するためになされたものであり、任意の撮影条件下で、カメラの位置，姿勢を検出する位置・姿勢検出装置と方法及び計算の負荷を少なくし、かつ精度の高い３次元形状復元を実現することができる３次元形状復元装置及び方法を得ることを目的とするものである。
【０００８】
【課題を解決するための手段】
この発明に係る位置・姿勢検出装置は、画像入力手段と距離検出手段と姿勢検出手段及び並進成分演算手段を有し、画像入力手段は撮影位置と視点を変えて被写体の画像を入力し、距離検出手段は画像入力手段から得られた複数枚の画像上の特定のある１点に対応する被写体の注視点までの各視点からの距離を検出し、姿勢検出手段は各視点における画像入力手段の姿勢を算出し、並進成分演算手段は各視点における画像情報と注視点までの距離情報と画像入力手段の姿勢情報より、視点を変えたときの画像入力手段の並進成分を算出することを特徴とする。
【０００９】
この発明に係る位置・姿勢検出方法は、撮影位置と視点を変えて画像入力手段で被写体の画像を入力し、視点を変えて得られた複数枚の画像上の特定のある１点に対応する被写体の注視点までの各視点からの距離を検出し、各視点における画像入力手段の姿勢を算出し、各視点における画像情報と注視点までの距離情報と画像入力手段の姿勢情報より、視点を変えたときの画像入力手段の並進成分を算出することを特徴とする。
【００１０】
この発明に係わる３次元形状復元装置は、画像入力手段と距離検出手段と姿勢検出手段と並進成分演算手段と対応検出手段及び３次元演算手段を有し、画像入力手段は撮影位置と視点を変えて被写体の画像を入力し、距離検出手段は画像入力手段から得られた複数枚の画像上の特定のある１点又は複数点に対応する被写体の注視点までの各視点からの距離を検出し、姿勢検出手段は各視点における画像入力手段の姿勢を算出し、画像入力手段から得られた複数枚の画像上の特定のある１点又は複数点に対応する被写体の注視点までの各視点からの距離を検出し、並進成分演算手段は各視点における画像情報と注視点までの距離情報と画像入力手段の姿勢情報より、視点を変えたときの画像入力手段の並進成分を算出し、対応検出手段は画像入力手段の並進成分と姿勢情報より視点を変えて得られた複数枚の画像間の対応付けを行い、３次元演算手段は対応付け結果と画像入力手段の位置，姿勢情報により被写体の３次元形状を算出することを特徴とする。
【００１１】
この発明に係る他の３次元形状復元装置は、画像入力手段と距離検出手段と姿勢検出手段と注視領域検定手段と並進成分演算手段と対応検出手段及び３次元演算手段を有し、画像入力手段は撮影位置と視点を変えて被写体の画像を入力し、距離検出手段は画像入力手段から得られた複数枚の画像上の特定の注視領域内のある１点に対応する被写体の注視点までの各視点からの距離を検出し、姿勢検出手段は各視点における画像入力手段の姿勢を算出し、注視領域検定手段は視点を変えて撮影した画像の注視領域内の被写体の変位量があらかじめ定めた閾値以下であることを確認し、並進成分演算手段は各視点における画像情報と注視点までの距離情報と画像入力手段の姿勢情報より視点を変えたときの画像入力手段の並進成分を算出し、対応検出手段は画像入力手段の並進成分と姿勢情報より視点を変えて得られた複数枚の画像間の対応付けを行い、３次元演算手段は対応付け結果と画像入力手段の位置，姿勢情報により被写体の３次元形状を算出することを特徴とする。
【００１２】
上記注視領域検定手段における閾値を画像入力手段の光学系パラメーターにより調整する注視領域調整手段を有することが望ましい。
【００１３】
また、上記姿勢検出手段に加速度センサや磁気センサ又は角速度センサを単独あるいは併用して使用すると良い。
【００１４】
この発明に係る３次元形状復元方法は、撮影位置と視点を変えて画像入力手段で被写体の画像を入力し、視点を変えて得られた複数枚の画像上の特定のある１点又は複数点に対応する被写体の注視点までの各視点からの距離を検出し、各視点における画像入力手段の姿勢を算出し、各視点における画像情報と 1 又は複数の注視点までの距離情報と画像入力手段の姿勢情報より、視点を変えたときの画像入力手段の並進成分を算出し、画像入力手段の並進成分と姿勢情報より視点を変えて得られた複数枚の画像間の対応付けを行い、対応付け結果と画像入力手段の位置，姿勢情報により被写体の３次元形状を算出することを特徴とする３次元形状復元方法。
【００１５】
この発明に係る他の３次元形状復元方法は、撮影位置と視点を変えて画像入力手段で被写体の画像を入力し、視点を変えて得られた複数枚の画像上の特定の注視領域内のある点に対応する被写体の注視点までの各視点からの距離を検出し、各視点における画像入力手段の姿勢を算出し、視点を変えて撮影した画像の注視領域内の被写体の変位量があらかじめ定めた閾値以下であることを確認し、各視点における画像情報と注視点までの距離情報と画像入力手段の姿勢情報より視点を変えたときの画像入力手段の並進成分を算出し、画像入力手段の並進成分と姿勢情報より視点を変えて得られた複数枚の画像間の対応付けを行い、対応付け結果と画像入力手段の位置，姿勢情報により被写体の３次元形状を算出することを特徴とする。上記閾値を画像入力手段の光学系パラメーターにより調整することが望ましい。
【００１６】
【発明の実施の形態】
この発明の３次元形状復元装置は、画像入力手段と距離検出手段と直交する３軸方向の加速度センサから姿勢検出手段と並進成分演算手段と対応検出手段及び３次元演算手段を有する。そして同一の被写体を異なる２つの視点で撮影して被写体の３次元形状を復元するとき、撮影者が第１の視点と第２の視点から被写体のある１点までの距離を測定する注視点を決定する。注視点を決定したら第１の視点で画像入力手段により被写体を撮影し、距離検出手段で第１の視点から注視点までの距離を測定し、姿勢検出手段で第１の視点における画像入力手段の姿勢を測定する。次に画像入力手段を移動して第２の視点で被写体を撮影し、距離検出手段で第２の視点から注視点までの距離を測定し、姿勢検出手段で第２の視点における画像入力手段の姿勢を測定する。並進成分演算手段は各視点で撮影した画像データと各視点から注視点までの距離及び各視点における画像入力手段の姿勢情報から視点を変えたときの画像入力手段２の並進成分を算出する。対応検出手段は画像入力手段の並進成分と姿勢情報を利用して異なる視点で撮影した２枚の画像間の特徴点の対応付けを行う。３次元演算手段は対応検出手段の対応付け結果と並進成分及び姿勢情報より三角測量の原理で被写体の３次元構造を算出して復元する。
【００１７】
このように第１の視点から注視点までの距離と第２の視点から注視点までの距離及び各視点における画像入力手段の姿勢情報から視点を変えたときの画像入力手段の並進成分を算出するから、少ない計算容量で精度良く３次元形状を復元することができる。
【００１８】
また、姿勢検出手段として直交する３軸方向の加速度を測定する加速度センサを用いるから、静止した状態で２視点で被写体を撮影するときに、重力方向を検出することができ、重力方向に対する画像入力手段の姿勢を高精度で検出することができる。
【００１９】
また、被写体に複数の注視点を設定して、各注視点から計算される複数の並進成分を用いて最終的な並進成分を決定し、決定した並進成分により被写体の３次元構造を復元すると、より精度の高い３次元形状を復元することができる。
【００２０】
さらに、画像入力手段で撮影する画面の一定の領域を注視領域として固定し、異なる視点で被写体を撮影したときに、被写体の同じ位置が注視領域に入るようにすると、視点を変えたときの画像入力手段の並進成分をより少ない計算処理で算出することができる。
【００２１】
【実施例】
図１はこの発明の一実施例の構成を示すブロック図である。図に示すように、３次元形状復元装置１は例えばデジタルカメラからなる画像入力手段２と距離検出手段３と姿勢検出手段４と並進成分演算手段５と対応検出手段６及び３次元演算手段７を有する。距離検出手段３は３角測量の原理を利用した赤外線ステレオ法、超音波等の波動を投射し被写体からの反射の伝播時間より距離を計測する方法、合焦時の距離情報を光学系に設置したエンコーダより得る方法などを利用して画像入力手段２から被写体のある１点の注視点までの距離を検出する。姿勢検出手段４は、例えば直交する３軸方向の加速度センサからなり、画像入力手段２が静止した状態で被写体を撮影するときの重力方向を検出して画像入力手段２の姿勢を測定する。並進成分演算手段５は被写体の注視点までの距離情報と画像入力手段２の姿勢情報より、視点を変えたときの画像入力手段２の並進成分を算出する。対応検出手段６は画像入力手段２の並進成分と姿勢情報より視点を変えて得られた複数枚の画像間の特定のある１点の対応付けを行う。３次元演算手段７は対応検出手段６の対応付け結果と画像入力手段２の位置，姿勢情報により被写体の３次元形状を算出し、ハードディスク等の記憶手段８等に出力する。
【００２２】
上記のように構成された３次元形状復元装置１で、図２に示すように、同一の被写体９を第１の視点１１と第２の視点１２で撮影して被写体９の３次元形状を復元するときの動作を図３のフローチャートを参照して説明する。
【００２３】
画像入力手段２で被写体９を撮影する前に、撮影者が第１の視点１１と第２の視点１２から被写体９のある１点までの距離を測定する注視点１３を決定する（ステップＳ１）。この注視点１３は、例えば図２に示すように、被写体９の特徴的な濃度分布を示すエッジ部等の小領域を自動選択する等各種の手法が利用される。注視点１３を決定したら第１の視点１１で画像入力手段２により被写体９を撮影し、図４の画面図に示すように、第１の視点１１で撮影した画像面１４で注視点１３の位置に対応する対応点１６ａを特定する。また、距離検出手段３で第１の視点１１から注視点１３までの距離Ｌ₁を測定し、姿勢検出手段４で第１の視点１１における画像入力手段２の姿勢を測定する（ステップＳ２）。次に画像入力手段２を移動して第２の視点１２で被写体９を撮影し、図４に示すように、第２の視点１２で撮影した画像面１５で注視点１３の位置に対応する対応点１６ｂを特定し、距離検出手段３で第２の視点１２から注視点１３までの距離Ｌ₂を測定し、姿勢検出手段４で第２の視点１２における画像入力手段２の姿勢を測定する（ステップＳ３）。
【００２４】
ここで図５の説明図に示すように、第１の視点１１において、画像面１４上に互いに直交する向きにｘ軸とｙ軸をとり、光軸方向にｚ軸をとって、画像面１４の対応点１６ａの座標を（ｘ，ｙ）とし画像面１５の対応点１６ｂの座標を（ｘ₁，ｙ₁）とすると、このｘｙｚ座標系を基準とした第１の視点１１と第２の視点１２との間の画像入力手段２の相対的な姿勢を表す回転行列Ｒは、画像入力装置２を第１の視点１１から第２の視点１２に移動したときのｘ軸とｙ軸及びｚ軸周りの回転角をそれぞれα，β，γとすると下記（１）式で表せる。
【００２５】
【数１】

【００２６】
この回転行列Ｒは姿勢検出手段４で検出した画像入力手段２の姿勢情報から得ることができ、第１の視点１１から注視点１３までの距離Ｌ₁と第２の視点１２から注視点１３までの距離Ｌ₂は距離検出手段３により得ることができる。したがって画像入力装置２の焦点距離ｆ等の光学系パラメータが既知であれば、第１の視点１１から画像面１４の対応点１６ａ（ｘ，ｙ）に対する視線の向きと第２の視点１２から画像面１５の対応点１６ｂ（ｘ₁，ｙ₁）に対する視線の向きを求めることができる。例えば図６に示すように、画像入力手段２が中心射影モデルの場合、第１の視点１１を基準に３次元座標系をとり、回転行列Ｒの逆行列をＩＲとすると、第１の視点１１から注視点１３への単位視線ベクトルｍと第２の視点１２から注視点１３への単位視線ベクトルＩＲｍ₁は、それぞれ下記（２）式で表せる。
【００２７】
【数２】

【００２８】
したがって第１の視点１１から注視点１３までの距離Ｌ₁と第２の視点１２から注視点１３までの距離Ｌ₂と回転行列Ｒを得ることにより、画像入力手段２を第１の視点１１から第２の視点１２に移動したときの並進成分Ｄを下記（３）式で算出することができる。
【００２９】
【数３】

【００３０】
そこで並進成分演算手段５は第１の視点１１と第２の視点１２で撮影した画像データと、各視点１１，１２から注視点１３までの距離Ｌ₁，Ｌ₂及び各視点１１，１２における画像入力手段２の姿勢情報から第１の視点１１から第２の視点１２に視点を変えたときの画像入力手段２の並進成分Ｄを算出する（ステップＳ４）。対応検出手段６は画像入力手段２の並進成分Ｄと姿勢情報を利用して、図４に示す第１の視点１１と第２の視点１２で撮影した２枚の画像面１４，１５の画像間の特徴点の対応付けを行う（ステップＳ５）。画像間の対応付けは、画像入力手段１１の相対的な位置，姿勢情報が求められているので、２台のカメラで物体の像を捕らえるステレオ法における対応問題解法の基礎的拘束条件としてよく使用されるエピ極線拘束（epipolar constraint）を用いることができ、相関法，特徴照合法，疎密法等の局所的な画像特徴を用いる方法，時空間微分法を用いて移動領域を算出する方法等の一般的な手法により対応付けを行うことができる。３次元演算手段７は対応検出手段６の対応付け結果と並進成分Ｄ及び姿勢情報より三角測量の原理で被写体９の３次元構造を算出して復元する（ステップＳ６）。このようにして得られた位置，姿勢情報と３次元情報及び各画像データは必要に応じて記憶手段８に記録して保存する（ステップＳ７，Ｓ８）。
【００３１】
このように第１の視点１１から注視点１３までの距離Ｌ1と第２の視点１２から注視点１３までの距離Ｌ2及び各視点１１，１２における画像入力手段２の姿勢情報から第１の視点１１から第２の視点１２に視点を変えたときの画像入力手段２の並進成分Ｄを算出するから、画像入力手段２の撮影位置と姿勢を精度良く検出することができるとともに少ない計算容量で精度良く３次元形状を復元することができる。
【００３２】
また、姿勢検出手段４として直交する３軸方向の加速度を測定する加速度センサを用いるから、図２に示すように静止した状態で２視点１１，１２で被写体９を撮影するときに、重力方向を検出することができ、重力方向に対する画像入力手段２の姿勢を高精度で検出することができる。また、画像入力手段２を動かしながら被写体を撮影する場合には、加速度センサが出力する加速度信号を積分することにより、画像入力手段２の加速と位置情報（並進成分）を求めることができ、各視点から注視点までの距離情報と画像入力手段２の姿勢情報から算出した画像入力手段２の並進成分Ｄとの比較や両者の融合処理を行うこともできる。
【００３３】
上記実施例は第１の視点１１と第２の視点１２から被写体９の１点の注視点１３までの距離Ｌ₁，Ｌ₂と画像入力手段２の姿勢情報から第１の視点１１から第２の視点１２に視点を変えたときの画像入力手段２の並進成分Ｄを算出し、算出した並進成分Ｄと姿勢情報を利用して第１の視点１１と第２の視点１２で撮影した２枚の画像面１４，１５の画像間の特徴点の対応付けを行う場合について説明したが、各視点１１，１２から注視点１３までの距離測定の誤差や２枚の画像面１４，１５における対応点１６ａ，１６ｂの対応付けの誤差により、演算により算出した各視点１１，１２から注視点１３への視線ベクトルＬ₁ｍ，視線ベクトルＬ₂ＩＲｍ₁の終点１３１，１３２が、図７に示すように一致しない場合があり、単一の注視点１３のみから求めた並進成分Ｄに誤差が含まれる可能性がある。これを解消するためには被写体９に複数の注視点１３を設定して、各注視点１３から計算される並進成分を用いて最終的な並進成分を求めと良い。
【００３４】
図８は被写体９に複数の注視点１３ａ〜１３ｎを設定して、各注視点１３ａ〜１３ｎから計算される並進成分を用いて最終的な並進成分を求める第２の実施例の３次元形状復元装置１ａの構成を示すブロック図である。図に示すように、３次元形状復元装置１ａには画像入力手段２と距離検出手段３と姿勢検出手段４と並進成分演算手段５と対応検出手段６及び３次元演算手段７のほかに並進成分演算手段５の後段に設けた並進成分決定手段２１を有する。
【００３５】
上記のように構成された３次元形状復元装置１ａで図９に示すように被写体９に複数の注視点１３ａ〜１３ｎを設定して同一の被写体９を第１の視点１１と第２の視点１２で撮影して被写体９の３次元形状を復元するときの動作を図１０のフローチャートを参照して説明する。
【００３６】
画像入力手段２で被写体９を撮影する前に、撮影者が第１の視点１１と第２の視点１２から被写体９の複数の注視点１３ａ〜１３ｎを決定する（ステップＳ１１）。注視点１３ａ〜１３ｎを決定したら第１の視点１１で画像入力手段２により被写体９を撮影し、距離検出手段３で第１の視点１１から各注視点１３ａ〜１３ｎまでの距離を測定し、姿勢検出手段４で第１の視点１１における画像入力手段２の姿勢を測定する（ステップＳ１２）。距離検出手段１２は複数の注視点１３ａ〜１３ｎまでの距離を測定するため、能動的手法や合焦時の距離検出による方法等が利用して任意の点までの距離を測定できる構成になっている。第１の視点１１における撮影と測定が終了したら画像入力手段２を移動して第２の視点１２で被写体９を撮影し、距離検出手段３で第２の視点１２から各注視点１３ａ〜１３ｎまでの距離を測定し、姿勢検出手段４で第２の視点１２における画像入力手段２の姿勢を測定する（ステップＳ１３）。並進成分演算手段５は第１の視点１１と第２の視点１２で撮影した画像データと、第１の視点１１と第２の視点１２から各注視点１３ａ〜１３ｎまでの距離情報と画像入力手段２の姿勢情報により、（３）式に基づき２視点１１，１２間の画像入力手段２の並進成分Ｄ₁〜Ｄ_nを算出する（ステップＳ１４）。並進成分決定手段２１は算出した複数の並進成分Ｄ₁〜Ｄ_nより最終的な並進成分Ｄを決定する（ステップＳ１５）。この最終的な並進成分Ｄを決定するにあたっては、例えば各注視点１３ａ〜１３ｎの２画像間の対応付けの正確さを表す指標（重み）をＳ₁〜Ｓ_nとし、下記（４）式に示すように重み付き平均処理により決定する手法などが適用される。
【００３７】
【数４】

【００３８】
ここで指標Ｓ₁〜Ｓ_nは通常の画像処理で用いられる相互相関の値等が用いられる。例えば図１１に示すように第１の視点１１で撮影した画像面１４における被写体９のｉ番目目の注視点１３ｉの対応点１６ａｉ（ｘ_i0，ｙ_i0）と、第２の視点１２で撮影した画像面１４における被写体９のｉ番目目の注視点１３ｉの対応点１６ｂｉ（ｘ_i0＋ｄｘ，ｙ_i0＋ｄｙ）の対応付けを、（２Ｎ＋１）×（２Ｐ＋１）の相関窓１７を用いたブロックマッチング（テンプレートマッチング）で行う場合、指標Ｓ_iは下記（５）式で計算される。
【００３９】
【数５】

【００４０】
上記（５）式においてＩ₁（ｘ，ｙ）は画像面１４における対応点１６ａ（ｘ，ｙ）における濃度、Ｉ₂（ｘ，ｙ）は画像面１５における対応点１６ｂ（ｘ，ｙ）における濃度、Ｉ₁ｄ（ｘ，ｙ）は画像面１４における対応点１６ａ（ｘ，ｙ）を中心とする（２Ｎ＋１）×（２Ｐ＋１）の相関窓１７における平均の濃度、Ｉ₂ｄ（ｘ，ｙ）は画像面１５における対応点１６ｂ（ｘ，ｙ）を中心とする（２Ｎ＋１）×（２Ｐ＋１）の相関窓１７における平均の濃度をそれぞれ示し、Ｋは定数である。
【００４１】
対応検出手段６は上記（５）式と（４）式により求めた並進成分Ｄと各視点１１，１２における画像入力手段２の姿勢情報を利用して２枚の画像間の特徴点の対応付けを行う（ステップＳ１６）。３次元演算手段７は対応検出手段６の対応付け結果と並進成分Ｄ及び姿勢情報より三角測量の原理で被写体９の３次元構造を算出して復元する（ステップＳ１７）。このようにして得られた位置，姿勢情報と３次元情報及び各画像データは必要に応じて記憶手段８に記録して保存する（ステップＳ１８，Ｓ１９）。
【００４２】
このように被写体９に複数の注視点１３ａ〜１３ｎを設定して、各注視点１３ａ〜１３ｎから計算される並進成分Ｄ₁〜Ｄ_nを用いて最終的な並進成分Ｄを決定し、決定した並進成分Ｄにより被写体の３次元構造を復元するから、より精度の高い３次元形状を復元することができる。
【００４３】
なお、複数の注視点１３ａ〜１３ｎの対応付けや指標Ｓの計算方法は上記内容に限定されず、各種方法を採用することができる。
【００４４】
上記各実施例は被写体９の注視点１３の対応付けを行ってから距離検出手段３で注視点１３までの距離を測定した場合について説明したが、画像入力手段２で撮影する画面の一定の領域を注視領域として固定するようにしても良い。
【００４５】
図１２は画面の一定の領域を注視領域として固定して並進成分を算出する第３の実施例の３次元形状復元装置１ｂの構成を示すブロック図である。図に示すように、３次元形状復元装置１ｂは画像入力手段２と距離検出手段３と姿勢検出手段４と並進成分演算手段５と対応検出手段６及び３次元演算手段７のほかに注視領域検定手段２２を有する。注視領域検定手段２２は第１の視点１１で撮影した画像面１４と第２の視点１２で撮影した画像面１５の一定の位置である注視領域内に写った被写体９が同一で、かつほぼ同じ位置に写されたことを検出する。
【００４６】
上記のように構成された３次元形状復元装置１ａで図２に示すように同一の被写体９を第１の視点１１と第２の視点１２で撮影して被写体９の３次元形状を復元するときの動作を図１３のフローチャートを参照して説明する。
【００４７】
まず、第１の視点１１で図１４に示すように画像入力手段２の画像面１４のほぼ中心に被写体９が写るように画像入力手段２の向きを調整して被写体９を撮影し、撮影した被写体９の画像の画像面１４の中心に対応する位置を注視点とし、距離検出手段３で第１の視点１１から注視点までの距離を測定し、姿勢検出手段４で第１の視点１１における画像入力手段２の姿勢を測定する（ステップＳ２１）。注視領域検定手段２２は撮影した画像面１４の中心を含む一定範囲である注視領域１８内の被写体９の画像を記憶する（ステップＳ２２）。次に画像入力手段２を移動して第２の視点１２で被写体９を撮影し、距離検出手段３で第２の視点１２から撮影した被写体９の画像の画像面１５の中心に対応する位置を注視点とし、距離検出手段３で第２の視点１２から注視点までの距離を測定し、姿勢検出手段４で第２の視点１２における画像入力手段２の姿勢を測定する（ステップＳ２３）。注視領域検定手段２２は第２の視点１２で撮影した画像を確認し、第１の視点１１で撮影した画像面１４の注視領域１８内の被写体９の画像が第２の視点１２で撮影した画像面１５の注視領域１８内に含まれているかどうかと、その変位量を検出する。例えば画像面１４と画像面１５の注視領域１８内の画像の相互相関をとって被写体９の変位量を測定し、変位量があらかじめ定めた閾値以上であれば画像面１４と画像面１５の注視領域１８内の画像の一致度が少ないとして、第２の視点１２における画像入力手段２の向きを変更して撮り直すことを指示する（ステップＳ２４，Ｓ２５）。第１の視点１１で撮影した画像面１４の注視領域１８内の被写体９の画像が第２の視点１２で撮影した画像面１５の注視領域１８内に含まれているとき、並進成分演算手段５は第１の視点１１と第２の視点１２における画像データと距離情報及び画像入力手段２の姿勢情報により視点１１，１２間の画像入力手段２の並進成分Ｄを算出する（ステップＳ２６）。この並進成分Ｄを算出するときに、第１の視点１１から注視点への単位視線ベクトルｍと第２の視点１２から注視点への単位視線ベクトルＩＲｍ₁は、下記（６）式で表せるから並進成分Ｄを少ない計算処理で算出することができる。
【００４８】
【数６】

【００４９】
対応検出手段６は算出した並進成分Ｄと各視点１１，１２における画像入力手段２の姿勢情報を利用して２枚の画像間の特徴点の対応付けを行う（ステップＳ２７）。３次元演算手段７は対応検出手段６の対応付け結果と並進成分Ｄ及び姿勢情報より三角測量の原理で被写体９の３次元構造を算出して復元する（ステップＳ２８）。得られた位置，姿勢情報と３次元情報及び各画像データは必要に応じて記憶手段８に記録して保存する（ステップＳ２９，Ｓ３０）。
【００５０】
上記実施例では第１の視点１１で撮影した画像面１４の注視領域１８内の被写体９の画像が第２の視点１２で撮影した画像面１５の注視領域１８内に含まれていないときに警告して撮り直しを指示した場合について説明したが、第２の視点１２で撮影する前にファインダーに注視領域１８における画像の相関が高いことや低いことを示すインジケータを設置したりしても良い。また、注視領域１８を画像面の中心以外の任意の領域に設定しても良い。
【００５１】
上記実施例は第１の視点１１で撮影した画像面１４の注視領域１８内の被写体９の画像が第２の視点１２で撮影した画像面１５の注視領域１８内に含まれているかどうかを判定するときに、画像面１４と画像面１５の注視領域１８内の画像の相互相関をとって被写体９の変位量を測定し、測定した変位量をあらかじめ定めた閾値と比較する場合について説明したが、この変位量の閾値を画像入力手段２の焦点距離等の光学系パラメータに応じて可変することにより撮影条件を柔軟に設定することができる。このように変位量の閾値を画像入力手段２の光学系パラメータに応じて可変する第４の実施例の３次元形状復元装置１ｃの構成を図１５に示す。図１５に示すように、３次元形状復元装置１ｃには画像入力手段２と距離検出手段３と姿勢検出手段４と並進成分演算手段５と対応検出手段６と３次元演算手段７と注視領域検定手段２２及び注視領域１８内の画像の変位量の閾値を可変設定する注視領域調節手段２３を有する。画像入力手段２で被写体９を撮影するとき、例えば焦点距離が長くなると視野角が狭くなるので、画像入力手段２の向きを少し変えただけでも画像面上の変位量が大きくなる。そこで注視領域調節手段２３は画像入力手段２で被写体９を撮影するとき、焦点距離が長い場合には画像面１４と画像面１５の注視領域１８内の画像の変位量の閾値を大きく設定し、焦点距離が短いときは変位量の閾値を小さく設定するなど光学系パラメータに応じて変位量の閾値の調節を行う。このようにして最適な閾値で画像面１４と画像面１５の注視領域１８内の画像を比較することができ、３次元形状を復元する精度をより高めることができる。
【００５２】
上記各実施例は姿勢検出手段４として直交する３軸方向の加速度を測定する加速度センサを用い、第１の視点１１と第２の視点１２で被写体９を撮影するときに、重力方向に対する画像入力手段２の姿勢を検出する場合について説明したが、姿勢検出手段４に磁気センサを用いても良い。そして検出する磁気方位は地磁気でも人工的に発生させた磁場でも良い。姿勢検出手段４に直交する３軸方向を検出できる磁気センサを用いると、図２に示すように、静止した状態で２視点１１，１２で被写体９を撮影するときに、磁気方向に対する画像入力手段２の姿勢を高精度で検出することができる。また、直交する３軸方向の加速度を測定する加速度センサと直交する２軸方向を検出できる磁気センサを併用すると、画像入力手段２の姿勢を完全に検出することができる。
【００５３】
さらに、姿勢検出手段４として角速度センサを用いても良い。すなわち角速度センサを検出したい回転角に対応するように設置すれば、センサ出力を積分することにより回転角を算出することができる。したがって移動しながら複数の視点で被写体を撮影する場合、各視点間の画像入力手段２の姿勢変化を容易に検出することができる。また、３軸方向の加速度センサや磁気センサと併用することにより、画像入力手段２が静止したりあるいは非常にゆっくり動いているときに、加速度センサと磁気センサで求めた姿勢より角速度センサのオフセット成分を補正することができる。
【００５４】
また、上記各実施例は第１の視点１１と第２の視点１２で画像入力手段２を静止させた状態で被写体９を撮影し、距離情報と姿勢情報を得る場合について説明したが、画像入力手段２を移動しながら被写体９を撮影し、距離情報と姿勢情報を得るようにしても良い。また、第１の視点１１と第２の視点１２で被写体９を撮影して３次元形状を復元する場合について説明したが、３視点以上の複数の視点で被写体９を撮影して３次元形状を復元する場合にも同様にして適用することができる。
【００５５】
さらに、上記実施例は被写体９を撮影したときに実時間処理する場合について説明したが、各視点で撮影した画像や距離情報と姿勢情報を記憶手段にまとめて格納しておき、あとから記憶手段に格納した情報等によりオフライン処理しても良い。さらに、各視点で撮影した画像や距離情報と姿勢情報をネットワークなどに転送して処理するようにしても良い。
【００５６】
【発明の効果】
この発明は以上説明したように、異なる視点における画像データと被写体のある１点までの距離情報及び各視点における画像入力手段の姿勢情報から視点を変えたときの画像入力手段の並進成分を算出するから、画像入力手段の撮影位置と姿勢を精度良く検出することができるとともに少ない計算容量で精度良く３次元形状を復元することができる。
【００５７】
また、被写体に複数の注視点を設定して、各注視点から計算される複数の並進成分を用いて最終的な並進成分を決定し、決定した並進成分により被写体の３次元構造を復元することにより、より精度の高い３次元形状を復元することができる。
【００５８】
さらに、画像入力手段で撮影する画面の一定の領域を注視領域として固定し、異なる視点で被写体を撮影したときに、被写体の同じ位置が注視領域に入るようにすることにより、視点を変えたときの画像入力手段の並進成分をより少ない計算処理で算出することができ、簡単な構成で３次元形状を精度良く復元することができる。
【００５９】
また、被写体の同じ位置が注視領域に入っているかどうかを判別する閾値を画像入力手段の焦点距離等の光学系パラメータに応じて調節することにより、撮影条件を柔軟に設定することができるとともに３次元形状を復元する精度をより高めることができる。
【００６０】
また、各視点で画像入力手段の姿勢を検出する姿勢検出手段として直交する３軸方向の加速度を測定する加速度センサを用いることにより、静止した状態で被写体を撮影するときに、重力方向を検出することができ、重力方向に対する画像入力手段の姿勢を高精度で検出することができ、３次元形状の復元精度を高めることができる。
【００６１】
また、姿勢検出手段に磁気センサを使用して静止撮影時において地磁気方向や人工的に発生された磁場を検出することにより、画像入力手段の姿勢を高精度に測定でき、３次元形状を精度良く復元することができる。
【００６２】
さらに、姿勢検出手段に角速度センサを使用して回転角速度を検出することにより、特に画像入力手段の動的な姿勢を高精度に測定することができ、移動しながら被写体を撮影したときの３次元形状を精度良く復元することができる。
【図面の簡単な説明】
【図１】この発明の実施例の構成を示すブロック図である。
【図２】被写体に対する撮影位置を示す配置図である。
【図３】上記実施例の動作を示すフローチャートである。
【図４】異なる視点で撮影した画像を示す画面図である。
【図５】異なる視点と被写体と画像面を示す説明図である。
【図６】中心射影モデルの画像入力手段の構成図である。
【図７】異なる視点からの単位視線ベクトルの誤差を示す説明図である。
【図８】第２の実施例の構成を示すブロック図である。
【図９】複数の注視点を設定した被写体と撮影位置を示す配置図である。
【図１０】第２の実施例の動作を示すフローチャートである。
【図１１】異なる視点で撮影した画像の１点の対応付けを示す画面図である。
【図１２】第３の実施例の構成を示すブロック図である。
【図１３】第３の実施例の動作を示すフローチャートである。
【図１４】画像面の注視領域を示す画面図である。
【図１５】第４の実施例の構成を示すブロック図である。
【符号の説明】
１３次元形状復元装置
２画像入力手段
３距離検出手段
４姿勢検出手段
５並進成分演算手段
６対応検出手段
７３次元演算手段
９被写体
１１第１の視点
１３第２の視点
１３注視点
１８注視領域
２１並進成分決定手段
２２注視領域検定手段
２３注視領域調節手段[0001]
BACKGROUND OF THE INVENTION
  The present invention relates to the position and orientation of the camera when shooting from a plurality of consecutive images.And position detecting device and method for detectingThe present invention relates to a three-dimensional shape restoration apparatus and method for restoring a three-dimensional shape of a photographed object, and particularly to realizing a highly accurate three-dimensional shape restoration with a small amount of calculation.
[0002]
[Prior art]
Research to restore the three-dimensional shape of an object is underway in various fields, including vision of autonomous mobile robots. In particular, in recent years, computers and electronic devices have rapidly spread due to dramatic advances in electronic technology, and it has become possible to easily enjoy stereoscopic display of three-dimensional information. On the other hand, the development of technology for restoring the three-dimensional structure of real-world objects and scenes is expected.
[0003]
In order to restore the three-dimensional structure of an object in the real world, the method of measuring the distance and shape to the object includes an active method of irradiating the object with light waves and ultrasonic waves, and a passive method represented by stereo imaging. There is a typical method. The active method irradiates the object with light waves, radio waves, sound waves, etc., and measures the propagation time of the reflected wave from the object to determine the distance to the object, or the positional relationship with the camera is known There is a light projection method that irradiates a target with slit light, spot light, or the like having a specific pattern from a light source, and observes the distortion to determine the shape of the target. This active method generally has a problem in miniaturization of the apparatus, but has a feature that the distance can be measured at high speed and with high accuracy.
[0004]
On the other hand, the passive method is roughly divided into a multi-view stereoscopic method and a motion stereoscopic method. The multi-view stereoscopic method uses a plurality of cameras whose positions and orientations are known to capture an object, associates feature points or regions between the images from the captured images, and performs triangulation This is performed by the procedure of calculating the three-dimensional shape of the object according to the principle. This method has a problem that a large distance measurement error is likely to occur when there is an association error due to noise or the like superimposed on the image, or when the parallax cannot be sufficiently obtained. The method based on motion stereoscopic vision is performed by a procedure of photographing a target while moving one camera, associating successive images, and calculating the position and orientation of the camera and the three-dimensional shape of the target. Is called. This method has the same problems as the multi-view stereoscopic method, and unlike the multi-view stereoscopic view, the camera position and orientation information between images is unknown and generally solves complex nonlinear equations by iterative calculation. There is a need. Therefore, the calculation amount is enormous and the solution tends to be unstable. An apparatus that restores a three-dimensional shape with a small calculation cost by using a distance sensor, an acceleration sensor, an angular velocity sensor, a magnetic sensor, etc. in addition to an image, for example, is disclosed in Japanese Patent Laid-Open No. Hei 5-196437. No. 7, JP-A-7-181024, JP-A-9-81790, and the like.
[0005]
The apparatus disclosed in Japanese Patent Application Laid-Open No. 5-194437 takes one point of a subject with an orthogonal projection camera, obtains the posture of the camera with a three-axis gyro, and obtains three-dimensional information of the subject by a voting method. Extracting. Further, the apparatus disclosed in Japanese Patent Application Laid-Open No. 7-181024 is provided with a movement amount detecting means for detecting the movement amount of the camera, and the movement amount of the camera obtained by the movement amount detection means is set as a baseline length. The three-dimensional shape of the subject is restored from the results of the corresponding inspection measures based on the image data, thereby reducing the size and weight of the three-dimensional shape measuring device that tends to be large. As the moving amount detecting means, an angular velocity sensor using inertial force is used to directly measure the moving amount of the image input means, or the movement of the measurer is detected by a magnetic sensor, ultrasonic sensor, optical fiber sensor, pressure sensor, etc. The amount of movement of the input means is calculated. The apparatus disclosed in Japanese Patent Application Laid-Open No. 9-81790 etc. detects the movement of the camera with an angle sensor and an acceleration sensor, and adjusts the optical axis direction at each viewpoint so that the optical axes from different viewpoints intersect at an arbitrary point. In addition, the viewpoint at the time of shooting can be freely selected and the coordinate axes of the images from the respective viewpoints are made common so that the correspondence between the images can be easily performed and the three-dimensional shape is restored. The processing speed is reduced by reducing the processing load.
[0006]
[Problems to be solved by the invention]
However, as disclosed in Japanese Patent Application Laid-Open No. 5-194437, if orthogonal projection is assumed, accuracy is insufficient to extract three-dimensional information from an image photographed by a central projection model camera. Also, as disclosed in Japanese Patent Laid-Open No. 7-181024, when calculating the movement amount of the camera with various sensors such as an angular velocity sensor, it is necessary to analyze signals from the various sensors when calculating the movement amount. Therefore, there is a problem that an error component of the movement amount is accumulated. Further, in the apparatus disclosed in Japanese Patent Laid-Open No. 9-81790, sensor information indicating camera movement, an estimated value of a motion vector calculated based on a distance between a predetermined object and the camera, and a motion vector obtained by image processing are used. Although the subject is detected by comparison, since the distance between the object and the camera is set in advance, the three-dimensional shape can be restored only under specific imaging conditions. Furthermore, since a drive mechanism for changing the direction of the optical axis is required, the structure of the apparatus becomes complicated.
[0007]
  The present invention has been made to solve such problems, and the position and orientation of the camera under arbitrary shooting conditions.And position detecting device and method for detectingAn object of the present invention is to obtain a three-dimensional shape restoration apparatus and method that can reduce calculation load and realize high-precision three-dimensional shape restoration.
[0008]
[Means for Solving the Problems]
  According to this inventionThe position / orientation detection apparatus includes an image input unit, a distance detection unit, an attitude detection unit, and a translation component calculation unit.Image input meansChange the shooting position and viewpoint, enter the subject image,The distance detection means corresponds to a specific point on a plurality of images obtained from the image input means.Subject gaze pointThe attitude detection means calculates the attitude of the image input means at each viewpoint, the translation component calculation means calculates the image information at each viewpoint, the distance information to the gazing point, and the attitude of the image input means. From the information, the image input means when changing the viewpointThe translation component is calculated.
[0009]
  In the position / posture detection method according to the present invention, the image of the subject is input by the image input means while changing the shooting position and the viewpoint, and it corresponds to one specific point on a plurality of images obtained by changing the viewpoint. The distance from each viewpoint to the gazing point of the subject is detected, the attitude of the image input means at each viewpoint is calculated, and the viewpoint is determined from the image information at each viewpoint, the distance information to the gazing point, and the attitude information of the image input means. The translation component of the image input means when it is changed is calculated.
[0010]
  The three-dimensional shape restoration apparatus according to the present invention comprises image input means, distance detection means, posture detection means, translation component calculation means, correspondence detection means, and three-dimensional calculation means. The image input means changes the photographing position and viewpoint. The image of the subject is input, and the distance detection means detects the distance from each viewpoint to the gazing point of the subject corresponding to one or more specific points on the plurality of images obtained from the image input means. The attitude detection means calculates the attitude of the image input means at each viewpoint, and from each viewpoint up to the gazing point of the subject corresponding to one or more specific points on a plurality of images obtained from the image input means. The translation component calculation means calculates the translation component of the image input means when the viewpoint is changed from the image information at each viewpoint, the distance information to the gazing point and the attitude information of the image input means, and detects the correspondence. Means include images A plurality of images obtained by changing the viewpoint from the translation component of the means and the posture information are associated, and the three-dimensional computing means determines the three-dimensional shape of the subject based on the correspondence result and the position and posture information of the image input means. It is characterized by calculating.
[0011]
  Another three-dimensional shape restoration apparatus according to the present invention comprises image input means, distance detection means, posture detection means, gaze region verification means, translation component calculation means, correspondence detection means, and three-dimensional calculation means, and image input means Changes the shooting position and viewpoint and inputs an image of the subject, and the distance detection means detects the subject to the gazing point corresponding to a certain point in a specific gazing area on the plurality of images obtained from the image input means. The distance from each viewpoint is detected, the attitude detection means calculates the attitude of the image input means at each viewpoint, and the gaze area verification means determines the amount of displacement of the subject in the gaze area of the image taken by changing the viewpoint. Confirming that it is below the threshold, the translation component calculation means calculates the translation component of the image input means when the viewpoint is changed from the image information at each viewpoint, the distance information to the gazing point, and the attitude information of the image input means, Correspondence The output means associates a plurality of images obtained by changing the viewpoint from the translation component of the image input means and the posture information, and the three-dimensional calculation means determines the subject based on the association result and the position and posture information of the image input means. The three-dimensional shape is calculated.
[0012]
  It is desirable to have a gaze area adjustment means for adjusting the threshold value in the gaze area verification means according to the optical system parameters of the image input means.
[0013]
  In addition, an acceleration sensor, a magnetic sensor, or an angular velocity sensor may be used alone or in combination for the posture detection means.
[0014]
  In the three-dimensional shape restoration method according to the present invention, a specific one point or a plurality of points on a plurality of images obtained by changing the photographing point and the viewpoint and inputting the subject image by the image input means and changing the viewpoint. The distance from each viewpoint to the gazing point of the subject corresponding to is calculated, the attitude of the image input means at each viewpoint is calculated, and the image information at each viewpoint is calculated. 1 Or, the translation component of the image input means when the viewpoint is changed is calculated from the distance information to a plurality of gazing points and the attitude information of the image input means, and the viewpoint is changed from the translation component and the attitude information of the image input means. A method for reconstructing a three-dimensional shape, comprising associating a plurality of images and calculating a three-dimensional shape of a subject based on the association result and position / posture information of the image input means.
[0015]
  According to this inventionOther three-dimensional shape restoration methods are:Change the viewpoint and input the subject image with the image input means, and correspond to a certain point in a specific gaze area on multiple images obtained by changing the viewpointSubject gaze pointThe distance from each viewpoint is detected, the attitude of the image input means at each viewpoint is calculated, and it is confirmed that the amount of displacement of the subject in the gaze area of the image taken from different viewpoints is below a predetermined threshold The translation component of the image input means when the viewpoint is changed is calculated from the image information at each viewpoint, the distance information to the gazing point, and the attitude information of the image input means, and the viewpoint is determined from the translation component of the image input means and the attitude information. A plurality of images obtained by changing are associated with each other, and the three-dimensional shape of the subject is calculated from the association result and the position and orientation information of the image input means. It is desirable to adjust the threshold value according to the optical system parameters of the image input means.
[0016]
DETAILED DESCRIPTION OF THE INVENTION
The three-dimensional shape restoration apparatus of the present invention includes an attitude detection unit, a translational component calculation unit, a correspondence detection unit, and a three-dimensional calculation unit from an acceleration sensor in three axial directions orthogonal to the image input unit and the distance detection unit. When the same subject is photographed from two different viewpoints and the three-dimensional shape of the subject is restored, the photographer measures a gaze point at which the photographer measures the distance from the first viewpoint and the second viewpoint to a certain point of the subject. decide. When the gazing point is determined, the subject is photographed by the image input means at the first viewpoint, the distance from the first viewpoint to the gazing point is measured by the distance detection means, and the image input means at the first viewpoint is measured by the posture detection means. Measure posture. Next, the image input means is moved to photograph the subject at the second viewpoint, the distance detection means measures the distance from the second viewpoint to the gazing point, and the attitude detection means determines the image input means at the second viewpoint. Measure posture. The translation component calculation means calculates the translation component of the image input means 2 when the viewpoint is changed from the image data taken at each viewpoint, the distance from each viewpoint to the gazing point, and the posture information of the image input means at each viewpoint. The correspondence detection means associates feature points between two images taken from different viewpoints using the translation component of the image input means and the posture information. The three-dimensional calculation means calculates and restores the three-dimensional structure of the subject based on the triangulation principle from the correspondence result of the correspondence detection means, the translation component, and the posture information.
[0017]
Thus, the translation component of the image input means when the viewpoint is changed is calculated from the distance from the first viewpoint to the gazing point, the distance from the second viewpoint to the gazing point, and the attitude information of the image input means at each viewpoint. Therefore, the three-dimensional shape can be accurately restored with a small calculation capacity.
[0018]
In addition, since an acceleration sensor that measures acceleration in three orthogonal axes is used as the posture detection means, it is possible to detect the direction of gravity when shooting a subject from two viewpoints in a stationary state, and to input an image with respect to the direction of gravity. The attitude of the means can be detected with high accuracy.
[0019]
Further, by setting a plurality of gazing points on the subject, determining a final translation component using a plurality of translation components calculated from each gazing point, and restoring the three-dimensional structure of the subject using the determined translation components, A more accurate three-dimensional shape can be restored.
[0020]
Furthermore, if a fixed area of the screen shot by the image input means is fixed as the gaze area, and the subject is photographed from a different viewpoint, the same position of the subject enters the gaze area. The translation component of the input means can be calculated with less calculation processing.
[0021]
【Example】
FIG. 1 is a block diagram showing the configuration of an embodiment of the present invention. As shown in the figure, the three-dimensional shape restoration apparatus 1 includes an image input means 2, a distance detection means 3, a posture detection means 4, a translational component calculation means 5, a correspondence detection means 6, and a three-dimensional calculation means 7, which are composed of, for example, a digital camera. Have. The distance detection means 3 is an infrared stereo method using the principle of triangulation, a method of measuring the distance from the propagation time of reflection from the object by projecting waves such as ultrasonic waves, and distance information at the time of focusing in the optical system. The distance from the image input means 2 to a certain point of interest of the subject is detected using a method obtained from the encoder. The posture detection means 4 is composed of, for example, orthogonal three-axis acceleration sensors, and detects the direction of gravity when the subject is photographed while the image input means 2 is stationary, and measures the posture of the image input means 2. The translation component calculation means 5 calculates the translation component of the image input means 2 when the viewpoint is changed from the distance information to the gazing point of the subject and the attitude information of the image input means 2. Correspondence detection means 6 associates a specific point between a plurality of images obtained by changing the viewpoint from the translation component of image input means 2 and posture information. The three-dimensional calculation means 7 calculates the three-dimensional shape of the subject based on the correspondence result of the correspondence detection means 6 and the position and orientation information of the image input means 2 and outputs them to the storage means 8 such as a hard disk.
[0022]
As shown in FIG. 2, the three-dimensional shape restoration apparatus 1 configured as described above restores the three-dimensional shape of the subject 9 by photographing the same subject 9 from the first viewpoint 11 and the second viewpoint 12. The operation at this time will be described with reference to the flowchart of FIG.
[0023]
Before photographing the subject 9 with the image input means 2, the photographer determines the gaze point 13 for measuring the distance from the first viewpoint 11 and the second viewpoint 12 to a certain point of the subject 9 (step S1). . As the gazing point 13, for example, as shown in FIG. 2, various methods such as automatically selecting a small region such as an edge portion indicating a characteristic density distribution of the subject 9 are used. When the gazing point 13 is determined, the subject 9 is photographed by the image input means 2 at the first viewpoint 11, and the position of the gazing point 13 on the image plane 14 photographed at the first viewpoint 11 as shown in the screen diagram of FIG. 4. The corresponding point 16a corresponding to is specified. Further, the distance L from the first viewpoint 11 to the gazing point 13 by the distance detection means 3₁And the posture of the image input unit 2 at the first viewpoint 11 is measured by the posture detection unit 4 (step S2). Next, the image input means 2 is moved to photograph the subject 9 at the second viewpoint 12, and the correspondence corresponding to the position of the gazing point 13 on the image plane 15 photographed at the second viewpoint 12, as shown in FIG. The point 16b is specified, and the distance L from the second viewpoint 12 to the gazing point 13 is detected by the distance detector 3.₂And the attitude of the image input means 2 at the second viewpoint 12 is measured by the attitude detection means 4 (step S3).
[0024]
Here, as shown in the explanatory diagram of FIG. 5, at the first viewpoint 11, the x-axis and y-axis are taken on the image plane 14 in directions orthogonal to each other, and the z-axis is taken in the optical axis direction. And the coordinates of the corresponding point 16b on the image plane 15 are (x, y).₁, Y₁), The rotation matrix R representing the relative attitude of the image input means 2 between the first viewpoint 11 and the second viewpoint 12 with respect to the xyz coordinate system is the first input of the image input device 2. When the rotation angles around the x-axis, y-axis, and z-axis when moving from the viewpoint 11 to the second viewpoint 12 are α, β, and γ, respectively, it can be expressed by the following equation (1).
[0025]
[Expression 1]

[0026]
This rotation matrix R can be obtained from the posture information of the image input means 2 detected by the posture detection means 4, and the distance L from the first viewpoint 11 to the gazing point 13.₁And the distance L from the second viewpoint 12 to the gazing point 13₂Can be obtained by the distance detection means 3. Therefore, if the optical system parameters such as the focal length f of the image input apparatus 2 are known, the direction of the line of sight from the first viewpoint 11 to the corresponding point 16a (x, y) on the image plane 14 and the image from the second viewpoint 12 are displayed. Corresponding point 16b (x of surface 15₁, Y₁) Can be obtained. For example, as shown in FIG. 6, when the image input means 2 is a central projection model, if the three-dimensional coordinate system is taken with reference to the first viewpoint 11 and the inverse matrix of the rotation matrix R is IR, the first viewpoint 11 Unit line-of-sight vector m from gazing point 13 to unit gaze vector IRm from second viewpoint 12 to gazing point 13₁Can be expressed by the following equation (2).
[0027]
[Expression 2]

[0028]
Therefore, the distance L from the first viewpoint 11 to the gazing point 13₁And the distance L from the second viewpoint 12 to the gazing point 13₂And the rotation matrix R, the translation component D when the image input means 2 is moved from the first viewpoint 11 to the second viewpoint 12 can be calculated by the following equation (3).
[0029]
[Equation 3]

[0030]
Therefore, the translation component calculation means 5 is the image data taken at the first viewpoint 11 and the second viewpoint 12, and the distance L from each

viewpoint

11, 12 to the gazing point 13.₁, L₂The translation component D of the image input means 2 when the viewpoint is changed from the first viewpoint 11 to the second viewpoint 12 is calculated from the posture information of the image input means 2 at each viewpoint 11 and 12 (step S4). The correspondence detection unit 6 uses the translation component D and the posture information of the image input unit 2 to obtain a distance between the two

image planes

14 and 15 photographed at the first viewpoint 11 and the second viewpoint 12 shown in FIG. The feature points are associated (step S5). Since the relative position and orientation information of the image input means 11 is required for the association between the images, it is often used as a basic constraint condition for solving the correspondence problem in the stereo method in which the image of the object is captured by two cameras. Epipolar constraint can be used, such as correlation method, feature matching method, local image feature method such as density method, method of calculating moving region using spatio-temporal differential method, etc. Correspondence can be performed by the following general method. The three-dimensional calculation means 7 calculates and restores the three-dimensional structure of the subject 9 based on the triangulation principle from the correspondence result of the correspondence detection means 6, the translation component D, and the posture information (step S6). The position, orientation information, three-dimensional information and image data obtained in this way are recorded and stored in the storage means 8 as necessary (steps S7 and S8).
[0031]
Thus, the first viewpoint 11 is determined from the distance L1 from the first viewpoint 11 to the gazing point 13, the distance L2 from the second viewpoint 12 to the gazing point 13, and the posture information of the image input means 2 at each of the

viewpoints

11 and 12. Since the translation component D of the image input means 2 when the viewpoint is changed from the first viewpoint 12 to the second viewpoint 12 is calculated,It is possible to detect the shooting position and posture of the image input means 2 with high accuracy.A three-dimensional shape can be accurately restored with a small calculation capacity.
[0032]
In addition, since an acceleration sensor that measures acceleration in three orthogonal directions is used as the posture detection means 4, when the subject 9 is photographed at two

viewpoints

11 and 12 in a stationary state as shown in FIG. The posture of the image input means 2 with respect to the direction of gravity can be detected with high accuracy. Further, when the subject is photographed while moving the image input means 2, the acceleration and position information (translation component) of the image input means 2 can be obtained by integrating the acceleration signal output from the acceleration sensor. It is also possible to compare the distance information from the viewpoint to the gazing point and the translation component D of the image input means 2 calculated from the attitude information of the image input means 2 or to perform a fusion process thereof.
[0033]
In the above embodiment, the distance L from the first viewpoint 11 and the second viewpoint 12 to the single gazing point 13 of the subject 9.₁, L₂The translation component D of the image input means 2 when the viewpoint is changed from the first viewpoint 11 to the second viewpoint 12 is calculated from the attitude information of the image input means 2 and the calculated translation component D and attitude information are used. The case where the feature points are associated between the images of the two

image planes

14 and 15 photographed at the first viewpoint 11 and the second viewpoint 12 has been described, but from the

viewpoints

11 and 12 to the gazing point 13. Line-of-sight vector L from each of the

viewpoints

11 and 12 to the gazing point 13 calculated by calculation due to an error in measuring the distance between them and an error in associating the corresponding

points

16a and 16b on the two image planes 14 and 15.₁m, eye vector L₂IRm₁7 may not coincide with each other as shown in FIG. 7, and there is a possibility that an error is included in the translation component D obtained from only the single gazing point 13. In order to solve this problem, it is preferable to set a plurality of gazing points 13 on the subject 9 and obtain the final translation component using the translation components calculated from each gazing point 13.
[0034]
FIG. 8 shows a three-dimensional shape restoration of the second embodiment in which a plurality of gazing points 13a to 13n are set on the subject 9, and a final translation component is obtained using translation components calculated from the gazing points 13a to 13n. It is a block diagram showing the composition of device 1a. As shown in the figure, the three-dimensional shape restoration apparatus 1a includes a translation component in addition to the image input means 2, the distance detection means 3, the posture detection means 4, the translation component calculation means 5, the correspondence detection means 6, and the three-dimensional calculation means 7. It has a translation component determination means 21 provided in the subsequent stage of the calculation means 5.
[0035]
As shown in FIG. 9, the three-dimensional shape restoration apparatus 1 a configured as described above sets a plurality of gazing points 13 a to 13 n on the subject 9, and sets the same subject 9 as the first viewpoint 11 and the second viewpoint 12. The operation when the three-dimensional shape of the subject 9 is restored by shooting with reference to FIG. 10 will be described.
[0036]
Before the subject 9 is photographed by the image input means 2, the photographer determines a plurality of gazing points 13a to 13n of the subject 9 from the first viewpoint 11 and the second viewpoint 12 (step S11). When the gazing points 13a to 13n are determined, the subject 9 is photographed by the image input means 2 at the first viewpoint 11, the distances from the first viewpoint 11 to the gazing points 13a to 13n are measured by the distance detection means 3, and the posture is determined. The attitude of the image input means 2 at the first viewpoint 11 is measured by the detection means 4 (step S12). Since the distance detection unit 12 measures the distance to the plurality of gazing points 13a to 13n, the distance detection unit 12 can measure the distance to an arbitrary point by using an active method or a method based on distance detection at the time of focusing. Yes. When the photographing and measurement at the first viewpoint 11 are completed, the image input means 2 is moved to photograph the subject 9 at the second viewpoint 12, and the distance detecting means 3 from the second viewpoint 12 to the gazing points 13a to 13n. And the posture of the image input unit 2 at the second viewpoint 12 is measured by the posture detection unit 4 (step S13). The translation component calculation means 5 includes image data taken at the first viewpoint 11 and the second viewpoint 12, distance information from the first viewpoint 11 and the second viewpoint 12 to the gazing points 13a to 13n, and image input means. 2 translation information D of the image input means 2 between the two

viewpoints

11 and 12 based on the expression (3).₁~ D_nIs calculated (step S14). The translation component determining means 21 calculates a plurality of calculated translation components D₁~ D_nA final translation component D is determined (step S15). In determining this final translation component D, for example, an index (weight) indicating the accuracy of the correspondence between the two images of each gazing point 13a to 13n is set to S.₁~ S_nAs shown in the following equation (4), a method of determining by weighted averaging is applied.
[0037]
[Expression 4]

[0038]
Where index S₁~ S_nThe value of cross-correlation used in normal image processing is used. For example, as shown in FIG. 11, the corresponding point 16ai (x) of the i-th gazing point 13i of the subject 9 on the image plane 14 photographed at the first viewpoint 11_i0, Y_i0) And the corresponding point 16bi (x) of the i-th gazing point 13i of the subject 9 on the image plane 14 photographed at the second viewpoint 12._i0+ Dx, y_i0+ Dy) is performed by block matching (template matching) using the correlation window 17 of (2N + 1) × (2P + 1), the index S_iIs calculated by the following equation (5).
[0039]
[Equation 5]

[0040]
In the above formula (5), I₁(X, y) is the density at the corresponding point 16a (x, y) on the image plane 14, and I₂(X, y) is the density at the corresponding point 16b (x, y) on the image plane 15, and I₁d (x, y) is the average density in the correlation window 17 of (2N + 1) × (2P + 1) centered on the corresponding point 16a (x, y) on the image plane 14, I₂d (x, y) represents an average density in the correlation window 17 of (2N + 1) × (2P + 1) centered on the corresponding point 16b (x, y) on the image plane 15, and K is a constant.
[0041]
Correspondence detecting means 6 associates feature points between two images by using translation component D obtained by the above equations (5) and (4) and posture information of image input means 2 at each

viewpoint

11 and 12. Is performed (step S16). The three-dimensional calculation means 7 calculates and restores the three-dimensional structure of the subject 9 on the basis of the triangulation principle from the correspondence result of the correspondence detection means 6, the translation component D and the posture information (step S17). The position, orientation information, three-dimensional information and image data obtained in this way are recorded and stored in the storage means 8 as necessary (steps S18 and S19).
[0042]
In this way, a plurality of gazing points 13a to 13n are set on the subject 9, and the translation component D calculated from each gazing point 13a to 13n.₁~ D_nIs used to determine the final translation component D, and the determined translation component D is used to restore the three-dimensional structure of the subject. Therefore, a more accurate three-dimensional shape can be restored.
[0043]
Note that the method of associating a plurality of gazing points 13a to 13n and the method of calculating the index S are not limited to the above contents, and various methods can be employed.
[0044]
In each of the above embodiments, the case where the distance to the gazing point 13 is measured by the distance detecting unit 3 after the gazing point 13 of the subject 9 is associated has been described. May be fixed as a gaze area.
[0045]
FIG. 12 is a block diagram showing the configuration of the three-dimensional shape restoration apparatus 1b of the third embodiment that calculates a translational component while fixing a fixed area of the screen as a gaze area. As shown in the figure, the three-dimensional shape restoration apparatus 1b includes an image input means 2, a distance detection means 3, a posture detection means 4, a translational component calculation means 5, a correspondence detection means 6, and a three-dimensional calculation means 7. Means 22 are included. The gaze area verification means 22 has the same and substantially the same subject 9 in the gaze area which is a fixed position of the image plane 14 photographed at the first viewpoint 11 and the image plane 15 photographed at the second viewpoint 12. Detects being copied to the position.
[0046]
When the same subject 9 is photographed from the first viewpoint 11 and the second viewpoint 12 as shown in FIG. 2 and the three-dimensional shape of the subject 9 is restored by the three-dimensional shape restoration apparatus 1a configured as described above. Will be described with reference to the flowchart of FIG.
[0047]
First, the subject 9 is photographed by adjusting the orientation of the image input means 2 so that the subject 9 appears in the center of the image plane 14 of the image input means 2 as shown in FIG. The position corresponding to the center of the image plane 14 of the image of the subject 9 is set as the gazing point, the distance from the first viewpoint 11 to the gazing point is measured by the distance detecting unit 3, and the position at the first viewpoint 11 is measured by the posture detecting unit 4. The posture of the image input means 2 is measured (step S21). The gaze area verification means 22 stores the image of the subject 9 in the gaze area 18 that is a certain range including the center of the imaged image plane 14 (step S22). Next, the image input means 2 is moved to photograph the subject 9 at the second viewpoint 12, and the position corresponding to the center of the image plane 15 of the image of the subject 9 photographed from the second viewpoint 12 by the distance detection means 3. The distance from the second viewpoint 12 to the gazing point is measured by the distance detection unit 3 and the attitude of the image input unit 2 at the second viewpoint 12 is measured by the attitude detection unit 4 (step S23). The gaze area verification means 22 confirms the image taken at the second viewpoint 12, and the image of the subject 9 in the gaze area 18 on the image plane 14 taken at the first viewpoint 11 is taken at the second viewpoint 12. Whether it is included in the gaze region 18 of the surface 15 and the amount of displacement thereof are detected. For example, the amount of displacement of the subject 9 is measured by taking the cross-correlation of the images in the gaze area 18 of the image plane 14 and the image plane 15, and if the displacement amount is equal to or greater than a predetermined threshold, the gaze of the image plane 14 and the image plane 15 is measured. It is instructed to change the orientation of the image input means 2 at the second viewpoint 12 and re-take the image, assuming that the degree of coincidence of the images in the region 18 is small (steps S24 and S25). When the image of the subject 9 in the gaze area 18 of the image plane 14 photographed at the first viewpoint 11 is included in the gaze area 18 of the image plane 15 photographed at the second viewpoint 12, the translation component calculation means 5 Calculates the translation component D of the image input means 2 between the

viewpoints

11 and 12 from the image data and distance information at the first viewpoint 11 and the second viewpoint 12 and the attitude information of the image input means 2 (step S26). When calculating the translation component D, the unit line-of-sight vector m from the first viewpoint 11 to the gazing point and the unit line-of-sight vector IRm from the second viewpoint 12 to the gazing point₁Can be expressed by the following equation (6), so that the translational component D can be calculated with less calculation processing.
[0048]
[Formula 6]

[0049]
The correspondence detection means 6 associates the feature points between the two images by using the calculated translation component D and the posture information of the image input means 2 at the respective viewpoints 11 and 12 (step S27). The three-dimensional calculation means 7 calculates and restores the three-dimensional structure of the subject 9 based on the triangulation principle from the correspondence result of the correspondence detection means 6, the translation component D, and the posture information (step S28). The obtained position, orientation information, three-dimensional information, and each image data are recorded and stored in the storage means 8 as necessary (steps S29 and S30).
[0050]
In the above embodiment, a warning is given when the image of the subject 9 in the gaze area 18 of the image plane 14 photographed at the first viewpoint 11 is not included in the gaze area 18 of the image plane 15 photographed at the second viewpoint 12. The case where the re-shooting is instructed has been described, but an indicator indicating that the correlation of the image in the gaze area 18 is high or low may be installed in the finder before shooting at the second viewpoint 12. Further, the gaze area 18 may be set to an arbitrary area other than the center of the image plane.
[0051]
In the above embodiment, it is determined whether or not the image of the subject 9 in the gaze area 18 of the image plane 14 photographed at the first viewpoint 11 is included in the gaze area 18 of the image plane 15 photographed at the second viewpoint 12. In this case, the case where the displacement amount of the subject 9 is measured by taking the cross-correlation between the images in the gaze region 18 of the image surface 14 and the image surface 15 and the measured displacement amount is compared with a predetermined threshold value has been described. The photographing condition can be set flexibly by changing the threshold value of the displacement amount according to the optical system parameters such as the focal length of the image input means 2. FIG. 15 shows the configuration of the three-dimensional shape restoration apparatus 1c of the fourth embodiment in which the displacement amount threshold value is varied according to the optical system parameters of the image input means 2 in this way. As shown in FIG. 15, the three-dimensional shape restoration apparatus 1c includes an image input means 2, a distance detection means 3, a posture detection means 4, a translation component calculation means 5, a correspondence detection means 6, a three-dimensional calculation means 7, and a gaze area test. Means 22 and gaze area adjusting means 23 for variably setting the threshold value of the displacement amount of the image in the gaze area 18 are provided. When the subject 9 is photographed by the image input means 2, for example, if the focal length is increased, the viewing angle is narrowed. Therefore, even if the direction of the image input means 2 is slightly changed, the amount of displacement on the image plane increases. Therefore, when the image input means 2 captures the subject 9, the gaze area adjusting means 23 sets a large threshold value for the amount of displacement of the image in the gaze area 18 of the image plane 14 and the image plane 15 when the focal length is long. When the focal length is short, the displacement threshold is adjusted according to the optical system parameters, such as setting the displacement threshold small. In this way, the images in the gaze region 18 of the image plane 14 and the image plane 15 can be compared with the optimum threshold value, and the accuracy of restoring the three-dimensional shape can be further increased.
[0052]
In each of the above-described embodiments, an acceleration sensor that measures acceleration in three orthogonal directions is used as the posture detection means 4, and when the subject 9 is photographed at the first viewpoint 11 and the second viewpoint 12, image input in the direction of gravity is performed. Although the case where the attitude of the means 2 is detected has been described, a magnetic sensor may be used for the attitude detection means 4. The detected magnetic orientation may be geomagnetism or an artificially generated magnetic field. When a magnetic sensor capable of detecting three axial directions orthogonal to the posture detection means 4 is used, as shown in FIG. 2, when photographing the subject 9 at two

viewpoints

11 and 12 in a stationary state, an image input means for the magnetic direction 2 postures can be detected with high accuracy. Further, when an acceleration sensor that measures acceleration in three orthogonal directions and a magnetic sensor that can detect two orthogonal directions are used in combination, the attitude of the image input means 2 can be completely detected.
[0053]
Further, an angular velocity sensor may be used as the posture detection means 4. That is, if the angular velocity sensor is installed so as to correspond to the rotation angle to be detected, the rotation angle can be calculated by integrating the sensor output. Therefore, when photographing a subject from a plurality of viewpoints while moving, it is possible to easily detect a change in posture of the image input means 2 between the viewpoints. Further, when used in combination with a three-axis acceleration sensor or magnetic sensor, when the image input means 2 is stationary or moving very slowly, the offset component of the angular velocity sensor is obtained from the posture determined by the acceleration sensor and the magnetic sensor. Can be corrected.
[0054]
In each of the above-described embodiments, the case where the subject 9 is photographed while the image input unit 2 is stationary at the first viewpoint 11 and the second viewpoint 12 and distance information and posture information are obtained has been described. The subject 9 may be photographed while moving the means 2 to obtain distance information and posture information. Further, the case where the subject 9 is photographed from the first viewpoint 11 and the second viewpoint 12 to restore the three-dimensional shape has been described, but the subject 9 is photographed from a plurality of viewpoints of three or more viewpoints to obtain the three-dimensional shape. The same applies to restoration.
[0055]
Further, in the above embodiment, the case where the real time processing is performed when the subject 9 is photographed has been described. However, the images photographed from the respective viewpoints, the distance information, and the posture information are collectively stored in the storage unit, and the storage unit is stored later. Offline processing may be performed based on the information stored in the file. Furthermore, the image captured at each viewpoint, distance information, and posture information may be transferred to a network or the like for processing.
[0056]
【The invention's effect】
As described above, the present invention calculates the translation component of the image input means when the viewpoint is changed from the image data at different viewpoints, the distance information to a certain point of the subject, and the attitude information of the image input means at each viewpoint. FromIn addition to being able to accurately detect the shooting position and orientation of the image input meansA three-dimensional shape can be accurately restored with a small calculation capacity.
[0057]
Also, setting a plurality of gazing points on the subject, determining a final translation component using a plurality of translation components calculated from each gazing point, and restoring the three-dimensional structure of the subject using the determined translation components Thus, a more accurate three-dimensional shape can be restored.
[0058]
Furthermore, when the viewpoint is changed by fixing a fixed area of the screen shot by the image input means as the gaze area and shooting the subject from a different viewpoint so that the same position of the subject enters the gaze area. The translation component of the image input means can be calculated with less calculation processing, and the three-dimensional shape can be accurately restored with a simple configuration.
[0059]
Further, by adjusting the threshold value for determining whether or not the same position of the subject is in the gaze area according to the optical system parameters such as the focal length of the image input means, the shooting conditions can be set flexibly. The accuracy of restoring the dimensional shape can be further increased.
[0060]
In addition, by using an acceleration sensor that measures acceleration in three orthogonal directions as posture detecting means for detecting the posture of the image input means at each viewpoint, the direction of gravity is detected when the subject is photographed in a stationary state. It is possible to detect the attitude of the image input means with respect to the direction of gravity with high accuracy, and to improve the reconstruction accuracy of the three-dimensional shape.
[0061]
In addition, by detecting the geomagnetic direction and artificially generated magnetic field during still photography using a magnetic sensor as the posture detection means, the posture of the image input means can be measured with high accuracy, and the three-dimensional shape can be accurately obtained. Can be restored.
[0062]
Further, by detecting the rotational angular velocity using an angular velocity sensor for the posture detection means, it is possible to measure the dynamic posture of the image input means with high accuracy, particularly when the subject is photographed while moving. The shape can be accurately restored.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of an embodiment of the present invention.
FIG. 2 is a layout diagram showing shooting positions with respect to a subject.
FIG. 3 is a flowchart showing the operation of the embodiment.
FIG. 4 is a screen diagram showing images taken from different viewpoints.
FIG. 5 is an explanatory diagram showing different viewpoints, subjects, and image planes.
FIG. 6 is a configuration diagram of an image input unit for a central projection model.
FIG. 7 is an explanatory diagram showing an error of a unit line-of-sight vector from different viewpoints.
FIG. 8 is a block diagram showing a configuration of a second embodiment.
FIG. 9 is a layout diagram showing subjects and shooting positions for which a plurality of gazing points are set.
FIG. 10 is a flowchart showing the operation of the second embodiment.
FIG. 11 is a screen diagram showing the association of one point of images taken from different viewpoints.
FIG. 12 is a block diagram showing a configuration of a third embodiment.
FIG. 13 is a flowchart showing the operation of the third embodiment.
FIG. 14 is a screen diagram showing a gaze area on the image plane.
FIG. 15 is a block diagram showing a configuration of a fourth embodiment.
[Explanation of symbols]
1 3D shape restoration device
2 Image input means
3 Distance detection means
4 Attitude detection means
5 Translation component calculation means
6 Correspondence detection means
7 Three-dimensional calculation means
9 Subject
11 First viewpoint
13 Second perspective
13 Gaze points
18 Gaze area
21 Translation component determination means
22 Gaze area verification means
23 Gaze area adjustment means

Claims

Image input means, distance detection means, posture detection means and translational component calculation means,
The image input means inputs the subject image by changing the shooting position and viewpoint,
The distance detection means detects the distance from each viewpoint to the gazing point of the subject corresponding to one specific point on the plurality of images obtained from the image input means,
The posture detection means calculates the posture of the image input means at each viewpoint,
Translational component calculation means from the attitude information of the distance information and the image input means to the fixation point and the image information in each viewpoint, the position and calculates the translation component of the image input means and posture detecting when varying viewpoints apparatus.

The distance from each viewpoint to the gazing point of the subject corresponding to one specific point on a plurality of images obtained by changing the shooting position and viewpoint and inputting the subject image with the image input means and changing the viewpoint The orientation of the image input means at each viewpoint is calculated, and the translation component of the image input means when the viewpoint is changed is determined from the image information at each viewpoint, the distance information to the gazing point, and the attitude information of the image input means. A position / posture detection method characterized by calculating.

Image input means, distance detection means, posture detection means, translation component calculation means, correspondence detection means and three-dimensional calculation means,
The image input means inputs the subject image by changing the shooting position and viewpoint,
The distance detection means detects the distance from each viewpoint to the gazing point of the subject corresponding to one specific point on the plurality of images obtained from the image input means,
The posture detection means calculates the posture of the image input means at each viewpoint, and detects the distance from each viewpoint to the gazing point of the subject corresponding to a specific point on the plurality of images obtained from the image input means. And
The translation component calculation means calculates the translation component of the image input means when the viewpoint is changed from the image information at each viewpoint, the distance information to the gazing point, and the attitude information of the image input means,
Correspondence detection means performs correspondence between a plurality of images obtained by changing the viewpoint from the translation component of the image input means and the posture information,
3. A three-dimensional shape restoration apparatus, wherein the three-dimensional calculation means calculates the three-dimensional shape of the subject from the association result and the position and orientation information of the image input means.

Image input means, distance detection means, posture detection means, translation component calculation means, translation component determination means, correspondence detection means, and three-dimensional calculation means;
  The image input means inputs the subject image by changing the shooting position and viewpoint,
  The distance detection means detects the distance from each viewpoint to a plurality of gazing points of the subject corresponding to a plurality of specific points on the plurality of images obtained from the image input means,
  The posture detection means calculates the posture of the image input means at each viewpoint,
  The translation component calculation means calculates the translation component of the image input means when the viewpoint is changed from the image information at each viewpoint, the distance information to the plurality of gazing points, and the attitude information of the image input means,
  The translation component determining means determines a final translation component from a plurality of translation components,
  The correspondence detecting means performs correspondence between the plurality of images obtained by changing the viewpoint from the determined translation component and the posture information of the image input means,
  3. A three-dimensional shape restoration apparatus, wherein the three-dimensional calculation means calculates the three-dimensional shape of the subject from the association result and the position and orientation information of the image input means.

Image input means, distance detection means, posture detection means, gaze area verification means, translation component calculation means, correspondence detection means, and three-dimensional calculation means,
The image input means inputs the subject image by changing the shooting position and viewpoint,
The distance detection means detects the distance from each viewpoint to the gazing point of the subject corresponding to a certain point in a specific gazing area on the plurality of images obtained from the image input means,
The posture detection means calculates the posture of the image input means at each viewpoint,
The gaze area verification means confirms that the amount of displacement of the subject in the gaze area of the image taken by changing the viewpoint is below a predetermined threshold,
The translation component calculation means calculates the translation component of the image input means when the viewpoint is changed from the image information at each viewpoint, the distance information to the gazing point, and the attitude information of the image input means,
Correspondence detection means performs correspondence between a plurality of images obtained by changing the viewpoint from the translation component of the image input means and the posture information,
3. A three-dimensional shape restoration apparatus, wherein the three-dimensional calculation means calculates the three-dimensional shape of the subject from the association result and the position and orientation information of the image input means.

6. The three-dimensional shape restoration apparatus according to claim 5, further comprising a gaze area adjustment unit that adjusts a threshold value in the gaze area verification unit by an optical system parameter of the image input unit.

The three-dimensional shape restoration apparatus according to any one of claims 3 to 6, wherein an acceleration sensor, a magnetic sensor, or an angular acceleration sensor is used alone or in combination for the posture detection means.

The distance from each viewpoint to the gazing point of the subject corresponding to one specific point on a plurality of images obtained by changing the shooting position and viewpoint and inputting the subject image with the image input means and changing the viewpoint The orientation of the image input means at each viewpoint is calculated, and the translation component of the image input means when the viewpoint is changed is determined from the image information at each viewpoint, the distance information to the gazing point, and the attitude information of the image input means. The image is calculated, and a plurality of images obtained by changing the viewpoint from the translation component of the image input unit and the posture information are associated, and the three-dimensional shape of the subject is determined based on the association result and the position and posture information of the image input unit. A three-dimensional shape restoration method characterized by calculating.

Change the shooting position and viewpoint, input the subject image with the image input means, and from each viewpoint up to multiple gazing points of the subject corresponding to a plurality of specific points on multiple images obtained by changing the viewpoint Of the image input means at each viewpoint, and calculates the attitude of the image input means when the viewpoint is changed from the image information at each viewpoint, the distance information to a plurality of gazing points, and the attitude information of the image input means. A translation component is calculated, a final translation component is determined from a plurality of translation components, and correspondence between a plurality of images obtained by changing the viewpoint from the determined translation component and the posture information of the image input means is performed, A three-dimensional shape restoration method, characterized in that a three-dimensional shape of a subject is calculated from an association result and position / posture information of an image input means.

Change the shooting position and viewpoint, input the subject image with the image input means, and each viewpoint up to the subject's gazing point corresponding to a point in a specific gazing area on multiple images obtained by changing the viewpoint From each viewpoint, calculate the attitude of the image input means at each viewpoint, and confirm that the amount of displacement of the subject in the gaze area of the image captured by changing the viewpoint is less than or equal to a predetermined threshold. Can be obtained by calculating the translation component of the image input means when the viewpoint is changed from the image information in the image, the distance information to the gazing point, and the attitude information of the image input means, and changing the viewpoint from the translation component of the image input means and the attitude information. A method for reconstructing a three-dimensional shape, comprising: associating a plurality of images, and calculating a three-dimensional shape of a subject based on association results and position / posture information of an image input unit.

The three-dimensional shape restoration method according to claim 10, wherein the threshold value is adjusted by an optical system parameter of the image input means.