JP2004013869A

JP2004013869A - Apparatus for generating three-dimensional shape, method therefor, and its program

Info

Publication number: JP2004013869A
Application number: JP2002170841A
Authority: JP
Inventors: Toshiyuki Kamiya; 神谷　俊之; Naokazu Yokoya; 横矢　直和; Tomokazu Sato; 佐藤　智和; Masayuki Kanbara; 神原　誠之
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2002-06-12
Filing date: 2002-06-12
Publication date: 2004-01-15

Abstract

<P>PROBLEM TO BE SOLVED: To provide an apparatus for generating 3-D (three-dimensional) shape capable of automatically acquiring a 3-D shape surface with a high reality sense. <P>SOLUTION: A stereo matching means 11 performs stereo matching processing to a plurality of sets of images to generate a depth dimension image. A depth dimension image integration means 12 performs voting to points corresponding to each parallax in a voxel space based on the depth dimension image acquired from the stereo matching means 11 and the parameters of camera positions corresponding each depth dimension image to reproduce a shape based on the voxel expression of the original 3-D shape. A voxel color determination means 13 performs reverse projection of a voxel position based on the voxel expression acquired from the integration means 12 and the parameters of the camera positions to an input image used at the matching means 11 to select the most appropriate pixel among corresponding pixels. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
本発明は３次元形状生成装置及びそれに用いる３次元形状生成方法並びにそのプログラムに関し、特に３次元形状モデルに対応する現実の物体をカメラによって複数のアングルから撮影し、同時に撮影時のカメラ位置・姿勢を計測しておくことで、自動的に形状表面の情報を決定する３次元形状生成方法に関する。
【０００２】
【従来の技術】
実世界、例えば街並の建物、植物等を３次元ＣＧ（Ｃｏｍｐｕｔｅｒ　Ｇｒａｐｈｉｃｓ）で表現するための３次元モデルの形状を生成する手段として、３次元モデリング用のツールを用いて作成することが広く行われている。また、実際の風景を、カメラで撮影したり、３次元スキャナ装置を用いて、実際の３次元空間をそのままモデル化する手法についても多くの手法が開発されている。その中で、ボクセル（３次元空間中の微小立方体）を用いた３次元形状表現によって計算機内で３次元形状を表現する３次元形状生成手法は、対象の細かな凹凸等を表現可能であるという特徴がある。
【０００３】
従来のボクセルを用いた３次元形状生成手法の一例が、「視点位置推定による動画像からの屋外環境の三次元モデル化」（佐藤ら、第１９０回研究会講演予稿集、画像電子学会、２００１年、第４９頁〜第５５頁）に記載されている。
【０００４】
従来のボクセルを用いた３次元形状生成装置の一例を図１０に示す。図１０に示すように、３次元形状生成装置３はステレオマッチング手段３１と、対応色判定手段３２と、奥行き画像統合手段３３とから構成されている。
【０００５】
ステレオマッチング手段３１は複数の異なる３次元位置で入力されたフレーム画像間でのステレオマッチング処理を行い、奥行き推定を行う。奥行き画像統合手段３３は複数のステレオ奥行き画像と、推定されたカメラ位置パラメータとを使って、ボクセル空間への投票を行うことで、３次元空間中の対象位置の推定を行う。対応色判定手段３２は奥行き画像統合手段３３での投票に対応する画素の色を用いてボクセルの色を決定する。
【０００６】
このような構成な３次元形状生成装置３では、形状の復元及び形状表面の色決定を次のように行っている。すなわち、例えば、街並みや建物を撮影したビデオ画像から各フレームを撮影した時点でのビデオカメラ（図示せず）の３次元的な位置のパラメータを求めた後、各フレーム画像及び推定されたビデオカメラの３次元位置を入力とし、各フレーム画像に対して時間的に前後にあり、フレーム画像と異なる位置から撮影を行った複数の画像を用いてステレオマッチング処理を行い、フレーム毎に対応する奥行き画像を得る。
【０００７】
次に、得られた奥行き画像と、奥行き画像に対応するフレーム画像の推定位置とを用いて、奥行き画像の各画素の点が実際に存在すると推定される３次元空間中の微小立方体（ボクセル）に投票を行う。この投票を各奥行き画像から行うことで、一定以上の投票値を持つボクセルを、物体が実際に存在する場所であるという推定を行うことで３次元形状の復元を行う。
【０００８】
以上で、３次元の形状を推定することができるが、３次元ＣＧで用いるための３次元モデルとしては表面の色情報を得る必要があるため、さらに以下の処理を行う。すなわち、得られた各ボクセルへの投票に際して、各奥行き画像に対応するフレーム画像中の対応画素の色を調べ、ボクセルへの投票数と同時に記録する。
【０００９】
ボクセルへの投票が完了した時点で投票された色の平均値を求め、これをボクセルの色とする。このような処理によって、３次元形状とその表面の色とを求めることが可能となる。
【００１０】
【発明が解決しようとする課題】
しかしながら、上述した従来の３次元形状生成手法では、ボクセルの色の決定をボクセルに投票を行った画素の平均値で行っているので、例えば、建物の壁と窓の境目のように大きく色が異なる場合、ステレオマッチングの誤差によって、誤って異なる色の画素が投票された場合、非常に異なった色となってしまうため、上記の方法によって生成されるボクセルの色の精度が低いという問題がある。これは、上記の誤差が各ボクセルについて独立であるので、実際には一様である面において、隣り合うボクセルで色がそれぞれ微妙に異なるという現象が発生する。
【００１１】
また、従来の３次元形状生成手法では、上記の方法によって生成されるボクセルが、例えば、実空間中で１０ｃｍの精度を持つ立方体であったとしても、色データのみを投票することによって、その表面に持つ細かな模様（テクスチャ）の情報が失われるという問題がある。
【００１２】
以上のような問題のため、全体としては形状が正確であったとしても、現実感の低い形状になるという問題が生じており、建物形状等の詳細な形状を求めると同時に、現実感の高い形状表面を生成する３次元形状生成手法が求められている。
【００１３】
そこで、本発明の目的は上記の問題点を解消し、自動的にかつ現実感の高い３次元形状表面を取得することができる３次元形状生成装置及びそれに用いる３次元形状生成方法並びにそのプログラムを提供することにある。
【００１４】
【課題を解決するための手段】
本発明による３次元形状生成装置は、複数の異なる３次元位置でカメラから入力されたフレーム画像間でのステレオマッチング処理を行って奥行き推定を行うステレオマッチング手段と、前記ステレオマッチング手段で得られた複数のステレオ奥行き画像と前記カメラの位置パラメータとを基に３次元空間中の微小立方体を示すボクセルへの投票を行うことで３次元空間中の対象位置の推定を行う奥行き画像統合手段と、前記ボクセルの位置と前記カメラの位置パラメータとを用いて前記奥行き画像統合手段で得られたボクセルを元の画像系列の各画像に逆投影して当該ボクセルの最適な色を決定するボクセル色決定手段とを備えている。
【００１５】
本発明による他の３次元形状生成装置は、複数の異なる３次元位置でカメラから入力されたフレーム画像間でのステレオマッチング処理を行って奥行き推定するステレオマッチング手段と、前記ステレオマッチング手段で得られた複数のステレオ奥行き画像と前記カメラの位置パラメータとを基に３次元空間中の微小立方体を示すボクセルへの投票を行うことで３次元空間中の対象位置の推定を行う奥行き画像統合手段と、前記ボクセルの位置と前記カメラの位置パラメータとを用いて前記奥行き画像統合手段で得られたボクセルを元の画像系列の各画像に逆投影して当該ボクセルの各面に対応するテクスチャデータを決定するボクセルテクスチャ決定手段とを備えている。
【００１６】
本発明による３次元形状生成方法は、複数の異なる３次元位置でカメラから入力されたフレーム画像間でのステレオマッチング処理を行って奥行き推定する第１のステップと、前記第１のステップで得られた複数のステレオ奥行き画像と前記カメラの位置パラメータとを基にボクセル空間への投票を行うことで３次元空間中の対象位置の推定を行う第２のステップと、前記ボクセルの位置と前記カメラの位置パラメータとを用いて前記第２のステップで得られたボクセルを元の画像系列の各画像に逆投影して当該ボクセルの最適な色を決定する第３のステップとを備えている。
【００１７】
本発明による他の３次元形状生成方法は、複数の異なる３次元位置でカメラから入力されたフレーム画像間でのステレオマッチング処理を行って奥行き推定する第１のステップと、前記第１のステップで得られた複数のステレオ奥行き画像と前記カメラの位置パラメータとを基に３次元空間中の微小立方体を示すボクセルへの投票を行うことで３次元空間中の対象位置の推定を行う第２のステップと、前記ボクセルの位置と前記カメラの位置パラメータとを用いて前記第２のステップで得られたボクセルを元の画像系列の各画像に逆投影して当該ボクセルの各面に対応するテクスチャデータを決定する第３のステップとを備えている。
【００１８】
本発明による３次元形状生成方法のプログラムは、コンピュータに、複数の異なる３次元位置でカメラから入力されたフレーム画像間でのステレオマッチング処理を行って奥行き推定する第１の処理と、前記第１の処理で得られた複数のステレオ奥行き画像と前記カメラの位置パラメータとを基に３次元空間中の微小立方体を示すボクセルへの投票を行うことで３次元空間中の対象位置の推定を行う第２の処理と、前記ボクセルの位置と前記カメラの位置パラメータとを用いて前記第２の処理で得られたボクセルを元の画像系列の各画像に逆投影して当該ボクセルの最適な色を決定する第３の処理とを実行させている。
【００１９】
本発明による他の３次元形状生成方法のプログラムは、コンピュータに、複数の異なる３次元位置でカメラから入力されたフレーム画像間でのステレオマッチング処理を行って奥行き推定する第１の処理と、前記第１の処理で得られた複数のステレオ奥行き画像と前記カメラの位置パラメータとを基に３次元空間中の微小立方体を示すボクセルへの投票を行うことで３次元空間中の対象位置の推定を行う第２の処理と、前記ボクセルの位置と前記カメラの位置パラメータとを用いて前記第２の処理で得られたボクセルを元の画像系列の各画像に逆投影して当該ボクセルの各面に対応するテクスチャデータを決定する第３の処理とを実行させている。
【００２０】
すなわち、本発明の第１の３次元形状生成装置は、複数の異なる３次元位置でカメラから入力されたフレーム画像間でのステレオマッチング処理を行って奥行き推定するステレオマッチング手段と、複数のステレオ奥行き画像とカメラの位置パラメータとを使って３次元空間中の微小立方体を示すボクセルへの投票を行うことで３次元空間中の対象位置の推定を行う奥行き画像統合手段と、ボクセルの位置とカメラの位置パラメータとを用いて得られたボクセルを元の画像系列の各画像に逆投影して適当な色を決定するボクセル色決定手段とを有している。
【００２１】
本発明の第１の３次元形状生成装置では、上記のような構成をとり、複数の異なる３次元位置で入力された画像間でのステレオマッチング処理を自動的に行い、得られた視差情報による画像中の各点での撮影位置からの奥行き情報を抽出した後、それぞれの奥行き情報とカメラの位置パラメータとを使ってボクセル空間へ奥行き情報を投票し、さらに一定数以上の投票が得られたボクセルを物体表面であるとしてボクセル表現による物体の３次元形状を求め、得られた各ボクセルの位置をカメラの位置パラメータを用いて元の画像系列に逆投影し、適当な画像中の対応画素の色を得ることでボクセルの色を決定するように動作する。
【００２２】
本発明の第１の３次元形状生成装置では、このような構成を採用し、ボクセル位置とカメラの位置パラメータとからボクセルを逆投影することによって、元の画像とボクセルとの間の距離やフレーム画像の撮影順序の情報を利用し、適当なボクセル色を決定することが可能となる。したがって、本発明は３次元形状モデル生成において、自動的にかつ現実感の高い３次元形状表面色が取得可能となる。
【００２３】
また、本発明の第２の３次元形状生成方法は、複数の異なる３次元位置で入力されたフレーム画像間でのステレオマッチング処理を行って奥行き推定するステレオマッチング手段と、複数のステレオ奥行き画像とカメラの位置パラメータとを使ってボクセルへの投票を行うことで３次元空間中の対象位置の推定を行う奥行き画像統合手段と、ボクセルの位置とカメラの位置パラメータとを用いて得られたボクセルを元の画像系列の各画像に逆投影してボクセルの各面に対応するテクスチャデータを決定するボクセルテクスチャ決定手段とを有している。
【００２４】
本発明の第２の３次元形状生成装置では、上記のような構成をとり、複数の異なる３次元位置で入力された画像間でのステレオマッチング処理を自動的に行い、得られた視差情報による画像中の各点での撮影位置からの奥行き情報を抽出した後、それぞれの奥行き情報とカメラの位置パラメータとを使ってボクセル空間へ奥行き情報を投票し、さらに一定数以上の投票が得られたボクセルを物体表面であるとしてボクセル表現による物体の３次元形状を求め、得られた各ボクセルの各面を構成する頂点をカメラの位置パラメータを用いて元の画像系列に逆投影し、適当な画像中の対応領域をボクセル表面のテクスチャデータして得ることで、ボクセルの表面のテクスチャを決定するように動作する。
【００２５】
本発明の第２の３次元形状生成装置では、このような構成を採用し、ボクセルの位置とカメラの位置パラメータとからボクセルを構成する頂点を逆投影することによって、元の画像とボクセルとの間の距離やフレーム画像の撮影順序の情報を利用し、適当なボクセル表面テクスチャを決定することが可能となる。したがって、本発明では、３次元形状モデル生成において、自動的にかつ現実感の高い３次元形状表面テクスチャが取得可能となる。
【００２６】
【発明の実施の形態】
次に、本発明の実施例について図面を参照して説明する。図１は本発明の一実施例による３次元形状生成装置の構成を示すブロック図である。図１において、３次元形状生成装置１はステレオマッチング手段１１と、奥行き画像統合手段１２と、ボクセル色決定手段１３と、記録媒体１４とから構成されている。尚、記録媒体１４には３次元形状生成装置１をコンピュータで実現する際に、そのコンピュータで実行するプログラムが格納されている。
【００２７】
ステレオマッチング手段１１は複数の異なる３次元位置で図示せぬカメラから入力されたフレーム画像間でのステレオマッチング処理を行い、奥行き推定を行う。奥行き画像統合手段１２は複数のステレオ奥行き画像とカメラの位置パラメータとを使って３次元空間中の微小立方体を示すボクセルへの投票を行うことで、３次元空間中の対象位置の推定を行う。ボクセル色決定手段１３はボクセルの位置とカメラの位置パラメータとを用いて、得られたボクセルを元の画像系列の各画像に逆投影し、適当な色を決定する。
【００２８】
図２は図１の３次元形状生成装置１の動作を示すフローチャートである。これら図１及び図２を参照して本発明の一実施例による３次元形状生成装置１の動作について説明する。尚、図２に示す処理は３次元形状生成装置１のコンピュータが記録媒体１４に格納されたプログラムを実行することで実現される。
【００２９】
ステレオマッチング手段１１は複数の画像の組に対してステレオマッチング処理を行い（図２ステップＳ１）、奥行き画像を生成する（図２ステップＳ２）。奥行き画像統合手段１２はステレオマッチング手段１　１から得られた奥行き画像と、それぞれの奥行き画像に対応するカメラ位置のパラメータとからボクセル空間内でそれぞれの視差に対応する点への投票を行い（図２ステップＳ３）、元の３次元形状のボクセル表現による形状を復元する（図２ステップＳ４）。
【００３０】
ボクセル色決定手段１３は奥行き画像統合手段１２で得られたボクセル表現と、カメラの位置パラメータとからボクセルの位置を、ステレオマッチン手段１１に用いた入力画像に逆投影し（図２ステップＳ５）、対応する画素のうちの最も適した画素を選択する（図２ステップＳ６）。
【００３１】
図３はボクセル表現の例を示す図であり、図４及び図５（ａ），（ｂ）はボクセルへの投票を説明するための図であり、図６はボクセルの点の画像への逆投影を説明するための図である。これら図１と図３〜図６とを参照して本発明の一実施例による３次元形状生成装置１の具体的な動作について説明する。
【００３２】
まず、３次元形状生成装置１にはビデオカメラあるいは通常のカメラ（図示せず）等によって撮影した画像の系列が入力として与えられる。この画像系列は復元対象となる建物等の物体をそれぞれ異なる位置、向きで撮影した画像である。また、それぞれの画像を撮影した際のカメラの相対的な位置関係については、例えば、上記の文献「視点位置推定による動画像からの屋外環境の三次元モデル化」（佐藤ら、第１９０回研究会講演予稿集、画像電子学会、２００１年、第４９頁〜第５５頁）中にあるカメラのパス推定やカメラ本体に固定した位置、方向を計測するセンサで測定したデータがカメラの位置パラメータとして得られるものとする。
【００３３】
ステレオマッチング手段１１では、まず同じ物体が写っている画像組を用いてステレオマッチング処理を行い、奥行き画像を生成する。その具体的な手法については、特に制限はなく、例えば特開平３−１６７６７８号公報に開示されている方法でも、上記の文献中に記述されている方法でもよい。
【００３４】
得られる奥行き画像はステレオマッチング処理対象となる画像組のうちのいずれかを基準画像として、それと同じ座標系として作成される。すなわち、奥行き画像は基準画像を撮影したカメラ位置から各画素の位置に対応する対象物体上の点の距離を示すものである。
【００３５】
画像の系列中には、同じ対象に対して異なる位置、方向から撮影したこのような画像組が多数含まれ、そのような画像組の全て、または任意に選択した一部の組に対して行い、複数の奥行き画像を得る。以上の処理で、入力となった各画像を撮影したカメラ位置から見た時の対象物体までの距離を示す画像が多数得られる。
【００３６】
次に、得られた奥行き画像とカメラの位置パラメータとから奥行き画像統合手段１２において、物体の３次元ボクセル表現への変換を行う。ボクセル表現は、図３に示すように、３次元空間を一定の大きさの立方体に区切り、その立方体の集合として形状を表現するものである。
【００３７】
奥行き画像とカメラの位置パラメータとからボクセル表現を推定するには、各カメラ位置から視差の値を用いて、ボクセル空間への投票を行う。図４及び図５（ａ），（ｂ）は２次元の場合、ボクセルへの投票の概念を示したものである。これは３次元ボクセル空間がＸＹＺからなる直交座標空間で考えた場合の特定のＺ値についての投票に相当する。
【００３８】
図４において、カメラＡにおいて撮影した入力画像それから得られた奥行き画像Ａは、２次元で図５（ａ　）のような値の分布を持っていたとする。この時、ボクセル空間への図５（ｂ）のような投票が行われる。同様に、カメラＢ，Ｃからの投票を行うことで、実際に物体がある確からしい面を得ることができる。この投票は得られた視差に対応する単一のボクセルに対して行ってもよいし、精度に応じて一定の範囲のボクセルに重み付けを行って投票を行う等、各種の方式をとることができる。最後に得られた投票空間に対して一定の閾値処理をすることで、物体の表面形状をボクセル表現として得ることができる。
【００３９】
次に、ボクセル色決定手段１３では物体のボクセル表現と入力となった画像系列及びそれぞれに対応するカメラの位置パラメータとから、各ボクセルを３次元中の点として考えた場合のそれぞれの点の画像への逆投影を行う。ここで、各ボクセルを点として考える場合の方法としては、ボクセルの中心点をそのボクセルを代表する点として用いる、ボクセル中のカメラに最も近い点を代表点として用いる等の様々な方法での選択が可能である。
【００４０】
図６はボクセルからの逆投影を２次元的に示したものである。各ボクセルからは入力画像それぞれに対応して複数の点の色が得られる。これらの点の色から最適な点を選択する。
【００４１】
この最適な点の選択基準としては、まず、（１）ボクセルから各画像の間に遮蔽物となるボクセルが存在しないことを条件とし、（２）（１）で選択した複数の画像の中からカメラ位置とボクセルとの３次元的な距離が最小となる画像中の画素の色を選択する。上記の処理によって、物体の対象とする点を最も近くから撮影した画像中の色を使うことで選択された色の確からしさを向上させることができる。
【００４２】
また、別の選択基準としては、ユーザが目視で画像系列から撮影状態が良好なものから順に画像に順序を与え、その中で上記の（１）の条件を満たす最初の画像を選択することもできる。この選択基準は、照明条件等によって色が全ての画像で必ずしも正しく得られていない場合に有効となる基準である。
【００４３】
さらに、（１）の条件を満たす全ての画像、または距離を基準として一定距離以下の画像、あるいは距離を基準として一定枚数以下の画像、ユーザが与えた順序で一定枚数の画像を選択し、その平均値、または最頻値、中央値等を用いることもできる。
【００４４】
さらにまた、隣接するボクセル間を順に処理する際に、できるだけ同じ画像を選択するという基準を設けることもできる。この基準は連続するボクセルにおいて、できるだけ同じ画像から色を取得することで、表面色の滑らかさを重視する場合の基準である。尚、選択の基準としてはこれらに限定されるものではなく、各画像に画素値の信頼度を付加する様々な基準が利用可能である。以上の処理で得られた画素の色を各ボクセルに付与することで、３次元形状表面の色を決定することができる。
【００４５】
図７は本発明の他の実施例による３次元形状生成装置の構成を示すブロック図である。図７において、３次元形状生成装置２はステレオマッチング手段１１と、奥行き画像統合手段１２と、テクスチャ決定手段２１と、記録媒体２２とから構成されている。尚、記録媒体２２には３次元形状生成装置２をコンピュータで実現する際に、そのコンピュータで実行するプログラムが格納されている。
【００４６】
ステレオマッチング手段１１は複数の異なる３次元位置で図示せぬカメラから入力されたフレーム画像間でのステレオマッチング処理を行い、奥行き推定を行う。奥行き画像統合手段１２は複数のステレオ奥行き画像とカメラの位置パラメータとを使って、ボクセルへの投票を行うことで、３次元空間中の対象位置の推定を行う。
【００４７】
テクスチャ決定手段２１はボクセルの位置とカメラの位置パラメータとを用いて、得られたボクセルの各頂点を元の画像系列の各画像に逆投影し、ボクセルの各面に対応する適当なテクスチャを決定する。
【００４８】
図８は図７の３次元形状生成装置２の動作を示すフローチャートである。これら図７及び図８を参照して本発明の他の実施例による３次元形状生成装置２の動作について説明する。尚、図８に示す処理は３次元形状生成装置２のコンピュータが記録媒体２２に格納されたプログラムを実行することで実現される。
【００４９】
ステレオマッチング手段１１は複数の画像の組に対してステレオマッチング処理を行い（図８ステップＳ１１）、奥行き画像を生成する（図８ステップＳ１２）。奥行き画像統合手段１２はステレオマッチング手段１　１から得られた奥行き画像とそれぞれの奥行き画像に対応するカメラ位置のパラメータとからボクセル空間内でそれぞれの視差に対応する点への投票を行い（図８ステップＳ１３）、元の３次元形状のボクセル表現による形状を復元する（図８ステップＳ１４）。
【００５０】
テクスチャ決定手段２１は奥行き画像統合手段１２で得られたボクセル表現とカメラの位置パラメータとからボクセルの各頂点を、ステレオマッチング手段１１に用いた入力画像に逆投影し（図８ステップＳ１５）、ボクセルの各面を構成する４つの頂点の投影が生成する入力画像中の矩形領域から最も適した矩形領域を決定し（図８ステップＳ１６）、その領域の画素値をボクセルの各面のテクスチャとする（図８ステップＳ１７）。
【００５１】
図９はボクセルの面の画像への逆投影を説明するための図である。これら図７と図９とを参照して本発明の他の実施例による３次元形状生成装置２の具体的な動作について説明する。
【００５２】
入力として与えられる画像データと、ステレオマッチング手段１１及び奥行き画像統合手段１２各々における処理は上述した本発明の一実施例と同様であるので、その説明を省略する。
【００５３】
本実施例では、得られた物体のボクセル表現と、入力となった画像系列及びそれぞれに対応するカメラの位置パラメータとをテクスチャ決定手段２１に入力として与え、各ボクセルを構成する面の頂点の画像への逆投影を行う。
【００５４】
図９はボクセルの各面の頂点からの逆投影を模式的に示したものである。各ボクセルの面毎に、入力画像それぞれに対応して矩形の領域が得られる。この矩形領域のうちの最適な矩形領域を選択する。
【００５５】
この最適な矩形領域の選択基準としては、（１）ボクセルの面から各画像への逆投影の間に遮蔽物となるボクセルが存在しないことを条件とし、（２）（１）で選択した複数の画像の中からカメラ位置とボクセルとの３次元的な距離が最小となる画像中の画素の色を選択する。上記の処理によって、本実施例では、物体の対象とする点を最も近くから撮影した画像中のテクスチャから選択可能とすることで、選択されたテクスチャの確からしさを向上させることができる。
【００５６】
また、別の基準としては、ボクセルを画像に投影した時に得られる矩形の面積が最大になる面を選択することもできる。この基準は、カメラから見た時に対象となるボクセル面が正面向きに近いほどよいとする場合の基準である。また、別の基準としては、ユーザが目視で画像系列から撮影状態が良好なものから順に画像に順序を与え、その中で上記の（１）の条件を満たす最初の画像を選択することもできる。この基準は、照明条件等によって色が全ての画像で必ずしも正しく得られていない場合に有効となる基準である。
【００５７】
さらに、（１）の基準を満たす全ての画像、または距離を基準として一定距離以下の画像、あるいは距離を基準として一定枚数以下の画像、ユーザが与えた順序で一定枚数の画像を選択し、その平均値、または最頻値、中央値等を用いることもできる。
【００５８】
さらにまた、隣接するボクセル間を順に処理する際に、できるだけ同じ画像を選択するという基準を設けることもできる。この基準は連続するボクセルにおいて、できるだけ同じ画像から色を取得することで、表面色の滑らかさを重視する場合の基準である。尚、選択の基準としてはこれらに限定されるものではなく、各画像に画素値の信頼度を付加する等の様々な基準が利用可能である。本実施例では、以上の処理で得られたテクスチャを各ボクセルの面に付与することで、３次元形状表面のテクスチャを決定することができる。
【００５９】
このように、本発明では、ボクセルの位置とカメラの位置パラメータとからボクセルを逆投影することによって、元の画像とボクセルとの間の距離やフレーム画像の撮影順序の情報を利用して最適なボクセル色を選択することができるため、３次元形状モデル生成において、自動的にかつ現実感の高い３次元形状表面色を取得することができる。
【００６０】
また、本発明では、ボクセルの位置とカメラの位置パラメータとからボクセルの各頂点を逆投影することによって、元の画像とボクセルとの間の距離やフレーム画像の撮影順序の情報を利用して最適なボクセル表面テクスチャを選択することができるため、３次元形状モデル生成において、自動的にかつ現実感の高い３　次元形状表面テクスチャを取得することができる。
【００６１】
【発明の効果】
以上説明したように本発明の３次元形状生成装置は、ボクセルの位置とカメラの位置パラメータとからボクセルを逆投影することによって、自動的にかつ現実感の高い３次元形状表面を取得することができるという効果が得られる。
【００６２】
また、本発明の他の３次元形状生成装置は、ボクセル位置とカメラ位置パラメータとからボクセルの各頂点を逆投影することによって、自動的にかつ現実感の高い３次元形状表面を取得することができるという効果が得られる。
【図面の簡単な説明】
【図１】本発明の一実施例による３次元形状生成装置の構成を示すブロック図である。
【図２】図１の３次元形状生成装置の動作を示すフローチャートである。
【図３】ボクセル表現の例を示す図である。
【図４】ボクセルへの投票を説明するための図である。
【図５】（ａ），（ｂ）はボクセルへの投票を説明するための図である。
【図６】ボクセルの点の画像への逆投影を説明するための図である。
【図７】本発明の他の実施例による３次元形状生成装置の構成を示すブロック図である。
【図８】図７の３次元形状生成装置の動作を示すフローチャートである。
【図９】ボクセルの面の画像への逆投影を説明するための図である。
【図１０】従来の３次元形状生成装置の構成を示すブロック図である。
【符号の説明】
１，２　３次元形状生成装置
１１　ステレオマッチング手段
１２　奥行き画像統合手段
１３　ボクセル色決定手段
１４，２２　記録媒体
２１　テクスチャ決定手段[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a three-dimensional shape generating apparatus, a three-dimensional shape generating method used therefor, and a program therefor, and more particularly to a camera in which a real object corresponding to a three-dimensional shape model is photographed from a plurality of angles by a camera, and the camera position and orientation during photographing are simultaneously taken The present invention relates to a three-dimensional shape generation method for automatically determining information on a shape surface by measuring the shape surface information.
[0002]
[Prior art]
As a means for generating the shape of a three-dimensional model for expressing the real world, for example, buildings and plants in a street, in three-dimensional CG (Computer Graphics), it is widely used to create a shape using a three-dimensional modeling tool. Has been done. In addition, many techniques have been developed for taking a picture of an actual scene with a camera or modeling a real three-dimensional space as it is using a three-dimensional scanner device. Among them, the three-dimensional shape generation method of expressing a three-dimensional shape in a computer by using a three-dimensional shape expression using voxels (small cubes in a three-dimensional space) is capable of expressing fine irregularities and the like of an object. There are features.
[0003]
An example of a conventional three-dimensional shape generation method using voxels is “3D modeling of an outdoor environment from a moving image by estimating the viewpoint position” (Sato et al., Proceedings of the 190th Technical Conference, The Institute of Image Electronics Engineers of Japan, 2001). Year, pp. 49-55).
[0004]
FIG. 10 shows an example of a conventional three-dimensional shape generating apparatus using voxels. As shown in FIG. 10, the three-dimensional shape generation device 3 includes a stereo matching unit 31, a corresponding color determination unit 32, and a depth image integration unit 33.
[0005]
The stereo matching unit 31 performs a stereo matching process between the frame images input at a plurality of different three-dimensional positions, and estimates the depth. The depth image integration means 33 estimates the target position in the three-dimensional space by voting in the voxel space using the plurality of stereo depth images and the estimated camera position parameters. The corresponding color determination unit 32 determines the color of the voxel using the color of the pixel corresponding to the vote by the depth image integration unit 33.
[0006]
In the three-dimensional shape generation device 3 having such a configuration, restoration of the shape and determination of the color of the shape surface are performed as follows. That is, for example, after obtaining a parameter of a three-dimensional position of a video camera (not shown) at the time of capturing each frame from a video image of a cityscape or a building, each frame image and the estimated video camera are obtained. , And performs stereo matching processing using a plurality of images that are temporally before and after each frame image and are taken from a position different from the frame image, and a depth image corresponding to each frame. Get.
[0007]
Next, using the obtained depth image and the estimated position of the frame image corresponding to the depth image, a minute cube (voxel) in a three-dimensional space in which the point of each pixel of the depth image is estimated to actually exist. Vote for. By performing this voting from each depth image, a three-dimensional shape is restored by estimating that a voxel having a voting value equal to or greater than a certain value is a place where an object actually exists.
[0008]
As described above, a three-dimensional shape can be estimated. However, since it is necessary to obtain surface color information as a three-dimensional model for use in three-dimensional CG, the following processing is further performed. That is, at the time of voting for each obtained voxel, the color of the corresponding pixel in the frame image corresponding to each depth image is checked, and recorded at the same time as the number of votes for the voxel.
[0009]
When voting for a voxel is completed, the average value of the colors voted is obtained, and this is used as the voxel color. By such processing, it is possible to obtain the three-dimensional shape and the color of the surface.
[0010]
[Problems to be solved by the invention]
However, in the above-described conventional three-dimensional shape generation method, the color of a voxel is determined based on the average value of the pixels that have voted for the voxel. Otherwise, if a pixel of a different color is erroneously voted due to an error in stereo matching, the color will be very different, so that there is a problem that the color accuracy of the voxel generated by the above method is low. . This is because the above-mentioned error is independent for each voxel, so that a phenomenon occurs in which adjacent voxels have slightly different colors on the surface that is actually uniform.
[0011]
In addition, in the conventional three-dimensional shape generation method, even if the voxel generated by the above method is, for example, a cube having an accuracy of 10 cm in the real space, by voting only the color data, the surface of the voxel can be obtained. However, there is a problem that information of a fine pattern (texture) included in the image is lost.
[0012]
Due to the above problems, even if the shape is accurate as a whole, there is a problem that the shape becomes less realistic. There is a need for a three-dimensional shape generation technique for generating a shape surface.
[0013]
Therefore, an object of the present invention is to provide a three-dimensional shape generating apparatus, a three-dimensional shape generating method, and a program for use in the three-dimensional shape generating apparatus, which can solve the above-described problems and can automatically obtain a realistic three-dimensional shape surface. To provide.
[0014]
[Means for Solving the Problems]
A three-dimensional shape generation device according to the present invention is obtained by stereo matching means for performing a stereo matching process between frame images input from a camera at a plurality of different three-dimensional positions to estimate a depth, and the stereo matching means. Depth image integration means for estimating a target position in a three-dimensional space by voting on a voxel indicating a small cube in a three-dimensional space based on a plurality of stereo depth images and a position parameter of the camera; Voxel color determining means for back-projecting the voxel obtained by the depth image integrating means onto each image of the original image sequence using the voxel position and the camera position parameter to determine the optimal color of the voxel; It has.
[0015]
Another three-dimensional shape generating device according to the present invention is obtained by stereo matching means for performing stereo matching processing between frame images input from a camera at a plurality of different three-dimensional positions and estimating depth, and the stereo matching means. Depth image integration means for estimating a target position in a three-dimensional space by voting for a voxel indicating a small cube in a three-dimensional space based on the plurality of stereo depth images and the position parameters of the camera, The voxel obtained by the depth image integrating means is back-projected to each image of the original image sequence using the voxel position and the camera position parameter to determine texture data corresponding to each surface of the voxel. Voxel texture determining means.
[0016]
A three-dimensional shape generation method according to the present invention is obtained by a first step of performing a stereo matching process between frame images input from a camera at a plurality of different three-dimensional positions to estimate a depth, and the first step. A second step of estimating a target position in a three-dimensional space by voting in a voxel space based on the plurality of stereo depth images and the position parameters of the camera, and A third step of back-projecting the voxel obtained in the second step to each image of the original image sequence using the positional parameters and determining an optimal color of the voxel.
[0017]
Another three-dimensional shape generation method according to the present invention includes a first step of performing a stereo matching process between frame images input from a camera at a plurality of different three-dimensional positions to estimate a depth, and the first step. A second step of estimating a target position in the three-dimensional space by voting for a voxel representing a small cube in the three-dimensional space based on the obtained plurality of stereo depth images and the position parameters of the camera And using the voxel position and the camera position parameter to project back the voxel obtained in the second step to each image of the original image sequence to obtain texture data corresponding to each surface of the voxel. And a third step of determining.
[0018]
The program of the method for generating a three-dimensional shape according to the present invention includes: a first process for performing a stereo matching process between frame images input from a camera at a plurality of different three-dimensional positions to estimate a depth by a computer; Estimating a target position in a three-dimensional space by voting for a voxel indicating a small cube in a three-dimensional space based on a plurality of stereo depth images obtained in the processing of the above and the position parameters of the camera 2. The voxel obtained in the second processing is back-projected to each image of the original image sequence using the processing of step 2, the voxel position and the camera position parameter to determine the optimal color of the voxel. And the third processing to be performed.
[0019]
A program of another three-dimensional shape generation method according to the present invention includes: a first process for performing a stereo matching process between frame images input from a camera at a plurality of different three-dimensional positions on a computer to estimate a depth; Voting for a voxel representing a small cube in a three-dimensional space based on a plurality of stereo depth images obtained in the first processing and the position parameters of the camera, thereby estimating a target position in the three-dimensional space. The second processing to be performed, the voxel obtained in the second processing is back-projected to each image of the original image sequence using the position of the voxel and the position parameter of the camera, and the voxel is projected onto each surface of the voxel. And a third process of determining corresponding texture data.
[0020]
That is, the first three-dimensional shape generation device of the present invention performs stereo matching processing between frame images input from a camera at a plurality of different three-dimensional positions to estimate depth, and a plurality of stereo depths. Depth image integration means for estimating a target position in the three-dimensional space by voting for a voxel indicating a small cube in the three-dimensional space using the image and the position parameters of the camera; Voxel color determining means for back-projecting a voxel obtained using the position parameter onto each image of the original image sequence to determine an appropriate color.
[0021]
The first three-dimensional shape generation device of the present invention has the above configuration, automatically performs stereo matching processing between images input at a plurality of different three-dimensional positions, and uses the obtained parallax information. After extracting depth information from the shooting position at each point in the image, using the respective depth information and the camera's position parameters, voting for depth information to the voxel space, and a certain number or more votes were obtained Assuming that the voxel is the surface of the object, the three-dimensional shape of the object is obtained by voxel expression, and the obtained position of each voxel is back-projected to the original image sequence using the position parameters of the camera. Obtaining the color operates to determine the color of the voxel.
[0022]
The first three-dimensional shape generating apparatus of the present invention adopts such a configuration, and back-projects the voxel from the voxel position and the camera position parameter, thereby obtaining the distance or frame between the original image and the voxel. It is possible to determine an appropriate voxel color using the information on the image capturing order. Therefore, according to the present invention, it is possible to automatically and realistically obtain a three-dimensional shape surface color when generating a three-dimensional shape model.
[0023]
Further, the second three-dimensional shape generation method of the present invention includes: a stereo matching unit that performs stereo matching processing between frame images input at a plurality of different three-dimensional positions to estimate a depth; Depth image integration means for estimating a target position in a three-dimensional space by voting for voxels using camera position parameters, and voxels obtained using voxel positions and camera position parameters. Voxel texture determining means for back-projecting each image of the original image sequence to determine texture data corresponding to each surface of the voxel.
[0024]
The second three-dimensional shape generating apparatus according to the present invention has the above-described configuration, automatically performs stereo matching processing between images input at a plurality of different three-dimensional positions, and uses the obtained parallax information. After extracting depth information from the shooting position at each point in the image, using the respective depth information and the camera's position parameters, voting for depth information to the voxel space, and a certain number or more votes were obtained Assuming that the voxel is the surface of the object, determine the three-dimensional shape of the object by voxel expression, and backproject the vertices constituting each surface of each obtained voxel to the original image sequence using the position parameters of the camera to obtain an appropriate image By obtaining the corresponding region inside as voxel surface texture data, it operates to determine the voxel surface texture.
[0025]
The second three-dimensional shape generating apparatus of the present invention adopts such a configuration, and back-projects vertices forming a voxel from the position of the voxel and the position parameter of the camera, thereby forming a relationship between the original image and the voxel. It is possible to determine an appropriate voxel surface texture by using information on the distance between the frames and the shooting order of the frame images. Therefore, in the present invention, in generating a three-dimensional shape model, it is possible to automatically obtain a realistic three-dimensional surface texture.
[0026]
BEST MODE FOR CARRYING OUT THE INVENTION
Next, embodiments of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram showing a configuration of a three-dimensional shape generating apparatus according to one embodiment of the present invention. In FIG. 1, the three-dimensional shape generating apparatus 1 includes a stereo matching unit 11, a depth image integrating unit 12, a voxel color determining unit 13, and a recording medium 14. The recording medium 14 stores a program to be executed by the computer when implementing the three-dimensional shape generating apparatus 1 by the computer.
[0027]
The stereo matching means 11 performs a stereo matching process between frame images input from a camera (not shown) at a plurality of different three-dimensional positions, and performs depth estimation. The depth image integrating unit 12 estimates a target position in the three-dimensional space by voting for a voxel indicating a small cube in the three-dimensional space using a plurality of stereo depth images and the position parameters of the camera. The voxel color determining means 13 uses the voxel position and the camera position parameter to project the obtained voxel back onto each image of the original image sequence and determines an appropriate color.
[0028]
FIG. 2 is a flowchart showing the operation of the three-dimensional shape generation device 1 of FIG. The operation of the three-dimensional shape generating apparatus 1 according to one embodiment of the present invention will be described with reference to FIGS. Note that the processing illustrated in FIG. 2 is realized by the computer of the three-dimensional shape generation device 1 executing a program stored in the recording medium 14.
[0029]
The stereo matching means 11 performs a stereo matching process on a set of a plurality of images (step S1 in FIG. 2), and generates a depth image (step S2 in FIG. 2). The depth image integrating means 12 performs voting for points corresponding to respective parallaxes in the voxel space from the depth images obtained from the stereo matching means 11 and parameters of camera positions corresponding to the respective depth images (FIG. 2 step S3), the original three-dimensional shape is restored by the voxel expression (step S4 in FIG. 2).
[0030]
The voxel color determining means 13 backprojects the position of the voxel from the voxel expression obtained by the depth image integrating means 12 and the position parameter of the camera onto the input image used for the stereo matching means 11 (step S5 in FIG. 2), The most suitable pixel is selected from the corresponding pixels (step S6 in FIG. 2).
[0031]
FIG. 3 is a diagram showing an example of a voxel expression. FIGS. 4 and 5 (a) and 5 (b) are diagrams for explaining voting for voxels, and FIG. 6 is a diagram showing the reverse of a voxel point to an image. FIG. 3 is a diagram for explaining projection. The specific operation of the three-dimensional shape generation device 1 according to one embodiment of the present invention will be described with reference to FIG. 1 and FIGS.
[0032]
First, a series of images captured by a video camera, a normal camera (not shown), or the like is provided as an input to the three-dimensional shape generation device 1. This image sequence is an image obtained by photographing an object such as a building to be restored at different positions and directions. Regarding the relative positional relationship between the cameras when each image was taken, see, for example, the above-mentioned document “Three-dimensional modeling of an outdoor environment from a moving image by estimating the viewpoint position” (Sato et al., The 190th Research Project) ), Data measured by a sensor that measures the path of the camera and the position and direction fixed to the camera body in the Institute of Image Electronics Engineers of Japan, 2001, pp. 49-55. Shall be obtained.
[0033]
The stereo matching means 11 first performs a stereo matching process using an image set in which the same object is captured, and generates a depth image. The specific method is not particularly limited, and may be, for example, the method disclosed in JP-A-3-167678 or the method described in the above-mentioned document.
[0034]
The obtained depth image is created using the same coordinate system as any one of the image sets to be subjected to the stereo matching process as a reference image. That is, the depth image indicates the distance between the camera position where the reference image is captured and the point on the target object corresponding to the position of each pixel.
[0035]
A series of images includes a large number of such image sets taken from different positions and directions for the same object, and the processing is performed on all such image sets or on a arbitrarily selected partial set. And obtain a plurality of depth images. Through the above processing, a large number of images indicating the distance to the target object as viewed from the camera position where each input image is captured is obtained.
[0036]
Next, the depth image integrating means 12 converts the obtained depth image and camera position parameters into a three-dimensional voxel representation of the object. In the voxel expression, as shown in FIG. 3, a three-dimensional space is divided into cubes of a fixed size, and a shape is represented as a set of the cubes.
[0037]
In order to estimate a voxel expression from a depth image and a camera position parameter, voting for a voxel space is performed using a parallax value from each camera position. FIGS. 4 and 5 (a) and 5 (b) show the concept of voting for voxels in the two-dimensional case. This corresponds to voting for a specific Z value when the three-dimensional voxel space is considered in a rectangular coordinate space consisting of XYZ.
[0038]
In FIG. 4, it is assumed that the input image captured by the camera A and the depth image A obtained from the input image have a two-dimensional distribution of values as shown in FIG. At this time, voting as shown in FIG. 5B is performed on the voxel space. Similarly, by voting from the cameras B and C, it is possible to obtain a certain surface where the object is actually located. This voting may be performed on a single voxel corresponding to the obtained parallax, or may be performed in various ways, such as voting by weighting a certain range of voxels according to accuracy. . By performing a certain threshold process on the finally obtained voting space, the surface shape of the object can be obtained as a voxel expression.
[0039]
Next, the voxel color determination means 13 uses the voxel representation of the object, the input image sequence and the corresponding camera position parameters to determine the image of each point when each voxel is considered as a point in three dimensions. Back projection to Here, as a method of considering each voxel as a point, selection by various methods such as using the center point of the voxel as a point representing the voxel, using the point closest to the camera in the voxel as a representative point, etc. Is possible.
[0040]
FIG. 6 shows two-dimensional back projection from a voxel. From each voxel, a plurality of point colors are obtained corresponding to the respective input images. An optimal point is selected from the colors of these points.
[0041]
As a criterion for selecting the optimum point, first, (1) from the plurality of images selected in (1), on the condition that there is no voxel serving as an obstruction between each image from the voxel. The color of the pixel in the image that minimizes the three-dimensional distance between the camera position and the voxel is selected. By the above-described processing, the certainty of the selected color can be improved by using the color in the image obtained by photographing the target point of the object from the closest point.
[0042]
Further, as another selection criterion, the user may visually order images from the image sequence in descending order of the shooting state, and select the first image satisfying the above condition (1). it can. This selection criterion is a criterion that is effective when colors are not always obtained correctly in all images due to lighting conditions and the like.
[0043]
Further, all images satisfying the condition of (1), images having a certain distance or less based on the distance, or images having a certain number or less based on the distance, and a certain number of images are selected in the order given by the user. An average value, a mode value, a median value, or the like can also be used.
[0044]
Furthermore, when sequentially processing adjacent voxels, a criterion for selecting the same image as much as possible can be provided. This criterion is a criterion when importance is placed on smoothness of the surface color by acquiring colors from the same image as much as possible in consecutive voxels. Note that selection criteria are not limited to these, and various criteria for adding the reliability of the pixel value to each image can be used. By assigning the color of the pixel obtained by the above processing to each voxel, the color of the surface of the three-dimensional shape can be determined.
[0045]
FIG. 7 is a block diagram showing a configuration of a three-dimensional shape generating apparatus according to another embodiment of the present invention. In FIG. 7, the three-dimensional shape generating device 2 includes a stereo matching unit 11, a depth image integrating unit 12, a texture determining unit 21, and a recording medium 22. Note that the recording medium 22 stores a program to be executed by the computer when the three-dimensional shape generation device 2 is realized by the computer.
[0046]
The stereo matching means 11 performs a stereo matching process between frame images input from a camera (not shown) at a plurality of different three-dimensional positions, and performs depth estimation. The depth image integrating means 12 estimates a target position in a three-dimensional space by voting for voxels using a plurality of stereo depth images and camera position parameters.
[0047]
The texture determining means 21 uses the voxel position and the camera position parameter to project each vertex of the obtained voxel back to each image of the original image sequence, and determines an appropriate texture corresponding to each surface of the voxel. I do.
[0048]
FIG. 8 is a flowchart showing the operation of the three-dimensional shape generation device 2 of FIG. The operation of the three-dimensional shape generating apparatus 2 according to another embodiment of the present invention will be described with reference to FIGS. Note that the processing shown in FIG. 8 is realized by the computer of the three-dimensional shape generation device 2 executing a program stored in the recording medium 22.
[0049]
The stereo matching means 11 performs a stereo matching process on a set of a plurality of images (step S11 in FIG. 8), and generates a depth image (step S12 in FIG. 8). The depth image integration means 12 performs voting for points corresponding to respective parallaxes in the voxel space from the depth images obtained from the stereo matching means 11 and camera position parameters corresponding to the respective depth images (FIG. 8). Step S13), the original three-dimensional shape is restored by the voxel representation (step S14 in FIG. 8).
[0050]
The texture determination means 21 backprojects each vertex of the voxel from the voxel expression obtained by the depth image integration means 12 and the position parameter of the camera onto the input image used by the stereo matching means 11 (step S15 in FIG. 8). The most suitable rectangular area is determined from the rectangular area in the input image generated by the projection of the four vertices constituting each surface (step S16 in FIG. 8), and the pixel value of the area is used as the texture of each surface of the voxel. (Step S17 in FIG. 8).
[0051]
FIG. 9 is a diagram for explaining back projection of a voxel surface onto an image. A specific operation of the three-dimensional shape generation device 2 according to another embodiment of the present invention will be described with reference to FIGS.
[0052]
Since the image data given as input and the processing in each of the stereo matching means 11 and the depth image integrating means 12 are the same as those in the above-described embodiment of the present invention, the description is omitted.
[0053]
In the present embodiment, the obtained voxel representation of the object, the input image sequence and the corresponding camera position parameters are given as inputs to the texture determining means 21, and the images of the vertices of the surface constituting each voxel are provided. Back projection to
[0054]
FIG. 9 schematically shows back projection from the vertex of each surface of the voxel. For each face of each voxel, a rectangular area is obtained corresponding to each input image. An optimal rectangular area is selected from the rectangular areas.
[0055]
The criteria for selecting this optimal rectangular area are (1) a condition that there is no voxel that becomes an obstruction during back projection from the voxel surface to each image, and (2) a plurality of images selected in (1). The color of the pixel in the image that minimizes the three-dimensional distance between the camera position and the voxel is selected from among the images. In the present embodiment, by performing the above-described processing, it is possible to improve the certainty of the selected texture by making it possible to select the target point of the object from the texture in the image captured from the closest point.
[0056]
As another criterion, it is also possible to select a plane that maximizes the area of a rectangle obtained when a voxel is projected on an image. This criterion is a criterion for determining that the closer the voxel surface to be viewed from the camera is to the front, the better. Further, as another criterion, the user can visually give an order to images in descending order of the shooting state from the image sequence, and select the first image satisfying the above condition (1) among them. . This criterion is a criterion that is effective when colors are not always obtained correctly in all images due to lighting conditions and the like.
[0057]
Further, all images satisfying the criterion of (1), images having a certain distance or less based on the distance, or images having a certain number or less based on the distance, and a certain number of images are selected in the order given by the user. An average value, a mode value, a median value, or the like can also be used.
[0058]
Furthermore, when sequentially processing adjacent voxels, a criterion for selecting the same image as much as possible can be provided. This criterion is a criterion when importance is placed on smoothness of the surface color by acquiring colors from the same image as much as possible in consecutive voxels. Note that the selection criteria are not limited to these, and various criteria such as adding reliability of pixel values to each image can be used. In the present embodiment, the texture of the surface of the three-dimensional shape can be determined by applying the texture obtained by the above processing to the surface of each voxel.
[0059]
As described above, in the present invention, the voxel is back-projected from the position of the voxel and the position parameter of the camera, so that the optimum distance between the original image and the voxel and the information of the shooting order of the frame image are utilized. Since the voxel color can be selected, it is possible to automatically and realistically acquire a realistic three-dimensional shape surface color in generating a three-dimensional shape model.
[0060]
Further, in the present invention, by backprojecting each vertex of the voxel from the position of the voxel and the position parameter of the camera, the optimal distance between the original image and the voxel and the information of the shooting order of the frame image are optimized. Since it is possible to select a suitable voxel surface texture, it is possible to automatically and highly realistically obtain a three-dimensional shape surface texture in the generation of a three-dimensional shape model.
[0061]
【The invention's effect】
As described above, the three-dimensional shape generation device of the present invention can automatically and realistically acquire a highly realistic three-dimensional shape surface by backprojecting voxels from voxel positions and camera position parameters. The effect that can be obtained is obtained.
[0062]
Further, another three-dimensional shape generating apparatus of the present invention can automatically and highly realistically obtain a three-dimensional shape surface by back-projecting each vertex of a voxel from a voxel position and a camera position parameter. The effect that can be obtained is obtained.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of a three-dimensional shape generation device according to one embodiment of the present invention.
FIG. 2 is a flowchart showing the operation of the three-dimensional shape generation device of FIG.
FIG. 3 is a diagram illustrating an example of a voxel expression.
FIG. 4 is a diagram for explaining voting for voxels.
FIGS. 5A and 5B are diagrams for explaining voting for voxels. FIG.
FIG. 6 is a diagram for describing back projection of voxel points onto an image.
FIG. 7 is a block diagram illustrating a configuration of a three-dimensional shape generating apparatus according to another embodiment of the present invention.
FIG. 8 is a flowchart showing an operation of the three-dimensional shape generation device of FIG. 7;
FIG. 9 is a diagram for explaining back projection of a voxel plane onto an image.
FIG. 10 is a block diagram illustrating a configuration of a conventional three-dimensional shape generation device.
[Explanation of symbols]
1, 3D shape generator
11 Stereo matching means
12 Depth image integration means
13 Voxel color determination means
14,22 Recording medium
21 Texture determination means

Claims

Stereo matching means for performing depth matching by performing stereo matching between frame images input from a camera at a plurality of different three-dimensional positions; and a plurality of stereo depth images obtained by the stereo matching means and a position of the camera. Depth image integration means for estimating a target position in a three-dimensional space by voting for a voxel indicating a small cube in a three-dimensional space based on parameters, and a position parameter of the voxel and a position parameter of the camera. And voxel color determining means for back-projecting the voxel obtained by the depth image integrating means onto each image of the original image sequence to determine the optimal color of the voxel. Generator.

Stereo matching means for performing depth matching by performing stereo matching between frame images input from a camera at a plurality of different three-dimensional positions; a plurality of stereo depth images obtained by the stereo matching means; and positional parameters of the camera A depth image integrating means for estimating a target position in a three-dimensional space by voting for a voxel indicating a small cube in a three-dimensional space based on the above, and a position parameter of the voxel and a position parameter of the camera Voxel texture determining means for back-projecting the voxel obtained by the depth image integrating means onto each image of the original image sequence to determine texture data corresponding to each surface of the voxel. 3D shape generation device.

A first step of performing a stereo matching process between frame images input from a camera at a plurality of different three-dimensional positions to estimate a depth, and a plurality of stereo depth images obtained in the first step and the camera A second step of estimating a target position in a three-dimensional space by voting in a voxel space based on the position parameter, and the second step using the position of the voxel and the position parameter of the camera A third step of back-projecting the voxel obtained in the step onto each image of the original image sequence to determine an optimal color of the voxel.

A first step of performing a stereo matching process between frame images input from a camera at a plurality of different three-dimensional positions to estimate a depth, and a plurality of stereo depth images obtained in the first step and the camera A second step of estimating a target position in the three-dimensional space by voting for a voxel indicating a small cube in the three-dimensional space based on the position parameter, and a position parameter of the voxel and a position parameter of the camera And back-projecting the voxel obtained in the second step onto each image of the original image sequence to determine texture data corresponding to each surface of the voxel. Three-dimensional shape generation method.

A first process for performing depth matching by performing a stereo matching process between frame images input from the camera at a plurality of different three-dimensional positions, and a plurality of stereo depth images obtained in the first process; A second process of estimating a target position in a three-dimensional space by voting for a voxel indicating a small cube in a three-dimensional space based on the position parameters of the camera, and a position of the voxel and the camera For back-projecting the voxel obtained in the second processing to each image of the original image sequence using the position parameters of the above and determining the optimum color of the voxel. .

A first process for performing depth matching by performing a stereo matching process between frame images input from the camera at a plurality of different three-dimensional positions, and a plurality of stereo depth images obtained in the first process; A second process of estimating a target position in a three-dimensional space by voting for a voxel indicating a small cube in a three-dimensional space based on the position parameters of the camera, and a position of the voxel and the camera And performing a third process of back-projecting the voxel obtained in the second process onto each image of the original image sequence and determining texture data corresponding to each surface of the voxel using the positional parameters described above. Program to let you.