JP2000268179A

JP2000268179A - Three-dimensional shape information obtaining method and device, two-dimensional picture obtaining method and device and record medium

Info

Publication number: JP2000268179A
Application number: JP2000007200A
Authority: JP
Inventors: Takeo Kaneide; 武雄金出; Yuji Uchida; 勇治内田; Tatsuya Konno; 立也今野; Katsuya Yasuda; 勝也安田
Original assignee: Ogis Ri Co Ltd
Current assignee: Ogis Ri Co Ltd
Priority date: 1999-01-14
Filing date: 2000-01-14
Publication date: 2000-09-29

Abstract

PROBLEM TO BE SOLVED: To provide a method for easily obtaining the highly precise three- dimensional shape information of an object from the dynamic image of the object photographed by a camera to which calibration is not preliminarily applied without using any exclusive special device and a method for obtaining the two-dimensional picture of an object from an arbitrary point of view. SOLUTION: An object is photographed by a camera to which calibration is not preliminarily applied, and the dynamic image is obtained (A), and the optical parameter of the camera and the moving locus of the camera at the time of photographing is obtained based on the dynamic image (B), and the three dimensional shape information of the object is obtained according to a stereoscopic method by using the obtained optical parameter and moving locus and the dynamic image data (C). Then, the two-dimensional picture of the object is obtained from a designated arbitrary point of view by using the obtained optical parameter and moving locus and the dynamic image data and the obtained three-dimension shape information (D).

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、物体の三次元形状
の情報を取得する方法及び装置、任意の視点からの物体
の二次元画像を取得する方法及び装置、並びに、それら
の方法を実施するためのプログラムを記録した記録媒体
に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method and an apparatus for acquiring information on a three-dimensional shape of an object, a method and an apparatus for acquiring a two-dimensional image of an object from an arbitrary viewpoint, and to implement those methods. Recording medium on which a program for recording is recorded.

【０００２】[0002]

【従来の技術】ＣＡＤ，ＣＧ技術の発展に伴って、物体
の三次元形状の情報を取得してその三次元形状情報をコ
ンピュータシステムに入力することが必要となってきて
いる。物体の三次元形状の情報を取得する手法として、
従来からステレオ視（多眼視）を利用したステレオ法
が、比較的精度が良く、アルゴリズムも安定しており、
除々に実システムへ導入され始めている。このステレオ
法は、能動的ステレオ法と受動的ステレオ法とに大別さ
れる。2. Description of the Related Art With the development of CAD and CG techniques, it has become necessary to obtain information on the three-dimensional shape of an object and to input the three-dimensional shape information to a computer system. As a method of acquiring information on the three-dimensional shape of an object,
Conventionally, the stereo method using stereo vision (multi-view) is relatively accurate and the algorithm is stable,
It is gradually being introduced into real systems. The stereo method is roughly classified into an active stereo method and a passive stereo method.

【０００３】能動的ステレオ法とは、ステレオ法の基本
である多眼視間での対応点決定のために、レーザスリッ
ト光などを積極的に対象の物体に照射する方法であり、
高速，高精度のシステムを構築できる。しかしながら、
専用のハードウェアが必要であるので、高価であるとい
う問題がある。また、スリット光が外乱によって阻害さ
れないような環境が通常必要であり、システムの利用適
合性が低いという問題もある。よって、環境を制限で
き、しかも高い情報精度が要求される応用分野以外での
普及は進んでいない。The active stereo method is a method of actively irradiating a target object with a laser slit light or the like in order to determine a corresponding point between multi-views, which is the basis of the stereo method.
A high-speed, high-accuracy system can be constructed. However,
Since dedicated hardware is required, there is a problem that it is expensive. In addition, an environment in which the slit light is not hindered by disturbance is usually required, and there is a problem that system suitability is low. Therefore, it has not been widely used outside of application fields where the environment can be restricted and high information accuracy is required.

【０００４】これに対して、受動的ステレオ法では、複
数のカメラを固定配置し、それらのカメラで得られた画
像のみを使って物体の三次元形状の情報を取得する。よ
って、一般的なビデオカメラ，スチルカメラ以外に特殊
なハードウェアを必要とせず、また環境も制限を受けな
いという利点があって、広く普及し始めており、具体的
な提案も多数なされている。On the other hand, in the passive stereo method, a plurality of cameras are fixedly arranged, and information on the three-dimensional shape of an object is obtained using only images obtained by the cameras. Therefore, there is an advantage that special hardware other than a general video camera and a still camera is not required and the environment is not restricted, and it has begun to spread widely, and many specific proposals have been made.

【０００５】これらの中で、IEEE TRANSACTIONS ON PAT
TERN ANALYSIS AND MACHINE INTELLIGENCE, VOL.15, N
O.4, APRIL 1993，pp353-363 に提案されたmuitiple-ba
seline stereo( マルチプルベースラインステレオ）法
は、多視点の画像を用いることにより受動的ステレオ法
で問題とされてきた対応点決定処理での不安定さを簡単
なアルゴリズムによって解消しており、良好な距離画像
を得ることができる。また、金出武雄，C.Lawrence Zit
nickらが開発した、volumetric iterative approach to
stereo matching and occlusion detection法では、オ
クルージョン領域を検出し、ボクセルデータ形式のより
質が高い三次元形状情報を得ることが可能である。[0005] Among them, IEEE TRANSACTIONS ON PAT
TERN ANALYSIS AND MACHINE INTELLIGENCE, VOL.15, N
Muitiple-ba proposed in O.4, APRIL 1993, pp353-363
The seline stereo (multiple baseline stereo) method eliminates the instability in the corresponding point determination process, which has been a problem in the passive stereo method, by using multiple viewpoint images with a simple algorithm. A range image can be obtained. Also, Takeo Kanade, C. Lawrence Zit
Volumetric iterative approach to developed by nick and others
In the stereo matching and occlusion detection method, it is possible to detect an occlusion region and obtain higher-quality three-dimensional shape information in a voxel data format.

【０００６】以上のような撮影画像から物体の三次元形
状情報を得る手法とは別に、ユーザが指定した任意の視
点（任意の位置，姿勢を有する仮想カメラ）からの物体
の二次元画像を、複数の撮影画像から合成して取得する
手法（イメージベースドレンダリング：Image Based Re
ndering(以下、ＩＢＲと適宜略記する））が知られてい
る。この二次元画像取得手法は、三次元形状情報取得手
法が三次元形状情報から物体の形状モデルを構築するこ
とに主眼を置いているのに対して、物体の形状モデルを
構築することなくコンピュータ内で物体の二次元画像を
生成してユーザに呈示することを主な目的としている。
特に、仮想現実を利用したシステム，訓練シミュレーシ
ョン等にあっては、物体の形状モデルを正確に構築でき
ることよりも、任意の視点からの物体の正確な二次元画
像を呈示できることの方が重要である。[0006] Apart from the method of obtaining the three-dimensional shape information of the object from the photographed image as described above, a two-dimensional image of the object from an arbitrary viewpoint (a virtual camera having an arbitrary position and orientation) designated by a user is obtained. A method of synthesizing and acquiring multiple captured images (Image-based rendering: Image Based Re
ndering (hereinafter abbreviated as IBR)) is known. This two-dimensional image acquisition method focuses on constructing a shape model of an object from three-dimensional shape information, whereas the three-dimensional shape information acquisition method focuses on building a shape model of an object in a computer without building a shape model of the object. The main purpose is to generate a two-dimensional image of the object and present it to the user.
In particular, in systems using virtual reality, training simulations, and the like, it is more important to be able to present an accurate two-dimensional image of an object from an arbitrary viewpoint than to be able to accurately construct a shape model of the object. .

【０００７】このようなＩＢＲの一例として、Steven
M.Seitzらが開発した手法（Steven M.Seitz,"Toward In
teractive Scene Walkthroughs from Images"）があ
る。この手法は、対象となる物体を含むボクセル空間
（Voxel Space)をボクセル（Voxel)と呼ばれる微小な直
方体の集まりとして考え、特定のボクセルを各カメラの
撮像面に投影した場合を考える。物体の表面を構成する
ボクセルにあっては、全てのカメラへの投影像は同じ色
になる。よって、全てのカメラへの投影像が同じ色であ
る場合には、そのボクセルにその投影像の色を割り当
て、そうでない場合には物体の表面でないそのボクセル
に色を割り当てない。そして、このようにして割り当て
られた各ボクセルの色に基づき、任意に指定された位
置，姿勢を有する仮想カメラの撮像面にこれらを投影す
ることにより、仮想カメラから見た二次元画像を合成す
る。As an example of such an IBR, Steven
A method developed by M. Seitz et al. (Steven M. Seitz, "Toward In
teractive Scene Walkthroughs from Images "). In this method, a voxel space (Voxel Space) containing a target object is considered as a collection of small rectangular parallelepipeds called voxels, and a specific voxel is captured by the imaging surface of each camera. In the case of voxels constituting the surface of an object, the projected images to all cameras have the same color. Therefore, when the projected images to all cameras have the same color, The voxel is assigned the color of the projected image, otherwise no color is assigned to the voxel that is not the surface of the object, and an arbitrarily specified position is determined based on the color of each voxel thus assigned. The two-dimensional image viewed from the virtual camera is synthesized by projecting these onto the imaging surface of the virtual camera having the posture.

【０００８】取得した物体の三次元形状情報を用いてそ
の物体の形状モデルを構築した後、その形状モデルを仮
想カメラの撮像面に投影して、任意の視点からの物体の
二次元画像を得る手法が考えられる。しかしながらこの
手法では、取得した三次元形状情報に基づいて三角パッ
チなどの位相情報を与える必要があって、実際の物体と
の不整合が発生する虞がある。これに対して、ＩＢＲで
は色が割り当てられたボクセル空間という非常に簡単な
データ構造を用いているので、実際の物体との不整合が
発生し難い。After constructing a shape model of the object using the acquired three-dimensional shape information of the object, the shape model is projected on an imaging plane of a virtual camera to obtain a two-dimensional image of the object from an arbitrary viewpoint. A method is conceivable. However, in this method, it is necessary to provide phase information such as a triangular patch based on the acquired three-dimensional shape information, and there is a possibility that inconsistency with an actual object may occur. On the other hand, since the IBR uses a very simple data structure of a voxel space to which colors are assigned, inconsistency with an actual object hardly occurs.

【０００９】[0009]

【発明が解決しようとする課題】ところで、前述した２
つの手法を含めて多数の受動的ステレオ法では、高い精
度の結果は得られるが、カメラの内部パラメータ（焦点
距離など）及び外部パラメータ（位置，姿勢など）を予
めキャリブレーションしておく必要がある。これは、一
度キャリブレーションされた複数のカメラを動かすこと
ができないということを意味し、実際の情報取得処理に
おいて大きな制約となっている。また、カメラを固定し
ておくための設備が必要であり、実質的に特殊で大きな
ハードウェアを使用しなければならないという問題があ
る。By the way, the aforementioned 2
Many passive stereo methods, including one, can provide highly accurate results, but need to calibrate the camera's internal parameters (focal length, etc.) and external parameters (position, orientation, etc.) in advance. . This means that it is not possible to move a plurality of cameras that have been calibrated once, which is a great limitation in actual information acquisition processing. In addition, there is a problem in that equipment for fixing the camera is required, and substantially special and large hardware must be used.

【００１０】一方、撮影した物体の動画像からカメラの
各種のパラメータをキャリブレーションしてその三次元
形状の情報を取得する方法（Marc Pollefeys, Reinhard
Koch and Luc Van Gool, "Self-Calibration and Metr
ic Reconstruction in spiteof Varying and Unknown I
nternal Camera Parameters")も提案されている。しか
しながら、この方法では、ユーザが画像上の特徴点を手
動で指示しなければならない。また、特徴点の追跡のみ
で物体の三次元形状の情報を得ているので、その形状情
報の精度が低く、複雑な形状の物体についてはその正確
な三次元形状を復元できないという問題がある。On the other hand, a method of calibrating various parameters of a camera from a moving image of a photographed object and acquiring information on its three-dimensional shape (Marc Pollefeys, Reinhard
Koch and Luc Van Gool, "Self-Calibration and Metr
ic Reconstruction in spiteof Varying and Unknown I
However, this method also requires that the user manually indicate feature points on the image, and obtains information on the three-dimensional shape of the object only by tracking the feature points. Therefore, there is a problem that the accuracy of the shape information is low, and an accurate three-dimensional shape cannot be restored for an object having a complicated shape.

【００１１】また、ＩＢＲを行う場合にあっても、受動
的ステレオ法と同様に、撮影画像を取得する各カメラの
内部パラメータ及び外部パラメータを予めキャリブレー
ションしておく必要がある。従って、上述したような受
動的ステレオ法と同様の問題がある。Also, when performing IBR, it is necessary to calibrate the internal parameters and the external parameters of each camera for acquiring a captured image in advance, as in the passive stereo method. Therefore, there is a problem similar to the passive stereo method described above.

【００１２】本発明は斯かる事情に鑑みてなされたもの
であり、予めキャリブレーションを必要とせず、また大
きなハードウェアを使用することもなく、極めて容易
に、物体の三次元形状の情報を高精度に取得することが
できる三次元形状情報取得方法及び装置、任意の視点か
らの物体の二次元画像を高精度に取得することができる
二次元画像取得方法及び装置、並びに、それらの取得方
法を実施するためのプログラムを記録した記録媒体を提
供することを目的とする。[0012] The present invention has been made in view of such circumstances, and does not require calibration in advance and does not use large hardware. Therefore, it is extremely easy to obtain information on the three-dimensional shape of an object. 3D shape information acquisition method and device that can be acquired with high accuracy, 2D image acquisition method and device that can acquire a 2D image of an object from any viewpoint with high accuracy, and their acquisition methods It is an object of the present invention to provide a recording medium on which a program to be executed is recorded.

【００１３】[0013]

【課題を解決するための手段】請求項１に係る三次元形
状情報取得方法は、物体に対して相対的に移動するカメ
ラで撮影した前記物体の画像から前記物体の三次元形状
の情報を得る方法において、撮影した撮影画像に基づい
て前記カメラの内部パラメータ及び前記カメラの撮影時
の前記物体に対する相対的な移動軌跡を求める第１ステ
ップと、求めた内部パラメータ及び移動軌跡と前記撮影
画像とを用いてステレオ法に従って前記物体の三次元形
状の情報を得る第２ステップとを有することを特徴とす
る。According to a first aspect of the present invention, there is provided a method for obtaining three-dimensional shape information of an object from an image of the object taken by a camera moving relatively to the object. A first step of obtaining an internal parameter of the camera and a relative movement trajectory of the camera at the time of shooting based on the shot image, and a step of calculating the obtained internal parameters and the movement track and the shot image. Using the stereo method to obtain information on the three-dimensional shape of the object.

【００１４】請求項２に係る三次元形状情報取得方法
は、請求項１において、静止している物体を移動してい
るカメラで撮影して前記物体の撮影画像を得ることを特
徴とする。According to a second aspect of the present invention, in the first aspect, a still object is photographed by a moving camera to obtain a photographed image of the object.

【００１５】請求項３に係る三次元形状情報取得方法
は、請求項１において、移動している物体を静止してい
るカメラで撮影して前記物体の撮影画像を得ることを特
徴とする。A three-dimensional shape information acquiring method according to a third aspect is characterized in that, in the first aspect, a moving object is photographed with a stationary camera to obtain a photographed image of the object.

【００１６】請求項４に係る三次元形状情報取得方法
は、請求項１〜３の何れかにおいて、前記第１ステップ
は、前記撮影画像内の任意の１フレームにおける特徴点
を抽出するステップと、その抽出した特徴点を前記撮影
画像内の複数フレームにわたって追跡するステップと、
その追跡結果に基づいて前記内部パラメータ及び移動軌
跡を求めるステップとを含むことを特徴とする。According to a third aspect of the present invention, in the method for acquiring three-dimensional shape information according to any one of the first to third aspects, the first step includes a step of extracting a feature point in an arbitrary frame in the photographed image; Tracking the extracted feature points over a plurality of frames in the captured image;
Obtaining the internal parameters and the movement trajectory based on the tracking result.

【００１７】請求項５に係る三次元形状情報取得方法
は、請求項４において、一旦求めた移動軌跡を用いて前
記特徴点の移動方向を予測し、その予測結果と前記特徴
点の追跡結果とを比較し、その比較結果に基づいて前記
特徴点の追跡結果を修正し、その修正後の追跡結果に基
づいて前記内部パラメータ及び移動軌跡を再び求めるこ
とを特徴とする。According to a fifth aspect of the present invention, in the method for obtaining three-dimensional shape information according to the fourth aspect, the moving direction of the feature point is predicted by using the once determined movement trajectory. Are compared, the tracking result of the feature point is corrected based on the comparison result, and the internal parameters and the movement trajectory are obtained again based on the corrected tracking result.

【００１８】請求項６に係る三次元形状情報取得方法
は、請求項１〜５の何れかにおいて、前記特徴点の追跡
結果に基づいて前記カメラの移動軌跡を求める際に、因
子分解法を利用することを特徴とする。According to a sixth aspect of the present invention, in the method for obtaining three-dimensional shape information according to any one of the first to fifth aspects, a factor decomposition method is used when obtaining the movement locus of the camera based on the tracking result of the feature point. It is characterized by doing.

【００１９】請求項７に係る三次元形状情報取得方法
は、請求項１〜６の何れかにおいて、前記ステレオ法と
して、マルチプルベースラインステレオ法を用いること
を特徴とする。A three-dimensional shape information acquiring method according to claim 7 is characterized in that in any one of claims 1 to 6, a multiple baseline stereo method is used as the stereo method.

【００２０】請求項８に係る三次元形状情報取得装置
は、物体に対して相対的に移動するカメラで撮影した前
記物体の画像から前記物体の三次元形状の情報を得る装
置において、撮影した撮影画像に基づいて前記カメラの
内部パラメータ及び前記カメラの撮影時の前記物体に対
する相対的な移動軌跡を求める手段と、求めた内部パラ
メータ及び移動軌跡と前記撮影画像とを用いてステレオ
法に従って前記物体の三次元形状の情報を得る手段とを
備えることを特徴とする。According to a third aspect of the present invention, there is provided an apparatus for obtaining three-dimensional shape information of an object from an image of the object taken by a camera moving relatively to the object. Means for obtaining an internal parameter of the camera based on the image and a relative movement trajectory to the object at the time of shooting by the camera, and a stereo method using the obtained internal parameter and the movement trajectory and the captured image in accordance with a stereo method. Means for obtaining three-dimensional shape information.

【００２１】請求項９に係る三次元形状情報取得装置
は、請求項８において、前記物体に対して相対的に移動
する前記カメラを更に備えることを特徴とする。According to a ninth aspect of the present invention, there is provided the three-dimensional shape information acquiring apparatus according to the eighth aspect, further comprising the camera which moves relatively to the object.

【００２２】請求項10に係る二次元画像取得方法は、請
求項１〜７の何れかに記載の三次元形状情報取得方法に
よって得られる情報を用いて、任意の視点からの前記物
体の二次元画像を取得する方法であって、前記第１ステ
ップで求めた内部パラメータ及び移動軌跡と前記撮影画
像と前記第２ステップで得た三次元形状の情報とを用い
て、任意に指定された視点からの前記物体の二次元画像
を取得することを特徴とする。According to a tenth aspect of the present invention, there is provided a two-dimensional image acquiring method, wherein information obtained by the three-dimensional shape information acquiring method according to any one of the first to seventh aspects is used. A method for acquiring an image, using an internal parameter and a movement trajectory obtained in the first step, the photographed image, and information on a three-dimensional shape obtained in the second step, from an arbitrary designated viewpoint. A two-dimensional image of the object is obtained.

【００２３】請求項11に係る二次元画像取得方法は、物
体に対して相対的に移動するカメラで撮影した前記物体
の画像に基づいて任意の視点からの前記物体の二次元画
像を取得する方法において、撮影した撮影画像に基づい
て前記カメラの内部パラメータ及び前記カメラの撮影時
の前記物体に対する相対的な移動軌跡を求める第１ステ
ップと、求めた内部パラメータ及び移動軌跡と前記撮影
画像とを用いて前記物体の三次元形状情報を求める第２
ステップと、求めた内部パラメータ及び移動軌跡と前記
撮影画像と求めた三次元形状情報とを用いて、任意に指
定された視点からの前記物体の二次元画像を取得する第
３ステップとを有することを特徴とする。A two-dimensional image acquiring method according to claim 11, wherein a two-dimensional image of the object from an arbitrary viewpoint is acquired based on an image of the object taken by a camera moving relatively to the object. A first step of obtaining an internal parameter of the camera and a relative movement trajectory with respect to the object at the time of shooting by the camera based on the shot image, and using the obtained internal parameters and the movement track and the shot image To obtain three-dimensional shape information of the object by
And a third step of obtaining a two-dimensional image of the object from an arbitrarily designated viewpoint using the obtained internal parameters and the movement trajectory, the captured image, and the obtained three-dimensional shape information. It is characterized by.

【００２４】請求項12に係る二次元画像取得方法は、請
求項10または11の何れかにおいて、前記物体の二次元画
像を取得する際に、イメージベースドレンダリングを利
用することを特徴とする。A two-dimensional image acquisition method according to a twelfth aspect is characterized in that, in any of the tenth and eleventh aspects, when acquiring a two-dimensional image of the object, image-based rendering is used.

【００２５】請求項13に係る二次元画像取得装置は、物
体に対して相対的に移動するカメラで撮影した前記物体
の画像に基づいて任意の視点からの前記物体の二次元画
像を取得する装置において、撮影した撮影画像に基づい
て前記カメラの内部パラメータ及び前記カメラの撮影時
の前記物体に対する相対的な移動軌跡を求める手段と、
求めた内部パラメータ及び移動軌跡と前記撮影画像とを
用いて前記物体の三次元形状情報を求める手段と、求め
た内部パラメータ及び移動軌跡と前記撮影画像と求めた
三次元形状情報とを用いて、任意に指定された視点から
の前記物体の二次元画像を取得する手段とを備えること
を特徴とする。According to a thirteenth aspect of the present invention, there is provided a two-dimensional image acquiring apparatus for acquiring a two-dimensional image of the object from an arbitrary viewpoint based on an image of the object taken by a camera moving relatively to the object. In, means for obtaining a relative trajectory of the internal parameters of the camera and the object at the time of shooting of the camera based on the captured image,
Means for determining the three-dimensional shape information of the object using the determined internal parameters and the movement locus and the captured image, using the determined internal parameters and the movement locus, the captured image and the determined three-dimensional shape information, Means for acquiring a two-dimensional image of the object from an arbitrarily designated viewpoint.

【００２６】請求項14に係る記録媒体は、物体に対して
相対的に移動するカメラで撮影した前記物体の画像から
前記物体の三次元形状の情報を得るためのプログラムを
記録してあるコンピュータでの読み取り可能な記録媒体
において、撮影した撮影画像に基づいて前記カメラの内
部パラメータ及び前記カメラの撮影時の前記物体に対す
る相対的な移動軌跡を求めることを前記コンピュータに
実行させるプログラムコード手段と、求めた内部パラメ
ータ及び移動軌跡と前記撮影画像とを用いてステレオ法
に従って前記物体の三次元形状の情報を得ることを前記
コンピュータに実行させるプログラムコード手段とを有
することを特徴とする。According to a fourteenth aspect of the present invention, there is provided a computer-readable storage medium storing a program for obtaining information on a three-dimensional shape of an object from an image of the object taken by a camera that moves relatively to the object. Program code means for causing the computer to execute an internal parameter of the camera and a relative movement trajectory with respect to the object at the time of shooting by the camera based on the shot image on the readable recording medium; Program code means for causing the computer to execute the acquisition of information on the three-dimensional shape of the object in accordance with a stereo method using the internal parameters and the movement trajectory and the captured image.

【００２７】請求項15に係る記録媒体は、物体に対して
相対的に移動するカメラで撮影した前記物体の画像に基
づいて任意の視点からの前記物体の二次元画像を取得す
るためのプログラムを記録してあるコンピュータでの読
み取り可能な記録媒体において、撮影した撮影画像に基
づいて前記カメラの内部パラメータ及び前記カメラの撮
影時の前記物体に対する相対的な移動軌跡を求めること
を前記コンピュータに実行させるプログラムコード手段
と、求めた内部パラメータ及び移動軌跡と前記撮影画像
とを用いて前記物体の三次元形状情報を得ることを前記
コンピュータに実行させるプログラムコード手段と、求
めた内部パラメータ及び移動軌跡と前記撮影画像と求め
た三次元形状情報とを用いて、任意に指定された視点か
らの前記物体の二次元画像を取得することを前記コンピ
ュータに実行させるプログラムコード手段とを有するこ
とを特徴とする。A recording medium according to a fifteenth aspect stores a program for acquiring a two-dimensional image of the object from an arbitrary viewpoint based on an image of the object taken by a camera that moves relatively to the object. On the recorded computer-readable recording medium, the computer is caused to execute a process of obtaining an internal parameter of the camera and a relative movement trajectory to the object at the time of shooting by the camera based on the shot image. Program code means, program code means for causing the computer to obtain the three-dimensional shape information of the object using the obtained internal parameters and movement trajectory and the photographed image, and the obtained internal parameters and movement trajectory; Using the photographed image and the obtained three-dimensional shape information, the object is viewed from an arbitrarily designated viewpoint. And having a program code means for executing to obtain the original image on the computer.

【００２８】図１は、本発明による三次元形状情報取得
方法の概念を示す図である。本発明では、予めキャリブ
レーションされていないカメラで対象となる物体を撮影
してその動画像を得る（Ａ）。そして、その動画像から
カメラの内部パラメータ（焦点距離など）とカメラの移
動軌跡（撮影時のカメラの位置，姿勢など）とを求める
（Ｂ）。その求めた情報と動画像とを用いてステレオ法
に従って物体の三次元形状の情報を取得する（Ｃ）。FIG. 1 is a diagram showing the concept of a method for acquiring three-dimensional shape information according to the present invention. In the present invention, a target object is photographed by a camera that has not been calibrated in advance, and a moving image is obtained (A). Then, from the moving image, the internal parameters of the camera (such as the focal length) and the movement locus of the camera (such as the position and orientation of the camera at the time of shooting) are obtained (B). Using the obtained information and the moving image, information on the three-dimensional shape of the object is obtained according to the stereo method (C).

【００２９】本発明では、カメラのキャリブレーション
を必要とせず、任意の環境で自由に撮影した動画像から
三次元形状の情報を取得でき、撮影タイミングの自由度
が高く、環境の制約が全くない。また、カメラを固定す
る設備も不要であり、低コストで処理を行える。更に、
ステレオ法に従って三次元形状情報を得るので、物体の
高精度の三次元形状情報を得ることができる。According to the present invention, three-dimensional shape information can be obtained from a moving image freely photographed in an arbitrary environment without the need for camera calibration, the degree of freedom in photographing timing is high, and there are no environmental restrictions at all. . Further, equipment for fixing the camera is not required, and the processing can be performed at low cost. Furthermore,
Since three-dimensional shape information is obtained according to the stereo method, highly accurate three-dimensional shape information of the object can be obtained.

【００３０】また、本発明では、カメラの内部パラメー
タ及び移動軌跡を求める際に、次のような処理を行う。
まず、動画像内の任意の１フレームにおける特徴点を抽
出し、その抽出した特徴点を動画像内の複数フレームに
わたって追跡し、特徴点追跡結果を得る。そして、その
特徴点追跡結果に基づいて、カメラの内部パラメータを
決定する（Ｂ１）。その決定した内部パラメータ及び特
徴点追跡結果に基づいて因子分解法を利用してカメラの
移動軌跡を求める（Ｂ２）。そして、求めた移動軌跡と
特徴点追跡結果とを整合させ、整合しない追跡結果デー
タは除外して、内部パラメータを決定し直した後（Ｂ
１）、その再決定した内部パラメータと整合しない追跡
結果データを除外した特徴点追跡結果とに基づいて、因
子分解法を利用してカメラの移動軌跡を再び求める（Ｂ
２）。このような較正プロセスを複数回繰り返して、最
終的なカメラの内部パラメータ及び移動軌跡を得る。よ
って、ステレオ法による処理で要求されるカメラの各種
の正確なパラメータを求めることが可能となる。In the present invention, the following processing is performed when obtaining the internal parameters and the movement locus of the camera.
First, feature points in an arbitrary one frame in a moving image are extracted, and the extracted feature points are tracked over a plurality of frames in the moving image to obtain a feature point tracking result. Then, based on the feature point tracking result, the internal parameters of the camera are determined (B1). Based on the determined internal parameters and the feature point tracking results, a camera trajectory is obtained using a factor decomposition method (B2). Then, the obtained trajectory is matched with the feature point tracking result, the tracking result data that does not match is excluded, and the internal parameters are determined again (B
1) Based on the re-determined internal parameters and the feature point tracking result excluding the tracking result data that does not match, the movement trajectory of the camera is obtained again by using the factor decomposition method (B).
2). Such a calibration process is repeated several times to obtain final camera internal parameters and trajectories. Therefore, it is possible to obtain various accurate parameters of the camera required in the processing by the stereo method.

【００３１】本発明では、ステレオ法として、マルチプ
ルベースラインステレオ法を用いる。このマルチプルベ
ースラインステレオ法のアルゴリズムでは、視点が異な
った画像のフレーム数が多いほど得られる三次元形状情
報の精度は向上する。本発明では、カメラで撮影した動
画像を用いるので、マルチプルベースラインステレオ法
のアルゴリズムに対して、非常に多くの多視点画像を提
供でき、データ密度が高くなって、三次元形状情報の精
度向上が期待できる。In the present invention, a multiple baseline stereo method is used as the stereo method. In the multiple baseline stereo algorithm, the accuracy of the obtained three-dimensional shape information is improved as the number of frames of images having different viewpoints increases. In the present invention, since a moving image captured by a camera is used, a very large number of multi-viewpoint images can be provided for the algorithm of the multiple baseline stereo method, the data density is increased, and the accuracy of the three-dimensional shape information is improved. Can be expected.

【００３２】図２は、本発明による二次元画像取得方法
の概念を示す図である。図２において、図１と同一の処
理部分には同一符号を付して、それらの説明を省略す
る。（Ｂ）で求めたカメラの内部パラメータ及び移動軌
跡と、撮影された動画像と、（Ｃ）で取得された物体の
三次元形状情報とを用いて、任意の視点からの物体の二
次元画像を取得する（Ｄ）。FIG. 2 is a diagram showing the concept of a two-dimensional image acquisition method according to the present invention. 2, the same reference numerals are given to the same processing portions as those in FIG. 1, and the description thereof will be omitted. A two-dimensional image of an object from an arbitrary viewpoint using the internal parameters and the movement trajectory of the camera determined in (B), the captured moving image, and the three-dimensional shape information of the object acquired in (C). (D).

【００３３】本発明では、カメラのキャリブレーション
を必要とせず、任意の環境で自由に撮影した動画像か
ら、任意の視点からの物体の二次元画像を取得でき、撮
影タイミングの自由度が高く、環境の制約が全くない。
また、カメラを固定する設備も不要であり、低コストで
処理を行える。According to the present invention, a two-dimensional image of an object from an arbitrary viewpoint can be obtained from a moving image freely photographed in an arbitrary environment without the need for camera calibration. There are no environmental restrictions.
Further, equipment for fixing the camera is not required, and the processing can be performed at low cost.

【００３４】本発明では、二次元画像の取得に、ＩＢＲ
を用いる。このＩＢＲでは、撮影画像と、任意の視点か
らの二次元画像との視差が小さい程、より正確な二次元
画像を合成できる。本発明では、物体に対して相対的に
移動するカメラにて多数の撮影画像が得られるようにな
っており、任意の視点からの二次元画像との視差が小さ
い撮影画像に基づいて、高品質の二次元画像を常に取得
することができる。In the present invention, an IBR is used to acquire a two-dimensional image.
Is used. In this IBR, the smaller the parallax between a captured image and a two-dimensional image from an arbitrary viewpoint, the more accurate a two-dimensional image can be synthesized. In the present invention, a large number of captured images are obtained by a camera that moves relatively to an object, and a high quality image is obtained based on a captured image in which a parallax with a two-dimensional image from an arbitrary viewpoint is small. Can always be obtained.

【００３５】[0035]

【発明の実施の形態】以下、本発明をその実施の形態を
示す図面を参照して具体的に説明する。図３は、本発明
の三次元形状情報取得装置及び二次元画像取得装置の構
成を示すブロック図である。図３において主制御部１は
具体的にはＣＰＵで構成されており、バス２を介して以
下に説明するハードウェア各部と接続されていて、それ
らの動作を制御すると共に、後述するような演算処理を
含む種々のソフトウェア的機能を実行する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The present invention will be specifically described below with reference to the drawings showing the embodiments. FIG. 3 is a block diagram showing a configuration of the three-dimensional shape information acquiring device and the two-dimensional image acquiring device of the present invention. In FIG. 3, the main control unit 1 is specifically constituted by a CPU, and is connected to each of the hardware units described below via a bus 2 to control the operation of the hardware units and to perform arithmetic operations described later. Perform various software functions including processing.

【００３６】読出部３は、外部のビデオテープに記録さ
れている対象の物体のアナログの動画像データを読み出
してディジタルの動画像データに変換する。画像メモリ
４は、そのディジタルの動画像データを格納する。ＲＯ
Ｍ５は、本発明の三次元形状情報取得処理及び二次元画
像取得処理に必要な種々のソフトウェアのプログラムを
予め格納している。ＲＡＭ６は、ＳＲＡＭまたはフラッ
シュメモリなどで構成され、ソフトウェアの実行時に発
生する一時的なデータを記憶する。表示部７は最終的に
復元された物体の三次元形状モデル，指定された視点か
らの物体の二次元画像，三次元形状情報取得処理及び二
次元画像取得処理における中途の情報などを表示する。The reading section 3 reads analog moving image data of a target object recorded on an external video tape and converts it into digital moving image data. The image memory 4 stores the digital moving image data. RO
M5 stores in advance various software programs necessary for the three-dimensional shape information acquisition processing and the two-dimensional image acquisition processing of the present invention. The RAM 6 is composed of an SRAM, a flash memory, or the like, and stores temporary data generated when executing software. The display unit 7 displays a finally restored three-dimensional shape model of the object, a two-dimensional image of the object from a specified viewpoint, three-dimensional shape information acquisition processing, and intermediate information in the two-dimensional image acquisition processing.

【００３７】次に、動作について説明する。図４，図５
は本発明の三次元形状情報取得方法の処理手順を示すフ
ローチャートである。なお、以下の例では、カメラの各
種のパラメータを求める手法として、因子分解法を利用
したカメラ移動軌跡復元技術を用い、三次元形状情報を
得るためのステレオ法として、マルチプルベースライン
ステレオ法を用いる。Next, the operation will be described. 4 and 5
5 is a flowchart showing a processing procedure of the three-dimensional shape information acquisition method of the present invention. In the following example, a camera movement trajectory restoring technique using a factorization method is used as a method for obtaining various camera parameters, and a multiple baseline stereo method is used as a stereo method for obtaining three-dimensional shape information. .

【００３８】（ステップＳ１：ビデオカメラによる撮
影）三次元形状情報の取得対象の物体をビデオカメラに
より撮影する。この際、視差情報を与えるために、ビデ
オカメラを積極的に動かして撮影を行うことが好まし
く、急激なカメラ移動のために１フレーム内の画像に大
きなぶれが生じなければどのようにビデオカメラを動か
しても良い。但し、カメラ光軸と瞬間的なカメラの移動
方向とが同一であるような画像では後の処理を行えな
い。なお、隣接する位置で撮影された画像であまり大き
な視差がなければ、デジタルカメラ，通常のフィルムを
使用するスチルカメラで撮影した連続写真でも良い。(Step S1: Photographing by Video Camera) An object from which three-dimensional shape information is to be obtained is photographed by a video camera. At this time, in order to provide disparity information, it is preferable to take a picture by actively moving the video camera, and how to set the video camera if there is no large blurring in an image in one frame due to rapid camera movement. You may move it. However, subsequent processing cannot be performed on an image in which the camera optical axis and the instantaneous movement direction of the camera are the same. If there is not much parallax between images taken at adjacent positions, a continuous photograph taken by a digital camera or a still camera using ordinary film may be used.

【００３９】（ステップＳ２：動画像の取込み）ビデオ
カメラで撮影した物体のアナログの動画像データをその
ビデオテープなどの画像媒体から読出部３にて読み出
し、そのアナログの動画像データをディジタル化してデ
ィジタルの動画像データを得る。なお、画像媒体から動
画像データを読み出すのではなく、リアルタイムでビデ
オカメラから直接動画像データを取り込むようにしても
良い。得たディジタルの動画像データを一旦画像メモリ
４に格納する。(Step S2: Capture of Moving Image) The reading unit 3 reads analog moving image data of an object photographed by a video camera from an image medium such as a video tape, and digitizes the analog moving image data. Digital moving image data is obtained. Instead of reading out moving image data from an image medium, moving image data may be directly taken in from a video camera in real time. The obtained digital moving image data is temporarily stored in the image memory 4.

【００４０】（ステップＳ３：動画像の前処理）後述す
る特徴点抽出処理，特徴点追跡処理を円滑に行うため
に、ディジタルの動画像にいくつかの前処理を施す。(Step S3: Preprocessing of Moving Image) In order to smoothly perform a feature point extraction process and a feature point tracking process described later, some preprocessing is performed on a digital moving image.

【００４１】〔画像補間処理〕日本，アメリカなどで使
用されるビデオ映像信号規格ＮＴＳＣでは飛び越し走査
を行うので、１フィールド分の画像は櫛の歯状になって
いる。動きが激しい動画像では、奇数フィールド，偶数
フィールドでの走査時間差のために画像が水平方向に大
きく異なってしまうことがある。画像面内輝度の微分量
を用いる特徴点抽出，特徴点追跡にとって、この現象は
大きな誤差の原因となる。よって、本例では、奇数フィ
ールドまたは偶数フィールドの何れか一方のみを使用
し、他方のフィールド画像は垂直方向の補間処理によっ
て作成する。[Image Interpolation Processing] In the video video signal standard NTSC used in Japan and the United States, interlaced scanning is performed, so that an image for one field has a comb-like shape. In the case of a moving image having a large movement, the image may be greatly different in the horizontal direction due to a difference in scanning time between the odd field and the even field. This phenomenon causes a large error in feature point extraction and feature point tracking using the differential amount of luminance in the image plane. Therefore, in this example, only one of the odd field and the even field is used, and the other field image is created by vertical interpolation processing.

【００４２】〔ピラミッド画像及びオプティカルフロー
計算〕各フレームで検出される点特徴を時間的に前後の
フレーム上の同一点と考えられる特徴点と結びつけるた
めにオプティカルフローを予め計算しておく。動画像
は、時間的に連続して撮影した画像であるので、時間に
関して微分可能である。三次元空間の同一の点の輝度Ｉ
が隣合うフレーム間で変化しないとすると、動画像内の
連続する２フレーム間では、以下の式（１）のような時
空間微分方程式が成り立つ。式（１）においてＩx ，Ｉ
y ，Ｉt はそれぞれ画像の水平方向，垂直方向，時間方
向についての偏微分であり、下記式（２）で示されるも
のは、水平方向と垂直方向との運動ベクトルであって、
オプティカルフローと呼ばれる。[Pyramid Image and Optical Flow Calculation] An optical flow is calculated in advance in order to link a point feature detected in each frame with a feature point which is considered to be the same point on the preceding and succeeding frames. Since a moving image is an image captured continuously in time, it can be differentiated with respect to time. Brightness I of the same point in three-dimensional space
Is not changed between adjacent frames, a spatio-temporal differential equation such as the following equation (1) is established between two consecutive frames in a moving image. In the equation (1), Ix, I
y and It are partial derivatives of the image in the horizontal direction, the vertical direction, and the time direction, respectively.
It is called optical flow.

【００４３】[0043]

【数１】 (Equation 1)

【００４４】時空間微分方程式を解くために、本例で
は、relaxation法を用いる。この手法は基本的に繰り返
し漸近法であるため、初期値がある程度解に近くなけれ
ばならない。そこで、本例では、画像を１／２，１／４
などのサイズに縮小して対応点の距離を縮めておいてこ
の手法を適用する。その後、小さな画像のオプティカル
フローを初期値として次の段階の大きな画像のオプティ
カルフローを計算する。このオプティカルフローは、特
徴点追跡データを作成するフレーム全てにわたって計算
される。In order to solve the spatio-temporal differential equation, in this example, the relaxation method is used. Since this method is basically an iterative asymptotic method, the initial value must be close to the solution to some extent. Therefore, in this example, the image is １／, １／
This method is applied after reducing the distance to the corresponding point by reducing the size to such a size. Thereafter, the optical flow of the next large image is calculated using the optical flow of the small image as an initial value. This optical flow is calculated over all frames for generating feature point tracking data.

【００４５】（ステップＳ４：特徴点抽出）各フレーム
において特徴点を抽出する。三次元空間内の物体の頂点
などの特徴点は、画像内輝度の微分値がその方向に関係
なく大きく変化している位置として検出される。(Step S4: Feature Point Extraction) Feature points are extracted from each frame. A feature point such as a vertex of an object in the three-dimensional space is detected as a position where the differential value of the luminance in the image is largely changed irrespective of its direction.

【００４６】（ステップＳ５：特徴点追跡）ステップＳ
３で計算しておいたオプティカルフローを用いて、各フ
レームにおいて抽出した特徴点を連結する。オプティカ
ルフローから任意のフレーム上のある画素が次のフレー
ムでどの位置に移動するかを予測し、両フレーム間の特
徴点を三次元空間上の同一の点として連結する。この
際、ステップＳ４で述べた考えも利用する。(Step S5: Feature Point Tracking) Step S
The feature points extracted in each frame are connected using the optical flow calculated in step 3. The position where a certain pixel on an arbitrary frame moves in the next frame is predicted from the optical flow, and the feature points between the two frames are connected as the same point in the three-dimensional space. At this time, the idea described in step S4 is also used.

【００４７】（ステップＳ６：カメラ内部パラメータ自
己較正）ステップＳ５までで得られた特徴点追跡データ
から、撮影に使用したカメラの内部パラメータを較正す
る。カメラの内部パラメータには、レンズ焦点距離，画
像中心座標，画素アスペクトレシオなどがある。これら
の中で、レンズ焦点距離以外は通常のレンズでは既知で
あるので、実質的にはレンズ焦点距離のみの較正を行う
ことが一般的である。但し、画像中心座標，画素アスペ
クトレシオの較正が必要な場合もある。このステップＳ
６で具体的に示すアルゴリズムは、これらの全ての内部
パラメータを決定することができる。(Step S6: Camera Internal Parameter Self-Calibration) The internal parameters of the camera used for photographing are calibrated from the feature point tracking data obtained up to step S5. Internal parameters of the camera include a lens focal length, image center coordinates, a pixel aspect ratio, and the like. Of these, the normal lens is known except for the lens focal length, so it is common to calibrate substantially only the lens focal length. However, calibration of the image center coordinates and the pixel aspect ratio may be required. This step S
The algorithm illustrated in FIG. 6 can determine all these internal parameters.

【００４８】動画像内の任意の初期フレームのプロジェ
クションマトリクスをＰ₁＝〔Ｉ｜０〕と表せることが
知られている。これ以外のフレームでのプロジェクショ
ンマトリクスは、初期フレームとこのフレームとのファ
ンダメンタルマトリクスを用いて表すことが可能であ
る。注目するフレームのプロジェクションマトリクス
Ｐ′は下記式（３）で表せる。但し、マトリクスＭは下
記式（４）で示され、マトリクスｅ及びＦはそれぞれそ
のフレームのエピ極及びファンダメンタルマトリクスと
なる。また、行列〔〕x は任意の３×１ベクトルに対
して下記式（５）のように定義する。It is known that the projection matrix of an arbitrary initial frame in a moving image can be expressed as P ₁ = [I | 0]. The projection matrix in other frames can be represented by using a fundamental matrix between the initial frame and this frame. The projection matrix P 'of the frame of interest can be expressed by the following equation (3). Here, the matrix M is represented by the following equation (4), and the matrices e and F are the epipole and the fundamental matrix of the frame, respectively. The matrix [] x is defined as shown in the following equation (5) for an arbitrary 3 × 1 vector.

【００４９】[0049]

【数２】 (Equation 2)

【００５０】このようにして得られたプロジェクション
マトリクス及び無限遠にある平面上の絶対円錐曲線Ωと
その射影である撮像面上の円錐曲線ωとには、以下の式
（６）のような関係があることが分かっている。また、
このマトリクスΩ，ωはそれぞれ式（７），式（８）の
ような対称行列であることが知られている。The projection matrix obtained in this manner, the absolute conic curve Ω on the plane at infinity, and the conic curve ω on the imaging surface, which is the projection thereof, have the relationship expressed by the following equation (6). I know there is. Also,
It is known that the matrices Ω and ω are symmetric matrices as shown in equations (7) and (8), respectively.

【００５１】[0051]

【数３】 (Equation 3)

【００５２】この結果から、ω₁₁＝ω₂₂，ω₁₂＝ω₁₃＝
ω₂₃＝０などの方程式が導き出されるが、マトリクスΩ
の各成分は各フレームのプロジェクションマトリクスを
用いて条件付きの線形最小二乗法を適用することにより
得ることができる。この場合、マトリクスΩのランクは
３であることが分かっているので、一度特異値分解し、
最小の固有値を０にしてマトリクスを再構成する。これ
により、第１フレームの焦点距離が得られ、得られたマ
トリクスΩを式（６）に代入することにより、その他の
フレームの焦点距離が得られる。From these results, ω ₁₁ = ω ₂₂ , ω ₁₂ = ω ₁₃ =
Equations such as ω ₂₃ = 0 are derived, but the matrix Ω
Can be obtained by applying a conditional linear least squares method using the projection matrix of each frame. In this case, the rank of the matrix Ω is known to be 3, so once the singular value decomposition is performed,
The matrix is reconfigured with the smallest eigenvalue set to zero. Thereby, the focal length of the first frame is obtained, and the focal lengths of the other frames are obtained by substituting the obtained matrix Ω into Expression (6).

【００５３】（ステップＳ７：弱中心射影モデルによる
因子分解法を用いたカメラ移動軌跡の復元）カメラの内
部パラメータ及び特徴点追跡データから、カメラ移動軌
跡を復元する。ここでは、米国カーネギーメロン大学の
金出武雄教授らが開発した因子分解法を使用する。因子
分解法の特徴は、カメラ移動，物体形状に移動の滑らか
さなどの拘束条件を必要としないことであり、また、数
値計算的に安定した特異値分解を使用するために結果が
安定している。フレームｊ（ｊ＝１，…，ｍ）の画像上
の点ｉ（ｉ＝１，…，ｎ）の座標（ｘ_ij，ｙ_ij）を特徴
点の重心を原点とした座標系の座標値とする。弱中心射
影モデルでは、画像上の特徴点座標（ｘ_ij，ｙ _ij）と三
次元空間上の特徴点座標（Ｘ_i，Ｙ_i）には以下の式
（９）のような関係がある。(Step S7: Using Weak Center Projection Model)
Restoration of camera trajectory using factorization method)
From camera parameters and feature point tracking data,
Restore traces. Here, Carnegie Mellon University, USA
The factorization method developed by Professor Takeo Kanade and others is used. factor
The feature of the decomposition method is that the camera moves and the object moves smoothly.
It does not require any constraints such as
To use a singular value decomposition that is computationally stable
stable. On the image of frame j (j = 1,..., M)
(I = 1,..., N) at the coordinates (x_ij, Y_ij)
The coordinate value of the coordinate system with the center of gravity of the point as the origin. Weak central shooting
In the shadow model, feature point coordinates (x_ij, Y _ij) And three
Feature point coordinates (X_i, Y_i) Has the following formula
There is such a relationship as (9).

【００５４】[0054]

【数４】 (Equation 4)

【００５５】ここで、ｓ_jは物体のスケールに相対する
ｊ番目のフレームの画像スケール、ベクトルｒ_1j，ｒ_2j
はそれぞれ物体座標系に相対するｊ番目のフレームのカ
メラ座標系の回転行列の１番目，２番目の行ベクトルで
ある。物体のスケールは１番目のフレームの画像と同じ
スケール（ｓ₁＝１）とし、また、物体の座標系の姿勢
も１番目のフレームのカメラ座標系と同じにする（ｒ₁₁
＝〔１，０，０〕^T，ｒ₂₁＝〔０，１，０〕^T）。Here, s _j is the image scale of the j-th frame relative to the scale of the object, and the vectors r _1j and r _2j
Are the first and second row vectors of the rotation matrix of the camera coordinate system of the j-th frame relative to the object coordinate system, respectively. The scale of the object is the same as the image of the first frame (s ₁ = 1), and the posture of the object in the coordinate system is the same as the camera coordinate system in the first frame (r ₁₁
= [1,0,0] ^T , r ₂₁ = [0,1,0] ^T ).

【００５６】全てのフレームの全ての特徴点を１つの計
測行列に配置すると、下記式（10）が得られる。When all the feature points of all the frames are arranged in one measurement matrix, the following equation (10) is obtained.

【００５７】[0057]

【数５】 (Equation 5)

【００５８】カメラ移動軌跡を示すマトリクスＭ及び形
状を示すマトリクスＳは何れもランク３であるので、マ
トリクスＤのランクは３でなければならない。しかし特
徴点追跡の失敗，量子化誤差によって、４番目以降の特
異値は０にならない。そこで、特異値分解すると、下記
式（11）となり、Σ４番目以降の特異値を０にし、同時
に対応する特異ベクトルも削除する。これらをそれぞれ
マトリクスＶ′，Σ′，Ｕ′とすると、式（12）とし
て、マトリクスＤを再構成したマトリクスＤ′を得る。Since both the matrix M indicating the camera movement locus and the matrix S indicating the shape have a rank of 3, the rank of the matrix D must be 3. However, the fourth and subsequent singular values do not become zero due to the failure of the feature point tracking and the quantization error. Therefore, when the singular value is decomposed, the following equation (11) is obtained. Assuming these as matrices V ', Σ', and U ', a matrix D' obtained by reconstructing the matrix D is obtained as equation (12).

【００５９】[0059]

【数６】 (Equation 6)

【００６０】このマトリクスＤ′は、下記（13）を最小
にするランク３の行列である。マトリクスＤ′からマト
リクスＭ，Ｓを求めたいが、マトリクスＭ，Ｓの組合せ
は唯一ではない。なぜなら、任意の３×３の正則マトリ
クスＣが下記式（14）を満たすからである。This matrix D 'is a rank 3 matrix that minimizes the following (13). Although it is desired to obtain the matrices M and S from the matrix D ', the combination of the matrices M and S is not unique. This is because an arbitrary 3 × 3 regular matrix C satisfies the following equation (14).

【００６１】[0061]

【数７】 (Equation 7)

【００６２】そこで、下記式（15）の関係を満たすマト
リクスＣを求める。弱中心投影の拘束条件から以下の式
（16）のようになることが知られている。Therefore, a matrix C satisfying the following equation (15) is obtained. It is known that the following equation (16) is obtained from the constraint condition of the weak center projection.

【００６３】[0063]

【数８】 (Equation 8)

【００６４】この式（16）から、下記式（17）を満たす
マトリクスＢを考えると、マトリクスＢの６個の成分に
関する（２ｍ＋１）個の線形方程式が得られる。これら
の方程式から最小二乗法によりマトリクスＢの成分を決
定し、更に、ｒ_1j＝〔１，０，０〕^T，ｒ_2j＝〔０，
１，０〕^Tの条件を加えて、マトリクスＣを得ることが
できる。このマトリクスＣより、以下の式（18）のよう
にカメラ移動軌跡が決まる。From the equation (16), considering a matrix B satisfying the following equation (17), (2m + 1) linear equations relating to the six components of the matrix B are obtained. The components of the matrix B are determined by the least square method from these equations, and further, r _1j = [1,0,0] ^T , r _2j = [0,
The matrix C can be obtained by adding the condition of [1,0] ^T. From this matrix C, the camera movement locus is determined as in the following equation (18).

【００６５】[0065]

【数９】 (Equation 9)

【００６６】なお、ここで、以下の式（19）から形状を
求めることは可能であるが、この形状は特徴点の追跡結
果のみから求められたものであって、その精度は低いの
で、本例では採用しない。Here, it is possible to obtain the shape from the following equation (19). However, since this shape is obtained only from the tracking results of the feature points and its accuracy is low, Not used in the example.

【００６７】[0067]

【数１０】 (Equation 10)

【００６８】（ステップＳ８，Ｓ９：特徴点追跡の評
価）ステップＳ５で得られた特徴点追跡結果を、復元さ
れたカメラ移動軌跡で誤差評価し、追跡が失敗している
と考えられる追跡データを除去する。(Steps S8 and S9: Evaluation of Tracking of Feature Point) The tracking result of the feature point obtained in step S5 is evaluated for an error using the restored camera movement trajectory, and tracking data considered to have failed in tracking is obtained. Remove.

【００６９】ステップＳ６で得られたカメラの内部パラ
メータ及びステップＳ７で得られたカメラ移動軌跡を用
いて、各フレーム画像間のファンダメンタルマトリクス
を求める。このファンダメンタルマトリクスＦは、具体
的には下記式（20）で求められる。ここで、マトリクス
Ａはカメラの内部パラメータを表し、具体的には下記式
（21）で表される。但し、画素中心を（０，０）、アス
ペクトレシオは１：１としている。またマトリクスＥ
は、エッセンシャルマトリクスと呼ばれ、具体的には下
記式（22）で示される。なお、マトリクスＲは、カメラ
移動の中の回転を表す。Using the internal parameters of the camera obtained in step S6 and the camera trajectory obtained in step S7, a fundamental matrix between each frame image is obtained. This fundamental matrix F is specifically determined by the following equation (20). Here, the matrix A represents an internal parameter of the camera, and is specifically expressed by the following equation (21). However, the pixel center is (0, 0) and the aspect ratio is 1: 1. Matrix E
Is called an essential matrix, and is specifically expressed by the following equation (22). Note that the matrix R represents rotation during camera movement.

【００７０】[0070]

【数１１】 [Equation 11]

【００７１】このようにして得られたファンダメンタル
マトリクスから、特徴点に対応する隣合うフレーム上の
エピ極線を得ることができる。いま注目している特徴点
は、隣合うフレームではこのエピ極線にのっているはず
である。そこで、隣合うフレームでの特徴点位置とその
特徴点がのっているはずのエピ極線との距離が、追跡精
度の評価基準として用いられる。それは、具体的には下
記式（23）で示される。From the fundamental matrix obtained in this way, epipolar lines on adjacent frames corresponding to feature points can be obtained. The feature point we are paying attention to should be on this epipolar line in adjacent frames. Therefore, a distance between a feature point position in an adjacent frame and an epipolar line on which the feature point should be placed is used as an evaluation criterion of tracking accuracy. It is specifically expressed by the following equation (23).

【００７２】[0072]

【数１２】 (Equation 12)

【００７３】式（23）において、ωが大きいほど精度が
良い。このωの値が所定の閾値を超えた場合に、追跡失
敗としてその特徴点のデータをカメラ移動追跡データか
ら削除する。このアルゴリズムでは、マトリクスＦ自体
の計算に追跡を失敗した特徴点のデータが含まれている
ので、ステップＳ６に戻ってより正しいマトリクスＦを
求める必要がある。このようなサイクルを複数回繰り返
すことにより、後続のマルチプルベースラインステレオ
法のアルゴリズムの適用に耐えうる精度を持つデータを
得ることができる。In equation (23), the accuracy is better as ω is larger. When the value of ω exceeds a predetermined threshold, the tracking failure is determined and the data of the feature point is deleted from the camera movement tracking data. In this algorithm, since the calculation of the matrix F itself includes the data of the feature point whose tracking has failed, it is necessary to return to step S6 to obtain a more correct matrix F. By repeating such a cycle a plurality of times, it is possible to obtain data having an accuracy that can withstand the application of the algorithm of the subsequent multiple baseline stereo method.

【００７４】（ステップＳ10：パースペクティブモデル
による非線形最小二乗法を用いたカメラ移動軌跡の復
元）ステップＳ７で用いた弱中心投影モデルは、画像座
標と三次元座標とが線形の関数で表現されているが、実
際のカメラでは中心投影モデルが妥当であり、これは、
以下の式（24）のような非線形な関係を持つ。なお、ｆ
は焦点距離である。(Step S10: Restoration of Camera Movement Locus Using Nonlinear Least Square Method Based on Perspective Model) In the weak center projection model used in step S7, image coordinates and three-dimensional coordinates are represented by linear functions. However, the central projection model is reasonable for a real camera,
It has a nonlinear relationship as shown in the following equation (24). Note that f
Is the focal length.

【００７５】[0075]

【数１３】 (Equation 13)

【００７６】そこで、下記式（25）のような評価関数を
設定する。Therefore, an evaluation function such as the following equation (25) is set.

【００７７】[0077]

【数１４】 [Equation 14]

【００７８】この式（25）は、三次元物体が三次元運動
を経て射影される像と、実際に得られる画像との距離誤
差の自乗和である。ステップＳ７で得られるカメラ移動
と形状の復元結果とを初期値とし、Levenberg-Marguard
t 法を用いてこの非線形最小二乗問題を解く。これによ
り、更に精度が高いカメラ移動軌跡を復元する。This equation (25) is the sum of squares of the distance error between the image of the three-dimensional object projected through the three-dimensional motion and the actually obtained image. Using the camera movement and the shape restoration result obtained in step S7 as initial values, Levenberg-Marguard
Solve this nonlinear least squares problem using the t method. As a result, a more accurate camera movement trajectory is restored.

【００７９】（ステップＳ11：ベース画像の決定）復元
されたカメラ移動軌跡及び特徴点の座標値に基づき、後
続するマルチプルベースラインステレオ法に用いるベー
ス画像を決定する。マルチプルベースラインステレオ法
では、距離画像を生成しようとする画像（フレーム）を
ベース画像と呼ぶ。また、その距離画像を生成するため
に参照する複数の画像（フレーム）を参照画像と呼ぶ。
ビデオカメラで撮影した場合、どの複数フレームをベー
ス画像にしても距離画像は得られるが、物体の形状によ
ってはベース画像の選択により正確な三次元モデルを期
待できない場合がある。(Step S11: Determination of Base Image) A base image to be used for the subsequent multiple baseline stereo method is determined based on the restored camera movement trajectory and the coordinate values of the feature points. In the multiple baseline stereo method, an image (frame) for which a distance image is to be generated is called a base image. A plurality of images (frames) referred to for generating the distance image are referred to as reference images.
When shooting with a video camera, a distance image can be obtained regardless of the plurality of frames as a base image, but depending on the shape of an object, an accurate three-dimensional model may not be expected due to selection of the base image.

【００８０】図６は、この理由を説明するための図であ
る。図６（ａ）に示す例では、対象となる物体10の面と
ビデオカメラ11の光軸とのなす角αが、直角から大きく
ずれているので、データ密度，精度が何れも悪くなる。
これに対して図６（ｂ）に示す例では、この角αが直角
に近いので、データ密度，精度が何れも良好となる。FIG. 6 is a diagram for explaining the reason. In the example shown in FIG. 6A, since the angle α between the surface of the target object 10 and the optical axis of the video camera 11 is greatly deviated from a right angle, both the data density and the accuracy are deteriorated.
On the other hand, in the example shown in FIG. 6B, since the angle α is close to a right angle, both the data density and the accuracy are improved.

【００８１】そこで、ステップＳ４で抽出した特徴点に
対しドロネーメッシュ生成を行う。ステップＳ10で求め
た特徴点の三次元空間での座標値をこのメッシュに対し
て与え、物体の大まかな三次元形状モデルを得る。この
三次元形状モデルとカメラ移動軌跡とから各フレームが
この物体の距離画像を生成するのにどれだけ適している
かを評価する。この際の具体的な評価値は下記式（26）
で与える。なお、ｓ_ijはｊ番目のフレームで見えるｉ番
の三角パッチが注目するフレーム中の像の面積である。
また、ベクトルｎ_ijはｊ番目のフレームで見えるｉ番の
三角パッチの代表法線ベクトルである。また、ベクトル
ｃ_jはｊ番目のフレームのカメラｚ軸を表す単位ベクト
ルである。Accordingly, Delaunay mesh generation is performed on the feature points extracted in step S4. The coordinate values of the feature points in the three-dimensional space determined in step S10 are given to this mesh, and a rough three-dimensional shape model of the object is obtained. From the three-dimensional shape model and the camera movement trajectory, it is evaluated how suitable each frame is for generating a range image of the object. The specific evaluation value at this time is given by the following equation (26).
Give in. Note that s _ij is the area of the image in the frame of interest for the i-th triangular patch visible in the j-th frame.
The vector n _ij is a representative normal vector of the i-th triangular patch seen in the j-th frame. The vector c _j is a unit vector representing the camera z-axis of the j-th frame.

【００８２】[0082]

【数１５】 (Equation 15)

【００８３】式（26）で示される評価値は、注目するフ
レームでの物体面の見えやすさを表す。この評価値によ
りベース画像候補を決定するが、このままでは特定の方
向から撮影されたフレームばかりに偏る恐れがあるの
で、カメラ移動軌跡を所定距離で適当に分割し、それぞ
れのカメラ移動軌跡セグメント内で上記の処理を行っ
て、それぞれで評価値が高いフレームをベースフレーム
とする。The evaluation value represented by the equation (26) indicates the visibility of the object plane in the frame of interest. The base image candidate is determined based on this evaluation value. However, since the base image candidate may be biased only to a frame photographed from a specific direction as it is, the camera movement trajectory is appropriately divided at a predetermined distance, and within each camera movement trajectory segment. The above processing is performed, and a frame having a high evaluation value is set as a base frame.

【００８４】（ステップＳ12：Zeta方向の範囲の決定）
後続するマルチプルベースラインステレオ処理では、Ｓ
ＳＳＤ（Sum of Sum of Squared Difference）と呼ばれ
るベース画像と参照画像との各部分の相関を評価するプ
ロファイルを作成する。ベース画像の各画素毎に奥行き
に沿ってＳＳＳＤプロファイルは作成されるが、これは
非常に計算コストが高い処理であり、計算する奥行きの
領域（カメラの中心から物体表面での距離）によって全
体の処理時間がほぼ決まる。(Step S12: Determination of Range in Zeta Direction)
In the subsequent multiple baseline stereo processing, S
A profile called SSD (Sum of Sum of Squared Difference) for evaluating the correlation of each part between the base image and the reference image is created. An SSSD profile is created along the depth for each pixel of the base image, but this is a very computationally expensive process, and the overall depth depends on the depth region to be calculated (distance from the center of the camera to the object surface). The processing time is almost determined.

【００８５】ところが、物体表面がカメラの中心からど
れほど離れているかは、通常形状復元が終了しないと分
からないので、厳密にはステレオ視に用いた視差を検出
できる奥行きまでＳＳＳＤプロファイルを作成しなけれ
ばならない。しかし、それほど遠くまでの距離画像を必
要とすることは通常考えられず、また、ＳＳＳＤプロフ
ァイル作成に要する時間が莫大すぎて現実的でない。However, it is not known how far the object surface is from the center of the camera until normal shape restoration is completed. Therefore, strictly speaking, an SSSD profile must be created to a depth at which parallax used for stereo vision can be detected. No. However, it is not generally conceivable that a distance image that is so far is required, and the time required for creating an SSSD profile is too long to be realistic.

【００８６】そこで、ステップＳ10で求めた物体の特徴
点の座標値に基づいて、このＳＳＳＤプロファイル作成
の領域を決定する。そのフレームから見て、最も近い特
徴点と最も遠い特徴点とを使い、両点の間をＳＳＳＤプ
ロファイル作成領域とする。これにより、比較的短時間
でマルチプルベースラインステレオ法による形状復元を
達成することができる。Therefore, based on the coordinate values of the feature points of the object obtained in step S10, the area for creating the SSSD profile is determined. The closest feature point and the farthest feature point are used from the frame, and an area between both points is defined as an SSSD profile creation area. Thus, shape restoration by the multiple baseline stereo method can be achieved in a relatively short time.

【００８７】（ステップＳ13，Ｓ14：マルチプルベース
ラインステレオ法による形状復元）ステップＳ６で得ら
れたカメラの内部パラメータ、ステップＳ10で得られた
正確なカメラ移動軌跡、並びに、ステップＳ11で決定し
たベース画像及びその前後の複数の参照フレームに基づ
いて、マルチプルベースラインステレオ法のアルゴリズ
ムを実行して、各ベースフレームの密な距離画像を生成
する。(Steps S13 and S14: Shape Restoration by Multiple Baseline Stereo Method) Internal parameters of the camera obtained in step S6, accurate camera movement trajectory obtained in step S10, and base image determined in step S11 And a plurality of reference frames before and after that, an algorithm of a multiple baseline stereo method is executed to generate a dense range image of each base frame.

【００８８】ステレオ視では、ベース画像上の特定の点
とそれに対応する参照画像上の点とを検索し、これらの
２つの点から視差を求め、カメラ位置間の距離を使って
奥行きを同定する。対応点の決定と奥行きの同定とは次
のように行う。In stereo vision, a specific point on the base image and a corresponding point on the reference image are searched, parallax is determined from these two points, and depth is identified using the distance between camera positions. . The determination of the corresponding point and the identification of the depth are performed as follows.

【００８９】ベース画像及び参照画像上に10×10画素程
度の大きさのマスクを設け、これらの２つのマスク内の
画素毎の輝度の差を二乗和したものをＳＳＤ（sum of s
quared difference)と呼ぶ。予め与えられたマトリクス
Ｆからベース画像上のマスクの中心位置に対応する参照
画像上のエピ極線を求め、これに沿って参照画像上のマ
スクを移動しなからＳＳＤを計算する。よって、エピ極
線に沿ったＳＳＤプロファイルが生成されるが、エピ極
線上の位置は奥行きに対応するので、これは奥行きに対
するＳＳＤプロファイルとも言える。このＳＳＤプロフ
ァイルが最小値を示すエピ極線上の位置が対応点であ
り、奥行きであると推定できる。A mask having a size of about 10 × 10 pixels is provided on the base image and the reference image, and the sum of squares of the difference in luminance between pixels in these two masks is used as an SSD (sum of s).
quared difference). An epipolar line on the reference image corresponding to the center position of the mask on the base image is obtained from the matrix F given in advance, and the SSD is calculated while moving the mask on the reference image along this. Thus, an SSD profile along the epi-polar is generated. Since the position on the epi-polar corresponds to the depth, this can be said to be an SSD profile for the depth. The position on the epipolar line where the SSD profile shows the minimum value is the corresponding point, and can be estimated to be the depth.

【００９０】しかしながら、例えば２つのカメラ位置が
水平な位置関係をなして、垂直な棒を何本も等間隔で並
べたようなシーンを想定した場合、ＳＳＤプロファイル
はいくつもの良く似た局所的な最小値を持つことになっ
て、奥行きを同定できない。そこで、３つ以上のカメラ
位置が水平な位置関係をなすようにして、同じシーンに
ついて処理を行う。ここで、下記式（27）で定義するよ
うなＳＳＳＤを考える。図７は、式（27）における各パ
ラメータの関係を示す図である。However, when assuming a scene in which, for example, two camera positions are in a horizontal positional relationship and a number of vertical bars are arranged at equal intervals, the SSD profile can have many similar local Since it has the minimum value, the depth cannot be identified. Therefore, the same scene is processed so that three or more camera positions have a horizontal positional relationship. Here, an SSSD defined by the following equation (27) is considered. FIG. 7 is a diagram illustrating the relationship between the parameters in equation (27).

【００９１】[0091]

【数１６】 (Equation 16)

【００９２】但し、ベクトルｍ：ベース画像上の注目する画素を表すＮベク
トルベクトルｍ′：ベース画像上の注目する画素のまわりに
設定したマスクに含まれる画素 α：ベース画像上の注目する画素からの距離Ｎ〔〕：正規化作用素ｄΩ：微小立体角ｆ_i（）：輝度関数マトリクスＲ_i：ｉフレームのカメラの回転マトリクスベクトルＢ_i：ｉフレームのカメラの平行移動ベクトルHere, vector m: N vector representing the pixel of interest on the base image Vector m ': pixel included in the mask set around the pixel of interest on the base image α: from the pixel of interest on the base image N []: Normalization operator dΩ: Small solid angle f _i (): Luminance function Matrix R _i : Rotation matrix of i-frame camera Vector B _i : Translation frame of i-frame camera

【００９３】式（27）は、ＳＳＤを複数の参照画像分だ
け加算したことを意味する。奥行きを横軸にとってＳＳ
Ｄをグラフ化した場合、各ＳＳＤは複数の局所的な最小
値を持つ。どの参照画像から得られたＳＳＤも本当の表
面のある奥行きで最小値を持つ。それ以外の局所的な最
小値は、カメラの位置によってずれるので、加算したＳ
ＳＳＤでは平均化されて消えてしまうが、正しい最小値
では鮮明な谷部を形成する。このようにして、唯一の明
確な最小値が安定して検出され、奥行きが同定される。
ベース画像の全画素に対してこの処理を行い、密な距離
画像を安定して得ることができる。Equation (27) means that SSDs are added for a plurality of reference images. SS with depth on the horizontal axis
When D is graphed, each SSD has multiple local minima. SSDs obtained from any reference image have a minimum at some depth of the real surface. Other local minimum values are shifted depending on the position of the camera.
In an SSD, the data is averaged out, but a clear valley is formed at a correct minimum value. In this way, only one distinct minimum is stably detected and depth is identified.
This process is performed on all the pixels of the base image, and a dense range image can be stably obtained.

【００９４】ＳＳＳＤの生成は非常に計算コストが高い
ので、ステップＳ12で求めたＳＳＳＤ計算範囲内でも多
数の奥行きに関してＳＳＳＤを計算した場合には全体の
処理時間が莫大になってしまう。そこで、30程度の奥行
きに関してＳＳＳＤを計算し、これを通るスプライン曲
線を求め、補間により滑らかなＳＳＳＤプロファイルを
決定する。この決定したプロファイルから解析的に全体
的な最小値を検出して、奥行きを同定する。Since the generation of the SSSD is very expensive, if the SSSD is calculated for a large number of depths even within the SSSD calculation range obtained in step S12, the entire processing time becomes enormous. Therefore, the SSSD is calculated for a depth of about 30, a spline curve passing through the SSSD is obtained, and a smooth SSSD profile is determined by interpolation. The overall minimum value is analytically detected from the determined profile to identify the depth.

【００９５】（ステップＳ15：三角パッチモデル生成）
ステップＳ11で決定されたベース画像において、ステッ
プＳ13のマルチプルベースラインステレオ処理により復
元された距離画像から、三角パッチモデルを生成する。(Step S15: Triangular patch model generation)
In the base image determined in step S11, a triangular patch model is generated from the distance image restored by the multiple baseline stereo processing in step S13.

【００９６】距離画像は、基本的に三次元座標値を持つ
点の集まりに過ぎない。そこで、三次元空間上で隣合う
と思われる３個の点を一まとめにして微小な三角ポリゴ
ン（パッチ）を生成し、それらの集合によって複雑な物
体面を形成して、表示部７に表示する。簡易な手法とし
ては、一枚の距離画像内の画素の隣接関係は三次元空間
上の隣接関係をほぼ保持していることを利用して、例え
ば注目する画素とその左隣及び下隣の画素とを使って微
小な三角ポリゴンを生成し、同様の処理を距離画像内の
全ての画素に適用する。A range image is basically only a group of points having three-dimensional coordinate values. Therefore, three triangular points considered to be adjacent in the three-dimensional space are grouped together to generate a small triangular polygon (patch), and a complex object plane is formed by the set of these triangular polygons, and displayed on the display unit 7. I do. As a simple method, utilizing the fact that the adjacency of the pixels in one distance image substantially holds the adjacency in the three-dimensional space, for example, the pixel of interest and the pixels on the left and below the target pixel Is used to generate a small triangular polygon, and the same processing is applied to all pixels in the distance image.

【００９７】この手法は、安定して高速に三角バッチモ
デルを生成できる利点がある。しかし、複数のベース画
像（視点）で生成された距離画像内の画素同士は適切な
隣接関係を持たない。そこで、エピ極線拘束などを用い
て隣合う距離画像内の画素の対応をとるなどの手法が考
えられるが、処理が複雑になり、さまざまな場合分けを
必要とする。そこで、点の隣接関係が全く与えられてい
なくても、その座標値のみから適切な３個の点をまとめ
て三角ポリゴンを生成するマーチングキューブアルゴリ
ズムを適用する。This method has an advantage that a triangular batch model can be generated stably and at high speed. However, pixels in a distance image generated by a plurality of base images (viewpoints) do not have an appropriate adjacent relationship. Therefore, a method of associating pixels in adjacent distance images using epipolar constraint or the like can be considered, but the processing becomes complicated, and various cases are required. Therefore, even if no adjacency relation of the points is given at all, a marching cube algorithm for generating a triangular polygon by applying appropriate three points from only the coordinate values is applied.

【００９８】まず、上記の二次元画像の隣接関係を利用
した三角バッチ生成アルゴリズムを全ての距離画像に対
して適用する。次に、全ての距離画像からデータが占め
る空間のそれぞれの座標軸方向の最小値，最大値を求め
る。これらの値に基づき、ボクセル空間を設定する。距
離画像上の三次元座標値を含むセルの中心に“１”、そ
れ以外に“０”を配置する。隣接する８個のセルの中心
を結んで立方体を作成する。これをマーチングキューブ
と呼ぶ。このマーチングキューブの８個の頂点には、
“０”または“１”が配置されているので、その配置パ
ターンは２⁸＝256 種類も存在する。しかし、幾何学的
に対称（回転形，反射形，“０”と“１”との反転形な
ど）であるものを同じと考えると、最終的には15種類の
配置にまで減らすことができる。この15種類のパターン
にそれぞれに生成する三角ポリゴンの形状を定義し、全
てのキューブにそのパターンをあてはめることにより、
全体として三角ポリゴンを生成することができる。First, the triangular batch generation algorithm using the above-described two-dimensional image adjacency is applied to all distance images. Next, the minimum value and the maximum value in the direction of each coordinate axis of the space occupied by the data are obtained from all the distance images. A voxel space is set based on these values. “1” is arranged at the center of the cell including the three-dimensional coordinate value on the distance image, and “0” is arranged at other positions. A cube is created by connecting the centers of eight adjacent cells. This is called a marching cube. At the eight vertices of this marching cube,
Since “0” or “1” is arranged, there are 2 ⁸ = 256 kinds of arrangement patterns. However, if things that are geometrically symmetric (rotational type, reflective type, inverted type of "0" and "1", etc.) are considered to be the same, it can be finally reduced to 15 types of arrangements . By defining the shape of the triangular polygon to be generated for each of these 15 types of patterns, and applying that pattern to all cubes,
A triangular polygon can be generated as a whole.

【００９９】次に、ユーザから指定された任意の視点
（指定された任意の位置，姿勢を有する仮想カメラ）か
らの物体の二次元画像を取得する動作について説明す
る。図８，図９は本発明の二次元画像取得方法の処理手
順を示すフローチャートである。なお、以下の例では、
物体の二次元画像を取得する手法として、ＩＢＲ（イメ
ージベースドレンダリング）を用いる。Next, an operation of acquiring a two-dimensional image of an object from an arbitrary viewpoint designated by the user (a virtual camera having an arbitrary designated position and orientation) will be described. 8 and 9 are flowcharts showing the processing procedure of the two-dimensional image acquisition method of the present invention. In the following example,
As a technique for acquiring a two-dimensional image of an object, IBR (Image Based Rendering) is used.

【０１００】（ステップＳ１〜ステップＳ14）このステ
ップＳ１〜Ｓ14の処理は、前述した三次元形状情報取得
の処理手順と同じであるので、それらの各処理について
は同じステップ番号を付して説明を省略する。(Steps S1 to S14) Since the processing of steps S1 to S14 is the same as the processing procedure of the above-described three-dimensional shape information acquisition, the respective steps are denoted by the same step numbers and will be described. Omitted.

【０１０１】（ステップＳ21，Ｓ22：物体の二次元画像
の取得）ステップＳ１〜ステップＳ14にて求められたカ
メラの各種パラメータ，撮影画像及び物体の三次元形状
情報を用いて、ＩＢＲを実行する。このＩＢＲ処理は２
つのステップに分けられる。(Steps S21 and S22: Acquisition of Two-Dimensional Image of Object) IBR is executed using various camera parameters, photographed images and three-dimensional shape information of the object obtained in steps S1 to S14. This IBR processing is 2
Divided into two steps.

【０１０２】第１のステップとして、物体を含むボクセ
ル空間を構築し、そのボクセル空間の各ボクセルに色を
割り当てる。このボクセル空間は、上述のマルチプルベ
ースラインステレオ法で得られた全ての距離画像に基づ
いて決定される。As a first step, a voxel space including an object is constructed, and a color is assigned to each voxel in the voxel space. This voxel space is determined based on all range images obtained by the multiple baseline stereo method described above.

【０１０３】具体的には、全ての距離画像と、それらを
計測したカメラの位置から算出される全三次元座標の
ｘ，ｙ，ｚ値の最大値，最小値とを用いて直方体状の領
域を設定し、この領域をボクセル空間とする。次に、適
当な間隔でボクセル空間を分割して、各ボクセルを発生
させる。次に、マルチプルベースラインステレオ法で生
成された距離画像から計算される点群をこのボクセル空
間に発生させる。More specifically, a rectangular parallelepiped region is obtained by using all the distance images and the maximum and minimum values of the x, y, and z values of all three-dimensional coordinates calculated from the position of the camera that measured them. Is set as a voxel space. Next, the voxel space is divided at appropriate intervals to generate each voxel. Next, a point group calculated from the distance image generated by the multiple baseline stereo method is generated in the voxel space.

【０１０４】Steven M.Seitzらが開発した手法（”Towa
rd Interactive Scene Walkthroughs from Images"）で
は、ボクセル空間内の全てのボクセルについて処理を行
って割り当てるべき色を選定するが、この手法では処理
に莫大な時間を要する。そこで、本例では、マルチプル
ベースラインステレオ法で生成した距離画像を用いて処
理時間を短縮できるようにする。The technique developed by Steven M. Seitz et al. (“Towa
In rd Interactive Scene Walkthroughs from Images "), processing is performed on all voxels in the voxel space to select colors to be assigned. However, this method takes an enormous amount of time. Therefore, in this example, multiple baselines are used. Processing time can be reduced by using a distance image generated by a stereo method.

【０１０５】ある三次元の点について、この点とカメラ
中心とを結ぶ直線に垂直な平面を設定し、この点を含む
距離画像の画素の大きさをこの平面に反映させる。具体
的には、カメラ中心と矩形画素の四隅とを結ぶ直線を、
設定した平面と交差させ、平面上に矩形を発生させる。
この矩形を含むボクセルには、面が存在するとしてフラ
グを立てておく。図10は、この概念を示す図である。For a certain three-dimensional point, a plane perpendicular to the straight line connecting this point and the center of the camera is set, and the size of the pixel of the range image including this point is reflected on this plane. Specifically, a straight line connecting the camera center and the four corners of the rectangular pixel
Intersect with the set plane and generate a rectangle on the plane.
The voxel including this rectangle is flagged as having a surface. FIG. 10 is a diagram illustrating this concept.

【０１０６】次に、このフラグが立っているボクセルに
ついてのみ、以下の処理を行う。注目しているボクセル
と各カメラ中心とを結ぶ直線を走査して、フラグが立っ
ている他のボクセルを通過していないことを確認する。
フラグが立っている他のボクセルを通過している場合に
はオクルージョンが起こっているはずであるので、その
カメラの画像はボクセルの色を決定する処理から除外す
る。オクルージョンが起こっていない場合の撮像面の交
点となる画素の色がボクセルに割り当てる色の候補とな
る。各ボクセルには、これらの複数の画素の色候補の平
均値を割り当てる。Next, the following processing is performed only on the voxel for which this flag is set. A straight line connecting the voxel of interest and the center of each camera is scanned to confirm that the voxel does not pass through another voxel with a flag.
If occlusion must have occurred when passing through another voxel with the flag set, the image of the camera is excluded from the processing for determining the color of the voxel. The color of the pixel at the intersection of the imaging plane when occlusion has not occurred is a color candidate to be assigned to the voxel. Each voxel is assigned an average value of the color candidates of the plurality of pixels.

【０１０７】マルチプルベースラインステレオ法で処理
を失敗している距離画像の画素の存在が考えられる。こ
の場合には、色候補の標準偏差を用いた下記式（28）で
定義する評価関数Ｅについて、適当な閾値を設けて閾値
以下になるものに関しては色を割り当てず、またフラグ
も削除する。式（28）において、ｓは色候補の標準偏
差、σはカメラ撮像デバイスの輝度値の標準偏差、ｍ−
１は自由度を表し、これらの色候補及び輝度値はχ²分
布に従うとする。It is conceivable that there are pixels in the range image for which the processing has failed in the multiple baseline stereo method. In this case, for the evaluation function E defined by the following equation (28) using the standard deviation of the color candidates, an appropriate threshold value is provided, and if the evaluation function E falls below the threshold value, no color is assigned and the flag is also deleted. In equation (28), s is the standard deviation of the color candidate, σ is the standard deviation of the luminance value of the camera imaging device, and m−
1 represents the degree of freedom, these colors candidate and the luminance value is to follow a chi ² distribution.

【０１０８】[0108]

【数１７】 [Equation 17]

【０１０９】第２ステップでは、上述したように各ボク
セル毎に色が割り当てられたボクセル空間を用いて、任
意の位置，姿勢の仮想カメラで得られる二次元画像を生
成する。この際、色が割り当てられたボクセル空間を仮
想カメラ撮像面に単純な透視変換する手法を採用する。
即ち、下記式（29）に従って、仮想カメラ撮像面での位
置を算出する。式（29）において、ｘ，ｙは撮像面での
二次元座標値、ｆは焦点距離、Ｘ，Ｙ，Ｚはボクセル中
心の三次元座標値、マトリクスＴr は世界座標系をカメ
ラ座標系に変換する４×４のマトリクスである。In the second step, a two-dimensional image obtained by a virtual camera at an arbitrary position and orientation is generated using the voxel space to which a color is assigned to each voxel as described above. At this time, a technique of performing a simple perspective transformation of a voxel space to which a color is assigned to a virtual camera imaging plane is adopted.
That is, the position on the virtual camera imaging plane is calculated according to the following equation (29). In equation (29), x and y are two-dimensional coordinate values on the imaging surface, f is a focal length, X, Y, and Z are three-dimensional coordinate values at the center of the voxel, and a matrix Tr is a world coordinate system converted to a camera coordinate system. Is a 4 × 4 matrix.

【０１１０】[0110]

【数１８】 (Equation 18)

【０１１１】なお、オクルージョンの場合を除外しても
撮像面上の１つの画素に２つ以上のボクセルが対応付け
られることもあり得る。このようなときには、それらの
平均値を用いる。Even if the case of occlusion is excluded, two or more voxels may be associated with one pixel on the imaging surface. In such a case, the average value is used.

【０１１２】次に、本発明の具体例について説明する。
図11は、対象となる物体10のビデオカメラ11による撮影
状態を示す模式図であり、本例の物体10は、立方体の一
面に球の一部をくっつけた形状をなしている。ビデオカ
メラ11を移動，回転させながら、静止しているこのよう
な物体10を種々の角度から撮影して、動画像を得る。Next, a specific example of the present invention will be described.
FIG. 11 is a schematic diagram showing a shooting state of the target object 10 by the video camera 11, and the object 10 in this example has a shape in which a part of a sphere is attached to one surface of a cube. While moving and rotating the video camera 11, such a stationary object 10 is photographed from various angles to obtain a moving image.

【０１１３】図12は、前述のステップＳ４に従って抽出
した特徴点（□）を示す図である。特徴点は、物体10の
頂点の位置に出やすいが、滑らかな曲面では表面の模様
などに応じてランダムに出る。FIG. 12 is a diagram showing feature points (□) extracted in accordance with step S4 described above. The feature points are likely to appear at the vertices of the object 10, but randomly appear on a smooth curved surface according to the surface pattern or the like.

【０１１４】また、図13は、前述のステップＳ５に従っ
た特徴点追跡を模式的に示したものであり、２つのフレ
ーム間における特徴点の移動（□から○）を追跡した結
果を示している。FIG. 13 schematically shows feature point tracking according to step S5 described above, and shows the result of tracking the movement of feature points between two frames (from □ to ○). I have.

【０１１５】図14は、前述のステップＳ13に従ってマル
チプルベースラインステレオ法により復元した距離画像
に基づいて、前述のステップＳ15に従って生成した物体
10の三角パッチモデルを示す図である。また、図15は、
本発明の比較例である、特徴点の追跡結果のみを用いた
因子分解法により復元した三次元モデルを示す図であ
る。この比較例では、立方体の部分は正確に復元されて
いるが、球の部分は粗い多面体として復元されていて復
元精度が低い。これに対して、本発明の例では、細かな
三角パッチで物体10を表現しているので、立方体の部分
は勿論のこと、球の部分も正確に復元できている。この
ような三次元モデルの構築処理の手法は、物体の三次元
形状そのものを計測する場合に適した手法である。FIG. 14 shows an object generated according to the above-described step S15 based on the distance image restored by the multiple baseline stereo method according to the above-described step S13.
It is a figure showing ten triangular patch models. Also, FIG.
FIG. 7 is a diagram illustrating a three-dimensional model restored by a factorization method using only tracking results of feature points, which is a comparative example of the present invention. In this comparative example, the cube portion is accurately restored, but the sphere portion is restored as a coarse polyhedron, and the restoration accuracy is low. On the other hand, in the example of the present invention, the object 10 is represented by fine triangular patches, so that not only the cubic portion but also the sphere portion can be accurately restored. Such a method of constructing a three-dimensional model is a method suitable for measuring the three-dimensional shape itself of an object.

【０１１６】また、本発明では、前述のステップＳ21に
従って、カメラのパラメータ, 撮影画像，物体の三次元
形状情報を用いて、任意の視点からの二次元画像を得る
ことができる。このような二次元画像の取得処理の手法
は、ディスプレイなどでの物体の表示機能を主目的とし
た仮想現実システム，訓練シミュレーション等に好適な
手法である。Further, according to the present invention, a two-dimensional image can be obtained from an arbitrary viewpoint by using the camera parameters, the photographed image, and the three-dimensional shape information of the object in accordance with step S21 described above. Such a method of acquiring a two-dimensional image is a method suitable for a virtual reality system, a training simulation, or the like, whose main purpose is to display an object on a display or the like.

【０１１７】図16は、本発明の三次元形状情報取得装
置，二次元画像取得装置及び記録媒体の構成例を示す図
である。ここに例示するプログラムは、図４，図５に示
すステップＳ２〜Ｓ15または図８，図９に示すステップ
Ｓ２〜Ｓ14，Ｓ21，Ｓ22を含んでおり、以下に説明する
記録媒体に記録されている。FIG. 16 is a diagram showing a configuration example of the three-dimensional shape information acquiring device, the two-dimensional image acquiring device, and the recording medium of the present invention. The program exemplified here includes steps S2 to S15 shown in FIGS. 4 and 5 or steps S2 to S14, S21 and S22 shown in FIGS. 8 and 9 and is recorded on a recording medium described below. .

【０１１８】図16において、コンピュータ20とオンライ
ン接続する記録媒体21は、コンピュータ20の設置場所か
ら隔たって設置される例えばＷＷＷ(World Wide Web)の
サーバコンピュータを用いてなり、記録媒体21には前述
の如きプログラム21a が記録されている。三次元形状情
報取得装置及び二次元画像取得装置として機能するコン
ピュータ20が、記録媒体21から読み出されたプログラム
21a の制御により、物体の三次元形状の情報を取得した
り、指定された任意の視点からの物体の二次元画像を取
得したりする。In FIG. 16, a recording medium 21 that is connected online to the computer 20 is, for example, a WWW (World Wide Web) server computer that is installed separately from a place where the computer 20 is installed. The program 21a is recorded as follows. A computer 20 functioning as a three-dimensional shape information acquiring device and a two-dimensional image acquiring device, a program read from a recording medium 21;
Under the control of 21a, information on the three-dimensional shape of the object is obtained, or a two-dimensional image of the object from any specified viewpoint is obtained.

【０１１９】コンピュータ20の内部に設けられた記録媒
体22は、内蔵設置される例えばハードディスクドライブ
またはＲＯＭなどを用いてなり、記録媒体22には前述の
如きプログラム22a が記録されている。三次元形状情報
取得装置及び二次元画像取得装置として機能するコンピ
ュータ20が、記録媒体22から読み出されたプログラム22
a の制御により、物体の三次元形状の情報を取得した
り、指定された任意の視点からの物体の二次元画像を取
得したりする。この記録媒体22の例は、図３のＲＯＭ５
に該当する。The recording medium 22 provided inside the computer 20 uses, for example, a hard disk drive or a ROM installed therein. The recording medium 22 stores the program 22a as described above. The computer 20 functioning as the three-dimensional shape information acquiring device and the two-dimensional image acquiring device includes the program 22 read from the recording medium 22.
Under the control of a, information on the three-dimensional shape of the object is obtained, or a two-dimensional image of the object from a specified arbitrary viewpoint is obtained. An example of the recording medium 22 is the ROM 5 in FIG.
Corresponds to.

【０１２０】コンピュータ20に設けられたディスクドラ
イブ20a に装填して使用される記録媒体23は、運搬可能
な例えば光磁気ディスク，ＣＤ−ＲＯＭまたはフレキシ
ブルディスクなどを用いてなり、記録媒体23には前述の
如きプログラム23a が記録されている。三次元形状情報
取得装置及び二次元画像取得装置として機能するコンピ
ュータ20が、記録媒体23から読み出されたプログラム23
a の制御により、物体の三次元形状の情報を取得した
り、指定された任意の視点からの物体の二次元画像を取
得したりする。The recording medium 23 used by being loaded into the disk drive 20a provided in the computer 20 is, for example, a transportable magneto-optical disk, CD-ROM or flexible disk. The program 23a is recorded as follows. The computer 20 functioning as the three-dimensional shape information acquiring device and the two-dimensional image acquiring device has the program 23 read from the recording medium 23.
Under the control of a, information on the three-dimensional shape of the object is obtained, or a two-dimensional image of the object from a specified arbitrary viewpoint is obtained.

【０１２１】なお、上記例では、静止している物体を移
動しているカメラで撮影して物体の動画像を得るように
したが、これとは逆に、移動している物体を静止してい
るカメラで撮影して物体の動画像を得るようにしても良
い。また、物体及びカメラの何れもが移動している状況
で、物体の動画像を得るようにしても良い。In the above example, a moving object is captured by photographing a stationary object with a moving camera. On the contrary, the moving object is stopped and photographed. The moving image of the object may be obtained by shooting with a camera. Further, a moving image of the object may be obtained in a situation where both the object and the camera are moving.

【０１２２】図17は、移動する物体を静止しているカメ
ラで撮影するようにした本発明の他の実施の形態を示す
図である。図17に示すように、標準的なビデオカメラ11
を三脚12に固定させ、対象となる物体10をターンテーブ
ル13上に載せてターンテーブル13を回転させる。なお、
物体10が移動しても背景の画像が一定となるように、例
えば白布14を物体10の後方に設けておく。そして、回転
する物体10を固定したビデオカメラ11で撮影した動画像
から、上述した同様の方法に従って、物体10の三次元形
状を復元する。このようなシステムでは、ビデオカメラ
11の内部パラメータ，外部パラメータ、ターンテーブル
13の回転角度などの情報を必要としない。ターンテーブ
ル13の回転が、物体10に対する相対的なビデオカメラ11
の移動軌跡として計算される。FIG. 17 is a diagram showing another embodiment of the present invention in which a moving object is photographed by a stationary camera. As shown in FIG. 17, a standard video camera 11
Is fixed to a tripod 12, the target object 10 is placed on the turntable 13, and the turntable 13 is rotated. In addition,
For example, a white cloth 14 is provided behind the object 10 so that the background image is constant even when the object 10 moves. Then, the three-dimensional shape of the object 10 is restored from the moving image captured by the video camera 11 in which the rotating object 10 is fixed according to the same method as described above. In such a system, a video camera
11 internal parameters, external parameters, turntable
There is no need for information such as 13 rotation angles. The rotation of the turntable 13 causes the video camera 11 to move relative to the object 10.
Is calculated as the movement trajectory.

【０１２３】なお、上記例では、物体の三次元形状モデ
ルを求める際に、マルチプルベースラインステレオ法を
利用したが、volumetric iterative approach to stere
o matching and occlusion detection法などの他の手法
を利用しても良いことは勿論である。In the above example, the multiple baseline stereo method was used when obtaining the three-dimensional shape model of the object, but the volumetric iterative approach to stere
Of course, other methods such as the matching and occlusion detection method may be used.

【０１２４】[0124]

【発明の効果】以上のように本発明では、予めキャリブ
レーションされていないカメラで物体を撮影した動画像
からカメラの各種のパラメータ（内部パラメータ及び移
動軌跡）を求め、その求めたパラメータと動画像とを用
いてステレオ法に従って物体の三次元形状の情報を取得
するようにしたので、カメラのキャリブレーションが不
要であってカメラを固定するための装置も不要であり、
非常に容易に物体の三次元形状情報を得ることができ、
しかも物体の復元形状の精度が極めて高い三次元形状情
報を得ることができる。As described above, according to the present invention, various parameters (internal parameters and moving trajectories) of a camera are obtained from a moving image obtained by photographing an object with a camera which has not been calibrated in advance. Since the information of the three-dimensional shape of the object is acquired according to the stereo method using and, the calibration of the camera is unnecessary, and a device for fixing the camera is also unnecessary,
Very easily obtain three-dimensional shape information of the object,
Moreover, it is possible to obtain three-dimensional shape information with extremely high accuracy of the restored shape of the object.

【０１２５】また本発明では、予めキャリブレーション
されていないカメラで物体を撮影した動画像からカメラ
の各種のパラメータ（内部パラメータ及び移動軌跡）を
求めると共に物体の三次元形状情報を求め、その求めた
パラメータ及び三次元形状情報と動画像とを用いて任意
の視点からの物体の二次元画像を取得するようにしたの
で、カメラのキャリブレーションが不要であってカメラ
を固定するための装置も不要であり、任意の視点からの
物体の高品質の二次元画像を非常に容易に呈示すること
ができる。In the present invention, various parameters (internal parameters and movement trajectories) of the camera are obtained from a moving image obtained by photographing the object with a camera which has not been calibrated beforehand, and three-dimensional shape information of the object is obtained. Since the two-dimensional image of the object from an arbitrary viewpoint is obtained using the parameters and the three-dimensional shape information and the moving image, the calibration of the camera is unnecessary, and the device for fixing the camera is also unnecessary. Yes, it is very easy to present a high quality two-dimensional image of an object from any viewpoint.

[Brief description of the drawings]

【図１】本発明による三次元形状情報取得方法の概念を
示す図である。FIG. 1 is a diagram showing the concept of a method for acquiring three-dimensional shape information according to the present invention.

【図２】本発明による二次元画像取得方法の概念を示す
図である。FIG. 2 is a diagram illustrating the concept of a two-dimensional image acquisition method according to the present invention.

【図３】本発明の三次元形状情報取得装置及び二次元画
像取得装置の構成を示すブロック図である。FIG. 3 is a block diagram showing a configuration of a three-dimensional shape information acquiring device and a two-dimensional image acquiring device according to the present invention.

【図４】本発明の三次元形状情報取得方法の処理手順を
示すフローチャートである。FIG. 4 is a flowchart showing a processing procedure of the three-dimensional shape information acquiring method of the present invention.

【図５】本発明の三次元形状情報取得方法の処理手順を
示すフローチャートである。FIG. 5 is a flowchart showing a processing procedure of the three-dimensional shape information acquiring method of the present invention.

【図６】対象となる物体とカメラの光軸との位置関係を
示す図である。FIG. 6 is a diagram illustrating a positional relationship between a target object and an optical axis of a camera.

【図７】ＳＳＳＤの計算式における各種パラメータを示
す図である。FIG. 7 is a diagram showing various parameters in an SSSD calculation formula.

【図８】本発明の二次元画像取得方法の処理手順を示す
フローチャートである。FIG. 8 is a flowchart showing a processing procedure of the two-dimensional image acquisition method of the present invention.

【図９】本発明の二次元画像取得方法の処理手順を示す
フローチャートである。FIG. 9 is a flowchart showing a processing procedure of the two-dimensional image acquisition method of the present invention.

【図１０】ＩＢＲにおける距離画像とボクセル空間との
関係を示す図である。FIG. 10 is a diagram showing a relationship between a distance image and a voxel space in IBR.

【図１１】対象となる物体のビデオカメラによる撮影状
態を示す模式図である。FIG. 11 is a schematic diagram showing a state where a target object is photographed by a video camera.

【図１２】抽出した特徴点のパターン例を示す図であ
る。FIG. 12 is a diagram showing an example of a pattern of extracted feature points.

【図１３】特徴点追跡のパターン例を示す図である。FIG. 13 is a diagram illustrating an example of a feature point tracking pattern.

【図１４】本発明により復元した物体の三次元モデルを
示す図である。FIG. 14 is a diagram showing a three-dimensional model of an object restored according to the present invention.

【図１５】比較例により復元した物体の三次元モデルを
示す図である。FIG. 15 is a diagram illustrating a three-dimensional model of an object restored according to a comparative example.

【図１６】三次元形状情報取得装置，二次元画像取得装
置及び記録媒体の構成例を示す図である。FIG. 16 is a diagram illustrating a configuration example of a three-dimensional shape information acquisition device, a two-dimensional image acquisition device, and a recording medium.

【図１７】本発明の他の実施の形態を示す図である。FIG. 17 is a diagram showing another embodiment of the present invention.

[Explanation of symbols]

１主制御部（ＣＰＵ）４画像メモリ５ＲＯＭ６ＲＡＭ 10 物体 11 ビデオカメラ 20 コンピュータ 21，22，23 記録媒体 DESCRIPTION OF SYMBOLS 1 Main control part (CPU) 4 Image memory 5 ROM 6 RAM 10 Object 11 Video camera 20 Computer 21, 22, 23 Recording medium

───────────────────────────────────────────────────── フロントページの続き (72)発明者安田勝也大阪府大阪市東住吉区公園南矢田４−25− ９−208 ────────────────────────────────────────────────── ─── Continuation of the front page (72) Inventor Katsuya Yasuda 4-25- 9-208 Minami Yata Park, Higashi Sumiyoshi-ku, Osaka-shi, Osaka

Claims

[Claims]

1. A method for obtaining information on a three-dimensional shape of an object from an image of the object taken by a camera that moves relatively to the object, wherein an internal parameter of the camera and the A first step of obtaining a relative movement trajectory with respect to the object at the time of shooting by a camera; and a second step of obtaining information on a three-dimensional shape of the object according to a stereo method using the obtained internal parameters, the movement trajectory, and the captured image. And a method for acquiring three-dimensional shape information.

2. The three-dimensional shape information acquiring method according to claim 1, wherein a still object is photographed by a moving camera to obtain a photographed image of the object.

3. The method according to claim 1, wherein a moving object is photographed by a stationary camera to obtain a photographed image of the object.

4. The first step includes: extracting a feature point in an arbitrary frame in the photographed image; tracking the extracted feature point over a plurality of frames in the photographed image; 4. The method for acquiring three-dimensional shape information according to claim 1, further comprising: obtaining the internal parameters and the movement trajectory based on a result.

5. A moving direction of the feature point is predicted by using a moving trajectory obtained once, a result of the prediction is compared with a result of tracking the feature point, and a result of the tracking of the feature point is determined based on the comparison result. The three-dimensional shape information acquiring method according to claim 4, wherein the internal parameters and the movement trajectory are obtained again based on the corrected tracking result.

6. The method for acquiring three-dimensional shape information according to claim 1, wherein a factor decomposition method is used when obtaining the movement locus of the camera based on the tracking result of the feature points.

7. The three-dimensional shape information acquiring method according to claim 1, wherein a multiple baseline stereo method is used as the stereo method.

8. An apparatus for obtaining information on a three-dimensional shape of the object from an image of the object taken by a camera relatively moving with respect to the object, wherein an internal parameter of the camera and the Means for obtaining a relative movement trajectory with respect to the object at the time of shooting by a camera, and means for obtaining information on the three-dimensional shape of the object according to a stereo method using the obtained internal parameters and the movement trajectory and the captured image A three-dimensional shape information acquisition device, characterized in that:

9. The three-dimensional shape information acquiring apparatus according to claim 8, further comprising the camera that moves relatively to the object.

10. A method for acquiring a two-dimensional image of the object from an arbitrary viewpoint, using information obtained by the method for acquiring three-dimensional shape information according to claim 1. A two-dimensional image of the object is obtained from an arbitrarily designated viewpoint using the internal parameters and the movement trajectory obtained in the first step, the photographed image, and the information of the three-dimensional shape obtained in the second step. A method for acquiring a two-dimensional image, the method comprising:

11. A method for acquiring a two-dimensional image of the object from an arbitrary viewpoint based on an image of the object taken by a camera that moves relatively to the object, wherein the two-dimensional image A first step of obtaining an internal parameter of the camera and a relative trajectory of the camera relative to the object at the time of shooting, and obtaining three-dimensional shape information of the object using the obtained internal parameter and the trajectory and the captured image A second step, and a third step of obtaining a two-dimensional image of the object from an arbitrarily designated viewpoint using the obtained internal parameters and movement trajectory, the captured image, and the obtained three-dimensional shape information. A method for acquiring a two-dimensional image, comprising:

12. The method according to claim 1, wherein when acquiring a two-dimensional image of the object, image-based rendering is used.
12. The two-dimensional image acquisition method according to any one of 0 and 11.

13. An apparatus for acquiring a two-dimensional image of the object from an arbitrary viewpoint based on an image of the object taken by a camera relatively moving with respect to the object, Means for obtaining an internal parameter of the camera and a relative movement trajectory to the object at the time of shooting by the camera; means for obtaining three-dimensional shape information of the object using the obtained internal parameter and movement trajectory and the captured image; Means for acquiring a two-dimensional image of the object from an arbitrarily designated viewpoint using the determined internal parameters and the movement trajectory, the captured image and the determined three-dimensional shape information. Two-dimensional image acquisition device.

14. A computer-readable recording medium storing a program for obtaining information on a three-dimensional shape of an object from an image of the object taken by a camera that moves relatively to the object. Program code means for causing the computer to determine an internal parameter of the camera based on a captured image and a relative movement trajectory to the object at the time of imaging by the camera; and the determined internal parameter and movement trajectory. And a program code unit for causing the computer to obtain three-dimensional shape information of the object according to a stereo method using the captured image.

15. A computer which records a program for acquiring a two-dimensional image of the object from an arbitrary viewpoint based on an image of the object taken by a camera moving relatively to the object. Program code means for causing the computer to execute an internal parameter of the camera and a relative movement trajectory with respect to the object at the time of shooting by the camera based on the shot image on the readable recording medium; and Program code means for causing the computer to obtain three-dimensional shape information of the object using the internal parameters and the movement trajectory and the photographed image; and the obtained internal parameters and movement trajectory and the photographed image and the obtained three-dimensional image. Using the shape information, to obtain a two-dimensional image of the object from an arbitrarily designated viewpoint Recording medium having program code means to be executed by the computer.