JP2007292657A

JP2007292657A - Camera motion information acquiring apparatus, camera motion information acquiring method, and recording medium

Info

Publication number: JP2007292657A
Application number: JP2006122186A
Authority: JP
Inventors: Tatsuya Osawa; 達哉大澤; Isao Miyagawa; 勲宮川; Yoshiori Wakabayashi; 佳織若林; Takayuki Yasuno; 貴之安野
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2006-04-26
Filing date: 2006-04-26
Publication date: 2007-11-08

Abstract

<P>PROBLEM TO BE SOLVED: To provide a camera motion information acquiring apparatus for automatically finding a three-dimensional position and a posture of a camera in all images of a long image sequence taken by a time-series image inputting apparatus such as a video camera when the images are taken. <P>SOLUTION: The camera motion information acquiring apparatus is provided with a moving/observed image sequence acquiring means 11 for acquiring the image sequence observed by the moving image inputting apparatus, a moving/observed image sequence dividing means 12 for dividing the acquired moving/observed image sequence into a plurality of sub-image sequences as a plurality of the images are overlapped between the sub-image sequences, a coordinate system integrating means 13 for integrating different coordinate systems between the sub-image sequences into a common global coordinate system, and a camera motion re-estimating means 14 for re-estimating a camera motion in the common global coordinate system. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、ビデオカメラ等の時系列画像入力装置によって得られた移動観測画像列の、各画像を撮影した際のカメラの三次元位置及び姿勢を表す、カメラ運動情報を取得する方法および装置に関するものである。 The present invention relates to a method and apparatus for acquiring camera motion information representing the three-dimensional position and orientation of a camera when each image of a moving observation image sequence obtained by a time-series image input device such as a video camera is captured. Is.

移動する画像入力装置により観測された画像列、（以下、移動観測画像列と呼ぶ）からのカメラ運動情報の推定は、三次元モデル復元、物体認識、ロボットナビゲーション、複合現実感など、様々な分野への応用が可能である。代表的な手法に因子分解法がある。移動観測画像列の任意の画像上で発生させた特徴点を移動観測画像列の他の全ての画像上で対応付けを行って（以下、特徴点追跡と呼ぶ）、線形解法によって撮影対象の形状とカメラの運動情報を同時に復元することができる。例えば、下記非特許文献１では、透視投影カメラモデルで頑健にカメラの運動情報を取得している。 Estimating camera motion information from an image sequence observed by a moving image input device (hereinafter referred to as a moving observation image sequence) is used in various fields such as 3D model restoration, object recognition, robot navigation, and mixed reality. Application to is possible. A typical method is a factorization method. A feature point generated on an arbitrary image in the moving observation image sequence is associated with all other images in the moving observation image sequence (hereinafter referred to as feature point tracking), and the shape of the object to be imaged is obtained by linear solution. And camera motion information can be restored at the same time. For example, in the following Non-Patent Document 1, camera motion information is acquired robustly with a perspective projection camera model.

尚、本発明に関連する、バンドル調整については下記非特許文献２に、クォータニオンについては下記非特許文献３に各々記載されている。
Ｓ．ＣｈｒｉｓｔｙａｎｄＲ．Ｈｏｒａｕｄ“Ｅｕｃｌｉｄｅａｎｓｈａｐｅａｎｄｍｏｔｉｏｎｆｒｏｍｍｕｌｔｉｐｌｅｐｅｒｓｐｅｃｔｉｖｅｖｉｅｗｓｂｙａｆｆｉｎｅｉｔｅｒａｔｉｏｎｓ”ＩＥＥＥＴｒａｎｓａｃｔｉｏｎｓｏｎＰａｔｔｅｒｎＡｎａｌｙｓｉｓａｎｄＭａｃｈｉｎｅＩｎｔｅｌｌｉｇｅｎｃｅ，１８（１１）：１０９８−１１０４，１９９６．Ｂ．Ｔｒｉｇｇｓ，Ｐ．ＭｃＬａｕｃｈｌａｎ，Ｒ．ＨａｒｔｌｅｙａｎｄＡ．Ｆｉｔｚｇｉｂｂｏｎ．ＢｕｎｄｌｅＡｄｊｕｓｔｍｅｎｔ−Ａｍｏｄｅｒｎｓｙｎｔｈｅｓｉｓ．ＶｉｓｉｏｎＡｌｇｏｒｉｔｈｍｓ’９９，ＬＮＣＳ１８８３，ｐｐ．２９８−３７２，２０００．Ｂ．Ｋ．Ｐ．Ｈｏｒｎ．Ｃｌｏｓｅｄ−ｆｏｒｍｓｏｌｕｔｉｏｎｏｆａｂｓｏｌｕｔｅｏｒｉｅｎｔａｔｉｏｎｕｓｉｎｇｕｎｉｔｑｕａｔｅｒｎｉｏｎｓ．ＪｏｕｒｎａｌｏｆｔｈｅＯｐｔｉｃａｌＳｏｃｉｅｔｙｏｆＡｍｅｒｉｃａＡ，ｖｏｌ．〜４，ｐｐ．〜６２９−−６４２，１９８７． The bundle adjustment related to the present invention is described in Non-Patent Document 2 below, and the quaternion is described in Non-Patent Document 3 below.
S. Christy and R.C. Horaud “Euclidean shape and motion from multiple perspectives by affine iterations”, IEEE Transactions on Pattern Analysis in 1996. B. Triggs, P.M. McLauchlan, R.M. Hartley and A.M. Fitzgibbon. Bundle Adjustment-A modern synthesis. Vision Algorithms '99, LNCS 1883, pp. 298-372, 2000. B. K. P. Horn. Closed-form solution of absolute orientation using unit quotas. Journal of the Optical Society of America A, vol. -4, pp. ~ 629--642, 1987.

しかしながら因子分解法では移動観測画像列の全ての画像で特徴点を対応付けなくてはならない。移動距離が長いシーケンスの移動観測画像列（以下、ロングシーケンス画像列と呼ぶ）では、特徴点のフレームアウトが発生するために、カメラ運動を推定することが不可能となってしまう。 However, in the factorization method, feature points must be associated with all the images in the moving observation image sequence. In a moving observation image sequence (hereinafter referred to as a long sequence image sequence) having a long moving distance, it is impossible to estimate a camera motion because a feature point frame out occurs.

本発明は、上述のような従来技術の問題点を解決するためになされたものであり、その目的は、ビデオカメラ等の時系列画像入力装置により撮影された、ロングシーケンス画像列の全ての画像について、その画像を撮影した際のカメラの三次元位置及び姿勢を自動的に求めることができるカメラ運動情報取得装置、カメラ運動情報取得方法および記録媒体を提供することにある。 The present invention has been made in order to solve the above-described problems of the prior art, and the object thereof is all images of a long sequence image sequence taken by a time-series image input device such as a video camera. Is to provide a camera motion information acquisition device, a camera motion information acquisition method, and a recording medium capable of automatically obtaining the three-dimensional position and orientation of the camera when the image is taken.

上記課題を解決するために本発明は、ビデオカメラなどの時系列画像入力装置を用いて、移動観測画像列を取得し、これを処理することで、移動観測画像列の各画像撮影時のカメラの三次元位置および姿勢を表すカメラ運動情報を取得する方法および装置であって、移動観測画像列を取得する手段と、前記取得した移動観測画像列を複数のサブ画像列に分割する手段と、前記分割された各サブ画像列を用いて独立にカメラ運動情報を推定する手段と、前記推定された各サブ画像列のカメラ運動情報を用いて、各サブ画像列で独立に推定されたカメラ運動情報を共通の世界座標系に統合する手段と、前記統合されたカメラ運動情報を全移動観測画像列で最適化を行って、統合されたカメラ運動情報を再推定する手段を備えることを特徴とする。 In order to solve the above problems, the present invention obtains a moving observation image sequence using a time-series image input device such as a video camera, and processes this to obtain a camera at the time of capturing each image of the moving observation image sequence. A method and apparatus for acquiring camera motion information representing the three-dimensional position and orientation of the image capturing device, means for acquiring a moving observation image sequence, means for dividing the acquired movement observation image sequence into a plurality of sub-image sequences, Means for independently estimating camera motion information using each divided sub-image sequence, and camera motion independently estimated for each sub-image sequence using the estimated camera motion information of each sub-image sequence Means for integrating information into a common world coordinate system, and means for re-estimating the integrated camera motion information by optimizing the integrated camera motion information with a whole moving observation image sequence To do.

すなわち、請求項１に記載のカメラ運動情報取得装置は、移動する画像入力装置により観測された画像列から、画像撮影時のカメラの三次元位置および姿勢を表すカメラ運動情報を取得するカメラ運動情報取得装置であって、移動する画像入力装置により観測された画像列を取得する移動観測画像列取得手段と、前記取得した移動観測画像列を、各サブ画像列間で複数の画像を重複させながら複数のサブ画像列に分割する移動観測画像列の分割手段と、前記サブ画像列間で異なる座標系を共通の世界座標系に統合する座標系の統合手段と、前記共通の世界座標系でのカメラ運動を再推定するカメラ運動情報の再推定手段とを備えることを特徴としている。 That is, the camera motion information acquisition device according to claim 1 acquires camera motion information representing the three-dimensional position and orientation of the camera at the time of image capture from an image sequence observed by the moving image input device. A moving observation image sequence acquisition unit that acquires an image sequence observed by a moving image input device and an acquired moving observation image sequence while overlapping a plurality of images between sub image sequences. A moving observation image sequence dividing unit that divides the image into a plurality of sub image sequences; a coordinate system integrating unit that integrates coordinate systems different between the sub image sequences into a common world coordinate system; and a common world coordinate system. The camera motion information re-estimating means for re-estimating the camera motion is provided.

また請求項２に記載のカメラ運動情報取得装置は、請求項１に記載のカメラ運動情報取得装置において、前記移動観測画像列の分割手段は、特徴点追跡の結果を利用してサブ画像列を決定し、前記特徴点追跡結果を用いて計測行列を作成し、因子分解法によりカメラ運動を推定することを特徴としている。 Further, the camera motion information acquisition device according to claim 2 is the camera motion information acquisition device according to claim 1, wherein the moving observation image sequence dividing means uses the result of feature point tracking to sub-sequence the image sequence. It is characterized in that a determination is made, a measurement matrix is created using the result of tracking the feature points, and camera motion is estimated by a factorization method.

また請求項３に記載のカメラ運動情報取得装置は、請求項１に記載のカメラ運動情報取得装置において、前記移動観測画像列の分割手段は、特徴点追跡の結果を利用してサブ画像列を決定し、前記特徴点追跡結果を用いて逐次的に射影復元を行ってカメラ運動を推定することを特徴としている。 The camera motion information acquisition device according to claim 3 is the camera motion information acquisition device according to claim 1, wherein the moving observation image sequence dividing unit uses the result of the feature point tracking to sub-sequence the image sequence. It is characterized in that the camera motion is estimated by determining and sequentially performing projection restoration using the feature point tracking result.

また請求項４に記載のカメラ運動情報取得装置は、請求項１乃至３に記載のカメラ運動情報取得装置において、前記座標系の統合手段は、サブ画像列間の重複フレームを利用してステレオ処理により三次元対応点テーブルを作成し、前記三次元対応点テーブルを利用してサブ画像列間の座標変換行列を計算し、座標系を統合することを特徴としている。 The camera motion information acquisition device according to claim 4 is the camera motion information acquisition device according to any one of claims 1 to 3, wherein the coordinate system integration unit performs stereo processing using overlapping frames between sub-image sequences. A three-dimensional corresponding point table is created by using the three-dimensional corresponding point table, a coordinate transformation matrix between sub-image sequences is calculated using the three-dimensional corresponding point table, and the coordinate system is integrated.

また請求項５に記載のカメラ運動情報取得方法は、移動する画像入力装置により観測された画像列から、画像撮影時のカメラの三次元位置および姿勢を表すカメラ運動情報を取得するカメラ運動情報取得方法であって、移動観測画像列取得手段が、移動する画像入力装置により観測された画像列を取得するステップと、移動観測画像列の分割手段が、前記取得した移動観測画像列を、各サブ画像列間で複数の画像を重複させながら複数のサブ画像列に分割する分割ステップと、座標系の統合手段が、前記サブ画像列間で異なる座標系を共通の世界座標系に統合する統合ステップと、カメラ運動情報の再推定手段が、前記共通の世界座標系でのカメラ運動を再推定するステップとを備えることを特徴としている。 The camera motion information acquisition method according to claim 5 is a camera motion information acquisition method for acquiring camera motion information representing a three-dimensional position and orientation of a camera at the time of image capture from an image sequence observed by a moving image input device. A moving observation image sequence acquisition unit that acquires an image sequence observed by a moving image input device; and a moving observation image sequence dividing unit converts the acquired movement observation image sequence into A dividing step of dividing a plurality of images between image sequences while dividing them into a plurality of sub-image sequences, and a integrating step of integrating coordinate systems different between the sub-image sequences into a common world coordinate system. And re-estimating means of camera motion information comprises re-estimating camera motion in the common world coordinate system.

また請求項６に記載のカメラ運動情報取得方法は、請求項５に記載のカメラ運動情報取得方法において、前記分割ステップは、特徴点追跡の結果を利用してサブ画像列を決定するステップと、前記特徴点追跡結果を用いて計測行列を作成し、因子分解法によりカメラ運動を推定するステップと、推定されたカメラ運動を最適化するステップと、移動観測画像列の分割が終了か否かを判定するステップと、重複するフレーム数を決定するステップとを有することを特徴としている。 The camera motion information acquisition method according to claim 6 is the camera motion information acquisition method according to claim 5, wherein the dividing step determines a sub-image sequence using a result of feature point tracking; Create a measurement matrix using the feature point tracking result, estimate the camera motion by a factorization method, optimize the estimated camera motion, and whether or not the division of the moving observation image sequence is finished It has a step of determining and a step of determining the number of overlapping frames.

また請求項７に記載のカメラ運動情報取得方法は、請求項５に記載のカメラ運動情報取得方法において、前記分割ステップは、特徴点追跡の結果を利用してサブ画像列を決定するステップと、前記特徴点追跡結果を用いて逐次的に射影復元を行ってカメラ運動を推定するステップと、推定されたカメラ運動を最適化するステップと、移動観測画像列の分割が終了か否かを判定するステップと、重複するフレーム数を決定するステップとを有することを特徴としている。 The camera motion information acquisition method according to claim 7 is the camera motion information acquisition method according to claim 5, wherein the dividing step determines a sub-image sequence using a result of feature point tracking; It is determined whether to perform projection restoration sequentially using the feature point tracking result to estimate camera motion, to optimize the estimated camera motion, and to divide the moving observation image sequence And a step of determining the number of overlapping frames.

また請求項８に記載のカメラ運動情報取得方法は、請求項５乃至７に記載のカメラ運動情報取得方法において、前記統合ステップは、サブ画像列間の重複フレームを利用してステレオ処理により三次元対応点テーブルを作成するステップと、前記三次元対応点テーブルを利用してサブ画像列間の座標変換行列を計算し、座標系を統合するステップとを有することを特徴としている。 The camera motion information acquisition method according to claim 8 is the camera motion information acquisition method according to claims 5 to 7, wherein the integration step is a three-dimensional process by stereo processing using overlapping frames between sub-image sequences. The method includes a step of creating a corresponding point table, and a step of calculating a coordinate transformation matrix between the sub-image sequences using the three-dimensional corresponding point table and integrating the coordinate system.

また請求項９に記載の記録媒体は、請求項５乃至８のいずれかに記載のカメラ運動情報取得方法をコンピュータに実行させるためのプログラムを記録したコンピュータ読み取り可能な記録媒体である。 A recording medium according to claim 9 is a computer-readable recording medium in which a program for causing a computer to execute the camera motion information acquisition method according to any of claims 5 to 8 is recorded.

上記構成において、分割されたサブ画像列には重複部分が存在し、この重複部分を利用して３次元対応点の関係が決まる。このため１つのローカル座標系を基準とした世界座標系に変換することで、カメラ運動情報を得ることができる。 In the above configuration, there is an overlapping portion in the divided sub-image sequence, and the relationship between the three-dimensional corresponding points is determined using this overlapping portion. For this reason, camera motion information can be obtained by converting to a world coordinate system based on one local coordinate system.

本発明によれば、従来不可能であった、特徴点追跡が全フレームで不可能なロングシーケンス画像列から座標系の統一されたカメラ運動情報を取得することができる。このようにして得られたカメラ運動情報により、ステレオ法、シルエット法など目的にあった様々な画像からの三次元復元方法を用いて広域の三次元モデルを安定的に、しかも精度良く復元することが可能となる。 According to the present invention, it is possible to acquire camera motion information with a unified coordinate system from a long sequence image sequence, which has been impossible in the past and is impossible to track feature points in all frames. Using the camera motion information obtained in this way, a wide-range 3D model can be restored stably and accurately using a 3D restoration method from various images such as stereo and silhouette methods. Is possible.

以下、図面を参照しながら本発明の実施の形態を説明するが、本発明は下記の実施形態例に限定されるものではない。図１は、本発明の実施形態例によるカメラ運動情報取得装置の構成を示している。本発明のカメラ運動情報取得装置は、移動観測画像列取得手段１１と、移動観測画像列の分割手段１２と、座標系の統合手段１３と、カメラ運動情報の再推定手段１４で構成される。 Hereinafter, embodiments of the present invention will be described with reference to the drawings, but the present invention is not limited to the following embodiments. FIG. 1 shows the configuration of a camera motion information acquisition apparatus according to an embodiment of the present invention. The camera motion information acquisition apparatus of the present invention includes a movement observation image sequence acquisition unit 11, a movement observation image sequence division unit 12, a coordinate system integration unit 13, and a camera motion information re-estimation unit 14.

移動観測画像列取得手段１１は、画像入力装置を移動させながら時系列画像データを取得する手段であり、一例として、手持ちカメラによる歩行撮影や車に取り付けたカメラなどが考えられる。 The moving observation image sequence acquisition unit 11 is a unit that acquires time-series image data while moving the image input device. As an example, a walk shooting with a handheld camera or a camera attached to a car can be considered.

移動観測画像列の分割手段１２は、前記移動観測画像列を複数のサブ画像列に、各サブ画像列間で複数の画像を重複させながら分割すると同時に、各サブ画像列を用いて、各サブ画像列に固有の座標系で表されたカメラ運動を推定する手段である。 The moving observation image sequence dividing means 12 divides the moving observation image sequence into a plurality of sub-image sequences while overlapping a plurality of images between the sub-image sequences, and at the same time, using each sub-image sequence, It is a means for estimating the camera motion expressed in a coordinate system unique to the image sequence.

座標系の統合手段１３は、隣接するサブ画像列の重複画像部分を用いて、各サブ画像列で推定されたカメラ運動情報を共通の世界座標系に統合する手段である。 The coordinate system integration unit 13 is a unit that integrates camera motion information estimated in each sub-image sequence into a common world coordinate system using overlapping image portions of adjacent sub-image sequences.

カメラ運動情報の再推定手段１４は、前記世界座標系に統一されたカメラ運動情報を移動観測画像列の全ての画像を用いて最適化を行い、カメラ運動情報を再推定する手段である。 The camera motion information re-estimating means 14 is a means for optimizing the camera motion information unified in the world coordinate system using all the images in the moving observation image sequence and re-estimating the camera motion information.

次に本発明の実施形態例を詳細に説明する。本発明の目的は、移動するカメラ等の画像入力装置によって観測された移動観測画像列から、移動観測画像列を構成する各フレーム撮影時のカメラの三次元位置および姿勢を取得することである。 Next, exemplary embodiments of the present invention will be described in detail. An object of the present invention is to acquire the three-dimensional position and orientation of a camera at the time of photographing each frame constituting a moving observation image sequence from a moving observation image sequence observed by an image input device such as a moving camera.

本実施形態例では、図２のように移動するビデオカメラにより時系列画像列を撮影して、各画像撮影時のカメラの三次元位置と姿勢を推定する例を説明する。 In the present embodiment, an example will be described in which a time-series image sequence is captured by a moving video camera as shown in FIG. 2 and the three-dimensional position and orientation of the camera at the time of capturing each image are estimated.

本実施形態例において使用するビデオカメラのカメラ内部パラメータは事前に校正を行っておく。 The camera internal parameters of the video camera used in this embodiment are calibrated in advance.

図３は本実施形態例の動作を示すフローチャートである。処理が開始されると、移動観測画像列取得手段１１により、カメラを移動させながら観測した移動観測画像列Ｎを取得する（ステップＳ２０１）。 FIG. 3 is a flowchart showing the operation of this embodiment. When the process is started, the movement observation image sequence acquisition unit 11 acquires the movement observation image sequence N observed while moving the camera (step S201).

次に移動観測画像列の分割手段１２により、図４のように移動観測画像列Ｎを隣接サブ画像列間で重複させながらＫ個のサブ画像列に分割する（ステップＳ２０２）。この分割処理のステップＳ２０２は、図５に示すフローチャートに沿って実行される。 Next, the moving observation image sequence dividing means 12 divides the moving observation image sequence N into K sub-image sequences while overlapping the adjacent sub-image sequences as shown in FIG. 4 (step S202). Step S202 of this division process is executed along the flowchart shown in FIG.

図５において、まずサブ画像列の通し番号を表すｎを、ｎ＝１として処理が開始される。 In FIG. 5, first, the process is started by setting n representing the serial number of the sub-image sequence to n = 1.

次にサブ画像列＃ｎを構成する画像の決定を行う（ステップＳ３０１）。 Next, the image constituting the sub image sequence #n is determined (step S301).

各サブ画像列を構成する画像数は安定したカメラ運動推定という観点ではより多くの画像数が必要であるが、あまりに画像数が多いと特徴点の消失により、特徴点追跡を行うことが出来ない。そこで各サブ画像列の画像数は安定的に特徴点追跡が行える画像数として決定することが望ましい。例えば、以下の手順で行うことが可能である。 The number of images constituting each sub-image sequence requires a larger number of images from the viewpoint of stable camera motion estimation. However, if the number of images is too large, feature points cannot be tracked due to disappearance of feature points. . Therefore, it is desirable to determine the number of images in each sub-image sequence as the number of images that can stably perform feature point tracking. For example, it can be performed by the following procedure.

特徴点追跡とは画像列中の任意のフレームで発生させた特徴点の画像座標値とその他の全ての画像で対応する画像座標値を求めることである。特徴点追跡はある画像で検出した特徴点を時系列方向及び時系列逆方向の双方向で追跡を行う。 The feature point tracking is to obtain the image coordinate value of the feature point generated in an arbitrary frame in the image sequence and the corresponding image coordinate value in all other images. Feature point tracking tracks feature points detected in a certain image in both the time-series direction and the time-series reverse direction.

まず特徴点を検出する画像を選択するために、時系列上で二番目の画像から順に選択をしていき、それぞれの画像で特徴点を検出し、時系列逆方向にサブ画像列最初の画像まで特徴点の追跡を行う。検出した特徴点がある閾値以上追跡に失敗した場合、新たな画像を選択するのを中止し、その前に選択された画像を特徴点を検出する画像として採用する。 First, in order to select an image for detecting feature points, selection is performed in order from the second image in time series, feature points are detected in each image, and the first image in the sub-image sequence in the time series reverse direction. Trace feature points. When the detected feature point fails to be tracked for a certain threshold or more, the selection of a new image is stopped, and the image selected before that is adopted as an image for detecting the feature point.

次にその同じ特徴点を時系列方向に追跡を行っていき、検出した特徴点がある閾値以上追跡に失敗した画像の時系列一つ手前の画像をサブ画像列最後の画像として決定する。 Next, the same feature point is tracked in the time series direction, and the image immediately before the time series of the image that failed to be tracked for a certain threshold value or more is determined as the last image in the sub-image sequence.

このようにすることでサブ画像列は安定した特徴点追跡を可能な最大の画像数として決定することが出来る。 By doing so, the sub-image sequence can be determined as the maximum number of images capable of stable feature point tracking.

次に決定されたサブ画像列の各画像を撮影したときのカメラ運動、すなわちサブ画像列の第ｉフレームにおけるカメラの三次元位置(ｘｉ，ｙｉ，ｚｉ)と姿勢（φｉ，θｉ，γｉ）を求める（ステップＳ３０２）。 Next, the camera motion when each image of the determined sub-image sequence is photographed, that is, the three-dimensional position (xi, yi, zi) and orientation (φi, θi, γi) of the camera in the i-th frame of the sub-image sequence. Obtained (step S302).

以下、本実施例では、カメラ運動の推定に因子分解法を用いる例で説明を行うが、逐次射影復元などを用いても実現可能であることは自明である。 Hereinafter, in the present embodiment, an example using a factorization method for camera motion estimation will be described. However, it is obvious that this can be realized even by using sequential projection restoration or the like.

ステップＳ３０１で行った特徴点追跡の結果を用いて因子分解法を適用する。まず特徴点を追跡した結果から計測行列を作成する。計測行列とはサブ画像列のフレーム数をＦ、特徴点数をＮとすると以下の式（１）で表されるＦ行Ｎ列の行列となる。 The factorization method is applied using the result of the feature point tracking performed in step S301. First, a measurement matrix is created from the result of tracking feature points. The measurement matrix is a matrix of F rows and N columns expressed by the following equation (1), where F is the number of frames in the sub-image sequence and N is the number of feature points.

ただし、（ｕ_ij，ｖ_ij）はｉフレーム目のｊ番目の特徴点の画像座標値を表す。 However, (u _ij , v _ij ) represents the image coordinate value of the j-th feature point in the i-th frame.

因子分解法は式（１）で表される計測行列から、例えば特異値分解などにより、各特徴点の三次元位置とカメラ運動（図４で示される各フレーム毎の三次元位置(ｘｉ，ｙｉ，ｚｉ)、姿勢（φｉ，θｉ，γｉ））を求める方法であり、例えば非特許文献１の方法を用いれば透視投影モデルで頑健にカメラ運動を推定することが出来る。 In the factorization method, the three-dimensional position of each feature point and the camera motion (three-dimensional position (xi, yi shown in FIG. 4) are obtained from the measurement matrix represented by Equation (1) by, for example, singular value decomposition. , Zi) and posture (φi, θi, γi)). For example, if the method of Non-Patent Document 1 is used, the camera motion can be estimated robustly with a perspective projection model.

次に推定されたカメラ運動の最適化を行う（ステップＳ３０３）。 Next, the estimated camera motion is optimized (step S303).

ステップＳ３０２で推定されたカメラ運動及び特徴点の三次元位置を初期解として、最適化を行うことで、より精度の良いカメラ運動を得ることが出来る。これは例えばよく知られたバンドル調整（非特許文献２）によって実現可能である。 By performing the optimization using the camera motion and the three-dimensional position of the feature point estimated in step S302 as an initial solution, a more accurate camera motion can be obtained. This can be realized, for example, by well-known bundle adjustment (Non-patent Document 2).

次に分割が終了か否かの判断を行う（ステップＳ３０４）。もし現在処理しているサブ画像列が移動観測画像列Ｎの最終フレームに到達していれば、これ以上の分割は必要ないので、処理を終了とする。逆にまだ処理されていないフレームが残されている場合には、ステップＳ３０５の処理に進む。 Next, it is determined whether or not the division is complete (step S304). If the currently processed sub-image sequence has reached the final frame of the moving observation image sequence N, no further division is necessary, and the process ends. On the other hand, if a frame that has not yet been processed remains, the process proceeds to step S305.

ステップＳ３０５では、隣接サブ画像列間の重複フレーム数を決定する。この重複フレームは各サブ画像列で推定された座標系の異なるカメラ運動を統合するのに利用する。後述するが、この統合にはステレオ処理により復元した三次元点群を利用する。そのため重複フレームを利用したステレオ処理によって三次元点群が精度良く復元されることが重要である。 In step S305, the number of overlapping frames between adjacent sub-image sequences is determined. This overlapping frame is used to integrate different camera motions in the coordinate system estimated in each sub-image sequence. As will be described later, this integration uses a three-dimensional point group restored by stereo processing. Therefore, it is important that the three-dimensional point group is accurately restored by stereo processing using overlapping frames.

ステレオ処理を利用した三次元点群の精度は利用する画像ペアを撮影した際のカメラ光学中心の三次元位置を結んだ距離、すなわちベースラインが長いほど復元される精度が良い。しかしながらベースラインが長すぎるとステレオ処理自体が困難になる。そこでこのベースラインの長さを利用して、重複フレーム数を決定する。 The accuracy of the three-dimensional point group using the stereo processing is good as the distance connecting the three-dimensional positions of the camera optical center when the pair of images to be used, that is, the base line is longer, is restored. However, if the baseline is too long, stereo processing itself becomes difficult. Therefore, the number of overlapping frames is determined using the length of this baseline.

ステップＳ３０３で求めたカメラ運動の最適化の結果を利用する。サブ画像列の最終フレームを撮影したときのカメラの三次元位置を(Ｘｆ，Ｙｆ，Ｚｆ)とする。時系列で最終フレームの前のフレームより逆時系列順にフレームを選択し、選択されたフレームを撮影したときのカメラの三次元位置（Ｘｓ，Ｙｓ，Ｚｓ）を用いて、ベースライン距離Ｂを以下の式（２）にて計算する。 The result of optimizing the camera motion obtained in step S303 is used. Assume that the three-dimensional position of the camera when the last frame of the sub-image sequence is captured is (Xf, Yf, Zf). In the time series, frames are selected in reverse time series from the frame before the last frame, and the baseline distance B is set to the following using the three-dimensional position (Xs, Ys, Zs) of the camera when the selected frame is photographed. (2).

例えば、ある閾値Ｔを用意しておき、ＢがＴを初めて超えた際に、最終フレームからそのフレームまでを重複するフレームとして決定する。閾値Ｔは経験的に任意の値として決定することが出来る。 For example, a certain threshold value T is prepared, and when B exceeds T for the first time, the frame from the last frame to that frame is determined as an overlapping frame. The threshold value T can be determined as an arbitrary value empirically.

重複フレームが決定したら次のサブ画像列を決定するためにサブ画像列番号ｎをｎ＝ｎ＋１としてステップＳ３０１の処理に戻る。以上述べた処理を繰り返すことで移動観測画像列Ｎを分割する。 When the overlapping frame is determined, the sub image sequence number n is set to n = n + 1 to determine the next sub image sequence, and the process returns to step S301. The moving observation image sequence N is divided by repeating the processing described above.

以上のように移動観測画像列の分割処理の終了後、座標系の統合手段１３により、サブ画像列間の座標変換行列を計算し、座標系を共通の世界座標系に統合する。 As described above, after the movement observation image sequence dividing process is completed, the coordinate conversion matrix between the sub-image sequences is calculated by the coordinate system integration unit 13 and the coordinate system is integrated into a common world coordinate system.

まず重複フレーム部分を用いて、同一の三次元点を隣接サブ画像列間に由来している二つの座標系で表した対応関係を表す、三次元対応点テーブルを作成する（ステップＳ２０３）。 First, using the overlapping frame portion, a three-dimensional correspondence point table is created that represents the correspondence relationship in which the same three-dimensional point is represented by two coordinate systems derived between adjacent sub-image sequences (step S203).

本実施例では、この三次元対応点テーブルを作成するのにステレオ処理を用いる。ステレオ処理に用いる基準画像と比較画像は重複フレームの中から任意に選ぶことが可能である。例えば、重複フレーム部分の最初のフレームと最終フレームをそれぞれ基準画像、比較画像とすることができる。ステレオ処理を行うことで、この基準画像の各画素（ｕ，ｖ）と比較画像の各画素（ｕ’，ｖ’）の対応付けを行って、対応点データを得る。 In this embodiment, stereo processing is used to create this three-dimensional corresponding point table. The reference image and comparison image used for stereo processing can be arbitrarily selected from overlapping frames. For example, the first frame and the last frame of the overlapping frame portion can be used as a reference image and a comparison image, respectively. By performing a stereo process, each pixel (u, v) of the reference image and each pixel (u ′, v ′) of the comparison image are associated to obtain corresponding point data.

次に前記対応点データを用いて三次元点の復元を行う。三次元点の復元は一つの対応点について、重複フレームを含む隣接する二つのサブ画像列から推定された二つのカメラ運動データを用いて計算を行う。つまり基準画像の各画素（ｕ，ｖ）について二つの座標系で表された三次元点が計算されることになる。 Next, the three-dimensional points are restored using the corresponding point data. Three-dimensional point restoration is performed using two camera motion data estimated from two adjacent sub-image sequences including overlapping frames for one corresponding point. That is, a three-dimensional point represented by two coordinate systems is calculated for each pixel (u, v) of the reference image.

三次元点復元は、基準画像の投影行列をＰ、比較画像の投影行列Ｐ’とすると、投影行列Ｐは３行４列の行列で、推定されたカメラ運動（三次元位置(ｘ，ｙ，ｚ)、姿勢（φ，θ，γ））とカメラ内部行列Ａから以下の式（３）のように計算できる。すなわち、 In the 3D point restoration, if the projection matrix P of the reference image is P and the projection matrix P ′ of the comparative image is the projection matrix P, the projection matrix P is a 3 × 4 matrix, and the estimated camera motion (3D position (x, y, z), posture (φ, θ, γ)) and camera internal matrix A can be calculated as in the following equation (3). That is,

であり、Ａ行列の各要素はカメラ校正により既知である。 And each element of the A matrix is known by camera calibration.

式（３）により、基準画像の投影行列Ｐおよび比較画像の投影行列Ｐ’をまず計算する。 First, the projection matrix P of the reference image and the projection matrix P ′ of the comparison image are calculated by the equation (3).

次に対応点データと投影行列ＰおよびＰ’を使って三次元点を復元する。基準画像の画像座標（ｕ，ｖ）と比較画像の画像座標（ｕ’，ｖ’）が対応しているとすると、三次元点Ｍ＝(Ｘ，Ｙ，Ｚ)は以下の式（４）により復元される。すなわち、 Next, the three-dimensional point is restored using the corresponding point data and the projection matrices P and P ′. Assuming that the image coordinates (u, v) of the reference image correspond to the image coordinates (u ′, v ′) of the comparison image, the three-dimensional point M = (X, Y, Z) is expressed by the following equation (4). Is restored. That is,

であり、ｐｉｊおよびｐ’ｉｊはＰおよびＰ’のｉ行ｊ列の要素を表し、Ｂ⁺はＢ行列の疑似逆行列である。 Pij and p′ij represent the elements of i rows and j columns of P and P ′, and B ⁺ is a pseudo inverse matrix of the B matrix.

これを二つのカメラ運動を用いることで、二つの座標系で表された三次元点を得ることが出来る。これを基準画像の全画素について行うことで、図６のような三次元対応点テーブルを作成することが出来る。 By using two camera motions, a three-dimensional point represented by two coordinate systems can be obtained. By performing this for all the pixels of the reference image, a three-dimensional corresponding point table as shown in FIG. 6 can be created.

以上の処理を全ての隣接サブ画像列間の重複フレームで行う。次に得られた三次元対応点テーブルを用いて、統合されたカメラ運動を再推定するために、ステップＳ２０１で因子分解法により得た特徴点の三次元点と移動観測画像列の全てのフレームの投影行列を共通の座標系に統合する（ステップＳ２０４）。 The above processing is performed on overlapping frames between all adjacent sub-image sequences. Next, in order to re-estimate the integrated camera motion using the obtained 3D corresponding point table, the 3D feature points obtained by the factorization method in step S201 and all frames of the moving observation image sequence Are integrated into a common coordinate system (step S204).

共通の世界座標系は任意のサブ画像列のローカル座標系とすることができる。例えば、時系列上で中央のサブ画像列のローカル座標系を共通の世界座標系とすることができる。以下では図７のように、時系列上で中央のサブ画像列のローカル座標系を共通の世界座標系とし、このサブ画像列の番号をＷ、Ｔｉは座標変換行列で、ｉ＜ＷならＴｉはサブ画像列番号ｉからサブ画像列番号ｉ＋１の座標系へ変換する行列であり、ｉ≧ＷならＴｉはサブ画像列番号ｉ＋１からサブ画像列番号ｉの座標系へ変換する行列である。 The common world coordinate system can be the local coordinate system of any sub-image sequence. For example, the local coordinate system of the center sub-image sequence on the time series can be a common world coordinate system. In the following, as shown in FIG. 7, the local coordinate system of the central sub-image sequence in the time series is a common world coordinate system, and the numbers of the sub-image sequences are W and Ti are coordinate transformation matrices. If i <W, Ti Is a matrix for converting from the sub-image sequence number i to the coordinate system of the sub-image sequence number i + 1. If i ≧ W, Ti is a matrix for converting from the sub-image sequence number i + 1 to the coordinate system of the sub-image sequence number i.

このような行列ＴｉはステップＳ２０３で求めた三次元対応点テーブルを用いて、例えば非特許文献３に記載のクオータニオンを利用した方法で計算することが出来る。全ての隣接サブ画像列間において行列Ｔｉの計算を行う。 Such a matrix Ti can be calculated, for example, by a method using a quota anion described in Non-Patent Document 3, using the three-dimensional corresponding point table obtained in step S203. The matrix Ti is calculated between all adjacent sub-image columns.

次に求めたＴｉを用いて共通の世界座標系での各サブ画像列で因子分解法により獲得した特徴点の三次元点を世界座標系に変換する。これは以下の式（５）を用いて計算できる。 Next, using the obtained Ti, the three-dimensional points of the feature points obtained by the factorization method in each sub-image sequence in the common world coordinate system are converted into the world coordinate system. This can be calculated using the following equation (5).

ただし、(Ｘｉ，Ｙｉ，Ｚｉ)はサブ画像列番号ｉより因子分解法により計算された特徴点の三次元座標値、(Ｘｗ，Ｙｗ，Ｚｗ)は(Ｘｉ，Ｙｉ，Ｚｉ)を世界座標系にて表した三次元座標値である。この変換を全てのサブ画像列において因子分解法によって得られた三次元点全てに対し行う。
また同様にして世界座標系で表された投影行列の計算も行う。これは以下の式（６）で計算することができる。 However, (Xi, Yi, Zi) is the three-dimensional coordinate value of the feature point calculated by the factorization method from the sub-image sequence number i, (Xw, Yw, Zw) is (Xi, Yi, Zi) in the world coordinate system It is a three-dimensional coordinate value represented by. This conversion is performed on all three-dimensional points obtained by the factorization method in all sub-image sequences.
Similarly, the projection matrix expressed in the world coordinate system is also calculated. This can be calculated by the following equation (6).

ただし、Ｐｉはサブ画像列番号ｉの各画像の投影行列を表し、Ｐｗは変換されて世界座標系で表された投影行列を表す。この変換を各サブ画像列の全ての画像の投影行列に対して行う。以上の処理にて各サブ画像列毎のカメラ運動を推定するのに利用した特徴点の三次元点と移動観測画像列の全ての画像の投影行列を共通の世界座標系へと変換することができる。 Here, Pi represents the projection matrix of each image of the sub-image sequence number i, and Pw represents the projection matrix converted and represented in the world coordinate system. This conversion is performed on the projection matrices of all the images in each sub-image sequence. It is possible to convert the projection matrix of the 3D point of the feature point and all images of the moving observation image sequence used to estimate the camera motion for each sub-image sequence by the above processing into a common world coordinate system. it can.

最後に前記得られた共通の世界座標系で表された特徴点の三次元座標値を用いてカメラ運動情報の再推定手段１４により世界座標系で統一されたカメラ運動の再推定を行う（ステップＳ２０５）。 Finally, camera motion information re-estimating means 14 re-estimates the camera motion unified in the world coordinate system using the obtained three-dimensional coordinate values of the feature points expressed in the common world coordinate system (step S205).

カメラ運動の再推定は前記ステップＳ３０２でも説明したようにバンドル調整により行うことができる。ステップＳ２０４で得られた共通の世界座標系で表された特徴点の三次元座標値および投影行列とステップＳ２０２で各サブ画像列にて特徴点追跡を行った計測行列を用いて、特徴点の三次元座標値の最投影誤差を最小とするように最適化を行って、これを世界座標系に統合されたカメラ運動情報とする。 The re-estimation of the camera motion can be performed by bundle adjustment as described in step S302. Using the three-dimensional coordinate values and the projection matrix of the feature points expressed in the common world coordinate system obtained in step S204 and the measurement matrix obtained by tracking the feature points in each sub-image sequence in step S202, Optimization is performed so as to minimize the maximum projection error of the three-dimensional coordinate value, and this is used as camera motion information integrated into the world coordinate system.

本実施形態例では移動観測画像列の分割手段１２においてカメラ運動の推定に因子分解法を用いる例を説明したが、移動観測画像の特徴点追跡結果からカメラ運動を推定する手段であれば、何を用いても実現可能であり、例えば、逐次射影復元を用いても構わない。 In this embodiment, an example in which the factorization method is used for estimating the camera motion in the moving observation image sequence dividing unit 12 has been described. For example, it is possible to use sequential projection restoration.

本実施形態例では、座標系の統合手段１３において、ステレオ処理により復元した三次元点群を利用してクォータニオンを用いて統合を行ったが、これに限るものではなく、単純な最小二乗近似などによって変換行列の推定を行っても構わない。 In the present embodiment, the coordinate system integration means 13 performs integration using quaternions using a three-dimensional point group restored by stereo processing. However, the present invention is not limited to this, and a simple least square approximation or the like is used. The conversion matrix may be estimated as follows.

また、前記カメラのパラメータや、途中の３次元座標は、例えばメモリに格納して利用するように構成するものである。 The camera parameters and intermediate three-dimensional coordinates are configured to be stored in a memory, for example.

尚、図１で示した装置における各手段の一部もしくは全部の機能をコンピュータのプログラムで構成し、そのプログラムをコンピュータを用いて実行して本発明を実現することができること、図３で示した処理の手順をコンピュータのプログラムで構成し、そのプログラムをコンピュータに実行させることができることは言うまでもなく、コンピュータでその機能を実現するためのプログラムを、そのコンピュータが読み取り可能な記録媒体、例えばＦＤや、ＭＯ、ＲＯＭ、メモリカード、ＣＤ、ＤＶＤ、リムーバブルディスクなどに記録して、保存したり、配布したりすることが可能である。また、上記のプログラムをインターネットや電子メールなど、ネットワークを通して提供することも可能である。 It should be noted that a part or all of the functions of each means in the apparatus shown in FIG. 1 can be configured by a computer program and that the program can be executed using the computer to realize the present invention, as shown in FIG. Needless to say, the processing procedure can be configured by a computer program, and the program can be executed by the computer. A program for realizing the function by the computer can be recorded on a computer-readable recording medium such as an FD, It can be recorded on an MO, ROM, memory card, CD, DVD, removable disk and stored or distributed. It is also possible to provide the above program through a network such as the Internet or electronic mail.

（実施例）
以下に、上述のカメラ運動情報取得装置を用いて、移動する１台のビデオカメラから取得した移動観測画像を処理した結果を図３に示した処理フローを用いて述べる。 (Example)
Hereinafter, a result of processing a moving observation image acquired from one moving video camera using the above-described camera motion information acquisition apparatus will be described using a processing flow shown in FIG.

まず処理が開始されると、移動観測画像列の取得を行う（ステップＳ２０１）。図８は取得した移動観測画像列の一例の一部である（８０１〜８０６）。これはビデオカメラを手に持って歩きながら撮影を行って取得したもので、時系列順に番号が振ってある。最初のフレーム（８０１）で見えていた箇所は最後のフレーム（８０６）ではすでに見えなくなっており、全てのフレームでの特徴点追跡が不可能なロングシーケンス画像列であることが分かる。 First, when the process is started, a moving observation image sequence is acquired (step S201). FIG. 8 is a part of an example of the acquired moving observation image sequence (801 to 806). This is obtained by taking a picture while walking with a video camera in hand, and numbered in chronological order. The portion that was visible in the first frame (801) is no longer visible in the last frame (806), and it can be seen that this is a long sequence image sequence in which feature point tracking is not possible in all frames.

次にステップＳ２０２では移動観測画像列の分割を行う。これは図５の処理フロー図を用いて説明する。ステップＳ３０１はサブ画像列の決定を行う。これは特徴点追跡が安定して行える画像数として決定する。図９はあるサブ画像列における特徴点追跡の結果の一例の一部（９０１〜９０３）であり、９０１から撮影した時系列順に並んでいる。中央の画像（９０２）はサブ画像列の丁度時系列順で中央に位置する画像であり、画像中の四角い枠の中にＨａｒｒｉｓ特徴を用いた特徴点を発生させた。ここで発生した特徴点を時系列逆方向、順方向ともに追跡して、サブ画像列の全てのフレームで対応付けを行う。特徴点の消失がなく、安定した追跡がされていることを確認することができる。 In step S202, the moving observation image sequence is divided. This will be described with reference to the processing flowchart of FIG. In step S301, the sub image sequence is determined. This is determined as the number of images in which feature point tracking can be stably performed. FIG. 9 shows a part (901 to 903) of an example of the result of the feature point tracking in a certain sub-image sequence, which are arranged in chronological order taken from 901. The center image (902) is an image located in the center in the time series order of the sub-image sequence, and feature points using the Harris feature are generated in a square frame in the image. The feature points generated here are tracked in both the time-series backward direction and the forward direction, and association is performed in all frames of the sub-image sequence. It can be confirmed that there is no disappearance of feature points and that stable tracking is performed.

ステップＳ３０２では特徴点追跡結果を用いて因子分解法によりカメラ運動を推定し、ステップＳ３０３ではこれを初期解として最適化を行う。図１０に、推定されたカメラ運動及び復元した特徴点の三次元位置の一例を示す。なお図１０の各軸はＸ，Ｙ，Ｚ軸をそれぞれ表しており、特徴点の三次元位置とカメラ位置の相対的な位置関係を表している。以後、図１１、図１２も同様である。 In step S302, the camera motion is estimated by the factorization method using the feature point tracking result, and in step S303, this is optimized as an initial solution. FIG. 10 shows an example of the estimated camera motion and the three-dimensional position of the restored feature point. Each axis in FIG. 10 represents the X, Y, and Z axes, and represents the relative positional relationship between the three-dimensional position of the feature point and the camera position. Thereafter, the same applies to FIG. 11 and FIG.

ステップＳ３０４では分割終了か否かの判定が行われる。現在処理しているサブ画像列が移動観測画像列の最終フレームまで入っていれば、分割を終了とする。そうでなければ重複フレーム数の決定処理（ステップＳ３０５）に進む。ここでは推定されたカメラ運動を用いて最適な重複フレーム数を決定する。重複フレーム間のベースライン距離を利用してステレオ処理により三次元点群が精度良く復元できるようにする。処理を終えたら処理するサブ画像列の番号ｎをｎ＝ｎ＋１として、ステップＳ３０１へ戻る。 In step S304, it is determined whether or not the division is finished. If the currently processed sub-image sequence is included up to the last frame of the moving observation image sequence, the division ends. Otherwise, the process proceeds to the process for determining the number of duplicate frames (step S305). Here, the optimum number of overlapping frames is determined using the estimated camera motion. A three-dimensional point group can be accurately restored by stereo processing using a baseline distance between overlapping frames. When the process is completed, the number n of the sub-image sequence to be processed is set to n = n + 1, and the process returns to step S301.

以上の処理を繰り返し、移動観測画像列の分割処理（ステップＳ２０２）が終了したら、三次元対応点テーブル作成を行う（ステップＳ２０３）。これは重複フレーム部でステレオ処理を行い、得られた対応点データから隣接するサブ画像列から推定した二つのカメラ運動で三次元点を復元することで実現する。 When the above processing is repeated and the moving observation image sequence dividing process (step S202) is completed, a three-dimensional corresponding point table is created (step S203). This is realized by performing stereo processing in the overlapping frame portion and restoring the three-dimensional point by two camera motions estimated from the adjacent sub-image sequence from the obtained corresponding point data.

図１１はステレオ処理による視差画像（対応点データ）と二つのカメラ運動を用いた三次元点群の復元結果の一例である。復元された三次元点群は座標系が異なるために、二つに分離されていることが確認できる。この三次元点群一点一点について対応関係が求められている。この処理を全ての隣接サブ画像列間で行う。 FIG. 11 is an example of a reconstruction result of a three-dimensional point group using stereo processing parallax images (corresponding point data) and two camera motions. It can be confirmed that the restored three-dimensional point group is separated into two because the coordinate system is different. Correspondence is required for each point of the three-dimensional point group. This process is performed between all adjacent sub-image sequences.

次にステップＳ２０４では得られた三次元対応点テーブルを用いてステップＳ２０２で因子分解法によってサブ画像列毎に得られた特徴点の三次元位置と移動観測画像列の全画像の投影行列を共通の世界座標系に統合する。 Next, in step S204, the three-dimensional corresponding point table obtained is used, and the three-dimensional position of the feature point obtained for each sub-image sequence by the factorization method in step S202 and the projection matrix of all images in the moving observation image sequence are shared. Integrate into the world coordinate system.

最後にステップＳ２０５ではステップＳ２０４で求めた共通の世界座標系に統合された特徴点の三次元位置、投影行列、特徴点追跡結果を用いてバンドル調整により三次元点の再投影誤差を最小化するように最適化を行って、統合されたカメラ運動を推定する。図１２は全てのサブ画像列毎に復元した特徴点の三次元位置を共通の世界座標系に統合した三次元点群と統合されたカメラ運動の一例を示した図である。この図１２によれば、シームレスにカメラ運動が復元されている様子が確認できる。 Finally, in step S205, the reprojection error of the three-dimensional point is minimized by bundle adjustment using the three-dimensional position, projection matrix, and feature point tracking result of the feature point integrated in the common world coordinate system obtained in step S204. So as to estimate the integrated camera motion. FIG. 12 is a diagram showing an example of camera motion integrated with a three-dimensional point group in which three-dimensional positions of feature points restored for every sub-image sequence are integrated into a common world coordinate system. According to FIG. 12, it can be confirmed that the camera motion is seamlessly restored.

以上述べた処理により、本実施形態例ではロングシーケンス画像列からカメラ運動情報が推定できる。 Through the processing described above, camera motion information can be estimated from a long sequence image sequence in this embodiment.

本発明の一実施形態例によるカメラ運動情報取得装置の構成図である。1 is a configuration diagram of a camera motion information acquisition device according to an exemplary embodiment of the present invention. 本発明の一実施形態例によるカメラ運動情報取得装置におけるデータ取得の様子を示す説明図である。It is explanatory drawing which shows the mode of the data acquisition in the camera movement information acquisition apparatus by one example of embodiment of this invention. 本発明の一実施形態例におけるカメラ運動情報取得方法の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of the camera movement information acquisition method in one embodiment of this invention. 本発明の一実施形態例における移動観測画像列の分割の様子を示した説明図である。It is explanatory drawing which showed the mode of the division | segmentation of the movement observation image sequence in the example of 1 embodiment of this invention. 本発明の一実施形態例における移動観測画像列の分割の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of the division | segmentation of the movement observation image sequence in one embodiment of this invention. 本発明の一実施形態例における異なる座標系で表された三次元点の対応点テーブルの例を示す説明図である。It is explanatory drawing which shows the example of the corresponding point table of the three-dimensional point represented by the different coordinate system in one example of embodiment of this invention. 本発明の一実施形態例における世界座標系への座標系統合の方法を示す説明図である。It is explanatory drawing which shows the method of coordinate system integration to the world coordinate system in one example embodiment of this invention. 本発明の一実施形態例における移動観測画像の一例を示す説明図である。It is explanatory drawing which shows an example of the movement observation image in one Example of this invention. 本発明の一実施形態例のサブ画像列における特徴点追跡の結果の一例を示す説明図である。It is explanatory drawing which shows an example of the result of the feature point tracking in the sub image sequence of one Example of this invention. 本発明の一実施形態例において、特徴点追跡結果を用いて因子分解により、特徴点の三次元復元結果と推定したカメラ運動の一例を示す説明図である。FIG. 10 is an explanatory diagram showing an example of camera motion estimated as a three-dimensional restoration result of feature points by factorization using a feature point tracking result in an embodiment of the present invention. 本発明の一実施形態例において、ステレオ処理結果を示す視差画像および視差画像と二つのカメラ運動を用いて三次元点群を復元した結果の一例を示す説明図である。FIG. 11 is an explanatory diagram illustrating an example of a result of restoring a three-dimensional point group using a parallax image indicating a stereo processing result and a parallax image and two camera motions in an exemplary embodiment of the present invention. 本発明の一実施形態例において、世界座標系に三次元点群およびカメラ運動を統合した結果の一例を示す説明図である。In one embodiment of this invention, it is explanatory drawing which shows an example of the result of having integrated a three-dimensional point cloud and a camera motion into the world coordinate system.

Explanation of symbols

１１…移動観測画像列取得手段、１２…移動観測画像列の分割手段、１３…座標系の統合手段、１４…カメラ運動情報の再推定手段。
DESCRIPTION OF SYMBOLS 11 ... Movement observation image sequence acquisition means, 12 ... Movement observation image sequence division means, 13 ... Coordinate system integration means, 14 ... Camera motion information re-estimation means

Claims

A camera motion information acquisition device that acquires camera motion information representing a three-dimensional position and posture of a camera at the time of image capture from an image sequence observed by a moving image input device,
Moving observation image sequence acquisition means for acquiring an image sequence observed by the moving image input device;
A moving observation image sequence dividing unit that divides the acquired movement observation image sequence into a plurality of sub image sequences while overlapping a plurality of images between the sub image sequences;
A coordinate system integrating means for integrating different coordinate systems between the sub-image sequences into a common world coordinate system;
A camera motion information acquisition apparatus comprising: camera motion information re-estimation means for re-estimating camera motion in the common world coordinate system.

The camera motion information acquisition device according to claim 1,
The moving observation image sequence dividing means includes:
A camera motion information acquisition apparatus, wherein a sub-image sequence is determined using a feature point tracking result, a measurement matrix is created using the feature point tracking result, and a camera motion is estimated by a factorization method.

The camera motion information acquisition device according to claim 1,
The moving observation image sequence dividing means includes:
A camera motion information acquisition apparatus characterized by determining a sub-image sequence using a result of feature point tracking and estimating a camera motion by sequentially performing projection restoration using the feature point tracking result.

The camera motion information acquisition device according to claim 1,
The coordinate system integration means includes:
Create a 3D correspondence point table by stereo processing using overlapping frames between sub-image sequences, calculate a coordinate transformation matrix between sub-image sequences using the 3D correspondence point table, and integrate the coordinate system A camera motion information acquisition device.

A camera motion information acquisition method for acquiring camera motion information representing a three-dimensional position and posture of a camera at the time of image capture from an image sequence observed by a moving image input device,
A moving observation image sequence acquisition means for acquiring an image sequence observed by the moving image input device; and
A dividing step for dividing the acquired moving observation image sequence into a plurality of sub image sequences while overlapping the plurality of images between the sub image sequences, and a moving observation image sequence dividing unit,
A coordinate system integrating means for integrating different coordinate systems between the sub-image sequences into a common world coordinate system;
The camera motion information re-estimating means comprises the step of re-estimating the camera motion in the common world coordinate system.

The camera motion information acquisition method according to claim 5,
The dividing step includes
A step of determining a sub-image sequence using the result of feature point tracking, a step of creating a measurement matrix using the result of the feature point tracking, estimating a camera motion by a factorization method, and an estimated camera motion A camera motion information acquisition method comprising: an optimization step; a step of determining whether or not the division of the moving observation image sequence is completed; and a step of determining the number of overlapping frames.

The camera motion information acquisition method according to claim 5,
The dividing step includes
A step of determining a sub-image sequence using the result of feature point tracking, a step of estimating a camera motion by sequentially performing projection restoration using the result of the feature point tracking, and an optimization of the estimated camera motion And a step of determining whether or not the division of the moving observation image sequence is completed, and a step of determining the number of overlapping frames.

The camera motion information acquisition method according to claim 5,
The integration step includes a step of creating a three-dimensional corresponding point table by stereo processing using overlapping frames between sub-image sequences, and calculating a coordinate transformation matrix between the sub-image sequences using the three-dimensional corresponding point table. And a step of integrating the coordinate system.

A computer-readable recording medium having recorded thereon a program for causing a computer to execute the camera motion information acquisition method according to claim 5.