JP2001028049A

JP2001028049A - Method and device for three-dimensional shape acquisition, and computer-readable recording medium

Info

Publication number: JP2001028049A
Application number: JP2000137825A
Authority: JP
Inventors: Yukinori Minamida; 幸紀南田; Mikio Shintani; 幹夫新谷; Tatsuki Matsuda; 達樹松田; Mikito Notomi; 幹人納富
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1999-05-11
Filing date: 2000-05-10
Publication date: 2001-01-30

Abstract

PROBLEM TO BE SOLVED: To provide a method and a device for three-dimensional shape acquisition, and a recording medium which restore the three-dimensional shape of a body or scene with high precision even when the objective three-dimensional body has large depth. SOLUTION: The method has an input step for inputting an image from a camera to a computer, a 1st extraction step (S3) for extracting a three- dimensional shape from the image by a specific method, a generation step (S7) for generating a depth image from the three-dimensional shape, a correction step (S8) for correcting a shake of the image due to the movement of the camera by using the depth image, a 2nd extraction step for extracting a three- dimensional shape by a specific method by using the shake-corrected image, and an output step for outputting the three-dimensional shape to a storage device. The above mentioned methods are based upon epipolar image analysis.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、実画像から高臨場
感画像を生成するのに必要な３次元構造を推定するため
の３次元形状取得方法、装置、及び記録媒体に関する。
近年、ゲームや映画、コマーシャルフィルムなど、様々
な局面で３次元コンピュータグラフィックスの需要が増
大している。しかし、３次元ＣＧの元データとなる３次
元モデルの作成は人件費のかかる作業である。そのた
め、実際に存在している物体や景観を自動的あるいは半
自動的に３次元モデル化する方法が求められている。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method, an apparatus, and a recording medium for obtaining a three-dimensional shape for estimating a three-dimensional structure required for generating a highly realistic image from a real image.
In recent years, demand for three-dimensional computer graphics has been increasing in various aspects such as games, movies, and commercial films. However, creating a three-dimensional model as the original data of the three-dimensional CG is a labor-intensive operation. Therefore, there is a need for a method of automatically or semi-automatically three-dimensionally modeling an actually existing object or landscape.

【０００２】[0002]

【従来の技術】従来から、物体や景観などの３次元形状
を入力するための各種の入力方法が提案されている。た
とえば、文献「R. C. Bolles, H. H. Baker and D. H.
Marimont, “IJCV, vol.1, No.1, 1987, pp.7-55」（以
下文献１とする）では、ビデオカメラで景観を撮影した
多数の画像を入力とし、時空間画像を構成し、そのx-t
平面の画像に現れる直線状の模様を解析することで３次
元形状を得るエピポーラ画像解析と呼ばれる方法を提案
している。この方法は、解析の方法が簡単であるという
特徴があるが、撮影時にカメラを正確に等速直線運動さ
せることが必要である。カメラの動きに手ぶれや振動な
どで予期せぬ動きが伴うと、前述のx-t平面の画像に現
れる模様が直線にならず、３次元形状の抽出の精度が著
しく落ちる。2. Description of the Related Art Conventionally, various input methods for inputting a three-dimensional shape such as an object and a landscape have been proposed. For example, in the literature "RC Bolles, HH Baker and DH
In Marimont, “IJCV, vol.1, No.1, 1987, pp.7-55” (hereinafter referred to as Reference 1), a spatio-temporal image is constructed by inputting a large number of images of a landscape taken with a video camera. And that xt
A method called epipolar image analysis for obtaining a three-dimensional shape by analyzing a linear pattern appearing in a plane image has been proposed. This method is characterized in that the analysis method is simple, but it is necessary to accurately and linearly move the camera at the time of photographing. If an unexpected movement is accompanied by camera shake or vibration, the pattern appearing in the image on the xt plane will not be a straight line, and the accuracy of extracting a three-dimensional shape will be significantly reduced.

【０００３】また特開平１１−３３９０４３「三次元形
状入力方法および三次元形状入力方法を記録した記録媒
体」（以下文献２とする）では、文献１の方法の前処理
として、画像の特徴点を追跡した軌跡から、カメラの手
ぶれなどによる動きを推定し、この推定結果から、見か
け上、カメラが等速直線運動を行ったかのように画像を
補正する方法が提案されている。この方法により、撮影
時にカメラを正確に等速直線運動させる必要がなくな
り、エピポーラ画像解析の実施が容易になった。同様
に、文献「Z. Zhu, G. Xu and X. Lin, “Constructing
3D Natural Scenefrom Video Sequences with Vibrate
d Motions,” Proc. IEEE VRAIS ’98, 1998, pp.105-1
12」では、画像のオプティカルフローから、カメラの手
ぶれなどによる動きを推定し、この推定結果から、見か
け上カメラが等速直線運動を行ったかのように画像を補
正する方法が提案されている。しかしながら、文献２の
ような手法では、対象とする３次元物体の奥行き方向の
幅が大きい場合には、３次元形状計算精度が低下すると
いう問題点があった。また文献「C. Tomasi and T. Kan
ade, "Shape and Motion from Image Streams : a Fact
orization Method -- Full Report on the Orthographi
c Case," Computer Science Technical Report, CMU-CS
-104, Carnegie Mellon Univ., 1992」（文献３とす
る）および「C. J. Poelman and T. Kanade, “A Parap
erspective factorization method for Shape and Moti
on Recovery,” IEEE PAMI, vol.19, no. 3, 1997, pp.
206-218」では、景観を撮影した複数枚の画像を入力と
し、画像間の特徴点を対応付け、因子分解法により該特
徴点の３次元座標を得る方法が提案されている。この方
法は、入力に特殊な装置や格別の配慮を必要とせず、容
易に実施が可能であるが、カメラモデルが透視変換でな
いため、対象物体の奥行きの幅が広い場合には入力誤差
が大きくなる欠点がある。Japanese Patent Application Laid-Open No. 11-339043, entitled "Three-dimensional shape input method and a recording medium recording the three-dimensional shape input method" (hereinafter referred to as Document 2) discloses a method of pre-processing the method of Document 1 in which feature points of an image are used. There has been proposed a method of estimating a motion due to camera shake or the like from a track that has been tracked, and correcting an image from the estimation result as if the camera performed a linear motion at a constant velocity. According to this method, it is not necessary to accurately move the camera in a linear motion at the time of photographing, and the epipolar image analysis can be easily performed. Similarly, the references “Z. Zhu, G. Xu and X. Lin,“ Constructing
3D Natural Scenefrom Video Sequences with Vibrate
d Motions, ”Proc. IEEE VRAIS '98, 1998, pp.105-1
In “12”, there is proposed a method of estimating a motion due to camera shake or the like from an optical flow of an image, and correcting the image from the estimation result as if the camera performed a linear motion at a constant velocity. However, the method as described in Document 2 has a problem in that when the width of the target three-dimensional object in the depth direction is large, the three-dimensional shape calculation accuracy is reduced. In addition, the literature `` C. Tomasi and T. Kan
ade, "Shape and Motion from Image Streams: a Fact
orization Method-Full Report on the Orthographi
c Case, "Computer Science Technical Report, CMU-CS
-104, Carnegie Mellon Univ., 1992 ”(reference 3) and“ CJ Poelman and T. Kanade, “A Parap
erspective factorization method for Shape and Moti
on Recovery, ”IEEE PAMI, vol.19, no. 3, 1997, pp.
206-218 ", a method is proposed in which a plurality of images of a landscape are input, feature points between the images are associated, and three-dimensional coordinates of the feature points are obtained by a factor decomposition method. This method does not require special equipment or special consideration for input, and can be easily implemented.However, since the camera model is not a perspective transformation, the input error is large when the depth of the target object is wide. There are disadvantages.

【０００４】また特開平７−１４６１２１「視覚に基く
三次元位置および姿勢の認識方法ならびに視覚に基く三
次元位置および姿勢の認識装置」では、位置と大きさが
既知の三次元物体をカメラで撮影した１枚の画像から、
カメラの位置および姿勢を推定するカメラキャリブレー
ションの方法が開示されている。しかしながら、この方
法では、未知の物体や景観の三次元形状を得ることはで
きない。Japanese Patent Application Laid-Open No. 7-146121, entitled "Visually Recognition Method for 3D Position and Orientation and Apparatus for Recognizing 3D Position and Orientation Based on Vision", photograph a 3D object whose position and size are known by a camera. From one of the images
A camera calibration method for estimating the position and orientation of a camera is disclosed. However, this method cannot obtain a three-dimensional shape of an unknown object or landscape.

【０００５】また、特開平８−１８１９０３「撮像装
置」では、複数の画像を入力とし、それら画像間の並行
移動のずれと回転のずれを推定し、張り合わせる装置が
開示されている。しかしながら、この方法では、物体や
景観の三次元形状を得ることはできない。In Japanese Patent Application Laid-Open No. Hei 8-181903, an image pickup apparatus is disclosed, in which a plurality of images are input, and a shift in parallel movement and a shift in rotation between the images are estimated and attached. However, this method cannot obtain a three-dimensional shape of an object or a landscape.

【０００６】特開平１１−１８３１３９「断面及び３次
元形状測定装置」では、対象物体にスリット光を投影し
撮影した複数の画像から、対象物体の３次元形状を測定
する装置が開示されている。この装置は、スリット光を
投影する装置とカメラを連動して動作させる必要があ
り、実施にコストがかかることが欠点である。Japanese Patent Application Laid-Open No. 11-183139 discloses a device for measuring a three-dimensional shape of a target object from a plurality of images obtained by projecting slit light on the target object and photographing the target object. This apparatus has a drawback in that the apparatus for projecting slit light and the camera need to be operated in conjunction with each other, which is expensive to implement.

【０００７】[0007]

【発明が解決しようとする課題】本発明は上記の点に鑑
みてなされたものであり、対象とする３次元物体の奥行
き方向の幅が大きい場合でも物体や景観の３次元形状を
高精度に復元する３次元形状取得方法、装置および記録
媒体を提供することを課題とする。SUMMARY OF THE INVENTION The present invention has been made in view of the above points, and enables highly accurate three-dimensional shapes of objects and scenes even when the width of the target three-dimensional object in the depth direction is large. An object of the present invention is to provide a method, an apparatus and a recording medium for acquiring a three-dimensional shape to be restored.

【０００８】[0008]

【課題を解決するための手段】上記の目的を達成するた
めに本発明は次のように構成することができる。In order to achieve the above object, the present invention can be configured as follows.

【０００９】本発明は、画像をカメラからコンピュータ
に入力する入力ステップと、所定の方法により該画像か
ら３次元形状を抽出する第１の抽出ステップと、該３次
元形状から奥行き画像を生成する生成ステップと、該奥
行き画像を利用してカメラの変動による画像のぶれを補
正する補正ステップと、ぶれ補正された画像を用いて該
所定の方法により3次元形状を抽出する第２の抽出ステ
ップと、該３次元形状を記憶装置に出力する出力ステッ
プとを有する３次元形状取得方法である。According to the present invention, an input step of inputting an image from a camera to a computer, a first extraction step of extracting a three-dimensional shape from the image by a predetermined method, and a generation step of generating a depth image from the three-dimensional shape Step, a correction step of correcting image blur due to camera fluctuation using the depth image, and a second extraction step of extracting a three-dimensional shape by the predetermined method using the image corrected for blur, Outputting the three-dimensional shape to a storage device.

【００１０】本発明によれば、奥行き画像を利用して精
度良くぶれ補正された画像を用いて３次元形状を抽出す
るので、高精度の３次元形状を取得することが可能とな
る。また、前記所定の方法はエピポーラ画像解析に基づ
く方法とすることができる。本発明によれば、エピポー
ラ画像解析によって高精度の３次元形状を抽出すること
が可能となる。According to the present invention, a three-dimensional shape is extracted by using an image which has been subjected to accurate blur correction using a depth image, so that a highly accurate three-dimensional shape can be obtained. Further, the predetermined method may be a method based on epipolar image analysis. According to the present invention, it is possible to extract a highly accurate three-dimensional shape by epipolar image analysis.

【００１１】また、上記の構成において、３次元形状抽
出後に再現画像を生成するステップと、該再現画像と前
記ぶれ補正された画像との差分を計算するステップとを
更に有し、該差分が所定の値以下になるまで前記生成ス
テップ、前記補正ステップ、前記第２の抽出ステップを
繰り返して行うようにしてもよい。Further, the above configuration further includes a step of generating a reproduced image after extracting the three-dimensional shape, and a step of calculating a difference between the reproduced image and the image subjected to the blur correction, wherein the difference is a predetermined value. The generation step, the correction step, and the second extraction step may be repeatedly performed until the value becomes equal to or less than the value of.

【００１２】本発明によれば、前記生成ステップ、前記
補正ステップ、前記第２の抽出ステップを繰り返すこと
によって３次元形状の精度を更に向上させることが可能
となる。また、再現画像と比較することによって、処理
の終了を判断することが可能となる。According to the present invention, it is possible to further improve the accuracy of the three-dimensional shape by repeating the generation step, the correction step, and the second extraction step. Further, it is possible to determine the end of the processing by comparing with the reproduced image.

【００１３】また、本発明によれば、上記の３次元形状
取得方法の実施に適した３次元形状取得装置及び３次元
形状取得プログラムを記録した記録媒体が提供される。Further, according to the present invention, there is provided a three-dimensional shape obtaining apparatus suitable for carrying out the above three-dimensional shape obtaining method and a recording medium storing a three-dimensional shape obtaining program.

【００１４】本発明の他の特徴及び利点は、添付の図面
を用いた以下の説明により明らかになる。[0014] Other features and advantages of the present invention will become apparent from the following description, taken in conjunction with the accompanying drawings.

【００１５】[0015]

【発明の実施の形態】本発明の原理を説明するにあた
り、まず、文献２に記載された手ぶれ等によるカメラの
変動の推定方法について説明する。ここでは、画面上の
複数の特徴点を時間方向に追跡し、これら特徴点の軌跡
がそれぞれ同一のスキャンライン上になるように入力画
像を変形している。DESCRIPTION OF THE PREFERRED EMBODIMENTS Before describing the principle of the present invention, a method for estimating camera fluctuation due to camera shake or the like described in Reference 2 will be described. Here, a plurality of feature points on the screen are tracked in the time direction, and the input image is deformed so that the trajectories of these feature points are on the same scan line.

【００１６】画像入力に用いるカメラのモデルとしてピ
ンホールカメラモデルを用いる。カメラの回転がなく、
視線方向がz軸、スキャン方向がx軸を向いている場合、
(x,y, z)に存在する物体点の投影点(X_s, Y_s)は角倍率ａ
を用いて X_s = ax/z ・・・（１） Y_s= ay/z ・・・（２）と表せる。ａはカメラ固有のパラメータである。A pinhole camera model is used as a camera model used for image input. No camera rotation,
If the line of sight is on the z axis and the scan direction is on the x axis,
The projected point (X _s , Y _s ) of the object point existing at (x, y, z) is represented by the angular magnification a
Expressed as _{X s = ax / z ··· (} 1) Y s = ay / z ··· (2) used. a is a camera-specific parameter.

【００１７】時刻ｔにおいてカメラがx軸回りに−α、
ｙ軸回りに−β、z軸回りに−γ回転したとする。この
ときの投影点(X’_s, Y’_s)は以下のように求められる。At time t, the camera moves around the x-axis at -α,
Assume that the rotation is -β around the y-axis and -γ around the z-axis. Projected point of time _{_{(X 's, Y' s}} ) is obtained as follows.

【００１８】まず、各軸回りの回転行列はそれぞれ、First, the rotation matrix about each axis is

【００１９】[0019]

【数１】であり、全体の回転行列Ｒは(Equation 1) And the overall rotation matrix R is

【００２０】[0020]

【数２】となる。特に、回転角−α、−β、−γが小さい場合に
は(Equation 2) Becomes In particular, when the rotation angles -α, -β, and -γ are small,

【００２１】[0021]

【数３】と近似できる。投影点は、(Equation 3) Can be approximated. The projection point is

【００２２】[0022]

【数４】となる。近似式（７）を用いれば、(Equation 4) Becomes Using the approximate expression (7),

【００２３】[0023]

【数５】と近似できる。(Equation 5) Can be approximated.

【００２４】次に、カメラ位置がx軸から−δ_y、−δ_z
ずれた場合を考える。この場合には、Next, the camera position is -δ _y , -δ _z from the x-axis.
Let's consider the case of deviation. In this case,

【００２５】[0025]

【数６】となる。−δ_y、−δ_zが小さい場合には、(Equation 6) Becomes When −δ _y and −δ _z are small,

【００２６】[0026]

【数７】と近似できる。ここで、第２項、第３項はzに依存する
が、zの変化が小さい場合には、定数と見なすことがで
きる。式（１２）、（１６）から、カメラの方向、位置
の変動による投影点の変分D_xs＝X_s-X’_s，D_ys＝Y_s-Y’_s
は、(Equation 7) Can be approximated. Here, the second and third terms depend on z, but when the change in z is small, they can be regarded as constants. Equation (12), from (16), the direction of the camera, variation _{_{_{D xs = X s -X 's}}} , D ys = Y s -Y' projection points due to fluctuation of the position _s
Is

【００２７】[0027]

【数８】と書き表せる。A(t)〜E(t)は各フレームｔ毎に決まる定
数であり、入力画像の歪みを表している。カメラ定数ａ
が既知であれば、これらからα(t)，β(t)，γ(t)，
δ_y，δ_zを求めることができ、x方向の変分D_xs＝X_s-X’
_sを計算することができる。(Equation 8) Can be written as A (t) to E (t) are constants determined for each frame t, and represent distortion of the input image. Camera constant a
If is known from these, α (t), β (t), γ (t),
δ _y and δ _z can be obtained, and the variation in the x direction D _xs = X _s -X ′
_s can be calculated.

【００２８】５個以上の特徴点の軌跡(X_i(t), Y_i(t))か
らA(t)〜E(t)を推定すれば、これを用いてD_ysを求め、When A (t) to E (t) are estimated from the trajectories (X _i (t), Y _i (t)) of five or more feature points, D _ys is obtained using the estimated _values .

【００２９】[0029]

【数９】により入力画像を変形し、歪みを減少させることができ
る。f_newは補正された入力画像を示している。この推定
は、例えば、(Equation 9) Can deform the input image and reduce distortion. f _new indicates the corrected input image. This estimate is, for example,

【００３０】[0030]

【数１０】を最小化するA(t)〜E(t)を最小自乗法で解くことで行え
る。また、(Equation 10) Can be obtained by solving A (t) to E (t) that minimizes by the method of least squares. Also,

【００３１】[0031]

【数１１】によりラバスト推定することもできる（Z. Zhang, et.
al., “A robust technique for matching two uncalib
rated images through the recovery of the unknown e
pipolar geometry,” Artificial Intelligence, vol.
78, 1995, pp.87-119）。[Equation 11] Can be used for robust estimation (Z. Zhang, et.
al., “A robust technique for matching two uncalib
rated images through the recovery of the unknown e
pipolar geometry, ”Artificial Intelligence, vol.
78, 1995, pp. 87-119).

【００３２】以上が文献２に示されている方法である
が、文献２に示されるカメラの変動の推定方法には欠点
がある。文献２によれば、画像中の特徴点を追跡した軌
跡{(X’s, Y’s)}から、上記の式（２５）、（２６）を
用いた最小化問題を解くことでカメラの変動のパラメー
タ（カメラの回転角α(t)、β(t)、γ(t)およびカメラ
の並進δ_y(t)、δ_z(t)）を推定することができる。しか
しながら、本来被写体の奥行き座標値(z)がわかってい
なければ、式（１５）、（１６）は適用できない。文献
２では、被写体の奥行き(z)の変化が小さいと仮定し
て、近似的にz＝定数とみなす。そうして、δ_y/zおよび
δ_z/zをひとつの変数として扱い、この値を推定してい
た（式（２２）、（２３））。この方法では、被写体が
奥行き方向に広がりをもつ物体である場合であっても、
奥行きを一律に定数と見なしてしまうため、推定の精度
が低い。The above is the method disclosed in Reference 2, but the method for estimating camera fluctuations described in Reference 2 has drawbacks. According to Literature 2, by solving the minimization problem using the above equations (25) and (26) from the trajectory {(X's, Y's)} that tracks the feature points in the image, the camera fluctuation parameters ( The camera rotation angles α (t), β (t), γ (t) and camera translations δ _y (t), δ _z (t)) can be estimated. However, Expressions (15) and (16) cannot be applied unless the depth coordinate value (z) of the subject is known. In Document 2, assuming that the change in the depth (z) of the subject is small, z is approximately regarded as a constant. Thus, δ _y / z and δ _z / z were treated as one variable, and this value was estimated (Equations (22) and (23)). In this method, even if the subject is an object having a spread in the depth direction,
Since the depth is uniformly regarded as a constant, the estimation accuracy is low.

【００３３】本発明では、まず、文献２に示される方法
で３次元形状を入力する。この３次元形状を用いて奥行
き画像を作成し（後述する図２のステップ７）、この奥
行き画像から得られる奥行き情報を利用して精度良くカ
メラ姿勢推定を行うことで（後述する図２のステップ
８）、３次元形状の抽出の高精度化を実現する。In the present invention, first, a three-dimensional shape is input by the method shown in Reference 2. A depth image is created using the three-dimensional shape (step 7 in FIG. 2 described later), and the camera posture is accurately estimated using depth information obtained from the depth image (step in FIG. 2 described later). 8) Realization of high-precision extraction of a three-dimensional shape is realized.

【００３４】そのためには、まず、上述した文献２に示
される画像補正方法と、文献１に示される３次元形状抽
出方法（エピポーラ画像解析に基づく３次元形状抽出処
理）などを組み合わせることで、ビデオカメラで撮影し
た複数枚の画像から、一旦３次元形状を抽出する（後述
する図２のステップ１〜３）。上述したように文献２に
示される画像補正方法は、画像中の特徴点を対応付けて
得られた軌跡から、カメラの変動パラメータを推定し、
このカメラの変動パラメータをもとに、あたかもビデオ
カメラが直線運動を行って撮影したかのような画像に変
形するものである。文献１に示される３次元形状抽出方
法は、ビデオカメラを等速直線運動させながら撮影した
複数枚の画像から時空間画像を構成し、そのｘ−ｔ平面
の画像に現れる直線状の模様を解析することで３次元形
状を得るものである。このとき、この３次元形状はカメ
ラの移動軌跡およびカメラの移動速度との相対的な関係
として得られる。また、このとき、文献２の方法によれ
ば、カメラの変動が推定されてわかっているのであるか
ら、被写体を撮影したとき、実際にカメラがどのように
動いたかはわかっている。For this purpose, the video correction method is firstly combined with the above-described image correction method disclosed in Document 2 and the three-dimensional shape extraction method described in Document 1 (three-dimensional shape extraction processing based on epipolar image analysis). A three-dimensional shape is once extracted from a plurality of images taken by the camera (steps 1 to 3 in FIG. 2 described later). As described above, the image correction method disclosed in Document 2 estimates a camera variation parameter from a trajectory obtained by associating feature points in an image,
Based on the fluctuation parameters of the camera, the image is transformed into an image as if the video camera performed a linear motion. The three-dimensional shape extraction method disclosed in Document 1 forms a spatiotemporal image from a plurality of images taken while moving a video camera at a constant linear motion, and analyzes a linear pattern appearing in the image on the xt plane. To obtain a three-dimensional shape. At this time, the three-dimensional shape is obtained as a relative relationship between the movement locus of the camera and the movement speed of the camera. Also, at this time, according to the method of Document 2, since the fluctuation of the camera is estimated and known, how the camera actually moves when the subject is photographed is known.

【００３５】次に、上記のようにして得られた３次元形
状から、この撮影時のカメラ軌跡上に、仮想的に撮影時
と同じ位置および姿勢でカメラを置き、そこから撮影し
たときに得られる奥行き画像を作成する（後述する図２
のステップ７）。これは、一般的に知られているＺバッ
ファ法などを用いて３次元形状の描画を行い、その結果
作成されたＺバッファを取り出すなどの方法により行う
ことができる。Ｚバッファの各画素には、視点からその
画素に映っている物体点までの奥行きの値が入ってい
る。この結果、入力画像を撮影したときの奥行きが画素
毎に得られることになる。この奥行きを、第１の奥行き
の推定値とする。Next, based on the three-dimensional shape obtained as described above, the camera is virtually placed at the same position and orientation as on the camera at the time of shooting on the camera trajectory at the time of shooting, and obtained when shooting from there. Create a depth image to be displayed (see FIG. 2 described later).
Step 7). This can be performed by a method of drawing a three-dimensional shape using a generally known Z buffer method or the like and taking out a Z buffer created as a result. Each pixel in the Z buffer contains a depth value from the viewpoint to the object point reflected in that pixel. As a result, the depth when the input image is captured can be obtained for each pixel. This depth is used as a first estimated value of the depth.

【００３６】さて、最初にカメラの変動を推定するとき
は物体点までの奥行きは未知であったが、上記の奥行き
画像作成処理によって、現時点では物体点までの推定値
が得られている。そこで、この奥行き情報を利用して、
精度良くカメラの変動の推定を再度行う（後述する図２
のステップ８）。上記の式（１５）、（１６）において
は、奥行き(z)が未知であるため、zを定数とみなしてい
た。本発明においては、奥行き画像から奥行き情報が得
られ、zに値を設定することができる。すなわち、上記
の式（１７）、（１８）を基に次のような式を用いるこ
とによって、カメラの変動の推定を行う。By the way, when estimating the fluctuation of the camera for the first time, the depth to the object point is unknown, but the estimated value up to the object point is obtained at the present time by the above-described depth image creation processing. So, using this depth information,
The fluctuation of the camera is accurately estimated again (see FIG. 2 described later).
Step 8). In the above equations (15) and (16), since the depth (z) is unknown, z is regarded as a constant. In the present invention, depth information is obtained from a depth image, and a value can be set for z. That is, the camera fluctuation is estimated by using the following equation based on the above equations (17) and (18).

【００３７】[0037]

【数１２】ただし、D_xs(X’_s, Y’_s): 入力画像の座標(X’_s, Y’
_s)の点のｘ軸方向への補正量 D_ys(X’_s, Y’_s): 入力画像の座標(X’_s, Y’_s)の点の
y軸方向への補正量 α、β、γ、δ_y、δ_z:入力画像毎に決まるカメラの変
動を表す変数。(Equation 12) Where D _xs (X ' _s , Y' _s ): coordinates of input image (X ' _s , Y'
_s ), the correction amount in the x-axis direction D _ys (X ' _s , Y' _s ): of the point at the coordinates (X ' _s , Y' _s ) of the input image
Correction amounts in the y-axis direction α, β, γ, δ _y , δ _z : Variables representing camera fluctuations determined for each input image.

【００３８】ａ：カメラの内部パラメータ X’_s, Y’_s：入力画像の画像中の座標を表すｚ’：入力画像の座標(X_’s, Y’_s)に写っている物体点
の奥行き、である。[0038] a: an internal parameter X _'s, Y' camera _s: z represents the coordinate in the image of the input image ': the coordinates of the input image _{_{(X' s, Y 's}} ) the depth of the object point that is reflected in ,.

【００３９】上式は、入力画像の手ぶれ補正を行うため
に、入力画像中の画素をどれだけ移動させればよいかを
表している。ｚ’は、奥行き画像の座標(X’s, Y’s)に
おける画素値から得られる。カメラの変動を表す未知数
α、β、γ、δ_y、δ_zは、入力画像中の特徴点の軌跡(X
_i(t), Y_i(t))から最小自乗法などを用いて推定すること
ができる。この最小自乗法は次のように行う。もしカメ
ラがx軸方向に直線運動しているならば、特徴点のy座標
は一定のはずである。そこで、補正後の特徴点のｙ座標
が一定という条件をできるだけ満たすように、最小自乗
法で未知数α、β、γ、δ_y、δ_zを決定する。すなわ
ち、The above equation indicates how much the pixels in the input image should be moved in order to correct the camera shake of the input image. z ′ is obtained from the pixel value at the coordinates (X ′s, Y ′s) of the depth image. The unknowns α, β, γ, δ _y , δ _z representing the camera fluctuations are the trajectories (X
_i (t), Y _i (t)) can be estimated using the least squares method or the like. This least square method is performed as follows. If the camera is moving linearly in the x-axis direction, the y-coordinate of the feature point should be constant. Therefore, the unknowns α, β, γ, δ _y , δ _z are determined by the least square method so as to satisfy the condition that the y coordinate of the corrected feature point is constant as much as possible. That is,

【００４０】[0040]

【数１３】を解くことで未知数を決定することができる。または、
文献「Z. Zhang, et. al., “A robust technique for
matching two uncalibrated images through therecove
ry of the unknown epipolar geometry,” Artificial
Intelligence, vol. 78, 1995, pp.87-119」で述べられ
ているように、(Equation 13) By solving, the unknown can be determined. Or
Reference “Z. Zhang, et. Al.,“ A robust technique for
matching two uncalibrated images through therecove
ry of the unknown epipolar geometry, ”Artificial
Intelligence, vol. 78, 1995, pp. 87-119,

【００４１】[0041]

【数１４】により、median演算で外れ値を除外することでラバスト
推定をすることもできる。[Equation 14] Accordingly, robust estimation can be performed by excluding outliers by the median operation.

【００４２】このようにして得られたカメラの変動を表
す未知数α、β、γ、δ_y、δ_zを用いて式（２７）、
（２８）によりD_xs、D_ysを求め、Using the unknowns α, β, γ, δ _y , δ _z representing the camera fluctuations obtained in this way, equation (27),
D _xs, the D _ys determined by (28),

【００４３】[0043]

【数１５】ただし、fは補正前の入力画像 f⁺ _new は補正後の入力画像より入力画像の画素を各々移動させることで、画像を補
正できる。(Equation 15) However, f is an input image before correction, and f ⁺ _new can correct an image by moving each pixel of the input image from the input image after correction.

【００４４】ここで述べた画像補正方法は、文献２に示
された方法では奥行きを定数とみなしていたのに対し
て、式（２７）、（２８）において奥行きの値が計算に
入っているため精度が高い。したがって、こうして補正
した画像を入力として３次元形状抽出を再度行えば、対
象とする３次元物体の奥行き方向の幅が大きい場合で
も、より高精度に３次元形状抽出が行える。In the image correction method described here, the depth is regarded as a constant in the method disclosed in Reference 2, whereas the values of the depth are calculated in equations (27) and (28). High accuracy. Therefore, if the three-dimensional shape is extracted again using the corrected image as an input, the three-dimensional shape can be extracted with higher accuracy even if the width of the target three-dimensional object in the depth direction is large.

【００４５】また、以上の手順を繰り返し行えばその度
に奥行きの推定値が上がり、奥行きの精度が向上する。Further, if the above procedure is repeated, the estimated value of the depth is increased each time, and the accuracy of the depth is improved.

【００４６】以上の手順を何回繰り返すかは任意であ
る。下記の実施例では、再現画像と入力画像の差が一定
値以下になるまで繰り返す方法と、予め定めた一定回数
繰り返す方法を示す。The number of times the above procedure is repeated is arbitrary. In the following embodiments, a method of repeating until the difference between the reproduced image and the input image becomes equal to or less than a certain value and a method of repeating the method a predetermined number of times will be described.

【００４７】文献２によるカメラの変動の推定アルゴリ
ズムは、物体とカメラの距離（奥行き）によらず、マッ
プされた２次元画素上の、左右上下斜め方向の平行移動
の量として補正していたところ、本発明では奥行き情報
を利用して推定するので、物体がカメラからどれだけは
なれているかを考慮して、離れている物体に関しては動
きが小さく、近い物体は大きく動いているように、２次
元画素上で反映する。上記の２７式、２８式はこのよう
な意味を式で表している。すなわち、２７式の右辺第４
項及び、２８式の右辺第４項と５項に、奥行きを表す
z’という値があらわれる。これにより、奥行きを定数
と見なす場合より、精度よく変動を推定することができ
る。The algorithm for estimating camera fluctuation according to Reference 2 corrects the amount of parallel movement in the left, right, up, down, and diagonal directions on a mapped two-dimensional pixel regardless of the distance (depth) between the object and the camera. In the present invention, since the estimation is performed using the depth information, considering how far the object is away from the camera, the two-dimensional movement is performed such that the movement of the distant object is small and the movement of the close object is large. Reflect on the pixel. The above expressions 27 and 28 express such a meaning by expressions. That is, the fourth on the right side of equation 27
Term and the fourth and fifth terms on the right-hand side of equation 28 represent depth
The value z 'appears. This makes it possible to more accurately estimate the variation than when the depth is regarded as a constant.

【００４８】（第１の実施例）図１は、本発明の実施例
における３次元形状取得装置の構成例である。本発明の
実施例における３次元形状取得装置は、ＣＰＵ（中央処
理装置）１、メモリ３、入力装置５、表示装置７、ＣＤ
−ＲＯＭドライブ９、ハードディスク１１を有する。Ｃ
ＰＵ１は３次元形状取得装置の全体を制御する。メモリ
３はＣＰＵ１で処理するデータやプログラムを保持す
る。入力装置５は画像を入力するためのカメラ等であ
る。表示装置７はディスプレイ等の装置である。ＣＤ−
ＲＯＭドライブ９はＣＤ−ＲＯＭ等を駆動し、読み書き
を行う。ハードディスク１１には、プログラムや本発明
の処理によって得られた３次元形状データが格納され
る。本発明の３次元形状取得装置は、本発明の原理に基
づく３次元形状取得処理を実行するプログラムによって
動作する。そのプログラムは、３次元形状取得装置に予
めインストールされていてもよいし、例えばＣＤ−ＲＯ
Ｍに格納され、ＣＤ−ＲＯＭドライブ９を介してハード
ディスク１１にロードするようにしてもよい。プログラ
ムが起動されると、所定のプログラム部分がメモリ３に
展開され、処理が実行される。本発明では、画像が入力
装置５から入力され、３次元形状抽出処理が行われ、結
果が表示装置７やハードディスク１１に出力される。(First Embodiment) FIG. 1 shows an example of the configuration of a three-dimensional shape obtaining apparatus according to an embodiment of the present invention. The three-dimensional shape acquisition device according to the embodiment of the present invention includes a CPU (central processing unit) 1, a memory 3, an input device 5, a display device 7, a CD, and the like.
It has a ROM drive 9 and a hard disk 11; C
The PU 1 controls the entire three-dimensional shape acquisition device. The memory 3 holds data and programs to be processed by the CPU 1. The input device 5 is a camera or the like for inputting an image. The display device 7 is a device such as a display. CD-
The ROM drive 9 drives a CD-ROM or the like to perform reading and writing. The hard disk 11 stores programs and three-dimensional shape data obtained by the processing of the present invention. The three-dimensional shape acquisition device of the present invention operates by a program that executes a three-dimensional shape acquisition process based on the principle of the present invention. The program may be installed in the three-dimensional shape acquisition device in advance, or for example, a CD-RO
M, and may be loaded into the hard disk 11 via the CD-ROM drive 9. When the program is started, a predetermined program portion is expanded in the memory 3 and the processing is executed. In the present invention, an image is input from the input device 5, a three-dimensional shape extraction process is performed, and the result is output to the display device 7 or the hard disk 11.

【００４９】次に、３次元形状取得装置が実行する３次
元形状取得処理の手順を図２を参照して説明する。この
処理は本発明の原理に基づくものである。Next, the procedure of a three-dimensional shape acquisition process executed by the three-dimensional shape acquisition device will be described with reference to FIG. This processing is based on the principle of the present invention.

【００５０】ステップ１において、カメラをその光軸方
向と垂直に平行移動させながら被写体を撮影し、画像入
力を行う。このとき、手ぶれその他の要因でカメラの運
動に未知の運動が含まれていてもよい。In step 1, a subject is photographed while the camera is moved in parallel with the direction of the optical axis, and an image is input. At this time, unknown motion may be included in the motion of the camera due to camera shake or other factors.

【００５１】ステップ２において、手ぶれ補正として、
手ぶれの影響を除去するために画像の変形を行う。この
処理は、式（２４）により行うことができる。この後、
エピポーラ解析による処理を行う。In step 2, as camera shake correction,
The image is deformed to remove the effect of camera shake. This processing can be performed by equation (24). After this,
Performs processing by epipolar analysis.

【００５２】エピポーラ画像解析は、カメラがその光軸
方向と垂直に平行移動しながら被写体を撮影しているこ
とを条件としており、以下、光軸と垂直方向への平行移
動を理想的な動きと呼ぶ。カメラが手ぶれその他の要因
で未知の動きをしている場合、エピポーラ画像解析の精
度は著しく低下する。ステップ２における手ぶれ補正処
理は、エピポーラ画像解析の精度を向上させるべく、カ
メラが理想的な動きからどの程度ずれているか（以下カ
メラの変動と呼ぶ）を推定し、この推定結果をもとに画
像を変形し、あたかもカメラが理想的に動きながら撮影
したかのような画像を得る処理である。The epipolar image analysis is based on the condition that the camera is shooting an object while moving in parallel with the optical axis direction. Hereinafter, the parallel movement in the direction perpendicular to the optical axis is defined as an ideal movement. Call. If the camera moves unknown due to camera shake or other factors, the accuracy of the epipolar image analysis will be significantly reduced. The camera shake correction processing in step 2 estimates how much the camera deviates from ideal movement (hereinafter referred to as camera fluctuation) in order to improve the accuracy of epipolar image analysis, and based on the estimation result, To obtain an image as if the camera were photographed while moving ideally.

【００５３】ステップ３のエピポーラ画像解析に基づく
３次元形状抽出処理において、前述の手ぶれ補正後の画
像を入力とし、３次元形状を抽出する。この処理は、た
とえば文献１に示されている方法などで行うことができ
る。In the three-dimensional shape extraction processing based on the epipolar image analysis in step 3, the image after the above-mentioned camera shake correction is input and a three-dimensional shape is extracted. This processing can be performed, for example, by the method described in Document 1.

【００５４】ステップ４の再現画像作成処理では、上述
の３次元形状を用いて入力画像を再現する。この処理
は、実際のカメラ移動軌跡のそれぞれに仮想的にカメラ
を置き、この点から被写体を撮影した場合に得られる画
像を作成する。この画像作成処理を行うには、たとえば
一般的な透視変換とＺバッファ法を用いるがこれに限る
ものではない。再現画像作成はループの終了判定のため
に使用するものである。In the reproduction image creation processing of step 4, the input image is reproduced using the above-described three-dimensional shape. In this process, a camera is virtually placed on each of the actual camera movement trajectories, and an image obtained when an object is photographed from this point is created. To perform this image creation processing, for example, general perspective transformation and the Z-buffer method are used, but the present invention is not limited thereto. The reproduction image creation is used for determining the end of the loop.

【００５５】ステップ５における再現画像と入力画像の
差分計算処理では、ループの終了判定のために、再現画
像と入力画像の差分を計算する。もし、上述の３次元形
状の抽出結果とカメラの変動の推定結果が正確であれ
ば、ここで再現した再現画像と入力画像は正確に一致す
るはずである。しかしながら、現実にはカメラの変動の
推定が正確でないため、一致しない。そこで、本ステッ
プにより、どの程度差があるかを評価する。具体的に
は、本実施例では式In the difference calculation process between the reproduced image and the input image in step 5, the difference between the reproduced image and the input image is calculated to determine the end of the loop. If the extraction result of the three-dimensional shape and the estimation result of the fluctuation of the camera are accurate, the reproduced image reproduced here and the input image should exactly match. However, in reality, they do not match because the estimation of camera fluctuation is not accurate. Therefore, in this step, the difference is evaluated. Specifically, in this embodiment, the expression

【００５６】[0056]

【数１６】ただし、ε：差分 I_t(x, y):t番目の入力画像の座標(x, y)の輝度値 S_t(x, y):t番目の再現画像の座標(x, y)の輝度値で計算するが、他の式を用いても良い。(Equation 16) Where ε: difference I _t (x, y): luminance value of coordinates (x, y) of t-th input image S _t (x, y): luminance value of coordinates (x, y) of t-th reproduced image The value is calculated, but other formulas may be used.

【００５７】ステップ６において条件判定を行う。すな
わち、上述の差分が一定値以下であるか、またはループ
回数が一定回数に達したかを判定し、結果が真の場合に
はループを脱出する。結果が偽の場合には、ループ回数
を１回カウントアップし、ループを継続する。判定に用
いる差分の一定値、ループ回数の一定値は、おのおの実
施者が適切と認める値に設定し、外部から与える。In step 6, a condition is determined. That is, it is determined whether the difference is equal to or less than a certain value or the number of loops has reached a certain number, and if the result is true, the loop is exited. If the result is false, the loop is counted up once and the loop is continued. The fixed value of the difference used for the determination and the fixed value of the number of loops are set to values deemed appropriate by the practitioner, and given from outside.

【００５８】ステップ７における奥行き画像作成処理で
は、上述の再現画像の各々に対応する奥行き画像を作成
する。奥行き画像とは、画像の画素の輝度に、その画素
に写っている物体までの距離を対応させた濃淡画像であ
る。奥行き画像の例を図３に示す。In the depth image creation processing in step 7, depth images corresponding to each of the above-described reproduced images are created. A depth image is a grayscale image in which the luminance of a pixel of an image is associated with the distance to an object shown in that pixel. FIG. 3 shows an example of the depth image.

【００５９】ステップ８において、奥行き情報を利用し
た手ぶれ補正を行う。すなわち、上述の奥行き画像に含
まれる奥行き情報を利用して、精度良く手ぶれ補正を行
う。この処理は、本発明の原理として説明した通り、式
（２７）〜式（３０）を用いて行うことが可能である。In step 8, camera shake correction using the depth information is performed. That is, camera shake correction is accurately performed using the depth information included in the above-described depth image. This processing can be performed using Equations (27) to (30) as described as the principle of the present invention.

【００６０】ステップ６において、差分が一定値又はル
ープ回数が一定回数に達した場合には、ステップ９で最
終的に得られた３次元形状を例えばファイルに出力す
る。In step 6, when the difference has reached a certain value or the number of loops has reached a certain number, the three-dimensional shape finally obtained in step 9 is output to, for example, a file.

【００６１】図４は、本発明の実施結果を示す図であ
る。入力画像として屋外の景観を使用し、上記の実施例
のフローに基づいて実施した。ステップ３のエピポーラ
画像解析に基づく３次元形状抽出処理の実施のために
は、文献１に示されている方法および、文献「M. Shiny
a, T. Saito, T. Mori and N. Osumi, “VR Models fro
mEpipolar Images: An Approach to Minimize Errors i
n Synthesized Images,”LNCS 1352 Computer Vision -
ACCV ’98, Vol. II, pp. 471-478」に示されているDP
-Strip法を組み合わせて利用した。DP-Strip法とは、復
元された物体点の間の面の位相構造を決定する方法であ
り、再現画像と入力画像の差が最も小さくなるような位
相構造を動的計画法により決定する方法である。面の位
相構造が決定されることにより、手前の面が遠くの物体
を隠すといったオクルージョンの現象が再現できるよう
になる。FIG. 4 is a diagram showing the results of the implementation of the present invention. An outdoor scene was used as an input image, and the operation was performed based on the flow of the above embodiment. In order to perform the three-dimensional shape extraction processing based on the epipolar image analysis in Step 3, the method described in Reference 1 and the reference “M. Shiny
a, T. Saito, T. Mori and N. Osumi, “VR Models fro
mEpipolar Images: An Approach to Minimize Errors i
n Synthesized Images, ”LNCS 1352 Computer Vision-
ACCV '98, Vol. II, pp. 471-478 "
-Used in combination with the Strip method. The DP-Strip method is a method of determining the topological structure of a plane between restored object points, and a method of determining a topological structure that minimizes a difference between a reproduced image and an input image by a dynamic programming method. It is. By determining the topological structure of the surface, it becomes possible to reproduce the occlusion phenomenon that the front surface hides a distant object.

【００６２】図４（ａ）は入力画像の時空間画像を表す
図である。また、図４には、本発明の効果を分かりやす
く示すため、エピポーラ画像を掲載している。すなわ
ち、図４（ｂ）は、図４（ａ）の切断面における入力画
像のエピポーラ画像である。ステップ３により抽出され
た３次元形状が再現する再現画像が、この入力画像に近
ければ近いほど、形状が精度良く再現されているといえ
る。FIG. 4A is a diagram showing a spatiotemporal image of an input image. FIG. 4 also shows an epipolar image in order to clearly show the effect of the present invention. That is, FIG. 4B is an epipolar image of the input image on the cutting plane in FIG. It can be said that the closer the reproduced image in which the three-dimensional shape extracted in Step 3 is reproduced to the input image, the more accurately the shape is reproduced.

【００６３】図４（ｃ）は、本発明を適用せずに３次元
形状抽出を行った結果得られた３次元形状を用いて再現
したエピポーラ画像であるが、丸で囲んだ部分にみられ
るオクルージョンが入力画像のものと異なっており、３
次元形状抽出の精度が低いことがわかる。FIG. 4C is an epipolar image reproduced using a three-dimensional shape obtained as a result of performing a three-dimensional shape extraction without applying the present invention, which is seen in a circled portion. The occlusion is different from that of the input image and 3
It can be seen that the accuracy of dimensional shape extraction is low.

【００６４】図４（ｄ）は、同じ入力データに対して、
本発明を繰り返し回数１回で実施した例である。丸で囲
んだ部分のオクルージョンが入力画像に近く改善されて
いる。このことは、物体点の奥行き推定が向上したこと
によると考えられる。FIG. 4D shows that, for the same input data,
This is an example in which the present invention is implemented with one repetition. The occlusion in the circled area is improved closer to the input image. This is considered to be because the depth estimation of the object point was improved.

【００６５】図４（ｅ）は、同じ入力データに対して、
本発明を繰り返し回数２回で実施した例である。図４
（ｄ）に示された改善の他、丸で囲んだ部分のオクルー
ジョンが入力画像に近く改善されている。このことは、
物体点の奥行き推定がさらに向上したことによると考え
られる。FIG. 4E shows that, for the same input data,
This is an example in which the present invention is implemented with two repetitions. FIG.
In addition to the improvement shown in (d), the occlusion in the circled part is improved close to the input image. This means
It is considered that the depth estimation of the object point was further improved.

【００６６】（第２の実施例）次に、本発明の第２の実
施例について説明する。図５は第２の実施例における処
理のフローチャートである。第２の実施例では、カメラ
で撮影した１枚の画像から、別のカメラの位置および姿
勢から撮影したときに得られる画像を作成する例を示
す。(Second Embodiment) Next, a second embodiment of the present invention will be described. FIG. 5 is a flowchart of the process in the second embodiment. In the second embodiment, an example will be described in which an image obtained when an image is captured from another camera position and orientation is created from one image captured by the camera.

【００６７】ステップ１０では、１枚の画像を入力す
る。ここで、本発明に係る式（２７）および（２８）
を、カメラで撮影した画像に対し、カメラを変動させた
とした場合に、この画像の各画素がどれだけ移動するか
を示す式として使用できる。ここで、カメラの変動は、
ｘ、ｙ、ｚ軸まわりの回転角α、β、γと、ｙ、ｚ軸方
向の微小な平行移動距離δ_y、δ_zで表されているが、画
素の移動量D_xs、D_ysは、その画素に映っている物体点の
奥行きz’がわかっていないと、正確に計算できない。
そこで、ステップ１１において、入力画像の各画素の奥
行き値を入力する。奥行き値を得るための方法として
は、たとえば、この入力画像がビデオカメラで撮影した
多数の画像の一枚である場合には、文献３に示されるよ
うに因子分解法で物体の形状を求めて奥行きを算出する
方法などがある。In step 10, one image is input. Here, equations (27) and (28) according to the present invention are used.
Can be used as an expression indicating how much each pixel of this image moves when the camera is changed with respect to the image taken by the camera. Here, the camera fluctuation is
The rotation angles α, β, and γ around the x, y, and z axes and the minute parallel movement distances δ _y , δ _z in the y and z axis directions are represented by pixel movement amounts D _xs and D _ys. If the depth z 'of the object point reflected in the pixel is not known, it cannot be calculated accurately.
Therefore, in step 11, the depth value of each pixel of the input image is input. As a method for obtaining the depth value, for example, when the input image is one of a large number of images taken by a video camera, the shape of the object is obtained by a factor decomposition method as shown in Reference 3. There is a method of calculating the depth.

【００６８】ステップ１２におけるカメラを変動させた
場合に得られる画像の作成処理では、ステップ１１で得
た奥行きを式（２７）および（２８）のz’に代入し、D
_xs、D_ysを求め、式（３０）により入力画像の画素を各
々移動させることで、パラメータα、β、γ、δ_y、δ_z
で表される別のカメラ位置から撮影したときに得られる
画像を作成することができる。このように、本発明に係
る式（２７）および（２８）を用いることによって、別
のカメラ位置から撮影したときに得られる画像を精度良
く作成することができる。In the process of creating an image obtained when the camera is changed in step 12, the depth obtained in step 11 is substituted for z 'in equations (27) and (28), and D
_The parameters α, β, γ, δ _y , δ _z are obtained by calculating _xs and D _ys and moving the pixels of the input image according to equation (30).
An image obtained when photographing from another camera position represented by can be created. As described above, by using the expressions (27) and (28) according to the present invention, it is possible to accurately create an image obtained when photographing from another camera position.

【００６９】本発明は、エピポーラ解析方法による３次
元形状抽出方法以外にも適用可能である。特に、カメラ
の予期せぬ動きに弱い３次元形状抽出方法に効果的に適
用できる。The present invention can be applied to a method other than the three-dimensional shape extraction method based on the epipolar analysis method. In particular, it can be effectively applied to a three-dimensional shape extraction method that is vulnerable to unexpected movement of the camera.

【００７０】なお、本発明は、上記の実施例に限定され
ることなく、特許請求の範囲内で種々変更・応用が可能
である。It should be noted that the present invention is not limited to the above-described embodiment, but can be variously modified and applied within the scope of the claims.

【００７１】[0071]

【発明の効果】本発明によれば、奥行き画像を利用して
精度良くぶれ補正された画像を用いて３次元形状を抽出
するので、対象とする３次元物体の奥行き方向の幅が大
きい場合でも物体や景観の３次元形状を高精度に復元す
ることが可能となる。また、第２の実施例で示した方法
により、別のカメラ位置から撮影したときに得られる画
像を精度良く作成することができる。According to the present invention, a three-dimensional shape is extracted by using an image which has been subjected to accurate blur correction using a depth image. Therefore, even when the width of the target three-dimensional object in the depth direction is large, It is possible to restore the three-dimensional shape of the object or the landscape with high accuracy. Further, according to the method described in the second embodiment, an image obtained when photographing from another camera position can be accurately created.

[Brief description of the drawings]

【図１】本発明の実施例における３次元形状取得装置の
構成例である。FIG. 1 is a configuration example of a three-dimensional shape acquisition device according to an embodiment of the present invention.

【図２】第１の実施例における３次元形状取得処理の手
順を示すフローチャートである。FIG. 2 is a flowchart illustrating a procedure of a three-dimensional shape acquisition process according to the first embodiment.

【図３】ステップ７で得られた奥行き画像の例である。FIG. 3 is an example of a depth image obtained in step 7;

【図４】本発明の実施結果を示す図である。FIG. 4 is a diagram showing the results of the implementation of the present invention.

【図５】第２の実施例における処理のフローチャートで
ある。FIG. 5 is a flowchart of a process in the second embodiment.

[Explanation of symbols]

１ＣＰＵ３メモリ５入力装置７表示装置９ＣＤ−ＲＯＭドライブ１１ハードディスク DESCRIPTION OF SYMBOLS 1 CPU 3 Memory 5 Input device 7 Display device 9 CD-ROM drive 11 Hard disk

───────────────────────────────────────────────────── フロントページの続き (72)発明者松田達樹東京都千代田区大手町二丁目３番１号日本電信電話株式会社内 (72)発明者納富幹人東京都千代田区大手町二丁目３番１号日本電信電話株式会社内Ｆターム(参考） 5B057 BA02 CA12 CA16 CB13 CB16 CD02 CD03 CD14 ──────────────────────────────────────────────────続き Continuing on the front page (72) Inventor Tatsuki Matsuda 2-3-1 Otemachi, Chiyoda-ku, Tokyo Within Nippon Telegraph and Telephone Corporation (72) Mikito Notomi 2-chome, Otemachi, Chiyoda-ku, Tokyo No. 1 Nippon Telegraph and Telephone Corporation F-term (reference) 5B057 BA02 CA12 CA16 CB13 CB16 CD02 CD03 CD14

Claims

[Claims]

1. An input step of inputting an image from a camera to a computer, and a first step of extracting a three-dimensional shape from the image by a predetermined method.
Extraction step; a generation step of generating a depth image from the three-dimensional shape; a correction step of correcting image blur due to camera fluctuation using the depth image; and a predetermined step of using the blur-corrected image. A three-dimensional shape obtaining method, comprising: a second extracting step of extracting a three-dimensional shape by the method of (1); and an output step of outputting the three-dimensional shape to a storage device.

2. The method according to claim 1, wherein the predetermined method is a method based on epipolar image analysis.

3. The method according to claim 1, further comprising: generating a reproduced image after extracting the three-dimensional shape; and calculating a difference between the reproduced image and the blur-corrected image, wherein the difference is equal to or less than a predetermined value. The three-dimensional shape acquisition method according to claim 1, wherein the generation step, the correction step, and the second extraction step are repeatedly performed until the generation step.

4. The correction step comprises: setting a line of sight direction to the z axis, a camera scanning direction to the x axis, a as a camera constant, and (X _s , Y _s ) an object point existing in (x, y, z). Let (X ' _s , Y' _s ) be the camera around the x-axis
-β around the y-axis, -γ around the z-axis, and -δ _y , -δ _z from the x-axis as projection points, (D _xs , D _ys )
Was a variation of the projected point due to variations in the orientation and position of the camera, 'obtained from the depth image _{(X' z s, Y '} s) to the depth _{_{of, D xs (X' s,}} Y 's) _{_{= αX 's Y' s /}} a-β (a + X 's 2 / a) -γY' s +
(δ _z / z ') X' _s , D _ys (X ' _s , Y' _s ) = α (a + Y ' _s ² / a) -βX' _s Y ' _s / a + γX' _s
+ (δ _z / z ') Y' _s -aδ _y / z ', f ⁺ _new (X _s , Y _s ; t) = f (X' _s + D _xs (X ' _s , Y' _s ), Y ' _s + D
_{_{ys (X 's, Y'}} s); t), 3 -dimensional shape acquiring method according to claim 2 including the step of deforming the input image f to f ⁺ _{new new} by.

5. An image inputting step of inputting an image captured from a certain camera position to a computer, a depth inputting step of inputting a depth value of the image, and using the depth value to correct an image blur due to camera fluctuation. Creating an image taken from a different camera position by applying an equation for performing.

6. An input means for inputting an image from a camera, and a first means for extracting a three-dimensional shape from the image by a predetermined method.
Extraction means, generation means for generating a depth image from the three-dimensional shape, correction means for correcting image blur caused by camera fluctuation using the depth image, and the predetermined method using the blur-corrected image. A three-dimensional shape obtaining apparatus, comprising: a second extracting unit that extracts a three-dimensional shape by the method according to (1); and a storage device that stores the extracted three-dimensional shape.

7. The three-dimensional shape acquiring apparatus according to claim 6, wherein the predetermined method is a method based on epipolar image analysis.

8. A means for generating a reproduced image after three-dimensional shape extraction, and means for calculating a difference between the reproduced image and the blur-corrected image, wherein the difference is equal to or less than a predetermined value. The three-dimensional shape acquisition device according to claim 6, wherein the processing by the generation unit, the correction unit, and the second extraction unit is repeated until the processing.

9. The method according to claim 1, wherein the correcting unit sets the line of sight to the z axis, the scan direction of the camera to the x axis, a as a camera constant, and (X _s , Y _s ) an object point existing at (x, y, z). Let (X ' _s , Y' _s ) be the camera around the x-axis
-β around the y-axis, -γ around the z-axis, and -δ _y , -δ _z from the x-axis as projection points, (D _xs , D _ys )
Was a variation of the projected point due to variations in the orientation and position of the camera, 'obtained from the depth image _{(X' z s, Y '} s) to the depth _{_{of, D xs (X' s,}} Y 's) _{_{= αX 's Y' s /}} a-β (a + X 's 2 / a) -γY' s +
(δ _z / z ') X' _s , D _ys (X ' _s , Y' _s ) = α (a + Y ' _s ² / a) -βX' _s Y ' _s / a + γX' _s
+ (δ _z / z ') Y' _s -aδ _y / z ', f ⁺ _new (X _s , Y _s ; t) = f (X' _s + D _xs (X ' _s , Y' _s ), Y ' _s + D
_{_{ys (X 's, Y'}} s); t), claim including means for deforming the input image f to f ⁺ _{new new} by 7
3. The three-dimensional shape acquisition device according to 1.

10. An image input means for inputting an image photographed from a certain camera position, a depth input means for inputting a depth value of the image, and using the depth value to perform image blur correction due to camera fluctuation. Creating means for creating an image taken from another camera position by applying the formula:

11. A computer-readable recording medium in which a program for causing a computer to execute a process of acquiring a three-dimensional shape is recorded, wherein: an input procedure for inputting an image from a camera to the computer; First to extract 3D shape
Extraction procedure; a generation procedure for generating a depth image from the three-dimensional shape; a correction procedure for correcting image blur due to camera fluctuation using the depth image; and a predetermined procedure using the blur-corrected image. A computer-readable recording medium recording a program for causing a computer to execute a second extraction procedure for extracting a three-dimensional shape by the method according to the above, and an output procedure for outputting the three-dimensional shape to a storage device.

12. The computer-readable recording medium according to claim 11, wherein the predetermined method is a method based on epipolar image analysis.

13. A method for generating a reproduced image after extracting a three-dimensional shape, further comprising: calculating a difference between the reproduced image and the blur-corrected image, wherein the difference is equal to or less than a predetermined value. 2. The method according to claim 1, wherein the generation step, the correction step, and the second extraction step are repeatedly performed until the time point.
A computer-readable recording medium recording the program according to claim 1.

14. The correction procedure according to claim 1, wherein the viewing direction is the z-axis, the camera scanning direction is the x-axis, a is a camera constant, and (X _s , Y _s ) is an object point existing in (x, y, z). Let (X ' _s , Y' _s ) be the camera around the x-axis
-β around the y-axis, -γ around the z-axis, and -δ _y , -δ _z from the x-axis as projection points, (D _xs , D _ys )
Was a variation of the projected point due to variations in the orientation and position of the camera, 'obtained from the depth image _{(X' z s, Y '} s) to the depth _{_{of, D xs (X' s,}} Y 's) _{_{= αX 's Y' s /}} a-β (a + X 's 2 / a) -γY' s +
(δ _z / z ') X' _s , D _ys (X ' _s , Y' _s ) = α (a + Y ' _s ² / a) -βX' _s Y ' _s / a + γX' _s
+ (δ _z / z ') Y' _s -aδ _y / z ', f ⁺ _new (X _s , Y _s ; t) = f (X' _s + D _xs (X ' _s , Y' _s ), Y ' _s + D
_{_{ys (X 's, Y'}} s); t), claim comprising the steps of deforming the input image f to f ⁺ _{new new} by 1
3. A computer-readable recording medium recording the program according to 2.

15. A computer-readable recording medium storing a program for causing a computer to execute an image creation process, comprising: an image inputting step of inputting an image captured from a certain camera position to a computer; A depth input procedure for inputting, and a creation procedure for creating an image photographed from another camera position by using the depth value and applying an equation for performing image blur correction due to camera fluctuation, to the computer. A computer-readable recording medium on which a program to be recorded is recorded.