JP2017123087A

JP2017123087A - Program, device and method for calculating normal vector of planar object reflected in continuous photographic images

Info

Publication number: JP2017123087A
Application number: JP2016002279A
Authority: JP
Inventors: 小林　達也; Tatsuya Kobayashi; 達也小林; 加藤　晴久; Haruhisa Kato; 晴久加藤
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2016-01-08
Filing date: 2016-01-08
Publication date: 2017-07-13
Anticipated expiration: 2036-01-08
Also published as: JP6515039B2

Abstract

PROBLEM TO BE SOLVED: To provide a program or the like capable of calculating a normal vector of a planar object reflected in continuous photographic images in a drastically small amount of calculation without being greatly restricted by camera work on precondition that a target object is a planar object (or an approximatable object in an almost plane surface).SOLUTION: The program has normal vector calculation means for calculating camera attitude parameters Rand t, and a normal vector nfor minimizing a reprojection error function by using Nc frames i of continuous photographic images, three-dimensional coordinates Xof Np pieces of registration points pdetected from a registered image of a planar object, and the tracking coordinates mof each of the Np pieces of registration points preflected in each frame. The normal vector calculation means causes a computer to function so as to express the three-dimensional coordinates Xof the registration points pof the registered image by the registration points p, a reference point Xgoing through an object plane, and the normal vector nand calculate the three-dimensional coordinates Xby back projection of the registration points pto the object plane.SELECTED DRAWING: Figure 1

Description

本発明は、カメラによる撮影画像から平面物体の法線ベクトル（法線方向）を算出する技術に関する。 The present invention relates to a technique for calculating a normal vector (normal direction) of a planar object from an image captured by a camera.

コンピュータビジョンやロボットビジョンの技術によれば、カメラによって撮影された映像を解析することによって、映像内に映り込む対象物体の位置姿勢を推定・追跡することができる。例えば、監視カメラの連続的な撮影画像に映る車両を追跡する技術がある（例えば特許文献１参照）。この技術によれば、フレーム毎に、画像中の対象物体をテンプレートマッチングによって追跡する。但し、画像に映る車両の位置を追跡するに過ぎず、向き（方向）を追跡することはできない。 According to the technology of computer vision or robot vision, it is possible to estimate and track the position and orientation of the target object reflected in the video by analyzing the video taken by the camera. For example, there is a technique for tracking a vehicle shown in continuous captured images of a surveillance camera (see, for example, Patent Document 1). According to this technique, a target object in an image is tracked by template matching for each frame. However, only the position of the vehicle shown in the image is tracked, and the direction (direction) cannot be tracked.

これに対し、対象物体に仮想情報を重畳表示する拡張現実感技術によれば、カメラに対する対象物体の６自由度の位置姿勢を推定・追跡し、現実感の高い拡張現実を表示することができる。例えば、画像中で指定された対象物体に仮想オブジェクトを配置し、その対象物体の位置姿勢を推定・追跡することによって、仮想オブジェクトがあたかも指定された領域に存在するかのように表示することができる（例えば特許文献２参照）。
また、対象物体の３次元構造を３Ｄセンサでリアルタイムに取得し、その位置姿勢を推定／追跡する技術もある（例えば特許文献３参照）。
更に、単眼カメラのみで、撮影画像の３次元構造を推定する技術もある（例えば特許文献４参照）。 On the other hand, according to the augmented reality technology that superimposes and displays virtual information on the target object, it is possible to estimate and track the 6-DOF position and orientation of the target object with respect to the camera and display augmented reality with a high sense of reality. . For example, by placing a virtual object on the target object specified in the image and estimating / tracking the position and orientation of the target object, the virtual object can be displayed as if it exists in the specified area. (See, for example, Patent Document 2).
There is also a technique for acquiring a three-dimensional structure of a target object in real time with a 3D sensor and estimating / tracking the position and orientation (see, for example, Patent Document 3).
Furthermore, there is a technique for estimating a three-dimensional structure of a captured image using only a monocular camera (see, for example, Patent Document 4).

特許３６５１７４５号公報Japanese Patent No. 36551745 特開２０１３−１６４６９７号公報JP 2013-164597 A 特開２０１４−５１１５９１号公報JP 2014-515991 A 特開２０１４−１４９５８２号公報JP 2014-149582 A 特開２０１５−０６９３５４号公報Japanese Patent Application Laid-Open No. 2015-069354

A. Ruiz et al., "Practical planar metric rectification," In Proc. of British Machine Vision Conference, 2006.A. Ruiz et al., "Practical planar metric rectification," In Proc. Of British Machine Vision Conference, 2006. A Mulloni et al., "User friendly SLAM initialization," in Proc. of IEEE International Symposium on Mixed and Augmented Reality, 2013.A Mulloni et al., "User friendly SLAM initialization," in Proc. Of IEEE International Symposium on Mixed and Augmented Reality, 2013. F Yu et al., "3D Reconstruction from Accidental Motion," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, 2014.F Yu et al., "3D Reconstruction from Accidental Motion," in Proc. Of IEEE Conference on Computer Vision and Pattern Recognition, 2014.

しかしながら、前述した従来技術によれば、単眼カメラの連続的な撮影画像を用いて、対象物体の位置姿勢を推定・追跡する際に、対象物体の３次元構造が未知である場合、撮影角度の変化による見た目の変化を予測することができない。そのために、ロバスト（頑健）性が著しく損なわれる。 However, according to the above-described prior art, when the position and orientation of the target object are estimated and tracked using continuous captured images of a monocular camera, if the three-dimensional structure of the target object is unknown, Unable to predict changes in appearance due to changes. For this reason, robustness is significantly impaired.

また、単眼カメラしか用いない場合、処理負荷や推定精度、カメラワーク（撮影中のカメラの動かし方）に対する制約が高い。例えば大きな視点変化を含む画像群を取得するようにカメラを動かす、ということが必要となる。 Further, when only a monocular camera is used, there are high restrictions on processing load, estimation accuracy, and camera work (how to move the camera during shooting). For example, it is necessary to move the camera so as to acquire an image group including a large viewpoint change.

これに対し、撮影画像に映る対象物体が平面物体（又は概ね平面で近似可能な物体）であるという前提条件の下では、計算量を削減し且つ推定精度を高めることができる。対象物体の３次元構造の平面を、「法線ベクトル」で表すことによって、計算量に影響する未知パラメータ数を大幅に減らすことができる。
対象物体が平面であれば、例えば画像間のホモグラフィ行列から、対象物体の方向を推定することもできる（例えば非特許文献１参照）。
また、対象物体に複数の平面が含まれる場合に推定精度を向上させる技術もある（例えば特許文献５参照）。
しかしながら、これら技術についても、カメラワークに対する制約が存在する。 On the other hand, under the precondition that the target object shown in the captured image is a planar object (or an object that can be approximated by a plane), the amount of calculation can be reduced and the estimation accuracy can be increased. By expressing the plane of the three-dimensional structure of the target object with a “normal vector”, the number of unknown parameters that affect the amount of calculation can be greatly reduced.
If the target object is a plane, the direction of the target object can be estimated from, for example, a homography matrix between images (see, for example, Non-Patent Document 1).
There is also a technique for improving estimation accuracy when a target object includes a plurality of planes (see, for example, Patent Document 5).
However, these techniques also have restrictions on camera work.

更に、既存技術のカメラワークの制約を解消する技術として、小さなカメラワークで撮影された連続画像から、バンドル調整を用いて撮影シーンの３次元構造を推定する技術がある（例えば非特許文献２、３参照）。これらの技術によれば、小さなカメラワークを前提とすることによって、推定するパラメータを制約し、初期パラメータの精度を向上させている。
しかしながら、この技術によれば、指定する対象物体の模様に制約があり、対象物体から万遍なく特徴点（画像特徴）が検出される場合にしか、実用的な精度を得ることができない。また、特徴点の追跡失敗に対するロバスト性に乏しい。 Furthermore, as a technique for eliminating the restrictions of the existing camera work, there is a technique for estimating a three-dimensional structure of a shooting scene using bundle adjustment from continuous images shot with a small camera work (for example, Non-Patent Document 2, 3). According to these techniques, by assuming a small camera work, the parameters to be estimated are restricted and the accuracy of the initial parameters is improved.
However, according to this technique, there is a restriction on the pattern of the target object to be specified, and practical accuracy can be obtained only when feature points (image features) are detected from the target object. Moreover, the robustness with respect to the feature point tracking failure is poor.

そこで、本発明は、対象物体が平面物体（又は概ね平面で近似可能な物体）であるという前提条件の下で、カメラワークの大きな制約無しに、連続的な撮影画像に映り込む平面物体の法線ベクトルを、できる限り少ない計算量で算出することができるプログラム、装置及び方法を提供する。 Therefore, the present invention provides a method for a planar object that is reflected in a continuous photographed image under the precondition that the target object is a planar object (or an object that can be approximated by a plane) without significant restrictions on camera work. Provided are a program, an apparatus, and a method capable of calculating a line vector with as little calculation as possible.

本発明によれば、撮影画像に映り込む平面物体の法線ベクトルを算出するようにコンピュータを機能させることを特徴とするプログラムにおいて、
連続的な撮影画像のＮc個のフレームiと、
平面物体の登録画像から検出されたＮp個の登録点ｐ_j（＝[ｕ_ｊ,ｖ_ｊ]^T、ｊ＝１〜Ｎp）の３次元座標Ｘ_j（＝[ｘ_ｊ,ｙ_ｊ,ｚ_ｊ]^T）と、
各フレームに映るＮp個の登録点ｐ_j毎の追跡座標ｍ_ij（＝[ｕ_ij,ｖ_ij]^T、i＝１〜Ｎc）とを用いて、
再投影誤差関数を最小化する、カメラ姿勢パラメータＲ_i（＝[ｒ_ix,ｒ_iy,ｒ_iz]^T）及びｔ_i（＝[ｔ_ix,ｔ_iy,ｔ_iz]^T）と、法線ベクトルｎ_t（＝[ｘ_ｎ,ｙ_ｎ,ｚ_ｎ]^T,ｘ_ｎ ^２+ｙ_ｎ ^２+ｚ_ｎ ^２＝１）とを算出する法線ベクトル算出手段と
を有し、
法線ベクトル算出手段は、登録画像の登録点ｐ_jの３次元座標Ｘ_jを、登録点ｐ_jと、物体平面の通る基準点Ｘ_０（＝[ｘ_０,ｙ_０,ｚ_０]^T）と、法線ベクトルｎ_tとによって表現し、登録点ｐ_jの物体平面への逆投影によって算出する
ようにコンピュータを機能させることを特徴とする。 According to the present invention, in a program that causes a computer to function to calculate a normal vector of a planar object reflected in a captured image,
Nc frames i of continuous shot images;
Three-dimensional coordinates X _j (= [x _j , y _j , z _j ) of Np registration points p _j (= [u _j , v _j ] ^T , j = 1 to Np) detected from the registered image of the planar object. ] ^T ),
Using tracking coordinates m _ij (= [u _ij , v _ij ] ^T , i = 1 to Nc) for each of Np registration points p _j reflected in each frame,
Camera orientation parameters R _i (= [r _ix , r _iy , r _iz ] ^T ) and t _i (= [t _ix , t _iy , t _iz ] ^T ) that minimize the reprojection error function, and normal vectors normal vector calculation means for calculating n _t (= [x _n , y _n , z _n ] ^T , x _n ² + y _n ² + z _n ² = 1),
Normal vector calculating means, the three-dimensional coordinates X _j of registration points p _j of the reference image, a registration point p _j, the reference point _X 0 through the object plane _{_{(= [x 0, y 0}} , z 0] T) And a normal vector n _t, and the computer is made to function so as to be calculated by back projection of the registration point p _j onto the object plane.

本発明のプログラムにおける他の実施形態によれば、
法線ベクトル算出手段は、基準点Ｘ_０を、登録点ｐ_jの重心ｐ_０（＝[ｕ_０,ｖ_０]^Tの逆投影（Ｘ_０＝１／ｗ_０[ｕ_０,ｖ_０,１]^T）によって算出する
ようにコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
The normal vector calculation means uses the reference point X ₀ as a back projection (X ₀ = 1 / w ₀ [u ₀ , v ₀ , 1) of the center of gravity p ₀ (= [u ₀ , v ₀ ] ^T of the registration point p _j. It is also preferred to have the computer function so as to calculate according to ^T ).

本発明のプログラムにおける他の実施形態によれば、
法線ベクトル算出手段について、再投影誤差関数は、以下の式によって表される
Ｒ_i',ｔ_i',ｎ_t'＝arg min_Ｒi,ｔi,ntΣ_i=1 ^NcΣ_ｊ=1 ^Nｐ(ｍ_ij−proj(Ｒ_i,ｔ_i,Ｘ_j))²
Ｘ_j＝(ｎ_t・Ｘ_０／ｎ_t・ｐ_j')ｐ_j'
＝(ｘ_ｎｘ_０＋ｙ_ｎｙ_０＋ｚ_nｚ_０)／(ｘ_ｎｕ_j＋ｙ_ｎｖ_j＋ｚ_ｎ)[ｕ_j,ｖ_j,1]^T
ｘ_n ²＋ｙ_n ²＋ｚ_n ²＝１
ｚ_n＝√(１−ｘ_n ²−ｙ_n ²)
ｐ_j'（＝[ｕ_j,ｖ_j,１]^T）：登録点ｐ_jの同次座標表現
i：撮影画像のＮc個のフレームの番数
ｍ_ij：フレームiに映るＮp個の登録点ｐ_j毎の追跡座標
Ｒ_i及びｔ_i：フレームiのカメラ姿勢パラメータ
ｎ_t：平面物体の法線ベクトル
Ｘ_０：物体平面の通る基準点
proj(Ｒ_i,ｔ_i,Ｘ_j)：３次元座標Ｘ_jの投影関数 [Ｒ_i｜ｔ_i]Ｘ_j'
Ｘ_j'：Ｘ_jの同次座標表現
Ｒ_i'及びｔ_i'：フレームiのカメラ姿勢パラメータＲ_i及びｔ_iの推定値
ｎ_t'：平面物体の法線ベクトルｎ_tの推定値
ようにコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
For normal vector calculating means, the reprojection error function, R _i represented by the formula _{_{', t i', n t}} '= arg min Ri, ti, nt Σ i = 1 Nc Σ j = 1 Np ( m _ij −proj (R _i , t _i , X _j )) ²
X _j = (n _t · X ₀ / n _t · p _j ') p _j '
= (X _n x ₀ + y _n y ₀ + z _n z ₀ ) / (x _n u _j + y _n v _j + z _n ) [u _j , v _j , 1] ^T
x _n ² + y _n ² + z _n ² = 1
z _n = √ (1−x _n ² −y _n ² )
p _j ′ (= [u _j , v _j , 1] ^T ): homogeneous coordinate expression of the registration point p _j
i: Number of Nc frames of the captured image m _ij : Tracking coordinates for each of Np registration points p _j reflected in frame i R _i and t _i : Camera posture parameters of frame i n _t : Normal of plane object Vector X ₀ : Reference point through which the object plane passes
proj (R _i , t _i , X _j ): projection function [R _i | t _i ] X _j 'of the three-dimensional coordinate X _j
X _j ': _Homogeneous coordinate representation of X _j R _i ' and t _i ': Estimated values of camera posture parameters R _i and t _i of frame i n _t ': Estimated value of normal vector n _t of planar object It is also preferable to make the computer function.

本発明のプログラムにおける他の実施形態によれば、
法線ベクトル算出手段について、再投影誤差関数における未知パラメータの数は、Ｎc個のフレームi毎に生じるカメラ姿勢パラメータＲ_i（＝[ｒ_ix,ｒ_iy,ｒ_iz]^T）及びｔ_i（＝[ｔ_ix,ｔ_iy,ｔ_iz]^T）の６個と、登録点に対する法線ベクトルｎ_t（＝[ｘ_ｎ,ｙ_ｎ,ｚ_ｎ]^T,ｘ_ｎ ^２+ｙ_ｎ ^２+ｚ_ｎ ^２＝１）の２個とを合計した、６Ｎc＋２個となる
ようにコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
Regarding the normal vector calculation means, the number of unknown parameters in the reprojection error function is the camera posture parameters R _i (= [r _ix , r _iy , r _iz ] ^T ) and t _i (= [t _ix , t _iy , t _iz ] ^T ) and normal vectors n _t (= [x _n , y _n , z _n ] ^T , x _n ² + y _n ² + z _n ²⁾ = 1) It is also preferable to make the computer function so that 6Nc + 2 is obtained by adding the two.

本発明のプログラムにおける他の実施形態によれば、
法線ベクトル算出手段は、撮影画像のカメラワークが微小であるとする前提条件の下、カメラ姿勢パラメータのＲiを単位行列とし、ｔ_iを零ベクトルとして、法線ベクトルを算出する
ようにコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
The normal vector calculation means, under the precondition that the camera work of the photographed image is minute, sets the computer to calculate the normal vector with the camera orientation parameter Ri as the unit matrix and t _i as the zero vector. It is also preferable to make it function.

本発明のプログラムにおける他の実施形態によれば、
法線ベクトル算出手段について、再投影誤差関数は、バンドル調整における法線ベクトルの初期値を、光軸と平行な方向ｎ_t＝[0,0,1]^Tとする
ようにコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
Regarding the normal vector calculation means, the reprojection error function causes the computer to function so that the initial value of the normal vector in bundle adjustment is a direction n _t = [0,0,1] ^T parallel to the optical axis. Is also preferable.

本発明のプログラムにおける他の実施形態によれば、
登録画像の登録点と撮影画像の追跡座標との間のホモグラフィ行列を用いて、誤追跡された追跡座標を除外する画像特徴追跡手段と
して更にコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
It is also preferable that the computer further function as an image feature tracking unit that excludes mistracked tracking coordinates using a homography matrix between the registered points of the registered image and the tracking coordinates of the captured image.

本発明のプログラムにおける他の実施形態によれば、
撮影画像の中から、ユーザ操作に応じて平面物体が映る対象領域を特定し、該対象領域を登録画像として記憶する登録画像記憶手段と
して更にコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
It is also preferable to specify a target area in which a planar object appears in accordance with a user operation from the captured images, and further cause the computer to function as registered image storage means for storing the target area as a registered image.

本発明のプログラムにおける他の実施形態によれば、
登録画像記憶手段は、登録画像を、法線ベクトル算出手段によって算出された法線ベクトルによって正面化画像に幾何変換し、該正面化画像を登録画像として記憶する
ようにコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
The registered image storage means preferably causes the computer to function so as to geometrically transform the registered image into a frontalized image using the normal vector calculated by the normal vector calculating means, and store the frontaled image as a registered image. .

本発明によれば、撮影画像に映り込む平面物体の法線ベクトルを算出する画像処理装置において、
連続的な撮影画像のＮc個のフレームiと、
平面物体の登録画像から検出されたＮp個の登録点ｐ_j（＝[ｕ_ｊ,ｖ_ｊ]^T、ｊ＝１〜Ｎp）の３次元座標Ｘ_j（＝[ｘ_ｊ,ｙ_ｊ,ｚ_ｊ]^T）と、
各フレームに映るＮp個の登録点ｐ_j毎の追跡座標ｍ_ij（＝[ｕ_ij,ｖ_ij]^T、i＝１〜Ｎc）とを用いて、
再投影誤差関数を最小化する、カメラ姿勢パラメータＲ_i（＝[ｒ_ix,ｒ_iy,ｒ_iz]^T）及びｔ_i（＝[ｔ_ix,ｔ_iy,ｔ_iz]^T）と、法線ベクトルｎ_t（＝[ｘ_ｎ,ｙ_ｎ,ｚ_ｎ]^T,ｘ_ｎ ^２+ｙ_ｎ ^２+ｚ_ｎ ^２＝１）とを算出する法線ベクトル算出手段と
を有し、
法線ベクトル算出手段は、登録画像の登録点ｐ_jの３次元座標Ｘ_jを、登録点ｐ_jと、物体平面の通る基準点Ｘ_０（＝[ｘ_０,ｙ_０,ｚ_０]^T）と、法線ベクトルｎ_tとによって表現し、登録点ｐ_jの物体平面への逆投影によって算出する
ことを特徴とする。 According to the present invention, in an image processing apparatus that calculates a normal vector of a planar object reflected in a captured image,
Nc frames i of continuous shot images;
Three-dimensional coordinates X _j (= [x _j , y _j , z _j ) of Np registration points p _j (= [u _j , v _j ] ^T , j = 1 to Np) detected from the registered image of the planar object. ] ^T ),
Using tracking coordinates m _ij (= [u _ij , v _ij ] ^T , i = 1 to Nc) for each of Np registration points p _j reflected in each frame,
Camera orientation parameters R _i (= [r _ix , r _iy , r _iz ] ^T ) and t _i (= [t _ix , t _iy , t _iz ] ^T ) that minimize the reprojection error function, and normal vectors normal vector calculation means for calculating n _t (= [x _n , y _n , z _n ] ^T , x _n ² + y _n ² + z _n ² = 1),
Normal vector calculating means, the three-dimensional coordinates X _j of registration points p _j of the reference image, a registration point p _j, the reference point _X 0 through the object plane _{_{(= [x 0, y 0}} , z 0] T) And a normal vector n _t and calculated by back projection of the registration point p _j onto the object plane.

本発明によれば、撮影画像に映り込む平面物体の法線ベクトルを算出する装置の法線ベクトル算出方法において、
連続的な撮影画像のＮc個のフレームiと、
平面物体の登録画像から検出されたＮp個の登録点ｐ_j（＝[ｕ_ｊ,ｖ_ｊ]^T、ｊ＝１〜Ｎp）の３次元座標Ｘ_j（＝[ｘ_ｊ,ｙ_ｊ,ｚ_ｊ]^T）と、
各フレームに映るＮp個の登録点ｐ_j毎の追跡座標ｍ_ij（＝[ｕ_ij,ｖ_ij]^T、i＝１〜Ｎc）とを用いて、
再投影誤差関数を最小化する、カメラ姿勢パラメータＲ_i（＝[ｒ_ix,ｒ_iy,ｒ_iz]^T）及びｔ_i（＝[ｔ_ix,ｔ_iy,ｔ_iz]^T）と、法線ベクトルｎ_t（＝[ｘ_ｎ,ｙ_ｎ,ｚ_ｎ]^T,ｘ_ｎ ^２+ｙ_ｎ ^２+ｚ_ｎ ^２＝１）とを算出するステップを有し、
このステップは、登録画像の登録点ｐ_jの３次元座標Ｘ_jを、登録点ｐ_jと、物体平面の通る基準点Ｘ_０（＝[ｘ_０,ｙ_０,ｚ_０]^T）と、法線ベクトルｎ_tとによって表現し、登録点ｐ_jの物体平面への逆投影によって算出する
ことを特徴とする。 According to the present invention, in the normal vector calculation method of the apparatus for calculating the normal vector of the planar object reflected in the captured image,
Nc frames i of continuous shot images;
Three-dimensional coordinates X _j (= [x _j , y _j , z _j ) of Np registration points p _j (= [u _j , v _j ] ^T , j = 1 to Np) detected from the registered image of the planar object. ] ^T ),
Using tracking coordinates m _ij (= [u _ij , v _ij ] ^T , i = 1 to Nc) for each of Np registration points p _j reflected in each frame,
Camera orientation parameters R _i (= [r _ix , r _iy , r _iz ] ^T ) and t _i (= [t _ix , t _iy , t _iz ] ^T ) that minimize the reprojection error function, and normal vectors n _t (= [x _n , y _n , z _n ] ^T , x _n ² + y _n ² + z _n ² = 1)
This step, the three-dimensional coordinates X _j of registration points p _j of the reference image, a registration point p _j, the reference point _X 0 through the object plane and _{_{(= [x 0, y 0}} , z 0] T), Law It is expressed by the line vector n _t and is calculated by back projection of the registration point p _j onto the object plane.

本発明のプログラム、装置及び方法によれば、対象物体が平面物体（又は概ね平面で近似可能な物体）であるという前提条件の下で、カメラワークの大きな制約無しに、連続的な撮影画像に映り込む平面物体の法線ベクトルを、大幅に少ない計算量で算出することができる。具体的には、法線ベクトルを算出するためのバンドル調整における未知パラメータの数を削減しているために、処理負荷を削減し、推定精度を高めることができる。また、ホモグラフィ行列を用いることによって、特徴点の追跡失敗に対するロバスト性も向上させることができる。 According to the program, the apparatus, and the method of the present invention, it is possible to continuously capture captured images without a large limitation of camera work under the precondition that the target object is a planar object (or an object that can be approximated by a plane). The normal vector of the planar object to be reflected can be calculated with a significantly small amount of calculation. Specifically, since the number of unknown parameters in bundle adjustment for calculating a normal vector is reduced, the processing load can be reduced and the estimation accuracy can be increased. Further, by using the homography matrix, it is possible to improve the robustness against the tracking failure of the feature points.

本発明における画像処理装置の機能構成図である。It is a functional block diagram of the image processing apparatus in this invention. 登録画像及び正面化画像の法線ベクトルを表す説明図である。It is explanatory drawing showing the normal vector of a registration image and a front-facing image. 法線ベクトルと正面化回転行列との関係を表す説明図である。It is explanatory drawing showing the relationship between a normal vector and a front-facing rotation matrix. 登録画像の登録点と撮影画像の追跡座標との間の対応関係を表す説明図であるIt is explanatory drawing showing the correspondence between the registration point of a registration image, and the tracking coordinate of a picked-up image. 登録点と追跡座標との対応を表す画像図である。It is an image figure showing a response | compatibility with a registration point and a tracking coordinate. 異なる未知パラメータの数で表現した特徴点（追跡座標）を表す画像図である。It is an image figure showing the feature point (tracking coordinate) expressed by the number of different unknown parameters. 登録点と基準点、法線ベクトル、対象物平面の幾何的関係を表す説明図である。It is explanatory drawing showing the geometric relationship of a registration point, a reference point, a normal vector, and an object plane. 光軸中心と、画像平面上の登録画像の登録点と、撮影画像の法線ベクトルとの幾何的関係を表すグラフである。It is a graph showing the geometric relationship between the optical axis center, the registration point of the registration image on the image plane, and the normal vector of the captured image. バンドル調整に応じてフレーム数に対する法線ベクトルの収束及び処理時間を表すグラフである。It is a graph showing the convergence and processing time of the normal vector with respect to the number of frames according to bundle adjustment.

以下、本発明の実施の形態について、図面を用いて詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

図１は、本発明における画像処理装置の機能構成図である。 FIG. 1 is a functional configuration diagram of an image processing apparatus according to the present invention.

図１の画像処理装置によれば、画像取得部１０と、登録画像記憶部１１と、画像特徴追跡部１２と、法線ベクトル算出部１３とを有する。これら機能構成部は、装置に搭載されたコンピュータを機能させるプログラムを実行することによって実現される。また、これら機能構成部の処理の流れは、装置の法線ベクトル算出方法としても理解できる。 The image processing apparatus of FIG. 1 includes an image acquisition unit 10, a registered image storage unit 11, an image feature tracking unit 12, and a normal vector calculation unit 13. These functional components are realized by executing a program that causes a computer installed in the apparatus to function. Further, the processing flow of these functional components can be understood as a normal vector calculation method of the apparatus.

本発明によれば、連続的な撮影画像に映り込む平面物体の法線ベクトルを算出する際に、以下の２つの条件を前提とすることによって、３次元構造の未知のパラメータ数を削減し、できる限り少ない計算量で実行する。
（前提条件１）対象物体が、平面物体（又は概ね平面で近似可能な物体）である
（前提条件２）撮影画像のカメラワークが、微小である According to the present invention, when calculating the normal vector of a planar object reflected in continuous captured images, the number of unknown parameters of the three-dimensional structure is reduced by assuming the following two conditions: Run with as little computation as possible.
(Precondition 1) The target object is a plane object (or an object that can be approximated by a plane) (Precondition 2) The camera work of the captured image is very small.

撮影画像に映り込む「対象物体」とは、平面で且つ剛体の「平面物体」であり、時間経過に応じて変形するような物体を対象としていない。また、特徴点追跡によって法線ベクトルを推定するために、撮影画像に映る対象物体から、比較的多くの特徴点が検出されることを前提としている。但し、単一色の平面の場合は、特徴点が検出されにくい場合がある。対象物体としては、例えば雑誌やポスター、広告、トレーディングカード、建物の壁面が想定されている。 The “target object” reflected in the captured image is a planar “rigid” “planar object” and does not target an object that deforms over time. Further, in order to estimate the normal vector by feature point tracking, it is assumed that a relatively large number of feature points are detected from the target object shown in the captured image. However, in the case of a single color plane, it may be difficult to detect feature points. As target objects, for example, magazines, posters, advertisements, trading cards, and building walls are assumed.

「カメラワーク」とは、ユーザによるカメラの動かし方を意味する。「カメラワークが微小である」ということは、ユーザがカメラをできる限り動かさないように把持して、対象物体を撮影している状態をいう。これに対し、「カメラワークが大きい」とは、ユーザがカメラを大きく動かして、対象物体を様々な位置と方向から撮影している状態をいう。 “Camera work” means how the user moves the camera. “The camera work is very small” means a state in which the user holds the camera so as not to move as much as possible and photographs the target object. On the other hand, “the camera work is large” means a state in which the user moves the camera greatly to photograph the target object from various positions and directions.

［画像取得部１０］
画像取得部１０は、カメラによる連続的な撮影画像を取得する。撮影画像は、予め録画されたものであってもよいし、インタフェースを介して外部から時系列に入力されるもの（例えばライブ映像）であってもよい。インタフェースは、ネットワークに接続する通信インタフェースであってもよいし、カメラからの入力インタフェースであってもよい。取得された撮影画像は、画像特徴追跡部１２へ出力される。ここで、登録画像がユーザによって指定される場合、取得された撮影画像は、登録画像記憶部１１へも出力される。 [Image acquisition unit 10]
The image acquisition unit 10 acquires continuous captured images from the camera. The captured image may be recorded in advance, or may be input in a time series from the outside via an interface (for example, live video). The interface may be a communication interface connected to a network or an input interface from a camera. The acquired captured image is output to the image feature tracking unit 12. Here, when the registered image is designated by the user, the acquired captured image is also output to the registered image storage unit 11.

［登録画像記憶部１１］
ユーザは、撮影画像中の平面物体の姿勢推定を開始する際に、ユーザインタフェースを介して登録画像を設定し、姿勢推定の開始を指示する。登録画像記憶部１１は、画像取得部１０から入力された撮影画像の中から、対象領域となる登録画像を記憶する。 [Registered image storage unit 11]
When starting the posture estimation of a planar object in a captured image, the user sets a registered image via the user interface and instructs the start of posture estimation. The registered image storage unit 11 stores a registered image serving as a target area from among the captured images input from the image acquisition unit 10.

登録画像記憶部１１は、ユーザからユーザインタフェースを介して、平面物体が映る大まかな画像範囲の指示を受け付けることもできる。例えば、ユーザが、平面物体を包含する矩形領域や輪郭を、ポインティング操作に応じて指示する。ここで、登録画像には、その平面物体が、比較的大きく撮像されていることが好ましい。この登録画像が、連続して入力される撮影画像の中で追跡されていく。 The registered image storage unit 11 can accept an instruction of a rough image range in which a planar object is reflected from the user via the user interface. For example, the user instructs a rectangular region or contour that includes a planar object in accordance with a pointing operation. Here, it is preferable that the registration object has a relatively large image of the planar object. This registered image is tracked in the captured images that are continuously input.

登録画像記憶部１１は、登録画像の画像範囲のトリミングや、画像範囲の外側の背景を単一色で塗りつぶすように画像処理を加えることによって、背景に対する画像特徴の追跡のロバスト性を高めることができる。また、対象物体を表すマスク画像（登録画像と同じ範囲で、対象領域の輝度値を２５５、それ以外の輝度値を０とすることで表現した画像）を保持し、そのマスク画像を用いて画像特徴追跡部１２が、画像特徴を検出する範囲を制限してもよい。 The registered image storage unit 11 can improve the robustness of image feature tracking with respect to the background by trimming the image range of the registered image and applying image processing to fill the background outside the image range with a single color. . In addition, a mask image representing the target object (an image expressed by setting the luminance value of the target region to 255 and the other luminance values to 0 within the same range as the registered image) is held, and an image is generated using the mask image. The feature tracking unit 12 may limit the range in which image features are detected.

図２は、登録画像及び正面化画像の法線ベクトルを表す説明図である。 FIG. 2 is an explanatory diagram showing normal vectors of the registered image and the front-facing image.

図２によれば、登録画像について、法線ベクトル算出部１３によって算出された法線ベクトルｎ_t＝[ｘn,ｙn,ｚn]^Tを用いて幾何変換している。これによって、平面物体が映る登録画像を、正面化画像（平面物体を正面から撮像した際の画像をシミュレートした変換画像）に変換することができる。平面物体が正面から撮影されている場合、法線ベクトルは、画像平面と垂直、光軸と並行の関係になる。逆に、平面物体が真横から撮影されている場合、法線ベクトルは、光軸と垂直、画像平面と並行の関係になる。 According to FIG. 2, the registered image is geometrically transformed using the normal vector n _t = [ _x n, yn, z n] ^T calculated by the normal vector calculator 13. As a result, the registered image in which the planar object is reflected can be converted into a frontalized image (a converted image simulating an image when the planar object is imaged from the front). When a planar object is photographed from the front, the normal vector is perpendicular to the image plane and parallel to the optical axis. On the other hand, when a planar object is photographed from the side, the normal vector has a relationship perpendicular to the optical axis and parallel to the image plane.

登録画像記憶部１１は、その正面化画像を記憶することができる。登録画像を正面化画像に変換することによって、画像認識や姿勢追跡の精度を向上させることができる。 The registered image storage unit 11 can store the frontal image. By converting the registered image into a frontalized image, the accuracy of image recognition and posture tracking can be improved.

図３は、法線ベクトルと正面化回転行列との関係を表す説明図である。 FIG. 3 is an explanatory diagram showing the relationship between the normal vector and the front-facing rotation matrix.

図３によれば、回転行列をＲrecは、平面物体の法線ベクトルｎ_tが、光軸（Ｚ軸）と並行なベクトルｎ_Zと一致するように変換するものである。
Ｒrec＝Ｒ_Y(θ_Y)Ｒ_X(θ_X)
Ｒ_X：Ｘ軸周りの回転行列
Ｒ_Y：Ｙ軸周りの回転行列
θ_X：ｎ_tのＸ軸に沿った角度
θ_Y：ｎ_tのＹ軸に沿った角度 According to FIG. 3, the rotation matrix Rrec is transformed so that the normal vector n _t of the planar object coincides with a vector n _Z parallel to the optical axis (Z axis).
Rrec = R _Y (θ _Y ) R _X (θ _X )
R _X : rotation matrix around the X axis
R _Y : rotation matrix around Y axis
theta _X: angle along the X-axis of the n _t
theta _Y: angle along the Y-axis of the n _t

この回転行列Ｒrecを用いて、登録画像を正面化することができる。例えば以下のようなホモグラフィ行列Ｈrecによって、登録画像を正面化画像に変換することができる。
Ｈrec＝[ｐ₁,ｐ₂,ｐ₄］
Ｐrec＝[ｐ₁,ｐ₂,ｐ₃,ｐ₄］＝Ａ[Ｒrec｜ｔ]
ｔ：登録画像中の平面物体の位置を調整する並進ベクトル
ここで、ｔは正面化画像中の平面物体が画像中央に来るように調整して設定することが望ましい。 Using this rotation matrix Rrec, the registered image can be fronted. For example, the registered image can be converted into a front-facing image by the following homography matrix Hrec.
Hrec = [p ₁ , p ₂ , p ₄ ]
Prec = [p ₁ , p ₂ , p ₃ , p ₄ ] = A [Rrec | t]
t: Translation vector for adjusting the position of the planar object in the registered image Here, t is preferably adjusted and set so that the planar object in the frontalized image comes to the center of the image.

カメラの内部パラメータＡは、画像の歪みを無視した場合、以下のように表すことができる。
ｆ_x、ｆ_ｙ：焦点距離、ｃ_x、ｃ_y：光軸のズレ
焦点距離ｆ_x、ｆ_y及び光軸のズレｃ_x、ｃ_yは、事前のキャリブレーションによって算出しておくことができる。画像平面上の二次元ピクセル座標[ｕ,ｖ]^Tは、内部パラメータを用いて、以下の式で正規化座標[ｕ',ｖ']^Tに変換することができる。
ｕ'＝(ｕ−ｃ_x)／ｆ_x ，ｖ'＝(ｖ−ｃ_y)／ｆ_y
このように、二次元座標を正規化座標で表現することにより、式を簡潔に記載することができる。そのために、本発明の登録点や特徴点の二次元座標は、正規化座標で記載していることに注意すべきである。 The internal parameter A of the camera can be expressed as follows when image distortion is ignored.
f _x, _{f y:} focal length, c _x, c _y: displacement focal length f _x of the optical axis, f _y and shift c _x of the optical axis, c _y may be kept calculated by prior calibration . Two-dimensional pixel coordinates [u, v] ^T on the image plane can be converted into normalized coordinates [u ′, v ′] ^T by the following formula using internal parameters.
u ′ = (u−c _x ) / f _x , v ′ = (v−c _y ) / f _y
Thus, by expressing the two-dimensional coordinates with the normalized coordinates, the formula can be described simply. Therefore, it should be noted that the two-dimensional coordinates of the registration points and feature points of the present invention are described in normalized coordinates.

［画像特徴追跡部１２］
画像特徴追跡部１２は、連続的な撮影画像の中で、登録画像の登録点（画像特徴）を追跡する。 [Image Feature Tracking Unit 12]
The image feature tracking unit 12 tracks a registration point (image feature) of a registered image in continuous captured images.

図４は、登録画像の登録点と撮影画像の追跡座標との間の対応関係を表す説明図である。 FIG. 4 is an explanatory diagram showing the correspondence between the registration points of the registered image and the tracking coordinates of the captured image.

画像間で画像特徴を追跡するために、一般的に以下の２つの技術がある。
＜特徴点追跡ベース＞
＜局所特徴量のマッチングベース＞
画像特徴追跡部１２は、２つのいずれか一方又は両方を用いて、登録画像（法線ベクトルｎ_t）の登録点ｐ_j（＝[ｕ_ｊ,ｖ_ｊ]^T、ｊ＝１〜Ｎp）に対応する撮影画像の追跡座標ｍ_ij（＝[ｕ_ij,ｖ_ij]^T、i＝１〜Ｎc）を取得する。 There are generally two techniques for tracking image features between images:
<Feature point tracking base>
<Local feature matching base>
The image feature tracking unit 12 uses one or both of the two at the registration point p _j (= [u _j , v _j ] ^T , j = 1 to Np) of the registered image (normal vector n _t ). Acquire tracking coordinates m _ij (= [u _ij , v _ij ] ^T , i = 1 to Nc) of the corresponding captured image.

＜特徴点追跡ベース＞
特徴点追跡ベースの技術によれば、Harrisコーナー検出器や、ＦＡＳＴコーナー検出器によって、登録画像の登録点に対応する撮影画像の中の追跡座標を検出する。一般的に、ＫＬＴ(Kanade-Lucas-Tomasi)アルゴリズムや、特徴点周囲局所領域を切り出したパッチのテンプレートマッチングの技術が用いられる。テンプレートマッチングの類似度算出方法としては、ＮＣＣ(Normalized Cross Correlation)やＳＳＤ(Sum of Squared Difference)を用いることができる。 <Feature point tracking base>
According to the feature point tracking-based technique, tracking coordinates in a captured image corresponding to a registration point of a registered image are detected by a Harris corner detector or a FAST corner detector. In general, a KLT (Kanade-Lucas-Tomasi) algorithm or a template matching technique for patches obtained by cutting out local regions around feature points is used. As a template matching similarity calculation method, NCC (Normalized Cross Correlation) or SSD (Sum of Squared Difference) can be used.

＜局所特徴量のマッチングベース＞
局所特徴量のマッチングベースの技術によれば、ＳＩＦＴ(Scale-Invariant Feature Transform)やＳＵＲＦ(Speeded Up Robust Features)のような、位置や回転、歪みの変化に頑健な特徴量のマッチングによって画像間の点対応を取得する。ＳＩＦＴは、１枚の画像からは１２８次元の特徴ベクトルの集合を抽出し、スケールスペースを用いて特徴的な局所領域を解析し、そのスケール変化及び回転に不変となる特徴ベクトルを記述する。ＳＵＲＦは、積分画像を利用することによってＳＩＦＴよりも高速処理が可能であって、１枚の画像から６４次元の特徴ベクトルの集合を抽出する。また、バイナリ特徴ベクトル抽出アルゴリズムであるＦＡＳＴ(Features from Accelerated Segment Test)やＦＲＥＡＫ(Fast Retina Keypoint)の場合、ＳＩＦＴやＳＵＲＦよりも高速且つコンパクトな特徴ベクトルを抽出することができる。 <Local feature matching base>
According to local feature matching-based technology, matching between features such as SIFT (Scale-Invariant Feature Transform) and SURF (Speeded Up Robust Features) is robust to changes in position, rotation, and distortion. Get point correspondence. SIFT extracts a set of 128-dimensional feature vectors from one image, analyzes a characteristic local region using a scale space, and describes a feature vector that is invariant to scale change and rotation. SURF can perform higher-speed processing than SIFT by using an integral image, and extracts a set of 64-dimensional feature vectors from one image. Further, in the case of FAST (Features from Accelerated Segment Test) and FRAK (Fast Retina Keypoint) which are binary feature vector extraction algorithms, feature vectors that are faster and more compact than SIFT or SURF can be extracted.

ここで、登録点と追跡座標との間で、背景画像が写りこむ場合や、照明変化、パターン模様等に起因して、誤追跡（アウトライア）される場合がある。後段の法線ベクトル算出部１３に入力される登録点及び追跡座標には、アウトライアの対応点が入力されないことが好ましい。ここで、画像特徴追跡部１２は、登録画像の登録点と撮影画像の追跡座標との間のホモグラフィ行列を用いて、誤追跡された追跡座標を除外する。即ち、アウトライアの追跡座標が除外され、インライアの追跡座標のみが、法線ベクトル算出部１３へ出力される。 Here, there is a case where a background image is captured between the registration point and the tracking coordinates, or an erroneous tracking (outlier) is caused due to a change in illumination, a pattern pattern, or the like. It is preferable that a corresponding point of the outlier is not input to the registration point and tracking coordinates input to the normal vector calculation unit 13 at the subsequent stage. Here, the image feature tracking unit 12 excludes mistracked tracking coordinates using a homography matrix between the registration points of the registered image and the tracking coordinates of the captured image. That is, the tracking coordinates of the outlier are excluded, and only the tracking coordinates of the inlier are output to the normal vector calculation unit 13.

図５は、登録点と追跡座標との対応を表す画像図である。 FIG. 5 is an image diagram showing correspondence between registered points and tracking coordinates.

図５によれば、対象物体が平面であると仮定して、幾何的な検証によって誤追跡を除外することができる。平面物体上の登録点の位置の変化は、ホモグラフィ変換で表現することができる。そのために、画像特徴追跡部１２は、登録画像と撮影画像との間のホモグラフィ行列Ｈ_RCを推定し、そのＨ_RCに該当しない追跡座標を除外することによって、追跡のロバスト性を向上させることができる。 According to FIG. 5, assuming that the target object is a plane, mistracking can be excluded by geometric verification. The change in the position of the registration point on the planar object can be expressed by homography conversion. Therefore, the image feature tracking unit 12, by estimating the homography matrix H _RC between the captured image and the registered image, excluding the tracking coordinates do not correspond to the H _RC, to improve the robustness of the tracking Can do.

具体的には、各登録点ｐ_jについて、Ｈ_RCを用いた変換位置と、実際の追跡座標ｍ_ijの距離ｄ_ijを、以下の式によって算出する。
ｄ_ij＝｜ｍ_ij−ｐ'_ij｜
Ｐ'_ij￣＝Ｈ_RCｐ_ij￣
Ｐ_ij'￣：ｐ_ijの同次表現
ｄ_ij：追跡座標ｍ_ijのホモグラフィ行列Ｈ_RCからの乖離度
ｄ_ijが一定の閾値以上の追跡座標ｍ_ijを除外することによって、追跡のロバスト性を向上させることができる。 Specifically, for each registered point p _j , the distance d _ij between the conversion position using _HRC and the actual tracking coordinate m _ij is calculated by the following equation.
d _ij = | m _ij −p ′ _ij |
P ′ _ij ￣ = H _RC p _ij ￣
P _ij '￣: homogeneous representation of p _ij
d _ij: By deviance d _ij from the homography matrix H _RC tracking coordinates m _ij is excluded tracking coordinates m _ij above a certain threshold, it is possible to improve the robustness of the tracking.

一般に、ホモグラフィ行列の推定値は、登録点の投影誤差関数の最小化問題を解くことによって得られる。ここで、更に、ＲＡＮＳＡＣ(RANdom SAmple Consensus)のようなロバスト推定を併用することによって、Ｈ_RCに該当する追跡座標のインライアとアウトライアとを、ロバストに分離することができる。 In general, the estimated value of the homography matrix is obtained by solving the minimization problem of the projection error function of the registration points. Here, further, by combining the robust estimation such as RANSAC (RANdom SAmple Consensus), the inliers and outliers tracking coordinates corresponding to H _RC, it can be robustly separated.

登録画像の全面に平面物体が映る場合、画像間のホモグラフィ行列を高精度に推定することができる。一方で、登録画像の一部にしか平面物体が映らない場合（背景が写る場合）や、対象物体が一部立体的構造を含む場合、画像特徴の対応からホモグラフィ行列を一意に推定することができない。このとき、正しい追跡座標であっても、ホモグラフィ推定によってアウトライアとなることが起こりうる。これらの点は平面構造に合致しないため、追跡結果に含まれると法線ベクトルの推定精度の劣化要因となる。そのため、ロバスト推定で除外されることが好ましい。 When a planar object appears on the entire registered image, the homography matrix between the images can be estimated with high accuracy. On the other hand, when a planar object appears only in a part of the registered image (when the background appears), or when the target object partially includes a three-dimensional structure, the homography matrix should be uniquely estimated from the correspondence of the image features I can't. At this time, even if the tracking coordinates are correct, outliers may occur due to homography estimation. Since these points do not match the planar structure, if they are included in the tracking result, the normal vector estimation accuracy deteriorates. Therefore, it is preferable to be excluded by robust estimation.

このような点が少数であれば、ロバスト推定で正しく除外することができる。尚、背景に対して平面物体の面積が相対的に小さい場合は、アウトライアの除外に失敗し、誤ったホモグラフィ行列を算出してしまうことに留意すべきである。 If there are few such points, they can be correctly excluded by robust estimation. It should be noted that if the area of the planar object is relatively small with respect to the background, outlier exclusion fails and an incorrect homography matrix is calculated.

＜ホモグラフィ行列を用いた登録画像の画像範囲の推定＞
ユーザの指示に応じて平面物体が映る登録画像から、ある程度、背景領域を除外することができる。しかしながら、ユーザが厳密に、平面物体を登録画像全体で指定することは難しく、登録画像の指定には誤差が含まれる。尚、本来、利便性の観点からは、対象領域をユーザに指定させる必要が無いことが好ましい。 <Estimating the image range of a registered image using a homography matrix>
The background region can be excluded to some extent from the registered image in which the planar object is reflected according to the user's instruction. However, it is difficult for the user to strictly specify a planar object for the entire registered image, and the registration image specification includes an error. Note that, from the viewpoint of convenience, it is originally preferable that the user does not need to designate the target area.

ここで、画像特徴追跡部１２は、登録画像中の対象領域を推定し、ホモグラフィ行列の算出に用いる画像特徴を制限することで、ホモグラフィ推定のロバスト性を向上させる。また、対象領域外の画像特徴に対して、ホモグラフィ行列と合致するかを検証することによって、画像特徴追跡のロバスト性を向上させる。最終的には、インライアとなった登録点のみを用いて対象領域を更新することができる。 Here, the image feature tracking unit 12 estimates the target region in the registered image and restricts the image features used for calculating the homography matrix, thereby improving the robustness of the homography estimation. Also, robustness of image feature tracking is improved by verifying whether image features outside the target region match the homography matrix. Eventually, the target area can be updated using only the registration points that have become inliers.

＜登録画像に映る平面物体の位置の推定＞
平面物体は、登録画像の中央付近に写っていることが一般的に期待されており、対象領域の初期値として、登録画像の中央付近の領域を指定することが望ましい。ホモグラフィ行列に合致するインライアの取得後、インライアを包含する領域を対象領域として更新する。 <Estimation of the position of a planar object in the registered image>
It is generally expected that the planar object appears in the vicinity of the center of the registered image, and it is desirable to designate an area near the center of the registered image as the initial value of the target area. After obtaining the inlier that matches the homography matrix, the region including the inlier is updated as the target region.

＜大きすぎるカメラワークに対する処理の中断＞
追跡に成功した追跡座標（画像特徴）の数が著しく少ない場合、複数の原因がある。例えば、カメラワークが大きすぎるか、光源変化やフォーカスの変化によって画像特徴の追跡が著しく困難な画像が入力されたか、登録画像にノイズが乗っている場合がある。これらの場合、法線ベクトルを正確に算出することができない。そのために、画像特徴追跡部１２は、追跡に成功した画像特徴の数が第１の所定閾値τ_ｎp以下の場合に、画像処理を中断して、画像登録からやり直す。 <Interruption of processing for camera work that is too large>
If the number of tracking coordinates (image features) that have been successfully tracked is significantly small, there are multiple causes. For example, there may be a case where the camera work is too large, an image whose tracking of image features is extremely difficult due to a light source change or a focus change is input, or noise is added to the registered image. In these cases, the normal vector cannot be calculated accurately. Therefore, the image feature tracking unit 12 interrupts the image processing and starts again from the image registration when the number of image features that have been successfully tracked is equal to or less than the first predetermined threshold value τ _np .

＜少なすぎる追跡座標に対する処理の中断＞
十分な数の追跡座標（画像特徴）が追跡できたにもかかわらず、ホモグラフィ行列に合致する（インライアとなる）追跡座標の数が著しく少ない場合、対象領域が平面で無い場合が想定される。この場合、画像特徴追跡部１２は、追跡に成功した画像特徴の数が第２の所定閾値（＜第１の所定閾値）以下の場合に、画像処理を中断して、「対象物体が平面で無い」旨をユーザに明示する。 <Interruption of processing for too few tracking coordinates>
If a sufficient number of tracking coordinates (image features) can be tracked but the number of tracking coordinates that match the homography matrix (becomes inliers) is extremely small, the target area may not be a plane. . In this case, the image feature tracking unit 12 interrupts the image processing when the number of image features that have been successfully tracked is equal to or less than a second predetermined threshold (<first predetermined threshold), and “the target object is a plane “None” is clearly indicated to the user.

＜多面体の対象物体に対する繰り返し処理＞
対象物体が複数平面で構成される場合は、第１の対象領域についてアウトライアとなった追跡座標のみから再度、ＲＡＮＳＡＣ等のロバスト推定を用いて、第２の対象領域として、第１のＨ_RCとは異なる第２のホモグラフィ行列Ｈ_RCを算出し、そのインライアとなる追跡座標を抽出することも好ましい。３つ以上の対象領域についても、これを繰り返すことができる。 <Repetitive processing for polyhedral target object>
When the target object is composed of a plurality of planes, the first H _RC is used as the second target area by using robust estimation such as RANSAC again from only the tracking coordinates that are outliers for the first target area. It is also preferable to calculate a second homography matrix H _RC different from, and extract the tracking coordinates that become the inlier. This can be repeated for three or more target regions.

第２の対象領域以降の法線ベクトルは、法線ベクトル算出部１３の処理によって、第１の対象領域の法線ベクトルを推定した後で同様の処理を繰り返していくか、又は、各対象領域に対して法線ベクトル算出部１３の処理を並列に実行することもできる。並列的に実行する方が、法線ベクトルを高速に推定することができるが、処理負荷が高くなる。処理リソースに応じて、処理手順を選択することが好ましい。 The normal vectors after the second target area are processed by the normal vector calculation unit 13 after the normal vector of the first target area is estimated, or the same process is repeated, or each target area However, the processing of the normal vector calculation unit 13 can be executed in parallel. Executing in parallel can estimate the normal vector at a higher speed, but the processing load becomes higher. It is preferable to select a processing procedure according to the processing resource.

［法線ベクトル算出部１３］
法線ベクトル算出部１３には、画像特徴追跡部１２から、各撮影画像に対応する連続的な登録点ｐの群が入力される。法線ベクトル算出部１３は、撮影画像に映り込む平面物体の法線ベクトルを算出するために、以下の要素を用いる。
ｉ：連続的な撮影画像のＮc個のフレームの番数
Ｘ_j（＝[ｘ_ｊ,ｙ_ｊ,ｚ_ｊ]^T）：平面物体の登録画像から検出されたＮp個の登録点ｐ_j（＝[ｕ_ｊ,ｖ_ｊ]^T、ｊ＝１〜Ｎp）の３次元座標
ｍ_ij（＝[ｕ_ij,ｖ_ij]^T、i＝１〜Ｎc）：各フレームに映るＮp個の登録点ｐ_j毎の追跡座標
そして、法線ベクトル算出部１３は、再投影誤差関数（バンドル調整）を最小化する、カメラ姿勢パラメータＲ_i（＝[ｒ_ix,ｒ_iy,ｒ_iz]^T）及びｔ_i（＝[ｔ_ix,ｔ_iy,ｔ_iz]^T）と、法線ベクトルｎ_t（＝[ｘ_ｎ,ｙ_ｎ,ｚ_ｎ]^T,ｘ_ｎ ^２+ｙ_ｎ ^２+ｚ_ｎ ^２＝１）とを算出する。
このとき、本発明の法線ベクトル算出部１３は、登録画像の登録点ｐ_jの３次元座標Ｘ_jを、登録点ｐ_jと、物体平面の通る基準点Ｘ_０（＝[ｘ_０,ｙ_０,ｚ_０]^T）と、法線ベクトルｎ_tとによって表現し、登録点ｐ_jの物体平面への逆投影によって算出する。
これによって、Ｘ_j（＝[ｘ_ｊ,ｙ_ｊ,ｚ_ｊ]^T）を未知パラメータとすることなく、これに代えて、法線ベクトルｎ_tの２パラメータを未知パラメータとすることができる。 [Normal vector calculation unit 13]
The normal vector calculation unit 13 receives a group of continuous registration points p corresponding to each captured image from the image feature tracking unit 12. The normal vector calculation unit 13 uses the following elements in order to calculate the normal vector of a planar object reflected in the captured image.
i: Number of Nc frames of consecutive captured images X _j (= [x _j , y _j , z _j ] ^T ): Np registration points p _j (= [u _j , v _j ] ^T , j = 1 to Np) three-dimensional coordinates m _ij (= [u _ij , v _ij ] ^T , i = 1 to Nc): Np registration points p _j reflected in each frame Then, the normal vector calculation unit 13 minimizes the reprojection error function (bundle adjustment), and camera orientation parameters R _i (= [r _ix , r _iy , r _iz ] ^T ) and t _i ( = [T _ix , t _iy , t _iz ] ^T ) and normal vector n _t (= [x _n , y _n , z _n ] ^T , x _n ² + y _n ² + z _n ² = 1) calculate.
At this time, the vector generator 13 of the present invention, the three-dimensional coordinates X _j of registration points p _j of the reference image, a registration point p _j, the reference point X ₀ through the object plane ₍₌ [x _0, y ₀ , z ₀ ] ^T ) and the normal vector n _t, and is calculated by back projection of the registration point p _j onto the object plane.
Thus, two parameters of the normal vector n _t can be used as unknown parameters instead of using X _j (= [x _j , y _j , z _j ] ^T ) as unknown parameters.

また、法線ベクトル算出部１３は、基準点Ｘ_０を、登録点ｐ_jの重心ｐ_０（＝[ｕ_０,ｖ_０]^Tの逆投影（Ｘ_０＝１／ｗ_０[ｕ_０,ｖ_０,１]^T）によって算出するものであってもよい。
物体平面の通る基準点Ｘ_０（＝[ｘ_０,ｙ_０,ｚ_０]^T）は、再投影誤差関数の最小化計算の安定性の観点から、対象物平面の中心付近に設定することが好ましい。また、推定されるカメラ姿勢はスケール不定のため、Ｘ_０の奥行きは任意の値に設定してもよい。例えば、Ｘ_０＝[０,０,ｆ]^T（ｆは焦点距離ｆ_ｘ、ｆ_ｙに近い値）と設定してもよい。
但し、対象物が画像中央に映るとは限らないため、登録点ｐ_jの重心ｐ_０（＝[ｕ_０,ｖ_０]^T）を算出して、ｐ_０の逆投影点Ｘ_０（＝１／ｗ_０[ｕ_０,ｖ_０,１]^T）を算出してもよい。これにより、対象物が画面の端の方に映る場合における、法線ベクトル算出の安定性を向上することができる。この場合についても、奥行き１／ｗ_０は任意の値（例えば焦点距離ｆ_ｘ、ｆ_ｙに近い値）に設定してもよい。また、予め対象物平面までの距離が大まかに分かっている場合には、分かっている範囲に設定することも好ましい。 Further, the normal vector calculation unit 13 uses the reference point X ₀ as a back projection (X ₀ = 1 / w ₀ [u ₀ , v] of the center of gravity p ₀ (= [u ₀ , v ₀ ] ^T of the registration point p _j. ₀ , 1] ^T ).
The reference point X ₀ (= [x ₀ , y ₀ , z ₀ ] ^T ) through which the object plane passes can be set near the center of the object plane from the viewpoint of the stability of the reprojection error function minimization calculation. preferable. The camera pose is estimated for scale indefinite depth of X ₀ may be set to any value. For _{example, X 0 = [0,0, f} ] T (f is the focal length _f x, close to _{f y)} may be set to.
However, since the object does not always appear in the center of the image, the center of gravity p ₀ (= [u ₀ , v ₀ ] ^T ) of the registration point p _j is calculated, and the back projection point X ₀ (= 1) of p ₀ / W ₀ [u ₀ , v ₀ , 1] ^T ) may be calculated. This can improve the stability of normal vector calculation when the object is reflected toward the edge of the screen. In this case the well, the depth 1 / w ₀ may be set to any value (for example, the focal length f _x, close to f _y). In addition, when the distance to the object plane is roughly known in advance, it is also preferable to set it within the known range.

法線ベクトル算出部１３について、再投影誤差関数（バンドル調整）は、以下の式によって表される。
Ｒ_i',ｔ_i',ｎ_t'＝arg min_Ｒi,ｔi,ntΣ_i=1 ^NcΣ_ｊ=1 ^Nｐ(ｍ_ij−proj(Ｒ_i,ｔ_i,Ｘ_j))²
Ｘ_j＝(ｎ_t・Ｘ_０／ｎ_t・ｐ_j')ｐ_j'
＝(ｘ_ｎｘ_０＋ｙ_ｎｙ_０＋ｚ_nｚ_０)／(ｘ_ｎｕ_j＋ｙ_ｎｖ_j＋ｚ_ｎ)[ｕ_j,ｖ_j,１]^T
ｘ_n ²＋ｙ_n ²＋ｚ_n ²＝１
ｚ_n＝√(１−ｘ_n ²−ｙ_n ²)
ｐ_j'（＝[ｕ_j,ｖ_j,１]^T）：登録点ｐ_jの同次座標表現
i：撮影画像のＮc個のフレームの番数
ｍ_ij：フレームiに映るＮp個の登録点ｐ_j毎の追跡座標
Ｒ_i及びｔ_i：フレームiのカメラ姿勢パラメータ
ｎ_t：平面物体の法線ベクトル
Ｘ_０：物体平面の通る基準点
proj(Ｒ_i,ｔ_i,Ｘ_j)：３次元座標Ｘ_jの投影関数 [Ｒ_i｜ｔ_i]Ｘ_j'
Ｘ_j'：Ｘ_jの同次座標表現
Ｒ_i'及びｔ_i'：フレームiのカメラ姿勢パラメータＲ_i及びｔ_iの推定値
ｎ_t'：平面物体の法線ベクトルｎ_tの推定値 Regarding the normal vector calculation unit 13, the reprojection error function (bundle adjustment) is expressed by the following equation.
_{_{R i ', t i',}} n t '= arg min Ri, ti, nt Σ i = 1 Nc Σ j = 1 Np (m ij -proj (R i, t i, X j)) 2
X _j = (n _t · X ₀ / n _t · p _j ') p _j '
= (X _n x ₀ + y _n y ₀ + z _n z ₀ ) / (x _n u _j + y _n v _j + z _n ) [u _j , v _j , 1] ^T
x _n ² + y _n ² + z _n ² = 1
z _n = √ (1−x _n ² −y _n ² )
p _j ′ (= [u _j , v _j , 1] ^T ): homogeneous coordinate expression of the registration point p _j
i: Number of Nc frames of the captured image m _ij : Tracking coordinates for each of Np registration points p _j reflected in frame i R _i and t _i : Camera posture parameters of frame i n _t : Normal of plane object Vector X ₀ : Reference point through which the object plane passes
proj (R _i , t _i , X _j ): projection function [R _i | t _i ] X _j 'of the three-dimensional coordinate X _j
X _j ′: homogeneous coordinate representation of X _j R _i ′ and t _i ′: estimated values of camera posture parameters R _i and t _i of frame i n _t ′: estimated values of normal vector n _t of the planar object

図６は、異なる未知パラメータの数で表現した特徴点（追跡座標）を表す画像図である。 FIG. 6 is an image diagram showing feature points (tracking coordinates) expressed by the number of different unknown parameters.

既存技術に基づく３次元復元で用いられるバンドル調整の場合、Ｘ_j（＝[ｘ_ｊ,ｙ_ｊ,ｚ_ｊ]^T）を未知パラメータとしている。そのために、再投影誤差関数における未知パラメータの数は、Ｎc個のフレームi毎に生じるカメラ姿勢パラメータＲ_i（＝[ｒ_ix,ｒ_iy,ｒ_iz]^T）及びｔ_i（＝[ｔ_ix,ｔ_iy,ｔ_iz]^T）の６個と、Ｎp個の３次元座標の登録点ｐ_jとなり、未知パラメータの数は、６Ｎc＋３Ｎｐ個となる。
Ｒ_i',ｔ_i',Ｘ_j'＝arg min_{Ｒi,ｔi,ｘj}Σ_i=1 ^NcΣ_j=1 ^Np(ｍ_ij−proj(Ｒ_i,ｔ_i,Ｘ_j))² In the case of bundle adjustment used in three-dimensional restoration based on the existing technology, X _j (= [x _j , y _j , z _j ] ^T ) is an unknown parameter. For this purpose, the number of unknown parameters in the reprojection error function is determined by the camera posture parameters R _i (= [r _ix , r _iy , r _iz ] ^T ) and t _i (= [t _ix , t _iy , t _iz ] ^T ) and Np three-dimensional coordinate registration points p _j , and the number of unknown parameters is 6Nc + 3Np.
_{_{R i ', t i',}} X j '= arg min Ri, ti, xj Σ i = 1 Nc Σ j = 1 Np (m ij -proj (R i, t i, X j)) 2

これに対し、本発明によれば、Ｘ_j（＝[ｘ_ｊ,ｙ_ｊ,ｚ_ｊ]^T）に代えて、法線ベクトルｎ_tの２パラメータを未知パラメータｎ_t’とする。そのために、再投影誤差関数における未知パラメータの数は、Ｎc個のフレームi毎に生じるカメラ姿勢パラメータＲ_i（＝[ｒ_ix,ｒ_iy,ｒ_iz]^T）及びｔ_i（＝[ｔ_ix,ｔ_iy,ｔ_iz]^T）の６個と、登録点ｐ_jに対する法線ベクトルｎ_t（＝[ｘ_ｎ,ｙ_ｎ,ｚ_ｎ]^T,ｘ_ｎ ^２+ｙ_ｎ ^２+ｚ_ｎ ^２＝１）の２個とを合計した、６Ｎc＋２個となる。
Ｒ_i',ｔ_i',ｎ_t'＝arg min_Ｒi,ｔi,ntΣ_i=1 ^NcΣ_ｊ=1 ^Nｐ(ｍ_ij−proj(Ｒ_i,ｔ_i,Ｘ_j))²
そのために、本発明によれば、パラメータ数を大幅に削減することができ、処理負荷の削減と、精度の向上との効果を得られる。 On the other hand, according to the present invention, instead of X _j (= [x _j , y _j , z _j ] ^T ), two parameters of the normal vector n _t are set as unknown parameters n _t ′. For this purpose, the number of unknown parameters in the reprojection error function is determined by the camera posture parameters R _i (= [r _ix , r _iy , r _iz ] ^T ) and t _i (= [t _ix , t _iy , t _iz ] ^T ) and the normal vector n _t (= [x _n , y _n , z _n ] ^T , x _n ² + y _n ² + z _n ² = 1) for the registration point p _j ), And a total of 6Nc + 2.
_{_{R i ', t i',}} n t '= arg min Ri, ti, nt Σ i = 1 Nc Σ j = 1 Np (m ij -proj (R i, t i, X j)) 2
Therefore, according to the present invention, the number of parameters can be greatly reduced, and the effects of reducing the processing load and improving the accuracy can be obtained.

図７は、登録点と基準点、法線ベクトル、対象物平面の幾何的関係を表す説明図である。 FIG. 7 is an explanatory diagram showing a geometric relationship between a registration point, a reference point, a normal vector, and an object plane.

法線ベクトルを算出できれば、平面物体の３次元構造を算出できる。ここで、法線ベクトルの推定対象となる平面物体は、事前に見た目やサイズに関する情報が全く与えられていないこととする。法線ベクトル算出部１３の推定する平面物体の法線ベクトルとは、画像平面に対する平面物体の法線ベクトルの情報である。 If the normal vector can be calculated, the three-dimensional structure of the planar object can be calculated. Here, it is assumed that no information regarding the appearance and size is given in advance to the planar object to be estimated for the normal vector. The normal vector of the planar object estimated by the normal vector calculation unit 13 is information on the normal vector of the planar object with respect to the image plane.

また、法線ベクトル算出部１３について、再投影誤差関数は、バンドル調整における法線ベクトルの初期値を、光軸と平行な方向とすることが好ましい。 In the normal vector calculation unit 13, the reprojection error function preferably sets the initial value of the normal vector in bundle adjustment to a direction parallel to the optical axis.

再投影誤差関数の最小化は、ガウス・ニュートン法に代表される、非線形最小化問題の解法を用いることができる。 The minimization of the reprojection error function can use a solution of a nonlinear minimization problem represented by the Gauss-Newton method.

法線ベクトル算出部１３は、撮影画像のカメラワークが微小であるとする前提条件の下、初期値として、カメラ姿勢パラメータのＲ_iを単位行列とし、ｔ_iを零ベクトルとし、ｎ_t＝[0,0,1]^Tとして、法線ベクトルを算出する。回転行列Ｒ_iは、以下の式で分解できるため、パラメータｒ_x,ｒ_y,ｒ_zをパラメータとして用いる。
θ＝||ｒ||_L2
ｒ＝[ｒ_x,ｒ_y,ｒ_z]^T Under the precondition that the camera work of the captured image is very small, the normal vector calculation unit 13 sets the camera posture parameter R _i as a unit matrix, t _i as a zero vector, and n _t = [ [0,0,1] ^T is calculated as a normal vector. Since the rotation matrix R _i can be decomposed by the following equation, the parameters r _x , r _y , r _z are used as parameters.
θ = || r || _L2
r = [r _x , r _y , r _z ] ^T

特に、カメラワーク（視点変化）が小さいほど、回転行列Ｒiは以下の式で近似できるため、θ_i ^x,θ_i ^y,θ_i ^zをパラメータとして用いてもよい。
最終的に、法線ベクトル算出部１３は、推定した法線ベクトルｎ_tを、所定のアプリケーションへ出力することができる。また、法線ベクトルｎ_tは、登録画像記憶部１１へ出力され、登録画像に対する正面化画像を記憶することもできる。 In particular, as the camera work (viewpoint change) is smaller, the rotation matrix Ri can be approximated by the following expression, and θ _i ^x , θ _i ^y , and θ _i ^z may be used as parameters.
Finally, the normal vector calculation unit 13 can output the estimated normal vector n _t to a predetermined application. Further, the normal vector n _t can be output to the registered image storage unit 11 to store a frontal image with respect to the registered image.

図８は、光軸中心と、画像平面上の登録画像の登録点と、撮影画像の法線ベクトルとの幾何的関係を表すグラフである。 FIG. 8 is a graph showing the geometric relationship among the optical axis center, the registration point of the registered image on the image plane, and the normal vector of the captured image.

図８によれば、400×400ピクセルの画像平面を、20ピクセル間隔でサンプリングし、441点の初期追跡点ｐ_jを取得したものである（Ｎp＝４４１）。
対象平面の法線ｎ_tを、z軸方向から角度４５度以内でランダムに生成し、登録点ｐ_jを逆投影して３次元座標Ｘ_jを取得する。焦点距離ｆは、500mmに設定している。
そして、初期のカメラ姿勢ｒ₀＝[0 0 0]^T、ｔ₀＝[0 0 0]^Tにガウシアンノイズを加え、カメラ姿勢Ｒ_i及びｔ_i（i＝1〜100）を生成する。
カメラ姿勢Ｒ_i及びｔ_iによってＸ_jを投影し、ガウシアンノイズを加えて、追跡点ｍ_ij（j＝1〜441）を生成する。
ｍ_ijを入力として、法線方向ｎ_t'を推定する。
推定に使用するフレーム数を、2枚から100枚まで徐々に増やし、精度及び処理時間を評価する（Ｎc＝2〜100）。
ｎ_tを100セット用意し、精度及び処理時間の平均を算出する。 According to FIG. 8, a 400 × 400 pixel image plane is sampled at 20 pixel intervals to obtain 441 initial tracking points p _j (Np = 441).
A normal n _{t of the} target plane is randomly generated within an angle of 45 degrees from the z-axis direction, and the registration point p _j is back-projected to obtain a three-dimensional coordinate X _j . The focal length f is set to 500 mm.
Then, Gaussian noise is added to the initial camera posture r ₀ = [0 0 0] ^T and t ₀ = [0 0 0] ^T to generate camera postures R _i and t _i (i = 1 to 100).
X _j is projected by camera postures R _i and t _i , and Gaussian noise is added to generate tracking points m _ij (j = 1 to 441).
Using m _ij as an input, the normal direction n _t ′ is estimated.
The number of frames used for estimation is gradually increased from 2 to 100, and accuracy and processing time are evaluated (Nc = 2 to 100).
100 sets of _nt are prepared, and the average of accuracy and processing time is calculated.

図９は、バンドル調整に応じてフレーム数に対する法線ベクトルの精度及び処理時間を表すグラフである。 FIG. 9 is a graph showing the accuracy of the normal vector and the processing time with respect to the number of frames according to bundle adjustment.

図９によれば、異なるバンドル調整について比較している。
ＤＥ：従来のバンドル調整（パラメータ数６Ｎc＋Ｎp＝453〜1041）で、３次元座標Ｘ_j'を主成分分析して、法線ベクトルｎ_t'を算出する。
ＯＥ：本発明のバンドル調整（パラメータ数６Ｎc＋２＝14〜602）で、法線ベクトルを算出する。 According to FIG. 9, different bundle adjustments are compared.
DE: A normal vector n _t ′ is calculated by performing principal component analysis on the three-dimensional coordinate X _j ′ by conventional bundle adjustment (number of parameters: 6Nc + Np = 453 to 1041).
OE: A normal vector is calculated by the bundle adjustment (parameter number 6Nc + 2 = 14 to 602) of the present invention.

図９（ａ）は、フレーム数に応じた法線ベクトルの推定精度を表すグラフである。ここでは、従来技術のバンドル調整ＤＥと、本発明のバンドル調整ＯＥとを比較して、法線ベクトルの精度は同じである。即ち、２０フレーム程度で、法線ベクトルを高精度（角度誤差約２度）に算出できる。 FIG. 9A is a graph showing the normal vector estimation accuracy according to the number of frames. Here, comparing the bundle adjustment DE of the prior art and the bundle adjustment OE of the present invention, the accuracy of the normal vector is the same. That is, the normal vector can be calculated with high accuracy (angle error of about 2 degrees) in about 20 frames.

図９（ｂ）は、フレーム数に応じた法線ベクトルの処理時間を表すグラフである。ここでは、従来技術のバンドル調整ＤＥと、本発明のバンドル調整ＯＥとを比較して、法線ベクトルの推定処理時間は大きく異なっている。即ち、本発明のバンドル調整ＯＥの処理時間が、極めて高速（短時間）であることが理解できる。具体的には、Ｎc＝2で60倍、Ｎc＝20で約12倍、Ｎc＝100で約3倍の高速化が確認できる。 FIG. 9B is a graph showing the processing time of the normal vector according to the number of frames. Here, comparing the bundle adjustment DE of the prior art with the bundle adjustment OE of the present invention, the normal vector estimation processing time is greatly different. That is, it can be understood that the processing time of the bundle adjustment OE of the present invention is extremely fast (short time). Specifically, it can be confirmed that Nc = 2 is 60 times faster, Nc = 20 is about 12 times faster, and Nc = 100 is about 3 times faster.

前述した実施形態によれば、１枚の平面で構成された平面物体について説明した。勿論、本発明によれば、複数の主要平面で構成された対象物体であっても、複数平面のそれぞれについて適用することもできる。即ち、本発明は、対象物体を構成する主要な平面数を１枚に限定するものではない。平面数を増やすことで、任意形状の対象物体に対して適用することができる。 According to the above-described embodiment, the planar object constituted by one plane has been described. Of course, according to the present invention, even a target object composed of a plurality of main planes can be applied to each of a plurality of planes. That is, the present invention does not limit the number of main planes constituting the target object to one. By increasing the number of planes, it can be applied to a target object having an arbitrary shape.

以上、詳細に説明したように、本発明のプログラム、装置及び方法によれば、対象物体が平面物体（又は概ね平面で近似可能な物体）であるという前提条件の下で、カメラワークの大きな制約無しに、連続的な撮影画像に映り込む平面物体の法線ベクトルを、大幅に少ない計算量で算出することができる。具体的には、法線ベクトルを算出するためのバンドル調整における未知パラメータの数を削減しているために、処理負荷を削減し、推定精度を高めることができる。また、ホモグラフィ行列を用いることによって、特徴点の追跡失敗に対するロバスト性も向上させることができる。 As described above in detail, according to the program, the apparatus, and the method of the present invention, there is a significant limitation of camera work under the precondition that the target object is a plane object (or an object that can be approximated by a plane). Without any problem, the normal vector of a planar object reflected in a continuous captured image can be calculated with a significantly small amount of calculation. Specifically, since the number of unknown parameters in bundle adjustment for calculating a normal vector is reduced, the processing load can be reduced and the estimation accuracy can be increased. Further, by using the homography matrix, it is possible to improve the robustness against the tracking failure of the feature points.

前述した本発明の種々の実施形態について、本発明の技術思想及び見地の範囲の種々の変更、修正及び省略は、当業者によれば容易に行うことができる。前述の説明はあくまで例であって、何ら制約しようとするものではない。本発明は、特許請求の範囲及びその均等物として限定するものにのみ制約される。 Various changes, modifications, and omissions of the above-described various embodiments of the present invention can be easily made by those skilled in the art. The above description is merely an example, and is not intended to be restrictive. The invention is limited only as defined in the following claims and the equivalents thereto.

１画像処理装置
１０画像取得部
１１登録画像記憶部
１２画像特徴追跡部
１３法線ベクトル算出部 DESCRIPTION OF SYMBOLS 1 Image processing apparatus 10 Image acquisition part 11 Registered image memory | storage part 12 Image feature tracking part 13 Normal vector calculation part

Claims

In a program characterized by operating a computer to calculate a normal vector of a planar object reflected in a captured image,
Nc frames i of continuous shot images;
Three-dimensional coordinates X _j (= [x _j , y _j , z _j ) of Np registration points p _j (= [u _j , v _j ] ^T , j = 1 to Np) detected from the registered image of the planar object. ] ^T ),
Using tracking coordinates m _ij (= [u _ij , v _ij ] ^T , i = 1 to Nc) for each of Np registration points p _j reflected in each frame,
Camera orientation parameters R _i (= [r _ix , r _iy , r _iz ] ^T ) and t _i (= [t _ix , t _iy , t _iz ] ^T ) that minimize the reprojection error function, and normal vectors normal vector calculation means for calculating n _t (= [x _n , y _n , z _n ] ^T , x _n ² + y _n ² + z _n ² = 1),
The normal vector calculating means, the three-dimensional coordinates X _j of registration points p _j of the reference image, a registration point p _j, the reference point _X 0 through the object plane _{_{(= [x 0, y 0}} , z 0] T ) And a normal vector n _t, and a computer functioning to calculate by back projection of the registration point p _j onto the object plane.

The normal vector calculation means uses the reference point X ₀ as the back projection (X ₀ = 1 / w ₀ [u ₀ , v ₀ , V ₀ , ^T centroid p ₀ (= [u ₀ , v ₀ ] ^T) of the registration point p _j . 1] The program according to claim 1, which causes the computer to function as calculated by ^T ).

With respect to the normal vector calculation means, the reprojection error function is represented by the following equation: R _i ′, t _i ′, n _t ′ = arg min _{Ri, ti, nt} Σ _{i = 1} ^Nc Σ _{j = 1} ^Np (m _ij −proj (R _i , t _i , X _j )) ²
X _j = (n _t · X ₀ / n _t · p _j ') p _j '
= (X _n x ₀ + y _n y ₀ + z _n z ₀ ) / (x _n u _j + y _n v _j + z _n ) [u _j , v _j , 1] ^T
x _n ² + y _n ² + z _n ² = 1
z _n = √ (1−x _n ² −y _n ² )
p _j ′ (= [u _j , v _j , 1] ^T ): homogeneous coordinate expression of the registration point p _j
i: Number of Nc frames of the captured image m _ij : Tracking coordinates for each of Np registration points p _j reflected in frame i R _i and t _i : Camera posture parameters of frame i n _t : Normal of plane object Vector X ₀ : Reference point through which the object plane passes
_{_{proj (R i, t i,}} X j): projection function of the three-dimensional coordinates _{_{_{X j [R i | t i}}} ] X j
X _j ': _Homogeneous coordinate representation of X _j R _i ' and t _i ': Estimated values of camera posture parameters R _i and t _i of frame i n _t ': Estimated value of normal vector n _t of planar object The program according to claim 1 or 2, which causes a computer to function.

With respect to the normal vector calculation means, the number of unknown parameters in the reprojection error function is the camera orientation parameters R _i (= [r _ix , r _iy , r _iz ] ^T ) and t _i that occur every Nc frames _i. (= [T _ix , t _iy , t _iz ] ^T ) and normal vectors n _t (= [x _n , y _n , z _n ] ^T , x _n ² + y _n ² + z) for registered points _{n 2} ^{= 1)} the sum of two and a program according to claim 3, characterized in that causes a computer to function so that 6 nC + 2.

The normal vector calculating means calculates the normal vector under the precondition that the camera work of the photographed image is minute, with Ri of the camera posture parameter as a unit matrix and t _i as a zero vector. The program according to any one of claims 1 to 4, which causes a computer to function.

With respect to the normal vector calculation means, the reprojection error function uses a computer so that the initial value of the normal vector in bundle adjustment is in a direction parallel to the optical axis (n _t = [0,0,1] ^T ). 6. The program according to claim 1, wherein the program is made to function.

The computer is further caused to function as an image feature tracking unit that excludes mistracked tracking coordinates using a homography matrix between a registration point of a registered image and a tracking coordinate of a captured image. The program according to any one of 6 above.

8. The computer further functions as registered image storage means for specifying a target area in which a planar object appears in accordance with a user operation from the photographed image and storing the target area as a registered image. The program according to any one of the above.

The registered image storage means causes the computer to function to geometrically transform the registered image into a frontalized image using the normal vector calculated by the normal vector calculating means, and to store the frontaled image as a registered image. The program according to claim 8.

The image feature tracking means, when the number of image features that have been successfully tracked is equal to or less than a first predetermined threshold, causes the computer to function so as to interrupt image processing and start again from image registration. The program according to any one of 1 to 9.

The image feature tracking unit interrupts image processing when the number of image features that have been successfully tracked is equal to or less than a second predetermined threshold (<first predetermined threshold), and states that “the target object is not a plane”. The program according to claim 10, wherein the computer is caused to function so as to clearly indicate to a user.

In an image processing apparatus that calculates a normal vector of a planar object reflected in a captured image,
Nc frames i of continuous shot images;
Three-dimensional coordinates X _j (= [x _j , y _j , z _j ) of Np registration points p _j (= [u _j , v _j ] ^T , j = 1 to Np) detected from the registered image of the planar object. ] ^T ),
Using tracking coordinates m _ij (= [u _ij , v _ij ] ^T , i = 1 to Nc) for each of Np registration points p _j reflected in each frame,
Camera orientation parameters R _i (= [r _ix , r _iy , r _iz ] ^T ) and t _i (= [t _ix , t _iy , t _iz ] ^T ) that minimize the reprojection error function, and normal vectors normal vector calculation means for calculating n _t (= [x _n , y _n , z _n ] ^T , x _n ² + y _n ² + z _n ² = 1),
The normal vector calculating means, the three-dimensional coordinates X _j of registration points p _j of the reference image, a registration point p _j, the reference point _X 0 through the object plane _{_{(= [x 0, y 0}} , z 0] T ) And a normal vector n _t and calculated by back projection of the registration point p _j onto the object plane.

In the normal vector calculation method of the apparatus for calculating the normal vector of a planar object reflected in a captured image,
Nc frames i of continuous shot images;
Three-dimensional coordinates X _j (= [x _j , y _j , z _j ) of Np registration points p _j (= [u _j , v _j ] ^T , j = 1 to Np) detected from the registered image of the planar object. ] ^T ),
Using tracking coordinates m _ij (= [u _ij , v _ij ] ^T , i = 1 to Nc) for each of Np registration points p _j reflected in each frame,
Camera orientation parameters R _i (= [r _ix , r _iy , r _iz ] ^T ) and t _i (= [t _ix , t _iy , t _iz ] ^T ) that minimize the reprojection error function, and normal vectors n _t (= [x _n , y _n , z _n ] ^T , x _n ² + y _n ² + z _n ² = 1)
In the above step, the three-dimensional coordinates X _j of the registration point p _j of the registration image, the registration point p _j , the reference point X ₀ (= [x ₀ , y ₀ , z ₀ ] ^T ) passing through the object plane, and the modulus A normal vector calculation method for an apparatus, which is expressed by a line vector n _t and is calculated by back projection of a registration point p _j onto an object plane.