JP2016003930A

JP2016003930A - Image processing apparatus, image processing method, and image processing program

Info

Publication number: JP2016003930A
Application number: JP2014123735A
Authority: JP
Inventors: 康輔高橋; Kosuke Takahashi; 明小島; Akira Kojima; 藤井　憲作; Kensaku Fujii; 憲作藤井; 豊國田; Yutaka Kunida; 越智　大介; Daisuke Ochi; 大介越智
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2014-06-16
Filing date: 2014-06-16
Publication date: 2016-01-12

Abstract

PROBLEM TO BE SOLVED: To provide an image processing apparatus configured to easily estimate a camera parameter.SOLUTION: An image processing apparatus configured to estimate a camera parameter of a camera which has captured refocus-able images from the two refocus-able images having an overlapping area and variable focal depth includes: input means which inputs the two refocus-able images, a refocused-image division number which is the number of divisions of an image having a predetermined focal depth, and an internal parameter of the camera; image processing means which detects features points of refocused images and a feature-point corresponding group between the refocused images from the two refocus-able images, and outputs them; and geometric processing means which estimates a camera parameter from the feature-point corresponding group and the positions of the feature points.

Description

本発明は、被写体の高精度な三次元形状推定や品質の高い仮想視点映像の生成に利用する画像処理装置、画像処理方法および画像処理プログラムに関する。 The present invention relates to an image processing device, an image processing method, and an image processing program that are used for highly accurate estimation of a three-dimensional shape of a subject and generation of a high-quality virtual viewpoint video.

ライトフィールドカメラなどの普及に伴い、ｒｅｆｏｃｕｓ−ａｂｌｅｉｍａｇｅ（合焦する奥行きが可変な画像）が日常の様々なシーンにおいて撮影される機会が増えており、同一の被写体が複数の視点から撮影されることも珍しくない。複数の視点から撮影された画像は、対象の３次元計測や仮想視点における画像生成などの様々なアプリケーションに用いることができる。一般に、複数の視点から撮影された画像をこれらのアプリケーションに用いるためには、それぞれの画像を撮影したカメラのカメラパラメータを推定する必要がある。 With the widespread use of light field cameras and the like, there is an increased chance that a focus-able image (an image with variable depth of focus) is shot in various everyday scenes, and the same subject is shot from multiple viewpoints. It is not unusual. Images taken from a plurality of viewpoints can be used in various applications such as three-dimensional measurement of an object and image generation at a virtual viewpoint. In general, in order to use images captured from a plurality of viewpoints for these applications, it is necessary to estimate the camera parameters of the camera that captured each image.

重複領域を持つ複数の画像を入力とし、カメラパラメータを推定するＳｔｒｕｃｔｕｒｅｆｒｏｍＭｏｔｉｏｎ（ＳｆＭ）という手法が提案されている（例えば、非特許文献１参照）。これらの手法は複数の画像間で画像から検出される特徴点の対応をとり、その対応の関係からカメラパラメータを求める。カメラパラメータの推定精度は特徴点対応の精度に大きく依存することが知られている。この特徴点対応の精度に影響する原因の１つとして、複数の画像において焦点の合う奥行きが異なっていることが挙げられる。これは合焦している場合と合焦していない場合では、検出される特徴点の位置及び特徴量が異なるためである。 A technique called Structure from Motion (SfM) that estimates a camera parameter using a plurality of images having overlapping regions as an input has been proposed (for example, see Non-Patent Document 1). These methods take correspondence between feature points detected from images among a plurality of images, and obtain camera parameters from the correspondence relationship. It is known that the camera parameter estimation accuracy largely depends on the feature point correspondence accuracy. One of the causes affecting the accuracy of the feature point correspondence is that the depths in focus are different in a plurality of images. This is because the position and feature amount of the detected feature point are different between when the focus is achieved and when the focus is not achieved.

Noah Snavely, Steven M Seitz, Richard Szeliski,“Photo Tourism: Exploring image collections in 3D”,ACM Transaction on Graphics (Proceedings of SIGGRAPH 2006), 2006.Noah Snavely, Steven M Seitz, Richard Szeliski, “Photo Tourism: Exploring image collections in 3D”, ACM Transaction on Graphics (Proceedings of SIGGRAPH 2006), 2006.

ところで、１枚のｒｅｆｏｃｕｓ−ａｂｌｅｉｍａｇｅからは複数枚のｒｅｆｏｃｕｓｅｄｉｍａｇｅ（ある奥行きで合焦した画像）が生成可能であるため、２枚のｒｅｆｏｃｕｓ−ａｂｌｅｉｍａｇｅを用いてＳｆＭを行う場合には、それぞれのｒｅｆｏｃｕｓ−ａｂｌｅｉｍａｇｅにおける全てのｒｅｆｏｃｕｓｅｄｉｍａｇｅの組み合わせを考慮する必要がある。 By the way, since a plurality of focused images (images focused at a certain depth) can be generated from a single focused-able image, when performing SfM using two focused-able images, It is necessary to consider all the combinations of focused images in the focused-able image.

しかしながら、それらの組み合わせの多くは奥行きが合っていないｒｅｆｏｃｕｓｅｄｉｍａｇｅであり、適切な奥行きのｒｅｆｏｃｕｓｅｄｉｍａｇｅの組み合わせに対してＳｆＭを行った結果を自動的に選択することは難しいという問題がある。 However, many of these combinations are focused images whose depths do not match, and there is a problem that it is difficult to automatically select the result of performing SfM on a combination of focused images having appropriate depths.

本発明は、このような事情に鑑みてなされたもので、容易にカメラパラメータを推定することが可能な画像処理装置、画像処理方法および画像処理プログラムを提供することを目的とする。 The present invention has been made in view of such circumstances, and an object thereof is to provide an image processing apparatus, an image processing method, and an image processing program capable of easily estimating camera parameters.

本発明は、重複領域を持つ合焦する奥行きが可変な画像である２枚のｒｅｆｏｃｕｓ−ａｂｌｅｉｍａｇｅから、それぞれの前記ｒｅｆｏｃｕｓ−ａｂｌｅｉｍａｇｅを撮影したカメラのカメラパラメータを推定する画像処理装置であって、２枚の前記ｒｅｆｏｃｕｓ−ａｂｌｅｉｍａｇｅと、所定の奥行きに合焦した画像の分割数であるｒｅｆｏｃｕｓｅｄｉｍａｇｅ分割数と、前記カメラの内部パラメータとを入力する入力手段と、２枚の前記ｒｅｆｏｃｕｓ−ａｂｌｅｉｍａｇｅから、前記ｒｅｆｏｃｕｓｅｄｉｍａｇｅそれぞれの特徴点及び前記ｒｅｆｏｃｕｓｅｄｉｍａｇｅ間の特徴点対応群を検出して出力する画像処理手段と、前記特徴点対応群および前記特徴点の位置から前記カメラパラメータを推定する幾何処理手段とを備えたことを特徴とする。 The present invention is an image processing apparatus for estimating camera parameters of a camera that has photographed each of the focus-able images from two focus-ble images that are overlapping images having overlapping regions and variable in-focus depths. Input means for inputting the two refocus-able images, the number of refocused image divisions that are the number of divisions of an image focused on a predetermined depth, and the internal parameters of the camera, and the two refocus-ables image processing means for detecting and outputting each feature point of the focused image and the feature point correspondence group between the focused images from the image; and a geometry for estimating the camera parameters from the feature point correspondence group and the position of the feature point processing Characterized by comprising a stage.

本発明は、前記画像処理部は、２枚の前記ｒｅｆｏｃｕｓ−ａｂｌｅｉｍａｇｅ及び前記ｒｅｆｏｃｕｓｅｄｉｍａｇｅ分割数を入力し、前記ｒｅｆｏｃｕｓ−ａｂｌｅｉｍａｇｅから前記ｒｅｆｏｃｕｓｅｄｉｍａｇｅ分割数に従い前記ｒｅｆｏｃｕｓｅｄｉｍａｇｅを作成する手段と、前記ｒｅｆｏｃｕｓｅｄｉｍａｇｅのそれぞれを入力し、各ｒｅｆｏｃｕｓｅｄｉｍａｇｅの特徴点を検出する手段と、前記ｒｅｆｏｃｕｓｅｄｉｍａｇｅの特徴点群を入力し、全ての特徴点群の組み合わせを求める手段と、前記特徴点群の組み合わせを入力し、検出した各特徴点の特徴量に基づいて特徴点対応群を求める手段と、得られた前記各特徴点および前記特徴点の対応を入力し、該特徴点の対応に対してエピポーラ制約を用いたＲＡＮＳＡＣ法による処理を行い、インライアとなる特徴点対応群及び特徴点群の組み合わせを出力する手段とを備えることを特徴とする。 According to the present invention, the image processing unit inputs two pieces of the referenced-able image and the number of pieces of the referenced image, and creates the referenced image according to the number of pieces of the referenced image from the reference-able image. A means for inputting each of the focused images, detecting feature points of each focused image, a means for inputting the feature points of the focused image, obtaining a combination of all the feature points, and a combination of the feature points Means for obtaining a feature point correspondence group based on the feature amount of each detected feature point, and inputting the obtained feature points and the correspondence of the feature points, and epipolar constraints on the correspondence of the feature points R using And a means for performing processing by the ANSAC method and outputting a combination of feature point correspondence groups and feature point groups as inliers.

本発明は、前記画像処理手段は、全ての組み合わせにおけるインライアとなる特徴点対応群を入力し、含まれる特徴点対応の数が最も多い特徴点群の組み合わせ及びインライアとなる特徴点対応群を出力することを特徴とする。 In the present invention, the image processing means inputs a feature point correspondence group that becomes an inlier in all combinations, and outputs a combination of feature points that has the largest number of feature point correspondences included and a feature point correspondence group that becomes an inlier. It is characterized by doing.

本発明は、前記幾何処理手段は、前記特徴点対応群および前記特徴点を入力し、基礎行列を算出する手段と、前記基礎行列及び前記カメラの内部パラメータを入力し、基本行列を算出する手段と、前記基本行列を入力し、回転行列及び並進ベクトルを出力する手段と、前記回転行列、前記並進ベクトル、前記内部パラメータ、前記特徴点および前記特徴点対応群を入力し、再投影誤差が最小になるようにバンドルアジャストメントを行い、前記回転行列、前記並進ベクトル、前記内部パラメータの最適解、最適化におけるイテレーション回数および再投影誤差を出力する手段とを備えることを特徴とする。 According to the present invention, the geometric processing means inputs the feature point correspondence group and the feature points, calculates a basic matrix, and inputs the basic matrix and the internal parameters of the camera, and calculates the basic matrix A means for inputting the basic matrix and outputting a rotation matrix and a translation vector; and inputting the rotation matrix, the translation vector, the internal parameter, the feature point, and the feature point correspondence group, and minimizing a reprojection error. And a means for outputting the rotation matrix, the translation vector, the optimal solution of the internal parameters, the number of iterations in the optimization, and the reprojection error.

本発明は、前記幾何処理手段は、全ての組み合わせに対するバンドルアジャストメントの結果を入力し、最もイテレーション回数の少なかった特徴点群の組み合わせに対するバンドルアジャストメントの結果を出力することを特徴とする。 According to the present invention, the geometric processing means inputs bundle adjustment results for all combinations, and outputs bundle adjustment results for a combination of feature points having the smallest number of iterations.

本発明は、前記幾何処理手段は、全ての組み合わせに対する全ての特徴点群の組み合わせに対して行われたバンドルアジャストメントの結果を入力し、最も再投影誤差の小さかった特徴点群の組み合わせに対するバンドルアジャストメントの結果を出力することを特徴とする。 In the present invention, the geometric processing means inputs the result of bundle adjustment performed for all combinations of feature points for all combinations, and bundles for combinations of feature points having the smallest reprojection error. The result of the adjustment is output.

本発明は、重複領域を持つ合焦する奥行きが可変な画像である２枚のｒｅｆｏｃｕｓ−ａｂｌｅｉｍａｇｅから、それぞれの前記ｒｅｆｏｃｕｓ−ａｂｌｅｉｍａｇｅを撮影したカメラのカメラパラメータを推定する画像処理方法であって、２枚の前記ｒｅｆｏｃｕｓ−ａｂｌｅｉｍａｇｅと、所定の奥行きに合焦した画像の分割数であるｒｅｆｏｃｕｓｅｄｉｍａｇｅ分割数と、前記カメラの内部パラメータとを入力する入力ステップと、２枚の前記ｒｅｆｏｃｕｓ−ａｂｌｅｉｍａｇｅから、前記ｒｅｆｏｃｕｓｅｄｉｍａｇｅそれぞれの特徴点及び前記ｒｅｆｏｃｕｓｅｄｉｍａｇｅ間の特徴点対応群を検出して出力する画像処理ステップと、前記特徴点対応群および前記特徴点の位置から前記カメラパラメータを推定する幾何処理ステップとを有することを特徴とする。 The present invention is an image processing method for estimating camera parameters of a camera that has photographed each of the focus-able images from two focus-ble images that are images with variable overlapping depths that have overlapping areas. An input step for inputting the two refocus-able images, the number of the refocused image that is the number of divisions of the image focused on a predetermined depth, and the internal parameters of the camera, and the two refocus-ables An image processing step of detecting and outputting each feature point of the focused image and a feature point correspondence group between the focused images from the image, and estimating the camera parameter from the position of the feature point correspondence group and the feature point And having a what processing steps.

本発明は、コンピュータを、前記画像処理装置として機能させるための画像処理プログラムである。 The present invention is an image processing program for causing a computer to function as the image processing apparatus.

本発明によれば、２枚のｒｅｆｏｃｕｓ−ａｂｌｅｉｍａｇｅを用い、それぞれの画像を獲得したカメラのキャリブレーションを行う場合において、適切な合焦位置のｒｅｆｏｃｕｓｅｄｉｍａｇｅを用いたカメラパラメータの推定結果を自動的に出力することが可能になるという効果が得られる。 According to the present invention, when calibrating the camera that acquired each image using the two refocus-able images, the camera parameter estimation result using the refocused image at an appropriate in-focus position is automatically calculated. The effect that it becomes possible to output to is obtained.

本発明の第１実施形態の構成を示すブロック図である。It is a block diagram which shows the structure of 1st Embodiment of this invention. 第１実施形態による画像処理装置の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the image processing apparatus by 1st Embodiment. 第２実施形態による画像処理装置の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the image processing apparatus by 2nd Embodiment. 第３実施形態による画像処理装置の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the image processing apparatus by 3rd Embodiment.

＜第１実施形態＞
以下、図面を参照して、本発明の第１実施形態による画像処理装置を説明する。図１は同実施形態の構成を示すブロック図である。この図において、符号１は、ｒｅｆｏｃｕｓ−ａｂｌｅｉｍａｇｅ及びｒｅｆｏｃｕｓｅｄｉｍａｇｅ分割数及びカメラの内部パラメータを入力する入力部である。符号２は、２枚のｒｅｆｏｃｕｓ−ａｂｌｅｉｍａｇｅを入力とし、各ｒｅｆｏｃｕｓｅｄｉｍａｇｅ（ある奥行きに合焦した画像）の特徴点及びｒｅｆｏｃｕｓｅｄｉｍａｇｅ間の特徴点対応群を出力する画像処理部である。符号３は、特徴点対応群および特徴点の位置からカメラパラメータを推定して出力する幾何処理部である。 <First Embodiment>
Hereinafter, an image processing apparatus according to a first embodiment of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram showing the configuration of the embodiment. In this figure, reference numeral 1 denotes an input unit for inputting the number of the focused-able image and the focused image and the internal parameter of the camera. Reference numeral 2 denotes an image processing unit that receives two reflected-able images as input and outputs a feature point of each focused image (an image focused on a certain depth) and a feature point correspondence group between the focused images. Reference numeral 3 denotes a geometric processing unit that estimates and outputs a camera parameter from the feature point correspondence group and the position of the feature point.

ｒｅｆｏｃｕｓｅｄｉｍａｇｅ間の特徴点対応群は、ある特徴点群のペアにおいて得られる全ての特徴点対応である。特徴点群とは１枚の画像から検出される全ての特徴点を指す。また、特徴点対応とはある画像における特徴点が、もう一方の画像においてどの特徴点と対応づくかという関係を表す。図１に示す画像処理装置は、複数存在する特徴点対応群の候補及びカメラパラメータの候補の中から、適切な合焦位置のｒｅｆｏｃｕｓｅｄｉｍａｇｅを用いた場合のものを自動的に出力する。 The feature point correspondence group between the refocused images is all feature point correspondences obtained in a certain pair of feature points. The feature point group refers to all feature points detected from one image. Also, the feature point correspondence represents the relationship between the feature point in one image and which feature point in the other image. The image processing apparatus shown in FIG. 1 automatically outputs a plurality of feature point correspondence group candidates and camera parameter candidates when using a focused image at an appropriate in-focus position.

次に、図２を参照して、図１に示す画像処理装置の動作を説明する。図２は、図１に示す画像処理装置の動作を示すフローチャートである。図１に示す画像処理装置は、ｒｅｆｏｃｕｓｅｄｉｍａｇｅ（２枚）、ｒｅｆｏｃｕｓｅｄｉｍａｇｅの分割数、各ｒｅｆｏｃｕｓｅｄｉｍａｇｅを撮影したカメラの内部パラメータを入力し、外部パラメータ（回転行列、並進ベクトル）と内部パラメータを出力する。 Next, the operation of the image processing apparatus shown in FIG. 1 will be described with reference to FIG. FIG. 2 is a flowchart showing the operation of the image processing apparatus shown in FIG. The image processing apparatus shown in FIG. 1 inputs the focused image (two images), the number of divisions of the focused image, the internal parameters of the camera that captured each focused image, and outputs the external parameters (rotation matrix, translation vector) and the internal parameters. To do.

まず、入力部１は、Ｒｅｆｏｃｕｓ−ａｂｌｅｉｍａｇｅを２枚入力する（ステップＳ１）。続いて、入力部１は、カメラの内部パラメータを入力する（ステップＳ２）。そして、入力部１は、作成するｒｅｆｏｃｕｓｅｄｉｍａｇｅの数を入力する（ステップＳ３）。 First, the input unit 1 inputs two references-able images (step S1). Subsequently, the input unit 1 inputs camera internal parameters (step S2). Then, the input unit 1 inputs the number of refocused images to be created (step S3).

次に、画像処理部２は、ステップＳ１において入力されたｒｅｆｏｃｕｓ−ａｂｌｅｉｍａｇｅから、ステップＳ３において入力された数だけｒｅｆｏｃｕｓｅｄｉｍａｇｅを作成する（ステップＳ４）。 Next, the image processing unit 2 creates the number of refocused images input in step S3 from the refocus-able image input in step S1 (step S4).

次に、画像処理部２は、ステップＳ４において作成したｒｅｆｏｃｕｓｅｄｉｍａｇｅについて、特徴点を検出する（特徴点の位置、特徴量）（ステップＳ５）。 Next, the image processing unit 2 detects a feature point (the position of the feature point and the feature amount) for the focused image created in step S4 (step S5).

次に、画像処理部２は、特徴点群（１枚のｒｅｆｏｃｕｓｅｄｉｍａｇｅから検出される全ての特徴点）の全ての組み合わせを用意する（ステップＳ６）。ただし、組み合わせの一方は１枚目のｒｅｆｏｃｕｓ−ａｂｌｅｉｍａｇｅから作成されたｒｅｆｏｃｕｓｅｄｉｍａｇｅの特徴点群であり、もう一方は２枚目のｒｅｆｏｃｕｓ−ａｂｌｅｉｍａｇｅから作成されたｒｅｆｏｃｕｓｅｄｉｍａｇｅの特徴点群である。 Next, the image processing unit 2 prepares all combinations of the feature point group (all feature points detected from a single focused image) (step S6). However, one of the combinations is a feature point group of the focused image created from the first focused image, and the other is a group of feature points of the focused image created from the second focused-able image. .

次に、画像処理部２は、全ての特徴点群の組み合わせに対し、特徴点対応群を求める（ステップＳ７）。 Next, the image processing unit 2 obtains feature point correspondence groups for all combinations of feature point groups (step S7).

次に、幾何処理部３は、ステップＳ７において得られた特徴点対応群に対しＲＡＮＳＡＣ（RANDdom Sample Consensus）を行い、インライアとなる特徴点対応群を求める（ステップＳ８）。そして、幾何処理部３は、ステップＳ８の結果の中から、最も特徴点対応の数が多いもの出力する（ステップＳ９）。 Next, the geometric processing unit 3 performs RANSAC (RANDdom Sample Consensus) on the feature point correspondence group obtained in step S7 to obtain a feature point correspondence group to be an inlier (step S8). Then, the geometric processing unit 3 outputs the one corresponding to the largest number of feature points among the results of step S8 (step S9).

次に、幾何処理部３は、ステップＳ９において得られたインライアとなる特徴点対応群から基礎行列を求める（ステップＳ１０）。 Next, the geometric processing unit 3 obtains a basic matrix from the feature point correspondence group that is an inlier obtained in step S9 (step S10).

次に、幾何処理部３は、ステップＳ２において入力した内部パラメータとステップＳ１０において得られた基礎行列から基本行列を求める（ステップＳ１１）。 Next, the geometric processing unit 3 obtains a basic matrix from the internal parameters input in step S2 and the basic matrix obtained in step S10 (step S11).

次に、幾何処理部３は、ステップＳ１１において得られた基本行列から回転行列および並進ベクトルを算出する（ステップＳ１２）。 Next, the geometric processing unit 3 calculates a rotation matrix and a translation vector from the basic matrix obtained in step S11 (step S12).

次に、幾何処理部３は、ステップＳ１２において得られた回転行列、並進ベクトルおよびステップＳ２において入力した内部パラメータを初期値とし、ステップＳ９において得られたインライアとなる特徴点対応群を用いてバンドルアジャストメントを行い、回転行列、並進ベクトルおよび内部パラメータの最適解を求める（ステップＳ１３）。 Next, the geometric processing unit 3 uses the rotation matrix obtained in step S12, the translation vector, and the internal parameters input in step S2 as initial values, and bundles using the feature point correspondence group that is an inlier obtained in step S9. Adjustment is performed to obtain an optimal solution of the rotation matrix, translation vector, and internal parameters (step S13).

最後に、幾何処理部３は、ステップＳ１３において求めた結果を出力する（ステップＳ１４）。 Finally, the geometric processing unit 3 outputs the result obtained in step S13 (step S14).

このように、複数存在する特徴点対応群の候補及びカメラパラメータの候補の中から、適切な合焦位置のｒｅｆｏｃｕｓｅｄｉｍａｇｅを用いた場合のものを自動的に出力することができる。 As described above, it is possible to automatically output a plurality of feature point correspondence group candidates and camera parameter candidates when a focused image at an appropriate focus position is used.

＜第２実施形態＞
次に、本発明の第２実施形態による画像処理装置を説明する。第２実施形態における装置構成は、図１に示す構成と同様であるため、ここでは詳細な説明を省略する。 Second Embodiment
Next, an image processing apparatus according to a second embodiment of the present invention will be described. Since the device configuration in the second embodiment is the same as the configuration shown in FIG. 1, detailed description thereof is omitted here.

次に、図３を参照して、第２実施形態における画像処理装置の動作を説明する。図３は、第２実施形態における画像処理装置の動作を示すフローチャートである。この図において、図２に示す動作と同一の動作には同一の符号を付し、その説明を簡単に行う。第２実施形態における画像処理装置は、第１実施形態における画像処理装置と同様に、ｒｅｆｏｃｕｓｅｄｉｍａｇｅ（２枚）、ｒｅｆｏｃｕｓｅｄｉｍａｇｅの分割数、各ｒｅｆｏｃｕｓｅｄｉｍａｇｅを撮影したカメラの内部パラメータを入力し、外部パラメータ（回転行列、並進ベクトル）と内部パラメータを出力する。 Next, the operation of the image processing apparatus in the second embodiment will be described with reference to FIG. FIG. 3 is a flowchart showing the operation of the image processing apparatus according to the second embodiment. In this figure, the same operations as those shown in FIG. Similar to the image processing apparatus in the first embodiment, the image processing apparatus in the second embodiment inputs the refocused image (2 images), the number of divisions of the refocused image, and the internal parameters of the camera that has captured each refocused image. Output parameters (rotation matrix, translation vector) and internal parameters.

次に、画像処理部２は、特徴点群（１枚のｒｅｆｏｃｕｓｅｄｉｍａｇｅから検出される全ての特徴点）の全ての組み合わせを用意する（ステップＳ６）。 Next, the image processing unit 2 prepares all combinations of the feature point group (all feature points detected from a single focused image) (step S6).

次に、画像処理部２は、特徴点群の組み合わせを１つ選択し、この特徴点群の組み合わせに対し、特徴点対応群を求める（ステップＳ１５）。 Next, the image processing unit 2 selects one combination of feature points, and obtains a feature point correspondence group for this combination of feature points (step S15).

次に、幾何処理部３は、ステップＳ１５において得られた特徴点対応群に対しＲＡＮＳＡＣを行い、インライアとなる特徴点対応群を求める（ステップＳ８）。 Next, the geometric processing unit 3 performs RANSAC on the feature point correspondence group obtained in step S15 to obtain a feature point correspondence group to be an inlier (step S8).

次に、幾何処理部３は、ステップＳ８において得られたインライアとなる特徴点対応群から基礎行列を求める（ステップＳ１０）。 Next, the geometric processing unit 3 obtains a basic matrix from the feature point correspondence group that is an inlier obtained in step S8 (step S10).

次に、幾何処理部３は、ステップＳ１２において得られた回転行列、並進ベクトルおよびステップＳ２において入力した内部パラメータを初期値とし、ステップＳ８において得られたインライアとなる特徴点対応群を用いてバンドルアジャストメントを行い、回転行列、並進ベクトルおよび内部パラメータの最適解を求める（ステップＳ１７）。 Next, the geometric processing unit 3 sets the rotation matrix obtained in step S12, the translation vector, and the internal parameters input in step S2 as initial values, and bundles using the feature point correspondence group that is an inlier obtained in step S8. Adjustment is performed to find the optimal solution for the rotation matrix, translation vector, and internal parameters (step S17).

次に、幾何処理部３は、全ての特徴点群の組み合わせに対して処理が終了したか否かを判定し、終了していなければステップＳ１５に戻り、用意した全ての特徴点群の組み合わせに対して処理を行う（ステップＳ１８）。 Next, the geometric processing unit 3 determines whether or not processing has been completed for all combinations of feature points, and if not, the process returns to step S15, where all combinations of feature points are prepared. A process is performed for this (step S18).

最後に、幾何処理部３は、全ての組み合わせの結果の中で、求めた非線形最適化のイテレーション回数が最も少ない結果を出力する（ステップＳ１９）。 Finally, the geometric processing unit 3 outputs a result with the smallest number of iterations of the obtained nonlinear optimization among the results of all combinations (step S19).

＜第３実施形態＞
次に、本発明の第３実施形態による画像処理装置を説明する。第３実施形態における装置構成は、図１に示す構成と同様であるため、ここでは詳細な説明を省略する。 <Third Embodiment>
Next, an image processing apparatus according to a third embodiment of the present invention will be described. Since the apparatus configuration in the third embodiment is the same as the configuration shown in FIG. 1, detailed description thereof is omitted here.

次に、図４を参照して、第３実施形態における画像処理装置の動作を説明する。図４は、第３実施形態における画像処理装置の動作を示すフローチャートである。この図において、図３に示す動作と同一の動作には同一の符号を付し、その説明を簡単に行う。第３実施形態における画像処理装置は、第１、第２実施形態における画像処理装置と同様に、ｒｅｆｏｃｕｓｅｄｉｍａｇｅ（２枚）、ｒｅｆｏｃｕｓｅｄｉｍａｇｅの分割数、各ｒｅｆｏｃｕｓｅｄｉｍａｇｅを撮影したカメラの内部パラメータを入力し、外部パラメータ（回転行列、並進ベクトル）と内部パラメータを出力する。 Next, the operation of the image processing apparatus in the third embodiment will be described with reference to FIG. FIG. 4 is a flowchart illustrating the operation of the image processing apparatus according to the third embodiment. In this figure, the same operations as those shown in FIG. 3 are denoted by the same reference numerals, and description thereof will be briefly made. Similar to the image processing apparatuses in the first and second embodiments, the image processing apparatus according to the third embodiment inputs the number of refocused images (two images), the number of divisions of the refocused image, and the internal parameters of the camera that has captured each refocused image. External parameters (rotation matrix, translation vector) and internal parameters are output.

最後に、幾何処理部３は、全ての組み合わせの結果の中で、求めた非線形最適化後の再投影誤差が最も小さい結果を出力する（ステップＳ２０）。 Finally, the geometric processing unit 3 outputs a result having the smallest reprojection error after the obtained nonlinear optimization among the results of all combinations (step S20).

なお、前述したステップＳ１において、入力する２枚のｒｅｆｏｃｕｓ−ａｂｌｅｉｍａｇｅは以下の（１）、（２）のどちらでもよい。
（１）１台のカメラが２視点から撮影した画像。
（２）２台のカメラが２視点から撮影した画像。
ただし、２台のカメラは必ずしも同期している必要はないが、重複領域を有している必要がある。 Note that in step S1 described above, the two pieces of reference-able image to be input may be either (1) or (2) below.
(1) An image taken from two viewpoints by one camera.
(2) Images taken by two cameras from two viewpoints.
However, the two cameras do not necessarily have to be synchronized, but need to have overlapping areas.

また、前述したステップＳ２において、入力するカメラの内部パラメータは、参考文献１に記載の内部パラメータを利用しても求めてもよい。
参考文献１：Zhang, Zhengyou. "A flexible new technique for camera calibration." Pattern Analysis and Machine Intelligence, IEEE Transactions on 22.11 (2000): 1330-1334. In step S2 described above, the internal parameters of the camera to be input may be obtained using the internal parameters described in Reference Document 1.
Reference 1: Zhang, Zhengyou. "A flexible new technique for camera calibration." Pattern Analysis and Machine Intelligence, IEEE Transactions on 22.11 (2000): 1330-1334.

また、前述したステップＳ３において、入力するｒｅｆｏｃｕｓｅｄｉｍａｇｅの数は、ｒｅｆｏｃｕｓ−ａｂｌｅｉｍａｇｅから奥行きを推定し、その奥行きの値を基に適当な数を設定してもよい。Ｒｅｆｏｃｕｓ−ａｂｌｅｉｍａｇｅから奥行きを推定する手法は公知の手法（例えば、参考文献２参照）を用いる。
参考文献２：Tao, Michael W., et al. "Depth from Combining Defocus and Correspondence Using Light-Field Cameras." ICCV, (2013). Further, in step S3 described above, the number of inputted focused images may be estimated by estimating the depth from the focused-able image and setting an appropriate number based on the depth value. As a technique for estimating the depth from the Focus-able image, a known technique (see, for example, Reference 2) is used.
Reference 2: Tao, Michael W., et al. "Depth from Combining Defocus and Correspondence Using Light-Field Cameras." ICCV, (2013).

具体的な数の定め方としては、例えば奥行きに関してクラスタリングを行い、そのクラスタの数をｒｅｆｏｃｕｓｅｄｉｍａｇｅの数とする。ただし、その場合はクラスタリングのイテレーション回数及び閾値を事前に定める必要がある。 As a specific method of determining the number, for example, clustering is performed on the depth, and the number of clusters is set as the number of reflected images. In this case, however, the number of iterations of clustering and the threshold value must be determined in advance.

また、前述したステップＳ４において、Ｒｅｆｏｃｕｓ−ａｂｌｅｉｍａｇｅからｒｅｆｏｃｕｓｅｄｉｍａｇｅを作成する方法は公知の手法（例えば、参考文献３参照）を利用する。
参考文献３：Ng, Ren, et al. "Light field photography with a hand-held plenoptic camera." Computer Science Technical Report CSTR 2.11 (2005). In step S4 described above, a known method (for example, see Reference 3) is used as a method for creating a focused image from a Focused-able image.
Reference 3: Ng, Ren, et al. "Light field photography with a hand-held plenoptic camera." Computer Science Technical Report CSTR 2.11 (2005).

ただし、作成するｒｅｆｏｃｕｓｅｄｉｍａｇｅの奥行きは以下の（３）、（４）のどちらでもよい。
（３）ｓｔｅｐ＝（最も大きい奥行き）／（作成するｒｅｆｏｃｕｓｅｄｉｍａｇｅの数）とし、カメラの光学中心の近いほうからｓｔｅｐの整数倍の奥行きでｒｅｆｏｃｕｓｅｄｉｍａｇｅを作成する。
（４）ｋ−ｍｅａｎｓなどで奥行きに関してクラスタリングし（ｋは作成するｒｅｆｏｃｕｓｅｄｉｍａｇｅの数）、各クラスタの重心の奥行きでｒｅｆｏｃｕｓｅｄｉｍａｇｅを作成する。 However, the depth of the created focused image may be either (3) or (4) below.
(3) Step = (largest depth) / (number of focused images to be created), and a focused image is created at a depth that is an integral multiple of step from the closest optical center of the camera.
(4) Cluster the depth with k-means or the like (k is the number of focused images to be created), and create a focused image with the depth of the center of gravity of each cluster.

また、前述したステップＳ５において、特徴点を検出する手法として、ＳＩＦＴ、ＳＵＲＦ、ＣＡＲＤなどの公知の方法を用いることができる。 Moreover, in step S5 mentioned above, well-known methods, such as SIFT, SURF, and CARD, can be used as a method of detecting a feature point.

また、前述したステップＳ７において、特徴点を対応付ける手法として、ＳＩＦＴなどで提案されているように特徴点の特徴量の類似度に基づく手法を利用することができる。 In step S7 described above, as a technique for associating feature points, a technique based on the similarity of feature quantities of feature points as proposed in SIFT or the like can be used.

また、前述したステップＳ８において、ＲＡＮＳＡＣに用いるモデルとして、エピポーラ制約を用いてもよい。 In step S8 described above, epipolar constraints may be used as a model used for RANSAC.

また、前述したステップＳ１０において、基礎行列の求め方として、ｅｉｇｈｔ−ｐｏｉｎｔａｌｇｏｒｉｔｈｍを用いてもよい。 Moreover, in step S10 mentioned above, as a method for obtaining the basic matrix, an eight-point algorithm may be used.

また、前述したステップＳ１３、Ｓ１７において、バンドルアジャストメントに用いる非線形最適化手法として、Ｌｅｖｅｎｂｅｒｇ−Ｍａｒｑｕａｒｄｔ法を利用してもよい。 Further, in Steps S13 and S17 described above, the Levenberg-Marquardt method may be used as a nonlinear optimization method used for bundle adjustment.

以上説明したように、２枚のｒｅｆｏｃｕｓ−ａｂｌｅｉｍａｇｅに対してＳｆＭを行う場合、それぞれ適切な奥行きのｒｅｆｏｃｕｓｅｄｉｍａｇｅの組み合わせに対してＳｆＭを適用した結果を自動的に出力することにより容易にカメラパラメータを推定することが可能になる。これらは被写体の高精度な三次元形状推定や品質の高い仮想視点映像の生成に利用することができる。 As described above, when SfM is performed on two refocus-able images, camera parameters can be easily output by automatically outputting the result of applying SfM to a combination of refocused images having appropriate depths. Can be estimated. These can be used for highly accurate estimation of a three-dimensional shape of a subject and generation of a high-quality virtual viewpoint video.

前述した実施形態における画像処理装置をコンピュータで実現するようにしてもよい。その場合、この機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することによって実現してもよい。なお、ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものとする。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時間の間、動的にプログラムを保持するもの、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含んでもよい。また上記プログラムは、前述した機能の一部を実現するためのものであっても良く、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであってもよく、ＰＬＤ（Programmable Logic Device）やＦＰＧＡ（Field Programmable Gate Array）等のハードウェアを用いて実現されるものであってもよい。 You may make it implement | achieve the image processing apparatus in embodiment mentioned above with a computer. In that case, a program for realizing this function may be recorded on a computer-readable recording medium, and the program recorded on this recording medium may be read into a computer system and executed. Here, the “computer system” includes an OS and hardware such as peripheral devices. The “computer-readable recording medium” refers to a storage device such as a flexible medium, a magneto-optical disk, a portable medium such as a ROM and a CD-ROM, and a hard disk incorporated in a computer system. Furthermore, the “computer-readable recording medium” dynamically holds a program for a short time like a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line. In this case, a volatile memory inside a computer system serving as a server or a client in that case may be included and a program held for a certain period of time. Further, the program may be for realizing a part of the functions described above, and may be a program capable of realizing the functions described above in combination with a program already recorded in the computer system. It may be realized using hardware such as PLD (Programmable Logic Device) or FPGA (Field Programmable Gate Array).

以上、図面を参照して本発明の実施の形態を説明してきたが、上記実施の形態は本発明の例示に過ぎず、本発明が上記実施の形態に限定されるものではないことは明らかである。したがって、本発明の技術思想及び範囲を逸脱しない範囲で構成要素の追加、省略、置換、その他の変更を行ってもよい。 As mentioned above, although embodiment of this invention has been described with reference to drawings, the said embodiment is only the illustration of this invention, and it is clear that this invention is not limited to the said embodiment. is there. Therefore, additions, omissions, substitutions, and other modifications of the components may be made without departing from the technical idea and scope of the present invention.

２枚のｒｅｆｏｃｕｓ−ａｂｌｅｉｍａｇｅに対してＳｆＭを行う場合、それぞれ適切な奥行きのｒｅｆｏｃｕｓｅｄｉｍａｇｅの組み合わせに対してＳｆＭを適用した結果を自動的に出力することが不可欠な用途に適用できる。 When SfM is performed on two refocus-able images, it can be applied to an indispensable use in which it is indispensable to automatically output a result of applying SfM to a combination of refocused images having appropriate depths.

１・・・入力部、２・・・画像処理部、３・・・幾何処理部 DESCRIPTION OF SYMBOLS 1 ... Input part, 2 ... Image processing part, 3 ... Geometric processing part

Claims

An image processing apparatus that estimates camera parameters of a camera that has captured each of the focus-able images from two focus-ble images that are images with varying depths to be focused and having overlapping regions.
Input means for inputting the two refocus-able images, the number of refocused image divisions which are the number of divisions of an image focused on a predetermined depth, and the internal parameters of the camera;
Image processing means for detecting and outputting each feature point of the focused image and a feature point correspondence group between the focused image from the two focused-able images;
An image processing apparatus comprising: a geometric processing unit that estimates the camera parameter from the feature point correspondence group and the position of the feature point.

The image processing unit
Means for inputting the two pieces of the referenced-able image and the number of the focused image, and generating the focused image from the reflected-able image according to the number of the divided pieces of the recommended image;
Means for inputting each of the focused images and detecting a feature point of each focused image;
Means for inputting feature points of the focused image and obtaining a combination of all feature points;
Means for inputting a combination of the feature points and obtaining a feature point correspondence group based on the feature amount of each detected feature point;
Each of the obtained feature points and the correspondence between the feature points are inputted, the processing by the RANSAC method using the epipolar constraint is performed on the correspondence of the feature points, and the feature point correspondence group and the feature point group which are inliers are combined. The image processing apparatus according to claim 1, further comprising:

The image processing means inputs feature point correspondence groups that become inliers in all combinations, and outputs a combination of feature point groups with the largest number of feature point correspondences included and feature point correspondence groups that become inliers. The image processing apparatus according to claim 2.

The geometric processing means includes:
Means for inputting the feature point correspondence group and the feature points and calculating a basic matrix;
Means for inputting the basic matrix and internal parameters of the camera and calculating the basic matrix;
Means for inputting the basic matrix and outputting a rotation matrix and a translation vector;
The rotation matrix, the translation vector, the internal parameter, the feature point, and the feature point correspondence group are input, and bundle adjustment is performed so that a reprojection error is minimized, and the rotation matrix, the translation vector, the internal point The image processing apparatus according to any one of claims 1 to 3, further comprising: an optimum parameter solution, an iteration count in optimization, and a means for outputting a reprojection error.

The geometric processing means includes:
5. The image processing apparatus according to claim 4, wherein a bundle adjustment result for all combinations is input, and a bundle adjustment result for a combination of feature points having the smallest number of iterations is output.

The geometric processing means includes:
Input the result of bundle adjustment performed for all combinations of feature points for all combinations, and output the result of bundle adjustment for the combination of feature points with the smallest reprojection error The image processing apparatus according to claim 4.

An image processing method for estimating camera parameters of a camera that has photographed each of the focus-able images from two focus-ble images that are images with variable depths to be focused and having overlapping regions,
An input step for inputting the two refocus-able images, the number of the refocused image that is the number of divisions of the image focused on a predetermined depth, and the internal parameters of the camera;
An image processing step of detecting and outputting each of the feature points of the focused image and the feature point correspondence group between the focused images from the two focused-able images;
And a geometric processing step of estimating the camera parameter from the feature point correspondence group and the position of the feature point.

An image processing program for causing a computer to function as the image processing apparatus according to claim 1.