JP2015022510A

JP2015022510A - Free viewpoint image imaging device and method for the same

Info

Publication number: JP2015022510A
Application number: JP2013149902A
Authority: JP
Inventors: 渡邉　隆史; Takashi Watanabe; 隆史渡邉
Original assignee: Toppan Printing Co Ltd
Current assignee: Toppan Inc
Priority date: 2013-07-18
Filing date: 2013-07-18
Publication date: 2015-02-02
Anticipated expiration: 2033-07-18
Also published as: JP6201476B2

Abstract

PROBLEM TO BE SOLVED: To provide a free viewpoint image imaging device and a method for the free viewpoint image imaging that image a photoreal free viewpoint image by morphing an imaging image of a virtual camera corresponding to a viewpoint in accordance with the viewpoint of observing a subject on the basis of a plurality imaging images.SOLUTION: A free viewpoint image imaging device according to the present invention has: an imaging control part that acquires first imaging images of a subject from each of a plurality of imaging devices; a corresponding point retrieval part that uses a three-dimensional shape of the subject reproduced from the first imaging image to retrieves a first corresponding point serving as a corresponding point of a second imaging image acquired from the imaging device; a template matching part that performs a template matching of the first corresponding point among the second imaging images; and a virtual camera image generation part that generates third imaging images to be imaged from virtual cameras arranged among the plurality of imaging devices by a morphing process of the second imaging images.

Description

本発明は、３次元における被写体の自由視点画像を撮像する自由視点画像撮像装置およびその方法に関する。 The present invention relates to a free viewpoint image capturing apparatus and method for capturing a free viewpoint image of a subject in three dimensions.

近年、画像処理技術や画像通信技術の発達に伴い、次世代の画像コンテンツとして、３次元における自由視点画像が注目されている。
このため、被写体の周囲にビデオカメラを配置して撮影した多視点画像を用いて全周囲自由視点画像を生成する技術が開発されている。
ここで、撮影画像に基づいて自由視点画像を作り出す場合、撮像画像から３次元空間における被写体の３次元形状を作り、作成した３次元形状をレンダリングする方法を用いることが多い。 In recent years, with the development of image processing technology and image communication technology, three-dimensional free viewpoint images have attracted attention as next-generation image content.
For this reason, a technique for generating an all-around free viewpoint image using a multi-viewpoint image captured by arranging a video camera around the subject has been developed.
Here, when creating a free viewpoint image based on a photographed image, a method of creating a three-dimensional shape of a subject in a three-dimensional space from a captured image and rendering the created three-dimensional shape is often used.

基本的な考え方としては、２枚の画像間で対応点を見つけ、三角測量により対応点の３次元空間における座標位置を計測する。この処理により、３次元空間における対応点の点群のデータが得られる。
次に、この点群データに基づいたポリゴンの生成を行った後、このポリゴンに対して撮像画像を貼り付ける（テクスチャ・マッピングを行う）ことによりテスクチャの生成（テクスチャ・オブジェクトの生成）を行う。そして、最後に、３次元のコンピュータグラフィックスの技術に基づいて、この撮影画像をポリゴンに貼り付けたテスクチャ・オブジェクトをレンダリングし、画像を生成する。 As a basic idea, a corresponding point is found between two images, and the coordinate position of the corresponding point in a three-dimensional space is measured by triangulation. By this processing, point group data of corresponding points in the three-dimensional space is obtained.
Next, after generating a polygon based on the point cloud data, a texture is generated (texture / object generation) by pasting a captured image to the polygon (performing texture mapping). Finally, based on the three-dimensional computer graphics technique, the texture object obtained by pasting the photographed image on the polygon is rendered to generate an image.

他の自由視点画像を撮像する方法として、モーフィング（morphing）を利用した２次元画像処理に基づく方法がある。このモーフィングは、一般的に、１枚の被写体の画像から、他の１枚の被写体の画像へ変化する様子を、画像処理技術により自然に変化しているように見せる技術を指している。
このモーフィングの処理を、３次元形状の被写体に対する視点変更に応用することにより、自然な視点変更の画像を得ることができる。 As another method for capturing a free viewpoint image, there is a method based on two-dimensional image processing using morphing. This morphing generally refers to a technique that makes it appear that a state of changing from an image of one subject to an image of another subject is naturally changed by an image processing technique.
By applying this morphing process to a viewpoint change for a three-dimensional object, a natural viewpoint change image can be obtained.

ここで、モーフィングの利点は、３次元形状の被写体に対する視点変更に対し、簡単にフォトリアルな画像を作れることにある。
すなわち、すでに述べたように、撮像画像から３次元形状を一旦作成する方法の場合、通常テクスチャは１枚となり、観察者の視点が変更されてもテクスチャは変化しない。
しかし、一般的に物（被写体）の色は、被写体に照射される光の入射角と、被写体に対する観察者の視点方向に依存して色合いが変化する。このため、観察者の視点が変更された際に、テクスチャが変化しないことは不自然な状態となる。一方、モーフィングは、視点の変更に応じて、画像におけるテクスチャも変化していくため、フォトリアルな映像となる。 Here, the advantage of morphing is that a photorealistic image can be easily created with respect to a viewpoint change for a three-dimensional object.
That is, as already described, in the method of once creating a three-dimensional shape from a captured image, there is only one normal texture, and the texture does not change even if the observer's viewpoint is changed.
However, in general, the color of an object (subject) changes depending on the incident angle of light applied to the subject and the viewpoint direction of the observer with respect to the subject. For this reason, it is unnatural that the texture does not change when the viewpoint of the observer is changed. On the other hand, morphing produces a photorealistic image because the texture in the image changes as the viewpoint changes.

自由視点画像を生成する技術の一つとして、モーフィングを用いたビューモーフィング（View morphing）がある（例えば、非特許文献１参照）。このビューモーフィングの技術は、２台のカメラ間に仮想的なカメラ（以下、仮想カメラ）がある場合、この仮想カメラの視点から見た画像を、カメラの幾何学的な整合性を保ちつつ、２台のカメラの各々の撮像画像をモーフィングすることにより作り出す技術である。しかしながら、この非特許文献１の技術は、生成できる画像が２つのカメラの視点間に制限されるため、視点の自由度は低いものとなる。また、モーフィングを使用したことにより、視点変化に対してフォトリアルな画像を作り出せるが、作られる画像の品質は対応点探索の精度に依存する。一般的に、２枚の撮像画像間において、高精度かつロバストな対応点探索を行うことは難しい。 One technique for generating a free viewpoint image is view morphing using morphing (see Non-Patent Document 1, for example). In this view morphing technique, when there is a virtual camera (hereinafter referred to as a virtual camera) between two cameras, an image viewed from the viewpoint of the virtual camera is maintained while maintaining the geometrical consistency of the camera. This is a technology that creates images by morphing the captured images of each of the two cameras. However, in the technique of Non-Patent Document 1, since the images that can be generated are limited between the viewpoints of two cameras, the degree of freedom of the viewpoint is low. Also, by using morphing, a photorealistic image can be created with respect to a change in viewpoint, but the quality of the created image depends on the accuracy of the corresponding point search. In general, it is difficult to perform a highly accurate and robust corresponding point search between two captured images.

対応点の探索においては、２台のカメラの撮像方向が平行であり、かつ、撮影距離が近いことが高精度に探索するための好条件となる。しかし、撮像方向が平行で近距離にある２台のカメラの間における仮想カメラの撮像画像を、モーフィングの処理により生成しても、撮像画像が平行移動するのみであるため意味が無い。
このため、この非特許文献１の技術においては、ビューモーフィングを行う際、２台のカメラの間の仮想カメラの撮像画像を作り出すため、２台のカメラが輻輳角を持ち、離れているという条件下で被写体の撮像が行われる。 In the search for corresponding points, it is a favorable condition for searching with high accuracy that the imaging directions of the two cameras are parallel and the imaging distance is short. However, even if a captured image of a virtual camera between two cameras having a parallel imaging direction and a short distance is generated by morphing processing, it is meaningless because the captured image only moves in parallel.
For this reason, in the technique of Non-Patent Document 1, when view morphing is performed, in order to create a captured image of a virtual camera between two cameras, the two cameras have a convergence angle and are separated from each other. The subject is imaged below.

したがって、非特許文献１においては、２台のカメラが輻輳角を持ち、離れているという条件下における困難な対応点検索を行う必要がある。
対応点探索の精度が不十分な場合、２台の各々の撮像画像間において外れ値（実際には対応していない誤った対応点）を含む疎な対応点しか得られない。この対応点が疎な状態において、２台のカメラの各々の撮像画像を用いてビューモーフィングを行った場合、モーフィング画像において画素が存在しない領域が発生したり、あるいはぼやけたりといった問題が生じる。また、仮に高精度で密な対応点探索が成功したとしても、オクルージョン領域で対応点を見つけることはできないため、オクルージョン領域の描画では、やはり画素が存在しない領域が発生するなどの問題が生じてしまう。 Therefore, in Non-Patent Document 1, it is necessary to perform a difficult corresponding point search under the condition that the two cameras have a convergence angle and are separated.
When the accuracy of the corresponding point search is insufficient, only sparse corresponding points including outliers (incorrect corresponding points not actually corresponding) can be obtained between the two captured images. When view morphing is performed using the captured images of the two cameras in a state where the corresponding points are sparse, there is a problem that a region where no pixel exists in the morphing image is generated or the image is blurred. In addition, even if high-precision and dense corresponding point search succeeds, it is impossible to find corresponding points in the occlusion area. Therefore, in the drawing of the occlusion area, there is a problem that an area where pixels do not exist is generated. End up.

非特許文献１におけるビューモーフィングを発展させた手法もいくつか提案されている。例えば、非特許文献１におけるビューモーフィングは、２視点間という制限があったため、３視点間でモーフィングする手法が提案されている（例えば、非特許文献２参照）。
そのことで、３視点間により形成される平面上において、非特許文献１に対して、より自由な視点の画像を生成することができる。しかしながら、この非特許文献２の手法においても、対応点探索に起因する問題は解決されていない。 Several methods that have developed view morphing in Non-Patent Document 1 have been proposed. For example, since view morphing in Non-Patent Document 1 has a limitation of two viewpoints, a method of morphing between three viewpoints has been proposed (for example, see Non-Patent Document 2).
As a result, a more free viewpoint image can be generated for Non-Patent Document 1 on a plane formed by three viewpoints. However, even in the method of Non-Patent Document 2, the problem caused by the corresponding point search is not solved.

３視点間、すなわち３台のカメラの各々の撮像画像における対応点探索の結果が疎である場合、または密すぎる場合、適切な対応点を選びドロネー三角分割法によるメッシュ化を行うことにより、モーフィング処理を行い仮想カメラにおける撮像画像を描画する方法も提案されている（例えば、非特許文献３参照）。この非特許文献３の方法によるモーフィング処理ならば、仮想カメラにおけるモーフィング画像には穴が空かずに描画可能である。しかしながら、対応点の外れ値の問題やオクルージョンの問題は解決しない。 If the corresponding point search result between the three viewpoints, that is, the captured images of the three cameras is sparse or too dense, morphing is performed by selecting an appropriate corresponding point and meshing with Delaunay triangulation. There has also been proposed a method of performing processing and drawing a captured image in a virtual camera (see, for example, Non-Patent Document 3). With the morphing process according to the method of Non-Patent Document 3, a morphing image in a virtual camera can be drawn without a hole. However, it does not solve the problem of outliers in correspondence points and the problem of occlusion.

一方、オクルージョン領域を描画する手法も提案されている（例えば、特許文献４参照）が、この特許文献４においてはオクルージョン領域は手動で指定しなければならない。また、オクルージョンの描画は可能ではあるが、隠されて撮像されていない部分を細部まで綺麗に描画することはできない。 On the other hand, a technique for drawing an occlusion area has also been proposed (see, for example, Patent Document 4). However, in Patent Document 4, the occlusion area must be manually specified. Although it is possible to draw occlusion, it is not possible to draw fine details of a hidden part that is not captured.

３次元形状を利用して自由視点画像を生成する手法としては、Y．Furukawaらによる手法がある（例えば、非特許文献５参照）。この非特許文献５の手法においては、はじめに対象物（被写体）を複数の視点から撮像した撮像画像を用意する。
次に、その撮像画像からＳＩＦＴ（Scale Invariant Feature Transform）などの特徴量抽出処理を用いて特徴量を抽出する。そして、抽出した特徴量を用いて、ＳＦＭ（Structure from motion）及びＭＶＳ（Multi View Stereo）を実行し、撮像した被写体の３次元の形状を復元している。ここで、ＳＦＭは、ある被写体をカメラの視点を変えながら撮影した複数枚の撮像画像からその被写体の３次元形状とカメラの位置を同時に復元する手法である。ＭＶＳは、ＳＦＭが復元したカメラ位置とパラメータ及び入力画像から、初期３Ｄモデルを生成する手法である。 As a method for generating a free viewpoint image using a three-dimensional shape, There is a method by Furukawa et al. (For example, see Non-Patent Document 5). In the method of Non-Patent Document 5, first, a captured image obtained by capturing an object (subject) from a plurality of viewpoints is prepared.
Next, feature amounts are extracted from the captured image using a feature amount extraction process such as SIFT (Scale Invariant Feature Transform). Then, using the extracted feature quantity, SFM (Structure from motion) and MVS (Multi View Stereo) are executed to restore the three-dimensional shape of the imaged subject. Here, SFM is a technique for simultaneously restoring the three-dimensional shape of the subject and the position of the camera from a plurality of captured images obtained by photographing a subject while changing the viewpoint of the camera. MVS is a method for generating an initial 3D model from camera positions and parameters restored by SFM and an input image.

上述したＳＩＦＴの計算が成功する場合は、多くの特徴量が得られ、高精度な３次元形状を復元することが可能である。
一方、ＳＩＦＴにおいて高精度に特徴点を見つけることは難しく、自由視点画像を作れるほど密な対応点となる特徴点の点群を得られないことが多い。
さらに、複数枚の撮像画像から特徴点として抽出される点には、外れ値も多く発生するため、取り除く必要がある。 When the above-described SIFT calculation is successful, many feature quantities can be obtained and a highly accurate three-dimensional shape can be restored.
On the other hand, it is difficult to find a feature point with high precision in SIFT, and it is often impossible to obtain a point group of feature points that are close enough to create a free viewpoint image.
Further, since many outliers are generated at points extracted as feature points from a plurality of captured images, it is necessary to remove them.

また、密な特徴点の点群を得られない場合であっても、撮像画像の描画を可能にする手法も提案されている（非特許文献６）。
この非特許文献６においては、ＳＦＭ及びＭＶＳを使用した際に、精度が高く求まった点群は優先度を高く描画し、精度が低い点群に対してはぼかして描画することで、滑らかな視点移動を実現している。
ただし、この手法は視点移動の際に点群を誤魔化して描画しているだけであるため、精度の低い点群の領域はぼやけて描画されるだけである。 In addition, even when a dense feature point group cannot be obtained, a technique that enables drawing of a captured image has been proposed (Non-Patent Document 6).
In this non-patent document 6, when SFM and MVS are used, a point group that is obtained with high accuracy is rendered with high priority, and a point group with low accuracy is rendered with blurring to render smooth. The viewpoint is moved.
However, since this method only draws the point cloud by deceiving it when the viewpoint is moved, the area of the point cloud with low accuracy is only drawn in a blurred manner.

安定的に３次元形状を復元できる方法として、ＶＨ（Visual Hull）が知られている（例えば、非特許文献７、非特許文献８を参照）。このＶＨは、対象物である被写体を複数の視点から撮像画像を撮像し、それぞれの撮像画像における背景を削除し、シルエット画像を生成する。このシルエット中に復元したい対象物が含まれていることを利用して、シルエットの共通領域を３次元空間中で求める。これにより、３次元空間において被写体の存在している領域（部分空間）を見つけ出し、その領域を被写体の３次元形状として復元する。 VH (Visual Hull) is known as a method that can stably restore a three-dimensional shape (see, for example, Non-Patent Document 7 and Non-Patent Document 8). This VH captures captured images of a subject that is an object from a plurality of viewpoints, deletes the background in each captured image, and generates a silhouette image. Using the fact that the object to be restored is included in the silhouette, the common area of the silhouette is obtained in the three-dimensional space. Thereby, a region (partial space) where the subject exists in the three-dimensional space is found, and the region is restored as the three-dimensional shape of the subject.

この非特許文献７及び非特許文献８の方法では、特徴点を使わずシルエットを計算するだけなので、比較的安定して３次元形状を得ることができる。ただし、ＶＨ（Visual Hull）で復元される３次元形状は精度が低く、撮影画像をそのままテクスチャとして適用しようとしても、上手くいかない。
この問題を解決するため、ＶＨとステレオマッチングとを組み合わせて精度を向上する方法（例えば、非特許文献９参照）や、テクスチャに基づいて３次元形状を変形する方法が提案されている（例えば、特許文献１参照）。 In the methods of Non-Patent Document 7 and Non-Patent Document 8, only a silhouette is calculated without using feature points, so that a three-dimensional shape can be obtained relatively stably. However, the three-dimensional shape restored by VH (Visual Hull) has low accuracy, and even if it tries to apply a photographed image as a texture as it is, it will not work.
In order to solve this problem, a method of improving accuracy by combining VH and stereo matching (for example, see Non-Patent Document 9) and a method of deforming a three-dimensional shape based on a texture have been proposed (for example, Patent Document 1).

また、テクスチャに優先順位を与えることで、複数視点から撮影した画像からテクスチャを作り出す技術もある（例えば、特許文献２参照）。
しかし、これらはテクスチャを貼っただけであるため、モーフィングのように視点変更に対してフォトリアルな映像を作り出すことはできない。 There is also a technique for creating a texture from images taken from a plurality of viewpoints by giving priority to the texture (see, for example, Patent Document 2).
However, since these are only textured, it is not possible to create photorealistic images for changing viewpoints like morphing.

特開２００９−２９４９５６号公報JP 2009-94956 A 特許第４４６４７７３号公報Japanese Patent No. 4464773

S．Seitz，C．Dyer．，“View morphing”， SIGGRAPH 1996，1996，pp. 21−30．S. Seitz, C.M. Dyer. "View morphing", SIGGRAPH 1996, 1996, pp. 21-30. XIAO J，SHAH M，“Tri−view morphing”Comput Vis Image Underst，Volume 96 Issue 3，December 2004 pp．345−366XIAO J, SHAH M, “Tri-view morphing” Comput Vis Image Underst, Volume 96 Issue 3, December 2004 pp. 345-366 David Jelinek and C．J．Taylor ”Quasi−Dense Motion Stereo for 3D View Morphing”International Symposium on Virtual and Augmented Architecture (VAA01) pp 219−229，June 2001．David Jelinek and C. J. Taylor “Quasi-Dense Motion Stereo for 3D View Morphing” International Symposium on Virtual and Augmented Architecture (VAA01) pp 219-229, June 2001. G．Chaurasia，O．Sorkine，G．Drettakis ”Silhouette−Aware Warping for Image−Based Rendering”Computer Graphics Forum (Proceedings of the Eurographics Symposium on Rendering)，Volume 30，Number 4−2011．G. Chaurasia, O. Sorkine, G. Drettakis "Silhouette-Aware Warping for Image-Based Rendering" Computer Graphics Forum (Proceedings of the Eurographics Symposium on Rendering), Volume 30, Number 4-2011. Y．Furukawa and J．Ponce，”Accurate，Dense，and Robust Multi−View Stereopsis”CVPR 2007．Y. Furukawa and J. Ponce, “Accurate, Dense, and Robust Multi-View Stereopsis” CVPR 2007. M．Goesele，J．Ackermann，S．Fuhrmann，C．Haubold，R．Klowsky，D．Steedly，R．Szeliski ”Ambient Point Clouds for View Interpolation”ACM Transactions on Graphics Proceedings of ACM SIGGRAPH 2010．M. Goesele, J.A. Ackermann, S.M. Fuhrmann, C.D. Haubold, R.D. Klowsky, D.C. Steedly, R.D. Szeliski “Ambient Point Clouds for View Interpolation” ACM Transactions on Graphics Proceedings of ACM SIGGRAPH 2010. W．Matusik C．Buehler R．Raskar S．Gortler L．McMillan ”Image−Based Visual Hulls”SIGGRAPH 2000．W. Matusik C. Buehler R. Raskar S. Gortler L. McMillan “Image-Based Visual Hulls” SIGGRAPH 2000. S．Moezzi，L．Tai，and P．Gerard：Virtual View Generation for 3D Digital Video，IEEE Multimedia，pp．18-26，1997．S. Moezzi, L. Tai, and P. Gerard: Virtual View Generation for 3D Digital Video, IEEE Multimedia, pp. 18-26, 1997. 冨山仁博，片山美和，岩舘祐一，今泉浩幸 ”視体積交差法とステレオマッチング法を用いた多視点画像からの3次元動オブジェクト生成手法”映像情報メディア 58(6)，797−806，2004−06−01．N. Hiroyama, M. Katayama, Y. Iwabuchi, H. Imaizumi “Generation of 3D objects from multi-view images using visual volume intersection method and stereo matching method” Video Information Media 58 (6), 797-806, 2004- 06-01.

本発明は、このような事情に鑑みてなされたもので、被写体を観察する視点に応じ、その視点に対応する仮想カメラの撮像画像を、複数の撮像画像をもとにモーフィングすることにより、フォトリアルな自由視点画像を撮像する自由視点画像撮像装置およびその方法を提供することを目的とする。 The present invention has been made in view of such circumstances, and by morphing a captured image of a virtual camera corresponding to the viewpoint according to the viewpoint of observing the subject based on a plurality of captured images, An object of the present invention is to provide a free viewpoint image capturing apparatus and method for capturing a real free viewpoint image.

本発明の自由視点画像撮像装置は、複数の撮像装置の各々から被写体の第１撮像画像を取得する撮像制御部と、前記第１撮像画像から再生された前記被写体の３次元形状を用い、前記撮像装置から取得した第２撮像画の対応点である第１対応点を検索する対応点探索部と、前記第２撮像画像間の前記第１対応点のテンプレートマッチングを行うテンプレートマッチング部と、複数の前記撮像装置の間に配置された仮想カメラから撮像される第３撮像画像を、前記第２撮像画像のモーフィング処理により生成する仮想カメラ画像生成部とを有することを特徴とする。 The free viewpoint image capturing device of the present invention uses an imaging control unit that acquires a first captured image of a subject from each of a plurality of imaging devices, and a three-dimensional shape of the subject that is reproduced from the first captured image. A corresponding point search unit that searches for a first corresponding point that is a corresponding point of the second captured image acquired from the imaging device; a template matching unit that performs template matching of the first corresponding point between the second captured images; And a virtual camera image generation unit that generates a third captured image captured by a virtual camera disposed between the imaging devices by morphing the second captured image.

本発明の自由視点画像撮像装置は、仮想カメラ画像生成部が、前記第２撮像画像の組み合わせにより前記モーフィング処理を行い、前記仮想カメラが、前記組み合わせた第２撮像画像を撮像する前記撮像装置の配置された線あるいは平面上における撮像面を有することを特徴とする。 In the free viewpoint image capturing device of the present invention, a virtual camera image generation unit performs the morphing process by a combination of the second captured images, and the virtual camera captures the combined second captured image. It has an imaging surface on the arranged line or plane.

本発明の自由視点画像撮像装置は、前記３次元形状を前記仮想カメラの撮像面に投影させた第４撮像画像から、前記組み合わせにおける前記撮像装置の各々のオクルージョン領域を検出し、検出結果を前記仮想カメラ画像生成部に対してオクルージョン情報として供給するオクルージョン領域探索部をさらに有し、前記仮想カメラ画像生成部が、前記オクルージョン情報を反映して、前記組み合わせの前記撮像画像を用いたモーフィング処理により前記第３撮像画像を生成することを特徴とする。 The free viewpoint image capturing device of the present invention detects each occlusion region of the image capturing device in the combination from a fourth captured image obtained by projecting the three-dimensional shape onto the image capturing surface of the virtual camera, and the detection result is the It further includes an occlusion area search unit that supplies the virtual camera image generation unit as occlusion information, and the virtual camera image generation unit reflects the occlusion information and performs a morphing process using the captured image of the combination. The third captured image is generated.

本発明の自由視点画像撮像装置は、前記第２撮像画像の解像度が前記第１撮像画像の解像度より高いことを特徴とする。 The free viewpoint image capturing device of the present invention is characterized in that the resolution of the second captured image is higher than the resolution of the first captured image.

本発明の自由視点画像撮像装置は、前記第１撮像画像からシルエット画像を生成するシルエット画像生成部と、前記シルエット画像から前記被写体の前記３次元形状の一部分である仮部分３次元形状を生成するＶＦ処理部と、前記撮像装置の組み合わせ毎に、それぞれの前記第１撮像画像のステレオマッチングを行い、当該第１撮像画像の各々における共通の対応点である共通対応点を抽出するステレオマッチング部と、前記共通対応点により前記仮部分３次元形状を修正して、修正した結果を部分３次元形状とする部分３次元形状生成部と、前記部分３次元形状を合成して前記３次元形状を生成する３次元形状合成部とをさらに有することを特徴とする請求項１から請求項４のいずれか一項に記載の自由視点画像撮像装置。 The free viewpoint image capturing apparatus of the present invention generates a silhouette image generating unit that generates a silhouette image from the first captured image, and generates a temporary partial 3D shape that is a part of the 3D shape of the subject from the silhouette image. A stereo matching unit that performs stereo matching of each of the first captured images for each combination of the VF processing unit and the imaging device, and extracts a common corresponding point that is a common corresponding point in each of the first captured images; The three-dimensional shape is generated by combining the partial three-dimensional shape with the partial three-dimensional shape generation unit that corrects the temporary partial three-dimensional shape using the common corresponding points and makes the corrected result a partial three-dimensional shape. 5. The free viewpoint image capturing apparatus according to claim 1, further comprising a three-dimensional shape synthesizing unit.

本発明の自由視点画像撮像装置は、前記３次元形状合成部が、前記共通対応点を結ぶ三角形のメッシュと、前記仮部分３次元形状とを比較し、前記メッシュに対して前記撮像装置方向に近い前記仮部分３次元形状の領域を削除し、前記仮部分３次元形状の修正を行うことを特徴とする。 In the free viewpoint image capturing apparatus according to the present invention, the three-dimensional shape combining unit compares the triangular mesh connecting the common corresponding points with the temporary partial three-dimensional shape, and the direction of the image capturing apparatus is set with respect to the mesh. The temporary part three-dimensional shape region is deleted, and the temporary part three-dimensional shape is corrected.

本発明の自由視点画像撮像装置は、前記３次元形状合成部が、前記第１撮像画像の全ての組み合わせから得られた前記部分３次元画像を重ね合わせ、予め設定された閾値以上の当該部分３次元画像が重なった領域から前記３次元形状を生成することを特徴とする。 In the free viewpoint image capturing device of the present invention, the three-dimensional shape synthesis unit superimposes the partial three-dimensional images obtained from all combinations of the first captured images, and the portion 3 that is equal to or greater than a preset threshold value. The three-dimensional shape is generated from a region where the two-dimensional images overlap.

本発明の自由視点画像撮像装置は、前記撮像装置が２台であり、当該撮像装置が配置された線上に前記仮想カメラが配置されており、前記仮想カメラの配置された座標Ｅｖが、撮像装置ｉ（１≦ｉ≦２）の座標Ｅｉを用い、Ｅｖ＝（１−α）Ｅ１＋αＥ２，（０≦α≦１）の式で表される際、前記仮想カメラの前記第３撮像画像における画素の座標Ｐｖが、前記座標Ｐｖの第１対応点が座標Ｐｉであり、前記第３撮像画像の座標系における前記第１対応点がＰｉと、前記テンプレートマッチングにおけるテンプレートとしての第１撮像画像のＰｊと、Ｐｉに対するＰｊの差分Ｄｉｊ（ｉ＝１，２、ｊ＝１，２、ｉ≠ｊ）との組み合わせで表現される場合、前記第３撮像画像の座標系における前記座標Ｐｖに対する撮像装置１の第１対応点がＰ１＋αＤ１２、撮像装置２の第１の対応点がＰ２＋（１−α）Ｄ２１となることを特徴とする。 The free viewpoint image capturing apparatus according to the present invention includes two image capturing apparatuses, the virtual camera is disposed on a line where the image capturing apparatus is disposed, and the coordinate Ev where the virtual camera is disposed is the image capturing apparatus. When the coordinates Ei of i (1 ≦ i ≦ 2) are used and expressed by the equation Ev = (1−α) E1 + αE2, (0 ≦ α ≦ 1), the pixel of the third captured image of the virtual camera In the coordinate Pv, the first corresponding point of the coordinate Pv is the coordinate Pi, the first corresponding point in the coordinate system of the third captured image is Pi, and Pj of the first captured image as a template in the template matching , Pj difference Dij (i = 1, 2, j = 1, 2, i ≠ j) is expressed by a combination of the imaging device 1 with respect to the coordinate Pv in the coordinate system of the third captured image. The first corresponding point is P1 Arufadi12, first corresponding point of the image pickup apparatus 2 is characterized by comprising a P2 + (1-α) D21.

本発明の自由視点画像撮像装置は、前記撮像装置が３台であり、当該撮像装置が配置された面上に前記仮想カメラが配置されており、前記仮想カメラの配置された座標Ｅｖが、撮像装置ｉ（１≦ｉ≦３）の座標Ｅｉを用い、Ｅｖ＝（１−α−β）Ｅ１＋αＥ２＋βＥ３，（０≦α、０≦β、α＋β≦１）の式で表される際、前記仮想カメラの前記第３撮像画像における画素の座標Ｐｖが、前記座標Ｐｖの第１対応点が座標Ｐｉであり、前記第３撮像画像の座標系における前記第１対応点がＰｉと、前記テンプレートマッチングにおけるテンプレートとしての第１撮像画像のＰｊと、Ｐｉに対するＰｊの差分Ｄｉｊ（ｉ＝１，２，３、ｊ＝１，２，３、ｉ≠ｊ）との組み合わせで表現される場合、前記第３撮像画像の座標系における前記座標Ｐｖに対する撮像装置１の第１対応点がＰ１＋αＤ１２＋βＤ１３、撮像装置２の第１対応点がＰ２＋（１−α−β）Ｄ２１＋βＤ２３、撮像装置３の第１対応点がＰ３＋（１−α−β）Ｄ３１＋αＤ３２となることを特徴とする。 The free viewpoint image capturing apparatus according to the present invention includes three image capturing apparatuses, the virtual camera is disposed on a surface where the image capturing apparatus is disposed, and a coordinate Ev where the virtual camera is disposed is an image capturing position. When the coordinates Ei of the device i (1 ≦ i ≦ 3) are used and the virtual camera is expressed by the following equation: Ev = (1−α−β) E1 + αE2 + βE3 (0 ≦ α, 0 ≦ β, α + β ≦ 1) The coordinate Pv of the pixel in the third captured image is the coordinate Pi of the first corresponding point of the coordinate Pv, the Pi is the first corresponding point in the coordinate system of the third captured image, and the template in the template matching As a combination of Pj of the first captured image and the difference Dij of Pj with respect to Pi (i = 1, 2, 3, j = 1, 2, 3, i ≠ j). Paired with the coordinate Pv in the image coordinate system The first corresponding point of the imaging device 1 is P1 + αD12 + βD13, the first corresponding point of the imaging device 2 is P2 + (1−α−β) D21 + βD23, and the first corresponding point of the imaging device 3 is P3 + (1−α−β) D31 + αD32. It is characterized by becoming.

本発明の自由視点画像撮像装置は、前記撮像装置が４台であり、当該撮像装置が配置された面上に前記仮想カメラが配置されており、前記仮想カメラの配置された座標Ｅｖが、撮像装置ｉ（１≦ｉ≦４）の座標Ｅｉを用い、Ｅｖ＝（１−α−αβ）Ｅ１＋α（１−β）Ｅ２−αβＥ３＋αβＥ４，（０≦α、０≦β、α＋β≦１）の式で表される際、前記仮想カメラの前記第３撮像画像における画素の座標Ｐｖが、前記座標Ｐｖの第１対応点が座標Ｐｉであり、前記第３撮像画像の座標系における前記第１対応点がＰｉと、前記テンプレートマッチングにおけるテンプレートとしての第１撮像画像のＰｊと、Ｐｉに対するＰｊの差分Ｄｉｊ（ｉ＝１，２，３，４、ｊ＝１，２，３，４、ｉ≠ｊ）との組み合わせで表現される場合、前記第３撮像画像の座標系における前記座標Ｐｖに対する撮像装置ｉの対応点が（１−α＋αβ）Ｐ’ｉ１＋α（１−β）Ｐ’ｉ２−αβＰ’ｉ３＋αＰ’ｉ４（ただし、Ｐ’ｉｉ＝Ｐｉ）となることを特徴とする。 In the free viewpoint image pickup device of the present invention, the number of the image pickup devices is four, the virtual camera is arranged on the surface on which the image pickup devices are arranged, and the coordinate Ev where the virtual camera is arranged is an image pickup device. Using the coordinates Ei of the device i (1 ≦ i ≦ 4), Ev = (1−α−αβ) E1 + α (1−β) E2−αβE3 + αβE4 (0 ≦ α, 0 ≦ β, α + β ≦ 1) When represented, the coordinate Pv of the pixel in the third captured image of the virtual camera is that the first corresponding point of the coordinate Pv is the coordinate Pi, and the first corresponding point in the coordinate system of the third captured image is Pi, Pj of the first captured image as a template in the template matching, and a difference Dij (i = 1, 2, 3, 4, j = 1, 2, 3, 4, i ≠ j) with respect to Pi When the third captured image is represented by a combination of The corresponding point of the imaging device i with respect to the coordinate Pv in the standard system is (1−α + αβ) P′i1 + α (1−β) P′i2−αβP′i3 + αP′i4 (where P′ii = Pi). And

本発明の自由視点画像撮像装置は、前記撮像装置各々が、前記被写体までの距離が同様となるように、当該被写体を中心とした球の面上に、撮像方向を前記被写体として配置されることを特徴とする。 In the free viewpoint image pickup device of the present invention, each of the image pickup devices is arranged with the image pickup direction as the subject on the surface of a sphere centered on the subject so that the distance to the subject is the same. It is characterized by.

本発明の自由視点画像撮像方法は、撮像制御部が、複数の撮像装置の各々から被写体の第１撮像画像を取得する撮像制御過程と、対応点探索部が、前記第１撮像画像から再生された前記被写体の３次元形状を用い、前記撮像装置から取得した第２撮像画の対応点である第１対応点を検索する対応点探索過程と、テンプレートマッチング部が、前記第２撮像画像間の前記第１対応点のテンプレートマッチングを行うテンプレートマッチング過程と、仮想カメラ画像生成部が、複数の前記撮像装置の間に配置された仮想カメラから撮像される第３撮像画像を、前記第２撮像画像のモーフィング処理により生成する仮想カメラ画像生成過程とを含むことを特徴とする。 In the free viewpoint image capturing method of the present invention, the image capturing control unit acquires the first captured image of the subject from each of the plurality of image capturing apparatuses, and the corresponding point search unit is reproduced from the first captured image. A matching point search process for searching for a first corresponding point that is a corresponding point of a second captured image acquired from the imaging device using a three-dimensional shape of the subject, and a template matching unit between the second captured images A template matching process for performing template matching of the first corresponding point, and a third captured image captured by a virtual camera image generation unit from a virtual camera arranged between the plurality of imaging devices, And a virtual camera image generation process generated by the morphing process.

この発明によれば、複数枚の撮像画像をモーフィング処理することで自由視点画像を生成する際、モーフィング処理において局所領域でテンプレートマッチングを行い、撮像画像間における第１対応点の修正を行うことにより、モーフィングによるぼけを低減させ、被写体を観察する視点に応じ、その視点に対応する仮想カメラの撮像画像を、フォトリアルな自由視点画像として撮像する事ができる。 According to the present invention, when a free viewpoint image is generated by morphing a plurality of captured images, template matching is performed in the local region in the morphing process, and the first corresponding point between the captured images is corrected. The blur caused by morphing can be reduced, and a captured image of a virtual camera corresponding to the viewpoint can be captured as a photoreal free viewpoint image according to the viewpoint of observing the subject.

この発明の一実施形態による自由視点画像撮像装置に用いるドーム型撮像装置の構成例を示す図である。It is a figure which shows the structural example of the dome shape imaging device used for the free viewpoint image imaging device by one Embodiment of this invention. 自由視点画像撮像装置の構成例を示し、枠５００のうち支柱５０１＿３のみを示した概念図である。FIG. 6 is a conceptual diagram illustrating a configuration example of a free viewpoint image capturing device, and illustrating only a support column 501_3 in a frame 500; ドーム枠５００を構成する各支柱における撮像装置の取り付け例を示す図である。It is a figure which shows the example of attachment of the imaging device in each support | pillar which comprises the dome frame. 図２における自由視点画像撮像部１０の構成例を示す図である。It is a figure which shows the structural example of the free viewpoint image imaging part 10 in FIG. 第１撮像画像の色情報からシルエット画像を生成する処理を説明する図である。It is a figure explaining the process which produces | generates a silhouette image from the color information of a 1st captured image. 撮像方向の異なる被写体の複数のシルエット画像からの３次元形状の生成を示す図である。It is a figure which shows the production | generation of the three-dimensional shape from the several silhouette image of the to-be-photographed object from which an imaging direction differs. ２台以上からなる撮像装置の組の生成を説明する図である。It is a figure explaining the production | generation of the group of the imaging device which consists of 2 or more units | sets. ステレオマッチング部１０５がＳＡＤ（Sum of Absolute Difference）を用いて行った共通対応点の抽出の結果を示す図である。It is a figure which shows the result of the extraction of the common corresponding point which the stereo matching part 105 performed using SAD (Sum of Absolute Difference). 図８（ｂ）に示す共通対応点を頂点とし、ドロネー三角分割法を用いて作成した三角形のメッシュを示す図である。FIG. 9 is a diagram showing a triangular mesh created by using the Delaunay triangulation method with the common corresponding point shown in FIG. ３次元形状生成部１０６が生成した３次元空間における三角形のメッシュを示す図である。It is a figure which shows the triangular mesh in the three-dimensional space which the three-dimensional shape production | generation part 106 produced | generated. ３次元形状生成部１０６が行った仮３次元形状とメッシュの形成する３次元形状との比較結果に基づき生成された部分３次元形状を示す図である。It is a figure which shows the partial three-dimensional shape produced | generated based on the comparison result of the temporary three-dimensional shape which the three-dimensional shape production | generation part 106 performed, and the three-dimensional shape which a mesh forms. ドーム型撮像装置に設けられた１６台の撮像装置を、４台ずつ組み合わせて、９個の第１撮像装置組を生成し、第１撮像装置組の各々の部分３次元形状を用い、被写体の３次元形状の生成を示す図である。Fourteen image pickup devices provided in the dome-type image pickup device are combined in units of four to generate nine first image pickup device sets, and using the partial three-dimensional shape of each of the first image pickup device sets, It is a figure which shows the production | generation of a three-dimensional shape. オクルージョン領域探索部１０９が行うオクルージョン領域を探索する処理を説明する図である。It is a figure explaining the process which searches the occlusion area | region which the occlusion area search part 109 performs. テンプレートマッチング部１１０が行う第１撮像画像間のテンプレートマッチングの処理を説明する図である。It is a figure explaining the process of the template matching between the 1st captured images which the template matching part 110 performs. 第１撮像装置組を構成する撮像装置において、仮想カメラの撮像画像である仮想撮像画像を生成するモーフィング処理を行う際、仮想カメラの配置可能範囲を示す図である。It is a figure which shows the arrangement | positioning possible range of a virtual camera, when performing the morphing process which produces | generates the virtual captured image which is a captured image of a virtual camera in the imaging device which comprises a 1st imaging device group. 図３に示すドーム型撮像装置における撮像装置の配置に対応したモーフィング処理に用いる撮像装置の選択について説明する図である。FIG. 4 is a diagram for describing selection of an imaging device used for morphing processing corresponding to the arrangement of the imaging device in the dome type imaging device shown in FIG. 3. 撮像装置１、撮像装置２及び撮像装置３の３台の撮像装置を用いた場合における仮想カメラｖの座標Ｅｖを示す図である。It is a figure which shows the coordinate Ev of the virtual camera v at the time of using three imaging devices, the imaging device 1, the imaging device 2, and the imaging device 3. FIG. 仮想カメラｖの仮想撮像画像の画素Ｐｖの色Ｃｖを求める（２０）式における重み係数を説明する図である。It is a figure explaining the weighting coefficient in (20) Formula which calculates | requires the color Cv of the pixel Pv of the virtual captured image of the virtual camera v. 撮像装置１、２及び３の各々の撮像した第１撮像画像と、仮想カメラｖの撮像した仮想撮像画像とにおける対応点の関係を示す図である。It is a figure which shows the relationship of the corresponding point in the 1st captured image which each imaging device 1, 2, and 3 imaged, and the virtual captured image which the virtual camera v imaged. 撮像装置１、撮像装置２及び撮像装置３の３台の撮像装置を用いた場合における仮想カメラｖの座標Ｅｖを示す図である。It is a figure which shows the coordinate Ev of the virtual camera v at the time of using three imaging devices, the imaging device 1, the imaging device 2, and the imaging device 3. FIG. 仮想撮像画像の２次元平面における撮像装置１、２、３及び４の各々の第１撮像画像における第１共通点の対応を示す図である。It is a figure which shows a response | compatibility of the 1st common point in each 1st captured image of the imaging devices 1, 2, 3, and 4 in the two-dimensional plane of a virtual captured image. 仮想カメラｖの仮想撮像画像の画素Ｐｖの色Ｃｖを求める（２７）式における重み係数を説明する図である。It is a figure explaining the weighting coefficient in (27) Formula which calculates | requires the color Cv of the pixel Pv of the virtual captured image of the virtual camera v. 本実施形態による自由視点画像撮像装置が複数の第１撮像画像をモーフィング処理することにより自由視点画像を生成する動作例を示すフローチャートである。It is a flowchart which shows the operation example which the free viewpoint image imaging device by this embodiment produces | generates a free viewpoint image by carrying out the morphing process of several 1st captured images. ドーム型撮像装置の第１撮像装置組により撮像された第１撮像画像を示す図である。It is a figure which shows the 1st captured image imaged with the 1st imaging device group of the dome shape imaging device. オープンソースのライブラリにあるＯｐｅｎＣＶ（登録商標）で実装されているＶｉｅｗＭｏｒｐｈｉｎｇ（登録商標）を使用して生成した自由視点画像を示す図である。It is a figure which shows the free viewpoint image produced | generated using View Morphing (trademark) mounted by OpenCV (trademark) in an open source library. ステレオマッチングの手法としてＳＡＤを使用し、ドロネー三角分割法によるメッシュを作成し、このメッシュを合成して生成した自由視点画像を示す図である。It is a figure which shows the free viewpoint image which produced | generated the mesh by Delaunay triangulation method using SAD as a stereo matching method, and synthesize | combined this mesh. ＶｉｓｕａｌＨｕｌｌで３次元形状を求めた結果に対し、モーフィング処理により生成した自由視点画像を示す図である。It is a figure which shows the free viewpoint image produced | generated by the morphing process with respect to the result of calculating | requiring the three-dimensional shape by Visual Hull. ＶｉｓｕａｌＨｕｌｌで３次元形状を求め、この３次元形状と第１撮像画像各々とのステレオマッチングを行って、共通対応点を求めて行ったモーフィング処理により生成した自由視点画像を示す図である。It is a figure which shows the free viewpoint image produced | generated by the morphing process which calculated | required 3D shape by Visual Hull, performed stereo matching with this 3D shape and each 1st captured image, and calculated | required the common corresponding point. ＶｉｓｕａｌＨｕｌｌで３次元形状を求め、この３次元形状と第１撮像画像各々とのステレオマッチングを行って、共通対応点を求めた後の３次元形状を用い、第１撮像画像間のテンプレートマッチングを行って得られた第１対応点を用いたモーフィング処理により生成した自由視点画像を示す図である。A three-dimensional shape is obtained by Visual Hull, and stereo matching is performed between the three-dimensional shape and each of the first captured images, and the template matching between the first captured images is performed using the three-dimensional shape after obtaining the common corresponding points. It is a figure which shows the free viewpoint image produced | generated by the morphing process using the 1st corresponding point obtained by performing.

本発明による実施形態においては、課題を解決するため、以下の３つの条件を満たす構成となっている。
ａ．モーフィングを行う撮像画像間において、対応点探索の結果によらず外れ値を含まない、密な対応点を得る。
ｂ．オクルージョン撮像に用いる各撮像装置のオクルージョン領域の検出を行い、自由視点画像を描画可能である。
ｃ．仮想カメラの撮像結果としての自由視点画像を、この仮想カメラ近傍の撮像装置で撮像した撮像画像からモーフィング処理により生成する。 In the embodiment according to the present invention, the following three conditions are satisfied in order to solve the problem.
a. Dense corresponding points that do not include outliers are obtained regardless of the result of the corresponding point search between captured images to be morphed.
b. A free viewpoint image can be drawn by detecting an occlusion area of each imaging apparatus used for occlusion imaging.
c. A free viewpoint image as an imaging result of the virtual camera is generated from a captured image captured by an imaging device near the virtual camera by morphing processing.

以下、図面を参照して、本発明の実施の形態について説明する。図１は、この発明の一実施形態による自由視点画像撮像装置に用いるドーム型撮像装置の構成例を示す図である。
半球状のドーム枠５００の中央に撮像対象の被写体６００が配置されている。このドーム枠５００は、被写体６００が中心点となる半球状の面に、撮像装置５３＿１から撮像装置５９＿４を配置する構造となっている。ドーム枠５００は、複数の支柱、本実施形態においては、支柱５００＿１から支柱５００＿１２の１２本から構成されている。また、複数の撮像装置（本実施形態においては４個の撮像装置）が、支柱５００＿３、５００＿５、５００＿７、５００＿９各々に、各支柱の垂直方向に沿って所定の間隔を開けて設けられている。したがって、ドーム枠５００には１６台の撮像装置が設けられている。また、支柱５００＿１から支柱５００＿ｎの各々は、被写体６００を中心とする半球上のドーム枠５００において１５度間隔で設けられている。 Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 is a diagram showing a configuration example of a dome type imaging apparatus used in a free viewpoint image imaging apparatus according to an embodiment of the present invention.
A subject 600 to be imaged is arranged in the center of the hemispherical dome frame 500. The dome frame 500 has a structure in which the imaging device 53_1 to the imaging device 59_4 are arranged on a hemispherical surface having the subject 600 as a center point. The dome frame 500 is composed of a plurality of support posts, in this embodiment, 12 support posts 500_1 to 500_12. In addition, a plurality of imaging devices (four imaging devices in the present embodiment) are provided in the columns 500_3, 500_5, 500_7, and 500_9 with predetermined intervals along the vertical direction of the columns. Accordingly, the dome frame 500 is provided with 16 imaging devices. Further, each of the columns 500_1 to 500_n is provided at intervals of 15 degrees in a dome frame 500 on a hemisphere centered on the subject 600.

図２は、自由視点画像撮像装置の構成例を示し、枠５００のうち支柱５００＿３のみを示した概念図である。本実施形態における自由視点画像撮像装置は、図１のドーム型撮像装置と自由視点画像撮像部１０とから構成されている。この図２において、ＬＥＤ（Light Emitting Diode）光源８００は、支柱５００＿３に設けられたライトガイド８５０＿３により被写体６００に対して光を照射する。
ブルーバック９００は、被写体を挟んでドーム枠５００の側に配置され、各撮像装置により被写体を撮像した際、撮像画像における被写体６００の背景が青色となるように設けられている。 FIG. 2 is a conceptual diagram showing a configuration example of the free viewpoint image capturing apparatus, and showing only the support column 500_3 in the frame 500. The free viewpoint image capturing apparatus according to the present embodiment includes the dome-shaped image capturing apparatus and the free viewpoint image capturing unit 10 shown in FIG. In FIG. 2, an LED (Light Emitting Diode) light source 800 irradiates a subject 600 with light by a light guide 850_3 provided on a column 500_3.
The blue back 900 is disposed on the dome frame 500 side with the subject interposed therebetween, and is provided so that the background of the subject 600 in the captured image is blue when the subject is imaged by each imaging device.

撮像装置５３＿１、５３＿２、５３＿３及び５３＿４の各々は、被写体６００を中心とする円弧状の支柱５０１＿３に設けられており、撮像面と被写体６００との距離が同様である。図示していないが、撮像装置５３＿１、５３＿２、５３＿３及び５３＿４の各々は、自由雲台を介して支柱５００＿３に取り付けられている。このため、各撮像装置の光軸の微調整が容易に行え、各撮像装置の光軸を被写体６００に容易に向けて揃える処理が容易に行える。また、各撮像装置は、取り付け位置及び撮像方向の角度が予めキャリブレーションされて既知となっているため、共通の３次元空間における世界座標系を有している。以降で説明する３次元形状の座標値は世界座標系における座標である。 Each of the imaging devices 53_1, 53_2, 53_3, and 53_4 is provided on an arc-shaped column 501_3 centered on the subject 600, and the distance between the imaging surface and the subject 600 is the same. Although not shown, each of the imaging devices 53_1, 53_2, 53_3, and 53_4 is attached to the column 500_3 via a free pan head. Therefore, fine adjustment of the optical axis of each imaging device can be easily performed, and processing for easily aligning the optical axis of each imaging device with the subject 600 can be easily performed. In addition, each imaging apparatus has a common world coordinate system in a three-dimensional space because the mounting position and the angle of the imaging direction are calibrated and known in advance. The coordinate values of the three-dimensional shape described below are coordinates in the world coordinate system.

また、撮像装置５３＿１、５３＿２、５３＿３及び５３＿４の各々は、自由視点画像撮像部１０により撮像画像の撮像が制御される。
自由視点画像撮像部１０は、撮像装置５３＿１、５３＿２、５３＿３及び５３＿４の各々を含め、図１のドーム型撮像装置に設けられた撮像装置による撮像画像から自由視点画像を生成する。
本実施形態における自由視点画像撮像装置は、図１のドーム型撮像装置と自由視点画像撮像部１０とから構成されている。 In addition, each of the imaging devices 53_1, 53_2, 53_3, and 53_4 is controlled by the free viewpoint image imaging unit 10 to capture a captured image.
The free viewpoint image capturing unit 10 generates a free viewpoint image from images captured by the image capturing apparatuses provided in the dome-shaped image capturing apparatus of FIG. 1 including each of the image capturing apparatuses 53_1, 53_2, 53_3, and 53_4.
The free viewpoint image capturing apparatus according to the present embodiment includes the dome-shaped image capturing apparatus and the free viewpoint image capturing unit 10 shown in FIG.

図３は、ドーム枠５００を構成する各支柱における撮像装置の取り付け例を示す図である。図３（ａ）は、１６個の撮像装置を略等間隔に４個ずつ、１本おきに４本の支柱それぞれに取り付けて配置した構成を示す。すなわち、図３（ａ）は、図１に示す支柱５００＿３、５００＿５、５００＿７及び５００＿９の各々に、それぞれ４台ずつの撮像装置が配置されている。支柱５０３には撮像装置５３＿１、５３＿２、５３＿３及び５３＿４が配置されている。支柱５０５には撮像装置５５＿１、５５＿２、５５＿３及び５５＿４が配置されている。支柱５０７には撮像装置５７＿１、５７＿２、５７＿３及び５７＿４が配置されている。支柱５０９には撮像装置５９＿１、５９＿２、５９＿３及び５９＿４が配置されている。 FIG. 3 is a diagram illustrating an example of attaching the imaging device to each support column constituting the dome frame 500. FIG. 3 (a) shows a configuration in which 16 image pickup devices are attached to each of four support columns, four of which are arranged at substantially equal intervals. That is, in FIG. 3A, four imaging devices are arranged on each of the columns 500_3, 500_5, 500_7, and 500_9 shown in FIG. Imaging devices 53_1, 53_2, 53_3, and 53_4 are arranged on the column 503. Imaging devices 55_1, 55_2, 55_3, and 55_4 are arranged on the support column 505. Imaging devices 57_1, 57_2, 57_3, and 57_4 are arranged on the column 507. Imaging devices 59_1, 59_2, 59_3, and 59_4 are arranged on the column 509.

また、図３（ｂ）は、１６個の撮像装置を２個ずつ、隣接した支柱１本毎に、千鳥状となるように取り付けて配置した構成を示す。すなわち、図３（ｂ）は、図１に示す支柱５００＿３、５００＿４、５００＿５、５００＿６、５００＿７、５００＿８及び５００＿９の各々に、それぞれ２台ずつの撮像装置が配置されている。支柱５００＿３には撮像装置５３＿１及び５３＿２が配置されている。支柱５００＿４には撮像装置５４＿１及び５４＿２が配置されている。支柱５００＿５には撮像装置５５＿１及び５５＿２が配置されている。支柱５００＿６には撮像装置５６＿１及び５６＿２が配置されている。支柱５００＿７には撮像装置５７＿１及び５７＿２が配置されている。支柱５００＿８には撮像装置５８＿１及び５８＿２が配置されている。支柱５００＿９には撮像装置５９＿１及び５９＿２が配置されている。また、１つの支柱に設けられた撮像装置の間隔は、各々の支柱に取り付けられた撮像装置の間隔と同様である。支柱５００＿３に取り付けられた２つの撮像装置のそれぞれの取り付けられた高さは、支柱５００＿５、５００＿７及び５００＿９に取り付けられた２つの撮像装置の取り付けられた高さと同様である。また、支柱５００＿４に取り付けられた２つの撮像装置のそれぞれの取り付けられた高さは、支柱５００＿６及び５００＿８に取り付けられた２つの撮像装置の取り付けられた高さと同様である。 Further, FIG. 3B shows a configuration in which 16 image pickup devices are attached and arranged in a zigzag pattern for each adjacent support column. That is, in FIG. 3B, two imaging devices are arranged on each of the columns 500_3, 500_4, 500_5, 500_6, 500_7, 500_8, and 500_9 shown in FIG. Imaging devices 53_1 and 53_2 are provided on the column 500_3. Imaging devices 54_1 and 54_2 are provided on the support post 500_4. Imaging devices 55_1 and 55_2 are provided on the column 500_5. Imaging devices 56_1 and 56_2 are provided on the column 500_6. Imaging devices 57_1 and 57_2 are provided on the column 500_7. Imaging devices 58_1 and 58_2 are provided on the column 500_8. Imaging devices 59_1 and 59_2 are provided on the column 500_9. In addition, the interval between the imaging devices provided on one support is the same as the interval between the imaging devices attached to each support. The attached heights of the two imaging devices attached to the column 500_3 are the same as the attached heights of the two imaging devices attached to the columns 500_5, 500_7, and 500_9. In addition, the attached heights of the two imaging devices attached to the column 500_4 are the same as the attached heights of the two imaging devices attached to the columns 500_6 and 500_8.

図４は、図２における自由視点画像撮像部１０の構成例を示す図である。この図４において、自由視点画像撮像部１０は、撮像制御部１０１、シルエット画像生成部１０２、ＶＦ処理部１０３、撮像装置選択部１０４、ステレオマッチング部１０５、３次元形状生成部１０６、３次元形状合成部１０７、対応点探索部１０８、オクルージョン領域探索部１０９、テンプレートマッチング部１１０、仮想カメラ設定部１１１、モーフィング用撮像装置選択部１１２、仮想カメラ画像生成部１１３及びデータベース１１４を備えている。 FIG. 4 is a diagram illustrating a configuration example of the free viewpoint image capturing unit 10 in FIG. In FIG. 4, a free viewpoint image capturing unit 10 includes an image capturing control unit 101, a silhouette image generating unit 102, a VF processing unit 103, an image capturing device selecting unit 104, a stereo matching unit 105, a 3D shape generating unit 106, and a 3D shape. A synthesis unit 107, a corresponding point search unit 108, an occlusion area search unit 109, a template matching unit 110, a virtual camera setting unit 111, a morphing imaging device selection unit 112, a virtual camera image generation unit 113, and a database 114 are provided.

撮像制御部１０１は、ドーム枠５００の各支柱に設けられた撮像装置からの撮像画像を読み込み、一旦、対応する撮像装置の撮像装置の識別情報とともに、データベース１１４に書き込んで記憶させる。ここで、撮像制御部１０１は、各撮像装置からの撮像画像の画素を間引いて低解像度の第１撮像画像と、画素を間引かない撮像画像と同様の解像度の第２撮像画像とを組として記憶させる。 The imaging control unit 101 reads a captured image from an imaging device provided on each column of the dome frame 500 and temporarily writes and stores it in the database 114 together with identification information of the imaging device of the corresponding imaging device. Here, the imaging control unit 101 sets a low-resolution first captured image by thinning out pixels of the captured image from each imaging device and a second captured image with the same resolution as the captured image without thinning out pixels. Remember me.

シルエット画像生成部１０２は、第２撮像画像の各々から、色フィルタを用いてシルエット画像の生成を、全ての第１撮像画像に対して行う。ここで、図３で説明したように、撮像装置における撮像の際、背景にブルーバック９００が設けられており、背景を切り抜く必要性のために、撮像画像の背景がブルーバックの青色として撮像されている。
シルエット画像生成部１０２は、シルエット画像の生成の際、撮像画像の背景がブルーバックの青色として撮像されているため、容易に背景の色情報を用いて切り抜く処理が行える。 The silhouette image generation unit 102 generates a silhouette image for each of the first captured images from each of the second captured images using a color filter. Here, as described with reference to FIG. 3, the blue background 900 is provided in the background when imaging is performed in the imaging apparatus, and the background of the captured image is captured as blue background blue because of the necessity of clipping the background. ing.
The silhouette image generation unit 102 can easily cut out the background using the color information of the background because the background of the captured image is captured as a blue-blue background when the silhouette image is generated.

図５は、第１撮像画像の色情報からシルエット画像を生成する処理を説明する図である。図５（ａ）は、第１撮像画像を示しており、被写体６００の背景が青色となっていることを示している。図５（ｂ）は、青色の色情報に基づいて、色フィルタを用いることにより、背景を切り抜いて被写体６００のシルエットのみとなったシルエット画像を示している。この図５（ｂ）のシルエット画像を、ドーム枠５００の各支柱に設けられた撮像装置の撮像画像全て、本実施形態においては１６枚の第１撮像画像全てから生成する。 FIG. 5 is a diagram for explaining processing for generating a silhouette image from color information of the first captured image. FIG. 5A shows the first captured image, and shows that the background of the subject 600 is blue. FIG. 5B shows a silhouette image in which only the silhouette of the subject 600 is obtained by cutting out the background by using a color filter based on the blue color information. The silhouette image of FIG. 5B is generated from all the captured images of the imaging devices provided on the respective columns of the dome frame 500, in this embodiment, all the 16 first captured images.

図４に戻り、ＶＦ処理部１０３は、Ｖｉｓｕａｌ＿Ｈｕｌｌの手法を用い、シルエット画像生成部１０２の生成した１６枚のシルエット画像から仮３次元形状を生成する。このＶｉｓｕａｌ＿Ｈｕｌｌは、多数の方向から撮像した撮像画像から生成したシルエット画像と、各撮像装置の光軸によって形成される錐体の積集合空間であるため、被写体６００の大まかな３次元形状としての仮３次元形状が生成される。また、Ｖｉｓｕａｌ＿Ｈｕｌｌは、撮像条件に大きく左右されずに、安定して仮３次元形状を得ることができる。 Returning to FIG. 4, the VF processing unit 103 generates a provisional three-dimensional shape from the 16 silhouette images generated by the silhouette image generation unit 102 using the method of Visual_Hull. This Visual_Hull is a product space of a cone formed by silhouette images generated from captured images captured from a number of directions and the optical axis of each imaging device, so that the subject 600 is assumed to be a rough three-dimensional shape. A three-dimensional shape is generated. In addition, Visual_Hull can stably obtain a provisional three-dimensional shape without being greatly affected by the imaging conditions.

図６は、撮像方向の異なる被写体の複数のシルエット画像からの３次元形状の生成を示す図である。図６（ａ）は、シルエット画像生成部１０２が生成した、撮像方向の異なる複数の被写体のシルエット画像を示している。図６（ｂ）は、図６（ａ）のシルエット画像から生成した仮３次元形状を示している。このＶｉｓｕａｌ＿Ｈｕｌｌで求まる仮３次元形状は、特徴点の精度が低い。このため、この仮３次元形状を元に、モーフィング処理を行う第１撮像画像間の対応点を求めても、モーフィング処理に必要な精度を得ることはできない。 FIG. 6 is a diagram illustrating generation of a three-dimensional shape from a plurality of silhouette images of subjects having different imaging directions. FIG. 6A shows silhouette images of a plurality of subjects with different imaging directions generated by the silhouette image generation unit 102. FIG. 6B shows a provisional three-dimensional shape generated from the silhouette image of FIG. The provisional three-dimensional shape obtained by Visual_Hull has low feature point accuracy. For this reason, even if the corresponding points between the first captured images subjected to the morphing process are obtained based on the provisional three-dimensional shape, the accuracy required for the morphing process cannot be obtained.

このため後述するステレオマッチング法によって、仮３次元形状の修正を行う。しかしながら、一般にステレオマッチングの結果見つかる対応点は、疎な対応点であり、外れ値を含んでいるため、そのままでは仮３次元形状の修正に利用できない。そこで、次に示すステレオマッチングの処理が行われる。 For this reason, the provisional three-dimensional shape is corrected by a stereo matching method described later. However, in general, the corresponding points found as a result of stereo matching are sparse corresponding points and include outliers, and thus cannot be used for correction of the provisional three-dimensional shape as they are. Therefore, the following stereo matching process is performed.

図４に戻り、撮像装置選択部１０４は、ドーム枠５００に設けられた複数台の撮像装置を２台以上を一組として、複数の撮像装置の組を生成する。
図７は、２台以上からなる撮像装置の組である第１撮像装置組の生成を説明する図である。図７（ａ）は、隣接する４台の撮像装置を一組をした場合を示している。このため、本実施形態の場合、撮像装置が１６台であるため、９組の組み合わせが生成される。この図７（ａ）は、図３（ａ）に示すドーム枠５００の支柱における撮像装置の配置に対応している。また、図７（ｂ）は、隣接する３台の撮像装置を一組の第１撮像装置組とした場合を示している。このため、本実施形態の場合、撮像装置が１６台であるため、１５組の組み合わせが生成される。この図７（ｂ）は、図３（ｂ）に示すドーム枠５００の支柱における撮像装置の配置に対応している。撮像装置の組み合わせについては自由に選択することができるが、ステレオマッチングに用いる撮像画像はできれば近い位置にある撮像装置の撮像したものが良い結果が得られるため、図７（ａ）及び図７（ｂ）に示すように隣接している撮像装置が第１撮像装置組として選択される。 Returning to FIG. 4, the imaging device selection unit 104 generates a set of a plurality of imaging devices by setting two or more imaging devices provided in the dome frame 500 as a set.
FIG. 7 is a diagram for explaining generation of a first imaging device set that is a set of imaging devices including two or more devices. FIG. 7A shows a case in which a set of four adjacent imaging devices is combined. For this reason, in the case of this embodiment, since there are 16 imaging devices, nine combinations are generated. FIG. 7A corresponds to the arrangement of the imaging device on the support of the dome frame 500 shown in FIG. FIG. 7B shows a case where three adjacent imaging devices are used as one set of first imaging devices. For this reason, in the case of this embodiment, since there are 16 imaging devices, 15 combinations are generated. FIG. 7B corresponds to the arrangement of the imaging device on the support of the dome frame 500 shown in FIG. The combination of the imaging devices can be freely selected, but the captured images used for stereo matching can be obtained with good results when the images captured by the imaging devices located at close positions are possible. FIG. 7A and FIG. As shown in b), adjacent imaging devices are selected as the first imaging device group.

ステレオマッチング部１０５は、撮像装置選択部１０４が生成した第１撮像装置組毎に、この第１撮像装置組に含まれる撮像装置の撮像した第１撮像画像間における共通対応点のステレオマッチングを行う。すなわち、ステレオマッチング部１０５は、第１撮像装置組に含まれる撮像装置の撮像した第１撮像画像から、いずれか一枚の第１撮像画像をテンプレートとして選択し、他の一枚の第１撮像画像とにおいてマッチングを行い対応点の検出を行い共通対応点とする。このステレオマッチングは、撮像装置を組み合わせた第１撮像装置組ごとに、それぞれの撮像した第１撮像画像間において行う。また、ステレオマッチングには、様々な方法が提案されており、高精度なマッチング法を用いた方が良い結果が得られる。
しかしながら、本実施形態においては、対応点が疎であり、かつ、外れ値も含んでいても構わない手法となっている。このため、本実施形態のロバスト性を示すため、ドーム型撮像装置を使った実験においては、もっとも基本的なマッチング法の一つであるＳＡＤ(Sum of Absolute Difference)を用いた。 The stereo matching unit 105 performs stereo matching of common corresponding points between the first captured images captured by the imaging devices included in the first imaging device set for each first imaging device set generated by the imaging device selection unit 104. . That is, the stereo matching unit 105 selects any one of the first captured images as a template from the first captured images captured by the imaging devices included in the first imaging device group, and the other first captured image. Matching is performed on the image to detect corresponding points, and common corresponding points are obtained. This stereo matching is performed between the first captured images captured for each first imaging device group in which the imaging devices are combined. In addition, various methods have been proposed for stereo matching, and better results can be obtained by using a highly accurate matching method.
However, in the present embodiment, the corresponding points are sparse and the outliers may be included. Therefore, in order to show the robustness of the present embodiment, SAD (Sum of Absolute Difference), which is one of the most basic matching methods, was used in the experiment using the dome-type imaging device.

ＳＡＤは、一枚の撮像画像の局所領域を切り出してテンプレートとし、このテンプレートの画素と他の撮像画像の画素の輝度差を画素単位で比較し、上記局所領域における画素の輝度差の絶対値の和が最も小さい局所領域を対応点とする。このＳＡＤにより、ステレオマッチング部１０５は、第１撮像装置組ごとに、この組に含まれる全ての撮像画像に対して共通対応点を抽出する。
例えば、第１撮像装置組が３台の撮像装置として、図７（ｂ）に示す撮像装置５３＿Ａ、５３＿Ｂ及び５３＿Ｃから構成されている場合、ステレオマッチング部１０５は、これら３台の撮像装置のなかから２台、撮像装置５３＿Ａ及び５３＿Ｂと、撮像装置５３＿Ａ及び５３＿Ｃとを選択する。 SAD cuts out a local area of one captured image as a template, compares the luminance difference between the pixel of this template and the pixel of another captured image in pixel units, and calculates the absolute value of the luminance difference of the pixel in the local area. The local region with the smallest sum is taken as the corresponding point. With this SAD, the stereo matching unit 105 extracts a common corresponding point for all captured images included in this set for each first imaging device set.
For example, when the first imaging device set includes three imaging devices 53_A, 53_B, and 53_C illustrated in FIG. 7B, the stereo matching unit 105 includes the three imaging devices. , Two imaging devices 53_A and 53_B and imaging devices 53_A and 53_C are selected.

ステレオマッチング部１０５は、撮像装置５３＿Ａ及び５３＿Ｂのいずれかをテンプレートとし、エピポーラ線上において共通対応点の探索を行う。ここで、ステレオマッチング部１０５は、テンプレートを撮像装置５３＿Ａの第１撮像画像から生成し、撮像装置５３＿Ｂ及び５３＿Ｃの各々との間でＳＡＤを実行し、対応点を抽出する。このとき、ステレオマッチング部１０５は、抽出された共通対応点が撮像装置５３＿Ｂ及び５３＿Ｃの各々の第１撮像画像におけるエピポーラ線上に乗っていた（存在していた）場合、抽出した対応点を共通対応点とする。 The stereo matching unit 105 searches for common corresponding points on the epipolar line using either of the imaging devices 53_A and 53_B as a template. Here, the stereo matching unit 105 generates a template from the first captured image of the imaging device 53_A, executes SAD with each of the imaging devices 53_B and 53_C, and extracts corresponding points. At this time, if the extracted common corresponding points are on (or existed on) the epipolar lines in the first captured images of the imaging devices 53_B and 53_C, the stereo matching unit 105 commonly handles the extracted corresponding points. Let it be a point.

また、図７（ａ）に示すように、４台の撮像装置５３＿Ａ、５３＿Ｂ、５３＿Ｃ及び５３＿Ｄの各々から第１撮像装置組が構成されている場合も、すでに述べた３台の撮像装置から第１撮像装置組の場合と同様の処理を行う。すなわち、ステレオマッチング部１０５は、撮像装置５３＿Ａの第１撮像画像からテンプレートを生成し、このテンプレートによって撮像装置５３＿Ｂ、５３＿Ｃ及び５３＿Ｄの各々との間でＳＡＤを実行し、対応点を抽出する。そして、ステレオマッチング部１０５は、撮像装置５３＿Ｂ、５３＿Ｃ及び５３＿Ｄの各々の第１撮像画像から抽出した対応点がエピポーラ線上に乗っていた場合、抽出した対応点を共通対応点とする。 In addition, as shown in FIG. 7A, when the first imaging device set is configured by each of the four imaging devices 53_A, 53_B, 53_C, and 53_D, the three imaging devices that have already been described The same processing as in the case of one imaging device group is performed. That is, the stereo matching unit 105 generates a template from the first captured image of the imaging device 53_A, executes SAD with each of the imaging devices 53_B, 53_C, and 53_D using this template, and extracts corresponding points. When the corresponding points extracted from the first captured images of the imaging devices 53_B, 53_C, and 53_D are on the epipolar line, the stereo matching unit 105 sets the extracted corresponding points as common corresponding points.

図８は、ステレオマッチング部１０５がＳＡＤを用いて行った共通対応点の抽出の結果を示す図である。図８（ａ）は、撮像装置５３＿Ａ、５３＿Ｂ、５３＿Ｃ及び５３＿Ｄの各々の第１撮像画像２００Ａ、２００Ｂ、２００Ｃ、２００Ｄを示している。図８（ｂ）は、第１撮像画像２００Ａからテンプレートを生成し、第１撮像画像２００Ｂ、２００Ｃ及び２００Ｄの各々と比較して抽出した第１撮像画像２００Ａにおける共通対応点を示している。また、図８（ｂ）において、第１撮像画像２００Ａ上における第１撮像画像２００Ｂ、２００Ｃ及び２００Ｄの各々との共通対応点が黒丸で示されている。この図８（ｂ）からは、共通対応点が疎の状態でしか抽出できないことが分かる。この図８（ｂ）に示される共通対応点は、第１撮像装置組ごとに生成される。 FIG. 8 is a diagram illustrating a result of common corresponding point extraction performed by the stereo matching unit 105 using SAD. FIG. 8A shows the first captured images 200A, 200B, 200C, and 200D of the imaging devices 53_A, 53_B, 53_C, and 53_D, respectively. FIG. 8B shows common corresponding points in the first captured image 200A that are extracted from the first captured image 200B, 200C, and 200D by generating a template from the first captured image 200A. In FIG. 8B, the common corresponding points with each of the first captured images 200B, 200C, and 200D on the first captured image 200A are indicated by black circles. From FIG. 8B, it can be seen that the common corresponding points can be extracted only in a sparse state. The common corresponding points shown in FIG. 8B are generated for each first imaging device set.

図４に戻り、３次元形状生成部１０６は、図８（ｂ）の共通対応点を用い、ＶＦ処理部１０３がＶＨ（Visual Hull）により生成した図６（ｂ）の仮３次元形状の修正を行い、部分３次元形状の生成を行う。
３次元形状生成部１０６は、ステレオマッチングにより得られた共通対応点を頂点とし、ドロネー三角分割法を用いて三角形のメッシュを作成する。
図９は、図８（ｂ）に示す共通対応点を頂点とし、ドロネー三角分割法を用いて作成した三角形のメッシュを示す図である。図９においては、第１撮像画像２００Ａがテンプレートを生成した撮像画像であるため、この第１撮像画像２００Ａ上の共通対応点を頂点とした三角形のメッシュが生成されている。 Returning to FIG. 4, the three-dimensional shape generation unit 106 uses the common corresponding points in FIG. 8B, and corrects the temporary three-dimensional shape in FIG. 6B generated by the VF processing unit 103 using VH (Visual Hull). To generate a partial three-dimensional shape.
The three-dimensional shape generation unit 106 uses a common corresponding point obtained by stereo matching as a vertex, and creates a triangular mesh using Delaunay triangulation.
FIG. 9 is a diagram showing a triangular mesh created using the Delaunay triangulation method with the common corresponding point shown in FIG. 8B as a vertex. In FIG. 9, since the first captured image 200A is a captured image in which a template is generated, a triangular mesh having a vertex corresponding to the common corresponding point on the first captured image 200A is generated.

図４に戻り、３次元形状生成部１０６は、データベース１１４のカメラパラメータテーブルからカメラパラメータを読み出し、このカメラパラメータ（撮像装置の位置及び焦点距離など）から、３台の撮像装置の第１撮像画像から三角測量によって共通対応点の３次元空間における座標を求める。このカメラパラメータテーブルは、異なる種類の撮像装置及びドーム型撮像装置における撮像装置の配置の組み合わせ毎に、予めデータベース１１４に対して書き込まれて記憶されている。 Returning to FIG. 4, the three-dimensional shape generation unit 106 reads camera parameters from the camera parameter table of the database 114, and uses the camera parameters (such as the position and focal length of the imaging device) to capture the first captured images of the three imaging devices. The coordinates of the common corresponding point in the three-dimensional space are obtained by triangulation. The camera parameter table is written and stored in advance in the database 114 for each combination of the types of imaging devices in different types of imaging devices and dome type imaging devices.

そして、３次元形状生成部１０６は、図９における三角形のメッシュを、３次元空間における座標を求めた共通対応点に対応させ、３次元空間に対して三角形のメッシュを配置する。
図１０は、３次元形状生成部１０６が生成した３次元空間における三角形のメッシュを示す図である。この図１０において、上述した三角形のメッシュの形成する３次元形状は、ＶＦ処理部１０３の生成した仮３次元形状を囲む殻の形状となっている。 Then, the three-dimensional shape generation unit 106 associates the triangular mesh in FIG. 9 with the common corresponding point for which the coordinates in the three-dimensional space are obtained, and arranges the triangular mesh in the three-dimensional space.
FIG. 10 is a diagram illustrating a triangular mesh in the three-dimensional space generated by the three-dimensional shape generation unit 106. In FIG. 10, the three-dimensional shape formed by the triangular mesh described above is a shell shape surrounding the temporary three-dimensional shape generated by the VF processing unit 103.

図４に戻り、３次元形状生成部１０６は、図１０に示す三角形のメッシュの形成する３次元形状と、図６（ｂ）に示す仮３次元形状とを比較する。ここで、３次元形状生成部１０６は、メッシュの３次元形状に対し、撮像装置方向において、仮３次元形状のある領域が撮像装置に近い場合、そのメッシュの３次元形状（三角形のメッシュの形成する殻形状の図形）から撮像装置に方向に突出した領域を仮３次元形状から取り除く。そして、３次元形状生成部１０６は、この突出した領域を取り除いた仮３次元形状を、修正した仮３次元形状である部分３次元形状として生成する。ここで、撮像装置が３次元空間の位置座標においてすべてキャリブレーション済みと言う前提であるため、すべての撮像装置の座標や向きの相対値は既知となっている。このため、すべての撮像装置は共通の座標系（世界座標系）を有しており、第１撮像装置組毎の共通対応点は、共通座標点からなる３次元形状の頂点の座標として世界座標系にマッピングされることになる。 Returning to FIG. 4, the three-dimensional shape generation unit 106 compares the three-dimensional shape formed by the triangular mesh shown in FIG. 10 with the provisional three-dimensional shape shown in FIG. Here, the three-dimensional shape generation unit 106, when a region having a provisional three-dimensional shape is close to the imaging device in the imaging device direction with respect to the three-dimensional shape of the mesh, the three-dimensional shape of the mesh (formation of a triangular mesh) A region projecting in the direction from the shell-shaped figure) to the imaging device is removed from the provisional three-dimensional shape. Then, the three-dimensional shape generation unit 106 generates a temporary three-dimensional shape from which the protruding region is removed as a partial three-dimensional shape that is a corrected temporary three-dimensional shape. Here, since it is a premise that the imaging devices are all calibrated at the position coordinates in the three-dimensional space, the relative values of the coordinates and orientations of all the imaging devices are known. For this reason, all the imaging devices have a common coordinate system (world coordinate system), and the common corresponding point for each first imaging device set is the world coordinate as the coordinates of the vertex of the three-dimensional shape composed of the common coordinate points. It will be mapped to the system.

図１１は、３次元形状生成部１０６が行った仮３次元形状とメッシュの形成する３次元形状との比較結果に基づき生成された部分３次元形状を示す図である。図１１（ａ）、図１１（ｂ）及び図１１（ｃ）の各々の部分３次元形状は、ステレオマッチングの結果において、対応点に外れ値が含まれているため、形状の一部分が欠けた形状となる。
この図１１に示す部分３次元形状のデータは、すでに述べたように、第１撮像装置組の数だけ生成される。すなわち、部分３次元形状のデータは、第１撮像装置組の数だけ存在する。ｎ組の第１撮像装置組の各々においてステレオマッチングを行えば、ｎ個の部分３次元形状のデータが作成されることになる。この図１１（ａ）、図１１（ｂ）及び図１１（ｃ）の各々の示す部分３次元形状は、それぞれ異なる第１撮像装置組で生成されており、組によって形状が違うことがわかる。 FIG. 11 is a diagram illustrating a partial three-dimensional shape generated based on a comparison result between the temporary three-dimensional shape performed by the three-dimensional shape generation unit 106 and the three-dimensional shape formed by the mesh. Each of the partial three-dimensional shapes in FIGS. 11A, 11B, and 11C lacks a part of the shape because the corresponding points include outliers in the result of stereo matching. It becomes a shape.
As described above, the partial three-dimensional shape data shown in FIG. 11 is generated by the number of the first imaging device sets. That is, there are as many pieces of partial three-dimensional shape data as there are first imaging device groups. If stereo matching is performed in each of the n first imaging device groups, n pieces of partial three-dimensional data are generated. The partial three-dimensional shapes shown in FIG. 11A, FIG. 11B, and FIG. 11C are generated by different first imaging device groups, and it can be seen that the shapes differ depending on the groups.

図４に戻り、３次元形状合成部１０７は、３次元形状生成部１０６が第１撮像装置組毎に生成した部分３次元形状の合成を行う。
３次元形状合成部１０７は、３次元形状生成部１０６が生成した全ての部分３次元形状を、上述した世界座標系において、同一の座標値を有する共通対応点を重ね合わせ、部分３次元形状を重ねる処理を行う。このとき、３次元形状合成部１０７は、重ね合わせた部分３次元画像の空間を細分化し、細分化した領域毎に何個の部分３次元形状が重なって存在するかを検出する。 Returning to FIG. 4, the three-dimensional shape synthesis unit 107 synthesizes the partial three-dimensional shape generated by the three-dimensional shape generation unit 106 for each first imaging device set.
The three-dimensional shape synthesis unit 107 superimposes all the partial three-dimensional shapes generated by the three-dimensional shape generation unit 106 on common corresponding points having the same coordinate value in the above-described world coordinate system, and generates the partial three-dimensional shape. Perform the process of overlapping. At this time, the three-dimensional shape synthesizing unit 107 subdivides the space of the superimposed partial three-dimensional image, and detects how many partial three-dimensional shapes overlap each other for each subdivided region.

そして、３次元形状合成部１０７は、細分化した領域毎に、それぞれの領域において重なっている部分３次元形状の数が予め設定された閾値以上であるか否かの判定を行う。ここで、３次元形状合成部１０７は、細分化された領域において重なっている部分３次元形状の数が閾値以上である場合、その領域に３次元形状が存在していると判定する。一方、３次元形状合成部１０７は、細分化された領域において重なっている部分３次元形状の数が閾値未満である場合、その領域に３次元形状が存在していないと判定する。そして、３次元形状合成部１０７は、複数の仮３次元形状を重ねた合成画像において、３次元形状が存在していると判定された領域の画素を元に、３次元形状の合成を行う。 Then, the three-dimensional shape synthesis unit 107 determines, for each subdivided region, whether or not the number of partial three-dimensional shapes overlapping in each region is equal to or greater than a preset threshold value. Here, if the number of partial three-dimensional shapes overlapping in the subdivided area is equal to or greater than the threshold value, the three-dimensional shape synthesis unit 107 determines that a three-dimensional shape exists in the area. On the other hand, when the number of partial three-dimensional shapes overlapping in the subdivided area is less than the threshold value, the three-dimensional shape synthesizing unit 107 determines that the three-dimensional shape does not exist in the area. Then, the three-dimensional shape synthesis unit 107 synthesizes the three-dimensional shape based on the pixels in the region where it is determined that the three-dimensional shape exists in the synthesized image obtained by superimposing a plurality of temporary three-dimensional shapes.

図１２は、ドーム型撮像装置に設けられた１６台の撮像装置を、４台ずつ組み合わせて、９個の第１撮像装置組を生成し、第１撮像装置組の各々の部分３次元形状を用い、被写体の３次元形状の生成を示す図である。図１２（ａ）は、各第１撮像装置組で生成された部分３次元形状を示している。図１２（ｂ）は、９個の部分３次元画像を重ね合わせて、細分化した領域に閾値である２個以上の部分３次元画像が重なっている場合に、合成された３次元形状を示している。この、部分３次元画像を重ね合わせて、閾値以上の３次元形状のデータが存在する領域を組み合わせて、３次元形状の生成を行うことにより、仮３次元形状と三角形のメッシュの形成する３次元形状の比較を含め、多くの外れ値を除去することができる。 In FIG. 12, 16 image pickup devices provided in the dome-type image pickup device are combined four by four to generate nine first image pickup device sets, and the partial three-dimensional shape of each of the first image pickup device sets is generated. FIG. 3 is a diagram illustrating generation of a three-dimensional shape of a subject used. FIG. 12A shows a partial three-dimensional shape generated by each first imaging device group. FIG. 12B shows a synthesized three-dimensional shape when nine partial three-dimensional images are overlapped and two or more partial three-dimensional images that are threshold values overlap each other in a subdivided region. ing. By superimposing the partial 3D images and combining the regions where the data of the 3D shape exceeding the threshold exists, the 3D shape is generated, so that the temporary 3D shape and the triangular mesh are formed. Many outliers can be removed, including shape comparisons.

図４に戻り、対応点探索部１０８は、図１２（ｂ）に示す合成された３次元形状を用いて、第２撮像画像における対応点である第１対応点の探索を行う。この第１対応点に基づいて第２撮像画像を用いたモーフィングを行うため、第１対応点の探索はモーフィングに用いる高解像度の第２撮像画像に対して行う。この第２撮像画像は、第１撮像画像より解像度が高いものであり、ドーム型撮像装置に設けられた撮像装置の撮像した状態（解像度）の撮像画像である。後述するモーフィング処理には、この第２撮像画像が用いられる。
対応点探索部１０８は、各撮像装置のカメラパラメータを、データベース１１４から読み出し、３次元幾何学的な計算（三角測量の計算）により、各撮像装置の第２撮像画像間における対応点を抽出する。ここで、対応点探索部１０８は、第１対応点の探索において、３次元形状を各撮像装置の撮像方向に垂直な２次元平面、すなわち各撮像装置の撮像面に対応した２次元平面に対し、３次元形状の各画素を投影した仮想第１撮像画像を用いて行う（後述）。 Returning to FIG. 4, the corresponding point search unit 108 searches for a first corresponding point that is a corresponding point in the second captured image, using the combined three-dimensional shape shown in FIG. Since the morphing using the second captured image is performed based on the first corresponding point, the search for the first corresponding point is performed on the high-resolution second captured image used for morphing. The second captured image has a higher resolution than the first captured image, and is a captured image in a state (resolution) captured by the imaging device provided in the dome-shaped imaging device. This second captured image is used for a morphing process to be described later.
The corresponding point search unit 108 reads out the camera parameters of each imaging device from the database 114, and extracts corresponding points between the second captured images of the imaging devices by three-dimensional geometric calculation (triangulation calculation). . Here, in the search for the first corresponding point, the corresponding point search unit 108 sets the three-dimensional shape to the two-dimensional plane perpendicular to the imaging direction of each imaging device, that is, the two-dimensional plane corresponding to the imaging surface of each imaging device. This is performed using a virtual first captured image obtained by projecting each pixel having a three-dimensional shape (described later).

上述した第１対応点の探索を行い、この第２対応点の探索結果を用いることで、モーフィングが可能となる。しかしながら、合成して得られた３次元形状の精度が高くないため、この３次元形状と第２撮像装置とにおける正確な対応点を求めることができない。ここで、不正確な対応点を用いたモーフィング処理を行うと、対応点の誤差に起因してモーフィング処理を行った後に、モーフィング処理で生成した画像がぼやけてしまったり、あるいはぶれてしまうという問題が発生する。 Morphing is possible by searching for the first corresponding point described above and using the search result of the second corresponding point. However, since the accuracy of the three-dimensional shape obtained by combining is not high, it is not possible to obtain an exact corresponding point between the three-dimensional shape and the second imaging device. Here, if the morphing process using inaccurate corresponding points is performed, the image generated by the morphing process is blurred or blurred after performing the morphing process due to the error of the corresponding points Occurs.

この問題を解決するため、対応点探索部１０８は、第１対応点を探索するため、局所領域におけるテンプレートマッチングを、３次元形状と第２撮像画像との間において行う。ここで、合成された３次元形状の精度は、第１撮像画像を用いているため、高くはないものの、外れ値がすでに説明した手法により除去されているので、大きな誤差が生じている訳ではない。このため、対応点探索部１０８が第１撮像画像と３次元形状とにおいて探索する第１対応点との誤差も大きい値ではなく、局所領域においてテンプレートマッチングを行うことにより、この誤差を修正することができる。 In order to solve this problem, the corresponding point search unit 108 performs template matching in the local region between the three-dimensional shape and the second captured image in order to search for the first corresponding point. Here, the accuracy of the synthesized three-dimensional shape is not high because the first captured image is used, but the outlier has been removed by the method already described, so that a large error has occurred. Absent. For this reason, the error between the first corresponding point searched by the corresponding point search unit 108 in the first captured image and the three-dimensional shape is not a large value, and this error is corrected by performing template matching in the local region. Can do.

図４に戻り、対応点探索部１０８は、３次元形状とのテンプレートマッチングを、図７に示す第１撮像装置組に含まれる全ての撮像装置の撮像した第１撮像画像に対して行う。ここで、対応点探索部１０８は、第１撮像装置組に含まれる撮像装置各々の撮像方向に平行な２次元平面に対して、３次元形状を投影させ、仮想的な第１撮像画像である仮想第１撮像画像を生成する。これにより、対応点探索部１０８は、第１撮像装置組に含まれる撮像装置の各々が撮像した第１撮像画像に対応させた、合成された３次元形状を仮想的に撮像した仮想第１撮像画像のそれぞれを生成する。対応点探索部１０８は、撮像装置の撮像した第１撮像画像とこの撮像装置に対応する仮想第１撮像画像との局所的なテンプレートマッチングを行う。 Returning to FIG. 4, the corresponding point search unit 108 performs template matching with the three-dimensional shape on the first captured images captured by all the imaging devices included in the first imaging device group illustrated in FIG. 7. Here, the corresponding point search unit 108 projects a three-dimensional shape onto a two-dimensional plane parallel to the imaging direction of each imaging device included in the first imaging device group, and is a virtual first captured image. A virtual first captured image is generated. Thereby, the corresponding point search unit 108 virtually captures the synthesized three-dimensional shape corresponding to the first captured image captured by each of the imaging devices included in the first imaging device group. Generate each of the images. The corresponding point search unit 108 performs local template matching between the first captured image captured by the imaging device and the virtual first captured image corresponding to the imaging device.

これにより、対応点探索部１０８は、第１撮像画像の画素の各々と、仮想第１撮像画像の画素の各々とのテンプレートマッチングにより、第１撮像画像の画素の各々と、仮想第１撮像画像の画素の各々との対応関係を抽出する。このとき、対応点探索部１０８は、仮想第１撮像画像に投影されている各画素の３次元空間における座標が分かっているため、第１撮像画像の画素各々の３次元空間における座標を検出することができる。対応点探索部１０８は、全ての第１撮像装置組に対して上述した処理を行い、３次元画像の各画素と第１撮像画像の各画素との対応関係を抽出する。 Thus, the corresponding point search unit 108 performs template matching between each of the pixels of the first captured image and each of the pixels of the virtual first captured image, and each of the pixels of the first captured image and the virtual first captured image. The corresponding relationship with each of the pixels is extracted. At this time, since the corresponding point search unit 108 knows the coordinates of each pixel projected on the virtual first captured image in the three-dimensional space, it detects the coordinates of each pixel of the first captured image in the three-dimensional space. be able to. The corresponding point search unit 108 performs the above-described processing on all the first imaging device groups, and extracts the correspondence between each pixel of the three-dimensional image and each pixel of the first captured image.

オクルージョン領域探索部１０９は、対応点探索部１０８が３次元形状の表面の点に対し、対応点の探索を行う際、オクルージョン領域の探索を行う。このオクルージョン領域は、障害物により対応点が抽出されない場合を指している。このため、オクルージョン領域探索部１０９は、２個の第２撮像画像間における第１対応点の探索を行う際、いずれか一方の撮像装置と３次元形状の対応点の間に遮るもの（３次元形状そのもの）の存在の有無によりオクルージョン領域を検出する。このとき、オクルージョン領域探索部１０９は、いずれの撮像装置では撮像することができ、いずれの撮像装置では撮像できないかを記録しておくことにより、どこでオクルージョンが発生しているかを検出している。 The occlusion area search unit 109 searches for an occlusion area when the corresponding point search unit 108 searches for a corresponding point with respect to a surface point of a three-dimensional shape. This occlusion area indicates a case where no corresponding point is extracted due to an obstacle. Therefore, when searching for the first corresponding point between the two second captured images, the occlusion area searching unit 109 blocks between any one of the imaging devices and the corresponding point of the three-dimensional shape (three-dimensional The occlusion area is detected by the presence or absence of the shape itself. At this time, the occlusion area search unit 109 detects where the occlusion occurs by recording which imaging device can capture an image and which imaging device cannot capture an image.

すなわち、オクルージョン領域探索部１０９は、対応点探索部１０８が３次元形状を投影した仮想第１撮像画像毎に、３次元形状に存在して仮想第１撮像画像に存在しない画素を検出し、撮像装置毎にデータベース１１４に書き込んで記憶させる。これにより、モーフィングを行う際、撮像装置の撮像する第１撮像画像に撮像されていない３次元形状における画素を検出することができる。 That is, the occlusion area search unit 109 detects pixels that exist in the three-dimensional shape and do not exist in the virtual first captured image for each virtual first captured image on which the corresponding point search unit 108 projects the three-dimensional shape. Each device is written and stored in the database 114. Thereby, when performing morphing, it is possible to detect pixels in a three-dimensional shape that are not captured in the first captured image captured by the imaging device.

図１３は、オクルージョン領域探索部１０９が行うオクルージョン領域を探索する処理を説明する図である。図１３において、第１対応点の探索の際、第１対応点７０（画素）が撮像装置５３＿Ａからは撮像される。一方、撮像装置５３＿Ｂからは第１対応点７０に対して、３次元形状６００’の一部分が障害物となり（上述した仮想第１撮像画像から容易に検出）、撮像装置５３＿Ｂと第１対応点７０とを結ぶ線分上にある撮像装置の光軸が遮られ、撮像装置５３＿Ｂからは第１対応点７０は撮像されない。 FIG. 13 is a diagram for explaining the process of searching for an occlusion area performed by the occlusion area searching unit 109. In FIG. 13, when searching for the first corresponding point, the first corresponding point 70 (pixel) is imaged from the imaging device 53_A. On the other hand, with respect to the first corresponding point 70 from the imaging device 53_B, a part of the three-dimensional shape 600 ′ becomes an obstacle (easily detected from the above-described virtual first captured image), and the imaging device 53_B and the first corresponding point 70. The optical axis of the imaging device on the line connecting the two is blocked, and the first corresponding point 70 is not imaged from the imaging device 53_B.

図４に戻り、テンプレートマッチング部１１０は、第１撮像装置組の各々において、それぞれの第１撮像装置組に含まれる撮像装置全てが撮像した第１撮像画像間のテンプレートマッチングを行う。ここで、テンプレートマッチング部１１０は、撮像装置の幾何学的な整合性を考慮に入れないため、ステレオマッチングを行うのではなく、第１撮像画像間におけるテンプレートマッチングのみを行う。これにより、幾何学的な整合性を追求した場合、この整合性の制約のために、対応する画素が検出できなかったり、画素間の色（例えば、ＲＧＢ（Red／Green／Blue）の階調度）が合わなくなる問題を低減することができる。幾何学的な整合性をとれれば望ましいが、幾何学的な整合性を満足させ、撮像画像間における画素毎の対応点を検出することは、一般的に困難である。 Returning to FIG. 4, the template matching unit 110 performs template matching between first captured images captured by all of the imaging devices included in each first imaging device set in each first imaging device set. Here, since the template matching unit 110 does not take into account the geometric consistency of the imaging device, the template matching unit 110 does not perform stereo matching but only performs template matching between the first captured images. As a result, when geometric matching is pursued, the corresponding pixel cannot be detected due to the restriction of the matching, or the color between the pixels (for example, RGB (Red / Green / Blue) gradation degree). ) Can be reduced. Although it is desirable to achieve geometric matching, it is generally difficult to satisfy the geometric matching and detect corresponding points for each pixel between captured images.

例えば、第１撮像装置組が図５（ｂ）に示すように、撮像装置５３＿Ａ、５３＿Ｂ及び５３＿Ｃの３台から構成されている場合、テンプレートマッチング部１１０は、各撮像装置の撮像した第１撮像画層間のテンプレートマッチングを、以下の組み合わせにより行う。すなわち、テンプレートマッチング部１１０は、
Ｆ１：撮像装置５３＿Ａの第１撮像画像をテンプレートとして、撮像装置５３＿Ｂの第１撮像画像に対するテンプレートマッチング
Ｆ２：撮像装置５３＿Ｂの第１撮像画像をテンプレートとして、撮像装置５３＿Ａの第１撮像画像に対するテンプレートマッチング
Ｆ３：撮像装置５３＿Ａの第１撮像画像をテンプレートとして、撮像装置５３＿Ｃの第１撮像画像に対するテンプレートマッチング
Ｆ４：撮像装置５３＿Ｃの第１撮像画像をテンプレートとして、撮像装置５３＿Ａの第１撮像画像に対するテンプレートマッチング
Ｆ５：撮像装置５３＿Ｂの第１撮像画像をテンプレートとして、撮像装置５３＿Ｃの第１撮像画像に対するテンプレートマッチング
Ｆ６：撮像装置５３＿Ｃの第１撮像画像をテンプレートとして、撮像装置５３＿Ｂの第１撮像画像に対するテンプレートマッチング
したがって、第１撮像装置組を構成する撮像装置の数が２台の場合には２回、撮像装置の数が３台の場合には６回、撮像装置の数が４台の場合には１２回のテンプレートマッチングを、第１撮像装置組毎に行う。 For example, when the first imaging device set includes three imaging devices 53_A, 53_B, and 53_C as illustrated in FIG. 5B, the template matching unit 110 performs the first imaging that is captured by each imaging device. Template matching between layers is performed by the following combinations. That is, the template matching unit 110
F1: Template matching for the first captured image of the imaging device 53_B using the first captured image of the imaging device 53_A as a template F2: Template matching for the first captured image of the imaging device 53_A using the first captured image of the imaging device 53_B as a template F3: Template matching for the first captured image of the imaging device 53_A using the first captured image of the imaging device 53_C as a template F4: Template matching for the first captured image of the imaging device 53_A using the first captured image of the imaging device 53_C as a template F5: Template matching for the first captured image of the imaging device 53_C using the first captured image of the imaging device 53_B as a template F6: Imaging device 53_ using the first captured image of the imaging device 53_C as a template Therefore, when the number of image pickup devices constituting the first image pickup device group is 2, the number of image pickup devices is twice. If there are four, twelve template matchings are performed for each first imaging device group.

また、テンプレートマッチング部１１０は、テンプレートを対応点（すでに対応点探索部１０８が抽出した画素）を中心とした所定領域とし、探索の範囲を対応点近傍のみとする。ここで、すでに示したように、対応点探索部１０８がすでに３次元形状と第１撮像画像との対応点の抽出を行っているため、第１検出画像の画素がいずれの３次元形状の画素と対応しているかが判明している。このため、テンプレートマッチング部１１０は、テンプレートマッチングを行う際、対応点探索部１０８が検出した３次元形状の対応点の画素の座標を基準とする。 Further, the template matching unit 110 sets the template as a predetermined area centered on the corresponding point (the pixel already extracted by the corresponding point searching unit 108), and sets the search range only in the vicinity of the corresponding point. Here, as already shown, since the corresponding point search unit 108 has already extracted the corresponding points between the three-dimensional shape and the first captured image, the pixels of the first detection image are pixels of any three-dimensional shape. It is known whether it corresponds to. For this reason, when performing template matching, the template matching unit 110 uses the coordinates of the corresponding point pixel of the three-dimensional shape detected by the corresponding point search unit 108 as a reference.

図１４は、テンプレートマッチング部１１０が行う第１撮像画像間のテンプレートマッチングの処理を説明する図である。この図１４において、例えば、第１撮像画像６０１は撮像装置５３＿Ａが撮像した第１撮像画像であり、第１撮像画像６０１は撮像装置５３＿Ｂが撮像した第１撮像画像である。第１撮像画像６０１にはテンプレート枠６１１が示され、第２撮像画像６０２には探索範囲（局所領域）が示されている。また、テンプレート枠６１１は、対応点６２１を中心に所定の範囲の矩形状の枠で設定されている。また、探索範囲６１２は、対応点６２２を中心にテンプレート枠６１１より大きな範囲の矩形状の枠で設定されている。この対応点６２１及び６２２の各々は、それぞれ対応点探索部１０８が抽出した第１撮像画像６０１、６０２の３次元形状において抽出した同一の座標である対応点（画素）である。これにより、テンプレートマッチング部１１０は、第１撮像装置組に含まれる撮像装置各々の撮像した第１撮像画像における共通の対応点である第１対応点を検出する。 FIG. 14 is a diagram illustrating template matching processing between first captured images performed by the template matching unit 110. In FIG. 14, for example, a first captured image 601 is a first captured image captured by the imaging device 53_A, and a first captured image 601 is a first captured image captured by the imaging device 53_B. A template frame 611 is shown in the first captured image 601, and a search range (local region) is shown in the second captured image 602. The template frame 611 is set as a rectangular frame within a predetermined range with the corresponding point 621 as the center. The search range 612 is set as a rectangular frame having a larger range than the template frame 611 with the corresponding point 622 as the center. Each of the corresponding points 621 and 622 is a corresponding point (pixel) that is the same coordinate extracted in the three-dimensional shape of the first captured images 601 and 602 extracted by the corresponding point search unit 108. Thereby, the template matching part 110 detects the 1st corresponding point which is a common corresponding point in the 1st captured image which each imaging device included in the 1st imaging device group imaged.

図４に戻り、仮想カメラ設定部１１１は、複数の異なる種類の撮像装置のカメラパラメータを、データベース１１４に記憶されているカメラパラメータテーブルから選択する。この仮想カメラのカメラパラメータには理論上の制限は無い。
しかしながら、モーフィング処理を行うため、第１撮像画像を撮影した撮像装置と大きく異なる条件では良い結果を得ることはできない。したがって、内部パラメータ（焦点距離、光学中心、レンズの歪みなど）は撮影に用いた撮像装置に近い値とする。また、外部パラメータ（３次元空間における位置など）は、仮想カメラの撮像画像を得るためにモーフィングに用いる第１撮像画像を撮像した撮像装置間を結ぶ線分もしくは面に近い位置とすることにより、モーフィング処理の結果をより良くすることができる。 Returning to FIG. 4, the virtual camera setting unit 111 selects camera parameters of a plurality of different types of imaging devices from the camera parameter table stored in the database 114. There is no theoretical limit to the camera parameters of this virtual camera.
However, since the morphing process is performed, good results cannot be obtained under conditions that are significantly different from those of the imaging device that captured the first captured image. Therefore, the internal parameters (focal length, optical center, lens distortion, etc.) are set to values close to the imaging device used for shooting. In addition, by setting the external parameters (such as the position in the three-dimensional space) to positions close to a line segment or a plane connecting the imaging devices that have captured the first captured image used for morphing to obtain the captured image of the virtual camera, The result of the morphing process can be improved.

モーフィング用撮像装置選択部１１２は、図示しない入力手段により、上記仮想カメラの位置がユーザにより設定されると、モーフィングに用いる撮像装置の設定を行う。すなわち、いずれの撮像方向（視点方向）から被写体の３次元形状を、仮想カメラの撮像面（すなわち２次元平面）に投影するかが設定されると、モーフィング用撮像装置選択部１１２は、モーフィングに用いる第１撮像画像を撮像した撮像装置の設定を行う。ここで、仮想カメラは、ドーム型撮像装置に配置された撮像装置において、撮像装置を結ぶ線分上、あるいは撮像装置が形成する面（２次元平面）上に配置される。言い換えると、モーフィング処理に用いる撮像装置の各々は、それぞれが形成する線分上、あるいは複数の撮像装置が形成する面上にあるものに限られる。また、仮想カメラのカメラパラメータは、仮想カメラの配置位置に依存する。ある撮像装置と仮想カメラとの配置位置が一致した場合、仮想カメラのカメラパラメータは、配置位置が一致した撮像装置と同一の値となる。 The morphing imaging device selection unit 112 sets the imaging device used for morphing when the position of the virtual camera is set by the user by an input unit (not shown). That is, when it is set from which imaging direction (viewpoint direction) to project the three-dimensional shape of the subject onto the imaging surface of the virtual camera (that is, the two-dimensional plane), the morphing imaging device selection unit 112 performs morphing. Setting of the imaging device that captured the first captured image to be used is performed. Here, in the imaging device arranged in the dome-type imaging device, the virtual camera is arranged on a line segment connecting the imaging devices or on a surface (two-dimensional plane) formed by the imaging device. In other words, each of the imaging devices used for the morphing process is limited to one on a line segment formed by each, or on a surface formed by a plurality of imaging devices. The camera parameters of the virtual camera depend on the arrangement position of the virtual camera. When the arrangement positions of a certain imaging device and a virtual camera match, the camera parameters of the virtual camera have the same values as those of the imaging device with the same arrangement position.

図１５は、第１撮像装置組を構成する撮像装置において、仮想カメラの撮像画像である仮想撮像画像を生成するモーフィング処理を行う際、仮想カメラの配置可能範囲を示す図である。図１５において、例えば、第１撮像装置組が撮像装置５３＿Ａ、５３＿Ｂ及び５３＿Ｃの３台で構成されている場合、仮想カメラ５５を配置する範囲は、撮像装置５３＿Ａ、５３＿Ｂ及び５３＿Ｃが形成する三角形上の二次元平面７６０上となる。この面上から３次元形状への方向が仮想カメラ５５の撮像方向となる。この二次元平面７６０上においては、撮像装置５３＿Ａ、５３＿Ｂ及び５３＿Ｃの各々の撮像した第１撮像画像から、仮想カメラ５５の撮像する仮想撮像画像をモーフィングにより生成することができる。 FIG. 15 is a diagram illustrating a virtual camera arrangement possible range when performing a morphing process for generating a virtual captured image that is a captured image of a virtual camera in an imaging apparatus that constitutes the first imaging apparatus set. In FIG. 15, for example, when the first imaging device group is configured by three imaging devices 53_A, 53_B, and 53_C, the range in which the virtual camera 55 is arranged is on the triangle formed by the imaging devices 53_A, 53_B, and 53_C. On the two-dimensional plane 760. The direction from this surface to the three-dimensional shape is the imaging direction of the virtual camera 55. On the two-dimensional plane 760, a virtual captured image captured by the virtual camera 55 can be generated by morphing from the first captured images captured by the imaging devices 53_A, 53_B, and 53_C.

図１６は、図３に示すドーム型撮像装置における撮像装置の配置に対応したモーフィング処理に用いる撮像装置の選択について説明する図である。図１６（ａ）は、ドーム型撮像装置の各支柱における撮像装置の図３（ａ）の構成において、仮想カメラ５５の配置位置を示している。仮想カメラは、第１撮像装置組の撮像装置５３＿Ａ、５３＿Ｂ、５３＿Ｃ及び５３＿Ｄの４台の撮像装置が形成する２次元平面７０１上に配置されている場合、モーフィング用撮像装置として、撮像装置５３＿Ａ、５３＿Ｂ、５３＿Ｃ及び５３＿Ｄがモーフィング用撮像装置選択部１１２により選択される。
また、図１６（ｂ）は、ドーム型撮像装置の各支柱における撮像装置の図３（ｂ）の構成において、仮想カメラ５５の配置位置を示している。仮想カメラは、第１撮像装置組の撮像装置５３＿Ａ、５３＿Ｂ及び５３＿Ｃの３台の撮像装置が形成する２次元平面７０２上に配置されている場合、モーフィング用撮像装置として、撮像装置５３＿Ａ、５３＿Ｂ及び５３＿Ｃがモーフィング用撮像装置選択部１１２により選択される。 FIG. 16 is a diagram for describing selection of an imaging device used for morphing processing corresponding to the arrangement of the imaging device in the dome-type imaging device shown in FIG. FIG. 16A shows an arrangement position of the virtual camera 55 in the configuration of FIG. 3A of the imaging device in each column of the dome type imaging device. When the virtual camera is arranged on the two-dimensional plane 701 formed by the four imaging devices 53_A, 53_B, 53_C, and 53_D of the first imaging device set, the imaging device 53_A, 53_B, 53_C, and 53_D are selected by the morphing imaging device selection unit 112.
FIG. 16B shows an arrangement position of the virtual camera 55 in the configuration of FIG. 3B of the imaging device in each column of the dome-type imaging device. When the virtual camera is arranged on the two-dimensional plane 702 formed by the three imaging devices 53_A, 53_B, and 53_C of the first imaging device set, the imaging devices 53_A, 53_B, and morphing imaging devices are used. 53_C is selected by the morphing imaging device selection unit 112.

図４に戻り、仮想カメラ画像生成部１１３は、選択された撮像装置の撮像した第１撮像画像を用い、仮想カメラの撮像する仮想撮像画像を生成するモーフィング処理を行う。以下の仮想カメラ画像生成部１１３の動作の説明において、第１撮像装置組が撮像装置１及び撮像装置２の２台から構成されているとする。このとき、仮想カメラｖは、撮像装置１及び撮像装置２の各々を結ぶ線上に配置されている。また、対応点探索部１０８が探索する仮想カメラｖの撮像する仮想撮像画像のある画素の座標をＰｖとした場合、撮像装置ｉ（ｉ＝１，２）の撮像する第１撮像画像における対応点の画素の座標をＰｉとする。また、撮像装置ｉの画素の座標ｘにおける色（ＲＧＢの階調度のデータ）をＣｉ（ｘ）とし、撮像装置ｉの座標をＥｉとする。この場合、世界座標系における仮想カメラｖの座標Ｅｖは、パラメータα（０≦α≦１）を用いて以下の（１）式により表すことができる。以下の説明における位置及び座標は、仮想カメラｖの撮像する仮想撮像画像の２次元平面上における座標である。この２次元座標に対して、第１撮像画像の２次元平面上における座標点Ｅ１、Ｅ２、座標点Ｐ１、Ｐ２などを座標変換してマッピングを行う。
Ｅｖ＝（１−α）Ｅ１＋αＥ２ …（１） Returning to FIG. 4, the virtual camera image generation unit 113 performs a morphing process for generating a virtual captured image captured by the virtual camera using the first captured image captured by the selected imaging device. In the following description of the operation of the virtual camera image generation unit 113, it is assumed that the first imaging device set includes two imaging devices 1 and 2. At this time, the virtual camera v is arranged on a line connecting each of the imaging device 1 and the imaging device 2. Further, when the coordinates of a pixel of a virtual captured image captured by the virtual camera v searched by the corresponding point search unit 108 is Pv, the corresponding point in the first captured image captured by the imaging device i (i = 1, 2). Let Pi be the coordinates of the pixel. In addition, the color (RGB gradation data) at the coordinate x of the pixel of the imaging device i is Ci (x), and the coordinate of the imaging device i is Ei. In this case, the coordinate Ev of the virtual camera v in the world coordinate system can be expressed by the following equation (1) using the parameter α (0 ≦ α ≦ 1). The position and coordinates in the following description are coordinates on a two-dimensional plane of a virtual captured image captured by the virtual camera v. With respect to the two-dimensional coordinates, the coordinate points E1, E2, the coordinate points P1, P2, etc. on the two-dimensional plane of the first captured image are subjected to coordinate conversion to perform mapping.
Ev = (1-α) E1 + αE2 (1)

また、座標Ｐｖは、対応点の座標Ｐｉ及び撮像装置ｉのカメラパラメータから幾何学的に求めることができる。撮像装置１及び撮像装置２のカメラパラメータは定数であり、仮想カメラのカメラパラメータは位置Ｅｖに依存する。位置Ｅｖがパラメータαに依存するため、座標Ｐｖは対応点の座標Ｐｉとαに依存する関数Ｆを用いて、以下の（２）式により求められる。
Ｐｖ＝Ｆ（Ｐ１，Ｐ２，α） …（２） The coordinates Pv can be obtained geometrically from the coordinates Pi of the corresponding points and the camera parameters of the imaging device i. The camera parameters of the imaging device 1 and the imaging device 2 are constants, and the camera parameter of the virtual camera depends on the position Ev. Since the position Ev depends on the parameter α, the coordinate Pv can be obtained by the following equation (2) using the coordinates P of the corresponding point and the function F depending on α.
Pv = F (P1, P2, α) (2)

ここで、（２）式が成り立つ前提条件として、Ｅｖ＝Ｅ１の場合、仮想カメラｖは撮像装置１と同一のカメラパラメータを有することになり、Ｐｖ＝Ｅ１となる。同様に、Ｅｖ＝Ｅ２の場合、仮想カメラｖは撮像装置２と同一のカメラパラメータを有することになり、Ｐｖ＝Ｅ２となる。このため、以下の（３）式及び（４）式が成り立つことになる。
Ｐｖ＝Ｆ（Ｐ１，Ｐ２，０） …（３）
Ｐｖ＝Ｆ（Ｐ１，Ｐ２，１） …（４） Here, as a precondition for the expression (2) to be satisfied, if Ev = E1, the virtual camera v has the same camera parameters as the imaging device 1, and Pv = E1. Similarly, when Ev = E2, the virtual camera v has the same camera parameters as the imaging device 2, and Pv = E2. For this reason, the following formulas (3) and (4) are established.
Pv = F (P1, P2, 0) (3)
Pv = F (P1, P2, 1) (4)

このため、仮想カメラ画像生成部１１３は、上述した場合のモーフィング処理する色Ｃｖを、仮想カメラの位置Ｅｖに依存するとして、以下の（５）式により求める。
Ｃｖ（Ｐｖ）＝（１−α）Ｃ１（Ｐ１）＋αＣ２（Ｐ２） …（５） For this reason, the virtual camera image generation unit 113 obtains the color Cv to be morphed in the above-described case based on the following equation (5) assuming that the color Cv depends on the position Ev of the virtual camera.
Cv (Pv) = (1-α) C1 (P1) + αC2 (P2) (5)

しかしながら、実際には対応点として座標Ｐ１及び座標Ｐ２に誤差が生じているため、「Ｐｖ，Ｐ１，Ｐ２」は正確な対応点の位置の関係にはない。このため、すでに説明したように、テンプレートマッチング部１１０がモーフィングに用いる第１撮像画像間における第１対応点のテンプレートマッチングを行っている。 However, since an error has actually occurred in the coordinates P1 and P2 as corresponding points, “Pv, P1, P2” does not have an accurate relationship between the positions of the corresponding points. For this reason, as already described, the template matching unit 110 performs template matching of the first corresponding points between the first captured images used for morphing.

次に、テンプレートマッチングによる撮像装置１及び撮像装置２の各々が撮像した第１撮像画から、仮想カメラ画像生成部１１３が、テンプレートマッチング部１１０の行ったテンプレートマッチングの結果により、仮想カメラの撮像する仮想撮像画像の各画素の色Ｃｖを求める処理について説明する。
撮像装置ｊが撮像した第１撮像画像をテンプレートとして撮像装置ｉの撮像した第１撮像画像とテンプレートのテンプレートマッチングが行われた場合、撮像装置ｊにおける対応点が座標Ｐ’ｉｊとする。また、座標Ｐ’ｉｊと座標Ｐｉとの座標の差（距離）の値をＤｉｊ（＝Ｐ’ｉｊ−Ｐｉ）とする。例えば、撮像装置１の撮像した第１撮像画像をテンプレートとし、撮像装置２の撮像した第１撮像画像とのテンプレートマッチングが行われた場合、座標Ｐ１の対応点として座標Ｐ’２１が検出され、座標Ｐ２と座標Ｐ’２１との距離がＤ２１となる。 Next, the virtual camera image generation unit 113 captures an image of the virtual camera based on the result of template matching performed by the template matching unit 110 from the first captured images captured by the imaging device 1 and the imaging device 2 by template matching. Processing for obtaining the color Cv of each pixel of the virtual captured image will be described.
When template matching is performed between the first captured image captured by the imaging apparatus i and the template using the first captured image captured by the imaging apparatus j as a template, the corresponding point in the imaging apparatus j is set as a coordinate P′ij. Further, the value of the coordinate difference (distance) between the coordinates P′ij and the coordinates Pi is set to Dij (= P′ij−Pi). For example, when the first captured image captured by the imaging device 1 is used as a template and template matching is performed with the first captured image captured by the imaging device 2, the coordinate P′21 is detected as the corresponding point of the coordinate P1, The distance between the coordinates P2 and the coordinates P′21 is D21.

この結果、テンプレートマッチング部１１０が行うテンプレートマッチングの結果、２つの対応点の組が生成される。すなわち、撮像装置１の撮像した第１撮像画像をテンプレートとし、撮像装置２の撮像した第１撮像画像とのテンプレートマッチングを行った場合、座標Ｐ１の対応点として座標Ｐ’２１が得られ、対応点組（Ｐ１，Ｐ’２１）が得られる。一方、撮像装置２の撮像した第１撮像画像をテンプレートとし、撮像装置１の撮像した第１撮像画像とのテンプレートマッチングを行った場合、座標Ｐ２の対応点として座標Ｐ’１２が得られ、対応点組（Ｐ２，Ｐ’１２）が得られる。 As a result, as a result of template matching performed by the template matching unit 110, a set of two corresponding points is generated. That is, when the first captured image captured by the image capturing apparatus 1 is used as a template and template matching is performed with the first captured image captured by the image capturing apparatus 2, the coordinate P′21 is obtained as the corresponding point of the coordinate P1, and the corresponding A point set (P1, P'21) is obtained. On the other hand, when the first captured image captured by the imaging device 2 is used as a template and template matching is performed with the first captured image captured by the imaging device 1, the coordinate P′12 is obtained as the corresponding point of the coordinate P2, and the corresponding A point set (P2, P′12) is obtained.

この対応点組（Ｐ１，Ｐ’２１）及び対応点組（Ｐ２，Ｐ’１２）の各々は、以下の（６）式及び（７）式に示すように、（２）式がなりたたないため、座標Ｐｖと対応する位置関係にはない。
Ｐｖ≠Ｆ（Ｐ１，Ｐ’２１，α） …（６）
Ｐｖ≠Ｆ（Ｐ２，Ｐ’１２，α） …（７） Each of the corresponding point set (P1, P′21) and the corresponding point set (P2, P′12) has the following formula (2) as shown in the following formulas (6) and (7). Therefore, there is no positional relationship corresponding to the coordinate Pv.
Pv ≠ F (P1, P′21, α) (6)
Pv ≠ F (P2, P′12, α) (7)

（６）式及び（７）式に示すように、求めようとしている座標Ｐｖに対応する対応点の組が存在しないことが分かる。上述した座標Ｐｖに対応する対応点組を、例えば（Ｐ１＋δ，Ｐ２＋δＰ２）で表した場合、以下の（８）式が成り立つ。
Ｐｖ＝Ｆ（Ｐ１＋δ，Ｐ２＋δＰ２，α） …（８） As shown in the equations (6) and (7), it can be seen that there is no set of corresponding points corresponding to the coordinates Pv to be obtained. When the corresponding point set corresponding to the coordinate Pv described above is expressed by, for example, (P1 + δ, P2 + δP2), the following equation (8) is established.
Pv = F (P1 + δ, P2 + δP2, α) (8)

また、（３）式及び（４）式で示した前提条件からα＝０、つまり仮想カメラｖが撮像装置１と重なる場合、Ｐｖ＝Ｐ１となり、一方、α＝１、つまり仮想カメラｖが撮像装置２と重なる場合、以下の（９）式及び（１０）式が成り立つ。
Ｐ１＝Ｆ（Ｐ１＋δ，Ｐ２＋δＰ２，０） …（９）
Ｐ２＝Ｆ（Ｐ１＋δ，Ｐ２＋δＰ２，１） …（１０） In addition, when α = 0 from the preconditions expressed by the equations (3) and (4), that is, when the virtual camera v overlaps the imaging device 1, Pv = P1, while α = 1, that is, the virtual camera v captures an image. When overlapping with the device 2, the following equations (9) and (10) are established.
P1 = F (P1 + δ, P2 + δP2, 0) (9)
P2 = F (P1 + δ, P2 + δP2, 1) (10)

そして、仮想カメラｖと撮像装置１とが重なる場合、仮想カメラｖの対応点の座標Ｐｖと撮像装置１の対応点の座標Ｐ１とは同一の座標となるため、以下の（１１）式が成り立つ。
Ｐｖ＝Ｐ１＋δＰ１＝Ｐ１ …（１１）
上記（１１）式により、α＝０の場合、δＰ１＝０となる。同様に、α＝１の場合、δＰ２＝０となる。 When the virtual camera v and the imaging device 1 overlap, the coordinate Pv of the corresponding point of the virtual camera v and the coordinate P1 of the corresponding point of the imaging device 1 are the same coordinates, so the following equation (11) holds. .
Pv = P1 + δP1 = P1 (11)
According to the above equation (11), when α = 0, δP1 = 0. Similarly, when α = 1, δP2 = 0.

また、座標Ｐ１と座標Ｐ’２１とが対応点の関係にあり、座標Ｐ２と座標Ｐ’１２とが対応点関係にあるため、
・α＝０の場合、Ｐ２＋δＰ２＝Ｐ’２１、すなわちδＰ２＝Ｄ２１
・α＝１の場合、Ｐ１＋δＰ１＝Ｐ’１２、すなわちδＰ１＝Ｄ１２
となる。 Further, since the coordinate P1 and the coordinate P′21 are in a corresponding point relationship, and the coordinate P2 and the coordinate P′12 are in a corresponding point relationship,
When α = 0, P2 + δP2 = P′21, that is, δP2 = D21
When α = 1, P1 + δP1 = P′12, that is, δP1 = D12
It becomes.

以上の条件を満たす式として、以下の（１２）式を用いる。
δＰ１＝αＤ１２， δＰ１＝（１−α）Ｄ２１ …（１２）
この（１２）式により、座標Ｐｖは以下の（１３）式により求まる。
Ｐｖ＝Ｆ（Ｐ１＋αＤ１２，Ｐ２＋（１−α）Ｄ２１，α） …（１３）
この場合、仮想カメラｖの座標Ｐｖにおける色Ｃｖ（Ｐｖ）は、以下の（１４）式により求められる。
Ｃｖ（Ｐｖ）＝（１−α）Ｃ１（Ｐ１＋αＤ１２）
＋αＣ２（Ｐ２＋（１−α）Ｄ２１） …（１４） As an expression that satisfies the above conditions, the following expression (12) is used.
δP1 = αD12, δP1 = (1-α) D21 (12)
From this equation (12), the coordinate Pv is obtained by the following equation (13).
Pv = F (P1 + αD12, P2 + (1-α) D21, α) (13)
In this case, the color Cv (Pv) at the coordinate Pv of the virtual camera v is obtained by the following equation (14).
Cv (Pv) = (1-α) C1 (P1 + αD12)
+ ΑC2 (P2 + (1-α) D21) (14)

また、仮想カメラｖの撮像する仮想撮像画像を生成するモーフィング処理を、３台以上の撮像装置を用いた場合であっても、上述した撮像装置が２台の場合と同様に行うことができる。すなわち、モーフィング処理に用いる撮像装置、すなわち第１撮像装置組が撮像装置１、撮像装置２及び撮像装置３の３台から構成されている場合、パラメータα及びβとを用いて、以下の（１５）式から仮想カメラｖの座標Ｅｖを求めることができる。ここで、α≧０、β≧、０≦α＋β≦１である。
Ｅｖ＝（１−α−β）Ｅ１＋αＥ２＋βＥ３ …（１５） Further, even when three or more imaging devices are used, the morphing process for generating a virtual captured image captured by the virtual camera v can be performed in the same manner as when two imaging devices are used. That is, when the imaging device used for the morphing process, that is, the first imaging device set is composed of the imaging device 1, the imaging device 2, and the imaging device 3, using the parameters α and β, the following (15 ) To obtain the coordinate Ev of the virtual camera v. Here, α ≧ 0, β ≧, and 0 ≦ α + β ≦ 1.
Ev = (1−α−β) E1 + αE2 + βE3 (15)

図１７は、撮像装置１、撮像装置２及び撮像装置３の３台の撮像装置を用いた場合における仮想カメラｖの座標Ｅｖを示す図である。この図１７において、撮像装置１の座標がＥ１であり、撮像装置２の座標がＥ２であり、撮像装置３の座標がＥ３であり、仮想カメラｖの座標がＥｖである。このように、仮想カメラｖは、撮像装置１、撮像装置２及び撮像装置３の形成する三角形の２次元平面上に配置される。また、図１７に示されるように、座標Ｅｖは、座標Ｅ２から座標Ｅ１を減算した距離ベクトルに対しαを乗じたα（Ｅ２−Ｅ１）と、座標Ｅ３から座標Ｅ１を減算した距離ベクトルに対しβを乗じたβ（Ｅ３−Ｅ１）とにより、座標Ｅ１を２次元平面上において平面的に移動させることにより求まる。 FIG. 17 is a diagram illustrating the coordinates Ev of the virtual camera v in the case where three image capturing apparatuses, the image capturing apparatus 1, the image capturing apparatus 2, and the image capturing apparatus 3, are used. In FIG. 17, the coordinates of the imaging device 1 are E1, the coordinates of the imaging device 2 are E2, the coordinates of the imaging device 3 are E3, and the coordinates of the virtual camera v are Ev. In this way, the virtual camera v is arranged on a triangular two-dimensional plane formed by the imaging device 1, the imaging device 2, and the imaging device 3. In addition, as shown in FIG. 17, the coordinate Ev is obtained by multiplying a distance vector obtained by subtracting the coordinate E1 from the coordinate E2 by α (E2-E1) and a distance vector obtained by subtracting the coordinate E1 from the coordinate E3. The coordinate E1 is obtained by moving in a plane on a two-dimensional plane by β (E3-E1) multiplied by β.

この場合の仮想カメラｖの仮想撮像画像の画素の座標Ｐｖと、この座標Ｐｖとの対応点となる撮像装置１の第１撮像画像の画素の座標Ｐ１、撮像装置２の第１撮像画像の画素の座標Ｐ２及び撮像装置３の第１撮像画像の画素の座標Ｐ３との対応関係は、撮像装置が２台の場合と同様の考え方から、以下の（１６）式により求められる。
Ｐｖ＝
Ｆ（Ｐ１＋δＰ１，Ｐ２＋δＰ２，Ｐ３＋δＰ３，α，β） …（１６） In this case, the coordinate Pv of the pixel of the virtual captured image of the virtual camera v, the coordinate P1 of the pixel of the first captured image of the image capturing apparatus 1 that is the corresponding point of the coordinate Pv, and the pixel of the first captured image of the image capturing apparatus 2 And the coordinate P3 of the pixel of the first captured image of the imaging device 3 are obtained by the following equation (16) from the same concept as in the case of two imaging devices.
Pv =
F (P1 + δP1, P2 + δP2, P3 + δP3, α, β) (16)

上記（１６）式において、α＝０、β＝０のとき、仮想カメラと撮像装置１との座標とが一致しＥｖ＝Ｅ１となり、α＝１、β＝０のとき、仮想カメラと撮像装置２との座標とが一致しＥｖ＝Ｅ２となり、α＝０、β＝１のとき、仮想カメラと撮像装置３との座標とが一致しＥｖ＝Ｅ３となる。これらの初期条件から、以下の（１７）式が成り立つ。
α＝０、β＝０の場合、δＰ１＝０、δＰ２＝Ｄ２１、δＰ３＝Ｄ３１
α＝１，β＝０の場合、δＰ１＝Ｄ１２、δＰ２＝０、δＰ３＝Ｄ３２
α＝０、β＝１の場合、δＰ１＝Ｄ１３、δＰ２＝Ｄ２３、δＰ３＝０
…（１７） In the above equation (16), when α = 0 and β = 0, the coordinates of the virtual camera and the imaging device 1 coincide and Ev = E1, and when α = 1 and β = 0, the virtual camera and the imaging device. 2 coincides with Ev = E2, and when α = 0 and β = 1, the coordinates of the virtual camera and the imaging device 3 coincide with each other, and Ev = E3. From these initial conditions, the following equation (17) holds.
When α = 0 and β = 0, δP1 = 0, δP2 = D21, δP3 = D31
When α = 1 and β = 0, δP1 = D12, δP2 = 0, δP3 = D32
When α = 0 and β = 1, δP1 = D13, δP2 = D23, δP3 = 0
... (17)

上記（１７）式が成り立つ条件として、以下の（１８）式の関係を用いた。
δＰ１＝αＤ１２＋βＤ１３
δＰ２＝（１−α−β）Ｄ２１＋βＤ２３
δＰ３＝（１−α−β）Ｄ３１＋αＤ３２ …（１８） As a condition for satisfying the above expression (17), the relationship of the following expression (18) was used.
δP1 = αD12 + βD13
δP2 = (1−α−β) D21 + βD23
δP3 = (1−α−β) D31 + αD32 (18)

上記（１８）式により、仮想カメラｖの仮想撮像画像の画素Ｐｖは、以下の（１９）式により求められる。
Ｐｖ＝Ｆ（Ｐ１＋αＤ１２＋βＤ１３，Ｐ２＋（１−α−β）Ｄ２１＋βＤ２３，Ｐ３＋（１−α−β）Ｄ３１＋αＤ３２，α，β） …（１９） From the equation (18), the pixel Pv of the virtual captured image of the virtual camera v is obtained by the following equation (19).
Pv = F (P1 + αD12 + βD13, P2 + (1−α−β) D21 + βD23, P3 + (1−α−β) D31 + αD32, α, β) (19)

上記（１９）式を用いた場合、仮想カメラｖの仮想撮像画像の画素の座標Ｐｖの色Ｃｖは、座標Ｐｖに対応する第１対応点である、撮像装置１、２及び３における画素各々の座標を仮想撮像画像の座標系に座標変換した座標Ｐ１＋δＰ１、Ｐ２＋δＰ２及びＰ３＋δＰ３を用いて、以下の（２０）式により求められる。
Ｃｖ（Ｐｖ）＝ｗ１Ｃ１（Ｐ１＋αＤ１２＋βＤ１３）＋ｗ２Ｃ２（Ｐ２＋（１−α−β）Ｄ２１＋βＤ２３）＋ｗ３Ｃ３（Ｐ３＋（１−α−β）Ｄ３１＋αＤ３２）
…（２０）
上記（２０）式において、ｗｉ（ｉ＝１，２，３）は重み係数であり、ｗ１＋ｗ２＋ｗ３＝１である。
また、Ｃ１、Ｃ２及びＣ３の各々は、撮像装置１、２及び３の各々の撮像した第１撮像画像における第１対応点であるＰ１、Ｐ２、Ｐ３のそれぞれの色を表している。 When the above equation (19) is used, the color Cv of the coordinate Pv of the pixel of the virtual captured image of the virtual camera v is the first corresponding point corresponding to the coordinate Pv, and each of the pixels in the imaging devices 1, 2, and 3 Using the coordinates P1 + δP1, P2 + δP2, and P3 + δP3 obtained by converting the coordinates into the coordinate system of the virtual captured image, the following equation (20) is obtained.
Cv (Pv) = w1C1 (P1 + αD12 + βD13) + w2C2 (P2 + (1−α−β) D21 + βD23) + w3C3 (P3 + (1−α−β) D31 + αD32)
... (20)
In the above equation (20), wi (i = 1, 2, 3) is a weighting coefficient, and w1 + w2 + w3 = 1.
Each of C1, C2, and C3 represents the respective colors of P1, P2, and P3 that are the first corresponding points in the first captured images captured by the imaging devices 1, 2, and 3, respectively.

図１８は、仮想カメラｖの仮想撮像画像の画素Ｐｖの色Ｃｖを求める（２０）式における重み係数を説明する図である。この図１８において、撮像装置１の座標がＥ１であり、撮像装置２の座標がＥ２であり、撮像装置３の座標がＥ３であり、仮想カメラｖの座標がＥｖである。ここで、重み係数ｗ１、ｗ２及びｗ３の各々は、座標Ｅ１、Ｅ２及びＥ３が頂点となり形成する三角形を、座標Ｅ１、Ｅ２及びＥ３の各々と座標Ｅｖを頂点とした三角形により分割した各々の面積比である。例えば、座標Ｅ１、Ｅ２及びＥ３が頂点となり形成する三角形の面積がＳであり、座標Ｅ２、Ｅ３及びＥｖを頂点とする三角形の面積がＳｗ１であり、座標Ｅ１、Ｅ３及びＥｖを頂点とする三角形の面積がＳｗ２であり、座標Ｅ１、Ｅ２及びＥｖを頂点とする三角形の面積がＳｗ３である。すなわち、Ｓ＝Ｓｗ１＋Ｓｗ２＋Ｓｗ３である。
この場合、重み係数としては、ｗ１＝Ｓｗ１／Ｓであり、ｗ２＝Ｓｗ２／Ｓであり、ｗ３＝Ｓｗ３／Ｓである。また、重み係数ｗｉは、座標Ｅｉに対して座標Ｅｖを介して対抗する位置にある三角形の面積ＳＷｉの面積Ｓに対する比に設定される。これにより、座標Ｅｖに最も近い座標Ｅｉの撮像装置ｉの重みが最も大きくなり、一方、Ｅｖに最も遠い座標Ｅｉの撮像装置ｉの重みが最も小さくなる。 FIG. 18 is a diagram for explaining a weighting coefficient in the equation (20) for obtaining the color Cv of the pixel Pv of the virtual captured image of the virtual camera v. In FIG. 18, the coordinates of the imaging device 1 are E1, the coordinates of the imaging device 2 are E2, the coordinates of the imaging device 3 are E3, and the coordinates of the virtual camera v are Ev. Here, each of the weighting factors w1, w2, and w3 is an area obtained by dividing a triangle formed by coordinates E1, E2, and E3 as vertices by a triangle having coordinates E1, E2, and E3 and coordinates Ev as vertices. Is the ratio. For example, the area of a triangle formed with coordinates E1, E2, and E3 as vertices is S, the area of a triangle with coordinates E2, E3, and Ev as vertices is Sw1, and a triangle that has coordinates E1, E3, and Ev as vertices The area of the triangle is Sw2, and the area of the triangle whose apexes are the coordinates E1, E2, and Ev is Sw3. That is, S = Sw1 + Sw2 + Sw3.
In this case, as weighting factors, w1 = Sw1 / S, w2 = Sw2 / S, and w3 = Sw3 / S. Further, the weight coefficient wi is set to a ratio of the area SWi of the triangle located at a position facing the coordinate Ei via the coordinate Ev to the area S. As a result, the weight of the imaging device i at the coordinate Ei closest to the coordinate Ev is the largest, while the weight of the imaging device i at the coordinate Ei farthest from the Ev is the smallest.

図４に戻り、上記（２０）式において、撮像装置１及び２の各々が撮像した第１撮像画像と仮想カメラの撮像する仮想撮像画像との対応点の組は、［Ｐ１＋αＤ１２＋βＤ１３，Ｐ２＋（１−α−β）Ｄ２１＋βＤ２３，Ｐ３＋（１−α−β）Ｄ３１＋αＤ３２］となり、この組により、仮想撮像画像の座標Ｐｖの画素のモーフィング処理が仮想カメラ画像生成部１１３により行われる。
すなわち、差分Ｄｉｊが生成する局所領域（例えば差分Ｄｉｊを辺とする立方体状の領域）がお互いの対応点を含んでいるとして、仮想カメラｖの仮想撮像画像の画素Ｐｖの色Ｃｖが算出される。 Returning to FIG. 4, in the above equation (20), the set of corresponding points between the first captured image captured by each of the image capturing apparatuses 1 and 2 and the virtual captured image captured by the virtual camera is [P1 + αD12 + βD13, P2 + (1- α−β) D21 + βD23, P3 + (1-α−β) D31 + αD32]. With this set, the morphing process of the pixel at the coordinate Pv of the virtual captured image is performed by the virtual camera image generation unit 113.
That is, the color Cv of the pixel Pv of the virtual captured image of the virtual camera v is calculated on the assumption that the local region (for example, a cubic region having the difference Dij as a side) generated by the difference Dij includes corresponding points. .

図１９は、撮像装置１、２及び３の各々の撮像した第１撮像画像と、仮想カメラｖの撮像した仮想撮像画像とにおける対応点の関係を示す図である。
図１９（ａ）は、仮想カメラｖが撮像した仮想撮像画像の座標Ｐｖに対応する、撮像装置１が撮像した第１撮像画像の画素が座標変換された座標Ｐ１＋δＰ１の求め方を示している。図１９（ａ）において、座標Ｐ１は、撮像装置１の撮像した撮像画像における画素の座標を、仮想カメラｖの仮想撮像画像に座標変換した座標を示している。座標Ｐ’１２は、撮像装置２の撮像した第１撮像画像をテンプレートとし、撮像装置１の撮像した第１撮像画像をテンプレートマッチングして第１対応点として抽出された画素の座標を座標変換した座標を示している。座標Ｐ’１３は、撮像装置３の撮像した第１撮像画像をテンプレートとし、撮像装置１の撮像した第１撮像画像をテンプレートマッチングして第１対応点として抽出された画素の座標を座標変換した座標を示している。距離ベクトルＤ１２は座標Ｐ’１２から座標Ｐ１を減算して生成され、距離ベクトルＤ１３は座標Ｐ’１３から座標Ｐ１を減算して生成されている。座標Ｐ１＋δＰ１は、上記（１８）式に基づいて、距離ベクトルＤ１２に対しαを乗じたαＤ１２と、距離ベクトルＤ１３に対しβを乗じたβＤ１３とにより、座標Ｐ１を２次元平面上において平面的に移動させることにより求まる。 FIG. 19 is a diagram illustrating a relationship between corresponding points in the first captured image captured by each of the image capturing apparatuses 1, 2, and 3 and the virtual captured image captured by the virtual camera v.
FIG. 19A illustrates how to obtain coordinates P1 + δP1 in which the pixels of the first captured image captured by the imaging device 1 corresponding to the coordinates Pv of the virtual captured image captured by the virtual camera v are coordinate-converted. In FIG. 19A, the coordinate P <b> 1 indicates a coordinate obtained by coordinate-converting the pixel coordinate in the captured image captured by the imaging device 1 into the virtual captured image of the virtual camera v. The coordinate P′12 is obtained by using the first captured image captured by the imaging device 2 as a template, and performing coordinate conversion on the coordinates of the pixel extracted as the first corresponding point by performing template matching on the first captured image captured by the imaging device 1. The coordinates are shown. The coordinate P′13 is obtained by converting the coordinates of the pixel extracted as the first corresponding point by using the first captured image captured by the imaging apparatus 3 as a template and performing template matching on the first captured image captured by the imaging apparatus 1. The coordinates are shown. The distance vector D12 is generated by subtracting the coordinate P1 from the coordinate P′12, and the distance vector D13 is generated by subtracting the coordinate P1 from the coordinate P′13. The coordinate P1 + δP1 is moved on the two-dimensional plane in a two-dimensional plane by αD12 obtained by multiplying the distance vector D12 by α and βD13 obtained by multiplying the distance vector D13 by β based on the above equation (18). It is obtained by doing.

図１９（ｂ）は、仮想カメラｖが撮像した仮想撮像画像の座標Ｐｖに対応する、撮像装置２が撮像した第１撮像画像の画素が座標変換された座標Ｐ２＋δＰ２の求め方を示している。図１９（ｂ）において、座標Ｐ２は、撮像装置２の撮像した撮像画像における画素の座標を、仮想カメラｖの仮想撮像画像に座標変換した座標を示している。座標Ｐ’２１は、撮像装置１の撮像した第１撮像画像をテンプレートとし、撮像装置２の撮像した第１撮像画像をテンプレートマッチングして第１対応点として抽出された画素の座標を座標変換した座標を示している。座標Ｐ’２３は、撮像装置３の撮像した第１撮像画像をテンプレートとし、撮像装置２の撮像した第１撮像画像をテンプレートマッチングして第１対応点として抽出された画素の座標を座標変換した座標を示している。距離ベクトルＤ２１は座標Ｐ’２１から座標Ｐ２を減算して生成され、距離ベクトル（Ｄ２３−Ｄ２１）は、距離ベクトルＤ２３から距離ベクトルＤ２１を減算して生成されている。座標Ｐ２＋δＰ２は、上記（１８）式に基づいて、距離ベクトルＤ２１に対しαを乗じたαＤ２１と、距離ベクトルＤ２３−Ｄ２１に対しβを乗じたβ（Ｄ２３−Ｄ２１）とにより、座標Ｐ’２１を２次元平面上において平面的に移動させることにより求まる。 FIG. 19B shows how to obtain coordinates P2 + δP2 in which the pixels of the first captured image captured by the imaging apparatus 2 corresponding to the coordinates Pv of the virtual captured image captured by the virtual camera v are coordinate-converted. In FIG. 19B, a coordinate P2 indicates a coordinate obtained by coordinate-converting the pixel coordinate in the captured image captured by the imaging device 2 into the virtual captured image of the virtual camera v. The coordinate P′21 is obtained by converting the coordinates of the pixel extracted as the first corresponding point by using the first captured image captured by the image capturing apparatus 1 as a template and performing template matching on the first captured image captured by the image capturing apparatus 2. The coordinates are shown. The coordinate P′23 is obtained by converting the coordinates of the pixel extracted as the first corresponding point by using the first captured image captured by the imaging device 3 as a template and performing the template matching on the first captured image captured by the imaging device 2. The coordinates are shown. The distance vector D21 is generated by subtracting the coordinate P2 from the coordinate P'21, and the distance vector (D23-D21) is generated by subtracting the distance vector D21 from the distance vector D23. The coordinate P2 + δP2 is calculated based on the above equation (18) by using αD21 obtained by multiplying the distance vector D21 by α and β (D23−D21) obtained by multiplying the distance vector D23-D21 by β (D23−D21). It is obtained by moving in a plane on a two-dimensional plane.

図１９（ｃ）は、仮想カメラｖが撮像した仮想撮像画像の座標Ｐｖに対応する、撮像装置３が撮像した第１撮像画像の画素が座標変換された座標Ｐ３＋δＰ３の求め方を示している。図１９（ｃ）において、座標Ｐ３は、撮像装置３の撮像した撮像画像における画素の座標を、仮想カメラｖの仮想撮像画像に座標変換した座標を示している。座標Ｐ’３１は、撮像装置１の撮像した第１撮像画像をテンプレートとし、撮像装置３の撮像した第１撮像画像をテンプレートマッチングして第１対応点として抽出された画素の座標を座標変換した座標を示している。座標Ｐ’３２は、撮像装置２の撮像した第１撮像画像をテンプレートとし、撮像装置３の撮像した第１撮像画像をテンプレートマッチングして第１対応点として抽出された画素の座標を座標変換した座標を示している。距離ベクトルＤ３１は座標Ｐ’２１から座標Ｐ２を減算して生成され、距離ベクトル（Ｄ３２−Ｄ３１）は、距離ベクトルＤ３２から距離ベクトルＤ３１を減算して生成されている。座標Ｐ３＋δＰ３は、上記（１８）式に基づいて、距離ベクトルＤ３１に対してβを乗じたβＤ３１と、距離ベクトルＤ３２−Ｄ３１に対しαを乗じたα（Ｄ３２−Ｄ３１）とにより、座標Ｐ’３１を２次元平面上において平面的に移動させることにより求まる。 FIG. 19C shows how to obtain coordinates P3 + δP3 in which the pixels of the first captured image captured by the imaging apparatus 3 corresponding to the coordinates Pv of the virtual captured image captured by the virtual camera v are transformed. In FIG. 19C, a coordinate P3 indicates a coordinate obtained by coordinate-converting the pixel coordinate in the captured image captured by the imaging device 3 into the virtual captured image of the virtual camera v. The coordinate P′31 is obtained by converting the coordinates of the pixel extracted as the first corresponding point by using the first captured image captured by the image capturing apparatus 1 as a template and performing the template matching on the first captured image captured by the image capturing apparatus 3. The coordinates are shown. The coordinate P′32 is obtained by using the first captured image captured by the imaging device 2 as a template, and performing coordinate conversion on the coordinates of the pixel extracted as the first corresponding point by performing template matching on the first captured image captured by the imaging device 3. The coordinates are shown. The distance vector D31 is generated by subtracting the coordinate P2 from the coordinate P'21, and the distance vector (D32-D31) is generated by subtracting the distance vector D31 from the distance vector D32. The coordinate P3 + δP3 is based on the above equation (18), and the coordinate P′31 is obtained by βD31 obtained by multiplying the distance vector D31 by β and α (D32−D31) obtained by multiplying the distance vector D32-D31 by α. Is obtained by moving in a plane on a two-dimensional plane.

また、仮想カメラｖの撮像する仮想撮像画像を生成するモーフィング処理を、３台を超える複数の撮像装置を用いた場合であっても、上述した撮像装置が３台の場合と同様に行うことができる。本実施形態においては、モーフィング処理に用いる撮像装置、すなわち第１撮像装置組が撮像装置１、撮像装置２、撮像装置３及び撮像装置４の４台から構成されている場合、パラメータα及びβを用いて、以下の（２１）式から仮想カメラｖの座標Ｅｖを求めることができる。ここで、０≦α≦１、０≦β≦１である。
Ｅｖ＝（１−α＋αβ）Ｅ１＋α（１−β）Ｅ２
−αβＥ３＋αβＥ４ …（２１） Further, the morphing process for generating the virtual captured image captured by the virtual camera v can be performed in the same manner as in the case where the above-described three imaging devices are used even when a plurality of imaging devices exceeding three are used. it can. In the present embodiment, when the imaging device used for the morphing process, that is, the first imaging device set is composed of the imaging device 1, the imaging device 2, the imaging device 3, and the imaging device 4, the parameters α and β are set. By using the following equation (21), the coordinate Ev of the virtual camera v can be obtained. Here, 0 ≦ α ≦ 1 and 0 ≦ β ≦ 1.
Ev = (1-α + αβ) E1 + α (1-β) E2
-ΑβE3 + αβE4 (21)

図２０は、撮像装置１、撮像装置２、撮像装置３及び撮像装置４の４台の撮像装置を用いた場合における仮想カメラｖの座標Ｅｖを示す図である。この図２０において、撮像装置１の座標がＥ１であり、撮像装置２の座標がＥ２であり、撮像装置３の座標がＥ３であり、撮像装置４の座標がＥ４であり、仮想カメラｖの座標がＥｖである。ここで、座標Ｅｖは、座標Ｅ２と座標Ｅ１とをα：１−αの比で分割して生成した座標Ｅ１２と、座標Ｅ３と座標Ｅ４とをα：１−αで分割して生成した座標３４とを求める。そして、座標Ｅ１２と座標３４との間をβ：１−βの比で分割した座標β（α（Ｅ４−Ｅ３）−α（Ｅ２−Ｅ１））を座標Ｅｖとしている。 FIG. 20 is a diagram illustrating the coordinates Ev of the virtual camera v in the case of using four imaging devices, that is, the imaging device 1, the imaging device 2, the imaging device 3, and the imaging device 4. In FIG. 20, the coordinates of the imaging device 1 are E1, the coordinates of the imaging device 2 are E2, the coordinates of the imaging device 3 are E3, the coordinates of the imaging device 4 are E4, and the coordinates of the virtual camera v. Is Ev. Here, the coordinate Ev is a coordinate E12 generated by dividing the coordinate E2 and the coordinate E1 by a ratio of α: 1-α, and a coordinate generated by dividing the coordinate E3 and the coordinate E4 by α: 1-α. 34. A coordinate β (α (E4-E3) −α (E2-E1)) obtained by dividing the coordinate E12 and the coordinate 34 by a ratio of β: 1−β is set as the coordinate Ev.

また、撮像装置１、撮像装置２、撮像装置３及び撮像装置４の各々の撮像する第１撮像画像における対応点は、撮像装置が３台の場合と同様に考えると、以下の（２２）式により表すことができる。以下の式においてｉ＝１、２、３、４である。
Ｐｉ＋δＰｉ＝（１−α＋αβ）Ｐ’ｉ１
＋α（１−β）Ｐ’ｉ２−αβＰ’ｉ３＋αβＰ’ｉ４
…（２２） Further, the corresponding points in the first captured image captured by each of the imaging device 1, the imaging device 2, the imaging device 3, and the imaging device 4 are expressed by the following formula (22) when considered in the same manner as when there are three imaging devices. Can be represented by In the following formula, i = 1, 2, 3, and 4.
Pi + δPi = (1−α + αβ) P′i1
+ Α (1-β) P'i2-αβP'i3 + αβP'i4
... (22)

図２１は、仮想撮像画像の２次元平面における撮像装置１、２、３及び４の各々の第１撮像画像における第１対応点の対応を示す図である。この図２１において、座標Ｐｉ１は撮像装置１の撮像した第１撮像画像をテンプレートとして、撮像装置Ｐｉの撮像した第１撮像画像の第１対応点を、仮想撮像画像の座標系に座標変換した座標である。座標Ｐｉ２は撮像装置２の撮像した第１撮像画像をテンプレートとして、撮像装置Ｐｉの撮像した第１撮像画像の第１対応点を、仮想撮像画像の座標系に座標変換した座標である。座標Ｐｉ３は撮像装置１の撮像した第１撮像画像をテンプレートとして、撮像装置Ｐｉの撮像した第１撮像画像の第１対応点を、仮想撮像画像の座標系に座標変換した座標である。座標Ｐｉ４は撮像装置１の撮像した第１撮像画像をテンプレートとして、撮像装置Ｐｉの撮像した第１撮像画像の第１対応点を、仮想撮像画像の座標系に座標変換した座標である。例えば、ｉ＝１の場合、撮像装置１の撮像した第１撮像画像における第１共通点がＰ’１１＝Ｐ１となり、この座標Ｐ１に対応する対応点が、撮像装置２、３及び４の各々をテンプレートとした場合、それぞれ座標Ｐ’１２、座標Ｐ’１３及び座標Ｐ’１４となる。 FIG. 21 is a diagram illustrating the correspondence between the first corresponding points in the first captured images of the imaging devices 1, 2, 3, and 4 on the two-dimensional plane of the virtual captured image. In FIG. 21, coordinates Pi1 are coordinates obtained by coordinate-converting the first corresponding point of the first captured image captured by the imaging device Pi into the coordinate system of the virtual captured image using the first captured image captured by the imaging device 1 as a template. It is. The coordinate Pi2 is a coordinate obtained by coordinate-converting the first corresponding point of the first captured image captured by the imaging device Pi into the coordinate system of the virtual captured image using the first captured image captured by the imaging device 2 as a template. The coordinate Pi3 is a coordinate obtained by coordinate-converting the first corresponding point of the first captured image captured by the imaging device Pi into the coordinate system of the virtual captured image using the first captured image captured by the imaging device 1 as a template. The coordinate Pi4 is a coordinate obtained by coordinate-converting the first corresponding point of the first captured image captured by the imaging device Pi into the coordinate system of the virtual captured image using the first captured image captured by the imaging device 1 as a template. For example, when i = 1, the first common point in the first captured image captured by the imaging device 1 is P′11 = P1, and the corresponding points corresponding to the coordinates P1 are the respective imaging devices 2, 3, and 4. Are the coordinates P′12, the coordinates P′13, and the coordinates P′14, respectively.

また、図２１において、仮想カメラｖの撮像する仮想撮像における対応点Ｐｉ＋δＰｉは、図２０の仮想カメラｖの座標を求めた場合と同様に、座標Ｐ’ｉ２と座標Ｐ’ｉ１とをα：１−αの比で分割して生成した座標Ｐａ（＝α（Ｐ’ｉ２−Ｐ’ｉ１））と、座標Ｐ’ｉ４と座標Ｐ’ｉ３とをα：１−αの比で分割して生成した座標Ｐｂ（＝α（Ｐ’ｉ４−Ｐ’ｉ３））とを求める。そして、座標Ｐａと座標Ｐｂとの間をβ：１−βの比で分割した座標β（α（Ｐ’ｉ４−Ｐ’ｉ３）−α（Ｐ’ｉ２−Ｐ’ｉ１））を座標Ｐｉ＋δＰｉとしている。この座標Ｐｉ＋δＰｉが仮想カメラの撮像する仮想撮像画像における撮像装置ｉの撮像した第１撮像画像における第１対応点の座標Ｐｉとなる。 Further, in FIG. 21, the corresponding point Pi + δPi in the virtual image picked up by the virtual camera v is obtained by changing the coordinates P′i2 and the coordinates P′i1 to α: 1 as in the case where the coordinates of the virtual camera v in FIG. A coordinate Pa (= α (P′i2−P′i1)) generated by dividing at a ratio of −α and a coordinate P′i4 and a coordinate P′i3 generated at a ratio of α: 1−α. Coordinate Pb (= α (P′i4−P′i3)) is obtained. A coordinate β (α (P′i4−P′i3) −α (P′i2−P′i1)) obtained by dividing the coordinate Pa and the coordinate Pb by a ratio of β: 1−β is set as a coordinate Pi + δPi. Yes. This coordinate Pi + δPi becomes the coordinate Pi of the first corresponding point in the first captured image captured by the imaging device i in the virtual captured image captured by the virtual camera.

ただし、上記（２２）式において、Ｐ’ｉｉ＝Ｐｉとする。例えば、ｉ＝３の場合、上記（２２）式の結果は、以下の（２３）式となる。
Ｐ３＋δＰ３＝（１−α＋αβ）Ｐ’３１
＋α（１−β）Ｐ’３２−αβＰ’３＋αβＰ’３４
…（２３） However, in the above equation (22), P′ii = Pi. For example, when i = 3, the result of the above expression (22) becomes the following expression (23).
P3 + δP3 = (1−α + αβ) P′31
+ Α (1-β) P′32−αβP′3 + αβP′34
... (23)

上記（２３）式においては、以下に示す（２４）式に示す関係がある。
Ｐ’３１＝Ｐ３＋Ｄ３１
Ｐ’３２＝Ｐ３＋Ｄ３２
Ｐ’３４＝Ｐ３＋Ｄ３４ …（２４） In the above equation (23), there is a relationship represented by the following equation (24).
P'31 = P3 + D31
P'32 = P3 + D32
P′34 = P3 + D34 (24)

したがって、（２３）式に対して（２４）式を代入することにより、δＰ３は、以下に示す（２５）式として表すことができる。
δＰ３＝（１−α＋αβ）Ｄ３１＋α（１−β）Ｄ３２＋αβＤ３４
…（２５） Therefore, by substituting the equation (24) for the equation (23), δP3 can be expressed as the following equation (25).
δP3 = (1−α + αβ) D31 + α (1-β) D32 + αβD34
... (25)

したがって、撮像画像における第１対応点の座標Ｐｖは、以下の（２６）式により表すことができる。
Ｐｖ＝Ｆ（Ｐ１＋α（１−β）Ｄ１２−αβＤ１３＋αβＤ１４，Ｐ２＋（１−α＋αβ）Ｄ２１−αβＤ２３＋αβＤ２４，Ｐ３＋（１−α＋αβ）Ｄ３１＋α（１−β）Ｄ３２＋αβＤ３４，Ｐ４＋（１−α＋αβ）Ｄ４１＋α（１−β）Ｄ４２−αβＤ４３，α，β） …（２６） Therefore, the coordinates Pv of the first corresponding point in the captured image can be expressed by the following equation (26).
Pv = F (P1 + α (1-β) D12-αβD13 + αβD14, P2 + (1-α + αβ) D21-αβD23 + αβD24, P3 + (1-α + αβ) D31 + α (1-β) D32 + αβD34, P4 + (1-α + αβ) D41 + α (1-β) D42−αβD43, α, β) (26)

上記（２６）式を用いた場合、仮想カメラｖの仮想撮像画像の画素の座標Ｐｖの色Ｃｖは、座標Ｐｖに対応する第１対応点である、撮像装置１、２、３及び４における画素各々の座標を仮想撮像画像の座標系に座標変換した座標Ｐ１＋δＰ１、Ｐ２＋δＰ２、Ｐ３＋δＰ３及びＰ４＋δＰ４を用いて、以下の（２７）式により求められる。
Ｃｖ（Ｐｖ）＝ｗ１Ｃ１（Ｐ１＋α（１−β）Ｄ１２−αβＤ１３＋αβＤ１４）＋ｗ２Ｃ２（Ｐ２＋（１−α＋αβ）Ｄ２１−αβＤ２３＋αβＤ２４）＋ｗ３Ｃ３（Ｐ３＋（１−α＋αβ）Ｄ３１＋α（１−β）Ｄ３２＋αβＤ３４）＋ｗ４Ｃ４（Ｐ４＋（１−α＋αβ）Ｄ４１＋α（１−β）Ｄ４２−αβＤ４３） …（２７）
上記（２０）式において、ｗｉ（ｉ＝１，２，３，４）は重み係数であり、ｗ１＋ｗ２＋ｗ３＋ｗ４＝１である。
また、Ｃ１、Ｃ２、Ｃ３及びＣ３の各々は、撮像装置１、２、３及び４の各々の撮像した第１撮像画像における第１対応点である座標Ｐ１、Ｐ２、Ｐ３、Ｐ４の画素それぞれの色を表している。 When the above equation (26) is used, the color Cv of the coordinate Pv of the pixel of the virtual captured image of the virtual camera v is the first corresponding point corresponding to the coordinate Pv, and the pixel in the imaging devices 1, 2, 3, and 4 Using the coordinates P1 + δP1, P2 + δP2, P3 + δP3, and P4 + δP4 obtained by converting each coordinate into the coordinate system of the virtual captured image, the following equation (27) is obtained.
Cv (Pv) = w1C1 (P1 + α (1-β) D12-αβD13 + αβD14) + w2C2 (P2 + (1-α + αβ) D21-αβD23 + αβD24) + w3C3 (P3 + (1-α + αβ) D31 + α (1-β) D32 + αβD34 (w41 + 4) 1 -Α + αβ) D41 + α (1-β) D42-αβD43) (27)
In the above equation (20), wi (i = 1, 2, 3, 4) is a weighting coefficient, and w1 + w2 + w3 + w4 = 1.
In addition, each of C1, C2, C3, and C3 is a pixel corresponding to each of coordinates P1, P2, P3, and P4 that are first corresponding points in the first captured images captured by the imaging devices 1, 2, 3, and 4, respectively. Represents a color.

図２２は、仮想カメラｖの仮想撮像画像の画素Ｐｖの色Ｃｖを求める（２７）式における重み係数を説明する図である。この図１８において、撮像装置１の座標がＥ１であり、撮像装置２の座標がＥ２であり、撮像装置３の座標がＥ３であり、撮像装置４の座標がＥ４であり、仮想カメラｖの座標がＥｖである。ここで、重み係数ｗ１、ｗ２、ｗ３及びｗ４の各々は、座標Ｅ１、Ｅ２、Ｅ３及びＥ４が頂点となり形成する四角形を、４つに分割した面積比となっている。すなわち、座標Ｅ２と座標Ｅ１とをα：１−αで分割した座標Ｅ１２と、座標Ｅ４と座標Ｅ３とをα：１−αで分割した座標Ｅ３４とを生成する。また、座標Ｅ３と座標Ｅ１とをβ：１−βで分割した座標Ｅ１３と、座標Ｅ４と座標Ｅ２とをβ：１−βで分割した座標Ｅ２４とを生成する。 FIG. 22 is a diagram for explaining a weighting coefficient in the equation (27) for obtaining the color Cv of the pixel Pv of the virtual captured image of the virtual camera v. In FIG. 18, the coordinates of the imaging device 1 are E1, the coordinates of the imaging device 2 are E2, the coordinates of the imaging device 3 are E3, the coordinates of the imaging device 4 are E4, and the coordinates of the virtual camera v. Is Ev. Here, each of the weighting factors w1, w2, w3, and w4 has an area ratio obtained by dividing a quadrangle formed by coordinates E1, E2, E3, and E4 into four parts. That is, a coordinate E12 obtained by dividing the coordinates E2 and E1 by α: 1−α and a coordinate E34 obtained by dividing the coordinates E4 and E3 by α: 1−α are generated. Further, a coordinate E13 obtained by dividing the coordinate E3 and the coordinate E1 by β: 1−β and a coordinate E24 obtained by dividing the coordinate E4 and the coordinate E2 by β: 1−β are generated.

そして、座標Ｅ１、Ｅ２、Ｅ３及びＥ４を頂点とする面積Ｓの四角形を、座標１２及び座標３４間を結ぶ線分と、座標Ｅ１３及び座標Ｅ２４間を結ぶ線分とで４分割する。ここで、座標Ｅ１、Ｅ１２、Ｅｖ及びＥ１３を頂点とする四角形の面積をＳｗ４とし、座標Ｅ１２、Ｅ２、Ｅ２４及びＥｖを頂点とする四角形の面積をＳｗ３とし、座標Ｅ１３、Ｅｖ、Ｅ３４及びＥ３を頂点とする四角形の面積をＳｗ２とし、座標Ｅｖ、Ｅ２４、Ｅ４及びＥ３４を頂点とする四角形の面積をＳｗ１とする。すなわち、Ｓ＝Ｓｗ１＋Ｓｗ２＋Ｓｗ３＋Ｓｗ４である。
この場合、重み係数としては、ｗ１＝Ｓｗ１／ｓであり、ｗ２＝Ｓｗ２／Ｓであり、ｗ３＝Ｓｗ３／Ｓであり、ｗ４＝Ｓｗ４／Ｓである。また、重み係数ｗｉは、座標Ｅｉに対して座標Ｅｖを介して対抗する位置にある四角形の面積ＳＷｉの面積Ｓに対する比に設定される。これにより、座標Ｅｖに最も近い座標Ｅｉの撮像装置ｉの重みが最も大きくなり、一方、Ｅｖに最も遠い座標Ｅｉの撮像装置ｉの重みが最も小さくなる。 Then, the quadrangle of the area S having the coordinates E1, E2, E3, and E4 as vertices is divided into four by a line segment connecting the coordinates 12 and the coordinates 34 and a line segment connecting the coordinates E13 and the coordinates E24. Here, the area of a quadrangle having coordinates E1, E12, Ev, and E13 as vertices is Sw4, the area of a quadrangle having coordinates E12, E2, E24, and Ev as vertices is Sw3, and coordinates E13, Ev, E34, and E3 are The area of the quadrangle having the vertex is Sw2, and the area of the quadrangle having the coordinates Ev, E24, E4, and E34 as the vertex is Sw1. That is, S = Sw1 + Sw2 + Sw3 + Sw4.
In this case, the weighting factors are w1 = Sw1 / s, w2 = Sw2 / S, w3 = Sw3 / S, and w4 = Sw4 / S. Further, the weight coefficient wi is set to a ratio of the area SWi of the quadrangular area SWi at a position facing the coordinate Ei via the coordinate Ev to the area S. As a result, the weight of the imaging device i at the coordinate Ei closest to the coordinate Ev is the largest, while the weight of the imaging device i at the coordinate Ei farthest from the Ev is the smallest.

上記（２７）式の０≦ｉ≦ｎとした一般式は、以下の（２８）式である。
Ｃｖ（Ｐｖ）＝ΣｗｉＣｉ（Ｐｉ＋δＰｉ） …（２８）
上記（２８）式において、ｗｉは、撮像装置ｉの撮像した第１撮像画像の第１対応点の画素の色の重み係数を示している。また、Ｃｉ（Ｐｉ＋δＰｉ）は、仮想カメラｖの撮像する仮想撮像座標Ｐｉ＋δＰｉに座標変換される、撮像装置ｉの撮像した第１撮像画像の第１対応点の画素の色を示している。 The general formula with 0 ≦ i ≦ n in the above equation (27) is the following equation (28).
Cv (Pv) = ΣwiCi (Pi + δPi) (28)
In the above equation (28), wi represents the weighting coefficient of the color of the pixel at the first corresponding point of the first captured image captured by the imaging device i. Ci (Pi + δPi) indicates the color of the pixel at the first corresponding point of the first captured image captured by the imaging apparatus i, which is coordinate-converted to the virtual imaging coordinates Pi + δPi captured by the virtual camera v.

次に、図を用いて本実施形態による自由視点画像撮像装置において、複数の第１撮像画像をモーフィング処理することにより自由視点画像を生成する動作を説明する。図２３は、本実施形態による自由視点画像撮像装置が複数の第１撮像画像をモーフィング処理することにより自由視点画像を生成する動作例を示すフローチャートである。
ステップＳ１：
撮像制御部１０１は、ドーム枠５００の各支柱に設けられた撮像装置の各々からの撮像画像を読み込み、一旦、対応する撮像装置の撮像装置の識別情報とともに、データベース１１４に書き込んで記憶させる。このとき、撮像制御部１０１は、各撮像装置からの撮像画像の画素を間引いた低解像度の第１撮像画像と、画素を間引かない、すなわち撮像画像と同様の解像度の第２撮像画像とを組として、データベース１１４に対して書き込んで記憶させる。 Next, an operation of generating a free viewpoint image by morphing a plurality of first captured images in the free viewpoint image capturing apparatus according to the present embodiment will be described with reference to the drawings. FIG. 23 is a flowchart illustrating an operation example in which the free viewpoint image capturing apparatus according to the present embodiment generates a free viewpoint image by morphing a plurality of first captured images.
Step S1:
The imaging control unit 101 reads a captured image from each of the imaging devices provided on each column of the dome frame 500, and once writes and stores it in the database 114 together with the identification information of the imaging device of the corresponding imaging device. At this time, the imaging control unit 101 obtains a low-resolution first captured image obtained by thinning out pixels of the captured image from each imaging device, and a second captured image having a resolution similar to that of the captured image without thinning out pixels. As a set, it is written and stored in the database 114.

ステップＳ２：
シルエット画像生成部１０２は、データベース１１４から第２撮像画像を読み出し、図５に示すように、この読み出した第２撮像画像の各々から、色フィルタを用いてシルエット画像の生成を行う。このとき、シルエット画像生成部１０２は、撮像画像の背景がブルーバックの青色として撮像されているため、容易にシルエット画像を生成する処理が行える。 Step S2:
The silhouette image generation unit 102 reads out the second captured image from the database 114, and generates a silhouette image using a color filter from each of the read out second captured images as shown in FIG. At this time, the silhouette image generation unit 102 can easily perform a process of generating a silhouette image because the background of the captured image is captured as blue blue.

ステップＳ３：
ＶＨ処理部１０３は、シルエット画像生成部１０２の生成したシルエット画像を用い、Ｖｉｓｕａｌ＿Ｈｕｌｌの手法を用い、たとえば、本実施形態の場合、図６に示すように、１６枚のシルエット画像から仮３次元形状を生成する。 Step S3:
The VH processing unit 103 uses the silhouette image generated by the silhouette image generation unit 102 and uses the Visual_Hull method. For example, in the case of the present embodiment, as shown in FIG. Is generated.

ステップＳ４：
撮像装置選択部１０４は、ドーム枠５００に設けられた複数台の撮像装置を２台以上を一組として、複数の撮像装置の組を、たとえば図７に示すように、３台あるいは４台の撮像装置の組である第１撮像装置組を生成する。 Step S4:
The imaging device selection unit 104 sets two or more imaging devices provided in the dome frame 500 as a set, and sets a plurality of imaging devices as three or four as shown in FIG. A first imaging device set that is a set of imaging devices is generated.

ステップＳ５：
ステレオマッチング部１０５は、撮像装置選択部１０４が生成した第１撮像装置組毎に、この第１撮像装置組に含まれる撮像装置の撮像した第１撮像画像間における共通対応点のステレオマッチングを行う。 Step S5:
The stereo matching unit 105 performs stereo matching of common corresponding points between the first captured images captured by the imaging devices included in the first imaging device set for each first imaging device set generated by the imaging device selection unit 104. .

ステップＳ６：
３次元形状生成部１０６は、ステレオマッチング部１０５が抽出した対応点である、各撮像装置の第１撮像画像間の共通対応点を用い、ＶＦ処理部１０３がＶＨ（Visual Hull）により生成した仮３次元形状の修正を行い、部分３次元形状の生成を行う。
すなわち、３次元形状生成部１０６は、この突出した領域を取り除いた仮３次元形状を、修正した仮３次元形状である部分３次元形状として生成する。ここで、３次元形状生成部１０６は、各第１撮像装置組の共通座標点を、世界座標系においてマッピングして重ね合わせる。ここで、異なる第１撮像装置組間の共通座標点は同一座標であれば、重ね合わさることになる。 Step S6:
The three-dimensional shape generation unit 106 uses the common corresponding points between the first captured images of the respective imaging devices, which are corresponding points extracted by the stereo matching unit 105, and the VF processing unit 103 generates the temporary points generated by VH (Visual Hull). The three-dimensional shape is corrected and a partial three-dimensional shape is generated.
That is, the three-dimensional shape generation unit 106 generates a temporary three-dimensional shape from which the protruding region is removed as a partial three-dimensional shape that is a corrected temporary three-dimensional shape. Here, the three-dimensional shape generation unit 106 maps and superimposes common coordinate points of each first imaging device set in the world coordinate system. Here, if the common coordinate points between different first imaging device sets are the same coordinates, they are overlapped.

ステップＳ７：
３次元形状合成部１０７は、３次元形状生成部１０６が第１撮像装置組毎に生成した部分３次元形状を全て重ね合わせる。このとき、３次元形状合成部１０７は、共通対応点を基準とし、部分３次元形状の各々の画素単位における座標位置を合わせ、部分３次元形状を重ねる処理を行う。
また、対応点探索部１０８は、３次元形状合成部１０７により合成された３次元形状を用い、第２撮像画像における対応点である第１対応点の探索を行う。 Step S7:
The three-dimensional shape synthesis unit 107 superimposes all the partial three-dimensional shapes generated by the three-dimensional shape generation unit 106 for each first imaging device set. At this time, the three-dimensional shape synthesis unit 107 performs a process of superimposing the partial three-dimensional shapes by aligning the coordinate positions in the respective pixel units of the partial three-dimensional shape with the common corresponding point as a reference.
In addition, the corresponding point search unit 108 searches for the first corresponding point, which is a corresponding point in the second captured image, using the three-dimensional shape combined by the three-dimensional shape combining unit 107.

対応点探索部１０８は、この第１対応点に基づいて第２撮像画像を用いたモーフィング処理を行うため、第１対応点の探索はモーフィングに用いる高解像度の第２撮像画像に対して行う。ここで、対応点探索部１０８は、データベース１１４から第２撮像画像を読み出す。対応点探索部１０８は、第１対応点の探索において、３次元形状を各撮像装置の撮像方向に垂直な２次元平面、すなわち各撮像装置の撮像面に対応した２次元平面に対し、３次元形状の各画素を投影した仮想第１撮像画像を用いて行う。
また、オクルージョン領域探索部１０９は、対応点探索部１０８が３次元形状を投影した仮想第１撮像画像毎に、３次元形状に存在して仮想第１撮像画像に存在しない画素を検出し、撮像装置毎にデータベース１１４に書き込んで記憶させる。 Since the corresponding point search unit 108 performs the morphing process using the second captured image based on the first corresponding point, the search for the first corresponding point is performed on the high-resolution second captured image used for morphing. Here, the corresponding point search unit 108 reads the second captured image from the database 114. In the search for the first corresponding point, the corresponding point search unit 108 determines the three-dimensional shape from the two-dimensional plane perpendicular to the imaging direction of each imaging device, that is, the two-dimensional plane corresponding to the imaging surface of each imaging device. This is performed using a virtual first captured image obtained by projecting each pixel of the shape.
Further, the occlusion area search unit 109 detects pixels that exist in the three-dimensional shape and do not exist in the virtual first captured image for each virtual first captured image on which the corresponding point search unit 108 projects the three-dimensional shape. Each device is written and stored in the database 114.

ステップＳ８：
テンプレートマッチング部１１０は、第１撮像装置組の各々において、それぞれの第１撮像装置組に含まれる撮像装置全てが撮像した第１撮像画像間のテンプレートマッチングを行う。ここで、テンプレートマッチング部１１０は、撮像装置の幾何学的な整合性を考慮に入れないため、ステレオマッチングを行うのではなく、第１撮像画像間におけるテンプレートマッチングのみを行う。 Step S8:
The template matching unit 110 performs template matching between first captured images captured by all the imaging devices included in each first imaging device set in each first imaging device set. Here, since the template matching unit 110 does not take into account the geometric consistency of the imaging device, the template matching unit 110 does not perform stereo matching but only performs template matching between the first captured images.

ステップＳ９：
仮想カメラ設定部１１１は、ドーム型撮像装置において用いられている撮像装置のカメラパラメータを、データベース１１４に記憶されているカメラパラメータテーブルから選択して読み出す。 Step S9:
The virtual camera setting unit 111 selects and reads out camera parameters of the imaging device used in the dome imaging device from the camera parameter table stored in the database 114.

ステップＳ１０：
モーフィング用撮像装置選択部１１２は、図示しない入力手段により、上記仮想カメラの位置がユーザにより設定されると、仮想カメラの撮像する仮想撮像画像を生成するモーフィング処理に用いる撮像装置の設定を行う。
すなわち、いずれの撮像方向（視点方向）から被写体の３次元形状を、仮想カメラの撮像面（すなわち２次元平面）に投影するかが設定されると、モーフィング用撮像装置選択部１１２は、仮想撮像画像を生成するモーフィング処理に用いる第１撮像画像を撮像した撮像装置の設定、すなわち第１撮像装置組の選定を行う。このとき、モーフィング用撮像装置選択部１１２は、仮想カメラの位置を含む空間の範囲を有する第１撮像装置組を、データベース１１４における各撮像装置のカメラパラメータから位置情報を読み出して選択する。 Step S10:
When the position of the virtual camera is set by a user using an input unit (not shown), the morphing imaging device selection unit 112 sets an imaging device used for a morphing process for generating a virtual captured image captured by the virtual camera.
That is, when it is set from which imaging direction (viewpoint direction) to project the three-dimensional shape of the subject onto the imaging surface of the virtual camera (that is, the two-dimensional plane), the morphing imaging device selection unit 112 performs the virtual imaging. The setting of the imaging device that captured the first captured image used for the morphing process for generating an image, that is, the selection of the first imaging device set is performed. At this time, the morphing imaging device selection unit 112 reads out and selects position information from the camera parameters of each imaging device in the database 114, and selects a first imaging device group having a space range including the position of the virtual camera.

ステップＳ１１：
仮想カメラ画像生成部１１３は、モーフィング用撮像装置選択部１１２が選択した撮像装置の第１撮像画像をデータベース１１４から読み出す。この説明においては、例えば、第１撮像装置組が３台の撮像装置から構成されている。
次に、仮想カメラ画像生成部１１３は、テンプレートマッチングを行った撮像装置１、２及び３の各々の第１対応点それぞれを、自身の第１撮像画像の座標系から仮想撮像画像の座標系に座標変換する。
そして、仮想カメラ画像生成部１１３は、仮想撮像画像の座標系における撮像装置１、２及び３の各々の撮像画像における第１対応点の画素の座標Ｐ１＋δＰ１、Ｐ２＋δＰ２及びＰ３＋δＰ３の画素の色から、これらの第１対応点に対応する仮想撮像画像に座標Ｐｖの画素の色を、上記（２０）式によりモーフィング処理を行って求める。 Step S11:
The virtual camera image generation unit 113 reads the first captured image of the imaging device selected by the morphing imaging device selection unit 112 from the database 114. In this description, for example, the first imaging device group is composed of three imaging devices.
Next, the virtual camera image generation unit 113 changes the first corresponding points of the imaging devices 1, 2, and 3 that have undergone template matching from the coordinate system of the first captured image thereof to the coordinate system of the virtual captured image. Convert coordinates.
Then, the virtual camera image generation unit 113 calculates the pixel coordinates P1 + δP1, P2 + δP2, and P3 + δP3 of the pixels of the first corresponding point in the captured images of the imaging devices 1, 2, and 3 in the coordinate system of the virtual captured image from these colors. The color of the pixel at the coordinate Pv is obtained by performing the morphing process according to the above equation (20) in the virtual captured image corresponding to the first corresponding point.

仮想カメラ画像生成部１１３は、上述した処理を全ての仮想撮像画像における各画素の座標Ｐｖに対して行い、仮想撮像画像を生成する。
また、このとき、仮想カメラ画像生成部１１３は、仮想撮像画像の座標系に対して３次元形状を投影し、オクルージョンにより見えない座標を検出した場合、この座標の色の生成を行わない。
また、仮想カメラ画像生成部１１３は、モーフィング処理を行う第１撮像装置組においてオクルージョンにより第１撮像画像に第１対応点がない撮像装置が存在する場合、第１撮像装置組におけるこの撮像装置以外の座標以外の撮像装置の第１撮像画像を用いてモーフィング処理を行う。
ここで、オクルージョンにより第１対応点がない撮像装置の重みを０とする。例えば、ある画素Ｐｖの色を決定する際に、撮像装置１から撮像装置４を用いた際、撮像装置２がオクルージョンとなっている場合、段落番号０１１４〜段落番号０１１６に記載した処理において、Ｓｗ２の値を０にする。これにより、Ｓ＝Ｓｗ１+Ｓｗ３+Ｓｗ４となり、重みは、ｗ２＝０、ｗ１＋ｗ３＋ｗ４＝１となる。この重みを用いて式（２８）を計算する。 The virtual camera image generation unit 113 performs the above-described processing on the coordinates Pv of each pixel in all virtual captured images, and generates a virtual captured image.
At this time, if the virtual camera image generation unit 113 projects a three-dimensional shape onto the coordinate system of the virtual captured image and detects coordinates that cannot be seen due to occlusion, the virtual camera image generation unit 113 does not generate the color of the coordinates.
In addition, when there is an imaging device that does not have the first corresponding point in the first captured image due to occlusion in the first imaging device set that performs the morphing process, the virtual camera image generation unit 113 other than the imaging device in the first imaging device set Morphing processing is performed using the first captured image of the imaging apparatus other than the coordinates.
Here, the weight of the imaging device that does not have the first corresponding point due to occlusion is set to zero. For example, when determining the color of a certain pixel Pv, when using the imaging device 1 to the imaging device 4 and the imaging device 2 is occluded, in the processing described in the paragraph numbers 0114 to 0116, Sw2 Set the value of to 0. As a result, S = Sw1 + Sw3 + Sw4 and the weights are w2 = 0 and w1 + w3 + w4 = 1. Equation (28) is calculated using this weight.

上述したように、本実施形態によれば、複数の撮像装置の各々が撮像した撮像画像から、第１撮像画像及び第１撮像画像より解像度の高い第２撮像画像を生成し、第１撮像画像により被写体の３次元形状を生成し、この３次元形状を元にテンプレートマッチングを行い、仮想カメラの撮像する仮想撮像画像を生成する際、この仮想撮像画像の各画素に対応する第１対応点を各撮像装置の第２撮像画像から容易に、かつロバストで密な状態で抽出することができる。
また、本実施形態によれば、各撮像装置の第２撮像画像における、仮想撮像画像の画素の対応点の位置を、仮想撮像画像の座標に座標変換するため、モーフィング処理における各第２撮像画像の各画素の色の重み付けを正確に算出することができ、精度の高いモーフィング処理を行うことができる。 As described above, according to the present embodiment, the first captured image and the second captured image having a higher resolution than the first captured image are generated from the captured images captured by the plurality of imaging devices, and the first captured image is generated. To generate a three-dimensional shape of the subject, perform template matching based on the three-dimensional shape, and generate a virtual captured image captured by the virtual camera, the first corresponding point corresponding to each pixel of the virtual captured image It can be easily and robustly extracted from the second captured image of each imaging device.
In addition, according to the present embodiment, the position of the corresponding point of the pixel of the virtual captured image in the second captured image of each imaging device is coordinate-converted to the coordinates of the virtual captured image, and thus each second captured image in the morphing process. Thus, it is possible to accurately calculate the weighting of the color of each pixel, and to perform highly accurate morphing processing.

また、本実施形態によれば、３次元形状を仮想像画像の座標系に投影するため、仮想撮像画像上にオクルージョンにより見えない座標が存在する場合、あるいはモーフィング処理に見えない座標のある第２撮像画像が存在する場合でも、画像に穴が開くことやぼけることがないように、それらの座標の画素の取り扱いについて容易に対処することができるため、オクルージョン領域の形状を仮想撮像画像に反映することができる。
上述したように、本実施形態によれば、視点変化に対応して、撮像装置の撮像した第２撮像画像をモーフィング処理することで自由視点画像（仮想撮像画像）を生成する際、モーフィング処理において局所領域でテンプレートマッチングを行い、撮像画像間における第１対応点の修正を、仮想撮像画像の座標系において行っている。このため、本実施形態によれば、モーフィングによるぼけを低減させ、被写体を観察する視点に応じ、その視点に対応する仮想カメラの撮像画像を、フォトリアルな自由視点画像として撮像することができる。 Further, according to the present embodiment, since the three-dimensional shape is projected onto the coordinate system of the virtual image image, there are coordinates that cannot be seen due to occlusion on the virtual captured image, or there are second coordinates that cannot be seen by the morphing process. Even when a captured image exists, it is possible to easily deal with the handling of pixels of those coordinates so that holes are not opened or blurred in the image, so the shape of the occlusion area is reflected in the virtual captured image. be able to.
As described above, according to the present embodiment, when generating a free viewpoint image (virtual captured image) by morphing the second captured image captured by the imaging device in response to the viewpoint change, Template matching is performed in the local region, and the first corresponding point between the captured images is corrected in the coordinate system of the virtual captured image. For this reason, according to the present embodiment, blur due to morphing can be reduced, and a captured image of the virtual camera corresponding to the viewpoint of observing the subject can be captured as a photoreal free viewpoint image.

以下、本実施形態による自由視点画像撮像装置と、従来との各々において生成された自由視点画像の比較を示す。
図２４は、ドーム型撮像装置の第１撮像装置組により撮像された第１撮像画像を示す図である。この図２４においては、第１撮像装置組が４台の撮像装置から構成されている例を示しているため、第２撮像画像は４枚となっている。 Hereinafter, a comparison of the free viewpoint image generated in each of the free viewpoint image capturing apparatus according to the present embodiment and the conventional one will be shown.
FIG. 24 is a diagram illustrating a first captured image captured by the first imaging device group of the dome-shaped imaging device. In FIG. 24, an example in which the first imaging device group is configured by four imaging devices is illustrated, and thus the number of second captured images is four.

図２５は、オープンソースのライブラリにあるＯｐｅｎＣＶ（登録商標）で実装されているＶｉｅｗＭｏｒｐｈｉｎｇ（登録商標）を使用して生成した自由視点画像を示す図である。この図２５の自由視点画像は、図２４の第２撮像画像を用い、ＶｉｅｗＭｏｒｐｈｉｎｇ（登録商標）を使用してモーフィング処理を行って生成した。 FIG. 25 is a diagram showing a free viewpoint image generated by using View Morphing (registered trademark) implemented in OpenCV (registered trademark) in an open source library. The free viewpoint image of FIG. 25 was generated by performing morphing processing using View Morphing (registered trademark) using the second captured image of FIG.

図２６は、ステレオマッチングの手法としてＳＡＤを使用し、ドロネー三角分割法によるメッシュを作成し、このメッシュを合成して生成した自由視点画像を示す図である。この図２６においては、オクルージョン領域８６０或いは８６１においては対応点が見つからないため、輪郭が描画されずに一部がぼけてしまっていることが分かる。 FIG. 26 is a diagram showing a free viewpoint image generated by using SAD as a stereo matching method, creating a mesh by the Delaunay triangulation method, and synthesizing this mesh. In FIG. 26, the corresponding point is not found in the occlusion area 860 or 861, and it can be seen that the outline is not drawn and a part thereof is blurred.

図２７は、ＶｉｓｕａｌＨｕｌｌで３次元形状を求めた結果に対し、モーフィング処理により生成した自由視点画像を示す図である。すなわち、この図２７においては、ＶｉｓｕａｌＨｕｌｌで求めた３次元形状を、仮想カメラの視点（撮像方向）に対応する仮想撮像画像に投影して、自由視点画像を生成している。ＶｉｓｕａｌＨｕｌｌの３次元形状の精度が低いため、モーフィング処理において対応点が取得できず、画像がぼやけていることが分かる。 FIG. 27 is a diagram showing a free viewpoint image generated by morphing the result of obtaining a three-dimensional shape using Visual Hull. In other words, in FIG. 27, the three-dimensional shape obtained by Visual Hull is projected onto a virtual captured image corresponding to the viewpoint (imaging direction) of the virtual camera to generate a free viewpoint image. Since the accuracy of the three-dimensional shape of Visual Hull is low, it can be seen that the corresponding points cannot be acquired in the morphing process, and the image is blurred.

図２８は、ＶｉｓｕａｌＨｕｌｌで３次元形状を求め、この３次元形状と第１撮像画像各々とのステレオマッチングを行って、共通対応点を求めて行ったモーフィング処理により生成した自由視点画像を示す図である。すなわち、この図２８は、本実施形態における第２撮像画像間の各画素のテンプレートマッチングを行っていない。図２８（ａ）は、得られた自由視点画像を示し、図２８（ｂ）、図２８（ｃ）及び図２８（ｄ）の各々は、それぞれ図２８（ａ）における領域８５１、８５２、８５３の拡大図を示している。 FIG. 28 is a view showing a free viewpoint image generated by a morphing process in which a three-dimensional shape is obtained by Visual Hull, stereo matching between the three-dimensional shape and each of the first captured images is performed, and a common corresponding point is obtained. It is. That is, FIG. 28 does not perform template matching of each pixel between the second captured images in the present embodiment. FIG. 28A shows the obtained free viewpoint image, and FIG. 28B, FIG. 28C, and FIG. 28D are the regions 851, 852, and 853 in FIG. 28A, respectively. FIG.

図２９は、ＶｉｓｕａｌＨｕｌｌで３次元形状を求め、この３次元形状と第１撮像画像各々とのステレオマッチングを行って、共通対応点を求めた後の３次元形状を用い、第１撮像画像間のテンプレートマッチングを行って得られた第１対応点を用いたモーフィング処理により生成した自由視点画像を示す図である。 FIG. 29 shows a three-dimensional shape obtained by Visual Hull, stereo matching between the three-dimensional shape and each of the first captured images is performed, and a common corresponding point is obtained. It is a figure which shows the free viewpoint image produced | generated by the morphing process using the 1st corresponding point obtained by performing template matching.

また、図４における自由視点画像生成１０の機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することにより自由視点画像（仮想撮像画像）の生成処理を行ってもよい。なお、ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものとする。 Further, a program for realizing the function of the free viewpoint image generation 10 in FIG. 4 is recorded on a computer-readable recording medium, and the program recorded on the recording medium is read into a computer system and executed. A viewpoint image (virtual captured image) generation process may be performed. Here, the “computer system” includes an OS and hardware such as peripheral devices.

また、「コンピュータシステム」は、ＷＷＷシステムを利用している場合であれば、ホームページ提供環境（あるいは表示環境）も含むものとする。
また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時間の間、動的にプログラムを保持するもの、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含むものとする。また上記プログラムは、前述した機能の一部を実現するためのものであっても良く、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであっても良い。 Further, the “computer system” includes a homepage providing environment (or display environment) if a WWW system is used.
The “computer-readable recording medium” refers to a storage device such as a flexible medium, a magneto-optical disk, a portable medium such as a ROM and a CD-ROM, and a hard disk incorporated in a computer system. Furthermore, the “computer-readable recording medium” dynamically holds a program for a short time like a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line. In this case, a volatile memory in a computer system serving as a server or a client in that case, and a program that holds a program for a certain period of time are also included. The program may be a program for realizing a part of the functions described above, and may be a program capable of realizing the functions described above in combination with a program already recorded in a computer system.

以上、この発明の実施形態を図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の設計等も含まれる。 The embodiment of the present invention has been described in detail with reference to the drawings. However, the specific configuration is not limited to this embodiment, and includes design and the like within a scope not departing from the gist of the present invention.

１０…自由視点画像撮像部
５３＿１，５３＿２，５３＿３…撮像装置
１０１…撮像制御部
１０２…シルエット画像生成部
１０３…ＶＦ処理部
１０４…撮像装置選択部
１０５…ステレオマッチング部
１０６…３次元形状生成部
１０７…３次元形状合成部
１０８…対応点探索部
１０９…オクルージョン領域探索部
１１０…テンプレートマッチング部
１１１…仮想カメラ設定部
１１２…モーフィング用撮像装置選択部
１１３…仮想カメラ画像生成部
１１４…データベース
５００＿１，５００＿２，５００＿３，５００＿４，５００＿５，５００＿６，５００＿７，５００＿８，５００＿９，５００＿１０，５００＿１１，５００＿１２…支柱
６００…被写体
８００…ＬＥＤ光源
８５０＿３…ライトガイド
９００…ブルーバッグ DESCRIPTION OF SYMBOLS 10 ... Free viewpoint image imaging part 53_1, 53_2, 53_3 ... Imaging device 101 ... Imaging control part 102 ... Silhouette image generation part 103 ... VF processing part 104 ... Imaging apparatus selection part 105 ... Stereo matching part 106 ... Three-dimensional shape generation part 107 ... 3D shape synthesis unit 108 ... corresponding point search unit 109 ... occlusion region search unit 110 ... template matching unit 111 ... virtual camera setting unit 112 ... morphing imaging device selection unit 113 ... virtual camera image generation unit 114 ... database 500_1, 500_2 , 500_3, 500_4, 500_5, 500_6, 500_7, 500_8, 500_9, 500_10, 500_11, 500_12 ... column 600 ... subject 800 ... LED light source 850_3 ... light guide 900 ... blue bag

Claims

An imaging control unit that acquires a first captured image of a subject from each of a plurality of imaging devices;
A corresponding point search unit that searches for a first corresponding point that is a corresponding point of the second captured image acquired from the imaging device, using the three-dimensional shape of the subject reproduced from the first captured image;
A template matching unit that performs template matching of the first corresponding points between the second captured images;
A free viewpoint image, comprising: a virtual camera image generation unit configured to generate a third captured image captured from a virtual camera arranged between a plurality of the imaging devices by morphing the second captured image. Imaging device.

The virtual camera image generation unit
Performing the morphing process by a combination of the second captured images;
The virtual camera is
The free viewpoint image capturing apparatus according to claim 1, further comprising an imaging surface on a line or a plane on which the imaging apparatus that captures the combined second captured image is arranged.

From the fourth captured image obtained by projecting the three-dimensional shape onto the imaging surface of the virtual camera, each occlusion region of the imaging device in the combination is detected, and the detection result is indicated to the virtual camera image generation unit as occlusion information. An occlusion area search unit to supply as
The virtual camera image generation unit is
3. The free viewpoint image capturing apparatus according to claim 1, wherein the third captured image is generated by a morphing process using the captured image of the combination, reflecting the occlusion information.

4. The free viewpoint image capturing apparatus according to claim 1, wherein a resolution of the second captured image is higher than a resolution of the first captured image. 5.

A silhouette image generator for generating a silhouette image from the first captured image;
A VF processing unit that generates a temporary partial 3D shape that is a part of the 3D shape of the subject from the silhouette image;
A stereo matching unit that performs stereo matching of each of the first captured images for each combination of the imaging devices and extracts a common corresponding point that is a common corresponding point in each of the first captured images;
A partial three-dimensional shape generation unit that corrects the temporary partial three-dimensional shape using the common corresponding points and sets the corrected result as a partial three-dimensional shape;
The free viewpoint image capturing apparatus according to claim 1, further comprising: a three-dimensional shape synthesis unit that synthesizes the partial three-dimensional shape to generate the three-dimensional shape. .

The three-dimensional shape synthesis unit
The triangular mesh connecting the common corresponding points and the temporary part three-dimensional shape are compared, the temporary part three-dimensional area close to the imaging device direction with respect to the mesh is deleted, and the temporary part three-dimensional The free viewpoint image capturing apparatus according to claim 5, wherein the shape is corrected.

The three-dimensional shape synthesis unit
The partial three-dimensional images obtained from all combinations of the first captured images are superimposed, and the three-dimensional shape is generated from a region where the partial three-dimensional images equal to or greater than a predetermined threshold are overlapped. The free viewpoint image imaging device according to claim 5 or 6.

There are two imaging devices, and the virtual camera is arranged on a line where the imaging devices are arranged,
The coordinate Ev where the virtual camera is arranged uses the coordinate Ei of the imaging device i (1 ≦ i ≦ 2),
Ev = (1-α) E1 + αE2, (0 ≦ α ≦ 1)
When expressed by the formula
As for the coordinate Pv of the pixel in the third captured image of the virtual camera, the first corresponding point of the coordinate Pv is the coordinate Pi, the first corresponding point in the coordinate system of the third captured image is Pi, and the template When represented by a combination of Pj of the first captured image as a template in matching and a difference Dij (i = 1, 2, j = 1, 2, i ≠ j) with respect to Pi, the third captured image The first corresponding point of the imaging device 1 with respect to the coordinate Pv in the coordinate system is P1 + αD12, and the first corresponding point of the imaging device 2 is P2 + (1-α) D21. The free viewpoint image capturing device according to any one of the above.

The number of the imaging devices is three, and the virtual camera is arranged on the surface where the imaging device is arranged,
The coordinate Ev where the virtual camera is arranged uses the coordinate Ei of the imaging device i (1 ≦ i ≦ 3),
Ev = (1−α−β) E1 + αE2 + βE3 (0 ≦ α, 0 ≦ β, α + β ≦ 1)
When expressed by the formula
As for the coordinate Pv of the pixel in the third captured image of the virtual camera, the first corresponding point of the coordinate Pv is the coordinate Pi, the first corresponding point in the coordinate system of the third captured image is Pi, and the template When expressed by a combination of Pj of the first captured image as a template in matching and a difference Dij (i = 1, 2, 3, j = 1, 2, 3, i ≠ j) with respect to Pi, The first corresponding point of the imaging device 1 with respect to the coordinate Pv in the coordinate system of the third captured image is P1 + αD12 + βD13, the first corresponding point of the imaging device 2 is P2 + (1-α−β) D21 + βD23, and the first corresponding point of the imaging device 3 The free viewpoint image capturing apparatus according to claim 5, wherein P3 + (1−α−β) D31 + αD32.

The number of the imaging devices is four, and the virtual camera is arranged on a surface on which the imaging devices are arranged,
The coordinate Ev where the virtual camera is arranged uses the coordinate Ei of the imaging device i (1 ≦ i ≦ 4),
Ev = (1-α-αβ) E1 + α (1-β) E2-αβE3
+ ΑβE4 (0 ≦ α, 0 ≦ β, α + β ≦ 1)
When expressed by the formula
As for the coordinate Pv of the pixel in the third captured image of the virtual camera, the first corresponding point of the coordinate Pv is the coordinate Pi, the first corresponding point in the coordinate system of the third captured image is Pi, and the template It is expressed by a combination of Pj of the first captured image as a template in matching and a difference Dij of Pj with respect to Pi (i = 1, 2, 3, 4, j = 1, 2, 3, 4, i ≠ j). The corresponding point of the imaging device i with respect to the coordinate Pv in the coordinate system of the third captured image is (1−α + αβ) P′i1 + α (1−β) P′i2−αβP′i3 + αP′i4 (where P ′ The free viewpoint image capturing apparatus according to claim 5, wherein ii = Pi).

2. The imaging device according to claim 1, wherein each of the imaging devices is arranged with the imaging direction as the subject on the surface of a sphere centered on the subject so that the distance to the subject is the same. The free viewpoint image capturing device according to any one of 10.

An imaging control process in which an imaging control unit acquires a first captured image of a subject from each of a plurality of imaging devices;
A corresponding point search process in which a corresponding point search unit searches for a first corresponding point that is a corresponding point of the second captured image acquired from the imaging device, using the three-dimensional shape of the subject reproduced from the first captured image. When,
A template matching process in which a template matching unit performs template matching of the first corresponding point between the second captured images;
A virtual camera image generation unit that generates a third captured image captured from a virtual camera arranged between a plurality of the imaging devices by a morphing process of the second captured image; A free viewpoint image capturing method characterized by the above.