JP2020166652A

JP2020166652A - Image processing apparatus, image processing method, and program

Info

Publication number: JP2020166652A
Application number: JP2019067475A
Authority: JP
Inventors: 希名板倉; Kina Itakura
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2019-03-29
Filing date: 2019-03-29
Publication date: 2020-10-08

Abstract

To suppress misalignment of texture between polygons in three-dimentional polygon data with which a texture image is associated.SOLUTION: An image processing apparatus 100 that generates a texture image to be associated with three-dimensional polygon data of an object, determines a pixel value of each of a plurality of pixels structuring a region corresponding to polygons constituting the three-dimensional polygon data on the basis of one or more captured image on a per-pixel basis, and generates a texture image on the basis of the pixel values.SELECTED DRAWING: Figure 7

Description

本発明は、画像処理装置、画像処理方法及びプログラムに関する。 The present invention relates to an image processing apparatus, an image processing method and a program.

被写体を複数のカメラで撮像して得られた複数の画像から、任意の視点から被写体を見た際に得られる画像（仮想視点画像）を再構成する技術が知られている。その際に、構成する要素ごとにテクスチャ画像が対応付けられた３次元データが用いられる。 There is known a technique for reconstructing an image (virtual viewpoint image) obtained when the subject is viewed from an arbitrary viewpoint from a plurality of images obtained by capturing the subject with a plurality of cameras. At that time, three-dimensional data in which a texture image is associated with each constituent element is used.

特許文献１には、その３次元データの一例である３次元ポリゴンデータの生成方法について開示されている。特許文献１では、３次元ポリゴンデータを構成するポリゴン毎に、撮像画像が選択され、選択された撮像画像に基づくテクスチャ画像が対応付けられた３次元データが生成される。 Patent Document 1 discloses a method for generating three-dimensional polygon data, which is an example of the three-dimensional data. In Patent Document 1, an captured image is selected for each polygon constituting the three-dimensional polygon data, and three-dimensional data associated with a texture image based on the selected captured image is generated.

特開２００３−３３７９５３号公報Japanese Unexamined Patent Publication No. 2003-337953

しかし、ポリゴン毎に撮像画像が選択される場合、隣り合うポリゴン間で選択画像が大きく異なる場合あり、ポリゴン間のテクスチャのずれが生じる。 However, when the captured image is selected for each polygon, the selected image may differ greatly between the adjacent polygons, and the texture shift between the polygons occurs.

そこで本発明は、テクスチャ画像が対応付けられた３次元ポリゴンデータにおいて、ポリゴン間のテクスチャのずれを抑制することを目的としている。 Therefore, an object of the present invention is to suppress the deviation of the texture between polygons in the three-dimensional polygon data to which the texture image is associated.

本発明の一態様の画像処理装置は、オブジェクトを複数の撮像装置を用いて撮像することにより取得された複数の撮像画像に基づいて、前記オブジェクトの３次元ポリゴンデータと対応するテクスチャ画像を生成する画像処理装置であって、前記３次元ポリゴンデータを構成するポリゴンに対応する領域を構成する複数の画素又は複数の画素群のそれぞれの画素値を、画素又は複数の画素群毎に前記複数の撮像画像のうちの１以上の撮像画像に基づいて決定する決定手段と、前記決定手段により決定された画素値に基づいて、前記テクスチャ画像を生成する生成手段と、を有する。 The image processing apparatus of one aspect of the present invention generates a texture image corresponding to the three-dimensional polygon data of the object based on a plurality of captured images acquired by imaging the object using a plurality of imaging devices. In an image processing device, the pixel values of a plurality of pixels or a plurality of pixel groups constituting a region corresponding to a polygon constituting the three-dimensional polygon data are captured by the plurality of pixels or the plurality of image pickups for each of the plurality of pixel groups. It has a determination means for determining based on one or more captured images of the image, and a generation means for generating the texture image based on the pixel value determined by the determination means.

本発明によれば、テクスチャ画像が対応付けられた３次元ポリゴンデータにおいて、ポリゴン間のテクスチャのずれを抑制することができる。 According to the present invention, it is possible to suppress the deviation of the texture between polygons in the three-dimensional polygon data to which the texture image is associated.

画像処理装置のハードウェア構成例を示すブロック図。The block diagram which shows the hardware configuration example of an image processing apparatus. 仮想視点画像の生成の概念図。Conceptual diagram of virtual viewpoint image generation. ３次元形状データとテクスチャデータとの対応付けの概念図。A conceptual diagram of the correspondence between the three-dimensional shape data and the texture data. テクスチャ画像と対応付けられた３Ｄポリゴンデータを表現するデータの例。An example of data that represents 3D polygon data associated with a texture image. ポリゴンの表方向を示す概念図。A conceptual diagram showing the front direction of a polygon. 実施形態１のテクスチャ画像の生成処理に用いられるシステムの構成例を示す図。The figure which shows the configuration example of the system used for the generation process of the texture image of Embodiment 1. FIG. 実施形態１における画像処理装置の機能構成例を示すブロック図。The block diagram which shows the functional structure example of the image processing apparatus in Embodiment 1. FIG. 実施形態１における画像処理方法例を示すフローチャート。The flowchart which shows the example of the image processing method in Embodiment 1. 実施形態２における画像処理装置の機能構成例を示すブロック図。The block diagram which shows the functional structure example of the image processing apparatus in Embodiment 2. 実施形態２における画像処理方法例を示すフローチャート。The flowchart which shows the example of the image processing method in Embodiment 2. 実施形態２のテクスチャ画像生成処理の概念を説明する図。The figure explaining the concept of the texture image generation processing of Embodiment 2.

以下、本発明の実施形態について、図面を参照して説明する。なお、以下の実施形態は本発明を限定するものではなく、また、本実施形態で説明されている特徴の組み合わせの全てが本発明の解決手段に必須のものとは限らない。なお、同一の構成については、同じ符号を付して説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. It should be noted that the following embodiments do not limit the present invention, and not all combinations of features described in the present embodiment are essential for the means for solving the present invention. The same configuration will be described with the same reference numerals.

本実施形態では、３次元データを構成する要素にテクスチャを対応付けたものを３次元モデルと表現する。３次元形状データ（単に形状データという場合もある）は、オブジェクト（被写体）の３次元形状を表すデータであり、典型的にはポリゴンデータで表現されるものである。ここでは、３次元形状データ及び３次元モデルの形状表現を、ポリゴンデータを用いて行う場合で説明を行う。なお、ポリゴンデータの要素はポリゴンであり、ポリゴンは複数の頂点で規定される。ポリゴンは、三角形に限らず、四角形やそれ以上の多角形であってもよい。特に以下では、三角形ポリゴンを用いて形状を表現した３次元モデルを例に説明を行う。 In the present embodiment, a model in which textures are associated with elements constituting 3D data is expressed as a 3D model. The three-dimensional shape data (sometimes referred to simply as shape data) is data representing the three-dimensional shape of an object (subject), and is typically represented by polygon data. Here, the case where the three-dimensional shape data and the shape representation of the three-dimensional model are performed using the polygon data will be described. The element of the polygon data is a polygon, and the polygon is defined by a plurality of vertices. The polygon is not limited to a triangle, and may be a quadrangle or a polygon larger than that. In particular, in the following, a three-dimensional model in which the shape is expressed using triangular polygons will be described as an example.

テクスチャとは、色情報と輝度情報と彩度情報のうち少なくとも１つの情報を含む。また、テクスチャ画像とは、複数のテクスチャ領域を有する画像であり、各テクスチャ領域は、いずれかのポリゴンにと対応付けられている。そのテクスチャ領域は、対応するポリゴンに関する色情報と輝度情報と彩度情報のうち少なくとも１つの情報を含んでいる。 The texture includes at least one of color information, luminance information, and saturation information. Further, the texture image is an image having a plurality of texture areas, and each texture area is associated with any polygon. The texture area contains at least one of color information, luminance information, and saturation information related to the corresponding polygon.

図２は、仮想視点画像の生成の概念図を示している。図２（Ａ）は、被写体の形状データを表し、図２（Ｂ）は、被写体のテクスチャ画像を表している。テクスチャ画像は、ポリゴン又は複数のポリゴンで構成される領域に対応する複数のテクスチャ領域で構成されている。そして、テクスチャ領域の画素値は、対応するポリゴン又は複数のポリゴンで構成される領域のテクスチャを表している。なお、図２（Ｂ）で示すテクスチャ画像は、２つのテクスチャ領域を有し、テクスチャ領域のそれぞれは図２（Ａ）の２つの六角形それぞれに対応している。図２（Ａ）、（Ｂ）を組み合わせて、決定された視点に基づいて、レンダリング処理を行うことで、図２（Ｃ）のような決定された視点に対応する仮想視点画像が生成される。なお、視点の決定は、ユーザが行ってもよいし、装置が自動的に決定してもよい。また視点の決定は、視点の位置及び姿勢（方向）の決定を少なくとも含むようにしてもよい。 FIG. 2 shows a conceptual diagram of generating a virtual viewpoint image. FIG. 2A represents the shape data of the subject, and FIG. 2B represents the texture image of the subject. The texture image is composed of a plurality of texture areas corresponding to a polygon or a region composed of a plurality of polygons. The pixel value of the texture area represents the texture of the corresponding polygon or the area composed of a plurality of polygons. The texture image shown in FIG. 2B has two texture regions, and each of the texture regions corresponds to each of the two hexagons in FIG. 2A. By combining FIGS. 2A and 2B and performing rendering processing based on the determined viewpoint, a virtual viewpoint image corresponding to the determined viewpoint as shown in FIG. 2C is generated. .. The viewpoint may be determined by the user or may be automatically determined by the device. Further, the determination of the viewpoint may include at least the determination of the position and posture (direction) of the viewpoint.

次に、図２（Ａ）と（Ｂ）とを対応付けるための表現方法を、図３を使って説明する。これは、複数の三角形ポリゴンで構成される形状データ（以下では、３Ｄポリゴンデータという）のポリゴン毎にテクスチャデータを対応付けて、３次元モデル（以下、３Ｄポリゴンモデルという）を生成することに相当する。なお、ポリゴン毎にテクスチャデータが対応付けられた３Ｄポリゴンモデルを、テクスチャ付き３Ｄポリゴンモデルという場合もある。 Next, an expression method for associating FIGS. 2A and 2B will be described with reference to FIG. This is equivalent to generating a three-dimensional model (hereinafter referred to as a 3D polygon model) by associating texture data with each polygon of shape data (hereinafter referred to as 3D polygon data) composed of a plurality of triangular polygons. To do. A 3D polygon model in which texture data is associated with each polygon may be referred to as a textured 3D polygon model.

図３（Ａ）は、図２（Ａ）で示した被写体の形状データを表現するための要素であるポリゴンＴ０〜Ｔ１１と、これらを構成する頂点Ｖ０〜Ｖ１１とを明示的に表したものである。また、図３（Ｂ）は、図２（Ｂ）で示したテクスチャ画像と、図３（Ａ）のポリゴンの各頂点に対応する位置Ｐ０〜Ｐ１３とを表している。そして、これらを対応づけるための情報として、図３（Ｃ）のように、各ポリゴンに対してそれらを構成する３次元空間上のポリゴン頂点のＩＤとテクスチャ画像空間上のテクスチャ頂点のＩＤとの対応表が生成される。これにより、ポリゴンにテクスチャを付与することができる。 FIG. 3A explicitly represents polygons T0 to T11, which are elements for expressing the shape data of the subject shown in FIG. 2A, and vertices V0 to V11 constituting them. is there. Further, FIG. 3B represents the texture image shown in FIG. 2B and the positions P0 to P13 corresponding to the respective vertices of the polygon in FIG. 3A. Then, as information for associating these, as shown in FIG. 3C, the IDs of the polygon vertices in the three-dimensional space and the IDs of the texture vertices in the texture image space that constitute them for each polygon A correspondence table is generated. As a result, the polygon can be textured.

図３（Ａ）の座標は、ｘｙｚ軸の３次元空間座標、図３（Ｂ）の座標は、ｕｖ軸の２Ｄ画像空間座標で表現される。図３（Ｃ）におけるポリゴン頂点Ｖ０〜Ｖ４、Ｖ７〜Ｖ１１のように、多くの場合、ポリゴン頂点とテクスチャ頂点は１対１の対応関係にあり、インデックス番号を一致させて表現することができる。ところが、例えばポリゴン頂点Ｖ５が、テクスチャ頂点Ｐ５、Ｐ１２の両方に対応しているように、３次元空間上では１つのポリゴン頂点であっても、テクスチャ画像空間上では、異なる複数の位置に対応するテクスチャ頂点が存在する。このような対応関係であっても仮想視点画像の生成処理を行えるように、頂点ＩＤとテクスチャ頂点ＩＤは独立に管理される。 The coordinates of FIG. 3 (A) are represented by the three-dimensional space coordinates of the xyz axis, and the coordinates of FIG. 3 (B) are represented by the 2D image space coordinates of the uv axis. As shown in polygon vertices V0 to V4 and V7 to V11 in FIG. 3C, in many cases, the polygon vertices and the texture vertices have a one-to-one correspondence, and the index numbers can be matched and expressed. However, for example, as the polygon vertex V5 corresponds to both the texture vertices P5 and P12, even if it is one polygon vertex in the three-dimensional space, it corresponds to a plurality of different positions in the texture image space. There are texture vertices. The vertex ID and the texture vertex ID are managed independently so that the virtual viewpoint image generation process can be performed even in such a correspondence relationship.

図４は、テクスチャ付き３Ｄポリゴンのデータ表現である。具体的には、図４（Ａ）は、図３（Ａ）で表されるポリゴン頂点とその座標の対応関係を表すデータである。図４（Ｂ）は、図３（Ｂ）で表されるテクスチャ頂点とその座標の対応関係を表すデータである。図４（Ｃ）は、図３（Ｃ）そのものであり、三角形ポリゴンと、三角形ポリゴンのポリゴン頂点と、テクスチャ頂点との対応関係を表すデータである。図４（Ｄ）は、図２（Ｂ）（又は図３（Ｂ））で表されるテクスチャ画像である。 FIG. 4 is a data representation of a textured 3D polygon. Specifically, FIG. 4A is data showing the correspondence between the polygon vertices represented by FIG. 3A and their coordinates. FIG. 4B is data showing the correspondence between the texture vertices represented by FIG. 3B and their coordinates. FIG. 4C is the same as FIG. 3C, and is data showing the correspondence between the triangular polygon, the polygon vertices of the triangular polygon, and the texture vertices. FIG. 4 (D) is a texture image represented by FIG. 2 (B) (or FIG. 3 (B)).

また、図４（Ａ）におけるポリゴン頂点ＩＤの並び順には、面の表方向を定義する役割がある。例えばポリゴンＴ０は、３つの頂点から成り、その順序は６通り存在する。この頂点を左から順にたどった際の回転方向に対して右ねじの法則に従う方向を表方向と定義することが多い。図５（Ａ）、（Ｂ）は、ポリゴン頂点の記述順序とポリゴンの表方向の組を表したものである。図５（Ａ）は、紙面に対して裏から表の方向が表方向であり、図５（Ｂ）は、紙面に対して表から裏の方向が表方向である。いずれの方向の場合にも、本実施形態に適用することができる。 Further, the order of the polygon vertex IDs in FIG. 4A has a role of defining the front direction of the surface. For example, polygon T0 is composed of three vertices, and there are six orders. The direction that follows the right-handed screw rule with respect to the direction of rotation when the vertices are traced in order from the left is often defined as the front direction. 5 (A) and 5 (B) show the description order of polygon vertices and the set of polygons in the front direction. In FIG. 5A, the direction from the back to the front with respect to the paper surface is the front direction, and in FIG. 5B, the direction from the front to the back with respect to the paper surface is the front direction. It can be applied to this embodiment in any direction.

なお、本実施形態は、上記で説明したデータ表現に縛られることなく、例えば四角形以上のポリゴン表現を用いてもよい。さらには、形状とテクスチャの対応関係の表現にインデックスを用いず直接座標を記述する場合等、様々なケースに対して、本実施形態を適用することが可能である。 In this embodiment, for example, a polygon representation of a quadrangle or more may be used without being bound by the data representation described above. Furthermore, this embodiment can be applied to various cases such as when the coordinates are directly described without using an index to express the correspondence between the shape and the texture.

＜実施形態１＞
本実施形態について、まずテクスチャ画像の生成処理の概要について、図６を用いて説明する。図６は、テクスチャ画像の生成処理に用いられるシステムの構成例を示す。本実施形態では、大きく分けて二つのステップでテクスチャ画像の生成を行う。 <Embodiment 1>
Regarding this embodiment, first, an outline of the texture image generation process will be described with reference to FIG. FIG. 6 shows a configuration example of a system used for a texture image generation process. In the present embodiment, the texture image is generated in roughly two steps.

第１のステップでは、被写体の３Ｄポリゴンモデルにおけるポリゴン毎に、テクスチャ画像６０８上の座標を決定する。まず、複数の異なる撮像装置（視点）６０１から被写体６０２を撮像した画像群に基づいて、被写体６０２の３Ｄポリゴンデータを生成する。そして、その３Ｄポリゴンデータにおいて着目するポリゴン６０３を高解像度かつ正面に近い状態で撮像している撮像装置の撮像画像６０４を選択する。次に、選択した撮像画像６０４において着目ポリゴン６０３に対応する画素領域６０７を検出し、テクスチャ画像６０８の任意の位置に該画素領域６０７に相当するテクスチャ領域６０９を配置して、その座標を決定する。これにより、テクスチャ画像６０８上において、被写体の３Ｄポリゴンデータの着目ポリゴン６０３のテクスチャを示すテクスチャ領域６０９が、対応付けられて確保される。 In the first step, the coordinates on the texture image 608 are determined for each polygon in the 3D polygon model of the subject. First, 3D polygon data of the subject 602 is generated based on an image group obtained by capturing the subject 602 from a plurality of different imaging devices (viewpoints) 601. Then, the captured image 604 of the imaging device that captures the polygon 603 of interest in the 3D polygon data in a state close to the front with high resolution is selected. Next, the pixel region 607 corresponding to the polygon of interest 603 is detected in the selected captured image 604, the texture region 609 corresponding to the pixel region 607 is arranged at an arbitrary position of the texture image 608, and the coordinates thereof are determined. .. As a result, on the texture image 608, the texture area 609 showing the texture of the polygon 603 of interest of the 3D polygon data of the subject is associated and secured.

ただし、着目ポリゴン６０３に対応する被写体上の領域を含む撮像画像は撮像画像６０４以外にもあり、例えば撮像画像６０５がそれである。例えば、撮像画像６０５を選択してテクスチャ画像６０８上の座標を決定するとする。撮像画像６０５を取得する撮像装置６０１の撮像方向は、撮像画像６０４を取得する撮像装置６０１の撮像方向に比べて、着目ポリゴン６０３に対応する被写体上の領域について正対する方向に対してより大きい角度を有する方向である。さらに、撮像画像６０５を取得する撮像装置６０１は、撮像画像６０４を取得する撮像装置６０１に比べて広角の範囲を撮像している。そのため、撮像画像６０５上の着目ポリゴン６０３に対応する領域は、撮像画像６０４上の着目ポリゴン６０３に対応する領域に比べて、面積が小さい。そのため、撮像画像６０５を選択してテクスチャ画像６０８上の座標を決定した場合、撮像画像６０４を選択してテクスチャ画像６０８上の座標を決定した場合に比べて、テクスチャ領域が小さくなる。その結果、着目ポリゴン６０３に対応するテクスチャ領域は、低解像度になる。ここで、テクスチャに関する解像度は、ピクセル数で定義されるものとする。 However, the captured image including the region on the subject corresponding to the polygon of interest 603 is other than the captured image 604, for example, the captured image 605. For example, suppose that the captured image 605 is selected and the coordinates on the texture image 608 are determined. The imaging direction of the imaging device 601 that acquires the captured image 605 is a larger angle than the imaging direction of the imaging device 601 that acquires the captured image 604 with respect to the direction facing the region on the subject corresponding to the polygon of interest 603. Is the direction to have. Further, the image pickup device 601 that acquires the captured image 605 captures a wide-angle range as compared with the image pickup device 601 that acquires the captured image 604. Therefore, the area corresponding to the polygon of interest 603 on the captured image 605 is smaller than the area corresponding to the polygon of interest 603 on the captured image 604. Therefore, when the captured image 605 is selected and the coordinates on the texture image 608 are determined, the texture area becomes smaller than when the captured image 604 is selected and the coordinates on the texture image 608 are determined. As a result, the texture area corresponding to the polygon of interest 603 has a low resolution. Here, it is assumed that the resolution regarding the texture is defined by the number of pixels.

上記の点を踏まえ、本実施形態では、複数の撮像画像の中から、着目するポリゴンが高解像度かつ正面に近い状態で撮像されて取得された撮像画像を選択し、テクスチャ画像６０８の座標を決定してテクスチャ領域６０９を確保する。それにより、高解像度なテクスチャ付き３Ｄポリゴンモデルの生成を実現することができる。なお、テクスチャ画像６０８上に確保されるテクスチャ領域６０９は、画素領域６０７と同等サイズかつ三角形領域としたがこれに限らない。画素領域６０７を包括する矩形領域として、テクスチャ画像６０８上にテクスチャ領域６０９を確保するという他の方法が用いられてもよい。 Based on the above points, in the present embodiment, the captured image obtained by capturing the polygon of interest with high resolution and close to the front is selected from the plurality of captured images, and the coordinates of the texture image 608 are determined. To secure the texture area 609. As a result, it is possible to generate a high-resolution textured 3D polygon model. The texture area 609 secured on the texture image 608 has a size equivalent to that of the pixel area 607 and is a triangular area, but is not limited to this. As a rectangular region including the pixel region 607, another method of securing the texture region 609 on the texture image 608 may be used.

一方、３Ｄポリゴンモデルの生成方法は、公知の方法を用いることができる。ただし、生成した３Ｄポリゴンモデルには誤差が含まれることがある。特に、スタジアムに複数の撮像装置を配置し、グラウンドにいる選手の３Ｄポリゴンモデルを生成するという場合のように、被写体と撮像装置との間に長い距離が存在する場合は、その誤差が大きくなる。３Ｄポリゴンモデルに誤差が含まれる場合は、着目ポリゴンとそのポリゴンに対応づけたテクスチャが、実際のテクスチャとは異なる場合がある。この現象は、３Ｄポリゴンモデルにおけるテクスチャ画像を生成する際に問題となる。例えば、テクスチャ画像の画素値を、第１のステップで選択した撮像装置のみを利用して一律に決定するとする。そして、着目ポリゴン６０３に隣接するポリゴンに対しては、着目ポリゴン６０３とは異なる撮像装置の撮像画像の該当画素領域が選択される場合、３Ｄポリゴンデータに含まれる誤差の影響で隣接するポリゴン間でテクスチャのずれが発生する。このずれは、テクスチャが一様である場合はテクスチャのずれは目立たないが、被写体が人である場合などでは、顔やユニフォームの背番号の部分に不自然な切れ目が発生するという画質劣化の要因となる。このように、ポリゴン毎に同じ撮像画像を用いた場合は、高解像度なテクスチャ画像の生成を実現できるが、３Ｄポリゴンデータに含まれる誤差の影響でポリゴン間のテクスチャのずれにより画質劣化が発生する。 On the other hand, as a method for generating a 3D polygon model, a known method can be used. However, the generated 3D polygon model may contain an error. In particular, when there is a long distance between the subject and the imaging device, such as when multiple imaging devices are placed in the stadium and a 3D polygon model of a player on the ground is generated, the error becomes large. .. If the 3D polygon model contains an error, the polygon of interest and the texture associated with that polygon may differ from the actual texture. This phenomenon becomes a problem when generating a texture image in a 3D polygon model. For example, it is assumed that the pixel value of the texture image is uniformly determined by using only the imaging device selected in the first step. Then, for the polygon adjacent to the polygon of interest 603, when the corresponding pixel area of the image captured by the imaging device different from the polygon of interest 603 is selected, the polygons adjacent to each other are affected by the error included in the 3D polygon data. Texture shift occurs. This deviation is not noticeable when the texture is uniform, but when the subject is a person, unnatural cuts occur in the face and uniform number part, which is a factor of image quality deterioration. It becomes. In this way, when the same captured image is used for each polygon, it is possible to generate a high-resolution texture image, but the image quality deteriorates due to the texture shift between the polygons due to the influence of the error included in the 3D polygon data. ..

そこで、本実施形態では、次の第２のステップを行う。この第２のステップでは、第１のステップで確保されたテクスチャ画像６０８上の着目ポリゴン６０３に対応するテクスチャ領域６０９に含まれる複数の画素それぞれの画素値を、複数の異なる撮像装置６０１の撮像画像に基づいて決定する。具体的には、まず、テクスチャ領域６０９（又はテクスチャ画像６０８）を構成する画素毎に、撮像画像上での解像度や着目する画素に対応する被写体６０２の法線方向に対する撮像装置の正面度合いなどに基づいて評価値を算出する。そして、算出した評価値を基に決定した２以上の撮像装置の２以上の撮像画像のそれぞれの比率に応じて、撮像画像の画素値を組み合わせし、テクスチャ領域６０９（又はテクスチャ画像６０８）上の画素値を決定する。このようにテクスチャ領域６０９（又はテクスチャ画像６０８）の画素毎に適切な比率を決定することで、ポリゴン間のテクスチャのずれによる画質劣化を低減することができる。 Therefore, in the present embodiment, the following second step is performed. In this second step, the pixel values of the plurality of pixels included in the texture region 609 corresponding to the polygon of interest 603 on the texture image 608 secured in the first step are set to the captured images of the plurality of different imaging devices 601. Determined based on. Specifically, first, for each pixel constituting the texture region 609 (or texture image 608), the resolution on the captured image, the degree of front of the imaging device with respect to the normal direction of the subject 602 corresponding to the pixel of interest, and the like are determined. The evaluation value is calculated based on this. Then, the pixel values of the captured images are combined according to the ratio of each of the two or more captured images of the two or more imaging devices determined based on the calculated evaluation value, and the pixel values of the captured images are combined on the texture region 609 (or texture image 608). Determine the pixel value. By determining an appropriate ratio for each pixel of the texture region 609 (or texture image 608) in this way, it is possible to reduce image quality deterioration due to texture shift between polygons.

つまり、本実施形態では、第１のステップでテクスチャ画像の解像度を担保するためのテクスチャ領域の確保を行う。そして、次の第２のステップでは、高解像のテクスチャ領域に含まれる画素毎に、２以上の撮像画像を用いて、適切な比率で組み合わせて画素値を決定することにより、ポリゴン間のテクスチャのずれによる画質劣化を低減することができる。その結果、ポリゴン間のテクスチャのずれがないかつテクスチャの解像度が高い、高画質なテクスチャ付き３Ｄポリゴンモデルを生成することができる。 That is, in the present embodiment, the texture area for ensuring the resolution of the texture image is secured in the first step. Then, in the next second step, the texture between polygons is determined by combining two or more captured images for each pixel included in the high-resolution texture region at an appropriate ratio and determining the pixel value. It is possible to reduce the deterioration of image quality due to the deviation. As a result, it is possible to generate a high-quality textured 3D polygon model in which there is no texture shift between polygons and the texture resolution is high.

なお、第２のステップでは、テクスチャ領域６０９を構成する画素毎に、２以上の撮像画像に基づいて画素値を決定しなくてもよい。例えば、テクスチャ領域６０９を構成する複数の画素からなる画素群毎に、２以上の撮像画像に基づいて画素値を決定してもよい。この場合、画素群は、テクスチャ領域６０９をいくつかの領域に分割した領域であり、画素群の画素数はテクスチャ領域６０９の画素数より少ない。また、テクスチャ領域６０９を構成する画素又は画素群毎に、１つの撮像画像に基づいて画素値を決定してもよい。この場合も、決定される画素値が、ポリゴン単位ではなく、それよりも小さい単位で決定されるため、隣接するポリゴン間でのテクスチャのずれは軽減される。つまり、テクスチャ領域６０９を構成する画素又は画素群毎に、１以上の撮像画像が選択され、その選択された１以上の撮像画像に基づいて画素又は画素群の画素値が決定されてよい。 In the second step, it is not necessary to determine the pixel value for each pixel constituting the texture region 609 based on two or more captured images. For example, the pixel value may be determined based on two or more captured images for each pixel group composed of a plurality of pixels constituting the texture region 609. In this case, the pixel group is an area obtained by dividing the texture area 609 into several areas, and the number of pixels of the pixel group is smaller than the number of pixels of the texture area 609. Further, the pixel value may be determined based on one captured image for each pixel or pixel group constituting the texture region 609. Also in this case, since the determined pixel value is determined not in polygon units but in units smaller than that, the deviation of the texture between adjacent polygons is reduced. That is, one or more captured images may be selected for each pixel or pixel group constituting the texture region 609, and the pixel value of the pixel or pixel group may be determined based on the selected one or more captured images.

以上が本実施形態で行われるテクスチャ画像の生成処理の概要である。本実施形態では、スタジアム規模の環境で選手などの人を撮像した場合を例に挙げて説明するが、これに限らず、スタジオなどにおいて任意の被写体を撮像した場合といった小規模な撮像環境においても適用可能である。 The above is the outline of the texture image generation process performed in the present embodiment. In this embodiment, a case where a person such as a player is imaged in a stadium-scale environment will be described as an example, but the present embodiment is not limited to this, and may be used in a small-scale imaging environment such as when an arbitrary subject is imaged in a studio or the like. Applicable.

以下、本実施形態の具体的な構成について述べる。図１は、本実施形態の画像処理装置のハードウェア構成の一例を示す図である。本実施形態の画像処理装置１００は、ＣＰＵ１０１、ＲＡＭ１０２、ＲＯＭ１０３、二次記憶装置１０４、入力インターフェース１０５、出力インターフェース１０６を含む。そして、画像処理装置１００の各構成部はシステムバス１０７によって相互に接続されている。また、画像処理装置１００は、入力インターフェース１０５を介して外部記憶装置１０８に接続されており、出力インターフェース１０６を介して外部記憶装置１０８および表示装置１０９に接続されている。 Hereinafter, a specific configuration of the present embodiment will be described. FIG. 1 is a diagram showing an example of a hardware configuration of the image processing device of the present embodiment. The image processing device 100 of the present embodiment includes a CPU 101, a RAM 102, a ROM 103, a secondary storage device 104, an input interface 105, and an output interface 106. Then, each component of the image processing apparatus 100 is connected to each other by a system bus 107. Further, the image processing device 100 is connected to the external storage device 108 via the input interface 105, and is connected to the external storage device 108 and the display device 109 via the output interface 106.

ＣＰＵ１０１は、ＲＡＭ１０２をワークメモリとして、ＲＯＭ１０３に格納されたプログラムを実行し、システムバス１０７を介して画像処理装置１００の各構成部を統括的に制御するプロセッサーである。これにより、後述する様々な処理が実行される。 The CPU 101 is a processor that uses the RAM 102 as a work memory to execute a program stored in the ROM 103 and collectively controls each component of the image processing device 100 via the system bus 107. As a result, various processes described later are executed.

二次記憶装置１０４は、画像処理装置１００で取り扱われる種々のデータを記憶する記憶装置であり、本実施形態ではＨＤＤが用いられる。ＣＰＵ１０１は、システムバス１０７を介して二次記憶装置１０４へのデータの書き込みおよび二次記憶装置１０４に記憶されたデータの読出しを行うことができる。なお、二次記憶装置１０４にはＨＤＤの他に、光ディスクドライブやフラッシュメモリなど、様々な記憶デバイスを用いることが可能である。 The secondary storage device 104 is a storage device that stores various data handled by the image processing device 100, and an HDD is used in this embodiment. The CPU 101 can write data to the secondary storage device 104 and read data stored in the secondary storage device 104 via the system bus 107. In addition to the HDD, various storage devices such as an optical disk drive and a flash memory can be used for the secondary storage device 104.

入力インターフェース１０５は、例えばＵＳＢやＩＥＥＥ１３９４等のシリアルバスインターフェースであり、外部装置から画像処理装置１００へのデータや命令等の入力は、この入力インターフェース１０５を介して行われる。画像処理装置１００は、この入力インターフェース１０５を介して、外部記憶装置１０８（例えば、ハードディスク、メモリーカード、ＣＦカード、ＳＤカード、ＵＳＢメモリなどの記憶媒体）からデータを取得する。なお、入力インターフェース１０５には不図示のマウスやボタンなどの入力デバイスも接続可能である。出力インターフェース１０６は、入力インターフェース１０５と同様にＵＳＢやＩＥＥＥ１３９４等のシリアルバスインターフェースを備える。その他に、例えばＤＶＩやＨＤＭＩ（登録商標）等の映像出力端子を用いることも可能である。画像処理装置１００から外部装置へのデータ等の出力は、この出力インターフェース１０６を介して行われる。画像処理装置１００は、この出力インターフェース１０６を介して表示装置１０９（液晶ディスプレイなどの各種画像表示デバイス）に、処理された画像などを出力することで、画像の表示を行う。なお、画像処理装置１００の構成要素は上記以外にも存在するが、本発明の主眼ではないため、説明を省略する。 The input interface 105 is, for example, a serial bus interface such as USB or IEEE1394, and input of data, commands, etc. from an external device to the image processing device 100 is performed via the input interface 105. The image processing device 100 acquires data from an external storage device 108 (for example, a storage medium such as a hard disk, a memory card, a CF card, an SD card, or a USB memory) via the input interface 105. An input device such as a mouse or a button (not shown) can also be connected to the input interface 105. The output interface 106 includes a serial bus interface such as USB or IEEE 1394, similarly to the input interface 105. In addition, for example, a video output terminal such as DVI or HDMI (registered trademark) can be used. The output of data or the like from the image processing device 100 to the external device is performed via the output interface 106. The image processing device 100 displays an image by outputting the processed image or the like to a display device 109 (various image display devices such as a liquid crystal display) via the output interface 106. Although there are other components of the image processing apparatus 100 other than the above, the description thereof will be omitted because they are not the main focus of the present invention.

以下、本実施形態の画像処理装置１００で行われる処理について、図７および図８を参照して説明する。図７は、画像処理装置１００の機能構成を示すブロック図である。図８は、本実施形態における画像処理装置１００が行う画像処理方法のフローチャートを示している。図７に示すように、画像処理装置１００は、画像データ取得部７０１と視点情報取得部７０２と３Ｄポリゴン取得部７０３とＵＶ座標取得部７０４と第１算出部７０５と領域決定部７０６と第２算出部７０７と画像生成部７０８とを有する。画像処理装置１００は、ＲＯＭ１０３に格納されたプログラムをＣＰＵ１０１がＲＡＭ１０２をワークメモリとして実行することで、図７に示す各構成部として機能し、図８のフローチャートに示す一連の処理を実行する。なお、以下に示す処理の全てがＣＰＵ１０１によって実行される必要はなく、処理の一部または全部が、ＣＰＵ１０１以外の１つ又は複数の処理回路によって行われるように画像処理装置１００が構成されてもよい。以下、各構成部により行われる処理の流れを説明する。 Hereinafter, the processing performed by the image processing apparatus 100 of the present embodiment will be described with reference to FIGS. 7 and 8. FIG. 7 is a block diagram showing a functional configuration of the image processing device 100. FIG. 8 shows a flowchart of an image processing method performed by the image processing apparatus 100 in the present embodiment. As shown in FIG. 7, the image processing apparatus 100 includes an image data acquisition unit 701, a viewpoint information acquisition unit 702, a 3D polygon acquisition unit 703, a UV coordinate acquisition unit 704, a first calculation unit 705, an area determination unit 706, and a second. It has a calculation unit 707 and an image generation unit 708. The image processing device 100 functions as each component shown in FIG. 7 by executing the program stored in the ROM 103 by the CPU 101 using the RAM 102 as the work memory, and executes a series of processes shown in the flowchart of FIG. It should be noted that it is not necessary that all of the processes shown below are executed by the CPU 101, and even if the image processing device 100 is configured so that a part or all of the processes is performed by one or a plurality of processing circuits other than the CPU 101. Good. Hereinafter, the flow of processing performed by each component will be described.

ステップＳ８０１において、画像データ取得部７０１は、入力インターフェース１０５を介して、または二次記憶装置１０４から、被写体を複数の異なる視点から撮像した複数の撮像画像を取得する。撮像画像は、例えば図６の被写体６０２の周りを取り囲むように配置した複数の撮像装置６０１により撮像された画像である。本実施形態では、図６に示すように、撮像装置６０１は上方から床面を見下ろすように配置されている。また、撮像装置６０１それぞれには異なるズームレンズが装着され、撮像画像６０４や撮像画像６０５のように撮像画像上に写る被写体の大きさは撮像装置毎に異なっていてもよい。なお、図６に示す撮像装置６０１の配置方法や装着するズームレンズの構成は一例にすぎず、他の構成を用いて複数の撮像画像が取得されてもよい。 In step S801, the image data acquisition unit 701 acquires a plurality of captured images obtained by capturing the subject from a plurality of different viewpoints via the input interface 105 or from the secondary storage device 104. The captured image is, for example, an image captured by a plurality of imaging devices 601 arranged so as to surround the subject 602 in FIG. In the present embodiment, as shown in FIG. 6, the image pickup apparatus 601 is arranged so as to look down on the floor surface from above. Further, a different zoom lens is attached to each of the image pickup devices 601, and the size of the subject captured on the captured image such as the captured image 604 and the captured image 605 may be different for each imaging device. The method of arranging the image pickup device 601 and the configuration of the zoom lens to be attached shown in FIG. 6 are merely examples, and a plurality of captured images may be acquired using other configurations.

また、画像処理装置１００は、複数の撮像装置と接続され、画像処理装置１００及び複数の撮像装置を備える画像処理システムの一部として構成されていてもよい。このような構成によれば、テクスチャ付き３Ｄポリゴンモデルの生成を被写体の撮像を行うとともにすることができる。本実施形態において、取得される撮像画像は、ＲＧＢ３チャンネルのカラー画像の場合を例に説明する。しかし、撮像画像が１チャンネルのグレー画像や動画像データの場合であっても、本実施形態を適用することができる。撮像画像が動画像データである場合、画像処理装置１００は、複数の撮像装置により略同時刻に撮像されたフレームを用いて、以下の処理を行うことができる。さらに、画像データ取得部７０１は、撮像画像を区別するため、各画像を、撮像装置６０１を区別する番号（以下、視点番号という）と対応付けて記憶する。画像データ取得部７０１は、撮像画像を画像生成部７０８に出力する。 Further, the image processing device 100 may be connected to a plurality of image processing devices and may be configured as a part of an image processing system including the image processing device 100 and the plurality of image processing devices. According to such a configuration, it is possible to generate a textured 3D polygon model while imaging the subject. In the present embodiment, the captured image to be acquired will be described by taking the case of a color image of RGB3 channel as an example. However, this embodiment can be applied even when the captured image is a 1-channel gray image or moving image data. When the captured image is moving image data, the image processing device 100 can perform the following processing using frames captured by a plurality of imaging devices at substantially the same time. Further, the image data acquisition unit 701 stores each image in association with a number for distinguishing the image pickup device 601 (hereinafter, referred to as a viewpoint number) in order to distinguish the captured images. The image data acquisition unit 701 outputs the captured image to the image generation unit 708.

なお、画像データ取得部７０１は、撮像画像ではなく、撮像画像内の被写体を表す領域である被写体画像を取得するようにしてもよいし、その両方を取得してもよい。被写体画像は、撮像画像から公知の前景背景分離法を用いて動体である被写体の領域を抽出して生成されてもよい。特に、複数の撮像画像に基づいて３次元形状データを生成する過程で生成される被写体画像であってもよい。また、被写体画像は、被写体のシルエットを示すシルエット画像と撮像画像とに基づいて生成されてもよい。また、画像データ取得部７０１は、撮像画像から被写体画像を生成するようにしてもよい。ただし、被写体画像は、被写体のテクスチャの情報を有している。 The image data acquisition unit 701 may acquire a subject image which is a region representing a subject in the captured image instead of the captured image, or may acquire both of them. The subject image may be generated by extracting a region of a moving subject from the captured image by using a known foreground background separation method. In particular, it may be a subject image generated in the process of generating three-dimensional shape data based on a plurality of captured images. Further, the subject image may be generated based on the silhouette image showing the silhouette of the subject and the captured image. Further, the image data acquisition unit 701 may generate a subject image from the captured image. However, the subject image has information on the texture of the subject.

ステップＳ８０２において、視点情報取得部７０２は、画像データ取得部７０１で取得した複数の撮像画像を撮像した複数の撮像装置の撮像視点情報を撮像装置毎に取得する。撮像視点とは、撮像装置６０１の視点のことを指し、撮像視点情報とは、複数の撮像装置６０１それぞれについての情報のことをいう。撮像視点情報には、所定の座標系内での撮像装置６０１の位置姿勢情報が含まれ、例えば、撮像装置６０１の位置情報及び光軸方向を示す姿勢情報を含む。また、撮像視点情報には、撮像装置６０１の焦点距離又は主点位置など、撮像装置６０１の画角情報が含まれていてもよい。これらの撮像視点情報を用いて、撮像画像の各画素と、撮像画像内に存在する被写体の位置とを対応付けることができる。そのため、被写体の特定の箇所について、撮像画像上の対応する画素を特定し、その色情報を取得することができる。さらに、撮像視点情報には、撮像装置６０１により撮像されて取得された撮像画像の歪曲を示す歪曲パラメータ並びにＦ値、及びシャッタースピード並びにホワイトバランス等の撮像パラメータを含むことができる。視点情報取得部７０２は、撮像視点情報をｕｖ座標取得部７０４と第１算出部７０５と領域決定部７０６と第２算出部７０７に出力する。 In step S802, the viewpoint information acquisition unit 702 acquires the imaging viewpoint information of the plurality of imaging devices that have captured the plurality of captured images acquired by the image data acquisition unit 701 for each imaging device. The imaging viewpoint refers to the viewpoint of the imaging device 601 and the imaging viewpoint information refers to information about each of the plurality of imaging devices 601. The imaging viewpoint information includes the position / orientation information of the imaging device 601 within a predetermined coordinate system, and includes, for example, the position information of the imaging device 601 and the attitude information indicating the optical axis direction. Further, the image pickup viewpoint information may include the angle of view information of the image pickup device 601 such as the focal length or the principal point position of the image pickup device 601. Using these imaging viewpoint information, it is possible to associate each pixel of the captured image with the position of the subject existing in the captured image. Therefore, it is possible to identify the corresponding pixel on the captured image and acquire the color information of the specific portion of the subject. Further, the imaging viewpoint information can include a distortion parameter indicating distortion of the captured image acquired by the imaging device 601 and an F value, and imaging parameters such as shutter speed and white balance. The viewpoint information acquisition unit 702 outputs the imaging viewpoint information to the uv coordinate acquisition unit 704, the first calculation unit 705, the area determination unit 706, and the second calculation unit 707.

ステップＳ８０３において、３Ｄポリゴン取得部７０３は、３次元空間上の被写体の形状を三角形ポリゴンで表す３Ｄポリゴンデータを取得する。３Ｄポリゴンデータは、図４（Ａ）で示したデータ形式により表される。３Ｄポリゴン取得部７０３は、ｖｉｓｕａｌｈｕｌｌアルゴリズムを適用して、ボクセル情報を取得し、３Ｄポリゴンデータを再構成する。なお、３Ｄポリゴン取得部７０３は、予め生成されて二次記憶装置１０４又は外部記憶装置１０８に記憶された３Ｄポリゴンデータを取得するようにしてもよい。 In step S803, the 3D polygon acquisition unit 703 acquires 3D polygon data in which the shape of the subject in the three-dimensional space is represented by a triangular polygon. The 3D polygon data is represented by the data format shown in FIG. 4 (A). The 3D polygon acquisition unit 703 applies the visual hull algorithm to acquire voxel information and reconstructs the 3D polygon data. The 3D polygon acquisition unit 703 may acquire 3D polygon data that has been generated in advance and stored in the secondary storage device 104 or the external storage device 108.

３Ｄポリゴンデータの生成方法は、任意のものでよく、例えばボクセル情報を、直接ポリゴンデータに変換してもよい。その他にも、赤外線センサを用いて取得されるデプスマップから得られる点群にＰＳＲ（ｐｏｉｓｓｏｎｓｕｒｆａｃｅｒｅｃｏｎｓｔｒｕｃｔｉｏｎ）を適用してもよい。点群の取得方法は、ＰＭＶＳ（Ｐａｔｃｈ−ｂａｓｅｄＭｕｌｔｉ−ｖｉｅｗＳｔｅｒｅｏ）に代表される画像特徴を利用したステレオマッチングによって得られるものであってもよい。３Ｄポリゴン取得部７０３は、取得した３Ｄポリゴンモデルを、ｕｖ座標取得部７０４と第１算出部７０５と領域決定部７０６と第２算出部７０７に出力する。 The method for generating 3D polygon data may be arbitrary, and for example, voxel information may be directly converted into polygon data. In addition, PSR (poisson surface reconstruction) may be applied to a point cloud obtained from a depth map acquired by using an infrared sensor. The point cloud acquisition method may be obtained by stereo matching using image features typified by PMVS (Patch-based Multi-view Stereo). The 3D polygon acquisition unit 703 outputs the acquired 3D polygon model to the uv coordinate acquisition unit 704, the first calculation unit 705, the area determination unit 706, and the second calculation unit 707.

ステップＳ８０４において、ｕｖ座標取得部７０４は、取得した撮像視点情報を用いて、取得した３Ｄポリゴンモデルの各ポリゴン頂点を、それぞれの撮像視点に投影した際のその撮像視点の撮像画像上でのｕｖ座標を算出して取得する。つまり、各ポリゴン頂点に対応する全撮像画像上のｕｖ座標を取得する。ただし、ポリゴン頂点を投影した際、撮像視点を表す撮像装置の画角の外にあることが分かった場合は、ｕｖ座標にエラー値として負の値を設定する。この負の値は、この後の処理で画角外となった撮像視点を使用不可とするフラグとして利用することができる。しかし、エラー値の表現方法は上記に限らず、画角内となり取得したｕｖ座標の値と区別可能な任意の値が設定されてもよい。ｕｖ座標取得部７０４は、３Ｄポリゴンデータに含まれる全ポリゴン頂点に対して全撮像視点分のｕｖ座標の値を算出し、第１算出部７０５にｕｖ座標のデータを出力する。 In step S804, the uv coordinate acquisition unit 704 uses the acquired imaging viewpoint information to project the polygon vertices of the acquired 3D polygon model onto the respective imaging viewpoints, and the uv on the captured image of the imaging viewpoint. Calculate and obtain the coordinates. That is, the uv coordinates on all the captured images corresponding to each polygon vertex are acquired. However, when it is found that the polygon vertices are outside the angle of view of the imaging device representing the imaging viewpoint, a negative value is set as an error value in the uv coordinates. This negative value can be used as a flag for disabling the imaging viewpoint that is out of the angle of view in the subsequent processing. However, the method of expressing the error value is not limited to the above, and an arbitrary value that is within the angle of view and can be distinguished from the acquired uv coordinate value may be set. The uv coordinate acquisition unit 704 calculates the uv coordinate values for all the imaging viewpoints for all the polygon vertices included in the 3D polygon data, and outputs the uv coordinate data to the first calculation unit 705.

ステップＳ８０５において、第１算出部７０５は、３Ｄポリゴン取得部７０３から取得した３Ｄポリゴンデータに含まれる全てのポリゴンに対して、各撮像視点における評価値Ｑを視点情報取得部７０２から取得した撮像視点情報を用いて算出する。算出した評価値Ｑは、前述したテクスチャ画像の生成の第１のステップの処理を行う際に用いる。具体的には、テクスチャ画像におけるテクスチャ領域の座標を決定する際に、複数の撮像装置（撮像画像）の中から１つの撮像装置（撮像画像）を選択する指標となる。評価値Ｑは、ポリゴン毎に全ての撮像視点について算出する。第１算出部７０５は、算出した評価値Ｑを領域決定部７０６に出力する。以下に、本実施形態における評価値Ｑの算出方法について述べる。 In step S805, the first calculation unit 705 acquires the evaluation value Q at each imaging viewpoint from the viewpoint information acquisition unit 702 for all the polygons included in the 3D polygon data acquired from the 3D polygon acquisition unit 703. Calculate using information. The calculated evaluation value Q is used when performing the processing of the first step of generating the texture image described above. Specifically, it serves as an index for selecting one imaging device (captured image) from a plurality of imaging devices (captured images) when determining the coordinates of the texture region in the texture image. The evaluation value Q is calculated for each polygon for all imaging viewpoints. The first calculation unit 705 outputs the calculated evaluation value Q to the area determination unit 706. The method of calculating the evaluation value Q in this embodiment will be described below.

（評価値Ｑの算出方法）
評価値Ｑの算出方法について詳細に説明する。以下では、１つのポリゴンかつ１つの撮像視点に着目した際の評価値の算出方法について説明する。同様の処理を３Ｄポリゴンに含まれる全てのポリゴンに対して撮像視点の数分だけ適用し、全ポリゴンの評価値を算出する。評価値Ｑは、撮像視点毎に算出するため、ポリゴン毎に、視点情報取得部７０２が取得した撮像視点分の評価値を保持する。なお、以下では、着目した撮像視点を着目視点という。 (Calculation method of evaluation value Q)
The method of calculating the evaluation value Q will be described in detail. Hereinafter, a method of calculating an evaluation value when focusing on one polygon and one imaging viewpoint will be described. The same process is applied to all the polygons included in the 3D polygon for the number of imaging viewpoints, and the evaluation values of all the polygons are calculated. Since the evaluation value Q is calculated for each imaging viewpoint, the evaluation value for the imaging viewpoint acquired by the viewpoint information acquisition unit 702 is held for each polygon. In the following, the focused imaging viewpoint will be referred to as the focused viewpoint.

まず、ｕｖ座標取得部７０４から取得したｕｖ座標データにおいて、ポリゴンを構成する全てのポリゴン頂点が着目視点の画角内に含まれるか否かを判定する。上述したように、画角外である場合はｕｖ座標にエラー値として負の値が保持されているため、全てのポリゴン頂点のｕｖ座標が正であることを確認する。全てのポリゴン頂点のｕｖ座標が正である場合は、評価値Ｑの算出処理を進める。一方、判定の結果が否である場合は、評価値Ｑを負の値とする。しかしながら、判定結果が否であった場合に設定する評価値Ｑは負の値に限らず、評価値Ｑを計算して算出した結果と区別可能な任意の数値を設定してよい。 First, in the uv coordinate data acquired from the uv coordinate acquisition unit 704, it is determined whether or not all the polygon vertices constituting the polygon are included in the angle of view of the viewpoint of interest. As described above, when it is outside the angle of view, a negative value is held as an error value in the UV coordinates, so it is confirmed that the UV coordinates of all the polygon vertices are positive. If the uv coordinates of all polygon vertices are positive, the calculation process of the evaluation value Q proceeds. On the other hand, if the result of the determination is negative, the evaluation value Q is set to a negative value. However, the evaluation value Q to be set when the determination result is negative is not limited to a negative value, and an arbitrary numerical value that can be distinguished from the result calculated by calculating the evaluation value Q may be set.

次に、着目するポリゴンの代表点として、ポリゴンを構成する三頂点の重心Ｃとポリゴンの法線方向ベクトルＮを算出する。さらに着目視点の３次元空間上の位置座標と重心Ｃから着目視点の方向ベクトルＣＡを算出する。算出した、法線方向ベクトルＮと着目視点の方向ベクトルＣＡとの内積を計算することで、二つの方向ベクトルが成す角θの余弦ｃｏｓθを算出する。ここで、法線方向とはポリゴン面に垂直で、ポリゴン頂点の順番によって定義される表の方向のことである。着目視点の方向とは、ポリゴンの重心Ｃから外部パラメータによって定義される着目視点に向かう方向のことである。外部パラメータとは、撮像装置の位置情報及び光軸方向を示す姿勢情報である。 Next, as representative points of the polygon of interest, the center of gravity C of the three vertices constituting the polygon and the normal direction vector N of the polygon are calculated. Further, the direction vector CA of the viewpoint of interest is calculated from the position coordinates of the viewpoint of interest in the three-dimensional space and the center of gravity C. By calculating the inner product of the calculated normal direction vector N and the direction vector CA of the viewpoint of interest, the cosine cos θ of the angle θ formed by the two direction vectors is calculated. Here, the normal direction is the direction of the table that is perpendicular to the polygon plane and is defined by the order of the polygon vertices. The direction of the viewpoint of interest is the direction from the center of gravity C of the polygon toward the viewpoint of interest defined by the external parameters. The external parameters are the position information of the imaging device and the attitude information indicating the direction of the optical axis.

算出したｃｏｓθが負の値である場合、ポリゴンの表面は着目視点の撮像画像に映っていないことを表している。そのため、ｃｏｓθが負である場合は、その着目視点の撮像画像を使用しないように評価値Ｑに負の値を設定する。一方、ｃｏｓθが正である場合は評価値Ｑの算出処理を進める。 When the calculated cos θ is a negative value, it means that the surface of the polygon is not reflected in the captured image of the viewpoint of interest. Therefore, when cosθ is negative, the evaluation value Q is set to a negative value so that the captured image of the viewpoint of interest is not used. On the other hand, when cosθ is positive, the calculation process of the evaluation value Q proceeds.

次に、着目視点の撮像画像におけるポリゴンに対応する画素領域の解像度を算出し、算出したｃｏｓθを利用して評価値Ｑを算出する。この解像度を、ここでは着目視点の撮像画像上に投影された三角形の面積（例えばピクセル数）Ｓとして定義する。そして、解像度が大きい撮像視点を優先的に選択する。ただし、前述の通り、形状に誤差が含まれる場合に被写体平面に対する斜度が大きいとテクスチャに歪を生じる。そこで、評価値Ｑとして、解像度に加えて歪みの指標Ｗとしてｃｏｓθを利用する。これによって、歪の大きなマッピングが行われる可能性の高い撮像視点と歪みの小さなマッピングが行われる可能性が高い撮像視点とを区別した上で高解像なテクスチャ領域を選択することができる。以上をまとめると、評価値Ｑは下記の式１のように記述される。 Next, the resolution of the pixel region corresponding to the polygon in the captured image of the viewpoint of interest is calculated, and the evaluation value Q is calculated using the calculated cos θ. This resolution is defined here as the area (for example, the number of pixels) S of the triangle projected on the captured image of the viewpoint of interest. Then, the imaging viewpoint having a large resolution is preferentially selected. However, as described above, when the shape includes an error and the inclination with respect to the subject plane is large, the texture is distorted. Therefore, as the evaluation value Q, cosθ is used as the distortion index W in addition to the resolution. As a result, it is possible to select a high-resolution texture region after distinguishing between an imaging viewpoint in which a mapping with a large distortion is likely to be performed and an imaging viewpoint in which a mapping with a small distortion is likely to be performed. Summarizing the above, the evaluation value Q is described by the following equation 1.

Ｑ＝Ｓ×Ｗ・・・式１
なお、解像度Ｓの定義は、ポリゴンに対応する被写体の部分領域を撮像する撮像装置（撮像視点）における解像力の指標となるものであれば他の方法でもよい。例えば、解像度Ｓは、焦点距離や、撮像装置とポリゴンに対応する被写体の部分領域との距離などから、数式やルックアップテーブルによって算出されるものであってもよい。また、テクスチャの歪み度合いＷは、ｃｏｓθ以外を用いてもよく、角度そのものや、角度に基づいて設定された重みを用いてもよい。例えば、角度の閾値を設定し、閾値以下の角度である場合はＷ＝１、閾値よりも大きい場合はＷ＝０とするなどがあげられる。また、その際は、必ずしも２値である必要はなく、３値以上であってもよいし、また、角度０から閾値の角度となるまで線形若しくは非線形的に重みが減少するように設定してもよい。本実施形態では、評価値Ｑに対する影響度合いを、解像度の指標Ｓと歪みの指標Ｗで同じとして扱ったがこれに限らない。一方の指標を二乗するなど、評価値Ｑに対する影響度合いについて異なるように設定してもよい。 Q = S × W ・・・ Equation 1
The definition of the resolution S may be another method as long as it is an index of the resolving power in the imaging device (imaging viewpoint) that images a partial region of the subject corresponding to the polygon. For example, the resolution S may be calculated by a mathematical formula or a look-up table from the focal length, the distance between the image pickup device and the partial region of the subject corresponding to the polygon, and the like. Further, the degree of distortion W of the texture may be other than cos θ, or the angle itself or a weight set based on the angle may be used. For example, the threshold value of the angle is set, and W = 1 when the angle is equal to or less than the threshold value, and W = 0 when the angle is larger than the threshold value. In that case, the weight does not necessarily have to be binary, it may be 3 or more, and the weight is set to decrease linearly or non-linearly from the angle 0 to the threshold angle. May be good. In the present embodiment, the degree of influence on the evaluation value Q is treated as the same for the resolution index S and the distortion index W, but the present invention is not limited to this. One of the indexes may be squared, and the degree of influence on the evaluation value Q may be set differently.

図８に戻り、ステップＳ８０６において、領域決定部７０６は、第１算出部７０５から取得したポリゴン毎の評価値Ｑに基づいて、テクスチャ画像上における各ポリゴンに対応するテクスチャ領域を決定する。具体的には、領域決定部７０６は、テクスチャ領域を規定するテクスチャ画像上の座標を決定する。領域決定部７０６は、ポリゴン毎に、第１算出部７０５が算出した評価値Ｑが最大となる撮像視点を選択する。以下、選択された撮像視点を選択視点という。 Returning to FIG. 8, in step S806, the area determination unit 706 determines the texture area corresponding to each polygon on the texture image based on the evaluation value Q for each polygon acquired from the first calculation unit 705. Specifically, the area determination unit 706 determines the coordinates on the texture image that defines the texture area. The area determination unit 706 selects an imaging viewpoint that maximizes the evaluation value Q calculated by the first calculation unit 705 for each polygon. Hereinafter, the selected imaging viewpoint is referred to as a selection viewpoint.

次に、選択視点の撮像画像を対応するポリゴンに投影し、投影された選択視点の撮像画像の画素領域分を、該ポリゴンのテクスチャに対応する領域として、テクスチャ画像上に確保する。ここで、ポリゴンがテクスチャ画像上の必要な位置を参照できるように、確保したテクスチャ領域に対応するテクスチャ画像上のｕｖ座標を算出し、図４（Ｃ）のような形式で、ポリゴンとテクスチャ領域とを対応付けたデータを保存する。また、領域決定部７０６は、確保したテクスチャ領域における画素毎に、対応するポリゴンの識別ＩＤと選択視点の視点番号と選択視点の撮像画像上に対応する画素座標（ｕ，ｖ）との３つのデータを保持する。これらの情報により、テクスチャ画像上に含まれる画素において、ポリゴンとの対応を取り、画素に対応する３次元座標の値を算出することが可能となる。具体的な算出方法については、後述する。 Next, the captured image of the selected viewpoint is projected onto the corresponding polygon, and the pixel area of the projected captured image of the selected viewpoint is secured on the texture image as a region corresponding to the texture of the polygon. Here, the uv coordinates on the texture image corresponding to the secured texture area are calculated so that the polygon can refer to the required position on the texture image, and the polygon and the texture area are in the format shown in FIG. 4 (C). Save the data associated with. Further, the area determination unit 706 has three pixels, that is, the identification ID of the corresponding polygon, the viewpoint number of the selected viewpoint, and the pixel coordinates (u, v) corresponding to the captured image of the selected viewpoint for each pixel in the secured texture area. Hold data. Based on this information, it is possible to take correspondence with polygons in the pixels included in the texture image and calculate the value of the three-dimensional coordinates corresponding to the pixels. The specific calculation method will be described later.

テクスチャ画像上で確保するポリゴン毎のテクスチャ領域は、テクスチャ画像上のどのような順番、位置で配置されていてもよい。テクスチャ画像に含まれる画素において、３Ｄポリゴンデータに含まれるどのポリゴンにも対応しない画素が存在する場合（例えば、図２（Ｂ）の白抜きの領域）は、該画素に対応するポリゴンの識別ＩＤにダミーの値を与える。例えば、識別ＩＤとして負の値を与えてもよい。 The texture areas for each polygon secured on the texture image may be arranged in any order and position on the texture image. When there is a pixel that does not correspond to any polygon included in the 3D polygon data among the pixels included in the texture image (for example, the white area in FIG. 2B), the identification ID of the polygon corresponding to the pixel Gives a dummy value to. For example, a negative value may be given as the identification ID.

領域決定部７０６は、全てのポリゴンに対してテクスチャ画像上でのテクスチャ領域の配置を決定する。その後、領域決定部７０６は、画素毎のポリゴン識別ＩＤと選択視点の識別番号と選択視点の撮像画像上での画素座標とを画素毎に保持したテクスチャ画像を第２算出部７０７と画像生成部７０８に出力する。 The area determination unit 706 determines the arrangement of the texture area on the texture image for all polygons. After that, the area determination unit 706 uses the second calculation unit 707 and the image generation unit to generate a texture image in which the polygon identification ID for each pixel, the identification number of the selected viewpoint, and the pixel coordinates on the captured image of the selected viewpoint are held for each pixel. Output to 708.

ステップＳ８０７において、第２算出部７０７は、テクスチャ画像に含まれる画素において、画素値を決定する着目画素を設定する。ここでは、テクスチャ画像において最も左上の画素がまず着目画素として選択され、その後、着目画素の画素値の決定が完了する（Ｓ８１０でＮｏ）たびに、右下に向かって、それまでに着目画素として選択されない画素が新たな着目画素として選択される。なお、着目画素の選択順番はこれに限られず、どのような順番で着目画素を決定してもよい。 In step S807, the second calculation unit 707 sets the pixel of interest for determining the pixel value in the pixels included in the texture image. Here, the upper left pixel in the texture image is first selected as the pixel of interest, and then each time the determination of the pixel value of the pixel of interest is completed (No in S810), the pixel of interest is set toward the lower right. Pixels that are not selected are selected as new pixels of interest. The selection order of the pixels of interest is not limited to this, and the pixels of interest may be determined in any order.

ステップＳ８０８において、第２算出部７０７は、領域決定部７０６から取得したテクスチャ画像を用いて、着目画素の評価値Ｖを撮像視点毎に算出する。ここで算出する評価値Ｖは、着目画素におけるテクスチャの画素値を決定する際の指標として用いられる。第２算出部７０７は、算出した全視点における評価値Ｖを、評価値Ｖの算出に利用したテクスチャの歪みの指標と保持していた撮像視点毎の画素座標とともに画像生成部７０８に出力する。以下に評価値Ｖの算出方法について述べる。 In step S808, the second calculation unit 707 calculates the evaluation value V of the pixel of interest for each imaging viewpoint using the texture image acquired from the region determination unit 706. The evaluation value V calculated here is used as an index when determining the pixel value of the texture in the pixel of interest. The second calculation unit 707 outputs the calculated evaluation value V at all viewpoints to the image generation unit 708 together with the index of the distortion of the texture used for calculating the evaluation value V and the pixel coordinates for each imaging viewpoint held. The calculation method of the evaluation value V will be described below.

（評価値Ｖの算出方法）
Ｓ８０８における、評価値Ｖの算出方法について詳細に説明する。まず、評価値Ｖの算出における前準備として、着目画素に対応する３Ｄポリゴンデータ上での３次元座標及び法線ベクトルを算出する。本実施形態では、テクスチャ画像に含まれるポリゴン識別ＩＤと選択視点の識別番号と選択視点の撮像画像上での画素座標を用いて、着目画素に対応する３次元座標を算出する。具体的には、視点情報取得部７０２における撮像視点情報から取得した選択視点の視点位置に応じた３次元座標を始点として、選択視点の撮像画像上での画素座標に仮想的に光線を飛ばす。そして、ポリゴン識別ＩＤに対応するポリゴン（以下、着目ポリゴンという）と前述の光線との衝突した点を着目画素に対応する３次元座標とする。さらに、着目画素に対応する３次元座標と着目ポリゴンの３頂点の３次元座標と該３頂点の法線ベクトルとから、着目画素の法線ベクトルを算出する。具体的には、着目画素に対応する３次元座標と着目ポリゴンの３頂点の３次元座標とのそれぞれの距離に基づいて、各頂点における法線ベクトルの重み付け和によって、着目画素に対応する３次元座標の法線ベクトルを算出する。 (Calculation method of evaluation value V)
The method of calculating the evaluation value V in S808 will be described in detail. First, as a preliminary preparation for the calculation of the evaluation value V, the three-dimensional coordinates and the normal vector on the 3D polygon data corresponding to the pixel of interest are calculated. In the present embodiment, the polygon identification ID included in the texture image, the identification number of the selected viewpoint, and the pixel coordinates on the captured image of the selected viewpoint are used to calculate the three-dimensional coordinates corresponding to the pixel of interest. Specifically, the light beam is virtually emitted to the pixel coordinates on the captured image of the selected viewpoint, starting from the three-dimensional coordinates corresponding to the viewpoint position of the selected viewpoint acquired from the imaged viewpoint information in the viewpoint information acquisition unit 702. Then, the point at which the polygon corresponding to the polygon identification ID (hereinafter referred to as the polygon of interest) collides with the above-mentioned light ray is set as the three-dimensional coordinates corresponding to the pixel of interest. Further, the normal vector of the pixel of interest is calculated from the three-dimensional coordinates corresponding to the pixel of interest, the three-dimensional coordinates of the three vertices of the polygon of interest, and the normal vector of the three vertices. Specifically, based on the respective distances between the 3D coordinates corresponding to the pixel of interest and the 3D coordinates of the 3 vertices of the polygon of interest, the 3D corresponding to the pixel of interest is determined by the weighted sum of the normal vectors at each vertex. Calculate the normal vector of the coordinates.

次に、評価値Ｖの算出を行う。以下では、複数の撮像視点における１つの撮像視点の評価値Ｖの算出方法について説明するが、他の撮像視点についても同様の処理を行うことで評価値Ｖを算出する。以下では、１つの撮像視点を着目視点として説明を行う。 Next, the evaluation value V is calculated. Hereinafter, the method of calculating the evaluation value V of one imaging viewpoint at a plurality of imaging viewpoints will be described, but the evaluation value V is calculated by performing the same processing for the other imaging viewpoints. In the following, one imaging viewpoint will be described as a viewpoint of interest.

まず、前述の通り算出した着目画素に対応する３次元座標を始点として、法線ベクトルの方向に仮想的に光線を飛ばし、その光線が着目視点の画角内と衝突するか否かを判定する。判定の結果が否である場合は、着目視点において該３次元座標は見えていないため、着目画素の画素値の算出において着目画素を除外するために評価値Ｖにエラー値を設定する。本実施形態では、エラー値として負の値を用いるがこれに限らない。判定結果が可である場合は、衝突した画素座標を保持するとともに着目画素における評価値Ｖを算出する。 First, starting from the three-dimensional coordinates corresponding to the pixel of interest calculated as described above, a ray is virtually shot in the direction of the normal vector, and it is determined whether or not the ray collides with the angle of view of the viewpoint of interest. .. If the result of the determination is negative, the three-dimensional coordinates are not visible from the viewpoint of interest, so an error value is set in the evaluation value V in order to exclude the pixel of interest in the calculation of the pixel value of the pixel of interest. In the present embodiment, a negative value is used as the error value, but the error value is not limited to this. If the determination result is acceptable, the coordinates of the colliding pixel are retained and the evaluation value V in the pixel of interest is calculated.

本実施形態では、ステップＳ８０５で説明した評価値Ｑの考え方と同様に、解像度の指標Ｓと、テクスチャの歪みの指標Ｗとを用いて評価値Ｖ＝Ｓ×Ｗを算出する。ただし、解像度の指標Ｓとして、着目画素に対応する３次元座標と着目視点の３次元位置との距離ｄｉと着目視点の焦点距離ｆｉとを用いて、ｄｉ／ｆｉにより算出する。ｉは、複数の撮像視点において着目視点を識別する記号を示す。なお、解像度の指標Ｓの算出方法は上記に限らず、着目視点の画像に写る被写体の大きさを決定する他の方法を用いてもよい。例えば、焦点距離ではなく、画角を用いてもよい。 In the present embodiment, the evaluation value V = S × W is calculated by using the resolution index S and the texture distortion index W in the same manner as the concept of the evaluation value Q described in step S805. However, as the index S of the resolution, it is calculated by di / fi using the distance di between the three-dimensional coordinates corresponding to the pixel of interest and the three-dimensional position of the viewpoint of interest and the focal length fi of the viewpoint of interest. Reference numeral i indicates a symbol for identifying the viewpoint of interest at a plurality of imaging viewpoints. The method for calculating the resolution index S is not limited to the above, and other methods for determining the size of the subject appearing in the image of the viewpoint of interest may be used. For example, the angle of view may be used instead of the focal length.

解像度の指標Ｓの値が高いほど、着目画素に対応する３次元座標は高解像度に撮像されている。テクスチャの歪みの指標Ｗは、着目視点の３次元空間上の位置座標と着目画素に対応する３次元座標から着目視点の方向ベクトルＣＡを算出し、方向ベクトルＣＡと前準備で算出した法線ベクトルとの内積の結果ｃｏｓθを用いる。ただし、テクスチャの歪みの指標はこれに限らず、評価値Ｑの算出方法で説明した内容と同様に、角度や角度に基づく重みなどを用いて設定してもよい。 The higher the value of the resolution index S, the higher the resolution of the three-dimensional coordinates corresponding to the pixel of interest. For the texture distortion index W, the direction vector CA of the viewpoint of interest is calculated from the position coordinates of the viewpoint of interest in the three-dimensional space and the three-dimensional coordinates corresponding to the pixels of interest, and the direction vector CA and the normal vector calculated in the preparation are performed. As a result of the inner product with and, cos θ is used. However, the index of texture distortion is not limited to this, and may be set by using an angle, a weight based on the angle, or the like, as in the content described in the method of calculating the evaluation value Q.

テクスチャの歪みの指標Ｗの値が大きいほど、着目視点が正面に近い位置から着目画素に対応する３次元座標を撮像している。そのため、以上の指標の積により算出した評価値Ｖの値が大きいほど、着目画素に対応する３次元座標は高解像度かつテクスチャの歪みの少ない状態で着目視点から撮像されている。本実施形態では、各指標を同等の割合で用いた評価値Ｖを設定したがこれに限らない。解像度を重視したい場合は、解像度の指標に累乗の項を付け加えた上でテクスチャの歪みの指標との積をとるなどの調整を行ってもよい。第２算出部７０７は、算出した全視点における評価値Ｖを、この評価値Ｖの算出に利用したテクスチャの歪みの指標と保持していた撮像視点毎の画素座標（ＵＶ座標）とともに画像生成部７０８に出力する。 The larger the value of the texture distortion index W, the more the three-dimensional coordinates corresponding to the pixel of interest are captured from a position where the viewpoint of interest is closer to the front. Therefore, the larger the value of the evaluation value V calculated by the product of the above indexes, the more the three-dimensional coordinates corresponding to the pixel of interest are imaged from the viewpoint of interest with high resolution and less distortion of the texture. In the present embodiment, the evaluation value V using each index at the same ratio is set, but the present invention is not limited to this. If you want to emphasize the resolution, you may make adjustments such as adding a power term to the resolution index and then multiplying it with the texture distortion index. The second calculation unit 707 uses the calculated evaluation value V at all viewpoints as an index of the distortion of the texture used for calculating the evaluation value V and the pixel coordinates (UV coordinates) for each imaging viewpoint held together with the image generation unit. Output to 708.

図８に戻り、ステップＳ８０９において、画像生成部７０８は、第２算出部７０７から取得した着目画素における全撮像視点の評価値Ｖに基づいて、着目画素の画素値を決定する。ポリゴン間でのテクスチャのずれによる画質劣化を考慮し、画像生成部７０８は、複数の撮像画像に基づいて画素値を決定する。画素値の決定方法については以下に述べる。 Returning to FIG. 8, in step S809, the image generation unit 708 determines the pixel value of the pixel of interest based on the evaluation value V of all the imaging viewpoints of the pixel of interest acquired from the second calculation unit 707. The image generation unit 708 determines the pixel value based on a plurality of captured images in consideration of the deterioration of image quality due to the deviation of the texture between the polygons. The method of determining the pixel value will be described below.

（画素値の決定方法）
Ｓ８０９における、画素値の決定方法について詳細に説明する。画像生成部７０８は、複数の撮像画像に基づいて画素値を決定するが、まず、全視点の評価値Ｖから最大の評価値Ｖとなる撮像視点ｊ１を検出し、評価値Ｖｊ１を取得する。次に、撮像視点ｊ１を除く撮像視点の中から、最大の評価値Ｖとなる撮像視点ｊ２（つまり、撮像視点ｊ１を含めると２番目に大きい評価値Ｖとなる撮像視点ｊ２）を検出し、その撮像視点の評価値Ｖｊ２を取得する。さらに、検出した撮像視点ｊ１及びｊ２のそれぞれの評価値Ｖｊ１、Ｖｊ２から、式２、３に基づいて撮像視点ｊ１及びｊ２のそれぞれの撮像画像のブレンド比率（混合比率）Ｂを決定する。なお、他の視点のブレンド比率は０を設定する。 (Method of determining pixel value)
The method of determining the pixel value in S809 will be described in detail. The image generation unit 708 determines the pixel value based on a plurality of captured images. First, the imaging viewpoint j1 having the maximum evaluation value V is detected from the evaluation values V of all viewpoints, and the evaluation value Vj1 is acquired. Next, from the imaging viewpoints other than the imaging viewpoint j1, the imaging viewpoint j2 having the maximum evaluation value V (that is, the imaging viewpoint j2 having the second largest evaluation value V when the imaging viewpoint j1 is included) is detected. The evaluation value Vj2 of the imaging viewpoint is acquired. Further, from the detected evaluation values Vj1 and Vj2 of the imaging viewpoints j1 and j2, the blend ratio (mixing ratio) B of the captured images of the imaging viewpoints j1 and j2 is determined based on the equations 2 and 3. The blend ratio of other viewpoints is set to 0.

Ｂｊ１＝Ｖｊ１／（Ｖｊ１＋Ｖｊ２）・・・式２
Ｂｊ２＝Ｖｊ２／（Ｖｊ１＋Ｖｊ２）・・・式３
本実施形態において、撮像視点は固定されており、図６のように被写体を取り囲むように撮像されている。そのため、着目するポリゴンとそれに隣接するポリゴンとは、どちらも同一の撮像視点から及び隣接した撮像視点から撮像されている可能性が高い。また、被写体が人などである場合、被写体の３Ｄポリゴンデータにおいて隣接するポリゴン間はなめらか繋がり変化している。評価値Ｖでは、解像度と、撮像視点とポリゴンの正面度合いに基づくテクスチャ歪みとの指標を用いて設定しているため、隣接するポリゴン間において評価値が上位である撮像視点は類似する可能性が高い。そのため、評価値が上位である複数の撮像視点に高いブレンド比率を与えてブレンドすることで、ポリゴン間の急激なテクスチャの切り替わりを抑制することができる。本実施形態では、上位２視点のみを利用してブレンド比率を決定したがこれに限らず、任意の視点数を選択しブレンド比率を決定してもよい。なお、全撮像視点において評価値Ｖがエラー値、つまり負であった場合は、ブレンド比率を設定せず、事前に定めた画素値で着目画素の画素値を決定するようにしてもよい。 Bj1 = Vj1 / (Vj1 + Vj2) ... Equation 2
Bj2 = Vj2 / (Vj1 + Vj2) ... Equation 3
In the present embodiment, the imaging viewpoint is fixed, and the image is taken so as to surround the subject as shown in FIG. Therefore, it is highly possible that the polygon of interest and the polygon adjacent thereto are both imaged from the same imaging viewpoint and from adjacent imaging viewpoints. Further, when the subject is a person or the like, the adjacent polygons in the 3D polygon data of the subject are smoothly connected and changed. Since the evaluation value V is set using the index of the resolution and the texture distortion based on the degree of front of the polygon and the imaging viewpoint, the imaging viewpoints having higher evaluation values may be similar between adjacent polygons. high. Therefore, by giving a high blend ratio to a plurality of imaging viewpoints having higher evaluation values and blending them, it is possible to suppress abrupt texture switching between polygons. In the present embodiment, the blend ratio is determined using only the top two viewpoints, but the present invention is not limited to this, and an arbitrary number of viewpoints may be selected to determine the blend ratio. If the evaluation value V is an error value, that is, a negative value in all imaging viewpoints, the pixel value of the pixel of interest may be determined by a predetermined pixel value without setting the blend ratio.

最後に、決定したブレンド比率に基づいた重み付け和により着目画素の画素値を決定する。各撮像視点の画素値は、第２算出部７０７から取得した画素座標に従い取得する。しかし、各撮像視点の画素値の取得はこれに限らず、取得した画素座標を中心に周辺の画素の平均画素値など、周辺の画素値を考慮してもよい。 Finally, the pixel value of the pixel of interest is determined by the weighted sum based on the determined blend ratio. The pixel value of each imaging viewpoint is acquired according to the pixel coordinates acquired from the second calculation unit 707. However, the acquisition of the pixel value of each imaging viewpoint is not limited to this, and the peripheral pixel values such as the average pixel value of the peripheral pixels may be considered around the acquired pixel coordinates.

図８に戻り、ステップＳ８１０において、画像生成部７０８は、テクスチャ画像の全画素に対して画素値を決定したかを判断する。ステップＳ８１０の判定の結果が偽（Ｎｏ）の場合、ステップＳ８０７に戻る。一方、ステップＳ８１０の判定の結果が真（Ｙｅｓ）の場合、生成したテクスチャ画像を二次記憶装置１０４や外部記憶装置１０８や表示装置１０９に出力して、一連の処理は完了する。以上が、本実施形態における画像処理装置１００が実行する、テクスチャ付き３Ｄポリゴンモデルを生成する処理である。 Returning to FIG. 8, in step S810, the image generation unit 708 determines whether or not the pixel values have been determined for all the pixels of the texture image. If the result of the determination in step S810 is false (No), the process returns to step S807. On the other hand, when the result of the determination in step S810 is true (Yes), the generated texture image is output to the secondary storage device 104, the external storage device 108, or the display device 109, and the series of processes is completed. The above is the process of generating the textured 3D polygon model executed by the image processing apparatus 100 in the present embodiment.

以上のように、本実施形態では、テクスチャ画像の生成を大きく二つのステップに分け、第１のステップではテクスチャ画像上における解像度を担保し、第２のステップでは画素毎に複数の撮像視点の撮像画像をブレンドして画素値を決定している。これにより、高解像度かつポリゴン間でのテクスチャの段差が目立たないテクスチャ画像を生成し、高品質なテクスチャ付き３Ｄポリゴンモデルを生成することができる。 As described above, in the present embodiment, the generation of the texture image is roughly divided into two steps, the first step ensures the resolution on the texture image, and the second step captures a plurality of imaging viewpoints for each pixel. The image is blended to determine the pixel value. As a result, it is possible to generate a texture image having high resolution and inconspicuous texture steps between polygons, and to generate a high-quality textured 3D polygon model.

本実施形態において生成される３Ｄポリゴンモデルは、仮想視点画像の生成に用いられる。例えば、タブレットやスマートフォンなどのポータブル端末の仮想視点画像の生成アプリなどを利用して、仮想視点画像の生成指示がなされた場合、ポータブル端末に３Ｄポリゴンモデルを送信する。そしてポータブル端末内で、ユーザにより指定された視点と３Ｄポリゴンモデルとに基づいて、仮想視点画像が生成されるようにしてもよい。また、ポータブル端末には３Ｄポリゴンモデルを送信せず、サーバがポータブル端末で指定された視点の情報を取得し、取得した視点の情報と３Ｄポリゴンモデルとに基づいて、サーバが仮想視点画像を生成するようにしてもよい。そして、ポータブル端末に、生成された仮想視点画像を送信するようにしてもよい。 The 3D polygon model generated in this embodiment is used to generate a virtual viewpoint image. For example, when an instruction to generate a virtual viewpoint image is given by using a virtual viewpoint image generation application of a portable terminal such as a tablet or a smartphone, a 3D polygon model is transmitted to the portable terminal. Then, in the portable terminal, a virtual viewpoint image may be generated based on the viewpoint specified by the user and the 3D polygon model. In addition, the server does not send the 3D polygon model to the portable terminal, the server acquires the information of the viewpoint specified by the portable terminal, and the server generates a virtual viewpoint image based on the acquired viewpoint information and the 3D polygon model. You may try to do it. Then, the generated virtual viewpoint image may be transmitted to the portable terminal.

＜実施形態２＞
実施形態１では、テクスチャ画像の画素値を決定する際、評価値Ｖの数値が高い上位の撮像視点の撮像画像の画素値を、評価値Ｖに基づく比率に基づいてブレンドしている。これにより、ポリゴン間のテクスチャの段差のない高画質なテクスチャ付き３Ｄポリゴンモデルを生成している。実施形態２では、解像度の指標とテクスチャの歪みの指標とによる評価値Ｖを、ブレンド比率を決定する際の基準として用い、テクスチャの歪みの指標を、画素値を決定する際に使用するレンダリング比率に用いる形態について説明する。以下に、本実施形態の概要と意義について説明する。 <Embodiment 2>
In the first embodiment, when determining the pixel value of the texture image, the pixel value of the captured image of the higher imaging viewpoint having a high evaluation value V is blended based on the ratio based on the evaluation value V. As a result, a high-quality textured 3D polygon model with no texture step between polygons is generated. In the second embodiment, the evaluation value V based on the resolution index and the texture distortion index is used as a reference when determining the blend ratio, and the texture distortion index is used as the rendering ratio when determining the pixel value. The form used for is described. The outline and significance of this embodiment will be described below.

本実施形態では、まず、撮像視点毎に決定した解像度の指標Ｓとテクスチャの歪みの指標Ｗとの積で算出する評価値Ｖにより、ブレンドする撮像視点を選択するための優先度を決定する。解像度の指標Ｓとテクスチャの歪みの指標Ｗとは、どちらにおいても、値が高いほど高画質なテクスチャ画像を生成できる可能性が高いことを示している。したがって、優先度が高い撮像視点ほど、高画質なテクスチャ画像を生成できる可能性が高い。次に、撮像視点毎に、自身の優先度以上の優先度を保持する撮像視点を検出する。そして、検出した撮像視点の中でテクスチャの歪みの指標Ｗが大きい撮像視点を、ブレンドに用いる撮像装置として少なくとも１つ以上選択する。また、対応する優先度をブレンド重みとして決定する。そして、全撮像視点における、ブレンド重みと選択された撮像視点とを統合して、撮像視点毎に画素値を決定するためのレンダリング重みを算出する。 In the present embodiment, first, the priority for selecting the imaging viewpoint to be blended is determined by the evaluation value V calculated by the product of the resolution index S determined for each imaging viewpoint and the texture distortion index W. In both the resolution index S and the texture distortion index W, the higher the value, the higher the possibility that a high-quality texture image can be generated. Therefore, the higher the priority of the imaging viewpoint, the higher the possibility that a high-quality texture image can be generated. Next, for each imaging viewpoint, an imaging viewpoint that holds a priority higher than its own priority is detected. Then, at least one imaging viewpoint having a large texture distortion index W among the detected imaging viewpoints is selected as the imaging device used for blending. Also, the corresponding priority is determined as the blend weight. Then, the blend weight and the selected imaging viewpoint in all the imaging viewpoints are integrated to calculate the rendering weight for determining the pixel value for each imaging viewpoint.

例えば、図１１に示される撮像装置の配置構成においては、被写体１１００におけるポリゴン１１０１の左右に撮像視点１１０２、１１０３が配置されている。ここで、撮像視点１１０２、１１０３のどちらにおいてもポリゴンとしては、解像度の指標が同等程度であり、テクスチャの歪みの指標が左右方向の違いはあるが同程度であるとする。そのため、どちらの撮像視点においても指標の積である評価値が同程度である。この場合、実施形態１によれば、以下のように画素値が決定する。つまり、テクスチャ画像１１０４上のポリゴン１１０１に対応するテクスチャ領域１１０５に含まれる画素は、撮像視点１１０２の撮像画像内の画素値と撮像視点１１０３の撮像画像内の画素値とを同程度の比率によりブレンドされて、画素値が決定される。ブレンドして画素値を決定することにより、ポリゴン間若しくは、ポリゴン内部において、テクスチャの段差のない高品質なテクスチャを生成できる。しかし、前述の通り、同一の部位を撮像している画像であっても、３Ｄポリゴンデータに含まれる誤差の影響により、ポリゴンに投影したテクスチャには異なる撮像視点間によるずれや異なる歪みが生じる。この影響により、ブレンドして決定したテクスチャ領域１１０５は、撮像視点１１０２、１１３を単独で使用した場合に比べて、解像感が劣化する。そこで本実施形態では、撮像視点を最適にブレンドし、不必要なブレンディングを抑えることで、テクスチャの段差がなく、より高解像度なテクスチャ画像を生成する。 For example, in the arrangement configuration of the image pickup apparatus shown in FIG. 11, the image pickup viewpoints 1102 and 1103 are arranged on the left and right sides of the polygon 1101 in the subject 1100. Here, it is assumed that the resolution index is about the same for polygons at both of the imaging viewpoints 1102 and 1103, and the texture distortion index is about the same although there is a difference in the left-right direction. Therefore, the evaluation value, which is the product of the indexes, is about the same in both imaging viewpoints. In this case, according to the first embodiment, the pixel value is determined as follows. That is, the pixels included in the texture region 1105 corresponding to the polygon 1101 on the texture image 1104 blend the pixel values in the captured image of the imaging viewpoint 1102 and the pixel values in the captured image of the imaging viewpoint 1103 at the same ratio. The pixel value is determined. By blending and determining the pixel value, it is possible to generate a high-quality texture with no texture step between polygons or inside the polygon. However, as described above, even if the image captures the same portion, the texture projected on the polygon may be displaced or distorted between different imaging viewpoints due to the influence of the error included in the 3D polygon data. Due to this effect, the texture region 1105 determined by blending has a deteriorated resolution as compared with the case where the imaging viewpoints 1102 and 113 are used alone. Therefore, in the present embodiment, by optimally blending the imaging viewpoints and suppressing unnecessary blending, a texture image having no texture step and a higher resolution is generated.

本実施形態では、人などの被写体において、３Ｄポリゴンデータの３次元座標及び法線方向が滑らかに変化することを利用して、仮想視点画像の生成における最終的なレンダリング重みを決定する。テクスチャの歪みの指標Ｗと解像度の指標Ｓとの評価値Ｖから定めた優先度が同等以上である全ての撮像視点を検出した上で、テクスチャの歪みの指標Ｗに基づいて各撮像視点のブレンドに用いる重みを決定する。図１１を用いて、具体的に説明する。図１１（ｂ）は、図１１（ａ）を俯瞰した図を示している。面１１０６はポリゴン１１０１に対応し、面１１０６に付随する矢印は、テクスチャ領域１１０５に含まれる画素毎の３次元座標と法線方向とを示す。面１１０６に上に存在する画素において、評価値が同程度、つまり優先度が同程度である撮像視点１１０２、１１０３が検出される。検出された撮像視点において、面１１０６上に存在する各法線に対する撮像視点のテクスチャの歪みの指標Ｗに基づき、テクスチャの歪みの指標Ｗが大きい撮像視点ほど大きいブレンド重みを設定する。そして、設定した重みに基づいて撮像視点１１０２と撮像視点１１０３それぞれの撮像画像をブレンドして、テクスチャ領域１１０５に含まれる画素の画素値を決定する。 In the present embodiment, the final rendering weight in the generation of the virtual viewpoint image is determined by utilizing the smooth change of the three-dimensional coordinates and the normal direction of the 3D polygon data in a subject such as a person. After detecting all imaging viewpoints whose priority is equal to or higher than the evaluation value V of the texture distortion index W and the resolution index S, the blending of each imaging viewpoint is based on the texture distortion index W. Determine the weight to use for. A specific description will be given with reference to FIG. FIG. 11B shows a bird's-eye view of FIG. 11A. The surface 1106 corresponds to the polygon 1101, and the arrows attached to the surface 1106 indicate the three-dimensional coordinates and the normal direction for each pixel included in the texture area 1105. In the pixels existing on the surface 1106, the imaging viewpoints 1102 and 1103 having the same evaluation value, that is, the same priority are detected. In the detected imaging viewpoint, a blend weight is set as the index W of the texture distortion is larger than the index W of the texture distortion of the imaging viewpoint with respect to each normal existing on the surface 1106. Then, based on the set weight, the captured images of the imaging viewpoint 1102 and the imaging viewpoint 1103 are blended to determine the pixel values of the pixels included in the texture region 1105.

面１１０６の法線は、被写体と撮像視点１１０２とを結ぶ方向ベクトルに平行に近い方向から、右側に進むにつれて、被写体と撮像視点１１０３とを結ぶ方向ベクトルに平行に近い方向へと滑らかに変化する。そのため、法線に対する撮像視点の正面度合いによって算出したテクスチャ歪みの指標Ｗを用いて設定したブレンド比率は、撮像視点１１０２から右方向に向かうにつれて撮像視点１１０３のブレンド比率が大きくなるように滑らかに変化させることができる。これにより、不必要なブレンディングを抑えて、高解像度かつテクスチャのずれが少ない、高画質なテクスチャ付き３Ｄポリゴンモデルを生成することができる。 The normal of the surface 1106 smoothly changes from a direction close to parallel to the direction vector connecting the subject and the imaging viewpoint 1102 to a direction close to parallel to the direction vector connecting the subject and the imaging viewpoint 1103 as it advances to the right. .. Therefore, the blend ratio set using the texture distortion index W calculated by the degree of front of the imaging viewpoint with respect to the normal changes smoothly so that the blend ratio of the imaging viewpoint 1103 increases from the imaging viewpoint 1102 to the right. Can be made to. As a result, it is possible to suppress unnecessary blending and generate a high-quality textured 3D polygon model with high resolution and little texture deviation.

以下、本実施形態の画像処理装置９００で行われる処理について、図９および図１０を用いて説明する。図９は、本実施形態における画像処理装置９００の機能構成を示す図である。図１０は、本実施形態における画像処理装置９００が行う画像処理方法のフローチャートを示している。なお、本実施形態において、実施形態１と同様の構成および同様の処理については、実施形態１と同様の符号を付して説明を省略する。また、本実施形態における画像処理装置９００のハードウェア構成を示すブロック図は、図１と同等であるため、説明は省略する。 Hereinafter, the processing performed by the image processing apparatus 900 of the present embodiment will be described with reference to FIGS. 9 and 10. FIG. 9 is a diagram showing a functional configuration of the image processing device 900 in the present embodiment. FIG. 10 shows a flowchart of an image processing method performed by the image processing apparatus 900 in the present embodiment. In the present embodiment, the same configuration and the same processing as in the first embodiment are designated by the same reference numerals as those in the first embodiment, and the description thereof will be omitted. Further, since the block diagram showing the hardware configuration of the image processing device 900 in the present embodiment is the same as that in FIG. 1, the description thereof will be omitted.

図９に示すように、画像処理装置９００は、図７と同様に、画像データ取得部７０１と視点情報取得部７０２と３Ｄポリゴン取得部７０３とＵＶ座標取得部７０４と第１算出部７０５と領域決定部７０６とを有する。さらに、画像処理装置９００は、第２算出部９０１とブレンド比率決定部９０２と画像生成部９０３とを有する。画像処理装置９００は、ＲＯＭ１０３に格納されたプログラムをＣＰＵ１０１がＲＡＭ１０２をワークメモリとして実行することで、図９に示す各構成部として機能し、図１０のフローチャートに示す一連の処理を実行する。なお、以下に示す処理の全てがＣＰＵ１０１によって実行される必要はなく、処理の一部または全部が、ＣＰＵ１０１以外の１つ又は複数の処理回路によって行われるように画像処理装置１００が構成されていてもよい。以下、各構成部により行われる処理の流れを説明するが、ステップＳ１００１〜Ｓ１００６は、実施形態１の図８におけるステップＳ８０１〜Ｓ８０６と同様であるため、説明を省略する。 As shown in FIG. 9, the image processing apparatus 900 includes an image data acquisition unit 701, a viewpoint information acquisition unit 702, a 3D polygon acquisition unit 703, a UV coordinate acquisition unit 704, a first calculation unit 705, and an area, as in FIG. It has a determination unit 706. Further, the image processing apparatus 900 has a second calculation unit 901, a blend ratio determination unit 902, and an image generation unit 903. The image processing device 900 functions as each component shown in FIG. 9 by executing the program stored in the ROM 103 by the CPU 101 using the RAM 102 as the work memory, and executes a series of processes shown in the flowchart of FIG. It should be noted that it is not necessary that all of the processes shown below are executed by the CPU 101, and the image processing device 100 is configured so that a part or all of the processes is performed by one or a plurality of processing circuits other than the CPU 101. May be good. Hereinafter, the flow of processing performed by each component will be described, but since steps S1001 to S1006 are the same as steps S801 to S806 in FIG. 8 of the first embodiment, the description thereof will be omitted.

ステップＳ１００７において、第２算出部９０１は、テクスチャ画像に含まれる画素において、画素値を決定する着目画素を設定する。 In step S1007, the second calculation unit 901 sets the pixel of interest for determining the pixel value in the pixels included in the texture image.

ステップＳ１００８において、第２算出部９０１は、領域決定部７０６から取得したテクスチャ画像を用いて、着目画素の評価値Ｖを撮像視点毎に算出する。ここで算出する評価値Ｖは、着目画素におけるテクスチャの画素値を決定する際の指標として用いられる。評価値Ｖの算出方法は、実施形態１のステップＳ８０８の処理で算出したものと同様であるため説明を省略する。第２算出部９０１は、算出した全視点における評価値Ｖを、この評価値算出に利用したテクスチャの歪みの指標とともにブレンド比率決定部９０２に出力する。さらに、第２算出部９０１は保持していた撮像視点毎の画素座標とともに画像生成部９０３に出力する。 In step S1008, the second calculation unit 901 calculates the evaluation value V of the pixel of interest for each imaging viewpoint using the texture image acquired from the region determination unit 706. The evaluation value V calculated here is used as an index when determining the pixel value of the texture in the pixel of interest. Since the method of calculating the evaluation value V is the same as that calculated in the process of step S808 of the first embodiment, the description thereof will be omitted. The second calculation unit 901 outputs the calculated evaluation value V at all viewpoints to the blend ratio determination unit 902 together with the index of the distortion of the texture used for calculating the evaluation value. Further, the second calculation unit 901 outputs the pixel coordinates for each imaging viewpoint held to the image generation unit 903.

ステップＳ１００９において、ブレンド比率決定部９０２は、第２算出部９０１から取得した各撮像視点の評価値Ｖから、撮像視点の優先度を撮像視点毎に決定する。本実施形態では、評価値Ｖそのものを優先度として決定する。しかし、優先度の表現方法は上記に限らず、あらかじめ優先度を下げたい撮像視点においては、評価値Ｖと１より小さい値との積を取るといった調整をしても構わない。 In step S1009, the blend ratio determination unit 902 determines the priority of the imaging viewpoint for each imaging viewpoint from the evaluation value V of each imaging viewpoint acquired from the second calculation unit 901. In the present embodiment, the evaluation value V itself is determined as the priority. However, the method of expressing the priority is not limited to the above, and an adjustment such as taking the product of the evaluation value V and a value smaller than 1 may be made at the imaging viewpoint for which the priority is to be lowered in advance.

ステップＳ１０１０において、ブレンド比率決定部９０２は、ステップＳ１００９で決定した撮像視点毎の優先度と第２算出部９０１から取得したテクスチャの歪みの指標とを用いて、優先度ごとに各撮像視点のブレンド比率を決定する。以下で、具体的に説明する。 In step S1010, the blend ratio determination unit 902 blends each imaging viewpoint for each priority by using the priority for each imaging viewpoint determined in step S1009 and the texture distortion index acquired from the second calculation unit 901. Determine the ratio. This will be described in detail below.

まず、全ての優先度の中で最小となる優先度を着目する優先度として選択する。その後、着目した優先度（以下、着目優先度という）に対して、以下の処理が完了するたびに、優先度の高い方へ、着目優先度としてまだ選択されていない優先度が着目優先度として選択され、同様の処理を全ての優先度について処理が完了するまで繰り返す。 First, the lowest priority among all the priorities is selected as the priority to be focused on. After that, with respect to the priority of interest (hereinafter referred to as the priority of interest), each time the following processing is completed, the priority that has not yet been selected as the priority of interest is set as the priority of interest. It is selected and the same process is repeated for all priorities until the process is completed.

次に、全撮像視点の中で着目優先度以上の優先度を保持する撮像視点を検出し、テクスチャ歪みの指標に基づき、検出した撮像視点の中でブレンドに用いる撮像視点を選択する。本実施形態では、撮像視点の中から、テクスチャの歪みの指標が上位２位以内の撮像視点を選択する。しかし、必ずしも２台の撮像視点を選択する必要はなく、全撮像視点の中で最大のテクスチャの歪みの指標を持つ１台の撮像視点を選択してもよい。 Next, the imaging viewpoint that holds the priority of interest or higher is detected among all the imaging viewpoints, and the imaging viewpoint used for blending is selected from the detected imaging viewpoints based on the index of texture distortion. In the present embodiment, an imaging viewpoint having a texture distortion index within the top two is selected from the imaging viewpoints. However, it is not always necessary to select two imaging viewpoints, and one imaging viewpoint having the maximum texture distortion index among all the imaging viewpoints may be selected.

さらに、着目優先度で選択した撮像視点に対して、撮像視点毎にブレンド重みを決定する。本実施形態では、選択した撮像視点のテクスチャの歪みの指標に応じて、テクスチャの歪みの指標が小さくなるにつれ線形的に重みが減少するようにブレンド重みを決定する。具体的には、選択撮像視点のテクスチャ歪みを示す指標の合計値Ｗｓｕｍ、それぞれの撮像視点の指標値Ｗ１、Ｗ２とした場合、それぞれのブレンド重みはＢ１＝Ｗ１／Ｗｓｕｍ、Ｗ２＝Ｗ２／Ｗｓｕｍとする。しかし、ブレンド重みの算出方法はこれに限らず、非線形的な重みの減少や、ある一定の角度以上からの急激な重みの減少など、様々な方法を用いることができる。 Further, the blend weight is determined for each imaging viewpoint with respect to the imaging viewpoint selected by the priority of attention. In the present embodiment, the blend weight is determined so that the weight decreases linearly as the index of texture distortion decreases according to the index of texture distortion of the selected imaging viewpoint. Specifically, when the total value Wsum of the indexes indicating the texture distortion of the selected imaging viewpoint and the index values W1 and W2 of the respective imaging viewpoints, the blend weights are B1 = W1 / Wsum and W2 = W2 / Wsum. To do. However, the method for calculating the blend weight is not limited to this, and various methods such as non-linear weight reduction and abrupt weight reduction from a certain angle or more can be used.

ブレンド比率決定部９０２は、着目優先度において選択した撮像視点の視点番号と撮像視点のそれぞれに対するブレンド重みとを、着目優先度に対するブレンド情報として決定する。そして、全ての優先度に対して上記と同様の処理を行い、ブレンド比率決定部９０２は各優先度におけるブレンド情報を画像生成部９０３に出力する。 The blend ratio determination unit 902 determines the viewpoint number of the imaging viewpoint selected in the focus priority and the blend weight for each of the imaging viewpoints as blend information for the focus priority. Then, the same processing as described above is performed for all the priorities, and the blend ratio determination unit 902 outputs the blend information for each priority to the image generation unit 903.

ステップＳ１００４において、画像生成部９０３は、ブレンド比率決定部９０２から取得した、優先度毎のブレンド情報を統合して、撮像視点毎にレンダリング重みを決定する。そして、決定したレンダリング重みに応じて着目画素の画素値を決定する。 In step S1004, the image generation unit 903 integrates the blend information for each priority acquired from the blend ratio determination unit 902, and determines the rendering weight for each imaging viewpoint. Then, the pixel value of the pixel of interest is determined according to the determined rendering weight.

本実施形態では、まず、全撮像視点のレンダリング重みを０で初期化する。その後、撮像視点毎に全ての優先度におけるブレンド情報を確認し、自身の撮像視点の番号が含まれている優先度を検出する。そして、検出した優先度毎に、優先度と自身の撮像視点のブレンド重みの積を求める。さらに、全検出した優先度において、優先度毎に求めた優先度とブレンド重みの積の総和を求め、その結果を撮像視点における仮のレンダリング重みとする。最後に、全撮像視点に対して仮のレンダリング重みを決定した後、全撮像視点の仮のレンダリング重みの総和で各撮像視点のレンダリング重みを除算した結果を、各撮像視点のレンダリング重みとして決定する。 In the present embodiment, first, the rendering weights of all imaging viewpoints are initialized to 0. After that, the blend information at all the priorities is confirmed for each imaging viewpoint, and the priority including the number of the own imaging viewpoint is detected. Then, for each detected priority, the product of the priority and the blend weight of its own imaging viewpoint is obtained. Further, in all the detected priorities, the sum of the products of the priorities obtained for each priority and the blend weights is obtained, and the result is used as a provisional rendering weight at the imaging viewpoint. Finally, after determining the temporary rendering weights for all imaging viewpoints, the result of dividing the rendering weights of each imaging viewpoint by the sum of the temporary rendering weights of all imaging viewpoints is determined as the rendering weights of each imaging viewpoint. ..

画像生成部９０３は、決定したレンダリング重みに基づいた重み付け和により着目画素の画素値を決定する。各撮像視点の画素値は、第２算出部９０１から取得した画素座標に従い取得する。しかしながら、各撮像視点の画素値の取得はこれに限らず、取得した画素座標を中心に周辺の画素の平均画素値など、周辺の画素値を考慮してもよい。 The image generation unit 903 determines the pixel value of the pixel of interest by the weighting sum based on the determined rendering weight. The pixel value of each imaging viewpoint is acquired according to the pixel coordinates acquired from the second calculation unit 901. However, the acquisition of the pixel value of each imaging viewpoint is not limited to this, and the peripheral pixel values such as the average pixel value of the peripheral pixels may be taken into consideration around the acquired pixel coordinates.

以上が、本実施形態の画像処理装置９００で行われる処理である。上記で説明した構成によれば、不必要なブレンディングを抑えて、高解像度かつテクスチャのずれが低減された、高画質なテクスチャ付き３Ｄポリゴンモデルを生成することができる。 The above is the processing performed by the image processing apparatus 900 of the present embodiment. According to the configuration described above, it is possible to suppress unnecessary blending and generate a high-quality textured 3D polygon model with high resolution and reduced texture deviation.

＜その他の実施形態＞
本発明の実施形態は、上記の実施形態に限られるものではなく、様々な実施形態をとることが可能である。例えば、上記実施形態では、評価値Ｖの算出方法として、解像度の指標とテクスチャ歪みの指標との２つの指標を用いたが、これに限られず複数の指標を追加で用いてよい。例えば、撮像視点の画像上において被写体の輪郭周辺を検出し、テクスチャ画像の画素値を決定する際に、検出した輪郭周辺の画素の重みが小さくなるような指標を追加してもよい。これにより、３Ｄポリゴンデータに含まれる誤差により、３Ｄポリゴンデータが実際の被写体よりも微少に大きい場合、被写体以外の画素値を誤って用いることによる画質劣化を抑制できる。 <Other Embodiments>
The embodiment of the present invention is not limited to the above embodiment, and various embodiments can be taken. For example, in the above embodiment, two indexes, a resolution index and a texture distortion index, are used as the method for calculating the evaluation value V, but the present invention is not limited to this, and a plurality of indexes may be additionally used. For example, when detecting the periphery of the contour of the subject on the image of the imaging viewpoint and determining the pixel value of the texture image, an index may be added so that the weight of the pixel around the detected contour becomes small. As a result, when the 3D polygon data is slightly larger than the actual subject due to the error included in the 3D polygon data, deterioration of image quality due to erroneous use of pixel values other than the subject can be suppressed.

また、上記実施形態２では、ブレンド比率を決定する際、被写体の法線に対して、各撮像視点が時計回り方向若しくは反時計回り方向に位置しているか否かの区別を行わなかったがこれに限らない。各優先度において二つの撮像視点を選択する際、テクスチャの歪みの指標が高い撮像視点をそれぞれの方向から選択するなどの方法を用いてもよい。また、実施形態１におけるブレンド比率の決定においても同様である。 Further, in the second embodiment, when determining the blend ratio, it is not distinguished whether or not each imaging viewpoint is located in the clockwise direction or the counterclockwise direction with respect to the normal of the subject. Not limited to. When selecting two imaging viewpoints at each priority, a method such as selecting an imaging viewpoint having a high index of texture distortion from each direction may be used. The same applies to the determination of the blend ratio in the first embodiment.

また、上記実施形態では、テクスチャ画像においてポリゴンのテクスチャ領域を確保する際、３Ｄポリゴンデータに含まれるポリゴン毎に処理を行ったがこれに限らない。複数の隣接するポリゴンをまとめて１つの大きなポリゴンとして扱い、テクスチャ領域の確保を行ってもよい。また、上記実施形態では、テクスチャ画像において各画素の画素値を決定する際、画素毎に処理を行ったがこれに限られない。複数の隣接する画素をまとめて処理を行ってもよい。 Further, in the above embodiment, when securing the texture area of polygons in the texture image, processing is performed for each polygon included in the 3D polygon data, but the present invention is not limited to this. A plurality of adjacent polygons may be collectively treated as one large polygon to secure a texture area. Further, in the above embodiment, when determining the pixel value of each pixel in the texture image, processing is performed for each pixel, but the present invention is not limited to this. A plurality of adjacent pixels may be processed together.

本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 The present invention supplies a program that realizes one or more functions of the above-described embodiment to a system or device via a network or storage medium, and one or more processors in the computer of the system or device reads and executes the program. It can also be realized by the processing to be performed. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

１００画像処理装置
７０８画像生成部 100 Image processing device 708 Image generator

Claims

An image processing device that generates a texture image corresponding to three-dimensional polygon data of an object based on a plurality of captured images based on capturing an object using a plurality of imaging devices.
Each pixel value of the plurality of pixels or the plurality of pixel groups constituting the region corresponding to the polygon constituting the three-dimensional polygon data is set to one or more captured images of the plurality of captured images for each pixel or pixel group. And the decision-making means to decide based on
An image processing apparatus comprising: a generation means for generating the texture image based on a pixel value determined by the determination means.

The image processing apparatus according to claim 1, wherein the texture image includes at least one of color information, luminance information, and saturation information.

The determination means according to claim 1 or 2, wherein the image captured image used for determining the pixel value is determined for each pixel or pixel group based on the respective focal length or angle of view of the image pickup apparatus. The image processing apparatus described.

The determination means determines an image to be used for determining the pixel value for each pixel or pixel group based on the position in the object corresponding to the pixel or the pixel group constituting the region and the position of the image pickup device. The image processing apparatus according to any one of claims 1 to 3, wherein the image processing apparatus is used.

The determination means obtains a pixel or a pixel of an image captured image used for determining the pixel value based on a distance between a position in the object corresponding to a pixel or a group of pixels constituting the region and a position of the image pickup device. The image processing apparatus according to any one of claims 1 to 4, wherein the image processing apparatus is determined for each group.

The determination means determines the pixel value of the pixel or pixel group constituting the region based on the direction vector connecting the position in the object corresponding to the pixel or pixel group constituting the region and the position of the image pickup apparatus. The image processing apparatus according to any one of claims 1 to 5, wherein the captured image used in the above is determined for each pixel or pixel group.

The determination means according to any one of claims 1 to 6, wherein the determination means determines the pixel value based on two or more captured images out of the plurality of captured images for each pixel or pixel group. The image processing apparatus described.

It further has a first calculation means for calculating the first evaluation value of each of the plurality of captured images for each of the pixels or pixel groups.
The claim means that the determination means determines two or more captured images used for determining the pixel value for each pixel or pixel group based on the first evaluation value calculated by the first calculation means. Item 7. The image processing apparatus according to item 7.

The determination means determines the mixing ratio of the two or more captured images used for determining the pixel value based on the first evaluation value, and determines the pixel value based on the determined mixing ratio. 8. The image processing apparatus according to claim 8.

The determination means is characterized in that the mixing ratio of the two or more captured images is determined so that the mixing ratio of the captured images having a large first evaluation value is large among the two or more captured images. Item 9. The image processing apparatus according to item 9.

Claims 8 to 10 are characterized in that the determination means determines two or more captured images having a first evaluation value larger than other captured images as two or more captured images used for determining the pixel value. The image processing apparatus according to any one of the above items.

11. The determination means according to claim 11, wherein the determination means determines two or more captured images including the captured image having the maximum first evaluation value as two or more captured images used for determining the pixel value. Image processing equipment.

The determination means determines two or more captured images including the captured image having the maximum first evaluation value and the captured image having the second largest evaluation value as two or more captured images used for determining the pixel value. The image processing apparatus according to claim 12.

The determination means determines the pixel value of the pixel or the pixel group based on the priority for determining the pixel value set for each of the plurality of image pickup devices for each pixel or the pixel group. The image processing apparatus according to any one of claims 1 to 13, wherein the captured image to be used is determined.

The determination means is characterized in that the mixing ratio of the two or more captured images used for determining the pixel value is determined based on the priority, and the pixel value is determined based on the determined mixing ratio. The image processing apparatus according to claim 14.

The determination means is further used to determine the pixel value of the pixel or the pixel group based on the direction vector connecting the position in the object corresponding to the pixel or the pixel group constituting the region and the position of the image pickup device. The image processing apparatus according to claim 14 or 15, wherein the captured image is determined.

The determination means is
In each of the two or more captured images used for determining the pixel value, the weight used for determining the mixing ratio is determined for each of the priorities.
The image processing apparatus according to claim 15, wherein the mixing ratio of each of the two or more captured images is determined based on the weight determined for each priority.

The weights are a direction vector connecting a position in the object corresponding to the pixel or a group of pixels constituting the region and a position of the image pickup apparatus, and a position in the object corresponding to the pixel or the group of pixels constituting the region. The image processing apparatus according to claim 17, wherein the image processing apparatus becomes larger as the normal direction becomes closer to parallel.

The image processing apparatus according to any one of claims 1 to 18, further comprising an associating means for associating the polygon with the region based on one of the plurality of captured images. ..

The associating means determines the one captured image based on the resolution of the portion corresponding to the polygon included in the captured image, and based on the determined one captured image, the polygon and the region The image processing apparatus according to claim 19, wherein the image processing apparatus is associated with.

The image processing apparatus according to claim 20, wherein the resolution is represented by the number of pixels constituting a portion corresponding to the polygon included in the captured image.

The associating means is based on the angle formed by the direction vector connecting the position in the object corresponding to the pixel or the pixel group constituting the region and the position of the image pickup apparatus and the normal direction of the polygon. The image processing apparatus according to any one of claims 19 to 21, wherein one captured image is determined, and the polygon and the region are associated with each other based on the determined captured image.

The image processing apparatus according to claim 22, wherein the associating means determines the one captured image based on the inner product of the direction vector and the normal direction of the polygon.

The associating means determines the one captured image based on the second evaluation value based on the resolution of the region corresponding to the polygon included in the captured image, and based on the determined one captured image, The image processing apparatus according to any one of claims 19 to 23, wherein the polygon is associated with the region.

The second evaluation value is further based on the angle formed by the direction vector connecting the position in the object corresponding to the pixel or the pixel group constituting the region and the position of the image pickup apparatus and the normal direction of the polygon. The image processing apparatus according to claim 24.

It further has a second calculation means for calculating the second evaluation value of each of the plurality of captured images.
The claim is characterized in that the associating means determines an image captured in which the second evaluation value calculated by the second calculation means is larger than the evaluation value of another image, and determines the one image. 24 or 25. The image processing apparatus.

The image processing apparatus according to claim 26, wherein the associating means determines the captured image having the maximum second evaluation value as the one captured image.

An image processing method for generating a texture image corresponding to three-dimensional polygon data of an object based on a plurality of captured images acquired by imaging the object using a plurality of imaging devices.
Each pixel value of the plurality of pixels or the plurality of pixel groups constituting the region corresponding to the polygon constituting the three-dimensional polygon data is set to one or more captured images of the plurality of captured images for each pixel or pixel group. And the decision process to decide based on
An image processing method comprising a generation step of generating the texture image based on a pixel value determined by the determination step.

A program for causing a computer to function as the image processing device according to any one of claims 1 to 27.