JP6736422B2

JP6736422B2 - Image processing apparatus, image processing method and program

Info

Publication number: JP6736422B2
Application number: JP2016162803A
Authority: JP
Inventors: 知宏西山
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2016-08-23
Filing date: 2016-08-23
Publication date: 2020-08-05
Anticipated expiration: 2036-08-23
Also published as: JP2018032938A

Description

本発明は、被写体の３次元形状を高速かつ高精度に推定する技術に関する。 The present invention relates to a technique for estimating a three-dimensional shape of a subject at high speed and with high accuracy.

従来より、複数台のカメラによって異なる視点から撮像された互いに視差のある画像（視差画像）を用いて、被写体の３次元概形形状を高速に推定する手法として、視体積交差法が知られている。視体積交差法では、被写体の輪郭情報のみを用いるため、安定して形状が得られる一方で、本来被写体ではない領域（偽領域）の発生や、凹んだ被写体表面を復元することができないという原理的な課題がある。この原理的課題に対しては、その克服を企図して様々な手法が提案されている。例えば、非特許文献１では、ボクセルが各カメラから見えているかの可視判定を行い、判定結果に基づき決定されたカメラの色情報から、色の整合性が取れないボクセルを削除する手法が提案されている。また、特許文献１では、視体積交差法で推定された形状の表面に存在するボクセルに対し、色情報の整合性、表面の滑らかさなどをエネルギー関数として表現し、エネルギーを最適化することで、形状を高精度化する手法が提案されている。なお、ボクセルとは、x 軸、y 軸、z 軸の３次元空間において分割された単位格子を指す。 BACKGROUND ART Conventionally, a visual volume intersection method has been known as a method for rapidly estimating a three-dimensional outline shape of a subject using images having parallax (parallax images) captured by a plurality of cameras from different viewpoints. There is. Since the visual volume intersection method uses only the contour information of the subject, it is possible to obtain a stable shape, but it is not possible to restore an originally non-subject region (false region) or to restore a concave subject surface. Problem. Various methods have been proposed for the purpose of overcoming this fundamental problem. For example, Non-Patent Document 1 proposes a method of visually determining whether or not a voxel is seen by each camera, and deleting voxels whose colors are not consistent from the color information of the camera determined based on the determination result. ing. Further, in Patent Document 1, for voxels existing on the surface of the shape estimated by the visual volume intersection method, consistency of color information, smoothness of the surface, and the like are expressed as an energy function to optimize energy. , A method for improving the precision of the shape has been proposed. The voxel means a unit cell divided in a three-dimensional space of x axis, y axis, and z axis.

特開２０１２−２０８７５９号公報JP2012-208759A

Kutulakos他「ATheory of Shape by Space Carving 」 International Journal of Computer Vision ２０００年Kutulakos et al. "A Theory of Shape by Space Carving" International Journal of Computer Vision 2000

しかしながら、非特許文献１に記載の手法では、ボクセル単体の色の整合性に基づいてボクセルの削除を行うため、本来の被写体を構成するボクセルを誤って削除してしまう可能性がある。また、特許文献１に記載の手法では、エネルギー最適化を行うため、演算量が膨大になるという課題がある。 However, in the method described in Non-Patent Document 1, voxels are deleted based on the color consistency of the voxels alone, so there is a possibility that the voxels forming the original subject may be deleted by mistake. Further, the method described in Patent Document 1 has a problem that the amount of calculation becomes enormous because energy optimization is performed.

本発明に係る画像処理装置は、複数の撮像装置の撮像により取得された複数の画像に基づき、被写体の３次元形状に対応し、ボクセルにより構成される候補領域を生成する生成手段と、前記候補領域を構成する特定のボクセルに対して、削除を含む処理を行う処理手段と、を有し、前記処理手段は、前記特定のボクセルの周りの他のボクセルの密度に基づき、前記特定のボクセルに対して削除を行うかを判定する、ことを特徴とする。 An image processing apparatus according to the present invention is based on a plurality of images acquired by imaging of a plurality of imaging devices, a generating unit that generates a candidate region corresponding to a three-dimensional shape of a subject and is composed of voxels, and the candidate. for a particular voxel constituting the region includes a processing means for performing processing including deletion, wherein the processing means is based on the density of the other voxels around the particular voxel, the particular voxel to determine whether to remove for, characterized in that.

本発明に係る画像処理装置は、複数の視点から撮影された視差画像から被写体の３次元形状を推定する画像処理装置であって、前記視差画像における被写体のシルエット画像を取得する取得手段と、前記シルエット画像に基づき、処理対象とする３次元空間に、前記被写体の３次元形状の候補領域を生成する生成手段と、前記３次元空間に生成された前記候補領域から、所定の基準に従って不要と判断されるボクセルを削除する削除手段と、前記候補領域における注目格子点の周辺ボクセル密度に基づいて、前記削除手段で削除されたボクセルを当該注目格子点の位置に復元する復元手段とを備えたことを特徴とする。 An image processing apparatus according to the present invention is an image processing apparatus that estimates a three-dimensional shape of a subject from parallax images captured from a plurality of viewpoints, and an acquisition unit that acquires a silhouette image of the subject in the parallax image, Based on the silhouette image, a generation unit that generates a candidate region of the three-dimensional shape of the subject in the three-dimensional space to be processed, and the candidate region generated in the three-dimensional space are determined to be unnecessary according to a predetermined criterion. And a restoring means for restoring the voxels deleted by the deleting means to the position of the target grid point based on the peripheral voxel density of the target grid point in the candidate area. Is characterized by.

本発明によれば、被写体の３次元形状を、高速、高精度、かつ少ない演算量で推定することが可能になる。 According to the present invention, the three-dimensional shape of a subject can be estimated at high speed, with high accuracy, and with a small amount of calculation.

視差画像を取得するためのカメラ配置の一例を示した図である。It is the figure which showed an example of the camera arrangement for acquiring a parallax image. 実施例１に係る、画像処理装置のハードウェア構成の一例を示す図である。FIG. 3 is a diagram illustrating an example of a hardware configuration of an image processing apparatus according to the first embodiment. 実施例１に係る、画像処理装置の機能ブロック図である。3 is a functional block diagram of the image processing apparatus according to the first embodiment. FIG. 実施例１に係る、画像処理装置が実行する一連の画像処理の流れを示すフローチャートである。6 is a flowchart showing a flow of a series of image processing executed by the image processing apparatus according to the first embodiment. 被写体を３台のカメラから撮影する様子を示す図である。It is a figure which shows a mode that a to-be-photographed object is image|photographed from three cameras. （ａ）は視差画像の一例であり、（ｂ）はシルエット画像の一例である。(A) is an example of a parallax image, (b) is an example of a silhouette image. （ａ）は３台のカメラで撮影された各画像を示し、（ｂ）は任意のボクセルを各カメラに投影する様子を示す図である。(A) shows each image image|photographed by three cameras, (b) is a figure which shows a mode that an arbitrary voxel is projected on each camera. ボクセル復元処理の概念を説明する図である。It is a figure explaining the concept of voxel restoration processing. 実施例１に係る、ボクセル復元処理の詳細を示すフローチャートである。6 is a flowchart showing details of voxel restoration processing according to the first embodiment. 実施例１に係る発明の効果を説明する図である。FIG. 6 is a diagram illustrating an effect of the invention according to the first embodiment. 実施例２に係る、画像処理装置の機能ブロック図である。6 is a functional block diagram of an image processing apparatus according to a second embodiment. FIG. 実施例２に係る、画像処理装置が実行する一連の画像処理の流れを示すフローチャートである。9 is a flowchart showing a flow of a series of image processing executed by the image processing apparatus according to the second embodiment.

［実施例１］
本実施例では、被写体を包含する３次元空間の注目格子点の周辺ボクセル密度が一定以上となる場合に、被写体形状の候補領域内における当該注目格子点の位置にボクセルを追加することで、削除しすぎたボクセルを復元する手法について説明する。 [Example 1]
In the present embodiment, when the peripheral voxel density of the grid point of interest in the three-dimensional space including the subject is equal to or higher than a certain level, the voxel is added to the position of the grid point of interest in the candidate region of the subject shape to delete it. A method of restoring an excessively voxel will be described.

図１は、異なる視点から撮影された互いに視差のある画像（以下、視差画像）を取得するためのカメラ配置の一例を示した図である。図１は、サッカー用のフィールド１０４を囲むように配置された１０台のカメラ１０１により、フィールド１０４上にいる選手１０２やボール１０３を撮影している様子を表している。各カメラ１０１で撮影された画像データは、多視点の視差画像データとして画像処理装置２００に送られ、所定の画像処理が施される。ここでは、スポーツシーンを例にとって説明するが、本実施例において説明する手法は、被写体となる物体の周りを囲むように複数のカメラを配置し、当該物体の形状を推定するようなシーンについては、幅広く適用可能である。 FIG. 1 is a diagram showing an example of a camera arrangement for acquiring images having parallax with each other (hereinafter, parallax images) captured from different viewpoints. FIG. 1 shows a situation in which ten cameras 101 arranged so as to surround a soccer field 104 photograph a player 102 and a ball 103 on the field 104. The image data captured by each camera 101 is sent to the image processing apparatus 200 as multi-viewpoint parallax image data and subjected to predetermined image processing. Here, a sports scene will be described as an example, but the method described in the present embodiment will be described with respect to a scene in which a plurality of cameras are arranged so as to surround the object to be a subject and the shape of the object is estimated. , Widely applicable.

以下では、前述のボクセルを用いて、対象となる３次元空間を大きさd[mm]の正規格子で区切って離散的に表現する。各ボクセルの座標は、x,y,z座標の順に(0,0,0)、(1,0,0)、(3,0,1)・・・のように格子ベクトルを用いて表現するものとする。実際の物理的な座標値は、上記の整数ベクトルに正規格子の大きさdを乗算することで得られる。具体的なｄとしては、例えば5mmなどの値が採用される。 In the following, using the voxels described above, the target three-dimensional space is divided into regular grids of size d[mm] and expressed discretely. The coordinates of each voxel are expressed using a lattice vector such as (0,0,0), (1,0,0), (3,0,1)... in the order of x,y,z coordinates. I shall. The actual physical coordinate value is obtained by multiplying the above integer vector by the size d of the regular lattice. As a specific value d, a value such as 5 mm is adopted.

（画像処理装置の構成）
まずは、本実施例における、画像処理装置２００の構成について説明する。図２は、画像処理装置２００のハードウェア構成の一例を示す図である。画像処理装置２００は、CPU２０１、RAM２０２、ROM２０３、HDD２０４、入力I/F２０５、出力I/F２０６を含む。そして、画像処理装置２００を構成する各部は、システムバス２０７によって相互に接続されている。また、画像処理装置２００は、入力I/F２０５を介して、カメラ１０１、操作部２１０、外部メモリ２１１に接続されている。また、出力I/F２０６を介して、外部メモリ２１１及び表示装置２１２に接続されている。 (Structure of image processing device)
First, the configuration of the image processing apparatus 200 in this embodiment will be described. FIG. 2 is a diagram showing an example of the hardware configuration of the image processing apparatus 200. The image processing apparatus 200 includes a CPU 201, a RAM 202, a ROM 203, an HDD 204, an input I/F 205, and an output I/F 206. Then, the respective units constituting the image processing apparatus 200 are mutually connected by the system bus 207. The image processing apparatus 200 is also connected to the camera 101, the operation unit 210, and the external memory 211 via the input I/F 205. Further, it is connected to the external memory 211 and the display device 212 via the output I/F 206.

CPU２０１は、RAM２０２をワークメモリとして、ROM２０３に格納されたプログラムを実行し、システムバス２０７を介して画像処理装置２００の各部を統括的に制御する。これにより、後述する様々な処理が実現される。HDD２０４は、画像処理装置２００で取り扱う種々のデータを記憶する大容量記憶装置であり、例えばSSDなどでもよい。CPU２０１は、システムバス２０７を介してHDD２０４へのデータの書き込み及びHDD２０４に記憶されたデータの読出しを行うことができる。 The CPU 201 uses the RAM 202 as a work memory to execute a program stored in the ROM 203, and centrally controls each unit of the image processing apparatus 200 via the system bus 207. As a result, various processes described later are realized. The HDD 204 is a large-capacity storage device that stores various data handled by the image processing apparatus 200, and may be, for example, an SSD or the like. The CPU 201 can write data to the HDD 204 and read data stored in the HDD 204 via the system bus 207.

入力I/F２０５は、例えばUSBやIEEE1394等のシリアルバスI/Fであり、外部装置から画像処理装置２００へのデータや命令等の入力は、この入力I/F２０５を介して行われる。画像処理装置２００は、この入力I/F２０５を介して、外部メモリ２０８（例えば、ハードディスク、メモリーカード、CFカード、SDカード、USBメモリなどの記憶媒体）からデータを取得する。また、画像処理装置２００は、この入力I/F２０５を介して、操作部２１０を用いて入力されたユーザによる命令を取得する。操作部２１０はマウスやキーボードなどの入力装置であり、ユーザの指示を処理装置２００に入力するために用いられる。 The input I/F 205 is, for example, a serial bus I/F such as USB or IEEE1394, and input of data, commands, and the like from the external device to the image processing apparatus 200 is performed via the input I/F 205. The image processing apparatus 200 acquires data from the external memory 208 (for example, a storage medium such as a hard disk, a memory card, a CF card, an SD card, a USB memory) via the input I/F 205. Further, the image processing apparatus 200 acquires a user command input using the operation unit 210 via the input I/F 205. The operation unit 210 is an input device such as a mouse and a keyboard, and is used to input a user instruction to the processing device 200.

出力I/F２０６は、入力I/F２０５と同様にUSBやIEEE1394等のシリアルバスI/Fを備える。その他に、例えばDVIやHDMI（登録商標）等の映像出力端子を用いることも可能である。画像処理装置２００から外部装置へのデータ等の出力は、この出力I/F２０６を介して行われる。画像処理装置２００は、この出力I/F２０６を介して表示装置２０９（液晶ディスプレイなどの各種画像表示デバイス）に、処理された画像データなどを出力することで、画像の表示を行う。なお、画像処理装置２００の構成要素は上記以外にも存在するが、本発明の主眼ではないため、説明を省略する。 The output I/F 206 includes a serial bus I/F such as USB or IEEE1394 similar to the input I/F 205. In addition, a video output terminal such as DVI or HDMI (registered trademark) can be used. Output of data and the like from the image processing apparatus 200 to an external device is performed via this output I/F 206. The image processing apparatus 200 displays the image by outputting the processed image data and the like to the display device 209 (various image display devices such as a liquid crystal display) via the output I/F 206. It should be noted that although the constituent elements of the image processing apparatus 200 exist in addition to the above, they are not the main subject of the present invention, and thus the description thereof is omitted.

続いて、画像処理装置２００で行う一連の画像処理について説明する。図３は、本実施例に係る画像処理装置２００の機能ブロック図である。画像処理装置２００は、画像データ取得部３０１、歪曲補正部３０２、候補領域生成部３０３、ボクセル削除部３０４、ボクセル復元部３０５、カメラパラメータ取得部３０６、閾値設定部３０７から構成される。CPU２０１がROM２０３内に格納された制御プログラムを読み込んでRAM２０２に展開してこれを実行することで、上記各部の機能が実現される。そして、図４は、本実施例の画像処理装置２００が実行する一連の画像処理の流れを示すフローチャートである。なお、上記各部に相当する専用の処理回路を備えるように画像処理装置２００を構成してもよい。以下、本実施例に係る画像処理装置２００で行われる画像処理の流れを説明する。 Next, a series of image processing performed by the image processing apparatus 200 will be described. FIG. 3 is a functional block diagram of the image processing apparatus 200 according to this embodiment. The image processing apparatus 200 includes an image data acquisition unit 301, a distortion correction unit 302, a candidate region generation unit 303, a voxel deletion unit 304, a voxel restoration unit 305, a camera parameter acquisition unit 306, and a threshold value setting unit 307. The CPU 201 reads the control program stored in the ROM 203, expands it in the RAM 202, and executes it to realize the functions of the above-mentioned units. Then, FIG. 4 is a flowchart showing a flow of a series of image processing executed by the image processing apparatus 200 of the present embodiment. The image processing apparatus 200 may be configured to include a dedicated processing circuit corresponding to each of the above units. The flow of image processing performed by the image processing apparatus 200 according to this embodiment will be described below.

ステップ４０１では、画像データ取得部３０１が、入力I/F２０５を介して複数のカメラから直接、またはHDD２０４や外部メモリ２１１から、上述の視差画像とシルエット画像のデータを取得する。ここで、シルエット画像とは、視差画像を構成する各画像において、被写体が存在する領域を白（画素値＝255）、存在しない領域を黒（画素値＝0）で表した2値画像である。図５は、被写体としての完全な球体５１０と一部が欠けた球体５２０を、３台のカメラ５０１〜５０３から撮影する様子を示す図である。図６（ａ）の画像６０１〜６０３は、当該３台のカメラ５０１〜５０３で上記２個の球体５１０と５２０を撮影して得られた視差画像であり、同（ｂ）はそれらのシルエット画像６１１〜６１３をそれぞれ示している。シルエット画像は、視差画像を元に、背景抽出や被写体切り出しなどの手法を用いて、予め生成しておくものとする。 In step 401, the image data acquisition unit 301 acquires the above-described parallax image and silhouette image data directly from a plurality of cameras via the input I/F 205 or from the HDD 204 or the external memory 211. Here, the silhouette image is a binary image in which an area where a subject is present is white (pixel value=255) and an area where no subject is present is black (pixel value=0) in each image forming the parallax image. .. FIG. 5 is a diagram showing a state in which a perfect sphere 510 as a subject and a sphere 520 with a part missing are photographed from three cameras 501 to 503. Images 601 to 603 in FIG. 6A are parallax images obtained by photographing the two spheres 510 and 520 with the three cameras 501 to 503, and FIG. 6B is a silhouette image thereof. 611 to 613 are shown respectively. It is assumed that the silhouette image is generated in advance based on the parallax image by using a method such as background extraction or subject cutout.

ステップ４０２では、カメラパラメータ取得部３０６が、視差画像を撮影したカメラに関する、内部パラメータ・外部パラメータ・歪曲パラメータといった、所定のパラメータを取得する。ここで、内部パラメータとは、画像中心の座標値やカメラが備えるレンズの焦点距離の情報である。外部パラメータとは、カメラそれぞれの位置と向きを表す情報である。ここでは、世界座標系における位置ベクトルと回転行列で、カメラそれぞれの向きと位置を記述する方式を採用するが、別の方式で記述してもよい。歪曲パラメータとは、カメラが備えるレンズの歪曲を表す情報である。これらカメラパラメータは、視差画像データを元に、例えばSFM(Structure from Motion)によって推定をしてもよいし、予めチャートなどを用いたキャリブレーションを行って算出しておいてもよい。 In step 402, the camera parameter acquisition unit 306 acquires predetermined parameters regarding the camera that has captured the parallax image, such as internal parameters, external parameters, and distortion parameters. Here, the internal parameter is information on the coordinate value of the image center and the focal length of the lens included in the camera. The external parameter is information indicating the position and orientation of each camera. Here, a method of describing the direction and position of each camera by a position vector and rotation matrix in the world coordinate system is adopted, but it may be described by another method. The distortion parameter is information indicating the distortion of the lens included in the camera. These camera parameters may be estimated by, for example, SFM (Structure from Motion) based on the parallax image data, or may be calculated by performing calibration using a chart or the like in advance.

ステップ４０３では、歪曲補正部３０２が、ステップ４０２で取得したカメラパラメータのうちの歪曲パラメータに基づき、ステップ４０１で取得した視差画像とシルエット画像に対し、レンズの歪曲によって生じる歪みを補正する処理（歪曲補正処理）を行う。 In step 403, the distortion correction unit 302 corrects the distortion caused by the lens distortion in the parallax image and the silhouette image acquired in step 401 based on the distortion parameter of the camera parameters acquired in step 402 (distortion). Correction processing).

ステップ４０４では、候補領域生成部３０３が、ステップ４０２で取得したカメラパラメータとステップ４０１で取得したシルエット画像データを元に、３次元空間に被写体形状の候補領域を生成する。ここで、被写体形状の候補領域は、被写体がその内側に存在すると見込まれる、被写体の３次元形状の候補となる凸多面体の領域を表す。そして、この候補領域は、後述のボクセル追加処理にて参照される。本実施例では、視体積交差法と呼ばれる手法に基づき、シルエット画像データから候補領域が生成される。前述の図５において、図６（ｂ）に示したシルエット画像６１１〜６１３から生成した候補領域が示されている。図５において、カメラ５０１〜５０３から２本ずつ伸びる扇形の領域５０４〜５０９は、垂直断面が被写体のシルエット（ここでは、円）となる錐体を上から見たものである。領域５０４〜５０９の各錐体を空間中に投影し、錐体同士が重なった多角形の共通の領域（太線で示す領域）５１１、５１２、５２１、５２２が、ここでは上述の候補領域となる。 In step 404, the candidate area generation unit 303 generates a candidate shape area of the subject shape in the three-dimensional space based on the camera parameters acquired in step 402 and the silhouette image data acquired in step 401. Here, the subject shape candidate region represents a region of a convex polyhedron that is a candidate for the three-dimensional shape of the subject, in which the subject is expected to exist inside. Then, this candidate area is referred to in the voxel addition processing described later. In this embodiment, a candidate area is generated from the silhouette image data based on a method called a visual volume intersection method. In FIG. 5 described above, candidate regions generated from the silhouette images 611 to 613 shown in FIG. 6B are shown. In FIG. 5, fan-shaped regions 504 to 509 extending from the cameras 501 to 503 by two are cones whose vertical cross section is the silhouette (here, a circle) of the subject as viewed from above. Regions 504 to 509 are projected in space, and common regions (regions indicated by thick lines) 511, 512, 521, and 522 of polygons in which the cones overlap each other become the above-described candidate regions. ..

ステップ４０５では、ボクセル削除部３０４が、ステップ４０４で生成された候補領域に対し、不要なボクセルを削除する処理（ボクセル削除処理）を行う。ボクセル削除処理の詳細については後述する。 In step 405, the voxel deletion unit 304 performs processing (voxel deletion processing) of deleting unnecessary voxels in the candidate area generated in step 404. Details of the voxel deletion processing will be described later.

ステップ４０６では、ボクセル復元部３０５が、閾値設定部３０７で設定された閾値と、候補領域生成部３０３が生成した候補領域の情報に基づき、ボクセル削除処理で削除されすぎたボクセルを復元する処理（ボクセル復元処理）を行う。ボクセル復元処理の詳細については後述する。 In step 406, the voxel restoration unit 305 restores the voxels that were excessively deleted by the voxel deletion process based on the threshold value set by the threshold value setting unit 307 and the information of the candidate area generated by the candidate area generation unit 303 ( Voxel restoration processing) is performed. Details of the voxel restoration processing will be described later.

以上が、本実施例の画像処理装置２００における一連の画像処理の内容である。 The above is the contents of a series of image processing in the image processing apparatus 200 of the present embodiment.

＜視体積交差法の特徴＞
視体積交差法では、被写体の色情報に基づく対応関係を用いない。そのため、ロバストかつ高速に形状を推定することができるという利点がある反面、偽領域の発生や、凸な形状しか推定できないという欠点がある。前述の図５に示す撮影シーンにおいて、被写体である２個の球体のうち右側の球体についてはその一部が欠けている。そして、前述の通り、錐体５０４と５０５はカメラ５０１から、錐体５０６と５０７はカメラ５０２から、錐体５０８と５０９はカメラ５０３から、それぞれ空間中に投影されている。この場合、被写体である２つの球体に外接する凸多面体５１１と５２１の他に、相対的に小さい凸多面体５１２及び５２２が、候補領域として生成されている。これら４つの候補領域は、被写体に外接する凸多面体として推定されるため、球体５２０のような凹んだ部分を持つ被写体の場合は当該凹んだ部分を再現することができない。また、本来は被写体として存在しない凸多面体５１２及び５２２も偽の候補領域として推定されてしまう。本発明は、上述したような視体積交差法の欠点を補うものである。 <Characteristics of visual volume intersection method>
The visual volume intersection method does not use the correspondence relationship based on the color information of the subject. Therefore, there is an advantage that the shape can be estimated robustly and at high speed, but on the other hand, there are disadvantages that a false area is generated and only a convex shape can be estimated. In the above-described shooting scene shown in FIG. 5, a part of the right sphere of the two spheres that are the subject is partially missing. As described above, the cones 504 and 505 are projected in the space from the camera 501, the cones 506 and 507 are projected from the camera 502, and the cones 508 and 509 are projected in the space respectively. In this case, in addition to the convex polyhedrons 511 and 521 that circumscribe the two spheres that are subjects, relatively small convex polyhedrons 512 and 522 are generated as candidate regions. Since these four candidate regions are estimated as convex polyhedrons circumscribing the subject, in the case of a subject having a concave portion such as the sphere 520, the concave portion cannot be reproduced. Further, the convex polyhedrons 512 and 522 that do not originally exist as the subject are also estimated as false candidate regions. The present invention compensates for the drawbacks of the visual volume intersection method as described above.

＜ボクセル削除処理＞
続いて、図４のフローのステップ４０５におけるボクセル削除処理について説明する。この処理では、任意のボクセルを、視点の異なる複数のカメラに投影したときの色の整合性に基づいて、当該ボクセルを削除するかどうかが決定される。図７は、ボクセル削除処理を説明する図である。図７（ａ）は、図６（ａ）に示したカメラ５０１〜５０３で撮影された各画像６０１〜６０２であり、同（ｂ）は任意のボクセル７００をカメラ５０１〜５０３に投影する様子を示す図である。図７（ａ）において、各画像６０１〜６０３上の黒矩形の点７０１〜７０３は、ボクセル７００を各画像中に投影したときの位置を示している。また、図７（ｂ）において、線分７１１、７１２、７１３はそれぞれカメラ５０１、５０２、５０３の仮想的な投影面を示している。ボクセル削除部３０４は、ボクセル７００を投影した点７０１の画素値{I_1R, I_1G, I_1B}、点７０２の画素値{I_2R, I_2G, I_2B}及び点７０３の画素値{I_3R, I_3G, I_3B}を元に、画素値相互の類似度を算出する。ここでは類似度の評価にNCC(Normalized Cross-Correlation)を用いるが、SSD(Sum of Squared Difference)やSAD(Sum of Absolute Difference)など別の類似度指標を用いてもよい。ボクセル７００が見えると判定（可視判定）されたカメラがN台ある場合は、以下の式（１）を用いて、平均NCCが求められる。 <Voxel deletion processing>
Next, the voxel deletion processing in step 405 of the flow of FIG. 4 will be described. In this processing, whether or not to delete an arbitrary voxel is determined based on color consistency when an arbitrary voxel is projected onto a plurality of cameras having different viewpoints. FIG. 7 is a diagram for explaining the voxel deletion process. FIG. 7A shows images 601 to 602 taken by the cameras 501 to 503 shown in FIG. 6A, and FIG. 7B shows a state in which an arbitrary voxel 700 is projected on the cameras 501 to 503. FIG. In FIG. 7A, black rectangle points 701 to 703 on the images 601 to 603 indicate positions when the voxels 700 are projected in the images. Further, in FIG. 7B, line segments 711, 712, 713 indicate virtual projection planes of the cameras 501, 502, 503, respectively. The voxel deleting unit 304 determines the pixel value {I _1R , I _1G , I _1B } of the point 701 onto which the voxel 700 is projected, the pixel value {I _2R , I _2G , I _2B } of the point 702, and the pixel value {I of the point 703. _{Based on 3R} , I _3G , I _3B }, the similarity between pixel values is calculated. Here, NCC (Normalized Cross-Correlation) is used to evaluate the similarity, but another similarity index such as SSD (Sum of Squared Difference) or SAD (Sum of Absolute Difference) may be used. When there are N cameras for which it is determined that the voxel 700 is visible (visible determination), the average NCC is calculated using the following equation (1).

上記式（１）において、太字のIは、R,G,Bチャンネルを各要素に持つ３次元ベクトルである。ボクセルが異なればそれが見えるカメラも異なってくるので、ボクセルによってカメラ台数Nも変化することになる。図７の例では、N＝3なので、式（１）において3通りの組み合わせについて和を取ることになる。 In the above formula (1), bold I is a three-dimensional vector having R, G, B channels as elements. Since different voxels have different cameras that can be seen, the number of cameras N also changes depending on the voxel. In the example of FIG. 7, since N=3, the sum is calculated for the three combinations in equation (1).

NCCでは、画素値をベクトルとして捉えたときの、ベクトルが成す角度が得られる。上記式（１）では、N台のカメラの組合せについて、NCCの平均が求められる。このとき、ボクセル７００の可視判定は、視体積交差法で求めた概形形状からデプスマップを作成し、このデプスマップに基づいて行う手法や、概形形状の法線とカメラの光軸方向を表すベクトルの成す角度に基づいて行う手法などがあり、特に限定されない。 In NCC, when the pixel value is taken as a vector, the angle formed by the vector is obtained. In the above equation (1), the NCC average is obtained for the combination of N cameras. At this time, in the visual determination of the voxel 700, a depth map is created from the rough shape obtained by the visual volume intersection method, and a method performed based on this depth map, or the normal line of the rough shape and the optical axis direction of the camera are determined. There is a method based on the angle formed by the vector to be represented, and there is no particular limitation.

そして、注目するボクセルについて上記式（１）で求めた平均NCCの値が、所定の閾値D_th（例えば0.9）未満である場合は、色の整合性がない（すなわち不要なボクセル）と判断して、当該注目ボクセルを削除する。なお、ここでは、点７０１〜７０３の画素値のみに基づいて類似度を求めたが、点７０１〜７０３周辺の画素値（例えば、点７０１〜７０３それぞれを中心とした3×3のブロック内の画素値）を用いて、類似度を求めてもよい。 Then, if the average NCC value obtained by the above equation (1) for the voxel of interest is less than a predetermined threshold D _th (for example, 0.9), it is determined that there is no color consistency (that is, unnecessary voxels). Then, the relevant voxel is deleted. In addition, here, the similarity is calculated based only on the pixel values of the points 701 to 703, but the pixel values around the points 701 to 703 (for example, in a 3×3 block centered on the points 701 to 703, respectively) The degree of similarity may be obtained by using (pixel value).

また、色の整合性に基づく判定に代えて、例えばボクセルの幾何学的な情報に基づいて不要なボクセルかどうかを判定するなど、他の手法を用いてボクセル削除を行っても構わない。以上がボクセル削除処理の内容である。 Further, instead of the determination based on color consistency, voxel deletion may be performed using another method such as determining whether or not an unnecessary voxel is based on geometrical information of voxels. The above is the content of the voxel deletion processing.

＜ボクセル復元処理＞
続いて、図４のフローのステップ４０６におけるボクセル復元処理について説明する。具体的な処理内容の説明に入る前に、ボクセル復元処理の原理についてまず説明する。 <Voxel restoration processing>
Next, the voxel restoration processing in step 406 of the flow of FIG. 4 will be described. Before starting the description of the specific processing content, the principle of the voxel restoration processing will be described first.

前述の通り、候補領域の中でも偽領域は、錐体の共通領域として、被写体の存在しない空間に生成される。そのため、通常は、被写体に比べて体積がかなり小さい。また、前述のボクセル削除処理により、偽領域内のボクセルについては削除される可能性が高いため、ボクセル削除処理後における偽領域内のボクセル総数はさらに少なくなる。そのため、候補領域を包含する３次元空間における、対応するボクセルが存在しない特定の格子点の周辺ボクセル密度は相対的に低いことが予想される。その一方で、実際に被写体が存在する被写体領域は、本来の体積が大きい。そのため、ボクセル削除処理によって本来は必要なボクセルの一部が削除されても、上記特定格子点の周辺ボクセル密度は、偽領域に比べて高いと考えられる。そこで、本発明では、候補領域内の上記特定格子点の周辺ボクセル密度に応じて、偽領域においてはボクセルを復元することなく、被写体領域にのみボクセルを復元するようにしている。具体的には、候補領域内の特定格子点を含む所定範囲に存在するボクセル数が、所定の数以上である場合にのみ、当該特定格子点の位置にボクセルを復元する。ここでの所定の数をR_thで表すものとする。以上これにより、体積が大きく、かつボクセル削除処理後もある程度密にボクセルが満たされている状態である被写体領域のみでボクセルが復元されることになる。これは、偽領域が被写体領域に隣接する場合でも同様である。 As described above, the false area among the candidate areas is generated as a common area of the cones in the space where the subject does not exist. Therefore, the volume is usually considerably smaller than that of the subject. Further, since the voxel deleting process described above is likely to delete voxels in the false region, the total number of voxels in the false region after the voxel deleting process is further reduced. Therefore, it is expected that the peripheral voxel density of a specific grid point where the corresponding voxel does not exist in the three-dimensional space including the candidate region is relatively low. On the other hand, the subject area where the subject actually exists has a large original volume. Therefore, even if a part of the originally required voxels is deleted by the voxel deletion process, the peripheral voxel density of the specific grid point is considered to be higher than that in the false region. Therefore, in the present invention, according to the peripheral voxel density of the specific grid point in the candidate region, the voxels are not restored in the false region but the voxels are restored only in the subject region. Specifically, the voxel is restored to the position of the specific grid point only when the number of voxels existing in the predetermined range including the specific grid point in the candidate region is equal to or larger than the predetermined number. The predetermined number here is represented by R _th . As described above, as a result, the voxels are restored only in the subject region that has a large volume and is still densely filled to some extent after the voxel deletion processing. This is the same even when the false area is adjacent to the subject area.

ここで、閾値設定部３０７における上記所定数R_thの決め方について説明する。例えば、ボクセル削除部３０４が、ある格子点の周辺ボクセル密度に基づいてボクセルを削除する場合、３次元空間における当該格子点に隣接する格子点の個数（3^3 -1=26個）に基づいて所定数R_thを決定することができる。上述のとおり、偽領域内の特定格子点の周辺ボクセル密度は、被写体領域内のそれに比べて小さくなる。そこで、３次元空間における全隣接格子点の数の半分（26/2＝13個）や1/3（26/3≒8個）を、所定数R_thとして設定する。また、被写体内領域においては、ボクセル削除処理が行われた後であっても、２次元平面における隣接格子点数以上のボクセルが存在する可能性が高い。そこで、２次元平面における隣接格子点の数を基準に所定数R_thを決めることも可能である。例えば、２次元平面における隣接格子点の数は8個（3^2 -1）個なので、8個より大きい数を所定数R_thに設定してもよい。また、特定格子点に隣接しない格子点を参照する場合であっても、２次元や３次元の幾何学的な性質に基づいて所定数R_thを決定することができる。このように、幾何学的な性質に基づいてボクセル復元の際の所定数R_thを設定することで、高精度にボクセル復元処理を行うことが可能になる。 Here, how to determine the predetermined number R _th in the threshold setting unit 307 will be described. For example, when the voxel deleting unit 304 deletes voxels based on the peripheral voxel density of a certain grid point, based on the number of grid points (3^3 -1=26) adjacent to the grid point in the three-dimensional space. Can determine the predetermined number R _th . As described above, the peripheral voxel density of the specific grid point in the false area is smaller than that in the subject area. Therefore, half (26/2=13) or 1/3 (26/3≈8) of the number of all adjacent grid points in the three-dimensional space is set as the predetermined number R _th . Further, in the intra-subject region, there is a high possibility that there are more voxels than the number of adjacent lattice points in the two-dimensional plane even after the voxel deletion processing is performed. Therefore, it is also possible to determine the predetermined number R _{th based} on the number of adjacent grid points in the two-dimensional plane. For example, since the number of adjacent grid points on the two-dimensional plane is 8 (3^2 -1), a number larger than 8 may be set as the predetermined number R _th . Further, even when referring to a grid point that is not adjacent to a specific grid point, the predetermined number R _th can be determined based on the two-dimensional or three-dimensional geometrical property. As described above, by setting the predetermined number R _th at the time of voxel restoration based on the geometrical property, it is possible to perform the voxel restoration process with high accuracy.

図８（ａ）〜（ｃ）は、ボクセル復元処理の概念を説明する図である。ここでは説明の便宜上、２次元的に表現しているが、実際は３次元である。図８（ａ）において、太線の六角形８０１は、被写体である球に外接する凸多面体を上から見た場合の水平断面を示し、太線の三角形８０２は偽領域の水平断面を示している。そして、細線の矩形８０３は、上記２つの凸多面体に含まれるボクセルの水平断面を示している。実際の偽領域の体積は、被写体領域に比べてもっと小さくなるが、ここでは誇張して表わしている。六角形８０１と三角形８０２にそれぞれ対応する凸多面体の内部が候補領域である。図８（ｂ）は、色の整合性に基づくボクセル削除処理によって一部のボクセルが削除された状態を示している。図８（ｂ）では、偽領域を表す三角形８０２内のボクセルが削除されている一方で、被写体領域を表す六角形８０１内のボクセルも削除され過ぎている。ここでは、説明の都合上、偽領域内のボクセルは完全に削除されていないが、このような場合に不要なボクセルをさらに削除する手法については、実施例２で説明する。図８（ｃ）は、ボクセル復元処理によって被写体領域内にボクセルが追加された状態を示している。図８（ｃ）を見ると、偽領域内の一度削除されたボクセルが再び追加されることなく、被写体領域内でのみボクセルが追加されているのが分かる。以上が、本実施例におけるボクセル復元処理の原理である。 8A to 8C are diagrams illustrating the concept of the voxel restoration process. Here, for convenience of explanation, it is expressed in two dimensions, but in reality it is three dimensions. In FIG. 8A, a thick line hexagon 801 shows a horizontal section when a convex polyhedron circumscribing a sphere as a subject is viewed from above, and a thick line triangle 802 shows a horizontal section of a false region. Then, a thin line rectangle 803 indicates a horizontal cross section of the voxels included in the two convex polyhedrons. The actual volume of the false area is smaller than that of the subject area, but is exaggerated here. The inside of the convex polyhedron corresponding to each of the hexagon 801 and the triangle 802 is a candidate area. FIG. 8B shows a state in which some voxels are deleted by the voxel deletion process based on color matching. In FIG. 8B, the voxels in the triangle 802 representing the false area are deleted, while the voxels in the hexagon 801 representing the subject area are deleted too much. Here, for convenience of description, voxels in the false region are not completely deleted, but a method of further deleting unnecessary voxels in such a case will be described in the second embodiment. FIG. 8C shows a state in which voxels are added to the subject area by the voxel restoration processing. It can be seen from FIG. 8C that voxels once deleted in the false area are not added again, but voxels are added only in the subject area. The above is the principle of the voxel restoration processing in the present embodiment.

続いて、ボクセル復元処理の具体的な手順を説明する。図９は、本実施例に係る、ボクセル復元処理の詳細を示すフローチャートである。 Next, a specific procedure of voxel restoration processing will be described. FIG. 9 is a flowchart showing details of the voxel restoration processing according to this embodiment.

ステップ９０１では、ボクセル復元処理の対象となる、形状を推定したい被写体を確実に含む３次元空間が処理対象空間として設定される。例えば、前述の図１のシーンにおいて、フィールド上の選手やボールを形状推定の被写体とする場合には、１０台のカメラ１０１の画角が共通する範囲が、処理対象空間として設定される。もちろん、もっと狭い範囲を処理対象空間に設定してもよく、例えば図１において、フィールド中央を中心とした所定範囲のみを処理対象空間に設定してもよい。ここでは、図５の視差画像における被写体（２つの球体）を含むような直方体が、処理対象空間として設定されたものとする。 In step 901, a three-dimensional space that surely includes a subject whose shape is to be estimated, which is a target of voxel restoration processing, is set as a processing target space. For example, in the scene of FIG. 1 described above, when a player or a ball on the field is the subject of shape estimation, a range in which the angle of view of the ten cameras 101 is common is set as the processing target space. Of course, a narrower range may be set as the processing target space, and for example, in FIG. 1, only a predetermined range centered on the center of the field may be set as the processing target space. Here, it is assumed that a rectangular parallelepiped including the subject (two spheres) in the parallax image in FIG. 5 is set as the processing target space.

ステップ９０２では、設定された処理対象空間において、注目する格子点が初期化（注目格子点の初期位置が設定）される。ここでは、処理対象空間として設定された上記直方体をボクセルと同サイズの単位格子で区切った場合の各格子点のうちいずれか１つの点、例えば、(x,y,z)＝(0,0,0)が、注目格子点の初期位置に設定される。 In step 902, the target grid point is initialized (the initial position of the target grid point is set) in the set processing target space. Here, any one of the grid points when the above rectangular parallelepiped set as the processing target space is divided by a unit grid of the same size as the voxel, for example, (x,y,z)=(0,0 , 0) is set to the initial position of the grid point of interest.

ステップ９０３では、処理対象空間の注目格子点上にボクセルが存在するかどうかが判定される。注目格子点上にボクセルが存在しない場合は、ステップ９０４に進む。一方、注目格子点上にボクセルが存在する場合は、ステップ９０８に進む。 In step 903, it is determined whether or not a voxel exists on the target grid point in the processing target space. If no voxel exists on the grid point of interest, the process proceeds to step 904. On the other hand, if there is a voxel on the grid point of interest, the process proceeds to step 908.

ステップ９０４では、処理対象空間の注目格子点が上述の候補領域に含まれるかどうかが判定される。注目格子点が候補領域に含まれる場合は、ステップ９０５に進む。一方、注目格子点が候補領域に含まれない場合は、ステップ９０８に進む。 In step 904, it is determined whether the grid point of interest in the processing target space is included in the above-mentioned candidate area. If the grid point of interest is included in the candidate area, the process proceeds to step 905. On the other hand, if the grid point of interest is not included in the candidate area, the process proceeds to step 908.

ステップ９０５では、注目格子点の周辺に存在するボクセルの数がカウントされ、周辺ボクセルのカウント数が所定数R_th以上であるかどうかが判定される。「周辺」の範囲は、例えば、３次元空間において注目格子点に隣接する26格子点、すなわち、３次元空間における3×3×3＝27個の格子点から、中心にある注目格子点自身を除いたものを「周辺」とする。ただし、より広い範囲（例えば4×4×4＝64個の格子点から中心にある注目格子点自身を除いた63格子点）を「周辺」としてもよい。また、所定数R_thは、前述の「ボクセル復元処理の原理」に従い、「周辺」の範囲や処理対象空間における単位格子のサイズ等を考慮して予め設定しておけばよい。ここでは、所定数R_thの値として10が設定されているものとする。判定の結果、周辺ボクセルのカウント数が、所定数R_th以上である場合は、ステップ９０６に進む。一方、周辺ボクセルのカウント数が、所定数R_th未満である場合は、ステップ９０８に進む。 In step 905, the number of voxels existing around the target grid point is counted, and it is determined whether or not the count number of peripheral voxels is equal to or _larger than a predetermined number R _th . The “peripheral” range is, for example, 26 grid points adjacent to the grid point of interest in the three-dimensional space, that is, 3×3×3=27 grid points in the three-dimensional space, and The removed ones are referred to as "surroundings". However, a wider range (for example, 4×4×4=63 grid points excluding the grid point of interest at the center from 64 grid points) may be set as the “periphery”. Further, the predetermined number R _th may be set in advance according to the above-mentioned “principle of voxel restoration processing” in consideration of the “peripheral” range, the size of the unit lattice in the processing target space, and the like. Here, it is assumed that 10 is set as the value of the predetermined number R _th . If the result of determination is that the count number of peripheral voxels is greater than or equal to the predetermined number R _th , processing proceeds to step 906. On the other hand, if the count number of peripheral voxels is less than the predetermined number R _th , the process proceeds to step 908.

ステップ９０６では、注目格子点に対応する候補領域内の位置にボクセルが追加される。そして、ステップ９０７では、処理対象空間の全格子点について処理が完了したか判定される。未処理の格子点があればステップ９０８に進む。一方、全格子点についての処理が完了していれば、ステップ９０９に進む。 In step 906, voxels are added to the positions in the candidate area corresponding to the grid point of interest. Then, in step 907, it is determined whether the processing has been completed for all grid points in the processing target space. If there are unprocessed grid points, the process proceeds to step 908. On the other hand, if the processing has been completed for all grid points, the process proceeds to step 909.

ステップ９０８では、注目格子点が更新される。注目格子点の更新は、x軸→y軸→z軸の順にその位置座標を変更するものとする。例えば、（0,0,0）→（1,0,0）→（2,0,0）・・・といった具合である。更新によって新たな注目格子点が設定されると、ステップ９０３に戻って、当該新たな注目格子点に対する処理が続行される。 In step 908, the target grid point is updated. The grid point of interest is updated by changing its position coordinates in the order of x-axis→y-axis→z-axis. For example, (0,0,0)→(1,0,0)→(2,0,0)... When a new grid point of interest is set by the update, the process returns to step 903 and the processing for the new grid point of interest is continued.

ステップ９０９では、上述の処理が規定回数なされたかどうかが判定される。規定回数処理が行われていない場合は、ステップ９１０で処理回数が更新され、ステップ９０２に戻る。一方、規定回数処理が行われている場合は、本処理を終了する。例えば、規定回数nとしてn＝5が設定されていれば、上述の処理が５回繰り返されることになる。 In step 909, it is determined whether the above process has been performed a specified number of times. If the specified number of times of processing has not been performed, the number of times of processing is updated in step 910, and the process returns to step 902. On the other hand, if the process has been performed the specified number of times, this process ends. For example, if n=5 is set as the prescribed number of times n, the above processing is repeated 5 times.

以上が、ボクセル復元処理の内容である。図８（ｃ）で被写体領域内のボクセルが適切に復元されているのは、ステップ９０４の処理において、ボクセルを追加する格子点を候補領域内のみに限定していることに因るものである。また、偽領域内のボクセルが復元されないのは、ステップ９０５において、着目格子点と隣接するボクセル数をカウントしていることに因るものである。 The above is the content of the voxel restoration processing. The reason that the voxels in the subject area are properly restored in FIG. 8C is that the grid points to which voxels are added are limited to only the candidate areas in the processing of step 904. .. Further, the voxels in the false region are not restored because the number of voxels adjacent to the grid point of interest is counted in step 905.

＜本実施例の効果＞
図１０は、本実施例に係る発明の効果を説明する図である。図１０（ａ）は、図８（ｂ）と同じ図であり、ボクセル削除処理によってボクセルが削除された状態を示している。この状態では、偽領域である候補領域１００２内のボクセルだけでなく、被写体領域を表す候補領域１００１内の必要なボクセルも削除されてしまっている。図１０（ｂ）は、図１０（ａ）のボクセル削除処理後の状態に対して、公知のモルフォロジー演算による膨張処理を適用した状態を示している。この場合は、注目格子点の隣接格子点に１つでもボクセルが存在すればボクセルが追加されるため、候補領域１００１と１００２の双方において、その範囲を超えてボクセルが追加される。このように従来のモルフォロジー演算による膨張処理の場合、被写体領域の範囲を超えてボクセルが追加される結果、被写体の形状が正確に推定されず、しかも偽領域までが膨張してしまうという弊害がある。図１０（ｃ）は、図１０（ａ）のボクセル削除処理後の状態に対して、本実施例の手法を適用した状態を示している。本実施例の手法の場合、偽領域についてはボクセルが追加されることなく、被写体領域にのみボクセルが追加される結果、被写体形状がより正確に推定できている。 <Effect of this embodiment>
FIG. 10 is a diagram for explaining the effect of the invention according to the present embodiment. FIG. 10A is the same diagram as FIG. 8B and shows a state in which voxels are deleted by the voxel deletion processing. In this state, not only the voxels in the candidate area 1002 which is a false area but also the necessary voxels in the candidate area 1001 representing the subject area have been deleted. FIG. 10B shows a state in which the dilation processing by a known morphological operation is applied to the state after the voxel deletion processing in FIG. 10A. In this case, a voxel is added if at least one voxel exists in the adjacent grid point of the grid point of interest, so voxels are added beyond the range in both candidate regions 1001 and 1002. As described above, in the case of the conventional expansion processing by the morphological operation, as a result of adding voxels beyond the range of the subject area, the shape of the subject is not accurately estimated, and further, the false area is expanded. .. FIG. 10C shows a state in which the method of this embodiment is applied to the state after the voxel deletion processing in FIG. 10A. In the case of the method of the present embodiment, voxels are not added to the false region, and voxels are added only to the subject region. As a result, the subject shape can be estimated more accurately.

＜変形例＞
本実施例の変形例として、ボクセル削除処理（ステップ４０５）において、色の整合性に基づく判定（閾値D_thを用いた判定）によって削除することになったボクセル（削除対象ボクセル）を、改めて削除するかどうかを異なる基準を用いて再判定する態様を説明する。具体的には、削除対象ボクセルの周辺にあるボクセルの数をカウントし、当該カウント数と別途設定された所定数とを比較して、削除対象ボクセルを実際に削除するかどうかを決定する。ここでの所定数をS_thとして、前述のステップ９０５における所定数R_thと区別する。ここでは、削除対象ボクセルを実際に削除するかどうかの基準となる上記所定数S_thの値を5とするが、これ以外の値でも構わない。 <Modification>
As a modified example of the present embodiment, in the voxel deletion processing (step 405), the voxel (the voxel to be deleted) that is to be deleted by the judgment based on the color consistency (judgment using the threshold D _th ) is deleted again. An aspect of re-determining whether or not to perform using different criteria will be described. Specifically, the number of voxels around the voxel to be deleted is counted, and the count number is compared with a predetermined number set separately to determine whether or not to actually delete the voxel to be deleted. The predetermined number here is defined as S _th , and is distinguished from the predetermined number R _th in step 905 described above. Here, the value of the predetermined number S _th , which serves as a reference for actually deleting the voxel to be deleted, is set to 5, but any other value may be used.

まず、前述の式（１）から求まる平均NCCの値が、類似度に関する前述の閾値D_th（例えば0.9。以下、この閾値を第１の閾値としてD_th1と表す。）未満となって、ボクセルを削除することになったとする。次に、当該削除対象ボクセルの周辺のボクセルの数をカウントする。ここでの「周辺」の範囲は、ボクセル復元処理（ステップ４０６）と同じでよい。周辺ボクセルの数が所定数S_th（例えば5）未満の場合は、当該削除対象ボクセルは偽領域に属している可能性が高いと判断して、実際に削除する。一方、周辺ボクセルの数が所定数S_th以上の場合は、当該削除対象ボクセルは被写体領域に属している可能性があると判断して、類似度に関する別の閾値（例えば0.7。以下、この閾値を第２の閾値としてD_th2と表す）を用いて、再度、色の整合性の判定を行う。そして、上記式（１）から求まる平均NCCの値が、第２の閾値D_th2未満の場合は当該削除対象ボクセルを、実際に削除する。一方、平均NCCの値が、第２の閾値D_th2以上の場合は、当該削除対象ボクセルは被写体領域に属している可能性が高いと判断して、実際に削除することなくそのまま維持する。 First, the value of the average NCC obtained from the above equation (1) becomes less than the above-mentioned threshold value D _th (for example, 0.9. Hereinafter, this threshold value is referred to as D _th1 as the first threshold value) related to the similarity, and the voxel is reached. Suppose you decide to delete. Next, the number of voxels around the deletion target voxel is counted. The “peripheral” range here may be the same as the voxel restoration process (step 406). If the number of peripheral voxels is less than the predetermined number S _th (for example, 5), it is determined that the deletion target voxel is likely to belong to the false area, and the deletion is actually performed. On the other hand, when the number of peripheral voxels is equal to or _larger than the predetermined number S _th , it is determined that the deletion target voxel may belong to the subject region, and another threshold value (for example, 0.7. _Is represented by D _th2 as a second threshold), and the color consistency is determined again. Then, when the average NCC value obtained from the above equation (1) is less than the second threshold value D _{th2, the} deletion target voxel is actually deleted. On the other hand, when the average NCC value is equal to or _larger than the second threshold value D _th2 , it is determined that the deletion target voxel is likely to belong to the subject area, and the deletion target voxel is maintained without being actually deleted.

なお、本変形例における再判定で使用する所定数S_thの値は、ボクセル追加処理における所定数R_thより小さい値を設定することが望ましい。ただし、削除された偽領域内のボクセルが、続くボクセル追加処理によって復元しないようにすることができれば、ボクセル追加処理の所定数R_thとの関係に囚われずに、自由に所定数S_thの値を決めてもよい。 It is desirable that the value of the predetermined number S _th used in the re-determination in this modification be set to a value smaller than the predetermined number R _th in the voxel addition process. However, if it is possible to prevent the voxels in the deleted false region from being restored by the subsequent voxel addition process, the value of the predetermined number S _th can be freely set without being bound by the relationship with the predetermined number R _th of the voxel addition process. You may decide.

この変形例の場合、ボクセル削除処理において、被写体領域内のボクセルが不必要に削除されるのを防止することができる。 In the case of this modification, it is possible to prevent unnecessary deletion of voxels in the subject region in the voxel deletion processing.

以上のとおり本実施例によれば、より簡便な処理で、偽領域については不要なボクセルをさらに削除しつつ、被写体領域については削除され過ぎたボクセルを復元することができる。 As described above, according to the present embodiment, unnecessary voxels can be further deleted from the false area and the voxels that have been deleted too much can be restored from the object area by a simpler process.

［実施例２］
次に、判断基準の異なるボクセル削除処理を複数回行なうと共に、その都度ボクセルの追加と候補領域の更新を行って、より高精度の形状推定を行う態様を、実施例２として説明する。なお、実施例１と共通する部分については説明を省略或いは簡略化し、以下では差異点を中心に説明するものとする。 [Example 2]
Next, a second embodiment will be described in which a voxel deletion process with different determination criteria is performed a plurality of times, and a voxel is added and a candidate region is updated each time to perform more accurate shape estimation. The description of the parts common to the first embodiment will be omitted or simplified, and the differences will be mainly described below.

図１１は、本実施例に係る画像処理装置２００の機能ブロック図である。本実施例の画像処理装置２００は、図３で示す各部に加え、候補領域更新部１１０１を有する。そして、図１２は、本実施例の画像処理装置２００が実行する一連の画像処理の流れを示すフローチャートである。以下、本実施例に係る画像処理装置２００で行われる画像処理の流れを説明する。 FIG. 11 is a functional block diagram of the image processing apparatus 200 according to this embodiment. The image processing apparatus 200 of the present embodiment has a candidate area updating unit 1101 in addition to the units shown in FIG. Then, FIG. 12 is a flowchart showing a flow of a series of image processing executed by the image processing apparatus 200 of the present embodiment. The flow of image processing performed by the image processing apparatus 200 according to this embodiment will be described below.

ステップ１２０１〜ステップ１２０４は、実施例１の図４のフローにおけるステップ４０１〜ステップ４０４と同じである。すなわち、ステップ１２０１では視差画像データとシルエット画像データを取得され、続くステップ４０２ではカメラ１０１に関する各種パラメータが取得される。そして、ステップ１２０３では取得したカメラパラメータのうちの歪曲パラメータに基づく歪曲補正処理が行われ、ステップ１２０４では被写体がその内側に存在すると見込まれる「候補領域」が生成される。 Steps 1201 to 1204 are the same as steps 401 to 404 in the flow of FIG. 4 of the first embodiment. That is, parallax image data and silhouette image data are acquired in step 1201, and various parameters related to the camera 101 are acquired in subsequent step 402. Then, in step 1203, distortion correction processing based on the distortion parameter of the acquired camera parameters is performed, and in step 1204, a “candidate region” in which the subject is expected to exist inside is generated.

ステップ１２０５では、適用する複数のボクセル削除処理のうち１のボクセル処理が、ステップ１２０４で生成された候補領域に対し行われる。本実施例では、複数のボクセル削除処理として、実施例１で説明した色の整合性を基準とするボクセル削除処理に加え、被写体の影を構成するボクセルを削除する処理を行う場合を例にとって説明する。シルエット画像における被写体部分には、被写体の影が一部含まれる場合があり、被写体単体の形状を推定したい場合には、この影を除去する必要がある。そこで、地面やテーブルといった被写体の影が投影される部分のボクセルを削除するための処理を行う。例えば、被写体の影が投影される面の座標をz=0としたとき、z座標の値が一定の閾値以下のボクセルをすべて削除する処理を行う。これら2種類のボクセル削除処理は、どちらを先に実行しても構わない。 In step 1205, one voxel process of the plurality of voxel deletion processes to be applied is performed on the candidate region generated in step 1204. In the present embodiment, a case will be described as an example where a plurality of voxel deleting processes, in addition to the voxel deleting process based on the color consistency described in the first embodiment, a process of deleting a voxel forming a shadow of a subject is performed. To do. The subject part in the silhouette image may include a part of the subject's shadow, and this shadow needs to be removed in order to estimate the shape of the subject alone. Therefore, processing is performed to delete the voxels in the portion where the shadow of the subject such as the ground and the table is projected. For example, when the coordinate of the surface on which the shadow of the subject is projected is z=0, a process of deleting all voxels whose z coordinate value is equal to or less than a certain threshold value is performed. Either of these two types of voxel deletion processing may be executed first.

ステップ１２０６では、ボクセル削除処理で削除されすぎたボクセルを復元するボクセル復元処理が行われる。ボクセル復元処理の内容は、実施例１の図４のフローにおけるステップ４０６と同じである。仮に、被写体の影を構成するボクセルの削除処理が最初に実行された場合には、被写体が人であれば足元、物であれば底などの影以外の部分を復元するようなボクセルの追加がなされる。 In step 1206, a voxel restoration process for restoring voxels that have been deleted too much in the voxel deletion process is performed. The content of the voxel restoration processing is the same as step 406 in the flow of FIG. 4 of the first embodiment. If the process of deleting the voxels that make up the shadow of the subject is executed first, it is necessary to add voxels that restore parts other than the shadow, such as the foot if the subject is a person, or the bottom if the subject is an object. Done.

ステップ１２０７では、複数のボクセル削除処理のすべてが実行されたかどうかが判定される。未実行のボクセル削除処理があればステップ１２０８に進み、候補領域の更新処理が実行される。すなわち、複数回の削除処理を行う場合において、２回目以降の削除処理においては、直前に行われた削除処理及び復元処理の結果を反映した候補領域が、新たな候補領域として設定される。例えば、直前に実行されたボクセル削除処理が被写体の影を構成するボクセルの削除処理であった場合には、被写体の影が除去された新たな候補領域が設定される。そして、ステップ１２０５に戻り、新たに設定された候補領域を対象に次のボクセル削除処理が実行される。一方、すべてのボクセル削除処理が実行済みであれば、本処理を終える。 In step 1207, it is determined whether all the voxel deleting processes have been executed. If there is unexecuted voxel deletion processing, the process proceeds to step 1208, and candidate area update processing is executed. That is, when the deletion process is performed a plurality of times, in the second and subsequent deletion processes, the candidate region that reflects the results of the deletion process and the restoration process performed immediately before is set as a new candidate region. For example, when the voxel deleting process executed immediately before is the deleting process of the voxels forming the shadow of the subject, a new candidate area in which the shadow of the subject is removed is set. Then, returning to step 1205, the next voxel deletion process is executed for the newly set candidate area. On the other hand, if all voxel deletion processing has been executed, this processing ends.

以上が、本実施例の画像処理装置２００が実行する一連の画像処理の内容である。このように複数種類のボクセル削除処理を実行することで、例えば前述の図８（ｃ）のように、1種類のボクセル削除処理では削除し切れなった偽領域８０２内のボクセルを削除することが可能になる。 The above is the content of the series of image processing executed by the image processing apparatus 200 of the present embodiment. By executing a plurality of types of voxel deletion processing in this manner, it is possible to delete voxels in the false area 802 that have not been completely deleted by one kind of voxel deletion processing, as shown in FIG. 8C, for example. It will be possible.

なお、本実施例で実行する複数のボクセル削除処理は上述の2種類に限定されず、3種類以上のボクセル削除処理を行っても構わない。また、各ボクセル削除処理に対し、ボクセル追加処理のイタレーション回数ｎをそれぞれ変化させてもよい。例えば、色の整合性を判定するボクセル削除処理に対しては、ボクセル追加処理のイタレーション回数を少なめに設定することで、凹んだ被写体形状を再現することも可能になる。 Note that the plurality of voxel deletion processes executed in this embodiment are not limited to the above-mentioned two types, and three or more types of voxel deletion processes may be performed. Further, the iteration number n of the voxel addition process may be changed for each voxel deletion process. For example, with respect to the voxel deletion processing for determining the color consistency, it is possible to reproduce the concave object shape by setting a small number of iterations in the voxel addition processing.

本実施例では、候補領域を更新しつつ複数種類のボクセル削除処理を実行する。こうして、ボクセルの削除と追加を繰り返し行うことで、被写体の3次元形状をより高精度に推定することができる。 In this embodiment, a plurality of types of voxel deletion processing are executed while updating the candidate area. By repeatedly deleting and adding voxels in this manner, the three-dimensional shape of the subject can be estimated with higher accuracy.

＜その他の実施形態＞
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 <Other embodiments>
The present invention supplies a program that implements one or more functions of the above-described embodiments to a system or apparatus via a network or a storage medium, and one or more processors in a computer of the system or apparatus read and execute the program. It can also be realized by the processing. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

２００画像処理装置
３０１画像データ取得部
３０３候補領域生成部
３０４ボクセル削除部
３０５ボクセル復元部 200 image processing device 301 image data acquisition unit 303 candidate region generation unit 304 voxel deletion unit 305 voxel restoration unit

Claims

Generating means for generating a candidate region corresponding to the three-dimensional shape of the subject and composed of voxels , based on the plurality of images acquired by the plurality of image capturing devices;
For a specific voxel forming the candidate area, processing means for performing processing including deletion,
Have
The processing means, based on the density of the other voxels around the particular voxel, determining whether to delete with respect to the particular voxel,
An image processing device characterized by the above.

The image processing apparatus according to claim 1, wherein the processing unit further processes the specific voxel based on a predetermined criterion.

The processing means, the particular voxel does not satisfy a predetermined criterion, and based on the fact that the density does not satisfy a predetermined condition, according to claim 2, wherein the deleting the particular voxel Image processing device.

The processing means, the particular voxel is based on the fact that not satisfy the predetermined criteria, to delete the particular voxel, other conditions are satisfied density voxels of a predetermined surrounding of the deleted the particular voxel The image processing apparatus according to claim 2, wherein the deleted specific voxel is restored based on the above.

The image processing apparatus according to claim 3, wherein the predetermined condition is that the number of other voxels existing in a predetermined range including the specific voxel is equal to or larger than a predetermined number.

The image processing apparatus according to claim 5, wherein the predetermined number is determined based on the number of voxels adjacent to one voxel .

Wherein the predetermined criteria, the image processing apparatus according to any one of claims 2 to 6, characterized in that means that there is a color consistency in each of the plurality of images corresponding to the particular voxel ..

Wherein there is a consistent, according to claim 7, characterized in that refer to the evaluation value of the pixel values mutual similarity in each of the plurality of images corresponding to the particular voxel is above the threshold Image processing device.

Wherein the predetermined criteria, the particular voxel, the image processing apparatus according to any one of claims 2 to 8, characterized in that means it is not a voxel corresponding to the shadow of the object.

A generation step of generating a candidate region corresponding to the three-dimensional shape of the subject and composed of voxels , based on the plurality of images acquired by the plurality of imaging devices;
For a specific voxel that constitutes the candidate area, a processing step of performing processing including deletion,
Have
In the processing step, based on the density of other voxels around the specific voxel, it is determined whether to delete for the specific voxel ,
An image processing method characterized by the above.

The computer program to function as the image processing apparatus according to any one of claims 1 to 9.