JP2022029239A

JP2022029239A - Image processing apparatus, image processing method, and program

Info

Publication number: JP2022029239A
Application number: JP2020132476A
Authority: JP
Inventors: 圭輔森澤; Keisuke Morisawa
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2020-08-04
Filing date: 2020-08-04
Publication date: 2022-02-17

Abstract

To control a processing load related to three-dimensional shape data.SOLUTION: A three-dimensional model generation device 104 has: a controlled foreground selection unit 902 that selects objects in a first area in a frame and objects in a second area that is an area other than the first area; a controlled interval determination unit 903 that determines a first frame rate for generating three-dimensional shape data of the objects in the first area, and determines a second frame rate for generating three-dimensional shape data of the objects in the second area to be a frame rate different from the first frame rate; a model generation unit 905 that generates the three-dimensional shape data by using object images; and a data deletion unit 904 that controls such that the three-dimensional shape data of the objects in the first area is generated at the first frame rate, and controls such that the three-dimensional shape data of the objects in the second area is generated at the second frame rate.SELECTED DRAWING: Figure 9

Description

本開示の技術は、オブジェクトの三次元形状データの生成に関する。 The technique of the present disclosure relates to the generation of three-dimensional shape data of an object.

複数のカメラを異なる位置に設置して複数の視点で同期して撮像し、撮像により得られた複数の画像を用いて仮想視点画像を生成する技術がある。仮想視点画像の生成では、対象物体の三次元形状データが生成される。また、三次元形状データの生成では、対象物体の数や大きさ、撮像装置の台数、三次元形状を構成するボクセルの大きさ、対象となる三次元空間の大きさなどの要因により、計算量が大きくなることがある。 There is a technique in which a plurality of cameras are installed at different positions, images are taken synchronously from a plurality of viewpoints, and a virtual viewpoint image is generated using a plurality of images obtained by the imaging. In the generation of the virtual viewpoint image, the three-dimensional shape data of the target object is generated. In addition, in the generation of 3D shape data, the amount of calculation depends on factors such as the number and size of target objects, the number of image pickup devices, the size of box cells that make up the 3D shape, and the size of the target 3D space. May be large.

計算量を削減する方法として、特許文献１には、最大ボクセルで分割した測定対象空間内の対象物体と重なるボクセルに対してボクセルの分割を繰り返すことで、多数の最小ボクセルの塊によって対象物体を再構成する方法が記載されている。 As a method of reducing the amount of calculation, Patent Document 1 describes that a target object is formed by a large number of minimum voxel blocks by repeating voxel division for a voxel that overlaps with a target object in the measurement target space divided by the maximum voxel. It describes how to reconstruct it.

特開２００１－３０７０７３号公報Japanese Unexamined Patent Publication No. 2001-307073

撮像画像には、サッカーの試合におけるゴール前のような重要度が高い領域とフィールド外のような重要度が低い領域とが含まれることがあり、対象物体は、重要度が高い領域だけでなく重要度が低い領域にも位置することがある。このため、特許文献１に記載の方法によって、撮像画像に含まれる複数の対象物体の三次元形状データを生成すると、重要度が低い領域の処理負荷の抑制が十分ではなく、三次元形状データに関わる処理負荷が増す虞がある。 The captured image may include a high-importance area such as before a goal in a soccer match and a low-importance area such as outside the field, and the target object is not only a high-importance area but also a high-importance area. It may also be located in less important areas. Therefore, when the three-dimensional shape data of a plurality of target objects included in the captured image is generated by the method described in Patent Document 1, the processing load of the region of low importance is not sufficiently suppressed, and the three-dimensional shape data is obtained. There is a risk that the processing load involved will increase.

本開示は、三次元形状データに関わる処理負荷を抑制することを目的とする。 An object of the present disclosure is to suppress a processing load related to three-dimensional shape data.

本開示に関わる画像処理装置は、複数の撮像装置の撮像によって得られたフレーム内のオブジェクトの形状を示すオブジェクト画像を用いて、前記フレーム内に含まれるオブジェクトの三次元形状データを生成する画像処理装置であって、前記フレーム内の第１の領域のオブジェクトと、前記第１の領域以外の領域である第２の領域のオブジェクトと、を選定する選定手段と、前記第１の領域のオブジェクトの三次元形状データを生成する第１のフレームレートを決定し、前記第２の領域のオブジェクトの三次元形状データを生成するための第２のフレームレートを第１のフレームレートとは異なるフレームレートに決定する決定手段と、前記フレームに対応する前記オブジェクト画像を生成する画像生成手段と、前記画像生成手段によって生成された前記オブジェクト画像を用いて、前記三次元形状データを生成するデータ生成手段と、を有し、前記画像生成手段は、前記第１のフレームレートで前記第１の領域のオブジェクトの三次元形状データが生成されるように制御し、前記第２のフレームレートで前記第２の領域のオブジェクトの三次元形状データが生成されるように制御することを特徴とする。 The image processing apparatus according to the present disclosure is an image processing that generates three-dimensional shape data of an object included in the frame by using an object image showing the shape of the object in the frame obtained by imaging by a plurality of imaging devices. A selection means for selecting an object in the first region in the frame and an object in the second region which is an region other than the first region, and an object in the first region. The first frame rate for generating the three-dimensional shape data is determined, and the second frame rate for generating the three-dimensional shape data of the object in the second region is set to a frame rate different from the first frame rate. A determination means for determining, an image generation means for generating the object image corresponding to the frame, and a data generation means for generating the three-dimensional shape data using the object image generated by the image generation means. The image generation means controls so that the three-dimensional shape data of the object in the first region is generated at the first frame rate, and the second region is generated at the second frame rate. It is characterized by controlling so that the three-dimensional shape data of the object of is generated.

本開示の技術によれば、三次元形状データに関わる処理負荷を抑制することができる。 According to the technique of the present disclosure, it is possible to suppress the processing load related to the three-dimensional shape data.

仮想視点画像生成システムの構成を示す図。The figure which shows the structure of the virtual viewpoint image generation system. 仮想視点画像生成システムのカメラ配置の例を表した図。The figure which showed the example of the camera arrangement of the virtual viewpoint image generation system. 撮像画像から検出された前景を表した図。The figure showing the foreground detected from the captured image. 矩形テクスチャと矩形マスクを表した図。A diagram showing a rectangular texture and a rectangular mask. 前景マスク画像を表した図。A diagram showing a foreground mask image. 視体積交差法による三次元モデルの生成方法を説明するための図。The figure for demonstrating the generation method of the 3D model by the visual volume crossing method. ボクセルによる前景の三次元モデルを説明するための図。A diagram for explaining a three-dimensional model of the foreground by voxels. 三次元モデル生成装置のハードウエア構成を表した図。The figure which showed the hardware composition of the 3D model generator. 三次元モデル生成装置の機能構成を示す図。The figure which shows the functional structure of a 3D model generator. 背景モデルと重要度情報を説明するための図。Diagram to explain the background model and importance information. 三次元モデル生成の処理を示すフローチャート。A flowchart showing the process of generating a three-dimensional model. 重要度領域とマスク画像を表した図。The figure which showed the importance area and the mask image. 低重要度領域の矩形マスクの削除を説明するための図。The figure for demonstrating the deletion of the rectangular mask of the low importance area. 三次元モデルの合成を説明するための図。The figure for demonstrating the composition of a three-dimensional model. 三次元モデル生成装置の機能構成を示す図。The figure which shows the functional structure of a 3D model generator. 三次元モデル生成の処理を示すフローチャート。A flowchart showing the process of generating a three-dimensional model. ボクセル数に応じた重要度情報を説明するための図。The figure for explaining the importance information according to the number of voxels. ボールモデルからの距離に応じた重要度領域を説明するための図。The figure for demonstrating the importance area according to the distance from a ball model.

以下、添付の図面を参照して、実施形態に基づいて本開示の技術の詳細を説明する。以下の実施形態は、本開示の技術を実施するにあたっての具体化の例を示したものに過ぎず、本開示の技術的範囲が限定的に解釈されるものではない。本開示の技術はその技術思想又はその主要な特徴から逸脱することなく、様々な形で実施することができる。 Hereinafter, the technique of the present disclosure will be described in detail based on the embodiments with reference to the accompanying drawings. The following embodiments are merely examples of embodiment of the techniques of the present disclosure and are not intended to limit the technical scope of the present disclosure. The techniques of the present disclosure can be implemented in various forms without departing from their technical ideas or their key features.

＜第１の実施形態＞
第１の実施形態では、一連のフレームに含まれる複数のオブジェクトの三次元モデルを生成する処理において、複数のオブジェクトのうち、予め設定した重要度が低い領域に位置するオブジェクトの三次元モデルの更新頻度（生成頻度）を下げる処理を行う。本実施形態では、この処理によって三次元モデルに関わる処理負荷を抑制する方法を説明する。 <First Embodiment>
In the first embodiment, in the process of generating a three-dimensional model of a plurality of objects included in a series of frames, an update of the three-dimensional model of an object located in a preset region of low importance among the plurality of objects. Performs processing to reduce the frequency (generation frequency). In this embodiment, a method of suppressing the processing load related to the three-dimensional model by this processing will be described.

［システム構成］
図１は、仮想視点画像を生成するための、本実施形態の仮想視点画像生成システム１００の構成を示すブロック図である。仮想視点画像とは、実カメラとは異なる実在しない仮想カメラの位置及び向き等に基づいて生成される画像であり、自由視点画像や任意視点画像とも呼ばれる。仮想視点画像を生成する技術によれば、例えば、サッカーやバスケットボールのハイライトシーンを様々な角度から視聴することが出来るため、通常の画像と比較してユーザに高臨場感を与えることが出来る。仮想視点画像は、動画であっても、静止画であってもよい。本実施形態では、仮想視点画像は動画であるものとして説明する。 [System configuration]
FIG. 1 is a block diagram showing a configuration of a virtual viewpoint image generation system 100 of the present embodiment for generating a virtual viewpoint image. The virtual viewpoint image is an image generated based on the position and orientation of a non-existent virtual camera different from the real camera, and is also called a free viewpoint image or an arbitrary viewpoint image. According to the technology for generating a virtual viewpoint image, for example, a highlight scene of soccer or basketball can be viewed from various angles, so that a user can be given a high sense of presence as compared with a normal image. The virtual viewpoint image may be a moving image or a still image. In the present embodiment, the virtual viewpoint image will be described as a moving image.

本実施形態の仮想視点画像生成システム１００は、複数のカメラ１０１ａ～１０１ｐを有するカメラアレイ１０１、複数の前景抽出装置１０２ａ～１０２ｒを有する前景抽出装置群１０２、制御装置１０３を有する。さらに、仮想視点画像生成システム１００は、三次元モデル生成装置１０４、レンダリング装置１０５を有する。前景抽出装置１０２ａ～１０２ｐ、制御装置１０３、三次元モデル生成装置１０４、及びレンダリング装置１０５は、演算処理を行うＣＰＵ、演算処理の結果やプログラム等を記憶するメモリなどを備えた一般的な画像処理装置によって実現される。 The virtual viewpoint image generation system 100 of the present embodiment includes a camera array 101 having a plurality of cameras 101a to 101p, a foreground extraction device group 102 having a plurality of foreground extraction devices 102a to 102r, and a control device 103. Further, the virtual viewpoint image generation system 100 includes a three-dimensional model generation device 104 and a rendering device 105. The foreground extraction devices 102a to 102p, the control device 103, the three-dimensional model generation device 104, and the rendering device 105 are general image processing including a CPU for performing arithmetic processing, a memory for storing arithmetic processing results, programs, and the like. Realized by the device.

カメラアレイ１０１は、複数のカメラ１０１ａ～１０１ｐで構成され、様々な角度の複数方向からオブジェクトを撮像して前景抽出装置群１０２へ撮像画像の画像データを出力する。 The camera array 101 is composed of a plurality of cameras 101a to 101p, captures an object from a plurality of directions at various angles, and outputs image data of the captured image to the foreground extraction device group 102.

図２は、カメラアレイ１０１を構成する全１６台のカメラ１０１ａ～１０１ｐの配置を、フィールドを真上から見た俯瞰図で示した図である。図２に示すようにカメラは競技場の周囲に配置され、全てのカメラ１０１ａ～１０１ｐで共通したフィールド上の注視点に向けて様々な角度から時刻同期して撮像する。ただし、本開示において、カメラ１０１ａ～１０１ｐのうちいくつかは別の注視点に向けられて配置されていてもよいし、全てのカメラ１０１ａ～１０１ｐが異なる位置に向けて配置されていてもよい。また、カメラ１０１ａ～１０１ｐのうちいくつかは同じ注視点に向けられて配置され、残りのカメラは互いに異なる位置に向けて配置されるような構成であってもよい。 FIG. 2 is a diagram showing the arrangement of all 16 cameras 101a to 101p constituting the camera array 101 in a bird's-eye view of the field as viewed from directly above. As shown in FIG. 2, the cameras are arranged around the stadium, and images are taken in time synchronization from various angles toward the gazing point on the field common to all cameras 101a to 101p. However, in the present disclosure, some of the cameras 101a to 101p may be arranged toward different gazing points, or all the cameras 101a to 101p may be arranged toward different positions. Further, some of the cameras 101a to 101p may be arranged toward the same gazing point, and the remaining cameras may be arranged toward different positions from each other.

前景抽出装置群１０２は、夫々のカメラ１０１ａ～１０１ｐに対応する前景抽出装置１０２ａ～１０２ｐで構成される。各前景抽出装置１０２ａ～１０２ｐは、対応するカメラから出力された撮像画像の画像データから撮像画像に含まれるオブジェクトのシルエットを示す前景領域を抽出する。 The foreground extraction device group 102 is composed of foreground extraction devices 102a to 102p corresponding to the respective cameras 101a to 101p. Each foreground extraction device 102a to 102p extracts a foreground region showing a silhouette of an object included in the captured image from the image data of the captured image output from the corresponding camera.

撮像画像の前景であるオブジェクトとは、仮想視点の任意の角度から見ることを可能とする被写体であり、例えば、競技場のフィールド上に存在する人物のことを指す。または、オブジェクトは、ボール、またはゴール等、画像パターンが予め定められている物体であってもよい。また、オブジェクトは動体であってもよいし、静止体であってもよい。 The object that is the foreground of the captured image is a subject that can be viewed from an arbitrary angle of the virtual viewpoint, and refers to, for example, a person existing on the field of the stadium. Alternatively, the object may be an object having a predetermined image pattern, such as a ball or a goal. Further, the object may be a moving body or a stationary body.

そして各前景抽出装置１０２ａ～１０２ｐは、対応するカメラが撮像することによって得られた撮像画像の前景領域とそれ以外の領域とを示す矩形マスク、およびオブジェクトのテクスチャを示す画像である矩形テクスチャを生成する。 Then, each of the foreground extraction devices 102a to 102p generates a rectangular mask showing the foreground region and the other regions of the captured image obtained by the corresponding camera, and a rectangular texture which is an image showing the texture of the object. do.

図３は、前景抽出装置１０２ａ～１０２ｐにおける前景領域の抽出処理について説明するための図である。図３（ａ）は、図２のカメラ１０１ｄが撮像することによって得られた撮像画像の例であり、撮像画像３００には５つオブジェクトを示す前景３ａ－３ｅが含まれる。前景抽出装置１０２ｄはカメラ１０１ｄの撮像画像３００に含まれる前景領域を抽出し、図３（ｂ）で示すように前景３ａ－３ｅに対応する５つの矩形領域４ａ－４ｅを算出する。前景領域の抽出方法は、例えば、保持している背景画像と撮像画像の輝度や色の差分の大きい領域を前景と判定する方法が用いられる。背景は撮像画像の前景以外の領域を指す。 FIG. 3 is a diagram for explaining the extraction process of the foreground region in the foreground extraction devices 102a to 102p. FIG. 3A is an example of an captured image obtained by taking an image with the camera 101d of FIG. 2, and the captured image 300 includes a foreground 3a-3e showing five objects. The foreground extraction device 102d extracts the foreground region included in the captured image 300 of the camera 101d, and calculates five rectangular regions 4a-4e corresponding to the foreground 3a-3e as shown in FIG. 3B. As a method for extracting the foreground region, for example, a method of determining a region having a large difference in brightness and color between the held background image and the captured image as the foreground is used. The background refers to an area other than the foreground of the captured image.

図４は、矩形テクスチャおよび矩形マスクを示す図である。図４（ａ）は、図３（ｂ）で示した前景を内包する矩形領域を切り出すことによって得られた複数の画像を表す図であり、この画像を矩形テクスチャと呼ぶ。また、図４（ｂ）は図３（ｂ）で示した前景を内包する矩形領域のうち、オブジェクトを示す前景領域を白で、背景の領域を黒で、表したオブジェクトごとの二値のシルエット画像５ａ－５ｅを表す図である。この二値画像を矩形マスクと呼ぶ。矩形テクスチャおよび矩形マスクは矩形領域の座標と共に三次元モデル生成装置１０４に送信される。 FIG. 4 is a diagram showing a rectangular texture and a rectangular mask. FIG. 4A is a diagram showing a plurality of images obtained by cutting out a rectangular region including the foreground shown in FIG. 3B, and this image is referred to as a rectangular texture. Further, in FIG. 4 (b), among the rectangular areas including the foreground shown in FIG. 3 (b), the foreground area showing the object is white and the background area is black, and the binary silhouette for each object is shown. It is a figure which shows the image 5a-5e. This binary image is called a rectangular mask. The rectangular texture and the rectangular mask are transmitted to the 3D model generator 104 together with the coordinates of the rectangular area.

制御装置１０３は、カメラアレイ１０１のカメラによって時刻同期され撮像された撮像画像の画像データからカメラ１０１ａ～カメラ１０１ｐの位置や姿勢を示すカメラパラメータを算出し、三次元モデル生成装置１０４とレンダリング装置１０５に出力する。 The control device 103 calculates camera parameters indicating the positions and orientations of the cameras 101a to 101p from the image data of the captured images captured by the cameras of the camera array 101 in time synchronization, and the three-dimensional model generation device 104 and the rendering device 105. Output to.

カメラパラメータは、外部パラメータ、および内部パラメータで構成されている。外部パラメータは、回転行列および並進行列で構成されておりカメラの位置や姿勢を示すものである。一方、内部パラメータは、カメラの焦点距離や光学的中心などを含みカメラの画角や撮像センサの大きさなどを示すものである。カメラパラメータを算出する処理はキャリブレーションと呼ばれ、チェッカーボードのような特定パターンを撮像した複数枚の画像を用いて取得した三次元の世界座標系の点とそれに対応する二次元上の点との対応関係を用いることで求められる。 Camera parameters are composed of external parameters and internal parameters. The external parameters are composed of a rotation matrix and a parallel traveling matrix, and indicate the position and orientation of the camera. On the other hand, the internal parameters include the focal length of the camera, the optical center, and the like, and indicate the angle of view of the camera and the size of the image pickup sensor. The process of calculating camera parameters is called calibration, which includes points in a three-dimensional world coordinate system acquired using multiple images of a specific pattern such as a checkerboard, and points on the corresponding two dimensions. It is obtained by using the correspondence of.

三次元モデル生成装置１０４は、制御装置１０３よりカメラパラメータを取得し、前景抽出装置群１０２から矩形マスクを取得し、複数の矩形マスク画像を再構成して複数のカメラ毎のマスク画像を生成する。 The three-dimensional model generation device 104 acquires camera parameters from the control device 103, acquires a rectangular mask from the foreground extraction device group 102, reconstructs a plurality of rectangular mask images, and generates mask images for each of a plurality of cameras. ..

図５は、カメラ１０１ｄが撮像することによって得られた撮像画像３００に対応する、オブジェクトのシルエットを示す前景マスク画像（オブジェクト画像とも記す）を表す図である。前景マスク画像は、撮像画像内のオブジェクトの領域を示す画像であり、矩形マスクと各矩形マスクの座標情報を基に撮像画像３００におけるオブジェクトの位置に矩形マスクを配置して再構成された画像である。矩形マスクは図４（ｂ）で示すように撮像画像における個々のオブジェクトを内包する矩形で切り出されたオブジェクトごとの画像であるのに対して、前景マスク画像は図５で示すように複数の矩形マスクを切り出された座標へ貼り付けて合成した画像である。前景マスク画像は、矩形マスクと同様に、撮像画像の前景領域を白、背景領域を黒で表した二値のシルエット画像である。なお、本実施形態では、前景マスク画像は、三次元モデル生成装置１０４で生成されるもとして説明するが前景抽出装置１０２ａ～１０２ｐで生成されてもよい。 FIG. 5 is a diagram showing a foreground mask image (also referred to as an object image) showing the silhouette of an object, which corresponds to the captured image 300 obtained by the camera 101d. The foreground mask image is an image showing an area of an object in the captured image, and is an image reconstructed by arranging a rectangular mask at the position of the object in the captured image 300 based on the rectangular mask and the coordinate information of each rectangular mask. be. As shown in FIG. 4B, the rectangular mask is an image of each object cut out by a rectangle containing each object in the captured image, whereas the foreground mask image is a plurality of rectangles as shown in FIG. It is an image synthesized by pasting the mask on the cut out coordinates. Similar to the rectangular mask, the foreground mask image is a binary silhouette image in which the foreground area of the captured image is represented by white and the background area is represented by black. In the present embodiment, the foreground mask image will be described as being generated by the three-dimensional model generation device 104, but may be generated by the foreground extraction devices 102a to 102p.

さらに、三次元モデル生成装置１０４は、複数のカメラの前景マスク画像を用いて、視体積交差法によりボクセル集合で表されるオブジェクトの三次元モデルを三次元空間上に生成する。三次元モデルの生成およびボクセルについては後述する。三次元モデル生成装置１０４は、生成したオブジェクトの三次元モデルのデータをレンダリング装置１０５に出力する。 Further, the three-dimensional model generation device 104 generates a three-dimensional model of an object represented by a voxel set by a visual volume crossing method in a three-dimensional space using foreground mask images of a plurality of cameras. The generation of 3D models and voxels will be described later. The three-dimensional model generation device 104 outputs the data of the three-dimensional model of the generated object to the rendering device 105.

レンダリング装置１０５は、オブジェクトの三次元モデル、および前景テクスチャ画像を取得する。前景テクスチャ画像は、矩形テクスチャと各矩形テクスチャの座標情報を基に撮像画像におけるオブジェクトの位置に矩形テクスチャを配置するように再構成された画像である。また、レンダリング装置１０５は、制御装置１０３からカメラパラメータを取得する。レンダリング装置１０５はこれらのデータに基づき仮想視点画像を生成する。具体的には、カメラパラメータから前景テクスチャ画像とオブジェクトの三次元モデルとの位置関係を求める。そして、三次元モデルを構成する各ボクセルに対応する前景テクスチャ画像における画素の色に基づき色づけすることで三次元空間が再構築され、任意視点から見た映像が生成される。 The rendering device 105 acquires a three-dimensional model of the object and a foreground texture image. The foreground texture image is an image reconstructed so as to arrange the rectangular texture at the position of the object in the captured image based on the rectangular texture and the coordinate information of each rectangular texture. Further, the rendering device 105 acquires camera parameters from the control device 103. The rendering device 105 generates a virtual viewpoint image based on these data. Specifically, the positional relationship between the foreground texture image and the three-dimensional model of the object is obtained from the camera parameters. Then, the three-dimensional space is reconstructed by coloring based on the color of the pixels in the foreground texture image corresponding to each voxel constituting the three-dimensional model, and an image viewed from an arbitrary viewpoint is generated.

なお、本実施形態では、前景抽出装置１０２ａ～１０２ｐと三次元モデル生成装置１０４とがスター型のトポロジーで接続されている形態であるものとして説明する。他にも、前景抽出装置１０２ａ～１０２ｐと三次元モデル生成装置１０４とがディジーチェーン接続によるリング型またはバス型等のトポロジーで接続されている形態であってもよい。 In this embodiment, it is assumed that the foreground extraction devices 102a to 102p and the three-dimensional model generation device 104 are connected in a star-shaped topology. In addition, the foreground extraction devices 102a to 102p and the three-dimensional model generation device 104 may be connected in a topology such as a ring type or a bus type by daisy chain connection.

[三次元モデルの生成方法について]
次に、三次元モデル生成装置において行われるオブジェクトの三次元モデルの生成処理の概要について説明する。三次元形状データにおいてボクセルで表されるオブジェクトの三次元形状を、オブジェクト（前景）の三次元モデルとも呼ぶ。ここでは、視体積交差法によるオブジェクトの三次元モデルの生成について説明する。 [How to generate a 3D model]
Next, the outline of the 3D model generation process of the object performed in the 3D model generation device will be described. The 3D shape of an object represented by a voxel in the 3D shape data is also called a 3D model of the object (foreground). Here, the generation of a three-dimensional model of an object by the visual volume crossing method will be described.

図６は、視体積交差法の基本原理を示す図である。図６（ａ）は、あるオブジェクトである対象物体Ｃを撮像装置であるカメラにより撮像したときの図である。前述したように、対象物体Ｃを撮像して得られる撮像画像と背景画像との色または輝度の差分に基づき二値化することで対象物体Ｃの二次元シルエット（前景領域）が含まれる前景マスク画像が得られる。 FIG. 6 is a diagram showing the basic principle of the visual volume crossing method. FIG. 6A is a diagram when a target object C, which is an object, is imaged by a camera, which is an image pickup device. As described above, a foreground mask including a two-dimensional silhouette (foreground region) of the target object C by binarizing it based on the difference in color or brightness between the captured image obtained by imaging the target object C and the background image. An image is obtained.

図６（ｂ）は、カメラの投影中心（Ｐａ）から二次元シルエットＤａの輪郭上の各点を通すように、三次元空間中に広がる錐体を示す図である。この錐体のことを当該カメラによる視体積Ｖａと呼ぶ。図６（ｃ）は複数の視体積によりオブジェクトの三次元モデルが求まる様子を示す図である。図６（ｃ）のように、位置が異なる複数の異なるカメラによって同期撮像された画像に基づく二次元シルエットＤａからカメラごとの複数の視体積を求める。視体積交差法による三次元モデルの生成では、複数のカメラの視体積の交差（共通領域）を求めることによって、対象物体の三次元モデルが生成される。生成される三次元モデルはボクセルの集合で表される。 FIG. 6B is a diagram showing a cone extending in a three-dimensional space so as to pass through each point on the contour of the two-dimensional silhouette Da from the projection center (Pa) of the camera. This cone is called the visual volume Va by the camera. FIG. 6C is a diagram showing how a three-dimensional model of an object can be obtained from a plurality of visual volumes. As shown in FIG. 6 (c), a plurality of visual volumes for each camera are obtained from a two-dimensional silhouette Da based on images synchronously captured by a plurality of different cameras having different positions. In the generation of the three-dimensional model by the visual volume intersection method, the three-dimensional model of the target object is generated by finding the intersection (common area) of the visual volumes of a plurality of cameras. The generated 3D model is represented by a set of voxels.

図７はボクセルを説明するための図である。ボクセルとは、図７（ａ）で示すような微小な立方体のことである。図７（ｂ）は三次元モデルを生成するカメラの対象空間をボクセルの集合として表したものである。対象空間のボクセルのうち処理対象のボクセルである１つの着目ボクセルを各カメラの前景マスク画像に射影したとき、各カメラの前景マスク画像の前景領域内に着目ボクセルの射影が収まるか否かが判定される。この判定の結果、着目ボクセルの射影が前景領域から外れる場合、着目ボクセルは削除される。 FIG. 7 is a diagram for explaining voxels. A voxel is a minute cube as shown in FIG. 7 (a). FIG. 7B shows the target space of the camera that generates the three-dimensional model as a set of voxels. When one of the voxels in the target space, which is the voxel to be processed, is projected onto the foreground mask image of each camera, it is determined whether or not the projection of the focused voxel fits within the foreground area of the foreground mask image of each camera. Will be done. As a result of this determination, if the projection of the voxel of interest deviates from the foreground region, the voxel of interest is deleted.

カメラの前景領域に収まらなかったボクセルを削ることで、図７（ｃ）に示す四角錐の対象物体の三次元モデルが、図７（ｄ）に示すようにボクセルによって生成される。生成された三次元モデルを構成する各ボクセルの位置に対応したテクスチャ画像の画素値をボクセルに付与することで三次元空間上に色の付いたオブジェクトの三次元形状を構成することができる。 By cutting the voxels that do not fit in the foreground area of the camera, a three-dimensional model of the target object of the quadrangular pyramid shown in FIG. 7 (c) is generated by the voxels as shown in FIG. 7 (d). By assigning the pixel value of the texture image corresponding to the position of each voxel constituting the generated 3D model to the voxel, it is possible to construct the 3D shape of the colored object in the 3D space.

対象物体の数や大きさ、カメラの台数、ボクセルのサイズ、対象となる三次元空間の大きさなどの要因により三次元モデルに関わる処理の計算量が多くなり、仮想視点画像の生成が困難となる場合がある。特に三次元モデルの精細さ、三次元モデル生成時の負荷、または三次元モデルに関わる処理負荷は、三次元モデルを構成するボクセルの数に影響される。 Due to factors such as the number and size of target objects, the number of cameras, the size of voxels, and the size of the target 3D space, the amount of calculation for processing related to the 3D model increases, making it difficult to generate virtual viewpoint images. May be. In particular, the fineness of the 3D model, the load at the time of generating the 3D model, or the processing load related to the 3D model is affected by the number of voxels constituting the 3D model.

なお、本実施形態では、オブジェクトの三次元モデルを立方体のボクセルとして表すがこれに限られない。他にも例えば、三次元空間を構成する要素として点を用いて、三次元モデルを点群で表してもよい。 In this embodiment, the three-dimensional model of the object is represented as a cubic voxel, but the present invention is not limited to this. Alternatively, for example, a three-dimensional model may be represented by a point cloud by using points as elements constituting the three-dimensional space.

[三次元モデル生成装置のハードウエア構成]
図８は、本実施形態の三次元モデル生成装置１０４のハードウエア構成例を示すブロック図である。三次元モデル生成装置１０４は、ＣＰＵ８０１、ＲＯＭ８０２、ＲＡＭ８０３、記憶装置８０４、表示部８０５、操作部８０６、および通信部８０７を有する。 [Hardware configuration of 3D model generator]
FIG. 8 is a block diagram showing a hardware configuration example of the three-dimensional model generation device 104 of the present embodiment. The three-dimensional model generation device 104 includes a CPU 801 and a ROM 802, a RAM 803, a storage device 804, a display unit 805, an operation unit 806, and a communication unit 807.

ＣＰＵ８０１は、中央処理装置（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）であり、ＲＯＭ８０２またはＲＡＭ８０３に格納されたプログラムを実行することにより三次元モデル生成装置１０４の全体を制御する。 The CPU 801 is a central processing unit, and controls the entire three-dimensional model generation device 104 by executing a program stored in the ROM 802 or the RAM 803.

ＲＯＭ８０２は読み取り専用の不揮発性メモリである。ＲＡＭ８０３は、随時読み書きが可能なメモリである。ＲＡＭ８０３として、例えば、ＤＲＡＭ（ＤｙｎａｍｉｃＲａｎｄａｍＡｃｃｅｓｓＭｅｍｏｒｙ）を用いることができる。記憶装置８０４はたとえばハードディスクなどで構成される大容量の記憶装置である。記憶装置８０４には、前景抽出装置１０２ａ～１０２ｐから得られた矩形マスクなどを格納することができる。 ROM 802 is a read-only non-volatile memory. The RAM 803 is a memory that can be read and written at any time. As the RAM 803, for example, a DRAM (Dynamic Random Access Memory) can be used. The storage device 804 is a large-capacity storage device composed of, for example, a hard disk. The storage device 804 can store rectangular masks and the like obtained from the foreground extraction devices 102a to 102p.

表示部８０５は、例えば液晶ディスプレイやＬＥＤ等で構成され、ユーザが三次元モデル生成装置１０４を操作するためのＧＵＩ（ＧｒａｐｈｉｃａｌＵｓｅｒＩｎｔｅｒｆａｃｅ）などを表示する。操作部８０６は、例えばキーボードやマウス、ジョイスティック、タッチパネル等で構成され、ユーザによる操作を受けて各種の指示をＣＰＵ８０１に入力する。ＣＰＵ８０１は、表示部８０５を制御する表示制御部、及び操作部８０６を制御する操作制御部としても動作する。本実施形態では表示部８０５と操作部８０６が三次元モデル生成装置１０４の内部に存在するものとするが、表示部８０５と操作部８０６との少なくとも一方が三次元モデル生成装置１０４の外部に別の装置として存在していてもよい。 The display unit 805 is composed of, for example, a liquid crystal display, an LED, or the like, and displays a GUI (Graphical User Interface) for the user to operate the three-dimensional model generation device 104. The operation unit 806 is composed of, for example, a keyboard, a mouse, a joystick, a touch panel, or the like, and inputs various instructions to the CPU 801 in response to an operation by the user. The CPU 801 also operates as a display control unit that controls the display unit 805 and an operation control unit that controls the operation unit 806. In the present embodiment, it is assumed that the display unit 805 and the operation unit 806 exist inside the three-dimensional model generation device 104, but at least one of the display unit 805 and the operation unit 806 is separated from the outside of the three-dimensional model generation device 104. It may exist as a device of.

通信部８０７は、三次元モデル生成装置１０４と外部装置との通信制御を行う。例えば、通信部８０７が通信制御を行うことで、ＬＡＮなどのネットワークを介して、三次元モデル生成装置１０４は、前景抽出装置群１０２とレンダリング装置１０５とに接続される。 The communication unit 807 controls communication between the three-dimensional model generation device 104 and the external device. For example, when the communication unit 807 performs communication control, the three-dimensional model generation device 104 is connected to the foreground extraction device group 102 and the rendering device 105 via a network such as a LAN.

なお、三次元モデル生成装置１０４において実現される後述する図９の各機能部は、ＣＰＵ８０１が所定のプログラムを実行することにより実現されるが、これに限られるものではない。他にも例えば、演算を高速化するためのＧＰＵ（ＧｒａｐｈｉｃｓＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）、または、ＦＰＧＡ（ＦｉｅｌｄＰｒｏｇｒａｍｍａｂｌｅＧａｔｅＡｒｒａｙ）などのハードウエアが利用されてもよい。すなわち、三次元モデル生成装置１０４の各機能部は、ソフトウエアと専用ＩＣなどのハードウエアとの協働で実現されてもよいし、一部またはすべての機能がハードウエアのみで実現されてもよい。また、三次元モデル生成装置１０４を複数用いることにより各機能部の処理を分散させて実行するような構成でもよい。 Each functional unit of FIG. 9, which will be described later, realized in the three-dimensional model generation device 104 is realized by the CPU 801 executing a predetermined program, but is not limited thereto. In addition, for example, hardware such as a GPU (Graphics Processing Unit) for speeding up an operation or an FPGA (Field Programmable Gate Array) may be used. That is, each functional unit of the three-dimensional model generator 104 may be realized by the cooperation of software and hardware such as a dedicated IC, or even if some or all of the functions are realized only by hardware. good. Further, the processing of each functional unit may be distributed and executed by using a plurality of three-dimensional model generation devices 104.

［三次元モデル生成装置の機能構成］
図９は、本実施形態における三次元モデル生成装置１０４の機能構成を示すブロック図である。本実施形態の三次元モデル生成装置１０４は、取得部９０１、抑制前景選定部９０２、抑制間隔決定部９０３、データ削除部９０４、モデル生成部９０５、データ管理部９０６、およびデータ合成部９０７を有する。 [Functional configuration of 3D model generator]
FIG. 9 is a block diagram showing a functional configuration of the three-dimensional model generation device 104 in the present embodiment. The three-dimensional model generation device 104 of the present embodiment includes an acquisition unit 901, a suppression foreground selection unit 902, a suppression interval determination unit 903, a data deletion unit 904, a model generation unit 905, a data management unit 906, and a data synthesis unit 907. ..

取得部９０１は、カメラアレイ１０１を構成する各カメラ１０１ａ～１０１ｐのカメラパラメータ、競技場などの撮像環境の三次元形状を示す背景モデル、競技場などの撮像環境内の領域の重要度を示す重要度情報を制御装置１０３から取得する。重要度情報については後述する。さらに、取得部９０１は、カメラアレイ１０１を構成する各カメラの撮像画像から前景抽出することによって得られた図４（ａ）に示す矩形テクスチャと図４（ｂ）に示す矩形マスクとを前景抽出装置群１０２から取得する。取得部９０１は、カメラパラメータ、背景モデル、および重要度情報を、抑制前景選定部９０２へ出力する。また、取得部９０１は、カメラパラメータ、矩形テクスチャ、および矩形マスクをデータ削除部９０４へ出力する。 The acquisition unit 901 is important to indicate the camera parameters of the cameras 101a to 101p constituting the camera array 101, the background model showing the three-dimensional shape of the imaging environment such as the stadium, and the importance of the area in the imaging environment such as the stadium. The degree information is acquired from the control device 103. The importance information will be described later. Further, the acquisition unit 901 extracts the foreground of the rectangular texture shown in FIG. 4A and the rectangular mask shown in FIG. 4B obtained by extracting the foreground from the captured images of the cameras constituting the camera array 101. Obtained from the device group 102. The acquisition unit 901 outputs the camera parameters, the background model, and the importance information to the suppression foreground selection unit 902. Further, the acquisition unit 901 outputs the camera parameters, the rectangular texture, and the rectangular mask to the data deletion unit 904.

抑制前景選定部９０２は、背景モデルとカメラパラメータと重要度情報と用いて各カメラ１０１ａ～１０１ｐの撮像画像内の領域に重要度を設定する。撮像画像内の領域の重要度の設定の方法は、例えば、撮像環境の三次元空間を示す背景モデル上に、重要度情報を用いて、重要度に応じた領域を配置する。そして、各重要度に応じた三次元空間上の領域を、カメラパラメータを用いて、カメラの撮像画像に射影することで撮像画像内の領域の重要度を設定する。撮像画像内の領域の重要度を示す情報を重要度領域情報と呼ぶ。そして、抑制前景選定部９０２は、三次元形状データの生成の対象である対象フレームにおける所定の重要度の領域に位置するオブジェクトを選定する。抑制前景選定部９０２は、各々の撮像画像内の重要度領域情報を抑制間隔決定部９０３に出力する。 The suppression foreground selection unit 902 sets the importance in the region in the captured image of each camera 101a to 101p by using the background model, the camera parameters, and the importance information. As a method of setting the importance of a region in a captured image, for example, a region corresponding to the importance is arranged on a background model showing a three-dimensional space of an imaging environment by using importance information. Then, the importance of the region in the captured image is set by projecting the region on the three-dimensional space corresponding to each importance onto the captured image of the camera using the camera parameters. Information indicating the importance of a region in a captured image is called importance region information. Then, the suppression foreground selection unit 902 selects an object located in a region of predetermined importance in the target frame that is the target of generating the three-dimensional shape data. The suppression foreground selection unit 902 outputs the importance region information in each captured image to the suppression interval determination unit 903.

抑制間隔決定部９０３は、撮像画像内の重要度が低い領域（低重要度領域）に位置するオブジェクトの三次元モデルの更新の頻度を示す時間間隔を決定する。この時間間隔を抑制間隔とも記す。例えば、抑制間隔は、撮像画像の受信頻度である入力フレームレートが６０ｆｐｓの場合において、低重要度領域のオブジェクトの三次元モデルの更新頻度を示すフレームレートを３０ｆｐｓとする場合の、更新頻度を示すフレームレートに関わる情報である。なお、三次元モデルの更新の頻度は、三次元モデルを生成する頻度ともいえる。 The suppression interval determination unit 903 determines a time interval indicating the frequency of updating the three-dimensional model of the object located in the region of low importance (low importance region) in the captured image. This time interval is also referred to as a suppression interval. For example, the suppression interval indicates the update frequency when the input frame rate, which is the reception frequency of the captured image, is 60 fps, and the frame rate indicating the update frequency of the three-dimensional model of the object in the low importance region is 30 fps. Information related to the frame rate. The frequency of updating the 3D model can also be said to be the frequency of generating the 3D model.

フレームレート（ｆｐｓ：ｆｒａｍｅｓｐｅｒｓｅｃｏｎｄ）は１秒間に処理される単位を示し、入力フレームレートは１秒間にカメラから出力される撮像画像（フレーム）の数である。同様に、フレームレートは１秒間に生成される前景テクスチャ画像、および前景マスク画像の数を示す。 The frame rate (fps: frames per second) indicates a unit processed in one second, and the input frame rate is the number of captured images (frames) output from the camera in one second. Similarly, the frame rate indicates the number of foreground texture images and foreground mask images generated per second.

処理フレームレートは１秒間に生成する三次元モデルの数を表す。カメラからフレームを取得するごとに、視体積交差法によって、低重要度領域に位置しないオブジェクトの三次元モデルが生成される。このため、入力フレームレートが６０ｆｐｓの場合、低重要度領域に位置しないそれぞれのオブジェクトの三次元モデルは１秒間にそれぞれ６０個生成される。低重要度領域に位置しないオブジェクトの三次元モデルの更新頻度を表す処理フレームレートは、入力フレームレートと同じ６０ｆｐｓである。 The processing frame rate represents the number of 3D models generated per second. Each time a frame is acquired from the camera, the visual volume crossover method produces a three-dimensional model of the object that is not located in the low importance region. Therefore, when the input frame rate is 60 fps, 60 three-dimensional models of each object not located in the low importance region are generated per second. The processing frame rate representing the update frequency of the three-dimensional model of the object not located in the low importance region is 60 fps, which is the same as the input frame rate.

一方、間隔が２の場合、入力フレームレート（６０ｆｐｓ）の１／２のフレームレートである３０ｆｐｓの処理フレームレートで、低重要度領域に位置する前景の三次元モデルが視体積交差法により生成されて更新される。この場合、１秒間にカメラから入力される６０フレームの内、低重要度領域のオブジェクトの三次元モデルを生成には用いられないフレームが含まれることになる。低重要度領域のオブジェクトの三次元モデルの生成に用いられないフレームを処理抑制フレームとも呼ぶ。同様に、間隔が４の場合、入力フレームレートの１／４のフレームレートである１５ｆｐｓで、低重要度領域のオブジェクトの三次元モデルが生成されて更新される。 On the other hand, when the interval is 2, a three-dimensional model of the foreground located in the low importance region is generated by the visual volume crossing method at a processing frame rate of 30 fps, which is a frame rate of 1/2 of the input frame rate (60 fps). Will be updated. In this case, out of the 60 frames input from the camera per second, the frames that are not used to generate the three-dimensional model of the object in the low importance region are included. Frames that are not used to generate a 3D model of objects in the low importance region are also called processing suppression frames. Similarly, when the interval is 4, a 3D model of the object in the low importance region is generated and updated at 15 fps, which is a frame rate of 1/4 of the input frame rate.

抑制間隔決定部９０３は、この間隔または低重要度領域の処理フレームレートを抑制間隔として決定する。抑制間隔決定部９０３は、抑制間隔の情報を取得して、低重要度領域に位置するオブジェクトの三次元モデルの更新の頻度を示す時間間隔を決定してもよい。また、抑制間隔決定部９０３は夫々の重要度に応じた抑制間隔を決定してもよい。重要度領域情報と夫々の重要度ごとの抑制間隔をデータ削除部９０４に出力する。 The suppression interval determination unit 903 determines the processing frame rate of this interval or the low importance region as the suppression interval. The suppression interval determination unit 903 may acquire information on the suppression interval and determine a time interval indicating the frequency of updating the three-dimensional model of the object located in the low importance region. Further, the suppression interval determination unit 903 may determine the suppression interval according to the importance of each. The importance area information and the suppression interval for each importance are output to the data deletion unit 904.

なお、抑制間隔決定部９０３は、動的に抑制間隔を決定してもよい。または、所定の条件を満たす場合に対象フレームを処理抑制フレームと決定してもよい。例えば、三次元モデルの生成時の処理負荷が高くなった場合、対象フレームを処理抑制フレームとして決定してもよい。または、三次元モデルの生成時の処理負荷が高くなった場合、低重要度領域に位置するオブジェクトの三次元モデルの更新の時間間隔を大きくして、三次元モデルを更新する処理フレームレートを下げるように抑制間隔が決定してもよい。 The suppression interval determination unit 903 may dynamically determine the suppression interval. Alternatively, the target frame may be determined as a processing suppression frame when a predetermined condition is satisfied. For example, when the processing load at the time of generating the three-dimensional model becomes high, the target frame may be determined as the processing suppression frame. Alternatively, if the processing load at the time of generating the 3D model becomes high, increase the time interval for updating the 3D model of the object located in the low importance area and lower the processing frame rate for updating the 3D model. The suppression interval may be determined as described above.

データ削除部９０４は、抑制間隔決定部９０３から出力される重要度領域情報と抑制間隔とを取得する。取得部９０１が取得した矩形マスク等を抽出するために用いられたフレームが処理抑制フレームである場合、データ削除部９０４は、生成された矩形マスクのうち低重要度領域内に位置するオブジェクトの矩形マスクを削除する。データ削除部９０４は、削除した矩形マスクの画像以外の矩形マスクを用いて図５で示した前景マスク画像を生成する。生成された前景マスク画像をモデル生成部９０５に出力する。 The data deletion unit 904 acquires the importance area information output from the suppression interval determination unit 903 and the suppression interval. When the frame used for extracting the rectangular mask or the like acquired by the acquisition unit 901 is a processing suppression frame, the data deletion unit 904 is a rectangle of an object located in the low importance area of the generated rectangular mask. Remove the mask. The data deletion unit 904 generates the foreground mask image shown in FIG. 5 by using a rectangular mask other than the deleted rectangular mask image. The generated foreground mask image is output to the model generation unit 905.

なお、本実施形態では、前景マスク画像は、矩形マスクを合成して生成されるものとして説明するが、矩形マスクを用いないで前景マスク画像が生成されてもよい。例えば、撮像画像から前景領域を抽出して、前景領域を白、背景領域を黒に表すように二値化することにより生成されてもよい。その場合、データ削除部９０４は、前景マスク画像の低重要度領域内にある前景領域が背景領域になるように前景マスク画像に処理を行うことで低重要度領域の前景領域を削除してもよい。 In the present embodiment, the foreground mask image is described as being generated by synthesizing a rectangular mask, but the foreground mask image may be generated without using the rectangular mask. For example, it may be generated by extracting the foreground region from the captured image and binarizing the foreground region into white and the background region into black. In that case, even if the data deletion unit 904 deletes the foreground area of the low importance area by processing the foreground mask image so that the foreground area in the low importance area of the foreground mask image becomes the background area. good.

また、データ削除部９０４は、低重要度領域内にあるオブジェクトの矩形マスクを削除した場合、同様に、低重要度領域内にあるオブジェクトの矩形テクスチャを削除してもよい。低重要度領域内にあるオブジェクトの矩形テクスチャを削除することで、レンダリング装置１０５における色付けにおいて、低重要度領域の矩形テクスチャが用いられないようにすることができる。 Further, when the data deletion unit 904 deletes the rectangular mask of the object in the low importance area, the data deletion unit 904 may similarly delete the rectangular texture of the object in the low importance area. By deleting the rectangular texture of the object in the low importance region, it is possible to prevent the rectangular texture of the low importance region from being used in the coloring in the rendering device 105.

モデル生成部９０５は、カメラパラメータと、データ削除部９０４から出力された複数のカメラの前景マスク画像とを用いて、視体積交差法により、フレームに含まれるオブジェクトの三次元モデルの生成をする三次元形状データ生成部である。生成されたデータは、データ合成部９０７に出力される。処理抑制フレームの三次元モデルを生成する場合、モデル生成に用いられる前景マスク画像は、データ削除部９０４によって削除された矩形マスクに対応する前景領域は含まれない前景マスク画像である。このため、モデル生成部９０５は、処理抑制フレームでは、低重要度領域に位置するオブジェクトの三次元モデルは生成しないで、低重要度領域以外の領域に位置するオブジェクトの三次元モデルを生成することができる。 The model generation unit 905 uses the camera parameters and the foreground mask images of a plurality of cameras output from the data deletion unit 904 to generate a three-dimensional model of the object included in the frame by the visual volume crossing method. Original shape data generation unit. The generated data is output to the data synthesis unit 907. When generating a three-dimensional model of the processing suppression frame, the foreground mask image used for model generation is a foreground mask image that does not include the foreground region corresponding to the rectangular mask deleted by the data deletion unit 904. Therefore, the model generation unit 905 does not generate a three-dimensional model of the object located in the low importance region in the processing suppression frame, but generates a three-dimensional model of the object located in the region other than the low importance region. Can be done.

データ合成部９０７は、前景マスク画像と、モデル生成部９０５が生成したオブジェクトの三次元モデルとを取得し、データ削除部９０４から前景テクスチャ画像と抑制間隔とを取得する。対象フレームが処理抑制フレームである場合、データ合成部９０７は、データ削除部９０４が削除を行った三次元空間上の低重要度領域に、前回生成されたオブジェクトの三次元モデルを合成する。そして、合成の結果得られたデータを対象フレームに含まれるオブジェクトの三次元形状を表すデータとしてレンダリング装置１０５に出力する。 The data synthesis unit 907 acquires the foreground mask image and the three-dimensional model of the object generated by the model generation unit 905, and acquires the foreground texture image and the suppression interval from the data deletion unit 904. When the target frame is a processing suppression frame, the data synthesis unit 907 synthesizes a three-dimensional model of the previously generated object in the low importance region on the three-dimensional space deleted by the data deletion unit 904. Then, the data obtained as a result of the synthesis is output to the rendering device 105 as data representing the three-dimensional shape of the object included in the target frame.

同様に、データ合成部９０７は、前景マスク画像の低重要度領域に、前回生成された前景マスク画像の対応する領域の前景領域を合成する。合成の結果得られた前景マスク画像は、次のフレームにおける低重要度領域の位置する前景のマスク画像として用いられる。同様に、データ合成部９０７は、前景テクスチャ画像の低重要度領域に、前回生成された前景テクスチャ画像の対応する前景の部分を合成する。合成の結果得られた夫々のデータはデータ管理部９０６に出力される。 Similarly, the data synthesizing unit 907 synthesizes the foreground region of the corresponding region of the previously generated foreground mask image with the low importance region of the foreground mask image. The foreground mask image obtained as a result of the composition is used as the foreground mask image in which the low importance region is located in the next frame. Similarly, the data synthesizing unit 907 synthesizes the corresponding foreground portion of the previously generated foreground texture image into the low importance region of the foreground texture image. Each data obtained as a result of the synthesis is output to the data management unit 906.

データ管理部９０６は、データ合成部９０７が出力したデータをＲＯＭ８０３または記憶装置８０４等の記憶部に記憶させる。記憶されるデータは、三次元モデル、前景テクスチャ画像、前景マスク画像のそれぞれのデータである。このため、データ合成部９０７は、データ管理部９０６を介して、１つ前のフレームに対応する三次元モデルから、低重要度領域に位置するオブジェクトの三次元モデルを取得することができる。 The data management unit 906 stores the data output by the data synthesis unit 907 in a storage unit such as a ROM 803 or a storage device 804. The stored data is each of the three-dimensional model, the foreground texture image, and the foreground mask image. Therefore, the data synthesis unit 907 can acquire the three-dimensional model of the object located in the low importance region from the three-dimensional model corresponding to the previous frame via the data management unit 906.

[画像内の領域の重要度について]
図１０は、画像内の領域の重要度を説明するための図である。図１０（ａ）は、背景モデルの三次元空間上に、重要度が設定された領域である重要度領域Ｒ１～Ｒ３を重畳して表示している図である。また、図１０（ａ）にはカメラパラメータに基づきカメラ１０１ａ～１０１ｐの配置を表している。 [About the importance of the area in the image]
FIG. 10 is a diagram for explaining the importance of the area in the image. FIG. 10A is a diagram in which importance areas R1 to R3, which are areas for which importance is set, are superimposed and displayed on the three-dimensional space of the background model. Further, FIG. 10A shows the arrangement of the cameras 101a to 101p based on the camera parameters.

背景モデルは、撮像環境の三次元空間を表す三次元形状データであり図１０（ａ）では簡易的な競技場を表している。背景モデルのデータ形式としては、例えば、三次元ＣＧ（ＣｏｍｐｕｔｅｒＧｒａｐｈｉｃｓ）でも用いられるジオメトリ定義ファイル形式で表される。 The background model is three-dimensional shape data representing the three-dimensional space of the imaging environment, and is a simple stadium in FIG. 10 (a). As the data format of the background model, for example, it is represented by a geometry definition file format that is also used in three-dimensional CG (Computer Graphics).

重要度領域Ｒ１～Ｒ３は、重要度情報に基づき表示される。重要度情報は、三次元空間上での夫々の領域の重要度を決定するための情報である。重要度情報に基づき、背景モデルの床面を分割して夫々の領域に重要度が付与される。重要度は、本実施形態では、重要度の高い順に１、２、３の３段階で設定するものとして説明する。そして、重要度が１の領域を領域Ｒ１、重要度が２の領域を領域Ｒ２、重要度が３の領域を領域Ｒ３として背景モデルに設定される。重要度が、最も高い領域Ｒ１以外の領域である、領域Ｒ２および領域Ｒ３を低重要度領域と呼ぶ。重要度情報は、例えば、ユーザの指示により設定された夫々の領域の重要度を示す情報である。 The importance areas R1 to R3 are displayed based on the importance information. The importance information is information for determining the importance of each area in the three-dimensional space. Based on the importance information, the floor of the background model is divided and importance is given to each area. In the present embodiment, the importance will be described as being set in three stages of 1, 2, and 3 in descending order of importance. Then, the area of importance 1 is set as the area R1, the area of importance 2 is set as the area R2, and the area of importance 3 is set as the area R3 in the background model. Regions R2 and R3, which are regions other than the region R1 having the highest importance, are referred to as low importance regions. The importance information is, for example, information indicating the importance of each area set by the instruction of the user.

図１０の例では、領域Ｒ１は視聴者の注目度が高いゴール前、領域Ｒ２はフィールド内の中央付近、領域Ｒ３はフィールド外を示している。なお、本実施形態では、重要度情報は、分割した二次元平面である床面に重要度を付与するための情報であるものとして説明するが、三次元空間に重要度を付与する情報であってもよい。 In the example of FIG. 10, the area R1 is in front of the goal where the viewer's attention is high, the area R2 is near the center of the field, and the area R3 is outside the field. In the present embodiment, the importance information is described as information for imparting importance to the floor surface which is a divided two-dimensional plane, but is information for imparting importance to the three-dimensional space. You may.

本実施形態では、低重要度領域に位置するオブジェクトについては、三次元モデルを更新する頻度が低くなるように処理される。このため、重要でない領域の三次元モデルの更新頻度を低くすることで、重要な領域の三次元モデルの更新頻度を高い状態に保ちつつ、三次元モデルに関わる全体の処理負荷を抑制することができる。 In the present embodiment, the object located in the low importance region is processed so that the frequency of updating the three-dimensional model is low. Therefore, by reducing the update frequency of the 3D model in the unimportant area, it is possible to suppress the overall processing load related to the 3D model while keeping the update frequency of the 3D model in the important area high. can.

図１０（ｂ）は、カメラパラメータに基づき三次元空間上の点を透視投影変換することで得られるカメラ１０１ｅによる図１０（ａ）の三次元空間を撮像した場合の撮像画像を示す図である。夫々のカメラにおいて透視投影変換することで、全てのカメラの撮像画像の各重要度領域を求めることができる。 FIG. 10 (b) is a diagram showing an image captured by the camera 101e obtained by performing a perspective projection conversion of a point on the three-dimensional space based on the camera parameters, when the three-dimensional space of FIG. 10 (a) is imaged. .. By performing perspective projection conversion in each camera, it is possible to obtain each importance region of the images captured by all the cameras.

透視投影変換とは、式１により、三次元座標点をカメラから撮像した画像平面に射影することである。
ｓｘ＝Ａ［Ｒ｜ｔ］Ｘ・・・式１
式１において、Ｘは三次元空間の座標点、Ｒ｜ｔは、カメラパラメータの外部パラメータ、Ａはカメラパラメータの内部パラメータ、ｘは二次元平面の座標点、ｓはカメラ座標系における座標点のｚ座標を示す。 The perspective projection transformation is to project a three-dimensional coordinate point onto an image plane imaged from a camera by Equation 1.
sx = A [R | t] X ... Equation 1
In Equation 1, X is a coordinate point in three-dimensional space, R | t is an external parameter of the camera parameter, A is an internal parameter of the camera parameter, x is a coordinate point on a two-dimensional plane, and s is a coordinate point in the camera coordinate system. Indicates the z coordinate.

［三次元モデルの生成の処理フロー］
図１１は、フレーム内のオブジェクトの三次元モデルを示す三次元形状データの生成の流れを示すフローチャートである。図１１のフローチャートで示される一連の処理は、三次元モデル生成装置１０４のＣＰＵ８０１がＲＯＭ８０３に記憶されているプログラムコードをＲＡＭ８０２に展開し実行することにより行われる。また、図１１におけるステップの一部または全部の機能をＡＳＩＣまたは電子回路等のハードウエアで実現してもよい。なお、各処理の説明における記号「Ｓ」は、当該フローチャートにおけるステップであることを意味し、以後のフローチャートにおいても同様とする。 [Processing flow for generating 3D model]
FIG. 11 is a flowchart showing the flow of generation of three-dimensional shape data showing a three-dimensional model of an object in a frame. The series of processes shown in the flowchart of FIG. 11 is performed by the CPU 801 of the three-dimensional model generation device 104 expanding the program code stored in the ROM 803 into the RAM 802 and executing the program code. Further, some or all the functions of the steps in FIG. 11 may be realized by hardware such as an ASIC or an electronic circuit. The symbol "S" in the description of each process means that the step is a step in the flowchart, and the same applies to the subsequent flowcharts.

Ｓ１１０１において抑制前景選定部９０２は、カメラパラメータ、背景モデル、重要度情報に基づき、各カメラ１０１ａ～１０１ｐの撮像画像内の領域の重要度を決定する。 In S1101, the suppression foreground selection unit 902 determines the importance of the region in the captured image of each camera 101a to 101p based on the camera parameters, the background model, and the importance information.

Ｓ１１０２において抑制間隔決定部９０３は、重要度ごとの抑制間隔を取得する。本フローチャートの説明では、抑制間隔は処理フレームレートで表現する。重要度が最大である重要度１の領域Ｒ１に位置するオブジェクトの処理フレームレートは、入力フレームレートと同じ６０ｆｐｓと予め決められているものとする。また、重要度２の領域Ｒ２に位置するオブジェクトの処理フレームレートは３０ｆｐｓ、重要度が最小の３の領域Ｒ３に位置するオブジェクトの処理フレームレートは１５ｆｐｓ、と予め決定されているものとする。このため、抑制間隔決定部９０３は決定されている各重要度の処理フレームレートを取得する。 In S1102, the suppression interval determination unit 903 acquires the suppression interval for each importance. In the description of this flowchart, the suppression interval is expressed by the processing frame rate. It is assumed that the processing frame rate of the object located in the region R1 of the importance 1 having the maximum importance is predetermined to be 60 fps, which is the same as the input frame rate. Further, it is assumed that the processing frame rate of the object located in the region R2 of importance 2 is 30 fps, and the processing frame rate of the object located in the region R3 of the minimum importance 3 is 15 fps. Therefore, the suppression interval determination unit 903 acquires the processing frame rate of each determined importance.

Ｓ１１０３～Ｓ１１１０は、カメラ１０１ａ～１０１ｐから出力された１フレーム（対象フレーム）に含まれるオブジェクトの三次元モデルを出力するためのステップである。まず、Ｓ１１０３において取得部９０１は、各カメラの、矩形マスクおよび矩形テクスチャを、前景抽出装置群１０２から取得する。Ｓ１１０４において抑制前景選定部９０２は、取得した矩形マスクのうち低重要度領域に位置する矩形マスクがあるか判定する。 S1103 to S1110 are steps for outputting a three-dimensional model of an object included in one frame (target frame) output from the cameras 101a to 101p. First, in S1103, the acquisition unit 901 acquires a rectangular mask and a rectangular texture of each camera from the foreground extraction device group 102. In S1104, the suppression foreground selection unit 902 determines whether or not there is a rectangular mask located in the low importance region among the acquired rectangular masks.

図１２は、本ステップの処理を説明するための図である。図１２（ａ）は図１０（ａ）のカメラ１０１ｅの撮像に基づくフレームを示す図である。図１２（ｂ）は、各重要度領域が重畳された背景モデルをカメラ１０１ｅが撮像することによって得られた画像に、図１２（ａ）のフレームに基づき生成された矩形マスクを配置した図である。図１２（ｂ）に示すように、低重要度領域であるフィールド中央の領域Ｒ２とフィールド外の領域Ｒ３に位置する矩形マスクがある場合、低重要度領域に位置する矩形マスクがあると判定される。 FIG. 12 is a diagram for explaining the process of this step. 12 (a) is a diagram showing a frame based on the image taken by the camera 101e of FIG. 10 (a). FIG. 12B is a diagram in which a rectangular mask generated based on the frame of FIG. 12A is arranged on an image obtained by taking an image of a background model in which each importance region is superimposed by the camera 101e. be. As shown in FIG. 12B, when there is a rectangular mask located in the area R2 in the center of the field which is a low importance area and the area R3 outside the field, it is determined that there is a rectangular mask located in the low importance area. Rectangle.

図１２（ｂ）の矩形マスクＭ１が示すように領域Ｒ１と領域Ｒ２との２つの重要度領域に跨る矩形マスクについては、矩形マスクの下端が属する領域の重要度に基づき、低重要度領域に位置する矩形マスクはあるかの判定が行われる。即ち、矩形マスクＭ１は領域Ｒ１の高重要度領域に位置すると判定される。 As shown by the rectangular mask M1 in FIG. 12B, the rectangular mask straddling the two importance regions of the region R1 and the region R2 is set to the low importance region based on the importance of the region to which the lower end of the rectangular mask belongs. It is determined whether there is a rectangular mask to be located. That is, it is determined that the rectangular mask M1 is located in the high importance region of the region R1.

低重要度領域に位置する矩形マスクがある場合、Ｓ１１０５に進む。Ｓ１１０５におい重要度の抑制間隔に基づき、Ｓ１１０３で取得された矩形マスクを生成するために用いられた対象フレームが処理抑制フレームか判定される。例えば、抑制間隔決定部９０３は、対象フレームが、ある重要度の領域に位置するオブジェクトの三次元モデルの更新を行わないフレームであるかを決定する。決定は、抑制間隔と対象フレームのタイムコードとに基づき行われる。 If there is a rectangular mask located in the low importance region, proceed to S1105. Based on the suppression interval of the odor importance in S1105, it is determined whether the target frame used to generate the rectangular mask acquired in S1103 is the processing suppression frame. For example, the suppression interval determination unit 903 determines whether the target frame is a frame that does not update the three-dimensional model of the object located in the region of a certain importance. The decision is made based on the suppression interval and the timecode of the target frame.

例えば、入力フレームレートが６０ｆｐｓの場合であって、領域Ｒ２のオブジェクトの処理フレームレートが３０ｆｐｓに設定されているとする。また、領域Ｒ３のオブジェクトの処理フレームレートが１５ｆｐｓに設定されているとする。この場合、対象フレームのタイムコードのフレーム数を示す値が２の倍数以外のときは、対象フレームは、領域Ｒ２または領域Ｒ３に位置するオブジェクトの三次元モデルの更新は行わない処理抑制フレームであると決定される。また、対象フレームのタイムコードのフレーム数を示す値が、２の倍数であり、かつ、４の倍数以外の場合、対象フレームは、領域Ｒ３に位置するオブジェクトの三次元モデルの更新は行わない処理抑制フレームであると決定される。 For example, it is assumed that the input frame rate is 60 fps and the processing frame rate of the object in the area R2 is set to 30 fps. Further, it is assumed that the processing frame rate of the object in the area R3 is set to 15 fps. In this case, when the value indicating the number of frames of the time code of the target frame is other than a multiple of 2, the target frame is a processing suppression frame that does not update the three-dimensional model of the object located in the area R2 or the area R3. Is decided. Further, when the value indicating the number of frames of the time code of the target frame is a multiple of 2 and other than a multiple of 4, the target frame does not update the three-dimensional model of the object located in the area R3. Determined to be a suppression frame.

タイムコードとは［時：分：秒．フレーム］の形式で表される時刻情報である。以下、本フローチャートにおいては、領域Ｒ２の処理フレームレートは３０ｆｐｓ、領域Ｒ３の処理フレームレートは１５ｆｐｓとして説明をする。 What is time code? [Hour: Minutes: Seconds. Time information expressed in the format of [frame]. Hereinafter, in this flowchart, the processing frame rate of the area R2 will be 30 fps, and the processing frame rate of the area R3 will be 15 fps.

つまり、領域Ｒ２に位置するオブジェクトは、対象フレームのタイムコードのフレームの値が２の倍数の場合に、対象フレームに基づき生成された矩形マスクを用いて三次元モデルが生成されることになる。領域Ｒ３に位置するオブジェクトは、タイムコードのフレームの値が４の倍数の場合に、対象フレームに基づき生成された矩形マスクを用いて三次元モデルが生成される。 That is, for the object located in the area R2, when the frame value of the time code of the target frame is a multiple of 2, a three-dimensional model is generated using the rectangular mask generated based on the target frame. For the object located in the region R3, when the value of the frame of the time code is a multiple of 4, a three-dimensional model is generated using the rectangular mask generated based on the target frame.

対象フレームが処理抑制フレームである場合（Ｓ１１０５がＹＥＳ）、Ｓ１１０６において、低重要度領域に位置する矩形マスクのうち、三次元モデルの生成を抑制する対象となる領域に位置する矩形マスクが選定される。そして、データ削除部９０４は、選定された三次元モデルの生成を抑制する対象となる領域に位置する矩形マスクを削除する。 When the target frame is a processing suppression frame (YES in S1105), in S1106, among the rectangular masks located in the low importance region, the rectangular mask located in the region targeted to suppress the generation of the three-dimensional model is selected. Rectangle. Then, the data deletion unit 904 deletes the rectangular mask located in the region to be suppressed from generating the selected three-dimensional model.

図１３は、処理抑制フレームの矩形マスクを削除する本ステップの処理を説明するための図である。図１３（ａ）は、図１２（ｂ）に含まれる矩形マスクのうち、領域Ｒ３に位置する矩形マスクを削除した後の矩形マスクを表す図である。例えば、タイムコードが１２：３４：５６．０２のように、フレームの値が２の倍数であり、かつ、４の倍数ではない場合、重要度が最も低い領域Ｒ３の位置するオブジェクトの三次元モデルは生成しない処理抑制フレームである。このため、領域Ｒ３に位置する矩形マスクが削除される。 FIG. 13 is a diagram for explaining the processing of this step of deleting the rectangular mask of the processing suppression frame. FIG. 13A is a diagram showing a rectangular mask after deleting the rectangular mask located in the area R3 among the rectangular masks included in FIG. 12B. For example, when the time code is 12: 34: 56.02, and the frame value is a multiple of 2 and not a multiple of 4, a three-dimensional model of the object in which the least important region R3 is located. Is a processing suppression frame that is not generated. Therefore, the rectangular mask located in the area R3 is deleted.

図１３（ｂ）は、図１２（ｂ）の図に含まれる矩形マスクのうち、領域Ｒ３または領域Ｒ２に位置する矩形マスクを削除した後の矩形マスクマスクを表す図である。タイムコードが１２：３４：５６．０３のように、フレームの値が２の倍数ではなくかつ４の倍数以外の場合、重要度が最も高い領域Ｒ１以外の領域である領域Ｒ２および領域Ｒ３の位置するオブジェクトの三次元モデルは生成しない処理抑制フレームである。このため、領域Ｒ２および領域Ｒ３に位置する矩形マスクが削除される。 FIG. 13 (b) is a diagram showing a rectangular mask mask after deleting the rectangular mask located in the area R3 or the area R2 among the rectangular masks included in the figure of FIG. 12 (b). When the frame value is not a multiple of 2 and is not a multiple of 4, such as 12: 34: 56.03, the positions of the regions R2 and R3, which are regions other than the region R1 having the highest importance. The 3D model of the object is a processing suppression frame that is not generated. Therefore, the rectangular masks located in the areas R2 and R3 are deleted.

Ｓ１１０７ではオブジェクトの三次元モデルを生成する。初めに、データ削除部９０４は、矩形マスクを用いて前景マスク画像を生成する。Ｓ１１０６で矩形マスクが削除された場合、データ削除部９０４は、削除された矩形マスクは用いないで、各カメラの前景マスク画像を生成する。そして、モデル生成部９０５は、カメラパラメータと各カメラの前景マスク画像を用いて視体積交差法で三次元モデルを生成する。図１３（ａ）または図１３（ｂ）で示すように処理抑制フレームにおける三次元モデル生成処理では、矩形マスクが少なくなる。このため、前景マスク画像における前景領域も少なくなり、生成される三次元モデルのボクセル数も少なくなる。このため、本実施形態では、三次元モデルに関わる処理負荷を軽減することができる。 In S1107, a three-dimensional model of the object is generated. First, the data deletion unit 904 generates a foreground mask image using a rectangular mask. When the rectangular mask is deleted in S1106, the data deletion unit 904 generates a foreground mask image of each camera without using the deleted rectangular mask. Then, the model generation unit 905 generates a three-dimensional model by the visual volume crossing method using the camera parameters and the foreground mask image of each camera. As shown in FIG. 13A or FIG. 13B, the number of rectangular masks is reduced in the three-dimensional model generation processing in the processing suppression frame. Therefore, the foreground region in the foreground mask image is also reduced, and the number of voxels in the generated 3D model is also reduced. Therefore, in the present embodiment, the processing load related to the three-dimensional model can be reduced.

一方、Ｓ１１０４においてＮＯ、またはＳ１１０５においてＮＯと判定された場合は、Ｓ１１０６はスキップしてＳ１１０７に進む。このため、取得した全ての矩形マスクを用いて前景マスク画像が生成され、フレームに含まれるオブジェクトの三次元モデルが生成される。 On the other hand, if NO is determined in S1104 or NO in S1105, S1106 skips and proceeds to S1107. Therefore, a foreground mask image is generated using all the acquired rectangular masks, and a three-dimensional model of the object included in the frame is generated.

Ｓ１１０８では、Ｓ１１０７で三次元モデルの生成がされなかったオブジェクトがあるか判定される。Ｓ１１０６で矩形マスクが削除された場合はＹＥＳと判定される。 In S1108, it is determined whether there is an object for which the three-dimensional model was not generated in S1107. If the rectangular mask is deleted in S1106, it is determined as YES.

三次元モデルが生成されないオブジェクトがある場合（Ｓ１１０８がＹＥＳ）、Ｓ１１０９においてデータ合成部９０７は、オブジェクトの三次元モデルが生成されなかった領域に位置する、１つ前のフレームの三次元モデルを取得する。そして今回の生成した三次元モデルが配置された三次元空間に、取得された１つ前のフレームのオブジェクトの三次元モデルを追加するように合成する。合成の結果得られたデータは対象フレームのオブジェクトの三次元モデルを表すデータとして、レンダリング装置１０５に出力される。 If there is an object for which a 3D model is not generated (YES in S1108), the data synthesis unit 907 in S1109 acquires the 3D model of the previous frame located in the area where the 3D model of the object was not generated. do. Then, the three-dimensional model of the object of the immediately preceding frame obtained is added to the three-dimensional space in which the generated three-dimensional model is arranged. The data obtained as a result of the synthesis is output to the rendering device 105 as data representing a three-dimensional model of the object of the target frame.

図１４はデータ合成部９０７の処理を説明するための図である。図１４（ａ）は、Ｓ１１０７の処理の結果生成されたオブジェクトの三次元モデルを白色の領域で示した図である。図１３（ｂ）で示した領域Ｒ２と領域Ｒ３とに位置する矩形マスクが削除されたことに基づき生成された前景マスク画像に基づき生成された三次元モデルを、背景に重畳させて表した図が図１４（ａ）である。つまり、対象フレームのタイムコードが１２：３４：５６．０３である場合のように、高重要度領域である領域Ｒ１に位置するオブジェクトの三次元モデルのみ生成されていることを示す。 FIG. 14 is a diagram for explaining the processing of the data synthesis unit 907. FIG. 14A is a diagram showing a three-dimensional model of the object generated as a result of the processing of S1107 in a white region. The figure which superposed the three-dimensional model generated based on the foreground mask image generated based on the deletion of the rectangular masks located in the area R2 and the area R3 shown in FIG. 13B on the background is shown. Is FIG. 14 (a). That is, it is shown that only the three-dimensional model of the object located in the region R1 which is the high importance region is generated as in the case where the time code of the target frame is 12: 34: 56.03.

図１４（ｂ）は、１つ前のフレームに対応する、前回出力されたオブジェクトの三次元モデルを斜線の領域で示した図である。本ステップでは、今回、三次元モデルが生成されなかった領域に位置するオブジェクトの三次元モデルが、１つ前のフレームの三次元モデルから取得される。つまり、図１４の例では、図１４（ｂ）における領域Ｒ２と領域Ｒ３とに位置する三次元モデルが取得される。そして図１４（ｃ）に示すように、１つ前のフレームにおける領域Ｒ２と領域Ｒ３に位置する三次元モデルを、今回生成された三次元モデルが配置されている三次元空間上の領域Ｒ２と領域Ｒ３に追加する。つまり、低重要度領域である領域Ｒ２と領域Ｒ３とに位置する三次元モデルは今回のフレームでは更新されないことになり、高重要度領域である領域Ｒ１に位置する三次元モデルのみ処理抑制フレームでは更新される。 FIG. 14B is a diagram showing a three-dimensional model of the previously output object corresponding to the previous frame in the shaded area. In this step, the 3D model of the object located in the area where the 3D model was not generated this time is acquired from the 3D model of the previous frame. That is, in the example of FIG. 14, the three-dimensional model located in the region R2 and the region R3 in FIG. 14B is acquired. Then, as shown in FIG. 14 (c), the three-dimensional model located in the region R2 and the region R3 in the previous frame is combined with the region R2 in the three-dimensional space in which the three-dimensional model generated this time is arranged. Add to region R3. That is, the three-dimensional model located in the low importance region R2 and the region R3 is not updated in this frame, and only the three-dimensional model located in the high importance region R1 is processed in the processing suppression frame. Will be updated.

なお、図１３（ａ）で示したように領域Ｒ２の矩形マスクは削除されず、領域Ｒ３に位置する矩形マスクが削除された場合も同様に１つ前のフレームの三次元モデルが追加による合成が行われる。即ち、１つ前のフレームの三次元モデルから領域Ｒ３に位置する三次元モデルが取得されて、今回の生成された三次元モデルが配置されている三次元空間上に取得された三次元モデルが追加される。 As shown in FIG. 13A, the rectangular mask in the area R2 is not deleted, and even when the rectangular mask located in the area R3 is deleted, the three-dimensional model of the previous frame is additionally combined. Is done. That is, the 3D model located in the region R3 is acquired from the 3D model of the previous frame, and the acquired 3D model is obtained on the 3D space in which the generated 3D model is arranged. Will be added.

また、重要度が異なる領域の境界付近にある三次元モデルについては、位置の誤判定等により、三次元モデルが生成されたにも関わらず、１つ前のフレームの三次元モデルが取得されることが考えられる。この場合、合成により三次元モデルが重複してしまう。この重複を回避するため、１つ前のフレームに対応する三次元モデルと生成した三次元モデルとが重なるか判定し、重なる場合は１つ前のフレームの三次元モデルを合成しないという処理が行われてもよい。 In addition, for a 3D model near the boundary of regions of different importance, the 3D model of the previous frame is acquired even though the 3D model is generated due to misjudgment of the position or the like. Is possible. In this case, the three-dimensional models are duplicated due to the composition. In order to avoid this duplication, it is determined whether the 3D model corresponding to the previous frame and the generated 3D model overlap, and if they overlap, the process of not synthesizing the 3D model of the previous frame is performed. You may be broken.

一方、三次元モデルの生成がされなかったオブジェクトが無いと判定された場合（Ｓ１１０８がＮＯ）、全てのオブジェクトの三次元モデルは生成されている。このため１つ前のフレームの三次元モデルを合成する処理は行わないため、Ｓ１１０９はスキップされる。 On the other hand, when it is determined that there is no object for which the 3D model has not been generated (S1108 is NO), the 3D model of all the objects has been generated. Therefore, since the process of synthesizing the three-dimensional model of the previous frame is not performed, S1109 is skipped.

Ｓ１１１０においてデータ管理部９０６は、対象フレームの三次元モデル等を記憶部に記憶する。記憶する三次元モデルは、Ｓ１１０９で合成が行われた場合は、合成の結果得られた三次元モデルである。記憶されたデータは、次のフレームにおけるＳ１１０９の三次元モデルの合成処理における、１つ前のフレームの三次元モデルのデータとして用いられる。１つ前のフレームが処理抑制フレームである場合、記憶される三次元モデルのデータには、１つ前のフレームに基づき生成された三次元モデルと、１つ前のフレームより過去のフレームに基づき生成された三次元モデルとが含まれる。 In S1110, the data management unit 906 stores the three-dimensional model of the target frame and the like in the storage unit. The three-dimensional model to be stored is the three-dimensional model obtained as a result of the synthesis when the synthesis is performed in S1109. The stored data is used as the data of the three-dimensional model of the previous frame in the synthesis process of the three-dimensional model of S1109 in the next frame. When the previous frame is a processing suppression frame, the stored 3D model data includes the 3D model generated based on the previous frame and the frame past the previous frame. Includes the generated 3D model.

Ｓ１１１１において、次のフレームがあるかが判定される。次のフレームがある場合は、Ｓ１１０３に戻る。そして、Ｓ１１０３～Ｓ１１１０の処理を繰り返すことで、時間的に連続した一連のフレームの三次元モデルが生成され、時間的に連続した三次元モデルが出力される。 In S1111, it is determined whether or not there is the next frame. If there is a next frame, the process returns to S1103. Then, by repeating the processes of S1103 to S1110, a three-dimensional model of a series of temporally continuous frames is generated, and a temporally continuous three-dimensional model is output.

以上説明したように本実施形態においては、仮想視点画像の視聴者の注目度が低いような低重要度領域に位置する三次元モデルの生成の頻度を抑制する。一方、サッカーのゴール前の領域など視聴者の注目度の高い高重要度領域に位置する三次元モデルの更新の頻度は維持される。よって、三次元モデルに関わる処理が高負荷となる場合においても、仮想視点画像映像品質を損なうことを抑制しつつ、処理負荷を抑制することができる。 As described above, in the present embodiment, the frequency of generation of the three-dimensional model located in the low importance region where the viewer's attention of the virtual viewpoint image is low is suppressed. On the other hand, the frequency of updating the 3D model located in the high importance area where the viewer's attention is high, such as the area in front of the soccer goal, is maintained. Therefore, even when the processing related to the three-dimensional model has a high load, the processing load can be suppressed while suppressing the deterioration of the virtual viewpoint image quality.

なお、上記の説明では、処理抑制フレームにおいて、低重要度領域に位置するオブジェクトの三次元モデルを生成しなかった場合、１つ前のフレームにおける低重要度領域の三次元モデルを追加するものとして説明した。フレーム内の領域によっては、オブジェクトの三次元モデルが生成されなくても視聴者への影響が少ない場合がある。この場合、その領域に位置するオブジェクトの三次元モデルを生成しなかった場合、１つ前のフレームにおいて出力された三次元モデルを追加しなくてもよい。この場合でも仮想視点画像映像品質を損なうことを抑制しつつ、処理負荷を抑制することができる。 In the above description, when the three-dimensional model of the object located in the low importance region is not generated in the processing suppression frame, the three-dimensional model of the low importance region in the previous frame is added. explained. Depending on the area within the frame, the effect on the viewer may be small even if a three-dimensional model of the object is not generated. In this case, if the 3D model of the object located in the area is not generated, it is not necessary to add the 3D model output in the previous frame. Even in this case, it is possible to suppress the processing load while suppressing the deterioration of the virtual viewpoint image quality.

＜第２の実施形態＞
第１の実施形態では予め決定された重要度情報に基づき、フレーム内の低重要度領域に位置するオブジェクトの三次元モデルの更新を抑制する方法を説明した。本実施形態では、処理負荷を高くするオブジェクトを特定し、特定されたオブジェクトの三次元モデルの更新の頻度を減らすことで処理負荷を抑制する方法について説明する。本実施形態については、第１の実施形態からの差分を中心に説明する。特に明記しない部分については第１の実施形態と同じ構成および処理である。 <Second embodiment>
In the first embodiment, a method of suppressing the update of the three-dimensional model of the object located in the low importance region in the frame has been described based on the importance information determined in advance. In this embodiment, a method of specifying an object that increases the processing load and suppressing the processing load by reducing the frequency of updating the three-dimensional model of the specified object will be described. The present embodiment will be described focusing on the differences from the first embodiment. The parts not specified in particular have the same configuration and processing as those in the first embodiment.

図１５は本実施形態の三次元モデル生成装置１０４の機能構成を示すブロック図である。第１の実施形態と同一の処理ブロックについては同じ番号を付して説明を省略する。本実施形態の三次元モデル生成装置１０４は、重要度情報決定部１５０１をさらに有する。 FIG. 15 is a block diagram showing a functional configuration of the three-dimensional model generation device 104 of the present embodiment. The same processing blocks as those in the first embodiment are designated by the same numbers, and the description thereof will be omitted. The three-dimensional model generation device 104 of the present embodiment further includes an importance information determination unit 1501.

重要度情報決定部１５０１は、モデル生成部９０５が生成した三次元モデルのボクセル数を算出し、ボクセル数が閾値以上である三次元モデルを特定する。そして、特定した三次元モデルの三次元空間における位置が含まれる領域の決定をする領域決定を行う。決定された三次元空間における領域の位置情報と重要度とを抑制前景選定部９０２に出力する。 The importance information determination unit 1501 calculates the number of voxels of the three-dimensional model generated by the model generation unit 905, and identifies the three-dimensional model in which the number of voxels is equal to or greater than the threshold value. Then, the area determination for determining the area including the position of the specified three-dimensional model in the three-dimensional space is performed. The position information and importance of the determined area in the three-dimensional space are output to the suppression foreground selection unit 902.

図１６は、フレーム内のオブジェクトの三次元モデルを表す三次元形状データの生成の流れを示すフローチャートである。図１６を用いて本実施形態における、三次元形状データの生成方法を説明する。 FIG. 16 is a flowchart showing a flow of generation of three-dimensional shape data representing a three-dimensional model of an object in a frame. A method of generating three-dimensional shape data in the present embodiment will be described with reference to FIG.

Ｓ１６０１において抑制前景選定部９０２は、初期値として、フレーム内の全ての領域の重要度を最も高い重要度に設定する。Ｓ１６０２では、Ｓ１１０２と同様に、抑制間隔決定部９０３は重要度ごとの抑制間隔を取得する。 In S1601, the suppression foreground selection unit 902 sets the importance of all areas in the frame to the highest importance as an initial value. In S1602, similarly to S1102, the suppression interval determination unit 903 acquires the suppression interval for each importance.

Ｓ１６０３～Ｓ１６１３は、カメラから出力された１フレームに含まれるオブジェクトの三次元モデルを出力するためのステップである。Ｓ１６０３～Ｓ１６０７の処理は、Ｓ１１０３～Ｓ１１０７の処理と同様であるため概略を説明する。Ｓ１６０３において取得部９０１は、前景抽出装置群１０２から矩形マスク、矩形テクスチャを取得する。Ｓ１６０４では取得された矩形マスクのうち、低重要度領域に位置するオブジェクトの矩形マスクがあるか判定される。最初のフレームでは、Ｓ１６０１で全ての領域の重要度を最も高い重要度に設定しているため本ステップではＮＯと判定される。Ｓ１６０５では対象フレームが処理抑制フレームであるか判定される。Ｓ１６０６では低重要度領域に位置するオブジェクトの矩形マスクを削除する。Ｓ１６０７では複数の前景マスク画像を用いて視体積交差法により三次元モデルを生成する。 S1603 to S1613 are steps for outputting a three-dimensional model of an object included in one frame output from the camera. Since the processing of S1603 to S1607 is the same as the processing of S1103 to S1107, an outline thereof will be described. In S1603, the acquisition unit 901 acquires a rectangular mask and a rectangular texture from the foreground extraction device group 102. In S1604, it is determined whether or not there is a rectangular mask of an object located in the low importance region among the acquired rectangular masks. In the first frame, since the importance of all areas is set to the highest importance in S1601, it is determined as NO in this step. In S1605, it is determined whether the target frame is a processing suppression frame. In S1606, the rectangular mask of the object located in the low importance area is deleted. In S1607, a three-dimensional model is generated by the visual volume crossing method using a plurality of foreground mask images.

Ｓ１６０８において重要度情報決定部１５０１は、Ｓ１６０７で生成された三次元モデルのうち、三次元モデルを構成するボクセル数が閾値以上である三次元モデルがあるかを判定する。閾値は、例えば、三次元モデル生成装置１０４の処理性能、または生成する仮想視点画像の要求仕様に基づき予め決められた値が用いられる。 In S1608, the importance information determination unit 1501 determines whether or not there is a three-dimensional model in which the number of voxels constituting the three-dimensional model is equal to or greater than the threshold value among the three-dimensional models generated in S1607. As the threshold value, for example, a value predetermined in advance based on the processing performance of the three-dimensional model generation device 104 or the required specifications of the virtual viewpoint image to be generated is used.

閾値以上のボクセル数で構成される三次元モデルがある場合（Ｓ１６０８がＹＥＳ）、Ｓ１６０９に進む。Ｓ１６０９において重要度情報決定部１５０１は、ボクセル数が閾値以上である三次元モデルが含まれる、背景モデルの三次元空間上の領域を低重要度領域に設定する。抑制前景選定部９０２は、設定された背景モデルの重要度情報に基づき、各カメラ１０１a～１０１ｐの撮像画像内の重要度領域を更新する。このため次のフレームが処理抑制フレームである場合、次フレームの三次元モデル生成の処理では、本ステップにおいて決定された低重要度領域に位置するオブジェクトの三次元モデルの更新を抑制することができる。 If there is a three-dimensional model composed of the number of voxels above the threshold value (YES in S1608), the process proceeds to S1609. In S1609, the importance information determination unit 1501 sets a region on the three-dimensional space of the background model, which includes a three-dimensional model in which the number of voxels is equal to or greater than the threshold value, as a low importance region. The suppression foreground selection unit 902 updates the importance region in the captured image of each camera 101a to 101p based on the set importance information of the background model. Therefore, when the next frame is a processing suppression frame, in the process of generating the 3D model of the next frame, it is possible to suppress the update of the 3D model of the object located in the low importance region determined in this step. ..

図１７は、重要度情報決定部１５０１による低重要度領域の決定について説明するための図である。図１７（ａ）は、三次元空間上の、生成された三次元モデルを示す図であり、生成された三次元モデルには閾値以上のボクセル数で構成されたフラッグのオブジェクトの三次元モデル１７０１が含まれる。図１７（ｂ）では、閾値以上のボクセル数で構成される三次元モデル１７０１が位置する三次元空間上の領域を、低重要度領域である領域Ｒ３に設定したことを示している。図１７（ｃ）は、領域Ｒ３を含む三次元空間上を、カメラ１０１ｅのカメラパラメータを用いてカメラ１０１ｅの画像平面に投影することによって得られた画像を示している。よって、次フレームでは、図１７（ｃ）に示す重要度領域に基づき、フレーム内の低重要度領域に位置するオブジェクトがあるかが判定される。 FIG. 17 is a diagram for explaining determination of a low importance region by the importance information determination unit 1501. FIG. 17A is a diagram showing a generated three-dimensional model in a three-dimensional space, and the generated three-dimensional model is a three-dimensional model 1701 of a flag object composed of a number of voxels equal to or larger than a threshold value. Is included. FIG. 17B shows that the region on the three-dimensional space in which the three-dimensional model 1701 composed of the number of voxels equal to or larger than the threshold value is located is set in the region R3 which is a low importance region. FIG. 17C shows an image obtained by projecting a three-dimensional space including the region R3 onto the image plane of the camera 101e using the camera parameters of the camera 101e. Therefore, in the next frame, it is determined whether or not there is an object located in the low importance region in the frame based on the importance region shown in FIG. 17 (c).

このように本実施形態では、領域の重要度を、フレーム毎に、フレームに応じて決定することで、多くのボクセル数で構成されている三次元モデルの更新の頻度を少なくすることができる。三次元モデルに関わる処理量はボクセル数に依存することが多い。このため、多くのボクセル数で構成されている三次元モデルの更新の頻度を少なくすることで、三次元モデルに関わる処理負荷を抑制することができる。 As described above, in the present embodiment, the importance of the region is determined for each frame according to the frame, so that the frequency of updating the three-dimensional model composed of a large number of voxels can be reduced. The amount of processing related to the 3D model often depends on the number of voxels. Therefore, by reducing the frequency of updating the three-dimensional model composed of a large number of voxels, the processing load related to the three-dimensional model can be suppressed.

一方、閾値以上のボクセル数で構成される三次元モデルがない場合（Ｓ１６０８がＮＯ）、Ｓ１６０９に進む。Ｓ１６０９では、Ｓ１６０１と同様に、フレーム内の領域の重要度を最大とする重要度領域の設定が行われる。 On the other hand, if there is no three-dimensional model composed of the number of voxels equal to or larger than the threshold value (S1608 is NO), the process proceeds to S1609. In S1609, as in S1601, the importance area that maximizes the importance of the area in the frame is set.

Ｓ１６１１～Ｓ１６１３の処理は、Ｓ１１０８～Ｓ１１１０の処理と同様であるため概略を説明する。Ｓ１６１１では三次元モデルが生成されなかったオブジェクトがあるか判定される。三次元モデル生成が生成されなかったオブジェクトがある場合、Ｓ１６１２において１つ前のフレームで出力された低重要度領域に位置する三次元モデルを合成してレンダリング装置１０５に出力する。Ｓ１６１３では出力された三次元モデル等を記憶する。Ｓ１６１４において、次のフレームがあるかが判定される。次のフレームがある場合は、Ｓ１６０３に戻る。そして、Ｓ１６０３～Ｓ１６１３の処理を繰り返すことで、一連のフレームの三次元モデルが生成され、時間的に連続した三次元モデルが出力される。 Since the processing of S1611 to S1613 is the same as the processing of S1108 to S1110, an outline will be described. In S1611, it is determined whether there is an object for which the three-dimensional model has not been generated. If there is an object for which 3D model generation has not been generated, the 3D model located in the low importance region output in the previous frame in S1612 is synthesized and output to the rendering device 105. In S1613, the output three-dimensional model and the like are stored. In S1614, it is determined whether or not there is the next frame. If there is a next frame, the process returns to S1603. Then, by repeating the processes of S1603 to S1613, a three-dimensional model of a series of frames is generated, and a temporally continuous three-dimensional model is output.

なお、今回のフレームで閾値以上のボクセル数で構成される三次元モデルが存在し、次のフレームでも閾値以上のボクセル数で構成される三次元モデルが存在する場合がある。この場合、1つ前フレームで低重要度領域に設定した領域を高重要度領域に戻してから、今回のフレームで三次元モデルが位置する領域を低重要度領域に設定してもよい。 In this frame, there may be a three-dimensional model composed of the number of voxels equal to or greater than the threshold value, and in the next frame, there may be a three-dimensional model composed of the number of voxels equal to or greater than the threshold value. In this case, the region set in the low importance region in the previous frame may be returned to the high importance region, and then the region in which the 3D model is located in the current frame may be set in the low importance region.

また、上記の説明では、閾値以上のボクセル数であるような、多くのボクセル数で構成されている三次元モデルの更新頻度を抑制する方法を説明した。他にも、予め決められた所定のオブジェクトの位置に基づき領域の重要度が動的に決定されてもよい。 Further, in the above description, a method of suppressing the update frequency of a three-dimensional model composed of a large number of voxels, such as the number of voxels exceeding the threshold value, has been described. In addition, the importance of the area may be dynamically determined based on a predetermined position of a predetermined object.

図１８は、サッカーの試合を撮像することによって得られたフレームに基づき生成された三次元モデルである。図１８は、サッカーの試合におけるボールのような重要度の高いオブジェクトの三次元モデル１８０１に近い領域ほど重要度が高くなるように、フレーム内の領域の重要度を決定した場合の例を示す図である。このように、動的に重要度を決定することにより、ボールの周辺領域のように、フレームごとにユーザの注目度の高い領域が変わる場合であっても、注目度の高い領域を高重要度領域とすることができる。このため、注目度の高い領域に位置するオブジェクトの三次元モデルの更新頻度は高いフレームレートを維持することが可能となる。 FIG. 18 is a three-dimensional model generated based on a frame obtained by imaging a soccer game. FIG. 18 is a diagram showing an example in which the importance of the area in the frame is determined so that the area closer to the three-dimensional model 1801 of a high-importance object such as a ball in a soccer game becomes more important. Is. In this way, by dynamically determining the importance, even if the area of high attention of the user changes for each frame, such as the area around the ball, the area of high attention is highly important. It can be an area. Therefore, it is possible to maintain a high frame rate by updating the three-dimensional model of the object located in the region of high attention.

また、上記の説明では、閾値以上のボクセル数で構成される三次元モデルがある場合、その三次元モデルが位置する領域を一律に低重要度領域に決定する方法を説明した。他にも、三次元モデルを構成するボクセル数に応じて重要度を変えて、その三次元モデルが位置する領域の重要度を決定してもよい。例えば、複数の閾値を設定して、ボクセル数が第1の閾値以上の場合は、その三次元モデルが位置する領域を重要度が一番低い領域Ｒ３に決定する。そして、ボクセル数が第１の閾値未満で第１の閾値より小さい第２の閾値以上の場合、その三次元モデルが位置する領域を重要度が次に低い領域Ｒ２に決定するような構成でもよい。 Further, in the above description, when there is a three-dimensional model composed of the number of voxels above the threshold value, a method of uniformly determining the region in which the three-dimensional model is located as a low importance region has been described. Alternatively, the importance may be changed according to the number of voxels constituting the three-dimensional model to determine the importance of the region in which the three-dimensional model is located. For example, when a plurality of threshold values are set and the number of voxels is equal to or greater than the first threshold value, the region in which the three-dimensional model is located is determined to be the region R3 having the lowest importance. When the number of voxels is less than the first threshold value and greater than or equal to the second threshold value smaller than the first threshold value, the region in which the three-dimensional model is located may be determined to be the region R2 having the next lowest importance. ..

また、オブジェクトの動きが少ない場合、そのオブジェクトの三次元モデルの更新頻度を下げても視聴者へ与える影響は少なくなる。このため、生成された数フレーム分の三次元モデルから、それぞれのオブジェクトの移動量を算出する。そして、移動量が閾値以下であるような動きの少ないオブジェクトの三次元モデルについては、更新頻度を下げるために、処理フレームレートを小さくする決定がされてもよい。 Further, when the movement of the object is small, the influence on the viewer is small even if the update frequency of the three-dimensional model of the object is reduced. Therefore, the movement amount of each object is calculated from the generated three-dimensional model for several frames. Then, for a three-dimensional model of an object with little movement such that the movement amount is equal to or less than the threshold value, a decision may be made to reduce the processing frame rate in order to reduce the update frequency.

また、上記の説明では、生成した三次元モデルのボクセル数に基づいて領域の重要度を設定した。他にも、処理抑制フレームの場合、三次元モデルの生成中にボクセル数が閾値以上となる三次元モデルがあるか判定して、ボクセル数が閾値以上となったらその三次元モデルの生成を打ち切ってもよい。そして生成処理が打ち切られた三次元モデルが位置する領域を低重要度領域に決定してもよい。 Further, in the above description, the importance of the region is set based on the number of voxels of the generated 3D model. In addition, in the case of the processing suppression frame, it is determined whether there is a 3D model in which the number of voxels exceeds the threshold value during the generation of the 3D model, and when the number of voxels exceeds the threshold value, the generation of the 3D model is stopped. You may. Then, the region in which the three-dimensional model whose generation process is terminated may be determined as the low importance region.

また、対象フレームを得るためにカメラが撮像した時間に応じて領域の重要度または領域の位置を変更してもよい。例えば、時間に紐づいた重要度情報を取得して、指定された時間で領域の重要度を切り替えてもよい。 Further, the importance of the region or the position of the region may be changed according to the time taken by the camera in order to obtain the target frame. For example, the importance information associated with time may be acquired and the importance of the area may be switched at a specified time.

重要度情報決定部１５０１は、撮像装置による撮像対象がスポーツ競技の場合、撮像画像からフィールドに引かれたラインなどの情報に基づき競技の行われる領域を特定し、競技の行われる領域をそれ以外の領域よりも重要度の高い領域に設定してもよい。 When the object to be imaged by the image pickup device is a sports competition, the importance information determination unit 1501 identifies the area where the competition is performed based on information such as a line drawn from the captured image to the field, and sets the other area where the competition is performed. It may be set to an area of higher importance than the area of.

以上説明したように本実施形態によれば、フレーム内の領域の重要度を動的に変更することができる。このため、三次元モデルに関わる処理負荷を抑制でき、高負荷時においても品質低下を抑制して三次元モデルの生成を実現することができる。 As described above, according to the present embodiment, the importance of the area in the frame can be dynamically changed. Therefore, the processing load related to the three-dimensional model can be suppressed, and the quality deterioration can be suppressed even when the load is high, and the generation of the three-dimensional model can be realized.

＜その他の実施形態＞
上述した実施形態では、制御装置１０３、三次元モデル生成装置１０４、レンダリング装置１０５はそれぞれ別の装置であるものとして説明した。他にも、１つのまたは２つの装置により、制御装置１０３、三次元モデル生成装置１０４、レンダリング装置１０５の機能が実現されてもよい。例えば、前景の三次元モデルの生成と、仮想視点画像の生成を１つの画像処理装置によって行う形態でもよい。 <Other embodiments>
In the above-described embodiment, the control device 103, the three-dimensional model generation device 104, and the rendering device 105 have been described as being separate devices. In addition, the functions of the control device 103, the three-dimensional model generation device 104, and the rendering device 105 may be realized by one or two devices. For example, a three-dimensional model of the foreground and a virtual viewpoint image may be generated by one image processing device.

本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 The present invention supplies a program that realizes one or more functions of the above-described embodiment to a system or device via a network or storage medium, and one or more processors in the computer of the system or device reads and executes the program. It can also be realized by the processing to be performed. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

１０４三次元モデル生成装置
９０２抑制前景選定部
９０４データ削除部
９０５モデル生成部 104 3D model generator 902 Suppression foreground selection unit 904 Data deletion unit 905 Model generation unit

Claims

An image processing device that generates three-dimensional shape data of an object included in the frame by using an object image showing the shape of the object in the frame obtained by imaging by a plurality of image pickup devices.
A selection means for selecting an object in the first region in the frame and an object in the second region which is an region other than the first region.
The first frame rate for generating the three-dimensional shape data of the object in the first region is determined, and the second frame rate for generating the three-dimensional shape data of the object in the second region is the first. A means of determining a frame rate different from the frame rate,
An image generation means for generating the object image corresponding to the frame, and
It has a data generation means for generating the three-dimensional shape data by using the object image generated by the image generation means.
The image generation means is
The three-dimensional shape data of the object in the first region is controlled to be generated at the first frame rate, and the three-dimensional shape data of the object in the second region is generated at the second frame rate. An image processing device characterized by being controlled so as to be.

The image processing apparatus according to claim 1, wherein the first region is a region having a lower importance than the second region.

The image processing apparatus according to claim 2, wherein the value of the first frame rate is smaller than the value of the second frame rate.

The image processing apparatus according to claim 2 or 3, wherein the importance is set based on a user's instruction.

The image processing apparatus according to claim 2 or 3, further comprising a region determining means for determining the importance of the region in the frame.

The area determination means is
When the image pickup target by the plurality of image pickup devices targets a sports competition, the area in the frame where the competition is performed is specified, and the area other than the area where the competition is performed is set from the area where the competition is performed. The image processing apparatus according to claim 5, wherein the image processing apparatus is determined to be a region of low importance.

The area determination means is
The image processing apparatus according to claim 5, wherein a region having a low importance is determined based on a processing load at the time of generating three-dimensional shape data by the data generation means.

The area determination means is
Of the three-dimensional shapes of the object indicated by the three-dimensional shape data generated by the data generation means, the importance is low based on the region where the three-dimensional shape in which the number of elements constituting the three-dimensional shape is equal to or greater than the threshold value is located. The image processing apparatus according to claim 5, wherein the area is determined.

The area determination means is
The image processing apparatus according to claim 5, wherein the less important area is determined based on the area including a predetermined object.

The area determination means is
It is characterized in that each area corresponding to a plurality of importance is determined, and each area is determined so that the area closer to the predetermined object is, the more important the area is. The image processing apparatus according to claim 9.

The area determination means is
The image processing apparatus according to claim 5, wherein the region of low importance is determined based on the amount of movement of the object.

The area determination means is
The image processing apparatus according to claim 5, wherein the region of low importance is determined based on the time when the frame is imaged.

The determination means is
The image processing apparatus according to any one of claims 1 to 12, wherein the first frame rate and the second frame rate are determined based on the processing load of the image processing apparatus.

When there is an area in the frame that does not generate 3D shape data of the object, the 3D shape in the 3D shape data of the object in the area generated in the past is the 3D shape data generated by the data generation means. The image processing apparatus according to any one of claims 1 to 13, further comprising a processing means for processing so as to be included in the three-dimensional space in the above.

Further, it has an acquisition means for acquiring an image showing the shape of each object included in the frame.
The image generation means generates the object image using the acquired image, and the object image is generated.
If there is an area in the frame that does not generate 3D shape data for the object
The image generation means deletes the image corresponding to the region in the frame that does not generate the three-dimensional shape data of the object, and uses the image other than the deleted image among the acquired images to obtain the object. The image processing apparatus according to any one of claims 1 to 14, wherein an image is generated.

An image processing method for generating three-dimensional shape data of an object contained in the frame by using an object image showing the shape of the object in the frame obtained by imaging by a plurality of image pickup devices.
A selection step for selecting an object in the first region in the frame and an object in the second region which is an region other than the first region.
The first frame rate for generating the three-dimensional shape data of the object in the first region is determined, and the second frame rate for generating the three-dimensional shape data of the object in the second region is the first. The decision step to decide the frame rate different from the frame rate,
An image generation step that generates the object image corresponding to the frame, and
It has a data generation step of generating the three-dimensional shape data using the object image generated by the image generation step.
In the image generation step,
The three-dimensional shape data of the object in the first region is controlled to be generated at the first frame rate, and the three-dimensional shape data of the object in the second region is generated at the second frame rate. An image processing method characterized by being controlled in such a manner.

A program for making a computer function as each means of the image processing apparatus according to any one of claims 1 to 15.