JP2021051374A

JP2021051374A - Shape data generation device, shape data generation method, and program

Info

Publication number: JP2021051374A
Application number: JP2019172190A
Authority: JP
Inventors: 康文 ▲高▼間; Yasufumi Takama
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2019-09-20
Filing date: 2019-09-20
Publication date: 2021-04-01
Anticipated expiration: 2039-09-20
Also published as: JP7475831B2

Abstract

To appropriately perform processing involved in generation of shape data in an entire captured space.SOLUTION: A space setting unit 110 sets a first partial space of a plurality of partial spaces included in a captured space at a first parameter used in generation of shape data on a subject included in the first partial space. Also, the space setting unit 110 sets a second partial space of the plurality of partial spaces at a second parameter used in generation of shape data on the subject included in the second partial space, the second parameter differing from the first parameter. A shape estimation unit 130 generates the shape data on the subject included in the first partial space on the basis of the first parameter and a plurality of captured images taken from a plurality of image-capturing devices 20. Also, the shape estimation unit 130 generates the shape data on the subject included in the second partial space on the basis of the second parameter and a plurality of captured images.SELECTED DRAWING: Figure 3

Description

本発明は、仮想視点画像を生成する技術に関するものである。 The present invention relates to a technique for generating a virtual viewpoint image.

近年、複数の撮像装置を異なる位置に設置して複数の方向から同期撮影し、当該撮像装置から得られた複数の画像を用いて、指定された視点（仮想視点）から見た画像（仮想視点画像）を生成する技術が注目されている。上記のような仮想視点画像を生成する技術によれば、スポーツの試合、コンサート、及び演劇等といった様々なイベントについて、指定した視点から見た画像を生成することができる。 In recent years, a plurality of image pickup devices are installed at different positions to perform synchronous shooting from a plurality of directions, and an image (virtual viewpoint) viewed from a designated viewpoint (virtual viewpoint) is used by using a plurality of images obtained from the image pickup device. The technology for generating images) is drawing attention. According to the above-mentioned technology for generating a virtual viewpoint image, it is possible to generate an image viewed from a designated viewpoint for various events such as sports games, concerts, and plays.

特許文献１には、異なる位置に設置された複数の撮像装置による撮像に基づいて得られた被写体を示す画像（以下、前景画像と呼ぶ）に基づいて被写体の形状データを生成し、その形状データを用いて、仮想視点画像を生成する方法について記載されている。 In Patent Document 1, shape data of a subject is generated based on an image showing a subject (hereinafter, referred to as a foreground image) obtained based on imaging by a plurality of imaging devices installed at different positions, and the shape data thereof. Describes how to generate a virtual viewpoint image using.

特開２０１５−４５９２０号公報JP-A-2015-45920

複数の撮像装置により撮像される撮像空間に、大きさや動き特性等が異なる被写体が含まれる場合がある。例えば、ラグビーの試合における被写体は選手とボールであるが、ボールは、選手である人物に比べて小さく、また移動する速さが速い。そして、撮像空間において地上近くの第１の空間では、被写体としては選手とボールが含まれるが、上空の第２の空間では、選手は含まれずボールのみが存在しうる。この第１及び第２の部分空間に対し、前景画像の生成や形状データの生成などの処理を同様に行うと、前景画像の生成精度にばらつきが生じたり、形状データが精度よく生成されないなど、形状データの生成に係る処理が適切に行えない虞があった。 The imaging space imaged by a plurality of imaging devices may include subjects having different sizes, motion characteristics, and the like. For example, in a rugby game, the subjects are a player and a ball, but the ball is smaller and moves faster than a person who is a player. Then, in the first space near the ground in the imaging space, the player and the ball are included as the subjects, but in the second space above the sky, the player is not included and only the ball may exist. If processing such as foreground image generation and shape data generation is performed in the same manner for the first and second subspaces, the foreground image generation accuracy may vary, and the shape data may not be generated accurately. There was a risk that the processing related to the generation of shape data could not be performed properly.

本発明は上記の課題に鑑みてなされたものである。その目的は、撮像空間全体において、形状データの生成に係る処理を適切に行うことである。 The present invention has been made in view of the above problems. The purpose is to appropriately perform processing related to shape data generation in the entire imaging space.

本発明に係る形状データ生成装置は、複数の撮像装置のうち一以上の撮像装置から取得される一以上の撮像画像と、前記複数の撮像装置により撮像される撮像空間における複数の部分空間に含まれる第１の部分空間に対応する第１のパラメータとに基づいて前記第１の部分空間に含まれる被写体の形状データを生成する第１の生成手段と、前記複数の撮像装置のうち一以上の撮像装置から取得される一以上の撮像画像と、前記複数の部分空間に含まれる第２の部分空間に対応する第２のパラメータであって、前記第１のパラメータとは異なる第２のパラメータとに基づいて前記第２の部分空間に含まれる被写体の形状データを生成する第２の生成手段とを有することを特徴とする。 The shape data generation device according to the present invention is included in one or more captured images acquired from one or more imaging devices among the plurality of imaging devices, and in a plurality of subspaces in the imaging space imaged by the plurality of imaging devices. A first generation means for generating shape data of a subject included in the first subspace based on a first parameter corresponding to the first subspace, and one or more of the plurality of imaging devices. A second parameter corresponding to one or more captured images acquired from the image pickup apparatus and a second subspace included in the plurality of subspaces, which is different from the first parameter. It is characterized by having a second generation means for generating shape data of a subject included in the second subspace based on the above.

本発明によれば、撮像空間全体において、形状データの生成に係る処理を適切に行うことができる。 According to the present invention, it is possible to appropriately perform the process related to the generation of shape data in the entire imaging space.

複数の撮像装置２が配置される一例を示す図である。It is a figure which shows an example in which a plurality of image pickup apparatus 2 are arranged. 形状推定装置１のハードウェア構成を説明するための図である。It is a figure for demonstrating the hardware structure of the shape estimation apparatus 1. FIG. 形状推定装置１を含む画像処理システム１０００の構成を説明するための図である。It is a figure for demonstrating the structure of the image processing system 1000 including the shape estimation apparatus 1. 撮像システム２の構成の一例を説明するための図である。It is a figure for demonstrating an example of the structure of the image pickup system 2. 複数の部分空間に分割された撮像空間の一例を模式的に表した図である。It is a figure which represented typically an example of the imaging space divided into a plurality of subspaces. 形状推定装置１が行う処理の一例を説明するためのフローチャートである。It is a flowchart for demonstrating an example of a process performed by a shape estimation apparatus 1. 形状推定装置７を含む画像処理システム１１００の構成を説明するための図である。It is a figure for demonstrating the structure of the image processing system 1100 including the shape estimation apparatus 7. 第１撮像システム８及び第２撮像システム９の構成の一例を説明するための図である。It is a figure for demonstrating an example of the structure of the 1st imaging system 8 and the 2nd imaging system 9. 形状推定装置７が行う処理の一例を説明するためのフローチャートである。It is a flowchart for demonstrating an example of the process performed by a shape estimation apparatus 7.

以下、本発明の実施形態について、図面を参照しながら説明する。なお、以下の実施形態に記載される構成要素は、本発明の実施の形態の一例を示すものであり、本発明をそれらのみに限定するものではない。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. The components described in the following embodiments show an example of the embodiments of the present invention, and the present invention is not limited to them.

（第１の実施形態）
本実施形態は、複数の撮像装置が撮像する撮像空間における被写体の形状データを生成する形状推定装置に関する。初めに、本実施形態における課題について説明する。複数の撮像装置により撮像される撮像空間における被写体の形状データを生成するとき、大きさや動き特性が異なる被写体が撮像空間に含まれている場合に形状データの生成に係る処理が適切に行われない可能性がある。例えば、ラグビーの試合が撮像される場合、撮像空間には選手及びボールが含まれる。ボールは、選手と比べて大きさが小さく移動する速さが速いため、前景画像が適切に生成されない場合がある。このため、ボールと選手とで同様に形状データの生成処理を行うと、形状データの生成も適切に行われないという可能性があった。それに対して、被写体の種類（ここでは、選手とボール）に応じて形状データの生成に際して異なる処理を行うことが考えられる。しかしながら、形状データの生成に係る処理を行う前に、被写体の種類を特定することは困難であった。 (First Embodiment)
The present embodiment relates to a shape estimation device that generates shape data of a subject in an imaging space imaged by a plurality of imaging devices. First, the problems in this embodiment will be described. When generating shape data of a subject in an imaging space imaged by a plurality of imaging devices, processing related to the generation of shape data is not properly performed when subjects having different sizes and motion characteristics are included in the imaging space. there is a possibility. For example, when a rugby game is imaged, the imaging space includes players and the ball. Since the ball is smaller in size and moves faster than the player, the foreground image may not be generated properly. Therefore, if the ball and the player perform the shape data generation process in the same manner, there is a possibility that the shape data is not properly generated. On the other hand, it is conceivable to perform different processing when generating the shape data depending on the type of the subject (here, the player and the ball). However, it has been difficult to specify the type of subject before performing the process related to the generation of shape data.

ところで、ラグビーの例においては、フィールドの地上付近の空間には、被写体として選手とボールとが混在しうるが、例えば地上から１０ｍ上方の空間には選手が含まれず、被写体としてボールのみが含まれ得る。つまり、撮像空間において特定の被写体のみが存在しうる部分空間が存在する場合がある。本実施形態における形状推定装置は、上記の点に着目して、部分空間ごとに形状データの生成に係る処理を異ならせて行う構成にした。したがって、例えばラグビーの試合における地上付近の部分空間と１０ｍ上方の部分空間とで形状データの生成に係る処理を異ならせることにより、撮像空間全体において被写体の形状データの生成に係る処理が適切に行われるようになる。 By the way, in the example of rugby, a player and a ball can coexist as a subject in the space near the ground of the field, but for example, the space 10 m above the ground does not include the player and only the ball is included as the subject. obtain. That is, there may be a subspace in the imaging space where only a specific subject can exist. Focusing on the above points, the shape estimation device in the present embodiment is configured to perform different processes related to shape data generation for each subspace. Therefore, for example, by making the process related to the generation of shape data different between the subspace near the ground and the subspace 10 m above in a rugby game, the process related to the generation of the shape data of the subject is appropriately performed in the entire imaging space. Will come to be.

上記の例によらず、以下のような場合でも本実施形態を適用できる。例えば、撮像空間を画角の異なる撮像装置を含む複数の撮像装置で撮像する場合がある。この場合に、例えば重要なシーン（例えば、サッカーにおけるゴールシーン等）が発生しうる空間を望遠カメラ及び広角カメラで撮像し、その他の空間全体を広角カメラのみで撮像する形態が想定される。また、例えば、選手を高解像に撮像するため地上付近の空間を望遠カメラで撮像し、ボールのみが含まれ得るフィールド上空の空間は広角カメラのみで撮像する形態も想定される。広角カメラで被写体が撮像されると、同一の被写体が望遠カメラで撮像される場合と比べて、撮像画像における被写体の領域が小さくなる。これにより、形状データの生成に係る処理が適切に行われない場合がある。このような場合も、本実施形態における形状推定装置を用いれば、撮像空間全体において被写体の形状データの生成に係る処理が適切に行われるようになる。 Notwithstanding the above example, the present embodiment can be applied even in the following cases. For example, the imaging space may be imaged by a plurality of imaging devices including imaging devices having different angles of view. In this case, for example, a space in which an important scene (for example, a goal scene in soccer) can occur is imaged by a telephoto camera and a wide-angle camera, and the entire other space is imaged only by the wide-angle camera. Further, for example, in order to image a player with high resolution, it is assumed that the space near the ground is imaged with a telephoto camera, and the space above the field where only the ball can be included is imaged only with a wide-angle camera. When a subject is imaged by a wide-angle camera, the area of the subject in the captured image becomes smaller than when the same subject is imaged by a telephoto camera. As a result, the process related to the generation of shape data may not be performed properly. Even in such a case, if the shape estimation device according to the present embodiment is used, the processing related to the generation of the shape data of the subject can be appropriately performed in the entire imaging space.

以下では、本実施形態における形状推定装置について説明する。形状推定装置により生成された形状データは、仮想視点画像の生成に使用される。ここで、本実施形態における仮想視点画像は、自由視点画像とも呼ばれるものであるが、ユーザが自由に（任意に）指定した視点に対応する画像に限定されず、例えば複数の候補からユーザが選択した視点に対応する画像なども仮想視点画像に含まれる。また、仮想視点の指定は、ユーザ操作により行われてもよいし、画像解析の結果等に基づいて自動で行われてもよい。また、本実施形態では仮想視点画像が静止画である場合を中心に説明するが、仮想視点画像は動画であってもよい。 Hereinafter, the shape estimation device according to the present embodiment will be described. The shape data generated by the shape estimation device is used to generate a virtual viewpoint image. Here, the virtual viewpoint image in the present embodiment is also called a free viewpoint image, but is not limited to an image corresponding to a viewpoint freely (arbitrarily) specified by the user, and is selected by the user from a plurality of candidates, for example. The virtual viewpoint image also includes an image corresponding to the viewpoint. Further, the virtual viewpoint may be specified by a user operation, or may be automatically specified based on the result of image analysis or the like. Further, in the present embodiment, the case where the virtual viewpoint image is a still image will be mainly described, but the virtual viewpoint image may be a moving image.

また、仮想視点画像を生成するために用いられる複数の撮像装置は、例えば、図１に示す撮像装置２０のように、撮像空間を囲むように配置されうる。複数の撮像装置により撮像される対象としては、スポーツの試合、コンサート、及び演劇等のイベントが想定される。また、撮像空間は上記のイベントが行われる空間を指し、例えばラグビーの場合、ラグビーが行われる競技場における地面と、任意の高さとからなる三次元空間等である。また、被写体は、上記の撮像空間におけるオブジェクトを指し、例えば、フィールド内の選手、及び球技におけるボール等である。撮像装置は、それぞれ異なる位置に設置され、異なる撮像方向から同期して撮像を行う。なお、撮像装置は撮像空間の全周にわたって設置されていなくてもよく、設置場所の制限等によっては撮像空間の一部の方向にのみ設置されていてもよい。また、撮像装置の数は限定されず、例えば撮像空間をサッカーの競技場とする場合、競技場の周囲に数十〜数百台程度の撮像装置が設置されてもよい。 Further, the plurality of image pickup devices used for generating the virtual viewpoint image may be arranged so as to surround the image pickup space, for example, as in the image pickup device 20 shown in FIG. Events such as sports games, concerts, and plays are assumed as targets to be imaged by a plurality of image pickup devices. Further, the imaging space refers to a space where the above-mentioned event is performed. For example, in the case of rugby, it is a three-dimensional space composed of a ground in a stadium where rugby is performed and an arbitrary height. Further, the subject refers to an object in the above-mentioned imaging space, for example, a player in the field, a ball in a ball game, or the like. The imaging devices are installed at different positions and perform imaging in synchronization from different imaging directions. The imaging device may not be installed over the entire circumference of the imaging space, and may be installed only in a part of the imaging space depending on the limitation of the installation location or the like. The number of imaging devices is not limited. For example, when the imaging space is a soccer stadium, tens to hundreds of imaging devices may be installed around the stadium.

なお、本実施形態における撮像システムは、望遠カメラ及び広角カメラを含む複数の撮像装置により構成される。例えば、ラグビーの試合における選手については望遠カメラを用いて撮像することにより、高解像な撮像画像を得ることができ、生成される仮想視点画像の解像度が向上する。一方、ボールは移動範囲が広いため、望遠カメラで撮像するためには多くの台数を設置する必要がある。そこで、ボールについては画角の広い広角カメラを用いて撮像することにより、設置する撮像装置の台数を削減することができる。また、例えばラグビーの試合においては、選手がフィールドの地上付近に分布し、ボールがフィールドの上方に含まれ得ることが想定される。したがって、撮像空間において選手がプレーする地上付近における部分空間と、例えば地上から１０ｍ上方の、ボールが到達する部分空間とを、それぞれ画角の異なる撮像装置で撮像することも可能である。この他にも、例えば、サッカーにおけるゴール前及びペナルティーエリアのように、選手が集まりやすく重要なシーン（例えば、ゴールシーン等）が発生しやすい空間を望遠カメラで撮像し、その他の空間を広角カメラで撮像する構成であってもよい。以上のように、望遠カメラ及び広角カメラのように画角の異なる撮像装置を含む撮像システムによれば、撮像装置の設置台数を削減しつつ高解像度な仮想視点画像の生成が可能になるという効果がある。なお、撮像システムの構成はこれに限定されない。例えば、同じ画角の撮像装置によって構成されてもよいし、上記とは異なる種類の撮像装置を含むような構成であってもよい。 The imaging system in this embodiment is composed of a plurality of imaging devices including a telephoto camera and a wide-angle camera. For example, by imaging a player in a rugby game using a telephoto camera, a high-resolution captured image can be obtained, and the resolution of the generated virtual viewpoint image is improved. On the other hand, since the ball has a wide range of movement, it is necessary to install a large number of balls in order to take an image with a telephoto camera. Therefore, the number of image pickup devices to be installed can be reduced by taking an image of the ball using a wide-angle camera having a wide angle of view. Further, for example, in a rugby game, it is assumed that the players are distributed near the ground of the field and the ball can be included above the field. Therefore, it is also possible to image a subspace near the ground where the player plays in the imaging space and a subspace where the ball reaches, for example, 10 m above the ground, with imaging devices having different angles of view. In addition to this, for example, a space where players are likely to gather and important scenes (for example, goal scenes) are likely to occur, such as in front of a goal and a penalty area in soccer, is imaged with a telephoto camera, and other spaces are captured with a wide-angle camera. It may be configured to take an image. As described above, according to an imaging system including imaging devices having different angles of view such as a telephoto camera and a wide-angle camera, it is possible to generate a high-resolution virtual viewpoint image while reducing the number of imaging devices installed. There is. The configuration of the imaging system is not limited to this. For example, it may be configured by an image pickup device having the same angle of view, or may be configured to include an image pickup device of a type different from the above.

図２は、本実施形態における形状推定装置１のハードウェア構成を説明するための図である。形状推定装置１は、ＣＰＵ５１１、ＲＯＭ５１２、ＲＡＭ５１３、補助記憶装置５１４、通信Ｉ／Ｆ５１５、及びバス５１６を有する。ＣＰＵ５１１は、ＲＯＭ５１２やＲＡＭ５１３に格納されているコンピュータプログラムやデータを用いて形状推定装置１の全体を制御することで、形状推定装置１の各機能を実現する。なお、形状推定装置１がＣＰＵ５１１とは異なる一又は複数の専用のハードウェアを有し、ＣＰＵ５１１による処理の少なくとも一部を専用のハードウェアが実行してもよい。専用のハードウェアの例としては、ＡＳＩＣ（特定用途向け集積回路）、ＦＰＧＡ（フィールドプログラマブルゲートアレイ）、およびＤＳＰ（デジタルシグナルプロセッサ）などがある。ＲＯＭ５１２は、変更を必要としないプログラムなどを格納する。ＲＡＭ５１３は、補助記憶装置５１４から供給されるプログラムやデータ、及び通信Ｉ／Ｆ５１５を介して外部から供給されるデータなどを一時記憶する。補助記憶装置５１４は、例えばハードディスクドライブ等で構成され、画像データや音声データなどの種々のデータを記憶する。 FIG. 2 is a diagram for explaining the hardware configuration of the shape estimation device 1 in the present embodiment. The shape estimation device 1 includes a CPU 511, a ROM 512, a RAM 513, an auxiliary storage device 514, a communication I / F 515, and a bus 516. The CPU 511 realizes each function of the shape estimation device 1 by controlling the entire shape estimation device 1 by using computer programs and data stored in the ROM 512 and the RAM 513. The shape estimation device 1 may have one or more dedicated hardware different from the CPU 511, and the dedicated hardware may execute at least a part of the processing by the CPU 511. Examples of dedicated hardware include ASICs (application specific integrated circuits), FPGAs (field programmable gate arrays), and DSPs (digital signal processors). The ROM 512 stores programs and the like that do not require changes. The RAM 513 temporarily stores programs and data supplied from the auxiliary storage device 514, data supplied from the outside via the communication I / F 515, and the like. The auxiliary storage device 514 is composed of, for example, a hard disk drive or the like, and stores various data such as image data and audio data.

通信Ｉ／Ｆ５１５は、形状推定装置１の外部の装置との通信に用いられる。例えば、形状推定装置１が外部の装置と有線で接続される場合には、通信用のケーブルが通信Ｉ／Ｆ５１５に接続される。形状推定装置１が外部の装置と無線通信する機能を有する場合には、通信Ｉ／Ｆ５１５はアンテナを備える。本実施形態における形状推定装置１は、通信Ｉ／Ｆ５１５を介して撮像システム、及び後述する画像生成装置等と通信する。バス５１６は、形状推定装置１の各部をつないで情報を伝達する。なお、本実施形態では補助記憶装置５１４が形状推定装置１の内部に存在するものとするが、形状推定装置１の外部に接続される構成であってもよい。 The communication I / F 515 is used for communication with an external device of the shape estimation device 1. For example, when the shape estimation device 1 is connected to an external device by wire, a communication cable is connected to the communication I / F 515. When the shape estimation device 1 has a function of wirelessly communicating with an external device, the communication I / F 515 includes an antenna. The shape estimation device 1 in the present embodiment communicates with an image pickup system, an image generation device, and the like, which will be described later, via communication I / F 515. The bus 516 connects each part of the shape estimation device 1 to transmit information. In the present embodiment, the auxiliary storage device 514 is assumed to exist inside the shape estimation device 1, but it may be connected to the outside of the shape estimation device 1.

図３は、本実施形態における形状推定装置１を含む画像処理システム１０００の構成を説明するための図である。図３を用いて、画像処理システム１０００に含まれるシステム及び装置について説明する。画像処理システム１０００は、形状推定装置１、撮像システム２、画像生成装置３、及び、表示装置４を含む。形状推定装置１は、撮像システム２により撮像される撮像空間における被写体の形状データを生成し、生成した形状データを画像生成装置３に送信する。形状推定装置１の詳細な説明については後述する。 FIG. 3 is a diagram for explaining the configuration of the image processing system 1000 including the shape estimation device 1 in the present embodiment. The system and the apparatus included in the image processing system 1000 will be described with reference to FIG. The image processing system 1000 includes a shape estimation device 1, an image pickup system 2, an image generation device 3, and a display device 4. The shape estimation device 1 generates shape data of a subject in the imaging space imaged by the image pickup system 2, and transmits the generated shape data to the image generation device 3. A detailed description of the shape estimation device 1 will be described later.

撮像システム２は、上述したように複数の撮像装置により構成されるシステムである。図４は、本実施形態における撮像システム２の構成の一例を説明するための図である。図４は、７台の撮像装置２０ａ〜２０ｇにより構成される撮像システム２０を示す。なお、撮像システム２に含まれる撮像装置の台数はこれに限定されない。本実施形態における撮像システム２０は、前景背景分離部２１ａ〜２１ｇを有し、撮像装置２０ａ〜２０ｇは、それぞれ、前景背景分離部２１ａ〜２１ｇと接続される。以降の説明においては、特に区別をしない場合、撮像装置２０ａ〜２０ｇ及び前景背景分離部２１ａ〜２１ｇを、単に撮像装置２０及び分離部２１とよぶ。 The image pickup system 2 is a system composed of a plurality of image pickup devices as described above. FIG. 4 is a diagram for explaining an example of the configuration of the imaging system 2 in the present embodiment. FIG. 4 shows an imaging system 20 composed of seven imaging devices 20a to 20g. The number of imaging devices included in the imaging system 2 is not limited to this. The image pickup system 20 in the present embodiment has foreground background separation units 21a to 21g, and the image pickup devices 20a to 20g are connected to the foreground background separation units 21a to 21g, respectively. In the following description, unless otherwise specified, the image pickup apparatus 20a to 20g and the foreground background separation portion 21a to 21g are simply referred to as the image pickup apparatus 20 and the separation unit 21.

複数の撮像装置２０は、それぞれ、撮像装置２０を識別するための識別番号及び主に撮像する被写体を示す被写体情報を有する。本実施形態における被写体情報は、「選手」あるいは「ボール」等、人物かボールなどの人物以外の移動物体かを示す被写体の種別を表す情報であるものとする。なお、人物を示す被写体情報の場合、その被写体情報が「選手Ａ」及び「選手Ｂ」のようにより詳細な情報を有していてもよい。識別番号及び被写体情報は、あらかじめ設定される。なお、本実施形態においては、一つの撮像装置２０に対し一つの被写体情報が付されるものとする。分離部２１は、撮像装置２０が撮像空間を撮像することにより得られた撮像画像から、前景に対応する領域と背景に対応する領域とを分離し、前景画像及び背景画像を生成する。本実施形態における前景とは、被写体に対応するオブジェクトのうち、時系列で同じ方向から撮像を行った場合において動きのある（その絶対位置や形が変化し得る）動的オブジェクト（動体）を指す。動的オブジェクトは、例えば、上述した選手及びボールの他に、コンサート及びエンタテイメントにおける歌手、演奏者、パフォーマー及び司会者等である。前景画像は、撮像画像において、上記の動的オブジェクトに対応する領域を抽出することにより得られる画像である。 Each of the plurality of image pickup devices 20 has an identification number for identifying the image pickup device 20 and subject information indicating a subject to be mainly imaged. The subject information in the present embodiment is information indicating the type of the subject indicating whether it is a person or a moving object other than a person such as a ball, such as a "player" or a "ball". In the case of subject information indicating a person, the subject information may have more detailed information such as "player A" and "player B". The identification number and subject information are set in advance. In this embodiment, one subject information is attached to one imaging device 20. The separation unit 21 separates a region corresponding to the foreground and a region corresponding to the background from the captured image obtained by the imaging device 20 capturing the imaging space, and generates a foreground image and a background image. The foreground in the present embodiment refers to a dynamic object (moving object) that moves (the absolute position and shape of the object can change) when images are taken from the same direction in time series among the objects corresponding to the subject. .. Dynamic objects are, for example, singers, performers, performers, moderators, etc. in concerts and entertainment, in addition to the players and balls described above. The foreground image is an image obtained by extracting a region corresponding to the above-mentioned dynamic object in the captured image.

また、背景とは、被写体に対応するオブジェクトのうち、時系列で同じ方向から撮像を行った場合において静止している、又は静止に近い状態が継続している撮像対象物を指す。このような撮像対象物は、例えば、コンサート等のステージ、競技などのイベントを行うスタジアム、球技で使用するゴールなどの構造物、及びフィールド等である。すなわち、背景は少なくとも前景に対応するオブジェクトとは異なる領域である。背景画像は、撮像画像から前景に対応するオブジェクトを取り除くことにより得られる。なお、撮像装置が撮像する被写体としては、上記の前景及び背景の他に、別の物体が含まれていてもよい。 The background refers to an object to be imaged that is stationary or nearly stationary when the image is taken from the same direction in chronological order among the objects corresponding to the subject. Such imaging objects are, for example, stages such as concerts, stadiums where events such as competitions are held, structures such as goals used in ball games, and fields. That is, the background is at least an area different from the object corresponding to the foreground. The background image is obtained by removing the object corresponding to the foreground from the captured image. The subject imaged by the imaging device may include another object in addition to the above-mentioned foreground and background.

分離部２１は、撮像装置２０が有する被写体情報に応じて、前景を抽出し前景画像を生成する分離処理を行う。例えば、撮像装置２０が有する被写体情報が「選手」であった場合、分離部２１は分離処理として背景差分法を使用する。背景差分法は、撮像画像と背景画像との差分を計算することにより前景を抽出する方法である。また、背景画像を一定時間ごとに更新することにより、明るさの変動等が生じる場合も頑健に前景が抽出される。また、例えば、撮像装置２０が有する被写体情報が「ボール」であった場合、分離部２１は分離処理としてフレーム間差分法を使用する。フレーム間差分法は、連続して撮像された撮像フレームを使用することにより前景を抽出する方法である。ボールのように、物理法則に従って高速かつ広範囲に動く被写体の場合は、前景を抽出する対象の撮像画像と、連続して撮像された撮像フレームにおける数フレーム前のフレームとの差分を計算することにより、被写体が頑健に抽出される。上記のように、分離部２１は、撮像装置２０が主に撮像する被写体に応じて異なる分離処理を使用することにより、各撮像装置２０が撮像する被写体の大きさ及び動き特性の少なくともいずれかが異なる場合でも、前景画像の精度を向上させることができる。なお、撮像装置２０それぞれに対応する各分離部２１が被写体情報を有する構成であってもよく、この場合分離部２１は、撮像装置２０から取得した撮像画像に対し、各分離部２１が有する被写体情報に対応する手法を用いて分離処理を行う。 The separation unit 21 performs a separation process of extracting the foreground and generating a foreground image according to the subject information possessed by the image pickup apparatus 20. For example, when the subject information possessed by the image pickup apparatus 20 is "player", the separation unit 21 uses the background subtraction method as the separation process. The background subtraction method is a method of extracting the foreground by calculating the difference between the captured image and the background image. In addition, by updating the background image at regular intervals, the foreground is stubbornly extracted even when the brightness fluctuates. Further, for example, when the subject information possessed by the image pickup apparatus 20 is a "ball", the separation unit 21 uses the inter-frame difference method as the separation process. The inter-frame difference method is a method of extracting the foreground by using continuously imaged imaging frames. In the case of a subject that moves at high speed and in a wide range according to the laws of physics, such as a ball, the difference between the captured image of the target for which the foreground is extracted and the frame several frames before in the continuously captured imaging frame is calculated. , The subject is stubbornly extracted. As described above, the separation unit 21 uses at least one of the size and motion characteristics of the subject imaged by each image pickup device 20 by using different separation processes depending on the subject mainly imaged by the image pickup device 20. Even if they are different, the accuracy of the foreground image can be improved. In addition, each separation unit 21 corresponding to each of the image pickup devices 20 may have a configuration in which the subject information is provided. Separation processing is performed using a method corresponding to information.

また、分離部２１は、生成した前景画像に対しノイズ除去処理を行う。ノイズ除去処理の一例としては、ノイズ領域をフィルタリングにより削減する方法がある。フィルタリングの際の処理パラメータについても、撮像装置２０が有する被写体情報に応じて異なる値が設定される。上記の処理パラメータは、前景画像における特定の領域がノイズであるか否かを判定するために使用するパラメータであり、例えば画素面積等によって表される。ここでいう画素面積とは、撮像画像における、被写体に対応する領域の面積を画素数により表したものである。例えば、被写体情報が「選手」である場合、選手の標準的な体型等に基づいて処理パラメータが設定される。これにより、選手の大きさに対応する画素面積と極端に異なる領域はノイズ領域であると判定されるようになる。また、例えば、被写体情報が「ボール」である場合、サッカーボールあるいはラグビーボールの大きさ及び形状等に基づいて処理パラメータが設定される。なお、ノイズ除去処理に使用される処理パラメータは、撮像装置２０の画角に応じて決定されてもよい。例えば広角カメラにより被写体を撮像する場合、撮像画像における被写体に対応する画素面積は、望遠カメラで撮像される場合よりも小さくなることが想定される。したがって、広角カメラにより被写体が撮像される場合は、被写体の画素面積よりも極端に大きい領域をノイズ領域として判定するように処理パラメータを決定する。反対に、望遠カメラで被写体が撮像される場合は、被写体の画素面積よりも極端に小さい領域をノイズ領域として判定するように処理パラメータを決定する。これにより、撮像装置２０の画角が異なる場合でも、被写体領域を残しつつノイズを除去することができる。また、分離部２１は、複数の処理パラメータを有し、撮像装置２０が有する被写体情報に応じて複数の処理パラメータの中から使用する処理パラメータを選択することも可能である。 In addition, the separation unit 21 performs noise removal processing on the generated foreground image. As an example of the noise removal process, there is a method of reducing the noise area by filtering. As for the processing parameters at the time of filtering, different values are set according to the subject information possessed by the image pickup apparatus 20. The above processing parameters are parameters used for determining whether or not a specific region in the foreground image is noise, and are represented by, for example, a pixel area or the like. The pixel area referred to here is the area of the region corresponding to the subject in the captured image expressed by the number of pixels. For example, when the subject information is "player", the processing parameters are set based on the standard body shape of the player. As a result, the region extremely different from the pixel area corresponding to the size of the player is determined to be the noise region. Further, for example, when the subject information is a "ball", processing parameters are set based on the size and shape of a soccer ball or a rugby ball. The processing parameters used for the noise removal processing may be determined according to the angle of view of the image pickup apparatus 20. For example, when a subject is imaged by a wide-angle camera, it is assumed that the pixel area corresponding to the subject in the captured image is smaller than that when the subject is imaged by a telephoto camera. Therefore, when the subject is imaged by the wide-angle camera, the processing parameters are determined so that the region extremely larger than the pixel area of the subject is determined as the noise region. On the contrary, when the subject is imaged by the telephoto camera, the processing parameters are determined so that the region extremely smaller than the pixel area of the subject is determined as the noise region. As a result, even if the angle of view of the image pickup apparatus 20 is different, noise can be removed while leaving the subject area. Further, the separation unit 21 has a plurality of processing parameters, and it is also possible to select a processing parameter to be used from the plurality of processing parameters according to the subject information possessed by the image pickup apparatus 20.

分離部２１は、生成した前景画像を形状推定装置１に送信する。なお、分離部２１が行う分離処理及びノイズ除去処理の種類については上記に限定されず、他の処理が行われてもよい。また、本実施形態においては分離部２１が撮像システム２に含まれる構成であるが、分離部２１が各撮像装置２０又は形状推定装置１に含まれる、若しくは他の装置として外部に接続される構成であってもよい。 The separation unit 21 transmits the generated foreground image to the shape estimation device 1. The types of separation processing and noise removal processing performed by the separation unit 21 are not limited to the above, and other processing may be performed. Further, in the present embodiment, the separation unit 21 is included in the imaging system 2, but the separation unit 21 is included in each imaging device 20 or the shape estimation device 1, or is connected to the outside as another device. It may be.

図３に戻り、画像生成装置３は、形状推定装置１が生成した形状データを用いて、仮想視点画像を生成する。仮想視点画像の生成方法については後述する。画像生成装置３は、生成した仮想視点画像を表示装置４に送信する。表示装置４は、例えば液晶ディスプレイやＬＥＤ等で構成され、画像生成装置３から送信された仮想視点画像を表示する。表示装置４は、仮想視点画像の他に、ユーザが仮想視点画像を生成するために必要な入力操作等を行うためのＧＵＩ（ＧｒａｐｈｉｃａｌＵｓｅｒＩｎｔｅｒｆａｃｅ）等も表示する。 Returning to FIG. 3, the image generation device 3 generates a virtual viewpoint image using the shape data generated by the shape estimation device 1. The method of generating the virtual viewpoint image will be described later. The image generation device 3 transmits the generated virtual viewpoint image to the display device 4. The display device 4 is composed of, for example, a liquid crystal display, an LED, or the like, and displays a virtual viewpoint image transmitted from the image generation device 3. In addition to the virtual viewpoint image, the display device 4 also displays a GUI (Graphical User Interface) or the like for the user to perform an input operation or the like necessary for generating the virtual viewpoint image.

次に、図３を用いて形状推定装置１の機能構成について説明する。形状推定装置１は、被写体情報取得部１００、空間設定部１１０、撮像情報取得部１２０、及び、形状推定部１３０を有する。以下、各処理部について説明する。 Next, the functional configuration of the shape estimation device 1 will be described with reference to FIG. The shape estimation device 1 includes a subject information acquisition unit 100, a space setting unit 110, an imaging information acquisition unit 120, and a shape estimation unit 130. Hereinafter, each processing unit will be described.

被写体情報取得部１００は、撮像システム２から、複数の撮像装置２０それぞれが有する識別番号と被写体情報とを取得する。なお、被写体情報取得部１００は、外部の記憶装置等から撮像装置２０の識別番号と被写体情報とが紐づけられたファイルを取得する構成であってもよい。 The subject information acquisition unit 100 acquires the identification numbers and the subject information of each of the plurality of imaging devices 20 from the imaging system 2. The subject information acquisition unit 100 may be configured to acquire a file in which the identification number of the image pickup apparatus 20 and the subject information are associated with each other from an external storage device or the like.

空間設定部１１０は、撮像システム２により撮像される撮像空間を、複数の部分空間に分割し、各部分空間に被写体情報を対応付ける設定を行う。ここでいう撮像空間を分割するとは、空間を仮想的に複数の部分空間に分割することを意味する。図５は、複数の部分空間に分割された撮像空間の一例を模式的に表した図である。本実施形態における形状推定装置１は、破線により示される撮像空間３００における被写体の形状データを生成する。なお、撮像空間と形状推定が行われる空間（形状推定空間）は異なる空間であるが、本実施形態においては、撮像空間と形状推定空間とを同じものとして説明することとする。 The space setting unit 110 divides the imaging space imaged by the imaging system 2 into a plurality of subspaces, and sets the subject information to be associated with each subspace. Dividing the imaging space here means virtually dividing the space into a plurality of subspaces. FIG. 5 is a diagram schematically showing an example of an imaging space divided into a plurality of subspaces. The shape estimation device 1 in the present embodiment generates shape data of a subject in the imaging space 300 indicated by a broken line. Although the imaging space and the space where the shape estimation is performed (shape estimation space) are different spaces, in the present embodiment, the imaging space and the shape estimation space will be described as the same.

撮像空間３００は三次元空間であり、撮像システム２により取得される撮像画像上の座標と対応づけて設定される。図５（ａ）に示す例においては、スポーツイベントが行われる競技場等の地面をｘ軸とｙ軸で表されるｘｙ面とし、また地面３０１と垂直な方向３０２にｚ軸が定義される。地面３０１はｚ＝０とする。また、撮像空間３００を部分空間３２０及び部分空間３３０に分割するための境界３１０が設定される。境界３１０は、例えば「ｚ＝２ｍ」のように高さを表す三次元座標を用いて表される。撮像空間３００及び境界３１０は、空間設定部１１０が補助記憶装置５１４又は外部の記憶装置等から三次元座標の情報を取得することにより設定される。また、撮像空間を表す座標の原点及び座標軸は、任意の位置及び方向に設定可能である。 The imaging space 300 is a three-dimensional space, and is set in association with the coordinates on the captured image acquired by the imaging system 2. In the example shown in FIG. 5A, the ground of a stadium or the like where a sporting event is held is defined as an xy plane represented by the x-axis and the y-axis, and the z-axis is defined in a direction 302 perpendicular to the ground 301. .. The ground 301 has z = 0. Further, a boundary 310 for dividing the imaging space 300 into the subspace 320 and the subspace 330 is set. The boundary 310 is represented using three-dimensional coordinates representing the height, for example, "z = 2m". The imaging space 300 and the boundary 310 are set by the space setting unit 110 acquiring information on three-dimensional coordinates from the auxiliary storage device 514, an external storage device, or the like. Further, the origin and the coordinate axis of the coordinates representing the imaging space can be set at arbitrary positions and directions.

図５（ｂ）及び図５（ｃ）は、撮像空間及び境界に基づいて決定される部分空間の他の例を示す図である。図５（ｂ）における撮像空間３００は、境界４１０により、直方体状の部分空間４３０及びその他の部分空間４２０に分割される。また、図５（ｃ）は、３つの部分空間に分割される例である。図５（ｃ）における撮像空間３００は、境界４１１により、直方体状の部分空間４３１及び部分空間４３２、並びにその他の部分空間４２１に分割される。このように、空間設定部１１０は、任意の境界により、撮像空間を二以上の部分空間に分割する。 5 (b) and 5 (c) are diagrams showing other examples of the imaging space and the subspace determined based on the boundaries. The imaging space 300 in FIG. 5B is divided into a rectangular parallelepiped subspace 430 and other subspaces 420 by the boundary 410. Further, FIG. 5C is an example of being divided into three subspaces. The imaging space 300 in FIG. 5C is divided into a rectangular parallelepiped subspace 431 and a subspace 432, and other subspaces 421 by the boundary 411. In this way, the space setting unit 110 divides the imaging space into two or more subspaces by an arbitrary boundary.

空間設定部１１０は、撮像空間３００及び境界３１０に基づいて決定される複数の部分空間のそれぞれに対し、被写体情報取得部１００において取得された被写体情報を対応付ける。各部分空間には、それぞれの部分空間に含まれる割合が高い被写体を示す被写体情報が対応付けられる。例えば、ラグビーの試合が撮像される撮像空間に対し、図５（ａ）に示す部分空間が設定された場合について説明する。このとき、地面３０１から境界３１０までの部分空間３３０には、被写体として選手とボールとの両方が含まれ得る。しかしながら、地上付近では含まれる被写体として、ボールよりも選手の割合が高いことが想定されるため、被写体情報として「選手」が対応付けられる。一方、境界３１０より上方は、選手はほとんど含まれず、高く上がったボールが頻繁に含まれ得る。したがって、部分空間３２０には被写体情報として「ボール」が対応付けられる。 The space setting unit 110 associates the subject information acquired by the subject information acquisition unit 100 with each of the plurality of subspaces determined based on the imaging space 300 and the boundary 310. Each subspace is associated with subject information indicating a subject having a high proportion contained in each subspace. For example, a case where the subspace shown in FIG. 5A is set with respect to the imaging space in which the rugby game is imaged will be described. At this time, the subspace 330 from the ground 301 to the boundary 310 may include both the player and the ball as subjects. However, since it is assumed that the proportion of players included in the subject near the ground is higher than that of the ball, "player" is associated with the subject information. On the other hand, above the boundary 310, players are rarely included, and high-rise balls may be frequently included. Therefore, a "ball" is associated with the subspace 320 as subject information.

図５（ｂ）及び図５（ｃ）に示すような部分空間が設定された場合にも、上記と同様の考え方により被写体情報が対応付けられる。図５（ｂ）のような部分空間には、例えば野球における「内野手」及び「外野手」を示す被写体情報が対応付けられる。また、図５（ｃ）のような部分空間には、例えばバレーボールのように自陣でプレーする競技において、チームごとに選手を分類した「Ａチーム選手」、「Ｂチーム選手」及び「ボール」を示す被写体情報が対応付けられる。空間設定部１１０は、各部分空間を示す情報と、各部分空間に対応付けられた被写体情報と紐づけられた撮像装置の識別番号とを形状推定部１３０に送信する。 Even when the subspaces shown in FIGS. 5 (b) and 5 (c) are set, the subject information is associated with the same concept as described above. Subject information indicating an "infielder" and an "outfielder" in baseball, for example, is associated with the subspace as shown in FIG. 5B. Further, in the subspace as shown in FIG. 5 (c), "Team A player", "Team B player" and "ball" in which the players are classified by team in a competition such as volleyball in which the players are played by themselves are displayed. The indicated subject information is associated. The space setting unit 110 transmits information indicating each subspace and the identification number of the imaging device associated with the subject information associated with each subspace to the shape estimation unit 130.

また、空間設定部１１０は、各部分空間に対応付けられた被写体情報に応じて、後述する形状推定部１３０が形状推定処理を行う際に使用する処理パラメータを設定する。空間設定部１１０は、例えば、撮像空間３００に含まれる第１の部分空間である部分空間３２０に第１の処理パラメータを設定し、且つ第２の部分空間である部分空間３３０に、第１の処理パラメータとは異なる第１の処理パラメータを設定する。３つ以上の部分空間を有する撮像空間（例えば、図５（ｃ）に示す撮像空間３００）に対しても、同様に処理パラメータを設定する。したがって、異なる処理パラメータが設定される部分空間の組み合わせが少なくとも一つ存在しうる。なお、異なる処理パラメータが設定される第１及び第２の部分空間を含む複数の部分空間において、第１及び第２の部分空間以外の部分空間に対して、第１又は第２の処理パラメータが設定されてもよい。空間設定部１１０は、設定した処理パラメータについても形状推定部１３０に送信する。 Further, the space setting unit 110 sets processing parameters to be used when the shape estimation unit 130, which will be described later, performs the shape estimation process, according to the subject information associated with each subspace. For example, the space setting unit 110 sets the first processing parameter in the subspace 320 which is the first subspace included in the imaging space 300, and sets the first processing parameter in the subspace 330 which is the second subspace. A first processing parameter different from the processing parameter is set. Similarly, processing parameters are set for an imaging space having three or more subspaces (for example, the imaging space 300 shown in FIG. 5C). Therefore, there can be at least one combination of subspaces in which different processing parameters are set. In a plurality of subspaces including the first and second subspaces in which different processing parameters are set, the first or second processing parameters are set for the subspaces other than the first and second subspaces. It may be set. The space setting unit 110 also transmits the set processing parameters to the shape estimation unit 130.

なお、本実施形態においては、部分空間に含まれる割合の高い被写体ごとに部分空間が設定される例について説明したが、重要なシーンの発生のしやすさに応じて部分空間が設定されてもよい。例えば、サッカーにおけるゴール前は、ゴールシーン等の重要なシーンが発生しやすいことが想定される。この場合、空間設定部１１０は、ゴール前を含む部分空間と、それ以外の領域を含む部分空間とを設定し、それぞれに「重要度：高」及び「重要度：低」のような情報を対応付ける。また、撮像空間において被写体の数が多い空間と少ない空間とに応じて、部分空間が設定されてもよい。 In the present embodiment, an example in which the subspace is set for each subject having a high proportion of the subspace is described, but even if the subspace is set according to the easiness of occurrence of an important scene. Good. For example, before a goal in soccer, it is assumed that important scenes such as a goal scene are likely to occur. In this case, the space setting unit 110 sets a subspace including the front of the goal and a subspace including other areas, and provides information such as "importance: high" and "importance: low" to each. Correspond. Further, the subspace may be set according to the space in which the number of subjects is large and the space in which the number of subjects is small in the imaging space.

撮像情報取得部１２０は、撮像システム２において生成された前景画像を取得する。また、撮像情報取得部１２０は、撮像システム２から、撮像システム２に含まれる各撮像装置２０の撮像パラメータを取得する。撮像パラメータは、撮像装置２０が設置される位置及び撮像装置２０の姿勢を示す外部パラメータと、撮像装置２０の焦点距離、光学中心、及びレンズ歪み等を示す内部パラメータとを含むパラメータである。なお、撮像情報取得部１２０は、撮像システム２から得られる撮像画像に基づいて、既存のキャリブレーション処理を用いて撮像パラメータを算出することも可能である。既存のキャリブレーション処理の一例として、撮像情報取得部１２０は、撮像システム２から複数の撮像画像を取得し、該複数の撮像画像における特徴点を用いて、撮像画像どうしの対応点を算出する。撮像情報取得部１２０は、算出した対応点を各撮像装置２０に投影したときの誤差が最小になるように最適化し、各撮像装置２０を校正することで撮像パラメータを算出する。使用されるキャリブレーション処理は上記に限定されない。また、撮像パラメータは、撮像装置２０が設置される事前準備の段階で取得されてもよいし、撮像情報取得部１２０が撮像画像を取得するごとに算出されてもよい。また、撮像パラメータは、過去に取得された撮像画像に基づいて算出されてもよい。撮像情報取得部１２０は、取得した前景画像及び撮像パラメータを、形状推定部１３０及び画像生成装置３に送信する。 The image pickup information acquisition unit 120 acquires the foreground image generated by the image pickup system 2. Further, the imaging information acquisition unit 120 acquires the imaging parameters of each imaging device 20 included in the imaging system 2 from the imaging system 2. The imaging parameter is a parameter including an external parameter indicating the position where the imaging device 20 is installed and the posture of the imaging device 20, and an internal parameter indicating the focal length, the optical center, the lens distortion, and the like of the imaging device 20. The imaging information acquisition unit 120 can also calculate imaging parameters using existing calibration processing based on the captured image obtained from the imaging system 2. As an example of the existing calibration process, the imaging information acquisition unit 120 acquires a plurality of captured images from the imaging system 2, and uses the feature points in the plurality of captured images to calculate the corresponding points between the captured images. The imaging information acquisition unit 120 optimizes the calculated corresponding points so that the error when projected onto each imaging device 20 is minimized, and calculates the imaging parameters by calibrating each imaging device 20. The calibration process used is not limited to the above. Further, the imaging parameters may be acquired at the stage of preparation in advance when the imaging device 20 is installed, or may be calculated each time the imaging information acquisition unit 120 acquires the captured image. In addition, the imaging parameters may be calculated based on the captured images acquired in the past. The imaging information acquisition unit 120 transmits the acquired foreground image and imaging parameters to the shape estimation unit 130 and the image generation device 3.

形状推定部１３０は、空間設定部１１０から取得される部分空間を示す情報及び識別番号、並びに撮像情報取得部１２０から取得される前景画像及び撮像パラメータに基づいて、被写体の形状推定処理を行い、形状データを生成する。本実施形態における形状推定部１３０は、公知技術である視体積交差法（ｓｈａｐｅ−ｆｒｏｍ−ｓｉｌｈｏｕｅｔｔｅ法）を用いて三次元モデルデータを生成する。なお、三次元モデルデータを生成する方法として、視体積交差法以外の手法が使用されてもよい。 The shape estimation unit 130 performs shape estimation processing of the subject based on the information indicating the subspace acquired from the space setting unit 110 and the identification number, and the foreground image and the imaging parameter acquired from the imaging information acquisition unit 120. Generate shape data. The shape estimation unit 130 in the present embodiment generates three-dimensional model data by using a known technique of the visual volume crossing method (shape-from-silhouette method). As a method for generating three-dimensional model data, a method other than the visual volume crossing method may be used.

形状推定部１３０は、空間設定部１１０から取得される部分空間を示す情報及び識別番号に基づいて、当該部分空間に対応する識別番号を有する一以上の撮像装置２０を特定する。また、形状推定部１３０は、特定した一以上の撮像装置２０から得られる一以上の撮像画像に基づいて生成された前景画像を用いて、被写体の形状推定を行う。例えば、図５（ａ）に示す部分空間３２０及び部分空間３３０に、それぞれ「ボール」及び「選手」を示す被写体情報が対応付けられているとする。このとき、形状推定部１３０は、部分空間３２０においては「ボール」を示す被写体情報と紐づけられた識別番号を有する一以上の撮像装置２０から得られる一以上の撮像画像に基づいて生成された前景画像を用いて形状推定を行う。また、形状推定部１３０は、部分空間３３０においては「選手」を示す被写体情報と紐づけられた識別番号を有する一以上の撮像装置２０からえら得る一以上の撮像画像に基づいて生成された前景画像を用いて形状推定を行う。 The shape estimation unit 130 identifies one or more image pickup devices 20 having an identification number corresponding to the subspace based on the information indicating the subspace acquired from the space setting unit 110 and the identification number. Further, the shape estimation unit 130 estimates the shape of the subject by using the foreground image generated based on the one or more captured images obtained from the specified one or more image pickup devices 20. For example, it is assumed that the subspace 320 and the subspace 330 shown in FIG. 5A are associated with subject information indicating a “ball” and a “player”, respectively. At this time, the shape estimation unit 130 is generated based on one or more captured images obtained from one or more imaging devices 20 having an identification number associated with the subject information indicating the "ball" in the subspace 320. Shape estimation is performed using the foreground image. Further, the shape estimation unit 130 is a foreground generated based on one or more captured images obtained from one or more imaging devices 20 having an identification number associated with subject information indicating "player" in the subspace 330. Shape estimation is performed using images.

ここで、形状推定部１３０が行う形状推定処理について説明する。形状推定部１３０は、前景画像に基づいて、撮像画像において前景に対応する領域とそれ以外の領域とを２値で表したシルエット画像を生成する。シルエット画像は、例えば、前景に対応する領域の画素値を１、それ以外の領域の画素値を０とした画像である。形状推定部１３０は、処理の対象となるボクセル（三次元空間を単位体積の立方体の集合により表現する手法における一単位）の代表点（例えば中心）の座標を、撮像パラメータを用いて各撮像装置２０が取得した撮像画像系の座標に変換する。形状推定部１３０は、撮像画像系の座標が前景領域に含まれるか否かを判定する。具体的には、形状推定部１３０は、撮像画像上の座標に対応するシルエット画像上の画素値が、前景領域に対応する値（上記の例における画素値１）であるか否かを判定する。すべての撮像画像系において上記の座標が前景領域であると判定された場合、形状推定部１３０は当該座標に対応するボクセルを被写体の形状の一部であると判定する。一方、上記の座標が前景領域でない領域（当該座標に対応するシルエット画像上の画素値が０）であると判定される撮像画像がある場合、形状推定部１３０は、当該画素に対応するボクセルを被写体の形状の一部ではないと判定する。形状推定部１３０は、上記の処理を撮像空間に対応する形状推定空間を構成するボクセルごとに行い、被写体の形状の一部であると判定されなかったボクセルを削除することにより被写体の形状が推定される。なお、本実施形態においては形状推定部１３０がシルエット画像を生成する構成としたが、シルエット画像が分離部２１で生成され、撮像情報取得部１２０がシルエット画像を取得する構成であってもよい。 Here, the shape estimation process performed by the shape estimation unit 130 will be described. Based on the foreground image, the shape estimation unit 130 generates a silhouette image in which the region corresponding to the foreground and the other regions in the captured image are represented by two values. The silhouette image is, for example, an image in which the pixel value of the region corresponding to the foreground is 1 and the pixel value of the other region is 0. The shape estimation unit 130 uses imaging parameters to determine the coordinates of the representative points (for example, the center) of the voxels (one unit in the method of expressing a three-dimensional space by a set of cubes of a unit volume) to be processed. It is converted into the coordinates of the captured image system acquired by 20. The shape estimation unit 130 determines whether or not the coordinates of the captured image system are included in the foreground region. Specifically, the shape estimation unit 130 determines whether or not the pixel value on the silhouette image corresponding to the coordinates on the captured image is a value corresponding to the foreground region (pixel value 1 in the above example). .. When it is determined that the above coordinates are the foreground region in all the captured image systems, the shape estimation unit 130 determines that the voxels corresponding to the coordinates are a part of the shape of the subject. On the other hand, when there is an captured image in which the above coordinates are determined to be a region other than the foreground region (the pixel value on the silhouette image corresponding to the coordinates is 0), the shape estimation unit 130 determines the voxels corresponding to the pixels. Judge that it is not a part of the shape of the subject. The shape estimation unit 130 performs the above processing for each voxel constituting the shape estimation space corresponding to the imaging space, and estimates the shape of the subject by deleting the voxels that are not determined to be a part of the shape of the subject. Will be done. In the present embodiment, the shape estimation unit 130 is configured to generate the silhouette image, but the silhouette image may be generated by the separation unit 21 and the imaging information acquisition unit 120 may acquire the silhouette image.

形状推定部１３０は、上記の形状推定処理において、空間設定部１１０において設定された処理パラメータを使用して形状推定を行う。ここでは、処理パラメータとして、形状推定パラメータを用いて形状推定処理を行う例について説明する。形状推定パラメータは、上述した形状推定処理において、処理対象のボクセルに対応する撮像画像上の画素が前景領域でないと判定される撮像画像を所定の数まで許容することを示すパラメータである。例えば、形状推定パラメータが１と設定されている場合、形状推定部１３０は、処理対象のボクセルに対応する撮像画像上の画素が前景領域でないと判定される撮像画像の枚数が１枚以下の場合は、当該ボクセルは被写体の形状の一部であると判定する。一方、処理対象のボクセルに対応する撮像画像上の画素が前景領域でないと判定される撮像画像の枚数が２枚以上の場合は、当該ボクセルは被写体の形状の一部ではないと判定される。形状推定部１３０は、上記のような形状推定パラメータを用いて、各部分空間における被写体の形状推定を行う。 In the above shape estimation process, the shape estimation unit 130 performs shape estimation using the processing parameters set in the space setting unit 110. Here, an example of performing shape estimation processing using shape estimation parameters as processing parameters will be described. The shape estimation parameter is a parameter indicating that, in the shape estimation process described above, a predetermined number of captured images in which the pixels on the captured image corresponding to the voxel to be processed are determined not to be in the foreground region are allowed. For example, when the shape estimation parameter is set to 1, the shape estimation unit 130 determines that the number of pixels on the captured image corresponding to the voxel to be processed is not the foreground region, and the number of captured images is 1 or less. Determines that the voxel is part of the shape of the subject. On the other hand, when the number of captured images determined to be not in the foreground region is two or more pixels on the captured image corresponding to the voxel to be processed, it is determined that the voxel is not a part of the shape of the subject. The shape estimation unit 130 estimates the shape of the subject in each subspace using the shape estimation parameters as described above.

図５（ａ）に示す部分空間３２０及び部分空間３３０に、それぞれ「ボール」及び「選手」を示す被写体情報が対応付けられている場合の形状推定処理の一例について説明する。ボールは大きさが小さく、且つ移動する速さが速いため、分離部２１における前景抽出が失敗しやすい。そのため、形状推定部１３０は、部分空間３２０における形状推定処理を行う場合は形状推定パラメータを１として処理を行う。これにより、形状推定部１３０は、前景抽出に失敗した撮像画像が存在しても形状推定を行うことができる。一方、選手はボールよりも精度よく前景抽出がされやすい。したがって、形状推定部１３０は、部分空間３２０における形状推定処理を行う場合は形状推定パラメータを０として処理を行う。これにより、形状推定部１３０は、選手の形状推定を精度良く行うことができる。上述したように、ボールなどの前景抽出が失敗しやすい被写体は、前景抽出が失敗しにくい被写体よりも大きい値の形状推定パラメータを用いて形状推定処理を行うことにより、精度の良い形状推定が可能になる。なお、上記の形状推定パラメータは、空間設定部１１０において各部分空間に対して設定される。したがって、形状推定部１３０は、例えば撮像空間３００に含まれる第１の部分空間である部分空間３２０においては、１に設定された形状推定パラメータを使用して形状推定処理を行う。また、形状推定部１３０は、第２の部分空間である部分空間３３０においては、０に設定された形状推定パラメータを使用して形状推定処理を行う。 An example of the shape estimation process in the case where the subspace 320 and the subspace 330 shown in FIG. 5A are associated with the subject information indicating the “ball” and the “player”, respectively, will be described. Since the ball is small in size and moves quickly, the foreground extraction in the separation unit 21 tends to fail. Therefore, when performing the shape estimation process in the subspace 320, the shape estimation unit 130 performs the process with the shape estimation parameter set to 1. As a result, the shape estimation unit 130 can perform shape estimation even if there is an captured image in which foreground extraction has failed. On the other hand, players are more likely to extract the foreground more accurately than the ball. Therefore, when performing the shape estimation process in the subspace 320, the shape estimation unit 130 sets the shape estimation parameter to 0 and performs the process. As a result, the shape estimation unit 130 can accurately estimate the shape of the athlete. As described above, for a subject such as a ball whose foreground extraction is likely to fail, accurate shape estimation is possible by performing shape estimation processing using a shape estimation parameter having a larger value than that of a subject whose foreground extraction is unlikely to fail. become. The shape estimation parameters are set for each subspace in the space setting unit 110. Therefore, the shape estimation unit 130 performs the shape estimation process using the shape estimation parameter set to 1, for example, in the subspace 320 which is the first subspace included in the imaging space 300. Further, the shape estimation unit 130 performs the shape estimation process in the subspace 330, which is the second subspace, by using the shape estimation parameter set to 0.

また、被写体を撮像する撮像装置の台数に応じて形状推定パラメータの値が決定されてもよい。例えば、「ボール」及び「選手」を示す被写体情報が対応付けられた撮像装置が、それぞれ１０台及び５０台であった場合について考える。「選手」に対応する撮像装置は「ボール」に対応する撮像装置よりも台数が多いため、故障等により前景抽出に失敗する撮像装置が含まれる可能性も高くなることが想定される。この場合は、例えば「ボール」に対応する部分空間における形状推定を行う場合は形状パラメータを１とし、「選手」に対応する部分空間における形状推定を行う場合は形状パラメータを４とする。これにより、形状推定部１３０は、撮像装置の台数も考慮して精度よく形状推定を行うことができる。なお、上記の形状推定パラメータは一例であり、上記以外の値が形状推定パラメータとして使用されてもよい。形状推定部１３０は、形状推定処理の結果に基づいて形状データを生成し、画像生成装置３に送信する。 Further, the value of the shape estimation parameter may be determined according to the number of image pickup devices that image the subject. For example, consider the case where there are 10 and 50 image pickup devices associated with subject information indicating "ball" and "player", respectively. Since the number of image pickup devices corresponding to "players" is larger than that of image pickup devices corresponding to "balls", it is expected that there is a high possibility that an image pickup device that fails to extract the foreground due to a failure or the like is included. In this case, for example, the shape parameter is set to 1 when estimating the shape in the subspace corresponding to the "ball", and the shape parameter is set to 4 when estimating the shape in the subspace corresponding to the "player". As a result, the shape estimation unit 130 can accurately estimate the shape in consideration of the number of image pickup devices. The above shape estimation parameter is an example, and values other than the above may be used as the shape estimation parameter. The shape estimation unit 130 generates shape data based on the result of the shape estimation process and transmits it to the image generation device 3.

なお、本実施形態においては、形状推定部１３０が三次元モデルデータを生成する例について説明したが、形状推定部１３０が二次元形状データを生成する構成とすることも可能である。この場合、形状推定部１３０は、形状データの生成処理として、撮像画像に基づいて前景画像の生成を行う。このとき、撮像情報取得部１２０は撮像システム２から撮像画像を取得し、形状推定部１３０に送信する。また、空間設定部１１０は、前景画像を生成するための処理の手法を特定するための処理パラメータを設定する。例えば、背景差分法を示すパラメータとして「１」が割り当てられ、且つフレーム間差分法を示すパラメータとして「２」が割り当てられているものとする。空間設定部１１０は、対応する被写体情報が「選手」である部分空間３３０には、処理パラメータとして２を設定し、被写体情報が「ボール」である部分空間３２０には、処理パラメータとして１を設定する。これにより、形状推定部１３０は、被写体に応じて精度よく前景画像を生成することができる。 In the present embodiment, the example in which the shape estimation unit 130 generates the three-dimensional model data has been described, but the shape estimation unit 130 may be configured to generate the two-dimensional shape data. In this case, the shape estimation unit 130 generates a foreground image based on the captured image as a shape data generation process. At this time, the imaging information acquisition unit 120 acquires the captured image from the imaging system 2 and transmits it to the shape estimation unit 130. In addition, the spatial setting unit 110 sets processing parameters for specifying a processing method for generating a foreground image. For example, it is assumed that "1" is assigned as a parameter indicating the background subtraction method and "2" is assigned as a parameter indicating the inter-frame difference method. The space setting unit 110 sets 2 as a processing parameter in the subspace 330 whose corresponding subject information is "player", and sets 1 as a processing parameter in the subspace 320 whose subject information is "ball". To do. As a result, the shape estimation unit 130 can accurately generate a foreground image according to the subject.

また、形状推定部１３０が二次元形状データを生成する場合の処理パラメータとして、空間設定部１１０は、上述した前景画像のノイズ除去処理に使用する処理パラメータを設定する。この場合も、複数の部分空間（例えば、部分空間３２０及び部分空間３３０）に対応する被写体情報（例えば、「ボール」と「選手」）に応じて異なる処理パラメータが設定される。これにより、形状推定部１３０は、二次元形状データを生成する際に、前景として抽出されたが前景領域と極端に大きさ（画素面積）の異なる領域をノイズ領域として除去することができ、前景画像の精度が向上する。 Further, as the processing parameter when the shape estimation unit 130 generates the two-dimensional shape data, the space setting unit 110 sets the processing parameter used for the noise removal processing of the foreground image described above. Also in this case, different processing parameters are set according to the subject information (for example, "ball" and "player") corresponding to the plurality of subspaces (for example, the subspace 320 and the subspace 330). As a result, the shape estimation unit 130 can remove a region extracted as the foreground but having an extremely different size (pixel area) from the foreground region as a noise region when generating the two-dimensional shape data. Image accuracy is improved.

図６は、形状推定装置１が行う処理を説明するためのフローチャートである。図６に示す処理は、ＣＰＵ５１１がＲＯＭ５１２または補助記憶装置５１４に記憶されたプログラムを読み出して実行することにより実行される。以下の説明においては、処理の一例として、図５（ａ）に示す撮像空間３００に含まれる部分空間３２０及び部分空間３３０に、それぞれ「ボール」及び「選手」を示す被写体情報が対応付けられる場合における形状推定装置１の処理について説明する。形状推定装置１が、撮像システム２と被写体情報等のデータの通信を行うことにより、処理が開始される。なお、以降の説明においては、処理ステップのことを単にＳと表記する。 FIG. 6 is a flowchart for explaining the process performed by the shape estimation device 1. The process shown in FIG. 6 is executed by the CPU 511 reading and executing the program stored in the ROM 512 or the auxiliary storage device 514. In the following description, as an example of processing, when the subspace 320 and the subspace 330 included in the imaging space 300 shown in FIG. 5A are associated with subject information indicating a “ball” and a “player”, respectively. The processing of the shape estimation device 1 in the above will be described. The shape estimation device 1 starts the process by communicating data such as subject information with the image pickup system 2. In the following description, the processing step is simply referred to as S.

Ｓ６００において、被写体情報取得部１００は、撮像システム２から複数の撮像装置２０それぞれが有する識別番号と被写体情報とを取得する。被写体情報取得部１００は、取得した識別番号及び被写体情報を空間設定部１１０に送信する。Ｓ６０１において、空間設定部１１０は、補助記憶装置５１４又は外部の記憶装置等から、撮像空間３００及び境界３１０を決定するための情報を取得する。また、空間設定部１１０は、取得した情報に基づいて部分空間を設定する。本処理においては、境界３１０は「ｚ＝２ｍ」であるとし、部分空間３２０及び部分空間３３０が設定される。 In S600, the subject information acquisition unit 100 acquires the identification numbers and the subject information of each of the plurality of imaging devices 20 from the imaging system 2. The subject information acquisition unit 100 transmits the acquired identification number and subject information to the space setting unit 110. In S601, the space setting unit 110 acquires information for determining the imaging space 300 and the boundary 310 from the auxiliary storage device 514, an external storage device, or the like. Further, the space setting unit 110 sets the subspace based on the acquired information. In this process, the boundary 310 is assumed to be "z = 2 m", and the subspace 320 and the subspace 330 are set.

Ｓ６０２において、空間設定部１１０は、Ｓ６０１において設定された部分空間３２０及び部分空間３３０のそれぞれに、予め設定された情報に基づいて、被写体情報を対応付ける。ここでいう予め設定された情報は、境界３１０を示すｚ座標以上の部分空間に対して「ボール」を被写体情報として対応付ける情報と、境界３１０を示すｚ座標未満の部分空間に対して「選手」を被写体情報として対応付ける情報と、を含む。そのため、本処理においては、境界３１０が示す「ｚ＝２ｍ」の上方の部分空間３２０には「ボール」が対応付けられ、「ｚ＝２ｍ」の下方の部分空間３３０には「選手」が対応付けられる。空間設定部１１０は、各部分空間を示す情報と、各部分空間に対応付けられた被写体情報と紐づけられた撮像装置の識別番号とを形状推定部１３０に送信する。なお、予め設定された情報は、境界３１０を決定するための情報に合わせて、どの部分空間にどの被写体情報を対応付けるかを示す情報が設定されればよい。 In S602, the space setting unit 110 associates the subject information with each of the subspace 320 and the subspace 330 set in S601 based on the preset information. The preset information referred to here is information that associates the "ball" as subject information with respect to the subspace above the z-coordinate indicating the boundary 310, and "player" with respect to the subspace below the z-coordinate indicating the boundary 310. Is included as subject information. Therefore, in this process, the "ball" is associated with the subspace 320 above "z = 2m" indicated by the boundary 310, and the "player" corresponds to the subspace 330 below "z = 2m". Attached. The space setting unit 110 transmits information indicating each subspace and the identification number of the imaging device associated with the subject information associated with each subspace to the shape estimation unit 130. As the preset information, information indicating which subject information is associated with which subspace may be set in accordance with the information for determining the boundary 310.

Ｓ６０３において、撮像情報取得部１２０は、撮像システム２において生成された前景画像及び各撮像装置２０の撮像パラメータを取得する。なお、撮像パラメータは、撮像装置２０の設置状態が不変であれば一度のみ取得され、補助記憶装置５１４等に記憶されていればよい。しかしながら、風あるいは障害物との接触等により撮像装置２０の設置状態が変化する場合が想定される。この場合、撮像情報取得部１２０は、必要に応じて撮像パラメータの算出を行うことが可能である。撮像情報取得部１２０は、取得した前景画像及び撮像パラメータを、形状推定部１３０及び画像生成装置３に送信する。なお、撮像情報取得部１２０は、撮像システム２から撮像画像を取得し、形状推定部１３０及び画像生成装置３に送信する構成であってもよい。この場合、形状推定部１３０は、取得した撮像画像に基づいて前景画像を生成する。ここで、前景画像は、対応する撮像装置に紐づけられた被写体情報に応じて適切な処理パラメータを用いて、撮像装置２０又は形状推定部１３０により生成される。 In S603, the imaging information acquisition unit 120 acquires the foreground image generated by the imaging system 2 and the imaging parameters of each imaging device 20. If the installation state of the imaging device 20 does not change, the imaging parameters may be acquired only once and stored in the auxiliary storage device 514 or the like. However, it is assumed that the installation state of the image pickup apparatus 20 may change due to contact with wind or an obstacle. In this case, the imaging information acquisition unit 120 can calculate the imaging parameters as needed. The imaging information acquisition unit 120 transmits the acquired foreground image and imaging parameters to the shape estimation unit 130 and the image generation device 3. The image pickup information acquisition unit 120 may be configured to acquire an image captured from the image pickup system 2 and transmit it to the shape estimation unit 130 and the image generation device 3. In this case, the shape estimation unit 130 generates a foreground image based on the acquired captured image. Here, the foreground image is generated by the image pickup apparatus 20 or the shape estimation unit 130 using appropriate processing parameters according to the subject information associated with the corresponding imaging apparatus.

Ｓ６０４において、形状推定部１３０は、部分空間を示す情報と、各部分空間に紐づけられた撮像装置２０の識別番号とに基づいて、形状推定処理に使用する撮像装置２０を特定する。本処理においては、境界３１０は「ｚ＝２ｍ」と設定されているため、形状推定部１３０は、形状推定処理の対象となるボクセルの代表点の座標について、ｚ＞２ｍを満たすか否かを判定する。ｚ＞２ｍを満たす場合、形状推定部１３０は当該ボクセルが部分空間３２０に含まれていると判定し、Ｓ６０５に処理を進める。ｚ＞２ｍを満たさない場合、形状推定部１３０は当該ボクセルが部分空間３３０に含まれていると判定し、Ｓ６０６に処理を進める。なお、形状推定部１３０は、処理の対象となるボクセルが部分空間３２０に含まれるか否かを判定し、含まれると判定した場合にＳ６０５へ処理を進め、含まれないと判定した場合にＳ６０６に処理を進める構成であってもよい。 In S604, the shape estimation unit 130 identifies the image pickup device 20 to be used for the shape estimation process based on the information indicating the subspace and the identification number of the image pickup device 20 associated with each subspace. In this process, since the boundary 310 is set to "z = 2 m", the shape estimation unit 130 determines whether or not z> 2 m is satisfied with respect to the coordinates of the representative point of the voxel to be the target of the shape estimation process. judge. When z> 2m is satisfied, the shape estimation unit 130 determines that the voxel is included in the subspace 320, and proceeds to S605 for processing. If z> 2m is not satisfied, the shape estimation unit 130 determines that the voxel is included in the subspace 330, and proceeds to S606. The shape estimation unit 130 determines whether or not the voxel to be processed is included in the subspace 320, proceeds to S605 when it is determined to be included, and S606 when it is determined not to be included. It may be configured to proceed with processing.

Ｓ６０５において、形状推定部１３０は、部分空間３２０に対応する被写体情報（この場合、被写体情報は「ボール」である）と同じ被写体情報が紐づけられている識別番号を有する撮像装置２０を形状推定処理に使用すると判定する。 In S605, the shape estimation unit 130 shape-estimates the image pickup device 20 having an identification number associated with the same subject information as the subject information (in this case, the subject information is a "ball") corresponding to the subspace 320. Determined to be used for processing.

Ｓ６０６において、形状推定部１３０は、部分空間３３０に対応する被写体情報（この場合、被写体情報は「選手」である）と同じ被写体情報が紐づけられている識別番号を有する撮像装置２０を形状推定処理に使用すると判定する。 In S606, the shape estimation unit 130 shape-estimates the image pickup device 20 having an identification number associated with the same subject information as the subject information (in this case, the subject information is "player") corresponding to the subspace 330. Determined to be used for processing.

Ｓ６０７において、形状推定部１３０は、Ｓ６０５又はＳ６０６において使用すると判定された撮像装置２０から取得された前景画像に基づいてシルエット画像を生成し、生成されたシルエット画像を用いて形状推定処理を行う。このとき、形状推定部１３０は、各部分空間に対応付けられた被写体情報に応じて異なる形状推定パラメータを使用して、ボクセルが被写体の形状の一部であるか否かの判定を行う。例えば、Ｓ６０５において「ボール」に対応する撮像装置２０を使用すると判定された場合、形状推定パラメータとして１が使用され、Ｓ６０６において「選手」に対応する撮像装置２０を使用すると判定された場合、形状推定パラメータとして０が使用される。形状推定部１３０は、処理対象のボクセルを撮像画像系に変換した座標に対応するシルエット画像上の画素値が、前景領域に対応する値（上記の例における画素値１）であるか否かを判定する。使用される形状推定パラメータが１のときは、処理対象のボクセルが、すべての撮像画像系において上記の座標が前景領域であると判定された場合、形状推定部１３０は処理対象のボクセルを被写体の形状の一部であると判定する。一方、上記の座標が前景領域でない領域（当該座標に対応するシルエット画像上の画素値が０）であると判定される撮像画像がある場合、形状推定部１３０は、処理対象のボクセルを被写体の形状の一部ではないと判定する。使用される形状推定パラメータの値が１のときは、上記の座標が前景領域でない領域であると判定される撮像画像の枚数が１枚以下の場合、形状推定部１３０は処理対象のボクセルを被写体の一部であると判定する。使用される形状推定パラメータの値が１のときは、上記の座標が前景領域でない領域であると判定される撮像画像の枚数が２枚以上の場合、形状推定部１３０は処理対象のボクセルを被写体の一部ではないと判定する。 In S607, the shape estimation unit 130 generates a silhouette image based on the foreground image acquired from the image pickup apparatus 20 determined to be used in S605 or S606, and performs shape estimation processing using the generated silhouette image. At this time, the shape estimation unit 130 determines whether or not the voxel is a part of the shape of the subject by using different shape estimation parameters according to the subject information associated with each subspace. For example, if it is determined in S605 that the image pickup device 20 corresponding to the "ball" is used, 1 is used as the shape estimation parameter, and if it is determined in S606 that the image pickup device 20 corresponding to the "player" is used, the shape. 0 is used as the estimation parameter. The shape estimation unit 130 determines whether or not the pixel value on the silhouette image corresponding to the coordinates obtained by converting the voxel to be processed into the captured image system is the value corresponding to the foreground region (pixel value 1 in the above example). judge. When the shape estimation parameter used is 1, if the voxel to be processed is determined to have the above coordinates in the foreground region in all the captured image systems, the shape estimation unit 130 sets the voxel to be processed as the subject. Determined to be part of the shape. On the other hand, when there is an captured image in which the above coordinates are determined to be a region other than the foreground region (the pixel value on the silhouette image corresponding to the coordinates is 0), the shape estimation unit 130 sets the voxel to be processed as the subject. Judge that it is not part of the shape. When the value of the shape estimation parameter used is 1, the shape estimation unit 130 targets the voxel to be processed when the number of captured images determined to be in the region other than the foreground region is 1 or less. It is determined that it is a part of. When the value of the shape estimation parameter used is 1, the shape estimation unit 130 targets the voxel to be processed when the number of captured images determined to be in the region other than the foreground region is two or more. Judge that it is not a part of.

Ｓ６０８において、形状推定部１３０は、撮像空間におけるすべてのボクセルの処理が終了したか否かを判定する。未処理のボクセルが存在すると判定した場合、形状推定部１３０はＳ６０４以降の処理を再度行う。すべてのボクセルの処理が終了したと判定した場合、形状推定部１３０はＳ６０９へ処理を進める。 In S608, the shape estimation unit 130 determines whether or not the processing of all voxels in the imaging space has been completed. When it is determined that the unprocessed voxels exist, the shape estimation unit 130 performs the processing after S604 again. When it is determined that the processing of all voxels is completed, the shape estimation unit 130 proceeds to processing to S609.

Ｓ６０９において、形状推定部１３０は、形状推定処理の結果に基づいて形状データを生成し、画像処理装置３に送信する。形状推定部１３０が形状データを送信すると、処理が終了する。 In S609, the shape estimation unit 130 generates shape data based on the result of the shape estimation process and transmits it to the image processing device 3. When the shape estimation unit 130 transmits the shape data, the process ends.

以上が、形状推定装置１が行う処理の一例である。なお、形状推定装置１は、被写体の種類又は境界が上記の例と異なる場合、及び部分空間が三以上の場合においても同様に処理を行うことが可能である。この場合は、Ｓ６０４において、部分空間の数に応じて境界判定が行われる。また、形状推定部１３０は、Ｓ６０４における判定の結果に基づいて、ボクセルが含まれる部分空間に対応する被写体情報を有する撮像装置を使用し、形状推定処理を行う。また、形状推定処理において使用される形状推定パラメータも、被写体の種別及び部分空間に応じて任意の値を使用することが可能である。 The above is an example of the processing performed by the shape estimation device 1. The shape estimation device 1 can perform the same processing even when the type or boundary of the subject is different from the above example and when the subspace is three or more. In this case, in S604, the boundary determination is performed according to the number of subspaces. Further, the shape estimation unit 130 performs the shape estimation process by using an image pickup device having subject information corresponding to the subspace including the voxels based on the result of the determination in S604. Further, as the shape estimation parameter used in the shape estimation process, any value can be used according to the type of the subject and the subspace.

以上説明したように、本実施形態における空間設定部１１０は、撮像空間に含まれる複数の部分空間に対して、被写体の形状データ生成の処理に使用される処理パラメータを異ならせて設定する。上記の構成によれば、部分空間ごとに適切な処理パラメータを用いて形状データの生成を行うことができるため、撮像空間全体において被写体の形状データを精度よく生成することが可能になる。 As described above, the space setting unit 110 in the present embodiment sets the processing parameters used for the process of generating the shape data of the subject differently for the plurality of subspaces included in the imaging space. According to the above configuration, since the shape data can be generated by using appropriate processing parameters for each subspace, it is possible to accurately generate the shape data of the subject in the entire imaging space.

＜仮想視点画像の生成方法＞
画像生成装置３が行う仮想視点画像の生成処理について説明する。画像生成装置３は、形状推定装置１から送信される前景画像、撮像パラメータ及び形状データを用いて、前景の仮想視点画像を生成する処理と、背景の仮想視点画像を生成する処理を実行する。画像生成装置３は、生成した前景の仮想視点画像及び背景の仮想視点画像を合成することにより、仮想視点画像を生成する。以下、それぞれの生成処理について説明する。 <How to generate a virtual viewpoint image>
The virtual viewpoint image generation process performed by the image generation device 3 will be described. The image generation device 3 executes a process of generating a virtual viewpoint image of the foreground and a process of generating a virtual viewpoint image of the background by using the foreground image, the imaging parameter, and the shape data transmitted from the shape estimation device 1. The image generation device 3 generates a virtual viewpoint image by synthesizing the generated virtual viewpoint image of the foreground and the virtual viewpoint image of the background. Hereinafter, each generation process will be described.

前景の仮想視点画像を生成する方法について説明する。前景の仮想視点画像は、形状データにおける各ボクセルに対し色付けを行うことにより生成される。ボクセルに色付けされる色の算出方法について説明する。画像生成装置３は、形状データと撮像パラメータとに基づいて、撮像装置２０から被写体の形状データの表面のボクセルまでの距離ｄを算出し、算出した距離と画素値とを対応付けた距離画像を生成する。また、画像生成装置３は、撮像空間におけるある座標Ｘｗに対し、座標Ｘｗを画角内に含む撮像装置２０の撮像パラメータを用いて、座標Ｘｗを撮像画像（距離画像）上における座標Ｘｉに変換する。また、画像生成装置３は、座標Ｘｗと、上記の座標Ｘｗを画角内に含む撮像装置２０との距離ｄｘを算出する。画像生成装置３は、距離画像上の座標Ｘｉの画素値を参照することにより、距離ｄと距離ｄｘとの比較を行う。距離ｄと距離ｄｘとの差が所定の閾値以下である場合、座標Ｘｗは上記の撮像装置２０から可視であると判定される。このときの所定の閾値を、以降の説明においては可視判定パラメータと呼ぶ。座標Ｘｗが可視である場合、座標Ｘｗと対応する撮像画像上の座標Ｘｉにおける画像値が、形状データの表面の色として算出される。 A method of generating a virtual viewpoint image of the foreground will be described. The virtual viewpoint image of the foreground is generated by coloring each voxel in the shape data. A method of calculating the color to be colored in voxels will be described. The image generation device 3 calculates the distance d from the image pickup device 20 to the voxel on the surface of the shape data of the subject based on the shape data and the imaging parameter, and creates a distance image in which the calculated distance and the pixel value are associated with each other. Generate. Further, the image generation device 3 converts the coordinates Xw into the coordinates Xi on the captured image (distance image) by using the imaging parameters of the imaging device 20 including the coordinates Xw within the angle of view for a certain coordinate Xw in the imaging space. To do. Further, the image generation device 3 calculates the distance dx between the coordinates Xw and the image pickup device 20 including the above coordinates Xw within the angle of view. The image generation device 3 compares the distance d with the distance dx by referring to the pixel values of the coordinates Xi on the distance image. When the difference between the distance d and the distance dx is equal to or less than a predetermined threshold value, the coordinates Xw are determined to be visible from the image pickup apparatus 20. The predetermined threshold value at this time is referred to as a visibility determination parameter in the following description. When the coordinates Xw are visible, the image value at the coordinates Xi on the captured image corresponding to the coordinates Xw is calculated as the surface color of the shape data.

上記の処理を複数の撮像装置２０に対して行うことにより、形状データの表面における一つのボクセルに対し、可視であると判定される座標が複数特定され得る。画像生成装置３は、特定された複数の座標に対応する画素値の平均値を算出する（ブレンドする）ことにより、ボクセルの色を決定し色付けを行う。以上の処理を形状データにおけるすべてのボクセルに対して行うことにより、形状データの色付けがなされ、前景の仮想視点画像が生成される。なお、上記の処理において使用された可視判定パラメータは、部分空間ごとに異なる値を使用することができる。可視判定パラメータが小さいほど、可視であると判定される座標が少なくなるため、ブレンドされる色が少なくなり、色が鮮明になるという効果が期待できる。したがって、例えば「選手」に対応する部分空間における形状データの色付けを行う場合、可視判定パラメータを小さい値に設定することにより、選手の色が鮮明になり、高品質な仮想視点画像を生成することができる。一方、可視判定パラメータが大きいほど、可視であると判定される座標が多くなるため、多くの色がブレンドされる。例えば、ボールのように大きさが小さく高速に移動しうる被写体の場合、被写体の色が厳密に算出されなくてもよい。したがって、「ボール」に対応する部分空間における形状データの色付けを行う場合、可視判定パラメータを大きい値に設定する。 By performing the above processing on the plurality of imaging devices 20, a plurality of coordinates determined to be visible can be specified for one voxel on the surface of the shape data. The image generation device 3 determines and colors the voxel color by calculating (blending) the average value of the pixel values corresponding to the specified plurality of coordinates. By performing the above processing for all voxels in the shape data, the shape data is colored and a virtual viewpoint image of the foreground is generated. As the visibility determination parameter used in the above process, different values can be used for each subspace. The smaller the visibility determination parameter, the fewer coordinates are determined to be visible, so the effect of blending fewer colors and making the colors clearer can be expected. Therefore, for example, when coloring the shape data in the subspace corresponding to the "player", by setting the visibility judgment parameter to a small value, the color of the player becomes clear and a high-quality virtual viewpoint image is generated. Can be done. On the other hand, the larger the visibility determination parameter, the more coordinates are determined to be visible, so that many colors are blended. For example, in the case of a subject such as a ball, which is small in size and can move at high speed, the color of the subject does not have to be calculated exactly. Therefore, when coloring the shape data in the subspace corresponding to the "ball", the visibility determination parameter is set to a large value.

なお、可視判定パラメータは、撮像空間全体で同じ値を用いてもよい。また、色の算出方法については上記に限定されず、例えば、指定された仮想視点に最も近い撮像装置２０により取得された撮像画像の色を使用する等、種々の方法が用いられてもよい。また、本実施形態においては、上記の処理はすべての撮像装置２０のうち処理対象の座標を画角に含む撮像装置２０を使用して行われるが、処理対象の座標が含まれる部分空間に対応する撮像装置２０のみを使用してもよい。 The same value may be used for the visibility determination parameter in the entire imaging space. Further, the color calculation method is not limited to the above, and various methods may be used, for example, using the color of the captured image acquired by the imaging device 20 closest to the designated virtual viewpoint. Further, in the present embodiment, the above processing is performed using the imaging device 20 that includes the coordinates of the processing target in the angle of view among all the imaging devices 20, but corresponds to the subspace including the coordinates of the processing target. Only the imaging device 20 may be used.

次に、背景の仮想視点画像を生成する方法について説明する。背景の仮想視点画像は、背景の三次元形状データを用いて生成される。背景の三次元形状データとしては、あらかじめ生成され、記憶装置等に記憶された競技場等のＣＧモデルが用いられる。当該ＣＧモデルは複数の面により背景の形状を再現している。このとき、画像生成装置３は、面の法線ベクトルと撮像装置２０の撮像方向とを比較することにより、面を画角に含み、且つ面と最も正対する撮像装置２０を特定する。画像生成装置３は、特定した撮像装置２０により取得される撮像画像を用いて当該面に対応するテクスチャ画像を生成し、既存のテクスチャマッピング手法を用いることにより、面にテクスチャを貼り付ける。上記の処理を各面に対して行うことにより、背景の仮想視点画像が生成される。 Next, a method of generating a virtual viewpoint image of the background will be described. The virtual viewpoint image of the background is generated using the three-dimensional shape data of the background. As the background three-dimensional shape data, a CG model of a stadium or the like that is generated in advance and stored in a storage device or the like is used. The CG model reproduces the shape of the background with a plurality of surfaces. At this time, the image generation device 3 identifies the imaging device 20 that includes the surface in the angle of view and faces the surface most by comparing the normal vector of the surface with the imaging direction of the imaging device 20. The image generation device 3 generates a texture image corresponding to the surface using the image captured by the specified image pickup device 20, and attaches the texture to the surface by using an existing texture mapping method. By performing the above processing on each surface, a virtual viewpoint image of the background is generated.

画像生成装置３は、上述した処理によって生成された前景の仮想視点画像及び背景の仮想視点画像を合成することにより、仮想視点画像を生成する。また、画像生成装置３は、画像生成装置３の内部又は外部に接続される入力装置等により、仮想視点の視点位置及び仮想視点からの視線方向等を指定するためのユーザ操作を受け付ける、又は仮想視点を指定するための情報を記憶装置等から取得する。上記の処理を行うことにより、画像生成装置３は、指定された視点から見た仮想視点画像を生成し、表示装置４に表示することができる。 The image generation device 3 generates a virtual viewpoint image by synthesizing the virtual viewpoint image of the foreground and the virtual viewpoint image of the background generated by the above-described processing. Further, the image generation device 3 accepts a user operation for designating the viewpoint position of the virtual viewpoint, the line-of-sight direction from the virtual viewpoint, or the like by an input device or the like connected to the inside or the outside of the image generation device 3, or is virtual. Obtain information for designating the viewpoint from a storage device or the like. By performing the above processing, the image generation device 3 can generate a virtual viewpoint image viewed from a designated viewpoint and display it on the display device 4.

（第２の実施形態）
本実施形態においては、複数の撮像システムを使用して形状データを生成する形状推定装置７について説明する。なお、ハードウェア構成、機能構成及び処理について第１の実施形態と同様の箇所については同じ符号を付し、説明は省略する。 (Second embodiment)
In the present embodiment, the shape estimation device 7 that generates shape data using a plurality of imaging systems will be described. The hardware configuration, functional configuration, and processing are designated by the same reference numerals as those in the first embodiment, and the description thereof will be omitted.

図７は、形状推定装置７を含む画像処理システム１１００の構成を説明するための図である。画像処理システム１１００は、形状推定装置７、第１撮像システム８、第２撮像システム９、画像生成装置３、及び、表示装置４を含む。なお、形状推定装置７のハードウェア構成は第１の実施形態と同様であるため、説明を省略する。 FIG. 7 is a diagram for explaining the configuration of the image processing system 1100 including the shape estimation device 7. The image processing system 1100 includes a shape estimation device 7, a first image pickup system 8, a second image pickup system 9, an image generation device 3, and a display device 4. Since the hardware configuration of the shape estimation device 7 is the same as that of the first embodiment, the description thereof will be omitted.

第１撮像システム８及び第２撮像システム９は、それぞれ、同じ被写体情報を有する複数の撮像装置により構成されるシステムである。図８は、第１撮像システム８及び第２撮像システム９の構成の一例を説明するための図である。図８は、４台の撮像装置８０ａ〜８０ｄにより構成される第１撮像システム８及び３台の撮像装置９０ａ〜９０ｃにより構成される第２撮像システム９を示す。なお、各撮像システムに含まれる撮像装置の台数はこれに限定されない。第１撮像システムに含まれる撮像装置８０ａ〜８０ｄは、同じ被写体情報（例えば、選手等）を有し、被写体情報によって示される被写体を主に撮像するように設置される。同様に、第２撮像システムに含まれる撮像装置９０ａ〜９０ｄは、同じ被写体情報（例えば、ボール等）を有し、被写体情報によって示される被写体を主に撮像するように設置される。撮像システム８及び撮像システム９は、それぞれのシステム内における分離部８１ａ〜８１ｄ及び分離部９１ａ〜９１〜ｃにおいて前景画像の生成を行い、形状推定装置７に送信する。以降の説明においては、特に区別をしない場合、撮像装置８０ａ〜８０ｄ及び撮像装置９０ａ〜９０ｃを、単に撮像装置８０及び撮像装置９０とよぶ。 The first imaging system 8 and the second imaging system 9 are systems composed of a plurality of imaging devices having the same subject information, respectively. FIG. 8 is a diagram for explaining an example of the configuration of the first imaging system 8 and the second imaging system 9. FIG. 8 shows a first imaging system 8 composed of four imaging devices 80a to 80d and a second imaging system 9 composed of three imaging devices 90a to 90c. The number of imaging devices included in each imaging system is not limited to this. The imaging devices 80a to 80d included in the first imaging system have the same subject information (for example, a player or the like), and are installed so as to mainly capture a subject indicated by the subject information. Similarly, the imaging devices 90a to 90d included in the second imaging system have the same subject information (for example, a ball or the like) and are installed so as to mainly capture a subject indicated by the subject information. The image pickup system 8 and the image pickup system 9 generate a foreground image in the separation units 81a to 81d and the separation units 91a to 91 to c in the respective systems, and transmit the foreground image to the shape estimation device 7. In the following description, unless otherwise specified, the image pickup devices 80a to 80d and the image pickup devices 90a to 90c are simply referred to as the image pickup device 80 and the image pickup device 90.

図７に戻り、形状推定装置７の機能構成について説明する。形状推定装置１は、被写体情報取得部１００、空間設定部１１０、第１撮像情報取得部７００、第２撮像情報取得部７１０、及び、形状推定部１３０を有する。ここで、第１の実施形態における形状推定装置１と異なる処理部について説明する。 Returning to FIG. 7, the functional configuration of the shape estimation device 7 will be described. The shape estimation device 1 includes a subject information acquisition unit 100, a space setting unit 110, a first imaging information acquisition unit 700, a second imaging information acquisition unit 710, and a shape estimation unit 130. Here, a processing unit different from the shape estimation device 1 in the first embodiment will be described.

第１撮像情報取得部７００は、第１撮像システム８において生成された前景画像及び第１撮像システム８に含まれる各撮像装置２０の撮像パラメータを取得する。第２撮像情報取得部７１０は、第２撮像システム９において生成された前景画像及び第２撮像システム９に含まれる各撮像装置２０の撮像パラメータを取得する。第１撮像情報取得部７００及び第２撮像情報取得部７１０は、それぞれ、取得した前景画像及び撮像パラメータを形状推定部１３０及び画像生成装置３に送信する。 The first imaging information acquisition unit 700 acquires the foreground image generated by the first imaging system 8 and the imaging parameters of each imaging device 20 included in the first imaging system 8. The second imaging information acquisition unit 710 acquires the foreground image generated by the second imaging system 9 and the imaging parameters of each imaging device 20 included in the second imaging system 9. The first imaging information acquisition unit 700 and the second imaging information acquisition unit 710 transmit the acquired foreground image and imaging parameters to the shape estimation unit 130 and the image generation device 3, respectively.

図９は、形状推定装置７が行う処理を説明するためのフローチャートである。図６に示すフローチャートにおけるＳ６０３、Ｓ６０５及びＳ６０６の代わりに、Ｓ９００、Ｓ９０１及びＳ９０２の処理が行われる。以下の説明においては、図６と同様、図５（ａ）に示す撮像空間３００に含まれる部分空間３２０及び部分空間３３０に、それぞれ「ボール」及び「選手」を示す被写体情報が対応付けられる場合における形状推定装置７の処理について説明する。また、第１撮像システム８に含まれる撮像装置８０は、被写体情報として「選手」を有し、第２撮像システム９に含まれる撮像装置９０は、被写体情報として「ボール」を有する場合について説明する。以下、図６に示す処理と異なる点について説明する。 FIG. 9 is a flowchart for explaining the process performed by the shape estimation device 7. Instead of S603, S605 and S606 in the flowchart shown in FIG. 6, the processes of S900, S901 and S902 are performed. In the following description, as in FIG. 6, when the subspace 320 and the subspace 330 included in the imaging space 300 shown in FIG. 5A are associated with subject information indicating a “ball” and a “player”, respectively. The processing of the shape estimation device 7 in the above will be described. Further, a case where the imaging device 80 included in the first imaging system 8 has a “player” as subject information and the imaging device 90 included in the second imaging system 9 has a “ball” as subject information will be described. .. Hereinafter, the points different from the processing shown in FIG. 6 will be described.

Ｓ９００において、第１撮像情報取得部７００は、第１撮像システム８において生成された前景画像及び各撮像装置８０の撮像パラメータを取得する。また、第２撮像情報取得部７１０は、第２撮像システム９において生成された前景画像及び各撮像装置９０の撮像パラメータを取得する。第１撮像情報取得部７００及び第２撮像情報取得部７１０は、それぞれ、取得した前景画像及び撮像パラメータを、形状推定部１３０に送信する。 In S900, the first imaging information acquisition unit 700 acquires the foreground image generated by the first imaging system 8 and the imaging parameters of each imaging device 80. In addition, the second imaging information acquisition unit 710 acquires the foreground image generated by the second imaging system 9 and the imaging parameters of each imaging device 90. The first imaging information acquisition unit 700 and the second imaging information acquisition unit 710 transmit the acquired foreground image and imaging parameters to the shape estimation unit 130, respectively.

Ｓ６０４においては、図６と同様にボクセルがどの部分空間に含まれるかを判定する。境界が「ｚ＝２ｍ」と設定されている場合、形状推定部１３０は、形状推定処理の対象となるボクセルの代表点の座標について、ｚ＞２ｍを満たすか否かを判定する。ｚ＞２ｍを満たす場合、形状推定部１３０は当該ボクセルが部分空間３２０に含まれていると判定し、Ｓ９０１に処理を進める。ｚ＞２ｍを満たさない場合、形状推定部１３０は当該ボクセルが部分空間３３０に含まれていると判定し、Ｓ９０２に処理を進める。Ｓ９０１において、形状推定部１３０は、部分空間３２０に対応する被写体情報が「ボール」であるため、第２撮像システム９に含まれる撮像装置９０を形状推定処理に使用すると判定する。Ｓ９０２において、形状推定部１３０は、部分空間３２０に対応する被写体情報が「選手」であるため、第１撮像システム８に含まれる撮像装置８０を形状推定処理に使用すると判定する。 In S604, as in FIG. 6, it is determined in which subspace the voxel is contained. When the boundary is set to "z = 2m", the shape estimation unit 130 determines whether or not z> 2m is satisfied with respect to the coordinates of the representative point of the voxel to be processed for shape estimation. When z> 2m is satisfied, the shape estimation unit 130 determines that the voxel is included in the subspace 320, and proceeds to S901 for processing. If z> 2m is not satisfied, the shape estimation unit 130 determines that the voxel is included in the subspace 330, and proceeds to S902. In S901, the shape estimation unit 130 determines that the image pickup device 90 included in the second imaging system 9 is used for the shape estimation process because the subject information corresponding to the subspace 320 is a “ball”. In S902, the shape estimation unit 130 determines that the image pickup device 80 included in the first imaging system 8 is used for the shape estimation process because the subject information corresponding to the subspace 320 is the “player”.

以降は、第１の実施形態と同様に、部分空間ごとに異なる形状推定パラメータを用いて形状推定処理が行われ、被写体の形状データが生成される。図９に示す処理についても、第１の実施形態と同様、被写体の種類又は境界が上記の例と異なる場合、及び部分空間が三以上の場合においても適用可能である。本実施形態において説明したように、同じ被写体情報を有する撮像装置により構成される撮像システムを用いることにより、形状推定処理に使用する撮像装置の特定が容易になる。また、撮像システムに含まれる撮像装置は同じ被写体情報を有するため、撮像システムごとに同じ処理パラメータを割りあてることもできる。なお、本実施形態においては、画像処理システム１１００が２つの撮像システムを含む場合について説明したが、被写体の種類に応じて任意の数の撮像システムを含んでいてもよい。 After that, as in the first embodiment, the shape estimation process is performed using different shape estimation parameters for each subspace, and the shape data of the subject is generated. Similar to the first embodiment, the process shown in FIG. 9 can be applied even when the type or boundary of the subject is different from the above example and when the subspace is three or more. As described in the present embodiment, by using an imaging system composed of imaging devices having the same subject information, it becomes easy to specify the imaging device used for the shape estimation process. Further, since the image pickup apparatus included in the image pickup system has the same subject information, the same processing parameters can be assigned to each image pickup system. In the present embodiment, the case where the image processing system 1100 includes two imaging systems has been described, but an arbitrary number of imaging systems may be included depending on the type of the subject.

（その他の実施形態）
上述の実施形態においては、撮像空間が境界に基づいて複数の部分空間に分割され、各部分空間に対して処理パラメータが設定される例について説明した。ここで、上記以外の処理パラメータの設定の一例について説明する。例えば、図５（ａ）に示す撮像空間３００において、被写体が境界３１０を跨いだ状態で撮像された場合、部分空間３２０に含まれるボクセルと部分空間３３０に含まれるボクセルとで異なる処理パラメータを使用して形状データの生成が行われる。このため、生成される被写体の形状が境界３１０付近で歪になる可能性がある。この問題を解決するために、空間設定部１１０は、例えば境界３１０付近の処理パラメータが緩やかに変化するように処理パラメータを設定する。例えば、空間設定部１１０は、境界３１０がｚ＝３ｍの場合に、境界付近の空間（以下、境界空間という）であるｚ＝３±０．３ｍに対して形状推定パラメータを１として設定する。また、空間設定部１１０は、部分空間３２０には形状推定パラメータとして２を設定し、部分空間３３０には形状推定パラメータとして０を設定する。このようにパラメータに傾斜をつけることにより、境界付近で形状データが歪になることを軽減することができる。なお、境界空間の設定は上記の値に限定されない。 (Other embodiments)
In the above-described embodiment, an example in which the imaging space is divided into a plurality of subspaces based on the boundary and processing parameters are set for each subspace has been described. Here, an example of setting processing parameters other than the above will be described. For example, in the imaging space 300 shown in FIG. 5A, when the subject is imaged across the boundary 310, different processing parameters are used for the voxels included in the subspace 320 and the voxels included in the subspace 330. Then, the shape data is generated. Therefore, the shape of the generated subject may be distorted near the boundary 310. In order to solve this problem, the space setting unit 110 sets the processing parameters so that the processing parameters near the boundary 310, for example, change slowly. For example, when the boundary 310 is z = 3 m, the space setting unit 110 sets the shape estimation parameter as 1 for z = 3 ± 0.3 m, which is the space near the boundary (hereinafter referred to as the boundary space). Further, the space setting unit 110 sets 2 as a shape estimation parameter in the subspace 320 and 0 as a shape estimation parameter in the subspace 330. By inclining the parameters in this way, it is possible to reduce the distortion of the shape data near the boundary. The setting of the boundary space is not limited to the above values.

また、形状データの生成処理時に、被写体の位置等に基づいて被写体が境界を跨いでいることを特定可能な場合は、同じ処理パラメータを用いて境界付近の形状データ生成の処理を行う構成であってもよい。この場合、例えば、境界を跨いでいる被写体の部分がより多く含まれる部分領域に対応する処理パラメータが使用される。 In addition, when it is possible to identify that the subject straddles the boundary based on the position of the subject during the shape data generation process, the shape data generation process near the boundary is performed using the same processing parameters. You may. In this case, for example, a processing parameter corresponding to a partial region including a larger portion of the subject straddling the boundary is used.

本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 The present invention supplies a program that realizes one or more functions of the above-described embodiment to a system or device via a network or storage medium, and one or more processors in the computer of the system or device reads and executes the program. It can also be realized by the processing to be performed. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

１形状推定装置
１１０空間設定部
１３０形状推定部 1 Shape estimation device 110 Space setting unit 130 Shape estimation unit

Claims

A first subspace corresponding to one or more captured images acquired from one or more imaging devices among a plurality of imaging devices and a first subspace included in a plurality of subspaces in the imaging space imaged by the plurality of imaging devices. A first generation means for generating shape data of a subject included in the first subspace based on the parameter of 1.
It is a second parameter corresponding to one or more captured images acquired from one or more imaging devices among the plurality of imaging devices and a second subspace included in the plurality of subspaces, and is the first parameter. A shape data generation device comprising a second generation means for generating shape data of a subject included in the second subspace based on a second parameter different from the parameter of the above.

The first parameter and the second parameter according to claim 1, wherein the first parameter and the second parameter include a parameter for determining whether or not the element constituting the imaging space is a part of the shape of the subject. Shape data generator.

The first generation means and the second generation means are image data generated based on the one or more captured images, and are based on the image data indicating a region of a subject in the one or more captured images. The shape data generation device according to claim 1 or 2, wherein the shape data of the subject is generated.

The first to third image pickup apparatus according to claim 1 to 3, wherein the one or more image pickup apparatus for imaging the first subspace includes an image pickup apparatus having a different angle of view from the one or more image pickup apparatus for imaging the second subspace. The shape data generation device according to any one of the items.

Any of claims 1 to 4, wherein the imaging space is divided into a plurality of subspaces including the first subspace and the second subspace based on the information indicating the boundary. The shape data generation device according to item 1.

The shape data generation device according to claim 5, wherein the information indicating the boundary includes information indicating a height in the imaging space.

Claims 1 to 6 are characterized in that the subject that can be included in the first subspace and the subject that can be included in the second subspace are different in at least one of a size and a motion characteristic, respectively. The shape data generation device according to any one of the above items.

The shape data generation device according to any one of claims 1 to 7, wherein the number of subjects included in the first subspace is different from the number of subjects included in the second subspace. ..

A first subspace corresponding to one or more captured images acquired from one or more imaging devices among a plurality of imaging devices and a first subspace included in a plurality of subspaces in the imaging space imaged by the plurality of imaging devices. A first generation step of generating shape data of a subject included in the first subspace based on the parameter of 1.
It is a second parameter corresponding to one or more captured images acquired from one or more imaging devices among the plurality of imaging devices and a second subspace included in the plurality of subspaces, and is the first parameter. A second generation step of generating shape data of a subject included in the second subspace based on a second parameter different from the parameter of the above.
A shape data generation method characterized by having.

The ninth aspect of claim 9, wherein the one or more imaging devices that image the first subspace include one or more imaging devices that image the second subspace and an imaging device having a different angle of view. Shape data generation method.

The 9 or 10 according to claim 9, wherein the imaging space is divided into a plurality of subspaces including the first subspace and the second subspace based on the information indicating the boundary. Shape data generation method.

The shape data generation method according to claim 11, wherein the information indicating the boundary includes information indicating a height in the imaging space.

Claims 9 to 12, wherein the subject that can be included in the first subspace and the subject that can be included in the second subspace are different in at least one of the size and the motion characteristic, respectively. The shape data generation method according to any one of the above items.

The shape data generation method according to any one of claims 9 to 13, wherein the number of subjects included in the first subspace is different from the number of subjects included in the second subspace. ..

A computer program for causing a computer to function as the shape data generator according to any one of claims 1 to 8.