JP2022094907A

JP2022094907A - Generation device, generation method, and program

Info

Publication number: JP2022094907A
Application number: JP2021146788A
Authority: JP
Inventors: 祥吾水野; Shogo Mizuno
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2020-12-15
Filing date: 2021-09-09
Publication date: 2022-06-27
Anticipated expiration: 2041-09-09
Also published as: JP7494153B2

Abstract

To generate a virtual viewpoint image capable of addressing a problem that a position of a subject is different from a desired position.SOLUTION: An image generation device 120 includes: a subject shape generation section 405 for obtaining shape data representing a shape of a subject based on a plurality of photographed images obtained by photographing the subject in a photographing area by a plurality of photographing devices; a subject shape center-of-gravity calculation section 406 for identifying a subject position; a subject position setting section 408 for identifying a referential position representing a reference for the position of the subject; and a virtual viewpoint image generation section 412 for generating a virtual viewpoint image corresponding to a deviation between the subject position and the referential position based on the shape data.SELECTED DRAWING: Figure 4

Description

本開示は、仮想視点画像を生成する技術に関するものである。 The present disclosure relates to a technique for generating a virtual viewpoint image.

近年、撮影領域の周囲に複数の撮影装置を配置して撮影を行い、それぞれの撮影装置から取得された複数の撮影画像を用いて、指定された視点（仮想視点）から見た画像（仮想視点画像）を生成する技術が注目されている。この技術によれば、サッカーやラグビー等のスポーツ競技、コンサート及びダンス等を任意の視点から見るような画像を生成することができるため、ユーザに高臨場感を与えることができる。 In recent years, a plurality of shooting devices are arranged around a shooting area to shoot, and an image (virtual viewpoint) viewed from a designated viewpoint (virtual viewpoint) is used by using a plurality of shot images acquired from each shooting device. The technique of generating an image) is attracting attention. According to this technique, it is possible to generate an image in which a sports competition such as soccer or rugby, a concert, a dance, or the like is viewed from an arbitrary viewpoint, so that the user can be given a high sense of presence.

特許文献１では、複数のカメラが被写体を取り囲むように配置して被写体を撮影した画像を用いて、任意の仮想カメラ画像を生成、表示する技術が開示されている。特許文献１によれば、仮想カメラ画像生成の技術を用いた視聴コンテンツにおいて、撮影された演技者（被写体）のダンス、演技などをいろいろな角度から視聴することが可能となる。 Patent Document 1 discloses a technique for generating and displaying an arbitrary virtual camera image using an image in which a plurality of cameras are arranged so as to surround the subject and the subject is photographed. According to Patent Document 1, in the viewing content using the technique of virtual camera image generation, it is possible to view the dance, acting, etc. of the photographed performer (subject) from various angles.

特開２００８－１５７５６号公報Japanese Unexamined Patent Publication No. 2008-15756

しかしながら、例えばダンスや演技などにおいては、撮影時に、被写体である演者の位置が所望の位置と異なる位置となる場合がありうる。この結果、撮影画像に基づいて生成される仮想視点画像においても、演者の位置が所望の位置と異なる位置となってしまう。また、撮影時に被写体の位置がずれている場合に、指定される仮想視点の位置によっては、ユーザは仮想視点画像上で被写体の位置が所望の位置と異なる位置となっていることに気づかないこともありうる。 However, for example, in dance or acting, the position of the performer who is the subject may be different from the desired position at the time of shooting. As a result, even in the virtual viewpoint image generated based on the captured image, the position of the performer is different from the desired position. In addition, when the position of the subject is deviated during shooting, the user does not notice that the position of the subject is different from the desired position on the virtual viewpoint image depending on the position of the specified virtual viewpoint. There can also be.

本開示は上記の課題に鑑みてなされたものである。その目的は、被写体の位置が所望する位置と異なることに対応可能な仮想視点画像を生成することである。 This disclosure has been made in view of the above issues. The purpose is to generate a virtual viewpoint image that can correspond to the position of the subject being different from the desired position.

本開示に係る生成装置は、複数の撮影装置が撮影領域における被写体を撮影することにより得られる複数の撮影画像に基づく前記被写体の形状を表す形状データを取得する取得手段と、前記撮影領域における被写体の位置である被写体位置を特定する第１の特定手段と、前記被写体の位置の基準となる基準位置を特定する第２の特定手段と、前記取得手段により取得される形状データに基づいて、前記第１の特定手段により特定される被写体位置と前記第２の特定手段により特定される基準位置とのずれに応じた仮想視点画像を生成する生成手段とを有することを特徴とする。 The generation device according to the present disclosure includes acquisition means for acquiring shape data representing the shape of the subject based on a plurality of captured images obtained by photographing the subject in the photographing area by a plurality of photographing devices, and a subject in the photographing area. Based on the first specific means for specifying the subject position, which is the position of the subject, the second specific means for specifying the reference position as the reference for the position of the subject, and the shape data acquired by the acquisition means. It is characterized by having a generation means for generating a virtual viewpoint image according to a deviation between a subject position specified by the first specific means and a reference position specified by the second specific means.

本開示によれば、被写体の位置が所望する位置と異なることに対応可能な仮想視点画像を生成することができる。 According to the present disclosure, it is possible to generate a virtual viewpoint image that can correspond to a position of a subject different from a desired position.

画像処理システムの構成を説明するための図である。It is a figure for demonstrating the structure of an image processing system. 複数の撮影装置の設置の一例を示す図である。It is a figure which shows an example of the installation of a plurality of photographing apparatus. 画像生成装置のハードウェア構成を説明するための図である。It is a figure for demonstrating the hardware configuration of an image generator. 第１の実施形態における画像生成装置の機能構成を説明するための図である。It is a figure for demonstrating the functional structure of the image generation apparatus in 1st Embodiment. 第１の実施形態における画像生成装置が行う処理を説明するためのフローチャートである。It is a flowchart for demonstrating the process performed by the image generation apparatus in 1st Embodiment. 第１の実施形態における被写体の形状データの配置の一例を示す図である。It is a figure which shows an example of arrangement of the shape data of a subject in 1st Embodiment. 第１の実施形態において生成される仮想視点画像の一例を示す図である。It is a figure which shows an example of the virtual viewpoint image generated in 1st Embodiment. 第２の実施形態における画像生成装置の機能構成を説明するための図である。It is a figure for demonstrating the functional structure of the image generation apparatus in 2nd Embodiment. 第２の実施形態における画像生成装置が行う処理を説明するためのフローチャートである。It is a flowchart for demonstrating the process performed by the image generation apparatus in 2nd Embodiment. 第２の実施形態における被写体の形状データの配置の一例を示す図である。It is a figure which shows an example of arrangement of the shape data of a subject in 2nd Embodiment. 第３の実施形態における画像生成装置の機能構成を説明するための図である。It is a figure for demonstrating the functional structure of the image generation apparatus in 3rd Embodiment. 第３の実施形態における画像生成装置が行う処理を説明するためのフローチャートである。It is a flowchart for demonstrating the process performed by the image generation apparatus in 3rd Embodiment. 第３の実施形態における被写体の形状データの配置の一例を示す図である。It is a figure which shows an example of arrangement of the shape data of a subject in 3rd Embodiment. 第４の実施形態における画像生成装置の機能構成を説明するための図である。It is a figure for demonstrating the functional structure of the image generation apparatus in 4th Embodiment. 第４の実施形態における仮想カメラの設置の一例を示す図である。It is a figure which shows an example of the installation of the virtual camera in 4th Embodiment. 撮影される被写体と、生成される仮想視点画像の一例を示す図である。It is a figure which shows an example of the subject to be photographed, and the generated virtual viewpoint image. 第１の実施形態における被写体の形状データの撮影領域外への配置の一例を示す図である。It is a figure which shows an example of the arrangement of the shape data of a subject outside the photographing area in 1st Embodiment.

以下、本開示の実施形態について、図面を参照しながら説明する。なお、以下の実施形態に記載される構成要素は、実施の形態の一例を示すものであり、本開示をそれらのみに限定するものではない。 Hereinafter, embodiments of the present disclosure will be described with reference to the drawings. The components described in the following embodiments are examples of the embodiments, and the present disclosure is not limited to them.

（第１の実施形態）
図１は、本実施形態に係る画像処理システム１００を示す図である。画像処理システム１００は、複数の撮影装置１１０、画像生成装置１２０、及び端末装置１３０を有する。各撮影装置１１０と画像生成装置１２０は、ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）ケーブル等の通信ケーブルを介して接続している。なお、本実施形態においては、通信ケーブルはＬＡＮケーブルであるものとするが、通信ケーブルは実施形態に限定されるものではない。 (First Embodiment)
FIG. 1 is a diagram showing an image processing system 100 according to the present embodiment. The image processing system 100 includes a plurality of photographing devices 110, an image generation device 120, and a terminal device 130. Each photographing device 110 and the image generating device 120 are connected via a communication cable such as a LAN (Local Area Network) cable. In the present embodiment, the communication cable is a LAN cable, but the communication cable is not limited to the embodiment.

撮影装置１１０は、例えば画像（静止画及び動画）を撮影可能なデジタルカメラである。各撮影装置１１０は、撮影スタジオなどで撮影領域を囲むように設置され、画像（映像）を撮影する。本実施形態では、ダンスシーンなど複数の演者を被写体として撮影する場合を例にとって説明する。撮影された画像は、撮影装置１１０から画像生成装置１２０に送信される。図２は、撮影装置１１０の設置例を示す図である。本実施形態においては、複数の撮影装置１１０は、それぞれ撮影スタジオ内のすべて又は一部を撮影するように設置されているものとする。つまり、本実施形態の画像処理システム１００には、被写体を複数の方向から撮影するための複数の撮影装置１１０が含まれる。 The photographing device 110 is, for example, a digital camera capable of photographing an image (still image and moving image). Each shooting device 110 is installed so as to surround a shooting area in a shooting studio or the like, and shoots an image (video). In the present embodiment, a case where a plurality of performers such as a dance scene are photographed as a subject will be described as an example. The captured image is transmitted from the photographing device 110 to the image generation device 120. FIG. 2 is a diagram showing an installation example of the photographing apparatus 110. In the present embodiment, it is assumed that the plurality of photographing devices 110 are installed so as to photograph all or a part of the photographing studio. That is, the image processing system 100 of the present embodiment includes a plurality of photographing devices 110 for photographing a subject from a plurality of directions.

画像生成装置１２０は、撮影装置１１０により得られた撮影画像を蓄積しておき、端末装置１３０におけるユーザ操作により、仮想視点情報と再生時刻情報とが入力されると、撮影画像と仮想視点とに基づいて、仮想視点画像を生成する。ここで、仮想視点情報は、撮影画像から構築される仮想空間における仮想的な視点（仮想視点）の位置と、仮想視点からの視線方向とを示す情報を含む。なお、仮想視点情報に含まれる情報はこれに限定されない。例えば、仮想視点の視野の広さ（画角）に関する情報が含まれてもよい。また、仮想視点の位置、仮想視点からの視線方向、及び仮想視点の画角のうち、少なくともいずれかを表す情報が含まれる構成でもよい。再生時刻情報は、撮影画像の録画開始時刻からの時刻情報である。ユーザは、例えば後述する端末装置１３０を操作して再生時刻を指定することにより、録画された撮影画像において指定された再生時刻に対応するシーンの仮想視点画像を生成することができる。 The image generation device 120 stores the captured images obtained by the photographing device 110, and when the virtual viewpoint information and the reproduction time information are input by the user operation in the terminal device 130, the captured image and the virtual viewpoint are obtained. Based on this, a virtual viewpoint image is generated. Here, the virtual viewpoint information includes information indicating the position of the virtual viewpoint (virtual viewpoint) in the virtual space constructed from the captured image and the line-of-sight direction from the virtual viewpoint. The information included in the virtual viewpoint information is not limited to this. For example, information regarding the width (angle of view) of the field of view of the virtual viewpoint may be included. Further, the configuration may include information indicating at least one of the position of the virtual viewpoint, the line-of-sight direction from the virtual viewpoint, and the angle of view of the virtual viewpoint. The reproduction time information is time information from the recording start time of the captured image. The user can generate a virtual viewpoint image of a scene corresponding to the specified reproduction time in the recorded captured image by, for example, operating the terminal device 130 described later to specify the reproduction time.

画像生成装置１２０は、例えば、サーバ装置であり、データベース機能や、画像処理機能を備えている。データベースには、本番の撮影の開始前など、予め被写体が存在しない状態の場面を撮影した画像を背景画像として、撮影装置１１０を介して保持しておく。また、被写体の存在するシーンでは、画像生成装置１２０は、撮影画像のうち演者等の人物、及び演者が使用する道具など特定のオブジェクトに対応する領域（以下、前景画像ともいう）を画像処理により分離して、前景画像として保持しておく。なお、特定オブジェクトは、小道具などの画像パターンが予め定められている物体であってもよい。 The image generation device 120 is, for example, a server device and has a database function and an image processing function. In the database, an image of a scene in which a subject does not exist in advance, such as before the start of actual shooting, is stored as a background image via the shooting device 110. Further, in a scene in which a subject exists, the image generation device 120 performs image processing on a region (hereinafter, also referred to as a foreground image) corresponding to a specific object such as a person such as a performer and a tool used by the performer in the captured image. Separate and keep as a foreground image. The specific object may be an object such as a prop whose image pattern is predetermined.

仮想視点情報に対応した仮想視点画像は、データベースで管理された背景画像と特定オブジェクト画像とから生成されるものとする。仮想視点画像の生成方式として、例えばモデルベースレンダリング（Ｍｏｄｅｌ－ＢａｓｅｄＲｅｎｄｅｒｉｎｇ：ＭＢＲ）が用いられる。ＭＢＲとは、被写体を複数の方向から撮影した複数の撮影画像に基づいて生成される三次元形状を用いて仮想視点画像を生成する方式である。具体的には、視体積交差法、Ｍｕｌｔｉ－Ｖｉｅｗ－Ｓｔｅｒｅｏ（ＭＶＳ）などの三次元形状復元手法により得られた対象シーンの三次元形状（モデル）を利用し、仮想視点からのシーンの見えを画像として生成する技術である。なお、仮想視点画像の生成方法は、ＭＢＲ以外のレンダリング手法を用いてもよい。生成された仮想視点画像は、ＬＡＮケーブルなどを介して、端末装置１３０に伝送される。 The virtual viewpoint image corresponding to the virtual viewpoint information shall be generated from the background image managed by the database and the specific object image. As a virtual viewpoint image generation method, for example, model-based rendering (MBR) is used. The MBR is a method of generating a virtual viewpoint image using a three-dimensional shape generated based on a plurality of captured images of a subject captured from a plurality of directions. Specifically, the 3D shape (model) of the target scene obtained by a 3D shape restoration method such as the visual volume crossing method or Multi-View-Stereo (MVS) is used to view the scene from a virtual viewpoint. It is a technique to generate as an image. As the method of generating the virtual viewpoint image, a rendering method other than the MBR may be used. The generated virtual viewpoint image is transmitted to the terminal device 130 via a LAN cable or the like.

端末装置１３０は、例えば、ＰＣ（ＰｅｒｓｏｎａｌＣｏｍｐｕｔｅｒ）やタブレットである。コントローラ１３１は、例えば、マウス、キーボード、６軸コントローラ、タッチパネルであり、これらを用いてユーザは操作し、画面上に静止画像や動画像を表示する。端末装置１３０は、例えば、画像生成装置１２０から受信した仮想視点画像を表示部１３２に表示する。端末装置１３０は、さらに、接続されたコントローラ１３１に対するユーザ操作に応じて、再生時刻と仮想視点の移動の指示（移動量と移動方向に関する指示）を受け付け、受け付けた指示に応じた指示情報を示す伝送信号を画像生成装置１２０に送信する。 The terminal device 130 is, for example, a PC (Personal Computer) or a tablet. The controller 131 is, for example, a mouse, a keyboard, a 6-axis controller, and a touch panel, and the user operates the controller 131 to display a still image or a moving image on the screen. The terminal device 130 displays, for example, a virtual viewpoint image received from the image generation device 120 on the display unit 132. The terminal device 130 further receives an instruction to move the playback time and the virtual viewpoint (instruction regarding the amount of movement and the direction of movement) in response to a user operation to the connected controller 131, and indicates instruction information according to the received instruction. The transmission signal is transmitted to the image generator 120.

図３は、画像生成装置１２０のハードウェア構成を示す図である。 FIG. 3 is a diagram showing a hardware configuration of the image generation device 120.

画像生成装置１２０は、ＣＰＵ３０１と、ＲＯＭ３０２と、ＲＡＭ３０３と、ＨＤＤ３０４と、表示部３０５と、入力部３０６と、通信部３０７とを有する。ＣＰＵ３０１は、ＲＯＭ３０２に記憶された制御プログラムを読み出して各種処理を実行する。ＲＡＭ３０３は、ＣＰＵ３０１の主メモリ、ワークエリア等の一時記憶領域として用いられる。ＨＤＤ３０４は、各種データや各種プログラム等を記憶する。表示部３０５は、各種情報を表示する。入力部３０６は、キーボードやマウスを有し、ユーザによる各種操作を受け付ける。通信部３０７は、ネットワークを介して撮影装置１１０等の外部装置との通信処理を行う。なお、ネットワークとしては、イーサネット（登録商標）が挙げられる。また、他の例としては、通信部３０７は、無線により外部装置との通信を行ってもよい。 The image generation device 120 includes a CPU 301, a ROM 302, a RAM 303, an HDD 304, a display unit 305, an input unit 306, and a communication unit 307. The CPU 301 reads the control program stored in the ROM 302 and executes various processes. The RAM 303 is used as a temporary storage area for the main memory, work area, etc. of the CPU 301. The HDD 304 stores various data, various programs, and the like. The display unit 305 displays various information. The input unit 306 has a keyboard and a mouse, and accepts various operations by the user. The communication unit 307 performs communication processing with an external device such as the photographing device 110 via the network. The network includes Ethernet (registered trademark). Further, as another example, the communication unit 307 may wirelessly communicate with an external device.

なお、図３に示す例では、ＨＤＤ３０４、表示部３０５及び入力部３０６が画像生成装置１２０の内部に含まれるものとしたが、これに限定されない。例えば、ＨＤＤ３０４、表示部３０５及び入力部３０６の少なくともいずれかが、他の装置として画像生成装置１２０の外部に接続される構成でもよい。また、後述する画像生成装置１２０の機能や処理は、ＣＰＵ３０１がＲＯＭ３０２又はＨＤＤ３０４に格納されているプログラムを読み出し、このプログラムを実行することにより実現されるものである。また、端末装置１３０のハードウェア構成についても、画像生成装置１２０のハードウェア構成と同様である。 In the example shown in FIG. 3, the HDD 304, the display unit 305, and the input unit 306 are included in the image generation device 120, but the present invention is not limited to this. For example, at least one of the HDD 304, the display unit 305, and the input unit 306 may be connected to the outside of the image generation device 120 as another device. Further, the functions and processes of the image generation device 120, which will be described later, are realized by the CPU 301 reading a program stored in the ROM 302 or the HDD 304 and executing this program. Further, the hardware configuration of the terminal device 130 is the same as the hardware configuration of the image generation device 120.

図４は、画像生成装置１２０の機能構成を示す図である。ここで、本実施形態における画像生成装置１２０が行う処理について説明する。本実施形態においては、ダンスシーンを撮影し、撮影画像に基づいて仮想視点画像を生成することを想定する。しかしながら、ダンス中における被写体の立ち位置が、所望の立ち位置とは異なる位置となることがありうる。図１６（ａ）は、線で囲まれた撮影領域が複数のカメラで撮影されることにより生成された仮想空間上の被写体と、仮想視点に対応する仮想的なカメラ（以下、仮想カメラという）との概念図である。仮想カメラ２１０１のように被写体を正面から撮影し視聴する場合もあれば、仮想カメラ２１０２のように被写体を横方向から撮影し視聴するなど、自由な角度から被写体を撮影するような仮想カメラを指定可能である。図１６（ｃ）は仮想カメラ２１０２に対応する仮想視点画像の画面表示である。ここで、演者は、仮想カメラ２１０２から撮影した場合に、一直線上になるような立ち位置で演技したいと考えているとする。しかしながら、実際に撮影した場合に、立ち位置がずれでしまい、図１６（ｃ）のように被写体の立ち位置が直線上に揃っていない仮想視点画像となってしまうことがありうる。 FIG. 4 is a diagram showing a functional configuration of the image generation device 120. Here, the processing performed by the image generation device 120 in the present embodiment will be described. In the present embodiment, it is assumed that a dance scene is photographed and a virtual viewpoint image is generated based on the photographed image. However, the standing position of the subject during the dance may be different from the desired standing position. FIG. 16A shows a subject in a virtual space generated by shooting a shooting area surrounded by a line with a plurality of cameras, and a virtual camera corresponding to a virtual viewpoint (hereinafter referred to as a virtual camera). It is a conceptual diagram of. Specify a virtual camera that shoots the subject from any angle, such as the virtual camera 2101 that shoots the subject from the front and watches it, or the virtual camera 2102 that shoots the subject from the side and watches it. It is possible. FIG. 16C is a screen display of a virtual viewpoint image corresponding to the virtual camera 2102. Here, it is assumed that the performer wants to perform in a standing position so as to be in a straight line when taken from the virtual camera 2102. However, when the image is actually taken, the standing position may shift, resulting in a virtual viewpoint image in which the standing position of the subject is not aligned on a straight line as shown in FIG. 16 (c).

また、演者が仮想視点画像を見て立ち位置を確認するという用途も考えられる。このとき、指定される仮想カメラによっては、立ち位置のずれを容易に認識できない場合がある。図１６（ｂ）は仮想カメラ２１０１に対応する仮想視点画像の画面表示である。仮想カメラ２１０１に対応する仮想視点画像においては、被写体の立ち位置は揃っているように見える。しかし実際には、仮想カメラ２１０２から撮影された仮想視点画像上では、被写体の立ち位置がずれてしまっている。本実施形態における画像生成装置１２０は、上記の問題を解決することを目的とする。 It is also conceivable that the performer can confirm the standing position by looking at the virtual viewpoint image. At this time, depending on the designated virtual camera, the deviation of the standing position may not be easily recognized. FIG. 16B is a screen display of a virtual viewpoint image corresponding to the virtual camera 2101. In the virtual viewpoint image corresponding to the virtual camera 2101, the standing positions of the subjects appear to be aligned. However, in reality, the standing position of the subject is displaced on the virtual viewpoint image taken from the virtual camera 2102. The image generator 120 in the present embodiment aims to solve the above problem.

画像生成装置１２０は、撮影画像入力４０１、前景背景分離部４０２、撮影画像データ保存部４０３、カメラパラメータ保持部４０４、被写体形状生成部４０５、被写体形状重心算出部４０６、被写体形状移動部４０７、及び被写体位置決定部４０８を有する。また、画像生成装置１２０は、ユーザ入力部４０９、仮想視点情報設定部４１０、着色情報算出部４１１、仮想視点画像生成部４１２、及び画像出力部４１３を有する。以下、各処理部について説明する。 The image generation device 120 includes a photographed image input 401, a foreground background separation unit 402, a photographed image data storage unit 403, a camera parameter holding unit 404, a subject shape generation unit 405, a subject shape center of gravity calculation unit 406, a subject shape moving unit 407, and a subject shape moving unit 407. It has a subject position determining unit 408. Further, the image generation device 120 includes a user input unit 409, a virtual viewpoint information setting unit 410, a coloring information calculation unit 411, a virtual viewpoint image generation unit 412, and an image output unit 413. Hereinafter, each processing unit will be described.

撮影画像入力部４０１は、撮影装置１１０からＬＡＮケーブルを介して入力された伝送信号を撮影画像データに変換して、前景背景分離部４０２へ出力する。前景背景分離部４０２は、撮影画像入力部４０１から入力された撮影画像のうち、被写体の演技開始前など予め被写体が存在しない状態の場面を撮影した画像を背景画像データとして、撮影画像データ保存部４０３へ出力する。また、被写体の演技中に撮影された画像から被写体を抽出し、前景画像データとして撮影画像データ保存部４０３へ出力する。 The captured image input unit 401 converts the transmission signal input from the photographing device 110 via the LAN cable into captured image data, and outputs the transmitted signal to the foreground background separating unit 402. The foreground background separation unit 402 uses as background image data a captured image data storage unit that captures a scene in which the subject does not exist in advance, such as before the subject's performance starts, among the captured images input from the captured image input unit 401. Output to 403. Further, the subject is extracted from the image taken during the performance of the subject and output as the foreground image data to the photographed image data storage unit 403.

撮影画像データ保存部４０３は、データベースであり、前景背景分離部４０２から入力された撮影画像データのうち、被写体が存在しない状態で予め撮影された画像を背景画像データとしてＨＤＤ３０４に保存する。また撮影画像データ保存部４０３は、背景画像データと被写体の存在する撮影画像データとの差分データを前景画像データとしてＨＤＤ３０４に保存する。また、撮影画像データ保存部４０３は、被写体形状生成部４０５に前景画像データを出力する。また、着色情報算出部４１１により指定された背景画像データと前景画像データを着色情報算出部４１１へ出力する。 The captured image data storage unit 403 is a database, and among the captured image data input from the foreground background separation unit 402, an image previously captured in the absence of a subject is stored in the HDD 304 as background image data. Further, the captured image data storage unit 403 stores the difference data between the background image data and the captured image data in which the subject is present in the HDD 304 as the foreground image data. Further, the captured image data storage unit 403 outputs the foreground image data to the subject shape generation unit 405. Further, the background image data and the foreground image data designated by the coloring information calculation unit 411 are output to the coloring information calculation unit 411.

カメラパラメータ保持部４０４は、複数の撮影装置１１０の撮影位置情報、撮影装置１１０のレンズの焦点距離、及び撮影装置１１０のシャッタースピード等のカメラ設定情報を、カメラパラメータ情報として保持する。複数の撮影装置１１０はあらかじめ決められた位置に設置され、カメラパラメータ情報があらかじめ取得されるものとする。また、カメラパラメータ保持部４０４は、カメラパラメータ情報を、被写体形状生成部４０５及び着色情報算出部４１１に出力する。 The camera parameter holding unit 404 holds camera setting information such as the shooting position information of the plurality of shooting devices 110, the focal length of the lens of the shooting device 110, and the shutter speed of the shooting device 110 as camera parameter information. It is assumed that the plurality of photographing devices 110 are installed at predetermined positions and the camera parameter information is acquired in advance. Further, the camera parameter holding unit 404 outputs the camera parameter information to the subject shape generation unit 405 and the coloring information calculation unit 411.

被写体形状生成部４０５は、前景画像データと、カメラパラメータ情報とを用いて、被写体の形状を表す形状データを生成する。被写体形状生成部４０５は、例えば、視体積交差法などの三次元形状復元手法を用いて被写体の形状データを生成するとする。また、形状データを被写体形状重心算出部４０６と着色情報算出部４１１へ出力する。 The subject shape generation unit 405 generates shape data representing the shape of the subject by using the foreground image data and the camera parameter information. It is assumed that the subject shape generation unit 405 generates shape data of the subject by using, for example, a three-dimensional shape restoration method such as a visual volume crossing method. Further, the shape data is output to the subject shape center of gravity calculation unit 406 and the coloring information calculation unit 411.

被写体形状重心算出部４０６は、撮影領域における被写体の位置を特定する。具体的には、被写体形状重心算出部４０６は、被写体形状生成部４０５から入力された形状データを使用して、被写体の形状の重心を被写体位置として特定する。このとき、被写体形状重心算出部４０６は、所定視点位置から被写体を見た場合の重心位置を算出する。被写体形状重心算出部４０６は、例えば、被写体を真上から見下ろす視点での重心位置を被写体の位置として算出する。被写体形状重心算出部４０６は、被写体重心位置からなる被写体重心情報を、被写体形状移動部４０７へ出力する。 The subject shape center of gravity calculation unit 406 specifies the position of the subject in the photographing area. Specifically, the subject shape center of gravity calculation unit 406 uses the shape data input from the subject shape generation unit 405 to specify the center of gravity of the subject shape as the subject position. At this time, the subject shape center of gravity calculation unit 406 calculates the position of the center of gravity when the subject is viewed from a predetermined viewpoint position. The subject shape center of gravity calculation unit 406 calculates, for example, the position of the center of gravity at the viewpoint of looking down on the subject from directly above as the position of the subject. The subject shape center of gravity calculation unit 406 outputs the subject center of gravity information including the position of the subject center of gravity to the subject shape moving unit 407.

被写体形状移動部４０７は、被写体形状重心算出部４０６から入力された被写体重心情報と、後述する被写体位置設定部４０８から入力された被写体移動先位置情報とに基づき、被写体の形状データを配置する位置を決定する。ここで、被写体移動先位置情報は、被写体が配置されるべき基準の位置（以下、基準位置ともいう）を表す情報である。被写体形状移動部４０７は、基準位置と被写体の位置とのずれに応じて、形状データの配置を決定する。 The subject shape moving unit 407 is a position for arranging the shape data of the subject based on the subject center of gravity information input from the subject shape center of gravity calculation unit 406 and the subject moving destination position information input from the subject position setting unit 408 described later. To decide. Here, the subject movement destination position information is information indicating a reference position (hereinafter, also referred to as a reference position) in which the subject should be placed. The subject shape moving unit 407 determines the arrangement of the shape data according to the deviation between the reference position and the position of the subject.

被写体形状移動部４０７は、被写体移動先位置情報が床面に設定された所定の間隔（例えば３メートル間隔）の格子点情報である場合、格子点の位置を基準位置として、格子点の位置と、被写体の重心の位置とが一致するように、被写体の形状データを配置する。なお、被写体移動先位置情報に基づいて、被写体の形状データが再生成されてもよい。また、被写体形状移動部４０７は、基準位置と被写体の位置とのずれが所定の閾値以上である場合に、ずれが所定の閾値よりも小さくなるように形状データの位置を変更し、変更した位置に形状データが配置されるようにしてもよい。なお、被写体形状移動部４０７は、被写体形状移動部４０７は、移動した形状データを仮想視点画像生成部４１２へ出力する。 When the subject movement destination position information is the grid point information at a predetermined interval (for example, 3 meter interval) set on the floor surface, the subject shape moving unit 407 sets the position of the grid point as a reference position and the position of the grid point. , Arrange the shape data of the subject so that the position of the center of gravity of the subject matches. The shape data of the subject may be regenerated based on the subject movement destination position information. Further, when the deviation between the reference position and the position of the subject is equal to or greater than a predetermined threshold value, the subject shape moving unit 407 changes the position of the shape data so that the deviation becomes smaller than the predetermined threshold value, and the changed position. The shape data may be arranged in. The subject shape moving unit 407 outputs the moved shape data to the virtual viewpoint image generation unit 412.

被写体位置設定部４０８は、ユーザにより予め設定された三次元空間上の被写体移動先位置情報を被写体形状移動部４０７へ出力する。例えば、複数の被写体を所定の直線上の所定区間に配置されるように床面に対応した３メートル間隔の格子点情報を出力するものとする。なお、被写体移動先位置情報は格子点に限定されず、例えば直線や曲線であってもよい。 The subject position setting unit 408 outputs the subject movement destination position information in the three-dimensional space preset by the user to the subject shape moving unit 407. For example, it is assumed that grid point information at intervals of 3 meters corresponding to the floor surface is output so that a plurality of subjects are arranged in a predetermined section on a predetermined straight line. The subject movement destination position information is not limited to the grid points, and may be, for example, a straight line or a curved line.

ユーザ入力部４０９は、端末装置１３０からＬＡＮケーブルを介して入力された伝送信号をユーザ入力データに変換する。ユーザ入力データが再生時刻情報と仮想視点情報である場合、再生時刻情報と仮想視点情報を仮想視点情報設定部４１０へ出力する。 The user input unit 409 converts the transmission signal input from the terminal device 130 via the LAN cable into user input data. When the user input data is the reproduction time information and the virtual viewpoint information, the reproduction time information and the virtual viewpoint information are output to the virtual viewpoint information setting unit 410.

仮想視点情報設定部４１０は、ユーザ入力部４０９から入力された再生時刻情報と仮想視点情報に基づき、仮想空間内の現在位置と方向と再生時刻とを更新する。 The virtual viewpoint information setting unit 410 updates the current position, direction, and reproduction time in the virtual space based on the reproduction time information and the virtual viewpoint information input from the user input unit 409.

その後、再生時刻情報と仮想視点情報とを被写体形状生成部４０５と着色情報算出部４１１と仮想視点画像生成部４１２へ出力する。なお、仮想空間の原点は競技場の中心などを予め設定するものとする。 After that, the reproduction time information and the virtual viewpoint information are output to the subject shape generation unit 405, the coloring information calculation unit 411, and the virtual viewpoint image generation unit 412. The origin of the virtual space shall be set in advance such as the center of the stadium.

着色情報算出部４１１は、仮想視点情報設定部４１０から入力された再生時刻情報と仮想視点情報とに基づいた前景画像データと背景画像データとを撮影画像データ保存部４０３から入力する。またカメラパラメータをカメラパラメータ保持部４０４から入力し、また形状データを被写体形状生成部４０５から入力する。次に仮想視点位置から見た被写体形状に対して、該当時刻に実カメラで撮影された画像データの色情報でレンダリング（着色処理）して被写体形状の着色情報を保持する。例えば、仮想視点から形状データに基づく被写体が見えている状況で、仮想視点の位置から所定の範囲内に実カメラ位置情報がある場合、その実カメラの前景画像データを形状の色として使用するものとする。また着色情報算出部４１１は、仮想視点画像生成部４１２へ着色情報を出力する。 The coloring information calculation unit 411 inputs the foreground image data and the background image data based on the reproduction time information and the virtual viewpoint information input from the virtual viewpoint information setting unit 410 from the captured image data storage unit 403. Further, the camera parameters are input from the camera parameter holding unit 404, and the shape data is input from the subject shape generation unit 405. Next, the subject shape viewed from the virtual viewpoint position is rendered (colored) with the color information of the image data taken by the actual camera at the corresponding time, and the coloring information of the subject shape is retained. For example, if the subject based on the shape data is visible from the virtual viewpoint and the actual camera position information is within a predetermined range from the position of the virtual viewpoint, the foreground image data of the actual camera is used as the color of the shape. do. Further, the coloring information calculation unit 411 outputs coloring information to the virtual viewpoint image generation unit 412.

仮想視点画像生成部４１２は、仮想視点情報設定部４０８から入力された再生時刻情報と仮想視点情報とに基づいた前景画像データと背景画像データとを撮影画像データ保存部４０３から入力する。またカメラパラメータをカメラパラメータ保持部４０４から入力し、また移動形状データを被写体形状移動部４０７から入力する。その後、背景画像データを仮想視点位置から背景として見えるように投影変換や画像処理を施して仮想視点画像の背景とする。さらに次に仮想視点位置から見た移動被写体形状に対して、該当時刻に撮影装置で撮影された画像データによる着色情報を着色情報算出部４１１より入力して、色情報でレンダリング（着色処理）して仮想視点画像を生成する。最後に仮想視点画像生成部４１２にて生成された仮想視点画像を画像出力部４１３へ出力する。画像出力部４１３は、仮想視点画像生成部４１２から入力した画像データを、端末装置１３０へ伝送可能な伝送信号に変換して、端末装置１３０へ出力する。 The virtual viewpoint image generation unit 412 inputs the foreground image data and the background image data based on the reproduction time information and the virtual viewpoint information input from the virtual viewpoint information setting unit 408 from the captured image data storage unit 403. Further, the camera parameters are input from the camera parameter holding unit 404, and the moving shape data is input from the subject shape moving unit 407. After that, the background image data is subjected to projection transformation and image processing so that it can be seen as a background from the virtual viewpoint position, and is used as the background of the virtual viewpoint image. Next, for the moving subject shape viewed from the virtual viewpoint position, coloring information based on the image data taken by the photographing device at the corresponding time is input from the coloring information calculation unit 411 and rendered (coloring processing) with the color information. To generate a virtual viewpoint image. Finally, the virtual viewpoint image generated by the virtual viewpoint image generation unit 412 is output to the image output unit 413. The image output unit 413 converts the image data input from the virtual viewpoint image generation unit 412 into a transmission signal that can be transmitted to the terminal device 130, and outputs the image data to the terminal device 130.

次に、画像生成装置１２０の動作について説明する。図５は、実施例１に係る画像生成装置１２０の動作を示すフローチャートである。ＣＰＵ３０１がＲＯＭ３０２またはＨＤＤ３０４に記憶されたプログラムを読み出して実行することにより、以下の処理が行われる。端末装置１３０から仮想視点情報及び再生時刻を指定するための入力が行われると、処理が開始される。 Next, the operation of the image generation device 120 will be described. FIG. 5 is a flowchart showing the operation of the image generation device 120 according to the first embodiment. The following processing is performed by the CPU 301 reading and executing the program stored in the ROM 302 or the HDD 304. When the terminal device 130 inputs the virtual viewpoint information and the reproduction time, the process is started.

仮想視点情報設定部４１０は、ユーザ入力部４０９を介して、仮想視点情報と再生時刻情報が入力されたか否か判断する（Ｓ５０１）。仮想視点情報と再生時刻情報が入力されない場合（Ｓ５０１のＮｏ）、待機する。一方、仮想視点情報と再生時刻情報が入力された場合（Ｓ５０１のＹｅｓ）、仮想視点情報と再生時刻情報とを被写体形状生成部４０５と着色情報算出部４１１と仮想視点画像生成部４１２へ出力する。 The virtual viewpoint information setting unit 410 determines whether or not the virtual viewpoint information and the reproduction time information have been input via the user input unit 409 (S501). If the virtual viewpoint information and the reproduction time information are not input (No in S501), the system waits. On the other hand, when the virtual viewpoint information and the reproduction time information are input (Yes in S501), the virtual viewpoint information and the reproduction time information are output to the subject shape generation unit 405, the coloring information calculation unit 411, and the virtual viewpoint image generation unit 412. ..

次に、被写体形状生成部４０５は、仮想視点情報設定部４１０から入力された再生時刻情報に基づく撮影画像データ保存部４０３から入力された前景画像データと、カメラパラメータ保持部４０４から入力されたカメラパラメータ情報を読み込む（Ｓ５０２）。続いて被写体形状生成部４０５は、被写体の３次元形状を推定する（Ｓ５０３）。例えば、視体積交差法などの三次元形状復元手法を用いて被写体の形状データを生成するとする。ここで、形状データとは、複数の点群からなり各点は位置情報を含むものとする。 Next, the subject shape generation unit 405 includes the foreground image data input from the captured image data storage unit 403 based on the reproduction time information input from the virtual viewpoint information setting unit 410, and the camera input from the camera parameter holding unit 404. Read the parameter information (S502). Subsequently, the subject shape generation unit 405 estimates the three-dimensional shape of the subject (S503). For example, it is assumed that the shape data of the subject is generated by using a three-dimensional shape restoration method such as the visual volume crossing method. Here, the shape data is composed of a plurality of point groups, and each point includes position information.

次に、着色情報算出部４１１は、仮想視点情報設定部４１０から入力された再生時刻情報と仮想視点情報に基づいた前景画像データと背景画像データとを撮影画像データ保存部４０３から入力する。またカメラパラメータをカメラパラメータ保持部４０４から入力し、また形状データを被写体形状生成部４０５から入力する。次に仮想視点位置から見た被写体形状に対して、該当時刻に実カメラで撮影された画像データの色情報でレンダリング（着色処理）、被写体形状の着色情報を保持する（Ｓ５０４）。 Next, the coloring information calculation unit 411 inputs the reproduction time information input from the virtual viewpoint information setting unit 410, the foreground image data and the background image data based on the virtual viewpoint information from the captured image data storage unit 403. Further, the camera parameters are input from the camera parameter holding unit 404, and the shape data is input from the subject shape generation unit 405. Next, with respect to the subject shape viewed from the virtual viewpoint position, rendering (coloring processing) is performed using the color information of the image data taken by the actual camera at the corresponding time, and the coloring information of the subject shape is retained (S504).

次に、被写体形状重心算出部４０６は、被写体形状生成部４０５から入力された被写体形状の所定視点位置から見た場合の重心位置を、被写体位置として特定する（Ｓ５０５）。なお、本実施例では真上から見た場合の重心位置を被写体形状重心情報とする。図６（ａ）は、仮想空間上で真上から見た被写体の形状データと、形状データの重心位置の概念図である。本実施形態では、複数の被写体を所定格子点上に配置しなおすため、被写体に対して真上から見下ろす視点で、被写体の前後方向軸の中心位置、また左右方向軸の中心位置などを算出して重心位置を算出する。その重心位置は被写体の黒点で示しており、重心位置を直線上や格子点上に配置することになる。 Next, the subject shape center of gravity calculation unit 406 specifies the position of the center of gravity of the subject shape input from the subject shape generation unit 405 when viewed from a predetermined viewpoint position as the subject position (S505). In this embodiment, the position of the center of gravity when viewed from directly above is used as the subject shape center of gravity information. FIG. 6A is a conceptual diagram of the shape data of the subject viewed from directly above in the virtual space and the position of the center of gravity of the shape data. In the present embodiment, since a plurality of subjects are rearranged on a predetermined grid point, the center position of the front-back direction axis of the subject, the center position of the left-right direction axis, and the like are calculated from the viewpoint of looking down on the subject from directly above. And calculate the position of the center of gravity. The position of the center of gravity is indicated by a black dot of the subject, and the position of the center of gravity is arranged on a straight line or a grid point.

図５の説明に戻り、被写体形状移動部４０７は、被写体位置設定部４０８より三次元空間上の被写体移動先位置情報があるか否か判定する（Ｓ５０６）。被写体位置設定部４０８より三次元空間上の被写体移動先位置情報がある場合（Ｓ５０６のＹｅｓ）、被写体移動先位置情報を入力する（Ｓ５０７）。一方で、被写体移動先位置情報がない場合（Ｓ５０６のＮｏ）、Ｓ５０９へ進む。図６（ｂ）は、被写体移動先位置情報に基づく被写体の基準位置を表す、床面の格子点位置の概念図である。例えば、複数の被写体を所定の直線上の所定区間に配置されるように床面に対応した３メートル間隔の格子点情報を出力するものとする。 Returning to the description of FIG. 5, the subject shape moving unit 407 determines whether or not there is subject moving destination position information in the three-dimensional space from the subject position setting unit 408 (S506). When there is subject movement destination position information in the three-dimensional space from the subject position setting unit 408 (Yes in S506), the subject movement destination position information is input (S507). On the other hand, if there is no subject movement destination position information (No in S506), the process proceeds to S509. FIG. 6B is a conceptual diagram of the grid point positions on the floor surface showing the reference position of the subject based on the subject movement destination position information. For example, it is assumed that grid point information at intervals of 3 meters corresponding to the floor surface is output so that a plurality of subjects are arranged in a predetermined section on a predetermined straight line.

図５の説明に戻り、次に、被写体形状移動部４０７は、被写体形状重心算出部４０６から入力された被写体重心情報と被写体位置設定部４０８から入力された被写体移動先位置情報に基づき、形状データを移動する（Ｓ５０８）。図６（ｃ）は、移動後の被写体形状の概念図である。被写体形状に対して真上から見下ろす視点で、被写体移動先位置情報に基づく格子点位置と被写体形状の重心位置とが一致するように被写体位置を変更し、変更した被写体位置に形状データを移動したことを示している。これにより、移動被写体形状の位置は一定の距離間となるように移動されたことになる。なお、基準位置と被写体位置とが所定の閾値よりも小さくなるように被写体位置が変更され、形状データが配置されてもよい。 Returning to the description of FIG. 5, next, the subject shape moving unit 407 has shape data based on the subject center of gravity information input from the subject shape center of gravity calculation unit 406 and the subject moving destination position information input from the subject position setting unit 408. (S508). FIG. 6C is a conceptual diagram of the subject shape after movement. The subject position was changed so that the grid point position based on the subject movement destination position information and the center of gravity position of the subject shape match from the viewpoint looking down from directly above the subject shape, and the shape data was moved to the changed subject position. It is shown that. As a result, the position of the moving subject shape is moved so as to be within a certain distance. The subject position may be changed so that the reference position and the subject position become smaller than a predetermined threshold value, and the shape data may be arranged.

図５の説明に戻り、仮想視点画像生成部４１２は、移動された形状データに基づいて、仮想視点位置から見た形状データに対して、指定時刻に撮影装置１１０で撮影された画像データを用いてレンダリング（着色処理）して仮想視点画像を生成する（Ｓ５０９）。すなわち、被写体移動前に決定、保持された被写体の色情報を被写体移動後にレンダリングすることで被写体が移動された位置で表示されることになる。仮想視点画像生成部４１２は、仮想視点画像生成部４１２にて生成された仮想視点画像を画像出力部４１３へ出力する（Ｓ５１０）。以上説明した処理は、録画して蓄積した画像データに基づいて行われてもよいし、撮影装置１１０による撮影が行われるのと並行して、リアルタイムで行うことも可能である。 Returning to the description of FIG. 5, the virtual viewpoint image generation unit 412 uses the image data taken by the photographing device 110 at the designated time with respect to the shape data viewed from the virtual viewpoint position based on the moved shape data. And render (coloring process) to generate a virtual viewpoint image (S509). That is, the color information of the subject determined and held before the subject is moved is rendered after the subject is moved, so that the subject is displayed at the moved position. The virtual viewpoint image generation unit 412 outputs the virtual viewpoint image generated by the virtual viewpoint image generation unit 412 to the image output unit 413 (S510). The process described above may be performed based on the recorded and accumulated image data, or may be performed in real time in parallel with the image taken by the photographing apparatus 110.

なお、上述した図５の説明では、被写体移動前に決定、保持された被写体の色情報を被写体移動後にレンダリングする、と説明しているが、これに限定されない。例えば、対象とする被写体を指定の位置に移動させると共に全撮影装置１１０の撮影位置と対象の被写体以外の位置も対象の被写体との位置関係を保つように相対的に移動させて、対象とする被写体のみをレンダリングする。これを各被写体に対して順次行い合成することで仮想視点画像を生成するとしてもよい。すなわち、被写体の移動前に色情報を決定するのではなく、被写体移動後に色情報の計算とレンダリングとを行う構成であってもよい。 In the above description of FIG. 5, it is explained that the color information of the subject determined and held before the subject is moved is rendered after the subject is moved, but the present invention is not limited to this. For example, the target subject is moved to a designated position, and the shooting positions of all the shooting devices 110 and positions other than the target subject are also relatively moved so as to maintain the positional relationship with the target subject to be targeted. Render only the subject. A virtual viewpoint image may be generated by sequentially performing this for each subject and synthesizing them. That is, instead of determining the color information before moving the subject, the color information may be calculated and rendered after the subject is moved.

図７は、図１６（ａ）に示す被写体を撮影した場合に、図５に示す処理を行うことにより生成される仮想視点画像を表す図である。図７（ａ）は、撮影領域に対応する三次元空間における被写体の形状データの位置を表す。図７（ａ）によれば、複数の被写体がダンスなどの演技をしながらも、演技実施時とは異なる立ち位置である所定の格子点位置に移動され、一定の距離をとった立ち位置で、同じ動作をしている様子が仮想カメラで撮影される。仮想カメラ９０１の位置から被写体を撮影しても仮想カメラ９０２の位置から被写体を撮影しても、実際にダンスなどの演技をした立ち位置と比較して、異なる立ち位置になる。 FIG. 7 is a diagram showing a virtual viewpoint image generated by performing the process shown in FIG. 5 when the subject shown in FIG. 16A is photographed. FIG. 7A shows the position of the shape data of the subject in the three-dimensional space corresponding to the shooting area. According to FIG. 7A, while a plurality of subjects perform a dance or the like, they are moved to a predetermined grid point position, which is a standing position different from that at the time of the performance, and at a standing position at a certain distance. , The same operation is taken with a virtual camera. Whether the subject is photographed from the position of the virtual camera 901 or the subject is photographed from the position of the virtual camera 902, the standing position is different from the standing position where the person actually performed a dance or the like.

図７（ｂ）は、端末装置１３０に表示される、仮想カメラ９０１に対応する仮想視点画像の表示例である。被写体の正面から撮影しても一定の位置を常に保って演技させることが可能となる。また、図７（ｃ）は仮想カメラ９０２に対応する仮想視点画像の表示例である。被写体の横方向から撮影しても直線状に配置しなおして表示可能であるため、常に同じ位置で演技させることが可能となる。 FIG. 7B is a display example of a virtual viewpoint image corresponding to the virtual camera 901 displayed on the terminal device 130. Even if you shoot from the front of the subject, you can always keep a certain position and perform. Further, FIG. 7C is a display example of a virtual viewpoint image corresponding to the virtual camera 902. Even if the subject is photographed from the side, it can be rearranged in a straight line and displayed, so that it is possible to always perform at the same position.

以上、実施例１の形態によれば、複数被写体を複数カメラで撮影し、仮想視点撮影し表示する場合に、所望の位置に被写体を配置しなおした仮想視点画像を生成することが可能となる。演技の撮影時と比較して、仮想カメラの映像においては被写体の揃った演技を視聴することが可能となる。なお、上述した例では、基準位置が格子点で表されるものとしたが、基準位置は所定の直線や曲線で表されてもよい。この場合は、被写体の位置が線上からずれた場合、被写体位置が線上の任意の点（例えば、現在の被写体の位置から最も近い線上の点）の位置と一致するように形状データが配置される。また、基準位置は、任意の位置の点（座標）であってもよい。このとき、被写体が複数存在する場合は、被写体ごとに基準位置の点が設定されてもよい。 As described above, according to the embodiment of the first embodiment, when a plurality of subjects are photographed by a plurality of cameras and a virtual viewpoint image is photographed and displayed, it is possible to generate a virtual viewpoint image in which the subject is rearranged at a desired position. .. Compared to the time when the performance was shot, it is possible to view the performance with the same subject in the video of the virtual camera. In the above-mentioned example, the reference position is represented by a grid point, but the reference position may be represented by a predetermined straight line or curve. In this case, if the position of the subject deviates from the line, the shape data is arranged so that the position of the subject matches the position of an arbitrary point on the line (for example, the point on the line closest to the current position of the subject). .. Further, the reference position may be a point (coordinates) at an arbitrary position. At this time, if there are a plurality of subjects, a point at a reference position may be set for each subject.

また、基準位置は、複数の被写体の被写体位置に基づいて特定されてもよい。例えば、３人の被写体が撮影される場合に、３人の被写体が一直線上に位置するように立ち位置を設定したいものとする。このとき、３人のうち２人の被写体位置を結ぶ直線を算出し、算出した直線上の点を基準位置として、残りの１人の被写体位置を変更する。こうすることで、３人の被写体が一直線上に位置するように形状データが配置された仮想視点画像が生成される。なお、被写体の数が３人以外の場合にも適用可能である。 Further, the reference position may be specified based on the subject positions of a plurality of subjects. For example, when three subjects are photographed, it is assumed that the standing position is set so that the three subjects are positioned in a straight line. At this time, a straight line connecting the subject positions of two of the three people is calculated, and the subject position of the remaining one person is changed with the point on the calculated straight line as the reference position. By doing so, a virtual viewpoint image in which the shape data is arranged so that the three subjects are positioned on a straight line is generated. It is also applicable when the number of subjects is other than three.

また、基準位置は、図７（ａ）に示したような撮影領域に対応する三次元空間位置に被写体を配置するに限らず、撮影領域外の任意の三次元空間位置に設定されてもよい。図１７は、撮影領域外の三次元空間位置に被写体の形状データを配置した例を表す。図１７に示す例では、対象とする被写体の形状データを移動するための基準位置が、撮影領域外の位置に設定されている。図１７によれば、複数の被写体は実際にダンスなどの演技を撮影領域内で撮影したとしても、その撮影領域よりも広い領域でダンスなどの演技をしているように仮想カメラ９０１の位置からは撮影される。 Further, the reference position is not limited to arranging the subject in the three-dimensional space position corresponding to the shooting area as shown in FIG. 7A, and may be set to any three-dimensional space position outside the shooting area. .. FIG. 17 shows an example in which the shape data of the subject is arranged at a three-dimensional spatial position outside the photographing area. In the example shown in FIG. 17, the reference position for moving the shape data of the target subject is set to a position outside the photographing area. According to FIG. 17, even if a plurality of subjects actually shoot a performance such as dance in the shooting area, the position of the virtual camera 901 is such that the performance such as dance is performed in a wider area than the shooting area. Is filmed.

（第２の実施形態）
本実施形態では、被写体の所定の特徴に基づき、被写体の位置を変更する例である。図８は、第２の実施形態に係る画像生成装置１１００の機能構成を示す図である。画像生成装置１１００は、図４に示した第１の実施形態に係る画像生成装置１２０の被写体形状重心算出部４０６のかわりに、被写体特徴生成部１１０１を有する。なお、画像生成装置１１００のハードウェア構成は、第１の実施形態に係る画像生成装置１２０と同様であるものとする。また、画像生成装置１２０と同様の構成については、同じ符号を付し、説明を省略する。 (Second embodiment)
In this embodiment, the position of the subject is changed based on a predetermined feature of the subject. FIG. 8 is a diagram showing a functional configuration of the image generation device 1100 according to the second embodiment. The image generation device 1100 has a subject feature generation unit 1101 instead of the subject shape center of gravity calculation unit 406 of the image generation device 120 according to the first embodiment shown in FIG. The hardware configuration of the image generation device 1100 is the same as that of the image generation device 120 according to the first embodiment. Further, the same components as those of the image generation device 120 are designated by the same reference numerals, and the description thereof will be omitted.

被写体特徴生成部１１０１は、被写体形状生成部４０５から入力された形状データと形状データに対応した着色情報から、被写体の所定特徴認識とその位置を算出する。例えば、複数の被写体の顔を特徴として直線上や床面の所定格子点上に配置しなおす場合には、形状データと着色情報とを使用して被写体の顔認識を行い、顔の位置を特定する。これにより、その後の処理において、顔の位置が真上から見下ろす視点で、顔の位置が所定の直線上や格子点上にくるように形状データが配置されるようになる。すなわち、被写体特徴生成部１１０１は、被写体の形状における所定の部位を被写体位置として特定する。被写体特徴生成部１１０１は、被写体特徴位置情報を、被写体形状移動部４０７へ出力する。 The subject feature generation unit 1101 calculates the predetermined feature recognition of the subject and its position from the shape data input from the subject shape generation unit 405 and the coloring information corresponding to the shape data. For example, when the faces of a plurality of subjects are featured and rearranged on a straight line or on a predetermined grid point on the floor surface, the face of the subject is recognized using the shape data and the coloring information, and the position of the face is specified. do. As a result, in the subsequent processing, the shape data is arranged so that the position of the face is on a predetermined straight line or a grid point from the viewpoint of looking down on the position of the face from directly above. That is, the subject feature generation unit 1101 specifies a predetermined portion in the shape of the subject as the subject position. The subject feature generation unit 1101 outputs the subject feature position information to the subject shape moving unit 407.

図９は、第２の実施形態に係る画像処理装置１１００による画像処理を示すフローチャートである。なお、Ｓ５０１からＳ５０４までは図５の説明と同一であるため説明は省略する。またＳ５０６からＳ５１０においても図５の説明と同一であるため説明は省略する。 FIG. 9 is a flowchart showing image processing by the image processing apparatus 1100 according to the second embodiment. Since S501 to S504 are the same as the description of FIG. 5, the description will be omitted. Further, since the description in S506 to S510 is the same as the description in FIG. 5, the description will be omitted.

Ｓ１２０１において、被写体形状生成部４０５は、被写体位置として所定の部位の位置を特定する。このとき、被写体形状生成部４０５は、被写体の形状データのうち所定の特徴を有する位置を、例えば顔認識などの画像解析を使用して特定する。 In S1201, the subject shape generation unit 405 specifies the position of a predetermined portion as the subject position. At this time, the subject shape generation unit 405 identifies a position having a predetermined feature in the shape data of the subject by using image analysis such as face recognition.

図１０は、被写体の顔認識を形状データと着色情報とから解析し、顔の位置を特定し被写体位置とする場合の例である。被写体特徴生成部１１０１は、図１０（ａ）に示すように、被写体の顔の位置を特定する。また、被写体形状移動部４０７は、被写体特徴生成部１１０１が特定した被写体位置に基づいて、形状データを配置する。これにより、図１０（ｂ）に示すように、上から見た場合に顔の位置が格子点と一致するように形状データが配置される。 FIG. 10 is an example in which the face recognition of the subject is analyzed from the shape data and the coloring information, and the position of the face is specified and used as the subject position. As shown in FIG. 10A, the subject feature generation unit 1101 specifies the position of the face of the subject. Further, the subject shape moving unit 407 arranges the shape data based on the subject position specified by the subject feature generation unit 1101. As a result, as shown in FIG. 10B, the shape data is arranged so that the position of the face coincides with the grid points when viewed from above.

なお、被写体の特徴認識（例えば顔認識など）及び位置の特定は、撮影装置１１０により得られる複数の撮影画像から特徴（顔）が抽出され、さらにカメラパラメータ情報に基づいて特徴の位置（三次元空間上の位置）が算出されるものとする。しかしこれに限定されず、例えば仮想カメラを所定の位置に設定して仮想視点画像を生成し、生成された仮想視点画像を使用して特徴認識及び位置の特定が行われてもよい。 For feature recognition (for example, face recognition) and position specification of the subject, features (faces) are extracted from a plurality of captured images obtained by the photographing device 110, and the feature positions (three-dimensional) are further based on camera parameter information. The position in space) shall be calculated. However, the present invention is not limited to this, and for example, a virtual camera may be set at a predetermined position to generate a virtual viewpoint image, and the generated virtual viewpoint image may be used for feature recognition and position specification.

以上説明したように、本実施形態の画像生成装置１１００は、被写体の所定の部位の位置を特定し、特定した位置と基準位置とのずれに応じて形状データを配置し仮想視点画像を生成する。所定の部位は、顔に限らず、手や足、靴などであってもよい。これにより、例えば靴のＣＭ（コマーシャル）撮影を行う場合は、靴の特徴を識別し、靴の位置と基準位置とが一致する、又は位置のずれが所定の閾値より小さくなるように、形状データを配置することが可能となる。このようにすることで、靴の位置が所望の位置となるような仮想視点画像を生成することができる。 As described above, the image generation device 1100 of the present embodiment specifies the position of a predetermined part of the subject, arranges the shape data according to the deviation between the specified position and the reference position, and generates a virtual viewpoint image. .. The predetermined portion is not limited to the face, but may be hands, feet, shoes, or the like. As a result, for example, when CM (commercial) photography of shoes is performed, the characteristics of the shoes are identified, and the shape data is such that the position of the shoes and the reference position match or the deviation of the positions becomes smaller than a predetermined threshold value. Can be placed. By doing so, it is possible to generate a virtual viewpoint image in which the position of the shoe is a desired position.

（第３の実施形態）
第３の実施形態は、時間軸方向における被写体の重心位置に基づき、被写体の位置を変更する例である。図１１は、実施例３に係る画像生成装置１５００の機能構成を示す図である。画像生成装置１５００は、図４に示した第１の実施形態に係る画像生成装置１２０の被写体形状重心算出部４０６のかわりに、被写体形状平均重心算出部１６０１を有する。なお、本実施形態における画像生成装置１５００のハードウェア構成は、上述した実施形態と同様であるものとする。また、同様の機能構成については同じ符号を付し、説明を省略する。 (Third embodiment)
The third embodiment is an example of changing the position of the subject based on the position of the center of gravity of the subject in the time axis direction. FIG. 11 is a diagram showing a functional configuration of the image generation device 1500 according to the third embodiment. The image generation device 1500 has a subject shape average center of gravity calculation unit 1601 instead of the subject shape center of gravity calculation unit 406 of the image generation device 120 according to the first embodiment shown in FIG. The hardware configuration of the image generation device 1500 in this embodiment is the same as that in the above-described embodiment. Further, the same reference numerals are given to the same functional configurations, and the description thereof will be omitted.

被写体形状平均重心算出部１５０１は、撮影装置１１０により得られる撮影画像の動画フレームのそれぞれにおいて、被写体の重心位置を算出する。例えばダンスシーン１０分間が撮影された場合には、各フレームにおいて被写体を真上から見下ろす視点での重心位置を特定し、さらに各動画フレームの重心位置の平均位置を算出する。被写体形状平均重心算出部１５０１は、算出した重心の平均位置を被写体位置とし、平均位置の情報を被写体形状移動部４０７へ出力する。 The subject shape average center of gravity calculation unit 1501 calculates the position of the center of gravity of the subject in each of the moving image frames of the captured image obtained by the photographing device 110. For example, when a dance scene of 10 minutes is shot, the position of the center of gravity at the viewpoint of looking down on the subject from directly above is specified in each frame, and the average position of the center of gravity of each moving image frame is calculated. The subject shape average center of gravity calculation unit 1501 sets the calculated average position of the center of gravity as the subject position, and outputs the information of the average position to the subject shape moving unit 407.

図１２は、実施例３に係る画像処理装置１５００による画像処理を示すフローチャートである。なお、図１６に示す処理のうち、図５に示す各処理と同じ処理については、説明を省略する。 FIG. 12 is a flowchart showing image processing by the image processing apparatus 1500 according to the third embodiment. Of the processes shown in FIG. 16, the same processes as those shown in FIG. 5 will be omitted.

Ｓ５０４の処理の後、被写体形状平均重心算出部１５０１は、被写体形状生成部４０５から入力された複数の撮影フレームの被写体形状の所定視点位置から見た場合の重心位置を算出する。また、被写体形状平均重心算出部１５０１は、各動画フレームにおける被写体の重心に基づいて、重心の平均位置を算出し、平均位置の情報を被写体形状移動部４０７へ出力する（Ｓ１６０１）。 After the processing of S504, the subject shape average center of gravity calculation unit 1501 calculates the position of the center of gravity when viewed from a predetermined viewpoint position of the subject shapes of the plurality of shooting frames input from the subject shape generation unit 405. Further, the subject shape average center of gravity calculation unit 1501 calculates the average position of the center of gravity based on the center of gravity of the subject in each moving image frame, and outputs the information of the average position to the subject shape moving unit 407 (S1601).

図１３は、被写体の重心の平均位置を表す図である。例えばダンスシーンの一連の中で、被写体は図１３（ａ）に示す矢印の方向に移動したとする。ここでは、被写体の演技中における移動前の時刻のフレームと演技中における移動後のフレームとで、被写体の重心位置が算出される。さらに、各フレームの重心の平均位置が算出される。図１３（ａ）においては、重心の平均位置として、点１３０１が特定される。 FIG. 13 is a diagram showing the average position of the center of gravity of the subject. For example, in a series of dance scenes, it is assumed that the subject moves in the direction of the arrow shown in FIG. 13 (a). Here, the position of the center of gravity of the subject is calculated by the frame at the time before the movement during the performance of the subject and the frame after the movement during the performance. Further, the average position of the center of gravity of each frame is calculated. In FIG. 13A, the point 1301 is specified as the average position of the center of gravity.

図１３（ｂ）は、基準位置に基づいて重心の平均位置を移動させた場合の図である。被写体に対して真上から見下ろす視点で、被写体移動先位置情報に基づく格子点位置に、被写体の平均重心位置が一致するように形状データが配置される。これにより、被写体の位置は、一連の撮影の間は平均して一定の位置に配置され、その周辺で移動していくことになる。 FIG. 13B is a diagram when the average position of the center of gravity is moved based on the reference position. The shape data is arranged so that the average center of gravity position of the subject matches the grid point position based on the subject movement destination position information from the viewpoint looking down from directly above the subject. As a result, the position of the subject is arranged at a fixed position on average during a series of shooting, and moves around the position.

以上、本実施形態によれば、撮影時間に対する被写体の重心の平均位置を算出し、所定高視点上に配置しなおすことで、シーンの一連の移動にも違和感なく被写体の再配置が可能となる。なお、本実施例では、平均重心位置を算出するとしたが、これに限らず、撮影の任意のフレームの被写体重心位置を基本位置として使用してもよいものとする。例えば、撮影開始時点のフレームの被写体重心位置を、床面の格子点位置に合わせて再配置することも可能であるし、撮影終了の直前のフレームの被写体重心位置を用いてもよいものとする。また、本実施形態においては、被写体の重心を特定し、重心の平均位置を被写体位置としたが、これに限定されない。例えば、第２の実施形態で説明した所定の部位の位置を特定し、その平均位置を被写体位置としてもよい。 As described above, according to the present embodiment, by calculating the average position of the center of gravity of the subject with respect to the shooting time and rearranging the subject on a predetermined high viewpoint, the subject can be rearranged without discomfort even in a series of movements of the scene. .. In this embodiment, the average center of gravity position is calculated, but the present invention is not limited to this, and the subject center of gravity position of any frame for shooting may be used as the basic position. For example, the position of the center of gravity of the subject of the frame at the start of shooting can be rearranged according to the grid point position of the floor surface, or the position of the center of gravity of the subject of the frame immediately before the end of shooting may be used. .. Further, in the present embodiment, the center of gravity of the subject is specified and the average position of the center of gravity is set as the subject position, but the present invention is not limited to this. For example, the position of a predetermined portion described in the second embodiment may be specified, and the average position thereof may be used as the subject position.

（第４の実施形態）
第４の実施形態では、被写体を移動させるのではなく、被写体の移動前後の位置関係に基づいて仮想カメラ相対的に移動して撮影して、移動前仮想カメラ画像と合成する方法について説明する。図１４は、被写体位置差分算出部１９０１、複数カオス視点情報設定部１９０２、複数仮想視点画像生成部１９０３を有する。なお、画像生成装置１１００のハードウェア構成は、上述した実施形態と同様であるものとする。また、同様の構成については、同じ符号を付し、説明を省略する。 (Fourth Embodiment)
In the fourth embodiment, a method of moving the subject relative to the virtual camera based on the positional relationship before and after the movement of the subject, taking a picture, and synthesizing the image with the virtual camera image before the movement will be described. FIG. 14 has a subject position difference calculation unit 1901, a plurality of chaos viewpoint information setting units 1902, and a plurality of virtual viewpoint image generation units 1903. The hardware configuration of the image generation device 1100 is the same as that of the above-described embodiment. Further, the same reference numerals are given to the same configurations, and the description thereof will be omitted.

被写体位置差分算出部１９０１は、被写体形状重心算出部４０５から被写体重心情報を入力する。また被写体位置設定部４０８から被写体の基準位置の情報を入力し、基準位置と被写体重心位置との差分を算出する。 The subject position difference calculation unit 1901 inputs the subject center of gravity information from the subject shape center of gravity calculation unit 405. Further, information on the reference position of the subject is input from the subject position setting unit 408, and the difference between the reference position and the position of the center of gravity of the subject is calculated.

複数仮想視点設定部１９０２は、ユーザ入力部４０９から入力された再生時刻情報と仮想視点情報に基づき、仮想空間内の現在位置と方向と再生時刻とを更新し、再生時刻情報と仮想視点情報とを複数仮想視点画像生成部１９０３へ出力する。また、複数仮想視点設定部１９０２は、被写体位置差分算出部１９０１から被写体の移動前位置と被写体の移動後位置との位置関係を被写体差分情報として入力した場合、その被写体差分情報に基づく差分仮想視点情報を生成する。 The multiple virtual viewpoint setting unit 1902 updates the current position, direction, and reproduction time in the virtual space based on the reproduction time information and the virtual viewpoint information input from the user input unit 409, and obtains the reproduction time information and the virtual viewpoint information. Is output to the plurality of virtual viewpoint image generation units 1903. Further, when the multiple virtual viewpoint setting unit 1902 inputs the positional relationship between the position before the movement of the subject and the position after the movement of the subject as the subject difference information from the subject position difference calculation unit 1901, the difference virtual viewpoint based on the subject difference information. Generate information.

複数仮想視点画像生成部１９０３は、複数仮想視点情報設定部１９０２から入力された再生時刻情報と仮想視点情報とに基づいた前景画像データと背景画像データとを撮影画像データ保存部４０３から入力する。またカメラパラメータをカメラパラメータ保持部４０４から入力し、移動後の形状データを被写体形状移動部４０７から入力する。その後、背景画像データを仮想視点位置から背景として見えるように投影変換や画像処理を施して仮想視点画像の背景とする。次に仮想視点位置から見た移動被写体形状に対して、該当時刻に実カメラで撮影された画像データによる色情報でレンダリング（着色処理）して仮想視点画像を生成する。また、被写体位置差分算出部１９０１が被写体差分情報を算出したことによる差分仮想視点情報を入力された場合、差分仮想視点情報の指定位置で仮想視点画像を生成し、ユーザ入力に基づく仮想視点画像と合成する。最後に仮想視点画像を画像出力部４１３へ出力する。 The plurality of virtual viewpoint image generation units 1903 input foreground image data and background image data based on the reproduction time information and virtual viewpoint information input from the plurality of virtual viewpoint information setting units 1902 from the captured image data storage unit 403. Further, the camera parameters are input from the camera parameter holding unit 404, and the shape data after movement is input from the subject shape moving unit 407. After that, the background image data is subjected to projection transformation and image processing so that it can be seen as a background from the virtual viewpoint position, and is used as the background of the virtual viewpoint image. Next, the moving subject shape viewed from the virtual viewpoint position is rendered (colored) with the color information based on the image data taken by the actual camera at the corresponding time to generate a virtual viewpoint image. Further, when the difference virtual viewpoint information obtained by the subject position difference calculation unit 1901 calculating the subject difference information is input, the virtual viewpoint image is generated at the designated position of the difference virtual viewpoint information, and the virtual viewpoint image is based on the user input. Synthesize. Finally, the virtual viewpoint image is output to the image output unit 413.

図１５は、仮想空間上の移動させる被写体位置における仮想カメラ位置の概念図である。床面の中心（０、０）の位置を原点として、移動後の被写体位置情報も同様に床面の中心（０、０）の位置とする。なお単位はメートルとする。移動前の被写体の位置が（０、２）であるとすると、基準位置（０、０）に被写体を配置する場合は、仮想カメラ２００１の位置（ｘ、ｙ）を、（０、２）移動した位置（ｘ、ｙ＋２）に移動される。そして、移動させた仮想カメラ２００１から被写体を撮影して仮想視点画像を生成する。さらに、生成した仮想視点画像と、移動させたい被写体以外を撮影した仮想カメラ２００１に対応する仮想視点画像と合成する。 FIG. 15 is a conceptual diagram of a virtual camera position at a moving subject position in a virtual space. The position of the center (0, 0) of the floor surface is set as the origin, and the subject position information after movement is also set to the position of the center (0, 0) of the floor surface. The unit is meters. Assuming that the position of the subject before movement is (0, 2), when the subject is placed at the reference position (0, 0), the position (x, y) of the virtual camera 2001 is moved by (0, 2). It is moved to the position (x, y + 2). Then, the subject is photographed from the moved virtual camera 2001 to generate a virtual viewpoint image. Further, the generated virtual viewpoint image is combined with the virtual viewpoint image corresponding to the virtual camera 2001 that captures a subject other than the subject to be moved.

以上、本実施形態によれば、移動させたい被写体に対して仮想カメラを移動させて仮想視点画像を生成し、移動させたい被写体以外を撮影する仮想カメラに対応する仮想視点画像と合成する。これにより、仮想視点画像の合成のみで、被写体の位置を移動させることが可能となる。 As described above, according to the present embodiment, the virtual camera is moved with respect to the subject to be moved to generate a virtual viewpoint image, and the virtual viewpoint image is combined with the virtual viewpoint image corresponding to the virtual camera for shooting a subject other than the subject to be moved. This makes it possible to move the position of the subject only by synthesizing the virtual viewpoint image.

（その他の実施形態）
上述した実施形態においては、被写体位置と基準位置とのずれに応じて、被写体の形状データを配置し直して仮想視点画像を生成する例について説明した。しかしこれに限定されず、被写体位置と基準位置とがずれていることを識別可能な情報を含む仮想視点画像が生成される構成でもよい。例えば、特定された被写体位置に形状データが配置するとともに、基準位置を示す情報（例えば、格子点や直線など）を表示させてもよい。また、被写体位置に形状データが配置するとともに、基準位置からずれている被写体を識別可能にする表示を行ってもよい。これは例えば、基準位置からずれている被写体の形状データを線で囲む、被写体の色又は明度を変更して強調する等である。これらの方法により、仮想視点画像を見るユーザは被写体位置と基準位置とがずれていることを認識することができる。例えば、演者が生成された仮想視点画像を見て立ち位置を確認したい場合に、仮想視点の位置によらずに、演者の位置のずれを容易に識別することができる。 (Other embodiments)
In the above-described embodiment, an example of generating a virtual viewpoint image by rearranging the shape data of the subject according to the deviation between the subject position and the reference position has been described. However, the present invention is not limited to this, and a configuration may be used in which a virtual viewpoint image including information that can identify that the subject position and the reference position are deviated is generated. For example, the shape data may be arranged at the specified subject position, and information indicating the reference position (for example, a grid point or a straight line) may be displayed. Further, the shape data may be arranged at the subject position and may be displayed so that the subject deviated from the reference position can be identified. This is, for example, surrounding the shape data of the subject deviating from the reference position with a line, changing the color or brightness of the subject to emphasize it, and the like. By these methods, the user who sees the virtual viewpoint image can recognize that the subject position and the reference position are deviated from each other. For example, when the performer wants to confirm the standing position by looking at the generated virtual viewpoint image, the deviation of the performer's position can be easily identified regardless of the position of the virtual viewpoint.

本開示は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 The present disclosure supplies a program that realizes one or more functions of the above-described embodiment to a system or device via a network or storage medium, and one or more processors in the computer of the system or device reads and executes the program. It can also be realized by the processing to be performed. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

１２０画像生成装置
４０５被写体形状生成部
４０６被写体形状重心算出部
４０８被写体位置設定部
４１２仮想視点画像生成部 120 Image generator 405 Subject shape generation unit 406 Subject shape center of gravity calculation unit 408 Subject position setting unit 412 Virtual viewpoint image generation unit

Claims

Acquisition means for acquiring shape data representing the shape of the subject based on a plurality of captured images obtained by photographing the subject in the photographing area by a plurality of photographing devices.
The first specifying means for specifying the subject position, which is the position of the subject in the shooting area,
A second specifying means for specifying a reference position that serves as a reference for the position of the subject, and
Based on the shape data acquired by the acquisition means, a virtual viewpoint image corresponding to the deviation between the subject position specified by the first specific means and the reference position specified by the second specific means is generated. A generator characterized by having a generator.

The first aspect of the present invention is characterized in that the generation means generates a virtual viewpoint image by arranging the shape data based on the reference position according to the deviation being equal to or more than a predetermined threshold value. Generator.

The generation means is
The subject position specified by the first specific means is changed so that the deviation becomes smaller than a predetermined threshold value.
The generation device according to claim 1, wherein a virtual viewpoint image is generated by arranging the shape data based on the changed subject position.

The generation means is
Based on the deviation, the subject position is changed so that the subject position specified by the first specific means and the reference position match.
The generation device according to claim 1, wherein a virtual viewpoint image is generated by arranging the shape data based on the changed subject position.

The generator according to any one of claims 1 to 4, wherein the reference position is specified based on grid points arranged at predetermined intervals.

The generator according to any one of claims 1 to 5, wherein the reference position is a position outside the photographing area.

The first specifying means identifies a plurality of subject positions corresponding to a plurality of subjects in the shooting area.
The second specifying means according to any one of claims 1 to 6, wherein the second specifying means specifies the reference position based on a plurality of subject positions specified by the first specifying means. Generator.

The generation device according to claim 7, wherein the second specifying means specifies the reference position so that the plurality of subject positions are at predetermined intervals.

The generator according to claim 6 or 7, wherein the second specifying means specifies the reference position so that the plurality of subject positions are positioned on a predetermined straight line.

The first specifying means according to any one of claims 1 to 9, wherein the first specifying means specifies the subject position based on the shape of the subject represented by the shape data acquired by the acquiring means. The generator described.

The generation device according to claim 10, wherein the first specifying means specifies the position of the center of gravity in the shape of the subject represented by the shape data as the subject position.

The generation device according to claim 10, wherein the first specifying means specifies the position of a predetermined portion in the shape of the subject represented by the shape data as the subject position.

The generation device according to any one of claims 1 to 12, wherein the generation means generates a virtual viewpoint image including information that can identify the deviation.

The generation means obtains a virtual viewpoint image including information representing a subject position specified by the first specific means and a reference position specified by the second specific means as information that can identify the deviation. The generator according to claim 13, wherein the generator is generated.

The generation means generates a virtual viewpoint image including information that can identify a subject whose position deviates from the reference position is specified as the subject position by the first specific means as information that can identify the deviation. 13. The generator according to claim 13 or 14.

An acquisition step of acquiring shape data representing the shape of the subject based on a plurality of captured images obtained by photographing the subject in the photographing area by a plurality of photographing devices.
The first specific step of specifying the subject position, which is the position of the subject in the shooting area, and
A second specific step of specifying a reference position that serves as a reference for the position of the subject, and
Based on the shape data acquired in the acquisition step, a generation that generates a virtual viewpoint image according to the deviation between the position specified in the first specific step and the reference position specified in the second specific step. A production method characterized by having a process.

A program for operating a computer as the generator according to any one of claims 1 to 15.