JP2021033347A

JP2021033347A - Image processing device, image processing method, and program

Info

Publication number: JP2021033347A
Application number: JP2019148819A
Authority: JP
Inventors: 元河西; Hajime Kawanishi
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2019-08-14
Filing date: 2019-08-14
Publication date: 2021-03-01

Abstract

To suppress a data volume of a 3D model generated by interpolation to an existing 3D model for generating virtual view point image data of an arbitrary frame rate.SOLUTION: A model analysis unit 304 acquires basic 3D models divided for each of sections being interpolation 3D model generation objects at consecutive two basic time points, that is, basic 3D models and dividing position information. The model analysis unit 304 calculates a motion vector for each of sections of basic 3D models M(T1) and M(T2) containing the acquired divided sections. An interpolation time point determination unit 305 calculates interpolation time points being time points to perform the interpolation for each of the sections of the basic 3D models of the objects. A model interpolation unit 306 generates an interpolation 3D model for a section to which the interpolation time point is set, and stores the model together with dividing section information, model connection information, and a time code indicating the interpolation time point in a model storage unit 303.SELECTED DRAWING: Figure 5

Description

本発明は、仮想視点画像データを生成するための技術に関する。 The present invention relates to a technique for generating virtual viewpoint image data.

複数の撮像装置を異なる位置に設置し、それら複数の撮像装置で複数視点から同期撮像して得られた複数の撮像画像データ（複数視点画像データ）を用いて仮想視点からの見えを表す仮想視点画像データを生成する技術がある。仮想視点画像データは、複数視点画像データから生成されるオブジェクトの３次元形状データである３Ｄモデルと、撮像画像データとを用いてレンダリングすることで生成される。なお、仮想視点画像データが動画データである場合は、仮想視点画像データを構成する各フレーム画像に対応する３Ｄモデルに対してレンダリングが行われる。 A virtual viewpoint that expresses the view from a virtual viewpoint by installing a plurality of imaging devices at different positions and using a plurality of captured image data (multi-viewpoint image data) obtained by synchronously imaging the plurality of imaging devices from a plurality of viewpoints. There is a technology to generate image data. The virtual viewpoint image data is generated by rendering using a 3D model, which is three-dimensional shape data of an object generated from a plurality of viewpoint image data, and captured image data. When the virtual viewpoint image data is moving image data, rendering is performed on the 3D model corresponding to each frame image constituting the virtual viewpoint image data.

この仮想視点画像データは、複数視点画像データのみを用いて生成された場合、複数視点画像データと等しいフレームレートで生成されるが、仮想視点画像データの配信においては、他の異なる様々なフレームレートでの配信も求められる。これは、仮想視点画像データを再生し、視聴するデバイスごとに、そのデバイスの処理能力に応じて異なるフレームレートに調整する、あるいは再生速度を変更しても同一のフレームレートで視聴する、などといったことを可能にするためである。 When this virtual viewpoint image data is generated using only the multiple viewpoint image data, it is generated at the same frame rate as the multiple viewpoint image data, but in the distribution of the virtual viewpoint image data, various other different frame rates are used. Delivery is also required. This means that the virtual viewpoint image data is played back and adjusted to a different frame rate according to the processing capacity of the device for viewing, or the same frame rate is viewed even if the playback speed is changed. This is to make it possible.

このようなニーズに対して、非特許文献１では、既存の３Ｄモデルを構成するメッシュの頂点を時間的にトラッキングし、それら頂点を既存の３Ｄモデルが存在しない時間帯において内挿し、より短い時間間隔で３Ｄモデルを生成している。これにより、任意のフレームレートの仮想視点画像データの各フレーム画像に対応する３Ｄモデルを生成することができる。 In response to such needs, Non-Patent Document 1 tracks the vertices of the mesh constituting the existing 3D model in time, interpolates the vertices in a time zone in which the existing 3D model does not exist, and shortens the time. 3D models are generated at intervals. This makes it possible to generate a 3D model corresponding to each frame image of the virtual viewpoint image data at an arbitrary frame rate.

Alvaro Collet, et al., “High-quality streamable free-viewpoint video.”, ACM Transactions on Graphics (TOG), 34 (4), 2015.Alvaro Collet, et al., “High-quality streamable free-viewpoint video.”, ACM Transactions on Graphics (TOG), 34 (4), 2015.

しかしながら、この従来技術を用いてユーザが要求する可能性のある全てのフレームレートに対応した３Ｄモデルを予め生成しておくことは、記憶領域や処理コストの面で現実的ではない。一方で、フレームレート変更の要求がある度に指定されたフレームレートに対応する３Ｄモデルを生成し、その３Ｄモデルに対してレンダリングを行って仮想視点画像データを生成していては、フレームレート変更の要求から再生までの遅延が大きく実用的でない。 However, it is not realistic in terms of storage area and processing cost to generate in advance a 3D model corresponding to all frame rates that the user may request by using this conventional technique. On the other hand, if a 3D model corresponding to the specified frame rate is generated each time there is a request to change the frame rate, and the 3D model is rendered to generate virtual viewpoint image data, the frame rate is changed. There is a large delay from the request to playback, which is not practical.

そのため本発明では、既存の３Ｄモデルを用いて、適切に３Ｄモデルを生成することを目的とする。 Therefore, it is an object of the present invention to appropriately generate a 3D model using an existing 3D model.

本発明に一つの態様は、オブジェクトの３次元形状を表す、タイムコードと対応付けられた第１の３Ｄモデルを取得する取得手段と、２つの連続するタイムコードと対応付けられた２つの第１の３Ｄモデルと、当該２つの第１の３Ｄモデルの移動速度とに基づき、当該２つの連続するタイムコードが示すタイムの間のタイムに対応する第２の３Ｄモデルを生成する生成手段と、を備えることを特徴とする画像処理装置である。 One aspect of the present invention is an acquisition means for acquiring a first 3D model associated with a time code representing a three-dimensional shape of an object, and two first aspects associated with two consecutive time codes. 3D model and a generation means for generating a second 3D model corresponding to the time between the times indicated by the two consecutive time codes based on the moving speeds of the two first 3D models. It is an image processing apparatus characterized by being provided.

本発明により、既存の３Ｄモデルを用いて、適切に３Ｄモデルを生成することができる。 According to the present invention, an existing 3D model can be used to appropriately generate a 3D model.

実施形態における仮想視点映像生成システムの構成例を示す図である。It is a figure which shows the configuration example of the virtual viewpoint image generation system in embodiment. 実施形態における情報処理装置のハードウェア構成を示す図である。It is a figure which shows the hardware configuration of the information processing apparatus in embodiment. 実施形態における仮想視点画像データ生成処理の構成を示すブロック図である。It is a block diagram which shows the structure of the virtual viewpoint image data generation processing in embodiment. 実施形態における基礎モデル生成処理を示すフローチャートである。It is a flowchart which shows the basic model generation processing in embodiment. 実施形態における内挿３Ｄモデル生成処理を示すフローチャートである。It is a flowchart which shows the interpolation 3D model generation processing in embodiment. 実施形態における連続するモデル時刻の基礎３Ｄモデルを示す図である。It is a figure which shows the basic 3D model of the continuous model time in an embodiment. 実施形態における内挿３Ｄモデルを示す図である。It is a figure which shows the interpolation 3D model in an embodiment. 実施形態における配信用仮想視点画像データ生成処理を示すフローチャートである。It is a flowchart which shows the virtual viewpoint image data generation processing for distribution in an embodiment. 実施形態における任意のフレームレートに応じた３Ｄモデルの取得方法を説明する図である。It is a figure explaining the acquisition method of the 3D model corresponding to an arbitrary frame rate in an embodiment. 実施形態における所定のフレームレートに応じた３Ｄモデルの取得方法を説明する図である。It is a figure explaining the acquisition method of the 3D model corresponding to a predetermined frame rate in an embodiment. 実施形態における内挿３Ｄモデルを示す図である。It is a figure which shows the interpolation 3D model in an embodiment. 実施形態における合成後の内挿３Ｄモデルを示す図である。It is a figure which shows the interpolation 3D model after synthesis in an embodiment. 実施形態における内挿モデルの合成方法を説明する図である。It is a figure explaining the synthesis method of the interpolation model in an embodiment. 実施形態における内挿時間算出処理を示すフローチャートである。It is a flowchart which shows the interpolation time calculation process in an embodiment.

以下、本発明の実施形態について、図面を参照して説明する。なお、以下の実施形態は本発明を限定するものではなく、また、本実施形態で説明されている特徴の組み合わせの全てが本発明の解決手段に必須のものとは限らない。なお、同一の構成については、同じ符号を付して説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. It should be noted that the following embodiments do not limit the present invention, and not all combinations of features described in the present embodiment are essential for the means for solving the present invention. The same configuration will be described with the same reference numerals.

本発明は、仮想視点画像データに含まれるオブジェクトに関わらず適用可能であるが、本実施例においてはサッカーの試合を撮像対象とし、スタジアム、人、ボールなどをオブジェクトとする仮想視点画像生成システムについて説明する。 The present invention can be applied regardless of the objects included in the virtual viewpoint image data, but in the present embodiment, the virtual viewpoint image generation system in which a soccer game is imaged and a stadium, a person, a ball, etc. are objects. explain.

仮想視点画像生成は、仮想視点画像データの生成のために用いられるものである。仮想視点画像生成は、複数の撮像装置による同期撮像により得られた複数の画像データと、指定された仮想視点とに基づいて、指定された仮想視点からの見えを表す仮想視点画像データを生成するステムである。本実施形態における仮想視点画像データが表す仮想視点画像は、自由視点映像とも呼ばれるものである。但し、ユーザが自由に（任意に）指定した視点に対応する画像に限定されず、例えば複数の候補からユーザが選択した視点に対応する画像なども仮想視点画像に含まれる。また、本実施形態では仮想視点の指定がユーザ操作により行われる場合を中心に説明するが、仮想視点の指定が画像解析の結果等に基づいて自動で行われてもよい。 Virtual viewpoint image generation is used for generating virtual viewpoint image data. The virtual viewpoint image generation generates virtual viewpoint image data representing the appearance from a specified virtual viewpoint based on a plurality of image data obtained by synchronous imaging by a plurality of imaging devices and a designated virtual viewpoint. It is a stem. The virtual viewpoint image represented by the virtual viewpoint image data in the present embodiment is also called a free viewpoint image. However, the virtual viewpoint image is not limited to the image corresponding to the viewpoint freely (arbitrarily) specified by the user, and includes, for example, an image corresponding to the viewpoint selected by the user from a plurality of candidates. Further, in the present embodiment, the case where the virtual viewpoint is specified by the user operation will be mainly described, but the virtual viewpoint may be automatically specified based on the result of image analysis or the like.

図１を用いて、本実施例が対象とする仮想視点画像生成システムの構成について説明する。スタジアムにて複数のカメラ１０１をスタジアム内に配置する。オブジェクト１０２は、本システムのオブジェクトとなる存在で、例えば選手や審判、ボールなどである。複数のカメラ１０１は同期撮像した画像を情報処理装置１０３へ伝送する。 The configuration of the virtual viewpoint image generation system targeted by this embodiment will be described with reference to FIG. At the stadium, a plurality of cameras 101 are arranged in the stadium. The object 102 is an object of the system, and is, for example, a player, a referee, a ball, or the like. The plurality of cameras 101 transmit the synchronously captured images to the information processing device 103.

なお、図１においては複数のカメラ１０１と情報処理装置１０３はスター型に接続されているが、ディジーチェーン型に接続してもよい。情報処理装置１０３には、伝送された画像及び情報処理装置１０３の処理結果を表示する表示装置１０４と、情報処理装置１０３への入力を行うための入力装置１０５が接続される。 Although the plurality of cameras 101 and the information processing device 103 are connected in a star type in FIG. 1, they may be connected in a daisy chain type. The information processing device 103 is connected to a display device 104 that displays a transmitted image and a processing result of the information processing device 103, and an input device 105 for inputting to the information processing device 103.

情報処理装置１０３は、ネットワークを介したＰＣやタブレット、スマートフォンなどのユーザ端末１０６、１０７からの要求に応じて、生成した仮想視点画像データを配信することができる。ユーザ端末１０６、１０７は、情報処理装置１０３に対して再生する仮想視点画像データの指定、再生区間の指定、仮想視点の指定、フレームレートの指定などを行うことができる。 The information processing device 103 can distribute the generated virtual viewpoint image data in response to a request from user terminals 106 and 107 such as a PC, a tablet, and a smartphone via a network. The user terminals 106 and 107 can specify the virtual viewpoint image data to be reproduced, specify the reproduction section, specify the virtual viewpoint, specify the frame rate, and the like for the information processing device 103.

図２を用いて、情報処理装置１０３のハードウェア構成について説明する。情報処理装置１０３は、ＣＰＵ２１１、ＲＯＭ２１２、ＲＡＭ２１３、補助記憶装置２１４、表示Ｉ／Ｆ２１５、操作Ｉ／Ｆ２１６、通信Ｉ／Ｆ２１７、及びバス２１８を有する。表示Ｉ／Ｆ２１５、操作Ｉ／Ｆ２１６は、ＣＰＵ２１１が、表示装置１０４を制御する表示制御部、及び操作部２１６を制御する操作制御部として動作する。 The hardware configuration of the information processing apparatus 103 will be described with reference to FIG. The information processing device 103 includes a CPU 211, a ROM 212, a RAM 213, an auxiliary storage device 214, a display I / F 215, an operation I / F 216, a communication I / F 217, and a bus 218. The display I / F 215 and the operation I / F 216 operate as a display control unit in which the CPU 211 controls the display device 104 and an operation control unit in which the operation unit 216 is controlled.

ＣＰＵ２１１は、ＲＯＭ２１２やＲＡＭ２１３に格納されているコンピュータプログラムやデータを用いて情報処理装置１０３の全体を制御することで、図１に示す情報処理装置１０３の各機能を実現する。なお、情報処理装置１０３がＣＰＵ２１１とは異なる１又は複数の専用のハードウェアを有し、ＣＰＵ２１１による処理の少なくとも一部を専用のハードウェアが実行してもよい。専用のハードウェアの例としては、ＡＳＩＣ（特定用途向け集積回路）、ＦＰＧＡ（フィールドプログラマブルゲートアレイ）、及びＤＳＰ（デジタルシグナルプロセッサ）などがある。ＲＯＭ２１２は、変更を必要としないプログラムなどを格納する。ＲＡＭ２１３は、補助記憶装置２１４から供給されるプログラムやデータ、及び通信Ｉ／Ｆ２１７を介して外部から供給されるデータなどを一時記憶する。補助記憶装置２１４は、例えばハードディスクドライブ等で構成され、画像データや音声データなどの種々のデータを記憶する。 The CPU 211 realizes each function of the information processing device 103 shown in FIG. 1 by controlling the entire information processing device 103 using computer programs and data stored in the ROM 212 and the RAM 213. The information processing device 103 may have one or more dedicated hardware different from the CPU 211, and the dedicated hardware may execute at least a part of the processing by the CPU 211. Examples of dedicated hardware include ASICs (application specific integrated circuits), FPGAs (field programmable gate arrays), and DSPs (digital signal processors). The ROM 212 stores programs and the like that do not require changes. The RAM 213 temporarily stores programs and data supplied from the auxiliary storage device 214, data supplied from the outside via the communication I / F 217, and the like. The auxiliary storage device 214 is composed of, for example, a hard disk drive or the like, and stores various data such as image data and audio data.

表示装置１０４は、例えば液晶ディスプレイやＬＥＤ等で構成され、情報処理装置１０３を操作するためのＧＵＩ（ＧｒａｐｈｉｃａｌＵｓｅｒＩｎｔｅｒｆａｃｅ）などを表示する。入力装置１０５は、例えばキーボードやマウス、ジョイスティック、タッチパネル等で構成され、ユーザによる操作を受けて各種の指示をＣＰＵ２１１に入力する。 The display device 104 is composed of, for example, a liquid crystal display, an LED, or the like, and displays a GUI (Graphical User Interface) for operating the information processing device 103. The input device 105 is composed of, for example, a keyboard, a mouse, a joystick, a touch panel, and the like, and inputs various instructions to the CPU 211 in response to an operation by the user.

通信Ｉ／Ｆ２１７は、情報処理装置１０３の外部の装置との通信に用いられる。例えば、情報処理装置１０３が外部の装置と有線で接続される場合には、通信用のケーブルが通信Ｉ／Ｆ２１７に接続される。情報処理装置１０３が外部の装置と無線通信する機能を有する場合には、通信Ｉ／Ｆ２１７はアンテナを備える。また、通信Ｉ／Ｆ２１７は、ネットワーク１０８に接続して、ネットワーク１０８を介したユーザ端末１０６、１０７など外部の装置との間でのデータ通信を行うために用いられる。 The communication I / F 217 is used for communication with an external device of the information processing device 103. For example, when the information processing device 103 is connected to an external device by wire, a communication cable is connected to the communication I / F 217. When the information processing device 103 has a function of wirelessly communicating with an external device, the communication I / F 217 includes an antenna. Further, the communication I / F 217 is used for connecting to the network 108 and performing data communication with an external device such as the user terminals 106 and 107 via the network 108.

バス２１８は、情報処理装置１０３の各部をつないで情報を伝達する。 The bus 218 connects each part of the information processing device 103 to transmit information.

本実施形態では表示装置１０４と入力装置１０５が情報処理装置１０３の外部に別の装置として存在するものとするが、表示装置１０４と入力装置１０５との少なくとも一方が情報処理装置１０３の内部に存在していてもよい。この場合、ＣＰＵ２１１が、表示装置１０４を制御する表示制御部、及び操作部２１６を制御する操作制御部として動作してもよい。 In the present embodiment, it is assumed that the display device 104 and the input device 105 exist as separate devices outside the information processing device 103, but at least one of the display device 104 and the input device 105 exists inside the information processing device 103. You may be doing it. In this case, the CPU 211 may operate as a display control unit that controls the display device 104 and an operation control unit that controls the operation unit 216.

図３は、本実施形態における仮想視点画像データ生成処理を行う情報処理装置１０３の機能構成を示すブロック図である。 FIG. 3 is a block diagram showing a functional configuration of the information processing apparatus 103 that performs the virtual viewpoint image data generation processing in the present embodiment.

複数視点画像撮像部３０１は、複数のカメラ１０１を時間的に同期させて撮像し、複数視点画像データを取得する。ここで、複数視点画像データの各フレーム画像データが生成された撮像時刻は、撮像開始時点を基準とする撮像時間における時刻を示すことなる。一方、後述の処理において用いる時刻（モデル時刻（基礎時刻、内挿時刻）、フレーム時刻）は、撮像時間の時間軸とは別の時間軸上の時刻を指すため、本開示ではコンテンツ時間における時刻とする。但し、基礎時刻は、複数視点画像データの各フレーム画像データと一対一に対応する時刻であり、内挿時刻は、フレーム画像データ間に相当する時刻となる。 The multi-viewpoint image imaging unit 301 acquires image data from a plurality of cameras 101 in synchronization with each other in time. Here, the imaging time at which each frame image data of the plurality of viewpoint image data is generated indicates the time in the imaging time based on the imaging start time point. On the other hand, the time (model time (basic time, interposition time), frame time) used in the processing described later refers to a time on a time axis different from the time axis of the imaging time, and therefore, in the present disclosure, the time in the content time. And. However, the basic time is a time corresponding to each frame image data of the plurality of viewpoint image data on a one-to-one basis, and the interpolation time is a time corresponding to between the frame image data.

基礎モデル生成部３０２は、複数視点画像撮像部３０１から同期撮像されたて得られた撮像画像データを取得し、後述の基礎モデル生成処理によって撮像画像データと同じフレームレートの３次元形状データである基礎３Ｄモデルを生成する。 The basic model generation unit 302 acquires the captured image data obtained by synchronously imaging from the multi-viewpoint image imaging unit 301, and is three-dimensional shape data having the same frame rate as the captured image data by the basic model generation process described later. Generate a basic 3D model.

モデル記憶部３０３は、基礎モデル生成部３０２で生成された基礎３Ｄモデル、モデル接合情報、及びモデル内挿部３０６で生成された内挿３Ｄモデルなどを記憶する。 The model storage unit 303 stores the basic 3D model generated by the basic model generation unit 302, the model junction information, the interpolation 3D model generated by the model insertion unit 306, and the like.

モデル解析部３０４は、モデル記憶部３０３から基礎３Ｄモデルを取得し、メッシュトラッキングと動きベクトルの計算を行い、各３Ｄモデルの速度情報を取得する。 The model analysis unit 304 acquires the basic 3D model from the model storage unit 303, performs mesh tracking and motion vector calculation, and acquires the velocity information of each 3D model.

内挿時刻決定部３０５は、各３Ｄモデルについて内挿を行うコンテンツ時間における内挿時刻を決定する。本実施形態では、内挿時刻が設定されているか否かによって内挿を行うか否かを決定するため、内挿時刻決定部３０５は内挿決定機能も有している。 The interpolation time determination unit 305 determines the interpolation time in the content time for interpolation for each 3D model. In the present embodiment, since it is determined whether or not the interpolation is performed depending on whether or not the interpolation time is set, the interpolation time determination unit 305 also has an interpolation determination function.

モデル内挿部３０６は、基礎３Ｄモデルと速度情報である動きベクトルとを基に内挿時刻決定部３０５で決定した内挿時刻の内挿３Ｄモデルを生成し、その内挿時刻をあらわすタイムコードとともに生成した内挿３Ｄモデルをモデル記憶部３０３に格納する。 The model insertion unit 306 generates an interpolation 3D model of the interpolation time determined by the interpolation time determination unit 305 based on the basic 3D model and the motion vector which is the velocity information, and represents the interpolation time. The interpolated 3D model generated together with is stored in the model storage unit 303.

モデル取得部３０７は、入出力部３０９を介したユーザからの任意のフレームレートの指定を受け、モデル記憶部３０３に記憶された内挿３Ｄモデルを含む３Ｄモデルから指定されたフレームレートに必要なフレーム画像データに対応する３Ｄモデルを取得する。 The model acquisition unit 307 receives an arbitrary frame rate designation from the user via the input / output unit 309, and is required for the frame rate specified from the 3D model including the interpolated 3D model stored in the model storage unit 303. Acquire the 3D model corresponding to the frame image data.

仮想視点画像データ生成部３０８は、モデル取得部３０７が取得した３Ｄモデルを用いて、指定のフレームレートの配信用仮想視点画像データを生成して入出力部３０９に送る。 The virtual viewpoint image data generation unit 308 generates virtual viewpoint image data for distribution at a specified frame rate using the 3D model acquired by the model acquisition unit 307 and sends it to the input / output unit 309.

なお、本実施形態における配信用仮想視点画像データは、ユーザが指定するフレームレートで各フレームにおける３Ｄモデルの情報をＯＢＪフォーマットで記録したデータとする。ユーザは、これをユーザ端末において仮想視点画像データ再生ソフトウェア等で再生することで視聴を行う。なお、ＯＢＪ以外のフォーマットで３Ｄモデルの情報を記述したり、３Ｄモデルの情報の容量圧縮を行ったり、ストリーミング配信の形式を取ったりしてもよい。 The distribution virtual viewpoint image data in the present embodiment is data in which the information of the 3D model in each frame is recorded in the OBJ format at the frame rate specified by the user. The user views the image by reproducing it on the user terminal with virtual viewpoint image data reproduction software or the like. The 3D model information may be described in a format other than OBJ, the capacity of the 3D model information may be compressed, or a streaming distribution format may be used.

図４は、本実施形態の基礎モデル生成処理のフローチャートである。本フローは、複数視点画像撮像部３０１によって新たな画像の同期撮像が行われて得られた複数視点画像データを情報処理装置１０３が受信したタイミングで実行される。 FIG. 4 is a flowchart of the basic model generation process of the present embodiment. This flow is executed at the timing when the information processing apparatus 103 receives the multi-viewpoint image data obtained by synchronously capturing a new image by the multi-viewpoint image capturing unit 301.

Ｓ４０１において、基礎モデル生成部３０２は複数視点画像撮像部３０１から複数のカメラ１０１によって時間的に同期させて撮像されて得られた複数視点画像データを取得する。 In S401, the basic model generation unit 302 acquires the multi-viewpoint image data obtained by being imaged from the multi-viewpoint image imaging unit 301 by a plurality of cameras 101 in time synchronization.

Ｓ４０２において、基礎モデル生成部３０２は、Ｓ４０１において取得した複数視点画像データを用いて、イメージベースドモデリングの技法によって基礎３Ｄモデルの生成を行う。ここで、基礎３Ｄモデルを含む３Ｄモデルは、３Ｄモデルを構成する頂点の位置のリストとメッシュを構成する頂点、辺及び面に関するメッシュ情報のリストとによって表現される。基礎３Ｄモデルは、任意のフレームレートの仮想視点画像データを生成するための元となる３Ｄモデルとして、コンテンツ時間における原モデル時刻を示すタイムコードとともにモデル記憶部３０３に格納する。 In S402, the basic model generation unit 302 generates a basic 3D model by an image-based modeling technique using the multi-viewpoint image data acquired in S401. Here, the 3D model including the basic 3D model is represented by a list of the positions of the vertices constituting the 3D model and a list of mesh information regarding the vertices, sides and faces constituting the mesh. The basic 3D model is stored in the model storage unit 303 together with a time code indicating the original model time in the content time as a base 3D model for generating virtual viewpoint image data at an arbitrary frame rate.

ここで、撮像画像中に写るオブジェクトのうち、イメージベースドモデリングにおいて３Ｄモデルの生成を行う対象とするものの選択は自由である。本実施形態ではサッカースタジアムに設置された仮想視点画像データ生成システムとしているため、３Ｄモデルを生成するオブジェクトとしては、「選手」と「サッカーボール」を対象とする。 Here, among the objects captured in the captured image, the object for which the 3D model is generated in the image-based modeling can be freely selected. In the present embodiment, since the virtual viewpoint image data generation system installed in the soccer stadium is used, "player" and "soccer ball" are targeted as the objects for generating the 3D model.

Ｓ４０３において、基礎モデル生成部３０２は、Ｓ４０２で生成された基礎３Ｄモデルの分割を行う。ここで、各基礎３Ｄモデルを共通の動きをする部位ごとに分割し、各部位を個別の３Ｄモデルとして扱う。一定以上の動きのある部位についてのみ内挿することで、内挿３Ｄモデルを生成するための処理量を低減することができる。これにより、任意のフレームレートの仮想視点画像データに対応する３Ｄモデルを効率的に生成することができる。 In S403, the basic model generation unit 302 divides the basic 3D model generated in S402. Here, each basic 3D model is divided into parts that make common movements, and each part is treated as an individual 3D model. By interpolating only the part having a certain amount of movement or more, the amount of processing for generating the interpolated 3D model can be reduced. This makes it possible to efficiently generate a 3D model corresponding to virtual viewpoint image data at an arbitrary frame rate.

なお、基礎３Ｄモデルの分割する部位の検出方法としては、例えば、機械学習を利用してオブジェクトごとに各部位の３Ｄモデルを学習させ、その学習結果に基づき各部位を検出する方法がある。また各部位の３Ｄモデルを直接検出する代わりに、３Ｄモデルの断面図を一方の端から他方の端に向かって見ていき、断面図の大きさの変化から部位間の接合部を検出することで各部位を検出する方法もある。 As a method of detecting the divided parts of the basic 3D model, for example, there is a method of learning a 3D model of each part for each object by using machine learning and detecting each part based on the learning result. Also, instead of directly detecting the 3D model of each part, look at the cross-sectional view of the 3D model from one end to the other, and detect the joint between the parts from the change in the size of the cross-sectional view. There is also a method of detecting each part with.

分割する部位は、例えば、人間の腕部と胴体では動きの細かさ、速さが異なることが多いため、この２つの部位は異なる分割モデルとして扱うと効率的である。本実施形態では、ボールは分割せず単一の部位とし、人は頭部、胴体、左腕、右腕、左足、右足の６部位に分割するものとする。なお、分割する部位の数は６つに限定されず、例えば腕を肩から肘、肘から手首、手首から指先とさらに３つに分割してもよく、足も同様にさらに３つに分割し、全体で１４部位に分割してもよい。また、オブジェクトの動きの特性に合わせて分割する部位数や分割位置は変えてもよい。 As for the parts to be divided, for example, the fineness and speed of movement are often different between the human arm and the body, so it is efficient to treat these two parts as different division models. In the present embodiment, the ball is not divided into a single part, and the person is divided into six parts: the head, the torso, the left arm, the right arm, the left foot, and the right foot. The number of parts to be divided is not limited to six. For example, the arm may be further divided into three parts: shoulder to elbow, elbow to wrist, wrist to fingertip, and the foot may be further divided into three parts. , It may be divided into 14 parts in total. Further, the number of parts to be divided and the division position may be changed according to the characteristics of the movement of the object.

さらに、分割する部位を上記のように予め決めておかずに、移動量が閾値以上の頂点からなる部分と閾値未満の頂点からなる部分とに分割するようにしてもよい。 Further, the portion to be divided may not be determined in advance as described above, but may be divided into a portion consisting of vertices having a movement amount equal to or more than the threshold value and a portion consisting of vertices having a movement amount less than the threshold value.

なお、上記基礎３Ｄモデルの分割処理とは、具体的には、分割部位情報と、分割される部位間の接合部の情報であるモデル接合情報とを、モデル記憶部３０３に基礎３Ｄモデルと合わせて格納することである。分割部位情報とは、分割する部位ごとの頂点のインデックスのリストであり、モデル接合情報とは接合する部位間で共有される頂点のインデックスのリストである。このモデル接合情報により、後述するモデル取得部３０７での分割された部位ごとの内挿３Ｄモデルの適切な結合（合成）が可能となる。 In the division processing of the basic 3D model, specifically, the division part information and the model joining information which is the information of the joints between the divided parts are combined with the basic 3D model in the model storage unit 303. To store. The division site information is a list of the indexes of the vertices for each part to be divided, and the model joining information is a list of the indexes of the vertices shared between the parts to be joined. With this model joining information, it is possible to appropriately join (synthesize) the interpolated 3D model for each divided part in the model acquisition unit 307, which will be described later.

本実施形態では、人間形状の基礎３Ｄモデルを分割する際には、胴体と他５部位（頭部、左腕、右腕、左足、右足）とのモデル接合情報をモデル記憶部３０３に格納する。この場合、いずれの接合部も接合する部位の数は２つである。そして頭部、左腕、右腕、左足、右足の各部位は接合部が１つであり、部位間で共有する頂点のインデックスのリストを作成してモデル接合情報とする。一方、胴体の部位は接合部が５つであり、５つの接合部全てについて部位間で共有する頂点のインデックスのリストを作成してモデル接合情報とする。 In the present embodiment, when the basic 3D model of the human shape is divided, the model joining information between the torso and the other five parts (head, left arm, right arm, left foot, right foot) is stored in the model storage unit 303. In this case, the number of joints of each joint is two. Each part of the head, left arm, right arm, left foot, and right foot has one joint, and a list of apex indexes shared between the parts is created and used as model joint information. On the other hand, the body part has five joints, and a list of apex indexes shared between the parts for all five joints is created and used as model joint information.

図５は本実施例の補間モデル生成処理のフローチャートである。本フローは、基礎モデル生成部３０２によってモデル記憶部３０３に新たな原モデル時刻の新規基礎３Ｄモデルが格納されたタイミングで、その新たな原モデル時刻の新規の基礎３Ｄモデルとその直前の原モデル時刻の基礎３Ｄモデルとを対象として実行される。なお、内挿３Ｄモデルの生成は、オブジェクト単位で行われ、オブジェクトごとに生成される内挿３Ｄモデルの数は異なっていてもよい。 FIG. 5 is a flowchart of the interpolation model generation process of this embodiment. In this flow, when the new basic 3D model of the new original model time is stored in the model storage unit 303 by the basic model generation unit 302, the new basic 3D model of the new original model time and the original model immediately before it are stored. It is executed for the basic 3D model of time. The generation of the interpolated 3D model is performed for each object, and the number of interpolated 3D models generated for each object may be different.

Ｓ５０１において、モデル解析部３０４は、内挿３Ｄモデル生成の対象となる連続した２つの原モデル時刻の、部位ごとに分割された基礎３Ｄモデル、すなわち基礎３Ｄモデルと分割部位情報とを取得する。ここで、２つの基礎３Ｄモデルのうち、先の基礎３Ｄモデル時刻Ｔ１のものを基礎３ＤモデルＭ（Ｔ１）、後の原モデル時刻Ｔ２のものを基礎３ＤモデルＭ（Ｔ２）とする。 In S501, the model analysis unit 304 acquires the basic 3D model divided for each part, that is, the basic 3D model and the divided part information at the time of two consecutive original models to be the target of the interpolation 3D model generation. Here, of the two basic 3D models, the one with the earlier basic 3D model time T1 is referred to as the basic 3D model M (T1), and the one with the later original model time T2 is referred to as the basic 3D model M (T2).

図６は、実施形態における連続するモデル時刻の基礎３Ｄモデル例を示す図である。本実施形態では、選手の基礎３Ｄモデルは、分割部位情報に基づき、頭部１１、右腕１２、左腕１３、胴体１４、右足１５、左足１６に分割される。 FIG. 6 is a diagram showing an example of a basic 3D model of continuous model time in the embodiment. In the present embodiment, the player's basic 3D model is divided into a head 11, a right arm 12, a left arm 13, a torso 14, a right foot 15, and a left foot 16 based on the division site information.

Ｓ５０２において、モデル解析部３０４は、Ｓ５０１で取得した分割された部位を含む基礎３ＤモデルＭ（Ｔ１）、Ｍ（Ｔ２）について、その部位ごとに動きベクトルを算出する。本実施形態では、距離を用いた単純な頂点マッチングを用いてメッシュトラッキングを行う。具体的には、基礎３ＤモデルＭ（Ｔ１）の頂点群Ｐ（Ｔ１）と、基礎３ＤモデルＭ（Ｔ２）の頂点群Ｐ（Ｔ２）があったとき、頂点群Ｐ（Ｔ１）の各頂点について、頂点群Ｐ（Ｔ２）の中から距離が最短となる頂点を対応付けることで行われる。なお、メッシュトラッキングは非特許文献１に示される方法などを用いてもよい。動きベクトルは、メッシュトラッキングによって関連付けられた基礎３Ｄモデルの頂点間の移動を表すベクトルである。動きベクトルは、頂点群Ｐ（Ｔ１）の各頂点の三次元位置座標から、メッシュトラッキングで対応付けられた頂点群Ｐ（Ｔ２）の対応頂点の三次元位置座標への変位量を各座標軸上で求めることで得られる。 In S502, the model analysis unit 304 calculates a motion vector for each of the basic 3D models M (T1) and M (T2) including the divided parts acquired in S501. In this embodiment, mesh tracking is performed using simple vertex matching using distance. Specifically, when there is a vertex group P (T1) of the basic 3D model M (T1) and a vertex group P (T2) of the basic 3D model M (T2), each vertex of the vertex group P (T1) , It is performed by associating the vertices having the shortest distance from the vertex group P (T2). For mesh tracking, the method shown in Non-Patent Document 1 may be used. The motion vector is a vector representing the movement between the vertices of the underlying 3D model associated by mesh tracking. The motion vector is the displacement amount from the three-dimensional position coordinates of each vertex of the vertex group P (T1) to the three-dimensional position coordinates of the corresponding vertex of the vertex group P (T2) associated with mesh tracking on each coordinate axis. Obtained by asking.

Ｓ５０３において、内挿時刻決定部３０５は、各オブジェクトの基礎３Ｄモデルの部位ごとに、内挿を行う時刻である内挿時刻を算出する。内挿時刻を算出するために、まず隣り合う基礎時刻間のそれぞれにおいて内挿を行う内挿回数Ｎを決定する。この内挿回数Ｎは、モデル解析部３０４で求めた動きベクトルと、隣り合う基礎時刻間の時間間隔と、本システムの構築者が定める許容速度Ｖａと、最小時間分解能Ｔｍｉｎとから決定される。図６に示す例において隣り合う基礎時刻間の時間間隔は、基礎３ＤモデルＭ（Ｔ１）、Ｍ（Ｔ２）の基礎時刻Ｔ１、Ｔ２との間の時間間隔である。 In S503, the interpolation time determination unit 305 calculates the interpolation time, which is the time for interpolation, for each part of the basic 3D model of each object. In order to calculate the interpolation time, first, the number of interpolations N for interpolation is determined between the adjacent basic times. The number of interpolations N is determined from the motion vector obtained by the model analysis unit 304, the time interval between adjacent basic times, the permissible speed Va determined by the builder of this system, and the minimum time resolution Tmin. In the example shown in FIG. 6, the time interval between adjacent basic times is the time interval between the basic times T1 and T2 of the basic 3D models M (T1) and M (T2).

本実施形態では、ある部位に属する全ての頂点についての動きベクトルの平均を、その部位を代表する代表動きベクトルとして使用する。代表動きベクトルの１秒あたりの大きさ、すなわち部位を代表する移動速度を代表速度Ｖとすると、隣り合う基礎時刻間の内挿回数Ｎは下式（１）にて求められる。
Ｎ＝ｍｉｎ（Ｒｏｕｎｄｄｏｗｎ（Ｖ／Ｖａ），（Ｔ２−Ｔ１）／Ｔｍｉｎ）式（１） In the present embodiment, the average of the motion vectors for all the vertices belonging to a certain part is used as the representative motion vector representing the part. Assuming that the magnitude of the representative motion vector per second, that is, the moving speed representing the part is the representative speed V, the number of interpolations N between adjacent basic times can be obtained by the following equation (1).
N = min (Rounddown (V / Va), (T2-T1) / Tmin) Equation (1)

ここで、Ｒｏｕｎｄｄｏｗｎは小数点以下を切り捨てた整数を求める関数を表す。式（１）により、代表速度Ｖが許容速度Ｖａ以上の場合に内挿が行われ、かつ内挿間隔の最小値は最小時間分解能Ｔｍｉｎに抑えられる。なお、内挿回数Ｎは、隣り合う基礎時刻間ごと及び分割された部位ごとに求まり、動きの大きい区間ほど内挿回数Ｎは大きくなり、かつ、動きの大きい部位ほど内挿回数Ｎは大きくなる。 Here, Rounddown represents a function for finding an integer with the decimal point truncated. According to the equation (1), interpolation is performed when the representative velocity V is equal to or higher than the permissible velocity Va, and the minimum value of the interpolation interval is suppressed to the minimum time resolution Tmin. The number of interpolations N is obtained for each adjacent basic time and for each divided portion. The number of interpolations N increases as the movement is larger, and the number of interpolations N increases as the movement increases. ..

許容速度Ｖａと最小時間分解能Ｔｍｉｎは、情報処理装置１０３の処理能力や基礎３Ｄモデルの複雑さ、ユーザの要求として想定される時間分解能などを考慮して決定する。これにより、基礎時刻間ごとに設定可能な内挿時刻の最大個数も決まることになる。式（１）に基づき決定した内挿回数Ｎが０でない場合（Ｖ≧Ｖａ）、基礎時刻Ｔ１、Ｔ２と内挿回数Ｎとを用いて、内挿時刻Ｔ１−ｉ（ｉ＝１，．．．，Ｎ）が設定される。隣り合う基礎時刻間ごとに設定される内挿時刻の個数はＮとなる。なお、内挿回数が０の場合（Ｖ＜Ｖａ）は、内挿を行わないため内挿時刻Ｔ１−ｉは設定されない。なお、内挿時刻は、原モデル時刻と同じコンテンツ時間における時刻を表す。また、基礎時刻及び内挿時刻を合わせて、単にモデル時刻と呼ぶこととする。 The permissible speed Va and the minimum time resolution Tmin are determined in consideration of the processing capacity of the information processing apparatus 103, the complexity of the basic 3D model, the time resolution assumed as the user's request, and the like. As a result, the maximum number of interpolation times that can be set for each basic time is also determined. When the number of interpolations N determined based on the equation (1) is not 0 (V ≧ Va), the interpolation times T1-i (i = 1,. ., N) is set. The number of interpolation times set for each adjacent basic time is N. When the number of interpolations is 0 (V <Va), the interpolation time T1-i is not set because the interpolation is not performed. The interpolation time represents a time in the same content time as the original model time. In addition, the basic time and the interpolation time are collectively referred to as the model time.

本実施形態では、線形補間を使用した内挿を行うこととし、内挿時刻Ｔ１−ｉは、基礎時刻Ｔ１、Ｔ２の間で等間隔にとり、基礎時刻に基礎時刻間の時間間隔を内挿回数で割った値を付加することで、下式（２）のように算出される。
Ｔ１−ｉ＝Ｔ１＋（Ｔ２−Ｔ１）／（Ｎ＋１）×ｉ式（２） In the present embodiment, interpolation is performed using linear interpolation, the interpolation time T1-i is set at equal intervals between the basic times T1 and T2, and the time interval between the basic times is the number of interpolations at the basic time. By adding the value divided by, it is calculated as in the following equation (2).
T1-i = T1 + (T2-T1) / (N + 1) × i equation (2)

ここで図１４に、Ｓ５０３の内挿時間算出処理のフローチャートを示す。Ｓ１４０１において、内挿時刻決定部３０５は、未選択の部位を１つ選択する。次にＳ１４０２において、内挿時刻決定部３０５は、動きベクトルと、隣り合う基礎時刻間の時間間隔と、許容速度Ｖａと、最小時間分解能Ｔｍｉｎとから式（１）に基づき内挿回数Ｎを算出する。Ｓ１４０３において、内挿時刻決定部３０５は、内挿回数Ｎが０であるか否かを判定し、内挿回数Ｎが０でない場合はＳ１４０４に移行し、内挿回数Ｎが０の場合はＳ１４０５に移行する。内挿回数Ｎが０でなくＳ１４０４に移行した場合は、内挿時刻決定部３０５は、式（２）に基づき内挿時間を算出する。Ｓ１４０５において、内挿時刻決定部３０５は、全ての部位が選択されたか否かを判定し、未選択の部位がある場合はＳ１４０１に戻り、全ての部位を選択している場合は処理を終了する。 Here, FIG. 14 shows a flowchart of the interpolation time calculation process of S503. In S1401, the interpolation time determination unit 305 selects one unselected part. Next, in S1402, the interpolation time determination unit 305 calculates the number of interpolations N based on the equation (1) from the motion vector, the time interval between adjacent basic times, the permissible speed Va, and the minimum time resolution Tmin. To do. In S1403, the interpolation time determination unit 305 determines whether or not the number of interpolations N is 0, shifts to S1404 if the number of interpolations N is not 0, and S1405 if the number of interpolations N is 0. Move to. When the number of interpolations N is not 0 but shifts to S1404, the interpolation time determination unit 305 calculates the interpolation time based on the equation (2). In S1405, the interpolation time determination unit 305 determines whether or not all the parts have been selected, returns to S1401 if there are unselected parts, and ends the process if all the parts are selected. ..

内挿時刻が設定されるか否かにより上述するように内挿を行うか否かが決まる。そのた、内挿を行うか否かは、内挿回数Ｎが０であるか否かにより決まる。この内挿回数Ｎが０であるか否かは基礎３Ｄモデルの移動速度が所定の速度以上（Ｖ≧Ｖａ）か所定の速度未満（Ｖ＜Ｖａ）かにより決まるため、内挿を行うか否かは、基礎３Ｄモデルの移動速度により決まることとなる。 Whether or not to perform interpolation is determined as described above depending on whether or not the interpolation time is set. In addition, whether or not to perform interpolation depends on whether or not the number of interpolations N is 0. Whether or not the number of interpolations N is 0 depends on whether the moving speed of the basic 3D model is equal to or higher than a predetermined speed (V ≧ Va) or lower than a predetermined speed (V <Va). Whether or not it will be determined by the moving speed of the basic 3D model.

Ｓ５０４において、モデル内挿部３０６は、Ｓ５０３で内挿時刻が設定された部位については内挿３Ｄモデルを生成し、分割部位情報、モデル接合情報、及び補間モデル時刻を示すタイムコードと合わせてモデル記憶部３０３に格納する。なお、Ｓ５０３で内挿時刻が設定されなかった部位については内挿３Ｄモデルを生成しない。 In S504, the model insertion unit 306 generates an interpolation 3D model for the part where the interpolation time is set in S503, and models the model together with the division part information, the model junction information, and the time code indicating the interpolation model time. It is stored in the storage unit 303. The interpolation 3D model is not generated for the part where the interpolation time is not set in S503.

図７は、図６に示す基礎３ＤモデルＭ（Ｔ１）、Ｍ（Ｔ２）に関してＮ＝１である場合に生成される内挿３ＤモデルＭ（Ｔ１−１）を示す図である。図６に示す基礎３ＤモデルＭ（Ｔ１）、Ｍ（Ｔ２）について内挿回数Ｎの算出を行った結果、左足１６とサッカーボール１７のみ代表速度Ｖが許容速度Ｖａ以上であった。そのため図７（ｂ）に示すように、内挿時刻が設定されたのは左足１６’’とサッカーボール１７’’のみとなった。この場合、内挿３ＤモデルＭ（Ｔ１−１）の頭部１１、右腕１２、左腕１３、胴体１４、右足１５については、Ｍ（Ｔ１）又はＭ（Ｔ２）のものをそのまま使用する。このように、動きの大きい部位やオブジェクトのみ内挿３Ｄモデルを生成することにより、内挿後の３Ｄモデルのデータ量を抑制することができる。 FIG. 7 is a diagram showing an interpolated 3D model M (T1-1) generated when N = 1 with respect to the basic 3D models M (T1) and M (T2) shown in FIG. As a result of calculating the number of interpolations N for the basic 3D models M (T1) and M (T2) shown in FIG. 6, the representative speed V of only the left foot 16 and the soccer ball 17 was equal to or higher than the permissible speed Va. Therefore, as shown in FIG. 7B, the interpolation time was set only for the left foot 16 ″ and the soccer ball 17 ″. In this case, as for the head 11, right arm 12, left arm 13, torso 14, and right foot 15 of the interpolated 3D model M (T1-1), those of M (T1) or M (T2) are used as they are. In this way, by generating the interpolated 3D model only for the part or the object having a large movement, it is possible to suppress the amount of data of the 3D model after the interpolation.

なお、本実施形態においては、内挿３Ｄモデルは、Ｓ５０２にて各頂点について算出された動きベクトルを用いて、頂点座標の線形補間を行うことで生成されるものとする。そのため、本実施形態では２つの原モデル時刻Ｔ１、Ｔ２の基礎３Ｄモデル間の内挿３Ｄモデルを生成するため線形補間を使用する。但し、本開示の技術はこの実施形態に限定されず、内挿３Ｄモデルを生成するために、参照する基礎３Ｄモデルの数を３つ以上に増やし、二次補間などの非線形な頂点座標の補間手法を用いてもよい。 In the present embodiment, the interpolated 3D model is generated by performing linear interpolation of the vertex coordinates using the motion vector calculated for each vertex in S502. Therefore, in this embodiment, linear interpolation is used to generate an interpolated 3D model between the basic 3D models of the two original model times T1 and T2. However, the technique of the present disclosure is not limited to this embodiment, and in order to generate an interpolated 3D model, the number of reference basic 3D models is increased to 3 or more, and non-linear vertex coordinate interpolation such as quadratic interpolation is performed. The method may be used.

図８は、本実施形態の配信用仮想視点画像データ生成処理のフローチャートである。本フローは、ユーザから入出力部３０９を介して取得した、特定の仮想視点画像データを特定のフレームレートｆで再生する要求が行われたタイミングで実行される。フレームレートｆは、ユーザがユーザ端末１０６、１０７を介して指定してもよいし、ユーザ端末１０６、１０７の処理能力や情報処理装置１０３との接続状況に応じて決定されるようにしてもよい。 FIG. 8 is a flowchart of the virtual viewpoint image data generation process for distribution of the present embodiment. This flow is executed at the timing when a request is made to reproduce the specific virtual viewpoint image data acquired from the user via the input / output unit 309 at a specific frame rate f. The frame rate f may be specified by the user via the user terminals 106 and 107, or may be determined according to the processing capacity of the user terminals 106 and 107 and the connection status with the information processing device 103. ..

Ｓ８０１において、モデル取得部３０７は、本処理で生成する配信用仮想視点画像データの再生区間の時刻（再生開始時刻及び再生終了時刻）を表すタイムコード及び配信用仮想視点画像データのフレームレートｆを取得する。Ｓ８０１で取得する再生区間のタイムコードは、再生時刻のみとし、ユーザからの再生終了の要求が行われたタイミングとしてもよいし、再生終了時刻は配信用仮想視点画像データのデータ終了のタイミングとしてもよい。フレームレートｆは、ユーザによる入力装置１０５などを介した入力に基づき決定されてもよいし、ユーザ端末１０６、１０７の処理能力から自動で決定してもよい。ユーザ端末１０６、１０７において自動で決定されるフレームレートｆとしては、例えばユーザ端末１０６、１０７の最大フレームレートとしてもよし、通信回線の状況に応じて再生が途切れないフレームレートとしてもよい。 In S801, the model acquisition unit 307 sets a time code representing the time (reproduction start time and reproduction end time) of the reproduction section of the distribution virtual viewpoint image data generated in this process and the frame rate f of the distribution virtual viewpoint image data. get. The time code of the reproduction section acquired in S801 may be only the reproduction time and may be the timing when the user requests the end of the reproduction, or the reproduction end time may be the timing of the end of the data of the virtual viewpoint image data for distribution. Good. The frame rate f may be determined based on the input via the input device 105 or the like by the user, or may be automatically determined from the processing capacity of the user terminals 106 and 107. The frame rate f automatically determined by the user terminals 106 and 107 may be, for example, the maximum frame rate of the user terminals 106 and 107, or a frame rate at which playback is not interrupted depending on the condition of the communication line.

Ｓ８０２において、モデル取得部３０７は、指定された再生開始時刻に最も近いモデル時刻Ｔを取得し、取得したモデル時刻Ｔを基準に、指定されたフレームレートｆに応じた時間間隔で各フレーム画像データのフレーム時刻Ｔｆを決定する。 In S802, the model acquisition unit 307 acquires the model time T closest to the designated playback start time, and based on the acquired model time T, each frame image data at a time interval corresponding to the designated frame rate f. Determines the frame time Tf of.

Ｓ８０３において、モデル取得部３０７は、モデル記憶部３０３に予め格納された基礎３Ｄモデル及び内挿３Ｄモデルを含む３Ｄモデルから、Ｓ８０１で取得されたフレーム時刻Ｔｆに近接する、望ましくは最も近いモデル時刻Ｔの３Ｄモデルを取得する。また、取得した３Ｄモデルが内挿３Ｄモデルの場合、分割部位情報及びモデル接合情報も併せて取得する。さらに、モデル接合情報に含まれる部位のうち、内挿３Ｄモデルが存在しない部位については、その内挿３Ｄモデルのモデル時刻Ｔに最も近いモデル時刻Ｔの基礎３Ｄモデルの対応する部位を取得する。 In S803, the model acquisition unit 307 is close to the frame time Tf acquired in S801 from the 3D model including the basic 3D model and the interpolated 3D model stored in advance in the model storage unit 303, preferably the closest model time. Acquire a 3D model of T. Further, when the acquired 3D model is an interpolated 3D model, the division site information and the model joining information are also acquired. Further, among the parts included in the model joining information, for the part where the interpolated 3D model does not exist, the corresponding part of the basic 3D model of the model time T closest to the model time T of the interpolated 3D model is acquired.

図９は、実施形態における任意のフレームレートｆに応じた３Ｄモデルの取得方法を説明する図である。ここでは、６０ｆｐｓの複数視点画像データから生成された基礎３Ｄモデルに対してＮ＝３で内挿した３Ｄモデルを用いて、４０ｆｐｓ、１００ｆｐｓの仮想視点画像データを生成する場合について説明する。 FIG. 9 is a diagram illustrating a method of acquiring a 3D model according to an arbitrary frame rate f in the embodiment. Here, a case where virtual viewpoint image data of 40 fps and 100 fps is generated by using a 3D model interpolated at N = 3 with respect to a basic 3D model generated from a plurality of viewpoint image data of 60 fps will be described.

４０ｆｐｓの仮想視点画像データを生成する場合には、フレーム時刻Ｔｆ１用の３Ｄモデルに基礎３ＤモデルＭ（Ｔ１）を使用し、フレーム時刻Ｔｆ２用の３Ｄモデルにフレーム時刻Ｔｆ２と一致するモデル時刻の内挿３ＤモデルＭ（Ｔ２−２）を使用する。このように、フレーム時刻Ｔｆと一致するモデル時刻Ｔが存在する場合、フレーム時刻Ｔｆのフレーム画像データの生成には、フレーム時刻Ｔｆと一致するモデル時刻Ｔの３Ｄモデルを使用する。 When generating 40 fps virtual viewpoint image data, the basic 3D model M (T1) is used as the 3D model for the frame time Tf1, and the 3D model for the frame time Tf2 is within the model time that matches the frame time Tf2. Insert 3D model M (T2-2) is used. As described above, when the model time T that matches the frame time Tf exists, the 3D model of the model time T that matches the frame time Tf is used to generate the frame image data of the frame time Tf.

一方、１００ｆｐｓの仮想視点画像データを生成する場合には、フレーム時刻Ｔｆ１用の３Ｄモデルに基礎３ＤモデルＭ（Ｔ１）を使用するが、フレーム時刻Ｔｆ２と一致するモデル時刻が存在しない。そのため、フレーム時刻Ｔｆ２用の３Ｄモデルには、フレーム時刻Ｔｆ２に最も近いモデル時刻Ｔ１−２の内挿３ＤモデルＭ（Ｔ１−２）を使用する。同様にフレーム時刻Ｔｆ３用の３Ｄモデルには、フレーム時刻Ｔｆ３に最も近いモデル時刻Ｔ２−１の内挿３ＤモデルＭ（Ｔ２−１）を使用する。フレーム時刻Ｔｆ４用の３Ｄモデルには、フレーム時刻Ｔｆ４に最も近いモデル時刻Ｔ２−３の内挿３ＤモデルＭ（Ｔ２−３）を使用する。 On the other hand, when generating virtual viewpoint image data of 100 fps, the basic 3D model M (T1) is used as the 3D model for the frame time Tf1, but there is no model time that matches the frame time Tf2. Therefore, as the 3D model for the frame time Tf2, the interpolated 3D model M (T1-2) at the model time T1-2 closest to the frame time Tf2 is used. Similarly, as the 3D model for the frame time Tf3, the interpolated 3D model M (T2-1) of the model time T2-1 closest to the frame time Tf3 is used. As the 3D model for the frame time Tf4, the interpolated 3D model M (T2-3) of the model time T2-3 closest to the frame time Tf4 is used.

このように、フレーム時刻Ｔｆに一致するモデル時刻Ｔが存在しない場合は、フレーム時刻Ｔｆに最も近いモデル時刻Ｔの３Ｄモデルを使用する。これにより任意のフレームレートｆの仮想視点画像データを生成することができる。なお、フレーム時刻Ｔｆに一致するモデル時刻Ｔの３Ｄモデルを使用するか、フレーム時刻Ｔｆよりも時間間隔の狭いモデル時刻Ｔの３Ｄモデルを使用する方が高品質な仮想視点画像データとなり好ましい。 As described above, when there is no model time T that matches the frame time Tf, the 3D model of the model time T closest to the frame time Tf is used. As a result, virtual viewpoint image data having an arbitrary frame rate f can be generated. It is preferable to use a 3D model of the model time T that matches the frame time Tf, or to use a 3D model of the model time T whose time interval is narrower than that of the frame time Tf because of high quality virtual viewpoint image data.

図１０は、実施形態における所定のフレームレートｆに応じた３Ｄモデルの取得方法を説明する図である。仮想視点画像データの選択可能なフレームレートｆが予め決められている場合は、内挿回数Ｎを適切に設定することで、無駄に多くの３Ｄモデルを保持せずに高品質な仮想視点画像データを生成可能にすることができる。図１０に示すように、選択可能なフレームレートｆが１２０ｆｐｓ、１８０ｆｐｓである場合、内挿回数Ｎ＝５とすることにより、全てのフレーム時刻Ｔｆに一致するモデル時刻Ｔの３Ｄモデルが存在することになる。これにより高品質な仮想視点画像データを生成可能で、かつ、内挿回数Ｎ≧６としたときよりも保持する３Ｄモデルを少なくすることができる。 FIG. 10 is a diagram illustrating a method of acquiring a 3D model according to a predetermined frame rate f in the embodiment. When the selectable frame rate f of the virtual viewpoint image data is predetermined, by setting the number of interpolation times N appropriately, high quality virtual viewpoint image data can be obtained without holding a lot of 3D models unnecessarily. Can be generated. As shown in FIG. 10, when the selectable frame rates f are 120 fps and 180 fps, there is a 3D model of the model time T that matches all the frame times Tf by setting the number of interpolations N = 5. become. As a result, high-quality virtual viewpoint image data can be generated, and the number of 3D models to be retained can be reduced as compared with the case where the number of interpolations N ≧ 6.

また、図示していないが、動きが小さく、内挿回数Ｎが小さいためにモデル時刻Ｔの方がフレーム時刻Ｔｆよりも時間間隔が広くなる場合は、複数のフレーム時刻Ｔｆにおいて同一のモデル時刻Ｔの３Ｄモデルを使用する。このように、オブジェクトの変化量が小さく、内挿することによる効果が小さい部分では、内挿３Ｄモデルを生成せずに３Ｄモデルのデータ量を低減することができる。 Further, although not shown, when the model time T has a wider time interval than the frame time Tf because the movement is small and the number of interpolations N is small, the same model time T at a plurality of frame times Tf. Use the 3D model of. As described above, in the portion where the amount of change of the object is small and the effect of interpolation is small, the amount of data of the 3D model can be reduced without generating the interpolated 3D model.

Ｓ８０４において、モデル取得部３０７は、Ｓ８０３で取得した部位ごとの内挿３Ｄモデルを、分割部位情報及びモデル接合情報に基づき、分割前のオブジェクト単位の３Ｄモデルとなるよう合成した合成３Ｄモデルを生成する。なお、内挿３Ｄモデルが存在しない部位については、合成する内挿３Ｄモデルの内挿時刻に近い時刻の基礎３Ｄモデルの対応する部位を使用する。このとき隣り合う基礎時刻間における合成後のオブジェクト単位の内挿３Ｄモデルの数は、部位ごとの内挿回数Ｎの最大数となる。 In S804, the model acquisition unit 307 generates a synthetic 3D model obtained by synthesizing the interpolated 3D model for each part acquired in S803 so as to be a 3D model for each object before division based on the division part information and the model joining information. To do. For the part where the interpolated 3D model does not exist, the corresponding part of the basic 3D model at a time close to the interpolation time of the interpolated 3D model to be synthesized is used. At this time, the number of interpolated 3D models for each object after synthesis between adjacent basic times is the maximum number of interpolation times N for each part.

ここで本実施形態では、モデル接合情報を参照することで部位ごとの接合部が離れないように合成することができる。例えば、分割前の基礎３Ｄモデルでは同一の頂点であったものが、分割後に部位ごと異なる位置に移動した場合、それら分割前に同一の頂点であった全ての頂点をそれらの中点位置に集めて結合する。詳細については後述する。 Here, in the present embodiment, by referring to the model joint information, it is possible to synthesize the joints of each part so that the joints do not separate. For example, if the same vertices in the basic 3D model before division move to different positions for each part after division, all the vertices that were the same vertices before division are collected at their midpoint positions. And combine. Details will be described later.

Ｓ８０５において、モデル取得部３０７は、Ｓ８０４で合成された合成３Ｄモデルと基礎３Ｄモデルを含む、再生区間内の３Ｄモデルを取得する。そして仮想視点画像データ生成部３０８が、ユーザがユーザ端末１０６、１０７において入力した仮想視点情報を取得してレンダリングし、仮想視点からの見えを表す配信用仮想視点画像データを生成する。仮想視点情報は、典型的には、ユーザ端末１０６、１０７側で予め決められた既定の再生フレームレートｆｄで取得される。例えば、既定の再生フレームレートｆｄが６０ｆｐｓのユーザ端末では、１／６０秒に１回、仮想視点情報を取得することとなる。 In S805, the model acquisition unit 307 acquires a 3D model in the reproduction section including the synthetic 3D model synthesized in S804 and the basic 3D model. Then, the virtual viewpoint image data generation unit 308 acquires and renders the virtual viewpoint information input by the user on the user terminals 106 and 107, and generates distribution virtual viewpoint image data representing the appearance from the virtual viewpoint. The virtual viewpoint information is typically acquired at a predetermined playback frame rate fd on the user terminals 106 and 107. For example, in a user terminal having a default playback frame rate fd of 60 fps, virtual viewpoint information is acquired once every 1/60 second.

既定の再生フレームレートｆｄとユーザが指定したフレームレートｆとが一致している場合は、仮想視点情報を変更したフレーム時刻Ｔｆに一致するモデル時刻Ｔの３Ｄモデルを使用して、変更後の仮想視点からの見えを表す仮想視点画像データを生成する。一方、この既定の再生フレームレートｆｄと、ユーザが指定した配信用仮想視点画像データのフレームレートｆとが一致しない場合もある。そのような場合には、ユーザ端末が仮想視点情報を取得したタイミングに最も近いモデル時刻Ｔの３Ｄモデルを使用して、変更後の仮想視点からの見えを表す仮想視点画像データを生成する。 If the default playback frame rate fd and the frame rate f specified by the user match, the changed virtual is used by using the 3D model of the model time T that matches the changed frame time Tf of the virtual viewpoint information. Generate virtual viewpoint image data that represents the view from the viewpoint. On the other hand, the default playback frame rate fd may not match the frame rate f of the virtual viewpoint image data for distribution specified by the user. In such a case, the 3D model of the model time T closest to the timing when the user terminal acquires the virtual viewpoint information is used to generate virtual viewpoint image data representing the appearance from the changed virtual viewpoint.

Ｓ８０６において、モデル取得部３０７は、Ｓ８０５で生成した仮想視点画像データを入出力部３０９からネットワーク１０８を介してユーザ端末１０６、１０７に配信する。 In S806, the model acquisition unit 307 distributes the virtual viewpoint image data generated in S805 from the input / output unit 309 to the user terminals 106 and 107 via the network 108.

なお、仮想視点画像データの配信中にフレームレートｆの変更の要求が行われた場合は、フレームレート変更の要求があったタイミングを再生開始時刻とし、Ｓ８０１〜Ｓ８０６の処理を行ってもよい。 If a request for changing the frame rate f is made during distribution of the virtual viewpoint image data, the playback start time may be set as the timing at which the request for changing the frame rate is made, and the processes S801 to S806 may be performed.

ここで、Ｓ８０４における３Ｄモデルの合成処理について説明する。図１１は、図７（ｂ）に示す内挿３Ｄモデルを示す図であり、基礎３Ｄモデルでは同一の頂点であった胴体１４の頂点Ｂ１と、左足１６’’の頂点Ｌ１とがマージされる前の状態の図である。また図１２は、図１１の頂点Ｂ１と頂点Ｌ１とをそれらの中点Ｉ１でマージした後の状態の図である。図１１では、頂点Ｂ１と頂点Ｌ１とを中点Ｉ１まで移動させるように、胴体１４と左足１６’’とを胴体１４’’と左足１６’’’とにそれぞれ変形させている。 Here, the 3D model synthesis process in S804 will be described. FIG. 11 is a diagram showing the interpolated 3D model shown in FIG. 7B, in which the apex B1 of the body 14 and the apex L1 of the left foot 16 ″, which were the same vertices in the basic 3D model, are merged. It is a figure of the previous state. Further, FIG. 12 is a diagram of a state after the apex B1 and the apex L1 of FIG. 11 are merged at their midpoint I1. In FIG. 11, the body 14 and the left foot 16 ″ are transformed into the body 14 ″ and the left foot 16 ″, respectively, so as to move the apex B1 and the apex L1 to the midpoint I1.

３Ｄモデルの合成処理についてさらに図１３を用いて詳細に説明する。図１３は、分割された部位ごとのモデルの合成方法の詳細を説明する図であり、便宜的に簡略化した２Ｄモデルを用いて説明する。 The 3D model synthesis process will be further described in detail with reference to FIG. FIG. 13 is a diagram for explaining the details of the method of synthesizing the model for each of the divided parts, and will be described using a simplified 2D model for convenience.

図１３（ａ）は、基礎３Ｄモデルの頂点ｖ１、ｖ２、ｖ３からなる左腕と、頂点ｖ２、ｖ３、ｖ４、ｖ５からなる胴体を示している。ここでは左腕と胴体とで共有する頂点はｖ２とｖ３であるが、この場合、図４のＳ４０３において、頂点インデックスｖ２、ｖ３が胴体及び左腕のモデル接合情報として格納される。図１３の例では接合頂点は２つであるが、Ｓ４０３では、異なる部位との間で共有する全ての頂点を検出し、それら頂点インデックスの全てをモデル接合情報として格納する。 FIG. 13A shows the left arm consisting of vertices v1, v2, v3 of the basic 3D model and the torso consisting of vertices v2, v3, v4, v5. Here, the vertices shared by the left arm and the body are v2 and v3. In this case, in S403 of FIG. 4, the vertex indexes v2 and v3 are stored as model joining information of the body and the left arm. In the example of FIG. 13, there are two joining vertices, but in S403, all the vertices shared with different parts are detected, and all of the vertex indexes are stored as model joining information.

図１３（ｂ）は、部位ごとに内挿された左腕と胴体とを示しており、左腕と胴体とで共有していた頂点ｖ２、ｖ３が左腕は頂点ｖ２’、ｖ３’に、胴体は頂点ｖ２’’、ｖ３’’にそれぞれ移動し、内挿後に左腕と胴体が離れた状態を示している。 FIG. 13B shows the left arm and the torso interpolated for each part, and the vertices v2 and v3 shared by the left arm and the torso are the vertices v2'and v3' for the left arm and the torso is the apex. It moves to v2'' and v3'', respectively, and shows a state in which the left arm and the torso are separated after interpolation.

図１３（ｃ）は、内挿された左腕と胴体とを合成した状態を示している。図１３（ｂ）に示すようにモデル接合情報に格納された頂点ｖ２、ｖ３の座標が左腕と胴体とで異なる場合、頂点ｖ２’、ｖ２’’をそれらの中点ｖ２’’’に移動させ、頂点ｖ３’、ｖ３’’をそれらの中点ｖ３’’’に移動させる。この頂点ｖ２’、ｖ２’’、ｖ３’、ｖ３’’の移動により、内挿された左腕及び胴体はそれぞれ変形することで接合されて合成される。 FIG. 13 (c) shows a state in which the interpolated left arm and the torso are combined. When the coordinates of the vertices v2 and v3 stored in the model joining information are different between the left arm and the torso as shown in FIG. 13 (b), the vertices v2'and v2'' are moved to their midpoint v2'''. , Move the vertices v3', v3'' to their midpoint v3'''. By the movement of the vertices v2 ′, v2 ″, v3 ′, and v3 ″, the interpolated left arm and the torso are deformed and joined to be synthesized.

このようにしてモデル接合情報を参照することで、分離してしまった部位間を分割時の接合部同士を接合して合成することができる。 By referring to the model joint information in this way, it is possible to combine the separated parts by joining the joints at the time of division.

以上が、任意のフレームレートｆの仮想視点画像データを、記憶領域や処理コストの面で効率的に提供する画像処理装置の構成例である。これにより、保持する３Ｄモデルのデータ量を必要最小限のデータ量に抑えつつ、ユーザからの再生要求から短時間で仮想視点画像データの配信を行うことができる。 The above is a configuration example of an image processing device that efficiently provides virtual viewpoint image data having an arbitrary frame rate f in terms of storage area and processing cost. As a result, it is possible to deliver the virtual viewpoint image data in a short time from the reproduction request from the user while suppressing the amount of data of the 3D model to be held to the minimum necessary amount.

また、図８では、内挿３Ｄモデルを合成するステップであるＳ８０４が、Ｓ８０１〜Ｓ８０３の後に行われる構成としているが、Ｓ８０１よりも前にＳ８０４を実施する構成としてもよい。すなわち、図５の補間モデル生成処理において内挿３Ｄモデルが生成された後、生成された内挿３Ｄモデルの全てについて内挿３Ｄモデルを合成する処理を行ってもよい。予め内挿３Ｄモデルの合成を行うことでユーザによる要求から再生までの遅延をより短縮することができる。 Further, in FIG. 8, S804, which is a step of synthesizing the interpolated 3D model, is configured to be performed after S801 to S803, but S804 may be performed before S801. That is, after the interpolation 3D model is generated in the interpolation model generation process of FIG. 5, a process of synthesizing the interpolation 3D model for all of the generated interpolation 3D models may be performed. By synthesizing the interpolated 3D model in advance, the delay from the request by the user to the reproduction can be further shortened.

また、本実施形態では、複数視点画像撮像部３０１、基礎モデル生成部３０２、仮想視点画像データ生成部３０８、及び入出力部３０９を含む構成としているが、これらは同様の機能を有する外部装置で代替することもできる。すなわち、情報処理装置１０３は、図４、５、８に示す処理のうち、Ｓ４０２、Ｓ５０１、Ｓ８０５、及びＳ８０６については実施しない構成とすることもできる。 Further, in the present embodiment, the configuration includes the multi-viewpoint image imaging unit 301, the basic model generation unit 302, the virtual viewpoint image data generation unit 308, and the input / output unit 309, but these are external devices having the same functions. It can be replaced. That is, the information processing apparatus 103 may be configured not to perform the processes shown in FIGS. 4, 5, and 8 for S402, S501, S805, and S806.

（その他の実施例）
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 (Other Examples)
The present invention supplies a program that realizes one or more functions of the above-described embodiment to a system or device via a network or storage medium, and one or more processors in the computer of the system or device reads and executes the program. It can also be realized by the processing to be performed. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

３０５内挿時刻決定部
３０６モデル内挿部
３０７モデル取得部 305 Interpolation time determination unit 306 Model interpolation unit 307 Model acquisition unit

Claims

An acquisition means for acquiring a first 3D model associated with a time code representing a three-dimensional shape of an object,
Based on the movement speeds of the two first 3D models associated with the two consecutive timecodes and the two first 3D models, the time between the times indicated by the two consecutive timecodes A generation means for generating the corresponding second 3D model, and
An image processing device comprising.

The image processing apparatus according to claim 1, wherein the generation means generates the second 3D model based on the movement speed being equal to or higher than a predetermined speed.

The image according to claim 1, wherein the generation means does not generate the second 3D model based on the two first 3D models when the moving speed is less than a predetermined speed. Processing equipment.

The generation means determines the time according to the time interval between the two consecutive time codes corresponding to the first 3D model and the number of the second 3D models interpolated between the time codes. Is added to the time of the time code corresponding to the first 3D model immediately before the interpolation, thereby determining the time code associated with the second 3D model. Item 3. The image processing apparatus according to any one of Items 1 to 3.

The generation means determines the maximum number of the second 3D models inserted between the two consecutive time codes corresponding to the first 3D model based on the minimum time resolution of the generation means. The image processing apparatus according to claim 4.

The generation means determines the number of the second 3D models to be generated for each of the two consecutive time codes corresponding to the first 3D model according to the movement speed of the movement speed. The image processing apparatus according to any one of claims 1 to 5, wherein the image processing apparatus is characterized.

The acquisition means divides the first 3D model corresponding to a predetermined object into a plurality of parts, and makes the plurality of parts individual objects, and joins the plurality of parts. The image processing apparatus according to any one of claims 1 to 6, wherein the joining information for designating a part is acquired.

The acquisition means acquires the first 3D model in which the first 3D model corresponding to the predetermined object is divided into a plurality of parts according to the moving speed, and the parts are the individual objects. The image processing apparatus according to claim 7, wherein the image processing apparatus is used.

Further provided with a synthesis means for generating a third 3D model by synthesizing the second 3D model for each of the divided parts based on the joining information so as to be a 3D model for each object before division. The image processing apparatus according to claim 7 or 8.

When the second 3D model is generated only for a part of the 3D model of the object unit before the division, the synthesis means corresponds to a time code close to the time code of the second 3D model. The image processing apparatus according to claim 9, wherein a portion of the attached first 3D model that is not in the second 3D model is synthesized with the second 3D model.

Further provided is a selection means for selecting a 3D model corresponding to the frame image data of the virtual viewpoint image data from the first 3D model and the third 3D model based on the specified playback section, virtual viewpoint and frame rate. The image processing apparatus according to claim 9 or 10.

The selection means includes the first 3D model or the third 3D model associated with the time code that matches the playback time determined based on the designated playback section, virtual viewpoint, and frame rate. If not, the first 3D model or the third 3D model associated with the time code closest to the playback time is selected as the 3D model corresponding to the frame image data of the virtual viewpoint image data. 11. The image processing apparatus according to claim 11.

The image processing apparatus according to claim 11 or 12, further comprising a generation means for generating the virtual viewpoint image data based on a 3D model corresponding to the frame image data selected by the selection means.

The image processing device according to any one of claims 11 to 13, wherein the designated frame rate is specified based on the terminal device on which the virtual viewpoint image data is reproduced.

Claims 1 to 14 characterized in that the first 3D model acquired by the acquisition means is a 3D model generated based on a plurality of image data obtained by synchronous imaging using a plurality of imaging devices. The image processing apparatus according to any one of the above items.

The step of acquiring the first 3D model associated with the time code, which represents the 3D shape of the object,
Based on the movement speeds of the two first 3D models associated with the two consecutive timecodes and the two first 3D models, the time between the times indicated by the two consecutive timecodes Steps to generate the corresponding second 3D model,
An image processing method characterized by having.

A program for operating a computer as the image processing device according to any one of claims 1 to 15.