JP2023163133A

JP2023163133A - Image processing system, image processing method, and computer program

Info

Publication number: JP2023163133A
Application number: JP2023038750A
Authority: JP
Inventors: 裕尚伊藤; Hironao Ito; 充前田; Mitsuru Maeda; 衛田苗; Mamoru Tanae
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2022-04-27
Filing date: 2023-03-13
Publication date: 2023-11-09

Abstract

To provide an image processing system that displays an attractive digital content including a virtual viewpoint image and other images.SOLUTION: A first region 911 and a second region 912 exist in a graphic user interface (GUI) image that is shown on a user device. A first display region 901 to a sixth display region 906 that are of a three-dimensional shape, and that display images that show information associated with the digital content are included in the first region. Respectively allocated images and information are shown in each display region. A seventh display region 907 exists in the second region 912, that shows an image indicative of information associated with the display region selected by a user among the first to the sixth display regions. When the second display region 902 is selected, to which three virtual viewpoint images are correlated, a GUI 908 to a GUI 910 corresponding to the virtual viewpoint image of each viewpoint are displayed in the second region. The GUI 908 to GUI 910 may be displayed by superimposition in the seventh display region 907.SELECTED DRAWING: Figure 9

Description

本開示は画像処理システム、画像処理方法及びコンピュータプログラム等に関する。 The present disclosure relates to an image processing system, an image processing method, a computer program, and the like.

複数の撮像装置の撮像により得られた複数の画像を用いて、指定された仮想視点からの仮想視点画像を生成する技術が注目されている。特許文献１には、複数の撮像装置を異なる位置に設置して被写体を撮像し、撮像により得られた撮像画像から推定される被写体の３次元形状を用いて、仮想視点画像を生成する方法について記載されている。 A technique that generates a virtual viewpoint image from a specified virtual viewpoint using a plurality of images obtained by imaging by a plurality of imaging devices is attracting attention. Patent Document 1 describes a method of capturing images of a subject by installing a plurality of imaging devices at different positions, and generating a virtual viewpoint image using a three-dimensional shape of the subject estimated from the captured image obtained by the capturing. Are listed.

特開２０１５－４５９２０号公報JP2015-45920A

しかし、仮想視点画像と他の画像を含む魅力的なデジタルコンテンツを提供することはできていなかった。 However, it has not been possible to provide attractive digital content including virtual perspective images and other images.

本開示は、仮想視点画像と他の画像を含む魅力的なデジタルコンテンツの表示技術を提供するための画像処理システムを提供することを目的としている。 The present disclosure is directed to providing an image processing system for providing attractive digital content display techniques including virtual perspective images and other images.

本開示の１つの実施態様の画像処理システムは、
立体形状のデジタルコンテンツの第１面に対応付けられた仮想視点画像であって、複数の撮像装置で撮像されることにより得られた複数の画像と仮想視点とに基づいて生成される仮想視点画像と、前記デジタルコンテンツの第２面に対応付けられた前記仮想視点画像に対応する前記仮想視点と異なる視点の画像とを特定する特定手段と、
前記仮想視点画像に対応する画像と、前記仮想視点と異なる視点の画像に対応する画像とを表示領域に表示する制御を行う表示制御手段と、
を有することを特徴とする。 An image processing system according to one embodiment of the present disclosure includes:
A virtual viewpoint image associated with the first surface of three-dimensional digital content, which is generated based on a plurality of images obtained by capturing images with a plurality of imaging devices and a virtual viewpoint. and identifying means for identifying an image of a different viewpoint from the virtual viewpoint corresponding to the virtual viewpoint image associated with the second side of the digital content;
Display control means for controlling display of an image corresponding to the virtual viewpoint image and an image corresponding to an image from a viewpoint different from the virtual viewpoint in a display area;
It is characterized by having the following.

本開示によれば、仮想視点画像と他の画像を含む魅力的なデジタルコンテンツを表示することができる。 According to the present disclosure, attractive digital content including virtual perspective images and other images can be displayed.

実施形態１に係る画像処理システム１００の装置構成の１例を示す図である。1 is a diagram illustrating an example of a device configuration of an image processing system 100 according to a first embodiment. FIG. 実施形態１に係る画像処理システム１００のハードウェア構成を示す図である。1 is a diagram showing a hardware configuration of an image processing system 100 according to a first embodiment. 実施形態１の画像処理システム１００の動作フローを説明するためのフローチャートである。3 is a flowchart for explaining the operation flow of the image processing system 100 according to the first embodiment. 図４（Ａ）～（Ｃ）は実施形態４でコンテンツ生成部４が生成するコンテンツとしての立体画像の例を示す図である。FIGS. 4A to 4C are diagrams showing examples of stereoscopic images as content generated by the content generation unit 4 in the fourth embodiment. 実施形態５の画像処理システム１００の動作フローを説明するためのフローチャートである。12 is a flowchart for explaining the operation flow of the image processing system 100 according to the fifth embodiment. 実施形態３の画像処理システム１００の動作フローを説明するためのフローチャートである。7 is a flowchart for explaining the operation flow of the image processing system 100 according to the third embodiment. 図６の続きのフローチャートである。7 is a flowchart continuing from FIG. 6. 図６と図７の続きのフローチャートである。7 is a flowchart that is a continuation of FIGS. 6 and 7. FIG. 実施形態４に係るユーザデバイスに表示されるグラフィカルユーザーインターフェースの１例を示す図である。FIG. 12 is a diagram illustrating an example of a graphical user interface displayed on a user device according to a fourth embodiment. 実施形態４の動作フローを説明するためのフローチャートである。12 is a flowchart for explaining the operation flow of Embodiment 4. 実施形態５に係るユーザデバイスに表示されるグラフィカルユーザーインターフェースの１例を示す図である。12 is a diagram illustrating an example of a graphical user interface displayed on a user device according to Embodiment 5. FIG. 実施形態５の動作フローを説明するためのフローチャートである。13 is a flowchart for explaining the operation flow of the fifth embodiment. 実施形態６に係るユーザデバイスに表示されるグラフィカルユーザーインターフェースの１例を示す図である。FIG. 12 is a diagram illustrating an example of a graphical user interface displayed on a user device according to a sixth embodiment. 実施形態６の動作フローを説明するためのフローチャートである。13 is a flowchart for explaining the operation flow of the sixth embodiment. 実施形態７に係る立体形状のデジタルコンテンツの各面を説明する図である。FIG. 7 is a diagram illustrating each side of three-dimensional digital content according to a seventh embodiment. 実施形態７に係る選手に対する撮影方向を説明する図である。FIG. 7 is a diagram illustrating a shooting direction for a player according to a seventh embodiment. 実施形態７でコンテンツ生成部４が生成する立体形状のデジタルコンテンツの例を示す図である。12 is a diagram showing an example of three-dimensional digital content generated by the content generation unit 4 in Embodiment 7. FIG. 実施形態７の画像処理システム１０１の装置構成の１例を示す図である。10 is a diagram showing an example of the device configuration of an image processing system 101 according to a seventh embodiment. FIG. 実施形態７の画像処理システム１０１の動作フローを説明するためのフローチャートである。12 is a flowchart for explaining the operation flow of the image processing system 101 according to the seventh embodiment. 実施形態８の画像処理システム１０１の動作フローを説明するためのフローチャートである。12 is a flowchart for explaining the operation flow of the image processing system 101 according to the eighth embodiment. 実施形態９の画像処理システム１０３のシステム構成を示す図である。10 is a diagram showing a system configuration of an image processing system 103 according to a ninth embodiment. FIG. 実施形態９におけるデータ伝送の流れを示す図である。FIG. 7 is a diagram showing the flow of data transmission in Embodiment 9. FIG.

以下、図面を参照して本開示の実施形態を説明する。ただし、本開示は以下の実施形態に限定されるものではない。なお、各図において、同一の部材または要素については同一の参照番号を付し、重複する説明は省略または簡略化する。 Embodiments of the present disclosure will be described below with reference to the drawings. However, the present disclosure is not limited to the following embodiments. In each figure, the same reference numerals are given to the same members or elements, and overlapping explanations are omitted or simplified.

（実施形態１）
実施形態１の画像処理システムは、複数の撮像装置（カメラ）により異なる方向から撮像して取得される撮像画像、撮像装置の状態、指定された仮想視点に基づいて、仮想視点から見た仮想視点画像を生成する。そして、その仮想視点画像を仮想的な立体画像の表面に表示する。なお、撮像装置は、カメラだけでなく、画像処理を行う機能部を有していてもよい。また、撮像装置は、カメラ以外に、距離情報を取得するセンサを有していてもよい。 (Embodiment 1)
The image processing system of Embodiment 1 generates a virtual viewpoint as seen from the virtual viewpoint based on captured images obtained by imaging from different directions with a plurality of imaging devices (cameras), the state of the imaging device, and a specified virtual viewpoint. Generate an image. Then, the virtual viewpoint image is displayed on the surface of the virtual stereoscopic image. Note that the imaging device may include not only a camera but also a functional unit that performs image processing. Furthermore, the imaging device may include a sensor that acquires distance information in addition to the camera.

複数のカメラは、複数の方向から撮像領域を撮像する。撮像領域は、例えば、競技場のフィールドと任意の高さで囲まれた領域である。撮像領域は、上述した被写体の３次元形状を推定する３次元空間と対応していても良い。３次元空間は、撮像領域の全部であっても良いし、一部であっても良い。また、撮像領域は、コンサート会場、撮像スタジオなどであってもよい。 The multiple cameras capture images of the imaging area from multiple directions. The imaging area is, for example, an area surrounded by the field of a stadium at an arbitrary height. The imaging area may correspond to the three-dimensional space in which the three-dimensional shape of the subject described above is estimated. The three-dimensional space may be the entire imaging region or a part thereof. Furthermore, the imaging area may be a concert venue, an imaging studio, or the like.

複数のカメラは、撮像領域を取り囲むように夫々異なる位置・異なる方向（姿勢）に設置され、同期して撮像を行う。尚、複数のカメラは撮像領域の全周にわたって設置されなくてもよく、設置場所の制限等によっては撮像領域の一部の方向にのみ設置されていても良い。カメラの数は限定されず、例えば撮像領域をラグビーの競技場とする場合、競技場の周囲に数十～数百台程度のカメラが設置されても良い。 The plurality of cameras are installed at different positions and different directions (attitudes) so as to surround the imaging area, and perform imaging in synchronization. Note that the plurality of cameras do not have to be installed all around the imaging area, and may be installed only in a part of the imaging area depending on restrictions on the installation location. The number of cameras is not limited; for example, if the imaging area is a rugby stadium, tens to hundreds of cameras may be installed around the stadium.

又、複数のカメラは、望遠カメラと広角カメラなど画角が異なるカメラが含まれていれも良い。例えば、望遠カメラを用いて選手を高解像度に撮像することで、生成される仮想視点画像の解像度を向上できる。又、球技の場合にはボールの移動範囲が広いので、広角カメラを用いて撮像することで、カメラ台数を減らすことができる。又、広角カメラと望遠カメラの撮像領域を組み合わせて撮像することで設置位置の自由度が向上する。尚、カメラは共通の時刻で同期され、撮像した画像にはフレーム毎の画像に撮像時刻情報が付与される。 Furthermore, the plurality of cameras may include cameras with different angles of view, such as a telephoto camera and a wide-angle camera. For example, by capturing an image of a player at high resolution using a telephoto camera, the resolution of the generated virtual viewpoint image can be improved. Furthermore, in the case of a ball game, the movement range of the ball is wide, so by capturing images using a wide-angle camera, the number of cameras can be reduced. Further, by capturing images by combining the imaging areas of the wide-angle camera and the telephoto camera, the degree of freedom in the installation position is improved. Note that the cameras are synchronized at a common time, and imaging time information is added to each frame of the captured image.

仮想視点画像は、自由視点画像とも呼ばれ、オペレータが自由に（任意に）指定した視点に対応する画像をモニタできるものであるが、例えば、限定された複数の視点候補からオペレータが選択した視点に対応する画像をモニタするもの仮想視点画像に含まれる。又、仮想視点の指定は、オペレータ操作により行われても良いし、画像解析の結果等に基づいてＡＩで自動で行われても良い。又、仮想視点画像は映像であっても静止画であっても良い。 A virtual viewpoint image, also called a free viewpoint image, allows an operator to monitor an image corresponding to a freely (arbitrarily) specified viewpoint. Included in the virtual viewpoint image is the one that monitors the image corresponding to the virtual viewpoint image. Further, the virtual viewpoint may be specified by an operator operation, or may be automatically specified by AI based on the results of image analysis. Further, the virtual viewpoint image may be a video or a still image.

仮想視点画像の生成に用いられる仮想視点情報は、仮想視点の位置及び向き（姿勢）更には画角（焦点距離）等を含む情報である。具体的には、仮想視点情報は、仮想視点の３次元位置を表すパラメータと、パン、チルト、及びロール方向における仮想視点からの向き（視線方向）を表すパラメータ、焦点距離情報等を含む。但し、仮想視点情報の内容は上記に限定されない。 The virtual viewpoint information used to generate the virtual viewpoint image is information including the position and direction (posture) of the virtual viewpoint, as well as the angle of view (focal length). Specifically, the virtual viewpoint information includes a parameter representing the three-dimensional position of the virtual viewpoint, a parameter representing the direction (line-of-sight direction) from the virtual viewpoint in pan, tilt, and roll directions, focal length information, and the like. However, the content of the virtual viewpoint information is not limited to the above.

又、仮想視点情報は複数フレーム毎のパラメータを有していても良い。つまり、仮想視点情報が、仮想視点画像の映像を構成する複数のフレームに夫々対応するパラメータを有し、連続する複数の時点夫々における仮想視点の位置及び向きを示す情報であっても良い。 Further, the virtual viewpoint information may include parameters for each of a plurality of frames. That is, the virtual viewpoint information may have parameters corresponding to each of a plurality of frames constituting the video of the virtual viewpoint image, and may be information indicating the position and orientation of the virtual viewpoint at each of a plurality of consecutive points in time.

仮想視点画像は、例えば、以下のような方法で生成される。先ず、カメラにより異なる方向から撮像することで複数カメラの画像が取得される。次に、複数カメラ画像から、人物やボールなどの被写体に対応する前景領域を抽出した前景画像と、前景領域以外の背景領域を抽出した背景画像が取得される。前景画像、背景画像は、テクスチャ情報（色情報など）を有している。 The virtual viewpoint image is generated, for example, by the following method. First, images from multiple cameras are acquired by capturing images from different directions using cameras. Next, a foreground image in which a foreground region corresponding to a subject such as a person or a ball is extracted, and a background image in which a background region other than the foreground region is extracted are obtained from the multiple camera images. The foreground image and the background image have texture information (color information, etc.).

そして、被写体の３次元形状を表す前景モデルと前景モデルに色付けするためのテクスチャデータとが前景画像に基づいて生成される。又、競技場などの背景の３次元形状を表す背景モデルに色づけするためのテクスチャデータが背景画像に基づいて生成される。そして、前景モデルと背景モデルに対してテクスチャデータをマッピングし、仮想視点情報が示す仮想視点に応じてレンダリングを行うことにより、仮想視点画像が生成される。 Then, a foreground model representing the three-dimensional shape of the subject and texture data for coloring the foreground model are generated based on the foreground image. Furthermore, texture data for coloring a background model representing the three-dimensional shape of a background such as a stadium is generated based on the background image. Then, a virtual viewpoint image is generated by mapping texture data to the foreground model and background model and performing rendering according to the virtual viewpoint indicated by the virtual viewpoint information.

但し、仮想視点画像の生成方法はこれに限定されず、前景や背景モデルを用いずに撮像画像の射影変換により仮想視点画像を生成する方法など、種々の方法を用いることができる。 However, the method for generating a virtual viewpoint image is not limited to this, and various methods can be used, such as a method of generating a virtual viewpoint image by projective transformation of a captured image without using a foreground or background model.

前景画像とは、カメラにより撮像されて取得された撮像画像から、被写体の領域（前景領域）を抽出した画像である。前景領域として抽出される被写体とは、時系列で同じ方向から撮像を行った場合において動きのある（その絶対位置や形が変化し得る）動的被写体（動体）などを指す。被写体は、例えば、競技において、それが行われるフィールド内にいる選手や審判などの人物、球技であれば人物に加え、ボールなども含む。又、コンサートやエンターテインメントにおいては、歌手、演奏者、パフォーマー、司会者などが前景の被写体である。 The foreground image is an image in which a subject area (foreground area) is extracted from a captured image captured and acquired by a camera. The subject extracted as a foreground region refers to a dynamic subject (moving object) that moves (its absolute position and shape may change) when images are captured from the same direction in time series. The subject includes, for example, people such as players and referees on the field in a competition, and a ball in addition to people in the case of a ball game. Furthermore, in concerts and entertainment, singers, performers, performers, presenters, and the like are subjects in the foreground.

背景画像とは、少なくとも前景となる被写体とは異なる領域（背景領域）の画像である。具体的には、背景画像は、撮像画像から前景となる被写体を取り除いた状態の画像である。又、背景は、時系列で同じ方向から撮像を行った場合において静止、又は静止に近い状態が継続している撮像対象物を指す。 The background image is an image of an area (background area) that is different from at least the subject that is the foreground. Specifically, the background image is an image obtained by removing the foreground subject from the captured image. Furthermore, the background refers to an object to be imaged that remains stationary or nearly stationary when images are taken from the same direction over time.

このような撮像対象物は、例えば、コンサート等のステージ、競技などのイベントを行うスタジアム、球技で使用するゴールなどの構造物、フィールド、などである。但し、背景は少なくとも前景となる被写体とは異なる領域である。尚、撮像対象としては、被写体と背景の他に、別の物体等が含まれていても良い。 Such imaging targets include, for example, stages for concerts, stadiums for events such as competitions, structures such as goals used in ball games, fields, and the like. However, the background is at least a different area from the subject that is the foreground. Note that the imaging target may include other objects in addition to the subject and the background.

図１は、本実施形態の画像処理システム１００を示す図である。尚、図１に示される機能ブロックの一部は、画像処理システム１００に含まれるコンピュータに、記憶媒体としてのメモリに記憶されたコンピュータプログラムを実行させることによって実現されている。しかし、それらの一部又は全部をハードウェアで実現するようにしても構わない。ハードウェアとしては、専用回路（ＡＳＩＣ）やプロセッサ（リコンフィギュラブルプロセッサ、ＤＳＰ）などを用いることができる。 FIG. 1 is a diagram showing an image processing system 100 of this embodiment. Note that some of the functional blocks shown in FIG. 1 are realized by causing a computer included in the image processing system 100 to execute a computer program stored in a memory as a storage medium. However, some or all of them may be realized by hardware. As the hardware, a dedicated circuit (ASIC), a processor (reconfigurable processor, DSP), etc. can be used.

又、画像処理システム１００の夫々の機能ブロックは、同じ筐体に内蔵されていなくても良く、互いに信号路を介して接続された別々の装置により構成されていても良い。画像処理システム１００は、複数のカメラに接続されている。又、画像処理システム１００は、形状推定部２、画像生成部３、コンテンツ生成部４、保存部５、表示部１１５、操作部１１６等を有する。形状推定部２は、複数のカメラ１と、画像生成部３に接続され、表示部１１５は画像生成部３に接続されている。なお、それぞれの機能ブロックは、別々の装置に実装されていてもいいし、そのうちの全部あるいはいくつかの機能ブロックが同じ装置に実装されていてもよい。 Furthermore, the respective functional blocks of the image processing system 100 do not need to be built into the same housing, and may be configured by separate devices connected to each other via signal paths. Image processing system 100 is connected to multiple cameras. The image processing system 100 also includes a shape estimation section 2, an image generation section 3, a content generation section 4, a storage section 5, a display section 115, an operation section 116, and the like. The shape estimation section 2 is connected to the plurality of cameras 1 and the image generation section 3 , and the display section 115 is connected to the image generation section 3 . Note that each functional block may be implemented in separate devices, or all or some of the functional blocks may be implemented in the same device.

複数のカメラ１は、コンサート等のステージ、競技などのイベントを行うスタジアム、球技で使用するゴールなどの構造物、フィールド、などの周囲の異なる位置に配置され、夫々異なる視点から撮像を行う。又、各カメラは、そのカメラを識別するための識別番号（カメラ番号）を持つ。カメラ１は、撮像した画像から前景画像を抽出する機能など、他の機能やその機能を実現するハードウェア（回路や装置など）も含んでも良い。カメラ番号は、カメラ１の設置位置に基づいて設定されていても良いし、それ以外の基準で設定されても良い。 The plurality of cameras 1 are arranged at different positions around a stage for a concert or the like, a stadium for an event such as a competition, a structure such as a goal used in a ball game, a field, etc., and take images from different viewpoints. Furthermore, each camera has an identification number (camera number) for identifying the camera. The camera 1 may also include other functions, such as a function of extracting a foreground image from a captured image, and hardware (circuits, devices, etc.) for realizing the functions. The camera number may be set based on the installation position of the camera 1, or may be set based on other criteria.

画像処理システム１００はカメラ１が設けられている会場内に配置されていても良いし、会場外の例えば放送局などに配置されていても良い。画像処理システム１００はカメラ１とネットワークを介して接続されている。 The image processing system 100 may be placed inside the venue where the camera 1 is installed, or may be placed outside the venue, for example at a broadcasting station. The image processing system 100 is connected to the camera 1 via a network.

形状推定部２は、複数のカメラ１からの画像を取得する。そして、形状推定部２は、複数のカメラ１から取得した画像に基づいて、被写体の３次元形状を推定する。具体的には、形状推定部２は、公知の表現方法で表される３次元形状データを生成する。３次元形状データは、点で構成される点群データや、ポリゴンで構成されるメッシュデータや、ボクセルで構成されるボクセルデータであってもよい。 The shape estimation unit 2 acquires images from a plurality of cameras 1. Then, the shape estimation unit 2 estimates the three-dimensional shape of the subject based on the images acquired from the plurality of cameras 1. Specifically, the shape estimation unit 2 generates three-dimensional shape data expressed using a known expression method. The three-dimensional shape data may be point cloud data made up of points, mesh data made up of polygons, or voxel data made up of voxels.

画像生成部３は、形状推定部２から被写体の３次元形状データの位置や姿勢を示す情報を取得し、仮想視点から被写体の３次元形状を見た場合に表現される被写体の二次元形状を含む仮想視点画像を生成ことができる。又、画像生成部３は、仮想視点画像を生成するために、仮想視点情報（仮想視点の位置と仮想視点からの視線方向等）の指定をオペレータから受け付け、その仮想視点情報に基づいて仮想視点画像を生成することもできる。ここで、画像生成部３は、複数のカメラから得られた複数の画像に基づいて仮想視点画像を生成する仮想視点画像生成手段として機能している。 The image generation unit 3 acquires information indicating the position and orientation of the three-dimensional shape data of the subject from the shape estimation unit 2, and calculates the two-dimensional shape of the subject expressed when the three-dimensional shape of the subject is viewed from a virtual viewpoint. It is possible to generate a virtual viewpoint image including: In addition, in order to generate a virtual viewpoint image, the image generation unit 3 receives designation of virtual viewpoint information (position of virtual viewpoint, line of sight direction from the virtual viewpoint, etc.) from the operator, and generates a virtual viewpoint based on the virtual viewpoint information. Images can also be generated. Here, the image generation unit 3 functions as a virtual viewpoint image generation means that generates a virtual viewpoint image based on a plurality of images obtained from a plurality of cameras.

仮想視点画像は、コンテンツ生成部４に送られ、コンテンツ生成部４では、後述のように例えば立体形状のデジタルコンテンツが生成される。又、コンテンツ生成部４で生成された、仮想視点画像を含むデジタルンテンツは表示部１１５へ出力される。 The virtual viewpoint image is sent to the content generation unit 4, and the content generation unit 4 generates, for example, three-dimensional digital content as described later. Further, the digital content generated by the content generation unit 4 and including the virtual viewpoint image is output to the display unit 115.

尚、コンテンツ生成部４は複数のカメラからの画像を直接受け取り、カメラ毎の画像を表示部１１５に供給することもできる。又、操作部１１６からの指示に基づき、カメラ毎の画像と仮想視点画像を仮想的な立体画像のどの面に表示するかを切り替えることもできる。 Note that the content generation section 4 can also directly receive images from a plurality of cameras and supply images from each camera to the display section 115. Furthermore, based on instructions from the operation unit 116, it is also possible to switch on which side of the virtual stereoscopic image the image for each camera and the virtual viewpoint image are to be displayed.

表示部１１５は例えば液晶ディスプレイやＬＥＤ等で構成され、コンテンツ生成部４から、仮想視点画像を含むデジタルンテンツを取得し表示する。又、オペレータが夫々のカメラ１を操作するためのＧＵＩ（ＧｒａｐｈｉｃａｌＵｓｅｒＩｎｔｅｒｆａｃｅ）などを表示する。 The display unit 115 is composed of, for example, a liquid crystal display, an LED, or the like, and acquires digital content including a virtual viewpoint image from the content generation unit 4 and displays it. Furthermore, a GUI (Graphical User Interface) for an operator to operate each camera 1 is displayed.

又、操作部１１６は、ジョイスティック、ジョグダイヤル、タッチパネル、キーボード、及びマウスなどから構成され、カメラ１などの操作をオペレータが行うために用いられる。又、操作部１１６は、コンテンツ生成部４で生成されるデジタルンテンツ（立体画像）の表面に表示される画像をオペレータが選択するために使われる。更に、画像生成部３における、仮想視点画像を生成するための仮想視点の位置や姿勢の指定等をすることができる。 Further, the operation unit 116 includes a joystick, a jog dial, a touch panel, a keyboard, a mouse, and the like, and is used by an operator to operate the camera 1 and the like. Further, the operation unit 116 is used by the operator to select an image to be displayed on the surface of the digital content (stereoscopic image) generated by the content generation unit 4. Furthermore, the position and orientation of a virtual viewpoint for generating a virtual viewpoint image in the image generation unit 3 can be specified.

尚、オペレータの操作指示によって仮想視点の位置や姿勢を画面上で直接的に指定しても良い。或いは、オペレータの操作指示によって所定の被写体を画面上で指定した場合に、その所定の被写体を画像認識して追尾し、その被写体からの仮想視点情報や、その被写体を中心とした円弧状の周囲の位置からの仮想視点情報を自動的に指定するようにしても良い。 Note that the position and orientation of the virtual viewpoint may be directly specified on the screen by an operator's operation instruction. Alternatively, when a predetermined subject is specified on the screen by an operator's operation instructions, the predetermined subject is image recognized and tracked, and virtual viewpoint information from the subject and an arc-shaped surrounding area centered on the subject are displayed. The virtual viewpoint information from the position may be automatically specified.

更には、オペレータの操作指示によって予め指定した条件を満たす被写体を画像認識し、その被写体からの仮想視点情報や、その被写体を中心として円弧状の周囲の位置からの仮想視点情報を自動的に指定するようにしても良い。その場合の指定した条件とは例えば、特定のアスリート名や、シュートをした人、ファインプレーをした人、ボールの位置、などの条件を含む。 Furthermore, the system automatically recognizes objects that meet pre-specified conditions based on operator instructions, and automatically specifies virtual viewpoint information from that object and virtual viewpoint information from positions in an arc around the object. You may also do this. In this case, the specified conditions include, for example, the name of a specific athlete, the person who made the shot, the person who made a fine play, the position of the ball, and the like.

保存部５はコンテンツ生成部４で生成されたデジタルコンテンツや、仮想視点画像や、カメラ画像等を保存するためのメモリを含む。又、保存部５は着脱可能な記録媒体を有していても良い。着脱可能な記録媒体には、例えば他の会場や他のスポーツシーンにおいて撮像された複数のカメラ画像や、それらを用いて生成された仮想視点画像や、それらを組み合わせて生成されたデジタルコンテンツなどが記録されていても良い。 The storage unit 5 includes a memory for storing digital content generated by the content generation unit 4, virtual viewpoint images, camera images, and the like. Further, the storage unit 5 may include a removable recording medium. The removable recording medium can store, for example, multiple camera images taken at other venues or other sports scenes, virtual viewpoint images generated using them, digital content generated by combining them, etc. It may be recorded.

又、保存部５は、外部サーバなどからネットワークを介してダウンロードした複数のカメラ画像や、それらを用いて生成された仮想視点画像や、それらを組み合わせて生成されたデジタルンテンツなどを保存できるようにしても良い。また、それらのカメラ画像や、仮想視点画像や、デジタルンテンツなどは第３者が作成したものであっても良い。 Furthermore, the storage unit 5 is capable of storing a plurality of camera images downloaded from an external server etc. via a network, virtual viewpoint images generated using them, digital content generated by combining them, etc. It's okay. Furthermore, these camera images, virtual viewpoint images, digital content, etc. may be created by a third party.

図２は、実施形態１に係る画像処理システム１００のハードウェア構成を示す図であり、図２を用いて画像処理システム１００のハードウェア構成について説明する。 FIG. 2 is a diagram showing the hardware configuration of the image processing system 100 according to the first embodiment, and the hardware configuration of the image processing system 100 will be described using FIG. 2.

画像処理システム１００は、ＣＰＵ１１１、ＲＯＭ１１２、ＲＡＭ１１３、補助記憶装置１１４、表示部１１５、操作部１１６、通信Ｉ／Ｆ１１７、及びバス１１８等を有する。ＣＰＵ１１１は、ＲＯＭ１１２やＲＡＭ１１３や補助記憶装置１１４等に記憶されているコンピュータプログラム等を用いて画像処理システム１００の全体を制御することで、図１に示す画像処理システムの各機能ブロックを実現する。 The image processing system 100 includes a CPU 111, a ROM 112, a RAM 113, an auxiliary storage device 114, a display section 115, an operation section 116, a communication I/F 117, a bus 118, and the like. The CPU 111 implements each functional block of the image processing system shown in FIG. 1 by controlling the entire image processing system 100 using computer programs stored in the ROM 112, RAM 113, auxiliary storage device 114, and the like.

ＲＡＭ１１３は、補助記憶装置１１４から供給されるコンピュータプログラムやデータ、及び通信Ｉ／Ｆ１１７を介して外部から供給されるデータなどを一時記憶する。補助記憶装置１１４は、例えばハードディスクドライブ等で構成され、画像データや音声データやコンテンツ生成部４からの仮想視点画像を含むデジタルンテンツなどの種々のデータを記憶する。 The RAM 113 temporarily stores computer programs and data supplied from the auxiliary storage device 114, data supplied from the outside via the communication I/F 117, and the like. The auxiliary storage device 114 is composed of, for example, a hard disk drive, and stores various data such as image data, audio data, and digital content including virtual viewpoint images from the content generation unit 4.

表示部１１５は、前述のように、仮想視点画像を含むデジタルコンテンツや、ＧＵＩ等を表示する。操作部１１６は、前述のように、オペレータによる操作入力を受けて各種の指示をＣＰＵ１１１に入力する。ＣＰＵ１１１は、表示部１１５を制御する表示制御部、及び操作部１１６を制御する操作制御部として動作する。 As described above, the display unit 115 displays digital content including virtual viewpoint images, a GUI, and the like. As described above, the operation unit 116 receives operation input from the operator and inputs various instructions to the CPU 111. The CPU 111 operates as a display control unit that controls the display unit 115 and an operation control unit that controls the operation unit 116.

通信Ｉ／Ｆ１１７は、画像処理システム１００の外部の装置（例えば、カメラ１や外部サーバ等）との通信に用いられる。例えば、画像処理システム１００が外部の装置と有線で接続される場合には、通信用のケーブルが通信Ｉ／Ｆ１１７に接続される。画像処理システム１００が外部の装置と無線通信する機能を有する場合には、通信Ｉ／Ｆ１１７はアンテナを備える。バス１１８は、画像処理システム１００の各部をつないで情報を伝達する。 The communication I/F 117 is used for communication with devices external to the image processing system 100 (for example, the camera 1, an external server, etc.). For example, when the image processing system 100 is connected to an external device by wire, a communication cable is connected to the communication I/F 117. When the image processing system 100 has a function of wirelessly communicating with an external device, the communication I/F 117 includes an antenna. Bus 118 connects each part of image processing system 100 to transmit information.

尚、本実施形態では表示部１１５や操作部１１６が、画像処理システム１００の内部に含まれている例を示しているが、表示部１１５と操作部１１６との少なくとも一方が画像処理システム１００の外部に別体の装置として存在していても良い。尚、画像処理システム１００は、例えばＰＣ端末のような形態であっても良い。 Note that although the present embodiment shows an example in which the display unit 115 and the operation unit 116 are included inside the image processing system 100, at least one of the display unit 115 and the operation unit 116 is included in the image processing system 100. It may exist as a separate device externally. Note that the image processing system 100 may be in the form of, for example, a PC terminal.

図３は、実施形態１の画像処理システム１００の動作フローを説明するためのフローチャートである。又、図４（Ａ）～（Ｃ）は実施形態１でコンテンツ生成部４が生成する立体形状のデジタルコンテンツの例を示す図である。 FIG. 3 is a flowchart for explaining the operation flow of the image processing system 100 of the first embodiment. Further, FIGS. 4A to 4C are diagrams showing examples of three-dimensional digital content generated by the content generation unit 4 in the first embodiment.

尚、画像処理システム１００のコンピュータとしてのＣＰＵ１１１が例えばＲＯＭ１１２や補助記憶装置１１４等のメモリに記憶されたコンピュータプログラムを実行することによって図３のフローチャートの各ステップの動作が行われる。 Note that each step in the flowchart of FIG. 3 is performed by the CPU 111 as a computer of the image processing system 100 executing a computer program stored in a memory such as the ROM 112 or the auxiliary storage device 114.

尚、本実施形態では、画像処理システム１００は放送局等に設置され、図４（Ａ）に示すような立方形状のデジタルコンテンツ２００を製作し放送してもよいし、或いはインターネットを介して提供してもよい。その際に、デジタルコンテンツ２００にＮＦＴを付与可能としている。 In this embodiment, the image processing system 100 may be installed at a broadcasting station or the like to produce and broadcast cubic-shaped digital content 200 as shown in FIG. 4(A), or may be provided via the Internet. You may. At that time, NFT can be added to the digital content 200.

即ち、資産価値を向上させるために、例えば配布するコンテンツの数量を制限してシリアル番号で管理するなどして稀少性を持たせることができるようにしている。尚、ＮＦＴはノンファンジブルトークン（Ｎｏｎ－ｆｕｎｇｉｂｌｅＴｏｋｅｎ）の略であり、ブロックチェーン上に発行・流通するためのトークンである。ＮＦＴのフォーマットの一例としては、ＥＲＣ―７２１やＥＲＣ―１１５５と呼ばれるトークン規格がある。トークンは通常、オペレータが管理するウォレットに関連付けて保管される。 That is, in order to improve asset value, for example, the quantity of content to be distributed is limited and managed by serial numbers, thereby making the content rare. Note that NFT is an abbreviation for Non-fungible Token, and is a token to be issued and distributed on a blockchain. Examples of NFT formats include token standards called ERC-721 and ERC-1155. Tokens are typically stored in association with a wallet controlled by an operator.

ステップＳ３１において、ＣＰＵ１１１は、メインカメラ画像（第１の画像）を例えば図４（Ａ）のような立体形状のデジタルコンテンツ２００第１面２０１と対応付ける。なお、オペレータによる確認のために、第１面２０１と対応付けられたメインカメラ画像を表示してもよい。さらに、図４（Ａ）で示すように、デジタルコンテンツを仮想的に見た視点からの視線方向（具体的には、図４（Ａ）の紙面に垂直な方向））と第１面２０１の法線方向とが平行でない場合には、以下の表示を行ってもよい。つまり、第１面２０１に表示されるメインカメラ画像は、デジタルコンテンツの表示面に対する第１面２０１の法線方向の角度に応じて、射影変換されて生成されてもよい。ここで、メインカメラ画像（メイン画像、第１の画像）とは、スポーツ会場に設置された複数のカメラから得られた複数の画像の内の、ＴＶ放送等のために選択される画像である。なお、メイン画像は所定の被写体を画角内に含む画像である。また、メインカメラ画像は、スポーツ会場に設置されたカメラによって撮像されたものでなくてもよい。例えば、カメラマンの持ち込む手持ちのカメラによって撮像された画像であってもよい。あるいは会場内の観客が持ち込むカメラやカメラを搭載するスマホ等の電子機器で撮像された画像であってもよい。また、メインカメラ画像は、仮想視点画像の生成に用いられる複数のカメラのうちの一つでもよいし、その複数のカメラに含まれないカメラであってもよい。 In step S31, the CPU 111 associates the main camera image (first image) with the first surface 201 of the digital content 200 having a three-dimensional shape as shown in FIG. 4(A), for example. Note that the main camera image associated with the first surface 201 may be displayed for operator confirmation. Furthermore, as shown in FIG. 4(A), the direction of the line of sight from a virtual viewpoint of the digital content (specifically, the direction perpendicular to the page of FIG. 4(A)) and the direction of the first surface 201 If the normal direction is not parallel, the following display may be performed. That is, the main camera image displayed on the first surface 201 may be generated by projective transformation depending on the angle of the normal direction of the first surface 201 with respect to the display surface of the digital content. Here, the main camera image (main image, first image) is an image selected for TV broadcasting etc. from among multiple images obtained from multiple cameras installed at a sports venue. . Note that the main image is an image that includes a predetermined subject within the angle of view. Furthermore, the main camera image does not have to be captured by a camera installed at a sports venue. For example, the image may be an image captured by a camera carried by a photographer. Alternatively, it may be an image taken by a camera brought by a spectator in the venue or an electronic device such as a smartphone equipped with a camera. Further, the main camera image may be one of the plurality of cameras used to generate the virtual viewpoint image, or may be a camera not included in the plurality of cameras.

どのカメラの画像をメイン画像として放送或いはネットで配信するかは放送局等のオペレータが操作部１１６を用いて逐次選択する。例えばゴールの瞬間を放映、配信する場合にはゴール近傍のカメラからの画像をメイン画像として放送する場合が多い。 An operator of a broadcasting station or the like uses the operation unit 116 to sequentially select which camera's image is to be broadcast or distributed over the Internet as the main image. For example, when broadcasting or distributing the moment of a goal, an image from a camera near the goal is often broadcast as the main image.

尚、本実施形態においては図４（Ａ）～（Ｃ）に示すように、向かって左側に見える面を第１面とし、右側に見える面を第２面とし、上側に見える面を第３面としている。しかし、これに限定されない。どの面を第１面～第３面とするかは予め任意に設定できるものとする。 In this embodiment, as shown in FIGS. 4A to 4C, the surface visible on the left side is the first surface, the surface visible on the right side is the second surface, and the surface visible on the upper side is the third surface. It is a face. However, it is not limited to this. It is assumed that which surfaces are to be designated as the first to third surfaces can be arbitrarily set in advance.

ステップＳ３２で、コンテンツ生成部４は、デジタルコンテンツ２００の第３面２０３に付随データとして例えばゴールにシュートした選手の名前や所属チーム名やゴールした試合における最終試合結果などのデータを対応付ける。なお、オペレータによる確認のために、第３面２０３に対応付けられた付随データを表示してもよい。ＮＦＴを付与する場合には、その発行数など希少性を表すデータを付随データとして第３面２０３に表示しても良い。発行数は、画像生成システムを使ってデジタルコンテンツを生成するオペレータが決定してもよいし、画像生成システムが自動的に決定してもよい。 In step S32, the content generation unit 4 associates the third page 203 of the digital content 200 with data such as the name of the player who shot on goal, the name of the team he belongs to, and the final game result of the game in which he scored, as accompanying data. Incidentally, the associated data may be displayed on the third page 203 for confirmation by the operator. When providing NFTs, data representing rarity, such as the number of NFTs issued, may be displayed on the third page 203 as accompanying data. The number of issues may be determined by an operator who uses the image generation system to generate digital content, or may be automatically determined by the image generation system.

ステップＳ３３において、画像生成部３は、メインカメラ画像を撮像するカメラの視点の向きに対して所定角度（例えば９０度）だけ視点の向きが異なり、例えばゴールやシューターが含まれる画像を複数のカメラ１からの画像の中から取得する。その際、複数のカメラの配置位置や姿勢等は予めわかっているので、メインカメラ画像に対して上記のように所定角度視点の向きの異なる画像がどのカメラから取得できるかはＣＰＵ１１１が判断可能である。なお、以下で画像の視点という表現を用いる場合があるが、それは画像を撮像するカメラの視点あるいは、画像を生成するために指定された仮想視点のことである。 In step S33, the image generation unit 3 generates an image that is different from the viewpoint direction of the camera that captures the main camera image by a predetermined angle (for example, 90 degrees) and that includes, for example, a goal or a shooter, from a plurality of cameras. Obtain from among the images from 1. At this time, since the placement positions and postures of the multiple cameras are known in advance, the CPU 111 can determine from which camera images with different predetermined angular viewpoints as described above can be obtained from the main camera image. be. Note that although the expression "image viewpoint" is sometimes used below, it refers to the viewpoint of a camera that captures an image or a virtual viewpoint designated for generating an image.

或いは、ステップＳ３３において、画像生成部３において所定の（上記のように例えば９０度視点の向きが異なる）仮想視点であって、画像認識された被写体を含む仮想視点からの仮想視点画像を取得しても良い。その場合、画像生成部３に対して上記の所定の（上記のように９０度視点の向きつまり姿勢の異なる）仮想視点を指定し、仮想視点画像を生成させることで取得しても良い。 Alternatively, in step S33, the image generation unit 3 acquires a virtual viewpoint image from a predetermined virtual viewpoint (for example, the orientation of the viewpoint is different by 90 degrees as described above) and includes the image-recognized subject. It's okay. In that case, the virtual viewpoint may be acquired by specifying the above-mentioned predetermined virtual viewpoint (with a different 90-degree viewpoint direction, ie, posture, as described above) to the image generation unit 3, and causing the image generation unit to generate a virtual viewpoint image.

或いは、予め画像生成部３で予め複数の視点の仮想視点画像を生成しておき、その中から該当するものを選択することで取得しても良い。尚、本実施形態ではメインカメラ画像に対して所定角度だけ視点の異なる画像を例えば９０度視点の異なる画像としているが、この角度は予め設定できるものとする。 Alternatively, virtual viewpoint images of a plurality of viewpoints may be generated in advance by the image generation unit 3, and a corresponding one may be selected from among the virtual viewpoint images to obtain the virtual viewpoint images. Note that in this embodiment, an image whose viewpoint differs from the main camera image by a predetermined angle is, for example, an image whose viewpoint differs by 90 degrees, but this angle can be set in advance.

又、仮想視点画像は、メインカメラ画像に含まれる被写体の向き（例えば、人物であれば顔や体の向き）に基づいて特定される仮想視点に対応する画像であってもよい。なお、メインカメラ画像に含まれる被写体が複数の場合には、そのうちの一つの被写体に対して、仮想視点を設定してもよいし、複数の被写体に対して仮想視点を設定してもよい。 Further, the virtual viewpoint image may be an image corresponding to a virtual viewpoint specified based on the orientation of the subject included in the main camera image (for example, the orientation of the face or body in the case of a person). Note that when there are multiple subjects included in the main camera image, a virtual viewpoint may be set for one of the subjects, or a virtual viewpoint may be set for a plurality of subjects.

尚、上記ではメイン画像に対して所定の角度の視点を選択する例について説明した。しかし、例えば被写体視点、被写体の後方からの視点、被写体を中心とした円弧上の位置の中の仮想視点の何れか一つなどから所定の１つの視点からの仮想視点画像を選択して取得するようにしても良い。 Note that the example in which a viewpoint at a predetermined angle is selected with respect to the main image has been described above. However, for example, a virtual viewpoint image from one predetermined viewpoint is selected and acquired from one of the subject's viewpoint, a viewpoint from behind the subject, and a virtual viewpoint located on an arc centered on the subject. You can do it like this.

被写体視点とは、被写体の位置を仮想視点の位置として、被写体の向きを仮想視点からの視線方向とする仮想視点である。例えば、人物を被写体とする場合、被写体視点は、人物の顔の位置を仮想視点の位置とし、人物の顔の向きを仮想視点からの視線方向とした視点である。あるいは、その人物の視線方向を仮想視点からの視線方向としてもよい。 The subject viewpoint is a virtual viewpoint in which the position of the subject is the position of the virtual viewpoint, and the orientation of the subject is the line of sight direction from the virtual viewpoint. For example, when the subject is a person, the subject viewpoint is a viewpoint in which the position of the person's face is the position of the virtual viewpoint, and the direction of the person's face is the line of sight from the virtual viewpoint. Alternatively, the line-of-sight direction of the person may be set as the line-of-sight direction from the virtual viewpoint.

被写体の後方からの視点とは、被写体の後方に所定の距離離れた位置を仮想視点の位置として、その位置から被写体の位置に向かう方向を仮想視点からの視線方向とする仮想視点である。または、仮想視点からの視線方向としては、被写体の向きに応じて決定されてもよい。例えば、人物を被写体とする場合、被写体の後方からの視点とは、人物の背中から一定距離後方かつ一定距離高いた位置を仮想視点の位置とし、人物の顔の向きを仮想視点から視線方向とした仮想視点である。 A viewpoint from behind the subject is a virtual viewpoint in which a position a predetermined distance behind the subject is the position of the virtual viewpoint, and a line of sight from the virtual viewpoint is a direction from that position toward the position of the subject. Alternatively, the line of sight direction from the virtual viewpoint may be determined according to the orientation of the subject. For example, when the subject is a person, the viewpoint from behind the subject is a virtual viewpoint that is a certain distance behind and a certain distance above the person's back, and the direction of the person's face is the line of sight from the virtual viewpoint. This is a virtual perspective.

被写体を中心とした円弧上の位置の中の仮想視点とは、被写体の位置を中心とした所定の半径で規定される球面上の位置を仮想視点の位置として、その位置から被写体の位置に向かう方向を仮想視点からの視線方向とする仮想視点である。 A virtual viewpoint within a position on an arc centered on the subject is a position on a spherical surface defined by a predetermined radius centered on the subject's position as the virtual viewpoint position, and the direction is directed from that position to the subject's position. This is a virtual viewpoint where the direction is the line of sight direction from the virtual viewpoint.

例えば、人物を被写体とする場合、人物の位置を中心とした所定の半径で規定される球面上の位置を仮想視点の位置とし、その位置から被写体の位置に向かう方向を仮想視点からの視線方向とする仮想視点である。 For example, when the subject is a person, the position on a spherical surface defined by a predetermined radius centered on the person's position is the position of the virtual viewpoint, and the direction from that position to the position of the subject is the line of sight from the virtual viewpoint. This is a virtual perspective.

ステップＳ３３は、このように、第１の画像と所定の関係を有する視点の仮想視点画像を取得して第２の画像とする、仮想視点画像生成ステップとして機能している。尚、ここで、第１の画像と所定の関係を有する視点の仮想視点画像は、第１の画像の視点と同じ時刻（撮像タイミング）の仮想視点画像とする。又、本実施形態では、第１の画像と所定の関係を有する前記視点は、上記のように第１の画像の視点と所定の角度関係又は所定の位置関係にある視点とする。 Step S33 thus functions as a virtual viewpoint image generation step that acquires a virtual viewpoint image of a viewpoint that has a predetermined relationship with the first image and uses it as a second image. Here, it is assumed that the virtual viewpoint image of a viewpoint that has a predetermined relationship with the first image is a virtual viewpoint image at the same time (imaging timing) as the viewpoint of the first image. Further, in this embodiment, the viewpoint having a predetermined relationship with the first image is a viewpoint having a predetermined angular relationship or a predetermined positional relationship with the viewpoint of the first image, as described above.

そしてステップＳ３４において、ＣＰＵ１１１は、その第２の画像をデジタルコンテンツ２００の第２面２０２に対応付ける。なお、オペレータの確認のために、第２の画像は表示されてもよい。尚、この時、第１面２０１に対応付けられるメイン画像と、第２面２０２に対応付けられる第２画像は、前述のように同じ時刻（タイムコード）に撮像された画像となるように同期制御されるものとする。このようにステップＳ３１～Ｓ３４は、第１の画像を立体形状のデジタルコンテンツの後述の第１面に対応付け、第１の画像と所定の関係を有する仮想視点の仮想視点画像を、第２面２０２に対応付けている。又、ステップＳ３１～Ｓ３４は、コンテンツ生成ステップ（コンテンツ生成手段）として機能している。 Then, in step S34, the CPU 111 associates the second image with the second surface 202 of the digital content 200. Note that the second image may be displayed for the operator's confirmation. At this time, the main image associated with the first surface 201 and the second image associated with the second surface 202 are synchronized so that they are images captured at the same time (time code) as described above. shall be controlled. In this way, in steps S31 to S34, the first image is associated with the first surface of the three-dimensional digital content, which will be described later, and the virtual viewpoint image of the virtual viewpoint having a predetermined relationship with the first image is associated with the second surface. 202. Further, steps S31 to S34 function as a content generation step (content generation means).

次にステップＳ３５において、ＣＰＵ１１１は、操作部１１６を介して上記の第２面２０２に表示される第２の画像の視点を変更する操作がなされたか判別する。即ち、オペレータは時々刻々変化するスポーツシーンを見ながら第２面に表示すべき第２画像の視点を例えば複数のカメラ１の中から所望の視点のカメラ画像を選択することで変更する場合がある。 Next, in step S35, the CPU 111 determines whether an operation to change the viewpoint of the second image displayed on the second screen 202 has been performed via the operation unit 116. That is, the operator may change the viewpoint of the second image to be displayed on the second screen by, for example, selecting a camera image from a desired viewpoint from among the plurality of cameras 1 while watching a sports scene that changes from moment to moment. .

或いは、仮想視点の中から所望の視点を画像生成部３に対して指定することによってその視点からの仮想視点画像を取得する場合がある。ステップＳ３５では、そのような視点変更操作がなされた場合には、Ｙｅｓとなり、ステップＳ３６に進む。 Alternatively, by specifying a desired viewpoint among the virtual viewpoints to the image generation unit 3, a virtual viewpoint image from that viewpoint may be obtained. In step S35, if such a viewpoint changing operation has been performed, the answer is Yes, and the process proceeds to step S36.

ステップＳ３６において、ＣＰＵ１１１は、視点が変更された後の視点画像を複数のカメラ１の中から選択するか、或いは視点が変更された後の仮想視点画像を画像生成部３から取得する。仮想視点画像の取得は、予め生成された仮想視点画像を取得してもよいし、変更された視点に基づいて新たに仮想視点画像を生成することで取得されてもよい。そしてその取得された画像を第２画像とし、ステップＳ３４に遷移し第２面に対応付ける。この状態において、表示部１１５には、デジタルコンテンツ２００の第１～第３面に夫々、第１の画像、第２の画像、付随データが対応付けられるここで、オペレータは、デジタルコンテンツ２００の第１～第３面に夫々、第１の画像、第２の画像、付随データが対応付けられた状態を、表示により確認してもよい。尚、その場合に、どの面が第１～第３の面かが分かるように面の番号も表示しても良い。 In step S36, the CPU 111 selects a viewpoint image after the viewpoint has been changed from among the plurality of cameras 1, or obtains a virtual viewpoint image after the viewpoint has been changed from the image generation unit 3. The virtual viewpoint image may be acquired by acquiring a virtual viewpoint image that has been generated in advance, or by newly generating a virtual viewpoint image based on a changed viewpoint. The acquired image is then set as a second image, and the process moves to step S34, where it is associated with the second surface. In this state, the first image, the second image, and the accompanying data are associated with the first to third pages of the digital content 200, respectively, on the display unit 115. The state in which the first image, second image, and accompanying data are associated with the first to third surfaces, respectively, may be confirmed by display. In this case, the surface numbers may also be displayed so that it can be seen which surfaces are the first to third surfaces.

ステップＳ３５で視点変更がない場合にはステップＳ３７に進み、ＣＰＵ１１１は、デジタルコンテンツ２００にＮＦＴを付与するか否か判別する。そのために例えば表示部１１５にデジタルコンテンツ２００にＮＦＴを付与するか否かを問う表示画像（ＧＵＩ）を表示する。そして、オペレータがＮＦＴを付与すると選択した場合には、それを判別してステップＳ３８に進み、デジタルコンテンツ２００にＮＦＴを付与して、暗号化してからステップＳ３９に進む。 If there is no viewpoint change in step S35, the process proceeds to step S37, and the CPU 111 determines whether or not to add an NFT to the digital content 200. For this purpose, for example, a display image (GUI) is displayed on the display unit 115 asking whether or not to add NFT to the digital content 200. If the operator selects to add an NFT, it is determined and the process proceeds to step S38, where the digital content 200 is provided with an NFT and encrypted, and then the process proceeds to step S39.

ステップＳ３７でＮｏと判別された場合には、そのままステップＳ３９に進む。 If the determination in step S37 is No, the process directly advances to step S39.

尚、ステップＳ３７におけるデジタルコンテンツ２００は、図４（Ｂ）、（Ｃ）のような形状の立体画像であっても良い。又、多面体の場合に、図４（Ａ）のような６面体に限定されず例えば８面体などであっても良い。 Note that the digital content 200 in step S37 may be a stereoscopic image having a shape as shown in FIGS. 4(B) and 4(C). Further, in the case of a polyhedron, it is not limited to a hexahedron as shown in FIG. 4(A), but may be an octahedron, for example.

ステップＳ３９において、ＣＰＵ１１１は、図３のデジタルコンテンツ２００を生成するためのフローを終了するか否か判別する。そして、オペレータが操作部１１６を操作して終了にしていなければ、ステップＳ３１に戻って上記の処理を繰り替えし、終了であれば図３のフローを終了する。尚、オペレータが操作部１１６を操作して終了にしていなくても、操作部１１６の最後の操作から所定期間（例えば３０分）経過したら自動的に終了しても良い。 In step S39, the CPU 111 determines whether to end the flow for generating the digital content 200 in FIG. 3. If the operator has not operated the operation unit 116 to end the process, the process returns to step S31 and repeats the above process, and if the process has ended, the flow in FIG. 3 ends. Note that even if the operator does not operate the operation unit 116 to terminate the process, the process may be automatically terminated after a predetermined period (for example, 30 minutes) has elapsed since the last operation of the operation unit 116.

尚、図４（Ｂ）、（Ｃ）はデジタルコンテンツ２００の変形例を示した図であり、図４（Ｂ）は図４（Ａ）のデジタルコンテンツ２００を球としたものである。球２００の例えば正面から見て左側の球面である第１面２０１に第１画像を表示し、右側の球面である第２面２０２に第２画像を表示する。又、上側の球面である第３面２０３に上記のような付随データを表示する。 Note that FIGS. 4(B) and 4(C) are diagrams showing modified examples of the digital content 200, and FIG. 4(B) shows the digital content 200 of FIG. 4(A) in the form of a sphere. A first image is displayed on the first surface 201, which is the spherical surface on the left side of the sphere 200 when viewed from the front, and a second image is displayed on the second surface 202, which is the spherical surface on the right side. Additionally, the above-mentioned accompanying data is displayed on the third surface 203, which is the upper spherical surface.

図４（Ｃ）は図４（Ａ）のデジタルコンテンツ２００の各平面を所望の曲率の曲面にした例を示す図である。このように、本実施形態におけるデジタルコンテンツの表示は図４（Ｂ）、（Ｃ）のような球体や、表面が球面の立方体などを使って画像を表示するものであっても良い。 FIG. 4C is a diagram showing an example in which each plane of the digital content 200 in FIG. 4A is a curved surface with a desired curvature. In this way, the digital content in this embodiment may be displayed using a sphere as shown in FIGS. 4(B) and 4(C), a cube with a spherical surface, or the like.

（実施形態２）
次に実施形態２について図５を用いて説明する。 (Embodiment 2)
Next, Embodiment 2 will be described using FIG. 5.

図５は、実施形態２の画像処理システム１００の動作フローを説明するためのフローチャートである。尚、画像処理システム１００のコンピュータとしてのＣＰＵ１１１が例えばＲＯＭ１１２や補助記憶装置１１４等のメモリに記憶されたコンピュータプログラムを実行することによって図５のフローチャートの各ステップの動作が行われる。 FIG. 5 is a flowchart for explaining the operation flow of the image processing system 100 according to the second embodiment. Note that each step in the flowchart of FIG. 5 is performed by the CPU 111 as a computer of the image processing system 100 executing a computer program stored in a memory such as the ROM 112 or the auxiliary storage device 114.

尚、図５において図３と同じ符号のステップは同じ処理であり説明を省略する。 Note that in FIG. 5, steps with the same reference numerals as in FIG. 3 are the same processes, and a description thereof will be omitted.

図５のステップＳ５１において、ＣＰＵ１１１は、オペレータにより指定された視点のカメラ画像又はオペレータにより指定された仮想視点からの仮想視点画像を画像生成部３から取得する。そして、取得された画像を第２の画像とする。それ以外は図３のフローと同じである。 In step S51 in FIG. 5, the CPU 111 obtains from the image generation unit 3 a camera image from a viewpoint specified by the operator or a virtual viewpoint image from a virtual viewpoint specified by the operator. The acquired image is then set as a second image. Other than that, the flow is the same as that of FIG. 3.

実施形態１では、メイン画像（第１の画像）に対して所定の関係を有する（所定角度異なる）第２画像を取得するようにしていた。しかし、実施形態２では、オペレータが所望のカメラを選択するか、所望の被写体について所望の視点の仮想視点画像を取得して第２画像としている。 In the first embodiment, a second image having a predetermined relationship (different by a predetermined angle) from the main image (first image) is acquired. However, in the second embodiment, the operator selects a desired camera or obtains a virtual viewpoint image of a desired viewpoint of a desired subject as the second image.

尚、ステップＳ５１でオペレータにより選択されるカメラ画像又は仮想視点画像は、例えばスポーツ会場に対して斜め上方の俯瞰的な視点の画像或いは斜め下方からの視点の画像などを含む。このように、実施形態２においては、第２面に表示される仮想視点画像をオペレータが選択可能である。 Note that the camera image or virtual viewpoint image selected by the operator in step S51 includes, for example, an image from a bird's-eye view diagonally above the sports venue, or an image from a viewpoint diagonally from below. In this way, in the second embodiment, the operator can select the virtual viewpoint image displayed on the second surface.

又、ステップＳ５１でオペレータにより選択される仮想視点画像は、例えばズームアウトしたような被写体から離れた位置からの視点の仮想視点画像であっても良い。 Further, the virtual viewpoint image selected by the operator in step S51 may be a virtual viewpoint image taken from a position distant from the subject, such as a zoomed-out virtual viewpoint image.

又、過去に生成したカメラ画像や、それに基づき生成した仮想視点画像などを保存部５に保存しておき、それを読出して、第１画像や第２画像や付随データとして夫々第１面～第３面に表示させるようにしても良い。 In addition, camera images generated in the past, virtual viewpoint images generated based on them, etc. are stored in the storage unit 5, and they are read out and stored as the first image, second image, and accompanying data, respectively. It may be displayed on three sides.

尚、ステップＳ３８とステップＳ３９の間に、ＣＰＵ１１１により、例えば操作部１１６の最後の操作から所定期間（例えば３０分）経過したら自動的にデフォルトの立体画像表示に切り替えるステップを挿入しても良い。デフォルトの立体画像表示は、例えば第１面にメイン画像、第３面には例えば付随データを表示し、第２面には過去の統計で最も頻度の高い視点などからのカメラ画像や仮想視点画像を表示すれば良い。 Note that a step may be inserted between step S38 and step S39 in which the CPU 111 automatically switches to the default stereoscopic image display after a predetermined period (for example, 30 minutes) has elapsed since the last operation on the operation unit 116, for example. The default 3D image display is, for example, the main image on the first page, the accompanying data on the third page, and the camera image or virtual viewpoint image from the most frequent viewpoint based on past statistics on the second page. All you have to do is display it.

（実施形態３）
図６～図８を用いて実施形態３について説明する。図６は実施形態３の画像処理システム１００の動作フローを説明するためのフローチャートであり、図７は図６の続きのフローチャート、図８は、図６と図７の続きのフローチャートである。尚、画像処理システム１００のコンピュータとしてのＣＰＵ１１１が例えばＲＯＭ１１２や補助記憶装置１１４等のメモリに記憶されたコンピュータプログラムを実行することによって図６～図８のフローチャートの各ステップの動作が行われる。 (Embodiment 3)
Embodiment 3 will be described using FIGS. 6 to 8. FIG. 6 is a flowchart for explaining the operation flow of the image processing system 100 of the third embodiment, FIG. 7 is a flowchart continuing from FIG. 6, and FIG. 8 is a flowchart continuing from FIGS. 6 and 7. Note that each step in the flowcharts of FIGS. 6 to 8 is performed by the CPU 111 as a computer of the image processing system 100 executing a computer program stored in a memory such as the ROM 112 or the auxiliary storage device 114.

実施形態３では、オペレータが仮想視点数を１～３の中から選択すると、それに応じてデジタルコンテンツ２００の第１～第３面の表示を自動的に切り替える。 In the third embodiment, when the operator selects the number of virtual viewpoints from 1 to 3, the display of the first to third surfaces of the digital content 200 is automatically switched accordingly.

ステップＳ６１において、オペレータが仮想視点の数を１～３の中から選択し、それをＣＰＵ１１１は受け付ける。ステップＳ６２において、ＣＰＵ１１１は、選択された数の仮想視点画像を画像生成部３から取得する。 In step S61, the operator selects the number of virtual viewpoints from 1 to 3, and the CPU 111 accepts it. In step S62, the CPU 111 acquires the selected number of virtual viewpoint images from the image generation unit 3.

このとき、自動的に代表的な仮想視点を選択する。即ち、シーンを解析し、そのシーンに対して例えば過去の統計から最も使用頻度の高い仮想視点を第１の仮想視点とし、次に使用頻度の高い仮想視点を第２の仮想視点とし、その次に使用頻度の高い仮想視点を第３の仮想視点とする。尚、第２の仮想視点は第１の仮想視点に対して例えば＋９０°、第３の仮想視点は第１の仮想視点に対して例えば－９０°角度が異なるように予め設定しても良い。ここで＋９０°、－９０°は例であって、これらの角度に限定されない。 At this time, a representative virtual viewpoint is automatically selected. That is, a scene is analyzed, and for that scene, for example, based on past statistics, the most frequently used virtual viewpoint is set as the first virtual viewpoint, the next most frequently used virtual viewpoint is set as the second virtual viewpoint, and the next virtual viewpoint is set as the second virtual viewpoint. A virtual viewpoint that is frequently used is set as a third virtual viewpoint. Note that the second virtual viewpoint may be set in advance to be different from the first virtual viewpoint by, for example, +90°, and the third virtual viewpoint may be set to be different from the first virtual viewpoint by, for example, −90°. Here, +90° and -90° are examples, and the angles are not limited to these.

ステップＳ６３において、ＣＰＵ１１１は、選択された仮想視点の数が１か否か判別し、１であれば、ステップＳ６４に進む。ステップＳ６４において、ＣＰＵ１１１は、複数のカメラ１の中のメインカメラからメイン画像を取得してデジタルコンテンツ２００の第１面２０１に対応付ける。 In step S63, the CPU 111 determines whether the number of selected virtual viewpoints is 1 or not, and if it is 1, the process proceeds to step S64. In step S64, the CPU 111 acquires the main image from the main camera of the plurality of cameras 1 and associates it with the first page 201 of the digital content 200.

そしてステップＳ６５で、ＣＰＵ１１１は、付随データをデジタルコンテンツ２００の第３面２０３に対応付ける。付随データは、実施形態１の図３のステップＳ３２で表示する付随データと同様の、例えばゴールにシュートした選手の名前などであれば良い。 Then, in step S65, the CPU 111 associates the accompanying data with the third page 203 of the digital content 200. The accompanying data may be the same as the accompanying data displayed in step S32 of FIG. 3 of the first embodiment, such as the name of the player who shot at the goal.

ステップＳ６６において、ＣＰＵ１１１は、デジタルコンテンツ２００の第２面に前述の第１の仮想視点からの第１の仮想視点画像を対応付け、その後、図８のステップＳ８１に進む。 In step S66, the CPU 111 associates the first virtual viewpoint image from the first virtual viewpoint with the second surface of the digital content 200, and then proceeds to step S81 in FIG. 8.

ステップＳ６３でＮｏと判別された場合には、ステップＳ６７において、ＣＰＵ１１１は、選択された仮想視点の数が２か否か判別し、２であれば、ステップＳ６８に進む。 If the determination in step S63 is No, the CPU 111 determines in step S67 whether the number of selected virtual viewpoints is 2 or not, and if it is 2, the process proceeds to step S68.

ステップＳ６８において、ＣＰＵ１１１は、付随データをデジタルコンテンツ２００の第３面２０３に対応付ける。付随データは、ステップＳ６５で対応付ける付随データと同様の、例えばゴールにシュートした選手の名前などであれば良い。 In step S68, the CPU 111 associates the accompanying data with the third page 203 of the digital content 200. The accompanying data may be the same as the accompanying data associated in step S65, such as the name of the player who shot at the goal.

そして、ステップＳ６９において、ＣＰＵ１１１は、第１の仮想視点からの第１の仮想視点画像をデジタルコンテンツ２００の第１面２０１に対応付ける。又、デジタルコンテンツ２００の第２面２０２に、前述の第２の仮想視点からの第２の仮想視点画像を対応付ける。その後、図８のステップＳ８１に進む。 Then, in step S69, the CPU 111 associates the first virtual viewpoint image from the first virtual viewpoint with the first surface 201 of the digital content 200. Further, the second virtual viewpoint image from the aforementioned second virtual viewpoint is associated with the second surface 202 of the digital content 200. Thereafter, the process advances to step S81 in FIG.

ステップＳ６７でＮｏと判別された場合には、図７のステップＳ７１に進む。ステップＳ７１において、ＣＰＵ１１１は、オペレータが第３面２０３に付随データを対応付ける選択をしているか否か判別する。Ｙｅｓの場合には、ステップＳ７２に進み、Ｎｏの場合には、ステップＳ７３に進む。 If the determination in step S67 is No, the process advances to step S71 in FIG. In step S71, the CPU 111 determines whether or not the operator has selected to associate accompanying data with the third surface 203. If Yes, the process advances to step S72; if No, the process advances to step S73.

ステップＳ７２において、ＣＰＵ１１１は、第１の仮想視点からの仮想視点画像をデジタルコンテンツ２００の第１面２０１に対応付け、第２面２０２に第２の仮想視点からの仮想視点画像を対応付け、第３面に第３の仮想視点からの仮想視点画像を対応付ける。その後、図８のステップＳ８１に進む。 In step S72, the CPU 111 associates the virtual viewpoint image from the first virtual viewpoint with the first surface 201 of the digital content 200, the virtual viewpoint image from the second virtual viewpoint with the second surface 202, and the virtual viewpoint image from the second virtual viewpoint. A virtual viewpoint image from a third virtual viewpoint is associated with the three surfaces. Thereafter, the process advances to step S81 in FIG.

ステップＳ７１でＮｏの場合には、ステップＳ７３において、ＣＰＵ１１１は、付随データをデジタルコンテンツ２００の第３面２０３に対応付ける。付随データは、ステップＳ６５で対応付ける付随データと同様の、例えばゴールにシュートした選手の名前などであれば良い。 If No in step S71, the CPU 111 associates the accompanying data with the third page 203 of the digital content 200 in step S73. The accompanying data may be the same as the accompanying data associated in step S65, such as the name of the player who shot at the goal.

そして、ステップＳ７４において、ＣＰＵ１１１は、第１の仮想視点からの第１の仮想視点画像をデジタルコンテンツ２００の第１面２０１に対応付ける。又、ステップＳ７５において、ＣＰＵ１１１は、デジタルコンテンツ２００の第２面２０２に、第２の仮想視点からの第２の仮想視点画像と第３の仮想視点からの第３の仮想視点画像とを並べて表示できるように対応付ける。つまり、第２面２０２を第２の仮想視点画像と第３の仮想視点画像とを表示する２つの領域に分割して、領域ごとに仮想視点画像を対応付ける。その後、図８のステップＳ８１に進む。 Then, in step S74, the CPU 111 associates the first virtual viewpoint image from the first virtual viewpoint with the first surface 201 of the digital content 200. Further, in step S75, the CPU 111 displays the second virtual viewpoint image from the second virtual viewpoint and the third virtual viewpoint image from the third virtual viewpoint side by side on the second surface 202 of the digital content 200. Correspond as possible. That is, the second surface 202 is divided into two areas displaying a second virtual viewpoint image and a third virtual viewpoint image, and the virtual viewpoint images are associated with each area. Thereafter, the process advances to step S81 in FIG.

図８のステップＳ８１において、ＣＰＵ１１１は、デジタルコンテンツ２００にＮＦＴを付与するか否か判別する。そのために例えば表示部１１５にデジタルコンテンツ２００にＮＦＴを付与するか否かを問う表示画像（ＧＵＩ）を表示する。そして、オペレータがＮＦＴを付与すると選択した場合には、ステップＳ８２に進み、デジタルコンテンツ２００にＮＦＴを付与し、暗号化してからステップＳ８３に進む。 In step S81 of FIG. 8, the CPU 111 determines whether to add an NFT to the digital content 200. For this purpose, for example, a display image (GUI) that asks whether to add NFT to the digital content 200 is displayed on the display unit 115. If the operator selects to add an NFT, the process proceeds to step S82, where the digital content 200 is provided with an NFT and encrypted, and then the process proceeds to step S83.

ステップＳ８１でＮｏと判別された場合には、そのままステップＳ８３に進む。尚、ステップＳ８１におけるデジタルコンテンツ２００は、前述と同様に、図４（Ｂ）、（Ｃ）のような形状のものでもよい。 If the determination in step S81 is No, the process directly advances to step S83. Note that the digital content 200 in step S81 may have a shape as shown in FIGS. 4(B) and 4(C), as described above.

ステップＳ８３において、ＣＰＵ１１１は、図６～図８のフローを終了するか否か判別し、オペレータが操作部１１６を操作して終了にしていなければ、ステップＳ８４に進む。 In step S83, the CPU 111 determines whether or not to end the flow of FIGS. 6 to 8, and if the operator has not operated the operation unit 116 to end the flow, the process proceeds to step S84.

ステップＳ８４では、ＣＰＵ１１１は、仮想視点の数が変更されていないか判別する。数が変更されていればステップＳ６１に戻る。数が変更されていなければステップＳ６２に戻る。ステップＳ８３でＹｅｓと判別された場合には図６～図８のフローを終了する。 In step S84, the CPU 111 determines whether the number of virtual viewpoints has been changed. If the number has been changed, the process returns to step S61. If the number has not been changed, the process returns to step S62. If the determination in step S83 is Yes, the flow of FIGS. 6 to 8 ends.

尚、実施形態３では、オペレータが仮想視点数を１～３の中から選択すると、それに応じてデジタルコンテンツ２００の第１～第３面に対応付ける画像を自動的に切り替える例を説明した。しかし、オペレータが複数のカメラの画像中からデジタルコンテンツ２００を構成する面に対応付けるカメラ画像の数を、選択するようにしても良い。そして、それに応じて複数のカメラ１の中から所定のカメラを自動的に選択して、デジタルコンテンツ２００の第１～第３面のそれらのカメラ画像を自動的に対応付けるようにしても良い。なお、視点の数は、最大３つでなくてもよい。例えば、視点の数は、デジタルコンテンツを構成する面の数あるいは、画像を対応付けることが可能な面の数を最大とする範囲で決定されてもよい。また、一つの面に複数の画像を対応付けることができれば、さらに最大の視点の数を増やすことが可能である。 In the third embodiment, an example has been described in which when the operator selects the number of virtual viewpoints from 1 to 3, the images associated with the first to third surfaces of the digital content 200 are automatically switched accordingly. However, the operator may select the number of camera images to be associated with a surface forming the digital content 200 from among images taken by a plurality of cameras. Then, a predetermined camera may be automatically selected from among the plurality of cameras 1 in accordance with this, and the camera images on the first to third surfaces of the digital content 200 may be automatically associated with each other. Note that the number of viewpoints does not have to be three at most. For example, the number of viewpoints may be determined within a range that maximizes the number of surfaces constituting the digital content or the number of surfaces to which images can be associated. Furthermore, if multiple images can be associated with one surface, the maximum number of viewpoints can be further increased.

又、ステップＳ８２とステップＳ８３の間に、ＣＰＵ１１１が、例えば操作部１１６の最後の操作から所定期間（例えば３０分）経過したら自動的にデフォルトの立体画像表示からなるコンテンツに切り替えるステップを挿入しても良い。デフォルトの立体画像表示は、例えば第１面にメイン画像を表示し、第２面に、過去の統計で最も使用頻度の高い視点からのカメラ画像または仮想視点画像とする。第３面は例えば付随データとする。 Further, between step S82 and step S83, a step is inserted in which the CPU 111 automatically switches to the content consisting of the default stereoscopic image display after a predetermined period (for example, 30 minutes) has elapsed since the last operation on the operation unit 116. Also good. In the default stereoscopic image display, for example, the main image is displayed on the first screen, and the camera image or virtual viewpoint image from the most frequently used viewpoint according to past statistics is displayed on the second screen. For example, the third page may contain accompanying data.

以上のように実施形態３においては、ステップＳ６９、ステップＳ７２、ステップＳ９５において、第２面に表示させる仮想視点画像とは異なる仮想視点画像を、第１面に対応付けることを可能にしている。 As described above, in the third embodiment, in steps S69, S72, and S95, it is possible to associate a virtual viewpoint image different from the virtual viewpoint image displayed on the second surface with the first surface.

（実施形態４）
次に実施形態４について図９および図１０を用いて説明する。なお、本実施形態において、システム構成は実施形態１で説明した構成と同じであるため、説明を省略する。また、システムのハードウェア構成も、図２と同じであり、その説明も省略する。 (Embodiment 4)
Next, Embodiment 4 will be described using FIGS. 9 and 10. Note that in this embodiment, the system configuration is the same as the configuration described in Embodiment 1, so a description thereof will be omitted. Furthermore, the hardware configuration of the system is also the same as that in FIG. 2, and its description will be omitted.

本実施形態は、実施形態１～３のいずれかの方法において生成された立体形状のデジタルコンテンツをユーザデバイスに表示する表示画像（ＧＵＩ）である。ユーザデバイスは、例えば、ＰＣやスマートフォン、タッチパネルを有するタブレット端末でもよい（不図示）。本実施形態では、タッチパネルを有するタブレット端末を例に説明する。このＧＵＩは、画像処理システム１００で生成され、ユーザデバイスに送信される。なお、このＧＵＩは、必要な情報を取得したユーザデバイスによって、生成されてもよい。 This embodiment is a display image (GUI) that displays, on a user device, a three-dimensional digital content generated by any of the methods of embodiments 1 to 3. The user device may be, for example, a PC, a smartphone, or a tablet terminal with a touch panel (not shown). In this embodiment, a tablet terminal having a touch panel will be described as an example. This GUI is generated by the image processing system 100 and sent to the user device. Note that this GUI may be generated by the user device that has acquired the necessary information.

画像処理システムは、ＣＰＵ、ＲＯＭ、ＲＡＭ、補助記憶装置、表示部、操作部、通信Ｉ／Ｆ、及びバス等を有する（不図示）。ＣＰＵは、ＲＯＭやＲＡＭや補助記憶装置等に記憶されているコンピュータプログラム等を用いて画像処理システムの全体を制御する。 The image processing system includes a CPU, a ROM, a RAM, an auxiliary storage device, a display section, an operation section, a communication I/F, a bus, etc. (not shown). The CPU controls the entire image processing system using computer programs stored in the ROM, RAM, auxiliary storage, and the like.

画像処理システムは、立体形状のデジタルコンテンツから、複数の撮像装置（カメラ）により異なる方向から撮像して取得される撮像画像、指定された仮想視点に基づいた仮想視点映像、仮想視点映像に関連付けられた音情報、撮像画像および仮想視点映像に含まれる被写体に関する情報を特定する。 The image processing system processes three-dimensional digital content, images obtained by capturing images from different directions using multiple imaging devices (cameras), virtual perspective images based on a specified virtual viewpoint, and images associated with the virtual perspective images. The system identifies information regarding the subject included in the captured sound information, captured image, and virtual viewpoint video.

本実施形態では、実施形態３にて生成され、仮想視点の数を３つとし、３つの仮想視点画像が第２面に対応付けられる立体形状のデジタルコンテンツを例に説明する。３つの仮想視点画像は映像であり、以下、仮想視点映像と呼称する。本実施形態では、立体形状のデジタルコンテンツを６面体とするが、球体や８面体であってもよい。 The present embodiment will be described using, as an example, a three-dimensional digital content generated in the third embodiment, in which the number of virtual viewpoints is three, and three virtual viewpoint images are associated with the second surface. The three virtual viewpoint images are videos, and are hereinafter referred to as virtual viewpoint videos. In this embodiment, the three-dimensional digital content is a hexahedron, but it may also be a sphere or an octahedron.

仮想視点映像に関連付けられた音情報とは、画像撮像時の会場で取得される音情報である。あるいは、仮想視点に基づいて補正された音情報を用いてもよい。仮想視点に基づいて補正された音情報とは、例えば、画像撮像時に会場で取得される音情報を、仮想視点の位置で仮想視点からの視線方向を向いている際に聞こえるように調整された音情報である。なお、別途音情報を用意してもよい。 The sound information associated with the virtual viewpoint video is sound information acquired at the venue at the time of image capture. Alternatively, sound information corrected based on the virtual viewpoint may be used. Sound information corrected based on the virtual viewpoint is, for example, sound information acquired at the venue during image capture that has been adjusted so that it can be heard when facing the line of sight from the virtual viewpoint at the position of the virtual viewpoint. It is sound information. Note that sound information may be prepared separately.

図９は、ユーザデバイスに表示される本実施形態のグラフィックユーザーインターフェース（ＧＵＩ）を示す図である。ＧＵＩ画像には、第１領域９１１と第２領域９１２が存在する。第１領域９１１内に、立体形状のデジタルコンテンツに関連付けられた情報を示す画像を表示する第１表示領域９０１、第２表示領域９０２、第３表示領域９０３、第４表示領域９０４、第５表示領域９０５、第６表示領域９０６が含まれる。各表示領域には、それぞれに割り当てられた画像や情報が表示される。第２領域９１２内に、第１表示領域９０１から第６表示領域９０６のうちユーザにより選択された表示領域に関連付けられた情報を示す画像を表示する第７表示領域９０７がある。尚、第１表示領域９０１から第６表示領域９０６に関連付けられた情報を示す画像とは静止画像であってもよいし、映像であってもよい。 FIG. 9 is a diagram showing a graphic user interface (GUI) of this embodiment displayed on a user device. A first area 911 and a second area 912 exist in the GUI image. In the first area 911, a first display area 901, a second display area 902, a third display area 903, a fourth display area 904, and a fifth display display an image indicating information associated with three-dimensional digital content. An area 905 and a sixth display area 906 are included. Images and information assigned to each display area are displayed. Within the second area 912, there is a seventh display area 907 that displays an image showing information associated with the display area selected by the user from among the first display area 901 to the sixth display area 906. Note that the images indicating information associated with the first display area 901 to the sixth display area 906 may be still images or videos.

図９では、３つの仮想視点映像が対応付けられた第２表示領域９０２が選択された例を示している。３つの仮想視点映像に対応付けられた表示領域が選択された場合、各視点の仮想視点映像に対応するＧＵＩ９０８、ＧＵＩ９０９、ＧＵＩ９１０を第２領域内に表示する。尚、ＧＵＩ９０８、ＧＵＩ９０９、ＧＵＩ９１０を第７表示領域９０７に重畳表示してもよい。 FIG. 9 shows an example in which the second display area 902 to which three virtual viewpoint videos are associated is selected. When display areas associated with three virtual viewpoint videos are selected, GUI 908, GUI 909, and GUI 910 corresponding to the virtual viewpoint videos of each viewpoint are displayed in the second area. Note that the GUI 908, GUI 909, and GUI 910 may be displayed in a superimposed manner on the seventh display area 907.

本実施形態では、立体形状のデジタルコンテンツの各面と各表示領域を対応付けられている。すなわち、各表示領域にはデジタルコンテンツの各面に対応付けられた情報を示す画像が表示される。なお、デジタルコンテンツの形状に依らず、デジタルコンテンツに関連付けられた情報を示す画像を各表示領域に表示してもよい。 In this embodiment, each surface of the three-dimensional digital content is associated with each display area. That is, an image indicating information associated with each side of the digital content is displayed in each display area. Note that an image indicating information associated with the digital content may be displayed in each display area regardless of the shape of the digital content.

尚、デジタルコンテンツの表示面と表示領域の数が異なっていてもよい。例えば、６面体のデジタルコンテンツに対し、第１表示領域９０１から第４表示領域９０４のみをユーザデバイスに表示してもよい。第１表示領域９０１から第３表示領域９０３にはデジタルコンテンツに関連付けられた情報の一部が表示され、ユーザにより選択された表示領域に関連付けられた情報を第４表示領域９０４に表示する。 Note that the number of display surfaces and display areas for digital content may be different. For example, for hexahedral digital content, only the first to fourth display areas 901 to 904 may be displayed on the user device. Part of the information associated with the digital content is displayed in the first display area 901 to the third display area 903, and the information associated with the display area selected by the user is displayed in the fourth display area 904.

立体形状のデジタルコンテンツから特定した情報を、各表示領域に対応付け、特定した情報を示す画像を表示する。本実施形態では、被写体をバスケットの選手とし、第１表示領域９０１に選手を映すメイン画像を表示する。第１表示領域９０１に表示される画像は、仮想視点画像でもよいし、撮影画像であってもよい。第２表示領域９０２には、メイン画像に表示される選手に関連する３つの仮想視点映像を示す画像および仮想視点映像を示すアイコン９１３を重畳表示する。第２表示領域９０２には、第１表示領域９０１に表示される画像とは異なる視点に対応する仮想視点画像が表示される。第３表示領域９０３にメイン画像に表示される選手の所属するチームの情報を示す画像が表示される。第４表示領域９０４に、メイン画像に表示される選手の、メイン画像が撮像されたシーズンの成績情報を示す画像が表示される。第５表示領域９０５に撮影時の試合の最終スコアを示す画像が表示される。第６表示領域９０６にデジタルコンテンツの著作権情報を示す画像が表示される。 Information identified from three-dimensional digital content is associated with each display area, and an image representing the identified information is displayed. In this embodiment, the subject is a basketball player, and a main image showing the player is displayed in the first display area 901. The image displayed in the first display area 901 may be a virtual viewpoint image or a photographed image. In the second display area 902, an image showing three virtual viewpoint videos related to the player displayed in the main image and an icon 913 showing the virtual viewpoint video are displayed in a superimposed manner. In the second display area 902, a virtual viewpoint image corresponding to a different viewpoint from the image displayed in the first display area 901 is displayed. An image showing information about the team to which the player displayed in the main image belongs is displayed in the third display area 903. In the fourth display area 904, an image showing performance information of the player displayed in the main image in the season in which the main image was captured is displayed. An image showing the final score of the match at the time of shooting is displayed in the fifth display area 905. An image showing copyright information of the digital content is displayed in the sixth display area 906.

第１表示領域９０１から第６表示領域９０６に関連付けられている情報に映像を示す情報が含まれている場合、映像に対応する表示領域の画像上にイラストやアイコンを重畳表示してもよい。この場合、１つの撮像装置により撮影された画像により生成される映像と、複数の撮像装置により生成された仮想視点画像により生成される仮想視点映像に対しそれぞれ異なるアイコンを用いる。本実施形態では、第２表示領域９０２の仮想視点映像を示す画像上にアイコン９１３を重畳表示する。また、第２表示領域９０２がユーザによって選択された場合、第７表示領域９０７に表示される仮想視点映像上にアイコン９１３を表示する。尚、イラストやアイコンは表示領域の周辺に配置してもよい。 If the information associated with the first display area 901 to the sixth display area 906 includes information indicating a video, an illustration or an icon may be displayed superimposed on the image in the display area corresponding to the video. In this case, different icons are used for a video generated from an image captured by one imaging device and a virtual viewpoint video generated from virtual viewpoint images generated by a plurality of imaging devices. In this embodiment, an icon 913 is displayed superimposed on the image showing the virtual viewpoint video in the second display area 902. Further, when the second display area 902 is selected by the user, an icon 913 is displayed on the virtual viewpoint video displayed in the seventh display area 907. Note that illustrations and icons may be placed around the display area.

第１表示領域９０１から第６表示領域９０６のうち、１つの表示領域に複数の画像または複数の映像が対応付けられてよい。本実施形態では、第２表示領域９０２に視点の異なる３つの仮想視点映像が対応付けられている場合を記載する。この場合、ＧＵＩ９０８、ＧＵＩ９０９、ＧＵＩ９１０にそれぞれ視点の異なる仮想視点映像が対応付けられる。ＧＵＩ９０８には被写体視点（視点１）の仮想視点映像が対応付けられる。ＧＵＩ９０９には被写体の後方からの視点（視点２）に対応する仮想視点映像が対応付けられる。ＧＵＩ９１０には被写体を中心とした球面上の位置の中の仮想視点（視点３）の仮想視点映像が対応付けられる。尚、１つの表示領域に複数の画像または複数の映像が対応付けられない場合は、ＧＵＩ９０８、ＧＵＩ９０９、ＧＵＩ９１０は表示しなくてよい。 A plurality of images or a plurality of videos may be associated with one display area among the first display area 901 to the sixth display area 906. In this embodiment, a case will be described in which three virtual viewpoint videos having different viewpoints are associated with the second display area 902. In this case, virtual viewpoint videos having different viewpoints are associated with the GUI 908, GUI 909, and GUI 910, respectively. A virtual viewpoint image of the subject viewpoint (viewpoint 1) is associated with the GUI 908. A virtual viewpoint video corresponding to a viewpoint from behind the subject (viewpoint 2) is associated with the GUI 909. The GUI 910 is associated with a virtual viewpoint image of a virtual viewpoint (viewpoint 3) among positions on a spherical surface centered on the subject. Note that if multiple images or multiple videos are not associated with one display area, GUI 908, GUI 909, and GUI 910 do not need to be displayed.

第７表示領域９０７には、ユーザが第１表示領域９０１から第６表示領域９０６を選択する前に表示する情報として、初期画像が設定される。初期画像は第１表示領域９０１から第６表示領域９０６に関連付けられた画像であってもよいし、第１表示領域９０１から第６表示領域９０６に関連付けられた画像と異なる画像であってもよい。本実施形態では、第１表示領域９０１のメイン画像を初期画像と設定する。 An initial image is set in the seventh display area 907 as information to be displayed before the user selects the sixth display area 906 from the first display area 901. The initial image may be an image associated with the first display area 901 to the sixth display area 906, or may be a different image from the image associated with the first display area 901 to the sixth display area 906. . In this embodiment, the main image in the first display area 901 is set as the initial image.

図１０は、本実施形態における、画像処理システムの動作フローを説明するためのフローチャートである。具体的には、図２のＣＰＵ１１１が実行する処理である。ステップＳ１００１において、立体形状のデジタルコンテンツの第１面から第６面に対応付けられたコンテンツ情報を特定する。 FIG. 10 is a flowchart for explaining the operation flow of the image processing system in this embodiment. Specifically, this is a process executed by the CPU 111 in FIG. In step S1001, content information associated with the first to sixth sides of the three-dimensional digital content is identified.

ステップＳ１００２において、ステップＳ１００１にて特定したコンテンツ情報を第１表示領域９０１から第６表示領域９０６に対応付ける。その後、対応付けた情報を示す画像を第１表示領域９０１から第６表示領域９０６に表示する。また、第７表示領域９０７には予め設定された初期画像を表示する。本実施形態では、第１表示領域９０１のメイン画像を初期画像として第７表示領域９０７に表示する。 In step S1002, the content information specified in step S1001 is associated with the first display area 901 to the sixth display area 906. Thereafter, images indicating the associated information are displayed from the first display area 901 to the sixth display area 906. Further, a preset initial image is displayed in the seventh display area 907. In this embodiment, the main image in the first display area 901 is displayed in the seventh display area 907 as an initial image.

ステップＳ１００３において、最新の入力を受け付けて所定の時間（例えば３０分）が経過したか否か判定する。Ｙｅｓの場合、ステップＳ１０１７に進む。Ｎｏの場合、ステップＳ１００４に進む。 In step S1003, it is determined whether a predetermined time (for example, 30 minutes) has elapsed since the latest input was received. If Yes, the process advances to step S1017. If No, the process advances to step S1004.

ステップＳ１００４において、ユーザによる第１表示領域９０１から第６表示領域９０６の何れかを選択する入力を受け付けたか否か判定する。Ｙｅｓの場合、受け付けた入力により進むステップが異なる。第１表示領域９０１を選択する入力を受け付けた場合、ステップＳ１００５に進む。第２表示領域９０２を選択する入力を受け付けた場合、ステップＳ１００６に進む。第３表示領域９０３を選択する入力を受け付けた場合、ステップＳ１００７に進む。第４表示領域９０４を選択する入力を受け付けた場合、ステップＳ１００８に進む。第５表示領域９０５を選択する入力を受け付けた場合、ステップＳ１００９に進む。第６表示領域９０６を選択する入力を受け付けた場合、ステップＳ１０１０に進む。Ｎｏの場合、ステップＳ１００３に戻る。 In step S1004, it is determined whether an input by the user to select any one of the first display area 901 to the sixth display area 906 has been received. If Yes, the steps to proceed will differ depending on the received input. If an input to select the first display area 901 is accepted, the process advances to step S1005. If an input to select the second display area 902 is accepted, the process advances to step S1006. If an input to select the third display area 903 is accepted, the process advances to step S1007. If an input to select the fourth display area 904 is accepted, the process advances to step S1008. If an input to select the fifth display area 905 is accepted, the process advances to step S1009. If an input to select the sixth display area 906 is accepted, the process advances to step S1010. If No, the process returns to step S1003.

ステップ１００５において、第１表示領域９０１に対応する選手を映すメイン画像を第７表示領域９０７に表示する。尚、すでに第７表示領域９０７に第１表示領域９０１に関連付けられた選手を映すメイン画像が表示されている場合、第７表示領域９０７にメイン画像を表示したままステップＳ１００３に戻る。ユーザが選択した表示領域に関連する情報がすでに第７表示領域９０７に表示されている場合は、ステップ１００７からステップ１０１０においても同様であるため以下省略する。また、メイン画像が映像であり、すでに第７表示領域９０７にメイン画像が表示されていた場合、再度既定の再生時刻から映像を再生してもよいし、表示されている映像をそのまま再生してもよい。 In step 1005, the main image showing the player corresponding to the first display area 901 is displayed in the seventh display area 907. Note that if the main image showing the player associated with the first display area 901 is already displayed in the seventh display area 907, the process returns to step S1003 while the main image is displayed in the seventh display area 907. If information related to the display area selected by the user is already displayed in the seventh display area 907, the same applies to steps 1007 to 1010, so the description thereof will be omitted. Further, if the main image is a video and the main image is already displayed in the seventh display area 907, the video may be played again from the default playback time, or the displayed video may be played as is. Good too.

ステップ１００６において、第２表示領域９０２に対応する選手に関連する仮想視点映像を第７表示領域９０７に表示する。なお、複数の仮想視点映像が第２表示領域９０２に関連付けられている場合、予め設定した仮想視点映像を第７表示領域９０７に表示する。本実施形態では、被写体視点（視点１）の仮想視点映像を第７表示領域９０７に表示する。あらかじめ設定した仮想視点映像を第７表示領域９０７に表示した後、ステップＳ１０１１に進む。 In step 1006, a virtual viewpoint video related to the player corresponding to the second display area 902 is displayed in the seventh display area 907. Note that when a plurality of virtual viewpoint videos are associated with the second display area 902, a preset virtual viewpoint video is displayed in the seventh display area 907. In this embodiment, a virtual viewpoint image from the subject viewpoint (viewpoint 1) is displayed in the seventh display area 907. After displaying the preset virtual viewpoint video in the seventh display area 907, the process advances to step S1011.

ステップＳ１０１１において、最新の入力を受け付けて所定の時間（例えば３０分）が経過したか否か判定する。Ｙｅｓの場合、ステップＳ１０１７に進む。Ｎｏの場合、ステップＳ１０１２に進む。 In step S1011, it is determined whether a predetermined time (for example, 30 minutes) has elapsed since the latest input was received. If Yes, the process advances to step S1017. If No, the process advances to step S1012.

ステップＳ１０１２において、複数の仮想視点映像のうち、ユーザが見たい仮想視点の仮想視点映像を選択する入力を受け付けたか否か判定する。具体的には、図９の第７表示領域９０７のように各仮想視点を示すＧＵＩを表示し、ユーザがＧＵＩを選択することにより各視点の仮想視点映像が選択される。Ｙｅｓの場合、選択されたＧＵＩに応じて次のステップに進む。ＧＵＩ９０８が選択されるとステップＳ１０１３、ＧＵＩ９０９が選択されるとステップＳ１０１４、ＧＵＩ９１０が選択されるとステップＳ１０１５に進む。Ｎｏの場合、ステップＳ１０１６に進む。 In step S1012, it is determined whether an input for selecting a virtual viewpoint video of a virtual viewpoint that the user wants to see from among a plurality of virtual viewpoint videos is received. Specifically, a GUI indicating each virtual viewpoint is displayed as in the seventh display area 907 in FIG. 9, and when the user selects the GUI, the virtual viewpoint video of each viewpoint is selected. If Yes, proceed to the next step depending on the selected GUI. If the GUI 908 is selected, the process proceeds to step S1013, if the GUI 909 is selected, the process proceeds to step S1014, and if the GUI 910 is selected, the process proceeds to step S1015. If No, the process advances to step S1016.

尚、各仮想視点を示すＧＵＩを設けず、第７表示領域９０７に対するフリック操作やタッチ操作により各視点の仮想視点映像が選択されてもよい。また、複数の仮想視点映像が第２表示領域９０２に関連付けられている場合、複数の仮想視点映像を繋げ１つの仮想視点映像とし、連続的に再生されるようにしてもよい。その場合、仮想視点を選択するステップＳ１０１２を経由せず、ステップＳ１０１６に進む。 Note that the virtual viewpoint video of each viewpoint may be selected by a flick operation or a touch operation on the seventh display area 907 without providing a GUI indicating each virtual viewpoint. Further, when a plurality of virtual viewpoint videos are associated with the second display area 902, the plurality of virtual viewpoint videos may be connected to form one virtual viewpoint video, and the virtual viewpoint video may be continuously played. In that case, the process proceeds to step S1016 without going through step S1012 for selecting a virtual viewpoint.

ステップＳ１０１３において、被写体視点（視点１）の仮想視点映像を第７表示領域９０７に表示する。その後、ステップＳ１０１１に戻る。すでに第７表示領域９０７に被写体視点の仮想視点映像が表示されている場合は、再度既定の再生時刻から映像を再生してもよいし、表示されている映像をそのまま再生してもよい。すでに第７表示領域９０７に表示されている場合は、ステップＳ１０１４とステップＳ１０１５においても同様であるため以下省略する。 In step S1013, a virtual viewpoint image of the subject viewpoint (viewpoint 1) is displayed in the seventh display area 907. After that, the process returns to step S1011. If the virtual viewpoint video from the subject's viewpoint is already displayed in the seventh display area 907, the video may be played back from the predetermined playback time, or the displayed video may be played back as is. If it is already displayed in the seventh display area 907, the same applies to steps S1014 and S1015, so the description thereof will be omitted below.

ステップＳ１０１４において、被写体の後方からの視点（視点２）に対応する仮想視点映像を第７表示領域９０７に表示する。その後、ステップＳ１０１１に戻る。 In step S1014, a virtual viewpoint video corresponding to a viewpoint from behind the subject (viewpoint 2) is displayed in the seventh display area 907. After that, the process returns to step S1011.

ステップＳ１０１５において、被写体を中心とした球面上の位置の中の仮想視点（視点３）の仮想視点映像を第７表示領域９０７に表示する。その後、ステップＳ１０１１に戻る。 In step S1015, a virtual viewpoint image of a virtual viewpoint (viewpoint 3) among positions on a spherical surface centered on the subject is displayed in the seventh display area 907. After that, the process returns to step S1011.

ステップＳ１０１６において、ユーザによる第１表示領域９０１から第６表示領域９０６の何れかを選択する入力を受け付けたか判定する。Ｙｅｓの場合、ステップＳ１００４と同様の処理を行う。Ｎｏの場合、ステップＳ１０１１に戻る。 In step S1016, it is determined whether an input by the user to select any one of the first display area 901 to the sixth display area 906 has been received. If Yes, the same process as step S1004 is performed. If No, the process returns to step S1011.

ステップＳ１００７において、第３表示領域９０３に対応する選手の所属するチーム情報を第７表示領域９０７に表示する。 In step S1007, team information to which the player corresponding to the third display area 903 belongs is displayed in the seventh display area 907.

ステップＳ１００８において、第４表示領域９０４に対応する選手の今シーズンの成績情報を第７表示領域９０７に表示する。 In step S1008, this season's performance information of the player corresponding to the fourth display area 904 is displayed in the seventh display area 907.

ステップＳ１００９において、第５表示領域９０５に対応する試合の最終スコアを第７表示領域９０７に表示する。 In step S1009, the final score of the match corresponding to the fifth display area 905 is displayed in the seventh display area 907.

ステップＳ１０１０において、第６表示領域９０６に対応する著作権情報を第７表示領域９０７に表示する。 In step S1010, copyright information corresponding to the sixth display area 906 is displayed in the seventh display area 907.

ステップＳ１０１７において、第７表示領域に初期画像を表示する。本実施形態では、第１表示領域９０１のメイン画像を初期画像として第７表示領域９０７に表示する。その後、処理フローを終了する。 In step S1017, the initial image is displayed in the seventh display area. In this embodiment, the main image in the first display area 901 is displayed in the seventh display area 907 as an initial image. After that, the processing flow ends.

（実施形態５）
次に実施形態５について図１１および図１２を用いて説明する。なお、本実施形態において、システム構成は実施形態１で説明した構成と同じであるため、説明を省略する。また、システムのハードウェア構成も、図２と同じであり、その説明も省略する。 (Embodiment 5)
Next, Embodiment 5 will be described using FIGS. 11 and 12. Note that in this embodiment, the system configuration is the same as the configuration described in Embodiment 1, so a description thereof will be omitted. Furthermore, the hardware configuration of the system is also the same as that in FIG. 2, and its description will be omitted.

図１１は、本実施形態のグラフィックユーザーインターフェース（ＧＵＩ）を示す図である。実施形態４と同様に、実施形態３にて生成され、仮想視点の数を３つとし、３つの仮想視点画像が第２面に対応付けられる立体形状のデジタルコンテンツを例に説明する。また、実施形態４と異なり、第１領域１１０７に表示される表示領域の数はデジタルコンテンツの面数と異なる。具体的には、第１領域１１０７に表示される表示領域の数が５つに対し、デジタルコンテンツの面数は６つである。このＧＵＩは、画像処理システム１００で生成され、ユーザデバイスに送信される。なお、このＧＵＩは、必要な情報を取得したユーザデバイスによって、生成されてもよい。 FIG. 11 is a diagram showing the graphic user interface (GUI) of this embodiment. As in the fourth embodiment, a three-dimensional digital content that is generated in the third embodiment, has three virtual viewpoints, and has three virtual viewpoint images associated with the second surface will be described as an example. Further, unlike the fourth embodiment, the number of display areas displayed in the first area 1107 is different from the number of pages of digital content. Specifically, while the number of display areas displayed in the first area 1107 is five, the number of pages of digital content is six. This GUI is generated by the image processing system 100 and sent to the user device. Note that this GUI may be generated by the user device that has acquired the necessary information.

本実施形態では、デジタルコンテンツが視点の異なる３つの仮想視点映像を含み、各視点の仮想視点映像をそれぞれ第２表示領域１１０２から第４表示領域１１０４に対応付ける。３つの仮想視点は、実施形態４と同様に、被写体視点（視点１）の仮想視点映像、被写体の後方からの視点（視点２）、被写体を中心とした球面上の位置の中の仮想視点（視点３）である。３つの仮想視点映像を第２表示領域１１０２から第４表示領域１１０４に対応付けるため、仮想視点映像を示すアイコン１１０９も第２表示領域１１０２から第４表示領域１１０４に重畳表示する。 In this embodiment, the digital content includes three virtual viewpoint videos with different viewpoints, and the virtual viewpoint videos of each viewpoint are associated with the second display area 1102 to the fourth display area 1104, respectively. As in the fourth embodiment, the three virtual viewpoints are a virtual viewpoint image of the subject's viewpoint (viewpoint 1), a viewpoint from behind the subject (viewpoint 2), and a virtual viewpoint within the position on the spherical surface centered on the subject ( Viewpoint 3). In order to associate the three virtual viewpoint videos from the second display area 1102 to the fourth display area 1104, an icon 1109 indicating the virtual viewpoint video is also superimposedly displayed from the second display area 1102 to the fourth display area 1104.

第１表示領域１１０１には選手を映す画像、第５表示領域１１０５には著作権情報を対応付ける。第１表示領域１１０１には第２表示領域１１０２から第４表示領域１１０４までの仮想視点画像とは異なる視点の仮想視点映像あるいは撮影画像であってもよい。なお、表示領域に表示する情報はこれらの限りではなく、デジタルコンテンツと対応付けられた情報であればよい。 The first display area 1101 is associated with an image showing a player, and the fifth display area 1105 is associated with copyright information. The first display area 1101 may contain a virtual viewpoint video or a photographed image from a different viewpoint from the virtual viewpoint images from the second display area 1102 to the fourth display area 1104. Note that the information displayed in the display area is not limited to these, and may be any information that is associated with digital content.

図１２は、本実施形態の画像処理システム１００の動作フローを説明するためのフローチャートである。尚、画像処理システム１００のコンピュータとしてのＣＰＵ１１１が例えばＲＯＭ１１２や補助記憶装置１１４等のメモリに記憶されたコンピュータプログラムを実行することによって図１２のフローチャートの各ステップの動作が行われる。尚、図１２において図１０と同じ符号のステップは同じ処理であり説明を省略する。 FIG. 12 is a flowchart for explaining the operation flow of the image processing system 100 of this embodiment. Note that each step in the flowchart of FIG. 12 is performed by the CPU 111 as a computer of the image processing system 100 executing a computer program stored in a memory such as the ROM 112 or the auxiliary storage device 114. Note that in FIG. 12, steps with the same reference numerals as in FIG. 10 are the same processes, and a description thereof will be omitted.

図１２のステップＳ１００４において、第２表示領域１１０２を選択する入力を受け付けた場合、ステップＳ１２０１に進む。第３表示領域１１０３を選択する入力を受け付けた場合、ステップＳ１２０２に進む。第４表示領域１１０４を選択する入力を受け付けた場合、ステップＳ１２０３に進む。 If an input to select the second display area 1102 is received in step S1004 of FIG. 12, the process advances to step S1201. If an input to select the third display area 1103 is accepted, the process advances to step S1202. If an input to select the fourth display area 1104 is accepted, the process advances to step S1203.

ステップＳ１２０１において、被写体視点（視点１）の仮想視点映像を第６表示領域１１０６に表示する。その後、ステップＳ１００３に戻る。 In step S1201, a virtual viewpoint image of the subject viewpoint (viewpoint 1) is displayed in the sixth display area 1106. After that, the process returns to step S1003.

ステップＳ１２０２において、被写体の後方からの視点（視点２）に対応する仮想視点映像を第６表示領域１１０６に表示する。その後、ステップＳ１００３に戻る。 In step S1202, a virtual viewpoint video corresponding to a viewpoint from behind the subject (viewpoint 2) is displayed in the sixth display area 1106. After that, the process returns to step S1003.

ステップＳ１２０３において、被写体を中心とした球面上の位置の中の仮想視点（視点３）の仮想視点映像を第６表示領域１１０６に表示する。その後、ステップＳ１００３に戻る。 In step S1203, a virtual viewpoint image of a virtual viewpoint (viewpoint 3) among positions on a spherical surface centered on the subject is displayed in the sixth display area 1106. After that, the process returns to step S1003.

（実施形態６）
次に実施形態６について図１３および図１４を用いて説明する。なお、本実施形態において、システム構成は実施形態１で説明した構成と同じであるため、説明を省略する。また、システムのハードウェア構成も、図２と同じであり、その説明も省略する。 (Embodiment 6)
Next, Embodiment 6 will be described using FIGS. 13 and 14. Note that in this embodiment, the system configuration is the same as the configuration described in Embodiment 1, so a description thereof will be omitted. Furthermore, the hardware configuration of the system is also the same as that in FIG. 2, and its description will be omitted.

図１３は、実施形態６のグラフィックユーザーインターフェース（ＧＵＩ）を示す図である。本実施形態では、実施形態３にて生成され、仮想視点の数を６つとし、６つの仮想視点画像が第２面に対応付けられる立体形状のデジタルコンテンツを例に説明する。６つの仮想視点画像は映像であり、以下、仮想視点映像と呼称する。このＧＵＩは、画像処理システム１００で生成され、ユーザデバイスに送信される。なお、このＧＵＩは、必要な情報を取得したユーザデバイスによって、生成されてもよい。 FIG. 13 is a diagram showing a graphic user interface (GUI) according to the sixth embodiment. The present embodiment will be described using as an example the digital content in a three-dimensional shape that is generated in the third embodiment, has six virtual viewpoints, and has six virtual viewpoint images associated with the second surface. The six virtual viewpoint images are videos, and are hereinafter referred to as virtual viewpoint videos. This GUI is generated by the image processing system 100 and sent to the user device. Note that this GUI may be generated by the user device that has acquired the necessary information.

本実施形態は、実施形態４と異なり、第１領域１３０７に表示される表示領域の数はデジタルコンテンツの面数と異なる形態である。具体的には、第１領域１３０７に表示される表示領域の数が３つに対し、デジタルコンテンツの面数は６つである。また、６つの仮想視点映像が第２表示領域１３０２に対応付けられている。 This embodiment is different from the fourth embodiment in that the number of display areas displayed in the first area 1307 is different from the number of pages of digital content. Specifically, while the number of display areas displayed in the first area 1307 is three, the number of pages of digital content is six. Additionally, six virtual viewpoint videos are associated with the second display area 1302.

第１表示領域１３０１には選手を映すメイン画像、第３表示領域１３０３には著作権情報を対応付ける。第１表示領域１３０１には第２表示領域１３０２の仮想視点画像とは異なる視点の仮想視点映像あるいは撮影画像であってもよい。なお、表示領域に表示する情報はこれらの限りではなく、デジタルコンテンツと対応付けられた情報であればよい。 The first display area 1301 is associated with a main image showing a player, and the third display area 1303 is associated with copyright information. The first display area 1301 may contain a virtual viewpoint video or a photographed image from a different viewpoint from the virtual viewpoint image in the second display area 1302. Note that the information displayed in the display area is not limited to these, and may be any information that is associated with digital content.

本実施形態は、実施形態４および実施形態５と異なり、第２領域１３０８に第４表示領域１３０４、第５表示領域１３０５、第６表示領域１３０６を含む。ユーザにより第１表示領域１３０１から第３表示領域１３０３の何れかが選択された場合、選択された表示領域に対応する画像を第５表示領域に表示する。 This embodiment differs from Embodiments 4 and 5 in that the second area 1308 includes a fourth display area 1304, a fifth display area 1305, and a sixth display area 1306. When the user selects any one of the first display area 1301 to the third display area 1303, an image corresponding to the selected display area is displayed in the fifth display area.

第５表示領域１３０５は常に第２領域１３０８に表示される。一方、第４表示領域１３０４および第６表示領域１３０６は、仮想視点映像を第５表示領域に表示する場合に第２領域１３０８に表示される。 The fifth display area 1305 is always displayed in the second area 1308. On the other hand, the fourth display area 1304 and the sixth display area 1306 are displayed in the second area 1308 when the virtual viewpoint video is displayed in the fifth display area.

第２領域１３０８の中心に位置する表示領域とそれ以外の表示領域は形状が異なる。具体的には、第４表示領域１３０４および第６表示領域１３０６は第５表示領域１３０５と形状または大きさが異なる。本実施形態では、第５表示領域が長方形の形状であるのに対し、第４表示領域と第６表示領域は台形の形状になっている。このようにすることにより、第２領域１３０８の中心に位置する第５表示領域１３０５の視認性を向上することができる。 The display area located at the center of the second area 1308 and the other display areas have different shapes. Specifically, the fourth display area 1304 and the sixth display area 1306 are different in shape or size from the fifth display area 1305. In this embodiment, the fifth display area has a rectangular shape, whereas the fourth display area and the sixth display area have a trapezoidal shape. By doing so, the visibility of the fifth display area 1305 located at the center of the second area 1308 can be improved.

６つの仮想視点映像はそれぞれ仮想視点が異なる。３つの被写体が存在し、３つの被写体の位置を仮想視点の位置とした３つの仮想視点映像と、３つの被写体の位置から一定距離後方かつ一定距離高い位置を仮想視点の位置とした３つの仮想視点映像である。例えば、バスケの試合において、オフェンスを行うＡ選手とディフェンスを行うＢ選手、バスケットボールを被写体とした場合には、以下の例が挙げられる。１つは、Ａ選手の顔の位置を仮想視点の位置とし、Ａ選手の顔の向きを仮想視点からの視線方向とした第１仮想視点映像である。別の１つは、Ａ選手の顔の位置から一定距離後方（例えば、３ｍ後方）かつ一定距離高い位置（例えば、１ｍ高い位置）を仮想視点の位置とし、Ａ選手を画角内に含むように設定される方向を仮想視点からの視線方向とした第２仮想視点映像である。また別の１つは、Ｂ選手の顔の位置を仮想視点の位置とし、Ｂ選手の顔の向きを仮想視点からの視線方向とした第３仮想視点映像である。また別の１つは、Ｂ選手の顔の位置から一定距離後方かつ一定距離高い位置を仮想視点の位置とし、Ｂ選手を画角内に含むように設定される方向を仮想視点からの視線方向とした第４仮想視点映像である。また別の１つは、バスケットボールの重心位置を仮想視点の位置とし、バスケットボールの進行方向を仮想視点からの視線方向とした第５仮想視点映像である。また、別の１つはバスケットボールの重心位置から一定距離後方かつ一定距離高い位置を仮想視点の位置とし、バスケットボールの進行方向を仮想視点の位置とした第６仮想視点映像である。なお、被写体の位置から一定距離後方かつ一定距離高い位置とは、撮影シーンに基づいて決定されたり、被写体が仮想視点画像の画角に占める割合を用いて決定されたりしてもよい。仮想視点からの視線方向は、被写体の姿勢または被写体の進行方向、画角に占める被写体の位置の何れか一つに基づいて設定される。 Each of the six virtual viewpoint images has a different virtual viewpoint. There are three subjects, three virtual viewpoint images with the positions of the three subjects as the virtual viewpoint positions, and three virtual viewpoint images with the virtual viewpoint positions a certain distance behind and a certain distance above the positions of the three subjects. This is a perspective video. For example, in the case of a basketball game, where player A plays offense, player B plays defense, and the basketball is the subject, the following example may be given. One is a first virtual viewpoint video in which the position of Player A's face is the position of the virtual viewpoint, and the direction of Player A's face is the line of sight direction from the virtual viewpoint. Another method is to set the virtual viewpoint at a position a certain distance behind (e.g., 3 m behind) and a certain distance (e.g., 1 m higher) from the position of player A's face, and to include player A within the angle of view. This is a second virtual viewpoint video in which the direction set in is the line-of-sight direction from the virtual viewpoint. Another one is a third virtual viewpoint video in which the position of player B's face is the position of the virtual viewpoint, and the direction of player B's face is the line of sight direction from the virtual viewpoint. Another method is to set the virtual viewpoint position a certain distance behind and a certain distance higher than the position of player B's face, and set the direction set to include player B within the angle of view as the viewing direction from the virtual viewpoint. This is the fourth virtual viewpoint video. Another one is a fifth virtual viewpoint video in which the position of the center of gravity of the basketball is the position of the virtual viewpoint, and the direction of progress of the basketball is the line of sight from the virtual viewpoint. Another one is a sixth virtual viewpoint image in which the virtual viewpoint is a certain distance behind and a certain distance above the center of gravity of the basketball, and the virtual viewpoint is in the direction of movement of the basketball. Note that the position a certain distance behind and a certain distance higher than the position of the subject may be determined based on the shooting scene, or may be determined using the ratio of the subject to the angle of view of the virtual viewpoint image. The viewing direction from the virtual viewpoint is set based on any one of the posture of the subject, the direction of movement of the subject, and the position of the subject in the angle of view.

本実施形態では、６つの仮想視点映像は再生時間が同じであるとする。尚、再生時間が異なる仮想視点映像であってもよい。 In this embodiment, it is assumed that the six virtual viewpoint videos have the same playback time. Note that the virtual viewpoint videos may have different playback times.

６つの仮想視点映像に対応する第２表示領域１３０２が選択された場合、第２領域１３０８に第４表示領域１３０４および第６表示領域１３０６が表示される。第２領域１３０８に表示される３つの表示領域は、３つの被写体を映す仮想視点映像がそれぞれ対応付けられる。具体的には、第４表示領域１３０４にＡ選手を被写体とした第１仮想視点映像と第２仮想視点映像、第５表示領域１３０５にＢ選手を被写体とした第３仮想視点映像と第４仮想視点映像、第６表示領域１３０６にバスケットボールを被写体とした第５仮想視点映像と第６仮想視点映像が対応付けられる。 When the second display area 1302 corresponding to six virtual viewpoint videos is selected, the fourth display area 1304 and the sixth display area 1306 are displayed in the second area 1308. The three display areas displayed in the second area 1308 are associated with virtual viewpoint videos showing three objects, respectively. Specifically, the fourth display area 1304 displays a first virtual viewpoint video and a second virtual viewpoint video with player A as the subject, and the fifth display area 1305 displays a third virtual viewpoint video and a fourth virtual viewpoint with player B as the subject. A fifth virtual viewpoint video and a sixth virtual viewpoint video with a basketball as the subject are associated with the viewpoint video and the sixth display area 1306.

各表示領域に表示する映像は全て同時に再生されない。第２領域１３０８の中心に位置する表示領域に表示される映像のみ再生される。本実施形態では、第５表示領域１３０５に表示される仮想視点映像のみ再生される。 The images displayed in each display area are not all played back at the same time. Only the video displayed in the display area located at the center of the second area 1308 is played back. In this embodiment, only the virtual viewpoint video displayed in the fifth display area 1305 is played back.

図１４は、実施形態６の画像処理システム１００の動作フローを説明するためのフローチャートである。尚、画像処理システム１００のコンピュータとしてのＣＰＵ１１１が例えばＲＯＭ１１２や補助記憶装置１１４等のメモリに記憶されたコンピュータプログラムを実行することによって図５のフローチャートの各ステップの動作が行われる。尚、図１４において図１０と同じ符号のステップは同じ処理であり説明を省略する。 FIG. 14 is a flowchart for explaining the operation flow of the image processing system 100 according to the sixth embodiment. Note that each step in the flowchart of FIG. 5 is performed by the CPU 111 as a computer of the image processing system 100 executing a computer program stored in a memory such as the ROM 112 or the auxiliary storage device 114. Note that in FIG. 14, steps with the same reference numerals as in FIG. 10 are the same processes, and a description thereof will be omitted.

ステップＳ１４０１において、予め設定した仮想視点映像を第４表示領域１３０４、第５表示領域１３０５、第６表示領域１３０６に表示する。本実施形態では、３つの被写体の位置を仮想視点の位置とした３つの仮想視点映像が表示される。具体的には、第４表示領域１３０４にＡ選手の顔の位置を仮想視点の位置とした第１仮想視点映像、第５表示領域１３０５にＢ選手の顔の位置を仮想視点の位置とした第３仮想視点映像、第６表示領域１３０６にバスケットボールの重心位置を仮想視点の位置とした第５仮想視点映像が表示される。このとき、第２領域１３０８の中心に位置する第５表示領域１３０５のみ映像を再生し、第４表示領域１３０４および第６表示領域１３０６は映像を再生しない。予め設定した仮想視点映像を第４表示領域１３０４、第５表示領域１３０５、第６表示領域１３０６に表示した後、ステップＳ１０１１に進む。 In step S1401, preset virtual viewpoint images are displayed in the fourth display area 1304, the fifth display area 1305, and the sixth display area 1306. In this embodiment, three virtual viewpoint videos are displayed with the positions of three subjects as the virtual viewpoint positions. Specifically, the fourth display area 1304 displays a first virtual viewpoint image with the position of player A's face as the virtual viewpoint position, and the fifth display area 1305 displays a first virtual viewpoint image with the position of player B's face as the virtual viewpoint position. In the third virtual viewpoint video and the sixth display area 1306, a fifth virtual viewpoint video is displayed in which the center of gravity of the basketball is the virtual viewpoint position. At this time, only the fifth display area 1305 located at the center of the second area 1308 plays back the video, and the fourth display area 1304 and the sixth display area 1306 do not play back the video. After displaying the preset virtual viewpoint video in the fourth display area 1304, fifth display area 1305, and sixth display area 1306, the process advances to step S1011.

ステップＳ１４０２において、第５表示領域１３０５に表示される仮想視点映像の被写体を変更する操作が入力されたか判定する。具体的には、第５表示領域１３０５に対する水平方向へのスライド操作により被写体の異なる仮想視点映像に切り替える操作が入力されたか判定する。Ｙｅｓの場合、スライドされた方向に基づいて次のステップに進む。左方向へのスライド操作の入力情報を受け付けた場合、ステップＳ１４１０に進む。右方向へのスライド操作の入力情報を受け付けた場合、ステップＳ１４１１に進む。Ｎｏの場合、ステップＳ１４０５に進む。 In step S1402, it is determined whether an operation to change the subject of the virtual viewpoint video displayed in the fifth display area 1305 has been input. Specifically, it is determined whether an operation to switch to a virtual viewpoint image of a different subject by a horizontal slide operation on the fifth display area 1305 has been input. If Yes, proceed to the next step based on the slid direction. If input information for a leftward sliding operation is received, the process advances to step S1410. If input information for a rightward sliding operation is accepted, the process advances to step S1411. If No, the process advances to step S1405.

ステップＳ１４０３において、各表示領域に対応付けられた仮想視点映像を左隣りの表示領域に再度対応付ける。例えば、第５表示領域１３０５に対応する第３仮想視点映像に対し左方向へのスライド操作を受け付けた場合、第５表示領域１３０５に対応する第３仮想視点映像と第４仮想視点映像を第５表示領域１３０５の左隣にある第４表示領域１３０４に対応付ける。第６表示領域１３０６に対応付けられた第５仮想視点映像と第６仮想視点映像を第６表示領域１３０６の左隣にある第５表示領域１３０５に対応付ける。第２領域１３０８において、第４表示領域１３０４の左隣りには表示領域が存在しないため、第４表示領域１３０４に対応する第１仮想視点映像と第２仮想視点映像は右隣に表示領域が存在しない第６表示領域１３０６に対応付ける。再度対応付けを行った後は、第５表示領域１３０５に対応付けられた仮想視点映像のうち、被写体に対する仮想視点の位置が対応付け前の第５表示領域１３０５で再生されていた仮想視点映像と同じ仮想視点映像を再生する。例えば、対応付け前の第５表示領域１３０５で再生されていた仮想視点映像が第３仮想視点映像であった場合、対応付け後に第５表示領域１３０５には第５仮想視点映像と第６仮想視点映像が対応付けられている。第３仮想視点映像は被写体の位置を仮想視点の位置とした仮想視点映像であるため、同じく被写体の位置を仮想視点の位置とした仮想視点映像である第５仮想視点映像が第５表示領域に表示される。このようにすることにより、ユーザが直感的に被写体の異なる仮想視点映像を切り替えることができる。上記処理を行った後、ステップＳ１０１１に戻る。 In step S1403, the virtual viewpoint video associated with each display area is associated again with the display area adjacent to the left. For example, when a slide operation to the left is received for the third virtual viewpoint video corresponding to the fifth display area 1305, the third virtual viewpoint video and the fourth virtual viewpoint video corresponding to the fifth display area 1305 are moved to the fifth virtual viewpoint video. It is associated with the fourth display area 1304 on the left side of the display area 1305. The fifth virtual viewpoint video and the sixth virtual viewpoint video associated with the sixth display area 1306 are associated with the fifth display area 1305 on the left side of the sixth display area 1306. In the second area 1308, since there is no display area to the left of the fourth display area 1304, there is a display area to the right of the first virtual viewpoint video and the second virtual viewpoint video corresponding to the fourth display area 1304. It is associated with the sixth display area 1306 that is not displayed. After the association is performed again, the position of the virtual viewpoint with respect to the subject among the virtual perspective images associated with the fifth display area 1305 will be the same as the virtual perspective image that was being played in the fifth display area 1305 before the association. Play the same virtual viewpoint video. For example, if the virtual viewpoint video that was being played in the fifth display area 1305 before the mapping was the third virtual viewpoint video, after the mapping, the fifth virtual viewpoint video and the sixth virtual viewpoint video are displayed in the fifth display area 1305. Images are associated. Since the third virtual viewpoint video is a virtual viewpoint video with the subject's position as the virtual viewpoint, the fifth virtual viewpoint video, which is also a virtual viewpoint video with the subject's position as the virtual viewpoint, is displayed in the fifth display area. Is displayed. By doing so, the user can intuitively switch between virtual viewpoint videos of different subjects. After performing the above processing, the process returns to step S1011.

ステップＳ１４０４において、各表示領域に対応付けられた仮想視点映像を右隣りの表示領域に再度対応付ける。例えば、第５表示領域１３０５に対応する第３仮想視点映像に対し右方向へのスライド操作を受け付けた場合、第５表示領域１３０５に対応する第３仮想視点映像と第４仮想視点映像を第５表示領域１３０５の右隣にある第６表示領域１３０６に対応付ける。第４表示領域１３０４に対応付けられた第１仮想視点映像と第２仮想視点映像を第４表示領域１３０４の右隣にある第５表示領域１３０５に対応付ける。第２領域１３０８において、第６表示領域１３０６の右隣りには表示領域が存在しないため、第６表示領域１３０６に対応する第５仮想視点映像と第６仮想視点映像は左隣に表示領域が存在しない第４表示領域１３０４に対応付ける。再度対応付けを行った後は、第５表示領域１３０５に対応付けられた仮想視点映像のうち、被写体に対する仮想視点の位置が対応付け前の第５表示領域１３０５で再生されていた仮想視点映像と同じ仮想視点映像を再生する。上記処理を行った後、ステップＳ１４０８に戻る。上記処理を行った後、ステップＳ１０１１に戻る。 In step S1404, the virtual viewpoint video associated with each display area is associated again with the display area on the right. For example, when a slide operation to the right is received for the third virtual viewpoint video corresponding to the fifth display area 1305, the third virtual viewpoint video and the fourth virtual viewpoint video corresponding to the fifth display area 1305 are It is associated with the sixth display area 1306 on the right side of the display area 1305. The first virtual viewpoint video and the second virtual viewpoint video associated with the fourth display area 1304 are associated with the fifth display area 1305 on the right side of the fourth display area 1304. In the second area 1308, since there is no display area on the right side of the sixth display area 1306, there is a display area on the left side of the fifth virtual viewpoint video and the sixth virtual viewpoint video corresponding to the sixth display area 1306. It is associated with the fourth display area 1304 that is not displayed. After the association is performed again, the position of the virtual viewpoint with respect to the subject among the virtual perspective images associated with the fifth display area 1305 will be the same as the virtual perspective image that was being played in the fifth display area 1305 before the association. Play the same virtual viewpoint video. After performing the above processing, the process returns to step S1408. After performing the above processing, the process returns to step S1011.

ステップＳ１４０５において、第５表示領域１３０５に表示される仮想視点映像の仮想視点位置を変更する操作が入力されたか判定する。具体的には、第５表示領域１３０５に対するダブルタップ操作により同じ被写体であるが仮想視点の位置が異なる仮想視点映像へ切り替える入力を受け付ける。Ｙｅｓの場合、ステップＳ１４０６に進み、Ｎｏの場合、ステップＳ１０１６に進む。 In step S1405, it is determined whether an operation to change the virtual viewpoint position of the virtual viewpoint video displayed in the fifth display area 1305 has been input. Specifically, an input to switch to a virtual viewpoint video of the same subject but with a different virtual viewpoint position is accepted by a double tap operation on the fifth display area 1305. If Yes, the process advances to step S1406; if No, the process advances to step S1016.

ステップＳ１４０６において、仮想視点映像の被写体の位置を変更する処理を行う。具体的には、同じ被写体であるが仮想視点の位置が異なる仮想視点映像へ切り替える。例えば、第５表示領域１３０５にＢ選手の顔の位置を仮想視点の位置とした第３仮想視点映像とＢ選手の顔の位置から一定距離後方かつ一定距離高い位置を仮想視点の位置とした第４仮想視点映像が対応付けられているとする。第５表示領域１３０５に第３仮想視点映像が表示されているときにダブルタップ操作を受け付けた場合、第４仮想視点映像と切り替えて第５表示領域１３０５に表示する処理を行う。このようにすることにより、同じ被写体での仮想視点の位置を直感的に切り替えることができる。上記処理を行った後、ステップＳ１０１１に戻る。 In step S1406, processing is performed to change the position of the subject in the virtual viewpoint video. Specifically, the video is switched to a virtual viewpoint video of the same subject but with a different virtual viewpoint position. For example, the fifth display area 1305 displays a third virtual viewpoint image with the virtual viewpoint position of player B's face and a third virtual viewpoint video with the virtual viewpoint position a certain distance behind and a certain distance higher than the position of player B's face. It is assumed that four virtual viewpoint images are associated with each other. If a double-tap operation is received while the third virtual viewpoint video is being displayed in the fifth display area 1305, a process of switching to the fourth virtual viewpoint video and displaying it in the fifth display area 1305 is performed. By doing this, it is possible to intuitively switch the position of the virtual viewpoint for the same subject. After performing the above processing, the process returns to step S1011.

尚、スライド操作およびダブルタップ操作において仮想視点映像を切り替える際、第５表示領域１３０５にて再生している仮想視点映像のタイムコードを記録しておき、切り替え後の仮想視点映像を記録したタイムコードの時刻から再生してもよい。 Note that when switching the virtual viewpoint video using a slide operation or a double tap operation, the time code of the virtual viewpoint video being played in the fifth display area 1305 is recorded, and the time code of the virtual viewpoint video after switching is recorded. You can start playing from the time.

尚、本実施形態ではダブルタップ操作を同じ被写体での仮想視点の位置を切り替える操作としたが、他の操作であってもよい。例えば、第５表示領域１３０５に対するピンチイン・ピンチアウト操作や上下方向へのスライド操作であってもよい。 Note that in this embodiment, the double-tap operation is an operation for switching the position of the virtual viewpoint on the same subject, but other operations may be used. For example, it may be a pinch-in/pinch-out operation or a slide operation in the vertical direction on the fifth display area 1305.

本実施形態では、同じ被写体を映す複数の仮想視点映像を１つの表示領域に対応付けたが、同じ被写体を映す複数の仮想視点映像を複数の表示領域に対応付けてもよい。具体的には、第４表示領域１３０４、第５表示領域１３０５、第６表示領域１３０６の上にそれぞれ第７表示領域、第８表示領域、第９表示領域を新たに設けてもよい（不図示）。この場合、第４表示領域１３０４に第１仮想視点映像、第５表示領域１３０５に第３仮想視点映像、第６表示領域１３０６に第５仮想視点映像、第７表示領域に第２仮想視点映像、第８表示領域に第４仮想視点映像、第９表示領域に第６仮想視点映像を表示する。実施形態６では、タップジェスチャにより同じ被写体を映す仮想視点映像を切り替える操作を行ったが、この場合は上下方向へのスライド操作により表示領域を切り替える。このようにすることにより、ユーザが直感的に仮想視点を操作することができる。 In this embodiment, a plurality of virtual viewpoint videos showing the same subject are associated with one display area, but a plurality of virtual viewpoint videos showing the same subject may be associated with a plurality of display areas. Specifically, a seventh display area, an eighth display area, and a ninth display area may be newly provided above the fourth display area 1304, the fifth display area 1305, and the sixth display area 1306, respectively (not shown). ). In this case, the fourth display area 1304 has a first virtual viewpoint video, the fifth display area 1305 has a third virtual viewpoint video, the sixth display area 1306 has a fifth virtual viewpoint video, the seventh display area has a second virtual viewpoint video, A fourth virtual viewpoint video is displayed in the eighth display area, and a sixth virtual viewpoint video is displayed in the ninth display area. In the sixth embodiment, the tap gesture was used to switch between virtual viewpoint images showing the same subject, but in this case, the display area is switched by sliding in the vertical direction. By doing so, the user can intuitively operate the virtual viewpoint.

（実施形態７）
実施形態１では、デジタルコンテンツ２００の第１面２０１に対応付けられたメイン画像（第１画像）に対して所定の関係を有する第２画像をデジタルコンテン２００の第２面２０２に対応付ける形態である。本実施形態では、立体形状のデジタルコンテンツの各面に、同じタイムコードの複数の仮想視点映像を対応付ける形態である。具体的には、仮想視点の被写体に対する視線方向に基づいて、立体形状のデジタルコンテンツの各面に仮想視点映像を対応づける例について記載する。 (Embodiment 7)
In the first embodiment, a second image having a predetermined relationship with the main image (first image) associated with the first page 201 of the digital content 200 is associated with the second page 202 of the digital content 200. . In this embodiment, a plurality of virtual viewpoint videos having the same time code are associated with each side of three-dimensional digital content. Specifically, an example will be described in which a virtual viewpoint image is associated with each surface of a three-dimensional digital content based on the line of sight direction of the virtual viewpoint with respect to the subject.

図１５は、本実施形態の画像処理システム１０１を示す図である。なお、図１と同様のブロックは同じ番号を付与し、説明を省略する。 FIG. 15 is a diagram showing the image processing system 101 of this embodiment. Note that blocks similar to those in FIG. 1 are given the same numbers, and explanations thereof will be omitted.

画像生成部１１７は、操作部１１６から指定された仮想視点と、仮想視点映像に表示されるオブジェクトの座標情報に基づいて、仮想視点の位置及び仮想視点からの視線方向とオブジェクトの座標の対応関係を解析する。画像生成部１１７は、操作部１１６から指定された仮想視点から見た仮想視点映像から注目オブジェクトを特定する。本実施形態では、仮想視点映像の中心に映るまたは中心に最も近いオブジェクトを特定するが、これに限定されない。例えば、仮想視点映像に占める割合が最も高いオブジェクトを特定してもよいし、仮想視点映像を生成せずにオブジェクトを選択してもよい。次に、画像生成部１１７は、特定した注目オブジェクトを撮影する撮影方向を決定し、それぞれの撮影方向に対応する複数の仮想視点を生成する。撮影方向は、上下左右、前後（正面と背面）である。なお、生成される複数の仮想視点は、オペレータにより指定される仮想視点と同じタイムコードに対応している。そして生成された仮想視点に対応する仮想視点映像を生成した後、それぞれ仮想視点映像が注目オブジェクトを、上下左右、前後どの撮影方向からとらえた映像であるかを決定し、該仮想視点映像に撮影方向情報を付与する。撮影方向情報は、注目オブジェクトの向きに対してどの方向から撮影されたかを示す情報である。撮影方向の決定方法は、仮想視点映像の開始時に、その仮想視点映像で中心的に捉えられるオブジェクトと所定の地点との位置関係で定まるものとする。詳細は、図１７にて説明する。 The image generation unit 117 generates a correspondence relationship between the position of the virtual viewpoint, the line of sight direction from the virtual viewpoint, and the coordinates of the object based on the virtual viewpoint designated by the operation unit 116 and the coordinate information of the object displayed in the virtual viewpoint video. Analyze. The image generation unit 117 identifies the object of interest from the virtual viewpoint video seen from the virtual viewpoint specified by the operation unit 116. In this embodiment, the object that appears at the center of the virtual viewpoint video or is closest to the center is specified, but the present invention is not limited thereto. For example, the object that occupies the highest proportion of the virtual viewpoint video may be specified, or the object may be selected without generating the virtual viewpoint video. Next, the image generation unit 117 determines a photographing direction in which to photograph the identified object of interest, and generates a plurality of virtual viewpoints corresponding to each photographing direction. The shooting directions are top, bottom, left, right, front and back (front and back). Note that the plurality of generated virtual viewpoints correspond to the same time code as the virtual viewpoint specified by the operator. After generating a virtual viewpoint video corresponding to the generated virtual viewpoint, it is determined from which shooting direction each virtual viewpoint video captures the object of interest, top, bottom, left, right, front, back, front, back, etc. Add direction information. The photographing direction information is information indicating from which direction the object of interest was photographed. The method for determining the shooting direction is determined based on the positional relationship between the object centrally captured in the virtual viewpoint video and a predetermined point at the start of the virtual viewpoint video. Details will be explained with reference to FIG. 17.

コンテンツ生成部１１８は、画像生成部１１７から受け取った仮想視点映像を付与された撮影方向情報に基づいて、立体形状コンテンツのどの面に対応づけるかを決定し、立体形状のデジタルコンテンツを生成する。 The content generation unit 118 determines which side of the three-dimensional content the virtual viewpoint video received from the image generation unit 117 is to be associated with, based on the attached shooting direction information, and generates three-dimensional digital content.

図１６は、本実施形態における立体形状のデジタルコンテンツの各面を説明する図である。立体形状のデジタルコンテンツ１６００において、面１６０１を正面、面１６０２を右面、面１６０３を上面、面１６０４を左面、面１６０５を背面、面１６０６を下面と定義する。 FIG. 16 is a diagram illustrating each side of the three-dimensional digital content in this embodiment. In the three-dimensional digital content 1600, a surface 1601 is defined as the front surface, a surface 1602 as the right surface, a surface 1603 as the top surface, a surface 1604 as the left surface, a surface 1605 as the back surface, and a surface 1606 as the bottom surface.

図１７は、選手に対する撮影方向を説明する図である。コート１７００において、ゴール１７０１、ゴール１７０２、選手１７０３が存在する。選手１７０３は、ゴール１７０１に向けて攻めるものとする。ここで、選手１７０３とゴール１７０１を結ぶ向きで、地面と水平にゴール１７０１から選手１７０３に向かう方向を「正面」と定める。ここで正面の決定方法を説明する。初めに選手１７０３とゴール１７０１を結ぶ線分を導出する、ゴールはあらかじめ決められた点、選手は３Ｄモデルの重心とする。次に前記導出した線分に対して直行し、かつ選手の３Ｄモデルと接する面を導出し、この面を正面とする。なお、正面を決定した後、選手を囲むバウンディングボックスを決定する。上面、下面、右面、左面、背面は、バウンディングボックスの正面を基準に決定される。図１７では、選手を正面からとらえる向きは矢印１７０４で表現され、この向きを仮想視点からの視線方向として作成された仮想視点映像の撮影方向情報は、「正面」となる。「正面」の撮影方向情報が付与された仮想視点映像は、図１６の面１６０１に対応付けられることになる。同様に、選手を右からとらえた矢印１７０６の向きを仮想視点からの視線方向として作成された仮想視点映像の撮影方向情報は、「右面」となる。「右面」の撮影方向情報が付与された仮想視点映像は、図１６の１６０２面に対応付けられる。選手を左からとらえた矢印１７０７の向きを仮想視点からの視線方向として作成された仮想視点映像の撮影方向情報は、「左面」となる。「左面」の撮影方向情報が付与された仮想視点映像は、図１６の１６０４面に対応付けられる。選手を上からとらえた矢印１７０８の向きを仮想視点からの視線方向として作成された仮想視点映像の撮影方向情報は、「上面」となる。「上面」の撮影方向情報が付与された仮想視点映像は、図１６の１６０３面に対応付けられる。選手を下からとらえた矢印１７０９の向きを仮想視点からの視線方向として作成された仮想視点映像の撮影方向情報は、「下面」となる。「下面」の撮影方向情報が付与された仮想視点映像は、図１６の１６０６面に対応付けられる。選手を後ろからとらえた矢印１７０５の向きを仮想視点からの視線方向として作成された仮想視点映像の撮影方向情報は、「背面」となる。「背面」の撮影方向情報が付与された仮想視点映像は、図１６の１６０５面に対応付けられる。なお、本実施形態では、ある特定の瞬間に選手とゴールの関係から向きを決めたが、選手がフィールド上を移動するために選手とゴールの関係から向きを変動しても良い。 FIG. 17 is a diagram illustrating shooting directions for athletes. On the court 1700, there are a goal 1701, a goal 1702, and a player 1703. It is assumed that player 1703 attacks toward goal 1701. Here, the direction connecting the player 1703 and the goal 1701, and the direction from the goal 1701 toward the player 1703 horizontally to the ground, is defined as the "front". Here, the method for determining the front will be explained. First, a line segment connecting the player 1703 and the goal 1701 is derived, with the goal being a predetermined point and the player being the center of gravity of the 3D model. Next, a surface that is perpendicular to the derived line segment and in contact with the 3D model of the player is derived, and this surface is defined as the front. After determining the front, the bounding box surrounding the player is determined. The top, bottom, right, left, and back surfaces are determined based on the front of the bounding box. In FIG. 17, the direction in which the player is viewed from the front is expressed by an arrow 1704, and the shooting direction information of the virtual viewpoint video created with this direction as the viewing direction from the virtual viewpoint is "front". The virtual viewpoint video to which the shooting direction information of "front" is attached is associated with the plane 1601 in FIG. 16. Similarly, the shooting direction information of the virtual viewpoint video created with the direction of the arrow 1706 taken from the right of the player as the viewing direction from the virtual viewpoint is "right side". The virtual viewpoint video to which the shooting direction information of “right side” is attached is associated with the 1602 screen in FIG. 16 . The shooting direction information of the virtual viewpoint video created with the direction of the arrow 1707 taken from the left of the player as the viewing direction from the virtual viewpoint is "left side." The virtual viewpoint video to which the shooting direction information of “left side” is attached is associated with the 1604th screen in FIG. 16 . The shooting direction information of the virtual viewpoint video created with the direction of the arrow 1708, which captures the player from above, as the viewing direction from the virtual viewpoint is "top". The virtual viewpoint video to which the shooting direction information of “top view” is attached is associated with the 1603 screen in FIG. 16 . The shooting direction information of the virtual viewpoint video created with the direction of the arrow 1709, which captures the player from below, as the viewing direction from the virtual viewpoint is "bottom surface". The virtual viewpoint video to which shooting direction information of "bottom surface" is attached is associated with screen 1606 in FIG. 16. The shooting direction information of the virtual viewpoint video created with the direction of the arrow 1705 that captures the player from behind as the viewing direction from the virtual viewpoint is "back". The virtual viewpoint video to which "rear" shooting direction information is attached is associated with screen 1605 in FIG. 16. In this embodiment, the direction is determined based on the relationship between the player and the goal at a specific moment, but the direction may be changed based on the relationship between the player and the goal as the player moves on the field.

なお、本実施形態では、選手とゴールの位置により撮影方向を決定したが、これに限定されない。例えば、選手の進行方向から選手を見た方向を「正面」と設定し、「正面」の方向をＸＹ面上に±９０度回転させた方向を、「左面」と「右面」と設定する。「正面」の方向をＹＺ面上に±９０度回転させた方向を、「上面」と「下面」と設定する。「正面」の方向をＸＹ面上に＋１８０度または―１８０度回転させた方向を、「背面」と設定する。なお、正面を設定する他の例として、選手の顔の向きから見た方向を「正面」と設定してもよいし、選手の進行方向上の直線から最も近いゴールの位置から見た方向を「正面」と設定してもよい。 Note that in this embodiment, the photographing direction is determined based on the positions of the players and the goal, but the present invention is not limited to this. For example, the direction in which the player is viewed from the player's advancing direction is set as "front", and the directions obtained by rotating the "front" direction by ±90 degrees on the XY plane are set as "left side" and "right side". Directions obtained by rotating the "front" direction by ±90 degrees on the YZ plane are set as "top" and "bottom". The direction obtained by rotating the "front" direction by +180 degrees or -180 degrees on the XY plane is set as the "back". In addition, as another example of setting the front direction, you can set the direction seen from the player's face as "front", or the direction seen from the goal position closest to the straight line in the player's direction of progress. It may also be set to "front".

なお、選手が移動する度に、ゴールとの位置関係で撮影方向を変化しても良い。例えばあらかじめ正面を決定した位置から、一定の距離（たとえば３メートル）以上移動した場合に正面を定め直しても良いし、一定時間の経過により撮影方向を変化させてもよい。または、最初に決定された正面と、移動した後で算出される正面を比較したときに、面を上から見たときの角度が４５度以上変化する場合に、正面を定め直しても良い。また、パスで他の選手にボールが移ったタイミングをトリガとして、その都度パスを受けた選手とゴール位置の関係から正面を定め直しても良い。 Note that each time the player moves, the shooting direction may be changed depending on the positional relationship with the goal. For example, the front may be determined again when the front is moved a certain distance (for example, 3 meters) or more from a predetermined position, or the photographing direction may be changed after a certain period of time has elapsed. Alternatively, when comparing the initially determined front and the front calculated after movement, if the angle when the surface is viewed from above changes by 45 degrees or more, the front may be redefined. Furthermore, the front direction may be redefined each time based on the relationship between the player who received the pass and the goal position, using the timing when the ball is passed to another player as a trigger.

図１８は、コンテンツ生成部４が生成する立体形状のデジタルコンテンツの例を示す図である。オペレータが設定する仮想視点と同じタイムコードに対応する複数の仮想視点から見た複数の仮想視点映像が、立体形状のデジタルコンテンツの各面に対応している。このように表示することにより、ユーザはどの位置からオブジェクトを撮影した仮想視点映像であるのか直感的に把握することができる。 FIG. 18 is a diagram illustrating an example of three-dimensional digital content generated by the content generation unit 4. A plurality of virtual viewpoint images viewed from a plurality of virtual viewpoints corresponding to the same time code as the virtual viewpoint set by the operator correspond to each side of the three-dimensional digital content. By displaying the image in this manner, the user can intuitively understand from which position the object is photographed in the virtual viewpoint image.

図１９は、実施形態７の画像処理システム１０１の動作フローを説明するためのフローチャートである。なお、図３のＳ３７～Ｓ３９と同じフローについては同様の番号を付して説明を省略する。 FIG. 19 is a flowchart for explaining the operation flow of the image processing system 101 according to the seventh embodiment. Note that the same numbers are given to the same flows as S37 to S39 in FIG. 3, and the description thereof will be omitted.

Ｓ１９０１において、画像生成部１１７は、操作部１１６を介してユーザが指定する仮想視点の位置及び仮想視点からの視線方向を示す仮想視点情報を取得する。 In S1901, the image generation unit 117 acquires virtual viewpoint information indicating the position of a virtual viewpoint specified by the user via the operation unit 116 and the direction of line of sight from the virtual viewpoint.

Ｓ１９０２において、画像生成部１１７は、取得した仮想視点情報に対応する仮想視点から見た仮想視点画像において、注目オブジェクトを特定する。本実施形態では、仮想視点映像の中心に映るまたは中心に最も近いオブジェクトを特定する。 In S1902, the image generation unit 117 identifies the object of interest in the virtual viewpoint image seen from the virtual viewpoint corresponding to the acquired virtual viewpoint information. In this embodiment, an object that appears at the center of the virtual viewpoint video or is closest to the center is identified.

Ｓ１９０３において、画像生成部１１７は、注目オブジェクトに対する撮影方向を決定する。本実施形態では、注目オブジェクトの位置と所定の位置との直線と直行し、注目オブジェクトの３Ｄモデルと接する面を正面とする。この正面を基準に、上下左右、後ろに対する撮影方向を決定する。 In S1903, the image generation unit 117 determines the shooting direction for the object of interest. In this embodiment, the front surface is defined as a surface that is perpendicular to the straight line between the position of the object of interest and a predetermined position and that is in contact with the 3D model of the object of interest. Based on this front, the shooting directions for the top, bottom, left, right, and back are determined.

Ｓ１９０４において、画像生成部１１７は、Ｓ１９０３において決定した複数の撮影方向に対応する複数の仮想視点を生成する。本実施形態では、注目オブジェクトに対して前後、上下左右に対応する撮影方向が決定されるため、それぞれに対応する仮想視点が生成される。なお、生成される仮想視点は、撮影方向と同じ向きに仮想視点からの視線方向が設定されていればよく、注目オブジェクトを仮想視点の光軸上にとらえなくてもよい。また、生成される仮想視点の位置は、注目オブジェクトの位置から所定の値の距離分離れた位置に設定される。本実施形態では、注目オブジェクトから３ｍ離れた位置に仮想視点が設定されるものとする。 In S1904, the image generation unit 117 generates a plurality of virtual viewpoints corresponding to the plurality of shooting directions determined in S1903. In this embodiment, photographing directions corresponding to the front, back, top, bottom, left, and right of the object of interest are determined, so virtual viewpoints corresponding to each are generated. Note that the generated virtual viewpoint only needs to have the viewing direction from the virtual viewpoint set in the same direction as the photographing direction, and the object of interest does not need to be captured on the optical axis of the virtual viewpoint. Further, the position of the generated virtual viewpoint is set to a position separated by a predetermined distance from the position of the object of interest. In this embodiment, it is assumed that the virtual viewpoint is set at a position 3 meters away from the object of interest.

Ｓ１９０５において、画像生成部１１７は、生成した仮想視点に対応する仮想視点画像を生成する。その後、生成した仮想視点画像に対し、仮想視点に対応する撮影方向を示す撮影方向情報を付与する。 In S1905, the image generation unit 117 generates a virtual viewpoint image corresponding to the generated virtual viewpoint. Thereafter, photographing direction information indicating a photographing direction corresponding to the virtual viewpoint is added to the generated virtual viewpoint image.

Ｓ１９０６において、画像生成部１１７は、Ｓ１９０４にて生成したすべての仮想視点に対して、仮想視点映像が生成されたか否か判定する。すべての仮想視点映像が生成されている場合、生成した仮想視点映像をコンテンツ生成部１１８に送信し、Ｓ１９０７に進む。すべての仮想視点映像が生成されていない場合、Ｓ１９０５に進み、すべての仮想視点映像が生成されるまでループする。 In S1906, the image generation unit 117 determines whether virtual viewpoint videos have been generated for all the virtual viewpoints generated in S1904. If all virtual viewpoint videos have been generated, the generated virtual viewpoint videos are sent to the content generation unit 118, and the process advances to S1907. If all virtual viewpoint videos have not been generated, the process advances to S1905 and loops until all virtual viewpoint videos are generated.

Ｓ１９０７において、コンテンツ生成部１１８は、受信した仮想視点映像の撮影方向情報に基づいて、仮想視点映像を立体形状のデジタルコンテンツの各面に対応付ける。 In S1907, the content generation unit 118 associates the virtual viewpoint video with each surface of the three-dimensional digital content based on the received shooting direction information of the virtual viewpoint video.

Ｓ１９０８において、コンテンツ生成部１１８は、受信した仮想視点映像が、立体形状のデジタルコンテンツのすべての面に対応付けられているか否か判定する。対応付けられている場合は、Ｓ３７に進み、対応付けられていない場合は、Ｓ１９０７に進む。なお、本実施形態では、すべての面に対応付けることを想定しているが、これに限定されず、特定の面に対応付けるようにしてもよい。その場合、Ｓ１９０８において、特定の面に仮想視点映像が対応付けられているかを判定する。 In S1908, the content generation unit 118 determines whether the received virtual viewpoint video is associated with all sides of the three-dimensional digital content. If they are associated, the process advances to S37; if not, the process advances to S1907. Note that in this embodiment, it is assumed that the information is associated with all surfaces, but the present invention is not limited to this, and it may be made to be associated with a specific surface. In that case, in S1908, it is determined whether a virtual viewpoint image is associated with a specific surface.

上記処理により、立体形状のデジタルコンテンツの各面に撮影方向に応じた仮想視点映像を対応付けることができる。その結果、デジタルコンテンツを用いて仮想視点映像を視聴するユーザが、仮想視点映像を切り替えたいときに、各面に対応する仮想視点映像を直感的に把握することができる。 Through the above processing, each surface of the three-dimensional digital content can be associated with a virtual viewpoint video according to the shooting direction. As a result, when a user viewing a virtual viewpoint video using digital content wants to switch virtual viewpoint videos, he or she can intuitively grasp the virtual viewpoint video corresponding to each side.

（実施形態８）
実施形態７では、オペレータが指定した仮想視点を基準に、同一のタイムコードに対応する複数の仮想視点を生成し、デジタルコンテンツの各面に各仮想視点の撮影方向に応じた仮想視点映像を対応付けた。しかしながら、オペレータが指定した仮想視点から見た仮想視点映像を撮影方向に応じた面に対応付けたいケースも考えられる。実施形態８では、オペレータが指定した仮想視点から見た仮想視点映像が、注目オブジェクトをどの撮影方向から撮影しているのか特定し、撮影方向に応じてデジタルコンテンツの各面に対応付ける。 (Embodiment 8)
In the seventh embodiment, a plurality of virtual viewpoints corresponding to the same time code are generated based on a virtual viewpoint specified by the operator, and a virtual viewpoint video corresponding to the shooting direction of each virtual viewpoint is associated with each side of the digital content. I attached it. However, there may be cases where it is desired to associate a virtual viewpoint image seen from a virtual viewpoint specified by the operator with a plane corresponding to the shooting direction. In the eighth embodiment, the virtual viewpoint video viewed from the virtual viewpoint specified by the operator specifies from which shooting direction the object of interest is shot, and is associated with each side of the digital content according to the shooting direction.

図２０は、実施形態８の画像処理システム１０１の動作フローを説明するためのフローチャートである。なお、図３のＳ３７～Ｓ３９、図１９のＳ１９０１およびＳ１９０２、Ｓ１９０７およびＳ１９０８と同じフローについては同様の番号を付して説明を省略する。 FIG. 20 is a flowchart for explaining the operation flow of the image processing system 101 according to the eighth embodiment. Flows that are the same as S37 to S39 in FIG. 3, S1901 and S1902, S1907 and S1908 in FIG.

ステップＳ２００１において、画像生成部１１７は、Ｓ１９０１において取得した仮想視点情報に基づいて、仮想視点映像を生成する。 In step S2001, the image generation unit 117 generates a virtual viewpoint video based on the virtual viewpoint information acquired in S1901.

ステップＳ２００２において、画像生成部１１７は、仮想視点映像のフレームごとに注目オブジェクトに対する撮影方向を決定する。例えば、仮想視点映像が全体で１０００フレームとした場合、「正面」の撮影方向情報が付与されたフレームが８００フレームとする。「背面」の撮影方向情報が付与されたフレームが１００フレームとする。「左面」の撮影方向情報が付与されたフレームが５０フレームとする。「右面」の撮影方向情報が付与されたフレームが３０フレームとする。「上面」の撮影方向情報が付与されたフレームが２０フレームとする。「下面」の撮影方向情報が付与されたフレームが１０フレームとする。このようにタイムコードの異なるフレームごとに撮影方向情報を付与する。 In step S2002, the image generation unit 117 determines the shooting direction for the object of interest for each frame of the virtual viewpoint video. For example, if the virtual viewpoint video has a total of 1000 frames, the number of frames to which "front" shooting direction information is added is 800 frames. It is assumed that the number of frames to which "rear" photographing direction information is added is 100 frames. It is assumed that the number of frames to which "left side" photographing direction information is added is 50 frames. It is assumed that the number of frames to which the shooting direction information of "right side" is added is 30 frames. It is assumed that there are 20 frames to which "top" photographing direction information is added. It is assumed that the number of frames to which the shooting direction information of "bottom surface" is added is 10 frames. In this way, shooting direction information is assigned to each frame with a different time code.

なお、本実施形態では、ユーザがデジタルコンテンツを視聴する際、撮影方向が異なるフレームを表示する際、対応するデジタルコンテンツの面に回転するように切り替わることを想定している。このようにすることにより、立体形状を生かし、仮想視点映像に躍動感を持たせることができる。 Note that in this embodiment, when a user views digital content, it is assumed that when frames with different shooting directions are displayed, the frame is rotated to the corresponding digital content plane. By doing so, it is possible to take advantage of the three-dimensional shape and give a sense of dynamism to the virtual viewpoint video.

また、本実施形態では、撮影方向に対して立体形状のデジタルコンテンツに対応付ける面が予め定まっていたが、これに限定されない。例えば、オブジェクトを捉えた各仮想視点映像のフレーム割合が最も多い順に立体形状のデジタルコンテンツに対応付ける面を決定してもよい。具体的には、フレーム割合が多い順に第二方向～第六方向までを設定する。仮想視点映像が、１０００フレーム中、正面１６０４からが８００フレーム、背面１６０５が１００フレーム、左１６０６からのフレームが５０、右１６０７からが３０フレーム、上１６０８からが２０フレーム，下１６０９からが１０フレームとする。この場合、第一方向が正面、以降第二方向～第六方向が、背面、左、右、上、下という順に決定される。そして、撮影方向情報を付与した仮想視点映像をコンテンツ生成部１１８に出力する。 Further, in the present embodiment, the surface to be associated with the three-dimensional digital content is determined in advance with respect to the photographing direction, but the present invention is not limited to this. For example, the surfaces to be associated with the three-dimensional digital content may be determined in order of the highest frame ratio of each virtual viewpoint video capturing the object. Specifically, the second to sixth directions are set in descending order of frame ratio. The virtual viewpoint video has 1000 frames, 800 frames from the front 1604, 100 frames from the back 1605, 50 frames from the left 1606, 30 frames from the right 1607, 20 frames from the top 1608, and 10 frames from the bottom 1609. shall be. In this case, the first direction is determined as the front, and the second to sixth directions are determined in this order as back, left, right, top, and bottom. Then, the virtual viewpoint video to which the shooting direction information has been added is output to the content generation unit 118.

（実施形態９）
実施形態４では、デジタルコンテンツを保存する保存部５が画像処理装置１００に組み込まれている例を説明した。実施形態９では、デジタルコンテンツを外部装置２１０２に保存する例を説明する。なお、画像処理装置１０２は、画像処理装置１００から保存部５が除かれた装置である（不図示）。 (Embodiment 9)
In the fourth embodiment, an example has been described in which the storage unit 5 that stores digital content is incorporated in the image processing device 100. In the ninth embodiment, an example will be described in which digital content is stored in the external device 2102. Note that the image processing device 102 is a device obtained by removing the storage unit 5 from the image processing device 100 (not shown).

図２１は、実施形態９における画像処理システム１０３のシステム構成を示す図である。画像処理システム１０３は、画像処理装置１０２、ユーザデバイス２１０１、外部装置２１０２を含む。 FIG. 21 is a diagram showing the system configuration of the image processing system 103 in the ninth embodiment. The image processing system 103 includes an image processing apparatus 102, a user device 2101, and an external device 2102.

画像処理装置１０２は、実施形態１～３のいずれかに記載の方法により生成されたデジタルコンテンツを生成する。生成したデジタルコンテンツおよび生成に用いた仮想視点画像等のメディアデータ、各仮想視点画像を示すアイコン、各仮想視点画像のメタデータ等は、外部装置２１０２に送信される。また、実施形態４～６のいずれかに記載の表示画像を生成する。生成された表示画像は、ユーザデバイス２１０１に送信される。 The image processing device 102 generates digital content generated by the method described in any of the first to third embodiments. The generated digital content, media data such as virtual viewpoint images used for generation, icons representing each virtual viewpoint image, metadata of each virtual viewpoint image, and the like are transmitted to the external device 2102. Further, a display image according to any one of the fourth to sixth embodiments is generated. The generated display image is transmitted to the user device 2101.

ユーザデバイス２１０１は、例えば、ＰＣやスマートフォン、タッチパネルを有するタブレット端末でもよい（不図示）。本実施形態では、タッチパネルを有するタブレット端末を例に説明する。 The user device 2101 may be, for example, a PC, a smartphone, or a tablet terminal with a touch panel (not shown). In this embodiment, a tablet terminal having a touch panel will be described as an example.

外部装置２１０２は、実施形態１～３のいずれかに記載の方法により生成されたデジタルコンテンツを保存する。なお、図１の保存部５と同様に、デジタルコンテンツのほか、仮想視点画像や、カメラ画像、各デジタルコンテンツに表示される仮想視点画像に対応するアイコン等も保存する。画像処理装置１０２から、特定のデジタルコンテンツを要求された場合、該当するデジタルコンテンツを画像処理装置１０２に送信する。なお、デジタルコンテンツだけでなく、仮想視点画像や各仮想視点画像のメタデータ等を画像処理装置１０２に送信してもよい。 The external device 2102 stores digital content generated by the method described in any of the first to third embodiments. Note that, like the storage unit 5 in FIG. 1, in addition to digital content, it also stores virtual viewpoint images, camera images, icons corresponding to virtual viewpoint images displayed in each digital content, and the like. When specific digital content is requested from the image processing apparatus 102, the corresponding digital content is transmitted to the image processing apparatus 102. Note that not only digital content but also virtual viewpoint images, metadata of each virtual viewpoint image, and the like may be transmitted to the image processing apparatus 102.

図２２は、実施形態９におけるデータ伝送の流れを示す図である。なお、本実施形態では、ユーザからの指示を受けて表示画像を生成し、さらに表示画像に対する操作に応じて表示する仮想視点画像を変更する流れを説明する。 FIG. 22 is a diagram showing the flow of data transmission in the ninth embodiment. In this embodiment, a flow will be described in which a display image is generated in response to an instruction from a user, and a virtual viewpoint image to be displayed is changed in response to an operation on the display image.

Ｓ２２０１において、ユーザデバイス２１０１は、ユーザからの入力を経て、画像処理装置１０２にデジタルコンテンツを視聴する指示を送信する。なお、この指示には、視聴したいデジタルコンテンツを特定するための情報が含まれる。具体的には、デジタルコンテンツのＮＦＴや、外部装置２１０２に記憶されているアドレス等の情報である。 In S2201, the user device 2101 transmits an instruction to view digital content to the image processing apparatus 102 through input from the user. Note that this instruction includes information for specifying the digital content that the user wants to view. Specifically, it is information such as an NFT of digital content and an address stored in the external device 2102.

Ｓ２２０２において、画像処理装置１０２は、取得した視聴指示に基づいて、視聴対象のデジタルコンテンツを外部装置２１０２に要求する。 In S2202, the image processing apparatus 102 requests the external device 2102 for digital content to be viewed based on the acquired viewing instruction.

Ｓ２２０３において、外部装置２１０２は、取得した要求に対応するデジタルコンテンツを特定する。なお、要求内容に応じてデジタルコンテンツのみならず、デジタルコンテンツのメタデータや関連する仮想視点画像を特定する。 In S2203, the external device 2102 identifies digital content corresponding to the obtained request. Note that not only the digital content but also metadata of the digital content and related virtual viewpoint images are specified according to the request content.

Ｓ２２０４において、外部装置２１０２は、特定したデジタルコンテンツを画像処理装置１０２に送信する。なお、デジタルコンテンツのほかに特定した情報があれば、あわせて送信する。 In S2204, the external device 2102 transmits the identified digital content to the image processing device 102. In addition to the digital content, if there is any specified information, it will also be sent.

Ｓ２２０５において、画像処理装置１０２は、取得したデジタルコンテンツに対応する表示画像を生成する。例えば、取得したデジタルコンテンツに３つの仮想視点映像が対応付けられている場合は、実施形態４の図９に示す表示画像を作成する。なお、実施形態５の図１１に示す表示画像を作成してもよいし、実施形態６の図１３に示す表示画像を生成してもよい。本実施形態では、生成する表示画像は予め実施形態４の図９に示す表示画像であると設定されているものとする。なお、デジタルコンテンツに予め対応する表示画像が設定されていてもよい。その場合、デジタルコンテンツの作成時に作成者が設定し、デジタルコンテンツのメタデータに格納する。また、他の例として、ユーザが生成する表示画像を指示してもよい。その場合、Ｓ２２０１のデジタルコンテンツの視聴指示の際に、あわせて表示する表示画像の種類を指示する。そして、生成した表示画像に取得したデジタルコンテンツの各面を対応付ける。本実施形態では、被写体をバスケットの選手とし、第１表示領域９０１に選手を映すメイン画像に対応する画像を表示する。第２表示領域９０２には、メイン画像に表示される選手に関連する３つの仮想視点映像を示す画像および仮想視点映像を示すアイコン９１３を重畳表示する。第３表示領域９０３にメイン画像に表示される選手の所属するチームの情報を示す画像が表示される。第４表示領域９０４に、メイン画像に表示される選手の、メイン画像が撮像されたシーズンの成績情報を示す画像が表示される。第５表示領域９０５に撮影時の試合の最終スコアを示す画像が表示される。第６表示領域９０６にデジタルコンテンツの著作権情報を示す画像が表示される。なお、第１表示領域９０１～第６表示領域９０６は、ユーザ操作により選択可能な画像である。Ｓ２２０５では、第７表示領域９０７には、初期画像として第１表示領域９０１に表示した選手を映すメイン画像が表示される。 In S2205, the image processing device 102 generates a display image corresponding to the acquired digital content. For example, if three virtual viewpoint videos are associated with the acquired digital content, a display image shown in FIG. 9 of the fourth embodiment is created. Note that the display image shown in FIG. 11 of the fifth embodiment may be created, or the display image shown in FIG. 13 of the sixth embodiment may be created. In this embodiment, it is assumed that the display image to be generated is set in advance to be the display image shown in FIG. 9 of the fourth embodiment. Note that a display image corresponding to the digital content may be set in advance. In that case, the creator sets it when creating the digital content and stores it in the metadata of the digital content. Furthermore, as another example, the user may instruct a display image to be generated. In that case, when instructing to view digital content in S2201, the type of display image to be displayed is also instructed. Each aspect of the acquired digital content is then associated with the generated display image. In this embodiment, the subject is a basketball player, and an image corresponding to a main image showing the player is displayed in the first display area 901. In the second display area 902, an image showing three virtual viewpoint videos related to the player displayed in the main image and an icon 913 showing the virtual viewpoint video are displayed in a superimposed manner. An image showing information about the team to which the player displayed in the main image belongs is displayed in the third display area 903. In the fourth display area 904, an image showing performance information of the player displayed in the main image in the season in which the main image was captured is displayed. An image showing the final score of the match at the time of shooting is displayed in the fifth display area 905. An image showing copyright information of the digital content is displayed in the sixth display area 906. Note that the first display area 901 to the sixth display area 906 are images that can be selected by user operation. In S2205, the main image showing the player displayed in the first display area 901 is displayed in the seventh display area 907 as an initial image.

Ｓ２２０６において、画像処理装置１０２は、Ｓ２２０５にて生成した表示画像をユーザデバイス２１０１に送信する。 In S2206, the image processing apparatus 102 transmits the display image generated in S2205 to the user device 2101.

Ｓ２２０７において、ユーザデバイス２１０１は、受信した表示画像を表示する。 In S2207, the user device 2101 displays the received display image.

Ｓ２２０８において、ユーザデバイス２１０１は、ユーザ操作により表示画像の表示領域を選択する操作を受信した場合、選択された表示領域を指定する情報を画像処理装置１０２に送信する。例えば、実施形態４の図９に示す表示画像を表示し、ユーザが表示領域９０２に対応する画像を選択した場合、表示領域９０２が選択されたことを示す情報を画像処理装置１０２に送信する。なお、本実施形態では、表示領域を指定する情報を画像処理装置１０２に送信した。これは第１表示領域９０１～第６表示領域９０６に表示されている画像と異なる画像または映像を第７表示領域９０７に表示できるようにするためである。なお、これに限定されず、ユーザ操作により選択された画像を第７表示領域９０７に表示するようにしてもよい。その場合、選択された表示領域を示す情報ではなく、選択された画像を示す情報を画像処理装置１０２に送信する。 In S2208, when the user device 2101 receives a user operation to select a display area of a display image, the user device 2101 transmits information specifying the selected display area to the image processing apparatus 102. For example, when the display image shown in FIG. 9 of the fourth embodiment is displayed and the user selects the image corresponding to the display area 902, information indicating that the display area 902 has been selected is transmitted to the image processing apparatus 102. Note that in this embodiment, information specifying the display area is transmitted to the image processing apparatus 102. This is so that an image or video different from the images displayed in the first display area 901 to the sixth display area 906 can be displayed in the seventh display area 907. Note that the present invention is not limited to this, and an image selected by a user operation may be displayed in the seventh display area 907. In that case, information indicating the selected image is sent to the image processing device 102 instead of information indicating the selected display area.

Ｓ２２０９において、画像処理装置１０２は、選択された表示部に対応するアイコンから、デジタルコンテンツのどの画像が選択されたのか判定する。Ｓ２２０８において、表示領域９０２が選択されたため、表示領域９０２に対応する仮想視点映像が選択されたことになる。したがって、第７表示領域９０７に、デジタルコンテンツに含まれる仮想視点映像を表示する。なお、デジタルコンテンツに複数の仮想視点映像が含まれている場合、初めに表示する仮想視点映像を予め設定しておく。上記処理により、第１表示領域９０１～第６表示領域９０６のうちユーザ操作により選択された画像に対応する表示領域を指定する情報を受信し、選択された表示領域に対応する画像または映像を第７表示領域９０７に表示することで表示画面を更新する。また、実施形態６の図１３が示す表示画面のように、第２領域１３０８に複数の仮想視点映像を表示する場合、それら表示した仮想視点映像に対するタッチ入力やフリック入力を受信したか否か判定する。タッチ入力を受信した場合、再生されている仮想視点映像を一時停止する。フリック入力を受信した場合、第５表示領域１３０５に表示され、現在再生されている仮想視点映像を異なる仮想視点映像に切り替える。 In S2209, the image processing apparatus 102 determines which image of the digital content has been selected from the icon corresponding to the selected display section. In S2208, since the display area 902 is selected, the virtual viewpoint video corresponding to the display area 902 is selected. Therefore, the virtual viewpoint video included in the digital content is displayed in the seventh display area 907. Note that if the digital content includes a plurality of virtual viewpoint videos, the virtual viewpoint video to be displayed first is set in advance. Through the above processing, information specifying the display area corresponding to the image selected by the user operation among the first display area 901 to the sixth display area 906 is received, and the image or video corresponding to the selected display area is displayed in the first display area 901 to the sixth display area 906. The display screen is updated by displaying it in the 7 display area 907. In addition, when a plurality of virtual viewpoint videos are displayed in the second area 1308 as in the display screen shown in FIG. 13 of the sixth embodiment, it is determined whether a touch input or flick input for the displayed virtual viewpoint videos has been received. do. When a touch input is received, the virtual viewpoint video being played is paused. When a flick input is received, the virtual viewpoint video displayed in the fifth display area 1305 and currently being played is switched to a different virtual viewpoint video.

Ｓ２２１０において、画像処理装置１０２は、更新した表示画像をユーザデバイス２１０１に送信する。 In S2210, the image processing apparatus 102 transmits the updated display image to the user device 2101.

上記処理により、ユーザの所望のデジタルコンテンツを表示する表示画像を作成し、ユーザデバイス上に表示することができる。 Through the above processing, a display image displaying digital content desired by the user can be created and displayed on the user device.

以上、本開示を複数の実施形態に基づいて詳述してきたが、本開示は上記実施形態に限定されるものではなく、本開示の主旨に基づき種々の変形が可能であり、それらを本開示の範囲から除外するものではない。例えば、以上の実施形態１～７を適宜組み合わせても良い。 Although the present disclosure has been described in detail based on a plurality of embodiments, the present disclosure is not limited to the above embodiments, and various modifications can be made based on the gist of the present disclosure. It is not excluded from the scope of For example, the above embodiments 1 to 7 may be combined as appropriate.

尚、本実施形態における制御の一部又は全部を上述した実施形態の機能を実現するコンピュータプログラムをネットワーク又は各種記憶媒体を介して画像処理システム等に供給するようにしてもよい。そしてその画像処理システム等におけるコンピュータ（又はＣＰＵやＭＰＵ等）がプログラムを読み出して実行するようにしてもよい。その場合、そのプログラム、及び該プログラムを記憶した記憶媒体は本開示を構成することとなる。 Note that a computer program that implements some or all of the functions of the above-described embodiments of control in this embodiment may be supplied to an image processing system or the like via a network or various storage media. Then, a computer (or CPU, MPU, etc.) in the image processing system or the like may read and execute the program. In that case, the program and the storage medium storing the program constitute the present disclosure.

尚、本実施形態の開示は、以下の構成、方法およびプログラムを含む。 Note that the disclosure of this embodiment includes the following configuration, method, and program.

（構成１）立体形状のデジタルコンテンツの第１面に対応付けられた仮想視点画像であって、複数の撮像装置で撮像されることにより得られた複数の画像と仮想視点とに基づいて生成される仮想視点画像と、前記デジタルコンテンツの第２面に対応付けられた前記仮想視点画像に対応する前記仮想視点と異なる視点の画像とを特定する特定手段と、
前記仮想視点画像に対応する画像と、前記仮想視点と異なる視点の画像に対応する画像とを表示領域に表示する制御を行う表示制御手段と、
を有することを特徴とする装置。 (Configuration 1) A virtual viewpoint image associated with the first surface of three-dimensional digital content, which is generated based on a plurality of images obtained by capturing images with a plurality of imaging devices and a virtual viewpoint. identifying means for identifying a virtual viewpoint image associated with a second surface of the digital content, and an image having a different viewpoint from the virtual viewpoint corresponding to the virtual viewpoint image associated with a second surface of the digital content;
Display control means for controlling display of an image corresponding to the virtual viewpoint image and an image corresponding to an image from a viewpoint different from the virtual viewpoint in a display area;
A device characterized by having:

（構成２）更に、前記表示領域を選択する入力情報を取得する取得手段を有し、
前記表示制御手段は、前記取得手段により前記取得された入力情報に基づいて、前記選択された表示領域に対応する画像を前記選択された表示領域と異なる表示領域である選択画像表示領域に表示すること
を特徴とする構成１に記載の装置。 (Configuration 2) further comprising an acquisition means for acquiring input information for selecting the display area,
The display control means displays an image corresponding to the selected display area in a selected image display area that is a different display area from the selected display area, based on the input information acquired by the acquisition means. The device according to configuration 1, characterized in that:

（構成３）前記デジタルコンテンツは、前記複数の画像と前記仮想視点を含む複数の仮想視点とに基づいて生成される複数の仮想視点画像を含むことを特徴とする構成１又は２に記載の装置。 (Structure 3) The device according to Structure 1 or 2, wherein the digital content includes a plurality of virtual viewpoint images generated based on the plurality of images and a plurality of virtual viewpoints including the virtual viewpoint. .

（構成４）前記表示制御手段は、前記複数の仮想視点画像に対応する画像をそれぞれ異なる表示領域に表示することを特徴とする構成１乃至３のいずれか１項に記載の装置。 (Structure 4) The apparatus according to any one of Structures 1 to 3, wherein the display control means displays images corresponding to the plurality of virtual viewpoint images in different display areas.

（構成５）前記複数の仮想視点画像に対応する画像を特定の表示領域に対応付け、
前記表示制御手段は、前記取得手段により前記複数の仮想視点画像に対応する画像を表示する表示領域を選択する入力を取得した場合、前記複数の仮想視点画像のうち、特定の仮想視点画像を前記選択画像表示領域に表示することを特徴とする構成１乃至４のいずれか１項に記載の装置。 (Configuration 5) Associating images corresponding to the plurality of virtual viewpoint images with a specific display area,
The display control means selects a specific virtual viewpoint image from among the plurality of virtual viewpoint images when the acquisition means obtains an input for selecting a display area in which an image corresponding to the plurality of virtual viewpoint images is displayed. 5. The device according to any one of configurations 1 to 4, wherein the device displays the selected image in the selected image display area.

（構成６）前記複数の仮想視点画像に対応する複数の仮想視点うち少なくとも一つの仮想視点は、前記複数の撮像装置で撮像されることにより得られた複数の画像のうち少なくとも一つの画像に含まれる被写体に基づいて決定されることを特徴とする構成１乃至５のいずれか１項に記載の装置。 (Configuration 6) At least one virtual viewpoint among the plurality of virtual viewpoints corresponding to the plurality of virtual viewpoint images is included in at least one image among the plurality of images obtained by imaging with the plurality of imaging devices. 6. The apparatus according to any one of configurations 1 to 5, wherein the determination is made based on a subject to be photographed.

（構成７）前記仮想視点の位置は、前記被写体を表す３次元形状の位置に基づいて決定されることを特徴とする構成１乃至６のいずれか１項に記載の装置。 (Configuration 7) The device according to any one of configurations 1 to 6, wherein the position of the virtual viewpoint is determined based on the position of a three-dimensional shape representing the subject.

（構成８）前記複数の仮想視点画像に対応する複数の仮想視点うち少なくとも一つの仮想視点の位置は被写体の位置に基づいて決定され、当該仮想視点からの視線方向は前記被写体の向きに基づいて決定されることを特徴とする構成１乃至７のいずれか１項に記載の装置。 (Configuration 8) The position of at least one virtual viewpoint among the plurality of virtual viewpoints corresponding to the plurality of virtual viewpoint images is determined based on the position of the subject, and the line of sight direction from the virtual viewpoint is determined based on the orientation of the subject. 8. The device according to any one of configurations 1 to 7, characterized in that:

（構成９）前記複数の仮想視点画像に対応する複数の仮想視点うち少なくとも一つの仮想視点の位置は被写体の後方に所定の距離離れた位置に基づいて決定され、当該仮想視点からの視線方向は前記被写体の向きに基づいて決定されることを特徴とする構成１乃至７のいずれか１項に記載の装置。 (Configuration 9) The position of at least one virtual viewpoint among the plurality of virtual viewpoints corresponding to the plurality of virtual viewpoint images is determined based on a position a predetermined distance behind the subject, and the line of sight direction from the virtual viewpoint is 8. The apparatus according to any one of configurations 1 to 7, wherein the determination is made based on the orientation of the subject.

（構成１０）前記複数の仮想視点画像に対応する複数の仮想視点うち少なくとも一つの仮想視点の位置は、被写体を中心とする球面上の位置に基づいて決定され、当該仮想視点からの視線方向は当該仮想視点の位置から前記被写体に向かう方向に基づいて決定されることを特徴とする構成１乃至７のいずれか１項に記載の装置。 (Configuration 10) The position of at least one virtual viewpoint among the plurality of virtual viewpoints corresponding to the plurality of virtual viewpoint images is determined based on the position on a spherical surface centered on the subject, and the line of sight direction from the virtual viewpoint is 8. The apparatus according to any one of configurations 1 to 7, wherein the determination is made based on a direction toward the subject from the position of the virtual viewpoint.

（構成１１）前記被写体は人物であり、
前記被写体の向きは、前記被写体の顔の向きであることを特徴とする構成１乃至８のいずれか１項に記載の装置。 (Configuration 11) The subject is a person,
9. The device according to any one of configurations 1 to 8, wherein the orientation of the subject is the orientation of the face of the subject.

（構成１２）前記表示制御手段は、前記選択画像表示領域に対し特定の操作情報が入力された場合、前記選択画像表示領域に表示している仮想視点画像と、前記複数の仮想視点画像のうち前記選択画像表示領域に表示している仮想視点画像と異なる仮想視点画像と、を切り替えることを特徴とする構成１乃至５のいずれか１項に記載の装置。 (Configuration 12) When specific operation information is input to the selected image display area, the display control means selects one of the virtual viewpoint images displayed in the selected image display area and the plurality of virtual viewpoint images. 6. The device according to any one of configurations 1 to 5, wherein the device switches between a virtual viewpoint image displayed in the selected image display area and a different virtual viewpoint image.

（構成１３）前記特定の操作情報は、キーボードのタイピング操作、マウスのクリック操作、マウスによるスクロール操作、仮想視点画像が表示されている表示装置に対するタッチ操作、スライド操作、フリックジェスチャ、ピンチイン・ピンチアウト操作の少なくとも何れか一つに関する操作情報である構成１２に記載の装置。 (Configuration 13) The specific operation information includes keyboard typing operation, mouse click operation, mouse scroll operation, touch operation on the display device on which the virtual viewpoint image is displayed, slide operation, flick gesture, pinch-in/pinch-out The device according to configuration 12, wherein the device is operation information regarding at least one of the operations.

（構成１４）前記表示制御手段は、前記複数の仮想視点画像それぞれに対応するアイコンを前記選択画像表示領域に重畳表示し、前記アイコンを選択する入力を受け付けると、前記選択画像表示領域に表示している仮想視点画像と、前記入力された前記アイコンに対応する前記仮想視点画像とを切り替えることを特徴とする構成１２に記載の装置。 (Configuration 14) The display control means displays icons corresponding to each of the plurality of virtual viewpoint images in a superimposed manner on the selected image display area, and upon receiving an input to select the icon, displays the icons on the selected image display area. 13. The device according to configuration 12, wherein the device switches between a virtual viewpoint image corresponding to the inputted icon and a virtual viewpoint image corresponding to the inputted icon.

（構成１５）前記表示制御手段は、前記仮想視点画像を示すアイコンを、前記仮想視点画像を示す画像に重畳表示することを特徴とする構成１乃至１４のいずれか１項に記載の装置。 (Structure 15) The apparatus according to any one of Structures 1 to 14, wherein the display control means displays an icon representing the virtual viewpoint image in a superimposed manner on an image representing the virtual viewpoint image.

（方法）立体形状のデジタルコンテンツの第１面に対応付けられた仮想視点画像であって、複数の撮像装置で撮像されることにより得られた複数の画像と仮想視点とに基づいて生成される仮想視点画像と、前記デジタルコンテンツの第２面に対応付けられた前記仮想視点画像に対応する前記仮想視点と異なる視点の画像とを特定する特定工程と、
前記仮想視点画像に対応する画像と、前記仮想視点と異なる視点の画像に対応する画像とを表示領域に表示する制御を行う表示制御工程と、
を有することを特徴とする画像処理方法。 (Method) A virtual viewpoint image associated with the first surface of three-dimensional digital content, which is generated based on a plurality of images obtained by capturing images with a plurality of imaging devices and a virtual viewpoint. identifying a virtual viewpoint image and an image with a different viewpoint from the virtual viewpoint corresponding to the virtual viewpoint image associated with the second side of the digital content;
a display control step of controlling display of an image corresponding to the virtual viewpoint image and an image corresponding to an image of a viewpoint different from the virtual viewpoint in a display area;
An image processing method comprising:

（プログラム）構成１乃至１５のいずれか１項に記載の装置に記載の各手段をコンピュータにより制御するためのプログラム。 (Program) A program for controlling each means described in the apparatus described in any one of Configurations 1 to 15 by a computer.

１カメラ
２形状推定部
３画像生成部
４コンテンツ生成部
５保存部
１１５表示部
１１６操作部
１００画像処理装置 1 Camera 2 Shape estimation section 3 Image generation section 4 Content generation section 5 Storage section 115 Display section 116 Operation section 100 Image processing device

Claims

A virtual viewpoint image associated with the first surface of three-dimensional digital content, which is generated based on a plurality of images obtained by capturing images with a plurality of imaging devices and a virtual viewpoint. and identifying means for identifying an image of a different viewpoint from the virtual viewpoint corresponding to the virtual viewpoint image associated with the second side of the digital content;
Display control means for controlling display of an image corresponding to the virtual viewpoint image and an image corresponding to an image from a viewpoint different from the virtual viewpoint in a display area;
An image processing system comprising:

Furthermore, it has an acquisition means for acquiring input information for selecting the display area,
The display control means displays an image corresponding to the selected display area in a selected image display area that is a different display area from the selected display area, based on the input information acquired by the acquisition means. The image processing system according to claim 1, characterized in that:

The image processing system according to claim 2, wherein the digital content includes a plurality of virtual viewpoint images generated based on the plurality of images and a plurality of virtual viewpoints including the virtual viewpoint.

4. The image processing system according to claim 3, wherein the display control means displays images corresponding to the plurality of virtual viewpoint images in different display areas.

associating images corresponding to the plurality of virtual viewpoint images with a specific display area;
The display control means selects a specific virtual viewpoint image from among the plurality of virtual viewpoint images when the acquisition means obtains an input for selecting a display area in which an image corresponding to the plurality of virtual viewpoint images is displayed. 4. The image processing system according to claim 3, wherein the image processing system displays the selected image in a selected image display area.

At least one virtual viewpoint among the plurality of virtual viewpoints corresponding to the plurality of virtual viewpoint images is based on a subject included in at least one image among the plurality of images obtained by imaging with the plurality of imaging devices. The image processing system according to claim 5, wherein the image processing system is determined by:

7. The image processing system according to claim 6, wherein the position of the virtual viewpoint is determined based on the position of a three-dimensional shape representing the subject.

The position of at least one virtual viewpoint among the plurality of virtual viewpoints corresponding to the plurality of virtual viewpoint images is determined based on the position of the subject, and the line of sight direction from the virtual viewpoint is determined based on the direction of the subject. The image processing system according to claim 3, characterized in that:

The position of at least one virtual viewpoint among the plurality of virtual viewpoints corresponding to the plurality of virtual viewpoint images is determined based on a position a predetermined distance behind the subject, and the viewing direction from the virtual viewpoint is determined based on the direction of the subject. The image processing system according to claim 3, wherein the image processing system is determined based on.

The position of at least one virtual viewpoint among the plurality of virtual viewpoints corresponding to the plurality of virtual viewpoint images is determined based on the position on a spherical surface centered on the subject, and the line of sight direction from the virtual viewpoint is determined based on the position on the spherical surface centered on the subject. The image processing system according to claim 3, wherein the image processing system is determined based on a direction from the position toward the subject.

The subject is a person,
9. The image processing system according to claim 8, wherein the orientation of the subject is the orientation of the face of the subject.

When specific operation information is input to the selected image display area, the display control means displays the virtual viewpoint image displayed in the selected image display area and the selected image from among the plurality of virtual viewpoint images. 6. The image processing system according to claim 5, wherein the virtual viewpoint image displayed in the area is switched between a different virtual viewpoint image.

The specific operation information includes at least one of a keyboard typing operation, a mouse click operation, a scroll operation using a mouse, a touch operation on a display device on which a virtual viewpoint image is displayed, a slide operation, a flick gesture, and a pinch-in/pinch-out operation. The image processing system according to claim 12, wherein the image processing system is operation information regarding one of the above.

The display control means superimposes and displays an icon corresponding to each of the plurality of virtual viewpoint images in the selected image display area, and upon receiving an input for selecting the icon, displays the icon corresponding to each of the plurality of virtual viewpoint images in the selected image display area. The image processing system according to claim 12, wherein an image and the virtual viewpoint image corresponding to the inputted icon are switched.

The image processing system according to claim 1, wherein the display control means displays an icon representing the virtual viewpoint image in a superimposed manner on the image representing the virtual viewpoint image.

A virtual viewpoint image associated with the first surface of three-dimensional digital content, which is generated based on a plurality of images obtained by capturing images with a plurality of imaging devices and a virtual viewpoint. and a specifying step of identifying an image of a different viewpoint from the virtual viewpoint corresponding to the virtual viewpoint image associated with the second side of the digital content;
a display control step of controlling display of an image corresponding to the virtual viewpoint image and an image corresponding to an image of a viewpoint different from the virtual viewpoint in a display area;
An image processing method comprising:

A computer program for controlling each means of the image processing system according to any one of claims 1 to 15 by a computer.