JP7530206B2

JP7530206B2 - Information processing device, information processing method, and program

Info

Publication number: JP7530206B2
Application number: JP2020083365A
Authority: JP
Inventors: 智昭新井
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2020-05-11
Filing date: 2020-05-11
Publication date: 2024-08-07
Anticipated expiration: 2040-05-11
Also published as: JP2021179687A

Description

本発明は、仮想視点画像を生成する情報処理装置、情報処理方法およびプログラムに関する。 The present invention relates to an information processing device, an information processing method, and a program for generating a virtual viewpoint image.

異なる位置に設置された複数のカメラを同期させて撮影することにより得られた複数視点画像を用いて、指定された仮想視点に応じた仮想視点画像を生成する技術が注目されている。仮想視点画像によれば、例えば、サッカーやバスケットボールなどにおける特定シーン（例えばゴールシーンなど）を様々な角度から視聴することができる。そのため、仮想視点画像は、従来の撮影画像と比較してユーザに高臨場感を与えることができる。 A technology that generates a virtual viewpoint image according to a specified virtual viewpoint using multiple viewpoint images obtained by synchronously capturing images from multiple cameras installed in different positions is attracting attention. With a virtual viewpoint image, for example, a specific scene (such as a goal shot) in a soccer or basketball game can be viewed from various angles. Therefore, the virtual viewpoint image can give the user a higher sense of realism than a conventional captured image.

仮想視点を任意の経路上を移動させながら、当該経路上に設定された仮想視点の位置・姿勢、時刻に対応する仮想視点画像を順次に生成する技術が知られている。以下、仮想視点が移動する経路を動線と称する。また、動線上に設定された仮想視点の位置、姿勢、時刻のセットを動線情報と称する。このような仮想視点の動線情報を生成する方法の一つとしてキーフレーム方式がある。キーフレーム方式では、仮想視点が移動する動線上に、基本となる複数のフレーム（基本フレーム）に対する仮想視点の位置と姿勢が指定される。基本フレームと基本フレームの間に位置する中間フレームに対する仮想視点の動線上の位置と姿勢は、補間により自動的に生成される。特許文献１は、中間フレームをスプライン関数で補間することにより、３次元仮想空間におけるカメラのモーションパスを生成する技術を開示している。 There is a known technology that sequentially generates virtual viewpoint images corresponding to the position, orientation, and time of the virtual viewpoint set on an arbitrary path while moving the virtual viewpoint on the path. Hereinafter, the path along which the virtual viewpoint moves is referred to as a flow line. Also, a set of the position, orientation, and time of the virtual viewpoint set on the flow line is referred to as flow line information. One method for generating such virtual viewpoint flow line information is the key frame method. In the key frame method, the position and orientation of the virtual viewpoint with respect to a number of basic frames (base frames) are specified on the flow line along which the virtual viewpoint moves. The position and orientation on the flow line of the virtual viewpoint with respect to intermediate frames located between the base frames are automatically generated by interpolation. Patent Document 1 discloses a technology that generates a motion path of a camera in a three-dimensional virtual space by interpolating intermediate frames with a spline function.

特開２００７－２５９７９号公報JP 2007-25979 A

しかしながら、特許文献１などに開示されているキーフレーム方式では、基本フレームに設定された仮想視点の動線上の位置・姿勢に基づいて中間フレームのための仮想視点の動線上の位置・姿勢が補間され、仮想視点が動線を移動している間のオブジェクトの位置は考慮されない。そのため、生成される仮想視点画像に注目すべきオブジェクトが含まれなくなる可能性がある。所望のオブジェクトが確実に仮想視点画像に含まれるようにするためには、より細かく基本フレームを設定することになり、結果、仮想視点の動線情報の生成に係る手間が増大する。 However, in the key frame method disclosed in Patent Document 1 and other documents, the position and orientation of the virtual viewpoint on the movement line for the intermediate frames are interpolated based on the position and orientation of the virtual viewpoint on the movement line set in the base frame, and the position of the object while the virtual viewpoint is moving along the movement line is not taken into consideration. As a result, there is a possibility that the generated virtual viewpoint image will not include a noteworthy object. In order to ensure that the desired object is included in the virtual viewpoint image, the base frame must be set more precisely, which results in increased effort involved in generating the movement line information for the virtual viewpoint.

本発明は、所定のオブジェクトが含まれる仮想視点画像を生成するための仮想視点の設定を容易にするための技術を提供することを目的とする。 The present invention aims to provide a technology that makes it easy to set a virtual viewpoint for generating a virtual viewpoint image that includes a specified object.

本発明の一態様による情報処理装置は以下の構成を備える。すなわち、
複数の撮影装置により所定の空間を撮影することで得られる複数の画像から仮想視点画像を生成するための仮想視点の位置および前記仮想視点からの視線方向を生成する情報処理装置であって、
前記仮想視点の第１の位置および第２の位置と、前記第１の位置および前記第２の位置に対応し、前記所定の空間の撮影時刻を示す第１の時刻および第２の時刻を取得する取得手段と、
前記第１の位置と前記第２の位置を接続し、前記仮想視点が移動する動線を生成する生成手段と、
前記動線の前記第１の位置と前記第２の位置の間に前記仮想視点の複数の位置を設定し、前記複数の位置の各々に対応する前記所定の空間の撮影時刻を前記第１の時刻と前記第２の時刻に基づいて設定する設定手段と、
前記複数の位置の各々における前記仮想視点からの視線方向を、前記所定の空間における特定のオブジェクトの、前記複数の位置の各々に設定された前記撮影時刻における位置に基づいて決定する決定手段と、を備え、
前記取得手段は、前記仮想視点の前記第１の位置を含む複数の位置と当該複数の位置の各々における前記仮想視点からの視線方向および撮影時刻を含む第１の動線情報と、前記仮想視点の前記第２の位置を含む複数の位置と当該複数の位置の各々における前記仮想視点からの視線方向および撮影時刻を含む第２の動線情報と、を取得し、
前記生成手段は、前記第１の位置と前記第２の位置とを接続する前記動線を、前記仮想視点が移動する第３の動線情報の動線として生成する。 An information processing device according to an aspect of the present invention has the following arrangement:
An information processing device that generates a position of a virtual viewpoint and a line of sight direction from the virtual viewpoint for generating a virtual viewpoint image from a plurality of images obtained by photographing a predetermined space with a plurality of photographing devices,
an acquisition means for acquiring a first position and a second position of the virtual viewpoint, and a first time and a second time corresponding to the first position and the second position and indicating a time of shooting of the predetermined space ;
a generating means for generating a flow line along which the virtual viewpoint moves by connecting the first position and the second position;
a setting means for setting a plurality of positions of the virtual viewpoint between the first position and the second position of the traffic line, and setting a photographing time of the predetermined space corresponding to each of the plurality of positions based on the first time and the second time;
a determination means for determining a line of sight direction from the virtual viewpoint at each of the plurality of positions based on a position of a specific object in the predetermined space at the shooting time set at each of the plurality of positions,
the acquiring means acquires first flow line information including a plurality of positions including the first position of the virtual viewpoint, a line of sight from the virtual viewpoint at each of the plurality of positions, and a photographing time, and second flow line information including a plurality of positions including the second position of the virtual viewpoint, and a line of sight from the virtual viewpoint at each of the plurality of positions, and a photographing time,
The generating means generates the flow line connecting the first position and the second position as a flow line of third flow line information along which the virtual viewpoint moves.

本発明によれば、所定のオブジェクトが含まれる仮想視点画像を生成するための仮想視点の設定を容易にすることが可能になる。 The present invention makes it possible to easily set a virtual viewpoint for generating a virtual viewpoint image that includes a specified object.

画像処理システムの一例を概略的に示す図。FIG. 1 is a diagram illustrating an example of an image processing system. 画像処理システムの構成例を示す図。FIG. 1 is a diagram showing an example of the configuration of an image processing system. 各装置のハードウェア構成例を示す図。FIG. 2 is a diagram showing an example of the hardware configuration of each device. 画像生成装置の機能構成例を示す図。FIG. 2 is a diagram showing an example of the functional configuration of an image generating apparatus. 情報処理装置の機能構成例を示す図。FIG. 2 is a diagram showing an example of the functional configuration of an information processing device. 仮想視点の操作の一例を示す図。11A to 11C are diagrams showing an example of a virtual viewpoint operation. 仮想視点パラメータの構成例を示す図。FIG. 4 is a diagram showing an example of the configuration of virtual viewpoint parameters. メタデータの構成例を示す図。FIG. 4 is a diagram showing an example of metadata configuration. 情報処理装置における一連の処理の流れを示すフローチャート。4 is a flowchart showing the flow of a series of processes in an information processing device. 仮想視点パラメータ群を図示化した図。FIG. 13 is a diagram illustrating a group of virtual viewpoint parameters. 動線生成部が生成した接続動線の一例を示す図。FIG. 13 is a diagram showing an example of a connecting flow line generated by a flow line generating unit. 追従するオブジェクトの決定に係る説明図。FIG. 11 is an explanatory diagram relating to the determination of an object to be followed. 接続動線の線上を移動する仮想視点の一例を示す図。FIG. 13 is a diagram showing an example of a virtual viewpoint moving on a connecting flow line. ３つの動線が結合された状態の一例を示す図。FIG. 13 is a diagram showing an example of a state in which three flow lines are connected. ２つの動線を自動的に結合する条件の一例を示す図。FIG. 13 is a diagram showing an example of a condition for automatically joining two flow lines.

以下、添付図面を参照して実施形態を詳しく説明する。なお、以下の実施形態は特許請求の範囲に係る発明を限定するものではない。実施形態には複数の特徴が記載されているが、これらの複数の特徴の全てが発明に必須のものとは限らず、また、複数の特徴は任意に組み合わせられてもよい。さらに、添付図面においては、同一若しくは同様の構成に同一の参照番号を付し、重複した説明は省略する。 The following embodiments are described in detail with reference to the attached drawings. Note that the following embodiments do not limit the invention according to the claims. Although the embodiments describe multiple features, not all of these multiple features are necessarily essential to the invention, and multiple features may be combined in any manner. Furthermore, in the attached drawings, the same reference numbers are used for the same or similar configurations, and duplicate explanations are omitted.

（システム構成）
図１は、実施形態に係る画像処理システム１００の一例を概略的に説明する図である。画像処理システム１００は、スタジアム１０１に設置された複数の撮影装置（以下、カメラ１０４）を有する。スタジアム１０１は、観客席１０２と、実際に競技等が行われるフィールド１０３とを含む。複数のカメラ１０４は、それぞれが撮影対象であるフィールド１０３の少なくとも一部を撮影する。また、複数のカメラ１０４は、少なくとも２つのカメラの画角に重なりが生じるように配置される。 (System Configuration)
1 is a diagram for explaining an example of an image processing system 100 according to an embodiment. The image processing system 100 has a plurality of imaging devices (hereinafter, cameras 104) installed in a stadium 101. The stadium 101 includes spectator seats 102 and a field 103 where a sport or the like is actually played. Each of the plurality of cameras 104 captures at least a portion of the field 103, which is the subject of the image capture. The plurality of cameras 104 are also arranged so that the angles of view of at least two of the cameras overlap.

図２は、画像処理システム１００の概略の装置構成例を示す図である。画像処理システム１００は、一例において、スタジアム１０１に設置された複数のカメラ１０４と、画像生成装置２０１と、情報処理装置２０２と、ユーザ端末２０３とを含む。複数のカメラ１０４は、例えば伝送ケーブルを介して相互に接続されるとともに、画像生成装置２０１に接続される。複数のカメラ１０４は、それぞれがフィールド１０３を撮影することにより取得した画像を、伝送ケーブルを介して画像生成装置２０１へ伝送する。図１の例では、複数のカメラ１０４は、サッカー場などのスタジアム１０１の全てまたは一部の範囲が撮影されるように配置される。複数のカメラ１０４は、静止画像を撮影するように構成されてもよいし、動画像を撮影するように構成されてもよい。また、複数のカメラ１０４は、静止画像と動画像との両方を撮影するように構成されてもよい。なお、本実施形態では、特に断りがない限り、用語「画像」は、静止画と動画との両方をその意義として含む。 2 is a diagram showing an example of a schematic device configuration of the image processing system 100. In one example, the image processing system 100 includes a plurality of cameras 104 installed in the stadium 101, an image generating device 201, an information processing device 202, and a user terminal 203. The plurality of cameras 104 are connected to each other via, for example, a transmission cable, and are also connected to the image generating device 201. The plurality of cameras 104 transmit images acquired by each of them by photographing the field 103 to the image generating device 201 via the transmission cable. In the example of FIG. 1, the plurality of cameras 104 are arranged so that all or part of the range of the stadium 101, such as a soccer field, is photographed. The plurality of cameras 104 may be configured to photograph still images or may be configured to photograph moving images. The plurality of cameras 104 may also be configured to photograph both still images and moving images. In this embodiment, unless otherwise specified, the term "image" includes both still images and moving images.

画像生成装置２０１は、複数のカメラ１０４により撮影された複数の画像を集約、蓄積し、指定された仮想視点に対応する仮想視点画像を生成する。以下、カメラ１０４により撮影された画像を撮影画像と呼ぶ場合がある。また、仮想視点パラメータ群により指定される位置、姿勢、時刻に従った複数の仮想視点について生成された複数の仮想視点画像を仮想視点画像群と称する。また、本明細書において、仮想視点の経路（動線）に沿って設定された仮想視点の位置、姿勢、時刻のセットを含む仮想視点パラメータ群を動線情報と称する。画像生成装置２０１は、情報処理装置２０２から受信した動線情報に基づいて仮想視点画像群を生成し、生成した仮想視点画像群を情報処理装置２０２に伝送する。また、画像生成装置２０１は、メタデータを情報処理装置２０２に伝送する。メタデータは、例えば、フィールド１０３でプレーをする選手などの位置を示す位置情報を含む。メタデータの詳細については後述する。画像生成装置２０１は、複数の撮影画像および生成した仮想視点画像群を記憶するデータベース機能と、仮想視点画像群を生成するための画像処理機能とを有する。スタジアム１０１内の複数のカメラ１０４と画像生成装置２０１は、例えば、有線または無線による通信ネットワーク回線、または、ＳＤＩ（ＳｅｒｉａｌＤｉｇｉｔａｌＩｎｔｅｒｆａｃｅ）等のケーブル回線によって接続される。画像生成装置２０１は、複数のカメラ１０４からこの回線を通じて撮影画像を受信し、受信した撮影画像をデータベースに格納する。 The image generating device 201 aggregates and accumulates a plurality of images taken by a plurality of cameras 104, and generates a virtual viewpoint image corresponding to a specified virtual viewpoint. Hereinafter, an image taken by the camera 104 may be referred to as a taken image. In addition, a plurality of virtual viewpoint images generated for a plurality of virtual viewpoints according to the position, posture, and time specified by a group of virtual viewpoint parameters are referred to as a group of virtual viewpoint images. In addition, in this specification, a group of virtual viewpoint parameters including a set of the position, posture, and time of the virtual viewpoint set along the path (flow line) of the virtual viewpoint is referred to as flow line information. The image generating device 201 generates a group of virtual viewpoint images based on the flow line information received from the information processing device 202, and transmits the generated group of virtual viewpoint images to the information processing device 202. In addition, the image generating device 201 transmits metadata to the information processing device 202. The metadata includes, for example, position information indicating the position of a player playing on the field 103. Details of the metadata will be described later. The image generating device 201 has a database function for storing multiple captured images and a group of generated virtual viewpoint images, and an image processing function for generating a group of virtual viewpoint images. The multiple cameras 104 in the stadium 101 and the image generating device 201 are connected, for example, by a wired or wireless communication network line, or a cable line such as SDI (Serial Digital Interface). The image generating device 201 receives captured images from the multiple cameras 104 through this line, and stores the received captured images in a database.

情報処理装置２０２は、仮想視点（位置、姿勢など）を編集する装置である。情報処理装置２０２は、ユーザ端末２０３から受け付けた仮想視点パラメータ群を記憶部５０２（図５）に記憶する。情報処理装置２０２は、画像生成装置２０１に動線情報を送信し、送信した動線情報に対応する仮想視点画像群を受信する。記憶部５０２には、複数の動線情報に対応する複数の仮想視点パラメータ群が記憶される。情報処理装置２０２は、記憶部５０２に記憶されている２つの動線情報（仮想視点パラメータ群）を取得し、それら動線情報を結合する。より具体的には、情報処理装置２０２は、２つの動線情報により示される２つの動線を結合する動線（以下、接続動線と呼ぶ場合がある。）を生成し、接続動線の線上を移動する仮想視点の位置、姿勢の情報を生成することで接続動線情報を生成する。その際、情報処理装置２０２は、所定のオブジェクトを追従するように、接続動線上を移動する仮想視点の姿勢情報を生成する。情報処理装置２０２は、２つの動線情報を、接続動線情報により結合し、新たな１つの動線情報を生成する。すなわち、情報処理装置２０２は、３つの動線情報を接続して新たな動線情報（仮想視点パラメータ群）を生成する。情報処理装置２０２は、例えば、パーソナルコンピュータである。なお、情報処理装置２０２は、画像生成装置２０１やユーザ端末２０３に組み込まれてもよい。即ち、情報処理装置２０２は、画像生成装置２０１とユーザ端末２０３との少なくともいずれかと一体の装置として実現されてもよい。 The information processing device 202 is a device that edits the virtual viewpoint (position, posture, etc.). The information processing device 202 stores the virtual viewpoint parameter group received from the user terminal 203 in the storage unit 502 (FIG. 5). The information processing device 202 transmits flow line information to the image generating device 201 and receives a virtual viewpoint image group corresponding to the transmitted flow line information. The storage unit 502 stores a plurality of virtual viewpoint parameter groups corresponding to a plurality of flow line information. The information processing device 202 acquires two pieces of flow line information (virtual viewpoint parameter groups) stored in the storage unit 502 and combines the flow line information. More specifically, the information processing device 202 generates a flow line (hereinafter, sometimes referred to as a connecting flow line) that connects two flow lines indicated by two pieces of flow line information, and generates the position and posture information of the virtual viewpoint moving on the connecting flow line to generate the connecting flow line information. At that time, the information processing device 202 generates posture information of the virtual viewpoint moving on the connecting flow line so as to follow a predetermined object. The information processing device 202 combines the two pieces of movement line information using the connection movement line information to generate a new piece of movement line information. That is, the information processing device 202 connects three pieces of movement line information to generate new movement line information (a group of virtual viewpoint parameters). The information processing device 202 is, for example, a personal computer. The information processing device 202 may be incorporated into the image generating device 201 or the user terminal 203. That is, the information processing device 202 may be realized as a device integrated with at least one of the image generating device 201 and the user terminal 203.

ユーザ端末２０３は、画像処理システム１００を利用するユーザが所持する情報処理装置である。ユーザ端末２０３は、仮想視点の位置や姿勢に関わるユーザの操作指示を受け付けることができる。図６は仮想視点の操作の一例である。例えば、ユーザは、仮想視点６０２の位置を、選手６０１の右斜め前方となるように移動させながら、仮想視点６０２の視線を選手６０１へ向けることができる。なお、仮想視点６０２が移動した動線を動線６０３とする。ユーザ端末２０３は、仮想視点の位置や姿勢に変化が生じる度に、操作指示の内容を示す仮想視点パラメータを情報処理装置２０２へ伝送する。また、ユーザ端末２０３は、図６で示すような一連の仮想視点の位置、姿勢の指定操作が完了した際に、当該一連の操作指示の内容を示す仮想視点パラメータ群を情報処理装置２０２へ伝送することもできる。 The user terminal 203 is an information processing device owned by a user who uses the image processing system 100. The user terminal 203 can accept user operation instructions related to the position and posture of the virtual viewpoint. FIG. 6 is an example of virtual viewpoint operation. For example, the user can move the position of the virtual viewpoint 602 so that it is diagonally forward to the right of the player 601, and direct the line of sight of the virtual viewpoint 602 toward the player 601. The movement line along which the virtual viewpoint 602 moves is called the movement line 603. The user terminal 203 transmits virtual viewpoint parameters indicating the contents of the operation instructions to the information processing device 202 each time a change occurs in the position and posture of the virtual viewpoint. In addition, the user terminal 203 can also transmit a group of virtual viewpoint parameters indicating the contents of the series of operation instructions to the information processing device 202 when a series of operations for specifying the position and posture of the virtual viewpoint as shown in FIG. 6 is completed.

図７に、動線情報を構成する仮想視点パラメータの構成例を示す。仮想視点パラメータは、時刻情報、位置情報および姿勢情報を含む。時刻情報は、ＨＨ（時間）：ＭＭ（分）：ＳＳ（秒）．ＦＦ（フレーム）で構成される。位置情報は、３つの座標軸（ｘ軸、ｙ軸、ｚ軸）が原点で直交するように交わる座標系の３次元直交座標によって示されて得る。原点は、撮影空間内の任意の位置とすることができる。例えば、原点は、サッカーフィールドのセンターサークルの中心とすることができる。姿勢情報は、パン（水平方向）、チルト（垂直方向）、ロール（カメラが回転する方向）の３軸に対する角度により示され得る。なお、仮想視点パラメータ群は、仮想視点パラメータを時系列に纏めたデータである。 Figure 7 shows an example of the configuration of virtual viewpoint parameters that make up the movement line information. The virtual viewpoint parameters include time information, position information, and attitude information. The time information is composed of HH (hours): MM (minutes): SS (seconds). FF (frames). The position information is obtained by indicating three-dimensional orthogonal coordinates of a coordinate system in which three coordinate axes (x-axis, y-axis, z-axis) intersect at the origin so as to be perpendicular to each other. The origin can be any position in the shooting space. For example, the origin can be the center of the center circle of a soccer field. The attitude information can be indicated by the angles relative to the three axes of pan (horizontal direction), tilt (vertical direction), and roll (direction in which the camera rotates). The virtual viewpoint parameter group is data in which virtual viewpoint parameters are compiled in chronological order.

ユーザ端末２０３は、仮想視点パラメータに基づいて生成された仮想視点画像を情報処理装置２０２から受信し、表示装置に表示する。なお、表示装置はユーザ端末２０３に内蔵された表示装置であってもよいし、外部装置の表示装置であってもよい。ユーザ端末２０３は、例えば、パーソナルコンピュータや、スマートフォンやタブレットなどの携帯端末などである。ユーザ端末２０３は、マウス、キーボード、６軸コントローラ、及び、タッチパネルの少なくとも１つ等のユーザ操作を受け付けるためのインタフェースを有する。 The user terminal 203 receives a virtual viewpoint image generated based on the virtual viewpoint parameters from the information processing device 202 and displays it on a display device. The display device may be a display device built into the user terminal 203 or a display device of an external device. The user terminal 203 is, for example, a personal computer or a mobile terminal such as a smartphone or tablet. The user terminal 203 has an interface for accepting user operations such as at least one of a mouse, a keyboard, a six-axis controller, and a touch panel.

画像生成装置２０１、情報処理装置２０２及びユーザ端末２０３は、例えば、インターネットなどのネットワークを介して相互に情報のやり取りが可能となるように構成される。なお、装置間の通信は、無線通信と有線通信とのいずれか又はこれらの組み合わせのいずれによって行われてもよい。 The image generating device 201, the information processing device 202, and the user terminal 203 are configured to be able to exchange information with each other via a network such as the Internet. Note that communication between the devices may be performed by either wireless communication or wired communication, or a combination of these.

（装置のハードウェア構成）
続いて、図３を参照して、上述の画像生成装置２０１、情報処理装置２０２及びユーザ端末２０３のハードウェア構成例について説明する。本実施形態では、各装置は、図３のブロック図により示されるハードウェア構成を有しており、コントローラユニット３００、操作ユニット３０９、および表示装置３１０を含む。なお、画像生成装置２０１、情報処理装置２０２及びユーザ端末２０３が同じハードウェア構成を備える必要はなく、また、以下に説明する構成の全てを備えていなくてもよい。 (Hardware configuration of the device)
Next, an example of the hardware configuration of the image generating device 201, the information processing device 202, and the user terminal 203 will be described with reference to Fig. 3. In this embodiment, each device has a hardware configuration shown in the block diagram of Fig. 3, and includes a controller unit 300, an operation unit 309, and a display device 310. Note that the image generating device 201, the information processing device 202, and the user terminal 203 do not need to have the same hardware configuration, and they do not need to have all of the configurations described below.

コントローラユニット３００は、ＣＰＵ３０１、ＲＯＭ３０２、ＲＡＭ３０３、ＨＤＤ３０４、操作部Ｉ／Ｆ３０５、表示部Ｉ／Ｆ３０６、及び通信Ｉ／Ｆ３０７を含む。なお、これらのハードウェアブロックは、例えばシステムバス３０８によって相互に通信可能となるように接続される。なお、ＣＰＵはＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔの頭字語であり、ＨＤＤはＨａｒｄＤｉｓｋＤｒｉｖｅの頭字語であり、Ｉ／Ｆはインタフェースである。 The controller unit 300 includes a CPU 301, a ROM 302, a RAM 303, a HDD 304, an operation unit I/F 305, a display unit I/F 306, and a communication I/F 307. These hardware blocks are connected so as to be able to communicate with each other, for example, via a system bus 308. Note that CPU is an acronym for Central Processing Unit, HDD is an acronym for Hard Disk Drive, and I/F is an interface.

ＣＰＵ３０１は、システムバス３０８を介して、ＲＯＭ３０２、ＲＡＭ３０３、ＨＤＤ３０４、操作部Ｉ／Ｆ３０５、表示部Ｉ／Ｆ３０６、及び、通信Ｉ／Ｆ３０７の動作を制御する。ＣＰＵ３０１は、ＲＯＭ３０２に格納されているブートプログラムによりＯＳ（ＯｐｅｒａｔｉｎｇＳｙｓｔｅｍ）を起動する。ＣＰＵ３０１は、このＯＳ上で、例えばＨＤＤ３０４に格納されているアプリケーションプログラムを実行する。ＣＰＵ３０１がアプリケーションプログラムを実行することによって、各装置の各種処理が実現される。ＲＯＭ３０２は読み出し専用の半導体メモリであり、ＣＰＵ３０１により実行される各種プログラムやパラメータを格納する。ＲＡＭ３０３は、随時に読み出し及び書き込みが可能な半導体メモリであり、例えば、一時的な情報を格納する領域やＣＰＵ３０１の作業領域などを提供する。ＨＤＤ３０４は、ハードディスクなどの大容量記憶装置であり、アプリケーションプログラム、画像データなどを格納する。なお、ＣＰＵ３０１は、１つ以上の任意のプロセッサー（例えば、ＡＳＩＣ（特定用途向け集積回路）、ＤＳＰ（デジタルシグナルプロセッサ）や、ＦＰＧＡ（フィールドプログラマブルゲートアレイ）等）によって置き換えられてもよい。 The CPU 301 controls the operation of the ROM 302, the RAM 303, the HDD 304, the operation unit I/F 305, the display unit I/F 306, and the communication I/F 307 via the system bus 308. The CPU 301 starts the OS (operating system) by a boot program stored in the ROM 302. The CPU 301 executes an application program stored in the HDD 304, for example, on this OS. The CPU 301 executes the application program to realize various processes of each device. The ROM 302 is a read-only semiconductor memory, and stores various programs and parameters executed by the CPU 301. The RAM 303 is a semiconductor memory that can be read and written at any time, and provides, for example, an area for storing temporary information and a working area for the CPU 301. The HDD 304 is a large-capacity storage device such as a hard disk, and stores application programs, image data, etc. The CPU 301 may be replaced by one or more arbitrary processors (e.g., an ASIC (application specific integrated circuit), a DSP (digital signal processor), an FPGA (field programmable gate array), etc.).

操作部Ｉ／Ｆ３０５は、操作ユニット３０９をコントローラユニット３００に接続するインタフェースである。操作部Ｉ／Ｆ３０５は、操作ユニット３０９によって受け付けられたユーザ操作の情報をＣＰＵ３０１へ転送する。操作ユニット３０９は、例えば、マウスやキーボード、タッチパネル等の、ユーザ操作を受付可能な機器を含んで構成される。表示部Ｉ／Ｆ３０６は、表示装置３１０をコントローラユニット３００に接続するインタフェースである。例えば、ＣＰＵ３０１は、表示されるべき画像データを、表示部Ｉ／Ｆ３０６を介して表示装置３１０に出力する。表示装置３１０は、液晶ディスプレイなどのディスプレイを含んで構成される。通信部Ｉ／Ｆ３０７は、例えばイーサネット（登録商標）等の通信を行うためのインタフェースである。通信部Ｉ／Ｆ３０７は、伝送ケーブルの接続を受け付けるためのコネクタ等を有する。なお、通信部Ｉ／Ｆ３０７は、無線通信インタフェースであってもよく、その場合、通信部Ｉ／Ｆ３０７は、例えばベースバンド回路、ＲＦ回路、及びアンテナを有する。コントローラユニット３００は、通信部Ｉ／Ｆ３０７を介して外部装置と情報の入出力を行う。 The operation unit I/F 305 is an interface that connects the operation unit 309 to the controller unit 300. The operation unit I/F 305 transfers information on user operations received by the operation unit 309 to the CPU 301. The operation unit 309 is configured to include devices that can receive user operations, such as a mouse, a keyboard, and a touch panel. The display unit I/F 306 is an interface that connects the display device 310 to the controller unit 300. For example, the CPU 301 outputs image data to be displayed to the display device 310 via the display unit I/F 306. The display device 310 is configured to include a display such as a liquid crystal display. The communication unit I/F 307 is an interface for performing communication such as Ethernet (registered trademark). The communication unit I/F 307 has a connector for receiving a connection of a transmission cable. The communication unit I/F 307 may be a wireless communication interface, in which case the communication unit I/F 307 has, for example, a baseband circuit, an RF circuit, and an antenna. The controller unit 300 inputs and outputs information with external devices via the communication unit I/F 307.

なお、各装置において表示装置３１０は、外部の装置であってもよい。すなわち、装置は、ケーブルやネットワークを介して接続された外部の表示装置３１０に画像を表示させる表示制御を行うこともできる。この場合、装置は、表示データを表示装置３１０に出力させるための表示制御を実行する。なお、図３の構成は一例であり、その一部が省略され又は図示されていない構成が追加され、さらに、図示された構成が組み合わされてもよい。例えば、画像生成装置２０１は、表示装置３１０を有しなくてもよい。 Note that the display device 310 in each device may be an external device. That is, the device may also perform display control to display an image on the external display device 310 connected via a cable or network. In this case, the device executes display control to output display data to the display device 310. Note that the configuration in FIG. 3 is an example, and some of the configurations may be omitted or configurations not shown may be added, and further, the configurations shown in the figures may be combined. For example, the image generating device 201 may not have a display device 310.

（画像生成装置２０１の機能構成）
図４に、画像生成装置２０１の機能構成例を示す。図４に示される機能構成は、画像生成装置２０１のＣＰＵ３０１が、ＲＯＭ３０２等に記録された各種プログラムを読み出して各部の制御を実行することにより実現される。なお、図４に示される機能構成の一部または全部が専用のハードウェア（例えばＡＳＩＣやＦＰＧＡ）により実現されてもよい。 (Functional configuration of image generating device 201)
Fig. 4 shows an example of a functional configuration of the image generating device 201. The functional configuration shown in Fig. 4 is realized by a CPU 301 of the image generating device 201 reading various programs recorded in a ROM 302 or the like and executing control of each unit. Note that a part or all of the functional configuration shown in Fig. 4 may be realized by dedicated hardware (e.g., an ASIC or an FPGA).

画像生成装置２０１は、一例において、制御部４０１、記憶部４０２、画像入力部４０３、画像記憶部４０４、画像生成部４０５、メタデータ生成部４０６、及び、データ送受信部４０７を有する。なお、これら機能部は、内部バス４０８によって相互に接続され、制御部４０１による制御の下で、相互にデータを送受信することができる。制御部４０１は、以下に説明する各機能部を含む、画像生成装置２０１全体の動作を制御する。記憶部４０２は、ＲＯＭ３０２、ＲＡＭ３０３、ＨＤＤ３０４を含み、種々のプログラム、撮影画像等の種々のデータの格納、読み出しを行う。 In one example, the image generating device 201 has a control unit 401, a storage unit 402, an image input unit 403, an image storage unit 404, an image generating unit 405, a metadata generating unit 406, and a data transmitting/receiving unit 407. These functional units are interconnected by an internal bus 408, and can transmit and receive data to and from each other under the control of the control unit 401. The control unit 401 controls the operation of the entire image generating device 201, including each of the functional units described below. The storage unit 402 includes a ROM 302, a RAM 303, and a HDD 304, and stores and reads out various programs, various data such as captured images, etc.

画像入力部４０３は、通信部Ｉ／Ｆ３０７を介して、スタジアム１０１に設置された複数のカメラ１０４によって撮影された画像（撮影画像）を所定のフレームレートで取得する。画像入力部４０３は、例えば有線もしくは無線の通信モジュール、又はＳＤＩ等の画像伝送モジュールにより、カメラ１０４から撮影画像を取得する。画像記憶部４０４は、画像入力部４０３によって取得された撮影画像と、それらの撮影画像に基づいて生成された仮想視点画像群と、オブジェクトの位置を示すメタデータとを、例えば、ＨＤＤ３０４などに記憶する。なお、仮想視点画像群は後述の画像生成部４０５により、メタデータは後述のメタデータ生成部４０６によりそれぞれ生成される。画像記憶部４０４は、例えば、磁気ディスク、光ディスク、半導体メモリ等によって実現され得る。画像記憶部４０４は、画像生成装置２０１に内蔵された装置（ＲＡＭ３０３、ＨＤＤ３０４）によって実現されてもよいし、画像生成装置２０１とは物理的に切り離された外部の装置によって実現されてもよい。また、画像記憶部４０４に記憶される撮影画像及び仮想視点画像群は、例えば、ＭＸＦ（ＭａｔｅｒｉａｌｅＸｃｈａｎｇｅＦｏｒｍａｔ）形式の画像フォーマットで記憶される。また、これら撮影画像及び仮想視点画像群は、例えば、ＭＰＥＧ２形式で圧縮される。ただし、画像フォーマットやデータ圧縮方式は、これらに限定されず、任意の画像フォーマットおよびデータ圧縮方式が用いられてもよい。 The image input unit 403 acquires images (photographed images) taken by multiple cameras 104 installed in the stadium 101 at a predetermined frame rate via the communication unit I/F 307. The image input unit 403 acquires the photographed images from the cameras 104, for example, by a wired or wireless communication module, or an image transmission module such as SDI. The image storage unit 404 stores the photographed images acquired by the image input unit 403, a group of virtual viewpoint images generated based on the photographed images, and metadata indicating the position of the object, for example, in the HDD 304. The group of virtual viewpoint images is generated by the image generation unit 405 described later, and the metadata is generated by the metadata generation unit 406 described later. The image storage unit 404 can be realized, for example, by a magnetic disk, an optical disk, a semiconductor memory, or the like. The image storage unit 404 may be realized by a device (RAM 303, HDD 304) built into the image generating device 201, or may be realized by an external device physically separated from the image generating device 201. The captured images and virtual viewpoint images stored in the image storage unit 404 are stored in an image format, for example, MXF (Material exchange format). These captured images and virtual viewpoint images are compressed in, for example, MPEG2 format. However, the image format and data compression method are not limited to these, and any image format and data compression method may be used.

画像生成部４０５は、画像記憶部４０４に記憶された複数の撮影画像から、複数の仮想視点に対応する仮想視点画像群を生成する。仮想視点画像群は、例えば、イメージベースレンダリング（Ｉｍａｇｅ－ＢａｓｅｄＲｅｎｄｅｒｉｎｇ、以下、ＩＢＲ）を用いて生成され得る。ＩＢＲとは、モデリング（幾何学図形を使用して物体の形状を作成する過程）を行わず、複数の実際の視点から撮影された画像から仮想視点画像を生成するレンダリング方法である。ただし、仮想視点画像群の生成方法はＩＢＲに限られるものではない。例えば、モデルベースレンダリング（Ｍｏｄｅｌ－ＢａｓｅｄＲｅｎｄｅｒｉｎｇ、以下、ＭＢＲ）を用いて仮想視点画像群が生成されてもよい。ＭＢＲとは、複数の方向から被写体を撮影して得られた複数の撮影画像に基づいて生成される三次元モデルを用いて、仮想視点画像を生成する方式である。ＭＢＲでは、例えば、視体積交差法、Ｍｕｌｔｉ－Ｖｉｅｗ－Ｓｔｅｒｅｏ（ＭＶＳ）などの三次元形状復元手法によって得られた対象シーンの三次元形状（モデル）を利用することにより、仮想視点からのシーンの見えが画像として生成される。 The image generating unit 405 generates a group of virtual viewpoint images corresponding to a plurality of virtual viewpoints from a plurality of captured images stored in the image storage unit 404. The group of virtual viewpoint images can be generated, for example, using image-based rendering (hereinafter, IBR). IBR is a rendering method that generates virtual viewpoint images from images captured from a plurality of actual viewpoints without performing modeling (a process of creating the shape of an object using geometric figures). However, the method of generating the group of virtual viewpoint images is not limited to IBR. For example, the group of virtual viewpoint images can be generated using model-based rendering (hereinafter, MBR). MBR is a method of generating virtual viewpoint images using a three-dimensional model that is generated based on a plurality of captured images obtained by capturing an object from a plurality of directions. In MBR, the appearance of the scene from a virtual viewpoint is generated as an image by using the three-dimensional shape (model) of the target scene obtained by three-dimensional shape reconstruction methods such as volume intersection and Multi-View-Stereo (MVS).

なお、画像生成部４０５により生成される仮想視点画像群は、様々な仮想視点の位置及び視線の方向の仮想視点画像を含み、１つの画像ストリームとして空間方向及び時間方向に圧縮符号化される。但し、仮想視点画像群の形態はこのような画像ストリームに限られるものではなく、それぞれが独立した複数の画像によって構成されていてもよい。また、仮想視点画像群は、圧縮符号化されていなくてもよい。また、本実施形態では画像生成装置２０１が仮想視点画像群を生成するが、これに限られるものではない。例えば、画像生成装置２０１は、三次元形状データ等の三次元モデルを示す情報や、その情報によって示される三次元モデルにマッピングされる画像等の、仮想視点画像を生成するための情報を生成してもよい。すなわち、画像生成部４０５は、レンダリングされた仮想視点画像を生成することに代えて、情報処理装置２０２またはユーザ端末２０３で仮想視点画像をレンダリングするために必要な情報を生成してもよい。 The virtual viewpoint image group generated by the image generation unit 405 includes virtual viewpoint images of various virtual viewpoint positions and line of sight directions, and is compressed and encoded in the spatial direction and the temporal direction as one image stream. However, the form of the virtual viewpoint image group is not limited to such an image stream, and each may be composed of a plurality of independent images. Also, the virtual viewpoint image group does not have to be compressed and encoded. Also, in this embodiment, the image generation device 201 generates the virtual viewpoint image group, but this is not limited to this. For example, the image generation device 201 may generate information for generating a virtual viewpoint image, such as information indicating a three-dimensional model such as three-dimensional shape data, or an image mapped to a three-dimensional model indicated by the information. That is, instead of generating a rendered virtual viewpoint image, the image generation unit 405 may generate information necessary for rendering the virtual viewpoint image in the information processing device 202 or the user terminal 203.

メタデータ生成部４０６は、画像記憶部４０４に記憶された撮影画像を解析し、オブジェクトの位置情報を取得する。メタデータ生成部４０６は、例えば、ビジュアルハルなどの技術を用いて特定のオブジェクトの位置情報を取得する。メタデータ生成部４０６は、画像記憶部４０４に記憶された撮影画像の撮影開始から撮影終了までのすべての時刻における、オブジェクトの位置情報を取得する。メタデータ生成部４０６は、オブジェクトについて取得した位置情報を含むメタデータを生成する。図８に、メタデータ生成部４０６により生成されるメタデータの構成例を示す。メタデータは、例えば、時刻情報と各オブジェクトの位置情報とを含む。時刻情報は、ＨＨ（時間）：ＭＭ（分）：ＳＳ（秒）．ＦＦ（フレーム）で構成される。位置情報は、３次元直交座標を用いて、オブジェクト（例えば、ボール、選手、審判）の位置を示す。オブジェクト名は、任意の名称とオブジェクトＩＤで構成される。オブジェクトＩＤは、アルファベットや数字で構成され、それぞれのオブジェクトを区別できるようにオブジェクトごとに割り振られた識別記号である。なお、メタデータは、メタデータ生成部４０６による撮影画像の解析により取得されるものに限定されず、画像生成装置２０１や情報処理装置２０２に事前に登録されたものであってもよい。 The metadata generating unit 406 analyzes the captured images stored in the image storage unit 404 and acquires the position information of the object. The metadata generating unit 406 acquires the position information of a specific object using, for example, a technique such as visual hull. The metadata generating unit 406 acquires the position information of the object at all times from the start of shooting to the end of shooting of the captured images stored in the image storage unit 404. The metadata generating unit 406 generates metadata including the acquired position information for the object. FIG. 8 shows an example of the configuration of metadata generated by the metadata generating unit 406. The metadata includes, for example, time information and position information of each object. The time information is composed of HH (hours): MM (minutes): SS (seconds). FF (frames). The position information indicates the position of an object (for example, a ball, a player, a referee) using three-dimensional orthogonal coordinates. The object name is composed of an arbitrary name and an object ID. The object ID is an identification symbol composed of alphabets and numbers and assigned to each object so that each object can be distinguished. Note that the metadata is not limited to that obtained by the analysis of the captured image by the metadata generating unit 406, but may be that registered in advance in the image generating device 201 or the information processing device 202.

データ送受信部４０７は、画像記憶部４０４に記憶される仮想視点画像群を情報処理装置２０２へ所定のフレームレートで出力する。また、データ送受信部４０７は、仮想視点画像群を出力する前に画像記憶部４０４に記憶されるメタデータを情報処理装置２０２へ出力する。また、データ送受信部４０７は、情報処理装置２０２から動線情報（仮想視点パラメータ群）を受信する。 The data transmission/reception unit 407 outputs the group of virtual viewpoint images stored in the image storage unit 404 to the information processing device 202 at a predetermined frame rate. In addition, the data transmission/reception unit 407 outputs metadata stored in the image storage unit 404 to the information processing device 202 before outputting the group of virtual viewpoint images. In addition, the data transmission/reception unit 407 receives movement line information (group of virtual viewpoint parameters) from the information processing device 202.

（情報処理装置２０２の機能構成）
図５は、情報処理装置２０２の機能構成例を示すブロック図である。図５に示される機能構成は、情報処理装置２０２のＣＰＵ３０１が、ＲＯＭ３０２等に記録された各種プログラムを読み出して各部の制御を実行することにより実現される。なお、図５で示される機能部の一部または全部が専用のハードウェア（例えばＡＳＩＣやＦＰＧＡ）により実現されてもよい。 (Functional configuration of information processing device 202)
Fig. 5 is a block diagram showing an example of a functional configuration of the information processing device 202. The functional configuration shown in Fig. 5 is realized by the CPU 301 of the information processing device 202 reading various programs recorded in the ROM 302 or the like and executing control of each unit. Note that some or all of the functional units shown in Fig. 5 may be realized by dedicated hardware (e.g., ASIC or FPGA).

情報処理装置２０２は、一例において、制御部５０１、記憶部５０２、データ送受信部５０３、画像記憶部５０４、動線生成部５０５、姿勢生成部５０６、動線結合部５０７、画像取得部５０８、及び、入出力部５０９を有する。これらの機能部は、内部バス５１０によって相互に接続され、制御部５０１による制御の下で、相互にデータを送受信することができる。制御部５０１は、以下に説明する各機能部を含む、情報処理装置２０２全体の動作を制御する。記憶部５０２は、ＲＯＭ３０２、ＲＡＭ３０３、ＨＤＤ３０４を含み、種々のプログラム、撮影画像等の種々のデータの格納、読み出しを行う。 In one example, the information processing device 202 has a control unit 501, a memory unit 502, a data transmission/reception unit 503, an image memory unit 504, a movement line generation unit 505, a posture generation unit 506, a movement line connection unit 507, an image acquisition unit 508, and an input/output unit 509. These functional units are connected to each other by an internal bus 510, and can transmit and receive data to each other under the control of the control unit 501. The control unit 501 controls the operation of the entire information processing device 202, including each of the functional units described below. The memory unit 502 includes a ROM 302, a RAM 303, and a HDD 304, and stores and reads out various programs, various data such as captured images, etc.

データ送受信部５０３は、画像生成装置２０１から仮想視点画像群とメタデータを受信する。またデータ送受信部５０３は、画像生成装置２０１に動線情報（仮想視点パラメータ群）を送信する。画像記憶部５０４は、データ送受信部５０３によって受信された仮想視点画像群およびメタデータ、動線結合部５０７および入出力部５０９が取得した動線情報（仮想視点パラメータ群）などを記憶する。画像記憶部５０４は、例えば、磁気ディスク、光ディスク、半導体メモリ等によって実現される。なお、画像記憶部５０４は、情報処理装置２０２に内蔵された装置（ＲＯＭ３０２、ＲＡＭ３０３、ＨＤＤ３０４）によって実現されてもよいし、情報処理装置２０２とは物理的に切り離された外部の装置によって実現されてもよい。 The data transmission/reception unit 503 receives a group of virtual viewpoint images and metadata from the image generation device 201. The data transmission/reception unit 503 also transmits movement line information (group of virtual viewpoint parameters) to the image generation device 201. The image storage unit 504 stores the group of virtual viewpoint images and metadata received by the data transmission/reception unit 503, the movement line information (group of virtual viewpoint parameters) acquired by the movement line connection unit 507 and the input/output unit 509, and the like. The image storage unit 504 is realized, for example, by a magnetic disk, an optical disk, a semiconductor memory, or the like. The image storage unit 504 may be realized by a device (ROM 302, RAM 303, HDD 304) built into the information processing device 202, or may be realized by an external device physically separated from the information processing device 202.

動線生成部５０５、姿勢生成部５０６、動線結合部５０７は、画像記憶部５０４に記憶されている２つの動線情報を接続するための接続動線情報を生成する。動線生成部５０５は、例えば、結合の対象となる２つの動線のうち、一方の動線（以下、第１の動線）上の点を接続動線の始点として決定し、もう一方の動線（以下、第２の動線）上の点を接続動線の終点として決定する。本実施形態では、第１の動線の終点、第２の動線の始点が用いられる。動線生成部５０５は、決定した接続動線の始点から終点へ仮想視点が移動するように直線状の接続動線を生成し、接続動線の線上を移動する仮想視点の位置情報を時刻単位で補間する。なお、本実施形態では始点と終点の間を直線により補間する例を示すが、これに限られるものではない。例えば、スプライン関数などにより表される曲線が接続動線として用いられてもよい。また、結合の対象となる２つの動線のうちのいずれを第１の動線、第２の動線とするかは、動線生成部５０５により自動的に決定されてもよいし、ユーザ端末２０３を介してユーザにより明示的に指定されてもよい。例えば、動線生成部５０５は、結合対象の２つの動線のうち、仮想視点パラメータ群の先頭に位置する仮想視点パラメータの時刻情報（開始時刻）が早い方を第１の動線に決定する。 The flow line generating unit 505, the posture generating unit 506, and the flow line connecting unit 507 generate connecting flow line information for connecting two pieces of flow line information stored in the image storage unit 504. For example, the flow line generating unit 505 determines a point on one of the two flow lines to be connected (hereinafter, the first flow line) as the start point of the connecting flow line, and determines a point on the other flow line (hereinafter, the second flow line) as the end point of the connecting flow line. In this embodiment, the end point of the first flow line and the start point of the second flow line are used. The flow line generating unit 505 generates a linear connecting flow line so that the virtual viewpoint moves from the start point to the end point of the determined connecting flow line, and interpolates the position information of the virtual viewpoint moving on the connecting flow line in units of time. Note that in this embodiment, an example is shown in which a straight line is used to interpolate between the start point and the end point, but this is not limited to this. For example, a curve expressed by a spline function or the like may be used as the connecting flow line. In addition, which of the two flow lines to be combined will be the first flow line and which will be the second flow line may be automatically determined by the flow line generation unit 505, or may be explicitly specified by the user via the user terminal 203. For example, the flow line generation unit 505 determines, of the two flow lines to be combined, the one having the earlier time information (start time) of the virtual viewpoint parameter located at the top of the virtual viewpoint parameter group, as the first flow line.

更に、動線生成部５０５は、接続動線上の仮想視点画像を表示する際の再生速度を決定する。動線生成部５０５は、接続動線の始点と終点の距離または時刻の差分（時間差）によって再生速度を調整する。例えば、動線生成部５０５は、接続動線の始点と終点の間の距離が所定値よりも大きい場合に、通常の再生速度よりも早い再生速度を設定する。これにより、仮想視点が接続動線を移動する間は早送り再生が行われる。また、接続動線の始点と終点の間の距離が大きいほど速い再生速度が設定されるようにしてもよい。また、例えば、動線生成部５０５は、接続動線の始点と終点の時間差が所定値よりも大きい場合に通常の再生速度よりも早い再生速度を設定するようにしてもよい。また、接続動線の始点と終点の時間差が大きいほど早い再生速度が設定されるようにしてもよい。なお、接続動線の始点と終点の距離や時間差に関係なく、無条件で、所定速度の早送り再生が行われるようにしてもよい。 Furthermore, the flow line generating unit 505 determines the playback speed when displaying the virtual viewpoint image on the connecting flow line. The flow line generating unit 505 adjusts the playback speed according to the distance or time difference (time difference) between the start point and the end point of the connecting flow line. For example, when the distance between the start point and the end point of the connecting flow line is greater than a predetermined value, the flow line generating unit 505 sets a playback speed faster than the normal playback speed. As a result, fast-forward playback is performed while the virtual viewpoint moves along the connecting flow line. Also, the playback speed may be set faster as the distance between the start point and the end point of the connecting flow line is greater. Also, for example, the flow line generating unit 505 may set a playback speed faster than the normal playback speed when the time difference between the start point and the end point of the connecting flow line is greater than a predetermined value. Also, the playback speed may be set faster as the time difference between the start point and the end point of the connecting flow line is greater. Note that fast-forward playback at a predetermined speed may be performed unconditionally, regardless of the distance or time difference between the start point and the end point of the connecting flow line.

姿勢生成部５０６は、動線生成部５０５が生成した接続動線の線上を移動する仮想視点の姿勢情報を生成する。姿勢生成部５０６は、動線生成部５０５が結合の対象とした２つの動線のうちのどちらか一方の動線の始点または終点に位置する仮想視点パラメータを取得する。姿勢生成部５０６は、取得した仮想視点パラメータを画像取得部５０８へ提供することにより画像取得部５０８に仮想視点画像を取得させる。姿勢生成部５０６は、画像取得部５０８が取得した仮想視点画像の中央または中央付近に位置するオブジェクトを検出し、検出したオブジェクトの３次元直交座標を算出する。姿勢生成部５０６は、仮想視点パラメータの時刻情報と算出した３次元直交座標を用いて、画像記憶部５０４に記憶されるメタデータと照合し、検出したオブジェクトのオブジェクトＩＤを特定する。姿勢生成部５０６は、接続動線の動線上を移動する仮想視点の時刻情報と、特定したオブジェクトＩＤに基づいて、その時々のオブジェクトの位置情報をメタデータから抽出する。姿勢生成部５０６は、接続動線の動線上を移動する仮想視点の位置情報と、メタデータから抽出されたオブジェクトの位置情報に基づいて、仮想視点の姿勢を決定する。すなわち、姿勢生成部５０６は、仮想視点の位置とオブジェクトの位置を結ぶ直線がその仮想視点の方向と一致するように当該仮想始点の姿勢情報を生成する。 The posture generation unit 506 generates posture information of a virtual viewpoint moving on the line of the connecting flow line generated by the flow line generation unit 505. The posture generation unit 506 acquires a virtual viewpoint parameter located at the start point or end point of one of the two flow lines that the flow line generation unit 505 has selected as the target of joining. The posture generation unit 506 provides the acquired virtual viewpoint parameter to the image acquisition unit 508, thereby causing the image acquisition unit 508 to acquire a virtual viewpoint image. The posture generation unit 506 detects an object located at the center or near the center of the virtual viewpoint image acquired by the image acquisition unit 508, and calculates the three-dimensional orthogonal coordinates of the detected object. The posture generation unit 506 uses the time information of the virtual viewpoint parameter and the calculated three-dimensional orthogonal coordinates to collate with the metadata stored in the image storage unit 504, and identifies the object ID of the detected object. The posture generation unit 506 extracts the position information of the object at each time from the metadata based on the time information of the virtual viewpoint moving on the flow line of the connecting flow line and the identified object ID. The orientation generation unit 506 determines the orientation of the virtual viewpoint based on the position information of the virtual viewpoint moving on the flow line of the connecting flow line and the position information of the object extracted from the metadata. In other words, the orientation generation unit 506 generates the orientation information of the virtual starting point so that the straight line connecting the position of the virtual viewpoint and the position of the object coincides with the direction of the virtual viewpoint.

動線結合部５０７は、動線生成部５０５が画像記憶部５０４から取得した、結合対象の２つの動線情報と、動線生成部５０５と姿勢生成部５０６により生成された接続動線情報（仮想視点の位置、姿勢情報が付与された接続動線）とを結合する。動線結合部５０７は、結合により得られた動線情報（仮想視点パラメータ群）を画像記憶部５０４へ記憶する。その際、動線結合部５０７は、動線生成部５０５が決定した再生速度を付与する。 The movement line combination unit 507 combines two pieces of movement line information to be combined, acquired by the movement line generation unit 505 from the image storage unit 504, with the connected movement line information (connected movement line to which the position and posture information of the virtual viewpoint have been added) generated by the movement line generation unit 505 and the posture generation unit 506. The movement line combination unit 507 stores the movement line information (a group of virtual viewpoint parameters) obtained by the combination in the image storage unit 504. At that time, the movement line combination unit 507 applies the playback speed determined by the movement line generation unit 505.

画像取得部５０８は、姿勢生成部５０６から提供された仮想視点パラメータ、または入出力部５０９から取得された仮想視点パラメータに基づいて、画像記憶部５０４が記憶している仮想視点画像群から仮想視点画像を選択する。また、画像取得部５０８は、画像記憶部５０４から仮想視点パラメータ群（動線情報）を取得し、仮想視点パラメータ群の先頭から順に抽出した仮想視点パラメータに基づいて仮想視点画像群の中から仮想視点画像を選択することもできる。すなわち、画像取得部５０８は、動線情報に従って画像記憶部５０４から仮想視点画像を順次に取得し、取得した仮想視点画像を、入出力部５０９を介してユーザ端末２０３に提供することができる。入出力部５０９は、画像取得部５０８によって取得された仮想視点画像をユーザ端末２０３へ出力する。また、入出力部５０９は、操作指示、仮想視点パラメータや仮想視点パラメータ群をユーザ端末２０３から入力する。 The image acquisition unit 508 selects a virtual viewpoint image from the group of virtual viewpoint images stored in the image storage unit 504 based on the virtual viewpoint parameters provided by the posture generation unit 506 or the virtual viewpoint parameters acquired from the input/output unit 509. The image acquisition unit 508 can also acquire a group of virtual viewpoint parameters (traffic line information) from the image storage unit 504 and select a virtual viewpoint image from the group of virtual viewpoint images based on the virtual viewpoint parameters extracted in order from the top of the group of virtual viewpoint parameters. That is, the image acquisition unit 508 can sequentially acquire virtual viewpoint images from the image storage unit 504 according to the traffic line information and provide the acquired virtual viewpoint images to the user terminal 203 via the input/output unit 509. The input/output unit 509 outputs the virtual viewpoint image acquired by the image acquisition unit 508 to the user terminal 203. The input/output unit 509 also inputs operation instructions, virtual viewpoint parameters, and virtual viewpoint parameter groups from the user terminal 203.

（情報処理装置２０２の動作）
図９は、情報処理装置２０２における一連の処理の流れを示すフローチャートである。このフローチャートは、２つの動線情報を接続する接続動線情報を生成し、これらを結合して１つの動線情報を生成する処理を示す。このフローチャートは、例えば、ユーザ端末２０３から既に作成済みの２つ動線情報を結合する要求を受け付けた際に実行される。 (Operation of information processing device 202)
9 is a flowchart showing a series of processing steps in the information processing device 202. This flowchart shows a process of generating connected trajectory information that connects two pieces of trajectory information and combining them to generate one piece of trajectory information. This flowchart is executed, for example, when a request to combine two pieces of trajectory information that have already been created is received from the user terminal 203.

Ｓ９０１において、動線生成部５０５は、画像記憶部５０４に記憶されている２つの動線情報を取得する。図１０は、動線情報に含まれている仮想視点パラメータ群を図示化したものである。図１０において、１００１および１００２は選手であり、破線矢印に沿って矢印方向へ選手が移動している様子を示す。また、１０１１および１０１２は仮想視点であり、実線矢印に沿って矢印方向へ仮想視点が移動している様子を示す。また、１０２１および１０２２はそれぞれ仮想視点１０１１および仮想視点１０１２の動線を示す。仮想視点１０１１は、選手１００１の斜め前方を選手１００１の動きに合わせて移動し、常に選手１００１を撮影できるように姿勢が調整される。仮想視点１０１２は、選手１００１に向けて遠くから徐々に近づき、最後に選手１００１を中心に旋回するように移動し、常に選手１００１を撮影できるように姿勢が調整される。なお、説明の都合上、選手および仮想視点は、間引いて図示しているが、実際はフレームごとに存在する。また、動線１０２１および動線１０２２の撮影時刻を図１５（ａ）に示す。動線１０２１の撮影時刻は図１５（ａ）の動線Ａであって、動線１０２２の撮影時刻は図１５（ａ）の動線Ｂである。動線Ａと動線Ｂは、重複する期間があって、動線Ａの方が動線Ｂよりも早く撮影が開始されており、撮影の終了時刻は動線Ｂの方が遅い。 In S901, the movement line generating unit 505 acquires two pieces of movement line information stored in the image storage unit 504. FIG. 10 illustrates a group of virtual viewpoint parameters included in the movement line information. In FIG. 10, 1001 and 1002 are players, and show how the players move in the direction of the dashed arrows. Also, 1011 and 1012 are virtual viewpoints, and show how the virtual viewpoints move in the direction of the solid arrows. Also, 1021 and 1022 show the movement lines of the virtual viewpoints 1011 and 1012, respectively. The virtual viewpoint 1011 moves diagonally in front of the player 1001 in accordance with the movement of the player 1001, and its posture is adjusted so that the player 1001 can always be photographed. The virtual viewpoint 1012 gradually approaches the player 1001 from a distance, and finally moves so as to rotate around the player 1001, and its posture is adjusted so that the player 1001 can always be photographed. For ease of explanation, the players and virtual viewpoints are thinned out in the illustration, but in reality they exist in each frame. Also, the shooting times of the movement lines 1021 and 1022 are shown in FIG. 15(a). The shooting time of the movement line 1021 is the movement line A in FIG. 15(a), and the shooting time of the movement line 1022 is the movement line B in FIG. 15(a). There is an overlapping period between the movement lines A and B, with the shooting of the movement line A starting earlier than that of the movement line B, and the shooting ending time of the movement line B being later.

図９に戻り、Ｓ９０２において、動線生成部５０５は、Ｓ９０１で取得した２つの動線情報に含まれている動線を取得し、これら２つの動線を接続する接続動線を生成する。図１１は、動線生成部５０５が生成する接続動線の一例である。図１１において、動線１０２１，１０２２は図１０で示した動線である。１１０１は動線１０２１の終点を、１１０２は動線１０２２の始点を示している。また、接続動線１１１０は、終点１１０１から始点１１０２へ向けて生成された動線である。動線生成部５０５は、取得した２つの動線のうち動線１０２１を先方の動線として決定し、動線１０２２を後方の動線として決定する。動線生成部５０５は、例えば、仮想視点パラメータ群の先頭に位置する仮想視点パラメータの時刻情報を比較し、撮影の開始時刻が早い方を先方の動線として決定する。動線生成部５０５は、先方の動線１０２１の終点１１０１を接続動線１１１０の始点として決定し、後方の動線１０２２の始点１１０２を接続動線１１１０の終点として決定する。動線生成部５０５は、終点１１０１から始点１１０２へ向けた動線を接続動線１１１０として生成する。 Returning to FIG. 9, in S902, the movement line generation unit 505 acquires the movement lines contained in the two pieces of movement line information acquired in S901, and generates a connecting movement line that connects these two movement lines. FIG. 11 is an example of a connecting movement line generated by the movement line generation unit 505. In FIG. 11, movement lines 1021 and 1022 are the movement lines shown in FIG. 10. 1101 indicates the end point of movement line 1021, and 1102 indicates the start point of movement line 1022. Also, the connecting movement line 1110 is a movement line generated from the end point 1101 to the start point 1102. Of the two acquired movement lines, the movement line generation unit 505 determines movement line 1021 as the forward movement line, and movement line 1022 as the backward movement line. The flow line generation unit 505, for example, compares the time information of the virtual viewpoint parameter located at the top of the virtual viewpoint parameter group, and determines the one with the earlier shooting start time as the forward flow line. The flow line generation unit 505 determines the end point 1101 of the forward flow line 1021 as the start point of the connecting flow line 1110, and determines the start point 1102 of the rear flow line 1022 as the end point of the connecting flow line 1110. The flow line generation unit 505 generates a flow line from the end point 1101 to the start point 1102 as the connecting flow line 1110.

なお、本実施形態では、動線生成部５０５は、接続動線１１１０を終点１１０１と始点１１０２を結ぶ直線とし、接続動線１１１０の線上を移動する仮想視点の位置情報を時刻単位で補間する。もちろん、接続動線１１１０は、スプライン関数などの所定の関数により表される曲線であってもよい。また、終点１１０１の時刻よりも始点１１０２の時刻の方が過去になるので、接続動線１１１０の線上を移動する仮想視点の時刻情報は過去に遡るように時刻が戻る。したがって、接続動線１１１０による仮想視点画像の再生は逆再生となる。また、この時点における仮想視点の姿勢情報は空状態であって、Ｓ９０５で設定される。なお、補間方法については、スプライン関数によるものであってもよい。また、取得した２つの動線のうち先方の動線として選ばれる動線は、ユーザ端末２０３からユーザによって選択されるものであってもよい。 In this embodiment, the flow line generating unit 505 sets the connecting flow line 1110 as a straight line connecting the end point 1101 and the start point 1102, and interpolates the position information of the virtual viewpoint moving on the connecting flow line 1110 in units of time. Of course, the connecting flow line 1110 may be a curve represented by a predetermined function such as a spline function. Also, since the time of the start point 1102 is earlier than the time of the end point 1101, the time information of the virtual viewpoint moving on the connecting flow line 1110 goes back in time. Therefore, the playback of the virtual viewpoint image by the connecting flow line 1110 is reverse playback. Also, the posture information of the virtual viewpoint at this time is empty and is set in S905. The interpolation method may be a spline function. Also, the flow line selected as the forward flow line from the two acquired flow lines may be selected by the user from the user terminal 203.

Ｓ９０３において、動線生成部５０５は、Ｓ９０２で生成された接続動線１１１０の仮想視点画像を表示する際の再生速度を決定する。動線生成部５０５は、Ｓ９０２で生成された接続動線１１１０の再生方向を逆方向（逆再生）に設定し、再生速度を早送りに決定する。なお、上述したように、再生速度は、接続動線の始点と終点の距離または時間の差分によって調整されてもよい。また、距離または時間の差分が大きくなるにつれて早送りの速度を速めるようにしてもよい。また、ユーザ端末２０３からユーザによって再生速度が設定されてもよい。また、逆再生ならば早送りが、順方向の再生ならば通常再生が設定されるようにしてもよい。再生速度が決定されると、接続動線１１１０上における仮想視点の位置が決定される。例えば、始点と終点の時間差と再生速度から接続動線１１１０に対する再生時間を決定し、再生時間に基づいて決定されるフレーム数分の位置を接続動線１１１０上に等距離間隔で配置することで、仮想視点の位置が決定され得る。 In S903, the flow line generating unit 505 determines the playback speed when displaying the virtual viewpoint image of the connecting flow line 1110 generated in S902. The flow line generating unit 505 sets the playback direction of the connecting flow line 1110 generated in S902 to reverse (reverse playback) and determines the playback speed to fast forward. As described above, the playback speed may be adjusted according to the distance or time difference between the start point and the end point of the connecting flow line. Also, the fast forward speed may be increased as the distance or time difference increases. Also, the playback speed may be set by the user from the user terminal 203. Also, fast forward may be set for reverse playback, and normal playback may be set for forward playback. Once the playback speed is determined, the position of the virtual viewpoint on the connecting flow line 1110 is determined. For example, the playback time for the connecting flow line 1110 is determined from the time difference between the start point and the end point and the playback speed, and the position of the number of frames determined based on the playback time is arranged at equal distance intervals on the connecting flow line 1110, thereby determining the position of the virtual viewpoint.

次に、Ｓ９０４において、姿勢生成部５０６は、Ｓ９０２で生成された接続動線１１１０の線上を移動する仮想視点が追従するオブジェクトを決定する。図１２は、追従するオブジェクトの決定方法を説明する図である。図１２（ａ）において、動線１０２１と１０２２、選手１００１と１００２は図１０で説明したとおりである。１２０１は動線１０２１の終点に位置する仮想視点を示す。また、１２１１は仮想視点１２０１の撮影範囲を示す。なお、選手１００１および選手１００２の斜線丸印は、仮想視点１２０１が動線１０２１の終点に位置する時刻に、選手がプレーしている場所を示す。図１２（ｂ）は、仮想視点１２０１について取得される仮想視点画像である。 Next, in S904, the posture generation unit 506 determines an object to be followed by the virtual viewpoint moving on the connecting movement line 1110 generated in S902. FIG. 12 is a diagram for explaining a method for determining an object to be followed. In FIG. 12(a), movement lines 1021 and 1022 and players 1001 and 1002 are as described in FIG. 10. 1201 indicates the virtual viewpoint located at the end point of movement line 1021. Also, 1211 indicates the shooting range of virtual viewpoint 1201. Note that the diagonal lines of players 1001 and 1002 indicate the locations where the players are playing at the time when virtual viewpoint 1201 is located at the end point of movement line 1021. FIG. 12(b) is a virtual viewpoint image acquired for virtual viewpoint 1201.

姿勢生成部５０６は、Ｓ９０２で決定された先方の動線１０２１の終点に位置する仮想視点１２０１の仮想視点パラメータを取得する。姿勢生成部５０６は、取得した仮想視点パラメータを画像取得部５０８へ送り、画像取得部５０８から仮想視点画像を取得する。この時、画像取得部５０８から取得した仮想視点画像が図１２（ｂ）である。姿勢生成部５０６は、仮想視点画像の中央または中央付近に位置するオブジェクトを検出し、検出したオブジェクトの３次元直交座標を算出する。姿勢生成部５０６は、仮想視点１２０１の仮想視点パラメータの時刻情報と算出した３次元直交座標を用いて、画像記憶部５０４に記憶されるメタデータと照合し、オブジェクトＩＤを特定する。図１２（ａ）,（ｂ）の例では、仮想視点１２０１の仮想視点画像から選手１００１の位置が検出され、メタデータとの照合により選手１００１のオブジェクトＩＤが特定される。なお、動線１０２１の終点における仮想視点画像から追従すべきオブジェクトを検出したがこれに限られるものではない。例えば、動線１０２２の始点における仮想視点画像から追従すべきオブジェクトが検出されてもよい。また、追従すべきオブジェクトがユーザ端末２０３から指定されてもよい。 The posture generation unit 506 acquires the virtual viewpoint parameters of the virtual viewpoint 1201 located at the end point of the forward movement line 1021 determined in S902. The posture generation unit 506 sends the acquired virtual viewpoint parameters to the image acquisition unit 508 and acquires a virtual viewpoint image from the image acquisition unit 508. At this time, the virtual viewpoint image acquired from the image acquisition unit 508 is shown in FIG. 12(b). The posture generation unit 506 detects an object located at the center or near the center of the virtual viewpoint image and calculates the three-dimensional orthogonal coordinates of the detected object. The posture generation unit 506 uses the time information of the virtual viewpoint parameters of the virtual viewpoint 1201 and the calculated three-dimensional orthogonal coordinates to match them with the metadata stored in the image storage unit 504 and identify the object ID. In the example of FIG. 12(a) and (b), the position of the player 1001 is detected from the virtual viewpoint image of the virtual viewpoint 1201, and the object ID of the player 1001 is identified by matching it with the metadata. Note that, although the object to be followed is detected from the virtual viewpoint image at the end point of the flow line 1021, this is not limited to the above. For example, the object to be followed may be detected from the virtual viewpoint image at the start point of the flow line 1022. Also, the object to be followed may be specified from the user terminal 203.

Ｓ９０５において、姿勢生成部５０６は、Ｓ９０２で生成された接続動線１１１０の線上を移動する仮想視点の各位置の姿勢を生成する。図１３は、接続動線１１１０の線上を移動する仮想視点の一例である。図１３において、１３０１は接続動線１１１０の線上を移動する仮想視点を示す。仮想視点１３０１は、実線矢印に沿って矢印の方向へ移動しながら、常に選手１００１を撮影できるように姿勢が調整される。なお、接続動線１１１０の線上を移動する仮想視点１３０１の時刻情報は過去に遡るように時刻が戻るので、選手１００１も破線矢印に沿って矢印方向を逆行する。姿勢生成部５０６は、接続動線１１１０の線上を移動する仮想視点１３０１の各位置の時刻情報と、Ｓ９０４で決定したオブジェクトＩＤに基づいて、メタデータの中から選手１００１の位置情報を取得する。姿勢生成部５０６は、仮想視点の１３０１の視線がメタデータから抽出した選手１００１の位置を向くように各位置の仮想視点の姿勢を調整することにより、仮想視点の姿勢情報を生成する。 In S905, the posture generation unit 506 generates the posture of each position of the virtual viewpoint moving on the line of the connecting flow line 1110 generated in S902. FIG. 13 is an example of a virtual viewpoint moving on the line of the connecting flow line 1110. In FIG. 13, 1301 indicates a virtual viewpoint moving on the line of the connecting flow line 1110. The posture of the virtual viewpoint 1301 is adjusted so that the player 1001 can always be photographed while moving in the direction of the arrow along the solid arrow. Note that the time information of the virtual viewpoint 1301 moving on the line of the connecting flow line 1110 goes back in time so that the player 1001 also moves backward in the direction of the arrow along the dashed arrow. The posture generation unit 506 acquires the position information of the player 1001 from the metadata based on the time information of each position of the virtual viewpoint 1301 moving on the line of the connecting flow line 1110 and the object ID determined in S904. The posture generation unit 506 generates posture information of the virtual viewpoint by adjusting the posture of the virtual viewpoint at each position so that the line of sight of the virtual viewpoint 1301 faces the position of the player 1001 extracted from the metadata.

Ｓ９０６において、動線結合部５０７は、Ｓ９０１で取得した２つの動線情報と、Ｓ９０２～Ｓ９０５により生成された接続動線情報とを結合する。図１４は、これら３つの動線情報が結合された状態の一例を示す。動線結合部５０７は、Ｓ９０２で決定した先方の動線１０２１、Ｓ９０５で姿勢情報が付与された接続動線１１１０、Ｓ９０２で決定した後方の動線１０２２の順番で接続する。動線結合部５０７は、結合により得られた動線情報（仮想視点パラメータ群）を画像記憶部５０４へ記憶する。 In S906, the flow line combination unit 507 combines the two pieces of flow line information acquired in S901 with the connecting flow line information generated in S902 to S905. FIG. 14 shows an example of a state in which these three pieces of flow line information are combined. The flow line combination unit 507 connects in the following order: the forward flow line 1021 determined in S902, the connecting flow line 1110 to which posture information has been added in S905, and the rearward flow line 1022 determined in S902. The flow line combination unit 507 stores the flow line information (virtual viewpoint parameter group) obtained by the combination in the image storage unit 504.

以上のように、本実施形態によれば、既に作成済みの２つの動線情報を結合する場合に、２つの動線を接続する接続動線を生成し、接続動線の線上を移動する仮想視点の視線を、常に移動するオブジェクトへ向けることができる。そのため、ユーザが仮想視点の位置や姿勢に係る操作を行わずとも、注目すべきオブジェクトが仮想視点画像の範囲から外れないように２つの動線を接続する動線情報を接続する動線情報を生成することできる。例えば、サッカー競技において、ドリブルからシュートまでの一連の動作を行った選手を注目オブジェクトとして追従するように仮想視点を移動させた後、時間を戻しながら（逆再生しながら）別の位置へ仮想視点を移動させることを考える。この場合に、逆再生のための仮想視点の動線を、基本フレームに対応して設定された動線上の位置・姿勢を単純に補間することにより生成すると、注目オブジェクトである選手の位置は考慮されない。従って、逆再生中に注目対象である選手が仮想視点の視界から外れてしまう場合がある。これに対して、上述した本実施形態によれば、注目対象のオブジェクトの位置を考慮して動線を補間するため、注目対象の選手が仮想視点の視界から外れてしまう虞を低減できる。 As described above, according to this embodiment, when two pieces of already created flow line information are combined, a connecting flow line that connects the two flow lines is generated, and the line of sight of the virtual viewpoint that moves on the line of the connecting flow line can always be directed to the moving object. Therefore, even if the user does not perform an operation related to the position or posture of the virtual viewpoint, flow line information that connects two flow lines can be generated so that the object to be focused on does not leave the range of the virtual viewpoint image. For example, in a soccer game, consider moving the virtual viewpoint to follow a player who has performed a series of actions from dribbling to shooting as an object of interest, and then moving the virtual viewpoint to another position while rewinding time (while playing backwards). In this case, if the flow line of the virtual viewpoint for reverse playback is generated by simply interpolating the position and posture on the flow line set corresponding to the basic frame, the position of the player who is the object of interest is not taken into consideration. Therefore, there is a case where the player who is the object of interest moves out of the field of view of the virtual viewpoint during reverse playback. In contrast, according to the above-mentioned embodiment, the flow line is interpolated taking into account the position of the object of interest, so that the risk of the player who is the object of interest moving out of the field of view of the virtual viewpoint can be reduced.

なお、上記実施形態では、２つの動線の間を接続する接続動線を生成する例を説明したが、これに限られるものではない。例えば、時刻、位置、姿勢が設定された２つの仮想視点を指定することにより、これらの仮想視点間を結ぶ動線情報を自動生成する構成に適用することができる。すなわち、キーフレーム方式において２つの基本フレームの間の中間フレームに対応する仮想視点の位置および姿勢を決定するために、上述した動線情報の生成方法を適用することができる。これにより、所定のオブジェクトが仮想視点画像から外れないように、中間フレームの仮想視点を決定することができる。 In the above embodiment, an example of generating a connecting flow line that connects two flow lines has been described, but the present invention is not limited to this. For example, by specifying two virtual viewpoints for which a time, position, and orientation are set, the present invention can be applied to a configuration in which flow line information connecting these virtual viewpoints is automatically generated. That is, in the key frame method, the above-mentioned method of generating flow line information can be applied to determine the position and orientation of a virtual viewpoint corresponding to an intermediate frame between two basic frames. This makes it possible to determine the virtual viewpoint of the intermediate frame so that a specified object does not fall out of the virtual viewpoint image.

（その他の実施形態）
上述の実施形態では、ユーザ端末２０３から既に作成済みの２つ動線情報を結合する要求を受け付けると、図９に示されるフローチャートの処理が実行されるものとして説明したが、これに限定されない。例えば、制御部５０１は、画像記憶部５０４に動線情報（仮想視点パラメータ群）が新規に追加される度に、図９に示されるフローチャートの処理を実行するように制御してもよい。 Other Embodiments
In the above embodiment, when a request to combine two pieces of already created flow line information is received from the user terminal 203, the process of the flowchart shown in Fig. 9 is executed. However, the present invention is not limited to this. For example, the control unit 501 may control the process of the flowchart shown in Fig. 9 to be executed every time new flow line information (a group of virtual viewpoint parameters) is added to the image storage unit 504.

この場合、制御部５０１は、新規に追加された動線情報と、画像記憶部５０４に記憶されている既存の動線情報を解析し、これら動線情報の結合を実行するか否かを判定するようにしてもよい。例えば、図１５（ａ）で示すように２つの動線の撮影時刻の範囲が少なくとも部分的に重複し、かつ、図１５（ｂ）で示すように２つの動線情報によって共通のオブジェクトが撮影されている場合、制御部５０１は２つの動線情報を結合すると判断する。なお、結合の実行の判定条件はこれに限定されるものではない。このようにすることで、ユーザが２つの動線を結合する指示を出さずとも、関係性の高い２つの動線情報が自動的に結合されるようにすることができる。なお、動線情報の結合の実行が、ユーザ端末２０３からユーザによって設定されるようにしてもよい。 In this case, the control unit 501 may analyze the newly added flow line information and the existing flow line information stored in the image storage unit 504, and determine whether or not to combine these pieces of flow line information. For example, as shown in FIG. 15(a), if the ranges of the shooting times of the two flow lines at least partially overlap, and a common object is photographed by the two pieces of flow line information as shown in FIG. 15(b), the control unit 501 determines to combine the two pieces of flow line information. Note that the determination conditions for executing the combination are not limited to this. In this way, two pieces of flow line information that are highly related to each other can be automatically combined, even if the user does not issue an instruction to combine the two flow lines. Note that the execution of the combination of the flow line information may be set by the user from the user terminal 203.

また、上述の実施形態では、サッカーの試合を撮影する場合を例示したが、撮影対象は必ずしもこれに限定されない。例えば、ラグビー、テニス、アイススケート、バスケットボール等の他のスポーツの試合や、ライブ、コンサート等の演奏の撮影にも、本実施形態を適用することができる。 In addition, while the above embodiment illustrates the case of photographing a soccer match, the subject of the photograph is not necessarily limited to this. For example, this embodiment can also be applied to photographing other sports matches such as rugby, tennis, ice skating, and basketball, as well as live performances and concerts.

以上のように、各実施形態によれば、キーフレーム方式で中間フレームの仮想視点を補間する場合、仮想視点の撮影対象であるオブジェクトを追従するように仮想視点の姿勢を決定するので、仮想視点の撮影範囲からオブジェクトが外れることがなくなる。 As described above, according to each embodiment, when the virtual viewpoint of an intermediate frame is interpolated using the key frame method, the attitude of the virtual viewpoint is determined so as to track the object that is the subject of the virtual viewpoint's shooting, so that the object does not move out of the shooting range of the virtual viewpoint.

本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 The present invention can also be realized by supplying a program that realizes one or more of the functions of the above-mentioned embodiments to a system or device via a network or a storage medium, and having one or more processors in the computer of the system or device read and execute the program. It can also be realized by a circuit (e.g., an ASIC) that realizes one or more functions.

本発明は上記実施の形態に制限されるものではなく、本発明の精神及び範囲から離脱することなく、様々な変更及び変形が可能である。従って、本発明の範囲を公にするために、以下の請求項を添付する。 The present invention is not limited to the above-described embodiments, and various modifications and variations are possible without departing from the spirit and scope of the present invention. Therefore, in order to publicize the scope of the present invention, the following claims are appended.

２０２：情報処理装置、５０１：制御部、５０２：記憶部、５０３：データ送受信部、５０４：画像記憶部、５０５：動線生成部、５０６：姿勢生成部、５０７：動線結合部、５０８：画像取得部、５０９：入出力部、５１０：内部バス 202: Information processing device, 501: Control unit, 502: Storage unit, 503: Data transmission/reception unit, 504: Image storage unit, 505: Movement line generation unit, 506: Posture generation unit, 507: Movement line connection unit, 508: Image acquisition unit, 509: Input/output unit, 510: Internal bus

Claims

An information processing device that generates a position of a virtual viewpoint and a line of sight direction from the virtual viewpoint for generating a virtual viewpoint image from a plurality of images obtained by photographing a predetermined space with a plurality of photographing devices,
an acquisition means for acquiring a first position and a second position of the virtual viewpoint, and a first time and a second time corresponding to the first position and the second position and indicating a time of shooting of the predetermined space ;
a generating means for generating a flow line along which the virtual viewpoint moves by connecting the first position and the second position;
a setting means for setting a plurality of positions of the virtual viewpoint between the first position and the second position of the traffic line, and setting a photographing time of the predetermined space corresponding to each of the plurality of positions based on the first time and the second time;
a determination means for determining a line of sight direction from the virtual viewpoint at each of the plurality of positions based on a position of a specific object in the predetermined space at the shooting time set at each of the plurality of positions,
the acquiring means acquires first flow line information including a plurality of positions including the first position of the virtual viewpoint, a line of sight from the virtual viewpoint at each of the plurality of positions, and a photographing time, and second flow line information including a plurality of positions including the second position of the virtual viewpoint, and a line of sight from the virtual viewpoint at each of the plurality of positions, and a photographing time,
The information processing device, characterized in that the generation means generates the flow line connecting the first position and the second position as a flow line of third flow line information along which the virtual viewpoint moves.

The information processing device according to claim 1, further comprising a combining means for combining the first movement line information, the second movement line information, and the third movement line information to generate combined movement line information.

The information processing device according to claim 1 or 2, further comprising a selection means for selecting the specific object based on a virtual viewpoint image obtained from the first flow line information or the second flow line information.

The information processing device according to claim 3, characterized in that the selection means selects, as the specific object, an object located closest to the center in a virtual viewpoint image obtained from a virtual viewpoint at the first position or the second position.

The information processing device according to any one of claims 1 to 4, characterized in that the generating means generates a flow line whose starting point is the end point of the virtual viewpoint represented by the first flow line information and whose ending point is the start point of the virtual viewpoint represented by the second flow line information.

The information processing device according to any one of claims 1 to 5, characterized in that the generating means generates the movement line by connecting the first position and the second position with a straight line.

The information processing device according to any one of claims 1 to 5, characterized in that the generating means generates the movement line by connecting the first position and the second position with a curve represented by a predetermined function.

The information processing device according to any one of claims 1 to 7, characterized in that the setting means sets a playback speed based on a distance between the first position and the second position or a time difference between the first time and the second time , and sets the position of the virtual viewpoint and the shooting time on the traffic line based on the playback speed.

The information processing device according to claim 8, characterized in that the setting means sets a playback speed faster than the normal playback speed when the distance is greater than a predetermined value or when the time difference is greater than a predetermined value.

The information processing device according to claim 8 or 9, characterized in that the setting means sets a faster playback speed the greater the distance or the greater the time difference.

The information processing device according to any one of claims 1 to 5, further comprising a determination means for determining whether or not to generate the third movement line information based on the first movement line information and the second movement line information.

The information processing device according to claim 11, characterized in that the determination means determines to generate the third traffic line information when the range of shooting times included in the first traffic line information and the range of shooting times included in the second traffic line information at least partially overlap.

The information processing device according to claim 11 or 12, characterized in that the determination means determines to generate the third flow line information when a virtual viewpoint image generated based on the first flow line information and a virtual viewpoint image generated based on the second flow line information include a common object.

1. An information processing method for generating a position of a virtual viewpoint and a line of sight direction from a virtual viewpoint for generating a virtual viewpoint image from a plurality of images obtained by photographing a predetermined space with a plurality of photographing devices, comprising:
an acquisition step of acquiring a first position and a second position of the virtual viewpoint, and a first time and a second time corresponding to the first position and the second position and indicating a shooting time of the predetermined space ;
a generating step of connecting the first position and the second position to generate a flow line along which the virtual viewpoint moves;
a setting step of setting a plurality of positions of the virtual viewpoint between the first position and the second position of the traffic line, and setting a shooting time of the predetermined space corresponding to each of the plurality of positions based on the first time and the second time;
determining a line of sight direction from the virtual viewpoint at each of the plurality of positions based on a position of a specific object in the predetermined space at the shooting time set at each of the plurality of positions;
The acquiring step acquires first flow line information including a plurality of positions including the first position of the virtual viewpoint, a line of sight from the virtual viewpoint at each of the plurality of positions, and a photographing time, and second flow line information including a plurality of positions including the second position of the virtual viewpoint, and a line of sight from the virtual viewpoint at each of the plurality of positions, and a photographing time,
An information processing method characterized in that the generation step generates the flow line connecting the first position and the second position as a flow line of third flow line information along which the virtual viewpoint moves.

A program for causing a computer to function as each of the means of an information processing device according to any one of claims 1 to 13.