JP6853013B2

JP6853013B2 - Information processing systems, information processing methods, and information processing programs

Info

Publication number: JP6853013B2
Application number: JP2016209910A
Authority: JP
Inventors: 暁彦白井; 久貴鈴木; 裕太山口; 一造米澤
Original assignee: GREE Inc
Current assignee: GREE Holdings Inc
Priority date: 2016-10-26
Filing date: 2016-10-26
Publication date: 2021-03-31
Anticipated expiration: 2036-10-26
Also published as: JP2021101557A; JP7135141B2; JP2018074294A; JP2022177053A; JP7368886B2; JP2023181217A

Description

本開示は、情報処理システム、情報処理方法、および情報処理プログラムに関する。 The present disclosure relates to information processing systems , information processing methods , and information processing programs .

近年、ＩＣＴ（Information and Communication Technology：情報通信技術）の発展に伴い、ＣＧ（Computer Graphics）で描画された仮想的な３次元空間を用いた様々なコン
テンツが提案されている。このようなコンテンツとして、ＶＲ（Virtual Reality）コン
テンツやゲームコンテンツ、eスポーツ（electronic sports）コンテンツ等が例示される。 In recent years, with the development of ICT (Information and Communication Technology), various contents using a virtual three-dimensional space drawn by CG (Computer Graphics) have been proposed. Examples of such contents include VR (Virtual Reality) contents, game contents, and e-sports (electronic sports) contents.

上記コンテンツを利用する利用者（以下、ユーザとも称す）は、例えば、ＨＭＤ（Head-Mount Display）等の３次元映像を視聴可能なデバイスを装着することで、ＣＧで描画された仮想的な３次元空間をユーザ視点で立体視することが可能になる。さらに、ユーザの頭部や腕部といった身体の動きや位置等を検知するセンサ等が装着される場合には、例えば、ユーザの動作に応じて３次元空間内の移動、３次元空間内に配置されたオブジェクトへの働きかけといった行動が可能になる。３次元空間内に配置されたオブジェクトには、例えば、上記コンテンツを利用する他のユーザが操作可能なオブジェクト（例えば、キャラクタやアバター）が含まれる。 A user (hereinafter, also referred to as a user) who uses the above content is equipped with a device capable of viewing a three-dimensional image such as an HMD (Head-Mount Display), so that a virtual 3 drawn by CG is attached. It is possible to stereoscopically view the dimensional space from the user's point of view. Further, when a sensor or the like that detects the movement or position of the body such as the user's head or arm is attached, for example, it moves in the three-dimensional space according to the user's movement and is arranged in the three-dimensional space. Actions such as working on the created object become possible. Objects arranged in the three-dimensional space include, for example, objects (for example, characters and avatars) that can be operated by other users who use the above contents.

ゲーム、ｅスポーツ等のコンテンツでは、例えば、他のユーザの操作するオブジェクトとの間で、自身の行動によって生じたイベントに対する応答や会話、競争や対戦等が可能になる。コンテンツに参加するユーザ間では、それぞれの視点で、上記行動によって生じたイベントによる３次元空間内の衝突や物理的な相互作用を、コンテンツの定める所定のルールに従って楽しむことができる。 In contents such as games and e-sports, for example, it is possible to respond to an event caused by one's own action, have a conversation, compete, or compete with an object operated by another user. From each viewpoint, users who participate in the content can enjoy collisions and physical interactions in the three-dimensional space due to the events caused by the above actions according to predetermined rules defined by the content.

なお、本明細書で説明する技術に関連する技術が記載されている先行技術文献としては、以下の特許文献が存在している。 The following patent documents exist as prior art documents that describe the techniques related to the techniques described in the present specification.

特開２０１１−１５０４０２号公報Japanese Unexamined Patent Publication No. 2011-150402 特開２００１−３００１３１号公報Japanese Unexamined Patent Publication No. 2001-300131 特開平１１−１２３２８０号公報Japanese Unexamined Patent Publication No. 11-12328

コンテンツの形態として、会議場やオフィス、学校、ショッピングモールといった３次元空間をＣＧで描画し、上記３次元空間内に複数のユーザがリアルタイムで参加するリモート会議やＶＲオフィス、ｅ−ラーニング、ｅ−コマース等が想定される。 As a form of content, a three-dimensional space such as a conference hall, office, school, or shopping mall is drawn by CG, and a remote conference, VR office, e-learning, or e- in which multiple users participate in real time in the above three-dimensional space. Commerce etc. are assumed.

上記の形態では、例えば、描画された３次元空間のユーザ間のやり取りを発言者等としてリアルタイムに参加するだけでなく、事後や聴者として映像で視聴する利用者向けの配信や、その視聴環境・視聴形態、メディア変換の需要は想定される。 In the above form, for example, not only the interaction between users in the drawn three-dimensional space is participated in real time as a speaker, but also the distribution for users who watch the video after the fact or as a listener, and the viewing environment thereof. Demand for viewing formats and media conversion is expected.

配信等された３次元空間の映像を事後に視聴する場合には、例えば、ＨＭＤ等の３次元映像を視聴するための専用デバイスの購入や３次元映像受信のための視聴環境整備に係る負担が想定される。 If the view image of a three-dimensional space which is delivery, etc. after the fact, for example, according to the viewing environment improvement for the dedicated device purchase or 3D video reception to view three-dimensional images or an HMD burden Is assumed .

本開示は、撮影対象の多視点３次元動画データに基づいて、受け付けた視点からの動画を再現できる技術の提供を目的とする。 An object of the present disclosure is to provide a technique capable of reproducing a moving image from a received viewpoint based on multi-viewpoint 3D moving image data to be photographed.

開示の技術の一側面は、情報処理システムによって例示される。この情報処理システムは、撮影対象の多視点３次元動画データを生成する動画データ生成手段と、前記多視点３次元動画データに対して視点の設定を受け付ける手段と、前記受け付けた視点から前記撮影対象をみたときの動画を生成する動画生成手段と、前記生成された動画を視聴者装置に提供する手段と、前記設定された視点を記録する記録手段と、を備える。 One aspect of the disclosed technology is exemplified by an information processing system. This information processing system includes a moving image data generating means for generating multi-viewpoint 3D moving image data of a shooting target, a means for accepting a viewpoint setting for the multi-viewpoint 3D moving image data, and the shooting target from the received viewpoint. It is provided with a moving image generating means for generating a moving image when viewed, a means for providing the generated moving image to a viewer device, and a recording means for recording the set viewpoint .

開示の技術の一側面によれば、撮影対象の多視点３次元動画データに基づいて、受け付けた視点からの動画を再現できる技術を提供できる。 According to one aspect of the disclosed technology, it is possible to provide a technology capable of reproducing a moving image from a received viewpoint based on multi-viewpoint 3D moving image data to be photographed.

情報処理システムの一例を示す構成図である。It is a block diagram which shows an example of an information processing system. コンピュータのハードウェア構成の一例を示す構成図である。It is a block diagram which shows an example of the hardware configuration of a computer. ３ＤＣＧ空間内のオブジェクトについて保存されるデータ構成の一例を示す図である。It is a figure which shows an example of the data structure which is stored about the object in 3DCG space. シーンマネージャのイベントに係る事象の記録処理を例示するフローチャートである。It is a flowchart which illustrates the recording process of the event related to the event of a scene manager. シーンマネージャのバーチャルカメラの設定処理を例示するフローチャートである。It is a flowchart which illustrates the setting process of the virtual camera of a scene manager. バーチャルカメラのカメラ方向の設定を説明する説明図である。It is explanatory drawing explaining the setting of the camera direction of a virtual camera. 実空間の会議場を撮影するカメラを備える情報処理システムの一例を示す構成図である。It is a block diagram which shows an example of an information processing system which includes a camera which takes a picture of a conference hall in a real space. クロマキー合成を用いる際の、実空間における被写体の撮影を説明する図である。It is a figure explaining the shooting of the subject in the real space when using chroma key composition. クロマキー合成を用いる際の、実空間における被写体の撮影を説明する図である。It is a figure explaining the shooting of the subject in the real space when using chroma key composition. 合成映像の一例を示す図である。It is a figure which shows an example of a composite video. ２Ｄ映像デバイスに表示された合成映像の一例を示す図である。It is a figure which shows an example of the synthetic image displayed on the 2D image device. ＨＭＤを装着した被写体がリモート会議に参加する形態の説明図である。It is explanatory drawing of the form in which a subject wearing an HMD participates in a remote conference. ＨＭＤを装着しない被写体がリモート会議に参加する形態の説明図である。It is explanatory drawing of the form in which a subject who does not wear an HMD participates in a remote conference. ＨＭＤを装着しない被写体が事後に再生されるリモート会議に参加する形態の説明図である。It is explanatory drawing of the form which a subject who does not wear an HMD participates in a remote conference which is reproduced after the fact. 実空間を撮影するカメラを備える場合の、シーンマネージャのイベントに係る事象の記録処理を例示するフローチャートである。It is a flowchart which illustrates the recording process of the event related to the event of the scene manager when the camera which shoots a real space is provided.

以下、図面を参照して、一実施形態に係る情報処理システムについて説明する。以下の実施形態の構成は例示であり、本情報処理システムは、以下の実施形態の構成には限定されない。 Hereinafter, the information processing system according to the embodiment will be described with reference to the drawings. The configuration of the following embodiment is an example, and the information processing system is not limited to the configuration of the following embodiment.

＜実施形態＞
〔１．システム構成〕
図１は、本実施形態に係る情報処理システムの一例を示す構成図である。先ず、図１に示す情報処理システム１０の概要を説明する。図１の形態においては、例えば、ＣＧで作成された３次元空間（以下、「３ＤＣＧ空間」とも称す）内に会議場が設けられ、リモート会議等のコンテンツが提供される。ここで、「リモート会議」とは、ネットワークに接続する複数の会議参加者が３ＤＣＧ空間内の会議場を共有し、リアルタイムで行う会議をいう。ネットワークに接続する複数の会議参加者は、３ＤＣＧ空間で操作可能なオブジェクト（アバター等）を介し、会議における発言や対話、他者への働きかけや移動等の行動を行う。会議進行の様子は、会議参加者の備えるＨＭＤ等に、それぞれのアバター視点から見た３ＤＣＧの動画映像として提供される。
但し、情報処理システム１０が提供するコンテンツは、３ＤＣＧ空間で仮想的に形成されたＶＲオフィス、ゲーム、ｅ−ラーニング等であってもよい。コンテンツは、３ＤＣＧ空間内において、ユーザの行動や移動等で生じたイベントによる衝突や物理的な相互作用を、コンテンツの定める所定のルールに従って提供可能であればよい。
以下においては、情報処理システム１０の提供するコンテンツは、３ＤＣＧ空間に設けられた会議場を使用するリモート会議として説明する。また、ＣＧで作成された３次元空間の動画映像を「立体映像」とも称する。 <Embodiment>
[1. System configuration〕
FIG. 1 is a configuration diagram showing an example of an information processing system according to the present embodiment. First, an outline of the information processing system 10 shown in FIG. 1 will be described. In the form of FIG. 1, for example, a conference hall is provided in a three-dimensional space created by CG (hereinafter, also referred to as “3DCG space”), and contents such as a remote conference are provided. Here, the "remote conference" means a conference in which a plurality of conference participants connected to the network share a conference hall in the 3DCG space and hold the conference in real time. A plurality of conference participants connected to the network perform actions such as speaking and dialogue in the conference, working on others, and moving through objects (avatars and the like) that can be operated in the 3DCG space. The state of the conference progress is provided to the HMD or the like provided by the conference participants as a 3DCG moving image viewed from the viewpoint of each avatar.
However, the content provided by the information processing system 10 may be a VR office, a game, an e-learning, or the like virtually formed in the 3DCG space. The content may be able to provide collisions and physical interactions due to events caused by user actions, movements, etc. in the 3DCG space in accordance with predetermined rules defined by the content.
In the following, the content provided by the information processing system 10 will be described as a remote conference using a conference hall provided in the 3DCG space. Further, a moving image of a three-dimensional space created by CG is also referred to as a "stereoscopic image".

３ＤＣＧ空間内には、例えば、机、椅子、白板、壁、床、窓、扉、天井といったリモート会議の会議場を構成する様々な構成物のオブジェクトが配置される。同様にして、リモート会議に参加する複数の会議参加者についてのアバターが３ＤＣＧ空間内に配置される。アバターは、会議参加者が操作可能なオブジェクトである。３ＤＣＧ空間内に配置されるオブジェクトは、例えば、描画要素であるポリゴン（polygon）の集合体である。オブ
ジェクトを構成する各ポリゴンは共通の座標軸を有する。３ＤＣＧ空間内のワールド座標系における、オブジェクトの形状、配置位置、向き等は、上記座標軸上の３次元座標（Ｘ，Ｙ，Ｚ）により表される。 In the 3DCG space, objects of various components such as desks, chairs, whiteboards, walls, floors, windows, doors, and ceilings that make up a conference hall for remote conferences are arranged. Similarly, avatars for a plurality of conference participants participating in a remote conference are arranged in the 3DCG space. An avatar is an object that can be manipulated by conference participants. The object arranged in the 3DCG space is, for example, an aggregate of polygons which are drawing elements. Each polygon that makes up an object has a common coordinate axis. The shape, arrangement position, orientation, etc. of an object in the world coordinate system in the 3DCG space are represented by three-dimensional coordinates (X, Y, Z) on the coordinate axes.

会議参加者は、例えば、ＨＭＤ等の立体映像を視認可能するデバイス（以下、３Ｄ映像デバイスとも称す）、身体の動きや位置を検知するセンサデバイス、マイクやヘッドフォン等の音響デバイスを備え、リモート会議に参加する。進行中のリモート会議の様子は、会議参加者に対し、それぞれのアバター視点を介した立体映像としてリアルタイムに提供される。会議参加者は、会議進行に伴うプロセスの立体映像をアバター視点で視聴すると共に、立体映像を視聴しながらアバターを操作し、会議場の移動や他の会議参加者への対話や働きかけ（行動）を行う。リモート会議内での発言は、例えば、マイクやヘッドフォン等の音響デバイスを介して行われる。会議進行に伴うプロセスで生じた各会議参加者の行動に対応したイベント（例えば、会話や提案、離席等の行動）は、視聴中の立体映像に反映される。 Conference participants are equipped with, for example, a device that can visually recognize a stereoscopic image such as an HMD (hereinafter, also referred to as a 3D image device), a sensor device that detects body movement and position, and an acoustic device such as a microphone or headphones, and are equipped with a remote conference. attend to. The state of the ongoing remote conference is provided to the conference participants in real time as a stereoscopic image through each avatar viewpoint. The conference participants watch the stereoscopic video of the process accompanying the progress of the conference from the viewpoint of the avatar, and operate the avatar while watching the stereoscopic video to move the conference hall and talk to and work with other conference participants (actions). I do. Speaking within a remote conference is made, for example, through an acoustic device such as a microphone or headphones. Events corresponding to the actions of each meeting participant (for example, actions such as conversation, proposal, and leaving) that occur in the process accompanying the progress of the meeting are reflected in the stereoscopic image being viewed.

本実施形態に係る情報処理システム１０は、リモート会議が進行する３ＤＣＧ空間を、任意の視点位置から見た立体映像として中継する機能を備える。リモート会議が進行する３ＤＣＧ空間内の視点位置は、例えば、仮想的なカメラであるバーチャルカメラによって指定される。バーチャルカメラは、リモート会議が進行する３ＤＣＧ空間内の任意の位置に移動可能である。バーチャルカメラは、会議場内に配置されたオブジェクトやアバター、および、会議進行に伴うプロセスで発生したイベント等の事象を、任意の視点位置から見た立体映像として中継する。リモート会議が進行する３ＤＣＧ空間を中継する中継者が存在する場合には、バーチャルカメラの視点位置は中継者の操作に従って設定される。 The information processing system 10 according to the present embodiment has a function of relaying a 3DCG space in which a remote conference is proceeding as a stereoscopic image viewed from an arbitrary viewpoint position. The viewpoint position in the 3DCG space where the remote conference proceeds is specified by, for example, a virtual camera which is a virtual camera. The virtual camera can be moved to any position in the 3DCG space where the remote conference is going on. The virtual camera relays objects and avatars placed in the conference hall, and events such as events that occur in the process accompanying the progress of the conference as stereoscopic images viewed from an arbitrary viewpoint position. When there is a relay relaying in the 3DCG space where the remote conference is proceeding, the viewpoint position of the virtual camera is set according to the operation of the relay relay.

本実施形態に係る情報処理システム１０は、バーチャルカメラの中継する立体映像を２次元映像に変換する機能を備える。変換後の２次元映像は、例えば、リモート会議の様子
を視聴する視聴者にリアルタイム、あるいは、事後に配信される。視聴者は、例えば、スマートフォン、携帯電話、タブレットＰＣ（Personal Computer）等の２次元映像が視聴
可能なデバイス（以下、２Ｄ映像デバイスとも称す）を介して、リモート会議の配信映像を視聴する。 The information processing system 10 according to the present embodiment has a function of converting a stereoscopic image relayed by a virtual camera into a two-dimensional image. The converted two-dimensional image is delivered in real time or after the fact to the viewer who watches the state of the remote conference, for example. The viewer views the distribution video of the remote conference through a device (hereinafter, also referred to as a 2D video device) capable of viewing a two-dimensional video such as a smartphone, a mobile phone, or a tablet PC (Personal Computer).

本実施形態に係る情報処理システム１０は、２次元映像を視聴する視聴者からのメッセージ（文字、音声、動作等のリアクション信号）を受け付ける機能を備える。情報処理システム１０は、例えば、受け付けたメッセージに基づいて、リモート会議が進行する３ＤＣＧ空間内の、バーチャルカメラの視点位置を移動させる。情報処理システム１０は、例えば、受け付けたメッセージを中継者に通知し、通知を受けた中継者の操作に従ってバーチャルカメラの視点位置を移動させる。リモート会議が進行中の場合には、バーチャルカメラの中継する立体映像は視点位置の移動に従って更新され、視聴者から受け付けた視点位置が視聴中の２次元映像に反映される。 The information processing system 10 according to the present embodiment has a function of receiving messages (reaction signals such as characters, voices, and actions) from a viewer who views a two-dimensional image. The information processing system 10 moves the viewpoint position of the virtual camera in the 3DCG space in which the remote conference proceeds, for example, based on the received message. The information processing system 10 notifies the relayer of the received message, and moves the viewpoint position of the virtual camera according to the operation of the relayer who received the notification. When the remote conference is in progress, the stereoscopic image relayed by the virtual camera is updated according to the movement of the viewpoint position, and the viewpoint position received from the viewer is reflected in the two-dimensional image being viewed.

本実施形態に係る情報処理システム１０は、リモート会議が進行する３ＤＣＧ空間の事象、バーチャルカメラ情報（視点位置、焦点距離、回転、ズームパラメータ等）を記録する機能を有する。３ＤＣＧ空間の事象は、会議進行に伴うプロセスで発生した、各会議参加者の行動に伴うイベントに応じて３ＤＣＧ空間内を移動する各オブジェクトの座標を含む。３ＤＣＧ空間の事象には、会議参加者の音声データが含まれる。例えば、記録された音声データ等に基づいてリモート会議の議事録等が作成される。情報処理システム１０は、記録された上記情報に基づいてバーチャルカメラの視点位置による立体映像を生成し、生成された立体映像に基づく２次元映像を、事後の視聴者に配信する。事後の視聴においても視聴者からのメッセージが受け付けられ、事後の視聴者による視点位置が視聴中の２次元映像に反映される。 The information processing system 10 according to the present embodiment has a function of recording an event in 3DCG space in which a remote conference proceeds and virtual camera information (viewpoint position, focal length, rotation, zoom parameter, etc.). The events in the 3DCG space include the coordinates of each object that moves in the 3DCG space in response to the events that accompany the actions of each conference participant, which occur in the process that accompanies the progress of the conference. Events in the 3DCG space include audio data of conference participants. For example, the minutes of a remote conference are created based on the recorded voice data and the like. The information processing system 10 generates a stereoscopic image based on the viewpoint position of the virtual camera based on the recorded information, and distributes the two-dimensional image based on the generated stereoscopic image to the viewer after the fact. A message from the viewer is also accepted in the subsequent viewing, and the viewpoint position by the subsequent viewer is reflected in the two-dimensional image being viewed.

次に、図１に示す情報処理システム１０の構成を説明する。
図１に示す情報処理システム１０は、シーンマネージャＳＭと、バーチャルカメラＶＣと、カメラプロセッサＣと、レンダラＲと、配信サーバＣＨとを主に備える。上記の各構成は、ＬＡＮ（Local Area Network）等のネットワークを介して接続する。ネットワークには、バーチャルカメラＶＣ、カメラプロセッサＣ、レンダラＲのそれぞれが複数に接続し得る。図１においては、バーチャルカメラＶＣ１、ＶＣ２、ＶＣ３、カメラプロセッサＣ１、Ｃ２、Ｃ３、レンダラＲ１、Ｒ２、Ｒ３が例示される。なお、以下の説明においては、複数のバーチャルカメラＶＣ１、ＶＣ２、ＶＣ３を総称してバーチャルカメラＶＣという。カメラプロセッサＣ、レンダラＲについても同様である。 Next, the configuration of the information processing system 10 shown in FIG. 1 will be described.
The information processing system 10 shown in FIG. 1 mainly includes a scene manager SM, a virtual camera VC, a camera processor C, a renderer R, and a distribution server CH. Each of the above configurations is connected via a network such as a LAN (Local Area Network). Each of the virtual camera VC, the camera processor C, and the renderer R may be connected to a plurality of networks. In FIG. 1, virtual cameras VC1, VC2, VC3, camera processors C1, C2, C3, renderers R1, R2, and R3 are exemplified. In the following description, the plurality of virtual cameras VC1, VC2, and VC3 are collectively referred to as a virtual camera VC. The same applies to the camera processor C and the renderer R.

また、シーンマネージャＳＭと配信サーバＣＨとは、外部ネットワークに接続する。外部ネットワークは、インターネットといった公衆ネットワーク、携帯電話網、有線無線ＬＡＮ（Local Area Network）を含む。外部ネットワークには、２次元映像を視聴する視聴者（Ｖｉｅｗｅｒ）の２Ｄ映像デバイスが複数に接続する。なお、情報処理システム１０は、シーンマネージャＳＭと、配信サーバＣＨとを複数に備えるとしてもよい。また、情報処理システム１０は、バーチャルカメラＶＣと、カメラプロセッサＣと、レンダラＲとが有する機能を一体化して備えるとしてもよい。 Further, the scene manager SM and the distribution server CH are connected to an external network. The external network includes a public network such as the Internet, a mobile phone network, and a wired wireless LAN (Local Area Network). A plurality of 2D video devices of a viewer (Viewer) who view a two-dimensional video are connected to the external network. The information processing system 10 may be provided with a plurality of scene manager SMs and distribution server CHs. Further, the information processing system 10 may integrally include the functions of the virtual camera VC, the camera processor C, and the renderer R.

図１において、Ａ１は、バーチャルカメラＶＣ１の視点位置を操作する配信者Ｕ１が視聴中の立体映像を表す。配信者Ｕ１が視聴中の立体映像Ａ１には、バーチャルカメラＶＣ１の視点位置から見たリモート会議の会議参加者（Ｐ１、Ｐ２、Ｐ３）のアバターが含まれる。なお、アバターの視点位置の立体映像には、バーチャルカメラＶＣ１の視点位置を操作する配信者Ｕ１は描画されない。 In FIG. 1, A1 represents a stereoscopic image being viewed by a distributor U1 who operates the viewpoint position of the virtual camera VC1. The stereoscopic image A1 being viewed by the distributor U1 includes the avatars of the conference participants (P1, P2, P3) of the remote conference as viewed from the viewpoint position of the virtual camera VC1. The distributor U1 who operates the viewpoint position of the virtual camera VC1 is not drawn in the stereoscopic image of the viewpoint position of the avatar.

立体映像Ａ１内のバーチャルカメラＶＣ２、ＶＣ３は、３ＤＣＧ空間内にオブジェクト
として配置されたバーチャルカメラＶＣである。バーチャルカメラＶＣ２は、例えば、会議参加者Ｐ２の動きを追従する。バーチャルカメラＶＣ３は、例えば、会議参加者Ｐ３の動きを追従する。 The virtual cameras VC2 and VC3 in the stereoscopic image A1 are virtual cameras VC arranged as objects in the 3DCG space. The virtual camera VC2 follows, for example, the movement of the conference participant P2. The virtual camera VC3 follows, for example, the movement of the conference participant P3.

配信者Ｕ１は、自身が操作するバーチャルカメラＶＣ１の立体映像を視聴するＨＭＤ１を装着する。配信者Ｕ１が装着するＨＭＤ１には、ヘッドフォン、インカムといった音響デバイスが含まれる。また、配信者Ｕ１は、例えば、３ＤＣＧ空間内におけるバーチャルカメラＶＣ１の視点位置を指定するためのコントローラＣＮを備える。コントローラＣＮは、３ＤＣＧ空間を立体視するための２つの視点位置を操作するユーザ入力デバイスである。コントローラＣＮには、バーチャルカメラＶＣ１の回転角度を検知するセンサ（加速度センサ、ジャイロセンサ、地磁気センサ、画像センサ等）が含まれる。なお、バーチャルカメラＶＣ１、ＶＣ２、ＶＣ３で中継中の立体映像を切替える機能を有するとしてもよい。また、バーチャルカメラＶＣ１の視点位置を操作する配信者は、複数に存在するとしてもよい。 The distributor U1 wears the HMD1 for viewing the stereoscopic image of the virtual camera VC1 operated by the distributor U1. The HMD1 worn by the distributor U1 includes acoustic devices such as headphones and an intercom. Further, the distributor U1 includes, for example, a controller CN for designating the viewpoint position of the virtual camera VC1 in the 3DCG space. The controller CN is a user input device that operates two viewpoint positions for stereoscopic viewing of the 3DCG space. The controller CN includes a sensor (accelerometer, gyro sensor, geomagnetic sensor, image sensor, etc.) that detects the rotation angle of the virtual camera VC1. The virtual cameras VC1, VC2, and VC3 may have a function of switching the stereoscopic image being relayed. Further, there may be a plurality of distributors who operate the viewpoint position of the virtual camera VC1.

シーンマネージャＳＭは、リモート会議といった３ＤＣＧ空間の立体映像を作成するアプリケーションプログラムが動作するプラットフォームである。プラットフォームは、例えば、サーバ、ＷＳ（WorkStation）等の情報処理装置によって構成されたコンピュータ
システムである。 The scene manager SM is a platform on which an application program for creating a stereoscopic image in 3DCG space such as a remote conference operates. The platform is, for example, a computer system composed of information processing devices such as a server and WS (WorkStation).

シーンマネージャＳＭは、アプリケーションプログラムの実行により、３ＤＣＧ空間の映像を作成し、リモート会議参加者にアバター視点を介した立体映像をリアルタイムで提供する。同様に、シーンマネージャＳＭは、アプリケーションプログラムの実行により、バーチャルカメラＶＣの視点位置から見た３ＤＣＧ空間の立体映像を作成し、レンダラＲに送信する。シーンマネージャＳＭは、バーチャルカメラＶＣ１の視点位置を操作する配信者Ｕ１に対して、バーチャルカメラＶＣ１、ＶＣ２、ＶＣ３のそれぞれの視点位置から見た３ＤＣＧ空間の立体映像を出力する。配信者Ｕ１は、ＨＭＤ等を介してバーチャルカメラＶＣ１、ＶＣ２、ＶＣ３の視点位置から見た３ＤＣＧ空間の立体映像を視聴する。なお、ＨＭＤ等に表示される立体映像は、配信者Ｕ１の操作入力を受け付けるコントローラＣＮを介して切替えられる。 The scene manager SM creates an image in 3DCG space by executing an application program, and provides a remote conference participant with a stereoscopic image from an avatar viewpoint in real time. Similarly, the scene manager SM creates a stereoscopic image of the 3DCG space viewed from the viewpoint position of the virtual camera VC by executing the application program, and transmits it to the renderer R. The scene manager SM outputs a stereoscopic image of the 3DCG space seen from each viewpoint position of the virtual cameras VC1, VC2, and VC3 to the distributor U1 who operates the viewpoint position of the virtual camera VC1. The distributor U1 views the stereoscopic image of the 3DCG space seen from the viewpoint position of the virtual cameras VC1, VC2, and VC3 via the HMD or the like. The stereoscopic image displayed on the HMD or the like is switched via the controller CN that receives the operation input of the distributor U1.

シーンマネージャＳＭは、３ＤＣＤ空間の会議室を構成する構造物やアバター等のオブジェクトのデータをタイムコードに対応付けて保存・管理する。ここで、「タイムコード」とは、３ＤＣＧ空間内における絶対的な時間を表現するカウンタである。リモート会議の進行に伴うプロセスで発生したイベントに係る全ての事象は、タイムコードにより保存・管理される。 The scene manager SM stores and manages the data of objects such as structures and avatars constituting the conference room in the 3DCD space in association with the time code. Here, the "time code" is a counter that expresses the absolute time in the 3DCG space. All events related to the event that occurred in the process accompanying the progress of the remote conference are saved and managed by the time code.

シーンマネージャＳＭは、外部ネットワークに接続されたスマートフォン等の２Ｄ映像デバイスから送信されたメッセージを受信する。メッセージには、事後あるいはリアルタイムに配信された２次元映像の視聴者からの、文字、音声、動作等のリアクション信号等が含まれる。シーンマネージャＳＭは、受け付けたメッセージに基づいて、リモート会議が進行する３ＤＣＧ空間内の、バーチャルカメラＶＣの視点位置を移動させる。あるいは、シーンマネージャＳＭは、３ＤＣＧ空間内の、バーチャルカメラＶＣ１の視点位置を操作する配信者Ｕ１に対して、受け付けたメッセージに基づく視点位置を通知する。配信者は、例えば、ＨＭＤに表示された視点位置の移動の通知（文字やマーカ等）、あるいは、音声通知等を介してシーンマネージャＳＭからの指示を受け付け、バーチャルカメラＶＣ１の視点位置を移動させる。なお、シーンマネージャＳＭが受信したメッセージは、タイムコードに対応付けられて保存される。 The scene manager SM receives a message transmitted from a 2D video device such as a smartphone connected to an external network. The message includes reaction signals such as characters, voices, and actions from a viewer of a two-dimensional video delivered after the fact or in real time. The scene manager SM moves the viewpoint position of the virtual camera VC in the 3DCG space where the remote conference is proceeding based on the received message. Alternatively, the scene manager SM notifies the distributor U1 who operates the viewpoint position of the virtual camera VC1 in the 3DCG space of the viewpoint position based on the received message. The distributor receives an instruction from the scene manager SM via, for example, a notification of movement of the viewpoint position displayed on the HMD (characters, markers, etc.), a voice notification, or the like, and moves the viewpoint position of the virtual camera VC1. .. The message received by the scene manager SM is saved in association with the time code.

コンピュータプログラムの実行によってシーンマネージャＳＭが提供する機能は、「撮
影対象を多視点で撮影した多視点３次元動画データを生成する動画データ生成手段」の一例である。同様にして、コンピュータプログラムの実行によってシーンマネージャＳＭが提供する機能は、「視聴者装置から動画に対する反応を受け付ける手段の」一例である。同様にして、コンピュータプログラムの実行によってシーンマネージャＳＭが提供する機能は、「設定された視点と、多視点３次元動画データ中の時刻と、反応を記録する手段」の一例である。 The function provided by the scene manager SM by executing a computer program is an example of "a moving image data generation means for generating multi-viewpoint three-dimensional moving image data obtained by shooting a shooting target from multiple viewpoints". Similarly, the function provided by the scene manager SM by executing a computer program is an example of "a means for receiving a reaction to a moving image from a viewer device". Similarly, the function provided by the scene manager SM by executing a computer program is an example of "a means for recording a set viewpoint, a time in a multi-view 3D moving image data, and a reaction".

バーチャルカメラＶＣは、リモート会議が進行する３ＤＣＧ空間の立体映像を中継するための視点位置を指定する仮想的なデバイスである。バーチャルカメラＶＣは、３ＤＣＧ空間を立体映像として中継するために、２つの視点位置、２つの視点位置から決定される輻輳点をカメラ情報として有する。配信者Ｕ１の操作が介在しないバーチャルカメラＶＣでは、３ＤＣＧ空間内における配置位置はシーンマネージャＳＭによって決定される。バーチャルカメラＶＣは、例えば、３ＤＣＧ空間内に会議参加者が視覚可能なオブジェクトとして描画されてもよく、会議参加者に視認されないようにしてもよい。また、バーチャルカメラＶＣのオブジェクトは、立体映像Ａ１内のバーチャルカメラＶＣ２、ＶＣ３のように、配信者Ｕ１の視聴する立体映像内に限定して描画し、視聴者には見せないようにしてもよい。バーチャルカメラＶＣの焦点距離やズームパラメータ、回転、視点位置等のバーチャルカメラ情報（以下、単に「カメラ情報」とも称す）はシーンマネージャＳＭ、あるいは、バーチャルカメラＶＣを操作する配信者Ｕ１により制御される。バーチャルカメラＶＣは、「多視点３次元動画データに対して視点の設定を受け付ける手段」の一例である。 The virtual camera VC is a virtual device that specifies a viewpoint position for relaying a stereoscopic image in 3DCG space in which a remote conference is proceeding. The virtual camera VC has two viewpoint positions and congestion points determined from the two viewpoint positions as camera information in order to relay the 3DCG space as a stereoscopic image. In the virtual camera VC that does not involve the operation of the distributor U1, the arrangement position in the 3DCG space is determined by the scene manager SM. The virtual camera VC may be drawn as a visible object in the 3DCG space, for example, or may not be visible to the conference participants. Further, the object of the virtual camera VC may be drawn only in the stereoscopic image viewed by the distributor U1 like the virtual cameras VC2 and VC3 in the stereoscopic image A1 so as not to be shown to the viewer. .. Virtual camera information (hereinafter, also simply referred to as “camera information”) such as focal length, zoom parameters, rotation, and viewpoint position of the virtual camera VC is controlled by the scene manager SM or the distributor U1 who operates the virtual camera VC. .. The virtual camera VC is an example of "means for accepting viewpoint settings for multi-view three-dimensional moving image data".

カメラプロセッサＣは、バーチャルカメラＶＣのカメラ情報をタイムコードに対応付けて記録し、保存するコンピュータである。カメラプロセッサＣは、バーチャルカメラＶＣのカメラ情報に基づいて、シーンマネージャＳＭと連携し、リモート会議が進行する３ＤＣＧ空間内の立体映像を生成する。カメラプロセッサＣは、バーチャルカメラＶＣのカメラ情報、バーチャルカメラＶＣの視点位置から見た３ＤＣＧ空間内の立体映像をレンダラＲに出力する。 The camera processor C is a computer that records and stores camera information of the virtual camera VC in association with a time code. The camera processor C cooperates with the scene manager SM based on the camera information of the virtual camera VC to generate a stereoscopic image in the 3DCG space in which the remote conference proceeds. The camera processor C outputs the camera information of the virtual camera VC and the stereoscopic image in the 3DCG space seen from the viewpoint position of the virtual camera VC to the renderer R.

レンダラＲは、バーチャルカメラＶＣによって視点位置が指定された３ＤＣＧ空間の立体映像を２次元映像（ビデオ）に変換する、所謂レンダリング処理を行うコンピュータである。レンダリング処理では、３ＤＣＧ空間内のオブジェクトの２次元平面へのアフィン変換等の座標変換、照明処理等の描画処理が行われ、３ＤＣＧ空間に配置されたポリゴン単位のデータが、バーチャルカメラＶＣの視点位置から見たピクセル単位のデータに変換される。レンダラＲは、変換後のピクセル単位のデータに基づいて描画されたフレーム単位の２次元画像から２次元映像を作成し、作成した２次元映像を配信サーバＣＨに出力する。カメラプロセッサＣとレンダラＲは、「受け付けた視点から撮影対象をみたときの動画を生成する手段」の一例である。 The renderer R is a computer that performs so-called rendering processing that converts a stereoscopic image in a 3DCG space whose viewpoint position is specified by a virtual camera VC into a two-dimensional image (video). In the rendering process, coordinate transformation such as affine transformation of an object in 3DCG space to a two-dimensional plane and drawing processing such as lighting processing are performed, and the polygon unit data arranged in 3DCG space is the viewpoint position of the virtual camera VC. It is converted to pixel-based data as seen from. The renderer R creates a two-dimensional image from a frame-based two-dimensional image drawn based on the converted pixel-unit data, and outputs the created two-dimensional image to the distribution server CH. The camera processor C and the renderer R are examples of "means for generating a moving image when a shooting target is viewed from a received viewpoint".

配信サーバＣＨは、レンダラＲから出力された２次元映像（ビデオ）を、外部ネットワークに接続されたスマートフォン等の２Ｄ映像デバイスに配信するコンピュータである。配信サーバＣＨは、上記２Ｄ映像デバイスを介して事後、あるいは、リアルタイムで視聴中の視聴者からの要求に従って、複数のレンダラＲ１、Ｒ２、Ｒ３等から出力された２次元映像の配信切り替えを行う。配信サーバＣＨは、「生成された動画を視聴者装置に提供する手段」の一例である。 The distribution server CH is a computer that distributes the two-dimensional video (video) output from the renderer R to a 2D video device such as a smartphone connected to an external network. The distribution server CH switches the distribution of the two-dimensional video output from the plurality of renderers R1, R2, R3, etc. after the fact via the 2D video device or in response to a request from the viewer who is viewing in real time. The distribution server CH is an example of "means for providing the generated moving image to the viewer device".

〔２．装置構成〕
図２は、コンピュータのハードウェア構成の一例を示す構成図である。図１に示すシーンマネージャＳＭ、カメラプロセッサＣ、レンダラＲ、配信サーバＣＨは、図２に示すコンピュータ１００を用いて実現される。図２に例示のコンピュータ１００は、接続バス１
０６によって相互に接続されたＣＰＵ（Central Processing Unit）１０１、主記憶装置
１０２、補助記憶装置１０３、通信ＩＦ（Interface）１０４、入出力ＩＦ１０５を備え
る。ＣＰＵ１０１はプロセッサとも呼ばれる。ただし、ＣＰＵ１０１は、単一のプロセッサに限定される訳ではなく、マルチプロセッサ構成であってもよい。また、単一のソケットで接続される単一のＣＰＵ１０１がマルチコア構成であってもよい。また、上記構成要素はそれぞれ複数に設けられてもよいし、一部の構成要素を設けないようにしてもよい。 [2. Device configuration〕
FIG. 2 is a configuration diagram showing an example of a computer hardware configuration. The scene manager SM, the camera processor C, the renderer R, and the distribution server CH shown in FIG. 1 are realized by using the computer 100 shown in FIG. The computer 100 illustrated in FIG. 2 is a connection bus 1.
It includes a CPU (Central Processing Unit) 101, a main storage device 102, an auxiliary storage device 103, a communication IF (Interface) 104, and an input / output IF 105 connected to each other by 06. The CPU 101 is also called a processor. However, the CPU 101 is not limited to a single processor, and may have a multiprocessor configuration. Further, a single CPU 101 connected by a single socket may have a multi-core configuration. Further, the above-mentioned components may be provided in a plurality of each, or some of the components may not be provided.

ＣＰＵ１０１は、コンピュータ１００全体の制御を行う中央処理演算装置である。ＣＰＵ１０１は、補助記憶装置１０３に記憶されたプログラムを主記憶装置１０２の作業領域に実行可能に展開し、プログラムの実行を通じて周辺機器の制御を行うことで所定の目的に合致した機能を提供する。主記憶装置１０２は、ＣＰＵ１０１がプログラムやデータをキャッシュしたり、作業領域を展開したりする記憶媒体である。主記憶装置１０２は、例えば、フラッシュメモリ、ＲＡＭ（Random Access Memory）やＲＯＭ（Read Only Memory）を含む。 The CPU 101 is a central processing arithmetic unit that controls the entire computer 100. The CPU 101 executably deploys the program stored in the auxiliary storage device 103 in the work area of the main storage device 102, and controls peripheral devices through the execution of the program to provide a function that meets a predetermined purpose. The main storage device 102 is a storage medium in which the CPU 101 caches programs and data and expands a work area. The main storage device 102 includes, for example, a flash memory, a RAM (Random Access Memory), and a ROM (Read Only Memory).

補助記憶装置１０３は、ＯＳ（Operating System）、ＣＰＵ１０１により実行される各種プログラムや、動作の設定情報などを記憶する記憶媒体である。補助記憶装置１０３は、例えば、ＨＤＤ（Hard-disk Drive）やＳＳＤ（Solid State Drive）、ＥＰＲＯＭ（Erasable Programmable ROM）、フラッシュメモリ、ＵＳＢ（Universal Serial Bus）メモ
リ等である。補助記憶装置１０３には、ＣＤ（Compact Disc）ドライブ装置、ＤＶＤ（Digital Versatile Disc）ドライブ装置、ＢＤ（Blu-ray（登録商標） Disc）ドライブ装置等が含まれる。記録媒体としては、例えば、不揮発性半導体メモリ（フラッシュメモリ）を含むシリコンディスク、ハードディスク、ＣＤ、ＤＶＤ、ＢＤ、ＵＳＢ（Universal Serial Bus）メモリ、ＳＤ（Secure Digital）メモリカード等がある。 The auxiliary storage device 103 is a storage medium that stores various programs executed by the OS (Operating System) and the CPU 101, operation setting information, and the like. The auxiliary storage device 103 is, for example, an HDD (Hard-disk Drive), an SSD (Solid State Drive), an EPROM (Erasable Programmable ROM), a flash memory, a USB (Universal Serial Bus) memory, or the like. The auxiliary storage device 103 includes a CD (Compact Disc) drive device, a DVD (Digital Versatile Disc) drive device, a BD (Blu-ray (registered trademark) Disc) drive device, and the like. Examples of the recording medium include a silicon disk including a non-volatile semiconductor memory (flash memory), a hard disk, a CD, a DVD, a BD, a USB (Universal Serial Bus) memory, an SD (Secure Digital) memory card, and the like.

通信ＩＦ１０４は、コンピュータ１００に接続するネットワーク、外部ネットワークとのインターフェースである。通信ＩＦ１０４には、所定の規格に基づいて通信を行う無線通信モジュールが含まれる。入出力ＩＦ１０５は、コンピュータ１００に接続する他の装置との間でデータの入出力を行うインターフェースである。 The communication IF 104 is an interface with a network connected to the computer 100 and an external network. The communication IF 104 includes a wireless communication module that communicates based on a predetermined standard. The input / output IF 105 is an interface for inputting / outputting data to / from another device connected to the computer 100.

ＣＰＵ１０１のプログラムの実行により、図１に示すシーンマネージャＳＭ、カメラプロセッサＣ、レンダラＲ、配信サーバＣＨの処理が提供される。但し、上記それぞれの処理の少なくとも一部が、ＤＳＰ（Digital Signal Processor）、ＡＳＩＣ（Application Specific Integrated Circuit）、ＧＰＵ（Graphics Processing Unit）等によって提供
されてもよい。同様にして、上記それぞれの処理の少なくとも一部が、ＦＰＧＡ（Field-Programmable Gate Array）、数値演算プロセッサ、ベクトルプロセッサ、画像処理プロ
セッサ等の専用ＬＳＩ（large scale integration）、その他のデジタル回路であっても
よい。また、シーンマネージャＳＭ、カメラプロセッサＣ、レンダラＲ、配信サーバＣＨの少なくとも一部にアナログ回路を含むとしてもよい。 Execution of the program of the CPU 101 provides processing of the scene manager SM, the camera processor C, the renderer R, and the distribution server CH shown in FIG. However, at least a part of each of the above processes may be provided by a DSP (Digital Signal Processor), an ASIC (Application Specific Integrated Circuit), a GPU (Graphics Processing Unit), or the like. Similarly, at least a part of each of the above processes is a dedicated LSI (large scale integration) such as an FPGA (Field-Programmable Gate Array), a numerical arithmetic processor, a vector processor, an image processor, and other digital circuits. May be good. Further, at least a part of the scene manager SM, the camera processor C, the renderer R, and the distribution server CH may include an analog circuit.

〔３．データ構成〕
図３は、シーンマネージャＳＭにおいて記録・保存される３ＤＣＧ空間内のオブジェクトについてのデータ構成の一例を示す図である。図３に示すように、シーンマネージャＳＭにおいて記録・保存される３ＤＣＧ空間内のオブジェクトについてのデータは、タイムコード毎のレコードとして記録される。上記レコードは、「ＴＣ」、「Ｏｂｊｅｃｔ」、「Ｐｏｓ（ｘ，ｙ，ｚ）」、「Ｒｏｔ（ｐ，ｙ，ｒ）」、「ＦＯＶ（ｆ）」、「Ｓｃａｌｅ（ｓ）」、「Ｏｐｔｉｏｎ｛Ｔｒｉｇｇｅｒ，Ｔａｒｇｅｔ，…｝」といったフィールドを有する。 [3. Data structure]
FIG. 3 is a diagram showing an example of a data structure of an object in the 3DCG space recorded and saved by the scene manager SM. As shown in FIG. 3, the data about the objects in the 3DCG space recorded and saved in the scene manager SM is recorded as a record for each time code. The above records are "TC", "Object", "Pos (x, y, z)", "Rot (p, y, r)", "FOV (f)", "Scale (s)", "Option". It has a field such as "{Trigger, Target, ...}".

「ＴＣ」フィールドには、３ＤＣＧ空間内における絶対的な時間を表現するカウンタの
カウンタ値が格納される。カウンタ値は、例えば、０．１ｍｓといった周期で計測される。なお、「ＴＣ」フィールドには、計測されたカウンタ値から算出された時刻情報が格納されるとしてもよい。図３においては、“ｈ：ｍｍ：ｓｓ”といった時暦、分歴、秒歴の形式の時刻情報が例示される。 In the "TC" field, the counter value of the counter representing the absolute time in the 3DCG space is stored. The counter value is measured at a cycle of, for example, 0.1 ms. The time information calculated from the measured counter value may be stored in the "TC" field. In FIG. 3, time information in the form of time calendar, minute history, and second history such as “h: mm: ss” is illustrated.

「Ｏｂｊｅｃｔ」フィールドには、３ＤＣＧ空間内に配置されるオブジェクトを一意に識別する識別情報が格納される。また、「Ｏｂｊｅｃｔ」フィールドに格納されるデータ例としては、バーチャルカメラＶＣを一意に識別する識別情報が挙げられる。図３においては、リモート会議の参加者の操作可能なアバター（Ｐ１，Ｐ２）が例示される。同様に、配信者Ｕ１が操作するバーチャルカメラＶＣ１（ＨＩＤ１、ＨＩＤ２）、アバターを追従するバーチャルカメラＶＣ２（ＣＡＭ１）が例示される。バーチャルカメラＶＣ１は、３ＤＣＧ空間を立体視するための２つの視点位置（ＨＩＤ１、ＨＩＤ２）を有する。バーチャルカメラＶＣ１の視点位置（ＨＩＤ１、ＨＩＤ２）は、配信者Ｕ１の備えるユーザ入力デバイスを介して操作される。 The "Object" field stores identification information that uniquely identifies an object arranged in the 3DCG space. Further, as an example of the data stored in the "Object" field, there is identification information that uniquely identifies the virtual camera VC. In FIG. 3, avatars (P1, P2) that can be operated by the participants of the remote conference are illustrated. Similarly, the virtual camera VC1 (HID1, HID2) operated by the distributor U1 and the virtual camera VC2 (CAM1) that follows the avatar are exemplified. The virtual camera VC1 has two viewpoint positions (HID1 and HID2) for stereoscopically viewing the 3DCG space. The viewpoint positions (HID1, HID2) of the virtual camera VC1 are operated via the user input device included in the distributor U1.

「Ｐｏｓ（ｘ，ｙ，ｚ）」フィールドには、「Ｏｂｊｅｃｔ」フィールドに格納されたオブジェクトの３ＤＣＧ空間の座標が格納される。「Ｏｂｊｅｃｔ」フィールドに格納されたオブジェクトが、会議参加者のアバターの場合には、そのオブジェクトを構成する複数のポリゴンの重心座標が格納される。「Ｏｂｊｅｃｔ」フィールドに格納されたオブジェクトが、バーチャルカメラＶＣの場合には、視点位置の座標が格納される。配信者Ｕ１が操作するバーチャルカメラＶＣ１の場合には、ＨＩＤ１、ＨＩＤ２のそれぞれについての視点位置の座標が格納される。 The "Pos (x, y, z)" field stores the coordinates of the object stored in the "Object" field in the 3DCG space. If the object stored in the "Object" field is the avatar of a conference participant, the coordinates of the center of gravity of a plurality of polygons constituting the object are stored. If the object stored in the "Object" field is a virtual camera VC, the coordinates of the viewpoint position are stored. In the case of the virtual camera VC1 operated by the distributor U1, the coordinates of the viewpoint position for each of HID1 and HID2 are stored.

「Ｒｏｔ（ｐ，ｙ，ｒ）」フィールドには、「Ｏｂｊｅｃｔ」フィールドに格納されたオブジェクトの移動に伴う、ローカル座標の回転を表すパラメータ座標が格納される。バーチャルカメラＶＣの場合には、視点位置のピッチ軸、ヨー軸、ロール軸についての回転方向を表すパラメータ座標が格納される。配信者Ｕ１が操作するバーチャルカメラＶＣ１の場合には、ＨＩＤ１、ＨＩＤ２のそれぞれについての上記座標が格納される。 In the "Rot (p, y, r)" field, parameter coordinates representing the rotation of local coordinates due to the movement of the object stored in the "Object" field are stored. In the case of the virtual camera VC, the parameter coordinates representing the rotation direction of the pitch axis, yaw axis, and roll axis of the viewpoint position are stored. In the case of the virtual camera VC1 operated by the distributor U1, the above coordinates for each of HID1 and HID2 are stored.

「ＦＯＶ（ｆ）」フィールドには、バーチャルカメラＶＣのカメラ情報として画角を表すパラメータが格納される。「Ｓｃａｌｅ（ｓ）」フィールドには、「Ｏｂｊｅｃｔ」フィールドに格納されたオブジェクトの３ＤＣＧ空間の移動に伴う拡大／縮小のパラメータが格納される。「Ｏｐｔｉｏｎ｛Ｔｒｉｇｇｅｒ，Ｔａｒｇｅｔ，…｝」フィールドには、例えば、３ＤＣＧ空間におけるオブジェクトに質感を与えるためのパラメータ、動作等を示すパラメータが格納される。バーチャルカメラＶＣの場合には、例えば、ズーム対象標や、配信者Ｕ１が操作するバーチャルカメラＶＣ１の起動・停止等の情報が格納される。 In the "FOV (f)" field, a parameter representing the angle of view is stored as camera information of the virtual camera VC. In the "Scale (s)" field, the parameters of enlargement / reduction accompanying the movement of the object stored in the "Object" field in the 3DCG space are stored. In the "Option {Trigger, Target, ...}" Field, for example, a parameter for giving a texture to an object in the 3DCG space, a parameter indicating an operation, and the like are stored. In the case of the virtual camera VC, for example, information such as a zoom target mark and start / stop of the virtual camera VC1 operated by the distributor U1 is stored.

〔４．処理フロー〕
以下、図４、５に示すフローチャートを参照し、情報処理システム１０におけるシーンマネージャＳＭの処理を主に説明する。なお、図１に示すシーンマネージャＳＭ、カメラプロセッサＣ、レンダラＲ、配信サーバＣＨは、ＣＰＵ１０１が補助記憶装置１０３に記憶されている各種プログラムや各種データを主記憶装置１０２に読み出して実行することで、図４に示す処理を行う。なお、図５で後述するバーチャルカメラＶＣ１の設定処理についても同様である。 [4. Processing flow]
Hereinafter, the processing of the scene manager SM in the information processing system 10 will be mainly described with reference to the flowcharts shown in FIGS. The scene manager SM, the camera processor C, the renderer R, and the distribution server CH shown in FIG. 1 are executed by the CPU 101 reading various programs and various data stored in the auxiliary storage device 103 into the main storage device 102. , The process shown in FIG. 4 is performed. The same applies to the setting process of the virtual camera VC1 described later in FIG.

図４は、シーンマネージャＳＭのイベントに係る事象の記録処理を示すフローチャートである。シーンマネージャＳＭは、情報処理システム１０を介して提供するリモート会議の進行に伴うプロセスで発生した会議参加者（アバター）間のやり取り、バーチャルカメラＶＣのカメラ情報をタイムコードＴＣに対応付けて記録する。シーンマネージャＳＭは
、例えば、図３を用いて説明したデータ構成により、３ＤＣＧ空間内に配置されるオブジェクト、バーチャルカメラＶＣのカメラ情報を補助記憶装置１０３に記録する。同様にして、シーンマネージャＳＭは、配信サーバＣＨから配信された２Ｄ映像の視聴者からのメッセージをタイムコードＴＣに対応付けて記録する。シーンマネージャＳＭでは、アプリケーションプログラムの実行により作成された、３ＤＣＧ空間の映像は、リモート会議参加者にアバター視点を介した立体映像としてリアルタイムに提供される。 FIG. 4 is a flowchart showing a recording process of an event related to an event of the scene manager SM. The scene manager SM records the communication between the conference participants (avatars) generated in the process accompanying the progress of the remote conference provided via the information processing system 10 and the camera information of the virtual camera VC in association with the time code TC. .. For example, the scene manager SM records the camera information of the virtual camera VC, which is an object arranged in the 3DCG space, in the auxiliary storage device 103 according to the data structure described with reference to FIG. Similarly, the scene manager SM records a message from the viewer of the 2D video distributed from the distribution server CH in association with the time code TC. In the scene manager SM, the image in the 3DCG space created by executing the application program is provided to the remote conference participants in real time as a stereoscopic image via the avatar viewpoint.

図４のフローチャートにおいて、処理の開始は、リモート会議といったコンテンツを提供するアプリケーションプログラムの実行のときが例示される。シーンマネージャＳＭは、３ＤＣＧ空間内における絶対的な時間を表現するカウンタであるタイムコードＴＣを初期化（リセット）する（Ｓ１）。シーンマネージャＳＭは、例えば、会議場を構成する様々なオブジェクトを３ＤＣＧ空間内のワールド座標系に配置する。そして、シーンマネージャＳＭは、会議参加者のログインを受け付けるまで待機する（Ｓ２）。但し、リモート会議に参加する会議参加者の人数や役職等が事前に通知されている場合には、予め会議参加者に割当てられたアバターを３ＤＣＧ空間内に配置するとしてもよい。なお、上記オブジェクトは、例えば、シーンマネージャＳＭが参照する補助記憶装置１０３に予め登録される。図４の、待機状態（Ｓ２）においては、シーンマネージャＳＭは、Ｓ３からＳ７の処理で例示される複数のイベント待ちの状態にある。 In the flowchart of FIG. 4, the start of the process is exemplified when the application program that provides the content such as the remote conference is executed. The scene manager SM initializes (reset) the time code TC, which is a counter representing the absolute time in the 3DCG space (S1). The scene manager SM, for example, arranges various objects constituting the conference hall in the world coordinate system in the 3DCG space. Then, the scene manager SM waits until the login of the conference participant is accepted (S2). However, if the number of conference participants and their job titles participating in the remote conference are notified in advance, the avatars assigned to the conference participants in advance may be arranged in the 3DCG space. The object is registered in advance in the auxiliary storage device 103 referred to by the scene manager SM, for example. In the standby state (S2) of FIG. 4, the scene manager SM is in a state of waiting for a plurality of events exemplified by the processes of S3 to S7.

シーンマネージャＳＭは、３ＤＣＧ空間を用いて提供されるリモート会議の会議参加者（プレーヤー）のログイン、ログアウトを受け付ける（Ｓ３）。シーンマネージャＳＭは、会議参加者のログイン、ログアウトを受け付けた場合には（Ｓ３、受け付け）、受け付けた会議参加者のログイン、ログアウトをタイムコードＴＣに対応付けられたキー記録として記録する（Ｓ８）。キー記録には、例えば、会議参加者を一意に識別する識別情報、会議参加者に対応付けられた３Ｄ空間内のアバターの識別情報等が含まれる。 The scene manager SM accepts the login and logout of the conference participants (players) of the remote conference provided using the 3DCG space (S3). When the conference participant's login and logout are accepted (S3, accepted), the scene manager SM records the accepted conference participant's login and logout as a key record associated with the time code TC (S8). .. The key record includes, for example, identification information that uniquely identifies the conference participant, identification information of the avatar in the 3D space associated with the conference participant, and the like.

シーンマネージャＳＭは、ログインを受け付けた会議参加者に、アバター視点から見たリモート会議が行われる３ＤＣＧ空間の立体映像を提供する。ログインした会議参加者は、例えば、ＨＭＤ等を介して立体映像を視聴しながらアバターを操作し、会議場の移動や他の会議参加者への対話や働きかけを行う。３ＤＣＧ空間で生じた会議参加者の行動によるイベントは、視聴中の立体映像に反映される。 The scene manager SM provides the conference participants who have accepted the login with a stereoscopic image of the 3DCG space where the remote conference is performed from the viewpoint of the avatar. The logged-in conference participant operates the avatar while viewing the stereoscopic image via the HMD or the like, moves the conference hall, and interacts with and works with other conference participants. Events caused by the actions of conference participants in the 3DCG space are reflected in the stereoscopic image being viewed.

シーンマネージャＳＭは、リモート会議が進行する３ＤＣＧ空間を中継する複数の配信者Ｕｎのログイン、ログアウトを受け付ける（Ｓ４）。シーンマネージャＳＭは、配信者Ｕｎのログイン、ログアウトを受け付けた場合には（Ｓ４、受け付け）、受け付けた配信者Ｕｎのログイン、ログアウトをタイムコードＴＣに対応付けられたキー記録として記録する（Ｓ８）。キー記録には、例えば、配信者Ｕｎを一意に識別する識別情報、配信者Ｕｎが操作するバーチャルカメラＶＣｎの識別情報が含まれる。なお、３ＤＣＧ空間内における、配信者Ｕｎが操作するバーチャルカメラＶＣｎの初期の配置位置は、シーンマネージャＳＭにより決定される。 The scene manager SM accepts logins and logouts of a plurality of distributors Un that relay the 3DCG space in which the remote conference is proceeding (S4). When the scene manager SM accepts the login and logout of the distributor Un (S4, accept), the scene manager SM records the login and logout of the accepted distributor Un as a key record associated with the time code TC (S8). .. The key recording includes, for example, identification information that uniquely identifies the distributor Un, and identification information of the virtual camera VCn operated by the distributor Un. The initial placement position of the virtual camera VCn operated by the distributor Un in the 3DCG space is determined by the scene manager SM.

配信者Ｕｎは、バーチャルカメラＶＣｎの視点位置（ＨＩＤ１、ＨＩＤ２）を操作し、リモート会議が進行する３ＤＣＧ空間を中継する。バーチャルカメラＶＣｎの視点位置（ＨＩＤ１、ＨＩＤ２）は、３ＤＣＧ空間内の任意の位置に設定される。シーンマネージャＳＭは、ログインを受け付けた配信者Ｕｎに、バーチャルカメラＶＣｎの視点位置から見た３ＤＣＧ空間の立体映像を出力する。配信者Ｕｎは、ＨＭＤ等に表示された立体映像を視聴しながらバーチャルカメラＶＣｎの視点位置を操作し、リモート会議が進行する３ＤＣＧ空間を中継する。バーチャルカメラＶＣｎの操作を介して中継された立体映像には、例えば、リモート会議の進行に伴うイベントに応じた３ＤＣＧ空間内の移動、特定のアバターへのズームアップ等が反映される。 The distributor Un operates the viewpoint positions (HID1, HID2) of the virtual camera VCn, and relays the 3DCG space in which the remote conference is proceeding. The viewpoint positions (HID1, HID2) of the virtual camera VCn are set to arbitrary positions in the 3DCG space. The scene manager SM outputs a stereoscopic image of the 3DCG space seen from the viewpoint position of the virtual camera VCn to the distributor Un who has received the login. The distributor Un operates the viewpoint position of the virtual camera VCn while viewing the stereoscopic image displayed on the HMD or the like, and relays the 3DCG space in which the remote conference is proceeding. The stereoscopic image relayed through the operation of the virtual camera VCn reflects, for example, movement in the 3DCG space according to an event accompanying the progress of the remote conference, zooming up to a specific avatar, and the like.

シーンマネージャＳＭは、会議参加者によるアバターへの操作入力を受け付ける（Ｓ５）。シーンマネージャＳＭは、受け付けた操作入力により、リモート会議が進行する３ＤＣＧ空間に配置されたオブジェクト移動、回転、拡大・縮小が生じた場合には（Ｓ５、Ｙｅｓ）、タイムコードＴＣに対応付けて上記オブジェクト移動、回転、拡大・縮小を表すパラメータをキー記録として記録する（Ｓ９）。シーンマネージャＳＭは、例えば、図３に示すデータ構成を用いて対象オブジェクトについての各種パラメータを記録する。 The scene manager SM accepts operation input to the avatar by the conference participants (S5). When the received operation input causes the movement, rotation, enlargement / reduction of the object arranged in the 3DCG space where the remote conference proceeds (S5, Yes), the scene manager SM associates the object with the time code TC. Parameters representing object movement, rotation, and enlargement / reduction are recorded as key records (S9). The scene manager SM records various parameters for the target object using, for example, the data structure shown in FIG.

シーンマネージャＳＭは、配信者ＵｎによるバーチャルカメラＶＣｎへの操作入力を受け付ける（Ｓ６）。シーンマネージャＳＭは、配信者ＵｎによるバーチャルカメラＶＣｎへの操作入力を受け付けた場合には（Ｓ６、Ｙｅｓ）、バーチャルカメラＶＣｎのカメラ情報をタイムコードＴＣに対応付けてキー記録として記録する（Ｓ１０）。シーンマネージャＳＭは、例えば、図３に示すデータ構成を用いて操作対象のバーチャルカメラＶＣｎについてのカメラ情報を記録する。 The scene manager SM accepts an operation input to the virtual camera VCn by the distributor Un (S6). When the scene manager SM accepts the operation input to the virtual camera VCn by the distributor Un (S6, Yes), the scene manager SM records the camera information of the virtual camera VCn as a key record in association with the time code TC (S10). .. The scene manager SM records camera information about the virtual camera VCn to be operated, for example, using the data structure shown in FIG.

シーンマネージャＳＭは、配信サーバＣＨから配信された２Ｄ映像の視聴者からのメッセージ（視聴者反応）を受け付ける（Ｓ７）。視聴者から受け付けたメッセージには、文字、音声、動作等のリアクション信号が含まれる。シーンマネージャＳＭは、配信サーバＣＨから配信された２Ｄ映像の視聴者からのメッセージを受け付けた場合には（Ｓ７、Ｙｅｓ）、受け付けたメッセージをタイムコードＴＣに対応付けて視聴者反応として記録する（Ｓ１１）。 The scene manager SM receives a message (viewer reaction) from the viewer of the 2D video distributed from the distribution server CH (S7). The message received from the viewer includes reaction signals such as characters, voices, and actions. When the scene manager SM receives a message from the viewer of the 2D video distributed from the distribution server CH (S7, Yes), the scene manager SM records the received message as a viewer reaction in association with the time code TC (S7, Yes). S11).

コンテンツの提供者（サービス事業者）は、例えば、シーンマネージャＳＭに記録されたメッセージに含まれる、配信された２Ｄ映像についての評価や要望等を提供するコンテンツやサービス内容に反映する。 The content provider (service provider) reflects, for example, in the content or service content that provides the evaluation or request for the delivered 2D video included in the message recorded in the scene manager SM.

次に、図５に示すフローチャートを参照し、シーンマネージャＳＭのバーチャルカメラＶＣ１の設定処理を説明する。図５は、シーンマネージャＳＭのバーチャルカメラＶＣ１の設定処理を示すフローチャートである。なお、図５に示すフローチャートの処理の開始およびＳ１の処理は、図４に示す処理と同様である。 Next, the setting process of the virtual camera VC1 of the scene manager SM will be described with reference to the flowchart shown in FIG. FIG. 5 is a flowchart showing a setting process of the virtual camera VC1 of the scene manager SM. The start of the processing of the flowchart shown in FIG. 5 and the processing of S1 are the same as the processing shown in FIG.

シーンマネージャＳＭは、コントローラＣＮを介し、配信者Ｕｎの操作するバーチャルカメラＶＣｎの操作入力を受け付ける（Ｓ２１）。操作入力には、バーチャルカメラＶＣｎのカメラ情報、例えば、３ＤＣＧ空間内の視点位置（ＨＩＤ１、ＨＩＤ２の各座標）、ズームイン・ズームアウト（拡大・縮小）、カメラの向き（回転取得センサ）、画角、映像取得開始（ＨＩＤトリガーＯＮ／ＯＦＦ）等を表すパラメータが含まれる。シーンマネージャＳＭは、受け付けたバーチャルカメラＶＣｎのカメラ情報を取得し、主記憶装置１０２の所定の領域に一時的に記憶する。シーンマネージャＳＭの処理は、Ｓ２２の処理に移行する。 The scene manager SM receives the operation input of the virtual camera VCn operated by the distributor Un via the controller CN (S21). The operation input includes camera information of the virtual camera VCn, for example, the viewpoint position in 3DCG space (each coordinate of HID1 and HID2), zoom-in / zoom-out (enlargement / reduction), camera orientation (rotation acquisition sensor), and angle of view. , A parameter indicating the start of video acquisition (HID trigger ON / OFF) and the like is included. The scene manager SM acquires the camera information of the received virtual camera VCn and temporarily stores it in a predetermined area of the main storage device 102. The processing of the scene manager SM shifts to the processing of S22.

シーンマネージャＳＭは、操作入力に映像取得開始を表すパラメータ（ＨＩＤトリガーＯＮ）が含まれている場合には（Ｓ２２、Ｙｅｓ）、Ｓ２３の処理に移行する。シーンマネージャＳＭは、操作入力に映像取得開始を表すパラメータ（ＨＩＤトリガーＯＮ）が含まれていない場合には（Ｓ２２、Ｎｏ）、Ｓ２１の処理に戻る。 When the operation input includes a parameter (HID trigger ON) indicating the start of video acquisition (S22, Yes), the scene manager SM shifts to the process of S23. If the operation input does not include the parameter (HID trigger ON) indicating the start of video acquisition, the scene manager SM returns to the process of S21 (S22, No).

シーンマネージャＳＭは、リモート会議が提供される３ＤＣＧ空間内を任意の視点で中継するバーチャルカメラＶＣｎが既に配置されているかを判定する（Ｓ２３）。シーンマネージャＳＭは、上記のバーチャルカメラＶＣｎが既に配置されている場合には（Ｓ２３，Ｙｅｓ）、Ｓ２６の処理に移行する。シーンマネージャＳＭは、上記のバーチャルカメラＶＣｎが未だ配置されていない場合には（Ｓ２３，Ｎｏ）、Ｓ２４の処理に移行する。 The scene manager SM determines whether or not a virtual camera VCn that relays from an arbitrary viewpoint in the 3DCG space where the remote conference is provided is already arranged (S23). When the virtual camera VCn is already arranged (S23, Yes), the scene manager SM shifts to the process of S26. If the virtual camera VCn is not yet arranged (S23, No), the scene manager SM shifts to the process of S24.

Ｓ２４の処理では、シーンマネージャＳＭは、図６を用いて後述するカメラ方向設定アルゴリズムに従って、バーチャルカメラＶＣｎの視点位置（ＨＩＤ１、ＨＩＤ２）を設定する。シーンマネージャＳＭは、Ｓ２４の処理で設定された視点位置（ＨＩＤ１、ＨＩＤ２）に、３ＤＣＧ空間内を任意の視点で中継するバーチャルカメラＶＣｎを配置する（Ｓ２５）。３ＤＣＧ空間におけるバーチャルカメラＶＣｎの配置位置は、２つの視点位置（ＨＩＤ１、ＨＩＤ２）の中間点により表される。シーンマネージャＳＭの処理は、Ｓ２７の処理に移行する。 In the process of S24, the scene manager SM sets the viewpoint positions (HID1, HID2) of the virtual camera VCn according to the camera direction setting algorithm described later with reference to FIG. The scene manager SM arranges the virtual camera VCn that relays in the 3DCG space from an arbitrary viewpoint at the viewpoint positions (HID1, HID2) set in the process of S24 (S25). The arrangement position of the virtual camera VCn in the 3DCG space is represented by the midpoint between the two viewpoint positions (HID1 and HID2). The processing of the scene manager SM shifts to the processing of S27.

Ｓ２６の処理では、シーンマネージャＳＭは、バーチャルカメラＶＣｎの位置、角度等を更新する。シーンマネージャＳＭは、Ｓ２１の処理で受け付けたＨＩＤ１、ＨＩＤ２の各座標からバーチャルカメラＶＣｎの位置を更新し、カメラの向きからカメラの回転角度を更新する。シーンマネージャＳＭの処理は、Ｓ２７の処理に移行する。 In the process of S26, the scene manager SM updates the position, angle, and the like of the virtual camera VCn. The scene manager SM updates the position of the virtual camera VCn from the coordinates of HID1 and HID2 received in the process of S21, and updates the rotation angle of the camera from the direction of the camera. The processing of the scene manager SM shifts to the processing of S27.

Ｓ２７の処理では、シーンマネージャＳＭは、Ｓ２５、Ｓ２６の処理で設定されたバーチャルカメラＶＣｎのカメラ情報等を補助記憶装置１０３に記録する。上記カメラ情報は、例えば、図３を用いて説明したデータ構成により記録される。シーンマネージャＳＭの処理は、Ｓ２８の処理に移行する。 In the process of S27, the scene manager SM records the camera information of the virtual camera VCn set in the processes of S25 and S26 in the auxiliary storage device 103. The camera information is recorded, for example, by the data structure described with reference to FIG. The processing of the scene manager SM shifts to the processing of S28.

Ｓ２８の処理では、シーンマネージャＳＭは、タイムコードＴＣのカウント値が記録可能時間に到達したかを判定する。ここで、記録可能時間とは、予め定められた設定時間であり、例えば、補助記憶装置１０３等のメモリ容量等から求めることができる。シーンマネージャＳＭは、タイムコードＴＣのカウント値が記録可能時間に到達している場合には（Ｓ２８，Ｙｅｓ）、図５に示す処理を終了する。また、シーンマネージャＳＭは、タイムコードＴＣのカウント値が記録可能時間に到達していない場合には（Ｓ２８，Ｎｏ）、Ｓ２１の処理に移行し、図５に示す処理を継続する。 In the process of S28, the scene manager SM determines whether the count value of the time code TC has reached the recordable time. Here, the recordable time is a predetermined set time, and can be obtained from, for example, the memory capacity of the auxiliary storage device 103 or the like. When the count value of the time code TC has reached the recordable time (S28, Yes), the scene manager SM ends the process shown in FIG. If the count value of the time code TC has not reached the recordable time (S28, No), the scene manager SM shifts to the process of S21 and continues the process shown in FIG.

次に、図６を参照し、カメラ方向設定アルゴリズムを説明する。図６は、バーチャルカメラＶＣｎのカメラ方向の設定を説明する説明図である。図６においては、バーチャルカメラＶＣｎの視点方向の、右方向への回転（時計回り）を上面視した上面図が例示される。図６に示すバーチャルカメラＶＣｎの状態は、（１）→（２）→（３）の順で変化する。なお、（１）に示す状態は、初期状態を表す。 Next, the camera direction setting algorithm will be described with reference to FIG. FIG. 6 is an explanatory diagram illustrating the setting of the camera direction of the virtual camera VCn. In FIG. 6, a top view is illustrated in which the rotation (clockwise) of the virtual camera VCn in the viewpoint direction is viewed from above. The state of the virtual camera VCn shown in FIG. 6 changes in the order of (1) → (2) → (3). The state shown in (1) represents an initial state.

図６において、ＨＩＤ１、ＨＩＤ２は、３ＤＣＧ空間内におけるバーチャルカメラＶＣｎの視点位置を表し、ＨＩＤ１とＨＩＤ２の中間点Ｐ１が、バーチャルカメラＶＣｎの配置位置を表す。上記視点位置、配置位置は、ワールド座標系で表される。ＨＩＤ１、ＨＩＤ２の視点位置は、バーチャルカメラＶＣｎを操作する配信者Ｕｎによって設定される。なお、ＨＩＤ１とＨＩＤ２との間の距離は、バーチャルカメラＶＣｎの画角（ＦＯＶ）、または、焦点処理（ｆ）に対応付けられる。バーチャルカメラＶＣｎの画角（ＦＯＶ）は、例えば、配信者Ｕｎの操作入力、或いは、シーンマネージャＳＭの制御によって、ＨＩＤ１とＨＩＤ２との間の距離を広げることで、バーチャルカメラＶＣｎの画角（ＦＯＶ）の広角化が可能になる。 In FIG. 6, HID1 and HID2 represent the viewpoint position of the virtual camera VCn in the 3DCG space, and the intermediate point P1 between HID1 and HID2 represents the arrangement position of the virtual camera VCn. The viewpoint position and the arrangement position are represented by the world coordinate system. The viewpoint positions of HID1 and HID2 are set by the distributor Un who operates the virtual camera VCn. The distance between HID1 and HID2 is associated with the angle of view (FOV) of the virtual camera VCn or the focus processing (f). The angle of view (FOV) of the virtual camera VCn is, for example, the angle of view (FOV) of the virtual camera VCn by increasing the distance between HID1 and HID2 by the operation input of the distributor Un or the control of the scene manager SM. ) Can be widened.

中間点Ｐ１に配置されたバーチャルカメラＶＣｎの、３ＤＣＧ空間内の注視点は「ＶＤ」で表され、中間点Ｐ１から注視点ＶＤに向かう矢印が３ＤＣＧ空間内におけるバーチャルカメラＶＣｎの視軸方向を表す。「ＶＤ１」はＨＩＤ１による視軸方向を表し、「ＶＤ２」はＨＩＤ２による視軸方向を表す。なお、図６に示すＰ２、Ｐ３は、配信者Ｕｎの左右の視点を表す。 The gazing point in the 3DCG space of the virtual camera VCn arranged at the intermediate point P1 is represented by "VD", and the arrow from the intermediate point P1 to the gazing point VD indicates the visual axis direction of the virtual camera VCn in the 3DCG space. .. "VD1" represents the visual axis direction according to HID1, and "VD2" represents the visual axis direction according to HID2. Note that P2 and P3 shown in FIG. 6 represent the left and right viewpoints of the distributor Un.

図６に示すカメラ方向設定アルゴリズムにおいては、ＨＩＤ１の視軸方向の単位ベクト
ルを定数倍（ｈ）した終点座標ＶＰ１と、ＨＩＤ２の視軸方向の単位ベクトルを定数倍（ｈ）した終点座標ＶＰ２との中間点を用いて、カメラ方向が移動するバーチャルカメラＶＣｎの注視点ＶＤを設定する。 In the camera direction setting algorithm shown in FIG. 6, the end point coordinate VP1 obtained by multiplying (h) the unit vector in the visual axis direction of HID1 and the end point coordinate VP2 obtained by multiplying (h) the unit vector in the visual axis direction of HID2. The gazing point VD of the virtual camera VCn that moves in the camera direction is set using the midpoint of.

図６（１）に示すように、初期状態においては、終点座標ＶＰ１、ＶＰ２、注視点ＶＤの座標は、バーチャルカメラＶＣｎの視軸に直交する同一平面上に存在する。図６（１）の状態で、配信者Ｕｎの操作入力よってバーチャルカメラＶＣｎの回転角度のセンサ値が入力される。シーンマネージャＳＭは、例えば、ＨＩＤ１の視軸の方向を維持した状態で、ＨＩＤ２の視軸を入力された回転角度に沿って右方向に回転させる（図６（２））。 As shown in FIG. 6 (1), in the initial state, the coordinates of the end point coordinates VP1, VP2, and the gazing point VD exist on the same plane orthogonal to the visual axis of the virtual camera VCn. In the state of FIG. 6 (1), the sensor value of the rotation angle of the virtual camera VCn is input by the operation input of the distributor Un. For example, the scene manager SM rotates the visual axis of HID2 to the right along the input rotation angle while maintaining the direction of the visual axis of HID1 (FIG. 6 (2)).

図６（２）に示すように、シーンマネージャＳＭは、例えば、ＨＩＤ１の終点座標ＶＰ１と、回転移動したＨＩＤ２の終点座標ＶＰ２との中間点にバーチャルカメラＶＣｎの注視点ＶＤを設定する。３ＤＣＧ空間におけるバーチャルカメラＶＣｎの視軸方向は、ＨＩＤ１とＨＩＤ２との中間点Ｐ１と、回転移動後の上記注視点ＶＤとを結ぶ直線方向となる。 As shown in FIG. 6 (2), the scene manager SM sets the gazing point VD of the virtual camera VCn at the midpoint between the end point coordinate VP1 of the HID1 and the end point coordinate VP2 of the rotated HID2, for example. The visual axis direction of the virtual camera VCn in the 3DCG space is a linear direction connecting the intermediate point P1 between HID1 and HID2 and the gaze point VD after rotational movement.

シーンマネージャＳＭは、図６（２）に示す状態から、ＨＩＤ２の視軸を入力された回転角度に沿って右方向に回転させると共に、ＨＩＤ１の視軸を右方向に回転させる（図６（３））。このときの、ＨＩＤ１の視軸の右方向への回転移動量は、図６（２）に示すＨＩＤ２の視軸の回転移動量に相当する。 From the state shown in FIG. 6 (2), the scene manager SM rotates the visual axis of HID2 to the right along the input rotation angle and rotates the visual axis of HID1 to the right (FIG. 6 (3). )). At this time, the amount of rotational movement of the visual axis of HID1 to the right corresponds to the amount of rotational movement of the visual axis of HID2 shown in FIG. 6 (2).

図６（３）に示すように、シーンマネージャＳＭは、回転移動後のＨＩＤ１の終点座標ＶＰ１と、ＨＩＤ２の終点座標ＶＰ２との中間点にバーチャルカメラＶＣｎの注視点ＶＤを設定する。３ＤＣＧ空間におけるバーチャルカメラＶＣｎの視軸方向は、ＨＩＤ１とＨＩＤ２との中間点Ｐ１と、回転移動によって更新された注視点ＶＤとを結ぶ直線方向になる。 As shown in FIG. 6 (3), the scene manager SM sets the gazing point VD of the virtual camera VCn at the midpoint between the end point coordinate VP1 of HID1 and the end point coordinate VP2 of HID2 after rotational movement. The visual axis direction of the virtual camera VCn in the 3DCG space is a linear direction connecting the midpoint P1 between HID1 and HID2 and the gazing point VD updated by rotational movement.

シーンマネージャＳＭは、入力された回転角度のセンサ値が反映されるまで、上述した処理を連続して実行し、バーチャルカメラＶＣｎの注視点ＶＤと視軸方向を回転させる。シーンマネージャＳＭにおいては、図６を用いて説明したカメラ方向設定アルゴリズムを実行することで、バーチャルカメラＶＣｎのより自然な回転操作を立体映像に反映する。 The scene manager SM continuously executes the above-described processing until the input sensor value of the rotation angle is reflected, and rotates the gazing point VD and the visual axis direction of the virtual camera VCn. In the scene manager SM, by executing the camera direction setting algorithm described with reference to FIG. 6, a more natural rotation operation of the virtual camera VCn is reflected in the stereoscopic image.

以上、説明したように、情報処理システム１０は、リモート会議が進行する３ＤＣＧ空間の任意の位置にバーチャルカメラを設定できる。情報処理システム１０は、バーチャルカメラを介し、リモート会議が進行するプロセスで発生した会議参加者間のやり取り等の事象を、任意の視点位置から見た立体映像として中継できる。情報処理システム１０は、中継された立体映像を２次元映像に変換できる。情報処理システム１０は、リモート会議の様子を視聴する視聴者の有する、スマートフォン等の２Ｄ映像デバイスにリアルタイムに配信できる。 As described above, the information processing system 10 can set the virtual camera at an arbitrary position in the 3DCG space where the remote conference proceeds. The information processing system 10 can relay an event such as an exchange between conference participants that occurs in the process of proceeding with a remote conference as a stereoscopic image viewed from an arbitrary viewpoint position via a virtual camera. The information processing system 10 can convert the relayed stereoscopic image into a two-dimensional image. The information processing system 10 can deliver in real time to a 2D video device such as a smartphone owned by a viewer who watches the state of a remote conference.

情報処理システム１０においては、３次元映像で視聴するための専用デバイスや通信回線の帯域拡張といった環境整備を視聴者に負担させることなく、３ＤＣＧ空間で進行中のリモート会議の様子がリアルタイムに視聴できる。 In the information processing system 10, the state of the remote conference in progress in the 3DCG space can be viewed in real time without burdening the viewer with the environment maintenance such as the dedicated device for viewing the 3D image and the bandwidth expansion of the communication line. ..

情報処理システム１０は、２次元映像を視聴する視聴者からのメッセージ（文字、音声、動作等のリアクション信号）を受け付けることができる。情報処理システム１０は、例えば、受け付けたメッセージに基づいて、リモート会議が進行する３ＤＣＧ空間内のバーチャルカメラの中継する視点位置を移動できる。視点位置が更新されたバーチャルカメラの中継する立体映像は、２次元映像に変換されて視聴者に配信される。情報処理システム１０は、配信中の２次元映像に視聴者の意図を反映できる。情報処理システム１０においては、視聴者の反応や視点を考慮した２次元映像が配信できる。すなわち、例えば、会議に発言者として参加するユーザ間のやり取りを、事後あるいはリアルタイムに視聴する視聴者の反応や視点を考慮した映像の配信が望まれるが、これを、情報処理システム１０により実現できる。 The information processing system 10 can receive messages (reaction signals such as characters, voices, and actions) from a viewer who views a two-dimensional image. For example, the information processing system 10 can move the viewpoint position relayed by the virtual camera in the 3DCG space where the remote conference is proceeding based on the received message. The stereoscopic image relayed by the virtual camera whose viewpoint position has been updated is converted into a two-dimensional image and distributed to the viewer. The information processing system 10 can reflect the viewer's intention in the two-dimensional video being distributed. In the information processing system 10, a two-dimensional image can be distributed in consideration of the reaction and viewpoint of the viewer. That is, for example, it is desired to distribute a video in consideration of the reaction and viewpoint of the viewer who watches the exchange between users who participate in the conference as a speaker after the fact or in real time, and this can be realized by the information processing system 10. ..

情報処理システム１０は、リモート会議が進行する３ＤＣＧ空間のオブジェクトの状態、バーチャルカメラの状態を示す情報を、タイムコードに関連付けて記録・保存することができる。情報処理システム１０は、タイムコードに関連付けて記録・保存された情報に基づいて、リモート会議の立体映像を再現することができる。情報処理システム１０は、再現されたリモート会議の２次元映像を、事後に視聴する視聴者の２Ｄ映像デバイスに配信できる。情報処理システム１０は、事後に配信する２次元映像についても視聴者からのメッセージを受け付けることができる。情報処理システム１０においては、事後においても視聴者の反応や視点を考慮した２次元映像が配信できる。 The information processing system 10 can record and store information indicating the state of an object in the 3DCG space in which a remote conference is proceeding and the state of a virtual camera in association with a time code. The information processing system 10 can reproduce a stereoscopic image of a remote conference based on the information recorded and stored in association with the time code. The information processing system 10 can deliver the reproduced two-dimensional image of the remote conference to the viewer's 2D image device to be viewed after the fact. The information processing system 10 can also receive a message from the viewer regarding the two-dimensional video to be delivered after the fact. In the information processing system 10, it is possible to deliver a two-dimensional image in consideration of the reaction and viewpoint of the viewer even after the fact.

本実施形態に係る情報処理システム１０によれば、ＣＧで描画された３次元空間内における事象の、２次元映像デバイスを用いたリアルタイムおよび非同期の双方向参加が可能な技術が提供できる。 According to the information processing system 10 according to the present embodiment, it is possible to provide a technique capable of real-time and asynchronous bidirectional participation of events in a three-dimensional space drawn by CG using a two-dimensional video device.

〔変形例〕
実施形態においては、情報処理システム１０の提供するコンテンツは、３ＤＣＧ空間を用いたコンテンツとして説明した。情報処理システム１０は、例えば、実際の会議場をＣＣＤ（Charge-Coupled Device）、ＣＭＯＳ（Complementary Metal-Oxide Semiconductor）等の撮像素子を備えるカメラで撮影し、撮影された会議場にアバター等のオブジェクトを配置する形態としてもよい。実際の会議場を用いることで、リモート会議における臨場感や没入感の向上が期待できる。 [Modification example]
In the embodiment, the content provided by the information processing system 10 has been described as the content using the 3DCG space. The information processing system 10 captures, for example, an actual conference hall with a camera equipped with an image sensor such as a CCD (Charge-Coupled Device) or CMOS (Complementary Metal-Oxide Semiconductor), and an object such as an avatar is captured in the captured conference hall. May be in the form of arranging. By using an actual conference hall, it is expected that the sense of presence and immersiveness in remote conferences will be improved.

図７は、実空間の会議場Ａ２を撮影するカメラＲＣを備える情報処理システムの一例を示す構成図である。カメラＲＣは、外部ネットワークに接続し、リモート会議の会議場となる実空間の映像を取得する。実空間の会議場の映像を撮影するカメラＲＣは、複数に存在し得る。 FIG. 7 is a configuration diagram showing an example of an information processing system including a camera RC for photographing a conference hall A2 in a real space. The camera RC connects to an external network and acquires real-space images that serve as a conference hall for remote conferences. There may be a plurality of camera RCs that capture images of a conference hall in real space.

実空間の会議場の映像を撮影するカメラＲＣは、会議場に対して固定されているものとする。シーンマネージャＳＭは、例えば、カメラＲＣの位置、傾き、画角等のカメラ情報を用いて撮像された映像の動画フレームを解析し、会議場を撮影するカメラの視軸を推定する。そして、シーンマネージャＳＭは、固定されたカメラの視軸に基づいて、３ＤＣＧで描画されたアバター等のオブジェクトを配置するワールド座標系を特定する。シーンマネージャＳＭは、ワールド座標系に３ＤＣＧで作成されたオブジェクトを配置するための基準となるマーカ等を設定する。そして、シーンマネージャＳＭは、マーカ位置に３ＤＣＧで作成されたアバター等のオブジェクトを配置し、実施形態で説明したリモート会議のコンテンツを提供するとすればよい。なお、実空間を撮影するカメラＲＣが固定でない場合には、実空間内に撮影位置の推定基準となるマーカ（実マーカ）を設置すればよい。 It is assumed that the camera RC that captures the image of the conference hall in the real space is fixed to the conference hall. The scene manager SM, for example, analyzes a moving image frame of an image captured using camera information such as the position, tilt, and angle of view of the camera RC, and estimates the visual axis of the camera that shoots the conference hall. Then, the scene manager SM specifies a world coordinate system in which an object such as an avatar drawn by 3DCG is arranged based on the fixed visual axis of the camera. The scene manager SM sets a marker or the like as a reference for arranging an object created by 3DCG in the world coordinate system. Then, the scene manager SM may arrange an object such as an avatar created by 3DCG at the marker position and provide the contents of the remote conference described in the embodiment. If the camera RC that shoots the real space is not fixed, a marker (real marker) that serves as an estimation reference for the shooting position may be installed in the real space.

リモート会議の会議参加者には、例えば、実空間で撮影中の会議場と３ＤＣＧで作成されたアバター等が合成された３次元空間が、アバター視点から見た立体映像として提供される。実空間から撮影された会議場においても、会議参加者は自身のアバターを操作し、会議における発言や対話、他者への働きかけや会議場内の移動等の行動が可能になる。
リモート会議の様子を中継する配信者Ｕｎは、実施形態と同様にしてバーチャルカメラＶＣｎを操作し、合成された３次元空間の任意の視点位置から見た立体映像を中継することができる。 To the conference participants of the remote conference, for example, a three-dimensional space in which the conference hall being photographed in the real space and the avatar created by 3DCG are combined is provided as a stereoscopic image viewed from the avatar viewpoint. Even in a conference hall photographed from a real space, conference participants can operate their own avatars to perform actions such as speaking and interacting in the conference, working with others, and moving within the conference hall.
The distributor Un, which relays the state of the remote conference, can operate the virtual camera VCn in the same manner as in the embodiment, and can relay the stereoscopic image viewed from an arbitrary viewpoint position in the synthesized three-dimensional space.

なお、情報処理システム１０は、会議参加者や配信者といった被写体を撮影し、撮影された被写体の実空間における映像を３ＤＣＧ空間に合成するとしてもよい。実空間における被写体の映像と３ＤＣＧ空間の映像との合成として、クロマキー（Chroma key）合成が
例示される。図８、図９は、クロマキー合成を用いる際の、実空間における被写体の撮影を説明する図である。 The information processing system 10 may photograph a subject such as a conference participant or a distributor, and synthesize an image of the photographed subject in the real space into the 3DCG space. Chroma key composition is exemplified as the composition of the image of the subject in the real space and the image of the 3DCG space. 8 and 9 are diagrams for explaining shooting of a subject in real space when chroma key composition is used.

図８、図９において、背景２０はグリーンバック、あるいは、ブルーバックといった所定色の背景である。クロマキー合成においては、所定色の背景２０が撮影された領域に、３ＤＣＧの映像が合成される。被写体２１の映像は、背景２０で指定される所定色の実空間を背景として、カメラ２２によって撮影される。カメラ２２は、例えば、カメラの配置位置、傾き、画角といったカメラ情報と共に被写体２１の実空間における映像を撮影する。被写体２１の位置は、センサ２３によって特定される。なお、カメラ２２は、例えば、撮影対象になる空間の奥行きを取得可能な奥行きカメラであってもよい。 In FIGS. 8 and 9, the background 20 is a background of a predetermined color such as a green background or a blue background. In chroma key compositing, a 3DCG image is composited in the area where the background 20 of a predetermined color is captured. The image of the subject 21 is photographed by the camera 22 against the background of a real space of a predetermined color designated by the background 20. The camera 22 captures an image of the subject 21 in the real space together with camera information such as the arrangement position, tilt, and angle of view of the camera, for example. The position of the subject 21 is specified by the sensor 23. The camera 22 may be, for example, a depth camera capable of acquiring the depth of the space to be photographed.

図９に示すように、被写体２１は、自身の頭部の動きを検知可能なＨＭＤ２５や腕部の動きを検知するモーションセンサ２４を備えるとしてもよい。被写体２１が、情報処理システム１０の提供する立体映像の配信者の場合には、実施形態で説明したように、合成された３ＤＣＧ空間を任意の視点位置から中継するバーチャルカメラＶＣの操作デバイスを備える。ＨＭＤ２５、モーションセンサ２４を介して検知された被写体２１の動きは、合成される３ＤＣＧ空間内の動作や行動に反映される。 As shown in FIG. 9, the subject 21 may include an HMD 25 capable of detecting the movement of its own head and a motion sensor 24 capable of detecting the movement of its arm. When the subject 21 is a distributor of a stereoscopic image provided by the information processing system 10, it includes an operation device of a virtual camera VC that relays the synthesized 3DCG space from an arbitrary viewpoint position as described in the embodiment. .. The movement of the subject 21 detected via the HMD 25 and the motion sensor 24 is reflected in the movement and behavior in the synthesized 3DCG space.

図１０は、合成映像の一例を示す図である。図１０においては、図９に示す所定色の背景２０に対する映像内の部分領域に対して、３ＤＣＧ空間の映像２６を合成した状態が例示される。情報処理システム１０においては、被写体２１が会議参加者の場合には、被写体２１と３ＤＣＧ空間で作成された映像２６とを合成した立体映像がリモート会議の会議参加者に提供される。 FIG. 10 is a diagram showing an example of a composite video. In FIG. 10, a state in which the image 26 in the 3DCG space is synthesized with respect to the partial region in the image with respect to the background 20 of the predetermined color shown in FIG. 9 is illustrated. In the information processing system 10, when the subject 21 is a conference participant, a stereoscopic image obtained by synthesizing the subject 21 and the image 26 created in the 3DCG space is provided to the conference participant in the remote conference.

図１１は、２Ｄ映像デバイスに表示された合成映像の一例を示す図である。図１１においては、２Ｄ映像デバイス２７に表示された画面内には、ＨＭＤ等を装着した被写体２１と３ＤＣＧ空間の映像２６とが合成された状態で表示される。なお、図１１の２Ｄ映像デバイス２７は、ＨＭＤ等を装着した被写体２１と３ＤＣＧ空間の映像２６とが合成された立体映像を、裸眼で視聴可能なデバイスであってもよい。被写体２１が、情報処理システム１０の提供する立体映像の配信者の場合には、例えば、合成された３ＤＣＧ空間で進行中の事象を任意の視点位置から中継すると共に、事象の解説や事象に至る背景を音声にて視聴者に通知できる。 FIG. 11 is a diagram showing an example of a composite video displayed on a 2D video device. In FIG. 11, in the screen displayed on the 2D video device 27, the subject 21 equipped with the HMD or the like and the video 26 in the 3DCG space are displayed in a combined state. The 2D video device 27 in FIG. 11 may be a device that allows the naked eye to view a stereoscopic video in which the subject 21 equipped with the HMD or the like and the video 26 in the 3DCG space are combined. When the subject 21 is a distributor of a stereoscopic image provided by the information processing system 10, for example, an event in progress in the synthesized 3DCG space is relayed from an arbitrary viewpoint position, and an explanation of the event or an event is reached. The background can be notified to the viewer by voice.

次に、カメラで撮影された被写体がリモート会議に参加する形態について説明する。図１２は、ＨＭＤを装着した被写体がリモート会議に参加する形態の説明図である。図１２において、「Ｃ１１」は、図８、図９で説明した被写体Ｕａを撮影するカメラを表す。また、「Ｒ１１」は、被写体Ｕａの装着するＨＭＤにリアルタイムに表示される映像を表し、「Ｒ１２」は、外部ネットワークに接続する視聴者に配信される映像を表す。「Ｒ１３」は、カメラＣ１１で撮影された被写体Ｕａが合成されたリモート会議のリアルタイム映像を表す。被写体Ｕａは、ＨＭＤに表示される映像を視聴しながらリモートに参加する。ＨＭＤには、例えば、ＨＭＤを介して検知された被写体Ｕａの頭部の動きに対応する視点位置の映像が表示される。 Next, a mode in which a subject photographed by the camera participates in a remote conference will be described. FIG. 12 is an explanatory diagram of a mode in which a subject wearing an HMD participates in a remote conference. In FIG. 12, “C11” represents a camera that captures the subject Ua described with reference to FIGS. 8 and 9. Further, "R11" represents an image displayed in real time on the HMD worn by the subject Ua, and "R12" represents an image distributed to a viewer connected to an external network. “R13” represents a real-time image of a remote conference in which the subject Ua taken by the camera C11 is combined. Subject Ua participates remotely while watching the image displayed on the HMD. For example, the HMD displays an image of the viewpoint position corresponding to the movement of the head of the subject Ua detected via the HMD.

図８、図９を用いて説明したように、被写体Ｕａは、グリーンバック等のクロマキー合成が可能な背景を設置し、ＲＧＢ映像を撮影可能なカメラＣ１１、あるいは、撮影された映像領域から背景の切り出しが可能な奥行きカメラＣ１１で自身を撮影する。カメラＣ１１は、少なくともカメラ位置（Ｐｏｓ（ｘ，ｙ，ｚ））、回転（Ｒｏｔ（ｐ，ｙ，ｒ））を含むカメラ情報をタイムコードＴＣに対応付けて記録すると共に、上記カメラ情報と被写体Ｕａの映像をシーンマネージャＳＭに出力する。 As described with reference to FIGS. 8 and 9, the subject Ua is provided with a background capable of chroma key composition such as a green background, and is a camera C11 capable of capturing RGB images, or a background from the captured image area. Take a picture of yourself with the depth camera C11 that can be cut out. The camera C11 records camera information including at least the camera position (Pos (x, y, z)) and rotation (Rot (p, y, r)) in association with the time code TC, and records the camera information and the subject. Output the Ua image to the scene manager SM.

シーンマネージャＳＭは、カメラＣ１１の映像と３ＤＣＧのオブジェクトとをタイムコードＴＣを合致させて合成する。シーンマネージャＳＭによって合成されたカメラＣ１１の映像と３ＤＣＧ空間のオブジェクトは、映像Ｒ１３に示すように、被写体Ｕａを含むクロマキー合成されたリアルタイム映像が生成される。映像Ｒ１３の生成処理は、例えば、グリーンバック等で撮影されたフレーム毎のＲＧＢ画像の色空間を回転し、ＨＳＢ（Hue
、Saturation、Brightness）空間に変換する。そして、上記生成処理においては、変換された画像に対する背景除去処理、および、カメラＣ１１の視軸方向から視た３ＤＣＧ映像との合成処理がリアルタイムで行われる。生成された映像Ｒ１３は、配信サーバＣＨを介し、外部ネットワークに接続する視聴者の２Ｄ映像デバイスに２次元映像として配信される。 The scene manager SM synthesizes the image of the camera C11 and the object of the 3DCG by matching the time code TC. As shown in the image R13, the image of the camera C11 synthesized by the scene manager SM and the object in the 3DCG space generate a chroma key-combined real-time image including the subject Ua. In the process of generating the image R13, for example, the color space of the RGB image for each frame taken by a green background or the like is rotated, and HSB (Hue) is performed.
, Saturation, Brightness) Convert to space. Then, in the above generation process, the background removal process for the converted image and the compositing process with the 3DCG image viewed from the visual axis direction of the camera C11 are performed in real time. The generated video R13 is distributed as a two-dimensional video to the viewer's 2D video device connected to the external network via the distribution server CH.

図１３は、ＨＭＤを装着しない被写体がリモート会議に参加する形態の説明図である。被写体がＨＭＤを装着しない形態においても、図１２と同様の処理が行われる。但し、被写体Ｕａは、リアルタイムで進行するリモート会議の立体映像を裸眼で視聴可能な表示デバイスに表示された映像、あるいは、情報処理システム１０によって変換された２Ｄ映像を視聴しながら参加すればよい。シーンマネージャＳＭは、例えば、補助記憶装置１０３に、カメラＣ１１の映像と３ＤＣＧ空間のオブジェクトを合成したリモート会議のデータを、オフライン動画ファイル（ＳＶ）として保存するとしてもよい。なお、シーンマネージャＳＭは、図１２の形態についても同様にして、上記リモート会議のデータを補助記憶装置１０３に保存できる。 FIG. 13 is an explanatory diagram of a mode in which a subject without an HMD participates in a remote conference. Even when the subject does not wear the HMD, the same processing as in FIG. 12 is performed. However, the subject Ua may participate while viewing the image displayed on the display device capable of viewing the stereoscopic image of the remote conference progressing in real time with the naked eye, or the 2D image converted by the information processing system 10. For example, the scene manager SM may save the data of the remote conference in which the image of the camera C11 and the object in the 3DCG space are combined in the auxiliary storage device 103 as an offline moving image file (SV). The scene manager SM can store the data of the remote conference in the auxiliary storage device 103 in the same manner for the form shown in FIG.

図１３に示す形態は、例えば、事後に再生されるリモート会議の映像についても適用が可能である。図１４は、ＨＭＤを装着しない被写体が事後に再生されるリモート会議に参加する形態の説明図である。図１４においては、補助記憶装置１０３等にオフライン動画ファイル（ＳＶ）として保存された映像が、被写体Ｕａの視聴可能な映像として再生される。但し、再生される事後の映像は、他の資料映像やシーンを記録した動画やシーケンスといった互換を有する動画映像であってもよい。 The form shown in FIG. 13 can also be applied to, for example, a video of a remote conference to be reproduced after the fact. FIG. 14 is an explanatory diagram of a mode in which a subject without an HMD participates in a remote conference that is reproduced after the fact. In FIG. 14, an image stored as an offline moving image file (SV) in the auxiliary storage device 103 or the like is reproduced as a viewable image of the subject Ua. However, the post-reproduced video may be a compatible moving image such as another document video or a video or sequence in which a scene is recorded.

図１４において、情報処理システム１０のシーンマネージャＳＭは、補助記憶装置１０３に保存されたオフライン動画ファイル（ＳＶ）を読み出して再生する。再生されたオフライン動画ファイル（ＳＶ）は、被写体Ｕａが視聴可能な表示デバイスに映像Ｒ１１として表示される。ここで、映像Ｒ１１は、被写体Ｕａが視聴可能な形態であれば、立体映像であってもよく、２Ｄ映像であってもよい。 In FIG. 14, the scene manager SM of the information processing system 10 reads and reproduces an offline moving image file (SV) stored in the auxiliary storage device 103. The reproduced offline moving image file (SV) is displayed as a video R11 on a display device on which the subject Ua can be viewed. Here, the video R11 may be a stereoscopic video or a 2D video as long as the subject Ua can be viewed.

図１４の被写体Ｕａは、グリーンバック等のクロマキー合成が可能な背景を設置し、ＲＧＢ映像を撮影可能なカメラＣ１１、あるいは、撮影された映像領域から背景の切り出しが可能な奥行きカメラＣ１１で自身を撮影する。カメラＣ１１は、少なくともカメラ位置（Ｐｏｓ（ｘ，ｙ，ｚ））、回転（Ｒｏｔ（ｐ，ｙ，ｒ））を含むカメラ情報をタイムコードＴＣに対応付けて記録すると共に、上記カメラ情報と被写体Ｕａの映像をシーンマネージャＳＭに出力する。シーンマネージャＳＭは、カメラＣ１１の映像と３ＤＣＧのオブジェクトとをタイムコードＴＣを合致させて合成し、被写体Ｕａがクロマキー合成された映像Ｒ１３に示す映像を生成する。なお、映像Ｒ１３の生成処理については、図１３を用いて説明した。 The subject Ua in FIG. 14 has a background capable of chroma key composition such as a green background, and uses a camera C11 capable of shooting RGB images or a depth camera C11 capable of cutting out the background from the captured video area. Take a picture. The camera C11 records camera information including at least the camera position (Pos (x, y, z)) and rotation (Rot (p, y, r)) in association with the time code TC, and records the camera information and the subject. Output the Ua image to the scene manager SM. The scene manager SM synthesizes the image of the camera C11 and the object of the 3DCG by matching the time code TC, and generates the image shown in the image R13 in which the subject Ua is chromakey-combined. The process of generating the video R13 has been described with reference to FIG.

オフライン動画ファイル（ＳＶ）を用いて生成された映像Ｒ１３は、配信サーバＣＨを介し、外部ネットワークに接続する視聴者の２Ｄ映像デバイスに２次元映像として配信される。視聴者には、オフライン動画ファイル（ＳＶ）を用いて再生された立体映像内に、事後の参加者としてリモート会議に参加する被写体Ｕａを含む２次元映像が配信される。シーンマネージャＳＭは、再生されたオフライン動画ファイル（ＳＶ）に対して被写体Ｕａが合成された映像データを補助記憶装置１０３に保存できる。 The video R13 generated using the offline video file (SV) is distributed as a two-dimensional video to the viewer's 2D video device connected to the external network via the distribution server CH. A two-dimensional image including the subject Ua who participates in the remote conference as a subsequent participant is delivered to the viewer in the stereoscopic image reproduced using the offline moving image file (SV). The scene manager SM can save the video data in which the subject Ua is combined with the reproduced offline moving image file (SV) in the auxiliary storage device 103.

図１４に示す形態では、例えば、被写体Ｕａの身振りや手振り等のアクションを含む演技が、再生されたオフライン動画ファイル（ＳＶ）に対して反映される。被写体Ｕａの演技によっては、例えば、過去に進行されたリモート会議内に被写体Ｕａが存在しているかのような映像を合成し、タイムコードＴＣに対応付けて記録することが可能になる。
この結果、図１４に示す形態の情報処理システム１０においては、リアルタイムで会議に参加できなかった人物が、ＨＭＤの装着無しに会議に参加し、発言を記録するといったデータの共有化が可能になる。
補助記憶装置１０３に保存されたオフライン動画ファイル（ＳＶ）は、再帰的に使用できるため、複数の会議参加者がタイムコード上において前後して参加する「時間を超えた会議」が可能になる。 In the form shown in FIG. 14, for example, acting including actions such as gestures and gestures of the subject Ua is reflected in the reproduced offline moving image file (SV). Depending on the performance of the subject Ua, for example, it is possible to synthesize an image as if the subject Ua exists in a remote conference that has been conducted in the past, and record the image in association with the time code TC.
As a result, in the information processing system 10 of the form shown in FIG. 14, it becomes possible to share data such that a person who could not participate in the conference in real time participates in the conference without wearing the HMD and records his / her remarks. ..
Since the offline moving image file (SV) stored in the auxiliary storage device 103 can be used recursively, it is possible to perform a "meeting beyond time" in which a plurality of meeting participants join back and forth on the time code.

図１５は、実空間における被写体を撮影するカメラを備える場合の、イベントに係る事象の記録処理を示すフローチャートである。情報処理システム１０が実空間における被写体を撮影するカメラを備える場合には、例えば、シーンマネージャＳＭは、図４に示すＳ１の処理の実行前に、図１５に示すＳ３１からＳ３３の処理を実行するとすればよい。 FIG. 15 is a flowchart showing a recording process of an event related to an event when a camera for photographing a subject in a real space is provided. When the information processing system 10 includes a camera that captures a subject in real space, for example, the scene manager SM executes the processes S31 to S33 shown in FIG. 15 before executing the process S1 shown in FIG. do it.

図１５のＳ３１の処理では、シーンマネージャＳＭは、実空間における被写体を撮影するカメラ（実カメラ）の存在を判定する。シーンマネージャＳＭは、実カメラが存在する場合には（Ｓ３１，有）、Ｓ３３の処理に移行する。一方、シーンマネージャＳＭは、実カメラが存在しない場合には（Ｓ３１，無）、Ｓ３２の処理に移行する。Ｓ３２の処理では、シーンマネージャＳＭは、実施形態で説明した３ＤＣＧ空間のコンテンツ映像を生成する。また、Ｓ３３の処理では、シーンマネージャＳＭは、変形例で説明した実カメラで撮影された被写体の映像と３ＤＣＧ空間のオブジェクトとの座標系を合致させて合成されたコンテンツ映像（合成映像）を生成する。図１５に示す処理を実行するシーンマネージャＳＭにおいては、Ｓ３２あるいはＳ３３の処理で生成された映像コンテンツのオブジェクトを対象として、図４に示すイベントに係る事象の記録処理が行われる。 In the process of S31 of FIG. 15, the scene manager SM determines the existence of a camera (real camera) that captures a subject in the real space. If the actual camera exists (S31, Yes), the scene manager SM shifts to the process of S33. On the other hand, when the actual camera does not exist (S31, none), the scene manager SM shifts to the process of S32. In the process of S32, the scene manager SM generates the content image of the 3DCG space described in the embodiment. Further, in the processing of S33, the scene manager SM generates a content image (composite image) synthesized by matching the coordinate system of the image of the subject taken by the actual camera described in the modified example with the object in the 3DCG space. To do. In the scene manager SM that executes the process shown in FIG. 15, the event recording process related to the event shown in FIG. 4 is performed for the object of the video content generated in the process of S32 or S33.

なお、図１５のＳ３１に示す処理は、例えば、コンテンツ提供者、あるいは、配信者の操作入力であってもよい。例えば、Ｓ３２の処理で生成されるコンテンツとＳ３３の処理で生成されるコンテンツのそれぞれが提供可能な場合に、シーンマネージャＳＭは、操作入力に従って生成するコンテンツの選択が可能になる。 The process shown in S31 of FIG. 15 may be, for example, an operation input of a content provider or a distributor. For example, when each of the content generated by the process of S32 and the content generated by the process of S33 can be provided, the scene manager SM can select the content to be generated according to the operation input.

他の変形例として、情報処理システム１０は、視聴者からのメッセージを収集して解析し、配信者の視点を評価するとしてもよい。評価の高い配信者には、配信希望者が相対的に集まるため、例えば、配信映像に含まれる広告等の広告効果が期待できる。また、情報処理システム１０においては、視聴者からのメッセージを収集して解析することで、視聴者の興味の傾向を把握することが可能になる。また、情報処理システム１０の提供する、事後における映像配信のサービス形態として、視聴者を配信者に採用することも可能である。配信映像に対する、視聴者の視点から見た演出・編集が可能になる。 As another modification, the information processing system 10 may collect and analyze messages from viewers and evaluate the viewpoint of the distributor. Since the distributors who are highly evaluated have a relative number of distribution applicants, for example, an advertising effect such as an advertisement included in the distribution video can be expected. Further, in the information processing system 10, it is possible to grasp the tendency of the viewer's interest by collecting and analyzing the message from the viewer. It is also possible to employ a viewer as a distributor as a service form of video distribution after the fact provided by the information processing system 10. It is possible to direct and edit the delivered video from the viewer's point of view.

１０情報処理システム
２０背景
２１被写体
２２カメラ
２３センサ
２４モーションセンサ
２５ＨＭＤ
２６映像
２７２Ｄ映像デバイス
１００コンピュータ
１０１ＣＰＵ
１０２主記憶装置
１０３補助記憶装置
１０４入出力ＩＦ
１０５通信ＩＦ
１０６接続バス
Ｃ、Ｃ１、Ｃ２、Ｃ３カメラプロセッサ
Ｃ１１カメラ
ＶＣ、ＶＣ１、ＶＣ２、ＶＣ３バーチャルカメラ（仮想カメラ）
Ｐ１、Ｐ２アバター
Ｒ、Ｒ１、Ｒ２、Ｒ３レンダラ
ＲＣカメラ（実カメラ）
ＳＭシーンマネージャ
ＳＶオフライン動画ファイル 10 Information processing system 20 Background 21 Subject 22 Camera 23 Sensor 24 Motion sensor 25 HMD
26 Video 27 2D Video Device 100 Computer 101 CPU
102 Main storage 103 Auxiliary storage 104 I / O IF
105 Communication IF
106 Connection bus C, C1, C2, C3 Camera processor C11 camera VC, VC1, VC2, VC3 Virtual camera (virtual camera)
P1, P2 Avatar R, R1, R2, R3 Renderer RC camera (actual camera)
SM Scene Manager SV Offline Video File

Claims

A video data generation means for generating multi-view 3D video data of a shooting target including an avatar assigned to a person in real space, and
A means for accepting a viewpoint setting for the multi-viewpoint 3D moving image data in accordance with an operation on an input device by a person to which the avatar is assigned.
A moving image generation means for generating a moving image when the shooting target is viewed from the received viewpoint, and
A means for providing the generated moving image to the viewer device, and
A recording means for recording the set viewpoint and
With
The moving image data generating means operates the avatar in response to acquiring the detection result of the sensor.
Information processing system.

The information processing system according to claim 1, wherein the moving image data generating means generates the multi-viewpoint three-dimensional moving image data including the viewpoint of the avatar.

Video data generation means for generating multi-view 3D video data of the person to be photographed, and
A means for accepting a viewpoint setting for the multi-viewpoint 3D moving image data in accordance with an operation on an input device by the shooting target person,
A moving image generation means for generating a moving image when the shooting target is viewed from the received viewpoint, and
A means for providing the generated moving image to the viewer device, and
A recording means for recording the set viewpoint and
With
The moving image data generation means is an information processing system that generates the three-dimensional moving image data by synthesizing a real space image including the person to be photographed taken by a photographing device and a virtual three-dimensional space.

The information processing system according to any one of claims 1 to 3 , wherein the moving image generating means generates the moving image by converting the multi-viewpoint three-dimensional moving image data into a two-dimensional image.

The information processing system according to claim 3 or 4 , wherein the providing means provides the generated moving image to the viewer device for displaying a two-dimensional image.

The information processing system according to claim 3 , wherein the shooting target person is a distributor who distributes a moving image.

The information processing system according to any one of claims 1 to 6 , further comprising means for receiving a reaction to the moving image from the viewer device.

The information processing system according to claim 7 , wherein the recording means further records the time in the multi-viewpoint three-dimensional moving image data and the reaction.

The eighth aspect of the present invention, wherein the moving image generating means reproduces the moving image based on the viewpoint recorded in the recording means and the time in the multi-viewpoint three-dimensional moving image data recorded in the recording means. Information processing system.

The information processing system according to claim 9, wherein the moving image generating means changes the viewpoint related to the moving image to a viewpoint different from the viewpoint recorded in the recording means based on the reaction.

The information according to any one of claims 1 to 10 , further comprising means for synthesizing the multi-viewpoint three-dimensional moving image data with an image obtained after the time when the multi-viewpoint three-dimensional moving image data is generated. Processing system.

The information processing system according to any one of claims 1 to 11 , wherein each means is realized by a computer connected to each other.

Computers that connect to each other
A step to generate video data to generate multi-view 3D video data of the shooting target including an avatar assigned to a person in real space, and
A step of generating a moving image when the shooting target is viewed from the received viewpoint based on the viewpoint setting for the multi-viewpoint 3D moving image data in accordance with the operation of the input device by the person to whom the avatar is assigned. When,
The step of providing the generated video to the viewer device, and
The step of recording the set viewpoint and
And
The step of generating the moving image data operates the avatar in response to acquiring the detection result of the sensor.
Information processing method to execute.

Computers that connect to each other
Taken by the imaging device, and a video of the real space including the imaging subject, and generating a multi-view three-dimensional moving image data of the imaging subject by synthesizing the virtual three-dimensional space,
A step of generating a moving image when the shooting target is viewed from the received viewpoint based on the viewpoint setting for the multi-viewpoint 3D moving image data in accordance with the operation of the shooting target person on the input device.
The step of providing the generated video to the viewer device, and
The step of recording the set viewpoint and
Information processing method to execute.

Generate video data that generates multi-view 3D video data for shooting targets, including avatars assigned to people in real space.
Along with the operation of the input device by the person to whom the avatar is assigned, a moving image when the shooting target is viewed from the received viewpoint is generated based on the viewpoint setting for the multi-viewpoint 3D moving image data.
The generated video is provided to the viewer device,
Record the set viewpoint and
The means for generating the moving image data for generating the multi-viewpoint three-dimensional moving image data operates the avatar in response to acquiring the detection result of the sensor.
An information processing program that causes a computer to perform processing.

Taken by the imaging device, and generates the image of the real space including the imaging subject, a multi-view three-dimensional moving image data of the imaging subject by synthesizing the virtual three-dimensional space,
Along with the operation of the input device by the shooting target person, a moving image when the shooting target is viewed from the received viewpoint is generated based on the viewpoint setting for the multi-viewpoint 3D moving image data.
The generated video is provided to the viewer device,
Record the set viewpoint,
An information processing program that causes a computer to perform processing.