JP7194125B2

JP7194125B2 - Methods and systems for generating virtualized projections of customized views of real-world scenes for inclusion within virtual reality media content

Info

Publication number: JP7194125B2
Application number: JP2019566146A
Authority: JP
Inventors: マイケル・ロダト; パイ・ムードラジリ; ルネ・セプルベダ; オリバー・エス・カスタネダ
Original assignee: Verizon Patent and Licensing Inc
Current assignee: Verizon Patent and Licensing Inc
Priority date: 2017-05-31
Filing date: 2018-05-24
Publication date: 2022-12-21
Anticipated expiration: 2038-05-24
Also published as: US11055917B2; EP3631767A1; US20190206138A1; WO2018222498A1; US20180350147A1; US10269181B2; CN110663067A; JP2020522801A; CN110663067B; KR102499904B1; KR20200013672A

Description

関連出願
本願は、２０１７年５月３１日に出願され、「ＭＥＴＨＯＤＳＡＮＤＳＹＳＴＥＭＳＦＯＲＧＥＮＥＲＡＴＩＮＧＡＶＩＲＴＵＡＬＩＺＥＤＰＲＯＪＥＣＴＩＯＮＯＦＡＣＵＳＴＯＭＩＺＥＤＶＩＥＷＯＦＡＲＥＡＬ－ＷＯＲＬＤＳＣＥＮＥＦＯＲＩＮＣＬＵＳＩＯＮＷＩＴＨＩＮＶＩＲＴＵＡＬＲＥＡＬＩＴＹＭＥＤＩＡ」と題され、その全体が参照により本明細書に組み込まれている米国特許出願第１５／６１０，５９５号に対する優先権を主張する。関連出願本願は、２０１７年５月３１日に出願され、「ＭＥＴＨＯＤＳＡＮＤＳＹＳＴＥＭＳＦＯＲＧＥＮＥＲＡＴＩＮＧＡＶＩＲＴＵＡＬＩＺＥＤＰＲＯＪＥＣＴＩＯＮＯＦＡＣＵＳＴＯＭＩＺＥＤＶＩＥＷＯＦＡＲＥＡＬ－ＷＯＲＬＤＳＣＥＮＥＦＯＲＩＮＣＬＵＳＩＯＮＷＩＴＨＩＮＶＩＲＴＵＡＬＲＥＡＬＩＴＹＭＥＤＩＡ」と題され、その全体が参照によりPriority is claimed to US patent application Ser. No. 15/610,595, incorporated herein.

バーチャル・リアリティ・メディア・コンテンツは、バーチャル・リアリティ・メディア・コンテンツのユーザ（即ち、閲覧者）に提示することができ、ユーザを、相互的なバーチャル・リアリティ世界に没入させることができ、前記バーチャル・リアリティ世界は、ユーザが、同時に提示される任意の様々な物に注意を向けることで体験することができる。例えば、バーチャル・リアリティ・メディア・コンテンツが提示される任意の時間において、バーチャル・リアリティ・メディア・コンテンツを体験するユーザは、任意の方向に没入型バーチャル・リアリティ世界の周囲を見渡すことができ、ユーザにある感覚をもたらし、ある感覚とは、ユーザが、実際に、没入型バーチャル・リアリティ世界に存在し、そして、没入型バーチャル・リアリティ世界内の特定のロケーション及び視界（例えば、配向、ビューポイント等）から、没入型バーチャル・リアリティ世界を体験するような感覚である。 The virtual reality media content can be presented to a user (i.e., viewer) of the virtual reality media content, immersing the user in an interactive virtual reality world, wherein the virtual • The reality world can be experienced by the user by directing his or her attention to any variety of objects that are presented at the same time. For example, at any time the virtual reality media content is presented, a user experiencing the virtual reality media content can look around the immersive virtual reality world in any direction, allowing the user The sense is that the user is actually present in the immersive virtual reality world and has a particular location and view within the immersive virtual reality world (e.g. orientation, viewpoint, etc.). ), it feels like experiencing an immersive virtual reality world.

幾つかの例において、没入型バーチャル・リアリティ世界は、現実世界シーンに基づくことが望まれる可能性がある。例えば、バーチャル・リアリティ・メディア・コンテンツ内に表現される没入型バーチャル・リアリティ世界の幾つか又は全ては、現実世界に存在する景色、ロケーション、イベント、オブジェクト、及び／又は他の対象物をモデル化したものであってもよく、これは、バーチャル又は仮想世界にのみ存在するものとは対照的である。かくして、キャプチャ・デバイス（例えば、イメージ及び／又はビデオ・キャプチャ・デバイス（例えば、カメラ、ビデオ・カメラ等））を使用して、現実世界シーンを表すデータを、検出、記録、及び／又はキャプチャすることができ、その結果、データを、現実世界シーンを表現したものを生成することができるバーチャル・リアリティ・メディア・コンテンツ内に含めることができる。残念ながら、全てのロケーション、配向、視野等から、データをキャプチャするように、現実世界シーンに関する物理的なキャプチャ・デバイスを配置することは、不可能又は非現実的である可能性があるが、こうしたことは望まれる可能性がある。 In some instances, the immersive virtual reality world may be desired to be based on real world scenes. For example, some or all of the immersive virtual reality worlds represented within the virtual reality media content model scenery, locations, events, objects, and/or other objects that exist in the real world. , as opposed to existing only in a virtual or virtual world. Thus, a capture device (e.g., an image and/or video capture device (e.g., camera, video camera, etc.)) is used to detect, record, and/or capture data representing a real-world scene. so that the data can be included in virtual reality media content that can generate representations of real-world scenes. Unfortunately, it may be impossible or impractical to position a physical capture device on a real-world scene to capture data from all locations, orientations, views, etc. These things may be desired.

更には、膨大な数の物理的なキャプチャ・デバイスを採用して、膨大な数のロケーション、配向、視野等から、データをキャプチャしたとしても、これらキャプチャ・デバイスがキャプチャする全てのデータを、ユーザに提示されるバーチャル・リアリティ・メディア・コンテンツ内に含めることは、非現実的及び／又は非効率的である可能性がある。例えば、データ配信制限（例えば、ネットワーク帯域幅、デバイスのデコーディング能力等）、キャプチャされるデータの著しい冗長性、同時にバーチャル・リアリティ・メディア・コンテンツの異なるユーザへの異なる関連性を有する現実世界シーンの異なる詳細を記述するデータ、及び、その他の要素は、それぞれ、以下の点に寄与する可能性がある：膨大な数の物理的なキャプチャ・デバイスを使用して、現実世界シーンを表すデータをキャプチャ及び配信することについての非現実性及び／又は非効率性。
Furthermore, even if a large number of physical capture devices are employed to capture data from a large number of locations, orientations, fields of view, etc., all data captured by these capture devices can be viewed by the user. It may be impractical and/or inefficient to include within the virtual reality media content presented to. For example, data delivery limitations (e.g. network bandwidth, device decoding capabilities, etc.), significant redundancy in captured data, and at the same time real-world scenes with different relevance to different users of virtual reality media content. data describing different details of the impracticality and/or inefficiency of capturing and distributing;

添付図面は様々な実施形態を示し、本明細書の一部である。示した実施形態は例示にすぎず、本開示の範囲を限定する物ではない。図面を通して、同一の又は類似の参照番号は、同一又は類似の要素を示す。 The accompanying drawings illustrate various embodiments and are a part of this specification. The illustrated embodiments are exemplary only and are not intended to limit the scope of the present disclosure. Throughout the drawings, same or similar reference numbers designate same or similar elements.

例示的な仮想化プロジェクション生成システムを示し、前記システムは、本明細書に記載の原理に従って、バーチャル・リアリティ・メディア・コンテンツ内に含める目的で、現実世界シーンのカスタマイズされるビューの仮想化プロジェクションを生成するためのものである。1 illustrates an exemplary virtualized projection generation system for generating a virtualized projection of a customized view of a real-world scene for inclusion within virtual reality media content in accordance with the principles described herein; It is for generating.

例示的な構成を示し、ここで、例示的な現実世界シーンを表すデータは、本明細書に記載の原理に従って、現実世界シーンの異なるビューからキャプチャされる。1 illustrates an exemplary configuration, where data representing an exemplary real-world scene are captured from different views of the real-world scene according to principles described herein;

本明細書に記載の原理に従って、図２の現実世界シーンを表す表面データ・フレーム・シーケンス内に含める目的で色彩及び深度フレームをキャプチャする例示的なキャプチャ・デバイスを示す。3 illustrates an exemplary capture device that captures color and depth frames for inclusion within the sequence of surface data frames representing the real-world scene of FIG. 2 in accordance with principles described herein;

本明細書に記載の原理に従って、図３Ａのキャプチャ・デバイスによってキャプチャされる色彩フレームで表現される色彩データの例示的なグラフィカルな描写を示す。3B shows an exemplary graphical depiction of color data represented in color frames captured by the capture device of FIG. 3A in accordance with the principles described herein; FIG.

本明細書に記載の原理に従って、図３Ａのキャプチャ・デバイスによってキャプチャされる深度フレームで表現される深度データの例示的なグラフィカルな描写を示す。3B shows an exemplary graphical depiction of depth data represented in depth frames captured by the capture device of FIG. 3A in accordance with principles described herein; FIG.

本明細書に記載の原理に従って、図２の現実世界シーンを表し、図３Ａのキャプチャ・デバイスによって生成される例示的な表面データ・フレーム・シーケンスの異なる表現を示す。FIG. 2 depicts the real-world scene of FIG. 2 and shows different representations of an exemplary sequence of surface data frames generated by the capture device of FIG. 3A in accordance with the principles described herein; 本明細書に記載の原理に従って、図２の現実世界シーンを表し、図３Ａのキャプチャ・デバイスによって生成される例示的な表面データ・フレーム・シーケンスの異なる表現を示す。FIG. 2 depicts the real-world scene of FIG. 2 and shows different representations of an exemplary sequence of surface data frames produced by the capture device of FIG. 3A in accordance with the principles described herein;

は、図２の構成に基づく例示的な構成を示し、ここで、図２の現実世界シーンを表すデータは、現実世界シーンのカスタマイズされるビューに関して、本明細書に記載の原理に従って、更に生成される。shows an exemplary arrangement based on the arrangement of FIG. 2, wherein data representing the real-world scene of FIG. 2 is further generated according to principles described herein for customized views of the real-world scene. be done.

例示的な仮想化される表面データ・フレーム・シーケンスを示し、前記シーケンスは、本明細書に記載の原理に従って、図５の現実世界シーンのカスタマイズされるビューの例示的な仮想化プロジェクションに関する色彩及び深度フレームを含む。6 shows an exemplary virtualized surface data frame sequence, said sequence showing color and color for an exemplary virtualized projection of a customized view of the real-world scene of FIG. 5 in accordance with principles described herein. Contains depth frames.

例示的なトランスポート・ストリームのグラフィカルな表現を示し、前記ストリームは、本明細書に記載の原理に従って、例示的な複数の表面データ・フレーム・シーケンスを含む。1 shows a graphical representation of an exemplary transport stream, said stream including an exemplary plurality of surface data frame sequences in accordance with the principles described herein;

本明細書に記載の原理に従った、図７の例示的なトランスポート・ストリームのデータ構造表現を示す。8 shows a data structure representation of the exemplary transport stream of FIG. 7, according to principles described herein;

例示的なトランスポート・ストリームのグラフィカルな表現を示し、前記ストリームは、本明細書に記載の原理に従って、タイル・マップを実装する例示的なフレーム・シーケンスを含む。1 shows a graphical representation of an exemplary transport stream, said stream including an exemplary sequence of frames implementing a tile map according to the principles described herein;

本明細書に記載の原理に従った図９の例示的なトランスポート・ストリームのデータ構造表現を示す。10 shows a data structure representation of the exemplary transport stream of FIG. 9 in accordance with principles described herein; FIG.

例示的な構成を示し、ここで、例示的なバーチャル・リアリティ・メディア・コンテンツ提供システムは、本明細書に記載の原理に従って、バーチャル・リアリティ・メディア・コンテンツを、現実世界シーンに基づいて生成し、そして、バーチャル・リアリティ・メディア・コンテンツを、例示的なクライアント・サイド・メディア・プレーヤ装置に提供し、前記装置は、ユーザによって使用され、前記ユーザは、現実世界シーンを表現したものを体験する。1 illustrates an exemplary configuration, wherein an exemplary virtual reality media content serving system generates virtual reality media content based on real-world scenes according to principles described herein; and providing virtual reality media content to an exemplary client-side media player device, said device being used by a user, said user experiencing a representation of a real-world scene. .

本明細書に記載の原理に従った様々な例示的なタイプのメディア・プレーヤ装置を示し、前記装置は、バーチャル・リアリティ・メディア・コンテンツを体験するユーザによって使用することができる。Various exemplary types of media player devices in accordance with the principles described herein are shown and may be used by users to experience virtual reality media content.

本明細書に記載の原理に従った例示的なバーチャル・リアリティ体験を示し、ここで、ユーザには、現実世界シーンに基づく例示的なバーチャル・リアリティ・メディア・コンテンツが提示され、前記コンテンツは、動的に選択可能なバーチャル・ビューポイントから体験され、前記ビューポイントは、現実世界シーンに関する例示的な任意のバーチャル・ロケーションに対応する。1 illustrates an exemplary virtual reality experience in accordance with the principles described herein, wherein a user is presented with exemplary virtual reality media content based on a real-world scene, said content comprising: Experienced from a dynamically selectable virtual viewpoint, said viewpoint corresponding to an exemplary arbitrary virtual location of a real-world scene.

本明細書に記載の原理に従った例示的な方法を示し、前記方法は、バーチャル・リアリティ・メディア・コンテンツ内に含める目的で、現実世界シーンのカスタマイズされるビューの仮想化プロジェクションを生成するためのものである。An exemplary method according to the principles described herein for generating a virtualized projection of a customized view of a real-world scene for inclusion within virtual reality media content. belongs to. 本明細書に記載の原理に従った例示的な方法を示し、前記方法は、バーチャル・リアリティ・メディア・コンテンツ内に含める目的で、現実世界シーンのカスタマイズされるビューの仮想化プロジェクションを生成するためのものである。An exemplary method according to the principles described herein for generating a virtualized projection of a customized view of a real-world scene for inclusion within virtual reality media content. belongs to.

本明細書に記載の原理に従った例示的なコンピューティング・デバイスを示す。1 illustrates an exemplary computing device in accordance with the principles described herein;

バーチャル・リアリティ・メディア・コンテンツ内に含める目的で、現実世界シーンのカスタマイズされるビューの仮想化プロジェクションを生成するための方法及びシステムについて、本明細書にて説明する。例えば、より詳細に後述するが、仮想化プロジェクション生成システムは、複数のキャプチャされる表面データ・フレーム・シーケンスを受信（例えば、リクエスト、獲得、アクセス等）することができる。複数のキャプチャされる表面データ・フレーム・シーケンス内の各表面データ・フレーム・シーケンスは、以下を含むことができる：色彩及び深度フレームであって、前記色彩及び深度フレームは、現実世界シーンを、キャプチャ・パラメータの各セットに従って描写しており、前記パラメータは、現実世界シーンの異なるビューに関連するキャプチャ・パラメータの複数のセットに含まれる。例えば、現実世界シーンの各ビューに関連するキャプチャ・パラメータの各セットは、以下を表すパラメータを含むことができる：キャプチャ・ロケーション、配向、視野、深度マッピング、深度範囲、クオリティ・レベル、フォーマット、ソース、ダイナミック・レンジ、及び／又は、各表面データ・フレーム・シーケンスが現実世界シーンのビューを表現するための他の特性。複数のキャプチャされる表面データ・フレーム・シーケンス内の各表面データ・フレーム・シーケンスは、複数のキャプチャ・デバイス内の異なるキャプチャ・デバイスによってキャプチャされてもよく、前記複数のキャプチャ・デバイスは、現実世界シーンの異なるビューをキャプチャするように現実世界シーンに関しての異なるロケーションに配置されてもよい。例えば、各異なるキャプチャ・デバイスは、キャプチャ・パラメータの複数のセット内のキャプチャ・パラメータの異なるセットの１つに関連してもよい（例えば、キャプチャ・パラメータの異なるセットの１つに従って、現実世界シーンをキャプチャするように構成されてもよい）。 Methods and systems are described herein for generating virtualized projections of customized views of real-world scenes for inclusion within virtual reality media content. For example, as described in more detail below, a virtualized projection generation system may receive (eg, request, acquire, access, etc.) multiple captured surface data frame sequences. Each surface data frame sequence in the plurality of captured surface data frame sequences may include: a color and depth frame, the color and depth frame capturing a real-world scene; • Rendering according to each set of parameters, said parameters being included in multiple sets of capture parameters associated with different views of the real-world scene. For example, each set of capture parameters associated with each view of a real-world scene can include parameters representing: capture location, orientation, field of view, depth mapping, depth range, quality level, format, source. , dynamic range, and/or other characteristics for each surface data frame sequence to represent a view of the real-world scene. Each surface data frame sequence in the plurality of captured surface data frame sequences may be captured by a different capture device in the plurality of capture devices, the plurality of capture devices It may be placed at different locations with respect to the real-world scene to capture different views of the scene. For example, each different capture device may be associated with one of a different set of capture parameters within the plurality of sets of capture parameters (e.g., according to one of the different sets of capture parameters, the real-world scene ).

キャプチャ・パラメータの異なるセットに関連する複数の表面データ・フレーム・シーケンスを受信することに加えて、仮想化プロジェクション生成システムは、キャプチャ・パラメータの複数のセット内に含まれるキャプチャ・パラメータのセットとは異なるキャプチャ・パラメータの追加のセットを特定することができる。キャプチャ・パラメータの追加のセットは、現実世界シーンのカスタマイズされるビューに関連してもよく、前記カスタマイズされるビューは、複数のキャプチャ・デバイスがキャプチャする現実世界シーンの異なるビューとは異なってもよい。複数のキャプチャされる表面データ・フレーム・シーケンス内の表面データ・フレーム・シーケンスと、キャプチャ・パラメータの追加のセットとに基づいて、仮想化プロジェクション生成システムは、現実世界シーンのカスタマイズされるビューの仮想化プロジェクションに関する色彩及び深度フレームをレンダリングすることができる。 In addition to receiving multiple sequences of surface data frames associated with different sets of capture parameters, the virtualization projection generation system can determine which sets of capture parameters are included within the multiple sets of capture parameters. Additional sets of different capture parameters can be specified. An additional set of capture parameters may relate to a customized view of the real-world scene, which may differ from different views of the real-world scene captured by multiple capture devices. good. Based on the surface data frame sequences in the plurality of captured surface data frame sequences and an additional set of capture parameters, the virtualization projection generation system generates a virtual view of the customized view of the real-world scene. Color and depth frames can be rendered for color projection.

仮想化プロジェクション生成システムは、仮想化される表面データ・フレーム・シーケンスを提供することができ、前記表面データ・フレーム・シーケンスは、現実世界シーンのカスタマイズされるビューの仮想化プロジェクションに関するレンダリングされる色彩及び深度フレームを含むことができ、上記提供は、１以上の他のシステムに対してなされてもよい（例えば、ユーザに関連する１以上のメディア・プレーヤ装置へ、バーチャル・リアリティ・メディア・コンテンツ・プロバイダ・パイプライン内の１以上のダウンストリーム・システムへ等）。例えば、仮想化プロジェクション生成システムは、バーチャル・リアリティ・メディア・コンテンツ内に含める目的で、仮想化される表面データ・フレーム・シーケンスをメディア・プレーヤ装置に提供することができる（例えば、バーチャル・リアリティ・メディア・コンテンツは、バーチャル・リアリティ・メディア・プロバイダ・パイプラインの手段により、バーチャル・リアリティ・メディア・コンテンツを体験するユーザに関連するメディア・プレーヤ装置へ、ストリーミングされるように構成されてもよい）。 The virtualization projection generation system can provide a virtualized surface data frame sequence, the surface data frame sequence being a rendered color for a virtualization projection of a customized view of a real-world scene. and depth frames, which may be provided to one or more other systems (e.g., to one or more media player devices associated with the user, the virtual reality media content, to one or more downstream systems in the provider pipeline, etc.). For example, a virtualization projection generation system can provide a surface data frame sequence to be virtualized to a media player device for inclusion within virtual reality media content (e.g., virtual reality media content). The media content may be configured to be streamed by means of a virtual reality media provider pipeline to a media player device associated with the user experiencing the virtual reality media content) .

バーチャル・リアリティ・メディア・コンテンツ内に含める目的で、現実世界シーンのカスタマイズされるビューの仮想化プロジェクションを生成するための本明細書に記載のシステム及び方法は、様々な長所及び利点を提供することができる。一例として、本明細書に記載のシステム及び方法は、現実世界シーンのカスタマイズされるビューに関する仮想化プロジェクションを表すデータ（例えば、仮想化される表面データ・フレーム・シーケンス等）を、任意のキャプチャ・パラメータ（例えば、任意のキャプチャ・ロケーション、配向、視野、深度マッピング、クオリティ・レベル、ソース、ダイナミック・レンジ等）に基づいて生成することを可能にする。かくして、仮想化される表面データ・フレーム・シーケンスを生成することができ、キャプチャされる表面データ・フレーム・シーケンスとは別に提供され、特定の実施に寄与することができる現実世界シーンのビューの堅固（robust）なセットをカバーすることができる。例えば、現実世界シーンに関する様々なロケーションに膨大な数の物理的なキャプチャ・デバイスを配置しようとする（例えば、様々な異なるビット深度を有する異なるオブジェクトの様々なレベルの詳細を提供する等）のではなく、本明細書に記載の方法及びシステムは、比較的少数の物理的なキャプチャ・デバイスがデータをキャプチャすることを可能にすることができ、前記データから、膨大な数の仮想化される表面データ・フレーム・シーケンスを生成して、現実世界シーンのカスタマイズされるビューを表現することができる。 The systems and methods described herein for generating virtualized projections of customized views of real-world scenes for inclusion within virtual reality media content provide various advantages and advantages. can be done. As an example, the systems and methods described herein can capture data (e.g., virtualized surface data frame sequences, etc.) representing a virtualized projection of a customized view of a real-world scene from any captured image. Allows generation based on parameters (eg, arbitrary capture location, orientation, field of view, depth mapping, quality level, source, dynamic range, etc.). Thus, a robust view of a real-world scene can be generated that can generate a surface data frame sequence that is virtualized and that can be provided separately from the surface data frame sequence that is captured and that can contribute to a particular implementation. A (robust) set can be covered. For example, trying to place a large number of physical capture devices at various locations about a real-world scene (e.g., providing different levels of detail for different objects with different bit depths, etc.). Instead, the methods and systems described herein can enable a relatively small number of physical capture devices to capture data from which a vast number of virtualized surfaces can be captured. A data frame sequence can be generated to represent a customized view of the real-world scene.

更には、キャプチャ・パラメータ（例えば、様々な異なるヴァンテージ・ポイント、様々な異なるキャプチャ分解能等を表すパラメータ）の様々な異なるセットに関連するキャプチャ・データ（例えば、表面データ・フレーム・シーケンス）を生成することによって、本明細書に記載のシステム及び方法は、以下を促進することができる：物理的なキャプチャ・デバイスによってキャプチャされ、エンド・ユーザに提供されるバーチャル・リアリティ・メディア・コンテンツ内に含まれるデータを現実的且つ効率的に配信すること。例えば、現実世界シーン周辺に配置される８つのキャプチャ・デバイスによってキャプチャされる高分解能データを使用して、現実世界シーンを描写する８つの高分解能のキャプチャされる表面データ・フレーム・シーケンスを、８つのキャプチャ・デバイスの各ビューから生成するだけでなく、比較的膨大な数（例えば、３００）の低分解能の仮想化される表面データ・フレーム・シーケンスも生成することができ、前記表面データ・フレーム・シーケンスは、キャプチャ・デバイスのビューとは異なる（例えば、不揃いの）様々なカスタマイズされるビューに関連してもよい。 Further, generating capture data (e.g., surface data frame sequences) associated with different sets of capture parameters (e.g., parameters representing different vantage points, different capture resolutions, etc.). By doing so, the systems and methods described herein can facilitate: inclusion within virtual reality media content captured by physical capture devices and provided to end users; Deliver data realistically and efficiently. For example, using high-resolution data captured by eight capture devices positioned around the real-world scene, a sequence of eight high-resolution captured surface data frames depicting the real-world scene can be generated by 8 In addition to generating from each view of one capture device, a relatively large number (e.g., 300) of low-resolution virtualized surface data frame sequences can also be generated, wherein the surface data frames • The sequence may relate to various customized views that are different (eg, ragged) from the capture device's view.

仮想化される表面データ・フレーム・シーケンスを受信するバーチャル・リアリティ・メディア・コンテンツ提供システムは、利点として、以下の点について柔軟性を向上させることができる：どのデータを、システムが、特定のメディア・プレーヤ装置に特定の時間で提供するか（即ち、前記装置に提供されるバーチャル・リアリティ・メディア・コンテンツ内に含めるか）という点。かくして、バーチャル・リアリティ・メディア・コンテンツ・プロバイダは、ある向上した能力の利点を享受することができ、前記能力とは、メディア・プレーヤ装置に提供されるデータを最適化することであり（例えば、メディア・プレーヤ装置に比較的無関係のデータを大量にメディア・プレーヤ装置に送ること（例えば、メディア・プレーヤ装置がユーザに提供する特定のバーチャル・リアリティ体験の様々な態様に基づいて）によるのではない）、そして、表面の深度表現において最適化されるビット深度を使用したデータを提供し、深度精度及び／又は深度分解能を最適化することによってもよい。これについては、後で更に詳細に説明する。 A virtual reality media content delivery system that receives surface data frame sequences to be virtualized can advantageously provide greater flexibility in terms of: which data the system can access to a particular media; • To be provided to a player device at a specific time (ie included within the virtual reality media content provided to said device). Thus, virtual reality media content providers can take advantage of certain enhanced capabilities, said capabilities to optimize the data provided to media player devices (e.g. Not by flooding the media player device with relatively irrelevant data (e.g., based on various aspects of the particular virtual reality experience that the media player device provides to the user) ), and by providing the data using a bit depth that is optimized in the depth representation of the surface to optimize depth accuracy and/or depth resolution. This will be explained in more detail later.

一例として、８つの物理的なキャプチャ・デバイスによってキャプチャされる高分解能データの全てを、全てのメディア・プレーヤ装置に配信するのではなく（大量のデータが原因となって非現実的又は不可能となる可能性がある）、カスタマイズされるデータは、より選択的に且つより柔軟に配信されてもよい。具体的には、例えば、第１のメディア・プレーヤ装置用にカスタマイズされるデータ（例えば、仮想化される表面データ・フレーム・シーケンスの堅固なセットから選択される幾つか（ａｆｅｗ）の仮想化される表面データ・フレーム・シーケンスを表すデータ）を、第１のメディア・プレーヤ装置に配信することができ、第１のメディア・プレーヤ装置のユーザに関連する現実世界シーンの一部を高レベルで詳しく提供することができ、一方、第２のメディア・プレーヤ装置用にカスタマイズされるデータ（例えば、幾つか（ａｆｅｗ）の異なる仮想化される表面データ・フレーム・シーケンスを表すデータ）を、第２のメディア・プレーヤ装置に配信することができ、第２のメディア・プレーヤ装置のユーザに関連する現実世界シーンの別の一部を高レベルで詳しく提供することができる。かくして、バーチャル・リアリティ・メディア・コンテンツ提供システムは、第１の及び第２のメディア・プレーヤ装置の両方に対して、各ユーザに関連するデータ（例えば、ユーザが体験している現実世界シーンの各一部に対してカスタマイズされ且つローカライズされるデータ）を提供することができる。この間、いずれのメディア・プレーヤ装置も（又は、メディア・プレーヤ装置と通信するために使用されるいずれの配信チャンネルも）、各ユーザに関係がない現実世界シーンの一部に関する過剰量の冗長データ又は詳細データによる負荷を受けることがない。このようにして、データ配信について、向上させることができ、そして、更に効率的且つ効果的にすることができ、これは、クライアント・サイド・メディア・プレーヤ装置に配信される際に必要となるデータが少なくなることによってもよい（たとえ、分解能が高くなるほど、ユーザ・エクスペリエンスが、向上し、より現実的且つ没入性のあるコンテンツになるとしても）。こうした向上は、バーチャル・リアリティ・メディア・コンテンツのカスタマイズによって引き起こされ、前記カスタマイズでは、現実世界シーンの最も関連性の高い部分のみを高いクオリティで表現することを動的に含む。 As an example, rather than delivering all of the high-resolution data captured by eight physical capture devices to all media player devices (which would be impractical or impossible due to the large amount of possible), the customized data may be delivered more selectively and more flexibly. Specifically, for example, data customized for the first media player device (e.g., a few virtualizations selected from a robust set of virtualized surface data frame sequences). data representing a surface data frame sequence to be displayed) can be delivered to the first media player device, and a portion of the real world scene associated with the user of the first media player device is rendered at a high level. can be provided in detail while the data customized for the second media player device (eg, data representing a few different virtualized surface data frame sequences) It can be delivered to two media player devices, and another portion of the real world scene relevant to the user of the second media player device can be provided in high level detail. Thus, the virtual reality media content providing system provides both the first and second media player devices with data associated with each user (eg, each real-world scene the user is experiencing). customized and localized data for some). During this time, any media player device (or any distribution channel used to communicate with the media player device) may have excessive amounts of redundant data or You are not overloaded with detailed data. In this way, data delivery can be enhanced, and can be made more efficient and effective, because the data required to be delivered to the client-side media player device can be (even though higher resolution improves the user experience and results in more realistic and immersive content). These improvements are driven by customization of virtual reality media content, which dynamically involves rendering only the most relevant parts of the real-world scene in high quality.

様々な実施形態について、図面に言及しながら、以下、更に説明する。開示の方法及びシステムは、上述した１以上の利点を提供することができ、並びに／又は、様々な追加の及び／若しくは代替の利点を提供することができ、これらについては、本明細書で明らかになるであろう。 Various embodiments are further described below with reference to the drawings. The disclosed methods and systems may provide one or more of the advantages described above, and/or may provide various additional and/or alternative advantages, which are set forth herein. would be

図１は、例示的な仮想化プロジェクション生成システム１００（「システム１００」）を示し、前記システムは、バーチャル・リアリティ・メディア・コンテンツ内に含める目的で、現実世界シーンのカスタマイズされるビューの仮想化プロジェクションを生成するためのものである。示しているが、システム１００は、限定されるものではないが、以下を含むことができる：通信設備１０２、表面データ・フレーム・シーケンス・マネージメント設備１０４、仮想化プロジェクション生成設備１０６、及びストレージ設備１０８（互いに選択可能に且つ通信可能に接続される）。以下の点を理解されたい：設備１０２～１０８は、図１では別個の設備として示されているが、設備１０２～１０８は、より少ない数の設備（例えば、単一の設備に）に結合されてもよく、又は、特定の実施に寄与することができるより数多くの設備に分割されてもよい。幾つかの例において、設備１０２～１０８各々は、特定の実施に寄与することができるものとして、複数の装置間、及び／又は複数のロケーション間に分散してもよい。更には、以下の点を理解されたい：システム１００の特定の実装において、図１に示す特定の設備（及びこうした設備に関連する関連機能）はシステム１００から省略してもよい。設備１０２～１０８各々については、本明細書に含まれる他の特定の図に言及しながら、以下、更に詳細に説明する。 FIG. 1 illustrates an exemplary virtualized projection generation system 100 (“system 100”), which virtualizes customized views of real-world scenes for inclusion within virtual reality media content. It is for generating projections. As shown, the system 100 can include, but is not limited to: a communication facility 102, a surface data frame sequence management facility 104, a virtualized projection generation facility 106, and a storage facility 108. (selectably and communicatively connected to each other). It should be appreciated that although facilities 102-108 are shown as separate facilities in FIG. or divided into more facilities that can contribute to a particular implementation. In some examples, each of the facilities 102-108 may be distributed among multiple devices and/or among multiple locations as they may contribute to a particular implementation. Additionally, it should be appreciated that in certain implementations of system 100, certain facilities shown in FIG. 1 (and associated functionality associated with such facilities) may be omitted from system 100. Each of the facilities 102-108 is described in greater detail below with reference to other specific figures contained herein.

通信設備１０２は、以下を含むことができる：１以上の物理的なコンピューティング・デバイス（例えば、ハードウェア及び／又はソフトウェア・コンポーネント（例えば、プロセッサ、メモリ、通信インターフェース、プロセッサが実行するためにメモリに記憶されるインストラクション等））。これらは、システム１００が使用する及び／又は提供するデータの送信及び受信に関連する様々なオペレーションを実行することができる。例えば、通信設備１０２は、複数のキャプチャされる表面データ・フレーム・シーケンスを受信（又は受信を促進）することができ、各表面データ・フレーム・シーケンスは、色彩及び深度フレームを含むことができ、色彩及び深度フレームは、現実世界シーンを、キャプチャ・パラメータの各セットに従って描写することができ、前記パラメータは、現実世界シーンの異なるビューに関連するキャプチャ・パラメータの複数のセットに含まれてもよい。 Communication facility 102 can include: one or more physical computing devices (e.g., hardware and/or software components (e.g., processors, memory, communication interfaces, memory etc.)). They can perform various operations related to sending and receiving data used and/or provided by system 100 . For example, the communication facility 102 may receive (or facilitate receiving) multiple captured surface data frame sequences, each surface data frame sequence may include color and depth frames, A color and depth frame may depict a real-world scene according to each set of capture parameters, which may be included in multiple sets of capture parameters associated with different views of the real-world scene. .

通信設備１０２は、特定の実施に寄与することができる任意の方法で、複数のキャプチャされる表面データ・フレーム・シーケンスを受信することができる。例えば、特定の実施形態において、複数のキャプチャされる表面データ・フレーム・シーケンス内の各表面データ・フレーム・シーケンスは、複数のキャプチャ・デバイス内の異なるキャプチャ・デバイスによってキャプチャ（例えば、生成）されてもよく、前記キャプチャ・デバイスは、現実世界シーンの異なるビューをキャプチャするように現実世界シーンに関しての異なるロケーションに配置されてもよい。かくして、通信設備１０２は、データ（例えば、キャプチャされる表面データ・フレーム・シーケンス）を直接複数のキャプチャ・デバイスから受信することができ、前記受信は、以下によってもよい：例えば、キャプチャ・デバイスから送信されるデータを要求及び受信すること、又は、キャプチャ・デバイスからのデータにアクセス若しくは獲得すること。他の例において、１以上の他のシステム（例えば、現実世界シーン・キャプチャ・システム）は、キャプチャ・デバイスとシステム１００との間を仲介することができ、その結果、通信設備１０２は、キャプチャされる表面データ・フレーム・シーケンスを、１以上の他のシステムの手段により受信することができる。 Communication facility 102 may receive multiple captured surface data frame sequences in any manner that may contribute to a particular implementation. For example, in certain embodiments, each surface data frame sequence in the plurality of captured surface data frame sequences is captured (eg, generated) by a different capture device in the plurality of capture devices. Alternatively, the capture devices may be placed at different locations with respect to the real-world scene to capture different views of the real-world scene. Thus, the communication facility 102 can receive data (e.g., captured surface data frame sequences) directly from multiple capture devices, which may be: e.g., from a capture device Requesting and receiving transmitted data or accessing or obtaining data from a capture device. In other examples, one or more other systems (eg, real-world scene capture systems) can mediate between the capture device and system 100 so that communication facility 102 is captured. The surface data frame sequence may be received by means of one or more other systems.

これに加えて、又はこれに代えて、通信設備１０２は、データ（例えば、仮想化される表面データ・フレーム・シーケンス、又は、システム１００が受信及び／又は生成する他のデータ）を以下に提供することができる：バーチャル・リアリティ・メディア・コンテンツ・プロバイダ・パイプラインにおける他のサーバ・サイド・システム、及び／又は、エンド・ユーザによって使用されるクライアント・サイド・メディア・プレーヤ装置。本明細書で使用するが、「サーバ・サイド」は、サーバとクライアントとの間のトランザクション（例えば、コンテンツ提供システムがコンテンツ（例えば、バーチャル・リアリティ・メディア・コンテンツ）をエンドユーザが使用するクライアント・デバイスに提供するトランザクション）のサーバ・サイド（例えば、プロバイダ・サイド）を指してもよい。例えば、より詳細に後述するが、バーチャル・リアリティ・メディア・コンテンツ提供システムは、バーチャル・リアリティ・メディア・コンテンツを、ユーザに関連するメディア・プレーヤ装置に提供することができる。かくして、サーバ・サイドシステム及びコンポーネントは、あるシステム及びコンポーネントを指すことができ、前記システム及びコンポーネントは、コンテンツ提供システムに関連することができ（例えば、内包される、実装される、相互運用される等）、前記コンテンツ提供システムは、データ（例えば、バーチャル・リアリティ・メディア・コンテンツ）を、メディア・プレーヤ装置に提供することできる（例えば、ネットワークの手段により）。一方で、「クライアント・サイド」デバイスは、クライアント・デバイス（例えば、メディア・プレーヤ装置）に関連することができ、クライアント・デバイスは、ネットワークの他方に位置するユーザによって使用することができ、「クライアント・サイド」デバイスは、以下を含むことができる：クライアント・デバイスがデータをコンテンツ提供システムから受信することを促進するデバイス（例えば、メディア・プレーヤ装置、及び／又は、ネットワークのユーザ・サイド上でユーザが操作する他のコンピュータ・コンポーネント）。 Additionally or alternatively, communication facility 102 provides data (eg, surface data frame sequences to be virtualized or other data received and/or generated by system 100) to: May be: other server-side systems in the virtual reality media content provider pipeline and/or client-side media player devices used by end users. As used herein, "server-side" refers to transactions between a server and a client (e.g., a client-side transaction where a content providing system uses content (e.g., virtual reality media content) for end-user consumption). It may also refer to the server side (e.g., provider side) of a transaction provided to a device. For example, as described in more detail below, a virtual reality media content providing system can provide virtual reality media content to media player devices associated with users. Thus, server-side systems and components can refer to certain systems and components, and said systems and components can be related (e.g., contained, implemented, interoperable, etc.) to content serving systems. etc.), the content providing system may provide data (eg, virtual reality media content) to media player devices (eg, by means of a network). On the one hand, a "client-side" device can relate to a client device (e.g., a media player device), which can be used by a user located on the other side of the network, a "client-side" device A "side" device can include: a device that facilitates the client device receiving data from the content serving system (e.g., a media player device and/or a user device on the user's side of the network). other computer components operated by

通信設備１０２は、以下の動作を行うように構成されてもよい：特定の実施に寄与することができる任意の通信インターフェース、プロトコル、及び／又は技術を使用して、サーバ・サイド及び／又はクライアント・サイド・システムと通信すること。例えば、通信設備１０２は、以下の手段の１以上によって通信するように構成されてもよい：ネットワーク（例えば、有線又はワイヤレスなローカル・エリア・ネットワーク、ワイド・エリア・ネットワーク、プロバイダ・ネットワーク、インターネット等）、有線の通信インターフェース（例えば、ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ（「ＵＳＢ」））、ワイヤレス通信インターフェース、又は、任意の他の適切な通信インターフェース、プロトコル、及び／若しくは技術。 Communication facility 102 may be configured to perform the following operations: server-side and/or client-side using any communication interface, protocol, and/or technology that may contribute to a particular implementation • Communicate with side systems. For example, communication facility 102 may be configured to communicate via one or more of the following means: networks (e.g., wired or wireless local area networks, wide area networks, provider networks, the Internet, etc.) ), a wired communication interface (eg, Universal Serial Bus (“USB”)), a wireless communication interface, or any other suitable communication interface, protocol, and/or technology.

表面データ・フレーム・シーケンス・マネージメント設備１０４は、以下を含むことができる：１以上の物理的なコンピューティング・コンポーネント（例えば、通信設備１０２のハードウェア及び／又はソフトウェア・コンポーネントとは別個の、又は、通信設備１０２と共有される）。これらは、以下に対して、オーガナイズ、同期、管理、トラッキング、及び／又はマネージメントを行うことに関連する様々なオペレーションを実行することができる：システム１００が受信又は生成した表面データ・フレーム・シーケンス、及び、表面データ・フレーム・シーケンスに関連するキャプチャ・パラメータの各セット。例えば、表面データ・フレーム・シーケンス・マネージメント設備１０４は、キャプチャされる表面データ・フレーム・シーケンス（例えば、キャプチャ・デバイスによってキャプチャされ、且つ、上述した通信設備１０２によって受信される表面データ・フレーム・シーケンス）に関連するキャプチャ・パラメータのセットを管理することができ、及び／又は、表面データ・フレーム・シーケンス・マネージメント設備１０４は、以下を特定すること（又は、特定することを促進すること）ができる：キャプチャされる表面データ・フレーム・シーケンスに関連するキャプチャ・パラメータのセットとは異なるキャプチャ・パラメータの１以上の追加のセット。例えば、表面データ・フレーム・シーケンス・マネージメント設備１０４は、キャプチャ・パラメータの１以上のセットを特定することができ、前記１以上のセットは、複数のキャプチャ・デバイスがキャプチャする現実世界シーンの異なるビューとは異なる現実世界シーンの１以上のカスタマイズされるビューとそれぞれ関連してもよい。表面データ・フレーム・シーケンス・マネージメント設備１０４は、更に以下を実行することができる：本明細書に記載の他のオペレーション、及び／又は、システム１００の特定の実施に寄与することができる他のオペレーション。 The surface data frame sequence management facility 104 can include: one or more physical computing components (e.g., separate from the hardware and/or software components of the communication facility 102; or , shared with communication facility 102). They can perform various operations related to organizing, synchronizing, managing, tracking, and/or managing: surface data frame sequences received or generated by system 100; and each set of capture parameters associated with the surface data frame sequence. For example, the surface data frame sequence management facility 104 may manage captured surface data frame sequences (eg, surface data frame sequences captured by the capture device and received by the communication facility 102 described above). ) and/or the surface data frame sequence management facility 104 can specify (or facilitate specifying): : one or more additional sets of capture parameters that are different from the set of capture parameters associated with the captured surface data frame sequence. For example, the surface data frame sequence management facility 104 can specify one or more sets of capture parameters, the one or more sets representing different views of the real-world scene captured by multiple capture devices. may each be associated with one or more customized views of a different real-world scene. Surface data frame sequence management facility 104 may also perform: other operations described herein and/or other operations that may contribute to a particular implementation of system 100. .

仮想化プロジェクション生成設備１０６は、以下を含むことができる：１以上の物理的なコンピューティング・コンポーネント（例えば、設備１０２及び／若しくは１０４のハードウェア及び／若しくはソフトウェア・コンポーネントとは別個のハードウェア及び／若しくはソフトウェア・コンポーネント、又は、設備１０２及び／若しくは１０４と共有されるハードウェア及び／若しくはソフトウェア・コンポーネント）。これらは、以下の動作に関連する様々なオペレーションを実行することができる：現実世界シーンのビュー（例えば、カスタマイズされるビュー）の仮想化プロジェクション、及び／又はこれに関連するデータを準備、形成、レンダリング、又は生成すること。例えば、仮想化プロジェクション生成設備１０６は、現実世界シーンのカスタマイズされるビューの仮想化プロジェクションに関する色彩及び深度フレームをレンダリングすることができる。より具体的には、例えば、仮想化プロジェクション生成設備１０６は、色彩及び深度フレームをレンダリングすることができ、前記レンダリングは、通信設備１０２が受信する表面データ・フレーム・シーケンスのうち少なくとも１つに基づいてもよく、そして、表面データ・フレーム・シーケンス・マネージメント設備１０４が特定するキャプチャ・パラメータの追加のセットに更に基づいてもよい。また、仮想化プロジェクション設備１０６は、仮想化される表面データ・フレーム・シーケンスを生成することができ、前記生成は、仮想化プロジェクションに関するレンダリングされる色彩及び深度フレームに基づいてもよい（例えば、これを含んでもよい）。いったん、仮想化される表面データ・フレーム・シーケンスが生成されると、仮想化プロジェクション設備１０６は、バーチャル・リアリティ・メディア・コンテンツ内に含める目的で、表面データ・フレーム・シーケンスを、ユーザに関連するメディア・プレーヤ装置に提供することができる。或いは、上述したように、仮想化される表面データ・フレーム・シーケンスは、特定の実施において、バーチャル・リアリティ・メディア・コンテンツ内に含める目的で、通信設備１０２が提供することができる。仮想化プロジェクション設備１０６は、更に以下を実行することができる：本明細書に記載の他のオペレーション、及び／又は、システム１００の特定の実施に寄与することができる他のオペレーション。 Virtualized projection generation facility 106 may include: one or more physical computing components (e.g., hardware and software components separate from the hardware and/or software components of facilities 102 and/or 104); /or software components, or hardware and/or software components shared with facilities 102 and/or 104). They can perform various operations related to: preparing, shaping, virtualizing projections of views of the real-world scene (e.g., customized views) and/or data associated therewith; render or generate. For example, virtualization projection generation facility 106 can render color and depth frames for virtualization projections of customized views of real-world scenes. More specifically, for example, the virtualization projection generation facility 106 can render color and depth frames based on at least one of the surface data frame sequences received by the communication facility 102. and may be further based on an additional set of capture parameters specified by the surface data frame sequence management facility 104 . The virtualization projection facility 106 may also generate a virtualized surface data frame sequence, which may be based on the rendered color and depth frames for the virtualization projection (e.g., this may include). Once the virtualized surface data frame sequence is generated, the virtualization projection facility 106 associates the surface data frame sequence with the user for inclusion within the virtual reality media content. It can be provided in a media player device. Alternatively, as noted above, the virtualized surface data frame sequence may be provided by communication facility 102 for inclusion within virtual reality media content in certain implementations. Virtualization projection facility 106 may also perform: other operations described herein and/or other operations that may contribute to particular implementations of system 100 .

ストレージ設備１０８は、任意の適切なデータを記憶及び／又は管理することができ、前記データは、特定の実施において、設備１０２～１０６によって、受信、生成、マネージメント、トラッキング、管理、使用、及び／又は送信されてもよい。例えば、示すように、ストレージ設備１０８は、以下を含むことができる：表面データ・フレーム・シーケンス・データ１１０、及び／又はキャプチャ・パラメータ・データ１１２。これらは、本明細書に記載の方法のいずれかで、受信、生成、マネージメント、トラッキング、管理、使用、及び／又は送信（例えば、他のシステムへ提供）されてもよい。更には、ストレージ設備１０８は、以下を含むことができる：本明細書に記載のオペレーションを実行する目的でシステム１００の特定の実施によって使用される他のタイプのデータ（例えば、インストラクション（例えば、プログラミング・インストラクション））、及び／又は、本明細書に記載のオペレーションを実行するために設備１０２～１０６が使用する他のデータ。ストレージ設備１０８は、本明細書に記載の方法のいずれかで実装されてもよく、そして、ストレージ設備１０８は、データを記憶する際の任意の一時的又は非一時的なモードに関するハードウェア及び／又はソフトウェアを含むことができ、限定されるものではないが、以下を含むことができる：ランダム・アクセス・メモリ（「ＲＡＭ」）、非一時的ストレージ（例えば、ディスク・ストレージ、フラッシュ・メモリ・ストレージ等）等。 Storage facility 108 may store and/or manage any suitable data, which in particular implementations may be received, generated, managed, tracked, managed, used, and/or processed by facilities 102-106. or may be sent. For example, as shown, storage facility 108 may include: surface data frame sequence data 110 and/or capture parameter data 112 . They may be received, generated, managed, tracked, administered, used, and/or transmitted (eg, provided to other systems) in any of the ways described herein. Additionally, storage facility 108 may include: other types of data (e.g., instructions (e.g., programming • instructions)) and/or other data used by the equipment 102-106 to perform the operations described herein. Storage facility 108 may be implemented in any of the ways described herein, and storage facility 108 may include hardware and/or hardware for any transient or non-transitory mode of storing data. or software, including but not limited to: random access memory (“RAM”), non-transitory storage (e.g., disk storage, flash memory storage) And so on.

幾つかの例において、システム１００は、現実世界シーン内でイベントが発生したときに、１以上の本明細書に記載のオペレーションをリアルタイムで実行することができる。従って、システム１００がバーチャル・リアリティ・メディア・コンテンツ・プロバイダ・パイプライン内で使用される（ここで、他のシステムもリアルタイムで動作する）実施において、バーチャル・リアリティ・メディア・コンテンツ（例えば、システム１００がリアルタイムで生成する仮想化される表面データ・フレーム・シーケンスを含むバーチャル・リアリティ・メディア・コンテンツ）を、メディア・プレーヤ装置に提供することができ、その結果、メディア・プレーヤ装置の各ユーザ（ユーザは、現実世界シーンの近くに物理的に位置しなくてもよく、現実世界シーン（例えば、現実世界シーン内で発生するイベント）を体験することを望んでもよい）は、ユーザの各メディア・プレーヤ装置を使用して、現実世界シーン、及び、現実世界シーン内で発生するイベントを、仮想的に、ライブ形式で（例えば、イベントが発生した時にリアルタイムで）体験することができる。データ処理及びデータ配信について、現実世界シーン内でイベントが発生した時にユーザが現実世界シーンを正確に体験することが不可能となるような有限量の時間がかかる可能性がある。その一方で、本明細書で使用するが、オペレーションが即時に且つ過度の遅延なく実行される場合には、オペレーションは、「リアルタイム」で実行されるものとする。従って、以下の場合であっても、ユーザが、現実世界シーンをリアルタイムで体験するといってもよい：ユーザが、現実世界シーン内にて、ある遅延の後（例えば、発生事象が実際に起こってから数秒後又は数分後（ａｆｅｗｓｅｃｏｎｄｓｏｒｍｉｎｕｔｅｓ））で特定のイベントを体験した場合。 In some examples, system 100 can perform one or more operations described herein in real-time when events occur within a real-world scene. Thus, in implementations where system 100 is used within a virtual reality media content provider pipeline (where other systems also operate in real time), virtual reality media content (e.g., system 100 virtual reality media content, including virtualized surface data frame sequences generated in real-time by the media player device, so that each user of the media player device (user may not be physically located near the real-world scene, and may wish to experience the real-world scene (e.g., events occurring within the real-world scene), each media player of the user The device can be used to virtually experience real-world scenes and events occurring within the real-world scenes in a live format (eg, in real-time as the events occur). Data processing and data delivery can take a finite amount of time such that the user cannot accurately experience the real-world scene when events occur within the real-world scene. On the other hand, as used herein, an operation is performed in "real time" if the operation is performed immediately and without undue delay. Thus, it may be said that the user experiences the real-world scene in real-time even if: the user is in the real-world scene after some delay (e.g. If you experience a particular event a few seconds or minutes after .

上述したように、特定の実施において、システム１００は、比較的膨大な数の仮想化プロジェクションを表すデータを生成することができる。こうしたデータは、以下の点に関する柔軟性をもたらすことができる：どのようにして、バーチャル・リアリティ・メディア・コンテンツ（例えば、データを使用するバーチャル・リアリティ・メディア・コンテンツ）を生成し、そして、クライアント・サイド・メディア・プレーヤ装置に配信するか。例えば、現実世界シーンの膨大な数のローカライズされるバーチャル・プロジェクションを表すデータを生成することによって、１人のユーザの体験に関連する詳細を、前記ユーザに関連するメディア・プレーヤ装置に提供することができ、一方で、前記詳細との関連性が低い別のユーザのメディア・プレーヤ装置には、提供されない。 As noted above, in certain implementations, system 100 can generate data representing a relatively large number of virtualization projections. Such data can provide flexibility regarding: how to generate virtual reality media content (e.g., virtual reality media content that uses the data), and how clients - Deliver to side media player device? For example, providing details related to the experience of one user to a media player device associated with said user by generating data representing a large number of localized virtual projections of real world scenes. while other users' media player devices with less relevance to said details are not provided.

システム１００の１つの特定の実装において、例えば、通信設備１０２は、（例えば、現実世界シーン内でイベントが発生した時にリアルタイムで）複数のキャプチャされる表面データ・フレーム・シーケンスを受信することができ、各表面データ・フレーム・シーケンスは、色彩及び深度フレームを含み、前記色彩及び深度フレームは、現実世界シーンを、キャプチャ・パラメータの各セットに従って描写し、前記キャプチャ・パラメータの各セットは、現実世界シーンの異なるビューに関連するキャプチャ・パラメータの第１の複数のセット内に含まれる。上述したように、複数のキャプチャされる表面データ・フレーム・シーケンス内の各表面データ・フレーム・シーケンスは、複数のキャプチャ・デバイス内の異なるキャプチャ・デバイスによってキャプチャされてもよく、前記複数のキャプチャ・デバイスは、現実世界シーンの異なるビューをキャプチャするように現実世界シーンに関しての異なるロケーションに配置されてもよい。キャプチャされる表面データ・フレーム・シーケンスを受信することに加えて、表面データ・フレーム・シーケンス・マネージメント設備１０４は、キャプチャ・パラメータの第２の複数のセットを特定することができ、前記第２の複数のセットは、キャプチャ・パラメータの第１の複数のセット内に含まれるキャプチャ・パラメータのセットとは異なってもよい。例えば、キャプチャ・パラメータの第２の複数のセット内のキャプチャ・パラメータの各セットは、現実世界シーンの各カスタマイズされるビューに関連してもよく、前記各カスタマイズされるビューは、複数のキャプチャ・デバイスがキャプチャする現実世界シーンの異なるビューとは異なってもよい。例えば、キャプチャ・パラメータの前記第２の複数のセットは、以下を含むことができる：キャプチャ・パラメータの比較的膨大な数のセット（例えば、キャプチャ・パラメータの第１の複数のセット内に含まれるものよりも更に大きな数のセット）。また、こうした特定するオペレーションは、現実世界シーン内でイベントが発生した時に、リアルタイムで実行されてもよい。 In one particular implementation of system 100, for example, communication facility 102 can receive multiple captured surface data frame sequences (eg, in real-time as events occur within the real-world scene). , each surface data frame sequence includes a color and depth frame, the color and depth frame depicting a real-world scene according to a respective set of capture parameters, each set of the capture parameters representing a real-world Included within a first plurality of sets of capture parameters associated with different views of the scene. As described above, each surface data frame sequence in the plurality of captured surface data frame sequences may be captured by a different capture device in the plurality of capture devices, wherein the plurality of capture device Devices may be placed at different locations with respect to the real-world scene to capture different views of the real-world scene. In addition to receiving the surface data frame sequence to be captured, the surface data frame sequence management facility 104 can specify a second plurality of sets of capture parameters, said second plurality of The plurality of sets may be different from the set of capture parameters contained within the first plurality of sets of capture parameters. For example, each set of capture parameters in the second plurality of sets of capture parameters may be associated with each customized view of the real-world scene, each customized view associated with a plurality of capture parameters. Different views of the real-world scene captured by the device may differ. For example, the second plurality of sets of capture parameters can include: a relatively large number of sets of capture parameters (e.g., included within the first plurality of sets of capture parameters); set of numbers even greater than ). Such identifying operations may also be performed in real-time as events occur within the real-world scene.

キャプチャ・パラメータの第２の複数のセットを特定することに応答して、仮想化プロジェクション生成設備１０６は、現実世界シーンの各カスタマイズされるビューの仮想化プロジェクションに関する色彩及び深度フレームをレンダリングすることができ、前記レンダリングは、複数のキャプチャされる表面データ・フレーム・シーケンスに基づいてもよく、且つ、キャプチャ・パラメータの第２の複数のセットに基づいてもよい。幾つかの例において、仮想化プロジェクション生成設備１０６は、レンダリングされる色彩及び深度フレームをパッケージ化して、各仮想化される表面データ・フレーム・シーケンス内に含まれるようにしてもよく、前記各仮想化される表面データ・フレーム・シーケンスは、１以上のトランスポート・ストリーム等の手段によって送信されてもよく、これについては、後で更に詳細に説明する。また、こうしたレンダリング及び／又はデータのパッケージ化は、リアルタイムで、前記現実世界シーン内で前記イベントが発生した時に、実行されてもよい。かくして、通信設備１０２は、（例えば、リアルタイムで、前記現実世界シーン内で前記イベントが発生した時に）複数の仮想化される表面データ・フレーム・シーケンスを提供することができ、前記表面データ・フレーム・シーケンスは、現実世界シーンの各カスタマイズされるビューの仮想化プロジェクションに関するレンダリングされる色彩及び深度フレームを含むことができる。例えば、複数の仮想化される表面データ・フレーム・シーケンスは、バーチャル・リアリティ・メディア・コンテンツ内に含める目的で、メディア・プレーヤ装置に提供されてもよい（例えば、バーチャル・リアリティ・メディア・コンテンツは、バーチャル・リアリティ・メディア・プロバイダ・パイプラインの手段によって、リアルタイムで、バーチャル・リアリティ・メディア・コンテンツを体験するユーザに関連するメディア・プレーヤ装置へ、ストリーミングされるように構成される）。 In response to specifying the second plurality of sets of capture parameters, virtualization projection generation facility 106 can render color and depth frames for the virtualization projection of each customized view of the real-world scene. The rendering may be based on a plurality of captured surface data frame sequences and may be based on a second plurality of sets of capture parameters. In some examples, the virtualization projection generation facility 106 may package the rendered color and depth frames to be included within each virtualized surface data frame sequence; The surface data frame sequence to be encoded may be transmitted by means such as one or more transport streams, which will be described in more detail below. Also, such rendering and/or packaging of data may be performed in real-time, as the events occur within the real-world scene. Thus, communication facility 102 can provide a plurality of virtualized surface data frame sequences (eg, in real-time, upon occurrence of the event within the real-world scene), wherein the surface data frames - The sequence may include rendered color and depth frames for each customized view virtual projection of the real-world scene. For example, multiple virtualized surface data frame sequences may be provided to a media player device for inclusion within virtual reality media content (eg, virtual reality media content may be , configured to be streamed in real-time by means of a virtual reality media provider pipeline to a media player device associated with the user experiencing the virtual reality media content).

現実世界シーンを表すデータ（例えば、システム１００が受信する表面データ・フレーム・シーケンス）は、特定の実施に寄与することができる任意の適切な構成で配置される任意の適切なシステム及び／又はデバイスによってキャプチャされてもよい。例えば、上述したように、システム１００が受信する複数のキャプチャされる表面データ・フレーム・シーケンス内の各表面データ・フレーム・シーケンスは、複数のキャプチャ・デバイス内の異なるキャプチャ・デバイスによってキャプチャされてもよく、前記複数のキャプチャ・デバイスは、現実世界シーンの異なるビューをキャプチャするように、現実世界シーンに関しての異なるロケーションに配置されてもよい。 Data representing a real-world scene (e.g., surface data frame sequences received by system 100) may be arranged in any suitable configuration and/or device in any suitable manner to contribute to a particular implementation. may be captured by For example, as described above, each surface data frame sequence in the plurality of captured surface data frame sequences received by system 100 may be captured by a different capture device in the plurality of capture devices. Well, the plurality of capture devices may be placed at different locations with respect to the real world scene so as to capture different views of the real world scene.

図示する目的で、図２は、例示的な構成２００を示し、ここで、例示的な現実世界シーンを表すデータは、現実世界シーンの異なるビューからキャプチャされる。具体的には、構成２００に示すように、現実世界オブジェクト２０４を含む現実世界シーン２０２は、現実世界シーン２０２の複数のビュー２０６（例えば、ビュー２０６－１～２０６－８）によって囲まれてもよい。 For illustrative purposes, FIG. 2 shows an exemplary configuration 200, where data representing an exemplary real-world scene are captured from different views of the real-world scene. Specifically, as shown in configuration 200, a real-world scene 202 containing a real-world object 204 may be surrounded by multiple views 206 (eg, views 206-1 through 206-8) of the real-world scene 202. good.

現実世界シーン２０２は、以下を表現することができる：任意の現実世界の景色、現実世界のロケーション、現実世界のイベント（例えば、ライブ・イベント等）、又は、特定の実施に寄与することができる現実世界に存在する他の対象物（例えば、バーチャル世界又は仮想世界にのみ存在するものとは対照的に）。図２の現実世界シーン２０２を表現する円によって示されるように、現実世界シーン２０２は、具体的には、線引きされたエリアであってもよい（例えば、ステージ、アリーナ等）。逆に、他の例において、現実世界シーン２０２は、さほど、区画化又は線引きされなくてもよい。例えば、現実世界シーン２０２は、以下を含むことができる：任意のインドア又はアウトドアの現実世界のロケーション（例えば、街の通り、ミュージアム、眺めの良い景色等）。特定の例において、現実世界シーン２０２は、現実世界のイベントに関連してもよい（例えば、スポーツ・イベント、ミュージカル・イベント、演劇又はシアターのプレゼン、大規模な祝事（例えば、タイムズ・スクエアでの大晦日、マルディ・グラ等）、政治的なイベント、又は、任意の他の現実世界のイベント）。同一の又は他の例において、現実世界シーン２０２は、以下に関連してもよい：フィクション化されるシーン（例えば、ライブ・アクションのバーチャル・リアリティのテレビ・ショー又は映画のセット）に関するセッティング、及び／又は特定の実施に寄与することができる任意の他のインドア又はアウトドアの現実世界のロケーションでの任意の他のシーンに関するセッティング。 Real-world scene 202 can represent: any real-world scene, real-world location, real-world event (e.g., live event, etc.), or can contribute to a particular implementation. Other objects that exist in the real world (eg, as opposed to those that exist in the virtual world or only in the virtual world). As indicated by the circle representing real-world scene 202 in FIG. 2, real-world scene 202 may specifically be a delineated area (eg, a stage, an arena, etc.). Conversely, in other examples, the real-world scene 202 may be less compartmentalized or delineated. For example, real-world scenes 202 can include: any indoor or outdoor real-world location (eg, city streets, museums, scenic views, etc.). In particular examples, real-world scene 202 may relate to real-world events (e.g., sporting events, musical events, play or theater presentations, large celebrations (e.g., in Times Square). New Year's Eve, Mardi Gras, etc.), political events, or any other real world event). In the same or other examples, real-world scene 202 may relate to: a setting for a fictionalized scene (e.g., a live action virtual reality television show or movie set); /or any other scene-related settings in any other indoor or outdoor real-world location that may contribute to a particular implementation.

従って、現実世界オブジェクト２０４は、生物無生物に関わらず、任意の現実世界オブジェクトを表現することができ、前記現実世界オブジェクトは、現実世界シーン２０２に関連してもよく（例えば、現実世界シーン２０２内に又はその周辺に位置する）、ビュー２０６のうち少なくとも１つから検出可能（例えば、閲覧可能等）であってもよい。明確性の目的で、現実世界オブジェクト２０４は、比較的シンプルな幾何学的形状で描写されているが、以下の点を理解されたい：現実世界オブジェクト２０４は、様々なレベルの複雑さを有する様々なタイプのオブジェクトを表現することができる。例えば、幾何学的な形ではなく、現実世界オブジェクト２０４は、以下を表現することができる：任意のアニメーション若しくは非アニメーション・オブジェクト若しくは表面（例えば、人物又は別の生き物）、非透明な固体、液体、若しくは気体、不連続性に欠けるオブジェクト（例えば、壁、天井、床）、又は、本明細書に記載の若しくは特定の実施に寄与することができる任意の他のタイプのオブジェクト。 Thus, real-world object 204 can represent any real-world object, whether animate or inanimate, and said real-world object may be associated with real-world scene 202 (eg, within real-world scene 202). may be detectable (eg, viewable, etc.) from at least one of the views 206 . For purposes of clarity, real-world object 204 is depicted as a relatively simple geometric shape, but it should be understood that: real-world object 204 can have various shapes with varying levels of complexity. can represent any type of object. For example, rather than being geometric shapes, real-world objects 204 can represent: any animated or non-animated object or surface (e.g., a person or another living creature), non-transparent solids, liquids. , or gas, objects lacking discontinuity (eg, walls, ceilings, floors), or any other type of object that can contribute to the implementations described herein or in particular.

現実世界オブジェクト２０４は、様々な表面を含むことができ、前記表面は、それぞれ光を反射することができ（例えば、現実世界シーン２０２における環境光、深度キャプチャ・デバイス等によって発光される構造化光パターン内の赤外線）、前記光は、キャプチャ・デバイスによって検出されてもよく、前記キャプチャ・デバイスは、ビュー２０６から現実世界シーン２０２をキャプチャするように、現実世界シーン２０２に関しての異なるロケーションに配置されてもよい。現実世界オブジェクト２０４を比較的シンプルに描写しているものの、現実世界オブジェクト２０４の表面の深度及び／又は外観は、表面がどの現実世界シーン２０２のビュー２０６から検出されるかに基づいて、異なった外観を擁してもよく、これについては、後述する。換言すれば、現実世界オブジェクト２０４は、現実世界オブジェクト２０４が閲覧される視界（例えば、位置、ヴァンテージ（Ｖａｎｔａｇｅ）・ポイント等）に基づいて、異なって見えてもよい。 Real-world object 204 may include various surfaces, each of which may reflect light (e.g., ambient light in real-world scene 202, structured light emitted by a depth capture device, etc.). Infrared in a pattern), the light may be detected by capture devices positioned at different locations with respect to the real-world scene 202 to capture the real-world scene 202 from a view 206. may Although a relatively simple depiction of the real-world object 204, the depth and/or appearance of the surface of the real-world object 204 may differ based on which real-world scene 202 view 206 the surface is detected from. It may also have an appearance, which will be discussed later. In other words, real-world object 204 may appear differently based on the field of view (eg, location, Vantage point, etc.) in which real-world object 204 is viewed.

上述したように、現実世界シーン２０２のビュー２０６は、現実世界シーン２０２（例えば、現実世界オブジェクト２０４を含む）を閲覧することができる異なる視界、ヴァンテージ・ポイント等を提供することができる。後述するが、様々な異なるビュー２０６（例えば、様々な視界から現実世界シーン２０２をキャプチャする目的で現実世界シーン２０２を囲むビュー２０６）からキャプチャされる現実世界シーン２０２の色彩及び深度データを用いて、システム１００は、現実世界シーン２０２の任意のビューの仮想化プロジェクションを生成することができる。換言すれば、１以上のビュー２０６からキャプチャされる色彩及び深度データを使用して、システム１００は、現実世界シーン２０２のカスタマイズされるビュー（例えば、ビュー２０６とは異なるロケーション、配向等からの現実世界シーン２０２の任意のビュー）の仮想化プロジェクションのための色彩及び深度データをレンダリングすることができる。 As described above, views 206 of real-world scene 202 can provide different views, vantage points, etc., from which real-world scene 202 (eg, including real-world objects 204) can be viewed. Using color and depth data of the real-world scene 202 captured from various different views 206 (e.g., views 206 surrounding the real-world scene 202 for the purpose of capturing the real-world scene 202 from different perspectives), as described below. , system 100 can generate a virtualized projection of any view of real-world scene 202 . In other words, using color and depth data captured from one or more views 206 , system 100 can generate a customized view of real-world scene 202 (eg, from a different location, orientation, etc. than view 206 ). Any view of the world scene 202) can be rendered color and depth data for a virtualized projection.

ビュー２０６は、それぞれ、現実世界シーン２０２に関して固定されてもよい。例えば、現実世界シーン２０２及びビュー２０６の両方は静的であってもよく、又は、現実世界シーン２０２及びビュー２０６は共に動くものであってもよい。幾つかの例において、例えば、構成２００に示すように、ビュー２０６は、現実世界シーン２０２に関連する少なくとも２次元に沿って（例えば、平面に沿って（例えば、地面））現実世界シーン２０２を囲んでもよい。特定の例において、ビュー２０６は、現実世界シーン２０２を３次元（例えば、現実世界シーン２０２の上下のビュー２０６を含めることにもよって）に沿って囲んでもよい。 Views 206 may each be fixed with respect to real-world scene 202 . For example, both real-world scene 202 and view 206 may be static, or real-world scene 202 and view 206 may move together. In some examples, for example, as shown in configuration 200, view 206 views real-world scene 202 along at least two dimensions associated with real-world scene 202 (eg, along a plane (eg, ground)). You can surround it. In particular examples, views 206 may surround real-world scene 202 along three dimensions (eg, by also including views 206 above and below real-world scene 202).

ビュー２０６を配置する現実世界シーン２０２を囲む異なる位置によって示されるが、各ビュー２０６は、現実世界シーン２０２に関連する特定のロケーションに関連してもよい。更には、ビュー２０６は、現実世界シーン２０２をどのようにキャプチャするかに関する他の態様に更に関連してもよい。例えば、各ビュー２０６から出ているドット線によって示されるように、ビュー２０６は以下に関連してもよい：特定のキャプチャ配向（例えば、ビュー２０６に対応するキャプチャ・デバイスが向かう特定の方向）、キャプチャの特定の視野（例えば、キャプチャ・デバイスが以下に基づいてキャプチャする現実世界シーン２０２のエリア：例えば、キャプチャ・デバイスのレンズがどれほど狭角又は広角なのか、キャプチャ・デバイスのズーム・レベル等）等。各ビュー２０６は、更に、図２に明示的に示さないキャプチャの態様に関連してもよい。例えば、各ビュー２０６は以下に関連してもよい：データがビュー２０６に関連するキャプチャ・デバイスによってキャプチャされるときの特定のクオリティ・レベル（例えば、イメージ解像度、フレーム・レート等）、キャプチャ・デバイスがキャプチャするデータがエンコードされる特定のフォーマット、及び／又は特定の実施に寄与することができるデータ・キャプチャの任意の他の態様。 Each view 206 may be associated with a particular location relative to the real-world scene 202 , as indicated by the different locations surrounding the real-world scene 202 in which the views 206 are placed. Additionally, view 206 may further relate to other aspects of how real-world scene 202 is captured. For example, as indicated by the dotted lines emanating from each view 206, the views 206 may be associated with: a particular capture orientation (eg, a particular direction toward which the capture device corresponding to the view 206 is directed); The specific field of view of the capture (e.g., the area of the real-world scene 202 that the capture device captures based on: e.g., how narrow or wide the lens of the capture device is, the zoom level of the capture device, etc.) etc. Each view 206 may also be associated with aspects of capture not explicitly shown in FIG. For example, each view 206 may be associated with: a particular quality level (eg, image resolution, frame rate, etc.) at which data is captured by the capture device associated with view 206; The particular format in which the data captured by is encoded, and/or any other aspect of data capture that may contribute to a particular implementation.

幾つかの例において、構成２００に示すように、各ビュー２０６に関連するヴァンテージ・ポイント（例えば、配向、視野等）は、現実世界シーン２０２に向かって内向きに角度付けされてもよく、それによって、ビュー２０６とは不揃いとなり得るカスタマイズされるビューから現実世界シーン２０２を後に再生成することができるのに十分な視界から現実世界シーン２０２をキャプチャするようにしてもよい。更には、同一の又は他の例において、ビュー２０６に関連する１以上のヴァンテージ・ポイントは、外向きに角度付けされてもよく（即ち、現実世界シーン２０２から遠ざかるように）、現実世界シーン２０２を囲むオブジェクト等を表すデータをキャプチャしてもよい。例えば、球状の３６０度キャプチャ・デバイス（外側に向かうヴァンテージ・ポイント）は、現実世界シーン２０２内の位置（明示しない）に配置してもよく、現実世界シーン２０２内に含まれるオブジェクトを追加の視界からキャプチャすることができ、及び／又は、現実世界シーン２０２外部のデバイスをキャプチャすることができる。これに加えて、又はこれに代えて、特定の例において、複数の外向きのビューは、現実世界シーンに対して、パノラマの、広角の、又は３６０度のビューのキャプチャを可能にすることができる。 In some examples, as shown in configuration 200, vantage points (eg, orientation, field of view, etc.) associated with each view 206 may be angled inward toward real-world scene 202, which may capture the real-world scene 202 from a sufficient field of view to be able to later recreate the real-world scene 202 from a customized view that may be misaligned with the view 206 . Further, in the same or other examples, one or more vantage points associated with view 206 may be angled outward (i.e., away from real-world scene 202) such that real-world scene 202 may capture data representing objects, etc. surrounding the . For example, a spherical 360-degree capture device (outward facing vantage point) may be placed at a position (not explicitly shown) within the real-world scene 202 to provide an additional field of view of the objects contained within the real-world scene 202. and/or a device external to the real-world scene 202 can be captured. Additionally or alternatively, in certain examples, multiple outward-facing views may enable the capture of panoramic, wide-angle, or 360-degree views of real-world scenes. can.

各ビュー２０６の視界から現実世界シーン２０２をキャプチャする目的で、複数のキャプチャ・デバイス内の異なるキャプチャ・デバイスは、ビュー２０６の各異なるロケーションに配置することができる。図示する目的で、図３Ａは、例示的なキャプチャ・デバイス３０２を示し、前記キャプチャ・デバイス３０２は、現実世界シーン２０２を表す表面データ・フレーム・シーケンス内に含める目的で、色彩及び深度フレームをキャプチャする。 A different capture device in the plurality of capture devices may be positioned at each different location of the view 206 for the purpose of capturing the real-world scene 202 from the field of view of each view 206 . For illustrative purposes, FIG. 3A shows an exemplary capture device 302 that captures color and depth frames for inclusion within a sequence of surface data frames representing real-world scene 202. do.

図３Ａに示すように、キャプチャ・デバイス３０２は、ビュー２０６－１に関連してもよく、従って、現実世界シーン２０２及び現実世界オブジェクト２０４に関して、ビュー２０６－１に対応するロケーションに配置されてもよい。図３Ａが示すこととして、キャプチャ・デバイス３０２は、以下を含むことができる：２次元（「２Ｄ」）色彩キャプチャ・デバイス３０４であって、現実世界シーン２０２（例えば、現実世界オブジェクト２０４、及び／又は、現実世界シーン内に含まれる他のオブジェクトを含む）を表す色彩データ（例えば、フル・カラー又はグレースケール・イメージを表す２Ｄビデオ・データ）をキャプチャするように構成されるデバイス、及び、深度キャプチャ・デバイス３０６であって、現実世界シーン２０２を表す深度データをキャプチャするように構成されるデバイス。 As shown in FIG. 3A, capture device 302 may be associated with view 206-1 and thus may be positioned at a location corresponding to view 206-1 with respect to real-world scene 202 and real-world object 204. good. As FIG. 3A illustrates, the capture device 302 can include: a two-dimensional (“2D”) color capture device 304 that provides a real-world scene 202 (e.g., a real-world object 204 and/or or other objects contained within a real-world scene), and depth A capture device 306 , a device configured to capture depth data representing real-world scene 202 .

２Ｄ色彩キャプチャ・デバイス３０４は、任意の適切な２Ｄ色彩キャプチャ・デバイス（例えば、カメラ、ビデオ・カメラ等）によって実装することができ、そして、特定の実施に寄与することができる任意の態様で、２Ｄ色彩データをキャプチャすることができる。幾つかの例において、２Ｄ色彩キャプチャ・デバイス３０４は、深度キャプチャ・デバイス３０６とは別個のデバイスであってもよい。総称すると、こうした別個のデバイス（例えば、並びに、機能的にデバイスを融合するために使用される任意の通信インターフェース及び／又は他のハードウェア若しくはソフトウェアのメカニズム）は、キャプチャ・デバイス（例えば、キャプチャ・デバイス３０２）と称することができる。他の例において、図３Ａに示すように、２Ｄ色彩キャプチャ・デバイス３０４及び深度キャプチャ・デバイス３０６は、単一のデバイス（即ち、キャプチャ・デバイス３０２）に統合されてもよく、前記デバイスは、色彩及び深度データの両方をキャプチャすることができ、これについては、後述する。 2D color capture device 304 can be implemented by any suitable 2D color capture device (e.g., camera, video camera, etc.), and in any manner that can contribute to a particular implementation, 2D color data can be captured. In some examples, 2D color capture device 304 may be a separate device from depth capture device 306 . Collectively, such separate devices (e.g., as well as any communication interfaces and/or other hardware or software mechanisms used to functionally blend the devices) are referred to as capture devices (e.g., capture device 302). In another example, as shown in FIG. 3A, 2D color capture device 304 and depth capture device 306 may be integrated into a single device (i.e., capture device 302), said device and depth data can be captured, which is described below.

別個のデバイスとして実装されるか、又は、２Ｄ色彩キャプチャ・デバイス３０４に統合されるかに関わらず、深度データ・キャプチャ・デバイス３０６は、現実世界シーン２０２を表す深度データを、特定の実施に寄与することができる任意の態様でキャプチャすることができる。例えば、深度データ・キャプチャ・デバイス３０６は、１以上の深度マップ・キャプチャ技術を採用することができる（例えば、構造化光深度マップ・キャプチャ技術、立体視深度マップ・キャプチャ技術、タイム・オブ・フライト深度キャプチャ技術、別の適切な深度マップ・キャプチャ技術、又は、特定の実施に寄与することができる深度マップ・キャプチャ技術の任意の組み合わせ）。 Whether implemented as a separate device or integrated with the 2D color capture device 304, the depth data capture device 306 converts depth data representing the real-world scene 202 to contribute to a particular implementation. can be captured in any manner possible. For example, depth data capture device 306 can employ one or more depth map capture techniques (e.g., structured light depth map capture technique, stereoscopic depth map capture technique, time-of-flight depth capture technique, another suitable depth map capture technique, or any combination of depth map capture techniques that may contribute to a particular implementation).

深度データをキャプチャするのに使用される深度マップ・キャプチャ技術のタイプ及び数に関わらず、キャプチャ・デバイス３０２は、現実世界オブジェクト２０４及び／又は現実世界シーン２０２内に含まれる他のオブジェクトの表面を表す色彩データ（例えば、色彩フレーム）及び深度データ（例えば、深度フレーム）の両方を、ビュー２０６－１からキャプチャすることができる。本明細書で使用するが、キャプチャ・デバイス３０２によってほぼ同じ時間にキャプチャされる色彩フレーム及び深度フレームは、総称的に、「表面データ・フレーム」又は「色彩及び深度フレーム」と称してもよく、この理由として、これらのフレームに含まれるデータは、現実世界シーン内に含まれる現実世界オブジェクトの表面（即ち、表面の見える外観、並びに、表面の深度の幾何学の両方）を記述するデータを表現しているからである。 Regardless of the type and number of depth map capture techniques used to capture depth data, capture device 302 captures the surface of real-world object 204 and/or other objects contained within real-world scene 202. Both representative color data (eg, color frame) and depth data (eg, depth frame) can be captured from view 206-1. As used herein, color frames and depth frames captured at approximately the same time by capture device 302 may be collectively referred to as "surface data frames" or "color and depth frames"; For this reason, the data contained in these frames represent data describing the surface of the real-world objects contained within the real-world scene (i.e., both the visible appearance of the surface as well as the depth geometry of the surface). because they are

従って、本明細書で使用するが、表面データ・フレーム又は色彩及び深度フレームは、データセットを意味してもよく、前記データセットは、現実世界オブジェクトの表面に関連する様々なタイプのデータを表してもよく、前記現実世界オブジェクトの表面は、現実世界シーンの特定のビューから特定の時間のポイントにおいて現実世界シーン内で見ることができるものであってもよい。例えば、表面データ・フレームは、以下を含むことができる：現実世界シーンに関して特定のビューから見ることができるオブジェクトを表す色彩データ（即ち、イメージ・データ）並びに深度データ。かくして、複数の関連表面データ・フレームは、ともにシーケンス化されてもよく、特定のビューからみた現実世界シーンのビデオのような表現（色彩データのみならず深度データも表現）を生成してもよい。特定の例において、表面データ・フレームは、更に以下に関連してもよい：他のタイプのデータ（例えば、オーディオ・データ、メタデータ（例えば、表面データ・フレームがキャプチャされるビューを記述するキャプチャ・パラメータのセット、表面データ・フレームにおいて表現される特定の現実世界オブジェクトに関する情報等を含むメタデータ））、及び／又は特定の実施に寄与することができる他のタイプのデータ。後で説明及び図示するが、表面データ・フレームのこうしたシーケンスは、本明細書において「表面データ・フレーム・シーケンス」と称してもよい。 Thus, as used herein, a surface data frame or color and depth frame may refer to a dataset, which represents various types of data associated with the surface of a real-world object. Alternatively, the surface of the real-world object may be visible within the real-world scene at a particular point in time from a particular view of the real-world scene. For example, a surface data frame can include: color data (ie, image data) as well as depth data representing objects visible from a particular view with respect to a real-world scene. Thus, multiple related surface data frames may be sequenced together to produce a video-like representation (representing not only color data but also depth data) of the real-world scene from a particular view. . In certain examples, a surface data frame may also be associated with: other types of data (e.g., audio data, metadata (e.g., capture data describing the view from which the surface data frame is captured) Metadata, including a set of parameters, information about a particular real-world object represented in a surface data frame, etc.)), and/or other types of data that can contribute to a particular implementation. As described and illustrated below, such a sequence of surface data frames may be referred to herein as a "surface data frame sequence."

本明細書で使用するが、「色彩データ」は、広義には任意のイメージ・データ、ビデオ・データ等を含むことができ、カラーでの表現又はグレースケール（即ち「白黒」）での表現に関わらず、これらのデータは、特定のビューの視界から特定の時間ポイント又は特定の期間にわたって対象物（例えば、現実世界シーン内に含まれる現実世界オブジェクト）の外観を表現したものであってもよい。色彩データは、以下に限定されない：任意の特定の形式、ファイル・タイプ、フレーム・レート、解像度、品質レベル、又は、様々な定義、並びに／若しくは当分野でイメージ・データ及び／若しくはビデオ・データを定義する規格に関連する他の特性。同様に、本明細書で使用するが、「深度データ」は、以下を含むことができる：空間内の対象物の位置及び／又は幾何学を表す任意のデータ。例えば、現実世界オブジェクトを表す深度データは、以下を含むことができる：現実世界オブジェクトの表面の異なるポイントの座標系（例えば、特定のキャプチャ・デバイスに関連する座標系、現実世界シーン等に関連するグローバル座標系）に関する座標。 As used herein, "color data" can broadly include any image data, video data, etc., whether represented in color or represented in grayscale (i.e., "black and white"). Regardless, these data may represent the appearance of objects (e.g., real-world objects contained within a real-world scene) from the field of view of a particular view over a particular point in time or over a particular period of time. . Color data is not limited to: any particular format, file type, frame rate, resolution, quality level, or various definitions and/or image data and/or video data in the art. Other characteristics relevant to the standard being defined. Similarly, as used herein, "depth data" can include: any data representing the position and/or geometry of an object in space. For example, depth data representing a real-world object can include: a coordinate system of different points on the surface of the real-world object (e.g., a coordinate system associated with a particular capture device, a coordinate system associated with a real-world scene, etc.) coordinates with respect to the global coordinate system).

キャプチャ・デバイス３０２と同様、前記デバイスは、色彩及び深度フレームをビュー２０６－１からキャプチャするが、以下の点を理解されたい：他のキャプチャ・デバイスは、他のビュー２０６（例えば、図２中のビュー２０６－２～２０６－８）に関連してもよく、同様に、色彩及び深度フレームを、他のビュー２０６に関連する各ヴァンテージ・ポイントからキャプチャすることができる。幾つかの例において、表面データ・フレームは、同一の特定の時間ポイントで、異なるビュー２０６に関連する異なるキャプチャ・デバイスでキャプチャされてもよく、これによって、互いに同期をとることができる。本明細書で使用するが、表面データ・フレームは、以下の場合には、「同一の特定の時間ポイントで」キャプチャされると言ってもよい：表面データ・フレームが、ある時間の瞬間に（即ち、ある時間範囲にわたって対象物を表すのとは対照的に）対象物（例えば、現実世界シーン内の現実世界オブジェクト）を効果的に表現できるのに十分なほど近い時間内でキャプチャされる場合（たとえ、表面データ・フレームが正確に同一の瞬間にキャプチャされなかったとしても）。例えば、特定の対象物がどれだけ動的であるか（例えば、１以上の現実世界オブジェクトが現実世界シーン内をどれだけ早く動くか等）に依存するが、表面データ・フレームは、以下の場合に、同一の特定の時間ポイントでキャプチャされると考えてもよい：例えば、互いに数十又は数百ミリ秒（ｓｅｖｅｒａｌｔｅｎｓｏｒｈｕｎｄｒｅｄｓｏｆｍｉｌｌｉｓｅｃｏｎｄｓ）内でキャプチャされる場合、又は、特定の実施に寄与することができる別の適切なタイムフレーム内（例えば、マイクロ秒内、ミリ秒内、秒内等）でキャプチャされる場合。かくして、各表面データ・フレームは、特定の時間ポイントで、各キャプチャ・デバイスが関連するビュー２０６の各ヴァンテージ・ポイントからの現実世界シーン内に含まれる現実世界オブジェクトの表面の色彩及び深度データを、表面の外観として表すことができる。 Similar to capture device 302, said device captures color and depth frames from view 206-1, but it should be understood that other capture devices capture other views 206 (e.g. views 206 - 2 through 206 - 8 ), and similarly color and depth frames can be captured from each vantage point associated with the other views 206 . In some examples, the surface data frames may be captured at the same particular point in time with different capture devices associated with different views 206 so that they can be synchronized with each other. As used herein, a surface data frame may be said to be captured "at the same particular point in time" if: the surface data frame is captured at an instant in time ( That is, if captured within a time close enough to be able to effectively represent an object (e.g., a real-world object in a real-world scene), as opposed to representing the object over a range of time ( (even if the surface data frames were not captured at exactly the same instant). For example, depending on how dynamic a particular object is (e.g., how quickly one or more real-world objects move within a real-world scene, etc.), a surface data frame can be: may be considered to be captured at the same particular point in time: for example, if they are captured within several tens or hundreds of milliseconds of each other, or if they contribute to a particular implementation. if captured within another suitable timeframe that can be used (e.g., within microseconds, milliseconds, seconds, etc.). Thus, each surface data frame captures surface color and depth data for real-world objects contained within the real-world scene from each vantage point of view 206 to which each capture device is associated at a particular point in time. It can be expressed as a surface appearance.

図３Ｂ～３Ｃは、キャプチャ・デバイス３０２によってキャプチャされ、且つ、色彩及び深度フレーム内（即ち、表面データ・フレーム内）に含まれるデータの例示的なグラフィカルな描写を示す。具体的には、示すように、表面データ・フレームに組み込まれる色彩フレームは、色彩データ３０８（図３Ｂに示す）を含むことができ、一方で、表面データ・フレームに組み込まれる深度フレームは、深度データ３１０を含むことができる（図３Ｃに示す）。 3B-3C show exemplary graphical depictions of data captured by capture device 302 and contained within color and depth frames (ie, within surface data frames). Specifically, as shown, the color frame embedded in the surface data frame can include color data 308 (shown in FIG. 3B), while the depth frame embedded in the surface data frame includes depth Data 310 may be included (shown in FIG. 3C).

図３Ｂにおいて、色彩データ３０８は、現実世界シーン２０２（例えば、現実世界オブジェクト２０４を含む）を描写しており、前記現実世界シーン２０２は、ビュー２０６－１から、キャプチャ・デバイス３０２内の２Ｄ色彩キャプチャ・デバイス３０４によって閲覧される。色彩データ３０８は、ビデオ・フレームのシーケンスにおいて、単一のビデオ・フレームを表すことができるため、色彩データ３０８によって表現される現実世界オブジェクト２０４の描写は、以下を表すことができる：現実世界オブジェクト２０４（例えば、並びに現実世界シーン２０２に関連する他のオブジェクト）が、ビュー２０６－１のヴァンテージ・ポイントから、特定の時間ポイントでどのような外観を擁するか。図３Ｂにおいてイメージとして示されるが、以下の点を理解されたい：色彩データ３０８は、任意の適切な形態で、キャプチャ、エンコード、フォーマット、送信、及び表現されてもよい。例えば、色彩データ３０８は、標準ビデオ・エンコーディング・プロトコル、標準イメージ・フォーマット等に従ってフォーマットされるデジタル・データであってもよい。幾つかの例において、色彩データ３０８は、現実世界シーン２０２におけるオブジェクトの色彩イメージ（例えば、カラー写真に類する物）を表現したものであってもよい。或いは、他の例において、色彩データ３０８は、オブジェクトを表すグレースケール・イメージ（例えば、白黒写真に類するもの）であってもよい。 In FIG. 3B, color data 308 depicts a real-world scene 202 (eg, including real-world object 204), said real-world scene 202 being a 2D color image in capture device 302 from view 206-1. Viewed by capture device 304 . Since color data 308 can represent a single video frame in a sequence of video frames, the depiction of real-world object 204 represented by color data 308 can represent: What appearance 204 (eg, as well as other objects associated with real-world scene 202) would have at a particular point in time from the vantage point of view 206-1. Although shown as an image in FIG. 3B, it should be understood that the color data 308 may be captured, encoded, formatted, transmitted, and represented in any suitable form. For example, color data 308 may be digital data formatted according to standard video encoding protocols, standard image formats, or the like. In some examples, color data 308 may represent a color image of objects in real-world scene 202 (eg, analogous to a color photograph). Alternatively, in another example, color data 308 may be a grayscale image representing the object (eg, similar to a black and white photograph).

図３Ｃにおいて、深度データ３１０も（色彩データ３０８と同様に）、ビュー２０６－１の視界からの現実世界シーン２０２（現実世界オブジェクト２０４を含む）を描写する。しかし、現実世界シーン２０２内のオブジェクトの目に見える外観を表現する（例えば、光が現実世界オブジェクト２０４の表面とどのように相互作用するかについて、カラー又はグレースケールで表現する）のではなく、深度データ３１０は以下を表現したものであってもよい：例えば、キャプチャ・デバイス３０２内の深度キャプチャ・デバイス３０６に対するオブジェクト（例えば、現実世界オブジェクト２０４並びに現実世界シーン２０２内の他のオブジェクト）の表面上の各ポイントの深度（即ち、距離又は位置）。色彩データ３０８と同様、深度データ３１０は、任意の適切な形態で、キャプチャ、エンコード、フォーマット、送信、及び表現されてもよい。例えば、示すように、深度データ３１０は、グレースケール・イメージ・データによって表現されてもよい（例えば、深度キャプチャ・デバイス３０６によってキャプチャされる各ピクセルに関して６ビット又は８ビット）。しかし、光が現実世界オブジェクト２０４の表面からどのように反射するかについて表現（即ち、色彩データ３０８で表現）するのではなく、深度データ３１０のグレースケール・イメージは以下を表現したものであってもよい：イメージ中の各ピクセルに関して、そのピクセルによって表現されるポイントが、深度キャプチャ・デバイス３０６からどれほど離れているか。例えば、深度キャプチャ・デバイス３０６に、より近いポイントは、より暗いグレーの影を表現する値で表されてもよい（例えば、６ビット実装のケースで０ｂ１１１１１１がブラックを表現している場合において、０ｂ１１１１１１に近いバイナリな値）。逆に、深度キャプチャ・デバイス３０６から、より離れているポイントは、より明るいグレーの影を表現する値で表されてもよい（例えば、６ビット実装のケースで０ｂ００００００がホワイトを表現している場合において、０ｂ００００００に近いバイナリな値）。 In FIG. 3C, depth data 310 (as well as color data 308) also depict real-world scene 202 (including real-world objects 204) from the field of view of view 206-1. However, rather than representing the visible appearance of objects in the real-world scene 202 (e.g., representing in color or grayscale how light interacts with the surface of the real-world object 204), Depth data 310 may represent: for example, the surface of an object (eg, real-world object 204 as well as other objects in real-world scene 202) relative to depth capture device 306 in capture device 302; Depth (i.e. distance or position) of each point above. As with color data 308, depth data 310 may be captured, encoded, formatted, transmitted, and represented in any suitable form. For example, as shown, depth data 310 may be represented by grayscale image data (eg, 6 bits or 8 bits for each pixel captured by depth capture device 306). However, rather than representing how light reflects off the surface of real-world object 204 (i.e., represented by color data 308), the grayscale image of depth data 310 represents: May: For each pixel in the image, how far away from the depth capture device 306 the point represented by that pixel is. For example, points closer to the depth capture device 306 may be represented by values representing darker shades of gray (e.g., 0b111111 where 0b111111 represents black in the case of a 6-bit implementation). binary value close to ). Conversely, points further away from the depth capture device 306 may be represented by values representing lighter shades of gray (e.g., 0b000000 represents white in the case of a 6-bit implementation). in a binary value close to 0b000000).

特定の例において、システム１００（例えば、通信設備１０２）は、キャプチャ・デバイス３０２及び他のビュー２０６に関連する他のキャプチャ・デバイスと、１以上のネットワーク及び／又は任意の他の適切な通信インターフェース、プロトコル、及び技術により、通信可能に接続されてもよい。従って、こうした例において、通信設備１０２は、キャプチャされる表面データ・フレーム・シーケンスを、１以上のネットワーク並びに／又は他の通信インターフェース、プロトコル、及び技術の手段によりキャプチャ・デバイスから直接受信してもよい。他の例において、システム１００とは別個の現実世界シーン・キャプチャ・システムは、各キャプチャ・デバイスと通信可能に接続されてもよく、そして、以下の動作を行うように構成されてもよい：表面データ・フレームのキャプチャを、各キャプチャ・デバイスによってマネージメントすること、及び、表面データ・フレーム・シーケンスをシステム１００に提供すること（例えば、表面データ・フレーム・シーケンスに対して同期させた及び／又は処理した後）。とにかく、キャプチャ・デバイス、システム１００、及び／又は、仲介現実世界シーン・キャプチャ・システム間の通信は、以下の手段により実施されてもよい：ネットワークの手段により（例えば、有線又はワイヤレスなローカル・エリア・ネットワーク、ワイド・エリア・ネットワーク、プロバイダ・ネットワーク、インターネット等）、有線の通信インターフェース（例えば、ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ（「ＵＳＢ」））の手段により、ワイヤレス通信インターフェースの手段により、又は、特定の実施に寄与することができる任意の他の通信インターフェース、プロトコル、及び／若しくはテクノロジーの手段により。 In certain examples, system 100 (e.g., communication facility 102) communicates with capture device 302 and other capture devices associated with other views 206, one or more networks and/or any other suitable communication interface. , protocols, and techniques. Accordingly, in such examples, communication facility 102 may receive captured surface data frame sequences directly from capture devices by means of one or more networks and/or other communication interfaces, protocols, and technologies. good. In other examples, a real-world scene capture system separate from system 100 may be communicatively connected to each capture device and configured to perform the following actions: Managing the capture of data frames by each capture device and providing the surface data frame sequences to the system 100 (e.g., synchronizing and/or processing the surface data frame sequences). after). Regardless, communication between capture devices, system 100, and/or intermediary real-world scene capture systems may be implemented by the following means: by network means (e.g., wired or wireless local area network, wide area network, provider network, Internet, etc.), by means of a wired communication interface (e.g., Universal Serial Bus (“USB”)), by means of a wireless communication interface, or in certain implementations By means of any other communication interface, protocol, and/or technology that can contribute.

他の例において、複数のキャプチャ・デバイスは、システム１００内に統合されてもよく、又はシステム１００の一部として含まれてもよい（例えば、システム１００の表面データ・フレーム・シーケンス・マネージメント設備１０４又は別の設備の一部として）。かくして、こうした例において、表面データ・フレーム・シーケンス・マネージメント設備１０４は、統合されるキャプチャ・デバイスを使用して、表面データ・フレーム・シーケンスをキャプチャすることにより、表面データ・フレーム・シーケンスを受信することができる。 In other examples, multiple capture devices may be integrated within system 100 or included as part of system 100 (e.g., surface data frame sequence management facility 104 of system 100). or as part of another facility). Thus, in such an example, the surface data frame sequence management facility 104 receives the surface data frame sequence by capturing the surface data frame sequence using an integrated capture device. be able to.

図４Ａ～４Ｂが示すのは、例示的な表面データ・フレーム・シーケンス４００－１であり、前記表面データ・フレーム・シーケンス４００－１は、キャプチャ・デバイス３０２が生成する現実世界シーン２０２（例えば、ビュー２０６－１の視界から）を表す。具体的には、図４Ａが示すのは、表面データ・フレーム・シーケンス４００－１の詳細なグラフィカルなビューであり、前記ビューは、表面データ・フレーム・シーケンス４００－１に含めることができるある特定のデータを描写する。一方で、図４Ｂが示すのは、表面データ・フレーム・シーケンス４００－１の統合させたグラフィカルなビューであり、前記ビューは、表面データ・フレーム・シーケンス４００－１のコンテンツに関する数多くの詳細を具体的に描写しているわけではない。 4A-4B illustrate an exemplary surface data frame sequence 400-1, which is a real-world scene 202 (eg, view 206-1). Specifically, FIG. 4A shows a detailed graphical view of the surface data frame sequence 400-1, which is a specific view of the surface data frame sequence 400-1 that can be included in the surface data frame sequence 400-1. to depict the data of FIG. 4B, on the other hand, shows a consolidated graphical view of surface data frame sequence 400-1, which specifies many details about the contents of surface data frame sequence 400-1. It is not meant to be descriptive.

図４Ａに示すように、表面データ・フレーム・シーケンス４００－１は、色彩データ、深度データ、及びメタデータを含む様々なタイプのデータを含むことができる。具体的には、表面データ・フレーム・シーケンス４００－１は、色彩フレーム・シーケンス４０２、深度フレーム・シーケンス４０４、及びキャプチャ・パラメータ４０６のセットを含むように示される。以下の点を理解されたい：表面データ・フレーム・シーケンス４００－１は、更に、図４Ａに明示しない他のタイプのデータ（例えば、キャプチャ・パラメータ４０６のセットのほかに、キャプチャされるオーディオ・データ、他のメタデータ等）を含むことができる。更には、以下の点を理解されたい：表面データ・フレーム・シーケンス４００－１内に含まれるデータは、任意の適切な方法で、アレンジしたり、又はフォーマットしたりすることができる。例えば、示すように、表面データ・フレーム・シーケンス４００－１内に含まれるデータは、１つの色彩フレーム・シーケンス、及び１つの深度フレーム・シーケンスとしてアレンジされてもよい。他の例において、単独のキャプチャ・デバイスは、複数の色彩フレーム・シーケンス、及び／又は、複数の深度フレーム・シーケンスを出力することができる（例えば、キャプチャされる現実世界シーンの視野の異なる部分をカバーするために）。更なる他の例において、表面データ・フレーム・シーケンス４００－１のデータは、統合される表面データ・フレームのシーケンスとしてアレンジされてもよく、各フレームは、特定の色彩フレーム、特定の深度フレーム、及び特定のメタデータ（例えば、キャプチャ・パラメータ４０６のセットを表すデータ）を含むことができ、又は、特定の実施に寄与することができる他の方法でアレンジされてもよい。 As shown in FIG. 4A, surface data frame sequence 400-1 can include various types of data, including color data, depth data, and metadata. Specifically, surface data frame sequence 400-1 is shown to include a color frame sequence 402, a depth frame sequence 404, and a set of capture parameters 406. As shown in FIG. It should be appreciated that surface data frame sequence 400-1 may also include other types of data not explicitly shown in FIG. , other metadata, etc.). Additionally, it should be appreciated that the data contained within surface data frame sequence 400-1 may be arranged or formatted in any suitable manner. For example, as shown, the data contained within surface data frame sequence 400-1 may be arranged as one color frame sequence and one depth frame sequence. In other examples, a single capture device can output multiple color frame sequences and/or multiple depth frame sequences (e.g., for different portions of the field of view of the captured real-world scene). In order to cover). In yet another example, the data in surface data frame sequence 400-1 may be arranged as a sequence of surface data frames to be integrated, each frame representing a particular color frame, a particular depth frame, and may include specific metadata (eg, data representing the set of capture parameters 406), or may be arranged in other ways that may contribute to specific implementations.

色彩フレーム・シーケンス４０２の各色彩フレーム内に含まれるデータは、図３に関連して上述した色彩データ３０８と同様であってもよい。しかし、色彩フレーム・シーケンス４０２内の各色彩フレームは、少々異なる時間でキャプチャされてもよく、その結果、色彩フレーム・シーケンス４０２は、ビュー２０６－１からの現実世界シーン２０２のビデオのような表現を形成することができる。同様に、深度フレーム・シーケンス４０４の各深度フレーム内に含まれるデータは、以下の点を除いて、深度データ３１０と同様であってもよい：深度フレーム・シーケンス４０４内の各深度フレームは、少々異なる時間でキャプチャされてもよく（例えば、色彩フレーム・シーケンス４０２の色彩フレームがキャプチャされる時間と同期をとる時間）、その結果、深度フレーム・シーケンス４０４は、ビュー２０６－１からの現実世界シーン２０２の別のビデオのような表現を形成してもよい。 The data contained within each chromatic frame of chromatic frame sequence 402 may be similar to chromatic data 308 described above in connection with FIG. However, each chromatic frame in chromatic frame sequence 402 may be captured at a slightly different time, such that chromatic frame sequence 402 is a video-like representation of real-world scene 202 from view 206-1. can be formed. Similarly, the data contained within each depth frame of depth frame sequence 404 may be similar to depth data 310, except that each depth frame within depth frame sequence 404 may be slightly It may be captured at different times (eg, synchronized with the time the color frames of color frame sequence 402 are captured), such that depth frame sequence 404 is a real-world scene from view 206-1. Another video-like representation of 202 may be formed.

表面データ・フレーム・シーケンス４００－１内に含まれるキャプチャ・パラメータ４０６のセットは、以下を含むことができる：表面データ・フレーム・シーケンス４００－１をキャプチャするビュー（即ち、ビュー２０６－１のケースにおいて）を記述するメタデータ。例えば、キャプチャ・パラメータ４０６のセットは、以下を含むことができる：表面データ・フレーム・シーケンス４００－１内に含まれる表面データ・フレームが、どこで及び／又はどのようにキャプチャされるかに関する様々な態様を示す様々なパラメータ。キャプチャ・パラメータ４０６のセット内に含まれるキャプチャ・パラメータは、以下を含むことができる：特定の実施に寄与することができる現実世界シーンの各ビューに関連する任意の適切なキャプチャ・パラメータ。 The set of capture parameters 406 included within surface data frame sequence 400-1 may include: the view that captures surface data frame sequence 400-1 (ie, the case of view 206-1 in ). For example, the set of capture parameters 406 may include: various parameters regarding where and/or how the surface data frames contained within surface data frame sequence 400-1 are captured; Various parameters indicating aspects. Capture parameters included within set of capture parameters 406 may include: any suitable capture parameters associated with each view of a real-world scene that can contribute to a particular implementation.

例えば、キャプチャ・パラメータ４０６のセットは、以下を表すキャプチャ・パラメータを含むことができる：現実世界シーン２０２のビュー２０６－１に対応する色彩及び深度フレームがキャプチャされる現実世界シーン２０２に関するロケーション。別の例として、キャプチャ・パラメータ４０６のセットは、以下を表すキャプチャ・パラメータを含むことができる：現実世界シーン２０２のビュー２０６－１に対応する色彩及び深度フレームがキャプチャされる配向（例えば、キャプチャ・デバイスが指す異なる次元における異なる角度に関連するキャプチャ配向）。同様に、別の例として、キャプチャ・パラメータ４０６のセットは、以下を表すキャプチャ・パラメータを含むことができる：現実世界シーン２０２のビュー２０６－１に対応する色彩及び深度フレームをキャプチャする視野。更には、更なる別の例として、キャプチャ・パラメータ４０６のセットは、以下を表すキャプチャ・パラメータを含むことができる：現実世界シーン２０２のビュー２０６－１に対応する色彩及び深度フレームをキャプチャするイメージ・クオリティ。更なる他の例において、キャプチャ・パラメータ４０６のセットは、以下を含むことができる：現実世界シーン２０２のビュー２０６－１に対応する色彩及び深度フレームをキャプチャすることができる他の態様を表す任意の他の適切なキャプチャ・パラメータ。例えば、キャプチャ・パラメータ４０６のセットは、以下を表すパラメータを含むことができる：ビュー２０６－１に対応する深度フレームをキャプチャする深度マッピング及び／若しくは深度範囲；、ビュー２０６－１に対応する色彩及び深度フレームをキャプチャする特定のエンコーディング、フォーマット、フレーム・レート、ダイナミック・レンジ等を表すパラメータ；、キャプチャのソース（例えば、ビュー２０６－１に対応する色彩及び深度フレームをキャプチャするキャプチャ・デバイスに関する識別情報）；、又は、他の適切なパラメータ。 For example, set of capture parameters 406 may include capture parameters that represent: a location with respect to real-world scene 202 at which color and depth frames corresponding to view 206-1 of real-world scene 202 are captured; As another example, set of capture parameters 406 may include capture parameters that represent: an orientation in which color and depth frames corresponding to view 206-1 of real-world scene 202 are captured (eg, capture • Capture orientations associated with different angles in different dimensions the device points to). Similarly, as another example, set of capture parameters 406 may include capture parameters that represent: a field of view that captures color and depth frames corresponding to view 206-1 of real-world scene 202; Furthermore, as yet another example, set of capture parameters 406 may include capture parameters representing: an image that captures color and depth frames corresponding to view 206-1 of real-world scene 202; ·quality. In yet another example, the set of capture parameters 406 can include: any representing other ways in which color and depth frames corresponding to view 206-1 of real-world scene 202 can be captured; other appropriate capture parameters for . For example, set of capture parameters 406 may include parameters representing: a depth mapping and/or depth range for capturing depth frames corresponding to view 206-1; parameters describing the particular encoding, format, frame rate, dynamic range, etc., with which to capture the depth frames; identification information about the source of the capture (e.g., the capture device capturing the color and depth frames corresponding to view 206-1); ); or other suitable parameters.

キャプチャ・パラメータ４０６のセットは、特定の実施に寄与することができる任意の態様で、表面データ・フレーム・シーケンス４００－１内に含まれる他のデータによって表現されてもよく、前記データに統合されてもよい。例えば、幾つかの実施においては、キャプチャ・パラメータ４０６は、キャプチャ・パラメータを表すデータ（例えば、変数等）で明示的に表現されてもよいが、他の実施においては、キャプチャ・パラメータ４０６は、フォーマットによって暗黙的に表現されてもよく、ここで、配向、ロケーション、及び／又はプロジェクション情報（例えば、視野、及び、フレーム・シーケンスに関する深度マッピング、正投影図のフレーム・シーケンスに関して、左／右／上／下／近視／遠視等）が表現されてもよい。特定のキャプチャ・パラメータ４０６を表すデータは、例えば、特定のイメージ空間から世界空間の同種の座標への完全な変換を表現する単独の抽象的な行列（例えば、４×４行列）へ結合されてもよい。かくして、こうした例において、個々のコンポーネントは、明示的に特定されなくてもよいが、むしろ、より一般化した変換の範囲内に含まれてもよい。 The set of capture parameters 406 may be represented by, and integrated with, other data contained within surface data frame sequence 400-1 in any manner that may contribute to a particular implementation. may For example, in some implementations the capture parameters 406 may be explicitly expressed in data (eg, variables, etc.) representing the capture parameters, while in other implementations the capture parameters 406 may be format, where orientation, location, and/or projection information (e.g., field of view and depth mapping for a frame sequence, left/right/ up/down/myopia/hyperopia, etc.) may be represented. Data representing particular capture parameters 406 may be combined into a single abstract matrix (eg, a 4×4 matrix) that represents a complete transformation from a particular image space to homogeneous coordinates in world space, for example. good too. Thus, in such examples, individual components may not be explicitly specified, but rather may be included within a more general transformation.

更には、幾つかの例において、キャプチャ・パラメータ４０６のセットは、色彩フレーム・シーケンス４０２及び深度フレーム・シーケンス４０４内にそれぞれ含まれる各色彩フレーム及び／又は深度フレームに統合されてもよい（例えば、これらに関して繰り返されてもよい）。他の例において、キャプチャ・パラメータ４０６のセットは、各個々の表面データ・フレームに統合されてもよい（例えば、色彩及び深度フレームの組み合わせ）。こうした方法において、キャプチャ・パラメータ４０６のセットは、各々及びすべてのフレームに関するキャプチャ・パラメータを柔軟に記述することができる（たとえ、ビュー２０６が、表面データ・フレーム・シーケンス４００－１によって表現される期間中に、動的に変化したとしても）。他の例において、キャプチャ・パラメータ４０６のセットは、表面データ・フレーム・シーケンス４００－１によって表現される期間にわたって静的であってもよい。これらの例において、キャプチャ・パラメータ４０６のセットは、フレーム・シーケンス４０２及び４０４のフレームとは別に送信されてもよい。例えば、キャプチャ・パラメータ４０６のセットは、色彩及び深度フレームの送信とは別に送信されてもよい（例えば、色彩及び深度フレームの送信の前に、色彩及び深度フレームの送信開始時に、色彩及び深度フレームの送信後に、及び／又は、別の適切な時間に）。 Further, in some examples, the set of capture parameters 406 may be integrated into each chroma frame and/or depth frame included in chroma frame sequence 402 and depth frame sequence 404, respectively (e.g., may be repeated for these). In other examples, the set of capture parameters 406 may be integrated into each individual surface data frame (eg, a combination of color and depth frames). In such a manner, the set of capture parameters 406 can flexibly describe the capture parameters for each and every frame (eg, the time period over which the view 206 is represented by the surface data frame sequence 400-1). inside, even if it changes dynamically). In another example, the set of capture parameters 406 may be static over the time period represented by surface data frame sequence 400-1. In these examples, the set of capture parameters 406 may be transmitted separately from the frames of frame sequences 402 and 404 . For example, the set of capture parameters 406 may be transmitted separately from the transmission of the color and depth frames (e.g., before transmission of the color and depth frames, at the start of transmission of the color and depth frames, at the start of transmission of the color and depth frames). and/or at another suitable time).

上述したように、図４Ｂは、表面データ・フレーム・シーケンス４００－１の統合させたグラフィカルなビューを示す。具体的には、図４Ｂの表面データ・フレーム・シーケンス４００－１のビューは、特定のビュー（即ち、ビュー２０６－１）から閲覧したブロックの前面に、現実世界シーン２０２（即ち、現実世界オブジェクト２０４を含む）の描写に関するブロックとしての表面データ・フレーム・シーケンス４００－１を示す。このタイプの表面データ・フレーム・シーケンス・ビューは、後述する図における追加の表面データ・フレーム・シーケンスを示すのに有用となるであろう。しかし、以下の点を理解されたい：図４Ｂ等に示される統合させたグラフィカルなビューを用いて表現される任意の表面データ・フレーム・シーケンスは、以下を含むことができる：上述したアレンジのいずれかにおける図４Ａに関連して示される及び／又は記載されるすべての同一のタイプのデータ。 As mentioned above, FIG. 4B shows an integrated graphical view of the surface data frame sequence 400-1. Specifically, the view of surface data frame sequence 400-1 of FIG. 4B displays real-world scene 202 (ie, real-world objects) in front of blocks viewed from a particular view (ie, view 206-1). 204) is shown as a block of the surface data frame sequence 400-1. This type of surface data frame sequence view will be useful for showing additional surface data frame sequences in later figures. However, it should be appreciated that any surface data frame sequence represented using an integrated graphical view such as that shown in FIG. 4B can include: All of the same types of data shown and/or described in relation to FIG. 4A in any one.

１以上の表面データ・フレーム・シーケンス４００（例えば、図４に明示した表面データ・フレーム・シーケンス４００－１、及び、図４に明示しない他の類似の表面データ・フレーム・シーケンス（例えば、ビュー２０６－２に対応する表面データ・フレーム・シーケンス４００－２、ビュー２０６－３に対応する表面データ・フレーム・シーケンス４００－３、以下同様））に基づいて、システム１００は、現実世界シーン２０２のカスタマイズされるビューの仮想化プロジェクションに関して、色彩及び深度フレームをレンダリングすることができる。例えば、システム１００は、キャプチャ・パラメータの追加のセット（例えば、表面データ・フレーム・シーケンス４００に関連するキャプチャ・パラメータのセットとは異なるセット）を特定することができ、前記追加のセットは、現実世界シーン２０２のカスタマイズされるビュー（例えば、図２に示すビュー２０６とは異なるビュー）に関連してもよい。そして、システム１００は、表面データ・フレーム・シーケンス４００のうち少なくとも１つに基づいて、そして、キャプチャ・パラメータの追加のセットに基づいて、カスタマイズされるビューの仮想化プロジェクションに関して、色彩及び深度フレームをレンダリングすることができる。 One or more surface data frame sequences 400 (eg, surface data frame sequence 400-1 explicitly shown in FIG. 4, and other similar surface data frame sequences not explicitly shown in FIG. 4 (eg, view 206 -2, surface data frame sequence 400-3 corresponding to view 206-3, and so on), system 100 performs customization of real-world scene 202. Color and depth frames can be rendered for the virtualized projection of the viewed view. For example, system 100 can identify an additional set of capture parameters (eg, a set different from the set of capture parameters associated with surface data frame sequence 400), wherein the additional set is It may also relate to a customized view of world scene 202 (eg, a different view than view 206 shown in FIG. 2). The system 100 then renders color and depth frames for the customized view virtualization projection based on at least one of the surface data frame sequences 400 and based on an additional set of capture parameters. can be rendered.

本明細書で使用するが、現実世界シーンの「カスタマイズされるビュー」は、現実世界シーンを表すデータをキャプチャする物理的なキャプチャ・デバイスに関連するビューとは異なる現実世界シーンの任意のビューを意味してもよい。例えば、カスタマイズされるビューは、以下の点でカスタマイズされてもよい：特定の現実世界オブジェクトが位置する付近の現実世界シーン内のロケーション（例えば、現実世界オブジェクトについて、深度分解能又は深度精度を向上させるために）；、キャプチャ・デバイスが位置していない現実世界シーン内のロケーション；、キャプチャ・デバイスに関連する任意のビューが提供することができるのとは異なる配向；、キャプチャ・デバイスに関連する任意のビューが提供することができるのとは異なる視野（例えば、異なるズーム・レベル、広角又は狭角レンズに関連する視野等）；、キャプチャ・デバイスに関連する任意のビューが提供することができるのとは異なるレベルの詳細（例えば、イメージ解像度等）等。従って、本明細書で使用するが、現実世界シーンのカスタマイズされるビューの「仮想化プロジェクション」は、カスタマイズされるビューに関連するプロジェクション（例えば、視界プロジェクション（ｐｅｒｓｐｅｃｔｉｖｅｐｒｏｊｅｃｔｉｏｎ）、正投影プロジェクション等）を表すデータを意味してもよい。例えば、特定の例において、仮想化プロジェクションは、こうしたキャプチャ・デバイスがカスタマイズされるビューに関連する場合（即ち、カスタマイズされるビューを定義するキャプチャ・パラメータのセットを用いてデータをキャプチャ・デバイスがキャプチャする場合）、物理的なキャプチャ・デバイスがキャプチャするデータを仮想的にシミュレートする視界プロジェクションを含むことができる。別の例として、仮想化プロジェクションは、非視界プロジェクション（例えば、正投影プロジェクション等）を含むことができるが、前記非視界プロジェクションは、バーチャル・キャプチャ・デバイスのシミュレーションによって生成されず、むしろ、深度データを生成するための深度剥離技術、又は、特定の実施に寄与することができる他の適切な技術によって生成されてもよい。 As used herein, a “customized view” of a real-world scene is any view of the real-world scene that differs from the view associated with a physical capture device that captures data representing the real-world scene. may mean. For example, a customized view may be customized in terms of: a location within the real-world scene near which a particular real-world object is located (e.g., to improve depth resolution or depth accuracy for a real-world object); a location in the real-world scene where the capture device is not located; an orientation different than any view associated with the capture device can provide; any associated with the capture device (e.g., a different zoom level, a field of view associated with a wide-angle or narrow-angle lens, etc.) than the view of the capture device can provide; different level of detail (eg, image resolution, etc.), etc. Thus, as used herein, a "virtualized projection" of a customized view of a real-world scene refers to the projection associated with the customized view (e.g., perspective projection, orthographic projection, etc.). It may mean the data it represents. For example, in certain instances, a virtualized projection may be configured such that when such a capture device is associated with a customized view (i.e., the capture device captures data using a set of capture parameters that define the customized view). If so), it can include a field of view projection that virtually simulates the data that a physical capture device would capture. As another example, virtualization projections can include non-view projections (e.g., orthographic projections, etc.), but said non-view projections are not generated by simulation of the virtual capture device, but rather depth data may be produced by a deep ablation technique to produce a , or other suitable techniques that may be amenable to certain implementations.

上述したように、カスタマイズされるビューの仮想化プロジェクションは、以下の点をもたらすことができる：現実世界シーンの態様に関する新たな視界、向上した深度分解能に関する更なる柔軟性、及び仮想化プロジェクション無しでは得られない様々な他の利点。しかし、以下の点を理解されたい：仮想化プロジェクションは、物理的なキャプチャ・デバイスによってキャプチャされるデータに基づいてもよく、そして、従って、仮想化プロジェクションは、物理的なキャプチャ・デバイスによってキャプチャされていない任意の追加のデータを提供しなくてもよい。例えば、仮想化プロジェクションは、物理的なキャプチャ・デバイスが位置しない現実世界シーンのカスタマイズされるビューに関連してもよく、一方で、仮想化プロジェクションは、物理的なキャプチャ・デバイスが位置するビューからまだ得られていない任意の新たな情報を提供しなくてもよい。 As noted above, customized view virtualization projections can provide: new perspectives on aspects of real-world scenes, greater flexibility with respect to improved depth resolution, and, without virtualization projections, Various other benefits not available. However, it should be understood that the virtualization projection may be based on data captured by a physical capture device, and thus the virtualization projection may be based on the data captured by the physical capture device. You do not have to provide any additional data. For example, a virtualized projection may relate to a customized view of a real-world scene in which a physical capture device is not located, while a virtualized projection may relate to a view in which a physical capture device is located. It may not provide any new information that has not yet been obtained.

特定の例において、現実世界シーン２０２のカスタマイズされるビューは、現実世界シーン２０２を表すデータをキャプチャするために使用される特定のキャプチャ・デバイスを用いて揃えることができる。例えば、現実世界シーン２０２のカスタマイズされるビューに関連する（例えば、定義する）キャプチャ・パラメータの追加のセットは、以下を含むことができる：唯一のキャプチャ・デバイスによってキャプチャされるデータを要求する（例えば、キャプチャ・デバイスによってキャプチャされるデータのサブセットを要求する）１以上のキャプチャ・パラメータ。 In certain examples, the customized view of real-world scene 202 can be aligned with the particular capture device used to capture data representing real-world scene 202 . For example, an additional set of capture parameters associated with (eg, defining) a customized view of the real-world scene 202 may include: request data captured by only one capture device ( one or more capture parameters, e.g., requesting a subset of the data captured by the capture device;

例えば、キャプチャ・パラメータの追加のセットは、以下を表すキャプチャ・パラメータを含むことができる：現実世界シーン２０２のカスタマイズされるビューに関連するカスタマイズされる視野、ここで、カスタマイズされる視野は、キャプチャ・デバイスがキャプチャする表面データ・フレーム・シーケンスに関連するキャプチャされる視野よりも狭くてもよい。例えば、キャプチャ・デバイス・パラメータの追加のセットは、特定の物理的なキャプチャ・デバイスによってキャプチャされるデータの切り取った一部（即ち、ズーム・イン）を要求することができる。 For example, additional sets of capture parameters may include capture parameters that represent: a customized field of view associated with a customized view of real-world scene 202, where the customized field of view is the captured • May be narrower than the captured field of view associated with the surface data frame sequence captured by the device. For example, an additional set of capture device parameters may request a cropped portion (ie, zoom in) of data captured by a particular physical capture device.

唯一のキャプチャ・デバイスがキャプチャするデータを要求するキャプチャ・パラメータの別の例として、キャプチャ・パラメータの追加のセットは、以下を表すキャプチャ・パラメータを含むことができる：現実世界シーン２０２のカスタマイズされるビューに関連するカスタマイズされるイメージ・クオリティ、ここで、カスタマイズされるイメージ・クオリティは、キャプチャ・デバイスがキャプチャする表面データ・フレーム・シーケンスに関連するキャプチャされるイメージ・クオリティよりも低い。例えば、キャプチャ・パラメータの追加のセットは、特定の物理的なキャプチャ・デバイスによってキャプチャされるデータのより解像度の低いバージョンを要求することができる。 As another example of capture parameters that require data to be captured by only one capture device, an additional set of capture parameters may include capture parameters that represent: A customized image quality associated with the view, where the customized image quality is lower than the captured image quality associated with the surface data frame sequence captured by the capture device. For example, additional sets of capture parameters may request lower resolution versions of the data captured by a particular physical capture device.

他の例において、現実世界シーン２０２のカスタマイズされるビューは、以下と不揃い（ｕｎａｌｉｇｎｅｄ）であってもよい：現実世界シーン２０２に関して異なるロケーションに配置される複数のキャプチャ・デバイス（即ち、ビュー２０６に関連するキャプチャ・デバイス）によってキャプチャされる現実世界シーン２０２の異なるビュー。かくして、現実世界シーン２０２のカスタマイズされるビューに関する、仮想化プロジェクションのための色彩及び深度フレームのレンダリングは、以下を含むことができる：色彩及び深度フレームを、少なくとも２つの表面データ・フレーム・シーケンス４００に基づいてレンダリングすること。仮想化プロジェクションが単独のキャプチャ・デバイスからのデータに基づく上述した例と同様、これらの例におけるキャプチャ・パラメータの追加のセットは、以下を含むことができる：キャプチャされる表面データ・フレーム・シーケンスに関してより狭い視野、より低いイメージ・クオリティ等を要求するキャプチャ・パラメータ。しかし、仮想化プロジェクションが複数のキャプチャ・デバイスからのデータに基づく例におけるキャプチャ・パラメータの追加のセットは、カスタマイズされるロケーション、カスタマイズされる配向等を更に要求することができ、これらは、データが物理的なキャプチャ・デバイスによってキャプチャされる際の任意の実際のロケーション、配向等と異なってもよい。 In other examples, the customized view of real-world scene 202 may be unaligned with: multiple capture devices positioned at different locations with respect to real-world scene 202 (i.e. Different views of the real-world scene 202 captured by the associated capture device). Thus, rendering color and depth frames for a virtualized projection for a customized view of the real-world scene 202 can include: rendering color and depth frames into at least two surface data frame sequences 400; Rendering based on Similar to the examples above where the virtualization projection is based on data from a single capture device, additional sets of capture parameters in these examples can include: Capture parameters that require a narrower field of view, lower image quality, etc. However, additional sets of capture parameters in examples where the virtualization projection is based on data from multiple capture devices may further require customized locations, customized orientations, etc., which indicate that the data is May differ from any actual location, orientation, etc. as captured by a physical capture device.

図示する目的で、図５は、構成２００に基づく例示的な構成５００（即ち、構成２００に関連して示し、且つ上述した全ての同一の要素を含む）を示すが、ここで、現実世界シーン２０２を表すデータは、現実世界シーン２０２のカスタマイズされるビュー５０２のために追加で生成される。具体的には、構成５００に示すように、カスタマイズされるビュー５０２は、現実世界オブジェクト２０４付近の現実世界シーン２０２内に、現実世界オブジェクト２０４に向かう配向及び視野でもって位置することができる。上述したように、特定のカスタマイズされるビューは、ビュー２０６（例えば、ビュー２０６の１つと同じロケーションに位置し、ビュー２０６の１つと同一の配向及び／又は視野を提供する等）の１つと揃ってもよいが、その一方で、構成５００におけるビュー５０２は、ビュー２０６と不揃いの状態で示される。本明細書で使用するが、ビューが別のビューと「不揃い」と言える場合として、ビューが互いに以下の点で異なる場合が挙げられる：ビューに関連する各ロケーション、ビューに関連する各配向、及び／又はビューに関連する各視野。カスタマイズされるビュー５０２及びビュー２０６のケースにおいて、例えば、カスタマイズされるビュー５０２は、全てのビュー２０６と不揃いであり、その理由として、カスタマイズされるビュー５０２は、任意のビュー２０６とは異なるロケーション（即ち、現実世界シーン２０２の内側のロケーション）に配置され、任意のビュー２０６とは異なる配向（即ち、ビュー２０６－１と２０６－２との各配向の間の配向）を有しており、そして、ビュー２０６とは異なる視野（即ち、現実世界オブジェクト２０４上でより近い視界をもたらす視野）を有することが挙げられる。 For illustrative purposes, FIG. 5 shows an exemplary configuration 500 based on configuration 200 (ie, including all the same elements shown in relation to configuration 200 and described above), but now with a real-world scene Data representing 202 is additionally generated for customized view 502 of real-world scene 202 . Specifically, as shown in configuration 500 , customized view 502 can be positioned within real-world scene 202 near real-world object 204 with an orientation and field of view toward real-world object 204 . As noted above, a particular customized view may align with one of views 206 (eg, be co-located with one of views 206, provide the same orientation and/or field of view as one of views 206, etc.). However, view 502 in configuration 500 is shown misaligned with view 206 . As used herein, a view can be said to be "ragged" with another view if the views differ from each other in: each location associated with the view, each orientation associated with the view, and /or each field of view associated with the view. In the case of customized view 502 and view 206, for example, customized view 502 is not aligned with all views 206 because customized view 502 is located at a different location than any view 206 ( (i.e., a location inside the real-world scene 202), has a different orientation than any of the views 206 (i.e., an orientation between the orientations of views 206-1 and 206-2), and , has a different field of view than view 206 (ie, a field of view that provides a closer view on real-world object 204).

かくして、カスタマイズされるビュー５０２を定義するキャプチャ・パラメータの追加のセットの特定に基づいて、及び、ビュー２０６に関連するキャプチャ・デバイスがキャプチャする複数の表面データ・フレーム・シーケンス４００に基づいて（例えば、表面データ・フレーム・シーケンス４００－１及び４００－２に基づいて）、システム１００は、現実世界シーン２０２のカスタマイズされるビュー５０２の仮想化プロジェクションに関して、色彩及び深度フレームをレンダリングすることができ、そして、これらの色彩及び深度フレームを含む仮想化される表面データ・フレーム・シーケンスを提供することができる。 Thus, based on specifying an additional set of capture parameters that define the customized view 502 and based on the plurality of surface data frame sequences 400 captured by the capture device associated with the view 206 (e.g. , based on surface data frame sequences 400-1 and 400-2), system 100 can render color and depth frames for a virtualized projection of customized view 502 of real-world scene 202, A virtualized surface data frame sequence containing these color and depth frames can then be provided.

図示的な意味合いで、図６は、例示的な仮想化される表面データ・フレーム・シーケンス６００を示し、前記シーケンスは、システム１００によって、カスタマイズされるビュー５０２の仮想化プロジェクションに関して、生成されてもよく、そして、提供されてもよい。具体的には、仮想化される表面データ・フレーム・シーケンス６００を生成して、カスタマイズされるビュー５０２の仮想化プロジェクション６０２に関するレンダリングされる色彩及び深度フレームを含めてもよい。仮想化プロジェクション６０２が示すように、仮想化される表面データ・フレーム・シーケンス６００内に含まれる色彩及び深度フレームは、以下に関連してもよい（例えば、以下の観点からキャプチャされるような外観を擁してもよい）：表面データ・フレーム・シーケンス４００内に含まれる任意の色彩及び深度フレームとは異なるロケーション、異なる配向、及び異なる視野。 In an illustrative sense, FIG. 6 shows an exemplary virtualized surface data frame sequence 600 that may be generated by system 100 for a virtualized projection of customized view 502 . Well, and may be offered. Specifically, a virtualized surface data frame sequence 600 may be generated to include rendered color and depth frames for the virtualized projection 602 of the customized view 502 . As the virtualization projection 602 shows, the color and depth frames contained within the virtualized surface data frame sequence 600 may relate to (e.g., appearance as captured from the perspective of ): a different location, a different orientation, and a different field of view than any color and depth frames contained within the surface data frame sequence 400 .

具体的には、示すように、仮想化プロジェクション６０２は、現実世界オブジェクト２０４上の特定の配向からのクローズ・アップを表現する。このことは、下流のシステム及びデバイスに関して、様々な利点をもたらすことができ、前記下流のシステム及びデバイスは、バーチャル・リアリティ・メディア・コンテンツを提供することができ、前記コンテンツは、仮想化プロジェクション６０２が提供するロケーション、配向及び／又は視野からの現実世界オブジェクト２０４を表すことができる。例えば、有限数のビット（例えば、色彩データを処理するように構成されるオフ・ザ・シェルフ（ｏｆｆ－ｔｈｅ－ｓｈｅｌｆ）ビデオ・コーデック用の８～１２ビット）で動作する処理リソース（例えば、ビデオ・コーデック・ソリューション）を用いて深度データを処理する実施において、深度量子化問題（ｄｅｐｔｈｑｕａｎｔｉｚａｔｉｏｎｉｓｓｕｅｓ）（例えば、比較的大きなエリアを表す深度データの望ましくない「レイヤーリング」（ｌａｙｅｒｉｎｇ））は、よりローカライズされるエリアを表す深度データを使用することで緩和することができる。ローカライズされるエリアは、利用可能なビットによって表現される、より大きい（例えば、よりローカライズされない）エリアよりも短い深度を伴うことができる。従って、利用可能な有限数のビットによって、高精度で深度を表現して、レイヤーリング効果を軽減又は除外等を行うことを可能にする。かくして、オブジェクトの表面を表すビット深度（例えば、ヴァンテージ・ポイントからの異なる距離での深度を表現するのに使用されるある数のビット）を、仮想化プロジェクション６０２に関して最適化することができ、仮想化プロジェクション６０２のヴァンテージ・ポイントからの高いレベルの深度精度及び／又は深度分解能を提供することができる。 Specifically, as shown, virtualized projection 602 represents a close-up from a particular orientation on real-world object 204 . This can provide various advantages with respect to downstream systems and devices, which can provide virtual reality media content, which can be viewed in virtual projection 602 can represent the real-world object 204 from a location, orientation, and/or view provided by . For example, processing resources (e.g., video codec solutions), depth quantization issues (e.g., undesirable "layering" of depth data representing relatively large areas) are more It can be mitigated by using depth data representing the localized area. A localized area can have a shorter depth than a larger (eg, less localized) area represented by the available bits. Thus, the finite number of bits available allows depth to be represented with high precision, such as to reduce or eliminate layering effects. Thus, the bit depth representing the surface of the object (eg, a certain number of bits used to represent depth at different distances from the vantage point) can be optimized with respect to the virtualization projection 602, resulting in a virtual A high level of depth accuracy and/or depth resolution from the vantage point of the projection 602 can be provided.

特定のカスタマイズされるビュー（即ち、カスタマイズされるビュー５０２）に関連するキャプチャ・パラメータの特定の追加のセットについて、図５～６に関して、詳細に説明及び図示してきたが、以下の点を理解されたい：膨大な数（例えば、現実世界シーン２０２を表すデータをキャプチャするのに採用される物理的なキャプチャ・デバイスの数よりも有意に大きい数）のカスタマイズされるビューに関連するキャプチャ・パラメータの膨大な数のセットを、特定の実施において特定することができ、可能性としてバーチャル・リアリティ・メディア・コンテンツに含めるための膨大な数の仮想化される表面データ・フレーム・シーケンスを生成及び提供することと可能にすることができる。上述したように、この膨大な数の仮想化される表面データ・フレーム・シーケンスは、バーチャル・リアリティ・メディア・コンテンツの生成及び配信における柔軟性を向上させることを可能にすることができ、異なるメディア・プレーヤ装置に対して、同一の現実世界シーンについて異なる詳細を提供することができ、その際に、任意のメディア・プレーヤ装置に対して、膨大な量の冗長な又は比較的無関係のデータで負荷をかけることもない。 While certain additional sets of capture parameters associated with particular customized views (ie, customized view 502) have been described and illustrated in detail with respect to FIGS. want: a vast number (e.g., significantly greater than the number of physical capture devices employed to capture data representing the real-world scene 202) of the capture parameters associated with the customized views. A vast number of sets can be specified in a particular implementation, potentially generating and providing a large number of virtualized surface data frame sequences for inclusion in virtual reality media content. can be made possible. As noted above, this vast number of virtualized surface data frame sequences can allow for greater flexibility in the generation and delivery of virtual reality media content, allowing different media - Player devices can be presented with different details about the same real-world scene, while burdening any media player device with vast amounts of redundant or relatively irrelevant data. I don't even call.

キャプチャ・パラメータの１以上の追加のセット（即ち、ビュー２０６に関連するキャプチャ・パラメータのセット以外のキャプチャ・パラメータのセット）を、特定の実施に寄与することができる任意の態様で特定することができる。例えば、特定の実施において、キャプチャ・パラメータの追加のセット（例えば、カスタマイズされるビュー５０２に関連するキャプチャ・パラメータのセット）は、以下によって特定することができる：現実世界シーンの１以上の幾何学的プロパティに関して現実世界シーン２０２を分析すること；ビュー２０６に関連するキャプチャ・パラメータのセットとは異なるキャプチャ・パラメータの複数の追加のセットを生成すること（例えば、現実世界シーンの分析に基づいて）；及び、キャプチャ・パラメータの追加のセットを、キャプチャ・パラメータの複数の追加のセットから特定すること。具体的には、例えば、システム１００は、現実世界シーン２０２の幾何学的プロパティを決定することができる（例えば、以下に関連するプロパティ：現実世界シーン２０２の形、現実世界シーン２０２を分割することができる際の現実世界シーン２０２の様々な部分又は態様、現実世界シーン２０２内の特定のオブジェクトのロケーション及び／又は軌道等）。こうしたプロパティ及び／又は他のプロパティに基づいて、システム１００は、様々なカスタマイズされるビュー（例えば、カスタマイズされるビュー５０２を含む）がバーチャル・リアリティ・メディア・コンテンツの生成に関連することを決定することができ、そして、結果として、これらの関連するカスタマイズされるビュー各々に関してのキャプチャ・パラメータの各セットを生成することができる。 One or more additional sets of capture parameters (i.e., sets of capture parameters other than the set of capture parameters associated with view 206) may be specified in any manner that can contribute to a particular implementation. can. For example, in certain implementations, an additional set of capture parameters (eg, a set of capture parameters associated with customized view 502) can be specified by: one or more geometries of the real-world scene; analyzing the real-world scene 202 with respect to physical properties; generating multiple additional sets of capture parameters that differ from the set of capture parameters associated with the view 206 (e.g., based on analysis of the real-world scene); and identifying an additional set of capture parameters from the plurality of additional sets of capture parameters. Specifically, for example, system 100 can determine geometric properties of real-world scene 202 (e.g., properties related to: shape of real-world scene 202, division of real-world scene 202 various parts or aspects of the real-world scene 202 when possible, the location and/or trajectory of a particular object within the real-world scene 202, etc.). Based on these properties and/or other properties, system 100 determines that various customized views (eg, including customized view 502) are relevant to generating virtual reality media content. and, as a result, a respective set of capture parameters can be generated for each of these associated customized views.

いったん、複数のキャプチャされる表面データ・フレーム・シーケンスが、物理的なキャプチャ・デバイスに関連する現実世界シーンのビューのために受信され、そして、１以上の仮想化される表面データ・フレーム・シーケンスが現実世界シーンのカスタマイズされるビューの仮想化プロジェクションのために生成されると、システム１００は、１以上のキャプチャされる、及び／又は仮想化される表面データ・フレーム・シーケンスを、バーチャル・リアリティ・メディア・コンテンツ内に含める目的で提供することができる。例えば、より詳細に後述するが、システム１００は、キャプチャされる、及び／又は仮想化される表面データ・フレーム・シーケンス内のデータを以下に対して提供することができる：サーバ・サイド・システム（例えば、バーチャル・リアリティ・メディア・プロバイダ・パイプラインにおける下流システム）；及び／又は、クライアント・サイドシステム（例えば、バーチャル・リアリティ・メディア・コンテンツ（例えば、表面データ・フレーム・シーケンス内に含まれるデータに基づくバーチャル・リアリティ・メディア・コンテンツ）を体験するユーザに関連するメディア・プレーヤ装置）。 Once a plurality of captured surface data frame sequences are received for a view of the real world scene associated with the physical capture device, and one or more virtualized surface data frame sequences is generated for a virtualized projection of a customized view of a real-world scene, the system 100 converts one or more captured and/or virtualized surface data frame sequences into a virtual reality • Can be provided for inclusion within media content. For example, as described in more detail below, the system 100 can provide data in captured and/or virtualized surface data frame sequences to: a server-side system ( and/or client-side systems (e.g., virtual reality media content (e.g., data contained within surface data frame sequences); A media player device) associated with a user experiencing virtual reality media content based on).

システム１００は、表面データ・フレーム・シーケンス内に含まれるデータ（例えば、色彩及び深度フレーム、並びに他のタイプのデータ（例えば、オーディオ・データ、メタデータ等））を、特定の実施に寄与することができる任意の態様で、且つ、任意の他のシステム又はデバイスへ提供することができる。例えば、特定の実施において、システム１００は、色彩及びデータ・フレーム（並びにオーディオ及びメタデータ等）を、エンコーディング・システムに提供することができ、前記システムは、データをエンコードして、ビデオ・データ・ストリームを生成することができる（例えば、標準化されたフォーマット（例えば、Ｈ．２６４、Ｈ．２６５）で圧縮される２Ｄビデオ・ストリーム等）。従って、例えば、特定の表面データ・フレーム・シーケンス内に含まれるデータは、１以上のビデオ・データ・ストリーム（例えば、色彩ビデオ・データ・ストリーム、深度ビデオ・データ・ストリーム等）内に含まれてもよい。表面データ・フレーム・シーケンス（例えば、オーディオ・データ、メタデータ等）内に含まれる他のデータも、色彩ビデオ・データ・ストリーム、及び／若しくは、深度ビデオ・データ・ストリーム内に含まれてもよく、又は、異なるデータ・ストリーム内に含まれてもよい。 System 100 uses data (e.g., color and depth frames, as well as other types of data (e.g., audio data, metadata, etc.)) contained within surface data frame sequences to contribute to a particular implementation. can be provided in any manner and to any other system or device. For example, in certain implementations, system 100 can provide color and data frames (as well as audio and metadata, etc.) to an encoding system, which encodes the data into video data, A stream can be generated (eg, a 2D video stream compressed in a standardized format (eg, H.264, H.265), etc.). Thus, for example, data contained within a particular surface data frame sequence may be contained within one or more video data streams (eg, color video data stream, depth video data stream, etc.). good too. Other data contained within the surface data frame sequence (e.g., audio data, metadata, etc.) may also be contained within the color video data stream and/or the depth video data stream. , or may be included in different data streams.

どのシステムに表面データ・フレーム・シーケンス・データを提供するか、及び／又は表面データ・フレーム・シーケンス・データが１以上のビデオ・データ・ストリームにエンコードされたか等に関わらず、表面データ・フレーム・シーケンス・データは、ネットワークを介した伝送のために、パッケージ化、及び／又は、多重化されてもよい。こうしたデータのパッケージ化は、任意の適切な態様で実行されてもよく、及び／又は、特定の実施に寄与することができる任意の適切なデータ構造を用いて実行されてもよい。一例として、各表面データ・フレーム・シーケンスは、自身のユニークなトランスポート・ストリームへとパッケージ化されてもよい。具体的には、例えば、仮想化される表面データ・フレーム・シーケンス６００は、パッケージ化されてもよく、その結果、現実世界シーン２０２のカスタマイズされるビュー５０２の仮想化プロジェクション６０２に関するレンダリングされる色彩及び深度フレームは、トランスポート・ストリーム内に含まれてもよく、前記トランスポート・ストリームは、以下を含まなくてもよい：仮想化される表面データ・フレーム・シーケンス６００以外の追加の表面データ・フレーム・シーケンス（例えば、追加のキャプチャされる、又は、仮想化される表面データ・フレーム・シーケンス）を表す色彩及び深度フレーム。 Regardless of which system the surface data frame sequence data is provided to and/or whether the surface data frame sequence data is encoded into one or more video data streams, etc. Sequence data may be packaged and/or multiplexed for transmission over a network. Such data packaging may be performed in any suitable manner and/or using any suitable data structure that can contribute to a particular implementation. As an example, each surface data frame sequence may be packaged into its own unique transport stream. Specifically, for example, the virtualized surface data frame sequence 600 may be packaged such that the rendered color rendering for the virtualized projection 602 of the customized view 502 of the real-world scene 202 is and depth frames may be included within a transport stream, which may not include: additional surface data frames other than the virtualized surface data frame sequence 600; Color and depth frames representing a frame sequence (eg, an additional captured or virtualized surface data frame sequence).

別の例として、複数の表面データ・フレーム・シーケンスは、共有トランスポート・ストリームへ共にパッケージ化されてもよい（例えば、多重化されてもよい）。具体的には、例えば、仮想化される表面データ・フレーム・シーケンス６００は、パッケージ化されてもよく、その結果、現実世界シーン２０２のカスタマイズされるビュー５０２の仮想化プロジェクション６０２に関するレンダリングされる色彩及び深度フレームは、トランスポート・ストリーム内に含まれ、前記トランスポート・ストリームは、以下を表す色彩及び深度フレームを更に含む：仮想化される表面データ・フレーム・シーケンス６００以外の少なくとも１つの追加の表面データ・フレーム・シーケンス（例えば、少なくとも１つの追加のキャプチャされる、又は仮想化される表面データ・フレーム・シーケンス）。 As another example, multiple surface data frame sequences may be packaged together (eg, multiplexed) into a shared transport stream. Specifically, for example, the virtualized surface data frame sequence 600 may be packaged such that the rendered color rendering for the virtualized projection 602 of the customized view 502 of the real-world scene 202 is and depth frames are contained within a transport stream, said transport stream further comprising color and depth frames representing: at least one surface data frame sequence other than the virtualized surface data frame sequence 600; Additional surface data frame sequences (eg, at least one additional captured or virtualized surface data frame sequence).

図示する目的で、図７は、例示的なトランスポート・ストリーム７００のグラフィカルな表現を示し、前記トランスポート・ストリームは、例示的な複数の表面データ・フレーム・シーケンスを含む。具体的には、トランスポート・ストリーム７００は、以下を含むことができる：様々なキャプチャされる表面データ・フレーム・シーケンス４００（例えば、図４Ａ～４Ｂに示す表面データ・フレーム・シーケンス４００－１、及び、ビュー２０６－２～２０６－８に関連するキャプチャ・デバイスによってそれぞれ同様にキャプチャされる表面データ・フレーム・シーケンス４００－２～４００－８）、並びに様々な仮想化される表面データ・フレーム・シーケンス（例えば、図６に示す仮想化される表面データ・フレーム・シーケンス６００及び他の仮想化される表面データ・フレーム・シーケンス７０２－１～７０２－Ｎ）。 For illustrative purposes, FIG. 7 shows a graphical representation of an exemplary transport stream 700, said transport stream comprising an exemplary plurality of surface data frame sequences. Specifically, transport stream 700 may include: various captured surface data frame sequences 400 (eg, surface data frame sequence 400-1 shown in FIGS. 4A-4B; and surface data frame sequences 400-2 through 400-8) that are similarly captured by the capture devices associated with views 206-2 through 206-8, respectively, and various virtualized surface data frame sequences 400-2 through 400-8). sequences (eg, virtualized surface data frame sequence 600 shown in FIG. 6 and other virtualized surface data frame sequences 702-1 through 702-N).

本明細書で使用するが、「トランスポート・ストリーム」は、以下を目的として、データをパッケージ化するために使用されるデータ構造を意味してもよい：データを、あるデバイス又はシステムから別の所へ送信（即ち、伝送）することを促進すること、データをレンダリングする、若しくは、処理する、若しくは分析すること、及び／又は特定の実施に寄与することができる他の目的。幾つかの例において、トランスポート・ストリームは、１以上のデータ・ストリーム（例えば、１以上のビデオ・データ・ストリーム）、及び／又は、他のデータ（例えば、メタデータ等）を組み込むことができる。トランスポート・ストリームは、特定の実施に寄与することができる任意のタイプのトランスポート・ストリームとして実装されてもよい。例えば、本明細書に記載の特定のトランスポート・ストリーム（例えば、トランスポート・ストリーム７００）は、以下の手段により実装されてもよい：ＭＰＥＧトランスポート・ストリーム、ＭＰＥＧ－２トランスポート・ストリーム、又は、データの伝送を促進する別の適切なデータ構造（例えば、表面データ・フレーム・シーケンス、ビデオ・データ・ストリーム）等。トランスポート・ストリームは、特定の実施に寄与することができる任意の適切なデータフォーマット、コンテナ・フォーマット、及び／又は、伝送プロトコルに従って構成されてもよい。 As used herein, "transport stream" may mean a data structure used to package data for the purpose of: transferring data from one device or system to another; facilitating transmission (ie, transmission) to a location, rendering or processing or analyzing data, and/or other purposes that may contribute to a particular implementation. In some examples, a transport stream may incorporate one or more data streams (eg, one or more video data streams) and/or other data (eg, metadata, etc.). . A transport stream may be implemented as any type of transport stream that can contribute to a particular implementation. For example, a particular transport stream described herein (eg, transport stream 700) may be implemented by: an MPEG transport stream, an MPEG-2 transport stream, or , another suitable data structure that facilitates the transmission of data (eg, surface data frame sequence, video data stream), etc. A transport stream may be constructed according to any suitable data format, container format, and/or transmission protocol that can contribute to a particular implementation.

トランスポート・ストリーム７００は、キャプチャされる表面データ・フレーム・シーケンスと、仮想化される表面データ・フレーム・シーケンスとの両方を含むものとして示しているが、以下の点を理解されたい：特定の実施において、トランスポート・ストリーム７００は、以下を含むことができる：キャプチャされる表面データ・フレーム・シーケンスのみ、又は、仮想化される表面データ・フレーム・シーケンスのみ。更には、トランスポート・ストリーム７００は、以下を含むことができる：特定の実施に寄与することができる任意の適切な数の表面データ・フレーム・シーケンス、及び表面データ・フレーム・シーケンスの任意の組み合わせ。例えば、上述したように、特定の例において、トランスポート・ストリーム７００は、単独の表面データ・フレーム・シーケンス（例えば、仮想化される表面データ・フレーム・シーケンス６００）を含むことができ、そして、他の表面データ・フレーム・シーケンスは、仮にも伝送される場合には、他のトランスポート・ストリームの手段により伝送されてもよい。また、以下の点も理解されたい：図７では、上の通り記述及び図示した表面データ・フレーム・シーケンス（例えば、表面データ・フレーム・シーケンス４００及び６００）を含むトランスポート・ストリーム７００を示しているが、これらの表面データ・フレーム・シーケンスは、データ構造（例えば、エンコードされるビデオ・データ・ストリーム（明示しない））内に含まれてもよく、従って、これらの表面データ・フレーム・シーケンスは、システム１００が受信し及び／又は生成する上述したバージョン（例えば、ビデオ・データ・ストリームへとエンコード及び／又は圧縮されるデータのバージョン等）とは異なるデータのバージョンを意味してもよい。 Although transport stream 700 is shown as including both captured surface data frame sequences and virtualized surface data frame sequences, it should be understood that: In implementations, transport stream 700 can include: only captured surface data frame sequences or only virtualized surface data frame sequences. Additionally, transport stream 700 may include: any suitable number of surface data frame sequences and any combination of surface data frame sequences that may contribute to a particular implementation. . For example, as described above, in certain examples, transport stream 700 may include a single surface data frame sequence (eg, virtualized surface data frame sequence 600), and Other surface data frame sequences, if at all, may be transmitted by means of other transport streams. It should also be appreciated that FIG. 7 shows a transport stream 700 that includes the surface data frame sequences described and illustrated above (eg, surface data frame sequences 400 and 600). However, these surface data frame sequences may be included in a data structure (e.g., an encoded video data stream (not explicitly shown)), so these surface data frame sequences are , may refer to versions of data that are different from the versions described above that system 100 receives and/or generates (eg, versions of data that are encoded and/or compressed into a video data stream, etc.).

図８は、トランスポート・ストリーム７００のデータ構造表現８００を示す。示しているが、表現８００は、異なるタイプのデータに関するセクションを含む（例えば、メタデータ８０２のセクション、オーディオ・データ８０４のセクション、及びビデオ・データ８０６のセクション）。以下の点を理解されたい：表現８００に示すセクションは概念的なものにすぎなくてもよく、そして、表現８００に示すデータは、特定の実施に寄与することができる任意の方法で、トランスポート・ストリーム７００内で、多重化、オーガナイズ、アレンジ、送信等されてもよい。 FIG. 8 shows a data structure representation 800 of transport stream 700 . As shown, representation 800 includes sections for different types of data (eg, a section for metadata 802, a section for audio data 804, and a section for video data 806). It should be appreciated that the section shown in representation 800 may be conceptual only, and the data shown in representation 800 may be transported in any way that can contribute to a particular implementation. • may be multiplexed, organized, arranged, transmitted, etc. within the stream 700;

示しているが、メタデータ８０２は以下を含む：トランスポート・ストリーム７００内に含まれる各表面データ・フレーム・シーケンスに関連するキャプチャ・パラメータの様々なセット（即ち、「キャプチャ・パラメータ・セット１」～「キャプチャ・パラメータ・セットＭ」）。例えば、メタデータ８０２内に含まれるキャプチャ・パラメータのセットは、以下を含むことができる：図７に示すキャプチャされ仮想化される表面データ・フレーム・シーケンス（即ち、表面データ・フレーム・シーケンス４００－１～４００－８、６００、及び７０２－１～７０２－Ｎ）のそれぞれに関するキャプチャ・パラメータの各セット。メタデータ８０２は、特定の実施に寄与することができる表面データ・フレーム・シーケンス（例えば、又は、表面データ・フレーム・シーケンスがエンコードされるビデオ・データ・ストリーム）を記述する任意の他のメタデータを更に含むことができる。 As shown, metadata 802 includes: various sets of capture parameters associated with each surface data frame sequence contained within transport stream 700 (i.e., "capture parameter set 1"); ~"capture parameter set M"). For example, the set of capture parameters contained within metadata 802 may include: the surface data frame sequence to be captured and virtualized shown in FIG. 1 through 400-8, 600, and 702-1 through 702-N) for each set of capture parameters. Metadata 802 is any other metadata that describes the surface data frame sequence (eg, or the video data stream in which the surface data frame sequence is encoded) that can contribute to a particular implementation. can further include

同様に、オーディオ・データ８０４は、以下を含むことができる：トランスポート・ストリーム７００内に含まれる各表面データ・フレーム・シーケンスに関連するオーディオ・ソース・データ。例えば、「オーディオ・ソース１」～「オーディオ・ソースＭ」は、表面データ・フレーム・シーケンス４００－１～４００－８、６００、及び７０２－１～７０２－Ｎにそれぞれ関連してもよい。他の例において、オーディオ・ソースが表面データ・フレーム・シーケンスに関連しない場合に、存在する表面データ・フレーム・シーケンスよりも、オーディオ・ソースは、多くてもよく、又は少なくてもよい（例えば、オーディオ・ソースの数が、表面データ・フレーム・シーケンスの数とは関係無い）。 Similarly, audio data 804 may include: audio source data associated with each surface data frame sequence contained within transport stream 700; For example, "Audio Source 1" through "Audio Source M" may relate to surface data frame sequences 400-1 through 400-8, 600, and 702-1 through 702-N, respectively. In other examples, there may be more or less audio sources than there are surface data frame sequences if the audio sources are not associated with the surface data frame sequence (e.g. The number of audio sources is independent of the number of surface data frame sequences).

更に図８に示すようにビデオ・データ８０６は、以下を含むことができる：図７にてトランスポート・ストリーム７００内に含まれるように示される各表面データ・フレーム・シーケンスに関連する色彩ビデオ・データ・ストリーム及び深度ビデオ・データ・ストリーム。例えば、「色彩ビデオ・データ・ストリーム１」及び「深度ビデオ・データ・ストリーム１」は、表面データ・フレーム・シーケンス４００－１内に含まれる色彩及び深度フレームを表してもよく、「色彩ビデオ・データ・ストリーム２」及び「深度ビデオ・データ・ストリーム２」は、表面データ・フレーム・シーケンス４００－２内に含まれる色彩及び深度フレームを表してもよく、以下同様であってもよく、その結果、表面データ・フレーム・シーケンス４００－１～４００－８、６００、及び７０２－１～７０２－Ｎは、それぞれ、ビデオ・データ８０６内の色彩ビデオ・データ・ストリーム及び深度ビデオ・データ・ストリームの両方に対応する。 As further shown in FIG. 8, video data 806 may include: color video data associated with each surface data frame sequence shown in FIG. 7 as included within transport stream 700; data stream and depth video data stream. For example, "color video data stream 1" and "depth video data stream 1" may represent the color and depth frames contained within surface data frame sequence 400-1, and "color video data stream 1" may represent the color and depth frames contained within surface data frame sequence 400-1. Data stream 2" and "depth video data stream 2" may represent color and depth frames contained within surface data frame sequence 400-2, and so on, resulting in , surface data frame sequences 400-1 through 400-8, 600, and 702-1 through 702-N, respectively, both chrominance and depth video data streams in video data 806. corresponds to

上述したように、特定の実施において、以下のことを行うことは有用となる可能性がある：比較的膨大な数の仮想化される表面データ・フレーム・シーケンスを提供して、異なる時間で異なるメディア・プレーヤ装置に関連する（即ち、異なるバーチャル・リアリティ体験を有する異なるユーザに関連する）異なる詳細によりカスタマイズできるバーチャル・リアリティ・メディア・コンテンツの異なるバージョンを生成する際に柔軟性を可能にすること。例えば、１つの実施において、８つの物理的なキャプチャ・デバイスは、８つの高解像度のキャプチャされる表面データ・フレーム・シーケンスを生成することができ、そして、システム１００は、８つのキャプチャされる表面データ・フレーム・シーケンスに基づいて、以下を生成することができる：何百ものカスタマイズされるビューの何百もの仮想化プロジェクションに関する何百もの仮想化される表面データ・フレーム・シーケンス。 As noted above, in certain implementations, it may be useful to: provide a relatively large number of surface data frame sequences to be virtualized to allow different Allowing flexibility in generating different versions of virtual reality media content that can be customized with different details associated with the media player device (i.e., associated with different users with different virtual reality experiences). . For example, in one implementation, eight physical capture devices can generate eight high-resolution captured surface data frame sequences, and system 100 can capture eight captured surface data frames. Based on the data frame sequences, the following can be generated: Hundreds of virtualized surface data frame sequences for hundreds of virtualized projections of hundreds of customized views.

こうした膨大な数の仮想化される表面データ・フレーム・シーケンスを提供することは、バーチャル・リアリティ・メディア・コンテンツを効率的に生成及び配信する際に、有意な柔軟性を可能にする可能性があるが、しかし、あまりにも多くの個々のデータ・ストリーム（例えば、ビデオ・データ・ストリーム等）を取り扱うように備えられていない可能性のある利用可能なハードウェア及びソフトウェア・リソースを用いて取り扱うことが困難となる可能性がある。結果として、複数の色彩及び／又は深度フレーム・シーケンスを単独の表面データ・フレーム・シーケンスにパッケージ化することが望まれる可能性がある。例えば、仮想化される表面データ・フレーム・シーケンス６００は、パッケージ化されてもよく、その結果、現実世界シーン２０２のカスタマイズされるビュー５０２の仮想化プロジェクション６０２に関するレンダリングされる色彩及び深度フレームは、それぞれ、ビデオ・データ・ストリームにおいて、タイルとして表現され、前記ビデオ・データ・ストリームは、タイル・マップを実装し、前記タイル・マップは、ビデオ・データ・ストリームの各フレームにおいて、複数のタイルを表現する。例えば、タイル・マップ・ピング技術（例えば、テクスチャ・アトラス技術、スプライト・シート技術等）を使用して、複数の色彩及び／又は深度フレームを共に単独のフレームにパックすることができ、その結果、こうしたフレームのシーケンス（例えば、又は、これらのフレームを表すビデオ・データ・ストリーム）は、本質的に単独のフレーム・シーケンスとして扱うことができるが、複数のフレーム・シーケンス（例えば、カスタマイズされるビューの仮想化プロジェクションを含む複数のビューを表す）に関連するデータを含むことができる。 Providing such a large number of virtualized surface data frame sequences may allow significant flexibility in efficiently generating and delivering virtual reality media content. but with available hardware and software resources that may not be equipped to handle too many individual data streams (e.g., video data streams, etc.) can be difficult. As a result, it may be desirable to package multiple color and/or depth frame sequences into a single surface data frame sequence. For example, the virtualized surface data frame sequence 600 may be packaged such that the rendered color and depth frames for the virtualized projection 602 of the customized view 502 of the real-world scene 202 are: each represented as a tile in a video data stream, said video data stream implementing a tile map, said tile map representing a plurality of tiles in each frame of said video data stream do. For example, tile map ping techniques (e.g., texture atlas techniques, sprite sheet techniques, etc.) can be used to pack multiple color and/or depth frames together into a single frame, resulting in A sequence of such frames (e.g., or a video data stream representing these frames) can essentially be treated as a single frame sequence, but can be treated as multiple frame sequences (e.g., for customized views). (representing multiple views, including virtualization projections).

図示する目的で、図９は、例示的なトランスポート・ストリーム９００のグラフィカルな表現を示し、前記グラフィカルな表現は、例示的なタイル化されるフレーム・シーケンス９０２を含み、前記シーケンスは、タイル・マップを実装する。タイル化されるフレーム・シーケンス９０２は、本明細書で示す表面データ・フレーム・シーケンス（例えば、図４Ｂにおける表面データ・フレーム・シーケンス４００－１、図６における仮想化される表面データ・フレーム・シーケンス６００等）と同様のブロックとして描かれているが、以下の点を理解されたい：タイル化されるフレーム・シーケンス９０２は、本明細書の他の表面データ・フレーム・シーケンスの幾つかとは異なるデータを表現することができる。具体的には、例えば、タイル化されるフレーム・シーケンス９０２は、複数のフレーム・シーケンス（例えば、色彩フレーム・シーケンス４０２及び深度フレーム・シーケンス４０４（図４Ａに示す））を表さなくてもよいが、むしろ、タイル化されるフレーム・シーケンス９０２は、以下を含むことができる：複数のタイル９０４（即ち、タイル９０４－１－Ｃ～９０４－９－Ｃ）（例えば、タイル化されるフレーム・シーケンス９０２の前面で示されるもの）をそれぞれ含むフレームの単独のシーケンス。 For illustrative purposes, FIG. 9 shows a graphical representation of an exemplary transport stream 900, said graphical representation comprising an exemplary tiled frame sequence 902, said sequence comprising tiled frames. implement a map; The tiled frame sequence 902 is a surface data frame sequence shown herein (eg, surface data frame sequence 400-1 in FIG. 4B, virtualized surface data frame sequence 400-1 in FIG. 6). 600), but it should be understood that the tiled frame sequence 902 contains different data than some of the other surface data frame sequences herein. can be expressed. Specifically, for example, tiled frame sequence 902 may not represent multiple frame sequences (eg, color frame sequence 402 and depth frame sequence 404 (shown in FIG. 4A)). But rather, the tiled frame sequence 902 can include: a plurality of tiles 904 (ie, tiles 904-1-C through 904-9-C) (eg, a tiled frame sequence A single sequence of frames each including the one shown on the front of sequence 902).

タイル化されるフレーム・シーケンス９０２の各フレーム上に含まれるタイルは、以下を含むことができる：特定の実施に寄与することができる任意のキャプチャされる、又は仮想化される表面データ・フレーム・シーケンスに関連する任意の色彩又は深度フレーム。例えば、図９に示すように、タイル化されるフレーム・シーケンス９０２の各フレームは、以下を含むことができる：ビュー２０６－１からキャプチャされる色彩（「Ｃ」）フレームに対応するタイル９０４－１－Ｃ、ビュー２０６－２からキャプチャされる色彩フレームに対応するタイル９０４－２－Ｃ、以下同様であり、最大でビュー２０８－８からキャプチャされる色彩フレームに対応するタイル９０４－８－Ｃ。更に示すが、タイル化されるフレーム・シーケンス９０２の各フレームは、以下を含むことができる：カスタマイズされるビュー５０２の仮想化プロジェクション６０２に関して生成される色彩フレームに関連するタイル９０４－９－Ｃ。図９では明示的に９つのタイルのみを示しているが、以下の点を理解されたい：追加のタイルも、タイル化されるフレーム・シーケンス９０２の各フレームへパックされてもよい。例えば、他の仮想化プロジェクション、深度フレーム（例えば、ビュー２０６からキャプチャされる、及び／又は、仮想化プロジェクションに関して生成される深度フレーム）等に対応するタイルは、特定の実施に寄与することができるタイル・マップ内に更に含まれてもよい。更には、図９中のトランスポート・ストリーム９００では、タイル・マップを用いて、１つのタイル化されるフレーム・シーケンスを示しているだけではあるが、以下の点を理解されたい：トランスポート・ストリーム９００を使用して、複数のフレーム・シーケンスをパッケージ化することができ、前記フレーム・シーケンスは、タイル化されるフレーム・シーケンス（例えば、タイル化されるフレーム・シーケンス９０２等）、表面データ・フレーム・シーケンス（例えば、フレーム・シーケンス４００、６００又は７０２等）、又は、特定の実施に寄与することができる他のデータを含むことができる。幾つかの例において、タイル化されるフレーム・シーケンス９０２は、トランスポート・ストリーム（例えば、トランスポート・ストリーム９００）内に含まれることなく送信されてもよい。 The tiles included on each frame of the tiled frame sequence 902 can include: any captured or virtualized surface data frame that can contribute to a particular implementation; Any color or depth frame associated with the sequence. For example, as shown in FIG. 9, each frame of the tiled frame sequence 902 can include: a tile 904- corresponding to the color (“C”) frame captured from view 206-1. 1-C, tiles 904-2-C corresponding to color frames captured from view 206-2, and so on, up to tiles 904-8-C corresponding to color frames captured from view 208-8. . As further shown, each frame of the tiled frame sequence 902 can include: a tile 904-9-C associated with the color frame generated for the virtualized projection 602 of the customized view 502; Although FIG. 9 explicitly shows only nine tiles, it should be appreciated that additional tiles may also be packed into each frame of the tiled frame sequence 902 . For example, tiles corresponding to other virtualization projections, depth frames (eg, depth frames captured from view 206 and/or generated with respect to the virtualization projection), etc. can contribute to a particular implementation. It may also be included within the tile map. Furthermore, although transport stream 900 in FIG. 9 only shows one tiled frame sequence using a tile map, it should be understood that: Stream 900 can be used to package multiple frame sequences, which may be tiled frame sequences (eg, tiled frame sequence 902, etc.), surface data It may contain a frame sequence (eg, frame sequence 400, 600 or 702, etc.) or other data that may contribute to a particular implementation. In some examples, the tiled frame sequence 902 may be sent without being included within a transport stream (eg, transport stream 900).

図１０は、トランスポート・ストリーム９００のデータ構造表現１０００を示す。示しているが、表現８００と同様、表現１０００は、異なるタイプのデータに関するセクションを含む（例えば、メタデータ１００２のセクション、オーディオ・データ１００４のセクション、及びビデオ・データ１００６のセクション）。また、表現８００と同様、表現１０００に示すセクションは、概念的なものにすぎない旨を理解されたく、表現１０００に示すデータは、特定の実施に寄与することができる任意の方法で、トランスポート・ストリーム９００内で、多重化、オーガナイズ、アレンジ、送信等されてもよい。 FIG. 10 shows a data structure representation 1000 of transport stream 900 . As shown, similar to representation 800, representation 1000 includes sections for different types of data (eg, a section for metadata 1002, a section for audio data 1004, and a section for video data 1006). Also, as with representation 800, it should be understood that the sections shown in representation 1000 are conceptual only, and the data shown in representation 1000 may be transported in any manner that may contribute to a particular implementation. - may be multiplexed, organized, arranged, transmitted, etc. within the stream 900;

示しているが、メタデータ１００２は以下を含む：各タイル（例えば、タイル９０４－１－Ｃ～９０４－９－Ｃ及び／又は図９に明示しない他のタイル）に関して２つの異なるタイプのメタデータ。具体的には、各タイル（「タイル１」～「タイルＭ」）に関して、メタデータ１００２は以下を含む：その特定のタイルに関連するデータ専用の各フレームのセクションを表すタイル座標（例えば、「タイル座標１」～「タイル座標Ｍ」）。例えば、タイル１に関するタイル座標は、タイル９０４－１－Ｃを図９に示すときのフレームの上左の角を示す座標を含むことができ、以下同様であってもよい。また、メタデータ１００２は、各タイル１～Ｍに関して、タイルに関連するキャプチャ・パラメータの各セット（即ち、「キャプチャ・パラメータ・セット１」～「キャプチャ・パラメータ・セットＭ」）を含む。例えば、メタデータ８０２内に含まれるキャプチャ・パラメータのセットは、以下を含むことができる：図９に示す各タイル（即ち、タイル９０４－１－Ｃ～９０４－９－Ｃ）に関するキャプチャ・パラメータの各セット。メタデータ１００２は、更に以下を含むことができる：特定の実施に寄与することができ、タイル化されるフレーム・シーケンス９０２のタイル（例えば、又は、タイル化されるフレーム・シーケンス９０２がエンコードされるビデオ・データ・ストリーム）を記述する任意の他のメタデータ。 As shown, metadata 1002 includes: two different types of metadata for each tile (eg, tiles 904-1-C through 904-9-C and/or other tiles not explicitly shown in FIG. 9); . Specifically, for each tile (“Tile 1” through “Tile M”), metadata 1002 includes: tile coordinates representing a section of each frame dedicated to data associated with that particular tile (e.g., “ Tile Coordinate 1” to “Tile Coordinate M”). For example, the tile coordinates for tile 1 may include coordinates indicating the upper left corner of the frame when tile 904-1-C is shown in FIG. 9, and so on. Metadata 1002 also includes, for each tile 1-M, a respective set of capture parameters associated with the tile (ie, "capture parameter set 1" through "capture parameter set M"). For example, a set of capture parameters included within metadata 802 may include: a set of capture parameters for each tile shown in FIG. each set. Metadata 1002 may further include: the tiles of the tiled frame sequence 902, which may contribute to a particular implementation (eg, or the tiles in which the tiled frame sequence 902 is encoded); any other metadata describing the video data stream).

表現８００中のオーディオ・データ８０４と同様に、オーディオ・データ１００４は、以下を含むことができる：タイル化されるフレーム・シーケンス９０２内（即ち、トランスポート・ストリーム９００内）に含まれる各タイルに関連するオーディオ・ソースデータ。例えば、「オーディオ・ソース１」～「オーディオ・ソースＭ」は、それぞれ、タイル９０４－１－Ｃ～９０４－９－Ｃ及び／又は図９に明示しない他のタイルに関連してもよい。他の例において、タイルがオーディオ・ソースに特段関連しない場合、オーディオ・ソースは多くなってもよいし、又は少なくなってもよい。 Similar to audio data 804 in representation 800, audio data 1004 may include: for each tile contained within tiled frame sequence 902 (i.e., within transport stream 900) Associated audio source data. For example, "Audio Source 1" through "Audio Source M" may each be associated with tiles 904-1-C through 904-9-C and/or other tiles not explicitly shown in FIG. In other examples, there may be more or less audio sources if the tiles are not specifically associated with the audio sources.

図８中の表現８００とは対照的に、各表面データ・フレーム・シーケンスに関連する複数の色彩及び深度ビデオ・ストリームが含まれる場合、表現１０００が示すこととして、ビデオ・データ１００６は、タイル化されるフレーム・シーケンス９０２に関連する唯一のビデオ・データ・ストリームを含む。この理由として、図９に示すように、タイル化されるフレーム・シーケンス９０２によって表現される各色彩及び／又は深度フレームに関連する全てのイメージは、タイル化されるフレーム・シーケンス９０２における各フレームに共にパックされるからである。特定の例において、トランスポート・ストリーム９００は、以下を含むことができる：色彩データ・タイル専用の１つのフレーム・シーケンス（例えば、タイル化されるフレーム・シーケンス９０２）、及び、深度データ・タイル専用の第２のフレーム・シーケンス（図９に明示しない）。かくして、こうした例において、ビデオ・データ１００６は、以下を含むことができる：色彩ビデオ・データ・ストリーム及び深度ビデオ・データ・ストリームの両方。更なる他の例において、上述したように、トランスポート・ストリーム９００は、以下を含むことができる：タイル化されるフレーム・シーケンス９０２とともに、他のフレーム・シーケンス、ビデオ・ストリーム等（例えば、タイル化されているか、されていないかに関わらず）。かくして、こうした例において、ビデオ・データ１００６は、以下を含むことができる：図１０に明示しない他のビデオ・データ・ストリーム。 In contrast to representation 800 in FIG. 8, where multiple color and depth video streams are included associated with each surface data frame sequence, representation 1000 shows that video data 1006 is tiled. contains only one video data stream associated with the sequence of frames 902 to be processed. For this reason, as shown in FIG. 9, all images associated with each color and/or depth frame represented by tiled frame sequence 902 are assigned to each frame in tiled frame sequence 902. because they are packed together. In a particular example, transport stream 900 may include: one frame sequence dedicated to color data tiles (eg, tiled frame sequence 902), and one dedicated to depth data tiles. (not explicitly shown in FIG. 9). Thus, in such an example, video data 1006 may include: both a chroma video data stream and a depth video data stream. In yet another example, as discussed above, transport stream 900 can include: tiled frame sequence 902, along with other frame sequences, video streams, etc. (e.g., tile (whether or not it is encrypted). Thus, in such an example, video data 1006 may include: other video data streams not explicitly shown in FIG.

上述したように、幾つかの例において、システム１００並びに／又は本明細書に記載の他のシステム（例えば、他のサーバ・サイド・システム）及びデバイスを使用して、ユーザの体験対象であるバーチャル・リアリティ・メディア・コンテンツを生成することができる。例えば、上述したオペレーションのほか、バーチャル・リアリティ・メディア・コンテンツ提供システム（例えば、システム１００並びに／又は本明細書に記載の他のデバイス及びシステムが含まれることができる物、又は、これらのシステムが関連することができる物）は、システム１００が提供するデータに基づいて、更に、バーチャル・リアリティ・メディア・コンテンツを生成及び提供することができる。バーチャル・リアリティ・メディア・コンテンツは、現実世界シーン（例えば、現実世界シーン２０２）を表してもよく、そして、バーチャル・リアリティ・メディア・コンテンツは、現実世界シーンに対する任意のバーチャル・ロケーションに対応する動的に選択可能なバーチャル・ビューポイントから体験できるように、ユーザに提示可能なものであってもよい。例えば、動的に選択可能なバーチャル・ビューポイントは、メディア・プレーヤ装置のユーザによって選択されてもよく（例えば、決定される、配置される等）、その間、ユーザは、メディア・プレーヤ装置を用いて、現実世界シーンを仮想的に体験している。幾つかの例において、バーチャル・ビューポイントは、２次元又は３次元の連続体に沿って任意のロケーションとなるように選択することができ、これは、不連続なビューポイントのセットからしか選択されない状態とは対照的である。更には、バーチャル・リアリティ・メディア・コンテンツは、（例えば、システム１００を含む又は前記システムに関連するバーチャル・リアリティ・メディア・コンテンツ提供システムによって）メディア・プレーヤ装置に提供されてもよく、そして、ユーザに対して、現実世界シーン内の任意のバーチャル・ロケーションに対応する動的に選択可能なバーチャル・ビューポイントから現実世界シーンを体験することを可能にする。 As noted above, in some examples, system 100 and/or other systems (e.g., other server-side systems) and devices described herein can be used to create a virtual - Capable of generating reality media content. For example, in addition to the operations described above, a virtual reality media content serving system (eg, system 100 and/or other devices and systems described herein may be included, or these systems may entity) can further generate and provide virtual reality media content based on data provided by system 100 . The virtual reality media content may represent a real-world scene (eg, real-world scene 202), and the virtual reality media content may be animation corresponding to any virtual location relative to the real-world scene. It may also be presentable to the user for experience from a visually selectable virtual viewpoint. For example, the dynamically selectable virtual viewpoint may be selected (eg, determined, positioned, etc.) by a user of the media player device while the user is using the media player device. We are virtually experiencing a real world scene. In some examples, a virtual viewpoint can be selected to be any location along a 2D or 3D continuum, which can only be selected from a discrete set of viewpoints. Contrast with state. Additionally, the virtual reality media content may be provided to the media player device (eg, by a virtual reality media content providing system that includes or is associated with system 100), and the user , allows a real-world scene to be experienced from dynamically selectable virtual viewpoints corresponding to any virtual location within the real-world scene.

図示する目的で、図１１は、例示的な構成１１００を示し、ここで、例示的なバーチャル・リアリティ・メディア・コンテンツ提供システム１１０２（「プロバイダ・システム１１０２」）は、システム１００及び１以上の追加のバーチャル・リアリティ・メディア・プロバイダ・パイプライン・システム１１０４を含むが、前記例示的なバーチャル・リアリティ・メディア・コンテンツ提供システム１１０２は、バーチャル・リアリティ・メディア・コンテンツを生成し、前記バーチャル・リアリティ・メディア・コンテンツは、ネットワーク１１０６の手段により、例示的なクライアント・サイド・メディア・プレーヤ装置１１０８（「メディア・プレーヤ装置１１０８」）に提供され、前記例示的なクライアント・サイド・メディア・プレーヤ装置１１０８は、現実世界シーン２０２を体験するユーザ１１１０によって使用される。 For illustrative purposes, FIG. 11 shows an exemplary configuration 1100, in which an exemplary virtual reality media content providing system 1102 (“provider system 1102”) comprises system 100 and one or more additional , wherein the exemplary virtual reality media content providing system 1102 generates virtual reality media content and the virtual reality media provider pipeline system 1104 of Media content is provided by means of a network 1106 to an exemplary client-side media player device 1108 (“media player device 1108”), said exemplary client-side media player device 1108 , used by the user 1110 to experience the real-world scene 202 .

上述したように、仮想化される表面データ・フレーム・シーケンス６００が、生成され、そして、トランスポート・ストリーム（例えば、トランスポート・ストリーム７００、トランスポート・ストリーム９００等）へパッケージ化された後、プロバイダ・システム１１０２は、更に、１以上のトランスポート・ストリームを、エンコード、パッケージ化、暗号化、又は処理することができ、バーチャル・リアリティ・メディア・コンテンツを形成することができ、前記コンテンツは、メディア・プレーヤ装置１１０８がレンダリングするように構成されてもよい。例えば、バーチャル・リアリティ・メディア・コンテンツは、複数の２Ｄビデオ・データ・ストリーム（例えば、各ビュー及び仮想化プロジェクションに関連する色彩及び深度データに関連する２Ｄビデオ・データ・ストリーム）を含むことができ、又は表すことができ、前記複数の２Ｄビデオ・データ・ストリームは、メディア・プレーヤ装置１１０８によってレンダリングされてもよく、その結果、現実世界シーン２０２内の任意のバーチャル・ビューポイント（例えば、任意のキャプチャ・デバイスのビュー又はカスタマイズされるビューと不揃いではあるが、ユーザ１１１０にとって興味の対象となる可能性のあるバーチャル・ビューポイントを含む）からの現実世界シーン２０２のビューを提示することができる（これについては後述する）。バーチャル・リアリティ・メディア・コンテンツは、その後、ネットワーク１１０６の手段により、１以上のメディア・プレーヤ装置（例えば、ユーザ１１１０に関連するメディア・プレーヤ装置１１０８）に配信されてもよい。例えば、プロバイダ・システム１１０２は、バーチャル・リアリティ・メディア・コンテンツを、メディア・プレーヤ装置１１０８に提供することができ、その結果、ユーザ１１１０は、メディア・プレーヤ装置１１０８を用いて、現実世界シーン２０２を仮想的に体験することができる。 After the virtualized surface data frame sequence 600 is generated and packaged into a transport stream (eg, transport stream 700, transport stream 900, etc.), as described above, Provider system 1102 may further encode, package, encrypt, or otherwise process one or more transport streams to form virtual reality media content, said content comprising: Media player device 1108 may be configured to render. For example, virtual reality media content may include multiple 2D video data streams (eg, 2D video data streams associated with color and depth data associated with each view and virtualized projection). , or the plurality of 2D video data streams may be rendered by media player device 1108 such that any virtual viewpoint within real-world scene 202 (e.g., any A view of the real-world scene 202 from any view, including a virtual viewpoint that may be of interest to the user 1110, but not aligned with the capture device's view or a customized view, may be presented ( This will be discussed later). The virtual reality media content may then be distributed by means of network 1106 to one or more media player devices (eg, media player device 1108 associated with user 1110). For example, provider system 1102 can provide virtual reality media content to media player device 1108 such that user 1110 can use media player device 1108 to view real-world scene 202 . You can experience it virtually.

幾つかの例において、ユーザ１１１０にとって、バーチャル・リアリティ・メディア・コンテンツ（例えば、現実世界シーン２０２を表す）によって表現される没入型バーチャル・リアリティ世界内の１以上の不連続な位置に制限されることは望ましくない可能性がある。かくして、プロバイダ・システム１１０２は、現実世界シーン２０２を表すバーチャル・リアリティ・メディア・コンテンツ内に十分なデータを提供し、そして、現実世界シーン２０２を表現することを可能にし、その際の表現については、ビュー２０６及び／又は５０２だけではなく、現実世界シーン２０２内の任意のバーチャル・ロケーションに対応する任意の動的に選択可能なバーチャル・ビューポイントからも可能である。例えば、動的に選択可能なバーチャル・ビューポイントは、ユーザ１１１０が選択することができ、その間ユーザ１１１０は、メディア・プレーヤ装置１１０８を用いて、現実世界シーン２０２を体験している。 In some examples, the user 1110 is restricted to one or more discrete locations within the immersive virtual reality world represented by the virtual reality media content (eg, representing the real world scene 202). may not be desirable. Thus, the provider system 1102 provides sufficient data within the virtual reality media content representing the real-world scene 202 to enable the real-world scene 202 to be represented, wherein the representation is , views 206 and/or 502 , but also from any dynamically selectable virtual viewpoint corresponding to any virtual location within real-world scene 202 . For example, a dynamically selectable virtual viewpoint can be selected by user 1110 while user 1110 is using media player device 1108 to experience real-world scene 202 .

本明細書で使用するが、「任意のバーチャル・ロケーション」は、現実世界シーンの表現に関連する空間中の任意のバーチャル・ポイントを意味してもよい。例えば、任意のバーチャル・ロケーションは、現実世界シーンを囲む固定位置（例えば、ビュー２０６及び／又はカスタマイズされるビュー５０２に関連する固定位置）に限定されるのではなく、ビュー２０６及びカスタマイズされるビュー５０２に関連する位置の間の全ての位置も含むことができる。幾つかの例において、こうした任意のバーチャル・ロケーションは、現実世界シーン２０２内の最も望ましいバーチャル・ビューポイントに対応してもよい。例えば、現実世界シーン２０２がバスケットボールのゲームを含む場合、ユーザ１１１０は、バスケットボールのコート上の任意のバーチャル・ロケーションにあるゲームを体験するためのバーチャル・ビューポイントを動的に選択することができる。例えば、ユーザは、ユーザのバーチャル・ビューポイントを動的に選択することができ、バスケットボールのコートを上下に追従することができ、そして、ゲームの進行中のバスケットボールのコートに立っているかのごとく、バスケットボールのゲームを体験することができる。 As used herein, "any virtual location" may mean any virtual point in space associated with the representation of a real-world scene. For example, any virtual location may be defined by view 206 and customized view 502, rather than being limited to fixed positions surrounding the real-world scene (eg, fixed positions relative to view 206 and/or customized view 502). All positions between the positions associated with 502 can also be included. In some examples, any such virtual location may correspond to the most desirable virtual viewpoint within the real-world scene 202 . For example, if real-world scene 202 includes a game of basketball, user 1110 can dynamically select a virtual viewpoint to experience the game at any virtual location on the basketball court. For example, the user can dynamically select the user's virtual viewpoint, follow the basketball court up and down, and view the images as if they were standing on the basketball court while the game was in progress. You can experience a game of basketball.

ネットワーク１１０６は、以下を含むことができる：プロバイダに特化した有線の又はワイヤレスなネットワーク（例えば、ケーブル、又は衛星キャリア・ネットワーク、又はモバイル電話ネットワーク）、インターネット、ワイド・エリア・ネットワーク、コンテンツ・デリバリ・ネットワーク、又は任意の他の適切なネットワーク。データは、プロバイダ・システム１１０２とメディア・プレーヤ装置１１０８（並びに、明示しない他のメディア・プレーヤ装置）との間を流れることができ、その際に、特定の実施に寄与することができる任意の通信技術、デバイス、メディア、及びプロトコルを使用してもい。 Networks 1106 can include: provider-specific wired or wireless networks (eg, cable or satellite carrier networks, or mobile phone networks), the Internet, wide area networks, content delivery. - A network, or any other suitable network. Data can flow between provider system 1102 and media player device 1108 (as well as other media player devices not explicitly stated), and in so doing any communication that can contribute to a particular implementation. May use technologies, devices, media, and protocols.

メディア・プレーヤ装置１１０８をユーザ１１１０が使用して、プロバイダ・システム１１０２から受信するバーチャル・リアリティ・メディア・コンテンツにアクセスし、体験することができる。例えば、メディア・プレーヤ装置１１０８は、現実世界シーン２０２の３Ｄバーチャル表現を生成する動作を行うように構成されてもよく、前記表現は、ユーザ１１１０によって、任意のバーチャル・ビューポイント（例えば、ユーザによって選択され、現実世界シーン２０２内の任意のバーチャル・ロケーションに対応する動的に選択可能なバーチャル・ビューポイント）から体験されてもよい。この目的のため、メディア・プレーヤ装置１１０８は、以下を行うことができる任意のデバイス含んでもよく、又は、前記デバイスによって実装されてもよい：没入型バーチャル・リアリティ世界（例えば、現実世界シーン２０２を表す没入型バーチャル・リアリティ世界）の視野を提示すること、及び、ユーザ１１１０が没入型バーチャル・リアリティ世界を体験する際に、ユーザ１１１０からのユーザ入力を検出して、視野内の没入型バーチャル・リアリティ世界を動的にアップデートすること。 Media player device 1108 may be used by user 1110 to access and experience virtual reality media content received from provider system 1102 . For example, media player device 1108 may be configured to operate to generate a 3D virtual representation of real-world scene 202, which may be viewed by user 1110 from any virtual viewpoint (e.g., selected and experienced from a dynamically selectable virtual viewpoint corresponding to any virtual location within the real-world scene 202). To this end, the media player device 1108 may include or be implemented by any device capable of: creating an immersive virtual reality world (e.g., rendering the real world scene 202 and detecting user input from user 1110 as user 1110 experiences the immersive virtual reality world to create an immersive virtual reality world within the field of view. To dynamically update the reality world.

図１２が示すのは、様々な例示的なタイプのメディア・プレーヤ装置１１０８であり、前記装置は、バーチャル・リアリティ・メディア・コンテンツを体験するユーザ１１１０によって使用されてもよい。具体的には、示すように、メディア・プレーヤ装置１１０８は、幾つかの異なる形態要素のうち１つをとることができる（例えば、ヘッドマウント・バーチャル・リアリティ・デバイス１２０２（例えば、バーチャル・リアリティ・ゲーム・デバイス）（ヘッドマウント・ディスプレイ・スクリーンを含む）、個人用コンピュータ・デバイス１２０４（例えば、デスクトップ・コンピュータ、ラップトップ・コンピュータ等）、モバイル・デバイス若しくはワイヤレス・デバイス１２０６（例えば、スマートフォン、タブレット・デバイス等、これらは、ヘッドマウント装置の手段によりユーザ１１１０の頭部に取り付けることが可能）、又は、特定の実施に寄与して、バーチャル・リアリティ・メディア・コンテンツの受信及び／又は提示を促進することができる任意の他のデバイス若しくはデバイスの構成。異なるタイプのメディア・プレーヤ装置（例えば、ヘッドマウント・バーチャル・リアリティ・デバイス、個人用コンピュータ・デバイス、モバイル・デバイス等）は、異なるレベルの没入性を有する異なるタイプのバーチャル・リアリティ体験を、ユーザ１１１０に提供することができる。 FIG. 12 illustrates various exemplary types of media player devices 1108 that may be used by users 1110 to experience virtual reality media content. Specifically, as shown, media player device 1108 can take one of several different form factors (e.g., head-mounted virtual reality device 1202 (e.g., virtual reality device 1202)). gaming devices) (including head-mounted display screens), personal computing devices 1204 (e.g., desktop computers, laptop computers, etc.), mobile or wireless devices 1206 (e.g., smartphones, tablets, etc.). devices, which can be attached to the head of the user 1110 by means of a head-mounted device) or contribute to certain implementations to facilitate the reception and/or presentation of virtual reality media content. Any other device or configuration of devices that can provide different levels of immersion The user 1110 can be provided with different types of virtual reality experiences having .

図１３は、例示的なバーチャル・リアリティ体験１３００を示し、ここで、ユーザ１１１０には、例示的なバーチャル・リアリティ・メディア・コンテンツが提示され、前記コンテンツは、現実世界シーンを表し、前記現実世界シーンは、現実世界シーンに対する例示的な任意のバーチャル・ロケーションに対応する動的に選択可能なバーチャル・ビューポイントから体験される。具体的には、バーチャル・リアリティ・メディア・コンテンツ１３０２は、視野１３０４内に提示され、前記コンテンツが示すのは、バーチャル・ビューポイントからの現実世界シーンであり、前記バーチャル・ビューポイントは、任意のバーチャル・ロケーションに対応し、前記バーチャル・ロケーションは、シュートが行われている現実世界シーンの表現内のバスケットボールのゴールの真下である。現実世界シーンに基づく没入型バーチャル・リアリティ世界１３０６は、閲覧者にとって利用可能であってもよく、ユーザ入力（例えば、頭部の動き、キーボード・インプット等）（見回す、及び／又は動き回る（即ち、体験する箇所からバーチャル・ビューポイントを動的に選択する））を提供することで没入型バーチャル・リアリティ世界１３０６を体験できる。 FIG. 13 illustrates an exemplary virtual reality experience 1300, where a user 1110 is presented with exemplary virtual reality media content, said content representing a real-world scene, said real-world The scene is experienced from dynamically selectable virtual viewpoints corresponding to any exemplary virtual location relative to the real-world scene. Specifically, virtual reality media content 1302 is presented within field of view 1304, said content showing a real-world scene from a virtual viewpoint, said virtual viewpoint Corresponding to a virtual location, said virtual location is directly below the basketball goal in a representation of the real world scene where the shot is being taken. An immersive virtual reality world 1306 based on a real-world scene may be available to the viewer and user input (e.g., head movements, keyboard input, etc.) (look around and/or move around (i.e., An immersive virtual reality world 1306 can be experienced by providing )) that dynamically selects a virtual viewpoint from the location of the experience.

例えば、視野１３０４は、ウィンドウを提供することができ、前記ウィンドウを通して、ユーザ１１１０は、容易に且つ自然に、没入型バーチャル・リアリティ世界１３０６を見回すことができる。視野１３０４は、メディア・プレーヤ装置１１０８（例えば、メディア・プレーヤ装置１１０８のディスプレイ・スクリーン上に）によって提示されてもよく、そして、前記視野１３０４は、以下を含むことができる：没入型バーチャル・リアリティ世界１３０６内のユーザの周辺のオブジェクトを描写したビデオ。更には、視野１３０４を、ユーザ１１１０が没入型バーチャル・リアリティ世界１３０６を体験する際に、ユーザ１１１０が提供するユーザ入力に応答して、動的に変更することができる。例えば、メディア・プレーヤ装置１１０８は、ユーザ入力を検出することができる（例えば、視野１３０４が提示されるディスプレイ・スクリーンを移動又は回転する）。これに応じて、視野１３０４は、以下を提示することができる：異なるオブジェクト及び／又は以前のバーチャル・ビューポイント若しくはバーチャル・ロケーションから見えるオブジェクトの場所において、異なるバーチャル・ビューポイント若しくはバーチャル・ロケーションから見えるオブジェクト。 For example, field of view 1304 can provide a window through which user 1110 can easily and naturally look around immersive virtual reality world 1306 . A field of view 1304 may be presented by media player device 1108 (eg, on a display screen of media player device 1108), and said field of view 1304 may include: immersive virtual reality; A video depicting objects in the user's surroundings in world 1306 . Further, field of view 1304 can dynamically change in response to user input provided by user 1110 as user 1110 experiences immersive virtual reality world 1306 . For example, media player device 1108 can detect user input (eg, move or rotate the display screen on which field of view 1304 is presented). Accordingly, field of view 1304 can present: Views from different virtual viewpoints or virtual locations at locations of different objects and/or objects viewed from previous virtual viewpoints or locations; object.

図１３において、没入型バーチャル・リアリティ世界１３０６は、半球として示されているが、このことは、以下を示す：ユーザ１１１０は、没入型バーチャル・リアリティ世界１３０６内の任意の方向へ視線を向けることができ、前記方向として、実質的には、ユーザ１１１０が現在選択したバスケットボールのゴール下のロケーションにおけるバーチャル・ビューポイントから見て、前方向、後方向、左方向、右方向、及び／又は上方向である。他の例において、没入型バーチャル・リアリティ世界１３０６は、さらに１８０°分の球体を追加した全体３６０°を含むことができ、その結果、ユーザ１１１０は、下方向も見ることができる。更には、ユーザ１１１０は、没入型バーチャル・リアリティ世界１３０６内の他のロケーションへ動き回ることができる（即ち、現実世界シーンの表現内の異なる動的に選択可能なバーチャル・ビューポイントを動的に選択することができる）。例えば、ユーザ１１１０は以下を選択することができる：ハーフ・コートにおけるバーチャル・ビューポイント、バスケットボールのゴールに向かいあうフリー・スロー・ラインからのバーチャル・ビューポイント、バスケットボールのゴール上部で停止したバーチャル・ビューポイント等。 In FIG. 13, immersive virtual reality world 1306 is shown as a hemisphere, which indicates that user 1110 can look in any direction within immersive virtual reality world 1306. , wherein the directions are substantially forward, backward, leftward, rightward, and/or upward as viewed from a virtual viewpoint at a location under the basketball goal currently selected by the user 1110. is. In another example, the immersive virtual reality world 1306 can include a full 360° with an additional 180° of the sphere so that the user 1110 can also look downwards. Additionally, the user 1110 can move around to other locations within the immersive virtual reality world 1306 (i.e., dynamically select different dynamically selectable virtual viewpoints within the representation of the real-world scene). can do). For example, the user 1110 can select: a virtual viewpoint at half court, a virtual viewpoint from the free throw line facing the basketball goal, a virtual viewpoint stopped above the basketball goal. etc.

図１４が示すのは例示的な方法１４００であり、バーチャル・リアリティ・メディア・コンテンツ内に含める目的で、現実世界シーンのカスタマイズされるビューの仮想化プロジェクションを生成するためのものである。図１４は、１つの実施形態に従った例示的なオペレーションを示すが、他の実施形態では、図１４に示す任意のオペレーションに対して、省略、追加、並べ替え、及び／又は変更してもよい。図１４に示す１以上のオペレーションは、以下によって実行されてもよい：システム１００、前記システムの実装、及び／又は、システム１００に関連する（例えば、通信可能に接続される、共同で動作するように構成される等）上述した別のシステム。 FIG. 14 illustrates an exemplary method 1400 for generating a virtualized projection of a customized view of a real-world scene for inclusion within virtual reality media content. Although FIG. 14 shows exemplary operations according to one embodiment, in other embodiments any of the operations shown in FIG. 14 may be omitted, added, reordered, and/or modified. good. One or more of the operations illustrated in FIG. 14 may be performed by: system 100, implementations of the system, and/or associated with system 100 (eg, communicatively coupled, cooperatively operating). ) other systems described above.

オペレーション１４０２において、仮想化プロジェクション生成システムは、複数のキャプチャされる表面データ・フレーム・シーケンスを受信することができ、各表面データ・フレーム・シーケンスは、色彩及び深度フレームを含むことができ、前記色彩及び深度フレームは、現実世界シーンを、キャプチャ・パラメータの各セットに従って描写することができ、前記パラメータは、現実世界シーンの異なるビューに関連するキャプチャ・パラメータの複数のセットに含まれてもよい。幾つかの例において、複数のキャプチャされる表面データ・フレーム・シーケンス内の各表面データ・フレーム・シーケンスは、複数のキャプチャ・デバイス内の異なるキャプチャ・デバイスによってキャプチャされてもよく、前記複数のキャプチャ・デバイスは、現実世界シーンの異なるビューをキャプチャするように現実世界シーンに関しての異なるロケーションに配置されてもよい。オペレーション１４０２は、本明細書に記載の方法のいずれかで実行されてもよい。 In operation 1402, the virtualization projection generation system can receive a plurality of captured surface data frame sequences, each surface data frame sequence can include color and depth frames, and the color and depth frames may depict the real-world scene according to respective sets of capture parameters, which may be included in multiple sets of capture parameters associated with different views of the real-world scene. In some examples, each surface data frame sequence in the plurality of captured surface data frame sequences may be captured by a different capture device in the plurality of capture devices, wherein the plurality of capture • Devices may be placed at different locations with respect to the real-world scene to capture different views of the real-world scene. Operation 1402 may be performed in any of the methods described herein.

オペレーション１４０４において、仮想化プロジェクション生成システムは、キャプチャ・パラメータの追加のセットを特定することができ、前記追加のセットは、オペレーション１４０２にて受信されキャプチャされる表面データ・フレーム・シーケンスに関連するキャプチャ・パラメータの複数のセット内に含まれるキャプチャ・パラメータのセットとは異なる。幾つかの例において、キャプチャ・パラメータの追加のセットは、現実世界シーンのカスタマイズされるビューに関連してもよく、前記カスタマイズされるビューは、複数のキャプチャ・デバイスがキャプチャする現実世界シーンの異なるビューとは異なってもよい。オペレーション１４０４は、本明細書に記載の方法のいずれかで実行されてもよい。 In operation 1404, the virtualization projection generation system can identify an additional set of capture parameters, said additional set of capture parameters associated with the surface data frame sequence received and captured in operation 1402. • Different sets of capture parameters contained within multiple sets of parameters. In some examples, the additional set of capture parameters may relate to customized views of the real-world scene, wherein the customized views are different views of the real-world scene captured by multiple capture devices. It can be different from the view. Operation 1404 may be performed in any of the methods described herein.

オペレーション１４０６において、仮想化プロジェクション生成システムは、現実世界シーンのカスタマイズされるビューの仮想化プロジェクションに関する色彩及び深度フレームをレンダリングすることができる。例えば、仮想化プロジェクション生成システムは、オペレーション１４０２にて受信されキャプチャされる表面データ・フレーム・シーケンスに基づいて、そして、オペレーション１４０４で特定されるキャプチャ・パラメータの追加のセットに基づいて、仮想化プロジェクションに関する色彩及び深度フレームをレンダリングすることができる。オペレーション１４０６は、本明細書に記載の方法のいずれかで実行されてもよい。 At operation 1406, the virtualization projection generation system can render color and depth frames for the virtualization projection of the customized view of the real-world scene. For example, the virtualization projection generation system may generate a virtualization projection based on the surface data frame sequence received and captured in operation 1402 and based on the additional set of capture parameters specified in operation 1404. A color and depth frame can be rendered for . Operation 1406 may be performed in any of the methods described herein.

オペレーション１４０８において、仮想化プロジェクション生成システムは、仮想化される表面データ・フレーム・シーケンスを提供することができ、前記表面データ・フレーム・シーケンスは、オペレーション１４０６にてレンダリングされる現実世界シーンのカスタマイズされるビューの仮想化プロジェクションに関するレンダリングされる色彩及び深度フレームを含むことができる。幾つかの例において、仮想化プロジェクション生成システムは、バーチャル・リアリティ・メディア・コンテンツ内に含める目的で、仮想化される表面データ・フレーム・シーケンスをメディア・プレーヤ装置に提供することができる。オペレーション１４０８は、本明細書に記載の方法のいずれかで実行されてもよい。 At operation 1408 , the virtualization projection generation system can provide a sequence of surface data frames to be virtualized, said surface data frame sequence being a customized representation of the real-world scene rendered at operation 1406 . Rendered color and depth frames for the virtualized projection of the view. In some examples, the virtualization projection generation system can provide a virtualized sequence of surface data frames to a media player device for inclusion within virtual reality media content. Operation 1408 may be performed in any of the methods described herein.

図１５が示すのは例示的な方法１５００であり、バーチャル・リアリティ・メディア・コンテンツ内に含める目的で、現実世界シーンのカスタマイズされるビューの仮想化プロジェクションを生成するためのものである。図１５は、１つの実施形態に従った例示的なオペレーションを示すが、他の実施形態では、図１５に示す任意のオペレーションに対して、省略、追加、並べ替え、及び／又は変更してもよい。図１５に示す１以上のオペレーションは、以下によって実行されてもよい：システム１００、前記システムの実装、及び／又は、システム１００に関連する（例えば、通信可能に接続される、共同で動作するように構成される等）上述した別のシステム。 FIG. 15 illustrates an exemplary method 1500 for generating a virtualized projection of a customized view of a real-world scene for inclusion within virtual reality media content. Although FIG. 15 shows exemplary operations according to one embodiment, in other embodiments any of the operations shown in FIG. 15 may be omitted, added, reordered, and/or modified. good. One or more of the operations illustrated in FIG. 15 may be performed by: system 100, implementations of the system, and/or associated with system 100 (eg, communicatively coupled, cooperatively operating). ) other systems described above.

オペレーション１５０２において、仮想化プロジェクション生成システムは、複数のキャプチャされる表面データ・フレーム・シーケンスを受信することができ、各表面データ・フレーム・シーケンスは色彩及び深度フレームを含むことができ、前記色彩及び深度フレームは、現実世界シーンを、キャプチャ・パラメータの各セットに従って描写することができ、前記パラメータは、現実世界シーンの異なるビューに関連するキャプチャ・パラメータの第１の複数のセット内に含まれてもよい。例えば、複数のキャプチャされる表面データ・フレーム・シーケンス内の各表面データ・フレーム・シーケンスは、複数のキャプチャ・デバイス内の異なるキャプチャ・デバイスによってキャプチャされてもよく、前記複数のキャプチャ・デバイスは、現実世界シーンの異なるビューをキャプチャするように現実世界シーンに関しての異なるロケーションに配置されてもよい。特定の例において、オペレーション１５０２は、現実世界シーン内でイベントが発生した時に、リアルタイムで実行されてもよい。オペレーション１５０２は、本明細書に記載の方法のいずれかで実行されてもよい。 In operation 1502, the virtualization projection generation system can receive a plurality of captured surface data frame sequences, each surface data frame sequence can include color and depth frames, and the color and A depth frame may depict a real-world scene according to each set of capture parameters, said parameters being included in a first plurality of sets of capture parameters associated with different views of the real-world scene. good too. For example, each surface data frame sequence in the plurality of captured surface data frame sequences may be captured by a different capture device in the plurality of capture devices, the plurality of capture devices comprising: It may be placed at different locations with respect to the real-world scene to capture different views of the real-world scene. In particular examples, operation 1502 may be performed in real-time as events occur within the real-world scene. Operation 1502 may be performed in any of the methods described herein.

オペレーション１５０４において、仮想化プロジェクション生成システムは、キャプチャ・パラメータの第２の複数のセットを特定することができ、前記第２の複数のセットは、キャプチャ・パラメータの第１の複数のセット内に含まれるキャプチャ・パラメータのセットとは異なってもよい。例えば、キャプチャ・パラメータの第２の複数のセット内のキャプチャ・パラメータの各セットは、以下に関連してもよい：オペレーション１５０２に関連して上述した複数のキャプチャ・デバイスがキャプチャする現実世界シーンの異なるビューとは異なる現実世界シーンの各カスタマイズされるビュー。幾つかの例において、キャプチャ・パラメータの第２の複数のセットは、以下を含むことができる：オペレーション１５０２にて受信されキャプチャされる表面データ・フレーム・シーケンスに関連するキャプチャ・パラメータの第１の複数のセット内に含まれるものよりも更に大きな数のセット。オペレーション１５０２と同様、特定の実施において、オペレーション１５０４は、現実世界シーン内でイベントが発生した時にリアルタイムで実行されてもよい。オペレーション１５０４は、本明細書に記載の方法のいずれかで実行されてもよい。 In operation 1504, the virtualization projection generation system can identify a second plurality of sets of capture parameters, said second plurality of sets included within the first plurality of sets of capture parameters. may differ from the set of capture parameters provided. For example, each set of capture parameters in the second plurality of sets of capture parameters may relate to: Each customized view of the real-world scene differs from a different view. In some examples, the second plurality of sets of capture parameters may include: a first set of capture parameters associated with the surface data frame sequence received and captured in operation 1502; A set of numbers greater than those contained within the set. Similar to operation 1502, in certain implementations, operation 1504 may be performed in real-time when an event occurs within the real-world scene. Operation 1504 may be performed in any of the methods described herein.

オペレーション１５０６において、仮想化プロジェクション生成システムは、現実世界シーンの各カスタマイズされるビューの仮想化プロジェクションに関して色彩及び深度フレームをレンダリングすることができ、前記カスタマイズされるビューは、キャプチャ・パラメータの第２の複数のセットにおけるキャプチャ・パラメータのセットをオペレーション１５０４にて特定したものに関するものであってもよい。幾つかの例において、仮想化プロジェクション生成システムは、色彩及び深度フレームを以下に基づいてレンダリングすることできる：オペレーション１５０２で受信される複数のキャプチャされる表面データ・フレーム・シーケンス、及び、オペレーション１５０４で特定されるキャプチャ・パラメータの第２の複数のセット。オペレーション１５０２及び１５０４と同様、特定の実施において、オペレーション１５０６は、現実世界シーン内でイベントが発生した時にリアルタイムで実行されてもよい。オペレーション１５０６は、本明細書に記載の方法のいずれかで実行されてもよい。 At operation 1506, the virtualization projection generation system may render color and depth frames for the virtualization projection of each customized view of the real-world scene, the customized view being the second of the capture parameters. It may also relate to the set of capture parameters in the plurality of sets identified in operation 1504 . In some examples, the virtualization projection generation system can render color and depth frames based on: a plurality of captured surface data frame sequences received at operation 1502; A second plurality of sets of specified capture parameters. Similar to operations 1502 and 1504, in certain implementations, operation 1506 may be performed in real-time when an event occurs within the real-world scene. Operation 1506 may be performed in any of the methods described herein.

オペレーション１５０８において、仮想化プロジェクション生成システムは、複数の仮想化される表面データ・フレーム・シーケンスを提供することができ、前記表面データ・フレーム・シーケンスは、オペレーション１５０６にてレンダリングされる現実世界シーンの各カスタマイズされるビューの仮想化プロジェクションに関する色彩及び深度フレームを含むことができる。例えば、仮想化プロジェクション生成システムは、複数の仮想化される表面データ・フレーム・シーケンスを、メディア・プレーヤ装置のためのバーチャル・リアリティ・メディア・コンテンツ内に含める目的で提供してもよい。オペレーション１５０２～１５０６と同様、オペレーション１５０８は、現実世界シーン内でイベントが発生した時にリアルタイムで実行されてもよい。オペレーション１５０８は、本明細書に記載の方法のいずれかで実行されてもよい。 At operation 1508, the virtualization projection generation system can provide a plurality of virtualized surface data frame sequences, the surface data frame sequences representing the real-world scene rendered at operation 1506. Color and depth frames for each customized view virtualization projection can be included. For example, a virtualization projection generation system may provide multiple virtualized surface data frame sequences for inclusion within virtual reality media content for a media player device. Similar to operations 1502-1506, operation 1508 may be performed in real-time when an event occurs within the real-world scene. Operation 1508 may be performed in any of the methods described herein.

特定の実施形態において、本明細書に記載の１以上のシステム、コンポーネント、及び／又はプロセスは、１以上の適切に構成されたコンピューティング・デバイスによって実施及び／又は実行されてもよい。この目的のため、上述した１以上のシステム及び／又はコンポーネントは、以下を含んでもよく、又は、以下によって実装されてもよい：任意のコンピュータ・ハードウェア、及び／又は本明細書に記載の１以上のプロセスを実行するように構成される少なくとも１つの非一時的コンピュータ可読媒体に記録される、コンピュータで実装されるインストラクション（例えば、ソフトウェア）。特に、システム・コンポーネントは、１つの物理的なコンピューティング・デバイスによって実装されてもよいし、又は、複数の物理的なコンピューティング・デバイスで実装されてもよい。従って、システム・コンポーネントは、任意の数のコンピューティング・デバイスを含むことができ、そして、任意の数のコンピュータオペレーティングシステムを採用することができる。 In particular embodiments, one or more of the systems, components and/or processes described herein may be implemented and/or executed by one or more suitably configured computing devices. To this end, one or more of the systems and/or components described above may include or be implemented by: any computer hardware and/or one of the components described herein. Computer-implemented instructions (eg, software) recorded on at least one non-transitory computer-readable medium configured to perform the above processes. In particular, system components may be implemented by one physical computing device or may be implemented by multiple physical computing devices. Accordingly, system components may include any number of computing devices and may employ any number of computer operating systems.

特定の実施形態において、本明細書に記載の１以上のプロセスは、非一時的コンピュータ可読媒体に記録され、１以上のコンピューティング・デバイスによって実行可能なインストラクションとして、少なくとも部分的に実施されてもよい。一般的に、プロセッサ（例えば、マイクロプロセッサ）は、インストラクションを、非一時的コンピュータ可読媒体（例えば、メモリ等）から受け取り、そして、これらのインストラクションを実行し、これにより、本明細書に記載の１以上のプロセスを含む１以上のプロセスを実行する。こうしたインストラクションは、任意の様々な既知のコンピュータ可読媒体を使用して記憶することができ、及び／又は送ることができる。 In certain embodiments, one or more processes described herein may be implemented, at least in part, as instructions recorded on non-transitory computer-readable media and executable by one or more computing devices. good. Generally, a processor (e.g., microprocessor) receives instructions from a non-transitory computer-readable medium (e.g., memory, etc.) and executes those instructions, thereby implementing one of the methods described herein. One or more processes are executed, including the above processes. Such instructions can be stored and/or transmitted using any of a variety of known computer-readable media.

コンピュータ可読媒体（プロセッサ可読媒体とも言う）は、任意の非一時的媒体を含み、コンピュータによって（例えば、コンピュータのプロセッサによって）読み取ることができるデータ（例えば、インストラクション）を提供することに寄与する。こうした媒体は、多くの形態をとることができ、限定されるものではないが、以下を含む：不揮発性メディア及び／又は揮発性メディア。不揮発性メディアは、以下を含むことができる：例えば、光学ディスク又は磁気ディスク及び他の固定メモリ。揮発性メディアは、以下を含むことができる：例えば、ダイナミック・ランダム・アクセス・メモリ（「ＤＲＡＭ」）であって、典型的にはメインメモリを構成する物。コンピュータ可読媒体の通常の形態として以下を含む：例えば、ディスク、ハードディスク、磁気テープ、任意の他の磁気媒体、コンパクト・ディスク・リード・オンリ・メモリ（「ＣＤ－ＲＯＭ」）、デジタル・ビデオ・ディスク（「ＤＶＤ」）、任意の他の光学媒体、ランダム・アクセス・メモリ（「ＲＡＭ」）、プログラマブル・リード・オンリ・メモリ（「ＰＲＯＭ」）、電気的に消去可能なプログラマブル・リード・オンリ・メモリ（「ＥＰＲＯＭ」）、ＦＬＡＳＨ－ＥＥＰＲＯＭ、任意の他のメモリチップ、若しくはカートリッジ、又はコンピュータが読み取り可能な任意の他の有形の媒体。 Computer-readable media (also called processor-readable media) includes any non-transitory medium that contributes to providing data (eg, instructions) that can be read by a computer (eg, by a processor of the computer). Such a medium may take many forms, including but not limited to: non-volatile media and/or volatile media. Non-volatile media can include, for example, optical or magnetic disks and other fixed memory. Volatile media can include, for example, dynamic random access memory (“DRAM”), which typically constitutes the main memory. Common forms of computer-readable media include, for example, discs, hard disks, magnetic tapes, any other magnetic medium, compact disc read-only memory ("CD-ROM"), digital video discs. ("DVD"), any other optical medium, random access memory ("RAM"), programmable read only memory ("PROM"), electrically erasable programmable read only memory (“EPROM”), FLASH-EEPROM, any other memory chip or cartridge, or any other tangible computer-readable medium.

図１６は、例示的なコンピューティング・デバイス１６００を示し、当該デバイスは、特に、本明細書に記載の１以上のプロセスを実行するように構成されてもよい。図１６に示すように、コンピューティング・デバイス１６００は、以下を含むことができる：通信インターフェース１６０２、プロセッサ１６０４、ストレージ・デバイス１６０６、及び通信インフラ１６１０を介して通信可能に接続される入力／出力（「Ｉ／Ｏ」）モジュール１６０８。例示的なコンピューティング・デバイス１６００を、図１６に示すものの、図１６に示すコンポーネントは、限定することを意図するものではない。追加又は代替のコンポーネントを、他の実施形態において使用してもよい。図１６に示すコンピューティング・デバイス１６００のコンポーネントを、以下で、更に詳細に説明する。 FIG. 16 illustrates an exemplary computing device 1600, which may be configured, among other things, to perform one or more of the processes described herein. As shown in FIG. 16, computing device 1600 can include: a communication interface 1602, a processor 1604, a storage device 1606, and input/output communicatively coupled via a communication infrastructure 1610 ( “I/O”) module 1608 . Although an exemplary computing device 1600 is shown in FIG. 16, the components shown in FIG. 16 are not meant to be limiting. Additional or alternative components may be used in other embodiments. The components of computing device 1600 shown in FIG. 16 are described in further detail below.

通信インターフェース１６０２は、１以上のコンピューティング・デバイスと通信するように構成されてもよい。通信インターフェース１６０２の例は、限定されるものではないが、以下を含む：有線ネットワーク・インターフェース（例えば、ネットワーク・インターフェース・カード）、ワイヤレス・ネットワーク・インターフェース（例えば、ワイヤレス・ネットワーク・インターフェース・カード）、モデム、オーディオ／ビデオ接続、及び任意の他の適切なインターフェース。 Communication interface 1602 may be configured to communicate with one or more computing devices. Examples of communication interface 1602 include, but are not limited to: wired network interface (eg, network interface card), wireless network interface (eg, wireless network interface card), Modems, audio/video connections, and any other suitable interfaces.

プロセッサ１６０４は、概して、任意のタイプ又は形態の処理ユニット（例えば、中央演算装置及び／又はグラフィックス・プロセッシング・ユニット）を表し、データを処理することができ、又は、本明細書に記載の１以上のインストラクション、プロセス、及び／若しくはオペレーションの実行を解釈し、実行し、及び／若しくは指示することができる。プロセッサ１６０４は、１以上のアプリケーション１６１２又は他のコンピュータ可読インストラクション（例えば、ストレージ・デバイス１６０６又は別のコンピュータ可読媒体に記憶されてもよい）に従って、オペレーションの実行を指示することができる。 Processor 1604 generally represents any type or form of processing unit (e.g., a central processing unit and/or a graphics processing unit) and is capable of processing data or one of the methods described herein. It can interpret, perform and/or direct the execution of any of the above instructions, processes and/or operations. Processor 1604 can direct the execution of operations in accordance with one or more applications 1612 or other computer-readable instructions (eg, which may be stored in storage device 1606 or another computer-readable medium).

ストレージ・デバイス１６０６は、１以上のデータ・ストレージ・メディア、デバイス、又は構成を含むことができ、そして、データストレージ・メディア及び／又はデバイスを任意のタイプ、任意の形態、及び任意の組み合わせで採用することができる。例えば、ストレージ・デバイス１６０６は、以下を含むことができるが、これらに限定されない：ハード・ドライブ、ネットワーク・ドライブ、フラッシュ・ドライブ、磁気ディスク、光学ディスク、ＲＡＭ、ダイナミックＲＡＭ、他の不揮発性及び／又は揮発性のデータ・ストレージ・ユニット、又はこれらのコンビネーション若しくはサブコンビネーション。電子データは、本明細書に記載のデータを含むが、一時的に及び／又は永続的に、ストレージ・デバイス１６０６に記憶されてもよい。例えば、１以上の実行可能なアプリケーション１６１２を表すデータであって、プロセッサ１６０４に指示して本明細書に記載の任意のオペレーションを実行するように構成されるデータは、ストレージ・デバイス１６０６内に記憶されてもよい。幾つかの例において、データは、ストレージ・デバイス１６０６内に存在する１以上のデータベースに配置することができる。 Storage device 1606 can include one or more data storage media, devices, or configurations, and employs any type, form, and combination of data storage media and/or devices. can do. For example, storage devices 1606 can include, but are not limited to: hard drives, network drives, flash drives, magnetic disks, optical disks, RAM, dynamic RAM, other non-volatile and/or or volatile data storage units, or combinations or subcombinations thereof. Electronic data, including data described herein, may be temporarily and/or permanently stored in storage device 1606 . For example, data representing one or more executable applications 1612 configured to direct processor 1604 to perform any of the operations described herein may be stored in storage device 1606. may be In some examples, data may be located in one or more databases that reside within storage device 1606 .

Ｉ／Ｏモジュール１６０８は、１以上のＩ／Ｏモジュールを含むことができ、当該モジュールは、ユーザ入力を受け取り、そして、ユーザ出力を提供するように構成されてもよい。１以上のＩ／Ｏモジュールを用いて、単独のバーチャル・リアリティ・エクスペリエンスのためのインプットを受け取ってもよい。Ｉ／Ｏモジュール１６０８は、インプット及びアウトプット能力をサポートする任意のハードウェア、ファームウェア、ソフトウェア、又はこれらの組み合わせを含むことができる。例えば、Ｉ／Ｏモジュール１６０８は、ユーザ入力をキャプチャするハードウェア及び／又はソフトウェアを含むことができ、限定されるものではないが、以下を含むことができる：キーボード若しくはキーパッド、タッチスクリーン・コンポーネント（例えば、タッチスクリーン・ディスプレイ）、レシーバー（例えば、ＲＦ又は赤外線受信機）、モーション・センサ、及び／又は１以上のインプットボタン。 I/O modules 1608 may include one or more I/O modules, which may be configured to receive user input and provide user output. One or more I/O modules may be used to receive input for a single virtual reality experience. I/O module 1608 may include any hardware, firmware, software, or combination thereof that supports input and output capabilities. For example, the I/O module 1608 can include hardware and/or software that captures user input and can include, but is not limited to: a keyboard or keypad, a touchscreen component. (eg, touchscreen display), receiver (eg, RF or infrared receiver), motion sensor, and/or one or more input buttons.

Ｉ／Ｏモジュール１６０８は、アウトプットをユーザに提示するために１以上のデバイスを含むことができ、限定されるものではないが、以下を含むことができる：グラフィックス・エンジン、ディスプレイ（例えば、ディスプレイ・スクリーン）、１以上のアウトプット・ドライバ（例えば、ディスプレイ・ドライバ）、１以上のオーディオ・スピーカ、及び１以上のオーディオ・ドライバ。特定の実施形態において、Ｉ／Ｏモジュール１６０８は、ユーザに提示するためのディスプレイにグラフィカルなデータを提供するように構成される。このグラフィカルなデータは、特定の実施に寄与することができる１以上のグラフィカル・ユーザ・インターフェース、及び／又は任意の他のグラフィカルなコンテンツを表すものであってもよい。 I/O module 1608 can include one or more devices for presenting output to a user, including but not limited to: a graphics engine, a display (e.g., display screen), one or more output drivers (eg, display drivers), one or more audio speakers, and one or more audio drivers. In certain embodiments, I/O module 1608 is configured to provide graphical data to a display for presentation to a user. This graphical data may represent one or more graphical user interfaces and/or any other graphical content that can contribute to a particular implementation.

幾つかの例において、本明細書に記載の任意の設備は、コンピューティング・デバイス１６００の１以上のコンポーネントによって実装されてもよく、又は当該コンポーネント内で実装されてもよい。例えば、ストレージ・デバイス１６０６内に存在する１以上のアプリケーション１６１２は、システム１００（図１参照）の通信設備１０２、表面データ・フレーム・シーケンス・マネージメント設備１０４、又は仮想化プロジェクション生成設備１０６に関連する１以上のオペレーション又は機能を実行するようにプロセッサ１６０４に命令するように構成されてもよい。同様に、システム１００のストレージ設備１０８は、ストレージ・デバイス１６０６によって実装されてもよく、又は該デバイス内で実装されてもよい。 In some examples, any facility described herein may be implemented by or within one or more components of computing device 1600 . For example, one or more applications 1612 residing within storage device 1606 may be associated with communication facility 102, surface data frame sequence management facility 104, or virtualized projection generation facility 106 of system 100 (see FIG. 1). It may be configured to direct processor 1604 to perform one or more operations or functions. Similarly, storage facility 108 of system 100 may be implemented by or within storage device 1606 .

上述の実施形態が、個人によって提供される個人情報を収集、記憶、及び／又は採用する限度において、以下のように理解されたい：こうした情報は、個人情報の保護に関する全ての適用可能な法律に従って、使用することができる。更には、こうした情報の収集、記憶、及び使用は、こうした活動に対する個人の同意を得てもよい（例えば、情報の状況及びタイプに適した良く知られた「オプトイン」又は「オプトアウト」プロセスを通して）。個人情報の記憶及び使用について、適切に安全な態様で、例えば、特にセンシティブな情報に対して様々な暗号化及び匿名化を通した情報のタイプを反映したものであってもよい。 To the extent the above-described embodiments collect, store, and/or employ personal information provided by individuals, it should be understood that: , can be used. Further, such collection, storage, and use of information may be subject to individual consent to such activities (e.g., through well-known "opt-in" or "opt-out" processes appropriate to the circumstances and type of information). ). Storage and use of personal information may reflect the type of information in an appropriately secure manner, for example through various encryptions and anonymizations for particularly sensitive information.

上記説明において、様々な例示的実施形態について、添付図面に言及しながら説明してきた。しかし、以下の点は明白であろう：これらに対して様々な改変及び変更を行うことができ、そして、更なる実施形態を実施することができ、こうした、改変及び変更、並びに実施は、下記の特許請求の範囲に記載された発明の範囲から逸脱することなく、可能である。例えば、本明細書に記載の１つの実施形態の特定の特徴は、本明細書に記載の別の実施形態の特徴と組み合わせることができ、又は置き換えることができる。従って、上記説明及び図面は、限定的な意味ではなく、例示的な意味で理解すべきものである。
一側面において、本発明は以下の発明を包含する。
（発明１）
以下を含む方法：
仮想化プロジェクション生成システムが、複数のキャプチャされる表面データ・フレーム・シーケンスを受信する工程、ここで、各表面データ・フレーム・シーケンスは、色彩及び深度フレームを含み、前記フレームは、現実世界シーンを、キャプチャ・パラメータの各セットに従って描写しており、前記パラメータは、現実世界シーンの異なるビューに関連するキャプチャ・パラメータの複数のセット内に含まれ、前記複数のキャプチャされる表面データ・フレーム・シーケンス内の各表面データ・フレーム・シーケンスは、複数のキャプチャ・デバイス内の異なるキャプチャ・デバイスによってキャプチャされ、前記デバイスは、前記現実世界シーンの前記異なるビューをキャプチャするように前記現実世界シーンに関して異なるロケーションに配置される；
前記仮想化プロジェクション生成システムが、キャプチャ・パラメータの追加のセットを特定する工程、ここで、前記キャプチャ・パラメータの追加のセットは、キャプチャ・パラメータの前記複数のセット内に含まれるキャプチャ・パラメータの前記セットとは異なり、そして、前記現実世界シーンのカスタマイズされるビューに関連し、前記カスタマイズされるビューは、前記複数のキャプチャ・デバイスがキャプチャする前記現実世界シーンの前記異なるビューとは異なる；
前記仮想化プロジェクション生成システムが、以下のうち少なくとも１つに基づいて、前記現実世界シーンの前記カスタマイズされるビューの仮想化プロジェクションに関する色彩及び深度フレームをレンダリングする工程：前記複数のキャプチャされる表面データ・フレーム・シーケンス内の表面データ・フレーム・シーケンス、及び、前記キャプチャ・パラメータの追加のセット；並びに、
前記仮想化プロジェクション生成システムが、バーチャル・リアリティ・メディア・コンテンツ内に含める目的で、仮想化される表面データ・フレーム・シーケンスをメディア・プレーヤ装置に提供する工程、ここで、前記仮想化される表面データ・フレーム・シーケンスは、以下を含む：前記現実世界シーンの前記カスタマイズされるビューの前記仮想化プロジェクションに関するレンダリングされる色彩及び深度フレーム。
（発明２）
発明１の方法であって、以下の特徴を有する、方法：
前記現実世界シーンの前記カスタマイズされるビューは、前記現実世界シーンに関して前記異なるロケーションに配置される前記複数のキャプチャ・デバイスによってキャプチャされる現実世界シーンの前記異なるビューと不揃いである；並びに、
前記現実世界シーンの前記カスタマイズされるビューの前記仮想化プロジェクションに関する色彩及び深度フレームの前記レンダリングは、色彩及び深度フレームをレンダリングすることを含み、前記レンダリングは、以下の少なくとも２つに基づく：前記複数のキャプチャされる表面データ・フレーム・シーケンス内の前記表面データ・フレーム・シーケンス。
（発明３）
発明１の方法であって、前記現実世界シーンの前記異なるビューに関連するキャプチャ・パラメータの前記複数のセット内に含まれるキャプチャ・パラメータの各セットが以下のうち少なくとも１つを含む、方法：
前記現実世界シーンの特定のビューに対応する色彩及び深度フレームがキャプチャされる前記現実世界シーンに関するロケーションを表すキャプチャ・パラメータ；
前記現実世界シーンの前記特定のビューに対応する前記色彩及び深度フレームがキャプチャされる配向を表すキャプチャ・パラメータ；
前記現実世界シーンの前記特定のビューに対応する前記色彩及び深度フレームがキャプチャされる視野を表すキャプチャ・パラメータ；並びに、
前記現実世界シーンの前記特定のビューに対応する前記色彩及び深度フレームがキャプチャされるイメージ・クオリティを表すキャプチャ・パラメータ。
（発明４）
発明１の方法であって、ここで、前記現実世界シーンの前記カスタマイズされるビューに関連するキャプチャ・パラメータの前記追加のセットは、以下のうち少なくとも１つを含む、方法：
前記現実世界シーンの前記カスタマイズされるビューに関連するカスタマイズされる視野を表すキャプチャ・パラメータ、ここで、前記カスタマイズされる視野は、前記仮想化プロジェクションのための前記色彩及び深度フレームの前記レンダリングの基礎となる前記表面データ・フレーム・シーケンスのうち少なくとも１つに関連するキャプチャされる視野よりも狭い；並びに、
前記現実世界シーンの前記カスタマイズされるビューに関連するカスタマイズされるイメージ・クオリティを表すキャプチャ・パラメータ、ここで、前記カスタマイズされるイメージ・クオリティは、前記仮想化プロジェクションのための前記色彩及び深度フレームの前記レンダリングの基礎となる前記表面データ・フレーム・シーケンスのうち少なくとも１つに関連するキャプチャされるイメージ・クオリティよりも低い。
（発明５）
発明１の方法であって、ここで、キャプチャ・パラメータの前記追加のセットを特定することは、以下を含む、方法：
前記現実世界シーンの１以上の幾何学的プロパティに関して前記現実世界シーンを分析すること；
キャプチャ・パラメータの複数の追加のセットを、現実世界シーンの前記分析に基づいて生成すること、ここで、前記キャプチャ・パラメータの複数の追加のセットは、キャプチャ・パラメータの前記複数のセット内に含まれるキャプチャ・パラメータの前記セットとは異なり、且つキャプチャ・パラメータの前記追加のセットを含む；並びに、
キャプチャ・パラメータの前記追加のセットを、キャプチャ・パラメータの前記複数の追加のセットから特定すること。
（発明６）
発明１の方法であって、ここで、前記バーチャル・リアリティ・メディア・コンテンツ内に含める目的で提供される前記仮想化される表面データ・フレーム・シーケンスは、パッケージ化され、その結果、前記現実世界シーンの前記カスタマイズされるビューの前記仮想化プロジェクションに関するレンダリングされる色彩及び深度フレームは、トランスポート・ストリーム内に含まれ、前記トランスポート・ストリームは、前記仮想化される表面データ・フレーム・シーケンス以外の追加の表面データ・フレーム・シーケンスを表す色彩及び深度フレームを含まない、方法。
（発明７）
発明１の方法であって、ここで、前記バーチャル・リアリティ・メディア・コンテンツ内に含める目的で提供される前記仮想化される表面データ・フレーム・シーケンスは、パッケージ化され、その結果、前記現実世界シーンの前記カスタマイズされるビューの前記仮想化プロジェクションに関するレンダリングされる色彩及び深度フレームは、トランスポート・ストリーム内に含まれ、前記トランスポート・ストリームは、少なくとも１つの前記仮想化される表面データ・フレーム・シーケンス以外の追加の表面データ・フレーム・シーケンスを表す色彩及び深度フレームを更に含む、方法。
（発明８）
発明１の方法であって、ここで、前記バーチャル・リアリティ・メディア・コンテンツ内に含める目的で提供される前記仮想化される表面データ・フレーム・シーケンスは、パッケージ化され、その結果、前記現実世界シーンの前記カスタマイズされるビューの前記仮想化プロジェクションに関してレンダリングされる各々の色彩及び深度フレームは、タイル化ビデオ・データ・ストリームにおけるタイルとして表現され、前記タイル化ビデオ・データ・ストリームは、前記タイル化ビデオ・データ・ストリームの各フレームにおける複数のタイルを表すタイル・マップを実装する、該方法。
（発明９）
発明１の方法であって、少なくとも１つの非一時的コンピュータ可読媒体上にコンピュータ可読インストラクションとして記憶される、該方法。
（発明１０）
以下を含む方法：
仮想化プロジェクション生成システムが、リアルタイムで、現実世界シーン内でイベントが発生した時に、複数のキャプチャされる表面データ・フレーム・シーケンスを受信すること、ここで、前記複数のキャプチャされる表面データ・フレーム・シーケンスは、それぞれ、色彩及び深度フレームを含み、前記色彩及び深度フレームは、前記現実世界シーンを、キャプチャ・パラメータの各セットに従って描写し、前記キャプチャ・パラメータの各セットは、現実世界シーンの異なるビューに関連するキャプチャ・パラメータの第１の複数のセット内に含まれ、前記複数のキャプチャされる表面データ・フレーム・シーケンス内の各表面データ・フレーム・シーケンスは、複数のキャプチャ・デバイス内の異なるキャプチャ・デバイスによってキャプチャされ、前記キャプチャ・デバイスは、前記現実世界シーンの異なるビューをキャプチャするように前記現実世界シーンに関して異なるロケーションに配置される；
前記仮想化プロジェクション生成システムが、リアルタイムで、前記現実世界シーン内で前記イベントが発生した時に、キャプチャ・パラメータの第２の複数のセットを特定すること、ここで、キャプチャ・パラメータの前記第２の複数のセットは、キャプチャ・パラメータの前記第１の複数のセット内に含まれるキャプチャ・パラメータの前記セットとは異なり、そして、キャプチャ・パラメータの前記第２の複数のセット各々は、前記現実世界シーンの各カスタマイズされるビューに関連し、前記現実世界シーンの各カスタマイズされる前記ビューは、前記複数のキャプチャ・デバイスがキャプチャする前記現実世界シーンの前記異なるビューとは異なり、キャプチャ・パラメータの前記第２の複数のセットは、キャプチャ・パラメータの前記第１の複数のセット内に含まれるよりも大きい数を含む；
前記仮想化プロジェクション生成システムが、リアルタイムで、前記現実世界シーン内で前記イベントが発生した時に、前記現実世界シーンの各カスタマイズされるビューの仮想化プロジェクションための色彩及び深度フレームをレンダリングすること、ここで、前記レンダリングは、前記複数のキャプチャされる表面データ・フレーム・シーケンスに基づき、且つ、キャプチャ・パラメータの前記第２の複数のセットに基づく；並びに、
前記仮想化プロジェクション生成システムが、リアルタイムで、前記現実世界シーン内で前記イベントが発生した時に、且つ、バーチャル・リアリティ・メディア・コンテンツ内に含めることを目的として、複数の仮想化される表面データ・フレーム・シーケンスをメディア・プレーヤ装置に提供すること、ここで、前記複数の仮想化される表面データ・フレーム・シーケンスは、前記現実世界シーンの各カスタマイズされるビューの前記仮想化プロジェクションに関するレンダリングされる色彩及び深度フレームを含む。
（発明１１）
発明１０の方法であって、以下の特徴を有する方法：
キャプチャ・パラメータの前記第２の複数のセット内のキャプチャ・パラメータの特定のセットに関連する前記現実世界シーンの特定のカスタマイズされるビューは、前記現実世界シーンに関して前記異なるロケーションに配置される前記複数のキャプチャ・デバイスによってキャプチャされる現実世界シーンの前記異なるビューと不揃いである；及び、
前記現実世界シーンの前記特定のカスタマイズされるビューの前記仮想化プロジェクションのための前記色彩及び深度フレームの前記レンダリングは、色彩及び深度フレームをレンダリングすることを含み、前記レンダリングは、以下の少なくとも２つに基づく：前記複数のキャプチャされる表面データ・フレーム・シーケンス内の前記表面データ・フレーム・シーケンス。
（発明１２）
発明１０の方法であって、少なくとも１つの非一時的コンピュータ可読媒体上でコンピュータ可読インストラクションとして具現化される、方法。
（発明１３）
以下を含む、システム：
以下の動作を行う少なくとも１つの物理的なコンピューティング・デバイス：
複数のキャプチャされる表面データ・フレーム・シーケンスを受信すること、ここで、前記複数のキャプチャされる表面データ・フレーム・シーケンスは、それぞれ色彩及び深度フレームを含み、前記色彩及び深度フレームは、現実世界シーンを、キャプチャ・パラメータの各セットに従って描写しており、前記パラメータは、現実世界シーンの異なるビューに関連するキャプチャ・パラメータの複数のセット内に含まれ、前記複数のキャプチャされる表面データ・フレーム・シーケンス内の各表面データ・フレーム・シーケンスは、複数のキャプチャ・デバイス内の異なるキャプチャ・デバイスによってキャプチャされ、前記キャプチャ・デバイスは、前記現実世界シーンの異なるビューをキャプチャするように前記現実世界シーンに関しての異なるロケーションに配置される；
キャプチャ・パラメータの追加のセットを特定すること、ここで、キャプチャ・パラメータの前記追加のセットは、キャプチャ・パラメータの前記複数のセット内に含まれるキャプチャ・パラメータの前記セットとは異なり、そして、前記現実世界シーンのカスタマイズされるビューに関連し、前記現実世界シーンは、前記複数のキャプチャ・デバイスがキャプチャする前記現実世界シーンの前記異なるビューとは異なる；
前記複数のキャプチャされる表面データ・フレーム・シーケンス内の前記表面データ・フレーム・シーケンスのうち少なくとも１つと、前記キャプチャ・パラメータの前記追加のセットとに基づいて、前記現実世界シーンの前記カスタマイズされるビューの仮想化プロジェクションに関する色彩及び深度フレームをレンダリングすること；並びに、
バーチャル・リアリティ・メディア・コンテンツ内に含める目的で、仮想化される表面データ・フレーム・シーケンスをメディア・プレーヤ装置に提供すること、ここで、前記表面データ・フレーム・シーケンスは、前記現実世界シーンの前記カスタマイズされるビューの前記仮想化プロジェクションに関するレンダリングされる色彩及び深度フレームを含む。
（発明１４）
発明１３のシステムであって、以下の特徴を含む、システム：
前記現実世界シーンの前記カスタマイズされるビューは、前記現実世界シーンに関して前記異なるロケーションに配置される前記複数のキャプチャ・デバイスによってキャプチャされる現実世界シーンの前記異なるビューと不揃いである；並びに、
前記現実世界シーンの前記カスタマイズされるビューの前記仮想化プロジェクションに関する色彩及び深度フレームの前記レンダリングは、色彩及び深度フレームをレンダリングすることを含み、前記レンダリングは、以下の少なくとも２つに基づく：前記複数のキャプチャされる表面データ・フレーム・シーケンス内の前記表面データ・フレーム・シーケンス。
（発明１５）
発明１３のシステムであって、現実世界シーンの前記異なるビューに関連するキャプチャ・パラメータの前記複数のセット内に含まれるキャプチャ・パラメータの各セットが以下のうち少なくとも１つを含む、システム：
前記現実世界シーンの特定のビューに対応する色彩及び深度フレームがキャプチャされる前記現実世界シーンに関するロケーションを表すキャプチャ・パラメータ；
前記現実世界シーンの前記特定のビューに対応する前記色彩及び深度フレームがキャプチャされる配向を表すキャプチャ・パラメータ；
前記現実世界シーンの前記特定のビューに対応する前記色彩及び深度フレームがキャプチャされる視野を表すキャプチャ・パラメータ；並びに、
前記現実世界シーンの前記特定のビューに対応する前記色彩及び深度フレームがキャプチャされるイメージ・クオリティを表すキャプチャ・パラメータ。
（発明１６）
発明１３のシステムであって、ここで、前記現実世界シーンの前記カスタマイズされるビューに関連するキャプチャ・パラメータの前記追加のセットは、以下のうち少なくとも１つを含む、システム：
前記現実世界シーンの前記カスタマイズされるビューに関連するカスタマイズされる視野を表すキャプチャ・パラメータ、ここで、前記カスタマイズされる視野は、前記仮想化プロジェクションのための前記色彩及び深度フレームの前記レンダリングの基礎となる前記表面データ・フレーム・シーケンスのうち少なくとも１つに関連するキャプチャされる視野よりも狭い；及び、
前記現実世界シーンの前記カスタマイズされるビューに関連するカスタマイズされるイメージ・クオリティを表すキャプチャ・パラメータ、ここで、前記カスタマイズされるイメージ・クオリティは、前記仮想化プロジェクションのための前記色彩及び深度フレームの前記レンダリングの基礎となる前記表面データ・フレーム・シーケンスのうち少なくとも１つに関連するキャプチャされるイメージ・クオリティよりも低い。
（発明１７）
発明１３のシステムであって、ここで、前記少なくとも１つの物理的なコンピューティング・デバイスが、以下によって、キャプチャ・パラメータの前記追加のセットを特定する、システム：
前記現実世界シーンの１以上の幾何学的プロパティに関して前記現実世界シーンを分析すること；
キャプチャ・パラメータの複数の追加のセットを、現実世界シーンの前記分析に基づいて生成すること、ここで、前記キャプチャ・パラメータの複数の追加のセットは、キャプチャ・パラメータの前記複数のセット内に含まれるキャプチャ・パラメータの前記セットとは異なり、且つキャプチャ・パラメータの前記追加のセットを含む；並びに、
キャプチャ・パラメータの前記追加のセットを、キャプチャ・パラメータの前記複数の追加のセットから特定すること。
（発明１８）
発明１３のシステムであって、ここで、前記バーチャル・リアリティ・メディア・コンテンツ内に含める目的で提供される前記仮想化される表面データ・フレーム・シーケンスは、パッケージ化され、その結果、前記現実世界シーンの前記カスタマイズされるビューの前記仮想化プロジェクションに関するレンダリングされる色彩及び深度フレームは、トランスポート・ストリーム内に含まれ、前記トランスポート・ストリームは、前記仮想化される表面データ・フレーム・シーケンス以外の追加の表面データ・フレーム・シーケンスを表す色彩及び深度フレームを含まない、システム。
（発明１９）
発明１３のシステムであって、ここで、前記バーチャル・リアリティ・メディア・コンテンツ内に含める目的で提供される前記仮想化される表面データ・フレーム・シーケンスは、パッケージ化され、その結果、前記現実世界シーンの前記カスタマイズされるビューの前記仮想化プロジェクションに関するレンダリングされる色彩及び深度フレームは、トランスポート・ストリーム内に含まれ、前記トランスポート・ストリームは、少なくとも１つの前記仮想化される表面データ・フレーム・シーケンス以外の追加の表面データ・フレーム・シーケンスを表す色彩及び深度フレームを更に含む、システム。
（発明２０）
発明１３のシステムであって、ここで、前記バーチャル・リアリティ・メディア・コンテンツ内に含める目的で提供される前記仮想化される表面データ・フレーム・シーケンスは、パッケージ化され、その結果、前記現実世界シーンの前記カスタマイズされるビューの前記仮想化プロジェクションに関してレンダリングされる各々の色彩及び深度フレームは、タイル化ビデオ・データ・ストリームにおけるタイルとして表現され、前記タイル化ビデオ・データ・ストリームは、前記タイル化ビデオ・データ・ストリームの各フレームにおける複数のタイルを表すタイル・マップを実装する、システム。 In the foregoing description, various exemplary embodiments have been described with reference to the accompanying drawings. However, the following points should be apparent: Various modifications and changes may be made thereto, and further embodiments may be implemented, and such modifications and changes and implementations are subject to the following: possible without departing from the scope of the claimed invention. For example, specific features of one embodiment described herein may be combined or substituted with features of another embodiment described herein. Accordingly, the above description and drawings are to be regarded in an illustrative rather than a restrictive sense.
In one aspect, the present invention includes the following inventions.
(Invention 1)
Methods including:
a virtualization projection generation system receiving a plurality of captured surface data frame sequences, wherein each surface data frame sequence includes color and depth frames, the frames representing a real world scene; , according to each set of capture parameters, said parameters being included in a plurality of sets of capture parameters associated with different views of a real-world scene, said plurality of captured surface data frame sequences. is captured by a different capture device in a plurality of capture devices, the devices positioned at different locations with respect to the real-world scene to capture the different views of the real-world scene. is placed in;
identifying an additional set of capture parameters, wherein the additional set of capture parameters is one of the capture parameters included within the plurality of sets of capture parameters; different from a set and associated with a customized view of the real-world scene, the customized view being different from the different views of the real-world scene captured by the plurality of capture devices;
The virtualization projection generation system rendering color and depth frames for a virtualization projection of the customized view of the real-world scene based on at least one of: the plurality of captured surface data. a sequence of surface data frames within a sequence of frames and an additional set of said capture parameters;
said virtualization projection generating system providing a virtualized surface data frame sequence to a media player device for inclusion within virtual reality media content, wherein said virtualized surface A data frame sequence includes: rendered color and depth frames for the virtual projection of the customized view of the real world scene.
(Invention 2)
The method of Invention 1, characterized by:
the customized view of the real-world scene is misaligned with the different views of the real-world scene captured by the plurality of capture devices positioned at the different locations with respect to the real-world scene; and
The rendering of color and depth frames for the virtualized projection of the customized view of the real-world scene includes rendering color and depth frames, the rendering being based on at least two of: the plurality of The surface data frame sequence in the captured surface data frame sequence of .
(Invention 3)
The method of invention 1, wherein each set of capture parameters included within said plurality of sets of capture parameters associated with said different views of said real-world scene includes at least one of:
a capture parameter representing a location with respect to the real-world scene at which color and depth frames corresponding to a particular view of the real-world scene are captured;
a capture parameter representing an orientation in which the color and depth frames corresponding to the particular view of the real-world scene are captured;
capture parameters representing a field of view in which the color and depth frames corresponding to the particular view of the real-world scene are captured; and
A capture parameter that represents the image quality with which the color and depth frames corresponding to the particular view of the real-world scene are captured.
(Invention 4)
The method of invention 1, wherein said additional set of capture parameters associated with said customized view of said real-world scene includes at least one of:
A capture parameter representing a customized field of view associated with the customized view of the real-world scene, wherein the customized field of view is the basis for the rendering of the color and depth frames for the virtual projection. narrower than the captured field of view associated with at least one of the surface data frame sequences that are
a capture parameter representing a customized image quality associated with the customized view of the real-world scene, wherein the customized image quality is of the color and depth frames for the virtualization projection; lower than the captured image quality associated with at least one of said sequence of surface data frames on which said rendering is based.
(Invention 5)
The method of Invention 1, wherein identifying said additional set of capture parameters includes:
analyzing the real-world scene with respect to one or more geometric properties of the real-world scene;
generating a plurality of additional sets of capture parameters based on said analysis of the real-world scene, wherein said plurality of additional sets of capture parameters are included within said plurality of sets of capture parameters different from and including the additional set of capture parameters; and
Identifying the additional set of capture parameters from the plurality of additional sets of capture parameters.
(Invention 6)
The method of Invention 1, wherein said virtualized surface data frame sequences provided for inclusion within said virtual reality media content are packaged such that said real world Rendered color and depth frames for the virtualized projection of the customized view of a scene are contained within a transport stream, the transport stream other than the virtualized surface data frame sequence. not including color and depth frames representing additional surface data frame sequences of .
(Invention 7)
The method of Invention 1, wherein said virtualized surface data frame sequences provided for inclusion within said virtual reality media content are packaged such that said real world Rendered color and depth frames for the virtualized projection of the customized view of a scene are contained within a transport stream, the transport stream comprising at least one of the virtualized surface data frames. • The method further includes color and depth frames representing additional surface data frame sequences other than the sequence.
(Invention 8)
The method of Invention 1, wherein said virtualized surface data frame sequences provided for inclusion within said virtual reality media content are packaged such that said real world Each color and depth frame rendered for the virtualized projection of the customized view of a scene is represented as a tile in a tiled video data stream, the tiled video data stream being the tiled The method implements a tile map representing multiple tiles in each frame of the video data stream.
(Invention 9)
The method of invention 1, stored as computer readable instructions on at least one non-transitory computer readable medium.
(Invention 10)
Methods including:
A virtualized projection generation system receiving a sequence of a plurality of captured surface data frames in real time as an event occurs within a real world scene, wherein said plurality of captured surface data frames. - each of the sequences includes color and depth frames, said color and depth frames depicting said real-world scene according to respective sets of capture parameters, each set of said capture parameters representing a different real-world scene; included in a first plurality of sets of view-related capture parameters, each surface data frame sequence in said plurality of captured surface data frame sequences being different in a plurality of capture devices; captured by a capture device, said capture device positioned at different locations with respect to said real-world scene to capture different views of said real-world scene;
identifying a second plurality of sets of capture parameters, in real-time, when the event occurs within the real-world scene, wherein the second set of capture parameters comprises: A plurality of sets are different from the set of capture parameters included within the first plurality of sets of capture parameters, and each of the second plurality of sets of capture parameters are different from the real-world scene. wherein each customized view of the real-world scene differs from the different views of the real-world scene captured by the plurality of capture devices according to the first The plurality of sets of 2 includes a larger number than is included in the first plurality of sets of capture parameters;
said virtualization projection generation system rendering color and depth frames for a virtualization projection of each customized view of said real-world scene in real-time and upon said event occurring within said real-world scene; and said rendering is based on said plurality of captured surface data frame sequences and based on said second plurality of sets of capture parameters;
The virtual projection generation system generates a plurality of virtualized surface data in real-time, when the event occurs in the real-world scene, and for inclusion in virtual reality media content. providing a sequence of frames to a media player device, wherein the plurality of virtualized surface data frame sequences are rendered for the virtualized projection of each customized view of the real world scene. Includes color and depth frames.
(Invention 11)
A method of invention 10, characterized by:
A particular customized view of the real-world scene associated with a particular set of capture parameters in the second plurality of sets of capture parameters is located at the different locations with respect to the real-world scene. the different views of the real-world scene captured by the capture device; and
The rendering of the color and depth frames for the virtualized projection of the particular customized view of the real-world scene includes rendering color and depth frames, the rendering comprising at least two of the following: based on: said surface data frame sequence within said plurality of captured surface data frame sequences.
(Invention 12)
11. The method of invention 10, embodied as computer readable instructions on at least one non-transitory computer readable medium.
(Invention 13)
System, including:
At least one physical computing device that:
Receiving a plurality of captured surface data frame sequences, wherein said plurality of captured surface data frame sequences each include a color and depth frame, said color and depth frames representing the real world depicting a scene according to respective sets of capture parameters, said parameters being included in a plurality of sets of capture parameters associated with different views of a real-world scene, and said plurality of captured surface data frames; - each surface data frame sequence in a sequence is captured by a different capture device in a plurality of capture devices, said capture devices capturing different views of said real-world scene; located at different locations with respect to;
identifying an additional set of capture parameters, wherein said additional set of capture parameters is different from said set of capture parameters included within said plurality of sets of capture parameters, and said relating to a customized view of a real-world scene, wherein the real-world scene is different from the different views of the real-world scene captured by the plurality of capture devices;
customizing the real-world scene based on at least one of the surface data frame sequences in the plurality of captured surface data frame sequences and the additional set of capture parameters; rendering color and depth frames for the virtualized projection of the view; and
providing a surface data frame sequence to be virtualized to a media player device for inclusion within virtual reality media content, wherein said surface data frame sequence is a representation of said real world scene; including rendered color and depth frames for the virtualized projection of the customized view.
(Invention 14)
The system of invention 13, comprising the following features:
the customized view of the real-world scene is misaligned with the different views of the real-world scene captured by the plurality of capture devices positioned at the different locations with respect to the real-world scene; and
The rendering of color and depth frames for the virtualized projection of the customized view of the real-world scene includes rendering color and depth frames, the rendering being based on at least two of: the plurality of The surface data frame sequence in the captured surface data frame sequence of .
(Invention 15)
14. The system of invention 13, wherein each set of capture parameters included within said plurality of sets of capture parameters associated with said different views of a real-world scene includes at least one of:
a capture parameter representing a location with respect to the real-world scene at which color and depth frames corresponding to a particular view of the real-world scene are captured;
a capture parameter representing an orientation in which the color and depth frames corresponding to the particular view of the real-world scene are captured;
capture parameters representing a field of view in which the color and depth frames corresponding to the particular view of the real-world scene are captured; and
A capture parameter that represents the image quality with which the color and depth frames corresponding to the particular view of the real-world scene are captured.
(Invention 16)
The system of invention 13, wherein said additional set of capture parameters associated with said customized view of said real-world scene includes at least one of:
A capture parameter representing a customized field of view associated with the customized view of the real-world scene, wherein the customized field of view is the basis for the rendering of the color and depth frames for the virtual projection. narrower than the captured field of view associated with at least one of the surface data frame sequences where
a capture parameter representing a customized image quality associated with the customized view of the real-world scene, wherein the customized image quality is of the color and depth frames for the virtualization projection; lower than the captured image quality associated with at least one of said sequence of surface data frames on which said rendering is based.
(Invention 17)
The system of invention 13, wherein said at least one physical computing device identifies said additional set of capture parameters by:
analyzing the real-world scene with respect to one or more geometric properties of the real-world scene;
generating a plurality of additional sets of capture parameters based on said analysis of the real-world scene, wherein said plurality of additional sets of capture parameters are included within said plurality of sets of capture parameters different from and including the additional set of capture parameters; and
Identifying the additional set of capture parameters from the plurality of additional sets of capture parameters.
(Invention 18)
14. The system of invention 13, wherein said virtualized surface data frame sequences provided for inclusion within said virtual reality media content are packaged such that said real world Rendered color and depth frames for the virtualized projection of the customized view of a scene are contained within a transport stream, the transport stream other than the virtualized surface data frame sequence. system that does not include color and depth frames representing additional surface data frame sequences of
(Invention 19)
14. The system of invention 13, wherein said virtualized surface data frame sequences provided for inclusion within said virtual reality media content are packaged such that said real world Rendered color and depth frames for the virtualized projection of the customized view of a scene are contained within a transport stream, the transport stream comprising at least one of the virtualized surface data frames. • A system further comprising color and depth frames representing additional surface data frame sequences other than the sequence.
(Invention 20)
14. The system of invention 13, wherein said virtualized surface data frame sequences provided for inclusion within said virtual reality media content are packaged such that said real world Each color and depth frame rendered for the virtualized projection of the customized view of a scene is represented as a tile in a tiled video data stream, the tiled video data stream being the tiled A system that implements a tile map that represents multiple tiles in each frame of a video data stream.

Claims

Methods including:
a virtualization projection generation system receiving a plurality of captured surface data frame sequences, wherein each surface data frame sequence includes color and depth frames, the frames representing a real world scene; , according to each set of capture parameters, said parameters being included in a plurality of sets of capture parameters associated with different views of a real-world scene, said plurality of captured surface data frame sequences. is captured by a different capture device in a plurality of capture devices, the devices positioned at different locations with respect to the real-world scene to capture the different views of the real-world scene. is placed in;
identifying an additional set of capture parameters, wherein the additional set of capture parameters is one of the capture parameters included within the plurality of sets of capture parameters; different from a set and related to a customized view of the real-world scene, the customized view different from the different view of the real-world scene captured by the plurality of capture devices ; - identifying said additional set of parameters includes:
analyzing the real-world scene with respect to one or more geometric properties of the real-world scene;
generating a plurality of additional sets of capture parameters based on said analysis of said real-world scene, wherein said plurality of additional sets of capture parameters are within said plurality of sets of capture parameters; different from said set of included capture parameters and including said additional set of capture parameters; and
identifying the additional set of capture parameters from the plurality of additional sets of capture parameters ;
The virtualization projection generation system rendering color and depth frames for a virtualization projection of the customized view of the real-world scene based on at least one of: the plurality of captured surface data. a sequence of surface data frames within a sequence of frames and an additional set of said capture parameters;
said virtualization projection generating system providing a virtualized surface data frame sequence to a media player device for inclusion within virtual reality media content, wherein said virtualized surface A data frame sequence includes: rendered color and depth frames for the virtual projection of the customized view of the real world scene.

2. The method of claim 1, comprising the following features:
the customized view of the real-world scene is misaligned with the different views of the real-world scene captured by the plurality of capture devices positioned at the different locations with respect to the real-world scene; and
The rendering of color and depth frames for the virtualized projection of the customized view of the real-world scene includes rendering color and depth frames, the rendering being based on at least two of: the plurality of The surface data frame sequence in the captured surface data frame sequence of .

2. The method of claim 1, wherein each set of capture parameters included within said plurality of sets of capture parameters associated with said different views of said real-world scene includes at least one of:
a capture parameter representing a location with respect to the real-world scene at which color and depth frames corresponding to a particular view of the real-world scene are captured;
a capture parameter representing an orientation in which the color and depth frames corresponding to the particular view of the real-world scene are captured;
capture parameters representing a field of view in which the color and depth frames corresponding to the particular view of the real-world scene are captured; and
A capture parameter that represents the image quality with which the color and depth frames corresponding to the particular view of the real-world scene are captured.

2. The method of claim 1, wherein the additional set of capture parameters associated with the customized view of the real-world scene includes at least one of:
A capture parameter representing a customized field of view associated with the customized view of the real-world scene, wherein the customized field of view is the basis for the rendering of the color and depth frames for the virtual projection. narrower than the captured field of view associated with at least one of the surface data frame sequences that are
a capture parameter representing a customized image quality associated with the customized view of the real-world scene, wherein the customized image quality is of the color and depth frames for the virtualization projection; lower than the captured image quality associated with at least one of said sequence of surface data frames on which said rendering is based.

2. The method of claim 1 , wherein the virtualized surface data frame sequence comprising the rendered color and depth frames for the virtualized projection of the customized view of the real-world scene is: The step of providing comprises packaging a plurality of surface data frame sequences together into a common transport stream using a tile map ping technique to obtain from each of the plurality of surface data frame sequences: A method comprising packing frames together into a plurality of tiled frames of a single tiled frame sequence, said plurality of surface data frame sequences comprising:
the virtualized surface data frame sequence, the sequence comprising the rendered color and depth frames for the virtualized projection of the customized view of the real world scene; and
At least one additional captured surface data frame sequence or virtualized surface data frame sequence other than said virtualized surface data frame sequence .

2. The method of claim 1, wherein said virtualized surface data frame sequences provided for inclusion within said virtual reality media content are packaged such that said reality Rendered color and depth frames for the virtualized projection of the customized view of a world scene are contained within a transport stream, the transport stream comprising the virtualized surface data frame sequence. not including color and depth frames representing additional surface data frame sequences other than

2. The method of claim 1, wherein said virtualized surface data frame sequences provided for inclusion within said virtual reality media content are packaged such that said reality Rendered color and depth frames for the virtualized projection of the customized view of a world scene are contained within a transport stream, the transport stream comprising at least one of the virtualized surface data images. The method further comprising color and depth frames representing additional surface data frame sequences other than the frame sequence.

2. The method of claim 1, wherein said virtualized surface data frame sequences provided for inclusion within said virtual reality media content are packaged such that said reality Each color and depth frame rendered for the virtualized projection of the customized view of a world scene is represented as a tile in a tiled video data stream, the tiled video data stream comprising the tiles implementing a tile map representing a plurality of tiles in each frame of the encoded video data stream.

2. The method of claim 1, wherein the method is performed through computer execution of instructions residing on at least one non-transitory computer-readable medium.

Methods including:
A virtualized projection generation system receiving a sequence of a plurality of captured surface data frames in real time as an event occurs within a real world scene, wherein said plurality of captured surface data frames. - each of the sequences includes color and depth frames, said color and depth frames depicting said real-world scene according to respective sets of capture parameters, each set of said capture parameters representing a different real-world scene; included in a first plurality of sets of view-related capture parameters, each surface data frame sequence in said plurality of captured surface data frame sequences being different in a plurality of capture devices; captured by a capture device, said capture device positioned at different locations with respect to said real-world scene to capture different views of said real-world scene;
identifying a second plurality of sets of capture parameters, in real-time, when the event occurs within the real-world scene, wherein the second set of capture parameters comprises: A plurality of sets are different from the set of capture parameters included within the first plurality of sets of capture parameters, and each of the second plurality of sets of capture parameters are different from the real-world scene. wherein each customized view of the real-world scene differs from the different views of the real-world scene captured by the plurality of capture devices according to the first The plurality of sets of 2 includes a larger number than is included in the first plurality of sets of capture parameters;
the virtualization projection generation system rendering color and depth frames for a virtualization projection of each customized view of the real-world scene in real-time and upon the occurrence of the event within the real-world scene; wherein said rendering is based on said plurality of captured surface data frame sequences and based on said second plurality of sets of capture parameters; and
The virtual projection generation system generates a plurality of virtualized surface data in real-time, when the event occurs in the real-world scene, and for inclusion in virtual reality media content. providing a sequence of frames to a media player device, wherein the plurality of virtualized surface data frame sequences are rendered for the virtualized projection of each customized view of the real world scene. Includes color and depth frames.

11. The method of claim 10, characterized by:
A particular customized view of the real-world scene associated with a particular set of capture parameters in the second plurality of sets of capture parameters is located at the different locations with respect to the real-world scene. the different views of the real-world scene captured by the capture device; and
The rendering of the color and depth frames for the virtualized projection of the particular customized view of the real-world scene includes rendering color and depth frames, the rendering comprising at least two of the following: based on: said surface data frame sequence within said plurality of captured surface data frame sequences.

11. The method of claim 10, wherein the method is performed through computer execution of instructions residing on at least one non-transitory computer-readable medium.

System, including:
At least one physical computing device that:
Receiving a plurality of captured surface data frame sequences, wherein said plurality of captured surface data frame sequences each include a color and depth frame, said color and depth frames representing the real world depicting a scene according to respective sets of capture parameters, said parameters being included in a plurality of sets of capture parameters associated with different views of a real-world scene, and said plurality of captured surface data frames; - each surface data frame sequence in a sequence is captured by a different capture device in a plurality of capture devices, said capture devices capturing different views of said real-world scene; located at different locations with respect to;
identifying an additional set of capture parameters, wherein said additional set of capture parameters is different from said set of capture parameters included within said plurality of sets of capture parameters, and said associated with a customized view of a real-world scene, wherein the real-world scene is distinct from the different views of the real-world scene captured by the plurality of capture devices, and the at least one physical computing device; The device identifies said additional set of capture parameters by:
analyzing the real-world scene with respect to one or more geometric properties of the real-world scene;
generating a plurality of additional sets of capture parameters based on said analysis of said real-world scene, wherein said plurality of additional sets of capture parameters are within said plurality of sets of capture parameters; different from said set of included capture parameters and including said additional set of capture parameters; and
identifying the additional set of capture parameters from the plurality of additional sets of capture parameters ;
customizing the real-world scene based on at least one of the surface data frame sequences in the plurality of captured surface data frame sequences and the additional set of capture parameters; rendering color and depth frames for the virtualized projection of the view; and
providing a surface data frame sequence to be virtualized to a media player device for inclusion within virtual reality media content, wherein said surface data frame sequence is a representation of said real world scene; including rendered color and depth frames for the virtualized projection of the customized view.

14. The system of claim 13, comprising the following features:
the customized view of the real-world scene is misaligned with the different views of the real-world scene captured by the plurality of capture devices positioned at the different locations with respect to the real-world scene; and
The rendering of color and depth frames for the virtualized projection of the customized view of the real-world scene includes rendering color and depth frames, the rendering being based on at least two of: the plurality of The surface data frame sequence in the captured surface data frame sequence of .

14. The system of claim 13, wherein each set of capture parameters included within said plurality of sets of capture parameters associated with said different views of a real-world scene includes at least one of:
a capture parameter representing a location with respect to the real-world scene at which color and depth frames corresponding to a particular view of the real-world scene are captured;
a capture parameter representing an orientation in which the color and depth frames corresponding to the particular view of the real-world scene are captured;
capture parameters representing a field of view in which the color and depth frames corresponding to the particular view of the real-world scene are captured; and
A capture parameter that represents the image quality with which the color and depth frames corresponding to the particular view of the real-world scene are captured.

14. The system of claim 13, wherein the additional set of capture parameters associated with the customized view of the real-world scene includes at least one of:
A capture parameter representing a customized field of view associated with the customized view of the real-world scene, wherein the customized field of view is the basis for the rendering of the color and depth frames for the virtual projection. narrower than the captured field of view associated with at least one of the surface data frame sequences where
a capture parameter representing a customized image quality associated with the customized view of the real-world scene, wherein the customized image quality is of the color and depth frames for the virtualization projection; lower than the captured image quality associated with at least one of said sequence of surface data frames on which said rendering is based.

14. The system of claim 13, wherein the at least one physical computing device converts the rendered color and depth frames for the virtualized projection of the customized view of the real-world scene into providing said virtualized surface data frame sequence comprising: said providing packaging a plurality of surface data frame sequences together into a common transport stream using a tile map ping technique; tiling the frames from each of the plurality of surface data frame sequences together into a single tiled frame sequence using a tile map ping technique. system by packing into multiple frames .

14. The system of claim 13, wherein said virtualized surface data frame sequences provided for inclusion within said virtual reality media content are packaged such that said reality Rendered color and depth frames for the virtualized projection of the customized view of a world scene are contained within a transport stream, the transport stream comprising the virtualized surface data frame sequence. system that does not include color and depth frames representing additional surface data frame sequences other than

14. The system of claim 13, wherein said virtualized surface data frame sequences provided for inclusion within said virtual reality media content are packaged such that said reality Rendered color and depth frames for the virtualized projection of the customized view of a world scene are contained within a transport stream, the transport stream comprising at least one of the virtualized surface data images. The system further comprising color and depth frames representing additional surface data frame sequences other than the frame sequence.

14. The system of claim 13, wherein said virtualized surface data frame sequences provided for inclusion within said virtual reality media content are packaged such that said reality Each color and depth frame rendered for the virtualized projection of the customized view of a world scene is represented as a tile in a tiled video data stream, the tiled video data stream comprising the tiles A system that implements a tile map that represents multiple tiles in each frame of an encoded video data stream.