JP6672327B2

JP6672327B2 - Method and apparatus for reducing spherical video bandwidth to a user headset

Info

Publication number: JP6672327B2
Application number: JP2017550903A
Authority: JP
Inventors: ウィーバー，ジョシュア; ゲフィン，ノーム; ベンガリ，ハサイン; アダムス，ライリー
Original assignee: Google LLC
Current assignee: Google LLC
Priority date: 2015-05-27
Filing date: 2016-05-27
Publication date: 2020-03-25
Anticipated expiration: 2036-05-27
Also published as: EP3304895A1; KR20170122791A; WO2016191702A1; CN107409203A; JP2018522430A; KR101969943B1

Description

関連出願との相互参照
本願は、２０１５年５月２７日に出願された「ユーザヘッドセットへの球状ビデオ帯域幅を減少させる方法および装置」（Method and Apparatus to Reduce Spherical Video Bandwidth to User Headset）と題された米国出願連続番号第６２／１６７，１２１号の利益を主張する。当該出願は、その全体がここに引用により援用される。 CROSS-REFERENCE TO RELATED APPLICATIONS This application filed May 27, 2015, entitled "Method and Apparatus to Reduce Spherical Video Bandwidth to User Headset". Claim the benefit of the entitled U.S. Application Serial No. 62 / 167,121. That application is incorporated herein by reference in its entirety.

分野
実施形態は、球状ビデオをストリーミングすることに関する。 FIELD Embodiments relate to streaming spherical video.

背景
球状ビデオ（または他の３次元ビデオ）をストリーミングすることは、かなりの量のシステムリソースを消費する場合がある。たとえば、符号化された球状ビデオは送信用の多数のビットを含む場合があり、それらは、かなりの量の帯域幅、ならびに、エンコーダおよびデコーダに関連付けられた処理およびメモリを消費する場合がある。 Background Streaming spherical video (or other three-dimensional video) can consume a significant amount of system resources. For example, an encoded spherical video may include a large number of bits for transmission, which may consume a significant amount of bandwidth and processing and memory associated with the encoder and decoder.

概要
例示的な実施形態は、ビデオをストリーミングすること、３Ｄビデオをストリーミングすること、および／または球状ビデオをストリーミングすることを最適化するシステムおよび方法を説明する。 Overview Exemplary embodiments describe systems and methods that optimize streaming video, streaming 3D video, and / or streaming spherical video.

一般的な一局面では、方法は、３次元（３Ｄ）ビデオに関連付けられた少なくとも１つの好ましいビューパースペクティブを判断するステップと、少なくとも１つの好ましいビューパースペクティブに対応する３Ｄビデオの第１の部分を第１の品質で符号化するステップと、３Ｄビデオの第２の部分を第２の品質で符号化するステップとを含み、第１の品質は第２の品質と比べてより高い品質である。 In one general aspect, a method includes determining at least one preferred view perspective associated with a three-dimensional (3D) video and identifying a first portion of the 3D video corresponding to the at least one preferred view perspective. Encoding at a first quality and encoding a second portion of the 3D video at a second quality, wherein the first quality is higher quality than the second quality.

別の一般的な局面では、サーバおよび／またはストリーミングサーバは、３次元（３Ｄ）ビデオに関連付けられた少なくとも１つの好ましいビューパースペクティブを判断するように構成されたコントローラと、エンコーダとを含み、エンコーダは、少なくとも１つの好ましいビューパースペクティブに対応する３Ｄビデオの第１の部分を第１の品質で符号化し、３Ｄビデオの第２の部分を第２の品質で符号化するように構成され、第１の品質は第２の品質と比べてより高い品質である。 In another general aspect, a server and / or a streaming server includes a controller configured to determine at least one preferred view perspective associated with three-dimensional (3D) video, and an encoder, wherein the encoder comprises: The first portion of the 3D video corresponding to the at least one preferred view perspective is configured with a first quality and the second portion of the 3D video is configured with a second quality. The quality is higher quality as compared to the second quality.

さらに別の一般的な局面では、方法は、ストリーミングビデオに対する要求を受信するステップを含み、要求は、３次元（３Ｄ）ビデオに関連付けられたユーザビューパースペクティブの表示を含み、方法はさらに、ユーザビューパースペクティブがビューパースペクティブデータストアに格納されているかどうかを判断するステップと、ユーザビューパースペクティブがビューパースペクティブデータストアに格納されていると判断すると、ユーザビューパースペクティブに関連付けられたランキング値をインクリメントするステップと、ユーザビューパースペクティブがビューパースペクティブデータストアに格納されていないと判断すると、ユーザビューパースペクティブをビューパースペクティブデータストアに追加し、ユーザビューパースペクティブに関連付けられたランキング値を１に設定するステップとを含む。 In yet another general aspect, a method includes receiving a request for streaming video, wherein the request includes displaying a user view perspective associated with the three-dimensional (3D) video, and the method further comprises: Determining whether the perspective is stored in the view perspective data store; and, if determining that the user view perspective is stored in the view perspective data store, incrementing a ranking value associated with the user view perspective. If it is determined that the user view perspective is not stored in the view perspective data store, the user view perspective is added to the view perspective data store and the user view is stored. And a step of setting a ranking value associated with the perspective to 1.

実現化例は、以下の特徴のうちの１つ以上を含み得る。たとえば、方法（または、サーバでの実現化例）は、３Ｄビデオの第１の部分をデータストアに格納するステップと、３Ｄビデオの第２の部分をデータストアに格納するステップと、ストリーミングビデオに対する要求を受信するステップと、データストアから３Ｄビデオの第１の部分と３Ｄビデオの第２の部分とをストリーミングビデオとしてストリーミングするステップとをさらに含み得る。方法（または、サーバでの実現化例）は、ストリーミングビデオに対する要求を受信するステップをさらに含み、要求は、ユーザビューパースペクティブの表示を含み、方法はさらに、ユーザビューパースペクティブに対応する３Ｄビデオを、３Ｄビデオの符号化された第１の部分として選択するステップと、３Ｄビデオの選択された第１の部分と３Ｄビデオの第２の部分とをストリーミングビデオとしてストリーミングするステップとを含み得る。 Implementations may include one or more of the following features. For example, the method (or server implementation) includes storing a first portion of the 3D video in a data store, storing a second portion of the 3D video in a data store, The method may further include receiving the request and streaming the first portion of the 3D video and the second portion of the 3D video from the data store as streaming video. The method (or server implementation) further comprises receiving a request for streaming video, the request including displaying a user view perspective, the method further comprising: displaying the 3D video corresponding to the user view perspective; Selecting may be as a coded first portion of the 3D video, and streaming the selected first portion of the 3D video and a second portion of the 3D video as streaming video.

方法（または、サーバでの実現化例）は、ストリーミングビデオに対する要求を受信するステップをさらに含み、要求は、３Ｄビデオに関連付けられたユーザビューパースペクティブの表示を含み、方法はさらに、ユーザビューパースペクティブがビューパースペクティブデータストアに格納されているかどうかを判断するステップと、ユーザビューパースペクティブがビューパースペクティブデータストアに格納されていると判断すると、ユーザビューパースペクティブに関連付けられたカウンタをインクリメントするステップと、ユーザビューパースペクティブがビューパースペクティブデータストアに格納されていないと判断すると、ユーザビューパースペクティブをビューパースペクティブデータストアに追加し、ユーザビューパースペクティブに関連付けられたカウンタを１に設定するステップとを含み得る。方法（または、サーバでの実現化例）は、３Ｄビデオの第２の部分を符号化するステップは、少なくとも１つの第１のＱｏＳ（Quality of Service）パラメータを第１のパス符号化動作で使用するステップを含み、３Ｄビデオの第１の部分を符号化するステップは、少なくとも１つの第２のＱｏＳ（Quality of Service）パラメータを第２のパス符号化動作で使用するステップを含むことを含み得る。 The method (or implementation on a server) further comprises receiving a request for streaming video, wherein the request includes displaying a user view perspective associated with the 3D video, wherein the method further comprises: Determining whether the view is stored in the view perspective data store; and, if determining that the user view perspective is stored in the view perspective data store, incrementing a counter associated with the user view perspective; and If the user view perspective is not stored in the view perspective data store, the user view perspective is added to the view perspective data store, and the user view The counter associated with Pekutibu may include a step of setting to 1. The method (or server implementation) includes the step of encoding the second portion of the 3D video using at least one first QoS (Quality of Service) parameter in a first pass encoding operation. And encoding the first portion of the 3D video may include using at least one second Quality of Service (QoS) parameter in a second pass encoding operation. .

たとえば、３Ｄビデオに関連付けられた少なくとも１つの好ましいビューパースペクティブを判断するステップは、これまで（historically）見られた基準点、およびこれまで見られたビューパースペクティブ、のうちの少なくとも１つに基づいている。３Ｄビデオに関連付けられた少なくとも１つの好ましいビューパースペクティブは、３Ｄビデオの視聴者の配向、３Ｄビデオの視聴者の位置、３Ｄビデオの視聴者の点、および３Ｄビデオの視聴者の焦点、のうちの少なくとも１つに基づいている。３Ｄビデオに関連付けられた少なくとも１つの好ましいビューパースペクティブを判断するステップは、デフォルトビューパースペクティブに基づいており、デフォルトビューパースペクティブは、ディスプレイデバイスのユーザの特性、ディスプレイデバイスのユーザに関連付けられたグループの特性、ディレクターズカット、および、３Ｄビデオの特性、のうちの少なくとも１つに基づいている。たとえば、方法（または、サーバでの実現化例）は、３Ｄビデオの第２の部分の少なくとも一部を第１の品質で繰り返し符号化するステップと、３Ｄビデオの第２の部分の少なくとも一部をストリーミングするステップとをさらに含み得る。 For example, determining at least one preferred view perspective associated with the 3D video is based on at least one of a historically viewed reference point and a previously viewed view perspective. . The at least one preferred view perspective associated with the 3D video is a 3D video viewer orientation, a 3D video viewer location, a 3D video viewer point, and a 3D video viewer focus. Based on at least one. Determining at least one preferred view perspective associated with the 3D video is based on a default view perspective, wherein the default view perspective includes characteristics of a user of the display device, characteristics of a group associated with the user of the display device, Director's cut and / or 3D video characteristics. For example, the method (or server implementation) includes repeatedly encoding at least a portion of the second portion of the 3D video at a first quality and at least a portion of the second portion of the 3D video. Streaming.

例示的な実施形態は、ここに以下に提供される詳細な説明、および添付図面からより十分に理解されるであろう。図中、同じ要素は同じ参照番号によって表わされ、それらは例示としてのみ与えられており、このため例示的な実施形態の限定ではない。
少なくとも１つの例示的な実施形態に従った球の２次元（２Ｄ）表現を示す図である。２Ｄ矩形表現としての、球の２Ｄ表現の展開円筒表現を示す図である。少なくとも１つの例示的な実施形態に従った、ストリーミング球状ビデオを符号化するための方法を示す図である。少なくとも１つの例示的な実施形態に従った、ストリーミング球状ビデオを符号化するための方法を示す図である。少なくとも１つの例示的な実施形態に従った、ストリーミング球状ビデオを符号化するための方法を示す図である。少なくとも１つの例示的な実施形態に従った、ストリーミング球状ビデオを符号化するための方法を示す図である。少なくとも１つの例示的な実施形態に従ったビデオエンコーダシステムを示す図である。少なくとも１つの例示的な実施形態に従ったビデオデコーダシステムを示す図である。少なくとも１つの例示的な実施形態に従ったビデオエンコーダシステムについてのフロー図を示す図である。少なくとも１つの例示的な実施形態に従ったビデオデコーダシステムについてのフロー図を示す図である。少なくとも１つの例示的な実施形態に従ったシステムを示す図である。ここに説明される手法を実現するために使用され得るコンピュータデバイスおよびモバイルコンピュータデバイスの概略ブロック図である。 Example embodiments will be more fully understood from the detailed description provided herein below and the accompanying drawings. In the figures, the same elements are denoted by the same reference numerals, and they are given by way of example only and thus are not limitations of the exemplary embodiments.
FIG. 4 illustrates a two-dimensional (2D) representation of a sphere according to at least one example embodiment. FIG. 4 is a diagram illustrating an expanded cylindrical representation of a 2D representation of a sphere as a 2D rectangular representation. FIG. 3 illustrates a method for encoding streaming spherical video, according to at least one example embodiment; FIG. 3 illustrates a method for encoding streaming spherical video, according to at least one example embodiment; FIG. 3 illustrates a method for encoding streaming spherical video, according to at least one example embodiment; FIG. 3 illustrates a method for encoding streaming spherical video, according to at least one example embodiment; FIG. 2 illustrates a video encoder system according to at least one example embodiment. FIG. 4 illustrates a video decoder system according to at least one example embodiment. FIG. 4 illustrates a flow diagram for a video encoder system according to at least one example embodiment; FIG. 4 illustrates a flow diagram for a video decoder system according to at least one example embodiment; FIG. 2 illustrates a system according to at least one example embodiment. FIG. 2 is a schematic block diagram of a computing device and a mobile computing device that may be used to implement the techniques described herein.

なお、これらの図は、ある例示的な実施形態において利用される方法、構造および／または材料の一般的な特徴を示すよう意図されており、かつ、以下に提供される記載を補足するよう意図されている。しかしながら、これらの図面は縮尺通りではなく、また、任意の所与の実施形態の構造特性または性能特性そのものを正確に反映していない場合があり、例示的な実施形態が包含する特性を定義または限定していると解釈されるべきでない。たとえば、明瞭にするために、構造要素の位置付けが減少または誇張される場合がある。さまざまな図面における同様または同一の参照番号の使用は、同様または同一の要素または特徴の存在を示すよう意図される。 It is noted that these figures are intended to illustrate general features of the methods, structures and / or materials utilized in certain exemplary embodiments, and are intended to supplement the description provided below. Have been. However, these drawings are not to scale and may not accurately reflect the structural or performance characteristics of any given embodiment itself, and may not define or define the characteristics encompassed by the exemplary embodiment. It should not be construed as limiting. For example, the positioning of structural elements may be reduced or exaggerated for clarity. The use of similar or identical reference numbers in the various drawings is intended to indicate the presence of similar or identical elements or features.

実施形態の詳細な説明
例示的な実施形態はさまざまな修正および代替的形態を含み得るが、それらの実施形態は例として図面に示されており、ここに詳細に説明されるであろう。しかしながら、例示的な実施形態を開示された特定の形態に限定する意図はなく、それどころか、例示的な実施形態は請求の範囲内に該当するすべての修正、均等物、および代替物を網羅することが理解されるべきである。同じ番号は、図の説明全体にわたって同じ要素を指す。 DETAILED DESCRIPTION OF EMBODIMENTS While exemplary embodiments may include various modifications and alternatives, the embodiments are shown by way of example in the drawings and will now be described in detail. However, there is no intention to limit the exemplary embodiments to the particular forms disclosed, but rather the exemplary embodiments are intended to cover all modifications, equivalents, and alternatives falling within the scope of the appended claims. Should be understood. Like numbers refer to like elements throughout the description of the figures.

例示的な実施形態は、ビデオのストリーミング、３Ｄビデオのストリーミング、球状ビデオ（および／または他の３次元ビデオ）のストリーミングを、球状ビデオの（ビデオの視聴者によって）優先的に見られた部分（たとえば、ディレクターズカット、これまでの視聴（historical viewings）など）に基づいて最適化するように構成されたシステムおよび方法を説明する。たとえば、ディレクターズカットとは、ビデオの監督（ディレクター）または製作者によって選択されたようなビューパースペクティブであり得る。ディレクターズカットは、ビデオの監督または製作者がビデオを録画する際に選択され、または見られた、カメラの（複数のカメラの）ビューに基づいていてもよい。 An exemplary embodiment may include streaming video, 3D video streaming, spherical video (and / or other three-dimensional video) streaming, as well as preferentially viewed portions of spherical video (by video viewers). For example, systems and methods configured to optimize based on director's cuts, historical viewings, etc. are described. For example, a director's cut may be a view perspective as selected by the video director (director) or producer. The director's cut may be based on the camera's (multi-camera) view selected or seen when the video director or producer recorded the video.

球状ビデオ、球状ビデオのフレーム、および／または球状画像は、パースペクティブを有し得る。たとえば、球状画像は地球の画像であってもよい。内部パースペクティブは、地球の中心から外側を見るビューであってもよい。または、内部パースペクティブは、地球上で宇宙を眺めるものであってもよい。外部パースペクティブは、宇宙から地球に向かって見下ろすビューであってもよい。別の例として、パースペクティブは、可視である画像の一部に基づき得る。言い換えれば、可視パースペクティブは、視聴者が見ることができるものであり得る。可視パースペクティブは、視聴者の前にある球状画像の一部であり得る。たとえば、内部パースペクティブから見る際、視聴者は地面（たとえば地球）上に横たわり、宇宙を眺めていてもよい。視聴者は画像内で、月、太陽、または特定の星を見るかもしれない。しかしながら、視聴者が横たわっている地面は球状画像に含まれているものの、地面は現在の可視パースペクティブの外部にある。この例では、視聴者が自分の頭を回転させると、地面は周囲の可視パースペクティブに含まれるであろう。視聴者がうつぶせになると、地面は可視パースペクティブ内にあるものの、月、太陽、または星は可視パースペクティブ内にはないであろう。 The spherical video, the frames of the spherical video, and / or the spherical image may have a perspective. For example, the spherical image may be an image of the earth. The interior perspective may be a view looking outward from the center of the earth. Alternatively, the internal perspective may be looking at space on Earth. The external perspective may be a view looking down from space towards the earth. As another example, the perspective may be based on a portion of the image that is visible. In other words, the visible perspective may be what the viewer can see. The visible perspective may be part of a spherical image in front of the viewer. For example, when viewed from an interior perspective, a viewer may be lying on the ground (eg, the earth) and looking at the universe. Viewers may see the moon, sun, or certain stars in the image. However, although the ground on which the viewer is lying is included in the spherical image, the ground is outside the current visible perspective. In this example, when the viewer turns his or her head, the ground will be included in the surrounding visible perspective. If the viewer is prone, the ground will be in the visible perspective, but the moon, sun, or stars will not be in the visible perspective.

外部パースペクティブからの可視パースペクティブは、（たとえば画像の別の部分によって）遮られていない球状画像の一部、および／または、見えなくなるまで湾曲していない球状画像の一部であってもよい。球状画像を動かすこと（たとえば回転させること）によって、および／または球状画像の動きによって、球状画像の別の部分が外部パースペクティブから可視パースペクティブに持ち込まれてもよいい。したがって、可視パースペクティブは、球状画像の視聴者の可視範囲内にある球状画像の一部である。 The visible perspective from the external perspective may be a portion of the spherical image that is not obstructed (eg, by another portion of the image) and / or a portion of the spherical image that is not curved until invisible. By moving (eg, rotating) the spherical image and / or by moving the spherical image, another portion of the spherical image may be brought from the external perspective into the visible perspective. Thus, a visible perspective is a portion of a spherical image that is within the visible range of a viewer of the spherical image.

球状画像は、経時変化しない画像である。たとえば、地球に関するような内部パースペクティブからの球状画像は、月および星を１つの位置に示す場合がある。一方、球状ビデオ（または画像のシーケンス）は経時変化する場合がある。たとえば、地球に関するような内部パースペクティブからの球状ビデオは、（たとえば地球の自転のために）動く月および星、および／または、画像（たとえば空）を横切る飛行機雲を示す場合がある。 The spherical image is an image that does not change with time. For example, a spherical image from an interior perspective, such as about the earth, may show the moon and stars in one location. On the other hand, spherical videos (or sequences of images) may change over time. For example, a spherical video from an internal perspective, such as about the earth, may show moving moons and stars (eg, due to the Earth's rotation) and / or contrails across the image (eg, sky).

図１Ａは、球の２次元（２Ｄ）表現である。図１Ａに示すように、（たとえば、球状ビデオの球状画像またはフレームとしての）球１００は、内部パースペクティブ１０５、１１０、外部パースペクティブ１１５、および可視パースペクティブ１２０、１２５、１３０の方向を示す。可視パースペクティブ１２０は、内部パースペクティブ１１０から見られるような球状画像の一部であってもよい。可視パースペクティブ１２０は、内部パースペクティブ１０５から見られるような球１００の一部であってもよい。可視パースペクティブ１２５は、外部パースペクティブ１１５から見られるような球１００の一部であってもよい。 FIG. 1A is a two-dimensional (2D) representation of a sphere. As shown in FIG. 1A, a sphere 100 (eg, as a spherical image or frame of a spherical video) shows the orientation of an inner perspective 105, 110, an outer perspective 115, and a visible perspective 120, 125, 130. The visible perspective 120 may be part of a spherical image as seen from the interior perspective 110. The visible perspective 120 may be a part of the sphere 100 as seen from the interior perspective 105. The visible perspective 125 may be a part of the sphere 100 as seen from the external perspective 115.

図１Ｂは、２Ｄ矩形表現としての、球１００の２Ｄ表現の展開円筒表現１５０を示す図である。展開円筒表現１５０として示された画像の正距円筒投影は、画像が点Ａ、Ｂ間の中央線から垂直に（図１Ｂに示すように上下に）遠ざかって進むにつれて、伸張された画像として現われ得る。２Ｄ矩形表現は、Ｎ×ＮブロックのＣ×Ｒマトリックスとして分解され得る。たとえば、図１Ｂに示すように、図示された展開円筒表現１５０は、Ｎ×Ｎブロックの３０×１６マトリックスである。しかしながら、他のＣ×Ｒ次元がこの開示の範囲内にある。ブロックは、２×２、２×４、４×４、４×８、８×８、８×１６、１６×１６などのブロック（または画素のブロック）であってもよい。 FIG. 1B is a diagram illustrating an expanded cylindrical representation 150 of the 2D representation of the sphere 100 as a 2D rectangular representation. The equirectangular projection of the image, shown as an expanded cylinder representation 150, appears as an expanded image as the image advances vertically (up and down as shown in FIG. 1B) from the center line between points A and B. obtain. The 2D rectangular representation can be decomposed as a CxR matrix of NxN blocks. For example, as shown in FIG. 1B, the illustrated expanded cylindrical representation 150 is a 30 × 16 matrix of N × N blocks. However, other C × R dimensions are within the scope of this disclosure. The blocks may be blocks (or blocks of pixels) such as 2 × 2, 2 × 4, 4 × 4, 4 × 8, 8 × 8, 8 × 16, 16 × 16, and the like.

球状画像とは、全方向に連続している画像である。したがって、仮に球状画像を複数のブロックに分解した場合、複数のブロックは球状画像全体で近接しているであろう。言い換えれば、２Ｄ画像にあるようなエッジも境界もない。例示的な実現化例では、隣接端ブロックが、２Ｄ表現の境界に隣接していてもよい。加えて、隣接端ブロックは、２Ｄ表現の境界上のブロックとの近接ブロックであってもよい。たとえば、隣接端ブロックは、２次元表現の２つ以上の境界に関連付けられている。言い換えれば、球状画像は全方向に連続している画像であるため、隣接端は、画像またはフレームにおける（たとえばブロックの列の）上側境界および下側境界に関連付けられてもよく、および／または、画像またはフレームにおける（たとえばブロックの行の）左側境界および右側境界に関連付けられてもよい。 A spherical image is an image that is continuous in all directions. Therefore, if the spherical image is decomposed into a plurality of blocks, the plurality of blocks will be close to each other throughout the spherical image. In other words, there are no edges or boundaries as in 2D images. In an exemplary implementation, adjacent end blocks may be adjacent to a boundary of the 2D representation. In addition, the adjacent end block may be a block adjacent to a block on the boundary of the 2D representation. For example, adjacent end blocks are associated with two or more boundaries of a two-dimensional representation. In other words, because the spherical image is an image that is continuous in all directions, adjacent edges may be associated with upper and lower boundaries (eg, of a row of blocks) in the image or frame, and / or It may be associated with left and right boundaries (eg, of a row of blocks) in an image or frame.

たとえば、正距円筒投影が使用される場合、隣接端ブロックは、列または行の他方端のブロックであってもよい。たとえば、図１Ｂに示すように、ブロック１６０および１７０は、互いにそれぞれの（列ごとの）隣接端ブロックであってもよい。また、ブロック１８０および１８５は、互いにそれぞれの（列ごとの）隣接端ブロックであってもよい。さらに、ブロック１６５および１７５は、互いにそれぞれの（行ごとの）隣接端ブロックであってもよい。ビューパースペクティブ１９２は、少なくとも１つのブロックを含んでいてもよい（および／または、少なくとも１つのブロックと重複していてもよい）。ブロックは、画像の領域、フレームの領域、画像またはフレームの一部もしくは部分集合、ブロックの群などとして符号化されてもよい。以下に、ブロックのこの群は、タイルまたはタイルの群と称され得る。たとえば、図１Ｂでは、タイル１９０および１９５は、４つのブロックの群として図示される。タイル１９５は、ビューパースペクティブ１９２内にあるとして図示される。 For example, if equirectangular projection is used, the adjacent end block may be the block at the other end of the column or row. For example, as shown in FIG. 1B, blocks 160 and 170 may be their respective (per column) adjacent end blocks. Further, the blocks 180 and 185 may be adjacent end blocks (for each column). Further, the blocks 165 and 175 may be respective adjacent (row-by-row) end blocks. View perspective 192 may include at least one block (and / or may overlap with at least one block). A block may be encoded as a region of an image, a region of a frame, a part or subset of an image or a frame, a group of blocks, and the like. Hereinafter, this group of blocks may be referred to as a tile or a group of tiles. For example, in FIG. 1B, tiles 190 and 195 are illustrated as a group of four blocks. Tiles 195 are shown as being in view perspective 192.

例示的な実施形態では、符号化された球状ビデオのフレームをストリーミングすることに加え、視聴者によって頻繁に見られた少なくとも１つの基準点に基づいて選択されたタイル（またはタイルの群）としてのビューパースペクティブ（たとえば、これまで見られた少なくとも１つの基準点またはビューパースペクティブ）が、たとえばより高い品質（たとえば、より高い解像度および／またはより少ない歪み）で符号化され、球状ビデオの符号化されたフレームとともに（またはその一部として）ストリーミングされ得る。したがって、再生中、球状ビデオ全体が再生されている間に視聴者は復号されたタイルを（より高い品質で）見ることができ、視聴者のビューパースペクティブが、視聴者によって頻繁に見られたビューパースペクティブに変わった場合でも、球状ビデオ全体は利用可能である。視聴者はまた、視聴位置を変更したり、別のビューパースペクティブに切替えることもできる。その別のビューパースペクティブが、視聴者によって頻繁に見られた少なくとも１つの基準点に含まれる場合、再生されたビデオは、（たとえば、視聴者によって頻繁に見られた少なくとも１つの基準点のうちの１つではない）何らかの他のビューパースペクティブに比べ、より高い品質（たとえば、より高い解像度）のものであり得る。画像またはフレームの選択された一部または部分集合のみをより高い品質で符号化してストリーミングすることの１つの利点は、必ずしも球状ビデオ全体をより高い品質で符号化し、ストリーミングし、復号しなくても、球状ビデオの選択された画像またはフレームがより高い品質で復号され、再生され得るという利点を有しており、このため、帯域幅使用、ならびに、エンコーダおよびデコーダに関連付けられた処理リソースおよびメモリリソースにおける効率を高める。 In an exemplary embodiment, in addition to streaming the frames of the encoded spherical video, as a selected tile (or group of tiles) based on at least one reference point frequently viewed by a viewer. The view perspective (eg, at least one reference point or view perspective seen so far) is encoded, for example, with higher quality (eg, higher resolution and / or less distortion) and the encoded spherical video It can be streamed with (or as part of) the frame. Thus, during playback, the viewer can see the decoded tiles (with higher quality) while the entire spherical video is being played, and the viewer's view perspective is reduced to the views frequently viewed by the viewer. The entire spherical video is still available, even if turned into a perspective. The viewer can also change the viewing position or switch to another view perspective. If the other view perspective is included in at least one reference point frequently viewed by the viewer, the played video is (e.g., one of the at least one reference point frequently viewed by the viewer). It may be of higher quality (eg, higher resolution) compared to some other (not one) view perspective. One advantage of encoding and streaming only selected portions or subsets of images or frames with higher quality is that the entire spherical video does not necessarily have to be encoded, streamed, and decoded with higher quality. Has the advantage that the selected image or frame of the spherical video can be decoded and played back with higher quality, so that the bandwidth usage and the processing and memory resources associated with the encoder and decoder To increase efficiency.

頭部装着ディスプレイ（head mount display：ＨＭＤ）では、視聴者は、知覚された３次元（３Ｄ）ビデオまたは画像を投影する左（たとえば左目）ディスプレイおよび右（たとえば右目）ディスプレイの使用を通して、視覚的バーチャルリアリティを体験する。例示的な実施形態によれば、球状（たとえば３Ｄ）ビデオまたは画像がサーバ上に格納される。ビデオまたは画像は符号化され、サーバからＨＭＤにストリーミングされ得る。球状ビデオまたは画像は、左画像および右画像として符号化され得る。左画像および右画像は、左画像および右画像についてのメタデータとともに（たとえばデータパケットに）パッケージ化される。左画像および右画像は次に復号され、左（たとえば左目）ディスプレイおよび右（たとえば右目）ディスプレイによって表示される。 In a head-mounted display (HMD), the viewer can visually sense through the use of left (eg, left-eye) and right (eg, right-eye) displays that project a perceived three-dimensional (3D) video or image. Experience virtual reality. According to an exemplary embodiment, a spherical (eg, 3D) video or image is stored on a server. Video or images may be encoded and streamed from the server to the HMD. A spherical video or image may be encoded as a left image and a right image. The left and right images are packaged (eg, in data packets) with metadata about the left and right images. The left and right images are then decoded and displayed by a left (eg, left eye) display and a right (eg, right eye) display.

ここに説明されるシステムおよび方法は左画像および右画像双方に適用可能であり、本開示全体を通し、使用事例に依存して、画像、フレーム、画像の一部、フレームの一部、タイルなどと称される。言い換えれば、サーバ（たとえばストリーミングサーバ）からユーザデバイス（たとえばＨＭＤ）に通信され、次に表示のために復号される符号化データは、３Ｄビデオまたは画像に関連付けられた左画像および／または右画像であり得る。 The systems and methods described herein are applicable to both left and right images, and throughout this disclosure, depending on the use case, images, frames, parts of images, parts of frames, tiles, etc. It is called. In other words, the encoded data communicated from the server (eg, a streaming server) to a user device (eg, an HMD) and then decoded for display is a left and / or right image associated with the 3D video or image. possible.

図２〜５は、例示的な実施形態に従った方法のフローチャートである。図２〜５に関して説明されるステップは、（たとえば（以下に説明される）図６Ａ、図６Ｂ、図７Ａ、図７Ｂ、および図８に示すような）装置に関連付けられたメモリ（たとえば、少なくとも１つのメモリ６１０）に格納され、当該装置に関連付けられた少なくとも１つのプロセッサ（たとえば、少なくとも１つのプロセッサ６０５）によって実行される、ソフトウェアコードの実行によって行なわれてもよい。しかしながら、特殊用途プロセッサとして具現化されるシステムといった、代替的な実施形態が考えられる。以下に説明されるステップはプロセッサによって実行されるとして説明されるが、これらのステップは必ずしも同じプロセッサによって実行されるわけではない。言い換えれば、少なくとも１つのプロセッサが、図２〜５に関して以下に説明されるステップを実行してもよい。 2 to 5 are flowcharts of a method according to an exemplary embodiment. The steps described with respect to FIGS. 2-5 may include a memory associated with the device (eg, as shown in FIGS. 6A, 6B, 7A, 7B, and 8 (described below) (eg, at least It may be performed by execution of software code stored in one memory 610) and executed by at least one processor (eg, at least one processor 605) associated with the device. However, alternative embodiments are conceivable, such as a system embodied as a special purpose processor. Although the steps described below are described as being performed by a processor, these steps are not necessarily performed by the same processor. In other words, at least one processor may perform the steps described below with respect to FIGS.

図２は、これまでのビューパースペクティブを格納するための方法を示しており、ここで、「これまでの」（historical）とは、ユーザによって以前に要求されたビューパースペクティブを指す。たとえば、図２は、球状ビデオストリームにおいてよく見られるビューパースペクティブのデータベースの構築を示し得る。図２に示すように、ステップＳ２０５で、ビューパースペクティブの表示が受信される。たとえば、デコーダを含むデバイスによってタイルが要求され得る。タイル要求は、球状ビデオ上の視聴者の配向、位置、点、または焦点に関するパースペクティブまたはビューパースペクティブに基づいた情報を含み得る。パースペクティブまたはビューパースペクティブは、ユーザビューパースペクティブ、すなわちＨＭＤのユーザのビューパースペクティブであり得る。たとえば、ビューパースペクティブ（たとえばユーザビューパースペクティブ）は、（たとえば、内部パースペクティブまたは外部パースペクティブとしての）球状ビデオ上の緯度および経度位置であってもよい。ビュー、パースペクティブ、またはビューパースペクティブは、球状ビデオに基づいて立方体の辺として判断され得る。ビューパースペクティブの表示はまた、球状ビデオ情報を含み得る。例示的な実現化例では、ビューパースペクティブの表示は、ビューパースペクティブに関連付けられたフレームについての情報（たとえばフレームシーケンス）を含み得る。たとえば、ビュー（たとえば、緯度および経度位置、または辺）は、たとえばハイパーテキスト転送プロトコル（Hypertext Transfer Protocol：ＨＴＴＰ）を使用して、ＨＭＤを含むユーザデバイス（に関連付けられたコントローラ）からストリーミングサーバに通信され得る。 FIG. 2 illustrates a method for storing a previous view perspective, where "historical" refers to a view perspective previously requested by a user. For example, FIG. 2 may show the construction of a database of view perspectives commonly found in spherical video streams. As shown in FIG. 2, in step S205, a display of a view perspective is received. For example, a tile may be requested by a device that includes a decoder. The tile request may include information based on a perspective or view perspective regarding the viewer's orientation, position, point, or focus on the spherical video. The perspective or view perspective may be a user view perspective, ie, a view perspective of a user of the HMD. For example, a view perspective (eg, a user view perspective) may be a latitude and longitude position on a spherical video (eg, as an internal or external perspective). A view, perspective, or view perspective may be determined as a cube edge based on the spherical video. The display of the view perspective may also include spherical video information. In an exemplary implementation, the display of the view perspective may include information (eg, a frame sequence) about a frame associated with the view perspective. For example, views (eg, latitude and longitude locations, or sides) communicate from a user device (including a controller associated with an HMD) to a streaming server using, for example, the Hypertext Transfer Protocol (HTTP). Can be done.

ステップＳ２１０で、ビューパースペクティブ（たとえばユーザビューパースペクティブ）がビューパースペクティブデータストアに格納されているかどうかが判断される。たとえば、データストア（たとえばビューパースペクティブデータストア８１５）が、ビューパースペクティブまたはユーザビューパースペクティブに関連付けられた情報に基づいてクエリまたはフィルタされ得る。たとえば、データストアは、ビューパースペクティブの球状ビデオ上の緯度および経度位置、ならびに、ビューパースペクティブが見られた球状ビデオにおけるタイムスタンプに基づいて、クエリまたはフィルタされてもよい。タイムスタンプは、球状ビデオの再生に関連付けられた時間および／または時間範囲であり得る。クエリまたはフィルタは、空間近接性（たとえば、現在のビューパースペクティブが所与の格納されたビューパースペクティブにどのくらい近いか）、および／または、時間近接性（たとえば、現在のタイムスタンプが所与の格納されたタイムスタンプにどのくらい近いか）に基づき得る。クエリまたはフィルタが結果を返す場合、ビューパースペクティブはデータストアに格納されている。結果を返さない場合、ビューパースペクティブはデータストアに格納されていない。ビューパースペクティブがビューパースペクティブデータストアに格納されている場合、ステップＳ２１５で、処理はステップＳ２２０に続く。格納されていない場合、処理はステップＳ２２５に続く。 In step S210, it is determined whether a view perspective (eg, a user view perspective) is stored in a view perspective data store. For example, a data store (eg, view perspective data store 815) may be queried or filtered based on information associated with a view perspective or a user view perspective. For example, the data store may be queried or filtered based on the latitude and longitude location on the spherical video of the view perspective and the timestamp in the spherical video where the view perspective was viewed. The time stamp may be a time and / or time range associated with playing the spherical video. The query or filter may determine spatial proximity (eg, how close the current view perspective is to a given stored view perspective) and / or temporal proximity (eg, the current timestamp may be stored at a given stored timestamp). Based on how close the timestamp is). If the query or filter returns results, the view perspective is stored in the data store. If no results are returned, the view perspective has not been stored in the data store. If the view perspective is stored in the view perspective data store, in step S215, the process continues to step S220. If not, the process continues to step S225.

ステップＳ２２０で、受信されたビューパースペクティブに関連付けられたカウンタまたはランキング（またはランキング値）がインクリメントされる。たとえば、データストアは、これまでのビューパースペクティブを含むデータテーブルを含んでいてもよい（たとえば、データストアは、複数のデータテーブルを含むデータベースであってもよい）。データテーブルは、鍵付きの（たとえば、各々に固有の）ビューパースペクティブであってもよい。データテーブルは、ビューパースペクティブの識別情報と、ビューパースペクティブに関連付けられた情報と、ビューパースペクティブが何回要求されたかを示すカウンタとを含んでいてもよい。カウンタは、ビューパースペクティブが要求されるたびにインクリメントされてもよい。データテーブルに格納されたデータは、匿名化されてもよい。言い換えれば、データは、ユーザ、デバイス、セッションなどが言及されない（または、ユーザ、デバイス、セッションなどの識別情報がない）ように格納され得る。そのため、データテーブルに格納されたデータは、ビデオのユーザまたは視聴者に基づいて区別できない。例示的な実現化例では、データテーブルに格納されたデータは、ユーザを識別することなく、ユーザに基づいて分類されてもよい。たとえば、データは、ユーザの年齢、年齢層、性別、タイプまたは役割（たとえば音楽家または観衆）などを含んでいてもよい。 In step S220, a counter or ranking (or ranking value) associated with the received view perspective is incremented. For example, the data store may include a data table that includes a previous view perspective (eg, the data store may be a database that includes multiple data tables). The data table may be a keyed (eg, unique to each) view perspective. The data table may include identification information of the view perspective, information associated with the view perspective, and a counter indicating how many times the view perspective has been requested. The counter may be incremented each time a view perspective is requested. The data stored in the data table may be anonymized. In other words, the data may be stored such that the user, device, session, etc. are not mentioned (or there is no identifying information of the user, device, session, etc.). As such, the data stored in the data tables cannot be distinguished based on the video user or viewer. In an exemplary implementation, data stored in the data tables may be categorized based on a user without identifying the user. For example, the data may include the user's age, age group, gender, type or role (eg, musician or audience), and the like.

ステップＳ２２５で、ビューパースペクティブはビューパースペクティブデータストアに追加される。たとえば、ビューパースペクティブの識別情報と、ビューパースペクティブに関連付けられた情報と、１に設定されたカウンタ（またはランキング値）とが、これまでのビューパースペクティブを含むデータテーブルに格納されてもよい。 In step S225, the view perspective is added to the view perspective data store. For example, the identification information of the view perspective, the information associated with the view perspective, and the counter (or ranking value) set to 1 may be stored in a data table including the past view perspectives.

例示的な実施形態では、少なくとも１つの好ましいビューパースペクティブに関連付けられたタイルが、より高いＱｏＳで符号化され得る。ＱｏＳは、上述の品質の実現（たとえば、品質を規定するエンコーダ変数入力）であり得る。たとえば、エンコーダ（たとえばビデオエンコーダ６２５）が、３Ｄビデオに関連付けられたタイルを個々に符号化することができる。少なくとも１つの好ましいビューパースペクティブに関連付けられたタイルは、３Ｄビデオの残りに関連付けられたタイルに比べ、より高いＱｏＳで符号化され得る。例示的な実現化例では、３Ｄビデオは、（たとえば第１のパスにおける）第１のＱｏＳパラメータ、または、第１の符号化パスで使用される少なくとも１つの第１のＱｏＳパラメータを使用して符号化され得る。加えて、少なくとも１つの好ましいビューパースペクティブに関連付けられたタイルは、（たとえば第２のパスにおける）第２のＱｏＳパラメータ、または、第２の符号化パスで使用される少なくとも１つの第２のＱｏＳパラメータを使用して符号化され得る。この例示的な実現化例では、第２のＱｏＳは、第１のＱｏＳに比べ、より高いＱｏＳである。別の例示的な実現化例では、３Ｄビデオは、３Ｄビデオを表わす複数のタイルとして符号化され得る。少なくとも１つの好ましいビューパースペクティブに関連付けられたタイルは、第２のＱｏＳパラメータを使用して符号化され得る。残りのタイルは、第１のＱｏＳパラメータを使用して符号化され得る。 In an exemplary embodiment, tiles associated with at least one preferred view perspective may be encoded with higher QoS. QoS can be an implementation of the above-mentioned quality (eg, an encoder variable input that defines the quality). For example, an encoder (eg, video encoder 625) can individually encode tiles associated with 3D video. Tiles associated with at least one preferred view perspective may be encoded with a higher QoS than tiles associated with the rest of the 3D video. In an exemplary implementation, the 3D video uses a first QoS parameter (eg, in a first pass) or at least one first QoS parameter used in a first encoding pass. Can be encoded. In addition, the tile associated with the at least one preferred view perspective may include a second QoS parameter (eg, in a second pass) or at least one second QoS parameter used in a second encoding pass. May be encoded. In this example implementation, the second QoS is a higher QoS than the first QoS. In another example implementation, the 3D video may be encoded as multiple tiles representing the 3D video. Tiles associated with at least one preferred view perspective may be encoded using the second QoS parameter. The remaining tiles may be encoded using the first QoS parameter.

代替的な実現化例（および／または追加の実現化例）では、エンコーダは、少なくとも１つの好ましいビューパースペクティブに関連付けられたタイルを、３Ｄビデオフレームの残りの２Ｄ表現を生成するために使用されるものとは異なる投影手法またはアルゴリズムを使用して投影することができる。投影によっては、フレームのあるエリアに歪みを有する場合がある。したがって、タイルを球状フレームとは異なるように投影することは、最終画像の品質を向上させること、および／または、画素をより効率的に使用する（たとえば、コンピュータ計算をより少なくする、またはユーザの目に対する負担をより少なくする）ことができる。１つの例示的な実現化例では、投影アルゴリズムに基づいて歪みが最小となる位置にタイルを向けるために、タイルを投影する前に球状画像を回転させることができる。別の例示的な実現化例では、タイルは、タイルの位置に基づいた投影アルゴリズムを使用する（および／または修正する）ことができる。たとえば、球状ビデオフレームを２Ｄ表現に投影することは正距円筒投影を使用でき、一方、球状ビデオフレームをタイルとして選択されるべき部分を含む表現に投影することは立方体投影を使用できる。 In alternative implementations (and / or additional implementations), an encoder is used to generate a tile associated with at least one preferred view perspective to generate a remaining 2D representation of the 3D video frame. The projection can be performed using a different projection technique or algorithm. Some projections may have distortion in certain areas of the frame. Thus, projecting tiles differently than spherical frames can improve the quality of the final image and / or use pixels more efficiently (e.g., use less computational or user The burden on the eyes is reduced). In one example implementation, the spherical image can be rotated before projecting the tiles to direct the tiles to locations where distortion is minimal based on the projection algorithm. In another example implementation, the tile may use (and / or modify) a projection algorithm based on the location of the tile. For example, projecting a spherical video frame into a 2D representation can use an equirectangular projection, while projecting a spherical video frame onto a representation that includes a portion to be selected as a tile can use a cubic projection.

図３は、３Ｄビデオをストリーミングするための方法を示す。図３は、ライブストリーミングイベントなどの最中に、ストリーミング３Ｄビデオがオンデマンドで符号化されるシナリオを説明する。図３に示すように、ステップＳ３０５で、３Ｄビデオをストリーミングする要求が受信される。たとえば、ストリーミングに利用可能な３Ｄビデオ、３Ｄビデオの一部、またはタイルが、デコーダを含むデバイスによって（たとえば、媒体アプリケーションとのユーザインタラクションを介して）要求され得る。当該要求は、球状ビデオ上の視聴者の配向、位置、点、または焦点に関するパースペクティブまたはビューパースペクティブに基づいた情報を含み得る。パースペクティブまたはビューパースペクティブに基づいた情報は、現在の配向またはデフォルト（たとえば初期化）配向に基づき得る。デフォルト配向は、たとえば、３Ｄビデオについてのディレクターズカットであり得る。 FIG. 3 shows a method for streaming 3D video. FIG. 3 illustrates a scenario where streaming 3D video is encoded on demand during a live streaming event or the like. As shown in FIG. 3, in step S305, a request to stream 3D video is received. For example, 3D video, 3D video portions, or tiles available for streaming may be requested by a device including a decoder (eg, via user interaction with a media application). The request may include information based on a perspective or view perspective regarding the orientation, position, point, or focus of the viewer on the spherical video. Information based on a perspective or view perspective may be based on a current orientation or a default (eg, initialized) orientation. The default orientation may be, for example, director's cut for 3D video.

ステップＳ３１０で、少なくとも１つの好ましいビューパースペクティブが判断される。たとえば、データストア（たとえばビューパースペクティブデータストア８１５）が、ビューパースペクティブに関連付けられた情報に基づいてクエリまたはフィルタされ得る。データストアは、ビューパースペクティブの球状ビデオ上の緯度および経度位置に基づいてクエリまたはフィルタされてもよい。例示的な実現化例では、少なくとも１つの好ましいビューパースペクティブは、これまでのビューパースペクティブに基づき得る。そのため、データストアは、これまでのビューパースペクティブを含むデータテーブルを含み得る。ビューパースペクティブが何回要求されたかによって、好みが表示され得る。したがって、クエリまたはフィルタは、しきい値カウンタ値未満の結果を取り除くことを含み得る。言い換えれば、これまでのビューパースペクティブを含むデータテーブルのクエリのために設定されたパラメータは、カウンタまたはランキングについての値を含み得る。ここで、クエリの結果は、カウンタについてのしきい値より上でなければならない。これまでのビューパースペクティブを含むデータテーブルのクエリの結果は、少なくとも１つの好ましいビューパースペクティブとして設定され得る。 At step S310, at least one preferred view perspective is determined. For example, a data store (eg, view perspective data store 815) may be queried or filtered based on information associated with the view perspective. The data store may be queried or filtered based on the latitude and longitude position on the spherical video in the view perspective. In an exemplary implementation, at least one preferred view perspective may be based on a previous view perspective. As such, the data store may include a data table that includes the previous view perspective. Preferences may be displayed depending on how many times the view perspective has been requested. Thus, a query or filter may include removing results that are less than a threshold counter value. In other words, the parameters set for the data table query including the previous view perspective may include values for counters or rankings. Here, the result of the query must be above the threshold for the counter. The results of a query of the data table containing the previous view perspective may be set as at least one preferred view perspective.

加えて、デフォルトの好ましいビューパースペクティブ（または複数の当該ビューパースペクティブ）が、３Ｄビデオに関連付けられ得る。デフォルトの好ましいビューパースペクティブは、ディレクターズカット、関心点（たとえば、地平線、移動物体、優先物体）などであり得る。たとえば、あるゲームの目的は、物体（たとえば、ビルまたは車両）を破壊することである場合がある。この物体は、優先物体とラベル付けされてもよい。優先物体を含むビューパースペクティブは、好ましいビューパースペクティブとして表示され得る。デフォルトの好ましいビューパースペクティブは、これまでのビューパースペクティブに加えて、またはこれまでのビューパースペクティブに代えて含まれ得る。デフォルト配向はたとえば、（たとえばビデオが最初にアップロードされた場合はこれまでのデータがないため）たとえば自動コンピュータビジョンアルゴリズムに基づいた最初の一組の好ましいビューパースペクティブであり得る。ビジョンアルゴリズムは、動きまたは複雑な詳細、または何がおもしろそうか推測するための近くのステレオ物体、および／または、他のこれまでのビデオの好ましいビューに存在していた特徴を有する、ビデオの好ましいビューパースペクティブ部分を判断してもよい。 In addition, a default preferred view perspective (or multiple such view perspectives) may be associated with the 3D video. Default preferred view perspectives may be director's cuts, points of interest (eg, horizon, moving objects, priority objects), and so on. For example, the purpose of some games may be to destroy objects (eg, buildings or vehicles). This object may be labeled as a priority object. The view perspective that includes the priority object may be displayed as a preferred view perspective. A default preferred view perspective may be included in addition to, or instead of, a previous view perspective. The default orientation can be, for example, the first set of preferred view perspectives based on, for example, an automatic computer vision algorithm (eg, because there is no previous data if the video was first uploaded). The vision algorithm can be used for video preferred, with motion or complex details, or nearby stereo objects to infer what might be interesting, and / or other features that were present in the preferred view of the video. The view perspective portion may be determined.

少なくとも１つの好ましいビューパースペクティブを判断する際に、他の要因を使用することができる。たとえば、少なくとも１つの好ましいビューパースペクティブは、現在のビューパースペクティブの範囲内にある（たとえば、現在のビューパースペクティブに接近した）これまでのビューパースペクティブであり得る。たとえば、少なくとも１つの好ましいビューパースペクティブは、現在のユーザの、または現在のユーザが属するグループ（タイプまたはカテゴリー）のこれまでのビューパースペクティブの範囲内にある（たとえば、当該ビューパースペクティブに接近した）これまでのビューパースペクティブであり得る。言い換えれば、少なくとも１つの好ましいビューパースペクティブは、格納されたこれまでのビューパースペクティブと距離が近い、および／または時間が近いビューパースペクティブ（またはタイル）を含み得る。デフォルトの好ましいビューパースペクティブは、これまでのビューパースペクティブを含むデータストア８１５に、または図示されない別個の（たとえば追加の）データストアに格納され得る。 Other factors can be used in determining at least one preferred view perspective. For example, the at least one preferred view perspective may be a previous view perspective that is within range of the current view perspective (eg, close to the current view perspective). For example, at least one preferred view perspective is one that is within (eg, has approached) the previous view perspective of the current user or of the group (type or category) to which the current user belongs. View perspective. In other words, the at least one preferred view perspective may include a view perspective (or tile) that is close in distance and / or close in time to a stored previous view perspective. The default preferred view perspective may be stored in the data store 815 containing the previous view perspective, or in a separate (eg, additional) data store not shown.

ステップＳ３１５で、３Ｄビデオは、少なくとも１つの好ましいビューパースペクティブに基づいた少なくとも１つの符号化パラメータを用いて符号化される。たとえば、３Ｄビデオ（またはその一部）は、少なくとも１つの好ましいビューパースペクティブを含む部分が３Ｄビデオの残りとは異なるように符号化されるように、符号化され得る。そのため、少なくとも１つの好ましいビューパースペクティブを含む部分は、３Ｄビデオの残りに比べてより高いＱｏＳで符号化され得る。その結果、ＨＭＤ上でレンダリングされる場合、少なくとも１つの好ましいビューパースペクティブを含む部分は、３Ｄビデオの残りに比べてより高い解像度を有し得る。 In step S315, the 3D video is encoded using at least one encoding parameter based on at least one preferred view perspective. For example, the 3D video (or a portion thereof) may be encoded such that a portion including at least one preferred view perspective is encoded differently than the rest of the 3D video. As such, portions that include at least one preferred view perspective may be encoded with higher QoS than the rest of the 3D video. As a result, when rendered on the HMD, the portion containing at least one preferred view perspective may have a higher resolution than the rest of the 3D video.

ステップＳ３２０で、符号化された３Ｄビデオはストリーミングされる。たとえば、タイルが、送信用パケットに含まれてもよい。パケットは、圧縮されたビデオビット１０Ａを含んでいてもよい。パケットは、球状ビデオフレームの符号化された２Ｄ表現と、符号化されたタイル（または複数のタイル）とを含んでいてもよい。パケットは、送信用ヘッダを含んでいてもよい。ヘッダは、とりわけ、エンコーダによるフレーム内符号化におけるモードまたはスキーム使用を示す情報を含んでいてもよい。ヘッダは、球状ビデオフレームのフレームを２Ｄ矩形表現に変換するために使用されるパラメータを示す情報を含んでいてもよい。ヘッダは、符号化された２Ｄ矩形表現の、および符号化されたタイルのＱｏＳを獲得するために使用されるパラメータを示す情報を含んでいてもよい。上述のように、少なくとも１つの好ましいビューパースペクティブに関連付けられたタイルのＱｏＳは、少なくとも１つの好ましいビューパースペクティブに関連付けられていないタイルに比べ、異なっていてもよい（たとえば、より高くてもよい）。 In step S320, the encoded 3D video is streamed. For example, a tile may be included in a packet for transmission. The packet may include the compressed video bits 10A. The packet may include an encoded 2D representation of the spherical video frame and the encoded tile (or tiles). The packet may include a transmission header. The header may include information indicating, among other things, the mode or scheme usage in intra-frame encoding by the encoder. The header may include information indicating parameters used to convert the frames of the spherical video frame into a 2D rectangular representation. The header may include information indicating an encoded 2D rectangular representation and parameters used to obtain the QoS of the encoded tile. As described above, the QoS of tiles associated with at least one preferred view perspective may be different (eg, higher) than tiles not associated with at least one preferred view perspective.

３Ｄビデオをストリーミングすることは、優先段階の使用を通して実現され得る。たとえば、第１の優先段階では、低い（または最低基準の）ＱｏＳで符号化されたビデオデータがストリーミングされ得る。これにより、ＨＭＤのユーザは、バーチャルリアリティ体験を開始できるようになる。次に、より高いＱｏＳのビデオがＨＭＤにストリーミングされ、（たとえば、バッファ８３０に格納されたデータ）以前にストリーミングされた低い（または最低基準）のＱｏＳで符号化されたビデオデータを置き換えることができる。一例として、第２の段階では、現在のビューパースペクティブに基づいて、より高い品質のビデオまたは画像データがストリーミングされ得る。次の段階では、１つ以上の好ましいビューパースペクティブに基づいて、より高いＱｏＳのビデオまたは画像データがストリーミングされ得る。これは、ＨＭＤバッファが実質的に高ＱｏＳビデオまたは画像データのみを含むようになるまで続き得る。加えて、この段階的なストリーミングは、ＱｏＳが次第により高くなるビデオまたは画像データを用いてループし得る。言い換えれば、第１の繰返しの後で、ＨＭＤは第１のＱｏＳで符号化されたビデオまたは画像データを含み、第２の繰返しの後で、ＨＭＤは第２のＱｏＳで符号化されたビデオまたは画像データを含み、第３の繰返しの後で、ＨＭＤは第３のＱｏＳで符号化されたビデオまたは画像データを含む、というふうになっている。例示的な実現化例では、第２のＱｏＳは第１のＱｏＳよりも高く、第３のＱｏＳは第２のＱｏＳよりも高い、というふうになっている。 Streaming 3D video can be achieved through the use of a priority stage. For example, in a first priority stage, video data encoded at a low (or lowest reference) QoS may be streamed. This allows the user of the HMD to start a virtual reality experience. The higher QoS video may then be streamed to the HMD to replace previously streamed video data encoded with a lower (or lowest reference) QoS (eg, data stored in buffer 830). . As an example, in a second stage, higher quality video or image data may be streamed based on the current view perspective. In the next stage, higher QoS video or image data may be streamed based on one or more preferred view perspectives. This may continue until the HMD buffer contains substantially only high QoS video or image data. In addition, this gradual streaming may loop with video or image data with progressively higher QoS. In other words, after the first iteration, the HMD includes video or image data encoded with a first QoS, and after the second iteration, the HMD includes video or image encoded with a second QoS. It includes image data, and after a third iteration, the HMD contains video or image data encoded with a third QoS. In an exemplary implementation, the second QoS is higher than the first QoS, and the third QoS is higher than the second QoS.

エンコーダ６２５は、球状ビデオをストリーミング用に利用可能にするためのセットアップ手順の一環として、オフラインで動作してもよい。複数のタイルの各々は、ビューフレームストレージ７９５に格納されてもよい。複数のタイルの各々は、複数のタイルの各々がフレームを参照して（たとえば時間依存性）、およびビューを参照して（たとえばビュー依存性）格納され得るように、索引付けされてもよい。したがって、複数のタイルの各々は、時間およびビュー、パースペクティブ、またはビューパースペクティブに依存しており、時間依存性およびビュー依存性に基づいて呼び戻され得る。 The encoder 625 may operate offline as part of a set-up procedure to make the spherical video available for streaming. Each of the plurality of tiles may be stored in view frame storage 795. Each of the plurality of tiles may be indexed such that each of the plurality of tiles may be stored with reference to a frame (eg, time-dependent) and with reference to a view (eg, view-dependent). Thus, each of the multiple tiles is dependent on time and view, perspective, or view perspective, and may be recalled based on time and view dependencies.

そのため、例示的な実現化例では、エンコーダ６２５は、フレームが選択され、そのフレームの一部がビューパースペクティブに基づいてタイルとして選択されるループを実行するように構成されてもよい。タイルは次に符号化され、格納される。ループは、複数のビューパースペクティブを通して循環し続ける。たとえば球状画像の垂直線を中心に５度ずつ、および水平線を中心に５度ずつの所望数のビューパースペクティブがタイルとして保存される場合、新しいフレームが選択され、プロセスは、球状ビデオのすべてのフレームがそれらのために保存された所望数のタイルを有するようになるまで繰り返す。例示的な実施形態では、少なくとも１つの好ましいビューパースペクティブに関連付けられたタイルは、少なくとも１つの好ましいビューパースペクティブに関連付けられたタイルではないタイルに比べ、より高いＱｏＳで符号化され得る。これは、タイルを符号化し、保存するための１つの例示的な実現化例に過ぎない。 Thus, in an exemplary implementation, encoder 625 may be configured to perform a loop in which a frame is selected and a portion of the frame is selected as a tile based on a view perspective. The tiles are then encoded and stored. The loop continues to cycle through multiple view perspectives. If the desired number of view perspectives are saved as tiles, e.g., 5 degrees centered on the vertical line and 5 degrees centered on the horizontal line of the spherical image, a new frame is selected and the process proceeds with all frames of the spherical video. Until it has the desired number of tiles reserved for them. In an exemplary embodiment, tiles associated with at least one preferred view perspective may be encoded with a higher QoS than tiles that are not tiles associated with at least one preferred view perspective. This is just one example implementation for encoding and storing tiles.

図４は、符号化された３Ｄビデオを格納するための方法を示す。図４は、将来のストリーミングのために、ストリーミング３Ｄビデオが前もって符号化され、格納されるシナリオを説明する。図４に示すように、ステップＳ４０５で、３Ｄビデオについての少なくとも１つの好ましいビューパースペクティブが判断される。たとえば、データストア（たとえばビューパースペクティブデータストア８１５）が、ビューパースペクティブに関連付けられた情報に基づいてクエリまたはフィルタされ得る。データストアは、ビューパースペクティブの球状ビデオ上の緯度および経度位置に基づいてクエリまたはフィルタされてもよい。例示的な実現化例では、少なくとも１つの好ましいビューパースペクティブは、これまでのビューパースペクティブに基づき得る。そのため、データテーブルは、これまでのビューパースペクティブを含む。ビューパースペクティブが何回要求されたかによって、好みが表示され得る。したがって、クエリまたはフィルタは、しきい値カウンタ値未満の結果を取り除くことを含み得る。言い換えれば、これまでのビューパースペクティブを含むデータテーブルのクエリのために設定されたパラメータは、カウンタについての値を含み得る。ここで、クエリの結果は、カウンタについてのしきい値より上でなければならない。これまでのビューパースペクティブを含むデータテーブルのクエリの結果は、少なくとも１つの好ましいビューパースペクティブとして設定され得る。 FIG. 4 shows a method for storing encoded 3D video. FIG. 4 illustrates a scenario where streaming 3D video is pre-encoded and stored for future streaming. As shown in FIG. 4, at step S405, at least one preferred view perspective for the 3D video is determined. For example, a data store (eg, view perspective data store 815) may be queried or filtered based on information associated with the view perspective. The data store may be queried or filtered based on the latitude and longitude position on the spherical video in the view perspective. In an exemplary implementation, at least one preferred view perspective may be based on a previous view perspective. Therefore, the data table includes the previous view perspective. Preferences may be displayed depending on how many times the view perspective has been requested. Thus, a query or filter may include removing results that are less than a threshold counter value. In other words, the parameters set for the query of the data table containing the previous view perspective may include the value for the counter. Here, the result of the query must be above the threshold for the counter. The results of a query of the data table containing the previous view perspective may be set as at least one preferred view perspective.

加えて、デフォルトの好ましいビューパースペクティブ（または複数の当該ビューパースペクティブ）が、３Ｄビデオに関連付けられ得る。デフォルトの好ましいビューパースペクティブは、ディレクターズカット、関心点（たとえば、地平線、移動物体、優先物体）などであり得る。たとえば、あるゲームの目的は、物体（たとえば、ビルまたは車両）を破壊することである場合がある。この物体は、優先物体とラベル付けされてもよい。優先物体を含むビューパースペクティブは、好ましいビューパースペクティブとして表示され得る。デフォルトの好ましいビューパースペクティブは、これまでのビューパースペクティブに加えて、またはこれまでのビューパースペクティブに代えて含まれ得る。少なくとも１つの好ましいビューパースペクティブを判断する際に、他の要因を使用することができる。たとえば、少なくとも１つの好ましいビューパースペクティブは、現在のビューパースペクティブの範囲内にある（たとえば、現在のビューパースペクティブに接近した）これまでのビューパースペクティブであり得る。たとえば、少なくとも１つの好ましいビューパースペクティブは、現在のユーザの、または現在のユーザが属するグループ（タイプまたはカテゴリー）のこれまでのビューパースペクティブの範囲内にある（たとえば、当該ビューパースペクティブに接近した）これまでのビューパースペクティブであり得る。デフォルトの好ましいビューパースペクティブは、これまでのビューパースペクティブを含むデータストアに、または別個の（たとえば追加の）データテーブルに格納され得る。 In addition, a default preferred view perspective (or multiple such view perspectives) may be associated with the 3D video. Default preferred view perspectives may be director's cuts, points of interest (eg, horizon, moving objects, priority objects), and so on. For example, the purpose of some games may be to destroy objects (eg, buildings or vehicles). This object may be labeled as a priority object. The view perspective that includes the priority object may be displayed as a preferred view perspective. A default preferred view perspective may be included in addition to, or instead of, a previous view perspective. Other factors can be used in determining at least one preferred view perspective. For example, the at least one preferred view perspective may be a previous view perspective that is within range of the current view perspective (eg, close to the current view perspective). For example, at least one preferred view perspective is one that is within (eg, has approached) the previous view perspective of the current user or of the group (type or category) to which the current user belongs. View perspective. The default preferred view perspective may be stored in the data store containing the previous view perspective, or in a separate (eg, additional) data table.

ステップＳ４１０で、３Ｄビデオは、少なくとも１つの好ましいビューパースペクティブに基づいた少なくとも１つの符号化パラメータを用いて符号化される。たとえば、３Ｄビデオのフレームが選択され、そのフレームの一部がビューパースペクティブに基づいてタイルとして選択され得る。タイルは次に符号化される。例示的な実施形態では、少なくとも１つの好ましいビューパースペクティブに関連付けられたタイルは、より高いＱｏＳで符号化され得る。少なくとも１つの好ましいビューパースペクティブに関連付けられたタイルは、３Ｄビデオの残りに関連付けられたタイルに比べ、より高いＱｏＳで符号化され得る。 In step S410, the 3D video is encoded using at least one encoding parameter based on at least one preferred view perspective. For example, a frame of the 3D video may be selected and a portion of the frame may be selected as a tile based on a view perspective. The tile is then encoded. In an exemplary embodiment, tiles associated with at least one preferred view perspective may be encoded with a higher QoS. Tiles associated with at least one preferred view perspective may be encoded with a higher QoS than tiles associated with the rest of the 3D video.

代替的な実現化例（および／または追加の実現化例）では、エンコーダは、少なくとも１つの好ましいビューパースペクティブに関連付けられたタイルを、３Ｄビデオフレームの残りの２Ｄ表現を生成するために使用されるものとは異なる投影手法またはアルゴリズムを使用して投影することができる。投影によっては、フレームのあるエリアに歪みを有する場合がある。したがって、タイルを球状フレームとは異なるように投影することは、最終画像の品質を向上させること、および／または、画素をより効率的に使用することができる。１つの例示的な実現化例では、投影アルゴリズムに基づいて歪みが最小となる位置にタイルを向けるために、タイルを投影する前に球状画像を回転させることができる。別の例示的な実現化例では、タイルは、タイルの位置に基づいた投影アルゴリズムを使用する（および／または修正する）ことができる。たとえば、球状ビデオフレームを２Ｄ表現に投影することは正距円筒投影を使用でき、一方、球状ビデオフレームをタイルとして選択されるべき部分を含む表現に投影することは立方体投影を使用できる。 In alternative implementations (and / or additional implementations), an encoder is used to generate a tile associated with at least one preferred view perspective to generate a remaining 2D representation of the 3D video frame. The projection can be performed using a different projection technique or algorithm. Some projections may have distortion in certain areas of the frame. Thus, projecting tiles differently than spherical frames can improve the quality of the final image and / or use pixels more efficiently. In one example implementation, the spherical image can be rotated before projecting the tiles to direct the tiles to locations where distortion is minimal based on the projection algorithm. In another example implementation, the tile may use (and / or modify) a projection algorithm based on the location of the tile. For example, projecting a spherical video frame into a 2D representation can use an equirectangular projection, while projecting a spherical video frame onto a representation that includes a portion to be selected as a tile can use a cubic projection.

ステップＳ４１５で、符号化された３Ｄビデオは格納される。たとえば、複数のタイルの各々は、ビューフレームストレージ７９５に格納されてもよい。３Ｄビデオに関連付けられた複数のタイルの各々は、複数のタイルの各々がフレームを参照して（たとえば時間依存性）、およびビューを参照して（たとえばビュー依存性）格納され得るように、索引付けされてもよい。したがって、複数のタイルの各々は、時間およびビュー、パースペクティブ、またはビューパースペクティブに依存しており、時間依存性およびビュー依存性に基づいて呼び戻され得る。 In step S415, the encoded 3D video is stored. For example, each of the plurality of tiles may be stored in view frame storage 795. Each of the plurality of tiles associated with the 3D video is indexed such that each of the plurality of tiles can be stored with reference to a frame (eg, time-dependent) and with reference to a view (eg, view-dependent). It may be attached. Thus, each of the multiple tiles is dependent on time and view, perspective, or view perspective, and may be recalled based on time and view dependencies.

例示的な実現化例では、３Ｄビデオ（たとえば、それに関連付けられたタイル）は、可変符号化パラメータを用いて符号化され、格納されてもよい。したがって、３Ｄビデオは、異なる符号化状態で格納されてもよい。これらの状態は、ＱｏＳに基づいて変わってもよい。たとえば、３Ｄビデオは、同じＱｏＳで各々符号化された複数のタイルとして格納されてもよい。たとえば、３Ｄビデオは、異なるＱｏＳで各々符号化された複数のタイルとして格納されてもよい。たとえば、３Ｄビデオは、符号化された少なくとも１つの好ましいビューパースペクティブに基づいたＱｏＳで一部が符号化された複数のタイルとして格納されてもよい。 In an exemplary implementation, 3D video (eg, tiles associated therewith) may be encoded and stored using variable encoding parameters. Thus, 3D video may be stored in different encoding states. These states may change based on QoS. For example, 3D video may be stored as multiple tiles each encoded with the same QoS. For example, 3D video may be stored as multiple tiles, each encoded with a different QoS. For example, 3D video may be stored as multiple tiles partially coded with QoS based on at least one preferred view perspective coded.

図５は、３Ｄビデオについての好ましいビューパースペクティブを判断するための方法を示す。３Ｄビデオについての好ましいビューパースペクティブは、３Ｄビデオのこれまでの視聴に基づいた好ましいビューパースペクティブに加わるものであってもよい。図６に示すように、ステップＳ５０５で、少なくとも１つのデフォルトビューパースペクティブが判断される。たとえば、デフォルトの好ましいビューパースペクティブは、データストア（たとえばビューパースペクティブデータストア８１５）に含まれるデータテーブルに格納され得る。データストアは、３Ｄビデオについてのデフォルト表示に基づいてクエリまたはフィルタされ得る。クエリまたはフィルタが結果を返す場合、３Ｄビデオは、関連付けられたデフォルトビューパースペクティブを有する。結果を返さない場合、３Ｄビデオは、関連付けられたデフォルトビューパースペクティブを有していない。デフォルトの好ましいビューパースペクティブは、ディレクターズカット、関心点（たとえば、地平線、移動物体、優先物体）などであり得る。たとえば、あるゲームの目的は、物体（たとえば、ビルまたは車両）を破壊することである場合がある。この物体は、優先物体とラベル付けされてもよい。優先物体を含むビューパースペクティブは、好ましいビューパースペクティブとして表示され得る。 FIG. 5 shows a method for determining a preferred view perspective for 3D video. The preferred view perspective for 3D video may be in addition to the preferred view perspective based on previous viewing of the 3D video. As shown in FIG. 6, at step S505, at least one default view perspective is determined. For example, a default preferred view perspective may be stored in a data table included in a data store (eg, view perspective data store 815). The data store may be queried or filtered based on the default display for 3D video. If the query or filter returns results, the 3D video will have an associated default view perspective. If no result is returned, the 3D video does not have a default view perspective associated with it. Default preferred view perspectives may be director's cuts, points of interest (eg, horizon, moving objects, priority objects), and so on. For example, the purpose of some games may be to destroy objects (eg, buildings or vehicles). This object may be labeled as a priority object. The view perspective that includes the priority object may be displayed as a preferred view perspective.

ステップＳ５１０で、ユーザの特性／好み／カテゴリーに基づいた少なくとも１つのビューパースペクティブが判断される。たとえば、ＨＭＤのユーザは、ＨＭＤの以前の使用に基づいた特性を有していてもよい。これらの特性は、統計的な視聴の好み（たとえば、遠くにある物体ではなく、すぐ近くの物体を見る方を好むこと）に基づいていてもよい。たとえば、ＨＭＤのユーザは、ＨＭＤに関連付けられたユーザの好みを格納していてもよい。これらの好みは、セットアッププロセスの一環として、ユーザによって選択されてもよい。好みは、一般的なもの（たとえば、動きに引き寄せられること）であってもよく、または、ビデオ特有のもの（たとえば、音楽演奏でギタリストに注目しがちなこと）であってもよい。たとえば、ＨＭＤのユーザは、あるグループまたはカテゴリー（たとえば、１５〜２２才の男性）に属してもよい。たとえば、ユーザの特性／好み／カテゴリーは、データストア（たとえばビューパースペクティブデータストア８１５）に含まれたデータテーブルに格納され得る。データストアは、３Ｄビデオについてのデフォルト表示に基づいてクエリまたはフィルタされ得る。クエリまたはフィルタが結果を返す場合、３Ｄビデオは、ユーザについての関連付けられた特性／好み／カテゴリーに基づいた、少なくとも１つの関連付けられた好ましいビューパースペクティブを有する。結果を返さない場合、３Ｄビデオは、ユーザに基づいた、関連付けられたビューパースペクティブを有していない。 In step S510, at least one view perspective based on the user's characteristics / preferences / categories is determined. For example, a user of the HMD may have characteristics based on previous use of the HMD. These characteristics may be based on statistical viewing preferences (e.g., preferring to see nearby objects rather than objects that are far away). For example, a user of the HMD may have stored user preferences associated with the HMD. These preferences may be selected by the user as part of the setup process. The preference may be general (eg, attracted to movement) or video-specific (eg, tend to focus on guitarists in music performances). For example, HMD users may belong to a group or category (eg, men aged 15-22). For example, user characteristics / preferences / categories may be stored in a data table included in a data store (eg, view perspective data store 815). The data store may be queried or filtered based on the default display for 3D video. If the query or filter returns results, the 3D video has at least one associated preferred view perspective based on the associated characteristics / favorite / category for the user. If no result is returned, the 3D video does not have an associated view perspective based on the user.

ステップＳ５１５で、関心領域に基づいた少なくとも１つのビューパースペクティブが判断される。たとえば、関心領域は、現在のビューパースペクティブであってもよい。たとえば、少なくとも１つの好ましいビューパースペクティブは、現在のビューパースペクティブの範囲内にある（たとえば、現在のビューパースペクティブに接近した）これまでのビューパースペクティブであり得る。たとえば、少なくとも１つの好ましいビューパースペクティブは、現在のユーザの、または現在のユーザが属するグループ（タイプまたはカテゴリー）のこれまでのビューパースペクティブの範囲内にある（たとえば、当該ビューパースペクティブに接近した）これまでのビューパースペクティブであり得る。 At step S515, at least one view perspective based on the region of interest is determined. For example, the region of interest may be the current view perspective. For example, the at least one preferred view perspective may be a previous view perspective that is within range of the current view perspective (eg, close to the current view perspective). For example, at least one preferred view perspective is one that is within (eg, has approached) the previous view perspective of the current user or of the group (type or category) to which the current user belongs. View perspective.

ステップＳ５２０で、少なくとも１つのシステム特性に基づいた少なくとも１つのビューパースペクティブが判断される。たとえば、ＨＭＤは、ユーザ体験を強化し得る特徴を有していてもよい。１つの特徴は、強化された音声であってもよい。したがって、バーチャルリアリティ環境で、ユーザは特定の音に引き付けられるかもしれない（たとえば、ゲームユーザは爆発音に引き付けられるかもしれない）。好ましいビューパースペクティブは、これらの可聴キューを含むビューパースペクティブに基づいていてもよい。ステップＳ５２５で、前述のビューパースペクティブ判断の各々、および／または、それらの組合せ／サブ組合せに基づいた、３Ｄビデオについての少なくとも１つの好ましいビューパースペクティブが判断される。たとえば、少なくとも１つの好ましいビューパースペクティブは、前述のクエリの結果を合併または結合することによって生成されてもよい。 At step S520, at least one view perspective based on at least one system characteristic is determined. For example, an HMD may have features that can enhance the user experience. One feature may be enhanced speech. Thus, in a virtual reality environment, a user may be drawn to a particular sound (eg, a game user may be drawn to an explosion sound). Preferred view perspectives may be based on view perspectives that include these audible cues. At step S525, at least one preferred view perspective for the 3D video is determined based on each of the aforementioned view perspective determinations and / or combinations / sub-combinations thereof. For example, at least one preferred view perspective may be generated by merging or combining the results of the aforementioned queries.

図６Ａの例において、ビデオエンコーダシステム６００は、少なくとも１つのコンピューティングデバイスであってもよく、または少なくとも１つのコンピューティングデバイスを含んでいてもよく、ここに説明される方法を行なうように構成された事実上あらゆるコンピューティングデバイスを表わし得る。そのため、ビデオエンコーダシステム６００は、ここに説明される手法、もしくはその異なるバージョンまたは将来のバージョンを実現するために利用され得るさまざまなコンポーネントを含み得る。例として、ビデオエンコーダシステム６００は、少なくとも１つのプロセッサ６０５と、少なくとも１つのメモリ６１０（たとえば、非一時的なコンピュータ読取可能記憶媒体）とを含むとして図示される。 In the example of FIG. 6A, video encoder system 600 may be or include at least one computing device, and is configured to perform the methods described herein. Can represent virtually any computing device. As such, video encoder system 600 may include various components that may be utilized to implement the techniques described herein, or different or future versions thereof. By way of example, video encoder system 600 is illustrated as including at least one processor 605 and at least one memory 610 (eg, a non-transitory computer readable storage medium).

図６Ａは、少なくとも１つの例示的な実施形態に従ったビデオエンコーダシステムを示す。図６Ａに示すように、ビデオエンコーダシステム６００は、少なくとも１つのプロセッサ６０５と、少なくとも１つのメモリ６１０と、コントローラ６２０と、ビデオエンコーダ６２５とを含む。少なくとも１つのプロセッサ６０５、少なくとも１つのメモリ６１０、コントローラ６２０、およびビデオエンコーダ６２５は、バス６１５を介して通信可能に結合される。 FIG. 6A illustrates a video encoder system according to at least one example embodiment. As shown in FIG. 6A, the video encoder system 600 includes at least one processor 605, at least one memory 610, a controller 620, and a video encoder 625. At least one processor 605, at least one memory 610, controller 620, and video encoder 625 are communicatively coupled via bus 615.

少なくとも１つのプロセッサ６０５は、少なくとも１つのメモリ６１０上に格納された命令を実行するために利用されてもよく、それにより、ここに説明されるさまざまな特徴および機能、もしくは追加のまたは代替的な特徴および機能を実現する。少なくとも１つのプロセッサ６０５および少なくとも１つのメモリ６１０は、さまざまな他の目的のために利用されてもよい。特に、少なくとも１つのメモリ６１０は、ここに説明されるモジュールのうちのいずれか１つを実現するために使用され得るさまざまなタイプのメモリならびに関連するハードウェアおよびソフトウェアの一例を表わし得る。 The at least one processor 605 may be utilized to execute instructions stored on at least one memory 610, such that various features and functions described herein, or additional or alternative Implement features and functions. At least one processor 605 and at least one memory 610 may be utilized for various other purposes. In particular, at least one memory 610 may represent an example of various types of memory and associated hardware and software that may be used to implement any one of the modules described herein.

少なくとも１つのメモリ６１０は、ビデオエンコーダシステム６００に関連付けられたデータおよび／または情報を格納するように構成されてもよい。たとえば、少なくとも１つのメモリ６１０は、球状ビデオを符号化することに関連付けられたコーデックを格納するように構成されてもよい。たとえば、少なくとも１つのメモリは、球状ビデオのフレームの一部を、球状ビデオの符号化とは別に符号化されるべきタイルとして選択することに関連付けられた符号を格納するように構成されてもよい。少なくとも１つのメモリ６１０は、共有リソースであってもよい。以下により詳細に説明されるように、タイルは、球状ビューア（たとえばＨＭＤ）の再生中に視聴者のビューパースペクティブに基づいて選択された複数の画素であってもよい。複数の画素は、ユーザによって見られ得る球状画像の一部を含み得る、ブロック、複数のブロック、またはマクロブロックであってもよい。たとえば、ビデオエンコーダシステム６００は、より大型のシステム（たとえば、サーバ、パーソナルコンピュータ、モバイルデバイスなど）の要素であってもよい。したがって、少なくとも１つのメモリ６１０は、より大型のシステム内の他の要素（たとえば、画像／ビデオ供給、ウェブブラウジング、または有線／無線通信）に関連付けられたデータおよび／または情報を格納するように構成されてもよい。 At least one memory 610 may be configured to store data and / or information associated with video encoder system 600. For example, at least one memory 610 may be configured to store a codec associated with encoding the spherical video. For example, the at least one memory may be configured to store a code associated with selecting a portion of the frame of the spherical video as a tile to be encoded separately from the encoding of the spherical video. . At least one memory 610 may be a shared resource. As described in more detail below, a tile may be a plurality of pixels selected based on a viewer's view perspective during playback of a spherical viewer (eg, an HMD). The plurality of pixels may be a block, a plurality of blocks, or a macroblock, which may include a portion of a spherical image that may be viewed by a user. For example, video encoder system 600 may be an element of a larger system (eg, a server, personal computer, mobile device, etc.). Thus, at least one memory 610 is configured to store data and / or information associated with other elements in the larger system (eg, image / video feed, web browsing, or wired / wireless communications). May be done.

コントローラ６２０は、さまざまな制御信号を生成し、ビデオエンコーダシステム６００におけるさまざまなブロックに当該制御信号を通信するように構成されてもよい。コントローラ６２０は、以下に説明される手法を実現するために当該制御信号を生成するように構成されてもよい。コントローラ６２０は、例示的な実施形態によれば、画像、画像のシーケンス、ビデオフレーム、ビデオシーケンスなどを符号化するようビデオエンコーダ６２５を制御するように構成されてもよい。たとえば、コントローラ６２０は、球状ビデオを符号化するためのパラメータに対応する制御信号を生成してもよい。ビデオエンコーダ６２５およびコントローラ６２０の機能および動作に関するさらなる詳細が、少なくとも図７Ａ、図４Ａ、図５Ａ、図５Ｂおよび図６〜９に関連して以下に説明される。 Controller 620 may be configured to generate various control signals and communicate the control signals to various blocks in video encoder system 600. Controller 620 may be configured to generate such control signals to implement the techniques described below. The controller 620 may be configured to control the video encoder 625 to encode an image, a sequence of images, a video frame, a video sequence, etc., according to an example embodiment. For example, controller 620 may generate a control signal corresponding to a parameter for encoding a spherical video. Further details regarding the function and operation of video encoder 625 and controller 620 are described below in connection with at least FIGS. 7A, 4A, 5A, 5B, and 6-9.

ビデオエンコーダ６２５は、ビデオストリーム入力５を受信し、圧縮された（たとえば符号化された）ビデオビット１０を出力するように構成されてもよい。ビデオエンコーダ６２５は、ビデオストリーム入力５を離散ビデオフレームに変換してもよい。ビデオストリーム入力５はまた、画像であってもよく、したがって、圧縮された（たとえば符号化された）ビデオビット１０も、圧縮された画像ビットであってもよい。ビデオエンコーダ６２５はさらに、各離散ビデオフレーム（または画像）をブロックのマトリックス（以下、ブロックと称される）に変換してもよい。たとえば、ビデオフレーム（または画像）は、各々が多数の画素を有するブロックの１６×１６、１６×８、８×８、８×４、４×４、４×２、２×２などのマトリックスに変換されてもよい。これらの例示的なマトリックスが列挙されているが、例示的な実施形態はそれらに限定されない。 Video encoder 625 may be configured to receive video stream input 5 and output compressed (eg, encoded) video bits 10. Video encoder 625 may convert video stream input 5 into discrete video frames. The video stream input 5 may also be an image, so the compressed (eg coded) video bits 10 may also be compressed image bits. Video encoder 625 may further convert each discrete video frame (or image) into a matrix of blocks (hereinafter, referred to as blocks). For example, a video frame (or image) is organized into a 16 × 16, 16 × 8, 8 × 8, 8 × 4, 4 × 4, 4 × 2, 2 × 2, etc. matrix of blocks each having a large number of pixels. It may be converted. Although these exemplary matrices are listed, exemplary embodiments are not limited thereto.

圧縮されたビデオビット１０は、ビデオエンコーダシステム６００の出力を表わしていてもよい。たとえば、圧縮されたビデオビット１０は、符号化されたビデオフレーム（または符号化された画像）を表わしていてもよい。たとえば、圧縮されたビデオビット１０は、受信デバイス（図示せず）への送信の準備ができていてもよい。たとえば、ビデオビットは、受信デバイスへの送信のためにシステムトランシーバ（図示せず）に送信されてもよい。 The compressed video bits 10 may represent the output of the video encoder system 600. For example, the compressed video bits 10 may represent an encoded video frame (or an encoded image). For example, the compressed video bits 10 may be ready for transmission to a receiving device (not shown). For example, video bits may be transmitted to a system transceiver (not shown) for transmission to a receiving device.

少なくとも１つのプロセッサ６０５は、コントローラ６２０および／またはビデオエンコーダ６２５に関連付けられたコンピュータ命令を実行するように構成されてもよい。少なくとも１つのプロセッサ６０５は、共有リソースであってもよい。たとえば、ビデオエンコーダシステム６００は、より大型のシステム（たとえばモバイルデバイス）の要素であってもよい。したがって、少なくとも１つのプロセッサ６０５は、より大型のシステム内の他の要素（たとえば、画像／ビデオ供給、ウェブブラウジング、または有線／無線通信）に関連付けられたコンピュータ命令を実行するように構成されてもよい。 At least one processor 605 may be configured to execute computer instructions associated with controller 620 and / or video encoder 625. At least one processor 605 may be a shared resource. For example, video encoder system 600 may be an element of a larger system (eg, a mobile device). Thus, at least one processor 605 may also be configured to execute computer instructions associated with other elements in the larger system (eg, image / video feed, web browsing, or wired / wireless communications). Good.

図６Ｂの例において、ビデオデコーダシステム６５０は、少なくとも１つのコンピューティングデバイスであってもよく、ここに説明される方法を行なうように構成された事実上あらゆるコンピューティングデバイスを表わし得る。そのため、ビデオデコーダシステム６５０は、ここに説明される手法、もしくはその異なるバージョンまたは将来のバージョンを実現するために利用され得るさまざまなコンポーネントを含み得る。例として、ビデオデコーダシステム６５０は、少なくとも１つのプロセッサ６５５と、少なくとも１つのメモリ６６０（たとえば、コンピュータ読取可能記憶媒体）とを含むとして図示される。 In the example of FIG. 6B, video decoder system 650 may be at least one computing device and may represent virtually any computing device configured to perform the methods described herein. As such, video decoder system 650 may include various components that may be utilized to implement the techniques described herein, or different or future versions thereof. By way of example, video decoder system 650 is illustrated as including at least one processor 655 and at least one memory 660 (eg, a computer-readable storage medium).

このため、少なくとも１つのプロセッサ６５５は、少なくとも１つのメモリ６６０上に格納された命令を実行するために利用されてもよく、それにより、ここに説明されるさまざまな特徴および機能、もしくは追加のまたは代替的な特徴および機能を実現する。少なくとも１つのプロセッサ６５５および少なくとも１つのメモリ６６０は、さまざまな他の目的のために利用されてもよい。特に、少なくとも１つのメモリ６６０は、ここに説明されるモジュールのうちのいずれか１つを実現するために使用され得るさまざまなタイプのメモリならびに関連するハードウェアおよびソフトウェアの一例を表わし得る。例示的な実施形態によれば、ビデオエンコーダシステム６００およびビデオデコーダシステム６５０は、同じより大型のシステム（たとえば、パーソナルコンピュータ、モバイルデバイスなど）に含まれていてもよい。例示的な実施形態によれば、ビデオデコーダシステム６５０は、ビデオエンコーダシステム６００に関して説明されたものとは逆または反対の手法を実現するように構成されてもよい。 Thus, at least one processor 655 may be utilized to execute instructions stored on at least one memory 660, such that various features and functions described herein, or additional or Provide alternative features and functions. At least one processor 655 and at least one memory 660 may be utilized for various other purposes. In particular, at least one memory 660 may represent an example of various types of memory and associated hardware and software that may be used to implement any one of the modules described herein. According to an exemplary embodiment, video encoder system 600 and video decoder system 650 may be included in the same larger system (eg, personal computer, mobile device, etc.). According to an exemplary embodiment, video decoder system 650 may be configured to implement a reverse or reverse approach to that described with respect to video encoder system 600.

少なくとも１つのメモリ６６０は、ビデオデコーダシステム６５０に関連付けられたデータおよび／または情報を格納するように構成されてもよい。たとえば、少なくとも１つのメモリ６１０は、符号化された球状ビデオデータを復号することに関連付けられたコーデックを格納するように構成されてもよい。たとえば、少なくとも１つのメモリは、符号化されたタイルおよび別個に符号化された球状ビデオフレームを復号することに関連付けられた符号、ならびに、復号された球状ビデオフレームにおける画素を復号されたタイルと置き換えるための符号を格納するように構成されてもよい。少なくとも１つのメモリ６６０は、共有リソースであってもよい。たとえば、ビデオデコーダシステム６５０は、より大型のシステム（たとえば、パーソナルコンピュータ、モバイルデバイスなど）の要素であってもよい。したがって、少なくとも１つのメモリ６６０は、より大型のシステム内の他の要素（たとえば、ウェブブラウジング、または無線通信）に関連付けられたデータおよび／または情報を格納するように構成されてもよい。 At least one memory 660 may be configured to store data and / or information associated with video decoder system 650. For example, at least one memory 610 may be configured to store a codec associated with decoding encoded spherical video data. For example, the at least one memory may include a code associated with decoding the encoded tile and the separately encoded spherical video frame, and replace pixels in the decoded spherical video frame with the decoded tile. May be configured to store the code for At least one memory 660 may be a shared resource. For example, video decoder system 650 may be an element of a larger system (eg, personal computer, mobile device, etc.). Accordingly, at least one memory 660 may be configured to store data and / or information associated with other elements in a larger system (eg, web browsing, or wireless communication).

コントローラ６７０は、さまざまな制御信号を生成し、ビデオデコーダシステム６５０におけるさまざまなブロックに当該制御信号を通信するように構成されてもよい。コントローラ６７０は、以下に説明されるビデオ復号手法を実現するために当該制御信号を生成するように構成されてもよい。コントローラ６７０は、例示的な実施形態によれば、ビデオフレームを復号するようビデオデコーダ６７５を制御するように構成されてもよい。コントローラ６７０は、ビデオの復号に対応する制御信号を生成するように構成されてもよい。ビデオデコーダ６７５およびコントローラ６７０の機能および動作に関するさらなる詳細が、以下に説明される。 Controller 670 may be configured to generate various control signals and communicate the control signals to various blocks in video decoder system 650. Controller 670 may be configured to generate the control signals to implement the video decoding techniques described below. Controller 670 may be configured to control video decoder 675 to decode a video frame, according to an example embodiment. Controller 670 may be configured to generate a control signal corresponding to decoding the video. Further details regarding the function and operation of video decoder 675 and controller 670 are described below.

ビデオデコーダ６７５は、圧縮された（たとえば符号化された）ビデオビット１０入力を受信し、ビデオストリーム５を出力するように構成されてもよい。ビデオデコーダ６７５は、圧縮されたビデオビット１０の離散ビデオフレームをビデオストリーム５に変換してもよい。圧縮された（たとえば符号化された）ビデオビット１０はまた、圧縮された画像ビットであってもよく、したがって、ビデオストリーム５も画像であってもよい。 Video decoder 675 may be configured to receive the compressed (eg, encoded) video bit 10 input and output video stream 5. Video decoder 675 may convert the discrete video frames of compressed video bits 10 into video stream 5. Compressed (eg, coded) video bits 10 may also be compressed image bits, and thus video stream 5 may also be an image.

少なくとも１つのプロセッサ６５５は、コントローラ６７０および／またはビデオデコーダ６７５に関連付けられたコンピュータ命令を実行するように構成されてもよい。少なくとも１つのプロセッサ６５５は、共有リソースであってもよい。たとえば、ビデオデコーダシステム６５０は、より大型のシステム（たとえば、パーソナルコンピュータ、モバイルデバイスなど）の要素であってもよい。したがって、少なくとも１つのプロセッサ６５５は、より大型のシステム内の他の要素（たとえば、ウェブブラウジング、または無線通信）に関連付けられたコンピュータ命令を実行するように構成されてもよい。 At least one processor 655 may be configured to execute computer instructions associated with controller 670 and / or video decoder 675. At least one processor 655 may be a shared resource. For example, video decoder system 650 may be an element of a larger system (eg, personal computer, mobile device, etc.). Accordingly, at least one processor 655 may be configured to execute computer instructions associated with other elements in a larger system (eg, web browsing, or wireless communication).

図７Ａおよび図７Ｂはそれぞれ、少なくとも１つの例示的な実施形態に従った、図６Ａに示すビデオエンコーダ６２５、および図６Ｂに示すビデオデコーダ６７５についてのフロー図を示す。（上述の）ビデオエンコーダ６２５は、球状−２Ｄ表現ブロック７０５と、予測ブロック７１０と、変換ブロック７１５と、量子化ブロック７２０と、エントロピー符号化ブロック７２５と、逆量子化ブロック７３０と、逆変換ブロック７３５と、再構築ブロック７４０と、ループフィルタブロック７４５と、タイル表現ブロック７９０と、ビューフレームストレージ７９５とを含む。ビデオエンコーダ６２５の他の構造変形例が、入力ビデオストリーム５を符号化するために使用され得る。図７Ａに示すように、破線は、いくつかのブロック間の再構築経路を表わし、実線は、いくつかのブロック間の順方向経路を表わす。 7A and 7B show a flow diagram for the video encoder 625 shown in FIG. 6A and the video decoder 675 shown in FIG. 6B, respectively, according to at least one example embodiment. The video encoder 625 (described above) includes a spherical-2D representation block 705, a prediction block 710, a transform block 715, a quantization block 720, an entropy coding block 725, an inverse quantization block 730, and an inverse transform block. 735, a reconstruction block 740, a loop filter block 745, a tile expression block 790, and a view frame storage 795. Other structural variants of the video encoder 625 may be used to encode the input video stream 5. As shown in FIG. 7A, dashed lines represent reconstruction paths between some blocks, and solid lines represent forward paths between some blocks.

前述のブロックの各々は、（たとえば図６Ａに示すような）ビデオエンコーダシステムに関連付けられたメモリ（たとえば少なくとも１つのメモリ６１０）に格納され、当該ビデオエンコーダシステムに関連付けられた少なくとも１つのプロセッサ（たとえば少なくとも１つのプロセッサ６０５）によって実行される、ソフトウェアコードとして実行されてもよい。しかしながら、特殊用途プロセッサとして具現化されるビデオエンコーダといった、代替的な実施形態が考えられる。たとえば、（単独の、および／または組合された）前述のブロックの各々は、特定用途向け集積回路、すなわちＡＳＩＣであってもよい。たとえば、ＡＳＩＣは、変換ブロック７１５および／または量子化ブロック７２０として構成されてもよい。 Each of the foregoing blocks is stored in a memory (eg, at least one memory 610) associated with the video encoder system (eg, as shown in FIG. 6A) and includes at least one processor (eg, such as shown in FIG. 6A) associated with the video encoder system. It may be executed as software code, executed by at least one processor 605). However, alternative embodiments are conceivable, such as a video encoder embodied as a special purpose processor. For example, each of the foregoing blocks (alone and / or combined) may be an application specific integrated circuit, ie, an ASIC. For example, the ASIC may be configured as transform block 715 and / or quantization block 720.

球状−２Ｄ表現ブロック７０５は、球状フレームまたは画像を球状フレームまたは画像の２Ｄ表現にマッピングするように構成されてもよい。たとえば、球が、別の形状（たとえば、正方形、矩形、円筒、および／または立方体）の表面上に投影されてもよい。その投影は、たとえば、正距円筒または半正距円筒であってもよい。 The spherical-2D representation block 705 may be configured to map the spherical frame or image to a 2D representation of the spherical frame or image. For example, a sphere may be projected onto a surface of another shape (eg, a square, rectangle, cylinder, and / or cube). The projection may be, for example, an equirectangular or semi- equirectangular.

予測ブロック７１０は、ビデオフレーム整合性（たとえば、以前に符号化された画素と比べて変わっていない画素）を利用するように構成されてもよい。予測は、２つのタイプを含んでいてもよい。たとえば、予測は、フレーム内予測とフレーム間予測とを含んでいてもよい。フレーム内予測は、画像のブロックにおける画素値を、同じ画像の以前に符号化された隣接するブロックにおける基準サンプルと比べて予測することに関する。フレーム内予測では、サンプルは、予測変換コーデックの変換（たとえばエントロピー符号化ブロック７２５）およびエントロピー符号化（たとえばエントロピー符号化ブロック７２５）部分によって符号化される残差を減少させるために、同じフレーム内の再構築された画素から予測される。フレーム間予測は、画像のブロックにおける画素値を、以前に符号化された画像のデータと比べて予測することに関する。 The prediction block 710 may be configured to take advantage of video frame integrity (eg, pixels that have not changed compared to previously encoded pixels). The prediction may include two types. For example, the prediction may include intra-frame prediction and inter-frame prediction. Intra-frame prediction involves predicting the pixel values in a block of an image relative to reference samples in a previously encoded neighboring block of the same image. In intra-frame prediction, the samples are placed in the same frame to reduce the residual encoded by the transform (eg, entropy coding block 725) and entropy coding (eg, entropy coding block 725) portions of the predictive transform codec. From the reconstructed pixels. Inter-frame prediction relates to predicting pixel values in blocks of an image relative to previously encoded image data.

変換ブロック７１５は、画素の値を空間ドメインから変換ドメインにおける変換係数に変換するように構成されてもよい。変換係数は、元のブロックと通常同じサイズである係数の２次元マトリックスに対応していてもよい。言い換えれば、元のブロックにおける画素と同じぐらい多くの変換係数が存在していてもよい。しかしながら、変換により、変換係数の一部はゼロに等しい値を有していてもよい。 Transform block 715 may be configured to transform pixel values from the spatial domain to transform coefficients in the transform domain. The transform coefficients may correspond to a two-dimensional matrix of coefficients that are usually the same size as the original block. In other words, there may be as many transform coefficients as there are pixels in the original block. However, due to the transformation, some of the transform coefficients may have a value equal to zero.

変換ブロック７１５は、（予測ブロック７１０からの）残りを、たとえば周波数ドメインにおける変換係数に変換するように構成されてもよい。典型的には、変換は、カルフネン−ロエヴェ変換（Karhunen-Loeve Transform：ＫＬＴ）、離散コサイン変換（Discrete Cosine Transform：ＤＣＴ）、特異値分解変換（Singular Value Decomposition Transform：ＳＶＤ）、および非対称離散サイン変換（asymmetric discrete sine transform：ＡＤＳＴ）を含む。 Transform block 715 may be configured to transform the remainder (from prediction block 710) into transform coefficients in the frequency domain, for example. Typically, the transformation is a Karhunen-Loeve Transform (KLT), a Discrete Cosine Transform (DCT), a Singular Value Decomposition Transform (SVD), and an asymmetric Discrete Sine Transform (Asymmetric discrete sine transform: ADST).

量子化ブロック７２０は、各変換係数におけるデータを減少させるように構成されてもよい。量子化は、比較的大きい範囲内の値を比較的小さい範囲内の値にマッピングすることを伴ってもよく、このため、量子化された変換係数を表わすのに必要なデータの量を減少させる。量子化ブロック７２０は、変換係数を、量子化された変換係数または量子化レベルと称される離散量子値に変換してもよい。たとえば、量子化ブロック７２０は、変換係数に関連付けられたデータにゼロを加えるように構成されてもよい。たとえば、符号化規準は、スカラー量子化プロセスにおける１２８個の量子化レベルを規定してもよい。 Quantization block 720 may be configured to reduce data at each transform coefficient. Quantization may involve mapping values within a relatively large range to values within a relatively small range, thus reducing the amount of data required to represent the quantized transform coefficients. . The quantization block 720 may convert the transform coefficients into discrete quantized values called quantized transform coefficients or quantization levels. For example, quantization block 720 may be configured to add zero to data associated with transform coefficients. For example, a coding criterion may specify 128 quantization levels in a scalar quantization process.

量子化された変換係数は、エントロピー符号化ブロック７２５によってエントロピー符号化される。その後、エントロピー符号化された係数は、使用される予測のタイプ、運動ベクトル、および量子化器の値といった、ブロックを復号するのに必要とされる情報とともに、圧縮されたビデオビット１０として出力される。圧縮されたビデオビット１０は、ランレングス符号化（run-length encoding：ＲＬＥ）およびゼロラン符号化（zero-run coding）といったさまざまな手法を使用してフォーマットされ得る。 The quantized transform coefficients are entropy coded by an entropy coding block 725. The entropy coded coefficients are then output as compressed video bits 10, along with the information needed to decode the block, such as the type of prediction used, motion vectors, and quantizer values. You. Compressed video bits 10 may be formatted using various techniques, such as run-length encoding (RLE) and zero-run coding.

図７Ａにおける再構築経路は、ビデオエンコーダ６２５および（図７Ｂに関して以下に説明される）ビデオデコーダ６７５の双方が、同じ基準フレームを使用して、圧縮されたビデオビット１０（または圧縮された画像ビット）を復号することを保証するために存在する。当該再構築経路は、以下により詳細に説明される、復号処理中に行なわれる機能に類似する機能を行なう。当該機能は、微分残差（derivative residual）ブロック（微分残差）を作り出すために、逆量子化ブロック７３０で、量子化された変換係数を逆量子化することと、逆変換ブロック７３５で、逆量子化された変換係数を逆変換することとを含む。再構築ブロック７４０で、再構築ブロックを作り出すために、予測ブロック７１０で予測された予測ブロックは微分残差に加えられ得る。次に、ブロッキングアーティファクトなどの歪みを減少させるために、ループフィルタ７４５が再構築ブロックに適用され得る。 The reconstruction path in FIG. 7A is such that both video encoder 625 and video decoder 675 (described below with respect to FIG. 7B) use the same reference frame to compress compressed video bits 10 (or compressed image bits). ) Is present to guarantee decoding. The reconstruction path performs functions similar to those performed during the decoding process, described in more detail below. The function is to dequantize the quantized transform coefficients in an inverse quantization block 730 to create a derivative residual block (differential residual) and to perform an inverse transform in an inverse transform block 735. Inverse transforming the quantized transform coefficients. At reconstruction block 740, the prediction block predicted at prediction block 710 may be added to the differential residual to create a reconstruction block. Next, a loop filter 745 may be applied to the reconstructed block to reduce distortion such as blocking artifacts.

タイル表現ブロック７９０は、画像および／またはフレームを複数のタイルに変換するように構成され得る。１つのタイルは、画素のグループ化であり得る。タイルは、ビューまたはビューパースペクティブに基づいて選択された複数の画素であってもよい。複数の画素は、ユーザによって見られ得る（または見られることが予測される）球状画像の一部を含み得る、ブロック、複数のブロック、またはマクロブロックであってもよい。タイルとしての球状画像の一部は、長さと幅とを有していてもよい。球状画像の一部は、２次元、または実質的に２次元であってもよい。タイルは、可変サイズ（たとえば、タイルは球の何割をカバーするか）を有し得る。たとえば、タイルのサイズは、たとえば、視聴者の視野がどれくらい広いか、別のタイルへの近接性、および／または、ユーザがどれくらい速く自分の頭を回転させているかに基づいて、符号化され、ストリーミングされ得る。たとえば、視聴者が絶えず見回している場合、より大きく、より低品質のタイルが選択されるかもしれない。しかしながら、視聴者が１つのパースペクティブに注目している場合、より小さく、より詳細なタイルが選択されるかもしれない。 Tile representation block 790 may be configured to convert an image and / or frame into multiple tiles. One tile may be a grouping of pixels. A tile may be a plurality of pixels selected based on a view or view perspective. The plurality of pixels may be a block, a plurality of blocks, or a macroblock that may include a portion of a spherical image that may be viewed (or expected to be viewed) by a user. A part of the spherical image as a tile may have a length and a width. A portion of the spherical image may be two-dimensional, or substantially two-dimensional. Tiles may have variable sizes (eg, what percentage of a sphere a tile covers). For example, the size of a tile is coded based on, for example, how wide the viewer's field of view is, the proximity to another tile, and / or how fast the user is turning his or her head, Can be streamed. For example, if the viewer is constantly looking around, larger, lower quality tiles may be selected. However, if the viewer is looking at one perspective, smaller, more detailed tiles may be selected.

一実現化例では、タイル表現ブロック７９０は、球状−２Ｄ表現ブロック７０５にタイルを生成させる、球状−２Ｄ表現ブロック７０５への命令を起動する。別の実現化例では、タイル表現ブロック７９０がタイルを生成する。いずれの実現化例でも、各タイルは次に、個々に符号化される。さらに別の実現化例では、タイル表現ブロック７９０は、ビューフレームストレージ７９５に、符号化された画像および／またはビデオフレームをタイルとして格納させる、ビューフレームストレージ７９５への命令を起動する。タイル表現ブロック７９０は、ビューフレームストレージ７９５に、タイルについての情報またはメタデータとともにタイルを格納させる、ビューフレームストレージ７９５への命令を起動できる。たとえば、タイルについての情報またはメタデータは、画像またはフレーム内のタイル位置の表示、タイルの符号化に関連付けられた情報（たとえば、解像度、帯域幅、および／または３Ｄ−２Ｄ投影アルゴリズム）、１つ以上の関心領域との関連付けなどを含んでいてもよい。 In one implementation, the tile representation block 790 invokes instructions to the spherical-2D representation block 705 to cause the spherical-2D representation block 705 to generate tiles. In another implementation, the tile representation block 790 generates the tile. In either implementation, each tile is then individually encoded. In yet another implementation, the tile representation block 790 triggers instructions to the view frame storage 795 to cause the view frame storage 795 to store the encoded image and / or video frames as tiles. The tile representation block 790 can trigger an instruction to the view frame storage 795 to cause the view frame storage 795 to store the tile along with information or metadata about the tile. For example, information or metadata about a tile may include an indication of tile position within an image or frame, information associated with tile encoding (eg, resolution, bandwidth, and / or 3D-2D projection algorithms), one The association with the above-mentioned region of interest may be included.

例示的な実現化例によれば、エンコーダ６２５は、フレーム、フレームの一部、および／またはタイルを、異なる品質（またはＱｏＳ（Quality of Service））で符号化してもよい。例示的な実施形態によれば、エンコーダ６２５は、フレーム、フレームの一部、および／またはタイルを複数回、各々異なるＱｏＳで符号化してもよい。したがって、ビューフレームストレージ７９５は、画像またはフレーム内の同じ位置を表わすフレーム、フレームの一部、および／またはタイルを、異なるＱｏＳで格納できる。そのため、タイルについての前述の情報またはメタデータは、フレーム、フレームの一部、および／またはタイルが符号化された際のＱｏＳの表示を含んでいてもよい。 According to an example implementation, encoder 625 may encode a frame, a portion of a frame, and / or a tile with different qualities (or Quality of Service (QoS)). According to an exemplary embodiment, encoder 625 may encode a frame, a portion of a frame, and / or a tile multiple times, each with a different QoS. Thus, view frame storage 795 can store frames, portions of frames, and / or tiles that represent the same location in an image or frame with different QoS. As such, the aforementioned information or metadata about the tile may include a frame, a portion of the frame, and / or an indication of the QoS at which the tile was encoded.

ＱｏＳは、圧縮アルゴリズム、解像度、伝送速度、および／または符号化スキームに基づき得る。したがって、エンコーダ６２５は、フレーム、フレームの一部、および／またはタイルごとに、異なる圧縮アルゴリズムおよび／または符号化スキームを使用してもよい。たとえば、符号化されたタイルは、エンコーダ６２５によって符号化された（タイルに関連付けられたた）フレームに比べ、より高いＱｏＳであってもよい。上述のように、エンコーダ６２５は、球状ビデオフレームの２Ｄ表現を符号化するように構成されてもよい。したがって、（球状ビデオフレームの一部を含む可視パースペクティブとしての）タイルは、球状ビデオフレームの２Ｄ表現に比べ、より高いＱｏＳで符号化され得る。ＱｏＳは、復号された場合にフレームの解像度に影響を与えるかもしれない。したがって、（球状ビデオフレームの一部を含む可視パースペクティブとしての）タイルは、タイルが、復号された場合に、球状ビデオフレームの復号された２Ｄ表現と比べて、フレームのより高い解像度を有するように、符号化され得る。タイル表現ブロック７９０は、タイルが符号化されるべきＱｏＳを示してもよい。タイル表現ブロック７９０は、フレーム、フレームの一部、および／またはタイルが関心領域であるか、関心領域内にあるか、シード領域に関連付けられているか否かなどに基づいて、ＱｏＳを選択してもよい。関心領域およびシード領域は、以下により詳細に説明される。 QoS may be based on compression algorithm, resolution, transmission rate, and / or coding scheme. Accordingly, encoder 625 may use different compression algorithms and / or encoding schemes for each frame, portion of a frame, and / or tile. For example, the encoded tile may have a higher QoS than the frame encoded by the encoder 625 (associated with the tile). As described above, encoder 625 may be configured to encode a 2D representation of the spherical video frame. Thus, a tile (as a visible perspective that includes a portion of a spherical video frame) may be encoded with a higher QoS than a 2D representation of the spherical video frame. QoS may affect the resolution of a frame when decoded. Thus, the tile (as a visible perspective that includes a portion of the spherical video frame) is such that when the tile is decoded, it has a higher resolution of the frame compared to the decoded 2D representation of the spherical video frame. , Can be encoded. Tile representation block 790 may indicate the QoS at which the tile is to be encoded. The tile representation block 790 selects a QoS based on whether the frame, a portion of the frame, and / or the tile is, is within, is associated with a seed region, or the like. Is also good. The region of interest and the seed region are described in more detail below.

図７Ａに関して上述されたビデオエンコーダ６２５は、図示されたブロックを含む。しかしながら、例示的な実施形態はそれらに限定されない。使用される異なるビデオ符号化構成および／または手法に基づいて、追加のブロックが追加されてもよい。また、図７Ａに関して上述されたビデオエンコーダ６２５に示されるブロックの各々は、使用される異なるビデオ符号化構成および/または手法に基づくオプションのブロックであってもよい。 Video encoder 625 described above with respect to FIG. 7A includes the illustrated blocks. However, the exemplary embodiments are not so limited. Additional blocks may be added based on different video coding configurations and / or techniques used. Also, each of the blocks shown in video encoder 625 described above with respect to FIG. 7A may be optional blocks based on the different video coding configurations and / or techniques used.

図７Ｂは、圧縮されたビデオビット１０（または圧縮された画像ビット）を復号するように構成されたデコーダ６７５の概略ブロック図である。デコーダ６７５は、前述のエンコーダ６２５の再構築経路に類似して、エントロピー復号ブロック７５０と、逆量子化ブロック７５５と、逆変換ブロック７６０と、再構築ブロック７６５と、ループフィルタブロック７７０と、予測ブロック７７５と、ブロック解除フィルタブロック７８０と、２Ｄ表現−球状ブロック７８５とを含む。 FIG. 7B is a schematic block diagram of a decoder 675 configured to decode the compressed video bits 10 (or the compressed image bits). The decoder 675 includes an entropy decoding block 750, an inverse quantization block 755, an inverse transformation block 760, a reconstruction block 765, a loop filter block 770, and a prediction block similar to the reconstruction path of the encoder 625 described above. 775, a deblocking filter block 780, and a 2D representation-spherical block 785.

圧縮されたビデオビット１０内のデータ要素は、一組の量子化された変換係数を生成するために、（たとえば、コンテキスト適応型二値算術復号方式（Context Adaptive Binary Arithmetic Decoding）を使用して）エントロピー復号ブロック７５０によって復号され得る。逆量子化ブロック７５５は、量子化された変換係数を逆量子化し、逆変換ブロック７６０は、逆量子化された変換係数を（ＡＤＳＴを使用して）逆変換して、エンコーダ６２５における再構築段階によって作り出されたものと同一であり得る微分残差を作り出す。 The data elements in the compressed video bits 10 are used to generate a set of quantized transform coefficients (eg, using Context Adaptive Binary Arithmetic Decoding). It may be decoded by the entropy decoding block 750. An inverse quantization block 755 inversely quantizes the quantized transform coefficients, and an inverse transform block 760 inversely transforms the inversely quantized transform coefficients (using ADST) and reconstructs at the encoder 625. Produces a differential residual that can be identical to that produced by

圧縮されたビデオビット１０から復号されたヘッダ情報を使用して、デコーダ６７５は、エンコーダ６７５において作り出されたのと同じ予測ブロックを作り出すために、予測ブロック７７５を使用することができる。予測ブロックは、再構築ブロック７６５によって再構築ブロックを作り出すために、微分残差に加えられ得る。ブロッキングアーティファクトを減少させるために、ループフィルタブロック７７０が再構築ブロックに適用され得る。ブロッキング歪みを減少させるために、ブロック解除フィルタブロック７８０が再構築ブロックに適用され得る。その結果が、ビデオストリーム５として出力される。 Using the header information decoded from the compressed video bits 10, the decoder 675 can use the prediction block 775 to create the same prediction block as created at the encoder 675. The prediction block may be added to the differential residual to create a reconstructed block by the reconstructed block 765. To reduce blocking artifacts, a loop filter block 770 may be applied to the reconstructed block. To reduce blocking distortion, a deblocking filter block 780 may be applied to the reconstructed block. The result is output as a video stream 5.

２Ｄ表現−球状ブロック７８５は、球状フレームまたは画像の２Ｄ表現を球状フレームまたは画像にマッピングするように構成されてもよい。たとえば、球状フレームまたは画像の２Ｄ表現を球状フレームまたは画像にマッピングすることは、エンコーダ６２５によって行なわれる３Ｄ−２Ｄマッピングの逆であり得る。 The 2D representation-spherical block 785 may be configured to map a 2D representation of the spherical frame or image to the spherical frame or image. For example, mapping a 2D representation of a spherical frame or image to a spherical frame or image may be the inverse of the 3D-2D mapping performed by encoder 625.

図７Ｂに関して上述されたビデオデコーダ６７５は、図示されたブロックを含む。しかしながら、例示的な実施形態はそれらに限定されない。使用される異なるビデオ符号化構成および／または手法に基づいて、追加のブロックが追加されてもよい。また、図７Ｂに関して上述されたビデオデコーダ６７５に示されるブロックの各々は、使用される異なるビデオ符号化構成および/または手法に基づくオプションのブロックであってもよい。 Video decoder 675 described above with respect to FIG. 7B includes the illustrated blocks. However, the exemplary embodiments are not so limited. Additional blocks may be added based on different video coding configurations and / or techniques used. Also, each of the blocks shown in video decoder 675 described above with respect to FIG. 7B may be optional blocks based on the different video coding configurations and / or techniques used.

エンコーダ６２５およびデコーダ６７５はそれぞれ、球状ビデオおよび／または画像を符号化するように、ならびに球状ビデオおよび／または画像を復号するように構成されてもよい。球状画像は、球状に組織化された複数の画素を含む画像である。言い換えれば、球状画像は、全方向に連続している画像である。したがって、球状画像の視聴者は、任意の方向（たとえば、上方向、下方向、左方向、右方向、またはそれらの任意の組合せ）に位置または向きを変える（たとえば、自分の頭または目を動かす）ことができ、画像の一部を連続的に見ることができる。 Encoder 625 and decoder 675 may be configured to encode spherical video and / or images, and to decode spherical video and / or images, respectively. The spherical image is an image including a plurality of pixels organized in a spherical shape. In other words, the spherical image is an image that is continuous in all directions. Thus, a viewer of a spherical image changes position or orientation (eg, moves his or her head or eyes) in any direction (eg, up, down, left, right, or any combination thereof). ) And part of the image can be viewed continuously.

例示的な実現化例では、エンコーダ６２５において使用され、および／またはエンコーダ６２５によって判断されたパラメータは、エンコーダ４０５の他の要素によって使用され得る。たとえば、２Ｄ表現を符号化するために使用される（たとえば、予測において使用されるような）運動ベクトルが、タイルを符号化するために使用されてもよい。また、予測ブロック７１０、変換ブロック７１５、量子化ブロック７２０、エントロピー符号化ブロック７２５、逆量子化ブロック７３０、逆変換ブロック７３５、再構築ブロック７４０、およびループフィルタブロック７４５において使用され、および／または当該ブロックによって判断されたパラメータは、エンコーダ６２５とエンコーダ４０５との間で共有されてもよい。 In an exemplary implementation, the parameters used at and / or determined by encoder 625 may be used by other elements of encoder 405. For example, a motion vector used to encode a 2D representation (eg, as used in prediction) may be used to encode a tile. Also used in the prediction block 710, transform block 715, quantization block 720, entropy coding block 725, inverse quantization block 730, inverse transform block 735, reconstruction block 740, and loop filter block 745, and / or The parameters determined by the blocks may be shared between encoder 625 and encoder 405.

球状ビデオフレームまたは画像の一部は、画像として処理されてもよい。したがって、球状ビデオフレームの一部は、ブロックのＣ×Ｒマトリックス（以下、ブロックと称される）に変換（または分解）されてもよい。たとえば、球状ビデオフレームの一部は、各々が多数の画素を有するブロックの１６×１６、１６×８、８×８、８×４、４×４、４×２、２×２などのマトリックスといったＣ×Ｒマトリックスに変換されてもよい。 Part of the spherical video frame or image may be processed as an image. Thus, a portion of the spherical video frame may be transformed (or decomposed) into a C × R matrix of blocks (hereinafter referred to as blocks). For example, a portion of a spherical video frame may be a 16 × 16, 16 × 8, 8 × 8, 8 × 4, 4 × 4, 4 × 2, 2 × 2, etc. matrix of blocks each having a large number of pixels. It may be converted to a C × R matrix.

図８は、少なくとも１つの例示的な実施形態に従ったシステム８００を示す。図８に示すように、システム７００は、コントローラ６２０と、コントローラ６７０と、ビデオエンコーダ６２５と、ビューフレームストレージ７９５と、配向センサ８３５とを含む。コントローラ６２０はさらに、ビュー位置制御モジュール８０５と、タイル制御モジュール８１０と、ビューパースペクティブデータストア８１５とを含む。コントローラ６７０はさらに、ビュー位置判断モジュール８２０と、タイル要求モジュール８２５と、バッファ８３０とを含む。 FIG. 8 illustrates a system 800 according to at least one example embodiment. As shown in FIG. 8, the system 700 includes a controller 620, a controller 670, a video encoder 625, a view frame storage 795, and an orientation sensor 835. Controller 620 further includes a view position control module 805, a tile control module 810, and a view perspective data store 815. The controller 670 further includes a view position determination module 820, a tile request module 825, and a buffer 830.

例示的な実現化例によれば、配向センサ８３５は、視聴者の目（または頭）の配向（または配向の変化）を検出し、ビュー位置判断モジュール８２０は、検出された配向に基づいて、ビュー、パースペクティブ、またはビューパースペクティブを判断し、タイル要求モジュール８２５は、（球状ビデオに加えて）ビュー、パースペクティブ、またはビューパースペクティブを、タイルまたは複数のタイルに対する要求の一部として通信する。別の例示的な実現化例によれば、配向センサ８３５は、ＨＭＤまたはディスプレイ上でレンダリングされる際の画像パン配向（image panning orientation）に基づいて、配向（または配向の変化）を検出する。たとえば、ＨＭＤのユーザは、焦点深度を変更してもよい。言い換えれば、ＨＭＤのユーザは、配向の変化の有無にかかわらず、遠くにあった物体から近くにある物体に自分の焦点を変更してもよい（逆もまた同様）。たとえば、ユーザは、ディスプレイ上にレンダリングされる際の球状ビデオまたは画像の一部の選択、移動、ドラッグ、拡大などを行なうために、マウス、トラックパッド、または（たとえばタッチ感知ディスプレイ上での）ジェスチャを使用してもよい。 According to an exemplary implementation, orientation sensor 835 detects the orientation (or change in orientation) of the viewer's eyes (or head), and view position determination module 820 determines the orientation based on the detected orientation. Having determined the view, perspective, or view perspective, the tile request module 825 communicates the view, perspective, or view perspective (in addition to the spherical video) as part of the request for the tile or tiles. According to another example implementation, orientation sensor 835 detects an orientation (or a change in orientation) based on the image panning orientation as rendered on the HMD or display. For example, a user of the HMD may change the depth of focus. In other words, the user of the HMD may change his or her focus from an object that was far away to an object that is close, with or without a change in orientation, and vice versa. For example, a user may use a mouse, trackpad, or gesture (eg, on a touch-sensitive display) to select, move, drag, magnify, etc., a portion of a spherical video or image as rendered on a display. May be used.

タイルに対する要求は、球状ビデオのフレームに対する要求とともに通信されてもよい。タイルに対する要求は、球状ビデオのフレームに対する要求とは別に、ともに通信されてもよい。たとえば、タイルに対する要求は、変更されたビュー、パースペクティブ、またはビューパースペクティブに応答してもよく、以前に要求された、および／または待ち行列に入れられたタイルを置き換える必要性をもたらす。 Requests for tiles may be communicated with requests for frames of the spherical video. Requests for tiles may be communicated separately from requests for frames of the spherical video. For example, a request for a tile may respond to a modified view, perspective, or view perspective, resulting in a need to replace a previously requested and / or queued tile.

ビュー位置制御モジュール８０５は、タイルに対する要求を受信し、処理する。たとえば、ビュー位置制御モジュール８０５は、ビューに基づいて、フレームと、そのフレームにおけるタイルまたは複数のタイルの位置とを判断し得る。次に、ビュー位置制御モジュール８０５は、タイル制御モジュール８１０に、タイルまたは複数のタイルを選択するよう命令し得る。タイルまたは複数のタイルを選択することは、パラメータをビデオエンコーダ６２５へ渡すことを含み得る。パラメータは、球状ビデオおよび/またはタイルの符号化中にビデオエンコーダ６２５によって使用され得る。これに代えて、タイルまたは複数のタイルを選択することは、ビューフレームストレージ７９５からタイルまたは複数のタイルを選択することを含み得る。 View position control module 805 receives and processes requests for tiles. For example, view position control module 805 may determine a frame and a position of a tile or tiles in the frame based on the view. Next, view position control module 805 may instruct tile control module 810 to select a tile or multiple tiles. Selecting the tile or tiles may include passing parameters to video encoder 625. The parameters may be used by video encoder 625 during encoding of the spherical video and / or tile. Alternatively, selecting a tile or multiple tiles may include selecting a tile or multiple tiles from view frame storage 795.

したがって、タイル制御モジュール８１０は、球状ビデオを見ているユーザのビューまたはパースペクティブまたはビューパースペクティブに基づいてタイル（または複数のタイル）を選択するように構成されてもよい。タイルは、ビューに基づいて選択された複数の画素であってもよい。複数の画素は、ユーザによって見られ得る球状画像の一部を含み得る、ブロック、複数のブロック、またはマクロブロックであってもよい。球状画像の一部は、長さと幅とを有していてもよい。球状画像の一部は、２次元、または実質的に２次元であってもよい。タイルは、可変サイズ（たとえば、タイルは球の何割をカバーするか）を有し得る。たとえば、タイルのサイズは、たとえば、視聴者の視野がどれくらい広いか、および／または、ユーザがどれくらい速く自分の頭を回転させているかに基づいて、符号化され、ストリーミングされ得る。たとえば、視聴者が絶えず見回している場合、より大きく、より低品質のタイルが選択されるかもしれない。しかしながら、視聴者が１つのパースペクティブに注目している場合、より小さく、より詳細なタイルが選択されるかもしれない。 Accordingly, the tile control module 810 may be configured to select a tile (or multiple tiles) based on the view or perspective or view perspective of the user watching the spherical video. A tile may be a plurality of pixels selected based on a view. The plurality of pixels may be a block, a plurality of blocks, or a macroblock, which may include a portion of a spherical image that may be viewed by a user. Part of the spherical image may have a length and a width. A portion of the spherical image may be two-dimensional, or substantially two-dimensional. Tiles may have variable sizes (eg, what percentage of a sphere a tile covers). For example, the size of the tiles may be encoded and streamed, for example, based on how wide the viewer's field of view is and / or how fast the user is turning his or her head. For example, if the viewer is constantly looking around, larger, lower quality tiles may be selected. However, if the viewer is looking at one perspective, smaller, more detailed tiles may be selected.

したがって、配向センサ８３５は、視聴者の目（または頭）の配向（または配向の変化）を検出するように構成され得る。たとえば、配向センサ８３５は、動きを検出するために加速度計を、および、配向を検出するためにジャイロスコープを含み得る。これに代えて、またはこれに加えて、配向センサ８３５は、視聴者の目または頭の配向を判断するために、視聴者の目または頭に焦点を合わせたカメラまたは赤外線センサを含み得る。これに代えて、またはこれに加えて、配向センサ８３５は、球状ビデオまたは画像の配向を検出するために、ディスプレイ上でレンダリングされるような球状ビデオまたは画像の一部を判断し得る。配向センサ８３５は、配向および配向変化情報をビュー位置判断モジュール８２０に通信するように構成され得る。 Accordingly, orientation sensor 835 may be configured to detect the orientation (or change in orientation) of the viewer's eyes (or head). For example, orientation sensor 835 may include an accelerometer to detect movement and a gyroscope to detect orientation. Alternatively or additionally, orientation sensor 835 may include a camera or infrared sensor focused on the viewer's eyes or head to determine the orientation of the viewer's eyes or head. Alternatively or additionally, orientation sensor 835 may determine a portion of the spherical video or image as rendered on a display to detect the orientation of the spherical video or image. The orientation sensor 835 may be configured to communicate orientation and orientation change information to the view position determination module 820.

ビュー位置判断モジュール８２０は、球状ビデオに関してビューまたはパースペクティブビュー（たとえば、視聴者が現在見ている球状ビデオの一部）を判断するように構成され得る。ビュー、パースペクティブ、またはビューパースペクティブは、球状ビデオ上の位置、点、または焦点として判断され得る。たとえば、ビューは、球状ビデオ上の緯度および経度位置であってもよい。ビュー、パースペクティブ、またはビューパースペクティブは、球状ビデオに基づいて立方体の辺として判断され得る。ビュー（たとえば、緯度および経度位置、または辺）は、たとえばハイパーテキスト転送プロトコル（ＨＴＴＰ）を使用して、ビュー位置制御モジュール８０５に通信され得る。 The view location determination module 820 may be configured to determine a view or perspective view (eg, a portion of the spherical video that the viewer is currently viewing) with respect to the spherical video. A view, perspective, or view perspective may be determined as a position, point, or focus on a spherical video. For example, the view may be a latitude and longitude position on the spherical video. A view, perspective, or view perspective may be determined as a cube edge based on the spherical video. The views (eg, latitude and longitude locations, or sides) may be communicated to the view location control module 805 using, for example, Hypertext Transfer Protocol (HTTP).

ビュー位置制御モジュール８０５は、球状ビデオ内のタイルまたは複数のタイルのビュー位置（たとえば、フレーム、およびそのフレーム内での位置）を判断するように構成されてもよい。たとえば、ビュー位置制御モジュール８０５は、ビュー位置、点、または焦点（たとえば、緯度および経度位置、または辺）を中心とする矩形を選択し得る。タイル制御モジュール８１０は、当該矩形をタイルまたは複数のタイルとして選択するように構成され得る。タイル制御モジュール８１０は、（たとえば、パラメータまたは構成設定を介して）ビデオエンコーダ６２５に、選択されたタイルまたは複数のタイルを符号化するよう命令するように構成され得る。および／または、タイル制御モジュール８１０は、ビューフレームストレージ７９５からタイルまたは複数のタイルを選択するように構成され得る。 The view position control module 805 may be configured to determine a view position (eg, a frame and a position within the frame) of a tile or multiple tiles in the spherical video. For example, view position control module 805 may select a rectangle centered on a view position, point, or focus (eg, a latitude and longitude position, or a side). Tile control module 810 may be configured to select the rectangle as a tile or multiple tiles. Tile control module 810 may be configured to instruct video encoder 625 (eg, via parameters or configuration settings) to encode the selected tile or tiles. And / or the tile control module 810 can be configured to select a tile or multiple tiles from the view frame storage 795.

理解されるように、図６Ａに示すシステム６００および図６Ｂに示すシステム６５０、および／または図８に示すシステム８００は、図９に関して以下に説明される汎用コンピュータデバイス９００および／または汎用モバイルコンピュータデバイス９５０の要素および/または拡張として実現されてもよい。これに代えて、またはこれに加えて、図６Ａに示すシステム６００および図６Ｂに示すシステム６５０、および／または図８に示すシステム８００は、汎用コンピュータデバイス９００および／または汎用モバイルコンピュータデバイス９５０に関して以下に説明される特徴のうちのいくつかまたはすべてを有する、汎用コンピュータデバイス９００および／または汎用モバイルコンピュータデバイス９５０とは別個のシステムにおいて実現されてもよい。 As will be appreciated, the system 600 shown in FIG. 6A and the system 650 shown in FIG. 6B, and / or the system 800 shown in FIG. It may be implemented as 950 elements and / or extensions. Alternatively or additionally, the system 600 shown in FIG. 6A and the system 650 shown in FIG. 6B and / or the system 800 shown in FIG. May be implemented in a system separate from the general-purpose computing device 900 and / or the general-purpose mobile computing device 950, having some or all of the features described in.

図９は、ここに説明される手法を実現するために使用され得るコンピュータデバイスおよびモバイルコンピュータデバイスの概略ブロック図である。図９は、ここに説明される手法を用いて使用され得る汎用コンピュータデバイス９００および汎用モバイルコンピュータデバイス９５０の一例である。コンピューティングデバイス９００は、ラップトップ、デスクトップ、ワークステーション、携帯情報端末、サーバ、ブレードサーバ、メインフレーム、および他の適切なコンピュータといった、さまざまな形態のデジタルコンピュータを表わすよう意図されている。コンピューティングデバイス９５０は、携帯情報端末、携帯電話、スマートフォン、および他の同様のコンピューティングデバイスといった、さまざまな形態のモバイルデバイスを表わすよう意図されている。ここに示すコンポーネント、それらの接続および関係、ならびにそれらの機能は単なる例示であることが意図されており、本文書に記載のおよび／または請求項に記載の本発明の実現化例を限定するよう意図されてはいない。 FIG. 9 is a schematic block diagram of a computing device and a mobile computing device that may be used to implement the techniques described herein. FIG. 9 is an example of a general purpose computing device 900 and a general purpose mobile computing device 950 that can be used with the techniques described herein. Computing device 900 is intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers. Computing device 950 is intended to represent various forms of mobile devices, such as personal digital assistants, cell phones, smartphones, and other similar computing devices. The components shown, their connections and relationships, and their functions, are intended to be merely exemplary, and should not be construed as limiting the implementations of the invention described in this document and / or claimed. Not intended.

コンピューティングデバイス９００は、プロセッサ９０２と、メモリ９０４と、記憶装置９０６と、メモリ９０４および高速拡張ポート９１０に接続している高速インターフェイス９０８と、低速バス９１４および記憶装置９０６に接続している低速インターフェイス９１２とを含む。コンポーネント９０２、９０４、９０６、９０８、９１０、および９１２の各々は、さまざまなバスを使用して相互接続されており、共通のマザーボード上にまたは他の態様で適宜搭載されてもよい。プロセッサ９０２は、コンピューティングデバイス９００内で実行される命令を処理可能であり、これらの命令は、ＧＵＩのためのグラフィック情報を、高速インターフェイス９０８に結合されたディスプレイ９１６などの外部入出力デバイス上に表示するために、メモリ９０４内または記憶装置９０６上に格納された命令を含む。他の実現化例では、複数のプロセッサおよび／または複数のバスが、複数のメモリおよび複数のタイプのメモリとともに適宜使用されてもよい。また、複数のコンピューティングデバイス９００が接続されてもよく、各デバイスは（たとえば、サーババンク、ブレードサーバのグループ、またはマルチプロセッサシステムとして）必要な動作の部分を提供する。 The computing device 900 includes a processor 902, a memory 904, a storage device 906, a high-speed interface 908 connected to the memory 904 and the high-speed expansion port 910, and a low-speed interface connected to the low-speed bus 914 and the storage device 906. 912. Each of components 902, 904, 906, 908, 910, and 912 are interconnected using various buses and may be mounted on a common motherboard or otherwise. Processor 902 is capable of processing instructions executing within computing device 900, which in turn render graphics information for the GUI on an external input / output device, such as display 916, coupled to high speed interface 908. Includes instructions stored in memory 904 or on storage device 906 for display. In other implementations, multiple processors and / or multiple buses may be used with multiple memories and multiple types of memory as appropriate. Also, multiple computing devices 900 may be connected, each device providing a portion of the required operation (eg, as a server bank, a group of blade servers, or a multi-processor system).

メモリ９０４は、情報をコンピューティングデバイス９００内に格納する。一実現化例では、メモリ９０４は１つまたは複数の揮発性メモリユニットである。別の実現化例では、メモリ９０４は１つまたは複数の不揮発性メモリユニットである。メモリ９０４はまた、磁気ディスクまたは光ディスクといった別の形態のコンピュータ読取可能媒体であってもよい。 Memory 904 stores information in computing device 900. In one implementation, memory 904 is one or more volatile memory units. In another implementation, memory 904 is one or more non-volatile memory units. Memory 904 may also be another form of computer-readable medium, such as a magnetic or optical disk.

記憶装置９０６は、コンピューティングデバイス９００のための大容量記憶を提供可能である。一実現化例では、記憶装置９０６は、フロッピー（登録商標）ディスクデバイス、ハードディスクデバイス、光ディスクデバイス、もしくはテープデバイス、フラッシュメモリまたは他の同様のソリッドステートメモリデバイス、もしくは、ストレージエリアネットワークまたは他の構成におけるデバイスを含むデバイスのアレイといった、コンピュータ読取可能媒体であってもよく、または当該コンピュータ読取可能媒体を含んでいてもよい。コンピュータプログラム製品が情報担体において有形に具現化され得る。コンピュータプログラム製品はまた、実行されると上述のような１つ以上の方法を行なう命令を含んでいてもよい。情報担体は、メモリ９０４、記憶装置９０６、またはプロセッサ９０２上のメモリといった、コンピュータ読取可能媒体または機械読取可能媒体である。 The storage 906 can provide mass storage for the computing device 900. In one implementation, the storage device 906 is a floppy disk device, a hard disk device, an optical disk device, or a tape device, a flash memory or other similar solid state memory device, or a storage area network or other configuration. Or a computer readable medium, such as an array of devices, including the devices in. A computer program product can be tangibly embodied in an information carrier. The computer program product may also include instructions that, when executed, perform one or more methods as described above. The information carrier is a computer-readable or machine-readable medium, such as memory 904, storage 906, or memory on processor 902.

高速コントローラ９０８はコンピューティングデバイス９００のための帯域幅集約的な動作を管理し、一方、低速コントローラ９１２はより低い帯域幅集約的な動作を管理する。機能のそのような割当ては例示に過ぎない。一実現化例では、高速コントローラ９０８は、メモリ９０４、ディスプレイ９１６に（たとえば、グラフィックスプロセッサまたはアクセラレータを介して）、および、さまざまな拡張カード（図示せず）を受付け得る高速拡張ポート９１０に結合される。この実現化例では、低速コントローラ９１２は、記憶装置９０６および低速拡張ポート９１４に結合される。さまざまな通信ポート（たとえば、ＵＳＢ、ブルートゥース（登録商標）、イーサネット（登録商標）、無線イーサネット）を含み得る低速拡張ポートは、キーボード、ポインティングデバイス、スキャナなどの１つ以上の入出力デバイスに、もしくは、スイッチまたはルータなどのネットワーキングデバイスに、たとえばネットワークアダプタを介して結合されてもよい。 High speed controller 908 manages bandwidth intensive operations for computing device 900, while low speed controller 912 manages lower bandwidth intensive operations. Such an assignment of functions is merely exemplary. In one implementation, high-speed controller 908 couples to memory 904, display 916 (eg, via a graphics processor or accelerator), and a high-speed expansion port 910 that can accept various expansion cards (not shown). Is done. In this implementation, low speed controller 912 is coupled to storage 906 and low speed expansion port 914. The low speed expansion port, which may include various communication ports (eg, USB, Bluetooth, Ethernet, wireless Ethernet), may be connected to one or more input / output devices such as a keyboard, pointing device, scanner, or , A switch or a router, for example, via a network adapter.

コンピューティングデバイス９００は、図に示すように多くの異なる形態で実現されてもよい。たとえばそれは、標準サーバ９２０として、またはそのようなサーバのグループで複数回実現されてもよい。それはまた、ラックサーバシステム９２４の一部として実現されてもよい。加えて、それは、ラップトップコンピュータ９２２などのパーソナルコンピュータにおいて実現されてもよい。これに代えて、コンピューティングデバイス９００からのコンポーネントは、デバイス９５０などのモバイルデバイス（図示せず）における他のコンポーネントと組合されてもよい。そのようなデバイスの各々は、コンピューティングデバイス９００、９５０のうちの１つ以上を含んでいてもよく、システム全体が、互いに通信する複数のコンピューティングデバイス９００、９５０で構成されてもよい。 Computing device 900 may be implemented in many different forms as shown. For example, it may be implemented multiple times as a standard server 920 or in a group of such servers. It may also be implemented as part of the rack server system 924. In addition, it may be implemented on a personal computer such as a laptop computer 922. Alternatively, components from computing device 900 may be combined with other components on a mobile device (not shown), such as device 950. Each such device may include one or more of the computing devices 900, 950, and the entire system may be made up of multiple computing devices 900, 950 communicating with one another.

コンピューティングデバイス９５０は、数あるコンポーネントの中でも特に、プロセッサ９５２と、メモリ９６４と、ディスプレイ９５４などの入出力デバイスと、通信インターフェイス９６６と、トランシーバ９６８とを含む。デバイス９５０にはまた、追加の格納を提供するために、マイクロドライブまたは他のデバイスなどの記憶装置が設けられてもよい。コンポーネント９５０、９５２、９６４、９５４、９６６、および９６８の各々は、さまざまなバスを使用して相互接続されており、当該コンポーネントのうちのいくつかは、共通のマザーボード上にまたは他の態様で適宜搭載されてもよい。 The computing device 950 includes a processor 952, memory 964, input / output devices such as a display 954, a communication interface 966, and a transceiver 968, among other components. Device 950 may also be provided with storage, such as a microdrive or other device, to provide additional storage. Each of the components 950, 952, 964, 954, 966, and 968 are interconnected using various buses, some of which may be on a common motherboard or otherwise. It may be mounted.

プロセッサ９５２は、メモリ９６４に格納された命令を含む、コンピューティングデバイス９５０内の命令を実行可能である。プロセッサは、別個の複数のアナログおよびデジタルプロセッサを含むチップのチップセットとして実現されてもよい。プロセッサは、たとえば、ユーザインターフェイス、デバイス９５０が実行するアプリケーション、およびデバイス９５０による無線通信の制御といった、デバイス９５０の他のコンポーネント同士の連携を提供してもよい。 Processor 952 is capable of executing instructions in computing device 950, including instructions stored in memory 964. The processor may be implemented as a chipset of chips including separate analog and digital processors. The processor may provide coordination between other components of the device 950, for example, a user interface, applications executed by the device 950, and control of wireless communication by the device 950.

プロセッサ９５２は、ディスプレイ９５４に結合された制御インターフェイス９５８およびディスプレイインターフェイス９５６を介してユーザと通信してもよい。ディスプレイ９５４は、たとえば、ＴＦＴＬＣＤ（Thin-Film-Transistor Liquid Crystal Display：薄膜トランジスタ液晶ディスプレイ）、またはＯＬＥＤ（Organic Light Emitting Diode：有機発光ダイオード）ディスプレイ、または他の適切なディスプレイ技術であってもよい。ディスプレイインターフェイス９５６は、ディスプレイ９５４を駆動してグラフィカル情報および他の情報をユーザに提示するための適切な回路を含んでいてもよい。制御インターフェイス９５８は、ユーザからコマンドを受信し、それらをプロセッサ９５２に送出するために変換してもよい。加えて、デバイス９５０と他のデバイスとの近接エリア通信を可能にするように、外部インターフェイス９６２がプロセッサ９５２と通信した状態で設けられてもよい。外部インターフェイス９６２は、たとえば、ある実現化例では有線通信を提供し、他の実現化例では無線通信を提供してもよく、複数のインターフェイスも使用されてもよい。 Processor 952 may communicate with a user via a control interface 958 coupled to a display 954 and a display interface 956. Display 954 may be, for example, a thin-film-transistor liquid crystal display (TFT LCD), or an organic light-emitting diode (OLED) display, or other suitable display technology. Display interface 956 may include appropriate circuitry for driving display 954 to present graphical and other information to a user. Control interface 958 may receive commands from the user and convert them for delivery to processor 952. In addition, an external interface 962 may be provided in communication with the processor 952 to allow near-area communication between the device 950 and another device. External interface 962 may provide, for example, wired communication in one implementation and wireless communication in other implementations, and multiple interfaces may also be used.

メモリ９６４は、情報をコンピューティングデバイス９５０内に格納する。メモリ９６４は、１つまたは複数のコンピュータ読取可能媒体、１つまたは複数の揮発性メモリユニット、もしくは、１つまたは複数の不揮発性メモリユニットのうちの１つ以上として実現され得る。拡張メモリ９７４も設けられ、拡張インターフェイス９７２を介してデバイス９５０に接続されてもよく、拡張インターフェイス９７２は、たとえばＳＩＭＭ（Single In Line Memory Module）カードインターフェイスを含んでいてもよい。そのような拡張メモリ９７４は、デバイス９５０に余分の格納スペースを提供してもよく、もしくは、デバイス９５０のためのアプリケーションまたは他の情報も格納してもよい。具体的には、拡張メモリ９７４は、上述のプロセスを実行または補足するための命令を含んでいてもよく、安全な情報も含んでいてもよい。このため、たとえば、拡張メモリ９７４はデバイス９５０のためのセキュリティモジュールとして設けられてもよく、デバイス９５０の安全な使用を許可する命令でプログラミングされてもよい。加えて、ハッキング不可能な態様でＳＩＭＭカード上に識別情報を載せるといったように、安全なアプリケーションが追加情報とともにＳＩＭＭカードを介して提供されてもよい。 Memory 964 stores information in computing device 950. Memory 964 may be implemented as one or more computer-readable media, one or more volatile memory units, or one or more non-volatile memory units. An extension memory 974 is also provided and may be connected to the device 950 via an extension interface 972, which may include, for example, a SIMM (Single In Line Memory Module) card interface. Such expansion memory 974 may provide extra storage space for device 950, or may also store applications or other information for device 950. Specifically, expansion memory 974 may include instructions for performing or supplementing the processes described above, and may also include secure information. Thus, for example, the extension memory 974 may be provided as a security module for the device 950 and may be programmed with instructions permitting secure use of the device 950. In addition, secure applications may be provided via the SIMM card with additional information, such as placing identification information on the SIMM card in a non-hackable manner.

メモリはたとえば、以下に説明されるようなフラッシュメモリおよび／またはＮＶＲＡＭメモリを含んでいてもよい。一実現化例では、コンピュータプログラム製品が情報担体において有形に具現化される。コンピュータプログラム製品は、実行されると上述のような１つ以上の方法を行なう命令を含む。情報担体は、メモリ９６４、拡張メモリ９７４、またはプロセッサ９５２上のメモリといった、コンピュータ読取可能媒体または機械読取可能媒体であり、たとえばトランシーバ９６８または外部インターフェイス９６２を通して受信され得る。 The memory may include, for example, flash memory and / or NVRAM memory as described below. In one implementation, a computer program product is tangibly embodied in an information carrier. The computer program product includes instructions that, when executed, perform one or more methods as described above. The information carrier is a computer-readable or machine-readable medium, such as memory 964, expanded memory 974, or memory on processor 952, and may be received, for example, through transceiver 968 or external interface 962.

デバイス９５０は、必要に応じてデジタル信号処理回路を含み得る通信インターフェイス９６６を介して無線通信してもよい。通信インターフェイス９６６は、とりわけ、ＧＳＭ（登録商標）音声通話、ＳＭＳ、ＥＭＳ、またはＭＭＳメッセージング、ＣＤＭＡ、ＴＤＭＡ、ＰＤＣ、ＷＣＤＭＡ（登録商標）、ＣＤＭＡ２０００、またはＧＰＲＳといった、さまざまなモードまたはプロトコル下での通信を提供してもよい。そのような通信は、たとえば無線周波数トランシーバ９６８を介して生じてもよい。加えて、ブルートゥース、Ｗｉ−Ｆｉ、または他のそのようなトランシーバ（図示せず）などを使用して、短距離通信が生じてもよい。加えて、ＧＰＳ（Global Positioning System：全地球測位システム）レシーバモジュール９７０が、追加のナビゲーション関連および位置関連無線データをデバイス９５０に提供してもよく、当該データは、デバイス９５０上で実行されるアプリケーションによって適宜使用されてもよい。 Device 950 may communicate wirelessly via communication interface 966, which may optionally include digital signal processing circuitry. Communication interface 966 may communicate under various modes or protocols, such as GSM voice communication, SMS, EMS, or MMS messaging, CDMA, TDMA, PDC, WCDMA, CDMA2000, or GPRS, among others. May be provided. Such communication may occur, for example, via radio frequency transceiver 968. In addition, short-range communication may occur, such as using Bluetooth, Wi-Fi, or other such transceivers (not shown). In addition, a GPS (Global Positioning System) receiver module 970 may provide additional navigation-related and location-related wireless data to the device 950, where the data is transmitted to an application running on the device 950. May be used as appropriate.

デバイス９５０はまた、ユーザから口頭情報を受信してそれを使用可能なデジタル情報に変換し得る音声コーデック９６０を使用して、音声通信してもよい。音声コーデック９６０はまた、たとえばデバイス９５０のハンドセットにおいて、スピーカを介するなどして、ユーザに聞こえる音を生成してもよい。そのような音は、音声電話からの音を含んでいてもよく、録音された音（たとえば、音声メッセージ、音楽ファイルなど）を含んでいてもよく、デバイス９５０上で動作するアプリケーションが生成する音も含んでいてもよい。 Device 950 may also communicate audio using a speech codec 960 that may receive verbal information from the user and convert it into usable digital information. Audio codec 960 may also generate a sound that can be heard by the user, such as through a speaker, for example, in a handset of device 950. Such sounds may include sounds from a voice call, may include recorded sounds (eg, voice messages, music files, etc.), and may be generated by applications running on device 950. May also be included.

コンピューティングデバイス９５０は、図に示すように多くの異なる形態で実現されてもよい。たとえばそれは、携帯電話９８０として実現されてもよい。それはまた、スマートフォン９８２、携帯情報端末、または他の同様のモバイルデバイスの一部として実現されてもよい。 Computing device 950 may be implemented in many different forms as shown. For example, it may be implemented as a mobile phone 980. It may also be implemented as part of a smartphone 982, personal digital assistant, or other similar mobile device.

上述の例示的な実施形態のうちのいくつかは、フローチャートとして示されるプロセスまたは方法として説明される。これらのフローチャートは動作を逐次プロセスとして説明しているが、動作の多くは、並列、同時または一斉に行なわれてもよい。加えて、動作の順序は並び替えられてもよい。それらの動作が完了されるとプロセスは終了されてもよいが、図に含まれていない追加のステップも有していてもよい。これらのプロセスは、方法、機能、手順、サブルーチン、サブプログラムなどに対応していてもよい。 Some of the exemplary embodiments described above are described as processes or methods illustrated as flowcharts. Although these flowcharts describe operations as sequential processes, many of the operations may be performed in parallel, simultaneously, or all at once. In addition, the order of the operations may be rearranged. The process may be terminated when those operations are completed, but may have additional steps not included in the figure. These processes may correspond to methods, functions, procedures, subroutines, subprograms, etc.

そのいくつかがフローチャートによって示されている、上述された方法は、ハードウェア、ソフトウェア、ファームウェア、ミドルウェア、マイクロコード、ハードウェア記述言語、またはそれらの任意の組合せによって実現されてもよい。ソフトウェア、ファームウェア、ミドルウェア、またはマイクロコードにおいて実現される場合、必要なタスクを行なうプログラムコードまたはコードセグメントは、記憶媒体などの機械読取可能媒体またはコンピュータ読取可能媒体に格納されてもよい。プロセッサが必要なタスクを行なってもよい。 The methods described above, some of which are illustrated by flowcharts, may be implemented by hardware, software, firmware, middleware, microcode, hardware description languages, or any combination thereof. When implemented in software, firmware, middleware, or microcode, program code or code segments that perform the necessary tasks may be stored on machine-readable or computer-readable media, such as storage media. The processor may perform the necessary tasks.

ここに開示された具体的な構造詳細および機能詳細は、例示的な実施形態を説明するための代表的なものに過ぎない。しかしながら、例示的な実施形態は、多くの代替的な形態で具現化され、ここに述べられた実施形態のみに限定されると解釈されるべきでない。 The specific structural and functional details disclosed herein are merely representative for describing example embodiments. However, the exemplary embodiments may be embodied in many alternative forms and should not be construed as limited to only the embodiments set forth herein.

第１、第２などといった用語は、さまざまな要素を説明するためにここに使用され得るが、これらの要素はこれらの用語によって限定されるべきでない、ということが理解されるであろう。これらの用語は、１つの要素を別の要素と区別するために使用されているに過ぎない。たとえば、例示的な実施形態の範囲から逸脱することなく、第１の要素を第２の要素と称してもよく、同様に、第２の要素を第１の要素と称してもよい。ここに使用されるように、「および／または」という用語は、関連付けられる列挙された項目の１つ以上のいずれかおよびすべての組合せを含む。 Terms such as first, second, etc. may be used herein to describe various elements, but it will be understood that these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first element may be referred to as a second element, and similarly, a second element may be referred to as a first element without departing from the scope of the exemplary embodiments. As used herein, the term "and / or" includes any and all combinations of one or more of the associated listed items.

ある要素が別の要素に接続または結合されると称される場合、ある要素は別の要素に直接接続または結合され得るか、または介在要素が存在し得る、ということが理解されるであろう。対照的に、ある要素が別の要素に直接接続または直接結合されると称される場合、介在要素は存在しない。要素間の関係を説明するために使用される他の文言は、類似の態様（たとえば、「間に」と「間に直接」、「隣接」と「直接隣接」など）で解釈されるべきである。 When one element is referred to as being connected or coupled to another element, it will be understood that one element may be directly connected or coupled to another element or that there may be intervening elements. . In contrast, when one element is referred to as being directly connected to or directly coupled to another element, there are no intervening elements. Other language used to describe relationships between elements should be construed in a similar manner (eg, "between" and "directly between", "adjacent" and "directly adjacent", etc.). is there.

ここに使用される用語は特定の実施形態を説明するためのものに過ぎず、例示的な実施形態の限定であるよう意図されてはいない。ここに使用されるように、単数形は、文脈が別の態様を明らかに示していない限り、複数形も含むよう意図される。「備える（comprises, comprising）」および／または「含む（includes, including）」という用語は、ここに使用される場合、言及された特徴、整数、ステップ、動作、要素および／またはコンポーネントの存在を特定するが、１つ以上の他の特徴、整数、ステップ、動作、要素、コンポーネントおよび／またはそれらのグループの存在または追加を排除しない、ということがさらに理解されるであろう。 The terms used in the description are intended to describe certain embodiments only, and are not intended to be limiting of the exemplary embodiments. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise. The terms "comprises, comprising" and / or "includes, including" as used herein identify the presence of the feature, integer, step, operation, element and / or component mentioned. It will be further understood, however, that this does not preclude the presence or addition of one or more other features, integers, steps, acts, elements, components and / or groups thereof.

また、いくつかの代替的な実現化例では、言及された機能／行為が、図に示された順番とは異なって起きてもよい。たとえば、連続して示される２つの図は実際には、関与する機能性／行為に依存して、同時に実行されてもよく、または、時には逆の順序で実行されてもよい。 Also, in some alternative implementations, the functions / acts noted may occur out of the order noted in the figures. For example, two figures shown in succession may, in fact, be performed simultaneously, or sometimes in the reverse order, depending on the functionality / acts involved.

別の態様で定義されていない限り、ここに使用されるすべての用語（技術用語および科学用語を含む）は、例示的な実施形態が属する技術の当業者によって一般に理解されているのと同じ意味を有する。さらに、たとえば一般に使用されている辞書で定義されているような用語は、関連技術の文脈におけるそれらの意味と一致する意味を有すると解釈されるべきであり、ここに明らかにそう定義されていない限り、理想化されたまたは過度に形式的な意味で解釈されない、ということが理解されるであろう。 Unless defined otherwise, all terms used herein (including technical and scientific terms) have the same meaning as commonly understood by one of ordinary skill in the art to which the exemplary embodiment belongs. Having. Furthermore, terms such as those defined in commonly used dictionaries should be construed as having meanings consistent with their meanings in the context of the relevant art, and are not explicitly so defined herein. It will be understood that, as long as it is not interpreted in an idealized or overly formal sense.

ソフトウェア、または、コンピュータメモリ内でのデータビットに対する動作のアルゴリズムおよび記号的表現に関して、上述の例示的な実施形態および対応する詳細な説明の部分が提示される。これらの説明および表現は、当業者が自分の研究の内容を他の当業者に効果的に伝えるものである。アルゴリズムとは、その用語がここに使用される場合、および一般的に使用される場合、所望の結果に至るステップの首尾一貫したシーケンスであると考えられる。これらのステップは、物理量の物理的操作を必要とするものである。必ずではないものの、通常は、これらの量は、格納、転送、組合せ、比較、および別の態様での操作が可能である光学信号、電気信号、または磁気信号の形態を取る。これらの信号をビット、値、要素、記号、文字、項、または数字などと称することは、主に一般的な使用の理由により、時に便利であることが証明されている。 With respect to software or algorithms and symbolic representations of operations on data bits within a computer memory, portions of the illustrative embodiments described above and corresponding detailed description are presented. These descriptions and expressions enable those skilled in the art to effectively convey the substance of their work to others skilled in the art. An algorithm, as the term is used herein, and as it is commonly used, is considered to be a consistent sequence of steps leading to a desired result. These steps require physical manipulation of physical quantities. Usually, but not necessarily, these quantities take the form of optical, electrical, or magnetic signals that can be stored, transferred, combined, compared, and otherwise manipulated. Referencing these signals as bits, values, elements, symbols, characters, terms, or numbers, etc., has proven convenient at times, primarily for reasons of common usage.

上述の例示的な実施形態において、プログラムモジュールまたは機能的プロセスとして実現され得る（たとえばフローチャートの形態での）行為および動作の記号的表現への参照は、特定のタスクを行ない、または特定の抽象データタイプを実現するとともに、既存の構造要素で既存のハードウェアを使用して記述および／または実現され得る、ルーチン、プログラム、オブジェクト、コンポーネント、データ構造などを含む。そのような既存のハードウェアは、１つ以上の中央処理装置（ＣＰＵ）、デジタル信号プロセッサ（ＤＳＰ）、特定用途向け集積回路、または、フィールドプログラマブルゲートアレイ（ＦＰＧＡ）コンピュータなどを含み得る。 In the above-described exemplary embodiments, references to symbolic representations of acts and acts (eg, in the form of flowcharts) that can be implemented as program modules or functional processes perform particular tasks, or implement particular abstract data Includes routines, programs, objects, components, data structures, etc. that implement the type and can be described and / or implemented with existing hardware using existing hardware. Such existing hardware may include one or more central processing units (CPUs), digital signal processors (DSPs), application specific integrated circuits, or field programmable gate array (FPGA) computers.

しかしながら、これらおよび同様の用語はすべて、適切な物理量に関連付けられるべきであり、これらの量に適用された便利なラベルに過ぎない、ということが念頭に置かれるべきである。特に別記されない限り、あるいは説明から明らかであるように、表示の処理、コンピューティング、計算、または判断といった用語は、コンピュータシステムのレジスタおよびメモリ内で物理的な電子量として表わされるデータを操作し、当該データを、コンピュータシステムメモリ、レジスタ、もしくは他のそのような情報記憶デバイス、送信デバイスまたは表示デバイス内の物理量として同様に表わされる他のデータに変換する、コンピュータシステムまたは同様の電子コンピューティングデバイスのアクションおよびプロセスを指す。 However, it should be kept in mind that all of these and similar terms should be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless otherwise stated, or as will be apparent from the description, terms such as processing, computing, calculating, or determining display manipulate data represented as physical quantities of electrons in registers and memories of computer systems; A computer system or similar electronic computing device that converts the data into computer system memory, registers, or other such information storage devices, transmitting devices or other data that are also represented as physical quantities in a display device. Refers to actions and processes.

また、例示的な実施形態のソフトウェアによって実現される局面は典型的には、何らかの形態の非一時的なプログラム記憶媒体上で符号化されるか、または、何らかのタイプの伝送媒体上で実現される。プログラム記憶媒体は、磁気的（たとえば、フロッピーディスクまたはハードドライブ）であるか、または光学的（たとえば、コンパクトディスク読み取り専用メモリ、すなわちＣＤＲＯＭ）であってもよく、読み取り専用またはランダムアクセスであってもよい。同様に、伝送媒体は、当該技術について公知であるツイストペア線、同軸ケーブル、光ファイバ、または何らかの他の好適な伝送媒体であってもよい。例示的な実施形態は、所与の実現化例のこれらの局面によって限定されない。 Also, aspects implemented by the software of the exemplary embodiments are typically encoded on some form of non-transitory program storage media, or implemented on some type of transmission medium. . The program storage medium may be magnetic (eg, a floppy disk or hard drive) or optical (eg, a compact disk read-only memory, ie, CD ROM), read-only or random-access. Is also good. Similarly, the transmission medium may be a twisted pair wire, coaxial cable, optical fiber, or any other suitable transmission medium known in the art. The illustrative embodiments are not limited by these aspects of a given implementation.

最後に、添付の請求の範囲は、ここに説明された特徴の特定の組合せを述べているが、本開示の範囲は、請求されるその特定の組合せに限定されず、代わりに、その特定の組合せが現時点で添付の請求の範囲において具体的に列挙されているか否かに関わらず、ここに開示された特徴または実施形態の任意の組合せを包含するように広がる。 Finally, while the appended claims set forth certain combinations of the features described herein, the scope of the disclosure is not limited to the specific combinations claimed, but instead includes those specific combinations. Whether or not the combinations are specifically recited in the appended claims at this time, they extend to encompass any combination of the features or embodiments disclosed herein.

Claims

The method
Determining at least one preferred view perspective associated with a three-dimensional (3D) video, wherein the view perspective is a region selected based on at least one reference point viewed by a viewer of the 3D video. Corresponding to
Encoding a first portion of the 3D video corresponding to the at least one preferred view perspective with a first quality;
Encoding a second portion of the 3D video at a second quality, wherein the first quality is higher quality than the second quality;
Further comprising receiving a request for streaming video, wherein the request includes displaying a user view perspective associated with the 3D video, the method further comprising:
Determining whether the user view perspective is stored in a view perspective data store;
Upon determining that the user view perspective is stored in the view perspective data store, incrementing a counter associated with the user view perspective;
Determining that the user view perspective is not stored in the view perspective data store, adding the user view perspective to the view perspective data store, and setting the counter associated with the user view perspective to 1; Including, methods.

Determining at least one preferred view perspective associated with a three-dimensional (3D) video, wherein the view perspective is a region selected based on at least one reference point viewed by a viewer of the 3D video. Corresponding to
Encoding a first portion of the 3D video corresponding to the at least one preferred view perspective with a first quality;
Encoding a second portion of the 3D video at a second quality, wherein the first quality is higher quality than the second quality;
Repeatedly encoding at least a portion of the second portion of the 3D video with the first quality;
Streaming said at least a portion of said second portion of said 3D video.

Storing the view perspective as one of a plurality of historical view perspectives for the three-dimensional (3D) video based on the previously viewed view perspective associated with the three-dimensional (3D) video; The view perspective corresponds to an area selected based on at least one reference point viewed by a viewer of the 3D video;
Determining whether the ranking value associated with the plurality of history view perspectives is greater than a threshold and satisfies a condition for the rest of the plurality of history view perspectives,
The ranking value is greater than the threshold value, in response to determining that meets the above conditions, as at least one preferred view perspectives associated et a in the 3D video, selecting one of the plurality of history view perspectives Steps to
Encoding a first portion of the 3D video corresponding to the at least one preferred view perspective with a first quality;
Encoding a second portion of the 3D video at a second quality, wherein the first quality is higher quality than the second quality.

Storing the first portion of the 3D video in a data store;
Storing the second portion of the 3D video in the data store;
Receiving a request for streaming video;
The method of any of claims 1 to 3, further comprising: streaming the first portion of the 3D video and the second portion of the 3D video from the data store as the streaming video.

Further comprising receiving a request for streaming video, wherein the request comprises displaying a user view perspective, the method further comprising:
Selecting a 3D video corresponding to the user view perspective as the encoded first portion of the 3D video;
4. The method of any of the preceding claims, comprising streaming the selected first portion of the 3D video and the second portion of the 3D video as the streaming video.

Further comprising receiving a request for streaming video, wherein the request includes displaying a user view perspective associated with the 3D video, the method further comprising:
Determining whether the user view perspective is stored in a view perspective data store;
Upon determining that the user view perspective is stored in the view perspective data store, incrementing a counter associated with the user view perspective;
Determining that the user view perspective is not stored in the view perspective data store, adding the user view perspective to the view perspective data store, and setting the counter associated with the user view perspective to 1; The method according to claim 1, comprising:

Encoding the second portion of the 3D video includes using at least one first quality of service (QoS) parameter in a first pass encoding operation;
7. The method of claim 1, wherein encoding the first portion of the 3D video comprises using at least one second Quality of Service (QoS) parameter in a second pass encoding operation. A method according to any one of the preceding claims.

The at least one preferred view perspective associated with the 3D video is at least one of an orientation of a viewer of the 3D video, a location of a viewer of the 3D video, and a focus of a viewer of the 3D video. 4. The method according to claim 1, wherein the method is based on:

Determining the at least one preferred view perspective associated with the 3D video is based on a default view perspective;
The default view perspective is
Display device user characteristics,
Characteristics of a group associated with the user of the display device;
Director's cut, and
The method according to any of the preceding claims, wherein the method is based on at least one of the characteristics of the 3D video.

A streaming server,
A controller configured to determine at least one preferred view perspective associated with the three-dimensional (3D) video;
An encoder, wherein the view perspective corresponds to an area selected based on at least one reference point viewed by a viewer of the 3D video;
The encoder is
Encoding a first portion of the 3D video corresponding to the at least one preferred view perspective with a first quality;
Configured to encode a second portion of the 3D video at a second quality, wherein the first quality is higher quality than the second quality;
The controller further comprises:
The controller is configured to receive a request for streaming video, the request including displaying a user view perspective associated with the 3D video, the controller further comprising:
Determining whether the user view perspective is stored in a view perspective data store,
Upon determining that the user view perspective is stored in the view perspective data store, incrementing a counter associated with the user view perspective,
Upon determining that the user view perspective is not stored in the view perspective data store, adding the user view perspective to the view perspective data store and setting the counter associated with the user view perspective to one. Configured streaming server.

A streaming server,
A controller configured to determine at least one preferred view perspective associated with the three-dimensional (3D) video;
An encoder, wherein the view perspective corresponds to an area selected based on at least one reference point viewed by a viewer of the 3D video;
The encoder is
Encoding a first portion of the 3D video corresponding to the at least one preferred view perspective with a first quality;
Configured to encode a second portion of the 3D video at a second quality, wherein the first quality is higher quality than the second quality;
The controller further comprises:
Iteratively encoding at least a portion of the second portion of the 3D video with the first quality;
A streaming server configured to stream the at least a portion of the second portion of the 3D video.

A streaming server,
Based on a view perspective previously viewed and associated with the three-dimensional (3D) video, configured to store the view perspective as one of a plurality of historical view perspectives for the three-dimensional (3D) video. Wherein the plurality of historical view perspectives are associated with a plurality of users of the three-dimensional (3D) video, wherein the view perspective includes at least one reference point viewed by a viewer of the 3D video Corresponding to the region selected based on
Comprising a controller, wherein the controller comprises:
The ranking value associated with the plurality of history view perspectives is greater than a threshold, and determines whether or not a condition for the rest of the plurality of history view perspectives is satisfied,
The ranking value is greater than the threshold value, in response to determining that meets the above conditions, as at least one preferred view perspectives associated et a in the 3D video, selecting one of the plurality of history view perspectives Is configured to
Comprising an encoder, wherein the encoder comprises:
Encoding a first portion of the 3D video corresponding to the at least one preferred view perspective with a first quality;
The method, wherein the method is configured to encode a second portion of the 3D video at a second quality, wherein the first quality is higher quality than the second quality.

The controller further comprises:
Storing the first portion of the 3D video in a data store;
Storing the second portion of the 3D video in the data store;
Receiving a request for streaming video,
The streaming server according to any of claims 10 to 12, wherein the streaming server is configured to stream the first part of the 3D video and the second part of the 3D video as the streaming video from the data store. .

The controller further comprises:
Configured to receive a request for streaming video, wherein the request includes displaying a user view perspective, the controller further comprising:
Selecting a 3D video corresponding to the user view perspective as the encoded first portion of the 3D video;
The streaming server according to any of claims 10 to 12, wherein the streaming server is configured to stream the selected first part of the 3D video and the second part of the 3D video as the streaming video.

Encoding the second portion of the 3D video includes using at least one first quality of service (QoS) parameter in a first pass encoding operation;
13. The method of claim 10, wherein encoding the first portion of the 3D video comprises using at least one second Quality of Service (QoS) parameter in a second pass encoding operation. A streaming server according to any one of the preceding claims.

The at least one preferred view perspective associated with the 3D video is at least one of an orientation of a viewer of the 3D video, a location of a viewer of the 3D video, and a focus of a viewer of the 3D video. The streaming server according to any of claims 10 to 12, which is based on:

Determining the at least one preferred view perspective associated with the 3D video is based on a default view perspective,
The default view perspective is
Display device user characteristics,
Characteristics of a group associated with the user of the display device;
Director's cut, and
The streaming server according to claim 10, wherein the streaming server is based on at least one of the characteristics of the 3D video.

The controller further comprises:
Iteratively encoding at least a portion of the second portion of the 3D video with the first quality;
The streaming server according to claim 10 or 12, wherein the streaming server is configured to cause the at least a portion of the second portion of the 3D video to be streamed.

The method
Receiving a request for streaming video, wherein the request includes displaying a user view perspective associated with the three-dimensional (3D) video, wherein the user view perspective includes at least Corresponding to the area selected based on one reference point,
The method further comprises:
Determining whether the user view perspective is stored in a view perspective data store;
When determining that the user view perspective is stored in the view perspective data store, incrementing a ranking value associated with the user view perspective,
Determining that the user view perspective is not stored in the view perspective data store, adding the user view perspective to the view perspective data store and setting the ranking value associated with the user view perspective to 1; And a method comprising:

Determining at least one preferred view perspective associated with the 3D video based on the ranking value associated with the stored user view perspective;
Encoding a first portion of the 3D video corresponding to the at least one preferred view perspective with a first quality;
Encoding the second portion of the 3D video at a second quality, wherein the first quality is a higher quality compared to the second quality. .

A program executed by a computer,
The program causing the computer to execute the method according to any one of claims 1 to 9, 19, and 20.