JP2020506614A

JP2020506614A - Predicted Bit Rate Selection for 360 Video Streaming

Info

Publication number: JP2020506614A
Application number: JP2019541128A
Authority: JP
Inventors: ベラン、エリック; トクボ、トッド
Original assignee: Sony Interactive Entertainment Inc
Current assignee: Sony Interactive Entertainment Inc
Priority date: 2017-04-06
Filing date: 2018-03-13
Publication date: 2020-02-27
Anticipated expiration: 2038-03-13
Also published as: JP7129517B2; EP3607739A1; CN110832849B; CN110832849A; WO2018187003A1; US11128730B2; US20180295205A1; JP6867501B2; US20200162757A1; EP3607739A4; JP2021108481A; EP3607739B1; US10547704B2

Abstract

【解決手段】３６０度ビデオのためのストリームの予測プリフェッチが、説明される。ユーザビュー方位メタデータが、複数のビューポートについてのデータを含む３６０度ビデオストリームのために取得される。ビューポートのうちの特定のビューポートについての１つ以上の高解像度フレームに対応するデータが、ユーザビュー方位メタデータに基づいてプリフェッチされ、それらのフレームが表示される。高解像度フレームは、残りのビューポートについてよりも高い解像度によって特徴付けられる。【選択図】図１Kind Code: A1 Predicted stream prefetch for 360 degree video is described. User view orientation metadata is obtained for a 360 degree video stream that includes data for multiple viewports. Data corresponding to one or more high resolution frames for a particular one of the viewports is prefetched based on the user view orientation metadata and the frames are displayed. High resolution frames are characterized by a higher resolution than for the remaining viewports. [Selection diagram] Fig. 1

Description

［優先権の主張］
本出願は、２０１７年４月６日に出願された米国特許出願番号第１５／４８１，３２４号の優先権の利益を主張し、その内容全体が参照によって本明細書に組み込まれる。 [Priority claim]
This application claims the benefit of priority of US Patent Application No. 15 / 481,324, filed April 6, 2017, the entire contents of which are incorporated herein by reference.

本開示の態様は、ビデオストリーミングに関する。特に、本開示は、３６０度ビデオのストリーミングに関する。 Aspects of the present disclosure relate to video streaming. In particular, the present disclosure relates to streaming 360 degree video.

３６０度ビデオは、単一点の周りに配列された複数のカメラからビデオストリームを撮ること、及びビデオストリームを繋ぎ合わせて単一の連続ビデオ画像を作成することによって作成される。最新の符号化器は、複数のフレームのビデオストリームに連続ビデオ画像を挟み込む。ネットワークを介して３６０度ビデオを見るために、サーバは、これらの複数のフレームのストリームをクライアントに送信する。クライアントは、ディスプレイ上に提示されている連続画像に、ストリームを復号し、再構築する。 A 360 degree video is created by taking video streams from multiple cameras arranged around a single point and splicing the video streams to create a single continuous video image. Modern encoders interpose a continuous video image in a video stream of multiple frames. To watch 360 degree video over the network, the server sends a stream of these multiple frames to the client. The client decodes and reconstructs the stream into a continuous image being presented on the display.

システムは、フレームについての単一の要求を送信し、要求されたフレームをダウンロードし、次いで表示用にそれらを構築し得る。このアクションの組み合わせは、フェッチアクションと呼ばれることがある。概して、中断のないビデオを確実に流すために、クライアントはまた、ビデオをプリフェッチしなければならない。それは、先にダウンロードされたフレームが表示される前に、システムがフレームをダウンロードし、それらを処理しなければならないことを意味する。このようにして、システムは、表示されている処理済みフレームとダウンロード及び処理される必要がある後続フレームとの間で、処理済みフレームのバッファを構築する。 The system may send a single request for frames, download the requested frames, and then construct them for display. This combination of actions is sometimes called a fetch action. In general, to ensure uninterrupted video streaming, the client must also prefetch the video. That means that the system must download the frames and process them before the previously downloaded frames are displayed. In this way, the system builds a buffer of processed frames between the displayed processed frame and subsequent frames that need to be downloaded and processed.

バッファリングは、特に高解像度ビデオを処理し記憶する際に、システムリソース上で非常にコストがかかり得る。帯域幅を節約し、必要なバッファリングの量を減少させるために、クライアントは、ビューポートとも呼ばれる、クライアントの視野範囲内の高解像度ビデオストリームフレームのみを要求してもよい。この場合、クライアントは、クライアントの現在のビュー以外の全てについて低解像度ビデオストリームを受信する。このシステムの問題は、高品質ストリームが要求され、配信され、バッファされ得るよりも早く、クライアントはしばしば視野を移動可能である、ということである。したがって、当技術分野において、クライアントが、３６０度ビデオストリーム内で視野がどこに向けられ得るかを予測し、視野が移動される前に対応する高解像度ビデオストリームをフェッチすることを可能にするシステムに対する必要性が存在する。 Buffering can be very costly on system resources, especially when processing and storing high definition video. To conserve bandwidth and reduce the amount of buffering required, the client may request only high resolution video stream frames, also called viewports, within the client's field of view. In this case, the client receives the low resolution video stream for everything but the client's current view. The problem with this system is that clients are often able to move their field of view sooner than a high quality stream can be requested, delivered and buffered. Accordingly, there is a need in the art for a system that allows a client to predict where a view can be directed within a 360 degree video stream and fetch the corresponding high resolution video stream before the view is moved. There is a need.

先行技術に関連する欠点は、３６０度ビデオストリームについてのユーザビュー方位メタデータを取得することと、ユーザビュー方位メタデータによって判断されるフレームをプリフェッチすることと、ユーザビュー方位メタデータによる３６０度ビデオストリームのより高解像度のフレームを表示することと、を含む３６０度ビデオをプリフェッチする方法に関する本開示の態様によって、克服される。 Disadvantages associated with the prior art include obtaining user view orientation metadata for the 360 degree video stream, prefetching frames determined by the user view orientation metadata, and 360 degree video with user view orientation metadata. It is overcome by aspects of the present disclosure relating to a method of prefetching 360 degree video, including displaying higher resolution frames of a stream.

本開示の教示は、添付図面と併せて以下の詳細な説明を考察することによって容易に理解され得る。 The teachings of the present disclosure can be readily understood by considering the following detailed description in conjunction with the accompanying drawings.

本開示の態様による、３６０度ビデオにおける仮想カメラ視野の図である。FIG. 4 is an illustration of a virtual camera view in a 360 degree video, in accordance with aspects of the present disclosure. 本開示の態様による、正距円筒図法における予測ビデオフェッチの図である。FIG. 4 is a diagram of predictive video fetch in equirectangular projection according to aspects of the present disclosure. 本開示の態様による、立方体マッピング図法における予測ビデオフェッチの図である。FIG. 4 is an illustration of predictive video fetch in a cubic mapping projection, in accordance with aspects of the present disclosure. 本開示の態様による、３６０度ビデオを表示するシステムのブロック図である。FIG. 2 is a block diagram of a system for displaying 360-degree video according to aspects of the present disclosure. 本発明の態様による、フレームをいつ変更するかを決定するためのマルコフチェーンの簡略型円形図である。FIG. 4 is a simplified circular view of a Markov chain for determining when to change a frame, in accordance with aspects of the present invention. 本開示の態様による、３６０度ビデオを表示する方法を示すフロー図である。FIG. 4 is a flow diagram illustrating a method for displaying 360-degree video according to aspects of the present disclosure.

以下の詳細な説明は、例示のための多数の具体的詳細を含むが、当業者であれば誰でも、以下の詳細に対する多数の変形及び変更が発明の範囲内にあると理解するであろう。したがって、以下に記載される発明の例示的実施形態は、特許請求された発明に対して一般性を失うことなく、かつ限定を与えることなく明記される。 The following detailed description includes numerous specific details for the purpose of illustration, but those skilled in the art will recognize that numerous variations and modifications to the following details are within the scope of the invention. . Accordingly, the exemplary embodiments of the invention described below are set forth without loss of generality and without limitation to the claimed invention.

［序論］
典型的には、ネットワークを介して３６０度ビデオをストリーミングすることは、全て１つの品質のビデオストリームのセットを受信することを伴う。より新しいストリーミング技術は、視聴者が着目している領域において高品質のストリームをロードすることのみによって、帯域幅利用を減少させることを可能にする。この技術は、視聴者が、多くの時間またはバッファリソースとして必要とすることなく、より高解像度のビデオストリームをロードすることを可能にするという、追加的な効果を有する。 [Introduction]
Typically, streaming 360 degree video over a network involves receiving a set of all one quality video streams. Newer streaming technologies allow bandwidth utilization to be reduced by only loading high quality streams in the area of interest to the viewer. This technique has the added advantage of allowing viewers to load higher resolution video streams without requiring much time or buffer resources.

開示される技術は、視聴者がより高品質のビデオストリームを見ることを可能にするが、ユーザが突然高品質ストリームから離れてビューポートを移動する場合に、不快感を与える解像度の低下を体験することがある。本開示の態様は、このような不快な体験を除去するために開発された。 The disclosed technology allows viewers to view higher quality video streams, but experiences annoying resolution loss when the user suddenly moves away from the high quality stream in the viewport. May be. Aspects of the present disclosure have been developed to eliminate such an unpleasant experience.

他の事例では、３６０度ビデオの作者は、３６０度ビデオのあるシーンにおいて視聴者が見るべきものについて、何らかの芸術的構想を有する場合がある。先行技術の方法によれば、視聴者に対して表示されたシーン内のそのような詳細は、低解像度ビデオの表示に起因して、またはビューが別方向を見ている間に、失われることがある。したがって、本開示の態様は、知覚される３６０度ビデオストリームの品質を改善するため、ならびに作者が視聴者のためにビューポート及びビデオ品質を定義することを可能にするために開発された。 In other cases, the 360-degree video creator may have some artistic vision of what the viewer should see in a scene of the 360-degree video. According to prior art methods, such details in the scene displayed to the viewer may be lost due to the display of the low resolution video or while the view is looking in another direction. There is. Accordingly, aspects of the present disclosure have been developed to improve the quality of perceived 360 degree video streams, and to allow authors to define viewports and video quality for viewers.

［作者主導プリフェッチ］
３６０度ビデオ内において、図１に見られ得るように、画像を見るためにカメラが撮り得る多くの方位が存在し得る。カメラ１０１は、ポイント１０８に固定され、シーン１００が、３６０度可視領域を作成するためにカメラ１０１の周囲を包んでいる。カメラは、シーン内の画像１０３を見るために、固定ポイント１０８を中心に回転してもよい。シーンは、ビューポート１０２と呼ばれる異なる領域に分解されてもよい。各ビューポート１０２は、クライアントによってロードされる、異なるビデオストリームであってもよい。図２に見られるように、ビデオの各セクションが、これらのビューポートに分割されてもよい。カメラ１０１は、任意の方向、例えば、上１０４、下１０６、左１０５、または右１０７に方位を変更して、１つのビューポート１０２から別のビューポートにシフトしてもよい。 [Author-driven prefetch]
Within a 360 degree video, there can be many orientations that the camera can take to view the image, as can be seen in FIG. The camera 101 is fixed at a point 108, and the scene 100 wraps around the camera 101 to create a 360-degree viewable area. The camera may rotate about a fixed point 108 to view the image 103 in the scene. The scene may be broken down into different regions called viewports 102. Each viewport 102 may be a different video stream loaded by the client. As seen in FIG. 2, each section of the video may be divided into these viewports. The camera 101 may change orientation in any direction, for example, up 104, down 106, left 105, or right 107 to shift from one viewport 102 to another.

図２は、現在の開示の態様による、正距円筒図法を示す。３６０度シーン２００は、一連の等しい大きさの長方形２０１〜２０８から構成される。各長方形は、クライアントデバイスがロードし、表示用に繋ぎ合わせる別々のビデオストリームであってもよい。各長方形は、カメラのビューを包含するのに十分な大きさであってもよく、その場合、それらはビューポート２０１である。代替的に、複数の長方形が、単一ビューポート（図示せず）を共に表してもよい。 FIG. 2 illustrates an equirectangular projection in accordance with aspects of the present disclosure. The 360 degree scene 200 is composed of a series of equally sized rectangles 201-208. Each rectangle may be a separate video stream that the client device loads and splices for display. Each rectangle may be large enough to encompass the camera's view, in which case they are viewports 201. Alternatively, multiple rectangles may together represent a single viewport (not shown).

ビデオ及び画像の作者は、典型的には、彼らがコンテンツの視聴者に見て欲しいものの何らかのアイデアを有している。３６０度ビデオの作成者も何ら変わりはない。上述のように、先行技術において表示される３６０度ビデオは、解像度が１つだけであった。より低い解像度では、ビデオ画像の重要な態様が視聴者に対して失われることがあった。 Video and image authors typically have some idea of what they want a viewer of the content to see. The creator of the 360 video is no different. As mentioned above, the 360 degree video displayed in the prior art had only one resolution. At lower resolutions, important aspects of the video image could be lost to the viewer.

本開示の態様によれば、作者は、クライアントがロードするための高解像度フレームの位置２０２を定義し得る。作者は、また、ユーザが見そうな３６０度ビデオの部分に対応するストリームのための高解像度ビデオコンテンツを予測的にロードするためにクライアントが使用し得る、ビデオストリーム内のメタデータ２０９を定義してもよい。限定ではなく例として、メタデータ２０９は、重要性を表す大きさ、及び方向の両方を有するベクトルの形式であってもよい。ある実施態様では、時間が、メタデータ２０９と並んで符号化されてもよく、またはベクトルが、固定の時間ステップ間隔のストリーム内に置かれてもよい。定義済みは、ユーザビュー方位メタデータとも呼ばれる。 According to aspects of the present disclosure, an author may define a position 202 of a high resolution frame for a client to load. The author also defines metadata 209 in the video stream that the client can use to predictively load high-resolution video content for the stream corresponding to the portion of the 360-degree video that the user is likely to see. You may. By way of example, and not limitation, metadata 209 may be in the form of a vector having both magnitude and direction indicating importance. In some embodiments, time may be encoded alongside the metadata 209, or vectors may be placed in the stream at fixed time step intervals. Predefined is also called user view orientation metadata.

いくつかの実施態様において、メタデータは、クライアントによるビュー方位情報の明示的な送信なしで、バックエンドまたはサーバ側で生成されてもよい。限定ではなく例として、サーバは、クライアントがどのストリームを、いつ要求するかに基づいて確率フィールドを構築してもよく、次いで、どのストリームがどのビューポートに属するかマッピングする。これは、クライアントが、現在のクライアントビューポートのために最高品質のストリームを選択すると仮定している。 In some implementations, the metadata may be generated on the back end or server side without explicit transmission of view orientation information by the client. By way of example, and not limitation, the server may construct a probability field based on which stream the client requests and when, and then map which stream belongs to which viewport. This assumes that the client selects the highest quality stream for the current client viewport.

メタデータ２０９は、ユーザによるビデオストリーム内のビューポート２０１の動きについての可能性に関連し得る。代替的には、作者定義プリフェッチベクトルの形式におけるメタデータ２０９は、作者の芸術的構想に従って視聴者のために理想化された移動ベクトルであってもよい。クライアントデバイスは、ビューポートの実際の動き２１０とメタデータ２０９の両方を追跡してもよい。クライアントデバイスは、実際のビューポート２０１内のフレーム、及び作者定義プリフェッチベクトルに沿ったフレーム２０２、２０３をプリフェッチしてもよい。代替的には、クライアントは、視聴者が３６０度ストリーム内の異なる位置にビューポートを移動するように促すために、作者定義ベクトルに沿って高解像度フレーム２０２、２０３をフェッチするだけであってもよい。 Metadata 209 may relate to the likelihood of the user moving viewport 201 in the video stream. Alternatively, the metadata 209 in the form of an author-defined prefetch vector may be a movement vector that is idealized for a viewer according to the author's artistic concept. The client device may track both the actual viewport movement 210 and the metadata 209. The client device may prefetch frames in the actual viewport 201 and frames 202, 203 along with the author-defined prefetch vector. Alternatively, the client may only fetch high resolution frames 202, 203 along an author-defined vector to prompt the viewer to move the viewport to a different position in the 360 degree stream. Good.

作者定義プリフェッチメタデータは、ベクトルである必要はない。作者は、表示のある時間の間、高解像度であることを彼らが所望するフレーム２０２を単に定義してもよい。したがって、クライアントは、作者によって定義されたある時間において、作者定義の高解像度フレームをフェッチしてもよい。 Author-defined prefetch metadata need not be a vector. Authors may simply define the frames 202 they desire for a high resolution for a certain period of display. Thus, a client may fetch an author-defined high resolution frame at a time defined by the author.

作者は、また、ズーム機能のために、フレームのある領域を高解像度として定義してもよい。作者は、フレームのあるサブセクションが高解像度として符号化されるように、フレームのサブセクションのための詳細情報のレベルを提供してもよい。この高解像度サブセクションの位置をクライアントに通知するメタデータは、クライアントがそのストリームをプリフェッチし得るように、クライアントに送信され得る。 The author may also define certain areas of the frame as high resolution for the zoom function. The author may provide a level of detail for the subsections of the frame so that certain subsections of the frame are encoded as high resolution. Metadata that informs the client of the location of this high resolution subsection may be sent to the client so that the client can prefetch the stream.

メタデータ２０９は、また、ビデオの表示中にビューポート２０１を制御するために使用されてもよい。作者は、ビューポート２０１を視聴者の入力なしでメタデータ２０９に沿って移動するように選択してもよい。このようにして、仮想カメラマン機能が実現され得る。作者は、３６０度ディスプレイにおいて芸術的構想をより良く表示し得る。 Metadata 209 may also be used to control viewport 201 during video display. The author may choose to move viewport 201 along metadata 209 without viewer input. In this way, a virtual photographer function can be realized. The author can better display the artistic concept on a 360 degree display.

［演出定義プリフェッチ］
演出とも呼ばれる、ビデオストリームのフレームの効果の追加中に、クライアントが、演出効果に合致するように高解像度フレームをプリフェッチすることが望ましい場合がある。限定ではなく例として、クライアントが、大きな音の見かけ方向の高解像度フレームをプリフェッチすることが望ましい場合がある。代替的には、演出中に、クライアントが、多くの特殊効果または特殊なカメラの動きがある場所におけるフレームをプリフェッチすることが望ましい場合がある。 [Direction definition prefetch]
During the addition of effects of frames of a video stream, also referred to as renditions, it may be desirable for the client to prefetch high resolution frames to match the rendition effects. By way of example, and not limitation, it may be desirable for the client to prefetch high resolution frames in the apparent direction of the loud sound. Alternatively, during the staging, it may be desirable for the client to prefetch frames where there are many special effects or special camera movements.

本開示の態様によれば、クライアントは、演出中に定義されたあるフレームをクライアントにプリフェッチさせるメタデータを受信してもよい。これらの定義済みフレームは、上述のような芸術的構想よりも、特殊効果及びサウンドキューに対応し得る。演出定義プリフェッチメタデータもまた、ユーザ方位メタデータと呼ばれ得る。 According to aspects of the present disclosure, the client may receive metadata that causes the client to prefetch certain frames defined during the presentation. These predefined frames may correspond to special effects and sound cues, rather than artistic concepts such as those described above. The presentation definition prefetch metadata may also be referred to as user orientation metadata.

［予測プリフェッチ］
本開示の代替的な態様によれば、クライアントは、図２及び図３に見られるように、予測メタデータを使用して、潜在的な将来のビューポートにあると判断されるストリームをプリフェッチしてもよい。予測メタデータは、エンドユーザが、単一解像度ストリームを採用するシステムにおけるよりも高解像度のフレームを彼らの視野内で受信することを保証するために、演出中に生成されてもよい。 [Predictive Prefetch]
According to an alternative aspect of the present disclosure, the client uses predictive metadata to prefetch a stream determined to be in a potential future viewport, as seen in FIGS. You may. Predictive metadata may be generated during the rendition to ensure that end users receive higher resolution frames within their field of view than in systems employing a single resolution stream.

スタジオは、３６０度ビデオの視聴者から収集されたスクリーニングデータを使用して、視聴者が任意の時間にビデオを見ている可能性がある場所の確率モデルを生成してもよい。この確率モデルは、３６０度ビデオにおける現在のビュー方位、ビデオ内の時間コード、ビデオの過去ビューなどの変数に基づいて、ユーザが現在のフレーム２０１から別のフレーム２０２へと移動する可能性、または現在のフレーム２０１に留まる可能性を定義し得る。フレームを変更する確率は、現在表示されている３６０度ビデオ内の各フレームに結び付けられた確率によって表されてもよい。これは、図２に見られる各フレーム２０１〜２０８内のパーセンテージによって表される。予測プリフェッチデータは、ユーザ方位メタデータと呼ばれてもよい。 The studio may use the screening data collected from the 360-degree video viewer to generate a probabilistic model of where the viewer may be watching the video at any time. This probability model is based on variables such as the current view orientation in the 360 degree video, the time code in the video, the past view of the video, etc. The possibility of staying in the current frame 201 may be defined. The probability of changing a frame may be represented by the probability associated with each frame in the currently displayed 360 degree video. This is represented by the percentage in each frame 201-208 seen in FIG. The predicted prefetch data may be referred to as user orientation metadata.

代替的には、予測プリフェッチメタデータは、否定的な、または逆／反対のデータであってもよい。言い換えると、ユーザが見そうな場所の確率の代わりに、プリフェッチメタデータは、視聴者が見そうにない場所の１つ以上の確率を代わりに表してもよい。 Alternatively, the predictive prefetch metadata may be negative or reverse / reverse data. In other words, instead of probabilities of places where the user is likely to see, the prefetch metadata may instead represent one or more probabilities of places where the viewer is unlikely to see.

ビデオフレームのプリフェッチは、高解像度及び低解像度のストリームに限定されない。図２に見られるように、システムは、ある閾値レベルの確率を有するフレーム２０５、２０６についての中間解像度をプリフェッチするように選択してもよい。同様に、ビューポートである低い確率を有するフレーム２０４、２０７、２０８は、表示低解像度画像とともにプリフェッチされるだけである。追加的な態様として、低いビューポート確率フレーム２０４、２０７、２０８は、限定ではなく例として、より高い確率フレームより少ない更新を受信してもよい。低い確率フレームは、例えば、高い確率ストリームの２回の更新毎に１回だけ、更新してもよい。さらに概して、最適な／高い確率ストリームの任意の要因が、それらがビュー内か否かに依存して低確率ストリームを更新するために使用されてもよい。いくつかの実施態様において、依然としてビュー内にある低確率ストリームが、繋ぎ合わされたフレーム間の顕著な同期ずれを回避するために、完全な高確率更新速度で更新されてもよい。 Prefetching video frames is not limited to high and low resolution streams. As seen in FIG. 2, the system may choose to prefetch an intermediate resolution for frames 205, 206 having a certain threshold level of probability. Similarly, frames 204, 207, 208 that have a low probability of being viewports are only prefetched with the displayed low resolution image. As an additional aspect, low viewport probability frames 204, 207, 208 may receive fewer updates than higher probability frames, by way of example and not limitation. The low probability frame may be updated, for example, only once every two updates of the high probability stream. More generally, any factor of the optimal / high probability stream may be used to update the low probability stream depending on whether they are in view or not. In some implementations, low-probability streams that are still in view may be updated at a full high-probability update rate to avoid significant out-of-sync between spliced frames.

この確率メタデータを使用してストリームをプリフェッチするかどうかを判断するために、クライアントは、定義済み閾値確率レベルを有してもよい。視聴者があるフレーム内にビューポートを移動する確率が閾値確率レベルを超えると、確率メタデータが判断するとき、クライアントは、そのフレームをプリフェッチする。代替的な実施形態では、クライアントは、確率メタデータの忠実性に基づいて、フレームをプリフェッチしてもよい。 To determine whether to prefetch the stream using this probability metadata, the client may have a defined threshold probability level. The client prefetches the frame when the probability metadata determines that the probability of the viewer moving the viewport within a frame exceeds a threshold probability level. In an alternative embodiment, the client may prefetch frames based on the fidelity of the probability metadata.

［予測忠実性チェック］
図３は、本開示の態様による、立方体マッピング図法を示す。各フレーム内にパーセンテージとして表される予測データは、ビューポート３０１の実際の動き３０５に合致しない。この例では、作者予測のプリフェッチベクトル３０３もまた、ビューポート３０１の実際の移動ベクトル３０５に合致しない。図３に提示される場合において、実際のビューポートは、結局３０２となる一方、システムは、予測データに基づくフレーム３０１、または作者定義プリフェッチベクトルに基づくフレーム３０４のいずれかをプリフェッチする。よって、図３の場合、システムは、視聴者方位メタデータに従い続けるか、あるいはデフォルトで単一品質レベルとなるかを判断してもよい。 [Predictive fidelity check]
FIG. 3 illustrates a cubic mapping projection in accordance with aspects of the present disclosure. The predicted data represented as a percentage in each frame does not match the actual motion 305 of the viewport 301. In this example, the author prediction prefetch vector 303 also does not match the actual movement vector 305 of the viewport 301. In the case presented in FIG. 3, the actual viewport is eventually 302, while the system prefetches either a frame 301 based on predicted data or a frame 304 based on an author-defined prefetch vector. Thus, in the case of FIG. 3, the system may determine whether to continue following the viewer orientation metadata or default to a single quality level.

クライアントは、確率的及び作者定義のメタデータの忠実性の、連続的または断続的チェックを実行してもよい。クライアントは、確率メタデータ及び実際のビューポート３０１の方位に基づいて、高解像度フレームを最初にプリフェッチしてもよい。クライアントは、次いで、ビューポートの方位及びメタデータに従って、高解像度フレームを表示してもよい。クライアントは、視聴者が、確率メタデータまたは作者定義メタデータに従って高解像度フレームにビューポートを移動したかどうかを判断するためにチェックしてもよい。 The client may perform a continuous or intermittent check of the probabilistic and author-defined metadata fidelity. The client may first prefetch high resolution frames based on the probability metadata and the actual viewport 301 orientation. The client may then display the high resolution frame according to the viewport orientation and metadata. The client may check to determine if the viewer has moved the viewport to a higher resolution frame according to probability metadata or author-defined metadata.

ビューポートが、確率メタデータまたは作者定義プリフェッチベクトルに従ってプリフェッチされたフレーム内にないと判断すると、クライアントは、プリフェッチのためにメタデータを使用するのを止め、ビューポートの現在の視野内の高解像度フレームをフェッチするだけであってもよい。代替的な実施形態では、クライアントは、相関関係についてのメタデータを用いてビューポートの動きを連続的にチェックしてもよい。クライアントは、確率メタデータに従わないビューポートの動きについての許容誤差レベルを有してもよい。この許容誤差レベルは、限定ではなく例として、見られた予測フレームに対する見落とされた予測フレームの比率であってもよい。その事例において、比率が１まで増えるにつれて、クライアントは、実際のビューポート内のフレームのみをフェッチすることへとシフトしてもよい。さらに概して、許容誤差レベルは、値のセット内の変化の量を統計的に数量化することによって判断されてもよい。任意の適当な可変性の尺度、例えば、標準偏差が、固定またはスライディングウィンドウのユーザのセット及びメタデータに対して適用されてもよい。 If the viewport determines that it is not in a frame that was prefetched according to the probability metadata or author-defined prefetch vector, the client stops using the metadata for prefetching and the high resolution within the viewport's current field of view. It may just fetch the frame. In an alternative embodiment, the client may continuously check the viewport movement using the metadata about the correlation. The client may have a tolerance level for viewport movement that does not follow the probability metadata. This tolerance level may be, by way of example and not limitation, the ratio of overlooked predicted frames to observed predicted frames. In that case, as the ratio increases to one, the client may shift to fetching only frames in the actual viewport. More generally, the tolerance level may be determined by statistically quantifying the amount of change in the set of values. Any suitable measure of variability, eg, standard deviation, may be applied to the fixed or sliding window user set and metadata.

［演出後プリフェッチ生成］
本開示の別の態様は、最終視聴者データに基づく予測メタデータの生成である。クライアントデバイスは、ビューポートの方位に関して視聴者からデータを収集してもよい。クライアントデバイスは、この視聴者データを使用して、予測データを生成し、予測データに従ってビデオストリームをプリフェッチしてもよい。クライアントデバイスは、また、ビデオストリームのための確率メタデータを生成し、または精密化するために、他のクライアントまたはサーバと予測データを共有してもよい。 [Generate prefetch after production]
Another aspect of the present disclosure is generation of prediction metadata based on final viewer data. The client device may collect data from the viewer regarding the viewport orientation. The client device may use the viewer data to generate prediction data and prefetch the video stream according to the prediction data. Client devices may also share the prediction data with other clients or servers to generate or refine stochastic metadata for the video stream.

クライアントデバイスは、ビューポート方位データを使用して予測データを生成してもよい。限定ではなく例として、クライアントデバイスは、ビューポートの移動ベクトル３０５を使用して、ベクトル上にあり、移動速度を考慮に入れる高解像度ビデオストリームを予測的にフェッチしてもよい。ユーザ生成された予測プリフェッチデータは、ユーザ方位メタデータと呼ばれてもよい。 The client device may generate the prediction data using the viewport orientation data. By way of example, and not limitation, the client device may use the viewport's motion vector 305 to predictively fetch a high-resolution video stream that is on the vector and takes into account the speed of movement. The user-generated predicted prefetch data may be referred to as user orientation metadata.

クライアントは、将来の視聴のためのより良い確率メタデータを生成するために、ビューポート方位データをデータ収集センタに送信してもよい。データ収集センタは、作者定義プリフェッチデータに加えてビューポート方位データを使用して、ビデオストリームのためのメタデータを生成してもよい。クライアントが過去のユーザ定義視聴ベクトルに厳密に従っていない限り、作者定義ストリームのプリフェッチが優先されるように、生成されたメタデータは、作者定義プリフェッチデータに重み付けしてもよい。 The client may send the viewport orientation data to the data collection center to generate better probability metadata for future viewing. The data collection center may use the viewport orientation data in addition to the author-defined prefetch data to generate metadata for the video stream. Unless the client strictly adheres to past user-defined viewing vectors, the generated metadata may weight the author-defined prefetch data so that author-defined stream prefetching takes precedence.

［サーバ側プリフェッチ］
本開示の代替的な実施形態において、サーバは、メタデータを使用して、クライアントデバイスに送信するために高解像度ストリームを選択する。本実施形態の態様によれば、サーバによって使用されるメタデータは、作者定義、視聴者から生成された確率データ、または他の予測データであってもよい。サーバは、３６０度ビデオストリームについての要求をクライアントデバイスから受信してもよい。サーバは、要求されたストリームについての視聴者方位メタデータについてチェックしてもよい。このようなメタデータを見つけると、サーバは、メタデータに従って高解像度ストリームをクライアントに送信してもよい。サーバは、また、実際の視聴者方位データを受信し、実際のビューポートのビューにおける高解像度ストリームを送信してもよい。さらに、サーバがメタデータに基づいて高解像度ビデオストリームを送信し続けるべきかどうかを判断するために、サーバは、上述のような予測忠実性チェックを実行してもよい。 [Server side prefetch]
In an alternative embodiment of the present disclosure, the server uses the metadata to select a high-resolution stream to send to the client device. According to aspects of this embodiment, the metadata used by the server may be author definitions, probability data generated from viewers, or other predictive data. The server may receive a request for a 360 degree video stream from a client device. The server may check for viewer orientation metadata for the requested stream. Upon finding such metadata, the server may send a high-resolution stream to the client according to the metadata. The server may also receive the actual viewer orientation data and transmit the high resolution stream in the view of the actual viewport. Further, the server may perform a predictive fidelity check as described above to determine whether the server should continue to transmit the high resolution video stream based on the metadata.

［実施態様］
図４は、本開示の態様によるシステムを示す。システムは、ディスプレイ４０１及びユーザ入力デバイス４０２に連結されたコンピューティングデバイス４００を含んでもよい。ディスプレイデバイス４０１は、陰極線管（ＣＲＴ）、フラットパネルスクリーン、タッチスクリーン、またはテキスト、数字、グラフィカルシンボル、もしくは他の視覚的対象物を表示する他のデバイスの形式であってもよい。ユーザ入力デバイス４０２は、コントローラ、タッチスクリーン、またはユーザがユーザビュー方位と対話し、３６０度ビデオストリームを選択することを可能にする他のデバイスであってもよい。いくつかの実施態様において、ディスプレイ４０１は、没入型３６０度ビデオ体験のために３６０度ビデオの複数のビューポートを同時に表示するように構成される、３６０度ディスプレイであってもよい。他の実施態様において、ディスプレイ４０１は、従来の二次元ディスプレイであってもよい。このような実施態様では、ユーザは、コンピューティングデバイスとの対話によってビューポートを判断することが可能であってもよい。 [Embodiment]
FIG. 4 illustrates a system according to aspects of the present disclosure. The system may include a computing device 400 coupled to a display 401 and a user input device 402. Display device 401 may be in the form of a cathode ray tube (CRT), flat panel screen, touch screen, or other device that displays text, numbers, graphical symbols, or other visual objects. User input device 402 may be a controller, touch screen, or other device that allows the user to interact with the user view orientation and select a 360 degree video stream. In some embodiments, the display 401 may be a 360 degree display configured to simultaneously display multiple viewports of the 360 degree video for an immersive 360 degree video experience. In another embodiment, display 401 may be a conventional two-dimensional display. In such an embodiment, the user may be able to determine the viewport by interaction with the computing device.

コンピューティングデバイス４００は、例えば、シングルコア、デュアルコア、クアッドコア、マルチコア、プロセッサ−コプロセッサ、セルプロセッサなどの周知のアーキテクチャに従って構成され得る、１つ以上のプロセッサユニット４０３を含んでもよい。コンピューティングデバイスは、また、１つ以上のメモリユニット４０４（例えば、ランダムアクセスメモリ（ＲＡＭ）、ダイナミックランダムアクセスメモリ（ＤＲＡＭ）、読み出し専用メモリ（ＲＯＭ）など）を含んでもよい。 Computing device 400 may include one or more processor units 403, which may be configured according to well-known architectures such as, for example, single-core, dual-core, quad-core, multi-core, processor-coprocessor, cell processor, and the like. The computing device may also include one or more memory units 404 (eg, random access memory (RAM), dynamic random access memory (DRAM), read only memory (ROM), etc.).

プロセッサユニット４０３は、１つ以上のプログラムを実行してもよく、その一部は、メモリ４０４に記憶されてもよく、プロセッサ４０３は、例えば、データバス４０５を介してメモリにアクセスすることによって、メモリに動作可能に連結されてもよい。プログラムは、そのビデオストリームについて受信されたメタデータ４１０に基づいて、ビデオストリームについてのフレームを要求するように構成されてもよい。プロセッサによる実行時に、プログラムが、システムに高解像度フレームを復号４０８させ、バッファ４０９に視聴者のビューポート内の可能性のあるフレームを記憶させてもよい。 Processor unit 403 may execute one or more programs, some of which may be stored in memory 404, and processor 403 may access the memory via data bus 405, for example, It may be operatively connected to a memory. The program may be configured to request frames for the video stream based on the metadata 410 received for the video stream. When executed by the processor, the program may cause the system to decode 408 the high resolution frame and store in buffer 409 a possible frame in the viewer's viewport.

コンピューティングデバイス４００は、また、入力／出力（Ｉ／Ｏ）回路４０７、電源（Ｐ／Ｓ）４１１、クロック（ＣＬＫ）４１２、及びキャッシュ４１３などの周知のサポート回路を含んでもよく、それらは、例えば、バス４０５を介してシステムの他のコンポーネントと通信し得る。コンピューティングデバイスは、ネットワークインタフェース４１４を含んでもよい。プロセッサユニット４０３及びネットワークインタフェース４１４は、ＰＡＮのための、例えばＢｌｕｅｔｏｏｔｈ（登録商標）などの適当なネットワークプロトコルを介して、ローカルエリアネットワーク（ＬＡＮ）またはパーソナルエリアネットワーク（ＰＡＮ）を実施するように構成されてもよい。コンピューティングデバイスは、任意で、ディスクドライブ、ＣＤ−ＲＯＭドライブ、テープドライブ、フラッシュメモリなどの大容量記憶デバイス４１５を含んでもよい。大容量記憶デバイスは、プログラム及び／またはデータを記憶してもよい。コンピューティングデバイスは、また、システム及びユーザ間の対話を容易にするためのユーザインタフェース４１６を含んでもよい。ユーザインタフェースは、キーボード、マウス、ライトペン、ゲーム制御パッド、タッチインタフェース、または他のデバイスを含んでもよい。いくつかの実施態様において、ユーザは、インタフェース４１６を使用して、例えば、マウスでスクロールすること、またはジョイスティックの操作によって、ビューポートを変更してもよい。いくつかの実施態様において、ディスプレイ４０１は、スマートフォン、タブレットコンピュータ、またはポータブルゲームデバイスにおけるような、手持ちディスプレイであってもよい。このような実施態様において、ユーザインタフェース４１６は、ディスプレイ４０１の一部である加速度計を含んでもよい。このような実施態様において、プロセッサ４０３は、例えば、適当なプログラミングによって、ディスプレイの方位の変化を検出し、この情報を使用してビューポートを判断するように構成され得る。したがって、ユーザは、ディスプレイを単に移動することによって、ビューポートを変更することができる。 Computing device 400 may also include well-known support circuits such as input / output (I / O) circuit 407, power supply (P / S) 411, clock (CLK) 412, and cache 413, which include: For example, it may communicate with other components of the system via bus 405. The computing device may include a network interface 414. The processor unit 403 and the network interface 414 are configured to implement a local area network (LAN) or a personal area network (PAN) via a suitable network protocol for the PAN, for example, Bluetooth®. You may. The computing device may optionally include a mass storage device 415, such as a disk drive, CD-ROM drive, tape drive, flash memory, and the like. The mass storage device may store programs and / or data. The computing device may also include a user interface 416 to facilitate interaction between the system and a user. The user interface may include a keyboard, mouse, light pen, game control pad, touch interface, or other device. In some implementations, the user may use the interface 416 to change the viewport, for example, by scrolling with a mouse or operating a joystick. In some embodiments, display 401 may be a hand-held display, such as in a smartphone, tablet computer, or portable gaming device. In such an embodiment, user interface 416 may include an accelerometer that is part of display 401. In such an embodiment, the processor 403 may be configured to detect a change in the orientation of the display and to use this information to determine the viewport, for example, by appropriate programming. Thus, the user can change the viewport by simply moving the display.

いくつかの実施態様において、予測メタデータは、マルコフチェーンを実施するように構成されてもよい。図５は、現在の開示の態様による、マルコフチェーンのグラフィカル表示を示す。それぞれの円は、状態５０１〜５０６、例えば、３６０度ビデオ内のビューポート方位を表す。立方体マッピングされたビデオストリームを表す、このマルコフチェーンには６つの状態が存在する。それぞれの矢印は、ビューポートが、ビデオストリームのビューポートのうちの別のビューポートに変わる確率を表す。これらの転移確率は、監督者または製作者によって、手動で設定され、フォーカスグループデータから、または実際の視聴者データから判断され得る。各フレームは、ビデオストリーム内の各時点においてそれ自体のマルコフ確率を有し得る。本開示の１つの実施形態によれば、システムは、現在のビューポート方位から異なる方位への遷移についてのマルコフ確率を表すメタデータを受信するだけであってもよい。例えば、５０１における現在のビューポートを表示するクライアントが、現在のビューポート５０１から複数のビューポート５０２〜５０６のうちの異なるビューポートへと移動する確率、または現在のビューポート５０１に留まる確率を受信するだけである。このようにして、システムは、ストリームを受信する間に現在の状態からの遷移の確率のみを受信することによって、初期ロード時間及び帯域幅利用を減少させ得る。代替的には、システムは、ビデオストリームの開始において各時点におけるビデオストリームについてのマルコフ確率モデル全体を受信してもよい。この実施形態は、ストリームについての初期ロード時間を増加させるが、ストリーミング中に処理される必要がある情報全体を減少させる。別の実施形態では、システムは、３６０度ビデオのストリーミング中の各時点におけるマルコフ確率モデルを受信してもよい。この実施形態は、ストリーム処理を犠牲にして先行投資のロード時間を減少させるが、クライアントとサーバとの間の同期ずれの場合に、追加的なマルコフ確率データをさらに含む。 In some embodiments, the prediction metadata may be configured to implement a Markov chain. FIG. 5 illustrates a graphical representation of a Markov chain, according to aspects of the present disclosure. Each circle represents a state 501-506, for example, a viewport orientation in a 360 degree video. There are six states in this Markov chain, representing a cube-mapped video stream. Each arrow represents the probability that the viewport will change to another of the video stream viewports. These transfer probabilities are manually set by the supervisor or producer and can be determined from focus group data or from actual viewer data. Each frame may have its own Markov probability at each point in the video stream. According to one embodiment of the present disclosure, the system may only receive metadata representing a Markov probability for a transition from the current viewport orientation to a different orientation. For example, the client displaying the current viewport at 501 receives the probability of moving from the current viewport 501 to a different one of the plurality of viewports 502-506, or the probability of staying at the current viewport 501. Just do it. In this way, the system may reduce initial load time and bandwidth utilization by receiving only the probability of transition from the current state while receiving the stream. Alternatively, the system may receive the entire Markov probability model for the video stream at each point in time at the start of the video stream. This embodiment increases the initial load time for the stream, but reduces the overall information that needs to be processed during streaming. In another embodiment, the system may receive a Markov stochastic model at each point during the streaming of the 360 degree video. This embodiment reduces the load time of the upfront investment at the expense of stream processing, but further includes additional Markov probability data in case of out-of-sync between client and server.

上述のような作者定義プリフェッチメタデータを、マルコフチェーンモデルを用いて実装するために、システムは、作者定義プリフェッチベクトルに沿った状態への移行確率に重みを付けてもよい。したがって、重みを加えた移動の確率が閾値を上回る場合に、クライアントは、その状態のプリフェッチを開始する。 To implement author-defined prefetch metadata as described above using a Markov chain model, the system may weight the transition probability to a state along the author-defined prefetch vector. Thus, if the weighted probability of movement is above the threshold, the client initiates prefetch for that state.

本開示の態様によれば、クライアントは、ビデオストリームから独立して要求されたビデオストリームのための視聴者方位メタデータを受信してもよい。メタデータがビデオストリームから独立して受信される場合、それは、ストリームに時間同期しなければならない。現在の開示の代替的な態様において、メタデータは、限定ではなく例として、３６０度ビデオストリームのヘッダにおいて、ビデオストリーム自体の一部として受信されてもよい。 According to aspects of the present disclosure, a client may receive viewer orientation metadata for a requested video stream independently of the video stream. If the metadata is received independently of the video stream, it must be time synchronized to the stream. In an alternative aspect of the present disclosure, the metadata may be received as part of the video stream itself, by way of example and not limitation, in the header of the 360 degree video stream.

図６は、本開示の態様による、予測ビットレート選択についてのフロー図を示す。システムは、ネットワークからビデオストリームを要求６０１してもよい。ビデオストリームの要求時に、システムは、ネットワークから、または大容量記憶メモリ４１５にローカルに記憶された、またはコンパクトディスクなどの記憶媒体から、または当技術分野において既知の任意の適当な種類のデータ記憶から、ユーザビュー方位メタデータを取得６０２してもよい。上述のように、ユーザビュー方位メタデータは、ストリーミング中にも取得され得ることに留意すべきである。視聴者方向メタデータは、上述のようにユーザがまだ入っていない潜在的なビューポートを含む、表示前にフェッチするフレームをシステムに通知してもよい。したがって、システムは、上述のように、より高い解像度における初期開始ビューポートフレーム、及びより低い解像度における３６０度ビデオストリーム内の残りのフレームだけでなく、潜在的なビューポートまたは所望のビューポートであるフレームをプリフェッチ６０３する。システムは、次いで、例えば上述のようなディスプレイデバイス４０１を通して、プリフェッチされたフレームを表示６０４する。プリフェッチされたフレームを表示６０４する間、システムは、中断のない３６０度ビデオストリーミング体験をユーザの視野内の高解像度フレームを用いて作成するために、フレームのプリフェッチ６０３を継続６０５してもよい。任意で、システムは、予測忠実性チェック６０７を使用６０６して、ビュー方位メタデータに従ってフレームをプリフェッチし続けるか、あるいは現在のビューポート内のフレームだけがより高解像度でプリフェッチされる、現在のロード状態に入るかを判断６０８してもよい。現在のロード状態において、システムは、上述のように予測が正確になるまで、予測忠実性のチェックを継続６０９してもよい。その時点で、システムは、ユーザビュー方位メタデータに基づいてプリフェッチ６０３を再開６０８することができる。 FIG. 6 shows a flow diagram for predictive bit rate selection according to aspects of the present disclosure. The system may request 601 a video stream from the network. Upon request for a video stream, the system may operate from a network or from a storage medium, such as a compact disk, stored locally in mass storage memory 415, or from any suitable type of data storage known in the art. , User view orientation metadata may be obtained 602. It should be noted that, as described above, user view orientation metadata may also be obtained during streaming. Viewer orientation metadata may inform the system of frames to fetch before display, including potential viewports that the user has not yet entered, as described above. Thus, the system is a potential or desired viewport as well as the initial starting viewport frame at the higher resolution and the remaining frames in the 360 degree video stream at the lower resolution, as described above. The frame is prefetched 603. The system then displays 604 the prefetched frame, for example, through display device 401 as described above. While displaying 604 the prefetched frames, the system may continue 605 prefetching the frames 603 to create an uninterrupted 360 degree video streaming experience using the high resolution frames in the user's field of view. Optionally, the system uses 606 the predictive fidelity check 607 to continue prefetching frames according to the view orientation metadata, or the current load where only frames in the current viewport are prefetched at a higher resolution. A decision 608 may be made to enter the state. In the current load state, the system may continue 609 checking the prediction fidelity until the prediction is correct as described above. At that point, the system can resume 608 prefetch 603 based on the user view orientation metadata.

上記は、本発明の好ましい実施形態の完全な説明であるが、多様な代替物、修正物、及び均等物を使用することが可能である。したがって、本発明の範囲は、上記説明を参照することなく判断されるべきであるが、その代わりに、添付の特許請求の範囲を、その均等物の全範囲とともに参照して判断されるべきである。いかなる特徴も、好ましいか否かに関わらず、いかなる他の特徴とも、好ましいか否かに関わらず組み合わされてもよい。続く特許請求の範囲において、不定冠詞「Ａ」または「Ａｎ」は、そうでないと明示的に述べられている場合を除いて、冠詞に続く項目のうちの１つ以上の量をいう。そのような限定が「ための手段（ｍｅａｎｓｆｏｒ）」という句を使用して所与の請求項において明示的に列挙されない限り、添付の特許請求の範囲は、ミーンズプラスファンクション限定を含むものとして解釈されるべきでない。指定された機能を実行する「ための手段（ｍｅａｎｓｆｏｒ）」を明示的に述べない、請求項の中の任意の要素は、米国特許法第１１２条第６項において指定される「手段」または「ステップ」として解釈されるべきでない。 While the above is a complete description of the preferred embodiment of the present invention, it is possible to use various alternatives, modifications and equivalents. The scope of the invention should, therefore, be determined without reference to the above description, but instead should be determined with reference to the appended claims along with their full scope of equivalents. is there. Any feature, whether or not preferred, may be combined with any other feature, whether or not preferred. In the following claims, the indefinite article "A" or "An" refers to the quantity of one or more of the items following the article, unless explicitly stated otherwise. Unless such limitations are expressly recited in a given claim using the phrase “means for”, the appended claims are to be interpreted as including means-plus-function limitations. Should not be done. Any element in the claims that does not explicitly state a "means for" performing the specified function may be a "means" or a "means" specified in 35 USC 112,6. It should not be interpreted as a "step."

Claims

a) obtaining user view orientation metadata for a 360 degree video stream including data for a plurality of viewports;
b) prefetching data corresponding to one or more high resolution frames for a particular viewport of the plurality of viewports as determined by the user view orientation metadata;
c) displaying the one or more high resolution frames, wherein the one or more high resolution frames are characterized by a higher resolution than for a remaining one of the plurality of viewports; Said displaying;
A method that includes

The method of claim 1, wherein the user view orientation metadata is an author-defined prefetch vector.

The method of claim 1, wherein the user view orientation metadata is a prefetch vector created to match staging effects in scenes of the 360 ° video stream.

The method of claim 1, wherein the view orientation metadata is a predicted probability of a viewport frame change.

5. The method of claim 4, wherein the predicted probabilities are generated by a focus group.

5. The method of claim 4, wherein the predicted probabilities are generated using user generated frame position data.

5. The method of claim 4, wherein the predicted probabilities also include an author-defined prefetch vector.

The method of claim 7, wherein the author-defined prefetch vector is a weight applied to the probability of a viewport frame change.

The method of claim 4, wherein prefetching a frame comprises applying a first threshold to the user view orientation metadata to determine the particular viewport of the plurality of viewports. Method.

5. The method of claim 4, wherein the predicted probability of a viewport frame change is a probability for a Markov model.

The method of claim 1, wherein the user view orientation metadata is a motion vector.

10. The method of claim 9, wherein b) comprises prefetching an intermediate resolution frame when the prediction probability exceeds a second threshold.

The method of claim 1, wherein the user view orientation metadata represents a probability of where a user is likely to see.

The method of claim 1, wherein the user view orientation metadata represents a probability of a location unlikely to be seen by a user.

Checking the user view orientation metadata against the actual user view orientation;
Disabling prefetching when the user view orientation metadata does not match the actual user view orientation being displayed;
The method of claim 1, further comprising:

Disabling prefetching when the user view orientation metadata does not match the actual user view orientation being displayed comprises applying an oversight frame threshold to determine whether to disable prefetching. The method of claim 15, comprising:

2. The method of claim 1, wherein prefetching data corresponding to one or more high resolution frames determined by the user view orientation metadata comprises prefetching frames having higher resolution sections than the remaining frames. The method described in.

a) obtaining user view orientation metadata for a 360 degree video stream including data for a plurality of viewports;
b) prefetching data corresponding to one or more high resolution frames for a particular viewport of the plurality of viewports as determined by the user view orientation metadata;
c) displaying the one or more high resolution frames, wherein the one or more high resolution frames are characterized by a higher resolution than for a remaining one of the plurality of viewports; Said displaying;
Non-transitory computer readable medium containing program instructions for causing a computer to perform the method of claim 1.

A processor,
A display coupled to the processor;
A memory coupled to the processor having processor-executable instructions embodied therein, wherein the instructions are configured to perform a method when executed by the processor;
A system comprising:
The method comprises:
a) obtaining user view orientation metadata for a 360 degree video stream including data for a plurality of viewports;
b) prefetching data corresponding to one or more high resolution frames for a particular viewport of the plurality of viewports as determined by the user view orientation metadata;
c) displaying the one or more high-resolution frames using the display, wherein the one or more high-resolution frames have a higher resolution than the remaining viewports of the plurality of viewports. Displaying, characterized by:
Including the system.

20. The system of claim 19, wherein the display is a 360 degree display configured to simultaneously display the plurality of viewports.