JP2024050857A

JP2024050857A - Encoder tuning to improve the trade-off between latency and video quality in cloud gaming applications

Info

Publication number: JP2024050857A
Application number: JP2024017687A
Authority: JP
Inventors: イー．サーニーマーク; エム．ヨンケルビン
Original assignee: Sony Interactive Entertainment Inc
Current assignee: Sony Interactive Entertainment Inc
Priority date: 2019-10-01
Filing date: 2024-02-08
Publication date: 2024-04-10
Also published as: CN114746157A; US20210093960A1; US11524230B2; US20230115947A1; US11344799B2; WO2021067516A1; US20210093959A1; JP2022550443A; JP7436644B2; EP4037790A1

Abstract

【課題】クラウドゲーミングサービスのユーザに高品質のエクスペリエンスを提供する。【解決手段】クラウドゲーミングのための方法である。この方法では、クラウドゲーミングサーバにおいてビデオゲームを実行するときに、複数のビデオフレームを生成する。この方法では、複数のビデオフレームをエンコーダビットレートでエンコードし、圧縮された複数のビデオフレームが、クラウドゲーミングサーバのストリーマからクライアントに伝送される。また、この方法では、クライアントの最大受信帯域幅を測定し、ストリーマにおける複数のビデオフレームのエンコードを監視し、さらに、エンコードの監視に基づいてエンコーダのパラメータを動的に調節する。【選択図】図６A method for cloud gaming is provided to provide a high quality experience to users of a cloud gaming service. The method includes generating a plurality of video frames when executing a video game on a cloud gaming server. The method includes encoding the plurality of video frames at an encoder bitrate, and transmitting the compressed plurality of video frames from a streamer on the cloud gaming server to a client. The method also includes measuring a maximum receiving bandwidth of the client, monitoring the encoding of the plurality of video frames at the streamer, and dynamically adjusting parameters of the encoder based on the monitoring of the encoding. [Selected Figure] FIG.

Description

本開示は、ネットワークを介してコンテンツをストリーミングするように構成されたストリーミングシステムに関し、より具体的には、クラウドゲーミングシステム用の高性能エンコーダ及びデコーダ、ならびにネットワーク伝送速度及び信頼性、ならびに全体的なレイテンシ目標を認識したエンコーダ調節のために構成されたストリーミングシステムに関連する。 The present disclosure relates to streaming systems configured to stream content over a network, and more specifically to high performance encoders and decoders for cloud gaming systems, and streaming systems configured for encoder adjustments that are aware of network transmission speed and reliability, as well as overall latency targets.

近年、ネットワークを介して接続されたクラウドゲーミングサーバとクライアントとの間でストリーミングフォーマットのオンラインまたはクラウドゲーミングを可能にするオンラインサービスが継続的に推進されている。ストリーミングフォーマットは、オンデマンドでのゲームタイトルの利用可能性、マルチプレイヤーゲームのためにプレイヤー間をネットワークする機能、プレイヤー間での資産の共有、プレイヤー間及び／または観客間でのインスタントエクスペリエンスの共有、友人達が、友人がビデオゲームをプレイするのを見ることを可能にすること、友人が進行中のゲームプレイに参加する友人を有すること、などによって、増々人気が高まっている。
残念ながら、その需要はまた、ネットワーク接続の能力と、クライアントに配信されると高品質画像をレンダリングするのに十分な応答性を備えたサーバ及びクライアントにおいて実行される処理の制限と、に対しても押し上がっている。例えば、サーバで実行されるすべてのゲームアクティビティの結果は、最高のユーザエクスペリエンスのために、圧縮されて低いミリ秒のレイテンシでクライアントに戻すように伝送される必要がある。ラウンドトリップレイテンシは、ユーザのコントローラ入力とクライアントにおけるビデオフレームの表示との間の全体的な時間として定義されることができ、これは、コントローラからクライアントへの制御情報の処理と伝送と、クライアントからサーバへの制御情報の処理と伝送と、入力に応答するビデオフレームを生成するためのサーバでのその入力の使用と、ビデオフレームの処理とエンコーディングユニットへの転送（例えば、スキャンアウト）と、ビデオフレームのエンコードと、クライアントへ戻すエンコードされたビデオフレームの伝送と、ビデオフレームの受信とデコードと、表示前のビデオフレームの処理またはステージングのいずれかと、を含み得る。
一方向レイテンシは、ビデオフレームのサーバにあるエンコーディングユニットへの転送（例えば、スキャンアウト）の開始からクライアントにおけるビデオフレームの表示の開始までの時間で構成されるラウンドトリップレイテンシの一部として定義され得る。ラウンドトリップレイテンシ及び一方向レイテンシの一部分は、データストリームが、通信ネットワークを介してクライアントからサーバに送信され、サーバからクライアントに送信されるのにかかる時間に関連付けられている。別の部分は、クライアントとサーバでの処理に関連付けられており、フレームのデコードと表示に関する高度な方策などの、これらの動作での改善は、サーバとクライアント間のラウンドトリップレイテンシ及び一方向レイテンシの実質的な低減になり得、クラウドゲーミングサービスのユーザに高品質のエクスペリエンスを提供する。 In recent years, there has been a continuous push for online services that allow online or cloud gaming in a streaming format between cloud gaming servers and clients connected over a network. The streaming format has become increasingly popular due to the availability of game titles on demand, the ability to network between players for multiplayer gaming, sharing of assets between players, sharing of instant experiences between players and/or spectators, allowing friends to watch friends play video games, having friends join in on-going gameplay, etc.
Unfortunately, the demands are also pushing up against the capacity of network connections and limitations of processing performed on the server and client with sufficient responsiveness to render high quality images when delivered to the client. For example, the results of all game activity performed on the server need to be compressed and transmitted back to the client with low millisecond latency for the best user experience. Round-trip latency can be defined as the overall time between a user's controller input and the display of a video frame at the client, which may include the processing and transmission of control information from the controller to the client, the processing and transmission of control information from the client to the server, the use of that input at the server to generate a video frame responsive to the input, the processing and transfer of the video frame to an encoding unit (e.g., scan-out), the encoding of the video frame, the transmission of the encoded video frame back to the client, the receipt and decoding of the video frame, and either the processing or staging of the video frame prior to display.
One-way latency may be defined as a portion of the round-trip latency consisting of the time from the start of the transfer (e.g., scan-out) of a video frame to an encoding unit at the server to the start of the display of the video frame at the client. A portion of the round-trip latency and one-way latency is associated with the time it takes for a data stream to be transmitted from the client to the server and back across a communication network. Another portion is associated with the processing at the client and server, and improvements in these operations, such as advanced strategies for decoding and displaying frames, can result in a substantial reduction in the round-trip latency and one-way latency between the server and the client, providing a higher quality experience to users of the cloud gaming service.

本開示の実施形態は、このような背景のもとになされたものである。 The embodiments of the present disclosure have been made against this background.

本開示の実施形態は、ネットワークを介してコンテンツ（例えば、ゲーム）をストリーミングするように構成されたストリーミングシステムに関し、より具体的には、クラウドゲーミングシステムにおける一方向レイテンシとビデオ品質との間のトレードオフを改善するためにエンコーダ調節を提供するように構成されたストリーミングシステムに関し、エンコーダ調節は、クライアント帯域幅、スキップされたフレーム、エンコードされたＩフレームの数、シーン変化の数、及び／または目標フレームサイズを超えるビデオフレームの数の監視に基づき得、調節されたパラメータは、エンコーダビットレート、目標フレームサイズ、最大フレームサイズ、及び量子化パラメータ（ＱＰ）値を含み得、高性能のエンコーダとデコーダは、クラウドゲーミングサーバとクライアントとの間の全体的な一方向レイテンシを低減する働きをする。 Embodiments of the present disclosure relate to a streaming system configured to stream content (e.g., games) over a network, and more particularly to a streaming system configured to provide encoder adjustments to improve the tradeoff between one-way latency and video quality in a cloud gaming system, where the encoder adjustments may be based on monitoring of client bandwidth, skipped frames, number of encoded I-frames, number of scene changes, and/or number of video frames exceeding a target frame size, where the adjusted parameters may include encoder bitrate, target frame size, maximum frame size, and quantization parameter (QP) value, and where high performance encoders and decoders serve to reduce the overall one-way latency between the cloud gaming server and the client.

本開示の実施形態は、クラウドゲーミングのための方法を開示する。本方法は、クラウドゲーミングサーバにおいてビデオゲームを実行するときに、複数のビデオフレームを生成することを含んでいる。本方法は、複数のビデオフレームをエンコーダビットレートでエンコードすることを含み、圧縮された複数のビデオフレームが、クラウドゲーミングサーバのストリーマからクライアントに伝送されている。本方法は、クライアントの最大受信帯域幅を測定することを含んでいる。本方法は、ストリーマにおける複数のビデオフレームのエンコードを監視することを含んでいる。本方法は、エンコードの監視に基づいてエンコーダのパラメータを動的に調節することを含んでいる。 Embodiments of the present disclosure disclose a method for cloud gaming. The method includes generating a plurality of video frames when executing a video game at a cloud gaming server. The method includes encoding the plurality of video frames at an encoder bit rate, where the compressed plurality of video frames are transmitted from a streamer at the cloud gaming server to a client. The method includes measuring a maximum receive bandwidth of the client. The method includes monitoring encoding of the plurality of video frames at the streamer. The method includes dynamically adjusting parameters of the encoder based on the monitoring of the encoding.

別の実施形態では、クラウドゲーミング用のコンピュータプログラムを格納する非一時的なコンピュータ可読媒体が開示されている。コンピュータ可読媒体は、クラウドゲーミングサーバにおいてビデオゲームを実行するときに複数のビデオフレームを生成するためのプログラム命令を含んでいる。コンピュータ可読媒体は、複数のビデオフレームをエンコーダビットレートでエンコードするためのプログラム命令を含んでおり、圧縮された複数のビデオフレームは、クラウドゲーミングサーバのストリーマからクライアントに伝送されている。コンピュータ可読媒体は、クライアントの最大受信帯域幅を測定するためのプログラム命令を含んでいる。コンピュータ可読媒体は、ストリーマにおける複数のビデオフレームのエンコードを監視するためのプログラム命令を含んでいる。コンピュータ可読媒体は、エンコードの監視に基づいてエンコーダのパラメータを動的に調節するためのプログラム命令を含んでいる。 In another embodiment, a non-transitory computer readable medium storing a computer program for cloud gaming is disclosed. The computer readable medium includes program instructions for generating a plurality of video frames when executing a video game at a cloud gaming server. The computer readable medium includes program instructions for encoding the plurality of video frames at an encoder bit rate, and the compressed plurality of video frames are transmitted from a streamer at the cloud gaming server to a client. The computer readable medium includes program instructions for measuring a maximum receive bandwidth of the client. The computer readable medium includes program instructions for monitoring encoding of the plurality of video frames at the streamer. The computer readable medium includes program instructions for dynamically adjusting parameters of the encoder based on the monitoring of the encoding.

さらに別の実施形態では、コンピュータシステムは、プロセッサと、プロセッサに結合され、命令をその中に格納したメモリと、を含み、命令は、コンピュータシステムによって実行された場合、コンピュータシステムにクラウドゲーミングの方法を実行させる。本方法は、クラウドゲーミングサーバにおいてビデオゲームを実行するときに、複数のビデオフレームを生成することを含んでいる。本方法は、複数のビデオフレームをエンコーダビットレートでエンコードすることを含み、圧縮された複数のビデオフレームが、クラウドゲーミングサーバのストリーマからクライアントに伝送されている。本方法は、クライアントの最大受信帯域幅を測定することを含んでいる。本方法は、ストリーマにおける複数のビデオフレームのエンコードを監視することを含んでいる。本方法は、エンコードの監視に基づいてエンコーダのパラメータを動的に調節することを含んでいる。 In yet another embodiment, a computer system includes a processor and a memory coupled to the processor having instructions stored therein, the instructions, when executed by the computer system, causing the computer system to perform a method of cloud gaming. The method includes generating a plurality of video frames when executing a video game at a cloud gaming server. The method includes encoding the plurality of video frames at an encoder bit rate, the compressed plurality of video frames being transmitted from a streamer at the cloud gaming server to a client. The method includes measuring a maximum receive bandwidth of the client. The method includes monitoring encoding of the plurality of video frames at the streamer. The method includes dynamically adjusting parameters of the encoder based on the monitoring of the encoding.

さらに別の実施形態では、クラウドゲーミングのための方法が開示されている。本方法は、クラウドゲーミングサーバにおいてビデオゲームを実行するときに、複数のビデオフレームを生成することを含んでいる。本方法は、ビデオゲームの第１のビデオフレームのシーン変化を予測することを含んでおり、シーン変化は、第１のビデオフレームが生成される前に予測されている。本方法は、第１のビデオフレームがシーン変化である、シーン変化ヒントを生成することを含んでいる。本方法は、シーン変化ヒントをエンコーダに送信することを含んでいる。本方法は、第１のビデオフレームをエンコーダに送達することを含んでおり、第１のビデオフレームは、シーン変化ヒントに基づいてＩフレームとしてエンコードされている。本方法は、クライアントの最大受信帯域幅を測定することを含んでいる。本方法は、クライアントの最大受信帯域幅とクライアントディスプレイの目標解像度に基づいて、エンコーダで受信された第２のビデオフレームをエンコードするかエンコードしないかを判定することを含んでいる。 In yet another embodiment, a method for cloud gaming is disclosed. The method includes generating a plurality of video frames when executing a video game on a cloud gaming server. The method includes predicting a scene change for a first video frame of the video game, the scene change being predicted before the first video frame is generated. The method includes generating a scene change hint, where the first video frame is a scene change. The method includes sending the scene change hint to an encoder. The method includes delivering the first video frame to the encoder, where the first video frame is encoded as an I-frame based on the scene change hint. The method includes measuring a maximum receive bandwidth of a client. The method includes determining whether to encode or not encode a second video frame received at the encoder based on the maximum receive bandwidth of the client and a target resolution of the client display.

別の実施形態では、クラウドゲーミング用のコンピュータプログラムを格納する非一時的なコンピュータ可読媒体が開示されている。コンピュータ可読媒体は、クラウドゲーミングサーバにおいてビデオゲームを実行するときに複数のビデオフレームを生成するためのプログラム命令を含んでいる。コンピュータ可読媒体は、ビデオゲームの第１のビデオフレームのシーン変化を予測するためのプログラム命令を含んでおり、シーン変化は、第１のビデオフレームが生成される前に予測されている。コンピュータ可読媒体は、第１のビデオフレームがシーン変化であるシーン変化ヒントを生成するためのプログラム命令を含んでいる。コンピュータ可読媒体は、シーン変化ヒントをエンコーダに送信するためのプログラム命令を含んでいる。コンピュータ可読媒体は、第１のビデオフレームをエンコーダに配信するためのプログラム命令を含んでおり、第１のビデオフレームは、シーン変化ヒントに基づいてＩフレームとしてエンコードされている。コンピュータ可読媒体は、クライアントの最大受信帯域幅を測定するためのプログラム命令を含んでいる。コンピュータ可読媒体は、エンコーダで受信した第２のビデオフレームをエンコードするかエンコードしないかをクライアントの最大受信帯域幅とクライアントディスプレイの目標解像度に基づいて判定するためのプログラム命令を含んでいる。 In another embodiment, a non-transitory computer readable medium storing a computer program for cloud gaming is disclosed. The computer readable medium includes program instructions for generating a plurality of video frames when executing a video game on a cloud gaming server. The computer readable medium includes program instructions for predicting a scene change in a first video frame of the video game, the scene change being predicted before the first video frame is generated. The computer readable medium includes program instructions for generating a scene change hint, the first video frame being a scene change. The computer readable medium includes program instructions for sending the scene change hint to an encoder. The computer readable medium includes program instructions for delivering the first video frame to the encoder, the first video frame being encoded as an I-frame based on the scene change hint. The computer readable medium includes program instructions for measuring a maximum receive bandwidth of a client. The computer readable medium includes program instructions for determining whether to encode or not encode a second video frame received at the encoder based on the maximum receive bandwidth of the client and a target resolution of the client display.

さらに別の実施形態では、コンピュータシステムは、プロセッサと、プロセッサに結合され、コンピュータシステムによって実行されると、コンピュータシステムにクラウドゲーミングの方法を実行させる命令をその中に格納したメモリと、を含む。本方法は、クラウドゲーミングサーバにおいてビデオゲームを実行するときに、複数のビデオフレームを生成することを含んでいる。本方法は、ビデオゲームの第１のビデオフレームのシーン変化を予測することを含んでおり、シーン変化は、第１のビデオフレームが生成される前に予測されている。本方法は、第１のビデオフレームがシーン変化である、シーン変化ヒントを生成することを含んでいる。
本方法は、シーン変化ヒントをエンコーダに送信することを含んでいる。本方法は、第１のビデオフレームをエンコーダに送達することを含んでおり、第１のビデオフレームは、シーン変化ヒントに基づいてＩフレームとしてエンコードされている。本方法は、クライアントの最大受信帯域幅を測定することを含んでいる。本方法は、クライアントの最大受信帯域幅とクライアントディスプレイの目標解像度に基づいて、エンコーダで受信された第２のビデオフレームをエンコードするかエンコードしないかを判定することを含んでいる。 In yet another embodiment, a computer system includes a processor and a memory coupled to the processor having instructions stored therein that, when executed by the computer system, cause the computer system to perform a method of cloud gaming. The method includes generating a plurality of video frames when executing a video game on a cloud gaming server. The method includes predicting a scene change for a first video frame of the video game, the scene change being predicted before the first video frame is generated. The method includes generating a scene change hint, where the first video frame is a scene change.
The method includes sending a scene change hint to an encoder. The method includes delivering a first video frame to the encoder, the first video frame being encoded as an I-frame based on the scene change hint. The method includes measuring a maximum receive bandwidth of a client. The method includes determining whether to encode or not encode a second video frame received at the encoder based on the maximum receive bandwidth of the client and a target resolution of the client display.

本開示の他の態様は、本開示の原理を例として示す、添付の図面と併せて取られる以下の詳細な説明から明らかになるであろう。 Other aspects of the present disclosure will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, which illustrate, by way of example, the principles of the present disclosure.

本開示は、添付の図面と併せて以下の説明を参照することによって最もよく理解することができる。 The present disclosure can be best understood by referring to the following description in conjunction with the accompanying drawings.

本開示の一実施形態による、フレーム期間の開始時のＶＳＹＮＣ信号の図である。FIG. 2 is a diagram of a VSYNC signal at the beginning of a frame period according to one embodiment of the present disclosure. 本開示の一実施形態による、ＶＳＹＮＣ信号の周波数の図である。FIG. 2 is a diagram of the frequency of a VSYNC signal according to one embodiment of the present disclosure. 本開示の一実施形態による、様々な構成において、１つ以上のクラウドゲーミングサーバと１つ以上のクライアントデバイスとの間でネットワークを介してゲームを提供するためのシステムの図であり、ＶＳＹＮＣ信号は、同期及びオフセットされ、一方向レイテンシを低減することができる。FIG. 1 is a diagram of a system for providing games over a network between one or more cloud gaming servers and one or more client devices in various configurations according to one embodiment of the disclosure, where the VSYNC signals can be synchronized and offset to reduce one-way latency. 本開示の一実施形態による、２つ以上のピアデバイス間でゲームを提供するための図であり、ＶＳＹＮＣ信号は、同期及びオフセットされ、コントローラ及びデバイス間の他の情報の受信の最適なタイミングを達成することができる。FIG. 1 is a diagram for providing gaming between two or more peer devices according to one embodiment of the present disclosure, where the VSYNC signals can be synchronized and offset to achieve optimal timing of reception of controllers and other information between the devices. 本開示の一実施形態による、ソースデバイスとターゲットデバイスとの間のＶＳＹＮＣ信号の適切な同期及びオフセットから恩恵を得る様々なネットワーク構成を示す。1 illustrates various network configurations that benefit from proper synchronization and offset of VSYNC signals between source and target devices, according to one embodiment of the present disclosure. 本開示の一実施形態による、ソースデバイスとターゲットデバイスとの間のＶＳＹＮＣ信号の適切な同期及びオフセットから恩恵を得る、クラウドゲーミングサーバと複数のクライアントとの間のマルチテナンシ構成を示す。1 illustrates a multi-tenancy configuration between a cloud gaming server and multiple clients that benefits from proper synchronization and offset of VSYNC signals between source and target devices, according to one embodiment of the present disclosure. 本開示の一実施形態による、サーバ上で実行されるビデオゲームから生成されたビデオフレームをストリーミングするときのクロックドリフトによるクラウドゲーミングサーバとクライアントとの間の一方向レイテンシの変動を示す。1 illustrates the variation in one-way latency between a cloud gaming server and a client due to clock drift when streaming video frames generated from a video game running on the server, according to one embodiment of the present disclosure. サーバ上で実行されるビデオゲームから生成されたビデオフレームをストリーミングするときのクラウドゲーミングサーバとクライアントを含むネットワーク構成を示し、サーバとクライアントとの間のＶＳＹＮＣ信号は、同期かつオフセットされており、サーバ及びクライアントでの操作の重複を可能にし、サーバとクライアント間の一方向レイテンシを低減する。1 illustrates a network configuration including a cloud gaming server and clients when streaming video frames generated from a video game running on the server, where the VSYNC signals between the server and clients are synchronized and offset, allowing overlapping of operations at the server and clients and reducing one-way latency between the server and clients. 本開示の一実施形態による、クラウドゲーミングの方法を示すフロー図であり、ビデオフレームのエンコードが、ネットワーク伝送速度及び信頼性、ならびに全体的なレイテンシ目標を認識したエンコーダパラメータの調節を含む。FIG. 1 is a flow diagram illustrating a method for cloud gaming, according to one embodiment of the present disclosure, in which encoding of video frames includes adjusting encoder parameters with awareness of network transmission speed and reliability, as well as overall latency goals. 本開示の一実施形態による、アプリケーション層で動作するストリーマ構成要素によるクライアントの帯域幅の測定を示す図であり、ストリーマは、エンコーダを監視し、調節するように構成されており、圧縮されたビデオフレームが、クライアントの測定された帯域幅内の速度で伝送され得るようになっている。FIG. 1 illustrates measurement of a client's bandwidth by a streamer component operating at the application layer according to one embodiment of the present disclosure, where the streamer is configured to monitor and adjust the encoder so that compressed video frames can be transmitted at a rate within the client's measured bandwidth. Ａは本開示の一実施形態による、クライアントにおける品質及びバッファ利用を最適化するためのエンコーダの量子化パラメータ（ＱＰ）の設定を示す図、Ｂは本開示の一実施形態による、クライアントによってサポートされる真の目標フレームサイズを超えるＩフレームの発生を低減するための、目標フレームサイズ、最大フレームサイズ、及び／またはＱＰ（例えば、最小ＱＰ及び／または最大ＱＰ）エンコーダ設定の調節を示す図である。FIG. 1A illustrates an embodiment of the present disclosure showing setting an encoder's quantization parameter (QP) to optimize quality and buffer utilization at the client; and FIG. 1B illustrates an embodiment of the present disclosure showing adjusting target frame size, maximum frame size, and/or QP (e.g., minimum QP and/or maximum QP) encoder settings to reduce the occurrence of I-frames that exceed the true target frame size supported by the client. 本開示の一実施形態による、クラウドゲーミングの方法を示すフロー図であり、ビデオフレームのエンコードは、Ｉフレームをエンコードするときなど、エンコードが長い場合、または生成されているビデオフレームが大きい場合に、ビデオフレームをスキップするか、ビデオフレームのエンコード及び伝送を遅延させるかを決定することを含む。FIG. 1 is a flow diagram illustrating a method for cloud gaming according to one embodiment of the present disclosure, where encoding a video frame includes determining whether to skip a video frame or to delay the encoding and transmission of a video frame if the encoding is long, such as when encoding an I-frame, or if the video frame being generated is large. Ａは本開示の一実施形態による、エンコーダによって圧縮されているビデオフレームのシーケンスを示しており、エンコーダは、クライアントのディスプレイの目標解像度に対してクライアント帯域幅が低いとき、Ｉフレームをエンコードした後にビデオフレームのエンコードをドロップし、Ｂは本開示の一実施形態による、エンコーダによって圧縮されているビデオフレームのシーケンスを示し、各シーケンスでは、ビデオフレームがＩフレームとしてエンコードされており、後続のビデオフレームも、クライアント帯域幅が、クライアントのディスプレイの目標解像度に対して中程度または高いときに、Ｉフレームのエンコードの遅延後にエンコードされ、Ｃは本開示の一実施形態による、エンコーダによって圧縮されているビデオフレームのシーケンスを示し、各シーケンスでは、ビデオフレームがＩフレームとしてエンコードされており、後続のビデオフレームも、クライアント帯域幅が、クライアントのディスプレイの目標解像度に対して中程度または高いときに、Ｉフレームのエンコードの遅延後にエンコードされている。1A illustrates a sequence of video frames being compressed by an encoder according to an embodiment of the present disclosure, where the encoder drops the encoding of a video frame after encoding an I-frame when the client bandwidth is low relative to the target resolution of the client's display; FIG. 1B illustrates a sequence of video frames being compressed by an encoder according to an embodiment of the present disclosure, where in each sequence a video frame is encoded as an I-frame and a subsequent video frame is also encoded after a delay in encoding an I-frame when the client bandwidth is medium or high relative to the target resolution of the client's display; and FIG. 1C illustrates a sequence of video frames being compressed by an encoder according to an embodiment of the present disclosure, where in each sequence a video frame is encoded as an I-frame and a subsequent video frame is also encoded after a delay in encoding an I-frame when the client bandwidth is medium or high relative to the target resolution of the client's display. 本開示の様々な実施形態の態様を実行するために使用されることができる例示的なデバイスの構成要素を示している。1 illustrates components of an example device that can be used to implement aspects of various embodiments of the present disclosure.

以下の詳細な説明は、例示の目的で多くの具体的な詳細を含むが、当業者は、以下の詳細に対する多くの変形及び変更が本開示の範囲内にあることを理解するであろう。従って、以下に説明する本開示の態様は、この説明に続く特許請求の範囲に対する一般性を失うことなく、また制限を課すことなく説明されている。 Although the following detailed description includes many specific details for purposes of illustration, one of ordinary skill in the art will appreciate that many variations and modifications to the following details are within the scope of the disclosure. Accordingly, the aspects of the disclosure described below are set forth without loss of generality to, and without imposing limitations on, the claims that follow this description.

一般的に言えば、本開示の様々な実施形態は、メディアコンテンツをストリーミングするとき（例えば、ビデオゲームからオーディオ及びビデオをストリーミングするとき）、ソースデバイスとターゲットデバイスとの間のレイテンシ及び／またはレイテンシの不安定性を低減するように構成された方法及びシステムを説明する。サーバにおいて複雑なフレーム（例えば、シーン変化）を生成するために必要になる追加の時間、サーバにおいて複雑なフレームのエンコード／圧縮にかかる時間増加、ネットワークを介した可変通信経路、及びクライアントにおいて複雑なフレームをデコードにかかる時間増加により、サーバとクライアント間の一方向レイテンシにレイテンシ不安定性が生じる可能性がある。レイテンシの不安定性はまた、サーバとクライアントのＶＳＹＮＣ信号の間にドリフトを生じさせるサーバとクライアントにおけるクロックの違いによっても導入され得る。
本開示の実施形態では、サーバとクライアントとの間の一方向レイテンシは、高性能のエンコード及びデコードを提供することによって、クラウドゲーミングアプリケーションにおいて低減され得る。ストリーミングメディア（例えば、ストリーミングビデオ、映画、クリップ、コンテンツ）を解凍するとき、解凍されたビデオのかなりの量をバッファリングすることが可能であり、従って、ストリーミングされたコンテンツを表示するときに、平均のデコード能力及びメトリクスに依存（例えば、６０Ｈｚで４Ｋメディアをサポートするには、デコードリソースの平均量に依存）することが可能である。
しかしながら、クラウドゲーミングでは、（単一フレームであっても）エンコード及び／またはデコード操作を実行する時間が長くなると、それに応じて一方向レイテンシがより高くなる。従って、クラウドゲーミングの場合、ストリーミングビデオアプリケーションの必要性と比較すると不要と見える強力なデコードリソース及びエンコードリソースを供給することが有益であり、リソースは、より長いか、または最も長い処理を必要とするフレームを処理する時間のために最適化される必要がある。本開示の他の実施形態では、クラウドゲーミングアプリケーションにおいてレイテンシとビデオ品質との間のトレードオフを改善するためにエンコーダ調節が実行され得る。
エンコーダ調節は、ネットワークの伝送速度と信頼性の認識、及び全体的なレイテンシ目標のなかで実行される。実施形態では、エンコードが長く実行されるか、または生成されたデータが大きい（例えば、圧縮されたＩフレームで両方の条件が発生し得る）場合、後続のフレームのエンコード及び伝送を遅延させるか、またはそれらをスキップするかどうかを判定するための方法が実行される。
実施形態では、量子化パラメータ（ＱＰ）値、目標フレームサイズ、及び最大フレームサイズの調節は、クライアントへの利用可能なネットワーク速度に基づいて実行されている。例えば、ネットワーク速度がより速い場合、ＱＰは低減され得る。
他の実施形態では、Ｉフレーム発生率の監視が実行され、ＱＰの設定に使用される。例えば、Ｉフレームの頻度が低い場合、ＱＰは低減されることができ（例えば、より高いエンコード精度、またはエンコードのより高い品質をもたらす）、ビデオ再生品質を犠牲にする一方で、一方向レイテンシを低く抑えるためにビデオフレームのエンコードがスキップされ得るようになっている。このように、高性能のエンコードとデコード、及びクラウドゲーミングアプリケーションのレイテンシとビデオ品質の間のトレードオフを改善するために実行されるエンコーダ調節は、クラウドゲーミングサーバとクライアントとの間の一方向レイテンシの低減、よりスムーズなフレームレート、及びより信頼性の高い及び／または一貫した一方向レイテンシへと繋がる。 Generally speaking, various embodiments of the present disclosure describe methods and systems configured to reduce latency and/or latency instability between a source device and a target device when streaming media content (e.g., streaming audio and video from a video game). Latency instability can occur in one-way latency between the server and the client due to the additional time required to generate a complex frame (e.g., a scene change) at the server, the increased time it takes to encode/compress the complex frame at the server, variable communication paths through the network, and the increased time it takes to decode the complex frame at the client. Latency instability can also be introduced by clock differences at the server and the client that cause drift between the server and the client's VSYNC signals.
In embodiments of the present disclosure, one-way latency between a server and a client can be reduced in cloud gaming applications by providing high performance encoding and decoding. When decompressing streaming media (e.g., streaming video, movies, clips, content), it is possible to buffer a significant amount of the decompressed video, and therefore, when displaying the streamed content, it is possible to depend on the average decoding capabilities and metrics (e.g., to support 4K media at 60 Hz, it depends on the average amount of decoding resources).
However, in cloud gaming, the longer it takes to perform encoding and/or decoding operations (even for a single frame), the higher the one-way latency will be. Therefore, for cloud gaming, it is beneficial to provide powerful decoding and encoding resources that may seem unnecessary compared to the needs of streaming video applications, and the resources need to be optimized for the time to process the frames that require longer or the longest processing. In other embodiments of the present disclosure, encoder tuning may be performed to improve the tradeoff between latency and video quality in cloud gaming applications.
Encoder adjustments are made with knowledge of the network transmission speed and reliability, and overall latency goals. In an embodiment, if the encoding is taking long or the data generated is large (e.g., both conditions can occur with compressed I-frames), methods are implemented to determine whether to delay the encoding and transmission of subsequent frames or to skip them.
In an embodiment, adjustments to the quantization parameter (QP) value, the target frame size, and the maximum frame size are performed based on the available network speed to the client, e.g., if the network speed is faster, the QP may be reduced.
In other embodiments, monitoring of the I-frame occurrence rate is performed and used to set the QP. For example, if the frequency of I-frames is low, the QP can be reduced (e.g., resulting in higher encoding accuracy or higher quality of encoding) such that encoding of video frames can be skipped to keep one-way latency low while sacrificing video playback quality. Thus, high performance encoding and decoding, and encoder adjustments performed to improve the tradeoff between latency and video quality for cloud gaming applications, can lead to reduced one-way latency between cloud gaming servers and clients, smoother frame rates, and more reliable and/or consistent one-way latency.

上述の様々な実施形態の一般的な理解を伴って、実施形態の例示的な詳細が、様々な図面を参照して説明される。 With a general understanding of the various embodiments discussed above, exemplary details of the embodiments will now be described with reference to the various drawings.

明細書全体を通して、「ゲーム」または「ビデオゲーム」または「ゲーミングアプリケーション」への言及は、入力コマンドの実行を通じて指示される任意タイプのインタラクティブアプリケーションを表すことを意味する。例示的目的としてだけに、インタラクティブアプリケーションは、ゲーム、ワードプロセッシング、ビデオプロセッシング、ビデオゲームプロセッシングなどのためのアプリケーションを含む。さらに、上記で紹介した用語は互換性があるものである。 Throughout the specification, references to a "game" or a "video game" or a "gaming application" are meant to represent any type of interactive application that is directed through the execution of input commands. For illustrative purposes only, interactive applications include applications for games, word processing, video processing, video game processing, and the like. Furthermore, the terms introduced above are intended to be interchangeable.

クラウドゲーミングは、サーバでビデオゲームを実行して、ゲームでレンダリングされたビデオフレームを生成することを含み、これらは、次いで表示のためにクライアントに送信される。サーバとクライアントの両方での操作のタイミングは、それぞれの垂直同期（ＶＳＹＮＣ）パラメータに関連付けられ得る。ＶＳＹＮＣ信号が、サーバ及び／またはクライアント間で適切に同期及び／またはオフセットされると、サーバで実行される操作（例えば、１つ以上のフレーム期間にわたるビデオフレームの生成と伝送）は、クライアントで実行される操作（例えば、フレーム期間に対応する表示フレームまたはリフレッシュレートでディスプレイにビデオフレームを表示する）と同期される。特に、サーバで生成されたサーバＶＳＹＮＣ信号とクライアントで生成されたクライアントＶＳＹＮＣ信号は、サーバとクライアントでの操作を同期させるために使用され得る。つまり、サーバとクライアントのＶＳＹＮＣ信号が同期及び／またはオフセットされると、サーバは、クライアントがそれらのビデオフレームを表示する方法と同期してビデオフレームを生成及び送信する。 Cloud gaming involves running a video game on a server to generate game-rendered video frames, which are then transmitted to a client for display. The timing of operations on both the server and the client may be associated with respective vertical synchronization (VSYNC) parameters. When the VSYNC signals are properly synchronized and/or offset between the server and/or client, operations performed on the server (e.g., generating and transmitting video frames over one or more frame periods) are synchronized with operations performed on the client (e.g., displaying video frames on a display at a display frame or refresh rate corresponding to the frame periods). In particular, a server VSYNC signal generated on the server and a client VSYNC signal generated on the client may be used to synchronize operations on the server and the client. That is, when the server and client VSYNC signals are synchronized and/or offset, the server generates and transmits video frames in sync with the way the client will display those video frames.

サーバとクライアントとの間でメディアコンテンツをストリーミングするときにビデオフレームを生成し、それらのビデオフレームを表示するために、ＶＳＹＮＣ信号と垂直帰線区間（ＶＢＩ）が組み込まれている。例えば、サーバは、対応するサーバＶＳＹＮＣ信号で規定された１つ以上のフレーム期間でゲーム用にレンダリングされたビデオフレームを生成しようとし（例えば、フレーム期間が１６．７ｍｓの場合、フレーム期間ごとにビデオフレームを生成すると、６０Ｈｚの操作になり、２つのフレーム期間ごとに１つのビデオフレームを生成すると、３０Ｈｚの操作になる）、その後、そのビデオフレームをエンコードしてクライアントに伝送する。クライアントでは、受信したエンコードされたビデオフレームがデコードされて表示され、クライアントは、対応するクライアントＶＳＹＮＣで始まる表示用にレンダリングされた各ビデオフレームを表示する。 The VSYNC signal and vertical blanking interval (VBI) are incorporated to generate and display video frames when streaming media content between the server and the client. For example, the server will attempt to generate rendered video frames for a game at one or more frame periods defined by the corresponding server VSYNC signal (e.g., if the frame period is 16.7 ms, generating a video frame every frame period results in 60 Hz operation, while generating one video frame every two frame periods results in 30 Hz operation), and then encode and transmit the video frames to the client. At the client, the received encoded video frames are decoded and displayed, and the client displays each video frame rendered for display beginning with the corresponding client VSYNC.

例示のために、図１Ａは、ＶＳＹＮＣ信号１１１が、フレーム期間の開始をどのように示し得るかを示しており、ここで、様々な操作が、サーバ及び／またはクライアントにおいて、対応するフレーム期間中に実行され得る。メディアコンテンツをストリーミングするとき、サーバは、ビデオフレームを生成及びエンコードするためにサーバＶＳＹＮＣ信号を使用することができ、クライアントは、ビデオフレームを表示するためにクライアントＶＳＹＮＣ信号を使用することができる。ＶＳＹＮＣ信号１１１は、図１Ｂに示されるように、規定されたフレーム期間１１０に対応する規定された周波数で生成される。さらに、ＶＢＩ１０５は、最後のラスターラインが、前のフレーム期間のためにディスプレイ上に描かれたときと、最初のラスターライン（例えば、上）が、ディスプレイに描かれたときとの間の期間を規定する。図示されるように、ＶＢＩ１０５の後、表示のためにレンダリングされたビデオフレームは、ラスタースキャンライン１０６を介して表示される（例えば、左から右へのラスターラインごとのラスターライン）。 For illustration purposes, FIG. 1A shows how a VSYNC signal 111 may indicate the start of a frame period, where various operations may be performed during the corresponding frame period at the server and/or client. When streaming media content, the server may use the server VSYNC signal to generate and encode video frames, and the client may use the client VSYNC signal to display the video frames. The VSYNC signal 111 is generated at a specified frequency corresponding to a specified frame period 110, as shown in FIG. 1B. Additionally, the VBI 105 defines the period between when the last raster line is drawn on the display for the previous frame period and when the first raster line (e.g., top) is drawn on the display. As shown, after the VBI 105, the video frame rendered for display is displayed via raster scan line 106 (e.g., raster line by raster line from left to right).

さらに、本開示の様々な実施形態は、メディアコンテンツ（例えば、ビデオゲームコンテンツ）をストリーミングするときなどの、ソースデバイスとターゲットデバイスとの間の一方向レイテンシ及び／またはレイテンシ不安定性を低減するために開示される。説明のみを目的として、一方向レイテンシ及び／またはレイテンシ不安定性を低減するための様々な実施形態が、サーバとクライアントのネットワーク構成の中で説明されている。しかしながら、図２Ａ～図２Ｄに示されるように、一方向レイテンシ及び／またはレイテンシ不安定性を低減するために開示された様々な技術は、他のネットワーク構成内で、及び／またはピアツーピアネットワーク上で実装され得ることが理解される。例えば、一方向レイテンシ及び／またはレイテンシ不安定性を低減するために開示される様々な実施形態は、（例えば、サーバとクライアント、サーバとサーバ、サーバと複数のクライアント、サーバと複数のサーバ、クライアントとクライアント、クライアントと複数のクライアント、など）様々な構成で１つ以上のサーバとクライアントデバイスとの間に実装され得る。 Furthermore, various embodiments of the present disclosure are disclosed for reducing one-way latency and/or latency stability between a source device and a target device, such as when streaming media content (e.g., video game content). For purposes of illustration only, various embodiments for reducing one-way latency and/or latency stability are described in a server-client network configuration. However, it is understood that the various techniques disclosed for reducing one-way latency and/or latency stability may be implemented in other network configurations and/or on peer-to-peer networks, such as those shown in Figures 2A-2D. For example, various embodiments disclosed for reducing one-way latency and/or latency stability may be implemented between one or more server and client devices in various configurations (e.g., server-client, server-server, server-multiple clients, server-multiple servers, client-client, client-multiple clients, etc.).

図２Ａは、様々な構成で、ネットワーク２５０を介して１つ以上のクラウドゲーミングネットワーク２９０及び／またはサーバ２６０と１つ以上のクライアントデバイス２１０との間にゲームを提供するためのシステム２００Ａの図であり、本開示の一実施形態によれば、サーバ及びクライアントのＶＳＹＮＣ信号は同期及びオフセットされることができ、及び／または、動的バッファリングがクライアント上で実行されており、及び／または、サーバ上でのエンコード及び伝送操作が重複されることができ、及び／または、クライアントでの受信及びデコード操作が重複されることができ、及び／または、クライアント上でのデコード及び表示操作が重複され、サーバ２６０とクライアント２１０との間の一方向レイテンシを短縮することができる。
特に、本開示の一実施形態によれば、システム２００Ａは、クラウドゲームネットワーク２９０を介してゲームを提供し、ゲームは、ゲームをプレイしている対応するユーザのクライアントデバイス２１０（例えば、シンクライアント）から遠隔で実行されている。システム２００Ａは、シングルプレイヤーモードまたはマルチプレイヤーモードのいずれかで、ネットワーク２５０を介してクラウドゲームネットワーク２９０を通じて１つ以上のゲームをプレイする１人以上のユーザにゲーム制御を提供することができる。いくつかの実施形態では、クラウドゲーミングネットワーク２９０は、ホストマシンのハイパーバイザー上で実行される複数の仮想マシン（ＶＭ）を含み得、１つ以上の仮想マシンは、ホストマシンのハイパーバイザーに利用可能なハードウェアリソースを利用してゲームプロセッサモジュールを実行するように構成される。ネットワーク２５０は、１つ以上の通信技術を含み得る。いくつかの実施形態では、ネットワーク２５０は、高度な無線通信システムを有する第５世代（５Ｇ）ネットワーク技術を含み得る。 FIG. 2A is a diagram of a system 200A for providing games between one or more cloud gaming networks 290 and/or servers 260 and one or more client devices 210 over a network 250 in various configurations, where according to one embodiment of the disclosure, server and client VSYNC signals can be synchronized and offset, and/or dynamic buffering can be performed on the client, and/or encoding and transmission operations on the server can be duplicated, and/or receiving and decoding operations on the client can be duplicated, and/or decoding and display operations on the client can be duplicated to reduce one-way latency between the server 260 and the client 210.
In particular, according to one embodiment of the present disclosure, the system 200A provides games via a cloud gaming network 290, the games being executed remotely from the client devices 210 (e.g., thin clients) of corresponding users playing the games. The system 200A can provide game control to one or more users playing one or more games through the cloud gaming network 290 via the network 250, in either single player or multiplayer mode. In some embodiments, the cloud gaming network 290 can include multiple virtual machines (VMs) executing on a hypervisor of a host machine, the one or more virtual machines being configured to execute a game processor module utilizing hardware resources available to the hypervisor of the host machine. The network 250 can include one or more communication technologies. In some embodiments, the network 250 can include a fifth generation (5G) network technology having an advanced wireless communication system.

いくつかの実施形態では、通信は、無線技術を使用して促進され得る。そのような技術は、例えば、５Ｇ無線通信技術を含み得る。５Ｇは、セルラーネットワーク技術の第５世代である。５Ｇネットワークは、デジタルセルラーネットワークであり、プロバイダーがカバーするサービスエリアは、セルと呼ばれる小さな地理的エリアに分割されている。音と画像を表すアナログ信号は、電話でデジタル化され、アナログ－デジタルコンバータによって変換され、ビットのストリームとして伝送される。セル内のすべての５Ｇ無線デバイスは、他のセルで再利用される周波数のプールからトランシーバによって割り当てられた周波数チャネルを介して、セル内のローカルアンテナアレイ及び低電力自動トランシーバ（伝送機及び受信機）を用いて電波で通信する。ローカルアンテナは、高帯域幅の光ファイバまたは無線バックホール接続によって電話網及びインターネットに接続される。他のセルネットワークと同様に、あるセルから別のセルに移動するモバイルデバイスは、新しいセルに自動的に転送される。５Ｇネットワークは、通信ネットワークの単なる一例示的タイプであり、本開示の実施形態は、５Ｇに続く後続世代の有線技術または無線技術と同様に、前世代の無線通信または有線通信を利用することができることを理解されたい。 In some embodiments, communication may be facilitated using wireless technology. Such technologies may include, for example, 5G wireless communication technology. 5G is the fifth generation of cellular network technology. 5G networks are digital cellular networks, where the service area covered by a provider is divided into small geographic areas called cells. Analog signals representing sound and images are digitized by the telephone, converted by an analog-to-digital converter, and transmitted as a stream of bits. All 5G wireless devices within a cell communicate over the airwaves using a local antenna array and a low-power automatic transceiver (transmitter and receiver) within the cell over a frequency channel assigned by the transceiver from a pool of frequencies reused by other cells. The local antennas are connected to the telephone network and the Internet by high-bandwidth optical fiber or wireless backhaul connections. As with other cellular networks, mobile devices moving from one cell to another are automatically transferred to the new cell. It should be understood that 5G networks are merely one exemplary type of communication network, and that embodiments of the present disclosure may utilize previous generations of wireless or wired communication, as well as subsequent generations of wired or wireless technologies following 5G.

図示されるように、クラウドゲーミングネットワーク２９０は、複数のビデオゲームへのアクセスを提供するゲームサーバ２６０を含む。ゲームサーバ２６０は、クラウドで利用可能な任意タイプのサーバコンピューティングデバイスであり得、そして１つ以上のホスト上で実行される１つ以上の仮想マシンとして構成され得る。例えば、ゲームサーバ２６０は、ユーザのためにゲームのインスタンスをインスタンス化するゲームプロセッサをサポートする仮想マシンを管理することができる。このように、複数の仮想マシンに関連付けられたゲームサーバ２６０の複数のゲームプロセッサは、複数のユーザのゲームプレイに関連付けられた１つ以上のゲームの複数のインスタンスを実行するように構成されている。
そのようにして、バックエンドサーバサポートは、複数のゲームアプリケーションのゲームプレイのメディア（例えば、ビデオ、オーディオなど）のストリーミングを複数の対応するユーザに提供する。つまり、ゲームサーバ２６０は、ネットワーク２５０を介してデータ（例えば、対応するゲームプレイのレンダリングされた画像及び／またはフレーム）を対応するクライアントデバイス２１０にストリーミングして戻すように構成されている。そのようにして、計算が複雑なゲームアプリケーションは、クライアントデバイス２１０によって受信及び転送されたコントローラ入力に応答して、バックエンドサーバで実行され得る。各サーバは、画像やフレームをレンダリングすることができ、これらは次いでエンコード（例えば、圧縮）され、対応するクライアントデバイスに表示のためにストリーミングされ得る。 As shown, cloud gaming network 290 includes game server 260 that provides access to multiple video games. Game server 260 may be any type of server computing device available in the cloud and may be configured as one or more virtual machines running on one or more hosts. For example, game server 260 may manage virtual machines that support game processors that instantiate instances of games for users. In this manner, multiple game processors of game server 260 associated with multiple virtual machines are configured to run multiple instances of one or more games associated with gameplay for multiple users.
In that manner, the backend server supports streaming of gameplay media (e.g., video, audio, etc.) for multiple game applications to multiple corresponding users. That is, the game servers 260 are configured to stream data (e.g., rendered images and/or frames of corresponding gameplay) back to corresponding client devices 210 over the network 250. In that manner, computationally complex game applications may be executed on the backend servers in response to controller inputs received and forwarded by the client devices 210. Each server may render images and/or frames that may then be encoded (e.g., compressed) and streamed to corresponding client devices for display.

例えば、複数のユーザは、ストリーミングメディアを受信するように構成された対応するクライアントデバイス２１０を使用して、通信ネットワーク２５０を介してクラウドゲームネットワーク２９０にアクセスすることができる。一実施形態では、クライアントデバイス２１０は、計算機能（例えば、ゲームタイトル処理エンジン２１１を含む）を提供するように構成されたバックエンドサーバ（例えば、クラウドゲームネットワーク２９０のゲームサーバ２６０）とのインターフェースを提供するシンクライアントとして構成され得る。
別の実施形態では、クライアントデバイス２１０は、ビデオゲームの少なくともいくつかのローカル処理のためのゲームタイトル処理エンジン及びゲームロジックで構成され得、バックエンドで実行されるビデオゲームによって生成されるストリーミングコンテンツを受信するために、または、バックエンドサーバサポートによって提供されるその他のコンテンツのためにさらに利用され得る。ローカル処理に対して、ゲームタイトル処理エンジンは、ビデオゲームを実行するための基本的なプロセッサベースの機能と、ビデオゲームに関連するサービスと、を含む。ゲームロジックは、ローカルクライアントデバイス２１０に格納され、ビデオゲームを実行するために使用される。 For example, multiple users may access cloud gaming network 290 over communications network 250 using corresponding client devices 210 configured to receive streaming media. In one embodiment, client devices 210 may be configured as thin clients that interface with backend servers (e.g., game servers 260 of cloud gaming network 290) configured to provide computing functionality (e.g., including game title processing engine 211).
In another embodiment, client device 210 may be configured with a game title processing engine and game logic for at least some local processing of a video game, and may be further utilized to receive streaming content generated by the video game running in a back-end, or for other content provided by back-end server support. For local processing, the game title processing engine includes basic processor-based functionality for running the video game and services related to the video game. The game logic is stored on the local client device 210 and is used to run the video game.

特に、対応するユーザ（図示せず）のクライアントデバイス２１０は、インターネットなどの通信ネットワーク２５０を介してゲームへのアクセスを要求するように、及びゲームサーバ２６０によって実行されるビデオゲームによって生成される表示画像をレンダリングするように構成され、エンコードされた画像は、対応するユーザに関連付けて表示するためにクライアントデバイス２１０に配信されている。
例えば、ユーザは、クライアントデバイス２１０を介して、ゲームサーバ２６０のゲームプロセッサ上で実行されているビデオゲームのインスタンスと対話し得る。より具体的には、ビデオゲームのインスタンスは、ゲームタイトル処理エンジン２１１によって実行されている。ビデオゲームを実装する対応するゲームロジック（例えば、実行可能コード）２１５は、データストア（図示せず）に格納され、データストアを介してアクセス可能であり、ビデオゲームを実行するために使用されている。ゲームタイトル処理エンジン２１１は、複数のゲームロジックを使用して複数のビデオゲームをサポートすることができ、それらの各々がユーザによって選択可能である。 In particular, a client device 210 of a corresponding user (not shown) is configured to request access to the game over a communications network 250, such as the Internet, and to render display images generated by the video game executed by a game server 260, the encoded images being delivered to the client device 210 for display in association with the corresponding user.
For example, a user may interact, via a client device 210, with an instance of a video game running on a game processor of a game server 260. More specifically, the instance of the video game is executed by a game title processing engine 211. Corresponding game logic (e.g., executable code) 215 implementing the video game is stored in and accessible via a data store (not shown) and is used to execute the video game. The game title processing engine 211 may support multiple video games using multiple game logics, each of which is selectable by a user.

例えば、クライアントデバイス２１０は、ゲームプレイを駆動するために使用される入力コマンドを介してなど、対応するユーザのゲームプレイに関連してゲームタイトル処理エンジン２１１と相互作用するように構成される。特に、クライアントデバイス２１０は、ゲームコントローラ、タブレットコンピュータ、キーボード、ビデオカメラよってキャプチャされたジェスチャ、マウス、タッチパッドなどの様々なタイプの入力デバイスからの入力を受信することができる。
クライアントデバイス２１０は、ネットワーク２５０を介してゲームサーバ２６０に接続することができる少なくともメモリ及びプロセッサモジュールを有する任意タイプのコンピューティングデバイスであり得る。バックエンドゲームタイトル処理エンジン２１１は、レンダリングされた画像を生成するように構成され、レンダリングされた画像は、クライアントデバイス２１０に関連付けられた対応するディスプレイに表示するためにネットワーク２５０を介して配信される。
例えば、クラウドベースのサービスを通じて、ゲーム用にレンダリングされた画像は、ゲームサーバ２６０のゲーム実行エンジン２１１上で実行される対応するゲームのインスタンスによって配信され得る。つまり、クライアントデバイス２１０は、エンコードされた画像（例えば、ビデオゲームの実行を通じて生成されたゲームレンダリング画像からエンコードされた）を受信し、表示１１のためにレンダリングされた画像を表示するように構成される。一実施形態では、ディスプレイ１１は、ＨＭＤを含む（例えば、ＶＲコンテンツを表示する）。いくつかの実施形態では、レンダリングされた画像は、クラウドベースのサービスから直接的に、またはクライアントデバイス２１０（例えば、プレイステーション（登録商標）リモートプレイ）を介して、無線または有線でスマートフォンまたはタブレットにストリーミングされ得る。 For example, client device 210 is configured to interact with game title processing engine 211 in connection with a corresponding user's gameplay, such as via input commands used to drive the gameplay. In particular, client device 210 can receive input from various types of input devices, such as a game controller, a tablet computer, a keyboard, gestures captured by a video camera, a mouse, a touchpad, etc.
Client device 210 may be any type of computing device having at least a memory and a processor module that can connect to game server 260 via network 250. Backend game title processing engine 211 is configured to generate rendered images that are distributed over network 250 for display on corresponding displays associated with client device 210.
For example, through a cloud-based service, rendered images for a game may be delivered by an instance of the corresponding game running on a game execution engine 211 of the game server 260. That is, the client device 210 is configured to receive encoded images (e.g., encoded from game rendered images generated through the execution of a video game) and display the rendered images for display 11. In one embodiment, the display 11 includes an HMD (e.g., displaying VR content). In some embodiments, the rendered images may be streamed wirelessly or wired to a smartphone or tablet directly from the cloud-based service or via the client device 210 (e.g., PlayStation® Remote Play).

一実施形態では、ゲームサーバ２６０及び／またはゲームタイトル処理エンジン２１１は、ゲーム及びゲームアプリケーションに関連するサービスを実行するための基本的なプロセッサベースの機能を含む。例えば、プロセッサベースの機能は、２Ｄまたは３Ｄレンダリング、物理学、物理学シミュレーション、スクリプト、オーディオ、アニメーション、グラフィックス処理、ライティング、シェーディング、ラスター化、レイトレーシング、シャドウイング、カリング、変換、人工知能、等を含む。さらに、ゲームアプリケーションのサービスは、メモリ管理、マルチスレッド管理、サービス品質（ＱｏＳ）、帯域幅テスト、ソーシャルネットワーキング、ソーシャルフレンドの管理、フレンドのソーシャルネットワークとの通信、通信チャネル、テキストメッセージ、インスタントメッセージング、チャットサポートなどを含む。 In one embodiment, the game server 260 and/or the game title processing engine 211 include basic processor-based functionality for executing services related to games and game applications. For example, the processor-based functionality includes 2D or 3D rendering, physics, physics simulation, scripting, audio, animation, graphics processing, lighting, shading, rasterization, ray tracing, shadowing, culling, transformation, artificial intelligence, etc. Additionally, the game application services include memory management, multi-thread management, quality of service (QoS), bandwidth testing, social networking, social friend management, communication with friends' social networks, communication channels, text messaging, instant messaging, chat support, etc.

一実施形態では、クラウドゲーミングネットワーク２９０は、分散型ゲームサーバシステム及び／またはアーキテクチャである。特に、ゲームロジックを実行する分散型ゲームエンジンは、対応するゲームの対応するインスタンスとして構成される。一般に、分散型ゲームエンジンは、ゲームエンジンの各機能を取得し、多数の処理エンティティによる実行のために、それらの機能を分散する。個々の機能は、１つ以上の処理エンティティにわたって、さらに分散され得る。処理エンティティは、物理ハードウェアを含むさまざまな構成で、及び／または、仮想構成要素または仮想マシンとして、及び／または、仮想コンテナとして構成され得、コンテナは、仮想化されたオペレーティングシステム上で実行中のゲームアプリケーションのインスタンスを仮想化することから、仮想マシンとは異なっている。
処理エンティティは、クラウドゲーミングネットワーク２９０の１つ以上のサーバ（コンピューティングノード）上のサーバ及びその基礎となるハードウェアを利用及び／または依存することができ、サーバは、１つ以上のラックに配置され得る。様々な処理エンティティへのこれらの機能の実行の調整、割り当て、及び管理は、分散同期層によって実行されている。そのようにして、それらの機能の実行が分散同期層によって制御されて、プレイヤーによるコントローラ入力に応答して、ゲームアプリケーション用のメディア（例えば、ビデオフレーム、オーディオなど）を生成することが可能になる。分散同期層は、分散処理エンティティにわたって、（例えば、負荷均衡を通じて）これらの機能を効率的に実行でき、重要なゲームエンジン構成要素／機能が分散され、より効率的な処理のために再構築されるようになっている。 In one embodiment, cloud gaming network 290 is a distributed game server system and/or architecture. In particular, a distributed game engine that executes game logic is configured as a corresponding instance of a corresponding game. In general, a distributed game engine takes each function of the game engine and distributes them for execution by multiple processing entities. Individual functions may be further distributed across one or more processing entities. The processing entities may be configured in a variety of configurations, including physical hardware, and/or as virtual components or virtual machines, and/or as virtual containers, which differ from virtual machines because containers virtualize instances of game applications running on a virtualized operating system.
The processing entities may utilize and/or rely on servers and their underlying hardware on one or more servers (computing nodes) of the cloud gaming network 290, which may be located in one or more racks. Coordination, allocation, and management of the execution of these functions to the various processing entities is performed by a distributed synchronization layer. As such, the execution of those functions is controlled by the distributed synchronization layer to generate media (e.g., video frames, audio, etc.) for the game application in response to controller inputs by the player. The distributed synchronization layer allows these functions to be efficiently executed (e.g., through load balancing) across the distributed processing entities, such that critical game engine components/functions are distributed and restructured for more efficient processing.

ゲームタイトル処理エンジン２１１は、中央処理装置（ＣＰＵ）と、マルチテナンシＧＰＵ機能を実行するように構成され得るグラフィックス処理装置（ＧＰＵ）グループと、を含む。別の実施形態では、複数のＧＰＵデバイスが組み合わせられて、対応するＣＰＵ上で実行されている単一アプリケーションのグラフィックス処理を実行する。 The game title processing engine 211 includes a central processing unit (CPU) and a group of graphics processing units (GPUs) that may be configured to perform multi-tenancy GPU functions. In another embodiment, multiple GPU devices are combined to perform graphics processing for a single application running on a corresponding CPU.

図２Ｂは、本開示の一実施形態による、２つ以上のピアデバイス間でゲームを提供するための図であり、ＶＳＹＮＣ信号は、同期及びオフセットされ、コントローラ及びデバイス間の他の情報の受信の最適なタイミングを達成することができる。例えば、直接対決ゲームは、ネットワーク２５０を介して、またはピアツーピア通信（例えば、Ｂｌｕｅｔｏｏｔｈ（登録商標）、ローカルエリアネットワーキングなど）を介して直接接続された２つ以上のピアデバイスを使用して実行され得る。 2B is a diagram for providing a game between two or more peer devices, according to one embodiment of the present disclosure, where the VSYNC signals can be synchronized and offset to achieve optimal timing of the reception of controllers and other information between the devices. For example, a head-to-head game can be performed using two or more peer devices connected directly via network 250 or via peer-to-peer communications (e.g., Bluetooth, local area networking, etc.).

図示されるように、ゲームは、ビデオゲームをプレイしている対応するユーザのクライアントデバイス２１０（例えば、ゲームコンソール）の各々でローカルに実行されており、クライアントデバイス２１０は、ピアツーピアネットワーキングを介して通信する。例えば、ビデオゲームのインスタンスは、対応するクライアントデバイス２１０のゲームタイトル処理エンジン２１１によって実行されている。ビデオゲームを実装するゲームロジック２１５（例えば、実行可能コード）は、対応するクライアントデバイス２１０に格納され、ゲームを実行するために使用されている。説明の目的で、ゲームロジック２１５は、ポータブル媒体（例えば、光媒体）を介して、またはネットワークを介して（例えば、インターネットを介してゲームプロバイダからダウンロードされる）、対応するクライアントデバイス２１０に配信され得る。 As shown, the game is running locally on each of the client devices 210 (e.g., game consoles) of the corresponding users playing the video game, and the client devices 210 communicate via peer-to-peer networking. For example, an instance of the video game is executed by a game title processing engine 211 of the corresponding client device 210. Game logic 215 (e.g., executable code) implementing the video game is stored on the corresponding client device 210 and is used to run the game. For purposes of illustration, the game logic 215 may be distributed to the corresponding client device 210 via a portable medium (e.g., optical medium) or over a network (e.g., downloaded from a game provider via the Internet).

一実施形態では、対応するクライアントデバイス２１０のゲームタイトル処理エンジン２１１は、ゲーム及びゲームアプリケーションに関連するサービスを実行するための基本的なプロセッサベースの機能を含む。例えば、プロセッサベースの機能は、２Ｄまたは３Ｄレンダリング、物理学、物理学シミュレーション、スクリプト、オーディオ、アニメーション、グラフィックス処理、ライティング、シェーディング、ラスター化、レイトレーシング、シャドウイング、カリング、変換、人工知能、等を含む。さらに、ゲームアプリケーションのサービスは、メモリ管理、マルチスレッド管理、サービス品質（ＱｏＳ）、帯域幅テスト、ソーシャルネットワーキング、ソーシャルフレンドの管理、フレンドのソーシャルネットワークとの通信、通信チャネル、テキストメッセージ、インスタントメッセージング、チャットサポートなどを含む。 In one embodiment, the game title processing engine 211 of the corresponding client device 210 includes basic processor-based functionality for executing services related to games and game applications. For example, the processor-based functionality includes 2D or 3D rendering, physics, physics simulation, scripting, audio, animation, graphics processing, lighting, shading, rasterization, ray tracing, shadowing, culling, transformation, artificial intelligence, and the like. Additionally, the game application services include memory management, multi-thread management, quality of service (QoS), bandwidth testing, social networking, social friend management, communication with friends' social network, communication channels, text messaging, instant messaging, chat support, and the like.

クライアントデバイス２１０は、ゲームコントローラ、タブレットコンピュータ、キーボード、ビデオカメラによってキャプチャされたジェスチャ、マウス、タッチパッドなど、様々なタイプの入力デバイスから入力を受信することができる。クライアントデバイス２１０は、少なくともメモリ及びプロセッサモジュールを有する任意タイプのコンピューティングデバイスであり得、ゲームタイトル処理エンジン２１１によって実行されるレンダリング画像を生成し、ディスプレイ（例えば、ディスプレイ１１、またはヘッドマウントディスプレイ（ＨＭＤ）を含むディスプレイ１１など）にレンダリング画像を表示するように構成される。
例えば、レンダリングされた画像は、クライアントデバイス２１０上でローカルに実行されるゲームのインスタンスに関連付けられ得、ゲームプレイを駆動するために使用される入力コマンドなどを介して、対応するユーザのゲームプレイを実装する。クライアントデバイス２１０のいくつかの例には、パーソナルコンピュータ（ＰＣ）、ゲームコンソール、ホームシアターデバイス、汎用コンピュータ、モバイルコンピューティングデバイス、タブレット、電話、またはゲームのインスタンスを実行することができる任意の他のタイプのコンピューティングデバイスが含まれる。 Client device 210 can receive input from various types of input devices, such as a game controller, a tablet computer, a keyboard, gestures captured by a video camera, a mouse, a touchpad, etc. Client device 210 can be any type of computing device having at least a memory and a processor module, and is configured to generate rendered images that are executed by game title processing engine 211 and display the rendered images on a display (e.g., display 11, or display 11 including a head mounted display (HMD), etc.).
For example, the rendered images may be associated with an instance of a game running locally on client device 210, implementing the corresponding user's gameplay, such as via input commands used to drive the gameplay. Some examples of client device 210 include a personal computer (PC), a game console, a home theater device, a general purpose computer, a mobile computing device, a tablet, a phone, or any other type of computing device capable of running an instance of a game.

図２Ｃは、本開示の実施形態による、ソースデバイスとターゲットデバイスとの間のＶＳＹＮＣ信号の適切な同期及びオフセットから恩恵を得る、図２Ａ～図２Ｂに示される構成を含む、様々なネットワーク構成を示している。特に、様々なネットワーク構成は、サーバとクライアントのＶＳＹＮＣ信号の周波数の適切なアライメント、及びサーバとクライアント間の一方向レイテンシ及び／またはレイテンシ変動性を低減する目的のサーバとクライアントのＶＳＹＮＣ信号のタイミングオフセットの恩恵を得る。例えば、１つのネットワークデバイス構成は、クラウドゲーミングサーバ（例えば、ソース）からクライアント（ターゲット）への構成を含む。
一実施形態では、クライアントは、ウェブブラウザ内でオーディオ及びビデオ通信を提供するように構成されたウェブＲＴＣクライアントを含み得る。別のネットワーク構成は、クライアント（例えば、ソース）からサーバ（ターゲット）への構成を含む。さらに別のネットワーク構成は、サーバ（例えば、ソース）からサーバ（例えば、ターゲット）への構成を含む。別のネットワークデバイス構成は、クライアント（例えば、ソース）からクライアント（ターゲット）への構成を含み、クライアントは各々が、例えば、直接対決ゲームを提供するためのゲームコンソールであり得る。 2C illustrates various network configurations, including those illustrated in FIGS. 2A-2B, that benefit from proper synchronization and offsetting of VSYNC signals between source and target devices in accordance with embodiments of the present disclosure. In particular, the various network configurations benefit from proper alignment of the frequency of the server and client VSYNC signals and timing offsets of the server and client VSYNC signals to reduce one-way latency and/or latency variability between the server and client. For example, one network device configuration includes a cloud gaming server (e.g., source) to client (target) configuration.
In one embodiment, the client may include a web RTC client configured to provide audio and video communication within a web browser. Another network configuration includes a client (e.g., source) to server (target) configuration. Yet another network configuration includes a server (e.g., source) to server (e.g., target) configuration. Another network device configuration includes a client (e.g., source) to client (target) configuration, where each client may be, for example, a game console for providing head-to-head gaming.

特に、ＶＳＹＮＣ信号のアラインメントは、サーバＶＳＹＮＣ信号とクライアントＶＳＹＮＣ信号の周波数の同期を含み得、ドリフトを除去する目的のため、及び／または、一方向レイテンシ及び／またはレイテンシ変動性を低減する目的で、サーバとクライアントのＶＳＹＮＣ信号間の理想的な関係を維持するために、クライアントＶＳＹＮＣ信号とサーバＶＳＹＮＣ信号との間のタイミングオフセットを調整することも含み得る。
適切なアライメントを達成するために、一実施形態では、サーバ２６０とクライアント２１０のペアとの間の適切なアライメントを実施するために、サーバＶＳＹＮＣ信号が調節され得る。
別の実施形態では、クライアントＶＳＹＮＣ信号は、サーバ２６０とクライアント２１０のペアとの間の適切なアライメントを実施するために調節され得る。クライアントとサーバのＶＳＹＮＣ信号がアライメントされると、サーバＶＳＹＮＣ信号とクライアントＶＳＹＮＣ信号は、実質的に同一周波数で発生し、随時調整できるタイミングオフセットによって互いにオフセットされる。
別の実施形態では、ＶＳＹＮＣ信号のアライメントは、２つのクライアントのＶＳＹＮＣの周波数を同期させることを含み得、ドリフトを除去する目的で、それらのＶＳＹＮＣ信号の間のタイミングオフセットを調整すること、及び／またはコントローラ及びその他の情報の受信の適切なタイミングを達成することも含み得、いずれのＶＳＹＮＣ信号も、このアライメントを達成するように調節され得る。
さらに別の実施形態では、アラインメントは、複数のサーバのＶＳＹＮＣの周波数を同期させることを含み得、また、サーバＶＳＹＮＣ信号及びクライアントＶＳＹＮＣ信号の周波数を同期し、クライアントＶＳＹＮＣ信号とサーバＶＳＹＮＣ信号との間のタイミングオフセットを、例えば、直接対決クラウドゲーミングのために調整することを含み得る。サーバからクライアントへの構成、及びクライアントからクライアントへの構成では、アラインメントは、サーバＶＳＹＮＣ信号とクライアントＶＳＹＮＣ信号の間の周波数の同期、及びサーバＶＳＹＮＣ信号とクライアントＶＳＹＮＣ信号との間の適切なタイミングオフセットの提供の両方が含まれ得る。サーバからサーバへの構成では、アライメントは、タイミングオフセットを設定することを伴わないサーバＶＳＹＮＣ信号とクライアントＶＳＹＮＣ信号との間の周波数の同期を含み得る。 In particular, alignment of the VSYNC signals may include synchronizing the frequencies of the server VSYNC signal and the client VSYNC signal, and may also include adjusting the timing offset between the client VSYNC signal and the server VSYNC signal to maintain an ideal relationship between the server and client VSYNC signals for the purposes of eliminating drift and/or reducing one-way latency and/or latency variability.
To achieve proper alignment, in one embodiment, the server VSYNC signal may be adjusted to enforce proper alignment between the server 260 and client 210 pair.
In another embodiment, the client VSYNC signal may be adjusted to effect proper alignment between a server 260 and client 210 pair. When the client and server VSYNC signals are aligned, the server VSYNC signal and the client VSYNC signal occur at substantially the same frequency and are offset from one another by a timing offset that can be adjusted at any time.
In another embodiment, alignment of the VSYNC signals may include synchronizing the frequency of the VSYNC of two clients, adjusting the timing offset between their VSYNC signals to eliminate drift, and/or achieving proper timing of receipt of controller and other information, and either VSYNC signal may be adjusted to achieve this alignment.
In yet another embodiment, alignment may include synchronizing the frequency of VSYNC of multiple servers, and may also include synchronizing the frequency of the server VSYNC signal and the client VSYNC signal and adjusting the timing offset between the client VSYNC signal and the server VSYNC signal, for example, for head-to-head cloud gaming. In server-to-client and client-to-client configurations, alignment may include both synchronizing the frequency between the server VSYNC signal and the client VSYNC signal and providing a proper timing offset between the server VSYNC signal and the client VSYNC signal. In a server-to-server configuration, alignment may include synchronizing the frequency between the server VSYNC signal and the client VSYNC signal without setting a timing offset.

図２Ｄは、本開示の一実施形態による、ソースデバイスとターゲットデバイスとの間のＶＳＹＮＣ信号の適切な同期及びオフセットから恩恵がもたらされるクラウドゲーミングサーバ２６０と１つ以上のクライアント２１０との間のマルチテナンシ構成を示す。サーバからクライアントへの構成では、アライメントは、サーバＶＳＹＮＣ信号とクライアントＶＳＹＮＣ信号との間の周波数の同期、及びサーバＶＳＹＮＣ信号とクライアントＶＳＹＮＣ信号との間の適切なタイミングオフセットの提供の両方を含み得る。マルチテナンシ構成では、一実施形態では、サーバ２６０とクライアント２１０のペアとの間の適切なアライメントを実施するために、クライアントＶＳＹＮＣ信号が各クライアント２１０において調節されている。 2D illustrates a multi-tenancy configuration between a cloud gaming server 260 and one or more clients 210 that benefits from proper synchronization and offset of VSYNC signals between source and target devices, according to one embodiment of the present disclosure. In a server-to-client configuration, alignment may include both synchronizing the frequency between the server VSYNC signal and the client VSYNC signal, and providing a proper timing offset between the server VSYNC signal and the client VSYNC signal. In a multi-tenancy configuration, in one embodiment, the client VSYNC signal is adjusted at each client 210 to achieve proper alignment between the server 260 and client 210 pair.

例えば、グラフィックスサブシステムは、マルチテナンシＧＰＵ機能を実行するように構成され得、一実施形態では、１つのグラフィックスサブシステムは、グラフィックスの実装、及び／または複数のゲームのためのパイプラインのレンダリングである場合がある。つまり、グラフィックサブシステムは、実行されている複数のゲーム間で共有される。特に、ゲームタイトル処理エンジンは、マルチテナンシＧＰＵ機能を実行するように構成されたＣＰＵ及びＧＰＵグループを含み得、一実施形態では、１つのＣＰＵ及びＧＰＵグループは、グラフィックスの実装、及び／または複数のゲームのためのパイプラインのレンダリングである場合がある。つまり、ＣＰＵ及びＧＰＵグループは、実行されている複数のゲーム間で共有される。ＣＰＵ及びＧＰＵグループは、１つ以上の処理デバイスとして構成されることができる。別の実施形態では、複数のＧＰＵデバイスが組み合わせられて、対応するＣＰＵ上で実行されている単一アプリケーションのグラフィックス処理を実行する。 For example, the graphics subsystem may be configured to perform multi-tenancy GPU functions, and in one embodiment, one graphics subsystem may be the implementation of the graphics and/or rendering pipeline for multiple games. That is, the graphics subsystem is shared between multiple games being executed. In particular, the game title processing engine may include a CPU and GPU group configured to perform multi-tenancy GPU functions, and in one embodiment, one CPU and GPU group may be the implementation of the graphics and/or rendering pipeline for multiple games. That is, the CPU and GPU group is shared between multiple games being executed. The CPU and GPU group may be configured as one or more processing devices. In another embodiment, multiple GPU devices are combined to perform graphics processing for a single application running on a corresponding CPU.

図３は、サーバでビデオゲームを実行してゲーム用にレンダリングされたビデオフレームを生成し、それらのビデオフレームを表示のためにクライアントに送信する一般的なプロセスを示している。従来的に、ゲームサーバ２６０及びクライアント２１０における、いくつかの操作は、それぞれのＶＳＹＮＣ信号によって規定されるフレーム期間内に実行される。例えば、サーバ２６０は、対応するサーバＶＳＹＮＣ信号３１１によって規定されるような１つ以上のフレーム期間において、３０１においてゲーム用にレンダリングされたビデオフレームを生成しようと努める。
ビデオフレームは、操作３５０で入力デバイスから配信される制御情報（例えば、ユーザの入力コマンド）、または制御情報によって駆動されないゲームロジックのいずれかに応答して、ゲームによって生成される。伝送ジッター３５１は、制御情報をサーバ２６０に送信するときに存在し得、ジッター３５１は、クライアントからサーバへのネットワークレイテンシの変動（例えば、入力コマンドを送信するとき）を測定する。
図示されるように、太い矢印は、制御情報をサーバ２６０に送信するときの現在の遅延を示しているが、ジッターのために、サーバ２６０において制御情報のための到着時間の範囲（例えば、点線矢印で囲まれた範囲）があり得る。フリップ時間３０９で、ＧＰＵは、対応するビデオフレームが完全に生成され、サーバ２６０のフレームバッファに配置されたことを示すフリップコマンドに到達する。その後、サーバ２６０は、サーバＶＳＹＮＣ信号３１１（ＶＢＩは明確化のために省略されている）によって規定される後続のフレーム期間にわたって、そのビデオフレームに対してスキャンアウト／スキャンイン（操作３０２、そこでスキャンアウトはＶＳＹＮＣ信号３１１とアライメントされ得る）を実行する。続いて、ビデオフレームがエンコード（操作３０３）され（例えば、ＶＳＹＮＣ信号３１１の発生後にエンコードが開始され、エンコードの終了はＶＳＹＮＣ信号とアライメントされなくてもよい）、クライアント２１０へ伝送（操作３０４、ここでは伝送はＶＳＹＮＣ信号３１１とアライメントされなくてもよい）される。
クライアント２１０において、エンコードされたビデオフレームは、受信され（操作３０５、ここでは受信はクライアントＶＳＹＮＣ信号３１２とアライメントされなくてもよい）、デコードされ（操作３０６、ここではデコードはクライアントＶＳＹＮＣ信号３１２とアライメントされなくてもよい）、バッファリングされ、表示される（操作３０７、ここでは表示の開始は、クライアントのＶＳＹＮＣ信号３１２とアライメントされなくてもよい）。特に、クライアント２１０は、クライアントＶＳＹＮＣ信号３１２の対応する発生で開始する表示のためにレンダリングされた各ビデオフレームを表示する。 3 illustrates a general process for running a video game on a server to generate rendered video frames for the game and send those video frames to clients for display. Traditionally, some operations on the game server 260 and clients 210 are performed within a frame period defined by their respective VSYNC signals. For example, the server 260 seeks to generate rendered video frames for the game at 301 in one or more frame periods as defined by a corresponding server VSYNC signal 311.
The video frames are generated by the game in response to either control information delivered from an input device (e.g., a user's input command) in operation 350, or game logic that is not driven by control information. Transmission jitter 351 may be present when sending control information to the server 260, where jitter 351 measures the variation in network latency from the client to the server (e.g., when sending an input command).
As shown, the thick arrow indicates the current delay in sending the control information to the server 260, but due to jitter there may be a range of arrival times for the control information at the server 260 (e.g., the range enclosed by the dotted arrow). At flip time 309, the GPU arrives at a flip command indicating that the corresponding video frame has been fully generated and placed in the frame buffer of the server 260. The server 260 then performs a scan-out/scan-in (operation 302, where the scan-out may be aligned with the VSYNC signal 311) on that video frame over the subsequent frame period defined by the server VSYNC signal 311 (VBI omitted for clarity). The video frame is then encoded (operation 303) (e.g., encoding begins after the occurrence of the VSYNC signal 311, and the end of encoding may not be aligned with the VSYNC signal) and transmitted (operation 304, where the transmission may not be aligned with the VSYNC signal 311) to the client 210.
At client 210, the encoded video frames are received (operation 305, where reception may not be aligned with client VSYNC signal 312), decoded (operation 306, where decoding may not be aligned with client VSYNC signal 312), buffered, and displayed (operation 307, where start of display may not be aligned with client's VSYNC signal 312). In particular, client 210 displays each video frame rendered for display beginning with the corresponding occurrence of client VSYNC signal 312.

一方向レイテンシ３１５は、サーバでのビデオフレームのエンコードユニット（例えば、スキャンアウト３０２）への転送の開始から、クライアント３０７でのビデオフレームの表示の開始までのレイテンシとして定義され得る。つまり、一方向レイテンシは、クライアントのバッファリングを考慮した、サーバのスキャンアウトからクライアントの表示までの時間である。個々のフレームは、スキャンアウト３０２の開始からデコード３０６の完了までのレイテンシを有し、このレイテンシは、エンコード３０３と伝送３０４、サーバ２６０とクライアント２１０との間のジッター３５２を伴うネットワーク伝送、及びクライアント受信３０５などのサーバ操作の高度な変動により、フレームごとに変化し得る。
図示されているように、直線の太い矢印は、対応するビデオフレームをクライアント２１０に送信するときの現在のレイテンシを示しているが、ジッター３５２のために、クライアント２１０におけるビデオフレームの到着時間の範囲（例えば、点線矢印で区分けされた範囲）があり得る。良好な再生体験を実現するには、一方向レイテンシが比較的安定している（例えば、かなり一貫性が保たれている）必要があるため、従来、バッファリング３２０は、低いレイテンシ（例えば、スキャンアウト３０２の開始からデコード３０６の完了まで）を伴う個々のフレームの表示が、いくつかのフレーム期間にわたり遅延される結果を伴って実行される。つまり、ネットワークの不安定性、または予測できないエンコード／デコード時間がある場合、一方向レイテンシが一定に保たれるように追加のバッファリングが必要になる。 One-way latency 315 may be defined as the latency from the start of the transfer of a video frame to an encoding unit (e.g., scanout 302) at the server to the start of display of the video frame at the client 307. That is, one-way latency is the time from server scanout to client display, taking into account client buffering. An individual frame has a latency from the start of scanout 302 to the completion of decode 306, and this latency may vary from frame to frame due to high variability in server operations such as encoding 303 and transmission 304, network transmission with jitter 352 between the server 260 and client 210, and client receive 305.
As shown, the solid arrow indicates the current latency in transmitting the corresponding video frame to the client 210, but due to jitter 352, there may be a range of arrival times of the video frames at the client 210 (e.g., the ranges demarcated by the dotted arrows). Because one-way latency needs to be relatively stable (e.g., fairly consistent) to achieve a good playback experience, traditionally, buffering 320 is performed with the result that the display of individual frames with low latency (e.g., from the start of scanout 302 to the completion of decode 306) is delayed for several frame periods. That is, in the presence of network instability or unpredictable encoding/decoding times, additional buffering is required to ensure that one-way latency remains constant.

本開示の一実施形態によれば、クラウドゲーミングサーバとクライアントとの間の一方向レイテンシは、サーバ上で実行されるビデオゲームから生成されたビデオフレームをストリーミングするときに、クロックドリフトのために変化し得る。つまり、サーバＶＳＹＮＣ信号３１１とクライアントＶＳＹＮＣ信号３１２の周波数の違いは、サーバ２６０から到着するフレームに対してドリフトするクライアントＶＳＹＮＣ信号を生じ得る。このドリフトは、サーバとクライアントにおけるそれぞれのクロックの各々で使用される水晶振動子での、ごくわずかな違いによるものであり得る。さらに、本開示の実施形態は、サーバとクライアントとの間のアラインメントのためのＶＳＹＮＣ信号の同期及びオフセットの１つ以上を実行することによって、クライアントにおいて動的バッファリングを提供することによって、サーバにおいてビデオフレームのエンコード及び伝送を重複させることによって、クライアントにおいてビデオフレームの受信とデコードを重複させることによって、かつ、クライアントにおいてビデオフレームのデコードと表示を重複させることによって、一方向レイテンシを低減する。 According to an embodiment of the present disclosure, one-way latency between a cloud gaming server and a client may vary due to clock drift when streaming video frames generated from a video game running on the server. That is, differences in frequency between the server VSYNC signal 311 and the client VSYNC signal 312 may result in the client VSYNC signal drifting relative to the frames arriving from the server 260. This drift may be due to very slight differences in the crystals used in each of the respective clocks at the server and the client. Additionally, embodiments of the present disclosure reduce one-way latency by performing one or more of synchronization and offsetting of the VSYNC signals for alignment between the server and the client, by providing dynamic buffering at the client, by overlapping the encoding and transmission of video frames at the server, by overlapping the receiving and decoding of video frames at the client, and by overlapping the decoding and display of video frames at the client.

さらに、ビデオフレームのエンコード（操作３０３）の間、以前の技術では、エンコーダは、エンコードされている現在のビデオフレームと１つ以上の以前にエンコードされたフレームとの間にどれだけの変化があるかを判定して、シーン変化（例えば、対応する生成されたビデオフレームの複雑な画像）があるかどうかを判定する。つまり、シーン変化ヒントは、エンコードされる現在のフレームと、既にエンコードされている以前のフレームとの違いから推測できる。ネットワークを介してサーバからクライアントにコンテンツをストリーミングするとき、サーバにおけるエンコーダは、複雑性を伴うシーン変化として検出されるビデオフレームをエンコードすることを決定することができる。そうでない場合、エンコーダは、より低い複雑性を有するシーン変化として検出されないビデオフレームをエンコードする。
しかしながら、エンコーダにおけるシーン変化の検出は、ビデオフレームが初めに、より低い複雑性を伴って（第１のフレーム期間で）エンコードされるが、次いで、シーン変化があると判断されると、より高い複雑性を伴って（第２のフレーム期間で）再エンコードされるように、最大１フレーム期間（例えば、ジッターの追加）がかかり得る。同様に、シーン変化の検出は、シーン変化がないとしても、現在エンコードされているビデオフレームと以前にエンコードされたビデオフレームとの差異が閾値差異値を超え得ることから、（画像内の小さな爆発を介してなど）不必要にトリガーされ得る。このように、シーン変化がエンコーダにおいて検出されると、ジッターによる追加的なレイテンシがエンコーダにおいて導入され、シーン変化の検出の実行と、より高い複雑性を有するビデオフレームの再エンコードと、に適応する。 Further, during encoding of the video frames (operation 303), in the previous technique, the encoder determines how much change there is between the current video frame being encoded and one or more previously encoded frames to determine whether there is a scene change (e.g., a complex image of the corresponding generated video frame). That is, a scene change hint can be inferred from the difference between the current frame being encoded and the previous frames already encoded. When streaming content from a server to a client over a network, the encoder at the server can decide to encode the video frames that are detected as a scene change with complexity. Otherwise, the encoder encodes the video frames that are not detected as a scene change with lower complexity.
However, scene change detection in the encoder may take up to one frame period (e.g., adding jitter) such that a video frame is initially encoded (in a first frame period) with lower complexity, but then re-encoded (in a second frame period) with higher complexity if it is determined that there is a scene change. Similarly, scene change detection may be unnecessarily triggered (such as via a small burst in the image) because the difference between the currently encoded video frame and the previously encoded video frame may exceed a threshold difference value even if there is no scene change. Thus, when a scene change is detected in the encoder, additional latency due to jitter is introduced in the encoder to accommodate performing scene change detection and re-encoding the video frame with higher complexity.

図４は、サーバ上で実行されるビデオゲームから生成されたビデオフレームをストリーミングするときの、高度に最適化されたクラウドゲーミングサーバ２６０と、高度に最適化されたクライアント２１０と、を含むネットワーク構成を通るデータのフローを示し、本開示の実施形態によれば、サーバ操作とクライアント操作の重複が、一方向レイテンシを低減し、サーバとクライアント間のＶＳＹＮＣ信号の同期及びオフセットが、一方向レイテンシを低減し、同時に、サーバとクライアント間の一方向レイテンシの変動性を低減する。
特に、図４は、サーバＶＳＹＮＣ信号とクライアントＶＳＹＮＣ信号との間の望ましいアライメントを示している。一実施形態では、サーバＶＳＹＮＣ信号３１１の調節は、サーバ及びクライアントのネットワーク構成などにおいて、サーバＶＳＹＮＣ信号とクライアントＶＳＹＮＣ信号との間の適切なアライメントを得るために実行されている。
別の実施形態では、クライアントＶＳＹＮＣ信号３１２の調節は、マルチテナントサーバから複数クライアントへのネットワーク構成におけるような、サーバＶＳＹＮＣ信号とクライアントＶＳＹＮＣ信号との間の適切なアラインメントを得るために実行されている。サーバ及びクライアントのＶＳＹＮＣ信号の周波数を同期させること、及び／または対応するクライアント及びサーバのＶＳＹＮＣ信号間のタイミングオフセットを調整することを目的として、例示を目的として、サーバＶＳＹＮＣ信号３１１の調節が図４で説明されているが、クライアントＶＳＹＮＣ信号３１２もまた、調節に使用され得ることが理解される。
この特許出願の文脈では、「同期」とは、周波数が一致するが、位相は異なり得るように信号を調節することを意味すると解釈されるべきであり、「オフセット」とは、例えば、一方の信号が最大値に達したときと、もう一方の信号が最大値に達したときとの間の時間のように、信号間の時間遅延を意味すると解釈されるべきである。 FIG. 4 illustrates the flow of data through a network configuration including a highly optimized cloud gaming server 260 and a highly optimized client 210 when streaming video frames generated from a video game running on the server, where, according to an embodiment of the present disclosure, overlapping of server and client operations reduces one-way latency, and synchronization and offsetting of VSYNC signals between the server and client reduces one-way latency while at the same time reducing the variability of one-way latency between the server and client.
In particular, Figure 4 illustrates a desired alignment between the server VSYNC signal and the client VSYNC signal. In one embodiment, adjustments to the server VSYNC signal 311 are performed to obtain proper alignment between the server VSYNC signal and the client VSYNC signal, such as in a network configuration of servers and clients.
In another embodiment, adjustment of the client VSYNC signal 312 is performed to obtain proper alignment between the server VSYNC signal and the client VSYNC signal, such as in a multi-tenant server to multiple client network configuration. For purposes of illustration, adjustment of the server VSYNC signal 311 is described in FIG. 4 for purposes of synchronizing the frequency of the server and client VSYNC signals and/or adjusting the timing offset between corresponding client and server VSYNC signals, but it will be understood that the client VSYNC signal 312 may also be used for adjustment.
In the context of this patent application, "synchronization" should be interpreted to mean adjusting the signals so that their frequencies match but their phases may differ, and "offset" should be interpreted to mean the time delay between the signals, such as the time between when one signal reaches its maximum value and when the other signal reaches its maximum value.

図示されているように、図４は、本開示の実施形態において、サーバにおいてビデオゲームを実行してレンダリングされたビデオフレームを生成し、それらのビデオフレームを表示のためにクライアントに送信する改善されたプロセスを示している。本プロセスは、サーバとクライアントにおける単一ビデオフレームの生成と表示に関して示されている。特に、サーバは、４０１においてゲーム用にレンダリングされたビデオフレームを生成する。例えば、サーバ２６０は、ゲームを実行するように構成されたＣＰＵ（例えば、ゲームタイトル処理エンジン２１１）を含む。ＣＰＵは、ビデオフレームに対して１つ以上のドローコールを生成し、このドローコールは、グラフィックスパイプライン内のサーバ２６０の対応するＧＰＵによって実行するためにコマンドバッファに配置されたコマンドを含む。
グラフィックスパイプラインは、シーン内のオブジェクトの頂点に１つ以上のシェーダプログラムを含み、表示用のビデオフレームにレンダリングされたテクスチャ値を生成でき、この操作は、効率を上げるためにＧＰＵを介して並行して実行されている。フリップ時間４０９において、ＧＰＵは、コマンドバッファ内のフリップコマンドに到達し、これは、対応するビデオフレームが、完全に生成され、及び／またはレンダリングされ、サーバ２６０でフレームバッファ内に配置されたことを示している。 As illustrated, Figure 4 illustrates an improved process for executing a video game at a server to generate rendered video frames and send those video frames to a client for display in an embodiment of the present disclosure. The process is illustrated with respect to the generation and display of a single video frame at the server and the client. In particular, the server generates rendered video frames for the game at 401. For example, the server 260 includes a CPU (e.g., game title processing engine 211) configured to execute the game. The CPU generates one or more draw calls for the video frames, which include commands that are placed in a command buffer for execution by a corresponding GPU of the server 260 in a graphics pipeline.
The graphics pipeline may include one or more shader programs for vertices of objects in a scene to generate texture values that are rendered into a video frame for display, with this operation being performed in parallel across the GPU for efficiency. At flip time 409, the GPU reaches a flip command in the command buffer, indicating that the corresponding video frame has been completely generated and/or rendered and placed into a frame buffer at the server 260.

４０２において、サーバはゲーム用にレンダリングされたビデオフレームのスキャンアウトをエンコーダに実行する。特に、スキャンアウトは、スキャンラインごとに、または連続するスキャンラインのグループで実行され、スキャンラインは、例えば、画面の端から画面の端までの単一の水平線の表示を指す。これらのスキャンラインまたは連続するスキャンラインのグループは、スライスと称されることもあり、この明細書ではスクリーンスライスと称されている。特に、スキャンアウト４０２は、いくつかのプロセスを含み得、これらはゲーム用にレンダリングされたフレームを変更し、別のフレームバッファでそれをオーバーレイすること、または別のフレームバッファからの情報でそれを囲むためにそれを縮小することを含む。スキャンアウト４０２の間、修正されたビデオフレームは、次いで、圧縮のためにエンコーダにスキャンされる。一実施形態では、スキャンアウト４０２は、ＶＳＹＮＣ信号３１１の発生３１１ａで実行される。他の実施形態では、スキャンアウト４０２は、フリップ時間４０９などで、ＶＳＹＮＣ信号３１１の発生の前に実行され得る。 At 402, the server performs a scanout of the rendered video frame for the game to the encoder. In particular, the scanout is performed scanline by scanline or in groups of consecutive scanlines, where a scanline refers to the display of a single horizontal line, for example, from edge of the screen to edge of the screen. These scanlines or groups of consecutive scanlines are sometimes referred to as slices, and are referred to herein as screen slices. In particular, the scanout 402 may include several processes, including modifying the rendered frame for the game and overlaying it with another frame buffer or shrinking it to surround it with information from another frame buffer. During the scanout 402, the modified video frame is then scanned to the encoder for compression. In one embodiment, the scanout 402 is performed at the occurrence 311a of the VSYNC signal 311. In other embodiments, the scanout 402 may be performed before the occurrence of the VSYNC signal 311, such as at flip time 409.

４０３において、ゲーム用にレンダリングされたビデオフレーム（変更された可能性がある）は、エンコーダでエンコーダスライスごとに１つのエンコーダスライスにエンコードされて、１つ以上のエンコードスライスを生成し、ここで、エンコードされたスライスは、スキャンラインまたはスクリーンスライスとは無関係である。このように、エンコーダは、１つ以上のエンコードされた（例えば、圧縮された）スライスを生成する。一実施形態では、エンコードプロセスは、スキャンアウト４０２プロセスが、対応するビデオフレームのために完全に完了する前に開始する。
さらに、エンコード４０３の開始及び／または終了は、サーバＶＳＹＮＣ信号３１１とアライメントされてもされなくてもよい。エンコードされたスライスの境界は、単一のスキャンラインに制限されず、単一のスキャンライン、または複数のスキャンラインで構成され得る。さらに、エンコードされたスライスの終了及び／または次のエンコーダスライスの開始は、必ずしも表示画面の端で発生し得ず（例えば、画面の中央またはスキャンラインの中央で発生し得る）、エンコードされたスライスは、表示画面の端から端まで完全に横断する必要はない。図示されるように、１つ以上のエンコードされたスライスは、ハッシュマークを有する圧縮された「エンコードされたスライスＡ」を含めて、圧縮及び／またはエンコードされ得る。 At 403, the rendered video frame for the game (possibly modified) is encoded in an encoder, one encoder slice per encoder slice, to generate one or more encoded slices, where the encoded slices are independent of scanlines or screen slices. In this manner, the encoder generates one or more encoded (e.g., compressed) slices. In one embodiment, the encoding process begins before the scanout 402 process is fully completed for the corresponding video frame.
Additionally, the start and/or end of the encode 403 may or may not be aligned with the server VSYNC signal 311. The boundaries of an encoded slice are not limited to a single scan line, but may consist of a single scan line, or multiple scan lines. Additionally, the end of an encoded slice and/or the start of the next encoder slice may not necessarily occur at the edge of the display screen (e.g., may occur in the middle of the screen or the middle of a scan line), and an encoded slice need not traverse completely across the display screen. As illustrated, one or more encoded slices may be compressed and/or encoded, including a compressed "encoded slice A" with hash marks.

４０４において、エンコードされたビデオフレームは、サーバからクライアントに伝送され、この伝送は、エンコードされたスライスごとに行われることができ、各エンコードされたスライスは、圧縮されたエンコーダスライスである。一実施形態では、伝送プロセス４０４は、エンコードプロセス４０３が、対応するビデオフレームに対して完全に完了する前に開始する。さらに、伝送４０４の開始及び／または終了は、サーバＶＳＹＮＣ信号３１１とアライメントされていてもされていなくてもよい。図示されているように、圧縮されたエンコードされたスライスＡは、レンダリングされたビデオフレームに対する他の圧縮されたエンコーダスライスとは独立してクライアントに伝送される。エンコーダスライスは、一度に１つずつ、または並行して伝送され得る。 At 404, the encoded video frame is transmitted from the server to the client, and this transmission may be done on an encoded slice-by-encoded slice basis, with each encoded slice being a compressed encoder slice. In one embodiment, the transmission process 404 begins before the encoding process 403 is fully completed for the corresponding video frame. Furthermore, the start and/or end of the transmission 404 may or may not be aligned with the server VSYNC signal 311. As shown, compressed encoded slice A is transmitted to the client independently of the other compressed encoder slices for the rendered video frame. The encoder slices may be transmitted one at a time or in parallel.

４０５において、クライアントは圧縮されたビデオフレームを受信し、これもエンコードされたスライスごとに行われる。さらに、受信４０５の開始及び／または終了は、クライアントＶＳＹＮＣ信号３１２とアライメントされても、されなくてもよい。図示されているように、圧縮されたエンコードされたスライスＡが、クライアントによって受信される。伝送ジッター４５２は、サーバ２６０とクライアント２１０との間に存在することができ、ジッター４５２は、サーバ２６０からクライアント２１０へのネットワークレイテンシの変動を測定する。
より低いジッター値は、より安定した接続を示す。図示されているように、太い直線矢印は、対応するビデオフレームをクライアント２１０に送信するときの現在のレイテンシを示しているが、ジッターにより、クライアント２１０におけるビデオフレームに対する到着時間の範囲（例えば、点線矢印で囲まれた範囲）があり得る。レイテンシの変動はまた、エンコード４０３や伝送４０４などのサーバでの１つ以上の操作によるものであり得、同様に、ビデオフレームをクライアント２１０に伝送するときにレイテンシをもたらすネットワークの問題によるものであり得る。 At 405, the client receives the compressed video frames, again for each encoded slice. Additionally, the start and/or end of reception 405 may or may not be aligned with the client VSYNC signal 312. As shown, compressed encoded slice A is received by the client. Transmission jitter 452 may exist between the server 260 and the client 210, where jitter 452 measures the variation in network latency from the server 260 to the client 210.
A lower jitter value indicates a more stable connection. As shown, the thick straight arrow indicates the current latency in transmitting the corresponding video frame to the client 210, but due to jitter there may be a range of arrival times for the video frames at the client 210 (e.g., the range enclosed by the dotted arrow). The variation in latency may also be due to one or more operations at the server, such as encoding 403 and transmitting 404, as well as network issues that introduce latency in transmitting the video frames to the client 210.

４０６において、クライアントは、再びエンコードされたスライスごとに、圧縮されたビデオフレームをデコードし、現時点で表示準備ができているデコードされたスライスＡ（ハッシュマークなしで示されている）を生成する。一実施形態では、デコードプロセス４０６は、受信プロセス４０５が、対応するビデオフレームに対して完全に完了する前に開始する。さらに、デコード４０６の開始及び／または終了は、クライアントＶＳＹＮＣ信号３１２とアライメントされてもよく、またはされなくてもよい。４０７において、クライアントは、デコードされたレンダリングされたビデオフレームをクライアントにおいてディスプレイに表示する。つまり、デコードされたビデオフレームは、例えば、スキャンラインごとにディスプレイデバイスにストリームアウトされるディスプレイバッファに配置される。
一実施形態では、表示プロセス４０７（すなわち、ディスプレイデバイスへのストリーミングアウト）は、デコードプロセス４０６が、対応するビデオフレームに対して完全に完了した後、すなわち、デコードされたビデオフレームが、ディスプレイバッファに完全に常駐した後に開始する。
別の実施形態では、表示プロセス４０７は、デコードプロセス４０６が対応するビデオフレームに対して完全に完了する前に開始する。つまり、ディスプレイデバイスへのストリームアウトは、デコードされたフレームバッファの一部分のみがディスプレイバッファに常駐する時点で、ディスプレイバッファのアドレスで開始する。ディスプレイバッファは、次いで、表示に間に合うように、対応するビデオフレームの残りの部分で更新または充填され、ディスプレイバッファの更新は、それらの部分がディスプレイにストリームアウトする前に実行されるようになっている。さらに、表示４０７の開始及び／または終了は、クライアントＶＳＹＮＣ信号３１２とアライメントされる。 At 406, the client again decodes the compressed video frame for each encoded slice to generate a decoded slice A (shown without hash marks) that is now ready for display. In one embodiment, the decoding process 406 begins before the receiving process 405 is fully completed for the corresponding video frame. Furthermore, the start and/or end of the decoding 406 may or may not be aligned with the client VSYNC signal 312. At 407, the client displays the decoded rendered video frame on a display at the client. That is, the decoded video frame is placed in a display buffer that is streamed out to a display device, e.g., scanline by scanline.
In one embodiment, the display process 407 (i.e., streaming out to a display device) begins after the decode process 406 is fully completed for the corresponding video frame, i.e., after the decoded video frame is fully resident in the display buffer.
In another embodiment, the display process 407 starts before the decode process 406 is fully completed for the corresponding video frame. That is, the stream out to the display device starts at an address in the display buffer at a time when only a portion of the decoded frame buffer resides in the display buffer. The display buffer is then updated or filled with the remaining portions of the corresponding video frame in time for display, such that the display buffer updates are performed before those portions are streamed out to the display. Additionally, the start and/or end of display 407 is aligned with the client VSYNC signal 312.

一実施形態では、サーバ２６０とクライアント２１０との間の一方向レイテンシ４１６は、スキャンアウト４０２が開始されたときと、表示４０７が開始されるときとの間の経過時間として定義され得る。本開示の実施形態は、サーバとクライアントとの間のＶＳＹＮＣ信号をアライメント（例えば、周波数を同期させ、オフセットを調整する）させ、サーバとクライアントとの間の一方向レイテンシを低減し、サーバとクライアントとの間の一方向レイテンシの変動性を低減することができる。
例えば、本開示の実施形態は、サーバＶＳＹＮＣ信号３１１とクライアントＶＳＹＮＣ信号３１２との間のオフセット４３０に対する最適な調整を計算することができ、それにより、エンコード４０３及び伝送４０４などのサーバ処理に必要なほぼ最悪の場合の時間が発生した場合であっても、サーバ２６０とクライアント２１０との間のほぼ最悪の場合のネットワークレイテンシが発生した場合であっても、及び受信４０５及びデコード４０６などのほぼ最悪の場合のクライアント処理が発生した場合であっても、デコードされレンダリングされたビデオフレームは、表示プロセス４０７に間に合うように利用可能である。つまり、サーバＶＳＹＮＣとクライアントＶＳＹＮＣとの間の絶対オフセットを決定する必要はなく、デコードされレンダリングされたビデオフレームが表示プロセスに間に合うようにオフセットを調整するだけで十分である。 In one embodiment, one-way latency 416 between server 260 and client 210 may be defined as the elapsed time between when scanout 402 begins and when display 407 begins. Embodiments of the present disclosure may align (e.g., synchronize frequency and adjust offset) the VSYNC signals between the server and client to reduce the one-way latency between the server and client and reduce the variability of the one-way latency between the server and client.
For example, embodiments of the present disclosure can calculate an optimal adjustment to the offset 430 between the server VSYNC signal 311 and the client VSYNC signal 312, so that even with near worst-case times required for server processing such as encoding 403 and transmitting 404, near worst-case network latency between the server 260 and the client 210, and near worst-case client processing such as receiving 405 and decoding 406, the decoded and rendered video frames are available in time for the display process 407. In other words, it is not necessary to determine the absolute offset between the server VSYNC and the client VSYNC, it is sufficient to adjust the offset so that the decoded and rendered video frames are available in time for the display process.

特に、サーバＶＳＹＮＣ信号３１１及びクライアントＶＳＹＮＣ信号３１２の周波数は、同期によって調整され得る。同期は、サーバＶＳＹＮＣ信号３１１またはクライアントＶＳＹＮＣ信号３１２を調節することによって実現される。例示を目的として、調節は、サーバＶＳＹＮＣ信号３１１に関連して説明されているが、調節は、代わりにクライアントＶＳＹＮＣ信号３１２で実行され得ることが理解される。例えば、図４に示すように、サーバフレーム期間４１０（例えば、サーバＶＳＹＮＣ信号３１１の２回の発生である３１１ｃと３１１ｄとの間の時間）は、クライアントフレーム期間４１５（例えば、クライアントＶＳＹＮＣ信号３１２の２回の発生である３１２ａと３１２ｂとの間の時間）と実質的に等しく、このことは、サーバＶＳＹＮＣ信号３１１とクライアントのＶＳＹＮＣ信号３１２の周波数も実質的に等しいことを示している。 In particular, the frequencies of the server VSYNC signal 311 and the client VSYNC signal 312 may be adjusted by synchronization. The synchronization may be achieved by adjusting the server VSYNC signal 311 or the client VSYNC signal 312. For illustrative purposes, the adjustment is described with respect to the server VSYNC signal 311, but it is understood that the adjustment may be performed on the client VSYNC signal 312 instead. For example, as shown in FIG. 4, the server frame period 410 (e.g., the time between two occurrences 311c and 311d of the server VSYNC signal 311) is substantially equal to the client frame period 415 (e.g., the time between two occurrences 312a and 312b of the client VSYNC signal 312), which indicates that the frequencies of the server VSYNC signal 311 and the client VSYNC signal 312 are also substantially equal.

サーバ及びクライアントのＶＳＹＮＣ信号の周波数の同期を維持するために、サーバＶＳＹＮＣ信号３１１のタイミングが操作され得る。例えば、サーバＶＳＹＮＣ信号３１１の垂直帰線区間（ＶＢＩ）は、例えば、サーバＶＳＹＮＣ信号３１１とクライアントＶＳＹＮＣ信号３１２との間のドリフトを確認するために、ある期間にわたって増加または減少され得る。ＶＢＩにおける垂直帰線（ＶＢＬＡＮＫ）の操作は、サーバＶＳＹＮＣ信号３１１の１つ以上のフレーム期間に対するＶＢＬＡＮＫに使用されるスキャンラインの数を調整することを提供する。ＶＢＬＡＮＫのスキャンラインの数が下落すると、サーバＶＳＹＮＣ信号３１１の２つの発生間の対応するフレーム期間（例えば、時間間隔）を短縮する。反対に、ＶＢＬＡＮＫのスキャンラインの数が増加すると、ＶＳＹＮＣ信号３１１の２つの発生の間の対応するフレーム期間（例えば、時間間隔）を延長する。そのようにして、サーバＶＳＹＮＣ信号３１１の周波数は、クライアントとサーバのＶＳＹＮＣ信号３１１と３１２との間の周波数が実質的に同じ周波数になるように調整される。また、サーバとクライアントのＶＳＹＮＣ信号間のオフセットは、ＶＢＩを元の値に戻す前に、ＶＢＩを短時間にわたり増減することで調整され得る。
一実施形態では、サーバＶＢＩが調整される。別の実施形態では、クライアントＶＢＩが調整される。さらに別の実施形態では、２つのデバイス（サーバ及びクライアント）の代わりに、複数の接続されたデバイスがあり、それらのそれぞれは、調整された対応するＶＢＩを有し得る。一実施形態では、複数の接続されたデバイスの各々は、（例えば、サーバデバイスを伴わない）独立したピアデバイスであり得る。別の実施形態では、複数のデバイスは、１つ以上のサーバ／クライアントアーキテクチャ、マルチテナントサーバ／クライアント（複数可）アーキテクチャ、またはそれらのいくつかの組み合わせに配置された、１つ以上のサーバデバイス及び／または１つ以上のクライアントデバイスを含み得る。 To maintain synchronization of the frequencies of the server and client VSYNC signals, the timing of the server VSYNC signal 311 may be manipulated. For example, the vertical blanking interval (VBI) of the server VSYNC signal 311 may be increased or decreased over a period of time, for example, to check for drift between the server VSYNC signal 311 and the client VSYNC signal 312. Manipulation of the vertical blanking interval (VBLANK) in the VBI provides for adjusting the number of scan lines used for VBLANK for one or more frame periods of the server VSYNC signal 311. A decrease in the number of scan lines of VBLANK shortens the corresponding frame period (e.g., time interval) between two occurrences of the server VSYNC signal 311. Conversely, an increase in the number of scan lines of VBLANK lengthens the corresponding frame period (e.g., time interval) between two occurrences of the VSYNC signal 311. In that way, the frequency of the server VSYNC signal 311 is adjusted so that the frequencies between the client and server VSYNC signals 311 and 312 are substantially the same frequency. Also, the offset between the server and client VSYNC signals can be adjusted by increasing or decreasing the VBI for a short period of time before returning it to its original value.
In one embodiment, the server VBI is adjusted. In another embodiment, the client VBI is adjusted. In yet another embodiment, instead of two devices (server and client), there may be multiple connected devices, each of which may have a corresponding VBI that is adjusted. In one embodiment, each of the multiple connected devices may be an independent peer device (e.g., without a server device). In another embodiment, the multiple devices may include one or more server devices and/or one or more client devices arranged in one or more server/client architectures, a multi-tenant server/client(s) architecture, or some combination thereof.

他の形態として、サーバのピクセルクロック（例えば、サーバのノースブリッジ／サウスブリッジコアロジックチップセットのサウスブリッジに配置されているか、またはディスクリートＧＰＵの場合は、独自のハードウェアを使用してそれ自体でピクセルクロックを生成する）は、一実施形態では、ある期間にわたってサーバＶＳＹＮＣ信号３１１の周波数の粗い調節及び／または微細な調節を実行するように操作されることができ、サーバＶＳＹＮＣ信号３１１とクライアントのＶＳＹＮＣ信号３１２との間の周波数の同期をアライメント状態に戻す。
具体的には、サーバのサウスブリッジのピクセルクロックがオーバークロックまたはアンダークロックされ、サーバのＶＳＹＮＣ信号３１１の全体的な周波数を調整することができる。そのようにして、サーバＶＳＹＮＣ信号３１１の周波数は、クライアントとサーバのＶＳＹＮＣ信号３１１と３１２との間の周波数が実質的に同じ周波数になるように調整される。サーバとクライアントのＶＳＹＮＣ間のオフセットは、ピクセルクロックを元の値に戻す前に、クライアントサーバのピクセルクロックを短時間で増減することで調整されることができる。一実施形態では、サーバピクセルクロックが調整される。
別の実施形態では、クライアントピクセルクロックが調整される。さらに別の実施形態では、２つのデバイス（サーバ及びクライアント）の代わりに、複数の接続されたデバイスがあり、それらの各々は、調整されている対応するピクセルクロックを有し得る。一実施形態では、複数の接続されたデバイスの各々は、（例えば、サーバデバイスを伴わない）独立したピアデバイスであり得る。別の実施形態では、複数の接続されたデバイスは、１つ以上のサーバ／クライアントアーキテクチャ、マルチテナントサーバ／クライアント（複数可）アーキテクチャ、またはそれらのいくつかの組み合わせに配置された１つ以上のサーバデバイス及び１つ以上のクライアントデバイスを含み得る。 Alternatively, the server's pixel clock (e.g., located in the southbridge of the server's northbridge/southbridge core logic chipset, or in the case of a discrete GPU, generating the pixel clock itself using its own hardware) can, in one embodiment, be manipulated to perform coarse and/or fine adjustments to the frequency of the server VSYNC signal 311 over a period of time to bring the frequency synchronization between the server VSYNC signal 311 and the client's VSYNC signal 312 back into alignment.
Specifically, the pixel clock of the server southbridge may be overclocked or underclocked to adjust the overall frequency of the server VSYNC signal 311. In that way, the frequency of the server VSYNC signal 311 is adjusted so that the frequencies between the client and server VSYNC signals 311 and 312 are substantially the same frequency. The offset between the server and client VSYNC can be adjusted by briefly increasing or decreasing the client server pixel clock before restoring the pixel clock back to its original value. In one embodiment, the server pixel clock is adjusted.
In another embodiment, the client pixel clocks are aligned. In yet another embodiment, instead of two devices (a server and a client), there may be multiple connected devices, each of which may have a corresponding pixel clock that is aligned. In one embodiment, each of the multiple connected devices may be an independent peer device (e.g., without a server device). In another embodiment, the multiple connected devices may include one or more server devices and one or more client devices arranged in one or more server/client architectures, a multi-tenant server/client(s) architecture, or some combination thereof.

一実施形態では、クラウドゲーミングサーバとクライアントとの間の一方向レイテンシをさらに短縮するために高性能コーデック（例えば、エンコーダ及び／またはデコーダ）が使用され得る。圧縮メディアのストリーミングを伴う従来のストリーミングシステム（ストリーミング映画、テレビ番組、ビデオなど）では、エンドターゲット（例えば、クライアント）でストリーミングメディアを解凍するときに、解凍されたビデオのかなりの量をクライアントにおいてバッファリングして、エンコード操作で（例えばより長いエンコード時間）、伝送品質に侵入するジッター、デコード操作（例えば、より長いデコード時間）の変動に対処することが可能である。このように、従来のストリーミングシステムでは、デコードされたコンテンツがレイテンシ変動性に対処することから、平均的なデコード能力とメトリクス（例えば、平均的なデコードリソース）に依存することが可能であり、ビデオフレームが所望の速度で表示されること（例えば、６０Ｈｚで４Ｋメディアをサポートするか、クライアントＶＳＹＮＣ信号が生じるたびにビデオフレームを表示する）ができるようになっている。 In one embodiment, high performance codecs (e.g., encoders and/or decoders) may be used to further reduce one-way latency between the cloud gaming server and the client. In traditional streaming systems involving streaming of compressed media (e.g., streaming movies, television shows, videos, etc.), when the streaming media is decompressed at the end target (e.g., client), a significant amount of the decompressed video may be buffered at the client to deal with variability in the encoding operation (e.g., longer encoding times), jitter that invades the transmission quality, and in the decoding operation (e.g., longer decoding times). Thus, traditional streaming systems may rely on average decoding capabilities and metrics (e.g., average decoding resources) as the decoded content deals with latency variability, allowing video frames to be displayed at the desired speed (e.g., supporting 4K media at 60 Hz or displaying video frames every time a client VSYNC signal occurs).

しかしながら、クラウドゲーミング環境ではバッファリングは非常に制限されており（例えば、ゼロバッファリングへの移行）、リアルタイムゲームが実現されることができるようになっている。その結果、クラウドゲーミングサーバとクライアントとの間の一方向レイテンシに導入された変動性は、ダウンストリーム操作に悪影響を与え得る。例えば、複雑なフレームのエンコードやデコードに時間がかかると、（単一のフレームであっても）それに応じて一方向レイテンシが長くなり、最終的にユーザへの応答時間が長くなり、ユーザのリアルタイムエクスペリエンスに悪影響を及ぼす。 However, in cloud gaming environments, buffering is highly limited (e.g., moving to zero buffering) to allow real-time gaming to be realized. As a result, any variability introduced into the one-way latency between the cloud gaming server and the client can negatively impact downstream operations. For example, longer encoding and decoding of complex frames will result in correspondingly longer one-way latency (even for a single frame), ultimately resulting in longer response times to the user and negatively impacting the user's real-time experience.

一実施形態では、クラウドゲーミングの場合、ストリーミングビデオアプリケーションの必要性と比較した場合に不必要であると思われる、より強力なデコードリソース及びエンコードリソースを供給することが有益である。さらに、以下でさらに詳しく説明するように、エンコーダリソースは、長いか、または最も長い処理を必要とするフレームを処理するための時間に対して最適化される必要がある。つまり、エンコーダが、クラウドゲーミングシステムにおける一方向レイテンシとビデオ品質との間のトレードオフを改善するように調節されることができる実施形態では、エンコーダの調節は、クライアント帯域幅、スキップされたフレーム、エンコードされたＩフレームの数、シーン変化の数、及び／または目標フレームサイズを超えるビデオフレームの数の監視に基づき得、調節されたパラメータは、エンコーダビットレート、目標フレームサイズ、最大フレームサイズ、及び量子化パラメータ（ＱＰ）値を含み得、高性能エンコーダ及びデコーダが、クラウドゲーミングサーバとクライアントとの間の全体的な一方向レイテンシを低減することを補助する。 In one embodiment, for cloud gaming, it is beneficial to provide more powerful decoding and encoding resources that may be unnecessary when compared to the needs of streaming video applications. Furthermore, as described in more detail below, the encoder resources need to be optimized for the time to process frames that are long or require the longest processing. That is, in an embodiment where the encoder can be adjusted to improve the tradeoff between one-way latency and video quality in the cloud gaming system, the encoder adjustments may be based on monitoring of client bandwidth, skipped frames, number of encoded I-frames, number of scene changes, and/or number of video frames that exceed a target frame size, and the adjusted parameters may include the encoder bitrate, target frame size, maximum frame size, and quantization parameter (QP) value, where high performance encoders and decoders help reduce the overall one-way latency between the cloud gaming server and the client.

図２Ａ～図２Ｄの様々なクライアントデバイス２１０及び／またはクラウドゲーミングネットワーク２９０（例えば、ゲームサーバ２６０内）の詳細な説明と共に、図５のフロー図５００は、本開示の一実施形態による、クラウドゲーミングの方法を示し、ビデオフレームのエンコードは、ネットワーク伝送速度及び信頼性、ならびに全体的なレイテンシ目標を認識したエンコーダパラメータの調節を含んでいる。クラウドゲーミングサーバは、ネットワークを介して１つ以上のクライアントデバイスにコンテンツをストリーミングするように構成されている。このプロセスは、よりスムーズなフレームレートと、より信頼性の高いレイテンシをもたらし、クラウドゲーミングサーバとクライアントとの間の一方向レイテンシが低減され、より一貫性があるようにされ、それによって、ビデオのクライアント表示のスムーズさを改善する。 With detailed descriptions of the various client devices 210 and/or cloud gaming network 290 (e.g., within game server 260) of FIGS. 2A-2D, flow diagram 500 of FIG. 5 illustrates a method of cloud gaming according to one embodiment of the present disclosure, where encoding video frames includes adjusting encoder parameters with awareness of network transmission speed and reliability, as well as overall latency goals. The cloud gaming server is configured to stream content to one or more client devices over a network. This process results in smoother frame rates and more reliable latency, and one-way latency between the cloud gaming server and the client is reduced and made more consistent, thereby improving the smoothness of the client's display of the video.

５１０では、クラウドゲーミングサーバでビデオゲームを実行するときに、複数のビデオフレームが生成される。一般に、クラウドゲーミングサーバは、複数のゲーム用にレンダリングされたビデオフレームを生成する。例えば、ビデオゲームのゲームロジックは、ゲームエンジンまたはゲームタイトル処理エンジンに基づいて構築されている。ゲームエンジンは、ビデオゲームのゲーム環境を構築するためにゲームロジックによって使用され得るコア機能を含んでいる。例えば、ゲームエンジンのいくつかの機能は、ゲーム環境内のオブジェクトに対する物理的な力と衝突をシミュレートするための物理エンジン、２Ｄまたは３Ｄグラフィックス用のレンダリングエンジン、衝突検出、サウンド、アニメーション、人工知能、ネットワーキング、ストリーミング、等を含み得る。そのようにして、ゲームロジックは、ゲームエンジンによって提供されるコア機能を最初から構築する必要はない。 At 510, multiple video frames are generated when executing a video game on a cloud gaming server. Typically, the cloud gaming server generates rendered video frames for multiple games. For example, the game logic of the video game is built based on a game engine or a game title processing engine. The game engine includes core functions that can be used by the game logic to build a game environment of the video game. For example, some functions of the game engine may include a physics engine for simulating physical forces and collisions for objects in the game environment, a rendering engine for 2D or 3D graphics, collision detection, sound, animation, artificial intelligence, networking, streaming, etc. In that way, the game logic does not need to build from scratch the core functions provided by the game engine.

ゲームエンジンと組み合わせたゲームロジックは、ＣＰＵ及びＧＰＵによって実行され、ＣＰＵ及びＧＰＵは、加速処理装置（ＡＰＵ）内で構成できる。つまり、共有メモリと共にＣＰＵ及びＧＰＵは、ゲーム用にレンダリングされたビデオフレームを生成するためのレンダリングパイプラインとして構成され得、レンダリングパイプラインが、目標化、及び／または仮想化された表示の各ピクセルに対して対応する色情報を含む、ゲーム用にレンダリングされた画像を表示に適したビデオまたは画像フレームとして出力するようになっている。
特に、ＣＰＵは、ビデオフレームに対して１つ以上のドローコールを生成するように構成され得、各ドローコールは、ＧＰＵパイプラインのＧＰＵによって実行される対応するコマンドバッファに格納されたコマンドを含んでいる。一般に、グラフィックスパイプラインは、シーン内のオブジェクトの頂点に対してシェーダ操作を実行して、表示のピクセルのテクスチャ値を生成することができる。特に、グラフィックスパイプラインは、入力ジオメトリ（例えば、ゲーミング環境のオブジェクトの頂点）を受信し、頂点シェーダは、オブジェクトを構成するプリミティブまたはポリゴンを構築する。頂点シェーダプログラムは、プリミティブに対してライティング、シェーディング、シャドウイング、及びその他の操作を実行することができる。
深度バッファリングまたはＺバッファリングは、対応する視点からレンダリングされたときに、どのオブジェクトが見えるかを判定するために実行される。ラスタライズは、３Ｄゲーム環境内のオブジェクトを視点によって定義された２Ｄ平面に投影するために実行される。ピクセルサイズのフラグメントがオブジェクトに対して生成され、１つ以上のフラグメントが、画像のピクセルの色に寄与することができる。フラグメントは、対応するビデオの各ピクセルの組み合わせられた色を決定するためにマージ及び／またはブレンドされることができ、フレームバッファ内に格納されることができる。後続のビデオフレームは、同様に構成されたコマンドバッファを使用して表示用に生成され、及び／またはレンダリングされ、複数のビデオフレームが、ＧＰＵパイプラインから出力されている。 The game logic in combination with the game engine is executed by the CPU and GPU, which may be configured in an Accelerated Processing Unit (APU), such that the CPU and GPU, along with shared memory, may be configured as a rendering pipeline for generating rendered video frames for the game, such that the rendering pipeline outputs the rendered images for the game, including corresponding color information for each pixel of a targeted and/or virtualized display, as video or image frames suitable for display.
In particular, the CPU may be configured to generate one or more draw calls for a video frame, with each draw call including a command stored in a corresponding command buffer that is executed by the GPU in the GPU pipeline. In general, the graphics pipeline may perform shader operations on vertices of objects in a scene to generate texture values for pixels of a display. In particular, the graphics pipeline receives input geometry (e.g., vertices of objects in a gaming environment), and vertex shaders construct the primitives or polygons that make up the objects. The vertex shader programs may perform lighting, shading, shadowing, and other operations on the primitives.
Depth or Z-buffering is performed to determine which objects are visible when rendered from the corresponding viewpoint. Rasterization is performed to project objects in the 3D game environment onto the 2D plane defined by the viewpoint. Pixel-sized fragments are generated for the objects, and one or more fragments may contribute to the color of a pixel in the image. The fragments may be merged and/or blended to determine the combined color of each pixel of the corresponding video and may be stored in a frame buffer. Subsequent video frames are generated and/or rendered for display using a similarly configured command buffer, and multiple video frames are output from the GPU pipeline.

５２０では、本方法は、エンコーダビットレートで複数のビデオフレームをエンコードすることを含む。特に、複数のビデオフレームは、アプリケーション層で動作するストリーマを使用してクライアントにストリーミングする前に、圧縮されるエンコーダにスキャンされる。一実施形態では、ゲーム用にレンダリングされたビデオフレームの各々は、追加のユーザインターフェース機能で対応する改変されたビデオフレームに合成され、かつブレンドされ、次いでエンコーダにスキャンされることができ、このエンコーダは、改変されたビデオフレームを圧縮してクライアントにストリーミングする。
簡潔さと明確さのために、図５に開示されているエンコーダパラメータを調節する方法は、複数のビデオフレームをエンコードすることに関して説明されているが、改変されたビデオフレームをエンコードすることをサポートすると理解されている。エンコーダは、記述されたフォーマットに基づいて複数のビデオフレームを圧縮するように構成されている。例えば、クラウドゲーミングサーバからクライアントにメディアコンテンツをストリーミングするときに、モーションピクチャエキスパートグループ（ＭＰＥＧ）またはＨ．２６４標準が実装され得る。特に、エンコーダは、ビデオフレームによって、またはビデオフレームのエンコーダスライスによって圧縮を実行することができ、前述のように、各ビデオフレームは、１つ以上のエンコードされたスライスとして圧縮されることができる。一般に、メディアをストリーミングするとき、ビデオフレームは、Ｉフレーム（イントラフレーム）またはＰフレーム（予測フレーム）として圧縮され、それらの各々が、エンコードされたスライスに分割されることができる。 At 520, the method includes encoding the video frames at an encoder bit rate. In particular, the video frames are scanned into an encoder where they are compressed before streaming to the client using a streamer operating at the application layer. In one embodiment, each of the rendered video frames for the game can be composited and blended with additional user interface features into a corresponding modified video frame and then scanned into an encoder, which compresses and streams the modified video frames to the client.
For brevity and clarity, the method of adjusting encoder parameters disclosed in FIG. 5 is described with respect to encoding a plurality of video frames, but it is understood that it supports encoding modified video frames. The encoder is configured to compress a plurality of video frames based on the described format. For example, when streaming media content from the cloud gaming server to the client, the Motion Picture Experts Group (MPEG) or H.264 standard may be implemented. In particular, the encoder may perform compression by video frame or by encoder slicing of video frames, and as mentioned above, each video frame may be compressed as one or more encoded slices. In general, when streaming media, video frames are compressed as I frames (intra frames) or P frames (predicted frames), each of which may be divided into encoded slices.

５３０では、クライアントの最大受信帯域幅が測定される。一実施形態では、クライアントが経験する最大帯域幅は、クライアントからのフィードバックメカニズムによって判定されている。図６は、本開示の一実施形態による、クラウドゲーミングサーバのストリーマによるクライアント２１０の帯域幅の測定を示しており、ストリーマ６２０は、エンコーダ６１０を監視及び調節するように構成されており、圧縮されたビデオフレームが、クライアントの測定された帯域幅の範囲内の速度で伝送され得るようになっている。
図示されるように、圧縮されたビデオフレーム、エンコードされたスライス、及び／またはパケットは、エンコーダ６１０からバッファ６３０（例えば、先入れ先出し－ＦＩＦＯ）へ配信される。エンコーダは、エンコーダ充填速度６１５で圧縮されたビデオフレームを配信する。例えば、バッファは、エンコーダが、圧縮されたビデオフレーム、エンコードされたスライス６５０、及び／またはエンコードされたスライスのパケット６５５を生成することができるのと同じ速さで充填され得る。さらに、圧縮されたビデオフレームは、ネットワーク２５０を介してクライアント２１０に配信するために、バッファ排出速度６３５でバッファから排出される。
一実施形態では、バッファ排出速度６３５は、クライアントの測定された最大受信帯域幅に動的に調節される。例えば、バッファ排出速度６３５は、クライアントの測定された最大受信帯域幅にほぼ等しくなるように調整され得る。一実施形態では、パケットのエンコードは、それらが伝送されるのと同じ速度で実行され、両方の操作は、クライアントが利用できる最大の利用可能な帯域幅に動的に調節されるようになっている。 At 530, the client's maximum receive bandwidth is measured. In one embodiment, the maximum bandwidth experienced by the client is determined by a feedback mechanism from the client. Figure 6 illustrates measuring the bandwidth of a client 210 by a streamer of a cloud gaming server, according to one embodiment of the disclosure, where the streamer 620 is configured to monitor and adjust the encoder 610 such that compressed video frames can be transmitted at a rate within the client's measured bandwidth.
As shown, compressed video frames, encoded slices, and/or packets are delivered from an encoder 610 to a buffer 630 (e.g., first in, first out - FIFO). The encoder delivers the compressed video frames at an encoder fill rate 615. For example, the buffer may be filled as fast as the encoder can generate compressed video frames, encoded slices 650, and/or packets of encoded slices 655. Additionally, the compressed video frames are drained from the buffer at a buffer drain rate 635 for delivery over network 250 to client 210.
In one embodiment, the buffer drain rate 635 is dynamically adjusted to the client's measured maximum receive bandwidth. For example, the buffer drain rate 635 may be adjusted to be approximately equal to the client's measured maximum receive bandwidth. In one embodiment, the encoding of packets is performed at the same rate that they are transmitted, with both operations dynamically adjusted to the maximum available bandwidth available to the client.

特に、アプリケーション層で動作するストリーマ６２０は、帯域幅テスター６２５を使用するなどして、クライアント２１０の最大帯域幅を測定する。アプリケーション層は、インターネットを介してネットワークデバイスを相互接続するために使用されるプロトコルのユーザデータグラムプロトコル／インターネットプロトコル（ＵＤＰ／ＩＰ）スイートで使用される。例えば、アプリケーション層は、ＩＰ通信ネットワークを介したデバイス間の通信に使用される通信プロトコルとインターフェースメソッドを規定する。テスト中に、ストリーマ６２０は、追加のバッファリングされたパケット６４０（例えば、前方誤り訂正（ＦＥＣ）パケット）を提供し、バッファ６３０が、テストされた最大帯域幅などの事前規定されたビットレートからパケットをストリーミングできるようになっている。
一実施形態では、クライアントは、フィードバック６９０として、ビデオフレームのある範囲などの、増分シーケンス識別子（ＩＤ）の範囲にわたって受信したパケットの数をストリーマ６２０に返す。例えば、クライアントは、シーケンスＩＤ１００～２５０（例えば、１５０ビデオフレーム）で受信した１５０ビデオフレームのうち１４５のようなものを報告することができる。このように、サーバ２６０におけるストリーマ６２０は、パケット損失を計算することができ、ストリーマ６２０は、そのパケットのシーケンスの間に送信された（例えば、テストされた）帯域幅の量を知っているので、ストリーマ６２０は、クライアントの最大帯域幅が特定の時点で何であるか動的に判定することができる。
クライアントの測定された最大帯域幅は、制御情報６２７としてストリーマ６２０からバッファ６３０に配信され得、バッファ６３０は、クライアントの最大帯域幅にほぼ等しい速度でパケットを動的に伝送することができるようになっている。このように、圧縮されたビデオフレーム、エンコードされたスライス、及び／またはパケットの伝送速度は、現在測定されているクライアントの最大帯域幅に応じて動的に調整されることができる。 In particular, the streamer 620 operating at the application layer measures the maximum bandwidth of the client 210, such as by using a bandwidth tester 625. The application layer is used in the User Datagram Protocol/Internet Protocol (UDP/IP) suite of protocols used to interconnect network devices over the Internet. For example, the application layer specifies the communication protocols and interface methods used to communicate between devices over an IP communication network. During testing, the streamer 620 provides additional buffered packets 640 (e.g., forward error correction (FEC) packets) to allow the buffer 630 to stream packets from a predefined bit rate, such as the maximum bandwidth tested.
In one embodiment, the client returns as feedback 690 to the streamer 620 the number of packets received over a range of incremental sequence identifiers (IDs), such as a range of video frames. For example, the client may report something like 145 of 150 video frames received at sequence IDs 100-250 (e.g., 150 video frames). In this way, the streamer 620 at the server 260 can calculate packet loss, and because the streamer 620 knows the amount of bandwidth that was transmitted (e.g., tested) during that sequence of packets, the streamer 620 can dynamically determine what the client's maximum bandwidth is at a particular point in time.
The measured maximum bandwidth of the client may be delivered as control information 627 from the streamer 620 to the buffer 630 such that the buffer 630 can dynamically transmit packets at a rate approximately equal to the client's maximum bandwidth. In this manner, the transmission rate of compressed video frames, encoded slices, and/or packets may be dynamically adjusted depending on the currently measured maximum bandwidth of the client.

５４０では、エンコーディングプロセスはストリーマによって監視されている。つまり、複数のビデオフレームのエンコードが監視される。一実施形態では、監視は、クライアント２１０で実行され、フィードバック及び／または調節制御信号がエンコーダに戻されるように提供されている。別の実施形態では、監視は、クラウドゲーミングサーブ２６０において、ストリーマ６２０などによって実行されている。例えば、ビデオフレームのエンコードの監視は、ストリーマ６２０の監視及び調節ユニット６２９によって実行され得る。様々なエンコーディング特性及び／または操作が、追跡及び／または監視されることができる。
例えば、一実施形態では、複数のビデオフレーム内のＩフレームの発生率が、追跡及び／または監視されることができる。さらに、一実施形態では、複数のビデオフレーム内のシーン変化の発生率が、追跡及び／または監視されることができる。また、一実施形態では、目標フレームサイズを超えるビデオフレームの数が追跡及び／または監視されることができる。また、一実施形態では、１つ以上のビデオフレームをエンコードするために使用されるエンコーダビットレートが、追跡及び／または監視されることができる。 At 540, the encoding process is monitored by the streamer, i.e., the encoding of multiple video frames is monitored. In one embodiment, the monitoring is performed by the client 210 and feedback and/or adjustment control signals are provided back to the encoder. In another embodiment, the monitoring is performed by the streamer 620, etc., at the cloud gaming server 260. For example, monitoring of the encoding of the video frames may be performed by a monitoring and adjustment unit 629 of the streamer 620. Various encoding characteristics and/or operations can be tracked and/or monitored.
For example, in one embodiment, the rate of occurrence of I-frames within a plurality of video frames may be tracked and/or monitored. Additionally, in one embodiment, the rate of occurrence of scene changes within a plurality of video frames may be tracked and/or monitored. Also, in one embodiment, the number of video frames that exceed a target frame size may be tracked and/or monitored. Also, in one embodiment, the encoder bit rate used to encode one or more video frames may be tracked and/or monitored.

５５０では、エンコーダのパラメータは、ビデオフレームのエンコーディングの監視に基づいて、動的に調節される。つまり、ビデオフレームのエンコーディングの監視は、エンコーダで受信される現在及び将来のビデオフレームを圧縮するときに、エンコーダがどのように動作するかに影響を及ぼす。特に、監視及び調整ユニット６２９は、ビデオフレームのエンコードの監視、及び監視された情報に対して実行される分析に応答して、どのエンコーダパラメータを調節すべきかを判定するように構成されている。制御信号６２１は、エンコーダを構成するために使用される監視及び調整ユニット６２９からエンコーダ６１０に返信される。調節のためのエンコーダパラメータには、量子化パラメータ（ＱＰ）（例えば、最小ＱＰ、最大ＱＰ）または品質パラメータ、目標フレームサイズ、最大フレームサイズ、などが含まれる。 At 550, the encoder parameters are dynamically adjusted based on monitoring of the encoding of the video frames. That is, monitoring of the encoding of the video frames affects how the encoder operates when compressing current and future video frames received at the encoder. In particular, the monitoring and adjustment unit 629 is configured to determine which encoder parameters to adjust in response to monitoring the encoding of the video frames and analysis performed on the monitored information. Control signals 621 are transmitted back to the encoder 610 from the monitoring and adjustment unit 629 that are used to configure the encoder. Encoder parameters for adjustment include quantization parameters (QPs) (e.g., minimum QP, maximum QP) or quality parameters, target frame size, maximum frame size, etc.

調節は、ネットワークの伝送速度及び信頼性、ならびに全体的なレイテンシ目標を認識して実行される。一実施形態では、ビデオ再生のスムーズさは、低レイテンシまたは画像品質よりも優先される。例えば、１つ以上のビデオフレームのエンコードのスキップは、無効になっている。具体的には、画像解像度または画像品質（６０Ｈｚなど）とレイテンシのバランスは、様々なエンコーダパラメータを使用して調節される。特に、クラウドゲーミングサーバ及びクライアントにおけるＶＳＹＮＣ信号は、同期及びオフセットされることができることから、クラウドゲーミングサーバとクライアント間の一方向レイテンシは、低減されることができ、それによって、低レイテンシを促進するためにビデオフレームをスキップする必要性が減少する。ＶＳＹＮＣ信号の同期とオフセットはまた、クラウドゲーミングサーバにおける重複操作（スキャンアウト、エンコード、及び伝送）、クライアントにおける重複操作（受信、デコード、レンダリング、表示）、及び／またはクラウドゲーミングサーバとクライアントにおける重複操作も提供し、これらはすべて、一方向レイテンシの低減、一方向レイテンシの変動性の低減、ビデオコンテンツのリアルタイムでの生成と表示、及びクライアントにおける一貫したビデオ再生を促進する。 The adjustments are made with awareness of the network transmission speed and reliability, as well as the overall latency goal. In one embodiment, smoothness of video playback is prioritized over low latency or image quality. For example, skipping the encoding of one or more video frames is disabled. Specifically, the balance between image resolution or image quality (e.g., 60 Hz) and latency is adjusted using various encoder parameters. In particular, the VSYNC signals at the cloud gaming server and the client can be synchronized and offset, so that one-way latency between the cloud gaming server and the client can be reduced, thereby reducing the need to skip video frames to promote low latency. Synchronization and offsetting of the VSYNC signals also provides overlapping operations at the cloud gaming server (scanout, encode, and transmit), overlapping operations at the client (receive, decode, render, display), and/or overlapping operations at the cloud gaming server and the client, all of which facilitate reduced one-way latency, reduced variability in one-way latency, real-time generation and display of video content, and consistent video playback at the client.

一実施形態では、エンコーダビットレートは、クライアント帯域幅に対する需要を予測するために、次に続くフレーム及びそれらの複雑性（例えば、予測されるシーン変化）を考慮して監視され、ここでエンコーダビットレートは、予想される需要に従って調整されることができる。例えば、ビデオ再生のスムーズさを優先するとき、エンコーダ監視及び調節ユニット６２９は、使用されるエンコーダビットレートが、測定されている最大受信帯域幅を超えていると判定するように構成され得る。それに応じて、エンコーダビットレートは低減されることができ、フレームサイジングもまた低減されることができるようになっている。
スムーズさを優先する場合、最大受信帯域幅よりも低いエンコーダビットレート（例えば、１５メガビット／秒の最大受信帯域幅に対して１０メガビット／秒のエンコーダビットレート）を使用することが望ましい。そのようにして、エンコードされたフレームが最大フレームサイズを超えてスパイクした場合でも、エンコードされたフレームは、依然として６０Ｈｚ（ヘルツ）以内で送信されることができる。特に、エンコーダビットレートは、フレームサイズに変換され得る。ビデオゲームの所定のビットレート及び目標速度（例えば、毎秒６０フレーム）は、エンコードされたビデオフレームの平均サイズに変換される。例えば、１５メガビット／秒のエンコーダビットレート、及び６０フレーム／秒の所定の目標速度では、６０個のエンコードされたフレームが、１５メガビットを共有し、各エンコードされたフレームが、約２５０ｋ個のエンコードされたビットを有するようになっている。
このように、エンコーダビットレートを制御することはまた、エンコードされたビデオフレームのフレームサイズも制御し、その結果、エンコーダビットレートを増加すると、エンコード用のビットが多くなり（精度が高くなる）、また、エンコーダビットレートを低減すると、エンコード用のビットが少なくなる（精度が低くなる）。同様に、ビデオフレームのグループをエンコードするために使用されるエンコーダビットレートが、測定されている最大受信帯域幅内にあるとき、エンコーダビットレートは増加されることができ、フレームサイズもまた、増加されることができるようになっている。 In one embodiment, the encoder bitrate is monitored taking into account the upcoming frames and their complexity (e.g., predicted scene changes) to predict the demand for client bandwidth, where the encoder bitrate can be adjusted according to the predicted demand. For example, when prioritizing smoothness of video playback, the encoder monitoring and adjustment unit 629 can be configured to determine that the encoder bitrate used exceeds the measured maximum reception bandwidth. In response, the encoder bitrate can be reduced and the frame sizing can also be reduced.
If smoothness is a priority, it is desirable to use an encoder bit rate lower than the maximum receiving bandwidth (e.g., an encoder bit rate of 10 Mbits/sec for a maximum receiving bandwidth of 15 Mbits/sec). That way, even if the encoded frames spike above the maximum frame size, they can still be transmitted within 60 Hz (Hertz). In particular, the encoder bit rate can be translated into a frame size. A given bit rate and target speed of a video game (e.g., 60 frames per second) is translated into an average size of an encoded video frame. For example, with an encoder bit rate of 15 Mbits/sec and a given target speed of 60 frames/sec, 60 encoded frames share 15 Mbits, such that each encoded frame has approximately 250k encoded bits.
In this way, controlling the encoder bit rate also controls the frame size of the encoded video frames, so that increasing the encoder bit rate results in more bits for encoding (more precision) and decreasing the encoder bit rate results in fewer bits for encoding (less precision). Similarly, when the encoder bit rate used to encode a group of video frames is within the maximum received bandwidth being measured, the encoder bit rate can be increased and the frame size can also be increased.

一実施形態では、ビデオ再生のスムーズさを優先する場合、エンコーダ監視及び調節ユニット６２９は、複数のビデオフレームからビデオフレームのグループをエンコードするために使用されるエンコーダビットレートが、測定されている最大受信帯域幅を超えているかを判定するように構成され得る。例えば、エンコーダビットレートは、１５メガビット／秒（Ｍｂｐｓ）であると検出されることができ、一方で、最大受信帯域幅は、現在１０Ｍｂｐｓであることができる。このようにして、エンコーダは、一方向レイテンシを増加させることなく、クライアントに送信できるよりも多くのビットをプッシュアウトする。
前述したように、スムーズさを優先する場合、最大受信帯域幅よりも低いエンコーダビットレートを使用することが望ましくなり得る。上の例では、上で紹介した１０メガビット／秒の最大受信帯域幅に対して１０メガビット／秒以下に設定したエンコーダビットレートを有することが許容可能となり得る。そのようにして、エンコードされたフレームが最大フレームサイズの上にスパイクする場合、エンコードされたフレームは、依然として６０Ｈｚ以内で送信されることができる。それに応じて、ＱＰ値は、エンコーダビットレートの低減をしても、低減しなくても調節されることができ、ＱＰは、ビデオフレームを圧縮するときに使用される精度を制御する。
つまり、ＱＰは、実行される量子化の量を制御する（例えば、ビデオフレーム内の値の可変範囲を単一量子値に圧縮する）。Ｈ．２６４では、ＱＰの範囲は「０」から「５１」である。例えば、「０」のＱＰ値は、量子化がより少なく、圧縮がより少なく、精度がより高く、品質がより高いことを意味する。例えば、「５１」のＱＰ値は、量子化がより少なく、圧縮がより少なく、精度がより高く、品質がより高いことを意味する。具体的には、ＱＰ値が増加されることができ、ビデオフレームのエンコードが、より低い精度で実行されるようになっている。 In one embodiment, when smoothness of video playback is a priority, the encoder monitoring and adjustment unit 629 may be configured to determine whether the encoder bitrate used to encode a group of video frames from a plurality of video frames exceeds the measured maximum receive bandwidth. For example, the encoder bitrate may be detected to be 15 Megabits per second (Mbps), while the maximum receive bandwidth may currently be 10 Mbps. In this way, the encoder pushes out more bits than can be sent to the client without increasing one-way latency.
As mentioned above, if smoothness is a priority, it may be desirable to use an encoder bitrate lower than the maximum receiving bandwidth. In the above example, it may be acceptable to have the encoder bitrate set at 10 Mbit/s or less for the 10 Mbit/s maximum receiving bandwidth introduced above. That way, if the encoded frames spike above the maximum frame size, they can still be transmitted within 60 Hz. The QP value can be adjusted accordingly, with or without a reduction in the encoder bitrate, with the QP controlling the precision used when compressing the video frames.
That is, the QP controls the amount of quantization that is performed (e.g., compressing a variable range of values in a video frame to a single quantum value). In H.264, the QP ranges from "0" to "51". For example, a QP value of "0" means less quantization, less compression, more precision, and higher quality. For example, a QP value of "51" means less quantization, less compression, more precision, and higher quality. Specifically, the QP value can be increased such that the encoding of the video frame is performed with less precision.

一実施形態では、ビデオ再生のスムーズさを優先するとき、監視及び調節ユニット６２９によるエンコーダ監視は、複数のビデオフレームからのビデオフレームのグループをエンコードするために使用されるエンコーダビットレートが、最大受信帯域幅内にあることを判定するように構成され得る。以前に紹介したように、スムーズさを優先するとき、最大受信帯域幅よりも低いエンコーダビットレートを使用することが望ましくなり得る。このように、ビデオフレームのグループを送信するときに利用可能な超過帯域幅がある。超過帯域幅が判定されることができる。それに応じて、ＱＰ値は調節されることができ、ここで、ＱＰは、ビデオフレームを圧縮するときに使用される精度を制御する。特に、ＱＰ値は、超過帯域幅に基づいて低減されることができ、エンコードがより正確に実行されるようになっている。 In one embodiment, when prioritizing smoothness of video playback, the encoder monitoring by the monitoring and adjusting unit 629 may be configured to determine that the encoder bit rate used to encode a group of video frames from the plurality of video frames is within the maximum reception bandwidth. As previously introduced, when prioritizing smoothness, it may be desirable to use an encoder bit rate that is lower than the maximum reception bandwidth. In this manner, there is excess bandwidth available when transmitting the group of video frames. The excess bandwidth may be determined. Accordingly, a QP value may be adjusted, where the QP controls the precision used when compressing the video frames. In particular, the QP value may be reduced based on the excess bandwidth, such that the encoding is performed more precisely.

別の実施形態では、個々のビデオゲームの特性は、Ｉフレーム処理及びＱＰ設定を決定するときに、特にビデオ再生のスムーズさを優先するときに考慮される。例えば、ビデオゲームの「シーン変化」が頻繁に行われない場合（例えば、カメラカットのみ）、Ｉフレームをより大きくすることが望ましくなり得る（ＱＰがより低いか、エンコーダビットレートがより高い）。つまり、圧縮されている複数のビデオフレームからのビデオフレームのグループ内で、シーン変化を有していると識別されたビデオフレームの数は、シーン変化の閾値数よりも少ないと判定される。つまり、ストリーミングシステムは、現在の状態（例えば、測定されたクライアント帯域幅、必要とされるレイテンシ、など）のシーン変化の数を処理できる。それに応じて、ＱＰ値は調節されることができ、ここで、ＱＰは、ビデオフレームを圧縮するときに使用される精度を制御する。特に、ＱＰ値は、エンコードがより高い精度で実行されるように、低減されることができる。 In another embodiment, the characteristics of an individual video game are taken into account when determining I-frame processing and QP settings, especially when prioritizing smoothness of video playback. For example, if "scene changes" in a video game are infrequent (e.g., only camera cuts), it may be desirable to have larger I-frames (lower QP or higher encoder bitrate). That is, the number of video frames identified as having a scene change within a group of video frames from the multiple video frames being compressed is determined to be less than a threshold number of scene changes. That is, the streaming system can handle the number of scene changes for the current conditions (e.g., measured client bandwidth, required latency, etc.). The QP value can be adjusted accordingly, where the QP controls the precision used when compressing the video frames. In particular, the QP value can be reduced so that the encoding is performed with greater precision.

一方で、ビデオゲームでゲームプレイ中に頻繁に「シーン変化」が発生する場合は、Ｉフレームサイズを小さく保つことが望ましくなり得る（例えば、ＱＰをより高くするか、エンコーダビットレートをより低くする）。つまり、圧縮されている複数のビデオフレームからのビデオフレームのグループ内で、シーン変化を有していると識別されたビデオフレームの数は、シーン変化の閾値数を満たすか、または超えると判定される。つまり、ビデオゲームは、現在の状態（例えば、測定されたクライアント帯域幅、必要とされるレイテンシ、など）に対して多過ぎるシーン変化を生成している。それに応じて、ＱＰ値は調節されることができ、ここで、ＱＰは、ビデオフレームを圧縮するときに使用される精度を制御する。特に、ＱＰ値は、エンコードがより低い精度で実行されるように、増加されることができる。 On the other hand, if a video game experiences frequent "scene changes" during gameplay, it may be desirable to keep the I-frame size small (e.g., higher QP or lower encoder bitrate). That is, the number of video frames identified as having scene changes within a group of video frames from the plurality of video frames being compressed is determined to meet or exceed a threshold number of scene changes. That is, the video game is generating too many scene changes for the current conditions (e.g., measured client bandwidth, required latency, etc.). In response, the QP value can be adjusted, where the QP controls the precision used when compressing the video frames. In particular, the QP value can be increased so that encoding is performed with less precision.

別の実施形態では、Ｉフレーム処理及びＱＰ設定を決定するとき、特にビデオ再生のスムーズさを優先するときに、エンコードパターンが考慮され得る。例えば、エンコーダのＩフレームの生成頻度が低い場合は、Ｉフレームを大きくすることが望ましくなり得る（ＱＰが低いか、エンコーダビットレートがより高い）。つまり、圧縮されている複数のビデオフレームからのビデオフレームのグループ内で、Ｉフレームとして圧縮されるビデオフレームの数は、Ｉフレームの閾値数内にあるか、またはそれより少ない。つまり、ストリーミングシステムは、現在の状態（例えば、測定されたクライアント帯域幅、必要とされるレイテンシ、等）のＩフレームの数を処理できる。それに応じて、ＱＰ値は調節されることができ、ここで、ＱＰは、ビデオフレームを圧縮するときに使用される精度を制御する。特に、ＱＰ値は、エンコードがより高い精度で実行されるように、低減されることができる。 In another embodiment, encoding patterns may be taken into account when determining I-frame processing and QP settings, especially when prioritizing smoothness of video playback. For example, if the encoder generates I-frames less frequently, it may be desirable to have larger I-frames (lower QP or higher encoder bitrate). That is, the number of video frames compressed as I-frames within a group of video frames from multiple video frames being compressed is within or below a threshold number of I-frames. That is, the streaming system can handle the number of I-frames for the current conditions (e.g., measured client bandwidth, required latency, etc.). The QP value may be adjusted accordingly, where the QP controls the precision used when compressing the video frames. In particular, the QP value may be reduced so that the encoding is performed with higher precision.

エンコーダが頻繁にＩフレームを生成する場合は、Ｉフレームサイズを小さく保つことが望ましくなり得る（例えば、ＱＰを高くするか、エンコーダビットレートをより低くする）。つまり、圧縮されている複数のビデオフレームからのビデオフレームのグループ内で、Ｉフレームとして圧縮されるビデオフレームの数は、Ｉフレームの閾値数内にあるか、またはそれを超える。つまり、ビデオゲームは、現在の状態（例えば、測定されたクライアント帯域幅、必要とされるレイテンシ、など）に対して多過ぎるＩフレームを生成している。それに応じて、ＱＰ値は調節されることができ、ここで、ＱＰは、ビデオフレームを圧縮するときに使用される精度を制御する。特に、ＱＰ値は、エンコードがより低い精度で実行されるように、増加されることができる。 If the encoder generates I-frames frequently, it may be desirable to keep the I-frame size small (e.g., higher QP or lower encoder bitrate). That is, within a group of video frames from multiple video frames being compressed, the number of video frames compressed as I-frames is within or exceeds a threshold number of I-frames. That is, the video game is generating too many I-frames for the current conditions (e.g., measured client bandwidth, required latency, etc.). The QP value can be adjusted accordingly, where the QP controls the precision used when compressing the video frames. In particular, the QP value can be increased so that the encoding is performed with less precision.

別の実施形態では、エンコーダを回すとき、特にビデオ再生のスムーズさを優先するとき、エンコードパターンが考慮され得る。例えば、エンコーダが頻繁に目標フレームサイズを下回る場合は、目標フレームサイズをより大きくすることが望ましくなり得る。つまり、圧縮されて伝送速度で送信される複数のビデオフレームからのビデオフレームのグループ内で、ビデオフレームの数が閾値よりも低くなると判定される。ビデオフレームの数の各々は、目標フレームサイズ内にある（すなわち、目標フレームサイズと等しいか、小さい）。それに応じて、目標フレームサイズと最大フレームサイズのうちの少なくとも１つが増加する。 In another embodiment, encoding patterns may be taken into consideration when spinning the encoder, especially when smoothness of video playback is a priority. For example, if the encoder frequently falls short of a target frame size, it may be desirable to make the target frame size larger. That is, it is determined that within a group of video frames from a plurality of video frames to be compressed and transmitted at the transmission rate, a number of video frames falls below a threshold. Each of the number of video frames falls within the target frame size (i.e., is equal to or smaller than the target frame size). At least one of the target frame size and the maximum frame size is increased accordingly.

一方で、エンコーダが頻繁に目標フレームサイズを超える場合は、目標フレームサイズを小さくすることが望ましくなり得る。つまり、圧縮されて伝送速度で送信される複数のビデオフレームからのビデオフレームのグループ内で、ビデオフレームの数が閾値を満たすか、または超えるかが、判定される。ビデオフレームの数の各々は、目標フレームサイズを超えている。それに応じて、目標フレームサイズ及び最大フレームサイズのうちの少なくとも１つが低減される。 On the other hand, if the encoder frequently exceeds the target frame size, it may be desirable to reduce the target frame size. That is, it is determined whether within a group of video frames from a plurality of video frames to be compressed and transmitted at the transmission rate, a number of video frames meets or exceeds a threshold value. Each of the number of video frames exceeds the target frame size. At least one of the target frame size and the maximum frame size is reduced accordingly.

図７Ａは、本開示の一実施形態による、クライアントにおける品質及びバッファ利用を最適化するためのエンコーダの量子化パラメータ（ＱＰ）の設定を示す図である。グラフ７２０Ａは、水平方向に示されているように、生成された各フレームに対する垂直方向のフレームサイズ（バイト単位）を示している。目標フレームサイズと最大フレームサイズは静的である。特に、ライン７１１は最大フレームサイズを示し、ライン７１２は目標フレームサイズを示しており、最大フレームサイズは目標フレームサイズよりも大きくなっている。グラフ７２０Ａに示されるように、ライン７１２の目標フレームサイズを超える圧縮されたビデオフレームを含む複数のピークが存在する。目標フレームサイズを超えるビデオフレームは、クラウドゲーミングサーバからのエンコード及び／または送信に対して複数のフレーム期間を要し得ることから、再生ジッターをもたらすリスク（例えば、一方向レイテンシの増加）がある。 7A illustrates setting the quantization parameter (QP) of an encoder to optimize quality and buffer utilization at a client, according to one embodiment of the present disclosure. Graph 720A illustrates the vertical frame size (in bytes) for each generated frame, shown horizontally. The target frame size and maximum frame size are static. In particular, line 711 illustrates the maximum frame size, and line 712 illustrates the target frame size, which is larger than the target frame size. As shown in graph 720A, there are multiple peaks that include compressed video frames that exceed the target frame size of line 712. Video frames that exceed the target frame size may take multiple frame periods to encode and/or transmit from the cloud gaming server, thus risking playback jitter (e.g., increased one-way latency).

グラフ７００Ｂは、クライアントにおけるエンコードの品質とバッファ利用を最適化するための、目標フレームサイズ、最大フレームサイズ、ＱＰ範囲（最小ＱＰや最大ＱＰなど）に基づいてＱＰが設定された後のエンコーダ応答を示す。例えば、ＱＰは、前述のように、エンコーダビットレートのエンコーダ監視、シーン変化の頻度、及びＩフレーム生成の頻度に基づいて調整及び／または調節され得る。グラフ７００Ｂは、水平方向に示されているように、生成された各フレームに対する垂直方向のフレームサイズ（バイト単位）を示している。
ライン７１２の目標フレームサイズ及びライン７１１の最大フレームサイズは、グラフ７００Ａにあるものと同じ位置のままである。ＱＰ調節及び／または調整の後、グラフ７００Ａと比較すると、ライン７１２の目標フレームサイズを超える圧縮されたビデオフレームを含むピーク数が減少する。つまり、ＱＰは、現在の状態（例えば、測定されたクライアント帯域幅、必要とされるレイテンシ、等）に対してビデオフレームのエンコードを最適化する（すなわち、目標フレームサイズ内に収まる）ために調節されている。 Graph 700B shows the encoder response after the QP is set based on the target frame size, maximum frame size, and QP range (e.g., minimum QP and maximum QP) to optimize encoding quality and buffer utilization at the client. For example, the QP may be adjusted and/or tuned based on encoder monitoring of encoder bitrate, frequency of scene changes, and frequency of I-frame generation, as discussed above. Graph 700B shows the frame size (in bytes) vertically for each frame generated, as shown horizontally.
The target frame size at line 712 and the maximum frame size at line 711 remain in the same position as they were in graph 700A. After the QP adjustment and/or tuning, the number of peaks containing compressed video frames exceeding the target frame size at line 712 is reduced as compared to graph 700A. That is, the QP has been adjusted to optimize the encoding of video frames (i.e., to fit within the target frame size) for the current conditions (e.g., measured client bandwidth, required latency, etc.).

図７Ｂは、本開示の一実施形態による、クライアントによってサポートされる真の目標フレームサイズを超えるＩフレームの発生を低減するための、目標フレームサイズ、最大フレームサイズ、及び／またはＱＰ（例えば、最小ＱＰ及び／または最大ＱＰ）エンコーダ設定の調節を示す図である。例えば、ＱＰは、前述のように、エンコーダビットレートのエンコーダ監視、シーン変化の頻度、及びＩフレーム生成の頻度に基づいて調整及び／または調節され得る。 FIG. 7B illustrates the adjustment of target frame size, maximum frame size, and/or QP (e.g., minimum QP and/or maximum QP) encoder settings to reduce the occurrence of I-frames that exceed the true target frame size supported by the client, according to one embodiment of the present disclosure. For example, the QP may be adjusted and/or tuned based on encoder monitoring of the encoder bitrate, frequency of scene changes, and frequency of I-frame generation, as described above.

グラフ７２０Ａは、水平方向に示されているように、生成された各フレームに対する垂直方向のフレームサイズ（バイト単位）を示している。例示のために、図７Ｂの７２０Ａ及び図７Ａのグラフ７００Ａは、類似のエンコーダ状態を反映することができ、エンコーダ調節に使用される。グラフ７２０Ａでは、目標フレームサイズと最大フレームサイズは静的である。特に、ライン７１１は最大フレームサイズを示し、ライン７１２は目標フレームサイズを示しており、最大フレームサイズは目標フレームサイズよりも大きくなっている。
グラフ７２０Ａに示されるように、ライン７１２の目標フレームサイズを超える圧縮されたビデオフレームを含む複数のピークが存在する。目標フレームサイズを超えるビデオフレームは、クラウドゲーミングサーバからのエンコード及び／または送信に対して複数のフレーム期間を要し得ることから、再生ジッターをもたらすリスク（例えば、一方向レイテンシの増加）がある。例えば、ライン７１１で最大フレームサイズに達するピークは、クライアントに送信されるまでに１６ミリ秒以上かかるＩフレームであり得、このことが、クラウドゲーミングサーバとクライアントとの間の一方向レイテンシを増加させることによる再生ジッターを生じさせる。 Graph 720A shows the vertical frame size (in bytes) for each generated frame, as shown horizontally. For illustrative purposes, 720A of FIG. 7B and graph 700A of FIG. 7A may reflect similar encoder conditions and are used for encoder tuning. In graph 720A, the target frame size and maximum frame size are static. In particular, line 711 shows the maximum frame size and line 712 shows the target frame size, where the maximum frame size is larger than the target frame size.
As shown in graph 720A, there are multiple peaks that include compressed video frames that exceed the target frame size at line 712. Video frames that exceed the target frame size may take multiple frame periods to encode and/or transmit from the cloud gaming server, and therefore risk introducing playback jitter (e.g., increased one-way latency). For example, the peaks that reach the maximum frame size at line 711 may be I-frames that take 16 milliseconds or more to transmit to the client, which creates playback jitter by increasing the one-way latency between the cloud gaming server and the client.

グラフ７２０Ｂは、クライアントによってサポートされる真の目標フレームサイズを超えるＩフレームの発生を低減するために、目標フレームサイズ及び／または最大フレームサイズのうちの少なくとも１つが調節された後のエンコーダ応答を示している。真の目標フレームサイズは、前述のように、測定されるクライアント帯域幅、及び／またはエンコーダビットレート、シーン変化の頻度、及びＩフレーム生成の頻度の監視を含むエンコーダ監視に基づいて調整され得る。 Graph 720B shows the encoder response after at least one of the target frame size and/or maximum frame size has been adjusted to reduce the occurrence of I-frames that exceed the true target frame size supported by the client. The true target frame size may be adjusted based on measured client bandwidth and/or encoder monitoring, including monitoring the encoder bit rate, frequency of scene changes, and frequency of I-frame generation, as described above.

グラフ７２０Ｂは、水平方向に示されるように、生成された各フレームに対する垂直方向のフレームサイズ（バイト単位）を示している。グラフ７２０Ａと比較すると、ライン７１２’の目標フレームサイズとライン７１１’の最大フレームサイズの値が低くなっている。例えば、ライン７１２’の目標フレームサイズは、ライン７１２から値が小さくなっており、ライン７１１’の最大フレームサイズは、ライン７１１から値が小さくなっている。
目標フレームサイズ及び／または最大フレームサイズを調節した後、目標フレームサイズ７１２’を超えて圧縮されたビデオフレームのピークの最大サイズは、より良い送信のために縮小されている。さらに、グラフ７００Ａと比較した場合、目標フレームサイズ７１２’を超える圧縮されたビデオフレームを含むピークの数もまた減少した。例えば、グラフ７２０Ｂに示されているピークは１つだけである。つまり、目標フレームサイズ及び／または最大フレームサイズは、現在の状態（例えば、測定されたクライアント帯域幅、必要とされるレイテンシ、等）に対してビデオフレームのエンコーディングを最適化する（すなわち、目標フレームサイズ内に収まる）ために調節されている。 Graph 720B shows the vertical frame size (in bytes) for each generated frame, as shown horizontally. In comparison to graph 720A, the target frame size for line 712' and the maximum frame size for line 711' have lower values. For example, the target frame size for line 712' is a smaller value than that for line 712, and the maximum frame size for line 711' is a smaller value than that for line 711.
After adjusting the target frame size and/or maximum frame size, the maximum size of the peaks of compressed video frames exceeding the target frame size 712' has been reduced for better transmission. Additionally, the number of peaks containing compressed video frames exceeding the target frame size 712' has also been reduced when compared to graph 700A. For example, only one peak is shown in graph 720B. That is, the target frame size and/or maximum frame size has been adjusted to optimize the encoding of video frames (i.e., to fit within the target frame size) for the current conditions (e.g., measured client bandwidth, required latency, etc.).

図２Ａ～図２Ｄの様々なクライアントデバイス２１０及び／またはクラウドゲーミングネットワーク２９０（例えば、ゲームサーバ２６０内）の詳細な説明とともに、図８のフロー図８００は、本開示の一実施形態による、クラウドゲーミングの方法を示しており、ビデオフレームのエンコードは、エンコードが長い場合、または生成されるビデオフレームが大きい場合（Ｉフレームをエンコードする場合など）に、ビデオフレームをいつスキップするか、またはビデオフレームのエンコード及び送信をいつ遅らせるか、を決定することを含む。
特に、ビデオフレームをスキップする決定は、ネットワーク伝送速度と信頼性、及び全体的なレイテンシ目標を考慮して行われる。このプロセスは、よりスムーズなフレームレートと、より信頼性の高いレイテンシをもたらし、クラウドゲーミングサーバとクライアントとの間の一方向レイテンシが低減され、より一貫性があるようにされ、それによって、ビデオのクライアント表示のスムーズさを改善する。 In conjunction with detailed descriptions of the various client devices 210 and/or cloud gaming network 290 (e.g., within game server 260) of FIGS. 2A-2D, flow diagram 800 of FIG. 8 illustrates a method of cloud gaming according to one embodiment of the disclosure, where encoding a video frame includes determining when to skip a video frame or when to delay the encoding and transmission of a video frame if the encoding is long or if the resulting video frame is large (e.g., when encoding an I-frame).
In particular, the decision to skip video frames is made taking into account network transmission speed and reliability, and overall latency targets. This process results in smoother frame rates and more reliable latency, making one-way latency between the cloud gaming server and the client reduced and more consistent, thereby improving the smoothness of the client's display of video.

８１０では、ストリーミングモードで動作するクラウドゲーミングサーバにおいて、ビデオゲームを実行するときに、複数のビデオフレームが生成されている。一般に、クラウドゲーミングサーバは、複数のゲーム用にレンダリングされたビデオフレームを生成する。例えば、ゲーム用にレンダリングされたビデオフレームの生成は、図５の５１０で説明されており、図８のビデオフレームの生成に適用可能である。
例えば、ビデオゲームのゲームロジックは、ゲームエンジンまたはゲームタイトル処理エンジンに基づいて構築されている。ゲームエンジンと組み合わせたゲームロジックは、ＣＰＵ及びＧＰＵによって実行され、共有メモリとともにＣＰＵ及びＧＰＵは、ゲーム用にレンダリングされたビデオフレームを生成するためのレンダリングパイプラインとして構成され得、レンダリングパイプラインが、ゲーム用にレンダリングされた画像を目標の、及び／または仮想化された表示のピクセルの各々に対応する色情報を含む、表示に適したビデオまたは画像フレームとして出力するようになっている。 At 810, a plurality of video frames are generated when executing a video game at a cloud gaming server operating in streaming mode. Typically, a cloud gaming server generates rendered video frames for a plurality of games. For example, the generation of rendered video frames for a game is described at 510 in FIG. 5 and is applicable to the generation of the video frames in FIG. 8.
For example, game logic for a video game is built on a game engine or game title processing engine. The game logic in combination with the game engine is executed by a CPU and a GPU, which together with a shared memory may be configured as a rendering pipeline for generating rendered video frames for the game, such that the rendering pipeline outputs the rendered images for the game as video or image frames suitable for display, including color information corresponding to each of the pixels of a target and/or virtualized display.

８２０において、シーン変化は、ビデオゲームの第１のビデオフレームに対して予測され、シーン変化は、第１のビデオフレームが生成される前に予測されている。一実施形態では、ゲームロジックは、ＣＰＵがビデオゲームを実行している間のシーン変化を認識させることができる。例えば、ゲームロジックまたはアドオンロジックは、コード（例えば、シーン変化ロジック）を含むことができ、これは、ビデオフレームの範囲が少なくとも１つのシーン変化を含むと予測すること、または特定のビデオフレームがシーン変化であると予測することなど、ビデオフレームを生成するときにシーン変化を予測する。
特に、シーン変化予測のために構成されたゲームロジックまたはアドオンロジックは、ビデオゲームの実行中に収集されたゲーム状態データを分析して、次のＸフレーム数（例えば、範囲）内で、または識別されたビデオフレームに対して、など、いつシーン変化があるかを判定し、及び／または事前察知し、及び／または予測する。例えば、シーン変化は、仮想化ゲーム環境で、いつキャラクタが、あるシーンから別のシーンに移動するか、またはビデオゲームにおいて、いつキャラクタがレベルを終了し、別のレベルに移行するか、２つのビデオフレーム（例えば、映画的シーケンスでのシーンカット、または一連のメニューの後のインタラクティブなゲームプレイの開始）間でいつ移行するか、が予測されることができ、シーン変化は、仮想化ゲーム世界または環境の大きくて複雑なシーンを含む、ビデオフレームによって表され得る。 At 820, a scene change is predicted for a first video frame of the video game, the scene change being predicted before the first video frame is generated. In one embodiment, the game logic can cause the CPU to recognize a scene change while the video game is being executed. For example, the game logic or add-on logic can include code (e.g., scene change logic) that predicts a scene change when generating video frames, such as predicting that a range of video frames will include at least one scene change or predicting that a particular video frame is a scene change.
In particular, game logic or add-on logic configured for scene change prediction analyzes game state data collected during the execution of a video game to determine and/or anticipate and/or predict when there will be a scene change, such as within the next X number of frames (e.g., a range), or for an identified video frame, etc. For example, a scene change can be predicted when a character moves from one scene to another in a virtualized game environment, or when a character exits a level and transitions to another level in a video game, transition between two video frames (e.g., a scene cut in a cinematic sequence, or the start of interactive gameplay after a series of menus), and the scene change may be represented by a video frame that includes a large and complex scene of the virtualized game world or environment.

ゲーム状態データは、その時点でのゲームの状態を定義することができ、ゲームキャラクタ、ゲームオブジェクト、ゲームオブジェクト属性、ゲーム属性、ゲームオブジェクト状態、グラフィックオーバーレイ、プレイヤーのゲームプレイのゲーム世界内のキャラクタの場所、ゲームプレイのシーンまたはゲーム環境、ゲームアプリケーションのレベル、キャラクタの資産（例えば、武器、ツール、爆弾など）、ロードアウト、キャラクタのスキルセット、ゲームレベル、キャラクタ属性、キャラクタの場所、残存ライフ数、利用可能なライフの総数、鎧、トロフィー、時間カウンター値、及びその他の資産情報、等を含むことができる。 Game state data may define the state of the game at that time and may include game characters, game objects, game object attributes, game attributes, game object states, graphic overlays, a character's location within the game world of the player's gameplay, a gameplay scene or game environment, a game application level, a character's assets (e.g., weapons, tools, bombs, etc.), loadouts, a character's skill set, a game level, character attributes, a character's location, number of lives remaining, total number of lives available, armor, trophies, time counter values, and other asset information, etc.

８３０において、シーン変化ヒントが生成され、シーン変化ヒントがエンコーダに送信され、このヒントは、第１のビデオフレームがシーン変化であることを示す。このように、次のシーン変化の通知が、エンコーダに提供され得、エンコーダは、識別されたビデオフレームを圧縮するときにそのエンコード操作を調整することができる。シーン変化ヒントとして提供される通知は、構成要素間で、またはクラウドゲーミングサーバ２６０の構成要素上で実行されているアプリケーション間で、通信するために使用されるＡＰＩを介して配信され得る。
一実施形態では、ＡＰＩはＧＰＵＡＰＩであり得る。例えば、ＡＰＩは、エンコーダと通信するためにシーン変化を検出するように構成されたゲームロジック及び／またはアドオンロジック上で実行されているか、またはそれらによって呼び出され得る。
一実施形態では、シーン変化ヒントは、データ制御パケットを受信するすべての構成要素が、どのタイプの情報がデータ制御パケットに含まれているかを理解することができ、対応するレンダリングされたビデオフレームへの適切な基準を理解することができるようにフォーマットされたデータ制御パケットとして提供され得る。
一実施態様では、ＡＰＩに対して使用される通信プロトコル、データ制御パケットに対するフォーマットは、ビデオゲームのための対応するソフトウェア開発キット（ＳＤＫ）で定義され得る。 At 830, a scene change hint is generated and sent to the encoder, the hint indicating that the first video frame is a scene change. In this manner, notification of the upcoming scene change may be provided to the encoder, which may adjust its encoding operations when compressing the identified video frame. The notification provided as a scene change hint may be distributed via an API used to communicate between components or between applications running on components of cloud gaming server 260.
In one embodiment, the API may be a GPU API, for example, the API may be running on or called by game logic and/or add-on logic configured to detect scene changes to communicate with the encoder.
In one embodiment, the scene change hints may be provided as data control packets formatted such that all components receiving the data control packets can understand what type of information is contained in the data control packet and can understand the appropriate reference to the corresponding rendered video frame.
In one embodiment, the communication protocol used for the API, the format for the data control packets may be defined in a corresponding software development kit (SDK) for the video game.

８４０において、第１のビデオフレームをエンコーダに配信する。前述のように、ゲームで生成されたビデオフレームは、追加のユーザインターフェース機能と合成され得、かつブレンドされ得、エンコーダにスキャンされる変更されたビデオフレームになる。エンコーダは、クラウドゲーミングサーバからクライアントへのメディアコンテンツのストリーミングに使用されるＭＰＥＧまたはＨ．２６４標準など、所望のフォーマットに基づいて、第１のビデオフレームを圧縮するように構成されている。
ストリーミングするとき、シーン変化があるまで、または現在エンコードされているフレームが、キーフレーム（例えば、以前のＩフレーム）を参照できなくなるまで、ビデオフレームはＰフレームとしてエンコードされ、次のビデオフレームが、次いで別のＩフレームとしてエンコードされるようになっている。この場合、第１のビデオフレームは、シーン変化ヒントに基づいてＩフレームとしてエンコードされ、Ｉフレームは、他のいずれのビデオフレーム（例えば、キー画像としてのスタンドアロン）を参照することなくエンコードされ得る。 At 840, the first video frame is delivered to an encoder. As previously described, the game-generated video frame may be composited and blended with additional user interface features resulting in a modified video frame that is scanned into the encoder. The encoder is configured to compress the first video frame based on a desired format, such as the MPEG or H.264 standards used for streaming media content from the cloud gaming server to the client.
When streaming, a video frame is encoded as a P-frame, and the next video frame is then encoded as another I-frame, until there is a scene change or the currently encoded frame cannot reference a key frame (e.g., a previous I-frame), in which case the first video frame is encoded as an I-frame based on a scene change hint, and the I-frame may be encoded without referencing any other video frame (e.g., standalone as a key image).

８５０では、クライアントの最大受信帯域幅が測定される。前述のように、クライアントが経験する最大帯域幅は、図５及び図６の操作５３０に示されるように、クライアントからのフィードバックメカニズムの手段によって、判定され得る。特に、クラウドゲーミングサーバのストリーマは、クライアントの帯域幅を測定するように構成されることができる。 At 850, the maximum receive bandwidth of the client is measured. As previously described, the maximum bandwidth experienced by the client may be determined by means of a feedback mechanism from the client, as shown in operation 530 of FIGS. 5 and 6. In particular, a streamer of the cloud gaming server may be configured to measure the bandwidth of the client.

８６０において、エンコーダは第２のビデオフレームを受信する。つまり、第２のビデオフレームは、シーン変化の後に受信され、第１のビデオフレームが圧縮された後に圧縮される。また、第２のビデオフレーム（または後続のビデオフレーム）をエンコードしないか、または第２のビデオフレーム（または後続のビデオフレーム）のエンコードを遅らせるかのいずれかが、エンコーダによって決定される。この決定は、クライアントの最大受信帯域幅とクライアントディスプレイの目標解像度に基づいて行われる。つまり、エンコードをスキップするか、または遅延するかの決定は、クライアントが利用できる帯域幅を考慮する。
一般に、クライアントが経験する現在の帯域幅が十分である場合、クライアントの目標ディスプレイ用に生成及びエンコードされたビデオフレームが、レイテンシヒットを受けた（例えば、シーン変化用に大きなＩフレームを生成）後、早急に低い一方向レイテンシに戻ることができ、第２のビデオフレーム（及び／または後続のビデオフレーム）は、依然として遅延を伴ってエンコードされ得る。
一方、クライアントが経験する現在の帯域幅が十分でない場合、第２のビデオフレーム（及び／または後続のビデオフレーム）は、エンコードプロセス中にスキップされ、クライアントに配信されない場合がある。このように、クライアントへの帯域幅が、クライアントでのディスプレイの目標解像度をサポートするために必要な帯域幅を超える場合、より少ないスキップされるフレーム（及びより低いレイテンシ）を有することが可能である。 At 860, the encoder receives a second video frame, i.e., the second video frame is received after a scene change and compressed after the first video frame is compressed, and a decision is made by the encoder to either not encode the second video frame (or subsequent video frames) or to delay encoding the second video frame (or subsequent video frames). This decision is made based on the maximum receiving bandwidth of the client and the target resolution of the client display. That is, the decision to skip or delay encoding takes into account the bandwidth available to the client.
In general, if the current bandwidth experienced by the client is sufficient, a video frame generated and encoded for the client's target display can quickly return to low one-way latency after incurring a latency hit (e.g., generating a large I-frame for a scene change), and the second video frame (and/or subsequent video frames) may still be encoded with a delay.
On the other hand, if the current bandwidth experienced by the client is not sufficient, the second video frame (and/or subsequent video frames) may be skipped during the encoding process and not delivered to the client. In this way, it is possible to have fewer skipped frames (and lower latency) when the bandwidth to the client exceeds the bandwidth required to support the target resolution of the display at the client.

一実施形態では、圧縮されたビデオフレームは、特定の時点で、ネットワークを介して利用可能な最大ビットレートまたは帯域幅に基づくレートでサーバからクライアントに送信される。このように、圧縮されたビデオフレームのエンコードされたスライス及び／またはエンコードされたスライスのパケットの伝送速度は、現在測定されている最大帯域幅に従って動的に調整される。ビデオフレームは、ビデオフレームがエンコードされるときに送信され得、サーバＶＳＹＮＣ信号の次の発生を待たずに、またビデオフレーム全体がエンコードされるのを待たずに、エンコードが完了するとすぐに送信が行われるようになっている。 In one embodiment, compressed video frames are transmitted from the server to the client at a rate based on the maximum bit rate or bandwidth available over the network at a particular time. In this way, the transmission rate of encoded slices of compressed video frames and/or packets of encoded slices is dynamically adjusted according to the currently measured maximum bandwidth. Video frames can be transmitted as they are encoded, such that transmission occurs as soon as encoding is complete, without waiting for the next occurrence of a server VSYNC signal and without waiting for the entire video frame to be encoded.

さらに、一実施形態では、パケットのエンコードは、それらが送信されるのと同じ速度で実行され、両方の操作が、クライアントが利用できる最大の利用可能帯域幅に動的に調節されるようになっている。また、エンコーダビットレートは、クライアント帯域幅の需要を予測するために、次のフレームとその複雑さ（例えば、予測されるシーン変化）を考慮して監視されることができ、エンコーダビットレートは、予測される需要に従って調整されることができる。さらに、エンコーダビットレートは、クライアントに通信されることができるので、クライアントは、エンコーダビットレートに一致するように、それに応じてデコード速度を調整することができるようになっている。 Furthermore, in one embodiment, the encoding of packets is performed at the same rate as they are transmitted, with both operations dynamically adjusting to the maximum available bandwidth available to the client. Also, the encoder bitrate can be monitored taking into account the next frame and its complexity (e.g., predicted scene changes) to predict client bandwidth demand, and the encoder bitrate can be adjusted according to the predicted demand. Furthermore, the encoder bitrate can be communicated to the client, so that the client can adjust its decoding speed accordingly to match the encoder bitrate.

一実施形態では、クライアントへの伝送速度がクライアントディスプレイの目標解像度に対して低いとき、第２のビデオフレームはエンコーダによってスキップされる。つまり、第２のビデオフレームはエンコードされない。特に、圧縮されたビデオフレームのグループに対するクライアントへの伝送速度は、最大受信帯域幅を超えている。例えば、クライアントへの伝送速度は、１５メガバイト／秒（Ｍｂｐｓ）であることができるが、クライアントの測定された受信帯域幅は、現在５～１０Ｍｂｐｓであることができる。このように、すべてのビデオフレームが継続的にクライアントにプッシュされる場合、クラウドゲーミングサーバとクライアントとの間の一方向レイテンシが増加する。低遅延を促進するために、２番目以降のビデオフレームはエンコーダによってスキップされることができる。 In one embodiment, the second video frame is skipped by the encoder when the transmission rate to the client is low relative to the target resolution of the client display. That is, the second video frame is not encoded. In particular, the transmission rate to the client for a group of compressed video frames exceeds the maximum receive bandwidth. For example, the transmission rate to the client may be 15 megabytes per second (Mbps), but the measured receive bandwidth of the client may currently be 5-10 Mbps. Thus, if every video frame is continually pushed to the client, the one-way latency between the cloud gaming server and the client increases. To promote low latency, the second and subsequent video frames may be skipped by the encoder.

図９Ａは、本開示の一実施形態による、エンコーダによって圧縮されているビデオフレームのシーケンス９００Ａを示し、エンコーダは、クライアント帯域幅がクライアントのディスプレイの目標解像度に対して低いとき、第１のＩフレーム９０５をエンコードした後に、第２のビデオフレーム９２０のエンコードをドロップする。ビデオフレームのエンコードブロック及び伝送ブロックが、ＶＳＹＮＣ信号９５０に関連して示されている。特に、余分な帯域幅が利用できないとき、エンコードにより時間がかかるＩフレームは、１つ以上のスキップされたフレームが、一方向レイテンシを低く抑えるようにさせ、一方向レイテンシは、クライアントにおいて、ビデオフレームを表示する時間を含み得る。
図示されるように、Ｉフレームの後に１つ以上のビデオフレームをスキップすると、低い一方向レイテンシにすぐに戻ることを可能にする（例えば、１つまたは２つのフレーム期間内）。そうでない場合、ビデオフレームのエンコードをスキップしないことにより、低い一方向レイテンシに戻るまでに数フレーム期間を要することになる。 9A illustrates a sequence 900A of video frames being compressed by an encoder that drops the encoding of a second video frame 920 after encoding a first I-frame 905 when the client bandwidth is low relative to the target resolution of the client's display, according to one embodiment of the present disclosure. The encoding and transmission blocks of video frames are shown in relation to a VSYNC signal 950. In particular, when extra bandwidth is not available, an I-frame that takes longer to encode causes one or more skipped frames to keep one-way latency low, which may include the time to display a video frame at the client.
As shown, skipping one or more video frames after the I-frame allows for a quick return to low one-way latency (e.g., within one or two frame periods), whereas not skipping the encoding of video frames would otherwise require several frame periods to return to low one-way latency.

例えば、ビデオフレームのシーケンス９００Ａは、１つのエンコードされたＩフレーム９０５を含み、残りのフレームは、Ｐフレームとしてエンコードされている。説明のために、Ｐフレームとしてのエンコードブロック９０１及びエンコードブロック９０２が、Ｉフレームとしてエンコードされたエンコードブロック９０５の前に、エンコードされている。
その後、エンコーダは、次のシーン変化まで、またはビデオフレームが以前のキーフレーム（Ｉフレームなど）を参照できなくなるまで、ビデオフレームをＰフレームとして圧縮する。一般的に、Ｉフレームブロックのためのエンコード時間は、Ｐフレームブロックよりも長くかかることがある。例えば、Ｉフレームブロック９０５のエンコード時間は、１フレーム期間を超え得る。場合によっては、ＰフレームとＩフレームとの間のエンコード時間は、特に高出力のエンコーダを使用する場合、一般的に、ほぼ同じになり得る。 For example, a sequence of video frames 900A includes one encoded I-frame 905, with the remaining frames encoded as P-frames. For purposes of illustration, blocks 901 and 902 are encoded as P-frames before block 905, which is encoded as an I-frame.
The encoder then compresses the video frames as P frames until the next scene change or until the video frame cannot reference a previous key frame (such as an I frame). In general, the encoding time for an I frame block may take longer than a P frame block. For example, the encoding time for an I frame block 905 may exceed one frame period. In some cases, the encoding time between a P frame and an I frame may generally be approximately the same, especially when using a high-powered encoder.

しかしながら、ＩフレームとＰフレームとの間の送信時間は大きく異なる。図示されているように、様々な送信時間が、対応するエンコードされたビデオフレームに関連して示されている。例えば、エンコードされたＰフレームブロック９０１の伝送ブロック９１１は、低レイテンシで示され、エンコードブロック９０１及び伝送ブロック９１１が、１フレーム期間内に実行され得るようになっている。また、エンコードされたＰフレームブロック９０２の伝送ブロック９１２は、低い一方向レイテンシとともに示され、エンコードブロック９０２及び伝送ブロック９１２もまた、１フレーム期間内に実行され得るようになっている。 However, the transmission times between I and P frames are significantly different. As shown, the various transmission times are shown relative to the corresponding encoded video frames. For example, the transmission block 911 of the encoded P frame block 901 is shown with low latency, such that the encoding block 901 and the transmission block 911 can be performed within one frame period. Also, the transmission block 912 of the encoded P frame block 902 is shown with low one-way latency, such that the encoding block 902 and the transmission block 912 can also be performed within one frame period.

一方で、エンコードされたＩフレームブロック９０５の伝送ブロック９１５Ａは、より高い一方向レイテンシと共に示され、エンコードブロック９０５及び伝送ブロック９１５Ａは、いくつかのフレーム期間にわたって発生し、それによって、クラウドゲーミングサーバとクライアントとの間の一方向レイテンシにジッターを導入するようになっている。一方向レイテンシが少なくなるようにユーザにリアルタイムのエクスペリエンスを提供するために、クライアントにおけるバッファは、ジッターを修正するために使用され得ない。その場合、エンコーダは、Ｉフレームがエンコードされた後に、１つ以上のビデオフレームのエンコードをスキップすることを決定することができる。
例えば、ビデオフレーム９２０は、エンコーダによってドロップされる。その場合、エンコードされたビデオフレームの送信は、５つの後続のビデオフレームが、Ｐフレームとしてエンコードされて、クライアントに送信された後のように、ハイライトされた領域９１０の周りの低い一方向レイテンシの１つに戻る。つまり、Ｉフレームブロック９０５がエンコードされた後にエンコードされた第４または第５のＰフレームもまた、同じフレーム期間内でクライアントに送信され、それによって、クラウドゲーミングサーバとクライアントとの間の低い一方向レイテンシに戻る。 On the other hand, the transmission block 915A of the encoded I-frame block 905 is shown with a higher one-way latency, such that the encoding block 905 and the transmission block 915A occur over several frame periods, thereby introducing jitter into the one-way latency between the cloud gaming server and the client. To provide a real-time experience to the user with less one-way latency, a buffer at the client may not be used to correct for the jitter. In that case, the encoder may decide to skip encoding one or more video frames after the I-frame is encoded.
For example, video frame 920 is dropped by the encoder, in which case the transmission of the encoded video frames reverts to one of low one-way latency around highlighted region 910 as after five subsequent video frames have been encoded as P-frames and transmitted to the client, i.e., the fourth or fifth P-frame encoded after I-frame block 905 is encoded will also be transmitted to the client within the same frame period, thereby returning to low one-way latency between the cloud gaming server and the client.

一実施形態において、クライアントへの伝送速度が、クライアントディスプレイの目標解像度に対して高い場合、第２のビデオフレームは、遅延後も依然としてエンコーダによって圧縮される（すなわち、Ｉフレームがエンコードされるまで待機する）。特に、圧縮されたビデオフレームのグループのクライアントに対する伝送速度は、最大受信帯域幅の範囲内である。例えば、クライアントへの伝送速度は、１３メガバイト／秒（Ｍｂｐｓ）であり得るが、クライアントの測定された受信帯域幅は、現在１５Ｍｂｐｓであり得る。このように、クライアントにおけるエンコードされたビデオフレームの受信において、遅延が発生しないため、クラウドゲーミングサーバとクライアントとの間の一方向レイテンシの増加はない。 In one embodiment, if the transmission rate to the client is high relative to the target resolution of the client display, the second video frame is still compressed by the encoder after the delay (i.e., waiting until the I-frame is encoded). In particular, the transmission rate to the client of the group of compressed video frames is within the maximum receive bandwidth. For example, the transmission rate to the client may be 13 megabytes per second (Mbps), but the client's measured receive bandwidth may currently be 15 Mbps. In this way, there is no increase in one-way latency between the cloud gaming server and the client, since there is no delay in receiving the encoded video frames at the client.

さらに、クラウドゲーミングサーバとクライアントにおいてＶＳＹＮＣ信号が同期及びオフセットされることができるため、クラウドゲーミングサーバとクライアントとの間の一方向レイテンシは、低減されることができ、それによって、ネットワークを介した送信中にサーバにおいて、またはクライアントにおいて、ジッターによって生じたレイテンシの変動性を補正する。
また、ＶＳＹＮＣ信号の同期とオフセットは、クラウドゲーミングサーバでの重複操作（スキャンアウト、エンコード、及び送信）、クライアントでの重複操作（受信、デコード、レンダリング、表示）、及び／またはクラウドゲーミングサーバとクライアントでの重複操作を提供し、これらのすべてが、サーバまたはネットワークまたはクライアントのジッターによってもたらされるレイテンシの変動性に対する補償、一方向レイテンシの低減、一方向レイテンシの変動性の低減、ビデオコンテンツのリアルタイム生成と表示、及びクライアントでの一貫したビデオ再生を促進する。 Furthermore, because the VSYNC signals at the cloud gaming server and client can be synchronized and offset, one-way latency between the cloud gaming server and client can be reduced, thereby compensating for latency variability caused by jitter at the server or at the client during transmission over the network.
Additionally, the synchronization and offset of the VSYNC signal provides for overlapping operations at the cloud gaming server (scan out, encode, and transmit), overlapping operations at the client (receive, decode, render, display), and/or overlapping operations at the cloud gaming server and client, all of which facilitate compensation for latency variability introduced by server or network or client jitter, reduction in one-way latency, reduction in one-way latency variability, real-time generation and display of video content, and consistent video playback at the client.

図９Ｂは、エンコーダによって圧縮されているビデオフレームのシーケンス９００Ｂを示しており、エンコーダは、クライアントに利用可能な帯域幅を考慮に入れており、その結果、本開示の一実施形態によれば、帯域幅が、クライアントディスプレイの目標解像度をサポートするために必要な帯域幅を超える場合、より低いレイテンシを有すると同時に、スキップされたフレームを有さないか、より少なくすることが可能であるようになっている。
特に、シーケンス９００Ｂでは、ビデオフレームはＩフレームとしてエンコードされ、後続のビデオフレームもまた正常にエンコードされ、クライアント帯域幅がクライアントディスプレイの目標解像度に対して中適度であるとき、Ｉフレームのエンコードの遅延後にエンコードされる。中程度の帯域幅の可用性があるため、中適度の量の超過帯域幅が、クラウドゲーミングサーバとクライアント間のレイテンシの変動性（例えば、ジッター）を補正するために利用可能であり、その結果、フレームスキップが回避されることができ、低い一方向レイテンシへの回帰が比較的迅速に（例えば、２つ～４つのフレーム期間内で）達成され得るようになっている。ビデオフレームのエンコードブロック及び伝送ブロックが、ＶＳＹＮＣ信号９５０に関連して示されている。 FIG. 9B shows a sequence 900B of video frames being compressed by an encoder that takes into account the bandwidth available to the client, such that, according to one embodiment of the present disclosure, it is possible to have lower latency while at the same time having no or fewer skipped frames when the bandwidth exceeds the bandwidth required to support the target resolution of the client display.
In particular, in sequence 900B, a video frame is encoded as an I-frame, and subsequent video frames are also encoded normally, but after a delay in encoding the I-frame, when the client bandwidth is moderate to moderate for the target resolution of the client display. Because there is moderate bandwidth availability, a moderate amount of excess bandwidth is available to compensate for latency variability (e.g., jitter) between the cloud gaming server and the client, such that frame skipping can be avoided and a return to low one-way latency can be achieved relatively quickly (e.g., within two to four frame periods). The encoding and transmission blocks of the video frames are shown in relation to the VSYNC signal 950.

ビデオフレームのシーケンス９００Ｂは、１つのエンコードされたＩフレーム９０５を含み、残りのフレームは、Ｐフレームとしてエンコードされている。例示のために、Ｐフレームとしてのエンコードブロック９０１及びエンコードブロック９０２が、エンコードブロック９０５がＩフレームとしてエンコードされる前に、エンコードされている。その後、エンコーダは、次のシーン変化まで、またはビデオフレームが以前のキーフレーム（Ｉフレームなど）を参照できなくなるまで、ビデオフレームをＰフレームとして圧縮する。一般に、Ｉフレームブロックのエンコード時間は、Ｐフレームブロックより長くかかり得、Ｉフレームの送信は、１フレーム期間より長くかかり得る。例えば、Ｉフレームブロック９０５に対するエンコード時間及び送信時間は、１フレーム期間を超えている。
また、対応するエンコードされたビデオフレームに関連して様々な送信時間が表示されている。例えば、Ｉフレームブロック９０５の前のビデオフレームのエンコード及び送信は、低い一方向レイテンシを有して示され、その結果、対応するエンコード及び伝送ブロックが、１フレーム期間内に実行され得るようになっている。しかしながら、エンコードされたＩフレームブロック９０５の伝送ブロック９１５Ｂは、より高い一方向レイテンシで示されており、その結果、エンコードブロック９０５及び伝送ブロック９１５Ｂは、２つ以上のフレーム期間にわたって発生し、それにより、クラウドゲーミングサーバとクライアントとの間の一方向レイテンシにジッターをもたらす。
前述したように、１つ以上のエンコーダパラメータ（例えば、ＱＰ、目標フレームサイズ、最大フレームサイズ、エンコーダビットレートなど）を調節することにより、エンコード時間はさらに短縮され得る。つまり、Ｉフレームの後の第２の、または後続のビデオフレームは、クライアントへの伝送速度がクライアントディスプレイの目標解像度に対して中程度のとき、より低い精度でエンコードされており、また、伝送速度が目標解像度に対して高いときより低い精度でエンコードされている。 The sequence of video frames 900B includes one encoded I-frame 905, with the remaining frames being encoded as P-frames. For illustration purposes, encode block 901 and encode block 902 as P-frames before encode block 905 is encoded as an I-frame. The encoder then compresses the video frames as P-frames until the next scene change or until the video frames cannot reference a previous key frame (such as an I-frame). In general, the encoding time of an I-frame block may take longer than a P-frame block, and the transmission of an I-frame may take longer than one frame period. For example, the encoding time and transmission time for I-frame block 905 exceeds one frame period.
Also shown are various transmission times relative to the corresponding encoded video frames. For example, the encoding and transmission of the video frame prior to I-frame block 905 is shown with a low one-way latency such that the corresponding encoding and transmission blocks can be performed within one frame period. However, transmission block 915B of encoded I-frame block 905 is shown with a higher one-way latency such that encoding block 905 and transmission block 915B occur over more than one frame period, thereby introducing jitter into the one-way latency between the cloud gaming server and the client.
As mentioned above, the encoding time can be further reduced by adjusting one or more encoder parameters (e.g., QP, target frame size, maximum frame size, encoder bit rate, etc.) That is, the second or subsequent video frames after an I-frame are encoded with less precision when the transmission rate to the client is medium relative to the target resolution of the client display, and are encoded with less precision when the transmission rate is high relative to the target resolution.

Ｉフレームブロック９０５の後、エンコーダは、ビデオフレームの圧縮を継続するが、Ｉフレームのエンコードにより、それらは一時的に遅延することがある。この場合も、ＶＳＹＮＣ信号の同期とオフセットは、クラウドゲーミングサーバでの重複操作（スキャンアウト、エンコード、及び伝送）、クライアントでの重複操作（受信、デコード、レンダリング、表示）、及び／またはクラウドゲーミングサーバとクライアントでの重複操作を提供し、これらのすべてが、サーバまたはネットワークまたはクライアントのジッターによってもたらされる一方向レイテンシの変動性に対する補償、一方向レイテンシの低減、一方向レイテンシの変動性の低減、ビデオコンテンツのリアルタイム生成と表示、及びクライアントでの一貫したビデオ再生を促進する。 After the I-frame block 905, the encoder continues compressing the video frames, although they may be temporarily delayed due to the encoding of the I-frame. Again, the synchronization and offset of the VSYNC signal provides for overlapping operations at the cloud gaming server (scan out, encode, and transmit), overlapping operations at the client (receive, decode, render, display), and/or overlapping operations at the cloud gaming server and client, all of which facilitate compensation for variability in one-way latency introduced by server or network or client jitter, reduction in one-way latency, reduction in variability in one-way latency, real-time generation and display of video content, and consistent video playback at the client.

クライアント帯域幅はクライアントディスプレイの目標解像度に関して中程度であるため、エンコードされたビデオフレームの伝送は、２つまたは３つの後続のビデオフレームがＰフレームとしてエンコードされ、クライアントに伝送された後など、強調表示された領域９４０周辺の低い一方向レイテンシの１つに戻る。領域９４０内で、Ｉフレームブロック９０５がエンコードされた後にエンコードされたＰフレームもまた、同じフレーム期間内にクライアントに伝送され、それにより、クラウドゲーミングサーバとクライアントとの間で低い一方向レイテンシに戻る。 Because the client bandwidth is moderate with respect to the target resolution of the client display, the transmission of the encoded video frame returns to one of low one-way latency around highlighted region 940, such as after two or three subsequent video frames have been encoded as P-frames and transmitted to the client. Within region 940, P-frames encoded after I-frame block 905 is encoded are also transmitted to the client within the same frame period, thereby returning to low one-way latency between the cloud gaming server and the client.

図９Ｃは、エンコーダによって圧縮されているビデオフレームのシーケンス９００Ｃを示し、エンコーダは、クライアントに利用可能な帯域幅を考慮しており、本開示の一実施形態によれば、帯域幅が、クライアントディスプレイの目標解像度をサポートするために必要な帯域幅を超える場合、より低い一方向レイテンシを依然として有しながら、スキップされたフレームがないか、より少ないスキップされたフレームを有することが可能であるようになっている。特に、シーケンス９００Ｃでは、ビデオフレームはＩフレームとしてエンコードされ、後続のビデオフレームもまた正常にエンコードされ、クライアント帯域幅が、クライアントディスプレイの目標解像度に対して高い場合、Ｉフレームのエンコードの遅延後にエンコードされる。
高い帯域幅の可用性があるため、大量の超過帯域幅が、クラウドゲーミングサーバとクライアント間の一方向レイテンシの変動性（例えば、ジッター）を補正するために利用可能であり、その結果、フレームスキップが回避されることができ、低い一方向レイテンシへの回帰が、迅速に（例えば、１つ～２つのフレーム期間内で）達成され得るようになっている。ビデオフレームのエンコードブロック及び伝送ブロックが、ＶＳＹＮＣ信号９５０に関連して示されている。 9C illustrates a sequence 900C of video frames being compressed by an encoder that takes into account the bandwidth available to the client, such that, according to one embodiment of the present disclosure, if the bandwidth exceeds the bandwidth required to support the target resolution of the client display, it is possible to have no or fewer skipped frames while still having lower one-way latency. In particular, in sequence 900C, a video frame is encoded as an I-frame, and subsequent video frames are also encoded normally, and after a delay in encoding the I-frame, if the client bandwidth is high for the target resolution of the client display.
Because of high bandwidth availability, a large amount of excess bandwidth is available to compensate for variability (e.g., jitter) in one-way latency between the cloud gaming server and the client, so that frame skipping can be avoided and a return to low one-way latency can be achieved quickly (e.g., within one to two frame periods). The encoding and transmission blocks of a video frame are shown in relation to the VSYNC signal 950.

図９Ｂと同様に、図９Ｃのビデオフレームのシーケンス９００Ｃは、１つのエンコードされたＩフレーム９０５を含み、残りのフレームは、Ｐフレームとしてエンコードされている。例示のために、Ｐフレームとしてのエンコードブロック９０１及びエンコードブロック９０２が、エンコードブロック９０５がＩフレームとしてエンコードされる前に、エンコードされている。その後、エンコーダは、次のシーン変化まで、またはビデオフレームが以前のキーフレーム（Ｉフレームなど）を参照できなくなるまで、ビデオフレームをＰフレームとして圧縮する。一般的に、Ｉフレームブロックのためのエンコード時間は、Ｐフレームブロックよりも長くかかることがある。例えば、Ｉフレームブロック９０５のエンコード時間は、１フレーム期間を超え得る。
また、対応するエンコードされたビデオフレームに関連して様々な伝送時間が表示されている。例えば、Ｉフレームブロック９０５の前のビデオフレームのエンコード及び伝送は、低いレイテンシを有して示され、その結果、対応するエンコード及び伝送ブロックが、１フレーム期間内に実行され得るようになっている。しかしながら、エンコードされたＩフレームブロック９０５の伝送ブロック９１５Ｃは、より高いレイテンシで示されており、その結果、エンコードブロック９０５及び伝送ブロック９１５Ｃは、２つ以上のフレーム期間にわたって発生し、それにより、クラウドゲーミングサーバとクライアントとの間の一方向レイテンシにジッターをもたらす。前述したように、１つ以上のエンコーダパラメータ（例えば、ＱＰ、目標フレームサイズ、最大フレームサイズ、エンコーダビットレートなど）を調節することにより、エンコード時間はさらに短縮され得る。 Similar to Figure 9B, the sequence of video frames 900C of Figure 9C includes one encoded I-frame 905, with the remaining frames encoded as P-frames. For illustration purposes, encode block 901 and encode block 902 as P-frames before encode block 905 is encoded as an I-frame. The encoder then compresses the video frames as P-frames until the next scene change or until the video frames cannot reference a previous key frame (such as an I-frame). In general, the encoding time for an I-frame block may take longer than a P-frame block. For example, the encoding time for I-frame block 905 may exceed one frame period.
Also shown are various transmission times associated with corresponding encoded video frames. For example, the encoding and transmission of the video frame preceding the I-frame block 905 is shown with a low latency such that the corresponding encoding and transmission blocks can be performed within one frame period. However, the transmission block 915C of the encoded I-frame block 905 is shown with a higher latency such that the encoding block 905 and the transmission block 915C occur over more than one frame period, thereby introducing jitter into the one-way latency between the cloud gaming server and the client. As previously discussed, the encoding time can be further reduced by adjusting one or more encoder parameters (e.g., QP, target frame size, maximum frame size, encoder bitrate, etc.).

Ｉフレームブロック９０５の後、エンコーダは、ビデオフレームの圧縮を継続するが、Ｉフレームのエンコードにより、それらは一時的に遅延することがある。この場合も、ＶＳＹＮＣ信号の同期とオフセットは、クラウドゲーミングサーバでの重複操作（スキャンアウト、エンコード、及び送信）、クライアントでの重複操作（受信、デコード、レンダリング、表示）、及び／またはクラウドゲーミングサーバとクライアントでの重複操作を提供し、これらのすべてが、サーバまたはネットワークまたはクライアントのジッターによってもたらされるレイテンシの変動性に対する補償、一方向レイテンシの低減、一方向レイテンシの変動性の低減、ビデオコンテンツのリアルタイム生成と表示、及びクライアントでの一貫したビデオ再生を促進する。
クライアント帯域幅はクライアントディスプレイの目標解像度に対して高いため、エンコードされたビデオフレームの伝送は、１つまたは２つの後続のビデオフレームがＰフレームとしてエンコードされ、クライアントに伝送された後など、強調表示された領域９７０周辺の低い一方向レイテンシの１つに戻る。領域９７０内で、Ｉフレームブロック９０５がエンコードされた後にエンコードされたＰフレームもまた、（これらはＶＳＹＮＣ信号の発生の両側にまたがるが）１フレーム期間内にクライアントに伝送され、それにより、クラウドゲーミングサーバとクライアントとの間で低い一方向レイテンシに戻る。 After I-frame block 905, the encoder continues compressing video frames, although they may be temporarily delayed due to the encoding of the I-frame. Again, the synchronization and offset of the VSYNC signal provides for overlapping operations at the cloud gaming server (scan out, encode, and transmit), overlapping operations at the client (receive, decode, render, display), and/or overlapping operations at the cloud gaming server and client, all of which facilitate compensation for latency variability introduced by server or network or client jitter, reduction in one-way latency, reduction in one-way latency variability, real-time generation and display of video content, and consistent video playback at the client.
Because the client bandwidth is high relative to the target resolution of the client display, the transmission of the encoded video frames reverts to one of low one-way latency around highlighted region 970, such as after one or two subsequent video frames are encoded as P-frames and transmitted to the client. Within region 970, P-frames encoded after I-frame block 905 is encoded are also transmitted to the client within one frame period (although they straddle both sides of the occurrence of the VSYNC signal), thereby returning to low one-way latency between the cloud gaming server and the client.

図１０は、本開示の様々な実施形態の態様を実行するために使用されることができる例示的なデバイス１０００の構成要素を示している。例えば、図１０は、メディアコンテンツのストリーミング及び／またはストリーミングされたメディアコンテンツの受信に適した例示的なハードウェアシステムを示しており、レイテンシを低減し、クラウド間のより一貫したレイテンシを提供する目的のために、また、ビデオのクライアントディスプレイのスムーズさを改善するために、クラウドゲーミングシステムにおける一方向レイテンシとビデオ品質との間のトレードオフを改善するエンコーダ調節を提供することを含み、エンコーダ調節は、クライアント帯域幅、スキップされたフレーム、エンコードされたＩフレームの数、シーン変化の数、及び／または目標フレームサイズを超えるビデオフレームの数の監視に基づき得、調節されたパラメータは、エンコーダビットレート、目標フレームサイズ、最大フレームサイズ、及び量子化パラメータ（ＱＰ）値を含み得、本開示の実施形態によると、高性能のエンコーダとデコーダは、クラウドゲーミングサーバとクライアントとの間の全体的な一方向レイテンシを低減する働きをする。
このブロック図は、デバイス１０００を示し、デバイス１０００は、パーソナルコンピュータ、サーバコンピュータ、ゲームコンソール、モバイルデバイス、または他のデジタルデバイスを組み込むことができるか、またはそれらであり得、これらの各々は、本発明の実施形態を実施するのに適している。デバイス１０００は、ソフトウェアアプリケーション及び任意選択でオペレーティングシステムを実行するための中央処理装置（ＣＰＵ）１００２を含む。ＣＰＵ１００２は、１つ以上の同種または異種の処理コアで構成され得る。 10 illustrates components of an exemplary device 1000 that may be used to implement aspects of various embodiments of the present disclosure. For example, FIG. 10 illustrates an exemplary hardware system suitable for streaming media content and/or receiving streamed media content, including providing encoder adjustments to improve the tradeoff between one-way latency and video quality in a cloud gaming system for the purpose of reducing latency and providing more consistent latency between clouds, and to improve smoothness of client display of video, where the encoder adjustments may be based on monitoring of client bandwidth, skipped frames, number of encoded I-frames, number of scene changes, and/or number of video frames exceeding a target frame size, where the adjusted parameters may include encoder bitrate, target frame size, maximum frame size, and quantization parameter (QP) value, where according to an embodiment of the present disclosure, high performance encoders and decoders serve to reduce the overall one-way latency between the cloud gaming server and the client.
The block diagram illustrates a device 1000 that may incorporate or be a personal computer, a server computer, a game console, a mobile device, or other digital device, each of which is suitable for implementing embodiments of the present invention. The device 1000 includes a central processing unit (CPU) 1002 for executing software applications and optionally an operating system. The CPU 1002 may be comprised of one or more homogeneous or heterogeneous processing cores.

様々な実施形態によれば、ＣＰＵ１００２は、１つ以上の処理コアを有する１つ以上の汎用マイクロプロセッサである。さらなる実施形態は、マイクロプロセッサアーキテクチャを備えた１つ以上のＣＰＵを使用して実装されることができ、このマイクロプロセッサアーキテクチャは、ゲーム実行中のグラフィックス処理のために構成されたアプリケーションの、メディア及びインタラクティブエンターテインメントアプリケーションなどの高度に並列で計算集約的な用途に具体的に適合されている。 According to various embodiments, the CPU 1002 is one or more general-purpose microprocessors having one or more processing cores. Further embodiments can be implemented using one or more CPUs with a microprocessor architecture that is specifically adapted for highly parallel, computationally intensive uses, such as applications configured for graphics processing during game execution, media and interactive entertainment applications, etc.

メモリ１００４は、ＣＰＵ１００２及びＧＰＵ１０１６が使用するためのアプリケーション及びデータを格納する。ストレージ１００６は、アプリケーション及びデータ用の不揮発性ストレージ及びその他のコンピュータ可読メディアを提供し、固定ディスクドライブ、リムーバブルディスクドライブ、フラッシュメモリデバイス、及びＣＤ－ＲＯＭ、ＤＶＤ－ＲＯＭ、Ｂｌｕ－ｒａｙ（登録商標）、ＨＤ－ＤＶＤ、または、その他の光ストレージデバイス、及び信号送信とストレージメディアを含み得る。ユーザ入力デバイス１００８は、１人以上のユーザからデバイス１０００にユーザ入力を通信し、デバイス１０００の例には、キーボード、マウス、ジョイスティック、タッチパッド、タッチスクリーン、静止画またはビデオレコーダ／カメラ、及び／またはマイクロフォンが含まれ得る。
ネットワークインターフェース１００９は、デバイス１０００が、電子通信ネットワークを介して他のコンピュータシステムと通信することを可能にし、ローカルエリアネットワーク及びインターネットなどのワイドエリアネットワークを介した有線または無線通信を含み得る。オーディオプロセッサ１０１２は、ＣＰＵ１００２、メモリ１００４、及び／またはストレージ１００６によって提供される命令及び／またはデータからアナログまたはデジタルオーディオ出力を生成するように適合されている。ＣＰＵ１００２、ＧＰＵ１０１６を含むグラフィックスサブシステム、メモリ１００４、データストレージ１００６、ユーザ入力デバイス１００８、ネットワークインターフェース１００９、及びオーディオプロセッサ１０１２を含むデバイス１０００の構成要素は、１つ以上のデータバス１０２２を介して接続されている。 Memory 1004 stores applications and data for use by CPU 1002 and GPU 1016. Storage 1006 provides non-volatile storage and other computer readable media for applications and data, and may include fixed disk drives, removable disk drives, flash memory devices, and CD-ROM, DVD-ROM, Blu-ray, HD-DVD, or other optical storage devices and signal transmission and storage media. User input devices 1008 communicate user input from one or more users to device 1000, and examples of device 1000 may include a keyboard, mouse, joystick, touchpad, touchscreen, still or video recorder/camera, and/or microphone.
The network interface 1009 enables the device 1000 to communicate with other computer systems over an electronic communications network, which may include wired or wireless communications over local area networks and wide area networks such as the Internet. The audio processor 1012 is adapted to generate analog or digital audio output from instructions and/or data provided by the CPU 1002, memory 1004, and/or storage 1006. The components of the device 1000, including the CPU 1002, the graphics subsystem including the GPU 1016, the memory 1004, the data storage 1006, the user input devices 1008, the network interface 1009, and the audio processor 1012, are connected via one or more data buses 1022.

グラフィックサブシステム１０１４は、データバス１０２２及びデバイス１０００の構成要素にさらに接続されている。グラフィックサブシステム１０１４は、グラフィック処理ユニット（ＧＰＵ）１０１６及びグラフィックメモリ１０１８を含む。グラフィックメモリ１０１８は、出力画像の各ピクセルのピクセルデータを格納するために使用されるディスプレイメモリ（例えば、フレームバッファ）を含む。グラフィックスメモリ１０１８は、ＧＰＵ１０１６と同じデバイスに統合することができ、別のデバイスとしてＧＰＵ１０１６と接続され、及び／またはメモリ１００４内に実装されている。
ピクセルデータは、ＣＰＵ１００２から直接的にグラフィックスメモリ１０１８に提供されることができる。代替的に、ＣＰＵ１００２は、ＧＰＵ１０１６に所望の出力画像を定義するデータ及び／または命令を提供し、そこから、ＧＰＵ１０１６は、１つ以上の出力画像のピクセルデータを生成する。所望の出力画像を定義するデータ及び／または命令は、メモリ１００４及び／またはグラフィックスメモリ１０１８に格納されることができる。一実施形態では、ＧＰＵ１０１６は、シーンに対するジオメトリ、照明、シェーディング、テクスチャリング、モーション、及び／またはカメラパラメータを定義する命令及びデータから、出力画像のピクセルデータを生成するための３Ｄレンダリング機能を含む。ＧＰＵ１０１６は、シェーダプログラムを実行することができる１つ以上のプログラム可能な実行ユニットをさらに含むことができる。 The graphics subsystem 1014 is further connected to the data bus 1022 and to the components of the device 1000. The graphics subsystem 1014 includes a graphics processing unit (GPU) 1016 and a graphics memory 1018. The graphics memory 1018 includes a display memory (e.g., a frame buffer) used to store pixel data for each pixel of an output image. The graphics memory 1018 may be integrated into the same device as the GPU 1016, connected to the GPU 1016 as a separate device, and/or implemented within the memory 1004.
The pixel data may be provided to the graphics memory 1018 directly from the CPU 1002. Alternatively, the CPU 1002 provides the GPU 1016 with data and/or instructions defining a desired output image, from which the GPU 1016 generates pixel data for one or more output images. The data and/or instructions defining the desired output image may be stored in the memory 1004 and/or the graphics memory 1018. In one embodiment, the GPU 1016 includes 3D rendering capabilities for generating pixel data for output images from instructions and data defining geometry, lighting, shading, texturing, motion, and/or camera parameters for a scene. The GPU 1016 may further include one or more programmable execution units capable of executing shader programs.

グラフィックサブシステム１０１４は、ディスプレイデバイス１０１０に表示されるか、または投影システム（図示せず）によって投影される画像のピクセルデータをグラフィックメモリ１０１８から周期的に出力する。ディスプレイデバイス１０１０は、ＣＲＴ、ＬＣＤ、プラズマディスプレイ及びＯＬＥＤディスプレイを含む、デバイス１０００からの信号に応答して視覚情報を表示することができる任意のデバイスであり得る。デバイス１０００は、例えば、アナログまたはデジタルの信号をディスプレイデバイス１０１０に提供することができる。 The graphics subsystem 1014 periodically outputs pixel data from the graphics memory 1018 for an image to be displayed on the display device 1010 or projected by a projection system (not shown). The display device 1010 can be any device capable of displaying visual information in response to signals from the device 1000, including CRTs, LCDs, plasma displays, and OLED displays. The device 1000 can provide analog or digital signals to the display device 1010, for example.

グラフィックサブシステム１０１４を最適化するための他の実施形態は、ＧＰＵインスタンスが複数のアプリケーション間で共有され、かつ、単一のゲームをサポートするＧＰＵが分散されたマルチテナンシＧＰＵ操作を含むことができる。グラフィックサブシステム１０１４は、１つ以上の処理デバイスとして構成されることができる。 Other embodiments for optimizing the graphics subsystem 1014 can include multi-tenancy GPU operation where a GPU instance is shared among multiple applications and the GPU supporting a single game is distributed. The graphics subsystem 1014 can be configured as one or more processing devices.

例えば、グラフィックスサブシステム１０１４は、マルチテナンシＧＰＵ機能を実行するように構成され得、一実施形態では、１つのグラフィックスサブシステムが、複数のゲーム用のグラフィックス及び／またはレンダリングパイプラインを実装し得る。つまり、グラフィックサブシステム１０１４は、実行されている複数のゲーム間で共有される。 For example, the graphics subsystem 1014 may be configured to perform multi-tenancy GPU functions, and in one embodiment, one graphics subsystem may implement the graphics and/or rendering pipeline for multiple games. That is, the graphics subsystem 1014 is shared between multiple games being executed.

他の実施形態では、グラフィックサブシステム１０１４は、複数のＧＰＵデバイスを含み、これらは、対応するＣＰＵ上で実行されている単一アプリケーション用にグラフィック処理を実行するために組み合わされている。例えば、複数のＧＰＵは、フレームレンダリングの代替的形態を実行でき、ここでは、ＧＰＵ１は第１のフレームをレンダリングし、ＧＰＵ２は第２のフレームを連続したフレーム期間でレンダリングし、最後のＧＰＵに到達するまでそれを行い、その結果、最初のＧＰＵが次のビデオフレームをレンダリングする（例えば、ＧＰＵが２つしかない場合、ＧＰＵ１は第３のフレームをレンダリングする）。つまり、フレームをレンダリングするときにＧＰＵは循環する。レンダリング操作は重複することができ、ＧＰＵ１が第１のフレームのレンダリングを終了する前にＧＰＵ２が第２のフレームのレンダリングを開始することができる。
別の実施態様では、複数のＧＰＵデバイスに、レンダリングパイプライン及び／またはグラフィックスパイプラインにおいて、異なるシェーダ操作が割り当てられることができる。マスターＧＰＵが、メインのレンダリングと合成を実行している。例えば、３つのＧＰＵを含むグループでは、マスターＧＰＵ１が、メインレンダリング（例えば、第１のシェーダ操作）、及びスレーブＧＰＵ２とスレーブＧＰＵ３からの出力の合成を実行することができ、スレーブＧＰＵ２は、第２のシェーダ操作（例えば、河川などの流体効果）を実行することができ、スレーブＧＰＵ３は、第３のシェーダ（粒子の煙など）操作を実行することができ、マスターＧＰＵ１は、ＧＰＵ１、ＧＰＵ２、ＧＰＵ３の各々からの結果を合成する。このようにして、様々なＧＰＵが割り当てられ、様々なシェーダ操作（旗を振る、風、煙の生成、火など）を実行してビデオフレームをレンダリングすることができる。
さらに別の実施形態では、３つのＧＰＵの各々が、ビデオフレームに対応するシーンの異なるオブジェクト及び／または部分に割り当てられることができる。上述の実施形態及び実装態様では、これらの操作は、同じフレーム期間で（並行して同時に）、または異なるフレーム期間で（並行して連続的に）実行されることができる。 In other embodiments, the graphics subsystem 1014 includes multiple GPU devices that are combined to perform graphics processing for a single application running on a corresponding CPU. For example, multiple GPUs can perform an alternative form of frame rendering, where GPU1 renders a first frame, GPU2 renders a second frame in successive frame periods, until the last GPU is reached, so that the first GPU renders the next video frame (e.g., if there are only two GPUs, GPU1 renders the third frame). That is, the GPUs cycle when rendering frames. Rendering operations can overlap, and GPU2 can begin rendering the second frame before GPU1 finishes rendering the first frame.
In another embodiment, multiple GPU devices can be assigned different shader operations in the rendering pipeline and/or graphics pipeline. A master GPU performs the main rendering and compositing. For example, in a group including three GPUs, master GPU1 can perform the main rendering (e.g., first shader operations) and compositing the output from slave GPU2 and slave GPU3, slave GPU2 can perform second shader operations (e.g., fluid effects such as rivers), slave GPU3 can perform third shader (particle smoke, etc.) operations, and master GPU1 composites the results from each of GPU1, GPU2, and GPU3. In this way, various GPUs can be assigned to perform different shader operations (flag waving, wind, smoke generation, fire, etc.) to render a video frame.
In yet another embodiment, each of the three GPUs can be assigned to different objects and/or portions of the scene corresponding to the video frame. In the above embodiments and implementations, these operations can be performed in the same frame period (concurrently in parallel) or in different frame periods (sequentially in parallel).

従って、本開示は、クラウドゲーミングシステムにおいて一方向レイテンシとビデオ品質との間のトレードオフを改善するためのエンコーダ調整を提供することを含む、メディアコンテンツをストリーミングし、及び／またはストリーミングされたメディアコンテンツを受信するために構成された方法及びシステムを説明しており、エンコーダ調節は、クライアント帯域幅、スキップされたフレーム、エンコードされたＩフレームの数、シーン変化の数、及び／または目標フレームサイズを超えるビデオフレームの数の監視に基づき得、調節されたパラメータは、エンコーダビットレート、目標フレームサイズ、最大フレームサイズ、及び量子化パラメータ（ＱＰ）値を含み得、高性能のエンコーダとデコーダは、クラウドゲーミングサーバとクライアントとの間の全体的な一方向レイテンシを低減する働きをする。 Accordingly, the present disclosure describes methods and systems configured for streaming media content and/or receiving streamed media content, including providing encoder adjustments to improve the tradeoff between one-way latency and video quality in a cloud gaming system, where the encoder adjustments may be based on monitoring of client bandwidth, skipped frames, number of encoded I-frames, number of scene changes, and/or number of video frames exceeding a target frame size, where the adjusted parameters may include encoder bitrate, target frame size, maximum frame size, and quantization parameter (QP) value, where the high performance encoder and decoder serve to reduce the overall one-way latency between the cloud gaming server and the client.

本明細書で定義される様々な実施形態が、本明細書で開示された様々な特徴を使用して、特定の実施態様に組み合わせられるか、または組み立てられ得ることが理解されるべきである。従って、提供される実施例は、いくつかの可能な実施例にすぎず、様々な要素を組み合わせることでより多くの実施態様を定義することが可能な様々な実施態様に限定されない。ある例では、ある実施態様は、開示されたまたは同等の実施態様の趣旨から逸脱することなく、より少ない要素を含んでもよい。 It should be understood that the various embodiments defined herein may be combined or assembled into specific embodiments using the various features disclosed herein. Thus, the examples provided are only some possible examples and are not limited to the various embodiments in which more embodiments can be defined by combining various elements. In some instances, an embodiment may include fewer elements without departing from the spirit of the disclosed or equivalent embodiments.

本開示の実施形態は、ハンドヘルドデバイス、マイクロプロセッサシステム、マイクロプロセッサベースまたはプログラム可能な家庭用電化製品、ミニコンピュータ、メインフレームコンピュータなどを含む様々なコンピュータシステム構成で実施されることができる。本開示の実施形態は、有線ベースのネットワークまたは無線ネットワークを介してリンクされたリモート処理デバイスによってタスクが実行される分散コンピューティング環境でも実施されることができる。 Embodiments of the present disclosure may be practiced in a variety of computer system configurations including handheld devices, microprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframe computers, and the like. Embodiments of the present disclosure may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a wire-based or wireless network.

上述の実施形態を念頭に置いて、本開示の実施形態は、コンピュータシステムに格納されたデータを含む様々なコンピュータ実装操作を採用できることを理解されるべきである。これらの動作は、物理量の物理的操作を要する動作である。本開示の実施形態の一部を形成する、本明細書で説明される動作のうちのいずれも、有用な機械動作である。開示の実施形態はまた、これら動作を実行するためのデバイスまたは装置に関する。装置は、必要な目的のために特別に構築されてもよい。または、装置は、コンピュータに記憶されたコンピュータプログラムにより選択的に起動または構成される汎用コンピュータであってもよい。具体的には、本明細書の教示に従って書かれたコンピュータプログラムと共に様々な汎用マシンを使用することができる、あるいは、必要な動作を実行するためにさらに特化した装置を構築するほうがより好都合である場合もある。 With the above embodiments in mind, it should be understood that embodiments of the present disclosure may employ various computer-implemented operations involving data stored in computer systems. These operations are operations requiring physical manipulation of physical quantities. Any of the operations described herein that form part of the embodiments of the present disclosure are useful machine operations. The disclosed embodiments also relate to devices or apparatus for performing these operations. The apparatus may be specially constructed for the required purposes, or the apparatus may be a general-purpose computer selectively activated or configured by a computer program stored in the computer. In particular, various general-purpose machines may be used with computer programs written in accordance with the teachings of the present disclosure, or it may be more convenient to construct a more specialized apparatus to perform the required operations.

本開示はまた、コンピュータ可読媒体上のコンピュータ可読コードとして具現化されることができる。コンピュータ可読媒体は、データを格納することができる任意のデータ格納装置であり、これは、その後、コンピュータシステムによって読み取られることができる。コンピュータ可読媒体の例には、ハードドライブ、ネットクワーク接続ストレージ（ＮＡＳ）、読み出し専用メモリ、ランダムアクセスメモリ、ＣＤ－ＲＯＭ、ＣＤ－Ｒ、ＣＤ－ＲＷ、磁気テープ、並びに他の光学及び非光学データストレージデバイスが含まれる。コンピュータ可読媒体は、ネットワーク結合コンピュータシステム上に分散されたコンピュータ可読有形媒体を含むことができ、コンピュータ可読コードが分散方式で格納及び実行されるようになっている。 The present disclosure may also be embodied as computer readable code on a computer readable medium. A computer readable medium is any data storage device that can store data, which can then be read by a computer system. Examples of computer readable media include hard drives, network attached storage (NAS), read only memory, random access memory, CD-ROMs, CD-Rs, CD-RWs, magnetic tapes, and other optical and non-optical data storage devices. The computer readable medium may include computer readable tangible media distributed over network coupled computer systems, such that the computer readable code is stored and executed in a distributed fashion.

本方法の操作は特定の順序で説明されているが、他のハウスキーピング操作が、操作の間に実行され得、もしくは、操作が、わずかに異なる時間に発生するように調整されることができるか、またはシステムに分散されることができ、このことが、重複操作の処理が所望の方法で実行される限り、処理に関連する様々な間隔での処理操作の発生を可能とすることが理解されるべきである。 Although the operations of the method are described in a particular order, it should be understood that other housekeeping operations may be performed between the operations, or the operations may be coordinated to occur at slightly different times or distributed throughout the system, allowing processing operations to occur at various intervals relative to processing, so long as the processing of overlapping operations is performed in the desired manner.

前述の開示は、理解を明確にする目的で、ある程度詳細に説明されてきたが、特定の変更及び修正は、添付の特許請求の範囲内で実施されることができることは明らかであろう。従って、本実施形態は、限定ではなく例示としてみなされるべきであり、本開示の実施形態は、本明細書に提供される詳細に限定されるものではなく、添付の特許請求の範囲内及び均等物内で変更されてもよい。 Although the foregoing disclosure has been described in some detail for purposes of clarity of understanding, it will be apparent that certain changes and modifications can be practiced within the scope of the appended claims. Thus, the present embodiments should be considered as illustrative rather than limiting, and the embodiments of the present disclosure are not limited to the details provided herein, but may be modified within the scope and equivalents of the appended claims.

Claims

1. A cloud gaming method, comprising:
generating a plurality of video frames when executing a video game on a cloud gaming server;
encoding the plurality of video frames at an encoder bit rate; and transmitting the plurality of compressed video frames from a streamer of the cloud gaming server to a client;
Measure the client's maximum receive bandwidth,
monitoring, at the streamer, the encoding of the plurality of video frames;
and dynamically adjusting parameters of the encoder based on the monitoring of the encoding.

The dynamic adjustment of the parameters includes:
determining that the encoder bit rate used to encode a group of video frames from the plurality of video frames exceeds the maximum receive bandwidth;
The method of claim 1 , further comprising increasing a value of a QP parameter such that encoding is performed with less precision, said parameter being said QP parameter.

The dynamic adjustment of the parameters includes:
determining that the encoder bit rate used to encode a group of video frames from the plurality of video frames is within the maximum receive bandwidth;
determining that there is excess bandwidth when transmitting the group of video frames;
reducing a value of a QP parameter based on the excess bandwidth so that encoding is performed with greater precision, the parameter being the QP parameter;
The method of claim 1 , comprising:

The dynamic adjustment of the parameters includes:
determining whether a number of video frames compressed as I-frames from a group of video frames from the plurality of compressed video frames meets or exceeds a threshold number of I-frames;
increasing a value of a QP parameter such that encoding is performed with less precision, said parameter being the QP parameter;
The method of claim 1 , comprising:

The dynamic adjustment of the parameters includes:
determining that a number of video frames compressed as I-frames from a group of video frames from the plurality of compressed video frames is below a threshold number of I-frames;
The method of claim 1 , further comprising reducing a value of a QP parameter such that encoding is performed with greater precision, said parameter being said QP parameter.

The dynamic adjustment of the parameters includes:
determining that a group of video frames from the plurality of video frames encoded and transmitted at the transmission rate includes a number of video frames, the number of video frames meeting or exceeding a threshold, each of the number of video frames exceeding a target frame size;
The method of claim 1 , further comprising reducing at least one of the target frame size and maximum frame size due to the parameter.

The method of claim 6, wherein the target frame size and the maximum frame size are equal.

The dynamic adjustment of the parameters includes:
determining that a group of video frames from the plurality of video frames encoded and transmitted at the transmission rate includes a number of video frames, the number of video frames being below a threshold, each of the number of video frames being within a target frame size;
The method of claim 1 , further comprising increasing at least one of the target frame size and maximum frame size as the parameter.

The dynamic adjustment of the parameters includes:
determining whether a number of video frames identified as having a scene change from a group of video frames from the plurality of compressed video frames meets or exceeds a threshold number of scene changes;
The method of claim 1 , further comprising increasing a value of a QP parameter such that encoding is performed with less precision, said parameter being said QP parameter.

The dynamic adjustment of the parameters includes:
determining that a number of video frames identified as having a scene change from a group of video frames from the plurality of compressed video frames is below a threshold number of scene changes;
The method of claim 1 , further comprising reducing a value of a QP parameter such that encoding is performed with greater precision, said parameter being said QP parameter.

The method of claim 1, further comprising prioritizing smooth playback on the client by disabling skipping encoding of video frames.

The method of claim 1, further comprising: at the encoder, dynamically adjusting an encoder bit rate speed based on the maximum receiving bandwidth of the client.

A non-transitory computer-readable medium storing a cloud gaming computer program, comprising:
having program instructions for generating a plurality of video frames when executing a video game on a cloud gaming server;
having program instructions for measuring a maximum receive bandwidth of a client;
and program instructions for encoding the plurality of video frames at an encoder bit rate, the plurality of compressed video frames being transmitted from a streamer of the cloud gaming server to a client;
program instructions for monitoring the encoding of the plurality of video frames at the streamer;
A non-transitory computer readable medium having program instructions for dynamically adjusting parameters of the encoder based on the monitoring of the encoding.

The program instructions for dynamically adjusting the parameters include:
determining that the encoder bit rate used to encode a group of video frames from the plurality of video frames exceeds the maximum receive bandwidth;
14. The non-transitory computer readable medium of claim 13, comprising program instructions for increasing a value of a QP parameter such that encoding is performed with less precision, said parameter being the QP parameter.

The program instructions for dynamically adjusting the parameters include:
determining that the encoder bit rate used to encode a group of video frames from the plurality of video frames is within the maximum receive bandwidth;
determining that there is excess bandwidth when transmitting the group of video frames;
14. The non-transitory computer-readable medium of claim 13, comprising program instructions for reducing a value of a QP parameter based on the excess bandwidth such that encoding is performed with greater precision, the parameter being the QP parameter.

The program instructions for dynamically adjusting the parameters include:
determining whether a number of video frames identified as having a scene change from a group of video frames from the plurality of compressed video frames meets or exceeds a threshold number of scene changes;
14. The non-transitory computer readable medium of claim 13, comprising program instructions for increasing a value of a QP parameter such that encoding is performed with less precision, said parameter being the QP parameter.

The program instructions for dynamically adjusting the parameters include:
determining that a number of video frames compressed as I-frames from a group of video frames from the plurality of compressed video frames is below a threshold number of I-frames;
14. The non-transitory computer readable medium of claim 13, having programming instructions for reducing a value of a QP parameter such that encoding is performed with greater precision, the parameter being the QP parameter.

The program instructions for dynamically adjusting the parameters include:
the group of video frames from the plurality of video frames encoded and transmitted at the transmission rate includes a number of video frames, the number of video frames meeting or exceeding a threshold, each of the number of video frames exceeding a target frame size;
14. The non-transitory computer readable medium of claim 13 having program instructions for reducing at least one parameter of the target frame size and maximum frame size.

The non-transitory computer-readable medium of claim 18, wherein in the computer program for cloud gaming, the target frame size and the maximum frame size are equal.

The program instructions for dynamically adjusting the parameters include:
program instructions for determining that a group of video frames from the plurality of video frames encoded and transmitted at the transmission rate includes a number of video frames, the number of video frames being below a threshold, each of the number of video frames being within a target frame size;
14. The non-transitory computer readable medium of claim 13 having program instructions for increasing at least one of the target frame size and maximum frame size as the parameter.

The program instructions for dynamically adjusting the parameters include:
determining that a number of video frames identified as having a scene change from a group of video frames from the plurality of compressed video frames is below a threshold number of scene changes;
14. The non-transitory computer readable medium of claim 13, comprising program instructions for reducing a value of a QP parameter such that encoding is performed with greater precision, said parameter being the QP parameter.

The non-transitory computer-readable medium of claim 13, further comprising program instructions for prioritizing smooth playback at the client by disabling skipping of encoding of video frames.

The non-transitory computer-readable medium of claim 13, further comprising program instructions for dynamically adjusting an encoder bit rate rate at the encoder based on the maximum receiving bandwidth of the client.

1. A computer system comprising:
A processor;
and a memory coupled to the processor and having instructions stored therein, the instructions, when executed by the computer system, causing the computer system to perform a cloud gaming method, the cloud gaming method comprising:
generating a plurality of video frames when executing a video game on a cloud gaming server;
encoding the plurality of video frames at an encoder bit rate, and transmitting the compressed plurality of video frames from a streamer of the cloud gaming server to a client;
Measure the client's maximum receive bandwidth,
monitoring, at the streamer, the encoding of the plurality of video frames;
A computer system that dynamically adjusts parameters of the encoder based on the monitoring of the encoding.

The method of dynamically adjusting the parameters includes:
determining that the encoder bit rate used to encode a group of video frames from the plurality of video frames exceeds the maximum receive bandwidth;
26. The computer system of claim 25, further comprising increasing a value of a QP parameter such that encoding is performed with less precision, said parameter being said QP parameter.

The method of dynamically adjusting the parameters includes:
determining that the encoder bit rate used to encode a group of video frames from the plurality of video frames is within the maximum receive bandwidth;
determining that there is excess bandwidth when transmitting the group of video frames;
26. The computer system of claim 25, further comprising: reducing a value of a QP parameter based on the excess bandwidth so that encoding is performed with greater precision, the parameter being the QP parameter.

The method of dynamically adjusting the parameters includes:
determining whether a number of video frames compressed as I-frames from a group of video frames from the plurality of compressed video frames meets or exceeds a threshold number of I-frames;
26. The computer system of claim 25, further comprising increasing a value of a QP parameter such that encoding is performed with less precision, said parameter being said QP parameter.

The method of dynamically adjusting the parameters includes:
determining that a number of video frames compressed as I-frames from a group of video frames from the plurality of compressed video frames is below a threshold number of I-frames;
26. The computer system of claim 25, further comprising: reducing a value of a QP parameter such that encoding is performed with greater precision, said parameter being the QP parameter.

The method of dynamically adjusting the parameters includes:
determining that a group of video frames from the plurality of video frames encoded and transmitted at the transmission rate includes a number of video frames, the number of video frames meeting or exceeding a threshold, each of the number of video frames exceeding a target frame size;
26. The computer system of claim 25, wherein at least one of the target frame size and maximum frame size is reduced due to the parameters.

The computer system of claim 30, wherein the target frame size and the maximum frame size are equal.

The method of dynamically adjusting the parameters includes:
determining that a group of video frames from the plurality of video frames encoded and transmitted at the transmission rate includes a number of video frames, the number of video frames being below a threshold, each of the number of video frames being within a target frame size;
26. The computer system of claim 25, wherein at least one of the target frame size and maximum frame size is increased as the parameter.

The method of dynamically adjusting the parameters includes:
determining whether a number of video frames identified as having a scene change from a group of video frames from the plurality of compressed video frames meets or exceeds a threshold number of scene changes;
26. The computer system of claim 25, further comprising increasing a value of a QP parameter such that encoding is performed with less precision, said parameter being said QP parameter.

The method of dynamically adjusting the parameters includes:
determining that a number of video frames identified as having a scene change from a group of video frames from the plurality of compressed video frames is below a threshold number of scene changes;
26. The computer system of claim 25, further comprising: reducing a value of a QP parameter such that encoding is performed with greater precision, said parameter being said QP parameter.

The method further comprises:
26. The computer system of claim 25, further comprising disabling skipping encoding of video frames to prioritize smooth playback at the client.

The method further comprises:
26. The computer system of claim 25, wherein the encoder dynamically adjusts an encoder bit rate rate based on the maximum receiving bandwidth of the client.

1. A cloud gaming method, comprising:
generating a plurality of video frames when executing a video game on a cloud gaming server;
predicting a scene change for a first video frame for the video game, the scene change being predicted before the first video frame is generated;
generating a scene change hint that the first video frame is a scene change;
sending the scene change hint to the encoder;
delivering the first video frame to an encoder, the first video frame being encoded as an I-frame based on the scene change hint;
Measure the client's maximum receive bandwidth,
The method further comprising: determining whether to encode or not encode a second video frame received at the encoder based on the maximum receiving bandwidth of the client and a target resolution of a client display.

further comprising: dynamically adjusting an encoder bitrate speed based on the maximum receiving bandwidth of the client;
40. The method of claim 37, further comprising transmitting the video frames to the client once the video frames are encoded.

determining whether to encode or not encode the second video frame,
38. The method of claim 37, further comprising skipping encoding of the second video frame when the transmission rate to the client is low relative to the target resolution of the client display, such that the transmission rate to the client of a group of video frames from the compressed plurality of video frames exceeds the maximum receive bandwidth.

determining whether to encode or not encode the second video frame,
38. The method of claim 37, further comprising: if a transmission rate to the client is high for the target resolution of the client display, encoding the second video frame normally such that the transmission rate to the client for a group of video frames from the plurality of compressed video frames is within the maximum receive bandwidth.

determining whether to encode or not encode the second video frame,
41. The method of claim 40, further comprising encoding the second video frame at a lower precision if the transmission rate to the client is medium for the target resolution of the client display.

Predicting a scene change for the first video frame includes:
executing, at the cloud gaming server, game logic built on a game engine of the video game to generate the plurality of video frames;
executing scene change logic to predict the scene change for the first video frame, the prediction being based on a game state collected during execution of the game logic;
generating the scene change hint using the scene change logic;
38. The method of claim 37, wherein the encoder sends the scene change hint before receiving the first video frame.

The method of claim 42, wherein the scene change hints are delivered from the scene change logic to the encoder via an API.

The method of claim 37, wherein the second video frame is compressed after the first video frame is compressed by the encoder.

A non-transitory computer-readable medium storing a cloud gaming computer program, comprising:
having program instructions for generating a plurality of video frames when executing a video game on a cloud gaming server;
and program instructions for predicting a scene change for a first video frame for the video game, the scene change being predicted before the first video frame is generated;
generating a scene change hint that the first video frame is a scene change;
program instructions for transmitting the scene change hint to the encoder;
and program instructions for delivering the first video frame to an encoder, the first video frame being encoded as an I-frame based on the scene change hint;
having program instructions for measuring a maximum receive bandwidth of a client;
A non-transitory computer-readable medium having program instructions for determining whether to encode or not encode a second video frame received at the encoder based on the maximum receiving bandwidth of the client and a target resolution of a client display.

further comprising program instructions for dynamically adjusting an encoder bit rate rate at the encoder based on the maximum receiving bandwidth of the client;
46. The non-transitory computer readable medium of claim 45 having program instructions for transmitting the video frames to the client once the video frames are encoded.

The program instructions for determining whether to encode or not encode the second video frame include:
46. The non-transitory computer readable medium of claim 45, having program instructions for skipping encoding of the second video frame when the transmission rate to the client is low for the target resolution of the client display, such that the transmission rate to the client of a group of video frames from the plurality of compressed video frames exceeds the maximum receive bandwidth.

The program instructions for determining whether to encode or not encode the second video frame include:
46. The non-transitory computer readable medium of claim 45, having program instructions for successfully encoding the second video frame such that, if a transmission rate to the client is high for the target resolution of the client display, the transmission rate to the client for a group of video frames from the plurality of compressed video frames is within the maximum receive bandwidth.

The program instructions for determining whether to encode or not encode the second video frame include:
49. The non-transitory computer-readable medium of claim 48, having program instructions for encoding the second video frame at a lower precision if the transmission rate to the client is medium for the target resolution of the client display.

The program instructions for predicting a scene change for the first video frame include:
program instructions for executing game logic built on a game engine of the video game at the cloud gaming server to generate the plurality of video frames;
and program instructions for executing scene change logic to predict the scene change for the first video frame, the prediction being based on a game state collected during execution of the game logic;
having program instructions for generating the scene change hint using the scene change logic;
46. The non-transitory computer-readable medium of claim 45, having program instructions for transmitting the scene change hint before the encoder receives the first video frame.

The non-transitory computer-readable medium of claim 50, wherein in the computer program for cloud gaming, the scene change hint is delivered from the scene change logic to the encoder via an API.

The non-transitory computer-readable medium of claim 45, wherein in the computer program for cloud gaming, the second video frame is compressed after the first video frame is compressed by the encoder.

1. A computer system comprising:
A processor;
and a memory coupled to the processor and having instructions stored therein, the instructions, when executed by the computer system, causing the computer system to perform a cloud gaming method, the cloud gaming method comprising:
generating a plurality of video frames when executing a video game on a cloud gaming server;
predicting a scene change for a first video frame for the video game, the scene change being predicted before the first video frame is generated;
generating a scene change hint, the first video frame being a scene change;
sending the scene change hint to the encoder;
delivering the first video frame to an encoder, the first video frame being encoded as an I-frame based on the scene change hint;
Measure the client's maximum receive bandwidth,
The computer system determines whether to encode or not encode the received second video frame at the encoder based on the maximum receiving bandwidth of the client and a target resolution of a client display.

further comprising: dynamically adjusting, at the encoder, an encoder bit rate speed based on the maximum receiving bandwidth of the client;
54. The computer system of claim 53, further comprising: transmitting the video frames to the client once the video frames are encoded.

determining whether to encode or not encode the second video frame,
54. The computer system of claim 53, further comprising: skipping encoding of the second video frame when the transmission rate to the client is low for the target resolution of the client display, such that the transmission rate to the client of a group of video frames from the compressed plurality of video frames exceeds the maximum receive bandwidth.

The determining whether to encode or not encode the second video frame comprises:
54. The computer system of claim 53, further comprising: if a transmission rate to the client is high for the target resolution of the client display, then normally encoding the second video frame such that the transmission rate to the client for a group of video frames from the plurality of compressed video frames is within the maximum receive bandwidth.

determining whether to encode or not encode the second video frame,
57. The computer system of claim 56, further comprising: encoding the second video frame at a lower precision if the transmission rate to the client is medium for the target resolution of the client display.

Predicting a scene change for the first video frame includes:
executing, at the cloud gaming server, game logic built on a game engine of the video game to generate the plurality of video frames;
executing scene change logic to predict the scene change for the first video frame, the prediction being based on a game state collected during execution of the game logic;
generating the scene change hint using the scene change logic;
54. The computer system of claim 53, wherein the encoder transmits the scene change hint before receiving the first video frame.

The computer system of claim 58, wherein the scene change hints are delivered from the scene change logic to the encoder via an API.

The computer system of claim 53, wherein the second video frame is compressed after the first video frame is compressed by the encoder.