JPWO2021234132A5

JPWO2021234132A5 -

Info

Publication number: JPWO2021234132A5
Application number: JP2022571129A
Authority: JP
Publication date: 2023-07-03

Description

本発明は、ビデオ符号化及びビデオ復号に関し、特に、ビデオエンコーダ、ビデオデコーダ、符号化及び復号のための方法、並びに先進的ビデオ符号化コンセプトを実現するためのビデオデータストリームに関する。 The present invention relates to video encoding and video decoding, and in particular to video encoders, video decoders, methods for encoding and decoding, and video data streams for implementing advanced video encoding concepts.

Ｈ．２６５／ＨＥＶＣ（ＨＥＶＣ＝High Efficiency Video Coding、高効率ビデオ符号化）は、エンコーダ及び／又はデコーダにおける並列処理を向上させ、又は更には可能にするためのツールを既に提供するビデオコーデックである。例えば、ＨＥＶＣは、互いに独立して符号化されるタイルの配列へのピクチャの細分化をサポートする。ＨＥＶＣによってサポートされる他の概念はＷＰＰに関し、ＷＰＰによれば、ピクチャのＣＴＵ行又はＣＴＵラインは、連続するＣＴＵライン（ＣＴＵ＝coding tree unit、符号化ツリーユニット）の処理において何らかの最小ＣＴＵオフセットが遵守されることを条件として、左から右に並列に、例えばストライプ状に処理され得る。しかしながら、ビデオエンコーダ及び／又はビデオデコーダの並列処理能力を更により効率的にサポートするビデオコーデックを手元に有することが好ましい。 H. H.265/HEVC (HEVC = High Efficiency Video Coding) is a video codec that already provides tools to improve or even enable parallelism in encoders and/or decoders. For example, HEVC supports subdivision of a picture into an array of tiles that are encoded independently of each other. Another concept supported by HEVC is related to WPP, according to which a CTU row or CTU line of a picture has some minimum CTU offset in the processing of consecutive CTU lines (CTU = coding tree unit). Subject to compliance, they may be processed left to right in parallel, eg in stripes. However, it is preferable to have a video codec on hand that supports the parallel processing capabilities of a video encoder and/or video decoder even more efficiently.

以下では、最先端技術に係るＶＣＬ分割の導入について説明する（ＶＣＬ＝video coding layer、ビデオ符号化レイヤ）。 In the following, the introduction of VCL splitting according to the state of the art is described (VCL = video coding layer).

一般に、ビデオ符号化では、ピクチャサンプルの符号化プロセスは、より小さいパーティションを必要とし、この場合、サンプルは、予測又は変換符号化などのジョイント処理のために幾つかの矩形領域に分割される。したがって、ピクチャは、ビデオシーケンスの符号化中に一定である特定のサイズのブロックに分割される。Ｈ．２６４／ＡＶＣ規格では、１６×１６サンプルの固定サイズブロック、いわゆるマクロブロックが使用される（ＡＶＣ＝Advanced Video Coding、先進的ビデオ符号化）。 Generally, in video coding, the process of encoding picture samples requires smaller partitions, where the samples are divided into several rectangular regions for joint processing such as prediction or transform coding. A picture is thus divided into blocks of a certain size that is constant during the encoding of the video sequence. H. In the H.264/AVC standard, fixed-size blocks of 16×16 samples, so-called macroblocks, are used (AVC=Advanced Video Coding).

最先端のＨＥＶＣ規格（［１］参照）では、６４×６４サンプルの最大サイズの符号化ツリーブロック（ＣＴＢ）又は符号化ツリーユニット（ＣＴＵ）がある。ＨＥＶＣの更なる説明では、そのような種類のブロックについて、より一般的な用語ＣＴＵが使用される。 In the state-of-the-art HEVC standard (see [1]) there is a Coding Tree Block (CTB) or Coding Tree Unit (CTU) with a maximum size of 64x64 samples. Further discussion of HEVC uses the more general term CTU for such types of blocks.

ＣＴＵは、左上のＣＴＵから始まり、ピクチャ内のＣＴＵを線方向に処理して右下のＣＴＵまでラスタスキャン順に処理される。 The CTUs are processed in raster scan order starting from the top left CTU, linearly processing the CTUs in the picture to the bottom right CTU.

符号化されたＣＴＵデータは、スライスと呼ばれる一種のコンテナに編成される。本来、従来のビデオ符号化規格では、スライスは、ピクチャの１つ以上の連続するＣＴＵを含むセグメントを意味する。符号化データのセグメント化にはスライスが用いられる。他の観点から、完全なピクチャを１つの大きなセグメントとして定義することもでき、したがって、歴史的に、スライスという用語は依然として適用される。符号化されたピクチャサンプルに加えて、スライスは、いわゆるスライスヘッダに配置されるスライス自体の符号化プロセスに関連する追加情報も含む。 Encoded CTU data is organized into a kind of container called a slice. Essentially, in conventional video coding standards, a slice means a segment containing one or more consecutive CTUs of a picture. Slices are used to segment the encoded data. From another point of view, a complete picture can also be defined as one large segment, so historically the term slice still applies. In addition to the coded picture samples, a slice also contains additional information related to the coding process of the slice itself located in a so-called slice header.

最先端技術によれば、ＶＣＬ（ビデオ符号化レイヤ）は、フラグメンテーション及び空間分割のための技術も含む。そのような分割は、例えば、様々な理由でビデオ符号化に適用することができ、その中には並列化における負荷分散の処理、ネットワーク伝送におけるＣＴＵサイズマッチング、エラー緩和などがある。 According to the state of the art, VCL (Video Coding Layer) also includes techniques for fragmentation and spatial division. Such partitioning can be applied, for example, to video coding for a variety of reasons, among them handling load balancing in parallelization, CTU size matching in network transmission, and error mitigation.

他の例はＲｏＩ（ＲｏＩ＝Region of Interest、関心領域）符号化に関し、この場合、例えば、視聴者が、例えばズームイン操作（ＲｏＩのみを復号する）によって選択することができるピクチャの中央の領域、又は、イントラデータ（一般にはビデオシーケンスの１つのフレームに入れられる）が、例えば、ピクチャ平面上にわたってスワイプし、イントラピクチャがピクチャ平面全体に関してそれを行うのと同じ方式でローカルに時間予測チェーンをリセットするイントラブロックの列として、幾つかの連続するフレームにわたって時間的に分配される段階的なデコーダリフレッシュ（ＧＤＲ）が存在する。後者の場合、各ピクチャには２つの領域が存在し、１つは最近リセットされた領域であり、１つはエラー及びエラー伝播の影響を受ける可能性がある領域である。 Another example relates to RoI (RoI=Region of Interest) coding, where, for example, the central region of the picture that the viewer can select, for example by a zoom-in operation (decoding only the RoI), Or, intra data (generally put into one frame of a video sequence) can e.g. swipe over a picture plane and reset the temporal prediction chain locally in the same way that an intra picture does it for the entire picture plane. There is a gradual decoder refresh (GDR) that is temporally distributed over several consecutive frames as a series of intra blocks. In the latter case, there are two regions in each picture, one that has recently been reset and one that may be affected by errors and error propagation.

基準ピクチャ再サンプリング（ＲＰＲ）は、より粗い量子化パラメータを使用するだけでなく、潜在的に各送信ピクチャの解像度を適合させることによって、ビデオの品質／レートを適合させるためにビデオ符号化で使用される技術である。したがって、インター予測に使用される基準は、符号化のために現在予測されているピクチャとは異なるサイズを有する可能性がある。基本的に、ＲＰＲは、予測ループ内の再サンプリングプロセス、例えば、定義されるべきアップサンプリングフィルタ及びダウンサンプリングフィルタを必要とする。 Reference picture resampling (RPR) is used in video coding to adapt the video quality/rate by not only using coarser quantization parameters, but also potentially adapting the resolution of each transmitted picture. It is a technology that is used. Therefore, the reference used for inter prediction may have a different size than the picture currently predicted for encoding. Basically, RPR requires a resampling process within the prediction loop, eg upsampling and downsampling filters to be defined.

特色に応じて、ＲＰＲは、任意のピクチャにおいて符号化されたピクチャサイズの変化をもたらすことができ、又は、例えばセグメント境界適応型ＨＴＴＰストリーミングに境界付けられた特定の位置においてのみなど、幾つかの特定のピクチャにおいてのみ生じるように制限されることができる。 Depending on the peculiarities, RPR can lead to changes in the encoded picture size in any picture, or in some cases, such as only at certain locations bounded by segment boundary adaptive HTTP streaming. It can be restricted to occur only in certain pictures.

本発明の目的は、ビデオ符号化及びビデオ復号のための改善された概念を提供することである。 It is an object of the invention to provide improved concepts for video encoding and video decoding.

本発明の目的は、独立請求項の主題によって解決される。 The object of the invention is solved by the subject matter of the independent claims.

本発明の第１の態様によれば、入力ビデオデータストリームを受信するための装置が提供される。入力ビデオデータストリームには、ビデオが符号化されている。装置は、入力ビデオデータストリームから出力ビデオデータストリームを生成するように構成される。更に、装置は、従属ランダムアクセスピクチャに先行するビデオのピクチャが出力されるべきか否かを決定するようになっている。 According to a first aspect of the invention, an apparatus is provided for receiving an input video data stream. The input video data stream contains encoded video. The device is configured to generate an output video data stream from an input video data stream. Further, the apparatus is adapted to determine whether pictures of the video preceding the dependent random access picture should be output.

更に、ビデオデータストリームが提供される。このビデオデータストリームには、ビデオが符号化されている。ビデオデータストリームは、従属ランダムアクセスピクチャに先行するビデオのピクチャが出力されるべきか否かを示す表示を含む。 Additionally, a video data stream is provided. The video is encoded in this video data stream. The video data stream includes an indication of whether or not pictures of the video preceding the dependent random access pictures should be output.

更に、ビデオエンコーダが提供される。ビデオエンコーダは、ビデオをビデオデータストリームに符号化するように構成される。更に、ビデオエンコーダは、ビデオデータストリームが、従属ランダムアクセスピクチャに先行するビデオのピクチャが出力されるべきか否かを示す表示を含むようにビデオデータストリームを生成するように構成される。 Additionally, a video encoder is provided. A video encoder is configured to encode video into a video data stream. Further, the video encoder is configured to generate the video data stream such that the video data stream includes an indication of whether pictures of the video preceding the dependent random access pictures are to be output.

更に、ビデオを格納したビデオデータストリームを受信するためのビデオデコーダが提供される。ビデオデコーダは、ビデオデータストリームからビデオを復号するように構成される。ビデオデコーダは、従属ランダムアクセスピクチャに先行するビデオのピクチャが出力されるべきか否かを示す表示に応じて、ビデオを復号するように構成される。 A video decoder is also provided for receiving a video data stream containing video. A video decoder is configured to decode the video from the video data stream. A video decoder is configured to decode the video in response to an indication of whether a picture of the video preceding the dependent random access picture should be output.

更に、入力ビデオデータストリームを受信するための方法が提供される。入力ビデオデータストリームには、ビデオが符号化されている。本方法は、入力ビデオデータストリームから出力ビデオデータストリームを生成するステップを含む。更に、この方法は、従属するランダムアクセスピクチャに先行するビデオのピクチャが出力されるべきか否かを決定するステップを含む。 Additionally, a method is provided for receiving an input video data stream. The input video data stream contains encoded video. The method includes generating an output video data stream from an input video data stream. Additionally, the method includes determining whether a picture of the video preceding the dependent random access picture should be output.

更に、ビデオをビデオデータストリームに符号化するための方法が提供される。本方法は、ビデオデータストリームが、従属ランダムアクセスピクチャに先行するビデオのピクチャが出力されるべきか否かを示す表示を含むようにビデオデータストリームを生成するステップを含む。 Additionally, a method is provided for encoding video into a video data stream. The method includes generating the video data stream such that the video data stream includes an indication of whether pictures of the video preceding the dependent random access pictures are to be output.

更に、ビデオを格納したビデオデータストリームを受信するための方法が提供される。本方法は、ビデオデータストリームからビデオを復号するステップを含む。ビデオを復号するステップは、従属ランダムアクセスピクチャに先行するビデオのピクチャが出力されるべきか否かを示す表示に応じて行われる。 Additionally, a method is provided for receiving a video data stream containing video. The method includes decoding video from a video data stream. Decoding the video is performed in response to an indication of whether pictures of the video preceding the dependent random access pictures are to be output.

更に、コンピュータ又は信号プロセッサ上で実行されるときに前述の方法のうちの１つを実施するためのコンピュータプログラムが提供される。 Further provided is a computer program for performing one of the aforementioned methods when run on a computer or signal processor.

本発明の第２の態様によれば、１つ以上の入力ビデオデータストリームを受信するための装置が提供される。１つ以上の入力ビデオデータストリームのそれぞれに入力ビデオが符号化される。装置は、１つ以上の入力ビデオデータストリームから出力ビデオデータストリームを生成するように構成され、出力ビデオデータストリームは出力ビデオを符号化し、装置は、出力ビデオが１つ以上の入力ビデオデータストリームのうちの１つ内で符号化されている入力ビデオであるように、又は出力ビデオが１つ以上の入力ビデオデータストリームのうちの少なくとも１つの入力ビデオに依存するように、出力ビデオデータストリームを生成するように構成される。更に、装置は、符号化ピクチャバッファからの出力ビデオの複数のピクチャの現在のピクチャのアクセスユニット除去時間を決定するように構成される。この装置は、符号化ピクチャバッファからの現在のピクチャのアクセスユニット除去時間を決定するために、符号化ピクチャバッファ遅延オフセット情報を使用するか否かを決定するように構成される。 According to a second aspect of the invention, an apparatus is provided for receiving one or more input video data streams. Input video is encoded into each of one or more input video data streams. The apparatus is configured to generate an output video data stream from one or more input video data streams, the output video data stream encoding the output video, the apparatus converting the output video from the one or more input video data streams. generating an output video data stream such that the input video is encoded in one of the input video data streams, or the output video depends on at least one of the one or more input video data streams; configured to Further, the apparatus is configured to determine a current picture access unit removal time of the plurality of pictures of the output video from the coded picture buffer. The apparatus is configured to determine whether to use coded picture buffer delay offset information to determine the access unit removal time of the current picture from the coded picture buffer.

更に、ビデオデータストリームが提供される。このビデオデータストリームには、ビデオが符号化されている。ビデオデータストリームは、符号化ピクチャバッファ遅延オフセット情報を含む。 Additionally, a video data stream is provided. The video is encoded in this video data stream. The video data stream contains coded picture buffer delay offset information.

更に、ビデオに格納されたビデオデータストリームを受信するためのビデオデコーダが提供される。ビデオデコーダは、ビデオデータストリームからビデオを復号するように構成される。更に、ビデオデコーダは、符号化ピクチャバッファからのビデオの複数のピクチャの現在のピクチャのアクセスユニット除去時間に応じてビデオを復号するように構成される。ビデオデコーダは、符号化ピクチャバッファからの現在のピクチャのアクセスユニット除去時間を決定するために符号化ピクチャバッファ遅延オフセット情報を使用するか否かを示す表示に応じてビデオを復号するように構成される。 Further provided is a video decoder for receiving a video data stream stored in the video. A video decoder is configured to decode the video from the video data stream. Further, the video decoder is configured to decode the video according to a current picture access unit removal time of the plurality of pictures of the video from the coded picture buffer. A video decoder is configured to decode the video in response to an indication of whether to use the coded picture buffer delay offset information to determine an access unit removal time for a current picture from the coded picture buffer. be.

更に、１つ以上の入力ビデオデータストリームを受信するための方法が提供される。１つ以上の入力ビデオデータストリームのそれぞれに入力ビデオが符号化される。方法は、１つ以上の入力ビデオデータストリームから出力ビデオデータストリームを生成するステップを含み、出力ビデオデータストリームは出力ビデオを符号化し、出力ビデオデータストリームを生成するステップは、出力ビデオが１つ以上の入力ビデオデータストリームのうちの１つ内で符号化されている入力ビデオであるように、又は出力ビデオが１つ以上の入力ビデオデータストリームのうちの少なくとも１つの入力ビデオに依存するように行なわれる。更に、本方法は、符号化ピクチャバッファからの出力ビデオの複数のピクチャの現在のピクチャのアクセスユニット除去時間を決定するステップを含む。更に、この方法は、符号化ピクチャバッファからの現在のピクチャのアクセスユニット除去時間を決定するために、符号化ピクチャバッファ遅延オフセット情報を用いるべきか否かを決定するステップを含む。 Additionally, a method is provided for receiving one or more input video data streams. Input video is encoded into each of one or more input video data streams. The method includes generating an output video data stream from one or more input video data streams, the output video data stream encoding the output video, generating the output video data stream wherein the output video is one or more or the output video is dependent on at least one of the one or more input video data streams. be Further, the method includes determining an access unit removal time for a current picture of the plurality of pictures of the output video from the encoded picture buffer. Further, the method includes determining whether to use the coded picture buffer delay offset information to determine the access unit removal time of the current picture from the coded picture buffer.

更に、一実施形態に係るビデオをビデオデータストリームに符号化するための方法が提供される。本方法は、ビデオデータストリームが符号化ピクチャバッファ遅延オフセット情報を含むようにビデオデータストリームを生成するステップを含む。 Further, a method is provided for encoding video into a video data stream according to an embodiment. The method includes generating a video data stream such that the video data stream includes encoded picture buffer delay offset information.

更に、ビデオに格納されたビデオデータストリームを受信するための方法が提供される。本方法は、ビデオデータストリームからビデオを復号するステップを含む。ビデオを復号するステップは、ビデオの複数のピクチャの現在のピクチャの符号化ピクチャバッファからのアクセスユニット除去時間に応じて行なわれる。更に、ビデオを復号するステップは、符号化ピクチャバッファから現在のピクチャのアクセスユニット除去時間を決定するために符号化ピクチャバッファ遅延オフセット情報を使用するか否かを示す表示に応じて行われる。 Additionally, a method is provided for receiving a video data stream stored in a video. The method includes decoding video from a video data stream. Decoding the video is performed in response to an access unit removal time from a coded picture buffer of a current picture of the plurality of pictures of the video. Further, decoding the video is performed in response to an indication of whether to use the coded picture buffer delay offset information to determine the access unit removal time of the current picture from the coded picture buffer.

本発明の第３の態様によれば、ビデオデータストリームが提供される。このビデオデータストリームには、ビデオが符号化されている。更に、ビデオデータストリームは、初期符号化ピクチャバッファ除去遅延を含む。更に、ビデオデータストリームは、初期符号化ピクチャバッファ除去オフセットを含む。更に、ビデオデータストリームは、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であると定義されているか否かを示す情報を含む。 According to a third aspect of the invention, a video data stream is provided. The video is encoded in this video data stream. Additionally, the video data stream includes an initial coded picture buffer removal delay. Additionally, the video data stream includes an initial coded picture buffer removal offset. Additionally, the video data stream includes information indicating whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant over two or more buffering periods.

更に、ビデオエンコーダが提供される。ビデオエンコーダは、ビデオをビデオデータストリームに符号化するように構成される。更に、ビデオエンコーダは、ビデオデータストリームが初期符号化ピクチャバッファ除去遅延を含むようにビデオデータストリームを生成するように構成される。更に、ビデオエンコーダは、ビデオデータストリームが初期符号化ピクチャバッファ除去オフセットを含むようにビデオデータストリームを生成するように構成される。更に、ビデオエンコーダは、ビデオデータストリームが、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であると定義されるか否かを示す情報を含むように、ビデオデータストリームを生成するように構成される。 Additionally, a video encoder is provided. A video encoder is configured to encode video into a video data stream. Additionally, the video encoder is configured to generate the video data stream such that the video data stream includes the initial encoded picture buffer removal delay. Additionally, the video encoder is configured to generate the video data stream such that the video data stream includes the initial coded picture buffer removal offset. Additionally, the video encoder indicates whether the video data stream is defined such that the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is constant over two or more buffering periods. It is configured to generate a video data stream containing the information.

更に、第１の入力ビデオデータストリーム及び第２の入力ビデオデータストリームである２つの入力ビデオデータストリームを受信するための装置が提供される。２つの入力ビデオデータストリームのそれぞれには入力ビデオが符号化される。装置は、２つの入力ビデオデータストリームから出力ビデオデータストリームを生成するように構成され、出力ビデオデータストリームは出力ビデオを符号化し、装置は、第１の入力ビデオデータストリームと第２の入力ビデオデータストリームとを連結することによって出力ビデオデータストリームを生成するように構成される。更に、装置は、出力ビデオデータストリームが初期符号化ピクチャバッファ除去遅延を含むように出力ビデオデータストリームを生成するように構成される。更に、装置は、出力ビデオデータストリームが初期符号化ピクチャバッファ除去オフセットを含むように出力ビデオデータストリームを生成するように構成される。更に、装置は、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であると定義されているか否かを示す情報を出力ビデオデータストリームが含むように、出力ビデオデータストリームを生成するように構成される。 Further, an apparatus is provided for receiving two input video data streams, a first input video data stream and a second input video data stream. An input video is encoded into each of the two input video data streams. The apparatus is configured to generate an output video data stream from two input video data streams, the output video data stream encoding the output video, the apparatus generating a first input video data stream and a second input video data stream. and streams to generate an output video data stream. Additionally, the apparatus is configured to generate the output video data stream such that the output video data stream includes the initial encoded picture buffer removal delay. Additionally, the apparatus is configured to generate the output video data stream such that the output video data stream includes the initial coded picture buffer removal offset. In addition, the apparatus provides information indicating whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant over two or more buffering periods in the output video data stream. is configured to generate an output video data stream such that

更に、ビデオに格納されたビデオデータストリームを受信するためのビデオデコーダが提供される。ビデオデコーダは、ビデオデータストリームからビデオを復号するように構成される。更に、ビデオデータストリームは、初期符号化ピクチャバッファ除去遅延を含む。更に、ビデオデータストリームは、初期符号化ピクチャバッファ除去オフセットを含む。更に、ビデオデータストリームは、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であると定義されているか否かを示す情報を含む。更に、ビデオデコーダは、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であると定義されているか否かを示す情報に応じてビデオを復号するように構成される。 Further provided is a video decoder for receiving a video data stream stored in the video. A video decoder is configured to decode the video from the video data stream. Additionally, the video data stream includes an initial coded picture buffer removal delay. Additionally, the video data stream includes an initial coded picture buffer removal offset. Additionally, the video data stream includes information indicating whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant over two or more buffering periods. Further, the video decoder may determine whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant over two or more buffering periods. is configured to decode the

更に、ビデオをビデオデータストリームに符号化するための方法が提供される。本方法は、ビデオデータストリームが初期符号化ピクチャバッファ除去遅延を含むようにビデオデータストリームを生成するステップを含む。更に、本方法は、ビデオデータストリームが初期符号化ピクチャバッファ除去オフセットを含むようにビデオデータストリームを生成するステップを含む。更に、本方法は、ビデオデータストリームが、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であると定義されているか否かを示す情報を含むように、ビデオデータストリームを生成するステップを含む。 Additionally, a method is provided for encoding video into a video data stream. The method includes generating a video data stream such that the video data stream includes an initial encoded picture buffer removal delay. Additionally, the method includes generating the video data stream such that the video data stream includes the initial coded picture buffer removal offset. Further, the method indicates whether the video data stream is defined such that the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is constant over two or more buffering periods. Generating a video data stream to include the information.

更に、第１の入力ビデオデータストリーム及び第２の入力ビデオデータストリームである２つの入力ビデオデータストリームを受信するための方法が提供される。２つの入力ビデオデータストリームのそれぞれには入力ビデオが符号化される。方法は、２つの入力ビデオデータストリームから出力ビデオデータストリームを生成するステップを含み、出力ビデオデータストリームは出力ビデオを符号化し、装置は、第１の入力ビデオデータストリームと第２の入力ビデオデータストリームとを連結することによって出力ビデオデータストリームを生成するように構成される。更に、本方法は、出力ビデオデータストリームが初期符号化ピクチャバッファ除去遅延を含むように出力ビデオデータストリームを生成するステップを含む。更に、本方法は、出力ビデオデータストリームが初期符号化ピクチャバッファ除去オフセットを含むように出力ビデオデータストリームを生成するステップを含む。更に、本方法は、出力ビデオデータストリームが、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であると定義されているか否かを示す情報を含むように、出力ビデオデータストリームを生成するステップを含む。 Further, a method is provided for receiving two input video data streams, a first input video data stream and a second input video data stream. An input video is encoded into each of the two input video data streams. The method includes generating an output video data stream from two input video data streams, the output video data stream encoding the output video, the apparatus receiving the first input video data stream and the second input video data stream. and to generate an output video data stream. Additionally, the method includes generating the output video data stream such that the output video data stream includes the initial encoded picture buffer removal delay. Additionally, the method includes generating the output video data stream such that the output video data stream includes the initial coded picture buffer removal offset. Further, the method determines whether the output video data stream is defined such that the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is constant over two or more buffering periods. Generating an output video data stream to include the indicative information.

更に、ビデオに格納されたビデオデータストリームを受信するための方法が提供される。本方法は、ビデオデータストリームからビデオを復号するステップを含む。ビデオデータストリームは、初期符号化ピクチャバッファ除去遅延を含む。更に、ビデオデータストリームは、初期符号化ピクチャバッファ除去オフセットを含む。更に、ビデオデータストリームは、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であると定義されているか否かを示す情報を含む。本方法は、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であると定義されているか否かを示す情報に応じてビデオを復号するステップを含む。 Additionally, a method is provided for receiving a video data stream stored in a video. The method includes decoding video from a video data stream. A video data stream includes an initial coded picture buffer removal delay. Additionally, the video data stream includes an initial coded picture buffer removal offset. Additionally, the video data stream includes information indicating whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant over two or more buffering periods. The method decodes the video in response to information indicating whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant over two or more buffering periods. including the step of

本発明の第４の態様によれば、ビデオデータストリームが提供される。このビデオデータストリームには、ビデオが符号化されている。更に、ビデオデータストリームは、ビデオデータストリームの１つ以上の符号化ビデオシーケンスのうちの１つの符号化ビデオシーケンスの複数のアクセスユニットのうちの１つのアクセスユニットのネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義されているか否かを示す表示を含む。表示が第１の値を有する場合、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージは、前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義される。表示が第１の値とは異なる値を有する場合、表示は、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるか否かを定義しない。 According to a fourth aspect of the invention, a video data stream is provided. The video is encoded in this video data stream. Further, the video data stream is not scalable nested in network abstraction layer units of one access unit of the plurality of access units of the one encoded video sequence of the one or more encoded video sequences of the video data stream. including an indication of whether the picture timing supplemental enhancement information message is defined to apply to all output layer sets of the plurality of output layer sets of the access unit. if the indication has a first value, then the network abstraction layer unit scalable non-nested picture timing supplemental enhancement information message of the access unit is applied to all output layer sets of the plurality of output layer sets of the access unit; is defined as If an indication has a value different from a first value, an indication is that the network abstraction layer unit's scalable non-nested picture timing supplemental enhancement information message of the access unit's multiple output layer sets all of the access unit's output layer sets. Undefined whether or not it applies to the output layerset.

更に、ビデオエンコーダが提供される。ビデオエンコーダは、ビデオをビデオデータストリームに符号化するように構成される。更に、ビデオエンコーダは、ビデオデータストリームの１つ以上の符号化ビデオシーケンスのうちの１つの符号化ビデオシーケンスの複数のアクセスユニットのうちの１つのアクセスユニットのネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用するように定義されているか否かを示す表示をビデオデータストリームが含むように、ビデオデータストリームを生成するように構成される。表示が第１の値を有する場合、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージは、前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義される。表示が第１の値とは異なる値を有する場合、表示は、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるか否かを定義しない。 Additionally, a video encoder is provided. A video encoder is configured to encode video into a video data stream. Further, the video encoder provides a scalable non-nested picture of a network abstraction layer unit of one access unit of the plurality of access units of the one encoded video sequence of the one or more encoded video sequences of the video data stream. generating a video data stream such that the video data stream includes an indication of whether a timing supplement enhancement information message is defined to apply to all output layer sets of the plurality of output layer sets of the access unit; configured as if the indication has a first value, then the network abstraction layer unit scalable non-nested picture timing supplemental enhancement information message of the access unit is applied to all output layer sets of the plurality of output layer sets of the access unit; is defined as If an indication has a value different from a first value, an indication is that the network abstraction layer unit's scalable non-nested picture timing supplemental enhancement information message of the access unit's multiple output layer sets all of the access unit's output layer sets. Undefined whether or not it applies to the output layerset.

更に、入力ビデオデータストリームを受信するための装置が提供される。入力ビデオデータストリームには、ビデオが符号化されている。装置は、入力ビデオデータストリームから処理済みビデオデータストリームを生成するように構成される。更に、装置は、処理済みビデオデータストリームの１つ以上の符号化ビデオシーケンスのうちの１つの符号化ビデオシーケンスの複数のアクセスユニットのうちの１つのアクセスユニットのネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用するように定義されているか否かを示す表示を処理済みビデオデータストリームが含むように、処理済みビデオデータストリームを生成するように構成される。表示が第１の値を有する場合、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージは、前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義される。表示が第１の値とは異なる値を有する場合、表示は、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるか否かを定義しない。 Also provided is an apparatus for receiving an input video data stream. The input video data stream contains encoded video. The device is configured to generate a processed video data stream from an input video data stream. Further, the apparatus is scalable non-nested network abstraction layer units of one access unit of the plurality of access units of the one encoded video sequence of the one or more encoded video sequences of the processed video data stream. processed video, such that the processed video data stream includes an indication of whether a picture timing supplemental enhancement information message is defined to apply to all output layer sets of the plurality of output layer sets of the access unit; configured to generate a data stream; if the indication has a first value, then the network abstraction layer unit scalable non-nested picture timing supplemental enhancement information message of the access unit is applied to all output layer sets of the plurality of output layer sets of the access unit; is defined as If an indication has a value different from a first value, an indication is that the network abstraction layer unit's scalable non-nested picture timing supplemental enhancement information message of the access unit's multiple output layer sets all of the access unit's output layer sets. Undefined whether or not it applies to the output layerset.

更に、ビデオを格納したビデオデータストリームを受信するためのビデオデコーダが提供される。ビデオデコーダは、ビデオデータストリームからビデオを復号するように構成される。ビデオデータストリームは、ビデオデータストリームの１つ以上の符号化ビデオシーケンスのうちの１つの符号化ビデオシーケンスの複数のアクセスユニットのうちの１つのアクセスユニットのネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義されているか否かを示す表示を含む。表示が第１の値を有する場合、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージは、前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義される。表示が第１の値とは異なる値を有する場合、表示は、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるか否かを定義しない。 A video decoder is also provided for receiving a video data stream containing video. A video decoder is configured to decode the video from the video data stream. The video data stream includes scalable non-nested picture timing of network abstraction layer units of one access unit of a plurality of access units of one of the one or more encoded video sequences of the video data stream. including an indication of whether the supplemental enhancement information message is defined to apply to all output layer sets of a plurality of output layer sets of said access unit. if the indication has a first value, then the network abstraction layer unit scalable non-nested picture timing supplemental enhancement information message of the access unit is applied to all output layer sets of the plurality of output layer sets of the access unit; is defined as If an indication has a value different from a first value, an indication is that the network abstraction layer unit's scalable non-nested picture timing supplemental enhancement information message of the access unit's multiple output layer sets all of the access unit's output layer sets. Undefined whether or not it applies to the output layerset.

更に、ビデオをビデオデータストリームに符号化するための方法が提供される。本方法は、ビデオデータストリームが、ビデオデータストリームの１つ以上の符号化ビデオシーケンスのうちの１つの符号化ビデオシーケンスの複数のアクセスユニットのうちの１つのアクセスユニットのネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義されているか否かを示す表示を含むように、ビデオデータストリームを生成するステップを含む。表示が第１の値を有する場合、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージは、前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義される。表示が第１の値とは異なる値を有する場合、表示は、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるか否かを定義しない。 Additionally, a method is provided for encoding video into a video data stream. The method is a method wherein the video data stream is scalable network abstraction layer units of one access unit of a plurality of access units of an encoded video sequence of one or more encoded video sequences of the video data stream. generating a video data stream to include an indication of whether non-nested picture timing supplemental enhancement information messages are defined to apply to all output layer sets of the plurality of output layer sets of the access unit; Including steps. if the indication has a first value, then the network abstraction layer unit scalable non-nested picture timing supplemental enhancement information message of the access unit is applied to all output layer sets of the plurality of output layer sets of the access unit; is defined as If an indication has a value different from a first value, an indication is that the network abstraction layer unit's scalable non-nested picture timing supplemental enhancement information message of the access unit's multiple output layer sets all of the access unit's output layer sets. Undefined whether or not it applies to the output layerset.

更に、入力ビデオデータストリームを受信するための方法が提供される。入力ビデオデータストリームには、ビデオが符号化されている。本方法は、入力ビデオデータストリームから処理済みビデオデータストリームを生成するステップを含む。更に、本方法は、処理済みビデオデータストリームの１つ以上の符号化ビデオシーケンスのうちの１つの符号化ビデオシーケンスの複数のアクセスユニットのうちの１つのアクセスユニットのネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用するように定義されているか否かを示す表示を処理済みビデオデータストリームが含むように、処理済みビデオデータストリームを生成するステップを含む。表示が第１の値を有する場合、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージは、前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義される。表示が第１の値とは異なる値を有する場合、表示は、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるか否かを定義しない。 Additionally, a method is provided for receiving an input video data stream. The input video data stream contains encoded video. The method includes generating a processed video data stream from an input video data stream. Further, the method includes scalable nesting of network abstraction layer units of one access unit of a plurality of access units of an encoded video sequence of one or more encoded video sequences of the processed video data stream. such that the processed video data stream includes an indication of whether a picture timing supplemental enhancement information message is defined to apply to all output layer sets of the plurality of output layer sets of the access unit. Generating a video data stream. if the indication has a first value, then the network abstraction layer unit scalable non-nested picture timing supplemental enhancement information message of the access unit is applied to all output layer sets of the plurality of output layer sets of the access unit; is defined as If an indication has a value different from a first value, an indication is that the network abstraction layer unit's scalable non-nested picture timing supplemental enhancement information message of the access unit's multiple output layer sets all of the access unit's output layer sets. Undefined whether or not it applies to the output layerset.

更に、ビデオを格納したビデオデータストリームを受信するための方法が提供される。本方法は、ビデオデータストリームからビデオを復号するステップを含む。ビデオデータストリームは、ビデオデータストリームの１つ以上の符号化ビデオシーケンスのうちの１つの符号化ビデオシーケンスの複数のアクセスユニットのうちの１つのアクセスユニットのネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義されているか否かを示す表示を含む。表示が第１の値を有する場合、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージは、前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義される。表示が第１の値とは異なる値を有する場合、表示は、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるか否かを定義しない。 Additionally, a method is provided for receiving a video data stream containing video. The method includes decoding video from a video data stream. The video data stream includes scalable non-nested picture timing of network abstraction layer units of one access unit of a plurality of access units of one of the one or more encoded video sequences of the video data stream. including an indication of whether the supplemental enhancement information message is defined to apply to all output layer sets of a plurality of output layer sets of said access unit. if the indication has a first value, then the network abstraction layer unit scalable non-nested picture timing supplemental enhancement information message of the access unit is applied to all output layer sets of the plurality of output layer sets of the access unit; is defined as If an indication has a value different from a first value, an indication is that the network abstraction layer unit's scalable non-nested picture timing supplemental enhancement information message of the access unit's multiple output layer sets all of the access unit's output layer sets. Undefined whether or not it applies to the output layerset.

本発明の第５の態様によれば、ビデオデータストリームが提供される。このビデオデータストリームには、ビデオが符号化されている。更に、ビデオデータストリームは、１つ以上のスケーラブルネストされた補足拡張情報メッセージを含む。１つ以上のスケーラブルネストされた補足拡張情報メッセージは、複数のシンタックス要素を含む。複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素は、ビデオデータストリーム又はビデオデータストリームの一部のスケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義される。 According to a fifth aspect of the invention, a video data stream is provided. The video is encoded in this video data stream. Additionally, the video data stream includes one or more scalable nested Supplemental Enhancement Information messages. One or more scalable nested supplemental extension information messages include multiple syntax elements. Each syntax element of the one or more syntax elements of the plurality of syntax elements has the same size in each of the video data stream or the scalable nested supplemental enhancement information messages of the part of the video data stream. Defined.

更に、ビデオエンコーダが提供される。ビデオエンコーダは、ビデオをビデオデータストリームに符号化するように構成される。更に、ビデオエンコーダは、ビデオデータストリームが１つ以上のスケーラブルネストされた補足拡張情報メッセージを備えるように、ビデオデータストリームを生成するように構成される。更に、ビデオエンコーダは、１つ以上のスケーラブルネストされた補足拡張情報メッセージが複数のシンタックス要素を含むように、ビデオデータストリームを生成するように構成される。更に、ビデオエンコーダは、複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素がビデオデータストリームの又はビデオデータストリームの一部のスケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有すると定義されるように、ビデオデータストリームを生成するように構成される。 Additionally, a video encoder is provided. A video encoder is configured to encode video into a video data stream. Further, the video encoder is configured to generate the video data stream such that the video data stream comprises one or more scalable nested supplemental enhancement information messages. Further, the video encoder is configured to generate the video data stream such that one or more scalable nested supplemental enhancement information messages include multiple syntax elements. Further, the video encoder may determine that each syntax element of the one or more syntax elements of the plurality of syntax elements is in each scalable nested supplemental enhancement information message of the video data stream or part of the video data stream. It is configured to generate video data streams, defined as having the same size.

更に、入力ビデオデータストリームを受信するための装置が提供される。入力ビデオデータストリームには、ビデオが符号化されている。装置は、入力ビデオデータストリームから出力ビデオデータストリームを生成するように構成される。ビデオデータストリームは、１つ以上のスケーラブルネストされた補足拡張情報メッセージを含む。１つ以上のスケーラブルネストされた補足拡張情報メッセージは、複数のシンタックス要素を含む。複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素は、ビデオデータストリーム又はビデオデータストリームの一部のスケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義される。装置は、１つ以上のスケーラブルネストされた補足拡張情報メッセージを処理するように構成される。 Also provided is an apparatus for receiving an input video data stream. The input video data stream contains encoded video. The device is configured to generate an output video data stream from an input video data stream. A video data stream includes one or more scalable nested Supplemental Enhancement Information messages. One or more scalable nested supplemental extension information messages include multiple syntax elements. Each syntax element of the one or more syntax elements of the plurality of syntax elements has the same size in each of the video data stream or the scalable nested supplemental enhancement information messages of the part of the video data stream. Defined. The apparatus is configured to process one or more scalable nested supplemental enhancement information messages.

更に、ビデオを格納したビデオデータストリームを受信するためのビデオデコーダが提供される。ビデオデコーダは、ビデオデータストリームからビデオを復号するように構成される。ビデオデータストリームは、１つ以上のスケーラブルネストされた補足拡張情報メッセージを含む。１つ以上のスケーラブルネストされた補足拡張情報メッセージは、複数のシンタックス要素を含む。複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素は、ビデオデータストリーム又はビデオデータストリームの一部のスケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義される。ビデオデコーダは、複数のシンタックス要素のうちの１つ以上のシンタックス要素に応じてビデオを復号するように構成される。 A video decoder is also provided for receiving a video data stream containing video. A video decoder is configured to decode the video from the video data stream. A video data stream includes one or more scalable nested Supplemental Enhancement Information messages. One or more scalable nested supplemental extension information messages include multiple syntax elements. Each syntax element of the one or more syntax elements of the plurality of syntax elements has the same size in each of the video data stream or the scalable nested supplemental enhancement information messages of the part of the video data stream. Defined. A video decoder is configured to decode video in response to one or more syntax elements of the plurality of syntax elements.

更に、ビデオをビデオデータストリームに符号化するための方法が提供される。本方法は、ビデオデータストリームが１つ以上のスケーラブルネストされた補足拡張情報メッセージを含むようにビデオデータストリームを生成するステップを含む。更に、本方法は、１つ以上のスケーラブルネストされた補足拡張情報メッセージが複数のシンタックス要素を含むように、ビデオデータストリームを生成するステップを含む。更に、本方法は、複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素がビデオデータストリーム又はビデオデータストリームの一部のスケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有すると定義されるように、ビデオデータストリームを生成するステップを含む。 Additionally, a method is provided for encoding video into a video data stream. The method includes generating a video data stream such that the video data stream includes one or more scalable nested supplemental enhancement information messages. Additionally, the method includes generating the video data stream such that one or more scalable nested supplemental enhancement information messages include multiple syntax elements. Further, the method is characterized in that each syntax element of the one or more syntax elements of the plurality of syntax elements is the same in each of the scalable nested supplemental enhancement information messages of the video data stream or part of the video data stream. Generating a video data stream as defined to have a size.

更に、入力ビデオデータストリームを受信するための方法が提供される。入力ビデオデータストリームには、ビデオが符号化されている。本方法は、入力ビデオデータストリームから出力ビデオデータストリームを生成するステップを含む。ビデオデータストリームは、１つ以上のスケーラブルネストされた補足拡張情報メッセージを含む。１つ以上のスケーラブルネストされた補足拡張情報メッセージは、複数のシンタックス要素を含む。複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素は、ビデオデータストリーム又はビデオデータストリームの一部のスケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義される。本方法は、１つ以上のスケーラブルネストされた補足拡張情報メッセージを処理するステップを含む。 Additionally, a method is provided for receiving an input video data stream. The input video data stream contains encoded video. The method includes generating an output video data stream from an input video data stream. A video data stream includes one or more scalable nested Supplemental Enhancement Information messages. One or more scalable nested supplemental extension information messages include multiple syntax elements. Each syntax element of the one or more syntax elements of the plurality of syntax elements has the same size in each of the video data stream or the scalable nested supplemental enhancement information messages of the part of the video data stream. Defined. The method includes processing one or more scalable nested supplemental enhancement information messages.

更に、ビデオを格納したビデオデータストリームを受信するための方法が提供される。本方法は、ビデオデータストリームからビデオを復号するステップを含む。ビデオデータストリームは、１つ以上のスケーラブルネストされた補足拡張情報メッセージを含む。１つ以上のスケーラブルネストされた補足拡張情報メッセージは、複数のシンタックス要素を含む。複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素は、ビデオデータストリーム又はビデオデータストリームの一部のスケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義される。ビデオを復号するステップは、複数のシンタックス要素のうちの１つ以上のシンタックス要素に応じて行われる。 Additionally, a method is provided for receiving a video data stream containing video. The method includes decoding video from a video data stream. A video data stream includes one or more scalable nested Supplemental Enhancement Information messages. One or more scalable nested supplemental extension information messages include multiple syntax elements. Each syntax element of the one or more syntax elements of the plurality of syntax elements has the same size in each of the video data stream or the scalable nested supplemental enhancement information messages of the part of the video data stream. Defined. Decoding the video is performed in response to one or more syntax elements of the plurality of syntax elements.

好ましい実施形態は、従属請求項に提供される。 Preferred embodiments are provided in the dependent claims.

以下では、本発明の実施形態を、図面を参照して詳細に説明する。
一実施形態に係るビデオをビデオデータストリームに符号化するためのビデオエンコーダを示す。一実施形態に係る入力ビデオデータストリームを受信するための装置を示す。一実施形態に係るビデオを格納したビデオデータストリームを受信するためのビデオデコーダを示す。元のビットストリーム（図４の上部に示される）と、一実施形態に係るピクチャをドロップした後のビットストリーム（図４の下部に示される）を示す。一実施形態に係る２つのビットストリームのうちの１つからピクチャがドロップされた後の２つのビットストリームのスプライシングを示す。他の実施形態に係る２つのビットストリームのスプライシングを示す。一実施形態に係る２層ビットストリームにおける２セットのＨＲＤＳＥＩ、スケーラブルネストされたＳＥＩ及びスケーラブルネストされないＳＥＩを示す。ビデオエンコーダを示す。ビデオデコーダを示す。一方では再構成ピクチャなどの再構成信号と、他方ではデータストリームでシグナリングされる予測残差信号と予測信号との組み合わせとの関係を示す。 Embodiments of the invention are described in detail below with reference to the drawings.
1 illustrates a video encoder for encoding video into a video data stream according to one embodiment; 1 illustrates an apparatus for receiving an input video data stream according to one embodiment; 1 illustrates a video decoder for receiving a video data stream containing video according to one embodiment. 4 shows the original bitstream (shown at the top of FIG. 4) and the bitstream after dropping pictures according to one embodiment (shown at the bottom of FIG. 4). FIG. 4 illustrates splicing of two bitstreams after pictures are dropped from one of the two bitstreams according to one embodiment. FIG. Fig. 4 shows splicing of two bitstreams according to another embodiment; 2 shows two sets of HRD SEI, scalable nested SEI and scalable non-nested SEI in a two-layer bitstream according to one embodiment. Indicates a video encoder. 1 shows a video decoder. Fig. 1 shows the relationship between a reconstructed signal, such as a reconstructed picture, on the one hand, and a combination of a prediction residual signal and a prediction signal, signaled in a data stream, on the other hand.

以下の図の説明は、本発明の実施形態を組み込むことができる符号化フレームワークの例を形成するために、ビデオのピクチャを符号化するためのブロックベースの予測コーデックのエンコーダ及びデコーダの説明の提示から始まる。図８乃至図１０を参照して、エンコーダ及びデコーダのそれぞれについて説明する。以下、本発明の概念の実施形態の説明を、そのような概念を図８及び図９のエンコーダ及びデコーダにそれぞれどのように組み込むことができるかに関する説明と共に提示するが、図１～図３及び以下で説明する実施形態は、図８及び図９のエンコーダ及びデコーダの基礎となる符号化フレームワークに従って動作しないエンコーダ及びデコーダを形成するために使用することもできる。 The following figure descriptions are of descriptions of an encoder and decoder of a block-based predictive codec for encoding pictures of video to form an example encoding framework in which embodiments of the present invention may be incorporated. Start with a presentation. Each of the encoder and decoder will be described with reference to FIGS. Descriptions of embodiments of the inventive concepts are presented below, along with descriptions of how such concepts can be incorporated into the encoders and decoders of FIGS. 8 and 9, respectively, while FIGS. The embodiments described below can also be used to form encoders and decoders that do not operate according to the underlying encoding framework of the encoders and decoders of FIGS.

図８は、変換ベースの残差符号化を例示的に使用してピクチャ１２をデータストリーム１４に予測的に符号化するための装置であるビデオエンコーダを示す。装置又はエンコーダは、参照符号１０を使用して示されている。図９は、対応するビデオデコーダ２０、例えば、変換ベースの残差復号も使用してデータストリーム１４からピクチャ１２’を予測的に復号するように構成された装置２０を示し、アポストロフィは、デコーダ２０によって再構成されたピクチャ１２’が、予測残差信号の量子化によって導入される符号化損失に関して、装置１０によって最初に符号化されたピクチャ１２から逸脱していることを示すために使用されている。図８及び図９は、変換ベースの予測残差符号化を例示的に使用するが、本出願の実施形態はこの種の予測残差符号化に限定されない。これは、以下に概説されるように、図８及び図９に関して説明される他の詳細についても同様である。 FIG. 8 shows a video encoder, an apparatus for predictively encoding pictures 12 into a data stream 14 using exemplary transform-based residual coding. The device or encoder is indicated using reference number 10 . FIG. 9 shows a corresponding video decoder 20, e.g., an apparatus 20 configured to predictively decode pictures 12' from a data stream 14 also using transform-based residual decoding, where the apostrophe indicates that the decoder 20 is used to show that the picture 12' reconstructed by there is Although FIGS. 8 and 9 exemplarily use transform-based predictive residual coding, embodiments of the present application are not limited to this kind of predictive residual coding. This is also the case for other details discussed with respect to Figures 8 and 9, as outlined below.

エンコーダ１０は、予測残差信号を空間スペクトル変換し、このようにして得られた予測残差信号をデータストリーム１４に符号化するように構成される。同様に、デコーダ２０は、データストリーム１４からの予測残差信号を復号し、このようにして得られた予測残差信号をスペクトル空間変換するように構成される。 Encoder 10 is configured to spatial-spectrum transform the prediction residual signal and encode the prediction residual signal thus obtained into data stream 14 . Similarly, decoder 20 is configured to decode the prediction residual signal from data stream 14 and spectral-space transform the prediction residual signal thus obtained.

内部的に、エンコーダ１０は、例えばピクチャ１２からの元の信号からの予測信号２６の偏差を測定するために予測残差２４を生成する予測残差信号形成器２２を備えることができる。予測残差信号形成器２２は、例えば、元の信号から、例えばピクチャ１２から予測信号を減算する減算器であってもよい。次いで、エンコーダ１０は、同じくエンコーダ１０に含まれる量子化器３２によって量子化されるスペクトル領域予測残差信号２４’を取得するために、予測残差信号２４を空間スペクトル変換する変換器２８を更に備える。このように量子化された予測残差信号２４’’は、ビットストリーム１４に符号化される。この目的のために、エンコーダ１０は、任意選択的に、データストリーム１４に変換及び量子化される予測残差信号をエントロピー符号化するエントロピーコーダ３４を備えることができる。予測信号２６は、データストリーム１４に符号化され、そこから復号可能な予測残差信号２４’’に基づいて、エンコーダ１０の予測段３６によって生成される。この目的のために、予測段３６は、図８に示すように、量子化損失以外の信号２４’に対応するスペクトル領域予測残差信号２４’’を得るように予測残差信号２４’’を逆量子化する逆量子化器３８と、量子化損失以外の元の予測残差信号２４に対応する予測残差信号２４’’’を得るために、後者の予測残差信号２４’’を逆変換、例えばスペクトル空間変換にかける逆変換器４０とを内部に備えることができる。次いで、予測段３６の結合器４２は、例えば元の信号１２の再構成などの再構成信号４６を取得するために、加算などによって予測信号２６及び予測残差信号２４’’’’を再結合する。再構成信号４６は、信号１２’に対応することができる。次に、予測段３６の予測モジュール４４は、例えば、空間予測、例えば、ピクチャ内予測、及び／又は時間予測、例えば、ピクチャ間予測を使用することによって、信号４６に基づいて予測信号２６を生成する。 Internally, encoder 10 may comprise a prediction residual signal former 22 that generates prediction residuals 24 to measure the deviation of prediction signal 26 from the original signal from picture 12, for example. The prediction residual signal former 22 may, for example, be a subtractor that subtracts the prediction signal from the original signal, eg picture 12 . Encoder 10 then further includes transformer 28 that spatio-spectrally transforms prediction residual signal 24 to obtain a spectral-domain prediction residual signal 24 ′ that is quantized by quantizer 32 also included in encoder 10 . Prepare. The prediction residual signal 24 ″ thus quantized is encoded into the bitstream 14 . To this end, encoder 10 may optionally comprise an entropy coder 34 for entropy encoding the prediction residual signal that is transformed and quantized into data stream 14 . The prediction signal 26 is generated by the prediction stage 36 of the encoder 10 based on the prediction residual signal 24'' encoded in the data stream 14 and decodable therefrom. To this end, the prediction stage 36 converts the prediction residual signal 24'' to obtain a spectral domain prediction residual signal 24'' corresponding to the non-quantized loss signal 24', as shown in FIG. An inverse quantizer 38 for inverse quantizing and inverting the latter prediction residual signal 24'' to obtain a prediction residual signal 24''' corresponding to the original prediction residual signal 24 without the quantization loss. An inverse transformer 40 for applying a transform, for example a spectral space transform, may be included therein. A combiner 42 of the prediction stage 36 then recombines the prediction signal 26 and the prediction residual signal 24'''', such as by addition, to obtain a reconstructed signal 46, eg, a reconstruction of the original signal 12. do. Reconstructed signal 46 may correspond to signal 12'. Prediction module 44 of prediction stage 36 then generates prediction signal 26 based on signal 46, e.g., by using spatial prediction, e.g., intra-picture prediction, and/or temporal prediction, e.g., inter-picture prediction. do.

同様に、デコーダ２０は、図９に示すように、予測段３６に対応する構成要素から内部的に構成され、予測段に対応する方法で相互接続されてもよい。特に、デコーダ２０のエントロピーデコーダ５０は、データストリームから量子化されたスペクトル領域予測残差信号２４’’をエントロピー復号することができ、その際、逆量子化器５２、逆変換器５４、結合器５６、及び予測モジュール５８は、予測段３６のモジュールに関して前述した方法で相互接続されて協働し、予測残差信号２４’’に基づいて再構成された信号を回復し、それにより、図９に示すように、結合器５６の出力は、再構成された信号、すなわちピクチャ１２’をもたらす。 Similarly, the decoder 20 may be internally constructed from components corresponding to the prediction stages 36 and interconnected in a manner corresponding to the prediction stages, as shown in FIG. In particular, entropy decoder 50 of decoder 20 is capable of entropy decoding quantized spectral-domain prediction residual signal 24'' from the data stream, with inverse quantizer 52, inverse transformer 54, combiner 56 and prediction module 58 are interconnected and cooperate in the manner previously described with respect to the modules of prediction stage 36 to recover a reconstructed signal based on prediction residual signal 24'', thereby providing , the output of combiner 56 provides the reconstructed signal, picture 12'.

上記では具体的に説明されていないが、エンコーダ１０は、例えば、符号化コストなどの幾つかのレート及び歪み関連基準を最適化する方法などの幾つかの最適化方式に従って、例えば、予測モード、動きパラメータなどを含む幾つかの符号化パラメータを設定することができることは容易に明らかである。例えば、エンコーダ１０及びデコーダ２０並びに対応するモジュール４４、５８はそれぞれ、イントラ符号化モード及びインター符号化モードなどの異なる予測モードをサポートすることができる。エンコーダ及びデコーダがこれらの予測モードタイプを切り替える粒度は、それぞれピクチャ１２及び１２’の符号化セグメント又は符号化ブロックへの細分化に対応し得る。これらの符号化セグメントの単位で、例えば、ピクチャは、イントラ符号化されているブロックとインター符号化されているブロックとに細分され得る。イントラ符号化ブロックは、以下により詳細に概説されるように、それぞれのブロックの空間的な既に符号化／復号された近傍に基づいて予測される。指向性イントラ符号化モード又は角度イントラ符号化モードを含む幾つかのイントラ符号化モードが存在し、これらのモードがそれぞれのイントラ符号化セグメントに関して選択されてもよく、それらのモードに従って、それぞれの指向性イントラ符号化モードに固有の特定の方向に沿った近傍のサンプル値をそれぞれのイントラ符号化セグメントに外挿することによってそれぞれのセグメントが満たされる。イントラ符号化モードは、例えば、それぞれのイントラ符号化されたブロックの予測がそれぞれのイントラ符号化されたセグメント内の全てのサンプルにＤＣ値を割り当てるＤＣ符号化モード、及び／又はそれぞれのブロックの予測が、隣接するサンプルに基づいて２次元線形関数によって定義された平面の駆動傾斜及びオフセットを有するそれぞれのイントラ符号化されたブロックのサンプル位置にわたる２次元線形関数によって記述されたサンプル値の空間分布であると近似又は決定される平面イントラ符号化モードなどの１つ以上の更なるモードも含むことができる。これと比較して、インター符号化されたブロックは、例えば時間的に予測され得る。インター符号化されたブロックの場合、データストリーム内で動きベクトルをシグナルすることができ、動きベクトルは、ピクチャ１２が属するビデオの以前に符号化されたピクチャの部分の空間変位を示し、以前に符号化／復号されたピクチャは、それぞれのインター符号化されたブロックの予測信号を取得するためにサンプリングされる。これは、量子化スペクトル領域予測残差信号２４’’を表すエントロピー符号化された変換係数レベルなど、データストリーム１４に含まれる残差信号符号化に加えて、データストリーム１４は、符号化モードを様々なブロックに割り当てるための符号化モードパラメータ、インター符号化されたセグメントの動きパラメータなど、ブロックの幾つかの予測パラメータ、及びピクチャ１２及び１２’のそれぞれのセグメントへの再分割を制御及びシグナルするためのパラメータなどのオプションの更なるパラメータを符号化してもよいことを意味する。デコーダ２０は、これらのパラメータを使用して、エンコーダが行ったのと同じ方法でピクチャを再分割し、セグメントに同じ予測モードを割り当て、同じ予測を実行して同じ予測信号をもたらす。 Although not specifically described above, encoder 10 may, for example, predict mode, It is readily apparent that several encoding parameters can be set, including motion parameters and the like. For example, encoder 10 and decoder 20 and corresponding modules 44, 58, respectively, may support different prediction modes, such as intra-coding and inter-coding modes. The granularity at which the encoder and decoder switch between these prediction mode types may correspond to the subdivision of pictures 12 and 12' into coded segments or blocks, respectively. In units of these coded segments, for example, a picture can be subdivided into blocks that are intra-coded and blocks that are inter-coded. Intra-coded blocks are predicted based on the spatially already encoded/decoded neighborhood of the respective block, as outlined in more detail below. There are several intra-encoding modes, including a directional intra-encoding mode or an angular intra-encoding mode, and these modes may be selected for respective intra-encoded segments, and according to those modes, respective directional intra-encoding modes may be selected. Each segment is filled by extrapolating neighboring sample values along a particular direction specific to the intra-coding mode into each intra-coded segment. The intra-coding mode may be, for example, a DC-coding mode in which the prediction of each intra-coded block assigns a DC value to all samples in each intra-coded segment, and/or the prediction of each block. is the spatial distribution of sample values described by a two-dimensional linear function over the sample positions of each intra-coded block with the driving slope and offset of the plane defined by the two-dimensional linear function based on neighboring samples. One or more additional modes, such as a planar intra-coding mode that is approximated or determined to be, may also be included. In comparison, inter-coded blocks can be predicted temporally, for example. For inter-coded blocks, a motion vector can be signaled in the data stream, the motion vector indicating the spatial displacement of the portion of the previously coded picture of the video to which picture 12 belongs, and the previously coded block. The coded/decoded picture is sampled to obtain a prediction signal for each inter-coded block. This means that in addition to the residual signal encoding included in the data stream 14, such as the entropy-encoded transform coefficient levels representing the quantized spectral domain prediction residual signal 24'', the data stream 14 uses the encoding mode Controls and signals several prediction parameters of blocks, such as coding mode parameters to assign to various blocks, motion parameters of inter-coded segments, and subdivision of pictures 12 and 12' into their respective segments. This means that optional further parameters may be encoded, such as parameters for . Using these parameters, decoder 20 subdivides the picture in the same way that the encoder did, assigns the same prediction modes to the segments, and performs the same prediction to yield the same prediction signal.

図１０は、一方では再構成ピクチャ１２’などの再構成信号と、他方ではデータストリーム１４でシグナリングされる予測残差信号２４’’’と予測信号２６との組み合わせとの関係を示している。既に前述したように、組み合わせは加算であってもよい。図１０において、予測信号２６は、ピクチャ領域を、ハッチングを用いて例示的に示されるイントラ符号化ブロックと、ハッチングを用いずに例示的に示されるインター符号化ブロックとに細分化したものとして示されている。再分割は、正方形ブロック又は非正方形ブロックの行及び列へのピクチャエリアの規則的な再分割、又は４分木再分割などのような、ツリールートブロックから様々なサイズの複数のリーフブロックへのピクチャ１２の多分木再分割などの任意の再分割であってもよく、その混合は、図１０に示されており、ピクチャエリアは、ツリールートブロックの行及び列に最初に再分割され、次いで、再帰的多分木再分割に従って、１つ以上のリーフブロックに更に再分割される。 FIG. 10 shows the relationship between a reconstructed signal such as a reconstructed picture 12 ′ on the one hand and a combination of a prediction residual signal 24 ′″ and a prediction signal 26 signaled in the data stream 14 on the other hand. As already mentioned above, the combination may be additive. In FIG. 10, the prediction signal 26 is shown as subdividing the picture area into intra-coded blocks exemplarily shown with hatching and inter-coded blocks exemplarily shown without hatching. It is Subdivision may be a regular subdivision of the picture area into rows and columns of square or non-square blocks, or a tree root block into multiple leaf blocks of various sizes, such as a quadtree subdivision. It could be any subdivision, such as the multi-tree subdivision of picture 12, a mixture of which is shown in FIG. 10, where the picture area is first subdivided into rows and columns of tree root blocks, then , is further subdivided into one or more leaf blocks according to recursive multi-tree subdivision.

ここでも、データストリーム１４は、イントラ符号化されたブロック８０のために符号化されたイントラ符号化モードを有してもよく、これは、サポートされている幾つかのイントラ符号化モードのうちの１つを、それぞれのイントラ符号化されたブロック８０へ割り当てる。インター符号化されたブロック８２の場合、データストリーム１４には１つ以上の動きパラメータが符号化される。一般的に言えば、インター符号化されたブロック８２は、時間的に符号化されることに限定されない。或いは、インター符号化されたブロック８２は、ピクチャ１２が属するビデオの以前に符号化されたピクチャ、又はエンコーダ及びデコーダがそれぞれスケーラブルなエンコーダ及びデコーダである場合には、別のビュー又は階層的に下位のレイヤのピクチャなど、現在のピクチャ１２自体を超える以前に符号化された部分から予測された任意のブロックであってもよい。 Again, data stream 14 may have an intra-coding mode encoded for intra-coded block 80, which is one of several supported intra-coding modes. One is assigned to each intra-coded block 80 . For inter-coded blocks 82 , one or more motion parameters are encoded in data stream 14 . Generally speaking, inter-coded blocks 82 are not limited to being temporally coded. Alternatively, inter-coded block 82 may be a previously encoded picture of the video to which picture 12 belongs, or another view or hierarchically below if the encoder and decoder are respectively scalable encoders and decoders. It may be any block predicted from a previously coded portion beyond the current picture 12 itself, such as a picture of a layer of .

図１０の予測残差信号２４’’’’も、ピクチャ領域のブロック８４への細分化として示されている。これらのブロックは、符号化ブロック８０及び８２と区別するために、変換ブロックと呼ばれる場合がある。実際には、図１０は、エンコーダ１０及びデコーダ２０が、ピクチャ１２及びピクチャ１２’のブロックへの２つの異なる細分割、すなわち、符号化ブロック８０及び８２への一方の細分割、及び変換ブロック８４への他方の細分割を使用し得ることを示している。両方の細分化は同じであってもよく、例えば、各符号化ブロック８０及び８２は同時に変換ブロック８４を形成してもよいが、図１０は、例えば、変換ブロック８４への細分化が符号化ブロック８０、８２への細分化の拡張を形成し、その結果、ブロック８０及び８２の２つのブロック間の任意の境界が２つのブロック８４間の境界を覆い、或いは、各ブロック８０、８２は、変換ブロック８４のうちの一方と一致するか、又は変換ブロック８４のクラスタと一致する場合を示している。しかしながら、変換ブロック８４が代替的にブロック８０、８２間のブロック境界を横切ることができるように、これらの区画はまた、互いに独立して決定又は選択されてもよい。したがって、変換ブロック８４への細分化に関する限り、ブロック８０、８２への細分化に関して提示したのと同様の記述が当てはまり、例えば、ブロック８４は、（行及び列への配置の有無にかかわらず）ブロックへのピクチャエリアの規則的な細分化の結果、ピクチャエリアの再帰的マルチツリー細分化の結果、又はそれらの組み合わせ、又は任意の他の種類のブロック化であり得る。ちょうど傍らとして、ブロック８０、８２、及び８４は、二次、長方形、又は任意の他の形状に限定されないことに留意されたい。 The prediction residual signal 24 ″″ of FIG. 10 is also shown as a subdivision of the picture area into blocks 84 . These blocks are sometimes called transform blocks to distinguish them from encoding blocks 80 and 82 . In effect, FIG. 10 shows that encoder 10 and decoder 20 subdivide picture 12 and picture 12 ′ into two different subdivisions into blocks, one subdivision into encoding blocks 80 and 82 and one subdivision into transform block 84 . It shows that we can use the other subdivision into . Both subdivisions may be the same, for example each encoded block 80 and 82 may form transform block 84 at the same time, but FIG. Form an extension of the subdivision into blocks 80, 82, so that any boundary between two blocks 80 and 82 overlaps the boundary between two blocks 84, or each block 80, 82 Matching one of the transform blocks 84 or matching a cluster of transform blocks 84 is shown. However, these partitions may also be determined or selected independently of each other, such that transform block 84 may alternatively cross the block boundary between blocks 80,82. Thus, as far as the subdivision into transform block 84 is concerned, the same statements as given for the subdivision into blocks 80, 82 apply, e.g., block 84 (with or without arrangement into rows and columns) It may be the result of regular subdivision of the picture area into blocks, the result of recursive multi-tree subdivision of the picture area, or a combination thereof, or any other kind of blocking. Just as an aside, it should be noted that blocks 80, 82, and 84 are not limited to quadratic, rectangular, or any other shape.

図１０は更に、予測信号２６と予測残差信号２４’’’’の組み合わせが再構成信号１２’を直接もたらすことを示している。しかしながら、代替実施形態によれば、複数の予測信号２６を予測残差信号２４’’’と組み合わせてピクチャ１２’にすることができることに留意されたい。 FIG. 10 further shows that the combination of the prediction signal 26 and the prediction residual signal 24'''' directly yields the reconstructed signal 12'. However, it should be noted that according to an alternative embodiment, multiple prediction signals 26 can be combined with prediction residual signals 24''' into picture 12'.

図１０において、変換ブロック８４は、以下の意味を持つ。変換器２８及び逆変換器５４は、これらの変換ブロック８４単位で変換を行う。例えば、多くのコーデックは、全ての変換ブロック８４に対して何らかの種類のＤＳＴ又はＤＣＴを使用する。幾つかのコーデックは、変換ブロック８４の幾つかについて、予測残差信号が空間領域において直接符号化されるように、変換をスキップすることを可能にする。しかしながら、後述する実施形態によれば、エンコーダ１０及びデコーダ２０は、それらが幾つかの変換をサポートするように構成される。例えば、エンコーダ１０及びデコーダ２０によってサポートされる変換は、
ｏＤＣＴ－ＩＩ（又はＤＣＴ－ＩＩＩ）、ここで、ＤＣＴは離散コサイン変換を表す
ｏＤＳＴ－ＩＶ、ここで、ＤＳＴは離散サイン変換を表す
ｏＤＣＴ－ＩＶ
ｏＤＳＴ－ＶＩＩ
ｏアイデンティティ変換（ＩＴ）
を含むことができる。 In FIG. 10, transform block 84 has the following meaning. The transformer 28 and the inverse transformer 54 perform transforms in units of these transform blocks 84 . For example, many codecs use some kind of DST or DCT for all transform blocks 84 . Some codecs allow skipping transforms for some of the transform blocks 84 so that the prediction residual signal is encoded directly in the spatial domain. However, according to embodiments described below, encoder 10 and decoder 20 are configured such that they support several transforms. For example, the transforms supported by encoder 10 and decoder 20 are:
o DCT-II (or DCT-III), where DCT stands for Discrete Cosine Transform o DST-IV, where DST stands for Discrete Sine Transform o DCT-IV
o DST-VII
o Identity Transformation (IT)
can include

当然ながら、変換器２８はこれらの変換の順変換バージョンの全てをサポートするが、デコーダ２０又は逆変換器５４は以下の対応する逆方向又は逆バージョンをサポートする。
ｏ逆ＤＣＴ－ＩＩ（又は逆ＤＣＴ－ＩＩＩ）
ｏ逆ＤＳＴ－ＩＶ
ｏ逆ＤＣＴ－ＩＶ
ｏ逆ＤＳＴ－ＶＩＩ
ｏアイデンティティ変換（ＩＴ）。 Of course, transformer 28 supports all forward transform versions of these transforms, while decoder 20 or inverse transform 54 supports the corresponding reverse or inverse versions of:
o inverse DCT-II (or inverse DCT-III)
o Reverse DST-IV
o Inverse DCT-IV
o Reverse DST-VII
o Identity Transformation (IT).

以下の説明は、どの変換がエンコーダ１０及びデコーダ２０によってサポートされ得るかについての更なる詳細を提供する。いずれの場合でも、サポートされる変換のセットは、１つのスペクトルから空間への変換又は空間からスペクトルへの変換などの１つの変換のみを含むことができることに留意されたい。 The discussion below provides further details on which transforms may be supported by encoder 10 and decoder 20. FIG. Note that in any case, the set of supported transforms can include only one transform, such as one spectral-to-spatial transform or a spatial-to-spectral transform.

既に上で概説したように、図８～図１０は、本出願に係るエンコーダ及びデコーダの具体例を形成するために、以下で更に説明する発明概念を実施することができる例として提示されている。その限りにおいて、図８及び図９のエンコーダ及びデコーダはそれぞれ、本明細書で後述するエンコーダ及びデコーダの可能な実装形態を表すことができる。ただし、図８及び図９は一例である。しかしながら、本出願の実施形態に係るエンコーダは、以下でより詳細に概説される概念を使用して、図８のエンコーダとは異なるピクチャ１２のブロックベースの符号化を実行することができ、これは、例えば、ビデオエンコーダではないが静止ピクチャエンコーダである点、インター予測をサポートしない点、又はブロック８０への細分割が図１０に例示された方法とは異なる方法で実行される点などにおいてである。同様に、本出願の実施形態に係るデコーダは、以下で更に概説される符号化概念を使用してデータストリーム１４からピクチャ１２’のブロックベースの復号を実行することができるが、例えば、同じものがビデオデコーダではなく静止画デコーダであるという点で、同じものがイントラ予測をサポートしないという点で、又は同じものが図１０に関して説明したのとは異なる方法でピクチャ１２’をブロックにサブ分割するという点で、及び／又は同じものが変換領域ではあるが例えば空間領域ではデータストリーム１４から予測残差を導出しないという点で、図９のデコーダ２０とは異なり得る。 As already outlined above, FIGS. 8-10 are presented as examples in which the inventive concepts further described below may be implemented to form specific examples of encoders and decoders in accordance with the present application. . To that extent, the encoders and decoders of FIGS. 8 and 9, respectively, can represent possible implementations of the encoders and decoders described later in this document. However, FIGS. 8 and 9 are examples. However, encoders according to embodiments of the present application can perform block-based encoding of picture 12 differently than the encoder of FIG. 8, using concepts outlined in more detail below, which are , in that it is not a video encoder but a still picture encoder, that it does not support inter prediction, or that the subdivision into blocks 80 is performed differently than illustrated in FIG. . Similarly, a decoder according to embodiments of the present application may perform block-based decoding of pictures 12' from data stream 14 using encoding concepts further outlined below, but for example the same is a still picture decoder rather than a video decoder, the same does not support intra-prediction, or the same subdivides picture 12' into blocks in a different way than described with respect to FIG. and/or that the same does not derive prediction residuals from the data stream 14 in the spatial domain, but for example in the transform domain.

図１は、一実施形態に係るビデオをビデオデータストリームに符号化するためのビデオエンコーダ１００を示す。ビデオエンコーダ１００は、ビデオデータストリームが、従属ランダムアクセスピクチャに先行するビデオのピクチャが出力されるべきか否かを示す表示を含むように、ビデオデータストリームを生成するように構成される。 FIG. 1 shows a video encoder 100 for encoding video into a video data stream according to one embodiment. Video encoder 100 is configured to generate the video data stream such that the video data stream includes an indication of whether or not pictures of the video preceding the dependent random access pictures are to be output.

図２は、一実施形態に係る入力ビデオデータストリームを受信するための装置２００を示す。入力ビデオデータストリームには、ビデオが符号化されている。装置２００は、入力ビデオデータストリームから出力ビデオデータストリームを生成するように構成される。 FIG. 2 shows an apparatus 200 for receiving an input video data stream according to one embodiment. The input video data stream contains encoded video. Apparatus 200 is configured to generate an output video data stream from an input video data stream.

図３は、一実施形態に係る、ビデオを格納したビデオデータストリームを受信するためのビデオデコーダ３００を示す。ビデオデコーダ３００は、ビデオデータストリームからビデオを復号するように構成される。ビデオデコーダ３００は、従属ランダムアクセスピクチャに先行するビデオのピクチャが出力されるべきか否かを示す表示に応じて、ビデオを復号するように構成される。 FIG. 3 illustrates a video decoder 300 for receiving a video data stream containing video, according to one embodiment. Video decoder 300 is configured to decode video from a video data stream. Video decoder 300 is configured to decode the video in response to an indication of whether pictures of the video preceding the dependent random access picture should be output.

更に、一実施形態に係るシステムが提供される。システムは、図２の装置と、図３のビデオデコーダとを備える。図３のビデオデコーダ３００は、図２の装置の出力ビデオデータストリームを受信するように構成される。図３のビデオデコーダ３００は、図２の装置２００の出力ビデオデータストリームからビデオを復号するように構成される。 Further, a system is provided according to one embodiment. The system comprises the device of FIG. 2 and the video decoder of FIG. Video decoder 300 of FIG. 3 is configured to receive the output video data stream of the apparatus of FIG. Video decoder 300 of FIG. 3 is configured to decode video from the output video data stream of apparatus 200 of FIG.

一実施形態では、システムは、例えば、図１のビデオエンコーダ１００を更に備えることができる。図２の装置２００は、例えば、入力ビデオデータストリームとして図１のビデオエンコーダ１００からビデオデータストリームを受信するように構成され得る。 In one embodiment, the system may further comprise video encoder 100 of FIG. 1, for example. Apparatus 200 of FIG. 2 may, for example, be configured to receive a video data stream from video encoder 100 of FIG. 1 as an input video data stream.

装置２００の（任意選択的な）中間デバイス２１０は、例えば、入力ビデオデータストリームとしてビデオエンコーダ１００からビデオデータストリームを受信し、入力ビデオデータストリームから出力ビデオデータストリームを生成するように構成され得る。例えば、中間デバイスは、例えば、入力ビデオデータストリームの（ヘッダ／メタデータ）情報を修正するように構成されてもよく、及び／又は例えば、入力ビデオデータストリームからピクチャを削除するように構成されてもよく、及び／又は入力ビデオデータストリームを、第２のビデオが符号化された更なる第２のビットストリームと混合／スプライシングするように構成されてもよい。 An (optional) intermediate device 210 of apparatus 200 may, for example, be configured to receive a video data stream from video encoder 100 as an input video data stream and to generate an output video data stream from the input video data stream. For example, the intermediate device may be configured, for example, to modify (header/metadata) information of the input video data stream and/or, for example, to remove pictures from the input video data stream. and/or may be configured to mix/splice the input video data stream with a further secondary bitstream in which the secondary video is encoded.

（任意選択的な）ビデオデコーダ２２１は、例えば、出力ビデオデータストリームからビデオを復号するように構成され得る。 The (optional) video decoder 221 may be configured, for example, to decode video from an output video data stream.

（任意選択的な）仮想基準デコーダ２２２は、例えば、出力ビデオデータストリームに応じてビデオのタイミング情報を決定するように構成されてもよく、又は、例えば、ビデオ又はビデオの一部が格納されるバッファのバッファ情報を決定するように構成されてもよい。 The (optional) hypothetical reference decoder 222 may, for example, be configured to determine timing information for the video in response to the output video data stream, or the video or portion of the video stored, for example. It may be configured to determine buffer information for the buffer.

システムは、図１のビデオエンコーダ１０１及び図２のビデオデコーダ１５１を備える。 The system comprises video encoder 101 of FIG. 1 and video decoder 151 of FIG.

ビデオエンコーダ１０１は、符号化されたビデオ信号を生成するように構成される。ビデオデコーダ１５１は、符号化されたビデオ信号を復号して、ビデオのピクチャを再構成するように構成される。 Video encoder 101 is configured to generate an encoded video signal. Video decoder 151 is configured to decode the encoded video signal to reconstruct pictures of the video.

本発明の第１の態様は、態様１～３８に記載されている。 A first aspect of the invention is described in aspects 1-38.

本発明の第２の態様は、態様３９～７８に記載されている。 A second aspect of the invention is described in aspects 39-78.

本発明の第３の態様は、態様７９～１０８に記載されている。 A third aspect of the present invention is described in aspects 79-108.

本発明の第４の態様は、態様１０９～１３４に記載されている。 A fourth aspect of the present invention is described in aspects 109-134.

本発明の第５の態様は、態様１３５～１８８に記載されている。 A fifth aspect of the present invention is described in aspects 135-188.

以下、本発明の第１の態様について詳細に説明する。 The first aspect of the present invention will be described in detail below.

本発明の第１の態様によれば、入力ビデオデータストリームを受信するための装置２００が提供される。入力ビデオデータストリームには、ビデオが符号化されている。装置２００は、入力ビデオデータストリームから出力ビデオデータストリームを生成するように構成される。更に、装置２００は、従属ランダムアクセスピクチャに先行するビデオのピクチャが出力されるべきか否かを決定する。 According to a first aspect of the invention, an apparatus 200 is provided for receiving an input video data stream. The input video data stream contains encoded video. Apparatus 200 is configured to generate an output video data stream from an input video data stream. Further, the apparatus 200 determines whether the picture of the video preceding the dependent random access picture should be output.

一実施形態によれば、装置２００は、例えば、従属ランダムアクセスピクチャに先行するビデオのピクチャが出力されるべきか否かを示す第１の変数（例えば、ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＤｒａｐＦｌａｇ）を決定するように構成され得る。 According to an embodiment, the apparatus 200 may for example be configured to determine a first variable (eg NoOutputBeforeDrapFlag) indicating whether or not pictures of video preceding the dependent random access picture should be output. .

一実施形態では、装置２００は、例えば、出力ビデオデータストリームが、例えば、従属ランダムアクセスピクチャに先行するビデオのピクチャが出力されるべきか否かを示すことができる表示を含むことができるように、出力ビデオデータストリームを生成するように構成することができる。 In one embodiment, the apparatus 200 may, for example, include an indication that the output video data stream may, for example, indicate whether or not pictures of the video preceding the dependent random access pictures should be output. , can be configured to generate an output video data stream.

一実施形態によれば、装置２００は、例えば、出力ビデオデータストリームが、例えば、従属ランダムアクセスピクチャに先行するビデオのピクチャが出力されるべきか否かを示すことができる表示を含む補足拡張情報を含むことができるように、出力ビデオデータストリームを生成するように構成されることができる。 According to one embodiment, the apparatus 200 provides supplemental enhancement information including an indication that can indicate whether the output video data stream should, for example, output pictures of the video preceding the dependent random access pictures. can be configured to generate an output video data stream such that it can include a

一実施形態では、従属ランダムアクセスピクチャに先行するビデオのピクチャは、例えば、独立ランダムアクセスピクチャであってもよい。装置２００は、例えば、出力ビデオデータストリームを生成するように構成することができ、それによって、出力ビデオデータストリームは、例えば、独立ランダムアクセスピクチャのピクチャヘッダ内に所定の値（例えば、０）を有するフラグ（例えば、ｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇ）を含むことができ、それによって、フラグ（例えば、ｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇ）の所定の値（例えば、０）は、例えば、ビデオデータストリーム内の前記依存したランダムアクセスピクチャに直接先行する独立ランダムアクセスピクチャについて、前記独立ランダムアクセスピクチャが出力されるべきでないことを示すことができる。 In one embodiment, the picture of the video that precedes the dependent random access picture may be, for example, an independent random access picture. Apparatus 200 may, for example, be configured to generate an output video data stream whereby the output video data stream, for example, contains a predetermined value (eg, 0) in the picture headers of independent random access pictures. a flag (e.g. ph_pic_output_flag), whereby a predetermined value (e.g. 0) of the flag (e.g. ph_pic_output_flag) directly precedes said dependent random access picture in e.g. a video data stream For independent random access pictures, it can be indicated that said independent random access pictures should not be output.

一実施形態によれば、フラグは、例えば、第１のフラグであってもよく、装置２００は、例えば、出力ビデオデータストリームが、例えば、ビデオデータストリームのピクチャパラメータセットに更なるフラグを含むことができるように、出力ビデオデータストリームを生成するように構成されてもよく、更なるフラグは、例えば、第１のフラグ（例えば、ｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇ）が独立ランダムアクセスピクチャのピクチャヘッダに存在するか否かを示すことができる。 According to one embodiment, the flag may be, for example, the first flag, and the device 200 may, for example, detect that the output video data stream includes further flags, for example, in the picture parameter set of the video data stream. and a further flag, e.g. whether a first flag (e.g. ph_pic_output_flag) is present in the picture header of the independent random access picture can be shown.

一実施形態では、装置２００は、例えば、出力ビデオデータストリームが、例えば、従属ランダムアクセスピクチャに先行するビデオのピクチャが出力されるべきか否かを示すことができる表示として、出力ビデオデータストリームの補足拡張情報内の補足拡張情報フラグ、又は出力ビデオデータストリームのピクチャパラメータセット内のピクチャパラメータセットフラグ、又は出力ビデオデータストリームのシーケンスパラメータセット内のシーケンスパラメータセットフラグ、又は外部手段フラグを含むことができるように、出力ビデオデータストリームを生成するように構成することができ、外部手段フラグの値は、例えば、装置２００の外部にある外部ユニットによって設定することができる。 In one embodiment, the apparatus 200 may, for example, display the output video data stream's It may include a supplementary enhancement information flag in the supplementary enhancement information, or a picture parameter set flag in the picture parameter set of the output video data stream, or a sequence parameter set flag in the sequence parameter set of the output video data stream, or an external means flag. As such, it can be configured to generate an output video data stream, and the value of the external means flag can be set by an external unit external to the device 200, for example.

一実施形態によれば、装置２００は、例えば、第１変数（例えば、ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＤｒａｐＦｌａｇ）に応じて、従属ランダムアクセスピクチャに先行するビデオのピクチャの第２変数（例えば、ＰｉｃｔｕｒｅＯｕｔｐｕｔＦｌａｇ）の値を決定するように構成されてもよく、第２変数（例えば、ＰｉｃｔｕｒｅＯｕｔｐｕｔＦｌａｇ）は、例えば、前記ピクチャが出力されるべきか否かを前記ピクチャについて示してもよく、装置２００は、例えば、第２変数（例えば、ＰｉｃｔｕｒｅＯｕｔｐｕｔＦｌａｇ）に応じて前記ピクチャを出力し又は出力しないように構成されてもよい。 According to an embodiment, the apparatus 200 is configured to determine the value of a second variable (eg, PictureOutputFlag) of a picture of a video preceding a dependent random access picture, eg, depending on a first variable (eg, NoOutputBeforeDrapFlag). and a second variable (e.g. PictureOutputFlag) may e.g. The picture may be configured to be output or not to be output according to PictureOutputFlag).

一実施形態では、従属ランダムアクセスピクチャに先行するビデオのピクチャは、例えば、独立ランダムアクセスピクチャであってもよい。第１の変数（例えば、ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＤｒａｐＦｌａｇ）は、例えば、独立ランダムアクセスピクチャが出力されるべきでないことを示すことができる。 In one embodiment, the picture of the video that precedes the dependent random access picture may be, for example, an independent random access picture. A first variable (eg, NoOutputBeforeDrapFlag) may indicate, for example, that independent random access pictures should not be output.

一実施形態によれば、従属ランダムアクセスピクチャに先行するビデオのピクチャは、例えば、独立ランダムアクセスピクチャであってもよい。装置２００は、例えば、第１の変数（例えば、ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＤｒａｐＦｌａｇ）が、例えば、独立ランダムアクセスピクチャが出力されるべきであることを示すことができるように、第１の変数（例えば、ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＤｒａｐＦｌａｇ）を設定するように構成され得る。 According to one embodiment, the picture of the video preceding the dependent random access picture may be, for example, an independent random access picture. The apparatus 200 may, for example, set the first variable (eg, NoOutputBeforeDrapFlag) such that the first variable (eg, NoOutputBeforeDrapFlag) may indicate, for example, that independent random access pictures should be output. can be configured to

一実施形態では、装置２００は、例えば、従属ランダムアクセスピクチャに先行するビデオのピクチャが出力されるべきか否かをビデオデコーダ３００へシグナルするように構成され得る。 In one embodiment, apparatus 200 may, for example, be configured to signal to video decoder 300 whether pictures of video preceding dependent random access pictures should be output.

一実施形態によれば、ビデオデータストリームは、例えば、従属ランダムアクセスピクチャに先行するビデオのピクチャが出力されるべきか否かを示すことができる表示を含む補足拡張情報を含むことができる。 According to one embodiment, the video data stream may include supplemental enhancement information including, for example, an indication that may indicate whether or not pictures of the video preceding the dependent random access pictures should be output.

一実施形態では、従属ランダムアクセスピクチャに先行するビデオのピクチャは、例えば、独立ランダムアクセスピクチャであってもよい。ビデオデータストリームは、例えば、独立ランダムアクセスピクチャのピクチャヘッダ内に所定の値（例えば、０）を有するフラグ（例えば、ｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇ）を含むことができ、その結果、フラグ（例えば、ｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇ）の所定の値（例えば、０）は、例えば、ビデオデータストリーム内の前記独立ランダムアクセスピクチャに直接先行する独立ランダムアクセスピクチャについて、前記独立ランダムアクセスピクチャは出力されるべきでないことを示すことができる。 In one embodiment, the picture of the video that precedes the dependent random access picture may be, for example, an independent random access picture. A video data stream may include, for example, a flag (eg, ph_pic_output_flag) having a predetermined value (eg, 0) in the picture header of an independent random access picture, such that the flag (eg, ph_pic_output_flag) has a predetermined A value (eg, 0) may indicate, for example, for an independent random access picture that directly precedes said independent random access picture in a video data stream, said independent random access picture should not be output.

一実施形態によれば、フラグは、例えば、第１のフラグとすることができ、ビデオデータストリームは、例えば、ビデオデータストリームのピクチャパラメータセット内に更なるフラグを含むことができ、更なるフラグは、例えば、第１のフラグ（例えば、ｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇ）が独立ランダムアクセスピクチャのピクチャヘッダ内に存在するか否かを示すことができる。 According to an embodiment the flag may for example be a first flag, the video data stream may for example comprise a further flag within a picture parameter set of the video data stream, the further flag can, for example, indicate whether a first flag (eg, ph_pic_output_flag) is present in the picture header of an independent random access picture.

一実施形態では、ビデオデータストリームは、例えば、従属ランダムアクセスピクチャに先行するビデオのピクチャが出力されるべきか否かを示すことができる表示として、出力ビデオデータストリームの補足拡張情報内の補足拡張情報フラグ、又は出力ビデオデータストリームのピクチャパラメータセット内のピクチャパラメータセットフラグ、又は出力ビデオデータストリームのシーケンスパラメータセット内のシーケンスパラメータセットフラグ、又は外部手段フラグを含むことができ、外部手段フラグの値は、例えば、装置２００の外部にある外部ユニットによって設定することができる。 In one embodiment, the video data stream includes supplemental extensions in the output video data stream's supplemental extension information as an indication that can indicate, for example, whether pictures of the video preceding the dependent random access pictures should be output or not. An information flag, or a picture parameter set flag in the picture parameter set of the output video data stream, or a sequence parameter set flag in the sequence parameter set of the output video data stream, or an external means flag, the value of the external means flag can be set by an external unit external to the device 200, for example.

また、ビデオエンコーダ１００が設けられる。ビデオエンコーダ１００は、例えば、ビデオをビデオデータストリームに符号化するように構成することができる。更に、ビデオエンコーダ１００は、例えば、ビデオデータストリームが、従属ランダムアクセスピクチャに先行するビデオのピクチャが出力されるべきか否かを示す表示を含むように、ビデオデータストリームを生成するように構成され得る。 A video encoder 100 is also provided. Video encoder 100 may, for example, be configured to encode video into a video data stream. Further, the video encoder 100 is configured to generate the video data stream such that, for example, the video data stream includes an indication of whether pictures of the video preceding the dependent random access pictures should be output. obtain.

一実施形態によれば、ビデオエンコーダ１００は、例えば、ビデオデータストリームが、例えば、従属ランダムアクセスピクチャに先行するビデオのピクチャが出力されるべきか否かを示すことができる表示を含む補足拡張情報を含むことができるように、ビデオデータストリームを生成するように構成することができる。 According to one embodiment, the video encoder 100 may, for example, output supplemental enhancement information including an indication that may indicate whether the video data stream should, for example, output pictures of the video that precede the dependent random access pictures. can be configured to generate a video data stream such that it can include

一実施形態では、従属ランダムアクセスピクチャに先行するビデオのピクチャは、例えば、独立ランダムアクセスピクチャであってもよい。ビデオエンコーダ１００は、例えば、ビデオデータストリームが、例えば、独立ランダムアクセスピクチャのピクチャヘッダ内に所定の値（例えば、０）を有するフラグ（例えば、ｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇ）を含むことができるように、ビデオデータストリームを生成するように構成することができ、その結果、フラグ（例えば、ｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇ）の所定の値（例えば、０）は、例えば、ビデオデータストリーム内の前記独立ランダムアクセスピクチャに直接先行する独立ランダムアクセスピクチャについて、前記独立ランダムアクセスピクチャが出力されるべきでないことを示すことができ、 In one embodiment, the picture of the video that precedes the dependent random access picture may be, for example, an independent random access picture. The video encoder 100 may, for example, include a flag (eg, ph_pic_output_flag) having a predetermined value (eg, 0) in the picture headers of, for example, independent random access pictures. so that a predetermined value (e.g. 0) of a flag (e.g. ph_pic_output_flag) is for example an independent random access picture directly preceding said independent random access picture in the video data stream. capable of indicating, for a picture, that the independent random access picture should not be output;

一実施形態によれば、フラグは、例えば、第１のフラグとすることができ、ビデオエンコーダ１００は、例えば、ビデオデータストリームが、例えば、ビデオデータストリームのピクチャパラメータセットに更なるフラグを含むことができるようにビデオデータストリームを生成するように構成することができ、更なるフラグは、例えば、第１のフラグ（例えば、ｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇ）が独立ランダムアクセスピクチャのピクチャヘッダ内に存在するか否かを示すことができる。 According to one embodiment, the flag may be, for example, a first flag, and the video encoder 100 may, for example, detect that the video data stream includes further flags, for example, in the picture parameter set of the video data stream. and a further flag, e.g., whether a first flag (e.g., ph_pic_output_flag) is present in the picture header of an independent random access picture. can be shown.

一実施形態では、ビデオエンコーダ１００は、例えば、ビデオデータストリームが、例えば、従属ランダムアクセスピクチャに先行するビデオのピクチャが出力されるべきか否かを示すことができる表示として、例えば、出力ビデオデータストリームの補足拡張情報内の補足拡張情報フラグ、又は出力ビデオデータストリームのピクチャパラメータセット内のピクチャパラメータセットフラグ、又は出力ビデオデータストリームのシーケンスパラメータセット内のシーケンスパラメータセットフラグ、又は外部手段フラグを含むことができるように、ビデオデータストリームを生成するように構成することができ、外部手段フラグの値は、例えば、装置２００の外部にある外部ユニットによって設定することができる。 In one embodiment, video encoder 100 outputs, for example, output video data as an indication that can indicate, for example, whether a video data stream should output pictures of video that precedes, for example, dependent random access pictures. including supplementary enhancement information flags in the supplementary enhancement information of the stream, or picture parameter set flags in the picture parameter set of the output video data stream, or sequence parameter set flags in the sequence parameter set of the output video data stream, or external means flags As can be done, the value of the external means flag can be set by an external unit external to the device 200, for example.

更に、ビデオを格納したビデオデータストリームを受信するためのビデオデコーダ３００が設けられている。ビデオデコーダ３００は、ビデオデータストリームからビデオを復号するように構成される。ビデオデコーダ３００は、従属ランダムアクセスピクチャに先行するビデオのピクチャが出力されるべきか否かを示す表示に応じて、ビデオを復号するように構成される。 Additionally, a video decoder 300 is provided for receiving a video data stream containing video. Video decoder 300 is configured to decode video from a video data stream. Video decoder 300 is configured to decode the video in response to an indication of whether pictures of the video preceding the dependent random access picture should be output.

一実施形態によれば、ビデオデコーダ３００は、例えば、従属ランダムアクセスピクチャに先行するビデオのピクチャが出力されるべきか否かを示す第１の変数（例えば、ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＤｒａｐＦｌａｇ）に応じてビデオを復号するように構成され得る。 According to one embodiment, the video decoder 300 decodes the video, eg, depending on a first variable (eg, NoOutputBeforeDrapFlag) indicating whether pictures of the video preceding the dependent random access picture should be output. can be configured as

一実施形態では、ビデオデータストリームは、例えば、従属ランダムアクセスピクチャに先行するビデオのピクチャが出力されるべきか否かを示すことができる表示を含むことができる。ビデオデコーダ３００は、例えば、ビデオデータストリーム内の表示に応じてビデオを復号するように構成され得る。 In one embodiment, the video data stream may include, for example, an indication that may indicate whether or not pictures of the video preceding the dependent random access pictures should be output. Video decoder 300 may be configured, for example, to decode video in response to representations within a video data stream.

一実施形態によれば、ビデオデータストリームは、例えば、従属ランダムアクセスピクチャに先行するビデオのピクチャが出力されるべきか否かを示すことができる表示を含む補足拡張情報を含むことができる。ビデオデコーダ３００は、例えば、補足拡張情報に応じて、ビデオを復号するように構成され得る。 According to one embodiment, the video data stream may include supplemental enhancement information including, for example, an indication that may indicate whether or not pictures of the video preceding the dependent random access pictures should be output. Video decoder 300 may, for example, be configured to decode the video in response to supplemental enhancement information.

一実施形態では、従属ランダムアクセスピクチャに先行するビデオのピクチャは、例えば、独立ランダムアクセスピクチャであってもよい。ビデオデータストリームは、例えば、独立ランダムアクセスピクチャのピクチャヘッダ内に所定の値（例えば、０）を有するフラグ（例えば、ｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇ）を含むことができ、その結果、フラグ（例えば、ｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇ）の所定の値（例えば、０）は、例えば、ビデオデータストリーム内の前記独立ランダムアクセスピクチャに直接先行する独立ランダムアクセスピクチャについて、前記独立ランダムアクセスピクチャは出力されるべきでないことを示すことができる。ビデオデコーダ３００は、例えば、フラグに応じて、ビデオを復号するように構成され得る。 In one embodiment, the picture of the video that precedes the dependent random access picture may be, for example, an independent random access picture. A video data stream may include, for example, a flag (eg, ph_pic_output_flag) having a predetermined value (eg, 0) in the picture header of an independent random access picture, such that the flag (eg, ph_pic_output_flag) has a predetermined A value (eg, 0) may indicate, for example, for an independent random access picture that directly precedes said independent random access picture in a video data stream, said independent random access picture should not be output. Video decoder 300 may, for example, be configured to decode the video in response to the flag.

一実施形態によれば、フラグは、例えば、第１のフラグとすることができ、ビデオデータストリームは、例えば、ビデオデータストリームのピクチャパラメータセット内に更なるフラグを含むことができ、更なるフラグは、例えば、第１のフラグ（例えば、ｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇ）が独立ランダムアクセスピクチャのピクチャヘッダ内に存在するか否かを示すことができる。ビデオデコーダ３００は、例えば、更なるフラグに応じて、ビデオを復号するように構成され得る。 According to an embodiment the flag may for example be a first flag, the video data stream may for example comprise a further flag within a picture parameter set of the video data stream, the further flag can, for example, indicate whether a first flag (eg, ph_pic_output_flag) is present in the picture header of an independent random access picture. Video decoder 300 may, for example, be configured to decode the video in response to additional flags.

一実施形態では、ビデオデータストリームは、例えば、従属ランダムアクセスピクチャに先行するビデオのピクチャが出力されるべきか否かを示すことができる表示として、出力ビデオデータストリームの補足拡張情報内の補足拡張情報フラグ、又は出力ビデオデータストリームのピクチャパラメータセット内のピクチャパラメータセットフラグ、又は出力ビデオデータストリームのシーケンスパラメータセット内のシーケンスパラメータセットフラグ、又は外部手段フラグを含むことができ、外部手段フラグの値は、例えば、装置２００の外部にある外部ユニットによって設定することができる。ビデオデコーダ３００は、例えば、ビデオデータストリーム内の表示に応じてビデオを復号するように構成され得る。 In one embodiment, the video data stream includes supplemental extensions in the output video data stream's supplemental extension information as an indication that can indicate, for example, whether pictures of the video preceding the dependent random access pictures should be output or not. An information flag, or a picture parameter set flag in the picture parameter set of the output video data stream, or a sequence parameter set flag in the sequence parameter set of the output video data stream, or an external means flag, the value of the external means flag can be set by an external unit external to the device 200, for example. Video decoder 300 may be configured, for example, to decode video in response to representations within a video data stream.

一実施形態によれば、ビデオデコーダ３００は、例えば、ビデオデータストリームからビデオを再構成するように構成することができる。ビデオデコーダ３００は、例えば、第１変数（例えば、ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＤｒａｐＦｌａｇ）に応じて、従属ランダムアクセスピクチャに先行するビデオのピクチャを出力するように、又は出力しないように構成され得る。 According to one embodiment, video decoder 300 may be configured, for example, to reconstruct video from a video data stream. Video decoder 300 may, for example, be configured to output or not output pictures of the video preceding the dependent random access picture depending on a first variable (eg, NoOutputBeforeDrapFlag).

一実施形態では、ビデオデコーダ３００は、例えば、第１変数（例えば、ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＤｒａｐＦｌａｇ）に応じて、従属ランダムアクセスピクチャに先行するビデオのピクチャの第２変数（例えば、ＰｉｃｔｕｒｅＯｕｔｐｕｔＦｌａｇ）の値を決定するように構成されてもよく、第２変数（例えば、ＰｉｃｔｕｒｅＯｕｔｐｕｔＦｌａｇ）は、例えば、前記ピクチャが出力されるべきか否かを前記ピクチャについて示してもよく、装置２００は、例えば、第２変数（例えば、ＰｉｃｔｕｒｅＯｕｔｐｕｔＦｌａｇ）に応じて前記ピクチャを出力するか否かを構成してもよい。 In one embodiment, video decoder 300 determines the value of a second variable (eg, PictureOutputFlag) of a picture of video preceding a dependent random access picture, eg, according to a first variable (eg, NoOutputBeforeDrapFlag). A second variable (eg, PictureOutputFlag) may, for example, indicate for said picture whether said picture should be output, and the apparatus 200 may, for example, indicate a second variable (eg, PictureOutputFlag ), whether or not to output the picture.

一実施形態によれば、従属ランダムアクセスピクチャに先行するビデオのピクチャは、例えば、独立ランダムアクセスピクチャであってもよい。ビデオデコーダ３００は、例えば、独立ランダムアクセスピクチャが出力されるべきでないことを示す第１の変数（例えば、ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＤｒａｐＦｌａｇ）に応じて、ビデオを復号するように構成され得る。 According to one embodiment, the picture of the video preceding the dependent random access picture may be, for example, an independent random access picture. Video decoder 300 may, for example, be configured to decode the video in response to a first variable (eg, NoOutputBeforeDrapFlag) indicating that independent random access pictures should not be output.

一実施形態では、従属ランダムアクセスピクチャに先行するビデオのピクチャは、例えば、独立ランダムアクセスピクチャであってもよい。ビデオデコーダ３００は、例えば、独立ランダムアクセスピクチャが出力されるべきであることを示す第１の変数（例えば、ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＤｒａｐＦｌａｇ）に応じて、ビデオを復号するように構成され得る。 In one embodiment, the picture of the video that precedes the dependent random access picture may be, for example, an independent random access picture. Video decoder 300 may, for example, be configured to decode the video in response to a first variable (eg, NoOutputBeforeDrapFlag) that indicates that independent random access pictures should be output.

更に、システムが提供される。このシステムは、前述したような装置２００と、前述したようなビデオデコーダ３００とを備える。ビデオデコーダ３００は、装置２００の出力ビデオデータストリームを受信するように構成される。更に、ビデオデコーダ３００は、装置２００の出力ビデオデータストリームからビデオを復号するように構成される。 Further, a system is provided. The system comprises an apparatus 200 as previously described and a video decoder 300 as previously described. Video decoder 300 is configured to receive the output video data stream of device 200 . Further, video decoder 300 is configured to decode video from the output video data stream of device 200 .

一実施形態によれば、システムは、例えば、ビデオエンコーダ１００を更に備えることができる。装置２００は、例えば、入力ビデオデータストリームとしてビデオエンコーダ１００からビデオデータストリームを受信するように構成され得る。 According to one embodiment, the system may further comprise a video encoder 100, for example. Apparatus 200 may, for example, be configured to receive a video data stream from video encoder 100 as an input video data stream.

特に、本発明の第１の態様は、ＤＲＡＰで開始するＣＶＳに関し、復号及び適合性試験においてＩＤＲ出力を省略することに関する。 In particular, a first aspect of the invention relates to DRAP-initiated CVS and to omitting the IDR output in decoding and conformance testing.

ビットストリームがＤＲＡＰ（すなわち、前のＩＲＡＰをＤＲＡＰの基準としてのみ使用し、そこからビットストリームにおいて）としてマークされたピクチャを含む場合、これらのＤＲＡＰピクチャをより低いレートオーバーヘッドでのランダムアクセス機能のために利用することが可能である。しかしながら、ストリームにランダムにアクセスするために幾つかのターゲットＤＲＡＰを使用する場合、デコーダ出力でターゲットＤＲＡＰの前に（すなわち、ターゲットＤＲＡＰの関連付けられたＩＲＡＰ）任意の初期ピクチャを表示することは望ましくなく、これは、これらのピクチャ間の時間的距離が、ビデオがターゲットＤＲＡＰから滑らかに再生されるまで元のビデオのレートで再生されるときに不安定／不安定なビデオ再生につながるからである。 If a bitstream contains pictures marked as DRAP (i.e., using the previous IRAP only as a basis for DRAP and from there in the bitstream), mark these DRAP pictures for random access functionality with lower rate overhead. can be used for However, when using several target DRAPs to randomly access the stream, it is undesirable to display any initial pictures before the target DRAP (i.e., the target DRAP's associated IRAP) at the decoder output. , because the temporal distance between these pictures leads to jerky/jerky video playback when the video is played back at the rate of the original video until it plays smoothly from the target DRAP.

したがって、ＤＲＡＰピクチャの前のピクチャの出力は省略することが望ましい。本発明のこの態様は、それに応じてデコーダを制御する手段を提示する。 Therefore, it is desirable to omit the output of the picture before the DRAP picture. This aspect of the invention presents means for controlling the decoder accordingly.

一実施形態において、ＩＲＡＰピクチャのＰｉｃＯｕｔｐｕｔＦｌａｇ変数を設定する外部手段は、以下のように使用する実装に利用可能とされる。
－本明細書で指定されていない何らかの外部手段が、ピクチャの変数ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＤｒａｐＦｌａｇを値に設定するために利用可能である場合、ピクチャのＮｏＯｕｔｐｕｔＢｅｆｏｒｅＤｒａｐＦｌａｇは、外部手段によって提供される値と等しく設定される。
［…］
－現在のピクチャの変数ＰｉｃｔｕｒｅＯｕｔｐｕｔＦｌａｇは、以下のように導出される。
－ｓｐｓ＿ｖｉｄｅｏ＿ｐａｒａｍｅｔｅｒ＿ｓｅｔ＿ｉｄが０より大きく、現在のレイヤが出力レイヤではない（すなわち、ｎｕｈ＿ｌａｙｅｒ＿ｉｄは、０以上ＮｕｍＯｕｔｐｕｔＬａｙｅｒｓＩｎＯｌｓ［ＴａｒｇｅｔＯｌｓＩｄｘ］－１以下の範囲内のｉの任意の値について、ＯｕｔｐｕｔＬａｙｅｒＩｄＩｎＯｌｓ［ＴａｒｇｅｔＯｌｓＩｄｘ］［ｉ］と等しくない）場合、又は以下の条件の１つが当てはまる場合、ＰｉｃｔｕｒｅＯｕｔｐｕｔＦｌａｇは０に等しく設定される。
－現在のピクチャはＲＡＳＬピクチャであり、関連するＩＲＡＰピクチャのＮｏＯｕｔｐｕｔＢｅｆｏｒｅＲｅｃｏｖｅｒｙＦｌａｇは１に等しい。
－現在のピクチャは、ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＲｅｃｏｖｅｒｙＦｌａｇが１であるＧＤＲピクチャであるか、又は、ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＲｅｃｏｖｅｒｙＦｌａｇが１であるＧＤＲピクチャの復元ピクチャである。
－現在のピクチャは、ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＤｒａｐＦｌａｇが１であるＩＲＡＰピクチャである。
－そうでない場合、ＰｉｃｔｕｒｅＯｕｔｐｕｔＦｌａｇはｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇと等しく設定される。 In one embodiment, an external means of setting the PicOutputFlag variable of an IRAP picture is made available to implementations using the following.
- If some external means not specified herein is available for setting the picture's variable NoOutputBeforeDrapFlag to a value, the picture's NoOutputBeforeDrapFlag is set equal to the value provided by the external means.
[...]
- The variable PictureOutputFlag of the current picture is derived as follows.
- sps_video_parameter_set_id is greater than 0 and the current layer is not an output layer (i.e. nuh_layer_id is OutputLayerIdInOls[TargetOls Idx][i] equal to None), or if one of the following conditions is true, PictureOutputFlag is set equal to 0:
- the current picture is a RASL picture and the NoOutputBeforeRecoveryFlag of the associated IRAP picture is equal to 1;
- The current picture is a GDR picture with a NoOutputBeforeRecoveryFlag of 1 or a restored picture of a GDR picture with a NoOutputBeforeRecoveryFlag of 1.
- The current picture is an IRAP picture with a NoOutputBeforeDrapFlag of 1.
- Otherwise, PictureOutputFlag is set equal to ph_pic_output_flag.

別の実施形態において、ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＤｒａｐＦｌａｇは、ＣＶＳ内の最初のＩＲＡＰピクチャに対してのみ外部手段によって設定され、そうでなければ０に設定される。
－本明細書で指定されていない何らかの外部手段が、ピクチャの変数ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＤｒａｐＦｌａｇを値に設定するために利用可能である場合、ＣＶＳ内の最初のピクチャのＮｏＯｕｔｐｕｔＢｅｆｏｒｅＤｒａｐＦｌａｇは、外部手段によって提供される値と等しく設定される。そうでない場合、ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＤｒａｐＦｌａｇは０に設定される。 In another embodiment, the NoOutputBeforeDrapFlag is set by external means only for the first IRAP picture in the CVS and is set to 0 otherwise.
- If some external means not specified herein is available to set the variable NoOutputBeforeDrapFlag of the picture to a value, the NoOutputBeforeDrapFlag of the first picture in the CVS equals the value provided by the external means set. Otherwise, the NoOutputBeforeDrapFlag is set to zero.

また、前述のフラグＮｏＯｕｔｐｕｔＢｅｆｏｒｅＤｒａｐＦｌａｇは、ＩＲＡＰピクチャとＤＲＡＰピクチャとの間のピクチャの除去の場合のビットストリーム内で伝達される代替的なＨＲＤタイミング、例えばＶＶＣ仕様におけるフラグＵｓｅＡｌｔＣｐｂＰａｒａｍｓＦｌａｇの使用に関連付けることもできる。 The aforementioned flag NoOutputBeforeDrapFlag can also be associated with the use of alternative HRD timings conveyed in the bitstream in case of removal of pictures between IRAP and DRAP pictures, eg the flag UseAltCpbParamsFlag in the VVC specification.

代替の実施形態では、非ＤＲＡＰピクチャを介さずにＤＲＡＰピクチャの直前にあるＩＲＡＰピクチャは、ピクチャヘッダ内の出力フラグｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇにおいて０の値を有することが制約である。この場合、エクストラクタ又はプレーヤがランダムアクセスのためにＤＲＡＰを使用するときはいつでも、すなわち、ビットストリームからＩＲＡＰとＤＲＡＰとの間の中間ピクチャを除去するときはいつでも、それぞれの出力フラグが０に設定され、ＩＲＡＰの出力が省略されることを検証又は調整することも要求される。 In an alternative embodiment, the constraint is that the IRAP picture immediately preceding a DRAP picture without an intervening non-DRAP picture has a value of 0 in the output flag ph_pic_output_flag in the picture header. In this case, whenever the extractor or player uses DRAP for random access, i.e. removes intermediate pictures between IRAP and DRAP from the bitstream, the respective output flag is set to 0. It is also required to verify or coordinate that the IRAP output is omitted.

この演算を単純にするためには、元のビットストリームを対応して準備する必要がある。より具体的には、ピクチャヘッダ内のフラグｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇの存在を決定するｐｐｓ＿ｏｕｔｐｕｔ＿ｆｌａｇ＿ｐｒｅｓｅｎｔ＿ｆｌａｇは、ピクチャヘッダを容易に変更することができ、パラメータセットも変更する必要がないように、１に等しいとする。すなわち、
関連するＤＲＡＰＡＵを有するＣＶＳＳＡＵ内のピクチャによってＰＰＳが参照される場合、ｐｐｓ＿ｏｕｔｐｕｔ＿ｆｌａｇ＿ｐｒｅｓｅｎｔ＿ｆｌａｇの値は１に等しくなければならないことがビットストリーム適合性の要件である。 To simplify this operation, the original bitstream should be prepared accordingly. More specifically, the pps_output_flag_present_flag, which determines the presence of the flag ph_pic_output_flag in the picture header, is equal to 1 so that the picture header can be easily changed without the need to change the parameter set as well. i.e.
It is a bitstream conformance requirement that the value of pps_output_flag_present_flag must be equal to 1 if a PPS is referenced by a picture in a CVSS AU with an associated DRAP AU.

上記のオプションに加えて、別の実施形態では、ビットストリーム内の最初のＡＵ、すなわちＣＬＶＳ開始を構成するＣＲＡ又はＩＤＲが復号後に出力されるべきか否かがパラメータセットＰＰＳ又はＳＰＳで示される。したがって、システム統合は、例えばファイルフォーマットＩＳＯＢＭＦＦのファイルを解析するときに、同じく変更されるべきＰＨなどの比較的低レベルのシンタックスを必要とする代わりに、パラメータセットのみが調整される必要があるので、より簡単である。 In addition to the above options, in another embodiment the parameter set PPS or SPS indicates whether the first AU in the bitstream, i.e. the CRA or IDR constituting the CLVS start, should be output after decoding. Therefore, instead of system integration requiring relatively low-level syntax such as PH to also be changed when parsing files of file format ISOBMFF for example, only parameter sets need to be adjusted. so it's easier.

一例を以下に示す。

１に等しいｓｐｓ＿ｐｉｃ＿ｉｎ＿ｃｖｓｓ＿ａｕ＿ｎｏ＿ｏｕｔｐｕｔ＿ｆｌａｇは、ＳＰＳを参照するＣＶＳＳＡＵ内のピクチャが出力されないことを指定する。０に等しいｓｐｓ＿ｐｉｃ＿ｉｎ＿ｃｖｓｓ＿ａｕ＿ｎｏ＿ｏｕｔｐｕｔ＿ｆｌａｇは、ＳＰＳを参照するＣＶＳＳＡＵ内のピクチャが出力されてもされなくてもよいことを指定する。 An example is shown below.

sps_pic_in_cvss_au_no_output_flag equal to 1 specifies that pictures in the CVSS AU that reference the SPS are not output. sps_pic_in_cvss_au_no_output_flag equal to 0 specifies that pictures in the CVSS AU that reference the SPS may or may not be output.

ｓｐｓ＿ｐｉｃ＿ｉｎ＿ｃｖｓｓ＿ａｕ＿ｎｏ＿ｏｕｔｐｕｔ＿ｆｌａｇの値は、ＯＬＳ内の任意の出力レイヤによって参照される任意のＳＰＳに対して同じであるべきであることがビットストリーム適合性の要件である。 It is a bitstream compatibility requirement that the value of sps_pic_in_cvss_au_no_output_flag should be the same for any SPS referenced by any output layer in the OLS.

８．１．２において
－現在のピクチャの変数ＰｉｃｔｕｒｅＯｕｔｐｕｔＦｌａｇは、以下のように導出される。
－ｓｐｓ＿ｖｉｄｅｏ＿ｐａｒａｍｅｔｅｒ＿ｓｅｔ＿ｉｄが０より大きく、現在のレイヤが出力レイヤではない（すなわち、ｎｕｈ＿ｌａｙｅｒ＿ｉｄは、０以上ＮｕｍＯｕｔｐｕｔＬａｙｅｒｓＩｎＯｌｓ［ＴａｒｇｅｔＯｌｓＩｄｘ］－１以下の範囲内のｉの任意の値について、ＯｕｔｐｕｔＬａｙｅｒＩｄＩｎＯｌｓ［ＴａｒｇｅｔＯｌｓＩｄｘ］［ｉ］と等しくない）場合、又は以下の条件の１つが当てはまる場合、ＰｉｃｔｕｒｅＯｕｔｐｕｔＦｌａｇは０に等しく設定される。
－現在のピクチャはＲＡＳＬピクチャであり、関連するＩＲＡＰピクチャのＮｏＯｕｔｐｕｔＢｅｆｏｒｅＲｅｃｏｖｅｒｙＦｌａｇは１に等しい。
－現在のピクチャは、ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＲｅｃｏｖｅｒｙＦｌａｇが１であるＧＤＲピクチャであるか、又は、ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＲｅｃｏｖｅｒｙＦｌａｇが１であるＧＤＲピクチャの復元ピクチャである。
－そうではなく、現在のＡＵがＣＶＳＳＡＵであり、ｓｐｓ＿ｐｉｃ＿ｉｎ＿ｃｖｓｓ＿ａｕ＿ｎｏ＿ｏｕｔｐｕｔ＿ｆｌａｇが１に等しい場合、ＰｉｃｔｕｒｅＯｕｔｐｕｔＦｌａｇは０に等しく設定される。
－そうでない場合、ＰｉｃｔｕｒｅＯｕｔｐｕｔＦｌａｇはｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇと等しく設定される。
注記－一実施態様では、デコーダは、出力レイヤに属さないピクチャを出力することができる。例えば、ＡＵ内に出力レイヤが１つしかなく、出力レイヤのピクチャが利用できない場合、例えば、損失又はレイヤのダウンスイッチングのために、デコーダは、デコーダに利用可能なＡＵの全てのピクチャの中でｎｕｈ＿ｌａｙｅｒ＿ｉｄの値が最も高く、ｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇが１であるピクチャについて、ＰｉｃｔｕｒｅＯｕｔｐｕｔＦｌａｇを１に設定し、デコーダに利用可能なＡＵの他の全てのピクチャについて、ＰｉｃｔｕｒｅＯｕｔｐｕｔＦｌａｇを０に設定することができる。 In 8.1.2—the current picture variable PictureOutputFlag is derived as follows.
- sps_video_parameter_set_id is greater than 0 and the current layer is not an output layer (i.e. nuh_layer_id is OutputLayerIdInOls[TargetOls Idx][i] equal to None), or if one of the following conditions is true, PictureOutputFlag is set equal to 0:
- the current picture is a RASL picture and the NoOutputBeforeRecoveryFlag of the associated IRAP picture is equal to 1;
- The current picture is a GDR picture with a NoOutputBeforeRecoveryFlag of 1 or a restored picture of a GDR picture with a NoOutputBeforeRecoveryFlag of 1.
- Otherwise, if the current AU is a CVSS AU and sps_pic_in_cvss_au_no_output_flag is equal to 1, PictureOutputFlag is set equal to 0.
- Otherwise, PictureOutputFlag is set equal to ph_pic_output_flag.
Note—In one implementation, the decoder can output pictures that do not belong to the output layer. For example, if there is only one output layer in the AU and no picture of the output layer is available, e.g., due to loss or layer downswitching, the decoder may select PictureOutputFlag may be set to 1 for the picture with the highest nuh_layer_id value and ph_pic_output_flag equal to 1, and PictureOutputFlag may be set to 0 for all other pictures in the AU available to the decoder.

別の実施形態では、例えば、要件は、例えば、以下のように定義することができる。
ピクチャがＩＲＡＰＡＵに属し、ＩＲＡＰＡＵがＤＲＡＰＡＵの直前にある場合、ｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇの値は０に等しくなければならないことがビットストリーム適合性の要件である。 In another embodiment, the requirements can be defined, for example, as follows.
It is a bitstream conformance requirement that the value of ph_pic_output_flag must be equal to 0 if the picture belongs to an IRAP AU and the IRAP AU immediately precedes the DRAP AU.

以下、本発明の第２の態様について詳細に説明する。 The second aspect of the present invention will be described in detail below.

本発明の第２の態様によれば、１つ以上の入力ビデオデータストリームを受信する装置２００が提供される。１つ以上の入力ビデオデータストリームのそれぞれに入力ビデオが符号化される。装置２００は、１つ以上の入力ビデオデータストリームから出力ビデオデータストリームを生成するように構成され、出力ビデオデータストリームは出力ビデオを符号化し、装置は、出力ビデオが１つ以上の入力ビデオデータストリームのうちの１つ内で符号化されている入力ビデオであるように、又は出力ビデオが１つ以上の入力ビデオデータストリームのうちの少なくとも１つの入力ビデオに依存するように、出力ビデオデータストリームを生成するように構成される。更に、装置２００は、符号化ピクチャバッファからの出力ビデオの複数のピクチャの現在のピクチャのアクセスユニット除去時間を決定するように構成される。装置２００は、符号化ピクチャバッファからの現在のピクチャのアクセスユニット除去時間を決定するために符号化ピクチャバッファ遅延オフセット情報を使用するか否かを決定するように構成される。 According to a second aspect of the invention, an apparatus 200 is provided for receiving one or more input video data streams. Input video is encoded into each of one or more input video data streams. The apparatus 200 is configured to generate an output video data stream from one or more input video data streams, the output video data stream encoding the output video, and the apparatus converting the output video into the one or more input video data streams. or the output video depends on at least one of the one or more input video data streams. configured to generate Further, the apparatus 200 is configured to determine an access unit removal time of a current picture of the plurality of pictures of the output video from the coded picture buffer. Apparatus 200 is configured to determine whether to use coded picture buffer delay offset information to determine the access unit removal time of the current picture from the coded picture buffer.

一実施形態によれば、装置２００は、例えば、出力ビデオデータストリームを生成するために、１つ以上の入力ビデオデータストリームの第１のビデオデータストリームの入力ビデオの１つ以上のピクチャのグループをドロップするように構成され得る。装置２００は、例えば、符号化ピクチャバッファ遅延オフセット情報に応じて、符号化ピクチャバッファからの出力ビデオの複数のピクチャのうちの少なくとも１つのアクセスユニット除去時間を決定するように構成され得る。 According to one embodiment, the apparatus 200, for example, converts groups of one or more pictures of the input video of the first video data stream of the one or more input video data streams to generate the output video data stream. It can be configured to drop. Apparatus 200 may be configured to determine an access unit removal time for at least one of a plurality of pictures of an output video from a coded picture buffer, eg, depending on coded picture buffer delay offset information.

一実施形態では、装置２００によって受信された第１のビデオは、例えば、前処理されたビデオを生成するために１つ以上のピクチャのグループがドロップされた元のビデオから生じる前処理されたビデオであってもよい。装置２００は、例えば、符号化ピクチャバッファ遅延オフセット情報に応じて、符号化ピクチャバッファからの出力ビデオの複数のピクチャのうちの少なくとも１つのアクセスユニット除去時間を決定するように構成され得る。 In one embodiment, the first video received by device 200 is, for example, a preprocessed video resulting from an original video in which one or more groups of pictures have been dropped to produce the preprocessed video. may be Apparatus 200 may be configured to determine an access unit removal time for at least one of a plurality of pictures of an output video from a coded picture buffer, eg, depending on coded picture buffer delay offset information.

一実施形態によれば、バッファ遅延オフセット情報は、ドロップされた入力ビデオのピクチャの数に依存する。 According to one embodiment, the buffer delay offset information depends on the number of dropped pictures of the input video.

一実施形態では、１つ以上の入力ビデオデータストリームは、２つ以上の入力ビデオデータストリームである。装置２００は、例えば、処理されたビデオと、２つ以上の入力ビデオデータストリームのうちの第２のビデオデータストリームの入力ビデオとをスプライシングして出力ビデオを取得するように構成することができ、例えば、出力ビデオを出力ビデオデータストリームに符号化するように構成することができる。 In one embodiment, the one or more input video data streams are two or more input video data streams. The apparatus 200 may be configured, for example, to splice the processed video and the input video of a second one of the two or more input video data streams to obtain an output video, For example, it can be configured to encode the output video into an output video data stream.

一実施形態によれば、装置２００は、例えば、出力ビデオ内の現在のピクチャの位置に応じて現在のピクチャのアクセスユニット除去時間を決定するために符号化ピクチャバッファ遅延オフセット情報を使用するか否かを決定するように構成され得る。或いは、装置２００は、例えば、出力ビデオ内の現在のピクチャの位置に応じて、現在のピクチャのアクセスユニット除去時間を決定するために、符号化ピクチャバッファ遅延オフセット情報の符号化ピクチャバッファ遅延オフセット値を０に設定すべきか否かを決定するように構成され得る。 According to one embodiment, the apparatus 200 uses coded picture buffer delay offset information to determine the access unit removal time of the current picture, e.g. depending on the position of the current picture in the output video. can be configured to determine whether Alternatively, the apparatus 200 may use the coded picture buffer delay offset value of the coded picture buffer delay offset information to determine the access unit removal time of the current picture, e.g., depending on the position of the current picture in the output video. should be set to zero.

一実施形態では、装置２００は、例えば、出力ビデオ内の現在のピクチャに先行する前の廃棄不可能なピクチャの位置に応じて、現在のピクチャのアクセスユニット除去時間を決定するために符号化ピクチャバッファ遅延オフセット情報を使用するか否かを決定するように構成され得る。 In one embodiment, the apparatus 200 uses the coded picture to determine the access unit removal time for the current picture, e.g., depending on the position of the previous non-discardable picture that precedes the current picture in the output video. It may be configured to determine whether to use buffer delay offset information.

一実施形態によれば、装置２００は、例えば、出力ビデオ内の現在のピクチャに先行する前の廃棄不可能なピクチャが、例えば、前のバッファリング期間内の最初のピクチャであり得るか否かに応じて、現在のピクチャのアクセスユニット除去時間を決定するために符号化ピクチャバッファ遅延オフセット情報を使用するか否かを決定するように構成され得る。 According to one embodiment, the apparatus 200 determines whether the previous non-discardable picture preceding the current picture in the output video can be, for example, the first picture in the previous buffering period. depending on whether to use the coded picture buffer delay offset information to determine the access unit removal time for the current picture.

一実施形態では、装置２００は、例えば、連結フラグに応じて現在のピクチャのアクセスユニット除去時間を決定するために符号化ピクチャバッファ遅延オフセット情報を使用するか否かを決定するように構成することができ、現在のピクチャは第２のビデオデータストリームの入力ビデオの最初のピクチャである。 In one embodiment, the apparatus 200 is configured to determine whether to use coded picture buffer delay offset information to determine the access unit removal time of the current picture, for example depending on the concatenation flag. and the current picture is the first picture of the input video of the second video data stream.

一実施形態によれば、装置２００は、例えば、先行するピクチャの除去時間に応じて現在のピクチャのアクセスユニット除去時間を決定するように構成され得る。 According to an embodiment, the apparatus 200 may be configured, for example, to determine the access unit removal time of the current picture according to the removal time of the preceding picture.

一実施形態では、装置２００は、例えば、初期符号化ピクチャバッファ除去遅延情報に応じて現在のピクチャのアクセスユニット除去時間を決定するように構成され得る。 In one embodiment, the apparatus 200 may be configured to determine the access unit removal time for the current picture, for example, depending on the initial coded picture buffer removal delay information.

一実施形態によれば、装置２００は、例えば、現在のピクチャのアクセスユニット除去時間を決定するために一時的な符号化ピクチャバッファ除去遅延情報を取得するために、クロックティックに応じて初期符号化ピクチャバッファ除去遅延情報を更新するように構成され得る。 According to one embodiment, the apparatus 200 performs the initial encoding depending on clock ticks, for example, to obtain temporary coded picture buffer removal delay information to determine the access unit removal time of the current picture. It may be configured to update picture buffer removal delay information.

一実施形態によれば、連結フラグが第１の値に設定される場合、装置２００は、符号化ピクチャバッファ遅延オフセット情報を使用して１つ以上の除去時間を決定するように構成される。連結フラグが第１の値とは異なる第２の値に設定される場合、装置２００は、１つ以上の除去時間を決定するために符号化ピクチャバッファ遅延オフセット情報を使用しないように構成される。 According to one embodiment, if the concatenation flag is set to a first value, the apparatus 200 is configured to determine one or more removal times using the coded picture buffer delay offset information. If the concatenation flag is set to a second value different from the first value, the apparatus 200 is configured not to use the coded picture buffer delay offset information to determine one or more removal times. .

一実施形態では、装置２００は、例えば、符号化ピクチャバッファから現在のピクチャのアクセスユニット除去時間を決定するために符号化ピクチャバッファ遅延オフセット情報を使用するか否かをビデオデコーダ３００にシグナルするように構成され得る。 In one embodiment, the apparatus 200 signals the video decoder 300 whether to use the coded picture buffer delay offset information, for example, to determine the access unit removal time of the current picture from the coded picture buffer. can be configured to

一実施形態によれば、現在のピクチャは、例えば、２つの入力ビデオがスプライシングされた出力ビデオのスプライシングポイントに配置されてもよい。 According to one embodiment, the current picture may be placed at the splicing point of the output video, for example two input videos spliced together.

一実施形態によれば、ビデオデータストリームは、例えば、連結フラグを含むことができる。 According to one embodiment, the video data stream may include, for example, concatenation flags.

一実施形態では、ビデオデータストリームは、例えば、初期符号化ピクチャバッファ除去遅延情報を含むことができる。 In one embodiment, the video data stream may include, for example, initial coded picture buffer removal delay information.

一実施形態によれば、連結フラグが第１の値（例えば、０）に設定されている場合、連結フラグは、例えば、幾つかのピクチャ（例えば、ＲＡＳＬ写真）がドロップされたことが分かっているときに、符号化ピクチャバッファ遅延オフセット情報が１つ以上の（ピクチャ又はアクセスユニット）除去時間を決定するために使用される必要があることを示す。連結フラグが第１の値とは異なる第２の値（例えば、１）に設定される場合、連結フラグは、例えばオフセットシグナリングに関係なく、かつ例えばＲＡＳＬピクチャがドロップされたかどうかに関係なく、示されたオフセットが１つ以上の（ピクチャ又はアクセスユニット）除去時間を決定するために使用されないことを示す。ピクチャがドロップされない場合、例えば、オフセットは使用されない。 According to one embodiment, if the concatenated flag is set to a first value (eg, 0), the concatenated flag indicates, for example, that some pictures (eg, RASL photos) were known to be dropped. , the coded picture buffer delay offset information should be used to determine one or more (picture or access unit) removal times. If the concatenated flag is set to a second value (eg, 1) different from the first value, the concatenated flag indicates, eg, regardless of offset signaling and, eg, whether a RASL picture has been dropped. indicates that the specified offset is not used to determine one or more (picture or access unit) removal times. If no pictures are dropped, for example, no offset is used.

更に、ビデオエンコーダ１００が提供される。ビデオエンコーダ１００は、ビデオをビデオデータストリームに符号化するように構成される。ビデオエンコーダ１００は、ビデオデータストリームが符号化ピクチャバッファ遅延オフセット情報を含むようにビデオデータストリームを生成するように構成される。 Additionally, a video encoder 100 is provided. Video encoder 100 is configured to encode video into a video data stream. Video encoder 100 is configured to generate a video data stream such that the video data stream includes encoded picture buffer delay offset information.

一実施形態によれば、ビデオエンコーダ１００は、例えば、ビデオデータストリームが、例えば、連結フラグを含むことができるように、ビデオデータストリームを生成するように構成することができる。 According to one embodiment, video encoder 100 may be configured to generate a video data stream, eg, such that the video data stream may include, eg, concatenation flags.

一実施形態では、ビデオエンコーダ１００は、例えば、ビデオデータストリームが、例えば、符号化ピクチャバッファ遅延オフセット情報を含むことができるように、ビデオデータストリームを生成するように構成することができる。 In one embodiment, video encoder 100 may be configured to generate a video data stream, eg, such that the video data stream may include, eg, encoded picture buffer delay offset information.

また、ビデオを格納したビデオデータストリームを受信するビデオデコーダ３００が設けられている。ビデオデコーダ３００は、ビデオデータストリームからビデオを復号するように構成される。更に、ビデオデコーダ３００は、符号化ピクチャバッファからのビデオの複数のピクチャの現在のピクチャのアクセスユニット除去時間に応じてビデオを復号するように構成される。ビデオデコーダ３００は、符号化ピクチャバッファからの現在のピクチャのアクセスユニット除去時間を決定するために符号化ピクチャバッファ遅延オフセット情報を使用するか否かを示す表示に応じてビデオを復号するように構成される。 A video decoder 300 is also provided for receiving a video data stream containing video. Video decoder 300 is configured to decode video from a video data stream. Further, the video decoder 300 is configured to decode the video according to the access unit removal time of the current picture of the multiple pictures of the video from the coded picture buffer. The video decoder 300 is configured to decode the video in response to an indication of whether to use coded picture buffer delay offset information to determine the access unit removal time of the current picture from the coded picture buffer. be done.

一実施形態によれば、符号化ピクチャバッファからのビデオの複数のピクチャのうちの少なくとも１つのためのアクセスユニット除去時間は、符号化ピクチャバッファ遅延オフセット情報に依存する。 According to one embodiment, the access unit removal time for at least one of the pictures of the video from the coded picture buffer depends on the coded picture buffer delay offset information.

一実施形態では、ビデオデコーダ３００は、ビデオ内の現在のピクチャの位置に応じて現在のピクチャのアクセスユニット除去時間を決定するために符号化ピクチャバッファ遅延オフセット情報を使用するか否かに応じてビデオを復号するように構成される。 In one embodiment, video decoder 300 may or may not use coded picture buffer delay offset information to determine the access unit removal time for the current picture as a function of the position of the current picture within the video. configured to decode video;

一実施形態によれば、ビデオデコーダ３００は、例えば、符号化ピクチャバッファ遅延オフセット情報の符号化ピクチャバッファ遅延オフセット値が、例えば、０に設定され得るか否かに応じて、ビデオを復号するように構成され得る。 According to one embodiment, the video decoder 300 may, for example, decode the video depending on whether the coded picture buffer delay offset value in the coded picture buffer delay offset information may be set to 0, for example. can be configured to

一実施形態では、ビデオデコーダ３００は、例えば、ビデオ内の現在のピクチャに先行する前の廃棄不可能なピクチャの位置に応じて、現在のピクチャのアクセスユニット除去時間を決定するために符号化ピクチャバッファ遅延オフセット情報を使用するか否かを決定するように構成され得る。 In one embodiment, video decoder 300 uses the coded picture to determine the access unit removal time for the current picture, e.g., depending on the position of the previous non-discardable picture that precedes the current picture in the video. It may be configured to determine whether to use buffer delay offset information.

一実施形態によれば、ビデオデコーダ３００は、例えば、ビデオ内の現在のピクチャに先行する前の廃棄不可能ピクチャが、例えば、前のバッファリング期間内の最初のピクチャであり得るか否かに応じて、現在のピクチャのアクセスユニット除去時間を決定するために符号化ピクチャバッファ遅延オフセット情報を使用するか否かを決定するように構成され得る。 According to one embodiment, video decoder 300 determines whether the previous non-discardable picture preceding the current picture in the video may, for example, be the first picture in the previous buffering period. Accordingly, it may be configured to determine whether to use the coded picture buffer delay offset information to determine the access unit removal time for the current picture.

一実施形態では、ビデオデコーダ３００は、例えば、連結フラグに応じて現在のピクチャのアクセスユニット除去時間を決定するために符号化ピクチャバッファ遅延オフセット情報を使用するか否かを決定するように構成することができ、現在のピクチャは第２のビデオデータストリームの入力ビデオの最初のピクチャである。 In one embodiment, video decoder 300 is configured to determine whether to use coded picture buffer delay offset information to determine the access unit removal time for the current picture, eg, depending on the concatenation flag. and the current picture is the first picture of the input video of the second video data stream.

一実施形態によれば、ビデオデコーダ３００は、例えば、先行するピクチャの除去時間に応じて現在のピクチャのアクセスユニット除去時間を決定するように構成されてもよい。 According to one embodiment, the video decoder 300 may be configured to determine the access unit removal time of the current picture depending on the removal time of the previous picture, for example.

一実施形態では、ビデオデコーダ３００は、例えば、初期符号化ピクチャバッファ除去遅延情報に応じて現在のピクチャのアクセスユニット除去時間を決定するように構成され得る。 In one embodiment, video decoder 300 may be configured, for example, to determine the access unit removal time for the current picture as a function of initial coded picture buffer removal delay information.

一実施形態によれば、ビデオデコーダ３００は、例えば、現在のピクチャのアクセスユニット除去時間を決定するために一時的な符号化ピクチャバッファ除去遅延情報を取得するために、クロックティックに応じて初期符号化ピクチャバッファ除去遅延情報を更新するように構成され得る。 According to one embodiment, video decoder 300 may, for example, generate an initial code in response to clock ticks to obtain temporary coded picture buffer removal delay information to determine access unit removal time for the current picture. may be configured to update the normalized picture buffer removal delay information.

一実施形態によれば、連結フラグが第１の値に設定されている場合、ビデオデコーダ３００は、符号化ピクチャバッファ遅延オフセット情報を使用して１つ以上の除去時間を決定するように構成される。連結フラグが第１の値とは異なる第２の値に設定される場合、ビデオデコーダ３００は、１つ以上の除去時間を決定するために符号化ピクチャバッファ遅延オフセット情報を使用しないように構成される。 According to one embodiment, if the concatenation flag is set to a first value, video decoder 300 is configured to determine one or more removal times using coded picture buffer delay offset information. be. If the concatenation flag is set to a second value different than the first value, video decoder 300 is configured not to use the coded picture buffer delay offset information to determine one or more removal times. be.

特に、本発明の第２の態様は、代替タイミングの場合のｐｒｅｖＮｏｎＤｉｓｃａｒｄａｂｌｅは、（それがＢＰ開始ではない場合に）代替オフセット（ＣｐｂＤｅｌａｙＯｆｆｓｅｔ）を既に含み得るので、ｃｏｎｃａｔｅｎａｔｉｏｎ＿ｆｌａｇ＝＝１のＡＵの場合、ＣｐｂＤｅｌａｙＯｆｆｓｅｔは時間的に０に設定されるべきである。 In particular, the second aspect of the present invention is that prevNonDiscardable for alternative timings may already contain an alternative offset (CpbDelayOffset) (if it is not the BP start), so for AUs with congestion_flag==1, CpbDelayOffset should be set to 0 in time.

２つのビットストリームのスプライシングが発生すると、ＣＰＢからのＡＵの除去時間の導出は、スプライシングされていないビットストリームの場合とは異なって行われる。スプライシングポイントにおいて、バッファリング期間ＳＥＩメッセージ（ＢＰＳＥＩメッセージ；ＳＥＩ＝supplemental enhancement information、補足拡張情報）は、１に等しいｃｏｎｃａｔｅｎａｔｉｏｎＦｌａｇを含む。次に、デコーダは２つの値をチェックし、以下の両方のうちの大きい方を取る必要がある。
・以前の非廃棄可能なＰｉｃ（ｐｒｅｖＮｏｎＤｉｓｃａｒｄａｂｌｅＰｉｃ）除去時間＋ＢＰＳＥＩメッセージでシグナリングされたデルタ（ａｕＣｐｂＲｅｍｏｖａｌＤｅｌａｙＤｅｌｔａＭｉｎｕｓ１＋１）、又は
・先行するＰｉｃ除去時間＋ＩｎｉｔＣｐｂＲｅｍｏｖａｌＤｅｌａｙ When splicing of two bitstreams occurs, the derivation of the AU removal time from the CPB is done differently than for non-spliced bitstreams. At the splicing point, the buffering period SEI message (BP SEI message; SEI=supplemental enhancement information) contains a condensationFlag equal to one. Then the decoder should check the two values and take the larger of both:
- previous non-discardable Pic (prevNonDiscardablePic) removal time + delta signaled in BP SEI message (auCpbRemovalDelayDeltaMinus1 + 1), or - preceding Pic removal time + InitCpbRemovalDelay

しかしながら、ＢＰＳＥＩメッセージを有する以前のＰｉｃが、除去時間の導出のために代替タイミングが使用されたＡＵ（すなわち、ＲＡＳＬピクチャ又はＤＲＡＰまでのピクチャがドロップされたときに使用される第２のタイミング情報）であった場合、オフセットが使用されて（ＣｐｂＤｅｌａｙＯｆｆｓｅｔ）、バッファリング期間を有する以前のＰｉｃに対するデルタとして計算される各除去時間、すなわち、図４に示すように、ＡｕＮｏｍｉｎａｌＲｅｍｏｖａｌＴｉｍｅ［ｆｉｒｓｔＰｉｃＩｎＰｒｅｖＢｕｆｆＰｅｒｉｏｄ］＋ＡｕＣｐｂＲｅｍｏｖａｌＤｅｌａｙＶａｌ－ＣｐｂＤｅｌａｙＯｆｆｓｅｔが計算される。 However, previous Pics with BP SEI messages indicated that alternate timing was used for the derivation of the removal time AU (i.e. the second timing information used when RASL pictures or pictures up to DRAP were dropped). ), then an offset is used (CpbDelayOffset) to calculate each removal time as a delta relative to the previous Pic with buffering period, i. Calculated.

図４は、元のビットストリーム（図４の上部）と、ピクチャをドロップした後のビットストリーム（図４の下部）を示している。ＡＵをドロップした後の除去遅延の計算にオフセットを組み込む（元のビットストリーム内のライン１、２、及び３）。 FIG. 4 shows the original bitstream (top of FIG. 4) and the bitstream after dropping pictures (bottom of FIG. 4). Incorporate the offset into the calculation of the removal delay after dropping the AU (lines 1, 2, and 3 in the original bitstream).

ｆｉｒｓｔＰｉｃＩｎＰｒｅｖＢｕｆｆＰｅｒｉｏｄと呼ばれるピクチャの除去時間に対するデルタを使用して除去時間が計算されるため、オフセットが追加され、その後、幾つかのＡＵがドロップされ、したがって、ＡＵドロップを考慮（補償）するためにＣｐｂＤｅｌａｙＯｆｆｓｅｔが必要である。 Since the removal time is calculated using the delta to the removal time of the picture called firstPicInPrevBuffPeriod, an offset is added and then some AUs are dropped, so CpbDelayOffset is changed to account for (compensate for) the AU drop. is necessary.

図５は、ピクチャが元の第１のビットストリーム（図５中、中央、左）からドロップされた後の、第１のビットストリーム（図５中、左中）及び第２のビットストリーム（図５中、中央、右）の２つのビットストリーム（異なる位置にある）のスプライシングを示している。 FIG. 5 shows the first bitstream (middle left in FIG. 5) and the second bitstream (middle left in FIG. 5) after pictures have been dropped from the original first bitstream (middle left in FIG. 5). 5 middle, middle, right) shows the splicing of the two bitstreams (at different positions).

前の破棄不可能なピクチャの代わりに前のＰｉｃ除去時間をアンカーとして使用する例も同様であり、「－３」補正係数（ＣｐｂＤｅｌａｙＯｆｆｓｅｔ）も必要としない。 An example of using the previous Pic removal time instead of the previous non-discardable picture as an anchor is similar and does not require a '-3' correction factor (CpbDelayOffset) either.

ただし、図５に示すようなスプライシングの場合、２つの派生語が、ＢＰＳＥＩメッセージに関連付けられたＡＵの除去時間（ｆｉｒｓｔＰｉｃＩｎＰｒｅｖＢｕｆｆＰｅｒｉｏｄ）を使用するとは限らない。説明したように、スプライシングの場合、ｐｒｅｖＮｏｎＤｉｓｃａｒｄａｂｌｅＰｉｃ又は直前のＰｉｃのいずれかにデルタが追加される。これは、ｐｒｅｖＮｏｎＤｉｓｃａｒｄａｂｌｅＰｉｃがｆｉｒｓｔＰｉｃＩｎＰｒｅｖＢｕｆｆＰｅｒｉｏｄでない場合、ｐｒｅｖＮｏｎＤｉｓｃａｒｄａｂｌｅＰｉｃの除去時間がＡＵのドロップを既に説明しており、除去時間が計算されたｐｒｅｖＮｏｎＤｉｓｃａｒｄａｂｌｅＰｉｃとＡＵとの間でＡＵがドロップされないため、ＣｐｂＤｅｌａｙＯｆｆｓｅｔを使用して現在のＡＵのＣＰＢから除去時間を導出することができないことを意味する。ここで、現在のＡＵ（すなわち、新しいＢＰＳＥＩメッセージを有するスプライシングポイント）が所望の除去時間の後に現在のＡＵの除去時間を強制するＩｎｉｔｉａｌＣｐｂＲｅｍｏｖａｌＤｅｌａｙを有する場合と同様に、先行するＰｉｃ除去時間が代わりに使用されると仮定すると、等距離の除去時間が達成される（ｐｒｅｖＮｏｎＤｉｓｃａｒｄａｂｌｅＰｉｃが代わりに使用される場合）。そのような場合、現在のＡＵの除去時間は、先行するＰｉｃ除去時間＋ＩｎｉｔＣｐｂＲｅｍｏｖａｌＤｅｌａｙを使用して計算された時間よりも小さくすることはできない。これは、バッファアンダーラン（ＡＵが除去される必要がある前にバッファにない）につながる可能性があるからである。したがって、本発明の一部として、この場合、ＣｐｂＤｅｌａｙＯｆｆｓｅｔは計算に使用されないか、又は０に等しいと見なされる。 However, for splicing as shown in FIG. 5, the two derivatives do not necessarily use the AU removal time (firstPicInPrevBuffPeriod) associated with the BP SEI message. As described, for splicing, a delta is added to either the prevNonDiscardablePic or the previous Pic. This is because if prevNonDiscardablePic is not firstPicInPrevBuffPeriod, CpbDelayOffset is used to get the current It means that the removal time cannot be derived from the AU's CPB. Here, similar to if the current AU (i.e. the splicing point with the new BP SEI message) has an InitialCpbRemovalDelay that forces the current AU's removal time after the desired removal time, the preceding Pic removal time is instead Assuming it is used, equidistant removal times are achieved (if prevNonDiscardablePic is used instead). In such cases, the current AU's removal time cannot be less than the time calculated using the preceding Pic removal time plus InitCpbRemovalDelay. This is because it can lead to buffer underruns (AUs not in the buffer before they need to be removed). Therefore, as part of the present invention, CpbDelayOffset is either not used in the calculation or is considered equal to 0 in this case.

本明細書における実施形態を要約すると、ＲＡＳＬＡＵがビットストリームからドロップされるとき、又はＩＲＡＰＡＵとＤＲＡＰＡＵとの間のＡＵがチェックに応じてドロップされるとき、ＡＵ除去時間の計算のためにＣｐｂＤｅｌａｙＯｆｆｓｅｔを使用することである。ＣｐｂＤｅｌａｙＯｆｆｓｅｔが使用されていないか、又は０に等しいと考えられるかどうかを決定するためのチェックは、以下のうちの１つである。
・ｐｒｅｖＮｏｎＤｉｓｃａｒｄａｂｌｅＰｉｃは、ｆｉｒｓｔＰｉｃＩｎＰｒｅｖＢｕｆｆＰｅｒｉｏｄではない
・先行するＰｉｃ除去時間＋ＩｎｉｔＣｐｂＲｅｍｏｖａｌＤｅｌａｙが現在のＡＵの除去の計算に使用される To summarize the embodiments herein, when RASL AUs are dropped from the bitstream, or when AUs between IRAP AUs and DRAP AUs are dropped depending on the check, for AU removal time calculation One is to use CpbDelayOffset. A check to determine if CpbDelayOffset is not used or is considered equal to 0 is one of the following.
prevNonDiscardablePic is not firstPicInPrevBuffPeriod Preceding Pic removal time + InitCpbRemovalDelay is used to calculate removal of current AU

本明細書における実施態様は、以下の通りであり得る。
－ＡＵｎがＨＲＤを初期化しないＢＰの最初のＡＵである場合、以下が適用される。
ＣＰＢからのＡＵｎの公称除去時間は、以下によって指定される。
if(!concatenationFlag){
baseTime = AuNominalRemovalTime[firstPicInPrevBuffPeriod]
tmpCpbRemovalDelay = AuCpbRemovalDelayVal
tmpCpbDelayOffset = CpbDelayOffset
}else{
baseTime1 = AuNominalRemovalTime[prevNonDiscardablePic]
tmpCpbRemovalDelay1 = (auCpbRemovalDelayDeltaMinus1+1)
baseTime2 = AuNominalRemovalTime[n-1]
tmpCpbRemovalDelay2 = (C.10)
Ceil((InitCpbRemovalDelay[Htid][ScIdx]÷90000+
AuFinalArrivalTime[n-1]-AuNominalRemovalTime[n-1])÷ClockTick)
if(baseTime1+ClockTick*tmpCpbRemovalDelay1<
baseTime2+ClockTick*tmpCpbRemovalDelay2){
baseTime = baseTime2
tmpCpbRemovalDelay = tmpCpbRemovalDelay2
tmpCpbDelayOffset = 0
}else{
baseTime = baseTime1
tmpCpbRemovalDelay = tmpCpbRemovalDelay1
tmpCpbDelayOffset = ((prevNonDiscardablePic =
= firstPicInPrevBuffPeriod)?CpbDelayOffset:0)
}
}
AuNominalRemovalTime[n] =
baseTime+(ClockTick*tmpCpbRemovalDelay-tmpCpbDelayOffset
) Implementations herein may be as follows.
– If AU n is the first AU of a BP that does not initialize the HRD, the following applies.
The nominal removal time of AU n from the CPB is specified by:
if(!concatenationFlag){
baseTime = AuNominalRemovalTime[firstPicInPrevBuffPeriod]
tmpCpbRemovalDelay = AuCpbRemovalDelayVal
tmpCpbDelayOffset = CpbDelayOffset
}else{
baseTime1 = AuNominalRemovalTime[prevNonDiscardablePic]
tmpCpbRemovalDelay1 = (auCpbRemovalDelayDeltaMinus1+1)
baseTime2 = AuNominalRemovalTime[n-1]
tmpCpbRemovalDelay2 = (C.10)
Ceil((InitCpbRemovalDelay[Htid][ScIdx]÷90000+
AuFinalArrivalTime[n-1]-AuNominalRemovalTime[n-1])÷ClockTick)
if(baseTime1+ClockTick*tmpCpbRemovalDelay1<
baseTime2+ClockTick*tmpCpbRemovalDelay2){
baseTime = baseTime2
tmpCpbRemovalDelay = tmpCpbRemovalDelay2
tmpCpbDelayOffset = 0
}else{
baseTime = baseTime1
tmpCpbRemovalDelay = tmpCpbRemovalDelay1
tmpCpbDelayOffset = ((prevNonDiscardablePic =
= firstPicInPrevBuffPeriod)?CpbDelayOffset:0)
}
}
AuNominalRemovalTime[n] =
baseTime+(ClockTick*tmpCpbRemovalDelay-tmpCpbDelayOffset
)

或いは、図６に示す別の実施形態では、ｃｏｎｃａｔｅｎａｔｉｏｎＦｌａｇをチェックすることを含む異なるチェックに応じて、ＲＡＳＬＡＵがビットストリームからドロップされるときのＡＵ除去時間、又はＩＲＡＰＡＵとＤＲＡＰＡＵとの間のＡＵがドロップされるときのＡＵ除去時間を計算するためのＣｐｂＤｅｌａｙＯｆｆｓｅｔ。 Alternatively, in another embodiment shown in FIG. 6, the AU removal time when RASL AUs are dropped from the bitstream, or the time between IRAP AUs and DRAP AUs, depending on different checks, including checking the condensationFlag. CpbDelayOffset for calculating the AU removal time when an AU is dropped.

その場合、ｃｏｎｃａｔｅｎａｔｉｏｎＦｌａｇが１に設定されているときのビットストリーム内のデルタは、その図形に対してＣｐｂＤｅｌａｙＯｆｆｓｅｔが適用されないか又は０であると見なされないため、ＣｐｂＤｅｌａｙＯｆｆｓｅｔが考慮された（図５と図６を比較すると明らかなように、）かのように適切な値と一致する必要がある。 In that case, the deltas in the bitstream when the concatenationFlag is set to 1 were considered CpbDelayOffset because CpbDelayOffset does not apply to that shape or is not considered to be 0 (FIGS. 5 and 5). As can be seen by comparing 6, it is necessary to match the appropriate value as ).

本明細書における実施態様は、以下の通りであり得る。
－ＡＵｎがＨＲＤを初期化しないＢＰの最初のＡＵである場合、以下が適用される。
ＣＰＢからのＡＵｎの公称除去時間は、以下によって指定される。
if(!concatenationFlag){
baseTime = AuNominalRemovalTime[firstPicInPrevBuffPeriod]
tmpCpbRemovalDelay = AuCpbRemovalDelayVal
tmpCpbDelayOffset = CpbDelayOffset
}else{
baseTime1 = AuNominalRemovalTime[prevNonDiscardablePic]
tmpCpbRemovalDelay1 = (auCpbRemovalDelayDeltaMinus1+1)
baseTime2 = AuNominalRemovalTime[n-1]
tmpCpbRemovalDelay2 = (C.10)
Ceil((InitCpbRemovalDelay[Htid][ScIdx]÷90000+

AuFinalArrivalTime[n-1]-AuNominalRemovalTime[n-1])÷ClockTick)
if(baseTime1+ClockTick*tmpCpbRemovalDelay1<
baseTime2+ClockTick*tmpCpbRemovalDelay2){
baseTime = baseTime2
tmpCpbRemovalDelay = tmpCpbRemovalDelay2
}else{
baseTime = baseTime1
tmpCpbRemovalDelay = tmpCpbRemovalDelay1
}
tmpCpbDelayOffset = 0
}
AuNominalRemovalTime[n] =
baseTime+(ClockTick*tmpCpbRemovalDelay-tmpCpbDelayOffset Implementations herein may be as follows.
– If AU n is the first AU of a BP that does not initialize the HRD, the following applies.
The nominal removal time of AU n from the CPB is specified by:
if(!concatenationFlag){
baseTime = AuNominalRemovalTime[firstPicInPrevBuffPeriod]
tmpCpbRemovalDelay = AuCpbRemovalDelayVal
tmpCpbDelayOffset = CpbDelayOffset
}else{
baseTime1 = AuNominalRemovalTime[prevNonDiscardablePic]
tmpCpbRemovalDelay1 = (auCpbRemovalDelayDeltaMinus1+1)
baseTime2 = AuNominalRemovalTime[n-1]
tmpCpbRemovalDelay2 = (C.10)
Ceil((InitCpbRemovalDelay[Htid][ScIdx]÷90000+

AuFinalArrivalTime[n-1]-AuNominalRemovalTime[n-1])÷ClockTick)
if(baseTime1+ClockTick*tmpCpbRemovalDelay1<
baseTime2+ClockTick*tmpCpbRemovalDelay2){
baseTime = baseTime2
tmpCpbRemovalDelay = tmpCpbRemovalDelay2
}else{
baseTime = baseTime1
tmpCpbRemovalDelay = tmpCpbRemovalDelay1
}
tmpCpbDelayOffset = 0
}
AuNominalRemovalTime[n] =
baseTime+(ClockTick*tmpCpbRemovalDelay-tmpCpbDelayOffset

以下、本発明の第３の態様について詳細に説明する。 The third aspect of the present invention will be described in detail below.

一実施形態によれば、初期符号化ピクチャバッファ除去遅延は、例えば、ビデオデコーダ３００を初期化するビデオデータストリームのピクチャの最初のアクセスユニットに関して最初のアクセスユニットをビデオデコーダ３００に送信する前に経過する必要がある時間を示すことができる。 According to one embodiment, the initial coded picture buffer removal delay elapses before sending the first access unit of a picture of a video data stream initializing video decoder 300 to video decoder 300, for example. You can indicate the time you need to

一実施形態では、ビデオデータストリームは、例えば、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が、例えば、２つ以上のバッファリング期間にわたって一定であると定義され得るか否かを示すことができる単一の表示を含むことができる。 In one embodiment, a video data stream may be defined such that, for example, the sum of an initial coded picture buffer removal delay and an initial coded picture buffer removal offset is constant over, for example, two or more buffering periods. A single indication can be included that can indicate whether or not.

一実施形態によれば、ビデオデータストリームは、例えば、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が、例えば、２つ以上のバッファリング期間にわたって一定であると定義され得るか否かを示すことができる、例えば、単一の表示として連結フラグを含むことができる。連結フラグが第１の値に等しい場合、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和は、２つ以上のバッファリング期間にわたって一定である。連結フラグが第１の値と異なる場合、連結フラグは、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であるか否かを定義しない。 According to one embodiment, the video data stream is defined such that, for example, the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is constant over, for example, two or more buffering periods. For example, it can include a concatenation flag as a single indication. If the concatenated flag is equal to the first value, the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is constant over two or more buffering periods. If the concatenation flag is different than the first value, the concatenation flag defines whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is constant over two or more buffering periods. do not.

一実施形態では、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であると定義されることを単一の表示が示さない場合、ビデオデータストリームは、例えば、初期符号化ピクチャバッファ除去遅延情報に関する連続的に更新される情報と、初期符号化ピクチャバッファ除去オフセット情報に関する連続的に更新される情報とを含むことができる。 In one embodiment, if no single indication indicates that the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant over two or more buffering periods, A video data stream may include, for example, continuously updated information regarding initial coded picture buffer removal delay information and continuously updated information regarding initial coded picture buffer removal offset information.

一実施形態によれば、ビデオデータストリームが、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であると定義されることを示す情報を含む場合、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和は、例えば、ビデオデータストリーム内の現在の位置から開始して一定であると定義され得る。 According to one embodiment, information indicating that the video data stream is defined such that the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is constant over two or more buffering periods. , the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset may be defined to be constant, eg, starting from the current position within the video data stream.

また、ビデオエンコーダ１００が設けられる。ビデオエンコーダ１００は、ビデオをビデオデータストリームに符号化するように構成される。更に、ビデオエンコーダ１００は、ビデオデータストリームが初期符号化ピクチャバッファ除去遅延を含むようにビデオデータストリームを生成するように構成される。更に、ビデオエンコーダ１００は、ビデオデータストリームが初期符号化ピクチャバッファ除去オフセットを含むようにビデオデータストリームを生成するように構成される。更に、ビデオエンコーダ１００は、ビデオデータストリームが、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であると定義されるか否かを示す情報を含むように、ビデオデータストリームを生成するように構成される。 A video encoder 100 is also provided. Video encoder 100 is configured to encode video into a video data stream. Additionally, video encoder 100 is configured to generate the video data stream such that the video data stream includes the initial encoded picture buffer removal delay. Additionally, video encoder 100 is configured to generate the video data stream such that the video data stream includes the initial coded picture buffer removal offset. Additionally, video encoder 100 determines whether the video data stream is defined such that the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is constant over two or more buffering periods. A video data stream is configured to generate the video data stream to include the information indicated.

一実施形態では、ビデオエンコーダ１００は、例えば、ビデオデータストリームが、例えば、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が、例えば、２つ以上のバッファリング期間にわたって一定であると定義され得るか否かを示すことができる単一の表示を含むことができるように、ビデオデータストリームを生成するように構成され得る。 In one embodiment, the video encoder 100, for example, determines that the video data stream is such that, for example, the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is constant, for example, over two or more buffering periods. may be configured to generate a video data stream such that it may include a single indication that may indicate whether or not it may be defined as

一実施形態によれば、ビデオエンコーダ１００は、例えば、ビデオデータストリームが、例えば、単一の表示として連結フラグを備え得るように、ビデオデータストリームを生成するように構成され得る。連結フラグは、例えば、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が、例えば、２つ以上のバッファリング期間にわたって一定であると定義され得るか否かを示し得る。連結フラグが第１の値に等しい場合、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和は、２つ以上のバッファリング期間にわたって一定である。連結フラグが第１の値と異なる場合、連結フラグは、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であるか否かを定義しない。 According to one embodiment, video encoder 100 may be configured to generate a video data stream, eg, such that the video data stream may comprise concatenated flags, eg, as a single representation. A concatenation flag may, for example, indicate whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset may be defined as constant over, for example, two or more buffering periods. If the concatenated flag is equal to the first value, the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is constant over two or more buffering periods. If the concatenation flag is different than the first value, the concatenation flag defines whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is constant over two or more buffering periods. do not.

一実施形態では、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であると定義されることを単一の表示が示さない場合、ビデオエンコーダ１００は、例えば、ビデオデータストリームが、例えば、初期符号化ピクチャバッファ除去遅延情報に関する連続的に更新された情報及び初期符号化ピクチャバッファ除去オフセット情報に関する連続的に更新された情報を含むことができるように、ビデオデータストリームを生成するように構成することができる。 In one embodiment, if no single indication indicates that the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant over two or more buffering periods, Video encoder 100 may, for example, determine that the video data stream includes, for example, continuously updated information regarding initial coded picture buffer removal delay information and continuously updated information regarding initial coded picture buffer removal offset information. can be configured to generate a video data stream such that

一実施形態によれば、ビデオデータストリームが、例えば、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であると定義されることを示すことができる情報を含む場合、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和は、ビデオデータストリーム内の現在位置を発端として一定であると定義される。 According to one embodiment, the video data stream is defined such that, for example, the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is constant over two or more buffering periods. With the information that can be indicated, the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant starting from the current position within the video data stream.

更に、第１の入力ビデオデータストリーム及び第２の入力ビデオデータストリームである、２つの入力ビデオデータストリームを受信するための装置２００が提供される。２つの入力ビデオデータストリームのそれぞれには入力ビデオが符号化される。装置２００は、２つの入力ビデオデータストリームから出力ビデオデータストリームを生成するように構成され、出力ビデオデータストリームは出力ビデオを符号化し、装置は、第１入力ビデオデータストリームと第２入力ビデオデータストリームとを連結することによって出力ビデオデータストリームを生成するように構成される。更に、装置２００は、出力ビデオデータストリームが初期符号化ピクチャバッファ除去遅延を含むように、出力ビデオデータストリームを生成するように構成される。更に、装置２００は、出力ビデオデータストリームが初期符号化ピクチャバッファ除去オフセットを含むように、出力ビデオデータストリームを生成するように構成される。更に、装置２００は、出力ビデオデータストリームが、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であると定義されているか否かを示す情報を含むように、出力ビデオデータストリームを生成するように構成される。 Further, an apparatus 200 is provided for receiving two input video data streams, a first input video data stream and a second input video data stream. An input video is encoded into each of the two input video data streams. The apparatus 200 is configured to generate an output video data stream from two input video data streams, the output video data stream encoding the output video, the apparatus generating a first input video data stream and a second input video data stream. and to generate an output video data stream. Further, the apparatus 200 is configured to generate the output video data stream such that the output video data stream includes the initial coded picture buffer removal delay. Further, the apparatus 200 is configured to generate the output video data stream such that the output video data stream includes the initial coded picture buffer removal offset. Further, the apparatus 200 determines whether the output video data stream is defined such that the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is constant over two or more buffering periods. is configured to generate an output video data stream to include the information indicated.

一実施形態によれば、初期符号化ピクチャバッファ除去遅延は、例えば、ビデオデコーダ３００を初期化する出力ビデオデータストリームのピクチャの最初のアクセスユニットに関して最初のアクセスユニットをビデオデコーダ３００に送信する前に経過する必要がある時間を示すことができる。 According to one embodiment, the initial coded picture buffer removal delay is, e.g. You can indicate how much time should elapse.

一実施形態では、装置２００は、例えば、出力ビデオデータストリームが、例えば、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が、例えば、２つ以上のバッファリング期間にわたって一定であると定義され得るか否かを示すことができる単一の表示を含むことができるように、出力ビデオデータストリームを生成するように構成され得る。 In one embodiment, the apparatus 200 is configured such that the output video data stream is such that, for example, the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is constant, for example, over two or more buffering periods. may be configured to generate an output video data stream such that it may include a single indication that may indicate whether or not it may be defined as .

一実施形態によれば、装置２００は、例えば、出力ビデオデータストリームが、例えば、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が、例えば、２つ以上のバッファリング期間にわたって一定であると定義され得るか否かを示すことができる、例えば、単一の表示として連結フラグを含むことができるように、出力ビデオデータストリームを生成するように構成され得る。連結フラグが第１の値に等しい場合、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和は、２つ以上のバッファリング期間にわたって一定である。連結フラグが第１の値と異なる場合、連結フラグは、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であるか否かを定義しない。 According to one embodiment, the apparatus 200 is configured such that the output video data stream is such that the output video data stream, e.g., the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset, e.g. The output video data stream may be configured to include, for example, a concatenation flag as a single indication, which may indicate whether or not it may be defined as constant over time. If the concatenated flag is equal to the first value, the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is constant over two or more buffering periods. If the concatenation flag is different than the first value, the concatenation flag defines whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is constant over two or more buffering periods. do not.

一実施形態では、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であると定義されることを単一の表示が示さない場合、装置２００は、出力ビデオデータストリームが初期符号化ピクチャバッファ除去遅延情報に関する連続的に更新された情報及び初期符号化ピクチャバッファ除去オフセット情報に関する連続的に更新された情報を含むように、出力ビデオデータストリームを生成するように構成される。 In one embodiment, if no single indication indicates that the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant over two or more buffering periods, Apparatus 200 processes the output video data such that the output video data stream includes continuously updated information regarding the initial coded picture buffer removal delay information and continuously updated information regarding the initial coded picture buffer removal offset information. Configured to generate streams.

一実施形態によれば、ビデオデータストリームが、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であると定義されることを示す情報を含む場合、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和は、ビデオデータストリーム内の現在位置を発端として一定であると定義される。 According to one embodiment, information indicating that the video data stream is defined such that the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is constant over two or more buffering periods. , the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant starting from the current position within the video data stream.

また、ビデオを格納したビデオデータストリームを受信するビデオデコーダ３００が設けられている。ビデオデコーダ３００は、ビデオデータストリームからビデオを復号するように構成される。更に、ビデオデータストリームは、初期符号化ピクチャバッファ除去遅延を含む。更に、ビデオデータストリームは、初期符号化ピクチャバッファ除去オフセットを含む。更に、ビデオデータストリームは、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であると定義されているか否かを示す情報を含む。また、ビデオデコーダ３００は、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が、複数のバッファリング期間に亘って一定であると定義されているか否かを示す情報に応じて、ビデオを復号するように構成されている。 A video decoder 300 is also provided for receiving a video data stream containing video. Video decoder 300 is configured to decode video from a video data stream. Additionally, the video data stream includes an initial coded picture buffer removal delay. Additionally, the video data stream includes an initial coded picture buffer removal offset. Additionally, the video data stream includes information indicating whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant over two or more buffering periods. Video decoder 300 may also determine whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant over multiple buffering periods. and is configured to decode the video.

一実施形態では、ビデオデータストリームは、例えば、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が、例えば、２つ以上のバッファリング期間にわたって一定であると定義され得るか否かを示すことができる単一の表示を含むことができる。ビデオデコーダ３００は、例えば、単一の表示に応じて、ビデオを復号するように構成され得る。 In one embodiment, a video data stream may be defined such that, for example, the sum of an initial coded picture buffer removal delay and an initial coded picture buffer removal offset is constant over, for example, two or more buffering periods. A single indication can be included that can indicate whether or not. Video decoder 300 may be configured, for example, to decode video according to a single presentation.

一実施形態によれば、ビデオデータストリームは、例えば、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が、例えば、２つ以上のバッファリング期間にわたって一定であると定義され得るか否かを示すことができる、例えば、単一の表示として連結フラグを含むことができる。連結フラグが第１の値に等しい場合、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和は、２つ以上のバッファリング期間にわたって一定である。連結フラグが第１の値と異なる場合、連結フラグは、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であるか否かを定義しない。ビデオデコーダ３００は、連結フラグに応じてビデオを復号するように構成される。 According to one embodiment, the video data stream is defined such that, for example, the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is constant over, for example, two or more buffering periods. For example, it can include a concatenation flag as a single indication. If the concatenated flag is equal to the first value, the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is constant over two or more buffering periods. If the concatenation flag is different than the first value, the concatenation flag defines whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is constant over two or more buffering periods. do not. Video decoder 300 is configured to decode the video in response to the concatenation flag.

一実施形態では、初期符号化ピクチャバッファ除去遅延と初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であると定義されていることを単一の表示が示さない場合、ビデオデータストリームは、初期符号化ピクチャバッファ除去遅延情報に関する連続的に更新された情報及び初期符号化ピクチャバッファ除去オフセット情報に関する連続的に更新された情報を含む。ビデオデコーダ３００は、初期符号化ピクチャバッファ除去遅延情報に関する連続的に更新された情報、及び初期符号化ピクチャバッファ除去オフセット情報に関する連続的に更新された情報に応じてビデオを復号するように構成される。 In one embodiment, if no single indication indicates that the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant over two or more buffering periods. , the video data stream includes continuously updated information on initial coded picture buffer removal delay information and continuously updated information on initial coded picture buffer removal offset information. Video decoder 300 is configured to decode video in response to continuously updated information regarding initial coded picture buffer removal delay information and continuously updated information regarding initial coded picture buffer removal offset information. be.

一実施形態によれば、システムは、例えば、ビデオエンコーダ１００を更に備えることができる。態様２２１～２２６のいずれか一項に記載の装置２００は、例えば、態様２１１～２１６のいずれか一項に記載のビデオエンコーダ１００から入力ビデオデータストリームとしてビデオデータストリームを受信するように構成することができる。 According to one embodiment, the system may further comprise a video encoder 100, for example. The apparatus 200 according to any one of aspects 221-226, for example configured to receive a video data stream as an input video data stream from the video encoder 100 according to any one of aspects 211-216. can be done.

特に、本発明の第３の態様は、スプライシング、初期Ｃｐｂ除去遅延、及び初期Ｃｐｂ除去オフセットに関する。 In particular, the third aspect of the invention relates to splicing, initial Cpb removal delay and initial Cpb removal offset.

現在、本明細書は、初期Ｃｐｂ除去遅延と初期Ｃｐｂ除去オフセットの和がＣＶＳ内で一定であることを示している。代替タイミングについても同様の制約が表現される。初期Ｃｐｂ除去遅延は、復号のために第１のＡＵを送信する前にデコーダを初期化するビットストリーム内の第１のＡＵのために経過する必要がある時間を示す。初期Ｃｐｂ除去オフセットは、デコーダにおけるＡＵの最先到達時間が、第１のＡＵがデコーダに到達する時間０に対して必ずしも等距離ではないことを意味するビットストリームの特性である。これは、ＡＵの最初のビットがいつデコーダに最も早く到達できるかを決定するのに役立つ。 Presently, this specification shows that the sum of the initial Cpb removal delay and the initial Cpb removal offset is constant within the CVS. Similar constraints are expressed for alternative timings. The initial Cpb removal delay indicates the time that must elapse for the first AU in the bitstream to initialize the decoder before sending the first AU for decoding. The initial Cpb removal offset is a bitstream property that means that the earliest arrival time of an AU at the decoder is not necessarily equidistant to time 0 when the first AU arrives at the decoder. This helps determine when the first bit of the AU can reach the decoder earliest.

ＶＶＣドラフト仕様における現在の制約は、これら２つの値の和がＣＶＳ内で一定である必要があることを示す。
ＣＶＳ全体にわたって、ｉ及びｊの各値対について、ｎａｌ＿ｉｎｉｔｉａｌ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ［ｉ］［ｊ］とｎａｌ＿ｉｎｉｔｉａｌ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｏｆｆｓｅｔ［ｉ］［ｊ］との和は一定であり、ｎａｌ＿ｉｎｉｔｉａｌ＿ａｌｔ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ［ｉ］［ｊ］とｎａｌ＿ｉｎｉｔｉａｌ＿ａｌｔ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｏｆｆｓｅｔ［ｉ］［ｊ］との和は一定であるとする。 Current constraints in the VVC draft specification dictate that the sum of these two values must be constant within CVS.
Across CVS, for each value pair of i and j, the sum of nal_initial_cpb_removal_delay[i][j] and nal_initial_cpb_removal_offset[i][j] is constant, and nal_initial_alt_cpb_removal_delay[i][j] and nal_initial_ alt_cpb_removal_offset[i][j ] is constant.

この問題は、ビットストリームを編集又はスプライシングして新しいジョイントビットストリームを形成するときに現れる。和の異なる値を有するとバッファアンダーラン又はオーバーフローを引き起こす可能性があるため、この特性がビットストリームのＣＶＳ境界にわたって満たされるかどうかを示すことができることも望ましい。 This problem appears when editing or splicing bitstreams to form a new joint bitstream. It is also desirable to be able to indicate whether this property is satisfied across the CVS boundaries of the bitstream, as having different values of the sum can cause buffer underruns or overflows.

したがって、一実施形態では、（例えばスプライシングポイント）上のビットストリーム内のあるポイントから、ＩｎｉｔＣｐｂＲｅｍｏｖａｌＤｅｌａｙとＩｎｉｔＣｐｂＲｅｍｏｖａｌＤｅｌａｙＯｆｆｓｅｔとの定数和に関する値制約がリセットされ、ビットストリーム内の点の前後の和が異なり得るという表示がビットストリーム内で搬送される。この表示がビットストリーム内に存在しない限り、和は一定のままである。 Thus, in one embodiment, from some point in the bitstream (e.g., a splicing point), the value constraint on the constant sum of InitCpbRemovalDelay and InitCpbRemovalDelayOffset is reset to indicate that the sum before and after the point in the bitstream may be different. is carried in the bitstream. The sum remains constant as long as this indication is not present in the bitstream.

例えば、
ｃｏｎｃａｔｅｎａｔｉｏｎＦｌａｇが０に等しいとき、ＩｎｉｔＣｐｂＲｅｍｏｖａｌＤｅｌａｙとＩｎｉｔＣｐｂＲｅｍｏｖａｌＤｅｌａｙＯｆｆｓｅｔとの和がバッファリング期間にわたって一定であることがビットストリーム適合性の制約である。 for example,
When the concatenationFlag is equal to 0, the bitstream conformance constraint is that the sum of InitCpbRemovalDelay and InitCpbRemovalDelayOffset be constant over the buffering period.

そうでなければ、ＩｎｉｔＣｐｂＲｅｍｏｖａｌＤｅｌａｙとＩｎｉｔＣｐｂＲｅｍｏｖａｌＤｅｌａｙＯｆｆｓｅｔとの和は、バッファリング期間にわたって一定である必要はない。ＩｎｉｔＣｐｂＲｅｍｏｖａｌＤｅｌａｙ及びＩｎｉｔＣｐｂＲｅｍｏｖａｌＤｅｌａｙＯｆｆｓｅｔの値は、到着時間を考慮して更新される。 Otherwise, the sum of InitCpbRemovalDelay and InitCpbRemovalDelayOffset need not be constant over the buffering period. The values of InitCpbRemovalDelay and InitCpbRemovalDelayOffset are updated to take into account the arrival time.

一実施形態では、幾つかのビットストリームがスプライシングされる場合、各スプライシングポイントにおいて、連結フラグは、例えば、和が一定のままであるか否かを定義することができる。 In one embodiment, if several bitstreams are spliced, at each splicing point a concatenation flag can define, for example, whether the sum remains constant or not.

以下、本発明の第４の態様について詳細に説明する。 The fourth aspect of the present invention will be described in detail below.

本発明の第４の態様によれば、ビデオデータストリームが提供される。このビデオデータストリームには、ビデオが符号化されている。更に、ビデオデータストリームは、ビデオデータストリームの１つ以上の符号化ビデオシーケンスの１つの符号化ビデオシーケンスの複数のアクセスユニットのうちの１つのアクセスユニットのネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが、前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義されているか否かを示す表示（例えば、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）を含む。表示（例えば、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値を有する場合、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージは、前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義される。表示（例えば、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値とは異なる値を有する場合、表示は、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが、前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるか否かを定義しない。 According to a fourth aspect of the invention, a video data stream is provided. The video is encoded in this video data stream. Further, the video data stream includes scalable non-nested picture timing of network abstraction layer units of one access unit of a plurality of access units of one encoded video sequence of one or more encoded video sequences of the video data stream. including an indication (eg, general_same_pic_timing_in_all_ols_flag) indicating whether the supplemental enhancement information message is defined to apply to all output layer sets of the multiple output layer sets of the access unit. If the indication (eg, general_same_pic_timing_in_all_ols_flag) has a first value, then the network abstraction layer unit scalable non-nested picture timing supplemental enhancement information message of the access unit outputs all outputs of the access unit's multiple output layer sets. Defined to apply to layer sets. If an indication (eg, general_same_pic_timing_in_all_ols_flag) has a different value than a first value, then an indication is that the network abstraction layer unit's scalable non-nested picture timing supplemental enhancement information message of the access unit has multiple It does not define whether it applies to all output layer sets in the output layer set.

一実施形態によれば、例えば、表示（例えば、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値を有する場合、前記ネットワーク抽象化レイヤユニットは、ピクチャタイミング補足拡張情報メッセージとは異なる他の補足拡張情報メッセージを含まない。 According to one embodiment, for example, if the indication (eg general_same_pic_timing_in_all_ols_flag) has a first value, the network abstraction layer unit does not contain other supplemental enhancement information messages different from picture timing supplemental enhancement information messages. .

一実施形態では、例えば、表示（例えば、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値を有する場合、前記ネットワーク抽象化レイヤユニットは他のいかなる補足拡張情報メッセージも含まない。 In one embodiment, for example, if the indication (eg general_same_pic_timing_in_all_ols_flag) has a first value, the network abstraction layer unit does not include any other supplemental enhancement information messages.

一実施形態によれば、例えば、表示（例えば、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値を有する場合、１つ以上の符号化ビデオシーケンスの１つの符号化ビデオシーケンスの複数のアクセスユニットの各アクセスユニットの、スケーラブルネストされないピクチャタイミング補足拡張情報メッセージを含むそれぞれのネットワーク抽象化レイヤユニットごとに、前記ネットワーク抽象化レイヤユニットは、ピクチャタイミング補足拡張情報メッセージとは異なる任意の他の補足拡張情報メッセージを含まず、又はいかなるの他の補足拡張情報メッセージも含まない。 According to one embodiment, for example, if the indication (eg, general_same_pic_timing_in_all_ols_flag) has a first value, for each access unit of a plurality of access units of one encoded video sequence of one or more encoded video sequences, for each network abstraction layer unit containing a scalable non-nested picture timing supplemental enhancement information message, said network abstraction layer unit does not contain any other supplemental enhancement information message different from the picture timing supplemental enhancement information message; or does not contain any other Supplemental Extension Information messages.

一実施形態では、表示（例えば、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値を有する場合、ビデオデータストリームの１つ以上の符号化ビデオシーケンスのそれぞれの複数のアクセスユニットの各アクセスユニットの、スケーラブルネストされないピクチャタイミング補足拡張情報メッセージを含むそれぞれのネットワーク抽象化レイヤユニットごとに、前記ネットワーク抽象化レイヤユニットは、ピクチャタイミング補足拡張情報メッセージとは異なる任意の他の補足拡張情報メッセージを含まない、又はいかなる他の補足拡張情報メッセージも含まない。 In one embodiment, if the indication (eg, general_same_pic_timing_in_all_ols_flag) has a first value, scalable non-nested picture timing for each access unit of each of the plurality of access units of one or more encoded video sequences of the video data stream. For each network abstraction layer unit containing a supplemental enhancement information message, said network abstraction layer unit does not contain any other supplemental enhancement information message different from the picture timing supplemental enhancement information message or any other supplemental enhancement information message. It also does not contain extended information messages.

更に、例えば、ビデオエンコーダ１００が提供され得る。ビデオエンコーダ１００は、ビデオをビデオデータストリームに符号化するように構成される。更に、ビデオエンコーダ１００は、ビデオデータストリームが、ビデオデータストリームの１つ以上の符号化ビデオシーケンスの１つの符号化ビデオシーケンスの複数のアクセスユニットのうちの１つのアクセスユニットのネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが、前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義されているか否かを示す表示（例えば、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）を含むように、ビデオデータストリームを生成するように構成される。表示（例えば、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値を有する場合、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージは、前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義される。表示（例えば、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値とは異なる値を有する場合、表示は、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが、前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるか否かを定義しない。 Further, for example, a video encoder 100 may be provided. Video encoder 100 is configured to encode video into a video data stream. Furthermore, the video encoder 100 is configured such that the video data stream is a network abstraction layer unit of one access unit among a plurality of access units of one encoded video sequence of one or more encoded video sequences of the video data stream. To include an indication (e.g., general_same_pic_timing_in_all_ols_flag) indicating whether a scalable non-nested picture timing supplemental enhancement information message is defined to apply to all output layer sets of the multiple output layer sets of the access unit. , configured to generate a video data stream. If the indication (eg, general_same_pic_timing_in_all_ols_flag) has a first value, then the network abstraction layer unit scalable non-nested picture timing supplemental enhancement information message of the access unit outputs all outputs of the access unit's multiple output layer sets. Defined to apply to layer sets. If an indication (eg, general_same_pic_timing_in_all_ols_flag) has a different value than a first value, then an indication is that the network abstraction layer unit's scalable non-nested picture timing supplemental enhancement information message of the access unit has multiple It does not define whether it applies to all output layer sets in the output layer set.

一実施形態によれば、例えば、表示（例えば、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値を有する場合、ビデオエンコーダ１００は、上記ネットワーク抽象化レイヤユニットが、ピクチャタイミング補足拡張情報メッセージとは異なるいかなる他の補足拡張情報メッセージも含まないように、ビデオデータストリームを生成するように構成される。 According to one embodiment, for example, if the indication (eg, general_same_pic_timing_in_all_ols_flag) has a first value, video encoder 100 determines that the network abstraction layer unit is any other supplemental enhancement information message that differs from the picture timing supplemental enhancement information message. It is configured to generate the video data stream such that it also does not contain extended information messages.

一実施形態では、例えば、表示（例えば、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値を有する場合、ビデオエンコーダ１００は、前記ネットワーク抽象化レイヤユニットが他のいかなる補足拡張情報メッセージも含まないように、ビデオデータストリームを生成するように構成される。 In one embodiment, for example, if an indication (eg, general_same_pic_timing_in_all_ols_flag) has a first value, video encoder 100 controls the video data stream such that the network abstraction layer unit does not contain any other supplemental enhancement information messages. is configured to generate

一実施形態によれば、例えば、表示（例えば、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値を有する場合、ビデオエンコーダ１００は、例えば、１つ以上の符号化ビデオシーケンスの１つの符号化ビデオシーケンスの複数のアクセスユニットの各アクセスユニットの、スケーラブルネストされないピクチャタイミング補足拡張情報メッセージを含むそれぞれのネットワーク抽象化レイヤユニットごとに、前記ネットワーク抽象化レイヤユニットが、ピクチャタイミング補足拡張情報メッセージとは異なる任意の他の補足拡張情報メッセージを含まない、又はいかなる他の補足拡張情報メッセージも含まないように、ビデオデータストリームを生成するように構成され得る。 According to one embodiment, for example, if the indication (eg, general_same_pic_timing_in_all_ols_flag) has a first value, video encoder 100 may perform multiple accesses of one encoded video sequence, eg, one or more encoded video sequences. For each network abstraction layer unit containing a non-scalable nested picture timing supplemental enhancement information message, for each access unit of the unit, said network abstraction layer unit contains any other supplemental information different from the picture timing supplemental enhancement information message. It may be configured to generate the video data stream without the extended information message or without any other supplemental extended information message.

一実施形態では、例えば、表示（例えば、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値を有する場合、ビデオエンコーダ１００は、例えば、ビデオデータストリームの１つ以上の符号化ビデオシーケンスのそれぞれの複数のアクセスユニットの各アクセスユニットの、スケーラブルネストされないピクチャタイミング補足拡張情報メッセージを含むそれぞれのネットワーク抽象化レイヤユニットごとに、前記ネットワーク抽象化レイヤユニットが、ピクチャタイミング補足拡張情報メッセージとは異なる他の補足拡張情報メッセージを含まない、又はいかなる他の補足拡張情報メッセージも含まないように、ビデオデータストリームを生成するように構成され得る。 In one embodiment, for example, if the indication (eg, general_same_pic_timing_in_all_ols_flag) has a first value, video encoder 100 may, for example, control each of the multiple access units of each of the one or more encoded video sequences of the video data stream. For each network abstraction layer unit containing a non-scalable nested picture timing supplemental enhancement information message of an access unit, said network abstraction layer unit contains another supplemental enhancement information message different from the picture timing supplemental enhancement information message. , or without any other supplemental enhancement information messages.

更に、入力ビデオデータストリームを受信するための装置２００が提供される。入力ビデオデータストリームには、ビデオが符号化されている。装置２００は、入力ビデオデータストリームから処理済みビデオデータストリームを生成するように構成される。更に、装置２００は、処理済みビデオデータストリームの１つ以上の符号化ビデオシーケンスの１つの符号化ビデオシーケンスの複数のアクセスユニットのうちの１つのアクセスユニットのネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが、前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義されているか否かを示す表示（例えば、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）を処理済みビデオデータストリームが含むように、処理済みビデオデータストリームを生成するように構成される。表示（例えば、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値を有する場合、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージは、前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義される。表示（例えば、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値とは異なる値を有する場合、表示は、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが、前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるか否かを定義しない。 Additionally, an apparatus 200 is provided for receiving an input video data stream. The input video data stream contains encoded video. Apparatus 200 is configured to generate a processed video data stream from an input video data stream. Furthermore, the apparatus 200 performs scalable non-nested pictures of network abstraction layer units of one access unit of a plurality of access units of one encoded video sequence of one or more encoded video sequences of the processed video data stream. The processed video data stream includes an indication (e.g., general_same_pic_timing_in_all_ols_flag) indicating whether the timing supplemental enhancement information message is defined to apply to all output layer sets of the plurality of output layer sets of the access unit. so as to generate a processed video data stream. If the indication (eg, general_same_pic_timing_in_all_ols_flag) has a first value, then the network abstraction layer unit scalable non-nested picture timing supplemental enhancement information message of the access unit outputs all outputs of the access unit's multiple output layer sets. Defined to apply to layer sets. If an indication (eg, general_same_pic_timing_in_all_ols_flag) has a different value than a first value, then an indication is that the network abstraction layer unit's scalable non-nested picture timing supplemental enhancement information message of the access unit has multiple It does not define whether it applies to all output layer sets in the output layer set.

一実施形態によれば、例えば、表示（例えば、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値を有する場合、装置２００は、前記ネットワーク抽象化レイヤユニットがピクチャタイミング補足拡張情報メッセージとは異なるいかなる他の補足拡張情報メッセージも含まないように、処理済みビデオデータストリームを生成するように構成される。 According to one embodiment, for example, if the indication (eg, general_same_pic_timing_in_all_ols_flag) has a first value, the apparatus 200 detects any other supplemental enhancement information that the network abstraction layer unit is different from the picture timing supplemental enhancement information message. It is configured to generate a processed video data stream that also contains no messages.

一実施形態では、例えば、表示（例えば、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値を有する場合、装置２００は、前記ネットワーク抽象化レイヤユニットが他の補足拡張情報メッセージを含まないように、処理済みビデオデータストリームを生成するように構成される。 In one embodiment, for example, if an indication (eg, general_same_pic_timing_in_all_ols_flag) has a first value, apparatus 200 performs processing of the processed video data stream such that said network abstraction layer unit does not contain other supplemental enhancement information messages. is configured to generate

一実施形態によれば、例えば、表示（例えば、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値を有する場合、装置２００は、例えば、１つ以上の符号化ビデオシーケンスの１つの符号化ビデオシーケンスの複数のアクセスユニットの各アクセスユニットの、スケーラブルネストされないピクチャタイミング補足拡張情報メッセージを含むそれぞれのネットワーク抽象化レイヤユニットごとに、前記ネットワーク抽象化レイヤユニットが、ピクチャタイミング補足拡張情報メッセージとは異なる任意の他の補足拡張情報メッセージを含まない、又はいかなる他の補足拡張情報メッセージも含まないように、処理済みビデオデータストリームを生成するように構成され得る。 According to an embodiment, for example, if the indication (eg, general_same_pic_timing_in_all_ols_flag) has a first value, the apparatus 200 may control multiple access units of one encoded video sequence of one or more encoded video sequences, for example. for each network abstraction layer unit containing a non-scalable nested picture timing supplemental enhancement information message for each access unit in It can be configured to generate the processed video data stream without information messages or without any other supplemental enhancement information messages.

一実施形態では、例えば、表示（例えば、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値を有する場合、装置２００は、例えば、処理済みビデオデータストリームの１つ以上の符号化ビデオシーケンスのそれぞれの複数のアクセスユニットの各アクセスユニットの、スケーラブルネストされないピクチャタイミング補足拡張情報メッセージを含むそれぞれのネットワーク抽象化レイヤユニットごとに、前記ネットワーク抽象化レイヤユニットが、ピクチャタイミング補足拡張情報メッセージとは異なる任意の他の補足拡張情報メッセージを含まない、又はいかなる他の補足拡張情報メッセージも含まないように、処理済みビデオデータストリームを生成するように構成され得る。 In one embodiment, for example, if the indication (eg, general_same_pic_timing_in_all_ols_flag) has a first value, the apparatus 200 may, for example, determine the number of access units for each of the one or more encoded video sequences of the processed video data stream. For each network abstraction layer unit containing a non-scalable nested picture timing supplemental enhancement information message, for each access unit, said network abstraction layer unit contains any other supplemental enhancement information different from the picture timing supplemental enhancement information message. It can be configured to generate the processed video data stream without messages or without any other supplemental enhancement information messages.

更に、ビデオを格納したビデオデータストリームを受信するためのビデオデコーダ３００が設けられている。ビデオデコーダ３００は、ビデオデータストリームからビデオを復号するように構成される。ビデオデータストリームは、ビデオデータストリームの１つ以上の符号化ビデオシーケンスの１つの符号化ビデオシーケンスの複数のアクセスユニットのうちの１つのアクセスユニットのネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが、前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義されているか否かを示す表示（例えば、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）を含む。表示（例えば、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値を有する場合、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージは、前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義される。表示（例えば、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値とは異なる値を有する場合、表示は、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが、前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるか否かを定義しない。ビデオデコーダ３００は、前記表示に応じてビデオを復号するように構成される。 Additionally, a video decoder 300 is provided for receiving a video data stream containing video. Video decoder 300 is configured to decode video from a video data stream. A video data stream is a scalable non-nested picture timing supplemental extension of a network abstraction layer unit of one of a plurality of access units of a plurality of access units of a coded video sequence of one or more coded video sequences of the video data stream. An indication (eg, general_same_pic_timing_in_all_ols_flag) indicating whether the information message is defined to apply to all output layer sets of the plurality of output layer sets of the access unit. If the indication (eg, general_same_pic_timing_in_all_ols_flag) has a first value, then the network abstraction layer unit scalable non-nested picture timing supplemental enhancement information message of the access unit outputs all outputs of the access unit's multiple output layer sets. Defined to apply to layer sets. If an indication (eg, general_same_pic_timing_in_all_ols_flag) has a different value than a first value, then an indication is that the network abstraction layer unit's scalable non-nested picture timing supplemental enhancement information message of the access unit has multiple It does not define whether it applies to all output layer sets in the output layer set. Video decoder 300 is configured to decode the video in response to said representation.

一実施形態では、例えば、表示（例えば、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値を有する場合、前記ネットワーク抽象化レイヤユニットは他のいかなる補足拡張情報メッセージも含まない。ビデオデコーダ３００は、前記表示に応じてビデオを復号するように構成される。 In one embodiment, for example, if the indication (eg general_same_pic_timing_in_all_ols_flag) has a first value, the network abstraction layer unit does not include any other supplemental enhancement information messages. Video decoder 300 is configured to decode the video in response to said representation.

一実施形態では、例えば、表示（例えば、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値を有する場合、ビデオデータストリームの１つ以上の符号化ビデオシーケンスのそれぞれの複数のアクセスユニットの各アクセスユニットの、スケーラブルネストされないピクチャタイミング補足拡張情報メッセージを含むそれぞれのネットワーク抽象化レイヤユニットごとに、前記ネットワーク抽象化レイヤユニットは、ピクチャタイミング補足拡張情報メッセージとは異なる任意の他の補足拡張情報メッセージを含まない、又はいかなる他の補足拡張情報メッセージも含まない。 In one embodiment, for example, if the indication (eg, general_same_pic_timing_in_all_ols_flag) has a first value, the scalable non-nested For each network abstraction layer unit containing a picture timing supplemental enhancement information message, said network abstraction layer unit does not contain any other supplemental enhancement information message different from the picture timing supplemental enhancement information message, or any other Supplementary extended information messages are also not included.

更に、システムが提供される。このシステムは、前述したような装置２００と、前述したようなビデオデコーダ３００とを備える。ビデオデコーダ３００は、装置２００の処理済みビデオデータストリームを受信するように構成される。更に、ビデオデコーダ３００は、装置２００の出力ビデオデータストリームからビデオを復号するように構成される。 Further, a system is provided. The system comprises an apparatus 200 as previously described and a video decoder 300 as previously described. Video decoder 300 is configured to receive the processed video data stream of device 200 . Further, video decoder 300 is configured to decode video from the output video data stream of device 200 .

特に、本発明の第４の態様は、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇが１に等しいときにＰＴＳＥＩが他のＨＲＤＳＥＩと対にならないように制約することに関する。 In particular, the fourth aspect of the present invention relates to constraining PT SEIs from being paired with other HRD SEIs when general_same_pic_timing_in_all_ols_flag is equal to one.

ＶＶＣドラフト仕様は、以下のセマンティクスを有する一般的なＨＲＤパラメータ構造においてｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇと呼ばれるフラグを含む。すなわち、
１に等しいｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇは、各ＡＵ内のスケーラブルネストされないＰＴＳＥＩメッセージがビットストリーム内の任意のＯＬＳのＡＵに適用され、スケーラブルネストされたＰＴＳＥＩメッセージが存在しないことを指定する。０に等しいｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇは、各ＡＵ内のスケーラブルネストされないＰＴＳＥＩメッセージが、ビットストリーム内の任意のＯＬＳのＡＵに適用されても適用されなくてもよく、スケーラブルネストされたＰＴＳＥＩメッセージが存在してもよいことを指定する。 The VVC draft specification includes a flag called general_same_pic_timing_in_all_ols_flag in the general HRD parameter structure with the following semantics. i.e.
general_same_pic_timing_in_all_ols_flag equal to 1 specifies that scalable non-nested PT SEI messages in each AU apply to AUs of any OLS in the bitstream and there are no scalable nested PT SEI messages. general_same_pic_timing_in_all_ols_flag equal to 0 indicates that the scalable non-nested PT SEI messages in each AU may or may not apply to any OLS AU in the bitstream, and if scalable nested PT SEI messages are present Also specify that

一般に、ＯＬＳサブビットストリームが元のビットストリーム（ＯＬＳデータ及び非ＯＬＳデータを含む）から抽出されると、いわゆるスケーラブルネストされたＳＥＩメッセージにカプセル化された、バッファリング期間、ピクチャタイミング、及び復号ユニット情報ＳＥＩメッセージの形式のターゲットＯＬＳの対応するＨＲＤ関連タイミング／バッファ情報がデカプセル化される。このカプセル化解除されたＳＥＩメッセージは、その後、元のビットストリーム内のスケーラブルネストされないＨＲＤＳＥＩ情報を置き換えるために使用される。しかしながら、多くのシナリオでは、例えばピクチャタイミングＳＥＩメッセージのような幾つかのメッセージのコンテンツは、レイヤがドロップされた場合、すなわち１つのＯＬＳからそのサブセットにドロップされた場合、同じままであり得る。したがって、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇは、ＢＰ及びＤＵＩＳＥＩメッセージのみが置換されるようにショートカットを提供するが、元のビットストリーム内のＰＴＳＥＩは有効のままであり得る、すなわち、ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇが１に等しいときに抽出中に除去されない。したがって、置換ＢＰ及びＤＵＩＳＥＩメッセージを搬送するスケーラブルネストされたＳＥＩメッセージに置換ＰＴＳＥＩメッセージをカプセル化する必要はなく、この情報に対してビットレートオーバーヘッドは導入されない。 In general, when an OLS sub-bitstream is extracted from the original bitstream (including OLS and non-OLS data), the buffering period, picture timing, and decoding units encapsulated in a so-called scalable nested SEI message The corresponding HRD related timing/buffer information of the target OLS in the form of information SEI messages is decapsulated. This decapsulated SEI message is then used to replace the scalable non-nested HRD SEI information in the original bitstream. However, in many scenarios, the content of some messages, such as picture timing SEI messages, may remain the same when a layer is dropped, i.e. dropped from one OLS to its subset. So general_same_pic_timing_in_all_ols_flag provides a shortcut so that only BP and DUI SEI messages are replaced, but PT SEI in the original bitstream can remain valid, i.e. during extraction when general_same_pic_timing_in_all_ols_flag is equal to 1 not removed by Therefore, there is no need to encapsulate replacement PT SEI messages in scalable nested SEI messages that carry replacement BP and DUI SEI messages, and no bitrate overhead is introduced for this information.

しかしながら、最先端技術では、ＰＴＳＥＩメッセージは、他のＨＲＤＳＥＩメッセージと一緒に１つのＳＥＩＮＡＬユニット（ＮＡＬユニット＝ネットワーク抽象化レイヤユニット）内で搬送されることが可能にされ、すなわち、ＢＰ、ＰＴ、及びＳＥＩメッセージは全て、同じＰｒｅｆｉｘＳＥＩＮＡＬユニット内にカプセル化され得る。したがって、エクストラクタは、含まれるメッセージを理解するために、そのようなＳＥＩＮＡＬユニットをより深く検査しなければならず、含まれるメッセージのうちの１つ（ＰＴ）のみが抽出手順中に保持されるべきである場合、ショーＳＥＩＮＡＬユニット（すなわち、非ＰＴＳＥＩメッセージを除去する）を実際に書き換える必要がある。この面倒な低レベルの処理を回避し、エクストラクタがＮＡＬユニットレベルで完全にビットストリームの非パラメータセット部分で動作することを可能にするために、ビットストリーム制約がそのようなビットストリーム構築を許可しないことは本発明の一部である。一実施形態では、制約は以下のように表現される。すなわち、
１に等しいｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇは、各ＡＵ内のスケーラブルネストされないＰＴＳＥＩメッセージがビットストリーム内の任意のＯＬＳのＡＵに適用され、スケーラブルネストされたＰＴＳＥＩメッセージが存在しないことを指定する。０に等しいｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇは、各ＡＵ内のスケーラブルネストされないＰＴＳＥＩメッセージが、ビットストリーム内の任意のＯＬＳのＡＵに適用されても適用されなくてもよく、スケーラブルでネストされたＰＴＳＥＩメッセージが存在してもよいことを指定する。ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇが１に等しいとき、ｐａｙｌｏａｄ＿ｔｙｐｅが１に等しいＳＥＩメッセージ（ＰｉｃｔｕｒｅＴｉｍｉｎｇ）を含むビットストリーム内の全ての一般的なＳＥＩメッセージが、１に等しくないｐａｙｌｏａｄ＿ｔｙｐｅのＳＥＩメッセージを含まないことをもってビットストリーム適合性の制約とする。 However, in the state of the art, PT SEI messages are allowed to be carried within one SEI NAL unit (NAL unit = network abstraction layer unit) together with other HRD SEI messages, i.e. BP, PT and SEI messages may all be encapsulated within the same Prefix SEI NAL unit. Therefore, the extractor has to examine such SEI NAL units more deeply in order to understand the contained messages, and only one of the contained messages (PT) is retained during the extraction procedure. If it should, then we actually need to rewrite the show SEI NAL units (ie remove non-PT SEI messages). To avoid this cumbersome low-level processing and allow the extractor to operate entirely on the non-parameter set part of the bitstream at the NAL unit level, bitstream constraints permit such bitstream construction. Not doing so is part of the invention. In one embodiment, the constraints are expressed as follows. i.e.
general_same_pic_timing_in_all_ols_flag equal to 1 specifies that scalable non-nested PT SEI messages in each AU apply to AUs of any OLS in the bitstream and there are no scalable nested PT SEI messages. general_same_pic_timing_in_all_ols_flag equal to 0 indicates that the scalable non-nested PT SEI messages in each AU may or may not apply to any OLS AU in the bitstream, and if scalable nested PT SEI messages are present to specify that When general_same_pic_timing_in_all_ols_flag is equal to 1, all general SEI messages in the bitstream that contain SEI messages (Picture Timing) with payload_type equal to 1 do not contain SEI messages with payload_type not equal to 1. constraints.

以下、本発明の第５の態様について詳細に説明する。 The fifth aspect of the present invention will be described in detail below.

一実施形態によれば、ビデオデータストリームは、例えば、１つ以上のスケーラブルネストされない補足拡張情報メッセージを含むことができる。１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び１つ以上のスケーラブルネストされない補足拡張情報メッセージは、複数のシンタックス要素を含む。複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素は、ビデオデータストリーム又はビデオデータストリームの一部のスケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて、及びビデオデータストリーム又はビデオデータストリームの一部のスケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義される。 According to one embodiment, a video data stream may include, for example, one or more scalable non-nested supplemental enhancement information messages. The one or more scalable nested supplemental enhancement information messages and the one or more scalable non-nested supplemental enhancement information messages include multiple syntax elements. Each syntax element of the one or more syntax elements of the plurality of syntax elements, in each scalable nested Supplemental Enhancement Information message of the video data stream or part of the video data stream, and It is defined to have the same size in each of the scalable non-nested Supplemental Enhancement Information messages that are part of the video data stream.

一実施形態では、ビデオデータストリームは、例えば、複数のアクセスユニットを含むことができ、複数のアクセスユニットの各アクセスユニットは、例えば、ビデオの複数のピクチャのうちの１つに割り当てることができる。ビデオデータストリームの一部は、例えば、ビデオデータストリームの複数のアクセスユニットのうちのアクセスユニットであってもよい。複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素は、例えば、アクセスユニットのスケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義され得る。 In one embodiment, a video data stream may, for example, comprise a plurality of access units, each access unit of the plurality of access units being assigned to, for example, one of a plurality of pictures of the video. A portion of the video data stream may be, for example, an access unit of a plurality of access units of the video data stream. Each syntax element of one or more syntax elements of the plurality of syntax elements may be defined, for example, to have the same size in each of the scalable nested supplemental enhancement information messages of the access unit.

一実施形態によれば、ビデオデータストリームは、例えば、１つ以上のスケーラブルネストされない補足拡張情報メッセージを含むことができる。１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び１つ以上のスケーラブルネストされない補足拡張情報メッセージは、複数のシンタックス要素を含む。複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素は、例えば、アクセスユニットのスケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて、及びアクセスユニットのスケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて、同じサイズを有するように定義され得る。 According to one embodiment, a video data stream may include, for example, one or more scalable non-nested supplemental enhancement information messages. The one or more scalable nested supplemental enhancement information messages and the one or more scalable non-nested supplemental enhancement information messages include multiple syntax elements. Each syntax element of the one or more syntax elements of the plurality of syntax elements, e.g. can be defined to have the same size.

一実施形態では、ビデオデータストリームの一部は、例えば、ビデオデータストリームの符号化ビデオシーケンスであってもよい。複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素は、例えば、符号化ビデオシーケンスのスケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義され得る。 In one embodiment, the portion of the video data stream may be, for example, an encoded video sequence of the video data stream. Each syntax element of one or more syntax elements of the plurality of syntax elements may be defined to have the same size in each of the scalable nested supplemental enhancement information messages of the encoded video sequence, for example.

一実施形態によれば、ビデオデータストリームは、例えば、１つ以上のスケーラブルネストされない補足拡張情報メッセージを含むことができる。１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び１つ以上のスケーラブルネストされない補足拡張情報メッセージは、複数のシンタックス要素を含む。複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素は、例えば、符号化ビデオシーケンスのスケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて、かつ符号化ビデオシーケンスのスケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義され得る。 According to one embodiment, a video data stream may include, for example, one or more scalable non-nested supplemental enhancement information messages. The one or more scalable nested supplemental enhancement information messages and the one or more scalable non-nested supplemental enhancement information messages include multiple syntax elements. Each syntax element of the one or more syntax elements of the plurality of syntax elements, e.g., in each scalable nested supplemental enhancement information message of the encoded video sequence and not scalable nested of the encoded video sequence It may be defined to have the same size in each of the Supplemental Extension Information messages.

一実施形態では、複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素は、例えば、ビデオデータストリームのスケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義することができる。 In one embodiment, each syntax element of the one or more syntax elements of the plurality of syntax elements has the same size, e.g., in each of the scalable nested supplemental enhancement information messages of the video data stream. can be defined.

一実施形態によれば、複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素は、例えば、ビデオデータストリームのスケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて、及びビデオデータストリームのスケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義され得る。 According to one embodiment, each syntax element of the one or more syntax elements of the plurality of syntax elements, for example, in each scalable nested supplemental enhancement information message of the video data stream and the video data It may be defined to have the same size in each of the scalable non-nested Supplemental Enhancement Information messages of a stream.

一実施形態では、ビデオデータストリーム又はビデオデータストリームの一部は、例えば、少なくとも１つのバッファリング期間補足拡張情報メッセージを含むことができ、前記バッファリング期間補足拡張情報メッセージは、複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素のサイズを定義する。 In one embodiment, a video data stream or part of a video data stream may for example comprise at least one buffering period supplemental enhancement information message, said buffering period supplemental enhancement information message comprising a plurality of syntax elements defines the size of each syntax element for one or more syntax elements in .

一実施形態によれば、前記バッファリング期間補足拡張情報メッセージは、複数のシンタックス要素のうちの１つ以上のシンタックス要素のそれぞれのシンタックス要素ごとにサイズを定義するために、
ｂｐ＿ｃｐｂ＿ｉｎｉｔｉａｌ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、
ｂｐ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、
ｂｐ＿ｄｐｂ＿ｏｕｔｐｕｔ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、
ｂｐ＿ｄｕ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｉｎｃｒｅｍｅｎｔ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、
ｂｐ＿ｄｐｂ＿ｏｕｔｐｕｔ＿ｄｅｌａｙ＿ｄｕ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素
のうちの少なくとも１つを含む。 According to one embodiment, the buffering period supplemental extension information message defines a size for each syntax element of one or more syntax elements of the plurality of syntax elements,
bp_cpb_initial_removal_delay_length_minus1 element;
bp_cpb_removal_delay_length_minus1 element,
bp_dpb_output_delay_length_minus1 element,
bp_du_cpb_removal_delay_increment_length_minus1 element,
bp_dpb_output_delay_du_length_minus1 element.

一実施形態では、スケーラブルネストされたバッファリング期間補足拡張情報メッセージを含む、ビデオデータストリームの複数のアクセスユニットのそれぞれのアクセスユニットごとに、前記アクセスユニットは、例えば、スケーラブルネストされないバッファリング期間補足拡張情報メッセージも含むことができる。 In one embodiment, for each access unit of a plurality of access units of a video data stream comprising a scalable nested buffering period supplemental extension information message, said access unit for example includes a scalable non-nested buffering period supplemental extension Informational messages can also be included.

一実施形態によれば、スケーラブルネストされたバッファリング期間補足拡張情報メッセージを含む、ビデオデータストリームの複数の単層アクセスユニットのそれぞれの単層アクセスユニットごとに、前記単層アクセスユニットは、例えば、スケーラブルネストされないバッファリング期間補足拡張情報メッセージも含むことができる。 According to one embodiment, for each single layer access unit of a plurality of single layer access units of a video data stream comprising a scalable nested buffering period supplemental enhancement information message, said single layer access unit is for example: A scalable non-nested buffering period supplemental extension information message may also be included.

更にビデオエンコーダ１００が設けられる。ビデオエンコーダ１００は、ビデオをビデオデータストリームに符号化するように構成される。更に、ビデオエンコーダ１００は、ビデオデータストリームが、１つ以上のスケーラブルネストされた補足拡張情報メッセージを備えるように、ビデオデータストリームを生成するように構成される。更に、ビデオエンコーダ１００は、このビデオデータストリームを、１つ以上のスケーラブルネストされた補足拡張情報メッセージが複数のシンタックス要素を含むように生成するように構成される。更に、ビデオエンコーダ１００は、複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素が、ビデオデータストリームの、又はビデオデータストリームの一部の、スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義されるように、ビデオデータストリームを生成するように構成される。 A video encoder 100 is also provided. Video encoder 100 is configured to encode video into a video data stream. Further, video encoder 100 is configured to generate the video data stream such that the video data stream comprises one or more scalable nested supplemental enhancement information messages. Further, video encoder 100 is configured to generate this video data stream such that one or more scalable nested supplemental enhancement information messages include multiple syntax elements. Further, the video encoder 100 may determine that each syntax element of one or more syntax elements of the plurality of syntax elements is scalable nested supplemental extension information of the video data stream or part of the video data stream. It is configured to generate a video data stream, defined to have the same size in each of the messages.

一実施形態によれば、ビデオエンコーダ１００は、例えば、ビデオデータストリームが、例えば、１つ以上のスケーラブルネストされない補足拡張情報メッセージを備え得るように、ビデオデータストリームを生成するように構成され得る。ビデオエンコーダ１００は、例えば、１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び１つ以上のスケーラブルネストされない補足拡張情報メッセージが、複数のシンタックス要素を含むように、ビデオデータストリームを生成するように構成され得る。ビデオエンコーダ１００は、例えば、複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素が、例えば、ビデオデータストリーム又はビデオデータストリームの一部のスケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて、及びビデオデータストリーム又はビデオデータストリームの一部のスケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて同じサイズを有すると定義され得るように、ビデオデータストリームを生成するように構成され得る。 According to one embodiment, video encoder 100 may be configured to generate a video data stream such that, for example, the video data stream may comprise one or more scalable non-nested supplemental enhancement information messages. Video encoder 100 may generate the video data stream such that, for example, one or more scalable nested supplemental enhancement information messages and one or more scalable non-nested supplemental enhancement information messages include multiple syntax elements. can be configured to Video encoder 100 may, for example, determine that each syntax element of one or more syntax elements of the plurality of syntax elements is, for example, a scalable nested supplemental enhancement information message of a video data stream or part of a video data stream. and in each of the scalable non-nested Supplemental Enhancement Information messages of the video data stream or part of the video data stream may be defined to have the same size.

一実施形態では、ビデオエンコーダ１００は、例えば、ビデオデータストリームが、例えば、複数のアクセスユニットを含むことができるようにビデオデータストリームを生成するように構成することができ、複数のアクセスユニットの各アクセスユニットは、例えば、ビデオの複数のピクチャのうちの１つに割り当てることができる。ビデオデータストリームの一部は、例えば、ビデオデータストリームの複数のアクセスユニットのうちのアクセスユニットであってもよい。ビデオエンコーダ１００は、例えば、複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素が、例えば、アクセスユニットのスケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有すると定義されるように、ビデオデータストリームを生成するように構成され得る。 In one embodiment, video encoder 100 may be configured to generate a video data stream such that, for example, a video data stream may include, for example, multiple access units, each of the multiple access units. An access unit may, for example, be assigned to one of multiple pictures of a video. A portion of the video data stream may be, for example, an access unit of a plurality of access units of the video data stream. Video encoder 100 assumes, for example, that each syntax element of one or more syntax elements of the plurality of syntax elements has the same size, for example, in each of the scalable nested supplemental enhancement information messages of the access unit. As defined, it may be configured to generate a video data stream.

一実施形態によれば、ビデオエンコーダ１００は、例えば、ビデオデータストリームが、例えば、１つ以上のスケーラブルネストされない補足拡張情報メッセージを備え得るように、ビデオデータストリームを生成するように構成され得る。ビデオエンコーダ１００は、例えば、１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び１つ以上のスケーラブルネストされない補足拡張情報メッセージが、複数のシンタックス要素を含むように、ビデオデータストリームを生成するように構成され得る。ビデオエンコーダ１００は、例えば、複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素が、例えば、アクセスユニットのスケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて、及びアクセスユニットのスケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて、同じサイズを有すると定義され得るように、ビデオデータストリームを生成するように構成され得る。 According to one embodiment, video encoder 100 may be configured to generate a video data stream such that, for example, the video data stream may comprise one or more scalable non-nested supplemental enhancement information messages. Video encoder 100 may generate the video data stream such that, for example, one or more scalable nested supplemental enhancement information messages and one or more scalable non-nested supplemental enhancement information messages include multiple syntax elements. can be configured to Video encoder 100, for example, each syntax element of one or more syntax elements of the plurality of syntax elements, for example, in each of the scalable nested supplemental enhancement information messages of the access unit and It may be configured to generate a video data stream such that in each of the scalable non-nested Supplemental Enhancement Information messages may be defined to have the same size.

一実施形態では、ビデオデータストリームの一部は、例えば、ビデオデータストリームの符号化ビデオシーケンスであってもよい。ビデオエンコーダ１００は、例えば、複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素が、例えば、符号化ビデオシーケンスのスケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有すると定義されるように、ビデオデータストリームを生成するように構成され得る。 In one embodiment, the portion of the video data stream may be, for example, an encoded video sequence of the video data stream. Video encoder 100 may, for example, ensure that each syntax element of one or more of the plurality of syntax elements has the same size, for example, in each of the scalable nested supplemental enhancement information messages of the encoded video sequence. may be configured to generate a video data stream as defined to have;

一実施形態によれば、ビデオエンコーダ１００は、例えば、ビデオデータストリームが、例えば、１つ以上のスケーラブルネストされない補足拡張情報メッセージを備え得るように、ビデオデータストリームを生成するように構成され得る。ビデオエンコーダ１００は、例えば、１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び１つ以上のスケーラブルネストされない補足拡張情報メッセージが、複数のシンタックス要素を含むように、ビデオデータストリームを生成するように構成され得る。ビデオエンコーダ１００は、例えば、複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素が、例えば、符号化ビデオシーケンスのスケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて、及び、符号化ビデオシーケンスのスケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて、同じサイズを有すると定義され得るように、ビデオデータストリームを生成するように構成され得る。 According to one embodiment, video encoder 100 may be configured to generate a video data stream such that, for example, the video data stream may comprise one or more scalable non-nested supplemental enhancement information messages. Video encoder 100 may generate the video data stream such that, for example, one or more scalable nested supplemental enhancement information messages and one or more scalable non-nested supplemental enhancement information messages include multiple syntax elements. can be configured to Video encoder 100 may, for example, determine that each syntax element of one or more syntax elements of the plurality of syntax elements, for example, in each scalable nested supplemental enhancement information message of the encoded video sequence, and It may be configured to generate a video data stream such that each of the scalable non-nested Supplemental Enhancement Information messages of the encoded video sequence may be defined to have the same size.

一実施形態では、ビデオエンコーダ１００は、例えば、複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素が、例えば、ビデオデータストリームのスケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有すると定義されるように、ビデオデータストリームを生成するように構成され得る。 In one embodiment, video encoder 100 may, for example, ensure that each syntax element of one or more syntax elements of the plurality of syntax elements is, for example, each of scalable nested supplemental enhancement information messages of the video data stream. may be configured to generate video data streams as defined to have the same size in .

一実施形態によれば、ビデオエンコーダ１００は、例えば、複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素が、例えば、ビデオデータストリームのスケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて、及びビデオデータストリームのスケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて同じサイズを有すると定義され得るように、ビデオデータストリームを生成するように構成され得る。 According to one embodiment, video encoder 100 may, for example, ensure that each syntax element of one or more syntax elements of the plurality of syntax elements is, for example, a scalable nested supplemental enhancement information message of the video data stream. , and in each of the scalable non-nested Supplemental Enhancement Information messages of the video data stream may be defined to have the same size.

一実施形態では、ビデオエンコーダ１００は、例えば、ビデオデータストリーム又はビデオデータストリームの一部が、例えば、少なくとも１つのバッファリング期間補足拡張情報メッセージを含むことができるようにビデオデータストリームを生成するべく構成することができ、前記バッファリング期間補足拡張情報メッセージは、複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素のサイズを定義する。 In one embodiment, the video encoder 100 may, for example, generate a video data stream such that the video data stream or a portion of the video data stream may include, for example, at least one buffering period supplemental enhancement information message. Configurable, the buffering period supplemental extension information message defines a size of each syntax element of one or more syntax elements of the plurality of syntax elements.

一実施形態によれば、ビデオエンコーダ１００は、例えば、前記バッファリング期間補足拡張情報メッセージが、複数のシンタックス要素のうちの１つ以上のシンタックス要素のそれぞれのシンタックス要素ごとにサイズを定義するために、
ｂｐ＿ｃｐｂ＿ｉｎｉｔｉａｌ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、
ｂｐ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、
ｂｐ＿ｄｐｂ＿ｏｕｔｐｕｔ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、
ｂｐ＿ｄｕ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｉｎｃｒｅｍｅｎｔ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、
ｂｐ＿ｄｐｂ＿ｏｕｔｐｕｔ＿ｄｅｌａｙ＿ｄｕ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素
のうちの少なくとも１つを含むようにビデオデータストリームを生成するように構成されてもよい。 According to one embodiment, video encoder 100 determines, for example, that the Buffering Period Supplemental Extension Information message defines a size for each syntax element of one or more of the plurality of syntax elements. in order to
bp_cpb_initial_removal_delay_length_minus1 element,
bp_cpb_removal_delay_length_minus1 element,
bp_dpb_output_delay_length_minus1 element,
bp_du_cpb_removal_delay_increment_length_minus1 element,
bp_dpb_output_delay_du_length_minus1 elements.

一実施形態では、ビデオエンコーダ１００は、例えば、ビデオデータストリームを生成するように構成することができ、それによって、スケーラブルネストされたバッファリング期間補足拡張情報メッセージを含むビデオデータストリームの複数のアクセスユニットのそれぞれのアクセスユニットごとに、前記アクセスユニットは、例えば、スケーラブルネストされないバッファリング期間補足拡張情報メッセージも含むことができる。 In one embodiment, the video encoder 100 may be configured, for example, to generate a video data stream whereby multiple access units of the video data stream containing scalable nested buffering period supplemental enhancement information messages. , said access unit may also include, for example, a scalable non-nested buffering period supplemental enhancement information message.

一実施形態によれば、ビデオエンコーダ１００は、例えば、スケーラブルネストされたバッファリング期間補足拡張情報メッセージを含む、ビデオデータストリームの複数の単層アクセスユニットのそれぞれの単層アクセスユニットごとに、前記単層アクセスユニットが、例えば、スケーラブルネストされないバッファリング期間補足拡張情報メッセージも含むことができるように、ビデオデータストリームを生成するように構成されてもよい。 According to one embodiment, video encoder 100, for each single layer access unit of a plurality of single layer access units of a video data stream, including, for example, a scalable nested buffering period supplemental enhancement information message; The layer access unit may for example be configured to generate the video data stream such that it may also contain scalable non-nested buffering period supplemental enhancement information messages.

更に、入力ビデオデータストリームを受信するための装置２００が提供される。入力ビデオデータストリームには、ビデオが符号化されている。装置２００は、入力ビデオデータストリームから出力ビデオデータストリームを生成するように構成される。ビデオデータストリームは、１つ以上のスケーラブルネストされた補足拡張情報メッセージを含む。１つ以上のスケーラブルネストされた補足拡張情報メッセージは、複数のシンタックス要素を含む。複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素は、ビデオデータストリーム又はビデオデータストリームの一部のスケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義される。装置２００は、１つ以上のスケーラブルネストされた補足拡張情報メッセージを処理するように構成される。 Additionally, an apparatus 200 is provided for receiving an input video data stream. The input video data stream contains encoded video. Apparatus 200 is configured to generate an output video data stream from an input video data stream. A video data stream includes one or more scalable nested Supplemental Enhancement Information messages. One or more scalable nested supplemental extension information messages include multiple syntax elements. Each syntax element of the one or more syntax elements of the plurality of syntax elements has the same size in each of the video data stream or the scalable nested supplemental enhancement information messages of the part of the video data stream. Defined. Apparatus 200 is configured to process one or more scalable nested supplemental enhancement information messages.

一実施形態によれば、ビデオデータストリームは、例えば、１つ以上のスケーラブルネストされない補足拡張情報メッセージを含むことができる。１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び１つ以上のスケーラブルネストされない補足拡張情報メッセージは、複数のシンタックス要素を含む。複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素は、ビデオデータストリーム又はビデオデータストリームの一部のスケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて、及びビデオデータストリーム又はビデオデータストリームの一部のスケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義される。装置２００は、１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び１つ以上のスケーラブルネストされない補足拡張情報メッセージを処理するように構成されている。 According to one embodiment, a video data stream may include, for example, one or more scalable non-nested supplemental enhancement information messages. The one or more scalable nested supplemental enhancement information messages and the one or more scalable non-nested supplemental enhancement information messages include multiple syntax elements. Each syntax element of the one or more syntax elements of the plurality of syntax elements, in each scalable nested Supplemental Enhancement Information message of the video data stream or part of the video data stream, and It is defined to have the same size in each of the scalable non-nested Supplemental Enhancement Information messages that are part of the video data stream. Apparatus 200 is configured to process one or more scalable nested supplemental enhancement information messages and one or more scalable non-nested supplemental enhancement information messages.

一実施形態によれば、ビデオデータストリームは、例えば、１つ以上のスケーラブルネストされない補足拡張情報メッセージを含むことができる。１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び１つ以上のスケーラブルネストされない補足拡張情報メッセージは、複数のシンタックス要素を含む。複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素は、例えば、アクセスユニットのスケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて、及びアクセスユニットのスケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて、同じサイズを有するように定義され得る。装置２００は、例えば、１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び１つ以上のスケーラブルネストされない補足拡張情報メッセージを処理するように構成され得る。 According to one embodiment, a video data stream may include, for example, one or more scalable non-nested supplemental enhancement information messages. The one or more scalable nested supplemental enhancement information messages and the one or more scalable non-nested supplemental enhancement information messages include multiple syntax elements. Each syntax element of the one or more syntax elements of the plurality of syntax elements, e.g. can be defined to have the same size. Apparatus 200 may, for example, be configured to process one or more scalable nested supplemental enhancement information messages and one or more scalable non-nested supplemental enhancement information messages.

一実施形態によれば、ビデオデータストリームは、例えば、１つ以上のスケーラブルネストされない補足拡張情報メッセージを含むことができる。１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び１つ以上のスケーラブルネストされない補足拡張情報メッセージは、複数のシンタックス要素を含む。複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素は、例えば、符号化ビデオシーケンスのスケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて、かつ符号化ビデオシーケンスのスケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義され得る。装置２００は、例えば、１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び１つ以上のスケーラブルネストされない補足拡張情報メッセージを処理するように構成され得る。 According to one embodiment, a video data stream may include, for example, one or more scalable non-nested supplemental enhancement information messages. The one or more scalable nested supplemental enhancement information messages and the one or more scalable non-nested supplemental enhancement information messages include multiple syntax elements. Each syntax element of the one or more syntax elements of the plurality of syntax elements, e.g., in each scalable nested supplemental enhancement information message of the encoded video sequence and not scalable nested of the encoded video sequence It may be defined to have the same size in each of the Supplemental Extension Information messages. Apparatus 200 may, for example, be configured to process one or more scalable nested supplemental enhancement information messages and one or more scalable non-nested supplemental enhancement information messages.

一実施形態によれば、複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素は、例えば、ビデオデータストリームのスケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて、及びビデオデータストリームのスケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義され得る。装置２００は、例えば、１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び１つ以上のスケーラブルネストされない補足拡張情報メッセージを処理するように構成され得る。 According to one embodiment, each syntax element of the one or more syntax elements of the plurality of syntax elements, for example, in each scalable nested supplemental enhancement information message of the video data stream and the video data It may be defined to have the same size in each of the scalable non-nested Supplemental Enhancement Information messages of a stream. Apparatus 200 may, for example, be configured to process one or more scalable nested supplemental enhancement information messages and one or more scalable non-nested supplemental enhancement information messages.

一実施形態では、ビデオデータストリーム又はビデオデータストリームの一部は、例えば、少なくとも１つのバッファリング期間補足拡張情報メッセージを含むことができ、前記バッファリング期間補足拡張情報メッセージは、複数のシンタックス要素のうちの１つ以上のサイズを定義する。装置２００は、例えば、少なくとも１つのバッファリング期間補足拡張情報メッセージを処理するように構成され得る。 In one embodiment, a video data stream or part of a video data stream may for example comprise at least one buffering period supplemental enhancement information message, said buffering period supplemental enhancement information message comprising a plurality of syntax elements defines the size of one or more of Apparatus 200 may, for example, be configured to process at least one buffering period supplemental enhancement information message.

一実施形態によれば、前記バッファリング期間補足拡張情報メッセージは、前記複数のシンタックス要素のうちの前記１つ以上のサイズを定義するために、
ｂｐ＿ｃｐｂ＿ｉｎｉｔｉａｌ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、
ｂｐ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、
ｂｐ＿ｄｐｂ＿ｏｕｔｐｕｔ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、
ｂｐ＿ｄｕ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｉｎｃｒｅｍｅｎｔ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、
ｂｐ＿ｄｐｂ＿ｏｕｔｐｕｔ＿ｄｅｌａｙ＿ｄｕ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素
のうちの少なくとも１つを含む。 According to one embodiment, the Buffering Period Supplemental Extension Information message, to define the size of the one or more of the plurality of syntax elements, comprises:
bp_cpb_initial_removal_delay_length_minus1 element,
bp_cpb_removal_delay_length_minus1 element,
bp_dpb_output_delay_length_minus1 element,
bp_du_cpb_removal_delay_increment_length_minus1 element,
bp_dpb_output_delay_du_length_minus1 element.

一実施形態では、スケーラブルネストされたバッファリング期間補足拡張情報メッセージを含む、ビデオデータストリームの複数のアクセスユニットのそれぞれのアクセスユニットごとに、前記アクセスユニットは、例えば、スケーラブルネストされないバッファリング期間補足拡張情報メッセージも含むことができる。装置２００は、例えば、スケーラブルネストされた補足拡張情報メッセージ及びスケーラブルネストされない補足拡張情報メッセージを処理するように構成され得る。 In one embodiment, for each access unit of a plurality of access units of a video data stream comprising a scalable nested buffering period supplemental extension information message, said access unit for example includes a scalable non-nested buffering period supplemental extension Informational messages can also be included. Apparatus 200 may be configured, for example, to process scalable nested supplemental enhancement information messages and scalable non-nested supplemental enhancement information messages.

一実施形態によれば、スケーラブルネストされたバッファリング期間補足拡張情報メッセージを含む、ビデオデータストリームの複数の単層アクセスユニットのそれぞれの単層アクセスユニットごとに、前記単層アクセスユニットは、例えば、スケーラブルネストされないバッファリング期間補足拡張情報メッセージも含むことができる。装置２００は、例えば、スケーラブルネストされた補足拡張情報メッセージ及びスケーラブルネストされない補足拡張情報メッセージを処理するように構成され得る。 According to one embodiment, for each single layer access unit of a plurality of single layer access units of a video data stream comprising a scalable nested buffering period supplemental enhancement information message, said single layer access unit is for example: A scalable non-nested buffering period supplemental extension information message may also be included. Apparatus 200 may be configured, for example, to process scalable nested supplemental enhancement information messages and scalable non-nested supplemental enhancement information messages.

更に、ビデオを格納したビデオデータストリームを受信するためのビデオデコーダ３００が設けられている。ビデオデコーダ３００は、ビデオデータストリームからビデオを復号するように構成される。ビデオデータストリームは、１つ以上のスケーラブルネストされた補足拡張情報メッセージを含む。１つ以上のスケーラブルネストされた補足拡張情報メッセージは、複数のシンタックス要素を含む。複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素は、ビデオデータストリーム又はビデオデータストリームの一部のスケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義される。ビデオデコーダ３００は、複数のシンタックス要素のうちの１つ以上のシンタックス要素に応じてビデオを復号するように構成される。 Additionally, a video decoder 300 is provided for receiving a video data stream containing video. Video decoder 300 is configured to decode video from a video data stream. A video data stream includes one or more scalable nested Supplemental Enhancement Information messages. One or more scalable nested supplemental extension information messages include multiple syntax elements. Each syntax element of the one or more syntax elements of the plurality of syntax elements has the same size in each of the video data stream or the scalable nested supplemental enhancement information messages of the part of the video data stream. Defined. Video decoder 300 is configured to decode video in response to one or more syntax elements of a plurality of syntax elements.

一実施形態によれば、ビデオデータストリームは、例えば、１つ以上のスケーラブルネストされない補足拡張情報メッセージを含むことができる。１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び１つ以上のスケーラブルネストされない補足拡張情報メッセージは、複数のシンタックス要素を含む。複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素は、例えば、ビデオデータストリーム又はビデオデータストリームの一部のスケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて、及びビデオデータストリーム又はビデオデータストリームの一部のスケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義され得る。 According to one embodiment, a video data stream may include, for example, one or more scalable non-nested supplemental enhancement information messages. The one or more scalable nested supplemental enhancement information messages and the one or more scalable non-nested supplemental enhancement information messages include multiple syntax elements. Each syntax element of the one or more syntax elements of the plurality of syntax elements, for example, in each scalable nested Supplemental Enhancement Information message of the video data stream or part of the video data stream, and the video data It may be defined to have the same size in each of the scalable non-nested Supplemental Enhancement Information messages of a stream or part of a video data stream.

一実施形態によれば、前記バッファリング期間補足拡張情報メッセージは、複数のシンタックス要素のうちの１つ以上のシンタックス要素のそれぞれのシンタックス要素ごとにサイズを定義するために、
ｂｐ＿ｃｐｂ＿ｉｎｉｔｉａｌ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、
ｂｐ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、
ｂｐ＿ｄｐｂ＿ｏｕｔｐｕｔ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、
ｂｐ＿ｄｕ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｉｎｃｒｅｍｅｎｔ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、
ｂｐ＿ｄｐｂ＿ｏｕｔｐｕｔ＿ｄｅｌａｙ＿ｄｕ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素
のうちの少なくとも１つを含む。 According to one embodiment, the buffering period supplemental extension information message defines a size for each syntax element of one or more syntax elements of the plurality of syntax elements,
bp_cpb_initial_removal_delay_length_minus1 element,
bp_cpb_removal_delay_length_minus1 element,
bp_dpb_output_delay_length_minus1 element,
bp_du_cpb_removal_delay_increment_length_minus1 element,
bp_dpb_output_delay_du_length_minus1 element.

特に、本発明の第５の態様は、ビットストリーム内の全てのＢＰＳＥＩメッセージを、特定の可変長符号化されたシンタックス要素の同じ長さを示し、同じＡＵ内のスケーラブルネストされない変形なしでスケーラブルネストされないように制約することに関する。 In particular, the fifth aspect of the present invention makes all BP SEI messages in a bitstream represent the same length of a particular variable-length encoded syntax element, without scalable non-nested transformations within the same AU. Regarding constraining not to be scalable nested.

バッファリング期間ＳＥＩメッセージ、ピクチャタイミングＳＥＩメッセージ、及び復号ユニット情報ＳＥＩメッセージは、ビットストリーム内のＮＡＬユニットの正確なタイミング情報を提供して、適合性テストにおけるデコーダのバッファを介したそれらの遷移を制御するために使用される。ＰＴ及びＤＵＩＳＥＩメッセージ内の幾つかのシンタックス要素は可変長で符号化され、これらのシンタックス要素の長さはＢＰＳＥＩメッセージ内で伝達される。この解析依存性は設計上のトレードオフである。関連するＢＰＳＥＩメッセージを最初に構文解析せずにＰＴ及びＤＵＩＳＥＩメッセージの構文解析を許可しないことのコストのために、各ＰＴ又はＤＵＩＳＥＩメッセージにおけるそれらの長さシンタックス要素の送信を節約する利点が達成される。ＢＰＳＥＩメッセージ（複数のフレームに対し１回）は、ＰＴメッセージ（各フレームに対し１回）又はＤＵＩＳＥＩメッセージ（フレームごとに複数回）よりもはるかに少ない頻度で送信されるため、多くのスライスが使用される場合にピクチャヘッダ構造がスライスヘッダのビットコストをどのように低減できるかと同様に、この共通の設計上のトレードオフによってビット節約が達成される。 The buffering period SEI, picture timing SEI, and decoding unit information SEI messages provide precise timing information for NAL units in the bitstream to control their transition through the decoder's buffers in conformance tests. used to Some syntax elements in the PT and DUI SEI messages are variable length encoded and the length of these syntax elements is conveyed in the BP SEI message. This analytical dependency is a design trade-off. Save sending those length syntax elements in each PT or DUI SEI message for the cost of not allowing parsing of PT and DUI SEI messages without first parsing the associated BP SEI message Advantages are achieved. BP SEI messages (once for multiple frames) are sent much less frequently than PT messages (once for each frame) or DUI SEI messages (multiple times per frame), so many slices Bit savings are achieved through this common design trade-off, similar to how the picture header structure can reduce the bit cost of slice headers when is used.

より具体的には、現在のＶＶＣドラフト仕様におけるＢＰＳＥＩメッセージは、構文解析の依存関係のルートである以下のシンタックス要素を含む。
・ＰＴＳＥＩメッセージ内のＡＵの代替タイミング初期ＣＰＢ除去遅延の符号化長を指定するｂｐ＿ｃｐｂ＿ｉｎｉｔｉａｌ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１、及び
・ＰＴＳＥＩメッセージ内のＣＰＢ除去遅延の符号化長及びＡＵの除去遅延オフセットを指定するｂｐ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１と、
・ＰＴＳＥＩメッセージ内のＡＵのＤＰＢ出力遅延の符号化長を指定するｂｐ＿ｄｐｂ＿ｏｕｔｐｕｔ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１と、
・ＰＴＳＥＩメッセージ内のＤＵの個別及び共通ＣＰＢ除去遅延、及びＤＵＩＳＥＩメッセージ内のＤＵのＣＰＢ除去遅延の符号化された長さを指定するｂｐ＿ｄｕ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｉｎｃｒｅｍｅｎｔ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１と、
・ＰＴＳＥＩメッセージ及びＤＵＳＥＩメッセージ内のＡＵの符号化長ＤＰＢ出力遅延を指定するｂｐ＿ｄｐｂ＿ｏｕｔｐｕｔ＿ｄｅｌａｙ＿ｄｕ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１。 More specifically, the BP SEI message in the current VVC draft specification contains the following syntax elements that are the root of the parsing dependencies.
bp_cpb_initial_removal_delay_length_minus1, which specifies the encoded length of the AU's alternative timing initial CPB removal delay in the PT SEI message;
o bp_dpb_output_delay_length_minus1, which specifies the encoded length of the AU's DPB output delay in the PT SEI message;
- bp_du_cpb_removal_delay_increment_length_minus1, which specifies the encoded length of the DU's individual and common CPB removal delays in the PT SEI message and the DU's CPB removal delay in the DUI SEI message;
• bp_dpb_output_delay_du_length_minus1, which specifies the encoding length DPB output delay of the AU in the PT SEI and DU SEI messages.

しかしながら、ビットストリームが複数のＯＬＳを備える場合に問題が生じる。ビットストリームを表すＯＬＳに適用されるＢＰ／ＰＴ／ＤＵＩＳＥＩメッセージは、ビットストリーム内で逐語的に搬送され、構文解析依存性を追跡するのは簡単であるが、（サブ）ビットストリームを表すＯＬＳに対応するＢＰ／ＰＴ／ＤＵＩＳＥＩメッセージの他の対は、いわゆるスケーラブルネスティングＳＥＩメッセージでカプセル化された形式で搬送される。それでも、構文解析の依存関係が適用され、ＯＬＳの数が非常に高い可能性があることを考えると、カプセル化されたＰＴ及びＤＵＩＳＥＩメッセージを処理するときに構文解析の依存関係のために正しいカプセル化ＢＰＳＥＩメッセージを追跡し続けることは、デコーダ又はパーサにとってかなりの負担である。特に、これらのメッセージは、異なるスケーラブルネスティングＳＥＩメッセージにカプセル化することもできる。 However, a problem arises when the bitstream comprises multiple OLS. BP/PT/DUI SEI messages applied to OLS representing bitstreams are carried verbatim within the bitstream, making it easy to track parsing dependencies, but OLS representing (sub)bitstreams Other pairs of BP/PT/DUI SEI messages corresponding to are carried in encapsulated form in so-called scalable nesting SEI messages. Nonetheless, given that parsing dependencies apply and the number of OLS can be very high, the correct Keeping track of encapsulated BP SEI messages is a significant burden for decoders or parsers. In particular, these messages can also be encapsulated in different scalable nesting SEI messages.

したがって、本発明の一部として、一実施形態では、長さを記述するそれぞれのシンタックス要素の符号化値は、ＡＵ内の全てのスケーラブルネストされた及びスケーラブルネストされないＢＰＳＥＩメッセージにおいて同じでなければならないというビットストリーム制約が確立される。したがって、デコーダ又はパーサは、ＡＵ内の第１のスケーラブルでないＢＰＳＥＩメッセージを解析するときにそれぞれの長さ値を格納するだけでよく、スケーラブルネスティングＳＥＩメッセージにカプセル化されているか否かにかかわらず、それぞれのＡＵで始まるバッファリング期間内の全てのＰＴ及びＤＵＩＳＥＩメッセージの解析依存関係を解決することができる。以下は、それぞれの仕様テキストの例である。
ＡＵ内の全てのスケーラブルネストされた及びスケーラブルネストされないバッファリング期間ＳＥＩメッセージが、シンタックス要素ｂｐ＿ｃｐｂ＿ｉｎｉｔｉａｌ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１、ｂｐ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１、ｂｐ＿ｄｐｂ＿ｏｕｔｐｕｔ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１、ｂｐ＿ｄｕ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｉｎｃｒｅｍｅｎｔ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１、ｂｐ＿ｄｐｂ＿ｏｕｔｐｕｔ＿ｄｅｌａｙ＿ｄｕ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１の同じそれぞれの値を有することは、ビットストリーム適合性の要件である。 Therefore, as part of the present invention, in one embodiment, the encoded value of each syntax element describing length shall be the same for all scalable nested and non-scalably nested BP SEI messages within an AU. A bitstream constraint is established that Therefore, the decoder or parser only needs to store the respective length value when parsing the first non-scalable BP SEI message in the AU, whether encapsulated in a scalable nesting SEI message or not. , the parsing dependencies of all PT and DUI SEI messages within the buffering period starting at the respective AU can be resolved. Below are examples of the respective specification text.
All scalable nested and non-scalable nested buffering period SEI messages within an AU must have the following syntax elements: bp_cpb_initial_removal_delay_length_minus1, bp_cpb_removal_delay_length_minus1, bp_dpb_output_delay_length_min Having the same respective values for us1, bp_du_cpb_removal_delay_increment_length_minus1, bp_dpb_output_delay_du_length_minus1 is a bitstream conformance requirement .

別の実施形態では、制約は、現在のスケーラブルネストされないＢＰＳＥＩメッセージが以下のように決定するバッファリング期間内にあるスケーラブルネストされたＢＰＳＥＩメッセージについてのみ表現される。
バッファリング期間内の全てのスケーラブルネストされたバッファリング期間ＳＥＩメッセージが、バッファリング期間のスケーラブルネストされないバッファリング期間ＳＥＩメッセージと同じシンタックス要素ｂｐ＿ｃｐｂ＿ｉｎｉｔｉａｌ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１、ｂｐ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１、ｂｐ＿ｄｐｂ＿ｏｕｔｐｕｔ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１、ｂｐ＿ｄｕ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｉｎｃｒｅｍｅｎｔ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１、ｂｐ＿ｄｐｂ＿ｏｕｔｐｕｔ＿ｄｅｌａｙ＿ｄｕ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１のそれぞれの値を有することがビットストリーム適合性の要件である。 In another embodiment, the constraint is expressed only for scalable nested BP SEI messages that are within the buffering period that the current scalable non-nested BP SEI message determines as follows.
All scalable nested buffering period SEI messages within a buffering period have the same syntax elements as the scalable non-scalable nested buffering period SEI messages in the buffering period bp_cpb_initial_removal_delay_length_minus1, bp_cpb_removal_delay_length_minus1, bp_dpb_output _delay_length_minus1, bp_du_cpb_removal_delay_increment_length_minus1, bp_dpb_output_delay_du_length_minus1 is a bitstream conformance requirement.

ここで、ビットストリームのＢＰは、１つのスケーラブルネストされたＢＰから次のスケーラブルネストされたＢＰまでのスケーラブルネストされたＢＰの制約の範囲を定義する。 Here, the BPs of the bitstream define the range of scalable nested BP constraints from one scalable nested BP to the next scalable nested BP.

別の実施形態では、制約は、ビットストリームの全てのＡＵに対して、例えば以下のように表される。
ビットストリーム内の全てのスケーラブルネストされた及びスケーラブルネストされないバッファリング期間ＳＥＩメッセージが、シンタックス要素ｂｐ＿ｃｐｂ＿ｉｎｉｔｉａｌ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１、ｂｐ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１、ｂｐ＿ｄｐｂ＿ｏｕｔｐｕｔ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１、ｂｐ＿ｄｕ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｉｎｃｒｅｍｅｎｔ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１、ｂｐ＿ｄｐｂ＿ｏｕｔｐｕｔ＿ｄｅｌａｙ＿ｄｕ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１の同じそれぞれの値を有することは、ビットストリーム適合性の要件である。 In another embodiment, the constraint is expressed for all AUs of the bitstream, for example:
All scalable nested and non-scalable nested buffering period SEI messages in the bitstream are defined by the syntax elements bp_cpb_initial_removal_delay_length_minus1, bp_cpb_removal_delay_length_minus1, bp_dpb_output_delay_length_minus 1, bp_du_cpb_removal_delay_increment_length_minus1, bp_dpb_output_delay_du_length_minus1 having the same respective values is a bitstream conformance requirement. be.

別の実施形態では、制約はＣＶＳ内のＡＵに対してのみ表現されるため、スマートエンコーダは、関連する遅延及びオフセットシンタックス要素の符号化のためにビットストリーム内のＢＰの持続時間の差を依然として容易にすることができる。仕様テキストは以下の通りである。
ＣＶＳ内の全てのスケーラブルネストされた及びスケーラブルネストされないバッファリング期間ＳＥＩメッセージが、シンタックス要素ｂｐ＿ｃｐｂ＿ｉｎｉｔｉａｌ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１、ｂｐ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１、ｂｐ＿ｄｐｂ＿ｏｕｔｐｕｔ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１、ｂｐ＿ｄｕ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｉｎｃｒｅｍｅｎｔ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１、ｂｐ＿ｄｐｂ＿ｏｕｔｐｕｔ＿ｄｅｌａｙ＿ｄｕ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１の同じそれぞれの値を有することは、ビットストリーム適合性の要件である。 In another embodiment, since constraints are expressed only for AUs in CVS, the smart encoder calculates the difference in duration of BPs in the bitstream for encoding of the associated delay and offset syntax elements. It can still be made easier. The specification text follows.
All scalable nested and non-scalable nested buffering period SEI messages in CVS have the syntax elements bp_cpb_initial_removal_delay_length_minus1, bp_cpb_removal_delay_length_minus1, bp_dpb_output_delay_length_m Having the same respective values for inus1, bp_du_cpb_removal_delay_increment_length_minus1, bp_dpb_output_delay_du_length_minus1 is a bitstream conformance requirement .

ここで、制約範囲はＣＶＳである。 Here, the constraint range is CVS.

具体的には、バッファリング期間、すなわちＢＰＳＥＩメッセージは、各ピクチャのタイミングが、バッファリング期間の開始時のピクチャをアンカーとする、いわゆるバッファリング期間を規定する。バッファリング期間の開始は、例えば、ビットストリームにおけるランダムアクセス機能の適合性をテストするために役立つ。 Specifically, the buffering period, or BP SEI message, defines a so-called buffering period in which the timing of each picture anchors the picture at the beginning of the buffering period. The start of the buffering period is useful, for example, for testing the suitability of random access functions in the bitstream.

図７は、一実施形態に係る２層ビットストリームにおける２セットのＨＲＤＳＥＩ、スケーラブルネストされたＳＥＩ及びスケーラブルネストされないＳＥＩを示す。 FIG. 7 shows two sets of HRD SEI, scalable nested SEI and scalable non-nested SEI in a two-layer bitstream according to one embodiment.

例えば、図７に示すようなマルチレイヤシナリオでは、スケーラブルネストされたＨＲＤＳＥＩは、レイヤＬ０のみがＰＯＣ３から抽出されて再生されるときに使用されるスケーラブルネストされないＳＥＩ（ＰＯＣ０のみ）とは異なるバッファリング期間設定（ＰＯＣ０及びＰＯＣ３でのＢＰを介して）を提供する。 For example, in a multi-layer scenario as shown in Figure 7, the scalable nested HRD SEI is different from the scalable non-nested SEI (POC 0 only) used when only layer L0 is extracted from POC 3 and played. Provide different buffering period settings (via BP at POC 0 and POC 3).

しかしながら、これはまた、上記で説明したようにＰＴメッセージと個々のＢＰメッセージとの間の構文解析依存性を追跡する複雑さのコストが増加し、これは望ましくない。したがって、本発明の一部として、一実施形態では、以下のように、スケーラブルネストされないＢＰＳＥＩメッセージなしでＡＵ内にスケーラブルネストされたＢＰＳＥＩメッセージを有することが禁止される。
スケーラブルネストされたＢＰＳＥＩメッセージが、スケーラブルネストされないＢＰＳＥＩメッセージを含まないＡＵ内にないことは、ビットストリーム適合性の要件である。 However, this also increases the complexity cost of tracking parsing dependencies between PT messages and individual BP messages as explained above, which is undesirable. Therefore, as part of the present invention, in one embodiment, having scalable nested BP SEI messages within an AU without scalable non-nested BP SEI messages is prohibited as follows.
It is a bitstream conformance requirement that no scalable nested BP SEI message is in an AU that does not contain a scalable non-nested BP SEI message.

上記の使用シナリオはマルチレイヤビットストリームに限定されるため、別の実施形態では、関連する制約は以下のように単層ビットストリームに限定される。
スケーラブルネストされたＢＰＳＥＩメッセージが、スケーラブルネストされないＢＰＳＥＩメッセージを含まない単層ＡＵ内にないことは、ビットストリーム適合性の要件である。 Since the above usage scenario is limited to multi-layer bitstreams, in another embodiment the relevant constraints are limited to single-layer bitstreams as follows.
It is a bitstream conformance requirement that no scalable nested BP SEI message is in a single layer AU that does not contain a scalable non-nested BP SEI message.

幾つかの態様を装置の文脈で説明したが、これらの態様は対応する方法の説明も表すことは明らかであり、ブロック又はデバイスは方法ステップ又は方法ステップの特徴に対応する。同様に、方法ステップの文脈で説明される態様はまた、対応する装置の対応するブロック又は項目又は特徴の説明を表す。方法ステップの一部又は全ては、例えばマイクロプロセッサ、プログラマブルコンピュータ、又は電子回路などのハードウェア装置によって（又は使用して）実行されてもよい。幾つかの実施形態では、最も重要な方法ステップの１つ以上は、そのような装置によって実行されてもよい。 Although some aspects have been described in the context of an apparatus, it is clear that these aspects also represent descriptions of corresponding methods, where blocks or devices correspond to method steps or features of method steps. Similarly, aspects described in the context of method steps also represent descriptions of corresponding blocks or items or features of the corresponding apparatus. Some or all of the method steps may be performed by (or using) hardware devices such as microprocessors, programmable computers, or electronic circuits. In some embodiments, one or more of the most critical method steps may be performed by such apparatus.

特定の実装要件に応じて、本発明の実施形態は、ハードウェアもしくはソフトウェアで、又は少なくとも部分的にハードウェアで、又は少なくとも部分的にソフトウェアで実装することができる。実装は、電子的に読み取り可能な制御信号が格納されたデジタル記憶媒体、例えばフロッピーディスク、ＤＶＤ、Ｂｌｕ－Ｒａｙ、ＣＤ、ＲＯＭ、ＰＲＯＭ、ＥＰＲＯＭ、ＥＥＰＲＯＭ又はフラッシュメモリを使用して実行することができ、これらはそれぞれの方法が実行されるようにプログラム可能なコンピュータシステムと協働する（又は協働することができる）。したがって、デジタル記憶媒体はコンピュータ可読であってもよい。 Depending on certain implementation requirements, embodiments of the invention can be implemented in hardware or software, at least partially in hardware, or at least partially in software. Implementation may be performed using a digital storage medium, such as a floppy disk, DVD, Blu-Ray, CD, ROM, PROM, EPROM, EEPROM or flash memory, on which electronically readable control signals are stored. , which cooperate (or can cooperate) with a programmable computer system such that the respective methods are performed. As such, the digital storage medium may be computer readable.

本発明による幾つかの実施形態は、本明細書に記載の方法のうちの１つが実行されるように、プログラム可能なコンピュータシステムと協働することができる電子的に読み取り可能な制御信号を有するデータキャリアを含む。 Some embodiments according to the present invention have electronically readable control signals operable to cooperate with a programmable computer system to perform one of the methods described herein. Including data carrier.

一般に、本発明の実施形態は、プログラムコードを有するコンピュータプログラムプロダクトとして実装することができ、プログラムコードは、コンピュータプログラムプロダクトがコンピュータ上で実行されるときに方法のうちの１つを実行するように動作する。プログラムコードは、例えば、機械可読キャリアに格納することができる。 Generally, embodiments of the present invention can be implemented as a computer program product having program code that, when the computer program product is run on a computer, performs one of the methods. Operate. Program code may be stored, for example, in a machine-readable carrier.

他の実施形態は、機械可読キャリアに格納された、本明細書に記載の方法の１つを実行するためのコンピュータプログラムを含む。 Another embodiment includes a computer program stored on a machine-readable carrier for performing one of the methods described herein.

言い換えれば、したがって、本発明の方法の一実施形態は、コンピュータプログラムがコンピュータ上で実行されるときに、本明細書に記載の方法のうちの１つを実行するためのプログラムコードを有するコンピュータプログラムである。 In other words, one embodiment of the method of the present invention therefore comprises a computer program having program code for performing one of the methods described herein when the computer program is run on a computer. is.

したがって、本発明の方法の更なる実施形態は、本明細書に記載の方法の１つを実行するためのコンピュータプログラムを記録して含むデータキャリア（又はデジタル記憶媒体、又はコンピュータ可読媒体）である。データキャリア、デジタル記憶媒体、又は記録媒体は、通常、有形及び／又は非一時的である。 A further embodiment of the method of the invention is therefore a data carrier (or digital storage medium or computer readable medium) containing recorded thereon a computer program for carrying out one of the methods described herein. . A data carrier, digital storage medium or recording medium is typically tangible and/or non-transitory.

したがって、本発明の方法の更なる実施形態は、本明細書に記載の方法のうちの１つを実行するためのコンピュータプログラムを表すデータストリーム又は信号シーケンスである。データストリーム又は信号シーケンスは、例えば、データ通信接続を介して、例えばインターネットを介して転送されるように構成することができる。 A further embodiment of the method of the invention is therefore a data stream or signal sequence representing the computer program for performing one of the methods described herein. The data stream or signal sequence may, for example, be arranged to be transferred over a data communication connection, for example over the Internet.

更なる実施形態は、本明細書に記載の方法のうちの１つを実行するように構成又は適合された処理手段、例えばコンピュータ又はプログラマブル論理デバイスを含む。 Further embodiments include processing means, such as a computer or programmable logic device, configured or adapted to perform one of the methods described herein.

更なる実施形態は、本明細書に記載の方法の１つを実行するためのコンピュータプログラムがインストールされたコンピュータを含む。 A further embodiment includes a computer installed with a computer program for performing one of the methods described herein.

本発明による更なる実施形態は、本明細書に記載の方法のうちの１つを実行するためのコンピュータプログラムを受信機に転送する（例えば、電子的又は光学的に）ように構成された装置又はシステムを備える。受信機は、例えば、コンピュータ、モバイルデバイス、メモリデバイスなどであってもよい。装置又はシステムは、例えば、コンピュータプログラムを受信機に転送するためのファイルサーバを備えることができる。 A further embodiment according to the present invention relates to an apparatus configured to transfer (e.g. electronically or optically) to a receiver a computer program for performing one of the methods described herein or system. A receiver may be, for example, a computer, mobile device, memory device, or the like. The device or system may, for example, comprise a file server for transferring computer programs to receivers.

幾つかの実施形態では、プログラマブルロジックデバイス（例えば、フィールドプログラマブルゲートアレイ）を使用して、本明細書に記載の方法の機能の一部又は全てを実行することができる。幾つかの実施形態では、フィールドプログラマブルゲートアレイは、本明細書に記載の方法のうちの１つを実行するためにマイクロプロセッサと協働することができる。一般に、方法は、任意のハードウェア装置によって実行されることが好ましい。 In some embodiments, programmable logic devices (eg, field programmable gate arrays) can be used to perform some or all of the functions of the methods described herein. In some embodiments, a field programmable gate array can cooperate with a microprocessor to perform one of the methods described herein. In general, the methods are preferably performed by any hardware device.

本明細書に記載の装置は、ハードウェア装置を使用して、又はコンピュータを使用して、又はハードウェア装置とコンピュータとの組み合わせを使用して実装され得る。 The devices described herein may be implemented using a hardware device, using a computer, or using a combination of hardware devices and computers.

本明細書に記載の方法は、ハードウェア装置を使用して、又はコンピュータを使用して、又はハードウェア装置とコンピュータとの組み合わせを使用して実行され得る。 The methods described herein can be performed using a hardware apparatus, using a computer, or using a combination of hardware apparatus and computer.

前述の実施形態は、本発明の原理の単なる例示である。本明細書に記載の構成及び詳細の修正及び変形は、当業者には明らかであることが理解される。したがって、本明細書の実施形態の説明及び説明として提示された特定の詳細によってではなく、差し迫った特許請求の範囲によってのみ限定されることが意図される。 The foregoing embodiments are merely illustrative of the principles of the invention. It is understood that modifications and variations of the configurations and details described herein will be apparent to those skilled in the art. It is therefore intended to be limited only by the scope of the impending claims and not by the specific details presented in the description and description of the embodiments herein.

引用文献
［１］ＩＳＯ／ＩＥＣ，ＩＴＵ－Ｔ。高効率ビデオ符号化。ＩＴＵ－Ｔ勧告Ｈ．２６５｜ＩＳＯ／ＩＥＣ２３００８１０（ＨＥＶＣ）、第１版，２０１３；第２版、２０１４年。 References [1] ISO/IEC, ITU-T. High efficiency video coding. ITU-T Recommendation H. 265 | ISO/IEC 23008 10 (HEVC), 1st Edition, 2013; 2nd Edition, 2014.

以下、本発明の態様を例示する。Embodiments of the present invention are exemplified below.

＜態様１＞<Aspect 1>
入力ビデオデータストリームを受信するための装置（２００）であって、 An apparatus (200) for receiving an input video data stream, comprising:
前記入力ビデオデータストリームにはビデオが符号化され、 the input video data stream is video encoded;
前記装置（２００）は、前記入力ビデオデータストリームから出力ビデオデータストリームを生成するように構成され、 said device (200) being configured to generate an output video data stream from said input video data stream;
前記装置（２００）は、従属ランダムアクセスピクチャに先行する前記ビデオのピクチャが出力されるべきか否かを決定する、 The apparatus (200) determines whether pictures of the video preceding dependent random access pictures should be output.
装置（２００）。 A device (200).

＜態様２＞<Aspect 2>
前記装置（２００）は、前記従属ランダムアクセスピクチャに先行する前記ビデオの前記ピクチャが出力されるべきか否かを示す第１の変数（ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＤｒａｐＦｌａｇ）を決定するように構成される、態様１に記載の装置（２００）。 2. The apparatus (200) of aspect 1, wherein the apparatus (200) is configured to determine a first variable (NoOutputBeforeDrapFlag) indicating whether the picture of the video preceding the dependent random access picture is to be output. device (200).

＜態様３＞<Aspect 3>
前記装置（２００）は、前記従属ランダムアクセスピクチャに先行する前記ビデオの前記ピクチャが出力されるべきか否かを示す表示を前記出力ビデオデータストリームが含むように、前記出力ビデオデータストリームを生成するように構成される、態様２に記載の装置（２００）。 The apparatus (200) generates the output video data stream such that the output video data stream includes an indication of whether the pictures of the video preceding the dependent random access pictures are to be output. The apparatus (200) of aspect 2, configured to:

＜態様４＞<Aspect 4>
前記装置（２００）は、前記従属ランダムアクセスピクチャに先行する前記ビデオの前記ピクチャが出力されるべきか否かを示す前記表示を含む補足拡張情報を前記出力ビデオデータストリームが含むように、前記出力ビデオデータストリームを生成するように構成される、態様３に記載の装置（２００）。 The apparatus (200) is configured such that the output video data stream includes supplemental enhancement information including the indication of whether the picture of the video preceding the dependent random access picture should be output. The apparatus (200) of aspect 3, configured to generate a video data stream.

＜態様５＞<Aspect 5>
前記従属ランダムアクセスピクチャに先行する前記ビデオのピクチャが独立ランダムアクセスピクチャであり、 a picture of the video preceding the dependent random access picture is an independent random access picture;
前記装置（２００）は、前記独立ランダムアクセスピクチャのピクチャヘッダ内に所定の値（０）を有するフラグ（ｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇ）を前記出力ビデオデータストリームが含むように、前記出力ビデオデータストリームを生成するように構成され、それにより、前記フラグ（ｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇ）の前記所定の値（０）は、前記ビデオデータストリーム内の前記従属ランダムアクセスピクチャに直接先行する前記独立ランダムアクセスピクチャについて、前記独立ランダムアクセスピクチャが出力されるべきでないことを示す、 The apparatus (200) is configured to generate the output video data stream such that the output video data stream includes a flag (ph_pic_output_flag) having a predetermined value (0) in picture headers of the independent random access pictures. wherein said predetermined value (0) of said flag (ph_pic_output_flag) indicates that said independent random access picture is output for said independent random access picture directly preceding said dependent random access picture in said video data stream. indicate that it should not be
態様３に記載の装置（２００）。 The apparatus (200) of aspect 3.

＜態様６＞<Aspect 6>
前記フラグが第１のフラグであり、前記装置（２００）は、前記出力ビデオデータストリームが前記ビデオデータストリームのピクチャパラメータセット内に更なるフラグを含むように前記出力ビデオデータストリームを生成するように構成され、前記更なるフラグは、前記第１のフラグ（ｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇ）が前記独立ランダムアクセスピクチャの前記ピクチャヘッダ内に存在するか否かを示す、態様５に記載の装置（２００）。 wherein said flag is a first flag and said device (200) is adapted to generate said output video data stream such that said output video data stream includes a further flag within a picture parameter set of said video data stream. 6. The apparatus (200) of aspect 5, wherein the further flag is configured to indicate whether the first flag (ph_pic_output_flag) is present in the picture header of the independent random access picture.

＜態様７＞<Aspect 7>
前記装置（２００）は、前記出力ビデオデータストリームが、前記従属ランダムアクセスピクチャに先行する前記ビデオの前記ピクチャが出力されるべきか否かを示す前記表示として、 Said apparatus (200) as said indication indicating whether said output video data stream should output said picture of said video preceding said dependent random access picture:
前記出力ビデオデータストリームの補足拡張情報内の補足拡張情報フラグ、又は、 a supplemental enhancement information flag within the supplemental enhancement information of said output video data stream; or
前記出力ビデオデータストリームのピクチャパラメータセット内のピクチャパラメータセットフラグ、又は、 a picture parameter set flag within a picture parameter set of said output video data stream; or
前記出力ビデオデータストリームのシーケンスパラメータセット内のシーケンスパラメータセットフラグ、又は、 a sequence parameter set flag within a sequence parameter set of said output video data stream; or
外部手段フラグであって、該外部手段フラグの値が前記装置（２００）の外部にある外部ユニットによって設定される、外部手段フラグ、 an external means flag, the value of the external means flag being set by an external unit external to said device (200);
を含むように、前記出力ビデオデータストリームを生成するように構成される、 configured to generate the output video data stream to include
態様３に記載の装置（２００）。 The apparatus (200) of aspect 3.

＜態様８＞<Aspect 8>
前記装置（２００）は、前記従属ランダムアクセスピクチャに先行する前記ビデオの前記ピクチャに関する第２の変数（ＰｉｃｔｕｒｅＯｕｔｐｕｔＦｌａｇ）の値を前記第１の変数（ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＤｒａｐＦｌａｇ）に応じて決定するように構成され、前記第２の変数（ＰｉｃｔｕｒｅＯｕｔｐｕｔＦｌａｇ）は、前記ピクチャが出力されるべきか否かを前記ピクチャに関して示し、前記装置（２００）は、前記第２の変数（ＰｉｃｔｕｒｅＯｕｔｐｕｔＦｌａｇ）に応じて前記ピクチャを出力し又は出力しないように構成される、 The apparatus (200) is configured to determine a value of a second variable (PictureOutputFlag) for the picture of the video preceding the dependent random access picture in dependence on the first variable (NoOutputBeforeDrapFlag); A second variable (PictureOutputFlag) indicates for said picture whether said picture should be output or not, and said device (200) outputs or outputs said picture depending on said second variable (PictureOutputFlag). configured not to
態様２～７のいずれかに記載の装置（２００）。 The apparatus (200) of any of aspects 2-7.

＜態様９＞<Aspect 9>
前記従属ランダムアクセスピクチャに先行する前記ビデオの前記ピクチャが独立ランダムアクセスピクチャであり、 the picture of the video preceding the dependent random access picture is an independent random access picture;
前記第１の変数（ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＤｒａｐＦｌａｇ）は、前記独立ランダムアクセスピクチャが出力されるべきでないことを示す、 the first variable (NoOutputBeforeDrapFlag) indicates that the independent random access picture should not be output;
態様２～８のいずれかに記載の装置（２００）。 The apparatus (200) of any of aspects 2-8.

＜態様１０＞<Aspect 10>
前記従属ランダムアクセスピクチャに先行する前記ビデオの前記ピクチャが独立ランダムアクセスピクチャであり、 the picture of the video preceding the dependent random access picture is an independent random access picture;
前記装置（２００）は、前記独立ランダムアクセスピクチャが出力されるべきであることを前記第１の変数（ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＤｒａｐＦｌａｇ）が示すように、前記第１の変数（ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＤｒａｐＦｌａｇ）を設定するように構成される、 The apparatus (200) is configured to set the first variable (NoOutputBeforeDrapFlag) such that the first variable (NoOutputBeforeDrapFlag) indicates that the independent random access picture is to be output. ,
態様２～８のいずれかに記載の装置（２００）。 The apparatus (200) of any of aspects 2-8.

＜態様１１＞<Aspect 11>
前記装置（２００）は、従属ランダムアクセスピクチャに先行する前記ビデオのピクチャが出力されるべきか否かをビデオデコーダ（３００）へシグナルするように構成される、態様１～１０のいずれかに記載の装置（２００）。 11. according to any of the aspects 1-10, wherein the apparatus (200) is configured to signal to a video decoder (300) whether pictures of the video preceding dependent random access pictures should be output or not. device (200).

＜態様１２＞<Aspect 12>
ビデオデータストリームであって、 a video data stream,
前記ビデオデータストリームにはビデオが符号化され、 a video is encoded in the video data stream;
前記ビデオデータストリームは、従属ランダムアクセスピクチャに先行する前記ビデオのピクチャが出力されるべきか否かを示す表示を含む、 the video data stream includes an indication of whether pictures of the video preceding dependent random access pictures are to be output;
ビデオデータストリーム。 video data stream.

＜態様１３＞<Aspect 13>
前記出力ビデオデータストリームは、前記従属ランダムアクセスピクチャに先行する前記ビデオの前記ピクチャが出力されるべきか否かを示す前記表示を含む補足拡張情報を含む、態様１２に記載のビデオデータストリーム。 13. The video data stream of aspect 12, wherein the output video data stream includes supplemental enhancement information including the indication of whether the pictures of the video preceding the dependent random access pictures should be output.

＜態様１４＞<Aspect 14>
前記従属ランダムアクセスピクチャに先行する前記ビデオのピクチャが独立ランダムアクセスピクチャであり、 a picture of the video preceding the dependent random access picture is an independent random access picture;
前記ビデオデータストリームは、前記独立ランダムアクセスピクチャのピクチャヘッダ内に所定の値（０）を有するフラグ（ｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇ）を含み、それにより、前記フラグ（ｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇ）の前記所定の値（０）は、前記ビデオデータストリーム内の前記従属ランダムアクセスピクチャに直接先行する前記独立ランダムアクセスピクチャについて、前記独立ランダムアクセスピクチャが出力されるべきでないことを示す、 The video data stream includes a flag (ph_pic_output_flag) having a predetermined value (0) in a picture header of the independent random access picture, whereby the predetermined value (0) of the flag (ph_pic_output_flag) is equal to the for the independent random access picture that directly precedes the dependent random access picture in a video data stream, indicating that the independent random access picture should not be output;
態様１２に記載のビデオデータストリーム。 A video data stream according to aspect 12.

＜態様１５＞<Aspect 15>
前記フラグが第１のフラグであり、前記ビデオデータストリームは、前記ビデオデータストリームのピクチャパラメータセット内に更なるフラグを含み、前記更なるフラグは、前記第１のフラグ（ｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇ）が前記独立ランダムアクセスピクチャの前記ピクチャヘッダ内に存在するか否かを示す、態様１４に記載のビデオデータストリーム。 The flag is a first flag, and the video data stream includes a further flag within a picture parameter set of the video data stream, the further flag is that the first flag (ph_pic_output_flag) is the independent random 15. The video data stream of aspect 14, indicating whether present in said picture header of an access picture.

＜態様１６＞<Aspect 16>
ビデオデータストリームは、前記従属ランダムアクセスピクチャに先行する前記ビデオの前記ピクチャが出力されるべきか否かを示す前記表示として、 a video data stream comprising: as said indication indicating whether said pictures of said video preceding said dependent random access pictures should be output;
前記出力ビデオデータストリームの補足拡張情報内の補足拡張情報フラグ、又は、 a supplemental enhancement information flag within the supplemental enhancement information of said output video data stream; or
前記出力ビデオデータストリームのピクチャパラメータセット内のピクチャパラメータセットフラグ、又は、 a picture parameter set flag within a picture parameter set of said output video data stream; or
前記出力ビデオデータストリームのシーケンスパラメータセット内のシーケンスパラメータセットフラグ、又は、 a sequence parameter set flag within a sequence parameter set of said output video data stream; or
外部手段フラグであって、該外部手段フラグの値が装置（２００）の外部にある外部ユニットによって設定される、外部手段フラグ、 an external means flag, the value of the external means flag being set by an external unit external to the device (200);
を含む、 including,
態様１２に記載のビデオデータストリーム。 A video data stream according to aspect 12.

＜態様１７＞<Aspect 17>
ビデオエンコーダ（１００）であって、 A video encoder (100) comprising:
前記ビデオエンコーダ（１００）は、ビデオをビデオデータストリームに符号化するように構成され、 said video encoder (100) configured to encode video into a video data stream;
前記ビデオエンコーダ（１００）は、従属ランダムアクセスピクチャに先行する前記ビデオのピクチャが出力されるべきか否かを示す表示を前記ビデオデータストリームが含むように、前記ビデオデータストリームを生成するように構成される、 The video encoder (100) is configured to generate the video data stream such that the video data stream includes an indication of whether pictures of the video preceding dependent random access pictures are to be output. to be
ビデオエンコーダ（１００）。 A video encoder (100).

＜態様１８＞<Aspect 18>
前記ビデオエンコーダ（１００）は、前記従属ランダムアクセスピクチャに先行する前記ビデオの前記ピクチャが出力されるべきか否かを示す前記表示を含む補足拡張情報を前記ビデオデータストリームが含むように、前記ビデオデータストリームを生成するように構成される、態様１７に記載のビデオエンコーダ（１００）。 The video encoder (100) is configured such that the video data stream includes supplemental enhancement information including the indication of whether the picture of the video preceding the dependent random access picture is to be output. 18. The video encoder (100) of aspect 17, configured to generate a data stream.

＜態様１９＞<Aspect 19>
前記従属ランダムアクセスピクチャに先行する前記ビデオの前記ピクチャが独立ランダムアクセスピクチャであり、 the picture of the video preceding the dependent random access picture is an independent random access picture;
前記ビデオエンコーダ（１００）は、前記独立ランダムアクセスピクチャのピクチャヘッダ内に所定の値（０）を有するフラグ（ｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇ）を前記出力ビデオデータストリームが含むように、前記ビデオデータストリームを生成するように構成され、それにより、前記フラグ（ｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇ）の前記所定の値（０）は、前記ビデオデータストリーム内の前記従属ランダムアクセスピクチャに直接先行する前記独立ランダムアクセスピクチャについて、前記独立ランダムアクセスピクチャが出力されるべきでないことを示す、 The video encoder (100) generates the video data stream such that the output video data stream includes a flag (ph_pic_output_flag) having a predetermined value (0) in picture headers of the independent random access pictures. wherein said predetermined value (0) of said flag (ph_pic_output_flag) indicates that said independent random access picture is output for said independent random access picture directly preceding said dependent random access picture in said video data stream. indicate that it should not be
態様１７に記載のビデオエンコーダ（１００）。 18. A video encoder (100) according to aspect 17.

＜態様２０＞<Aspect 20>
前記フラグが第１のフラグであり、前記ビデオエンコーダ（１００）は、前記ビデオデータストリームが前記ビデオデータストリームのピクチャパラメータセット内に更なるフラグを含むように前記ビデオデータストリームを生成するように構成され、前記更なるフラグは、前記第１のフラグ（ｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇ）が前記独立ランダムアクセスピクチャの前記ピクチャヘッダ内に存在するか否かを示す、態様１９に記載のビデオエンコーダ（１００）。 The flag is a first flag, and the video encoder (100) is configured to generate the video data stream such that the video data stream includes a further flag within a picture parameter set of the video data stream. 20. The video encoder (100) of aspect 19, wherein said further flag indicates whether said first flag (ph_pic_output_flag) is present in said picture header of said independent random access picture.

＜態様２１＞<Aspect 21>
前記ビデオエンコーダ（１００）は、前記ビデオデータストリームが、前記従属ランダムアクセスピクチャに先行する前記ビデオの前記ピクチャが出力されるべきか否かを示す前記表示として、 The video encoder (100), as the indication of whether the video data stream is to output the picture of the video that precedes the dependent random access picture,
前記出力ビデオデータストリームの補足拡張情報内の補足拡張情報フラグ、又は、 a supplemental enhancement information flag within the supplemental enhancement information of said output video data stream; or
前記出力ビデオデータストリームのピクチャパラメータセット内のピクチャパラメータセットフラグ、又は、 a picture parameter set flag within a picture parameter set of said output video data stream; or
前記出力ビデオデータストリームのシーケンスパラメータセット内のシーケンスパラメータセットフラグ、又は、 a sequence parameter set flag within a sequence parameter set of said output video data stream; or
外部手段フラグであって、該外部手段フラグの値が装置（２００）の外部にある外部ユニットによって設定される、外部手段フラグ、 an external means flag, the value of the external means flag being set by an external unit external to the device (200);
を含むように、前記ビデオデータストリームを生成するように構成される、 configured to generate said video data stream to include
態様１７に記載のビデオエンコーダ（１００）。 18. A video encoder (100) according to aspect 17.

＜態様２２＞<Aspect 22>
ビデオを格納したビデオデータストリームを受信するためのビデオデコーダ（３００）であって、 A video decoder (300) for receiving a video data stream containing video, comprising:
前記ビデオデコーダ（３００）は、前記ビデオデータストリームから前記ビデオを復号するように構成され、 said video decoder (300) configured to decode said video from said video data stream;
前記ビデオデコーダ（３００）は、従属ランダムアクセスピクチャに先行する前記ビデオのピクチャが出力されるべきか否かを示す表示に応じて前記ビデオを復号するように構成される、 wherein said video decoder (300) is configured to decode said video in response to an indication indicating whether a picture of said video preceding a dependent random access picture is to be output;
ビデオデコーダ（３００）。 A video decoder (300).

＜態様２３＞<Aspect 23>
前記ビデオデコーダ（３００）は、前記従属ランダムアクセスピクチャに先行する前記ビデオの前記ピクチャが出力されるべきか否かを示す第１の変数（ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＤｒａｐＦｌａｇ）に応じて前記ビデオを復号するように構成される、態様２２に記載のビデオデコーダ（３００）。 The video decoder (300) is configured to decode the video according to a first variable (NoOutputBeforeDrapFlag) indicating whether the picture of the video preceding the dependent random access picture is to be output. 23. The video decoder (300) of aspect 22.

＜態様２４＞<Aspect 24>
前記ビデオデータストリームは、前記従属ランダムアクセスピクチャに先行する前記ビデオの前記ピクチャが出力されるべきか否かを示す前記表示を含み、 said video data stream includes said indication indicating whether said pictures of said video preceding said dependent random access pictures are to be output;
前記ビデオデコーダ（３００）は、前記ビデオデータストリーム内の前記表示に応じて前記ビデオを復号するように構成される、態様２３に記載のビデオデコーダ（３００）。 24. The video decoder (300) of aspect 23, wherein said video decoder (300) is configured to decode said video in response to said representation in said video data stream.

＜態様２５＞<Aspect 25>
前記ビデオデータストリームは、前記従属ランダムアクセスピクチャに先行する前記ビデオの前記ピクチャが出力されるべきか否かを示す前記表示を含む補足拡張情報を含み、 the video data stream includes supplemental enhancement information including the indication of whether the picture of the video preceding the dependent random access picture is to be output;
前記ビデオデコーダ（３００）は、前記補足拡張情報に応じて前記ビデオを復号するように構成される、態様２４に記載のビデオデコーダ（３００）。 25. The video decoder (300) of aspect 24, wherein the video decoder (300) is configured to decode the video in response to the supplemental enhancement information.

＜態様２６＞<Aspect 26>
前記従属ランダムアクセスピクチャに先行する前記ビデオの前記ピクチャが独立ランダムアクセスピクチャであり、 the picture of the video preceding the dependent random access picture is an independent random access picture;
前記ビデオデータストリームは、前記独立ランダムアクセスピクチャのピクチャヘッダ内に所定の値（０）を有するフラグ（ｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇ）を含み、それにより、前記フラグ（ｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇ）の前記所定の値（０）は、前記ビデオデータストリーム内の前記従属ランダムアクセスピクチャに直接先行する前記独立ランダムアクセスピクチャについて、前記独立ランダムアクセスピクチャが出力されるべきでないことを示し、 The video data stream includes a flag (ph_pic_output_flag) having a predetermined value (0) in a picture header of the independent random access picture, whereby the predetermined value (0) of the flag (ph_pic_output_flag) is equal to the indicating, for said independent random access picture that directly precedes said dependent random access picture in a video data stream, that said independent random access picture should not be output;
前記ビデオデコーダ（３００）は、前記フラグに応じて前記ビデオを復号するように構成される、 the video decoder (300) is configured to decode the video in response to the flag;
態様２４に記載のビデオデコーダ（３００）。 25. The video decoder (300) of aspect 24.

＜態様２７＞<Aspect 27>
前記フラグが第１のフラグであり、前記ビデオデータストリームは、前記ビデオデータストリームのピクチャパラメータセット内に更なるフラグを含み、前記更なるフラグは、前記第１のフラグ（ｐｈ＿ｐｉｃ＿ｏｕｔｐｕｔ＿ｆｌａｇ）が前記独立ランダムアクセスピクチャの前記ピクチャヘッダ内に存在するか否かを示し、 The flag is a first flag, and the video data stream includes a further flag within a picture parameter set of the video data stream, the further flag is that the first flag (ph_pic_output_flag) is the independent random indicating whether it is present in the picture header of an access picture;
前記ビデオデコーダ（３００）は、前記更なるフラグに応じて前記ビデオを復号するように構成される、 said video decoder (300) being configured to decode said video in response to said further flag;
態様２６に記載のビデオデコーダ（３００）。 27. The video decoder (300) of aspect 26.

＜態様２８＞<Aspect 28>
前記ビデオデータストリームは、前記従属ランダムアクセスピクチャに先行する前記ビデオの前記ピクチャが出力されるべきか否かを示す前記表示として、 wherein the video data stream includes, as the indication of whether the picture of the video preceding the dependent random access picture is to be output,
前記出力ビデオデータストリームの補足拡張情報内の補足拡張情報フラグ、又は、 a supplemental enhancement information flag within the supplemental enhancement information of said output video data stream; or
前記出力ビデオデータストリームのピクチャパラメータセット内のピクチャパラメータセットフラグ、又は、 a picture parameter set flag within a picture parameter set of said output video data stream; or
前記出力ビデオデータストリームのシーケンスパラメータセット内のシーケンスパラメータセットフラグ、又は、 a sequence parameter set flag within a sequence parameter set of said output video data stream; or
外部手段フラグであって、前記外部手段フラグの値が装置（２００）の外部にある外部ユニットによって設定される、外部手段フラグ、 an external means flag, the value of said external means flag being set by an external unit external to the device (200);
を含み、 including
前記ビデオデコーダ（３００）は、前記ビデオデータストリーム内の前記表示に応じて前記ビデオを復号するように構成される、 the video decoder (300) is configured to decode the video in response to the representation in the video data stream;
態様２４に記載のビデオデコーダ（３００）。 25. The video decoder (300) of aspect 24.

＜態様２９＞<Aspect 29>
前記ビデオデコーダ（３００）は、前記ビデオデータストリームから前記ビデオを再構成するように構成され、 said video decoder (300) configured to reconstruct said video from said video data stream;
前記ビデオデコーダ（３００）は、前記第１の変数（ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＤｒａｐＦｌａｇ）に応じて、前記従属ランダムアクセスピクチャに先行する前記ビデオの前記ピクチャを出力し又は出力しないように構成される、 said video decoder (300) is configured to output or not output said picture of said video preceding said dependent random access picture, depending on said first variable (NoOutputBeforeDrapFlag);
態様２３～２８のいずれかに記載のビデオデコーダ（３００）。 29. A video decoder (300) according to any of aspects 23-28.

＜態様３０＞<Aspect 30>
前記ビデオデコーダ（３００）は、前記従属ランダムアクセスピクチャに先行する前記ビデオの前記ピクチャに関する第２の変数（ＰｉｃｔｕｒｅＯｕｔｐｕｔＦｌａｇ）の値を前記第１の変数（ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＤｒａｐＦｌａｇ）に応じて決定するように構成され、前記第２の変数（ＰｉｃｔｕｒｅＯｕｔｐｕｔＦｌａｇ）は、前記ピクチャが出力されるべきか否かを前記ピクチャについて示し、前記装置（２００）は、前記第２の変数（ＰｉｃｔｕｒｅＯｕｔｐｕｔＦｌａｇ）に応じて前記ピクチャを出力し又は出力しないように構成される、態様２３～２９のいずれかに記載のビデオデコーダ（３００）。 said video decoder (300) is configured to determine a value of a second variable (PictureOutputFlag) for said picture of said video preceding said dependent random access picture as a function of said first variable (NoOutputBeforeDrapFlag); The second variable (PictureOutputFlag) indicates for the picture whether the picture should be output, and the device (200) outputs the picture depending on the second variable (PictureOutputFlag) or 30. The video decoder (300) of any of aspects 23-29, wherein the video decoder (300) is configured not to output.

＜態様３１＞<Aspect 31>
前記従属ランダムアクセスピクチャに先行する前記ビデオの前記ピクチャが独立ランダムアクセスピクチャであり、 the picture of the video preceding the dependent random access picture is an independent random access picture;
前記ビデオデコーダ（３００）は、前記独立ランダムアクセスピクチャが出力されるべきでないことを示す前記第１の変数（ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＤｒａｐＦｌａｇ）に応じて前記ビデオを復号するように構成される、態様２３～３０のいずれかに記載のビデオデコーダ（３００）。 31. Any of aspects 23-30, wherein the video decoder (300) is configured to decode the video in response to the first variable (NoOutputBeforeDrapFlag) indicating that the independent random access picture should not be output. A video decoder (300) according to any preceding claim.

＜態様３２＞<Aspect 32>
前記従属ランダムアクセスピクチャに先行する前記ビデオの前記ピクチャが独立ランダムアクセスピクチャであり、 the picture of the video preceding the dependent random access picture is an independent random access picture;
前記ビデオデコーダ（３００）は、前記独立ランダムアクセスピクチャが出力されるべきであることを示す前記第１の変数（ＮｏＯｕｔｐｕｔＢｅｆｏｒｅＤｒａｐＦｌａｇ）に応じて前記ビデオを復号するように構成される、態様２３～３０のいずれかに記載のビデオデコーダ（３００）。 31. of aspects 23-30, wherein the video decoder (300) is configured to decode the video in response to the first variable (NoOutputBeforeDrapFlag) indicating that the independent random access picture is to be output. A video decoder (300) according to any of the preceding claims.

＜態様３３＞<Aspect 33>
態様１～１１のいずれかに記載の装置（２００）と、 A device (200) according to any of aspects 1-11;
態様２２～３２のいずれかに記載のビデオデコーダ（３００）と、 A video decoder (300) according to any of aspects 22-32;
を備え、 with
態様２２～３２のいずれかに記載のビデオデコーダ（３００）は、態様１～１１のいずれかに記載の装置（２００）の出力ビデオデータストリームを受信するように構成され、 A video decoder (300) according to any of aspects 22-32, configured to receive an output video data stream of the apparatus (200) according to any of aspects 1-11,
態様２２～３２のいずれかに記載のビデオデコーダ（３００）は、態様１～１１のいずれかに記載の装置（２００）の前記出力ビデオデータストリーム～前記ビデオを復号するように構成される、 A video decoder (300) according to any of aspects 22-32, configured to decode said output video data stream to said video of the apparatus (200) according to any of aspects 1-11,
システム。 system.

＜態様３４＞<Aspect 34>
前記システムは、態様１７～２１のいずれかに記載のビデオエンコーダ（１００）を更に備え、 The system further comprising a video encoder (100) according to any of aspects 17-21;
態様１～１１のいずれかに記載の装置（２００）は、前記入力ビデオデータストリームとして、態様１７～２１のいずれかに記載のビデオエンコーダ（１００）から前記ビデオデータストリームを受信するように構成される、 The apparatus (200) according to any of aspects 1-11 is arranged to receive, as said input video data stream, said video data stream from a video encoder (100) according to any of aspects 17-21. Ru
態様３３に記載のシステム。 34. The system of aspect 33.

＜態様３５＞<Aspect 35>
入力ビデオデータストリームを受信するための方法であって、前記入力ビデオデータストリームにはビデオが符号化され、 1. A method for receiving an input video data stream, said input video data stream being video encoded,
前記方法は、前記入力ビデオデータストリームから出力ビデオデータストリームを生成するステップを含み、 The method includes generating an output video data stream from the input video data stream;
前記方法は、従属ランダムアクセスピクチャに先行する前記ビデオのピクチャが出力されるべきか否かを決定するステップを含む、 The method includes determining whether pictures of the video preceding dependent random access pictures should be output.
方法。 Method.

＜態様３６＞<Aspect 36>
ビデオをビデオデータストリームに符号化するための方法であって、 A method for encoding video into a video data stream, comprising:
前記方法は、従属ランダムアクセスピクチャに先行する前記ビデオのピクチャが出力されるべきか否かを示す表示を前記ビデオデータストリームが含むように前記ビデオデータストリームを生成するステップを含む、 The method includes generating the video data stream such that the video data stream includes an indication of whether pictures of the video preceding dependent random access pictures are to be output.
方法。 Method.

＜態様３７＞<Aspect 37>
ビデオを格納したビデオデータストリームを受信するための方法であって、 A method for receiving a video data stream containing video, comprising:
前記方法は、前記ビデオデータストリームから前記ビデオを復号するステップを含み、 The method includes decoding the video from the video data stream;
前記ビデオの復号する前記ステップは、従属ランダムアクセスピクチャに先行する前記ビデオのピクチャが出力されるべきか否かを示す表示に応じて行なわれる、 said step of decoding said video is performed in response to an indication indicating whether pictures of said video preceding dependent random access pictures are to be output;
方法。 Method.

＜態様３８＞<Aspect 38>
コンピュータ又は信号プロセッサで実行されるときに態様３５～３７のいずれかに記載の方法を実施するためのコンピュータプログラム。 38. A computer program for implementing the method of any of aspects 35-37 when run on a computer or signal processor.

＜態様３９＞<Aspect 39>
１つ以上の入力ビデオデータストリームを受信するための装置（２００）であって、前記１つ以上の入力ビデオデータストリームのそれぞれには入力ビデオが符号化され、 An apparatus (200) for receiving one or more input video data streams, each of said one or more input video data streams having an encoded input video,
前記装置（２００）は、前記１つ以上の入力ビデオデータストリームから出力ビデオデータストリームを生成するように構成され、前記出力ビデオデータストリームが出力ビデオを符号化し、前記装置は、前記出力ビデオが前記１つ以上の入力ビデオデータストリームのうちの１つで符号化されている前記入力ビデオであるように、又は、前記出力ビデオが前記１つ以上の入力ビデオデータストリームのうちの少なくとも１つの前記入力ビデオに依存するように、前記出力ビデオデータストリームを生成するように構成され、 The apparatus (200) is configured to generate an output video data stream from the one or more input video data streams, the output video data stream encoding an output video, the apparatus configured to convert the output video into the such that the input video is encoded in one of the one or more input video data streams, or the output video is the input of at least one of the one or more input video data streams. configured to generate said output video data stream as dependent on the video;
前記装置（２００）は、符号化ピクチャバッファからの前記出力ビデオの複数のピクチャの現在のピクチャのアクセスユニット除去時間を決定するように構成され、 The apparatus (200) is configured to determine a current picture access unit removal time of a plurality of pictures of the output video from a coded picture buffer;
前記装置（２００）は、前記符号化ピクチャバッファからの前記現在のピクチャの前記アクセスユニット除去時間を決定するために符号化ピクチャバッファ遅延オフセット情報を使用するか否かを決定するように構成される、 The apparatus (200) is configured to determine whether to use coded picture buffer delay offset information to determine the access unit removal time of the current picture from the coded picture buffer. ,
装置（２００）。 A device (200).

＜態様４０＞<Aspect 40>
前記装置（２００）は、前記出力ビデオデータストリームを生成するために、前記１つ以上の入力ビデオデータストリームの第１ビデオデータストリームの前記入力ビデオの１つ以上のピクチャのグループをドロップするように構成され、 The apparatus (200) is configured to drop groups of one or more pictures of the input video of a first video data stream of the one or more input video data streams to generate the output video data stream. configured,
前記装置（２００）は、前記符号化ピクチャバッファ遅延オフセット情報に応じて、前記符号化ピクチャバッファからの前記出力ビデオの前記複数のピクチャのうちの少なくとも１つに関するアクセスユニット除去時間を決定するように構成される、 The apparatus (200) is configured to determine an access unit removal time for at least one of the plurality of pictures of the output video from the coded picture buffer as a function of the coded picture buffer delay offset information. consists of
態様３９に記載の装置（２００）。 40. Apparatus (200) according to aspect 39.

＜態様４１＞<Aspect 41>
前記装置（２００）によって受信された前記第１のビデオは、処理されたビデオを生成するために１つ以上のピクチャのグループがドロップされた元のビデオから生じる前処理されたビデオであり、 said first video received by said device (200) being a preprocessed video resulting from an original video in which one or more groups of pictures have been dropped to produce a processed video;
前記装置（２００）は、前記符号化ピクチャバッファ遅延オフセット情報に応じて前記符号化ピクチャバッファからの前記出力ビデオの前記複数のピクチャのうちの少なくとも１つのアクセスユニット除去時間を決定するように構成される、 The apparatus (200) is configured to determine an access unit removal time for at least one of the plurality of pictures of the output video from the coded picture buffer as a function of the coded picture buffer delay offset information. Ru
態様３９に記載の装置（２００）。 40. Apparatus (200) according to aspect 39.

＜態様４２＞<Aspect 42>
前記バッファ遅延オフセット情報は、ドロップされた前記入力ビデオのピクチャ数に依存する、態様４０又は４１に記載の装置（２００）。 42. The apparatus (200) of aspect 40 or 41, wherein the buffer delay offset information is dependent on the number of dropped pictures of the input video.

＜態様４３＞<Aspect 43>
前記１つ以上の入力ビデオデータストリームが２つ以上の入力ビデオデータストリームであり、 said one or more input video data streams being two or more input video data streams;
前記装置（２００）は、前記出力ビデオを取得するために、前記処理されたビデオと、前記２つ以上の入力ビデオデータストリームのうちの第２のビデオデータストリームの前記入力ビデオとをスプライシングするように構成されるとともに、前記出力ビデオを前記出力ビデオデータストリームに符号化するように構成される、 The apparatus (200) is configured to splice the processed video and the input video of a second one of the two or more input video data streams to obtain the output video. and configured to encode the output video into the output video data stream;
態様４０～４２のいずれかに記載の装置（２００）。 43. The apparatus (200) of any of aspects 40-42.

＜態様４４＞<Aspect 44>
前記装置（２００）は、前記出力ビデオ内の前記現在のピクチャの位置に応じて前記現在のピクチャの前記アクセスユニット除去時間を決定するために符号化ピクチャバッファ遅延オフセット情報を使用するか否かを決定するように構成され、又は、 The apparatus (200) determines whether to use coded picture buffer delay offset information to determine the access unit removal time of the current picture according to the position of the current picture within the output video. configured to determine, or
前記装置（２００）は、前記出力ビデオ内の前記現在のピクチャの前記位置に応じて前記現在のピクチャの前記アクセスユニット除去時間を決定するために前記符号化ピクチャバッファ遅延オフセット情報の符号化ピクチャバッファ遅延オフセット値を０に設定するか否かを決定するように構成される、 The apparatus (200) stores the coded picture buffer delay offset information in the coded picture buffer for determining the access unit removal time of the current picture according to the position of the current picture in the output video. configured to determine whether to set the delay offset value to 0;
態様４３に記載の装置（２００）。 44. Apparatus (200) according to aspect 43.

＜態様４５＞<Aspect 45>
前記装置（２００）は、前記出力ビデオ内の前記現在のピクチャに先行する前の廃棄不可能ピクチャの位置に応じて、前記現在のピクチャの前記アクセスユニット除去時間を決定するために符号化ピクチャバッファ遅延オフセット情報を使用するか否かを決定するように構成される、態様４３又は４４に記載の装置（２００）。 Said apparatus (200) comprises a coded picture buffer for determining said access unit removal time for said current picture according to the position of a previous non-discardable picture preceding said current picture in said output video. 45. Apparatus (200) according to aspect 43 or 44, configured to determine whether to use delay offset information.

＜態様４６＞<Aspect 46>
前記装置（２００）は、前記出力ビデオ内の前記現在のピクチャに先行する前の前記廃棄不可能ピクチャが前のバッファリング期間内の最初のピクチャであるか否かに応じて、前記現在のピクチャの前記アクセスユニット除去時間を決定するために符号化ピクチャバッファ遅延オフセット情報を使用するか否かを決定するように構成される、態様４５に記載の装置（２００）。 Said apparatus (200) controls said current picture according to whether said previous non-discardable picture preceding said current picture in said output video is the first picture in a previous buffering period. 46. The apparatus (200) of aspect 45, configured to determine whether to use coded picture buffer delay offset information to determine the access unit removal time of .

＜態様４７＞<Aspect 47>
前記装置（２００）は、連結フラグに応じて前記現在のピクチャの前記アクセスユニット除去時間を決定するために符号化ピクチャバッファ遅延オフセット情報を使用するか否かを決定するように構成され、前記現在のピクチャが前記第２のビデオデータストリームの前記入力ビデオの最初のピクチャである、態様４３～４６のいずれかに記載の装置（２００）。 The apparatus (200) is configured to determine whether to use coded picture buffer delay offset information to determine the access unit removal time of the current picture in response to a concatenated flag; is the first picture of the input video of the second video data stream.

＜態様４８＞<Aspect 48>
前記装置（２００）は、前のピクチャの除去時間に応じて前記現在のピクチャの前記アクセスユニット除去時間を決定するように構成される、態様３９～４７のいずれかに記載の装置（２００）。 48. The apparatus (200) according to any of aspects 39-47, wherein the apparatus (200) is configured to determine the access unit removal time for the current picture according to a removal time for a previous picture.

＜態様４９＞<Aspect 49>
前記装置（２００）は、初期符号化ピクチャバッファ除去遅延情報に応じて前記現在のピクチャの前記アクセスユニット除去時間を決定するように構成される、態様３９～４８のいずれかに記載の装置（２００）。 49. The apparatus (200) according to any of aspects 39-48, wherein the apparatus (200) is configured to determine the access unit removal time for the current picture in dependence on initial coded picture buffer removal delay information. ).

＜態様５０＞<Aspect 50>
前記装置（２００）は、一時的な符号化ピクチャバッファ除去遅延情報を取得して前記現在のピクチャの前記アクセスユニット除去時間を決定するために、クロックティックに応じて前記初期符号化ピクチャバッファ除去遅延情報を更新するように構成される、態様４９に記載の装置（２００）。 The apparatus (200) is configured to obtain temporary coded picture buffer removal delay information to determine the access unit removal time of the current picture, the initial coded picture buffer removal delay according to clock ticks. 50. Apparatus (200) according to aspect 49, configured to update information.

＜態様５１＞<Aspect 51>
前記連結フラグが第１の値に設定される場合、前記装置（２００）は、前記符号化ピクチャバッファ遅延オフセット情報を使用して１つ以上の除去時間を決定するように構成され、 if said concatenation flag is set to a first value, said apparatus (200) is configured to determine one or more removal times using said coded picture buffer delay offset information;
前記連結フラグが前記第１の値とは異なる第２の値に設定される場合、前記装置（２００）は、前記１つ以上の除去時間を決定するために前記符号化ピクチャバッファ遅延オフセット情報を使用しないように構成される、態様４７に記載の装置（２００）。 If said concatenation flag is set to a second value different from said first value, said apparatus (200) uses said coded picture buffer delay offset information to determine said one or more removal times. 48. Apparatus (200) according to aspect 47, configured for non-use.

＜態様５２＞<Aspect 52>
前記装置（２００）は、前記符号化ピクチャバッファからの前記現在のピクチャの前記アクセスユニット除去時間を決定するために前記符号化ピクチャバッファ遅延オフセット情報を使用するか否かをビデオデコーダ（３００）にシグナルするように構成される、態様３９～５１のいずれかに記載の装置（２００）。 The apparatus (200) instructs a video decoder (300) whether to use the coded picture buffer delay offset information to determine the access unit removal time of the current picture from the coded picture buffer. 52. Apparatus (200) according to any of aspects 39-51, configured to signal.

＜態様５３＞<Aspect 53>
前記現在のピクチャは、２つの入力ビデオがスプライシングされた前記出力ビデオのスプライシングポイントに位置される、態様５２に記載の装置（２００）。 53. The apparatus (200) of aspect 52, wherein the current picture is located at a splicing point of the output video to which two input videos were spliced.

＜態様５４＞<Aspect 54>
ビデオデータストリームであって、 a video data stream,
前記ビデオデータストリームにはビデオが符号化され、 a video is encoded in the video data stream;
前記ビデオデータストリームが符号化ピクチャバッファ遅延オフセット情報を含む、 wherein the video data stream includes coded picture buffer delay offset information;
ビデオデータストリーム。 video data stream.

＜態様５５＞<Aspect 55>
前記ビデオデータストリームが連結フラグを含む、態様５４に記載のビデオデータストリーム。 55. A video data stream according to aspect 54, wherein said video data stream includes a concatenation flag.

＜態様５６＞<Aspect 56>
前記ビデオデータストリームが初期符号化ピクチャバッファ除去遅延情報を含む、態様５４又は５５に記載のビデオデータストリーム。 56. A video data stream according to aspect 54 or 55, wherein said video data stream comprises initial coded picture buffer removal delay information.

＜態様５７＞<Aspect 57>
前記前記連結フラグが第１の値に設定される場合、前記連結フラグは、１つ以上の除去時間を決定するために前記符号化ピクチャバッファ遅延オフセット情報を使用する必要があることを示し、 if the concatenated flag is set to a first value, the concatenated flag indicates that the coded picture buffer delay offset information should be used to determine one or more removal times;
前記連結フラグが前記第１の値とは異なる第２の値に設定される場合、前記連結フラグは、前記１つ以上の除去時間を決定するために前記示されたオフセットが使用されないことを示す、 When the concatenation flag is set to a second value different from the first value, the concatenation flag indicates that the indicated offset is not used to determine the one or more removal times. ,
態様５５に記載のビデオデータストリーム。 56. A video data stream according to aspect 55.

＜態様５８＞<Aspect 58>
ビデオエンコーダ（１００）であって、 A video encoder (100) comprising:
前記ビデオエンコーダ（１００）がビデオをビデオデータストリームに符号化するように構成され、 said video encoder (100) configured to encode video into a video data stream;
前記ビデオエンコーダ（１００）は、前記ビデオデータストリームが符号化ピクチャバッファ遅延オフセット情報を含むように前記ビデオデータストリームを生成するように構成される、 The video encoder (100) is configured to generate the video data stream such that the video data stream includes encoded picture buffer delay offset information.
ビデオエンコーダ（１００）。 A video encoder (100).

＜態様５９＞<Aspect 59>
前記ビデオエンコーダ（１００）は、前記ビデオデータストリームが連結フラグを含むように前記ビデオデータストリームを生成するように構成される、態様５８に記載のビデオエンコーダ（１００）。 59. The video encoder (100) of aspect 58, wherein the video encoder (100) is configured to generate the video data stream such that the video data stream includes a concatenation flag.

＜態様６０＞<Aspect 60>
前記ビデオエンコーダ（１００）は、前記ビデオデータストリームが符号化ピクチャバッファ遅延オフセット情報を含むように前記ビデオデータストリームを生成するように構成される、態様５８又は５９に記載のビデオエンコーダ（１００）。 60. A video encoder (100) according to aspect 58 or 59, wherein said video encoder (100) is configured to generate said video data stream such that said video data stream includes encoded picture buffer delay offset information.

＜態様６１＞<Aspect 61>
前記前記連結フラグが第１の値に設定される場合、前記連結フラグは、１つ以上の除去時間を決定するために前記符号化ピクチャバッファ遅延オフセット情報を使用する必要があることを示し、 if the concatenated flag is set to a first value, the concatenated flag indicates that the coded picture buffer delay offset information should be used to determine one or more removal times;
前記連結フラグが前記第１の値とは異なる第２の値に設定される場合、前記連結フラグは、前記１つ以上の除去時間を決定するために前記示されたオフセットが使用されないことを示す、 When the concatenation flag is set to a second value different from the first value, the concatenation flag indicates that the indicated offset is not used to determine the one or more removal times. ,
態様５９に記載のビデオエンコーダ（１００）。 60. A video encoder (100) according to aspect 59.

＜態様６２＞<Aspect 62>
ビデオが格納されているビデオデータストリームを受信するためのビデオデコーダ（３００）であって、 A video decoder (300) for receiving a video data stream in which video is stored, comprising:
前記ビデオデコーダ（３００）は、前記ビデオデータストリームから前記ビデオを復号するように構成され、 said video decoder (300) configured to decode said video from said video data stream;
前記ビデオデコーダ（３００）は、符号化ピクチャバッファからの前記ビデオの複数のピクチャの現在のピクチャのアクセスユニット除去時間に応じて前記ビデオを復号するように構成され、 The video decoder (300) is configured to decode the video according to a current picture access unit removal time of a plurality of pictures of the video from an encoded picture buffer;
前記ビデオデコーダ（３００）は、前記符号化ピクチャバッファからの前記現在のピクチャの前記アクセスユニット除去時間を決定するために前記符号化ピクチャバッファ遅延オフセット情報を使用するか否かを示す表示に応じて前記ビデオを復号するように構成される、 The video decoder (300), in response to an indication indicating whether to use the coded picture buffer delay offset information to determine the access unit removal time of the current picture from the coded picture buffer. configured to decode said video;
ビデオデコーダ（３００）。 A video decoder (300).

＜態様６３＞<Aspect 63>
前記符号化ピクチャバッファからの前記ビデオの前記複数のピクチャのうちの少なくとも１つの前記アクセスユニット除去時間は、前記符号化ピクチャバッファ遅延オフセット情報に依存する、態様６２に記載のビデオデコーダ（３００）。 63. The video decoder (300) of aspect 62, wherein the access unit removal time of at least one of the plurality of pictures of the video from the coded picture buffer is dependent on the coded picture buffer delay offset information.

＜態様６４＞<Aspect 64>
前記ビデオデコーダ（３００）は、前記ビデオ内の前記現在のピクチャの位置に応じて前記現在のピクチャの前記アクセスユニット除去時間を決定するために符号化ピクチャバッファ遅延オフセット情報を使用するか否かに応じて前記ビデオを復号するように構成される、態様６２又は６３に記載のビデオデコーダ（３００）。 The video decoder (300) whether to use coded picture buffer delay offset information to determine the access unit removal time for the current picture depending on the position of the current picture within the video. 64. A video decoder (300) according to aspect 62 or 63, configured to decode said video accordingly.

＜態様６５＞<Aspect 65>
前記ビデオデコーダ（３００）は、前記符号化ピクチャバッファ遅延オフセット情報の符号化ピクチャバッファ遅延オフセット値が０に設定されるか否かに応じて前記ビデオを復号するように構成される、態様６２又は６３に記載のビデオデコーダ（３００）。 aspect 62, wherein the video decoder (300) is configured to decode the video depending on whether a coded picture buffer delay offset value of the coded picture buffer delay offset information is set to zero or A video decoder (300) according to 63.

＜態様６６＞<Aspect 66>
前記ビデオデコーダ（３００）は、前記ビデオ内の前記現在のピクチャに先行する前の廃棄不可能ピクチャの位置に応じて、前記現在のピクチャの前記アクセスユニット除去時間を決定するために符号化ピクチャバッファ遅延オフセット情報を使用するか否かを決定するように構成される、態様６２～６５のいずれかに記載のビデオデコーダ（３００）。 The video decoder (300) is configured in a coded picture buffer for determining the access unit removal time for the current picture according to the position of a previous non-discardable picture that precedes the current picture in the video. 66. A video decoder (300) according to any of aspects 62-65, configured to determine whether to use delay offset information.

＜態様６７＞<Aspect 67>
前記ビデオデコーダ（３００）は、前記ビデオ内の前記現在のピクチャに先行する前の廃棄不可能ピクチャが前のバッファリング期間内の最初のピクチャであるか否かに応じて、前記現在のピクチャの前記アクセスユニット除去時間を決定するために符号化ピクチャバッファ遅延オフセット情報を使用するか否かを決定するように構成される、態様６６に記載のビデオデコーダ（３００）。 The video decoder (300) determines whether the previous non-discardable picture preceding the current picture in the video is the first picture in the previous buffering period. 67. The video decoder (300) of aspect 66, configured to determine whether to use coded picture buffer delay offset information to determine said access unit removal time.

＜態様６８＞<Aspect 68>
前記ビデオデコーダ（３００）は、連結フラグに応じて、前記現在のピクチャの前記アクセスユニット除去時間を決定するために符号化ピクチャバッファ遅延オフセット情報を使用するか否かを決定するように構成され、前記現在のピクチャは、前記第２のビデオデータストリームの前記入力ビデオの最初のピクチャである、態様６２～６７のいずれかに記載のビデオデコーダ（３００）。 said video decoder (300) being configured to determine, depending on a concatenated flag, whether to use coded picture buffer delay offset information to determine said access unit removal time for said current picture; 68. The video decoder (300) of any of aspects 62-67, wherein the current picture is the first picture of the input video of the second video data stream.

＜態様６９＞<Aspect 69>
前記ビデオデコーダ（３００）は、前のピクチャの除去時間に応じて前記現在のピクチャの前記アクセスユニット除去時間を決定するように構成される、態様６２～６８のいずれかに記載のビデオデコーダ（３００）。 69. A video decoder (300) according to any of aspects 62-68, wherein said video decoder (300) is configured to determine said access unit removal time for said current picture depending on a removal time of a previous picture. ).

＜態様７０＞<Aspect 70>
前記ビデオデコーダ（３００）は、初期符号化ピクチャバッファ除去遅延情報に応じて前記現在のピクチャの前記アクセスユニット除去時間を決定するように構成される、態様６２～６９のいずれかに記載のビデオデコーダ（３００）。 70. A video decoder according to any of aspects 62-69, wherein said video decoder (300) is configured to determine said access unit removal time for said current picture as a function of initial coded picture buffer removal delay information. (300).

＜態様７１＞<Aspect 71>
前記ビデオデコーダ（３００）は、クロックティックに応じて前記初期符号化ピクチャバッファ除去遅延情報を更新して、一時的な符号化ピクチャバッファ除去遅延情報を取得し、前記現在のピクチャの前記アクセスユニット除去時間を決定するように構成される、態様７０に記載のビデオデコーダ（３００）。 The video decoder (300) updates the initial coded picture buffer removal delay information according to clock ticks to obtain temporary coded picture buffer removal delay information, and removes the access unit of the current picture. 71. The video decoder (300) of aspect 70, configured to determine time.

＜態様７２＞<Aspect 72>
前記連結フラグが第１の値に設定される場合、前記ビデオデコーダ（３００）は、１つ以上の除去時間を決定するために前記符号化ピクチャバッファ遅延オフセット情報を使用するように構成され、 if the concatenation flag is set to a first value, the video decoder (300) is configured to use the coded picture buffer delay offset information to determine one or more removal times;
前記連結フラグが前記第１の値とは異なる第２の値に設定される場合、前記ビデオデコーダ（３００）は、前記１つ以上の除去時間を決定するために前記符号化ピクチャバッファ遅延オフセット情報を使用しないように構成される、態様６８に記載のビデオデコーダ（３００）。 If the concatenated flag is set to a second value different from the first value, the video decoder (300) uses the coded picture buffer delay offset information to determine the one or more removal times. 69. The video decoder (300) of aspect 68, wherein the video decoder (300) is configured not to use

＜態様７３＞<Aspect 73>
態様３９～５３のいずれかに記載の装置（２００）と、 A device (200) according to any of aspects 39-53;
態様６２～７２のいずれかに記載のビデオデコーダ（３００）と、 A video decoder (300) according to any of aspects 62-72;
を備え、 with
態様６２～７２のいずれかに記載のビデオデコーダ（３００）は、態様３９～５３のいずれかに記載の装置（２００）の出力ビデオデータストリームを受信するように構成され、 A video decoder (300) according to any of aspects 62-72, configured to receive an output video data stream of an apparatus (200) according to any of aspects 39-53,
態様６２～７２のいずれかに記載のビデオデコーダ（３００）は、態様３９～５３のいずれかに記載の装置（２００）の前記出力ビデオデータストリームから前記ビデオを復号するように構成される、 The video decoder (300) according to any of aspects 62-72, configured to decode said video from said output video data stream of the apparatus (200) according to any of aspects 39-53,
システム。 system.

＜態様７４＞<Aspect 74>
前記システムは、態様５８～６１のいずれかに記載のビデオエンコーダ（１００）を更に備え、 The system further comprising a video encoder (100) according to any of aspects 58-61;
態様３９～５３のいずれかに記載の装置（２００）は、前記入力ビデオデータストリームとして、態様５８～６１のいずれかに記載のビデオエンコーダ（１００）から前記ビデオデータストリームを受信するように構成される、 The apparatus (200) according to any of aspects 39-53 is arranged to receive, as said input video data stream, said video data stream from a video encoder (100) according to any of aspects 58-61. Ru
態様７３に記載のシステム。 74. The system of aspect 73.

＜態様７５＞<Aspect 75>
１つ以上の入力ビデオデータストリームを受信するための方法であって、前記１つ以上の入力ビデオデータストリームのそれぞれには入力ビデオが符号化され、 A method for receiving one or more input video data streams, each of said one or more input video data streams having an encoded input video;
前記方法は、前記１つ以上の入力ビデオデータストリームから出力ビデオデータストリームを生成するステップを含み、前記出力ビデオデータストリームが出力ビデオを符号化し、前記出力ビデオデータストリームを生成するステップは、前記出力ビデオが前記１つ以上の入力ビデオデータストリームのうちの１つの入力ビデオデータストリーム内で符号化されている前記入力ビデオであるように、又は、前記出力ビデオが前記１つ以上の入力ビデオデータストリームのうちの少なくとも１つの前記入力ビデオに依存するように行なわれ、 The method includes generating an output video data stream from the one or more input video data streams, wherein the output video data stream encodes an output video, and generating the output video data stream comprises: video is the input video encoded within one of the one or more input video data streams; or the output video is the one or more input video data streams. dependent on said input video of at least one of
前記方法は、符号化ピクチャバッファからの前記出力ビデオの複数のピクチャのうちの現在のピクチャのアクセスユニット除去時間を決定するステップを含み、 The method includes determining an access unit removal time for a current picture of a plurality of pictures of the output video from a coded picture buffer;
前記方法は、前記符号化ピクチャバッファからの前記現在のピクチャの前記アクセスユニット除去時間を決定するために符号化ピクチャバッファ遅延オフセット情報を使用するか否かを決定するステップを含む、 The method includes determining whether to use coded picture buffer delay offset information to determine the access unit removal time of the current picture from the coded picture buffer;
方法。 Method.

＜態様７６＞<Aspect 76>
ビデオをビデオデータストリームに符号化するための方法であって、 A method for encoding video into a video data stream, comprising:
前記方法は、前記ビデオデータストリームが符号化ピクチャバッファ遅延オフセット情報を含むように前記ビデオデータストリームを生成するステップを含む、 The method includes generating the video data stream such that the video data stream includes encoded picture buffer delay offset information.
方法。 Method.

＜態様７７＞<Aspect 77>
ビデオが格納されたビデオデータストリームを受信するための方法であって、 A method for receiving a video data stream in which video is stored, comprising:
前記方法は、前記ビデオデータストリームから前記ビデオを復号するステップを含み、 The method includes decoding the video from the video data stream;
前記ビデオを復号するステップは、符号化ピクチャバッファからの前記ビデオの複数のピクチャのうちの現在のピクチャのアクセスユニット除去時間に応じて行なわれ、 decoding the video is performed in response to an access unit removal time of a current picture of a plurality of pictures of the video from an encoded picture buffer;
前記ビデオを復号するステップは、前記符号化ピクチャバッファからの前記現在のピクチャの前記アクセスユニット除去時間を決定するために符号化ピクチャバッファ遅延オフセット情報を使用するか否かを示す表示に応じて行なわれる、 Decoding the video is performed in response to an indication indicating whether to use coded picture buffer delay offset information to determine the access unit removal time of the current picture from the coded picture buffer. to be
方法。 Method.

＜態様７８＞<Aspect 78>
コンピュータ又は信号プロセッサで実行されるときに態様７５～７７のいずれかに記載の方法を実施するためのコンピュータプログラム。 78. A computer program for implementing the method of any of aspects 75-77 when run on a computer or signal processor.

＜態様７９＞<Aspect 79>
ビデオデータストリームであって、 a video data stream,
前記ビデオデータストリームにはビデオが符号化され、 a video is encoded in the video data stream;
前記ビデオデータストリームが初期符号化ピクチャバッファ除去遅延を含み、 wherein the video data stream includes an initial coded picture buffer removal delay;
前記ビデオデータストリームが初期符号化ピクチャバッファ除去オフセットを含み、 the video data stream includes an initial coded picture buffer removal offset;
前記ビデオデータストリームは、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であると定義されるか否かを示す情報を含む、 The video data stream includes information indicating whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant over two or more buffering periods. ,
ビデオデータストリーム。 video data stream.

＜態様８０＞<Aspect 80>
前記初期符号化ピクチャバッファ除去遅延は、前記ビデオデコーダ（３００）を初期化する前記ビデオデータストリームのピクチャの最初のアクセスユニットに関して前記最初のアクセスユニットを前記ビデオデコーダ（３００）に送信する前に経過する必要がある時間を示す、態様７９に記載のビデオデータストリーム。 The initial coded picture buffer removal delay elapses before sending the first access unit of a picture of the video data stream to the video decoder (300) to initialize the video decoder (300). 80. The video data stream according to aspect 79, indicating the time when it needs to

＜態様８１＞<Aspect 81>
前記ビデオデータストリームは、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和が前記２つ以上のバッファリング期間にわたって一定であると定義されるか否かを示す単一の表示を含む、態様８０に記載のビデオデータストリーム。 The video data stream has a unit that indicates whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined as constant over the two or more buffering periods. 81. A video data stream according to aspect 80, comprising one display.

＜態様８２＞<Aspect 82>
前記ビデオデータストリームは、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和が前記２つ以上のバッファリング期間にわたって一定であると定義されるか否かを示す連結フラグを前記単一の表示として含み、 A concatenation indicating whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant over the two or more buffering periods. including a flag as said single indication;
前記連結フラグが第１の値に等しい場合、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和は、前記２つ以上のバッファリング期間にわたって一定であり、 if the concatenated flag is equal to a first value, the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is constant over the two or more buffering periods;
前記連結フラグが前記第１の値と異なる場合、前記連結フラグは、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和が前記２以上のバッファリング期間にわたって一定であるか否かを定義しない、 If the concatenated flag differs from the first value, the concatenated flag indicates that the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is constant over the two or more buffering periods. undefined whether or not
態様８１に記載のビデオデータストリーム。 82. A video data stream according to aspect 81.

＜態様８３＞<Aspect 83>
前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和が前記２つ以上のバッファリング期間にわたって一定であると定義されることを前記単一の表示が示さない場合、前記ビデオデータストリームは、前記初期符号化ピクチャバッファ除去遅延情報に関する連続的に更新された情報と、前記初期符号化ピクチャバッファ除去オフセット情報に関する連続的に更新された情報とを含む、態様８１又は８２に記載のビデオデータストリーム。 if the single indication does not indicate that the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant over the two or more buffering periods; 83. Aspect 81 or 82, wherein said video data stream comprises continuously updated information regarding said initial coded picture buffer removal delay information and continuously updated information regarding said initial coded picture buffer removal offset information. The video data stream described in .

＜態様８４＞<Aspect 84>
前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和が前記２つ以上のバッファリング期間にわたって一定であると定義されることを示す前記情報を前記ビデオデータストリームが含む場合、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和は、前記ビデオデータストリーム内の現在の位置を発端として一定であると定義される、態様７９～８３のいずれかに記載のビデオデータストリーム。 The video data stream includes said information indicating that said sum of said initial coded picture buffer removal delay and said initial coded picture buffer removal offset is defined to be constant over said two or more buffering periods. of aspects 79-83, wherein the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant starting from a current position within the video data stream. A video data stream according to any one of the preceding claims.

＜態様８５＞<Aspect 85>
ビデオエンコーダ（１００）であって、 A video encoder (100) comprising:
前記ビデオエンコーダ（１００）は、ビデオをビデオデータストリームに符号化するように構成され、 said video encoder (100) configured to encode video into a video data stream;
前記ビデオエンコーダ（１００）は、前記ビデオデータストリームが初期符号化ピクチャバッファ除去遅延を含むように前記ビデオデータストリームを生成するように構成され、 the video encoder (100) is configured to generate the video data stream such that the video data stream includes an initial coded picture buffer removal delay;
前記ビデオエンコーダ（１００）は、前記ビデオデータストリームが初期符号化ピクチャバッファ除去オフセットを含むように前記ビデオデータストリームを生成するように構成され、 the video encoder (100) is configured to generate the video data stream such that the video data stream includes an initial coded picture buffer removal offset;
前記ビデオエンコーダ（１００）は、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であると定義されるか否かを示す情報を前記ビデオデータストリームが含むように、前記ビデオデータストリームを生成するように構成される、 Information indicating whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant over two or more buffering periods. is configured to generate the video data stream such that the video data stream includes
ビデオエンコーダ（１００）。 A video encoder (100).

＜態様８６＞<Aspect 86>
前記初期符号化ピクチャバッファ除去遅延は、前記ビデオデコーダ（３００）を初期化する前記ビデオデータストリームのピクチャの最初のアクセスユニットに関して前記最初のアクセスユニットを前記ビデオデコーダ（３００）に送信する前に経過する必要がある時間を示す、態様８５に記載のビデオエンコーダ（１００）。 The initial coded picture buffer removal delay elapses before sending the first access unit of a picture of the video data stream to the video decoder (300) to initialize the video decoder (300). 86. The video encoder (100) of aspect 85, wherein the video encoder (100) of aspect 85 indicates a time when it needs to

＜態様８７＞<Aspect 87>
前記ビデオエンコーダ（１００）は、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和が前記２つ以上のバッファリング期間にわたって一定であると定義されるか否かを示す単一の表示を前記ビデオデータストリームが含むように、前記ビデオデータストリームを生成するように構成される、態様８６に記載のビデオエンコーダ（１００）。 The video encoder (100) determines whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined as constant over the two or more buffering periods. 87. A video encoder (100) according to aspect 86, configured to generate said video data stream such that said video data stream comprises a single representation of an image.

＜態様８８＞<Aspect 88>
前記ビデオエンコーダ（１００）は、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和が前記２つ以上のバッファリング期間にわたって一定であると定義されるか否かを示す連結フラグを前記単一の表示として前記ビデオデータストリームが含むように、前記ビデオデータストリームを生成するように構成され、 The video encoder (100) determines whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined as constant over the two or more buffering periods. configured to generate the video data stream such that the video data stream includes as the single representation a concatenated flag indicating
前記連結フラグが第１の値に等しい場合、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和は、前記２つ以上のバッファリング期間にわたって一定であり、 if the concatenated flag is equal to a first value, the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is constant over the two or more buffering periods;
前記連結フラグが前記第１の値と異なる場合、前記連結フラグは、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和が前記２以上のバッファリング期間にわたって一定であるか否かを定義しない、態様８７に記載のビデオエンコーダ（１００）。 If the concatenated flag differs from the first value, the concatenated flag indicates that the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is constant over the two or more buffering periods. 88. The video encoder (100) of aspect 87, undefined.

＜態様８９＞<Aspect 89>
前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和が前記２つ以上のバッファリング期間にわたって一定であると定義されることを前記単一の表示が示さない場合、前記ビデオエンコーダ（１００）は、前記初期符号化ピクチャバッファ除去遅延情報に関する連続的に更新された情報と前記初期符号化ピクチャバッファ除去オフセット情報に関する連続的に更新された情報とを前記ビデオデータストリームが含むように、前記ビデオデータストリームを生成するように構成される、態様８７又は８８に記載のビデオエンコーダ（１００）。 if the single indication does not indicate that the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant over the two or more buffering periods; The video encoder (100) outputs continuously updated information about the initial coded picture buffer removal delay information and continuously updated information about the initial coded picture buffer removal offset information to the video data stream. 89. A video encoder (100) according to aspect 87 or 88, configured to generate said video data stream, comprising:

＜態様９０＞<Aspect 90>
前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和が前記２つ以上のバッファリング期間にわたって一定であると定義されることを示す前記情報を前記ビデオデータストリームが含む場合、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和は、前記ビデオデータストリーム内の現在の位置を発端として一定であると定義される、態様８５～８９のいずれかに記載のビデオエンコーダ（１００）。 The video data stream includes said information indicating that said sum of said initial coded picture buffer removal delay and said initial coded picture buffer removal offset is defined to be constant over said two or more buffering periods. of aspects 85-89, wherein the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant starting from a current position within the video data stream. A video encoder (100) according to any of the preceding claims.

＜態様９１＞<Aspect 91>
第１の入力ビデオデータストリーム及び第２の入力ビデオデータストリームである、２つの入力ビデオデータストリームを受信するための装置（２００）であって、前記２つの入力ビデオデータストリームのそれぞれには入力ビデオが符号化され、 Apparatus (200) for receiving two input video data streams, a first input video data stream and a second input video data stream, each of said two input video data streams comprising an input video data stream is encoded and
前記装置（２００）は、前記２つの入力ビデオデータストリームから出力ビデオデータストリームを生成するように構成され、前記出力ビデオデータストリームが出力ビデオを符号化し、前記装置は、前記第１入力ビデオデータストリームと前記第２入力ビデオデータストリームとを連結することによって出力ビデオデータストリームを生成するように構成され、 Said apparatus (200) is configured to generate an output video data stream from said two input video data streams, said output video data stream encoding an output video, said apparatus comprising: said first input video data stream; and the second input video data stream to generate an output video data stream;
前記装置（２００）は、前記出力ビデオデータストリームが初期符号化ピクチャバッファ除去遅延を含むように前記出力ビデオデータストリームを生成するように構成され、 The apparatus (200) is configured to generate the output video data stream such that the output video data stream includes an initial coded picture buffer removal delay;
前記装置（２００）は、前記出力ビデオデータストリームが初期符号化ピクチャバッファ除去オフセットを含むように前記出力ビデオデータストリームを生成するように構成され、 The apparatus (200) is configured to generate the output video data stream such that the output video data stream includes an initial coded picture buffer removal offset;
前記装置（２００）は、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であると定義されるか否かを示す情報を前記出力ビデオデータストリームが含むように、前記出力ビデオデータストリームを生成するように構成される、 The apparatus (200) provides information indicating whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant over two or more buffering periods. configured to generate the output video data stream, such that the output video data stream comprises;
装置（２００）。 A device (200).

＜態様９２＞<Aspect 92>
前記初期符号化ピクチャバッファ除去遅延は、前記ビデオデコーダ（３００）を初期化する前記出力ビデオデータストリームのピクチャの最初のアクセスユニットに関して前記最初のアクセスユニットを前記ビデオデコーダ（３００）に送信する前に経過する必要がある時間を示す、態様９１に記載の装置（２００）。 The initial coded picture buffer removal delay is for initializing the video decoder (300) for the first access unit of a picture of the output video data stream before sending the first access unit to the video decoder (300). 92. Apparatus (200) according to aspect 91, indicating the time that needs to elapse.

＜態様９３＞<Aspect 93>
前記装置（２００）は、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和が前記２つ以上のバッファリング期間にわたって一定であると定義されるか否かを示す単一の表示を前記出力ビデオデータストリームが含むように、前記出力ビデオデータストリームを生成するように構成される、態様９２に記載の装置（２００）。 The apparatus (200) indicates whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined as constant over the two or more buffering periods. 93. Apparatus (200) according to aspect 92, configured to generate said output video data stream such that said output video data stream comprises a single representation.

＜態様９４＞<Aspect 94>
前記装置（２００）は、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和が前記２つ以上のバッファリング期間にわたって一定であると定義されるか否かを示す連結フラグを前記単一の表示として前記出力ビデオデータストリームが含むように前記出力ビデオデータストリームを生成するように構成され、 The apparatus (200) indicates whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined as constant over the two or more buffering periods. configured to generate the output video data stream such that the output video data stream includes a concatenated flag as the single representation;
前記連結フラグが第１の値に等しい場合、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和は、前記２つ以上のバッファリング期間にわたって一定であり、 if the concatenated flag is equal to a first value, the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is constant over the two or more buffering periods;
前記連結フラグが前記第１の値と異なる場合、前記連結フラグは、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和が前記２以上のバッファリング期間にわたって一定であるか否かを定義しない、 If the concatenated flag differs from the first value, the concatenated flag indicates that the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is constant over the two or more buffering periods. undefined whether or not
態様９３に記載の装置（２００）。 94. Apparatus (200) according to aspect 93.

＜態様９５＞<Aspect 95>
前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和が前記２つ以上のバッファリング期間にわたって一定であると定義されることを前記単一の表示が示さない場合、前記装置（２００）は、前記初期符号化ピクチャバッファ除去遅延情報に関する連続的に更新された情報と前記初期符号化ピクチャバッファ除去オフセット情報に関する連続的に更新された情報とを前記出力ビデオデータストリームが含むように、前記出力ビデオデータストリームを生成するように構成される、態様９３又は９４に記載の装置（２００）。 if the single indication does not indicate that the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant over the two or more buffering periods; The apparatus (200) provides continuously updated information about the initial coded picture buffer removal delay information and continuously updated information about the initial coded picture buffer removal offset information into the output video data stream. 95. Apparatus (200) according to aspect 93 or 94, configured to generate said output video data stream, comprising:

＜態様９６＞<Aspect 96>
前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和が前記２つ以上のバッファリング期間にわたって一定であると定義されることを示す前記情報を前記ビデオデータストリームが含む場合、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和は、前記ビデオデータストリーム内の現在の位置を発端として一定であると定義される、態様９１～９５のいずれかに記載の装置（２００）。 The video data stream includes said information indicating that said sum of said initial coded picture buffer removal delay and said initial coded picture buffer removal offset is defined to be constant over said two or more buffering periods. of aspects 91-95, wherein the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant starting from a current position within the video data stream. A device (200) according to any of the preceding claims.

＜態様９７＞<Aspect 97>
ビデオが格納されているビデオデータストリームを受信するためのビデオデコーダ（３００）であって、 A video decoder (300) for receiving a video data stream in which video is stored, comprising:
前記ビデオデコーダ（３００）は、前記ビデオデータストリームから前記ビデオを復号するように構成され、 said video decoder (300) configured to decode said video from said video data stream;
前記ビデオデータストリームが初期符号化ピクチャバッファ除去遅延を含み、 wherein the video data stream includes an initial coded picture buffer removal delay;
前記ビデオデータストリームが初期符号化ピクチャバッファ除去オフセットを含み、 the video data stream includes an initial coded picture buffer removal offset;
前記ビデオデータストリームは、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの和が２以上のバッファリング期間にわたって一定であると定義されるか否かを示す情報を含み、 the video data stream includes information indicating whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant over two or more buffering periods;
前記ビデオデコーダ（３００）は、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和が前記２つ以上のバッファリング期間にわたって一定であると定義されるか否かを示す情報に応じて、前記ビデオを復号するように構成される、 The video decoder (300) determines whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined as constant over the two or more buffering periods. configured to decode the video in response to indicating information;
ビデオデコーダ（３００）。 A video decoder (300).

＜態様９８＞<Aspect 98>
前記初期符号化ピクチャバッファ除去遅延は、前記ビデオデコーダ（３００）を初期化する前記出力ビデオデータストリームのピクチャの最初のアクセスユニットに関して前記ビデオデコーダ（３００）に前記最初のアクセスユニットを送信する前に経過する必要がある時間を示す、態様９７に記載のビデオデコーダ（３００）。 The initial coded picture buffer removal delay is prior to sending the first access unit of a picture of the output video data stream to the video decoder (300) to initialize the video decoder (300). 98. The video decoder (300) of aspect 97, indicating the time that needs to elapse.

＜態様９９＞<Aspect 99>
前記ビデオデータストリームは、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和が前記２つ以上のバッファリング期間にわたって一定であると定義されるか否かを示す単一の表示を含み、 The video data stream has a unit that indicates whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined as constant over the two or more buffering periods. including an indication of
前記ビデオデコーダ（３００）は、前記単一の表示に応じて前記ビデオを復号するように構成される、 the video decoder (300) is configured to decode the video according to the single representation;
態様９８に記載のビデオデコーダ（３００）。 99. A video decoder (300) according to aspect 98.

＜態様１００＞<Aspect 100>
前記ビデオデータストリームは、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和が前記２つ以上のバッファリング期間にわたって一定であると定義されるか否かを示す連結フラグを前記単一の表示として含み、 A concatenation indicating whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant over the two or more buffering periods. including a flag as said single indication;
前記連結フラグが第１の値に等しい場合、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和は、前記２つ以上のバッファリング期間にわたって一定であり、 if the concatenated flag is equal to a first value, the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is constant over the two or more buffering periods;
前記連結フラグが前記第１の値と異なる場合、前記連結フラグは、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和が前記２以上のバッファリング期間にわたって一定であるか否かを定義せず、 If the concatenated flag differs from the first value, the concatenated flag indicates that the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is constant over the two or more buffering periods. without defining whether or not there is
前記ビデオデコーダ（３００）は、前記連結フラグに応じて前記ビデオを復号するように構成される、 the video decoder (300) is configured to decode the video in response to the concatenation flag;
態様９９に記載のビデオデコーダ（３００）。 100. A video decoder (300) according to aspect 99.

＜態様１０１＞<Aspect 101>
前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和が前記２つ以上のバッファリング期間にわたって一定であると定義されることを前記単一の表示が示さない場合、前記ビデオデータストリームは、前記初期符号化ピクチャバッファ除去遅延情報に関する連続的に更新された情報と前記初期符号化ピクチャバッファ除去オフセット情報に関する連続的に更新された情報とを含み、 if the single indication does not indicate that the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant over the two or more buffering periods; the video data stream includes continuously updated information about the initial coded picture buffer removal delay information and continuously updated information about the initial coded picture buffer removal offset information;
前記ビデオデコーダ（３００）は、前記初期符号化ピクチャバッファ除去遅延情報に関する前記連続的に更新された情報と、前記初期符号化ピクチャバッファ除去オフセット情報に関する前記連続的に更新された情報とに応じて、前記ビデオを復号するように構成される、 The video decoder (300) is responsive to the continuously updated information regarding the initial coded picture buffer removal delay information and the continuously updated information regarding the initial coded picture buffer removal offset information. , configured to decode said video;
態様９９又は１００に記載のビデオデコーダ（３００）。 100. A video decoder (300) according to aspect 99 or 100.

＜態様１０２＞<Aspect 102>
前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和が前記２つ以上のバッファリング期間にわたって一定であると定義されることを示す前記情報を前記ビデオデータストリームが含む場合、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和は、前記ビデオデータストリーム内の現在の位置を発端として一定であると定義される、態様９７～１０１のいずれかに記載のビデオデコーダ（３００）。 The video data stream includes said information indicating that said sum of said initial coded picture buffer removal delay and said initial coded picture buffer removal offset is defined to be constant over said two or more buffering periods. 102. of aspects 97-101, wherein the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant starting from a current position within the video data stream. A video decoder (300) according to any of the preceding claims.

＜態様１０３＞<Aspect 103>
態様９１～９６のいずれかに記載の装置（２００）と、 A device (200) according to any of aspects 91-96;
態様９７～１０２のいずれかに記載のビデオデコーダ（３００）と、 A video decoder (300) according to any of aspects 97-102;
を備え、 with
態様９７～１０２のいずれかに記載のビデオデコーダ（３００）は、態様９１～９６のいずれかに記載の装置（２００）の出力ビデオデータストリームを受信するように構成され、 A video decoder (300) according to any of aspects 97-102, configured to receive an output video data stream of an apparatus (200) according to any of aspects 91-96,
態様９７～１０２のいずれかに記載のビデオデコーダ（３００）は、態様９１～９６のいずれかに記載の装置（２００）の前記出力ビデオデータストリームから前記ビデオを復号するように構成される、 A video decoder (300) according to any of aspects 97-102, wherein the video decoder (300) according to any of aspects 91-96 is configured to decode said video from said output video data stream of the apparatus (200) according to any of aspects 91-96,
システム。 system.

＜態様１０４＞<Aspect 104>
前記システムは、態様８５～９０のいずれかに記載のビデオエンコーダ（１００）を更に備え、 The system further comprising a video encoder (100) according to any of aspects 85-90,
態様９１～９６のいずれかに記載の装置（２００）は、前記入力ビデオデータストリームとして、態様８５～９０のいずれかに記載のビデオエンコーダ（１００）から前記ビデオデータストリームを受信するように構成される、 96. The apparatus (200) according to any of aspects 91-96 is arranged to receive said video data stream from a video encoder (100) according to any of aspects 85-90 as said input video data stream. Ru
態様１０３に記載のシステム。 104. The system of aspect 103.

＜態様１０５＞<Aspect 105>
ビデオをビデオデータストリームに符号化するための方法であって、 A method for encoding video into a video data stream, comprising:
前記方法は、前記ビデオデータストリームが初期符号化ピクチャバッファ除去遅延を含むように前記ビデオデータストリームを生成するステップを含み、 The method includes generating the video data stream such that the video data stream includes an initial coded picture buffer removal delay;
前記方法は、前記ビデオデータストリームが初期符号化ピクチャバッファ除去オフセットを含むように前記ビデオデータストリームを生成するステップを含み、 The method includes generating the video data stream such that the video data stream includes an initial coded picture buffer removal offset;
前記方法は、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であると定義されるか否かを示す情報を前記ビデオデータストリームが含むように、前記ビデオデータストリームを生成するステップを含む、 The method includes transmitting information to the video data indicating whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant over two or more buffering periods. generating said video data stream so that the stream contains
方法。 Method.

＜態様１０６＞<Aspect 106>
第１の入力ビデオデータストリーム及び第２の入力ビデオデータストリームである２つの入力ビデオデータストリームを受信するための方法であって、前記２つの入力ビデオデータストリームのそれぞれには入力ビデオが符号化され、 A method for receiving two input video data streams, a first input video data stream and a second input video data stream, each of said two input video data streams encoding an input video. ,
前記方法は、前記２つの入力ビデオデータストリームから出力ビデオデータストリームを生成するステップを含み、前記出力ビデオデータストリームが出力ビデオを符号化し、装置が、前記第１の入力ビデオデータストリームと前記第２の入力ビデオデータストリームとを連結することによって出力ビデオデータストリームを生成するように構成され、 The method includes generating an output video data stream from the two input video data streams, wherein the output video data stream encodes an output video, and an apparatus for generating the first input video data stream and the second input video data stream. configured to generate an output video data stream by concatenating the input video data stream of
前記方法は、前記出力ビデオデータストリームが初期符号化ピクチャバッファ除去遅延を含むように前記出力ビデオデータストリームを生成するステップを含み、 The method includes generating the output video data stream such that the output video data stream includes an initial coded picture buffer removal delay;
前記方法は、前記出力ビデオデータストリームが初期符号化ピクチャバッファ除去オフセットを含むように前記出力ビデオデータストリームを生成するステップを含み、 The method includes generating the output video data stream such that the output video data stream includes an initial coded picture buffer removal offset;
前記方法は、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であると定義されるか否かを示す情報を前記出力ビデオデータストリームが含むように、前記出力ビデオデータストリームを生成するステップを含む、 The method includes transmitting information to the output video indicating whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant over two or more buffering periods. generating said output video data stream such that the data stream comprises;
方法。 Method.

＜態様１０７＞<Aspect 107>
ビデオが格納されているビデオデータストリームを受信するための方法であって、 A method for receiving a video data stream in which video is stored, comprising:
前記方法は、前記ビデオデータストリームから前記ビデオを復号するステップを含み、 The method includes decoding the video from the video data stream;
前記ビデオデータストリームが初期符号化ピクチャバッファ除去遅延を含み、 wherein the video data stream includes an initial coded picture buffer removal delay;
前記ビデオデータストリームが初期符号化ピクチャバッファ除去オフセットを含み、 the video data stream includes an initial coded picture buffer removal offset;
前記ビデオデータストリームは、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの和が２つ以上のバッファリング期間にわたって一定であると定義されるか否かを示す情報を含み、 The video data stream includes information indicating whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant over two or more buffering periods. ,
前記方法は、前記初期符号化ピクチャバッファ除去遅延と前記初期符号化ピクチャバッファ除去オフセットとの前記和が前記２つ以上のバッファリング期間にわたって一定であると定義されるか否かを示す前記情報に応じて前記ビデオを復号するステップを含む、 The method further includes determining whether the sum of the initial coded picture buffer removal delay and the initial coded picture buffer removal offset is defined to be constant over the two or more buffering periods. decoding the video in response;
方法。 Method.

＜態様１０８＞<Aspect 108>
コンピュータ又は信号プロセッサで実行されるときに態様１０５～１０７のいずれかに記載の方法を実施するためのコンピュータプログラム。 108. A computer program for implementing the method of any of aspects 105-107 when run on a computer or signal processor.

＜態様１０９＞<Aspect 109>
ビデオデータストリームであって、 a video data stream,
前記ビデオデータストリームにはビデオが符号化され、 a video is encoded in the video data stream;
前記ビデオデータストリームは、前記ビデオデータストリームの１つ以上の符号化ビデオシーケンスのうちの１つの符号化ビデオシーケンスの前記複数のアクセスユニットのうちの１つのアクセスユニットのネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義されるか否かを示す表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）を含み、 The video data stream comprises a scalable nest of network abstraction layer units of one access unit of the plurality of access units of an encoded video sequence of one or more encoded video sequences of the video data stream. an indication (general_same_pic_timing_in_all_ols_flag) indicating whether a picture timing supplemental enhancement information message is defined to be applied to all output layer sets of the plurality of output layer sets of the access unit;
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値を有する場合、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットの前記スケーラブルネストされないピクチャタイミング補足拡張情報メッセージは、前記アクセスユニットの前記複数の出力レイヤセットの全ての出力レイヤセットに適用するように定義され、 If the indication (general_same_pic_timing_in_all_ols_flag) has a first value, then the scalable non-nested picture timing supplemental enhancement information message of the network abstraction layer unit of the access unit indicates all of the plurality of output layer sets of the access unit. defined to apply to the output layer set,
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が前記第１の値とは異なる値を有する場合、前記表示は、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットの前記スケーラブルネストされないピクチャタイミング補足拡張情報メッセージが前記アクセスユニットの前記複数の出力レイヤセットの全ての出力レイヤセットに適用されるか否かを定義しない、 If the indication (general_same_pic_timing_in_all_ols_flag) has a value different from the first value, then the indication is that the scalable non-nested picture timing supplemental enhancement information message of the network abstraction layer unit of the access unit is the undefined whether to apply to all output layer sets of multiple output layer sets,
ビデオデータストリーム。 video data stream.

＜態様１１０＞<Aspect 110>
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が前記第１の値を有する場合、前記ネットワーク抽象化レイヤユニットは、ピクチャタイミング補足拡張情報メッセージとは異なるいかなる他の補足拡張情報メッセージも含まない、態様１０９に記載のビデオデータストリーム。 110. The video data of aspect 109, wherein if the indication (general_same_pic_timing_in_all_ols_flag) has the first value, then the network abstraction layer unit does not include any other supplemental enhancement information message different from a picture timing supplemental enhancement information message. stream.

＜態様１１１＞<Aspect 111>
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が前記第１の値を有する場合、前記ネットワーク抽象化レイヤユニットはいかなる他の補足拡張情報メッセージも含まない、態様１０９又は１１０に記載のビデオデータストリーム。 111. A video data stream according to aspect 109 or 110, wherein if said indication (general_same_pic_timing_in_all_ols_flag) has said first value, said network abstraction layer unit does not include any other supplemental enhancement information messages.

＜態様１１２＞<Aspect 112>
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が前記第１の値を有する場合、前記１つ以上の符号化ビデオシーケンスのうちの１つの符号化ビデオシーケンスの前記複数のアクセスユニットの各アクセスユニットの、スケーラブルネストされないピクチャタイミング補足拡張情報メッセージを含むそれぞれのネットワーク抽象化レイヤユニットごとに、前記ネットワーク抽象化レイヤユニットは、ピクチャタイミング補足拡張情報メッセージとは異なる任意の他の補足拡張情報メッセージを含まず、又は、いかなる他の補足拡張情報メッセージも含まない、態様１０９～１１１のいずれかに記載のビデオデータストリーム。 if the indication (general_same_pic_timing_in_all_ols_flag) has the first value, scalable non-nested picture timing of each access unit of the plurality of access units of one coded video sequence among the one or more coded video sequences; For each network abstraction layer unit containing a supplemental enhancement information message, said network abstraction layer unit does not contain any other supplemental enhancement information message different from the picture timing supplemental enhancement information message or any other 112. The video data stream according to any of aspects 109-111, which also does not include Supplemental Enhancement Information messages.

＜態様１１３＞<Aspect 113>
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が前記第１の値を有する場合、前記ビデオデータストリームの前記１つ以上の符号化ビデオシーケンスのそれぞれの前記複数のアクセスユニットの各アクセスユニットの、スケーラブルネストされないピクチャタイミング補足拡張情報メッセージを含むそれぞれのネットワーク抽象化レイヤユニットごとに、前記ネットワーク抽象化レイヤユニットは、ピクチャタイミング補足拡張情報メッセージとは異なる任意の他の補足拡張情報メッセージを含まない又はいかなる他の補足拡張情報メッセージも含まない、態様１０９～１１１のいずれかに記載のビデオデータストリーム。 if the indication (general_same_pic_timing_in_all_ols_flag) has the first value, a scalable non-nested picture timing supplemental extension of each access unit of the plurality of access units of each of the one or more encoded video sequences of the video data stream. For each network abstraction layer unit containing an information message, said network abstraction layer unit does not contain any other supplemental enhancement information message different from the picture timing supplemental enhancement information message or any other supplemental enhancement information message. 112. A video data stream according to any of aspects 109-111, not comprising:

＜態様１１４＞<Aspect 114>
ビデオエンコーダ（１００）であって、 A video encoder (100) comprising:
前記ビデオエンコーダ（１００）は、ビデオをビデオデータストリームに符号化するように構成され、 said video encoder (100) configured to encode video into a video data stream;
前記ビデオエンコーダ（１００）は、前記ビデオデータストリームの１つ以上の符号化ビデオシーケンスのうちの１つの符号化ビデオシーケンスの前記複数のアクセスユニットのうちの１つのアクセスユニットのネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されると定義されるか否かを示す表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）を前記ビデオデータストリームが含むように、前記ビデオデータストリームを生成するように構成され、 the video encoder (100) for network abstraction layer units of one access unit of the plurality of access units of an encoded video sequence of one or more encoded video sequences of the video data stream; such that the video data stream includes an indication (general_same_pic_timing_in_all_ols_flag) indicating whether a scalable non-nested picture timing supplemental enhancement information message is defined to apply to all output layer sets of the multiple output layer sets of the access unit; and configured to generate said video data stream,
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値を有する場合、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットの前記スケーラブルネストされないピクチャタイミング補足拡張情報メッセージは、前記アクセスユニットの前記複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義され、 If the indication (general_same_pic_timing_in_all_ols_flag) has a first value, then the scalable non-nested picture timing supplemental enhancement information message of the network abstraction layer unit of the access unit indicates all of the plurality of output layer sets of the access unit. defined to be applied to the output layer set,
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が前記第１の値とは異なる値を有する場合、前記表示は、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットの前記スケーラブルネストされないピクチャタイミング補足拡張情報メッセージが前記アクセスユニットの前記複数の出力レイヤセットの全ての出力レイヤセットに適用されるか否かを定義しない、 If the indication (general_same_pic_timing_in_all_ols_flag) has a value different from the first value, then the indication is that the scalable non-nested picture timing supplemental enhancement information message of the network abstraction layer unit of the access unit is the undefined whether to apply to all output layer sets of multiple output layer sets,
ビデオエンコーダ（１００）。 A video encoder (100).

＜態様１１５＞<Aspect 115>
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が前記第１の値を有する場合、前記ネットワーク抽象化レイヤユニットは、ピクチャタイミング補足拡張情報メッセージとは異なるいかなる他の補足拡張情報メッセージも含まない、態様１１４に記載のビデオエンコーダ（１００）。 115. The video encoder of aspect 114, wherein if the indication (general_same_pic_timing_in_all_ols_flag) has the first value, then the network abstraction layer unit does not include any other supplemental enhancement information message different from a picture timing supplemental enhancement information message. (100).

＜態様１１６＞<Aspect 116>
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が前記第１の値を有する場合、前記ネットワーク抽象化レイヤユニットはいかなる他の補足拡張情報メッセージも含まない、態様１１４又は１１５に記載のビデオエンコーダ（１００）。 116. The video encoder (100) of aspect 114 or 115, wherein if said indication (general_same_pic_timing_in_all_ols_flag) has said first value, said network abstraction layer unit does not include any other supplemental enhancement information message.

＜態様１１７＞<Aspect 117>
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が前記第１の値を有する場合、前記ビデオエンコーダ（１００）は、前記１つ以上の符号化ビデオシーケンスのうちの１つの符号化ビデオシーケンスの前記複数のアクセスユニットの各アクセスユニットの、スケーラブルネストされないピクチャタイミング補足拡張情報メッセージを含むそれぞれのネットワーク抽象化レイヤユニットごとに、前記ネットワーク抽象化レイヤユニットがピクチャタイミング補足拡張情報メッセージとは異なる任意の他の補足拡張情報メッセージを含まない又はいかなる他の補足拡張情報メッセージも含まないように前記ビデオデータストリームを生成するように構成される、態様１１４～１１６のいずれかに記載のビデオエンコーダ（１００）。 If the indication (general_same_pic_timing_in_all_ols_flag) has the first value, the video encoder (100) controls each access of the plurality of access units of one encoded video sequence among the one or more encoded video sequences. For each network abstraction layer unit containing a non-scalable nested picture timing supplemental enhancement information message of a unit, said network abstraction layer unit contains any other supplemental enhancement information message different from the picture timing supplemental enhancement information message. 117. A video encoder (100) according to any of aspects 114-116, wherein the video encoder (100) is configured to generate said video data stream without or without any other supplemental enhancement information messages.

＜態様１１８＞<Aspect 118>
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が前記第１の値を有する場合、前記ビデオエンコーダ（１００）は、前記ビデオデータストリームの前記１つ以上の符号化ビデオシーケンスのそれぞれの前記複数のアクセスユニットの各アクセスユニットの、スケーラブルネストされないピクチャタイミング補足拡張情報メッセージを含むそれぞれのネットワーク抽象化レイヤユニットごとに、前記ネットワーク抽象化レイヤユニットがピクチャタイミング補足拡張情報メッセージと異なる他の補足拡張情報メッセージを含まない又はいかなる他の補足拡張情報メッセージも含まないように、前記ビデオデータストリームを生成するように構成される、態様１１４～１１６のいずれかに記載のビデオエンコーダ（１００）。 If the indication (general_same_pic_timing_in_all_ols_flag) has the first value, the video encoder (100) controls each access unit of the plurality of access units of each of the one or more encoded video sequences of the video data stream. , for each network abstraction layer unit containing a scalable non-nested picture timing supplemental enhancement information message, said network abstraction layer unit does not contain any other supplemental enhancement information message different from the picture timing supplemental enhancement information message or any other 117. A video encoder (100) according to any of aspects 114-116, configured to generate said video data stream such that it also does not include supplemental enhancement information messages.

＜態様１１９＞<Aspect 119>
入力ビデオデータストリームを受信するための装置（２００）であって、前記入力ビデオデータストリームにはビデオが符号化され、 An apparatus (200) for receiving an input video data stream, said input video data stream being video encoded,
前記装置（２００）は、前記入力ビデオデータストリームから処理済みビデオデータストリームを生成するように構成され、 said apparatus (200) being configured to generate a processed video data stream from said input video data stream;
前記装置（２００）は、前記処理済みビデオデータストリームの１つ以上の符号化ビデオシーケンスのうちの１つの符号化ビデオシーケンスの前記複数のアクセスユニットのうちの１つのアクセスユニットのネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義されるか否かを示す表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）を前記処理済みビデオデータストリームが含むように、前記処理済みビデオデータストリームを生成するように構成され、 The apparatus (200) performs a network abstraction layer unit of one access unit of the plurality of access units of an encoded video sequence of one or more encoded video sequences of the processed video data stream. an indication (general_same_pic_timing_in_all_ols_flag) indicating whether the scalable non-nested picture timing supplemental enhancement information message is defined to apply to all output layer sets of the plurality of output layer sets of the access unit; configured to generate said processed video data stream, such that the stream comprises;
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値を有する場合、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットの前記スケーラブルネストされないピクチャタイミング補足拡張情報メッセージは、前記アクセスユニットの前記複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義され、 If the indication (general_same_pic_timing_in_all_ols_flag) has a first value, then the scalable non-nested picture timing supplemental enhancement information message of the network abstraction layer unit of the access unit indicates all of the plurality of output layer sets of the access unit. defined to be applied to the output layer set,
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が前記第１の値とは異なる値を有する場合、前記表示は、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットの前記スケーラブルネストされないピクチャタイミング補足拡張情報メッセージが前記アクセスユニットの前記複数の出力レイヤセットの全ての出力レイヤセットに適用されるか否かを定義しない、 If the indication (general_same_pic_timing_in_all_ols_flag) has a value different from the first value, then the indication is that the scalable non-nested picture timing supplemental enhancement information message of the network abstraction layer unit of the access unit is the undefined whether to apply to all output layer sets of multiple output layer sets,
装置（２００）。 A device (200).

＜態様１２０＞<Aspect 120>
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が前記第１の値を有する場合、前記装置（２００）は、ピクチャタイミング補足拡張情報メッセージとは異なるいかなる他の補足拡張情報メッセージも前記ネットワーク抽象化レイヤユニットが含まないように、前記処理済みビデオデータストリームを生成するように構成される、態様１１９に記載の装置（２００）。 If said indication (general_same_pic_timing_in_all_ols_flag) has said first value, said apparatus (200) is configured such that said network abstraction layer unit does not include any other Supplemental Enhancement Information message different from a Picture Timing Supplemental Enhancement Information message. 120. The apparatus (200) of aspect 119, configured to generate said processed video data stream.

＜態様１２１＞<Aspect 121>
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が前記第１の値を有する場合、前記装置（２００）は、前記ネットワーク抽象化レイヤユニットが他のいかなる補足拡張情報メッセージも含まないように、前記処理済みビデオデータストリームを生成するように構成されている、態様１１９又は１２０に記載の装置（２００）。 if said indication (general_same_pic_timing_in_all_ols_flag) has said first value, said apparatus (200) generates said processed video data stream such that said network abstraction layer unit does not contain any other supplemental enhancement information messages; 121. Apparatus (200) according to aspect 119 or 120, configured to:

＜態様１２２＞<Aspect 122>
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が前記第１の値を有する場合、前記装置（２００）は、前記１つ以上の符号化ビデオシーケンスのうちの１つの符号化ビデオシーケンスの前記複数のアクセスユニットの各アクセスユニットの、スケーラブルネストされないピクチャタイミング補足拡張情報メッセージを含むそれぞれのネットワーク抽象化レイヤユニットごとに、前記ネットワーク抽象化レイヤユニットがピクチャタイミング補足拡張情報メッセージとは異なる他の補足拡張情報メッセージを含まない又はいかなる他の補足拡張情報メッセージも含まないように、前記処理済みビデオデータストリームを生成するように構成される、態様１１９～１２１のいずれかに記載の装置（２００）。 When the indication (general_same_pic_timing_in_all_ols_flag) has the first value, the apparatus (200) controls each access unit of the plurality of access units of one coded video sequence among the one or more coded video sequences. , for each network abstraction layer unit containing a non-scalable nested picture timing supplemental enhancement information message, said network abstraction layer unit does not contain any other supplemental enhancement information message different from the picture timing supplemental enhancement information message or any 122. The apparatus (200) according to any of aspects 119-121, configured to generate said processed video data stream such that it also does not include other supplemental enhancement information messages.

＜態様１２３＞<Aspect 123>
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が前記第１の値を有する場合、前記装置（２００）は、前記処理済みビデオデータストリームの前記１つ以上の符号化ビデオシーケンスのそれぞれの前記複数のアクセスユニットの各アクセスユニットの、スケーラブルネストされないピクチャタイミング補足拡張情報メッセージを含むそれぞれのネットワーク抽象化レイヤユニットごとに、前記ネットワーク抽象化レイヤユニットがピクチャタイミング補足拡張情報メッセージとは異なる任意の他の補足拡張情報メッセージを含まない又はいかなる他の補足拡張情報メッセージも含まないように、前記処理済みビデオデータストリームを生成するように構成される、態様１１９～１２１のいずれかに記載の装置（２００）。 If said indication (general_same_pic_timing_in_all_ols_flag) has said first value, said apparatus (200) controls each access unit of said plurality of access units of each of said one or more encoded video sequences of said processed video data stream. , for each network abstraction layer unit containing a non-scalable nested picture timing supplemental enhancement information message, said network abstraction layer unit does not contain any other supplemental enhancement information message different from the picture timing supplemental enhancement information message 122. The apparatus (200) according to any of aspects 119-121, configured to generate said processed video data stream so as not to include any other supplemental enhancement information message.

＜態様１２４＞<Aspect 124>
ビデオを格納したビデオデータストリームを受信するためのビデオデコーダ（３００）であって、 A video decoder (300) for receiving a video data stream containing video, comprising:
前記ビデオデコーダ（３００）は、前記ビデオデータストリームから前記ビデオを復号するように構成され、 said video decoder (300) configured to decode said video from said video data stream;
前記ビデオデータストリームは、前記ビデオデータストリームの１つ以上の符号化ビデオシーケンスのうちの１つの符号化ビデオシーケンスの前記複数のアクセスユニットのうちの１つのアクセスユニットのネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義されるか否かを示す表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）を含み、 The video data stream comprises a scalable nest of network abstraction layer units of one access unit of the plurality of access units of an encoded video sequence of one or more encoded video sequences of the video data stream. an indication (general_same_pic_timing_in_all_ols_flag) indicating whether a picture timing supplemental enhancement information message is defined to be applied to all output layer sets of the plurality of output layer sets of the access unit;
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値を有する場合、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットの前記スケーラブルネストされないピクチャタイミング補足拡張情報メッセージは、前記アクセスユニットの前記複数の出力レイヤセットの全ての出力レイヤセットに適用するように定義され、 If the indication (general_same_pic_timing_in_all_ols_flag) has a first value, then the scalable non-nested picture timing supplemental enhancement information message of the network abstraction layer unit of the access unit indicates all of the plurality of output layer sets of the access unit. defined to apply to the output layer set,
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が前記第１の値と異なる値を有する場合、前記表示は、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットの前記スケーラブルネストされないピクチャタイミング補足拡張情報メッセージが前記アクセスユニットの前記複数の出力レイヤセットの全ての出力レイヤセットに適用されるか否かを定義せず、 If the indication (general_same_pic_timing_in_all_ols_flag) has a value different from the first value, then the indication is that the scalable non-nested picture timing supplemental enhancement information message of the network abstraction layer unit of the access unit is the plurality of does not define whether it applies to all output layer sets in the output layer set of
前記ビデオデコーダ（３００）は、前記表示に応じて前記ビデオを復号するように構成される、 the video decoder (300) is configured to decode the video in response to the representation;
ビデオデコーダ（３００）。 A video decoder (300).

＜態様１２５＞<Aspect 125>
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が前記第１の値を有する場合、前記ネットワーク抽象化レイヤユニットは、ピクチャタイミング補足拡張情報メッセージとは異なるいかなる他の補足拡張情報メッセージも含まない、態様１２４に記載のビデオデコーダ（３００）。 125. The video decoder of aspect 124, wherein if the indication (general_same_pic_timing_in_all_ols_flag) has the first value, then the network abstraction layer unit does not include any other supplemental enhancement information message different from a picture timing supplemental enhancement information message. (300).

＜態様１２６＞<Aspect 126>
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が前記第１の値を有する場合、前記ネットワーク抽象化レイヤユニットはいかなる他の補足拡張情報メッセージも含まない、態様１２４又は１２５に記載のビデオデコーダ（３００）。 126. A video decoder (300) according to aspect 124 or 125, wherein if said indication (general_same_pic_timing_in_all_ols_flag) has said first value, said network abstraction layer unit does not include any other supplemental enhancement information message.

＜態様１２７＞<Aspect 127>
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が前記第１の値を有する場合、前記１つ以上の符号化ビデオシーケンスのうちの１つの符号化ビデオシーケンスの前記複数のアクセスユニットの各アクセスユニットの、スケーラブルネストされないピクチャタイミング補足拡張情報メッセージを含むそれぞれのネットワーク抽象化レイヤユニットごとに、前記ネットワーク抽象化レイヤは、ピクチャタイミング補足拡張情報メッセージとは異なる他のいかなる補足拡張情報メッセージも含まない又は他のいかなる補足拡張情報メッセージも含まない、態様１２４～１２６のいずれかに記載のビデオデコーダ（３００）。 if the indication (general_same_pic_timing_in_all_ols_flag) has the first value, scalable non-nested picture timing of each access unit of the plurality of access units of one coded video sequence among the one or more coded video sequences; For each network abstraction layer unit containing a Supplemental Enhancement Information message, the network abstraction layer does not contain any other Supplemental Enhancement Information message different from the Picture Timing Supplemental Enhancement Information message or any other Supplementary Enhancement Information message. 127. The video decoder (300) of any of aspects 124-126, not comprising:

＜態様１２８＞<Aspect 128>
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が前記第１の値を有する場合、前記ビデオデータストリームの前記１つ以上の符号化ビデオシーケンスのそれぞれの前記複数のアクセスユニットの各アクセスユニットの、スケーラブルネストされないピクチャタイミング補足拡張情報メッセージを含むそれぞれのネットワーク抽象化レイヤユニットごとに、前記ネットワーク抽象化レイヤユニットは、ピクチャタイミング補足拡張情報メッセージとは異なる他の補足拡張情報メッセージを含まない又はいかなる他の補足拡張情報メッセージも含まない、態様１２４～１２６のいずれかに記載のビデオデコーダ（３００）。 if the indication (general_same_pic_timing_in_all_ols_flag) has the first value, a scalable non-nested picture timing supplemental extension of each access unit of the plurality of access units of each of the one or more encoded video sequences of the video data stream. For each network abstraction layer unit containing an information message, said network abstraction layer unit does not contain or contains any other supplemental enhancement information message different from the picture timing supplemental enhancement information message. 127. The video decoder (300) of any of aspects 124-126, wherein:

＜態様１２９＞<Aspect 129>
態様１１９～１２３のいずれかに記載の装置（２００）と、 A device (200) according to any of aspects 119-123;
態様１２４～１２８のいずれかに記載のビデオデコーダ（３００）と、 A video decoder (300) according to any of aspects 124-128;
を備え、 with
態様１２４～１２８のいずれかに記載のビデオデコーダ（３００）は、態様１１９～１２３のいずれかに記載の装置（２００）の処理済みビデオデータストリームを受信するように構成され、 The video decoder (300) according to any of aspects 124-128 is configured to receive the processed video data stream of the apparatus (200) according to any of aspects 119-123,
態様１２４～１２８のいずれかに記載のビデオデコーダ（３００）は、態様１１９～１２３のいずれかに記載の装置（２００）の出力ビデオデータストリームからビデオを復号するように構成される、 The video decoder (300) according to any of aspects 124-128 is configured to decode video from an output video data stream of the apparatus (200) according to any of aspects 119-123,
システム。 system.

＜態様１３０＞<Aspect 130>
前記システムは、態様１１４～１１８のいずれかに記載のビデオエンコーダ（１００）を更に備え、 119. The system further comprises a video encoder (100) according to any of aspects 114-118,
態様１１９～１２３のいずれかに記載の装置（２００）は、前記入力ビデオデータストリームとして、態様１１４～１１８のいずれかに記載のビデオエンコーダ（１００）から前記ビデオデータストリームを受信するように構成される、態様１２９に記載のシステム。 The apparatus (200) according to any of aspects 119-123, wherein the apparatus (200) is configured to receive, as said input video data stream, said video data stream from a video encoder (100) according to any of aspects 114-118. 130. The system according to aspect 129.

＜態様１３１＞<Aspect 131>
ビデオをビデオデータストリームに符号化するための方法であって、 A method for encoding video into a video data stream, comprising:
前記方法は、前記ビデオデータストリームの１つ以上の符号化ビデオシーケンスのうちの１つの符号化ビデオシーケンスの前記複数のアクセスユニットのうちの１つのアクセスユニットのネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義されるか否かを示す表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）を前記ビデオデータストリームが含むように、前記ビデオデータストリームを生成するステップを含み、 The method comprises scalable non-nested pictures of network abstraction layer units of one access unit of the plurality of access units of an encoded video sequence of one or more encoded video sequences of the video data stream. wherein the video data stream includes an indication (general_same_pic_timing_in_all_ols_flag) indicating whether a timing supplemental enhancement information message is defined to be applied to all output layer sets of the plurality of output layer sets of the access unit; generating a video data stream;
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値を有する場合、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットの前記スケーラブルネストされないピクチャタイミング補足拡張情報メッセージは、前記アクセスユニットの前記複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義され、 If the indication (general_same_pic_timing_in_all_ols_flag) has a first value, then the scalable non-nested picture timing supplemental enhancement information message of the network abstraction layer unit of the access unit indicates all of the plurality of output layer sets of the access unit. defined to be applied to the output layer set,
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が前記第１の値とは異なる値を有する場合、前記表示は、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットの前記スケーラブルネストされないピクチャタイミング補足拡張情報メッセージが前記アクセスユニットの前記複数の出力レイヤセットの全ての出力レイヤセットに適用されるか否かを定義しない、 If the indication (general_same_pic_timing_in_all_ols_flag) has a value different from the first value, then the indication is that the scalable non-nested picture timing supplemental enhancement information message of the network abstraction layer unit of the access unit is the undefined whether to apply to all output layer sets of multiple output layer sets,
方法。 Method.

＜態様１３２＞<Aspect 132>
入力ビデオデータストリームを受信するための方法であって、前記入力ビデオデータストリームにはビデオが符号化され、 1. A method for receiving an input video data stream, said input video data stream being video encoded,
前記方法は、前記入力ビデオデータストリームから処理済みビデオデータストリームを生成するステップを含み、 The method includes generating a processed video data stream from the input video data stream;
前記方法は、前記処理済みビデオデータストリームの１つ以上の符号化ビデオシーケンスのうちの１つの符号化ビデオシーケンスの前記複数のアクセスユニットのうちの１つのアクセスユニットのネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義されるか否かを示す表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）を前記処理済みビデオデータストリームが含むように、前記処理済みビデオデータストリームを生成するステップを含み、 The method comprises scalable nesting of network abstraction layer units of one access unit of the plurality of access units of an encoded video sequence of one or more encoded video sequences of the processed video data stream. The processed video data stream includes an indication (general_same_pic_timing_in_all_ols_flag) indicating whether a picture timing supplemental enhancement information message is defined to be applied to all output layer sets of the plurality of output layer sets of the access unit. generating the processed video data stream as
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値を有する場合、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットの前記スケーラブルネストされないピクチャタイミング補足拡張情報メッセージは、前記アクセスユニットの前記複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義され、 If the indication (general_same_pic_timing_in_all_ols_flag) has a first value, then the scalable non-nested picture timing supplemental enhancement information message of the network abstraction layer unit of the access unit indicates all of the plurality of output layer sets of the access unit. defined to be applied to the output layer set,
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が前記第１の値とは異なる値を有する場合、前記表示は、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットの前記スケーラブルネストされないピクチャタイミング補足拡張情報メッセージが前記アクセスユニットの前記複数の出力レイヤセットの全ての出力レイヤセットに適用されるか否かを定義しない、 If the indication (general_same_pic_timing_in_all_ols_flag) has a value different from the first value, then the indication is that the scalable non-nested picture timing supplemental enhancement information message of the network abstraction layer unit of the access unit is the undefined whether to apply to all output layer sets of multiple output layer sets,
方法。 Method.

＜態様１３３＞<Aspect 133>
ビデオを格納したビデオデータストリームを受信するための方法であって、 A method for receiving a video data stream containing video, comprising:
前記方法は、前記ビデオデータストリームから前記ビデオを復号するステップを含み、 The method includes decoding the video from the video data stream;
前記ビデオデータストリームは、前記ビデオデータストリームの１つ以上の符号化ビデオシーケンスのうちの１つの符号化ビデオシーケンスの前記複数のアクセスユニットのうちの１つのアクセスユニットのネットワーク抽象化レイヤユニットのスケーラブルネストされないピクチャタイミング補足拡張情報メッセージが前記アクセスユニットの複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義されるか否かを示す表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）を含み、 The video data stream comprises a scalable nest of network abstraction layer units of one access unit of the plurality of access units of an encoded video sequence of one or more encoded video sequences of the video data stream. an indication (general_same_pic_timing_in_all_ols_flag) indicating whether a picture timing supplemental enhancement information message is defined to be applied to all output layer sets of the plurality of output layer sets of the access unit;
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が第１の値を有する場合、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットの前記スケーラブルネストされないピクチャタイミング補足拡張情報メッセージは、前記アクセスユニットの前記複数の出力レイヤセットの全ての出力レイヤセットに適用されるように定義され、 If the indication (general_same_pic_timing_in_all_ols_flag) has a first value, then the scalable non-nested picture timing supplemental enhancement information message of the network abstraction layer unit of the access unit indicates all of the plurality of output layer sets of the access unit. defined to be applied to the output layer set,
前記表示（ｇｅｎｅｒａｌ＿ｓａｍｅ＿ｐｉｃ＿ｔｉｍｉｎｇ＿ｉｎ＿ａｌｌ＿ｏｌｓ＿ｆｌａｇ）が前記第１の値と異なる値を有する場合、前記表示は、前記アクセスユニットの前記ネットワーク抽象化レイヤユニットの前記スケーラブルネストされないピクチャタイミング補足拡張情報メッセージが前記アクセスユニットの前記複数の出力レイヤセットの全ての出力レイヤセットに適用されるか否かを定義せず、 If the indication (general_same_pic_timing_in_all_ols_flag) has a value different from the first value, then the indication is that the scalable non-nested picture timing supplemental enhancement information message of the network abstraction layer unit of the access unit is the plurality of does not define whether it applies to all output layer sets in the output layer set of
前記ビデオを復号するステップは前記表示に応じて行なわれる、 decoding the video is performed in response to the display;
方法。 Method.

＜態様１３４＞<Aspect 134>
コンピュータ又は信号プロセッサで実行されるときに態様１３１～１３３のいずれかに記載の方法を実施するためのコンピュータプログラム。 134. A computer program for implementing the method of any of aspects 131-133 when run on a computer or signal processor.

＜態様１３５＞<Aspect 135>
ビデオデータストリームであって、 a video data stream,
前記ビデオデータストリームにはビデオが符号化され、 a video is encoded in the video data stream;
前記ビデオデータストリームは、１つ以上のスケーラブルネストされた補足拡張情報メッセージを含み、 the video data stream includes one or more scalable nested Supplemental Enhancement Information messages;
前記１つ以上のスケーラブルネストされた補足拡張情報メッセージが複数のシンタックス要素を含み、 the one or more scalable nested supplemental extension information messages comprising a plurality of syntax elements;
前記複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素は、前記ビデオデータストリーム又は前記ビデオデータストリームの一部の前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義される、 each syntax element of one or more syntax elements of the plurality of syntax elements is the same size in each of the scalable nested supplemental enhancement information messages of the video data stream or part of the video data stream; defined to have
ビデオデータストリーム。 video data stream.

＜態様１３６＞<Aspect 136>
前記ビデオデータストリームは、１つ以上のスケーラブルネストされない補足拡張情報メッセージを含み、 the video data stream includes one or more scalable non-nested supplemental enhancement information messages;
前記１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び前記１つ以上のスケーラブルネストされない補足拡張情報メッセージが前記複数のシンタックス要素を含み、 the one or more scalable nested supplemental enhancement information messages and the one or more scalable non-nested supplemental enhancement information messages comprise the plurality of syntax elements;
前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素は、前記ビデオデータストリーム又は前記ビデオデータストリームの前記一部の前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいてかつ前記ビデオデータストリーム又は前記ビデオデータストリームの前記一部の前記スケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義される、 each syntax element of the one or more syntax elements of the plurality of syntax elements in each of the scalable nested supplemental enhancement information messages of the video data stream or the portion of the video data stream; and defined to have the same size in each of the scalable non-nested Supplemental Enhancement Information messages of the video data stream or the portion of the video data stream;
態様１３５に記載のビデオデータストリーム。 136. A video data stream according to aspect 135.

＜態様１３７＞<Aspect 137>
前記ビデオデータストリームが複数のアクセスユニットを含み、前記複数のアクセスユニットの各アクセスユニットが前記ビデオの複数のピクチャのうちの１つに割り当てられ、 said video data stream comprising a plurality of access units, each access unit of said plurality of access units being assigned to one of a plurality of pictures of said video;
前記ビデオデータストリームの前記一部は、前記ビデオデータストリームの前記複数のアクセスユニットのうちのアクセスユニットであり、 said part of said video data stream being an access unit among said plurality of access units of said video data stream;
前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素は、前記アクセスユニットの前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義される、 each syntax element of the one or more syntax elements of the plurality of syntax elements is defined to have the same size in each of the scalable nested supplemental enhancement information messages of the access unit;
態様１３５に記載のビデオデータストリーム。 136. A video data stream according to aspect 135.

＜態様１３８＞<Aspect 138>
前記ビデオデータストリームは、１つ以上のスケーラブルネストされない補足拡張情報メッセージを含み、 the video data stream includes one or more scalable non-nested supplemental enhancement information messages;
前記１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び前記１つ以上のスケーラブルネストされない補足拡張情報メッセージが前記複数のシンタックス要素を含み、 the one or more scalable nested supplemental enhancement information messages and the one or more scalable non-nested supplemental enhancement information messages comprise the plurality of syntax elements;
前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素は、前記アクセスユニットの前記スケーラブルネストされた補足拡張情報メッセージのそれぞれ及び前記アクセスユニットの前記スケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義される、 Each syntax element of said one or more syntax elements of said plurality of syntax elements includes each of said scalable nested supplemental extension information messages of said access unit and said scalable non-nested supplemental extension of said access unit. defined to have the same size in each of the information messages,
態様１３７に記載のビデオデータストリーム。 138. A video data stream according to aspect 137.

＜態様１３９＞<Aspect 139>
前記ビデオデータストリームの前記一部が前記ビデオデータストリームの符号化ビデオシーケンスであり、 said portion of said video data stream is an encoded video sequence of said video data stream;
前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素は、前記符号化ビデオシーケンスの前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義される、 each syntax element of the one or more syntax elements of the plurality of syntax elements is defined to have the same size in each of the scalable nested supplemental enhancement information messages of the encoded video sequence; Ru
態様１３５に記載のビデオデータストリーム。 136. A video data stream according to aspect 135.

＜態様１４０＞<Aspect 140>
前記ビデオデータストリームは、１つ以上のスケーラブルネストされない補足拡張情報メッセージを含み、 the video data stream includes one or more scalable non-nested supplemental enhancement information messages;
前記１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び前記１つ以上のスケーラブルネストされない補足拡張情報メッセージが前記複数のシンタックス要素を含み、 the one or more scalable nested supplemental enhancement information messages and the one or more scalable non-nested supplemental enhancement information messages comprise the plurality of syntax elements;
前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素は、前記符号化ビデオシーケンスの前記スケーラブルネストされた補足拡張情報メッセージのそれぞれ及び前記符号化ビデオシーケンスの前記スケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義される、 Each syntax element of the one or more syntax elements of the plurality of syntax elements is a respective one of the scalable nested supplemental enhancement information messages of the encoded video sequence and the scalable extension information message of the encoded video sequence. defined to have the same size in each of the non-nested Supplemental Extension Information messages,
態様１３９に記載のビデオデータストリーム。 A video data stream according to aspect 139.

＜態様１４１＞<Aspect 141>
前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素は、前記ビデオデータストリームの前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義される、態様１３５に記載のビデオデータストリーム。 Each syntax element of the one or more syntax elements of the plurality of syntax elements is defined to have the same size in each of the scalable nested supplemental enhancement information messages of the video data stream. , aspect 135. The video data stream of aspect 135.

＜態様１４２＞<Aspect 142>
前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素は、前記ビデオデータストリームの前記スケーラブルネストされた補足拡張情報メッセージのそれぞれ及び前記ビデオデータストリームの前記スケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義される、態様１４１に記載のビデオデータストリーム。 each syntax element of the one or more syntax elements of the plurality of syntax elements is each of the scalable nested supplemental enhancement information messages of the video data stream and the scalable non-nested of the video data stream; 142. The video data stream of aspect 141, defined to have the same size in each of the Supplemental Enhancement Information messages.

＜態様１４３＞<Aspect 143>
前記ビデオデータストリーム又は前記ビデオデータストリームの前記一部は、少なくとも１つのバッファリング期間補足拡張情報メッセージを含み、前記バッファリング期間補足拡張情報メッセージは、前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素の前記サイズを定義する、態様１３５～１４２のいずれかに記載のビデオデータストリーム。 The video data stream or the portion of the video data stream includes at least one buffering period supplemental enhancement information message, the buffering period supplemental enhancement information message comprising the one of the plurality of syntax elements. 143. A video data stream according to any of aspects 135-142, defining said size of each syntax element of the above syntax elements.

＜態様１４４＞<Aspect 144>
前記バッファリング期間補足拡張情報メッセージは、前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素のそれぞれのシンタックス要素ごとに前記サイズを定義するために、 the buffering period supplemental extension information message to define the size for each syntax element of the one or more syntax elements of the plurality of syntax elements;
ｂｐ＿ｃｐｂ＿ｉｎｉｔｉａｌ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、 bp_cpb_initial_removal_delay_length_minus1 element,
ｂｐ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、 bp_cpb_removal_delay_length_minus1 element,
ｂｐ＿ｄｐｂ＿ｏｕｔｐｕｔ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、 bp_dpb_output_delay_length_minus1 element,
ｂｐ＿ｄｕ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｉｎｃｒｅｍｅｎｔ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、 bp_du_cpb_removal_delay_increment_length_minus1 element,
ｂｐ＿ｄｐｂ＿ｏｕｔｐｕｔ＿ｄｅｌａｙ＿ｄｕ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素 bp_dpb_output_delay_du_length_minus 1 element
のうちの少なくとも１つを含む、態様１４３に記載のビデオデータストリーム。 144. The video data stream of aspect 143, comprising at least one of:

＜態様１４５＞<Aspect 145>
スケーラブルネストされたバッファリング期間補足拡張情報メッセージを含む、前記ビデオデータストリームの複数のアクセスユニットのそれぞれのアクセスユニットごとに、前記アクセスユニットは、スケーラブルネストされないバッファリング期間補足拡張情報メッセージも含む、態様１４３又は１４４に記載のビデオデータストリーム。 wherein for each access unit of a plurality of access units of said video data stream comprising a scalable nested buffering period supplemental enhancement information message, said access unit also comprising a scalable non-nested buffering period supplemental enhancement information message, an aspect 143 or 144 video data stream.

＜態様１４６＞<Aspect 146>
スケーラブルネストされたバッファリング期間補足拡張情報メッセージを含む、前記ビデオデータストリームの複数の単層アクセスユニットのそれぞれの単層アクセスユニットごとに、前記単層アクセスユニットは、スケーラブルネストされないバッファリング期間補足拡張情報メッセージも含む、態様１４３又は１４４に記載のビデオデータストリーム。 For each single layer access unit of a plurality of single layer access units of said video data stream comprising a scalable nested buffering period supplemental extension information message, said single layer access unit includes a scalable non nested buffering period supplemental extension 145. A video data stream according to aspect 143 or 144, also comprising information messages.

＜態様１４７＞<Aspect 147>
ビデオエンコーダ（１００）であって、 A video encoder (100) comprising:
前記ビデオエンコーダ（１００）は、ビデオをビデオデータストリームに符号化するように構成され、 said video encoder (100) configured to encode video into a video data stream;
前記ビデオエンコーダ（１００）は、前記ビデオデータストリームが１つ以上のスケーラブルネストされた補足拡張情報メッセージを含むように前記ビデオデータストリームを生成するように構成され、 the video encoder (100) is configured to generate the video data stream such that the video data stream includes one or more scalable nested supplemental enhancement information messages;
前記ビデオエンコーダ（１００）は、前記１つ以上のスケーラブルネストされた補足拡張情報メッセージが複数のシンタックス要素を含むように、前記ビデオデータストリームを生成するように構成され、 the video encoder (100) is configured to generate the video data stream such that the one or more scalable nested supplemental enhancement information messages include a plurality of syntax elements;
前記ビデオエンコーダ（１００）は、前記複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素が前記ビデオデータストリーム又は前記ビデオデータストリームの一部の前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有すると定義されるように、前記ビデオデータストリームを生成するように構成される、 The video encoder (100) is configured such that each syntax element of one or more syntax elements of the plurality of syntax elements is the scalable nested supplemental extension of the video data stream or a portion of the video data stream. configured to generate the video data stream as defined to have the same size in each of the information messages;
ビデオエンコーダ（１００）。 A video encoder (100).

＜態様１４８＞<Aspect 148>
前記ビデオエンコーダ（１００）は、前記ビデオデータストリームが１つ以上のスケーラブルネストされない補足拡張情報メッセージを含むように前記ビデオデータストリームを生成するように構成され、 the video encoder (100) is configured to generate the video data stream such that the video data stream includes one or more scalable non-nested supplemental enhancement information messages;
前記ビデオエンコーダ（１００）は、前記１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び前記１つ以上のスケーラブルネストされない補足拡張情報メッセージが前記複数のシンタックス要素を含むように、前記ビデオデータストリームを生成するように構成され、 The video encoder (100) is configured to convert the video data stream such that the one or more scalable nested supplemental enhancement information messages and the one or more scalable non-nested supplemental enhancement information messages include the plurality of syntax elements. is configured to generate
前記ビデオエンコーダ（１００）は、前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素が前記ビデオデータストリーム又は前記ビデオデータストリームの前記一部の前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいてかつ前記ビデオデータストリーム又は前記ビデオデータストリームの前記一部の前記スケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて同じサイズを有すると定義されるように、前記ビデオデータストリームを生成するように構成される、 The video encoder (100) is configured such that each syntax element of the one or more syntax elements of the plurality of syntax elements is the scalable nested of the video data stream or the portion of the video data stream. the video data stream as defined to have the same size in each of the Supplemental Enhancement Information messages and in each of the scalable non-nested Supplementary Enhancement Information messages of the video data stream or the portion of the video data stream; configured to generate
態様１４７に記載のビデオエンコーダ（１００）。 147. The video encoder (100) of aspect 147.

＜態様１４９＞<Aspect 149>
前記ビデオエンコーダ（１００）は、前記ビデオデータストリームが複数のアクセスユニットを含むように前記ビデオデータストリームを生成するように構成され、前記複数のアクセスユニットの各アクセスユニットは、前記ビデオの複数のピクチャのうちの１つに割り当てられ、 The video encoder (100) is configured to generate the video data stream such that the video data stream comprises a plurality of access units, each access unit of the plurality of access units being a plurality of pictures of the video. is assigned to one of
前記ビデオデータストリームの前記一部は、前記ビデオデータストリームの前記複数のアクセスユニットのうちの１つのアクセスユニットであり、 said portion of said video data stream is one access unit of said plurality of access units of said video data stream;
前記ビデオエンコーダ（１００）は、前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素が前記アクセスユニットの前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有すると定義されるように、前記ビデオデータストリームを生成するように構成される、 the video encoder (100) wherein each syntax element of the one or more syntax elements of the plurality of syntax elements has the same size in each of the scalable nested supplemental enhancement information messages of the access unit; configured to generate the video data stream as defined as comprising
態様１４７に記載のビデオエンコーダ（１００）。 147. The video encoder (100) of aspect 147.

＜態様１５０＞<Aspect 150>
前記ビデオエンコーダ（１００）は、前記ビデオデータストリームが１つ以上のスケーラブルネストされない補足拡張情報メッセージを含むように前記ビデオデータストリームを生成するように構成され、 the video encoder (100) is configured to generate the video data stream such that the video data stream includes one or more scalable non-nested supplemental enhancement information messages;
前記ビデオエンコーダ（１００）は、前記１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び前記１つ以上のスケーラブルネストされない補足拡張情報メッセージが前記複数のシンタックス要素を含むように、前記ビデオデータストリームを生成するように構成され、 The video encoder (100) is configured to convert the video data stream such that the one or more scalable nested supplemental enhancement information messages and the one or more scalable non-nested supplemental enhancement information messages include the plurality of syntax elements. is configured to generate
前記ビデオエンコーダ（１００）は、前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素が前記アクセスユニットの前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいてかつ前記アクセスユニットの前記スケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて同じサイズを有すると定義されるように、前記ビデオデータストリームを生成するように構成される、 The video encoder (100) determines that each syntax element of the one or more syntax elements of the plurality of syntax elements is in each of the scalable nested supplemental enhancement information messages of the access unit and the access unit. configured to generate the video data stream as defined to have the same size in each of the scalable non-nested Supplemental Enhancement Information messages of a unit;
態様１４９に記載のビデオエンコーダ（１００）。 149. The video encoder (100) of aspect 149.

＜態様１５１＞<Aspect 151>
前記ビデオデータストリームの前記一部は、前記ビデオデータストリームの符号化ビデオシーケンスであり、 said portion of said video data stream is an encoded video sequence of said video data stream;
前記ビデオエンコーダ（１００）は、前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素が前記符号化ビデオシーケンスの前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有すると定義されるように、前記ビデオデータストリームを生成するように構成される、 The video encoder (100) determines that each syntax element of the one or more syntax elements of the plurality of syntax elements is the same in each of the scalable nested supplemental enhancement information messages of the encoded video sequence. configured to generate said video data stream as defined to have a size;
態様１４７に記載のビデオエンコーダ（１００）。 147. The video encoder (100) of aspect 147.

＜態様１５２＞<Aspect 152>
前記ビデオエンコーダ（１００）は、前記ビデオデータストリームが１つ以上のスケーラブルネストされない補足拡張情報メッセージを含むように前記ビデオデータストリームを生成するように構成され、 the video encoder (100) is configured to generate the video data stream such that the video data stream includes one or more scalable non-nested supplemental enhancement information messages;
前記ビデオエンコーダ（１００）は、前記１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び前記１つ以上のスケーラブルネストされない補足拡張情報メッセージが前記複数のシンタックス要素を含むように、前記ビデオデータストリームを生成するように構成され、 The video encoder (100) is configured to convert the video data stream such that the one or more scalable nested supplemental enhancement information messages and the one or more scalable non-nested supplemental enhancement information messages include the plurality of syntax elements. is configured to generate
前記ビデオエンコーダ（１００）は、前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素が前記符号化ビデオシーケンスの前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいてかつ前記符号化ビデオシーケンスの前記スケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて同じサイズを有すると定義されるように、前記ビデオデータストリームを生成するように構成される、 wherein each syntax element of the one or more syntax elements of the plurality of syntax elements is in each of the scalable nested supplemental enhancement information messages of the encoded video sequence; and configured to generate the video data stream as defined to have the same size in each of the scalable non-nested Supplemental Enhancement Information messages of the encoded video sequence;
態様１５１に記載のビデオエンコーダ（１００）。 152. The video encoder (100) of aspect 151.

＜態様１５３＞<Aspect 153>
前記ビデオエンコーダ（１００）は、前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素が前記ビデオデータストリームの前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有すると定義されるように、前記ビデオデータストリームを生成するように構成される、態様１４７に記載のビデオエンコーダ（１００）。 The video encoder (100) determines that each syntax element of the one or more syntax elements of the plurality of syntax elements is the same size in each of the scalable nested supplemental enhancement information messages of the video data stream. 148. The video encoder (100) of aspect 147, configured to generate said video data stream, defined as having:

＜態様１５４＞<Aspect 154>
前記ビデオエンコーダ（１００）は、前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素が前記ビデオデータストリームの前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいてかつ前記ビデオデータストリームの前記スケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて同じサイズを有すると定義されるように、前記ビデオデータストリームを生成するように構成される、態様１５３に記載のビデオエンコーダ（１００）。 The video encoder (100) determines that each syntax element of the one or more syntax elements of the plurality of syntax elements is in each of the scalable nested supplemental enhancement information messages of the video data stream and the 154. The video encoder (100) of aspect 153, configured to generate the video data stream as defined to have the same size in each of the scalable non-nested Supplemental Enhancement Information messages of a video data stream.

＜態様１５５＞<Aspect 155>
前記ビデオエンコーダ（１００）は、前記ビデオデータストリーム又は前記ビデオデータストリームの前記一部が少なくとも１つのバッファリング期間補足拡張情報メッセージを含むように前記ビデオデータストリームを生成するように構成され、前記バッファリング期間補足拡張情報メッセージは、前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素の前記サイズを定義する、態様１４７～１５４のいずれかに記載のビデオエンコーダ（１００）。 The video encoder (100) is configured to generate the video data stream such that the video data stream or the portion of the video data stream includes at least one buffering period supplemental enhancement information message; 155. The video encoder according to any of aspects 147-154, wherein the ring period supplemental extension information message defines the size of each syntax element of the one or more syntax elements of the plurality of syntax elements ( 100).

＜態様１５６＞<Aspect 156>
前記ビデオエンコーダ（１００）は、前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素のそれぞれのシンタックス要素ごとに前記サイズを定義するために前記バッファリング期間補足拡張情報メッセージが、 wherein the buffering period supplemental extension information message to define the size for each syntax element of the one or more syntax elements of the plurality of syntax elements comprises:
ｂｐ＿ｃｐｂ＿ｉｎｉｔｉａｌ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、 bp_cpb_initial_removal_delay_length_minus1 element,
ｂｐ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、 bp_cpb_removal_delay_length_minus1 element,
ｂｐ＿ｄｐｂ＿ｏｕｔｐｕｔ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、 bp_dpb_output_delay_length_minus1 element,
ｂｐ＿ｄｕ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｉｎｃｒｅｍｅｎｔ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、 bp_du_cpb_removal_delay_increment_length_minus1 element,
ｂｐ＿ｄｐｂ＿ｏｕｔｐｕｔ＿ｄｅｌａｙ＿ｄｕ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素 bp_dpb_output_delay_du_length_minus 1 element
のうちの少なくとも１つを含むように前記ビデオデータストリームを生成するように構成される、態様１５５に記載のビデオエンコーダ（１００）。 156. The video encoder (100) of aspect 155, configured to generate said video data stream to include at least one of:

＜態様１５７＞<Aspect 157>
前記ビデオエンコーダ（１００）は、スケーラブルネストされたバッファリング期間補足拡張情報メッセージを含む、前記ビデオデータストリームの複数のアクセスユニットのそれぞれのアクセスユニットごとに、前記アクセスユニットがスケーラブルネストされないバッファリング期間補足拡張情報メッセージも含むように、前記ビデオデータストリームを生成するように構成される、態様１５５又は１５６に記載のビデオエンコーダ（１００）。 The video encoder (100) performs, for each access unit of a plurality of access units of the video data stream, including a scalable nested buffering period supplemental enhancement information message, a buffering period supplemental buffering period in which the access unit is not scalable nested. 157. Video encoder (100) according to aspect 155 or 156, configured to generate said video data stream to also include an enhancement information message.

＜態様１５８＞<Aspect 158>
前記ビデオエンコーダ（１００）は、スケーラブルネストされたバッファリング期間補足拡張情報メッセージを含む、前記ビデオデータストリームの複数の単層アクセスユニットのそれぞれの単層アクセスユニットごとに、前記単層アクセスユニットがスケーラブルネストされないバッファリング期間補足拡張情報メッセージも含むように、前記ビデオデータストリームを生成するように構成される、態様１５５又は１５６に記載のビデオエンコーダ（１００）。 The video encoder (100) for each single layer access unit of a plurality of single layer access units of the video data stream comprising a scalable nested buffering period supplemental enhancement information message. 157. Video encoder (100) according to aspect 155 or 156, configured to generate said video data stream to also include non-nested buffering period supplemental enhancement information messages.

＜態様１５９＞<Aspect 159>
入力ビデオデータストリームを受信するための装置（２００）であって、前記入力ビデオデータストリームにはビデオが符号化され、 An apparatus (200) for receiving an input video data stream, said input video data stream being video encoded,
前記装置（２００）は、前記入力ビデオデータストリームから出力ビデオデータストリームを生成するように構成され、 said device (200) being configured to generate an output video data stream from said input video data stream;
前記ビデオデータストリームは、１つ以上のスケーラブルネストされた補足拡張情報メッセージを含み、 the video data stream includes one or more scalable nested Supplemental Enhancement Information messages;
前記１つ以上のスケーラブルネストされた補足拡張情報メッセージが複数のシンタックス要素を含み、 the one or more scalable nested supplemental extension information messages comprising a plurality of syntax elements;
前記複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素は、前記ビデオデータストリーム又は前記ビデオデータストリームの一部の前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義され、 each syntax element of one or more syntax elements of the plurality of syntax elements is the same size in each of the scalable nested supplemental enhancement information messages of the video data stream or part of the video data stream; is defined to have
前記装置（２００）は、前記１つ以上のスケーラブルネストされた補足拡張情報メッセージを処理するように構成される、 said apparatus (200) is configured to process said one or more scalable nested supplemental enhancement information messages;
装置（２００）。 A device (200).

＜態様１６０＞<Aspect 160>
前記ビデオデータストリームは、１つ以上のスケーラブルネストされない補足拡張情報メッセージを含み、 the video data stream includes one or more scalable non-nested supplemental enhancement information messages;
前記１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び前記１つ以上のスケーラブルネストされない補足拡張情報メッセージが前記複数のシンタックス要素を含み、 the one or more scalable nested supplemental enhancement information messages and the one or more scalable non-nested supplemental enhancement information messages comprise the plurality of syntax elements;
前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素は、前記ビデオデータストリーム又は前記ビデオデータストリームの前記一部の前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいてかつ前記ビデオデータストリーム又は前記ビデオデータストリームの前記一部の前記スケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義され、 each syntax element of the one or more syntax elements of the plurality of syntax elements in each of the scalable nested supplemental enhancement information messages of the video data stream or the portion of the video data stream; and defined to have the same size in each of the scalable non-nested Supplemental Enhancement Information messages of the video data stream or the portion of the video data stream;
前記装置（２００）は、前記１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び前記１つ以上のスケーラブルネストされない補足拡張情報メッセージを処理するように構成される、 said apparatus (200) is configured to process said one or more scalable nested supplemental enhancement information messages and said one or more scalable non-nested supplemental enhancement information messages;
態様１５９に記載の装置（２００）。 160. Apparatus (200) according to aspect 159.

＜態様１６１＞<Aspect 161>
前記ビデオデータストリームが複数のアクセスユニットを含み、前記複数のアクセスユニットの各アクセスユニットが前記ビデオの複数のピクチャのうちの１つに割り当てられ、 said video data stream comprising a plurality of access units, each access unit of said plurality of access units being assigned to one of a plurality of pictures of said video;
前記ビデオデータストリームの前記一部は、前記ビデオデータストリームの前記複数のアクセスユニットのうちの１つのアクセスユニットであり、 said portion of said video data stream is one access unit of said plurality of access units of said video data stream;
前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素は、前記アクセスユニットの前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義される、 each syntax element of the one or more syntax elements of the plurality of syntax elements is defined to have the same size in each of the scalable nested supplemental enhancement information messages of the access unit;
態様１５９に記載の装置（２００）。 160. Apparatus (200) according to aspect 159.

＜態様１６２＞<Aspect 162>
前記ビデオデータストリームは、１つ以上のスケーラブルネストされない補足拡張情報メッセージを含み、 the video data stream includes one or more scalable non-nested supplemental enhancement information messages;
前記１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び前記１つ以上のスケーラブルネストされない補足拡張情報メッセージが前記複数のシンタックス要素を含み、 the one or more scalable nested supplemental enhancement information messages and the one or more scalable non-nested supplemental enhancement information messages comprise the plurality of syntax elements;
前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素は、前記アクセスユニットの前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて及び前記アクセスユニットの前記スケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義され、 each syntax element of said one or more syntax elements of said plurality of syntax elements is included in each of said scalable nested supplemental enhancement information messages of said access unit and said scalable non-nested supplement of said access unit; defined to have the same size in each of the Extended Information messages,
前記装置（２００）は、前記１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び前記１つ以上のスケーラブルネストされない補足拡張情報メッセージを処理するように構成される、 said apparatus (200) is configured to process said one or more scalable nested supplemental enhancement information messages and said one or more scalable non-nested supplemental enhancement information messages;
態様１６１に記載の装置（２００）。 162. Apparatus (200) according to aspect 161.

＜態様１６３＞<Aspect 163>
前記ビデオデータストリームの前記一部は、前記ビデオデータストリームの符号化ビデオシーケンスであり、 said portion of said video data stream is an encoded video sequence of said video data stream;
前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素は、前記符号化ビデオシーケンスの前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義される、 each syntax element of the one or more syntax elements of the plurality of syntax elements is defined to have the same size in each of the scalable nested supplemental enhancement information messages of the encoded video sequence; Ru
態様１５９に記載の装置（２００）。 160. Apparatus (200) according to aspect 159.

＜態様１６４＞<Aspect 164>
前記ビデオデータストリームは、１つ以上のスケーラブルネストされない補足拡張情報メッセージを含み、 the video data stream includes one or more scalable non-nested supplemental enhancement information messages;
前記１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び前記１つ以上のスケーラブルネストされない補足拡張情報メッセージが前記複数のシンタックス要素を含み、 the one or more scalable nested supplemental enhancement information messages and the one or more scalable non-nested supplemental enhancement information messages comprise the plurality of syntax elements;
前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素は、前記符号化ビデオシーケンスの前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて及び前記符号化ビデオシーケンスの前記スケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義され、 Each syntax element of the one or more syntax elements of the plurality of syntax elements is included in each of the scalable nested supplemental enhancement information messages of the encoded video sequence and of the encoded video sequence. defined to have the same size in each of the scalable non-nested Supplemental Extension Information messages,
前記装置（２００）は、前記１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び前記１つ以上のスケーラブルネストされない補足拡張情報メッセージを処理するように構成される、 said apparatus (200) is configured to process said one or more scalable nested supplemental enhancement information messages and said one or more scalable non-nested supplemental enhancement information messages;
態様１６３に記載の装置（２００）。 164. Apparatus (200) according to aspect 163.

＜態様１６５＞<Aspect 165>
前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素は、前記ビデオデータストリームの前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義される、態様１５９に記載の装置（２００）。 Each syntax element of the one or more syntax elements of the plurality of syntax elements is defined to have the same size in each of the scalable nested supplemental enhancement information messages of the video data stream. 160. The apparatus (200) according to claim 159, aspect 159.

＜態様１６６＞<Aspect 166>
前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素は、前記ビデオデータストリームの前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて及び前記ビデオデータストリームの前記スケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義され、 each syntax element of said one or more syntax elements of said plurality of syntax elements in each of said scalable nested supplemental enhancement information messages of said video data stream and in said scalable nest of said video data stream; defined to have the same size in each of the Supplemental Extension Information messages that are not
前記装置（２００）は、前記１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び前記１つ以上のスケーラブルネストされない補足拡張情報メッセージを処理するように構成される、 said apparatus (200) is configured to process said one or more scalable nested supplemental enhancement information messages and said one or more scalable non-nested supplemental enhancement information messages;
態様１６５に記載の装置（２００）。 166. Apparatus (200) according to aspect 165.

＜態様１６７＞<Aspect 167>
前記ビデオデータストリーム又は前記ビデオデータストリームの前記一部は、少なくとも１つのバッファリング期間補足拡張情報メッセージを含み、前記バッファリング期間補足拡張情報メッセージは、前記複数のシンタックス要素のうちの前記１つ以上のサイズを定義し、 The video data stream or the portion of the video data stream includes at least one buffering period supplemental enhancement information message, the buffering period supplemental enhancement information message comprising the one of the plurality of syntax elements. Define a size above,
前記装置（２００）は、前記少なくとも１つのバッファリング期間補足拡張情報メッセージを処理するように構成される、 said device (200) is configured to process said at least one buffering period supplemental enhancement information message;
態様１５９～１６６のいずれかに記載の装置（２００）。 167. The apparatus (200) according to any of aspects 159-166.

＜態様１６８＞<Aspect 168>
前記バッファリング期間補足拡張情報メッセージは、前記複数のシンタックス要素のうちの前記１つ以上のサイズを定義するために、 the Buffering Period Supplemental Extension Information message to define the size of the one or more of the plurality of syntax elements;
ｂｐ＿ｃｐｂ＿ｉｎｉｔｉａｌ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、 bp_cpb_initial_removal_delay_length_minus1 element,
ｂｐ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、 bp_cpb_removal_delay_length_minus1 element,
ｂｐ＿ｄｐｂ＿ｏｕｔｐｕｔ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、 bp_dpb_output_delay_length_minus1 element,
ｂｐ＿ｄｕ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｉｎｃｒｅｍｅｎｔ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、 bp_du_cpb_removal_delay_increment_length_minus1 element,
ｂｐ＿ｄｐｂ＿ｏｕｔｐｕｔ＿ｄｅｌａｙ＿ｄｕ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素 bp_dpb_output_delay_du_length_minus 1 element
のうちの少なくとも１つを含む、態様１６７に記載の装置（２００）。 168. Apparatus (200) according to aspect 167, comprising at least one of:

＜態様１６９＞<Aspect 169>
スケーラブルネストされたバッファリング期間補足拡張情報メッセージを含む、前記ビデオデータストリームの複数のアクセスユニットのそれぞれのアクセスユニットごとに、前記アクセスユニットは、スケーラブルネストされないバッファリング期間補足拡張情報メッセージも含み、 for each access unit of a plurality of access units of said video data stream comprising a scalable nested buffering period supplemental enhancement information message, said access unit also comprising a scalable non-nested buffering period supplemental enhancement information message;
前記装置（２００）は、前記スケーラブルネストされた補足拡張情報メッセージ及び前記スケーラブルネストされない補足拡張情報メッセージを処理するように構成される、 said apparatus (200) is configured to process said scalable nested supplemental enhancement information message and said scalable non-nested supplemental enhancement information message;
態様１６７又は１６８に記載の装置（２００）。 169. Apparatus (200) according to aspect 167 or 168.

＜態様１７０＞<Aspect 170>
スケーラブルネストされたバッファリング期間補足拡張情報メッセージを含む、前記ビデオデータストリームの複数の単層アクセスユニットのそれぞれの単層アクセスユニットごとに、前記単層アクセスユニットは、スケーラブルネストされないバッファリング期間補足拡張情報メッセージも含み、 For each single layer access unit of a plurality of single layer access units of said video data stream comprising a scalable nested buffering period supplemental extension information message, said single layer access unit includes a scalable non nested buffering period supplemental extension including informational messages,
前記装置（２００）は、前記スケーラブルネストされた補足拡張情報メッセージ及び前記スケーラブルネストされない補足拡張情報メッセージを処理するように構成される、 said apparatus (200) is configured to process said scalable nested supplemental enhancement information message and said scalable non-nested supplemental enhancement information message;
態様１６７又は１６８に記載の装置（２００）。 169. Apparatus (200) according to aspect 167 or 168.

＜態様１７１＞<Aspect 171>
ビデオを格納したビデオデータストリームを受信するためのビデオデコーダ（３００）であって、 A video decoder (300) for receiving a video data stream containing video, comprising:
前記ビデオデコーダ（３００）は、前記ビデオデータストリームから前記ビデオを復号するように構成され、 said video decoder (300) configured to decode said video from said video data stream;
前記ビデオデータストリームは、１つ以上のスケーラブルネストされた補足拡張情報メッセージを含み、 the video data stream includes one or more scalable nested Supplemental Enhancement Information messages;
前記１つ以上のスケーラブルネストされた補足拡張情報メッセージは複数のシンタックス要素を含み、 the one or more scalable nested supplemental extension information messages comprising a plurality of syntax elements;
前記複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素は、前記ビデオデータストリーム又は前記ビデオデータストリームの一部の前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義され、 each syntax element of one or more syntax elements of the plurality of syntax elements is the same size in each of the scalable nested supplemental enhancement information messages of the video data stream or part of the video data stream; is defined to have
前記ビデオデコーダ（３００）は、前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素に応じて前記ビデオを復号するように構成される、 the video decoder (300) is configured to decode the video in response to the one or more syntax elements of the plurality of syntax elements;
ビデオデコーダ（３００）。 A video decoder (300).

＜態様１７２＞<Aspect 172>
前記ビデオデータストリームは、１つ以上のスケーラブルネストされない補足拡張情報メッセージを含み、 the video data stream includes one or more scalable non-nested supplemental enhancement information messages;
前記１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び前記１つ以上のスケーラブルネストされない補足拡張情報メッセージが前記複数のシンタックス要素を含み、 the one or more scalable nested supplemental enhancement information messages and the one or more scalable non-nested supplemental enhancement information messages comprise the plurality of syntax elements;
前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素は、前記ビデオデータストリーム又は前記ビデオデータストリームの前記一部の前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいてかつ前記ビデオデータストリーム又は前記ビデオデータストリームの前記一部の前記スケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義される、 each syntax element of the one or more syntax elements of the plurality of syntax elements in each of the scalable nested supplemental enhancement information messages of the video data stream or the portion of the video data stream; and defined to have the same size in each of the scalable non-nested Supplemental Enhancement Information messages of the video data stream or the portion of the video data stream;
態様１７１に記載のビデオデコーダ（３００）。 172. The video decoder (300) of aspect 171.

＜態様１７３＞<Aspect 173>
前記ビデオデータストリームが複数のアクセスユニットを含み、前記複数のアクセスユニットの各アクセスユニットが前記ビデオの複数のピクチャのうちの１つに割り当てられ、 said video data stream comprising a plurality of access units, each access unit of said plurality of access units being assigned to one of a plurality of pictures of said video;
前記ビデオデータストリームの前記一部は、前記ビデオデータストリームの前記複数のアクセスユニットのうちの１つのアクセスユニットであり、 said portion of said video data stream is one access unit of said plurality of access units of said video data stream;
前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素は、前記アクセスユニットの前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義される、 each syntax element of the one or more syntax elements of the plurality of syntax elements is defined to have the same size in each of the scalable nested supplemental enhancement information messages of the access unit;
態様１７１に記載のビデオデコーダ（３００）。 172. The video decoder (300) of aspect 171.

＜態様１７４＞<Aspect 174>
前記ビデオデータストリームは、１つ以上のスケーラブルネストされない補足拡張情報メッセージを含み、 the video data stream includes one or more scalable non-nested supplemental enhancement information messages;
前記１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び前記１つ以上のスケーラブルネストされない補足拡張情報メッセージが前記複数のシンタックス要素を含み、 the one or more scalable nested supplemental enhancement information messages and the one or more scalable non-nested supplemental enhancement information messages comprise the plurality of syntax elements;
前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素は、前記アクセスユニットの前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて及び前記アクセスユニットの前記スケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義される、 each syntax element of said one or more syntax elements of said plurality of syntax elements is included in each of said scalable nested supplemental enhancement information messages of said access unit and said scalable non-nested supplement of said access unit; defined to have the same size in each of the extended information messages,
態様１７３に記載のビデオデコーダ（３００）。 173. The video decoder (300) of aspect 173.

＜態様１７５＞<Aspect 175>
前記ビデオデータストリームの前記一部は、前記ビデオデータストリームの符号化ビデオシーケンスであり、 said portion of said video data stream is an encoded video sequence of said video data stream;
前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素は、前記符号化ビデオシーケンスの前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義される、 each syntax element of the one or more syntax elements of the plurality of syntax elements is defined to have the same size in each of the scalable nested supplemental enhancement information messages of the encoded video sequence; Ru
態様１７１に記載のビデオデコーダ（３００）。 172. The video decoder (300) of aspect 171.

＜態様１７６＞<Aspect 176>
前記ビデオデータストリームは、１つ以上のスケーラブルネストされない補足拡張情報メッセージを含み、 the video data stream includes one or more scalable non-nested supplemental enhancement information messages;
前記１つ以上のスケーラブルネストされた補足拡張情報メッセージ及び前記１つ以上のスケーラブルネストされない補足拡張情報メッセージが前記複数のシンタックス要素を含み、 the one or more scalable nested supplemental enhancement information messages and the one or more scalable non-nested supplemental enhancement information messages comprise the plurality of syntax elements;
前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素は、前記符号化ビデオシーケンスの前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて及び前記符号化ビデオシーケンスの前記スケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義される、 Each syntax element of the one or more syntax elements of the plurality of syntax elements is included in each of the scalable nested supplemental enhancement information messages of the encoded video sequence and of the encoded video sequence. defined to have the same size in each of the scalable non-nested Supplemental Extension Information messages,
態様１７５に記載のビデオデコーダ（３００）。 175. The video decoder (300) of aspect 175.

＜態様１７７＞<Aspect 177>
前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素は、前記ビデオデータストリームの前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義される、態様１７１に記載のビデオデコーダ（３００）。 Each syntax element of the one or more syntax elements of the plurality of syntax elements is defined to have the same size in each of the scalable nested supplemental enhancement information messages of the video data stream. 172. The video decoder (300) of claim 1, aspect 171.

＜態様１７８＞<Aspect 178>
前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素は、前記ビデオデータストリームの前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて及び前記ビデオデータストリームの前記スケーラブルネストされない補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義される、態様１７７に記載のビデオデコーダ（３００）。 each syntax element of said one or more syntax elements of said plurality of syntax elements in each of said scalable nested supplemental enhancement information messages of said video data stream and in said scalable nest of said video data stream; 178. The video decoder (300) of aspect 177, defined to have the same size in each of the Supplemental Enhancement Information messages that are not supported.

＜態様１７９＞<Aspect 179>
前記ビデオデータストリーム又は前記ビデオデータストリームの前記一部は、少なくとも１つのバッファリング期間補足拡張情報メッセージを含み、前記バッファリング期間補足拡張情報メッセージは、前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素の各シンタックス要素の前記サイズを定義する、態様１７１～１７８のいずれかに記載のビデオデコーダ（３００）。 The video data stream or the portion of the video data stream includes at least one buffering period supplemental enhancement information message, the buffering period supplemental enhancement information message comprising the one of the plurality of syntax elements. 179. A video decoder (300) according to any of aspects 171-178, wherein said size of each syntax element of said syntax elements is defined.

＜態様１８０＞<Aspect 180>
前記バッファリング期間補足拡張情報メッセージは、前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素のそれぞれのシンタックス要素ごとに前記サイズを定義するために、 the buffering period supplemental extension information message to define the size for each syntax element of the one or more syntax elements of the plurality of syntax elements;
ｂｐ＿ｃｐｂ＿ｉｎｉｔｉａｌ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、 bp_cpb_initial_removal_delay_length_minus1 element,
ｂｐ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、 bp_cpb_removal_delay_length_minus1 element,
ｂｐ＿ｄｐｂ＿ｏｕｔｐｕｔ＿ｄｅｌａｙ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、 bp_dpb_output_delay_length_minus1 element,
ｂｐ＿ｄｕ＿ｃｐｂ＿ｒｅｍｏｖａｌ＿ｄｅｌａｙ＿ｉｎｃｒｅｍｅｎｔ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素、 bp_du_cpb_removal_delay_increment_length_minus1 element,
ｂｐ＿ｄｐｂ＿ｏｕｔｐｕｔ＿ｄｅｌａｙ＿ｄｕ＿ｌｅｎｇｔｈ＿ｍｉｎｕｓ１要素 bp_dpb_output_delay_du_length_minus 1 element
のうちの少なくとも１つを含む、態様１７９に記載のビデオデコーダ（３００）。 179. The video decoder (300) of aspect 179, comprising at least one of:

＜態様１８１＞<Aspect 181>
スケーラブルネストされたバッファリング期間補足拡張情報メッセージを含む、前記ビデオデータストリームの複数のアクセスユニットのそれぞれのアクセスユニットごとに、前記アクセスユニットは、スケーラブルネストされないバッファリング期間補足拡張情報メッセージも含む、態様１７９又は１８０に記載のビデオデコーダ（３００）。 wherein for each access unit of a plurality of access units of said video data stream comprising a scalable nested buffering period supplemental enhancement information message, said access unit also comprising a scalable non-nested buffering period supplemental enhancement information message, an aspect 180. A video decoder (300) according to 179 or 180.

＜態様１８２＞<Aspect 182>
スケーラブルネストされたバッファリング期間補足拡張情報メッセージを含む、前記ビデオデータストリームの複数の単層アクセスユニットのそれぞれの単層アクセスユニットごとに、前記単層アクセスユニットは、スケーラブルネストされないバッファリング期間補足拡張情報メッセージも含む、態様１７９又は１８０に記載のビデオデコーダ（３００）。 For each single layer access unit of a plurality of single layer access units of said video data stream comprising a scalable nested buffering period supplemental extension information message, said single layer access unit includes a scalable non nested buffering period supplemental extension 181. A video decoder (300) according to aspect 179 or 180, also comprising an information message.

＜態様１８３＞<Aspect 183>
態様１５９～１７０のいずれかに記載の装置（２００）と、 A device (200) according to any of aspects 159-170;
態様１７１～１８２のいずれかに記載のビデオデコーダ（３００）と、 A video decoder (300) according to any of aspects 171-182;
を備え、 with
態様１７１～１８２のいずれかに記載のビデオデコーダ（３００）は、態様１５９～１７０のいずれかに記載の装置（２００）の前記出力ビデオデータストリームを受信するように構成され、 182. The video decoder (300) according to any of aspects 171-182, configured to receive said output video data stream of the apparatus (200) according to any of aspects 159-170,
態様１７１～１８２のいずれかに記載のビデオデコーダ（３００）は、態様１５９～１７０のいずれかに記載の装置（２００）の前記出力ビデオデータストリームから前記ビデオを復号するように構成される、 182. The video decoder (300) according to any of aspects 171-182, configured to decode said video from said output video data stream of the apparatus (200) according to any of aspects 159-170,
システム。 system.

＜態様１８４＞<Aspect 184>
前記システムは、態様１４７～１５８のいずれかに記載のビデオエンコーダ（１００）を更に備え、 159. The system further comprises a video encoder (100) according to any of aspects 147-158,
態様１５９～１７０のいずれかに記載の装置（２００）は、前記入力ビデオデータストリームとして、態様１４７～１５８のいずれかに記載のビデオエンコーダ（１００）から前記ビデオデータストリームを受信するように構成される、 170. The apparatus (200) according to any of aspects 159-170 is arranged to receive, as said input video data stream, said video data stream from a video encoder (100) according to any of aspects 147-158. Ru
態様１８３に記載のシステム。 184. The system of aspect 183.

＜態様１８５＞<Aspect 185>
ビデオをビデオデータストリームに符号化するための方法であって、 A method for encoding video into a video data stream, comprising:
前記方法は、前記ビデオデータストリームが１つ以上のスケーラブルネストされた補足拡張情報メッセージを含むように前記ビデオデータストリームを生成するステップを含み、 The method includes generating the video data stream such that the video data stream includes one or more scalable nested supplemental enhancement information messages;
前記方法は、前記１つ以上のスケーラブルネストされた補足拡張情報メッセージが複数のシンタックス要素を含むように、前記ビデオデータストリームを生成するステップを含み、 The method includes generating the video data stream such that the one or more scalable nested supplemental enhancement information messages include a plurality of syntax elements;
前記方法は、前記複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素が前記ビデオデータストリーム又は前記ビデオデータストリームの一部の前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有すると定義されるように、前記ビデオデータストリームを生成するステップを含む、 The method further comprises: each syntax element of one or more syntax elements of the plurality of syntax elements each of the scalable nested supplemental enhancement information messages of the video data stream or part of the video data stream; generating said video data stream as defined to have the same size in
方法。 Method.

＜態様１８６＞<Aspect 186>
入力ビデオデータストリームを受信するための方法であって、前記入力ビデオデータストリームにはビデオが符号化され、 1. A method for receiving an input video data stream, said input video data stream being video encoded,
前記方法は、前記入力ビデオデータストリームから出力ビデオデータストリームを生成するステップを含み、 The method includes generating an output video data stream from the input video data stream;
前記ビデオデータストリームは、１つ以上のスケーラブルネストされた補足拡張情報メッセージを含み、 the video data stream includes one or more scalable nested Supplemental Enhancement Information messages;
前記１つ以上のスケーラブルネストされた補足拡張情報メッセージは複数のシンタックス要素を含み、 the one or more scalable nested supplemental extension information messages comprising a plurality of syntax elements;
前記複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素は、前記ビデオデータストリーム又は前記ビデオデータストリームの一部の前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義され、 each syntax element of one or more syntax elements of the plurality of syntax elements is the same size in each of the scalable nested supplemental enhancement information messages of the video data stream or part of the video data stream; is defined to have
前記方法は、前記１つ以上のスケーラブルネストされた補足拡張情報メッセージを処理するステップを含む、 the method includes processing the one or more scalable nested supplemental enhancement information messages;
方法。 Method.

＜態様１８７＞<Aspect 187>
ビデオを格納したビデオデータストリームを受信するための方法であって、 A method for receiving a video data stream containing video, comprising:
前記方法は、前記ビデオデータストリームから前記ビデオを復号するステップを含み、 The method includes decoding the video from the video data stream;
前記ビデオデータストリームは、１つ以上のスケーラブルネストされた補足拡張情報メッセージを含み、 the video data stream includes one or more scalable nested Supplemental Enhancement Information messages;
前記１つ以上のスケーラブルネストされた補足拡張情報メッセージは複数のシンタックス要素を含み、 the one or more scalable nested supplemental extension information messages comprising a plurality of syntax elements;
前記複数のシンタックス要素のうちの１つ以上のシンタックス要素の各シンタックス要素は、前記ビデオデータストリーム又は前記ビデオデータストリームの一部の前記スケーラブルネストされた補足拡張情報メッセージのそれぞれにおいて同じサイズを有するように定義され、 each syntax element of one or more syntax elements of the plurality of syntax elements is the same size in each of the scalable nested supplemental enhancement information messages of the video data stream or part of the video data stream; is defined to have
前記ビデオを復号するステップは、前記複数のシンタックス要素のうちの前記１つ以上のシンタックス要素に応じて行なわれる、 decoding the video is performed in response to the one or more syntax elements of the plurality of syntax elements;
方法。 Method.

＜態様１８８＞<Aspect 188>
コンピュータ又は信号プロセッサで実行されるときに態様１８５～１８７のいずれかに記載の方法を実施するためのコンピュータプログラム。 188. A computer program for implementing the method of any of aspects 185-187 when run on a computer or signal processor.

Claims

A video decoder (300) for receiving a video data stream in which video is stored, comprising:
said video decoder (300) configured to decode said video from said video data stream;
The video decoder (300) is configured to decode the video according to a current picture access unit removal time of a plurality of pictures of the video from an encoded picture buffer;
The video decoder (300), in response to an indication indicating whether to use the coded picture buffer delay offset information to determine the access unit removal time of the current picture from the coded picture buffer. configured to decode said video;
A video decoder (300).

The video decoder (300) is configured to determine, depending on a concatenated flag, whether to use the coded picture buffer delay offset information to determine the access unit removal time for the current picture. 2. The video decoder (300) of claim 1, wherein

If the concatenation flag is set to a first value, the video decoder (300) is configured to use the coded picture buffer delay offset information to determine one or more access unit removal times. ,
If the concatenated flag is set to a second value different from the first value, the video decoder (300) uses the coded picture buffer delay to determine the one or more access unit removal times. 3. The video decoder (300) of claim 2, configured not to use offset information.

4. The method of any of claims 1-3, wherein the access unit removal time of at least one of the plurality of pictures of the video from the coded picture buffer is dependent on the coded picture buffer delay offset information. A video decoder (300).

said video decoder (300) configured to decode said video depending on whether to use coded picture buffer delay offset information to determine said access unit removal time for said current picture; A video decoder (300) according to any preceding claim, wherein the access unit removal time of a current picture depends on the position of the current picture within the video.

A video encoder (100) comprising:
said video encoder (100) configured to encode video into a video data stream;
the video encoder (100) is configured to generate the video data stream such that the video data stream includes encoded picture buffer delay offset information;
The video encoder (100) uses the video data stream to determine the access unit removal time of the current picture from a coded picture buffer used to decode the video at the decoder side. configured to generate the video data stream to include an indication of whether to use picture buffer delay offset information;
A video encoder (100).

7. The video encoder (100) of claim 6, wherein the video encoder (100) is configured to generate the video data stream such that the video data stream includes a concatenation flag.

if the concatenation flag is set to a first value, the concatenation flag indicates that the coded picture buffer delay offset information should be used to determine one or more access unit removal times;
When the concatenation flag is set to a second value different from the first value, the concatenation flag indicates that the indicated offset is not used to determine the one or more access unit removal times. indicates a
A video encoder (100) according to claim 7.

The video encoder comprises a coded picture buffer delay offset that the video data stream can use to determine an access unit removal time of at least one of the plurality of pictures of the video from the coded picture buffer. A video encoder (100) according to any of claims 6 to 8, adapted to generate said video data stream to include information of .

The video encoder converts the video data stream such that the video data stream includes information of the position of the current picture within the video that can be used to determine an access unit removal time for the current picture. A video encoder (100) according to any of claims 6 to 9, adapted to generate .

a video data stream,
a video is encoded in the video data stream;
the video data stream includes coded picture buffer delay offset information;
The video data stream uses the coded picture buffer delay offset information to determine a current picture access unit removal time from a coded picture buffer used to decode the video at a decoder side. including an indication of whether
video data stream.

12. The video data stream of claim 11, wherein said video data stream includes concatenation flags.

if the concatenation flag is set to a first value, the concatenation flag indicates that the coded picture buffer delay offset information should be used to determine one or more access unit removal times;
When the concatenation flag is set to a second value different from the first value, the concatenation flag indicates that the indicated offset is not used to determine the one or more access unit removal times. indicates a
Video data stream according to claim 12.

the video data stream includes information of the coded picture buffer delay offset that can be used to determine an access unit removal time of at least one of the plurality of pictures of the video from the coded picture buffer; Video data stream according to claim 13.

The video data stream is configured to generate the video data stream such that it includes information of the position of the current picture within the video that can be used to determine an access unit removal time for the current picture. A video encoder according to any of claims 11-14, wherein the video encoder is

A method for receiving a video data stream in which video is stored, comprising:
The method includes decoding the video from the video data stream;
decoding the video is performed in response to an access unit removal time of a current picture of a plurality of pictures of the video from an encoded picture buffer;
Decoding the video is performed in response to an indication indicating whether to use coded picture buffer delay offset information to determine the access unit removal time of the current picture from the coded picture buffer. to be
Method.

A method for encoding video into a video data stream, comprising:
The method includes generating the video data stream such that the video data stream includes encoded picture buffer delay offset information;
The step of generating the video data stream comprises the step of determining the access unit removal time of a current picture from a coded picture buffer that the video data stream is used to decode the video at a decoder side. including an indication of whether to use coded picture buffer delay offset information;
Method.

Computer program for implementing the method of claim 16 or 17 when run on a computer or signal processor.