JP2022516114A

JP2022516114A - Video encoders, video decoders, and corresponding methods

Info

Publication number: JP2022516114A
Application number: JP2021538019A
Authority: JP
Inventors: イェ－クイ・ワン; フヌ・ヘンドリー; マキシム・シチェフ
Original assignee: ホアウェイ・テクノロジーズ・カンパニー・リミテッド
Priority date: 2018-12-28
Filing date: 2019-12-27
Publication date: 2022-02-24
Anticipated expiration: 2039-12-27
Also published as: KR20210107090A; JP2023090749A; SG11202107047UA; JP7285934B2; BR112021012649A2; MX2021007926A; EP3903277A1; AU2019414459A1; WO2020140059A1; AU2019414459B2; WO2020140057A1; CN113261030A; EP3903277A4

Abstract

ビデオコーディングメカニズムが開示される。本メカニズムは、ピクチャを複数の第1のレベルのタイルに区分することを含む。第1のレベルのタイルのサブセットが、複数の第2のレベルのタイルに区分される。単一の第1のレベルのタイルから作成されたすべての第2のレベルのタイルが、同じタイルグループに割り当てられるように、第1のレベルのタイルおよび第2のレベルのタイルが1つ以上のタイルグループに割り当てられる。第1のレベルのタイルおよび第2のレベルのタイルは、ビットストリームの中に符号化される。ビットストリームは、デコーダに向かう通信のために記憶される。The video coding mechanism is disclosed. The mechanism involves dividing the picture into multiple first level tiles. A subset of first level tiles is divided into multiple second level tiles. One or more first-level tiles and one or more second-level tiles so that all second-level tiles created from a single first-level tile are assigned to the same tile group. Assigned to a tile group. The first level tile and the second level tile are encoded in the bitstream. The bitstream is stored for communication to the decoder.

Description

関連出願の相互参照
本特許出願は、参照により本明細書に組み込まれる、Ye-Kui Wangらによって2018年12月28日に出願された「Flexible Tiling in Video Coding」と題する米国仮特許出願第62/786,167号の利益を主張する。 Cross-reference to related applications This patent application is incorporated herein by reference in a US provisional patent application entitled "Flexible Tiling in Video Coding" filed by Ye-Kui Wang et al. On December 28, 2018. Claim the interests of / 786,167.

本開示は、一般に、ビデオコーディングに関し、詳細には、同じピクチャの中で異なる解像度を有する複数のタイルをサポートする、フレキシブルビデオタイリング方式に関する。 The present disclosure relates generally to video coding, and more specifically to flexible video tiling schemes that support multiple tiles with different resolutions in the same picture.

比較的短いビデオを描くために必要とされるビデオデータの量でさえ相当であり得、そのことは、帯域幅容量が限定された通信ネットワークを横断してデータがストリーミングされるかまたは別の方法で通信されることになるとき、困難をもたらす場合がある。したがって、ビデオデータは、一般に、現代の電気通信ネットワークを横断して通信される前に圧縮される。メモリリソースが限定されることがあるので、ビデオが記憶デバイス上に記憶されるときにもビデオのサイズが問題でありうる。送信または記憶の前にビデオデータをコーディングし、それによって、デジタルビデオ画像を描写するために必要とされるデータの量を減らすために、ビデオ圧縮デバイスは、しばしば、ソースにおいてソフトウェアおよび/またはハードウェアを使用する。圧縮データは、次いで、ビデオデータを復号するビデオ復元デバイスによって宛先において受信される。ネットワークリソースが限定され、より高いビデオ品質の需要が絶えず増大すると、画像品質の犠牲をほとんど伴わずに圧縮率を改善する、改善された圧縮および復元技法が望ましい。 Even the amount of video data required to draw a relatively short video can be considerable, which means that the data is streamed across a communication network with limited bandwidth capacity or otherwise. It can cause difficulties when it comes to communicating on. Therefore, video data is generally compressed before being communicated across modern telecommunications networks. The size of the video can also be an issue when the video is stored on the storage device, as memory resources may be limited. To code video data prior to transmission or storage, thereby reducing the amount of data required to depict digital video images, video compression devices are often software and / or hardware at the source. To use. The compressed data is then received at the destination by a video restore device that decodes the video data. As network resources are limited and the demand for higher video quality is constantly increasing, improved compression and restoration techniques that improve compression ratio with little sacrifice in image quality are desirable.

一実施形態では、本開示は、エンコーダの中で実施される方法を含み、方法は、エンコーダのプロセッサによってピクチャを複数の第1のレベルのタイルに区分することと、プロセッサによって第1のレベルのタイルのサブセットを複数の第2のレベルのタイルに区分することと、各タイルグループが、いくつかの第1のレベルのタイル、第2のレベルのタイルの各シーケンスが単一の第1のレベルのタイルから分割される第2のレベルのタイルの1つ以上の連続するシーケンス、またはそれらの組み合わせを含むように、プロセッサによって第1のレベルのタイルおよび第2のレベルのタイルを1つ以上のタイルグループに割り当てることと、プロセッサによって第1のレベルのタイルおよび第2のレベルのタイルをビットストリームの中に符号化することと、デコーダに向かう通信のためにビットストリームをエンコーダのメモリの中に記憶することとを含む。異なる解像度で符号化された複数の領域を含む単一の画像が送られうる場合、いくつかのストリーミングアプリケーション(たとえば、仮想現実(VR:virtual reality)および遠隔会議)は改善されることができる。ラスタ走査ベースのスライシングおよび/またはタイリングなどの、いくつかのスライシングおよびタイリングメカニズムは、異なる解像度におけるタイルが異なって扱われてもよいので、そのような機能をサポートしなくてもよい。たとえば、第1の解像度におけるタイルは、データの単一のスライスを含んでよいが、第2の解像度におけるタイルは、ピクセル密度の差異に起因してデータの複数のスライスを搬送してもよい。この機能をサポートするために、第1のレベルのタイルおよび第2のレベルのタイルを含むフレキシブルタイリング方式が採用されてもよい。第2のレベルのタイルは、第1のレベルのタイルを区分することによって作成される。このタイリング方式は、第1のレベルのタイルが、第1の解像度におけるデータの1つのスライスを含むことと、第2のレベルのタイルを含む第1のレベルのタイルが、第2の解像度における複数のスライスを含むこととを可能にする。そのようなフレキシブルタイリング方式は、複数の解像度を含むピクチャをエンコーダ/デコーダ(コーデック)がサポートすることを可能にし、したがって、エンコーダとデコーダの両方の機能を高める。本開示は、タイルグループをフレキシブルタイリング方式に統合するためのメカニズムを説明する。タイルグループは、第1のレベルのタイル、および/または1つ以上の第1のレベルのタイルから区分された第2のレベルのタイルの完全セットを含むことができる。この手法は、単一の第1のレベルのタイルからの第2のレベルのタイルが、異なるタイルグループに分割されるのを防止する。したがって、開示するメカニズムは、フレキシブルタイリング方式のタイルがタイルグループの中に含まれることを可能にし、そのことは、タイルグループに基づいてコーディングツールが様々なタイルに適用されることを可能にする。単一の第1のレベルのタイルからの第2のレベルのタイルが、異なるグループに分割されるのを防止することによって、結果として生じる、タイルグループを伴うフレキシブルタイリング方式の複雑度が低減される。したがって、本開示は、プロセッサおよび/またはメモリのリソース使用量を低減しながら、エンコーダとデコーダの両方の機能をさらに高める。 In one embodiment, the disclosure includes a method performed within an encoder, wherein the method divides the picture into multiple first level tiles by the encoder processor and the first level tile by the processor. Dividing a subset of tiles into multiple second level tiles, each tile group has several first level tiles, each sequence of second level tiles is a single first level One or more first-level tiles and one or more second-level tiles by the processor to include one or more contiguous sequences of second-level tiles, or combinations thereof, that are split from the tiles of Assigning to tile groups, encoding first-level and second-level tiles in the bitstream by the processor, and putting the bitstream in the encoder's memory for communication to the decoder. Including to remember. Some streaming applications (eg, virtual reality (VR) and teleconferencing) can be improved if a single image containing multiple areas encoded at different resolutions can be sent. Some slicing and tiling mechanisms, such as raster scan-based slicing and / or tiling, may not support such features as tiles at different resolutions may be treated differently. For example, tiles at the first resolution may contain a single slice of data, while tiles at the second resolution may carry multiple slices of data due to differences in pixel density. To support this feature, a flexible tiling scheme may be employed that includes first level tiles and second level tiles. Second level tiles are created by partitioning the first level tiles. In this tiling scheme, the first level tiles contain one slice of data at the first resolution, and the first level tiles containing the second level tiles at the second resolution. Allows you to include multiple slices. Such a flexible tiling scheme allows the encoder / decoder (codec) to support pictures containing multiple resolutions, thus enhancing the functionality of both the encoder and the decoder. The present disclosure describes a mechanism for integrating tile groups into a flexible tiling scheme. A tile group can include a first level tile and / or a complete set of second level tiles separated from one or more first level tiles. This technique prevents the second level tiles from a single first level tile from being split into different tile groups. Therefore, the disclosed mechanism allows flexible tiling tiles to be included within a tile group, which allows coding tools to be applied to various tiles based on the tile group. .. Preventing the second level tiles from a single first level tile from being split into different groups reduces the complexity of the resulting flexible tiling scheme with tile groups. To. Accordingly, the present disclosure further enhances the functionality of both encoders and decoders while reducing processor and / or memory resource usage.

一実施形態では、本開示は、エンコーダの中で実施される方法を含み、方法は、エンコーダのプロセッサによってピクチャを複数の第1のレベルのタイルに区分することと、プロセッサによって第1のレベルのタイルのサブセットを複数の第2のレベルのタイルに区分することと、単一の第1のレベルのタイルから作成されたすべての第2のレベルのタイルが、同じタイルグループに割り当てられるように、プロセッサによって第1のレベルのタイルおよび第2のレベルのタイルを1つ以上のタイルグループに割り当てることと、プロセッサによって第1のレベルのタイルおよび第2のレベルのタイルをビットストリームの中に符号化することと、デコーダに向かう通信のためにビットストリームをエンコーダのメモリの中に記憶することとを含む。異なる解像度で符号化された複数の領域を含む単一の画像が送られうる場合、いくつかのストリーミングアプリケーション(たとえば、VRおよび遠隔会議)は改善されることができる。ラスタ走査ベースのスライシングおよび/またはタイリングなどの、いくつかのスライシングおよびタイリングメカニズムは、異なる解像度におけるタイルが異なって扱われてもよいので、そのような機能をサポートしなくてもよい。たとえば、第1の解像度におけるタイルは、データの単一のスライスを含んでよいが、第2の解像度におけるタイルは、ピクセル密度の差異に起因してデータの複数のスライスを搬送してもよい。この機能をサポートするために、第1のレベルのタイルおよび第2のレベルのタイルを含むフレキシブルタイリング方式が採用されてもよい。第2のレベルのタイルは、第1のレベルのタイルを区分することによって作成される。このタイリング方式は、第1のレベルのタイルが、第1の解像度におけるデータの1つのスライスを含むことと、第2のレベルのタイルを含む第1のレベルのタイルが、第2の解像度における複数のスライスを含むこととを可能にする。そのようなフレキシブルタイリング方式は、複数の解像度を含むピクチャをコーデックがサポートすることを可能にし、したがって、エンコーダとデコーダの両方の機能を高める。本開示は、タイルグループをフレキシブルタイリング方式に統合するためのメカニズムを説明する。タイルグループは、第1のレベルのタイル、および/または1つ以上の第1のレベルのタイルから区分された第2のレベルのタイルの完全セットを含むことができる。この手法は、単一の第1のレベルのタイルからの第2のレベルのタイルが、異なるタイルグループに分割されるのを防止する。したがって、開示するメカニズムは、フレキシブルタイリング方式のタイルがタイルグループの中に含まれることを可能にし、そのことは、タイルグループに基づいてコーディングツールが様々なタイルに適用されることを可能にする。単一の第1のレベルのタイルからの第2のレベルのタイルが、異なるグループに分割されるのを防止することによって、結果として生じる、タイルグループを伴うフレキシブルタイリング方式の複雑度が低減される。したがって、本開示は、プロセッサおよび/またはメモリのリソース使用量を低減しながら、エンコーダとデコーダの両方の機能をさらに高める。 In one embodiment, the disclosure includes a method performed within an encoder, wherein the method divides the picture into multiple first level tiles by the encoder processor and the first level tile by the processor. Dividing a subset of tiles into multiple second level tiles and allowing all second level tiles created from a single first level tile to be assigned to the same tile group. The processor assigns first-level and second-level tiles to one or more tile groups, and the processor encodes first-level and second-level tiles into a bitstream. Includes doing and storing the bitstream in the encoder's memory for communication to the decoder. Some streaming applications (eg VR and teleconferencing) can be improved if a single image containing multiple areas encoded at different resolutions can be sent. Some slicing and tiling mechanisms, such as raster scan-based slicing and / or tiling, may not support such features as tiles at different resolutions may be treated differently. For example, tiles at the first resolution may contain a single slice of data, while tiles at the second resolution may carry multiple slices of data due to differences in pixel density. To support this feature, a flexible tiling scheme may be employed that includes first level tiles and second level tiles. Second level tiles are created by partitioning the first level tiles. In this tiling scheme, the first level tiles contain one slice of data at the first resolution, and the first level tiles containing the second level tiles at the second resolution. Allows you to include multiple slices. Such a flexible tiling scheme allows the codec to support pictures with multiple resolutions, thus enhancing the functionality of both the encoder and the decoder. The present disclosure describes a mechanism for integrating tile groups into a flexible tiling scheme. A tile group can include a first level tile and / or a complete set of second level tiles separated from one or more first level tiles. This technique prevents the second level tiles from a single first level tile from being split into different tile groups. Therefore, the disclosed mechanism allows flexible tiling tiles to be included within a tile group, which allows coding tools to be applied to various tiles based on the tile group. .. Preventing the second level tiles from a single first level tile from being split into different groups reduces the complexity of the resulting flexible tiling scheme with tile groups. To. Accordingly, the present disclosure further enhances the functionality of both encoders and decoders while reducing processor and / or memory resource usage.

任意選択で、前述の態様のいずれかにおいて、態様の別の実装形態は、サブセットの外側の第1のレベルのタイルが、第1の解像度におけるピクチャデータを含み、第2のレベルのタイルが、第1の解像度とは異なる第2の解像度におけるピクチャデータを含むことを提供する。 Optionally, in any of the aforementioned embodiments, another implementation of the embodiment is that the first level tiles outside the subset contain picture data at the first resolution and the second level tiles. Provided to include picture data at a second resolution different from the first resolution.

任意選択で、前述の態様のいずれかにおいて、態様の別の実装形態は、第1のレベルのタイルのサブセットの中の各第1のレベルのタイルが、2つ以上の完全な第2のレベルのタイルを含むことを提供する。 Optionally, in any of the aforementioned embodiments, another implementation of the embodiment is that each first level tile within a subset of the first level tiles has two or more complete second levels. Offer to include tiles.

任意選択で、前述の態様のいずれかにおいて、態様の別の実装形態は、第1のレベルのタイルおよび第2のレベルのタイルが走査順序に従って符号化され、走査順序に従って符号化することが、第1のレベルのタイルをラスタ走査順序で符号化することと、第2のレベルのタイルのうちの1つに遭遇すると、第1のレベルのタイルのラスタ走査順序符号化を中断することと、連続するすべての第2のレベルのタイルをラスタ走査順序で符号化してから第1のレベルのタイルのラスタ走査順序符号化を継続することとを含むことを提供する。 Optionally, in any of the aforementioned embodiments, another embodiment of the embodiment allows the first level tiles and the second level tiles to be encoded according to the scan order and encoded according to the scan order. Encoding first-level tiles in raster scan order, and interrupting raster scan order coding of first-level tiles when one of the second-level tiles is encountered. Provided to include encoding all consecutive second level tiles in raster scan order and then continuing raster scan order coding of first level tiles.

任意選択で、前述の態様のいずれかにおいて、態様の別の実装形態は、現在の第1のレベルのタイルから区分されたすべての第2のレベルのタイルが、後続の第2のレベルのタイルから区分された任意の第2のレベルのタイルを符号化する前に符号化されることを提供する。 Optionally, in any of the aforementioned embodiments, another implementation of the embodiment is that all second level tiles separated from the current first level tile are followed by second level tiles. Provides to be encoded before encoding any second level tile separated from.

任意選択で、前述の態様のいずれかにおいて、態様の別の実装形態は、1つ以上のタイルグループの各々が、割当て済みのタイルグループの中のすべてのタイルがピクチャの長方形部分をカバーするように制約されることを提供する。 Optionally, in any of the aforementioned embodiments, another implementation of the embodiment is such that each of one or more tile groups covers the rectangular portion of the picture with all tiles in the assigned tile group. Provides to be constrained to.

一実施形態では、本開示は、デコーダの中で実施される方法を含み、方法は、複数の第1のレベルのタイルに区分されたピクチャを含むビットストリームを、受信機を介してデコーダのプロセッサによって受信することであって、第1のレベルのタイルのサブセットが、複数の第2のレベルのタイルにさらに区分され、各タイルグループが、いくつかの第1のレベルのタイル、第2のレベルのタイルの各シーケンスが単一の第1のレベルのタイルから分割される第2のレベルのタイルの1つ以上の連続するシーケンス、またはそれらの組み合わせを含むように、第1のレベルのタイルおよび第2のレベルのタイルが1つ以上のタイルグループに割り当てられることと、1つ以上のタイルグループに基づいて第1のレベルのタイルおよび第2のレベルのタイルをプロセッサによって復号することと、復号された第1のレベルのタイルおよび第2のレベルのタイルに基づいて表示のために再構成ビデオシーケンスをプロセッサによって生成することとを含む。異なる解像度で符号化された複数の領域を含む単一の画像が送られうる場合、いくつかのストリーミングアプリケーション(たとえば、VRおよび遠隔会議)は改善されることができる。ラスタ走査ベースのスライシングおよび/またはタイリングなどの、いくつかのスライシングおよびタイリングメカニズムは、異なる解像度におけるタイルが異なって扱われてもよいので、そのような機能をサポートしなくてもよい。たとえば、第1の解像度におけるタイルは、データの単一のスライスを含んでよいが、第2の解像度におけるタイルは、ピクセル密度の差異に起因してデータの複数のスライスを搬送してもよい。この機能をサポートするために、第1のレベルのタイルおよび第2のレベルのタイルを含むフレキシブルタイリング方式が採用されうる。第2のレベルのタイルは、第1のレベルのタイルを区分することによって作成される。このタイリング方式は、第1のレベルのタイルが、第1の解像度におけるデータの1つのスライスを含むことと、第2のレベルのタイルを含む第1のレベルのタイルが、第2の解像度における複数のスライスを含むこととを可能にする。そのようなフレキシブルタイリング方式は、複数の解像度を含むピクチャをコーデックがサポートすることを可能にし、したがって、エンコーダとデコーダの両方の機能を高める。本開示は、タイルグループをフレキシブルタイリング方式に統合するためのメカニズムを説明する。タイルグループは、第1のレベルのタイル、および/または1つ以上の第1のレベルのタイルから区分された第2のレベルのタイルの完全セットを含むことができる。この手法は、単一の第1のレベルのタイルからの第2のレベルのタイルが、異なるタイルグループに分割されるのを防止する。したがって、開示するメカニズムは、フレキシブルタイリング方式のタイルがタイルグループの中に含まれることを可能にし、そのことは、タイルグループに基づいてコーディングツールが様々なタイルに適用されることを可能にする。単一の第1のレベルのタイルからの第2のレベルのタイルが、異なるグループに分割されるのを防止することによって、結果として生じる、タイルグループを伴うフレキシブルタイリング方式の複雑度が低減される。したがって、本開示は、プロセッサおよび/またはメモリのリソース使用量を低減しながら、エンコーダとデコーダの両方の機能をさらに高める。 In one embodiment, the disclosure comprises a method carried out within a decoder, wherein a bitstream containing pictures segmented into multiple first level tiles, via a receiver, is a processor of the decoder. By receiving by, a subset of the first level tiles are further subdivided into multiple second level tiles, where each tile group has several first level tiles, the second level. First level tiles and so that each sequence of tiles contains one or more consecutive sequences of second level tiles, or combinations thereof, that are split from a single first level tile. The second level tiles are assigned to one or more tile groups, and the processor decodes the first level tiles and the second level tiles based on the one or more tile groups. Includes the processor generating a reconstructed video sequence for display based on the first level tiles and the second level tiles that have been made. Some streaming applications (eg VR and teleconferencing) can be improved if a single image containing multiple areas encoded at different resolutions can be sent. Some slicing and tiling mechanisms, such as raster scan-based slicing and / or tiling, may not support such features as tiles at different resolutions may be treated differently. For example, tiles at the first resolution may contain a single slice of data, while tiles at the second resolution may carry multiple slices of data due to differences in pixel density. To support this feature, flexible tiling schemes may be employed that include first level tiles and second level tiles. Second level tiles are created by partitioning the first level tiles. In this tiling scheme, the first level tiles contain one slice of data at the first resolution, and the first level tiles containing the second level tiles at the second resolution. Allows you to include multiple slices. Such a flexible tiling scheme allows the codec to support pictures with multiple resolutions, thus enhancing the functionality of both the encoder and the decoder. The present disclosure describes a mechanism for integrating tile groups into a flexible tiling scheme. A tile group can include a first level tile and / or a complete set of second level tiles separated from one or more first level tiles. This technique prevents the second level tiles from a single first level tile from being split into different tile groups. Therefore, the disclosed mechanism allows flexible tiling tiles to be included within a tile group, which allows coding tools to be applied to various tiles based on the tile group. .. Preventing the second level tiles from a single first level tile from being split into different groups reduces the complexity of the resulting flexible tiling scheme with tile groups. To. Accordingly, the present disclosure further enhances the functionality of both encoders and decoders while reducing processor and / or memory resource usage.

一実施形態では、本開示は、デコーダの中で実施される方法を含み、方法は、複数の第1のレベルのタイルに区分されたピクチャを含むビットストリームを、受信機を介してデコーダのプロセッサによって受信することであって、第1のレベルのタイルのサブセットが、複数の第2のレベルのタイルにさらに区分され、単一の第1のレベルのタイルから作成されたすべての第2のレベルのタイルが、同じタイルグループに割り当てられるように、第1のレベルのタイルおよび第2のレベルのタイルが1つ以上のタイルグループに割り当てられる、受信することと、1つ以上のタイルグループに基づいて第1のレベルのタイルおよび第2のレベルのタイルをプロセッサによって復号することと、復号された第1のレベルのタイルおよび第2のレベルのタイルに基づいて表示のために再構成ビデオシーケンスをプロセッサによって生成することとを含む。異なる解像度で符号化された複数の領域を含む単一の画像が送られうる場合、いくつかのストリーミングアプリケーション(たとえば、VRおよび遠隔会議)は改善されることができる。ラスタ走査ベースのスライシングおよび/またはタイリングなどの、いくつかのスライシングおよびタイリングメカニズムは、異なる解像度におけるタイルが異なって扱われてもよいので、そのような機能をサポートしなくてもよい。たとえば、第1の解像度におけるタイルは、データの単一のスライスを含んでよいが、第2の解像度におけるタイルは、ピクセル密度の差異に起因してデータの複数のスライスを搬送してもよい。この機能をサポートするために、第1のレベルのタイルおよび第2のレベルのタイルを含むフレキシブルタイリング方式が採用されてもよい。第2のレベルのタイルは、第1のレベルのタイルを区分することによって作成される。このタイリング方式は、第1のレベルのタイルが、第1の解像度におけるデータの1つのスライスを含むことと、第2のレベルのタイルを含む第1のレベルのタイルが、第2の解像度における複数のスライスを含むこととを可能にする。そのようなフレキシブルタイリング方式は、複数の解像度を含むピクチャをコーデックがサポートすることを可能にし、したがって、エンコーダとデコーダの両方の機能を高める。本開示は、タイルグループをフレキシブルタイリング方式に統合するためのメカニズムを説明する。タイルグループは、第1のレベルのタイル、および/または1つ以上の第1のレベルのタイルから区分された第2のレベルのタイルの完全セットを含むことができる。この手法は、単一の第1のレベルのタイルからの第2のレベルのタイルが、異なるタイルグループに分割されるのを防止する。したがって、開示するメカニズムは、フレキシブルタイリング方式のタイルがタイルグループの中に含まれることを可能にし、そのことは、タイルグループに基づいてコーディングツールが様々なタイルに適用されることを可能にする。単一の第1のレベルのタイルからの第2のレベルのタイルが、異なるグループに分割されるのを防止することによって、結果として生じる、タイルグループを伴うフレキシブルタイリング方式の複雑度が低減される。したがって、本開示は、プロセッサおよび/またはメモリのリソース使用量を低減しながら、エンコーダとデコーダの両方の機能をさらに高める。 In one embodiment, the disclosure comprises a method carried out within a decoder, wherein a bitstream containing pictures segmented into multiple first level tiles, via a receiver, is a processor of the decoder. A subset of the first level tiles is further subdivided into multiple second level tiles and all second levels created from a single first level tile. First level tiles and second level tiles are assigned to one or more tile groups so that the tiles of the first level and the second level tiles are assigned to one or more tile groups, based on receiving and one or more tile groups. 1st level tiles and 2nd level tiles are decoded by the processor, and a reconstructed video sequence for display based on the decoded 1st level tiles and 2nd level tiles. Includes generating by the processor. Some streaming applications (eg VR and teleconferencing) can be improved if a single image containing multiple areas encoded at different resolutions can be sent. Some slicing and tiling mechanisms, such as raster scan-based slicing and / or tiling, may not support such features as tiles at different resolutions may be treated differently. For example, tiles at the first resolution may contain a single slice of data, while tiles at the second resolution may carry multiple slices of data due to differences in pixel density. To support this feature, a flexible tiling scheme may be employed that includes first level tiles and second level tiles. Second level tiles are created by partitioning the first level tiles. In this tiling scheme, the first level tiles contain one slice of data at the first resolution, and the first level tiles containing the second level tiles at the second resolution. Allows you to include multiple slices. Such a flexible tiling scheme allows the codec to support pictures with multiple resolutions, thus enhancing the functionality of both the encoder and the decoder. The present disclosure describes a mechanism for integrating tile groups into a flexible tiling scheme. A tile group can include a first level tile and / or a complete set of second level tiles separated from one or more first level tiles. This technique prevents the second level tiles from a single first level tile from being split into different tile groups. Therefore, the disclosed mechanism allows flexible tiling tiles to be included within a tile group, which allows coding tools to be applied to various tiles based on the tile group. .. Preventing the second level tiles from a single first level tile from being split into different groups reduces the complexity of the resulting flexible tiling scheme with tile groups. To. Accordingly, the present disclosure further enhances the functionality of both encoders and decoders while reducing processor and / or memory resource usage.

任意選択で、前述の態様のいずれかにおいて、態様の別の実装形態は、第1のレベルのタイルおよび第2のレベルのタイルが走査順序に従って復号され、走査順序に従って復号することが、第1のレベルのタイルをラスタ走査順序で復号することと、第2のレベルのタイルのうちの1つに遭遇すると、第1のレベルのタイルのラスタ走査順序符号化を中断することと、連続するすべての第2のレベルのタイルをラスタ走査順序で符号化してから第1のレベルのタイルのラスタ走査順序復号を継続することとを含むことを提供する。 Optionally, in any of the aforementioned embodiments, another embodiment of the embodiment is that the first level tiles and the second level tiles are decoded according to the scan order and are decoded according to the scan order. Decoding tiles at the first level in raster scan order, interrupting raster scan order coding of tiles at the first level when encountering one of the second level tiles, and all in succession Provided to include encoding the second level tiles of the first level tile in raster scan order and then continuing the raster scan order decoding of the first level tiles.

任意選択で、前述の態様のいずれかにおいて、態様の別の実装形態は、現在の第1のレベルのタイルから区分されたすべての第2のレベルのタイルが、後続の第2のレベルのタイルから区分された任意の第2のレベルのタイルを復号する前に復号されることを提供する。 Optionally, in any of the aforementioned embodiments, another implementation of the embodiment is that all second level tiles separated from the current first level tile are followed by second level tiles. Provides decryption before decrypting any second level tile sorted from.

一実施形態では、本開示は、プロセッサと、プロセッサに結合された受信機と、プロセッサに結合された送信機とを備える、ビデオコーディングデバイスを含み、プロセッサ、受信機、および送信機は、前述の態様のうちのいずれかの方法を実行するように構成される。 In one embodiment, the present disclosure comprises a video coding device comprising a processor, a receiver coupled to the processor, and a transmitter coupled to the processor, wherein the processor, receiver, and transmitter are described above. It is configured to perform any of the methods of the embodiment.

一実施形態では、本開示は、ビデオコーディングデバイスによる使用のためのコンピュータプログラム製品を備える非一時的コンピュータ可読媒体を含み、コンピュータプログラム製品は、プロセッサによって実行されたとき、前述の態様のうちのいずれかの方法をビデオコーディングデバイスに実行させるような、非一時的コンピュータ可読媒体上に記憶されるコンピュータ実行可能命令を備える。 In one embodiment, the present disclosure includes a non-temporary computer-readable medium comprising a computer programming product for use by a video coding device, wherein the computer programming product is any of the aforementioned embodiments when executed by a processor. It comprises computer-executable instructions stored on a non-temporary computer-readable medium, such as causing a video coding device to perform such a method.

一実施形態では、本開示は、ピクチャを複数の第1のレベルのタイルに区分し、かつ第1のレベルのタイルのサブセットを複数の第2のレベルのタイルに区分するための、区分手段と、単一の第1のレベルのタイルから作成されたすべての第2のレベルのタイルが、同じタイルグループに割り当てられるように、第1のレベルのタイルおよび第2のレベルのタイルを1つ以上のタイルグループに割り当てるための割当て手段と、第1のレベルのタイルおよび第2のレベルのタイルをビットストリームの中に符号化するための符号化手段と、デコーダに向かう通信のためにビットストリームを記憶するための記憶手段とを備える、エンコーダを含む。 In one embodiment, the present disclosure comprises a classification means for classifying a picture into a plurality of first level tiles and a subset of the first level tiles into a plurality of second level tiles. , One or more first-level tiles and one or more second-level tiles so that all second-level tiles created from a single first-level tile are assigned to the same tile group. Allocation means for allocating to tile groups, encoding means for encoding first-level and second-level tiles into a bitstream, and bitstreams for communication to the decoder. Includes an encoder with storage means for storage.

任意選択で、前述の態様のいずれかにおいて、態様の別の実装形態は、エンコーダが、前述の態様のうちのいずれかの方法を実行するようにさらに構成されることを提供する。 Optionally, in any of the aforementioned embodiments, another implementation of the embodiment provides that the encoder is further configured to perform any of the methods of the aforementioned embodiments.

一実施形態では、本開示は、複数の第1のレベルのタイルに区分されたピクチャを含むビットストリームを受信するための受信手段であって、第1のレベルのタイルのサブセットが、複数の第2のレベルのタイルにさらに区分され、単一の第1のレベルのタイルから作成されたすべての第2のレベルのタイルが、同じタイルグループに割り当てられるように、第1のレベルのタイルおよび第2のレベルのタイルが1つ以上のタイルグループに割り当てられる、受信手段と、1つ以上のタイルグループに基づいて第1のレベルのタイルおよび第2のレベルのタイルを復号するための復号手段と、復号された第1のレベルのタイルおよび第2のレベルのタイルに基づいて表示のために再構成ビデオシーケンスを生成するための生成手段とを備える、デコーダを含む。 In one embodiment, the present disclosure is a receiving means for receiving a bit stream containing pictures segmented into a plurality of first level tiles, wherein a subset of the first level tiles is a plurality of first level tiles. The first level tiles and the first level tiles are further subdivided into two level tiles so that all second level tiles created from a single first level tile are assigned to the same tile group. Receiving means, where two-level tiles are assigned to one or more tile groups, and decoding means for decoding first-level and second-level tiles based on one or more tile groups. Includes a decoder with a generator for generating a reconstructed video sequence for display based on the decrypted first level tiles and second level tiles.

任意選択で、前述の態様のいずれかにおいて、態様の別の実装形態は、デコーダが、前述の態様のうちのいずれかのうちのいずれかの方法を実行するようにさらに構成されることを提供する。 Optionally, in any of the aforementioned embodiments, another implementation of the embodiment provides that the decoder is further configured to perform any method of any of the aforementioned embodiments. do.

明快のために、上記の実施形態のうちのいずれか1つは、本開示の範囲内の新たな実施形態を作成するために、上記の他の実施形態のうちのいずれか1つ以上と組み合わせられてよい。 For clarity, any one of the above embodiments may be combined with any one or more of the other embodiments described above to create new embodiments within the scope of the present disclosure. May be done.

これらおよび他の特徴は、添付図面および請求項とともに取り込まれる以下の発明を実施するための形態から、より明確に理解されよう。 These and other features will be more clearly understood from the embodiments for carrying out the following inventions incorporated with the accompanying drawings and claims.

本開示のより完全な理解のために、ここで、添付図面および発明を実施するための形態に関して取り込まれる以下の簡単な説明への参照が行われ、同様の参照番号は同様の部分を表す。 For a more complete understanding of the present disclosure, references are made herein to the following brief description incorporated with respect to the accompanying drawings and embodiments for carrying out the invention, with similar reference numbers representing similar parts.

ビデオ信号をコーディングする例示的な方法のフローチャートである。It is a flowchart of an exemplary method of coding a video signal. ビデオコーディングのための例示的なコーディングおよび復号(コーデック)システムの概略図である。FIG. 3 is a schematic of an exemplary coding and decoding (codec) system for video coding. 例示的なビデオエンコーダを示す概略図である。It is a schematic diagram which shows an exemplary video encoder. 例示的なビデオデコーダを示す概略図である。It is a schematic diagram which shows an exemplary video decoder. 符号化ビデオシーケンスを含む例示的なビットストリームを示す概略図である。FIG. 6 is a schematic diagram showing an exemplary bitstream containing a coded video sequence. 異なるビットストリームからの複数の解像度のサブピクチャを組み合わせて仮想現実(VR)アプリケーションにおける使用のための単一のピクチャにするための、エクストラクタトラックを作成するための例示的なメカニズムを示す図である。A diagram illustrating an exemplary mechanism for creating an extractor track to combine multiple resolution subpictures from different bitstreams into a single picture for use in a virtual reality (VR) application. be. 異なるビットストリームからの複数の解像度のサブピクチャを組み合わせて仮想現実(VR)アプリケーションにおける使用のための単一のピクチャにするための、エクストラクタトラックを作成するための例示的なメカニズムを示す図である。A diagram illustrating an exemplary mechanism for creating an extractor track to combine multiple resolution subpictures from different bitstreams into a single picture for use in a virtual reality (VR) application. be. 異なるビットストリームからの複数の解像度のサブピクチャを組み合わせて仮想現実(VR)アプリケーションにおける使用のための単一のピクチャにするための、エクストラクタトラックを作成するための例示的なメカニズムを示す図である。A diagram illustrating an exemplary mechanism for creating an extractor track to combine multiple resolution subpictures from different bitstreams into a single picture for use in a virtual reality (VR) application. be. 異なるビットストリームからの複数の解像度のサブピクチャを組み合わせて仮想現実(VR)アプリケーションにおける使用のための単一のピクチャにするための、エクストラクタトラックを作成するための例示的なメカニズムを示す図である。A diagram illustrating an exemplary mechanism for creating an extractor track to combine multiple resolution subpictures from different bitstreams into a single picture for use in a virtual reality (VR) application. be. 異なるビットストリームからの複数の解像度のサブピクチャを組み合わせて仮想現実(VR)アプリケーションにおける使用のための単一のピクチャにするための、エクストラクタトラックを作成するための例示的なメカニズムを示す図である。A diagram illustrating an exemplary mechanism for creating an extractor track to combine multiple resolution subpictures from different bitstreams into a single picture for use in a virtual reality (VR) application. be. 異なるビットストリームからの複数の解像度のピクチャをつなぎ合わせて表示用の単一のピクチャにする、例示的なビデオ会議アプリケーションを示す図である。FIG. 3 illustrates an exemplary video conference application that stitches together multiple resolution pictures from different bitstreams into a single picture for display. 同じピクチャの中で異なる解像度を有する複数のタイルをサポートすることが可能な、例示的なフレキシブルビデオタイリング方式を示す概略図である。FIG. 6 illustrates an exemplary flexible video tiling scheme capable of supporting multiple tiles with different resolutions in the same picture. 同じピクチャの中で異なる解像度を有する複数のタイルをサポートすることが可能な、例示的なフレキシブルビデオタイリング方式を示す概略図である。FIG. 6 illustrates an exemplary flexible video tiling scheme capable of supporting multiple tiles with different resolutions in the same picture. 同じピクチャの中で異なる解像度を有する複数のタイルをサポートすることが可能な、例示的なフレキシブルビデオタイリング方式を示す概略図である。FIG. 6 illustrates an exemplary flexible video tiling scheme capable of supporting multiple tiles with different resolutions in the same picture. 同じピクチャの中で異なる解像度を有する複数のタイルをサポートすることが可能な、例示的なフレキシブルビデオタイリング方式を示す概略図である。FIG. 6 illustrates an exemplary flexible video tiling scheme capable of supporting multiple tiles with different resolutions in the same picture. 例示的なビデオコーディングデバイスの概略図である。It is a schematic diagram of an exemplary video coding device. フレキシブルタイリング方式を採用することによって画像を符号化する、例示的な方法のフローチャートである。It is a flowchart of an exemplary method of encoding an image by adopting a flexible tiling method. フレキシブルタイリング方式を採用することによって画像を復号する、例示的な方法のフローチャートである。It is a flowchart of an exemplary method of decoding an image by adopting a flexible tiling method. フレキシブルタイリング方式を採用することによってビデオシーケンスをコーディングするための、例示的なシステムの概略図である。It is a schematic diagram of an exemplary system for coding a video sequence by adopting a flexible tiling scheme.

最初に、1つ以上の実施形態の例示的な実装形態が以下に提供されるが、開示するシステムおよび/または方法が、現在知られているかまたは存在しているかどうかにかかわらず、任意の数の技法を使用して実施されてもよいことを理解されたい。本開示は、本明細書で図示および説明される例示的な設計および実装形態を含む、以下に示す例示的な実装形態、図面、および技法に、決して限定されるべきでなく、均等物のそれらの完全な範囲と一緒に添付の特許請求の範囲内で修正されてもよい。 Initially, exemplary implementations of one or more embodiments are provided below, but any number of disclosed systems and / or methods, whether currently known or present. Please understand that it may be carried out using the technique of. The present disclosure should by no means be limited to the exemplary implementations, drawings, and techniques shown below, including the exemplary designs and implementations illustrated and described herein, and those of equivalents. May be amended within the scope of the appended claims along with the full scope of.

本明細書では、コーディングツリーブロック(CTB:coding tree block)、コーディングツリーユニット(CTU:coding tree unit)、コーディングユニット(CU:coding unit)、コーディングされたビデオシーケンス(CVS:coded video sequence)、ジョイントビデオエキスパートチーム(JVET:Joint Video Experts Team)、動き制約タイルセット(MCTS:motion constrained tile set)、最大転送単位(MTU:maximum transfer unit)、ネットワーク抽象化レイヤ(NAL:network abstraction layer)、ピクチャ順序カウント(POC:picture order count)、ローバイトシーケンスペイロード(RBSP:raw byte sequence payload)、シーケンスパラメータセット(SPS:sequence parameter set)、多用途ビデオコーディング(VVC:versatile video coding)、およびワーキングドラフト(WD:working draft)などの、様々な頭字語が採用される。 As used herein, a coding tree block (CTB), a coding tree unit (CTU), a coding unit (CU), a coded video sequence (CVS), and a joint. Video Experts Team (JVET), motion constrained tile set (MCTS), maximum transfer unit (MTU), network abstraction layer (NAL), picture order Count (POC: picture order count), raw byte sequence payload (RBSP), sequence parameter set (SPS), versatile video coding (VVC), and working draft (WD). Various acronyms such as: working draft) are adopted.

データの最低限の損失しか伴わずにビデオファイルのサイズを低減するために、多くのビデオ圧縮技法が採用されうる。たとえば、ビデオ圧縮技法は、ビデオシーケンスにおけるデータ冗長性を低減または除去するために空間(たとえば、イントラピクチャ)予測および/または時間(たとえば、インターピクチャ)予測を実行することを含むことができる。ブロックベースのビデオコーディングのために、ビデオスライス(たとえば、ビデオピクチャまたはビデオピクチャの一部分)は、ビデオブロックに区分されてよく、ビデオブロックは、ツリーブロック、コーディングツリーブロック(CTB)、コーディングツリーユニット(CTU)、コーディングユニット(CU)、および/またはコーディングノードと呼ばれてもよい。ピクチャのイントラコーディングされた(I)スライスの中のビデオブロックは、同じピクチャの中の隣接ブロックの中の参照サンプルを基準にした空間予測を使用してコーディングされる。ピクチャのインターコーディングされた単方向予測(P)または双方向予測(B)スライスの中のビデオブロックは、同じピクチャの中の隣接ブロックの中の参照サンプルを基準にした空間予測、または他の参照ピクチャの中の参照サンプルを基準にした時間予測を採用することによって、コーディングされてもよい。ピクチャは、フレームおよび/または画像と呼ばれてもよく、参照ピクチャは、参照フレームおよび/または参照画像と呼ばれてもよい。空間予測または時間予測は、画像ブロックを表す予測ブロックをもたらす。残差データは、元の画像ブロックと予測ブロックとの間のピクセル差分を表す。したがって、インターコーディングされたブロックは、予測ブロックを形成する参照サンプルのブロックを指し示す動きベクトル、およびコーディングされたブロックと予測ブロックとの間の差分を示す残差データに従って符号化される。イントラコーディングされたブロックは、イントラコーディングモードおよび残差データに従って符号化される。さらなる圧縮のために、残差データは、ピクセル領域から変換領域に変換されてもよい。これらは残差変換係数をもたらし、残差変換係数は量子化されてもよい。量子化変換係数は、最初に二次元アレイをなして配置されてもよい。量子化変換係数は、変換係数の一次元ベクトルを作り出すために走査されてもよい。なお一層の圧縮を達成するために、エントロピーコーディングが適用されてもよい。そのようなビデオ圧縮技法が以下でさらに詳細に説明される。 Many video compression techniques can be employed to reduce the size of video files with minimal loss of data. For example, video compression techniques can include performing spatial (eg, intra-picture) and / or time (eg, inter-picture) predictions to reduce or eliminate data redundancy in a video sequence. For block-based video coding, a video slice (for example, a video picture or part of a video picture) may be divided into video blocks, where the video blocks are tree blocks, coding tree blocks (CTBs), coding tree units (eg, coding tree units). It may also be called a CTU), a coding unit (CU), and / or a coding node. The video blocks in the intracoded (I) slice of the picture are coded using spatial prediction relative to the reference sample in the adjacent blocks in the same picture. A video block in an intercoded unidirectional (P) or bidirectional (B) slice of a picture is a spatial prediction based on a reference sample in an adjacent block in the same picture, or another reference. It may be coded by adopting a time prediction relative to the reference sample in the picture. Pictures may be referred to as frames and / or images, and reference pictures may be referred to as reference frames and / or reference images. Spatial or temporal prediction results in predictive blocks that represent image blocks. The residual data represents the pixel difference between the original image block and the predicted block. Thus, the intercoded block is encoded according to a motion vector pointing to the block of the reference sample forming the predictive block and residual data showing the difference between the coded block and the predictive block. The intracoded block is encoded according to the intracoding mode and the residual data. For further compression, the residual data may be converted from the pixel area to the conversion area. These result in a residual conversion factor, which may be quantized. The quantization conversion coefficients may be initially arranged in a two-dimensional array. The quantized transformation coefficients may be scanned to produce a one-dimensional vector of transformation coefficients. Entropy coding may be applied to achieve even greater compression. Such video compression techniques are described in more detail below.

符号化されたビデオが正確に復号されてもよいことを確実にするために、ビデオは、対応するビデオコーディング規格に従って符号化および復号される。ビデオコーディング規格は、国際電気通信連合(ITU)標準化セクタ(ITU-T)H.261、国際標準化機構/国際電気標準会議(ISO/IEC)モーションピクチャエキスパートグループ(MPEG)-1パート2、ITU-T H.262またはISO/IEC MPEG-2パート2、ITU-T H.263、ISO/IEC MPEG-4パート2、ITU-T H.264またはISO/IEC MPEG-4パート10とも呼ばれるアドバンストビデオコーディング(AVC:Advanced Video Coding)、およびITU-T H.265またはMPEG-Hパート2とも呼ばれる高効率ビデオコーディング(HEVC:High Efficiency Video Coding)を含む。AVCは、スケーラブルビデオコーディング(SVC:Scalable Video Coding)、マルチビュービデオコーディング(MVC:Multiview Video Coding)およびマルチビュー・ビデオ・コーディング・プラス・デプス(MVC+D)、ならびに三次元(3D)AVC(3D-AVC)などの、拡張を含む。HEVCは、スケーラブルHEVC(SHVC)、マルチビューHEVC(MV-HEVC)、および3D HEVC(3D-HEVC)などの、拡張を含む。ITU-TとISO/IECとのジョイントビデオエキスパートチーム(JVET)は、多用途ビデオコーディング(VVC)と呼ばれるビデオコーディング規格を策定し始めている。VVCは、JVET-L1001-v5を含むワーキングドラフト(WD)の中に含まれる。 To ensure that the encoded video may be decoded accurately, the video is encoded and decoded according to the corresponding video coding standard. The video coding standards are the International Telecommunications Union (ITU) Standardization Sector (ITU-T) H.261, International Standardization Organization / International Electrical Standards Conference (ISO / IEC) Motion Picture Expert Group (MPEG) -1 Part 2, ITU- Advanced video coding, also known as T H.262 or ISO / IEC MPEG-2 Part 2, ITU-T H.263, ISO / IEC MPEG-4 Part 2, ITU-T H.264 or ISO / IEC MPEG-4 Part 10. Includes (AVC: Advanced Video Coding), and High Efficiency Video Coding (HEVC), also known as ITU-T H.265 or MPEG-H Part 2. AVC includes Flexible Video Coding (SVC), Multiview Video Coding (MVC) and Multiview Video Coding Plus Depth (MVC + D), and 3D (3D) AVC ( Includes extensions such as 3D-AVC). HEVC includes extensions such as scalable HEVC (SHVC), multi-view HEVC (MV-HEVC), and 3D HEVC (3D-HEVC). The Joint Video Expert Team (JVET) between ITU-T and ISO / IEC has begun to develop a video coding standard called Versatile Video Coding (VVC). VVC is included in the Working Draft (WD), which includes JVET-L1001-v5.

ビデオ画像をコーディングするために、画像が最初に区分され、区分はビットストリームの中にコーディングされる。様々なピクチャ区分方式が利用可能である。たとえば、画像は、通常スライス、従属スライス、タイルに、かつ/またはウェーブフロント並列処理(WPP:Wavefront Parallel Processing)に従って、区分されうる。簡単のために、HEVCは、ビデオコーディングのためにCTBのグループにスライスを区分するとき、通常スライス、従属スライス、タイル、WPP、およびそれらの組み合わせしか使用され得ないようにエンコーダを制限する。そのような区分は、最大転送単位(MTU)サイズ整合、並列処理、および短縮されたエンドツーエンド遅延をサポートするために適用されうる。MTUは、単一パケットの中で送信されうるデータの最大量を示す。パケットペイロードがMTUを超過する場合、そのペイロードは、フラグメンテーションと呼ばれるプロセスを通じて2つのパケットに分割される。 To code a video image, the image is first partitioned and the dividers are coded into the bitstream. Various picture classification methods are available. For example, an image can be divided into regular slices, dependent slices, tiles, and / or according to Wavefront Parallel Processing (WPP). For simplicity, HEVC limits the encoder so that when dividing slices into groups of CTBs for video coding, only normal slices, dependent slices, tiles, WPPs, and combinations thereof can be used. Such divisions can be applied to support maximum transmission unit (MTU) size matching, parallelism, and reduced end-to-end delay. The MTU indicates the maximum amount of data that can be transmitted in a single packet. When the packet payload exceeds the MTU, the payload is split into two packets through a process called fragmentation.

単にスライスとも呼ばれる通常スライスは、ループフィルタ処理動作に起因するいくつかの相互依存性にもかかわらず、同じピクチャ内の他の通常スライスから独立して再構成されうる、画像の区分された部分である。各通常スライスは、送信のためにそれ自体のネットワーク抽象化レイヤ(NAL)ユニットの中にカプセル化される。さらに、ピクチャ内予測(イントラサンプル予測、動き情報予測、コーディングモード予測)、およびスライス境界を横切るエントロピーコーディング依存関係は、独立した再構成をサポートするために無効にされてもよい。そのような独立した再構成は並列化をサポートする。たとえば、通常スライスベースの並列化は、最低限のプロセッサ間通信またはコア間通信を採用する。しかしながら、各通常スライスは独立しており、各スライスは別個のスライスヘッダに関連する。通常スライスの使用は、スライスごとのスライスヘッダのビットコストに起因して、またスライス境界を横切る予測の欠如に起因して、大幅なコーディングオーバーヘッドを招きうる。さらに、通常スライスは、MTUサイズ要件に対する整合をサポートするために採用されてもよい。詳細には、通常スライスが別個のNALユニットの中にカプセル化され独立してコーディングされうるので、各通常スライスは、スライスを破壊して複数のパケットにすることを回避するために、MTU方式におけるMTUよりも小さくすべきである。したがって、並列化の目標およびMTUサイズ整合の目標は、矛盾した需要をピクチャの中のスライスレイアウトに負わせてもよい。 Regular slices, also referred to simply as slices, are segmented parts of the image that can be reconstructed independently of other regular slices in the same picture, despite some interdependencies due to loop filtering behavior. be. Each normal slice is encapsulated within its own network abstraction layer (NAL) unit for transmission. In addition, in-picture predictions (intrasample predictions, motion information predictions, coding mode predictions), and entropy coding dependencies across slice boundaries may be disabled to support independent reconstruction. Such independent reconstruction supports parallelization. For example, slice-based parallelization usually employs minimal inter-processor or inter-core communication. However, each slice is usually independent and each slice is associated with a separate slice header. The use of regular slices can result in significant coding overhead due to the bit cost of the slice header per slice and due to the lack of prediction across slice boundaries. In addition, normal slices may be employed to support alignment to MTU size requirements. In detail, each normal slice can be encapsulated in a separate NAL unit and coded independently, so each normal slice is in the MTU scheme to avoid breaking the slice into multiple packets. Should be smaller than the MTU. Therefore, the parallelization goal and the MTU size matching goal may impose conflicting demands on the slice layout in the picture.

従属スライスは通常スライスと類似であるが、短縮されたスライスヘッダを有し、ピクチャ内予測を破壊することなく画像ツリーブロック境界の区分を可能にする。したがって、従属スライスは、通常スライスが複数のNALユニットに断片化されることを可能にし、そのことは、通常スライス全体の符号化が完了する前に通常スライスの一部が外へ送られることを可能にすることによって、短縮されたエンドツーエンド遅延をもたらす。 Dependent slices are similar to regular slices, but have a shortened slice header, allowing division of image tree block boundaries without breaking in-picture predictions. Therefore, dependent slices allow a normal slice to be fragmented into multiple NAL units, which means that some of the normal slices are sent out before the coding of the entire normal slice is complete. By enabling it, it results in a reduced end-to-end delay.

タイルとは、タイルの列および行を作成する水平境界および垂直境界によって作成される、画像の区分された部分である。タイルは、ラスタ走査順序で(右から左へ、かつ上から下へ)コーディングされてもよい。CTBの走査順序はタイル内で局所的である。したがって、最初のタイルの中のCTBは、次のタイルの中のCTBに進む前にラスタ走査順序でコーディングされる。通常スライスと同様に、タイルは、ピクチャ内予測依存関係ならびにエントロピー復号依存関係を破壊する。しかしながら、タイルは個々のNALユニットの中に含められなくてよく、したがって、タイルはMTUサイズ整合のために使用されなくてよい。各タイルは、1つのプロセッサ/コアによって処理され得、隣接タイルを復号する処理ユニット間での、ピクチャ内予測のために採用されるプロセッサ間/コア間通信は、(隣接するタイルが同じスライスの中にあるときに)共有されるスライスヘッダを伝達すること、ならびに再構成されるサンプルおよびメタデータのループフィルタ処理関連の共有を実行することに、限定されてもよい。スライスの中に2つ以上のタイルが含まれるとき、スライスの中の最初のエントリポイントオフセット以外の、タイルごとのエントリポイントバイトオフセットが、スライスヘッダの中でシグナリングされてもよい。スライスおよびタイルごとに、次の条件、すなわち、1)スライスの中のすべてのコーディングされたツリーブロックが同じタイルに属すること、および2)タイルの中のすべてのコーディングされたツリーブロックが同じスライスに属すること、のうちの少なくとも1つが果たされるべきである。 A tile is a segmented portion of an image created by the horizontal and vertical boundaries that create the columns and rows of the tile. Tiles may be coded in raster scan order (right-to-left and top-to-bottom). The CTB scan order is local within the tile. Therefore, the CTB in the first tile is coded in raster scan order before proceeding to the CTB in the next tile. Like regular slices, tiles break in-picture predictive and entropy-decoding dependencies. However, tiles do not have to be included within individual NAL units and therefore tiles do not have to be used for MTU size matching. Each tile can be processed by one processor / core, and interprocessor / core communication employed for in-picture prediction between processing units that decode adjacent tiles (adjacent tiles of the same slice). It may be limited to propagating shared slice headers (when in) and performing loop filtering related sharing of reconstructed samples and metadata. When a slice contains more than one tile, the entry point byte offset for each tile other than the first entry point offset in the slice may be signaled in the slice header. For each slice and tile, the following conditions: 1) all coded tree blocks in the slice belong to the same tile, and 2) all coded tree blocks in the tile belong to the same slice. At least one of the belongings should be fulfilled.

WPPでは、画像はCTBの単一の行に区分される。エントロピー復号および予測メカニズムは、他の行の中のCTBからのデータを使用してもよい。CTB行の並列復号を通じて並列処理が可能にされる。たとえば、現在の行が、先行する行と並行して復号されてもよい。しかしながら、現在の行の復号は、先行する行の復号プロセスから2個のCTBだけ遅延する。この遅延は、現在の行の中の現在のCTBの上のCTBおよび上かつ右のCTBに関係するデータが、現在のCTBがコーディングされる前に利用可能であることを確実にする。この手法は、図式で表されるとウェーブフロントのように見える。この千鳥状の冒頭は、最高で画像がCTB行を含むのと同じくらい多くのプロセッサ/コアを伴う並列化を可能にする。ピクチャ内の隣接ツリーブロック行の間のピクチャ内予測が許容されるので、ピクチャ内予測を可能にするためのプロセッサ間/コア間通信は相当となりうる。WPP区分はNALユニットサイズを考慮に入れる。したがって、WPPはMTUサイズ整合をサポートしない。しかしながら、必要に応じてMTUサイズ整合を実施するために、通常スライスは、いくらかのコーディングオーバーヘッドを伴って、WPPとともに使用されうる。 In WPP, images are divided into a single line of CTB. The entropy decoding and prediction mechanism may use data from the CTB in other rows. Parallel processing is enabled through parallel decoding of CTB rows. For example, the current row may be decrypted in parallel with the preceding row. However, decoding the current row is delayed by two CTBs from the decoding process of the preceding row. This delay ensures that the data related to the CTB above the current CTB in the current row and the CTB above and to the right are available before the current CTB is coded. This technique looks like a wavefront when represented graphically. This staggered beginning allows parallelization with as many processors / cores as an image contains CTB rows at most. Inter-processor / core-to-core communication to enable intra-picture prediction can be considerable, as intra-picture prediction between adjacent tree block rows in the picture is allowed. WPP classification takes into account NAL unit size. Therefore, WPP does not support MTU size matching. However, to perform MTU size matching as needed, slices can usually be used with WPP with some coding overhead.

タイルはまた、動き制約タイルセットを含んでよい。動き制約タイルセット(MCTS)とは、関連する動きベクトルが、MCTSの内側の完全サンプルロケーションと、補間のためにMCTSの内側の完全サンプルロケーションのみを必要とする分数サンプルロケーションとを指し示すように制限されるように設計された、タイルセットである。さらに、MCTSの外側のブロックから導出される、時間的な動きベクトル予測用の動きベクトル候補の使用が却下される。このようにして、各MCTSは、MCTSの中に含まれないタイルの存在を伴わずに独立して復号されてもよい。ビットストリームの中のMCTSの存在を示すとともにMCTSをシグナリングするために、時間的なMCTS補足エンハンスメント情報(SEI:supplemental enhancement information)メッセージが使用されてもよい。MCTS SEIメッセージは、MCTSのための適合するビットストリームを生成するために、(SEIメッセージのセマンティックの一部として指定される)MCTSサブビットストリーム抽出において使用されうる追加の情報を提供する。その情報は、各々がいくつかのMCTSを規定し、かつ置換ビデオパラメータセット(VPS:video parameter set)のローバイトシーケンスペイロード(RBSP)バイト、シーケンスパラメータセット(SPS)、およびMCTSサブビットストリーム抽出プロセス中に使用されるべきピクチャパラメータセット(PPS:picture parameter set)を含む、いくつかの抽出情報セットを含む。MCTSサブビットストリーム抽出プロセスに従ってサブビットストリームを抽出すると、スライスアドレス関連のシンタックス要素(first_slice_segment_in_pic_flagおよびslice_segment_addressを含む)のうちの1つまたは全部が、抽出されたサブビットストリームの中で異なる値を採用してもよいので、パラメータセット(VPS、SPS、およびPPS)が書き換えられてよく、または置き換えられてよく、スライスヘッダが更新されてよい。 Tiles may also include a set of motion constrained tiles. A motion constraint tile set (MCTS) limits the associated motion vector to point to a complete sample location inside the MCTS and a fractional sample location that requires only the complete sample location inside the MCTS for interpolation. It is a tile set designed to be used. In addition, the use of motion vector candidates for temporal motion vector prediction, derived from the outer block of MCTS, is rejected. In this way, each MCTS may be independently decoded without the presence of tiles not contained within the MCTS. Temporal MCTS supplemental enhancement information (SEI) messages may be used to indicate the presence of MCTS in the bitstream and to signal MCTS. The MCTS SEI message provides additional information that can be used in the MCTS subbitstream extraction (specified as part of the SEI message semantics) to generate a matching bitstream for MCTS. The information defines several MCTSs each, and the low-byte sequence payload (RBSP) bytes, sequence parameter set (SPS), and MCTS subbitstream extraction process of the replacement video parameter set (VPS). Contains several extraction information sets, including a picture parameter set (PPS) that should be used in it. When a subbitstream is extracted according to the MCTS subbitstream extraction process, one or all of the slice address related syntax elements (including first_slice_segment_in_pic_flag and slice_segment_address) adopt different values in the extracted subbitstream. The parameter sets (VPS, SPS, and PPS) may be rewritten or replaced, and the slice headers may be updated.

さらなる符号化のためにピクチャを区分するとき、様々なタイリング方式が採用されてもよい。特定の例として、タイルは、いくつかの例ではスライスに取って代わることができるタイルグループに割り当てられうる。いくつかの例では、各タイルグループは、他のタイルグループとは独立して抽出されうる。したがって、タイルグループ化は、各タイルグループが異なるプロセッサに割り当てられることを可能にすることによって並列化をサポートしてもよい。タイルグループは、ラスタ走査順序で割り当てられうるか、またはピクチャ内のエリアの長方形形状を形成するように制約されうる。そのようなタイルグループをサポートするために、明示的なタイル識別子(ID)シグナリングが使用されてもよい。いくつかのシステムでは、タイルIDは、常にタイルインデックスと同じとなるように割り当てられる。明示的なタイルIDシグナリングは、タイルIDがタイルインデックスとは異なることを可能にする。明示的なタイルIDシグナリングを有することは、タイルグループヘッダを更新する必要なく、ビットストリームからのMCTSの抽出をサポートする。明示的なタイルIDのシグナリングおよびタイルグループのアドレスとしての対応する使用が、HEVCスタイルのタイル構造定義およびシグナリングに特有であってもよいことに留意されたい。タイル構造定義および/またはシグナリングが修正される場合、明示的なタイルIDメカニズムによるタイルIDのシグナリングは、いくつかの例では不正確および/または適用不可能であってもよい。タイルグループ化および明示的なタイルIDシグナリングは、たとえば、デコーダが画像全体を復号することを望まないことがある場合に採用されうる。特定の例として、全方向性メディアアプリケーションフォーマット(OMAF:Omnidirectional Media Application Format)に従って符号化されてもよい仮想現実(VR)ビデオをサポートするために、ビデオコーディング方式が採用されてもよい。 Various tiling schemes may be employed when partitioning the picture for further coding. As a particular example, tiles can be assigned to tile groups that can replace slices in some examples. In some examples, each tile group can be extracted independently of the other tile groups. Therefore, tile grouping may support parallelization by allowing each tile group to be assigned to a different processor. Tile groups can be assigned in raster scan order or constrained to form a rectangular shape of the area in the picture. Explicit tile identifier (ID) signaling may be used to support such tile groups. On some systems, the tile ID is always assigned to be the same as the tile index. Explicit tile ID signaling allows the tile ID to be different from the tile index. Having explicit tile ID signaling supports the extraction of MCTS from the bitstream without the need to update the tile group headers. Note that the explicit signaling of tile IDs and the corresponding use as addresses for tile groups may be specific to HEVC-style tile structure definitions and signaling. If the tile structure definition and / or signaling is modified, the signaling of tile ID by the explicit tile ID mechanism may be inaccurate and / or inapplicable in some examples. Tile grouping and explicit tile ID signaling can be employed, for example, when the decoder may not want to decode the entire image. As a specific example, a video coding scheme may be employed to support virtual reality (VR) video, which may be encoded according to the Omnidirectional Media Application Format (OMAF).

VRビデオでは、1つ以上のカメラが、カメラの周囲の環境を記録してもよい。ユーザは、次いで、カメラと同じロケーションの中にユーザが存在するかのようにVRビデオを見ることができる。VRビデオでは、ピクチャはユーザの周囲の環境全体を取り囲む。ユーザは、次いで、ピクチャのサブ部分を見る。たとえば、ユーザは、表示されるピクチャのサブ部分をユーザの頭部移動に基づいて変化させるヘッドマウントディスプレイを採用してもよい。表示中のビデオの部分は、ビューポートと呼ばれてもよい。 In VR video, one or more cameras may record the environment around the cameras. The user can then watch the VR video as if the user were in the same location as the camera. In VR video, the picture surrounds the entire environment around the user. The user then sees a subpart of the picture. For example, the user may adopt a head-mounted display that changes a sub-part of the displayed picture based on the movement of the user's head. The portion of the video being displayed may be referred to as the viewport.

したがって、全方向性ビデオの異なる特徴とは、任意の特定の時間においてビューポートだけが表示されることである。このことは、ビデオ全体を表示してもよい他のビデオアプリケーションとは対照的である。この特徴は、たとえば、ユーザのビューポート(または、推奨されるビューポート時限メタデータなどの任意の他の基準)に応じた選択的な配信を通じて、全方向性ビデオシステムの性能を改善するために利用されてもよい。ビューポート依存配信は、たとえば、領域ごとのパッキングおよび/またはビューポート依存ビデオコーディングを採用することによって可能にされてもよい。性能改善は、同じビデオ解像度/品質を採用するときに他の全方向性ビデオシステムと比較すると、より小さい送信帯域幅、より低い復号複雑度、またはその両方をもたらしてもよい。 Therefore, a different feature of omnidirectional video is that only the viewport is displayed at any particular time. This is in contrast to other video applications where the entire video may be displayed. This feature is intended to improve the performance of omnidirectional video systems, for example, through selective delivery according to the user's viewport (or any other criteria such as recommended viewport timed metadata). It may be used. Viewport-dependent delivery may be enabled, for example, by adopting area-by-region packing and / or viewport-dependent video coding. Performance improvements may result in smaller transmit bandwidth, lower decoding complexity, or both when compared to other omnidirectional video systems when adopting the same video resolution / quality.

例示的なビューポート依存動作は、HEVCベースのビューポート依存OMAFビデオプロファイルを用いて5000サンプル(たとえば、5120×2560ルーマサンプル)解像度(5K)の有効な正距円筒図法投影(ERP:equirectangle projection)解像度を達成するための、MCTSベースの手法である。この手法が以下でより詳細に説明される。ただし、一般に、この手法はVRビデオをタイルグループに区分し、複数の解像度でビデオを符号化する。デコーダは、ストリーミング中にユーザによって現在使用されるビューポートを示すことができる。VRビデオデータを提供するビデオサーバは、次いで、ビューポートに関連するタイルグループを高解像度で転送することができ、見られていないタイルグループをもっと低い解像度で転送することができる。このことは、ピクチャ全体が高解像度で送られるのを必要とすることなく、ユーザが高解像度でVRビデオを見ることを可能にする。見られていないサブ部分は廃棄され、したがって、ユーザは、より低い解像度を意識しなくてよい。しかしながら、ユーザがビューポートを変化させる場合、解像度のより低いタイルグループがユーザに表示されてもよい。新たなビューポートの解像度は、次いで、ビデオが進むにつれて高められてよい。そのようなシステムを実施するために、解像度がより高いタイルグループと解像度がより低いタイルグループの両方を含むピクチャが作成されるべきである。 An exemplary viewport-dependent behavior is a valid equirectangle projection (ERP) with a resolution (5K) of 5000 samples (eg, 5120 x 2560 luma samples) using a HEVC-based viewport-dependent OMAF video profile. It is an MCTS-based method for achieving resolution. This technique is described in more detail below. However, in general, this technique divides VR video into tile groups and encodes the video at multiple resolutions. The decoder can indicate the viewport currently used by the user during streaming. The video server that provides the VR video data can then transfer the tile groups associated with the viewport at a higher resolution and the tile groups that are not seen at a lower resolution. This allows the user to watch the VR video in high resolution without having to send the entire picture in high resolution. Unseen sub-parts are discarded and therefore the user does not have to be aware of the lower resolution. However, if the user changes the viewport, a lower resolution tile group may be displayed to the user. The resolution of the new viewport may then be increased as the video progresses. In order to implement such a system, a picture containing both higher resolution tile groups and lower resolution tile groups should be created.

別の例では、複数の解像度を含むピクチャを転送するようにビデオ会議アプリケーションが設計されてもよい。たとえば、ビデオ会議は複数の参加者を含んでよい。現在話している参加者は高い方の解像度で表示されてよく、他の参加者は低い方の解像度で表示されてよい。そのようなシステムを実施するために、解像度がより高いタイルグループと解像度がより低いタイルグループの両方を含むピクチャが作成されるべきである。 In another example, the video conference application may be designed to transfer pictures containing multiple resolutions. For example, a video conference may include multiple participants. Participants currently speaking may be displayed in higher resolution and other participants may be displayed in lower resolution. In order to implement such a system, a picture containing both higher resolution tile groups and lower resolution tile groups should be created.

複数の解像度でコーディングされたサブピクチャを有するピクチャを作成することをサポートするための、様々なフレキシブルタイリングメカニズムが本明細書で開示される。たとえば、ビデオは複数の解像度でコーディングされうる。ビデオはまた、各解像度におけるサブピクチャを採用することによってコーディングされうる。解像度がより低いサブピクチャは、解像度がより高いサブピクチャよりも小さい。複数の解像度を有するピクチャを作成するために、ピクチャは第1のレベルのタイルに区分されうる。最も高い解像度からのサブピクチャは、第1のレベルのタイルの中に直接含められうる。さらに、第1のレベルのタイルは、第1のレベルのタイルよりも小さい第2のレベルのタイルに区分されうる。したがって、より小さい第2のレベルのタイルは、解像度がより低いサブピクチャを直接受け入れることができる。このようにして、首尾一貫したアドレス指定方式を使用するために、解像度が異なるタイルが動的にアドレス再指定されることを必要とすることなく、各解像度からのスライスがタイルインデックス関係を介して単一のピクチャに圧縮されうる。第1のレベルのタイルおよび第2のレベルのタイルはMCTSとして実施されてよく、したがって、異なる解像度における動き制約された画像データを受け入れてよい。本開示は多くの態様を含む。特定の例として、第1のレベルのタイルが第2のレベルのタイルに分割される。第1のレベルのタイルおよび第2のレベルのタイルは、次いで、タイルグループの中に含められうる。タイルグループは、整数個の第1のレベルのタイル、および/または第2のレベルのタイルの各シーケンスが単一の第1のレベルのタイルから分割される第2のレベルのタイルの1つ以上の連続するシーケンスを含むように制約されうる。この手法は、単一の第1のレベルのタイルから作成されたすべての第2のレベルのタイルが同じタイルグループに割り当てられることを確実にしてもよい。別の特定の例では、フレキシブルタイリング方式をコーディングするための走査順序が説明される。この例では、第1のレベルのタイルは、ピクチャおよび/またはタイルグループ境界に対してラスタ走査順序でコーディングされる。第2のレベルのタイルに遭遇すると、第1のレベルのタイル走査が中断される。第2のレベルのタイルの連続するシーケンスが、次いで、そのような第2のレベルのタイルがそこから区分された第1のレベルのタイルに対してラスタ走査順序で走査される。走査順序は、次いで、もしあれば、第2のレベルのタイルの連続する次のシーケンスに進む。そうでない場合、第1のレベルのタイル走査が継続される。このプロセスは、例に応じてタイルグループおよび/またはピクチャが符号化または復号されるまで継続する。 Various flexible tiling mechanisms are disclosed herein to support the creation of pictures with sub-pictures coded at multiple resolutions. For example, video can be coded in multiple resolutions. The video can also be coded by adopting subpictures at each resolution. A sub-picture with a lower resolution is smaller than a sub-picture with a higher resolution. To create a picture with multiple resolutions, the picture can be divided into first level tiles. Subpictures from the highest resolution can be included directly within the first level tiles. Further, the first level tiles can be divided into second level tiles, which are smaller than the first level tiles. Therefore, smaller second level tiles can directly accept lower resolution subpictures. In this way, to use a consistent addressing scheme, slices from each resolution are sliced through the tile index relationship without the need for dynamically readdressing tiles with different resolutions. Can be compressed into a single picture. The first level tiles and the second level tiles may be implemented as MCTS and therefore may accept motion constrained image data at different resolutions. The present disclosure includes many aspects. As a specific example, the first level tile is split into the second level tiles. The first level tile and the second level tile can then be included in the tile group. A tile group is one or more of an integer number of first-level tiles and / or a second-level tile in which each sequence of second-level tiles is split from a single first-level tile. Can be constrained to contain a contiguous sequence of. This technique may ensure that all second level tiles created from a single first level tile are assigned to the same tile group. Another particular example describes a scan sequence for coding a flexible tiling scheme. In this example, the first level tiles are coded in raster scan order for picture and / or tile group boundaries. When a second level tile is encountered, the first level tile scan is interrupted. A contiguous sequence of second level tiles is then scanned in raster scan order against the first level tiles from which such second level tiles are partitioned. The scan order then proceeds to the next sequence of consecutive second level tiles, if any. Otherwise, the first level tile scan is continued. This process continues until the tile group and / or picture is encoded or decrypted, as is the case.

図1は、ビデオ信号をコーディングする例示的な動作方法100のフローチャートである。詳細には、エンコーダにおいてビデオ信号が符号化される。符号化プロセスは、ビデオファイルサイズを低減するために、様々なメカニズムを採用することによってビデオ信号を圧縮する。より小さいファイルサイズは、関連する帯域幅オーバーヘッドを低減しながら、圧縮されたビデオファイルがユーザに向かって送信されることを可能にする。デコーダが、次いで、圧縮されたビデオファイルを復号して、エンドユーザへの表示のために元のビデオ信号を再構成する。復号プロセスは、概して、デコーダがビデオ信号を矛盾なく再構成することを可能にするように、符号化プロセスを反映する。 FIG. 1 is a flowchart of an exemplary operation method 100 for coding a video signal. Specifically, the video signal is encoded in the encoder. The coding process compresses the video signal by adopting various mechanisms to reduce the video file size. The smaller file size allows compressed video files to be sent to the user while reducing the associated bandwidth overhead. The decoder then decodes the compressed video file and reconstructs the original video signal for display to the end user. The decoding process generally reflects the coding process so that the decoder can reconstruct the video signal consistently.

ステップ101において、ビデオ信号がエンコーダの中に入力される。たとえば、ビデオ信号は、メモリの中に記憶された圧縮されていないビデオファイルであってよい。別の例として、ビデオファイルは、ビデオカメラなどのビデオキャプチャデバイスによってキャプチャされてよく、ビデオのライブストリーミングをサポートするように符号化されてよい。ビデオファイルは、オーディオ成分とビデオ成分の両方を含んでよい。ビデオ成分は、シーケンスで見られると動きの視覚的な印象を与える、一連の画像フレームを含む。フレームは、本明細書でルーマ成分(またはルーマサンプル)と呼ぶ光、およびクロマ成分(またはカラーサンプル)と呼ばれる色に換算して表現される、ピクセルを含む。いくつかの例では、フレームはまた、三次元視聴をサポートするための深度値を含んでよい。 In step 101, a video signal is input into the encoder. For example, the video signal may be an uncompressed video file stored in memory. As another example, the video file may be captured by a video capture device such as a video camera and encoded to support live streaming of the video. The video file may contain both audio and video components. The video component contains a series of image frames that give a visual impression of movement when viewed in sequence. The frame includes light, referred to herein as a luma component (or luma sample), and pixels, which are expressed in terms of color, referred to as a chroma component (or color sample). In some examples, the frame may also contain depth values to support 3D viewing.

ステップ103において、ビデオはブロックに区分される。区分することは、各フレームの中のピクセルを圧縮用の正方形および/または長方形のブロックに再分割することを含む。たとえば、(H.265およびMPEG-Hパート2とも呼ばれる)高効率ビデオコーディング(HEVC)では、フレームは、最初にコーディングツリーユニット(CTU)に分割され得、コーディングツリーユニットは、既定のサイズ(たとえば、64ピクセル×64ピクセル)のブロックである。CTUは、ルーマサンプルとクロマサンプルの両方を含む。CTUをブロックに分割し、次いで、さらなる符号化をサポートする構成が達成されるまでブロックを再帰的に再分割するために、コーディングツリーが採用されてもよい。たとえば、フレームのルーマ成分は、個々のブロックが比較的均質な光値を含むまで再分割されてよい。さらに、フレームのクロマ成分は、個々のブロックが比較的均質な色値を含むまで再分割されてよい。したがって、区分メカニズムはビデオフレームのコンテンツに応じて様々である。 In step 103, the video is divided into blocks. Separation involves subdividing the pixels in each frame into square and / or rectangular blocks for compression. For example, in High Efficiency Video Coding (HEVC) (also known as High Efficiency Video Coding and MPEG-H Part 2), frames can first be split into coding tree units (CTUs), which have a default size (eg, for example). , 64 pixels x 64 pixels) block. The CTU includes both luma and chroma samples. A coding tree may be employed to divide the CTU into blocks and then recursively subdivide the blocks until a configuration that supports further encoding is achieved. For example, the luma component of the frame may be subdivided until the individual blocks contain relatively homogeneous light values. In addition, the chroma component of the frame may be subdivided until the individual blocks contain relatively homogeneous color values. Therefore, the partitioning mechanism varies depending on the content of the video frame.

ステップ105において、ステップ103において区分された画像ブロックを圧縮するために、様々な圧縮メカニズムが採用される。たとえば、インター予測および/またはイントラ予測が採用されてよい。インター予測は、共通のシーンの中のオブジェクトが継続的なフレームの中に出現する傾向があるという事実を利用するように設計されている。したがって、参照フレームの中のオブジェクトを示すブロックは、隣接するフレームの中で繰り返し記述される必要がない。詳細には、テーブルなどのオブジェクトは、複数のフレームにわたって一定の位置にとどまることがある。したがって、テーブルは1回記述され、隣接するフレームは、参照フレームを逆戻りに参照することができる。複数のフレームにわたってオブジェクトを整合させるために、パターンマッチングメカニズムが採用されてもよい。さらに、たとえば、オブジェクト移動またはカメラ移動に起因して、移動するオブジェクトは、複数のフレームを横切って描写されることがある。特定の例として、ビデオは、複数のフレームにわたってスクリーンを横切って移動する自動車を見せることがある。そのような移動を表すために動きベクトルが採用されうる。動きベクトルとは、フレームの中のオブジェクトの座標から参照フレームの中のオブジェクトの座標までのオフセットを提供する二次元ベクトルである。したがって、インター予測は、参照フレームの中の対応するブロックからのオフセットを示す動きベクトルのセットとして、現在のフレームの中の画像ブロックを符号化することができる。 In step 105, various compression mechanisms are employed to compress the image blocks partitioned in step 103. For example, inter-prediction and / or intra-prediction may be adopted. Inter-prediction is designed to take advantage of the fact that objects in a common scene tend to appear in continuous frames. Therefore, the block indicating the object in the reference frame does not need to be repeatedly described in the adjacent frame. In particular, objects such as tables may stay in place across multiple frames. Therefore, the table is described once, and adjacent frames can refer back to the reference frame. A pattern matching mechanism may be employed to align objects across multiple frames. Further, moving objects may be depicted across multiple frames, for example due to object movement or camera movement. As a particular example, a video may show a car moving across a screen across multiple frames. Motion vectors can be employed to represent such movements. A motion vector is a two-dimensional vector that provides an offset from the coordinates of an object in a frame to the coordinates of an object in a reference frame. Thus, inter-prediction can encode an image block in the current frame as a set of motion vectors indicating an offset from the corresponding block in the reference frame.

イントラ予測は、共通のフレームの中のブロックを符号化する。イントラ予測は、ルーマ成分およびクロマ成分がフレームの中でクラスタ化する傾向があるという事実を利用する。たとえば、木の一部分の中の緑の断片は、緑の類似の断片に隣接して配置される傾向がある。イントラ予測は、複数の方向性予測モード(たとえば、HEVCでは33個)、平面モード、および直流(DC)モードを採用する。方向性モードは、対応する方向において現在ブロックが隣接ブロックのサンプルと類似/同じであることを示す。平面モードは、行/列に沿った一連のブロック(たとえば、平面)が、行の縁部において隣接ブロックに基づいて補間されうることを示す。平面モードは、実際には、値を変化させる際に比較的一定の勾配を採用することによって、行/列にわたる光/色の滑らかな遷移を示す。DCモードは、境界平滑化のために採用され、ブロックが方向性予測モードの角度方向に関連するすべての隣接ブロックのサンプルに関連する平均値と類似/同じであることを示す。したがって、イントラ予測ブロックは、実際の値ではなく関係を示す様々な予測モード値として画像ブロックを表すことができる。さらに、インター予測ブロックは、実際の値ではなく動きベクトル値として画像ブロックを表すことができる。いずれの場合も、予測ブロックは、いくつかの事例では画像ブロックを厳密に表さなくてよい。任意の差分が残差ブロックの中に記憶される。ファイルをさらに圧縮するために、残差ブロックに変換が適用されてもよい。 Intra-prediction encodes blocks in a common frame. Intra prediction takes advantage of the fact that the luma and chroma components tend to cluster within the frame. For example, a green fragment within a piece of a tree tends to be placed adjacent to a similar green fragment. Intra prediction employs multiple directional prediction modes (eg 33 for HEVC), planar mode, and direct current (DC) mode. The directional mode indicates that the current block is similar / same as the sample of the adjacent block in the corresponding direction. Plane mode indicates that a set of blocks along a row / column (eg, a plane) can be interpolated based on adjacent blocks at the edges of the row. Planar mode actually shows a smooth transition of light / color across rows / columns by adopting a relatively constant gradient as the values change. DC mode is adopted for boundary smoothing and indicates that the block is similar / same to the mean value associated with the samples of all adjacent blocks associated with the angular orientation of the directional prediction mode. Therefore, the intra prediction block can represent the image block as various prediction mode values that indicate the relationship rather than the actual value. In addition, the inter-prediction block can represent the image block as a motion vector value rather than an actual value. In either case, the predictive block does not have to be an exact representation of the image block in some cases. Any difference is stored in the residual block. Transformations may be applied to the residual blocks to further compress the file.

ステップ107において、様々なフィルタ処理技法が適用されてもよい。HEVCでは、フィルタは、ループ内フィルタ処理方式に従って適用される。上記で説明したブロックベースの予測は、デコーダにおいてブロック状の画像の生成をもたらしてもよい。さらに、ブロックベースの予測方式はブロックを符号化し、次いで、参照ブロックとして後で使用できるように符号化ブロックを再構成してもよい。ループ内フィルタ処理方式は、ブロック/フレームに、雑音抑制フィルタ、デブロッキングフィルタ、適応ループフィルタ、およびサンプル適応型オフセット(SAO:sample adaptive offset)フィルタを反復的に適用する。これらのフィルタは、符号化されたファイルが正確に再構成されうるように、そのようなブロッキングアーティファクトを緩和する。さらに、これらのフィルタは、再構成された参照ブロックの中のアーティファクトを緩和し、その結果、アーティファクトは、再構成された参照ブロックに基づいて符号化される後続のブロックの中で追加のアーティファクトを生成する可能性が低い。 In step 107, various filtering techniques may be applied. In HEVC, filters are applied according to the in-loop filtering method. The block-based prediction described above may result in the generation of block-like images in the decoder. In addition, block-based prediction schemes may encode the block and then reconstruct the coded block for later use as a reference block. The in-loop filtering method iteratively applies a noise suppression filter, a deblocking filter, an adaptive loop filter, and a sample adaptive offset (SAO) filter to the block / frame. These filters mitigate such blocking artifacts so that the encoded file can be reconstructed accurately. In addition, these filters mitigate the artifacts in the reconstructed reference block, so that the artifacts add additional artifacts in subsequent blocks that are encoded based on the reconstructed reference block. It is unlikely to be generated.

ビデオ信号が区分、圧縮、およびフィルタ処理されると、得られたデータは、ステップ109においてビットストリームの中で符号化される。ビットストリームは、上記で説明したデータ、ならびにデコーダにおいて適切なビデオ信号再構成をサポートするために望まれる任意のシグナリングデータを含む。たとえば、そのようなデータは、区分データ、予測データ、残差ブロック、およびコーディング命令をデコーダに提供する様々なフラグを含んでよい。ビットストリームは、要求時にデコーダに向かう送信のために、メモリの中に記憶されてもよい。ビットストリームはまた、複数のデコーダに向かってブロードキャストおよび/またはマルチキャストされてよい。ビットストリームの作成は、反復的なプロセスである。したがって、ステップ101、103、105、107、および109は、多くのフレームおよびブロックにわたって継続的かつ/または同時に行われてよい。図1に示す順序は、説明の明快および容易さのために提示され、ビデオコーディングプロセスを特定の順序に限定するものではない。 Once the video signal has been partitioned, compressed, and filtered, the resulting data is encoded in the bitstream in step 109. The bitstream contains the data described above, as well as any signaling data desired to support proper video signal reconstruction in the decoder. For example, such data may include partitioned data, predictive data, residual blocks, and various flags that provide coding instructions to the decoder. The bitstream may be stored in memory for transmission to the decoder on request. Bitstreams may also be broadcast and / or multicast to multiple decoders. Creating a bitstream is an iterative process. Therefore, steps 101, 103, 105, 107, and 109 may be performed continuously and / or simultaneously over many frames and blocks. The order shown in FIG. 1 is presented for clarity and ease of description and does not limit the video coding process to any particular order.

ステップ111において、デコーダは、ビットストリームを受信し復号プロセスを開始する。詳細には、デコーダは、ビットストリームを対応するシンタックスおよびビデオデータに変換するために、エントロピー復号方式を採用する。ステップ111において、デコーダは、フレームに対する区分を決定するために、ビットストリームからのシンタックスデータを採用する。区分は、ステップ103におけるブロック区分の結果に整合すべきである。ステップ111において採用されるようなエントロピー符号化/復号が次に説明される。エンコーダは、入力画像の中での値の空間測位に基づいて、いくつかの可能な選択肢からブロック区分方式を選択することなどの、多くの選択を圧縮プロセス中に行う。正確な選択肢をシグナリングすることは、多数のビンを採用してもよい。本明細書で使用するビンとは、変数として扱われるバイナリ値(たとえば、コンテキストに応じて変わってもよいビット値)である。エントロピーコーディングは、特定の事例に対して明確に実行可能でない任意のオプションをエンコーダが廃棄して、許容できるオプションのセットを残すことを可能にする。許容できる各オプションは、次いで、コードワードが割り当てられる。コードワードの長さは、許容できるオプションの数に基づく(たとえば、2個のオプションに対して1つのビン、3～4個のオプションに対して2つのビンなど)。エンコーダは、次いで、選択されたオプションに対してコードワードを符号化する。コードワードは、すべての可能なオプションの潜在的に大きいセットからの選択肢を固有に示すのとは対照的に、許容できるオプションの小さいサブセットからの選択肢を固有に示すために望まれるのと同じくらいの大きさであるので、この方式はコードワードのサイズを低減する。デコーダは、次いで、エンコーダと同様の方法で、許容できるオプションのセットを決定することによって選択肢を復号する。許容できるオプションのセットを決定することによって、デコーダは、コードワードを読み取ることができ、エンコーダによって行われた選択を決定することができる。 At step 111, the decoder receives the bitstream and initiates the decoding process. Specifically, the decoder employs an entropy decoding method to convert the bitstream into the corresponding syntax and video data. In step 111, the decoder employs syntax data from the bitstream to determine the division for the frame. The division should be consistent with the result of the block division in step 103. Entropy coding / decoding as adopted in step 111 is described below. The encoder makes many choices during the compression process, such as choosing a block partitioning scheme from several possible options, based on the spatial positioning of the values in the input image. Signaling the exact choices may employ a large number of bins. The bin used herein is a binary value treated as a variable (eg, a bit value that may change depending on the context). Entropy coding allows the encoder to discard any options that are not explicitly feasible for a particular case, leaving an acceptable set of options. Each acceptable option is then assigned a codeword. Codeword length is based on the number of options you can tolerate (for example, 1 bin for 2 options, 2 bins for 3-4 options, and so on). The encoder then encodes the codeword for the selected option. Codewords are as much desired to uniquely indicate choices from a small subset of acceptable options, as opposed to uniquely showing choices from a potentially large set of all possible options. Because of the size of, this method reduces the size of the codeword. The decoder then decodes the choices by determining an acceptable set of options in a manner similar to an encoder. By determining the acceptable set of options, the decoder can read the codeword and determine the choices made by the encoder.

ステップ113において、デコーダはブロック復号を実行する。詳細には、デコーダは、残差ブロックを生成するために逆変換を採用する。次いで、デコーダは、区分に従って画像ブロックを再構成するために、残差ブロックおよび対応する予測ブロックを採用する。予測ブロックは、ステップ105においてエンコーダにおいて生成されるような、イントラ予測ブロックとインター予測ブロックの両方を含んでよい。再構成された画像ブロックは、次いで、ステップ111において決定された区分データに従って、再構成されたビデオ信号のフレームの中に配置される。ステップ113のためのシンタックスも、上記で説明したようにエントロピーコーディングを介してビットストリームの中でシグナリングされてよい。 At step 113, the decoder performs block decoding. In particular, the decoder employs an inverse transformation to generate a residual block. The decoder then employs a residual block and a corresponding predictive block to reconstruct the image block according to the division. The prediction block may include both an intra-prediction block and an inter-prediction block as generated in the encoder in step 105. The reconstructed image block is then placed in the frame of the reconstructed video signal according to the segmented data determined in step 111. The syntax for step 113 may also be signaled in the bitstream via entropy coding as described above.

ステップ115において、エンコーダにおけるステップ107と同様の方法で、再構成されたビデオ信号のフレームに対してフィルタ処理が実行される。たとえば、ブロッキングアーティファクトを除去するために、雑音抑制フィルタ、デブロッキングフィルタ、適応ループフィルタ、およびSAOフィルタが、フレームに適用されてよい。フレームがフィルタ処理されると、ビデオ信号は、エンドユーザによる視聴のために、ステップ117においてディスプレイに出力されうる。 In step 115, filtering is performed on the frames of the reconstructed video signal in the same manner as in step 107 in the encoder. For example, noise suppression filters, deblocking filters, adaptive loop filters, and SAO filters may be applied to the frame to remove blocking artifacts. Once the frames have been filtered, the video signal may be output to the display in step 117 for viewing by the end user.

図2は、ビデオコーディングのための例示的なコーディングおよび復号(コーデック)システム200の概略図である。詳細には、コーデックシステム200は、動作方法100の実装形態をサポートするための機能を提供する。コーデックシステム200は、エンコーダとデコーダの両方の中で採用される構成要素を示すように一般化される。コーデックシステム200は、動作方法100の中のステップ101および103に関して説明したように、ビデオ信号を受信および区分し、そのことは区分されたビデオ信号201をもたらす。コーデックシステム200は、次いで、方法100の中のステップ105、107、および109に関して説明したようにエンコーダの働きをするとき、区分されたビデオ信号201を圧縮してコーディングされたビットストリームにする。デコーダの働きをするとき、コーデックシステム200は、動作方法100の中のステップ111、113、115、および117に関して説明したように、ビットストリームから出力ビデオ信号を生成する。コーデックシステム200は、汎用コーダ制御構成要素211、変換スケーリングおよび量子化構成要素213、イントラピクチャ推定構成要素215、イントラピクチャ予測構成要素217、動き補償構成要素219、動き推定構成要素221、スケーリングおよび逆変換構成要素229、フィルタ制御分析構成要素227、ループ内フィルタ構成要素225、復号ピクチャバッファ構成要素223、ならびにヘッダフォーマッティングおよびコンテキスト適応型バイナリ算術コーディング(CABAC:context adaptive binary arithmetic coding)構成要素231を含む。そのような構成要素は図示のように結合される。図2において、黒い線は符号化/復号されるべきデータの移動を示し、破線は他の構成要素の動作を制御する制御データの移動を示す。コーデックシステム200の構成要素はすべて、エンコーダの中に存在してよい。デコーダは、コーデックシステム200の構成要素のサブセットを含んでよい。たとえば、デコーダは、イントラピクチャ予測構成要素217、動き補償構成要素219、スケーリングおよび逆変換構成要素229、ループ内フィルタ構成要素225、ならびに復号ピクチャバッファ構成要素223を含んでよい。次にこれらの構成要素が説明される。 FIG. 2 is a schematic diagram of an exemplary coding and decoding (codec) system 200 for video coding. In particular, the codec system 200 provides functionality to support the implementation of method 100. The codec system 200 is generalized to indicate the components adopted in both the encoder and the decoder. The codec system 200 receives and partitions the video signal as described with respect to steps 101 and 103 in the operating method 100, which results in the segmented video signal 201. The codec system 200 then compresses the partitioned video signal 201 into a coded bitstream when acting as an encoder as described for steps 105, 107, and 109 in Method 100. When acting as a decoder, the codec system 200 produces an output video signal from the bitstream, as described for steps 111, 113, 115, and 117 in method 100. The codec system 200 includes general-purpose coder control component 211, transformation scaling and quantization component 213, intra-picture estimation component 215, intra-picture prediction component 217, motion compensation component 219, motion estimation component 221, scaling and vice versa. Includes transformation component 229, filter control analysis component 227, in-loop filter component 225, decryption picture buffer component 223, and header formatting and context adaptive binary arithmetic coding (CABAC) component 231. .. Such components are combined as shown. In FIG. 2, the black line shows the movement of data to be encoded / decoded, and the dashed line shows the movement of control data that controls the operation of other components. All components of the codec system 200 may reside within the encoder. The decoder may include a subset of the components of the codec system 200. For example, the decoder may include an intra-picture prediction component 217, a motion compensation component 219, a scaling and inverse transformation component 229, an in-loop filter component 225, and a decoding picture buffer component 223. Next, these components will be described.

区分されたビデオ信号201は、コーディングツリーによってピクセルのブロックに区分されている、キャプチャされたビデオシーケンスである。コーディングツリーは、ピクセルのブロックをピクセルのもっと小さいブロックに再分割するために、様々な分割モードを採用する。これらのブロックは、次いで、もっと小さいブロックにさらに再分割されうる。そのブロックは、コーディングツリー上のノードと呼ばれてもよい。より大きい親ノードが、より小さい子ノードに分割される。ノードが再分割される回数は、ノード/コーディングツリーの深度と呼ばれる。分割されたブロックは、場合によってはコーディングユニット(CU)の中に含まれうる。たとえば、CUは、CU用の対応するシンタックス命令と一緒にルーマブロック、赤色差分クロマ(Cr)ブロック、および青色差分クロマ(Cb)ブロックを含む、CTUのサブ部分でありうる。分割モードは、採用される分割モードに応じて変化する形状の、それぞれ、2つ、3つ、または4つの子ノードにノードを区分するために採用される、2分木(BT:binary tree)、3分木(TT:triple tree)、および4分木(QT:quad tree)を含んでよい。区分されたビデオ信号201は、圧縮のために、汎用コーダ制御構成要素211、変換スケーリングおよび量子化構成要素213、イントラピクチャ推定構成要素215、フィルタ制御分析構成要素227、ならびに動き推定構成要素221に転送される。 The segmented video signal 201 is a captured video sequence segmented into blocks of pixels by the coding tree. The coding tree employs various split modes to subdivide a block of pixels into smaller blocks of pixels. These blocks can then be further subdivided into smaller blocks. The block may be called a node on the coding tree. The larger parent node is split into smaller child nodes. The number of times a node is subdivided is called the node / coding tree depth. The divided blocks may be contained in a coding unit (CU) in some cases. For example, the CU can be a subpart of the CTU, including a luma block, a red differential chroma (Cr) block, and a blue differential chroma (Cb) block along with the corresponding syntax instructions for the CU. The split mode is a binary tree (BT) that is used to divide a node into two, three, or four child nodes, each of which has a shape that changes depending on the split mode adopted. , Ternary tree (TT: triple tree), and quadtree (QT: quad tree) may be included. The segmented video signal 201 is assigned to the general purpose coder control component 211, the transformation scaling and quantization component 213, the intra-picture estimation component 215, the filter control analysis component 227, and the motion estimation component 221 for compression. Transferred.

汎用コーダ制御構成要素211は、アプリケーション制約に従ってビットストリームへのビデオシーケンスの画像のコーディングに関係する決定を行うように構成される。たとえば、汎用コーダ制御構成要素211は、ビットレート/ビットストリームサイズ対再構成品質の最適化を管理する。そのような決定は、記憶空間/帯域幅利用可能性および画像解像度要求に基づいて行われてよい。汎用コーダ制御構成要素211はまた、バッファアンダーランおよびオーバーラン問題を緩和するために、送信速度に照らしてバッファ利用を管理する。これらの問題を管理するために、汎用コーダ制御構成要素211は、他の構成要素による区分、予測、およびフィルタ処理を管理する。たとえば、汎用コーダ制御構成要素211は、解像度を高くし帯域幅使用を大きくするために圧縮複雑度を動的に増大させることがあり、または解像度および帯域幅使用を低減するために圧縮複雑度を低減させることがある。したがって、汎用コーダ制御構成要素211は、ビデオ信号再構成品質をビットレート問題と平衡させるために、コーデックシステム200の他の構成要素を制御する。汎用コーダ制御構成要素211は、他の構成要素の動作を制御する制御データを作成する。制御データはまた、ヘッダフォーマッティングおよびCABAC構成要素231に転送されて、デコーダにおける復号用のパラメータをシグナリングするためにビットストリームの中で符号化される。 The general-purpose coder control component 211 is configured to make decisions related to coding the image of the video sequence into the bitstream according to application constraints. For example, general-purpose coder control component 211 manages bitrate / bitstream size vs. reconstruction quality optimization. Such decisions may be made based on storage space / bandwidth availability and image resolution requirements. General-purpose coder control component 211 also manages buffer utilization in the light of transmit speed to mitigate buffer underrun and overrun problems. To manage these issues, general-purpose coder control component 211 manages partitioning, prediction, and filtering by other components. For example, general purpose coder control component 211 may dynamically increase compression complexity to increase resolution and bandwidth usage, or compress compression complexity to reduce resolution and bandwidth usage. May be reduced. Therefore, the general purpose coder control component 211 controls the other components of the codec system 200 in order to balance the video signal reconstruction quality with the bit rate problem. The general-purpose coder control component 211 creates control data that controls the operation of other components. The control data is also transferred to header formatting and CABAC component 231 and encoded in the bitstream to signal the parameters for decoding in the decoder.

区分されたビデオ信号201はまた、インター予測のために動き推定構成要素221および動き補償構成要素219へ送られる。区分されたビデオ信号201のフレームまたはスライスは、複数のビデオブロックに分割されてもよい。動き推定構成要素221および動き補償構成要素219は、時間予測を行うために、1つ以上の参照フレームの中の1つ以上のブロックに対する、受信されたビデオブロックのインター予測コーディングを実行する。コーデックシステム200は、たとえば、ビデオデータのブロックごとに適切なコーディングモードを選択するために、複数のコーディングパスを実行してもよい。 The segmented video signal 201 is also sent to motion estimation component 221 and motion compensation component 219 for inter-prediction. The frame or slice of the divided video signal 201 may be divided into a plurality of video blocks. Motion estimation component 221 and motion compensation component 219 perform interpredictive coding of received video blocks for one or more blocks in one or more reference frames to make time predictions. The codec system 200 may perform multiple coding paths, for example, to select the appropriate coding mode for each block of video data.

動き推定構成要素221および動き補償構成要素219は高度に統合されてよいが、概念的な目的のために別々に図示される。動き推定構成要素221によって実行される動き推定は、ビデオブロックに対して動きを推定する動きベクトルを生成するプロセスである。動きベクトルは、たとえば、予測ブロックに対するコーディングされたオブジェクトの変位を示してよい。予測ブロックとは、ピクセル差分の観点から、コーディングされるべきブロックに密に整合するものと認められるブロックである。予測ブロックは、参照ブロックと呼ばれてもよい。そのようなピクセル差分は、絶対差分和(SAD:sum of absolute difference)、二乗差分和(SSD:sum of square difference)、または他の差分メトリックによって決定されてよい。HEVCは、CTU、コーディングツリーブロック(CTB)、およびCUを含む、いくつかのコーディングされたオブジェクトを採用する。たとえば、CTUがCTBに分割され得、CTBは、次いで、CUの中に含めるためのCBに分割されうる。CUは、予測データを含む予測ユニット(PU:prediction unit)、および/またはCUに対する変換された残差データを含む変換ユニット(TU:transform unit)として符号化されうる。動き推定構成要素221は、レートひずみ最適化プロセスの一部としてレートひずみ分析を使用することによって、動きベクトル、PU、およびTUを生成する。たとえば、動き推定構成要素221は、現在のブロック/フレームに対して、複数の参照ブロック、複数の動きベクトルなどを決定してよく、レートひずみ特性が最良の参照ブロック、動きベクトルなどを選択してよい。最良のレートひずみ特性は、ビデオ再構成の品質(たとえば、圧縮によるデータ損失の量)とコーディング効率(たとえば、最終的な符号化のサイズ)の両方を平衡させる。 The motion estimation component 221 and the motion compensation component 219 may be highly integrated, but are shown separately for conceptual purposes. The motion estimation performed by the motion estimation component 221 is a process of generating a motion vector for estimating motion for a video block. The motion vector may indicate, for example, the displacement of the coded object with respect to the prediction block. A predictive block is a block that is considered to be closely aligned with the block to be coded in terms of pixel difference. The predictive block may be referred to as a reference block. Such pixel differences may be determined by sum of absolute difference (SAD), sum of square difference (SSD), or other difference metrics. HEVC employs several coded objects, including CTU, Coding Tree Block (CTB), and CU. For example, the CTU can be split into CTBs, which can then be split into CBs for inclusion in the CU. The CU can be encoded as a prediction unit (PU) containing the prediction data and / or a transform unit (TU) containing the transformed residual data for the CU. Motion estimation component 221 generates motion vectors, PUs, and TUs by using rate strain analysis as part of the rate strain optimization process. For example, the motion estimation component 221 may determine multiple reference blocks, multiple motion vectors, etc. for the current block / frame, selecting the reference block, motion vector, etc. with the best rate distortion characteristics. good. The best rate distortion characteristics balance both the quality of the video reconstruction (eg, the amount of data loss due to compression) and the coding efficiency (eg, the size of the final coding).

いくつかの例では、コーデックシステム200は、復号ピクチャバッファ構成要素223の中に記憶された参照ピクチャのサブ整数ピクセル位置に対する値を計算してもよい。たとえば、ビデオコーデックシステム200は、参照ピクチャの1/4ピクセル位置、1/8ピクセル位置、または他の分数ピクセル位置の値を補間してよい。したがって、動き推定構成要素221は、完全ピクセル位置および分数ピクセル位置に対する動き探索を実行し、分数ピクセル精度を有する動きベクトルを出力してもよい。動き推定構成要素221は、PUの位置を参照ピクチャの予測ブロックの位置と比較することによって、インターコーディングされたスライスの中のビデオブロックのPUに対する動きベクトルを計算する。動き推定構成要素221は、計算された動きベクトルを動きデータとして符号化のためにヘッダフォーマッティングおよびCABAC構成要素231へ、また動き補償構成要素219へ動きを出力する。 In some examples, the codec system 200 may calculate the value for the sub-integer pixel position of the reference picture stored in the decrypted picture buffer component 223. For example, the video codec system 200 may interpolate values at 1/4 pixel position, 1/8 pixel position, or other fractional pixel position of the reference picture. Therefore, the motion estimation component 221 may perform a motion search for a perfect pixel position and a fractional pixel position and output a motion vector with fractional pixel accuracy. The motion estimation component 221 calculates the motion vector for the PU of the video block in the intercoded slice by comparing the position of the PU with the position of the predicted block of the reference picture. The motion estimation component 221 outputs the motion to the header formatting and CABAC component 231 for encoding the calculated motion vector as motion data, and to the motion compensation component 219.

動き補償構成要素219によって実行される動き補償は、動き推定構成要素221によって決定された動きベクトルに基づいて予測ブロックをフェッチまたは生成することを伴ってよい。再び、動き推定構成要素221および動き補償構成要素219は、いくつかの例では機能的に統合されてもよい。現在ビデオブロックのPUに対する動きベクトルを受信すると、動き補償構成要素219は、動きベクトルが指し示す先の予測ブロックの位置を特定してもよい。次いで、コーディング中の現在ビデオブロックのピクセル値から予測ブロックのピクセル値を減算することによって残差ビデオブロックが形成され、ピクセル差分値を形成する。概して、動き推定構成要素221は、ルーマ成分に対する動き推定を実行し、動き補償構成要素219は、ルーマ成分に基づいて計算された動きベクトルをクロマ成分とルーマ成分の両方に対して使用する。予測ブロックおよび残差ブロックは、変換スケーリングおよび量子化構成要素213に転送される。 The motion compensation performed by the motion compensation component 219 may involve fetching or generating a predictive block based on the motion vector determined by the motion estimation component 221. Again, motion estimation component 221 and motion compensation component 219 may be functionally integrated in some examples. Upon receiving the motion vector for the PU of the current video block, the motion compensation component 219 may locate the predicted block to which the motion vector points. The residual video block is then formed by subtracting the pixel value of the predicted block from the pixel value of the current video block being coded to form the pixel difference value. In general, motion estimation component 221 performs motion estimation for the luma component, and motion compensation component 219 uses a motion vector calculated based on the luma component for both chroma and luma components. The prediction block and the residual block are transferred to the transformation scaling and quantization component 213.

区分されたビデオ信号201はまた、イントラピクチャ推定構成要素215およびイントラピクチャ予測構成要素217へ送られる。動き推定構成要素221および動き補償構成要素219と同様に、イントラピクチャ推定構成要素215およびイントラピクチャ予測構成要素217は高度に統合されてよいが、概念的な目的のために別々に図示される。イントラピクチャ推定構成要素215およびイントラピクチャ予測構成要素217は、上記で説明したようにフレーム間で動き推定構成要素221および動き補償構成要素219によって実行されるインター予測の代替として、現在フレームの中のブロックに対して現在ブロックをイントラ予測する。詳細には、イントラピクチャ推定構成要素215は、現在ブロックを符号化するために使用すべきイントラ予測モードを決定する。いくつかの例では、イントラピクチャ推定構成要素215は、現在ブロックを符号化するための適切なイントラ予測モードを、テストされた複数のイントラ予測モードから選択する。選択されたイントラ予測モードは、次いで、符号化のためにヘッダフォーマッティングおよびCABAC構成要素231に転送される。 The partitioned video signal 201 is also sent to the intra-picture estimation component 215 and the intra-picture prediction component 217. Similar to motion estimation component 221 and motion compensation component 219, intra-picture estimation component 215 and intra-picture prediction component 217 may be highly integrated, but are shown separately for conceptual purposes. The intra-picture estimation component 215 and the intra-picture prediction component 217 are currently in the frame as an alternative to the inter-prediction performed by the motion estimation component 221 and the motion compensation component 219 between frames as described above. Intra-predict the current block for the block. In particular, the intra-picture estimation component 215 determines the intra-prediction mode currently to be used to encode the block. In some examples, the intra-picture estimation component 215 selects the appropriate intra-prediction mode for currently encoding the block from a plurality of tested intra-prediction modes. The selected intra-prediction mode is then transferred to header formatting and CABAC component 231 for encoding.

たとえば、イントラピクチャ推定構成要素215は、テストされる様々なイントラ予測モードに対してレートひずみ分析を使用してレートひずみ値を計算し、テストされたモードの中でレートひずみ特性が最良のイントラ予測モードを選択する。レートひずみ分析は、概して、符号化ブロックと、符号化ブロックを作り出すために符号化された、符号化されていない元のブロックとの間のひずみ(または誤差)の量、ならびに符号化ブロックを作り出すために使用されたビットレート(たとえば、ビット数)を決定する。イントラピクチャ推定構成要素215は、どのイントラ予測モードがブロックに対して最良のレートひずみ値を示すのかを決定するために、様々な符号化ブロックに対してひずみおよびレートから比率を計算する。加えて、イントラピクチャ推定構成要素215は、レートひずみ最適化(RDO:rate-distortion optimization)に基づいて深度モデリングモード(DMM:depth modeling mode)を使用して深度マップの深度ブロックをコーディングするように構成されてもよい。 For example, the intra-picture estimation component 215 uses rate strain analysis to calculate rate strain values for the various intra-prediction modes tested, and the intra-prediction with the best rate strain characteristics among the tested modes. Select a mode. Rate strain analysis generally produces the amount of strain (or error) between the coded block and the original uncoded block encoded to produce the coded block, as well as the coded block. Determine the bit rate used for (for example, the number of bits). Intra-picture estimation component 215 calculates the ratio from strain and rate for various coded blocks to determine which intra-prediction mode shows the best rate strain value for the block. In addition, the intra-picture estimation component 215 now uses the depth modeling mode (DMM) to code depth blocks in the depth map based on rate-distortion optimization (RDO). It may be configured.

イントラピクチャ予測構成要素217は、エンコーダ上に実装されるとき、イントラピクチャ推定構成要素215によって決定された選択済みのイントラ予測モードに基づいて予測ブロックから残差ブロックを生成してよく、またはデコーダ上に実装されるとき、ビットストリームから残差ブロックを読み取ってよい。残差ブロックは、行列として表される、予測ブロックと元のブロックとの間の値における差分を含む。残差ブロックは、次いで、変換スケーリングおよび量子化構成要素213に転送される。イントラピクチャ推定構成要素215およびイントラピクチャ予測構成要素217は、ルーマ成分とクロマ成分の両方に対して動作してもよい。 When the intra-picture prediction component 217 is implemented on the encoder, it may generate a residual block from the prediction block based on the selected intra-prediction mode determined by the intra-picture estimation component 215, or on the decoder. When implemented in, the residual block may be read from the bitstream. The residual block contains the difference in value between the predicted block and the original block, represented as a matrix. The residual block is then transferred to the transformation scaling and quantization component 213. The intra-picture estimation component 215 and the intra-picture prediction component 217 may operate for both the luma component and the chroma component.

変換スケーリングおよび量子化構成要素213は、残差ブロックをさらに圧縮するように構成される。変換スケーリングおよび量子化構成要素213は、離散コサイン変換(DCT)、離散サイン変換(DST)、または概念的に類似の変換などの、変換を残差ブロックに適用し、残差変換係数値を備えるビデオブロックを作り出す。ウェーブレット変換、整数変換、サブバンド変換、または他のタイプの変換も使用されうる。変換は、残差情報をピクセル値領域から周波数領域などの変換領域に変換してもよい。変換スケーリングおよび量子化構成要素213はまた、たとえば、周波数に基づいて、変換された残差情報をスケーリングするように構成される。そのようなスケーリングは、残差情報にスケールファクタを適用することを伴い、その結果、異なる周波数情報が異なる粒度で量子化され、そのことは、再構成されるビデオの最終的な視覚的品質に影響を及ぼしてもよい。変換スケーリングおよび量子化構成要素213はまた、ビットレートをさらに低減するために変換係数を量子化するように構成される。量子化プロセスは、係数の一部または全部に関連するビット深度を低減しうる。量子化の程度は、量子化パラメータを調整することによって修正されてもよい。いくつかの例では、変換スケーリングおよび量子化構成要素213は、次いで、量子化変換係数を含む行列の走査を実行してもよい。量子化変換係数は、ヘッダフォーマッティングおよびCABAC構成要素231に転送されてビットストリームの中で符号化される。 The transformation scaling and quantization component 213 is configured to further compress the residual block. Transform scaling and quantization component 213 applies transformations to the residual block, such as the Discrete Cosine Transform (DCT), Discrete Cosine Transform (DST), or conceptually similar transform, and has a residual transform coefficient value. Create a video block. Wavelet transforms, integer transforms, subband transforms, or other types of transforms may also be used. The conversion may convert the residual information from a pixel value area to a conversion area such as a frequency domain. The transformation scaling and quantization component 213 is also configured to scale the transformed residual information, for example, based on frequency. Such scaling involves applying a scale factor to the residual information, resulting in different frequency information being quantized at different particle sizes, which in turn leads to the final visual quality of the reconstructed video. May have an impact. The transformation scaling and quantization component 213 is also configured to quantize the transformation coefficients to further reduce the bit rate. The quantization process can reduce the bit depth associated with some or all of the coefficients. The degree of quantization may be modified by adjusting the quantization parameters. In some examples, the transformation scaling and quantization component 213 may then perform a scan of the matrix containing the quantization transformation coefficients. The quantization conversion factor is transferred to header formatting and CABAC component 231 and encoded in the bitstream.

スケーリングおよび逆変換構成要素229は、動き推定をサポートするために変換スケーリングおよび量子化構成要素213の逆の動作を適用する。スケーリングおよび逆変換構成要素229は、たとえば、別の現在ブロック用の予測ブロックになってもよい参照ブロックとして後で使用できるように、ピクセル領域における残差ブロックを再構成するために、逆のスケーリング、変換、および/または量子化を適用する。動き推定構成要素221および/または動き補償構成要素219は、もっと後のブロック/フレームの動き推定における使用のために、残差ブロックを対応する予測ブロックに戻して加算することによって参照ブロックを計算してよい。スケーリング、量子化、および変換の間に生み出されるアーティファクトを緩和するために、再構成された参照ブロックにフィルタが適用される。そのようなアーティファクトは、後続のブロックが予測されるとき、場合によっては不正確な予測を引き起こすことがある(かつ、追加のアーティファクトを生み出すことがある)。 Scaling and inverse transformation component 229 applies the inverse behavior of transformation scaling and quantization component 213 to support motion estimation. Scaling and inverse transformation component 229, for example, reverse scaling to reconstruct the residual block in the pixel area for later use as a reference block, which may be a predictive block for another current block. , Transformation, and / or apply quantization. The motion estimation component 221 and / or the motion compensation component 219 calculates the reference block by adding the residual block back to the corresponding predicted block for use in later block / frame motion estimation. It's okay. Filters are applied to the reconstructed reference blocks to mitigate the artifacts produced during scaling, quantization, and transformation. Such artifacts can in some cases cause inaccurate predictions (and can produce additional artifacts) when subsequent blocks are predicted.

フィルタ制御分析構成要素227およびループ内フィルタ構成要素225は、残差ブロックおよび/または再構成された画像ブロックにフィルタを適用する。たとえば、元の画像ブロックを再構成するために、スケーリングおよび逆変換構成要素229からの変換された残差ブロックが、イントラピクチャ予測構成要素217および/または動き補償構成要素219からの対応する予測ブロックと組み合わせられてよい。次いで、再構成された画像ブロックにフィルタが適用されてもよい。いくつかの例では、フィルタは、代わりに残差ブロックに適用されてよい。図2の中の他の構成要素と同様に、フィルタ制御分析構成要素227およびループ内フィルタ構成要素225は、高度に統合され一緒に実装されてもよいが、概念的な目的のために別々に示される。再構成された参照ブロックに適用されるフィルタは、特定の空間領域に適用され、そのようなフィルタがどのように適用されるのかを調整するための複数のパラメータを含む。フィルタ制御分析構成要素227は、そのようなフィルタがどこで適用されるべきかを決定するために、再構成された参照ブロックを分析し、対応するパラメータを設定する。そのようなデータは、符号化のためのフィルタ制御データとして、ヘッダフォーマッティングおよびCABAC構成要素231に転送される。ループ内フィルタ構成要素225は、フィルタ制御データに基づいてそのようなフィルタを適用する。フィルタは、デブロッキングフィルタ、雑音抑制フィルタ、SAOフィルタ、および適応ループフィルタを含んでよい。そのようなフィルタは、例に応じて、空間/ピクセル領域において(たとえば、再構成されたピクセルブロックに対して)、または周波数領域において適用されてよい。 The filter control analysis component 227 and the in-loop filter component 225 apply the filter to the residual block and / or the reconstructed image block. For example, to reconstruct the original image block, the converted residual block from the scaling and inverse transformation component 229 is the corresponding prediction block from the intra-picture prediction component 217 and / or the motion compensation component 219. May be combined with. The filter may then be applied to the reconstructed image block. In some examples, the filter may be applied to the residual block instead. Like the other components in Figure 2, the filter control analysis component 227 and the in-loop filter component 225 may be highly integrated and implemented together, but separately for conceptual purposes. Shown. Filters applied to reconstructed reference blocks are applied to specific spatial regions and contain multiple parameters for adjusting how such filters are applied. Filter control analysis component 227 analyzes the reconstructed reference block and sets the corresponding parameters to determine where such a filter should be applied. Such data is transferred to header formatting and CABAC component 231 as filter control data for encoding. In-loop filter component 225 applies such a filter based on the filter control data. Filters may include deblocking filters, noise suppression filters, SAO filters, and adaptive loop filters. Such filters may be applied in the spatial / pixel domain (eg, for reconstructed pixel blocks) or in the frequency domain, as is the case.

エンコーダとして動作するとき、フィルタ処理済みの再構成された画像ブロック、残差ブロック、および/または予測ブロックは、上記で説明したように動き推定において後で使用できるように、復号ピクチャバッファ構成要素223の中に記憶される。デコーダとして動作するとき、復号ピクチャバッファ構成要素223は、再構成されたフィルタ処理済みのブロックを記憶し、それを出力ビデオ信号の一部としてディスプレイに向かって転送する。復号ピクチャバッファ構成要素223は、予測ブロック、残差ブロック、および/または再構成された画像ブロックを記憶することが可能な任意のメモリデバイスであってよい。 When acting as an encoder, the filtered reconstructed image blocks, residual blocks, and / or predictive blocks are decoded picture buffer components 223 for later use in motion estimation as described above. It is remembered in. When acting as a decoder, the decode picture buffer component 223 stores the reconstructed filtered block and transfers it towards the display as part of the output video signal. The decoded picture buffer component 223 may be any memory device capable of storing predictive blocks, residual blocks, and / or reconstructed image blocks.

ヘッダフォーマッティングおよびCABAC構成要素231は、コーデックシステム200の様々な構成要素からデータを受信し、デコーダに向かう送信のために、そのようなデータをコーディングされたビットストリームの中に符号化する。詳細には、ヘッダフォーマッティングおよびCABAC構成要素231は、汎用制御データおよびフィルタ制御データなどの制御データを符号化するための、様々なヘッダを生成する。さらに、イントラ予測および動きデータを含む予測データならびに量子化変換係数データの形態の残差データはすべて、ビットストリームの中で符号化される。最終的なビットストリームは、区分された元のビデオ信号201を再構成するためにデコーダによって望まれるすべての情報を含む。そのような情報はまた、イントラ予測モードインデックステーブル(コードワードマッピングテーブルとも呼ばれる)、様々なブロックに対する符号化コンテキストの定義、最確イントラ予測モードの表示、区分情報の表示などを含んでよい。そのようなデータは、エントロピーコーディングを採用することによって符号化されてもよい。たとえば、情報は、コンテキスト適応型可変長コーディング(CAVLC:context adaptive variable length coding)、CABAC、シンタックスベースコンテキスト適応型バイナリ算術コーディング(SBAC:syntax-based context-adaptive binary arithmetic coding)、確率区間区分エントロピー(PIPE:probability interval partitioning entropy)コーディング、または別のエントロピーコーディング技法を採用することによって符号化されてよい。エントロピーコーディングに続いて、コーディングされたビットストリームは、別のデバイス(たとえば、ビデオデコーダ)へ送信されても、または後で送信もしくは取出しできるようにアーカイブされてもよい。 Header formatting and CABAC component 231 receives data from various components of the codec system 200 and encodes such data into a coded bitstream for transmission to the decoder. In particular, header formatting and CABAC component 231 generate various headers for encoding control data such as general purpose control data and filter control data. In addition, all prediction data, including intra-prediction and motion data, as well as residual data in the form of quantization conversion coefficient data are encoded in the bitstream. The final bitstream contains all the information desired by the decoder to reconstruct the original separated video signal 201. Such information may also include an intra-prediction mode index table (also called a codeword mapping table), definitions of coding contexts for various blocks, display of the most probable intra-prediction mode, display of partitioning information, and the like. Such data may be encoded by adopting entropy coding. For example, the information includes context adaptive variable length coding (CAVLC), CABAC, syntax-based context-adaptive binary arithmetic coding (SBAC), and probability interval segmentation entropy. It may be encoded by (PIPE: probability interval partitioning entropy) coding, or by adopting another entropy coding technique. Following entropy coding, the coded bitstream may be sent to another device (eg, a video decoder) or archived for later transmission or retrieval.

図3は、例示的なビデオエンコーダ300を示すブロック図である。ビデオエンコーダ300は、コーデックシステム200の符号化機能を実施し、かつ/または動作方法100のステップ101、103、105、107、および/もしくは109を実施するために、採用されてもよい。エンコーダ300は入力ビデオ信号を区分し、区分されたビデオ信号201と実質的に類似である区分されたビデオ信号301が得られる。区分されたビデオ信号301は、次いで、エンコーダ300の構成要素によって圧縮されビットストリームの中に符号化される。 FIG. 3 is a block diagram showing an exemplary video encoder 300. The video encoder 300 may be employed to perform the coding function of the codec system 200 and / or to perform steps 101, 103, 105, 107, and / or 109 of the operating method 100. The encoder 300 divides the input video signal and obtains the divided video signal 301 which is substantially similar to the divided video signal 201. The partitioned video signal 301 is then compressed by the components of the encoder 300 and encoded into a bitstream.

詳細には、区分されたビデオ信号301は、イントラ予測のために、イントラピクチャ予測構成要素317に転送される。イントラピクチャ予測構成要素317は、イントラピクチャ推定構成要素215およびイントラピクチャ予測構成要素217と実質的に類似であってよい。区分されたビデオ信号301はまた、復号ピクチャバッファ構成要素323の中の参照ブロックに基づくインター予測のために、動き補償構成要素321に転送される。動き補償構成要素321は、動き推定構成要素221および動き補償構成要素219と実質的に類似であってよい。イントラピクチャ予測構成要素317および動き補償構成要素321からの予測ブロックおよび残差ブロックは、残差ブロックの変換および量子化のために、変換および量子化構成要素313に転送される。変換および量子化構成要素313は、変換スケーリングおよび量子化構成要素213と実質的に類似であってよい。変換および量子化された残差ブロックならびに対応する予測ブロックは(関連する制御データと一緒に)、ビットストリームの中へのコーディングのために、エントロピーコーディング構成要素331に転送される。エントロピーコーディング構成要素331は、ヘッダフォーマッティングおよびCABAC構成要素231と実質的に類似であってよい。 Specifically, the partitioned video signal 301 is transferred to the intra-picture prediction component 317 for intra-prediction. The intra-picture prediction component 317 may be substantially similar to the intra-picture estimation component 215 and the intra-picture prediction component 217. The partitioned video signal 301 is also transferred to motion compensation component 321 for interprediction based on the reference block in the decoded picture buffer component 323. The motion compensation component 321 may be substantially similar to the motion estimation component 221 and the motion compensation component 219. The prediction and residual blocks from the intra-picture prediction component 317 and the motion compensation component 321 are transferred to the conversion and quantization component 313 for the conversion and quantization of the residual block. The transformation and quantization component 313 may be substantially similar to the transformation scaling and quantization component 213. The transformed and quantized residual blocks as well as the corresponding predictive blocks (along with the associated control data) are transferred to the entropy coding component 331 for coding into the bitstream. Entropy coding component 331 may be substantially similar to header formatting and CABAC component 231.

変換および量子化された残差ブロックならびに/または対応する予測ブロックはまた、動き補償構成要素321による使用のための参照ブロックへの再構成のために、変換および量子化構成要素313から逆変換および量子化構成要素329に転送される。逆変換および量子化構成要素329は、スケーリングおよび逆変換構成要素229と実質的に類似であってよい。ループ内フィルタ構成要素325の中のループ内フィルタも、例に応じて残差ブロックおよび/または再構成された参照ブロックに適用される。ループ内フィルタ構成要素325は、フィルタ制御分析構成要素227およびループ内フィルタ構成要素225と実質的に類似であってよい。ループ内フィルタ構成要素325は、ループ内フィルタ構成要素225に関して説明したような複数のフィルタを含んでよい。フィルタ処理済みのブロックが、次いで、動き補償構成要素321による参照ブロックとしての使用のために、復号ピクチャバッファ構成要素323の中に記憶される。復号ピクチャバッファ構成要素323は、復号ピクチャバッファ構成要素223と実質的に類似であってよい。 The transformed and quantized residual block and / or the corresponding predictive block is also inversely transformed and inversely transformed from the transformed and quantized component 313 for reconstruction into a reference block for use by the motion compensation component 321. Transferred to the quantization component 329. The inverse transformation and quantization component 329 may be substantially similar to the scaling and inverse transformation component 229. The in-loop filter in the in-loop filter component 325 is also applied to the residual block and / or the reconstructed reference block as usual. The in-loop filter component 325 may be substantially similar to the in-loop filter component 227 and the in-loop filter component 225. The in-loop filter component 325 may include a plurality of filters as described for the in-loop filter component 225. The filtered block is then stored in the decoded picture buffer component 323 for use as a reference block by the motion compensation component 321. The decode picture buffer component 323 may be substantially similar to the decode picture buffer component 223.

図4は、例示的なビデオデコーダ400を示すブロック図である。ビデオデコーダ400は、コーデックシステム200の復号機能を実施し、かつ/または動作方法100のステップ111、113、115、および/もしくは117を実施するために、採用されてもよい。デコーダ400は、たとえば、エンコーダ300から、ビットストリームを受信し、エンドユーザへの表示のために、再構成された出力ビデオ信号をビットストリームに基づいて生成する。 FIG. 4 is a block diagram showing an exemplary video decoder 400. The video decoder 400 may be employed to perform the decoding function of the codec system 200 and / or to perform steps 111, 113, 115, and / or 117 of method 100. The decoder 400 receives, for example, a bitstream from the encoder 300 and produces a reconstructed output video signal based on the bitstream for display to the end user.

ビットストリームは、エントロピー復号構成要素433によって受信される。エントロピー復号構成要素433は、CAVLC、CABAC、SBAC、PIPEコーディング、または他のエントロピーコーディング技法などの、エントロピー復号方式を実施するように構成される。たとえば、エントロピー復号構成要素433は、ビットストリームの中でコードワードとして符号化された追加のデータを解釈するためのコンテキストを提供するために、ヘッダ情報を採用してもよい。復号される情報は、汎用制御データ、フィルタ制御データ、区分情報、動きデータ、予測データ、および残差ブロックからの量子化変換係数などの、ビデオ信号を復号するための任意の所望の情報を含む。量子化変換係数は、残差ブロックへの再構成のために、逆変換および量子化構成要素429に転送される。逆変換および量子化構成要素429は、逆変換および量子化構成要素329と類似であってよい。 The bitstream is received by the entropy decoding component 433. Entropy decoding component 433 is configured to perform an entropy decoding scheme, such as CAVLC, CABAC, SBAC, PIPE coding, or other entropy coding techniques. For example, the entropy decoding component 433 may employ header information to provide context for interpreting additional data encoded as codewords in the bitstream. The information to be decoded includes any desired information for decoding the video signal, such as general purpose control data, filter control data, partitioning information, motion data, prediction data, and quantization conversion coefficients from residual blocks. .. The quantized transformation coefficients are transferred to the inverse transformation and quantization component 429 for reconstruction into the residual block. The inverse transformation and quantization component 429 may be similar to the inverse transformation and quantization component 329.

再構成された残差ブロックおよび/または予測ブロックは、イントラ予測動作に基づく画像ブロックへの再構成のために、イントラピクチャ予測構成要素417に転送される。イントラピクチャ予測構成要素417は、イントラピクチャ推定構成要素215およびイントラピクチャ予測構成要素217と類似であってよい。詳細には、イントラピクチャ予測構成要素417は、フレームの中の参照ブロックの位置を特定するために予測モードを採用し、イントラ予測画像ブロックを再構成するためにその結果に残差ブロックを適用する。再構成されたイントラ予測画像ブロック、ならびに/または残差ブロックおよび対応するインター予測データが、それぞれ、復号ピクチャバッファ構成要素223およびループ内フィルタ構成要素225と実質的に類似であってよい、復号ピクチャバッファ構成要素423にループ内フィルタ構成要素425を介して転送される。ループ内フィルタ構成要素425は、再構成された画像ブロック、残差ブロック、および/または予測ブロックをフィルタ処理し、そのような情報が、復号ピクチャバッファ構成要素423の中に記憶される。復号ピクチャバッファ構成要素423からの再構成された画像ブロックは、インター予測のために、動き補償構成要素421に転送される。動き補償構成要素421は、動き推定構成要素221および/または動き補償構成要素219と実質的に類似であってよい。詳細には、動き補償構成要素421は、予測ブロックを生成しその結果に残差ブロックを適用して画像ブロックを再構成するために、参照ブロックからの動きベクトルを採用する。結果として得られる再構成されたブロックも、ループ内フィルタ構成要素425を介して復号ピクチャバッファ構成要素423に転送されてよい。復号ピクチャバッファ構成要素423は、追加の再構成された画像ブロックを記憶し続け、そうした画像ブロックは区分情報を介してフレームに再構成されうる。そのようなフレームはまた、シーケンスの中に置かれてよい。シーケンスは、再構成された出力ビデオ信号としてディスプレイに向かって出力される。 The reconstructed residual block and / or prediction block is transferred to the intra-picture prediction component 417 for reconstruction into an image block based on the intra-prediction operation. The intra-picture prediction component 417 may be similar to the intra-picture estimation component 215 and the intra-picture prediction component 217. Specifically, the intra-picture prediction component 417 employs a prediction mode to locate the reference block within the frame and applies a residual block to the result to reconstruct the intra-prediction image block. .. The reconstructed intra-predicted image block and / or the residual block and the corresponding inter-predicted data may be substantially similar to the decoded picture buffer component 223 and the in-loop filter component 225, respectively. Transferred to buffer component 423 via in-loop filter component 425. In-loop filter component 425 filters the reconstructed image blocks, residual blocks, and / or predictive blocks, and such information is stored in the decoded picture buffer component 423. The reconstructed image block from the decoded picture buffer component 423 is transferred to the motion compensation component 421 for interprediction. The motion compensation component 421 may be substantially similar to the motion estimation component 221 and / or the motion compensation component 219. Specifically, motion compensation component 421 employs a motion vector from a reference block to generate a predictive block and apply a residual block to the result to reconstruct the image block. The resulting reconstructed block may also be transferred to the decode picture buffer component 423 via the in-loop filter component 425. The decrypted picture buffer component 423 continues to store additional reconstructed image blocks, which may be reconstructed into frames via partitioning information. Such frames may also be placed in the sequence. The sequence is output towards the display as a reconstructed output video signal.

図5は、符号化ビデオシーケンスを含む例示的なビットストリーム500を示す概略図である。たとえば、ビットストリーム500は、コーデックシステム200および/またはデコーダ400によって復号するために、コーデックシステム200および/またはエンコーダ300によって生成されうる。別の例として、ビットストリーム500は、ステップ111におけるデコーダによる使用のために、方法100のステップ109においてエンコーダによって生成されてよい。 FIG. 5 is a schematic diagram showing an exemplary bitstream 500 containing a coded video sequence. For example, the bitstream 500 may be generated by the codec system 200 and / or the encoder 300 for decoding by the codec system 200 and / or the decoder 400. As another example, the bitstream 500 may be generated by the encoder in step 109 of method 100 for use by the decoder in step 111.

ビットストリーム500は、シーケンスパラメータセット(SPS)510、複数のピクチャパラメータセット(PPS)512、タイルグループヘッダ514、および画像データ520を含む。SPS510は、ビットストリーム500の中に含まれるビデオシーケンスの中のすべてのピクチャに共通のシーケンスデータを含む。そのようなデータは、ピクチャサイズ決定、ビット深度、コーディングツールパラメータ、ビットレート制約などを含むことができる。PPS512は、1つ以上の対応するピクチャに特有のパラメータを含む。したがって、ビデオシーケンスの中の各ピクチャは、1つのPPS512を参照してもよい。PPS512は、対応するピクチャの中のタイルにとって利用可能なコーディングツール、量子化パラメータ、オフセット、ピクチャ特有のコーディングツールパラメータ(たとえば、フィルタ制御)などを示すことができる。タイルグループヘッダ514は、ピクチャの中の各タイルグループに特有のパラメータを含む。したがって、ビデオシーケンスの中のタイルグループごとに1つのタイルグループヘッダ514があってもよい。タイルグループヘッダ514は、タイルグループ情報、ピクチャ順序カウント(POC)、参照ピクチャリスト、予測重み、タイルエントリポイント、デブロッキングパラメータなどを含んでよい。いくつかのシステムがスライスヘッダとしてタイルグループヘッダ514を参照し、タイルグループではなくスライスをサポートするためにそのような情報を使用することに留意されたい。 Bitstream 500 includes a sequence parameter set (SPS) 510, a plurality of picture parameter sets (PPS) 512, a tile group header 514, and image data 520. The SPS510 contains sequence data common to all pictures in the video sequence contained within the bitstream 500. Such data can include picture sizing, bit depth, coding tool parameters, bit rate constraints, and so on. The PPS512 contains parameters specific to one or more corresponding pictures. Therefore, each picture in the video sequence may refer to one PPS512. The PPS512 can show coding tools, quantization parameters, offsets, picture-specific coding tool parameters (eg, filter control), etc. that are available for the tiles in the corresponding picture. The tile group header 514 contains parameters specific to each tile group in the picture. Therefore, there may be one tile group header 514 for each tile group in the video sequence. The tile group header 514 may include tile group information, a picture order count (POC), a reference picture list, a predicted weight, a tile entry point, a deblocking parameter, and the like. Note that some systems refer to tile group header 514 as a slice header and use such information to support slices rather than tile groups.

画像データ520は、インター予測および/またはイントラ予測に従って符号化されたビデオデータ、ならびに対応する変換および量子化された残差データを含む。そのような画像データ520は、符号化の前に画像を区分するために使用される区分に従ってソートされる。たとえば、画像データ520の中の画像は、タイル523に分割される。タイル523は、コーディングツリーユニット(CTU)にさらに分割される。CTUは、コーディングツリーに基づいてコーディングブロックにさらに分割される。コーディングブロックは、次いで、予測メカニズムに従って符号化/復号されうる。画像/ピクチャは、1つ以上のタイル523を含むことができる。 Image data 520 includes video data encoded according to inter-prediction and / or intra-prediction, as well as corresponding transformation and quantized residual data. Such image data 520 is sorted according to the classification used to classify the image prior to encoding. For example, the image in the image data 520 is divided into tiles 523. Tile 523 is further subdivided into coding tree units (CTUs). The CTU is further subdivided into coding blocks based on the coding tree. The coding block can then be encoded / decoded according to the prediction mechanism. The image / picture can include one or more tiles 523.

タイル523は、水平境界および垂直境界によって作成された、ピクチャの区分された部分である。タイル523は、長方形および/または正方形であってよい。詳細には、タイル523は、直角に接続されている4つの側部を含む。4つの側部は、平行な側部の2つのペアを含む。さらに、平行な側部ペアの中の側部は長さが等しい。したがって、タイル523は任意の長方形形状であってよく、ここで、正方形は、すべての4つの側部が等しい長さである、長方形の特別な事例である。ピクチャは、タイル523の行および列の中に配置されてもよい。タイル行とは、ピクチャの左の境界から右の境界まで(またはその逆に)連続的なラインを作成するように水平に隣接して配置されたタイル523のセットである。タイル列とは、ピクチャの上の境界から下の境界まで(またはその逆に)連続的なラインを作成するように垂直に隣接して配置されたタイル523のセットである。タイル523は、例に応じて、他のタイル523に基づく予測を可能にしてもしなくてもよい。各タイル523は、ピクチャの中に固有のタイルインデックスを有してよい。タイルインデックスは、あるタイル523を別のタイル523から区別するために使用されうる、手続き的に選択された数値識別子である。たとえば、タイルインデックスは、ラスタ走査順序で数値的に増大してよい。ラスタ走査順序は、左から右かつ上から下である。いくつかの例では、タイル523にタイル識別子(ID)も割り当てられてよいことに留意されたい。タイルIDは、あるタイル523を別のタイル523から区別するために使用されうる、割り当てられた識別子である。いくつかの例では、算出はタイルインデックスではなくタイルIDを採用してよい。さらに、いくつかの例では、タイルIDは、タイルインデックスと同じ値を有するように割り当てられうる。タイルインデックスおよび/またはタイルIDは、タイル523を含むタイルグループを示すためにシグナリングされてもよい。たとえば、タイルインデックスおよび/またはタイルIDは、タイル523に関連するピクチャデータを表示用の適切な位置にマッピングするために採用されてよい。タイルグループは、たとえば、対象領域の表示をサポートするために、かつ/または並列処理をサポートするために、別々に抽出およびコーディングされうる、タイル523の関連するセットである。タイルグループの中のタイル523は、タイルグループの外部のタイル523への参照を伴わずにコーディングされうる。各タイル523は対応するタイルグループに割り当てられてよく、したがって、ピクチャは複数のタイルグループを含むことができる。 Tile 523 is a segmented portion of the picture created by horizontal and vertical boundaries. The tile 523 may be rectangular and / or square. In particular, tile 523 includes four sides that are connected at right angles. The four sides contain two pairs of parallel sides. In addition, the sides of the parallel side pair are equal in length. Therefore, the tile 523 may be of any rectangular shape, where the square is a special case of a rectangle, where all four sides are of equal length. The picture may be placed in the rows and columns of tile 523. A tile row is a set of tiles 523 arranged horizontally adjacent to each other to create a continuous line from the left border to the right border of the picture (or vice versa). A tile column is a set of tiles 523 arranged vertically adjacent to each other to create a continuous line from the top border to the bottom border of the picture (or vice versa). Tile 523 may or may not allow predictions based on other tiles 523, as is the case. Each tile 523 may have a unique tile index within the picture. A tile index is a procedurally selected numeric identifier that can be used to distinguish one tile 523 from another. For example, the tile index may increase numerically in raster scan order. The raster scan order is left-to-right and top-to-bottom. Note that in some examples tile 523 may also be assigned a tile identifier (ID). The tile ID is an assigned identifier that can be used to distinguish one tile 523 from another tile 523. In some examples, the calculation may employ the tile ID instead of the tile index. Further, in some examples, the tile ID may be assigned to have the same value as the tile index. The tile index and / or tile ID may be signaled to indicate a tile group containing tile 523. For example, the tile index and / or tile ID may be employed to map the picture data associated with tile 523 to the appropriate location for display. A tile group is an associated set of tiles 523 that can be extracted and coded separately, for example, to support display of the area of interest and / or to support parallelism. The tile 523 inside the tile group can be coded without reference to the tile 523 outside the tile group. Each tile 523 may be assigned to a corresponding tile group, and thus the picture can contain multiple tile groups.

図6A～図6Eは、異なるビットストリームからの複数の解像度のサブピクチャを組み合わせて仮想現実(VR)アプリケーションにおける使用のための単一のピクチャにするための、エクストラクタトラック610を作成するための例示的なメカニズム600を示す。メカニズム600は、方法100の例示的な使用事例をサポートするために採用されてもよい。たとえば、メカニズム600は、コーデックシステム200および/またはエンコーダ300からコーデックシステム200および/またはデコーダ400に向かう送信用のビットストリーム500を生成するために採用されうる。特定の例として、メカニズム600は、VR、OMAF、360度ビデオなどとともに使用するために採用されうる。 Figures 6A-6E are for creating an extractor track 610 to combine multiple resolution subpictures from different bitstreams into a single picture for use in a virtual reality (VR) application. An exemplary mechanism 600 is shown. Mechanism 600 may be employed to support exemplary use cases of Method 100. For example, mechanism 600 may be employed to generate a bitstream 500 for transmission from the codec system 200 and / or the encoder 300 to the codec system 200 and / or the decoder 400. As a specific example, the mechanism 600 can be employed for use with VR, OMAF, 360 degree video, and the like.

VRでは、ビデオの一部分だけがユーザに表示される。たとえば、VRビデオは、ユーザを取り囲む球体を含むように撮影されてよい。ユーザは、VRビデオを見るためにヘッドマウントディスプレイ(HMD)を採用してもよい。ユーザは、対象領域のほうへHMDを向けてよい。対象領域がユーザに表示され、他のビデオデータは廃棄される。このようにして、ユーザは、任意の瞬間において、VRビデオの、ユーザが選択した部分だけを見る。この手法は、ユーザの知覚を模倣し、したがって、実環境を模倣する方法でユーザに仮想環境を体験させる。この手法を伴う問題のうちの1つは、VRビデオ全体がユーザへ送信されてもよいが、ビデオの現在のビューポートしか実際には使用されず、残りが廃棄されることである。ストリーミングアプリケーションに対してシグナリング効率を高めるために、ユーザの現在のビューポートは、より高い第1の解像度で送信され得、他のビューポートは、より低い第2の解像度で送信されうる。このようにして、廃棄される可能性のあるビューポートは、ユーザによって見られる可能性のあるビューポートよりも小さい帯域幅を占める。ユーザが新たなビューポートを選択する場合、異なる現在のビューポートがより高い第1の解像度で送信されることをデコーダが要求しうるまで、解像度がより低いコンテンツが見せられうる。この機能をサポートするために、図6Eに示すようなエクストラクタトラック610を作成するためにメカニズム600が採用されうる。エクストラクタトラック610とは、上記で説明したような使用のために複数の解像度におけるピクチャをカプセル化する、画像データのトラックである。 In VR, only a portion of the video is visible to the user. For example, VR video may be shot to include a sphere that surrounds the user. Users may employ a head-mounted display (HMD) to watch VR video. The user may point the HMD toward the target area. The target area is displayed to the user, and other video data is discarded. In this way, the user sees only the part of the VR video selected by the user at any moment. This technique mimics the user's perception and thus allows the user to experience the virtual environment in a way that mimics the real environment. One of the problems with this technique is that the entire VR video may be sent to the user, but only the current viewport of the video is actually used and the rest is discarded. To increase signaling efficiency for streaming applications, the user's current viewport may be transmitted at a higher first resolution and other viewports may be transmitted at a lower second resolution. In this way, viewports that may be discarded occupy less bandwidth than viewports that may be seen by the user. When the user selects a new viewport, lower resolution content may be shown until the decoder may require that different current viewports be transmitted at a higher first resolution. To support this feature, mechanism 600 may be employed to create the extractor track 610 as shown in FIG. 6E. The extractor track 610 is a track of image data that encapsulates a picture at multiple resolutions for use as described above.

メカニズム600は、それぞれ、図6Aおよび図6Bに示すように、同じビデオコンテンツを第1の解像度611および第2の解像度612で符号化する。特定の例として、第1の解像度611は5120×2560ルーマサンプルであってよく、第2の解像度612は2560×1280ルーマサンプルであってよい。ビデオのピクチャは、それぞれ、第1の解像度611におけるタイル601、および第2の解像度612におけるタイル603に区分されてもよい。図示の例では、タイル601および603は各々、4×2グリッドに区分される。さらに、各タイル601および603の位置に対してMCTSがコーディングされうる。第1の解像度611および第2の解像度612におけるピクチャは各々、対応する解像度で経時的にビデオを表すMCTSシーケンスをもたらす。各コーディングされたMCTSシーケンスは、サブピクチャトラックまたはタイルトラックとして記憶される。メカニズム600は、次いで、ビューポート適応MCTS選択をサポートすべきセグメントを作成するために、ピクチャを使用することができる。たとえば、高解像度MCTSと低解像度MCTSとの異なる選択を引き起こす、見ている方位の各範囲が検討される。図示の例では、第1の解像度611におけるMCTSを含む4つのタイル601、および第2の解像度612におけるMCTSを含む4つのタイル603が取得される。 Mechanism 600 encodes the same video content at a first resolution of 611 and a second resolution of 612, respectively, as shown in FIGS. 6A and 6B, respectively. As a specific example, the first resolution 611 may be a 5120 x 2560 luma sample and the second resolution 612 may be a 2560 x 1280 luma sample. The picture of the video may be divided into tile 601 at the first resolution 611 and tile 603 at the second resolution 612, respectively. In the illustrated example, tiles 601 and 603 are each divided into 4x2 grids. In addition, MCTS can be coded for the positions of tiles 601 and 603. The pictures at the first resolution 611 and the second resolution 612 each yield an MCTS sequence that represents the video over time at the corresponding resolution. Each coded MCTS sequence is stored as a sub-picture track or tile track. Mechanism 600 can then use the picture to create a segment that should support viewport adaptive MCTS selection. For example, each range of viewing orientation is considered, which causes different choices between high resolution MCTS and low resolution MCTS. In the illustrated example, four tiles 601 containing the MCTS at the first resolution 611 and four tiles 603 containing the MCTS at the second resolution 612 are obtained.

メカニズム600は、次いで、可能なビューポート適応MCTS選択ごとにエクストラクタトラック610を作成することができる。図6Cおよび図6Dは、例示的なビューポート適応MCTS選択を示す。詳細には、選択されるタイル605および607のセットは、それぞれ、第1の解像度611および第2の解像度612において選択される。選択されるタイル605および607は、灰色の陰影で示される。図示の例では、選択されるタイル605は、ユーザに見せられることになる、第1の解像度611におけるタイル601であり、選択されるタイル607は、廃棄される可能性があるが、ユーザが新たなビューポートを選択する場合に表示をサポートするために保持される、第2の解像度612におけるタイル603である。選択されるタイル605および607は、次いで、第1の解像度611と第2の解像度612の両方における画像データを含む単一のピクチャに組み合わせられる。そのようなピクチャは、エクストラクタトラック610を作成するように組み合わせられる。図6Eは、例示の目的で、対応するエクストラクタトラック610からの単一のピクチャを示す。図示のように、エクストラクタトラック610の中のピクチャは、第1の解像度611および第2の解像度612からの、選択されるタイル605および607を含む。上述のように、図6C～図6Eは、単一のビューポート適応MCTS選択を示す。任意のビューポートのユーザ選択を可能にするために、選択されるタイル605と607との可能な組み合わせごとに、エクストラクタトラック610が作成されるべきである。 Mechanism 600 can then create an extractor track 610 for each possible viewport adaptive MCTS selection. Figures 6C and 6D show exemplary viewport-adaptive MCTS selections. Specifically, the set of tiles 605 and 607 selected is selected at the first resolution 611 and the second resolution 612, respectively. The tiles 605 and 607 selected are shown in gray shading. In the illustrated example, the selected tile 605 is the tile 601 at first resolution 611, which will be shown to the user, and the selected tile 607 may be discarded, but the user is new. Tile 603 at a second resolution of 612, which is retained to support display when selecting a different viewport. The selected tiles 605 and 607 are then combined into a single picture containing image data at both the first resolution 611 and the second resolution 612. Such pictures are combined to create an extractor track 610. FIG. 6E shows a single picture from the corresponding extractor track 610 for illustrative purposes. As shown, the pictures in the extractor track 610 include tiles 605 and 607 selected from the first resolution 611 and the second resolution 612. As mentioned above, FIGS. 6C-6E show a single viewport adaptive MCTS selection. An extractor track 610 should be created for each possible combination of tiles 605 and 607 selected to allow user selection of any viewport.

図示の例では、第2の解像度612のビットストリームからのコンテンツをカプセル化するタイル603の各選択は、2つのスライスを含む。パッキングされたピクチャと正距円筒図法投影(ERP)フォーマットの投影されるピクチャとの間のマッピングを作成するために、エクストラクタトラック610の中にRegionWisePackingBoxが含まれてよい。提示する例では、エクストラクタトラック610から解像されたビットストリームは、解像度が3200×2560である。それ故に、4000サンプル(4K)対応デコーダは、5000サンプル5K(5120×2560)解像度を有するコーディングされたビットストリームからビューポートが抽出されるコンテンツを復号してもよい。 In the illustrated example, each selection of tile 603 that encapsulates the content from the second resolution 612 bitstream contains two slices. A RegionWisePackingBox may be included within the extractor track 610 to create a mapping between the packed picture and the projected picture in equirectangular projection (ERP) format. In the example presented, the bitstream resolved from the extractor track 610 has a resolution of 3200 x 2560. Therefore, a 4000 sample (4K) capable decoder may decode content from which the viewport is extracted from a coded bitstream with 5000 sample 5K (5120 x 2560) resolution.

図示のように、エクストラクタトラック610は、高解像度タイル601の2つの行および低解像度タイル603の4つの行を含む。したがって、エクストラクタトラック610は、高解像度コンテンツの2つのスライスおよび低解像度コンテンツの4つのスライスを含む。一律なタイリングは、そのような使用事例をサポートしなくてもよい。一律なタイリングは、タイル列のセットおよびタイル行のセットによって定義される。タイル列はピクチャの上からピクチャの下まで延在する。同様に、タイル行はピクチャの左からピクチャの右まで延在する。そのような構造は簡単に定義されうるが、この構造は、メカニズム600によって説明される使用事例などの、進歩した使用事例を事実上はサポートすることができない。図示の例では、エクストラクタトラック610の異なるセクションにおいて、異なる数の行が採用される。一律なタイリングが採用される場合、エクストラクタトラック610の右の側部におけるタイルは、各々2つのスライスを受け入れるように書き換えられるべきである。この手法は、非効率であり計算量的に複雑である。 As shown, the extractor track 610 contains two rows of high resolution tile 601 and four rows of low resolution tile 603. Therefore, the extractor track 610 contains two slices of high resolution content and four slices of low resolution content. Uniform tiling does not have to support such use cases. Uniform tiling is defined by a set of tile columns and a set of tile rows. The tile rows extend from the top of the picture to the bottom of the picture. Similarly, tile rows extend from the left of the picture to the right of the picture. Although such a structure can be easily defined, it is virtually incapable of supporting advanced use cases, such as the use cases described by Mechanism 600. In the illustrated example, different numbers of rows are employed in different sections of the extractor track 610. If uniform tiling is adopted, the tiles on the right side of the extractor track 610 should be rewritten to accept two slices each. This method is inefficient and computationally complex.

本開示は、異なる数のスライスを含むようにタイルが書き換えられることを必要としない、以下で説明するようなフレキシブルタイリング方式を含む。フレキシブルタイリング方式は、タイル601が第1の解像度611におけるコンテンツを含むことを可能にする。フレキシブルタイリング方式はまた、第2の解像度612におけるタイル603に各々が直接マッピングされうるもっと小さいタイルに、タイル601が区分されることを可能にする。そのような手法は、上記で説明したように異なる解像度が組み合わせられるとき、タイルが書き換えられる/アドレス再指定されることを必要としないので、この直接マッピングはより効率的である。 The present disclosure includes a flexible tiling scheme as described below that does not require the tiles to be rewritten to contain different numbers of slices. The flexible tiling scheme allows tile 601 to contain content at a first resolution of 611. The flexible tiling scheme also allows the tile 601 to be subdivided into smaller tiles, each of which can be directly mapped to the tile 603 at a second resolution 612. This direct mapping is more efficient because such techniques do not require tiles to be rewritten / readdressed when different resolutions are combined as described above.

図7は、異なるビットストリームからの複数の解像度のピクチャをつなぎ合わせて表示用の単一のピクチャにする、例示的なビデオ会議アプリケーション700を示す。アプリケーション700は、方法100の例示的な使用事例をサポートするために採用されてもよい。たとえば、アプリケーション700は、コーデックシステム200および/またはエンコーダ300からのビットストリーム500からのビデオコンテンツを表示するために、コーデックシステム200および/またはデコーダ400において採用されうる。ビデオ会議アプリケーション700は、ユーザにビデオシーケンスを表示する。ビデオシーケンスは、話している参加者701および他の参加者703を表示するピクチャを含む。話している参加者701は、より高い/より大きい第1の解像度で表示され、他の参加者703は、より低い/より小さい第2の解像度で表示される。そのようなピクチャをコーディングするために、ピクチャは、単一の行を有する部分、および3つの行を有する部分を含むべきである。一律なタイリングを用いてそのようなシナリオをサポートするために、ピクチャは左のタイルおよび右のタイルに区分される。右のタイルは、次いで、3つの行を含むように書き換えられる/アドレス再指定される。そのようなアドレス再指定は、圧縮と性能の両方の不利益をもたらす。以下で説明するフレキシブルタイリング方式は、単一のタイルが、もっと小さいタイルに区分され、他の参加者703に関連するサブピクチャビットストリームの中のタイルにマッピングされることを可能にする。このようにして、話している参加者701は、第1のレベルのタイルに直接マッピングされ得、他の参加者703は、そのような書換え/アドレス再指定を伴わずに、第1のタイルから分割された第2のレベルのタイルにマッピングされうる。 FIG. 7 illustrates an exemplary video conference application 700 that stitches together multiple resolution pictures from different bitstreams into a single picture for display. Application 700 may be employed to support exemplary use cases for Method 100. For example, application 700 may be employed in codec system 200 and / or decoder 400 to display video content from bitstream 500 from codec system 200 and / or encoder 300. The video conference application 700 displays the video sequence to the user. The video sequence includes pictures showing the talking participant 701 and other participants 703. The speaking participant 701 is displayed at a higher / larger first resolution, and the other participants 703 are displayed at a lower / smaller second resolution. To code such a picture, the picture should include a part with a single row and a part with three rows. To support such scenarios with uniform tiling, the picture is divided into left tile and right tile. The tile on the right is then rewritten / readdressed to contain three rows. Such readdressing presents both compression and performance disadvantages. The flexible tiling scheme described below allows a single tile to be divided into smaller tiles and mapped to tiles in the subpicture bitstream associated with other participants 703. In this way, the talking participant 701 can be mapped directly to the first level tile, and the other participants 703 can be from the first tile without such rewriting / readdressing. It can be mapped to a split second level tile.

図8A～図8Dは、同じピクチャの中で異なる解像度を有する複数のタイルをサポートすることが可能な、例示的なフレキシブルビデオタイリング方式800を示す概略図である。より効率的なコーディングメカニズム600およびアプリケーション700をサポートするために、フレキシブルビデオタイリング方式800が採用されうる。したがって、フレキシブルビデオタイリング方式800は、方法100の一部として採用されうる。さらに、フレキシブルビデオタイリング方式800は、コーデックシステム200、エンコーダ300、および/またはデコーダ400によって採用されうる。フレキシブルビデオタイリング方式800の結果は、エンコーダとデコーダとの間の送信のためにビットストリーム500の中に記憶されうる。 8A-8D are schematics illustrating an exemplary flexible video tiling scheme 800 capable of supporting multiple tiles with different resolutions in the same picture. Flexible video tiling method 800 may be adopted to support the more efficient coding mechanism 600 and application 700. Therefore, the flexible video tiling method 800 can be adopted as part of the method 100. In addition, the flexible video tiling scheme 800 may be employed by the codec system 200, encoder 300, and / or decoder 400. The result of the flexible video tiling method 800 may be stored in the bitstream 500 for transmission between the encoder and the decoder.

図8Aに示すように、ピクチャ(たとえば、フレーム、画像など)は、レベル1タイルとも呼ばれる第1のレベルのタイル801に区分されうる。図8Bに示すように、第1のレベルのタイル801は、レベル2タイルとも呼ばれる第2のレベルのタイル803を作成するように選択的に区分されうる。第1のレベルのタイル801および第2のレベルのタイル803は、次いで、複数の解像度でコーディングされたサブピクチャを有するピクチャを作成するために採用されうる。第1のレベルのタイル801は、ピクチャを1組の列および1組の行に完全に区分することによって生成されるタイルである。第2のレベルのタイル803は、第1のレベルのタイル801を区分することによって生成されるタイルである。 As shown in FIG. 8A, pictures (eg, frames, images, etc.) can be divided into first level tiles 801 also known as level 1 tiles. As shown in FIG. 8B, the first level tile 801 may be selectively partitioned to create a second level tile 803, also known as a level 2 tile. The first level tile 801 and the second level tile 803 can then be employed to create a picture with subpictures coded at multiple resolutions. The first level tile 801 is a tile generated by completely dividing a picture into a set of columns and a set of rows. The second level tile 803 is a tile generated by partitioning the first level tile 801.

上記で説明したように、様々なシナリオにおいて、たとえば、VRおよび/または遠隔会議では、ビデオは複数の解像度でコーディングされうる。ビデオはまた、各解像度におけるスライスを採用することによってコーディングされうる。解像度がより低いスライスは、解像度がより高いスライスよりも小さい。複数の解像度を有するピクチャを作成するために、ピクチャは第1のレベルのタイル801に区分されうる。最も高い解像度からのスライスは、第1のレベルのタイル801の中に直接含められうる。さらに、第1のレベルのタイル801は、第1のレベルのタイル801よりも小さい第2のレベルのタイル803に区分されうる。したがって、より小さい第2のレベルのタイル803は、解像度がより低いスライスを直接受け入れることができる。このようにして、各解像度からのスライスは、首尾一貫したアドレス指定方式を使用するために、解像度が異なるタイルが動的にアドレス再指定されることを必要とすることなく、たとえば、タイルインデックス関係を介して、単一のピクチャに圧縮されうる。第1のレベルのタイル801および第2のレベルのタイル803はMCTSとして実施されてよく、したがって、異なる解像度における動き制約された画像データを受け入れてよい。 As described above, in various scenarios, for example in VR and / or teleconferencing, video can be coded in multiple resolutions. The video can also be coded by adopting slices at each resolution. Slices with lower resolution are smaller than slices with higher resolution. To create a picture with multiple resolutions, the picture can be divided into tiles 801 of the first level. Slices from the highest resolution can be included directly within tile 801 of the first level. Further, the first level tile 801 can be divided into second level tiles 803, which are smaller than the first level tile 801. Therefore, the smaller second level tile 803 can directly accept lower resolution slices. In this way, slices from each resolution are, for example, tile index related, without the need for dynamically readdressing tiles with different resolutions in order to use a consistent addressing scheme. Can be compressed into a single picture via. The first level tile 801 and the second level tile 803 may be implemented as MCTS and therefore may accept motion constrained image data at different resolutions.

本開示は多くの態様を含む。特定の例として、第1のレベルのタイル801が第2のレベルのタイル803に分割される。第2のレベルのタイル803は、次いで、各々がピクチャデータの(たとえば、より低い解像度における)単一の長方形スライスを含むように制約されてもよい。長方形スライスは、長方形形状を維持するように制約されるスライスであり、したがって、水平および垂直のピクチャ境界に基づいてコーディングされる。したがって、長方形スライスは、(左から右までのラインおよび上から下までのラインの中にCTUを含み、長方形形状を維持しないことがある)ラスタ走査グループに基づいてコーディングされない。スライスとは、同じフレーム/ピクチャの中の任意の他の領域から別個に符号化される、ピクチャ/フレームの空間的に別個の領域である。別の例では、第1のレベルのタイル801は、2つ以上の完全な第2のレベルのタイル803に分割されうる。そのような場合、第1のレベルのタイル801は、部分的な第2のレベルのタイル803を含まなくてよい。別の例では、第1のレベルのタイル801および第2のレベルのタイル803の構成が、タイルを作成するために区分されたピクチャに関連するPPSなどの、ビットストリームの中のパラメータセットの中でシグナリングされうる。一例では、フラグなどの分割表示が、第1のレベルのタイル801ごとにパラメータセットの中でコーディングされうる。表示は、どの第1のレベルのタイル801が第2のレベルのタイル803にさらに分割されるのかを示す。別の例では、第2のレベルのタイル803の構成が、第2のレベルのタイル列の数および第2のレベルのタイル行の数としてシグナリングされうる。 The present disclosure includes many aspects. As a specific example, the first level tile 801 is split into the second level tile 803. The second level tile 803 may then be constrained to each contain a single rectangular slice of picture data (eg, at a lower resolution). Rectangular slices are slices that are constrained to maintain a rectangular shape and are therefore coded based on horizontal and vertical picture boundaries. Therefore, rectangular slices are not coded based on raster scan groups (which may contain a CTU in the left-to-right line and the top-to-bottom line and may not maintain a rectangular shape). A slice is a spatially separate area of a picture / frame that is encoded separately from any other area within the same frame / picture. In another example, the first level tile 801 can be divided into two or more complete second level tiles 803. In such cases, the first level tile 801 may not include the partial second level tile 803. In another example, the configuration of the first level tile 801 and the second level tile 803 is in a set of parameters in the bitstream, such as the PPS associated with the picture segmented to create the tile. Can be signaled with. In one example, a split display, such as a flag, could be coded in the parameter set for each first level tile 801. The display shows which first level tile 801 is further subdivided into second level tile 803. In another example, the configuration of the second level tile 803 can be signaled as the number of second level tile columns and the number of second level tile rows.

別の例では、第1のレベルのタイル801および第2のレベルのタイル803は、タイルグループに割り当てられうる。そのようなタイルグループは、対応するタイルグループの中のすべてのタイルが(たとえば、ラスタ走査とは対照的に)ピクチャの長方形領域をカバーするように制約されるように制約されうる。たとえば、いくつかのシステムは、ラスタ走査順序でタイルグループにタイルを追加してよい。このことは、現在の行の中に初期タイルを追加すること、現在の行の左のピクチャ境界に到達するまで行の中に各タイルを追加することに進むこと、次の行の右の境界に進むこと、および最終タイルに到達するまで次の行の中に各タイルを追加することなどを含む。この手法は、ピクチャを越えて延在する非長方形形状をもたらすことがある。そのような形状は、本明細書で説明するような複数の解像度を有するピクチャを作成することにとって有用でない場合がある。代わりに、本例は、任意の第1のレベルのタイル801および/または第2のレベルのタイル803が(たとえば、任意の順序で)タイルグループに追加されてもよいが、結果として得られるタイルグループが(たとえば、直角に接続された4つの側部を含む)長方形または正方形でなければならないように、タイルグループを制約してもよい。この制約は、単一の第1のレベルのタイル801から区分される第2のレベルのタイル803が、異なるタイルグループの中に置かれないことを確実にしてもよい。 In another example, the first level tile 801 and the second level tile 803 can be assigned to a tile group. Such a tile group can be constrained so that all tiles in the corresponding tile group are constrained to cover a rectangular area of the picture (for example, as opposed to a raster scan). For example, some systems may add tiles to a tile group in raster scan order. This goes to adding the initial tiles in the current row, adding each tile in the row until the left picture border of the current row is reached, the right border of the next row. Includes going to, and adding each tile in the next row until the last tile is reached. This technique can result in a non-rectangular shape that extends beyond the picture. Such shapes may not be useful for creating pictures with multiple resolutions as described herein. Alternatively, in this example, any first level tile 801 and / or second level tile 803 may be added to the tile group (eg, in any order), but the resulting tile. Tile groups may be constrained so that the group must be rectangular or square (for example, including four sides connected at right angles). This constraint may ensure that the second level tile 803, which is separated from the single first level tile 801, is not placed in a different tile group.

別の例では、第1のレベルのタイル幅が最小幅しきい値の2倍よりも小さく、かつ第1のレベルのタイル高さが最小高さしきい値の2倍よりも小さいとき、第2のレベルのタイル列の数および第2のレベルのタイル行の数を明示的に示すデータがビットストリームから除外されうる。なぜかと言うと、そのような条件を満たす第1のレベルのタイル801は、それぞれ、2つ以上の列または1つの行に分割されなくてよく、したがって、そのような情報はデコーダによって推測されうるからである。別の例では、どの第1のレベルのタイル801が第2のレベルのタイル803に区分されるのかを示す分割表示が、いくつかの第1のレベルのタイル801に対してビットストリームから除外されうる。たとえば、第1のレベルのタイル801が、最小幅しきい値よりも小さい第1のレベルのタイル幅を有し、かつ第1のレベルのタイル高さが最小高さしきい値よりも小さいとき、そのようなデータは除外されうる。なぜかと言うと、そのような条件を満たす第1のレベルのタイル801は、第2のレベルのタイル803に分割されるには小さすぎ、したがって、そのような情報はデコーダによって推測されうるからである。 In another example, when the tile width of the first level is less than twice the minimum width threshold and the tile height of the first level is less than twice the minimum height threshold, the first Data that explicitly indicates the number of second-level tile columns and the number of second-level tile rows can be excluded from the bitstream. The reason is that the first level tiles 801 that satisfy such conditions do not have to be divided into two or more columns or one row, respectively, so such information can be inferred by the decoder. Because. In another example, a split view showing which first level tile 801 is divided into second level tiles 803 is excluded from the bitstream for some first level tiles 801. sell. For example, when the first level tile 801 has a first level tile width less than the minimum width threshold and the first level tile height is less than the minimum height threshold. , Such data can be excluded. The reason is that the first level tile 801 that meets such conditions is too small to be split into the second level tile 803, so such information can be inferred by the decoder. be.

別の態様では、フレキシブルビデオタイリング方式800は、図8Cに示すようなタイルグループ805を採用してもよい。タイルグループ805は、太線によって仕切られるように示される。ピクチャは第1のレベルのタイル801に区分される。第1のレベルのタイル801のサブセットが、第2のレベルのタイル803に区分される。第1のレベルのタイル801および第2のレベルのタイル803は、次いで、タイルグループ805に割り当てられうる。タイルグループ805は、たとえば、対象領域の表示をサポートするために、かつ/または並列処理をサポートするために、別個に抽出およびコーディングされうる、タイルの関連するセットである。タイルグループ805は、ラスタ走査タイルグループまたは長方形タイルグループとして生成されてもよい。ラスタ走査タイルグループは、左から右へ、かつ上から下へ進む、ラスタ走査順序で割り当てられたタイルを含む。たとえば、ラスタ走査順序は、最初のタイルから右のピクチャ境界に向かって進み、次いで、最後のタイルに到達するまで、次の行における左のピクチャ境界から右のピクチャ境界に向かって進むなどである。対照的に、長方形タイルグループは、ピクチャの長方形のサブ部分を含む。タイルグループ805は長方形タイルグループであるが、いくつかの例ではラスタ走査タイルグループも使用されてもよい。 In another aspect, the flexible video tiling scheme 800 may employ tile group 805 as shown in FIG. 8C. Tile group 805 is shown to be separated by a thick line. The picture is divided into tiles 801 of the first level. A subset of the first level tile 801 is divided into the second level tile 803. The first level tile 801 and the second level tile 803 can then be assigned to tile group 805. Tile group 805 is a related set of tiles that can be extracted and coded separately, for example, to support display of the area of interest and / or to support parallelism. The tile group 805 may be generated as a raster scan tile group or a rectangular tile group. Raster scan tile groups include tiles assigned in raster scan order, from left to right and from top to bottom. For example, the raster scan order may be from the first tile to the right picture border, then from the left picture border to the right picture border in the next row until the last tile is reached. .. In contrast, a rectangular tile group contains a rectangular subpart of the picture. Tile group 805 is a rectangular tile group, but raster scan tile groups may also be used in some examples.

いくつかの態様では、第1のレベルのタイル801および第2のレベルのタイル803は、各タイルグループ805が、いくつかの第1のレベルのタイル801、または第2のレベルのタイル803の1つ以上の連続するシーケンスを含むように、タイルグループ805に割り当てられうる。本明細書で使用する第2のレベルのタイル803のシーケンスは、単一の第1のレベルのタイル801から分割された第2のレベルのタイル803のグループであり、したがって、連続するタイルインデックスを有する。この手法は、単一の第1のレベルのタイル801から作成されたすべての第2のレベルのタイル803が、同じタイルグループ805に割り当てられることを可能にする。図8Cに示す例では、1つのタイルグループ805は、4つの第1のレベルのタイル801、および単一の第1のレベルのタイル801から分割された第2のレベルのタイル803を各々が含む4つの他のタイルグループ805を含む。しかしながら、符号化および復号されるべきビデオのタイプに応じて、タイルおよびタイルグループ805の多くの組み合わせが使用されてもよい。 In some embodiments, the first level tile 801 and the second level tile 803 are such that each tile group 805 is one of several first level tiles 801 or second level tile 803. It can be assigned to tile group 805 to contain one or more consecutive sequences. The sequence of second level tiles 803 used herein is a group of second level tiles 803 divided from a single first level tile 801 and thus a contiguous tile index. Have. This technique allows all second level tiles 803 created from a single first level tile 801 to be assigned to the same tile group 805. In the example shown in Figure 8C, one tile group 805 each contains four first level tiles 801 and a second level tile 803 split from a single first level tile 801. Includes 4 other tile groups 805. However, many combinations of tiles and tile groups 805 may be used, depending on the type of video to be encoded and decoded.

別の態様では、フレキシブルビデオタイリング方式800は、図8Dに示すような走査順序807を採用してもよい。走査順序807は、例に応じて、タイルが、エンコーダにおいて符号化され、かつ/またはデコーダもしくは(エンコーダにおける)仮想参照デコーダにおいて復号される、順序である。図示の走査順序807の中で、第1のレベルのタイル801がラスタ走査順序でコーディングされる。第2のレベルのタイル803のうちの1つに遭遇すると、第1のレベルのタイル801のラスタ走査順序コーディングは中断される。連続するすべての第2のレベルのタイル803が、次いで、ラスタ走査順序でコーディングされてから、第1のレベルのタイル801のラスタ走査順序符号化を継続する。すべてのタイルがコーディングされるまで、このプロセスが継続する。図示の例では、タイル1とラベル付けされた第2のレベルのタイル803に最初に遭遇する。したがって、第1のレベルのタイル801のラスタ走査順序コーディングが中断され、タイル1および2とラベル付けされた連続する第2のレベルのタイル803がコーディングされる。連続するすべての第2のレベルのタイル803がコーディングされると、第1のレベルのタイル801のラスタ走査順序コーディングが継続する。したがって、タイル3および4とラベル付けされた第1のレベルのタイル801が、次いで、コーディングされる。5とラベル付けされた第2のレベルのタイル803に遭遇する。連続する第2のレベルのタイル803が、次いで、コーディングされる。したがって、5～8とラベル付けされたタイルがコーディングされる。連続するすべての第2のレベルのタイル803がコーディングされると、第1のレベルのタイル801のラスタ走査順序コーディングが再び継続する。このことは、タイル9および10とラベル付けされた第1のレベルのタイル801をコーディングする結果となる。11とラベル付けされた第2のレベルのタイル803に遭遇する。連続する第2のレベルのタイル803が、次いで、コーディングされる。したがって、11および12とラベル付けされたタイルがコーディングされる。より正式な言い方では、第1のレベルのタイル801は、ピクチャおよび/またはタイルグループの境界に対するラスタ走査順序でコーディングされる。さらに、現在の第1のレベルのタイル801から区分されたすべての第2のレベルのタイル803(たとえば、連続する第2のレベルのタイル803のシーケンス)は、後続の第2のレベルのタイル803から区分された任意の第2のレベルのタイル803を符号化する前にコーディングされる。加えて、現在の第1のレベルのタイル801から区分されたすべての第2のレベルのタイル803は、現在の第1のレベルのタイル801の境界に対してラスタ走査順序で符号化される。 In another aspect, the flexible video tiling scheme 800 may employ a scan sequence 807 as shown in FIG. 8D. Scanning sequence 807 is, by way of example, the sequence in which the tiles are encoded in the encoder and / or decoded in the decoder or the virtual reference decoder (in the encoder). Within the illustrated scan order 807, the first level tile 801 is coded in the raster scan order. When one of the second level tiles 803 is encountered, the raster scan order coding of the first level tiles 801 is interrupted. All consecutive second level tiles 803 are then coded in raster scan order before continuing raster scan order coding of first level tile 801. This process continues until all tiles have been coded. In the illustrated example, the second level tile 803, labeled tile 1, is first encountered. Therefore, the raster scan order coding of tile 801 of the first level is interrupted and consecutive second level tiles 803 labeled tiles 1 and 2 are coded. Once all consecutive second level tiles 803 have been coded, the raster scan order coding of the first level tiles 801 continues. Therefore, the first level tile 801 labeled tiles 3 and 4 is then coded. Encounter a second level tile 803 labeled 5. Consecutive second level tiles 803 are then coded. Therefore, tiles labeled 5-8 are coded. Once all consecutive second level tiles 803 have been coded, the raster scan order coding of the first level tiles 801 continues again. This results in coding the first level tile 801 labeled tiles 9 and 10. Encounter a second level tile 803 labeled 11. Consecutive second level tiles 803 are then coded. Therefore, tiles labeled 11 and 12 are coded. In more formal terms, the first level tile 801 is coded in a raster scan order with respect to the boundaries of the picture and / or tile group. In addition, all second level tiles 803 (eg, a sequence of consecutive second level tiles 803) separated from the current first level tile 801 will be followed by second level tiles 803. Coded before encoding any second level tile 803 segmented from. In addition, all second level tiles 803 separated from the current first level tile 801 are encoded in raster scan order with respect to the boundaries of the current first level tile 801.

上記で説明したように、フレキシブルビデオタイリング方式800は、異なるビットストリームからのサブピクチャを複数の解像度を含むピクチャにマージすることをサポートする。以下のことは、そのような機能をサポートする様々な実施形態を説明する。概して、本開示は、HEVCにおけるタイリング方式よりもフレキシブルな方法でピクチャを区分する、ビデオコーディングにおけるタイルのシグナリングおよびコーディングのための方法を説明する。より詳細には、本開示は、いくつかのタイリング方式を説明し、タイル列はコーディングされたピクチャの上から下まで一律に延在しなくてよく、同様にタイル行はコーディングされたピクチャの左から右まで一律に延在しなくてよい。 As described above, the flexible video tiling scheme 800 supports merging subpictures from different bitstreams into pictures containing multiple resolutions. The following describes various embodiments that support such a function. In general, the present disclosure describes a method for signaling and coding tiles in video coding that separates pictures in a more flexible way than the tiling method in HEVC. More specifically, the present disclosure describes some tiling schemes, where tile columns do not have to extend uniformly from top to bottom of the coded picture, as well as tile rows of the coded picture. It does not have to extend uniformly from left to right.

たとえば、HEVCタイリング手法に基づくと、いくつかのタイルは、図6A～図6Eおよび図7において説明する機能をサポートするために複数のタイル行にさらに分割されるべきである。さらに、タイルがどのように配置されるのかに応じて、タイルはタイル列にさらに分割されるべきである。たとえば、図7において、いくつかの事例では参加者2～4は参加者1の下に配置されてよく、そのことは、タイルを列に分割することによってサポートされうる。これらのシナリオを満足するために、第1のレベルのタイルは、以下で説明するような第2のレベルのタイルのタイル行およびタイル列に分割されてもよい。 For example, based on the HEVC tiling technique, some tiles should be further subdivided into multiple tile rows to support the features described in FIGS. 6A-6E and 7. In addition, the tiles should be further subdivided into tile columns, depending on how the tiles are arranged. For example, in FIG. 7, in some cases Participants 2-4 may be placed under Participant 1, which can be supported by dividing the tile into columns. To satisfy these scenarios, the first level tiles may be divided into tile rows and columns of second level tiles as described below.

たとえば、タイル構造は次のように緩和されうる。同じピクチャの中のタイルは、特定の数のタイル行であることを必要とされない。さらに、同じピクチャの中のタイルは、特定の数のタイル列であることを必要とされない。フレキシブルタイルのシグナリングのために、以下のステップが使用されてもよい。第1のレベルのタイル構造は、HEVCにおいて定義されるようなタイル列およびタイル行によって定義されてもよい。タイル列およびタイル行は、サイズが一律であってもまたは一律でなくてもよい。これらのタイルの各々は、第1のレベルのタイルと呼ばれてもよい。各第1のレベルのタイルが1つ以上のタイル列および1つ以上のタイル行にさらに分割されるか否かを指定するために、フラグがシグナリングされてもよい。第1のレベルのタイルがさらに分割される場合、タイル列およびタイル行は、サイズが一律であることまたは一律でないことのいずれかであってよい。第1のレベルのタイルの分割から得られる新たなタイルは、第2のレベルのタイルと呼ばれる。フレキシブルタイル構造は、第2のレベルのタイルのみに限定されてよく、したがって、いくつかの例では、いかなる第2のレベルのタイルのそれ以上の分割も許容されない。他の例では、第1のレベルのタイルからの第2のレベルのタイルの作成と同様の方法で後続のレベルのタイルを作成するために、第2のレベルのタイルのさらなる分割が適用されうる。 For example, the tile structure can be relaxed as follows: Tiles in the same picture do not need to be a certain number of tile rows. Moreover, the tiles in the same picture do not need to be a certain number of tile columns. The following steps may be used for flexible tile signaling. The first level tile structure may be defined by tile columns and rows as defined in HEVC. The tile columns and rows may or may not be uniform in size. Each of these tiles may be referred to as a first level tile. Flags may be signaled to specify whether each first level tile is further divided into one or more tile columns and one or more tile rows. If the first level tiles are further subdivided, the tile columns and rows may be either uniform in size or non-uniform in size. The new tiles that result from the splitting of the first level tiles are called the second level tiles. Flexible tile structures may be limited to second level tiles only, and therefore, in some examples, no further division of any second level tiles is allowed. In another example, further division of the second level tile may be applied to create subsequent level tiles in the same way as creating a second level tile from a first level tile. ..

前述の例では、ピクチャは、それ以上分割されない0個以上の第1のレベルのタイル、および0個以上の第2のレベルのタイルを含んでよい。さらに分割されている第1のレベルのタイルは、概念的に存在しうるにすぎず、ピクチャの中のタイルの総数の中に計数されなくてよい。例示的な走査順序は次のように規定される。簡単のために、タイルグループは、いくつかの完全な第1のレベルのタイル、または第1のレベルのタイルの完全なサブセットのいずれかを含むように制約されてよい。第1のレベルのタイルは、ピクチャのタイルラスタ走査に従って順序付けられてよい。さらに分割されている第1のレベルのタイルが参照されると、そのような分割から得られた第2のレベルのタイルのセットは一括して参照されてもよい。任意の現在の第1のレベルのタイルの第2のレベルのタイルは、現在の第1のレベルのタイルの後に来る後続の第1のレベルのタイルの任意の第2のレベルのタイルを参照する前に参照されてもよい。現在の第1のレベルのタイルの第2のレベルのタイルは、現在の第1のレベルのタイル内でラスタ走査順序で参照される。任意の現在のタイル内のCTUは、現在のタイル内でCTUラスタ走査順序で参照されてもよい。 In the above example, the picture may include zero or more first level tiles that are not further divided, and zero or more second level tiles. The first level tiles that are further subdivided can only exist conceptually and do not have to be counted in the total number of tiles in the picture. An exemplary scan order is defined as follows. For simplicity, a tile group may be constrained to contain either some complete first level tiles, or a complete subset of first level tiles. The first level tiles may be ordered according to the tile raster scan of the picture. If the first level tiles that are further subdivided are referenced, the set of second level tiles obtained from such subdivisions may be referenced collectively. The second level tile of any current first level tile refers to any second level tile of subsequent first level tiles that follow the current first level tile. May be referenced before. The second level tiles of the current first level tile are referenced in raster scan order within the current first level tile. The CTUs in any current tile may be referenced in the CTU raster scan order within the current tile.

簡単のために、第1のレベルのタイルが2つ以上の第2のレベルのタイルに分割されるとき、分割は、サイズが一律なタイル列および一律なタイル行を常に使用してもよい。そのような例では、レベル2タイル列およびレベル2タイル行が一律であるか否かを指定するフラグをシグナリングする必要がなくてよい。さらに、シンタックス要素がタイル行高さおよびタイル行幅を指定する必要がなくてよい。いくつかの例では、第1のレベルのタイルは、タイル列およびタイル行に対して一律なサイズを常に使用するように制約されてもよい。そのような例では、レベル1タイル列およびレベル1行が一律であるか否かを指定するフラグをシグナリングする必要がなくてよい。さらに、シンタックス要素がタイル行高さおよびタイル行幅を指定する必要がなくてよい。別の例では、第1のレベルのタイルおよび第2のレベルのタイルは、サイズが一律なタイル列およびタイル行を常に使用するように制約されてもよい。そのような例では、レベル1およびレベル2のタイル列および行が一律であるか否かを指定するフラグをシグナリングする必要がなくてよい。さらに、シンタックス要素がタイル行高さおよびタイル行幅を指定する必要がなくてよい。 For simplicity, when a first level tile is split into two or more second level tiles, the split may always use uniform tile columns and uniform tile rows of uniform size. In such an example, it is not necessary to signal a flag that specifies whether the level 2 tile column and level 2 tile row are uniform. Moreover, the syntax element does not have to specify the tile row height and tile row width. In some examples, first level tiles may be constrained to always use a uniform size for tile columns and rows. In such an example, it is not necessary to signal a flag that specifies whether the level 1 tile column and level 1 row are uniform. Moreover, the syntax element does not have to specify the tile row height and tile row width. In another example, the first level tiles and the second level tiles may be constrained to always use tile columns and rows of tiles of uniform size. In such an example, it is not necessary to signal a flag that specifies whether the level 1 and level 2 tile columns and rows are uniform. Moreover, the syntax element does not have to specify the tile row height and tile row width.

この手法によって定義されるフレキシブルタイルのタイルロケーション、サイズ、インデックス、および走査順序の導出が、以下で説明される。いくつかの例では、そのようなフレキシブルタイル構造が使用されるとき、タイルグループは、いくつかの完全な第1のレベルのタイルのみを、または1つの単一の第1のレベルのタイルの完全な第2のレベルのタイルの連続するシーケンスのみを含んでよい。さらに、そのようなフレキシブルタイル構造が使用されるとき、タイルグループは、1つ以上の完全な第1のレベルのタイルを含むように制約されてもよい。この例では、タイルグループが第2のレベルのタイルを含むとき、同じ第1のレベルのタイルの分割から発生するすべての第2のレベルのタイルがタイルグループの中に含められるべきである。そのようなフレキシブルタイル構造が使用されるとき、タイルグループが1つ以上のタイルを含み、一緒にすべてのタイルがピクチャの長方形領域をカバーするタイルグループに属することが、さらに制約されうる。別の態様では、そのようなフレキシブルタイル構造が使用されるとき、タイルグループは、1つ以上の第1のレベルのタイルを含み、一緒にすべてのタイルが、ピクチャの長方形領域をカバーするタイルグループに属する。 Derivation of tile location, size, index, and scan order of flexible tiles defined by this technique is described below. In some examples, when such a flexible tile structure is used, the tile group is complete with only a few complete first level tiles, or one single first level tile. It may only contain a contiguous sequence of second level tiles. Further, when such a flexible tile structure is used, the tile group may be constrained to include one or more complete first level tiles. In this example, when a tile group contains second level tiles, all second level tiles resulting from the same first level tile split should be included in the tile group. When such a flexible tile structure is used, it may be further constrained that the tile group contains one or more tiles and together all tiles belong to a tile group that covers the rectangular area of the picture. In another aspect, when such a flexible tile structure is used, the tile group contains one or more first level tiles, together all tiles cover the rectangular area of the picture. Belongs to.

一例では、フレキシブルタイルのシグナリングは次の通りでありうる。最小タイル幅および最小タイル高さは、定義された値である。第1のレベルのタイル構造は、タイル列およびタイル行によって定義されうる。タイル列およびタイル行は、サイズが一律であってもまたは一律でなくてもよい。これらのタイルの各々は、第1のレベルのタイルと呼ばれうる。第1のレベルのタイルのいずれかがさらに分割されてもよいかどうかを指定するために、フラグがシグナリングされてもよい。各第1のレベルのタイルの幅が最小タイル幅の2倍以下であり、かつ各第1のレベルのタイルの高さが最小タイル高さの2倍以下であるとき、このフラグは存在しなくてよい。存在しないとき、フラグの値は0に等しいものと推測される。 In one example, the signaling of flexible tiles could be: The minimum tile width and minimum tile height are defined values. The first level tile structure can be defined by tile columns and tile rows. The tile columns and rows may or may not be uniform in size. Each of these tiles may be referred to as a first level tile. Flags may be signaled to specify whether any of the first level tiles may be further subdivided. This flag does not exist when the width of each first level tile is less than or equal to twice the minimum tile width and the height of each first level tile is less than or equal to twice the minimum tile height. It's okay. When not present, the value of the flag is presumed to be equal to 0.

一例では、第1のレベルのタイルごとに以下のことが適用される。第1のレベルのタイルが1つ以上のタイル列および1つ以上のタイル行にさらに分割されるか否かを指定するために、フラグがシグナリングされうる。フラグの存在は、次のように制約されうる。第1のレベルのタイル幅が最小タイル幅よりも大きい場合、または第1のレベルのタイル高さが最小タイル高さよりも大きい場合、フラグは存在する/シグナリングされる。そうでない場合、フラグは存在せず、フラグの値は、第1のレベルのタイルがそれ以上分割されないことを示す0に等しいものと推測される。 In one example, the following applies to each first level tile: Flags can be signaled to specify whether first level tiles are further divided into one or more tile columns and one or more tile rows. The presence of the flag can be constrained as follows: If the tile width of the first level is greater than the minimum tile width, or if the tile height of the first level is greater than the minimum tile height, the flag is present / signaled. Otherwise, the flag does not exist and the value of the flag is presumed to be equal to 0, which indicates that the tiles at the first level will not be split any further.

第1のレベルのタイルがさらに分割される場合、この分割に対するタイル列の数およびタイル行の数がさらにシグナリングされてよい。タイル列およびタイル行は、サイズが一律であることまたは一律でないことのいずれかであってよい。第1のレベルのタイルの分割から得られるタイルは、第2のレベルのタイルと呼ばれる。タイル列の数およびタイル行の数の存在は、次のように制約されうる。第1のレベルのタイル幅が最小タイル幅の2倍よりも小さいとき、タイル列の数はシグナリングされなくてよく、タイル列値の数は1に等しいものと推測されうる。シグナリングされるシンタックス要素値が0であってよく、かつタイル列の数がシンタックス要素の値+1となるように、シグナリングは_minus1シンタックス要素を採用してもよい。この手法は、シグナリングデータをさらに圧縮してもよい。第1のレベルのタイル高さが最小タイル高さの2倍よりも小さいとき、タイル行の数はシグナリングされなくてよく、タイル行の数の値は、0に等しいものと推測されうる。シグナリングされるシンタックス要素値は0であってよく、タイル行の数は、シグナリングデータをさらに圧縮するためにシンタックス要素の値+1でありうる。第1のレベルのタイルの分割から得られるタイルは、第2のレベルのタイルと呼ばれうる。フレキシブルタイル構造は、任意の第2のレベルのタイルのそれ以上の分割が許容されないように第2のレベルのタイルのみに限定されてよい。他の例では、第1のレベルのタイルを第2のレベルのタイルに分割することと同様の方法で、第2のレベルのタイルのさらなる分割が適用されうる。 If the first level tile is further divided, the number of tile columns and the number of tile rows for this division may be further signaled. The tile columns and rows may be either uniform in size or non-uniform in size. The tiles obtained from the division of the first level tiles are called the second level tiles. The existence of the number of tile columns and the number of tile rows can be constrained as follows: When the tile width of the first level is less than twice the minimum tile width, the number of tile columns need not be signaled and the number of tile column values can be inferred to be equal to one. The signaling may employ the _minus1 syntax element so that the signaled syntax element value may be 0 and the number of tile columns is the syntax element value + 1. This technique may further compress the signaling data. When the tile height of the first level is less than twice the minimum tile height, the number of tile rows need not be signaled and the value of the number of tile rows can be inferred to be equal to zero. The signaled syntax element value can be 0 and the number of tile rows can be the syntax element value + 1 to further compress the signaling data. The tiles obtained from the division of the first level tiles can be referred to as the second level tiles. Flexible tile structures may be limited to second level tiles only so that further division of any second level tile is not allowed. In another example, further division of the second level tile may be applied in a similar manner to dividing the first level tile into the second level tile.

一例では、フレキシブルタイル構造のシグナリングは次の通りでありうる。ピクチャが2つ以上のタイルを含むとき、フラグなどの信号が、対応するタイルグループによって直接または間接的に参照されるパラメータセットの中で採用されうる。フラグは、対応するタイル構造が一律なタイル構造であるのかそれとも一律でないタイル構造(たとえば、本明細書で説明するようなフレキシブルタイル構造)であるのかを指定することができる。フラグはuniform_tile_structure_flagと呼ばれてもよい。uniform_tile_structure_flagが1に等しいとき、たとえば、一律なタイルの単一のレベルを示すためにnum_tile_columns_minus1およびnum_tile_rows_minus1をシグナリングすることによって、HEVCスタイルの一律なタイル構造のシグナリングが採用される。uniform_tile_structure_flagが0に等しいとき、以下の情報もシグナリングされてよい。ピクチャの中のタイルの数(NumTilesInPic)がnum_tiles_minus2+2に等しいことを示すシンタックス要素num_tiles_minus2によって、ピクチャの中のタイルの数がシグナリングされうる。デフォルトでピクチャはタイルであるものと見なされてもよいので、このことはシグナリング中のビット節約をもたらしうる。タイルごとに、最後の1つを除いて、タイルの最初のコーディングブロック(たとえば、CTU)および最後のコーディングブロックのアドレスがシグナリングされる。コーディングブロックのアドレスは、ピクチャの中のブロックのインデックス(たとえば、ピクチャの中のCTUのインデックス)でありうる。そのようなコーディングブロックに対するシンタックス要素は、tile_first_block_address[i]およびtile_last_block_address[i]であってよい。これらのシンタックス要素は、ue(v)またはu(v)としてコーディングされてもよい。シンタックス要素がu(v)としてコーディングされるとき、シンタックス要素の各々を表すために使用されるビット数は、ceil(log2(ピクチャの中のコーディングブロックの最大個数))である。最後のタイルの最初および最後のコーディングブロックのアドレスはシグナリングされなくてよく、代わりに、ルーマサンプル、およびピクチャの中のすべての他のタイルの集合の中の、ピクチャサイズに基づいて導出されてもよい。 In one example, the signaling of the flexible tile structure could be: When a picture contains more than one tile, signals such as flags can be adopted in a parameter set that is directly or indirectly referenced by the corresponding tile group. The flag can specify whether the corresponding tile structure is a uniform tile structure or a non-uniform tile structure (eg, a flexible tile structure as described herein). The flag may be called uniform_tile_structure_flag. When uniform_tile_structure_flag is equal to 1, HEVC style uniform tile structure signaling is adopted, for example by signaling num_tile_columns_minus1 and num_tile_rows_minus1 to indicate a single level of uniform tiles. When uniform_tile_structure_flag is equal to 0, the following information may also be signaled. The number of tiles in a picture can be signaled by the syntax element num_tiles_minus2, which indicates that the number of tiles in the picture (NumTilesInPic) is equal to num_tiles_minus2 + 2. This can result in bit savings during signaling, as by default pictures may be considered tiles. For each tile, except for the last one, the address of the tile's first coding block (for example, CTU) and the last coding block is signaled. The address of the coding block can be the index of the block in the picture (eg, the index of the CTU in the picture). The syntax elements for such coding blocks may be tile_first_block_address [i] and tile_last_block_address [i]. These syntax elements may be coded as ue (v) or u (v). When the syntax element is coded as u (v), the number of bits used to represent each of the syntax elements is ceil (log2 (maximum number of coding blocks in the picture)). The addresses of the first and last coding blocks of the last tile do not have to be signaled, even if they are instead derived based on the picture size in the room sample, and in the set of all other tiles in the picture. good.

一例では、タイルごとに、最後の1つを除いて、タイルの最初および最後のコーディングブロックのアドレスをシグナリングするのではなく、タイルの最初のコーディングブロックのアドレス、ならびにタイルの幅および高さがシグナリングされてもよい。別の例では、タイルごとに、最後の1つを除いて、タイルの最初および最後のコーディングブロックのアドレスをシグナリングするのではなく、ピクチャの原物に対するタイルの左上の点(たとえば、ピクチャの左上)のオフセット、ならびにタイルの幅および高さがシグナリングされてもよい。また別の例では、タイルごとに、最後の1つを除いて、タイルの最初および最後のコーディングブロックのアドレスをシグナリングするのではなく、以下の情報がシグナリングされうる。タイルの幅および高さがシグナリングされてよい。また、各タイルのロケーションはシグナリングされなくてよい。代わりに、タイルを以前のタイルのすぐ右に配置すべきかそれともすぐ下に配置すべきかを指定するために、フラグがシグナリングされてよい。タイルが以前のタイルの右にしかあり得ないかまたは下にしかあり得ない場合、このフラグは存在しなくてよい。最初のタイルの左上のオフセットは、常に原点/ピクチャの左上(たとえば、x=0かつy=0)となるように設定されてよい。 In one example, for each tile, the address of the first coding block of the tile, as well as the width and height of the tile, is signaled instead of signaling the addresses of the first and last coding blocks of the tile, except for the last one. May be done. In another example, for each tile, except for the last one, instead of signaling the addresses of the first and last coding blocks of the tile, the upper left point of the tile with respect to the original picture (for example, the upper left of the picture). ) Offset, as well as tile width and height may be signaled. In yet another example, for each tile, except for the last one, the following information may be signaled instead of signaling the addresses of the first and last coding blocks of the tile. The width and height of the tile may be signaled. Also, the location of each tile does not have to be signaled. Instead, a flag may be signaled to specify whether the tile should be placed just to the right of or just below the previous tile. This flag may not be present if the tile can only be to the right of or below the previous tile. The upper left offset of the first tile may be set to always be the origin / upper left of the picture (eg x = 0 and y = 0).

シグナリング効率のために、固有のタイルサイズ(たとえば、幅および高さ)のセットがシグナリングされてよい。固有のタイルサイズのこのリストは、各タイルサイズのシグナリングを含むループから、インデックスによって参照されてもよい。いくつかの例では、シグナリングされたタイル構造から導出されるようなタイルロケーションおよびタイルサイズは、いかなるタイルの間にもギャップおよびオーバーラップが発生しないことを確実にするように、区分を制約しなければならない。 For signaling efficiency, a unique set of tile sizes (eg, width and height) may be signaled. This list of unique tile sizes may be referenced by the index from a loop containing signaling for each tile size. In some examples, tile locations and tile sizes, such as those derived from signaled tile structures, must constrain the partition to ensure that there are no gaps or overlaps between any tiles. Must be.

以下の制約も適用してもよい。タイル形状は、長方形である(たとえば、ラスタ走査形状でない)ことを必要とされてもよい。ピクチャの中のタイルのユニットは、タイルの間にいかなるギャップおよびいかなるオーバーラップも伴わずにピクチャをカバーしなければならない。1つのコアしか用いずに復号が行われるとき、ピクチャの左の縁にない現在のコーディングブロック(たとえば、CTU)のコーディングのために、現在のコーディングブロックの前に左の隣接コーディングブロックが復号されなければならない。1つのコアしか用いずに復号が行われるとき、ピクチャの上の縁にない現在のコーディングブロック(たとえば、CTU)のコーディングのために、現在のコーディングブロックの前に上の隣接コーディングブロックが復号されなければならない。2つのタイルが、互いに隣接しているタイルインデックス(たとえば、idx3およびidx4)を有するとき、以下のことのうちの1つが真である。2つのタイルが垂直の縁を共有し、かつ/または第1のタイルがサイズ(その幅および高さを表すWaおよびHa)を伴って(Xa,Ya)における左上のロケーションを有するとき、かつ第2のタイルが(Xb,Yb)における左上のロケーションを有するとき、次いで、Yb=Ya+Haである。 The following constraints may also apply. The tile shape may be required to be rectangular (eg, not a raster scan shape). The units of tiles in the picture must cover the picture without any gaps and overlaps between the tiles. When decoding is done using only one core, the left adjacent coding block is decoded before the current coding block for coding the current coding block (eg CTU) that is not on the left edge of the picture. There must be. When decoding is done using only one core, the adjacent coding block above is decoded before the current coding block for coding the current coding block (eg CTU) that is not on the top edge of the picture. There must be. When two tiles have tile indexes adjacent to each other (eg idx3 and idx4), one of the following is true: When the two tiles share a vertical edge and / or the first tile has an upper left location in (Xa, Ya) with size (Wa and Ha representing its width and height), and the first. When the second tile has an upper left location in (Xb, Yb), then Yb = Ya + Ha.

以下の制約も適用してもよい。タイルが、2つ以上の左の隣接タイルを有するとき、タイルの高さは、すべてのその左の隣接タイルの高さの合計に等しくなければならない。タイルが、2つ以上の右の隣接タイルを有するとき、タイルの高さは、すべてのその左の隣接タイルの高さの合計に等しくなければならない。タイルが、2つ以上の上の隣接タイルを有するとき、タイルの幅は、すべてのその上の隣接タイルの幅の合計に等しくなければならない。タイルが、2つ以上の下の隣接タイルを有するとき、タイルの幅は、すべてのその下の隣接タイルの幅の合計に等しくなければならない。加えて、タイルインデックスとタイルIDとの間のマッピングを含む、タイルIDのシグナリングは、ピクチャの中のタイルの数に基づいてよい。したがって、マッピングは、タイル列およびタイル行に基づくのではなく、ピクチャの中のタイルの数に基づいてよい。たとえば、各タイルインデックスにタイルIDが割り当てられるように(たとえば、最初のインデックスが0であり、最後のインデックスがピクチャの中のタイルの数-1である、最初のインデックスから最後のインデックスまで)ループが採用されてよい。 The following constraints may also apply. When a tile has two or more left adjacent tiles, the height of the tile must be equal to the sum of the heights of all its left adjacent tiles. When a tile has two or more right adjacent tiles, the height of the tile must be equal to the sum of the heights of all its left adjacent tiles. When a tile has two or more adjacent tiles above it, the width of the tile must be equal to the sum of the widths of all adjacent tiles above it. When a tile has two or more adjacent tiles below it, the width of the tile must be equal to the sum of the widths of all adjacent tiles below it. In addition, tile ID signaling, including the mapping between tile index and tile ID, may be based on the number of tiles in the picture. Therefore, the mapping may be based on the number of tiles in the picture rather than on the tile columns and tile rows. For example, loop so that each tile index is assigned a tile ID (for example, the first index is 0 and the last index is the number of tiles in the picture-1 from the first index to the last index). May be adopted.

以下のことは、上述の態様の特定の例示的な実施形態である。CTBラスタおよびタイル走査プロセスは、次の通りであってよい。第iの第1のレベルのタイル列の幅をCTBの単位で指定する、両端値を含む0からnum_level1_tile_columns_minus1までにわたるiに対するリストColWidth[i]は、次のように導出されうる。
if(uniform_level1_tile_spacing_flag)
for(i=0;i<=num_level1_tile_columns_minus1;i++)
ColWidth[i]=((i+1)*PicWidthInCtbsY)/(num_level1_tile_columns_minus1+1)-(i*PicWidthInCtbsY)/(num_level1_tile_columns_minus1+1)

else{
ColWidth[num_level1_tile_columns_minus1]=PicWidthInCtbsY (6-1)
for(i=0;i<num_level1_tile_columns_minus1;i++){
ColWidth[i]=tile_level1_column_width_minus1[i]+1
ColWidth[num_tile_level1_columns_minus1] -= ColWidth[i]
}
} The following are specific exemplary embodiments of the above embodiments. The CTB raster and tile scanning process may be as follows. The list ColWidth [i] for i from 0 to num_level1_tile_columns_minus1 containing the double-ended values, which specifies the width of the first level tile column of the first i in CTB units, can be derived as follows.
if (uniform_level1_tile_spacing_flag)
for (i = 0; i <= num_level1_tile_columns_minus1; i ++)
ColWidth [i] = ((i + 1) * PicWidthInCtbsY) / (num_level1_tile_columns_minus1 + 1)-(i * PicWidthInCtbsY) / (num_level1_tile_columns_minus1 + 1)

else {
ColWidth [num_level1_tile_columns_minus1] = PicWidthInCtbsY (6-1)
for (i = 0; i <num_level1_tile_columns_minus1; i ++) {
ColWidth [i] = tile_level1_column_width_minus1 [i] +1
ColWidth [num_tile_level1_columns_minus1]-= ColWidth [i]
}
}

第jのタイル行の高さをCTBの単位で指定する、両端値を含む0からnum_level1_tile_rows_minus1までにわたるjに対するリストRowHeight[j]は、次のように導出されうる。
if(uniform_level1_tile_spacing_flag)
for(j=0;j<=num_level1_tile_rows_minus1;j++)
RowHeight[j]=((j+1)*PicHeightInCtbsY)/(num_level1_tile_rows_minus1+1)-(j*PicHeightInCtbsY)/(num_level1_tile_rows_minus1+1)

else{
RowHeight[num_level1_tile_rows_minus1]=PicHeightInCtbsY (6-2)
for(j=0;j<num_level1_tile_rows_minus1;j++){
RowHeight[j]=tile_level1_row_height_minus1[j]+1
RowHeight[num_level1_tile_rows_minus1] -= RowHeight[j]
}
} The list RowHeight [j] for j from 0 to num_level1_tile_rows_minus1 containing the double-ended values, which specifies the height of the jth tile row in CTB units, can be derived as follows.
if (uniform_level1_tile_spacing_flag)
for (j = 0; j <= num_level1_tile_rows_minus1; j ++)
RowHeight [j] = ((j + 1) * PicHeightInCtbsY) / (num_level1_tile_rows_minus1 + 1)-(j * PicHeightInCtbsY) / (num_level1_tile_rows_minus1 + 1)

else {
RowHeight [num_level1_tile_rows_minus1] = PicHeightInCtbsY (6-2)
for (j = 0; j <num_level1_tile_rows_minus1; j ++) {
RowHeight [j] = tile_level1_row_height_minus1 [j] +1
RowHeight [num_level1_tile_rows_minus1]-= RowHeight [j]
}
}

第iのタイル列境界のロケーションをCTBの単位で指定する、両端値を含む0からnum_level1_tile_columns_minus1+1までにわたるiに対するリストcolBd[i]は、次のように導出されうる。
for(colBd[0]=0,i=0;i<=num_level1_tile_columns_minus1;i++)
colBd[i+1]=colBd[i]+ColWidth[i] (6-3) The list colBd [i] for i from 0 to num_level1_tile_columns_minus1 + 1, which specifies the location of the boundary of the i-th tile column in CTB units, can be derived as follows.
for (colBd [0] = 0, i = 0; i <= num_level1_tile_columns_minus1; i ++)
colBd [i + 1] = colBd [i] + ColWidth [i] (6-3)

第jのタイル行境界のロケーションをCTBの単位で指定する、両端値を含む0からnum_level1_tile_rows_minus1+1までにわたるjに対するリストrowBd[j]は、次のように導出されうる。
for(rowBd[0]=0,j=0;j<=num_level1_tile_rows_minus1;j++)
rowBd[j+1]=rowBd[j]+RowHeight[j] (6-4) The list rowBd [j] for j from 0 to num_level1_tile_rows_minus1 + 1, which specifies the location of the jth tile row boundary in CTB units, can be derived as follows:
for (rowBd [0] = 0, j = 0; j <= num_level1_tile_rows_minus1; j ++)
rowBd [j + 1] = rowBd [j] + RowHeight [j] (6-4)

PPSを参照してピクチャの中のタイルの数を指定する変数NumTilesInPic、ならびに第iのタイル列境界のロケーションをCTBの単位で、第iのタイル行境界のロケーションをCTBの単位で、第iのタイル列の幅をCTBの単位で、かつ第iのタイル列の高さをCTBの単位で指定する、両端値を含む0からNumTilesInPic-1までにわたるiに対するリストTileColBd[i]、TileRowBd[i]、TileWidth[i]、およびTileHeight[i]は、次のように導出されうる。
for(tileIdx=0,i=0;i<NumLevel1Tiles;i++){
tileX=i%(num_level1_tile_columns_minus1+1)
tileY=i/(num_level1_tile_columns_minus1+1)
if(!level2_tile_split_flag[i]){ (6-5)
TileColBd[tileIdx]=colBd[tileX]
TileRowBd[tileIdx]=rowBd[tileY]
TileWidth[tileIdx]=ColWidth[tileX]
TileHeight[tileIdx]=RowHeight[tileY]
tileIdx++
}else{
for(k=0;k<=num_level2_tile_columns_minus1[i];k++)
colWidth2[k]=((k+1)*ColWidth[tileX])/(num_level2_tile_columns_minus1[i]+1)-(k*ColWidth[tileX])/(num_level2_tile_columns_minus1[i]+1)

for(k=0;k<=num_level2_tile_rows_minus1[i];k++)
rowHeight2[k]=((k+1)*RowHeight[tileY])/(num_level2_tile_rows_minus1[i]+1)-(k*RowHeight[tileY])/(num_level2_tile_rows_minus1[i]+1)

for(colBd2[0]=0,k=0;k<=num_level2_tile_columns_minus1[i];k++)
colBd2[k+1]=colBd2[k]+colWidth2[k]
for(rowBd2[0]=0,k=0;k<=num_level2_tile_rows_minus1[i];k++)
rowBd2[k+1]=rowBd2[k]+rowHeight2[k]
numSplitTiles=(num_level2_tile_columns_minus1[i]+1)*(num_level2_tile_rows_minus1[i]+1)

for(k=0;k<numSplitTiles;k++){
tileX2=k%(num_level2_tile_columns_minus1[i]+1)
tileY2=k/(num_level2_tile_columns_minus1[i]+1)
TileColBd[tileIdx]=colBd[tileX]+colBd2[tileX2]
TileRowBd[tileIdx]=rowBd[tileY]+rowBd2[tileY2]
TileWidth[tileIdx]=colWidth2[tileX2]
TileHeight[tileIdx]=rowHeight2[tileY2]
tileIdx++
}
}
}
NumTilesInPic=tileIdx The variable NumTilesInPic, which refers to the PPS to specify the number of tiles in the picture, as well as the location of the i-th tile column boundary in CTB units and the location of the i-th tile row boundary in CTB units. List TileColBd [i], TileRowBd [i] for i from 0 to NumTilesInPic-1, including both-end values, specifying the width of the tile column in CTB units and the height of the i-th tile column in CTB units. , TileWidth [i], and TileHeight [i] can be derived as follows.
for (tileIdx = 0, i = 0; i <NumLevel1Tiles; i ++) {
tileX = i% (num_level1_tile_columns_minus1 + 1)
tileY = i / (num_level1_tile_columns_minus1 + 1)
if (! Level2_tile_split_flag [i]) {(6-5)
TileColBd [tileIdx] = colBd [tileX]
TileRowBd [tileIdx] = rowBd [tileY]
TileWidth [tileIdx] = ColWidth [tileX]
TileHeight [tileIdx] = RowHeight [tileY]
tileIdx ++
} else {
for (k = 0; k <= num_level2_tile_columns_minus1 [i]; k ++)
colWidth2 [k] = ((k + 1) * ColWidth [tileX]) / (num_level2_tile_columns_minus1 [i] +1)-(k * ColWidth [tileX]) / (num_level2_tile_columns_minus1 [i] +1)

for (k = 0; k <= num_level2_tile_rows_minus1 [i]; k ++)
rowHeight2 [k] = ((k + 1) * RowHeight [tileY]) / (num_level2_tile_rows_minus1 [i] +1)-(k * RowHeight [tileY]) / (num_level2_tile_rows_minus1 [i] +1)

for (colBd2 [0] = 0, k = 0; k <= num_level2_tile_columns_minus1 [i]; k ++)
colBd2 [k + 1] = colBd2 [k] + colWidth2 [k]
for (rowBd2 [0] = 0, k = 0; k <= num_level2_tile_rows_minus1 [i]; k ++)
rowBd2 [k + 1] = rowBd2 [k] + rowHeight2 [k]
numSplitTiles = (num_level2_tile_columns_minus1 [i] +1) * (num_level2_tile_rows_minus1 [i] +1)

for (k = 0; k <numSplitTiles; k ++) {
tileX2 = k% (num_level2_tile_columns_minus1 [i] +1)
tileY2 = k / (num_level2_tile_columns_minus1 [i] +1)
TileColBd [tileIdx] = colBd [tileX] + colBd2 [tileX2]
TileRowBd [tileIdx] = rowBd [tileY] + rowBd2 [tileY2]
TileWidth [tileIdx] = colWidth2 [tileX2]
TileHeight [tileIdx] = rowHeight2 [tileY2]
tileIdx ++
}
}
}
NumTilesInPic = tileIdx

ピクチャのCTBラスタ走査におけるCTBアドレスからタイル走査におけるCTBアドレスへの変換を指定する、両端値を含む0からPicSizeInCtbsY-1までにわたるctbAddrRsに対するリストCtbAddrRsToTs[ctbAddrRs]は、次のように導出されうる。
for(ctbAddrRs=0;ctbAddrRs<PicSizeInCtbsY;ctbAddrRs++){
tbX=ctbAddrRs%PicWidthInCtbsY
tbY=ctbAddrRs/PicWidthInCtbsY tileFound=FALSE
for(tileIdx=NumTilesInPic-1,i=0;i<NumTilesInPic-1 && !tileFound;i++){ (6-6)
tileFound=tbX<(TileColBd[i]+TileWidth[i]) && tbY<(TileRowBd[i]+TileHeight[i])
if(tileFound)
tileIdx=i
}
CtbAddrRsToTs[ctbAddrRs]=0
for(i=0;i<tileIdx;i++)
CtbAddrRsToTs[ctbAddrRs] += TileHeight[i]*TileWidth[i]
CtbAddrRsToTs[ctbAddrRs] += (tbY-TileRowBd[tileIdx])*TileWidth[tileIdx]+tbX-TileColBd[tileIdx]

} The list CtbAddrRsToTs [ctbAddrRs] for ctbAddrRs from 0 to PicSizeInCtbsY-1, including the double-ended values, that specifies the conversion of the CTB address in the CTB raster scan of the picture to the CTB address in the tile scan can be derived as follows.
for (ctbAddrRs = 0; ctbAddrRs <PicSizeInCtbsY; ctbAddrRs ++) {
tbX = ctbAddrRs% PicWidthInCtbsY
tbY = ctbAddrRs / PicWidthInCtbsY tileFound = FALSE
for (tileIdx = NumTilesInPic-1, i = 0; i <NumTilesInPic-1 &&! TileFound; i ++) {(6-6)
tileFound = tbX <(TileColBd [i] + TileWidth [i]) && tbY <(TileRowBd [i] + TileHeight [i])
if (tileFound)
tileIdx = i
}
CtbAddrRsToTs [ctbAddrRs] = 0
for (i = 0; i <tileIdx; i ++)
CtbAddrRsToTs [ctbAddrRs] + = TileHeight [i] * TileWidth [i]
CtbAddrRsToTs [ctbAddrRs] + = (tbY-TileRowBd [tileIdx]) * TileWidth [tileIdx] + tbX-TileColBd [tileIdx]

}

タイル走査におけるCTBアドレスからピクチャのCTBラスタ走査におけるCTBアドレスへの変換を指定する、両端値を含む0からPicSizeInCtbsY-1までにわたるctbAddrTsに対するリストCtbAddrTsToRs[ctbAddrTs]は、次のように導出されうる。
for(ctbAddrRs=0;ctbAddrRs<PicSizeInCtbsY;ctbAddrRs++) (6-7)
CtbAddrTsToRs[CtbAddrRsToTs[ctbAddrRs]]=ctbAddrRs The list CtbAddrTsToRs [ctbAddrTs] for ctbAddrTs from 0 to PicSizeInCtbsY-1, including the double-ended values, that specifies the conversion of the CTB address in the tile scan to the CTB address in the CTB raster scan of the picture can be derived as follows.
for (ctbAddrRs = 0; ctbAddrRs <PicSizeInCtbsY; ctbAddrRs ++) (6-7)
CtbAddrTsToRs [CtbAddrRsToTs [ctbAddrRs]] = ctbAddrRs

タイル走査におけるCTBアドレスからタイルIDへの変換を指定する、両端値を含む0からPicSizeInCtbsY-1までにわたるctbAddrTsに対するリストTileId[ctbAddrTs]、およびタイルIDからタイルの中の最初のCTBのタイル走査におけるCTBアドレスへの変換を指定する、両端値を含む0からNumTilesInPic-1までにわたるtileIdxに対するリストFirstCtbAddrTs[tileIdx]は、次のように導出されうる。
for(i=0,tileIdx=0;i<=NumTilesInPic;i++,tileIdx++){
for(y=TileRowBd[i];y<TileRowBd[i+1];y++) (6-8)
for(x=TileColBd[i];x<TileColBd[i+1];x++)
TileId[CtbAddrRsToTs[y*PicWidthInCtbsY+x]]=tileIdx
FirstCtbAddrTs[tileIdx]=CtbAddrRsToTs[TileColBd[tileIdx]]*PicWidthInCtbsY+TileColBd[tileIdx]]
} List TileId [ctbAddrTs] for ctbAddrTs from 0 to PicSizeInCtbsY-1, including the double-ended values, and CTB in the tile scan of the first CTB in the tile, specifying the conversion of the CTB address to the tile ID in the tile scan. The list FirstCtbAddrTs [tileIdx] for tileIdx from 0 to NumTilesInPic-1 with double-ended values that specify the translation to the address can be derived as follows.
for (i = 0, tileIdx = 0; i <= NumTilesInPic; i ++, tileIdx ++) {
for (y = TileRowBd [i]; y <TileRowBd [i + 1]; y ++) (6-8)
for (x = TileColBd [i]; x <TileColBd [i + 1]; x ++)
TileId [CtbAddrRsToTs [y * PicWidthInCtbsY + x]] = tileIdx
FirstCtbAddrTs [tileIdx] = CtbAddrRsToTs [TileColBd [tileIdx]] * PicWidthInCtbsY + TileColBd [tileIdx]]
}

タイルインデックスからタイルの中のCTUの数への変換を指定する、両端値を含む0からNumTilesInPic-1までにわたるtileIdxに対するリストNumCtusInTile[tileIdx]は、次のように導出されうる。
for(i=0,tileIdx=0;i<NumTilesInPic;i++,tileIdx++) (6-9)
NumCtusInTile[tileIdx]=TileColWidth[tileIdx]*TileRowHeight[tileIdx] The list NumCtusInTile [tileIdx] for tileIdx from 0 to NumTilesInPic-1 with double-ended values, which specifies the conversion from the tile index to the number of CTUs in the tile, can be derived as follows:
for (i = 0, tileIdx = 0; i <NumTilesInPic; i ++, tileIdx ++) (6-9)
NumCtusInTile [tileIdx] = TileColWidth [tileIdx] * TileRowHeight [tileIdx]

例示的なピクチャパラメータセットRBSPシンタックスは次の通りである。

An exemplary picture parameter set RBSP syntax is as follows.

例示的なピクチャパラメータセットRBSPセマンティックは、次の通りである。num_level1_tile_columns_minus1+1は、ピクチャを区分するレベル1タイル列の数を指定する。num_level1_tile_columns_minus1は、両端値を含む0～PicWidthInCtbsY-1の範囲の中になければならない。存在しないとき、num_level1_tile_columns_minus1の値は0に等しいものと推測される。num_level1_tile_rows_minus1+1は、ピクチャを区分するレベル1タイル行の数を指定する。num_level1_tile_rows_minus1は、両端値を含む0～PicHeightInCtbsY-1の範囲の中になければならない。存在しないとき、num_level1_tile_rows_minus1の値は0に等しいものと推測される。変数NumLevel1Tilesは、(num_level1_tile_columns_minus1+1)*(num_level1_tile_rows_minus1+1)に等しく設定される。single_tile_in_pic_flagが0に等しいとき、NumTilesInPicは1よりも大きくなければならない。uniform_level1_tile_spacing_flagは、レベル1タイル列境界および同様にレベル1タイル行境界がピクチャを横断して一律に分散されることを指定するために1に等しく設定される。uniform_level1_tile_spacing_flagは、レベル1タイル列境界および同様にレベル1タイル行境界が、ピクチャを横断して一律に分散されないがシンタックス要素level1_tile_column_width_minus1[i]およびlevel1_tile_row_height_minus1[i]を使用して明示的にシグナリングされることを指定するために、0に等しい。存在しないとき、uniform_level1_tile_spacing_flagの値は1に等しいものと推測される。 An exemplary picture parameter set RBSP semantic is as follows. num_level1_tile_columns_minus1 + 1 specifies the number of level 1 tile columns that divide the picture. num_level1_tile_columns_minus1 must be in the range 0 to PicWidthInCtbsY-1 including the values at both ends. When not present, the value of num_level1_tile_columns_minus1 is presumed to be equal to 0. num_level1_tile_rows_minus1 + 1 specifies the number of level 1 tile rows that separate the pictures. num_level1_tile_rows_minus1 must be in the range 0 to PicHeightInCtbsY-1 including the values at both ends. When not present, the value of num_level1_tile_rows_minus1 is presumed to be equal to 0. The variable NumLevel1Tiles is set equal to (num_level1_tile_columns_minus1 + 1) * (num_level1_tile_rows_minus1 + 1). NumTilesInPic must be greater than 1 when single_tile_in_pic_flag is equal to 0. uniform_level1_tile_spacing_flag is set equal to 1 to specify that level 1 tile column boundaries and similarly level 1 tile row boundaries are uniformly distributed across the picture. uniform_level1_tile_spacing_flag is explicitly signaled using the syntax elements level1_tile_column_width_minus1 [i] and level1_tile_row_height_minus1 [i], although level 1 tile column boundaries and similarly level 1 tile row boundaries are not uniformly distributed across the picture. Equal to 0 to specify that. When not present, the value of uniform_level1_tile_spacing_flag is presumed to be equal to 1.

level1_tile_column_width_minus1[i]+1は、第iのレベル1タイル列の幅をCTBの単位で指定する。level1_tile_row_height_minus1[i]+1は、第iのタイルレベル1行の高さをCTBの単位で指定する。level2_tile_present_flagは、1つ以上のレベル1タイルがもっと多くのレベル2タイルに分割されることを指定する。level2_tile_split_flag[i]は、第iのレベル1タイルが2つ以上のタイルに分割されることを指定する。num_level2_tile_columns_minus1[i]+1は、第iのタイルを区分するタイル列の数を指定する。num_level2_tile_columns_minus1[i]は、両端値を含む0～ColWidth[i]の範囲の中になければならない。存在しないとき、num_level2_tile_columns_minus1[i]の値は0に等しいものと推測される。num_level2_tile_rows_minus1[i]+1は、第iのタイルを区分するタイル行の数を指定する。num_level2_tile_rows_minus1[i]は、両端値を含む0～RowHeight[i]の範囲の中になければならない。存在しないとき、num_level2_tile_rows_minus1[i]の値は0に等しいものと推測される。 level1_tile_column_width_minus1 [i] + 1 specifies the width of the i-th level 1 tile column in CTB units. level1_tile_row_height_minus1 [i] + 1 specifies the height of one row of the i-th tile level in CTB units. level2_tile_present_flag specifies that one or more Level 1 tiles will be split into more Level 2 tiles. level2_tile_split_flag [i] specifies that the i-th level 1 tile is split into two or more tiles. num_level2_tile_columns_minus1 [i] + 1 specifies the number of tile columns that divide the i-th tile. num_level2_tile_columns_minus1 [i] must be in the range 0-ColWidth [i] including both ends values. When not present, the value of num_level2_tile_columns_minus1 [i] is presumed to be equal to 0. num_level2_tile_rows_minus1 [i] + 1 specifies the number of tile rows that divide the i-th tile. num_level2_tile_rows_minus1 [i] must be in the range of 0 to RowHeight [i] including the double-ended value. When not present, the value of num_level2_tile_rows_minus1 [i] is presumed to be equal to 0.

level2_tile_split_flag[i]が1に等しいとき、(num_level2_tile_columns_minus1[i]+1)*(num_level2_tile_rows_minus1[i]+1)値は1よりも大きくなければならない。ピクチャは、0に等しいlevel2_tile_split_flag[i]を有する0個以上のレベル1タイル、および0個以上のレベル2タイルを含んでよい。0に等しいlevel2_tile_split_flag[i]を有するレベル1タイルは、ピクチャの中のタイルの総数の中に計数されなくてよい。そのようなタイルが参照されるとき、そのような分割から得られたレベル2タイルのセットは一括して参照されてもよい。 When level2_tile_split_flag [i] is equal to 1, the (num_level2_tile_columns_minus1 [i] +1) * (num_level2_tile_rows_minus1 [i] +1) value must be greater than 1. The picture may contain 0 or more level 1 tiles with level2_tile_split_flag [i] equal to 0, and 0 or more level 2 tiles. Level 1 tiles with level2_tile_split_flag [i] equal to 0 do not have to be counted in the total number of tiles in the picture. When such tiles are referenced, the set of Level 2 tiles obtained from such splits may be referenced collectively.

CTBラスタおよびタイル走査変換プロセス呼び出すことによって、以下の変数、すなわち、第iのレベル1タイル列の幅をCTBの単位で指定する、両端値を含む0からnum_level1_tile_columns_minus1までにわたるiに対するリストColWidth[i]、第jのレベル1タイル行の高さをCTBの単位で指定する、両端値を含む0からnum_level1_tile_rows_minus1までにわたるjに対するリストRowHeight[j]、PPSを参照してピクチャの中のタイルの数を指定する変数NumTilesInPic、第iのタイルの幅をCTBの単位で指定する、両端値を含む0からNumTilesInPicまでにわたるiに対するリストTileWidth[i]、第iのタイルの高さをCTBの単位で指定する、両端値を含む0からNumTilesInPicまでにわたるiに対するリストTileHeight[i]、第iのタイル列境界のロケーションをCTBの単位で指定する、両端値を含む0からNumTilesInPicまでにわたるiに対するリストTileColBd[i]、第iのタイル行境界のロケーションをCTBの単位で指定する、両端値を含む0からNumTilesInPicまでにわたるjに対するリストTileRowBd[i]、ピクチャのCTBラスタ走査におけるCTBアドレスからタイル走査におけるCTBアドレスへの変換を指定する、両端値を含む0からPicSizeInCtbsY-1までにわたるctbAddrRsに対するリストCtbAddrRsToTs[ctbAddrRs]、タイル走査におけるCTBアドレスからピクチャのCTBラスタ走査におけるCTBアドレスへの変換を指定する、両端値を含む0からPicSizeInCtbsY-1までにわたるctbAddrTsに対するリストCtbAddrTsToRs[ctbAddrTs]、タイル走査におけるCTBアドレスからタイルIDへの変換を指定する、両端値を含む0からPicSizeInCtbsY-1までにわたるctbAddrTsに対するリストTileId[ctbAddrTs]、タイルインデックスからタイルの中のCTUの数への変換を指定する、両端値を含む0からPicSizeInCtbsY-1までにわたるtileIdxに対するリストNumCtusInTile[tileIdx]、およびタイルIDからタイルの中の最初のCTBのタイル走査におけるCTBアドレスへの変換を指定する、両端値を含む0からNumTilesInPic-1までにわたるtileIdxに対するリストFirstCtbAddrTs[tileIdx]が導出される。 By calling the CTB raster and tile scan transformation process, the following variables, ie, the list ColWidth [i] for i from 0 to num_level1_tile_columns_minus1 containing both-end values, that specifies the width of the i-th level 1 tile column in CTB units. , Specify the height of the jth level 1 tile row in CTB units, list RowHeight [j] for j from 0 to num_level1_tile_rows_minus1 including the double-ended values, specify the number of tiles in the picture with reference to PPS Variable NumTilesInPic, specify the width of the i-th tile in CTB units, list TileWidth [i] for i from 0 to NumTilesInPic including both ends values, specify the height of the i-th tile in CTB units, List TileHeight [i] for i from 0 to NumTilesInPic including both-end values, List TileColBd [i] for i from 0 to NumTilesInPic including both-ended values, specifying the location of the boundary of the third tile column in CTB units, List TileRowBd [i] for j from 0 to NumTilesInPic, including both-end values, specifying the location of the i-th tile row boundary in units of CTB, conversion from CTB address in CTB raster scan of picture to CTB address in tile scan CtbAddrRsToTs [ctbAddrRs], a list for ctbAddrRs ranging from 0 to PicSizeInCtbsY-1, containing the both-end value, specifying the conversion of the CTB address in the tile scan to the CTB address in the CTB raster scan of the picture, from 0 containing the two-ended value. List CtbAddrTsToRs [ctbAddrTs] for ctbAddrTs spanning PicSizeInCtbsY-1, List TileId [ctbAddrTs] for ctbAddrTs spanning from 0 to PicSizeInCtbsY-1 that specify the conversion of CTB address to tile ID in tile scan, from tile index List NumCtusInTile [t] for tileIdx from 0 to PicSizeInCtbsY-1 with both-ended values that specifies the conversion to the number of CTUs in the tile. ileIdx], and a list FirstCtbAddrTs [tileIdx] for tileIdx from 0 to NumTilesInPic-1, including the double-ended values, that specifies the translation of the tile ID to the CTB address in the tile scan of the first CTB in the tile.

例示的なタイルグループヘッダセマンティックは次の通りである。タイルグループは、いくつかの完全なレベル1タイルのみ、または1つの単一のレベル1タイルのいくつかの完全なレベル2タイルのみを含んでよい。tile_group_addressは、タイルグループの中の最初のタイルのタイルアドレスを指定し、ここで、タイルアドレスは、式6-8によって指定されるようなTileId[firstCtbAddrTs]に等しく、firstCtbAddrTsは、タイルグループの中の最初のCTUのCTBのタイル走査におけるCTBアドレスである。tile_group_addressの長さは、Ceil(Log2(NumTilesInPic))ビットである。tile_group_addressの値は、両端値を含む0～NumTilesInPic-1の範囲の中になければならず、tile_group_addressの値は、同じコーディングされたピクチャのいかなる他のコーディングされたタイルグループNALユニットのtile_group_addressの値にも等しくてはならない。tile_group_addressが存在しないとき、それは0に等しいものと推測される。 An exemplary tile group header semantic is: A tile group may contain only some complete Level 1 tiles, or only some complete Level 2 tiles in a single Level 1 tile. tile_group_address specifies the tile address of the first tile in the tile group, where the tile address is equal to TileId [firstCtbAddrTs] as specified by Equation 6-8, where firstCtbAddrTs is in the tile group. The CTB address in the tile scan of the CTB of the first CTU. The length of tile_group_address is Ceil (Log2 (NumTilesInPic)) bits. The value of tile_group_address must be in the range 0 to NumTilesInPic-1 including the double-ended value, and the value of tile_group_address will be the value of tile_group_address of any other coded tilegroup NAL unit in the same coded picture. Must not be equal. If tile_group_address does not exist, it is presumed to be equal to 0.

以下のことは、上述の態様の第2の特定の例示的な実施形態である。例示的なCTBラスタおよびタイル走査プロセスは次の通りである。PPSを参照してピクチャの中のタイルの数を指定する変数NumTilesInPic、ならびに第iのタイル列境界のロケーションをCTBの単位で、第iのタイル行境界のロケーションをCTBの単位で、第iのタイル列の幅をCTBの単位で、かつ第iのタイル列の高さをCTBの単位で指定する、両端値を含む0からNumTilesInPic-1までにわたるiに対するリストTileColBd[i]、TileRowBd[i]、TileWidth[i]、およびTileHeight[i]は、次のように導出される。
for(tileIdx=0,i=0;i<NumLevel1Tiles;i++){
tileX=i%(num_level1_tile_columns_minus1+1)
tileY=i/(num_level1_tile_columns_minus1+1)
if(!level2_tile_split_flag[i]){ (6-5)
TileColBd[tileIdx]=colBd[tileX]
TileRowBd[tileIdx]=rowBd[tileY]
TileWidth[tileIdx]=ColWidth[tileX]
TileHeight[tileIdx]=RowHeight[tileY]
tileIdx++
}else{
if(uniform_level2_tile_spacing_flag[i]){
for(k=0;k<=num_level2_tile_columns_minus1[i];k++)
colWidth2[k]=((k+1)*ColWidth[tileX])/(num_level2_tile_columns_minus1[i]+1)-(k*ColWidth[tileX])/(num_level2_tile_columns_minus1[i]+1)

for(k=0;k<=num_level2_tile_rows_minus1[i];k++)
rowHeight2[k]=((k+1)*RowHeight[tileY])/(num_level2_tile_rows_minus1[i]+1)-(k*RowHeight[tileY])/(num_level2_tile_rows_minus1[i]+1)

}else{
colWidth2[num_level2_tile_columns_minus1[i]]=ColWidth[tileX])
for(k=0;k<=num_level2_tile_columns_minus1[i];k++){
colWidth2[k]=tile_level2_column_width_minus1[k]+1
colWidth2[k] -= colWidth2[k]
}
rowHeight2[num_level2_tile_rows_minus1[i]]=RowHeight[tileY])
for(k=0;k<=num_level2_tile_rows_minus1[i];k++){
rowHeigh2[k]=tile_level2_column_width_minus1[k]+1
rowHeight2[k] -= rowHeight2[k]
}
}
for(colBd2[0]=0,k=0;k<=num_level2_tile_columns_minus1[i];k++)
colBd2[k+1]=colBd2[k]+colWidth2[k]
for(rowBd2[0]=0,k=0;k<=num_level2_tile_rows_minus1[i];k++)
rowBd2[k+1]=rowBd2[k]+rowHeight2[k]
numSplitTiles=(num_level2_tile_columns_minus1[i]+1)*(num_level2_tile_rows_minus1[i]+1)

for(k=0;k<numSplitTiles;k++){
tileX2=k%(num_level2_tile_columns_minus1[i]+1)
tileY2=k/(num_level2_tile_columns_minus1[i]+1)
TileColBd[tileIdx]=colBd[tileX]+colBd2[tileX2]
TileRowBd[tileIdx]=rowBd[tileY]+rowBd2[tileY2]
TileWidth[tileIdx]=colWidth2[tileX2]
TileHeight[tileIdx]=rowHeight2[tileY2]
tileIdx++
}
}
}
NumTilesInPic=tileIdx The following is a second specific exemplary embodiment of the above embodiment. An exemplary CTB raster and tile scanning process is as follows. The variable NumTilesInPic, which refers to the PPS to specify the number of tiles in the picture, as well as the location of the i-th tile column boundary in CTB units and the location of the i-th tile row boundary in CTB units. List TileColBd [i], TileRowBd [i] for i from 0 to NumTilesInPic-1, including both-end values, specifying the width of the tile column in CTB units and the height of the i-th tile column in CTB units. , TileWidth [i], and TileHeight [i] are derived as follows.
for (tileIdx = 0, i = 0; i <NumLevel1Tiles; i ++) {
tileX = i% (num_level1_tile_columns_minus1 + 1)
tileY = i / (num_level1_tile_columns_minus1 + 1)
if (! Level2_tile_split_flag [i]) {(6-5)
TileColBd [tileIdx] = colBd [tileX]
TileRowBd [tileIdx] = rowBd [tileY]
TileWidth [tileIdx] = ColWidth [tileX]
TileHeight [tileIdx] = RowHeight [tileY]
tileIdx ++
} else {
if (uniform_level2_tile_spacing_flag [i]) {
for (k = 0; k <= num_level2_tile_columns_minus1 [i]; k ++)
colWidth2 [k] = ((k + 1) * ColWidth [tileX]) / (num_level2_tile_columns_minus1 [i] +1)-(k * ColWidth [tileX]) / (num_level2_tile_columns_minus1 [i] +1)

for (k = 0; k <= num_level2_tile_rows_minus1 [i]; k ++)
rowHeight2 [k] = ((k + 1) * RowHeight [tileY]) / (num_level2_tile_rows_minus1 [i] +1)-(k * RowHeight [tileY]) / (num_level2_tile_rows_minus1 [i] +1)

} else {
colWidth2 [num_level2_tile_columns_minus1 [i]] = ColWidth [tileX])
for (k = 0; k <= num_level2_tile_columns_minus1 [i]; k ++) {
colWidth2 [k] = tile_level2_column_width_minus1 [k] + 1
colWidth2 [k]-= colWidth2 [k]
}
rowHeight2 [num_level2_tile_rows_minus1 [i]] = RowHeight [tileY])
for (k = 0; k <= num_level2_tile_rows_minus1 [i]; k ++) {
rowHeigh2 [k] = tile_level2_column_width_minus1 [k] +1
rowHeight2 [k]-= rowHeight2 [k]
}
}
for (colBd2 [0] = 0, k = 0; k <= num_level2_tile_columns_minus1 [i]; k ++)
colBd2 [k + 1] = colBd2 [k] + colWidth2 [k]
for (rowBd2 [0] = 0, k = 0; k <= num_level2_tile_rows_minus1 [i]; k ++)
rowBd2 [k + 1] = rowBd2 [k] + rowHeight2 [k]
numSplitTiles = (num_level2_tile_columns_minus1 [i] +1) * (num_level2_tile_rows_minus1 [i] +1)

for (k = 0; k <numSplitTiles; k ++) {
tileX2 = k% (num_level2_tile_columns_minus1 [i] +1)
tileY2 = k / (num_level2_tile_columns_minus1 [i] +1)
TileColBd [tileIdx] = colBd [tileX] + colBd2 [tileX2]
TileRowBd [tileIdx] = rowBd [tileY] + rowBd2 [tileY2]
TileWidth [tileIdx] = colWidth2 [tileX2]
TileHeight [tileIdx] = rowHeight2 [tileY2]
tileIdx ++
}
}
}
NumTilesInPic = tileIdx

An exemplary picture parameter set RBSP syntax is as follows.

例示的なピクチャパラメータセットRBSPセマンティックは次の通りである。uniform_level2_tile_spacing_flag[i]は、第iのレベル1タイルのレベル2タイル列境界および同様に第iのレベル1タイルのレベル2タイル行境界がピクチャを横断して一律に分散されることを指定するために1に等しく設定される。uniform_level2_tile_spacing_flag[i]は、第iのレベル1タイルのレベル2タイル列境界および同様に第iのレベル1タイルのレベル2タイル行境界が、ピクチャを横断して一律に分散されないがシンタックス要素level2_tile_column_width_minus1[j]およびlevel2_tile_row_height_minus1[j]を使用して明示的にシグナリングされることを指定するために、0に等しく設定されうる。存在しないとき、uniform_level2_tile_spacing_flag[i]の値は1に等しいものと推測される。level2_tile_column_width_minus1[j]+1は、第iのレベル1タイルの第jのレベル2タイル列の幅をCTBの単位で指定する。level2_tile_row_height_minus1[j]+1は、第iのレベル1タイルの第jのタイルレベル2行の高さをCTBの単位で指定する。 An exemplary picture parameter set RBSP semantic is: uniform_level2_tile_spacing_flag [i] is used to specify that the level 2 tile column boundaries of the i-th level 1 tile and similarly the level 2 tile row boundaries of the i-th level 1 tile are uniformly distributed across the picture. Set equal to 1. uniform_level2_tile_spacing_flag [i] is a syntax element level2_tile_column_width_minus1 [i] that the level 2 tile column boundaries of the i-th level 1 tile and the level 2 tile row boundaries of the i-th level 1 tile are not evenly distributed across the picture. Can be set equal to 0 to specify that it is explicitly signaled using j] and level2_tile_row_height_minus1 [j]. When not present, the value of uniform_level2_tile_spacing_flag [i] is presumed to be equal to 1. level2_tile_column_width_minus1 [j] + 1 specifies the width of the jth level 2 tile column of the ith level 1 tile in CTB units. level2_tile_row_height_minus1 [j] + 1 specifies the height of the jth tile level 2 row of the ith level 1 tile in CTB units.

以下のことは、上述の態様の第3の特定の例示的な実施形態である。CTBラスタおよびタイル走査プロセスは次の通りである。第iの第1のレベルのタイル列の幅をCTBの単位で指定する、両端値を含む0からnum_level1_tile_columns_minus1までにわたるiに対するリストColWidth[i]は、次のように導出されてもよい。
for(i=0;i<=num_level1_tile_columns_minus1;i++)
ColWidth[i]=((i+1)*PicWidthInCtbsY)/(num_level1_tile_columns_minus1+1)-(i*PicWidthInCtbsY)/(num_level1_tile_columns_minus1+1)
The following is a third specific exemplary embodiment of the above embodiment. The CTB raster and tile scanning process is as follows. The list ColWidth [i] for i from 0 to num_level1_tile_columns_minus1 containing the double-ended values, which specifies the width of the first level tile column of the first i in CTB units, may be derived as follows.
for (i = 0; i <= num_level1_tile_columns_minus1; i ++)
ColWidth [i] = ((i + 1) * PicWidthInCtbsY) / (num_level1_tile_columns_minus1 + 1)-(i * PicWidthInCtbsY) / (num_level1_tile_columns_minus1 + 1)

第jのタイル行の高さをCTBの単位で指定する、両端値を含む0からnum_level1_tile_rows_minus1までにわたるjに対するリストRowHeight[j]は、次のように導出されてもよい。
for(j=0;j<=num_level1_tile_rows_minus1;j++)
RowHeight[j]=((j+1)*PicHeightInCtbsY)/(num_level1_tile_rows_minus1+1)-(j*PicHeightInCtbsY)/(num_level1_tile_rows_minus1+1)
The list RowHeight [j] for j from 0 to num_level1_tile_rows_minus1 containing the double-ended values, which specifies the height of the jth tile row in CTB units, may be derived as follows.
for (j = 0; j <= num_level1_tile_rows_minus1; j ++)
RowHeight [j] = ((j + 1) * PicHeightInCtbsY) / (num_level1_tile_rows_minus1 + 1)-(j * PicHeightInCtbsY) / (num_level1_tile_rows_minus1 + 1)

An exemplary picture parameter set RBSP syntax is as follows.

以下のことは、上述の態様の第4の特定の例示的な実施形態である。例示的なピクチャパラメータセットRBSPシンタックスは次の通りである。

The following is a fourth specific exemplary embodiment of the above embodiment. An exemplary picture parameter set RBSP syntax is as follows.

例示的なピクチャパラメータセットRBSPセマンティックは次の通りである。ビットストリーム適合は、以下の制約が適用されることを必要としてもよい。値MinTileWidthは最小タイル幅を指定し、256ルーマサンプルに等しくなければならない。値MinTileHeightは最小タイル高さを指定し、64ルーマサンプルに等しくなければならない。最小タイル幅および最小タイル高さの値は、プロファイルおよびレベル定義に従って変化してもよい。変数Level1TilesMayBeFurtherSplitは、次のように導出されてもよい。
Level1TilesMayFurtherBeSplit=0
for(i=0,!Level1TilesMayFurtherBeSplit && i=0;i<NumLevel1Tiles;i++)
if((ColWidth[i]*CtbSizeY>=(2*MinTileWidth))||(RowHeight[i]*CtbSizeY>=(2*MinTileHeight)))

Level1TilesMayFurtherBeSplit=1 An exemplary picture parameter set RBSP semantic is: Bitstream conformance may require the following constraints to apply: The value MinTileWidth specifies the minimum tile width and must be equal to 256 luma samples. The value MinTileHeight specifies the minimum tile height and must be equal to 64 luma samples. The minimum tile width and minimum tile height values may vary according to profile and level definitions. The variable Level1TilesMayBeFurtherSplit may be derived as follows.
Level1TilesMayFurtherBeSplit = 0
for (i = 0,! Level1TilesMayFurtherBeSplit && i = 0; i <NumLevel1Tiles; i ++)
if ((ColWidth [i] * CtbSizeY> = (2 * MinTileWidth)) || (RowHeight [i] * CtbSizeY> = (2 * MinTileHeight)))

Level1TilesMayFurtherBeSplit = 1

level2_tile_present_flagは、1つ以上のレベルタイルがもっと多くのタイルに分割されることを指定する。存在しないとき、level2_tile_present_flagの値は0に等しいものと推測される。level2_tile_split_flag[i]+1は、第iのレベル1タイルが2つ以上のタイルに分割されることを指定する。存在しないとき、level2_tile_split_flag[i]の値は0に等しいものと推測される。 level2_tile_present_flag specifies that one or more level tiles will be split into more tiles. When not present, the value of level2_tile_present_flag is presumed to be equal to 0. level2_tile_split_flag [i] + 1 specifies that the i-th level 1 tile is split into two or more tiles. When not present, the value of level2_tile_split_flag [i] is presumed to be equal to 0.

以下のことは、上述の態様の第5の特定の例示的な実施形態である。各タイルロケーションおよび各タイルサイズがシグナリングされてもよい。そのようなタイル構造シグナリングをサポートするためのシンタックスは、以下で作表されるようなものでありうる。tile_top_left_address[i]およびtile_bottom_right_address[i]は、タイルによってカバーされる長方形エリアを示す、ピクチャ内のCTUインデックスである。これらのシンタックス要素をシグナリングするためのビット数は、ピクチャの中の最大個数のCTUを表すのに十分であるべきである。

The following is a fifth specific exemplary embodiment of the above embodiment. Each tile location and each tile size may be signaled. The syntax for supporting such tile-structured signaling can be as represented below. tile_top_left_address [i] and tile_bottom_right_address [i] are CTU indexes in the picture that indicate the rectangular area covered by the tile. The number of bits for signaling these syntax elements should be sufficient to represent the maximum number of CTUs in the picture.

各タイルロケーションおよび各タイルサイズがシグナリングされてもよい。そのようなタイル構造シグナリングをサポートするためのシンタックスは、以下で作表されるようなものでありうる。tile_top_left_address[i]は、ピクチャのCTUラスタ走査の順序における、タイルの中の最初のCTUのCTUインデックスである。タイル幅およびタイル高さがタイルのサイズを指定する。タイルサイズが共通のユニットを最初にシグナリングすることによって、これらの2つのシンタックス要素をシグナリングするときにいくつかのビットが節約されうる。

Each tile location and each tile size may be signaled. The syntax for supporting such tile-structured signaling can be as represented below. tile_top_left_address [i] is the CTU index of the first CTU in the tile in the order of the CTU raster scans of the picture. The tile width and tile height specify the size of the tile. By signaling units with a common tile size first, some bits can be saved when signaling these two syntax elements.

代替的に、シグナリングは次の通りでありうる。

Alternatively, the signaling could be:

別の例では、各タイルサイズは次のようにシグナリングされうる。フレキシブルタイル構造をシグナリングするために、各タイルのロケーションはシグナリングされなくてよい。代わりに、タイルを以前のタイルのすぐ右に置くべきかそれともすぐ下に置くべきかを指定するために、フラグがシグナリングされてよい。タイルが現在のタイルの右にしかあり得ないかまたは下にしかあり得ない場合、このフラグは存在しなくてよい。 In another example, each tile size could be signaled as follows: The location of each tile does not have to be signaled to signal the flexible tile structure. Instead, a flag may be signaled to specify whether the tile should be placed just to the right or just below the previous tile. This flag may not be present if the tile can only be to the right of or below the current tile.

tile_x_offset[i]およびtile_y_offset[i]の値は、以下の順序付きステップによって導出されうる。
tile_x_offset[0]およびtile_y_offset[0]が0に等しく設定される。
maxWidthがtile_width[0]に等しく設定され、maxHeightがtile_height[0]に等しく設定される。
runningWidthがtile_width[0]に等しく設定され、runningHeightがtile_height[0]に等しく設定される。
lastNewRowHeightが0に等しく設定される。
TilePositionCannotBeInferred=falseである。
i>0に対して、以下のことが適用される。
値isRightを次のように設定させる。
runningWidth+tile_width[i]<=PictureWidthの場合、isRight==1であり、
他の場合、isRight==0である。
値isBelowを次のように設定させる。
runningHeight+tile_height[i]<=PictureHeightの場合、isBelow==1であり、
他の場合、isBelow==0である。
isRight==1 && isBelow==1の場合、TilePositionCannotBeInferred=trueである。
isRight==1 && isBelow==0の場合、以下のこと、すなわち、
right_tile_flag[i]=1、
tile_x_offset[i]=runningWidth、
tile_y_offset[i]=(runningWidth==maxWidth) ? 0 : lastNewRowHeight、
lastNewRowHeight=(runningWidth==maxWidth) ? 0 : lastNewRowHeightが適用され、
そうではなく、isRight==0 && isBelow==1の場合、以下のこと、すなわち、
right_tile_flag[i]=0、
tile_y_offset[i]=runningHeight、
tile_x_offset[i]=(runningHeight==maxHeight) ? 0 : tile_x_offset[i-1]、
lastNewRowHeight=(runningHeight==maxHeight && runningWidth==maxWidth) ? runningHeight : lastNewRowHeightが適用され、
そうではなく、isRight==1 && isBelow==1 && right_tile_flag[i]==1の場合、以下のこと、すなわち、
tile_x_offset[i]=runningWidth、
tile_y_offset[i]=(runningWidth==maxWidth) ? 0 : lastNewRowHeight、
lastNewRowHeight=(runningWidth==maxWidth) ? 0 : lastNewRowHeightが適用され、
他の場合(すなわち、isRight==1 && isBelow==1 && right_tile_flag[i]==0)、以下のこと、すなわち、
tile_y_offset[i]=runningHeight、
tile_x_offset[i]=(runningHeight==maxHeight) ? 0 : tile_x_offset[i-1]、
lastNewRowHeight=(runningHeight==maxHeight && runningWidth==maxWidth) ? runningHeight : lastNewRowHeightが適用され、
right_tile_flag[i]==1の場合、以下のこと、すなわち、
runningWidth=runningWidth+tile_width[i]が適用され、
runningWidth>maxWidthの場合、maxWidthをrunningWidthに等しく設定し、
runningHeightがtile_y_offset[i]+tile_height[i]に等しく、
他の場合(すなわち、right_tile_flag[i]==0)、以下のこと、すなわち、
runningHeight=runningHeight+tile_height[i]が適用され、
runningHeight>maxHeightの場合、maxHeightをrunningHeightに等しく設定し、
runningWidthがtile_x_offset[i]+tile_width[i]に等しい。 The values of tile_x_offset [i] and tile_y_offset [i] can be derived by the following ordered steps.
tile_x_offset [0] and tile_y_offset [0] are set equal to 0.
maxWidth is set equal to tile_width [0] and maxHeight is set equal to tile_height [0].
runningWidth is set equal to tile_width [0] and runningHeight is set equal to tile_height [0].
lastNewRowHeight is set equal to 0.
TilePositionCannotBeInferred = false.
For i> 0, the following applies:
Set the value isRight as follows.
If runningWidth + tile_width [i] <= PictureWidth, isRight == 1 and
In other cases, isRight == 0.
Set the value isBelow as follows.
If runningHeight + tile_height [i] <= PictureHeight, isBelow == 1 and
In other cases, isBelow == 0.
If isRight == 1 && isBelow == 1, TilePositionCannotBeInferred = true.
If isRight == 1 && isBelow == 0, then the following, that is,
right_tile_flag [i] = 1,
tile_x_offset [i] = runningWidth,
tile_y_offset [i] = (runningWidth == maxWidth)? 0: lastNewRowHeight,
lastNewRowHeight = (runningWidth == maxWidth)? 0: lastNewRowHeight is applied,
Instead, if isRight == 0 && isBelow == 1, then the following, that is,
right_tile_flag [i] = 0,
tile_y_offset [i] = runningHeight,
tile_x_offset [i] = (runningHeight == maxHeight)? 0: tile_x_offset [i-1],
lastNewRowHeight = (runningHeight == maxHeight && runningWidth == maxWidth)? runningHeight: lastNewRowHeight is applied,
Instead, if isRight == 1 && isBelow == 1 && right_tile_flag [i] == 1, then the following, that is,
tile_x_offset [i] = runningWidth,
tile_y_offset [i] = (runningWidth == maxWidth)? 0: lastNewRowHeight,
lastNewRowHeight = (runningWidth == maxWidth)? 0: lastNewRowHeight is applied,
In other cases (ie isRight == 1 && isBelow == 1 && right_tile_flag [i] == 0),
tile_y_offset [i] = runningHeight,
tile_x_offset [i] = (runningHeight == maxHeight)? 0: tile_x_offset [i-1],
lastNewRowHeight = (runningHeight == maxHeight && runningWidth == maxWidth)? runningHeight: lastNewRowHeight is applied,
When right_tile_flag [i] == 1, the following, that is,
runningWidth = runningWidth + tile_width [i] is applied,
If runningWidth> maxWidth, set maxWidth equal to runningWidth and
runningHeight is equal to tile_y_offset [i] + tile_height [i],
In other cases (ie right_tile_flag [i] == 0), the following, i.e.
runningHeight = runningHeight + tile_height [i] is applied,
If runningHeight> maxHeight, set maxHeight equal to runningHeight and set it to runningHeight.
runningWidth is equal to tile_x_offset [i] + tile_width [i].

前述のことは、擬似コードでは次のように記述されうる。
tile_x_offset[0]=0
tile_y_offset[0]=0
maxWidth=tile_width[0]
maxHeight=tile_height[0]
runningWidth=tile_width[0]
runningHeight=tile_height[0]
lastNewRowHeight=0
isRight=false
isBelow=false
TilePositionCannotBeInferred=false
for(i=1;i<num_tiles_minus2+2;i++){
TilePositionCannotBeInferred=false
isRight=(runningWidth+tile_width[i]<=PictureWidth) ? true : false
isbelow=(runningHeight+tile_height[i]<=PictureHeight) ? true : false
if(!isRight && !isBelow)
//エラー。この事例が発生してはならない。
if(isRight && isBelow)
TilePositionCannotBeInferred=true
if(isRight && !isBelow){
right_tile_flag[i]=true
tile_x_offst[i]=runningWidth
tile_y_offset[i]=(runningWidth==maxWidth) ? 0 : lastNewRowHeight
lastNewRowHeight=tile_y_offset[i]
}
else if(!isRight && isBelow){
right_tile_flag[i]=false
tile_y_offset[i]=runningHeight
tile_x_offset[i]=(runningHeight==maxHeight) ? 0 : tile_x_offset[i-1]
lastNewRowHeight=(runningHeight==maxHeight && runningWidth==maxWidth) ? runningHeight : lastNewRowHeight

}
else if(right_tile_flag[i]){
tile_x_offst[i]=runningWidth
tile_y_offset[i]=(runningWidth==maxWidth) ? 0 : lastNewRowHeight
lastNewRowHeight=tile_y_offset[i]
}
else{
tile_y_offset[i]=runningHeight
tile_x_offset[i]=(runningHeight==maxHeight) ? 0 : tile_x_offset[i-1]
lastNewRowHeight=(runningHeight==maxHeight && runningWidth==maxWidth) ? runningHeight : lastNewRowHeight

}
}
if(right_tile_flag[i]){
runningWidth += tile_width[i]
if(runningWidth>maxWidth)maxWidth=runningWidth
runningHeight=tile_y_offset[i]+tile_height[i]
}
else{
runningHeight += tile_height[i]
if(runningHeight>maxHeight)maxHeight=runningHeight
runningWidth=tile_x_offset[i]+tile_width[i]
}

The above can be described in pseudocode as:
tile_x_offset [0] = 0
tile_y_offset [0] = 0
maxWidth = tile_width [0]
maxHeight = tile_height [0]
runningWidth = tile_width [0]
runningHeight = tile_height [0]
lastNewRowHeight = 0
isRight = false
isBelow = false
TilePositionCannotBeInferred = false
for (i = 1; i <num_tiles_minus2 + 2; i ++) {
TilePositionCannotBeInferred = false
isRight = (runningWidth + tile_width [i] <= PictureWidth)? true: false
isbelow = (runningHeight + tile_height [i] <= PictureHeight)? true: false
if (! isRight &&! isBelow)
//error. This case should not occur.
if (isRight && isBelow)
TilePositionCannotBeInferred = true
if (isRight &&! isBelow) {
right_tile_flag [i] = true
tile_x_offst [i] = runningWidth
tile_y_offset [i] = (runningWidth == maxWidth)? 0: lastNewRowHeight
lastNewRowHeight = tile_y_offset [i]
}
else if (! isRight && isBelow) {
right_tile_flag [i] = false
tile_y_offset [i] = runningHeight
tile_x_offset [i] = (runningHeight == maxHeight)? 0: tile_x_offset [i-1]
lastNewRowHeight = (runningHeight == maxHeight && runningWidth == maxWidth)? runningHeight: lastNewRowHeight

}
else if (right_tile_flag [i]) {
tile_x_offst [i] = runningWidth
tile_y_offset [i] = (runningWidth == maxWidth)? 0: lastNewRowHeight
lastNewRowHeight = tile_y_offset [i]
}
else {
tile_y_offset [i] = runningHeight
tile_x_offset [i] = (runningHeight == maxHeight)? 0: tile_x_offset [i-1]
lastNewRowHeight = (runningHeight == maxHeight && runningWidth == maxWidth)? runningHeight: lastNewRowHeight

}
}
if (right_tile_flag [i]) {
runningWidth + = tile_width [i]
if (runningWidth> maxWidth) maxWidth = runningWidth
runningHeight = tile_y_offset [i] + tile_height [i]
}
else {
runningHeight + = tile_height [i]
if (runningHeight> maxHeight) maxHeight = runningHeight
runningWidth = tile_x_offset [i] + tile_width [i]
}

以下のことは、最後のタイルのサイズを導出する、擬似コードでの一実装形態である。
tile_x_offset[0]=0
tile_y_offset[0]=0
maxWidth=tile_width[0]
maxHeight=tile_height[0]
runningWidth=tile_width[0]
runningHeight=tile_height[0]
lastNewRowHeight=0
isRight=false
isBelow=false
TilePositionCannotBeInferred=false
for(i=1;i<num_tiles_minus2+2;i++){
currentTileWidth=(i==num_tiles_minus2+1) ? (PictureWidth-runningWidth)%PictureWidth : tile_width[i]

currentTileHeight=(i==num_tiles_minus2+1) ? (PictureHeight-runningHeight)%PictureHeight : tile_Height[i]

isRight=(runningWidth+currentTileWidth<=PictureWidth) ? true : false
isbelow=(runningHeight+currentTileHeight<=PictureHeight) ? true : false
if(!isRight && !isBelow)
//エラー。この事例が発生してはならない。
if(isRight && isBelow)
TilePositionCannotBeInferred=true
if(isRight && !isBelow){
right_tile_flag[i]=true
tile_x_offst[i]=runningWidth
tile_y_offset[i]=(runningWidth==maxWidth) ? 0 : lastNewRowHeight
lastNewRowHeight=tile_y_offset[i]
}
else if(!isRight && isBelow){
right_tile_flag[i]=false
tile_y_offset[i]=runningHeight
tile_x_offset[i]=(runningHeight==maxHeight) ? 0 : tile_x_offset[i-1]
lastNewRowHeight=(runningHeight==maxHeight && runningWidth==maxWidth) ? runningHeight : lastNewRowHeight

}
else if(right_tile_flag[i]){
tile_x_offst[i]=runningWidth
tile_y_offset[i]=(runningWidth==maxWidth) ? 0 : lastNewRowHeight
lastNewRowHeight=tile_y_offset[i]
}
else{
tile_y_offset[i]=runningHeight
tile_x_offset[i]=(runningHeight==maxHeight) ? 0 : tile_x_offset[i-1]
lastNewRowHeight=(runningHeight==maxHeight && runningWidth==maxWidth) ? runningHeight : lastNewRowHeight

}
}
if(right_tile_flag[i]){
runningWidth += currentTileWidth
if(runningWidth>maxWidth)maxWidth=runningWidth
runningHeight=tile_y_offset[i]+currentTileHeight
}
else{
runningHeight += currentTileHeight
if(runningHeight>maxHeight)maxHeight=runningHeight
runningWidth=tile_x_offset[i]+currentTileWidth
}

The following is an implementation in pseudocode that derives the size of the last tile.
tile_x_offset [0] = 0
tile_y_offset [0] = 0
maxWidth = tile_width [0]
maxHeight = tile_height [0]
runningWidth = tile_width [0]
runningHeight = tile_height [0]
lastNewRowHeight = 0
isRight = false
isBelow = false
TilePositionCannotBeInferred = false
for (i = 1; i <num_tiles_minus2 + 2; i ++) {
currentTileWidth = (i == num_tiles_minus2 + 1)? (PictureWidth-runningWidth)% PictureWidth: tile_width [i]

currentTileHeight = (i == num_tiles_minus2 + 1)? (PictureHeight-runningHeight)% PictureHeight: tile_Height [i]

isRight = (runningWidth + currentTileWidth <= PictureWidth)? True: false
isbelow = (runningHeight + currentTileHeight <= PictureHeight)? true: false
if (! isRight &&! isBelow)
//error. This case should not occur.
if (isRight && isBelow)
TilePositionCannotBeInferred = true
if (isRight &&! isBelow) {
right_tile_flag [i] = true
tile_x_offst [i] = runningWidth
tile_y_offset [i] = (runningWidth == maxWidth)? 0: lastNewRowHeight
lastNewRowHeight = tile_y_offset [i]
}
else if (! isRight && isBelow) {
right_tile_flag [i] = false
tile_y_offset [i] = runningHeight
tile_x_offset [i] = (runningHeight == maxHeight)? 0: tile_x_offset [i-1]
lastNewRowHeight = (runningHeight == maxHeight && runningWidth == maxWidth)? runningHeight: lastNewRowHeight

}
else if (right_tile_flag [i]) {
tile_x_offst [i] = runningWidth
tile_y_offset [i] = (runningWidth == maxWidth)? 0: lastNewRowHeight
lastNewRowHeight = tile_y_offset [i]
}
else {
tile_y_offset [i] = runningHeight
tile_x_offset [i] = (runningHeight == maxHeight)? 0: tile_x_offset [i-1]
lastNewRowHeight = (runningHeight == maxHeight && runningWidth == maxWidth)? runningHeight: lastNewRowHeight

}
}
if (right_tile_flag [i]) {
runningWidth + = currentTileWidth
if (runningWidth> maxWidth) maxWidth = runningWidth
runningHeight = tile_y_offset [i] + currentTileHeight
}
else {
runningHeight + = currentTileHeight
if (runningHeight> maxHeight) maxHeight = runningHeight
runningWidth = tile_x_offset [i] + currentTileWidth
}

さらなるシグナリングビット節約のために、固有のタイルサイズの数が、ユニットタイルサイズの作表をサポートするためにシグナリングされうる。タイルサイズは、次いで、インデックスのみによって参照されうる。

For further signaling bit savings, a number of unique tile sizes may be signaled to support unit tile size tabulation. The tile size can then be referenced only by the index.

以下のことは、上述の態様の第6の特定の例示的な実施形態である。例示的なCTBラスタおよびタイル走査プロセスは次の通りである。タイル走査におけるCTBアドレスからタイルIDへの変換を指定する、両端値を含む0からPicSizeInCtbsY-1までにわたるctbAddrTsに対するリストTileId[ctbAddrTs]、およびタイルIDからタイルの中の最初のCTBのタイル走査におけるCTBアドレスへの変換を指定する、両端値を含む0からNumTilesInPic-1までにわたるtileIdxに対するリストFirstCtbAddrTs[tileIdx]は、次のように導出されてもよい。
for(i=0,tileIdx=0;i<=NumTilesInPic;i++,tileIdx++){
for(y=TileRowBd[i];y<TileRowBd[i+1];y++) (6-8)
for(x=TileColBd[i];x<TileColBd[i+1];x++)
TileId[CtbAddrRsToTs[y*PicWidthInCtbsY+x]]=explicit_tile_id_flag ? tile_id_val[i] : tileIdx

FirstCtbAddrTs[tileIdx]=CtbAddrRsToTs[TileColBd[tileIdx]]*PicWidthInCtbsY+TileColBd[tileIdx]]

} The following is a sixth specific exemplary embodiment of the above embodiment. An exemplary CTB raster and tile scanning process is as follows. List TileId [ctbAddrTs] for ctbAddrTs from 0 to PicSizeInCtbsY-1, including the double-ended values, and CTB in the tile scan of the first CTB in the tile, specifying the conversion of the CTB address to the tile ID in the tile scan. The list FirstCtbAddrTs [tileIdx] for tileIdx from 0 to NumTilesInPic-1, including double-ended values, that specifies the translation to the address may be derived as follows:
for (i = 0, tileIdx = 0; i <= NumTilesInPic; i ++, tileIdx ++) {
for (y = TileRowBd [i]; y <TileRowBd [i + 1]; y ++) (6-8)
for (x = TileColBd [i]; x <TileColBd [i + 1]; x ++)
TileId [CtbAddrRsToTs [y * PicWidthInCtbsY + x]] = explicit_tile_id_flag? tile_id_val [i]: tileIdx

FirstCtbAddrTs [tileIdx] = CtbAddrRsToTs [TileColBd [tileIdx]] * PicWidthInCtbsY + TileColBd [tileIdx]]

}

タイルIDからタイルインデックスへの変換を指定するNumTilesInPic tileId値のセットに対するセットTileIdToIdx[tileId]は、次のように導出されてもよい。
for(ctbAddrTs=0,tileIdx=0,tileStartFlag=1;ctbAddrTs<PicSizeInCtbsY;ctbAddrTs++){
if(tileStartFlag){
TileIdToIdx[TileId[ctbAddrTs]]=tileIdx
tileStartFlag=0
}
tileEndFlag=ctbAddrTs==PicSizeInCtbsY-1||TileId[ctbAddrTs+1]!=TileId[ctbAddrTs]

if(tileEndFlag){
tileIdx++
tileStartFlag=1
}
} The set TileIdToIdx [tileId] for a set of NumTilesInPic tileId values that specify the conversion from tile ID to tile index may be derived as follows:
for (ctbAddrTs = 0, tileIdx = 0, tileStartFlag = 1; ctbAddrTs <PicSizeInCtbsY; ctbAddrTs ++) {
if (tileStartFlag) {
TileIdToIdx [TileId [ctbAddrTs]] = tileIdx
tileStartFlag = 0
}
tileEndFlag = ctbAddrTs == PicSizeInCtbsY-1 || TileId [ctbAddrTs + 1]! = TileId [ctbAddrTs]

if (tileEndFlag) {
tileIdx ++
tileStartFlag = 1
}
}

例示的なピクチャパラメータセットRBSPシンタックスおよびセマンティックは次の通りである。

An exemplary picture parameter set RBSP syntax and semantics are as follows.

いくつかの例では、セマンティックは次の通りでありうる。tile_id_len_minus1+1は、PPSを参照してタイルグループヘッダの中でシンタックス要素tile_id_val[i]およびシンタックス要素tile_group_addressを表すために使用されるビット数を指定する。tile_id_len_minus1の値は、両端値を含むCeil(Log2(NumTilesInPic)～15の範囲の中にあってよい。他の例では、セマンティックは次の通りでありうる。tile_id_len_minus1+1は、タイルID値を参照するPPSを参照してタイルグループヘッダの中でシンタックス要素tile_id_val[i]およびシンタックス要素を表すために使用されるビット数を指定してもよい。tile_id_len_minus1の値は、両端値を含むCeil(Log2(NumTilesInPic)～15の範囲の中にあってよい。 In some examples, the semantics can be: tile_id_len_minus1 + 1 refers to the PPS and specifies the number of bits used to represent the syntax element tile_id_val [i] and the syntax element tile_group_address in the tile group header. The value of tile_id_len_minus1 can be in the range Ceil (Log2 (NumTilesInPic) ~ 15 including both ends value. In other examples, the semantics can be: tile_id_len_minus1 + 1 refers to the tile ID value. You may specify the number of bits used to represent the syntax element tile_id_val [i] and the syntax element in the tile group header with reference to the PPS to be used. The value of tile_id_len_minus1 is Ceil (including the double-ended value). It may be in the range of Log2 (NumTilesInPic) to 15.

explicit_tile_id_flagは、タイルごとのタイルIDが明示的にシグナリングされることを指定するために1に等しく設定されてもよい。explicit_tile_id_flagは、タイルIDが明示的にはシグナリングされないことを指定するために0に等しく設定されてもよい。tile_id_val[i]は、ピクチャの中の第iのタイルのタイルのタイルIDを指定してもよい。tile_id_val[i]の長さは、tile_id_len_minus1+1ビットであってよい。両端値を含む0～NumTilesInPic-1の範囲の中の任意の整数mに対して、iがjに等しくないときtile_id_val[i]はtile_id_val[j]に等しくなくてよく、jがiよりも大きいときtile_id_val[i]はtile_id_val[j]よりも小さくてよい。 The explicit_tile_id_flag may be set equal to 1 to specify that the tile ID for each tile is explicitly signaled. The explicit_tile_id_flag may be set equal to 0 to specify that the tile ID is not explicitly signaled. tile_id_val [i] may specify the tile ID of the tile of the i-th tile in the picture. The length of tile_id_val [i] may be tile_id_len_minus 1 + 1 bits. Tile_id_val [i] does not have to be equal to tile_id_val [j] and j is greater than i for any integer m in the range 0 to NumTilesInPic-1, including the two-ended value, when i is not equal to j When tile_id_val [i] may be smaller than tile_id_val [j].

例示的なタイルグループヘッダRBSPシンタックスおよびセマンティックは次の通りである。

An exemplary tile group header RBSP syntax and semantics are as follows.

tile_group_addressは、タイルグループの中の最初のタイルのタイルアドレスを指定してもよい。tile_group_addressの長さは、tile_id_len_minus1+1ビットであってよい。tile_group_addressの値は、両端値を含む0～2^{tile_id_len_minus1+1}-1の範囲の中にあってよく、tile_group_addressの値は、同じコーディングされたピクチャのいかなる他のコーディングされたタイルグループNALユニットのtile_group_addressの値にも等しくなくてよい。tile_group_addressは、存在しないとき、0に等しいものと推定されてもよい。 tile_group_address may specify the tile address of the first tile in the tile group. The length of tile_group_address may be tile_id_len_minus1 + 1 bits. The value of tile_group_address can be in the range ^{0-2 tile_id_len_minus1 + 1} -1, including both ends, and the value of tile_group_address can be the tile_group_address of any other coded tilegroup NAL unit in the same coded picture. It does not have to be equal to the value. tile_group_address may be estimated to be equal to 0 when it does not exist.

図9は、例示的なビデオコーディングデバイス900の概略図である。ビデオコーディングデバイス900は、本明細書で説明するような開示する実施例/実施形態を実施するのに適している。ビデオコーディングデバイス900は、ダウンストリームポート920、アップストリームポート950、ならびに/またはネットワークを介してアップストリームおよび/もしくはダウンストリームにデータを通信するための送信機および/もしくは受信機を含むトランシーバユニット(Tx/Rx)910を備える。ビデオコーディングデバイス900はまた、データを処理するための論理ユニットおよび/または中央演算処理装置(CPU)、ならびにデータを記憶するためのメモリ932を含む、プロセッサ930を含む。ビデオコーディングデバイス900はまた、電気構成要素、光電気(OE:optical-to-electrical)構成要素、電気光(EO:electrical-to-optical)構成要素、ならびに/または電気通信ネットワーク、光通信ネットワーク、もしくはワイヤレス通信ネットワークを介したデータの通信のためにアップストリームポート950および/もしくはダウンストリームポート920に結合されたワイヤレス通信構成要素を備えてよい。ビデオコーディングデバイス900はまた、ユーザとの間でデータを通信するための入力および/または出力(I/O)デバイス960を含んでよい。I/Oデバイス960は、ビデオデータを表示するためのディスプレイ、オーディオデータを出力するためのスピーカーなどの、出力デバイスを含んでよい。I/Oデバイス960はまた、キーボード、マウス、トラックボールなどの、入力デバイス、および/またはそのような出力デバイスと相互作用するための対応するインターフェースを含んでよい。 FIG. 9 is a schematic diagram of an exemplary video coding device 900. The video coding device 900 is suitable for implementing the disclosed examples / embodiments as described herein. The video coding device 900 includes a transceiver unit (Tx) including a downstream port 920, an upstream port 950, and / or a transmitter and / or a receiver for communicating data upstream and / or downstream over a network. /Rx) 910 is provided. The video coding device 900 also includes a processor 930, including a logical unit and / or a central processing unit (CPU) for processing the data, and a memory 932 for storing the data. The video coding device 900 is also an electrical component, an optical-to-electrical (OE) component, an electrical-to-optical (EO) component, and / or a telecommunications network, an optical communication network, Alternatively, it may include wireless communication components coupled to upstream port 950 and / or downstream port 920 for the communication of data over a wireless communication network. The video coding device 900 may also include an input and / or output (I / O) device 960 for communicating data with the user. The I / O device 960 may include an output device such as a display for displaying video data and a speaker for outputting audio data. The I / O device 960 may also include an input device such as a keyboard, mouse, trackball, and / or a corresponding interface for interacting with such an output device.

プロセッサ930は、ハードウェアおよびソフトウェアによって実装される。プロセッサ930は、1つ以上のCPUチップ、コア(たとえば、マルチコアプロセッサとして)、フィールドプログラマブルゲートアレイ(FPGA)、特定用途向け集積回路(ASIC)、およびデジタル信号プロセッサ(DSP)として実装されてよい。プロセッサ930は、ダウンストリームポート920、Tx/Rx910、アップストリームポート950、およびメモリ932と通信している。プロセッサ930は、コーディングモジュール914を備える。コーディングモジュール914は、ビットストリーム500および/またはフレキシブルビデオタイリング方式800に従って区分される画像を採用してもよい、方法100、1000、および1100、メカニズム600、ならびに/またはアプリケーション700などの、本明細書で説明される開示する実施形態を実施する。コーディングモジュール914はまた、本明細書で説明する任意の他の方法/メカニズムを実施してもよい。さらに、コーディングモジュール914は、コーデックシステム200、エンコーダ300、および/またはデコーダ400を実装してもよい。たとえば、コーディングモジュール914は、ピクチャを第1のレベルのタイルに区分することができ、第1のレベルのタイルを第2のレベルのタイルに区分することができる。コーディングモジュール914はまた、各タイルグループが、いくつかの第1のレベルのタイル、または単一の第1のレベルのタイルの第2のレベルのタイルの連続するシーケンスを含むように、第1のレベルのタイルおよび第2のレベルのタイルをタイルグループに割り当てることができ、その結果、単一の第1のレベルのタイルから作成されたすべての第2のレベルのタイルは、同じタイルグループに割り当てられる。コーディングモジュール914はまた、第1のレベルのタイルがラスタ走査順序でコーディングされるような走査順序でそのようなタイルを符号化および/または復号することができ、第2のレベルのタイルは、そのような第2のレベルのタイルがそこから区分される第1のレベルのタイル内でラスタ走査順序でコーディングされる。コーディングモジュール914は、本明細書で説明するような様々な使用事例に対して、異なる解像度におけるサブピクチャを組み合わせて単一のピクチャにするために、そのようなメカニズムを採用することをさらにサポートする。したがって、コーディングモジュール914は、ビデオコーディングデバイス900の機能を改善するとともに、ビデオコーディング技術に特有の問題に対処する。さらに、コーディングモジュール914は、異なる状態へのビデオコーディングデバイス900の変換をもたらす。代替的に、コーディングモジュール914は、(たとえば、非一時的媒体上に記憶されるコンピュータプログラム製品として)メモリ932の中に記憶されプロセッサ930によって実行される命令として実施されうる。 Processor 930 is implemented by hardware and software. The processor 930 may be implemented as one or more CPU chips, cores (eg, as a multi-core processor), field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), and digital signal processors (DSPs). Processor 930 communicates with downstream port 920, Tx / Rx910, upstream port 950, and memory 932. The processor 930 includes a coding module 914. Coding module 914 may employ images segmented according to Bitstream 500 and / or Flexible Video Tiling Method 800, such as Methods 100, 1000, and 1100, Mechanism 600, and / or Application 700. Implement the disclosed embodiments described in the document. Coding module 914 may also implement any other method / mechanism described herein. Further, the coding module 914 may implement a codec system 200, an encoder 300, and / or a decoder 400. For example, the coding module 914 can divide the picture into first level tiles and the first level tiles into second level tiles. Coding module 914 also includes a first level tile, such that each tile group contains a contiguous sequence of several first level tiles, or a single first level tile, a second level tile. Level tiles and second level tiles can be assigned to tile groups, so that all second level tiles created from a single first level tile will be assigned to the same tile group. Will be. Coding module 914 can also encode and / or decode such tiles in a scan order such that the first level tiles are coded in raster scan order, and the second level tiles can be that. Second level tiles such as are coded in raster scan order within the first level tiles from which they are separated. Coding module 914 further supports the adoption of such a mechanism for combining sub-pictures at different resolutions into a single picture for various use cases as described herein. .. Therefore, the coding module 914 improves the functionality of the video coding device 900 and addresses issues specific to video coding technology. In addition, the coding module 914 provides the conversion of the video coding device 900 to different states. Alternatively, the coding module 914 can be implemented as an instruction stored in memory 932 (eg, as a computer program product stored on a non-temporary medium) and executed by processor 930.

メモリ932は、ディスク、テープドライブ、ソリッドステートドライブ、読取り専用メモリ(ROM)、ランダムアクセスメモリ(RAM)、フラッシュメモリ、3値連想メモリ(TCAM)、スタティックランダムアクセスメモリ(SRAM)などの、1つ以上のメモリタイプを備える。メモリ932は、実行のためにプログラムが選択されるときにそのようなプログラムを記憶するために、またプログラム実行中に読み取られる命令およびデータを記憶するために、オーバーフローデータ記憶デバイスとして使用されてもよい。 Memory 932 is one of disk, tape drive, solid state drive, read-only memory (ROM), random access memory (RAM), flash memory, ternary associative memory (TCAM), static random access memory (SRAM), etc. It has the above memory types. Memory 932 may also be used as an overflow data storage device to store such programs when they are selected for execution, and to store instructions and data read during program execution. good.

図10は、フレキシブルタイリング方式800などのフレキシブルタイリング方式を採用することによって画像を符号化する、例示的な方法1000のフローチャートである。方法1000は、方法100、メカニズム600、および/またはサポートするアプリケーション700を実行するとき、コーデックシステム200、エンコーダ300、および/またはビデオコーディングデバイス900などの、エンコーダによって採用されてもよい。さらに、方法1000は、デコーダ400などのデコーダへの送信のためにビットストリーム500を生成するために採用されてもよい。 FIG. 10 is a flowchart of an exemplary method 1000 that encodes an image by adopting a flexible tiling method such as the flexible tiling method 800. Method 1000 may be employed by an encoder, such as Codec System 200, Encoder 300, and / or Video Coding Device 900, when running Method 100, Mechanism 600, and / or supporting application 700. Further, method 1000 may be employed to generate a bitstream 500 for transmission to a decoder such as the decoder 400.

エンコーダが、複数の画像を含むビデオシーケンスを受信し、たとえば、ユーザ入力に基づいて、ビデオシーケンスをビットストリームの中に符号化すべきと決定すると、方法1000が開始してもよい。一例として、ビデオシーケンス、したがって、画像は、複数の解像度で符号化されうる。ステップ1001において、ピクチャは、複数の第1のレベルのタイルに区分される。第1のレベルのタイルのサブセットも、複数の第2のレベルのタイルに区分されてもよい。いくつかの例では、サブセットの外側の第1のレベルのタイルは、第1の解像度におけるピクチャデータを含んでよい。さらに、第2のレベルのタイルは、第1の解像度とは異なる第2の解像度におけるピクチャデータを含んでよい。いくつかの例では、第1のレベルのタイルのサブセットの中の各第1のレベルのタイルは、2つ以上の完全な第2のレベルのタイルを含んでよい。 Method 1000 may be initiated when the encoder receives a video sequence containing multiple images and, for example, determines that the video sequence should be encoded into a bitstream based on user input. As an example, a video sequence, and thus an image, can be encoded at multiple resolutions. In step 1001, the picture is divided into a plurality of first level tiles. A subset of first level tiles may also be divided into multiple second level tiles. In some examples, the first level tiles outside the subset may contain picture data at the first resolution. Further, the second level tile may contain picture data at a second resolution different from the first resolution. In some examples, each first level tile within a subset of first level tiles may contain two or more complete second level tiles.

ステップ1003において、第1のレベルのタイルおよび第2のレベルのタイルは、1つ以上のタイルグループに割り当てられる。割当ては、各タイルグループが、いくつかの第1のレベルのタイル、第2のレベルのタイルの各シーケンスが単一の第1のレベルのタイルから分割される第2のレベルのタイルの1つ以上の連続するシーケンス、またはそれらの組み合わせを含むように実行されてもよい。特定の例として、割当ては、単一の第1のレベルのタイルから作成されたすべての第2のレベルのタイルが、同じタイルグループに割り当てられるように実行されてもよい。いくつかの例では、1つ以上のタイルグループの各々は、割当て済みのタイルグループの中のすべてのタイルがピクチャの長方形部分をカバーするように制約されてもよい。 In step 1003, the first level tile and the second level tile are assigned to one or more tile groups. Allocation is one of the second level tiles where each tile group is divided into several first level tiles, each sequence of second level tiles from a single first level tile. It may be executed to include the above consecutive sequences or combinations thereof. As a specific example, the assignment may be performed so that all second level tiles created from a single first level tile are assigned to the same tile group. In some examples, each of one or more tile groups may be constrained so that all tiles in the assigned tile group cover the rectangular portion of the picture.

ステップ1005において、第1のレベルのタイルおよび第2のレベルのタイルは、ビットストリームの中に符号化される。たとえば、第1のレベルのタイルおよび第2のレベルのタイルは、走査順序に従って符号化されてもよい。特定の例では、走査順序に従って符号化することは、第1のレベルのタイルをラスタ走査順序で符号化することを含んでよい。第2のレベルのタイルのうちの1つに遭遇すると、第1のレベルのタイルのラスタ走査順序符号化が中断されてもよい。連続するすべての第2のレベルのタイルがラスタ走査順序で符号化されてから第1のレベルのタイルのラスタ走査順序符号化を継続してもよい。たとえば、現在の第1のレベルのタイルから区分されたすべての第2のレベルのタイルは、後続の第2のレベルのタイルから区分された任意の第2のレベルのタイルを符号化する前に符号化されてもよい。ステップ1007において、ビットストリームは、デコーダに向かう通信のために記憶されてもよい。 In step 1005, the first level tile and the second level tile are encoded in the bitstream. For example, the first level tiles and the second level tiles may be encoded according to the scan order. In certain examples, encoding according to scan order may include encoding first level tiles in raster scan order. Raster scan order coding of the first level tiles may be interrupted when one of the second level tiles is encountered. All consecutive second level tiles may be encoded in raster scan order before continuing raster scan order coding of the first level tiles. For example, all second level tiles separated from the current first level tile before encoding any second level tile separated from subsequent second level tiles. It may be encoded. At step 1007, the bitstream may be stored for communication towards the decoder.

図11は、フレキシブルビデオタイリング方式800などのフレキシブルタイリング方式を採用することによって画像を復号する、例示的な方法1100のフローチャートである。方法1100は、方法100、メカニズム600、および/またはサポートするアプリケーション700を実行するとき、コーデックシステム200、デコーダ400、および/またはビデオコーディングデバイス900などの、デコーダによって採用されてもよい。さらに、方法1100は、エンコーダ300などのエンコーダからビットストリーム500を受信すると採用されてもよい。 FIG. 11 is a flowchart of an exemplary method 1100 that decodes an image by adopting a flexible tiling method such as the flexible video tiling method 800. Method 1100 may be employed by a decoder, such as Codec System 200, Decoder 400, and / or Video Coding Device 900, when running Method 100, Mechanism 600, and / or supporting application 700. Further, method 1100 may be employed upon receiving a bitstream 500 from an encoder such as the encoder 300.

デコーダが、たとえば、方法1000の結果として、ビデオシーケンスを表すコーディングされたデータのビットストリームを受信し始めると、方法1100が開始してもよい。ビットストリームは、複数の解像度でコーディングされたビデオシーケンスからのビデオデータを含んでよい。ステップ1101において、ビットストリームが受信される。ビットストリームは、複数の第1のレベルのタイルに区分されたピクチャを含む。第1のレベルのタイルのサブセットは、複数の第2のレベルのタイルにさらに区分されてもよい。いくつかの例では、サブセットの外側の第1のレベルのタイルは、第1の解像度におけるピクチャデータを含んでよい。さらに、第2のレベルのタイルは、第1の解像度とは異なる第2の解像度におけるピクチャデータを含んでよい。別の例では、第1のレベルのタイルのサブセットの中の各第1のレベルのタイルは、2つ以上の完全な第2のレベルのタイルを含んでよい。第1のレベルのタイルおよび第2のレベルのタイルは、1つ以上のタイルグループに割り当てられる。割当ては、各タイルグループが、いくつかの第1のレベルのタイル、第2のレベルのタイルの各シーケンスが単一の第1のレベルのタイルから分割される第2のレベルのタイルの1つ以上の連続するシーケンス、またはそれらの組み合わせを含むように実行されてもよい。特定の例として、割当ては、単一の第1のレベルのタイルから作成されたすべての第2のレベルのタイルが、同じタイルグループに割り当てられるように実行されてもよい。いくつかの例では、1つ以上のタイルグループの各々は、割当て済みのタイルグループの中のすべてのタイルがピクチャの長方形部分をカバーするように制約されてもよい。 Method 1100 may start when the decoder begins to receive, for example, a bitstream of coded data representing a video sequence as a result of method 1000. The bitstream may contain video data from a video sequence coded at multiple resolutions. At step 1101, the bitstream is received. The bitstream contains pictures segmented into multiple first level tiles. The subset of first level tiles may be further subdivided into multiple second level tiles. In some examples, the first level tiles outside the subset may contain picture data at the first resolution. Further, the second level tile may contain picture data at a second resolution different from the first resolution. In another example, each first level tile within a subset of first level tiles may contain two or more complete second level tiles. First level tiles and second level tiles are assigned to one or more tile groups. Allocation is one of the second level tiles where each tile group is divided into several first level tiles, each sequence of second level tiles from a single first level tile. It may be executed to include the above consecutive sequences or combinations thereof. As a specific example, the assignment may be performed so that all second level tiles created from a single first level tile are assigned to the same tile group. In some examples, each of one or more tile groups may be constrained so that all tiles in the assigned tile group cover the rectangular portion of the picture.

ステップ1105において、第1のレベルのタイルおよび第2のレベルのタイルが、1つ以上のタイルグループに基づいて復号されてもよい。いくつかの例では、第1のレベルのタイルおよび第2のレベルのタイルは、走査順序に従って復号される。特定の例では、走査順序に従って復号することは、第1のレベルのタイルをラスタ走査順序で復号することを含んでよい。第2のレベルのタイルのうちの1つに遭遇すると、第1のレベルのタイルのラスタ走査順序符号化が中断されてもよい。連続するすべての第2のレベルのタイルが、次いで、ラスタ走査順序で符号化されてから、第1のレベルのタイルのラスタ走査順序復号を継続してもよい。たとえば、現在の第1のレベルのタイルから区分されたすべての第2のレベルのタイルは、後続の第2のレベルのタイルから区分された任意の第2のレベルのタイルを復号する前に復号されてもよい。 In step 1105, the first level tiles and the second level tiles may be decrypted based on one or more tile groups. In some examples, the first level tiles and the second level tiles are decoded according to the scan order. In certain examples, decoding according to scan order may include decoding first level tiles in raster scan order. Raster scan order coding of the first level tiles may be interrupted when one of the second level tiles is encountered. All consecutive second level tiles may then be encoded in raster scan order before continuing raster scan order decoding of the first level tiles. For example, all second level tiles separated from the current first level tile will be decrypted before decrypting any second level tile separated from subsequent second level tiles. May be done.

ステップ1107において、再構成ビデオシーケンスが、復号された第1のレベルのタイルおよび第2のレベルのタイルに基づいて表示のために生成されてもよい。 In step 1107, a reconstructed video sequence may be generated for display based on the decrypted first level tiles and second level tiles.

図12は、フレキシブルビデオタイリング方式800などのフレキシブルタイリング方式を採用することによってビデオシーケンスをコーディングするための、例示的なシステム1200の概略図である。システム1200は、コーデックシステム200、エンコーダ300、デコーダ400、および/またはビデオコーディングデバイス900などの、エンコーダおよびデコーダによって実施されてもよい。さらに、システム1200は、方法100、1000、1100、メカニズム600、および/またはアプリケーション700を実施するときに採用されてもよい。システム1200はまた、データをビットストリーム500などのビットストリームの中に符号化し、ユーザへの表示のためにそのようなビットストリームを復号してもよい。 FIG. 12 is a schematic diagram of an exemplary system 1200 for coding a video sequence by adopting a flexible tiling scheme such as the flexible video tiling scheme 800. System 1200 may be implemented by encoders and decoders such as codec system 200, encoder 300, decoder 400, and / or video coding device 900. In addition, system 1200 may be employed when implementing methods 100, 1000, 1100, mechanism 600, and / or application 700. The system 1200 may also encode the data into a bitstream such as the bitstream 500 and decode such a bitstream for display to the user.

システム1200はビデオエンコーダ1202を含む。ビデオエンコーダ1202は、ピクチャを複数の第1のレベルのタイルに区分し、かつ第1のレベルのタイルのサブセットを複数の第2のレベルのタイルに区分するための、区分モジュール1201を備える。ビデオエンコーダ1202は、単一の第1のレベルのタイルから作成されたすべての第2のレベルのタイルが、同じタイルグループに割り当てられるように、第1のレベルのタイルおよび第2のレベルのタイルを1つ以上のタイルグループに割り当てるための割当てモジュール1203をさらに備える。ビデオエンコーダ1202は、第1のレベルのタイルおよび第2のレベルのタイルをビットストリームの中に符号化するための符号化モジュール1205をさらに備える。ビデオエンコーダ1202は、デコーダに向かう通信のためにビットストリームを記憶するための記憶モジュール1207をさらに備える。ビデオエンコーダ1202は、デコーダに向かってビットストリームを送信するための送信モジュール1209をさらに備える。ビデオエンコーダ1202は、方法1000のステップのうちのいずれかを実行するようにさらに構成されてもよい。 System 1200 includes video encoder 1202. The video encoder 1202 comprises a partition module 1201 for partitioning a picture into a plurality of first level tiles and a subset of the first level tiles into a plurality of second level tiles. Video Encoder 1202 has first level tiles and second level tiles so that all second level tiles created from a single first level tile are assigned to the same tile group. Also includes an allocation module 1203 for assigning to one or more tile groups. The video encoder 1202 further comprises a coding module 1205 for encoding first level tiles and second level tiles into a bitstream. The video encoder 1202 further comprises a storage module 1207 for storing a bitstream for communication to the decoder. The video encoder 1202 further comprises a transmit module 1209 for transmitting a bitstream towards the decoder. The video encoder 1202 may be further configured to perform any of the steps of Method 1000.

システム1200はまた、ビデオデコーダ1210を含む。ビデオデコーダ1210は、複数の第1のレベルのタイルに区分されたピクチャを含むビットストリームを受信するための受信モジュール1211を備え、第1のレベルのタイルのサブセットは、複数の第2のレベルのタイルにさらに区分され、単一の第1のレベルのタイルから作成されたすべての第2のレベルのタイルが、同じタイルグループに割り当てられるように、第1のレベルのタイルおよび第2のレベルのタイルは1つ以上のタイルグループに割り当てられる。ビデオデコーダ1210は、1つ以上のタイルグループに基づいて第1のレベルのタイルおよび第2のレベルのタイルを復号するための復号モジュール1213をさらに備える。ビデオデコーダ1210は、復号された第1のレベルのタイルおよび第2のレベルのタイルに基づいて表示のために再構成ビデオシーケンスを生成するための生成モジュール1215をさらに備える。ビデオデコーダ1210は、方法1100のステップのうちのいずれかを実行するようにさらに構成されてもよい。 The system 1200 also includes a video decoder 1210. The video decoder 1210 comprises a receive module 1211 for receiving a bitstream containing pictures segmented into multiple first level tiles, with a subset of the first level tiles being multiple second level tiles. The first level tiles and the second level tiles are further subdivided into tiles so that all second level tiles created from a single first level tile are assigned to the same tile group. Tiles are assigned to one or more tile groups. The video decoder 1210 further comprises a decoding module 1213 for decoding first level tiles and second level tiles based on one or more tile groups. The video decoder 1210 further comprises a generation module 1215 for generating a reconstructed video sequence for display based on the decoded first level tiles and the second level tiles. The video decoder 1210 may be further configured to perform any of the steps of method 1100.

第1の構成要素と第2の構成要素との間にライン、トレース、または別の媒体を除いて介在する構成要素がないとき、第1の構成要素は第2の構成要素に直接結合される。第1の構成要素と第2の構成要素との間にライン、トレース、または別の媒体以外の介在する構成要素があるとき、第1の構成要素は第2の構成要素に間接的に結合される。「結合される(coupled)」という用語およびその変形は、直接結合されることと間接的に結合されることの両方を含む。「約(about)」という用語の使用は、別段に明記されていない限り後続の数の±10%を含む範囲を意味する。 When there is no intervening component between the first component and the second component except for lines, traces, or other media, the first component is directly coupled to the second component. .. When there is an intervening component other than a line, trace, or other medium between the first component and the second component, the first component is indirectly combined with the second component. To. The term "coupled" and its variants include both direct and indirect coupling. The use of the term "about" means a range that includes ± 10% of the subsequent number, unless otherwise stated.

本明細書に記載する例示的な方法のステップが、説明する順序で実行されることを必ずしも必要とされるとは限らないことも理解されたく、そのような方法のステップの順序は、単に例であるものと理解されるべきである。同様に、そのような方法の中に追加のステップが含められてよく、本開示の様々な実施形態と一致する方法の中で、いくつかのステップが除外されてよくまたは組み合わせられてよい。 It should also be appreciated that the steps of the exemplary methods described herein are not necessarily required to be performed in the order described, and the order of the steps in such methods is merely an example. Should be understood as being. Similarly, additional steps may be included in such methods, and some steps may be excluded or combined in a method consistent with the various embodiments of the present disclosure.

本開示の中でいくつかの実施形態が提供されているが、開示するシステムおよび方法が、本開示の趣旨または範囲から逸脱することなく、多くの他の特定の形態で具現されてもよいことが理解されてもよい。本例は、限定的ではなく例示的と見なされるべきであり、その意図は、本明細書で与えられる詳細に限定されるべきでない。たとえば、様々な要素または構成要素が組み合わせられてよく、または別のシステムの中で統合されてよく、あるいはいくつかの特徴が除外されてよく、または実施されなくてよい。 Although some embodiments are provided in the present disclosure, the disclosed systems and methods may be embodied in many other specific embodiments without departing from the spirit or scope of the present disclosure. May be understood. This example should be considered exemplary rather than limiting, and its intent should not be limited to the details given herein. For example, various elements or components may be combined, integrated within another system, or some features may be excluded or not implemented.

加えて、様々な実施形態において個別または別個として説明および図示される技法、システム、サブシステム、および方法は、組み合わせられてよく、または本開示の範囲から逸脱することなく、他のシステム、構成要素、技法、もしくは方法と統合されてよい。変更、置換、および改変の他の例は、当業者によって確認可能であり、本明細書で開示する趣旨および範囲から逸脱することなく行われてよい。 In addition, the techniques, systems, subsystems, and methods described and illustrated individually or separately in various embodiments may be combined or without departing from the scope of the present disclosure of other systems, components. , Technique, or method. Modifications, substitutions, and other examples of modifications can be ascertained by one of ordinary skill in the art and may be made without departing from the spirit and scope disclosed herein.

100 動作方法
200 コーディングおよび復号(コーデック)システム
201 区分されたビデオ信号
211 汎用コーダ制御構成要素
213 変換スケーリングおよび量子化構成要素
215 イントラピクチャ推定構成要素
217 イントラピクチャ予測構成要素
219 動き補償構成要素
221 動き推定構成要素
223 復号ピクチャバッファ構成要素
225 ループ内フィルタ構成要素
227 フィルタ制御分析構成要素
229 スケーリングおよび逆変換構成要素
231 ヘッダフォーマッティングおよびコンテキスト適応型バイナリ算術コーディング(CABAC)構成要素
300 ビデオエンコーダ
301 区分されたビデオ信号
313 変換および量子化構成要素
317 イントラピクチャ予測構成要素
321 動き補償構成要素
323 復号ピクチャバッファ構成要素
325 ループ内フィルタ構成要素
329 逆変換および量子化構成要素
331 エントロピーコーディング構成要素
400 ビデオデコーダ
417 イントラピクチャ予測構成要素
421 動き補償構成要素
423 復号ピクチャバッファ構成要素
425 ループ内フィルタ構成要素
429 逆変換および量子化構成要素
433 エントロピー復号構成要素
500 ビットストリーム
510 シーケンスパラメータセット(SPS)
512 ピクチャパラメータセット(PPS)
514 タイルグループヘッダ
520 画像データ
523、601、603、605、607 タイル
600 メカニズム
610 エクストラクタトラック
611 第1の解像度
612 第2の解像度
700 ビデオ会議アプリケーション
701 話している参加者
703 他の参加者
800 フレキシブルビデオタイリング方式
801 第1のレベルのタイル
803 第2のレベルのタイル
805 タイルグループ
807 走査順序
900 ビデオコーディングデバイス
910 トランシーバユニット(Tx/Rx)
914 コーディングモジュール
920 ダウンストリームポート
930 プロセッサ
932 メモリ
950 アップストリームポート
960 入力および/または出力(I/O)デバイス
1000 動作方法
1100 動作方法
1200 システム
1201 区分モジュール
1202 ビデオエンコーダ
1203 割当てモジュール
1205 符号化モジュール
1207 記憶モジュール
1209 送信モジュール、送信機
1210 ビデオデコーダ
1211 受信モジュール、受信機
1213 復号モジュール
1215 生成モジュール 100 How to operate
200 Coding and decoding (codec) system
201 segmented video signal
211 General purpose coder control component
213 Transformation scaling and quantization components
215 Intra-picture estimation component
217 Intrapicture Forecasting Component
219 Motion compensation component
221 Motion estimation component
223 Decrypted picture buffer component
225 In-loop filter component
227 Filter Control Analysis Component
229 Scaling and inverse transformation components
231 Header Formatting and Context Adaptive Binary Arithmetic Coding (CABAC) Components
300 video encoder
301 segmented video signal
313 Transformation and Quantization Components
317 Intra-picture prediction component
321 Motion compensation component
323 Decrypted picture buffer component
325 In-loop filter component
329 Inverse transformation and quantization components
331 Entropy coding component
400 video decoder
417 Intra-picture prediction component
421 motion compensation component
423 Decrypted picture buffer component
425 In-loop filter component
429 Inverse transformation and quantization components
433 Entropy decoding component
500 bitstream
510 Sequence Parameter Set (SPS)
512 Picture Parameter Set (PPS)
514 tile group header
520 Image data
523, 601, 603, 605, 607 tiles
600 mechanism
610 Extractor truck
611 First resolution
612 Second resolution
700 Video Conference Application
701 Talking participants
703 Other participants
800 Flexible video tiling method
801 First level tile
803 Second level tile
805 tile group
807 Scanning order
900 video coding device
910 Transceiver unit (Tx / Rx)
914 Coding module
920 downstream port
930 processor
932 memory
950 upstream port
960 Input and / or Output (I / O) Devices
1000 How to operate
1100 How it works
1200 system
1201 division module
1202 video encoder
1203 allocation module
1205 coding module
1207 Storage module
1209 Transmitter module, transmitter
1210 video decoder
1211 receiver module, receiver
1213 Decryption module
1215 generation module

Claims

It ’s a method implemented in an encoder,
A step of dividing a picture into multiple first level tiles by the encoder processor.
A step of dividing a subset of the first level tiles into a plurality of second level tiles by the processor.
Each tile group consists of several first level tiles, one or more sequences of second level tiles in which each sequence of second level tiles is split from a single first level tile. A step of assigning the first level tile and the second level tile to one or more tile groups by the processor so as to include the sequence to be performed, or a combination thereof.
A step of encoding the first level tile and the second level tile into a bitstream by the processor.
A method comprising storing the bitstream in memory of the encoder for communication towards the decoder.

It ’s a method implemented in an encoder,
A step of dividing a picture into multiple first level tiles by the encoder processor.
A step of dividing a subset of the first level tiles into a plurality of second level tiles by the processor.
The first level tile and the second level tile by the processor so that all second level tiles created from a single first level tile are assigned to the same tile group. And the steps to assign to one or more tile groups,
A step of encoding the first level tile and the second level tile into a bitstream by the processor.
A method comprising storing the bitstream in memory of the encoder for communication towards the decoder.

A first level tile outside the subset contains picture data at a first resolution, and a second level tile contains picture data at a second resolution different from the first resolution. The method according to item 1 or 2.

The method of any one of claims 1 to 3, wherein each first level tile in the subset of first level tiles comprises two or more complete second level tiles. ..

The first level tile and the second level tile are encoded according to the scanning order, and the step of encoding according to the scanning order is
The step of coding the first level tiles in raster scan order,
When one of the second level tiles is encountered, the step of interrupting the raster scan order coding of the first level tile, and
A step of encoding all consecutive second level tiles in raster scan order and then continuing said raster scan order coding of said first level tiles.
The method according to any one of claims 1 to 4.

All second level tiles separated from the current first level tile are encoded before any second level tiles separated from subsequent second level tiles are encoded. The method according to any one of claims 1 to 5.

10. One of claims 1-6, wherein each of the one or more tile groups is constrained so that all tiles in the assigned tile group cover the rectangular portion of the picture. Method.

It ’s a method that is carried out in the decoder.
A step of receiving a bit stream containing a picture divided into a plurality of first level tiles by the processor of the decoder through a receiver, wherein a subset of the first level tiles is a plurality of first level tiles. Further subdivided into two level tiles, each tile group is divided into several first level tiles, each sequence of second level tiles from a single first level tile. The first level tile and the second level tile are assigned to one or more tile groups so as to include one or more consecutive sequences of tiles of the first level, or a combination thereof. ,
A step of decoding the first level tile and the second level tile by the processor based on the one or more tile groups.
A method comprising the step of generating a reconstructed video sequence for display by the processor based on the decoded first level tiles and the second level tiles.

It ’s a method that is carried out in the decoder.
A step of receiving a bitstream containing pictures segmented into a plurality of first level tiles via a receiver by the decoder's processor, wherein a subset of the first level tiles is a plurality of first level tiles. The first level tiles and the above first level tiles are further subdivided into two level tiles so that all second level tiles created from a single first level tile are assigned to the same tile group. A step and a step in which the second level tile is assigned to one or more tile groups.
A step of decoding the first level tile and the second level tile by the processor based on the one or more tile groups.
A method comprising the step of generating a reconstructed video sequence for display by the processor based on the decoded first level tiles and the second level tiles.

A first level tile outside the subset contains picture data at a first resolution, and a second level tile contains picture data at a second resolution different from the first resolution. The method according to item 8 or 9.

The method of any one of claims 8-10, wherein each first level tile in the subset of first level tiles comprises two or more complete second level tiles. ..

The first level tile and the second level tile are decoded according to the scanning order, and the step of decoding according to the scanning order is
The step of decoding the first level tiles in raster scan order,
When one of the second level tiles is encountered, the step of interrupting the raster scan order coding of the first level tile, and
A step of encoding all consecutive second level tiles in raster scan order and then continuing said raster scan order decoding of said first level tiles.
The method according to any one of claims 8 to 11.

All second level tiles separated from the current first level tile are decrypted before decrypting any second level tiles separated from subsequent second level tiles. , The method according to any one of claims 8 to 12.

10. One of claims 8-13, wherein each of the one or more tile groups is constrained so that all tiles in the assigned tile group cover the rectangular portion of the picture. Method.

It ’s a video coding device,
13. Configured to perform the method of
Video coding device.

The method according to any one of claims 1 to 14, which is a non-temporary computer-readable medium comprising a computer programming product for use by a video coding device, wherein the computer programming product is executed by a processor. A non-temporary computer-readable medium comprising computer-executable instructions stored on the non-temporary computer-readable medium, such as causing the video coding device to execute.

It ’s an encoder,
To divide a picture into multiple first level tiles and a subset of the first level tiles into multiple second level tiles.
Classification means and
One of the first level tiles and one of the second level tiles so that all second level tiles created from a single first level tile are assigned to the same tile group. Allocation means for allocating to the above tile groups and
Coding means for encoding the first level tile and the second level tile into a bitstream, and
An encoder comprising a storage means for storing the bitstream for communication to the decoder.

17. The encoder according to claim 17, further configured to perform the method of any one of claims 1-7.

It ’s a decoder,
A receiving means for receiving a bitstream containing a picture divided into a plurality of first level tiles, wherein a subset of the first level tiles is further divided into a plurality of second level tiles. The first level tile and the second level tile are so that all second level tiles created from a single first level tile are assigned to the same tile group. Receiving means and which are assigned to one or more tile groups,
Decoding means for decoding the first level tile and the second level tile based on the one or more tile groups.
A decoder with a generation means for generating a reconstructed video sequence for display based on the decoded first level tiles and the second level tiles.

19. The decoder according to claim 19, further configured to perform the method of any one of claims 9-16.