JP2023123508A

JP2023123508A - Improved attribute layer and signaling in point cloud coding

Info

Publication number: JP2023123508A
Application number: JP2023093167A
Authority: JP
Inventors: ワーン，イエクイ; We-Kui Wang; ヘンドリー，フヌ; Hendry Fnu; ザハルチェンコ，ブラディスラフ; Zakharchenko Vladyslav
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2018-09-14
Filing date: 2023-06-06
Publication date: 2023-09-05
Also published as: KR102589477B1; US20210203989A1; US20210201539A1; CN116708799A; EP3841750A4; CN116847105A; CN113016184A; KR20210057161A; EP3841750A1; CN112690002A; KR20210057143A; MX2021003050A; US11979605B2; WO2020055869A1; EP3844963A1; SG11202102620XA; JP2021536204A; JP7130853B2; WO2020055865A1; CN113016184B

Abstract

To provide an encoder, a decoder, and a method implemented by them.SOLUTION: A method for encoding a point cloud coding (PCC) video sequence includes receiving a bitstream comprising a plurality of coded sequences of PCC frames. The plurality of coded sequences of PCC frames represent a plurality of PCC attributes including geometry, texture, and one or more of reflectance, transparency, and normal. Each coded PCC frame is represented by one or more PCC network abstraction layer (NAL) units. The method also includes: parsing the bitstream to obtain, for each PCC attribute, an indication of one of a plurality of video coder/decoders (codecs) used to code the corresponding PCC attribute; and decoding the bitstream based on the indicated video codecs for the PCC attributes.SELECTED DRAWING: Figure 11

Description

この特許出願は、Ｙｅ－ＫｕｉＷａｎｇらにより２０１８年９月１４日に出願された、参照により本明細書に組み込まれている「Ｈｉｇｈ－ＬｅｖｅｌＳｙｎｔａｘＤｅｓｉｇｎｓｆｏｒＰｏｉｎｔＣｌｏｕｄＣｏｄｉｎｇ」と題された米国特許仮出願第６２／７３１，６９３号の利益を主張する。 This patent application is a U.S. provisional patent entitled "High-Level Syntax Designs for Point Cloud Coding," filed September 14, 2018 by Ye-Kui Wang et al. The benefit of application Ser. No. 62/731,693 is claimed.

本開示は、一般に、ビデオコーディングに関し、具体的には、ポイントクラウドコーディング（ＰＣＣ）ビデオフレームに対するビデオ属性のコーディングに関連する。 TECHNICAL FIELD This disclosure relates generally to video coding, and specifically to coding video attributes for point cloud coding (PCC) video frames.

比較的短いビデオを示すために必要とされるビデオデータの量は、相当な量であり、データがストリーム化されるか、または他の方法で限定された帯域幅容量で通信ネットワークを介して通信されるときに、困難をもたらすことがある。したがって、ビデオデータは、一般に、現代の電気通信ネットワークを介して通信される前に圧縮される。また、メモリリソースが制限されることがあるため、ビデオが記憶デバイスに記憶されるときに、ビデオのサイズも問題となる可能性がある。ビデオ圧縮デバイスは、しばしば、伝送または記憶の前にビデオデータをコード化するために送信元においてソフトウェアおよび／またはハードウェアを使用し、それによってデジタルビデオ画像を表すのに必要とされるデータ量を減少させる。次いで、圧縮されたデータは、ビデオデータを復号するビデオ解凍デバイスによって宛先で受信される。ネットワークリソースが限られており、より高いビデオ品質の要求が絶えず増加しているため、画像品質にほとんど犠牲を払わずに圧縮比を改善する改善された圧縮および解凍技術が望ましい。 The amount of video data required to show a relatively short video is substantial and the data is streamed or otherwise communicated over communication networks with limited bandwidth capacity. It can cause difficulties when Therefore, video data is generally compressed before being communicated over modern telecommunication networks. The size of the video can also be an issue when the video is stored on a storage device, as memory resources may be limited. Video compression devices often use software and/or hardware at the source to encode video data prior to transmission or storage, thereby reducing the amount of data required to represent a digital video image. Decrease. The compressed data is then received at the destination by a video decompression device that decodes the video data. With limited network resources and ever-increasing demands for higher video quality, improved compression and decompression techniques that improve compression ratios while sacrificing little image quality are desirable.

一実施形態によると、本開示は、ビデオ復号器によって実施される方法を含む。この方法は、受信機によって、複数のポイントクラウドコーディング（ＰＣＣ）フレームのコード化されたシーケンスを含むビットストリームを受信することであって、複数のＰＣＣフレームのコード化されたシーケンスは、幾何学的形状、テクスチャ、および反射率、透明度、および法線のうちの１つ以上を含む複数のＰＣＣ属性を表し、各コード化されたＰＣＣフレームは、１つ以上のＰＣＣネットワーク抽象化レイヤ（ＮＡＬ）ユニットによって表される、受信することを含む。この方法はさらに、プロセッサによって、各ＰＣＣ属性に対して、対応するＰＣＣ属性をコード化するために使用される複数のビデオコーダデコーダ（コーデック）のうちの１つの指示を取得するために、ビットストリームを解析することを含む。この方法はさらに、プロセッサによって、ＰＣＣ属性に対する指示されたビデオコーデックに基づいて、ビットストリームを復号することを含む。いくつかのビデオコーディングシステムでは、ＰＣＣフレームのシーケンス全体が単一のコーデックを使用してコード化される。ＰＣＣフレームは複数のＰＣＣ属性を含んでもよい。いくつかのビデオコーデックは、他のものよりもいくつかのＰＣＣ属性を符号化するのにより効率的であってもよい。本実施形態は、異なるビデオコーデックが、同じＰＣＣフレームのシーケンスに対して異なるＰＣＣ属性を符号化することを可能にする。本実施形態はまた、シーケンス内のＰＣＣフレームが複数のＰＣＣ属性（例えば、３つ以上）を使用するときに、コーディングの柔軟性をサポートするために種々の構文要素を提供する。より多くの属性を提供することによって、符号化器は、より複雑なＰＣＣフレームを符号化することができる。さらに、復号器は、より複雑なＰＣＣフレームを復号し、表示することができる。さらに、異なるコーデックを異なる属性に採用することを可能にすることによって、コーデック選択に基づいてコーディングプロセスを最適化することができる。これは、符号化器と復号器の両方でのプロセッサリソースの使用量を減少させることがある。さらに、これは、増大した圧縮およびコーディング効率をサポートすることがあり、これは、符号化器と復号器との間でビットストリームを伝送しながら、メモリ使用量およびネットワークリソース使用量を低減する。 According to one embodiment, the present disclosure includes a method implemented by a video decoder. The method is receiving, by a receiver, a bitstream containing a coded sequence of multiple point cloud coding (PCC) frames, the coded sequence of multiple PCC frames being geometrically Each coded PCC frame represents one or more PCC Network Abstraction Layer (NAL) units representing shape, texture, and multiple PCC attributes including one or more of reflectance, transparency, and normal including receiving, represented by The method further comprises, by the processor, for each PCC attribute, a bitstream to obtain an indication of one of a plurality of video coder decoders (codecs) used to encode the corresponding PCC attribute. including parsing the The method further includes decoding, by the processor, the bitstream based on the indicated video codec for the PCC attributes. In some video coding systems, an entire sequence of PCC frames is coded using a single codec. A PCC frame may contain multiple PCC attributes. Some video codecs may be more efficient at encoding some PCC attributes than others. This embodiment allows different video codecs to encode different PCC attributes for the same sequence of PCC frames. The present embodiment also provides various syntax elements to support coding flexibility when a PCC frame in a sequence uses multiple PCC attributes (eg, three or more). By providing more attributes, the encoder can encode more complex PCC frames. Additionally, the decoder can decode and display more complex PCC frames. Furthermore, by allowing different codecs to be employed for different attributes, the coding process can be optimized based on codec selection. This may reduce processor resource usage at both the encoder and decoder. Furthermore, it may support increased compression and coding efficiency, which reduces memory usage and network resource usage while transmitting bitstreams between encoders and decoders.

任意選択で、前述の態様のいずれかにおいて、態様の別の実施態様は、ＰＣＣフレームの各シーケンスは、シーケンスレベルのパラメータを含むシーケンスレベルのデータユニットに関連付けられ、シーケンスレベルのデータユニットは、第１の属性が第１のビデオコーデックによってコード化されたことを示し、かつ第２の属性が第２のビデオコーデックによってコード化されたことを示す第１の構文要素を含む、ことを提供する。 Optionally, in any of the preceding aspects, in another implementation of the aspect, each sequence of PCC frames is associated with a sequence-level data unit comprising sequence-level parameters, the sequence-level data unit comprising: including a first syntax element indicating that one attribute was coded by a first video codec and indicating that a second attribute was coded by a second video codec.

任意選択で、前述の態様のいずれかにおいて、態様の別の実施態様は、第１の構文要素は、ビットストリームにおけるフレームヘッダのグループに含まれるｉｄｅｎｔｉｆｉｅｄ＿ｃｏｄｅｃ＿ｆｏｒ＿ａｔｔｒｉｂｕｔｅ要素である、ことを提供する。 Optionally, in any of the foregoing aspects, another implementation of the aspects provides that the first syntax element is an identified_codec_for_attribute element included in a group of frame headers in the bitstream.

任意選択で、前述の態様のいずれかにおいて、態様の別の実施態様は、第１の属性は、複数のストリームに編成され、第２の構文要素は、第１の属性に関連付けられたビットストリームのデータユニットに対するストリームメンバシップを示す、ことを提供する。 Optionally, in any of the preceding aspects, in another implementation of the aspect, the first attribute is organized into a plurality of streams and the second syntax element is a bitstream associated with the first attribute. , indicating the stream membership for the data unit of the .

任意選択で、前述の態様のいずれかにおいて、態様の別の実施態様は、第１の属性は、複数のレイヤに編成され、第３の構文要素は、第１の属性に関連付けられたビットストリームのデータユニットに対するレイヤメンバシップを示す、ことを提供する。 Optionally, in any of the preceding aspects, in another implementation of the aspect, the first attribute is organized into multiple layers and the third syntax element is a bitstream associated with the first attribute. indicates the layer membership for the data units of the .

任意選択で、前述の態様のいずれかにおいて、態様の別の実施態様は、第２の構文要素は、ビットストリームにおけるフレームヘッダのグループに含まれるｎｕｍ＿ｓｔｒｅａｍｓ＿ｆｏｒ＿ａｔｔｒｉｂｕｔｅ要素であり、第３の構文要素は、ビットストリームにおけるフレームヘッダのグループに含まれるｎｕｍ＿ｌａｙｅｒｓ＿ｆｏｒ＿ａｔｔｒｉｂｕｔｅ要素である、ことを提供する。 Optionally, in any of the preceding aspects, in another implementation of the aspect, the second syntax element is a num_streams_for_attribute element included in a group of frame headers in the bitstream and the third syntax element is the bit num_layers_for_attribute element contained in a group of frame headers in a stream.

任意選択で、前述の態様のいずれかにおいて、態様の別の実施態様は、第４の構文要素は、複数のレイヤのうちの第１のレイヤが不規則的なポイントクラウドに関連付けられたデータを含むことを示す、ことを提供する。 Optionally, in any of the preceding aspects, in another implementation of the aspect, the fourth syntax element comprises data associated with a point cloud in which a first of the plurality of layers is irregular Including, providing.

任意選択で、前述の態様のいずれかにおいて、態様の別の実施態様は、第４の構文要素は、ビットストリームにおけるフレームヘッダのグループに含まれるｒｅｇｕｌａｒ＿ｐｏｉｎｔｓ＿ｆｌａｇ要素である、ことを提供する。 Optionally, in any of the foregoing aspects, another implementation of the aspect provides that the fourth syntax element is a regular_points_flag element included in a group of frame headers in the bitstream.

任意選択で、前述の態様のいずれかにおいて、態様の別の実施態様は、ビットストリームは、ＰＣＣフレームの復号されたシーケンスに復号され、さらに、プロセッサによって、ＰＣＣフレームの復号されたシーケンスを提示のためのディスプレイに向けて転送することを含む、ことを提供する。 Optionally, in any of the preceding aspects, another implementation of the aspect is that the bitstream is decoded into a decoded sequence of PCC frames; including forwarding to a display for.

一実施形態によると、本開示は、ビデオ符号化器に実施される方法を含む。本方法は、プロセッサによって、ＰＣＣフレームのシーケンスの複数のＰＣＣ属性を、複数のコーダデコーダ（コーデック）でビットストリームに符号化することであって、複数のＰＣＣ属性は、幾何学的形状、テクスチャ、および反射率、透過率、および法線のうちのの１つ以上を含み、各コード化されたＰＣＣフレームは、１つ以上のＰＣＣネットワーク抽象化レイヤ（ＮＡＬ）ユニットによって表される、符号化することを含む。本方法はさらに、プロセッサによって、各ＰＣＣ属性に対して、対応するＰＣＣ属性をコード化するために使用されるビデオコーデックのうちの１つの指示を符号化することを含む。本方法はさらに、送信機によって、ビットストリームを復号器に向かって送信することを含む。いくつかのビデオコーディングシステムでは、ＰＣＣフレームのシーケンス全体が単一のコーデックを使用してコード化される。ＰＣＣフレームは複数のＰＣＣ属性を含んでもよい。いくつかのビデオコーデックは、他のものよりもいくつかのＰＣＣ属性を符号化するのにより効率的であってもよい。本実施形態は、異なるビデオコーデックが、同じＰＣＣフレームのシーケンスに対して異なるＰＣＣ属性を符号化することを可能にする。本実施形態はまた、シーケンス内のＰＣＣフレームが複数のＰＣＣ属性（例えば、３つ以上）を使用するときに、コーディングの柔軟性をサポートするために種々の構文要素を提供する。より多くの属性を提供することによって、符号化器は、より複雑なＰＣＣフレームを符号化することができる。さらに、復号器は、より複雑なＰＣＣフレームを復号し、表示することができる。さらに、異なるコーデックを異なる属性に採用することを可能にすることによって、コーデック選択に基づいてコーディングプロセスを最適化することができる。これは、符号化器と復号器の両方でのプロセッサリソースの使用量を減少させることがある。さらに、これは、増大した圧縮およびコーディング効率をサポートすることがあり、これは、符号化器と復号器との間でビットストリームを伝送しながら、メモリ使用量およびネットワークリソース使用量を低減する。 According to one embodiment, this disclosure includes a method implemented in a video encoder. The method is encoding, by a processor, a plurality of PCC attributes of a sequence of PCC frames into a bitstream with a plurality of coder-decoders (codecs), the plurality of PCC attributes being geometry, texture, and one or more of reflectance, transmittance, and normal, and each coded PCC frame is represented by one or more PCC network abstraction layer (NAL) units, encoding Including. The method further includes, by the processor, encoding for each PCC attribute an indication of one of the video codecs used to encode the corresponding PCC attribute. The method further includes transmitting, by the transmitter, the bitstream towards the decoder. In some video coding systems, an entire sequence of PCC frames is coded using a single codec. A PCC frame may contain multiple PCC attributes. Some video codecs may be more efficient at encoding some PCC attributes than others. This embodiment allows different video codecs to encode different PCC attributes for the same sequence of PCC frames. The present embodiment also provides various syntax elements to support coding flexibility when a PCC frame in a sequence uses multiple PCC attributes (eg, three or more). By providing more attributes, the encoder can encode more complex PCC frames. Additionally, the decoder can decode and display more complex PCC frames. Furthermore, by allowing different codecs to be employed for different attributes, the coding process can be optimized based on codec selection. This may reduce processor resource usage at both the encoder and decoder. Furthermore, it may support increased compression and coding efficiency, which reduces memory usage and network resource usage while transmitting bitstreams between encoders and decoders.

任意選択で、前述の態様のいずれかにおいて、態様の別の実施態様は、ＰＣＣフレームの各シーケンスは、シーケンスレベルのパラメータを含むシーケンスレベルのデータユニットに関連付けられ、シーケンスレベルのデータユニットは、第１のＰＣＣ属性が第１のビデオコーデックによってコード化されたことを示し、かつ第２のＰＣＣ属性が第２のビデオコーデックによってコード化されたことを示す第１の構文要素を含む、ことを提供する。 Optionally, in any of the preceding aspects, in another implementation of the aspect, each sequence of PCC frames is associated with a sequence-level data unit comprising sequence-level parameters, the sequence-level data unit comprising: including a first syntax element indicating that one PCC attribute was coded by a first video codec and indicating that a second PCC attribute was coded by a second video codec. do.

一実施形態では、本開示は、プロセッサと、プロセッサに結合された受信機と、プロセッサに結合された送信機と、を含み、プロセッサ、受信機、および送信機は、前述の態様のいずれかの方法を実行するように構成されている。 In one embodiment, the disclosure includes a processor, a receiver coupled to the processor, and a transmitter coupled to the processor, wherein the processor, receiver, and transmitter are any of the preceding aspects. configured to carry out the method.

一実施形態では、本開示は、ビデオコーディングデバイスによって使用されるコンピュータプログラム製品を含む非一時的コンピュータ可読媒体であって、コンピュータプログラム製品は、プロセッサによって実行されるときに、ビデオコーディングデバイスが前述の態様のいずれかの方法を実行するように、非一時的コンピュータ可読媒体上に記憶されたコンピュータ実行可能命令を含む、非一時的コンピュータ可読媒体を含む。 In one embodiment, the present disclosure is a non-transitory computer-readable medium containing a computer program product for use by a video coding device, the computer program product being a computer program product that, when executed by a processor, causes the video coding device to perform the aforementioned operations. It includes a non-transitory computer-readable medium containing computer-executable instructions stored on the non-transitory computer-readable medium to perform the method of any of the aspects.

一実施形態では、本開示は、ＰＣＣフレームのシーケンスの複数のＰＣＣ属性を、複数のコーダデコーダ（コーデック）でビットストリームに符号化することであって、複数のＰＣＣ属性は、幾何学的形状、テクスチャ、および反射率、透過率、および法線のうちのの１つ以上を含み、各コード化されたＰＣＣフレームは、１つ以上のＰＣＣネットワーク抽象化レイヤ（ＮＡＬ）ユニットによって表される、符号化することを行うための第１の属性符号化手段および第２の属性符号化手段を含む符号化器を含む。符号化器はさらに、各ＰＣＣ属性に対して、対応するＰＣＣ属性をコード化するために使用されるビデオコーデックのうちの１つの指示を符号化することを行うための構文符号化手段を含む。符号化器はさらに、ビットストリームを復号器に向かって送信することを行うための送信手段を含む。 In one embodiment, the present disclosure is encoding multiple PCC attributes of a sequence of PCC frames into a bitstream with multiple coder-decoders (codecs), wherein the multiple PCC attributes are geometric shapes, A code that includes texture and one or more of reflectance, transmittance, and normal, with each coded PCC frame represented by one or more PCC network abstraction layer (NAL) units an encoder including first attribute encoding means and second attribute encoding means for performing encoding. The encoder further includes syntax encoding means for encoding, for each PCC attribute, an indication of one of the video codecs used to encode the corresponding PCC attribute. The encoder further includes transmitting means for transmitting the bitstream towards the decoder.

任意選択で、前述の態様のいずれかにおいて、態様の別の実施態様は、符号化器は、前述の態様のいずれかの方法を実行するようにさらに構成されている、ことを提供する。 Optionally, in any of the preceding aspects, another implementation of the aspect provides that the encoder is further configured to perform the method of any of the preceding aspects.

一実施形態では、本開示は、複数のＰＣＣフレームのコード化されたシーケンスを含むビットストリームを受信することであって、複数のＰＣＣフレームのコード化されたシーケンスは、幾何学的形状、テクスチャ、および反射率、透明度、および法線のうちの１つ以上を含む複数のＰＣＣ属性を表し、各コード化されたＰＣＣフレームは、１つ以上のＰＣＣネットワーク抽象化レイヤ（ＮＡＬ）ユニットによって表される、受信することを行うための受信手段を含む復号器を含む。復号器はさらに、各ＰＣＣ属性に対して、対応するＰＣＣ属性をコーディングするために使用される複数のビデオコーダデコーダ（コーデック）のうちの１つの指示を取得するために、ビットストリームを解析することを行うための解析手段を含む。復号器はさらに、ＰＣＣ属性に対する指示されたビデオコーデックに基づいて、ビットストリームを復号することを行うための復号手段を含む。 In one embodiment, the present disclosure is to receive a bitstream that includes a coded sequence of multiple PCC frames, the coded sequence of multiple PCC frames including geometry, texture, and a plurality of PCC attributes including one or more of reflectance, transparency, and normal, and each coded PCC frame is represented by one or more PCC network abstraction layer (NAL) units , a decoder including receiving means for performing receiving. The decoder further parses the bitstream to obtain, for each PCC attribute, an indication of one of a plurality of video coder decoders (codecs) used to code the corresponding PCC attribute. including analysis means for performing The decoder further includes decoding means for performing decoding of the bitstream based on the indicated video codec for the PCC attribute.

任意選択で、前述の態様のいずれかにおいて、態様の別の実施態様は、前述の態様のいずれかの方法を実行するようにさらに構成されている、ことを提供する。 Optionally, in any of the foregoing aspects, there is provided that another implementation of the aspect is further configured to perform the method of any of the foregoing aspects.

明確にするために、前述の実施形態のいずれか１つを、他の前述の実施形態のいずれか１つ以上と組み合わせて、本開示の範囲内の新たな実施形態を作成してもよい。 For clarity, any one of the aforementioned embodiments may be combined with any one or more of the other aforementioned embodiments to create new embodiments within the scope of the present disclosure.

これらおよび他の特徴は、添付の図面および特許請求の範囲に関連して取られた以下の詳細な説明から、より明確に理解されるであろう。 These and other features will be more clearly understood from the following detailed description taken in conjunction with the accompanying drawings and claims.

本開示をより完全に理解するために、添付の図面および詳細な説明に関連して、以下の簡単な説明を参照する。ここで、同様の参照番号は同様の部分を表す。 For a more complete understanding of the present disclosure, reference is made to the following brief description in conjunction with the accompanying drawings and detailed description. Here, like reference numbers refer to like parts.

ビデオ信号をコーディングする例示的な方法のフローチャートである。4 is a flowchart of an exemplary method of coding a video signal; ビデオコーディングのための例示的なコーディングおよび復号（コーデック）システムの概略図である。1 is a schematic diagram of an exemplary coding and decoding (codec) system for video coding; FIG. 例示的なビデオ符号化器を示す概略図である。1 is a schematic diagram of an exemplary video encoder; FIG. 例示的なビデオ復号器を示す概略図である。1 is a schematic diagram of an exemplary video decoder; FIG. ＰＣＣ機構にしたがってコード化することができるポイントクラウド媒体の例である。Fig. 3 is an example of a point cloud medium that can be encoded according to the PCC scheme; ポイントクラウド媒体フレームのためのデータセグメント化およびパッキングの例である。FIG. 4 is an example of data segmentation and packing for point cloud media frames; FIG. 拡張された属性セットを有する例示的なＰＣＣビデオストリームを示す概略図である。FIG. 4B is a schematic diagram showing an exemplary PCC video stream with an extended set of attributes; 複数のコーデックを有するＰＣＣ属性を符号化する例示的な機構を示す概略図である。FIG. 4 is a schematic diagram showing an exemplary mechanism for encoding PCC attributes with multiple codecs; 属性レイヤの例を示す概略図である。FIG. 4 is a schematic diagram showing an example of an attribute layer; FIG. 属性ストリームの例を示す概略図である。FIG. 4 is a schematic diagram illustrating an example attribute stream; 複数のコーデックを有するＰＣＣビデオシーケンスを符号化する例示的な方法のフローチャートである。4 is a flowchart of an exemplary method for encoding a PCC video sequence with multiple codecs; 複数コーデックを有するＰＣＣビデオシーケンスを復号する例示的な方法のフローチャートである。4 is a flowchart of an exemplary method for decoding a PCC video sequence with multiple codecs; 例示的なビデオコーディングデバイスの概略図である。1 is a schematic diagram of an exemplary video coding device; FIG. 複数のコーデックを有するＰＣＣビデオシーケンスをコーディングするための例示的なシステムの概略図である。1 is a schematic diagram of an exemplary system for coding a PCC video sequence with multiple codecs; FIG. 複数のコーデックを有するＰＣＣビデオシーケンスをコーディングする別の例示的な方法のフローチャートである。4 is a flowchart of another exemplary method of coding a PCC video sequence with multiple codecs; 複数コーデックでＰＣＣビデオシーケンスを復号する別の例示的な方法のフローチャートである。4 is a flowchart of another exemplary method for decoding a PCC video sequence with multiple codecs;

最初に、１つ以上の実施形態の例示的な実施が以下に提供されるが、開示されたシステムおよび／または方法は、現在公知であるか存在するかを問わず、任意の数の技術を使用して実施され得ると理解されたい。本開示は、以下に例示的な実施態様、図面および技術に全く限定されるべきではなく、本明細書に例示され説明された例示の設計および実施態様を含むが、添付の特許請求の範囲の範囲内で、それらの均等物の全範囲と共に修正されてもよい。 Initially, exemplary implementations of one or more embodiments are provided below, although the disclosed systems and/or methods may incorporate any number of techniques, whether currently known or in existence. It should be understood that it can be implemented using The present disclosure should in no way be limited to the exemplary implementations, drawings and techniques hereinafter, including the exemplary designs and implementations illustrated and described herein, but within the scope of the appended claims. ranges may be modified along with their full range of equivalents.

多くのビデオ圧縮技術が、最小限のデータ損失でビデオファイルのサイズを低減するために使用され得る。例えば、ビデオ圧縮技術は、ビデオシーケンスにおけるデータ冗長性を低減または除去するために、間的（例えば、イントラピクチャ）予測および／または時間的（例えば、インターピクチャ）予測を実行することを含むことができる。ブロックベースのビデオコーディングのために、ビデオスライス（例えば、ビデオピクチャまたはビデオピクチャの一部分）は、ビデオブロックに分割されてもよく、これは、ツリーブロック、コーディング・ツリー・ブロック（ＣＴＢ）、コーディング・ツリー・ユニット（ＣＴＵ）、コーディング・ユニット（ＣＵ）、および／またはコーディング・ノードとも呼ばれることがある。ピクチャのイントラコード化（Ｉ）スライスにおけるビデオブロックは、同じピクチャ内の隣接ブロックにおける参照サンプルに関して空間的予測を使用してコード化される。ピクチャのインターコード化（ＰまたはＢ）スライスにおけるビデオブロックは、同じピクチャ内の隣接ブロック内の参照サンプルに関して空間的予測、または他の参照ピクチャ内の参照サンプルに関して時間的予測を採用することによって、コード化されてもよい。ピクチャはフレームと呼ばれ、参照ピクチャは参照フレームと呼ばれる。空間的予測または時間的予測は、画像ブロックを表す予測ブロックをもたらす。残差データは、元の画像ブロックと予測ブロックとの間の画素差を表す。したがって、インターコード化ブロックは、予測ブロックを形成する参照サンプルのブロックを指す動きベクトルと、コード化されたブロックと予測ブロックとの間の差を示す残差データにしたがって符号化される。イントラコード化ブロックは、イントラコーディングモードと残差データにしたがって符号化される。さらなる圧縮のために、残差データは、画素ドメインから変換ドメインに変換されてもよい。これらは、残差変換係数をもたらし、これは、量子化されてもよい。量子化された変換係数は、最初に二次元アレイに配置されてもよい。量子化された変換係数は、変換係数の一次元ベクトルを生成するために走査されてもよい。エントロピーコーディングは、より多くの圧縮を達成するために適用されてもよい。このようなビデオ圧縮技術は、以下により詳細に論じられる。 Many video compression techniques can be used to reduce the size of video files with minimal data loss. For example, video compression techniques can include performing temporal (eg, intra-picture) prediction and/or temporal (eg, inter-picture) prediction to reduce or remove data redundancy in a video sequence. can. For block-based video coding, a video slice (e.g., a video picture or portion of a video picture) may be partitioned into video blocks, which are treeblocks, coding tree blocks (CTBs), coding tree blocks (CTBs), and coding tree blocks (CTBs). It may also be called a tree unit (CTU), coding unit (CU), and/or coding node. Video blocks in an intra-coded (I) slice of a picture are coded using spatial prediction with respect to reference samples in neighboring blocks within the same picture. A video block in an inter-coded (P or B) slice of a picture employs spatial prediction with respect to reference samples in neighboring blocks in the same picture, or temporal prediction with respect to reference samples in other reference pictures: may be coded. A picture is called a frame and a reference picture is called a reference frame. Spatial prediction or temporal prediction results in a prediction block representing the image block. Residual data represents pixel differences between the original image block and the prediction block. Thus, an inter-coded block is encoded according to motion vectors pointing to blocks of reference samples forming the predictive block and residual data indicating the difference between the coded block and the predictive block. Intra-coded blocks are encoded according to an intra-coding mode and residual data. For further compression, the residual data may be transformed from the pixel domain to the transform domain. These yield residual transform coefficients, which may be quantized. The quantized transform coefficients may first be arranged in a two-dimensional array. The quantized transform coefficients may be scanned to produce a one-dimensional vector of transform coefficients. Entropy coding may be applied to achieve more compression. Such video compression techniques are discussed in more detail below.

符号化されたビデオが正確に復号されることを確実にするために、ビデオは、対応するビデオコーディング標準にしたがって符号化され、復号される。ビデオコーディング標準は、国際電気通信連合（ＩＴＵ）標準化部門（ＩＴＵ－Ｔ）Ｈ．２６１、国際標準化機構／国際電気標準会議（ＩＳＯ／ＩＥＣ）ムービング・ピクチャ・エクスパーツ・グループ（ＭＰＥＧ）－１パート２、ＩＴＵ－ＴＨ．２６２、またはＩＳＯ／ＩＥＣＭＰＥＧ－２パート２、ＩＴＵ－ＴＨ．２６３、ＩＳＯ／ＩＥＣＭＰＥＧ－４パート２、ＩＴＵ－ＴＨ．２６４またはＩＳＯ／ＩＥＣＭＰＥＧ－４パート１０とも呼ばれるアドバンスト・ビデオ・コーディング（ＡＶＣ）、およびＩＴＵ－ＴＨ．２６５またはＭＰＥＧ－Ｈパート２とも呼ばれるハイ・エフィシエンシー・ビデオ・コーディング（ＨＥＶＣ）を含む。ＡＶＣは、スケーラブル・ビデオ・コーディング（ＳＶＣ）、マルチビュー・ビデオ・コーディング（ＭＶＣ）およびマルチビュー・ビデオ・コーディング・プラス・デプス（ＭＶＣ＋Ｄ）、および三次元（３Ｄ）ＡＶＣ（３Ｄ－ＡＶＣ）などの拡張を含む。ＨＥＶＣは、スケーラブルＨＥＶＣ（ＳＨＶＣ）、マルチビューＨＥＶＣ（ＭＶ－ＨＥＶＣ）、３ＤＨＥＶＣ（３Ｄ－ＨＥＶＣ）などの拡張を含む。ＩＴＵ－ＴとＩＳＯ／ＩＥＣの合同ビデオ・エキスパート・チーム（ＪＶＥＴ）は、バーサタイル・ビデオ・コーディング（ＶＶＣ）と呼ばれるビデオコーディング標準の開発を開始したＶＶＣは、ワーキング・ドラフト（ＷＤ）に含まれ、これは、ＪＶＥＴ－Ｋ１００１－ｖ４およびＪＶＥＴ－Ｋ１００２－ｖ１を含む。 To ensure that the encoded video is decoded correctly, the video is encoded and decoded according to corresponding video coding standards. The video coding standard is the International Telecommunications Union (ITU) Standardization Sector (ITU-T) H.264 standard. 261, International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) Moving Picture Experts Group (MPEG)-1 Part 2, ITU-T H.261. 262, or ISO/IEC MPEG-2 Part 2, ITU-T H.262. 263, ISO/IEC MPEG-4 Part 2, ITU-T H.263. Advanced Video Coding (AVC), also called ISO/IEC MPEG-4 Part 10, and ITU-T H.264 or ISO/IEC MPEG-4 Part 10; H.265 or MPEG-H Part 2, also known as High Efficiency Video Coding (HEVC). AVC includes scalable video coding (SVC), multiview video coding (MVC) and multiview video coding plus depth (MVC+D), and three-dimensional (3D) AVC (3D-AVC). Including extensions. HEVC includes extensions such as scalable HEVC (SHVC), multi-view HEVC (MV-HEVC), and 3D HEVC (3D-HEVC). The ITU-T and ISO/IEC Joint Video Experts Team (JVET) initiated the development of a video coding standard called Versatile Video Coding (VVC).VVC is included in the Working Draft (WD), This includes JVET-K1001-v4 and JVET-K1002-v1.

ＰＣＣは、３Ｄ物体のビデオを符号化するための機構である。ポイントクラウドは、３Ｄ空間内のデータポイントのセットである。このようなデータポイントは、例えば、空間上の位置および色を決定するパラメータを含む。ポイントクラウドは、リアルタイムの３Ｄイマーシブ・テレプレゼンス、インタラクティブ視差を有するコンテンツ・バーチャル・リアリティ（ＶＲ）ビューイング、３Ｄフリー・ビューポイント・スポーツ・リレー放送、地理情報システム、文化遺産、大規模な３Ｄダイナミックマップに基づく自律ナビゲーション、自動車アプリケーションなど、様々なアプリケーションで使用されてもよい。ＰＣＣのためのＩＳＯ／ＩＥＣＭＰＥＧコーデックは、実質的なコーディング効率とネットワーク環境に対するロバスト性を有する可逆性および／または不可逆性圧縮ポイントクラウド上で動作してもよい。このコーデックを使用することにより、ポイントクラウドをコンピュータデータの形式として操作し、種々の記憶媒体に記憶し、ネットワークを介して送信および受信し、放送チャネル上で配信することが可能となる。ＰＣＣコーディング環境は、ＰＣＣカテゴリ１、ＰＣＣカテゴリ２、およびＰＣＣカテゴリ３に分類される。本開示は、ＭＰＥＧ出力文書Ｎ１７５３４およびＮ１７５３３に関連するＰＣＣカテゴリ２に向けられている。ＰＣＣカテゴリ２コーデックの設計は、異なるビデオシーケンスのセットとしてポイントクラウドデータを圧縮することによって、動的ポイントクラウドの幾何学的形状およびテクスチャ情報を圧縮するために、他のビデオコーデックを活用することを目指す。例えば、１つはポイントクラウドデータの幾何学的形状情報を表し、もう１つはテクスチャ情報を表す２つのビデオシーケンスは、１つ以上のビデオコーデックを使用して生成および圧縮することができる。ビデオシーケンスの解釈をサポートする追加のメタデータ（例えば、占有マップおよび補助パッチ情報）も、別々に生成および圧縮することができる。 PCC is a mechanism for coding video of 3D objects. A point cloud is a set of data points in 3D space. Such data points include, for example, parameters that determine spatial position and color. Point clouds are used for real-time 3D immersive telepresence, content virtual reality (VR) viewing with interactive parallax, 3D free viewpoint sports relay broadcasts, geographic information systems, cultural heritage, large-scale 3D dynamic It may be used in various applications such as map-based autonomous navigation, automotive applications, and so on. ISO/IEC MPEG codecs for PCC may operate on lossless and/or lossy compression point clouds with substantial coding efficiency and robustness to network environments. Using this codec, point clouds can be manipulated as a form of computer data, stored on various storage media, transmitted and received over networks, and distributed over broadcast channels. PCC coding environments are classified into PCC Category 1, PCC Category 2, and PCC Category 3. This disclosure is directed to PCC Category 2, which relates to MPEG output documents N17534 and N17533. The design of the PCC Category 2 codec seeks to leverage other video codecs to compress the geometry and texture information of dynamic point clouds by compressing the point cloud data as a set of different video sequences. aim. For example, two video sequences, one representing the geometry information of the point cloud data and the other representing the texture information, can be generated and compressed using one or more video codecs. Additional metadata supporting interpretation of the video sequence (eg, occupancy map and auxiliary patch information) can also be separately generated and compressed.

ＰＣＣシステムは、位置データを含む幾何学的形状ＰＣＣ属性とカラーデータを含むテクスチャＰＣＣ属性をサポートしてもよい。しかしながら、いくつかのビデオアプリケーションは、反射率、透明度、法線ベクトル等の他のタイプのデータを含んでもよい。これらのタイプのデータのいくつかは、あるコーデックを使用する方が他のコーデックよりも効率的にコード化されてもよい。しかしながら、ＰＣＣシステムは、ＰＣＣストリーム全体、したがって、すべてのＰＣＣ属性が、同じコーデックによって符号化されることを要求することがある。さらに、ＰＣＣ属性は、複数のレイヤに分割されてもよい。次いで、そのようなレイヤを組み合わせ、および／または１つ以上のＰＣＣ属性ストリームにコード化してもよい。例えば、属性のレイヤは、一時的にインターリーブされたコーディングスキームにしたがってコード化することができ、ここで、第１のレイヤは、ピクチャ出力順序の偶数値を有するＰＣＣアクセスユニット（ＡＵ）においてコード化され、第２のレイヤは、画像出力順序の奇数値を有するＰＣＣＡＵにおいてコード化される。各属性およびこのようなストリームの様々なレイヤの組み合わせに対して０～４のストリームが存在する可能性があるため、ストリームおよびレイヤの適切な識別が課題となることがある。しかし、ＰＣＣシステムは、所与のＰＣＣビットストリームにおいて、いくつのレイヤがコード化されているか、または、ＰＣＣ属性ストリームと組み合わされているかを決定することができないことがある。さらに、ＰＣＣシステムは、レイヤの組み合わせの方式を示すおよび／またはそのようなレイヤとＰＣＣ属性ストリームとの間の対応を示すための機構を有していないことがある。最後に、パッチはＰＣＣビデオデータをコード化するために採用される。例えば、三次元（３Ｄ）ＰＣＣ物体は、二次元（２Ｄ）パッチのセットとして表わされ得る。これにより、ＰＣＣは、２Ｄビデオフレームを符号化するように設計されたビデオコーデックと共に動作することを可能にする。ただし、ポイントクラウド内のいくつかのポイントは、場合によっては、パッチで捉えられないことがある。例えば、３Ｄ空間内の孤立ポイントは、パッチの一部としてコード化することが困難であってもよい。このような場合に意味をなす唯一のパッチは、単一ポイントを含む１ピクセル×１ピクセルのパッチであり、これは、多くのこのようなポイントの場合、シグナリングオーバーヘッドを著しく増加させる。代わりに、不規則的なポイントクラウドは使用することができ、これは、複数の孤立点を含む特殊なパッチである。不規則的なポイントクラウドパッチに対する属性をシグナリングするために、他のパッチタイプとは異なるアプローチが使用される。しかしながら、ＰＣＣシステムは、ＰＣＣ属性レイヤが不規則的なポイントクラウド／パッチを搬送することを示すことができないことがある。 A PCC system may support geometry PCC attributes containing position data and texture PCC attributes containing color data. However, some video applications may include other types of data such as reflectance, transparency, normal vectors, and the like. Some of these types of data may be encoded more efficiently using some codecs than others. However, PCC systems may require that the entire PCC stream, and thus all PCC attributes, be encoded by the same codec. Furthermore, PCC attributes may be divided into multiple layers. Such layers may then be combined and/or encoded into one or more PCC attribute streams. For example, the layers of attributes can be coded according to a temporally interleaved coding scheme, where the first layer is coded in PCC Access Units (AUs) with even values of picture output order. and the second layer is coded in PCC AU with odd values of image output order. Proper identification of streams and layers can be a challenge, as there can be 0-4 streams for each attribute and various layer combinations of such streams. However, the PCC system may not be able to determine how many layers are coded or combined with the PCC attribute stream in a given PCC bitstream. Furthermore, PCC systems may not have mechanisms for indicating the scheme of layer combinations and/or indicating correspondence between such layers and PCC attribute streams. Finally, patches are employed to encode PCC video data. For example, a three-dimensional (3D) PCC object can be represented as a set of two-dimensional (2D) patches. This allows PCC to work with video codecs designed to encode 2D video frames. However, some points in the point cloud may not be captured by the patch in some cases. For example, an isolated point in 3D space may be difficult to code as part of a patch. The only patch that makes sense in such a case is a 1 pixel by 1 pixel patch containing a single point, which for many such points significantly increases the signaling overhead. Alternatively, an irregular point cloud can be used, which is a special patch containing multiple isolated points. A different approach is used for signaling attributes for irregular point cloud patches than for other patch types. However, the PCC system may not be able to indicate that the PCC attribute layer carries an irregular point cloud/patch.

本明細書で開示されているのは、上記の問題に対処することによってＰＣＣを改善するための機構である。一実施形態では、ＰＣＣシステムは、異なるＰＣＣ属性をコード化するために異なるコーデックを採用してもよい。具体的には、各属性に対するビデオコーデックを識別するために、別個の構文要素を採用することができる。別の実施形態では、ＰＣＣシステムは、各ＰＣＣ属性ストリームを表すためにコード化される、および／または組み合わせられるレイヤの数を明示的にシグナルする。追加的に、ＰＣＣシステムは、ＰＣＣ属性ストリーム内のＰＣＣ属性のレイヤをコード化する、および／または組み合わせるために使用されるモードをシグナルするために構文要素を採用してもよい。さらに、ＰＣＣシステムは、対応するＰＣＣ属性ストリームの各データユニットに関連付けられたレイヤのレイヤインデックスを指定するために構文要素を採用してもよい。さらに別の実施形態では、ＰＣＣ属性レイヤが任意の不規則的なポイントクラウドポイントを搬送するかどうかを示すために各ＰＣＣ属性レイヤに対してフラグを採用することができる。このような実施形態は、単独で、または組み合わせて使用することができる。さらに、このような実施形態は、ＰＣＣシステムが、復号器によって認識可能であり、したがって、復号器によって復号可能な方式で、より複雑なコーディング機構を採用することを可能にする。以下に、これらおよび他の例を詳細に説明する。 Disclosed herein are mechanisms for improving PCC by addressing the above problems. In one embodiment, a PCC system may employ different codecs to encode different PCC attributes. Specifically, separate syntax elements can be employed to identify the video codec for each attribute. In another embodiment, the PCC system explicitly signals the number of layers that are coded and/or combined to represent each PCC attribute stream. Additionally, a PCC system may employ syntax elements to signal the modes used to encode and/or combine layers of PCC attributes within a PCC attribute stream. Additionally, the PCC system may employ syntax elements to specify the layer index of the layer associated with each data unit of the corresponding PCC attribute stream. In yet another embodiment, a flag can be employed for each PCC attribute layer to indicate whether the PCC attribute layer carries any irregular point cloud points. Such embodiments may be used alone or in combination. Moreover, such embodiments allow PCC systems to employ more complex coding schemes in a manner that is recognizable by, and therefore decodable by, decoders. These and other examples are described in detail below.

図１は、ビデオ信号をコーディングする例示的な動作方法１００のフローチャートである。具体的には、ビデオ信号は符号化器で符号化される。符号化プロセスは、ビデオファイルサイズを低減するために種々の機構を採用することによってビデオ信号を圧縮する。より小さいファイルサイズは、圧縮されたビデオファイルをユーザに送信することを可能にし、一方、関連する帯域幅オーバーヘッドを低減する。次いで、復号器は、圧縮されたビデオファイルを復号して、エンドユーザに表示するために元のビデオ信号を再構成する。復号プロセスは、一般に、符号化プロセスをミラーして、復号器がビデオ信号を一貫して再構成することを可能にする。 FIG. 1 is a flowchart of an exemplary method of operation 100 for coding a video signal. Specifically, a video signal is encoded with an encoder. The encoding process compresses the video signal by employing various mechanisms to reduce the video file size. Smaller file sizes allow compressed video files to be sent to users while reducing the associated bandwidth overhead. A decoder then decodes the compressed video file to reconstruct the original video signal for display to the end user. The decoding process generally mirrors the encoding process, allowing the decoder to consistently reconstruct the video signal.

ステップ１０１では、ビデオ信号が符号化器に入力される。例えば、ビデオ信号はメモリに記憶された非圧縮ビデオファイルであってもよい。別の例として、ビデオファイルは、ビデオカメラなどのビデオキャプチャデバイスによってキャプチャされ、ビデオのライブストリーミングをサポートするように符号化されてもよい。ビデオファイルは、オーディオ成分とビデオ成分の両方を含んでもよい。ビデオ成分は、シーケンスで見られるときに視覚的な動きの印象を与える一連の画像フレームを含む。フレームは、本明細書では輝度成分（または輝度サンプル）と呼ばれる光の観点で表現されるピクセルと、彩度成分（またはカラーサンプル）と呼ばれるカラーを含む。いくつかの例では、フレームはまた、三次元ビューイングをサポートするための深さ値を含んでもよい。 At step 101, a video signal is input to an encoder. For example, the video signal may be an uncompressed video file stored in memory. As another example, a video file may be captured by a video capture device such as a video camera and encoded to support live streaming of the video. A video file may contain both audio and video components. A video component includes a sequence of image frames that, when viewed in sequence, give the impression of visual motion. A frame includes pixels represented in terms of light, referred to herein as luminance components (or luminance samples), and colors, referred to herein as chroma components (or color samples). In some examples, the frames may also include depth values to support three-dimensional viewing.

ステップ１０３で、ビデオはブロックに分割される。分割は、圧縮のために、各フレーム内のピクセルを正方形および／または長方形のブロックにさらに分割することを含む。例えば、ハイ・エフィシエンシー・ビデオ・コーディング（ＨＥＶＣ）（Ｈ．２６５およびＭＰＥＧ－Ｈパート２としても知られている）において、フレームは、最初に、所定のサイズ（例えば、６４画素×６４画素）のブロックである、コーディングツリーユニット（ＣＴＵ）に分割することができる。ＣＴＵは輝度サンプルと彩度サンプルの両方が含む。コーディングツリーを採用して、ＣＴＵをブロックに分割し、次いで、さらなる符号化をサポートする構成が達成されるまで、ブロックを再帰的にさらに分割することができる。例えば、フレームの輝度成分は、個々のブロックが比較的均一な照明値を含むまで、さらに分割されてもよい。さらに、フレームの彩度成分は、個々のブロックが比較的均一な色値を含むまで、さらに分割されてもよい。したがって、分割機構はビデオフレームのコンテンツに依存して変動する。 At step 103 the video is divided into blocks. Partitioning involves subdividing the pixels in each frame into square and/or rectangular blocks for compression. For example, in High Efficiency Video Coding (HEVC) (also known as H.265 and MPEG-H Part 2), a frame is first sized to a predetermined size (e.g., 64 pixels by 64 pixels). ) can be divided into coding tree units (CTUs). A CTU contains both luminance and chroma samples. A coding tree can be employed to partition the CTU into blocks, which can then be recursively subdivided until a configuration supporting further encoding is achieved. For example, the luminance component of a frame may be further divided until individual blocks contain relatively uniform illumination values. Additionally, the chroma component of the frame may be further divided until individual blocks contain relatively uniform color values. Therefore, the splitting mechanism varies depending on the content of the video frames.

ステップ１０５では、ステップ１０３で分割された画像ブロックを圧縮するために、様々な圧縮機構が採用される。例えば、インター予測および／またはイントラ予測が採用されてもよい。インター予測は、共通のシーンの物体が連続したフレームに現れる傾向があるという事実を利用するように設計される。したがって、参照フレーム内の物体を示すブロックは、隣接するフレーム内に繰り返し記載される必要はない。具体的には、テーブルのような物体は、複数のフレームにわたって一定の位置に留まることができる。したがって、テーブルは一度記載され、隣接するフレームは参照フレームを参照することができる。パターンマッチング機構が採用されて、複数フレームにわたって物体をマッチングしてもよい。さらに、移動する物体は、例えば、物体の動きまたはカメラの動きのために、複数のフレームにわたって表現されてもよい。特定の例として、ビデオは、複数のフレームにわたってスクリーンを横切って移動する自動車を示してもよい。このような動きを説明するために、動きベクトルが採用されてもよい。動きベクトルは、フレーム内の物体の座標から参照フレーム内の物体の座標へのオフセットを提供する二次元ベクトルである。このように、インター予測は、参照フレーム内の対応するブロックからのオフセットを示す動きベクトルのセットとして、カレントフレーム内の画像ブロックを符号化することができる。 At step 105 various compression mechanisms are employed to compress the image blocks divided at step 103 . For example, inter prediction and/or intra prediction may be employed. Inter-prediction is designed to take advantage of the fact that common scene objects tend to appear in consecutive frames. Therefore, blocks representing objects in reference frames need not be repeated in adjacent frames. Specifically, an object such as a table can remain in a fixed position over multiple frames. Thus, the table can be written once and adjacent frames can reference the reference frame. A pattern matching mechanism may be employed to match objects across multiple frames. Additionally, moving objects may be represented over multiple frames, for example, due to object motion or camera motion. As a particular example, a video may show a car moving across a screen over multiple frames. Motion vectors may be employed to describe such motion. A motion vector is a two-dimensional vector that provides an offset from an object's coordinates in a frame to the object's coordinates in a reference frame. Thus, inter-prediction can encode image blocks in the current frame as a set of motion vectors that indicate offsets from corresponding blocks in the reference frame.

イントラ予測は、共通フレーム内のブロックを符号化する。イントラ予測は、輝度と彩度成分がフレームにおいて集中する傾向があるという事実を利用する。たとえば、樹木の一部分における緑色のパッチは、同様の緑色のパッチに隣接して位置付けられる傾向がある。イントラ予測は、多方向予測モード（例えば、ＨＥＶＣでは３３）、プラナーモード、およびダイレクトカレント（ＤＣ）モードを採用する。方向モードは、カレントブロックが、対応する方向における隣接ブロックのサンプルと同様／同じであることを示す。プラナーモードは、行／列に沿った一連のブロック（例えば、平面）を、行の端における隣接ブロックに基づいて補間できることを示す。プラナーモードは、事実上、値を変化させる際に比較的一定の傾きを採用することによって、行／列を横切る光／色の滑らかな遷移を示す。ＤＣモードは境界平滑化のために使用され、ブロックが方向予測モードの角度方向に関連付けられた全ての隣接ブロックのサンプルに関連付けられた平均値と同様／同じであることを示す。したがって、イントラ予測ブロックは、実際の値の代わりに、様々な関係予測モード値として画像ブロックを表わすことができる。さらに、インター予測ブロックは、実際の値の代わりに、動きベクトル値として画像ブロックを表わすことができる。いずれの場合も、予測ブロックは、場合によっては、画像ブロックを正確に表わさないことがある。任意の差異は、残差ブロックに記憶される。変換は、ファイルをさらに圧縮するために、残差ブロックに適用されてもよい。 Intra prediction encodes blocks within a common frame. Intra prediction takes advantage of the fact that luminance and chroma components tend to be concentrated in frames. For example, green patches in a portion of a tree tend to be located adjacent to similar green patches. Intra prediction employs multi-prediction modes (eg, 33 in HEVC), planar mode, and direct current (DC) mode. The direction mode indicates that the current block is similar/same as the neighboring block's samples in the corresponding direction. Planar mode indicates that a series of blocks (eg, planes) along a row/column can be interpolated based on neighboring blocks at the ends of the rows. The planar mode effectively exhibits a smooth transition of light/color across rows/columns by adopting a relatively constant slope in changing values. DC mode is used for boundary smoothing and indicates that a block is similar/same as the average value associated with the samples of all neighboring blocks associated with the angular direction of the directional prediction mode. Thus, an intra-predicted block can represent an image block as various related prediction mode values instead of actual values. Additionally, inter-predicted blocks can represent image blocks as motion vector values instead of actual values. In either case, the predictive block may not be an exact representation of the image block in some cases. Any differences are stored in the residual block. A transform may be applied to the residual block to further compress the file.

ステップ１０７では、種々のフィルタリング技術が適用されてもよい。ＨＥＶＣでは、フィルタは、ループ内フィルタリングスキームにしたがって適用される。上記に論じたブロックベースの予測は、復号器においてブロック状画像の生成をもたらすことがある。さらに、ブロックベースの予測スキームは、ブロックを符号化し、次いで、後で参照ブロックとして使用するために、符号化されたブロックを再構成してもよい。ループ内フィルタリングスキームは、ノイズ抑制フィルタ、デブロックフィルタ、適応ループフィルタ、およびサンプル適応オフセット（ＳＡＯ）フィルタをブロック／フレームに反復して適用する。これらのフィルタは、符号化されたファイルを正確に再構成することができるように、そのようなブロッキングアーチファクトを軽減する。さらに、これらのフィルタは、再構成された参照ブロック内のアーチファクトを軽減し、アーチファクトは、再構成された参照ブロックに基づいて符号化される後続ブロック内に追加のアーチファクトを生成する可能性が低い。 Various filtering techniques may be applied in step 107 . In HEVC, filters are applied according to an in-loop filtering scheme. The block-based prediction discussed above may result in the generation of blocky images at the decoder. In addition, block-based prediction schemes may encode blocks and then reconstruct the encoded blocks for later use as reference blocks. An in-loop filtering scheme iteratively applies a noise suppression filter, a deblocking filter, an adaptive loop filter, and a sample adaptive offset (SAO) filter to blocks/frames. These filters mitigate such blocking artifacts so that the encoded file can be reconstructed accurately. Additionally, these filters mitigate artifacts in the reconstructed reference block, which are less likely to produce additional artifacts in subsequent blocks that are encoded based on the reconstructed reference block. .

ビデオ信号が分割され、圧縮され、フィルタリングされると、もたらされたデータはステップ１０９でビットストリームに符号化される。ビットストリームは、上記に論じたデータおよび復号器での適切なビデオ信号再構成をサポートするのに望ましい任意のシグナリングデータを含む。例えば、このようなデータは、パーティションデータ、予測データ、残差ブロック、および復号器にコーディング命令を提供する種々のフラグを含んでもよい。ビットストリームは、要求があると、復号器に向かって伝送のためにメモリに記憶されてもよい。ビットストリームはまた、複数の復号器に向かってブロードキャストおよび／またはマルチキャストされてもよい。ビットストリームの生成は反復プロセスである。したがって、ステップ１０１、１０３、１０５、１０７、および１０９は、多くのフレームおよびブロックにわたって連続的におよび／または同時に発生してもよい。図１に示された順序は、議論の明確さと容易さのために提示されており、ビデオコーディングプロセスを特定の順序に限定することを意図したものではない。 Once the video signal has been split, compressed and filtered, the resulting data is encoded into a bitstream at step 109 . The bitstream includes the data discussed above and any signaling data desired to support proper video signal reconstruction at the decoder. For example, such data may include partition data, prediction data, residual blocks, and various flags that provide coding instructions to the decoder. The bitstream may be stored in memory for transmission towards the decoder upon request. The bitstream may also be broadcast and/or multicast to multiple decoders. Bitstream generation is an iterative process. Accordingly, steps 101, 103, 105, 107, and 109 may occur serially and/or simultaneously over many frames and blocks. The order shown in FIG. 1 is presented for clarity and ease of discussion and is not intended to limit the video coding process to any particular order.

復号器はビットストリームを受信し、ステップ１１１で復号プロセスを開始する。具体的には、復号器は、ビットストリームを対応する構文およびビデオデータに変換するエントロピー復号方式を採用する。復号器は、ビットストリームからの構文データを採用して、ステップ１１１でフレームのためのパーティションを決定する。パーティション分割は、ステップ１０３でのブロックパーティション分割の結果と一致すべきである。ステップ１１１において採用されるエントロピー符号化／復号が、ここで説明される。符号化器は、入力画像内の値の空間的位置付けに基づいて、いくつかの可能な選択からブロック分割方式を選択するなど、圧縮プロセスの間に多くの選択を行う。厳密な選択肢のシグナリングは、多数のビンが採用してもよい。本明細書で使用される際、ビンは、変数として扱われるバイナリ値（例えば、コンテキストに応じて変動し得るビット値）である。エントロピーコーディングは、符号化器が、許容可能なオプションのセットを残して、特定の場合に明らかに実行可能ではない任意のオプションを捨てることを可能にする。各許容可能なオプションには、コードワードが割り当てられる。コードワードの長さは、許容可能なオプションの数（例えば、２つのオプションに対して１つのビン、３～４つのオプションに対して２つのビンなど）に基づいており、符号化器は、次いで、選択されたオプションに対してコードワードを符号化する。このスキームは、コードワードが、全ての可能なオプションの潜在的に大きいセットからの選択を一意に示すこととは対照的に、許容可能なオプションの小さなサブセットからの選択を一意に示すことが望ましいほど大きいため、コードワードのサイズを低減する。次いで、復号器は、符号化器と同様の方法で許容可能なオプションのセットを決定することによって、選択を復号する。許容可能なオプションのセットを決定することによって、復号器は、コードワードを読み取り、符号化器によってなされる選択を決定することができる。 The decoder receives the bitstream and begins the decoding process at step 111 . Specifically, the decoder employs an entropy decoding scheme that transforms the bitstream into corresponding syntax and video data. The decoder employs syntactic data from the bitstream to determine partitions for the frame at step 111 . The partitioning should match the result of block partitioning in step 103 . The entropy encoding/decoding employed in step 111 is now described. The encoder makes many choices during the compression process, such as choosing a block partitioning scheme from several possible choices based on the spatial positioning of values in the input image. Strict choice signaling may be employed by multiple bins. As used herein, bins are binary values that are treated as variables (eg, bit values that can vary depending on context). Entropy coding allows the encoder to leave a set of acceptable options and discard any options that are clearly not viable in the particular case. Each acceptable option is assigned a codeword. The codeword length is based on the number of allowable options (e.g., 1 bin for 2 options, 2 bins for 3-4 options, etc.), and the encoder then: , encode the codewords for the selected options. The scheme desirably indicates that the codeword uniquely indicates a choice from a small subset of the allowable options, as opposed to a choice from a potentially large set of all possible options. is so large that the size of the codeword is reduced. The decoder then decodes the selection by determining the set of allowable options in the same manner as the encoder. By determining the set of allowable options, the decoder can read the codeword and determine the choices made by the encoder.

ステップ１１３で、復号器はブロック復号を実行する。具体的には、復号器は、残差ブロックを生成するために逆変換を採用する。次いで、復号器は、分割にしたがって画像ブロックを再構成するために、残差ブロックおよび対応する予測ブロックを採用する。予測ブロックは、ステップ１０５において符号化器で生成されるように、イントラ予測ブロックとインター予測ブロックの両方を含んでもよい。次いで、再構成された画像ブロックは、ステップ１１１で決定された分割データにしたがって、再構成されたビデオ信号のフレーム内に位置付けられる。ステップ１１３に対する構文はまた、上記に論じたようにエントロピーコーディングを介してビットストリーム内でシグナルされてもよい。 At step 113, the decoder performs block decoding. Specifically, the decoder employs an inverse transform to generate the residual block. The decoder then employs the residual block and the corresponding prediction block to reconstruct the image block according to the partition. Predicted blocks may include both intra-predicted blocks and inter-predicted blocks, as generated at the encoder in step 105 . The reconstructed image blocks are then positioned within the frame of the reconstructed video signal according to the segmentation data determined in step 111 . The syntax for step 113 may also be signaled within the bitstream via entropy coding as discussed above.

ステップ１１５では、符号化器におけるステップ１０７と同様の方法で、再構成されたビデオ信号のフレームに対してフィルタリングが実行される。例えば、ノイズ抑制フィルタ、デブロッキングフィルタ、適応ループフィルタ、およびＳＡＯフィルタがフレームに適用されて、ブロッキングアーチファクトを除去してもよい。フレームがフィルタリングされると、ビデオ信号は、ステップ１１７においてディスプレイに出力され、エンドユーザによって見ることができる。 At step 115 filtering is performed on the frames of the reconstructed video signal in a manner similar to step 107 in the encoder. For example, noise suppression filters, deblocking filters, adaptive loop filters, and SAO filters may be applied to the frames to remove blocking artifacts. Once the frames have been filtered, the video signal is output to a display in step 117 and can be viewed by the end user.

図２は、ビデオコーディングのための例示的なコーディングおよびデコーディング（コーデック）システム２００の概略図である。具体的には、コーデックシステム２００は、動作方法１００の実施をサポートするための機能性を提供する。コーデックシステム２００は、符号化器および復号器の両方で採用される構成要素を示すために一般化されている。コーデックシステム２００は、動作方法１００のステップ１０１および１０３に関して論じたように、ビデオ信号を受信および分割し、これにより、分割されたビデオ信号２０１をもたらす。次に、コーデックシステム２００は、方法１００のステップ１０５、１０７、および１０９に関して論じたように、符号化器として活動するときに、分割されたビデオ信号２０１をコード化されたビットストリームに圧縮する。復号器として活動するときに、コーデックシステム２００は、動作方法１００のステップ１１１、１１３、１１５、および１１７に関して論じたように、ビットストリームから出力ビデオ信号を生成する。コーデックシステム２００は、一般コーダ制御構成要素２１１、変換スケーリングおよび量子化構成要素２１３、イントラピクチャ推定構成要素２１５、イントラピクチャ予測構成要素２１７、動き補償構成要素２１９、動き推定構成要素２２１、スケーリングおよび逆変換構成要素２２９、フィルタ制御分析構成要素２２７、ループ内フィルタ構成要素２２５、復号されたピクチャバッファ構成要素２２３、ヘッダフォーマッティングおよびコンテキスト適応バイナリ算術コーディング（ＣＡＢＡＣ）構成要素２３１を含む。そのような構成要素は、示されるように結合される。図２において、黒線は符号化／復号されるべきデータの移動を示し、破線は他の構成要素の動作を制御する制御データの移動を示す。コーデックシステム２００の構成要素は全て、符号化器内に存在してもよい。復号器は、コーデックシステム２００の構成要素のサブセットを含んでもよい。例えば、復号器は、イントラピクチャ予測構成要素２１７、動き補償構成要素２１９、スケーリングおよび逆変換構成要素２２９、ループ内フィルタ構成要素２２５、および復号されたピクチャバッファ構成要素２２３を含んでもよい。ここで、これらの構成要素について説明する。 FIG. 2 is a schematic diagram of an exemplary coding and decoding (codec) system 200 for video coding. Specifically, codec system 200 provides functionality to support implementation of method of operation 100 . Codec system 200 is generalized to show the components employed in both the encoder and decoder. Codec system 200 receives and splits a video signal, as discussed with respect to steps 101 and 103 of method of operation 100 , resulting in split video signal 201 . Codec system 200 then compresses split video signal 201 into a coded bitstream when acting as an encoder, as discussed with respect to steps 105 , 107 and 109 of method 100 . When acting as a decoder, codec system 200 produces an output video signal from the bitstream as discussed with respect to steps 111 , 113 , 115 and 117 of method of operation 100 . Codec system 200 includes general coder control component 211, transform scaling and quantization component 213, intra picture estimation component 215, intra picture prediction component 217, motion compensation component 219, motion estimation component 221, scaling and inverse It includes a transform component 229 , a filter control analysis component 227 , an in-loop filter component 225 , a decoded picture buffer component 223 , a header formatting and context adaptive binary arithmetic coding (CABAC) component 231 . Such components are coupled as shown. In FIG. 2, black lines indicate movement of data to be encoded/decoded, and dashed lines indicate movement of control data that controls the operation of other components. All of the components of codec system 200 may reside within the encoder. A decoder may include a subset of the components of codec system 200 . For example, the decoder may include an intra-picture prediction component 217, a motion compensation component 219, a scaling and inverse transform component 229, an in-loop filter component 225, and a decoded picture buffer component 223. These constituent elements will now be described.

分割されたビデオ信号２０１は、コーディングツリーによってピクセルのブロックに分割されているキャプチャされたビデオシーケンスである。コーディングツリーは、ピクセルのブロックをピクセルのより小さなブロックに分割するために、種々のスプリットモードを採用する。次いで、これらのブロックは、さらに、より小さなブロックにさらに分割することができる。ブロックは、コーディングツリー上のノードと呼ばれることがある。大きな親ノードは、小さな子ノードにスプリットされる。ノードがさらに分割される回数は、ノード／コーディングツリーの深さと呼ばれる。場合によっては、分割されたブロックはコーディングユニット（ＣＵ）に含めることができる。例えば、ＣＵは、輝度ブロック、赤色差彩度（Ｃｒ）ロック、および青色差彩度（Ｃｂ）ブロックを、対応するＣＵの構文命令と共に含むＣＴＵのサブ部分とすることができる。スプリットモードは、ノードを、採用されるスプリットモードに依存して、異なる形状のそれぞれ２つ、３つ、または４つの子ノードに分割するために採用されるバイナリツリー（ＢＴ）、トリプルツリー（ＴＴ）、およびクワッドツリー（ＱＴ）を含んでもよい。分割されたビデオ信号２０１は、圧縮のために、一般コーダ制御構成要素２１１、変換スケーリングおよび量子化構成要素２１３、イントラピクチャ推定構成要素２１５、フィルタ制御分析構成要素２２７、および動き推定構成要素２２１に転送される。 A segmented video signal 201 is a captured video sequence that has been segmented into blocks of pixels by a coding tree. Coding trees employ various split modes to divide blocks of pixels into smaller blocks of pixels. These blocks can then be further divided into smaller blocks. Blocks are sometimes referred to as nodes on the coding tree. Large parent nodes are split into smaller child nodes. The number of times a node is split further is called the depth of the node/coding tree. In some cases, the partitioned blocks can be included in coding units (CUs). For example, a CU may be a sub-portion of a CTU that includes luma blocks, red-difference chroma (Cr) locks, and blue-difference chroma (Cb) blocks, along with corresponding CU syntax instructions. The split mode is binary tree (BT), triple tree (TT ), and quadtree (QT). Split video signal 201 is subjected to general coder control component 211, transform scaling and quantization component 213, intra picture estimation component 215, filter control analysis component 227, and motion estimation component 221 for compression. transferred.

一般コーダ制御構成要素２１１は、アプリケーションの制約にしたがって、ビデオシーケンスの画像をビットストリームにコーディングすることに関連する決定を行うように構成される。例えば、一般コーダ制御構成要素２１１は、ビットレート／ビットストリームサイズ対再構成品質の最適化を管理する。このような決定は、記憶領域／帯域幅の可用性および画像解像度要求に基づいて行われてもよい。一般コーダ制御構成要素２１１はまた、バッファのアンダーランおよびオーバーランの問題を緩和するために、送信速度に照らしてバッファの利用を管理する。これらの問題を管理するために、一般コーダ制御構成要素２１１は、他の構成要素による分割、予測、およびフィルタリングを管理する。例えば、一般コーダ制御構成要素２１１は、動的に、解像度を増加させ、帯域幅の使用を増加させるために圧縮の複雑さを増加させることができ、または、解像度および帯域幅の使用を減少させるために圧縮の複雑さを減少させることができる。したがって、一般コーダ制御構成要素２１１は、ビットレートの懸念とビデオ信号再構成品質とのバランスを取るために、コーデックシステム２００の他の構成要素を制御する。一般コーダ制御構成要素２１１は、他の構成要素の動作を制御する制御データを生成する。制御データはまた、復号器で復号するための信号パラメータにビットストリームで符号化されるヘッダフォーマッティングおよびＣＡＢＡＣ構成要素２３１に転送される。 General coder control component 211 is configured to make decisions related to coding images of a video sequence into a bitstream according to application constraints. For example, the general coder control component 211 manages optimization of bitrate/bitstream size versus reconstruction quality. Such decisions may be made based on storage/bandwidth availability and image resolution requirements. The general coder control component 211 also manages buffer utilization relative to transmission rate to mitigate buffer underrun and overrun problems. To manage these issues, general coder control component 211 manages segmentation, prediction, and filtering by other components. For example, the general coder control component 211 can dynamically increase resolution and increase compression complexity to increase bandwidth usage, or decrease resolution and bandwidth usage. compression complexity can be reduced. Accordingly, general coder control component 211 controls other components of codec system 200 to balance bit rate concerns and video signal reconstruction quality. General coder control component 211 generates control data that controls the operation of other components. Control data is also forwarded to the header formatting and CABAC component 231 which is encoded in the bitstream into signal parameters for decoding at the decoder.

分割されたビデオ信号２０１はまた、インター予測のために、動き推定構成要素２２１および動き補償構成要素２１９に送信される。分割されたビデオ信号２０１のフレームまたはスライスは、複数のビデオブロックに分割されてもよい。動き推定構成要素２２１および動き補償構成要素２１９は、時間的予測を提供するために、１つ以上の参照フレーム内の１つ以上のブロックに対して受信されたビデオブロックのインター予測コーディングを実行する。コーデックシステム２００は、例えばビデオデータの各ブロックに対して適切なコーディングモードを選択するために、複数のコーデド化パスを実行してもよい。 Split video signal 201 is also sent to motion estimation component 221 and motion compensation component 219 for inter prediction. A frame or slice of the partitioned video signal 201 may be partitioned into multiple video blocks. Motion estimation component 221 and motion compensation component 219 perform inter-predictive coding of received video blocks relative to one or more blocks in one or more reference frames to provide temporal prediction. . Codec system 200 may perform multiple coding passes, eg, to select an appropriate coding mode for each block of video data.

動き推定構成要素２２１および動き補償構成要素２１９は、高度に統合されてもよいが、概念的な目的のために別個に例示されている。動き推定構成要素２２１によって実行される動き推定は、動きベクトルを生成するプロセスであり、動きベクトルはビデオブロックの動きを推定する。運動ベクトルは、例えば、予測ブロックに対するコード化された物体の変位を示してもよい。予測ブロックは、ピクセル差に関して、コード化されるブロックに密接に一致することが見出されるブロックである。予測ブロックは、参照ブロックとも呼ばれることがある。このような画素差は、絶対差（ＳＡＤ）、二乗和差（ＳＳＤ）、または他の差分メトリックの合計によって決定され得る。ＨＥＶＣは、ＣＴＵ、コーディング・ツリー・ブロック（ＣＴＢ）、およびＣＵを含むいくつかのコード化された物体を採用する。例えば、ＣＴＵはＣＴＢに分割することができ、ＣＴＢはＣＵに含めるためにＣＢに分割することができる。ＣＵは、予測データを含む予測ユニット（ＰＵ）および／またはＣＵの変換残差データを含む変換ユニット（ＴＵ）として符号化することができる。動き推定構成要素２２１は、レート歪み最適化プロセスの一部としてレート歪み分析を使用して、動きベクトル、ＰＵ、およびＴＵを生成する。例えば、動き推定構成要素２２１は、カレントブロック／フレームに対して複数の参照ブロック、複数の動きベクトルなどを決定してしもよいし、最良のレート歪み特性を有する参照ブロック、動きベクトルなどを選択してもよい。最良のレート歪み特性は、ビデオ再構成の品質（例えば、圧縮によるデータ損失量）とコーディング効率（例えば、最終符号化のサイズ）の両方をバランスさせる。 Motion estimation component 221 and motion compensation component 219 may be highly integrated, but are illustrated separately for conceptual purposes. Motion estimation, performed by motion estimation component 221, is the process of generating motion vectors, which estimate the motion of video blocks. A motion vector may indicate, for example, the displacement of the coded object relative to the predictive block. A predictive block is a block that is found to closely match the block being coded in terms of pixel differences. A prediction block may also be referred to as a reference block. Such pixel differences may be determined by summing absolute differences (SAD), sum of squared differences (SSD), or other difference metrics. HEVC employs several coded objects including CTU, Coding Tree Block (CTB), and CU. For example, a CTU can be split into CTBs, and a CTB can be split into CBs for inclusion in a CU. A CU may be encoded as a prediction unit (PU), which contains prediction data, and/or a transform unit (TU), which contains transform residual data for the CU. The motion estimation component 221 uses rate-distortion analysis as part of the rate-distortion optimization process to generate motion vectors, PUs and TUs. For example, the motion estimation component 221 may determine multiple reference blocks, multiple motion vectors, etc. for the current block/frame, and select the reference block, motion vector, etc. that has the best rate-distortion characteristics. You may The best rate-distortion performance balances both video reconstruction quality (eg, amount of data loss due to compression) and coding efficiency (eg, size of final encoding).

いくつかの例では、コーデックシステム２００は、復号されたピクチャバッファ構成要素２２３に記憶された参照ピクチャのサブ整数画素位置の値を計算してもよい。例えば、ビデオコーデックシステム２００は、参照ピクチャの１／４画素位置、１／８画素位置、または他の画素部分位置の値を補間することができる。したがって、動き推定構成要素２２１は、全ピクセル位置および端数の画素位置に対する動きサーチを実行し、端数の画素精度を有する動きベクトルを出力してもよい。動き推定構成要素２２１は、ＰＵの位置を基準ピクチャの予測ブロックの位置と比較することによって、インターコード化スライス内のビデオブロックのＰＵに対する動きベクトルを計算する。動き推定構成要素２２１は、符号化のためのヘッダフォーマッティングおよびＣＡＢＡＣ構成要素２３１への動きデータとしての計算された動きベクトルを出力し、動き補償構成要素２１９への動きを出力する。 In some examples, codec system 200 may calculate values for sub-integer pixel positions of reference pictures stored in decoded picture buffer component 223 . For example, video codec system 200 may interpolate values at quarter-pixel positions, eighth-pixel positions, or other pixel sub-positions of a reference picture. Accordingly, the motion estimation component 221 may perform motion searches on full and fractional pixel positions and output motion vectors with fractional pixel accuracy. Motion estimation component 221 computes motion vectors for PUs of video blocks in inter-coded slices by comparing the positions of the PUs to the positions of predictive blocks in reference pictures. The motion estimation component 221 outputs the calculated motion vectors as header formatting and motion data to the CABAC component 231 for encoding and the motion to the motion compensation component 219 .

動き補償構成要素２１９によって実行される動き補償は、動き推定構成要素２２１によって決定された動きベクトルに基づいて予測ブロックをフェッチまたは生成することを伴ってもよい。また、いくつかの例では、動き推定構成要素２２１および動き補償構成要素２１９は、機能的に統合されてもよい。カレントビデオブロックのＰＵに対する動きベクトルを受信すると、動き補償構成要素２１９は、動きベクトルがポイントする予測ブロックを位置付けてもよい。次いで、コード化されるカレントビデオブロックのピクセル値から予測ブロックのピクセル値を減算することによって、残差ビデオブロックが形成され、ピクセル差値が形成する。一般に、動き推定構成要素２２１は輝度構成要素に対する動き推定を実行し、動き補償構成要素２１９は彩度構成要素と輝度構成要素の両方に対する輝度構成要素に基づいて計算された動きベクトルを使用する。予測ブロックおよび残差ブロックは、スケーリングおよび量子化構成要素２１３を変換するために転送される。 Motion compensation performed by motion compensation component 219 may involve fetching or generating a predictive block based on the motion vector determined by motion estimation component 221 . Also, in some examples, motion estimation component 221 and motion compensation component 219 may be functionally integrated. Upon receiving a motion vector for the PU of the current video block, motion compensation component 219 may locate the predictive block to which the motion vector points. A residual video block is then formed by subtracting the pixel values of the prediction block from the pixel values of the current video block being coded to form pixel difference values. In general, the motion estimation component 221 performs motion estimation for the luminance component and the motion compensation component 219 uses motion vectors calculated based on the luminance component for both the chroma and luminance components. The prediction and residual blocks are forwarded to transform scaling and quantization component 213 .

分割されたビデオ信号２０１はまた、イントラピクチャ推定構成要素２１５およびイントラピクチャ予測構成要素２１７に送信される。動き推定構成要素２２１および動き補償構成要素２１９と同様に、イントラピクチャ推定構成要素２１５およびイントラピクチャ予測構成要素２１７は、高度に統合されてもよいが、概念的な目的のために別個に例示されている。イントラピクチャ推定構成要素２１５およびイントラピクチャ予測構成要素２１７は、上述のように、フレーム間の動き推定構成要素２２１および動き補償構成要素２１９によって実行されるインター予測の代わりに、カレントフレーム内のブロックに対するカレントブロックをイントラ予測する。特に、イントラピクチャ推定構成要素２１５は、カレントブロックを符号化するために使用するイントラ予測モードを決定する。いくつかの例では、イントラピクチャ推定構成要素２１５は、複数のテストされたイントラピクチャ予測モードからカレントブロックを符号化するために、適切なイントラ予測モードを選択する。次いで、選択されたイントラ予測モードが、符号化のためにヘッダフォーマッティングおよびＣＡＢＡＣ構成要素２３１に転送される。 Split video signal 201 is also sent to intra-picture estimation component 215 and intra-picture prediction component 217 . Like motion estimation component 221 and motion compensation component 219, intra-picture estimation component 215 and intra-picture prediction component 217 may be highly integrated, but are illustrated separately for conceptual purposes. ing. Intra-picture estimation component 215 and intra-picture prediction component 217, as described above, instead of inter-prediction performed by inter-frame motion estimation component 221 and motion compensation component 219, Intra predict the current block. In particular, intra picture estimation component 215 determines the intra prediction mode to use for encoding the current block. In some examples, intra-picture estimation component 215 selects an appropriate intra-prediction mode for encoding the current block from multiple tested intra-picture prediction modes. The selected intra-prediction mode is then forwarded to header formatting and CABAC component 231 for encoding.

例えば、イントラピクチャ推定構成要素２１５は、種々の試験されたイントラ予測モードについてレート歪み分析を使用してレート歪み値を計算し、試験されたモードのうち最良のレート歪み特性を有するイントラ予測モードを選択する。レート歪み分析は、一般に、符号化ブロックと符号化ブロックを生成するために符号化された元の非コーディングブロックとの間の歪み（またはエラー）の量、および符号化ブロックを生成するために使用されるビットレート（例えば、ビット数）を決定する。イントラピクチャ推定構成要素２１５は、種々の符号化ブロックに対する歪みおよびレートから比を計算して、どのイントラ予測モードがブロックに対する最良のレート歪み値を呈するかを決定する。追加的に、イントラピクチャ推定構成要素２１５は、レート歪み最適化（ＲＤＯ）に基づく深さモデリングモード（ＤＭＭ）を使用して深さマップの深さブロックをコード化するように構成されてもよい。 For example, intra picture estimation component 215 computes rate-distortion values using rate-distortion analysis for various tested intra-prediction modes, and selects the intra-prediction mode having the best rate-distortion characteristics among the tested modes. select. Rate-distortion analysis is generally used to generate the coded block, the amount of distortion (or error) between the coded block and the original non-coded block that was coded to generate the coded block, and the Determines the bitrate (eg, number of bits) to be used. The intra-picture estimation component 215 computes ratios from the distortion and rate for various encoded blocks to determine which intra-prediction mode exhibits the best rate-distortion value for the block. Additionally, the intra-picture estimation component 215 may be configured to code the depth blocks of the depth map using a depth modeling mode (DMM) based rate-distortion optimization (RDO). .

イントラピクチャ予測構成要素２１７は、符号化器で実施されたときにイントラピクチャ推定構成要素２１５によって決定された選択されたイントラ予測モードに基づいて予測ブロックから残差ブロックを生成するか、または復号器で実施されたときにビットストリームから残差ブロックを読み取ってもよい。残差ブロックは、行列として表される予測ブロックと元のブロックとの間の値の差を含む。次いで、残差ブロックは、変換スケーリングおよび量子化構成要素２１３に転送される。イントラピクチャ推定構成要素２１５およびイントラピクチャ予測構成要素２１７は、輝度構成要素および彩度構成要素の両方に対して動作してもよい。 Intra-picture prediction component 217 generates a residual block from the predicted block based on a selected intra-prediction mode determined by intra-picture estimation component 215 when implemented at the encoder or at the decoder. may read the residual block from the bitstream when implemented in The residual block contains the difference in values between the prediction block and the original block represented as a matrix. The residual block is then forwarded to transform scaling and quantization component 213 . Intra-picture estimation component 215 and intra-picture prediction component 217 may operate on both the luminance and chroma components.

変換スケーリングおよび量子化構成要素２１３は、残差ブロックをさらに圧縮するように構成される。変換スケーリングおよび量子化構成要素２１３は、離散コサイン変換（ＤＣＴ）、離散正弦変換（ＤＳＴ）、概念的に同様の変換などの変換を残差ブロックに適用し、残差変換係数値を含むビデオブロックを生成する。ウェーブレット変換、整数変換、サブバンド変換、または他のタイプの変換も使用することができる。変換は、残差情報をピクセル値ドメインから変換ドメイン、例えば周波数ドメインに変換してもよい。変換スケーリングと量子化構成要素２１３はまた、変換された残差情報を、例えば周波数に基づいてスケーリングするように構成されている。このようなスケーリングは、異なる周波数情報が異なる粒度で量子化されるように、残差情報にスケールファクタを適用することを伴い、これは、再構成されたビデオの最終的な視覚品質に影響を及ぼすことがある。変換スケーリングおよび量子化構成要素２１３はまた、ビットレートをさらに低減するために変換係数を量子化するように構成されている。量子化プロセスは、係数のいくつかまたは全てに関連付けられたビット深さを低減してもよい。量子化の程度は、量子化パラメータを調整することによって修正されてもよい。いくつかの例では、変換スケーリングおよび量子化構成要素２１３は、次いで、量子化された変換係数を含む行列の走査を実行してもよい。量子化された変換係数は、ヘッダフォーマッティングおよびＣＡＢＡＣ構成要素２３１に転送され、ビットストリームにおいて符号化される。 Transform scaling and quantization component 213 is configured to further compress the residual block. Transform scaling and quantization component 213 applies a transform, such as a discrete cosine transform (DCT), a discrete sine transform (DST), or conceptually similar transforms, to the residual block to produce video blocks containing residual transform coefficient values. to generate Wavelet transforms, integer transforms, subband transforms, or other types of transforms can also be used. A transform may transform the residual information from the pixel value domain to a transform domain, eg, the frequency domain. Transform scaling and quantization component 213 is also configured to scale the transformed residual information, eg, based on frequency. Such scaling involves applying a scale factor to the residual information such that different frequency information is quantized at different granularities, which affects the final visual quality of the reconstructed video. may affect Transform scaling and quantization component 213 is also configured to quantize the transform coefficients to further reduce bitrate. The quantization process may reduce the bit depth associated with some or all of the coefficients. The degree of quantization may be modified by adjusting the quantization parameter. In some examples, transform scaling and quantization component 213 may then perform a scan of the matrix containing the quantized transform coefficients. The quantized transform coefficients are forwarded to the header formatting and CABAC component 231 and encoded in the bitstream.

スケーリングおよび逆変換構成要素２２９は、動き推定をサポートするために変換スケーリングおよび量子化構成要素２１３の逆動作を適用する。スケーリングおよび逆変換構成要素２２９は、逆スケーリング、変換、および／または量子化を適用して、画素ドメイン内の残差ブロックを再構成し、例えば、後に別のカレントブロックの予測ブロックとなり得る参照ブロックとして使用する。動き推定構成要素２２１および／または動き補償構成要素２１９は、後のブロック／フレームの動き推定に使用するために、残差ブロックを対応する予測ブロックに加算することによって参照ブロックを計算してもよい。フィルタは、スケーリング、量子化、および変換の間に生成されるアーチファクトを軽減するために、再構成された参照ブロックに適用される。そうでなければ、このようなアーチファクトは、後続のブロックが予測されたときに不正確な予測を引き起こす（追加のアーチファクトを生成する）。 A scaling and inverse transform component 229 applies the inverse operation of the transform scaling and quantizing component 213 to support motion estimation. A scaling and inverse transform component 229 applies inverse scaling, transform, and/or quantization to reconstruct the residual block in the pixel domain, e.g., a reference block that may later become the predictive block for another current block. Use as Motion estimation component 221 and/or motion compensation component 219 may calculate reference blocks by adding residual blocks to corresponding prediction blocks for use in motion estimation of subsequent blocks/frames. . A filter is applied to the reconstructed reference block to mitigate artifacts produced during scaling, quantization, and transform. Such artifacts would otherwise cause inaccurate predictions (generate additional artifacts) when subsequent blocks are predicted.

フィルタ制御分析構成要素２２７およびループ内フィルタ構成要素２２５は、残差ブロックおよび／または再構成された画像ブロックにフィルタを適用する。例えば、スケーリングおよび逆変換構成要素２２９からの変換された残差ブロックを、イントラピクチャ予測構成要素２１７および／または動き補償構成要素２１９からの対応する予測ブロックと組み合わせて、元の画像ブロックを再構成してもよい。次いで、フィルタは、再構成された画像ブロックに適用されてもよい。いくつかの例において、フィルタは、代わりに、残差ブロックに適用されてもよい。図２の他の構成要素と同様に、フィルタ制御分析構成要素２２７およびループ内フィルタ構成要素２２５は、高度に統合されており、一緒に実施されてもよいが、概念的な目的のために別々に示されている。再構成された参照ブロックに適用されるフィルタは、特定の空間領域に適用され、そのようなフィルタがどのように適用されるのかを調整するために複数のパラメータを含む。フィルタ制御分析構成要素２２７は、再構成された参照ブロックを解析して、そのようなフィルタを適用すべき場所を決定し、対応するパラメータをセットする。このようなデータは、符号化のためのフィルタ制御データとしてヘッダフォーマッティングおよびＣＡＢＡＣ構成要素２３１に転送される。ループ内フィルタ構成要素２２５は、フィルタ制御データに基づいてこのようなフィルタを適用する。フィルタは、デブロッキングフィルタ、ノイズ抑制フィルタ、ＳＡＯフィルタ、および適応ループフィルタを含んでもよい。このようなフィルタは、例に応じて、空間／画素ドメイン（例えば、再構成された画素ブロック）または周波数ドメインにおいて適用され得る。 Filter control analysis component 227 and in-loop filter component 225 apply filters to the residual block and/or the reconstructed image block. For example, the transformed residual blocks from the scaling and inverse transform component 229 are combined with the corresponding prediction blocks from the intra-picture prediction component 217 and/or the motion compensation component 219 to reconstruct the original image block. You may The filter may then be applied to the reconstructed image blocks. In some examples, the filter may be applied to the residual block instead. Like the other components in FIG. 2, the filter control analysis component 227 and the in-loop filter component 225 are highly integrated and may be implemented together, but for conceptual purposes they are separated. shown in The filters applied to the reconstructed reference blocks are applied to specific spatial regions and contain multiple parameters to adjust how such filters are applied. Filter control analysis component 227 analyzes the reconstructed reference block to determine where such filters should be applied and sets the corresponding parameters. Such data is forwarded to the header formatting and CABAC component 231 as filter control data for encoding. In-loop filter component 225 applies such filters based on filter control data. The filters may include deblocking filters, noise suppression filters, SAO filters, and adaptive loop filters. Such filters may be applied in the spatial/pixel domain (eg, reconstructed pixel blocks) or the frequency domain, depending on the example.

符号化器として動作するときに、フィルタリングされ再構成された画像ブロック、残差ブロック、および／または予測ブロックは、後に上記に論じたように動作推定に使用するために、復号されたピクチャバッファ構成要素２２３に記憶される。復号器として動作するときに、復号されたピクチャバッファ構成要素２２３は、出力ビデオ信号の一部として、再構成されフィルタリングされたブロックを記憶し、ディスプレイに向かって転送する。復号されたピクチャバッファ構成要素２２３は、予測ブロック、残差ブロック、および／または再構成された画像ブロックを記憶することができる任意のメモリデバイスであってもよい。 When operating as an encoder, the filtered and reconstructed image blocks, residual blocks, and/or prediction blocks are stored in a decoded picture buffer configuration for subsequent use in motion estimation as discussed above. Stored in element 223 . When operating as a decoder, the decoded picture buffer component 223 stores and forwards the reconstructed filtered blocks towards the display as part of the output video signal. Decoded picture buffer component 223 may be any memory device capable of storing prediction blocks, residual blocks, and/or reconstructed image blocks.

ヘッダフォーマッティングおよびＣＡＢＡＣ構成要素２３１は、コーデックシステム２００の種々の構成要素からデータを受信し、復号器に向けて伝送するために、このようなデータをコード化されたビットストリームに符号化する。具体的には、ヘッダフォーマッティングおよびＣＡＢＡＣ構成要素２３１は、一般制御データおよびフィルタ制御データのような制御データを符号化するための種々のヘッダを生成する。さらに、イントラ予測および動きデータを含む予測データ、ならびに量子化変換係数データの形の残差データは、全てビットストリームに符号化される。最終ビットストリームは、元の分割されたビデオ信号２０１を再構成するために復号器によって望まれる全ての情報を含む。このような情報は、また、イントラ予測モードインデックステーブル（コードワードマッピングテーブルとも呼ばれる）、種々のブロックに対する符号化コンテキストの定義、最も可能性の高いイントラ予測モードの指示、パーティション情報の指示などを含んでもよい。このようなデータは、エントロピーコーディングを採用することによって符号化されてもよい。例えば、コンテキスト適応可変長コーディング（ＣＡＶＬＣ）、ＣＡＢＡＣ、構文ベースのコンテキスト適応バイナリ算術コーディング（ＳＢＡＣ）、確率間隔分割エントロピーコーディング（ＰＩＰＥ）、または別のエントロピーコーディング技術を使用することによって、情報を符号化してもよい。エントロピーコーディングにしたがって、コード化されたビットストリームは、別のデバイス（例えば、ビデオ復号器）に送信されてもよく、または後の伝送または検索のためにアーカイブされてもよい。 Header formatting and CABAC component 231 receives data from various components of codec system 200 and encodes such data into a coded bitstream for transmission towards the decoder. Specifically, the header formatting and CABAC component 231 generates various headers for encoding control data such as general control data and filter control data. Furthermore, prediction data, including intra-prediction and motion data, and residual data in the form of quantized transform coefficient data are all encoded into the bitstream. The final bitstream contains all the information desired by the decoder to reconstruct the original split video signal 201 . Such information also includes intra-prediction mode index tables (also called codeword mapping tables), definitions of coding contexts for various blocks, indications of most probable intra-prediction modes, indications of partition information, etc. It's okay. Such data may be encoded by employing entropy coding. For example, the information is encoded by using context-adaptive variable-length coding (CAVLC), CABAC, syntax-based context-adaptive binary arithmetic coding (SBAC), probability interval partitioned entropy coding (PIPE), or another entropy coding technique. may Following entropy coding, the coded bitstream may be sent to another device (eg, a video decoder), or archived for later transmission or retrieval.

図３は、例示的なビデオ符号化器３００を示すブロック図である。ビデオ符号化器３００は、コーデックシステム２００の符号化機能を実施するため、および／または動作方法１００のステップ１０１、１０３、１０５、１０７、および／または１０９を実施するために採用されてもよい。符号化器３００は、入力ビデオ信号を分割し、分割されたビデオ信号３０１をもたらし、これは、分割されたビデオ信号２０１と実質的に同様である。次いで、分割されたビデオ信号３０１は、符号化器３００の構成要素によって圧縮され、ビットストリームに符号化される。 FIG. 3 is a block diagram illustrating an exemplary video encoder 300. As shown in FIG. Video encoder 300 may be employed to implement the encoding functions of codec system 200 and/or to implement steps 101 , 103 , 105 , 107 and/or 109 of method of operation 100 . Encoder 300 splits an input video signal resulting in split video signal 301 , which is substantially similar to split video signal 201 . The split video signal 301 is then compressed and encoded into a bitstream by the encoder 300 components.

具体的には、分割されたビデオ信号３０１は、イントラ予測のために、イントラピクチャ予測構成要素３１７に転送される。イントラピクチャ予測構成要素３１７は、イントラピクチャ推定構成要素２１５およびイントラピクチャ予測構成要素２１７と実質的に同様であってもよい。また、分割されたビデオ信号３０１は、復号されたピクチャバッファ構成要素３２３内の参照ブロックに基づくインター予測のために、動き補償構成要素３２１に転送される。動き補償構成要素３２１は、動き推定構成要素２２１および動き補償構成要素２１９と実質的に同様であってもよい。イントラピクチャ予測構成要素３１７および動き補償構成要素３２１からの予測ブロックおよび残差ブロックは、残差ブロックの変換および量子化のために変換および量子化構成要素３１３に転送される。変換および量子化構成要素３１３は、変換スケーリングおよび量子化構成要素２１３と実質的に同様であってもよい。変換され量子化された残差ブロックおよび対応する予測ブロック（関連付けられた制御データと共に）は、ビットストリームにコーディングするためにエントロピーコーディング構成要素３３１に転送される。エントロピーコーディング構成要素３３１は、ヘッダフォーマッティングおよびＣＡＢＡＣ構成要素２３１と実質的に同様であってもよい。 Specifically, split video signal 301 is forwarded to intra picture prediction component 317 for intra prediction. Intra-picture prediction component 317 may be substantially similar to intra-picture estimation component 215 and intra-picture prediction component 217 . Split video signal 301 is also forwarded to motion compensation component 321 for inter prediction based on reference blocks in decoded picture buffer component 323 . Motion compensation component 321 may be substantially similar to motion estimation component 221 and motion compensation component 219 . The prediction and residual blocks from intra-picture prediction component 317 and motion compensation component 321 are forwarded to transform and quantization component 313 for transform and quantization of the residual block. Transform and quantization component 313 may be substantially similar to transform scaling and quantization component 213 . The transformed and quantized residual block and corresponding prediction block (along with associated control data) are forwarded to entropy coding component 331 for coding into a bitstream. Entropy coding component 331 may be substantially similar to header formatting and CABAC component 231 .

変換および量子化された残差ブロックおよび／または対応する予測ブロックも、動き補償構成要素３２１によって使用される参照ブロックへの再構成のために、変換および量子化構成要素３１３から逆変換および量子化構成要素３２９に転送される。逆変換および量子化構成要素３２９は、スケーリングおよび逆変換構成要素２２９と実質的に同様であってもよい。ループ内フィルタ構成要素３２５内のループ内フィルタはまた、例に応じて、残差ブロックおよび／または再構成された参照ブロックにも適用される。ループ内フィルタ構成要素３２５は、実質的に、フィルタ制御分析構成要素２２７およびループ内フィルタ構成要素２２５と同様であってもよい。ループ内フィルタ構成要素３２５は、ループ内フィルタ構成要素２２５に関して論じられたように、複数のフィルタを含んでもよい。次いで、フィルタリングされたブロックは、動き補償構成要素３２１による参照ブロックとして使用されるために、復号されたピクチャバッファ構成要素３２３に記憶される復号されたピクチャバッファ構成要素３２３は、復号されたピクチャバッファ構成要素２２３と実質的に同様であってもよい。 The transformed and quantized residual blocks and/or corresponding prediction blocks are also inverse transformed and quantized from transform and quantization component 313 for reconstruction into reference blocks used by motion compensation component 321. Forwarded to component 329 . Inverse transform and quantization component 329 may be substantially similar to scaling and inverse transform component 229 . The in-loop filter in in-loop filter component 325 is also applied to the residual block and/or the reconstructed reference block, depending on the example. In-loop filter component 325 may be substantially similar to filter control analysis component 227 and in-loop filter component 225 . In-loop filter component 325 may include multiple filters, as discussed with respect to in-loop filter component 225 . The filtered blocks are then stored in the decoded picture buffer component 323 to be used as reference blocks by the motion compensation component 321. It may be substantially similar to component 223 .

図４は、例示的なビデオ復号器４００を示すブロック図である。ビデオ復号器４００は、コーデックシステム２００の復号機能を実施するため、および／または動作方法１００のステップ１１１、１１３、１１５、および／または１１７を実施するために採用されてもよい。復号器４００は、例えば符号化器３００からビットストリームを受信し、エンドユーザに表示するために、ビットストリームに基づいて再構成された出力ビデオ信号を生成する。 FIG. 4 is a block diagram illustrating an exemplary video decoder 400. As shown in FIG. Video decoder 400 may be employed to implement the decoding functions of codec system 200 and/or to implement steps 111 , 113 , 115 and/or 117 of method of operation 100 . Decoder 400 receives the bitstream, eg, from encoder 300, and produces a reconstructed output video signal based on the bitstream for display to an end user.

ビットストリームはエントロピー復号構成要素４３３によって受信される。エントロピー復号構成要素４３３は、ＣＡＶＬＣ、ＣＡＢＡＣ、ＳＢＡＣ、ＰＩＰＥコーディング、または他のエントロピーコーディング技術などのエントロピー復号スキームを実施するように構成されている。例えば、エントロピー復号構成要素４３３は、ビットストリームにおけるコードワードとして符号化された追加データを解釈するためのコンテキストを提供するために、ヘッダ情報を採用してもよい。復号された情報は、一般制御データ、フィルタ制御データ、パーティション情報、動きデータ、予測データ、および残差ブロックからの量子化変換係数のような、ビデオ信号を復号するための任意の所望の情報を含む。量子化された変換係数は、残差ブロックに再構成するために、逆変換および量子化構成要素４２９に転送される。逆変換および量子化構成要素４２９は、逆変換および量子化構成要素３２９と同様であってもよい。 The bitstream is received by entropy decoding component 433 . Entropy decoding component 433 is configured to implement an entropy decoding scheme such as CAVLC, CABAC, SBAC, PIPE coding, or other entropy coding techniques. For example, entropy decoding component 433 may employ header information to provide context for interpreting additional data encoded as codewords in the bitstream. The decoded information may include any desired information for decoding the video signal, such as general control data, filter control data, partition information, motion data, prediction data, and quantized transform coefficients from residual blocks. include. The quantized transform coefficients are forwarded to the inverse transform and quantization component 429 for reconstruction into residual blocks. Inverse transform and quantization component 429 may be similar to inverse transform and quantization component 329 .

再構成された残差ブロックおよび／または予測ブロックは、イントラ予測動作に基づいて画像ブロックに再構成するために、イントラピクチャ予測構成要素４１７に転送される。イントラピクチャ予測構成要素４１７は、イントラピクチャ推定構成要素２１５およびイントラピクチャ予測構成要素２１７と同様であってもよい。具体的には、イントラピクチャ予測構成要素４１７は、フレーム内の参照ブロックを位置決めするために予測モードを採用し、残差ブロックを結果に適用して、イントラ予測画像ブロックを再構成する。再構成されたイントラ予測画像ブロックおよび／または残差ブロックおよび対応するインター予測データは、復号されたピクチャバッファ構成要素４２３に、ループ内フィルタ構成要素４２５を介して転送され、これは、復号されたピクチャバッファ構成要素２２３およびループ内フィルタ構成要素２２５にそれぞれ実質的に同様であってもよい。ループ内フィルタ構成要素４２５は、再構成された画像ブロック、残差ブロックおよび／または予測ブロックをフィルタリングし、このような情報は復号されたピクチャバッファ構成要素４２３に記憶される。復号されたピクチャバッファ構成要素４２３からの再構成された画像ブロックは、インター予測のために動き補償構成要素４２１に転送される。動き補償構成要素４２１は、動き推定構成要素２２１および／または動き補償構成要素２１９と実質的に同様であってもよい。具体的には、動き補償構成要素４２１は、参照ブロックからの動きベクトルを採用して予測ブロックを生成し、残差ブロックを結果に適用して画像ブロックを再構成する。もたらされた再構成されたブロックは、ループ内フィルタ構成要素４２５を介して復号されたピクチャバッファ構成要素４２３に転送されてもよい。復号されたピクチャバッファ構成要素４２３は、パーティション情報を介してフレームに再構成することができる追加の再構成画像ブロックを記憶し続ける。このようなフレームは、シーケンスに置かれてもよい。このシーケンスは、再構成された出力ビデオ信号としてディスプレイに向けて出力される。 The reconstructed residual and/or prediction blocks are forwarded to intra picture prediction component 417 for reconstruction into image blocks based on intra prediction operations. Intra-picture prediction component 417 may be similar to intra-picture estimation component 215 and intra-picture prediction component 217 . Specifically, the intra-picture prediction component 417 employs prediction modes to locate reference blocks within frames and applies residual blocks to the results to reconstruct intra-predicted image blocks. The reconstructed intra-predicted image blocks and/or residual blocks and corresponding inter-predicted data are forwarded to the decoded picture buffer component 423 via the in-loop filter component 425, which is the decoded They may be substantially similar to picture buffer component 223 and in-loop filter component 225, respectively. In-loop filter component 425 filters reconstructed image blocks, residual blocks and/or prediction blocks, and such information is stored in decoded picture buffer component 423 . Reconstructed image blocks from the decoded picture buffer component 423 are forwarded to the motion compensation component 421 for inter prediction. Motion compensation component 421 may be substantially similar to motion estimation component 221 and/or motion compensation component 219 . Specifically, motion compensation component 421 employs motion vectors from reference blocks to generate prediction blocks, and applies residual blocks to the results to reconstruct image blocks. The resulting reconstructed block may be forwarded to the decoded picture buffer component 423 via an in-loop filter component 425 . The decoded picture buffer component 423 continues to store additional reconstructed image blocks that can be reconstructed into frames via partition information. Such frames may be placed in a sequence. This sequence is output to the display as a reconstructed output video signal.

図５は、ＰＣＣ機構にしたがってコード化され得るポイントクラウド媒体５００の例である。ポイントクラウドとは、空間内のデータポイントのセットである。ポイントクラウドは、周囲の物体の外表面上の多数のポイントを測定する３Ｄスキャナによって生成されてもよい。ポイントクラウドは、幾何属性、テクスチャ属性、反射率属性、透明度属性、法線属性などの観点から説明されてもよい。各属性は、方法１００の一部として、ビデオコーデックシステム２００、符号化器３００、および／または復号器４００などのコーデックによってコード化することができる。具体的には、ＰＣＣフレームの各属性は、符号化器で別々にコード化され、復号器で復号され、再度組み合わせられて、ＰＣＣフレームを再作成することができる。 FIG. 5 is an example of a point cloud medium 500 that can be encoded according to the PCC scheme. A point cloud is a set of data points in space. A point cloud may be generated by a 3D scanner that measures a large number of points on the outer surface of the surrounding object. A point cloud may be described in terms of geometric attributes, texture attributes, reflectance attributes, transparency attributes, normal attributes, and the like. Each attribute may be encoded by a codec, such as video codec system 200 , encoder 300 and/or decoder 400 as part of method 100 . Specifically, each attribute of the PCC frame can be separately coded at the encoder, decoded at the decoder, and recombined to recreate the PCC frame.

ポイントクラウド媒体５００は、３つの境界ボックス５０２、５０４、および５０６を含む。境界ボックス５０２、５０４、および５０６の各々は、カレントフレームからの３Ｄ画像の一部分またはセグメントを表す。境界ボックス５０２、５０４、および５０６は、人物の３Ｄ画像を含むが、他の物体は、実際のアプリケーションでは、境界ボックスに含まれてもよい。各境界ボックス５０２、５０４、および５０６は、ｘ、ｙ、およびｚ方向における３Ｄ画像によって占有される画素数をそれぞれ示すｘ軸、ｙ軸、およびｚ軸を含む。例えば、ｘ軸およびｙ軸は、約４００ピクセル（例えば、約０～４００ピクセル）を示し、ｚ軸は、約１０００ピクセル（例えば、約０～１０００ピクセル）を示す。 Point cloud medium 500 includes three bounding boxes 502 , 504 , and 506 . Each of bounding boxes 502, 504, and 506 represents a portion or segment of the 3D image from the current frame. Bounding boxes 502, 504, and 506 include 3D images of people, but other objects may be included in the bounding boxes in practical applications. Each bounding box 502, 504, and 506 includes x, y, and z axes that indicate the number of pixels occupied by the 3D image in the x, y, and z directions, respectively. For example, the x-axis and y-axis represent approximately 400 pixels (eg, approximately 0-400 pixels) and the z-axis represents approximately 1000 pixels (eg, approximately 0-1000 pixels).

境界ボックス５０２、５０４、および５０６の各々は、図５のキューブまたはボックスによって表される１つ以上のパッチ５０８を含む。各パッチ５０８は、境界ボックス５０２、５０４、または５０６のうちの１つの内の全体の物体の一部分を含み、パッチサイズ情報によって説明または表されてもよい。パッチ情報は、例えば、境界ボックス５０２、５０４、または５０６内のパッチ５０８の位置を説明する二次元（２Ｄ）座標および／または三次元（３Ｄ）座標を含んでもよい。パッチ情報はまた、他のパラメータを含んでもよい。例えば、パッチ情報は、参照パッチ情報からカレントパッチ情報に対して継承されるｎｏｒｍａｌＡｘｉｓのようなパラメータを含んでもよい。すなわち、参照フレームのパッチ情報からの１つ以上のパラメータが、カレントフレームのパッチ情報に対して継承されてもよい。追加的に、参照フレームからの１つ以上のメタデータ部（例えば、パッチ回転、スケールパラメータ、材料識別子など）は、カレントフレームに継承されてもよい。パッチ５０８は、本明細書では、３Ｄパッチまたはパッチデータユニットと互換的に呼ばれることがある。各境界ボックス５０２、５０４、または５０６内のパッチ５０８のリストを生成し、最大パッチから最小パッチまでの降順にパッチバッファに記憶してもよい。次いで、このパッチは、符号化器によって符号化器され、および／または復号器によって復号することができる。 Each of bounding boxes 502, 504, and 506 contains one or more patches 508 represented by the cubes or boxes in FIG. Each patch 508 includes a portion of the entire object within one of bounding boxes 502, 504, or 506 and may be described or represented by patch size information. Patch information may include, for example, two-dimensional (2D) and/or three-dimensional (3D) coordinates that describe the location of patch 508 within bounding box 502 , 504 , or 506 . Patch information may also include other parameters. For example, patch information may include parameters such as normalAxis that are inherited from the reference patch information to the current patch information. That is, one or more parameters from the patch information of the reference frame may be inherited for the patch information of the current frame. Additionally, one or more metadata portions (eg, patch rotation, scale parameters, material identifiers, etc.) from the reference frame may be inherited to the current frame. Patches 508 are sometimes referred to interchangeably herein as 3D patches or patch data units. A list of patches 508 within each bounding box 502, 504, or 506 may be generated and stored in the patch buffer in descending order from largest patch to smallest patch. This patch can then be encoded by an encoder and/or decoded by a decoder.

パッチ５０８は、ポイントクラウド媒体５００の様々な属性を説明することができる。具体的には、ｘ軸、ｙ軸、およびｚ軸上の各ピクセルの位置は、そのピクセルの幾何学的形状である。カレントフレーム内の全てのピクセルの位置を含むパッチ５０８は、ポイントクラウド媒体５００のカレントフレームに対する幾何学的属性を捕捉するようにコード化することができる。さらに、各ピクセルは、赤、青、および緑（ＲＧＢ）および／または輝度および彩度（ＹＵＶ）スペクトルにおける色値を含んでもよい。カレントフレーム内の全てのピクセルの色を含むパッチ５０８は、ポイントクラウド媒体５００のカレントフレームに対するテクスチャ属性を捕捉するようにコード化することができる。 Patches 508 may describe various attributes of point cloud media 500 . Specifically, the position of each pixel on the x-, y-, and z-axes is the geometry of that pixel. A patch 508 containing the positions of all pixels in the current frame can be coded to capture the geometric attributes for the current frame of the point cloud medium 500 . Additionally, each pixel may include color values in the red, blue, and green (RGB) and/or luminance and saturation (YUV) spectrum. A patch 508 containing the colors of all pixels in the current frame can be coded to capture texture attributes for the current frame of the point cloud medium 500 .

追加的に、各ピクセルは、いくらかの反射率を含んでもよい（または含まなくてもよい）。反射率は、ピクセルから隣接するピクセルに投影する光（例えば、着色光）の量である。輝く物体は高い反射率を有し、したがって、それらの対応するピクセルの光／色を他の近くのピクセルに広げる。一方、無光沢物体は、反射率がほとんどまたは全くなく、隣接するピクセルの色／光レベルに影響を与えないことがある。カレントフレーム内の全てのピクセルの反射率を含むパッチ５０８は、ポイントクラウド媒体５００のカレントフレームに対する反射率属性を捕捉するようにコード化することができる。いくつかの画素は、部分的に、完全に透明であってもよい（例えば、ガラス、透明プラスチックなど）。透明度は、カレント画素を通過することができる隣接する画素の光／色の量である。カレントフレーム内の全てのピクセルの透明度のレベルを含むパッチ５０８は、ポイントクラウド媒体５００のカレントフレームに対する透明度属性を捕捉するようにコード化することができる。さらに、ポイントクラウド媒体のポイントは、表面を生成することがある。表面は、表面に垂直なベクトルである法線ベクトルと関連付けることができる。法線ベクトルは、物体の動きおよび／または相互作用を説明するときに有用であることがある。したがって、場合によっては、ユーザは、さらなる機能性をサポートするために、表面のための法線ベクトルを符号化することを望んでもよい。カレントフレーム内の表面に対する法線ベクトルを含むパッチ５０８は、ポイントクラウド媒体５００のカレントフレームに対する法線属性を捕捉するようにコード化することができる。 Additionally, each pixel may (or may not) contain some reflectance. Reflectance is the amount of light (eg, colored light) that a pixel projects onto neighboring pixels. Shining objects have a high reflectance, thus spreading the light/color of their corresponding pixels to other nearby pixels. Matte objects, on the other hand, have little or no reflectance and may not affect the color/light levels of neighboring pixels. A patch 508 containing the reflectance of all pixels in the current frame can be coded to capture the reflectance attributes for the current frame of the point cloud medium 500 . Some pixels may be partially or completely transparent (eg, glass, clear plastic, etc.). Transparency is the amount of adjacent pixel light/color that can pass through the current pixel. A patch 508 containing the transparency levels of all pixels in the current frame can be coded to capture the transparency attribute for the current frame of the point cloud medium 500 . Additionally, the points of the point cloud medium may generate a surface. A surface can be associated with a normal vector, which is a vector perpendicular to the surface. Normal vectors can be useful when describing object motion and/or interactions. Therefore, in some cases, users may wish to encode normal vectors for surfaces to support additional functionality. A patch 508 containing normal vectors to surfaces in the current frame can be coded to capture normal attributes for the current frame of the point cloud medium 500 .

幾何学的形状、テクスチャ、反射率、透明度、および法線属性は、例に応じて、ポイントクラウド媒体５００内のいくつかのまたは全てのデータポイントを説明するデータを含むことができる。例えば、反射率、透明度、および法線属性は任意選択であり、したがって、同一のビットストリームにおいてさえ、あるポイントクラウド媒体５００の例では個別に、または組み合わせて発生し、他のものには発生しないことがある。このように、パッチ５０８の数、さらには属性の数は、フィルム化された主題、ビデオ設定などに基づいて、フレームごと、およびビデオごとに変動することがある。 Geometry, texture, reflectance, transparency, and normal attributes may include data describing some or all data points within the point cloud medium 500, depending on the example. For example, the reflectance, transparency, and normal attributes are optional, and thus occur individually or in combination in some point cloud media 500 examples and not in others, even in the same bitstream. Sometimes. Thus, the number of patches 508, and thus the number of attributes, may vary from frame to frame and video to video based on filmed subject matter, video settings, and the like.

図６は、ポイントクラウド媒体フレーム６００のためのデータセグメンテーションおよびパッキングの例である。具体的には、図６の例は、ポイントクラウド媒体５００のパッチ５０８の２Ｄ表現を示す。ポイントクラウド媒体フレーム６００は、ビデオシーケンスからのカレントフレームに対応する境界ボックス６０２を含む。境界ボックス６０２は、３Ｄである図５の境界ボックス５０２、５０４、および５０６とは対照的に２Ｄである。図示のように、境界ボックス６０２は、多数のパッチ６０４を含む。パッチ６０４は、本明細書では、２Ｄパッチまたはパッチデータユニットと互換的に呼ばれることがある。まとめると、図６のパッチ６０４は、図５の境界ボックス５０４内の画像の表現である。このように、図５の境界ボックス５０４内の３Ｄ画像は、パッチ６０４を介して境界ボックス６０２上に投影される。パッチ６０４のうちの１つを含まない境界ボックス６０２の部分は、空隙６０６と呼ばれる。空隙６０６は、空隙、空サンプルなどとも呼ばれることがある。 FIG. 6 is an example of data segmentation and packing for a point cloud media frame 600. FIG. Specifically, the example of FIG. 6 shows a 2D representation of patch 508 of point cloud medium 500 . A point cloud media frame 600 includes a bounding box 602 corresponding to the current frame from the video sequence. Bounding box 602 is 2D, in contrast to bounding boxes 502, 504, and 506 of FIG. 5, which are 3D. As shown, bounding box 602 includes multiple patches 604 . Patches 604 are sometimes referred to interchangeably herein as 2D patches or patch data units. Collectively, patch 604 in FIG. 6 is a representation of the image within bounding box 504 in FIG. Thus, the 3D image within bounding box 504 of FIG. 5 is projected onto bounding box 602 through patch 604 . Portions of bounding box 602 that do not contain one of patches 604 are referred to as voids 606 . Void 606 may also be referred to as a void, empty sample, or the like.

上記に留意すると、ビデオベースのポイントクラウド圧縮（ＰＣＣ）コーデック解は、３Ｄポイントクラウドデータ（例えば、図５のパッチ５０８）の２Ｄ投影パッチ（例えば、図６のパッチ６０４）へのセグメント化に基づくことに留意されたい。実際に、上述のコーディング方法またはプロセスは、例えば、没入６自由度（６ＤｏＦ）、動的拡張現実／仮想現実（ＡＲ／ＶＲ）物体、文化遺産、グラフィック情報システム（ＧＩＳ）、コンピュータ支援設計（ＣＡＤ）、自律ナビゲーションなどのような種々のタイプの技術に対して有益に実施されてもよい。 With the above in mind, video-based point cloud compression (PCC) codec solutions are based on segmenting 3D point cloud data (e.g., patch 508 in FIG. 5) into 2D projection patches (e.g., patch 604 in FIG. 6). Please note that Indeed, the coding method or process described above can be used, for example, in immersive six degrees of freedom (6 DoF), dynamic augmented/virtual reality (AR/VR) objects, cultural heritage, graphic information systems (GIS), computer aided design ( CAD), autonomous navigation, etc. may be beneficially implemented.

境界ボックス（例えば、境界ボックス６０２）内の各パッチ（例えば、図６のパッチ６０４のうちの１つ）の位置は、パッチのサイズのみによって決定され得る。例えば、図６のパッチ６０４のうち最大のものは、最初に左上隅（０，０）から開始する境界ボックス６０２上に投影される。パッチ６０４のうち最大のものが境界ボックス６０２上に投影された後、次の最大のパッチ６０４が、境界ボックス６０２上に投影され（別名充填され）、パッチ６０４のうち最小のものが境界ボックス６０２上に投影されるまで続く。再度、このプロセスでは、各パッチ６０４のサイズのみが考慮される。場合によっては、より小さいサイズを有するパッチ６０４は、より大きなパッチ間の空間を占有してもよく、より大きなパッチ６０４よりも境界ボックス６０２の左上隅により近い位置を有することになる。符号化の間、このプロセスは、フレーム内の各属性に対するパッチが１つ以上の対応する属性ストリームに符号化されるまで、各関連属性に対して繰り返されてもよい。次いで、単一フレームを再作成するために使用される属性ストリーム内のデータユニットのグループは、ＰＣＣＡＵ内のビットストリームに記憶することができる。復号器において、これらの属性ストリームは、ＰＣＣＡＵから取得され、復号されて、パッチ６０４を生成する。次いで、このようなパッチ６０４を組み合わせて、ＰＣＣ媒体を再作成することができる。このように、ポイントクラウド媒体フレーム６００は、伝送のためにポイントクラウド媒体５００を圧縮するための方法１００の一部として、ビデオコーデックシステム２００、符号化器３００、および／または復号器４００のようなコーデックによってコード化することができる。 The position of each patch (eg, one of patches 604 in FIG. 6) within a bounding box (eg, bounding box 602) may be determined solely by the size of the patch. For example, the largest of patches 604 in FIG. 6 is first projected onto bounding box 602 starting at the upper left corner (0,0). After the largest of patches 604 is projected onto bounding box 602 , the next largest patch 604 is projected (aka filled) onto bounding box 602 and the smallest of patches 604 is projected onto bounding box 602 . Continues until projected upwards. Again, only the size of each patch 604 is considered in this process. In some cases, patches 604 with smaller sizes may occupy more space between patches and will have a position closer to the upper left corner of bounding box 602 than larger patches 604 . During encoding, this process may be repeated for each relevant attribute until a patch for each attribute in the frame is encoded into one or more corresponding attribute streams. Groups of data units within the attribute stream that are used to recreate a single frame can then be stored in a bitstream within the PCC AU. At the decoder, these attribute streams are obtained from the PCC AU and decoded to produce patches 604 . Such patches 604 can then be combined to recreate the PCC media. Thus, the point-cloud media frames 600 can be processed by video codec system 200, encoder 300, and/or decoder 400, as part of method 100 for compressing point-cloud media 500 for transmission. It can be encoded by a codec.

図７は、拡張属性セットを有する例示的なＰＣＣビデオストリーム７００を例示する概略図である。例えば、ＰＣＣビデオストリーム７００は、例えば、ビデオコーデックシステム２００、符号化器３００、および／または復号器４００を採用することによって、ポイントクラウド媒体５００からのポイントクラウド媒体フレーム６００が方法１００にしたがって符号化されるときに生成されてもよい。 FIG. 7 is a schematic diagram illustrating an exemplary PCC video stream 700 with an extended attribute set. For example, PCC video stream 700 is encoded according to method 100 by point-cloud media frames 600 from point-cloud media 500, eg, by employing video codec system 200, encoder 300, and/or decoder 400. may be generated when

ＰＣＣビデオストリーム７００は、ＰＣＣＡＵ７１０のシーケンスを含む。ＰＣＣＡＵ７１０は、単一のＰＣＣフレームを再構成するのに十分なデータを含む。データは、ＮＡＬユニット７２０内のＰＣＣＡＵ７１０に配置される。ＮＡＬユニット７２０は、パケットサイズのデータコンテナである。例えば、単一のＮＡＬユニット７２０は、一般に、単純なネットワーク伝送を可能にするようにサイズ決定されている。ＮＡＬユニット７２０は、ＮＡＬユニット７２０のタイプを示すヘッダと、関連付けられたビデオデータを含むペイロードとを含んでもよい。ＰＣＣビデオストリーム７００は、拡張属性セットのために設計され、したがって、いくつかの属性特定ＮＡＬユニット７２０を含む。 PCC video stream 700 includes a sequence of PCC AUs 710 . PCC AU 710 contains enough data to reconstruct a single PCC frame. Data is placed in the PCC AU 710 within the NAL unit 720 . NAL unit 720 is a packet-sized data container. For example, a single NAL unit 720 is generally sized to allow simple network transmission. NAL unit 720 may include a header that indicates the type of NAL unit 720 and a payload that includes associated video data. The PCC video stream 700 is designed for extended attribute sets and therefore contains several attribute-specific NAL units 720 .

ＰＣＣビデオストリーム７００は、フレーム群（ＧＯＦ）ヘッダ７２１、補助情報フレーム７２２、占有マップフレーム７２３、幾何学的形状ＮＡＬユニット７２４、テクスチャＮＡＬユニット７２５、反射ＮＡＬユニット７２６、透明度ＮＡＬユニット７２７、および法線ＮＡＬユニット７２８を含み、それぞれはＮＡＬユニット７２０のタイプである。ＧＯＦヘッダ７２１は、対応するＰＣＣＡＵ７１０、対応するＰＣＣＡＵ７１０に関連付けられたフレーム、および／またはＰＣＣＡＵ７１０内の他のＮＡＬユニット７２０を説明する様々な構文要素を含む。ＰＣＣＡＵ７１０は、例に応じて、単一のＧＯＦヘッダ７２１を含んでもよいし、ＧＯＦヘッダ７２１を含まなくてもよい。補助情報フレーム７２２は、属性を符号化するために使用されるパッチに関連する情報のような、フレームに関連するメタデータを含んでもよい。占有マップフレーム７２３は、空のフレームの領域対データで占有されたフレームの領域を示す占有マップのような、フレームに関連するさらなるメタデータを含んでもよい。残りのＮＡＬユニット７２０は、ＰＣＣＡＵ７１０に対する属性データを含む。具体的には、幾何学的形状ＮＡＬユニット７２４、テクスチャＮＡＬユニット７２５、反射ＮＡＬユニット７２６、透明度ＮＡＬユニット７２７、および法線ＮＡＬユニット７２８は、それぞれ、幾何学的属性、テクスチャ属性、反射属性、透過属性、および法線属性を含む。 The PCC video stream 700 includes a group of frames (GOF) header 721, ancillary information frames 722, an occupancy map frame 723, a geometry NAL unit 724, a texture NAL unit 725, a reflection NAL unit 726, a transparency NAL unit 727, and a normal NAL units 728 , each of which is a type of NAL unit 720 . The GOF header 721 includes various syntactical elements that describe the corresponding PCC AU 710 , the frames associated with the corresponding PCC AU 710 , and/or other NAL units 720 within the PCC AU 710 . PCC AU 710 may include a single GOF header 721 or no GOF header 721, depending on the example. Auxiliary information frame 722 may contain metadata associated with the frame, such as information associated with the patch used to encode attributes. The occupancy map frame 723 may include further metadata associated with the frame, such as an occupancy map that indicates areas of frames that are empty versus areas of frames that are occupied with data. The remaining NAL units 720 contain attribute data for the PCC AU 710 . Specifically, geometry NAL unit 724, texture NAL unit 725, reflection NAL unit 726, transparency NAL unit 727, and normal NAL unit 728 are used to define geometric, texture, reflection, and transmission attributes, respectively. attributes, and normal attributes.

上記のように、属性はストリームに編成することができる。例えば、各属性に対して０～４のストリームがあってもよい。ストリームは、ＰＣＣビデオデータの論理的に分離した部分を含んでもよい。例えば、異なる物体のための属性は、同じタイプの複数の属性ストリーム（例えば、第１の３Ｄ境界ボックスのための第１の幾何学的形状ストリーム、第２の３Ｄ境界ボックスのための第２の属性ストリームなど）に符号化されてもよい。別の例では、異なるフレーム関連付けられた属性は、複数の属性ストリーム（例えば、偶数フレームのための透明度属性ストリーム、奇数フレームのための透明度属性ストリーム）に符号化されてもよい。さらに別の例では、パッチをレイヤ状に置いて３Ｄ物体を表してもよい。したがって、別々のレイヤは、別々のストリーム（例えば、トップレイヤに対して第１テクスチャ属性ストリーム、第２レイヤに対して第２のテクスチャ属性ストリームなど）に含まれてもよい。例にかかわらず、ＰＣＣＡＵ７１０は、対応する属性のための０、１または複数のＮＡＬユニットを含んでもよい。 As noted above, attributes can be organized into streams. For example, there may be 0-4 streams for each attribute. A stream may contain logically separate portions of PCC video data. For example, attributes for different objects may be represented by multiple attribute streams of the same type (e.g., a first geometry stream for a first 3D bounding box, a second attribute stream). In another example, attributes associated with different frames may be encoded into multiple attribute streams (eg, a transparency attribute stream for even frames, a transparency attribute stream for odd frames). In yet another example, patches may be layered to represent a 3D object. Thus, separate layers may be included in separate streams (eg, a first texture attribute stream for the top layer, a second texture attribute stream for the second layer, etc.). Regardless of the example, the PCC AU 710 may contain 0, 1 or more NAL units for the corresponding attributes.

本開示は、種々の属性（例えば、幾何学的形状ＮＡＬユニット７２４、テクスチャＮＡＬユニット７２５、反射ＮＡＬユニット７２６、透明度ＮＡＬユニット７２７、および／または法線ＮＡＬユニット７２８に含まれるように）をコーディングするための柔軟性の増大をサポートする。第１の例では、異なるＰＣＣ属性をコーディングするために、異なるコーデックを採用することができる。具体的な例として、第１コーデックを採用して、ＰＣＣビデオの幾何学的形状を幾何学的形状ＮＡＬユニット７２４にコード化することができ、第２コーデックを採用して、ＰＣＣビデオの反射を反射ＮＡＬユニット７２６にコード化する。別の例として、ＰＣＣビデオをコーディングする際に、最大５つのコーデック（例えば、各属性に対して１つのコーデック）を採用することができる。次いで、属性に使用されるコーデックは、ＰＣＣビデオストリーム７００、例えば、ＧＯＦヘッダ７２１において構文要素としてシグナルすることができる。 This disclosure codes various attributes (eg, as included in geometry NAL unit 724, texture NAL unit 725, reflection NAL unit 726, transparency NAL unit 727, and/or normal NAL unit 728). support increased flexibility for In a first example, different codecs can be employed to code different PCC attributes. As a specific example, a first codec can be employed to encode the geometry of the PCC video into the geometry NAL unit 724, and a second codec can be employed to encode the reflections of the PCC video. Encode into reflection NAL unit 726 . As another example, when coding PCC video, up to five codecs (eg, one codec for each attribute) may be employed. The codec used for the attribute can then be signaled as a syntax element in the PCC video stream 700, eg, GOF header 721.

さらに、上記のように、ＰＣＣ属性は、レイヤおよび／またはストリームの多くの組み合わせを採用してもよい。したがって、復号するときに復号器が各属性に対するレイヤおよび／またはストリームの組み合わせを決定することを可能にするために、（例えば、ＧＯＦヘッダ７２１内の）構文要素を使用して、各属性を符号化するときに符号化器によって使用されるレイヤおよび／またはストリームの組み合わせをシグナルすることができる。さらに、（例えば、ＧＯＦヘッダ７２１内の）構文要素を使用して、ＰＣＣ属性ストリーム内のＰＣＣ属性のレイヤをコーディングおよび／または組み合わせるために使用されるモードをシグナルすることができる。追加的に、（例えば、ＧＯＦヘッダ７２１内の）構文要素を使用して、ＰＣＣ属性ストリームに対応する各ＮＡＬユニット７２０に関連付けられたレイヤのレイヤインデックスを指定することができる。例えば、ＧＯＦヘッダ７２１を使用して、幾何学的形状属性に関連するレイヤおよびストリームの数、そのようなレイヤおよびストリームが配置される方式、ならびに各幾何学的ＮＡＬユニット７２４のためのレイヤインデックスを、復号器がＰＣＣフレームを復号するときに各幾何学的ＮＡＬユニット７２４を適切なレイヤに割り当てることができるようにシグナルすることができる。 Further, as noted above, PCC attributes may employ many combinations of layers and/or streams. Therefore, each attribute is encoded using a syntax element (eg, in the GOF header 721) to allow the decoder to determine the layer and/or stream combination for each attribute when decoding. The layer and/or stream combination used by the encoder when encoding can be signaled. Additionally, syntax elements (eg, in the GOF header 721) can be used to signal the mode used to code and/or combine layers of PCC attributes within the PCC attribute stream. Additionally, a syntax element (eg, in the GOF header 721) can be used to specify the layer index of the layer associated with each NAL unit 720 corresponding to the PCC attribute stream. For example, the GOF header 721 is used to specify the number of layers and streams associated with geometric attributes, the manner in which such layers and streams are arranged, and the layer index for each geometric NAL unit 724. , so that the decoder can assign each geometric NAL unit 724 to the appropriate layer when decoding the PCC frame.

最後に、（例えば、ＧＯＦヘッダ７２１内の）フラグは、任意のＰＣＣ属性レイヤが任意の不規則的なポイントクラウドポイントを含むかどうかを示すことができる。不規則的なポイントクラウドは、隣接するデータポイントと非連続である１つ以上のデータポイントのセットであり、したがって、パッチ６０４のような２Ｄパッチで表わすことができない。その代わりに、このような点は、このような不規則的なポイントクラウドポイントに関連付けられた座標および／または変換パラメータを含む不規則的なポイントクラウドパッチの一部として表わされる。不規則ポイントクラウドは２Ｄパッチとは異なるデータ構造を使用して表わされるため、フラグは復号器が不規則ポイントクラウドの存在を適切に認識し、そのようなデータを復号するための適切な機構を選択することを可能にする。 Finally, a flag (eg, in the GOF header 721) can indicate whether any PCC attribute layer contains any irregular point cloud points. An irregular point cloud is a set of one or more data points that are non-contiguous with neighboring data points and therefore cannot be represented by a 2D patch such as patch 604 . Instead, such points are represented as part of an irregular point cloud patch that includes coordinates and/or transformation parameters associated with such irregular point cloud points. Since the irregular point cloud is represented using a different data structure than the 2D patch, the flag enables the decoder to properly recognize the presence of an irregular point cloud and provide the appropriate mechanism for decoding such data. allow you to choose.

以下は、上述の態様を実施するための例示的な機構である。定義：ビデオＮＡＬユニットは、ＰｃｃＮａｌＵｎｉｔＴｙｐｅがＧＭＴＲＹ＿ＮＡＬＵ、ＴＥＸＴＵＲＥ＿ＮＡＬＵ、ＲＥＦＬＥＣＴ＿ＮＡＬＵ、ＴＲＡＮＳＰ＿ＮＡＬＵ、またはＮＯＲＭＡＬ＿ＮＡＬＵに等しいＰＣＣＮＡＬユニットである。 The following are exemplary mechanisms for implementing the aspects described above. Definition: A video NAL unit is a PCC NAL unit with PccNalUnitType equal to GMTRY_NALU, TEXTURE_NALU, REFLECT_NALU, TRANSP_NALU, or NORMAL_NALU.

ビットストリームフォーマット：この項は、ＮＡＬユニットストリームとバイトストリームの間の関係を指定し、いずれかはビットストリームと呼ばれる。ビットストリームは、ＮＡＬユニットストリームフォーマットまたはバイトストリームフォーマットの２つのフォーマットのうちの１つとすることができる。ＮＡＬユニットストリームフォーマットは概念的にはより基本的なタイプであり、ＰＣＣＮＡＬユニットと呼ばれる一連の構文構造を含む。このシーケンスは復号順で順序付けされる。ＮＡＬユニットストリームにおいてＰＣＣＮＡＬユニットの復号順（およびコンテンツ）に課される制約がある。バイトストリームフォーマットは、ＮＡＬユニットを復号順に並べ、各ＮＡＬユニットに開始コードプレフィックスとゼロ以上のゼロ値バイトをプレフィックスしてバイトストリームを形成することによって、ＮＡＬユニットストリームフォーマットから構成することができる。ＮＡＬユニットストリームフォーマットは、このバイトのストリーム内の一意の開始コードプレフィックスパターンの位置を検索することによって、バイトストリームフォーマットから抽出することができる。バイトストリームフォーマットは、ＨＥＶＣおよびＡＶＣで採用されているフォーマットと同様である。 Bitstream Format: This section specifies the relationship between the NAL unit stream and the byte stream, either called bitstream. A bitstream can be in one of two formats: NAL unit stream format or byte stream format. The NAL unit stream format is conceptually a more basic type, containing a set of syntactic structures called PCC NAL units. This sequence is ordered in decoding order. There are constraints imposed on the decoding order (and content) of PCC NAL units in the NAL unit stream. A byte stream format may be constructed from the NAL unit stream format by placing the NAL units in decoding order and prefixing each NAL unit with a start code prefix and zero or more zero value bytes to form a byte stream. The NAL unit stream format can be extracted from the byte stream format by searching for the position of the unique start code prefix pattern within this stream of bytes. The byte stream format is similar to that adopted by HEVC and AVC.

ＰＣＣＮＡＬユニットヘッダ構文は、以下の表１に説明されるように実施されてもよい。

The PCC NAL unit header syntax may be implemented as set forth in Table 1 below.

ＰＣＣプロファイルおよびレベル構文は、以下の表３に説明されるように実施されてもよい。

The PCC profile and level syntax may be implemented as described in Table 3 below.

ＰＣＣＮＡＬユニットヘッダセマンティクスは以下のように実施されてもよい。ｆｏｒｂｉｄｄｅｎ＿ｚｅｒｏ＿ｂｉｔは、０に等しく設定されてもよい。ｐｃｃ＿ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅ＿ｐｌｕｓ１－１は、変数ＰｃｃＮａｌＵｎｉｔＴｙｐｅの値を指定し、これは、下記の表４に指定されるように、ＰＣＣＮＡＬユニットに含まれるＲＢＳＰデータ構造のタイプを指定する。変数ＮａｌＵｎｉｔＴｙｐｅは次のように指定される：
ＰｃｃＮａｌＵｎｉｔＴｙｐｅ＝ｐｃｃ＿ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅ＿ｐｌｕｓ１-１（７－１）セマンティクスが指定されていないＵＮＳＰＥＣ２５～ＵＮＳＰＥＣ３０の範囲のｎａｌ＿ｕｎｉｔ＿ｔｙｐｅを持つＰＣＣＮＡＬユニットは、本明細書で指定される復号プロセスに影響を与えない。ＵＮＳＰＥＣ２５～ＵＮＳＰＥＣ３０の範囲のＰＣＣＮＡＬユニットは、アプリケーションによって決定されたとおりに使用されてもよいと留意されたい。ＰｃｃＮａｌＵｎｉｔＴｙｐｅのこれらの値に対する復号プロセスは、本開示では指定されない。異なるアプリケーションが異なる目的のためにこれらのＰＣＣＮＡＬユニットタイプを使うかもしれないため、これらのＰｃｃＮａｌＵｎｉｔＴｙｐｅ値を有するＰＣＣＮＡＬユニットを生成する符号化器の設計と、これらのＰｃｃＮａｌＵｎｉｔＴｙｐｅ値を有するＰＣＣＮＡＬユニットのコンテンツを解釈する復号器の設計に、特別な注意が払われるべきです。この開示は、これらの値に対するあらゆる管理を定義しない。これらのＰｃｃＮａｌＵｎｉｔＴｙｐｅ値は、使用の衝突（例えば、同じＰｃｃＮａｌＵｎｉｔＴｙｐｅ値に対するＰＣＣＮＡＬユニットのコンテンツの意味の異なる定義）が重要でなく、不可能であるコンテキストでの使用にのみ適している可能性があり、例えば、制御アプリケーションまたはトランスポート仕様で定義または管理されるか、ビットストリームが配布される環境を制御することによって管理される。 PCC NAL unit header semantics may be implemented as follows. forbidden_zero_bit may be set equal to zero. pcc_nal_unit_type_plus1-1 specifies the value of the variable PccNalUnitType, which specifies the type of RBSP data structure contained in the PCC NAL unit, as specified in Table 4 below. The variable NalUnitType is specified as follows:
PccNalUnitType=pcc_nal_unit_type_plus1-1 (7-1) PCC NAL units with nal_unit_type in the range UNSPEC25 to UNSPEC30 with unspecified semantics have no effect on the decoding process specified herein. Note that PCC NAL units in the range UNSPEC25 to UNSPEC30 may be used as determined by the application. The decoding process for these values of PccNalUnitType is not specified in this disclosure. Since different applications may use these PCC NAL unit types for different purposes, the design of encoders that generate PCC NAL units with these PccNalUnitType values and the content of PCC NAL units with these PccNalUnitType values Special attention should be paid to the design of decoders that interpret This disclosure does not define any control over these values. These PccNalUnitType values may only be suitable for use in contexts where usage conflicts (e.g., different definitions of the content of a PCC NAL unit for the same PccNalUnitType value) are immaterial and not possible; For example, defined or managed by a controlling application or transport specification, or by controlling the environment in which the bitstream is distributed.

ビットストリームのＰＣＣＡＵ内のデータ量を決定する以外の目的のために、復号器は、ＰｃｃＮａｌＵｎｉｔＴｙｐｅの予約値を使用するすべてのＰＣＣＮＡＬユニットのコンテンツを無視（ビットストリームから削除し、廃棄）してもよい。この要件は、この開示に対する互換性のある拡張の将来の定義を可能にしてもよい。

For purposes other than determining the amount of data in the PCC AUs of the bitstream, the decoder shall ignore (remove from the bitstream and discard) the contents of all PCC NAL units that use the reserved value of PccNalUnitType. good too. This requirement may allow future definition of compatible extensions to this disclosure.

識別されたビデオコーデック（例えば、ＨＥＶＣまたはＡＶＣ）は、各クラウドポイントストリーム（ＣＰＳ）の最初のＰＣＣＡＵに存在するフレームヘッダＮＡＬユニットのグループに指示される。ｐｃｃ＿ｓｔｒｅａｍ＿ｉｄは、ＰＣＣＮＡＬユニットのＰＣＣストリーム識別子（ＩＤ）を指定する。ＰｃｃＮａｌＵｎｉｔＴｙｐｅがＧＯＦ＿ＨＥＡＤＥＲ、ＡＵＸ＿ＩＮＦＯまたはＯＣＰ＿ＭＡＰに等しいときに、ｐｃｃ＿ｓｔｒｅａｍ＿ｉｄの値はゼロに設定される。ＰＣＣプロファイルおよびレベルの１つ以上のセットの定義では、ｐｃｃ＿ｓｔｒｅａｍ＿ｉｄの値は４未満に制約されることがある。 The identified video codec (eg, HEVC or AVC) is indicated in the group of frame header NAL units present in the first PCC AU of each cloud point stream (CPS). pcc_stream_id specifies the PCC stream identifier (ID) of the PCC NAL unit. The value of pcc_stream_id is set to zero when PccNalUnitType is equal to GOF_HEADER, AUX_INFO or OCP_MAP. The definition of one or more sets of PCC profiles and levels may constrain the value of pcc_stream_id to be less than four.

ＰＣＣＮＡＬユニットの順序とそれらのＰＣＣＡＵへの関連を以下に説明する。ＰＣＣＡＵは、フレームヘッダＮＡＬユニットの０または１つのグループ、１つの補助情報フレームＮＡＬユニット、１つの占有マップフレームＮＡＬユニット、および幾何学的形状、テクスチャ、反射、透明度、または法線などのＰＣＣ属性のデータユニットを搬送する１つ以上のビデオＡＵを含む。ｖｉｄｅｏ＿ａｕ（ｉ，ｊ）は、ＰＣＣ属性ＩＤがａｔｔｒｉｂｕｔｅ＿ｔｙｐｅ［ｉ］に等しいＰＣＣ属性に対して、ｐｃｃ＿ｓｔｒｅａｍ＿ｉｄがｊに等しいビデオＡＵを示す。ＰＣＣＡＵに存在するビデオＡＵは、以下のように順序付けされてもよい。ａｔｔｒｉｂｕｔｅｓ＿ｆｉｒｓｔ＿ｏｒｄｅｒｉｎｇ＿ｆｌａｇが１に等しい場合、ＰＣＣＡＵに存在する任意の２つのビデオＡＵｖｉｄｅｏ＿ａｕ（ｉ１，ｊ１）とｖｉｄｅｏ＿ａｕ（ｉ２，ｊ２）に対して、以下が適用される。ｉ１がｉ２より小さい場合、ｊ１およびｊ２の値に関係なく、ｖｉｄｅｏ＿ａｕ（ｉ１，ｊ１）は、ｖｉｄｅｏ＿ａｕ（ｉ２，ｊ２）を先行するものとする。それ以外の場合、ｉ１がｉ２に等しく、ｊ１がｊ２より大きい場合、ｖｉｄｅｏ＿ａｕ（ｉ１，ｊ１）はｖｉｄｅｏ＿ａｕ（ｉ２，ｊ２）を後続するものとする。 The order of PCC NAL units and their association to PCC AUs is described below. The PCC AU contains zero or one group of frame header NAL units, one auxiliary information frame NAL unit, one occupied map frame NAL unit, and PCC attributes such as geometry, texture, reflection, transparency, or normal. contains one or more video AUs carrying data units of video_au(i,j) denotes a video AU whose pcc_stream_id is equal to j for the PCC attribute whose PCC attribute ID is equal to attribute_type[i]. Video AUs present in the PCC AU may be ordered as follows. If attributes_first_ordering_flag is equal to 1, then for any two video AUs video_au(i1,j1) and video_au(i2,j2) present in the PCC AU, the following applies. If i1 is less than i2, video_au(i1,j1) shall precede video_au(i2,j2) regardless of the values of j1 and j2. Otherwise, if i1 is equal to i2 and j1 is greater than j2, then video_au(i1,j1) shall follow video_au(i2,j2).

さもなければ（例えば、ａｔｔｒｉｂｕｔｅｓ＿ｆｉｒｓｔ＿ｏｒｄｅｒｉｎｇ＿ｆｌａｇが０に等しい）、ＰＣＣＡＵに存在する２つのビデオＡＵｖｉｄｅｏ＿ａｕ（ｉ１，ｊ１）とｖｉｄｅｏ＿ａｕ（ｉ２，ｊ２）に対して、以下が適用される。ｊ１がｊ２より小さい場合、ｉ１およびｉ２の値に関係なく、ｖｉｄｅｏ＿ａｕ（ｉ１，ｊ１）はｖｉｄｅｏ＿ａｕ（ｉ２，ｊ２）を先行するものとする。それ以外の場合、ｊ１がｊ２に等しく、ｉ１がｉ２より大きい場合、ｖｉｄｅｏ＿ａｕ（ｉ１，ｊ１）はｖｉｄｅｏ＿ａｕ（ｉ２，ｊ２）を後続するものとする。ビデオＡＵの上記の順序は、以下をもたらす。ａｔｔｒｉｂｕｔｅｓ＿ｆｉｒｓｔ＿ｏｒｄｅｒｉｎｇ＿ｆｌａｇが１に等しい場合、ビデオＡＵの順序は、存在する場合、ＰＣＣＡＵ内で（リストされた順序で）以下のようである。ここで、ＰＣＣＡＵ内では、各ＰＣＣ属性のすべてのＰＣＣＮＡＬユニットは、存在する場合、他のＰＣＣ属性のＰＣＣＮＡＬユニットとインターリーブされることなく、復号順で連続的である。すなわち、ｖｉｄｅｏ＿ａｕ（０，０），ｖｉｄｅｏ＿ａｕ（０，１），．．．，ｖｉｄｅｏ＿ａｕ（０，ｎｕｍ＿ｓｔｒｅａｍｓ＿ｆｏｒ＿ａｔｔｒｉｂｕｔｅ［０］），ｖｉｄｅｏ＿ａｕ（１，０），ｖｉｄｅｏ＿ａｕ（１，１），．．．，ｖｉｄｅｏ＿ａｕ（１，ｎｕｍ＿ｓｔｒｅａｍｓ＿ｆｏｒ＿ａｔｔｒｉｂｕｔｅ［１］），ｖｉｄｅｏ＿ａｕ（ｎｕｍ＿ａｔｔｒｉｂｕｔｅｓ-１，０），ｖｉｄｅｏ＿ａｕ（ｎｕｍ＿ａｔｔｒｉｂｕｔｅｓ-１，１），．．．，ｖｉｄｅｏ＿ａｕ（ｎｕｍ＿ａｔｔｒｉｂｕｔｅｓ-１，ｎｕｍ＿ｓｔｒｅａｍｓ＿ｆｏｒ＿ａｔｔｒｉｂｕｔｅ［１］）である。さもなければ（ａｔｔｒｉｂｕｔｅｓ＿ｆｉｒｓｔ＿ｏｒｄｅｒｉｎｇ＿ｆｌａｇは０に等しい）、ビデオＡＵの順序は、存在する場合、ＰＣＣＡＵ内で（リストされた順序で）以下のようであり、ＰＣＣＡＵ内では、各特定のｐｃｃ＿ｓｔｒｅａｍ＿ｉｄ値のすべてのＰＣＣＮＡＬユニットは、存在する場合、他のｐｃｃ＿ｓｔｒｅａｍ＿ｉｄ値のＰＣＣＮＡＬユニットとインターリーブされることなく、復号順で連続している。すなわち、ｖｉｄｅｏ＿ａｕ（０，０），ｖｉｄｅｏ＿ａｕ（１，０），．．．，ｖｉｄｅｏ＿ａｕ（ｎｕｍ＿ａｔｔｒｉｂｕｔｅｓ-１，０），ｖｉｄｅｏ＿ａｕ（０，１），ｖｉｄｅｏ＿ａｕ（１，１），．．．，ｖｉｄｅｏ＿ａｕ（ｎｕｍ＿ａｔｔｒｉｂｕｔｅｓ-１，１），ｖｉｄｅｏ＿ａｕ（０，ｎｕｍ＿ｓｔｒｅａｍｓ＿ｆｏｒ＿ａｔｔｒｉｂｕｔｅ［１］），ｖｉｄｅｏ＿ａｕ（１，ｎｕｍ＿ｓｔｒｅａｍｓ＿ｆｏｒ＿ａｔｔｒｉｂｕｔｅ［１］），．．．，ｖｉｄｅｏ＿ａｕ（ｎｕｍ＿ａｔｔｒｉｂｕｔｅｓ-１，ｎｕｍ＿ｓｔｒｅａｍｓ＿ｆｏｒ＿ａｔｔｒｉｂｕｔｅ［１］）である。 Otherwise (eg attributes_first_ordering_flag equals 0), for the two video AUs video_au(i1,j1) and video_au(i2,j2) present in the PCC AU, the following applies. If j1 is less than j2, video_au(i1,j1) shall precede video_au(i2,j2) regardless of the values of i1 and i2. Otherwise, if j1 is equal to j2 and i1 is greater than i2, then video_au(i1,j1) shall follow video_au(i2,j2). The above order of video AUs yields: If attributes_first_ordering_flag is equal to 1, then the order of the video AUs, if any, within the PCC AU is as follows (in the order listed). Here, within the PCC AU, all PCC NAL units of each PCC attribute, if any, are contiguous in decoding order without being interleaved with PCC NAL units of other PCC attributes. That is, video_au(0,0), video_au(0,1), . . . , video_au(0, num_streams_for_attribute[0]), video_au(1,0), video_au(1,1), . . . , video_au(1, num_streams_for_attribute[1]), video_au(num_attributes-1,0), video_au(num_attributes-1,1), . . . , video_au(num_attributes−1, num_streams_for_attribute[1]). Otherwise (attributes_first_ordering_flag equals 0), the order of the video AUs, if any, within the PCC AU is as follows (in the order listed), and within the PCC AU all for each specific pcc_stream_id value , if present, are contiguous in decoding order without being interleaved with PCC NAL units of other pcc_stream_id values. That is, video_au(0,0), video_au(1,0), . . . , video_au(num_attributes-1,0), video_au(0,1), video_au(1,1), . . . , video_au(num_attributes−1, 1), video_au(0, num_streams_for_attribute[1]), video_au(1, num_streams_for_attribute[1]), . . . , video_au(num_attributes−1, num_streams_for_attribute[1]).

ＮＡＬユニットのビデオＡＵへの関連付けおよびビデオＡＵ内のＮＡＬユニットの順序は、識別されたビデオコーデック、例えば、ＨＥＶＣまたはＡＶＣの仕様で指定される。識別されたビデオコーデックは、各ＣＰＳの最初のＰＣＣＡＵに存在するフレームヘッダＮＡＬユニットに指示される。 The association of NAL units to video AUs and the ordering of NAL units within video AUs is specified in the specification of the identified video codec, eg, HEVC or AVC. The identified video codec is indicated in the Frame Header NAL unit present in the first PCC AU of each CPS.

各ＣＰＳの最初のＰＣＣＡＵは、フレームヘッダＮＡＬユニットのグループで開始し、フレームヘッダＮＡＬユニットの各グループは、新しいＰＣＣＡＵの開始を指定する。 The first PCC AU of each CPS starts with a group of frame header NAL units, and each group of frame header NAL units designates the start of a new PCC AU.

他のＰＣＣＡＵは、補助情報フレームＮＡＬユニットで開始する。言い換えると、補助情報フレームＮＡＬユニットは、フレームヘッダＮＡＬユニットのグループによって先行されない場合、新しいＰＣＣＡＵを開始する。 Other PCC AUs start with an auxiliary information frame NAL unit. In other words, an auxiliary information frame NAL unit starts a new PCC AU if it is not preceded by a group of frame header NAL units.

フレームヘッダＲＢＳＰのグループのセマンティクスは以下のようである。ｎｕｍ＿ａｔｔｒｉｂｕｔｅｓは、ＣＰＳで搬送され得るＰＣＣ属性の最大数（幾何学的形状、テクスチャなど）を指定する。ＰＣＣプロファイルとレベルの１つ以上のセットの定義では、ｎｕｍ＿ａｔｔｒｉｂｕｔｅｓの値が５以下に制約され得ることに留意する。ａｔｔｒｉｂｕｔｅｓ＿ｆｉｒｓｔ＿ｏｒｄｅｒｉｎｇ＿ｆｌａｇは、０に等しいときに、ＰＣＣＡＵ内で、各ＰＣＣ属性のすべてのＰＣＣＮＡＬユニットは、存在する場合、他のＰＣＣ属性のＰＣＣＮＡＬユニットとインターリーブされることなく、復号順で連続することを指定する。ａｔｔｒｉｂｕｔｅｓ＿ｆｉｒｓｔ＿ｏｒｄｅｒｉｎｇ＿ｆｌａｇは、０に等しく設定されるときに、ＰＣＣＡＵ内で、各特定のｐｃｃ＿ｓｔｒｅａｍ＿ｉｄ値のすべてのＰＣＣＮＡＬ単位が、存在する場合、他のｐｃｃ＿ｓｔｒｅａｍ＿ｉｄ値のＰＣＣＮＡＬユニットとインターリーブされることなく、復号順で連続することを指定する。ａｔｔｒｉｂｕｔｅ＿ｔｙｐｅ［ｉ］は、ｉ番目のＰＣＣ属性のＰＣＣ属性タイプを指定する。異なるＰＣＣ属性タイプの解釈は、以下の表５で指定される。ＰＣＣプロファイルとレベルの１つ以上のセットの定義では、ａｔｔｒｉｂｕｔｅ＿ｔｙｐｅ［０］とａｔｔｒｉｂｕｔｅ＿ｔｙｐｅ［１］の値は、それぞれ０と１に等しくなるように制約されてもよい。

The semantics of a group of frame header RBSPs are as follows. num_attributes specifies the maximum number of PCC attributes (geometry, texture, etc.) that can be carried in the CPS. Note that the definition of one or more sets of PCC profiles and levels may constrain the value of num_attributes to 5 or less. attributes_first_ordering_flag, when equal to 0, within the PCC AU all PCC NAL units of each PCC attribute, if any, shall be contiguous in decoding order without being interleaved with PCC NAL units of other PCC attributes. Specify attributes_first_ordering_flag, when set equal to 0, within the PCC AU all PCC NAL units for each particular pcc_stream_id value are decoded without being interleaved, if any, with PCC NAL units for other pcc_stream_id values Specifies that they are consecutive in order. attribute_type[i] specifies the PCC attribute type of the i-th PCC attribute. The interpretation of different PCC attribute types is specified in Table 5 below. In defining one or more sets of PCC profiles and levels, the values of attribute_type[0] and attribute_type[1] may be constrained to equal 0 and 1, respectively.

ｉｄｅｎｔｉｆｉｅｄ＿ｃｏｄｅｃ＿ｆｏｒ＿ａｔｔｒｉｂｕｔｅ［ｉ］は、以下の表６に示されるように、ｉ番目のＰＣＣ属性のコーディングのために使用される識別されたビデオコーデックを指定する。

identified_codec_for_attribute[i] specifies the identified video codec used for coding the i-th PCC attribute, as shown in Table 6 below.

ｎｕｍ＿ｓｔｒｅａｍｓ＿ｆｏｒ＿ａｔｔｒｉｂｕｔｅ［ｉ］は、ｉ番目のＰＣＣ属性のＰＣＣストリームの最大数を指定する。ＰＣＣプロファイルとレベルの１つ以上のセットの定義では、ｎｕｍ＿ｓｔｒｅａｍｓ＿ｆｏｒ＿ａｔｔｒｉｂｕｔｅ［ｉ］の値は４以下に制約されてもよいと留意する。ｎｕｍ＿ｌａｙｅｒｓ＿ｆｏｒ＿ａｔｔｒｉｂｕｔｅ［ｉ］は、ｉ番目のＰＣＣ属性の属性レイヤの数を指定する。ＰＣＣプロファイルとレベルの１つ以上のセットの定義では、ｎｕｍ＿ｌａｙｅｒ＿ｆｏｒ＿ａｔｔｒｉｂｕｔｅ［ｉ］の値は４以下に制約されてもよいと留意する。ｍａｘ＿ａｔｔｒｉｂｕｔｅ＿ｌａｙｅｒ＿ｉｄｘ［ｉ］［ｊ］は、ｉ番目のＰＣＣ属性に対してｐｃｃ＿ｓｔｒｅａｍ＿ｉｄがｊに等しい状態で、ＰＣＣストリームの属性レイヤインデックスの最大値を指定する。ｍａｘ＿ａｔｔｒｉｂｕｔｅ＿ｌａｙｅｒ＿ｉｄｘ［ｉ］［ｊ］の値は、ｎｕｍ＿ｌａｙｅｒ＿ｆｏｒ＿ａｔｔｒｉｂｕｔｅ［ｉ］より小さくするべきである。ａｔｔｒｉｂｕｔｉｏｎ＿ｌａｙｅｒｓ＿ｃｏｍｂｉｎａｔｉｏｎ＿ｍｏｄｅ［ｉ］［ｊ］は、ｉ番目のＰＣＣ属性に対してｐｃｃ＿ｓｔｒｅａｍ＿ｉｄがｊに等しい状態で、ＰＣＣストリームで搬送された属性レイヤに対する属性レイヤ組み合わせモードを指定する。ａｔｔｒｉｂｕｔｉｏｎ＿ｌａｙｅｒｓ＿ｃｏｍｂｉｎａｔｉｏｎ＿ｍｏｄｅ［ｉ］［ｊ］の異なる値の解釈は、以下の表７で指定される。

num_streams_for_attribute[i] specifies the maximum number of PCC streams for the i-th PCC attribute. Note that the definition of one or more sets of PCC profiles and levels may constrain the value of num_streams_for_attribute[i] to 4 or less. num_layers_for_attribute[i] specifies the number of attribute layers for the i-th PCC attribute. Note that the definition of one or more sets of PCC profiles and levels may constrain the value of num_layer_for_attribute[i] to 4 or less. max_attribute_layer_idx[i][j] specifies the maximum attribute layer index of the PCC stream with pcc_stream_id equal to j for the i-th PCC attribute. The value of max_attribute_layer_idx[i][j] should be less than num_layer_for_attribute[i]. attribute_layers_combination_mode[i][j] specifies the attribute layer combination mode for attribute layers carried in PCC streams, with pcc_stream_id equal to j for the i-th PCC attribute. The interpretation of different values of attribute_layers_combination_mode[i][j] is specified in Table 7 below.

ａｔｔｒｉｂｕｔｉｏｎ＿ｌａｙｅｒｓ＿ｃｏｍｂｉｎａｔｉｏｎ＿ｍｏｄｅ［ｉ］［ｊ］が存在し、０に等しいときに、ｉ番目のＰＣＣ属性に対してｐｃｃ＿ｓｔｒｅａｍ＿ｉｄがｊに等しい状態でＰＣＣストリームの属性レイヤに対する属性レイヤインデックスを示す変数ａｔｔｒＬａｙｅｒＩｄｘ［ｉ］［ｊ］、ピクチャ順序カウント値が識別されたビデオコーデックの仕様で指定されるようにＰｉｃＯｒｄｅｒＣｎｔＶａｌに等しい状態でビデオＡＵで搬送される属性レイヤのＰＣＣＮＡＬユニットは、以下のように導出される。

A variable attrLayerIdx[i][j] that indicates the attribute layer index for the attribute layer of the PCC stream with pcc_stream_id equal to j for the i-th PCC attribute when attribute_layers_combination_mode[i][j] is present and equal to 0 ], the attribute layer PCC NAL unit carried in the video AU with the picture order count value equal to PicOrderCntVal as specified in the identified video codec's specification is derived as follows.

ｒｅｇｕｌａｒ＿ｐｏｉｎｔｓ＿ｆｌａｇ［ｉ］［ｊ］は、１に等しい場合、ｉ番目のＰＣＣ属性に対してレイヤインデックスがｊに等しい状態で属性レイヤがポイントクラウド信号の通常ポイントを搬送することを指定する。ｒｅｇｕｌａｒ＿ｐｏｉｎｔｓ＿ｆｌａｇ［ｉ］［ｊ］は、０に等しい場合、ｉ番目のＰＣＣ属性に対してレイヤインデックスがｊに等しい状態で属性レイヤがポイントクラウド信号の不規則ポイントを搬送することを指定する。ＰＣＣプロファイルとレベルの１つ以上のセットの定義では、ｒｅｇｕｌａｒ＿ｐｏｉｎｔｓ＿ｆｌａｇ［ｉ］［ｊ］の値はゼロに制約されてもよいことに留意する。ｆｒａｍｅ＿ｗｉｄｔｈは、幾何学的形状およびテクスチャビデオのフレーム幅をピクセル単位で示す。フレーム幅はｏｃｃｕｐａｎｃｙＲｅｓｏｌｕｔｉｏｎの倍数とすべきである。ｆｒａｍｅ＿ｈｅｉｇｈｔは、幾何学的形状およびテクスチャビデオのフレーム高さをピクセル単位で示す。フレーム高さはｏｃｃｕｐａｎｃｙＲｅｓｏｌｕｔｏｎの倍数とすべきである。ｏｃｃｕｐａｎｃｙ＿ｒｅｓｏｌｕｔｉｏｎは、パッチが幾何学的形状およびテクスチャビデオにパックされる水平および垂直解像度をピクセル単位で示す。ｏｃｃｕｐａｎｃｙ＿ｒｅｓｏｌｕｔｉｏｎは、ｏｃｃｕｐａｎｃｙＰｒｅｃｉｓｉｏｎの偶数値倍とすべきである。ｒａｄｉｕｓ＿ｔｏ＿ｓｍｏｏｔｈｉｎｇは、スムージングのためにネイバーを検出するための半径を示す。ｒａｄｉｕｓ＿ｔｏ＿ｓｍｏｏｔｉｎｇの値は、０～２５５までの範囲とすべきである。 regular_points_flag[i][j], if equal to 1, specifies that for the i-th PCC attribute the attribute layer carries regular points of the point cloud signal with layer index equal to j. regular_points_flag[i][j], if equal to 0, specifies that for the i-th PCC attribute the attribute layer carries the regular points of the point cloud signal with the layer index equal to j. Note that the definition of one or more sets of PCC profiles and levels may constrain the value of regular_points_flag[i][j] to zero. frame_width indicates the frame width of the geometry and texture video in pixels. The frame width should be a multiple of the occupancyResolution. frame_height indicates the frame height of the geometry and texture video in pixels. The frame height should be a multiple of the occupancyResoluton. occupancy_resolution indicates the horizontal and vertical resolution, in pixels, at which patches are packed into the geometry and texture video. The occupancy_resolution should be an even multiple of the occupancyPrecision. radius_to_smoothing indicates the radius for detecting neighbors for smoothing. The value of radius_to_smooting should be in the range 0-255.

ｎｅｉｇｈｂｏｒ＿ｃｏｕｎｔ＿ｓｍｏｏｔｉｎｇは、スムージングに使用されるネイバーの最大数を示す。ｎｅｉｇｈｂｏｒ＿ｃｏｕｎｔ＿ｓｍｏｏｔｉｎｇの値は、０～２５５までの範囲とすべきである。ｒａｄｉｕｓ２＿ｂｏｕｎｄａｒｙ＿ｄｅｔｅｃｔｉｏｎは、境界ポイント検出のための半径を示す。ｒａｄｉｕｓ２＿ｂｏｕｎｄａｒｙ＿ｄｅｔｅｃｔｉｏｎの値は、０～２５５までの範囲とすべきである。ｔｈｒｅｓｈｏｌｄ＿ｓｍｏｏｔｉｎｇは、スムージングしきい値を示す。ｔｈｒｅｓｈｏｌｄ＿ｓｍｏｏｔｉｎｇの値は、０～２５５までの範囲とすべきである。ｌｏｓｓｌｅｓｓ＿ｇｅｏｍｅｔｒｙは、可逆性幾何学的形状コーディングを示す。ｌｏｓｓｌｅｓｓ＿ｇｅｏｍｅｔｒｙの値は、１に等しい場合、ポイントクラウド幾何学的形状情報が可逆的にコード化されることを示す。ｌｏｓｓｌｅｓｓ＿ｇｅｏｍｅｔｒｙの値は、０に等しい場合、ポイントクラウド幾何学的形状情報が非可逆的にコード化されることを示す。ｌｏｓｓｌｅｓｓ＿ｔｅｘｔｕｒｅは、可逆性テクスチャ符号化を示す。ｌｏｓｓｌｅｓｓ＿ｔｅｘｔｕｒｅの値は、１に等しい場合、ポイントクラウドテクスチャ情報が可逆的にコード化されることを示す。ｌｏｓｓｌｅｓｓ＿ｔｅｘｔｕｒｅの値は、０に等しい場合、ポイントクラウドテクスチャ情報が非可逆的にコード化されることを示す。ｌｏｓｓｌｅｓｓ＿ｇｅｏｍｅｔｒｙ＿４４４は、幾何学的形状フレームに４：２：０を使用するか、４：４：４のビデオフォーマットを使用するかを示す。ｌｏｓｓｌｅｓｓ＿ｇｅｏｍｅｔｒｙ＿４４４の値は、１に等しい場合、幾何学的形状ビデオが４：４：４フォーマットでコード化されることを示す。ｌｏｓｓｌｅｓｓ＿ｇｅｏｍｅｔｒｙ＿４４４の値は、０に等しい場合、幾何学的形状ビデオが４：２：０フォーマットでコード化されることを示す。 neighbor_count_smooting indicates the maximum number of neighbors used for smoothing. The value of neighbor_count_smooting should be in the range 0-255. radius2_boundary_detection indicates the radius for boundary point detection. The value of radius2_boundary_detection should be in the range 0-255. threshold_smooting indicates the smoothing threshold. The value of threshold_smooting should be in the range 0-255. lossless_geometry indicates reversible geometry coding. The value of lossless_geometry, when equal to 1, indicates that the point cloud geometry information is losslessly encoded. The value of lossless_geometry, when equal to 0, indicates that the point cloud geometry information is coded losslessly. lossless_texture indicates lossless texture encoding. The value of lossless_texture, when equal to 1, indicates that the point cloud texture information is losslessly encoded. The value of lossless_texture, when equal to 0, indicates that the point cloud texture information is lossy encoded. lossless_geometry_444 indicates whether to use 4:2:0 or 4:4:4 video format for geometry frames. The value of lossless_geometry_444, when equal to 1, indicates that the geometry video is coded in 4:4:4 format. The value of lossless_geometry_444, when equal to 0, indicates that the geometry video is coded in 4:2:0 format.

ａｂｓｏｌｕｔｅ＿ｄ１＿ｃｏｄｉｎｇは、投影面に最も近いレイヤ以外の幾何学的形状レイヤがどのようにコード化されるかを示す。ａｂｓｏｌｕｔｅ＿ｄ１＿ｃｏｄｉｎｇは、１に等しい場合、実際の幾何学的形状値が投影面に最も近いレイヤ以外の幾何学的形状レイヤに対してコード化されることを示す。ａｂｓｏｌｕｔｅ＿ｄ１＿ｃｏｄｉｎｇは、０に等しい場合、投影面に最も近いレイヤ以外の幾何学的形状レイヤが差分的にコード化されることを示す。ｂｉｎ＿ａｇｉｔｈｍｅｔｉｃ＿ｃｏｄｉｎｇは、バイナリ演算コーディングが使用されるかどうかを示す。ｂｉｎ＿ａｇｉｔｈｍｅｔｉｃ＿ｃｏｄｉｎｇの値は、１に等しい場合、全ての構文要素にバイナリ演算コーディングが使用されることを示す。ｂｉｎ＿ａｇｉｔｈｍｅｔｉｃ＿ｃｏｄｉｎｇの値は、０に等しい場合、いくつかの構文要素に非バイナリ演算コーディングが使用されることを示す。ｇｏｆ＿ｈｅａｄｅｒ＿ｅｘｔｅｎｓｉｏｎ＿ｆｌａｇは、０に等しい場合、ｇｏｆ＿ｈｅａｄｅｒ＿ｅｘｔｅｎｓｉｏｎ＿ｄａｔａ＿ｆｌａｇ構文要素がフレームヘッダＲＢＳＰ構文構造のグループに存在しないことを指定する。ｇｏｆ＿ｈｅａｄｅｒ＿ｅｘｔｅｎｓｉｏｎ＿ｆｌａｇは、１に等しい場合、ｇｏｆ＿ｈｅａｄｅｒ＿ｅｘｔｅｎｓｉｏｎ＿ｄａｔａ＿ｆｌａｇ構文要素がフレームヘッダＲＢＳＰ構文構造のグループに存在することを指定する。復号器は、フレームヘッダＮＡＬユニットのグループのｇｏｆ＿ｈｅａｄｅｒ＿ｅｘｔｅｎｓｉｏｎ＿ｆｌａｇの値１に続くすべてのデータを無視してもよい。ｇｏｆ＿ｈｅａｄｅｒ＿ｅｘｔｅｎｓｉｏｎ＿ｄａｔａ＿ｆｌａｇは任意の値を有してもよく、フラグの存在と値は復号器の適合性に影響しない。復号器は全てのｇｏｆ＿ｈｅａｄｅｒ＿ｅｘｔｅｎｓｉｏｎ＿ｄａｔａ＿ｆｌａｇ構文要素を無視してもよい。 absolute_d1_coding indicates how geometry layers other than the layer closest to the projection plane are coded. absolute_d1_coding, when equal to 1, indicates that the actual geometry value is coded for a geometry layer other than the layer closest to the projection plane. absolute_d1_coding, when equal to 0, indicates that geometry layers other than the layer closest to the projection plane are differentially coded. bin_agithmetic_coding indicates whether binary arithmetic coding is used. The value of bin_agithmetic_coding, when equal to 1, indicates that binary arithmetic coding is used for all syntax elements. The value of bin_agithmetic_coding, if equal to 0, indicates that non-binary arithmetic coding is used for some syntax elements. gof_header_extension_flag, when equal to 0, specifies that the gof_header_extension_data_flag syntax element is not present in the group of frame header RBSP syntax structures. gof_header_extension_flag, when equal to 1, specifies that the gof_header_extension_data_flag syntax element is present in the group of frame header RBSP syntax structures. The decoder may ignore all data following the gof_header_extension_flag value of 1 in a group of frame header NAL units. gof_header_extension_data_flag may have any value, and the presence and value of the flag does not affect decoder conformance. A decoder may ignore all gof_header_extension_data_flag syntax elements.

ＰＣＣプロファイルとレベルセマンティクスは以下のようである。ｐｃｃ＿ｐｒｏｆｉｌｅ＿ｉｄｃは、ＣＰＳが準拠するプロファイルを示す。ｐｃｃ＿ｐｌ＿ｒｅｓｅｒｖｅｄ＿ｚｅｒｏ＿１９ｂｉｔｓは、本開示のこのバージョンに準拠するビットストリームでは０に等しい。ｐｃｃ＿ｐｌ＿ｒｅｓｅｒｖｅｄ＿ｚｅｒｏ＿１９ｂｉｔｓの他の値は、ＩＳＯ／ＩＥＣによって将来使用するために予約されている。復号器は、ｐｃｃ＿ｐｌ＿ｒｅｓｅｒｖｅｄ＿ｚｅｒｏ＿１９ｂｉｔｓの値を無視してもよい。ｐｃｃ＿ｌｅｖｅｌ＿ｉｄｃは、ＣＰＳが準拠するレベルを示す。サブビットストリーム抽出プロセスによって指定されるように抽出されたａｔｔｒｉｂｕｔｅ＿ｔｙｐｅ［ｉ］に等しいＰＣＣ属性タイプに対するＨＥＶＣビットストリームが、準拠するＨＥＶＣ復号器によって復号されるときに、アクティブＳＰＳにおいて、ｈｅｖｃ＿ｐｔｌ＿１２ｂｙｔｅｓ＿ａｔｔｒｉｂｕｔｅ［ｉ］は、ｇｅｎｅｒａｌ＿ｐｒｏｆｉｌｅ＿ｉｄｃ～ｇｅｎｅｒａｌ＿ｌｅｖｅｌ＿ｉｄｃまでの１２バイトの値に等しくてもよい。サブビットストリーム抽出プロセスによって指定されたように抽出されたａｔｔｒｉｂｕｔｅ＿ｔｙｐｅ［ｉ］に等しいＰＣＣ属性タイプに対するＡＶＣビットストリームが準拠するＡＶＣ復号器によって復号されるときに、アクティブＳＰＳにおいて、ａｖｃ＿ｐｌ＿３ｙｔｅｓ＿ａｔｔｒｉｂｕｔｅ［ｉ］は、ｐｒｏｆｉｌｅ＿ｉｄｃ～ｌｅｖｅｌ＿ｉｄｃまでの３バイトの値に等しくてもよい。 The PCC profile and level semantics are as follows. pcc_profile_idc indicates the profile to which the CPS conforms. pcc_pl_reserved_zero — 19bits is equal to 0 for bitstreams conforming to this version of this disclosure. Other values of pcc_pl_reserved_zero — 19bits are reserved for future use by ISO/IEC. A decoder may ignore the value of pcc_pl_reserved_zero — 19bits. pcc_level_idc indicates the level to which the CPS conforms. When an HEVC bitstream for a PCC attribute type equal to attribute_type[i] extracted as specified by the sub-bitstream extraction process is decoded by a compliant HEVC decoder, in an active SPS hevc_ptl_12bytes_attribute[i] is , general_profile_idc to general_level_idc, which may be equal to a 12-byte value. When an AVC bitstream for a PCC attribute type equal to attribute_type[i] extracted as specified by the sub-bitstream extraction process is decoded by a compliant AVC decoder, in an active SPS, avc_pl_3ytes_attribute[i] is May be equal to a 3-byte value from profile_idc to level_idc.

サブビットストリーム抽出プロセスは、以下のようである。このプロセスへの入力は、ＰＣＣビットストリームｉｎＢｉｔｓｔｒｅａｍ、ターゲットＰＣＣ属性タイプｔａｒｇｅｔＡｔｔｒＴｙｐｅ、およびターゲットＰＣＣストリームＩＤ値ｔａｒｇｅｔＳｔｒｅａｍＩｄである。このプロセスの出力はサブビットストリームである。適合しているＰＣＣビットストリームｉｎＢｉｔｓｔｒｅａｍ、ｉｎＢｉｔｓｔｒｅａｍ内に存在する任意のタイプのＰＣＣ属性を示すｔａｒｇｅｔＡｔｔｒＴｙｐｅ、および属性タイプｔａｒｇｅｔＡｔｔｒＴｙｐｅに対するｉｎＢｉｔｓｔｒｅａｍ内に存在するＰＣＣストリームの最大ＰＣＣストリームＩＤ値以下のｔａｒｇｅｔＳｔｒｅａｍＩｄを持つ、この項で指定されたプロセスの出力である任意の出力サブビットストリームは、属性タイプｔａｒｇｅｔＡｔｔｒＴｙｐｅに対する識別されたビデオコーデック仕様ごとに適合しているビデオビットストリームであるものとすることは、入力ビットストリームに対するビットストリーム適合性の要件であってもよい。 The sub-bitstream extraction process is as follows. The inputs to this process are the PCC bitstream inBitstream, the target PCC attribute type targetAttrType, and the target PCC stream ID value targetStreamId. The output of this process is a sub-bitstream. This section with a conforming PCC bitstream inBitstream, targetAttrType indicating any type of PCC attribute present in inBitstream, and targetStreamId less than or equal to the maximum PCC stream ID value of the PCC streams present in inBitstream for attribute type targetAttrType. Any output sub-bitstream that is the output of the process specified in shall be a video bitstream conforming per the identified video codec specification for the attribute type targetAttrType is the bitstream for the input bitstream It may be a conformance requirement.

出力サブビットストリームは以下の順序付けられたステップによって導出される。ｔａｒｇｅｔＡｔｔｒＴｙｐｅの値に応じて、以下が適用される。ｔａｒｇｅｔＡｔｔｒＴｙｐｅがＡＴＴＲ＿ＧＥＯＭＥＴＲＹに等しい場合、ＰｃｃＮａｌＵｎｉｔＴｙｐｅがＧＭＴＲＹ＿ＮＡＬＵに等しくないか、ｐｃｃ＿ｓｔｒｅａｍ＿ｉｄがｔａｒｇｅｔＳｔｒｅａｍＩｄに等しくない全てのＰＣＣＮＡＬユニットが削除される。そうでなければ、ｔａｒｇｅｔＡｔｔｒＴｙｐｅがＡＴＴＲ＿ＴＥＸＴＵＲＥに等しい場合、ＰｃｃＮａｌＵｎｉｔＴｙｐｅがＴＥＸＴＵＲＥ＿ＮＡＬＵに等しくないか、ｐｃｃ＿ｓｔｒｅａｍ＿ｉｄがｔａｒｇｅｔＳｔｒｅａｍＩｄに等しくない全てのＰＣＣＮＡＬユニットは削除される。そうでなければ、ｔａｒｇｅｔＡｔｔｒＴｙｐｅがＡＴＴＲ＿ＲＥＦＬＥＣＴに等しい場合、ＰｃｃＮａｌＵｎｉｔＴｙｐｅがＲＥＦＬＥＣＴ＿ＮＡＬＵに等しくないか、ｐｃｃ＿ｓｔｒｅａｍ＿ｉｄがｔａｒｇｅｔＳｔｒｅａｍＩｄに等しくない全てのＰＣＣＮＡＬユニットは削除される。そうでなければ、ｔａｒｇｅｔＡｔｔｒＴｙｐｅがＡＴＴＲ＿ＴＲＡＮＳＰに等しい場合、ＰｃｃＮａｌＵｎｉｔＴｙｐｅがＴＲＡＮＳＰ＿ＮＡＬＵに等しくないか、ｐｃｃ＿ｓｔｒｅａｍ＿ｉｄがｔａｒｇｅｔＳｔｒｅａｍＩｄに等しくない全てのＰＣＣＮＡＬユニットは削除される。そうでなければ、ｔａｒｇｅｔＡｔｔｒＴｙｐｅがＡＴＴＲ＿ＮＯＲＭＡＬに等しい場合、ＰｃｃＮａｌＵｎｉｔＴｙｐｅがＮＯＲＭＡＬ＿ＮＡＬＵに等しくないか、ｐｃｃ＿ｓｔｒｅａｍ＿ｉｄがｔａｒｇｅｔＳｔｒｅａｍＩｄに等しくない全てのＰＣＣＮＡＬユニットは削除される。各ＰＣＣＮＡＬユニットに対して、最初のバイトを削除してもよい。 The output sub-bitstream is derived by the following ordered steps. Depending on the value of targetAttrType the following applies: If targetAttrType equals ATTR_GEOMETRY, remove all PCC NAL units whose PccNalUnitType is not equal to GMTRY_NALU or whose pcc_stream_id is not equal to targetStreamId. Otherwise, if targetAttrType equals ATTR_TEXTURE, all PCC NAL units with PccNalUnitType not equal to TEXTURE_NALU or pcc_stream_id not equal to targetStreamId are deleted. Otherwise, if targetAttrType equals ATTR_REFLECT, all PCC NAL units with PccNalUnitType not equal to REFLECT_NALU or pcc_stream_id not equal to targetStreamId are deleted. Otherwise, if targetAttrType equals ATTR_TRANSP, all PCC NAL units with PccNalUnitType not equal to TRANSP_NALU or pcc_stream_id not equal to targetStreamId are deleted. Otherwise, if targetAttrType equals ATTR_NORMAL, all PCC NAL units with PccNalUnitType not equal to NORMAL_NALU or pcc_stream_id not equal to targetStreamId are deleted. For each PCC NAL unit, the first byte may be deleted.

上記に要約した方法の第１のセットの代替の実施形態では、ＰＣＣＮＡＬユニットヘッダは、ｐｃｃ＿ｓｔｒｅａｍ＿ｉｄに対してより多くのビットを使用し、各属性に対して４つ以上のストリームを許容するように設計される。その場合、ＰＣＣＮＡＬユニットのヘッダにさらに１つのタイプを追加する。 In an alternative embodiment of the first set of methods summarized above, the PCC NAL unit header uses more bits for pcc_stream_id and allows more than four streams for each attribute. Designed. In that case, add one more type to the header of the PCC NAL unit.

図８は、複数のコーデック８４３および８４４を有するＰＣＣ属性８４１および８４２を符号化する例示的な機構８００を示す概略図である。例えば、機構８００を採用して、ＰＣＣビデオストリーム７００の属性を符号化および／または復号することができる。したがって、機構８００は、ポイントクラウド媒体５００に基づいてポイントクラウド媒体フレーム６００を符号化および／または復号するために採用することができる。このように、機構８００は、符号化器３００がＰＣＣシーケンスからビットストリームを生成するために使用され、復号器４００がビットストリームからＰＣＣシーケンスを再構成するときに使用されてもよい。したがって、機構８００は、コーデックシステム２００によって採用することができ、さらに、方法１００をサポートするために採用してもよい。 FIG. 8 is a schematic diagram illustrating an exemplary mechanism 800 for encoding PCC attributes 841 and 842 with multiple codecs 843 and 844. FIG. For example, mechanism 800 may be employed to encode and/or decode attributes of PCC video stream 700 . Accordingly, mechanism 800 can be employed to encode and/or decode point cloud media frames 600 based on point cloud media 500 . Thus, mechanism 800 may be used by encoder 300 to generate a bitstream from a PCC sequence and by decoder 400 to reconstruct a PCC sequence from the bitstream. Thus, mechanism 800 can be employed by codec system 200 and may also be employed to support method 100 .

機構８００は、複数のＰＣＣ属性８４１および８４２に適用することができる。例えば、ＰＣＣ属性８４１および８４２は、幾何学的形状属性、テクスチャ属性、反射率属性、透明度属性、および法線属性を含むグループから選択される任意の２つの属性であってもよい。図８に示すように、機構８００は、左から右に進むときの符号化プロセスと、右から左に進むときの復号プロセスを示す。コーデック８４３および８４４は、ＨＥＶＣ、ＡＶＣ、ＶＶＣなどの任意の２つのコーデック、またはその任意のバージョンであってもよい。特定のコーデック８４３および８４４、またはそのバージョンは、特定のＰＣＣ属性８４１および８４２を符号化するときに、他のものよりも効率的であることがある。この例では、それぞれ、コーデック８４３は、属性８４１を符号化するために使用され、コーデック８４４は、属性８４２を符号化するために使用される。このような符号化の結果を組み合わせて、ＰＣＣ属性８４１および８４２の両方を含むＰＣＣビデオストリーム８４５を生成する。復号器では、それぞれ、コーデック８４３は、属性８４１を復号するために使用され、コーデック８４４は、属性８４２を復号するために使用される。次いで、復号された属性８４１および８４２を再度組み合わせて、復号されたＰＣＣビデオストリーム８４５を生成することができる。 Mechanism 800 can be applied to multiple PCC attributes 841 and 842 . For example, PCC attributes 841 and 842 may be any two attributes selected from the group including geometry attributes, texture attributes, reflectance attributes, transparency attributes, and normal attributes. As shown in FIG. 8, mechanism 800 shows the encoding process when going from left to right and the decoding process when going from right to left. Codecs 843 and 844 may be any two codecs, such as HEVC, AVC, VVC, or any version thereof. Certain codecs 843 and 844, or versions thereof, may be more efficient than others when encoding certain PCC attributes 841 and 842. In this example, codec 843 is used to encode attribute 841 and codec 844 is used to encode attribute 842, respectively. The results of such encoding are combined to produce a PCC video stream 845 containing both PCC attributes 841 and 842 . At the decoder, codec 843 is used to decode attribute 841 and codec 844 is used to decode attribute 842, respectively. Decoded attributes 841 and 842 can then be recombined to produce decoded PCC video stream 845 .

機構８００を採用する利点は、最も効率的なコーデック８４３および８４４が、対応する属性８４１および８４２に対して選択され得ることである。機構８００は、２つの属性８４１および８４２、ならびに２つのコーデック８４３および８４４に限定されない。例えば、各属性（幾何学的形状、テクスチャ、反射率、透明度、および法線）は、別々のコーデックによって符号化することができる。適切なコーデック８４３および８４４が、対応する属性８４１および８４２を復号するように選択され得ることを保証するために、符号化器は、コーデック８４３および８４４、ならびに、それぞれ、属性８４１および８４２へのそれらの対応をシグナルしてもよい。例えば、符号化器は、コーデックを属性対応に示すために、ＧＯＦヘッダに構文要素を含むことができる。次いで、復号器は、関連する構文を読み取り、属性８４１および８４２に対する正しいコーデック８４３および８４４を選択し、ＰＣＣビデオストリーム８４５を復号することができる。特定の例として、ｉｄｅｎｔｉｆｉｅｄ＿ｃｏｄｅｃ＿ｆｏｒ＿ａｔｔｒｉｂｕｔｅ構文要素を採用して、それぞれ属性８４１および８４２に対するコーデック８４３および８４４を示してもよい。 An advantage of employing mechanism 800 is that the most efficient codecs 843 and 844 can be selected for corresponding attributes 841 and 842 . Mechanism 800 is not limited to two attributes 841 and 842 and two codecs 843 and 844 . For example, each attribute (geometry, texture, reflectance, transparency, and normal) can be encoded by a separate codec. To ensure that the appropriate codecs 843 and 844 can be selected to decode the corresponding attributes 841 and 842, the encoder assigns them to codecs 843 and 844 and attributes 841 and 842, respectively. may signal the response of For example, an encoder can include a syntax element in the GOF header to indicate the codec to attribute correspondence. The decoder can then read the relevant syntax, select the correct codecs 843 and 844 for the attributes 841 and 842 and decode the PCC video stream 845 . As a specific example, the identified_codec_for_attribute syntax element may be employed to indicate codecs 843 and 844 for attributes 841 and 842, respectively.

図９は、属性レイヤ９３１、９３２、９３３、および９３４の例を示す概略図９００である。例えば、属性レイヤ９３１、９３２、９３３、および９３４を採用して、ＰＣＣビデオストリーム７００の属性を搬送することができる。したがって、レイヤ９３１、９３２、９３３、および９３４は、ポイントクラウド媒体５００に基づいてポイントクラウド媒体フレーム６００を符号化および／または復号するときに使用することができる。このように、レイヤ９３１、９３２、９３３、および９３４は、符号化器３００によって、ＰＣＣシーケンスからビットストリームを生成するために使用されてもよく、復号器４００によって、ＰＣＣシーケンスをビットストリームから再構成するときに使用されてもよい。したがって、レイヤ９３１、９３２、９３３、および９３４は、コーデックシステム２００によって採用することができ、さらに、方法１００をサポートするために採用してもよい。追加的に、属性レイヤ９３１、９３２、９３３、および９３４は、属性８４１および８４２のうちの１つ以上を搬送するために使用されてもよい。 FIG. 9 is a schematic diagram 900 showing examples of attribute layers 931 , 932 , 933 , and 934 . For example, attribute layers 931 , 932 , 933 and 934 may be employed to carry attributes of PCC video stream 700 . Layers 931 , 932 , 933 and 934 may thus be used when encoding and/or decoding point cloud media frames 600 based on point cloud media 500 . Thus, layers 931, 932, 933, and 934 may be used by encoder 300 to generate a bitstream from the PCC sequence, and decoder 400 to reconstruct the PCC sequence from the bitstream. May be used when Thus, layers 931 , 932 , 933 and 934 may be employed by codec system 200 and may also be employed to support method 100 . Additionally, attribute layers 931 , 932 , 933 and 934 may be used to carry one or more of attributes 841 and 842 .

属性レイヤ９３１、９３２、９３３、および９３４は、同じ属性に関連するデータの他のグループとは独立して保存および／または変更することができる属性に関連するデータのグルーピングである。このように、各属性レイヤ９３１、９３２、９３３、および９３４は、残りの属性レイヤ９３１、９３２、９３３、および／または９３４に影響を及ぼすことなく、変更および／または表すことができる。いくつかの例では、属性レイヤ９３１、９３２、９３３、および／または９３４は、図９に示すように、互いの上に視覚的に表わされてもよい。例えば、物体全体をカバーするテクスチャ（例えば、より一般的なもの）は、属性レイヤ９３１に保存することができ、より詳細なテクスチャ（例えば、より具体的なもの）は、属性レイヤ９３２、９３３、および／または９３４に含まれる。別の例では、属性レイヤ９３１および／または９３２を奇数番号のフレームに適用してもよく、属性レイヤ９３３および／または９３４を偶数番号のフレームに適用してもよい。これは、フレームレートの変化に応答して、いくつかのレイヤが省略されることを可能にしてもよい。各属性は、０～４の属性レイヤ９３１、９３２、９３３、および／または９３４を有してもよい。採用される構成をシグナルするために、符号化器は、ｎｕｍ＿ｌａｙｅｒｓ＿ｆｏｒ＿ａｔｔｒｉｂｕｔｅ［ｉ］のような構文要素を、ＧＯＦヘッダのようなシーケンスレベルのデータに採用してもよい。復号器は、構文要素を読み取り、各属性に対して採用される属性レイヤ９３１、９３２、９３３、および／または９３４の数を決定することができる。追加の構文要素、例えば、ａｔｔｒｉｂｕｔｉｏｎ＿ｌａｙｅｒｓ＿ｃｏｍｂｉｎａｔｉｏｎ＿ｍｏｄｅ［ｉ］［ｊ］、ａｔｔｒＬａｙｅｒＩｄｘ［ｉ］［ｊ］などを採用して、ＰＣＣビデオストリームで採用される属性レイヤの組み合わせおよび対応する属性によって使用される各レイヤのインデックスをそれぞれ示すこともできる。 Attribute layers 931, 932, 933, and 934 are groupings of attribute-related data that can be stored and/or modified independently of other groups of data related to the same attribute. In this manner, each attribute layer 931, 932, 933, and 934 can be modified and/or represented without affecting the remaining attribute layers 931, 932, 933, and/or 934. In some examples, attribute layers 931, 932, 933, and/or 934 may be visually represented on top of each other, as shown in FIG. For example, textures covering an entire object (eg, more general) can be stored in attribute layer 931, and more detailed textures (eg, more specific) can be stored in attribute layers 932, 933, and/or included in 934. In another example, attribute layers 931 and/or 932 may be applied to odd numbered frames and attribute layers 933 and/or 934 may be applied to even numbered frames. This may allow some layers to be skipped in response to frame rate changes. Each attribute may have 0-4 attribute layers 931 , 932 , 933 and/or 934 . To signal the configuration to be adopted, the encoder may employ syntax elements such as num_layers_for_attribute[i] to sequence-level data such as GOF headers. A decoder can read the syntax elements and determine the number of attribute layers 931, 932, 933, and/or 934 employed for each attribute. Additional syntax elements, e.g., attribute_layers_combination_mode[i][j], attrLayerIdx[i][j], etc., are employed to specify the combination of attribute layers employed in the PCC video stream and the number of layers used by the corresponding attributes. Each index can also be indicated.

さらに別の例として、いくつかの属性レイヤ（例えば、属性レイヤ９３１、９３２、および９３３）は、規則的なパッチに関するデータを搬送することができる一方、他の属性レイヤ（例えば、属性レイヤ９３４）は、不規則的なポイントクラウドパッチに関連付けられたデータを搬送する。これは、不規則的なポイントクラウドが、規則的なクラウドパッチとは異なるデータを使用して説明される可能性があるために有用である。特定のレイヤが不規則的なポイントクラウドに関連付けられたデータを搬送することをシグナルするために、符号化器はシーケンスレベルのデータ中の別の構文要素を符号化することができる。特定の例として、属性レイヤが少なくとも１つの不規則的なポイントクラウドポイントを搬送することを示すために、ＧＯＦヘッダのｒｅｇｕｌａｒ＿ｐｏｉｎｔｓ＿ｆｌａｇを使用することができる。次いで、復号器は、構文要素を読み取り、対応する属性レイヤをそれに応じて復号することができる。 As yet another example, some attribute layers (eg, attribute layers 931, 932, and 933) may carry data about regular patches, while other attribute layers (eg, attribute layer 934) may carry data about regular patches. carries the data associated with irregular point cloud patches. This is useful because irregular point clouds may be explained using different data than regular cloud patches. To signal that a particular layer carries data associated with an irregular point cloud, the encoder can encode another syntax element in the sequence-level data. As a specific example, a regular_points_flag in the GOF header can be used to indicate that the attribute layer carries at least one regular point cloud point. A decoder can then read the syntax elements and decode the corresponding attribute layers accordingly.

図１０は、属性ストリーム１０３１、１０３２、１０３３、および１０３４の例を示す概略図１０００である。例えば、属性ストリーム１０３１、１０３２、１０３３、および１０３４を採用して、ＰＣＣビデオストリーム７００の属性を搬送することができる。したがって、属性ストリーム１０３１、１０３２、１０３３、および１０３４は、ポイントクラウド媒体５００に基づいてポイントクラウド媒体フレーム６００を符号化および／または復号するときに採用することができる。このように、属性ストリーム１０３１、１０３２、１０３３、および１０３４は、符号化器３００がＰＣＣシーケンスからビットストリームを作製するために使用され、復号器４００がＰＣＣシーケンスをビットストリームから再構成するときに使用されてもよい。したがって、属性ストリーム１０３１、１０３２、１０３３、および１０３４は、コーデックシステム２００によって採用することができ、さらに、方法１００をサポートするために採用されてもよい。追加的に、属性ストリーム１０３１、１０３２、１０３３、および１０３４は、属性８４１および８４２の１つ以上を搬送するために使用されてもよい。さらに、属性ストリーム１０３１、１０３２、１０３３、および１０３４を採用して、属性レイヤ９３１、９３２、９３３、および９３４を搬送することができる。 FIG. 10 is a schematic diagram 1000 showing examples of attribute streams 1031 , 1032 , 1033 , and 1034 . For example, attribute streams 1031 , 1032 , 1033 and 1034 can be employed to carry attributes of PCC video stream 700 . Accordingly, attribute streams 1031 , 1032 , 1033 , and 1034 may be employed when encoding and/or decoding point cloud media frames 600 based on point cloud media 500 . Thus, attribute streams 1031, 1032, 1033, and 1034 are used by encoder 300 to create a bitstream from the PCC sequence and used by decoder 400 to reconstruct the PCC sequence from the bitstream. may be Accordingly, attribute streams 1031 , 1032 , 1033 and 1034 may be employed by codec system 200 and may also be employed to support method 100 . Additionally, attribute streams 1031 , 1032 , 1033 and 1034 may be used to carry one or more of attributes 841 and 842 . Additionally, attribute streams 1031 , 1032 , 1033 and 1034 may be employed to carry attribute layers 931 , 932 , 933 and 934 .

属性ストリーム１０３１、１０３２、１０３３、および１０３４は、経時的な属性データのシーケンスである。具体的には、属性ストリーム１０３１、１０３２、１０３３、および１０３４は、ＰＣＣビデオストリームのサブストリームである。各属性ストリーム１０３１、１０３２、１０３３、および１０３４は、属性特定ＮＡＬユニットのシーケンスを搬送し、したがって、記憶および／または伝送データ構造として活動する。各属性ストリーム１０３１、１０３２、１０３３、および１０３４は、データの１つ以上の属性レイヤ９３１、９３２、９３３、および９３４を搬送してもよい。例えば、属性ストリーム１０３１は属性レイヤ９３１および９３２を搬送することができる一方、属性ストリーム１０３２は属性レイヤ９３１および９３２を搬送する（属性ストリーム１０３３および１０３４は省略されている）。別の例では、各属性ストリーム１０３１、１０３２、１０３３および１０３４は、単一の対応する属性レイヤ９３１、９３２、９３３および９３４を搬送する。他の例では、いくつかの属性ストリーム１０３１、１０３２、１０３３、および１０３４は、複数の属性レイヤ９３１、９３２、９３３、および９３４を搬送する一方、他の属性ストリーム１０３１、１０３２、１０３３、および１０３４は、単一の属性レイヤ９３１、９３２、９３３、および９３４を搬送するか、または省略される。分かるように、属性ストリーム１０３１、１０３２、１０３３、および１０３４、ならびに属性レイヤ９３１、９３２、９３３、および９３４の多くの組み合わせおよび順列が生じ得る。したがって、符号化器は、各属性を符号化するために使用される属性ストリーム１０３１、１０３２、１０３３、および１０３４の数を示すために、ＧＯＦヘッダのようなシーケンスレベルデータにおいて、ｎｕｍ＿ｓｔｒｅａｍｓ＿ｆｏｒ＿ａｔｔｒｉｂｕｔｅのような構文要素を採用することができる。次いで、復号器は、例えば、属性レイヤ情報と組み合わせて、そのような情報を使用して、ＰＣＣシーケンスを再構成するために、属性ストリーム１０３１、１０３２、１０３３、および１０３４を復号することができる。 Attribute streams 1031, 1032, 1033, and 1034 are sequences of attribute data over time. Specifically, attribute streams 1031, 1032, 1033, and 1034 are substreams of the PCC video stream. Each attribute stream 1031, 1032, 1033, and 1034 carries a sequence of attribute-specific NAL units and thus acts as a storage and/or transmission data structure. Each attribute stream 1031, 1032, 1033, and 1034 may carry one or more attribute layers 931, 932, 933, and 934 of data. For example, attribute stream 1031 may carry attribute layers 931 and 932, while attribute stream 1032 carries attribute layers 931 and 932 (attribute streams 1033 and 1034 are omitted). In another example, each attribute stream 1031 , 1032 , 1033 and 1034 carries a single corresponding attribute layer 931 , 932 , 933 and 934 . In another example, some attribute streams 1031, 1032, 1033, and 1034 carry multiple attribute layers 931, 932, 933, and 934, while other attribute streams 1031, 1032, 1033, and 1034 , carry a single attribute layer 931, 932, 933, and 934, or be omitted. As can be seen, many combinations and permutations of attribute streams 1031, 1032, 1033 and 1034 and attribute layers 931, 932, 933 and 934 can occur. Therefore, the encoder uses syntax such as num_streams_for_attribute in sequence level data such as the GOF header to indicate the number of attribute streams 1031, 1032, 1033, and 1034 used to encode each attribute. element can be adopted. A decoder can then decode attribute streams 1031, 1032, 1033, and 1034 to reconstruct a PCC sequence using such information, eg, in combination with attribute layer information.

図１１は、複数のコーデックを有するＰＣＣビデオシーケンスを符号化する例示的な方法１１００のフローチャートである。例えば、方法１１００は、属性レイヤ９３１、９３２、９３３、および９３４、および／またはストリーム１０３１、１０３２、１０３３、および／または１０３４を使用しながら、機構８００にしたがってデータをビットストリームに編成することができる。また、方法１１００は、ＧＯＦヘッダ内の属性を符号化するために使用される機構を指定してもよい。さらに、方法１１００は、ポイントクラウド媒体５００に基づいてポイントクラウド媒体フレーム６００を符号化することによって、ＰＣＣビデオストリーム７００を生成してもよい。追加的に、方法１１００は、方法１００の符号化ステップを実行しながら、コーデックシステム２００および／または符号化器３００によって採用されてもよい。 FIG. 11 is a flowchart of an exemplary method 1100 of encoding a PCC video sequence with multiple codecs. For example, method 1100 may organize data into bitstreams according to mechanism 800 while using attribute layers 931, 932, 933, and 934 and/or streams 1031, 1032, 1033, and/or 1034. . The method 1100 may also specify the mechanism used to encode attributes within the GOF header. Additionally, method 1100 may generate PCC video stream 700 by encoding point-cloud media frames 600 based on point-cloud media 500 . Additionally, method 1100 may be employed by codec system 200 and/or encoder 300 while performing the encoding steps of method 100 .

方法１１００は、符号化器が、ポイントクラウド媒体を含むＰＣＣフレームのシーケンスを受信したときに開始してもよい。符号化器は、例えばユーザコマンドの受信に応答して、そのようなフレームを符号化器することを決定してもよい。方法１１００では、符号化器は、第１の属性が第１のコーデックによって符号化されるべきである一方、第２の属性が第２のコーデックによって符号化されるべきであると決定してもよい。この決定は、例えば、第１のコーデックが第１の属性に対してより効率的であり、第２のコーデックが第２の属性に対してより効率的であるときに、所定の条件に基づいて、および／またはユーザの入力に基づいて行ってもよい。このように、符号化器は、ステップ１１０１において、ＰＣＣフレームのシーケンスの第１の属性を、第１のコーデックを有するビットストリームに符号化する。さらに、符号化器は、ステップ１１０３において、第１のコーデックとは異なる第２のコーデックで、ＰＣＣフレームのシーケンスの第２の属性をビットストリームに符号化する。 Method 1100 may begin when an encoder receives a sequence of PCC frames containing point cloud media. The encoder may decide to encode such a frame, eg, in response to receiving a user command. In method 1100, the encoder determines that a first attribute should be encoded by a first codec while a second attribute is to be encoded by a second codec. good. This determination is based on a predetermined condition, for example when the first codec is more efficient for the first attribute and the second codec is more efficient for the second attribute. , and/or based on user input. Thus, the encoder encodes, at step 1101, a first attribute of a sequence of PCC frames into a bitstream having a first codec. Further, the encoder encodes a second attribute of the sequence of PCC frames into the bitstream at step 1103 with a second codec that is different from the first codec.

ステップ１１０５において、符号化器は種々の構文要素を符号化器されたビデオデータと共にビットストリームに符号化する。例えば、構文要素は、ＰＣＣフレームが適切に再構成され得るように、符号化の間になされた決定を復号器に示すために、シーケンスレベルパラメータを含むシーケンスレベルデータユニットにコード化することができる。具体的には、符号化器は、シーケンスレベルのデータユニットを符号化して、第１の属性が第１のコーデックによってコード化されたことを示し、かつ第２の属性が第２のコーデックによってコード化されたことを示す第１の構文要素を含める。特定の例として、ＰＣＣフレームは、第１の属性および第２の属性を含む複数の属性を含んでもよい。また、ＰＣＣフレームの複数の属性は、幾何学的形状、テクスチャ、および反射率、透明度、および法線の１つ以上を含んでもよい。さらに、第１の構文要素は、ビットストリームにおけるＧＯＦヘッダに含まれるｉｄｅｎｔｉｆｉｅｄ＿ｃｏｄｅｃ＿ｆｏｒ＿ａｔｔｒｉｂｕｔｅ要素であってもよい。 At step 1105, the encoder encodes various syntax elements into a bitstream along with the encoded video data. For example, the syntax elements can be coded into sequence level data units containing sequence level parameters to indicate to the decoder decisions made during encoding so that the PCC frame can be properly reconstructed. . Specifically, the encoder encodes the sequence-level data unit to indicate that the first attribute was encoded by the first codec and the second attribute was encoded by the second codec. Include a first syntactical element indicating that the As a particular example, a PCC frame may include multiple attributes including a first attribute and a second attribute. Also, the attributes of the PCC frame may include geometry, texture, and one or more of reflectance, transparency, and normal. Additionally, the first syntax element may be an identified_codec_for_attribute element included in a GOF header in the bitstream.

いくつかの例では、第１の属性は、複数のストリームに編成されてもよい。このような場合、第１の属性に関連付けられたビットストリームのデータユニットに対するストリームメンバシップを示すために、第２の構文要素を使用することができる。いくつかの例において、第１の属性はまた、複数のレイヤに編成されてもよい。このような場合、第３の構文要素は、第１の属性に関連付けられたビットストリームのデータユニットのためのレイヤメンバシップを示してもよい。特定の例として、第２の構文要素はｎｕｍ＿ｓｔｒｅａｍｓ＿ｆｏｒ＿ａｔｔｒｉｂｕｔｅ要素であってもよく、第３の構文要素はｎｕｍ＿ｌａｙｅｒｓ＿ｆｏｒ＿ａｔｔｒｉｂｕｔｅ要素であってもよく、各々はビットストリームにおけるフレームヘッダのグループに含まれてもよい。さらに別の例では、第４の構文要素を使用して、複数のレイヤのうちの第１のレイヤが不規則的なポイントクラウドに関連付けられたデータを含むことを示してもよい。特定の例として、第４の構文要素は、ビットストリームにおけるフレームヘッダのグループに含まれるｒｅｇｕｌａｒ＿ｐｏｉｎｔｓ＿ｆｌａｇ要素であってもよい。 In some examples, the first attribute may be organized into multiple streams. In such cases, a second syntax element can be used to indicate stream membership for the bitstream data unit associated with the first attribute. In some examples, the first attribute may also be organized into multiple layers. In such cases, the third syntax element may indicate layer membership for the data unit of the bitstream associated with the first attribute. As a particular example, the second syntax element may be a num_streams_for_attribute element and the third syntax element may be a num_layers_for_attribute element, each of which may be included in a group of frame headers in the bitstream. In yet another example, a fourth syntax element may be used to indicate that a first of the multiple layers contains data associated with the irregular point cloud. As a particular example, the fourth syntax element may be a regular_points_flag element included in a group of frame headers in the bitstream.

このような情報をシーケンスレベルデータに含めることによって、復号器は、ＰＣＣビデオシーケンスを復号するために十分な情報を有してもよい。このように、符号化器は、ステップ１１０７において、ＰＣＣフレームの復号されたシーケンスの生成をサポートするために、第１のコーデックによってコード化された第１の属性および第２のコーデックによってコード化された第２の属性ならびに本明細書で説明される他の属性および／または構文要素に基づいて、ビットストリームを送信してもよい。 By including such information in the sequence level data, the decoder may have sufficient information to decode the PCC video sequence. Thus, the encoder, in step 1107, uses the first attribute encoded by the first codec and the second attribute encoded by the second codec to support the generation of a decoded sequence of PCC frames. The bitstream may be transmitted based on the second attribute as well as other attributes and/or syntactical elements described herein.

図１２は、複数のコーデックでＰＣＣビデオシーケンスを復号する例示的な方法１２００のフローチャートである。例えば、方法１２００は、属性レイヤ９３１、９３２、９３３、および９３４、および／またはストリーム１０３１、１０３２、１０３３、および／または１０３４を使用しながら、機構８００にしたがってビットストリームからデータを読み取ることができる。また、方法１２００は、ＧＯＦヘッダを読み取ることによって、属性をコード化するために使用される機構を決定してもよい。さらに、方法１２００は、ポイントクラウド媒体フレーム６００およびポイントクラウド媒体５００を再構成するために、ＰＣＣビデオストリーム７００を読み取ってもよい。追加的に、方法１２００は、方法１００の復号ステップを実行しながら、コーデックシステム２００および／または復号器４００によって採用されてもよい。 FIG. 12 is a flowchart of an exemplary method 1200 for decoding a PCC video sequence with multiple codecs. For example, method 1200 can read data from a bitstream according to mechanism 800 while using attribute layers 931 , 932 , 933 and 934 and/or streams 1031 , 1032 , 1033 and/or 1034 . The method 1200 may also determine the mechanism used to encode attributes by reading the GOF header. Additionally, method 1200 may read PCC video stream 700 to reconstruct point-cloud media frames 600 and point-cloud media 500 . Additionally, method 1200 may be employed by codec system 200 and/or decoder 400 while performing the decoding steps of method 100 .

方法１２００は、ステップ１２０１において、復号器が一連のＰＣＣフレームを含むビットストリームを受信したときに開始してもよい。次いで、復号器は、ステップ１２０５においてビットストリームまたはその一部を解析することができる。例えば、復号器は、ビットストリームを解析して、シーケンスレベルのパラメータを含むシーケンスレベルのデータユニットを取得することができる。シーケンスレベルのデータユニットは、符号化プロセスを説明する種々の構文要素を含んでもよい。したがって、復号器は、ビットストリームからのビデオデータを解析し、ビデオデータを復号するための適切なプロセスを決定するために構文要素を使用することができる。 Method 1200 may begin at step 1201 when a decoder receives a bitstream containing a series of PCC frames. The decoder can then parse the bitstream or part thereof in step 1205 . For example, the decoder can parse the bitstream to obtain sequence-level data units that include sequence-level parameters. A sequence-level data unit may contain various syntactical elements that describe the encoding process. Thus, a decoder can parse the video data from the bitstream and use the syntax elements to determine the appropriate process for decoding the video data.

例えば、シーケンスレベルのデータユニットは、第１の属性が第１のコーデックによってコード化されたことを示し、かつ第２の属性が第２のコーデックによってコード化されたことを示す第１の構文要素を含むことができる。特定の例として、ＰＣＣフレームは、第１の属性および第２の属性を含む複数の属性を含んでもよい。また、ＰＣＣフレームの複数の属性は、幾何学的形状、テクスチャ、および反射率、透明度、および法線のうちの１つ以上を含んでもよい。追加的に、第１の構文要素は、ビットストリームにおけるＧＯＦヘッダに含まれるｉｄｅｎｔｉｆｉｅｄ＿ｃｏｄｅｃ＿ｆｏｒ＿ａｔｔｒｉｂｕｔｅ要素であってもよい。 For example, a sequence-level data unit has a first syntax element indicating that a first attribute was encoded by a first codec and a second syntax element indicating that a second attribute was encoded by a second codec. can include As a particular example, a PCC frame may include multiple attributes including a first attribute and a second attribute. Also, the attributes of the PCC frame may include geometry, texture, and one or more of reflectance, transparency, and normal. Additionally, the first syntax element may be an identified_codec_for_attribute element included in a GOF header in the bitstream.

いくつかの例では、第１の属性は、複数のストリームに編成されてもよい。このような場合、第１の属性に関連付けられたビットストリームのデータユニットに対するストリームメンバシップを示すために、第２の構文要素を採用することができる。いくつかの例において、第１の属性はまた、複数のレイヤに編成されてもよい。このような場合、第３の構文要素は、第１の属性に関連付けられたビットストリームのデータユニットのためのレイヤメンバシップを示してもよい。特定の例として、第２の構文要素はｎｕｍ＿ｓｔｒｅａｍｓ＿ｆｏｒ＿ａｔｔｒｉｂｕｔｅ要素であってもよく、第３の構文要素はｎｕｍ＿ｌａｙｅｒｓ＿ｆｏｒ＿ａｔｔｒｉｂｕｔｅ要素であってもよく、各々はビットストリームにおけるフレームヘッダのグループに含まれてもよい。さらに別の例では、第４の構文要素を使用して、複数のレイヤのうちの第１のレイヤが不規則的なポイントクラウドに関連付けられたデータを含むことを示してもよい。特定の例として、第４の構文要素は、ビットストリームにおけるフレームヘッダのグループに含まれるｒｅｇｕｌａｒ＿ｐｏｉｎｔｓ＿ｆｌａｇ要素であってもよい。 In some examples, the first attribute may be organized into multiple streams. In such cases, a second syntax element may be employed to indicate stream membership for the data unit of the bitstream associated with the first attribute. In some examples, the first attribute may also be organized into multiple layers. In such cases, the third syntax element may indicate layer membership for the data unit of the bitstream associated with the first attribute. As a specific example, the second syntax element may be a num_streams_for_attribute element and the third syntax element may be a num_layers_for_attribute element, each of which may be included in a group of frame headers in the bitstream. In yet another example, a fourth syntax element may be used to indicate that a first layer of the multiple layers contains data associated with the irregular point cloud. As a particular example, the fourth syntax element may be a regular_points_flag element included in a group of frame headers in the bitstream.

このように、復号器は、ＰＣＣフレームの復号されたシーケンスを生成するために、ステップ１２０７において、第１のコーデックによって第１の属性を復号し、第２のコーデックによって第２の属性を復号することができる。復号器はまた、コーデックに基づいてＰＣＣビデオシーケンスの種々の属性を復号するときに採用するべき適切な機構を決定するときに、本明細書に記載されるような他の属性および／または構文要素を使用してもよい。 Thus, the decoder decodes the first attribute with the first codec and the second attribute with the second codec in step 1207 to generate a decoded sequence of PCC frames. be able to. The decoder also uses other attributes and/or syntax elements as described herein when determining the appropriate mechanism to employ when decoding various attributes of a PCC video sequence based on the codec. may be used.

図１３は、例示的なビデオコーディングデバイス１３００の概略図である。ビデオコーディングデバイス１３００は、本明細書で説明されるように、開示される例／実施形態を実施するのに適している。ビデオコーディングデバイス１３００は、下流ポート１３２０、上流ポート１３５０、および／または、ネットワークを介して上流および／または下流でデータを通信するための送信機および／または受信機を含むトランシーバユニット１３１０を含む。ビデオコーディングデバイス１３００はまた、論理ユニットおよび／または中央処理ユニット（ＣＰＵ）を含むプロセッサ１３３０と、データを記憶するためのメモリ１３３２とを含む。ビデオコーディングデバイス１３００はまた、電気、光－電気（ＯＥ）構成要素、電気－光（ＥＯ）構成要素、および／または上流ポート１３５０および／または下流ポート１３２０に結合され、電気、光、または無線通信ネットワークを介してデータを通信するための無線通信構成要素を含んでもよい。ビデオコーディングデバイス１３００はまた、ユーザとの間でデータを通信するための入力および／または出力（Ｉ／Ｏ）デバイス１３６０を含んでもよい。Ｉ／Ｏデバイス１３６０は、ビデオデータを表示するためのディスプレイ、オーディオデータを出力するためのスピーカなどの出力デバイスを含んでもよい。入出力デバイス１３６０はまた、キーボード、マウス、トラックボールなどの入力デバイスおよび／またはそのような出力デバイスと対話するための対応するインターフェースを含んでもよい。 FIG. 13 is a schematic diagram of an exemplary video coding device 1300. As shown in FIG. Video coding device 1300 is suitable for implementing the disclosed examples/embodiments as described herein. Video coding device 1300 includes a downstream port 1320, an upstream port 1350, and/or a transceiver unit 1310 that includes transmitters and/or receivers for communicating data upstream and/or downstream over a network. Video coding device 1300 also includes a processor 1330, which includes a logic unit and/or central processing unit (CPU), and memory 1332 for storing data. Video coding device 1300 may also be coupled to electrical, optical-electrical (OE) components, electrical-optical (EO) components, and/or upstream port 1350 and/or downstream port 1320 for electrical, optical, or wireless communication. A wireless communication component may be included for communicating data over a network. Video coding device 1300 may also include input and/or output (I/O) devices 1360 for communicating data to and from a user. The I/O devices 1360 may include output devices such as a display for displaying video data and speakers for outputting audio data. Input/output devices 1360 may also include input devices such as keyboards, mice, trackballs, etc. and/or corresponding interfaces for interacting with such output devices.

プロセッサ１３３０は、ハードウェアおよびソフトウェアによって実施される。プロセッサ１３３０は、１つ以上のＣＰＵチップ、コア（例えば、マルチコアプロセッサとして）、フィールドプログラマブルゲートアレイ（ＦＰＧＡ）、特定用途向け集積回路（ＡＳＩＣ）、およびデジタル信号プロセッサ（ＤＳＰ）として実施されてもよい。プロセッサ１３３０は、下流ポート１３２０、Ｔｘ／Ｒｘ１３１０、上流ポート１３５０、およびメモリ１３３２と通信する。プロセッサ１３３０は、コーディングモジュール１３１４を備える。コーディングモジュール１３１４は、上述の開示された実施形態、例えば、方法１００、１１００、１２００、１５００、および１６００、ならびに機構８００を実施し、これらは、レイヤ９３１～９３４および／またはストリーム１０３１～１０３４にコード化されたポイントクラウド媒体５００、ポイントクラウド媒体フレーム６００、および／またはＰＣＣビデオストリーム７００を採用してもよい。また、コーディングモジュール１３１４は、本明細書で説明される任意の他の方法／機構を実施してもよい。さらに、コーディングモジュール１３１４は、コーデックシステム２００、符号化器３００、および／または復号器４００を実施してもよい。例えば、コーディングモジュール１３１４は、複数のストリームおよびレイヤを有するＰＣＣのための拡張された属性セットを採用することができ、復号をサポートするために、シーケンスレベルデータ内のそのような属性セットの使用をシグナルすることができる。したがって、コーディングモジュール１３１４は、ＰＣＣビデオデータをコーディングするときに、ビデオコーディングデバイス１３００が付加的な機能性および／または柔軟性を提供するようにする。このように、コーディングモジュール１３１４は、ビデオコーディングデバイス１３００の機能性を改善すると共に、ビデオコーディング技術に特有の問題に対処する。さらに、コーディングモジュール１３１４は、ビデオコーディングデバイス１３００を異なる状態に変換することをもたらす。代替的には、コーディングモジュール１３１４は、メモリ１３３２に記憶され、プロセッサ１３３０によって実行される命令として実施することができる（例えば、非一時的媒体上に記憶されるコンピュータプログラム製品として）。 Processor 1330 is implemented by hardware and software. Processor 1330 may be implemented as one or more CPU chips, cores (eg, as a multicore processor), field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), and digital signal processors (DSPs). . Processor 1330 communicates with downstream port 1320 , Tx/Rx 1310 , upstream port 1350 and memory 1332 . Processor 1330 comprises a coding module 1314 . Coding module 1314 implements the disclosed embodiments described above, such as methods 100, 1100, 1200, 1500, and 1600, and mechanism 800, which code layers 931-934 and/or streams 1031-1034. Transformed point-cloud media 500, point-cloud media frames 600, and/or PCC video stream 700 may be employed. Coding module 1314 may also implement any other methods/mechanisms described herein. Additionally, coding module 1314 may implement codec system 200 , encoder 300 , and/or decoder 400 . For example, coding module 1314 can employ an extended attribute set for PCC with multiple streams and layers, and use such attribute sets within sequence-level data to support decoding. can be signaled. Coding module 1314 thus enables video coding device 1300 to provide additional functionality and/or flexibility when coding PCC video data. In this way, coding module 1314 improves the functionality of video coding device 1300 and addresses issues inherent in video coding technology. Further, coding module 1314 provides for converting video coding device 1300 to different states. Alternatively, coding modules 1314 may be implemented as instructions stored in memory 1332 and executed by processor 1330 (eg, as a computer program product stored on non-transitory media).

メモリ１３３２は、ディスク、テープドライブ、ソリッドステートドライブ、リードオンリーメモリ（ＲＯＭ）、ランダムアクセスメモリ（ＲＡＭ）、フラッシュメモリ、三値コンテンツアドレス指定可能メモリ（ＴＣＡＭ）、スタティックランダムアクセスメモリ（ＳＲＡＭ）などの１つ以上のメモリタイプを含む。メモリ１３３２は、オーバーフローデータ記憶デバイスとして、プログラムが実行のために選択されたときにプログラムを記憶し、プログラム実行中に読み出された命令およびデータを記憶するために使用されてもよい。 Memory 1332 may be a disk, tape drive, solid state drive, read only memory (ROM), random access memory (RAM), flash memory, ternary content addressable memory (TCAM), static random access memory (SRAM), or the like. Contains one or more memory types. Memory 1332 may be used as an overflow data storage device to store programs when the programs are selected for execution, and to store instructions and data read during program execution.

図１４は、複数のコーデックを有するＰＣＣビデオシーケンスをコーディングするための例示的なシステム１４００の概略図である。システム１４００は、ＰＣＣフレームのシーケンスの第１の属性を第１のコーデックでビットストリームに符号化するための第１の属性符号化モジュール１４０１を含むビデオ符号化器１４０２を含む。ビデオ符号化器１４０２は、さらに、第１のコーデックとは異なる第２のコーデックでＰＣＣフレームのシーケンスの第２の属性をビットストリームに符号化するための第２の属性符号化モジュール１４０３を含む。ビデオ符号化器１４０２は、さらに、シーケンスレベルのパラメータを含むシーケンスレベルのデータユニットをビットストリームに符号化するための構文符号化モジュール１４０５を含み、シーケンスレベルのデータユニットは、第１の属性が第１のコーデックによってコード化されたことを示し、かつ第２の属性が第２のコーデックによってコード化されたことを示す第１の構文要素を含む。ビデオ符号化器１４０２は、さらに、第１コーデックによってコード化された第１の属性と、第２コーデックによってコード化された第２の属性に基づいて、ＰＣＣフレームの復号シーケンスの生成をサポートするためにビットストリームを送信するための送信モジュール１４０７を含む。ビデオ符号化器１４０２のモジュールはまた、方法１１００および／または１５００に関して上述したステップ／アイテムのいずれかを実行するために採用することもできる。 FIG. 14 is a schematic diagram of an exemplary system 1400 for coding PCC video sequences with multiple codecs. System 1400 includes a video encoder 1402 that includes a first attribute encoding module 1401 for encoding a first attribute of a sequence of PCC frames into a bitstream with a first codec. Video encoder 1402 further includes a second attribute encoding module 1403 for encoding a second attribute of the sequence of PCC frames into the bitstream with a second codec different from the first codec. The video encoder 1402 further includes a syntax encoding module 1405 for encoding sequence-level data units including sequence-level parameters into a bitstream, the sequence-level data units having a first attribute It includes a first syntax element indicating that it is encoded by one codec and that the second attribute is encoded by a second codec. Video encoder 1402 further supports generating a decoded sequence of PCC frames based on the first attribute encoded by the first codec and the second attribute encoded by the second codec. includes a transmission module 1407 for transmitting the bitstream to the . The modules of video encoder 1402 may also be employed to perform any of the steps/items described above with respect to methods 1100 and/or 1500.

システム１４００はまた、ＰＣＣフレームのシーケンスを含むビットストリームを受信するための受信モジュール１４１１を含むビデオ復号器１４１０を含む。ビデオ復号器１４１０は、さらに、シーケンスレベルのパラメータを含むシーケンスレベルのデータユニットを取得するためにビットストリームを解析するための解析モジュール１４１３を含み、シーケンスレベルのデータユニットは、ＰＣＣフレームの第１の属性が第１のコーデックによってコード化されたことを示し、かつＰＣＣフレームの第２の属性が第２のコーデックによってコード化されたことを示す第１の構文要素を含む。ビデオ復号器１４１０は、さらに、第１のコーデックによって第１の属性を復号し、第２のコーデックによって第２の属性を復号して、ＰＣＣフレームの復号されたシーケンスを生成するための復号モジュール１４１５を含む。ビデオ復号器１４１０のモジュールはまた、方法１２００および／または１６００に関して上述されたステップ／アイテムのいずれかを実行するために採用することができる。 System 1400 also includes a video decoder 1410 including a receiving module 1411 for receiving a bitstream containing a sequence of PCC frames. The video decoder 1410 further includes a parsing module 1413 for parsing the bitstream to obtain a sequence level data unit including sequence level parameters, the sequence level data unit being the first It includes a first syntax element indicating that the attribute was coded by a first codec and indicating that a second attribute of the PCC frame was coded by a second codec. The video decoder 1410 further includes a decoding module 1415 for decoding the first attribute with the first codec and the second attribute with the second codec to produce a decoded sequence of PCC frames. including. The modules of video decoder 1410 may also be employed to perform any of the steps/items described above with respect to methods 1200 and/or 1600.

図１５は、複数のコーデックでＰＣＣビデオシーケンスを符号化する別の例示的な方法１５００のフローチャートである。例えば、方法１５００は、属性レイヤ９３１、９３２、９３３、および９３４、および／またはストリーム１０３１、１０３２、１０３３、および／または１０３４を使用しながら、機構８００にしたがってデータをビットストリームに編成することができる。また、方法１５００は、ＧＯＦヘッダ内の属性を符号化するために使用される機構を指定してもよい。さらに、方法１５００は、ポイントクラウド媒体５００に基づいてポイントクラウド媒体フレーム６００を符号化することによって、ＰＣＣビデオストリーム７００を生成してもよい。追加的に、方法１５００は、方法１００の符号化ステップを実行しながら、コーデックシステム２００および／または符号化器３００によって採用されてもよい。 FIG. 15 is a flowchart of another exemplary method 1500 for encoding a PCC video sequence with multiple codecs. For example, method 1500 can organize data into bitstreams according to mechanism 800 while using attribute layers 931, 932, 933 and 934 and/or streams 1031, 1032, 1033 and/or 1034. . The method 1500 may also specify the mechanism used to encode attributes within the GOF header. Additionally, method 1500 may generate PCC video stream 700 by encoding point-cloud media frames 600 based on point-cloud media 500 . Additionally, method 1500 may be employed by codec system 200 and/or encoder 300 while performing the encoding steps of method 100 .

ステップ１５０１において、複数のＰＣＣ属性が、ＰＣＣフレームのシーケンスの一部としてビットストリームに符号化される。ＰＣＣ属性は、複数のコーデックで符号化される。ＰＣＣ属性は、幾何学的形状とテクスチャを含む。ＰＣＣ属性はまた、反射率、透明度、および法線のうちの１つ以上を含む。各コード化されたＰＣＣフレームは、１つ以上のＰＣＣＮＡＬユニットによって表される。ステップ１５０３において、指示が各ＰＣＣ属性について符号化される。この指示は、対応するＰＣＣ属性をコード化するために使用されるビデオコーデックを示す。ステップ１５０５において、ビットストリームは復号器に向かって送信される。 At step 1501, multiple PCC attributes are encoded into a bitstream as part of a sequence of PCC frames. PCC attributes are encoded with multiple codecs. PCC attributes include geometry and texture. PCC attributes also include one or more of reflectance, transparency, and normal. Each coded PCC frame is represented by one or more PCC NAL units. At step 1503, an indication is encoded for each PCC attribute. This indication indicates the video codec used to encode the corresponding PCC attribute. At step 1505, the bitstream is transmitted towards the decoder.

図１６は、複数のコーデックでＰＣＣビデオシーケンスを復号する別の例示的な方法１６００のフローチャートである。例えば、方法１６００は、属性レイヤ９３１、９３２、９３３、および９３４、および／またはストリーム１０３１、１０３２、１０３３、および／または１０３４を使用しながら、機構８００にしたがってビットストリームからデータを読み取ることができる。また、方法１６００は、ＧＯＦヘッダを読み取ることによって、属性をコード化するために使用される機構を決定してもよい。さらに、方法１６００は、ポイントクラウド媒体フレーム６００およびポイントクラウド媒体５００を再構成するために、ＰＣＣビデオストリーム７００を読み取ってもよい。追加的に、方法１６００は、方法１００の復号ステップを実行する間に、コーデックシステム２００および／または復号器４００によって採用されてもよい。 FIG. 16 is a flowchart of another exemplary method 1600 for decoding a PCC video sequence with multiple codecs. For example, method 1600 can read data from a bitstream according to mechanism 800 while using attribute layers 931 , 932 , 933 and 934 and/or streams 1031 , 1032 , 1033 and/or 1034 . The method 1600 may also determine the mechanism used to encode attributes by reading the GOF header. Additionally, method 1600 may read PCC video stream 700 to reconstruct point-cloud media frames 600 and point-cloud media 500 . Additionally, method 1600 may be employed by codec system 200 and/or decoder 400 while performing the decoding steps of method 100 .

ステップ１６０１において、ビットストリームが受信される。ビットストリームは、複数のＰＣＣフレームのコード化されたシーケンスを含む。ＰＣＣフレームのコード化されたシーケンスは、複数のＰＣＣ属性を表わす。ＰＣＣ属性は、幾何学的形状とテクスチャが含む。ＰＣＣ属性はまた、反射率、透明度、および法線のうちの１つ以上が含む。各コード化されたＰＣＣフレームは、１つ以上のＰＣＣＮＡＬユニットによって表される。ステップ１６０３において、ビットストリームは、各ＰＣＣ属性について、対応するＰＣＣ属性をコード化するために使用されるコーデックの指示を取得するために解析される。ステップ１６０５において、ビットストリームは、ＰＣＣ属性のために指示されたビデオコーデックに基づいて復号される。 At step 1601, a bitstream is received. A bitstream includes a coded sequence of multiple PCC frames. A coded sequence of PCC frames represents multiple PCC attributes. PCC attributes include geometry and textures. PCC attributes also include one or more of reflectance, transparency, and normal. Each coded PCC frame is represented by one or more PCC NAL units. At step 1603, the bitstream is parsed to obtain, for each PCC attribute, an indication of the codec used to encode the corresponding PCC attribute. At step 1605, the bitstream is decoded based on the video codec indicated for the PCC attribute.

第１の構成要素は、第１の構成要素と第２の構成要素との間のライン、トレース、または別の媒体を除いて、介在する構成要素が存在しないときに、第２の構成要素に直接結合される。第１の構成要素は、第１の構成要素と第２の構成要素との間にライン、トレース、または他の媒体以外の介在する構成要素が存在するときに、間接的に第２の構成要素に結合される。用語「結合」およびその変形は、直接結合と間接結合の両方を含む。用語「約」の使用は、特に断らない限り、後に続く数の±１０％を含む範囲を意味する。 A first component is in contact with a second component when there are no intervening components except for lines, traces, or other media between the first component and the second component. Directly combined. A first component is indirectly connected to a second component when there is an intervening component other than a line, trace, or other medium between the first component and the second component. coupled to The term "bonding" and variations thereof includes both direct and indirect bonding. Use of the term "about" means a range including ±10% of the number that follows, unless otherwise specified.

本開示においていくつかの実施形態が提供されたが、開示されたシステムおよび方法は、本開示の精神または範囲から逸脱することなく、多くの他の具体的な形態で具現化され得ると理解されよう。本例は、例示的なものであり、限定的なものではないと考えられ、その意図は、本明細書に与えられた詳細に限定されるものではない。例えば、種々の要素または構成要素を別のシステムに組み合わせまたは統合することができ、あるいは一定の特徴を省略してもよいし、実施しなくてもよい。 Although several embodiments have been provided in this disclosure, it is understood that the disclosed systems and methods may be embodied in many other specific forms without departing from the spirit or scope of this disclosure. Yo. The examples are to be considered illustrative and not limiting, and the intent is not to be limited to the details given herein. For example, various elements or components may be combined or integrated into another system, or certain features may be omitted or not implemented.

追加的に、様々な実施形態において個別または別個に記載および図示された技術、システム、サブシステム、および方法は、本開示の範囲から逸脱することなく、他のシステム、構成要素、技術、または方法と組み合わせまたは統合してもよい。変更、置換、および改変の他の例は、当業者によって確認可能であり、本明細書に開示された精神および範囲から逸脱することなく行われてもよい。 Additionally, the techniques, systems, subsystems, and methods individually or separately described and illustrated in various embodiments may be incorporated into other systems, components, techniques, or methods without departing from the scope of the present disclosure. may be combined or integrated with Other examples of changes, substitutions, and modifications can be identified by those skilled in the art and may be made without departing from the spirit and scope disclosed herein.

Claims

A method implemented by a decoder, comprising:
Receiving, by the decoder, a bitstream comprising an encoded sequence of multiple three-dimensional (3D) content frames, wherein the encoded sequence of multiple 3D content frames is a multiple of 3D content frames. representing attributes, each encoded 3D content frame being represented by one or more 3D content network abstraction layer (NAL) units;
for obtaining, by the decoder, for each 3D content attribute, an indication of one of a plurality of video coder decoders (codecs) used to encode the corresponding 3D content attribute; and parsing
decoding, by the decoder, the bitstream based on the indicated video codec for the 3D content attributes.

Each sequence of three-dimensional (3D) content frames is associated with a sequence-level data unit including a sequence-level parameter, said sequence-level data unit having a first attribute encoded by a first video codec. and indicating that the second attribute was encoded by a second video codec.

3. The method of claim 2, wherein the first syntax element is an identified_codec_for_attribute element included in a group of frame headers in the bitstream.

The first attribute is organized into a plurality of streams, a second syntax element indicating stream membership for data units of the bitstream associated with the first attribute, the second syntax element comprising: , a num_streams_for_attribute element contained in a group of frame headers in the bitstream.

5. Any of claims 1-4, wherein the first attribute is organized in a plurality of layers and a third syntax element indicates layer membership for data units of the bitstream associated with the first attribute. or the method described in paragraph 1.

6. The method of claim 5, wherein the third syntax element is a num_layers_for_attribute element included in a group of frame headers in the bitstream.

A method according to any one of claims 1 to 6, wherein a fourth syntax element indicates that a first layer of said plurality of layers contains data associated with an irregular point cloud. .

8. The method of claim 7, wherein the fourth syntax element is a regular_points_flag element included in a group of frame headers in the bitstream.

The bitstream is decoded into a decoded sequence of 3D content frames, and the method further comprises forwarding, by the decoder, the decoded sequence of 3D content frames towards a display for presentation. A method according to any one of claims 1 to 8, comprising

A method implemented in an encoder, comprising:
encoding, by the encoder, multiple 3D content attributes of a sequence of multiple three-dimensional (3D) content frames into a bitstream with multiple coder-decoders (codecs), each encoded 3D a content frame is represented by one or more 3D content network abstraction layer (NAL) units;
encoding, by the encoder, for each 3D content attribute an indication of one of the video codecs used to encode the corresponding 3D content attribute;
and transmitting, by the encoder, the bitstream towards a decoder.

Each sequence of 3D content frames is associated with a sequence-level data unit including sequence-level parameters, the sequence-level data unit having a first 3D content attribute encoded by a first video codec. and indicating that the second 3D content attribute was encoded by the second video codec.

A method according to any one of claims 10 to 11, wherein said first syntax element is an identified_codec_for_attribute element contained in a group of frame headers in said bitstream.

14. The method of claim 13, wherein the first attributes are organized into multiple layers and a third syntax element indicates layer membership for data units of the bitstream associated with the first attributes. .

A method according to any one of claims 10 to 14, wherein said third syntax element is a num_layers_for_attribute element included in a group of frame headers in said bitstream.

A method according to any one of claims 10 to 15, wherein a fourth syntax element indicates that a first layer of said plurality of layers contains data associated with an irregular point cloud. .

17. The method of claim 16, wherein the fourth syntax element is a regular_points_flag element included in a group of frame headers in the bitstream.

a decoding device,
a non-transitory memory storage device configured to store video data in the form of a bitstream;
a decoder configured to perform the method according to any one of claims 1-9.

a non-transitory memory storage device configured to store video data in the form of a bitstream;
an encoder configured to perform the method of any one of claims 10-17.

A non-transitory computer-readable medium containing computer-executable instructions stored on the non-transitory computer-readable medium, the video coding device comprising: A non-transitory computer-readable medium operable to carry out the method of any one of Claims 1 to 3.

A decoder comprising processing circuitry for performing the method of any one of claims 1-9.

An encoder comprising processing circuitry for performing the method of any one of claims 10-17.

A non-transitory storage medium comprising a bitstream, said bitstream comprising an encoded sequence of three-dimensional (3D) content frames, said encoded sequence of 3D content frames comprising: , representing a plurality of 3D content attributes, each encoded 3D content frame being represented by one or more 3D content network abstraction layer (NAL) units, the bitstream comprising, for each 3D content attribute: The indicated video for the 3D content attribute parsed by a decoder to obtain an indication of one of a plurality of video coder decoders (codecs) used to encode the corresponding 3D content attribute. A non-transitory storage medium used to be decoded by said decoder based on a codec.

A non-transitory storage medium comprising a bitstream, the bitstream encoding a plurality of 3D content attributes of a plurality of three-dimensional (3D) content frames into the bitstream by an encoder. wherein each encoded 3D content frame is represented by one or more 3D content network abstraction layer (NAL) units; and for each 3D content attribute, a corresponding 3D content attribute A non-transitory storage medium produced by: and encoding an indication of one of the video codecs used for encoding.

an encoder,
encoding multiple 3D content attributes of a sequence of multiple three-dimensional (3D) content frames into a bitstream with multiple coder-decoders (codecs), each encoded 3D content frame first attribute encoding means and second attribute encoding means for performing encoding, represented by the above 3D content network abstraction layer (NAL) units;
syntax encoding means for encoding, for each 3D content attribute, an indication of one of the video codecs used to encode the corresponding 3D content attribute;
transmitting means for transmitting said bitstream towards a decoder.

The encoder of claim 25, wherein the encoder is further configured to perform the method of any one of claims 10-17.

Receiving a bitstream comprising an encoded sequence of multiple three-dimensional (3D) content frames, wherein the encoding sequence of multiple 3D content frames represents multiple 3D content attributes, each encoded receiving means for receiving the 3D content frame represented by one or more 3D content network abstraction layer (NAL) units;
parsing the bitstream to obtain, for each 3D content attribute, an indication of one of a plurality of video coder decoders (codecs) used to encode the corresponding 3D content attribute; an analytical means for performing
decoding means operable to decode the bitstream based on the indicated video codec for the 3D content attributes.

Decoder according to claim 27, wherein the decoder is further arranged to perform the method according to any one of claims 1-9.