TW202234894A

TW202234894A - Hybrid-tree coding for inter and intra prediction for geometry coding

Info

Publication number: TW202234894A
Application number: TW110149192A
Authority: TW
Inventors: 巴帕迪亞雷; 阿達許克里許納瑞瑪蘇布雷蒙尼安; 龍范文; 德奧維拉葛特汎; 馬塔卡茲維克茲
Original assignee: 美商高通公司
Priority date: 2020-12-29
Filing date: 2021-12-28
Publication date: 2022-09-01
Also published as: EP4272166A1; WO2022147015A1; JP2024501966A; KR20230127219A

Abstract

A device for decoding a bitstream that includes point cloud data is configured to determine an octree that defines an octree-based splitting of a space containing the point cloud, wherein a leaf node of the octree contains one or more points of the point cloud; and directly decode positions of each of the one or more points in the leaf node, wherein to directly decode the positions of each of the one or more points in the leaf node, the one or more processors are further configured to: generate a prediction of the one or more points; and determine the one or more points based on the prediction.

Description

Hybrid tree coding for inter and intra prediction for geometry coding

本申請要求享受於2020年12月29日遞交的美國臨時專利申請63/131,546的權益，該申請的全部內容通過引用的方式被併入本文中。This application claims the benefit of US Provisional Patent Application 63/131,546, filed on December 29, 2020, the entire contents of which are incorporated herein by reference.

本公開內容涉及點雲編碼和解碼。The present disclosure relates to point cloud encoding and decoding.

概括而言，本公開內容描述了一種混合樹譯碼方法，該方法將八叉樹譯碼和預測譯碼相組合以用於在區塊級別的增強幀間/幀內預測，以實現點雲壓縮。In summary, this disclosure describes a hybrid tree coding method that combines octree coding and predictive coding for enhanced inter/intra prediction at the block level to achieve point clouds compression.

在一個示例中，本公開內容描述了一種對點雲進行譯碼的方法，所述方法包括：確定定義包含所述點雲的空間的基於八叉樹的拆分的八叉樹，其中：所述八叉樹的葉節點包含所述點雲的一個或多個點，並且所述葉節點中的所述一個或多個點中的每個點的位置是直接用信號通知的；使用幀內預測或幀間預測來生成所述一個或多個點的預測；以及對語法元素進行譯碼，所述語法元素指示所述一個或多個點是使用幀內預測還是幀間預測進行預測的。In one example, the present disclosure describes a method of decoding a point cloud, the method comprising: determining an octree that defines an octree-based split of a space containing the point cloud, wherein: the The leaf nodes of the octree contain one or more points of the point cloud, and the position of each of the one or more points in the leaf node is directly signaled; using intraframe predicting or inter-predicting to generate a prediction for the one or more points; and coding a syntax element indicating whether the one or more points are predicted using intra-prediction or inter-prediction.

根據本公開內容的一個示例，一種用於對包括點雲資料的位元流進行解碼的設備，包括：用於儲存所述點雲資料的記憶體；以及耦合到所述記憶體並且在電路中實現的一個或多個處理器，所述一個或多個處理器被配置為：確定定義包含點雲的空間的基於八叉樹的拆分的八叉樹，其中，所述八叉樹的葉節點包含所述點雲的一個或多個點；以及直接對在所述葉節點中的所述一個或多個點中的每個點的位置進行解碼，其中，為了直接對在所述葉節點中的所述一個或多個點中的每個點的所述位置進行解碼，所述一個或多個處理器還被配置為：生成所述一個或多個點的預測；以及基於所述預測來確定所述一個或多個點。According to one example of the present disclosure, an apparatus for decoding a bitstream including point cloud data includes: memory for storing the point cloud data; and coupled to the memory and in a circuit Implementing one or more processors configured to: determine an octree that defines an octree-based split of a space containing a point cloud, wherein the leaves of the octree a node containing one or more points of the point cloud; and directly decoding the position of each of the one or more points in the leaf node, wherein in order to directly decode the point in the leaf node decoding the position of each of the one or more points in the one or more processors, the one or more processors are further configured to: generate a prediction of the one or more points; and based on the prediction to determine the one or more points.

根據本公開內容的另一示例，一種對點雲資料進行解碼的方法包括：確定定義包含所述點雲的空間的基於八叉樹的拆分的八叉樹，其中，所述八叉樹的葉節點包含所述點雲的一個或多個點；直接對在所述葉節點中的所述一個或多個點中的每個點的位置進行解碼，其中，直接對在所述葉節點中的所述一個或多個點中的每個點的所述位置進行解碼包括：生成所述一個或多個點的預測；以及基於所述預測來確定所述一個或多個點。According to another example of the present disclosure, a method of decoding point cloud material includes determining an octree that defines an octree-based split of a space containing the point cloud, wherein the octree's A leaf node contains one or more points of the point cloud; the position of each of the one or more points in the leaf node is directly decoded, wherein the position of each of the one or more points in the leaf node is directly decoded Decoding the location of each of the one or more points includes: generating a prediction of the one or more points; and determining the one or more points based on the prediction.

根據本公開內容的另一示例，一種電腦可讀儲存媒體儲存指令，所述指令在由一個或多個處理器執行時使得所述一個或多個處理器進行以下操作：確定定義包含點雲的空間的基於八叉樹的拆分的八叉樹，其中，所述八叉樹的葉節點包含所述點雲的一個或多個點；以及直接對在所述葉節點中的所述一個或多個點中的每個點的位置進行解碼，其中，為了直接對在所述葉節點中的所述一個或多個點中的每個點的所述位置進行解碼，所述指令使得所述一個或多個處理器進行以下操作：生成所述一個或多個點的預測；以及基於所述預測來確定所述一個或多個點。According to another example of the present disclosure, a computer-readable storage medium stores instructions that, when executed by one or more processors, cause the one or more processors to: determine a definition that includes a point cloud A spatial octree-based split octree, wherein a leaf node of the octree contains one or more points of the point cloud; and a direct response to the one or more points in the leaf node decoding the position of each of the plurality of points, wherein, to directly decode the position of each of the one or more points in the leaf node, the instructions cause the The one or more processors: generate a prediction of the one or more points; and determine the one or more points based on the prediction.

根據本公開內容的另一示例，一種裝置包括：用於確定定義包含點雲的空間的基於八叉樹的拆分的八叉樹的構件，其中，所述八叉樹的葉節點包含所述點雲的一個或多個點；用於直接對在所述葉節點中的所述一個或多個點中的每個點的位置進行解碼的構件，其中，所述用於直接對在所述葉節點中的所述一個或多個點中的每個點的所述位置進行解碼的構件包括：用於生成所述一個或多個點的預測的構件；以及用於基於所述預測來確定所述一個或多個點的構件。According to another example of the present disclosure, an apparatus includes means for determining an octree-based split of an octree that defines a space containing a point cloud, wherein leaf nodes of the octree contain the one or more points of a point cloud; means for directly decoding the position of each of said one or more points in said leaf node, wherein said means for directly decoding said one or more points in said leaf node The means for decoding the position of each of the one or more points in the leaf node comprises: means for generating a prediction of the one or more points; and determining based on the prediction components of the one or more points.

在附圖和下文的描述中闡述了一個或多個示例的細節。根據描述、附圖和申請專利範圍，其它特徵、目的和優勢將是顯而易見的。The details of one or more examples are set forth in the accompanying drawings and the description below. Other features, objects and advantages will be apparent from the description, drawings and claims.

點雲是三維（3D）空間中的點的集合。這些點可以對應於三維空間內的對象上的點。因此，可以使用點雲來表示三維空間的實體內容。點雲可以在各種情形下具有實用性。例如，點雲可以在自主式車輛的背景下用於表示道路上的對象的位置。在另一示例中，點雲可以在表示環境的實體內容的背景下使用，以用於在增強實境（AR）或混合實境（MR）應用中定位虛擬對象的目的。點雲壓縮是用於對點雲進行編碼和解碼的過程。對點雲進行編碼可以減少用於儲存和傳輸點雲所需要的資料量。A point cloud is a collection of points in three-dimensional (3D) space. These points may correspond to points on objects in three-dimensional space. Therefore, point clouds can be used to represent solid content in three-dimensional space. Point clouds can be useful in a variety of situations. For example, point clouds can be used in the context of autonomous vehicles to represent the location of objects on the road. In another example, point clouds may be used in the context of physical content representing an environment for the purpose of locating virtual objects in augmented reality (AR) or mixed reality (MR) applications. Point cloud compression is the process used to encode and decode point clouds. Encoding point clouds can reduce the amount of data needed to store and transmit point clouds.

先前已經存在用於用信號通知點雲中的點的位置的兩種主要的提議：八叉樹譯碼和預測樹譯碼。作為使用八叉樹譯碼對點雲資料進行編碼的一部分，G-PCC編碼器可以生成八叉樹。八叉樹的每個節點對應於立方體空間。八叉樹的節點可以具有零個子節點或八個子節點。在其它示例中，可以根據其它樹結構來將節點劃分為子節點。父節點的子節點對應于在與父節點相對應的立方體內的大小相等的立方體。點雲的各個點的位置可以相對於節點的原點來用信號通知。如果節點不包含點雲的任何點，則稱該節點未被佔用。如果該節點未被佔用，則用信號通知關於該節點的額外資料可能是不必要的。相反，如果節點包含點雲的一個或多個點，則稱該節點被佔用。There have previously been two main proposals for signaling the location of points in a point cloud: octree coding and predictive tree coding. As part of encoding the point cloud data using octree decoding, the G-PCC encoder can generate octrees. Each node of the octree corresponds to cube space. A node of an octree can have zero children or eight children. In other examples, nodes may be divided into child nodes according to other tree structures. The child nodes of the parent node correspond to cubes of equal size within the cube corresponding to the parent node. The position of each point of the point cloud can be signaled relative to the origin of the node. A node is said to be unoccupied if it does not contain any points of the point cloud. If the node is not occupied, signaling additional information about the node may not be necessary. Conversely, a node is said to be occupied if it contains one or more points of the point cloud.

當使用預測樹譯碼對點雲資料進行編碼時，G-PCC編碼器確定用於點雲的每個點的預測模式。用於點的預測模式可以是以下各項中的一項： (1) 無預測/零預測（0） (2) 增量預測（p0） (3) 線性預測（2*p0 – p1） (4) 平行四邊形預測（2*p0 + p1 – p2） When encoding point cloud data using prediction tree coding, the G-PCC encoder determines a prediction mode for each point of the point cloud. The prediction mode for points can be one of the following: (1) No Prediction/Zero Prediction (0) (2) Incremental prediction (p0) (3) Linear prediction (2*p0 – p1) (4) Parallelogram prediction (2*p0 + p1 – p2)

在用於點的預測模式為“無預測/零預測”的情況下，點被視為根點（即，根頂點），並且在位元流中用信號通知該點的座標（例如，x、y、z座標）。在用於點的預測模式為“增量預測”的情況下，G-PCC編碼器確定點的座標與父點（諸如根點或其它點）的座標之間的差（即，增量）。在預測模式為“線性預測”的情況下，G-PCC編碼器使用兩個父點座標的線性預測來確定點的預測座標。G-PCC編碼器用信號通知在使用線性預測所確定的預測座標與點的實際座標之間的差。在預測模式為“平行四邊形預測”的情況下，G-PCC編碼器使用三個父點來確定預測座標。然後，G-PCC編碼器用信號通知在點的預測座標與實際座標之間的差（例如，“主殘差”）。點之間的預測關係本質上定義了點樹。In the case where the prediction mode for a point is "no prediction/zero prediction", the point is treated as a root point (ie, a root vertex), and the coordinates of the point (eg, x, y, z coordinates). Where the prediction mode for a point is "delta prediction", the G-PCC encoder determines the difference (ie, delta) between the coordinates of the point and the coordinates of the parent point (such as the root point or other points). When the prediction mode is "linear prediction", the G-PCC encoder uses linear prediction of the coordinates of the two parent points to determine the predicted coordinates of the point. The G-PCC encoder signals the difference between the predicted coordinates determined using linear prediction and the actual coordinates of the point. When the prediction mode is "parallelogram prediction", the G-PCC encoder uses the three parent points to determine the prediction coordinates. The G-PCC encoder then signals the difference between the predicted and actual coordinates of the point (eg, the "main residual"). Predictive relationships between points essentially define a tree of points.

已經通過實驗觀察到，與預測樹譯碼相比，八叉樹譯碼可能更適於密集點雲。使用3D建模而獲取的點雲通常足夠密集，使得八叉樹譯碼效果更好。然而，在例如汽車應用中，使用LiDAR而獲取的點雲往往有點粗糙，並且因此，對於這些應用來說，預測編碼可能效果更好。It has been observed experimentally that octree decoding may be more suitable for dense point clouds than prediction tree decoding. Point clouds obtained using 3D modeling are usually dense enough to make octree decoding better. However, in automotive applications, for example, point clouds obtained using LiDAR tend to be a bit rough, and therefore, predictive coding may work better for these applications.

在一些示例中，角度模式可以用於表示球座標系中的點的座標。由於球座標系和笛卡爾座標系（例如，x、y、z）之間的轉換過程並不完善，因此可能丟失資訊。但是，由於G-PCC編碼器可以執行轉換過程，因此G-PCC編碼器可以用信號通知用於點的“次殘差”，其指示從將轉換過程應用於該點的球座標而產生的該點的笛卡爾座標與該點的原始笛卡爾座標之間的差。In some examples, the angle mode may be used to represent the coordinates of a point in a spherical coordinate system. Information may be lost due to imperfect transformations between spherical and Cartesian coordinate systems (eg, x, y, z). However, since the G-PCC encoder can perform the transformation process, the G-PCC encoder can signal a "sub-residual" for a point that indicates the result of applying the transformation process to the spherical coordinates of the point The difference between a point's Cartesian coordinates and the point's original Cartesian coordinates.

本公開內容涉及混合譯碼模型，其中，使用八叉樹譯碼和直接譯碼來對點雲進行譯碼。例如，八叉樹譯碼最初可以用於將空間劃分為節點直至特定級別。處於特定級別的節點（以及八叉樹的未被進一步拆分的其它佔用節點）可以被稱為“葉節點”。葉節點的體積內的點可以使用“直接”譯碼模式進行譯碼。The present disclosure relates to a hybrid decoding model in which point clouds are decoded using octree decoding and direct decoding. For example, octree coding may initially be used to divide the space into nodes up to a certain level. Nodes at a particular level (and other occupied nodes of the octree that are not further split) may be referred to as "leaf nodes". Points within the volume of a leaf node can be decoded using a "direct" decoding mode.

當以“直接”譯碼模式對葉節點的點進行編碼時，G-PCC編碼器可以針對葉節點選擇幀內預測模式或者針對葉節點選擇幀間預測模式。G-PCC編碼器可以用信號通知葉節點的點是使用幀內預測模式還是幀間預測模式進行編碼的。When encoding the points of a leaf node in "direct" coding mode, the G-PCC encoder may select an intra-prediction mode for the leaf node or an inter-prediction mode for the leaf node. The G-PCC encoder can signal whether the point of the leaf node is encoded using intra-prediction mode or inter-prediction mode.

如果G-PCC編碼器針對葉節點選擇幀內預測模式，則G-PCC編碼器可以使用預測樹譯碼，以與上述大致相同的方式對葉節點中的點進行編碼。也就是說，G-PCC編碼器可以從四種預測模式中進行選擇並且相應地用信號通知點的座標。然而，G-PCC編碼器可以用信號通知相對于葉節點的原點的座標，而不是用信號通知相對於與八叉樹相關聯的整個空間的原點的座標。這可以提高譯碼效率，特別是對於根節點來說。If the G-PCC encoder selects an intra prediction mode for a leaf node, the G-PCC encoder can use prediction tree coding to encode the points in the leaf node in much the same way as described above. That is, the G-PCC encoder can select from four prediction modes and signal the coordinates of the points accordingly. However, instead of signaling coordinates relative to the origin of the entire space associated with the octree, the G-PCC encoder may signal coordinates relative to the origin of the leaf nodes. This can improve decoding efficiency, especially for the root node.

如果G-PCC編碼器針對葉節點選擇幀間預測模式，則G-PCC編碼器可以相對於參考幀中的點集合來對葉節點中的點進行編碼。參考幀可以是先前譯碼的幀，類似於視頻的先前幀。G-PCC編碼器可以執行運動估計以在參考幀中識別具有與葉節點中的點類似的空間佈置的點集合。用於葉節點的運動向量指示在葉節點的點與在參考幀中識別的點集合之間的位移。If the G-PCC encoder selects an inter prediction mode for a leaf node, the G-PCC encoder may encode the points in the leaf node relative to the set of points in the reference frame. A reference frame may be a previously coded frame, similar to a previous frame of video. The G-PCC encoder may perform motion estimation to identify sets of points in the reference frame that have a similar spatial arrangement to the points in the leaf nodes. The motion vector for the leaf node indicates the displacement between the point of the leaf node and the set of points identified in the reference frame.

G-PCC編碼器可以用信號通知用於葉節點的參數集合。用於葉節點的參數可以包括識別參考幀的參考索引。用於葉節點的參數還可以包括指示葉節點中的點數量的值。The G-PCC encoder may signal the set of parameters for the leaf nodes. Parameters for leaf nodes may include reference indices identifying reference frames. Parameters for leaf nodes may also include a value indicating the number of points in the leaf node.

葉節點的參數還可以包括用於葉節點中的每個點的殘差值。用於葉節點中的點的殘差值指示在葉節點的預測座標之間的差（如通過將葉節點的運動向量與在參考幀中的與葉節點中的點相對應的點相加來確定）。在使用角度模式的示例中，G-PCC編碼器還可以用信號通知用於點的次殘差。The parameters of the leaf node may also include residual values for each point in the leaf node. The residual value for the point in the leaf node indicates the difference between the predicted coordinates of the leaf node (as by adding the motion vector of the leaf node to the point in the reference frame corresponding to the point in the leaf node) Sure). In the example using the angle mode, the G-PCC encoder may also signal the sub-residual for the points.

在一些示例中，用於葉節點的參數還包括運動向量差（MVD）。MVD指示在葉節點的運動向量與預測運動向量之間的差。預測運動向量是八叉樹的相鄰節點的運動向量。用於葉節點的參數可以包括識別相鄰節點的索引。In some examples, the parameters for leaf nodes also include motion vector difference (MVD). The MVD indicates the difference between the motion vector of the leaf node and the predicted motion vector. The predicted motion vector is the motion vector of the adjacent nodes of the octree. Parameters for leaf nodes may include indices that identify neighboring nodes.

在其它示例中，類似于傳統視頻譯碼中的合併模式，用於葉節點的參數不包括MVD，並且可以假設葉節點的運動向量與所識別的相鄰節點的運動向量相同。In other examples, similar to merge mode in conventional video coding, the parameters for the leaf nodes do not include MVD, and the motion vectors of the leaf nodes may be assumed to be the same as the motion vectors of the identified neighboring nodes.

在一些示例中，可以跳過用信號通知殘差。在使用角度模式的一些這樣的示例中，可以跳過用信號通知主殘差，而仍然用信號通知次殘差。In some examples, signaling residuals may be skipped. In some such examples using angular mode, signaling of the primary residual may be skipped, while the secondary residual is still signaled.

圖1是示出可以執行本公開內容的技術的示例編碼和解碼系統100的方塊圖。概括而言，本公開內容的技術涉及對點雲資料進行譯碼（編碼和/或解碼），即，以支持點雲壓縮。通常，點雲資料包括用於處理點雲的任何資料。譯碼可以在壓縮和/或解壓縮點雲資料方面是有效的。1 is a block diagram illustrating an example encoding and decoding system 100 in which the techniques of this disclosure may be implemented. In general, the techniques of this disclosure relate to transcoding (encoding and/or decoding) point cloud data, ie, to support point cloud compression. In general, point cloud profiles include any profile used to process point clouds. Decoding may be efficient in compressing and/or decompressing point cloud data.

如圖1中所示，系統100包括源設備102和目標設備116。源設備102提供要被目標設備116解碼的經編碼的點雲資料。具體地，在圖1的示例中，源設備102經由電腦可讀媒體110來將點雲資料提供給目標設備116。源設備102和目標設備116可以包括多種多樣的設備中的任何設備，包括桌上型電腦、筆記本（即，膝上型）電腦、平板電腦、機頂盒、電話手機（諸如智慧型電話）、電視機、相機、顯示設備、數位媒體播放器、視頻遊戲控制台、視頻流式傳輸設備、陸地或海上運載工具、宇宙飛船、飛行器、機器人、LIDAR設備、衛星等。在一些情況下，源設備102和目標設備116可以被配備用於無線通信。As shown in FIG. 1 , system 100 includes source device 102 and target device 116 . Source device 102 provides encoded point cloud material to be decoded by target device 116 . Specifically, in the example of FIG. 1 , source device 102 provides point cloud material to target device 116 via computer-readable medium 110 . Source device 102 and target device 116 may include any of a wide variety of devices, including desktop computers, notebook (ie, laptop) computers, tablet computers, set-top boxes, telephone handsets (such as smart phones), televisions , cameras, display devices, digital media players, video game consoles, video streaming devices, land or sea vehicles, spacecraft, aircraft, robots, LIDAR devices, satellites, etc. In some cases, source device 102 and target device 116 may be equipped for wireless communication.

在圖1的示例中，源設備102包括資料源104、記憶體106、G-PCC編碼器200以及輸出介面108。目標設備116包括輸入介面122、G-PCC解碼器300、記憶體120以及資料消費方118。根據本公開內容，源設備102的G-PCC編碼器200和目標設備116的G-PCC解碼器300可以被配置為應用本公開內容的與對八叉樹譯碼和預測譯碼進行組合以用於在區塊級別的增強的幀間/幀內預測以實現點雲壓縮的混合樹譯碼方法有關的技術。因此，源設備102表示編碼設備的示例，而目標設備116表示解碼設備的示例。在其它示例中，源設備102和目標設備116可以包括其它組件或排列。例如，源設備102可以從內部或外部源接收資料（例如，點雲資料）。同樣，目標設備116可以與外部資料消費方互接，而不是在同一設備中包括資料消費方。In the example of FIG. 1 , source device 102 includes data source 104 , memory 106 , G-PCC encoder 200 , and output interface 108 . Target device 116 includes input interface 122 , G-PCC decoder 300 , memory 120 , and data consumer 118 . According to the present disclosure, the G-PCC encoder 200 of the source device 102 and the G-PCC decoder 300 of the target device 116 may be configured to apply the present disclosure in combination with octree decoding and predictive decoding to use Techniques related to hybrid tree coding methods for enhanced inter/intra prediction at block level for point cloud compression. Thus, source device 102 represents an example of an encoding device, while target device 116 represents an example of a decoding device. In other examples, source device 102 and target device 116 may include other components or arrangements. For example, source device 102 may receive data (eg, point cloud data) from internal or external sources. Likewise, the target device 116 may interface with external data consumers, rather than including the data consumers in the same device.

如圖1中所示的系統100僅是一個示例。通常，其它數位編碼和/或解碼設備可以執行本公開內容的與對八叉樹譯碼和預測譯碼進行組合以用於在區塊級別的增強的幀間/幀內預測以實現點雲壓縮有關的技術。源設備102和目標設備116僅是在其中源設備102生成經譯碼的資料以用於傳輸給目標設備116的這樣的設備的示例。本公開內容將“譯碼”設備指代為執行對資料的譯碼（編碼和/或解碼）的設備。因此，G-PCC編碼器200和G-PCC解碼器300表示譯碼設備（具體地，分別為編碼器和解碼器）的示例。在一些示例中，源設備102和目標設備116可以以基本上對稱的方式操作，使得源設備102和目標設備116中的每一者包括編碼和解碼組件。因此，系統100可以支持在源設備102與目標設備116之間的單向或雙向傳輸，例如，用於流式傳輸、回放、廣播、電話、導航以及其它應用。The system 100 as shown in FIG. 1 is only one example. In general, other digital encoding and/or decoding devices may perform the present disclosure in combination with octree coding and predictive coding for enhanced inter/intra prediction at the block level for point cloud compression related technologies. Source device 102 and target device 116 are merely examples of such devices in which source device 102 generates decoded material for transmission to target device 116 . This disclosure refers to a "decoding" device as a device that performs the decoding (encoding and/or decoding) of material. Thus, G-PCC encoder 200 and G-PCC decoder 300 represent examples of decoding devices (specifically, encoder and decoder, respectively). In some examples, source device 102 and target device 116 may operate in a substantially symmetrical manner, such that source device 102 and target device 116 each include encoding and decoding components. Thus, system 100 may support one-way or two-way transmission between source device 102 and target device 116, eg, for streaming, playback, broadcast, telephony, navigation, and other applications.

通常，資料源104表示資料（即，原始的、未經編碼的點雲資料）的源，並且可以向G-PCC編碼器200提供順序系列的資料的“幀”，G-PCC編碼器200對用於幀的資料進行編碼。源設備102的資料源104可以包括點雲捕獲設備，諸如各種相機或感測器中的任何一者，例如，3D掃描儀或光探測和測距（LIDAR）設備、一個或多個攝像機、包含先前捕獲的資料的存檔、和/或用於從資料內容提供者接收資料的資料饋送介面。替代或另外地，點雲資料可以是來自掃描儀、相機、感測器的電腦生成的或其它資料。例如，資料源104可以生成基於電腦圖形的資料作為源資料，或者產生即時資料、存檔資料和電腦生成的資料的組合。在每種情況下，G-PCC編碼器200對被捕獲的、預捕獲的或電腦生成的資料進行編碼。G-PCC編碼器200可以將幀從所接收的順序（有時被稱為“顯示順序”）重新排列為用於譯碼的譯碼順序。G-PCC編碼器200可以生成包括經編碼的資料的一個或多個位元流。然後，源設備102可以經由輸出介面108將經編碼的資料輸出到電腦可讀媒體110上，以便由例如目標設備116的輸入介面122接收和/或取回。In general, material source 104 represents a source of material (ie, raw, unencoded point cloud material) and may provide a sequential series of "frames" of material to G-PCC encoder 200, which G-PCC encoder 200 has The data used for the frame is encoded. Source 104 of source device 102 may include a point cloud capture device, such as any of a variety of cameras or sensors, eg, a 3D scanner or light detection and ranging (LIDAR) device, one or more cameras, including Archive of previously captured data, and/or a data feed interface for receiving data from data content providers. Alternatively or additionally, point cloud data may be computer-generated or other data from scanners, cameras, sensors. For example, material sources 104 may generate computer graphics-based material as source material, or a combination of live material, archive material, and computer-generated material. In each case, the G-PCC encoder 200 encodes captured, pre-captured or computer-generated material. The G-PCC encoder 200 may rearrange the frames from the received order (sometimes referred to as "display order") to the decoding order for decoding. G-PCC encoder 200 may generate one or more bitstreams that include encoded material. Source device 102 may then output the encoded data onto computer-readable medium 110 via output interface 108 for reception and/or retrieval by input interface 122 of target device 116, for example.

源設備102的記憶體106和目標設備116的記憶體120可以表示通用記憶體。在一些示例中，記憶體106和記憶體120可以儲存原始資料，例如，來自資料源104的原始資料以及來自G-PCC解碼器300的原始的經解碼的資料。另外或替代地，記憶體106和記憶體120可以儲存由例如G-PCC編碼器200和G-PCC解碼器300分別可執行的軟體指令。儘管在該示例中，記憶體106和記憶體120被示為與G-PCC編碼器200和G-PCC解碼器300分開，但是，應當理解的是，G-PCC編碼器200和G-PCC解碼器300還可以包括用於在功能上類似或等效目的的內部記憶體。此外，記憶體106和記憶體120可以儲存例如從G-PCC編碼器200輸出的並且輸入到G-PCC解碼器300的經編碼的資料。在一些示例中，記憶體106和記憶體120的部分可以被分配作為一個或多個緩衝器，例如，以儲存原始的經解碼的和/或經編碼的資料。例如，記憶體106和記憶體120可以儲存表示點雲的資料。Memory 106 of source device 102 and memory 120 of target device 116 may represent general purpose memory. In some examples, memory 106 and memory 120 may store raw data, eg, raw data from data source 104 and raw decoded data from G-PCC decoder 300 . Additionally or alternatively, memory 106 and memory 120 may store software instructions executable by, for example, G-PCC encoder 200 and G-PCC decoder 300, respectively. Although memory 106 and memory 120 are shown separate from G-PCC encoder 200 and G-PCC decoder 300 in this example, it should be understood that G-PCC encoder 200 and G-PCC decoder The device 300 may also include internal memory for a functionally similar or equivalent purpose. In addition, memory 106 and memory 120 may store encoded data output from G-PCC encoder 200 and input to G-PCC decoder 300, for example. In some examples, portions of memory 106 and memory 120 may be allocated as one or more buffers, eg, to store raw decoded and/or encoded data. For example, memory 106 and memory 120 may store data representing point clouds.

電腦可讀媒體110可以表示能夠將經編碼的資料從源設備102傳輸給目標設備116的任何類型的媒體或設備。在一個示例中，電腦可讀媒體110表示通信媒體，其使得源設備102能夠例如經由射頻網路或基於電腦的網路，來即時地向目標設備116直接地發送經編碼的資料。根據諸如無線通信協定的通信標準，輸出介面108可以對包括經編碼的資料的傳輸信號進行調變，以及輸入介面122可以對所接收的傳輸資訊進行解調。通信媒體可以包括任何無線或有線通信媒體，諸如射頻（RF）頻譜或一條或多條實體傳輸線。通信媒體可以形成諸如以下各項的基於封包的網路的一部分：局域網、廣域網、或諸如互聯網的全球網路。通信媒體可以包括路由器、交換機、基站、或對於促進從源設備102到目標設備116的通信而言可以是有用的任何其它設備。Computer-readable medium 110 may represent any type of media or device capable of transmitting encoded material from source device 102 to target device 116 . In one example, computer-readable medium 110 represents a communication medium that enables source device 102 to send encoded material directly to target device 116 in real-time, such as via a radio frequency network or a computer-based network. The output interface 108 may modulate the transmission signal including the encoded data, and the input interface 122 may demodulate the received transmission information according to a communication standard, such as a wireless communication protocol. Communication media may include any wireless or wired communication media, such as a radio frequency (RF) spectrum or one or more physical transmission lines. The communication medium may form part of a packet-based network such as a local area network, a wide area network, or a global network such as the Internet. Communication media may include routers, switches, base stations, or any other device that may be useful to facilitate communication from source device 102 to target device 116 .

在一些示例中，源設備102可以將經編碼的資料從輸出介面108輸出到儲存設備112。類似地，目標設備116可以經由輸入介面122從儲存設備112存取經編碼的資料。儲存設備112可以包括各種各樣的分布式或本地存取的資料儲存媒體中的任何一種，諸如硬碟驅動器、藍光光碟、DVD、CD-ROM、快閃記憶體、易揮發性或非易揮發性記憶體、或用於儲存經編碼的資料的任何其它適當的數位儲存媒體。In some examples, source device 102 may output the encoded data from output interface 108 to storage device 112 . Similarly, target device 116 may access encoded data from storage device 112 via input interface 122 . Storage device 112 may include any of a variety of distributed or locally-accessed data storage media, such as hard drives, Blu-ray discs, DVDs, CD-ROMs, flash memory, volatile or non-volatile memory, or any other suitable digital storage medium for storing encoded data.

在一些示例中，源設備102可以將經編碼的資料輸出到檔案伺服器114或者可以儲存由源設備102生成的經編碼的資料的另一中間儲存設備。目標設備116可以經由流式傳輸或下載來從檔案伺服器114存取被儲存的資料。檔案伺服器114可以是能夠儲存經編碼的資料並且將該經編碼的資料發送給目標設備116的任何類型的伺服器設備。檔案伺服器114可以表示網頁伺服器（例如，用於網站）、檔案傳輸協定（FTP）伺服器、內容遞送網路設備、或網路附加儲存（NAS）設備。目標設備116可以通過任何標準資料連接（包括互聯網連接）來從檔案伺服器114存取經編碼的資料。這可以包括適於存取被儲存在檔案伺服器114上的經編碼的資料的無線通道（例如，Wi-Fi連接）、有線連接（例如，數位用戶線（DSL）、電纜數據機等）、或這兩者的組合。檔案伺服器114和輸入介面122可以被配置為根據流式傳輸協定、下載傳輸協定、或其組合來操作。In some examples, source device 102 may output the encoded data to file server 114 or another intermediate storage device that may store the encoded data generated by source device 102 . The target device 116 may access the stored data from the file server 114 via streaming or downloading. File server 114 may be any type of server device capable of storing encoded data and sending the encoded data to target device 116 . File server 114 may represent a web server (eg, for a website), a file transfer protocol (FTP) server, a content delivery network device, or a network attached storage (NAS) device. Target device 116 may access encoded data from file server 114 over any standard data connection, including an Internet connection. This may include wireless channels (eg, Wi-Fi connections), wired connections (eg, digital subscriber line (DSL), cable modem, etc.) suitable for accessing encoded data stored on file server 114, or a combination of the two. File server 114 and input interface 122 may be configured to operate according to a streaming protocol, a download protocol, or a combination thereof.

輸出介面108和輸入介面122可以表示無線發射機/接收機、數據機、有線聯網組件（例如，以太網卡）、根據各種各樣的IEEE 802.11標準中的任何一種標準進行操作的無線通信組件、或其它實體組件。在其中輸出介面108和輸入介面122包括無線組件的示例中，輸出介面108和輸入介面122可以被配置為根據蜂巢式通信標準（諸如4G、4G-LTE（長期演進）、改進的LTE、5G等）來傳輸資料（諸如經編碼的資料）。在其中輸出介面108包括無線發射機的一些示例中，輸出介面108和輸入介面122可以被配置為根據其它無線標準（諸如IEEE 802.11規範、IEEE 802.15規範（例如，ZigBee™）、Bluetooth™標準等）來傳輸資料（諸如經編碼的資料）。在一些示例中，源設備102和/或目標設備116可以包括相應的系統單晶片（SoC）設備。例如，源設備102可以包括用於執行歸屬於G-PCC編碼器200和/或輸出介面108的功能的SoC設備，以及目標設備116可以包括用於執行歸屬於G-PCC解碼器300和/或輸入介面122的功能的SoC設備。Output interface 108 and input interface 122 may represent wireless transmitters/receivers, modems, wired networking components (eg, Ethernet cards), wireless communication components operating in accordance with any of a variety of IEEE 802.11 standards, or other entity components. In examples in which output interface 108 and input interface 122 include wireless components, output interface 108 and input interface 122 may be configured according to cellular communication standards such as 4G, 4G-LTE (Long Term Evolution), LTE-Advanced, 5G, etc. ) to transmit data, such as encoded data. In some examples in which output interface 108 includes a wireless transmitter, output interface 108 and input interface 122 may be configured in accordance with other wireless standards (such as the IEEE 802.11 specification, the IEEE 802.15 specification (eg, ZigBee™), the Bluetooth™ standard, etc.) to transmit data, such as encoded data. In some examples, source device 102 and/or target device 116 may include respective system-on-chip (SoC) devices. For example, source device 102 may include a SoC device for performing functions ascribed to G-PCC encoder 200 and/or output interface 108, and target device 116 may include an SoC device for performing functions ascribed to G-PCC decoder 300 and/or The SoC device that inputs the functions of the interface 122 .

本公開內容的技術可以被應用於編碼和解碼以支持各種應用中的任何一種，諸如在自主式車輛之間的通信、在掃描儀、相機、感測器和處理設備（諸如本地或遠程伺服器）之間的通信、地理繪圖或其它應用。The techniques of this disclosure may be applied to encoding and decoding to support any of a variety of applications, such as communications between autonomous vehicles, in scanners, cameras, sensors, and processing devices such as local or remote servers ), geographic mapping or other applications.

目標設備116的輸入介面122從電腦可讀媒體110（例如，通信媒體、儲存設備112、檔案伺服器114等）接收經編碼的位元流。經編碼的位元流可以包括由G-PCC編碼器200定義的信令資訊（其也被G-PCC解碼器300使用），諸如具有描述經譯碼的單元（例如，切片、圖片、圖片組、序列等）的特性和/或處理的值的語法元素。資料消費方118使用經解碼的資料。例如，資料消費方118可以使用經解碼的資料來確定實體對象的位置。在一些示例中，資料消費方118可以包括基於點雲來呈現影像的顯示器。The input interface 122 of the target device 116 receives the encoded bitstream from the computer-readable medium 110 (eg, communication medium, storage device 112, file server 114, etc.). The encoded bitstream may include signaling information defined by G-PCC encoder 200 (which is also used by G-PCC decoder 300 ), such as with descriptions of coded units (eg, slices, pictures, groups of pictures) , sequence, etc.) properties and/or syntax elements of the processed value. The data consumer 118 uses the decoded data. For example, the data consumer 118 may use the decoded data to determine the location of the physical object. In some examples, data consumer 118 may include a display that renders imagery based on a point cloud.

G-PCC編碼器200和G-PCC解碼器300各自可以被實現為各種各樣的適當的編碼器和/或解碼器電路中的任何一種，諸如一個或多個微處理器、數位信號處理器（DSP）、專用積體電路（ASIC）、現場可程式設計閘陣列（FPGA）、離散邏輯、軟體、硬體、韌體、或其任何組合。當所述技術部分地在軟體中實現時，設備可以將用於軟體的指令儲存在適當的非暫時性電腦可讀媒體中，以及使用一個或多個處理器在硬體中執行指令以執行本公開內容的技術。G-PCC編碼器200和G-PCC解碼器300中的每一者可以被包括在一個或多個編碼器或解碼器中，編碼器或解碼器中的任一者可以被整合為相應設備中的組合的編碼器/解碼器（CODEC）的一部分。包括G-PCC編碼器200和/或G-PCC解碼器300的設備可以包括一個或多個積體電路、微處理器、和/或其它類型的設備。G-PCC encoder 200 and G-PCC decoder 300 may each be implemented as any of a wide variety of suitable encoder and/or decoder circuits, such as one or more microprocessors, digital signal processors (DSP), Application Specific Integrated Circuit (ASIC), Field Programmable Gate Array (FPGA), discrete logic, software, hardware, firmware, or any combination thereof. When the techniques are implemented partially in software, the apparatus may store instructions for the software in a suitable non-transitory computer-readable medium and execute the instructions in hardware using one or more processors to perform the present invention Techniques for Disclosing Content. Each of the G-PCC encoder 200 and the G-PCC decoder 300 may be included in one or more encoders or decoders, either of which may be integrated into a corresponding device part of a combined encoder/decoder (CODEC). Devices including G-PCC encoder 200 and/or G-PCC decoder 300 may include one or more integrated circuits, microprocessors, and/or other types of devices.

G-PCC編碼器200和G-PCC解碼器300可以根據譯碼標準（諸如視頻點雲壓縮（V-PCC）標準或幾何形狀點雲壓縮（G-PCC）標準）操作。概括而言，本公開內容可以涉及對圖片的譯碼（例如，編碼和解碼）以包括對資料進行編碼或解碼的過程。經編碼的位元流通常包括用於表示譯碼決策（例如，譯碼模式）的語法元素的一系列值。The G-PCC encoder 200 and the G-PCC decoder 300 may operate according to a coding standard, such as the Video Point Cloud Compression (V-PCC) standard or the Geometric Point Cloud Compression (G-PCC) standard. In general terms, the present disclosure may relate to the coding (eg, encoding and decoding) of pictures to include the process of encoding or decoding data. The encoded bitstream typically includes a series of values for syntax elements representing coding decisions (eg, coding modes).

概括而言，本公開內容可能涉及“用信號通知”某些資訊（諸如語法元素）。術語“用信號通知”通常可以指代對用於語法元素的值和/或用以對經編碼的資料進行解碼的其它資料的傳送。也就是說，G-PCC編碼器200可以在位元流中用信號通知用於語法元素的值。通常，用信號通知指代在位元流中生成值。如上所述，源設備102可以基本上即時地或不是即時地（諸如可能在將語法元素儲存到儲存設備112以供目標設備116稍後取回時發生）將位元流傳輸給目標設備116。In general terms, the present disclosure may involve "signaling" certain information, such as syntax elements. The term "signaling" may generally refer to the transmission of values for syntax elements and/or other data used to decode encoded data. That is, G-PCC encoder 200 may signal values for syntax elements in the bitstream. Generally, signaling refers to generating a value in a bitstream. As described above, source device 102 may transmit the bitstream to target device 116 substantially instantaneously or not, such as may occur when syntax elements are stored to storage device 112 for later retrieval by target device 116 .

ISO/IEC MPEG（JTC 1/SC 29/WG 11）正在研究點雲譯碼技術的標準化的潛在需求，其壓縮能力顯著超過當前方法的壓縮能力。該團體在被稱為三維圖形小組（3DG）的協同合作中正在共同致力於這一探索活動，以評估由這一領域中的他們的專家提出的壓縮技術設計。ISO/IEC MPEG (JTC 1/SC 29/WG 11) is investigating the potential need for standardization of point cloud decoding techniques whose compression capabilities significantly exceed those of current methods. The group is working together on this exploratory activity in a collaborative collaboration known as the 3D Graphics Group (3DG) to evaluate compression technology designs proposed by their experts in the field.

點雲壓縮活動被分類為兩種不同的方法。第一種方法是“視頻點雲壓縮”（V-PCC），其將3D對象分段，並且將分段投影在多個2D平面（其在2D幀中被表示為“補丁（patch）”）中，其由諸如高效率視頻譯碼（HEVC）（ITU-T H.265）編解碼器之類的傳統2D視頻編解碼器進一步譯碼。第二種方法是“基於幾何形狀的點雲壓縮”（G-PCC），其直接壓縮3D幾何形狀（即，點集合在3D空間中的位置）、以及（用於與3D幾何形狀相關聯的每個點）關聯的屬性值。G-PCC解決了在類別1（靜態點雲）和類別3（動態獲取的點雲）兩者下的點雲的壓縮的問題。G-PCC標準的最近草案在以下文檔中可得到：“Text of ISO/IEC FDIS 23090-9 Geometry-based Point Cloud Compression”，ISO/IEC JTC 1/SC29/WG 7 MDS19617，電話會議，2020年10月，以及對編解碼器的描述在以下文檔中可得到：G-PCC Codec Description，ISO/IEC JTC 1/SC29/WG 7 MDS19620，電話會議，2020年10月。Point cloud compression activities are classified into two different approaches. The first method is "Video Point Cloud Compression" (V-PCC), which segments 3D objects and projects the segments on multiple 2D planes (which are represented as "patches" in 2D frames) , it is further coded by conventional 2D video codecs such as the High Efficiency Video Coding (HEVC) (ITU-T H.265) codec. The second method is "Geometry-Based Point Cloud Compression" (G-PCC), which directly compresses 3D geometry (i.e., the location of a collection of points in 3D space), and (for each point) associated attribute value. G-PCC solves the problem of compression of point clouds under both category 1 (static point cloud) and category 3 (dynamically acquired point cloud). A recent draft of the G-PCC standard is available in the following document: "Text of ISO/IEC FDIS 23090-9 Geometry-based Point Cloud Compression", ISO/IEC JTC 1/SC29/WG 7 MDS19617, Conference Call, October 2020 month, and a description of the codec is available in the following document: G-PCC Codec Description, ISO/IEC JTC 1/SC29/WG 7 MDS19620, Conference Call, October 2020.

點雲包含3D空間中的點集合，並且可以具有與點相關聯的屬性。屬性可以是色彩資訊（諸如R、G、B或Y、Cb、Cr）、或反射率資訊、或其它屬性。點雲可以由各種相機或感測器（諸如LIDAR感測器和3D掃描儀）捕獲，並且也可以是電腦生成的。點雲資料被用於各種應用中，包括但不限於：構建（建模）、圖形（用於可視化和動畫的3D模型）和汽車行業（用於幫助導航的LIDAR感測器）。A point cloud contains a collection of points in 3D space and can have attributes associated with the points. Attributes may be color information (such as R, G, B or Y, Cb, Cr), or reflectance information, or other attributes. Point clouds can be captured by various cameras or sensors, such as LIDAR sensors and 3D scanners, and can also be computer-generated. Point cloud data are used in a variety of applications, including but not limited to: construction (modeling), graphics (3D models for visualization and animation), and the automotive industry (LIDAR sensors to aid navigation).

由點雲資料佔用的3D空間可以由虛擬邊界框包圍。邊界框中的點的位置可以通過某種精度來表示；因此，可以基於精度來量化一個或多個點的位置。在最小級別處，邊界框被拆分為體素，體素是由單位立方體表示的最小空間單位。邊界框中的體素可以與零個、一個、或一個以上的點相關聯。邊界框可以被拆分為多個立方體/立方體區域，這些區域可以被稱為瓦片。每個瓦片可以被譯碼為一個或多個切片。將邊界框劃分為切片和瓦片可以是基於每個分區中的點的數量，或者基於其它考慮（例如，可以將特定區域譯碼為瓦片）的。切片區域可以使用與視頻編解碼器中的拆分決策類似的拆分決策來進一步分割。The 3D space occupied by the point cloud data can be enclosed by a virtual bounding box. The location of a point in a bounding box can be represented with some precision; therefore, the location of one or more points can be quantified based on the precision. At the minimum level, the bounding box is split into voxels, which are the smallest spatial units represented by a unit cube. The voxels in the bounding box can be associated with zero, one, or more than one points. A bounding box can be split into cubes/cube regions, which can be called tiles. Each tile can be coded into one or more slices. The division of the bounding box into slices and tiles may be based on the number of points in each partition, or based on other considerations (eg, certain regions may be coded as tiles). Slice regions can be further segmented using split decisions similar to those in video codecs.

圖2提供了G-PCC編碼器200的概述。圖3提供了G-PCC解碼器300的概述。所示的模組是邏輯的，並且不一定與G-PCC編解碼器的參考實現（即，由ISO/IEC MPEG（JTC 1/SC 29/WG 11）研究的TMC13測試模型軟體）中的實現的代碼一一對應。FIG. 2 provides an overview of the G-PCC encoder 200 . FIG. 3 provides an overview of G-PCC decoder 300 . The modules shown are logical and not necessarily related to the implementation in the reference implementation of the G-PCC codec (ie, the TMC13 test model software studied by ISO/IEC MPEG (JTC 1/SC 29/WG 11)) The codes correspond one-to-one.

在G-PCC編碼器200和G-PCC解碼器300兩者中，首先對點雲位置進行譯碼。屬性譯碼取決於經解碼的幾何形狀。圖2的表面近似分析單元212和RAHT單元218以及圖3的表面近似合成單元310和RAHT單元314是通常用於類別1資料的選項。LOD生成單元220和提升單元222以及圖3的LOD生成單元316和反向提升單元318是通常用於類別3資料的選項。所有其它模組在類別1和類別3之間是共用的。In both the G-PCC encoder 200 and the G-PCC decoder 300, the point cloud positions are first decoded. Attribute coding depends on the decoded geometry. Surface approximation analysis unit 212 and RAHT unit 218 of FIG. 2 and surface approximation synthesis unit 310 and RAHT unit 314 of FIG. 3 are options typically used for Category 1 material. LOD generation unit 220 and lift unit 222 and LOD generation unit 316 and reverse lift unit 318 of FIG. 3 are options typically used for Category 3 material. All other mods are shared between Category 1 and Category 3.

對於幾何形狀，存在兩種不同類型的譯碼技術：八叉樹譯碼和預測樹譯碼。在下文中，本公開內容側重於八叉樹譯碼。對於類別3資料，經壓縮的幾何形狀通常被表示為從根一直向下到各個體素的葉級別的八叉樹。對於類別3資料，經壓縮的幾何形狀通常被表示為從根一直向下到單個體素的葉級別的八叉樹。對於類別1資料，經壓縮的幾何形狀通常由經修剪的八叉樹（即，從根向下到大於體素的區塊的葉級別的八叉樹）加上對在經修剪的八叉樹的每個葉內的表面進行近似的模型來表示。以這種方式，類別1資料和類別3資料兩者共享八叉樹譯碼機制，而類別1資料可以另外利用表面模型來近似在每個葉內的體素。所使用的表面模型是三角剖分，其包括每區塊1-10個三角形，從而產生三角形集合（triangle soup）。因此，類別1幾何形狀編解碼器被稱為Trisoup幾何形狀編解碼器，而類別3幾何形狀編解碼器被稱為八叉樹幾何形狀編解碼器。For geometry, there are two different types of coding techniques: octree coding and predictive tree coding. In the following, the present disclosure focuses on octree coding. For category 3 data, the compressed geometry is typically represented as an octree from the root all the way down to the leaf level of individual voxels. For Category 3 data, the compressed geometry is typically represented as an octree from the root all the way down to the leaf level of a single voxel. For category 1 data, the compressed geometry is typically composed of a pruned octree (ie, an octree from the root down to the leaf level of blocks larger than voxels) plus the pair of octrees in the pruned octree The surfaces within each lobe are represented by an approximate model. In this way, both class 1 data and class 3 data share the octree decoding mechanism, while class 1 data can additionally utilize a surface model to approximate the voxels within each leaf. The surface model used is a triangulation, which includes 1-10 triangles per block, resulting in a triangle soup. Therefore, class 1 geometry codecs are called Trisoup geometry codecs, and class 3 geometry codecs are called octree geometry codecs.

圖4是示出用於幾何形狀譯碼的示例八叉樹拆分的概念圖。八叉樹400包括8個子節點。這些子節點中的一些子節點（諸如節點402）沒有子節點。然而，其它子節點（諸如節點404）確實具有子節點，並且節點404的一些子節點也具有子節點，以此類推。4 is a conceptual diagram illustrating an example octree split for geometry coding. Octree 400 includes 8 child nodes. Some of these child nodes, such as node 402, have no child nodes. However, other child nodes, such as node 404, do have child nodes, and some child nodes of node 404 also have child nodes, and so on.

在八叉樹的每個節點處，針對子節點中的一個或多個子節點（多達八個節點）用信號通知（當未被推斷時）佔用。指定了多個鄰域，其包括：（a）與當前八叉樹節點共享面的節點，（b）與當前八叉樹節點共享面、邊或頂點的節點等。在每個鄰域內，可以使用節點和/或其子節點的佔用來預測當前節點或其子節點的佔用。對於在八叉樹的某些節點中稀疏地填充的點，編解碼器還支持直接譯碼模式，其中點的3D位置被直接地編碼。可以用信號通知標誌以指示用信號通知直接模式。在最低級別處，還可以對與八叉樹節點/葉節點相關聯的點的數量進行譯碼。At each node of the octree, occupancy is signaled (when not inferred) for one or more of the child nodes (up to eight nodes). Multiple neighborhoods are specified, including: (a) nodes that share faces with the current octree node, (b) nodes that share faces, edges, or vertices with the current octree node, etc. Within each neighborhood, the occupancy of the node and/or its children can be used to predict the occupancy of the current node or its children. For points that are sparsely populated in certain nodes of the octree, the codec also supports a direct coding mode, where the 3D position of the point is directly encoded. A flag can be signaled to indicate that direct mode is signaled. At the lowest level, the number of points associated with the octree node/leaf node can also be decoded.

一旦幾何形狀被譯碼，與幾何形狀點相對應的屬性就被譯碼。當存在與一個經重建的/經解碼的幾何形狀點相對應的多個屬性點時，可以推導代表經重建的點的屬性值。Once the geometry is decoded, the attributes corresponding to the geometry points are decoded. When there are multiple attribute points corresponding to one reconstructed/decoded geometry point, an attribute value representing the reconstructed point can be derived.

在G-PCC中存在三種屬性譯碼方法：區域自適應分層變換（RAHT）譯碼、基於插值的分層最近鄰居預測（預測變換）、以及具有更新/提升步長的基於插值的分層最近鄰居預測（提升變換）。RAHT和提升通常用於類別1資料，而預測通常用於類別3資料。然而，任一方法可以用於任何資料，並且正如G-PCC中的幾何形狀編解碼器一樣，在位元流中指定用於對點雲進行譯碼的屬性譯碼方法。There are three attribute coding methods in G-PCC: Region Adaptive Hierarchical Transform (RAHT) coding, Interpolation-based Hierarchical Nearest Neighbor Prediction (Predictive Transform), and Interpolation-based Hierarchical with Update/Boost Steps Nearest neighbor prediction (boosting transform). RAHT and boost are usually used for category 1 data, while prediction is usually used for category 3 data. However, either method can be used for any material, and just like the geometry codec in G-PCC, the attribute decoding method used to decode the point cloud is specified in the bitstream.

對屬性的譯碼可以以細節層次（LoD）來進行，其中，利用每個細節層次，可以獲得點雲屬性的更精細的表示。可以基於距相鄰節點的距離度量或者基於採樣距離來指定每個細節層次。The decoding of attributes can be done in levels of detail (LoD), where with each level of detail a finer representation of the point cloud attributes can be obtained. Each level of detail can be specified based on a distance metric from neighboring nodes or based on sampling distance.

在G-PCC編碼器200處，對作為用於屬性的譯碼方法的輸出而獲得的殘差進行量化。可以使用上下文自適應算術譯碼來對經量化的殘差進行譯碼。At the G-PCC encoder 200, the residual obtained as the output of the coding method for attributes is quantized. The quantized residual may be coded using context adaptive arithmetic coding.

在圖2的示例中，G-PCC編碼器200可以包括座標變換單元202、色彩變換單元204、體素化單元206、屬性傳遞單元208、八叉樹分析單元210、表面近似分析單元212、算術編碼單元214、幾何形狀重建單元216、RAHT單元218、LOD生成單元220、提升單元222、係數量化單元224和算術編碼單元226。In the example of FIG. 2, the G-PCC encoder 200 may include a coordinate transformation unit 202, a color transformation unit 204, a voxelization unit 206, an attribute transfer unit 208, an octree analysis unit 210, a surface approximation analysis unit 212, an arithmetic The encoding unit 214 , the geometry reconstruction unit 216 , the RAHT unit 218 , the LOD generation unit 220 , the lifting unit 222 , the coefficient quantization unit 224 , and the arithmetic encoding unit 226 .

如在圖2的示例中所示，G-PCC編碼器200可以接收位置集合和屬性集合。位置可以包括在點雲中的點的座標。屬性可以包括關於在點雲中的點的資訊，諸如與在點雲中的點相關聯的色彩。As shown in the example of FIG. 2, the G-PCC encoder 200 may receive a set of locations and a set of attributes. The location may include the coordinates of the point in the point cloud. Attributes may include information about points in the point cloud, such as colors associated with points in the point cloud.

座標變換單元202可以對點的座標應用變換，以將座標從初始域變換到變換域。本公開內容可以將經變換的座標稱為變換座標。色彩變換單元204可以應用變換來將屬性的色彩資訊變換到不同的域。例如，色彩變換單元204可以將色彩資訊從RGB色彩空間變換到YCbCr色彩空間。The coordinate transformation unit 202 may apply a transformation to the coordinates of the points to transform the coordinates from the original domain to the transformed domain. This disclosure may refer to the transformed coordinates as transformed coordinates. Color transform unit 204 may apply transforms to transform color information of attributes to different domains. For example, the color transform unit 204 may transform the color information from the RGB color space to the YCbCr color space.

此外，在圖2的示例中，體素化單元206可以將變換座標體素化。對變換座標的體素化可以包括量化和移除點雲中的一些點。換句話說，點雲中的多個點可以被歸入在單個“體素”內，該單個“體素”此後在一些方面中可以被視為一個點。此外，八叉樹分析單元210可以基於經體素化的變換座標來生成八叉樹。另外，在圖2的示例中，表面近似分析單元212可以分析點以潛在地確定點的集合的表面表示。算術編碼單元214可以對表示八叉樹和/或由表面近似分析單元212確定的表面的資訊的語法元素進行熵編碼。G-PCC編碼器200可以在幾何形狀位元流中輸出這些語法元素。Furthermore, in the example of FIG. 2, voxelization unit 206 may voxelize the transformed coordinates. Voxelization of the transformed coordinates may include quantizing and removing some points in the point cloud. In other words, multiple points in a point cloud may be grouped within a single "voxel", which may thereafter be considered a point in some aspects. Furthermore, octree analysis unit 210 may generate an octree based on the voxelized transformed coordinates. Additionally, in the example of FIG. 2, the surface approximation analysis unit 212 may analyze the points to potentially determine a surface representation of a set of points. Arithmetic encoding unit 214 may entropy encode syntax elements representing information about the octree and/or the surface determined by surface approximation analysis unit 212 . G-PCC encoder 200 may output these syntax elements in a geometry bitstream.

幾何形狀重建單元216可以基於八叉樹、指示由表面近似分析單元212確定的表面的資料和/或其它資訊來重建點雲中的點的變換座標。由於體素化和表面近似，由幾何形狀重建單元216重建的變換座標的數量可能不同於點雲的原始點數。本公開內容可以將所得到的點稱為經重建的點。屬性傳遞單元208可以將點雲的原始點的屬性傳遞到點雲的經重建的點。The geometry reconstruction unit 216 may reconstruct the transformed coordinates of the points in the point cloud based on the octree, data indicative of the surface determined by the surface approximation analysis unit 212, and/or other information. Due to voxelization and surface approximation, the number of transformed coordinates reconstructed by the geometry reconstruction unit 216 may differ from the original number of points of the point cloud. The present disclosure may refer to the resulting points as reconstructed points. The attribute transfer unit 208 may transfer the attributes of the original points of the point cloud to the reconstructed points of the point cloud.

此外，RAHT單元218可以對經重建的點的屬性應用RAHT譯碼。替代地或另外地，LOD生成單元220和提升單元222可以分別對經重建的點的屬性應用LOD處理和提升。RAHT單元218和提升單元222可以基於屬性來生成係數。係數量化單元224可以對由RAHT單元218或提升單元222生成的係數進行量化。算術編碼單元226可以對表示經量化的係數的語法元素應用算術譯碼。G-PCC編碼器200可以在屬性位元流中輸出這些語法元素。Additionally, RAHT unit 218 may apply RAHT coding to the attributes of the reconstructed points. Alternatively or additionally, LOD generation unit 220 and lifting unit 222 may apply LOD processing and lifting, respectively, to attributes of the reconstructed points. RAHT unit 218 and boost unit 222 may generate coefficients based on attributes. Coefficient quantization unit 224 may quantize coefficients generated by RAHT unit 218 or boosting unit 222 . Arithmetic encoding unit 226 may apply arithmetic coding to syntax elements representing quantized coefficients. The G-PCC encoder 200 may output these syntax elements in the attribute bitstream.

在圖3的示例中，G-PCC解碼器300可以包括幾何形狀算術解碼單元302、屬性算術解碼單元304、八叉樹合成單元306、逆量化單元308、表面近似合成單元310、幾何形狀重建單元312、RAHT單元314、LoD生成單元316、逆提升單元318、逆變換座標單元320和逆變換色彩單元322。In the example of FIG. 3, the G-PCC decoder 300 may include a geometry arithmetic decoding unit 302, an attribute arithmetic decoding unit 304, an octree synthesis unit 306, an inverse quantization unit 308, a surface approximation synthesis unit 310, a geometry reconstruction unit 312 , RAHT unit 314 , LoD generation unit 316 , inverse lift unit 318 , inverse transform coordinate unit 320 , and inverse transform color unit 322 .

G-PCC解碼器300可以獲得幾何形狀位元流和屬性位元流。G-PCC解碼器300的幾何形狀算術解碼單元302可以對幾何形狀位元流中的語法元素應用算術解碼（例如，上下文自適應二進制算術譯碼（CABAC）或其它類型的算術解碼）。類似地，屬性算術解碼單元304可以對屬性位元流中的語法元素應用算術解碼。The G-PCC decoder 300 can obtain geometry bitstreams and attribute bitstreams. The geometry arithmetic decoding unit 302 of the G-PCC decoder 300 may apply arithmetic decoding (eg, context adaptive binary arithmetic coding (CABAC) or other types of arithmetic decoding) to syntax elements in the geometry bitstream. Similarly, attribute arithmetic decoding unit 304 may apply arithmetic decoding to syntax elements in the attribute bitstream.

八叉樹合成單元306可以基於從幾何形狀位元流解析的語法元素來合成八叉樹。在幾何形狀位元流中使用表面近似的情況下，表面近似合成單元310可以基於從幾何形狀位元流解析的語法元素並且基於八叉樹來確定表面模型。Octree synthesis unit 306 may synthesize octrees based on syntax elements parsed from the geometry bitstream. Where surface approximation is used in the geometry bitstream, the surface approximation synthesis unit 310 may determine the surface model based on syntax elements parsed from the geometry bitstream and based on an octree.

此外，幾何形狀重建單元312可以執行重建以確定點雲中的點的座標。逆變換座標單元320可以對經重建的座標應用逆變換，以將點雲中的點的經重建的座標（位置）從變換域轉換回初始域。Additionally, geometry reconstruction unit 312 may perform reconstruction to determine coordinates of points in the point cloud. Inverse transform coordinates unit 320 may apply an inverse transform to the reconstructed coordinates to transform the reconstructed coordinates (positions) of points in the point cloud from the transformed domain back to the original domain.

另外，在圖3的示例中，逆量化單元308可以對屬性值進行逆量化。屬性值可以是基於從屬性位元流獲得的語法元素（例如，包括由屬性算術解碼單元304解碼的語法元素）的。In addition, in the example of FIG. 3, the inverse quantization unit 308 may inverse quantize the attribute value. The attribute value may be based on syntax elements obtained from the attribute bitstream (eg, including syntax elements decoded by attribute arithmetic decoding unit 304).

根據屬性值如何被編碼，RAHT單元314可以執行RAHT譯碼，以基於經逆量化的屬性值來確定用於點雲的點的色彩值。替代地，LOD生成單元316和逆提升單元318可以使用基於細節層次的技術來確定用於點雲的點的色彩值。Depending on how the attribute values are encoded, RAHT unit 314 may perform RAHT coding to determine color values for the points of the point cloud based on the inverse quantized attribute values. Alternatively, LOD generation unit 316 and inverse boosting unit 318 may use level of detail based techniques to determine color values for the points of the point cloud.

此外，在圖3的示例中，逆變換色彩單元322可以對色彩值應用逆色彩變換。逆色彩變換可以是由G-PCC編碼器200的色彩變換單元204應用的色彩變換的逆過程。例如，色彩變換單元204可以將色彩資訊從RGB色彩空間變換到YCbCr色彩空間。因此，逆色彩變換單元322可以將色彩資訊從YCbCr色彩空間變換到RGB色彩空間。Furthermore, in the example of FIG. 3, inverse transform color unit 322 may apply an inverse color transform to the color values. The inverse color transform may be the inverse process of the color transform applied by the color transform unit 204 of the G-PCC encoder 200 . For example, the color transform unit 204 may transform the color information from the RGB color space to the YCbCr color space. Therefore, the inverse color transform unit 322 can transform the color information from the YCbCr color space to the RGB color space.

示出了圖2和圖3的各個單元以幫助理解由G-PCC編碼器200和G-PCC解碼器300執行的操作。單元可以被實現為固定功能電路、可程式化電路、或其組合。固定功能電路指代提供特定功能的電路，並且關於可以被執行的操作而預先設置。可程式化電路指代可以被程式設計以執行各種任務的電路，並且以可以被執行的操作來提供靈活功能。例如，可程式化電路可以執行軟體或韌體，軟體或韌體使得可程式化電路以由軟體或韌體的指令定義的方式進行操作。固定功能電路可以執行軟體指令（例如，以接收參數或輸出參數），但是固定功能電路執行的操作的類型通常是不可變的。在一些示例中，這些單元中的一個或多個單元可以是不同的電路區塊（固定功能或可程式化的），以及在一些示例中，這些單元中的一個或多個單元可以是積體電路。The various elements of FIGS. 2 and 3 are shown to aid in understanding the operations performed by G-PCC encoder 200 and G-PCC decoder 300 . A unit may be implemented as a fixed function circuit, a programmable circuit, or a combination thereof. A fixed function circuit refers to a circuit that provides a specific function, and is preset with respect to operations that can be performed. Programmable circuits refer to circuits that can be programmed to perform various tasks and provide flexible functionality in operations that can be performed. For example, the programmable circuit may execute software or firmware that causes the programmable circuit to operate in a manner defined by the instructions of the software or firmware. Fixed-function circuits may execute software instructions (eg, to receive parameters or output parameters), but the type of operations performed by fixed-function circuits is generally immutable. In some examples, one or more of these units may be distinct circuit blocks (fixed function or programmable), and in some examples, one or more of these units may be integrated circuit.

引入預測幾何形狀譯碼作為八叉樹幾何形狀譯碼的替代方案，其中節點以樹結構（其定義預測結構）佈置，並且使用各種預測策略來預測樹中的每個節點相對於其預測器的座標。Predictive geometry coding is introduced as an alternative to octree geometry coding, where nodes are arranged in a tree structure (which defines the prediction structure), and various prediction strategies are used to predict each node in the tree relative to its predictor coordinate.

圖5示出了預測樹401的示例，其被表示為有向圖，其中箭頭指向預測方向。節點412是根頂點並且沒有預測器。節點414A和414B具有兩個子節點。虛線節點具有3個子節點。填充白色的節點具有一個子節點，並且節點418A-418E是沒有子節點的葉節點。每個節點只具有一個父節點。Figure 5 shows an example of a prediction tree 401, which is represented as a directed graph with arrows pointing in the prediction direction. Node 412 is the root vertex and has no predictor. Nodes 414A and 414B have two child nodes. The dotted node has 3 children. A node filled with white has one child, and nodes 418A-418E are leaf nodes with no children. Each node has only one parent node.

基於每個節點的父節點（p0）、祖父節點（p1）和曾祖節點（p2），為每個節點指定了四種預測策略： ( 1) 無預測/零預測（0） (2) 增量預測（p0） (3) 線性預測（2*p0 – p1） (4) 平行四邊形預測（2*p0 + p1 – p2） Four prediction strategies are specified for each node, based on its parent (p0), grandfather (p1), and great-grandfather (p2): (1) No prediction/zero prediction (0) (2) Incremental prediction (p0) (3) Linear prediction (2*p0 – p1) (4) Parallelogram prediction (2*p0 + p1 – p2)

G-PCC編碼器200可以採用任何算法來生成預測樹；可以基於應用/用例來確定所使用的算法，並且可以使用若干種策略。在以下文檔中描述了一些策略：G-PCC Codec Description，ISO/IEC JTC 1/SC29/WG 7 MDS19620，電話會議，2020年10月。The G-PCC encoder 200 can employ any algorithm to generate the prediction tree; the algorithm used can be determined based on the application/use case, and several strategies can be used. Some strategies are described in the following document: G-PCC Codec Description, ISO/IEC JTC 1/SC29/WG 7 MDS19620, Conference Call, October 2020.

對於每個節點，在位元流中以深度優先的方式從根節點開始對殘差座標值進行譯碼。For each node, the residual coordinate values are decoded in the bitstream in a depth-first manner starting from the root node.

預測幾何形狀譯碼主要用於類別3（LIDAR採集的）點雲資料，例如，用於低時延應用。Predictive geometry decoding is mainly used for category 3 (LIDAR collected) point cloud data, eg, for low-latency applications.

可以在預測幾何形狀譯碼中使用角度模式，其中LIDAR感測器的特性可以用於更高效地對預測樹進行譯碼。將位置的座標轉換為

（半徑、方位角和雷射折射率），並且在該域中執行預測（在

域中對殘差進行譯碼）。由於四捨五入的誤差，在

中進行譯碼不是無損的，並且因此對與笛卡爾座標相對應的第二殘差集合進行譯碼。下文複製了來自以下文檔的針對用於預測幾何形狀譯碼的角度模式的編碼和解碼策略的描述：G-PCC Codec Description，ISO/IEC JTC 1/SC29/WG 7 MDS19620，電話會議，2020年10月。 The angle mode can be used in predictive geometry coding, where the properties of the LIDAR sensor can be used to more efficiently code the predictive tree. Convert the coordinates of the location to

(radius, azimuth, and laser refractive index), and perform predictions in this domain (in

The residuals are decoded in the domain). Due to rounding errors, the

Coding in is not lossless, and therefore a second set of residuals corresponding to Cartesian coordinates is coded. The description of encoding and decoding strategies for angular modes for predictive geometry coding is reproduced below from the following document: G-PCC Codec Description, ISO/IEC JTC 1/SC29/WG 7 MDS19620, Conference Call, October 2020 moon.

該方法側重於使用旋轉雷射雷達（lidar）模型採集的點雲。此處，雷射雷達具有N個雷射器（例如，N=16、32、64），其根據方位角

圍繞Z軸旋轉（參見圖6）。每個雷射器可以具有不同的仰角

和高度

。假設雷射

擊中點

，其具有根據在圖6中描述的座標系定義的笛卡爾整數座標

。 The method focuses on point clouds acquired using a rotating lidar (lidar) model. Here, the lidar has N lasers (eg, N=16, 32, 64), which vary according to the azimuth angle

Rotate around the Z axis (see Figure 6). Each laser can have a different elevation angle

and height

. hypothetical laser

hit point

, which has Cartesian integer coordinates defined according to the coordinate system depicted in Figure 6

.

該方法提供用於利用三個參數

對M的位置進行建模，這些參數計算如下： (1)

(2)

(3)

The method provides for using three parameters

To model the position of M, these parameters are calculated as follows: (1)

(2)

(3)

更準確地說，該方法使用

的量化版本，其被表示為

，其中，三個整數

、

和

計算如下： (1)

(2)

(3)

其中： (1) (

和

分別是控制

和

的精度的量化參數。 (2)

是函數，其在t為正數的情況下返回1，否則返回(-1)。 (3)

是

的絕對值。 More precisely, the method uses

A quantified version of , which is expressed as

, where three integers

,

and

The calculation is as follows: (1)

(2)

(3)

of which: (1) (

and

respectively control

and

The precision of the quantization parameter. (2)

is a function that returns 1 if t is positive and (-1) otherwise. (3)

Yes

the absolute value of .

為了避免由於使用浮點運算而導致的重建不匹配，如下預先計算並且量化

和

的值：

其中： (1) (

和

分別是控制

和

的精度的量化參數。如下獲得經重建的笛卡爾座標： (1)

(2)

(3)

, 其中，

和

是

和

的近似。計算可以使用定點表示、查找表和線性插值。應注意，由於各種原因，

可以與

不同： (1) 量化 (2) 近似 (3) 模型不精確 (4) 模型參數不精確性令

為重建殘差，其定義如下： (1)

(2)

(3)

在該方法中，編碼器（例如，G-PCC編碼器200）如下進行： (1) 對模型參數

以及量化參數

、

進行編碼 (2) 將在以下文檔中描述的幾何形狀預測方案應用於表示

：Text of ISO/IEC FDIS 23090-9 Geometry-based Point Cloud Compression，ISO/IEC JTC 1/SC29/WG 7 MDS19617，電話會議，2020年10月。＜1＞可以引入一種利用雷射雷達的特性的新預測器。例如，雷射雷達掃描器圍繞z軸的旋轉速度通常是恒定的。因此，可以如下預測當前

：

其中 (1)

是編碼器可以從中選擇的潛在速度集合。索引

可以顯式地寫入位元流，或者可以基於編碼器和解碼器兩者應用的確定性策略從上下文推斷，以及 (2)

是可以顯式地寫入位元流或可以基於編碼器和解碼器兩者應用的確定性策略從上下文推斷的跳過的點的數量。

稍後也可以被稱為“phi乘數”，並且在一些實現中，可僅與增量預測器一起使用。 (3) 將重建殘差

與每個節點一起編碼。 To avoid reconstruction mismatches due to the use of floating point arithmetic, precompute and quantize as follows

and

The value of:

of which: (1) (

and

respectively control

and

The precision of the quantization parameter. The reconstructed Cartesian coordinates are obtained as follows: (1)

(2)

(3)

, in,

and

Yes

and

approximation. Calculations can use fixed-point representation, look-up tables, and linear interpolation. It should be noted that for various reasons,

With

Different: (1) Quantization (2) Approximation (3) Model inaccuracy (4) Model parameter inaccuracy makes

To reconstruct the residual, it is defined as follows: (1)

(2)

(3)

In this method, an encoder (eg, G-PCC encoder 200 ) proceeds as follows: (1) For model parameters

and quantization parameters

,

To encode (2) apply the geometry prediction scheme described in the following document to the representation

: Text of ISO/IEC FDIS 23090-9 Geometry-based Point Cloud Compression, ISO/IEC JTC 1/SC29/WG 7 MDS19617, Conference Call, October 2020. <1> A new predictor that utilizes the properties of lidar can be introduced. For example, the rotational speed of a lidar scanner around the z-axis is usually constant. Therefore, the current can be predicted as follows

:

of which (1)

is the set of potential speeds from which the encoder can choose. index

The bitstream may be written explicitly, or may be inferred from context based on deterministic policies applied by both the encoder and decoder, and (2)

is the number of skipped points that can be explicitly written to the bitstream or inferred from context based on deterministic policies applied by both the encoder and decoder.

It may also be referred to later as the "phi multiplier", and in some implementations, may be used only with the delta predictor. (3) will reconstruct the residual

Coded with each node.

G-PCC解碼器300如下進行： (1) 對模型參數

以及量化參數

、

進行編碼 (2) 根據在以下文檔中描述的幾何形狀預測方案來對與節點相關聯的參數

進行編碼：Text of ISO/IEC FDIS 23090-9 Geometry-based Point Cloud Compression，ISO/IEC JTC 1/SC29/WG 7 MDS19617，電話會議，2020年10月。 (3) 如上所述地計算經重建的座標

(4) 對殘差

進行編碼＜1＞如在下一節中討論的，可以通過對重建殘差(r_x,r_y,r_z)進行量化來支持有損壓縮 (5) 如下计算原始坐标

＜1＞

＜2＞

＜3＞

The G-PCC decoder 300 proceeds as follows: (1) Compare the model parameters

and quantization parameters

,

Encode (2) the parameters associated with the nodes according to the geometry prediction scheme described in the following document

Encoding: Text of ISO/IEC FDIS 23090-9 Geometry-based Point Cloud Compression, ISO/IEC JTC 1/SC29/WG 7 MDS19617, Conference Call, October 2020. (3) Calculate the reconstructed coordinates as above

(4) For the residual

Encoding <1> As discussed in the next section, lossy compression can be supported by quantizing the reconstruction residuals (r_x, r_y, r_z) (5) The original coordinates are calculated as follows

<1>

<2>

<3>

有損壓縮可以通過將量化應用於重建殘差

或通過丟棄點來實現。 Lossy compression can be achieved by applying quantization to the reconstructed residual

or by dropping points.

如下計算經量化的重建殘差： (1)

(2)

(3)

其中，

、

和

分別是控制

、

和

的精度的量化參數。 The quantized reconstruction residual is calculated as follows: (1)

(2)

(3)

in,

,

and

respectively control

,

and

The precision of the quantization parameter.

網格量化可以用於進一步改善RD（率失真）性能結果。Trellis quantization can be used to further improve RD (Rate Distortion) performance results.

量化參數可以在序列/幀/切片/區塊級別改變，以實現區域自適應品質並且用於速率控制目的。The quantization parameters can be changed at the sequence/frame/slice/block level to achieve region-adaptive quality and for rate control purposes.

G-PCC編碼器200可以被配置為執行用於幀間預測的運動估計。下文介紹了在InterEM軟體中應用的運動估計（全域和局部）過程。InterEM是用於幀間預測的基於八叉樹的譯碼擴展。儘管運動估計應用於基於八叉樹的框架，但是類似的過程（或其至少一部分）也可以應用於預測幾何形狀譯碼。The G-PCC encoder 200 may be configured to perform motion estimation for inter prediction. The motion estimation (global and local) process applied in the InterEM software is described below. InterEM is an octree-based coding extension for inter prediction. Although motion estimation applies to octree-based frameworks, a similar process (or at least a portion thereof) can also be applied to predictive geometry coding.

在G-PCC InterEM軟體中涉及兩種運動：全域運動矩陣和局部節點運動向量。全域運動參數包括可以應用於在預測（參考）幀中的所有點的旋轉矩陣和平移向量。八叉樹的節點的局部節點運動向量是僅應用於在預測（參考）幀中的節點內的點的運動向量。下面描述InterEM中的運動估計算法的細節。There are two kinds of motion involved in G-PCC InterEM software: global motion matrix and local node motion vector. Global motion parameters include rotation matrices and translation vectors that can be applied to all points in the predicted (reference) frame. A local node motion vector for a node of an octree is a motion vector that applies only to points within the node in the predicted (reference) frame. Details of the motion estimation algorithm in InterEM are described below.

圖7示出了說明運動估計過程的流程圖。該過程的輸入包括預測幀420和當前幀422。G-PCC編碼器200首先在全域尺度上估計全域運動（424）。在將所估計的全域運動應用於預測幀420（426）之後，G-PCC編碼器200在更精細的尺度上（在八叉樹中的節點級別）估計局部運動（428）。最後，G-PCC編碼器200將運動補償應用於所估計的局部節點運動，並且對所確定的運動向量和點進行編碼（430）。Figure 7 shows a flowchart illustrating the motion estimation process. Inputs to this process include predicted frame 420 and current frame 422 . G-PCC encoder 200 first estimates global motion on a global scale (424). After applying the estimated global motion to the predicted frame 420 (426), the G-PCC encoder 200 estimates the local motion at a finer scale (at the node level in the octree) (428). Finally, G-PCC encoder 200 applies motion compensation to the estimated local node motion and encodes the determined motion vector and points (430).

下面更詳細地解釋圖7的各方面。G-PCC編碼器200可以執行用於估計全域運動矩陣和平移向量的過程。在InterEM軟體中，定義了全域運動矩陣來在預測幀（參考）和當前幀之間匹配特徵點。Aspects of FIG. 7 are explained in greater detail below. G-PCC encoder 200 may perform procedures for estimating global motion matrices and translation vectors. In the InterEM software, a global motion matrix is defined to match feature points between the predicted frame (reference) and the current frame.

圖8示出了可以由G-PCC編碼器200執行的全域運動估計過程的示例。在圖8的示例中，G-PCC編碼器200找到特徵點（432），對特徵點進行採樣（434），並且使用最小均方（LMS）算法執行運動估計（436）。FIG. 8 shows an example of a global motion estimation process that may be performed by the G-PCC encoder 200 . In the example of FIG. 8, the G-PCC encoder 200 finds feature points (432), samples the feature points (434), and performs motion estimation (436) using a least mean squares (LMS) algorithm.

在圖8所示的算法中，可以將在預測幀和當前幀之間具有較大位置變化的點定義為特徵點。對於當前幀中的每個點，找到預測幀中的最近點，並且在當前幀和預測幀之間建立點對。如果成對點之間的距離大於閾值，則成對點被視為特徵點。In the algorithm shown in Fig. 8, points with large positional changes between the predicted frame and the current frame can be defined as feature points. For each point in the current frame, the closest point in the predicted frame is found, and a point pair is established between the current frame and the predicted frame. If the distance between paired points is greater than a threshold, the paired points are regarded as feature points.

在找到特徵點之後，對特徵點執行採樣以減小問題的尺度（例如，通過選擇特徵點子集以降低運動估計的複雜性）。然後，應用LMS算法，以便通過嘗試減小在預測幀和當前幀中的相應特徵點之間的誤差來推導運動參數。After the feature points are found, sampling is performed on the feature points to reduce the scale of the problem (eg, by selecting a subset of the feature points to reduce the complexity of motion estimation). Then, an LMS algorithm is applied in order to derive motion parameters by trying to reduce the error between the corresponding feature points in the predicted frame and the current frame.

圖9示出了用於估計局部節點運動向量的示例過程。在圖9中所示的局部節點估計算法中，以遞歸方式估計運動向量。用於選擇最佳適當的運動向量的成本函數可以是基於率失真成本的。在圖9中，路徑440示出了用於未被拆分為8個子節點的當前節點的過程，並且路徑442示出了用於被拆分為8個子節點的當前節點的過程。Figure 9 shows an example process for estimating local node motion vectors. In the local node estimation algorithm shown in Fig. 9, motion vectors are estimated recursively. The cost function for selecting the best appropriate motion vector may be based on rate-distortion cost. In Figure 9, path 440 shows the process for the current node that is not split into 8 child nodes, and path 442 shows the process for the current node that is split into 8 child nodes.

如果當前節點未被拆分為8個子節點（440），則確定可以導致當前節點和預測節點之間的最低成本的運動向量。如果將當前節點被劃分為8個子節點（442），則應用運動估計算法，並且通過將每個子節點的所估計的成本值相加來獲得在拆分條件下的總成本。通過比較拆分和不拆分的成本，決定進行拆分還是不進行拆分；如果拆分，則向每個子節點指派其相應的運動向量（或者可以進一步拆分為其子節點），如果不拆分，則向當前節點指派運動向量。If the current node is not split into 8 child nodes (440), then determine the motion vector that can result in the lowest cost between the current node and the predicted node. If the current node is divided into 8 child nodes (442), a motion estimation algorithm is applied and the total cost under split condition is obtained by adding the estimated cost value of each child node. By comparing the cost of splitting and not splitting, decide whether to split or not to split; if split, each child node is assigned its corresponding motion vector (or can be further split into its children), if not split, assign a motion vector to the current node.

影響運動向量估計的性能的兩個參數是區塊大小（BlockSize）和最小預測單元大小（MinPUSize）。BlockSize定義要應用運動向量估計的節點大小的上限，並且MinPUSize定義下限。Two parameters that affect the performance of motion vector estimation are the block size (BlockSize) and the minimum prediction unit size (MinPUSize). BlockSize defines an upper bound for the node size to which motion vector estimation is to be applied, and MinPUSize defines a lower bound.

根本上作為八叉樹譯碼器的InterEM軟體執行佔用預測，並且在進行佔用預測時，使用全域/局部運動和參考點雲的資訊。因此，InterEM軟體不執行點的直接運動補償（其可以例如包括將運動應用於在參考幀中的點以將點投影到當前幀）。然後，可以對在實際點和預測點之間的差進行譯碼，這在執行幀間預測時可以更有效。The InterEM software, which is essentially an octree decoder, performs occupancy prediction and uses global/local motion and reference point cloud information when doing occupancy prediction. Therefore, the InterEM software does not perform direct motion compensation of points (which may, for example, include applying motion to points in a reference frame to project the points to the current frame). The difference between the actual point and the predicted point can then be decoded, which can be more efficient when performing inter prediction.

在本文檔中公開的一種或多種技術可以單獨地應用或組合地應用。本公開內容提出了用於執行直接運動補償而仍然受益於靈活的基於八叉樹分割的譯碼結構的技術。在下文中，這些技術主要在八叉樹拆分的背景下示出，但是也可以擴展到OTQTBT（八叉樹-四叉樹-二叉樹）拆分場景。One or more of the techniques disclosed in this document may be applied individually or in combination. This disclosure proposes techniques for performing direct motion compensation while still benefiting from flexible octree partitioning based coding structures. In the following, these techniques are mainly shown in the context of octree splitting, but can also be extended to OTQTBT (octree-quadtree-binary tree) splitting scenarios.

G-PCC編碼器200和/或G-PCC解碼器300可以被配置為執行高級別拆分並且處理模式標誌。在本公開內容的一個示例中，G-PCC編碼器200和/或G-PCC解碼器300可以被配置為對當前點雲執行基於八叉樹的拆分（用於佔用預測）。然而，有可能在某個級別停止八叉樹拆分，並且然後直接對八叉樹葉體積內的點進行譯碼（下文被稱為“直接預測”），而不是對佔用進行譯碼。可以用信號通知葉節點大小或八叉樹深度值，以指定在其處八叉樹拆分停止並且點被譯碼為八叉樹葉體積的級別。G-PCC encoder 200 and/or G-PCC decoder 300 may be configured to perform high-level splitting and process mode flags. In one example of the present disclosure, G-PCC encoder 200 and/or G-PCC decoder 300 may be configured to perform octree-based splitting (for occupancy prediction) on the current point cloud. However, it is possible to stop the octree split at a certain level and then directly code the points within the octree volume (hereafter referred to as "direct prediction") instead of coding the occupancy. A leaf node size or octree depth value may be signaled to specify the level at which octree splitting stops and points are coded as octree volumes.

對於每個這樣的八叉樹葉節點（其處八叉樹拆分停止，並且“直接預測”被激活），可以用信號通知指示八叉樹葉體積內的點集合是幀內預測的還是幀間預測的標誌。在幾何形狀參數集中，可以定義針對八叉樹葉的最大和最小大小。For each such octree node where octree splitting stops and "direct prediction" is activated, a signal may be signaled indicating whether the set of points within the octree volume is intra-predicted or inter-predicted symbols of. In the geometry parameter set, you can define the maximum and minimum size for octree leaves.

圖10是示出八叉樹444的高級別八叉樹拆分的概念圖。圖10是用於直接預測的包含13個點（O0到O12）的八叉樹葉節點的示例。在特殊情況下，八叉樹的根節點（無拆分）可以使用“直接預測”進行譯碼。FIG. 10 is a conceptual diagram illustrating high-level octree splitting of octree 444 . Figure 10 is an example of an octree node containing 13 points (O0 to O12) for direct prediction. In special cases, the root node of the octree (without splitting) can be decoded using "direct prediction".

G-PCC編碼器200和/或G-PCC解碼器300可以被配置為執行幀內預測。當標誌值被設置為intra時，體積內的所有點都是幀內預測的。為此，生成“局部預測樹”。這種樹的生成是非規範的（這些點可以按不同的順序進行遍歷，諸如以方位、莫頓、徑向或某種其它順序）。對於每個點，用信號通知其預測模式(0, 1, 2, 3)、子節點數量資訊、主殘差和次殘差（如果啟用角度模式）。因此，總之，幀內預測在其功能上類似於預測幾何形狀譯碼。G-PCC encoder 200 and/or G-PCC decoder 300 may be configured to perform intra prediction. When the flag value is set to intra, all points within the volume are intra-predicted. For this, a "local prediction tree" is generated. The generation of such a tree is non-canonical (the points may be traversed in a different order, such as azimuthal, Morton, radial, or some other order). For each point, its prediction mode (0, 1, 2, 3), number of child nodes information, primary and secondary residuals (if angle mode is enabled) are signaled. So, in summary, intra prediction is similar in function to predictive geometry coding.

替代地，此外，針對在八叉樹葉體積中的所有點用信號通知單個預測模式，這可以降低關聯的信令成本。用於零預測器的半徑值（如果啟用角度模式）或(x,y,z)值（如果禁用角度模式）可以被設置為例如八叉樹葉體積內的左頂點。替代地，可以針對八叉樹葉體積用信號通知零預測器，或者可以用信號通知指示在八叉樹葉體積內要用於零預測器的點的索引。此外，如果值在八叉樹葉體積之外，則可以在執行預測/重建之後執行剪裁。Alternatively, in addition, a single prediction mode is signaled for all points in the octree volume, which can reduce the associated signaling cost. The radius value (if angular mode is enabled) or the (x,y,z) value (if angular mode is disabled) for the zero predictor can be set to e.g. the left vertex inside the octree volume. Alternatively, a zero predictor may be signaled for the octree volume, or an index may be signaled indicating a point within the octree volume to be used for the zero predictor. Also, clipping can be performed after performing prediction/reconstruction if the value is outside the octree volume.

用於幀內預測的語法表可以類似於在以下文檔中針對預測樹描述的語法：Text of ISO/IEC FDIS 23090-9 Geometry-based Point Cloud Compression，ISO/IEC JTC 1/SC29/WG 7 MDS19617，電話會議，2020年10月，其全部內容通過引用的方式併入。The syntax table for intra prediction can be similar to the syntax described for prediction trees in the following document: Text of ISO/IEC FDIS 23090-9 Geometry-based Point Cloud Compression, ISO/IEC JTC 1/SC29/WG 7 MDS19617, Conference Call, October 2020, the entire contents of which are incorporated by reference.

G-PCC編碼器200和/或G-PCC解碼器300可以被配置為執行幀間預測。假設八叉樹葉在其內具有N個點：(O(0), ….,O(N-1))，對於幀間預測，在編碼器側使用在八叉樹葉體積中的當前點集合來執行運動估計，並且找到與在參考點雲幀中的類似點集合的最佳匹配（其中參考點雲可以是未經運動補償的或經全域運動補償的）。對於八叉樹葉的幀間預測，用信號通知以下各項： i. 參考索引（如果存在要從其進行預測的多個參考點雲幀） ii. 運動向量差（MVD）。（在實際MV和預測MV之間的差（如上所述，關於根據鄰居來執行MV預測）） iii. 八叉樹葉中的點數量（N） iv. 用於N個點（R’i）的主殘差（如果啟用角度模式，還有次殘差）（元組，3D座標之間的差）。 G-PCC encoder 200 and/or G-PCC decoder 300 may be configured to perform inter prediction. Assuming that the octree has N points within it: (O(0), ...., O(N-1)), for inter prediction, the current set of points in the octree volume is used at the encoder side to Motion estimation is performed, and the best match is found to a set of similar points in the reference point cloud frame (where the reference point cloud may be motion-uncompensated or globally motion-compensated). For inter prediction of octree leaves, the following are signaled: i. Reference index (if there are multiple reference point cloud frames to predict from) ii. Motion Vector Difference (MVD). (difference between the actual MV and the predicted MV (as mentioned above about performing MV prediction from neighbors)) iii. The number of points in the octopus leaf (N) iv. Primary residual (and secondary residual if angle mode is enabled) for N points (R'i) (tuple, difference between 3D coordinates).

在下文中，本公開內容描述了在給定用於八叉樹節點的用信號通知的參考索引（如果適用）和MV情況下的運動補償過程。 a. 當前八叉樹葉具有處於(X0, Y0, Z0)的左上方點並且具有為(Sx, Sy, Sz) 的維度，並且運動向量為MV = (MVx, MVy, MVz)。因此，在參考點雲幀中的對應的參考區塊具有處於(Xr, Yr, Zr) = (X0 – MVx, Y0 – MVy, Z0 – MVz)左上方點，並且大小為(Sx, Sy, Sz) b. 提取在該參考區塊內的所有點，並且將其佈置為1D陣列，排序可以是預定的/固定的或用信號通知的（例如，對於八叉樹葉）。比如說，存在具有如下座標（在參考幀中）的M個這樣的點：(R0,…..R(M-1))，其中Ri是提供第i點的3D座標的三元組。對於i = 0…(M-1))，如圖12所示。 c. 所有點通過應用用信號通知的MV進行運動補償，用信號通知的MV用作預測幾何位置(Pi)，即Pi = Ri + MV，如圖13所示。 d. 如果啟用角度模式，則對於所有M個點，推導對應的

。 In the following, the present disclosure describes the motion compensation process given the signaled reference index (if applicable) and MV for the octree node. a. The current octree has a top-left point at (X0, Y0, Z0) and has dimensions of (Sx, Sy, Sz), and the motion vector is MV = (MVx, MVy, MVz). Therefore, the corresponding reference block in the reference point cloud frame has a point at the upper left of (Xr, Yr, Zr) = (X0 – MVx, Y0 – MVy, Z0 – MVz), and is of size (Sx, Sy, Sz ) b. Extract all points within this reference block and arrange them into a 1D array, the ordering may be predetermined/fixed or signaled (eg for octrees). Say, there are M such points with the following coordinates (in the reference frame): (R0,....R(M-1)), where Ri is a triple providing the 3D coordinates of the ith point. For i = 0...(M-1)), as shown in Figure 12. c. All points are motion compensated by applying the signaled MV, which is used as the predicted geometric position (Pi), i.e., Pi = Ri + MV, as shown in Figure 13. d. If angle mode is enabled, for all M points, derive the corresponding

.

在圖12中，當前點集合被標記為O0到O12，並且參考點集合被標記為R0到R12，其中N = M = 13。在圖13中，當前點集合被標記為O0到O12，並且經運動補償的參考點集合被標記為P0到P12，其中N = M = 13。In Figure 12, the current point set is labeled O0 to O12, and the reference point set is labeled R0 to R12, where N=M=13. In FIG. 13 , the current set of points is marked as O0 to O12 and the motion compensated reference point sets are marked as P0 to P12, where N=M=13.

現在，可能存在三種場景： i. N = M（當前八叉樹節點和參考區塊具有相同數量的點）。 ii. N ＞ M。 iii. N ＜ M。 Now, three scenarios are possible: i. N = M (the current octree node and the reference block have the same number of points). ii. N > M. iii. N < M.

現在將描述其中N=M的第一種場景。殘差（主殘差，並且如果適用，還有次殘差）直接與經運動補償的點相加，以生成重建= Pi + R’i。The first scenario where N=M will now be described. The residuals (primary and, if applicable, secondary) are directly added to the motion compensated points to generate reconstruction = Pi + R'i.

現在將描述其中N ＞ M的第二種場景。使用最後一個值P(M-1)來擴展1D陣列（Pi）中的經運動補償的點（即[P’0, ……P’(M-1), P’(M),…..P’(N-1)] = [P0, ……P(M-1), P(M-1),…..P(M-1)] ），並且然後直接加上殘差以生成重建= P’i + R’i。替代地，零預測器用於擴展。The second scenario where N>M will now be described. Use the last value P(M-1) to expand the motion compensated points in the 1D array (Pi) (i.e. [P'0,...P'(M-1), P'(M),.... P'(N-1)] = [P0, …P(M-1), P(M-1),…..P(M-1)] ), and then add the residual directly to generate the reconstruction = P'i + R'i. Alternatively, a zero predictor is used for expansion.

現在將描述其中N ＜ M的第三種場景。將殘差直接與前N個點（即[P0, ……P(N-1)]）相加，以生成重建= Pi + Ri。A third scenario where N < M will now be described. The residuals are summed directly with the first N points (i.e. [P0, ...P(N-1)]) to generate reconstruction = Pi + Ri.

G-PCC編碼器200和/或G-PCC解碼器300可以被配置為根據鄰居來執行MV預測。根據時空相鄰八叉樹葉間的MV來預測當前八叉樹葉的MV是可能的，並且可以用信號通知對應的MV差。在存在多個時空候選的情況下，可以用信號通知MV預測索引。除了時空鄰居候選之外，還可能在基於最近歷史的MV候選列表中添加先前使用的MV候選。G-PCC encoder 200 and/or G-PCC decoder 300 may be configured to perform MV prediction based on neighbors. It is possible to predict the MV of the current octree from the MV between spatiotemporally adjacent octrees, and the corresponding MV difference can be signaled. In cases where there are multiple spatiotemporal candidates, the MV prediction index can be signaled. In addition to the spatiotemporal neighbor candidates, it is also possible to add previously used MV candidates to the recent history-based MV candidate list.

還可能通過指定用信號通知的“合併標誌”，來將MV資訊與時空鄰居合併。在存在多個時空候選的情況下，可以用信號通知合併索引。It is also possible to merge MV information with spatiotemporal neighbors by specifying a signaled "merge flag". In cases where there are multiple spatiotemporal candidates, the merge index can be signaled.

G-PCC編碼器200和/或G-PCC解碼器300可以被配置為執行跳過主殘差。G-PCC encoder 200 and/or G-PCC decoder 300 may be configured to perform skipping the main residual.

對於良好的幀間預測，在啟用角度模式時適用的主殘差通常是小的，或者甚至接近於零。在這種情況下，也可能完全跳過用於在八叉樹葉體積中的所有點的主殘差。因此，可以針對八叉樹葉體積用信號通知primary_residual_skip標誌。在這種情況下，在原始點和預測點之間的差完全地用次殘差來譯碼。For good inter prediction, the main residual applied when angular mode is enabled is usually small, or even close to zero. In this case it is also possible to skip the main residuals entirely for all points in the octree volume. Therefore, the primary_residual_skip flag can be signaled for the octree volume. In this case, the difference between the original point and the predicted point is completely coded with the secondary residual.

替代地，primary_residual_skip_flag可以在高於八叉樹葉體積的八叉樹級別用信號通知，並且適用於與該八叉樹級別相關聯的一個或多個八叉樹葉。Alternatively, the primary_residual_skip_flag may be signaled at an octree level above the octree level and apply to one or more octree levels associated with that octree level.

下表是用於經幀間預測的八叉樹葉的語法表。 Octree_leaf(X0, Y0, Z0){ if(number_of_references ＞ 1) ref_idx ae(v) num_points_minus1 ae(v) merge_flag ae(v) if(!merge_flag) { mvp_idx ae(v) for( i = 0 ; i ＜ 3; i++){ abs_mvd [i] ae(v) if(!abs_mvd[i]) mvd_sign [i] ae(v) } } else merge_idx ae(v) if(geom_angular_enabled_flag) primary_residual_skip_flag ae(v) for(n = 0; n ＜= num_points_minus1; n++){ for( i = 0 ; i ＜ 3; i++){ If(!primary_residual_skip_flag) { abs_primary_residual [n][i] ae(v) if(!abs_primary_residual[n][i]) primary_residual_sign[n][i] ae(v) } abs_secondary_residual [n][i] ae(v) if(!abs_secondary_residual[n][i]) secondary_residual_sign [n][i] ae(v) } } } The following table is a syntax table for inter-predicted octrees. Octree_leaf(X0, Y0, Z0){ if(number_of_references > 1) ref_idx ae(v) num_points_minus1 ae(v) merge_flag ae(v) if(!merge_flag) { mvp_idx ae(v) for( i = 0 ; i <3; i++){ abs_mvd [i] ae(v) if(!abs_mvd[i]) mvd_sign [i] ae(v) } } else merge_idx ae(v) if(geom_angular_enabled_flag) primary_residual_skip_flag ae(v) for(n = 0; n <= num_points_minus1; n++){ for( i = 0 ; i <3; i++){ If(!primary_residual_skip_flag) { abs_primary_residual [n][i] ae(v) if(!abs_primary_residual[n][i]) primary_residual_sign[n][i] ae(v) } abs_secondary_residual [n][i] ae(v) if(!abs_secondary_residual[n][i]) secondary_residual_sign [n][i] ae(v) } } }

本公開內容的各個方面中的示例可以單獨地使用或以任何組合使用。The examples in various aspects of the present disclosure may be used alone or in any combination.

圖14是示出了可以與本公開內容的一種或多種技術一起使用的示例測距系統600的概念圖。在圖14的示例中，測距系統600包括照明器602和感測器604。照明器602可以發射光606。在一些示例中，照明器602可以將光606發射為一個或多個雷射光束。光606可以具有一個或多個波長，諸如紅外波長或可見光波長。在其它示例中，光606不是相干的雷射光。當光606遇到對象（諸如對象608）時，光606創建返回光610。返回光610可以包括反向散射的和/或反射的光。返回光610可以通過透鏡611，透鏡611引導返回光610在感測器604上創建對象608的圖像612。感測器604基於圖像612來生成信號618。圖像612可以包括點的集合（例如，如由圖14的圖像612中的圓點表示）。14 is a conceptual diagram illustrating an example ranging system 600 that may be used with one or more techniques of the present disclosure. In the example of FIG. 14 , ranging system 600 includes illuminator 602 and sensor 604 . Illuminator 602 may emit light 606 . In some examples, illuminator 602 may emit light 606 as one or more laser beams. Light 606 may have one or more wavelengths, such as infrared wavelengths or visible light wavelengths. In other examples, the light 606 is not coherent laser light. When light 606 encounters an object, such as object 608 , light 606 creates return light 610 . Return light 610 may include backscattered and/or reflected light. Return light 610 may pass through lens 611 , which directs return light 610 to create image 612 of object 608 on sensor 604 . Sensor 604 generates signal 618 based on image 612 . Image 612 may include a collection of points (eg, as represented by the dots in image 612 of FIG. 14 ).

在一些示例中，照明器602和感測器604可以被安裝在旋轉結構上，使得照明器602和感測器604捕獲環境的360度視圖。在其它示例中，測距系統600可以包括一個或多個光學組件（例如，鏡子、准直器、衍射光柵等），其使得照明器602和感測器604能夠檢測特定範圍內（例如，高達360度）的對象。儘管圖14的示例僅示出單個照明器602和感測器604，但是測距系統600可以包括多組照明器和感測器。In some examples, illuminator 602 and sensor 604 may be mounted on a rotating structure such that illuminator 602 and sensor 604 capture a 360-degree view of the environment. In other examples, ranging system 600 may include one or more optical components (eg, mirrors, collimators, diffraction gratings, etc.) that enable illuminator 602 and sensor 604 to detect within a specified range (eg, up to 360 degrees) objects. Although the example of FIG. 14 shows only a single illuminator 602 and sensor 604, ranging system 600 may include multiple sets of illuminators and sensors.

在一些示例中，照明器602生成結構的光圖案。在這樣的示例中，測距系統600可以包括在其上形成結構的光圖案的相應圖像的多個感測器604。測距系統600可以使用結構的光圖案的圖像之間的差異來確定到結構的光圖案從其反向散射的對象608的距離。當對象608相對接近感測器604（例如，0.2米到2米）時，基於結構的光的測距系統可以具有高的精度水平（例如，在亞毫米範圍內的精度）。這種高的精度水平在面部識別應用中可能是有用的，諸如解鎖行動設備（例如，行動電話、平板電腦等）和用於安全應用。In some examples, illuminator 602 generates a structured light pattern. In such an example, ranging system 600 may include a plurality of sensors 604 on which respective images of the light pattern of the structures are formed. The ranging system 600 can use the difference between the images of the light pattern of the structure to determine the distance to the object 608 from which the light pattern of the structure is backscattered. Structured light based ranging systems can have a high level of accuracy (eg, in the sub-millimeter range) when the object 608 is relatively close to the sensor 604 (eg, 0.2 meters to 2 meters). This high level of accuracy may be useful in facial recognition applications, such as unlocking mobile devices (eg, mobile phones, tablets, etc.) and for security applications.

在一些示例中，測距系統600是基於飛行時間（ToF）的系統。在其中測距系統600是基於ToF的系統的一些示例中，照明器602生成光脈衝。換句話說，照明器602可以調變發射的光606的幅度。在這樣的示例中，感測器604檢測來自由照明器602生成的光脈衝606的返回光610。然後，測距系統600可以基於在光606被發射和被檢測到之間的延遲以及已知的光在空氣中的速度來確定到光606從其反向散射的對象608的距離。在一些示例中，照明器602可以調變發射的光606的相位，而不是（或者除了）調變發射的光606的幅度。在這樣的示例中，感測器604可以檢測來自對象608的返回光610的相位，並且使用光速並且基於在照明器602在特定相位處生成光606的時間與感測器604在該特定相位處檢測到返回光610的時間之間的時間差來確定到對象608上的點的距離。In some examples, ranging system 600 is a time-of-flight (ToF) based system. In some examples in which ranging system 600 is a ToF-based system, illuminator 602 generates light pulses. In other words, the illuminator 602 can modulate the amplitude of the emitted light 606 . In such an example, sensor 604 detects return light 610 from light pulse 606 generated by illuminator 602 . The ranging system 600 can then determine the distance to the object 608 from which the light 606 is backscattered based on the delay between the light 606 being emitted and being detected and the known speed of the light in air. In some examples, the illuminator 602 may modulate the phase of the emitted light 606 instead of (or in addition to) the amplitude of the emitted light 606 . In such an example, the sensor 604 may detect the phase of the returning light 610 from the object 608 and use the speed of light and based on the time at which the illuminator 602 generates the light 606 at a particular phase and the sensor 604 at the particular phase The time difference between when the returning light 610 is detected determines the distance to the point on the object 608 .

在其它示例中，可以在不使用照明器602的情況下生成點雲。例如，在一些示例中，測距系統600的感測器604可以包括兩個或更多個光學相機。在這樣的示例中，測距系統600可以使用光學相機來捕獲包括對象608的環境的立體圖像。然後，測距系統600（例如，點雲生成器620）可以計算立體圖像中的位置之間的差異。然後，測距系統600可以使用該差異來確定到在立體圖像中所示的位置的距離。根據這些距離，點雲生成器620可以生成點雲。In other examples, the point cloud may be generated without the use of the illuminator 602 . For example, in some examples, sensor 604 of ranging system 600 may include two or more optical cameras. In such an example, ranging system 600 may use an optical camera to capture a stereoscopic image of the environment including object 608 . The ranging system 600 (eg, point cloud generator 620 ) can then calculate the differences between the positions in the stereoscopic image. The ranging system 600 can then use this difference to determine the distance to the location shown in the stereoscopic image. Based on these distances, the point cloud generator 620 may generate a point cloud.

感測器604還可以檢測對象608的其它屬性，諸如色彩和反射率資訊。在圖14的示例中，點雲生成器620可以基於由感測器604生成的信號618來生成點雲。測距系統600和/或點雲生成器620可以形成資料源104（圖1）的一部分。Sensor 604 may also detect other properties of object 608, such as color and reflectivity information. In the example of FIG. 14 , point cloud generator 620 may generate a point cloud based on signal 618 generated by sensor 604 . Ranging system 600 and/or point cloud generator 620 may form part of data source 104 (FIG. 1).

圖15是示出了可以在其中使用本公開內容的一種或多種技術的示例基於車輛的場景的概念圖。在圖15的示例中，車輛700包括雷射封裝702，諸如LIDAR系統。儘管在圖15的示例中未示出，但是車輛700還可以包括資料源和G-PCC編碼器（諸如G-PCC編碼器200（圖1））。在圖15的示例中，雷射封裝702發射雷射光束704，該雷射光束從行人706或道路中的其它對象反射。車輛700的資料源可以基於由雷射封裝702生成的信號來生成點雲。車輛700的G-PCC編碼器可以對點雲進行編碼以生成位元流708。位元流708可以包括比由G-PCC編碼器獲得的未經編碼的點雲少得多的位元。車輛700的輸出介面（例如，輸出介面108（圖1））可以向一個或多個其它設備發送位元流708。因此，車輛700可能能夠向其它設備更快地發送位元流708（與未經編碼的點雲資料相比）。另外，位元流708可能需要較少的資料儲存容量。15 is a conceptual diagram illustrating an example vehicle-based scenario in which one or more techniques of the present disclosure may be employed. In the example of FIG. 15, vehicle 700 includes a laser package 702, such as a LIDAR system. Although not shown in the example of FIG. 15 , vehicle 700 may also include a material source and a G-PCC encoder, such as G-PCC encoder 200 ( FIG. 1 ). In the example of FIG. 15, the laser package 702 emits a laser beam 704, which is reflected from a pedestrian 706 or other objects in the road. The data source of the vehicle 700 may generate a point cloud based on the signals generated by the laser package 702 . The G-PCC encoder of vehicle 700 may encode the point cloud to generate bitstream 708 . The bitstream 708 may include far fewer bits than the unencoded point cloud obtained by the G-PCC encoder. An output interface of vehicle 700 (eg, output interface 108 (FIG. 1)) may send bitstream 708 to one or more other devices. Therefore, the vehicle 700 may be able to send the bitstream 708 to other devices faster (compared to the unencoded point cloud material). Additionally, bitstream 708 may require less data storage capacity.

在圖15的示例中，車輛700可以向另一車輛710發送位元流708。車輛710可以包括G-PCC解碼器，諸如G-PCC解碼器300（圖1）。車輛710的G-PCC解碼器可以解碼位元流708以重建點雲。車輛710可以將經重建的點雲用於各種目的。例如，車輛710可以基於經重建的點雲來確定行人706在車輛700前面的道路中並且因此開始減速（例如，甚至在車輛710的駕駛員意識到行人706在道路中之前）。因此，在一些示例中，車輛710可以基於經重建的點雲來執行自主導航操作，生成通知或警告，或者執行另一動作。In the example of FIG. 15 , vehicle 700 may send bitstream 708 to another vehicle 710 . Vehicle 710 may include a G-PCC decoder, such as G-PCC decoder 300 (FIG. 1). The G-PCC decoder of the vehicle 710 may decode the bitstream 708 to reconstruct the point cloud. Vehicle 710 may use the reconstructed point cloud for various purposes. For example, vehicle 710 may determine, based on the reconstructed point cloud, that pedestrian 706 is in the road ahead of vehicle 700 and therefore begins to decelerate (eg, even before the driver of vehicle 710 realizes pedestrian 706 is in the road). Thus, in some examples, the vehicle 710 may perform autonomous navigation operations, generate a notification or warning, or perform another action based on the reconstructed point cloud.

另外或替代地，車輛700可以向伺服器系統712發送位元流708。伺服器系統712可以將位元流708用於各種目的。例如，伺服器系統712可以儲存位元流708以用於點雲的後續重建。在該示例中，伺服器系統712可以將點雲連同其它資料（例如，由車輛700生成的車輛遙測資料）一起使用，以訓練自主駕駛系統。在其它示例中，伺服器系統712可以儲存位元流708以用於供法醫事故調查的後續重建（例如，如果車輛700與行人706碰撞）或者可以發送用於導航到車輛700或車輛710的通知或指令。Additionally or alternatively, vehicle 700 may send bitstream 708 to server system 712 . The server system 712 may use the bitstream 708 for various purposes. For example, server system 712 may store bitstream 708 for subsequent reconstruction of the point cloud. In this example, the server system 712 may use the point cloud along with other data (eg, vehicle telemetry data generated by the vehicle 700 ) to train the autonomous driving system. In other examples, server system 712 may store bitstream 708 for subsequent reconstruction for forensic accident investigation (eg, if vehicle 700 collides with pedestrian 706 ) or may send a notification for navigating to vehicle 700 or vehicle 710 or instruction.

圖16是示出了可以在其中使用本公開內容的一種或多種技術的示例延展實境系統的概念圖。延展實境（XR）是用於涵蓋包括以下各項的一系列技術的術語：增強實境（AR）、混合實境（MR）和虛擬實境（VR）。在圖16的示例中，用戶800位於第一位置802中。用戶800佩戴XR耳機804。作為XR耳機804的替代方式，用戶800可以使用行動設備（例如，行動電話、平板電腦等）。XR耳機804包括深度檢測感測器（諸如LIDAR系統），其檢測在第一位置802處的對象806上的點的位置。XR耳機804的資料源可以使用由深度檢測感測器生成的信號來生成對在位置802處的對象806的點雲表示。XR耳機804可以包括G-PCC編碼器（例如，圖1的G-PCC編碼器200），其被配置為對點雲進行編碼以生成位元流808。16 is a conceptual diagram illustrating an example extended reality system in which one or more techniques of the present disclosure may be employed. Extended reality (XR) is a term used to cover a range of technologies including: Augmented Reality (AR), Mixed Reality (MR), and Virtual Reality (VR). In the example of FIG. 16 , user 800 is in first location 802 . User 800 wears XR headset 804 . As an alternative to XR headset 804, user 800 may use a mobile device (eg, mobile phone, tablet, etc.). The XR headset 804 includes a depth detection sensor (such as a LIDAR system) that detects the location of a point on the object 806 at the first location 802 . The data source of the XR headset 804 may use the signals generated by the depth detection sensors to generate a point cloud representation of the object 806 at the location 802 . XR headset 804 may include a G-PCC encoder (eg, G-PCC encoder 200 of FIG. 1 ) configured to encode a point cloud to generate bitstream 808 .

XR耳機804可以向由在第二位置814處的用戶812佩戴的XR耳機810發送位元流808（例如，經由諸如互聯網之類的網路）。XR耳機810可以解碼位元流808以重建點雲。XR耳機810可以使用點雲來生成表示在第一位置802處的對象806的XR可視化（例如，AR、MR、VR可視化）。因此，在一些示例中，諸如當XR耳機810生成VR可視化時，在位置814處的用戶812可以具有第一位置802的3D沉浸式體驗。在一些示例中，XR耳機810可以基於經重建的點雲來確定虛擬對象的位置。例如，XR耳機810可以基於經重建的點雲來確定環境（例如，第一位置802）包括平坦表面，並且然後確定虛擬對象（例如，卡通人物）將被定位在平坦表面上。XR耳機810可以生成其中虛擬對象位於所確定的位置處的XR可視化。例如，XR耳機810可以顯示位於平坦表面上的卡通人物。The XR headset 804 may send a bitstream 808 (eg, via a network such as the Internet) to the XR headset 810 worn by the user 812 at the second location 814 . The XR headset 810 can decode the bitstream 808 to reconstruct the point cloud. The XR headset 810 may use the point cloud to generate an XR visualization (eg, AR, MR, VR visualization) representing the object 806 at the first location 802 . Thus, in some examples, such as when the XR headset 810 generates a VR visualization, the user 812 at the location 814 may have a 3D immersive experience at the first location 802 . In some examples, the XR headset 810 may determine the location of the virtual object based on the reconstructed point cloud. For example, the XR headset 810 may determine, based on the reconstructed point cloud, that the environment (eg, the first location 802 ) includes a flat surface, and then determine that the virtual object (eg, a cartoon character) is to be positioned on the flat surface. The XR headset 810 may generate an XR visualization in which the virtual object is located at the determined location. For example, the XR headset 810 may display a cartoon character on a flat surface.

圖17是示出了可以在其中使用本公開內容的一種或多種技術的示例行動設備系統的概念圖。在圖17的示例中，行動設備900（諸如行動電話或平板電腦）包括深度檢測感測器（諸如LIDAR系統），其檢測在行動設備900的環境中的對象902上的點的位置。行動設備900的資料源可以使用由深度檢測感測器生成的信號來生成對象902的點雲表示。行動設備900可以包括G-PCC編碼器（例如，圖1的G-PCC編碼器200），其被配置為對點雲進行編碼以生成位元流904。在圖17的示例中，行動設備900可以向遠程設備906（諸如伺服器系統或其它行動設備）發送位元流。遠程設備906可以解碼位元流904以重建點雲。遠程設備906可以將點雲用於各種目的。例如，遠程設備906可以使用點雲來生成行動設備900的環境的地圖。例如，遠程設備906可以基於經重建的點雲來生成建築物的內部的地圖。在另一示例中，遠程設備906可以基於點雲來生成影像（例如，電腦圖形）。例如，遠程設備906可以使用點雲的點作為多邊形的頂點，並且使用點的色彩屬性作為用於著色多邊形的基礎。在一些示例中，遠程設備906可以使用點雲來執行面部識別。17 is a conceptual diagram illustrating an example mobile device system in which one or more techniques of the present disclosure may be employed. In the example of FIG. 17 , a mobile device 900 (such as a mobile phone or tablet) includes a depth detection sensor (such as a LIDAR system) that detects the location of a point on an object 902 in the environment of the mobile device 900 . The data source of the mobile device 900 may generate a point cloud representation of the object 902 using the signals generated by the depth detection sensors. Mobile device 900 may include a G-PCC encoder (eg, G-PCC encoder 200 of FIG. 1 ) configured to encode a point cloud to generate bitstream 904 . In the example of FIG. 17, the mobile device 900 may send a bitstream to a remote device 906, such as a server system or other mobile device. The remote device 906 can decode the bitstream 904 to reconstruct the point cloud. The remote device 906 may use the point cloud for various purposes. For example, remote device 906 may use the point cloud to generate a map of the environment of mobile device 900 . For example, the remote device 906 may generate a map of the interior of the building based on the reconstructed point cloud. In another example, the remote device 906 may generate imagery (eg, computer graphics) based on the point cloud. For example, the remote device 906 may use the points of the point cloud as the vertices of the polygon, and use the color properties of the points as the basis for coloring the polygon. In some examples, the remote device 906 may use the point cloud to perform facial recognition.

圖18是示出用於對包括點雲資料的位元流進行解碼的示例操作的流程圖。G-PCC解碼器300可以執行圖18的操作，作為對點雲進行解碼的一部分。在圖18的示例中，G-PCC解碼器300確定定義包含點雲的空間的基於八叉樹的拆分的八叉樹（1000）。八叉樹的葉節點包含點雲的一個或多個點。18 is a flowchart illustrating example operations for decoding a bitstream including point cloud material. G-PCC decoder 300 may perform the operations of FIG. 18 as part of decoding the point cloud. In the example of Figure 18, the G-PCC decoder 300 determines an octree (1000) that defines an octree-based split of the space containing the point cloud. The leaf nodes of the octree contain one or more points of the point cloud.

G-PCC解碼器300直接對葉節點中的一個或多個點中的每個點的位置進行譯碼（1002）。為了直接對葉節點中的一個或多個點中的每個點的位置進行譯碼，G-PCC解碼器300生成一個或多個點的預測（1004），並且基於該預測來確定一個或多個點（1006）。為了直接對葉節點中的一個或多個點中的每個點的位置進行解碼，G-PCC解碼器300可以被配置為接收標誌，其中，該標誌的第一值指示一個或多個點的預測是通過幀內預測而生成的，並且標誌的第二值指示一個或多個點的預測是通過幀間預測而生成的，並且基於該標誌的值，使用幀內預測或幀間預測來對一個或多個點進行解碼。The G-PCC decoder 300 directly decodes the location of each of the one or more points in the leaf node (1002). To directly code the location of each of the one or more points in the leaf node, the G-PCC decoder 300 generates a prediction of the one or more points ( 1004 ), and based on the prediction determines one or more points (1006). In order to directly decode the position of each of the one or more points in the leaf node, the G-PCC decoder 300 may be configured to receive a flag, wherein the first value of the flag indicates the location of the one or more points The prediction was generated by intra prediction and the second value of the flag indicates that the prediction for one or more points was generated by inter prediction, and based on the value of the flag, intra prediction or inter prediction is used to predict the prediction. one or more points to decode.

G-PCC解碼器300可以被配置為在用於點雲的位元流中接收指定葉節點的體積的八叉樹葉體積。例如，假設整個點雲封裝在WxWxW立方體中。點雲可以遞歸地拆分，並且對於給定的拆分深度d，八叉樹葉體積為W/2 ^dx W/2 ^dx W/2 ^d。在該級別，可以用信號通知佔用標誌（二進制），該標誌在等於1時指示立方體具有至少一個點。當佔用標誌為1時，則可以分別用信號通知指示立方體內的點是幀內預測還是幀間預測的另外的幀內或幀間標誌。 The G-PCC decoder 300 may be configured to receive, in a bitstream for a point cloud, an octree leaf volume specifying the volume of a leaf node. For example, suppose the entire point cloud is encapsulated in a WxWxW cube. The point cloud can be split recursively, and for a given split depth d, the octree volume is W/2 ^d x W/2 ^d x W/2 ^d . At this level, an occupancy flag (binary) can be signaled, which when equal to 1 indicates that the cube has at least one point. When the occupancy flag is 1, then an additional intra or inter flag indicating whether a point within the cube is intra- or inter-predicted, respectively, may be signaled.

為了生成一個或多個點的預測，G-PCC解碼器300還可以被配置為使用幀內預測來生成一個或多個點的預測，並且為了使用幀內預測來生成一個或多個點的預測，G-PCC解碼器300還可以被配置為確定用於一個或多個點的局部預測樹。In order to generate predictions of one or more points, the G-PCC decoder 300 may also be configured to generate predictions of one or more points using intra prediction, and to generate predictions of one or more points using intra prediction , the G-PCC decoder 300 may also be configured to determine a local prediction tree for one or more points.

為了基於預測來確定一個或多個點，G-PCC解碼器300可以被配置為在用於點雲的位元流中接收用於一個或多個點中的每個點的預測模式、主殘差和次殘差中的至少一項。為了生成一個或多個點的預測，G-PCC解碼器300可以被配置為使用幀間預測來生成一個或多個點的預測，並且為了使用幀間預測來生成一個或多個點的預測，G-PCC解碼器300還可以被配置為利用一個或多個點來執行運動估計，以確定在參考點雲幀中的類似點集合。To determine the one or more points based on the prediction, the G-PCC decoder 300 may be configured to receive, in the bitstream for the point cloud, the prediction mode, the primary residual, for each of the one or more points At least one of difference and sub-residual. To generate predictions of one or more points, G-PCC decoder 300 may be configured to generate predictions of one or more points using inter prediction, and to generate predictions of one or more points using inter prediction, The G-PCC decoder 300 may also be configured to perform motion estimation using one or more points to determine a set of similar points in the reference point cloud frame.

為了生成一個或多個點的預測，G-PCC解碼器300還可以被配置為使用幀間預測來生成一個或多個點的預測，並且為了使用幀間預測來生成一個或多個點的預測，G-PCC解碼器300還可以被配置為基於在參考點雲幀中的點集合來執行運動補償以預測一個或多個點。為了執行運動補償，G-PCC解碼器300還可以被配置為將運動向量應用於在參考點雲幀中的點集合，以確定一個或多個點的預測。G-PCC解碼器300可以被配置為基於時空相鄰八叉樹葉間的運動向量來預測運動向量。To generate predictions of one or more points, G-PCC decoder 300 may also be configured to generate predictions of one or more points using inter prediction, and to generate predictions of one or more points using inter prediction , the G-PCC decoder 300 may also be configured to perform motion compensation to predict one or more points based on the set of points in the reference point cloud frame. To perform motion compensation, the G-PCC decoder 300 may also be configured to apply a motion vector to a set of points in the reference point cloud frame to determine a prediction of one or more points. The G-PCC decoder 300 may be configured to predict motion vectors based on motion vectors between spatiotemporally adjacent octrees.

G-PCC解碼器300還可以被配置為從點雲資料重建點雲。作為重建點雲的一部分，G-PCC解碼器300還可以被配置為基於平面位置來確定點雲的一個或多個點的位置。The G-PCC decoder 300 may also be configured to reconstruct point clouds from point cloud data. As part of reconstructing the point cloud, the G-PCC decoder 300 may also be configured to determine the location of one or more points of the point cloud based on the planar location.

要認識到的是，根據示例，本文描述的技術中的任何技術的某些動作或事件可以以不同的順序執行，可以被添加、合併或完全省略（例如，並非所有描述的動作或事件對於所述技術的實踐是必要的）。此外，在某些示例中，動作或事件可以例如通過多執行緒處理、中斷處理或多個處理器併發地而不是順序地執行。It is recognized that, by way of example, certain acts or events of any of the techniques described herein may be performed in a different order, may be added, combined, or omitted entirely (eg, not all described acts or events may be used for all described acts or events). practice of the techniques described above is necessary). Furthermore, in some examples, actions or events may be performed concurrently rather than sequentially, eg, through multi-threaded processing, interrupt processing, or multiple processors.

在一個或多個示例中，所描述的功能可以用硬體、軟體、韌體或其任何組合來實現。如果用軟體來實現，則所述功能可以作為一個或多個指令或代碼儲存在電腦可讀媒體上或者通過其進行發送以及由基於硬體的處理單元執行。電腦可讀媒體可以包括電腦可讀儲存媒體，其對應于諸如資料儲存媒體的有形媒體或者通信媒體，所述通信媒體包括例如根據通信協定來促進電腦程式從一個地方傳送到另一個地方的任何媒體。以這種方式，電腦可讀媒體通常可以對應於（1）非暫時性的有形電腦可讀儲存媒體、或者（2）諸如信號或載波的通信媒體。資料儲存媒體可以是可以由一個或多個電腦或者一個或多個處理器存取以取得用於實現在本公開內容中描述的技術的指令、代碼和/或資料結構的任何可用的媒體。電腦程式產品可以包括電腦可讀媒體。In one or more examples, the functions described can be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium and executed by a hardware-based processing unit. Computer-readable media may include computer-readable storage media, which correspond to tangible media, such as data storage media, or communication media, including any medium that facilitates transfer of a computer program from one place to another, for example, in accordance with a communication protocol . In this manner, computer-readable media may generally correspond to (1) non-transitory tangible computer-readable storage media, or (2) communication media such as signals or carrier waves. Data storage media can be any available media that can be accessed by one or more computers or one or more processors for instructions, code and/or data structures for implementing the techniques described in this disclosure. The computer program product may include a computer-readable medium.

舉例來說而非進行限制，這樣的電腦可讀儲存媒體可以包括RAM、ROM、EEPROM、CD-ROM或其它光碟儲存、磁碟儲存或其它磁儲存設備、快閃記憶體、或者能夠用於以指令或資料結構的形式儲存期望的程式代碼以及能夠由電腦存取的任何其它媒體。此外，任何連接被適當地稱為電腦可讀媒體。例如，如果使用同軸電纜、光纖光纜、雙絞線、數位用戶線（DSL）或者無線技術（諸如紅外線、無線電和微波）來從網站、伺服器或其它遠端源發送指令，則同軸電纜、光纖光纜、雙絞線、DSL或者無線技術（諸如紅外線、無線電和微波）被包括在媒體的定義中。然而，應當理解的是，電腦可讀儲存媒體和資料儲存媒體不包括連接、載波、信號或其它暫時性媒體，而是替代地針對非暫時性的有形儲存媒體。如本文所使用的，磁碟和光碟包括壓縮光碟（CD）、雷射光碟、光碟、數位影音光碟（DVD）、軟碟和藍光光碟，其中，磁碟通常磁性地複製資料，而光碟利用雷射來光學地複製資料。上述各項的組合也應當被包括在電腦可讀媒體的範圍之內。By way of example and not limitation, such computer-readable storage media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, flash memory, or capable of The desired program code is stored in the form of instructions or data structures and any other medium that can be accessed by the computer. Also, any connection is properly termed a computer-readable medium. For example, if coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave are used to send commands from a website, server, or other remote source, coaxial cable, fiber optic cable, or Fiber optic cable, twisted pair, DSL or wireless technologies such as infrared, radio and microwave are included in the definition of media. It should be understood, however, that computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other transitory media, but are instead directed to non-transitory, tangible storage media. As used herein, magnetic and optical discs include compact discs (CDs), laser discs, optical discs, digital video discs (DVDs), floppy discs, and Blu-ray discs, where magnetic discs generally reproduce material magnetically, while optical discs use laser to optically reproduce material. Combinations of the above should also be included within the scope of computer-readable media.

指令可以由一個或多個處理器來執行，諸如一個或多個數位信號處理器（DSP）、通用微處理器、專用積體電路（ASIC）、現場可程式設計閘陣列（FPGA）、或其它等效的積體或離散邏輯電路。相應地，如本文所使用的術語“處理器”和“處理電路”可以指代前述結構中的任何一者或者適於實現本文描述的技術的任何其它結構。另外，在一些方面中，本文描述的功能可以在被配置用於編碼和解碼的專用硬體和/或軟體模組內提供，或者被併入經組合的編解碼器中。此外，所述技術可以充分地在一個或多個電路或邏輯元件中實現。Instructions may be executed by one or more processors, such as one or more digital signal processors (DSPs), general purpose microprocessors, application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), or other Equivalent integrated or discrete logic circuit. Accordingly, the terms "processor" and "processing circuit" as used herein may refer to any of the foregoing structures or any other structure suitable for implementing the techniques described herein. Additionally, in some aspects, the functionality described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or incorporated in a combined codec. Furthermore, the techniques may well be implemented in one or more circuits or logic elements.

本公開內容的技術可以在各種各樣的設備或裝置中實現，包括無線手機、積體電路（IC）或一組IC（例如，晶片組）。在本公開內容中描述了各個組件、模組或單元以強調被配置為執行所公開的技術的設備的功能性方面，但是不一定需要通過不同的硬體單元來實現。而是，如上所述，各個單元可以被組合在編解碼器硬體單元中，或者由可互操作的硬體單元的集合（包括如上所述的一個或多個處理器）結合適當的軟體和/或韌體來提供。The techniques of this disclosure may be implemented in a wide variety of devices or apparatus, including a wireless handset, an integrated circuit (IC), or a set of ICs (eg, a chip set). Various components, modules, or units are described in this disclosure to emphasize functional aspects of devices configured to perform the disclosed techniques, but do not necessarily require implementation by distinct hardware units. Rather, individual units may be combined in codec hardware units, as described above, or by a collection of interoperable hardware units (including one or more processors as described above) in conjunction with appropriate software and / or firmware.

以下編號的條款說明了在本公開內容中描述的設備和技術的一個或多個方面。The following numbered clauses describe one or more aspects of the devices and techniques described in this disclosure.

條款1A：一種對點雲進行譯碼的方法，所述方法包括：確定定義包含所述點雲的空間的基於八叉樹的拆分的八叉樹，其中：所述八叉樹的葉節點包含所述點雲的一個或多個點，並且所述葉節點中的所述一個或多個點中的每個點的位置是直接用信號通知的；使用幀內預測或幀間預測來生成所述一個或多個點的預測；以及對語法元素進行譯碼，所述語法元素指示所述一個或多個點是使用幀內預測還是幀間預測進行預測的。Clause 1A: A method of decoding a point cloud, the method comprising: determining an octree-based split octree defining a space containing the point cloud, wherein: leaf nodes of the octree contains one or more points of the point cloud, and the position of each of the one or more points in the leaf node is directly signaled; generated using intra prediction or inter prediction prediction of the one or more points; and coding a syntax element indicating whether the one or more points are predicted using intra prediction or inter prediction.

條款2A：根據條款1A所述的方法，其中，指定所述葉節點的體積的八叉樹葉體積是在位元流中用信號通知的。Clause 2A: The method of Clause 1A, wherein the octree volume specifying the volume of the leaf node is signaled in a bitstream.

條款3A：根據條款1A或2A所述的方法，其中：生成所述一個或多個點的所述預測包括使用幀內預測來生成所述一個或多個點的所述預測，並且使用幀內預測來生成所述一個或多個點的所述預測包括確定用於所述一個或多個點的局部預測樹。Clause 3A: The method of clause 1A or 2A, wherein: generating the prediction of the one or more points comprises generating the prediction of the one or more points using intra prediction, and using intra prediction Predicting to generate the prediction for the one or more points includes determining a local prediction tree for the one or more points.

條款4A：根據條款3A所述的方法，其中，針對所述一個或多個點中的每個點，預測模式、主殘差和次殘差中的至少一項是用信號通知的。Clause 4A: The method of Clause 3A, wherein, for each of the one or more points, at least one of a prediction mode, a primary residual, and a secondary residual is signaled.

條款5A：根據條款1A或2A所述的方法，其中：生成所述一個或多個點的所述預測包括使用幀間預測來生成所述一個或多個點的所述預測，並且使用幀間預測來生成所述一個或多個點的所述預測包括利用所述一個或多個點來執行運動估計以確定在參考點雲幀中的類似點集合。Clause 5A: The method of Clause 1A or 2A, wherein: generating the prediction of the one or more points comprises generating the prediction of the one or more points using inter prediction, and using inter prediction Predicting to generate the prediction of the one or more points includes performing motion estimation using the one or more points to determine a set of similar points in a reference point cloud frame.

條款6A：根據條款1A、2A或5A中任一項所述的方法，其中：生成所述一個或多個點的所述預測包括使用幀間預測來生成所述一個或多個點的所述預測，並且使用幀間預測來生成所述一個或多個點的所述預測包括基於在參考點雲幀中的點集合來執行運動補償以預測所述一個或多個點。Clause 6A: The method of any of Clauses 1A, 2A, or 5A, wherein: generating the prediction of the one or more points comprises using inter prediction to generate the prediction of the one or more points Predicting, and generating the prediction of the one or more points using inter prediction includes performing motion compensation to predict the one or more points based on a set of points in a reference point cloud frame.

條款7A：根據條款6A所述的方法，其中，執行運動補償包括：將運動向量應用於在所述參考點雲幀中的所述點集合，以確定所述一個或多個點的預測。Clause 7A: The method of Clause 6A, wherein performing motion compensation comprises applying a motion vector to the set of points in the reference point cloud frame to determine a prediction of the one or more points.

條款8A：根據條款7A所述的方法，還包括：基於時空相鄰八叉樹葉間的運動向量來預測所述運動向量。Clause 8A: The method of Clause 7A, further comprising predicting the motion vector based on a motion vector between spatiotemporally adjacent octrees.

條款9A：根據條款1A-8A中任一項所述的方法，還包括：生成所述點雲。Clause 9A: The method of any of Clauses 1A-8A, further comprising generating the point cloud.

條款10A：一種用於處理點雲的設備，所述設備包括用於執行根據條款1A-9A中任一項所述的方法的一個或多個構件。Clause 10A: An apparatus for processing a point cloud, the apparatus comprising one or more means for performing the method of any of clauses 1A-9A.

條款11A：根據條款10A所述的設備，其中，所述一個或多個構件包括在電路中實現的一個或多個處理器。Clause 11A: The apparatus of Clause 10A, wherein the one or more components comprise one or more processors implemented in a circuit.

條款12A：根據條款10A或條款11A中任一項所述的設備，還包括：儲存表示所述點雲的所述資料的記憶體。Clause 12A: The apparatus of any of Clause 10A or Clause 11A, further comprising: memory storing the data representing the point cloud.

條款13A：根據條款10A-12A中任一項所述的設備，其中，所述設備包括解碼器。Clause 13A: The apparatus of any of clauses 10A-12A, wherein the apparatus comprises a decoder.

條款14A：根據條款10A-13A中任一項所述的設備，其中，所述設備包括編碼器。Clause 14A: The apparatus of any of clauses 10A-13A, wherein the apparatus comprises an encoder.

條款15A：根據條款10A-14A中任一項所述的設備，還包括：用於生成所述點雲的設備。Clause 15A: The apparatus of any of clauses 10A-14A, further comprising: apparatus for generating the point cloud.

條款16A：根據條款10A-15A中任一項所述的設備，還包括：用於基於所述點雲來呈現影像的顯示器。Clause 16A: The apparatus of any of clauses 10A-15A, further comprising: a display for rendering imagery based on the point cloud.

條款17A：一種具有儲存在其上的指令的電腦可讀儲存媒體，所述指令在被執行時使得一個或多個處理器執行根據條款1A-9A中任一項所述的方法。Clause 17A: A computer-readable storage medium having instructions stored thereon that, when executed, cause one or more processors to perform the method of any of clauses 1A-9A.

條款1B：一種用於對包括點雲資料的位元流進行解碼的設備，所述設備包括：用於儲存所述點雲資料的記憶體；以及耦合到所述記憶體並且在電路中實現的一個或多個處理器，所述一個或多個處理器被配置為：確定定義包含所述點雲的空間的基於八叉樹的拆分的八叉樹，其中，所述八叉樹的葉節點包含所述點雲的一個或多個點；以及直接對所述葉節點中的所述一個或多個點中的每個點的位置進行解碼，其中，為了直接對所述葉節點中的所述一個或多個點中的每個點的所述位置進行解碼，所述一個或多個處理器還被配置為：生成所述一個或多個點的預測；以及基於所述預測來確定所述一個或多個點。Clause 1B: An apparatus for decoding a bitstream including point cloud data, the apparatus comprising: a memory for storing the point cloud data; and a memory coupled to the memory and implemented in a circuit one or more processors configured to: determine an octree that defines an octree-based split of a space containing the point cloud, wherein the leaves of the octree node contains one or more points of the point cloud; and directly decoding the position of each of the one or more points in the leaf node, wherein, in order to directly decode the point in the leaf node decoding the location of each of the one or more points, the one or more processors further configured to: generate a prediction of the one or more points; and determine based on the prediction the one or more points.

條款2B：根據條款1B所述的設備，其中，為了直接對所述葉節點中的所述一個或多個點中的每個點的所述位置進行解碼，所述一個或多個處理器還被配置為：接收標誌，其中，所述標誌的第一值指示所述一個或多個點的所述預測是通過幀內預測而生成的，並且所述標誌的第二值指示所述一個或多個點的所述預測是通過幀間預測而生成的；以及基於所述標誌的值，使用幀內預測或幀間預測來對所述一個或多個點進行解碼。Clause 2B: The apparatus of Clause 1B, wherein, to directly decode the location of each of the one or more points in the leaf node, the one or more processors further is configured to: receive a flag, wherein a first value of the flag indicates that the prediction of the one or more points was generated by intra prediction, and a second value of the flag indicates the one or The prediction for a plurality of points is generated by inter prediction; and the one or more points are decoded using intra prediction or inter prediction based on the value of the flag.

條款3B：根據條款1B所述的設備，其中，所述一個或多個處理器還被配置為：在包括所述點雲的所述位元流中接收指定所述葉節點的體積的八叉樹葉體積。Clause 3B: The apparatus of Clause 1B, wherein the one or more processors are further configured to receive, in the bitstream including the point cloud, an octet specifying a volume of the leaf node Leaf volume.

條款4B：根據條款1B所述的設備，其中：為了生成所述一個或多個點的所述預測，所述一個或多個處理器還被配置為使用幀內預測來生成所述一個或多個點的所述預測，以及為了使用幀內預測來生成所述一個或多個點的所述預測，所述一個或多個處理器還被配置為確定用於所述一個或多個點的局部預測樹。Clause 4B: The apparatus of Clause 1B, wherein: to generate the prediction for the one or more points, the one or more processors are further configured to generate the one or more using intra prediction the prediction of the one or more points, and in order to generate the prediction of the one or more points using intra prediction, the one or more processors are further configured to determine the prediction for the one or more points Local prediction tree.

條款5B：根據條款1B所述的設備，其中，為了基於所述預測來確定所述一個或多個點，所述一個或多個處理器還被配置為：在包括所述點雲的所述位元流中接收用於所述一個或多個點中的每個點的預測模式、主殘差和次殘差中的至少一項。Clause 5B: The apparatus of Clause 1B, wherein, to determine the one or more points based on the prediction, the one or more processors are further configured to: At least one of a prediction mode, a primary residual, and a secondary residual for each of the one or more points is received in the bitstream.

條款6B：根據條款1B所述的設備，其中：為了生成所述一個或多個點的所述預測，所述一個或多個處理器還被配置為使用幀間預測來生成所述一個或多個點的所述預測，以及為了使用幀間預測來生成所述一個或多個點的所述預測，所述一個或多個處理器還被配置為利用所述一個或多個點來執行運動估計，以確定在參考點雲幀中的類似點集合。Clause 6B: The apparatus of Clause 1B, wherein: to generate the prediction for the one or more points, the one or more processors are further configured to generate the one or more using inter prediction the prediction of a number of points, and in order to generate the prediction of the one or more points using inter prediction, the one or more processors are further configured to utilize the one or more points to perform motion estimate to determine the set of similar points in the reference point cloud frame.

條款7B：根據條款1B所述的設備，其中：為了生成所述一個或多個點的所述預測，所述一個或多個處理器還被配置為使用幀間預測來生成所述一個或多個點的所述預測，以及為了使用幀間預測來生成所述一個或多個點的所述預測，所述一個或多個處理器還被配置為基於在參考點雲幀中的點集合來執行運動補償以預測所述一個或多個點。Clause 7B: The apparatus of Clause 1B, wherein: to generate the prediction for the one or more points, the one or more processors are further configured to generate the one or more using inter prediction the prediction of the points, and to generate the prediction of the one or more points using inter prediction, the one or more processors are further configured to generate the prediction based on the set of points in the reference point cloud frame Motion compensation is performed to predict the one or more points.

條款8B：根據條款7B所述的設備，其中，為了執行運動補償，所述一個或多個處理器還被配置為：將運動向量應用於在所述參考點雲幀中的所述點集合，以確定所述一個或多個點的預測。Clause 8B: The apparatus of Clause 7B, wherein, to perform motion compensation, the one or more processors are further configured to apply a motion vector to the set of points in the reference point cloud frame, to determine a prediction for the one or more points.

條款9B：根據條款8B所述的設備，其中，所述一個或多個處理器還被配置為：基於時空相鄰八叉樹葉間的運動向量來預測所述運動向量。Clause 9B: The apparatus of Clause 8B, wherein the one or more processors are further configured to predict the motion vector based on a motion vector between spatiotemporally adjacent octrees.

條款10B：根據條款1B所述的設備，其中，所述一個或多個處理器還被配置為：從所述點雲資料重建點雲。Clause 10B: The apparatus of Clause 1B, wherein the one or more processors are further configured to reconstruct a point cloud from the point cloud data.

條款11B：根據條款10B所述的設備，其中，所述一個或多個處理器被配置為：作為重建所述點雲的一部分，基於所述平面位置來確定所述點雲的一個或多個點的位置。Clause 11B: The apparatus of clause 10B, wherein the one or more processors are configured to, as part of reconstructing the point cloud, determine one or more of the point cloud based on the planar position point location.

條款11B：根據條款10B所述的設備，其中，所述一個或多個處理器被配置為：作為重建所述點雲的一部分，基於所述葉節點中的一個或多個點中的每個點的經直接解碼的位置來確定所述點雲的所述一個或多個點的位置。Clause 11B: The apparatus of clause 10B, wherein the one or more processors are configured to: as part of reconstructing the point cloud, based on each of the one or more points in the leaf node The directly decoded positions of the points determine the position of the one or more points of the point cloud.

條款12B：根據條款11B所述的設備，其中，所述一個或多個處理器還被配置為：基於所述點雲來生成建築物的內部的地圖。Clause 12B: The apparatus of Clause 11B, wherein the one or more processors are further configured to generate a map of the interior of a building based on the point cloud.

條款13B：根據條款11B所述的設備，其中，所述一個或多個處理器還被配置為：基於所述點雲來執行自主導航操作。Clause 13B: The apparatus of Clause 11B, wherein the one or more processors are further configured to perform autonomous navigation operations based on the point cloud.

條款14B：根據條款11B所述的設備，其中，所述一個或多個處理器還被配置為：基於所述點雲來生成電腦圖形。Clause 14B: The apparatus of Clause 11B, wherein the one or more processors are further configured to generate computer graphics based on the point cloud.

條款15B：根據條款11B所述的設備，其中，所述一個或多個處理器被配置為：基於所述點雲來確定虛擬對象的位置；以及生成延展實境（XR）可視化，其中，所述虛擬對象位於所確定的位置處。Clause 15B: The apparatus of Clause 11B, wherein the one or more processors are configured to: determine a location of a virtual object based on the point cloud; and generate an extended reality (XR) visualization, wherein the The virtual object is located at the determined location.

條款16B：根據條款11B所述的設備，還包括：用於基於所述點雲來呈現影像的顯示器。Clause 16B: The apparatus of Clause 11B, further comprising: a display for rendering imagery based on the point cloud.

條款17B：根據條款1B所述的設備，其中，所述設備是行動電話或平板電腦。Clause 17B: The device of Clause 1B, wherein the device is a mobile phone or a tablet.

條款18B：根據條款1B所述的設備，其中，所述設備是車輛。Clause 18B: The apparatus of Clause 1B, wherein the apparatus is a vehicle.

條款19B：根據條款1B所述的設備，其中，所述設備是延展實境設備。Clause 19B: The device of Clause 1B, wherein the device is an extended reality device.

條款20B：一種對點雲資料進行解碼的方法，所述方法包括：確定定義包含所述點雲的空間的基於八叉樹的拆分的八叉樹，其中，所述八叉樹的葉節點包含所述點雲的一個或多個點；直接對在所述葉節點中的所述一個或多個點中的每個點的位置進行解碼，其中，直接對在所述葉節點中的所述一個或多個點中的每個點的所述位置進行解碼包括：生成所述一個或多個點的預測；以及基於所述預測來確定所述一個或多個點。Clause 20B: A method of decoding point cloud data, the method comprising: determining an octree-based split octree defining a space containing the point cloud, wherein leaf nodes of the octree including one or more points of the point cloud; directly decoding the position of each of the one or more points in the leaf node, wherein all the points in the leaf node are directly decoded Decoding the location of each of the one or more points includes generating a prediction of the one or more points; and determining the one or more points based on the prediction.

條款21B：根據條款20B所述的方法，其中，直接對在所述葉節點中的所述一個或多個點中的每個點的所述位置進行解碼還包括：接收標誌，其中，所述標誌的第一值指示所述一個或多個點的所述預測是通過幀內預測而生成的，並且所述標誌的第二值指示所述一個或多個點的所述預測是通過幀間預測而生成的；以及基於所述標誌的值，使用幀內預測或幀間預測來對所述一個或多個點進行解碼。Clause 21B: The method of Clause 20B, wherein directly decoding the location of each of the one or more points in the leaf node further comprises receiving a flag, wherein the a first value of the flag indicates that the prediction for the one or more points was generated by intra prediction, and a second value of the flag indicates that the prediction for the one or more points was generated by inter and the one or more points are decoded using intra prediction or inter prediction based on the value of the flag.

條款22B：根據條款20B所述的方法，還包括：在用於所述點雲的位元流中接收指定所述葉節點的體積的八叉樹葉體積。Clause 22B: The method of Clause 20B, further comprising receiving, in a bitstream for the point cloud, an octree leaf volume specifying a volume of the leaf node.

條款23B：根據條款20B所述的方法，其中：生成所述一個或多個點的所述預測包括使用幀內預測來生成所述一個或多個點的所述預測，並且使用幀內預測來生成所述一個或多個點的所述預測包括確定用於所述一個或多個點的局部預測樹。Clause 23B: The method of Clause 20B, wherein: generating the prediction for the one or more points comprises generating the prediction for the one or more points using intra prediction, and using intra prediction to Generating the prediction for the one or more points includes determining a local prediction tree for the one or more points.

條款24B：根據條款20B所述的方法，其中，基於所述預測來確定所述一個或多個點包括：在用於所述點雲的位元流中接收用於所述一個或多個點中的每個點的預測模式、主殘差和次殘差中的至少一項。Clause 24B: The method of Clause 20B, wherein determining the one or more points based on the prediction comprises receiving in a bitstream for the one or more points in a bitstream for the point cloud At least one of the prediction mode, primary residual, and secondary residual for each point in .

條款25B：根據條款20B所述的方法，其中：生成所述一個或多個點的所述預測包括使用幀間預測來生成所述一個或多個點的所述預測，並且使用幀間預測來生成所述一個或多個點的所述預測包括利用所述一個或多個點來執行運動估計，以確定在參考點雲幀中的類似點集合。Clause 25B: The method of Clause 20B, wherein: generating the prediction for the one or more points comprises generating the prediction for the one or more points using inter prediction, and using inter prediction to Generating the prediction of the one or more points includes performing motion estimation using the one or more points to determine a set of similar points in a reference point cloud frame.

條款26B：根據條款20B所述的方法，其中：生成所述一個或多個點的所述預測包括使用幀間預測來生成所述一個或多個點的所述預測，並且使用幀間預測來生成所述一個或多個點的所述預測包括基於在參考點雲幀中的點集合來執行運動補償以預測所述一個或多個點。Clause 26B: The method of Clause 20B, wherein: generating the prediction for the one or more points comprises generating the prediction for the one or more points using inter prediction, and using inter prediction to Generating the prediction of the one or more points includes performing motion compensation to predict the one or more points based on a set of points in a reference point cloud frame.

條款27B：根據條款26B所述的方法，其中，執行運動補償包括：將運動向量應用於在所述參考點雲幀中的所述點集合，以確定所述一個或多個點的預測。Clause 27B: The method of Clause 26B, wherein performing motion compensation comprises applying a motion vector to the set of points in the reference point cloud frame to determine a prediction of the one or more points.

條款28B：根據條款27B所述的方法，還包括：基於時空相鄰八叉樹葉間的運動向量來預測所述運動向量。Clause 28B: The method of Clause 27B, further comprising predicting the motion vector based on a motion vector between spatiotemporally adjacent octrees.

條款29B：一種儲存指令的電腦可讀儲存媒體，所述指令在由一個或多個處理器執行時使得所述一個或多個處理器進行以下操作：確定定義包含點雲的空間的基於八叉樹的拆分的八叉樹，其中，所述八叉樹的葉節點包含所述點雲的一個或多個點；以及直接對在所述葉節點中的所述一個或多個點中的每個點的位置進行解碼，其中，為了直接對在所述葉節點中的所述一個或多個點中的每個點的所述位置進行解碼，所述指令使得所述一個或多個處理器進行以下操作：生成所述一個或多個點的預測；以及基於所述預測來確定所述一個或多個點。Clause 29B: A computer-readable storage medium storing instructions that, when executed by one or more processors, cause the one or more processors to: determine an octet-based method that defines a space containing a point cloud A split octree of a tree, wherein a leaf node of the octree contains one or more points of the point cloud; and a direct response to the one or more points in the leaf node decode the position of each point, wherein, in order to directly decode the position of each of the one or more points in the leaf node, the instructions cause the one or more processing The processor operates to: generate a prediction of the one or more points; and determine the one or more points based on the prediction.

條款30B：一種裝置，包括：用於確定定義包含點雲的空間的基於八叉樹的拆分的八叉樹的構件，其中，所述八叉樹的葉節點包含所述點雲的一個或多個點；用於直接對在所述葉節點中的所述一個或多個點中的每個點的位置進行解碼的單元，其中，所述用於直接對在所述葉節點中的所述一個或多個點中的每個點的所述位置進行解碼的構件包括：用於生成所述一個或多個點的預測的構件；以及用於基於所述預測來確定所述一個或多個點的構件。Clause 30B: An apparatus comprising: means for determining an octree-based splitting of an octree that defines a space containing a point cloud, wherein a leaf node of the octree contains one or more of the point cloud. a plurality of points; means for directly decoding the position of each of the one or more points in the leaf node, wherein the means for directly decoding all the points in the leaf node the means for decoding the location of each of the one or more points comprises: means for generating a prediction of the one or more points; and determining the one or more points based on the prediction a point component.

條款1C：一種用於對包括點雲資料的位元流進行解碼的設備，所述設備包括：用於儲存所述點雲資料的記憶體；以及耦合到所述記憶體並且在電路中實現的一個或多個處理器，所述一個或多個處理器被配置為：確定定義包含所述點雲的空間的基於八叉樹的拆分的八叉樹，其中，所述八叉樹的葉節點包含所述點雲的一個或多個點；以及直接對在所述葉節點中的所述一個或多個點中的每個點的位置進行解碼，其中，為了直接對在所述葉節點中的所述一個或多個點中的每個點的所述位置進行解碼，所述一個或多個處理器還被配置為：生成所述一個或多個點的預測；以及基於所述預測來確定所述一個或多個點。Clause 1C: An apparatus for decoding a bitstream comprising point cloud data, the apparatus comprising: a memory for storing the point cloud data; and a memory coupled to the memory and implemented in a circuit one or more processors configured to: determine an octree that defines an octree-based split of a space containing the point cloud, wherein the leaves of the octree a node containing one or more points of the point cloud; and directly decoding the position of each of the one or more points in the leaf node, wherein in order to directly decode the point in the leaf node decoding the position of each of the one or more points in the one or more processors, the one or more processors are further configured to: generate a prediction of the one or more points; and based on the prediction to determine the one or more points.

條款2C：根據條款1C所述的設備，其中，為了直接對在所述葉節點中的所述一個或多個點中的每個點的所述位置進行解碼，所述一個或多個處理器還被配置為：接收標誌，其中，所述標誌的第一值指示所述一個或多個點的所述預測是通過幀內預測而生成的，並且所述標誌的第二值指示所述一個或多個點的所述預測是通過幀間預測而生成的；以及基於所述標誌的值，使用幀內預測或幀間預測來對所述一個或多個點進行解碼。Clause 2C: The apparatus of Clause 1C, wherein, to directly decode the location of each of the one or more points in the leaf node, the one or more processors is further configured to: receive a flag, wherein a first value of the flag indicates that the prediction of the one or more points was generated by intra prediction, and a second value of the flag indicates the one The prediction of the point or points is generated by inter prediction; and the one or more points are decoded using intra prediction or inter prediction based on the value of the flag.

條款3C：根據條款1C或2C所述的設備，其中，所述一個或多個處理器還被配置為：在包括所述點雲的所述位元流中接收指定所述葉節點的體積的八叉樹葉體積。Clause 3C: The apparatus of Clause 1C or 2C, wherein the one or more processors are further configured to receive, in the bitstream including the point cloud, a volume specifying the leaf node Octopus leaf volume.

條款4C：根據條款1C-3C中任一項所述的設備，其中：為了生成所述一個或多個點的所述預測，所述一個或多個處理器還被配置為使用幀內預測來生成所述一個或多個點的所述預測，以及為了使用幀內預測來生成所述一個或多個點的所述預測，所述一個或多個處理器還被配置為確定用於所述一個或多個點的局部預測樹。Clause 4C: The apparatus of any of clauses 1C-3C, wherein: to generate the prediction for the one or more points, the one or more processors are further configured to use intra prediction to generating the prediction for the one or more points, and in order to generate the prediction for the one or more points using intra prediction, the one or more processors are further configured to determine for the A local prediction tree for one or more points.

條款5C：根據條款1C-4C中任一項所述的設備，其中，為了基於所述預測來確定所述一個或多個點，所述一個或多個處理器還被配置為：在包括所述點雲的所述位元流中接收用於所述一個或多個點中的每個點的預測模式、主殘差和次殘差中的至少一項。Clause 5C: The apparatus of any of Clauses 1C-4C, wherein, to determine the one or more points based on the prediction, the one or more processors are further configured to: At least one of a prediction mode, a primary residual, and a secondary residual for each of the one or more points is received in the bitstream of the point cloud.

條款6C：根據條款1C-5C中任一項所述的設備，其中：為了生成所述一個或多個點的所述預測，所述一個或多個處理器還被配置為使用幀間預測來生成所述一個或多個點的所述預測，並且為了使用幀間預測來生成所述一個或多個點的所述預測，所述一個或多個處理器還被配置為利用所述一個或多個點來執行運動估計，以確定在參考點雲幀中的類似點集合。Clause 6C: The apparatus of any of clauses 1C-5C, wherein: to generate the prediction of the one or more points, the one or more processors are further configured to use inter prediction to generating the prediction of the one or more points, and in order to generate the prediction of the one or more points using inter prediction, the one or more processors are further configured to utilize the one or more Motion estimation is performed at multiple points to determine a set of similar points in the reference point cloud frame.

條款7C：根據條款1C-5C中任一項所述的設備，其中：為了生成所述一個或多個點的所述預測，所述一個或多個處理器還被配置為使用幀間預測來生成所述一個或多個點的所述預測，並且為了使用幀間預測來生成所述一個或多個點的所述預測，所述一個或多個處理器還被配置為基於在參考點雲幀中的點集合來執行運動補償以預測所述一個或多個點。Clause 7C: The apparatus of any of clauses 1C-5C, wherein: to generate the prediction of the one or more points, the one or more processors are further configured to use inter prediction to generating the prediction of the one or more points, and in order to generate the prediction of the one or more points using inter prediction, the one or more processors are further configured to generate the prediction of the one or more points based on the reference point cloud A set of points in the frame to perform motion compensation to predict the one or more points.

條款8C：根據條款7C所述的設備，其中，為了執行運動補償，所述一個或多個處理器還被配置為：將運動向量應用於在所述參考點雲幀中的所述點集合，以確定所述一個或多個點的預測。Clause 8C: The apparatus of Clause 7C, wherein, to perform motion compensation, the one or more processors are further configured to apply a motion vector to the set of points in the reference point cloud frame, to determine a prediction for the one or more points.

條款9C：根據條款8C所述的設備，其中，所述一個或多個處理器還被配置為：基於時空相鄰八叉樹葉間的運動向量來預測所述運動向量。Clause 9C: The apparatus of Clause 8C, wherein the one or more processors are further configured to predict the motion vector based on a motion vector between spatiotemporally adjacent octrees.

條款10C：根據條款1C-9C中任一項所述的設備，其中，所述一個或多個處理器還被配置為：從所述點雲資料重建點雲。Clause 10C: The apparatus of any of clauses 1C-9C, wherein the one or more processors are further configured to reconstruct a point cloud from the point cloud data.

條款11C：根據條款10C所述的設備，其中，所述一個或多個處理器被配置為：作為重建所述點雲的一部分，基於在所述葉節點中的一個或多個點中的每個點的經直接解碼的位置來確定所述點雲的所述一個或多個點的位置。Clause 11C: The apparatus of clause 10C, wherein the one or more processors are configured to: as part of reconstructing the point cloud, based on each of the one or more points in the leaf node The directly decoded locations of the points determine the location of the one or more points of the point cloud.

條款12C：根據條款11C所述的設備，其中，所述一個或多個處理器還被配置為：基於所述點雲來生成建築物的內部的地圖。Clause 12C: The apparatus of Clause 11C, wherein the one or more processors are further configured to generate a map of the interior of a building based on the point cloud.

條款13C：根據條款11C所述的設備，其中，所述一個或多個處理器還被配置為：基於所述點雲來執行自主導航操作。Clause 13C: The apparatus of Clause 11C, wherein the one or more processors are further configured to perform autonomous navigation operations based on the point cloud.

條款14C：根據條款11C所述的設備，其中，所述一個或多個處理器還被配置為：基於所述點雲來生成電腦圖形。Clause 14C: The apparatus of Clause 11C, wherein the one or more processors are further configured to generate computer graphics based on the point cloud.

條款15C：根據條款11C所述的設備，其中，所述一個或多個處理器被配置為：基於所述點雲來確定虛擬對象的位置；以及生成延展實境（XR）可視化，其中，所述虛擬對象位於所確定的位置。Clause 15C: The apparatus of Clause 11C, wherein the one or more processors are configured to: determine a location of a virtual object based on the point cloud; and generate an extended reality (XR) visualization, wherein the The virtual object is located at the determined location.

條款16C：根據條款11C-15C中任一項所述的設備，還包括：用於基於所述點雲來呈現影像的顯示器。Clause 16C: The apparatus of any of clauses 11C-15C, further comprising: a display for rendering imagery based on the point cloud.

條款17C：根據條款1C-16C中任一項所述的設備，其中，所述設備是行動電話或平板電腦。Clause 17C: The device of any of Clauses 1C-16C, wherein the device is a mobile phone or a tablet.

條款18C：根據條款1C-16C中任一項所述的設備，其中，所述設備是車輛。Clause 18C: The apparatus of any of Clauses 1C-16C, wherein the apparatus is a vehicle.

條款19C：根據條款1C-16C中任一項所述的設備，其中，所述設備是延展實境設備。Clause 19C: The device of any of Clauses 1C-16C, wherein the device is an extended reality device.

條款20C：一種對點雲資料進行解碼的方法，所述方法包括：確定定義包含所述點雲的空間的基於八叉樹的拆分的八叉樹，其中，所述八叉樹的葉節點包含所述點雲的一個或多個點；直接對在所述葉節點中的所述一個或多個點中的每個點的位置進行解碼，其中，直接對在所述葉節點中的所述一個或多個點中的每個點的所述位置進行解碼包括：生成所述一個或多個點的預測；以及基於所述預測來確定所述一個或多個點。Clause 20C: A method of decoding point cloud data, the method comprising: determining an octree-based split octree defining a space containing the point cloud, wherein leaf nodes of the octree are including one or more points of the point cloud; directly decoding the position of each of the one or more points in the leaf node, wherein all the points in the leaf node are directly decoded Decoding the location of each of the one or more points includes generating a prediction of the one or more points; and determining the one or more points based on the prediction.

條款21C：根據條款20C所述的方法，其中，直接對所述葉節點中的所述一個或多個點中的每個點的所述位置進行解碼還包括：接收標誌，其中，所述標誌的第一值指示所丄述一個或多個點的所述預測是通過幀內預測而生成的，並且所述標誌的第二值指示所述一個或多個點的所述預測是通過幀間預測而生成的；以及基於所述標誌的值，使用幀內預測或幀間預測來對所述一個或多個點進行解碼。Clause 21C: The method of Clause 20C, wherein directly decoding the location of each of the one or more points in the leaf node further comprises receiving a flag, wherein the flag The first value of the flag indicates that the prediction of the one or more points was generated by intra prediction, and the second value of the flag indicates that the prediction of the one or more points was generated by inter prediction and the one or more points are decoded using intra prediction or inter prediction based on the value of the flag.

條款22C：根據條款20C或21C所述的方法，還包括：在用於所述點雲的位元流中接收指定所述葉節點的體積的八叉樹葉體積。Clause 22C: The method of clause 20C or 21C, further comprising receiving, in a bitstream for the point cloud, an octree volume specifying a volume of the leaf node.

條款23C：根據條款20C-22C中任一項所述的方法，其中：生成所述一個或多個點的所述預測包括使用幀內預測來生成所述一個或多個點的所述預測，並且使用幀內預測來生成所述一個或多個點的所述預測包括確定用於所述一個或多個點的局部預測樹。Clause 23C: The method of any of Clauses 20C-22C, wherein: generating the prediction for the one or more points comprises generating the prediction for the one or more points using intra prediction, And using intra prediction to generate the prediction for the one or more points includes determining a local prediction tree for the one or more points.

條款24C：根據條款20C-23C中任一項所述的方法，其中，基於所述預測來確定所述一個或多個點包括：在用於所述點雲的位元流中接收用於所述一個或多個點中的每個點的預測模式、主殘差和次殘差中的至少一項。Clause 24C: The method of any one of Clauses 20C-23C, wherein determining the one or more points based on the prediction comprises receiving in a bitstream for the point cloud for all the points at least one of a prediction mode, a primary residual, and a secondary residual for each of the one or more points.

條款25C：根據條款20C-24C中任一項所述的方法，其中：生成所述一個或多個點的所述預測包括使用幀間預測來生成所述一個或多個點的所述預測，並且使用幀間預測來生成所述一個或多個點的所述預測包括利用所述一個或多個點來執行運動估計，以確定在參考點雲幀中的類似點集合。Clause 25C: The method of any of Clauses 20C-24C, wherein: generating the prediction for the one or more points comprises generating the prediction for the one or more points using inter prediction, And using inter prediction to generate the prediction of the one or more points includes performing motion estimation using the one or more points to determine a set of similar points in a reference point cloud frame.

條款26C：根據條款20C-25C中任一項所述的方法，其中：生成所述一個或多個點的所述預測包括使用幀間預測來生成所述一個或多個點的所述預測，並且使用幀間預測來生成所述一個或多個點的所述預測包括基於在參考點雲幀中的點集合來執行運動補償以預測所述一個或多個點。Clause 26C: The method of any of Clauses 20C-25C, wherein: generating the prediction for the one or more points comprises generating the prediction for the one or more points using inter prediction, And using inter prediction to generate the prediction of the one or more points includes performing motion compensation to predict the one or more points based on a set of points in a reference point cloud frame.

條款27C：根據條款26C所述的方法，其中，執行運動補償包括：將運動向量應用於在所述參考點雲幀中的所述點集合，以確定所述一個或多個點的預測。Clause 27C: The method of Clause 26C, wherein performing motion compensation comprises applying a motion vector to the set of points in the reference point cloud frame to determine a prediction of the one or more points.

條款28C：根據條款27C所述的方法，還包括：基於時空相鄰八叉樹葉間的運動向量來預測所述運動向量。Clause 28C: The method of Clause 27C, further comprising predicting the motion vector based on a motion vector between spatiotemporally adjacent octrees.

已經描述了各個示例。這些和其它示例在以下申請專利範圍的範圍內。Various examples have been described. These and other examples are within the scope of the following claims.

100:編碼和解碼系統 102:源設備 104:資料源 106:記憶體 108:輸出介面 110:電腦可讀媒體 112:儲存設備 114:檔案伺服器 116:目標設備 118:資料消費方 120:記憶體 122:輸入介面 200:G-PCC編碼器 300:G-PCC解碼器 200:G-PCC編碼器200 202:座標變換單元 204:色彩變換單元 206:體素化單元 208:屬性變換單元 210:八叉樹分析單元 212:表面近似分析單元 214:算術編碼單元 216:幾何形狀重建單元 218:RAHT單元 220:LOD生成單元LOD 222:提升單元 224:係數量化單元 226:算術編碼單元 300:G-PCC解碼器 302:幾何形狀算術解碼單元 304:屬性算術解碼單元 306:八叉樹合成單元 308:逆量化單元 310:表面近似合成單元 312:幾何形狀重建單元 314:RAHT單元 316:LOD生成單元 318:逆提升單元 320:逆座標變換單元 322:逆色彩變換單元 400:八叉樹 402:節點 404:節點 412:節點 414A、414B:節點 418A、418B、418C、418D、418E:節點 420:預測幀 422:前幀 424:全域運動估計 426:對預測幀應用所估計的全域運動 428:局部節點運動估計 430:運動補償，對運動向量和點進行編碼 432:找到特徵點 434:對特徵點對進行採樣 436:執行LMS 440:如果沒有拆分為8個子節點 442:如果拆分為8個子節點，則類似於八叉樹劃分 444:八叉樹 O0、O1、O2、O3、O4、O5、O6、O7、O8、O9、O10、O11、O12:點或當前點集合 R0、R1、R2、R3、R4、R5、R6、R7、R8、R9、R10、R11、R12:參考點集合 P0、P1、P2、P3、P4、P5、P6、P7、P8、P9、P10、P11、P12:經運動補償的參考點集合 600:測距系統 602:照明器 604:感測器 606:發射光 608:對象 610:返回光 611:透鏡 612:圖像 618:信號 620:點雲生成器 700:車輛 702:雷射封裝 704:雷射光束 706:行人 708:位元流 710:車輛 712:伺服器系統 800:用戶 802:第一位置 804:XR耳機 806:對象 808:位元流 810:XR耳機 812:用戶 814:第二位置 900:行動設備 902:對象 904:位元流 906:遠程設備 1000:確定定義包含點雲的空間的基於八叉樹的拆分的八叉樹 1002:步驟 1004:步驟 1006:步驟 100: Encoding and Decoding Systems 102: Source Device 104: Sources 106: Memory 108: Output interface 110: Computer-readable media 112: Storage Devices 114: file server 116: target device 118: Data consumer 120: memory 122: Input interface 200: G-PCC encoder 300: G-PCC Decoder 200: G-PCC Encoder 200 202: Coordinate transformation unit 204: Color transformation unit 206: Voxelization unit 208: Attribute Transform Unit 210: Octree Analysis Unit 212: Surface Approximation Analysis Unit 214: Arithmetic coding unit 216: Geometry Reconstruction Unit 218: RAHT unit 220:LOD generation unit LOD 222: Lifting Unit 224: Coefficient quantization unit 226: Arithmetic coding unit 300: G-PCC Decoder 302: Geometry Arithmetic Decoding Unit 304: attribute arithmetic decoding unit 306: Octree Synthesis Unit 308: Inverse Quantization Unit 310: Surface Approximation Synthesis Unit 312: Geometry Reconstruction Unit 314: RAHT unit 316: LOD generation unit 318: Inverse Lifting Unit 320: Inverse coordinate transformation unit 322: Inverse color transform unit 400: Octree 402: Node 404: Node 412: Node 414A, 414B: Nodes 418A, 418B, 418C, 418D, 418E: Nodes 420: Predicted frame 422: previous frame 424: Global Motion Estimation 426: Apply the estimated global motion to the predicted frame 428: Local node motion estimation 430: Motion compensation, encoding motion vectors and points 432: Find feature points 434: Sampling feature point pairs 436: Execute LMS 440: if not split into 8 child nodes 442: If split into 8 child nodes, similar to octree division 444: Octree O0, O1, O2, O3, O4, O5, O6, O7, O8, O9, O10, O11, O12: point or current point set R0, R1, R2, R3, R4, R5, R6, R7, R8, R9, R10, R11, R12: reference point set P0, P1, P2, P3, P4, P5, P6, P7, P8, P9, P10, P11, P12: motion compensated reference point set 600: Ranging System 602: Illuminator 604: Sensor 606: Emit Light 608: Object 610: Return Light 611: Lens 612: Image 618: Signal 620: Point Cloud Generator 700: Vehicle 702: Laser packaging 704: Laser Beam 706: Pedestrian 708: bitstream 710: Vehicles 712: Server System 800: User 802: First position 804:XR Headphones 806: Object 808: bitstream 810: XR Headphones 812: User 814: Second position 900: Mobile Devices 902: Object 904: bitstream 906: Remote Device 1000: Determine the octree that defines the octree-based split of the space containing the point cloud 1002: Steps 1004: Steps 1006: Steps

圖1是示出可以執行本公開內容的技術的示例編碼和解碼系統的方塊圖。1 is a block diagram illustrating an example encoding and decoding system that may implement the techniques of this disclosure.

圖2是示出示例幾何形狀點雲壓縮（G-PCC）編碼器的方塊圖。2 is a block diagram illustrating an example Geometric Point Cloud Compression (G-PCC) encoder.

圖3是示出示例G-PCC解碼器的方塊圖。3 is a block diagram illustrating an example G-PCC decoder.

圖4是示出用於幾何形狀譯碼的示例八叉樹拆分的概念圖。4 is a conceptual diagram illustrating an example octree split for geometry coding.

圖5是示出預測樹的示例的概念圖。FIG. 5 is a conceptual diagram illustrating an example of a prediction tree.

圖6是示出示例旋轉雷射雷達採集模型的概念圖。6 is a conceptual diagram illustrating an example rotating lidar acquisition model.

圖7是示出用於InterEM的示例運動估計流程圖的概念圖。7 is a conceptual diagram illustrating an example motion estimation flow diagram for InterEM.

圖8是示出用於全域運動的估計的示例算法的概念圖。8 is a conceptual diagram illustrating an example algorithm for estimation of global motion.

圖9是示出用於局部節點運動向量的估計的示例算法的概念圖。9 is a conceptual diagram illustrating an example algorithm for estimation of local node motion vectors.

圖10是示出高級八叉樹拆分的示例的概念圖。FIG. 10 is a conceptual diagram illustrating an example of advanced octree splitting.

圖11是示出局部預測樹生成的示例的概念圖。FIG. 11 is a conceptual diagram illustrating an example of local prediction tree generation.

圖12是示出示例當前點集合（O0到O12）和參考點集合（R0到R12）（其中，N = M = 13）的概念圖。FIG. 12 is a conceptual diagram illustrating example current point sets (O0 to O12) and reference point sets (R0 to R12) (where N=M=13).

圖13是示出示例當前點集合和運動補償參考點集合（其中，N = M = 13）的概念圖。13 is a conceptual diagram illustrating an example current point set and motion compensated reference point set (where N=M=13).

圖14是示出可以與本公開內容的一種或多種技術一起使用的示例測距系統的概念圖。14 is a conceptual diagram illustrating an example ranging system that may be used with one or more techniques of the present disclosure.

圖15是示出可以在其中使用本公開內容的一種或多種技術的示例基於車輛的場景的概念圖。15 is a conceptual diagram illustrating an example vehicle-based scenario in which one or more techniques of the present disclosure may be employed.

圖16是示出可以在其中使用本公開內容的一種或多種技術的示例延展實境系統的概念圖。16 is a conceptual diagram illustrating an example extended reality system in which one or more techniques of the present disclosure may be employed.

圖17是示出可以在其中使用本公開內容的一種或多種技術的示例行動設備系統的概念圖。17 is a conceptual diagram illustrating an example mobile device system in which one or more techniques of the present disclosure may be employed.

圖18是示出用於對包括點雲資料的位元流進行解碼的示例操作的流程圖。18 is a flowchart illustrating example operations for decoding a bitstream including point cloud material.

1000:確定定義包含點雲的空間的基於八叉樹的拆分的八叉樹 1000: Determine the octree that defines the octree-based split of the space containing the point cloud

1002:步驟 1002: Steps

1004:步驟 1004: Steps

1006:步驟 1006: Steps

Claims

An apparatus for decoding a bitstream comprising point cloud data, the apparatus comprising: memory for storing the point cloud data; and one or more processors coupled to the memory and implemented in circuitry, the one or more processors configured to: determining an octree-based split octree that defines a space containing a point cloud, wherein leaf nodes of the octree contain one or more points of the point cloud; and directly decoding the position of each of the one or more points in the leaf node, wherein in order to directly decode each of the one or more points in the leaf node to decode the location, the one or more processors are further configured to: generating a prediction for the one or more points; and The one or more points are determined based on the prediction.

The apparatus of claim 1, wherein, in order to directly decode the location of each of the one or more points in the leaf node, the one or more processors are further Configured as: receiving a flag, wherein a first value of the flag indicates that the prediction of the one or more points was generated by intra prediction, and a second value of the flag indicates that the prediction of the one or more points was the prediction is generated by inter prediction; and Based on the value of the flag, the one or more points are decoded using intra prediction or inter prediction.

The apparatus of claim 1, wherein the one or more processors are further configured to receive, in the bitstream including the point cloud, an octree leaf volume specifying the volume of the leaf node .

The apparatus of claim 1, wherein: To generate the prediction for the one or more points, the one or more processors are further configured to generate the prediction for the one or more points using intra prediction, and To generate the prediction for the one or more points using intra prediction, the one or more processors are further configured to determine a local prediction tree for the one or more points.

The apparatus of claim 1, wherein, to determine the one or more points based on the prediction, the one or more processors are further configured to: At least one of a prediction mode, a primary residual, and a secondary residual for each of the one or more points is received in the stream.

The apparatus of claim 1, wherein: To generate the prediction for the one or more points, the one or more processors are further configured to generate the prediction for the one or more points using inter prediction, and In order to generate the prediction of the one or more points using inter prediction, the one or more processors are further configured to perform motion estimation using the one or more points to determine where in the reference point cloud A collection of similar points in the frame.

The apparatus of claim 1, wherein: To generate the prediction for the one or more points, the one or more processors are further configured to generate the prediction for the one or more points using inter prediction, and In order to generate the prediction of the one or more points using inter prediction, the one or more processors are further configured to perform motion compensation to predict the one based on a set of points in a reference point cloud frame or multiple points.

The apparatus of claim 7, wherein, to perform motion compensation, the one or more processors are further configured to apply a motion vector to the set of points in the reference point cloud frame to determine Prediction of the one or more points.

The apparatus of claim 8, wherein the one or more processors are further configured to predict the motion vector based on a motion vector between spatiotemporally adjacent octrees.

The apparatus of claim 1, wherein the one or more processors are further configured to reconstruct the point cloud from the point cloud data.

The apparatus of claim 10, wherein the one or more processors are configured to: as part of reconstructing the point cloud, based on each of the one or more points in the leaf node to determine the location of one or more points of the point cloud.

The apparatus of claim 11, wherein the one or more processors are further configured to generate a map of the interior of a building based on the point cloud.

The apparatus of claim 11, wherein the one or more processors are further configured to perform autonomous navigation operations based on the point cloud.

The apparatus of claim 11, wherein the one or more processors are further configured to generate computer graphics based on the point cloud.

The apparatus of claim 11, wherein the one or more processors are configured to: determining the location of the virtual object based on the point cloud; and An extended reality (XR) visualization is generated, wherein the virtual object is located at the determined location of the virtual object.

The apparatus of claim 11, further comprising a display for rendering imagery based on the point cloud.

The device of claim 1, wherein the device is a mobile phone or a tablet.

The apparatus of claim 1, wherein the apparatus is a vehicle.

The device of claim 1, wherein the device is an extended reality device.

A method for decoding point cloud data, the method comprising: determining an octree-based split octree that defines a space containing the point cloud, wherein leaf nodes of the octree contain one or more points of the point cloud; directly decoding the position of each of the one or more points in the leaf node, wherein the location of each of the one or more points in the leaf node is directly decoded Decoding the location includes: generating a prediction for the one or more points; and The one or more points are determined based on the prediction.

The method of claim 20, wherein directly decoding the location of each of the one or more points in the leaf node further comprises: receiving a flag, wherein a first value of the flag indicates that the prediction of the one or more points was generated by intra prediction, and a second value of the flag indicates that the prediction of the one or more points was the prediction is generated by inter prediction; and Based on the value of the flag, the one or more points are decoded using intra prediction or inter prediction.

The method of claim 20, further comprising receiving, in a bitstream for the point cloud, an octree leaf volume specifying the volume of the leaf node.

The method of claim 20, wherein: Generating the prediction for the one or more points includes generating the prediction for the one or more points using intra prediction, and Generating the prediction for the one or more points using intra prediction includes determining a local prediction tree for the one or more points.

The method of claim 20, wherein determining the one or more points based on the prediction comprises receiving in a bitstream for the one or more points in a bitstream for the one or more points At least one of prediction mode, primary residual, and secondary residual for each point.

The method of claim 20, wherein: Generating the prediction for the one or more points includes generating the prediction for the one or more points using inter prediction, and Using inter prediction to generate the prediction of the one or more points includes performing motion estimation using the one or more points to determine a set of similar points in a reference point cloud frame.

The method of claim 20, wherein: Generating the prediction for the one or more points includes generating the prediction for the one or more points using inter prediction, and Using inter prediction to generate the prediction of the one or more points includes performing motion compensation to predict the one or more points based on a set of points in a reference point cloud frame.

The method of claim 26, wherein performing motion compensation comprises applying a motion vector to the set of points in the reference point cloud frame to determine a prediction of the one or more points.

The method of claim 27, further comprising predicting the motion vector based on a motion vector between spatiotemporally adjacent octrees.

A computer-readable storage medium storing instructions that, when executed by one or more processors, cause the one or more processors to: determining an octree-based split octree that defines a space containing a point cloud, wherein leaf nodes of the octree contain one or more points of the point cloud; and directly decoding the position of each of the one or more points in the leaf node, wherein in order to directly decode each of the one or more points in the leaf node , the instruction causes the one or more processors to: generating a prediction for the one or more points; and The one or more points are determined based on the prediction.

A device comprising: means for determining an octree-based split octree that defines a space containing a point cloud, wherein leaf nodes of the octree contain one or more points of the point cloud; means for directly decoding the position of each of the one or more points in the leaf node, wherein the means for directly decoding the one or more points in the leaf node The means for decoding the position of each of the points include: means for generating a prediction of the one or more points; and means for determining the one or more points based on the prediction.