WO2023107868A1

WO2023107868A1 - Adaptive attribute coding for geometry point cloud coding

Info

Publication number: WO2023107868A1
Application number: PCT/US2022/080859
Authority: WO
Inventors: Yue Yu
Original assignee: Innopeak Technology, Inc.
Priority date: 2021-12-06
Filing date: 2022-12-02
Publication date: 2023-06-15

Abstract

In some embodiments, a point cloud decoder decodes a point cloud from a point cloud bitstream. The point cloud bitstream includes a geometry bitstream and an attribute bitstream. The decoder parses the attribute bitstream to identify a residual coding order flag configured to specify a residual coding order. If the value of the residual coding order flag is a first value, the decoder determines that the residual coding order for at least a portion of the attribute stream is a first order; if the value of the flag is a second value, the decoder determines that the residual coding order for at least the portion of the attribute stream is a second order. The decoder decodes the portion of the attribute bitstream into attribute residuals according to the determined residual coding order and reconstructs the point cloud based at on the decoded attribute residuals.

Description

ADAPTIVE ATTRIBUTE CODING FOR GEOMETRY POINT CLOUD CODING

Cross-Reference to Related Applications

[0001] This application claims priority to U.S. Provisional Application No. 63/265,027, entitled “Adaptive Attribute Coding for Geometry Point Cloud Coding,” filed on December 6, 2021, which is hereby incorporated in its entirety by this reference.

Technical Field

[0002] This disclosure relates generally to computer-implemented methods and systems for geometry point cloud coding. Specifically, the present disclosure involves adaptive attribute coding for geometry point cloud coding.

Background

[0003] Embodiments of the present disclosure relate to point cloud coding. Point clouds are one of the major three-dimension (3D) data representations, which provide, in addition to spatial coordinates, attributes associated with the points in a 3D world. Point clouds in their raw format require a huge amount of memory for storage or bandwidth for transmission. Furthermore, the emergence of higher resolution point cloud capture technology imposes, in turn, even a higher requirement on the size of point clouds. In order to make point clouds usable, compression is necessary. Two compression technologies have been proposed for point cloud compression/coding (PCC) standardization activities: video-based PCC (V-PCC) and geometry-based PCC (G-PCC). V-PCC approach is based on 3D to two-dimension (2D) projections, while G-PCC, on the contrary, encodes the content directly in 3D space.

Summary

[0004] Some embodiments involve adaptive attribute coding for geometry point cloud coding. In one example, a method for decoding a point cloud from a point cloud bitstream which includes a geometry bitstream and an attribute bitstream is disclosed. The method includes parsing the attribute bitstream to identify a residual coding order flag configured to specify a residual coding order; determining a value of the residual coding order flag; in response to determining that the value of the residual coding order flag is a first value, determining that the residual coding order for at least a portion of the attribute stream is a first residual coding order; in response to determining that the value of the residual coding order flag is a second value, determining that the residual coding order for at least the portion of the attribute stream is a second residual coding order; decoding at least the portion of the attribute bitstream into attribute residuals according to the determined residual coding order; and reconstructing the point cloud based at least in part upon the decoded attribute residuals.

[0005] In another example, a non-transitory computer-readable medium has program code that is stored thereon. The program code is executable by one or more processing devices for performing operations. The operations include parsing an attribute bitstream of a point cloud bitstream for a point cloud to identify a residual coding order flag configured to specify a residual coding order. The point cloud bitstream includes the attribute bitstream and a geometry bitstream. The operations further include determining a value of the residual coding order flag; in response to determining that the value of the residual coding order flag is a first value, determining that the residual coding order for at least a portion of the attribute stream is a first residual coding order; in response to determining that the value of the residual coding order flag is a second value, determining that the residual coding order for at least the portion of the attribute stream is a second residual coding order; decoding at least the portion of the attribute bitstream into attribute residuals according to the determined residual coding order; and reconstructing the point cloud based at least in part upon the decoded attribute residuals.

[0006] In yet another example, a system includes a processing device and a non-transitory computer-readable medium communicatively coupled to the processing device. The processing device is configured to execute program code stored in the non-transitory computer-readable medium and thereby perform operations comprising: parsing an attribute bitstream of a point cloud bitstream for a point cloud to identify a residual coding order flag configured to specify a residual coding order, the point cloud bitstream comprising the attribute bitstream and a geometry bitstream; determining a value of the residual coding order flag; in response to determining that the value of the residual coding order flag is a first value, determining that the residual coding order for at least a portion of the attribute stream is a first residual coding order; in response to determining that the value of the residual coding order flag is a second value, determining that the residual coding order for at least the portion of the attribute stream is a second residual coding order; decoding at least the portion of the attribute bitstream into attribute residuals according to the determined residual coding order; and reconstructing the point cloud based at least in part upon the decoded attribute residuals.

[0007] These illustrative embodiments are mentioned not to limit or define the disclosure, but to provide examples to aid understanding thereof. Additional embodiments are discussed in the Detailed Description, and further description is provided there.

Brief Description of the Drawings

[0008] Features, embodiments, and advantages of the present disclosure are better understood when the following Detailed Description is read with reference to the accompanying drawings.

[0009] FIG. 1 illustrates a block diagram of an exemplary encoding system, according to some embodiments of the present disclosure.

[0010] FIG. 2 illustrates a block diagram of an exemplary decoding system, according to some embodiments of the present disclosure.

[0011] FIG. 3 illustrates a detailed block diagram of an exemplary encoder in the encoding system in FIG. 1, according to some embodiments of the present disclosure.

[0012] FIG. 4 illustrates a detailed block diagram of an exemplary decoder in the decoding system in FIG. 2, according to some embodiments of the present disclosure.

[0013] FIGS. 5 A and 5B illustrate an exemplary octree structure of G-PCC and the corresponding digital representation, respectively, according to some embodiments of the present disclosure.

[0014] FIG. 6 illustrates an exemplary structure of cube and the relationship with neighboring cubes in an octree structure of G-PCC, according to some embodiments of the present disclosure.

[0015] FIG. 7 depicts an example of a process for decoding a point cloud from a point cloud bitstream, according to some embodiments of the present disclosure.

Detailed Description

[0016] Various embodiments provide adaptive attribute coding for geometry point cloud coding. A point cloud is composed of a collection of points in a 3D space. Each point in the 3D space is associated with a geometry position together with the associated attribute information (e.g., color, reflectance, etc.). In order to compress the point cloud data efficiently, the geometry of a point cloud can be compressed first, and then the corresponding attributes, including color or reflectance, can be compressed based upon the geometry information according to a point cloud coding technique, such as G-PCC. G-PCC has been widely used in virtual reality /augmented reality (VR/AR), telecommunication, autonomous vehicle, etc., for entertainment and industrial applications, e.g., light detection and ranging (LiDAR) sweep compression for automotive or robotics and high-definition (HD) map for navigation. Moving Picture Experts Group (MPEG) released the first version G-PCC standard, and Audio Video Coding Standard (AVS) is also developing a G-PCC standard.

[0017] In G-PCC standards, because the neighboring points in a point cloud may have a strong correlation, prediction-based coding methods have been developed and used to compose and code point cloud attributes. For example, a prediction of an attribute of a point can be formed from neighboring coded attributes. Then, the difference (i.e., residual) between the current attribute and the prediction is coded. To encode a non-zero residual of a color attribute, the three components of the attribute residual (e.g., Y, U, V or R, G, B) are to be encoded. In existing G-PCC standards, the order of encoding the three components is fixed to be one of two alternative orders - YUV/RGB residual coding order and UYV/GRB residual coding order. However, different input point clouds may have different distributions for the YUV/RGB components, and the attribute coding for G-PCC with a fixed residual coding order may not work well for a wide range of input point clouds which leads to a low coding efficiency.

[0018] Various embodiments described herein address these problems by allowing the residual coding order could be adaptively chosen for different partitions of the point cloud. The following non-limiting examples are provided to introduce some embodiments. In one embodiment, the residual coding order can be determined adaptively for the point cloud or for each partition of the point cloud by using the residual coding order that leads to a smaller number of bits in the compressed bitstream. The partition can be a sequence, a frame, a slice, or any type of partition processed by the encoder as a unit when performing the encoding. A residual coding order flag can be used at the corresponding level, such as in the sequence parameter set (SPS) or the slice or attribute header, to indicate which residual coding order will be used for the partition of the point cloud. By adaptively determining the residual coding order for the point cloud or each partition of the point cloud, the size of the encoded bitstream is reduced thereby increasing the coding efficiency.

[0019] In further examples, a unique identifier can be used to indicate the active geometry parameter set in the geometry slice header. For example, this identifier can be used to identify the slice header when the parameters defined in the slice header are used by other partition of the cloud point, such as another slice. In addition, there might be multiple geometry sequence parameter sets (e.g., defined in multiple geometry headers) for the point cloud. To specify the geometry sequence set used in a slice out of the multiple geometry sequence sets, another unique identifier can be used and added to the geometry slice header of the slice.

[0020] FIG. 1 illustrates a block diagram of an exemplary encoding system 100, according to some embodiments of the present disclosure. FIG. 2 illustrates a block diagram of an exemplary decoding system 200, according to some embodiments of the present disclosure. Each system 100 or 200 may be applied or integrated into various systems and apparatus capable of data processing, such as computers and wireless communication devices. For example, system 100 or 200 may be the entirety or part of a mobile phone, a desktop computer, a laptop computer, a tablet, a vehicle computer, a gaming console, a printer, a positioning device, a wearable electronic device, a smart sensor, a virtual reality (VR) device, an argument reality (AR) device, or any other suitable electronic devices having data processing capability. As shown in FIGS. 1 and 2, system 100 or 200 may include a processor 102, a memory 104, and an interface 106. These components are shown as connected to one another by a bus, but other connection types are also permitted. It is understood that system 100 or 200 may include any other suitable components for performing functions described here.

[0021] Processor 102 may include microprocessors, such as graphic processing unit (GPU), image signal processor (ISP), central processing unit (CPU), digital signal processor (DSP), tensor processing unit (TPU), vision processing unit (VPU), neural processing unit (NPU), synergistic processing unit (SPU), or physics processing unit (PPU), microcontroller units (MCUs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), programmable logic devices (PLDs), state machines, gated logic, discrete hardware circuits, and other suitable hardware configured to perform the various functions described throughout the present disclosure. Although only one processor is shown in FIGS. 1 and 2, it is understood that multiple processors can be included. Processor 102 may be a hardware device having one or more processing cores. Processor 102 may execute software. Software shall be construed broadly to mean instructions, instruction sets, code, code segments, program code, programs, subprograms, software modules, applications, software applications, software packages, routines, subroutines, objects, executables, threads of execution, procedures, functions, etc., whether referred to as software, firmware, middleware, microcode, hardware description language, or otherwise. Software can include computer instructions written in an interpreted language, a compiled language, or machine code. Other techniques for instructing hardware are also permitted under the broad category of software.

[0022] Memory 104 can broadly include both memory (a.k.a, primary/system memory) and storage (a.k.a., secondary memory). For example, memory 104 may include randomaccess memory (RAM), read-only memory (ROM), static RAM (SRAM), dynamic RAM (DRAM), ferro-electric RAM (FRAM), electrically erasable programmable ROM (EEPROM), compact disc read-only memory (CD-ROM) or other optical disk storage, hard disk drive (HDD), such as magnetic disk storage or other magnetic storage devices, Flash drive, solid-state drive (SSD), or any other medium that can be used to carry or store desired program code in the form of instructions that can be accessed and executed by processor 102. Broadly, memory 104 may be embodied by any computer-readable medium, such as a non- transitory computer-readable medium. Although only one memory is shown in FIGS. 1 and 2, it is understood that multiple memories can be included.

[0023] Interface 106 can broadly include a data interface and a communication interface that is configured to receive and transmit a signal in a process of receiving and transmitting information with other external network elements. For example, interface 106 may include input/output (I/O) devices and wired or wireless transceivers. Although only one memory is shown in FIGS. 1 and 2, it is understood that multiple interfaces can be included.

[0024] Processor 102, memory 104, and interface 106 may be implemented in various forms in system 100 or 200 for performing point cloud coding functions. In some embodiments, processor 102, memory 104, and interface 106 of system 100 or 200 are implemented (e.g., integrated) on one or more system-on-chips (SoCs). In one example, processor 102, memory 104, and interface 106 may be integrated on an application processor (AP) SoC that handles application processing in an operating system (OS) environment, including running point cloud encoding and decoding applications. In another example, processor 102, memory 104, and interface 106 may be integrated on a specialized processor chip for point cloud coding, such as a GPU or ISP chip dedicated to graphic processing in a real-time operating system (RTOS).

[0025] As shown in FIG. 1, in encoding system 100, processor 102 may include one or more modules, such as an encoder 101. Although FIG. 1 shows that encoder 101 is within one processor 102, it is understood that encoder 101 may include one or more sub-modules that can be implemented on different processors located closely or remotely with each other. Encoder 101 (and any corresponding sub-modules or sub-units) can be hardware units (e.g., portions of an integrated circuit) of processor 102 designed for use with other components or software units implemented by processor 102 through executing at least part of a program, i.e., instructions. The instructions of the program may be stored on a computer-readable medium, such as memory 104, and when executed by processor 102, it may perform a process having one or more functions related to point cloud encoding, such as voxelization, transformation, quantization, arithmetic encoding, etc., as described below in detail.

[0026] Similarly, as shown in FIG. 2, in decoding system 200, processor 102 may include one or more modules, such as a decoder 201. Although FIG. 2 shows that decoder 201 is within one processor 102, it is understood that decoder 201 may include one or more submodules that can be implemented on different processors located closely or remotely with each other. Decoder 201 (and any corresponding sub-modules or sub-units) can be hardware units (e.g., portions of an integrated circuit) of processor 102 designed for use with other components or software units implemented by processor 102 through executing at least part of a program, i.e., instructions. The instructions of the program may be stored on a computer- readable medium, such as memory 104, and when executed by processor 102, it may perform a process having one or more functions related to point cloud decoding, such as arithmetic decoding, dequantization, inverse transformation, reconstruction, synthesis, as described below in detail. [0027] FIG. 3 illustrates a detailed block diagram of exemplary encoder 101 in encoding system 100 in FIG. 1, according to some embodiments of the present disclosure. As shown in FIG. 3, encoder 101 may include a coordinate transform module 302, a voxelization module 304, a geometry analysis module 306, and an arithmetic encoding module 308, together configured to encode positions associated with points of a point cloud into a geometry bitstream (i.e., geometry encoding). As shown in FIG. 3, encoder 101 may also include a color transform module 310, an attribute transform module 312, a quantization module 314, and an arithmetic encoding module 316, together configured to encode attributes associated with points of a point cloud into an attribute bitstream (i.e., attribute encoding). It is understood that each of the elements shown in FIG. 3 is independently shown to represent characteristic functions different from each other in a point cloud encoder, and it does not mean that each component is formed by the configuration unit of separate hardware or single software. That is, each element is included to be listed as an element for convenience of explanation, and at least two of the elements may be combined to form a single element, or one element may be divided into a plurality of elements to perform a function. It is also understood that some of the elements are not necessary elements that perform functions described in the present disclosure but instead may be optional elements for improving performance. It is further understood that these elements may be implemented using electronic hardware, firmware, computer software, or any combination thereof. Whether such elements are implemented as hardware, firmware, or software depends upon the particular application and design constraints imposed on encoder 101. It is still further understood that the modules shown in FIG. 3 are for illustrative purposes only, and in some examples, different modules may be included in encoder 101 for point cloud encoding.

[0028] As shown in FIG. 3, geometry positions and attributes associated with points may be encoded separately. A point cloud may be a collection of points with positions X_k = (^xk>yk> ^zk)> k = 1,

where K is the number of points in the point cloud, and attributes A_k = (A_lk,A_2k, ...,A_Dk), k = 1,

where D is the number of attributes for each point. In some embodiments, attribute coding depends on decoded geometry. Consequently, point cloud positions may be coded first. Since geometry positions may be represented by floatingpoint numbers in an original coordinate system, coordinate transform module 302 and a voxelization module 304 may be configured to perform a coordinate transformation followed by voxelization that quantizes and removes duplicate points. The process of position quantization, duplicate point removal, and assignment of attributes to the remaining points is called voxelization. The voxelized point cloud may be represented using, for example, an octree structure in a lossless manner. Geometry analysis module 306 may be configured to perform geometry analysis using, for example, the octree or trisoup scheme. Arithmetic encoding module 308 may be configured to arithmetically encode the resulting structure from geometry analysis module 306 into the geometry bitstream.

[0029] In some embodiments, geometry analysis module 306 is configured to perform geometry analysis using the octree scheme. Under the octree scheme, a cubical axis-aligned bounding box B may be defined by the two extreme points (0,0,0) and (2^d, 2^d, 2^d) Where d is the maximum size of the given point cloud along the v, y, or z direction. All point cloud points may be included in this defined cube. A cube may be divided into eight sub-cubes, which creates the octree structure allowing one parent to have 8 children, and an octree structure may then be built by recursively subdividing sub-cubes, as shown in FIG. 5A. As shown in FIG. 5B, an 8-bit code may be generated by associating a 1 -bit value with each sub-cube to indicate whether it contains points (i.e., full and has value 1) or not (i.e., empty and has value 0). Only full sub-cubes with a size greater than 1 (i.e., non-voxels) may be further subdivided. The geometry information (x, y, z) for one position may be represented by this defined octree structure. Since points may be duplicated, multiple points may be mapped to the same subcube of size 1 (i.e., the same voxel). In order to handle such a situation, the number of points for each sub-cube of dimension 1 is also arithmetically encoded. By construction of the octree, a current cube associated with a current node may be surrounded by six cubes of the same depth sharing a face with it. Depending on the location of the current cube, one cube may have up to six same-sized cubes to share one face, as shown in FIG. 6. In addition, the current cube may also have some neighboring cubes which share lines or point with the current cube. In some examples, the point cloud may be divided into different partitions at various granularity levels, such as sequences, frames, slices with one sequence containing multiple frames and one frame containing multiple slices, and so on. [0030] Referring back to FIG. 3, as to attribute encoding, optionally, color transform module 310 may be configured to convert red/green/blue (RGB) color attributes of each point to YCbCr (or YUV) color attributes if the attributes include color. Attribute transform module 312 may be configured to perform attribute transformation based on the results from geometry analysis module 306 (e.g., using the octree scheme), including but not limited to, the region adaptive hierarchical transform (RAHT), interpolation-based hierarchical nearest-neighbor prediction (predicting transform), and interpolation-based hierarchical nearest-neighbor prediction with an update/lifting step (lifting transform). Optionally, quantization module 314 may be configured to quantize the transformed coefficients of attributes from attribute transform module 312 to generate quantization levels of the attributes associated with each point to reduce the dynamic range. Arithmetic encoding module 316 may be configured to arithmetically encode the resulting transformed coefficients of attributes associated with each point or the quantization levels thereof into the attribute bitstream.

[0031] In some embodiments, a prediction may be formed from neighboring coded attributes, for example, in predicting transform and lifting transform by attribute transform module 312. Then, the difference between the current attribute and the prediction (i.e., the residual) may be coded. After the geometry positions are coded, Morton code or Hilbert code may be used to convert a point cloud in a 3D space (e.g., a point cloud cube) into a ID array. Each position in the cube will have a corresponding Morton or Hilbert code, but some positions may not have any corresponding point cloud attribute. In other words, some positions may be empty. The attribute coding may follow the predefined Morton order or Hilbert order. A predictor may be generated from the previous coded points in the ID array following the Morton order or Hilbert order. The attribute difference between the current point and its prediction points may be encoded into the bitstream.

[0032] To reduce the memory usage, some pre-defined number has been specified to limit the number of neighboring points that can be used in generating the prediction. For example, only M data points among previous N consecutively coded points may be used for coding the current attribute. That is, a set of n candidate points may be used as the candidates to select a set of m prediction points (in < ri) for predicting the current point in attribute coding. The number n of candidate points in the set is equal to or smaller than the maximum number N of candidate points (n <N), and the number m of prediction points in the set is equal to or smaller than the maximum number M of prediction points (m <M). For example, in an AVS G-PCC software, M and N are set as a fixed number of 3 and 128, respectively. Suppose more than 128 points have already been coded before the current point. In that case, only 3 out of the 128 previously-coded neighboring points could be used to form the attribute predictor according to a pre-defined order. If there are less than 128 coded points before the current point, all such coded points will be used as candidates to establish the attribute predictor.

[0033] More specifically, the previous K points (e.g., K = 6) before the current point are selected according to a pre-defined Morton or Hilbert order. Then new Morton or Hilbert codes for these N points will be re-calculated by adding a fixed shift (e.g., 1) to coordinates (x, y, z) of these N data points. Suppose that the new Morton or Hilbert code for the current position is X. A P-point set before and a Q-point set after the current position according to the new Morton or Hilbert code order will be selected. Among these pre-defined K, P and Q point sets, M points are selected with M closest “distance” between these coded points and the current point. The distance d as one example is defined as follows while other distance metrics can also be used in other examples: d = |xl — %2| + |yl — y2| + |zl — z2| (1), where (xl,yl,zl) and (%2,y2,z2) are the coordinates of the current point and the candidate point along the Morton order or Hilbert order, respectively.

[0034] In another example, a full search method based on Hilbert code can be applied in the AVS G-PCC attribute coding. For example, the search range can be set as 128 and the number of previous points used to form the predictor is set as M. If more than 128 points have already been coded before the current point, only M out of the 128 previously-coded neighboring points can be used to form the attribute predictor according to the Hilbert order. If there are fewer than 128 coded points before the current one, all such coded points will be used as candidates to form the attribute predictor. Among the up-to 128 previously-coded points, M points are selected with M closest “distance” between these coded points and the current point. The distance d defined in Eqn. (1) can be used as one example while other distance metrics can also be used. Once M closest points have been selected, a weighted average of attributes from these M points is formed as the predictor to code the attribute of the current point.

[0035] In some examples, the attribute difference (i.e., the residual) between the current point and its prediction points may be encoded into the bitstream through the run length coding, in which runs of zeros between non-zero residuals are stored as a single data value and count. Depending on the type of the attribute, the non-zero residual may be defined differently. For example, if the attribute is a single value attribute, such as the reflectance, the residual for the attribute is non-zero if the residual value is non-zero. If the attribute is a multi-value attribute, such as YUV/RGB color attribute, the residual for the attribute is non-zero is any of the three components is non-zero. In other words, the residual for a color attribute is zero only when all the three components are zeros. Each value of the three components in a non-zero residual need to be encoded into the bitstream.

[0036] FIG. 4 illustrates a detailed block diagram of exemplary decoder 201 in decoding system 200 in FIG. 2, according to some embodiments of the present disclosure. As shown in FIG. 4, decoder 201 may include an arithmetic decoding module 402, a geometry synthesis module 404, a reconstruction module 406, and a coordinate inverse transform module 408, together configured to decode positions associated with points of a point cloud from the geometry bitstream (i.e., geometry decoding). As shown in FIG. 4, decoder 201 may also include an arithmetic decoding module 410, a dequantization module 412, an attribute inverse transform module 414, and a color inverse transform module 416, together configured to decode attributes associated with points of a point cloud from the attribute bitstream (i.e., attribute decoding). It is understood that each of the elements shown in FIG. 4 is independently shown to represent characteristic functions different from each other in a point cloud decoder, and it does not mean that each component is formed by the configuration unit of separate hardware or single software. That is, each element is included to be listed as an element for convenience of explanation, and at least two of the elements may be combined to form a single element, or one element may be divided into a plurality of elements to perform a function. It is also understood that some of the elements are not necessary elements that perform functions described in the present disclosure but instead may be optional elements for improving performance. It is further understood that these elements may be implemented using electronic hardware, firmware, computer software, or any combination thereof. Whether such elements are implemented as hardware, firmware, software depends upon the particular application and design constraints imposed on decoder 201. It is still further understood that the modules shown in FIG. 4 are for illustrative purposes only, and in some examples, different modules may be included in decoder 201 for point cloud decoding.

[0037] When a point cloud bitstream (e.g., a geometry bitstream or an attribute bitstream generated by encoder 101) is received or otherwise obtained, the bitstream may be decoded by decoder 201 in a procedure opposite to that of the point cloud encoder. Thus, the details of decoding that are described above with respect to encoding may be skipped for ease of description. Arithmetic decoding modules 402 and 410 may be configured to decode the geometry bitstream and attribute bitstream, respectively, to obtain various information encoded into the bitstream. For example, arithmetic decoding module 410 may decode the attribute bitstream to obtain the attribute information associated with each point, such as the quantization levels or the coefficients of the attributes or attribute residuals associated with each point. Optionally, dequantization module 412 may be configured to dequantize the quantization levels of attributes or attribute residuals associated with each point to obtain the coefficients of attributes or attribute residuals associated with each point. Besides the attribute information, arithmetic decoding module 410 may decode various other information, such as an index for setting the maximum number of candidate points for attribute coding.

[0038] Inverse attribute transform module 414 may be configured to perform inverse attribute transformation, such as inverse RAHT, inverse predicting transform, or inverse lifting transform, to transform the data from the transform domain (e.g., coefficients) back to the attribute domain (e.g., luma and/or chroma information for color attributes). Optionally, color inverse transform module 416 may be configured to convert YCbCr (or YUV) color attributes to RGB color attributes. As to the geometry decoding, geometry synthesis module 404, reconstruction module 406, and coordinate inverse transform module 408 of decoder 201 may be configured to perform the inverse operations of geometry analysis module 306, voxelization module 304, and coordinate transform module 302 of encoder 101, respectively.

[0039] Attribute Coding [0040] In some examples, to further compress the attribute such as the color, the residual between the attribute and its prediction for some or all three components of the color attribute may be coded in a certain order. For a color attribute, for example, a flag flagy_r may be used to represent if the residual for the Y (of the YUV) or R (of the RGB) component of the attribute is zero. If the residual for the Y or R component of attribute is zero, then the flag flagy_r is zero; otherwise, the flag flagy_r is one. If the flag flagy_r is one, residuals for the three components in YUV or RGB will be coded in this order (i.e., Y, U, V or R, G, B) accordingly. In some examples, if the flag flagy r is one, the residual for the Y or R component is non-zero and the residual minus one is coded. If the flag flagy_r is zero, another flag flagyu_rg may be used to represent if the residual for both YU or RG components are zeros. If the flag flagyu rb is zero, the residual for the V or B component is non-zero and will be coded accordingly (e.g., the residual for V or B component minus one is coded). If the flag flagyu_rb is one, the residuals for two components UV/GB will be coded in this order (i.e., U and V or G and B) accordingly. Such residual coding order is referred as the YUV/RGB residual coding order.

[0041] In another example, a flag flagu_g may be used to represent whether the residual for U or G color component of the attribute is zero. If the residual for the U or G component of attribute is zero, the flag flagu_g is zero; otherwise, the flag flagu_g is one. If the flag flagu_g is one, the residuals for the three components UYV/GRB will be coded in this order (i.e., U, Y, V or G, R, B) accordingly. In some examples, if the flag flagu_g is one, the residual for the U or G component is non-zero and the residual minus one is coded. If the flagu g is zero, another flag, namely flaguy_gr, represents if the residual for both UY/GR components are zeros. If the flag flaguy_gr is zero, the residual for the V or B component will be coded accordingly (e.g., the residual for V or B component minus one is coded). If the flag flaguy_gr is one, the residuals for the two components YV/RB will be coded in the order of Y and V or R and B accordingly. This residual coding order is referred as the UYV/GRB residual coding order.

[0042] However, different PCC inputs may have different distributions for the YUV/RGB components, and the attribute coding for GPCC with a fixed residual coding order (the YUV/RGB residual coding order or the UYV/GRB residual coding order) may not work well for a wide range of PCC inputs leading to a low coding efficiency. In order to overcome this problem, the attribute residuals can be coded using different orders adaptively to achieve better coding efficiency.

[0043] For example, one bit flag can be used to indicate which residual coding order will be used. The proposed flag could be signaled at several levels, such as in the sequence parameter set (SPS), the slice or attribute header. As one example, below shows the syntax table which includes the proposed flag residual_coding_order signaled in the attribute header of the specification of the audio video coding standard (AVS) “GPCC WD5.0”, N3170. Added changes are underlined and the text with strikethrough are removed from the current specification.

residual coding order equal to 1 specifies that the residual coding order is YUV/RGB residual coding order, residual coding order equal to 0 specifies that the residual coding order is UYV/GRB residual coding order.

[0044] With the flag residual_coding_order, the encoder can choose the residual coding order that leads to a smaller number of bits for each sequence or slice and signal the flag residual_coding_order at the proper level such as the SPS, the slice or attribute header, and so on. In some examples, the residual coding order is determined for the entire point cloud and the flag residual_coding_order indicating the determined order can be signed at any level, such as the SPS, the slice or attribute header, and so on.

[0045] While the above examples focus on a binary flag residual_coding_order, the flag can be a multi-bit flag and thus have more than two values to indicate more than two residual coding orders. For example, in addition to the two residual coding orders discussed above, orders such as VYU/BGR residual coding order, UVY/RBG residual coding order, may also be used and selected by the encoder according to the characteristics of the point cloud. The selected residual coding order can be indicated by a corresponding value of the flag residual_coding_order in the attribute bitstream.

[0046] In further examples, a unique identifier can be used to indicate the active geometry parameter set in the geometry slice header. For example, this identifier can be used to identify the slice header when the parameters defined in the slice header are used by other partition of the cloud point, such as another slice. In addition, there might be multiple geometry sequence parameter sets (e.g., defined in multiple geometry headers) for the point cloud. To specify the geometry sequence set used in a slice out of the multiple geometry sequence sets, another unique identifier can be used and added to the geometry slice header of the slice. As one example, below shows the syntax table which includes the two proposed unique identifiers in the slice header of the specification of the AVS “GPCC WD5.0”, N3170. In this example, the two identifiers include gsh_geometry_parameter_set_id indicating the active geometry parameter set and gsh_geometry_sequence_parameter_set_id indicating the geometry sequence parameter set used by the slice out of the multiple geometry sequence sets defined for the point cloud. Added changes are underlined and the text with strikethrough are removed from the current specification.

gsh geometry parameter set id indicates the active geometry parameter set id geometry parameter set. gsh geometry _sequence parameter_set_id specifies the value of sequence parameter_set_id sequence parameter set. gsh_bounding box present flag equal to 1 specifies that the geometry slice bounding box is present in the bitstream. gsh_bounding box present_flag equal to 0 specifies that the geometry slice bounding box is not present in the bitstream. If gsh bounding box present flag is equal to 0, the size of the bounding box is equal to the size of the gpcc frame.

[0047] In the above example, a syntax gsh_bounding_box_present_flag is also added to the slice header to indicate whether the geometry slice bounding box is present in the bitstream. If so (e.g., the gsh_bounding_box_present_flag equals to 1), the bounding box is determined according to the bounding box parameters defined in the slice header such as gsh_bounding_box_offset_x_upper, gsh_bounding_box_offset_x_lower, etc. If gsh_bounding_box_present_flag indicates that the geometry slice bounding box is not present in the bitstream (e.g., the gsh_bounding_box_present_flag equals to 0), the size of the bounding box is equal to a default value, such as the size of the GPCC frame.

[0048] In some examples, the reflectance and colour attribute data coding in the specification of the AVS “GPCC WD5.0”, N3170 may be modified as follows. Added changes are underlined and the text with strikethrough are removed from the current specification.

[0049] Referring now to FIG. 7, FIG. 7 depicts an example of a process 700 for decoding a point cloud from a point cloud bitstream, according to some embodiments of the present disclosure. For example, the process 700 may be implemented to decode a video following the G-PCC standard with the proposed changes as discussed above. One or more computing devices implement operations depicted in FIG. 7 by executing suitable program code. For example, the computing device 200 in FIG. 2 may implement the operations depicted in FIG. 7 by executing the corresponding program code. For illustrative purposes, the process 700 is described with reference to some examples depicted in the figures. Other implementations, however, are possible.

[0050] At block 702, the process 700 involves decoding a point cloud from a point cloud bitstream. As discussed above, the point cloud bitstream includes a geometry bitstream containing the geometry information of the point cloud and an attribute bitstream containing attribute information of the points in the point cloud. At block 702, the process 700 involves parsing the attribute bitstream to identify a residual coding order flag. The residual coding order flag is configured to specify the residual coding order for the attribute encoding. In some examples, the residual coding order flag is a one-bit flag. In other examples, the residual coding order flag can be a multi-bit flag.

[0051] At block 704, the process 700 involves determining the value of the residual coding order flag. If the value of the residual coding order flag is a first value, such as 1, the process 700 involves, at block 708, determining that the residual coding order for at least a portion of the attribute stream is a first residual coding order, such as the YUV/RGB residual coding order discussed above. If the value of the residual coding order flag is a second value, such as 0, the process 700 involves, at block 710, determining that the residual coding order for at least the portion of the attribute stream is a second residual coding order, such as the UYV/GRB residual coding order. If the residual coding order flag has more than one value, the process 700 may involve more operations to determine the residual coding order to be another coding order based on the value of the residual coding order flag.

[0052] At block 712, the process 700 involves decoding at least the portion of the attribute bitstream into attribute residuals according to the determined residual coding order. In some examples, the residual coding flag indicates the residual coding order for the entire attribute stream, and thus the decoding includes decoding the attribute bitstream according to the determined order. In other examples, the residual coding flag indicates the residual coding order for a partition of the point cloud, such as a sequence, a frame, a slice or any type of partition processed by the encoder as a unit when performing the encoding, and thus the decoding includes decoding the partition of the point cloud according to the determined order. At block 714, the process 700 involves reconstructing the point cloud based on the decoded attribute residuals as discussed in detail above with respect to FIG. 4.

[0053] In some examples, reconstructing the point cloud further involves parsing the geometry bitstream to identify an identifier indicating the geometry sequence set used in a slice. Based on the identifier, the geometry information can be decoded based on parameters specified in geometry sequence set specified by the identifier. Likewise, the flag indicating whether a geometry slice bounding box is present for the slice can also be identified from the geometry bitstream. If the flag has a value indicating the bounding box is present, the decoding of the slice involves determining the bounding box according to the parameters specified in the geometry bitstream, such as the portion of the bit stream representing the slice header. If the flag has a value indicating the bounding box is not present, the decoding is performed using a default bounding box.

[0054] In various aspects of the present disclosure, the functions described herein may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored as instructions on a non-transitory computer-readable medium. Computer-readable media includes computer storage media. Storage media may be any available media that can be accessed by a processor, such as processor 102 in FIGS. 1 and 2. By way of example, and not limitation, such computer-readable media can include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, HDD, such as magnetic disk storage or other magnetic storage devices, Flash drive, SSD, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a processing system, such as a mobile device or a computer. Disk and disc, as used herein, includes CD, laser disc, optical disc, digital video disc (DVD), and floppy disk where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer- readable media.

[0055] Numerous specific details are set forth herein to provide a thorough understanding of the claimed subject matter. However, those skilled in the art will understand that the claimed subject matter may be practiced without these specific details. In other instances, methods, apparatuses, or systems that would be known by one of ordinary skill have not been described in detail so as not to obscure claimed subject matter.

[0056] Unless specifically stated otherwise, it is appreciated that throughout this specification discussions utilizing terms such as “processing,” “computing,” “calculating,” “determining,” and “identifying” or the like refer to actions or processes of a computing device, such as one or more computers or a similar electronic computing device or devices, that manipulate or transform data represented as physical electronic or magnetic quantities within memories, registers, or other information storage devices, transmission devices, or display devices of the computing platform.

[0057] The system or systems discussed herein are not limited to any particular hardware architecture or configuration. A computing device can include any suitable arrangement of components that provide a result conditioned on one or more inputs. Suitable computing devices include multi-purpose microprocessor-based computer systems accessing stored software that programs or configures the computing system from a general purpose computing apparatus to a specialized computing apparatus implementing one or more embodiments of the present subject matter. Any suitable programming, scripting, or other type of language or combinations of languages may be used to implement the teachings contained herein in software to be used in programming or configuring a computing device. [0058] Embodiments of the methods disclosed herein may be performed in the operation of such computing devices. The order of the blocks presented in the examples above can be varied — for example, blocks can be re-ordered, combined, and/or broken into sub-blocks. Some blocks or processes can be performed in parallel.

[0059] The use of “adapted to” or “configured to” herein is meant as an open and inclusive language that does not foreclose devices adapted to or configured to perform additional tasks or steps. Additionally, the use of “based on” is meant to be open and inclusive, in that a process, step, calculation, or other action “based on” one or more recited conditions or values may, in practice, be based on additional conditions or values beyond those recited. Headings, lists, and numbering included herein are for ease of explanation only and are not meant to be limiting.

[0060] While the present subject matter has been described in detail with respect to specific embodiments thereof, it will be appreciated that those skilled in the art, upon attaining an understanding of the foregoing, may readily produce alterations to, variations of, and equivalents to such embodiments. Accordingly, it should be understood that the present disclosure has been presented for purposes of example rather than limitation, and does not preclude the inclusion of such modifications, variations, and/or additions to the present subject matter as would be readily apparent to one of ordinary skill in the art.

Claims

1. A method for decoding a point cloud from a point cloud bitstream, the point cloud bitstream comprising a geometry bitstream and an attribute bitstream, the method comprising: parsing the attribute bitstream to identify a residual coding order flag configured to specify a residual coding order; determining a value of the residual coding order flag; in response to determining that the value of the residual coding order flag is a first value, determining that the residual coding order for at least a portion of the attribute stream is a first residual coding order; in response to determining that the value of the residual coding order flag is a second value, determining that the residual coding order for at least the portion of the attribute stream is a second residual coding order; decoding at least the portion of the attribute bitstream into attribute residuals according to the determined residual coding order; and reconstructing the point cloud based at least in part upon the decoded attribute residuals.

2. The method of claim 1, wherein the first residual coding order is a YUV/RGB residual coding order, and the second residual coding order is a UYV/GRB residual coding order.

3. The method of claim 1 , wherein the residual coding order flag is included in an attribute header of the attribute stream, a slice header of the attribute stream, or a sequence parameter set of the attribute stream.

4. The method of claim 1, further comprising: in response to determining that the value of the residual coding order flag is a third value, determining that the residual coding order for at least the portion of the attribute stream is a third residual coding order.

5. The method of claim 1, further comprising: parsing the geometry bitstream to identify a flag configured to indicate a geometry

26 sequence set used in a slice; wherein reconstructing the point cloud further comprises reconstructing geometry information of the slice in the point cloud based on the geometry sequence set.

6. The method of claim 5, wherein the flag is included is a geometry slice header for the slice.

7. The method of claim 1, further comprising: parsing the geometry bitstream to identify a flag configured to indicate whether a geometry slice bounding box is present in the bitstream; determining a size of the geometry slice bounding box based on a value of the flag; wherein reconstructing the point cloud further comprises reconstructing geometry information of the point cloud based on the size of the geometry slice bounding box.

8. A non-transitory computer-readable medium having program code that is stored thereon, the program code executable by one or more processing devices for performing operations comprising: parsing an attribute bitstream of a point cloud bitstream for a point cloud to identify a residual coding order flag configured to specify a residual coding order, the point cloud bitstream comprising the attribute bitstream and a geometry bitstream; determining a value of the residual coding order flag; in response to determining that the value of the residual coding order flag is a first value, determining that the residual coding order for at least a portion of the attribute stream is a first residual coding order; in response to determining that the value of the residual coding order flag is a second value, determining that the residual coding order for at least the portion of the attribute stream is a second residual coding order; decoding at least the portion of the attribute bitstream into attribute residuals according to the determined residual coding order; and reconstructing the point cloud based at least in part upon the decoded attribute residuals.

9. The non-transitory computer-readable medium of claim 8, wherein the first residual coding order is a YUV/RGB residual coding order, and the second residual coding order is a UYV/GRB residual coding order.

10. The non-transitory computer-readable medium of claim 8, wherein the residual coding order flag is included in an attribute header of the attribute stream, a slice header of the attribute stream, or a sequence parameter set of the attribute stream.

11. The non-transitory computer-readable medium of claim 8, wherein the operations further comprise: in response to determining that the value of the residual coding order flag is a third value, determining that the residual coding order for at least the portion of the attribute stream is a third residual coding order.

12. The non-transitory computer-readable medium of claim 8, wherein the operations further comprise: parsing the geometry bitstream to identify a flag configured to indicate a geometry sequence set used in a slice; wherein reconstructing the point cloud further comprises reconstructing geometry information of the slice in the point cloud based on the geometry sequence set.

13. The non-transitory computer-readable medium of claim 12, wherein the flag is included is a geometry slice header for the slice.

14. The non-transitory computer-readable medium of claim 8, wherein the operations further comprise: parsing the geometry bitstream to identify a flag configured to indicate whether a geometry slice bounding box is present in the bitstream; determining a size of the geometry slice bounding box based on a value of the flag; wherein reconstructing the point cloud further comprises reconstructing geometry information of the point cloud based on the size of the geometry slice bounding box.

15. A system comprising: a processing device; and a non-transitory computer-readable medium communicatively coupled to the processing device, wherein the processing device is configured to execute program code stored in the non-transitory computer-readable medium and thereby perform operations comprising: parsing an attribute bitstream of a point cloud bitstream for a point cloud to identify a residual coding order flag configured to specify a residual coding order, the point cloud bitstream comprising the attribute bitstream and a geometry bitstream; determining a value of the residual coding order flag; in response to determining that the value of the residual coding order flag is a first value, determining that the residual coding order for at least a portion of the attribute stream is a first residual coding order; in response to determining that the value of the residual coding order flag is a second value, determining that the residual coding order for at least the portion of the attribute stream is a second residual coding order; decoding at least the portion of the attribute bitstream into attribute residuals according to the determined residual coding order; and reconstructing the point cloud based at least in part upon the decoded attribute residuals.

16. The system of claim 15, wherein the first residual coding order is a YUV/RGB residual coding order, and the second residual coding order is a UYV/GRB residual coding order.

17. The system of claim 15, wherein the residual coding order flag is included in an attribute header of the attribute stream, a slice header of the attribute stream, or a sequence parameter set of the attribute stream.

29

18. The system of claim 15, wherein the operations further comprise: in response to determining that the value of the residual coding order flag is a third value, determining that the residual coding order for at least the portion of the attribute stream is a third residual coding order.

19. The system of claim 15, wherein the operations further comprise: parsing the geometry bitstream to identify a flag configured to indicate a geometry sequence set used in a slice; wherein reconstructing the point cloud further comprises reconstructing geometry information of the slice in the point cloud based on the geometry sequence set.

20. The system of claim 15, wherein the operations further comprise: parsing the geometry bitstream to identify a flag configured to indicate whether a geometry slice bounding box is present in the bitstream; determining a size of the geometry slice bounding box based on a value of the flag; wherein reconstructing the point cloud further comprises reconstructing geometry information of the point cloud based on the size of the geometry slice bounding box.

30