WO2020123469A1 - Encodage d'attribut d'arbre hiérarchique par points médians dans un codage en nuage de points - Google Patents

Encodage d'attribut d'arbre hiérarchique par points médians dans un codage en nuage de points Download PDF

Info

Publication number
WO2020123469A1
WO2020123469A1 PCT/US2019/065413 US2019065413W WO2020123469A1 WO 2020123469 A1 WO2020123469 A1 WO 2020123469A1 US 2019065413 W US2019065413 W US 2019065413W WO 2020123469 A1 WO2020123469 A1 WO 2020123469A1
Authority
WO
WIPO (PCT)
Prior art keywords
points
point
bitstream
parent node
hierarchical tree
Prior art date
Application number
PCT/US2019/065413
Other languages
English (en)
Inventor
Birendra KATHARIYA
Vladyslav ZAKHARCHENKO
Jianle Chen
Original Assignee
Futurewei Technologies, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Futurewei Technologies, Inc. filed Critical Futurewei Technologies, Inc.
Publication of WO2020123469A1 publication Critical patent/WO2020123469A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/001Model-based coding, e.g. wire frame
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/40Tree coding, e.g. quadtree, octree
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode

Definitions

  • the present disclosure is generally related to media coding, and is specifically related to coding attribute values of points in a sparse point cloud.
  • the disclosure includes a method implemented in an encoder, the method comprising: storing, in a memory of the encoder, an image of a media sequence, the image comprising a sparse point cloud containing a plurality of points; applying, by a processor of the encoder, a hierarchical structure to the points to encode a geometry bitstream describing positions of the points; applying, by the processor, a hierarchical tree to the points to encode an attribute bitstream describing attribute values of the points, wherein applying the hierarchical tree includes: selecting a median point of a parent node as a closest point to a center coordinate of a tight bounding box around the parent node, and splitting the parent node into a plurality of child nodes at the median point of the parent node; and storing, in the memory, a point cloud coding (PCC) bitstream containing the geometry bitstream and the attribute bitstream to support reconstruction of the media sequence.
  • PCC point cloud coding
  • PCC systems employ images with clouds of points.
  • the points have geometry, which is a position of the point in three dimensional (3D) space.
  • the points also have one or more attributes, such a color and/or reflectance (e.g., color and light).
  • Some systems use a highly processor intensive distance based algorithm to encode such attributes.
  • the disclosed examples instead employ a hierarchical tree that splits points into nodes for attribute coding based on point position.
  • the points are sorted into child nodes based on the relative distance between each point and a median point of the parent node.
  • the median point is the closest point to a center point of a tight bounding box around the parent node.
  • a tight bounding box empty space created by node splitting is omitted from nodes when possible. This shrinks the nodes and hence reduces the area to be searched, which reduces processor resource usage during both encoding and decoding.
  • applying the hierarchical tree further includes selecting a split axis for splitting the parent node as a selected axis that results in a maximum variance between the median point and other points of the parent node relative to the selected axis.
  • a split axis is selected to maximize node variance. This has the benefit of separating nodes with the highest variance into different child nodes. Nodes with high relative variance are likely to contain divergent attribute values. Separating such points may result in the selection of better predictors for the points, and hence may increase coding efficiency.
  • applying the hierarchical tree further includes distributing the points from the parent node into the child nodes based on a position of the median point and a direction of the split axis, and wherein points greater than the median point are included in a first of the child nodes and points less than the median point are included in a second of the child nodes.
  • applying the hierarchical tree further includes determining features for corresponding nodes resulting from applying the hierarchical tree, wherein the features for the corresponding nodes include a median point, a split axis index, a start index of points, an end index of points, a minimum point coordinate value, a maximum point coordinate value, or combinations thereof.
  • applying the hierarchical tree further includes determining the tight bounding box around the parent node based on a minimum point coordinate value and a maximum point coordinate value associated with points in the parent node.
  • applying the hierarchical tree further includes: assigning nodes resulting from the hierarchical tree into a plurality of layers of detail (LOD), selecting predictors for LODs containing child nodes, wherein the predictors are selected for a current point in a child node in a current LOD as a distance weighted average of N nearest points selected from a parent node in a previous LOD, and encoding point attribute values as a difference between actual values and corresponding predictor values.
  • LOD layers of detail
  • each point receives a separate predictor, the difference between the predictor and the actual value of the point is likely to be small, which increases coding efficiency. Because the predictor is procedurally generated, allowing each point to have a separate predictor may not increase the amount of coded data in the bitstream, and hence may not negatively impact coding efficiency.
  • applying the hierarchical tree further includes generating refinement layers for the LODs by including points with minimum distances to node centroids into the refinement layers, and wherein a number of LODs are compressed into a number of refinement layers based on a value of an adaptation prediction threshold.
  • PCC systems may have a predefined and/or specified number of refinement layers. However, the number of LODs may be generated based on the nature of the corresponding point cloud data.
  • the adaptation prediction threshold may serve to compress the LODs into refinement layers as desired in order to automatically match the prede fined/ specified number of refinement layers.
  • another implementation of the aspect provides, wherein the hierarchical tree is a binary tree.
  • another implementation of the aspect provides, further comprising setting a flag in the bitstream to indicate the sparse point cloud is coded by the binary tree.
  • another implementation of the aspect provides, wherein the hierarchical tree is an octree.
  • the disclosure includes a method implemented in a decoder, the method comprising: receiving, by a receiver of the decoder, a PCC bitstream containing a geometry bitstream and an attribute bitstream that include an encoding of an image of a media sequence, the image comprising a sparse point cloud containing a plurality of points; applying, by a processor of the decoder, a hierarchical structure to the geometry bitstream to decode positions of the points in the image; applying, by the processor, a hierarchical tree to the points based on the positions of the points to decode attribute values of the points, wherein applying the hierarchical tree includes: selecting a median point of a parent node as a closest point to a center coordinate of a tight bounding box around the parent node, and splitting the parent node into a plurality of child nodes at the median point of the parent node; and reconstructing, by the processor, the sparse point cloud to reconstruct the image as part of the media sequence, wherein the
  • PCC systems employ images with clouds of points.
  • the points have geometry, which is a position of the point in 3D space.
  • the points also have one or more attributes, such a color and/or reflectance (e.g., color and light).
  • Some systems use a highly processor intensive distance based algorithm to encode such attributes.
  • the disclosed examples instead employ a hierarchical tree that splits points into nodes for attribute coding based on point position.
  • the points are sorted into child nodes based on the relative distance between each point and a median point of the parent node.
  • the median point is the closest point to a center point of a tight bounding box around the parent node.
  • applying the hierarchical tree further includes selecting a split axis for splitting the parent node as a selected axis that results in a maximum variance between the median point and other points of the parent node relative to the selected axis.
  • a split axis is selected to maximize node variance. This has the benefit of separating nodes with the highest variance into different child nodes. Nodes with high relative variance are likely to contain divergent attribute values. Separating such points may result in the selection of better predictors for the points, and hence may increase coding efficiency.
  • applying the hierarchical tree further includes distributing the points from the parent node into the child nodes based on a position of the median point and a direction of the split axis, and wherein points greater than the median point are included in a first of the child nodes and points less than the median point are included in a second of the child nodes.
  • applying the hierarchical tree further includes determining features for corresponding nodes resulting from applying the hierarchical tree, wherein the features for the corresponding nodes include a median point, a split axis index, a start index of points, an end index of points, a minimum point coordinate value, a maximum point coordinate value, or combinations thereof.
  • applying the hierarchical tree further includes determining the tight bounding box around the parent node based on a minimum point coordinate value and a maximum point coordinate value associated with points in the parent node.
  • applying the hierarchical tree further includes: assigning nodes resulting from the hierarchical tree into a plurality of LOD, selecting predictors for LODs containing child nodes, wherein the predictors are selected for a current point in a child node in a current LOD as a distance weighted average of N nearest points selected from a parent node in a previous LOD, and determining point attribute values by comparing coded values and corresponding predictor values.
  • each point receives a separate predictor, the difference between the predictor and the actual value of the point is likely to be small, which increases coding efficiency. Because the predictor is procedurally generated, allowing each point to have a separate predictor may not increase the amount of coded data in the bitstream, and hence may not negatively impact coding efficiency.
  • applying the hierarchical tree further includes generating refinement layers for the LODs by including points with minimum distances to node centroids into the refinement layers, and wherein a number of LODs are compressed into a number of refinement layers based on a value of an adaptation prediction threshold.
  • PCC systems may have a predefined and/or specified number of refinement layers. However, the number of LODs may be generated based on the nature of the corresponding point cloud data.
  • the adaptation prediction threshold may serve to compress the LODs into refinement layers as desired in order to automatically match the prede fined/ specified number of refinement layers.
  • another implementation of the aspect provides, wherein the hierarchical tree is a binary tree.
  • another implementation of the aspect provides, further comprising determining the sparse point cloud is coded by the binary tree by obtaining a flag from the bitstream.
  • another implementation of the aspect provides, wherein the hierarchical tree is an octree.
  • the disclosure includes a volumetric media coding device comprising: a processor, a receiver coupled to the processor, a memory coupled to the processor, and a transmitter coupled to the processor, wherein the processor, receiver, memory, and transmitter are configured to perform the method of any of the preceding aspects.
  • the disclosure includes a non-transitory computer readable medium comprising a computer program product for use by a volumetric media coding device, the computer program product comprising computer executable instructions stored on the non- transitory computer readable medium such that when executed by a processor cause the volumetric media coding device to perform the method of any of the preceding aspects.
  • the disclosure includes an encoder comprising: an image storing means for storing an image of a media sequence, the image comprising a sparse point cloud containing a plurality of points; a geometry bitstream means for applying a hierarchical structure to the points to encode a geometry bitstream describing positions of the points; an attribute bitstream means for applying a hierarchical tree to the points to encode an attribute bitstream describing attribute values of the points, wherein applying the hierarchical tree includes: selecting a median point of a parent node as a closest point to a center coordinate of a tight bounding box around the parent node, and splitting the parent node into a plurality of child nodes at the median point of the parent node; and a bitstream storing means for storing a PCC bitstream containing the geometry bitstream and the attribute bitstream to support reconstruction of the media sequence.
  • the disclosure includes a decoder comprising: a receiving means for receiving a PCC bitstream containing a geometry bitstream and an attribute bitstream that include an encoding of an image of a media sequence, the image comprising a sparse point cloud containing a plurality of points; a geometry bitstream means for applying a hierarchical structure to the geometry bitstream to decode positions of the points in the image; an attribute bitstream means for applying a hierarchical tree to the points based on the positions of the points to decode attribute values of the points, wherein applying the hierarchical tree includes: selecting a median point of a parent node as a closest point to a center coordinate of a tight bounding box around the parent node, and splitting the parent node into a plurality of child nodes at the median point of the parent node; and a
  • any one of the foregoing embodiments may be combined with any one or more of the other foregoing embodiments to create a new embodiment within the scope of the present disclosure.
  • FIG. 1 is a schematic diagram of an encoder for encoding sparse point clouds.
  • FIG. 2 is a schematic diagram of a decoder for decoding sparse point clouds.
  • FIG. 3 is a flowchart of an example method of coding point geometry in a sparse point cloud.
  • FIG. 4 is a flowchart of an example distance based method of coding point attributes in a sparse point cloud.
  • FIG. 5 is a schematic diagram of an example hierarchical tree based method of coding point attributes in a sparse point cloud.
  • FIG. 6 is a schematic diagram of an example mechanism for splitting a node in a tight bounding box at a median point according to a hierarchical tree based method of coding point attributes.
  • FIG. 7 is a schematic diagram of another example hierarchical tree based method of coding point attributes in a sparse point cloud.
  • FIG. 8 is a schematic diagram of an example mechanism of adaptively compressing layers of detail (LOD) into refinement layers.
  • FIG. 9 is a schematic diagram of an example volumetric media coding device.
  • FIG. 10 is a flowchart of an example method of encoding point attributes in a sparse point cloud.
  • FIG. 1 1 is a flowchart of an example method of applying a hierarchical tree to support encoding point attributes in a sparse point cloud.
  • FIG. 12 is a flowchart of an example method of decoding point attributes in a sparse point cloud.
  • FIG. 13 is a flowchart of an example method of applying a hierarchical tree to support decoding point attributes in a sparse point cloud.
  • FIG. 14 is a schematic diagram of an example system for coding point attributes in a sparse point cloud according to a hierarchical tree.
  • Media can include many types of data.
  • One example type of media data is the point cloud.
  • a point cloud is a group of points, typically in three dimensional (3D) space, that describes one or more objects.
  • a cloud of points can be used to represent a 3D object, such as a person, a machine component, a rendering of a room, etc.
  • a point cloud can be represented across a sequence of frames that may or may not be temporally related, depending on the example.
  • Representations of object(s) may be generated by a 3D scanner with a high sampling rate. Such representations may employ a correspondingly high number of points. Such representations may be referred to as dense point clouds.
  • Sparse point clouds are point clouds that are generated using a lower sampling rate, and hence fewer points.
  • a dense point cloud may be used to represent a detailed rendering of an object or space with textures and detailed outlines
  • a sparse point cloud may be employed to represent the general outline of an object or space.
  • LIDAR light detection and ranging
  • sparse point clouds may be used to represent LIDAR data.
  • Sparse point clouds can then be used to control an autonomous vehicle, stored and reviewed for testing purposes, etc.
  • Sparse point clouds may include significant amounts of empty space, and dense point clouds may not. Accordingly, compression algorithms for sparse point clouds may be different than those applied to dense point clouds.
  • the present disclosure is specifically related to sparse point clouds.
  • Points in a point cloud are described by geometry and one or more attributes. Compressing the geometry and attributes of the point cloud to reduce the file size is referred to as point cloud coding (PCC).
  • Geometry describes the position of a point in space.
  • An attribute describes something about the point. While many attributes are possible, color and reflectance (e.g., light intensity) are the most common use cases.
  • the geometry of points in a spare point cloud may be encoded/decoded by applying an octree, which is discussed in greater detail below.
  • the attributes of the points may be encoded/decoded according to a distance based algorithm, which is also discussed in detail below.
  • the distance based algorithm selects a set of random root points to encode at a first layer of detail (LOD).
  • LOD layer of detail
  • the attribute values for the root points are encoded in an attribute bitstream.
  • the algorithm also sorts the remaining points into refinement layers that make up additional levels of detail.
  • the attribute values of each level are encoded as the difference between the attribute value of the point(s) in a node at a current level of detail and the attribute value of a predictor at a previous level of detail. By only encoding the difference in values instead of the actual values, the attribute data is compressed. Further, the resulting image can be displayed at varying levels of detail based on the particular needs of the user.
  • the distance based algorithm places nodes into the refinement layers by recursively using search algorithms relative to the nodes. For example a search algorithm can be applied to a set of root nodes to determine predictors for a first refinement layer and sort a subset of the nodes into the first refinement layer. The search algorithm can then be used iteratively on the second LOD to create a third LOD, on the third LOD to create a fourth LOD, etc.
  • media includes a sequence of pictures (also referred to as frames), with each picture including a point cloud to be compressed by the recursive search algorithms. Accordingly, the distance based search algorithm can take an immense amount of processor resources to complete.
  • the distance based algorithm for coding sparse point cloud attributes may be considered a non-deterministic polynomial-time hard (NP hard) problem. NP hard problems are known to be undesirable due to excessive computational resource usage.
  • Disclosed herein are mechanisms to encode and decode sparse point cloud attributes with a hierarchical tree. Such a mechanism is not NP hard, and hence significantly reduces computational resource usage during encoding and decoding.
  • the disclosure describes attribute coding using a binary tree and an octree.
  • the disclosed mechanisms may be modified to support other hierarchical tree structures.
  • attribute coding using a binary tree is shown to increase coding speed by a factor or four to a factor of five when compared to the distance based approach. This indicates binary tree attribute coding can reduce coding time to between twenty percent and twenty five percent of the time used to execute the distance based approach on the same hardware.
  • the binary tree attribute coding is shown to increase compression by fifteen percent when compared to the distance based approach. Accordingly, the hierarchical tree based attribute coding schemes of the present disclosure significantly improve the function of the corresponding encoders and decoders (codecs) in terms of completion speed, processor resource usage, memory usage, and network resource usage when communicating the compressed data.
  • the hierarchical tree based attribute coding schemes are discussed in greater detail below. But generally, the hierarchical tree based attribute coding schemes assign the points in a sparse point cloud to a parent node based on the positions of the nodes from the geometry data. A tree is then built by recursively sub-dividing the parent node into child nodes, grand-child nodes, etc.
  • Each node is divided by determining a centroid of the node.
  • a centroid is a center coordinate of the node. In some examples, the centroid may be calculated as a center coordinate of the group of points in the node.
  • a median point is then selected, where the median point is the closest point in the node to the centroid.
  • the node is then split at the median point.
  • the node may be split along an x axis, a y axis, or a z axis.
  • the node may be split along the axis that maximizes variance between the resulting child nodes. This may result in separating points with different characteristics, and hence may increase prediction accuracy (and hence compression) when the predictor is selected.
  • the node may be split by axes in a predefined order or by selecting the axis with the longest edge value.
  • a tight bounding box is used for the nodes.
  • a tight bounding box is defined as extending far enough to include all points in the node, but terminating immediately beyond such points. This is in contrast to a loose bounding box that inherits dimensions from a parent node. The tight bounding box is smaller than the loose bounding box, and hence reduces the area to be searched by searching algorithms looking for points.
  • Points are assigned to LODs by including the median points in corresponding LODs when such points are used for node splits and assigning remaining points in leaf nodes at a lowest LOD.
  • the LODs are then grouped into refinement layers.
  • an adaptation prediction threshold parameter is used to compress LODs into refinement layers as desired to cause the number of refinement layers to match a predefined/specified value.
  • Predictors can then be selected to encode each point. Such predictors for a current refinement layer may be taken from a previous refinement layer.
  • a predictor is determined for each point, where the predictor is a distance weighted average of attribute values for N nearest points from a previous LOD and/or refinement layer.
  • the attribute(s) for each point can then be encoded as a difference between the actual attribute value and the predictor value.
  • the preceding discussion generally describes the application of a binary tree. However, the principals described herein can apply to other hierarchical tree structures, such as octree or others known in the art, by altering the node splitting and predictor selection as desired.
  • a use binary tree LOD flag may also be employed to indicate that the binary tree attribute encoding mechanism is employed instead of the distance based attribute encoding mechanism.
  • FIG. 1 is a schematic diagram of an encoder 100 for encoding sparse point clouds.
  • the encoder 100 receives positions 101 and attributes 103 of the points in the sparse point cloud as inputs.
  • the encoder 100 encodes the positions 101 and attributes 103 into a geometry bitstream 105 and an attribute bitstream 107, respectively, which are outputs of the encoder.
  • the geometry bitstream 105 and the attribute bitstream 107 are combined into a combined bitstream for transmission toward a decoder.
  • a sparse point cloud is a cloud of points that describes an object or space and is captured according to a process with a reduced number of samples (e.g., in comparison to a dense point cloud).
  • a dense point cloud and a sparse point cloud may have similar resolutions, but can be differentiated based on a sampling rate (e.g., number of samples over a distance value).
  • the sampling rate may be selected/set based on the acquisition process used to generate the point cloud.
  • a sparse point cloud may include a sampling density between 3D points that is an inverse quadratic proportion of the distance between the points.
  • Such a sparse point cloud may density of:
  • the positions 101 are the locations of such points in space (e.g., 3D space) and may be represented as coordinate points or other spatial representations.
  • the attributes 103 are features of the points, such as color, reflectance, etc.
  • a color attribute 103 may be a red, green, and blue (RGB) value, light value, a red difference, and blue difference value (YCrCB), etc.
  • a reflectance attribute 103 may be an amount/intensity of light that bounces back from the point.
  • the geometry bitstream 105 is a series of compressed point positions 101 from one or more pictures (e.g., frames).
  • the attribute bitstream 107 is a series of compressed point attributes 103 from one or more pictures (e.g., frames).
  • the encoder 100 includes a geometry section for encoding the positions 101 and an attribute section for encoding the attributes 103.
  • the geometry section includes a transform coordinates module 111, a quantize points module 113, an analyze octree module 115, an analyze surface approximation module 119, and an arithmetic encode module 117.
  • the transform coordinates module 111 is applied to the positions 101 upon entering the encoder 100.
  • the positions 101 may not have any particular structure, may be floating point numbers, and may be presented relative to some coordinate system (e.g., an application specific or standardized coordinate system).
  • the transform coordinates module 111 transforms the positions 101 into a format usable by the encoder 100.
  • the parameters T and s are such that the point positions 2
  • N out is the number of points in the decoded point cloud. N out may not be the same as N.
  • s is specified by the triSoupIntToOrigScale parameter, while T is [0, 0, 0], and d are specified by the triSoupDepth parameter.
  • Ceil x is the least integer greater than or equal to x.
  • Log2 (x) is the base-2 logarithm of x.
  • Max( xl, ... , xN ) is the maximum of xl, ..., xN.
  • the positions 101 are forwarded to the quantize points module 113.
  • the encoder 100 operates on non-negative integer values.
  • the quantize points module 113 performs rounding to create integer values and merges any resulting duplicate points.
  • Round(-) is a function that rounds the components of a vector to the nearest integer.
  • the duplicate point removal process is optional. If enabled, this process removes points with the same quantized coordinates. Multiple points with the same quantized position and different attributes are merged in a single point.
  • the attributes 103 associated with the single point are computed by the transfer attributes module 125 as described below.
  • the process of position quantization, duplicate point removal, and assignment of attributes 103 to the remaining points is called voxelization.
  • Voxelization is the process of grouping points together into voxels.
  • the set of voxels are the unit cubes [i— 0.5, i + 0.5) x [j— 0.5, j + 0.5) x [k— 0.5, k + 0.5) for integer values of i, j, and k between 0 and 2 d — 1.
  • the locations of all the points within a voxel are quantized to the voxel centre, and the attributes 103 of all the points within the voxel are combined (e.g., averaged) and assigned to the voxel.
  • a voxel is occupied when the voxel contains any point of the point cloud.
  • the analyze octree module 115 is configured to encode the positions 101 of the points and/or voxels by employing an octree.
  • An octree is a hierarchical structure used to recursively split nodes into sub-nodes. Specifically, an octree recursively splits a parent node into eight child nodes.
  • the octree can contain any number of layers of nodes. When used to encode a sparse point cloud, the octree denotes when a node contains at least one point. The node is split into eight sub-nodes.
  • Those sub-nodes that contain points are marked and further split into eight sub-nodes, etc. Nodes that contain no points are discarded. Such sub-division may continue until the nodes reach a size that is comparable to (e.g., larger than/equal to) a minimum threshold, such as a minimum scanning capability size (e.g., a maximum effective sampling capability) associated with a point scanner/LIDAR that generated the point cloud.
  • a minimum threshold such as a minimum scanning capability size (e.g., a maximum effective sampling capability) associated with a point scanner/LIDAR that generated the point cloud.
  • This approach compresses the actual position of the points/voxels to a location in a leaf node of the octree.
  • An example implementation of such as process can be formally described as follows.
  • a cubical axis-aligned bounding box B is defined by the two extreme points (0,0,0) and (2 d , 2 d , 2 d ).
  • An octree structure is then built by recursively subdividing B. At each stage, a cube is subdivided into 8 sub-cubes.
  • An 8-bit code named an occupancy code, is then generated by associating a 1-bit value with each sub-cube in order to indicate whether the sub-cube contains points (e.g., full and has a value of one) or not (e.g., empty and has a value of zero). Only full sub-cubes with a size greater than one (e.g., non-voxels) are further subdivided.
  • points may be duplicated, multiple points may be mapped to the same sub cube of size one (e.g., the same voxel).
  • the number of points for each sub-cube of dimension one is also arithmetically encoded. For example, the occupancy of the octree as well as the number of points in each of the occupied smallest nodes can be sent to the arithmetic encode module 117 for encoding.
  • the resulting geometry data created from the positions 101 may be forwarded to the analyze surface approximation module 119.
  • the analyze surface approximation module 119 is an optional module that can be employed to encode an approximation of the surfaces of an object represented by a point cloud.
  • the analyze surface approximation module 119 may be used for dense point clouds.
  • the analyze surface approximation module 119 may use a trisoup algorithm that represents the surface of an object as a mesh of triangles. Data related to the triangle mesh can then be forwarded to the arithmetic encode module 117 for encoding.
  • the arithmetic encode module 117 is configured to encode positions 101 related data, such as octree data, node occupancy data, and/or object surface data, into the geometry bitstream 105.
  • the arithmetic encode module 117 can encode such data by using a binary code.
  • the positions 101 data can be received as a number of symbols.
  • the arithmetic encode module 117 can employ a look up table (LUT) with the most commonly used symbols and a cache of recently used symbols. If a symbol is in the LUT, the index of the location of the symbol in the LUT is encoded. If a symbol is in the cache, the index of the location of the symbol in the cache is encoded.
  • LUT look up table
  • CAVLC context adaptive variable length coding
  • CABAC Context-adaptive binary arithmetic coding
  • SBAC syntax-based context-adaptive binary arithmetic coding
  • PIPE probability interval partitioning entropy coding
  • the attribute section includes a transform colors module 123, a transfer attributes module 125, a Region Adaptive Hierarchical Transform (RAHT) module 127, a generate LOD module 129, a lifting module 131, a quantize coefficients module 133, and an arithmetic encode module 135.
  • the transform colors module 123 is an optional module.
  • the transform colors module 123 may be configured to convert RGB attributes 103 into a YCbCr form or vice versa. This is optional because quantization of color components may be agnostic to the color space of the components. This is because the components may be processed independently. In the event the transform colors module 123 is used, the transform colors module 123 ensures the attributes 103 are converted to the desired format for further processing.
  • the transfer attributes module 125 receives the attributes 103 of the points from the transform colors module 123 and/or as input to the encoder 100. Also, the positions 101 of the points are forwarded to the transfer attributes module 125. Further, the output from the analyze octree module 115 and/or the analyze surface approximation module 119 are forwarded to the reconstruct geometry module 121.
  • the reconstruct geometry module 121 is configured to reconstruct positions 101a of the points based on the results of the encoding processes used by the octree and/or surface approximation algorithms. Accordingly, the reconstruct geometry module 121 generates positions 101a data in manner similar to a decoder. The output of the reconstruct geometry module 121 is then forwarded to the transfer attributes module 125.
  • the transfer attributes module 125 may compare the positions 101a as encoded with the original positions 101 to determine the distortion created by the encoding process.
  • the transfer attributes module 125 is configured to alter the attributes 103 of the points, as desired, to offset the distortion created by the encoding of the geometry bitstream 105 from the positions 101, where the distortion is the difference between the positions 101 and the positions 101a. This process may also be called recoloring.
  • the transfer attributes module 125 may operate as follows. Given the input point cloud positions 101, the attributes 103, and the reconstructed positions 101a the objective of the attributes transfer procedure of the transfer attributes module 125 may be to determine the attribute 103 values that minimize the attribute 103 distortions.
  • X * is the nearest neighbour in the original point cloud and a * is the attribute 103 value associated with X * .
  • (Q) + (i) (X, + describes the set of points (Q) + (i) in the original point cloud that share X j as their nearest neighbour in the reconstructed point cloud.
  • H(i) is the number of elements in (Q) + (i) .
  • the transfer attributes module 125 is coupled to the RAFIT module 127 and the generate LOD module 129 via a switch. Accordingly, the transfer attributes module 125 can output results to the RAFIT module 127 or the generate LOD module 129 (e.g., but may not output to both simultaneously in some examples).
  • the recolored attributes 103 and the reconstructed positions 101a may be forwarded to the RAFIT module 127.
  • the RAFIT module 127 provides an optional mechanism for encoding the attributes 103.
  • the RAFIT module 127 traverses a set of blocks that contain the points of the point cloud in 3D space. The RAFIT module 127 then applies a transform to such blocks to transform attributes 103 of the points/voxels contained in each block. The transform may vary depending on the spatial position of the block.
  • the transform of a parent block may be the concatenation of the two sibling blocks, with the exception that a first direct current (DC) component of the transforms of the two sibling blocks may be replaced by their weighted sum and difference.
  • the transforms of the two sibling blocks may be copied from the first and last parts of the transform of the parent block, with the exception that the DC components of the transforms of the two sibling blocks are replaced by their weighted difference and sum.
  • the recolored attributes 103 and the reconstructed positions 101a may also be forwarded to the generate LOD module 129.
  • the generate LOD module 129 is another example attribute 103 encoding mechanism.
  • the generate LOD module 129 is configured to organize the points into various LODs based on their positions 101a. For example, the points may be organized into a first layer and one or more refinement layers. This allows the attributes 103 to be decoded and displayed in varying levels of detail as desired by the end user.
  • the generate LOD module 129 selects a set of random root points to encode at the first LOD. The remaining points are organized into refinement layers based on distance to the points of the first layer.
  • all the points are first marked as non-visited and the set of visited points, denoted as V, is set as empty.
  • the generate LOD module 129 may then proceed iteratively. At each iteration 1, the refinement layer R
  • LOD The level of detail, LOD ] , is obtained by taking the union of the refinement layers R 0 , R 1 ... , R j . This process is repeated until all the LODs are generated or until all the vertices have been visited. Iterative use of a distance based search algorithm is enormous processor intensive, for example when multiple pictures are to be encoded at many LODs. This may be considered an NP hard problem.
  • the present disclosure includes a modified generate LOD module 129 that significantly reduces coding time and increases attribute compression.
  • the disclosed generate LOD module 129 applies a hierarchical tree, such as a binary tree, an octree, and/or a hybrid binary tree, to sort the points for attribute encoding.
  • a hierarchical tree such as a binary tree coding speed may be increased by a factor or four to a factor of five when compared to the iterative distance based search approach. This indicates that hierarchical tree attribute coding can reduce coding time to between twenty percent and twenty five percent of the time used to execute the distance based approach on the same hardware. Further, hierarchical tree attribute coding may increase compression by fifteen percent when compared to the distance based approach.
  • the generate LOD module 129 assigns the points in the point cloud to a parent node based on positions 101a.
  • the parent node is then recursively sub-divided into child nodes to create multiple LODs.
  • the points are divided into groups by position 101a without employing recursive search algorithms.
  • the output of the generate LOD module 129 (e.g., the points sorted according to position) can then be forwarded to the lifting module 131.
  • the lifting module 131 is configured to encode each LOD and/or refinement layer based on a previous LOD and/or refinement layer. For example, the attribute values 103 for the points in the first LOD are encoded.
  • the attribute values 103 for the first refinement layer are encoded as the difference between the actual value and the value of a predictor from the first LOD.
  • the attribute values 103 of further refinement layers are then iteratively encoded as the difference between the actual value and the value of a predictor from the previous refinement layer. Such differences may be referred to as residuals.
  • a residual is a difference between a predictor value and an actual value.
  • the lifting module 131 may also apply a transform to the predictors and residuals in order to convert such values into a series of coefficients for further encoding. It should be noted that the predictors may be determined by the generate LOD module 129 in some examples.
  • the output of the RAHT module 127 and/or the lifting module 131 may be forwarded to the quantize coefficients module 133.
  • the quantize coefficients module 133 is configured to compress the predictors and residuals of the data corresponding to the attributes 103 for further encoding.
  • Many mechanisms can be employed to quantize the attribute 103 data.
  • the predictors and residuals may be received at the quantize coefficients module 133 as various transform coefficients.
  • the quantize coefficients module 133 may then multiply such coefficients by a weighting function to reduce the size of the coefficients for further encoding.
  • Different mechanisms can employ different weighting functions, such as a step function, a rounding function, a normalization function, etc.
  • the output of the quantize coefficients module 133 are quantized transformed coefficients representing the attributes 103 of the points in the point cloud. Such output is received by the arithmetic encoder module 135, which may be substantially similar to the arithmetic encoder module 117. However, the arithmetic encoder module 135 is dedicated to encoding attributes 103 into the attribute bitstream 107. In some examples, the arithmetic encoder module 135 and the arithmetic encoder module 117 may be implemented in a common module.
  • the preceding encoder 100 includes many optional modules and can be implemented by selectively applying various function(s), operator(s), options, etc. As noted above, different selections result in benefits and/or detriments for different cases. The only requirement is that a corresponding decoder must apply a copy and/or an inverse of any optional models, function(s), operator(s), options, etc. that are selected at the encoder 100. This is because the encoded data is unreadable at the decoder if the decoder is unable to properly reverse the encoding to reconstruct the media sequence. Accordingly, the relevant modules of the encoder 100 can signal such choices used in encoding the geometry bitstreaml05 and/or the attribute bitstream 107 to the decoder. These choices can be signaled in the geometry bitstream 105, the attribute bitstream 107, and/or the combined bitstream, depending on the example.
  • FIG. 2 is a schematic diagram of a decoder 200 for decoding sparse point clouds, for example as encoded in a combined bitstream by an encoder 100.
  • the decoder 200 includes various modules to reverse the encoding process from the encoder 100.
  • the decoder 200 receives a geometry bitstream 205 and an attribute bitstream 207 as inputs. Such bitstreams are substantially similar to the geometry bitstream 105 and the attribute bitstream 107, respectively, and may be received as part of a combined bitstream (e.g., including relevant signaling syntax).
  • the decoder 200 is configured to decode the geometry bitstream 205 and the attribute bitstream 207 to generate positions 201 and attributes 203 of points in a point cloud for display.
  • the point cloud can then be reconstructed and displayed or otherwise employed based on the positions 201 and attributes 203 of the points.
  • the positions 201 and attributes 203 should be approximately equivalent to positions 101 and attributes 103, respectively. Any differences in such values are perceived as distortion. While distortion is undesirable, distortion is a side effect of lossy compression. Accordingly, compression and distortion may be considered design tradeoffs.
  • Beneficial encoder 100 and decoder 200 designs seek to balance compression and distortion along with other design constraints, such as hardware resource usage, etc. For example, when coding sparse point clouds for use by machines, such as autonomous vehicles, distortion may be less of a detractor, in which case compression can be emphasized.
  • the decoder 200 includes a geometry section for decoding the positions 201 of the points.
  • the geometry section includes an arithmetic decode module 217, a synthesize octree module 215, a synthesize surface approximation module 219, a reconstruct geometry module 221, and an inverse transform coordinates module 211.
  • Such modules are configured to reverse the encoding processes performed by corresponding modules at the encoder.
  • the arithmetic decode module 217 is configured to receive the geometry bitstream 205 and convert a binary representation of the geometry bitstream 205 into quantized transformed coefficients representing the geometry.
  • the arithmetic decode module 217 is configured to perform an inverse function of the function provided by an arithmetic encode module 117.
  • the arithmetic decode module 217 may employ a LUT with a list of most commonly used symbols and a cache of recently used symbols. The arithmetic decode module 217 can then read indices from the geometry bitstream 205 to obtain coefficients from the LUT and/or cache in order to reverse a similar process performed by the encoder.
  • the value can be converted directly into a coefficient.
  • the arithmetic decode module 217 may employ CAVLC, CABAC, SBAC, PIPE coding, or other entropy coding technique as desired.
  • the transformed coefficients representing the geometry are output from the arithmetic decode module 217 and forwarded to the synthesize octree module 215 and the synthesize surface approximation module 219.
  • the synthesize octree module 215 is configured to recreate the octree from the transformed coefficients.
  • the synthesize octree module 215 may apply an inverse transform to recover the octree data, which can then be used to reconstruct the octree and determine the point positions based on the octree.
  • the decoding process starts by reading from the bitstream the dimensions of a bounding box (B) that contains the point cloud.
  • the octree structure is recreated by subdividing B according to the occupancy codes from the geometry bitstream 205.
  • a sub-cube of dimension one e.g., a lowest level of the octree
  • the number of points for that sub-cube, denoted as c is arithmetically decoded from the geometry bitstream 205.
  • a number of points c are then generated at the origin of the sub-cube. This results in a reconstruction of a number of points c that are positioned in a location that approximates the reconstructed point cloud.
  • the synthesize surface approximation module 219 is an optional module configured to reverse the encoding processes of the analyze surface approximation module 119. Accordingly, when used, the synthesize surface approximation module 219 may be configured to employ a trisoup algorithm to decode a triangular mesh in order to approximate the surfaces of the point cloud from the geometry bitstream 205. For example, the synthesize surface approximation module 219 may read various vertices from the geometry bitstream 205 and form triangles from the vertices by determining a centroid, mean-removed coordinates, and scaled variances for the triangles. Such information can then be used to reconstruct the triangular mesh to approximate the surface, for example by projecting the triangles onto corresponding planes based on the angles of the triangles.
  • the output of the synthesize octree module 215 and/or the synthesize surface approximation module 219 is received at the reconstruct geometry module 221, which is substantially similar to reconstruct geometry module 121 (e.g., reconstruct geometry module 121 is designed to predict the results generated by the reconstruct geometry module 221).
  • the reconstruct geometry module 221 assigns internal coordinates to the points based on the cube of the octree in which such points reside.
  • the triangular mesh can also be used to adjust the coordinates of such points into positions indicated by the triangular mesh.
  • the output of the reconstruct geometry module 221 can then be output as positions 201, or optionally forwarded to an inverse transform coordinates module 211.
  • the inverse transform coordinates module 211 is an optional module that reverses the coordinate transform mechanism of the transform coordinates module 111.
  • the inverse transform coordinates module 211 may apply an inverse transform to convert the coordinates of the points to an external coordinate system.
  • the decoder 200 also includes an attribute section for decoding the attributes 203 of the points.
  • the attribute section includes an arithmetic decode module 235, an inverse quantize module 233, a RAFIT module 227, a generate LOD module 229, an inverse lifting module 231, and an inverse transform colors module 223.
  • Such modules are configured to reverse the encoding processes performed by corresponding modules at the encoder.
  • the arithmetic decode module 235 is similar to the arithmetic decode module 217, but is configured to decode both the geometry bitstream 205 and the attribute bitstream 207. In some examples, the arithmetic decode module 235 and the arithmetic decode module 217 are implemented in the same module. The arithmetic decode module 235 is configured to convert a binary representation of the geometry bitstream 205 and the attribute bitstream 207 into quantized transformed coefficients representing the geometry and the attributes of the points.
  • the resulting data from the arithmetic decode module 235 is forwarded to the inverse quantize module 233, which is configured to remove some or all of the quantization from the data (e.g., depending on whether the quantization process is lossy or not).
  • the inverse quantize module 233 may apply an inverse weighting function to the coefficients.
  • the inverse weighting function is selected to reverse the compression applied by the quantize coefficients module 133. This may result in uncompressed coefficients for predictors and residuals of the attribute 203 data (e.g., and relevant data from the geometry bitstream 205).
  • An inverse transform can also be applied to convert such data into a form useable by the other attribute 203 related modules. As shown in FIG.
  • inverse quantize module 233 is coupled to the RAHT module 227 and the generate LOD module 229 via a switch. Accordingly, the inverse quantize module 233 can output results to the RAHT module 227 or the generate LOD module 229 (e.g., but may not output to both simultaneously in some examples).
  • the RAHT module 227 is an optional module that receives reconstructed positions 201 from the reconstruct geometry module 221 and attribute 203 data from the inverse quantize module 233.
  • the RAHT module 227 is designed to reverse the operation of the RAHT module 127.
  • the generate LOD module 229 receives reconstructed positions 201 from the reconstruct geometry module 221 and attribute 203 data from the inverse quantize module 233.
  • the generate LOD module 229 is substantially similar to the generate LOD module 129 of the encoder 100, and functions in substantially the same manner.
  • the generate LOD module 229 employs the reconstructed positions 201 to include the points in a node and applies a hierarchical tree to recursively divide the node into sub-nodes in order to generate a first LOD and a plurality of refinement layers.
  • the resulting LODs can then be forwarded to the inverse lifting module 231.
  • the generate LOD module 229 may also determine the predictors for the points, for example based on a predetermined mechanism that selects predictors for points at a lower LOD/refmement layer from a higher LOD/refmement layer.
  • the mechanism for selecting predictors may be the same at the encoder 100 and decoder 200 to result in an accurate reconstruction of the point cloud.
  • the inverse lifting module 231 performs the inverse operation of the lifting module 131.
  • the inverse lifting module 231 receives the LODs, refinement layers, and/or predictors from the generate LOD module 229.
  • the inverse lifting module 231 can also determine the residuals for the attributes 203 from the attribute bitstream 207 (e.g., after arithmetic decoding, inverse quantization, etc. by other modules as discussed above). For example, the inverse lifting module 231 may apply an inverse transform to coefficients in order to recover the residuals.
  • the inverse lifting module 231 may then determine the attribute 203 values for the points of the various LODs based on the determined predictors and the coded residuals from the bitstream.
  • the attribute 203 values for points at the first LOD can be set as indicated in the attribute bitstream 207
  • attribute 203 values for the points at the second LOD e.g., first refinement layer
  • attribute 203 values for the points at the third LOD can be set by adding atribute 203 values of a second set of predictors associated with the second LOD to a second set of residuals associated with the third LOD
  • the reconstructed atributes 203 may be forwarded from the inverse lifting module 231 and/or the RAHT module 227 to an optional inverse transform colors module 223.
  • the inverse transform colors module 223 is configured to reverse the operations of the transform colors module 123. Accordingly, the inverse transform colors module 223 may transform the attributes 203 from YCbCr form to RGB form, or vice versa, as desired.
  • the point cloud can then be reconstructed by applying the attributes 203 to the points at positions 201.
  • the resulting point cloud can also be included in a picture, also known as a frame.
  • Multiple pictures of the point cloud can be reconstructed and related to create a media sequence.
  • Such pictures may be related by different factors in different examples. For example, pictures of the point cloud may be related based on the relative position of a scanning device/LIDAR when such pictures were captured.
  • the media sequence can be forwarded to a display for viewing by a user and/or stored in memory for review by a machine, for example for use in controlling and/or debugging actions of a machine that captured the point cloud.
  • FIG. 3 is a flowchart of an example method 300 of coding point geometry in a sparse point cloud.
  • method 300 may be employed by an encoder 100 to generate a geometry bitstream 105.
  • Method 300 begins when a point cloud is received for encoding.
  • the points of the point cloud are assigned to a parent node in a first layer.
  • the parent node and the first layer may be considered to be the current node and the current layer.
  • the method 300 then proceeds to step 303.
  • the method 300 determines whether the current node is occupied. A node is occupied when the node contains at least one point in the point cloud. If the node is not occupied, the method 300 proceeds to step 307. At step 307, the current node is omited from the octree and the method 300 proceeds to step 309. If the node is occupied at step 303, the method 300 proceeds to step 305. At step 305, the method 300 indicates the current node is occupied in memory. The current node is split into eight child nodes which are included in a next layer. It should be noted that an octree employs eight nodes. However, other hierarchical trees can be employed that use different numbers of nodes. The method 300 then proceeds to step 309.
  • step 309 the method 300 determines whether there are more nodes in the current layer. If there are more nodes in the current layer, then the method 300 proceeds to step 311. At step 311, the next node in the current layer is set as the current node. The method 300 then returns to step 303. If there are no more nodes in the current layer at step 309, the method 300 proceeds to step 313.
  • the method 300 determines whether the lower threshold for the octree has been reached.
  • the lower threshold may be associated with the sampling limit of the scanner that created the point cloud.
  • the threshold may be set by default and/or set by a user and encoded in the bitstream to signal the decoder. If the threshold has not been reached, the method 300 proceeds to step 315 and moves to the next layer.
  • the method 300 makes the first child node in the next layer the current node. The next layer is also made the current node. The method 300 then returns to step 303. If the lower threshold is reached at step 313, then the current layer is the last layer of the octree. Accordingly, the method 300 proceeds to step 317.
  • the method 300 indicates the number of points in each of the occupied nodes in memory and ends. The occupied nodes and number of points in the last layer of occupied nodes can then be encoded in the geometry bitstream.
  • the method 300 can be modified to operate on a decoder 200 to decode the geometry bitstream 205 by reading from the geometry bitstream to determine whether a node is occupied instead of writing to the bitstream. Further, the number of points are assigned to the last layer of occupied nodes based on the data in bitstream instead of writing such data to the bitstream.
  • FIG. 4 is a flowchart of an example distance based method 400 of coding point attributes in a sparse point cloud.
  • method 400 may be employed to implement the generate LOD module 129 in the encoder 100 or the generate LOD module 229 in the decoder 200.
  • the method 400 may be used to encode or decode attributes of a point cloud to/from an attribute bitstream.
  • the method 400 sorts points into varying LODs so that each LOD can be coded based on a lower or higher LOD, depending on implementation.
  • An attribute signal 403 is received at the generate LOD module.
  • the attribute signal 403 includes the attributes of points in a point cloud and is substantially similar to attributes 103 as modified by intervening components.
  • the points and corresponding attributes are split into a group of higher level nodes (H(N)) and lower level nodes (L(N)). This split is performed by employing a distance threshold, which may be user defined. For example, a set of root points is randomly selected. A search algorithm is then employed from each of the root points. Points that are within the threshold distance of a root point are assigned to L(N) and root points and any other points that are not within the threshold of a root point are placed into FI(N). The attributes for points in FI(N) are coded directly.
  • a predication is performed. Specifically, a set of predictors P(N) is selected for use in encoding the next LOD. The predictors may be selected as the closest point to each root point that is beyond the distance threshold.
  • the predictors are encoded as a difference between the attribute value of the predictor and the attribute value of the associated root point. The encoded attribute values for the root points and the predictors are output as D(N).
  • an update is determined. Specifically, the attribute values of the predictors are obtained from D(N) and prepared to update the attribute values of the next LOD. The attribute values of the predictors are stored in U(N). At block 409, U(N) is applied to L(N).
  • the values of L(N) are then sent to the block 401 to be further split into additional levels. This process is applied recursively until the attributes of the points are encoded into the desired number of LODs. As can be seen for the forgoing, method 400 compresses the attribute values of the points into varying LODs. However, method 400 performs this task at the cost of recursively calling a search algorithm for sets of points at each LOD. This process is extremely processor intensive, particularly when multiple pictures each including a point cloud are to be coded. From a complexity standpoint, this approach may be considered an NP hard problem, which is generally disfavored in the art.
  • FIG. 5 is a schematic diagram of an example hierarchical tree based method 500 of coding point attributes in a sparse point cloud.
  • the hierarchical tree shown in FIG. 5 is a binary tree.
  • Method 500 may be employed to implement the generate LOD module 129 in the encoder 100 or the generate LOD module 229 in the decoder 200. Accordingly, the method 500 may be used to encode or decode attributes of a point cloud to/from an attribute bitstream.
  • the method 500 is applied to a picture containing a point cloud.
  • the method 500 sorts the points into a plurality of LODs for encoding by a lifting algorithm.
  • the method 500 initially assigns the points, denoted as P, to a first layer node 501 in a first LOD 511.
  • the method 500 divides the first layer node 501 into second layer nodes 503.
  • the first layer node 501 may be divided into second layer nodes 503 by a plane that bisects the first layer node 501.
  • the points P in the first layer node 501 are split into groups PI and P2 based on the position of such points relative to the plane.
  • the second layer nodes 503 make up a second LOD 512.
  • method 500 may determine a centroid of the first layer node 501.
  • a centroid may be a coordinate at the exact center of the corresponding node, or a coordinate in the center of the group of points in the node.
  • the method 500 may then select a median point, which is a cloud point in the corresponding node that is closest to the centroid.
  • the first layer node 501 can then be split by a plane that bisects the first layer node 501 via the median point.
  • the plane may be selected to maximize variance, based on a predefined order, and/or based on a maximum edge value, depending on the example.
  • the median point may be retained in the first layer node 501, and the remaining points PI and P2 may be sorted into the second layer nodes 503 based on their positions relative to the plane splitting the first layer node 501.
  • FIG. 5 depicts a tree of depth four with four LODs.
  • a tree can have j LODs where j is a default or user defined value (e.g., based on the maximum sampling capability of a scanner that created the point cloud).
  • the second layer nodes 503 are split into third layer nodes 505 in a third LOD 513, for example by a plane bisecting the median points of the second layer nodes 503.
  • the points PI and P2 of the second layer nodes 503 are further split based on their position relative to planes that splits the second layer nodes 503 to create the third layer nodes 505.
  • points PI are split into points Pla and Plb
  • points P2 are split into points P2a and P2b.
  • the third layer nodes 505 are split into fourth layer nodes 507 in a final LOD 514, for example by a plane bisecting the median points of the third layer nodes 505.
  • the points Pla, Plb, P2a, and P2b of the third layer nodes 505 are further split based on their position relative to planes that splits the third layer nodes 505 to create the fourth layer nodes 507.
  • points Pla are split into points Plaa and Plab
  • points Plb are split into points Plba and Plbb
  • points P2a are split into points P2aa and P2ab
  • points P2b are split into points P2ba and P2bb.
  • the hierarchical tree based method 500 sorts points into nodes based on position.
  • the method 500 may extract the point cloud points into the first layer node 501, build a binary tree, determine attributes of each node in the binary tree, and generate refinement layers from the LODs 511-514.
  • the points of the nodes 501, 503, 505, and 507 can then be encoded by the lifting algorithm.
  • a refinement layer may contain a plurality of LODs 511-514 in some cases. Example mechanisms for refinement layer generation are discussed in greater detail with respect to the FIGs. below.
  • a refinement layer may contain multiple points, which include the median nodes used to split the corresponding nodes in the refinement layer.
  • the lifting algorithm may then select predictors for points in a current refinement layer. Such predictors may be selected from a previous refinement layer and/or LOD. As a specific example, LOD 514 and LOD 513 may be assigned to different refinement layers. In such a case, points from points Pla, Plb, P2a, and P2b from LOD 513 can be selected to act as predictors for points Plaa, Plab, Plba, Plbb, P2aa, P2ab, P2ba, and P2bb in the final LOD 514. As a specific example, a predictor for a current point in LOD 514 may be selected as a weighted average of N points from LOD 513 that are nearest to the position of the current point.
  • N may be a predefined and/or user selected value.
  • the lifting algorithm may encode attribute values from the highest LOD 511/refmement layer and then use such values as predictors for lower refinement layers. Attribute values for points in the lower refinement layers are coded as a residual, which is a difference between the predictor values and the actual attribute values.
  • points from P may predict PI and P2
  • points from PI and P2 may predict points in Pla, Plb, P2a, P2b, etc. depending on the compression of corresponding LODs 511-514 into refinement layers. This scheme ensures that only the attribute values of the first LOD 511 are completely encoded and all remaining values are encoded as residuals.
  • the predictors may be procedurally selected, and hence may not be coded in the bitstream to save space.
  • the attributes may be coded as an attribute values in the first refinement layer
  • Method 500 is discussed in terms of encoding, but operates in a substantially similar manner at a decoder.
  • the nodes are split into LODs 511-514, which sorts points based on positions that are obtained from decoding the geometry bitstream.
  • the difference is the inverse lifting algorithm is applied instead of the lifting algorithm.
  • the attribute value(s) of the point(s) in the first LOD 511/refmement layer are obtained from the attribute bitstream.
  • the attribute values of the remaining LODs/refmement layers are obtained by adding the residuals from the bitstream to the corresponding predictors.
  • the attribute values for each current refinement are determined based on residuals for the current refinement layer and attribute values from the preceding refinement layer/LOD (as determined from the attribute bitstream).
  • Method 500 sorts the points into layers for coding based on point positions, but not based on relative distance between points. Accordingly, method 500 may not employ a recursive search algorithm as is done in method 400. Omission of the recursive search algorithm significantly reduces processor resource usage during encoding and decoding.
  • attribute coding using the binary tree of method 500 may increase coding speed by a factor or four to a factor of five when compared to the distance based approach of method 400. This indicates binary tree attribute coding according to method 500 can reduce coding time to between twenty percent and twenty five percent of the time used to execute the distance based approach of method 400 on the same hardware.
  • the binary tree attribute coding of method 500 may increase compression by fifteen percent when compared to the distance based approach of method 400. Accordingly, the hierarchical tree based attribute coding scheme of method 500 significantly improves the function of the corresponding codecs in terms of completion speed, processor resource usage, memory usage, and network resource usage when communicating the compressed data.
  • attribute data of the point cloud may be encoded as a binary tree to represent LODs.
  • One of the implementations to build the binary tree for attribute LODs employs the following steps. All of the points N from a point cloud data can be extracted and included in a node of a binary tree. A binary tree can then be built.
  • the binary tree depth can be set as according to:
  • the binary tree may be built by determining split axes for the binary subdivision each node.
  • a split axis may be determined as a plane that bisects the node at a median point Pma and results in a maximum variance at the median point.
  • other subdivision mechanisms may be employed such as splitting the nodes along the axis with maximum edge value at the median point of the axis, splitting in a predefined order (e.g., x, y, z ... x, y, z axis), etc.
  • Pi is a current point denoted by an index i
  • Pla is a left node
  • Pra is a right node
  • Pma is a median point. It should be noted that the index of the node may follow a similar parity index, for example where left nodes have odd index and right nodes have even index.
  • the resulting nodes may contain the following features: a median point associated with a split axis (e.g., a expressed as a position coordinate), a split axis index indicating the split axis (e.g., zero, one, or two) corresponding to an axis (e.g., x, y, or z), a start index corresponding a first point of the point cloud in the corresponding node, and an end index corresponding to a last point of the point cloud in the corresponding node.
  • a node can be designated as having a tight bounding box or a loose bounding box. A node having a loose bounding box maintains boundaries from a parent node as modified by a split.
  • a node having a tight bounding box maintains boundaries that are minimally sufficient to encompass all of the points assigned to the node. Fience, a node with a tight bounding box may contain smaller boundaries than the parent node even for walls that are not split by the split axis for the parent. Based on the forgoing, nodes may also contain following features: a minimum coordinate value and a maximum coordinate value, which may represent either a loose bounding box or tight bounding box. A loose bounding box may be inferred from intersection of a median plane with parent node bounding box. A tight bounding box may be calculated from the minimum and maximum point coordinates in a corresponding node (e.g., of the child node resulting from a split).
  • the method 500 may compute and store centroids for each of the binary tree nodes for all layers of binary tree.
  • a center point of the node may be used instead of or in addition to the centroid.
  • a binary search may be employed. For example, a binary search may be employed to determine a query point for each node. For each query point, the binary search is employed to look for the nearest (e.g., closest) leaf node to the query point. The binary search may employ the binary tree generated as discussed above.
  • the query point denoted as qi(x,y,z) may be generated as an average of the coordinates in corresponding child node i by employing the following equation:
  • qi is a query point with coordinates (x,y,z) and Pi is the closest child/leaf node with average coordinates denoted as ( ⁇ X>, ⁇ Y>, ⁇ Z>).
  • the query points may be generated as a centroid of loose or tight bounding box as described by the following equation: qi((xBB_min + xBB_max) / 2, (yBB_min + yBB_max) / 2, (zBB_min + zBB_max) / 2);
  • LOD points for the corresponding nodes may be stored as non-visited points of a leaf node with a minimum distance from a centroid to all the points in nearest the leaf node. Such points may be marked as point as visited and excluded from further search. This process may be repeated for all nodes in a corresponding LOD to generate a refinement layer. In some examples, this process may be repeated for alternate layers of the binary tree in such a way that leaf node (e.g., bottom) layer is always retained. This may result in compressing a plurality of LODs into a refinement layer.
  • predictors may be determined for each LOD layer j as the N nearest points in the LOD(j-l) for each of the points in the refinement layer R(j).
  • the predictors may be employed in the lifting algorithm to encode the attributes and store the residuals in an output bitstream using entropy coding.
  • the encoder may also encode a use_binary_tree_LOD_flag in syntax to indicate the method 500 has been employed to encode the attributes.
  • the same method 500 for LOD generation based on a binary tree can be used to reconstruct the point cloud based on the geometry.
  • the decoder can parse syntax elements to determine whether use_binary_tree_LOD_flag is set (e.g., equal to one).
  • the decoder can then extract all of the points from a point cloud data and include the points in a node of a binary tree.
  • the binary tree depth can be set and the tree built starting from the root as in the encoder.
  • Binary search can then be employed on the points based on the geometry data to generate the refinement layers from the LODs and determine predictors in a similar manner to the encoder.
  • the inverse lifting algorithm can then be employed to reconstruct the attributes of the points based on the predictors (e.g., including the stored attributes of the first refinement layer) and the coded residuals.
  • Attribute bitstream semantics are as follows.
  • the variable use_binary_tree_LOD_flag may be set equal to one to indicate that a binary tree shall be used to generate level of details structure for attribute coding in lifting transform or predictive lifting transform. Otherwise distance based method for binary tree construction shall be used.
  • FIG. 6 is a schematic diagram of an example mechanism 600 for splitting a node in a tight bounding box at a median point according to a hierarchical tree based method of coding point attributes, such as method 500. Accordingly, mechanism 600 can be employed in the generate LOD module 129 in the encoder 100 or the generate LOD module 229 in the decoder 200. As such, the mechanism 600 may be used to encode or decode attributes of a point cloud to/from an attribute bitstream.
  • Mechanism 600 shows a node 601.
  • the node 601 is an example of nodes 501, 503, and/or 505.
  • the node 601 can be split into two child nodes.
  • mechanism 600 can review points in the node 601 and select a median point 602.
  • the median point 602 is a point determined to be the closest to the centroid of a group of points 603 contained in the node 601.
  • the node 601 is bisected by a plane that traverses the median point 602.
  • the plane can be selected from an XY plane, a YZ plane, and an XZ plane.
  • the plane can be selected to maximize variance at the median point.
  • an XY plane includes X and Y components and no Z components.
  • An XY plane divides the node 601 into a top node above a bottom node.
  • a YZ plane includes Y and Z components and no X components.
  • a YZ plane divides the node 601 into a left node horizontally adjacent to a right node.
  • An XZ plane includes X and Z components and no Y components.
  • An XZ plane divides the node 601 into a front node in front of a back node.
  • the median point 602 may be removed from the resulting nodes when the node 601 is divided. For example, the median point 602 may be coded as part of a current LOD and may not remain when coding the next LOD.
  • the node 601 may include a plurality of points 603.
  • the node 601 also includes a tight bounding box 605.
  • a tight bounding box 605 is a bounding box of a minimum size that is capable of containing all of the points 603 of the node 601.
  • a tight bounding box 605 has minimum and maximum x, y, and z coordinates that are set based on the minimum and minimum and maximum x, y, and z coordinates used by the points 603.
  • a tight bounding box 605 may be unlike a loose bounding box, which inherits boundaries from a parent node.
  • the tight bounding box 605 is equal to or smaller in size than a loose bounding box. Accordingly, a search algorithm has less area to search when employing a tight bounding box 605. As such, use of a tight bounding box 605 may reduce processor resource usage at both an encoder and a decoder.
  • FIG. 7 is a schematic diagram of another example hierarchical tree based method 700 of coding point attributes in a sparse point cloud.
  • the hierarchical tree shown in FIG. 7 is an octree.
  • Method 700 may be employed to implement the generate LOD module 129 in the encoder 100 or the generate LOD module 229 in the decoder 200. Accordingly, the method 700 may be used to encode or decode attributes of a point cloud to/from an attribute bitstream.
  • Method 700 is substantially similar to method 500, but splits each current node into eight child nodes.
  • points are assigned to a first layer node 701 in a first LOD 711.
  • the first layer node 701 is split into second layer nodes 703 in a second LOD 712.
  • the second layer nodes 703 are split into third layer nodes 705 in a third LOD 713.
  • the third layer nodes 705 are split into fourth layer nodes 707 in a final LOD 714.
  • the nodes 701, 703, 705, and 707 of method 700 are split in a manner that is similar to the nodes of method 500. However, each node is split into eight child nodes instead of two child nodes.
  • the first layer node 701 can be split at a median point as in mechanism 600.
  • the first layer node 701 is split by an XY plane, a YZ plane, and an XZ plane (e.g., instead of by a single plane). This creates eight sub-nodes that become the second layer nodes 703.
  • the same process can be used to split the second layer nodes 703 into the third layer nodes 705 and the third layer nodes 705 into the fourth layer nodes 707.
  • the lifting algorithm can then be applied to method 700 in substantially the same manner as in method 500. For example, a predictor for each child node can be selected as a nearest point in a neighboring node, and hence can be refined according to selected a context of a neighboring node.
  • FIG. 8 is a schematic diagram of an example mechanism 800 of adaptively compressing LOD, such as LODs 511-514, into refinement layers.
  • Mechanism 800 may be employed to support methods 500 and/or 700 and/or employed in conjunction with mechanism 600.
  • Mechanism 800 may be employed in the generate LOD module 129 in the encoder 100 or the generate LOD module 229 in the decoder 200. Accordingly, the mechanism 800 may be used to encode or decode attributes of a point cloud to/from an attribute bitstream.
  • Mechanism 800 can be applied to compress LODs into refinement layers.
  • a binary tree may generate LODs based on a calculated depth of the binary tree. The depth may be determined based on the number of points. Hence, the number of LODs may vary based on the point cloud being coded.
  • a predefined, specified, and/or default number of refinement layers may be specified.
  • Mechanism 800 may be employed to compress the LODs into the predefined, specified, and/or default number of refinement layers.
  • application of a binary tree may result in LODs 810, 811, 812, 813, 814, and 815.
  • refinement layers 820, 821, 822, and 823 may be employed to display the point cloud.
  • An adaptive prediction threshold may be employed to perform the compression of mechanism 800.
  • the adaptive prediction threshold may include a dynamically determined value that indicates the compression to be employed to reduce the number of LODs 810, 811, 812, 813, 814, and 815 to align with the number of refinement layers 820, 821, 822, and 823.
  • the lowest LOD 810 is always retained in a separate refinement layer 820. This ensures that at least one refinement layer 820 contains the complete data set for the point cloud. Other LODs are compressed into corresponding refinement layers.
  • FIG. 9 is a schematic diagram of an example volumetric media coding device 900.
  • the volumetric media coding device 900 is suitable for implementing the disclosed examples/embodiments as described herein.
  • the volumetric media coding device 900 comprises downstream ports 920, upstream ports 950, and/or transceiver units (Tx/Rx) 910, including transmitters and/or receivers for communicating data upstream and/or downstream over a network.
  • the volumetric media coding device 900 also includes a processor 930 including a logic unit and/or central processing unit (CPU) to process the data and a memory 932 for storing the data.
  • the volumetric media coding device 900 may also comprise electrical, optical-to-electrical (OE) components, electrical-to-optical (EO) components, and/or wireless communication components coupled to the upstream ports 950 and/or downstream ports 920 for communication of data via electrical, optical, or wireless communication networks.
  • OE optical-to-electrical
  • EO electrical-to-optical
  • the volumetric media coding device 900 may also include input and/or output (I/O) devices 960 for communicating data to and from a user.
  • the I/O devices 960 may include output devices such as a display for displaying media data, speakers for outputting audio data, etc.
  • the I/O devices 960 may also include input devices, such as a keyboard, mouse, trackball, etc., and/or corresponding interfaces for interacting with such output devices.
  • the processor 930 is implemented by hardware and software.
  • the processor 930 may be implemented as one or more CPU chips, cores (e.g., as a multi-core processor), field- programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), and digital signal processors (DSPs).
  • the processor 930 is in communication with the downstream ports 920, Tx/Rx 910, upstream ports 950, and memory 932.
  • the processor 930 comprises a coding module 914.
  • the coding module 914 implements the disclosed embodiments described above, such as methods 300, 400, 500, 700, 1000, 1100, 1200, and/or 1300 which may employ a mechanism 600 and/or 800.
  • the coding module 914 may also implement any other method/mechanism described herein. Further, the coding module 914 may implement an encoder 100, and/or a decoder 200. For example, the coding module 914 can employ a hierarchical tree to split points in a point cloud into nodes based on point position. The coding module 914 can then apply a lifting algorithm or inverse lifting algorithm to encode or decode, respectively, attribute values for the points. Hence, coding module 914 allows the volumetric media coding device 900 to code point cloud attribute data using significantly fewer processing resources while providing increased coding efficiency when compared to other systems. As such, the coding module 914 improves the functionality of the volumetric media coding device 900 as well as addresses problems that are specific to the media coding arts.
  • the coding module 914 effects a transformation of the volumetric media coding device 900 to a different state.
  • the coding module 914 can be implemented as instructions stored in the memory 932 and executed by the processor 930 (e.g., as a computer program product stored on a non-transitory medium).
  • the memory 932 comprises one or more memory types such as disks, tape drives, solid-state drives, read only memory (ROM), random access memory (RAM), flash memory, ternary content addressable memory (TCAM), static random-access memory (SRAM), etc.
  • the memory 932 may be used as an over-flow data storage device, to store programs when such programs are selected for execution, and to store instructions and data that are read during program execution.
  • FIG. 10 is a flowchart of an example method 1000 of encoding point attributes in a sparse point cloud.
  • Method 1000 can be employed to implement method 300, 500, and/or 700, for example by employing mechanism 600 and/or 800. Accordingly, method 1000 may be employed to implement the generate LOD module 129 and/or the lifting module 131 in the encoder 100. As such, the method 1000 may be used to encode positions and attributes of points in a point cloud to a geometry bitstream and an attribute bitstream to create a combined bitstream. Method 1000 may also be applied by a coding module 914 of a volumetric media coding device 900.
  • Method 1000 may be initiated when an encoder receives input indicating that a media sequence containing a sparse point cloud should be encoded.
  • one or more images of a media sequence can be stored in memory. At least one of the images comprises a sparse point cloud.
  • the sparse point cloud contains a plurality of points.
  • an octree is applied to the points to encode a geometry bitstream describing positions of the points.
  • applying the octree can be accomplished by applying method 300 to the points as described above.
  • a hierarchical tree is applied to sort the points into nodes based on the position of the points.
  • a lifting algorithm can also be employed to encode an attribute bitstream describing attribute values of the points.
  • the attributes of the points can be encoded as a series of predictors and residuals to compress the encoding of the attribute values.
  • the hierarchical tree may be a binary tree and/or an octree as discussed in methods 500 and 700, respectively. Other hierarchical tree structures may also be used as desired.
  • the hierarchical tree can be applied according to method 1100 as described below.
  • a variable such as a flag
  • a hierarchical tree such as a binary tree. This indicates the hierarchical tree is used to determine the LODs and attributes values.
  • a use_binary_tree_LOD_flag may be set in the bitstream to indicate the sparse point cloud is coded by the binary tree.
  • a PCC bitstream containing the geometry bitstream and the attribute bitstream is transmitted to support reconstructing the media sequence.
  • a plurality of point cloud states can be decoded into a plurality of pictures, which can be arranged to create a media sequence for display or other computational analysis.
  • FIG. 11 is a flowchart of an example method 1100 of applying a hierarchical tree to support encoding point attributes in a sparse point cloud.
  • Method 1100 can be employed to apply a hierarchical tree at step 1005 of method 1000. Accordingly, method 1100 can be employed when implementing a method 300, 500, and/or 700, for example by employing a mechanism 600 and/or 800. Further, method 1100 may be employed to implement the generate LOD module 129 and/or the lifting module 131 in the encoder 100. As such, the method 1100 may be used to encode positions and attributes of points in a point cloud to a geometry bitstream and an attribute bitstream to create a combined bitstream. Method 1100 may also be applied by a coding module 914 of a volumetric media coding device 900.
  • Method 1100 is initiated when encoding attributes for points in a point cloud.
  • a median point of a parent node is selected.
  • the median node may be selected as a closest point to a center coordinate of a tight bounding box around the parent node.
  • a split axis is also selected for splitting the parent node.
  • the split axis may be a selected axis that results in a maximum variance between the median point and other points of the parent node relative to the selected axis.
  • the parent node is split into a plurality of child nodes at the median point of the parent node. Points from the parent node can also be sorted into the child nodes based on a position of the median point and a direction of the split axis.
  • features can be determined for corresponding nodes resulting from applying the hierarchical tree.
  • the features for the corresponding nodes may include a median point, a split axis index, a start index of points, an end index of points, a minimum point coordinate value, a maximum point coordinate value, or combinations thereof.
  • a tight bounding box around the parent node may also be determined based on a minimum point coordinate value and a maximum point coordinate value associated with points in the parent node.
  • nodes resulting from the hierarchical tree are assigned into LODs.
  • refinement layers are generated for the LODs.
  • refinement layers may be generated for the LODs by adding points with minimum distances to node centroids into the refinement layers.
  • a number of LODs may be compressed into a number of refinement layers based on a value of an adaptation prediction threshold, which may be dynamically determined as a compression ratio based on the number of LODs and a predefined, specified, and/or default number of refinement layers.
  • predictors can be selected for LODs containing child nodes.
  • the predictors can be selected from previous LOD/refmement layers.
  • the predictors can be selected for a current point in a child node in a current LOD as a distance weighted average of N nearest points selected from a parent node in a previous LOD.
  • Point attribute values can then be encoded based on actual values and predictor values.
  • attribute values may be encoded as a difference between actual values and corresponding predictor values.
  • FIG. 12 is a flowchart of an example method 1200 of decoding point attributes in a sparse point cloud.
  • Method 1200 can be employed to implement method 300, 500, and/or 700, for example by employing mechanism 600 and/or 800. Accordingly, method 1200 may be employed to implement the generate LOD module 229 and/or the inverse lifting module 231 in the decoder 200. As such, the method 1200 may be used to decode positions and attributes of points in a point cloud from a geometry bitstream and an attribute bitstream to reconstruct a media sequence. For example, method 1200 may be used to decode an attribute bitstream created according to methods 1000 and/or method 1100. Method 1200 may also be applied by a coding module 914 of a volumetric media coding device 900.
  • Method 1200 may be initiated when a decoder receives PCC data.
  • a PCC bitstream is received.
  • the PCC bitstream contains a geometry bitstream and an attribute bitstream.
  • the geometry bitstream and the attribute bitstream collectively include an encoding of an image of a media sequence.
  • the image includes a sparse point cloud containing a plurality of points.
  • an octree is applied to the geometry bitstream in order to decode positions of the points in the image.
  • applying the octree can be accomplished by applying method 300 to the points as described above.
  • a variable is obtained from the attribute bitstream, the geometry bitstream, and/or the PCC bitstream.
  • the variable can be used to determine that the sparse point cloud is coded by a hierarchical tree, such as a binary tree, and hence that hierarchical tree is to be used to determine LODs and attributes values.
  • the variable may be a use binary tree LOD flag set to indicate that a binary tree should be used to determine the LODs and attributes values.
  • a hierarchical tree is applied based on the value of the variable determined at step 1205. The hierarchical tree is applied to sort the points in the point cloud into nodes based on the position of the points.
  • An inverse lifting algorithm can also be employed to decode the attribute values of the points based on data in the attribute bitstream and the occupancy of the nodes of the hierarchical tree. This may result in decoding the attribute values of the points.
  • the inverse lifting algorithm can decode attribute values of the points that are encoded as a series of LOD/refmement layer based predictors and residuals.
  • the hierarchical tree may be a binary tree and/or an octree as discussed in methods 500 and 700, respectively. Other hierarchical tree structures may also be used as desired.
  • the hierarchical tree can be applied according to method 1300 as described below.
  • the sparse point cloud is reconstructed as part of a reconstructed image.
  • One or more images may be reconstructed and placed in order as part of a reconstructed media sequence. Based on the forgoing, the sparse point cloud is reconstructed based on the positions of the points and the attribute values of the points.
  • FIG. 13 is a flowchart of an example method 1300 of applying a hierarchical tree to support decoding point attributes in a sparse point cloud.
  • Method 1300 can be employed to apply a hierarchical tree at step 1207 of method 1200. Accordingly, method 1300 can be employed when implementing method 300, 500, and/or 700, for example by employing mechanism 600 and/or 800. Further, method 1300 may be employed to implement the generate LOD module 229 and/or the inverse lifting module 231 in the decoder 200. As such, the method 1300 may be used to decode positions and attributes of points in a point cloud from a geometry bitstream and an attribute bitstream to reconstruct a media sequence. For example, method 1300 may be used to decode an attribute bitstream created according to methods 1000 and/or method 1100. Method 1300 may also be applied by a coding module 914 of a volumetric media coding device 900.
  • Method 1300 is initiated when decoding attributes for points in a point cloud.
  • a median point of a parent node is selected.
  • the median node may be selected as a closest point to a center coordinate of a tight bounding box around the parent node.
  • a split axis is also selected for splitting the parent node.
  • the split axis may be a selected axis that results in a maximum variance between the median point and other points of the parent node relative to the selected axis.
  • the parent node is split into a plurality of child nodes at the median point of the parent node. Points from the parent node can also be sorted into the child nodes based on a position of the median point and a direction of the split axis.
  • features can be determined for corresponding nodes resulting from applying the hierarchical tree.
  • the features for the corresponding nodes may include a median point, a split axis index, a start index of points, an end index of points, a minimum point coordinate value, a maximum point coordinate value, or combinations thereof.
  • a tight bounding box around the parent node may also be determined based on a minimum point coordinate value and a maximum point coordinate value associated with points in the parent node
  • nodes resulting from the hierarchical tree are assigned into LODs.
  • refinement layers are generated for the LODs.
  • refinement layers may be generated for the LODs by adding points with minimum distances to node centroids into the refinement layers.
  • a number of LODs may be compressed into a number of refinement layers based on a value of an adaptation prediction threshold, which may be dynamically determined as a compression ratio based on the number of LODs and a predefined, specified, and/or default number of refinement layers.
  • predictors can be selected for LODs containing child nodes.
  • the predictors can be selected from previous LOD/refmement layers.
  • the predictors can be selected for a current point in a child node in a current LOD as a distance weighted average of N nearest points selected from a parent node in a previous LOD.
  • Point attribute values can then be decoded/determined by comparing coded values and corresponding predictor values.
  • attribute values may be determined by adding coded residual attribute values from the bitstream to corresponding predictor values.
  • FIG. 14 is a schematic diagram of an example system 1400 for coding point attributes in a sparse point cloud according to a hierarchical tree.
  • System 1400 may be implemented by an encoder and a decoder, such as encoder 100 and decoder 200.
  • System 1400 may also be implemented on a volumetric media coding device 900. Further system 1400 may be employed when implementing a mechanism 600 and/or 800 and/or a method 300, 500, 700, 1000, 1100, 1200, and/or 1300.
  • the system 1400 includes a media encoder 1402.
  • the media encoder 1402 comprises an image storing module 1401 for storing an image of a media sequence, the image comprising a sparse point cloud containing a plurality of points.
  • the media encoder 1402 further comprises a geometry bitstream module 1403 for applying an octree to the points to encode a geometry bitstream describing positions of the points.
  • the media encoder 1402 further comprises an attribute bitstream module 1405 for applying a hierarchical tree to the points to encode an attribute bitstream describing attribute values of the points, wherein applying the hierarchical tree includes: selecting a median point of a parent node as a closest point to a center coordinate of a tight bounding box around the parent node, and splitting the parent node into a plurality of child nodes at the median point of the parent node.
  • the media encoder 1402 further comprises a bitstream storing module 1406 for storing a PCC bitstream containing the geometry bitstream and the attribute bitstream to support reconstruction of the media sequence.
  • the media encoder 1402 may also comprises a transmitting module 1407 for transmitting the PCC bitstream containing the geometry bitstream and the attribute bitstream to support reconstructing the media sequence.
  • the media encoder 1402 may be further configured to perform any of the steps of method 1000 and/or 1100.
  • the system 1400 also includes a media decoder 1410.
  • the media decoder 1410 comprises a receiving module 1411 for receiving a PCC bitstream containing a geometry bitstream and an attribute bitstream that include an encoding of an image of a media sequence, the image comprising a sparse point cloud containing a plurality of points.
  • the media decoder 1410 further comprises a geometry bitstream module 1413 for applying an octree to the geometry bitstream to decode positions of the points in the image.
  • the media decoder 1410 further comprises an attribute bitstream module 1415 for applying a hierarchical tree to the points based on the positions of the points to decode attribute values of the points, wherein applying the hierarchical tree includes: selecting a median point of a parent node as a closest point to a center coordinate of a tight bounding box around the parent node, and splitting the parent node into a plurality of child nodes at the median point of the parent node.
  • the media decoder 1410 further comprises a reconstruction module 1417 for reconstructing the sparse point cloud to reconstruct the image as part of the media sequence, wherein the sparse point cloud is reconstructed based on the positions of the points and the attribute values of the points.
  • the media decoder 1410 may be further configured to perform any of the steps of method 1200 and/or 1300.
  • a first component is directly coupled to a second component when there are no intervening components, except for a line, a trace, or another medium between the first component and the second component.
  • the first component is indirectly coupled to the second component when there are intervening components other than a line, a trace, or another medium between the first component and the second component.
  • the term“coupled” and its variants include both directly coupled and indirectly coupled.
  • the use of the term“about” means a range including ⁇ 10% of the subsequent number unless otherwise stated.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

L'invention concerne un mécanisme de codage multimédia. Le mécanisme comprend le stockage d'une image d'une séquence multimédia, l'image comprenant un nuage de points épars contenant une pluralité de points. Une structure d'arbre hiérarchique est appliquée aux points afin de coder un train de bits de géométrie décrivant des positions des points. Un arbre hiérarchique est appliqué aux points afin d'encoder un train de bits d'attribut décrivant des valeurs d'attribut des points. L'application de l'arbre hiérarchique peut comprendre la sélection d'un point médian d'un noeud parent en tant que point le plus proche d'une coordonnée centrale d'une boîte de délimitation serrée autour du noeud parent. Le noeud parent est divisé en une pluralité de noeuds enfants au niveau du point médian du noeud parent. Un train de bits de codage en nuage de points (PCC) contenant le train de bits de géométrie et le train de bits d'attributs sont stocké afin de prendre en charge la reconstruction de la séquence multimédia.
PCT/US2019/065413 2018-12-11 2019-12-10 Encodage d'attribut d'arbre hiérarchique par points médians dans un codage en nuage de points WO2020123469A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201862778161P 2018-12-11 2018-12-11
US62/778,161 2018-12-11

Publications (1)

Publication Number Publication Date
WO2020123469A1 true WO2020123469A1 (fr) 2020-06-18

Family

ID=71076628

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2019/065413 WO2020123469A1 (fr) 2018-12-11 2019-12-10 Encodage d'attribut d'arbre hiérarchique par points médians dans un codage en nuage de points

Country Status (1)

Country Link
WO (1) WO2020123469A1 (fr)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112699223A (zh) * 2021-01-13 2021-04-23 腾讯科技(深圳)有限公司 数据搜索方法、装置、电子设备及存储介质
WO2021258325A1 (fr) * 2020-06-24 2021-12-30 Zte Corporation Procédés et appareil de traitement de contenu tridimensionnel
WO2022120542A1 (fr) * 2020-12-07 2022-06-16 浙江大学 Procédé et appareil de codage de nuage de points, procédé et appareil de décodage de nuage de points, et support de stockage lisible par ordinateur
WO2022141461A1 (fr) * 2020-12-31 2022-07-07 Oppo广东移动通信有限公司 Procédé de codage et de décodage de nuage de points, codeur, décodeur et support de stockage informatique
CN115086716A (zh) * 2021-03-12 2022-09-20 腾讯科技(深圳)有限公司 点云中邻居点的选择方法、装置及编解码器
US20220327742A1 (en) * 2019-06-26 2022-10-13 Lg Electronics Inc. Point cloud data transmission device, point cloud data transmission method, point cloud data reception device and point cloud data reception method
WO2022225333A1 (fr) * 2021-04-21 2022-10-27 엘지전자 주식회사 Dispositif de transmission de données de nuage de points, procédé de transmission de données de nuage de points, dispositif de réception de données de nuage de points et procédé de réception de données de nuage de points
WO2022257970A1 (fr) * 2021-06-11 2022-12-15 维沃移动通信有限公司 Procédé de traitement de codage d'informations géométriques de nuage de points, procédé de traitement de décodage et dispositif associé
CN116458158A (zh) * 2020-12-03 2023-07-18 Oppo广东移动通信有限公司 帧内预测方法及装置、编解码器、设备、存储介质
WO2023191605A1 (fr) * 2022-04-01 2023-10-05 엘지전자 주식회사 Dispositif de transmission de données de nuage de points, procédé de transmission de données de nuage de points, dispositif de réception de données de nuage de points et procédé de réception de données de nuage de points
US11956470B2 (en) 2020-04-07 2024-04-09 Qualcomm Incorporated Predictor index signaling for predicting transform in geometry-based point cloud compression

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1267309A2 (fr) * 2001-06-11 2002-12-18 Canon Kabushiki Kaisha Appareil de modellisation tridimensionnelle par ordinateur
US20040217956A1 (en) * 2002-02-28 2004-11-04 Paul Besl Method and system for processing, compressing, streaming, and interactive rendering of 3D color image data
US20170347122A1 (en) * 2016-05-28 2017-11-30 Microsoft Technology Licensing, Llc Scalable point cloud compression with transform, and corresponding decompression

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1267309A2 (fr) * 2001-06-11 2002-12-18 Canon Kabushiki Kaisha Appareil de modellisation tridimensionnelle par ordinateur
US20040217956A1 (en) * 2002-02-28 2004-11-04 Paul Besl Method and system for processing, compressing, streaming, and interactive rendering of 3D color image data
US20170347122A1 (en) * 2016-05-28 2017-11-30 Microsoft Technology Licensing, Llc Scalable point cloud compression with transform, and corresponding decompression

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220327742A1 (en) * 2019-06-26 2022-10-13 Lg Electronics Inc. Point cloud data transmission device, point cloud data transmission method, point cloud data reception device and point cloud data reception method
US11956470B2 (en) 2020-04-07 2024-04-09 Qualcomm Incorporated Predictor index signaling for predicting transform in geometry-based point cloud compression
WO2021258325A1 (fr) * 2020-06-24 2021-12-30 Zte Corporation Procédés et appareil de traitement de contenu tridimensionnel
CN116458158A (zh) * 2020-12-03 2023-07-18 Oppo广东移动通信有限公司 帧内预测方法及装置、编解码器、设备、存储介质
WO2022120542A1 (fr) * 2020-12-07 2022-06-16 浙江大学 Procédé et appareil de codage de nuage de points, procédé et appareil de décodage de nuage de points, et support de stockage lisible par ordinateur
WO2022141461A1 (fr) * 2020-12-31 2022-07-07 Oppo广东移动通信有限公司 Procédé de codage et de décodage de nuage de points, codeur, décodeur et support de stockage informatique
CN112699223B (zh) * 2021-01-13 2023-09-01 腾讯科技(深圳)有限公司 数据搜索方法、装置、电子设备及存储介质
CN112699223A (zh) * 2021-01-13 2021-04-23 腾讯科技(深圳)有限公司 数据搜索方法、装置、电子设备及存储介质
CN115086716B (zh) * 2021-03-12 2023-09-08 腾讯科技(深圳)有限公司 点云中邻居点的选择方法、装置及编解码器
CN115086716A (zh) * 2021-03-12 2022-09-20 腾讯科技(深圳)有限公司 点云中邻居点的选择方法、装置及编解码器
WO2022225333A1 (fr) * 2021-04-21 2022-10-27 엘지전자 주식회사 Dispositif de transmission de données de nuage de points, procédé de transmission de données de nuage de points, dispositif de réception de données de nuage de points et procédé de réception de données de nuage de points
WO2022257970A1 (fr) * 2021-06-11 2022-12-15 维沃移动通信有限公司 Procédé de traitement de codage d'informations géométriques de nuage de points, procédé de traitement de décodage et dispositif associé
WO2023191605A1 (fr) * 2022-04-01 2023-10-05 엘지전자 주식회사 Dispositif de transmission de données de nuage de points, procédé de transmission de données de nuage de points, dispositif de réception de données de nuage de points et procédé de réception de données de nuage de points

Similar Documents

Publication Publication Date Title
WO2020072665A1 (fr) Codage d'attribut d'arbre hiérarchique dans un codage en nuage de points
WO2020123469A1 (fr) Encodage d'attribut d'arbre hiérarchique par points médians dans un codage en nuage de points
US11711545B2 (en) Arithmetic coding information for parallel octree coding
US11743498B2 (en) Implicit quadtree or binary-tree geometry partition for point cloud coding
US10904564B2 (en) Method and apparatus for video coding
US11010931B2 (en) Method and apparatus for video coding
US11166048B2 (en) Method and apparatus for video coding
WO2020070191A1 (fr) Procédés et dispositifs de codage entropique binaire de nuages de points
JP6178798B2 (ja) 終了可能な空間ツリー型位置符号化および復号
AU2021206683B2 (en) Techniques and apparatus for alphabet-partition coding of transform coefficients for point cloud compression
WO2022131948A1 (fr) Dispositifs et procédés de codage séquentiel pour compression de nuage de points
WO2024082152A1 (fr) Procédés et appareils de codage et de décodage, codeur et décodeur, flux de code, dispositif et support de stockage
WO2024074121A1 (fr) Procédé, appareil et support de codage en nuage de points
WO2023173238A1 (fr) Procédé de codage, procédé de décodage, flux de code, codeur, décodeur et support d'enregistrement
WO2024074122A1 (fr) Procédé, appareil et support de codage de nuage de points
WO2023179706A1 (fr) Procédé de codage, procédé de décodage, et terminal
WO2023197122A1 (fr) Procédé de codage et de décodage pour positions de sommet trisoup
WO2023240662A1 (fr) Procédé de codage, procédé de décodage, codeur, décodeur, et support de stockage
WO2023249999A1 (fr) Système et procédé de codage en nuage de points géométriques
EP4244813A1 (fr) Dispositifs et procédés de codage évolutif pour compression de nuage de points
WO2024085936A1 (fr) Système et procédé de codage en nuage de points géométriques
WO2022131946A2 (fr) Dispositifs et procédés de quantification spatiale pour compression de nuage de points
WO2023096973A1 (fr) Codage de nuage de points géométriques

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19896176

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19896176

Country of ref document: EP

Kind code of ref document: A1