WO2019068259A1

WO2019068259A1 - Point cloud coding

Info

Publication number: WO2019068259A1
Application number: PCT/CN2018/109296
Authority: WO
Inventors: Zhu Li; Shan Liu; Jose Alvarez
Original assignee: Huawei Technologies Co., Ltd.
Priority date: 2017-10-02
Filing date: 2018-10-08
Publication date: 2019-04-11

Abstract

A method comprises obtaining a point cloud of an object, wherein the point cloud comprises points; generating an initial k-d tree of the point cloud, wherein the initial k-d tree comprises an initial depth and is balanced; generating a final k-d tree of the point cloud, wherein the final k-d tree comprises a final depth and is unbalanced, and wherein the final depth is greater than the initial depth; and encoding an encoded point cloud, wherein the encoded point cloud comprises a depth difference between the final depth and the initial depth.

Description

Point Cloud Coding

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority to United States provisional patent application number 62/734,831, filed September 21, 2018 by Futurewei Technologies, Inc. and titled “Point Cloud Coding” and United States provisional patent application number 62/566,761 filed October 2, 2017 by Futurewei Technologies, Inc. and titled “Method and Apparatus for Lossless Point Cloud Geometry Compression, ” which are incorporated by reference.

TECHNICAL FIELD

The disclosed embodiments relate to video coding in general and point cloud coding in particular.

BACKGROUND

Videos use a relatively large amount of data, so communication of videos uses a relatively large amount of bandwidth. However, many networks operate at or near their bandwidth capacities. In addition, customers demand high video quality, which requires using even more data. There is therefore a desire to both reduce the amount of data videos use and improve video quality. One solution is to compress videos during an encoding process and decompress the videos during a decoding process.

SUMMARY

In one embodiment, a method comprises: obtaining a point cloud of an object, wherein the point cloud comprises points; generating an initial k-d tree of the point cloud, wherein the initial k-d tree comprises an initial depth and is balanced; generating a final k-d tree of the point cloud, wherein the final k-d tree comprises a final depth and is unbalanced, and wherein the final depth is greater than the initial depth; and encoding an encoded point cloud, wherein the encoded point cloud comprises a depth difference between the final depth and the initial depth.

In any of the preceding embodiments, the method further comprises further encoding a bounding box describing diagonal corner points of the object.

In any of the preceding embodiments, the method further comprises further encoding the initial depth.

In any of the preceding embodiments, the method further comprises further encoding the final k-d tree.

In any of the preceding embodiments, the method further comprises transmitting the encoded point cloud.

In any of the preceding embodiments, the method further comprises determining the initial depth based on a total number of the points.

In any of the preceding embodiments, the method further comprises further determining the initial depth based on a predetermined maximum number of levels that can be efficiently encoded given a constraint.

In any of the preceding embodiments, the constraint is an amount of memory or a processing power.

In any of the preceding embodiments, the method further comprises generating additional nodes beyond the initial depth.

In any of the preceding embodiments, the method further comprises further generating the additional nodes based on an MST operation.

In any of the preceding embodiments, the method further comprises further generating the additional nodes based on an average residual from the MST operation.

In any of the preceding embodiments, the method further comprises further generating the final k-d tree using the initial k-d tree and the additional nodes.

In any of the preceding embodiments, the point cloud, the initial k-d tree, and the final k-d tree are 3D.

In another embodiment, an apparatus comprises a memory; and a processor coupled to the memory and configured to: obtain a point cloud of an object, wherein the point cloud comprises points, generate an initial k-d tree of the point cloud, wherein the initial k-d tree comprises an initial depth and is balanced, generate a final k-d tree of the point cloud, wherein the final k-d tree comprises a final depth and is unbalanced, and wherein the final depth is greater than the initial depth, and encode an encoded point cloud, wherein the encoded point cloud comprises a depth difference between the final depth and the initial depth.

In any of the preceding embodiments, the processor is further configured to further encode a bounding box describing diagonal corner points of the object.

In any of the preceding embodiments, the processor is further configured to further encode the initial depth.

In any of the preceding embodiments, the processor is further configured to further encode the final k-d tree.

In any of the preceding embodiments, the apparatus further comprises a transmitter coupled to the processor and configured to transmit the encoded point cloud.

In any of the preceding embodiments, the processor is further configured to determine the initial depth based on a total number of the points.

In any of the preceding embodiments, the processor is further configured to further determine the initial depth based on a predetermined maximum number of levels that can be efficiently encoded given a constraint.

In any of the preceding embodiments, the processor is further configured to generate additional nodes beyond the initial depth.

In any of the preceding embodiments, the processor is further configured to further generate the additional nodes based on an MST operation.

In any of the preceding embodiments, the processor is further configured to further generate the additional nodes based on an average residual from the MST operation.

In any of the preceding embodiments, the processor is further configured to further generate the final k-d tree using the initial k-d tree and the additional nodes.

In yet another embodiment, a computer program product comprises computer executable instructions stored on a non-transitory medium that when executed by a processor cause an apparatus to obtain a point cloud of an object, wherein the point cloud comprises points; generate an initial k-d tree of the point cloud, wherein the initial k-d tree comprises an initial depth and is balanced; generate a final k-d tree of the point cloud, wherein the final k-d tree comprises a final depth and is unbalanced, and wherein the final depth is greater than the initial depth; and encode an encoded point cloud, wherein the encoded point cloud comprises a depth difference between the final depth and the initial depth.

In any of the preceding embodiments, the instructions further cause the apparatus to further encode a bounding box describing diagonal corner points of the object.

In any of the preceding embodiments, the instructions further cause the apparatus to further encode the initial depth.

In any of the preceding embodiments, the instructions further cause the apparatus to further encode the final k-d tree.

In any of the preceding embodiments, the instructions further cause the apparatus to transmit the encoded point cloud.

In any of the preceding embodiments, the instructions further cause the apparatus to determine the initial depth based on a total number of the points.

In any of the preceding embodiments, the instructions further cause the apparatus to further determine the initial depth based on a predetermined maximum number of levels that can be efficiently encoded given a constraint.

In any of the preceding embodiments, the instructions further cause the apparatus to generate additional nodes beyond the initial depth.

In any of the preceding embodiments, the instructions further cause the apparatus to further generate the additional nodes based on an MST operation.

In any of the preceding embodiments, the instructions further cause the apparatus to further generate the additional nodes based on an average residual from the MST operation.

In any of the preceding embodiments, the instructions further cause the apparatus to further generate the final k-d tree using the initial k-d tree and the additional nodes.

In yet another embodiment, a method comprises receiving an encoded point cloud comprising characteristics, wherein the characteristics comprise a bounding box describing diagonal corner points of an object, an initial depth of an initial k-d tree, a final k-d tree associated with a final depth, and a depth difference between the final depth; extracting the characteristics from the encoded point cloud; and generating a point cloud based on the characteristics.

In yet another embodiment, an apparatus comprises a memory; and a processor coupled to the memory and configured to: receive an encoded point cloud comprising characteristics, wherein the characteristics comprise a bounding box describing diagonal corner points of an object, an initial depth of an initial k-d tree, a final k-d tree associated with a final depth, and a depth difference between the final depth, extract the characteristics from the encoded point cloud, and generate a point cloud based on the characteristics.

In yet another embodiment, a computer program product comprises computer executable instructions stored on a non-transitory medium that when executed by a processor cause an apparatus to: receive an encoded point cloud comprising characteristics, wherein the characteristics comprise a bounding box describing diagonal corner points of an object, an initial depth of an initial k-d tree, a final k-d tree associated with a final depth, and a depth difference between the final depth; extract the characteristics from the encoded point cloud; and generate a point cloud based on the characteristics.

The preceding embodiments provide for generating an initial k-d tree that is balanced but lossy, then generating nodes beyond the initial k-d tree to create a final k-d tree that may be unbalanced but lossless. The embodiments provide geometric and scalable coding. By increasing some processing at the encoding stage, the embodiments provide for more efficient coding and thus more efficient communication.

Any of the above embodiments may be combined with any of the other above embodiments to create a new embodiment. These and other features will be more clearly understood from the following detailed description taken in conjunction with the accompanying drawings and claims.

BRIEF DESCRIPTION OF THE DRAWINGS

For a more complete understanding of this disclosure, reference is now made to the following brief description, taken in connection with the accompanying drawings and detailed description, wherein like reference numerals represent like parts.

FIG. 1 is a schematic diagram of a coding system.

FIG. 2 is a flowchart illustrating a method of point cloud encoding and communication according to an embodiment of the disclosure.

FIGS. 3A-3D are diagrams demonstrating building of a k-d tree of dimension two.

FIG. 4 is a flowchart illustrating a method of point cloud communication and decoding according to an embodiment of the disclosure.

FIG. 5 is a schematic diagram of an apparatus according to an embodiment of the disclosure.

DETAILED DESCRIPTION

It should be understood at the outset that, although an illustrative implementation of one or more embodiments are provided below, the disclosed systems and/or methods may be implemented using any number of techniques, whether currently known or in existence. The disclosure should in no way be limited to the illustrative implementations, drawings, and techniques illustrated below, including the exemplary designs and implementations illustrated and described herein, but may be modified within the scope of the appended claims along with their full scope of equivalents.

The following abbreviations apply:

ASIC: application-specific integrated circuit

CPU: central processing unit

DSP: digital signal processor

EO: electrical-to-optical

FPGA: field-programmable gate array

LCD: liquid crystal display

MST: minimum spanning tree

OE: optical-to-electrical

RAM: random-access memory

RF: radio frequency

ROM: read-only memory

RX: receiver unit

SRAM: static RAM

TCAM: ternary content-addressable memory

TSP: traveling salesman problem

TX: transmitter unit

2D: two-dimensional

3D: three-dimensional.

FIG. 1 is a schematic diagram of a coding system 100. The coding system 100 comprises a source device 110, a medium 150, and a destination device 160. The source device 110 and the destination device 160 are mobile phones, tablet computers, desktop computers, notebook computers, or other suitable devices. The medium 150 is a local network, a radio network, the Internet, or another suitable medium.

The source device 110 comprises a video generator 120, an encoder 130, and an output interface 140. The video generator 120 is a camera or another device suitable for generating video. Videos include any visual representations of volumetric spaces or other multidimensional data. The encoder 130 may be referred to as a codec. The encoder 130 performs encoding according to a set of rules, for instance as described in “High Efficiency Video Coding, ” ITU-T H. 265, December 2016 ( “H. 265” ) . The output interface 140 is an antenna or another component suitable for transmitting data to the destination device 160. Alternatively, the video generator 120, the encoder 130, and the output interface 140 are in any suitable combination of devices.

The destination device 160 comprises an input interface 170, a decoder 180, and a display 190. The input interface 170 is an antenna or another component suitable for receiving data from the source device 110. The decoder 180 may also be referred to as a codec. The decoder 180 performs decoding according to a set of rules, for instance as described in H. 265. The display 190 is an LCD screen or another component suitable for displaying videos. Alternatively, the input interface 170, the decoder 180, and the display 190 are in any suitable combination of devices.

In operation, in the source device 110, the video generator 120 captures a video, the encoder 130 encodes the video to create an encoded video, and the output interface 140 transmits the encoded video over the medium 150 and towards the destination device 160. The source device 110 may locally store the video or the encoded video, or the source device 110 may instruct storage of the video or the encoded video on another device. The encoded video comprises data defined at various levels, including slices and blocks. A slice is a spatially distinct region of a video frame that the encoder 130 encodes separately from any other region in the video frame. A block is a group of pixels arranged in a rectangle. Blocks may also be referred to as units or coding units. In the destination device 160, the input interface 170 receives the encoded video from the source device 110, the decoder 180 decodes the encoded video to obtain a decoded video, and the display 190 displays the decoded video. The decoder 180 may decode the encoded video in a reverse manner compared to how the encoder 130 encodes the video. The destination device 160 locally stores the encoded video or the decoded video, or the destination device 160 instructs storage of the encoded video or the decoded video on another device. Though the coding system 100 is described as coding and communicating videos, which are simply series of images, the same concepts apply to single images.

The video generator 120 may be a traditional camera, an infrared camera, a time-of-flight camera, a laser system, a scanner, or another device that scans objects and generates point clouds representing the objects. The objects and the point clouds may be 3D. The point clouds comprise points, which may be more abundant in regions of objects that are more complex and may be less abundant in regions of objects that are less complex. For instance, a point cloud representing a human comprises more points in a facial region and fewer points in a torso region covered by a uniformly-colored shirt. The point clouds comprise hundreds of thousands or millions of points, so the point clouds require significant data to encode and significant bandwidth to communicate. There is therefore a desire to efficiently code, and thus communicate, the point clouds.

Disclosed herein are embodiments for point cloud coding. The embodiments provide for generating an initial k-d tree that is balanced but lossy, then generating nodes beyond the initial k-d tree to create a final k-d tree that may be unbalanced but lossless. The embodiments provide geometric and scalable coding. Geometric coding refers to coding of spatial positions of points, as opposed to attribute coding, which refers to coding values of points. Scalable coding refers to coding that works for all levels of partitioning a point cloud. The embodiments are discussed in the context of 3D point clouds and k-d trees of dimension three, but apply to point clouds and k-d trees of any dimension. In addition, the embodiments may apply in similar manners to data structures other than point clouds. By increasing some processing at the encoding stage, the embodiments provide for more efficient coding and thus more efficient communication.

FIG. 2 is a flowchart illustrating a method 200 of point cloud encoding and communication according to an embodiment of the disclosure. The source device 110 performs the method 200. At step 210, the encoder 130 obtains a point cloud of an object. For instance, the video generator 120 generates the point cloud and the encoder 130 receives the point cloud from the video generator 120, or the encoder 130 receives the point cloud from another device. The point cloud comprises N points, where N is a positive integer. The object is a 3D object such as a human, so the point cloud is also 3D.

At step 220, the encoder 130 determines an initial depth of an initial k-d tree. For instance, the encoder 130 determines the initial depth based on the following inequality:

N/2 ^D < M. (1)

N is the total number of points in the point cloud, D is the initial depth of the initial k-d tree, and M is a predetermined maximum number of levels the encoder 130 can efficiently encode given a constraint such as an amount of memory or processing power. Depth and levels are described below. A manufacturer of the source device 110 predetermines M and stores M in a memory of the source device 110. M may be about 1,000. The manufacturer, a user of the source device 110, or another entity may adjust M.

At step 230, the encoder 130 generates an initial k-d tree of the point cloud. The initial point cloud comprises the initial depth. Because the point cloud is 3D, the k-d tree is a k-d tree of dimension three. Though the initial k-d tree is of dimension three, for ease of understanding generation of a k-d tree, FIGS. 3A-3D demonstrate building of a k-d tree of dimension two.

FIGS. 3A-3D are diagrams 300 demonstrating building of a k-d tree of dimension two. The k-d tree is a binary tree in which every node is a k-dimensional point. A binary tree is a data structure in which each node has at most two children. The k-d tree describes N points in a point cloud. For FIGS. 3A-3D, the k-d tree is 2D because its nodes cut in both the x direction and the y direction.

FIG. 3A is a diagram showing a root node 310 at a zeroth level. The root node 310 is called a root node because it is the only node in the k-d tree at a zeroth level. A level refers to a number of cuts in the k-d tree. The root node 310 comprises all N points.

FIG. 3B is a diagram showing the root node 310 cut into a node 320 and a node 330 at a first level. The root node 310 is a parent of the

nodes

320, 330, and the

nodes

320, 330 are children of the root node 310. The

nodes

320, 330 are cut in the y direction. Each of the

nodes

320, 330 comprises N/2 points.

FIG. 3C is a diagram showing the node 320 cut into a node 340 and a node 350 at a second level. The node 320 is a parent of the

nodes

340, 350, and the

nodes

340, 350 are children of the node 320. The

nodes

340, 350 are cut in the x direction. Each of the

nodes

340, 350 comprises N/4 points.

FIG. 3D is a diagram showing the node 350 cut into a leaf node 360 and a leaf node 370 at a third level. The

leaf nodes

360, 370 are called leaf nodes because they are the lowest-level nodes in the k-d tree. The node 350 is a parent of the

leaf nodes

360, 370, and the

leaf nodes

360, 370 are children of the node 350. The

leaf nodes

360, 370 are cut in the y direction. Each of the

leaf nodes

360, 370 comprise N/8 points.

In FIGS. 3A-3D, the k-d tree has a depth of three because its nodes split to a third level. Depth may also be referred to as height. The k-d tree is an unbalanced k-d tree because the node 320 splits to a second level, but the node 330 does not, and because the node 350 splits to a third level, but the node 340 does not. The k-d tree would be balanced if each branch of the k-d tree comprised the same number of levels or within one of the same number of levels. A branch is a progression of parent and child relationships. For instance, one branch comprises the root node 310; the

nodes

320, 350; and the

leaf nodes

360, 370.

The nodes are not cut based on their size. For instance, the node 340 is larger than the node 350. Rather, the nodes are cut based on the locations of the pixels so that each child node associated with the same parent node comprises the same or substantially the same number of nodes. Thus, the equal sizes of the

nodes

320, 330 in FIG. 3B indicate the points are evenly distributed or substantially evenly distributed between the left side and the right side of the root node 310 in FIG. 3A. In contrast, the unequal sizes of the

nodes

340, 350 in FIG. 3C indicate the points are unevenly distributed between the top side and the bottom side of the node 320 in FIG. 3B. Specifically, within the node 320, the points are less heavily distributed in the node 340 and more heavily distributed in the node 350. Similarly, the unequal sizes of the

leaf nodes

360, 370 in FIG. 3D indicate the points are unevenly distributed between the left side and the right side of the node 350 in FIG. 3C. Specifically, within the node 350, the points are more heavily distributed in the leaf node 360 and less heavily distributed in the leaf node 370.

Returning to step 230 in FIG. 2, the initial k-d tree is similar to the k-d tree shown in FIGS. 3A-3D. For instance, the initial k-d tree is balanced. However, the initial k-d tree is of dimension three instead of dimension two, and the initial k-d tree is of the initial depth determined at step 220 instead of four. Thus, the nodes in the initial k-d tree are rectangular prisms instead of squares like in FIGS. 3A-3D.

At step 240, the encoder 130 generates additional nodes beyond the initial depth. To do so, the encoder 130 samples m sampling nodes, which are a percentage of the leaf nodes from the initial k-d tree, meaning the nodes that are the lowest-level nodes in the initial k-d tree at level D. m is a positive integer. A manufacturer of the source device 110 determines the percentage and stores the percentage in a memory of the source device 110. The manufacturer, a user of the source device 110, or another entity may adjust the percentage. The number of nodes in the initial k-d tree is 2 ^D, and the percentage may be about 5%–10%of 2 ^D. The encoder 130 performs an MST operation on all of the sampling nodes to connect points in the sampling nodes. The encoder 130 calculates a residual R for each point in a sampling node, where the residual is a number of bits needed to describe a distance of a currently-examined point from a previously-examined point in the MST operation. The encoder 130 calculates a sum of all of the residuals and averages the sum to obtain an average residual R _avg. The encoder 130 then determines whether to split each sampling node based on the following inequality for each of the m sampling nodes:

R _leaf, i < α R _avg. (2)

R _leaf is a residual of a sampling node i; i = 1, 2, ... m; α is a performance factor comparing MST to TSP; and R _avg is an average residual as described above. A manufacturer of the source device 110 determines α and stores α in a memory of the source device 110. The manufacturer, a user of the source device 110, or another entity may adjust α. α may be about 1.2 –2.0. If the inequality is true, then the encoder 130 splits the sampling node into two additional nodes, specifically two child nodes. If the inequality is false, then the encoder 130 does not split the sampling node into two child nodes. The encoder 130 continues splitting the sampling nodes until the inequality is false for every sampling node. Once the encoder 130 has done so for each sampling node, the encoder 130 calculates the following:

Δ = D’–D. (3) .

Δ is a depth difference, D’is a final depth of the sampling node with the highest level, and D is an initial depth of the initial k-d tree.

At step 250, the encoder 130 generates a final k-d tree of the point cloud. The final k-d tree comprises the initial k-d tree and the additional nodes. The final k-d tree comprises the final depth and is unbalanced.

At step 260, the encoder 130 encodes the point cloud as an encoded point cloud. The encoded point cloud may be referred to as a bitstream. The encoded point cloud comprises a bounding box, D, Δ, and the final k-d tree. The bounding box describes diagonal corner points of the object, for instance a top-right point of the object and a bottom-left point of the object. D is the initial depth of the initial k-d tree determined at step 220. Δ is the depth difference calculated at step 240. For levels 0 –D, the final k-d tree comprises, for each node, a dimension, a cut value, and a description of the points using the MST operation. The dimension is a direction by which the node being encoded is cut. For instance, in FIG. 3B, the root node 310 is cut into the

nodes

320, 330 in the y direction. The dimension may be encoded with the following bit map:

00: no split

01: x direction

10: y direction

11: z direction.

For levels 0 –D, each node has a dimension value indicating a split along the x direction, the y direction, or the z direction. Thus, each node will not have a dimension value indicating no split. The value is K bits, where K is a resolution of the point cloud. For instance, K is 10 bits. For levels (D+1) –D’, the final k-d tree comprises, for each node, a dimension, a cut value, an indication of whether the node is a parent, an indication of whether the node’s child is to the left or the right, and a description of the points using a TSP operation. The TSP operation produces an optimal traversal of all the points in the node, creating the smallest differential residual for the node. The dimension and the cut value are described above. However, for levels (D+1) , some nodes may not have children and therefore may have a dimension value indicating no split. A 0 or 1 bit may provide both the indication of whether the node is a parent and the indication of whether the node’s child is to the left or the right.

The encoder 130 encodes the point cloud using entropy encoding such as arithmetic entropy coding or machine-learning-based entropy compression. One type of machine-learning-based entropy compression is PAQ. PAQ uses lossless data compression archivers, which use a context mixing algorithm trained on representative point-cloud data streams. One type of PAQ is version 8, which combines predictions of various models by a weighted summation via a shallow neural network. An adaptive probability map may reduce a prediction error before PAQ. After encoding every bit, neural network weights are adjusted along a cost gradient.

Finally, at step 270, the output interface 140 transmits the encoded point cloud. Specifically, the output interface 140 transmits the encoded point cloud to the input interface 170 of the destination device 160 over the medium 150. The source device 110 may perform the method 200 at time intervals such as 60 times per second or 120 times per second. In that case, the collection of point clouds, which are 3D, include a fourth dimension of time.

FIG. 4 is a flowchart illustrating a method 400 of point cloud communication and decoding according to an embodiment of the disclosure. The destination device 160 performs the method 400. At step 410, the input interface 170 receives an encoded point cloud comprising characteristics. The input interface 170 may do so in response to the output interface 140 transmitting the encoded point cloud from step 270 of FIG. 2. The characteristics may comprise the bounding box, D, Δ, and the final k-d tree described at step 270 in FIG. 2. At step 420, the decoder 180 extracts the characteristics from the encoded point cloud. Finally, at step 430, the decoder 180 generates a point cloud based on the characteristics. The point cloud may be the point cloud described at step 210 in FIG. 2.

FIG. 5 is a schematic diagram of an apparatus 500 according to an embodiment of the disclosure. The apparatus 500 may implement the disclosed embodiments. The apparatus 500 comprises ingress ports 510 and an RX 520 for receiving data; a processor, logic unit, baseband unit, or CPU 530 to process the data; a TX 540 and egress ports 550 for transmitting the data; and a memory 560 for storing the data. The apparatus 500 may also comprise OE components, EO components, or RF components coupled to the ingress ports 510, the RX 520, the TX 540, and the egress ports 550 for ingress or egress of optical, electrical signals, or RF signals.

The processor 530 is any combination of hardware, middleware, firmware, or software. The processor 530 comprises any combination of one or more CPU chips, cores, FPGAs, ASICs, or DSPs. The processor 530 communicates with the ingress ports 510, the RX 520, the TX 540, the egress ports 550, and the memory 560. The processor 530 comprises a point cloud coding component 570, which implements the disclosed embodiments. The inclusion of the point cloud coding component 570 therefore provides a substantial improvement to the functionality of the apparatus 500 and effects a transformation of the apparatus 500 to a different state. Alternatively, the memory 560 stores the point cloud coding component 570 as instructions, and the processor 530 executes those instructions.

The memory 560 comprises any combination of disks, tape drives, or solid-state drives. The apparatus 500 may use the memory 560 as an over-flow data storage device to store programs when the apparatus 500 selects those programs for execution and to store instructions and data that the apparatus 500 reads during execution of those programs. The memory 560 may be volatile or non-volatile and may be any combination of ROM, RAM, TCAM, or SRAM.

An apparatus comprises a memory means and a processor means coupled to the memory means and configured to obtain a point cloud of an object, wherein the point cloud comprises points, generate an initial k-d tree of the point cloud, wherein the initial k-d tree comprises an initial depth and is balanced, generate a final k-d tree of the point cloud, wherein the final k-d tree comprises a final depth and is unbalanced, and wherein the final depth is greater than the initial depth, and encode an encoded point cloud, wherein the encoded point cloud comprises a depth difference between the final depth and the initial depth.

The term “about” means a range including ±10%of the subsequent number unless otherwise stated. The term “substantially” means within ±10%. While several embodiments have been provided in the present disclosure, it may be understood that the disclosed systems and methods might be embodied in many other specific forms without departing from the spirit or scope of the present disclosure. The present examples are to be considered as illustrative and not restrictive, and the intention is not to be limited to the details given herein. For example, the various elements or components may be combined or integrated in another system or certain features may be omitted, or not implemented.

In addition, techniques, systems, subsystems, and methods described and illustrated in the various embodiments as discrete or separate may be combined or integrated with other systems, components, techniques, or methods without departing from the scope of the present disclosure. Other items shown or discussed as coupled or directly coupled or communicating with each other may be indirectly coupled or communicating through some interface, device, or intermediate component whether electrically, mechanically, or otherwise. Other examples of changes, substitutions, and alterations are ascertainable by one skilled in the art and may be made without departing from the spirit and scope disclosed herein.

Claims

A method comprising:

obtaining a point cloud of an object, wherein the point cloud comprises points;

generating an initial k-d tree of the point cloud, wherein the initial k-d tree comprises an initial depth and is balanced;

generating a final k-d tree of the point cloud, wherein the final k-d tree comprises a final depth and is unbalanced, and wherein the final depth is greater than the initial depth; and

encoding an encoded point cloud,

wherein the encoded point cloud comprises a depth difference between the final depth and the initial depth.
The method of claim 1, further comprising further encoding a bounding box describing diagonal corner points of the object.
The method of any of claims 1-2, further comprising further encoding the initial depth.
The method of any of claims 1-3, further comprising further encoding the final k-d tree.
The method of any of claims 1-4, further comprising transmitting the encoded point cloud.
The method of any of claims 1-5, further comprising determining the initial depth based on a total number of the points.
The method of any of claims 1-6, further comprising further determining the initial depth based on a predetermined maximum number of levels that can be efficiently encoded given a constraint.
The method of any of claims 1-7, wherein the constraint is an amount of memory or a processing power.
The method of any of claims 1-8, further comprising generating additional nodes beyond the initial depth.
The method of any of claims 1-9, further comprising further generating the additional nodes based on a minimum spanning tree (MST) operation.
The method of any of claims 1-10, further comprising further generating the additional nodes based on an average residual from the MST operation.
The method of any of claims 1-11, further comprising further generating the final k-d tree using the initial k-d tree and the additional nodes.
The method of any of claims 1-12, wherein the point cloud, the initial k-d tree, and the final k-d tree are three-dimensional (3D) .
An apparatus comprising:

a memory; and

a processor coupled to the memory and configured to:

obtain a point cloud of an object, wherein the point cloud comprises points,

generate an initial k-d tree of the point cloud, wherein the initial k-d tree comprises an initial depth and is balanced,

generate a final k-d tree of the point cloud, wherein the final k-d tree comprises a final depth and is unbalanced, and wherein the final depth is greater than the initial depth, and

encode an encoded point cloud,

wherein the encoded point cloud comprises a depth difference between the final depth and the initial depth.
The apparatus of claim 14, wherein the processor is further configured to further encode a bounding box describing diagonal corner points of the object.
The apparatus of any of claims 13-15, wherein the processor is further configured to further encode the initial depth.
The apparatus of any of claims 13-16, wherein the processor is further configured to further encode the final k-d tree.
The apparatus of any of claims 13-17, further comprising a transmitter coupled to the processor and configured to transmit the encoded point cloud.
The apparatus of any of claims 13-18, wherein the processor is further configured to determine the initial depth based on a total number of the points.
The apparatus of any of claims 13-19, wherein the processor is further configured to further determine the initial depth based on a predetermined maximum number of levels that can be efficiently encoded given a constraint.
The apparatus of any of claims 13-20, wherein the constraint is an amount of memory or a processing power.
The apparatus of any of claims 13-21, wherein the processor is further configured to generate additional nodes beyond the initial depth.
The apparatus of any of claims 13-22, wherein the processor is further configured to further generate the additional nodes based on a minimum spanning tree (MST) operation.
The apparatus of any of claims 13-23, wherein the processor is further configured to further generate the additional nodes based on an average residual from the MST operation.
The apparatus of any of claims 13-24, wherein the processor is further configured to further generate the final k-d tree using the initial k-d tree and the additional nodes.
The apparatus of any of claims 13-25, wherein the point cloud, the initial k-d tree, and the final k-d tree are three-dimensional (3D) .
A computer program product comprising computer executable instructions stored on a non-transitory medium that when executed by a processor cause an apparatus to:

obtain a point cloud of an object, wherein the point cloud comprises points;

generate an initial k-d tree of the point cloud, wherein the initial k-d tree comprises an initial depth and is balanced;

generate a final k-d tree of the point cloud, wherein the final k-d tree comprises a final depth and is unbalanced, and wherein the final depth is greater than the initial depth; and

encode an encoded point cloud,

wherein the encoded point cloud comprises a depth difference between the final depth and the initial depth.
The computer program product of claim 27, wherein the instructions further cause the apparatus to further encode a bounding box describing diagonal corner points of the object.
The computer program product of any of claims 27-28, wherein the instructions further cause the apparatus to further encode the initial depth.
The computer program product of any of claims 27-29, wherein the instructions further cause the apparatus to further encode the final k-d tree.
The computer program product of any of claims 27-30, wherein the instructions further cause the apparatus to transmit the encoded point cloud.
The computer program product of any of claims 27-31, wherein the instructions further cause the apparatus to determine the initial depth based on a total number of the points.
The computer program product of any of claims 27-32, wherein the instructions further cause the apparatus to further determine the initial depth based on a predetermined maximum number of levels that can be efficiently encoded given a constraint.
The computer program product of any of claims 27-33, wherein the constraint is an amount of memory or a processing power.
The computer program product of claim any of claims 27-34, wherein the instructions further cause the apparatus to generate additional nodes beyond the initial depth.
The computer program product of any of claims 27-35, wherein the instructions further cause the apparatus to further generate the additional nodes based on a minimum spanning tree (MST) operation.
The computer program product of any of claims 27-36, wherein the instructions further cause the apparatus to further generate the additional nodes based on an average residual from the MST operation.
The computer program product of any of claims 27-37, wherein the instructions further cause the apparatus to further generate the final k-d tree using the initial k-d tree and the additional nodes.
The computer program product of any of claims 27-38, wherein the point cloud, the initial k-d tree, and the final k-d tree are three-dimensional (3D) .
A method comprising:

receiving an encoded point cloud comprising characteristics, wherein the characteristics comprise a bounding box describing diagonal corner points of an object, an initial depth of an initial k-d tree, a final k-d tree associated with a final depth, and a depth difference between the final depth;

extracting the characteristics from the encoded point cloud; and

generating a point cloud based on the characteristics.
The method of claim 40, wherein the point cloud, the initial k-d tree, and the final k-d tree are three-dimensional (3D) .
An apparatus comprising:

a memory; and

a processor coupled to the memory and configured to:

receive an encoded point cloud comprising characteristics, wherein the characteristics comprise a bounding box describing diagonal corner points of an object, an initial depth of an initial k-d tree, a final k-d tree associated with a final depth, and a depth difference between the final depth,

extract the characteristics from the encoded point cloud, and

generate a point cloud based on the characteristics.
The apparatus of claim 42, wherein the point cloud, the initial k-d tree, and the final k-d tree are three-dimensional (3D) .
A computer program product comprising computer executable instructions stored on a non-transitory medium that when executed by a processor cause an apparatus to:

receive an encoded point cloud comprising characteristics, wherein the characteristics comprise a bounding box describing diagonal corner points of an object, an initial depth of an initial k-d tree, a final k-d tree associated with a final depth, and a depth difference between the final depth;

extract the characteristics from the encoded point cloud; and

generate a point cloud based on the characteristics.
The computer program product of claim 44, wherein the point cloud, the initial k-d tree, and the final k-d tree are three-dimensional (3D) .