WO2022148903A1 - Improved bounding volumes compression for lossless g-pcc data compression - Google Patents

Improved bounding volumes compression for lossless g-pcc data compression Download PDF

Info

Publication number
WO2022148903A1
WO2022148903A1 PCT/FI2022/050001 FI2022050001W WO2022148903A1 WO 2022148903 A1 WO2022148903 A1 WO 2022148903A1 FI 2022050001 W FI2022050001 W FI 2022050001W WO 2022148903 A1 WO2022148903 A1 WO 2022148903A1
Authority
WO
WIPO (PCT)
Prior art keywords
points
point cloud
encoding
image
feasible region
Prior art date
Application number
PCT/FI2022/050001
Other languages
French (fr)
Inventor
Ioan Tabus
Emre Can KAYA
Sebastian Schwarz
Original Assignee
Nokia Technologies Oy
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Technologies Oy filed Critical Nokia Technologies Oy
Publication of WO2022148903A1 publication Critical patent/WO2022148903A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/001Model-based coding, e.g. wire frame
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/40Tree coding, e.g. quadtree, octree
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N2013/0074Stereoscopic image analysis
    • H04N2013/0081Depth or disparity estimation from stereoscopic image signals

Definitions

  • the examples and non-limiting embodiments relate generally to video coding, and more particularly, to improved bounding volumes compression for lossless G-PCC data compress on.
  • an apparatus includes means for encoding a front projection of a point cloud as a first depthmap image, and encoding a back projection of the point cloud as a second depthmap image; means for encoding an occupancy status of points on a boundary of at least one feasible region of the point cloud, the at least one feasible region comprising a set of points that are possible to be occupied, while the points outside of all feasible regions are known already to not belong to the point cloud; and means for encoding an occupancy status of at least some inner points inside the at least one feasible region, comprising encoding voxels connected to voxels within the first depthmap image and the second depthmap image, to generate an encoded point cloud.
  • an apparatus includes at least one processor; and at least one memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: encode a front projection of a point cloud as a first depthmap image, and encode a back projection of the point cloud as a second depthmap image; encode an occupancy status of points on a boundary of at least one feasible region of the point cloud, the at least one feasible region comprising a set of points that are possible to be occupied, while the points outside of all feasible regions are known already to not belong to the point cloud; and encode an occupancy status of at least some inner points inside the at least one feasible region, comprising encoding voxels connected to voxels within the first depthmap image and the second depthmap image, to generate an encoded point cloud.
  • a method includes encoding a front projection of a point cloud as a first depthmap image, and encoding a back projection of the point cloud as a second depthmap image; encoding an occupancy status of points on a boundary of at least one feasible region of the point cloud, the at least one feasible region comprising a set of points that are possible to be occupied, while the points outside of all feasible regions are known already to not belong to the point cloud; and encoding an occupancy status of at least some inner points inside the at least one feasible region, comprising encoding voxels connected to voxels within the first depthmap image and the second depthmap image, to generate an encoded point cloud.
  • a non-transitory program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine for performing operations, the operations comprising: encoding a front projection of a point cloud as a first depthmap image, and encoding a back projection of the point cloud as a second depthmap image; encoding an occupancy status of points on a boundary of at least one feasible region of the point cloud, the at least one feasible region comprising a set of points that are possible to be occupied, while the points outside of all feasible regions are known already to not belong to the point cloud; and encoding an occupancy status of at least some inner points inside the at least one feasible region, comprising encoding voxels connected to voxels within the first depthmap image and the second depthmap image, to generate an encoded point cloud.
  • FIG. 1A is an overview of an example G-PCC encoder.
  • FIG. IB is an overview of an example G-PCC decoder.
  • FIG. 2B shows the feasible region points, obtained by uniting at each column index the lowest and the highest points by a vertical line.
  • FIG. 4 is an example apparatus to implement improved bounding volumes compression for lossless G-PCC data compression, based on the examples described herein.
  • FIG. 5 is an example method to implement improved bounding volumes compression for lossless G-PCC data compression, based on the examples described herein.
  • FIG. 6 illustrates two depthmaps transmitted in Stage I, showing the maximal depth image Z max on the left, and the minimal depth image Z min on the right.
  • FIG. 7 shows the lossy reconstruction of the point cloud, after Stage I, only from the minimal and maximal points transmitted with the depthmaps.
  • FIG. 8A shows the original point cloud to be encoded.
  • FIG. 8D shows the points that remain to be encoded in Stages II and III.
  • G-PCC Geometry-based point cloud coding
  • ISO/IEC MPEG JTC 1/SC 29/WG 11
  • ISO/IEC MPEG JTC 1/SC 29/WG 11
  • the group is working together on this exploration activity in a collaborative effort known as the 3-Dimensional Graphics Team (3DG) to evaluate compression technology designs proposed by their experts in this area.
  • 3DG 3-Dimensional Graphics Team
  • G-PCC Geometry based Point Cloud Compression
  • FIG. 1A provides an overview of an example G-PCC encoder 100
  • FIG. IB provides an overview of an example G-PCC decoder 150
  • the analyze surface approximation 114, the RAHT 124, the synthesize surface approximation 160, and the RAHT 172 are options typically used for static data.
  • the generate LOD 126, lifting 128, generate LOD 174, and inverse lifting 176 are options typically used for dynamically acquired data.
  • the compressed geometry is typically represented as an octree from the root all the way down to a leaf level of individual voxels.
  • the compressed geometry is typically represented by a pruned octree (i.e., an octree from the root down to a leaf level of blocks larger than voxels) plus a model that approximates the surface within each leaf of the pruned octree.
  • a pruned octree i.e., an octree from the root down to a leaf level of blocks larger than voxels
  • static data may in addition approximate the voxels within each leaf with a surface model.
  • the surface model used is a triangulation comprising 1-10 triangles per block, resulting in a triangle soup.
  • the static geometry codec is therefore known as the Trisoup geometry codec, while the dynamically acquired geometry codec is known as the Octree geometry codec.
  • RAHT Region Adaptive Hierarchical Transform
  • Predicting Transform interpolation- based hierarchical nearest-neighbor prediction
  • Lifting Transform interpolation-based hierarchical nearest- neighbor prediction with an update/lifting step
  • RAHT and Lifting are typically used for Category 1 data
  • Predicting is typically used for Category 3 data.
  • either method may be used for any data, and, just like with the geometry codecs in G-PCC, the user has the option to choose which of the 3 attribute codecs they would like to use.
  • the inputs to the G-PCC encoder 100 include positions 102 and attributes 104. Positions 102 are provided to transform coordinates 106 and to transfer attributes 120. Following coordinate transformation 106, points are quantized and removed (are voxelized) at 108, then the octree analyzed at 110, the output of which is provided to geometry reconstruction 112, analyze surface approximation 114 (the output of which is also provided to geometry reconstruction 112), and arithmetic encode 116 which also takes as input the output of analyze surface approximation 114.
  • the geometry bitstream 134 output of the G-PCC encoder 100 is provided by arithmetic encode 116.
  • Attributes 104 are provided to transform colors 118, the output of which is provided to transfer attributes 120 which also takes as input the result of the geometry reconstruction 112.
  • Switch 122 is configured to provide the result of transfer attributes 120, for example as input to the RAHT 124 and generate LOD 126.
  • the result of the geometry reconstruction 112 is provided to the RAHT 124 and to the generate LOD 126, the output of which is provided to lifting 128.
  • Coefficients are quantized at 130 that takes as input the result of the RAHT 124 and the lifting 128.
  • the quantized coefficients 130 are provided to arithmetic encode 132 which then provides the attribute bitstream 136 as another output of the G-PCC encoder 100.
  • the G-PCC decoder 150 takes as input the output of the G-PCC encoder 100, namely geometry bitstream 134 and attribute bitstream 136.
  • Geometry bitstream 134 is provided to arithmetic decode 156, the output of which is provided to synthesize octree 158 and synthesize surface approximation 160, which also takes as input the result of the octree synthesizing 158.
  • the result of synthesize surface approximation 160 is provided to geometry reconstruction 162, the output of which is provided to inverse transform coordinates 164, the RAHT 172, and to generate LOD 174.
  • the result of the inverse transform coordinates 164 is positions 180, which is one of the outputs of the G-PCC decoder 150.
  • Attribute bitstream 136 is provided to arithmetic decode 166, the output of which is provided to inverse quantize 168.
  • Switch 170 is configured to provide the output of inverse quantize 168 to the RAHT 172 and to the generate LOD 174, the output of which is provided to inverse lifting 176.
  • the output of the RAHT 172 and the inverse lifting 176 is provided to inverse transform colors 178, the output of which is attributes 182, another one of the outputs of the G-PCC decoder 150.
  • the G-PCC encoder 100 and the G-PCC decoder 150 may be configured to implement the methods described herein related to improved bounding volumes compression for lossless G-PCC data compression
  • the lossless compression of point clouds was studied and is currently under standardization under the GPCC project of MPEG.
  • the current GPCC lossless method is based on octree representations, where a point cloud is represented as an octree, which can be parsed from the root to the leaves, and at each depth level in the tree one obtains a lossy reconstruction of the point cloud, at a certain resolution, while the lossless reconstruction is obtained at the final resolution level in the octree.
  • an octree node corresponds to a cube at a particular 3D location and the octree node is labeled by one if within that cube there is at least one final point of the point cloud.
  • the encoding process is done by traversing the tree from root to the leaves and specifying at each node the pattern of splitting of a node, where the occupancy pattern is a byte, with one bit for each of the possible eight children, where a 1 specifies that a child node is occupied.
  • the examples herein disclose a new point cloud lossless compression algorithm, based on bounding volumes for G-PCC [refer to I. Tabus, E. Kaya, S. Schwarz, "Successive Refinement of Bounding Volumes for Point Cloud Coding” IEEE International Workshop on Multimedia Signal Processing, MMSP 2020, Tampere, September 2020; also refer to Finnish patent application number 20205300, Method for Geometry-Based Point Cloud Coding].
  • bounding volumes all reconstructed points truly belonged to the point cloud, but some points, specifically some inner points in the transversal sections of the point cloud, were not encoded at all.
  • the examples herein describe/provide a complete lossless process, overlapping in the first stage with the bounding volumes method (encoding a front and back projection), but diverging from the previous method, already at the second stage, that of encoding the true boundary of the feasible region, and making it less restrictive, getting rid of the requirement of decomposing the point cloud into tubes having single connected component sections.
  • a similar or better exclusion process as in the octree method is achieved, by processing only the last resolution representation of the point cloud, with the additional advantage that the context information is easier to interpret and to operate with in 2D section images and one can use important long distance regularities of the point cloud, e.g. regularities of the geodesic shapes in projection images.
  • the disclosed encoding process includes additional primitives that can provide finally a lossless reconstruction of the point cloud.
  • the primitives are based on carefully selected context coding methods and using intensively the exclusion process, to avoid unnecessary testing for occupancy the locations already known.
  • the encoding of large connected components that cannot be predicted well from the already encoded voxels can be done using chain codes for their boundaries and context coding for the inner points.
  • the disclosed solution outperforms the current G-PCC solution in terms of compression efficiency by 4-9%.
  • the examples herein describe an original lossless coding scheme, where all the coding inference is done for the representation at the final resolution of a given point cloud, without using representations at intermediate resolutions.
  • the coding progresses along the axis Oy of the 3D coordinate system, drawing transverse sections of the point cloud parallel to the plane zOx, and encoding losslessly each such section, in Stages II and III, using two major primitives.
  • efficient context models are utilized relying on a model of watertight outer surface of the point cloud.
  • Such primitives are utilized in the encoding process, which leads to exclusion of sets of non-occupied points as large as possible within the current section.
  • the regularity of the geometric shapes including smoothness of the surface and the continuity of the edges, are exploited by using context models, where the occupancy probability of a point is determined by the occupancy of the neighbor points in the 3D space, by including the 2D neighbors from the current section, and the 3D-neighbor points from the past section.
  • Stage I Encoding a front and a back projection
  • the first encoding stage is intended for defining and encoding exclusion sets, containing large parts of the space, and for that the maximal and minimal depthmaps are utilized in the plane Oxy, where z represents the depth (i.e., height above the Oxy plane).
  • the minimal depth image, Z min has at the pixel (x,y) the value, Z min (x,y) equal to the minimum z for which (x,y,z) is a point of the original point cloud.
  • the maximal depth image, Z max has at the pixel (x,y) the value, Z max ⁇ x,y) equal to the maximum z for which (x,y,z) is a point of the original point cloud.
  • processing of the set of feasible points from the image F is started, obtained in light of the information encoded in Stage I.
  • the points on the boundary of the feasible region are marked as ones in the binary image B (the points (z, x) for which at least one of F(z - 1, x), F(z + 1, x), F(z, x - 1), F(z, x + 1) is zero).
  • Stage II transmitted to the decoder is the status of the unknown points from the set of points of B that are marked as ones.
  • the probability distribution used in the arithmetic coder is stored and updated at the counts array C (x, 0), C (x, 1) indexed by the context x. There are multiple ways to select the context, one of them being presented in Section II-D.
  • the feasible region is determined after Stage I (where the first set of points were encoded by the two depthmaps).
  • the feasible region consists, in light of the already transmitted information, of the points that are possible to be true points (they may be or not be true points), while the points outside the feasible region are known to not be true points.
  • the already recovered true points belonging to the boundary of the feasible region form a set F of pixels that are set to 1, which is formed possibly of several connected components.
  • a morphological dilation is performed of the set F using as a structural element the 3x3 element.
  • This obtained set of points is traversed along the rows of the 2D image and is stored in a list of pixels.
  • a binary image is also initialized to store the marked pixels information. After this initialization step, the list of pixels is processed sequentially, starting from the top of the list, processing a current pixel (z, x).
  • all voxels have been encoded that are connected to the boundary of the feasible region by a path of connected pixels (in 8-connectivity).
  • all voxels are encoded that are connected to the voxels contained in the initial two depthmap images.
  • This shell of voxels contains the majority of voxels in the point cloud.
  • the remaining voxels are encoded by processing additional shells, as described in E.
  • Encoding the status of the inner points inside the feasible region may also be conceptually referred to as tracing the inner points inside the feasible region.
  • T(z, x) is needed to be encoded.
  • the first part of the context uses the values of the already reconstructed pixels that are 8-neighbors of (z, x), and also additionally, the information about which of the pixels were already known (note that R and K are distinct images after Stage I).
  • the second part of the context is the 3 x 3 binary matrix B formed from a 3x3 template located at (z, x), selected from the binary image P of past true pixels.
  • the information from A and B is used to form the context.
  • Context (1(A), J(B)), e.g. by concatenating the binary representations of 1(A) and of I(B).
  • the context information in A and B is further normalized, in the following way. Performing a context collapsing operation is considered, so that if a rotation is performed by £ ⁇ 0, n/2, 2n/2, 3n/2 ⁇ of each of the images R, T and K, around the pixel (z,x), the value of the resulting normalized context is the same.
  • the 3 x3 matrix A Apply the rotation around the middle pixel and denote Aa the resulting 3 x 3 matrix.
  • This process of collapsing the four matrices to a single one is performed off-line once, resulting in a tabulated mapping a* A and another mapping I* A, which realize the mappings of the context to the canonical one, stored in look-up tables.
  • the normalized context is found in the following way.
  • the matrix A is formed from R+K and the canonical rotation index a* for this matrix is computed (or picked from the lookup table, at the address formed from the 9 elements of A).
  • the corresponding rotated matrix A0 is computed.
  • the second part of the context is the 3 c 3 matrix B formed from P at (z, x).
  • the matrix B is rotated by the previously determined a* around its center, obtaining a matrix B0.
  • Table 1 below shows the rate in BPV of the proposed codec compared to MPEG G-PCC TMC13 (Build 26.2.2020).
  • the reconstruction obtained from the disclosed method contains all the voxels forming the outer surfaces of the point cloud and all the inner voxels connected to these outer surface voxels, i.e., all voxels that are connected by a path in the 3D space (connected in the 26-voxel connectivity), to the initial voxels recovered in Stage I from the two depthmap images.
  • there are very complex point clouds for example those representing a building and all furniture and objects inside, where some objects are not connected by a 3D path to the outermost voxels.
  • FIG. 2B shows the feasible region points, obtained by uniting at each column index the lowest and the highest points by a vertical line.
  • FIG. 3A in white are shown the points recovered from Stage I (the same as in FIG. 2A) and in gray are shown the points recovered in Stage II, when encoding all true points from the boundary of the feasible region.
  • in white are shown all true points recovered after Stage II, while in gray are shown the true points added at Stage III.
  • FIG. 4 is an apparatus 200 which may be implemented in hardware, configured to implement improved bounding volumes compression for lossless G-PCC data compression, based on the examples described herein.
  • the apparatus comprises a processor 202, at least one memory 204 (e.g. non-transitory memory or transitory memory) including computer program code 205, wherein the at least one memory 204 and the computer program code 205 are configured to, with the at least one processor 202, cause the apparatus to utilize circuitry configured to implement improved bounding volumes compression for lossless G-PCC data compression, based on the examples described herein.
  • the computer program code 205 includes projection encode 206 configured at least to encode a front projection of a point cloud as a first depthmap image, and encode a back projection of the point cloud as a second depthmap image.
  • the computer program code 205 further includes feasible region encode 208 configured at least to encode an occupancy status of points on a boundary of at least one feasible region of the point cloud, the at least one feasible region comprising a set of points that are possible to be occupied, while the points outside of the at least one feasible region are known already to not belong to the point cloud.
  • the inner points encode 210 is configured to encode an occupancy status of at least some inner points inside the at least one feasible region, comprising encoding voxels connected to voxels within the first depthmap image and the second depthmap image, to generate an encoded point cloud.
  • the apparatus 200 is further configured to, using the projection encode 206 and/or the feasible region encode 208, to construct an image of the at least one feasible region of the point cloud comprising the set of points that are possible to be occupied, to mark one or more points on the boundary of the image of the at least one feasible region, encode an occupancy status of unknown points from the one or more points marked on the boundary of the at least one feasible region, and transmit to a decoder (such as decode 216) the occupancy status of the unknown points, wherein the encoding of the occupancy status of the unknown points is performed utilizing a context 212 of the unknown points.
  • a decoder such as decode 216
  • the apparatus 200 is further configured to, using the outer shell encode 214, to encode at least one outermost shell comprising voxels represented in the front and back projections and voxels connected to the voxels represented in the front and back projections, to encode objects not encoded in a three- dimensional path to the outermost shell.
  • Outer shell encode 214 is the implementation of the repetitive peel off process described herein in Section E.
  • Interface 222 is configured to facilitate communication between the encoding and decoding items, as shown, and may be a bus or function call or other programming interface or hardware interface.
  • the apparatus 200 includes a display and/or I/O interface 218 that may be used to display an output (e.g., reconstructed point cloud) of a result of the encoding and/or decoding.
  • the display and/or I/O interface 218 may also be configured to receive input such as an input point cloud or user input.
  • the apparatus 200 also includes one or more network (NW) interfaces (I/F(s)) 220.
  • NW I/F(s) 220 may be wired and/or wireless and communicate over the Internet/other network(s) via any communication technique.
  • the NW I/F(s) 220 may comprise one or more transmitters and one or more receivers.
  • the N/W I/F(s) 220 may comprise standard well-known components such as an amplifier, filter, frequency-converter, (de)modulator, and encoder/decoder circuitry (ies) and one or more antennas.
  • the apparatus 200 may be a remote, virtual or cloud apparatus.
  • the apparatus 200 may be either a coder or a decoder, or both a coder and a decoder.
  • the memory 204 may be implemented using any suitable data storage technology, such as semiconductor based memory devices, flash memory, magnetic memory devices and systems, optical memory devices and systems, fixed memory and removable memory.
  • the memory 204 may comprise a database for storing data.
  • Interface 224 is configured to facilitate communication between items of apparatus 200, as shown in FIG. 4, and may be a bus or other software or hardware interface.
  • the apparatus 200 need not comprise each of the features mentioned, or may comprise other features as well.
  • the apparatus 200 may be an example of the G-PCC encoder 100 and/or the G-PCC decoder 150.
  • FIG. 5 is an example method 300 to implement improved bounding volumes compression for lossless G-PCC data compression, based on the examples described herein.
  • the method includes encoding a front projection of a point cloud as a first depthmap image, and encode a back projection of the point cloud as a second depthmap image.
  • the method includes encoding an occupancy status of points on a boundary of at least one feasible region of the point cloud, the at least one feasible region comprising a set of points that are possible to be occupied, while the points outside of all feasible regions are known already to not belong to the point cloud.
  • the method includes encoding an occupancy status of at least some inner points inside the at least one feasible region, comprising encoding voxels connected to voxels within the first depthmap image and the second depthmap image, to generate an encoded point cloud.
  • Method 300 may be implemented using the apparatus 200.
  • a feasible region is a set of points that form a connected component (i.e., each of its points is connected in connectivity 8 with at least another point from the set), and hence there may be many such feasible regions in the section (for example in FIG. 2B there are two such connected components, namely items 101 and 103).
  • FIG. 6 shows two depthmaps transmitted in Stage I, showing the maximal depth image Z max on the left (402), and the minimal depth image Z min on the right (404).
  • FIG. 7 shows the lossy reconstruction of the point cloud, after Stage I, only from the minimal and maximal points transmitted with the depthmaps.
  • FIG. 8A, FIG. 8B, FIG. 8C, and FIG. 8D the origin and coordinate axes are drawn, Oxyz.
  • the ground truth points are obtained in the section, represented in FIG. 8B.
  • FIG. 8D shows the points that remain to be encoded in Stages II and III.
  • references to a 'computer', 'processor', etc. should be understood to encompass not only computers having different architectures such as single/multi-processor architectures and sequential (Von Neumann)/parallel architectures but also specialized circuits such as field-programmable gate arrays (FPGAs), application specific circuits (ASICs), signal processing devices and other processing circuitry.
  • References to computer program, instructions, code etc. should be understood to encompass software for a programmable processor or firmware such as, for example, the programmable content of a hardware device such as instructions for a processor, or configuration settings for a fixed-function device, gate array or programmable logic device, etc.
  • the term 'circuitry' may refer to any of the following: (a) hardware circuit implementations, such as implementations in analog and/or digital circuitry, and (b) combinations of circuits and software (and/or firmware), such as (as applicable): (i) a combination of processor(s) or (ii) portions of processor(s)/software including digital signal processor (s), software, and memory(ies) that work together to cause an apparatus to perform various functions, and (c) circuits, such as a microprocessor(s) or a portion of a microprocessor (s), that require software or firmware for operation, even if the software or firmware is not physically present.
  • This description of 'circuitry' applies to uses of this term in this application.
  • the term 'circuitry' would also cover an implementation of merely a processor (or multiple processors) or a portion of a processor and its (or their) accompanying software and/or firmware.
  • the term 'circuitry' would also cover, for example and if applicable to the particular element, a baseband integrated circuit or applications processor integrated circuit for a mobile phone or a similar integrated circuit in a server, a cellular network device, or another network device.
  • An example apparatus includes at least one processor; and at least one non-transitory memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: encode a front projection of a point cloud as a first depthmap image, and encode a back projection of the point cloud as a second depthmap image; encode an occupancy status of points on a boundary of at least one feasible region of the point cloud, the at least one feasible region comprising a set of points that are possible to be occupied, while the points outside of all feasible regions are known already to not belong to the point cloud; and encode an occupancy status of at least some inner points inside the at least one feasible region, comprising encoding voxels connected to voxels within the first depthmap image and the second depthmap image, to generate an encoded point cloud.
  • the at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: encode at least one outermost shell comprising voxels represented in the front and back projections and voxels connected to the voxels represented in the front and back projections, to encode objects not encoded in a three-dimensional path to the outermost shell.
  • the front and back projections of the point cloud may be projections along an axis; and the front and back projections may be encoded using crack-edge-region-value.
  • the at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: encode the front and back projections along an axis of a three-dimensional coordinate system; draw transverse sections of the point cloud parallel to a plane of the three-dimensional coordinate system; and encode losslessly the transverse sections.
  • the at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: construct an image of the at least one feasible region of the point cloud comprising the set of points that are possible to be occupied; mark one or more points on the boundary of the image of the at least one feasible region; encode an occupancy status of unknown points from the one or more points marked on the boundary of the at least one feasible region; and transmit to a decoder the occupancy status of the unknown points; wherein the encoding of the occupancy status of the unknown points is performed utilizing a context of the unknown points.
  • the at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: store a probability distribution of the occupancy status of the unknown points, where the probability distribution is based on neighboring points and is used in an arithmetic coder.
  • the context of the unknown points may be a normalized context using information from: a matrix generated from cropping a sum of a known points array and a reconstructed points array based respectively on the front projection and back projection; and a matrix formed from an image comprising the true points on the boundary of the at least one feasible region of the point cloud.
  • the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to: rotate the matrix generated from cropping the sum of the known points array and the reconstructed points array; determine, during the rotation, a weighted score of the elements of the matrix generated from cropping the sum of the known points array and the reconstructed points array, where elements close to one of the corners are given a higher score; and rotate the matrix formed from the image comprising the true points on the boundary of the at least one feasible region of the point cloud, based on the weighted score.
  • the at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: repeat the encoding of the at least one outermost shell.
  • the at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: obtain the set of points of the at least one feasible region after uniting at each column index a lowest and highest set of points of the at least one feasible region with a vertical line.
  • the at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: transmit the encoded point cloud to a decoder.
  • the axis may be an Oz axis.
  • the axis may be an Oy axis.
  • the plane may be a zOx plane.
  • An example apparatus includes means for encoding a front projection of a point cloud as a first depthmap image, and encoding a back projection of the point cloud as a second depthmap image; means for encoding an occupancy status of points on a boundary of at least one feasible region of the point cloud, the at least one feasible region comprising a set of points that are possible to be occupied, while the points outside of all feasible regions are known already to not belong to the point cloud; and means for encoding an occupancy status of at least some inner points inside the at least one feasible region, comprising encoding voxels connected to voxels within the first depthmap image and the second depthmap image, to generate an encoded point cloud.
  • the apparatus may further include means for encoding at least one outermost shell comprising voxels represented in the front and back projections and voxels connected to the voxels represented in the front and back projections, to encode objects not encoded in a three-dimensional path to the outermost shell.
  • the apparatus may further include wherein: the front and back projections of the point cloud are projections along an axis; and the front and back projections are encoded using crack-edge-region-value .
  • the apparatus may further include means for encoding the front and back projections along an axis of a three- dimensional coordinate system; means for drawing transverse sections of the point cloud parallel to a plane of the three- dimensional coordinate system; and means for encoding losslessly the transverse sections.
  • the apparatus may further include means for constructing an image of the at least one feasible region of the point cloud comprising the set of points that are possible to be occupied; means for marking one or more points on the boundary of the image of the at least one feasible region; means for encoding an occupancy status of unknown points from the one or more points marked on the boundary of the at least one feasible region; and means for transmitting to a decoder the occupancy status of the unknown points; wherein the encoding of the occupancy status of the unknown points is performed utilizing a context of the unknown points.
  • the apparatus may further include means for storing a probability distribution of the occupancy status of the unknown points, where the probability distribution is based on neighboring points and is used in an arithmetic coder.
  • the apparatus may further include wherein the context of the unknown points is a normalized context using information from: a matrix generated from cropping a sum of a known points array and a reconstructed points array based respectively on the front projection and back projection; and a matrix formed from an image comprising the true points on the boundary of the at least one feasible region of the point cloud.
  • the apparatus may further include means for rotating the matrix generated from cropping the sum of the known points array and the reconstructed points array; means for determining, during the rotation, a weighted score of the elements of the matrix generated from cropping the sum of the known points array and the reconstructed points array, where elements close to one of the corners are given a higher score; and means for rotating the matrix formed from the image comprising the true points on the boundary of the at least one feasible region of the point cloud, based on the weighted score.
  • the apparatus may further include means for repeating the encoding of the at least one outermost shell.
  • the apparatus may further include means for obtaining the set of points of the at least one feasible region after uniting at each column index a lowest and highest set of points of the at least one feasible region with a vertical line.
  • the apparatus may further include means for transmitting the encoded point cloud to a decoder.
  • the apparatus may further include wherein the axis is an Oz axis.
  • the apparatus may further include wherein the axis is an Oy axis.
  • the apparatus may further include wherein the plane is a zOx plane.
  • An example method includes encoding a front projection of a point cloud as a first depthmap image, and encoding a back projection of the point cloud as a second depthmap image; encoding an occupancy status of points on a boundary of at least one feasible region of the point cloud, the at least one feasible region comprising a set of points that are possible to be occupied, while the points outside of all feasible regions are known already to not belong to the point cloud; and encoding an occupancy status of at least some inner points inside the at least one feasible region, comprising encoding voxels connected to voxels within the first depthmap image and the second depthmap image, to generate an encoded point cloud.
  • the method may further include encoding at least one outermost shell comprising voxels represented in the front and back projections and voxels connected to the voxels represented in the front and back projections, to encode objects not encoded in a three-dimensional path to the outermost shell.
  • the method may further include wherein: the front and back projections of the point cloud are projections along an axis; and the front and back projections are encoded using crack-edge-region-value .
  • the method may further include encoding the front and back projections along an axis of a three-dimensional coordinate system; drawing transverse sections of the point cloud parallel to a plane of the three-dimensional coordinate system; and encoding losslessly the transverse sections.
  • the method may further include constructing an image of the at least one feasible region of the point cloud comprising the set of points that are possible to be occupied; marking one or more points on the boundary of the image of the at least one feasible region; encoding an occupancy status of unknown points from the one or more points marked on the boundary of the at least one feasible region; and transmitting to a decoder the occupancy status of the unknown points; wherein the encoding of the occupancy status of the unknown points is performed utilizing a context of the unknown points.
  • the method may further include storing a probability distribution of the occupancy status of the unknown points, where the probability distribution is based on neighboring points and is used in an arithmetic coder.
  • the method may further include wherein the context of the unknown points is a normalized context using information from: a matrix generated from cropping a sum of a known points array and a reconstructed points array based respectively on the front projection and back projection; and a matrix formed from an image comprising the true points on the boundary of the at least one feasible region of the point cloud.
  • the method may further include rotating the matrix generated from cropping the sum of the known points array and the reconstructed points array; determining, during the rotation, a weighted score of the elements of the matrix generated from cropping the sum of the known points array and the reconstructed points array, where elements close to one of the corners are given a higher score; and rotating the matrix formed from the image comprising the true points on the boundary of the at least one feasible region of the point cloud, based on the weighted score.
  • the method may further include repeating the encoding of the at least one outermost shell.
  • the method may further include obtaining the set of points of the at least one feasible region after uniting at each column index a lowest and highest set of points of the at least one feasible region with a vertical line.
  • the method may further include transmitting the encoded point cloud to a decoder.
  • the method may further include wherein the axis is an Oz axis.
  • the method may further include wherein the axis is an Oy axis.
  • the method may further include wherein the plane is a zOx plane.
  • An example non-transitory program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine for performing operations comprising: encoding a front projection of a point cloud as a first depthmap image, and encoding a back projection of the point cloud as a second depthmap image; encoding an occupancy status of points on a boundary of at least one feasible region of the point cloud, the at least one feasible region comprising a set of points that are possible to be occupied, while the points outside of all feasible regions are known already to not belong to the point cloud; and encoding an occupancy status of at least some inner points inside the at least one feasible region, comprising encoding voxels connected to voxels within the first depthmap image and the second depthmap image, to generate an encoded point cloud.
  • the operations of the non-transitory program storage device may further include encoding at least one outermost shell comprising voxels represented in the front and back projections and voxels connected to the voxels represented in the front and back projections, to encode objects not encoded in a three-dimensional path to the outermost shell.
  • the non-transitory program storage device may further include wherein: the front and back projections of the point cloud are projections along an axis; and the front and back projections are encoded using crack-edge-region-value.
  • the operations of the non-transitory program storage device may further include encoding the front and back projections along an axis of a three-dimensional coordinate system; drawing transverse sections of the point cloud parallel to a plane of the three-dimensional coordinate system; and encoding losslessly the transverse sections.
  • the operations of the non-transitory program storage device may further include constructing an image of the at least one feasible region of the point cloud comprising the set of points that are possible to be occupied; marking one or more points on the boundary of the image of the at least one feasible region; encoding an occupancy status of unknown points from the one or more points marked on the boundary of the at least one feasible region; and transmitting to a decoder the occupancy status of the unknown points; wherein the encoding of the occupancy status of the unknown points is performed utilizing a context of the unknown points.
  • the operations of the non-transitory program storage device may further include storing a probability distribution of the occupancy status of the unknown points, where the probability distribution is based on neighboring points and is used in an arithmetic coder.
  • the non-transitory program storage device may further include wherein the context of the unknown points is a normalized context using information from: a matrix generated from cropping a sum of a known points array and a reconstructed points array based respectively on the front projection and back projection; and a matrix formed from an image comprising the true points on the boundary of the at least one feasible region of the point cloud.
  • the operations of the non-transitory program storage device may further include rotating the matrix generated from cropping the sum of the known points array and the reconstructed points array; determining, during the rotation, a weighted score of the elements of the matrix generated from cropping the sum of the known points array and the reconstructed points array, where elements close to one of the corners are given a higher score; and rotating the matrix formed from the image comprising the true points on the boundary of the at least one feasible region of the point cloud, based on the weighted score.
  • the operations of the non-transitory program storage device may further include repeating the encoding of the at least one outermost shell.
  • the operations of the non-transitory program storage device may further include obtaining the set of points of the at least one feasible region after uniting at each column index a lowest and highest set of points of the at least one feasible region with a vertical line.
  • the operations of the non-transitory program storage device may further include transmitting the encoded point cloud to a decoder.
  • the non-transitory program storage device may further include wherein the axis is an Oz axis.
  • the non-transitory program storage device may further include wherein the axis is an Oy axis.
  • the non-transitory program storage device may further include wherein the plane is a zOx plane.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

An apparatus includes means for encoding a front projection of a point cloud as a first depthmap image, and encoding a back projection of the point cloud as a second depthmap image (302); means for encoding an occupancy status of points on a boundary of at least one feasible region of the point cloud, the at least one feasible region comprising a set of points that are possible to be occupied, while the points outside of all feasible regions are known already to not belong to the point cloud (304); and means for encoding an occupancy status of at least some inner points inside the at least one feasible region, comprising encoding voxels connected to voxels within the first depthmap image and the second depthmap image, to generate an encoded point cloud (306).

Description

Improved Bounding Volumes Compression For Lossless G-PCC Data
Compression
TECHNICAL FIELD
[0001] The examples and non-limiting embodiments relate generally to video coding, and more particularly, to improved bounding volumes compression for lossless G-PCC data compress on.
BACKGROUND
[0002] It is known to perform video coding and decoding.
SUMMARY
[0003] In an aspect, an apparatus includes means for encoding a front projection of a point cloud as a first depthmap image, and encoding a back projection of the point cloud as a second depthmap image; means for encoding an occupancy status of points on a boundary of at least one feasible region of the point cloud, the at least one feasible region comprising a set of points that are possible to be occupied, while the points outside of all feasible regions are known already to not belong to the point cloud; and means for encoding an occupancy status of at least some inner points inside the at least one feasible region, comprising encoding voxels connected to voxels within the first depthmap image and the second depthmap image, to generate an encoded point cloud.
[0004] In an aspect, an apparatus includes at least one processor; and at least one memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: encode a front projection of a point cloud as a first depthmap image, and encode a back projection of the point cloud as a second depthmap image; encode an occupancy status of points on a boundary of at least one feasible region of the point cloud, the at least one feasible region comprising a set of points that are possible to be occupied, while the points outside of all feasible regions are known already to not belong to the point cloud; and encode an occupancy status of at least some inner points inside the at least one feasible region, comprising encoding voxels connected to voxels within the first depthmap image and the second depthmap image, to generate an encoded point cloud.
[0005] In an aspect, a method includes encoding a front projection of a point cloud as a first depthmap image, and encoding a back projection of the point cloud as a second depthmap image; encoding an occupancy status of points on a boundary of at least one feasible region of the point cloud, the at least one feasible region comprising a set of points that are possible to be occupied, while the points outside of all feasible regions are known already to not belong to the point cloud; and encoding an occupancy status of at least some inner points inside the at least one feasible region, comprising encoding voxels connected to voxels within the first depthmap image and the second depthmap image, to generate an encoded point cloud.
[0006] In an aspect, a non-transitory program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine for performing operations is provided, the operations comprising: encoding a front projection of a point cloud as a first depthmap image, and encoding a back projection of the point cloud as a second depthmap image; encoding an occupancy status of points on a boundary of at least one feasible region of the point cloud, the at least one feasible region comprising a set of points that are possible to be occupied, while the points outside of all feasible regions are known already to not belong to the point cloud; and encoding an occupancy status of at least some inner points inside the at least one feasible region, comprising encoding voxels connected to voxels within the first depthmap image and the second depthmap image, to generate an encoded point cloud.
BRIEF DESCRIPTION OF THE DRAWINGS
[0007] The foregoing aspects and other features are explained in the following description, taken in connection with the accompanying drawings, wherein:
[0008] FIG. 1A is an overview of an example G-PCC encoder.
[0009] FIG. IB is an overview of an example G-PCC decoder.
[0010] FIG. 2A shows the points in the section y=132 of the "longdress" point cloud, obtained after stage I (after encoding the two depthmaps).
[0011] FIG. 2B shows the feasible region points, obtained by uniting at each column index the lowest and the highest points by a vertical line.
[0012] FIG. 3A shows the points in the section y=132 of the "longdress" point cloud obtained after stage II.
[0013] FIG. 3B shows all the points of the section with y=132 of the "longdress" point cloud encoded after Stage III.
[0014] FIG. 4 is an example apparatus to implement improved bounding volumes compression for lossless G-PCC data compression, based on the examples described herein. [0015] FIG. 5 is an example method to implement improved bounding volumes compression for lossless G-PCC data compression, based on the examples described herein.
[0016] FIG. 6 illustrates two depthmaps transmitted in Stage I, showing the maximal depth image Zmax on the left, and the minimal depth image Zmin on the right.
[0017] FIG. 7 shows the lossy reconstruction of the point cloud, after Stage I, only from the minimal and maximal points transmitted with the depthmaps.
[0018] FIG. 8A shows the original point cloud to be encoded.
[0019] FIG. 8B shows the ground truth points at the intersection of the plane y=yO with the point cloud.
[0020] FIG. 8C shows the points in the section y=yO which were transmitted in Stage I, by encoding with the CERV method two depthmaps.
[0021] FIG. 8D shows the points that remain to be encoded in Stages II and III.
DETAILED DESCRIPTION OF EXAMPLE EMBODIMENTS
[0022] Geometry-based point cloud coding (G-PCC)
[0023] ISO/IEC MPEG (JTC 1/SC 29/WG 11) is studying the potential need for standardization of point cloud coding technology with a compression capability that significantly exceeds that of the current approaches and target to create the standard. The group is working together on this exploration activity in a collaborative effort known as the 3-Dimensional Graphics Team (3DG) to evaluate compression technology designs proposed by their experts in this area. [0024] Part of this effort is the G-PCC standard (Geometry based Point Cloud Compression) . G-PCC addresses the compression of highly complex and sparse point clouds, both for static and dynamic PC acquisition uses cases.
[0025] A full overview of the codec design is given in [MPEG G-PCC Codec: ISO/IEC JTC1/SC2 9/WG11 N18891 "G-PCC codec description v5", October 2019] and summarized below. FIG. 1A provides an overview of an example G-PCC encoder 100, and FIG. IB provides an overview of an example G-PCC decoder 150. In FIG. 1A and FIG. IB, the analyze surface approximation 114, the RAHT 124, the synthesize surface approximation 160, and the RAHT 172 are options typically used for static data. The generate LOD 126, lifting 128, generate LOD 174, and inverse lifting 176 are options typically used for dynamically acquired data.
[0026] For dynamic PC data, the compressed geometry is typically represented as an octree from the root all the way down to a leaf level of individual voxels. For static PC data, the compressed geometry is typically represented by a pruned octree (i.e., an octree from the root down to a leaf level of blocks larger than voxels) plus a model that approximates the surface within each leaf of the pruned octree. In this way, both data types share the octree coding mechanism, while static data may in addition approximate the voxels within each leaf with a surface model. The surface model used is a triangulation comprising 1-10 triangles per block, resulting in a triangle soup. The static geometry codec is therefore known as the Trisoup geometry codec, while the dynamically acquired geometry codec is known as the Octree geometry codec.
[0027] There are 3 attribute coding methods in G-PCC: Region Adaptive Hierarchical Transform (RAHT) coding, interpolation- based hierarchical nearest-neighbor prediction (Predicting Transform), and interpolation-based hierarchical nearest- neighbor prediction with an update/lifting step (Lifting Transform). RAHT and Lifting are typically used for Category 1 data, while Predicting is typically used for Category 3 data. However, either method may be used for any data, and, just like with the geometry codecs in G-PCC, the user has the option to choose which of the 3 attribute codecs they would like to use.
[0028] As further shown in FIG. 1A, the inputs to the G-PCC encoder 100 include positions 102 and attributes 104. Positions 102 are provided to transform coordinates 106 and to transfer attributes 120. Following coordinate transformation 106, points are quantized and removed (are voxelized) at 108, then the octree analyzed at 110, the output of which is provided to geometry reconstruction 112, analyze surface approximation 114 (the output of which is also provided to geometry reconstruction 112), and arithmetic encode 116 which also takes as input the output of analyze surface approximation 114. The geometry bitstream 134 output of the G-PCC encoder 100 is provided by arithmetic encode 116.
[0029] Attributes 104 are provided to transform colors 118, the output of which is provided to transfer attributes 120 which also takes as input the result of the geometry reconstruction 112. Switch 122 is configured to provide the result of transfer attributes 120, for example as input to the RAHT 124 and generate LOD 126. The result of the geometry reconstruction 112 is provided to the RAHT 124 and to the generate LOD 126, the output of which is provided to lifting 128. Coefficients are quantized at 130 that takes as input the result of the RAHT 124 and the lifting 128. The quantized coefficients 130 are provided to arithmetic encode 132 which then provides the attribute bitstream 136 as another output of the G-PCC encoder 100.
[0030] As further shown in FIG. IB, the G-PCC decoder 150 takes as input the output of the G-PCC encoder 100, namely geometry bitstream 134 and attribute bitstream 136. Geometry bitstream 134 is provided to arithmetic decode 156, the output of which is provided to synthesize octree 158 and synthesize surface approximation 160, which also takes as input the result of the octree synthesizing 158. The result of synthesize surface approximation 160 is provided to geometry reconstruction 162, the output of which is provided to inverse transform coordinates 164, the RAHT 172, and to generate LOD 174. The result of the inverse transform coordinates 164 is positions 180, which is one of the outputs of the G-PCC decoder 150.
[0031] Attribute bitstream 136 is provided to arithmetic decode 166, the output of which is provided to inverse quantize 168. Switch 170 is configured to provide the output of inverse quantize 168 to the RAHT 172 and to the generate LOD 174, the output of which is provided to inverse lifting 176. The output of the RAHT 172 and the inverse lifting 176 is provided to inverse transform colors 178, the output of which is attributes 182, another one of the outputs of the G-PCC decoder 150.
[0032] The G-PCC encoder 100 and the G-PCC decoder 150 may be configured to implement the methods described herein related to improved bounding volumes compression for lossless G-PCC data compression
[0033] Lossless compression of G-PCC data
[0034] The lossless compression of point clouds was studied and is currently under standardization under the GPCC project of MPEG. The current GPCC lossless method is based on octree representations, where a point cloud is represented as an octree, which can be parsed from the root to the leaves, and at each depth level in the tree one obtains a lossy reconstruction of the point cloud, at a certain resolution, while the lossless reconstruction is obtained at the final resolution level in the octree. At each resolution level an octree node corresponds to a cube at a particular 3D location and the octree node is labeled by one if within that cube there is at least one final point of the point cloud. The encoding process is done by traversing the tree from root to the leaves and specifying at each node the pattern of splitting of a node, where the occupancy pattern is a byte, with one bit for each of the possible eight children, where a 1 specifies that a child node is occupied. By traversing in a breadth-first way one can obtain a lossy reconstruction at each depth of the tree, while the lossless reconstruction is obtained at the final resolution. This octree method is attractive for providing a progressive in resolution reconstruction. On the other hand it has the disadvantage that the lossless reconstruction is obtained by necessarily performing all encodings-decodings at each depth level of the tree. Very well engineered solutions were provided in GPCC for conditional coding of each bit of the occupancy pattern at each node of the octree, exploiting the neighbors of the nodes in the octree by acquiring the needed context from the parent node, and its 3D neighbor nodes, or from the already encoded children of the parent's neighbors. One might look at the process of octree encoding as specifying at each resolution stage which are the occupied cubes, or equivalently which are the non-occupied cubes (which leads to exclusion of that part of the space for the next resolution search of occupied sub-cubes). [0035] Lossy and lossless compression of static and dynamically acquired point clouds using G-PCC can be improved to enhance coding performance. The examples described herein provide a method for bounding volumes (a.k.a. "Bounding Volumes") compression of G-PCC data.
[0036] In the bounding volumes compression approach introduced in [Finnish patent application number 20205300, Method for Geometry-Based Point Cloud Coding] all reconstructed points truly belonged to the point cloud, but some points, specifically some inner points in the transversal sections of the point cloud, were not encoded at all. Thus true lossless compression was not achievable.
[0037] The examples herein describe improved bounding volume compression for G-PCC data to allow for true lossless compression. Lossless compression is one of the aspects of the upcoming G-PCC standard.
[0038] The examples herein disclose a new point cloud lossless compression algorithm, based on bounding volumes for G-PCC [refer to I. Tabus, E. Kaya, S. Schwarz, "Successive Refinement of Bounding Volumes for Point Cloud Coding" IEEE International Workshop on Multimedia Signal Processing, MMSP 2020, Tampere, September 2020; also refer to Finnish patent application number 20205300, Method for Geometry-Based Point Cloud Coding]. As mentioned, in the bounding volumes all reconstructed points truly belonged to the point cloud, but some points, specifically some inner points in the transversal sections of the point cloud, were not encoded at all. The examples herein describe/provide a complete lossless process, overlapping in the first stage with the bounding volumes method (encoding a front and back projection), but diverging from the previous method, already at the second stage, that of encoding the true boundary of the feasible region, and making it less restrictive, getting rid of the requirement of decomposing the point cloud into tubes having single connected component sections.
[0039] A similar or better exclusion process as in the octree method is achieved, by processing only the last resolution representation of the point cloud, with the additional advantage that the context information is easier to interpret and to operate with in 2D section images and one can use important long distance regularities of the point cloud, e.g. regularities of the geodesic shapes in projection images.
[0040] The disclosed encoding process includes additional primitives that can provide finally a lossless reconstruction of the point cloud. The primitives are based on carefully selected context coding methods and using intensively the exclusion process, to avoid unnecessary testing for occupancy the locations already known. As additional options, the encoding of large connected components that cannot be predicted well from the already encoded voxels can be done using chain codes for their boundaries and context coding for the inner points.
[0041] The disclosed solution outperforms the current G-PCC solution in terms of compression efficiency by 4-9%. The examples herein describe an original lossless coding scheme, where all the coding inference is done for the representation at the final resolution of a given point cloud, without using representations at intermediate resolutions.
[0042] Consider a point cloud having the resolution of Nx, Ny, Nz along the axes x, y, z respectively, where shifts on the axes might be first applied to ensure that all points (x,y,z) in the point cloud have their coordinates strictly positive. The points are encoded in three stages. In the first stage two projections of the points cloud are encoded: the front and the back projections along the Oz axis, see FIG. 6 and FIG. 7. These projections are two depthmap images, each with Nx c Ny pixels. After this stage, the coding progresses along the axis Oy of the 3D coordinate system, drawing transverse sections of the point cloud parallel to the plane zOx, and encoding losslessly each such section, in Stages II and III, using two major primitives. During encoding efficient context models are utilized relying on a model of watertight outer surface of the point cloud. Such primitives are utilized in the encoding process, which leads to exclusion of sets of non-occupied points as large as possible within the current section. The regularity of the geometric shapes, including smoothness of the surface and the continuity of the edges, are exploited by using context models, where the occupancy probability of a point is determined by the occupancy of the neighbor points in the 3D space, by including the 2D neighbors from the current section, and the 3D-neighbor points from the past section.
[0043] A. Stage I: Encoding a front and a back projection
[0044] The first encoding stage is intended for defining and encoding exclusion sets, containing large parts of the space, and for that the maximal and minimal depthmaps are utilized in the plane Oxy, where z represents the depth (i.e., height above the Oxy plane). The minimal depth image, Zmin, has at the pixel (x,y) the value, Zmin(x,y) equal to the minimum z for which (x,y,z) is a point of the original point cloud. The maximal depth image, Zmax, has at the pixel (x,y) the value, Zmax{x,y) equal to the maximum z for which (x,y,z) is a point of the original point cloud. If no point (x,y,z) exists in the point cloud, then it is set Zmax{x,y) = Oand Zmin{x,y ) = 0. The encoding of these depthmaps is performed by CERV [refer to I. Tabus, I. Schiopu, J. Astola, "Context coding of depth map images under the piecewise-constant image model representation". IEEE Trans. Image Processing, 22:11, pp. 4195-4210, Nov. 2013.], which encodes the geodesic lines, using contexts very rich in information. After the two depthmaps are encoded, start encoding at each y0 the traversal section through the point cloud, which is a 2D binary image of size Nz c Nx having ones at the coordinates (z, x) if (x, y0, z) is a point in the point cloud. Some of the voxels are already known to the decoder after Stage I, and two binary arrays of sizes Nz c Nx are initialized with them, one called known points array K and one called reconstructed points array R (at this initial stage the arrays are identical). Note that in the binary image R it is known at each column x which are the lowest, (Zmin
Figure imgf000014_0001
), and the highest (Zmax(x,y0),x), occupied points (from the two already transmitted depthmaps). Consequently, the Nz c Nx binary image, F, of feasible regions is constructed by filling with 1 all elements F{z, x), with Zmin{x, y0) < z< Zmax{x, y0)) , i.e., of points in the column x that are possible to be occupied. All the points outside the feasible regions, i.e., whereF(z,x)= 0, are thus known at this stage to not be feasible, so they will be excluded from all further testing for occupancy. For that, we set K(z,x) = 1, whenever F(z,x)= 0, i.e., we know already the status of occupancy of those points. The true points in the section, that are needed to losslessly encode, are marked in the Nz x Nx binary image denoted T, and the true points in the past section (section at the plane y = y0 — 1, that is already known to the decoder), are marked in the Nz c Nx binary image denoted as P. [0045] The original point cloud needs to be encoded losslessly, so that the reconstruction at the decoder is identical to the original. This original point cloud is called "ground truth". The method proceeds by sequentially reconstructing the ground truth. The points of the ground truth point cloud are also called "true points". During the reconstruction algorithm the encoding primitives are used in order to tell if a point in space (or in the current section at y=yO) is a true point or not.
[0046] B. Stage II: Encoding the true points on the boundary of the feasible region
[0047] In the second encoding stage, processing of the set of feasible points from the image F is started, obtained in light of the information encoded in Stage I. The points on the boundary of the feasible region are marked as ones in the binary image B (the points (z, x) for which at least one of F(z - 1, x), F(z + 1, x), F(z, x - 1), F(z, x + 1) is zero). In Stage II, transmitted to the decoder is the status of the unknown points from the set of points of B that are marked as ones. All points of B are traversed in a given order (for example they are scanned row-wise) and their occupancy status is encoded for each of them, if in the image K was set K(z,x) = 0 (i.e. the status of (z,x) is not yet known). When encoding each point the context of the point is utilized, using a context collection function described in next section. If B(z, x) = 1 and K(z, x) = 0 the symbol T(z, x) is encoded into the arithmetic code stream and then the reconstructed and known binary images are set as R(z, x) = T(z, x) and K(z, x) = 1, respectively. The probability distribution used in the arithmetic coder is stored and updated at the counts array C (x, 0), C (x, 1) indexed by the context x. There are multiple ways to select the context, one of them being presented in Section II-D.
[0048] Thus, the feasible region is determined after Stage I (where the first set of points were encoded by the two depthmaps). The feasible region consists, in light of the already transmitted information, of the points that are possible to be true points (they may be or not be true points), while the points outside the feasible region are known to not be true points.
[0049] C. Stage III: Encoding the status of the inner points inside the feasible region
[0050] In the image R the already recovered true points belonging to the boundary of the feasible region form a set F of pixels that are set to 1, which is formed possibly of several connected components. A morphological dilation is performed of the set F using as a structural element the 3x3 element. This obtained set of points is traversed along the rows of the 2D image and is stored in a list of pixels. The binary image of known pixels, K, is updated to contain all points with F(z, x) = 0, since the points outside the feasible region are known to be 0 (non-occupied). A binary image is also initialized to store the marked pixels information. After this initialization step, the list of pixels is processed sequentially, starting from the top of the list, processing a current pixel (z, x). Both the encoder and decoder check if K(z, x) = 1, and if yes, the point is removed from the list, since its occupancy status is already known. Otherwise, if K(z, x) = 0, the value of T(z, x) is transmitted using arithmetic coding with the coding distribution stored at the context x. After that, the counts of symbols associated to the context x are updated. Then the pixel is marked as known, K(z, c) = 1, and the reconstructed image is updated as R(z, x) = T(z, x). If the value T(z, x) = 1, each neighbor of (x,y) in 8-connectivity, (zq,cq) is considered one by one, and if its occupancy status is not known, i.e. if K(z0, xO) = 0, and if it is not marked yet, i.e., if M(z0,x0) = 0, then the neighbor (zq,cq) is added at the bottom of the list. After inclusion of (zq,cq) to the list, the marked status is set to 1, M(z0, xO) = 1. The processing of the list continues with the top most entry that is not processed yet, until the list becomes empty. At the end, all voxels have been encoded that are connected to the boundary of the feasible region by a path of connected pixels (in 8-connectivity). When all sections through the point cloud are finished, all voxels are encoded that are connected to the voxels contained in the initial two depthmap images. This shell of voxels contains the majority of voxels in the point cloud. The remaining voxels are encoded by processing additional shells, as described in E.
[0051] Encoding the status of the inner points inside the feasible region may also be conceptually referred to as tracing the inner points inside the feasible region.
[0052] D. Normalized contexts
[0053] One of the most efficient ways of utilizing the contextual information that is needed in Stage II and Stage III is described here. Consider that T(z, x) is needed to be encoded. The first part of the context uses the values of the already reconstructed pixels that are 8-neighbors of (z, x), and also additionally, the information about which of the pixels were already known (note that R and K are distinct images after Stage I). The values of the pixels in the Nz c Nx ternary image Rk = R + K have the following significance: Rk(z, x) = 0 if the value of T(z, x) is not known yet; Rk(z, x) = 1 if the value of T(z, x) was encoded, and T(z, x) = 0; Rk(z, x) = 2 if the value of T(z, x) was encoded, and T(z, x) = 1. Consider first the 3x3 square, centered at (z, x), cropped from the image R+K, which is denoted as a 3 c 3 matrix A. The elements of A belong by construction to {0, 1, 2}. The second part of the context is the 3 x 3 binary matrix B formed from a 3x3 template located at (z, x), selected from the binary image P of past true pixels. The information from A and B is used to form the context. One illustration of such a process is shown, but many similar processes can be considered. For example scanning by columns a one-to-one correspondence is observed between A and 1(A)=
Figure imgf000018_0001
. Similarly there is a one-to-one correspondence between B and J(B)=
Figure imgf000018_0002
. These are combined to a context label Context = (1(A), J(B)), e.g. by concatenating the binary representations of 1(A) and of I(B).
[0054] The context information in A and B is further normalized, in the following way. Performing a context collapsing operation is considered, so that if a rotation is performed by £ {0, n/2, 2n/2, 3n/2} of each of the images R, T and K, around the pixel (z,x), the value of the resulting normalized context is the same. Consider first the 3 x3 matrix A. Apply the rotation around the middle pixel and denote Aa the resulting 3 x 3 matrix. Compute for each of £ {0, n/2, 2n/2, 3n/2}, the matrix Aa, and the weighted score of it W (Aa) (favoring for example the values near the element (0, 0)) and pick as canonical rotation that a* for which the weighted score W (Aa) is the largest. Hence, the four rotated matrices Aa with a £ {0, n/2, 2n/2, 3n/2} are represented only by Aa*. This process of collapsing the four matrices to a single one is performed off-line once, resulting in a tabulated mapping a* A and another mapping I* A, which realize the mappings of the context to the canonical one, stored in look-up tables. As an example of the weighting score W(A), consider the vector v = lAooAQI A10 AQ2A1:L A2O A12 A21 A22] and form
Figure imgf000019_0001
giving in this way a larger weight to the elements close to the corner (0,0) of A.
[0055] The normalized context is found in the following way. At each point (z, x) the matrix A is formed from R+K and the canonical rotation index a* for this matrix is computed (or picked from the lookup table, at the address formed from the 9 elements of A). Also the corresponding rotated matrix A0 is computed. The second part of the context is the 3 c 3 matrix B formed from P at (z, x). The matrix B is rotated by the previously determined a* around its center, obtaining a matrix B0. Now the context to be used for encoding T(z, x) is constructed from A0 and B0 as Context = x =(I(A0), J(B0)), e.g. by concatenating the binary representations of I(A0) and of I(B0).
[0056] Table 1 below shows the rate in BPV of the proposed codec compared to MPEG G-PCC TMC13 (Build 26.2.2020).
Table 1
Figure imgf000019_0002
Figure imgf000020_0001
[0057] E. Repetitive peeling-off process for complete reconstruction of complex point clouds
[0058] The reconstruction obtained from the disclosed method contains all the voxels forming the outer surfaces of the point cloud and all the inner voxels connected to these outer surface voxels, i.e., all voxels that are connected by a path in the 3D space (connected in the 26-voxel connectivity), to the initial voxels recovered in Stage I from the two depthmap images. However, there are very complex point clouds, for example those representing a building and all furniture and objects inside, where some objects are not connected by a 3D path to the outermost voxels. In that case one can repeat the encoding process shell by shell, in a peeling-off operation, where first the outermost shell is encoded, defined by the voxels represented in the front and back projections and all voxels connected to these voxels, and then the same process is reapplied to the remaining un-encoded voxels. If needed, this peeling-off process can be applied several times.
[0059] FIG. 2A shows the points in the section y=132 of the "longdress" point cloud, obtained after stage I (after encoding the two depthmaps). FIG. 2B shows the feasible region points, obtained by uniting at each column index the lowest and the highest points by a vertical line.
[0060] FIG. 3A shows the points in the section y=132 of the "longdress" point cloud obtained after stage II. In FIG. 3A, in white are shown the points recovered from Stage I (the same as in FIG. 2A) and in gray are shown the points recovered in Stage II, when encoding all true points from the boundary of the feasible region. FIG. 3B shows all the points of the section with y=132 of the "longdress" point cloud encoded after Stage III. In FIG. 3B, in white are shown all true points recovered after Stage II, while in gray are shown the true points added at Stage III.
[0061] The benefits and technical effects of the examples described herein include lossless compression of G-PCC data, and outperforming the current G-PCC solution in terms of compression efficiency by 7-11% for full-body 8i sequences.
[0062] FIG. 4 is an apparatus 200 which may be implemented in hardware, configured to implement improved bounding volumes compression for lossless G-PCC data compression, based on the examples described herein. The apparatus comprises a processor 202, at least one memory 204 (e.g. non-transitory memory or transitory memory) including computer program code 205, wherein the at least one memory 204 and the computer program code 205 are configured to, with the at least one processor 202, cause the apparatus to utilize circuitry configured to implement improved bounding volumes compression for lossless G-PCC data compression, based on the examples described herein.
[0063] The computer program code 205 includes projection encode 206 configured at least to encode a front projection of a point cloud as a first depthmap image, and encode a back projection of the point cloud as a second depthmap image. The computer program code 205 further includes feasible region encode 208 configured at least to encode an occupancy status of points on a boundary of at least one feasible region of the point cloud, the at least one feasible region comprising a set of points that are possible to be occupied, while the points outside of the at least one feasible region are known already to not belong to the point cloud. The inner points encode 210 is configured to encode an occupancy status of at least some inner points inside the at least one feasible region, comprising encoding voxels connected to voxels within the first depthmap image and the second depthmap image, to generate an encoded point cloud.
[0064] The apparatus 200 is further configured to, using the projection encode 206 and/or the feasible region encode 208, to construct an image of the at least one feasible region of the point cloud comprising the set of points that are possible to be occupied, to mark one or more points on the boundary of the image of the at least one feasible region, encode an occupancy status of unknown points from the one or more points marked on the boundary of the at least one feasible region, and transmit to a decoder (such as decode 216) the occupancy status of the unknown points, wherein the encoding of the occupancy status of the unknown points is performed utilizing a context 212 of the unknown points. The apparatus 200 is further configured to, using the outer shell encode 214, to encode at least one outermost shell comprising voxels represented in the front and back projections and voxels connected to the voxels represented in the front and back projections, to encode objects not encoded in a three- dimensional path to the outermost shell. Outer shell encode 214 is the implementation of the repetitive peel off process described herein in Section E. Interface 222 is configured to facilitate communication between the encoding and decoding items, as shown, and may be a bus or function call or other programming interface or hardware interface.
[0065] The apparatus 200 includes a display and/or I/O interface 218 that may be used to display an output (e.g., reconstructed point cloud) of a result of the encoding and/or decoding. The display and/or I/O interface 218 may also be configured to receive input such as an input point cloud or user input. The apparatus 200 also includes one or more network (NW) interfaces (I/F(s)) 220. The NW I/F(s) 220 may be wired and/or wireless and communicate over the Internet/other network(s) via any communication technique. The NW I/F(s) 220 may comprise one or more transmitters and one or more receivers. The N/W I/F(s) 220 may comprise standard well-known components such as an amplifier, filter, frequency-converter, (de)modulator, and encoder/decoder circuitry (ies) and one or more antennas.
[0066] The apparatus 200 may be a remote, virtual or cloud apparatus. The apparatus 200 may be either a coder or a decoder, or both a coder and a decoder. The memory 204 may be implemented using any suitable data storage technology, such as semiconductor based memory devices, flash memory, magnetic memory devices and systems, optical memory devices and systems, fixed memory and removable memory. The memory 204 may comprise a database for storing data. Interface 224 is configured to facilitate communication between items of apparatus 200, as shown in FIG. 4, and may be a bus or other software or hardware interface. The apparatus 200 need not comprise each of the features mentioned, or may comprise other features as well. The apparatus 200 may be an example of the G-PCC encoder 100 and/or the G-PCC decoder 150.
[0067] FIG. 5 is an example method 300 to implement improved bounding volumes compression for lossless G-PCC data compression, based on the examples described herein. At 302, the method includes encoding a front projection of a point cloud as a first depthmap image, and encode a back projection of the point cloud as a second depthmap image. At 304, the method includes encoding an occupancy status of points on a boundary of at least one feasible region of the point cloud, the at least one feasible region comprising a set of points that are possible to be occupied, while the points outside of all feasible regions are known already to not belong to the point cloud. At 306, the method includes encoding an occupancy status of at least some inner points inside the at least one feasible region, comprising encoding voxels connected to voxels within the first depthmap image and the second depthmap image, to generate an encoded point cloud. Method 300 may be implemented using the apparatus 200.
[0068] Only when all feasible regions in the section are known it is possible to claim that the points outside all of them are known to not be true points (this is not true if only one feasible region is known and there are several of them in the section). A feasible region is a set of points that form a connected component (i.e., each of its points is connected in connectivity 8 with at least another point from the set), and hence there may be many such feasible regions in the section (for example in FIG. 2B there are two such connected components, namely items 101 and 103).
[0069] FIG. 6 shows two depthmaps transmitted in Stage I, showing the maximal depth image Zmax on the left (402), and the minimal depth image Zmin on the right (404). FIG. 7 shows the lossy reconstruction of the point cloud, after Stage I, only from the minimal and maximal points transmitted with the depthmaps.
[0070] In FIG. 8A, FIG. 8B, FIG. 8C, and FIG. 8D, the origin and coordinate axes are drawn, Oxyz. As shown in FIG. 8A, the original point cloud 502 to be encoded is processed in Stages II and III by iterating over the coordinate y, and drawing a plane y=y0 at each yO, and interpreting this section as a binary image. At the intersection of the plane y=y0 with the point cloud the ground truth points are obtained in the section, represented in FIG. 8B. FIG. 8C shows the points in the section y=y0 which were transmitted in Stage I, by encoding with the CERV method two depthmaps. FIG. 8D shows the points that remain to be encoded in Stages II and III.
[0071] References to a 'computer', 'processor', etc. should be understood to encompass not only computers having different architectures such as single/multi-processor architectures and sequential (Von Neumann)/parallel architectures but also specialized circuits such as field-programmable gate arrays (FPGAs), application specific circuits (ASICs), signal processing devices and other processing circuitry. References to computer program, instructions, code etc. should be understood to encompass software for a programmable processor or firmware such as, for example, the programmable content of a hardware device such as instructions for a processor, or configuration settings for a fixed-function device, gate array or programmable logic device, etc.
[0072] As used herein, the term 'circuitry' may refer to any of the following: (a) hardware circuit implementations, such as implementations in analog and/or digital circuitry, and (b) combinations of circuits and software (and/or firmware), such as (as applicable): (i) a combination of processor(s) or (ii) portions of processor(s)/software including digital signal processor (s), software, and memory(ies) that work together to cause an apparatus to perform various functions, and (c) circuits, such as a microprocessor(s) or a portion of a microprocessor (s), that require software or firmware for operation, even if the software or firmware is not physically present. This description of 'circuitry' applies to uses of this term in this application. As a further example, as used herein, the term 'circuitry' would also cover an implementation of merely a processor (or multiple processors) or a portion of a processor and its (or their) accompanying software and/or firmware. The term 'circuitry' would also cover, for example and if applicable to the particular element, a baseband integrated circuit or applications processor integrated circuit for a mobile phone or a similar integrated circuit in a server, a cellular network device, or another network device.
[0073] An example apparatus includes at least one processor; and at least one non-transitory memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: encode a front projection of a point cloud as a first depthmap image, and encode a back projection of the point cloud as a second depthmap image; encode an occupancy status of points on a boundary of at least one feasible region of the point cloud, the at least one feasible region comprising a set of points that are possible to be occupied, while the points outside of all feasible regions are known already to not belong to the point cloud; and encode an occupancy status of at least some inner points inside the at least one feasible region, comprising encoding voxels connected to voxels within the first depthmap image and the second depthmap image, to generate an encoded point cloud.
[0074] Other aspects of the apparatus may include the following. The at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: encode at least one outermost shell comprising voxels represented in the front and back projections and voxels connected to the voxels represented in the front and back projections, to encode objects not encoded in a three-dimensional path to the outermost shell. The front and back projections of the point cloud may be projections along an axis; and the front and back projections may be encoded using crack-edge-region-value. The at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: encode the front and back projections along an axis of a three-dimensional coordinate system; draw transverse sections of the point cloud parallel to a plane of the three-dimensional coordinate system; and encode losslessly the transverse sections. The at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: construct an image of the at least one feasible region of the point cloud comprising the set of points that are possible to be occupied; mark one or more points on the boundary of the image of the at least one feasible region; encode an occupancy status of unknown points from the one or more points marked on the boundary of the at least one feasible region; and transmit to a decoder the occupancy status of the unknown points; wherein the encoding of the occupancy status of the unknown points is performed utilizing a context of the unknown points. The at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: store a probability distribution of the occupancy status of the unknown points, where the probability distribution is based on neighboring points and is used in an arithmetic coder. The context of the unknown points may be a normalized context using information from: a matrix generated from cropping a sum of a known points array and a reconstructed points array based respectively on the front projection and back projection; and a matrix formed from an image comprising the true points on the boundary of the at least one feasible region of the point cloud. The at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to: rotate the matrix generated from cropping the sum of the known points array and the reconstructed points array; determine, during the rotation, a weighted score of the elements of the matrix generated from cropping the sum of the known points array and the reconstructed points array, where elements close to one of the corners are given a higher score; and rotate the matrix formed from the image comprising the true points on the boundary of the at least one feasible region of the point cloud, based on the weighted score. The at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: repeat the encoding of the at least one outermost shell. The at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: obtain the set of points of the at least one feasible region after uniting at each column index a lowest and highest set of points of the at least one feasible region with a vertical line. The at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: transmit the encoded point cloud to a decoder. The axis may be an Oz axis. The axis may be an Oy axis. The plane may be a zOx plane.
[0075] An example apparatus includes means for encoding a front projection of a point cloud as a first depthmap image, and encoding a back projection of the point cloud as a second depthmap image; means for encoding an occupancy status of points on a boundary of at least one feasible region of the point cloud, the at least one feasible region comprising a set of points that are possible to be occupied, while the points outside of all feasible regions are known already to not belong to the point cloud; and means for encoding an occupancy status of at least some inner points inside the at least one feasible region, comprising encoding voxels connected to voxels within the first depthmap image and the second depthmap image, to generate an encoded point cloud.
[0076] The apparatus may further include means for encoding at least one outermost shell comprising voxels represented in the front and back projections and voxels connected to the voxels represented in the front and back projections, to encode objects not encoded in a three-dimensional path to the outermost shell.
[0077] The apparatus may further include wherein: the front and back projections of the point cloud are projections along an axis; and the front and back projections are encoded using crack-edge-region-value .
[0078] The apparatus may further include means for encoding the front and back projections along an axis of a three- dimensional coordinate system; means for drawing transverse sections of the point cloud parallel to a plane of the three- dimensional coordinate system; and means for encoding losslessly the transverse sections.
[0079] The apparatus may further include means for constructing an image of the at least one feasible region of the point cloud comprising the set of points that are possible to be occupied; means for marking one or more points on the boundary of the image of the at least one feasible region; means for encoding an occupancy status of unknown points from the one or more points marked on the boundary of the at least one feasible region; and means for transmitting to a decoder the occupancy status of the unknown points; wherein the encoding of the occupancy status of the unknown points is performed utilizing a context of the unknown points.
[0080] The apparatus may further include means for storing a probability distribution of the occupancy status of the unknown points, where the probability distribution is based on neighboring points and is used in an arithmetic coder.
[0081] The apparatus may further include wherein the context of the unknown points is a normalized context using information from: a matrix generated from cropping a sum of a known points array and a reconstructed points array based respectively on the front projection and back projection; and a matrix formed from an image comprising the true points on the boundary of the at least one feasible region of the point cloud.
[0082] The apparatus may further include means for rotating the matrix generated from cropping the sum of the known points array and the reconstructed points array; means for determining, during the rotation, a weighted score of the elements of the matrix generated from cropping the sum of the known points array and the reconstructed points array, where elements close to one of the corners are given a higher score; and means for rotating the matrix formed from the image comprising the true points on the boundary of the at least one feasible region of the point cloud, based on the weighted score.
[0083] The apparatus may further include means for repeating the encoding of the at least one outermost shell.
[0084] The apparatus may further include means for obtaining the set of points of the at least one feasible region after uniting at each column index a lowest and highest set of points of the at least one feasible region with a vertical line.
[0085] The apparatus may further include means for transmitting the encoded point cloud to a decoder.
[0086] The apparatus may further include wherein the axis is an Oz axis.
[0087] The apparatus may further include wherein the axis is an Oy axis.
[0088] The apparatus may further include wherein the plane is a zOx plane.
[0089] An example method includes encoding a front projection of a point cloud as a first depthmap image, and encoding a back projection of the point cloud as a second depthmap image; encoding an occupancy status of points on a boundary of at least one feasible region of the point cloud, the at least one feasible region comprising a set of points that are possible to be occupied, while the points outside of all feasible regions are known already to not belong to the point cloud; and encoding an occupancy status of at least some inner points inside the at least one feasible region, comprising encoding voxels connected to voxels within the first depthmap image and the second depthmap image, to generate an encoded point cloud.
[0090] The method may further include encoding at least one outermost shell comprising voxels represented in the front and back projections and voxels connected to the voxels represented in the front and back projections, to encode objects not encoded in a three-dimensional path to the outermost shell.
[0091] The method may further include wherein: the front and back projections of the point cloud are projections along an axis; and the front and back projections are encoded using crack-edge-region-value .
[0092] The method may further include encoding the front and back projections along an axis of a three-dimensional coordinate system; drawing transverse sections of the point cloud parallel to a plane of the three-dimensional coordinate system; and encoding losslessly the transverse sections.
[0093] The method may further include constructing an image of the at least one feasible region of the point cloud comprising the set of points that are possible to be occupied; marking one or more points on the boundary of the image of the at least one feasible region; encoding an occupancy status of unknown points from the one or more points marked on the boundary of the at least one feasible region; and transmitting to a decoder the occupancy status of the unknown points; wherein the encoding of the occupancy status of the unknown points is performed utilizing a context of the unknown points.
[0094] The method may further include storing a probability distribution of the occupancy status of the unknown points, where the probability distribution is based on neighboring points and is used in an arithmetic coder.
[0095] The method may further include wherein the context of the unknown points is a normalized context using information from: a matrix generated from cropping a sum of a known points array and a reconstructed points array based respectively on the front projection and back projection; and a matrix formed from an image comprising the true points on the boundary of the at least one feasible region of the point cloud.
[0096] The method may further include rotating the matrix generated from cropping the sum of the known points array and the reconstructed points array; determining, during the rotation, a weighted score of the elements of the matrix generated from cropping the sum of the known points array and the reconstructed points array, where elements close to one of the corners are given a higher score; and rotating the matrix formed from the image comprising the true points on the boundary of the at least one feasible region of the point cloud, based on the weighted score.
[0097] The method may further include repeating the encoding of the at least one outermost shell.
[0098] The method may further include obtaining the set of points of the at least one feasible region after uniting at each column index a lowest and highest set of points of the at least one feasible region with a vertical line.
[0099] The method may further include transmitting the encoded point cloud to a decoder.
[00100] The method may further include wherein the axis is an Oz axis.
[00101] The method may further include wherein the axis is an Oy axis.
[00102] The method may further include wherein the plane is a zOx plane.
[00103] An example non-transitory program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine for performing operations is provided, the operations comprising: encoding a front projection of a point cloud as a first depthmap image, and encoding a back projection of the point cloud as a second depthmap image; encoding an occupancy status of points on a boundary of at least one feasible region of the point cloud, the at least one feasible region comprising a set of points that are possible to be occupied, while the points outside of all feasible regions are known already to not belong to the point cloud; and encoding an occupancy status of at least some inner points inside the at least one feasible region, comprising encoding voxels connected to voxels within the first depthmap image and the second depthmap image, to generate an encoded point cloud.
[00104] The operations of the non-transitory program storage device may further include encoding at least one outermost shell comprising voxels represented in the front and back projections and voxels connected to the voxels represented in the front and back projections, to encode objects not encoded in a three-dimensional path to the outermost shell.
[00105] The non-transitory program storage device may further include wherein: the front and back projections of the point cloud are projections along an axis; and the front and back projections are encoded using crack-edge-region-value.
[00106] The operations of the non-transitory program storage device may further include encoding the front and back projections along an axis of a three-dimensional coordinate system; drawing transverse sections of the point cloud parallel to a plane of the three-dimensional coordinate system; and encoding losslessly the transverse sections.
[00107] The operations of the non-transitory program storage device may further include constructing an image of the at least one feasible region of the point cloud comprising the set of points that are possible to be occupied; marking one or more points on the boundary of the image of the at least one feasible region; encoding an occupancy status of unknown points from the one or more points marked on the boundary of the at least one feasible region; and transmitting to a decoder the occupancy status of the unknown points; wherein the encoding of the occupancy status of the unknown points is performed utilizing a context of the unknown points.
[00108] The operations of the non-transitory program storage device may further include storing a probability distribution of the occupancy status of the unknown points, where the probability distribution is based on neighboring points and is used in an arithmetic coder.
[00109] The non-transitory program storage device may further include wherein the context of the unknown points is a normalized context using information from: a matrix generated from cropping a sum of a known points array and a reconstructed points array based respectively on the front projection and back projection; and a matrix formed from an image comprising the true points on the boundary of the at least one feasible region of the point cloud.
[00110] The operations of the non-transitory program storage device may further include rotating the matrix generated from cropping the sum of the known points array and the reconstructed points array; determining, during the rotation, a weighted score of the elements of the matrix generated from cropping the sum of the known points array and the reconstructed points array, where elements close to one of the corners are given a higher score; and rotating the matrix formed from the image comprising the true points on the boundary of the at least one feasible region of the point cloud, based on the weighted score. [00111] The operations of the non-transitory program storage device may further include repeating the encoding of the at least one outermost shell.
[00112] The operations of the non-transitory program storage device may further include obtaining the set of points of the at least one feasible region after uniting at each column index a lowest and highest set of points of the at least one feasible region with a vertical line.
[00113] The operations of the non-transitory program storage device may further include transmitting the encoded point cloud to a decoder.
[00114] The non-transitory program storage device may further include wherein the axis is an Oz axis.
[00115] The non-transitory program storage device may further include wherein the axis is an Oy axis.
[00116] The non-transitory program storage device may further include wherein the plane is a zOx plane.
[00117] It should be understood that the foregoing description is only illustrative. Various alternatives and modifications may be devised by those skilled in the art. For example, features recited in the various dependent claims could be combined with each other in any suitable combination (s). In addition, features from different embodiments described above could be selectively combined into a new embodiment. Accordingly, the description is intended to embrace all such alternatives, modifications and variances which fall within the scope of the appended claims. [00118] The following acronyms and abbreviations that may be found in the specification and/or the drawing figures are defined as follows:
2D two dimensional
3D three-dimensional
3DG three-dimensional graphics team
ASIC application-specific integrated circuit
BPV bits per vertex
CERV crack-edge-region-value
FPGA field-programmable gate array
G-PCC or GPCC geometry-based point cloud coding/compression
IEC International Electrotechnical Commission
I/F interface
I/O input/output
ISO International Organization for Standardization
JTC joint technical committee
LOD level of detail
MPEG moving picture experts group
NW network
0 origin
PC point cloud
RAHT region adaptive hierarchical transform
SC subcommittee
TMC13 test model for categories 1 and 3
WG working group x coordinate axis x y coordinate axis y z coordinate axis z

Claims

CLAIMS What is claimed is:
1. An apparatus comprising: means for encoding a front projection of a point cloud as a first depthmap image, and encoding a back projection of the point cloud as a second depthmap image; means for encoding an occupancy status of points on a boundary of at least one feasible region of the point cloud, the at least one feasible region comprising a set of points that are possible to be occupied, while the points outside of all feasible regions are known already to not belong to the point cloud; and means for encoding an occupancy status of at least some inner points inside the at least one feasible region, comprising encoding voxels connected to voxels within the first depthmap image and the second depthmap image, to generate an encoded point cloud.
2. The apparatus of claim 1, further comprising: means for encoding at least one outermost shell comprising voxels represented in the front and back projections and voxels connected to the voxels represented in the front and back projections, to encode objects not encoded in a three- dimensional path to the outermost shell.
3. The apparatus of any one of claims 1 to 2, wherein: the front and back projections of the point cloud are projections along an axis; and the front and back projections are encoded using crack- edge-region-value .
4. The apparatus of any one of claims 1 to 3, further comprising: means for encoding the front and back projections along an axis of a three-dimensional coordinate system; means for drawing transverse sections of the point cloud parallel to a plane of the three-dimensional coordinate system; and means for encoding losslessly the transverse sections.
5. The apparatus of any one of claims 1 to 4, further comprising: means for constructing an image of the at least one feasible region of the point cloud comprising the set of points that are possible to be occupied; means for marking one or more points on the boundary of the image of the at least one feasible region; means for encoding an occupancy status of unknown points from the one or more points marked on the boundary of the at least one feasible region; and means for transmitting to a decoder the occupancy status of the unknown points; wherein the encoding of the occupancy status of the unknown points is performed utilizing a context of the unknown points.
6. The apparatus of claim 5, further comprising: means for storing a probability distribution of the occupancy status of the unknown points, where the probability distribution is based on neighboring points and is used in an arithmetic coder.
7. The apparatus of any one of claims 5 to 6, wherein the context of the unknown points is a normalized context using information from: a matrix generated from cropping a sum of a known points array and a reconstructed points array based respectively on the front projection and back projection; and a matrix formed from an image comprising the true points on the boundary of the at least one feasible region of the point cloud.
8. The apparatus of claim 7, further comprising: means for rotating the matrix generated from cropping the sum of the known points array and the reconstructed points array; means for determining, during the rotation, a weighted score of the elements of the matrix generated from cropping the sum of the known points array and the reconstructed points array, where elements close to one of the corners are given a higher score; and means for rotating the matrix formed from the image comprising the true points on the boundary of the at least one feasible region of the point cloud, based on the weighted score.
9. The apparatus of any one of claims 2 to 8, further comprising: means for repeating the encoding of the at least one outermost shell.
10. The apparatus of any one of claims 1 to 9, further comprising: means for obtaining the set of points of the at least one feasible region after uniting at each column index a lowest and highest set of points of the at least one feasible region with a vertical line.
11. The apparatus of any one of claims 1 to 10, further comprising: means for transmitting the encoded point cloud to a decoder.
12. The apparatus of any one of claims 3 to 11, wherein the axis is an Oz axis.
13. The apparatus of any one of claims 4 to 12, wherein the axis is an Oy axis.
14. The apparatus of any one of claims 4 to 13, wherein the plane is a zOx plane.
15. An apparatus comprising: at least one processor; and at least one non-transitory memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: encode a front projection of a point cloud as a first depthmap image, and encode a back projection of the point cloud as a second depthmap image; encode an occupancy status of points on a boundary of at least one feasible region of the point cloud, the at least one feasible region comprising a set of points that are possible to be occupied, while the points outside of all feasible regions are known already to not belong to the point cloud; and encode an occupancy status of at least some inner points inside the at least one feasible region, comprising encoding voxels connected to voxels within the first depthmap image and the second depthmap image, to generate an encoded point cloud.
16. The apparatus of claim 15, wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to: encode at least one outermost shell comprising voxels represented in the front and back projections and voxels connected to the voxels represented in the front and back projections, to encode objects not encoded in a three- dimensional path to the outermost shell.
17. The apparatus of any one of claims 15 to 16, wherein: the front and back projections of the point cloud are projections along an axis; and the front and back projections are encoded using crack- edge-region-value .
18. The apparatus of any one of claims 15 to 17, wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to: encode the front and back projections along an axis of a three-dimensional coordinate system; draw transverse sections of the point cloud parallel to a plane of the three-dimensional coordinate system; and encode losslessly the transverse sections.
19. The apparatus of any one of claims 15 to 18, wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to: construct an image of the at least one feasible region of the point cloud comprising the set of points that are possible to be occupied; mark one or more points on the boundary of the image of the at least one feasible region; encode an occupancy status of unknown points from the one or more points marked on the boundary of the at least one feasible region; and transmit to a decoder the occupancy status of the unknown points; wherein the encoding of the occupancy status of the unknown points is performed utilizing a context of the unknown points.
20. The apparatus of claim 19, wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to: store a probability distribution of the occupancy status of the unknown points, where the probability distribution is based on neighboring points and is used in an arithmetic coder.
21. The apparatus of any one of claims 19 to 20, wherein the context of the unknown points is a normalized context using information from: a matrix generated from cropping a sum of a known points array and a reconstructed points array based respectively on the front projection and back projection; and a matrix formed from an image comprising the true points on the boundary of the at least one feasible region of the point cloud.
22. The apparatus of claim 21, wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to: rotate the matrix generated from cropping the sum of the known points array and the reconstructed points array; determine, during the rotation, a weighted score of the elements of the matrix generated from cropping the sum of the known points array and the reconstructed points array, where elements close to one of the corners are given a higher score; and rotate the matrix formed from the image comprising the true points on the boundary of the at least one feasible region of the point cloud, based on the weighted score.
23. The apparatus of any one of claims 16 to 22, wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to: repeat the encoding of the at least one outermost shell.
24. The apparatus of any one of claims 15 to 23, wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to: obtain the set of points of the at least one feasible region after uniting at each column index a lowest and highest set of points of the at least one feasible region with a vertical line.
25. The apparatus of any one of claims 15 to 24, wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to: transmit the encoded point cloud to a decoder.
26. The apparatus of any one of claims 17 to 25, wherein the axis is an Oz axis.
27. The apparatus of any one of claims 18 to 26, wherein the axis is an Oy axis.
28. The apparatus of any one of claims 18 to 27, wherein the plane is a zOx plane.
29. A method comprising: encoding a front projection of a point cloud as a first depthmap image, and encoding a back projection of the point cloud as a second depthmap image; encoding an occupancy status of points on a boundary of at least one feasible region of the point cloud, the at least one feasible region comprising a set of points that are possible to be occupied, while the points outside of all feasible regions are known already to not belong to the point cloud; and encoding an occupancy status of at least some inner points inside the at least one feasible region, comprising encoding voxels connected to voxels within the first depthmap image and the second depthmap image, to generate an encoded point cloud.
30. The method of claim 29, further comprising: encoding at least one outermost shell comprising voxels represented in the front and back projections and voxels connected to the voxels represented in the front and back projections, to encode objects not encoded in a three- dimensional path to the outermost shell.
31. The method of any one of claims 29 to 30, wherein: the front and back projections of the point cloud are projections along an axis; and the front and back projections are encoded using crack- edge-region-value .
32. The method of any one of claims 29 to 31, further comprising: encoding the front and back projections along an axis of a three-dimensional coordinate system; drawing transverse sections of the point cloud parallel to a plane of the three-dimensional coordinate system; and encoding losslessly the transverse sections.
33. The method of any one of claims 29 to 32, further comprising: constructing an image of the at least one feasible region of the point cloud comprising the set of points that are possible to be occupied; marking one or more points on the boundary of the image of the at least one feasible region; encoding an occupancy status of unknown points from the one or more points marked on the boundary of the at least one feasible region; and transmitting to a decoder the occupancy status of the unknown points; wherein the encoding of the occupancy status of the unknown points is performed utilizing a context of the unknown points.
34. The method of claim 33, further comprising: storing a probability distribution of the occupancy status of the unknown points, where the probability distribution is based on neighboring points and is used in an arithmetic coder.
35. The method of any one of claims 33 to 34, wherein the context of the unknown points is a normalized context using information from: a matrix generated from cropping a sum of a known points array and a reconstructed points array based respectively on the front projection and back projection; and a matrix formed from an image comprising the true points on the boundary of the at least one feasible region of the point cloud.
36. The method of claim 35, further comprising: rotating the matrix generated from cropping the sum of the known points array and the reconstructed points array; determining, during the rotation, a weighted score of the elements of the matrix generated from cropping the sum of the known points array and the reconstructed points array, where elements close to one of the corners are given a higher score; and rotating the matrix formed from the image comprising the true points on the boundary of the at least one feasible region of the point cloud, based on the weighted score.
37. The method of any one of claims 30 to 36, further comprising: repeating the encoding of the at least one outermost shell.
38. The method of any one of claims 29 to 37, further comprising: obtaining the set of points of the at least one feasible region after uniting at each column index a lowest and highest set of points of the at least one feasible region with a vertical line.
39. The method of any one of claims 29 to 38, further comprising: transmitting the encoded point cloud to a decoder.
40. The method of any one of claims 31 to 39, wherein the axis is an Oz axis.
41. The method of any one of claims 32 to 40, wherein the axis is an Oy axis.
42. The method of any one of claims 32 to 41, wherein the plane is a zOx plane.
43. A non-transitory program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine for performing operations, the operations comprising: encoding a front projection of a point cloud as a first depthmap image, and encoding a back projection of the point cloud as a second depthmap image; encoding an occupancy status of points on a boundary of at least one feasible region of the point cloud, the at least one feasible region comprising a set of points that are possible to be occupied, while the points outside of all feasible regions are known already to not belong to the point cloud; and encoding an occupancy status of at least some inner points inside the at least one feasible region, comprising encoding voxels connected to voxels within the first depthmap image and the second depthmap image, to generate an encoded point cloud.
44. The non-transitory program storage device of claim 43, the operations further comprising: encoding at least one outermost shell comprising voxels represented in the front and back projections and voxels connected to the voxels represented in the front and back projections, to encode objects not encoded in a three- dimensional path to the outermost shell.
45. The non-transitory program storage device of any one of claims 43 to 44, wherein: the front and back projections of the point cloud are projections along an axis; and the front and back projections are encoded using crack- edge-region-value .
46. The non-transitory program storage device of any one of claims 43 to 45, the operations further comprising: encoding the front and back projections along an axis of a three-dimensional coordinate system; drawing transverse sections of the point cloud parallel to a plane of the three-dimensional coordinate system; and encoding losslessly the transverse sections.
47. The non-transitory program storage device of any one of claims 43 to 46, the operations further comprising: constructing an image of the at least one feasible region of the point cloud comprising the set of points that are possible to be occupied; marking one or more points on the boundary of the image of the at least one feasible region; encoding an occupancy status of unknown points from the one or more points marked on the boundary of the at least one feasible region; and transmitting to a decoder the occupancy status of the unknown points; wherein the encoding of the occupancy status of the unknown points is performed utilizing a context of the unknown points.
48. The non-transitory program storage device of claim 47, the operations further comprising: storing a probability distribution of the occupancy status of the unknown points, where the probability distribution is based on neighboring points and is used in an arithmetic coder.
49. The non-transitory program storage device of any one of claims 47 to 48, wherein the context of the unknown points is a normalized context using information from: a matrix generated from cropping a sum of a known points array and a reconstructed points array based respectively on the front projection and back projection; and a matrix formed from an image comprising the true points on the boundary of the at least one feasible region of the point cloud.
50. The non-transitory program storage device of claim 49, the operations further comprising: rotating the matrix generated from cropping the sum of the known points array and the reconstructed points array; determining, during the rotation, a weighted score of the elements of the matrix generated from cropping the sum of the known points array and the reconstructed points array, where elements close to one of the corners are given a higher score; and rotating the matrix formed from the image comprising the true points on the boundary of the at least one feasible region of the point cloud, based on the weighted score.
51. The non-transitory program storage device of any one of claims 44 to 50, the operations further comprising: repeating the encoding of the at least one outermost shell.
52. The non-transitory program storage device of any one of claims 43 to 51, the operations further comprising: obtaining the set of points of the at least one feasible region after uniting at each column index a lowest and highest set of points of the at least one feasible region with a vertical line.
53. The non-transitory program storage device of any one of claims 43 to 52, the operations further comprising: transmitting the encoded point cloud to a decoder.
54. The non-transitory program storage device of any one of claims 45 to 53, wherein the axis is an Oz axis.
55. The non-transitory program storage device of any one of claims 46 to 54, wherein the axis is an Oy axis.
56. The non-transitory program storage device of any one of claims 46 to 55, wherein the plane is a zOx plane.
PCT/FI2022/050001 2021-01-05 2022-01-03 Improved bounding volumes compression for lossless g-pcc data compression WO2022148903A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202163133867P 2021-01-05 2021-01-05
US63/133,867 2021-01-05

Publications (1)

Publication Number Publication Date
WO2022148903A1 true WO2022148903A1 (en) 2022-07-14

Family

ID=82357827

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/FI2022/050001 WO2022148903A1 (en) 2021-01-05 2022-01-03 Improved bounding volumes compression for lossless g-pcc data compression

Country Status (1)

Country Link
WO (1) WO2022148903A1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160086353A1 (en) * 2014-09-24 2016-03-24 University of Maribor Method and apparatus for near-lossless compression and decompression of 3d meshes and point clouds

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160086353A1 (en) * 2014-09-24 2016-03-24 University of Maribor Method and apparatus for near-lossless compression and decompression of 3d meshes and point clouds

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"G-PCC codec description, MPEG document w18891", MPEG DOCUMENT MANAGEMENT SYSTEM, 128TH MEETING, 18 December 2019 (2019-12-18), Retrieved from the Internet <URL:http://dms.mpeg.expert> [retrieved on 20220406] *
GABEUR, V. ET AL.: "Moulding Humans: Non-Parametric 3D Human Shape Estimation From Single Images", IEEE /CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, 27 October 2019 (2019-10-27), XP033723160, Retrieved from the Internet <URL:http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9008390> [retrieved on 20220524], DOI: 10.1109/ICCV.2019.00232 *
TABUS, I. ET AL.: "Successive Refinement of Bounding Volumes for Point Cloud Coding", IEEE INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, 21 September 2020 (2020-09-21), XP055958126, Retrieved from the Internet <URL:http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9287106> [retrieved on 20220519], DOI: 10.1109/MMSP48831.2020.9287106 *

Similar Documents

Publication Publication Date Title
US11741638B2 (en) Methods and devices for entropy coding point clouds
US20240005565A1 (en) Methods and devices for binary entropy coding of point clouds
KR100450823B1 (en) Node structure for representing 3-dimensional objects using depth image
JP4832975B2 (en) A computer-readable recording medium storing a node structure for representing a three-dimensional object based on a depth image
JP4629005B2 (en) 3D object representation device based on depth image, 3D object representation method and recording medium thereof
EP3926962A1 (en) Apparatus and method for processing point cloud data
JP2004185628A (en) Coding and decoding methods for three-dimensional object data, and device for the methods
US20220108483A1 (en) Video based mesh compression
US11902348B2 (en) Point cloud data transmission device, point cloud data transmission method, point cloud data reception device, and point cloud data reception method
US20230171431A1 (en) Device for transmitting point cloud data, method for transmitting point cloud data, device for receiving point cloud data, and method for receiving point cloud data
WO2022131948A1 (en) Devices and methods for sequential coding for point cloud compression
EP4340363A1 (en) Point cloud data transmission method, point cloud data transmission device, point cloud data reception method, and point cloud data reception device
WO2022148903A1 (en) Improved bounding volumes compression for lossless g-pcc data compression
WO2024074122A1 (en) Method, apparatus, and medium for point cloud coding
WO2024074121A1 (en) Method, apparatus, and medium for point cloud coding
US20230342987A1 (en) Occupancy coding using inter prediction with octree occupancy coding based on dynamic optimal binary coder with update on the fly (obuf) in geometry-based point cloud compression
WO2024074123A1 (en) Method, apparatus, and medium for point cloud coding
US20230206510A1 (en) Point cloud data processing device and processing method
WO2024012381A1 (en) Method, apparatus, and medium for point cloud coding
US20240020885A1 (en) Point cloud data transmission method, point cloud data transmission device, point cloud data reception method, and point cloud data reception device
US20230018907A1 (en) Occupancy coding using inter prediction in geometry point cloud compression
WO2023001623A1 (en) V3c patch connectivity signaling for mesh compression
WO2022131947A1 (en) Devices and methods for scalable coding for point cloud compression
CN115914651A (en) Point cloud coding and decoding method, device, equipment and storage medium
CN117795554A (en) Method for decoding and encoding 3D point cloud

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22736671

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22736671

Country of ref document: EP

Kind code of ref document: A1