WO2024082101A1

WO2024082101A1 - Encoding method, decoding method, decoder, encoder, code stream, and storage medium

Info

Publication number: WO2024082101A1
Application number: PCT/CN2022/125722
Authority: WO
Inventors: 马展; 王剑强; 魏红莲
Original assignee: Oppo广东移动通信有限公司
Priority date: 2022-10-17
Filing date: 2022-10-17
Publication date: 2024-04-25

Abstract

Embodiments of the present application provide an encoding method, a decoding method, a decoder, an encoder, a code stream, and a computer readable storage medium. The decoding method comprises: parsing a code stream, and determining attribute encoding information corresponding to a first scale point cloud; performing first scale prediction on the basis of decoded attribute information of a second scale point cloud, and determining attribute prediction information of the first scale point cloud, wherein the second scale point cloud is decoded point cloud data of a parent node of a previous scale of the first scale point cloud; performing probability prediction on the attribute prediction information by using an attribute probability prediction network, to determine an occupation probability of the first scale point cloud; and decoding the attribute encoding information on the basis of the occupation probability, and determining attribute information of the first scale point cloud.

Description

Coding and decoding method, decoder, encoder, code stream and storage medium

Technical Field

The present application relates to point cloud compression coding and decoding technology, and in particular to a coding and decoding method, a decoder, an encoder, a bit stream and a storage medium.

Background technique

A point cloud is a collection of points that can store the geometric position and related attribute information of each point, thereby accurately and three-dimensionally describing objects in space. The amount of point cloud data is huge, and a frame of point cloud can contain millions of points, which also brings great difficulties and challenges to the effective storage and transmission of point clouds. Therefore, compression technology is used to reduce redundant information in point cloud storage, thereby facilitating subsequent processing. Depending on the compressed object, point cloud compression can be divided into two categories: geometry compression and attribute compression, which correspond to compressed coordinate information and attribute information respectively, and the two are compressed independently. That is, the coordinate information is first compressed using a geometric compression algorithm, and then the attribute information is compressed using a separate attribute compression algorithm with the coordinates as known information.

At present, attribute compression algorithms are all implemented based on traditional coding and decoding methods. Attribute compression algorithms can be divided into two main technical points: prediction and transformation. Taking the attribute compression method in the standard point cloud compression algorithm G-PCC (Geometry-based Point Cloud Compression) developed by the International Moving Picture Experts Group (MPEG) as an example, there are two different modes: Predlift and RAHT. Predlift is a compression technology with prediction as the core. Its basic principle is to divide the point cloud into different detail layers (LoD+Predlift) according to the distance between points, and then predict layer by layer, and quantize and entropy encode the prediction residual. RAHT is a compression technology with transformation as the core. Its basic principle is to decompose the point cloud into high-frequency and low-frequency components by wavelet transform in each scale and direction, and then quantize and entropy encode different components according to their importance. These methods all use rule-based calculation methods for prediction/transformation, such as weighted average based on distance and orthogonal transformation matrix constructed based on the number of points.

However, the encoding and decoding complexity of traditional encoding and decoding methods is high, and the encoding and decoding performance needs to be improved.

Summary of the invention

The embodiments of the present application provide a coding and decoding method, a decoder, an encoder, a bit stream and a storage medium, which can reduce the complexity of coding and decoding and improve the coding and decoding performance.

The technical solution of this application is implemented as follows:

In a first aspect, an embodiment of the present application provides a decoding method, including:

Parse the bitstream to determine the attribute coding information corresponding to the first-scale point cloud;

Predicting the first scale based on the decoded attribute information of the second scale point cloud, and determining the attribute prediction information of the first scale point cloud; wherein the second scale point cloud is the decoded point cloud data of the parent node of the previous scale of the first scale point cloud;

Using an attribute probability prediction network, performing probability prediction on the attribute prediction information to determine the occupancy probability of the first scale point cloud;

The attribute encoding information is decoded based on the occupancy probability to determine attribute information of the first-scale point cloud.

In a second aspect, an embodiment of the present application provides an encoding method, including:

The point cloud data is sequentially downsampled until it is divided into a single voxel, thereby obtaining a plurality of scale point clouds; the plurality of scale point clouds include: a first scale point cloud and a second scale point cloud; the second scale point cloud is point cloud data of a previous scale parent node of the first scale point cloud;

In the process of encoding the first-scale point cloud, predicting the first scale based on the attribute information of the second-scale point cloud to determine the attribute prediction information of the first-scale point cloud;

Attribute information of the first-scale point cloud is encoded based on the occupancy probability to determine attribute encoding information of the first-scale point cloud.

In a third aspect, an embodiment of the present application provides a decoder, including:

A decoding part, configured to parse the bitstream and determine attribute encoding information corresponding to the first scale point cloud;

A first prediction part is configured to perform a prediction of the first scale based on the decoded attribute information of the second scale point cloud, and determine the attribute prediction information of the first scale point cloud; wherein the second scale point cloud is the decoded point cloud data of the previous scale parent node of the first scale point cloud; and

The decoding part is also configured to decode the attribute encoding information based on the occupancy probability to determine the attribute information of the first-scale point cloud.

In a fourth aspect, an embodiment of the present application provides an encoder, including:

The division part is configured to sequentially perform voxel downsampling on the point cloud data until a single voxel is divided to obtain multiple scale point clouds; the multiple scale point clouds include: a first scale point cloud and a second scale point cloud; the second scale point cloud is the point cloud data of the previous scale parent node of the first scale point cloud;

A second prediction part is configured to perform a prediction of the first scale based on the attribute information of the second scale point cloud during encoding of the first scale point cloud, and determine the attribute prediction information of the first scale point cloud; and

The encoding part is configured to encode the attribute information of the first-scale point cloud based on the occupancy probability and determine the attribute encoding information of the first-scale point cloud.

In a fifth aspect, an embodiment of the present application further provides a decoder, including:

A first memory configured to store executable instructions;

The first processor is configured to implement the decoding method described in the first aspect when executing the executable instructions stored in the first memory.

In a sixth aspect, an embodiment of the present application further provides an encoder, including:

a second memory configured to store executable instructions;

The second processor is configured to implement the encoding method described in the second aspect when executing the executable instructions stored in the second memory.

In a seventh aspect, an embodiment of the present application provides a code stream, including:

The code stream is generated by bit encoding according to information to be encoded; wherein the information to be encoded includes at least: attribute encoding information of multiple scale point clouds, multiple quantization errors and reconstructed geometric information of multiple scale point clouds, wherein the reconstructed geometric information of the multiple scale point clouds includes the reconstructed geometric information of the first scale point cloud.

In an eighth aspect, an embodiment of the present application provides a computer-readable storage medium storing executable instructions for causing a first processor to execute and implement the decoding method described in the first aspect, or for causing a second processor to execute and implement the encoding method described in the second aspect.

The embodiment of the present application provides a coding and decoding method, a decoder, an encoder, a code stream and a storage medium. When the decoder decodes a first-scale point cloud, the attribute information of the decoded second-scale point cloud is used to predict the first scale, and the first-scale point cloud and its attribute prediction information are predicted. Then, the attribute prediction information is input into an attribute probability prediction network for probability prediction to determine the occupancy probability of the first-scale point cloud. The attribute probability prediction network is used in a lossless decoding process. Then, during decoding, the occupancy probability is determined by the attribute probability prediction network. The use of the attribute probability prediction network avoids many processing operations, reduces decoding complexity, and at the same time speeds up the prediction speed of the occupancy probability, thereby improving the processing efficiency of the prediction process. In this way, if the occupancy probability is used to decode the attribute coding information, the decoding efficiency and decoding performance can be improved.

In the encoder, through the processing between scales, the first scale is predicted based on the attribute information of the second scale point cloud, and the first scale point cloud and its attribute prediction information are predicted. Then, the attribute prediction information is input into the attribute probability prediction network for probability prediction to determine the occupancy probability of the first scale point cloud. The attribute probability prediction network is used in the lossless coding process. Then, the occupancy probability is determined by the attribute probability prediction network during encoding. The use of the attribute probability prediction network avoids many processing operations and reduces the coding complexity. At the same time, it speeds up the prediction speed of the occupancy probability and improves the processing efficiency of the prediction process. In this way, if the occupancy probability is used to encode the attribute information, the coding efficiency and coding performance can be improved.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG1 is a flow chart of G-PCC coding;

FIG2 is a flow chart of G-PCC decoding;

FIG3 is a schematic diagram of an optional flow chart of a decoding method provided in an embodiment of the present application;

FIG4 is a schematic diagram showing attribute information of a voxel provided in an embodiment of the present application;

FIG5A is a schematic diagram of an optional implementation of voxel upsampling provided in an embodiment of the present application;

FIG5B is an optional schematic diagram showing attribute information and attribute prediction information in a voxel upsampling process provided in an embodiment of the present application;

FIG6 is a schematic diagram of an optional structure of an attribute probability prediction network provided in an embodiment of the present application;

FIG7 is a schematic diagram of another optional flow chart of a decoding method provided in an embodiment of the present application;

FIG8 is a schematic diagram of an optional process of dividing into groups provided in an embodiment of the present application;

FIG9 is an optional schematic diagram of a process of sequentially decoding groups provided in an embodiment of the present application;

FIG10 is a schematic diagram of another optional flow chart of a decoding method provided in an embodiment of the present application;

FIG11 is an optional schematic diagram showing a process of predicting occupancy probabilities between different color components provided in an embodiment of the present application;

FIG12 is a schematic diagram of another optional flow chart of the decoding method provided in an embodiment of the present application;

FIG13 is a schematic diagram of an optional flow chart of an encoding method provided in an embodiment of the present application;

FIG14 is a schematic diagram of another optional flow chart of the encoding method provided in an embodiment of the present application;

FIG15 is a schematic diagram of another optional flow chart of the encoding method provided in an embodiment of the present application;

FIG16 is a schematic diagram of another optional flow chart of the encoding method provided in an embodiment of the present application;

FIG17 is a schematic diagram of an optional process of applying the encoding and decoding method provided in an embodiment of the present application to an actual scenario;

FIG18 is a schematic diagram of an optional structure of a decoder provided in an embodiment of the present application;

FIG19 is a schematic diagram of an optional structure of a decoder provided in an embodiment of the present application;

FIG20 is a schematic diagram of an optional structure of an encoder provided in an embodiment of the present application;

FIG. 21 is a schematic diagram of an optional structure of an encoder provided in an embodiment of the present application.

Detailed ways

In order to make the purpose, technical solutions and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings. The described embodiments should not be regarded as limiting the present application. All other embodiments obtained by ordinary technicians in the field without making creative work are within the scope of protection of this application.

In the following description, reference is made to “some embodiments”, which describe a subset of all possible embodiments, but it will be understood that “some embodiments” may be the same subset or different subsets of all possible embodiments and may be combined with each other without conflict.

In the following description, the terms "first\second\third" involved are merely used to distinguish similar objects and do not represent a specific ordering of the objects. It can be understood that "first\second\third" can be interchanged with a specific order or sequence where permitted, so that the embodiments of the present application described here can be implemented in an order other than that illustrated or described here.

Unless otherwise defined, all technical and scientific terms used herein have the same meaning as those commonly understood by those skilled in the art to which this application belongs. The terms used herein are only for the purpose of describing the embodiments of this application and are not intended to limit this application.

Before further describing the embodiments of the present application in detail, the nouns and terms involved in the embodiments of the present application are explained. The nouns and terms involved in the embodiments of the present application are subject to the following interpretations.

Before further describing the embodiments of the present application in detail, the nouns and terms involved in the embodiments of the present application are described first. The nouns and terms involved in the embodiments of the present application are subject to the following interpretations:

Geometry-based Point Cloud Compression (G-PCC or GPCC);

Video-based Point Cloud Compression (V-PCC or VPCC);

Octree;

Bounding Box;

Level of Detail (LOD);

Predicting Transform;

Lifting Transform;

Region Adaptive Hierarchal Transform (RAHT);

Luminance component (Luminance, Luma or Y);

Chroma blue (Cb);

Red chromaticity component (Chroma red, Cr);

1) Voxel: Voxel is the abbreviation of volume element, which is the smallest unit of digital data in three-dimensional space segmentation. Voxel can be used to divide 3D space into grids and give each grid feature. For example, voxel can be a cubic block of fixed size in three-dimensional space. Voxel can be widely used in fields such as three-dimensional imaging, scientific data and medical imaging.

2) Point Cloud refers to a set of irregularly distributed discrete points in space that express the spatial structure and surface properties of a three-dimensional object or three-dimensional scene.

Point cloud is a three-dimensional representation of the surface of an object. Point cloud (data) of the surface of an object can be collected through acquisition equipment such as photoelectric radar, lidar, laser scanner, and multi-view camera.

A point cloud refers to a collection of massive three-dimensional points, and the points in the point cloud may include the location information of the points and the attribute information of the points. For example, the location information of the point may be the three-dimensional coordinate information of the point. The location information of the point may also be referred to as the geometric information of the point. For example, the attribute information of the point may include color information and/or reflectivity, etc. For example, the color information may be information on any color space. For example, the color information may be RGB information. Among them, R represents red (Red, R), G represents green (Green, G), and B represents blue (Blue, B). For another example, the color information may be brightness and chromaticity (YCbCr, YUV) information. Among them, Y represents brightness, Cb (U) represents blue chromaticity, and Cr (V) represents red chromaticity. Alternatively, the color information may be brightness and chromaticity (YCKCg) information. Among them, Y represents brightness, Co represents orange chromaticity, Cg represents green chromaticity, etc., which is not limited by the embodiments of the present application.

Point clouds can be divided into the following categories according to the way they are obtained:

The first type of static point cloud: the object is stationary, and the device that obtains the point cloud is also stationary;

The second type of dynamic point cloud: the object is moving, but the device that obtains the point cloud is stationary;

The third type of dynamic point cloud acquisition: the device that acquires the point cloud is moving.

The uses of point clouds can be divided into two categories:

Category 1: Machine perception point cloud, which can be used in autonomous navigation systems, real-time inspection systems, geographic information systems, visual sorting robots, emergency rescue robots, etc.

Category 2: Point cloud perceived by the human eye, which can be used in point cloud application scenarios such as digital cultural heritage, free viewpoint broadcasting, 3D immersive communication, and 3D immersive interaction.

Since point clouds are a collection of massive points, storing point clouds not only consumes a lot of memory, but is also not conducive to transmission. There is also not enough bandwidth to support direct transmission of point clouds at the network layer without compression. Therefore, point clouds need to be compressed.

Point cloud compression algorithms include two solutions developed by the International Moving Picture Experts Group (MPEG), namely Video-based Point Cloud Compression (V-PCC) and Geometry-based Point Cloud Compression (G-PCC). Among them, the geometric compression in G-PCC is mainly implemented through an octree model and/or a triangle surface model. V-PCC is mainly implemented through three-dimensional to two-dimensional projection and video compression. In addition, the G-PCC codec framework can be used to compress the first type of static point cloud and the third type of dynamically acquired point cloud, and the V-PCC codec framework can be used to compress the second type of dynamic point cloud. In the embodiment of the present application, it can be described based on the principle implementation of the G-PCC codec framework.

Point cloud is a three-dimensional data format that uses a large number of discrete three-dimensional coordinate points (x, y, z) and their attributes (such as color Y, Co, Cg) to represent the three-dimensional information of objects and scenes. Its data volume is huge, and a high-performance point cloud compression algorithm is required. Depending on the compressed object, point cloud compression can be divided into two categories: geometry compression and attribute compression, which correspond to compressed coordinate information and attribute information respectively. The two are usually compressed separately, that is, the coordinate information is first compressed using a geometry compression algorithm, and then the attribute information is compressed using a separate attribute compression algorithm with the coordinates as known information. The present application belongs to an attribute compression algorithm under this mode.

The current attribute compression algorithms are all based on traditional non-neural network methods. They can be roughly divided into two types: prediction and transformation. Taking the attribute compression method in the point cloud compression algorithm G-PCC as an example, there are two different modes: Predlift and RAHT. Predlift is a compression technology with prediction as the core. Its basic principle is to divide the point cloud into different detail layers (LoD+Predlift) according to the distance between points, and then predict layer by layer, and quantize and entropy encode the prediction residual. RAHT is a compression technology with transformation as the core. Its basic principle is to decompose the point cloud into high-frequency and low-frequency components by wavelet transform in each scale and direction, and then quantize and entropy encode different components according to their importance. These methods all use rule-based calculation methods for prediction/transformation, such as weighted average based on distance and orthogonal transformation matrix constructed based on the number of points.

However, with the development of artificial intelligence technology, neural networks have been applied to geometry-based point cloud compression technology, or attribute lossy compression technology, but there is a lack of technology for attribute lossless compression.

The embodiment of the present application provides a coding and decoding method, a decoder, an encoder, a code stream and a computer-readable storage medium, which can improve the coding and decoding efficiency and improve the coding and decoding performance. In order to facilitate the understanding of the technical solution provided by the embodiment of the present application, a flow chart of G-PCC encoding and a flow chart of G-PCC decoding are first provided. It should be noted that the flow chart of G-PCC encoding and the flow chart of G-PCC decoding described in the embodiment of the present application are only for more clearly illustrating the technical solution of the embodiment of the present application, and do not constitute a limitation on the technical solution provided by the embodiment of the present application. It is known to those skilled in the art that with the evolution of point cloud compression technology and the emergence of new business scenarios, the technical solution provided in the embodiment of the present application is also applicable to point cloud coding and decoding architectures similar to G-PCC. The point cloud compressed in the embodiment of the present application can be a point cloud in a video, but is not limited to this.

In the point cloud G-PCC encoder framework, the point cloud of the input 3D image model is sliced and each slice is encoded independently.

As shown in the flowchart of G-PCC coding in FIG1 , it is applied to the encoder. For the point cloud data to be encoded, the point cloud data is first divided into multiple slices by strip division. In each slice, the geometric information and attribute information of the point cloud are encoded separately. In the geometric coding process, the geometric information is transformed so that all the point clouds are contained in a bounding box, and then quantized. Quantization mainly plays a role in scaling. Due to the quantization rounding, the geometric information of a part of the point cloud is the same. It can be determined whether to remove duplicate points based on parameters. The process of quantization and removal of duplicate points is also called voxelization. Then the bounding box is divided into octrees. In the octree-based geometric information coding process, the bounding box is divided into 8 sub-cubes, and the non-empty (containing points in the point cloud) sub-cubes are divided into 8 equal parts until the leaf node obtained by the division is a 1x1x1 unit cube. The division is stopped, and the points in the leaf node are arithmetically encoded to generate a binary geometric bit stream, that is, a geometric code stream. In the process of geometric information encoding based on triangle soup (trisoup), octree division must also be performed first. However, unlike the geometric information encoding based on octree, the trisoup does not need to divide the point cloud into unit cubes with a side length of 1x1x1 step by step. Instead, the division stops when the side length of the sub-block is W. Based on the surface formed by the distribution of the point cloud in each block, the surface and the twelve edges of the block are obtained. At most twelve intersections (vertex) generated by the twelve edges of the block are obtained, and the vertices are arithmetically encoded (surface fitting based on the intersections) to generate a binary geometric bit stream, that is, a geometric code stream. Vertex is also used to implement the process of geometric reconstruction, and the reconstructed geometric information is used when encoding the attributes of the point cloud.

In the attribute encoding process, color conversion is performed to convert the color information (i.e., attribute information) from the RGB color space to the YUV color space or the YCoCg color space. Then, the point cloud is recolored using the reconstructed geometric information so that the unencoded attribute information corresponds to the reconstructed geometric information. In the color information encoding process, there are two main transformation methods. One is the distance-based lifting transformation that relies on the level of detail (LOD) division, and the other is the direct transformation of the region adaptive hierarchical transformation (RAHT). Both methods will convert the color information from the spatial domain to the frequency domain, obtain high-frequency coefficients and low-frequency coefficients through the transformation, and finally quantize the coefficients (i.e., quantized coefficients). Finally, the geometric encoding data after octree division and surface fitting and the attribute encoding data processed by the quantized coefficients are sliced and synthesized, and the vertex coordinates of each block are encoded in turn (i.e., arithmetic encoding) to generate a binary attribute bit stream, i.e., the attribute code stream.

The flowchart of G-PCC decoding shown in Figure 2 is applied to the decoder. The decoder obtains a binary code stream, and independently decodes the geometric bit stream (i.e., geometric code stream) and attribute bit stream in the binary code stream. When decoding the geometric bit stream, the geometric information of the point cloud is obtained through arithmetic decoding-octree synthesis-surface fitting-reconstruction of geometry-inverse coordinate transformation; when decoding the attribute bit stream, the attribute information of the point cloud is obtained through arithmetic decoding-inverse quantization-LOD-based inverse lifting or RAHT-based inverse transformation-inverse color conversion, and the three-dimensional image model of the point cloud data to be encoded is restored based on the geometric information and attribute information.

The encoding method provided in the embodiment of the present application can be understood as being applied to the attribute information encoding process of G-PCC as shown in Figure 1. After the recoloring is completed, in the process of encoding the divided first-scale point cloud, the first-scale prediction is performed based on the attribute information of the second-scale point cloud to determine the attribute prediction information of the first-scale point cloud; the attribute prediction information is probability predicted using the attribute probability prediction network to determine the occupancy probability of the first-scale point cloud; the attribute information of the first-scale point cloud is encoded based on the occupancy probability to determine the attribute coding information of the first-scale point cloud, thereby replacing the processing process of generating the LOD-lifting-quantization-arithmetic coding branch in Figure 1 to achieve lossless coding, i.e., lossless compression. The encoding process of the embodiment of the present application can adopt the arithmetic coding method in Figure 1, such as entropy coding. Through the arithmetic coding process, the attribute information of the first-scale point cloud is encoded, the attribute coding information of the first-scale point cloud is determined, and the attribute bitstream (code stream) is written.

The decoding method provided in the embodiment of the present application can be understood as being applied to the attribute information decoding process of the G-PCC as shown in FIG2 , by parsing the attribute bit stream, determining the attribute coding information corresponding to the first scale point cloud; performing a prediction of the first scale based on the decoded attribute information of the second scale point cloud, and determining the attribute prediction information of the first scale point cloud; wherein the second scale point cloud is the decoded point cloud data of the previous scale parent node of the first scale point cloud; using an attribute probability prediction network, probabilistically predicting the attribute prediction information, and determining the occupancy probability of the first scale point cloud; decoding the attribute coding information based on the occupancy probability, and determining the attribute information of the first scale point cloud, thereby replacing the processing process of the arithmetic decoding-inverse quantization-generation of LOD-inverse lifting branch in FIG2 , and realizing lossless decoding. The decoding process of the embodiment of the present application can adopt the arithmetic decoding method in FIG2 , such as entropy decoding.

It should be noted that the encoding method and decoding method of the embodiments of the present application can also be used in other point cloud encoding and decoding processes other than G-PCC to achieve lossless compression of attribute information.

The decoding method applied to a decoder provided in an embodiment of the present application is described below.

Referring to FIG. 3 , FIG. 3 is a schematic diagram of an optional flow chart of a decoding method provided in an embodiment of the present application, the method comprising:

S101, parsing a bit stream to determine attribute coding information corresponding to a first-scale point cloud.

In the embodiment of the present application, the decoding object of the decoder is a point cloud or a point or node in the point cloud data. Because the encoder can obtain multiple strips, i.e., slices, by spatially dividing the point cloud data during encoding. The encoding method or decoding method provided in the embodiment of the present application is performed on the point cloud in each slice. Therefore, when decoding, the decoder decodes the point cloud in each slice, and finally performs the slice to obtain all the decoding information, thereby obtaining a three-dimensional image model.

In the embodiment of the present application, the encoder divides the point cloud to be encoded of each slice or point cloud data (when strip division is not performed) into point cloud structures of different scales corresponding to the attribute information, and performs attribute encoding on point clouds of different scales starting from the point cloud with the smallest scale. Therefore, in the decoding process of the attribute information, the decoder also decodes the points in the point clouds of different scales in order from small to large (or from low to high), determines the attribute information of different scales, and finally obtains the decoded information of all scales of all slices, and then obtains the three-dimensional image model.

The detailed decoding process of a scale point cloud (first scale point cloud) is described below.

It should be noted that the decoder can parse the attribute coding information (encoded attribute information) corresponding to the first-scale point cloud from the code stream. In an embodiment of the present application, the code stream usually contains encoding information (such as attribute coding information) corresponding to at least one scale of the point cloud transmitted by the encoder. When the decoder decodes, it decodes in order from low scale to high scale. In other words, before the decoder decodes the attribute coding information corresponding to the first-scale point cloud, it has completed the decoding of the attribute coding information of the previous scale point cloud of the first-scale point cloud, that is, the second-scale point cloud, and determined the attribute information of the decoded second-scale point cloud. In this way, the decoder can use the attribute information of the decoded second-scale point cloud of the previous scale to decode the attribute coding information corresponding to the first-scale point cloud of the current scale.

It should be noted that in the embodiment of the present application, since the encoder can voxelize each scale point cloud when performing attribute encoding, the points are represented in the form of a voxel grid.

In an embodiment of the present application, for the voxelization process, at least one point in the point cloud may correspond to an occupied voxel (i.e., a non-empty voxel or an occupied voxel), and an unoccupied voxel (i.e., an empty voxel or an unoccupied voxel) indicates that there is no point in the point cloud at the voxel position. In some embodiments, occupied voxels may be marked with different attribute information, and unoccupied voxels may have no attribute information. In this way, the voxelized point clouds of different scales may represent the attribute information of the point clouds of different scales through the information of the voxels at each position in the voxel grid. Accordingly, during the decoding process, when the decoder decodes the first-scale point cloud, it may parse each first-scale voxel in turn to complete the decoding.

In the embodiment of the present application, parsing the code stream to determine the attribute coding information corresponding to the first-scale point cloud can also be understood as parsing the code stream to obtain the attribute coding information of each first-scale voxel.

S102, predicting the first scale based on the decoded attribute information of the second scale point cloud to determine the attribute prediction information of the first scale point cloud; wherein the second scale point cloud is the decoded point cloud data of the previous scale parent node of the first scale point cloud.

In an embodiment of the present application, when the decoder decodes the first-scale point cloud, it can obtain the decoded attribute information of the decoded second-scale point cloud. The second-scale point cloud is the decoded point cloud data of the previous-scale parent node of the first-scale point cloud, and the second scale is smaller than the first scale.

In the embodiment of the present application, the scale of the point cloud corresponds to the scale of the voxels in the point cloud, that is, the voxels contained in the first scale point cloud correspond to the first scale voxels, and the voxels contained in the second scale point cloud correspond to the second scale voxels.

It should be noted that, since the second-scale point cloud is obtained by voxel downsampling of the first-scale point cloud, a point contained in a second-scale voxel of the second-scale point cloud can be used as a parent node of points contained in multiple corresponding first-scale voxels in the first-scale point cloud.

In the embodiment of the present application, the decoder may infer the undecoded attribute information of the point cloud at the first scale based on the decoded attribute information of the point cloud at the second scale, and obtain the predicted attribute prediction information.

In the embodiment of the present application, prediction or inference is performed by unpooling operation between different scales. The voxel at the first scale is the upsampled voxel corresponding to the voxel at the second scale.

In some embodiments of the present application, the decoder performs voxel upsampling on the second-scale point cloud to determine n first voxels of the first scale, where n is an integer greater than 1. The decoder determines attribute prediction information corresponding to the n first voxels of the first scale based on the decoded attribute information of the second-scale point cloud, and further obtains the attribute prediction information of the first-scale point cloud.

The first voxel in the present application is a first-scale voxel, and the second voxel is a second-scale voxel.

It should be noted that the decoder can perform voxel upsampling on the second scale voxels in the second scale prediction point cloud to obtain first scale voxels corresponding to the first scale. Exemplarily, upsampling is performed on the second scale voxels representing the occupied points in the second scale point cloud to obtain multiple first scale voxels of the first scale corresponding to each second scale voxel. In other words, when the second scale point cloud includes multiple second voxels, each second voxel can be upsampled to obtain multiple first voxels corresponding to the corresponding scale, thereby obtaining n first voxels of the first scale corresponding to the multiple second voxels.

In an embodiment of the present application, the decoder performs upsampling processing on the second-scale point cloud, and upsamples the second-scale point cloud to the first scale. It can be understood that since the second scale is lower than the first scale, when the second-scale point cloud is upsampled, multiple upsampled voxels (i.e., multiple first voxels) corresponding to a voxel of the second scale will be obtained. Whether each upsampled voxel of the multiple upsampled voxels contains the point in the point cloud, if it does, it can be predicted through the subsequent occupancy probability prediction process. Here, the decoder divides the upsampled voxels (n first voxels) obtained by upsampling, determines m groups of first voxels, and performs subsequent prediction processing in a group-based manner. Wherein, m is an integer greater than or equal to 1.

It should be noted that when the encoder downsamples the first-scale point cloud to determine the second-scale point cloud, the averaging operation of the attribute information in the downsampling process will produce decimals, and the rounding operation is used to quantize it. The decimal part produced is the quantization error. Therefore, when the encoder downsamples at adjacent scales, the quantization error between scales will be generated. In this way, during the upsampling process, the decoder can first perform dequantization and then upsample to achieve lossless decoding.

In some embodiments of the present application, when decoding the first-scale point cloud, the decoder can determine the decoded attribute information of the second-scale point cloud; it can also parse the quantization error when the first-scale point cloud is downsampled to the second-scale point cloud from the bitstream. The decoder uses the quantization error to dequantize the decoded attribute information of the second-scale point cloud to determine the corrected attribute information of the second-scale point cloud; and then determines the corrected attribute information of the second voxel of the second-scale point cloud to which each first voxel of the n first voxels of the first scale belongs as the attribute prediction information corresponding to each first voxel.

It should be noted that the decoder may superimpose the decoded attribute information of the second-scale point cloud with the corresponding quantization error to obtain the corrected attribute information of the second-scale point cloud. The second-scale point cloud may include multiple second-scale voxels (such as K second voxels, K is an integer greater than or equal to 1), and each second-scale voxel corresponds to a quantization error. Each second-scale voxel superimposes the corresponding quantization error to determine the corrected attribute information corresponding to each second-scale voxel of the second-scale point cloud. Since each second-scale voxel corresponds to multiple first-scale voxels (i.e., multiple first voxels), the corrected attribute information corresponding to each second-scale voxel is the mean of the attribute information of the multiple first-scale voxels corresponding to it. In other words, for the n first voxels corresponding to the first scale (i.e., n first-scale voxels), the corrected attribute information of the parent node of the previous scale corresponding to each first voxel is determined as the mean of the attribute information of each first voxel.

In the embodiment of the present application, the decoder determines the average value of the attribute information of the second-scale point cloud corresponding to each first voxel of n first voxels at the first scale as the attribute prediction information corresponding to each first voxel.

Since each second-scale voxel or second voxel corresponds to multiple first-scale voxels or multiple first voxels, the attribute prediction information corresponding to at least one first voxel belonging to the same second voxel as a parent node among the n first voxels is consistent.

It should be noted that each first voxel does not necessarily contain points in the point cloud, so the occupied first voxel is only used when it is occupied, and the attribute prediction information is determined to be present.

Exemplarily, the attribute information or attribute value of each voxel is represented by numbers or color depth or different marks, which is not limited in the embodiments of the present application. Occupied voxels in the point cloud are represented by solid cubes, which represent the attribute information of the points in the point cloud. Unoccupied voxels in the point cloud are represented by empty cubes, which means that there are no points in the point cloud and no attribute information, or the attribute information is empty. As shown in Figure 4, each cube represents a voxel, and the attribute information corresponding to each voxel containing the points in the point cloud is represented by numbers. As shown in Figure 4, 8, 5, 9, 7, 3, and 5 are used to represent the attribute information of each voxel.

It should be noted that the attribute information in the embodiment of the present application may include color components, specifically color information of any color space. For example, the attribute information may be color information of an RGB space, color information of a YUV space, color information of a YCbCr space, or YCoCg, etc., and the embodiment of the present application does not impose any limitation thereto.

In the embodiment of the present application, the color component may include at least one of the following: a first color component, a second color component, and a third color component. Thus, taking the attribute information as a color component as an example, if the color component conforms to the RGB color space, then the first color component, the second color component, and the third color component can be determined to be: R component, G component, B component in sequence; if the color component conforms to the YUV color space, then the first color component, the second color component, and the third color component can be determined to be: Y component, U component, V component in sequence; if the color component conforms to the YCbCr color space, then the first color component, the second color component, and the third color component can be determined to be: Y component, Cb component, Cr component in sequence. If the color component conforms to the YCoCg color space, then the first color component, the second color component, and the third color component can be determined to be: Y component, Co component, Cg component in sequence.

Exemplarily, when the attribute information is a color component, the value of the attribute information can be represented by a value between 0 and 255, which is not limited in the embodiments of the present application.

It can also be understood that in the embodiment of the present application, for each point in the point cloud, the attribute information of the point may include not only the color component, but also the reflectivity, refractive index or other attributes, which is not specifically limited here.

Exemplarily, the voxel containing the point, that is, the occupied voxel can use the mean of the attribute information of each point as the attribute information of the voxel. As shown in FIG5A, the second scale point cloud includes 3 occupied second voxels, and each second voxel contains 8 first voxels (first scale voxels) after voxel upsampling. As shown in FIG5B, 3 second voxels are occupied in the second voxels of the second scale (Scale p-1, p is an integer greater than 1), and the attribute information corresponding to these 3 second voxels is 8, 7 and 4 respectively. The quantization error between Scale p-1 and Scale p is 0, so that the corrected attribute information of the 3 second voxels is still 8, 7 and 4. After voxel upsampling, the attribute prediction information of the first voxel of the first scale (Scale p) corresponding to the second voxel with the corrected attribute information of 8 is 8, the attribute prediction information of the first voxel corresponding to the second voxel with the corrected attribute information of 7 is 7, and the attribute prediction information of the first voxel corresponding to the second voxel with the corrected attribute information of 4 is 4.

In some embodiments of the present application, the decoder may also determine the reconstructed geometric information of the first-scale point cloud from the bitstream.

In an embodiment of the present application, the decoder uses a quantization error to dequantize the decoded attribute information of the second-scale point cloud to determine the corrected attribute information of the second-scale point cloud; then, a first-scale prediction is performed based on the corrected attribute information of the second-scale point cloud and the reconstructed geometric information of the first-scale point cloud to determine the attribute prediction information of the first-scale point cloud.

It should be noted that when the decoder performs attribute decoding, it has already completed the geometric decoding process and can obtain the reconstructed geometric information of point clouds of different scales. Therefore, when the decoder performs attribute decoding, the decoder can determine which voxels in the first voxel obtained by upsampling the second scale point cloud are occupied and which voxels are not occupied based on the reconstructed geometric information of the first scale point cloud during the process of upsampling the second scale point cloud.

The reconstructed geometric information of the first-scale point cloud can determine the distribution position of the points in the first-scale point cloud, that is, determine the position information of the points corresponding to the first-scale voxels. The distribution position of the points contained in the first voxels corresponding to the first-scale point cloud can be determined through the position information of the points. In this way, among the n first voxels obtained by upsampling the voxels based on the second-scale point cloud, it is possible to determine which first voxels are occupied. For the occupied first voxels, the attribute prediction information of these occupied first voxels can be determined based on the corrected attribute information of the second-scale point cloud. After all the first voxels are determined, the attribute prediction information of the first-scale point cloud is determined.

For example, as shown in FIG5B , the second voxel with attribute information of 8 corresponds to 1 first voxel, and its attribute prediction information is 8. The second voxel with attribute information of 7 corresponds to 3 first voxels, and its attribute prediction information is 7. The second voxel with attribute information of 4 corresponds to 2 first voxels, and its attribute prediction information is 4.

S103: Using an attribute probability prediction network, perform probability prediction on the attribute prediction information to determine the occupancy probability of the first scale point cloud.

In an embodiment of the present application, after the decoder determines the attribute prediction information of the first voxel of the first-scale point cloud, an attribute probability prediction network can be used to perform probability prediction on the attribute prediction information to determine the occupancy probability of the first voxel.

In some embodiments of the present application, the attribute probability prediction network can be a sparse convolutional network attribute probability prediction (SparseCNN-based Attribute Probability Approximation, SAPA). The input of the attribute probability prediction network is attribute prediction information or attribute information value, and the output is probability model parameters. The decoder can determine the occupancy probability of the first voxel based on the probability model parameters of the first voxel, thereby completing the determination of the occupancy probability of the first scale point cloud.

It should be noted that when the attribute probability prediction network is used for probability prediction, the probability model parameters of each point contained in the voxel can be determined based on the attribute prediction information of multiple points in the first voxel in units of voxels.

In an embodiment of the present application, the probability model parameters may include: the mean and variance representing the Laplace distribution, expressed by (μ, σ), which is not limited in the embodiment of the present application.

Exemplarily, the probability model parameters may include: (μ, σ). Assume the attribute probability prediction network structure as shown in FIG6. X is the input attribute prediction information or attribute information, and the output is (μ, σ). Its network structure consists of multiple layers of sparse convolution (Sparse Convolution, SConv), and the output of the network is combined with X to determine the probability model parameters.

In some embodiments of the present application, the decoder inputs the attribute prediction information of each first voxel into the attribute probability prediction network for probability prediction, and determines at least one probability model parameter of each first voxel of the first-scale point cloud; the first voxel is a voxel corresponding to the first-scale point cloud; based on at least one probability model parameter of each first voxel of the first-scale point cloud, the occupancy probability of the first-scale point cloud is determined.

In the embodiment of the present application, the first scale corresponds to n first voxels. After the decoder determines the attribute prediction information of the n first voxels, the decoder can perform probability prediction on each first voxel in turn. The decoder inputs the attribute prediction information of the current first voxel into the attribute probability prediction network for probability prediction, and determines at least one probability model parameter of the current first voxel; and determines the occupancy probability of the current first voxel according to at least one probability model parameter of the current first voxel. Continue to perform probability prediction of the next first voxel until all first voxel predictions are completed, and obtain the occupancy probability of each first voxel of the first scale point cloud. Alternatively, the decoder inputs the attribute prediction information of the current first voxel into the attribute probability prediction network for probability prediction, and determines at least one probability model parameter of the current first voxel; and continues to perform probability prediction of the next first voxel until all first voxel predictions are completed, and obtains at least one probability model parameter of each first voxel of the first scale point cloud. According to at least one probability model parameter of each first voxel, the occupancy probability of each first voxel is determined respectively, thereby completing the determination of the occupancy probability of the first scale point cloud. That is, the occupancy probability of the first scale point cloud includes the occupancy probability of each first voxel.

It should be noted that at least one probability model parameter may be two probability models, or may be one or more other parameters that ensure the overall numerical performance, and the embodiments of the present application are not limited thereto.

In the embodiment of the present application, the probability model parameters may include: the mean and variance representing the Laplace distribution, expressed by (μ, σ), which is not limited in the embodiment of the present application.

It should be noted that, in the embodiment of the present application, each first voxel may contain at least one point. When the decoder performs probability prediction, the prediction is based on each point of each first voxel, and what is obtained is at least one probability model parameter of each point.

In some embodiments of the present application, the decoder may determine the probability density distribution of each first voxel based on at least one probability model parameter of at least one point contained in each first voxel of the first-scale point cloud; based on the probability density distribution of each first voxel, an integral operation may be performed within a preset range centered on each first voxel to determine the occupancy probability of each first voxel, thereby determining the occupancy probability of the first-scale point cloud.

It should be noted that, in the embodiment of the present application, after the probability prediction of each first voxel, at least one probability model parameter of each point on each first voxel is determined, and based on the at least one probability model parameter of each point on each first voxel, the probability density distribution of each first voxel can be determined. Taking each first voxel as the center, an integral calculation is performed within its preset range to obtain the occupancy probability of each first voxel.

Exemplarily, as shown in formula (1), assuming that the attribute prediction information follows the Laplace distribution, at least one probability model parameter is (μ, σ), and the probability density distribution function of each first voxel is determined according to the mean and variance (μ _i , σ _i ) of the point prediction on each first voxel x _i

And through the xi _- centered

The integral operation within the range obtains its probability p( _xi ), ie, the occupancy probability of the first voxel.

S104: Decode the attribute encoding information based on the occupancy probability to determine attribute information of the first-scale point cloud.

In the embodiment of the present application, after the decoder obtains the occupancy probability information of each first voxel, it uses the occupancy probability of each first voxel to respectively decode the attribute encoding information corresponding to each first voxel of the first-scale point cloud to determine the attribute information corresponding to each first voxel.

In some embodiments of the present application, the decoder may decode the first-scale point cloud in units of voxels, and further decode the attribute information of each point contained in the first voxel.

It should be noted that the decoder can decode the attribute coding information of each first voxel based on the occupancy probability of the first voxel. The attribute coding information of the first-scale point cloud includes the attribute coding information corresponding to each first voxel.

In an embodiment of the present application, the process of the decoder processing between scales is as described in the aforementioned embodiment. In addition, the decoder can perform component processing for decoding according to the grouping of voxels at the same scale, and can also perform channel-to-channel processing of different color channels according to different color components to achieve decoding. It can also combine voxel grouping and different color components for decoding. The embodiment of the present application is not limited and will be described in the following embodiments.

It should be noted that the decoding method in the embodiment of the present application can be applied to a scalable encoding and decoding method, that is, for multiple attribute coding information of multiple scale point clouds sent by the encoder side, the decoder can decode and reconstruct the point cloud of any scale in the order of decoding from low scale to high scale according to the actual decoding accuracy requirements. Exemplarily, the encoder writes and sends the attribute coding information of the first scale point cloud, the attribute coding information of the second scale point cloud to the attribute coding information of the S+1 scale point cloud in the bitstream, and the decoder can decode from the S+1 scale point cloud to the third scale point cloud according to the preset accuracy requirements according to the decoding method provided in the embodiment of the present application, and end the decoding after predicting the attribute information of the third scale point cloud, and no longer decode the attribute coding information of the second scale and the attribute coding information corresponding to the first scale point cloud. The specific selection is made according to the actual situation, and the embodiment of the present application is not limited.

It can be understood that the decoding method provided in the embodiment of the present application can be repeatedly applied between multiple adjacent scales, and the decoding between each group of adjacent scales is independent of each other, so scale-scalable decoding can be flexibly implemented.

It should be noted that each decoding process of the above decoder uses the decoded low-scale point cloud as known information to decode the attribute encoding information of the high-scale point cloud. For the first decoding process of the decoder, the known information may be a preset number of unencoded point cloud information sent by the encoder side. The encoder may send a preset number of point cloud information, such as the attribute information of 100 points in the point cloud, as the first known information directly to the decoding end in an unencoded manner, so that the decoder does not need to decode the first known information, but directly uses the attribute information of the preset number of points sent by the encoder to predict the point cloud of the corresponding scale to continue the subsequent decoding process.

It can be understood that, at the decoder, when decoding the first-scale point cloud, the attribute information of the decoded second-scale point cloud is used to predict the first scale, and the first-scale point cloud and its attribute prediction information are predicted. Then, the attribute prediction information is input into the attribute probability prediction network for probability prediction to determine the occupancy probability of the first-scale point cloud. The attribute probability prediction network is used in the lossless decoding process, and the occupancy probability is determined by the attribute probability prediction network during decoding. The use of the attribute probability prediction network avoids many processing operations, reduces the decoding complexity, and at the same time speeds up the prediction speed of the occupancy probability, and improves the processing efficiency of the prediction process. In this way, if the occupancy probability is used to decode the attribute encoding information, the decoding efficiency and decoding performance can be improved.

When decoding the first-scale point cloud, a plurality of first voxels of the first-scale point cloud may be grouped to implement the decoding process.

In some embodiments of the present application, as shown in FIG7 , the decoding method provided in the embodiment of the present application may further include:

S201, parsing a bit stream to determine attribute coding information corresponding to a first-scale point cloud.

It should be noted that the implementation principle of S201 is consistent with that of S101, and will not be repeated here.

S202, performing voxel upsampling on the second-scale point cloud to determine n first voxels of the first scale; n is an integer greater than 1; wherein the second-scale point cloud is decoded point cloud data of a parent node of a previous scale of the first-scale point cloud.

S203 . Determine attribute prediction information corresponding to n first voxels of the first scale according to the decoded attribute information of the second-scale point cloud, and then obtain attribute prediction information of the first-scale point cloud.

It should be noted that the implementation principle of S202 - S203 is consistent with the implementation principle of S102 and will not be repeated here.

S204 , grouping voxels at the same position of each of the n first voxels to determine m groups of first voxels; m is an integer greater than or equal to 1.

In the embodiment of the present application, the decoder may group the n first voxels and perform a post-coarse prediction process in a group-based manner.

In an embodiment of the present application, the decoder can determine the numbering range based on the number of voxels corresponding to the second voxel in each second-scale point cloud after upsampling; using the numbering range, among the multiple first-scale voxels obtained by upsampling the second-scale point cloud, the multiple first-scale voxels corresponding to each second voxel in the second-scale point cloud in the respective corresponding first scales are numbered in the same manner to obtain the number corresponding to each first-scale voxel in the n first voxels.

In some embodiments of the present application, among n first voxels, the decoder may group first voxels with the same number into a group according to the number of each first voxel, and determine m groups of first voxels, where m is an integer greater than or equal to 1.

Exemplarily, as shown in FIG8 , the p-1 scale point cloud, i.e., the second scale point cloud, is upsampled by 2×2×2 to obtain n first voxels of the p scale, i.e., n first voxels of the first scale. It can be seen that a second voxel of the p-1 scale point cloud corresponds to 8 first voxels of the p scale after upsampling, and the numbering range can be determined to be 1-8. For a second voxel in the p-1 scale point cloud, the 8 first voxels corresponding to the voxel in the p scale point cloud are numbered using numbers 1-8 respectively; for three second voxels in the p-1 scale point cloud, the 8 first voxels corresponding to the second voxel in each p-1 scale point cloud are numbered using the same numbering method, obtaining 3 first voxels numbered 1, 3 first voxels numbered 2, ..., to 3 first voxels numbered 8. The first grouping method: the decoder groups the first voxels according to the number of each first voxel, and divides the first voxels with the same number into a group, that is, the voxels at the same position of each first voxel are grouped as shown in Figure 8. For example, each first voxel numbered 1 is divided into a group, each first voxel numbered 2 is divided into a group, and so on, to obtain 8 groups of first voxels as m groups of first voxels, namely 8 groups from G1 to G8.

In the embodiment of the present application, other grouping methods may also be used, as follows:

The second grouping method: determine that number 1 and number 2 are consecutive numbers, number 3 and number 4 are consecutive numbers, and number 5 to number 8 are consecutive numbers; take the first voxels corresponding to number 1 and number 2 as a group, take the first voxels corresponding to number 3 and number 4 as a group, take the first voxels corresponding to number 5 to number 8 as a group, and obtain 3 groups of first voxels as m groups of first voxels, namely, the three groups G1G2, G3G4, and G5G6 G7G8 as shown in Figure 8.

The third grouping method: taking all n (n=24) first voxels as one group is equivalent to determining the numbers 1-8 as consecutive numbers, and obtaining one group of first voxels as m groups of first voxels. That is, as shown in FIG8 , G1G2G3G4G5G6 G7G8 is a group.

S205 , using an attribute probability prediction network, sequentially performing probability prediction on the attribute prediction information of the m groups of first voxels, and determining the m groups of occupancy probabilities, where the m groups of occupancy probabilities are the occupancy probabilities of the first scale point cloud.

S206 , decoding the attribute coding information corresponding to the first voxels in the m groups one by one according to the m groups of occupancy probabilities, and determining the attribute information of each of the first voxels in the m groups.

In the embodiment of the present application, the decoder divides the multiple first voxels corresponding to each second voxel into m groups of first voxels, and finally divides the n first voxels into m groups of first voxels. The decoder can decode in order from the first group to the last group in the grouping order.

In an embodiment of the present application, the decoder can use an attribute probability prediction network to perform probability prediction on the attribute prediction information of the m groups of first voxels in turn to determine the m groups of occupancy probabilities. The decoder decodes the attribute encoding information corresponding to the m groups of first voxels based on the m groups of occupancy probabilities, determines the attribute information of each of the m groups of first voxels, and completes the decoding of the first-scale point cloud.

In some embodiments of the present application, the decoder uses an attribute probability prediction network to perform probability prediction on the attribute prediction information of the m groups of first voxels in sequence to determine the occupancy probability of the m groups. The probability prediction process of the decoder for each first voxel in each group includes: the decoder can perform probability prediction on each first voxel in each group in sequence, the decoder inputs the attribute prediction information of the current first voxel into the attribute probability prediction network for probability prediction, and determines at least one probability model parameter of the current first voxel; according to at least one probability model parameter of the current first voxel, the occupancy probability of the current first voxel is determined. Continue to perform probability prediction on the next first voxel in this group until the prediction of all first voxels of all groups is completed, and obtain the occupancy probability of each first voxel in the m groups of the first scale point cloud. Alternatively, the decoder inputs the attribute prediction information of the current first voxel into the attribute probability prediction network for probability prediction, and determines at least one probability model parameter of the current first voxel; continue to perform probability prediction on the next first voxel in this group until the prediction of all first voxels of all groups is completed, and obtain at least one probability model parameter of each first voxel in the m groups of the first scale point cloud. According to at least one probability model parameter of each first voxel in the m groups, the occupation probability of each first voxel is determined respectively, so as to complete the determination of the occupation probability of the m groups of first voxels of the first scale point cloud. That is, the occupation probability of the first scale point cloud includes the occupation probability of each first voxel of the m groups of first voxels.

It should be noted that the principle for determining the occupation probability of each first voxel in each group of first voxels is consistent with the implementation principle in S103, and will not be repeated here.

In some embodiments of the present application, the decoder decodes the attribute coding information of the first voxel in the i-th group according to the occupancy probability of the i-th group to determine the attribute information of the first voxel in the i-th group; i is an integer greater than or equal to 1 and less than or equal to m; the decoding of the next group is continued based on the occupancy probability of the i+1-th group until the decoding of the attribute coding information of the first voxel in the m-th group is completed, and the attribute information of each of the m groups of first voxels is determined.

It should be noted that in the embodiment of the present application, the decoder decodes the attribute coding information of each group of first voxels using the ith group occupancy probability to implement the parsing of the attribute information of each group of first voxels. However, in the process of decoding the first voxels at the same scale, the decoding of the undecoded first voxels between different groups corresponding to the same second voxel can also be implemented based on the known information, including the decoded voxels (first voxels) between different groups corresponding to the same second voxel and the voxels of the parent node at the previous scale (the corresponding second voxels).

In an embodiment of the present application, the second-scale point cloud includes: K second voxels; K is an integer greater than or equal to 1 and less than the total number of voxels in the first-scale point cloud; the i-th group of first voxels includes: H first voxels; the i-th group of occupancy probabilities includes: H groups of sub-occupancy probabilities; H is an integer greater than or equal to 1 and less than or equal to a preset number.

It should be noted that the preset number is related to the sampling number of upsampling or downsampling, for example, if 2×2×2 upsampling is performed, the preset number is 8. The embodiment of the present application does not limit the sampling method.

In the embodiment of the present application, according to different grouping methods, each group of first voxels may include first voxels corresponding to multiple second voxels, for example, H first voxels, so the occupancy probability of the i-th group also corresponds to the H-group sub-occupancy probability.

In some embodiments of the present application, the decoder decodes the attribute encoding information of the first voxel in the i-th group according to the i-th group occupancy probability to determine the attribute information of the first voxel in the i-th group, including:

If the number of undecoded first voxels corresponding to the second voxel to which the jth first voxel belongs is greater than 1, the jth first voxel is decoded according to the jth group sub-occupancy probability to obtain the attribute information of the jth first voxel; the second voxel is the voxel where the parent node of the second scale corresponding to the point in the jth first voxel is located;

If the number of undecoded first voxels corresponding to the second voxel to which the j-th first voxel belongs is 1, then determining the attribute mean of the undecoded first voxel corresponding to the second voxel, the attribute mean being the attribute information of the j-th first voxel; the attribute mean is determined based on the attribute information of the decoded first voxel corresponding to the second voxel and the decoded attribute information of the second voxel;

The decoding of the j+1th first voxel is continued until j=H, and the attribute information of the i-th group of first voxels is determined.

In an embodiment of the present application, for the decoding of the jth first voxel in the ith group, the jth first voxel corresponds to a second voxel of the second scale, and a second voxel corresponds to multiple first voxels of the first scale, and the jth first voxel belongs to one of them. If there are at least two undecoded second voxels in the multiple first voxels, including the jth first voxel, it is still necessary to decode the jth first voxel based on the jth group sub-occupancy probability to obtain the attribute information of the jth first voxel. This is because there are multiple first voxels corresponding to the second voxel that have not been decoded, and there is no way to achieve decoding through direct speculation. Therefore, the decoder is still required to continue to decode the first voxel according to the sub-occupancy probability. If only the jth first voxel is left undecoded among the multiple first voxels, it means that only one first voxel corresponding to the second voxel is left undecoded. At this time, the attribute information of the remaining first voxel corresponding to the second voxel can be directly inferred based on the attribute information of the second voxel and the decoded first voxel corresponding to the second voxel. That is, the attribute mean of the undecoded first voxel corresponding to the second voxel is the attribute information of the jth first voxel, and there is no need to decode the jth first voxel according to the jth group sub-occupancy probability. Continue to decode the j+1th first voxel until j=H, and determine the attribute information of the i-th group of first voxels.

It should be noted that in the embodiment of the present application, if there is a partially decoded first voxel among the multiple first voxels corresponding to a second voxel, in the decoding process of the undecoded first voxel, since the decoder can determine the attribute mean of the undecoded first voxel corresponding to the second voxel based on the attribute information of the decoded first voxel corresponding to the second voxel and the decoded attribute information of the second voxel, then when there is only one undecoded first voxel corresponding to the second voxel, the attribute mean of the undecoded first voxel is the attribute information of the undecoded first voxel, and there is no need to decode based on the sub-occupancy probability. However, if there is more than one undecoded first voxel corresponding to the second voxel, the attribute mean of the undecoded first voxel is obtained, and it is necessary to continue to decode based on the sub-occupancy probability of the undecoded first voxel until there is only one undecoded first voxel, and its attribute information can be directly deduced.

Exemplarily, the decoder can determine the attribute mean of the undecoded first voxel corresponding to the second voxel according to the attribute information of the decoded first voxel corresponding to the second voxel and the decoded attribute information of the second voxel as follows: the attribute information of the second voxel of the parent node of the second scale is known to be X, and it is assumed that it corresponds to a+b first voxels of the first scale, of which a are decoded and b are undecoded, and the values of the a decoded child voxels are {x ₁ ,x ₂ ,…,x _a } respectively, then the attribute (information) mean of the remaining b undecoded first voxels can be inferred, that is, the average value of the attribute information is

At this time, if b=1, the attribute mean is the attribute information of the b first voxels.

Exemplarily, as shown in FIG9 , assuming that m=8, after the decoder samples and predicts the three second voxels of Scale p-1, the attribute prediction information of the six first voxels of Scale p (e.g., occupied voxels) is determined, that is, 8, 7, 7, 4, 4. And the n first voxels are divided into G1 (Group1), G2 (Group2), G3 (Group3), G4 (Group4), G5 (Group5), G6 (Grou6), G7 (Group7) and G8 (Grou8). Among these 8 groups, only some of the groups have first voxels, and the voxels at other positions are empty. The decoder performs probability prediction on each group of first voxels through the attribute probability prediction network (SAPA) to obtain the occupancy probability of each group of first voxels (including the sub-occupancy probability of each first voxel). When decoding Group 1, the attribute coding information of the first voxel in each group is decoded according to the sub-occupancy probability of each first voxel. As shown in FIG9 , when Group 1 is decoded, 5 and 3 are decoded first voxels. When decoding Group 2, it is known that 8 is a decoded first voxel. When decoding the second first voxel (at a) in Group 2, the second first voxel corresponds to the second second voxel (at b). At this time, the second second voxel corresponds to the second first voxel (at a) and the first voxel at c has not been decoded. At this time, the average value of the first voxels at a and c becomes 8, that is, 7×3-(5)(decoded first voxel corresponding to the second second voxel)/2=8. Therefore, the decoder needs to decode the second first voxel according to its sub-occupancy probability to obtain attribute information 9. For the first voxel at d, it corresponds to the third second voxel (at e), and it is the only undecoded first voxel corresponding to the third second voxel. Therefore, the attribute information of the first voxel at d is the attribute information of the third second voxel 4×2-3 (the decoded first voxel corresponding to the third second voxel)/1=5. When decoding Group 2, after the attribute information of the second first voxel (at a) is decoded as 9, for the first voxel at c, it corresponds to the second second voxel (at b), and it is the only undecoded first voxel corresponding to the second second voxel. Therefore, the attribute information of the first voxel at c is the attribute information of the second second voxel 7×3-(5+9) (the decoded first voxel corresponding to the second second voxel)/1=7. There are no occupied voxels in other groups. So far, the attribute information of Scale p is obtained after decoding.

It should be noted that if a second voxel corresponds to a first voxel, the attribute information of the second voxel is the attribute information of the first voxel, such as the first voxel with attribute information 8 as shown in FIG. 9 .

In the embodiment of the present application, when the decoding of the m groups of first voxels is completed, the decoding of the first scale point cloud is ended, and the decoding process of the next scale point cloud is continued until all decoding is completed.

It can be understood that the group prediction and decoding method in the embodiment of the present application, compared with the method of predicting and decoding the occupancy probability of each voxel based on the parent node and neighbor nodes of each voxel, uses an attribute probability prediction network for decoding. On the one hand, it greatly reduces the decoding complexity and improves the decoding speed and compression performance; on the other hand, as the decoding proceeds, the attribute information of the first voxel that has been decoded can be used in the decoding inference process of the undecoded first voxel corresponding to the same second voxel, helping the decoding inference of the undecoded first voxel, thereby improving the decoding accuracy and efficiency.

When decoding the first-scale point cloud, different color components of the first-scale point cloud may be processed to implement the decoding process.

In some embodiments of the present application, as shown in FIG10 , the decoding method provided in the embodiment of the present application may further include:

S301, parsing a bit stream to determine attribute coding information corresponding to a first-scale point cloud.

It should be noted that the implementation principle of S301 is consistent with that of S101, and will not be repeated here.

S302 , predicting the first scale based on the decoded attribute information of the second scale point cloud to determine the attribute prediction information of the first scale point cloud; wherein the second scale point cloud is the decoded point cloud data of the previous scale parent node of the first scale point cloud.

It should be noted that the prediction of the first scale is performed based on the attribute information of the decoded different color components of the second scale point cloud, and the attribute prediction information of the different color components of the first scale is determined respectively. The implementation principle of performing the prediction of the first scale based on the attribute information of each color component decoded by the second scale point cloud and determining the attribute prediction information of each color component of the first scale is consistent with the implementation principle of S102, and will not be repeated here.

S303: Use an attribute probability prediction network to perform probability prediction on the first color component of the attribute prediction information, and determine the occupancy probability of the first scale point cloud under the first color component.

S304 : Based on the occupancy probability of the first color component, decode the attribute encoding information of the first color component to obtain attribute information corresponding to the first color component of the first-scale point cloud.

S305. Using an attribute probability prediction network, combined with the attribute information corresponding to the first color component of the first-scale point cloud, probability prediction is performed on the second color component and the third color component of the attribute prediction information to determine the occupancy probability of the first-scale point cloud under the second color component and the occupancy probability under the third color component.

S306 : Based on the occupancy probability of the second color component, decode the attribute encoding information of the second color component to obtain attribute information corresponding to the second color component of the first-scale point cloud.

S307 . Based on the occupancy probability of the third color component, decode the attribute encoding information of the third color component to obtain attribute information corresponding to the third color component of the first-scale point cloud.

It should be noted that the attribute probability prediction network is used to perform probability prediction on the first color component of the attribute prediction information of each first voxel to determine the occupancy probability of each first voxel in the first scale point cloud under the first color component.

When determining the occupancy probability of each first voxel, the decoder can determine different color components for one first voxel, obtain the occupancy probabilities of different color channels or different color components based on three different color channels, and then decode the attribute coding information corresponding to different color channels based on the occupancy probabilities of different color channels or different color components.

In the embodiment of the present application, since the attribute information can be divided into information of three color components, the attribute information can be encoded according to different color components. Correspondingly, when the decoder performs decoding, it can be decoded based on different color components to obtain the attribute information of each first voxel under different color components.

It should be noted that, in the embodiment of the present application, the implementation principle of each first voxel under one color component is consistent with the implementation principle of S303 and the implementation principle of S103 determining the occupancy probability of the first voxel, and will not be repeated here.

The difference is that the decoder needs to determine the occupancy probabilities of different color components under different components for the first voxel. And after determining the first color component (brightness component) to achieve decoding, it can also use the attribute information of the first color component as the input of other color components, use the attribute probability prediction network to perform probability prediction, obtain the probability model parameters, and then obtain the occupancy probabilities corresponding to other colors. That is to say, the attribute probability prediction network at this time has one more input parameter than the attribute probability prediction network in S103, but the principle of determining the occupancy probability by the probability model parameters is the same.

In some embodiments of the present application, the decoder first decodes the brightness component. For a first voxel, the decoder uses an attribute probability prediction network to perform a probability prediction on the first color component of the attribute prediction information of the first voxel, and determines the occupancy probability of the first voxel under the first color component; based on the occupancy probability under the first color component, the attribute encoding information of the first voxel under the first color component is decoded to obtain the attribute information corresponding to the first color component of the first voxel.

After obtaining the attribute information of the first voxel in the first color component, the decoder continues to use the attribute probability prediction network, combines the attribute information of the first voxel in the first color component, and performs probability prediction on the second color component and the third color component of the attribute prediction information of the first voxel, and determines the occupancy probability of the first voxel in the second color component and the occupancy probability of the first voxel in the third color component. The occupancy probability of the first voxel in the second color component and the occupancy probability of the first voxel in the third color component are continued to be used to decode the attribute information of the second color component and the attribute information of the third color component, thereby decoding the attribute information of the first voxel.

In the embodiment of the present application, after the decoder decodes the attribute information of the three color components of each first voxel, the decoding of the first-scale point cloud is completed, and the attribute information of the first-scale point cloud is obtained.

In an embodiment of the present application, the first color component is a brightness component, the second color component and the third color component are chromaticity components. For example, the first color component is a Y component, the second color component is a Co component, and the third color component is a Cg component. This embodiment of the present application does not limit this.

Exemplarily, as shown in FIG11 , within the same scale, the color space is converted from RGB to YCoCg space, and then the attribute prediction information of the Y component of Scale p is determined based on the attribute information of the decoded Y component of the second scale of Scale p-1, and the occupancy probability of the Y channel (Y component) is predicted by SAPA, and the attribute information under the Y component is decoded based on the occupancy probability. According to the attribute information of the decoded Co component and Cg component of the second scale of Scale p-1, the attribute prediction information of the Co component and Cg component of Scale p is determined respectively, and the occupancy probability of the Co component and the Cg component is predicted by SAPA in combination with the attribute information under the Y component, and the attribute information under the Co component and the Cg component is decoded based on the occupancy probability of the Co component and the Cg component.

It can be understood that the color channel prediction and decoding method in the embodiment of the present application greatly reduces the decoding complexity when decoding using the attribute probability prediction network, compared to the method of predicting and decoding the occupancy probability based on non-channel sorting. As the decoding proceeds, the attribute information of the decoded Y color component can be used in the decoding inference process of other undecoded color components corresponding to the same first voxel, thereby realizing the decoding of the undecoded first voxel, thereby improving the decoding accuracy and efficiency.

When decoding the first-scale point cloud, multiple first voxels of the first-scale point cloud may be grouped, and different color components may be processed for each group of first voxels to implement the decoding process.

In some embodiments of the present application, as shown in FIG12 , the decoding method provided in the embodiment of the present application may further include:

S401, parsing a bit stream to determine attribute coding information corresponding to a first-scale point cloud.

It should be noted that the implementation principle of S401 is consistent with that of S101, and will not be repeated here.

S402, performing voxel upsampling on the second-scale point cloud to determine n first voxels of the first scale; n is an integer greater than 1; wherein the second-scale point cloud is decoded point cloud data of a parent node of a previous scale of the first-scale point cloud.

It should be noted that the implementation principle of S402 is consistent with that of S102, and will not be repeated here.

S403 . Determine attribute prediction information corresponding to n first voxels of the first scale according to the decoded attribute information of the second-scale point cloud, and then obtain attribute prediction information of the first-scale point cloud.

It should be noted that the prediction of the first scale is performed based on the attribute information of different color components decoded in the second scale point cloud, and the attribute prediction information of each of the n first voxels in the first scale in different color components is determined respectively. The implementation principle of performing the prediction of the first scale based on the attribute information of each color component decoded in the second scale point cloud and determining the attribute prediction information of each first voxel in the first scale in each color component is consistent with the implementation principle of S102, and will not be repeated here.

S404 , grouping voxels at the same position of each of the n first voxels to determine m groups of first voxels; m is an integer greater than or equal to 1.

It should be noted that the implementation principle of S404 is consistent with that of S204, and will not be repeated here.

S405 , using an attribute probability prediction network, under the first color component, sequentially perform probability prediction on the m groups of first voxels to determine the m groups of occupancy probabilities corresponding to the first color component.

S406 , according to the m groups of occupancy probabilities corresponding to the first color component, decode the attribute encoding information of the m groups of first voxels under the first color component to determine the attribute information corresponding to the first color component of each of the m groups of first voxels.

The attribute information corresponding to the first color component is used to determine the m-group occupancy probabilities of the second color component and the m-group occupancy probabilities of the third color component.

S407, using an attribute probability prediction network, combined with the attribute information corresponding to the decoded first color component, respectively, under the second color component and the third color component, perform probability prediction on the m groups of first voxels in turn, and determine the m groups of occupancy probabilities corresponding to the second color component and the m groups of occupancy probabilities corresponding to the third color component.

S408 . Decode the attribute encoding information of the m groups of first voxels under the second color component according to the m groups of occupancy probabilities corresponding to the second color component, and determine the attribute information corresponding to the second color component of each of the m groups of first voxels.

S409 , according to the m groups of occupancy probabilities corresponding to the third color component, decode the attribute encoding information of the m groups of first voxels under the third color component to determine the attribute information corresponding to the third color component of each of the m groups of first voxels.

It should be noted that the implementation principles of S405-S409 are consistent with those of S 303-S307, and will not be repeated here.

The difference is that the first voxel is a voxel in each group of first voxels, and the decoder decodes based on the m groups of first voxels in sequence. The Y channel is decoded using the group processing method in S402-S404 above; and then the CoCg channel is decoded in the same grouping manner, and the processed Y channel will serve as auxiliary information to help the decoding of the CoCg channel.

It should be noted that the decoder divides the scale point cloud to be decoded into a multi-layer structure, including: 1) downsampling into multiple scales; 2) the point cloud of the same scale is divided into multiple groups of voxels (for example, 8 groups) according to coordinates or positions; 3) the point cloud of the same scale is divided into a brightness channel (Y) and two chrominance channels (CoCg) according to the color channel. The decoder progressively predicts the probability distribution of each layer of point cloud in sequence from low scale to high scale, from group 1 to group 8, and from brightness channel to chrominance channel, and uses it for lossless entropy decoding. In this progressive prediction process, the decoded layer is known information and can be used as context. The probability prediction model based on sparse tensor neural network helps predict the remaining undecoded groups.

It can be understood that the lossless attribute compression algorithm based on learning, the grouping, channel prediction and decoding method in the embodiment of the present application, that is, according to the characteristics of point cloud data and neural network tools, the point cloud is layered in multiple ways, such as scale, group and channel decoding. Compared with the method of predicting and decoding the occupancy probability of each voxel based on the parent node and neighbor node of each voxel, when the attribute probability prediction network is used for decoding, on the one hand, the decoding complexity is greatly reduced, and the decoding speed and compression performance are improved; as the decoding proceeds, the attribute information of the decoded first voxel can be used in the decoding inference process of the undecoded first voxel corresponding to the same second voxel, and the attribute information of the decoded Y color component can be input into the attribute probability prediction network. It can be used in the decoding inference process of other undecoded color components corresponding to the same first voxel to help predict the occupancy probability of the current first voxel, thereby improving the accuracy and efficiency of decoding.

The following describes an encoding method applied to an encoder provided in an embodiment of the present application.

Referring to FIG. 13 , FIG. 13 is a schematic diagram of an optional flow chart of an encoding method provided in an embodiment of the present application, the method comprising:

S501, down-sampling the point cloud data in sequence until it is divided into a single voxel, to obtain multiple scale point clouds; the multiple scale point clouds include: a first scale point cloud and a second scale point cloud; the second scale point cloud is the point cloud data of the previous scale parent node of the first scale point cloud.

In the embodiment of the present application, the encoding object of the encoder is a point cloud or a point or node in the point cloud data. Since the encoder can obtain multiple strips, i.e., slices, by spatially dividing the point cloud data during encoding, the encoding method provided in the embodiment of the present application is performed on the point cloud in each slice.

In an embodiment of the present application, since the encoder divides the point cloud to be encoded of each slice or point cloud data (when strip division is not performed) into point cloud structures of different scales corresponding to attribute information during encoding, attribute encoding is performed scale by scale for point clouds of different scales starting from the point cloud with the smallest scale.

In an embodiment of the present application, when encoding point cloud data, the encoder first performs voxel downsampling, and divides the point cloud data to be encoded into the next scale one by one to obtain point clouds of multiple scales.

In some embodiments of the present application, the encoder determines multiple quantization errors each time the point cloud data is downsampled to the next scale when downsampling the point cloud data in sequence until it is divided into a single voxel, thereby obtaining point clouds of multiple scales; the multiple quantization errors include: the quantization error when the first-scale point cloud is downsampled to the second-scale point cloud.

It should be noted that the encoder performs voxel downsampling on the point cloud data to obtain a first-scale point cloud, continues to perform voxel downsampling on the first-scale point cloud to obtain a next-scale point cloud, and continues to perform downsampling until only one voxel remains, thereby obtaining multiple-scale point clouds.

In the embodiment of the present application, the first scale point cloud and the second scale point cloud are two adjacent scale point clouds among the multiple scale point clouds.

The following describes a process of dividing a first-scale point cloud into a second-scale point cloud.

In an embodiment of the present application, the encoder performs voxel downsampling on the first-scale point cloud to obtain a second-scale point cloud. The attribute information of each second voxel of the second-scale point cloud is the attribute mean of the first voxel within the downsampling range corresponding to the first scale. The attribute mean operation of the first voxel within the downsampling range corresponding to the first scale will produce a decimal. The embodiment of the present application uses a rounding operation to quantize it, and the resulting decimal part is the quantization error. Therefore, a quantization error will be generated in the process of each high-scale downsampling and quantization to obtain the attribute information of the low-scale point cloud.

Exemplarily, the encoder can perform voxel downsampling of the first scale point cloud by 2×2×2 to obtain a second scale point cloud. The attribute information of each second voxel of the second scale point cloud is the attribute mean of the first voxel within the 2×2×2 range corresponding to the first scale. Since the number of points in each 2×2×2 voxel set is between 1 and 8, the corresponding quantization error is also limited to a limited number of cases: {{0},{0,1/2},{0,1/3,2/3},{0,1/4,2/4,3/4},……,{0,1/8,2/8,3/8,4/8,5/8,6/8,7/8}}.

In the embodiment of the present application, when the encoder obtains point clouds at multiple scales, multiple quantization errors are generated accordingly.

In the embodiment of the present application, each quantization error is the value of the attribute mean of the high-scale voxels minus the value of the low-scale attribute information finally determined, which may be positive or negative.

In an embodiment of the present application, for the voxelization process, at least one point in the point cloud may correspond to an occupied voxel (i.e., a non-empty voxel or an occupied voxel), and an unoccupied voxel (i.e., an empty voxel or an unoccupied voxel) indicates that there is no point in the point cloud at the voxel position. In some embodiments, occupied voxels may be marked with different attribute information, and unoccupied voxels may have no attribute information. In this way, point clouds of different scales after voxelization can represent the attribute information of point clouds of different scales through the information of voxels at various positions in the voxel grid.

S502 . In a process of encoding the first-scale point cloud, predict the first scale based on the attribute information of the second-scale point cloud to determine attribute prediction information of the first-scale point cloud.

In the embodiment of the present application, the encoder performs inter-scale encoding based on the order from low scale to high scale. In the process of encoding the first-scale point cloud by the encoder, the encoding of the second-scale point cloud is represented to have been completed. The second-scale point cloud is the decoded point cloud data of the previous scale parent node of the first-scale point cloud, and the second scale is smaller than the first scale.

In some embodiments of the present application, the encoder determines reconstructed geometric information of the first-scale point cloud.

It should be noted that when the encoder performs attribute encoding, it has already completed the geometric encoding process and can obtain the reconstructed geometric information of point clouds of different scales. Therefore, when the encoder performs attribute encoding, the encoder can determine which voxels in the first voxel obtained by upsampling the second scale point cloud are occupied and which voxels are not occupied based on the reconstructed geometric information of the first scale point cloud during the process of upsampling the second scale point cloud.

In some embodiments of the present application, the encoder uses quantization error to dequantize the decoded attribute information of the second-scale point cloud to determine the revised attribute information of the second-scale point cloud; predicts the first scale based on the revised attribute information of the second-scale point cloud and the reconstructed geometric information of the first-scale point cloud to determine the attribute prediction information of the first-scale point cloud.

In some embodiments of the present application, the encoder writes multiple quantization errors into a bitstream.

In some embodiments of the present application, the encoder writes the reconstructed geometric information of the first-scale point cloud into a bitstream.

It should be noted that the encoder can use the attribute information of the second-scale point cloud to predict the first scale and determine the attribute prediction information of the first-scale point cloud. The process is consistent with the implementation and description process of using the decoded attribute information of the second-scale point cloud to predict the first scale and determine the attribute prediction information of the first-scale point cloud in S102 on the decoder side, and will not be repeated here.

S503: Using an attribute probability prediction network, perform probability prediction on the attribute prediction information to determine the occupancy probability of the first scale point cloud.

In some embodiments of the present application, the encoder inputs the attribute prediction information of each first voxel into the attribute probability prediction network for probability prediction, and determines at least one probability model parameter of each first voxel of the first-scale point cloud; the first voxel is a voxel corresponding to the first-scale point cloud; based on at least one probability model parameter of each first voxel of the first-scale point cloud, the occupancy probability of the first-scale point cloud is determined.

In some embodiments of the present application, the probability density distribution of each first voxel is determined based on at least one probability model parameter of at least one point contained in each first voxel of the first-scale point cloud; based on the probability density distribution of each first voxel, an integral operation is performed within a preset range centered on each first voxel to determine the occupancy probability of each first voxel, thereby determining the occupancy probability of the first-scale point cloud.

It should be noted that the encoder can use an attribute probability prediction network to perform probability prediction on the attribute prediction information and determine the occupancy probability of the first-scale point cloud. The process is consistent with the implementation and description process of using an attribute probability prediction network in S103 on the decoder side to perform probability prediction on the attribute prediction information and determine the occupancy probability of the first-scale point cloud, and will not be repeated here.

S504 : Encode the attribute information of the first-scale point cloud based on the occupancy probability to determine the attribute encoding information of the first-scale point cloud.

In the embodiment of the present application, after the encoder obtains the occupancy probability information of each first voxel, it uses the occupancy probability of each first voxel to encode the attribute information corresponding to each first voxel of the first-scale point cloud respectively, and determines the attribute coding information corresponding to each first voxel.

In some embodiments of the present application, the encoder may encode the first-scale point cloud in units of voxels, and further encode attribute encoding information of each point contained in the first voxel.

It should be noted that the encoder can encode the attribute information of each first voxel based on the occupancy probability of the first voxel. The attribute information of the first-scale point cloud includes the attribute information corresponding to each first voxel.

In some embodiments of the present application, the encoder continues to encode the third-scale point cloud until the encoding of multiple-scale point clouds is completed, thereby obtaining attribute encoding information of the multiple-scale point clouds.

In the embodiment of the present application, the encoder writes the attribute encoding information of multiple scale point clouds into the bitstream respectively for use by the decoder during decoding.

In an embodiment of the present application, the process of the encoder processing between scales is as described in the aforementioned embodiment. In addition, the encoder can perform component processing for encoding according to the grouping of voxels at the same scale, and can also perform channel-to-channel processing of different color channels according to different color components to achieve encoding. It can also combine voxel grouping and different color components for encoding. The embodiment of the present application is not limited and will be described in the following embodiments.

It can be understood that the encoding method provided in the embodiment of the present application can be repeatedly applied between multiple adjacent scales, and the encoding between each group of adjacent scales is independent of each other, so scale-scalable encoding can be flexibly implemented.

It should be noted that each encoding process of the above encoder uses the attribute information of the encoded low-scale point cloud as known information to encode the attribute information of the high-scale point cloud. For the first encoding process of the encoder, the known information may be the attribute information of a preset number of unencoded point clouds sent by the encoder side. The encoder may use a preset number of point cloud information, such as the attribute information of 100 points in the point cloud, as the first known information, and send it directly to the decoding end in an unencoded manner, so that the decoder does not need to decode the first known information, but directly uses the attribute information of the preset number of points sent by the encoder to predict the point cloud of the corresponding scale to continue the subsequent decoding process.

It can be understood that, in the encoder, through processing between scales, based on the attribute information of the second-scale point cloud, the first scale is predicted, the first-scale point cloud and its attribute prediction information are predicted, and then the attribute prediction information is input into the attribute probability prediction network for probability prediction to determine the occupancy probability of the first-scale point cloud. The attribute probability prediction network is used in the lossless coding process, and then the occupancy probability is determined by the attribute probability prediction network during encoding. The use of the attribute probability prediction network avoids many processing operations and reduces the coding complexity. At the same time, it speeds up the prediction speed of the occupancy probability and improves the processing efficiency of the prediction process. In this way, if the occupancy probability is used to encode the attribute information, the coding efficiency and coding performance can be improved.

In some embodiments of the present application, as shown in FIG14 , the encoding method provided in the embodiment of the present application may further include:

S601, down-sampling the point cloud data in sequence until it is divided into a single voxel, to obtain multiple-scale point clouds; the multiple-scale point clouds include: a first-scale point cloud and a second-scale point cloud; the second-scale point cloud is the point cloud data of the previous-scale parent node of the first-scale point cloud.

It should be noted that the implementation principle and description of S601 are consistent with those of S501, and will not be repeated here.

S602 , performing voxel upsampling on the second-scale point cloud to determine n first voxels of the first scale; n is an integer greater than 1.

S603 . Determine attribute prediction information corresponding to n first voxels of the first scale according to the attribute information of the second-scale point cloud, and then obtain the attribute prediction information of the first-scale point cloud.

In some embodiments of the present application, the encoder uses quantization error to dequantize the attribute information of the second-scale point cloud to determine the revised attribute information of the second-scale point cloud; the revised attribute information of the second voxel of the second-scale point cloud to which each first voxel of the n first voxels of the first scale belongs is determined as the attribute prediction information corresponding to each first voxel.

S604 , grouping voxels at the same position of each of the n first voxels to determine m groups of first voxels; m is an integer greater than or equal to 1.

S605 , using an attribute probability prediction network, sequentially perform probability prediction on the attribute prediction information of the m groups of first voxels to determine the m groups of occupancy probabilities, where the m groups of occupancy probabilities are the occupancy probabilities corresponding to the first scale point cloud.

It should be noted that the implementation principle and description of S602-S605 are consistent with those of S202-S205, and will not be repeated here.

S606 , encoding the attribute information corresponding to the first voxels in the m groups one by one according to the m groups of occupancy probabilities, and determining the attribute encoding information of each of the first voxels in the m groups.

It should be noted that, in the embodiment of the present application, the encoder divides the multiple first voxels corresponding to each second voxel into m groups of first voxels, and finally divides the n first voxels into m groups of first voxels. The encoder can encode in sequence from the first group to the last group in the order of grouping.

In an embodiment of the present application, the encoder may use an attribute probability prediction network to perform probability prediction on the attribute prediction information of the first voxels of the m groups in sequence to determine the occupancy probabilities of the m groups. The encoder encodes the attribute information corresponding to the first voxels of the m groups according to the occupancy probabilities of the m groups, determines the attribute coding information of each of the first voxels of the m groups, completes the encoding of the first-scale point cloud, and continues to encode the next scale until the point cloud of the entire scale is encoded.

In some embodiments of the present application, the encoder uses an attribute probability prediction network to perform probability prediction on the attribute prediction information of the m groups of first voxels in sequence to determine the occupancy probability of the m groups. The probability prediction process of the encoder for each first voxel in each group includes: the encoder can perform probability prediction on each first voxel in each group in sequence, the encoder inputs the attribute prediction information of the current first voxel into the attribute probability prediction network for probability prediction, and determines at least one probability model parameter of the current first voxel; according to at least one probability model parameter of the current first voxel, the occupancy probability of the current first voxel is determined. Continue to perform probability prediction on the next first voxel in this group until the prediction of all first voxels of all groups is completed, and obtain the occupancy probability of each first voxel in the m groups of the first scale point cloud. Alternatively, the encoder inputs the attribute prediction information of the current first voxel into the attribute probability prediction network for probability prediction, and determines at least one probability model parameter of the current first voxel; continue to perform probability prediction on the next first voxel in this group until the prediction of all first voxels of all groups is completed, and obtain at least one probability model parameter of each first voxel in the m groups of the first scale point cloud. According to at least one probability model parameter of each first voxel in the m groups, the occupation probability of each first voxel is determined respectively, so as to complete the determination of the occupation probability of the m groups of first voxels of the first scale point cloud. That is, the occupation probability of the first scale point cloud includes the occupation probability of each first voxel of the m groups of first voxels.

In some embodiments of the present application, the encoder encodes the attribute information of the first voxel in the i-th group according to the occupancy probability of the i-th group to determine the attribute coding information of the first voxel in the i-th group; i is an integer greater than or equal to 1 and less than or equal to m; the encoding of the next group is continued based on the occupancy probability of the i+1-th group until the encoding of the attribute information of the first voxel in the m-th group is completed, and the attribute coding information of each of the m groups of first voxels is determined.

It should be noted that in the embodiment of the present application, the encoder encodes the attribute information of each group of first voxels using the ith group occupancy probability to achieve the encoding of the attribute information of each group of first voxels. However, in the process of encoding the first voxels of the same scale, the encoding of the unencoded first voxels between different groups corresponding to the same second voxel can also be achieved based on the known information.

In some embodiments of the present application, the second-scale point cloud includes: K second voxels; K is an integer greater than or equal to 1 and less than the total number of voxels in the first-scale point cloud; the i-th group of first voxels includes: H first voxels; the i-th group of occupancy probabilities includes: H groups of sub-occupancy probabilities; H is an integer greater than or equal to 1 and less than or equal to a preset number.

In some embodiments of the present application, the encoder encodes the attribute information of the first voxel of the i-th group according to the i-th group occupancy probability, and determines the attribute encoding information of the first voxel of the i-th group, including:

If the number of uncoded first voxels corresponding to the second voxel to which the jth first voxel belongs is greater than 1, the jth first voxel is encoded according to the jth group child occupancy probability to obtain the attribute encoding information of the jth first voxel; the second voxel is the voxel where the parent node of the second scale corresponding to the point in the jth first voxel is located;

The encoding of the j+1th first voxel is continued until j=H, and the attribute encoding information of the i-th group of first voxels is determined.

In some embodiments of the present application, encoding of the j+1th first voxel continues until j=H. Before determining the attribute encoding information of the i-th group of first voxels, if the number of unencoded first voxels corresponding to the second voxel to which the j-th first voxel belongs is 1, encoding of the j-th first voxel is not performed.

In an embodiment of the present application, for the encoding of the jth first voxel in the ith group, the jth first voxel corresponds to a second voxel of the second scale, and a second voxel corresponds to multiple first voxels of the first scale, and the jth first voxel belongs to one of them. If the multiple first voxels include at least two unencoded second voxels in the jth first voxel, it is still necessary to encode the jth first voxel based on the jth group sub-occupancy probability to obtain the attribute encoding information of the jth first voxel. This is because there are multiple first voxels corresponding to the second voxel that are not encoded, and there is no way to achieve decoding through direct inference. Therefore, the encoder is still required to continue to encode the first voxel according to the sub-occupancy probability for use in decoding. If only the jth first voxel is left uncoded among the multiple first voxels, then it means that only one first voxel corresponding to the second voxel is left uncoded. At this time, when decoding the first voxel, the decoder can directly infer the attribute information of the remaining first voxel corresponding to the second voxel based on the attribute information of the second voxel and the decoded first voxel corresponding to the second voxel. That is, at this time, the attribute mean of the undecoded first voxel corresponding to the second voxel is the attribute information of the jth first voxel. It is no longer necessary to decode the jth first voxel according to the jth group sub-occupancy probability. Therefore, it is not necessary to encode it during encoding. Continue to encode the j+1th first voxel until j=H, and determine the attribute encoding information of the i-th group of first voxels.

It should be noted that in the embodiment of the present application, if there is a partially encoded first voxel among the multiple first voxels corresponding to a second voxel, in the process of encoding the unencoded first voxel, since the decoder can determine the attribute mean of the undecoded first voxel corresponding to the second voxel based on the attribute information of the decoded first voxel corresponding to the second voxel and the decoded attribute information of the second voxel, when there is only one unencoded first voxel corresponding to the second voxel, since the decoder can infer it during decoding, it is not necessary to encode based on the sub-occupancy probability. However, if there is more than one unencoded first voxel corresponding to the second voxel, the decoder cannot infer it, and therefore, it is necessary to continue encoding based on the sub-occupancy probability of the unencoded first voxel until there is only one unencoded first voxel, and no encoding is required, and the decoder can directly infer its attribute information.

It can be understood that the group prediction and encoding method in the embodiment of the present application, compared with the method of predicting and encoding the occupancy probability of each voxel based on the parent node and neighbor nodes of each voxel, uses the attribute probability prediction network for encoding. On the one hand, it greatly reduces the encoding complexity and improves the encoding speed and compression performance; on the other hand, the group encoding method during encoding can improve the accuracy and efficiency of encoding.

In some embodiments of the present application, as shown in FIG15 , the encoding method provided in the embodiment of the present application may further include:

S701, down-sampling the point cloud data in sequence until it is divided into a single voxel, to obtain multiple-scale point clouds; the multiple-scale point clouds include: a first-scale point cloud and a second-scale point cloud; the second-scale point cloud is the point cloud data of the previous-scale parent node of the first-scale point cloud.

It should be noted that the implementation principle and description of S701 are consistent with those of S501, and will not be repeated here.

S702: In the process of encoding the first-scale point cloud, predict the first scale based on the attribute information of the second-scale point cloud to determine the attribute prediction information of the first-scale point cloud.

It should be noted that the implementation principle and description of S701 are consistent with those of S502, and will not be repeated here.

S703: Use an attribute probability prediction network to perform probability prediction on the first color component of the attribute prediction information, and determine the occupancy probability of the first scale point cloud under the first color component.

S704 . Encode the attribute information of the first color component based on the occupancy probability of the first color component to obtain attribute encoding information corresponding to the first color component of the first-scale point cloud.

S705. Using an attribute probability prediction network, combined with the attribute information corresponding to the first color component of the first-scale point cloud, probability prediction is performed on the second color component and the third color component of the attribute prediction information to determine the occupancy probability of the first-scale point cloud under the second color component and the occupancy probability under the third color component.

S706 . Encode the attribute information of the second color component based on the occupancy probability of the second color component to obtain attribute encoding information corresponding to the second color component of the first-scale point cloud.

S707 . Encode the attribute information of the third color component based on the occupancy probability of the third color component to obtain attribute encoding information corresponding to the third color component of the first-scale point cloud.

It should be noted that the implementation principles of S703-S707 are consistent with those of S 303-S307, and will not be repeated here.

The difference is that in the embodiment of the present application, after the occupancy probabilities of different color components are determined, the corresponding attribute information is encoded based on the occupancy probabilities to obtain the attribute encoding information.

It can be understood that the color channel prediction and encoding method in the embodiment of the present application greatly reduces the decoding complexity when decoding using the attribute probability prediction network, compared to the method of predicting and encoding the occupancy probability based on non-channel sorting. Moreover, as the encoding proceeds, the attribute information of the encoded Y color component can be used in the encoding inference process of the unencoded other color components corresponding to the same first voxel, thereby realizing the encoding of the unencoded first voxel, thereby improving the accuracy and efficiency of the encoding.

In some embodiments of the present application, as shown in FIG16 , the encoding method provided in the embodiment of the present application may further include:

S801, down-sampling the point cloud data in sequence until it is divided into a single voxel, to obtain multiple-scale point clouds; the multiple-scale point clouds include: a first-scale point cloud and a second-scale point cloud; the second-scale point cloud is the point cloud data of the previous-scale parent node of the first-scale point cloud.

It should be noted that the implementation principle and description of S801 are consistent with those of S501, and will not be repeated here.

S802 , performing voxel upsampling on the second-scale point cloud to determine n first voxels of the first scale; n is an integer greater than 1.

It should be noted that the implementation principle and description of S802 are consistent with those of S602, and will not be repeated here.

S803 . Determine attribute prediction information corresponding to n first voxels at the first scale according to the attribute information of the second-scale point cloud, and then obtain the attribute prediction information of the first-scale point cloud.

It should be noted that the implementation principle and description of S803 are consistent with those of S603, and will not be repeated here.

S804 , grouping voxels at the same position of each of the n first voxels to determine m groups of first voxels; m is an integer greater than or equal to 1.

It should be noted that the implementation principle and description of S804 are consistent with those of S604, and will not be repeated here.

S805 , using an attribute probability prediction network, under the first color component, sequentially perform probability prediction on the m groups of first voxels to determine the m groups of occupancy probabilities corresponding to the first color component.

S806 , encoding attribute information of the m groups of first voxels under the first color component according to the m groups of occupancy probabilities corresponding to the first color component, and determining attribute coding information corresponding to the first color component of each of the m groups of first voxels.

The attribute information corresponding to the first color component is used to determine the m-group occupancy probability of the second color component and the m-group occupancy probability of the third color component.

S807, using an attribute probability prediction network, combined with the attribute information corresponding to the first color component, respectively, under the second color component and the third color component, perform probability prediction on the m groups of first voxels in turn to determine the m groups of occupancy probabilities corresponding to the second color component and the m groups of occupancy probabilities corresponding to the third color component.

S808 . Encode the attribute information of the m groups of first voxels under the second color component according to the m groups of occupancy probabilities corresponding to the second color component, and determine the attribute coding information corresponding to the second color component of each of the m groups of first voxels.

S809 , encoding the attribute information of the m groups of first voxels under the third color component according to the m groups of occupancy probabilities corresponding to the third color component, and determining the attribute coding information corresponding to the third color component of each of the m groups of first voxels.

It should be noted that the implementation principle of S805-S809 is consistent with that of S 405-S409, which will not be repeated here.

It should be noted that the encoder divides the scale point cloud to be encoded into a multi-layer structure, including: 1) downsampling into multiple scales; 2) the point cloud of the same scale is divided into multiple groups of voxels (for example, 8 groups) according to coordinates or positions; 3) the point cloud of the same scale is divided into a brightness channel (Y) and two chroma channels (CoCg) according to the color channel. The encoder progressively predicts the probability distribution of each layer of point cloud in sequence from low scale to high scale, from group 1 to group 8, and from brightness channel to chroma channel, and uses it for lossless entropy coding. In this progressive prediction process, the encoded layer is known information and can be used as context. The probability prediction model based on sparse tensor neural network helps predict the remaining unencoded groups.

It can be understood that the lossless attribute compression algorithm based on learning, the grouping, channel prediction and encoding method in the embodiment of the present application, that is, according to the characteristics of point cloud data and neural network tools, the point cloud is layered in multiple ways, such as scale, group and channel encoding. Compared with the method of predicting and encoding the occupancy probability of each voxel based on the parent node and neighbor node of each voxel, when the attribute probability prediction network is used for encoding, on the one hand, the encoding complexity is greatly reduced, and the encoding speed and compression performance are improved; as the encoding proceeds, the attribute information of the Y color component has been encoded, and the input attribute probability prediction network can be used in the encoding inference process of other unencoded color components corresponding to the same first voxel, to help predict the occupancy probability of the current first voxel, thereby improving the accuracy and efficiency of the encoding.

Below, in conjunction with Figure 17, the application of the encoding and decoding method provided in an embodiment of the present application in an actual scenario is explained.

Take the adjacent scales Scale p-1, Scale p, and Scale p+1 as examples for explanation. Scale p+1 is downsampled and quantized to obtain Scale p, and the quantization error 1 is saved. Scale p is further downsampled and quantized to obtain Scale p-1, and the quantization error 2 is saved until multiple scales are obtained.

Encode from the lowest scale point cloud. When encoding to Scale p, Scale p-1 has been encoded. The encoder can perform dequantization based on the attribute information of Scale p-1 and the quantization error 2 to obtain the modified attribute prediction information of

Scale p

8, 7, 7, 7, 4, 4. The encoder performs probability prediction based on the modified attribute prediction information and the attribute probability prediction network, and then encodes the attribute information of Scale p based on the occupancy probability. Continue to encode Scale p+1. The encoder can perform dequantization based on the attribute information of Scale p and the quantization error 1 to obtain the modified attribute prediction information of Scale p+1 7.5, 7.5, 5, 5, 8.5, 8.5, 7, 7, 3.5, 3.5, 4.5, 4.5. The encoder performs probability prediction based on the modified attribute prediction information and the attribute probability prediction network, and then encodes the attribute information of Scale p+1 based on the occupancy probability until all scale point clouds are encoded, and writes the quantization error and attribute encoding information into the bitstream.

In the decoding process, decoding is performed from the lowest scale point cloud. When decoding to Scale p, Scale p-1 has been decoded. The decoder can perform dequantization based on the decoded attribute information of Scale p-1 according to the quantization error 2 to obtain the modified attribute prediction information of

Scale p

8, 7, 7, 7, 4, 4. The decoder performs probability prediction based on the modified attribute prediction information and the attribute probability prediction network, and then decodes the attribute coding information of Scale p obtained from the bitstream based on the occupancy probability. Continuing to decode Scale p+1, the decoder can perform dequantization based on the decoded attribute information of Scale p according to the quantization error 1 to obtain the modified attribute prediction information of Scale p+1 7.5, 7.5, 5, 5, 8.5, 8.5, 7, 7, 3.5, 3.5, 4.5, 4.5. The decoder performs probability prediction based on the modified attribute prediction information and the attribute probability prediction network, and then decodes the attribute coding information of Scale p+1 obtained from the bitstream based on the occupancy probability until all scale point clouds are decoded.

An embodiment of the present application provides a code stream, which is generated by bit encoding according to information to be encoded; wherein the information to be encoded includes at least: attribute encoding information of multiple scale point clouds, multiple quantization errors, and reconstructed geometric information of multiple scale point clouds, wherein the reconstructed geometric information of the multiple scale point clouds includes the reconstructed geometric information of the first scale point cloud.

Based on the aforementioned inventive concept, an embodiment of the present application provides a decoder 1, as shown in FIG18 , comprising:

The decoding part 10 is configured to parse the code stream and determine the attribute coding information corresponding to the first scale point cloud;

The first prediction part 11 is configured to perform a prediction of the first scale based on the decoded attribute information of the second scale point cloud, and determine the attribute prediction information of the first scale point cloud; wherein the second scale point cloud is the decoded point cloud data of the previous scale parent node of the first scale point cloud; and

The decoding part 10 is further configured to decode the attribute encoding information based on the occupancy probability to determine the attribute information of the first-scale point cloud.

In some embodiments of the present application, the first-scale point cloud includes: n first voxels; n is an integer greater than 1;

The first prediction part 11 is further configured to group voxels at the same position of each of the n first voxels to determine m groups of first voxels; m is an integer greater than or equal to 1;

The attribute probability prediction network is used to sequentially perform probability prediction on the attribute prediction information of the m groups of first voxels to determine m groups of occupancy probabilities, where the m groups of occupancy probabilities are the occupancy probabilities corresponding to the first-scale point cloud.

In some embodiments of the present application, the decoding part 10 is further configured to decode the attribute encoding information corresponding to the m groups of first voxels one by one according to the m groups of occupancy probabilities, and determine the attribute information of each of the m groups of first voxels.

In some embodiments of the present application, the decoding part 10 is further configured to decode the attribute encoding information of the first voxel of the i-th group according to the i-th group occupancy probability to determine the attribute information of the first voxel of the i-th group; i is an integer greater than or equal to 1 and less than or equal to m;

The next group of decoding is continued based on the (i+1)th group of occupancy probabilities until the attribute coding information of the first voxels of the mth group is decoded, and the attribute information of each of the first voxels of the m groups is determined.

In some embodiments of the present application, the second-scale point cloud includes: K second voxels; K is an integer greater than or equal to 1 and less than the total number of voxels in the first-scale point cloud; the i-th group of first voxels includes: H first voxels; the i-th group of occupancy probabilities includes: H groups of sub-occupancy probabilities; H is an integer greater than or equal to 1 and less than or equal to a preset number;

The decoding part 10 is further configured to decode the jth first voxel according to the jth group sub-occupancy probability to obtain the attribute information of the jth first voxel if the number of undecoded first voxels corresponding to the second voxel to which the jth first voxel belongs is greater than 1; the second voxel is the voxel where the parent node of the second scale corresponding to the point in the jth first voxel is located;

If the number of undecoded first voxels corresponding to the second voxel to which the j-th first voxel belongs is 1, determining the attribute mean of the undecoded first voxels corresponding to the second voxel, wherein the attribute mean is the attribute information of the j-th first voxel; the attribute mean is determined according to the attribute information of the decoded first voxel corresponding to the second voxel and the decoded attribute information of the second voxel;

In some embodiments of the present application, the first prediction part 11 is further configured to use the attribute probability prediction network to perform probability prediction on the first color component of the attribute prediction information, and determine the occupancy probability of the first scale point cloud under the first color component.

In some embodiments of the present application, the decoding part 10 is further configured to decode the attribute encoding information under the first color component based on the occupancy probability under the first color component to obtain the attribute information corresponding to the first color component of the first-scale point cloud.

In some embodiments of the present application, the first prediction part 11 is further configured to adopt the attribute probability prediction network, combine the attribute information corresponding to the first color component of the first scale point cloud, perform probability prediction on the second color component and the third color component of the attribute prediction information, and determine the occupancy probability of the first scale point cloud under the second color component and the occupancy probability under the third color component;

The decoding part 10 is further configured to decode the attribute encoding information under the second color component based on the occupancy probability under the second color component to obtain the attribute information corresponding to the second color component of the first-scale point cloud;

Based on the occupancy probability of the third color component, the attribute encoding information of the third color component is decoded to obtain attribute information corresponding to the third color component of the first-scale point cloud.

In some embodiments of the present application, the first prediction part 11 is further configured to use the attribute probability prediction network to sequentially perform probability prediction on the m groups of first voxels under the first color component to determine the m groups of occupancy probabilities corresponding to the first color component;

The attribute probability prediction network is used, combined with the attribute information corresponding to the decoded first color component, to perform probability prediction on the m groups of first voxels in turn under the second color component and the third color component, respectively, to determine the m groups of occupancy probabilities corresponding to the second color component and the m groups of occupancy probabilities corresponding to the third color component.

In some embodiments of the present application, the decoding part 10 is further configured to decode the attribute encoding information of the m groups of first voxels under the first color component according to the m groups of occupancy probabilities corresponding to the first color component, and determine the attribute information corresponding to the first color component of each of the m groups of first voxels; the attribute information corresponding to the first color component is used to determine the m groups of occupancy probabilities of the second color component and the m groups of occupancy probabilities of the third color component;

According to the m groups of occupancy probabilities corresponding to the second color component, the attribute encoding information of the m groups of first voxels is decoded under the second color component to determine the attribute information corresponding to the second color component of each of the m groups of first voxels;

According to the m groups of occupancy probabilities corresponding to the third color component, the attribute encoding information of the m groups of first voxels is decoded under the third color component to determine the attribute information corresponding to the third color component of each of the m groups of first voxels.

In some embodiments of the present application, the first prediction part 11 is further configured to input the attribute prediction information of each first voxel into the attribute probability prediction network for probability prediction, and determine at least one probability model parameter of each first voxel of the first scale point cloud; the first voxel is a voxel corresponding to the first scale point cloud;

An occupancy probability of the first-scale point cloud is determined according to at least one probability model parameter of each first voxel of the first-scale point cloud.

In some embodiments of the present application, the first prediction part 11 is further configured to determine the probability density distribution of each first voxel according to at least one probability model parameter of at least one point included in each first voxel of the first scale point cloud;

According to the probability density distribution of each first voxel, an integral operation is performed within a preset range centered on each first voxel to determine the occupation probability of each first voxel, thereby determining the occupation probability of the first-scale point cloud.

In some embodiments of the present application, the decoding part 10 is further configured to determine the decoded attribute information of the second scale point cloud;

Determine, from the bitstream, a quantization error when downsampling the first-scale point cloud to the second-scale point cloud;

From the bitstream, the reconstructed geometric information of the first scale point cloud is determined.

In some embodiments of the present application, the first prediction part 11 is further configured to use the quantization error to dequantize the decoded attribute information of the second scale point cloud to determine the corrected attribute information of the second scale point cloud;

Prediction of the first scale is performed according to the corrected attribute information of the second-scale point cloud and the reconstructed geometric information of the first-scale point cloud to determine attribute prediction information of the first-scale point cloud.

In some embodiments of the present application, the first prediction part 11 is further configured to perform voxel upsampling on the second scale point cloud to determine n first voxels of the first scale; n is an integer greater than 1;

According to the decoded attribute information of the second-scale point cloud, attribute prediction information corresponding to the n first voxels of the first scale is determined, thereby obtaining the attribute prediction information of the first-scale point cloud.

The corrected attribute information of the second voxel of the second-scale point cloud to which each of the n first voxels of the first scale belongs is determined as the attribute prediction information corresponding to each first voxel.

The embodiment of the present application provides a decoder, as shown in FIG19 , including:

A first memory 12, configured to store executable instructions;

The first processor 13 is configured to implement a decoding method of the decoder when executing the executable instructions stored in the first memory 12.

The embodiment of the present application provides an encoder 2, as shown in FIG20 , including:

The division part 20 is configured to sequentially perform voxel downsampling on the point cloud data until a single voxel is divided to obtain a plurality of scale point clouds; the plurality of scale point clouds include: a first scale point cloud and a second scale point cloud; the second scale point cloud is the point cloud data of the previous scale parent node of the first scale point cloud;

The second prediction part 21 is configured to perform a first-scale prediction based on the attribute information of the second-scale point cloud during encoding of the first-scale point cloud, and determine the attribute prediction information of the first-scale point cloud; and

The encoding part 22 is configured to encode the attribute information of the first-scale point cloud based on the occupancy probability, and determine the attribute encoding information of the first-scale point cloud.

The second prediction part 21 is further configured to group voxels at the same position of each of the n first voxels to determine m groups of first voxels; m is an integer greater than or equal to 1;

In some embodiments of the present application, the encoding part 22 is further configured to encode the attribute information corresponding to the m groups of first voxels one by one according to the m groups of occupancy probabilities, and determine the attribute encoding information of each of the m groups of first voxels.

In some embodiments of the present application, the encoding part 22 is further configured to encode the attribute information of the first voxel of the i-th group according to the i-th group occupancy probability to determine the attribute encoding information of the first voxel of the i-th group; i is an integer greater than or equal to 1 and less than or equal to m;

The next group of encoding is continued based on the (i+1)th group of occupancy probabilities until the encoding of the attribute information of the first voxels of the mth group is completed, and the attribute encoding information of each of the first voxels of the m groups is determined.

The encoding part 22 is further configured to encode the jth first voxel according to the jth group sub-occupancy probability to obtain attribute encoding information of the jth first voxel if the number of unencoded first voxels corresponding to the second voxel to which the jth first voxel belongs is greater than 1; the second voxel is the voxel where the parent node of the second scale corresponding to the point in the jth first voxel is located;

In some embodiments of the present application, the encoding part 22 is further configured to continue encoding the j+1th first voxel until j=H, and before determining the attribute encoding information of the i-th group of first voxels, if the number of unencoded first voxels corresponding to the second voxel to which the j-th first voxel belongs is 1, encoding of the j-th first voxel is not performed.

In some embodiments of the present application, the second prediction part 21 is further configured to use the attribute probability prediction network to perform probability prediction on the first color component of the attribute prediction information to determine the occupancy probability of the first scale point cloud under the first color component.

In some embodiments of the present application, the encoding part 22 is further configured to encode the attribute information under the first color component based on the occupancy probability under the first color component to obtain the attribute encoding information corresponding to the first color component of the first-scale point cloud.

In some embodiments of the present application, the second prediction part 21 is further configured to adopt the attribute probability prediction network, combine the attribute information corresponding to the first color component of the first scale point cloud, perform probability prediction on the second color component and the third color component of the attribute prediction information, and determine the occupancy probability of the first scale point cloud under the second color component and the occupancy probability under the third color component;

The encoding part 22 is further configured to encode the attribute information under the second color component based on the occupancy probability under the second color component to obtain the attribute encoding information corresponding to the second color component of the first-scale point cloud;

Based on the occupancy probability of the third color component, the attribute information of the third color component is encoded to obtain attribute encoding information corresponding to the third color component of the first-scale point cloud.

In some embodiments of the present application, the second prediction part 21 is further configured to use the attribute probability prediction network to sequentially perform probability prediction on the m groups of first voxels under the first color component to determine the m groups of occupancy probabilities corresponding to the first color component;

The attribute probability prediction network is used, combined with the attribute information corresponding to the first color component, to perform probability prediction on the m groups of first voxels in turn under the second color component and the third color component, respectively, to determine the m groups of occupancy probabilities corresponding to the second color component and the m groups of occupancy probabilities corresponding to the third color component.

In some embodiments of the present application, the encoding part 22 is further configured to encode the attribute information of the m groups of first voxels under the first color component according to the m groups of occupancy probabilities corresponding to the first color component, and determine the attribute encoding information corresponding to the first color component of each of the m groups of first voxels; the attribute information corresponding to the first color component is used to determine the m groups of occupancy probabilities of the second color component and the m groups of occupancy probabilities of the third color component;

According to the m groups of occupancy probabilities corresponding to the second color component, encoding the attribute information of the m groups of first voxels under the second color component, and determining the attribute encoding information corresponding to the second color component of each of the m groups of first voxels;

According to the m groups of occupancy probabilities corresponding to the third color component, the attribute information of the m groups of first voxels is encoded under the third color component, and the attribute encoding information corresponding to the third color component of each of the m groups of first voxels is determined.

In some embodiments of the present application, the second prediction part 21 is further configured to input the attribute prediction information of each first voxel into the attribute probability prediction network for probability prediction, and determine at least one probability model parameter of each first voxel of the first scale point cloud; the first voxel is a voxel corresponding to the first scale point cloud;

In some embodiments of the present application, the second prediction part 21 is further configured to determine the probability density distribution of each first voxel according to at least one probability model parameter of at least one point included in each first voxel of the first scale point cloud;

In some embodiments of the present application, the encoder 2 further includes: a determination part 23;

The determination part 23 is also configured to determine multiple quantization errors each time the point cloud data is downsampled to the next scale in the process of sequentially downsampling the point cloud data until it is divided into a single voxel, thereby obtaining the multiple-scale point clouds; the multiple quantization errors include: the quantization error when the first-scale point cloud is downsampled to the second-scale point cloud.

The determining part 23 is further configured to determine the reconstructed geometric information of the first-scale point cloud.

In some embodiments of the present application, the second prediction part 21 is further configured to use the quantization error to dequantize the decoded attribute information of the second scale point cloud to determine the corrected attribute information of the second scale point cloud;

In some embodiments of the present application, the encoding part 22 is further configured to continue encoding the third-scale point cloud until the encoding of the multiple-scale point clouds is completed, thereby obtaining attribute encoding information of the multiple-scale point clouds.

In some embodiments of the present application, the encoder 2 further includes: a writing part 24;

The writing part 24 is further configured to write the attribute encoding information of the multiple scale point clouds into the code stream respectively.

The writing part 24 is further configured to write the plurality of quantization errors into the bit stream.

The writing part 24 is further configured to write the reconstructed geometric information of the first-scale point cloud into the bitstream.

In some embodiments of the present application, the second prediction part 21 is further configured to perform voxel upsampling on the second scale point cloud to determine n first voxels of the first scale; n is an integer greater than 1;

According to the attribute information of the second-scale point cloud, attribute prediction information corresponding to the n first voxels of the first scale is determined, and then the attribute prediction information of the first-scale point cloud is obtained.

In some embodiments of the present application, the second prediction part 21 is further configured to use the quantization error to dequantize the attribute information of the second scale point cloud to determine the corrected attribute information of the second scale point cloud;

The embodiment of the present application provides an encoder, as shown in FIG21, including:

A second memory 25, configured to store executable instructions;

The second processor 26 is configured to implement the encoding method of the encoder when executing the executable instructions stored in the second memory 25.

An embodiment of the present application provides a computer-readable storage medium storing executable instructions. When the executable instructions are executed by a first processor, the first processor will be caused to execute any one of the decoding methods provided by the embodiments of the present application; or, when the executable instructions are executed by a second processor, the second processor will be caused to execute any one of the encoding methods provided by the embodiments of the present application.

In some embodiments, the computer-readable storage medium may be a memory such as FRAM, ROM, PROM, EPROM, EEPROM, flash memory, magnetic surface storage, optical disk, or CD-ROM; or it may be various devices including one or any combination of the above memories.

In some embodiments, executable instructions may be in the form of a program, software, software module, script or code, written in any form of programming language (including compiled or interpreted languages, or declarative or procedural languages), and may be deployed in any form, including as a stand-alone program or as a module, component, subroutine or other unit suitable for use in a computing environment.

As an example, executable instructions may, but do not necessarily, correspond to a file in a file system, may be stored as part of a file that stores other programs or data, such as, for example, in one or more scripts in a Hypertext Markup Language (HTML) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files storing one or more modules, subroutines, or code portions).

By way of example, executable instructions may be deployed to be executed on one computing device, or on multiple computing devices located at one site, or on multiple computing devices distributed across multiple sites and interconnected by a communication network.

Those skilled in the art will appreciate that the embodiments of the present application may be provided as methods, systems, or computer program products. Therefore, the present application may adopt the form of hardware embodiments, software embodiments, or embodiments in combination with software and hardware. Moreover, the present application may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage and optical storage, etc.) that contain computer-usable program code.

The present application is described with reference to the flowchart and/or block diagram of the method, device (system) and computer program product according to the embodiment of the present application. It should be understood that each process and/or box in the flowchart and/or block diagram, and the combination of the process and/or box in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, an embedded processor or other programmable data processing device to produce a machine, so that the instructions executed by the processor of the computer or other programmable data processing device produce a device for realizing the function specified in one process or multiple processes in the flowchart and/or one box or multiple boxes in the block diagram.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing device to work in a specific manner, so that the instructions stored in the computer-readable memory produce a manufactured product including an instruction device that implements the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.

These computer program instructions may also be loaded onto a computer or other programmable data processing device so that a series of operational steps are executed on the computer or other programmable device to produce a computer-implemented process, whereby the instructions executed on the computer or other programmable device provide steps for implementing the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.

The above is only a preferred embodiment of the present application and is not intended to limit the protection scope of the present application. Any modifications, equivalent replacements and improvements made within the spirit and scope of the present application are included in the protection scope of the present application.

Industrial Applicability

In an embodiment of the present application, at a decoder, when decoding a first-scale point cloud, attribute information of a decoded second-scale point cloud is used to predict the first scale, and the first-scale point cloud and its attribute prediction information are predicted. Then, the attribute prediction information is input into an attribute probability prediction network for probability prediction, and the occupancy probability of the first-scale point cloud is determined. The attribute probability prediction network is used in a lossless decoding process, and the occupancy probability is determined by the attribute probability prediction network during decoding. The use of the attribute probability prediction network avoids many processing operations, reduces decoding complexity, and speeds up the prediction speed of the occupancy probability, thereby improving the processing efficiency of the prediction process. In this way, if the occupancy probability is used to decode the attribute coding information, the decoding efficiency and decoding performance can be improved.

Claims

A decoding method, comprising:

Parse the bitstream to determine the attribute coding information corresponding to the first-scale point cloud;

Predicting the first scale based on the decoded attribute information of the second scale point cloud, and determining the attribute prediction information of the first scale point cloud; wherein the second scale point cloud is the decoded point cloud data of the parent node of the previous scale of the first scale point cloud;

Using an attribute probability prediction network, performing probability prediction on the attribute prediction information to determine the occupancy probability of the first scale point cloud;

The attribute encoding information is decoded based on the occupancy probability to determine attribute information of the first-scale point cloud.
The method according to claim 1, wherein the first-scale point cloud comprises: n first voxels; n is an integer greater than 1;

The adopting the attribute probability prediction network to perform probability prediction on the attribute prediction information to determine the occupancy probability of the first scale point cloud includes:

Based on grouping voxels at the same position of each of the n first voxels, m groups of first voxels are determined; m is an integer greater than or equal to 1;

The attribute probability prediction network is used to sequentially perform probability prediction on the attribute prediction information of the m groups of first voxels to determine m groups of occupancy probabilities, where the m groups of occupancy probabilities are the occupancy probabilities corresponding to the first-scale point cloud.
The method according to claim 2, wherein the decoding of the attribute encoding information based on the occupancy probability to determine the attribute information of the first-scale point cloud comprises:

The attribute encoding information corresponding to the m groups of first voxels is decoded group by group according to the m groups of occupancy probabilities to determine the attribute information of each of the m groups of first voxels.
The method according to claim 2 or 3, wherein the decoding of the attribute encoding information based on the occupancy probability to determine the attribute information of the first-scale point cloud comprises:

Decoding the attribute encoding information of the first voxel of the i-th group according to the i-th group occupancy probability to determine the attribute information of the first voxel of the i-th group; i is an integer greater than or equal to 1 and less than or equal to m;

The next group of decoding is continued based on the (i+1)th group of occupancy probabilities until the attribute coding information of the first voxels of the mth group is decoded, and the attribute information of each of the first voxels of the m groups is determined.
The method according to claim 4, wherein the second-scale point cloud comprises: K second voxels; K is an integer greater than or equal to 1 and less than the total number of voxels in the first-scale point cloud; the i-th group of first voxels comprises: H first voxels; the i-th group of occupancy probabilities comprises: H groups of sub-occupancy probabilities; H is an integer greater than or equal to 1 and less than or equal to a preset number;

The step of decoding the attribute encoding information of the first voxel of the i-th group according to the i-th group occupancy probability to determine the attribute information of the first voxel of the i-th group includes:

If the number of undecoded first voxels corresponding to the second voxel to which the jth first voxel belongs is greater than 1, the jth first voxel is decoded according to the jth group sub-occupancy probability to obtain the attribute information of the jth first voxel; the second voxel is the voxel where the parent node of the second scale corresponding to the point in the jth first voxel is located;

If the number of undecoded first voxels corresponding to the second voxel to which the j-th first voxel belongs is 1, determining the attribute mean of the undecoded first voxels corresponding to the second voxel, wherein the attribute mean is the attribute information of the j-th first voxel; the attribute mean is determined according to the attribute information of the decoded first voxel corresponding to the second voxel and the decoded attribute information of the second voxel;

The decoding of the j+1th first voxel is continued until j=H, and the attribute information of the i-th group of first voxels is determined.
The method according to claim 1, wherein the adopting an attribute probability prediction network to perform probability prediction on the attribute prediction information to determine the occupancy probability of the first-scale point cloud comprises:

The attribute probability prediction network is used to perform probability prediction on the first color component of the attribute prediction information to determine the occupancy probability of the first scale point cloud under the first color component.
The method according to claim 6, wherein the decoding of the attribute encoding information based on the occupancy probability to determine the attribute information of the first-scale point cloud comprises:

Based on the occupancy probability of the first color component, the attribute encoding information of the first color component is decoded to obtain attribute information corresponding to the first color component of the first-scale point cloud.
The method according to claim 7, wherein the method further comprises:

The attribute probability prediction network is used to perform probability prediction on the second color component and the third color component of the attribute prediction information in combination with the attribute information corresponding to the first color component of the first scale point cloud, and determine the occupancy probability of the first scale point cloud under the second color component and the occupancy probability under the third color component;

Based on the occupancy probability of the second color component, the attribute encoding information of the second color component is decoded to obtain the attribute information corresponding to the second color component of the first-scale point cloud;

Based on the occupancy probability of the third color component, the attribute encoding information of the third color component is decoded to obtain attribute information corresponding to the third color component of the first-scale point cloud.
The method according to any one of claims 2 to 4, wherein the adopting an attribute probability prediction network to sequentially perform probability prediction on the m groups of first voxels to determine the m groups of occupancy probabilities comprises:

Using the attribute probability prediction network, under the first color component, sequentially perform probability prediction on the m groups of first voxels to determine the m groups of occupancy probabilities corresponding to the first color component;

The attribute probability prediction network is used, combined with the attribute information corresponding to the decoded first color component, to perform probability prediction on the m groups of first voxels in turn under the second color component and the third color component, respectively, to determine the m groups of occupancy probabilities corresponding to the second color component and the m groups of occupancy probabilities corresponding to the third color component.
The method according to claim 9, wherein the step of decoding the attribute encoding information of the m groups of first voxels one by one according to the m groups of occupancy probabilities to determine the attribute information of each of the m groups of first voxels comprises:

According to the m groups of occupancy probabilities corresponding to the first color component, under the first color component, the attribute encoding information of the m groups of first voxels is decoded to determine the attribute information corresponding to the first color component of each of the m groups of first voxels; the attribute information corresponding to the first color component is used to determine the m groups of occupancy probabilities of the second color component and the m groups of occupancy probabilities of the third color component;

According to the m groups of occupancy probabilities corresponding to the second color component, the attribute encoding information of the m groups of first voxels is decoded under the second color component to determine the attribute information corresponding to the second color component of each of the m groups of first voxels;

According to the m groups of occupancy probabilities corresponding to the third color component, the attribute encoding information of the m groups of first voxels is decoded under the third color component to determine the attribute information corresponding to the third color component of each of the m groups of first voxels.
The method according to any one of claims 1 to 7, wherein the adopting an attribute probability prediction network to perform probability prediction on the attribute prediction information to determine the occupancy probability of the first-scale point cloud comprises:

Inputting the attribute prediction information of each first voxel into the attribute probability prediction network for probability prediction, and determining at least one probability model parameter of each first voxel of the first scale point cloud; the first voxel is a voxel corresponding to the first scale point cloud;

An occupancy probability of the first-scale point cloud is determined according to at least one probability model parameter of each first voxel of the first-scale point cloud.
The method according to claim 11, wherein determining the occupancy probability of the first-scale point cloud according to at least one probability model parameter of each first voxel of the first-scale point cloud comprises:

Determine a probability density distribution of each first voxel according to at least one probability model parameter of at least one point included in each first voxel of the first-scale point cloud;

According to the probability density distribution of each first voxel, an integral operation is performed within a preset range centered on each first voxel to determine the occupation probability of each first voxel, thereby determining the occupation probability of the first-scale point cloud.
The method according to any one of claims 1 to 10, wherein the method further comprises:

Determining decoded attribute information of the second scale point cloud;

Determine, from the bitstream, a quantization error when downsampling the first-scale point cloud to the second-scale point cloud;

From the bitstream, the reconstructed geometric information of the first scale point cloud is determined.
The method according to claim 13, wherein the predicting the first scale based on the decoded attribute information of the second scale point cloud to determine the attribute prediction information of the first scale point cloud comprises:

Dequantizing the decoded attribute information of the second scale point cloud using the quantization error to determine the corrected attribute information of the second scale point cloud;

Prediction of the first scale is performed according to the corrected attribute information of the second-scale point cloud and the reconstructed geometric information of the first-scale point cloud to determine attribute prediction information of the first-scale point cloud.
The method according to any one of claims 1 to 10, wherein the predicting the first scale based on the decoded attribute information of the second scale point cloud to determine the attribute prediction information of the first scale point cloud comprises:

Perform voxel upsampling on the second-scale point cloud to determine n first voxels of the first scale; n is an integer greater than 1;

According to the decoded attribute information of the second-scale point cloud, attribute prediction information corresponding to the n first voxels of the first scale is determined, thereby obtaining the attribute prediction information of the first-scale point cloud.
The method according to claim 15, wherein determining the attribute prediction information corresponding to the n first voxels at the first scale according to the decoded attribute information of the second-scale point cloud comprises:

Dequantizing the decoded attribute information of the second scale point cloud using the quantization error to determine the corrected attribute information of the second scale point cloud;

The corrected attribute information of the second voxel of the second-scale point cloud to which each of the n first voxels of the first scale belongs is determined as the attribute prediction information corresponding to each first voxel.
A coding method, comprising:

The point cloud data is sequentially downsampled until it is divided into a single voxel, thereby obtaining a plurality of scale point clouds; the plurality of scale point clouds include: a first scale point cloud and a second scale point cloud; the second scale point cloud is point cloud data of a previous scale parent node of the first scale point cloud;

In the process of encoding the first-scale point cloud, performing a first-scale prediction based on the attribute information of the second-scale point cloud to determine the attribute prediction information of the first-scale point cloud;

Using an attribute probability prediction network, performing probability prediction on the attribute prediction information to determine the occupancy probability of the first scale point cloud;

Attribute information of the first-scale point cloud is encoded based on the occupancy probability to determine attribute encoding information of the first-scale point cloud.
The method according to claim 17, wherein the first-scale point cloud comprises: n first voxels; n is an integer greater than 1;

The adopting the attribute probability prediction network to perform probability prediction on the attribute prediction information to determine the occupancy probability of the first scale point cloud includes:

Based on grouping voxels at the same position of each of the n first voxels, m groups of first voxels are determined; m is an integer greater than or equal to 1;

The attribute probability prediction network is used to sequentially perform probability prediction on the attribute prediction information of the m groups of first voxels to determine m groups of occupancy probabilities, where the m groups of occupancy probabilities are the occupancy probabilities corresponding to the first-scale point cloud.
The method according to claim 18, wherein encoding the attribute information based on the occupancy probability to determine the attribute encoding information of the first-scale point cloud comprises:

The attribute information corresponding to the m groups of first voxels is encoded group by group according to the m groups of occupancy probabilities to determine the attribute encoding information of each of the m groups of first voxels.
The method according to claim 18 or 19, wherein encoding the attribute information based on the occupancy probability to determine the attribute encoding information of the first-scale point cloud comprises:

Encoding the attribute information of the first voxel of the i-th group according to the i-th group occupancy probability to determine the attribute encoding information of the first voxel of the i-th group; i is an integer greater than or equal to 1 and less than or equal to m;

The next group of encoding is continued based on the (i+1)th group of occupancy probabilities until the encoding of the attribute information of the first voxels of the mth group is completed, and the attribute encoding information of each of the first voxels of the m groups is determined.
The method according to claim 20, wherein the second-scale point cloud comprises: K second voxels; K is an integer greater than or equal to 1 and less than the total number of voxels in the first-scale point cloud; the i-th group of first voxels comprises: H first voxels; the i-th group of occupancy probabilities comprises: H groups of sub-occupancy probabilities; H is an integer greater than or equal to 1 and less than or equal to a preset number;

The step of encoding the attribute information of the first voxel in the i-th group according to the i-th group occupancy probability to determine the attribute encoding information of the first voxel in the i-th group includes:

If the number of uncoded first voxels corresponding to the second voxel to which the jth first voxel belongs is greater than 1, the jth first voxel is encoded according to the jth group sub-occupancy probability to obtain the attribute encoding information of the jth first voxel; the second voxel is the voxel where the parent node of the second scale corresponding to the point in the jth first voxel is located;

The encoding of the j+1th first voxel is continued until j=H, and the attribute encoding information of the i-th group of first voxels is determined.
The method according to claim 21, wherein the encoding of the j+1th first voxel continues until j=H, before determining the attribute encoding information of the i-th group of first voxels, the method further comprises:

If the number of uncoded first voxels corresponding to the second voxel to which the j-th first voxel belongs is 1, the j-th first voxel is not encoded.
The method according to claim 17, wherein the adopting an attribute probability prediction network to perform probability prediction on the attribute prediction information to determine the occupancy probability of the first-scale point cloud comprises:

The attribute probability prediction network is used to perform probability prediction on the first color component of the attribute prediction information to determine the occupancy probability of the first scale point cloud under the first color component.
The method according to claim 23, wherein encoding the attribute information based on the occupancy probability to determine the attribute encoding information of the first-scale point cloud comprises:

Based on the occupancy probability of the first color component, the attribute information of the first color component is encoded to obtain attribute encoding information corresponding to the first color component of the first-scale point cloud.
The method according to claim 23, wherein the method further comprises:

The attribute probability prediction network is used to perform probability prediction on the second color component and the third color component of the attribute prediction information in combination with the attribute information corresponding to the first color component of the first scale point cloud, and determine the occupancy probability of the first scale point cloud under the second color component and the occupancy probability under the third color component;

Based on the occupancy probability of the second color component, the attribute information of the second color component is encoded to obtain the attribute encoding information corresponding to the second color component of the first-scale point cloud;

Based on the occupancy probability of the third color component, the attribute information of the third color component is encoded to obtain attribute encoding information corresponding to the third color component of the first-scale point cloud.
The method according to any one of claims 18 to 20, wherein the adopting an attribute probability prediction network to sequentially perform probability prediction on the m groups of first voxels to determine the m groups of occupancy probabilities comprises:

Using the attribute probability prediction network, under the first color component, sequentially perform probability prediction on the m groups of first voxels to determine the m groups of occupancy probabilities corresponding to the first color component;

The attribute probability prediction network is used, combined with the attribute information corresponding to the first color component, to perform probability prediction on the m groups of first voxels in turn under the second color component and the third color component, respectively, to determine the m groups of occupancy probabilities corresponding to the second color component and the m groups of occupancy probabilities corresponding to the third color component.
The method according to claim 26, wherein encoding the attribute information of the m groups of first voxels one by one according to the m groups of occupancy probabilities to determine the attribute encoding information of each of the m groups of first voxels comprises:

According to the m groups of occupancy probabilities corresponding to the first color component, under the first color component, the attribute information of the m groups of first voxels is encoded to determine the attribute encoding information corresponding to the first color component of each of the m groups of first voxels; the attribute information corresponding to the first color component is used to determine the m groups of occupancy probabilities of the second color component and the m groups of occupancy probabilities of the third color component;

According to the m groups of occupancy probabilities corresponding to the second color component, encoding the attribute information of the m groups of first voxels under the second color component, and determining the attribute encoding information corresponding to the second color component of each of the m groups of first voxels;

According to the m groups of occupancy probabilities corresponding to the third color component, the attribute information of the m groups of first voxels is encoded under the third color component, and the attribute encoding information corresponding to the third color component of each of the m groups of first voxels is determined.
The method according to any one of claims 17 to 24, wherein the adopting an attribute probability prediction network to perform probability prediction on the attribute prediction information to determine the occupancy probability of the first-scale point cloud comprises:

Inputting the attribute prediction information of each first voxel into the attribute probability prediction network for probability prediction, and determining at least one probability model parameter of each first voxel of the first scale point cloud; the first voxel is a voxel corresponding to the first scale point cloud;

An occupancy probability of the first-scale point cloud is determined according to at least one probability model parameter of each first voxel of the first-scale point cloud.
The method of claim 28, wherein determining the occupancy probability of the first-scale point cloud based on at least one probability model parameter of each first voxel of the first-scale point cloud comprises:

Determine a probability density distribution of each first voxel according to at least one probability model parameter of at least one point included in each first voxel of the first-scale point cloud;

According to the probability density distribution of each first voxel, an integral operation is performed within a preset range centered on each first voxel to determine the occupation probability of each first voxel, thereby determining the occupation probability of the first-scale point cloud.
The method according to any one of claims 17 to 27, wherein the method further comprises:

In the process of sequentially downsampling the point cloud data by voxels until the point cloud data is divided into a single voxel, a plurality of quantization errors are determined each time the point cloud is downsampled to the next scale; the plurality of quantization errors include: a quantization error when a first-scale point cloud is downsampled to a second-scale point cloud.
The method according to any one of claims 17 to 27, wherein the method further comprises:

Determine the reconstructed geometric information of the first scale point cloud.
The method according to claim 31, wherein the predicting the first scale based on the attribute information of the second-scale point cloud to determine the attribute prediction information of the first-scale point cloud comprises:

Dequantizing the decoded attribute information of the second scale point cloud using the quantization error to determine the corrected attribute information of the second scale point cloud;

Prediction of the first scale is performed according to the corrected attribute information of the second-scale point cloud and the reconstructed geometric information of the first-scale point cloud to determine attribute prediction information of the first-scale point cloud.
The method according to claim 17, wherein the method further comprises:

The encoding of the third-scale point cloud is continued until the encoding of the multiple-scale point clouds is completed, thereby obtaining attribute encoding information of the multiple-scale point clouds.
The method according to claim 17 or 32, further comprising:

The attribute encoding information of point clouds at multiple scales is written into the bitstream respectively.
The method according to claim 30, further comprising:

The multiple quantization errors are written into a bitstream.
The method according to claim 31, further comprising:

The reconstructed geometric information of the first-scale point cloud is written into the bitstream.
The method according to any one of claims 17 to 27, wherein the predicting the first scale based on the attribute information of the second-scale point cloud to determine the attribute prediction information of the first-scale point cloud comprises:

Perform voxel upsampling on the second-scale point cloud to determine n first voxels of the first scale; n is an integer greater than 1;

According to the attribute information of the second-scale point cloud, attribute prediction information corresponding to the n first voxels of the first scale is determined, and then the attribute prediction information of the first-scale point cloud is obtained.
The method according to any one of claims 17 to 27, wherein determining the attribute prediction information corresponding to the n first voxels at the first scale according to the attribute information of the second-scale point cloud comprises:

Dequantizing the attribute information of the second-scale point cloud using the quantization error to determine the corrected attribute information of the second-scale point cloud;

The corrected attribute information of the second voxel of the second-scale point cloud to which each first voxel of the n first voxels of the first scale belongs is determined as the attribute prediction information corresponding to each first voxel.
A decoder, comprising:

A decoding part, configured to parse the bitstream and determine attribute encoding information corresponding to the first scale point cloud;

A first prediction part is configured to perform a prediction of the first scale based on the decoded attribute information of the second scale point cloud, and determine the attribute prediction information of the first scale point cloud; wherein the second scale point cloud is the decoded point cloud data of the previous scale parent node of the first scale point cloud; and

Using an attribute probability prediction network, performing probability prediction on the attribute prediction information to determine the occupancy probability of the first scale point cloud;

The decoding part is further configured to decode the attribute encoding information based on the occupancy probability to determine the attribute information of the first-scale point cloud.
An encoder, comprising:

The division part is configured to sequentially perform voxel downsampling on the point cloud data until a single voxel is divided to obtain multiple scale point clouds; the multiple scale point clouds include: a first scale point cloud and a second scale point cloud; the second scale point cloud is the point cloud data of the previous scale parent node of the first scale point cloud;

A second prediction part is configured to perform a prediction of the first scale based on the attribute information of the second scale point cloud during encoding of the first scale point cloud, and determine the attribute prediction information of the first scale point cloud; and

Using an attribute probability prediction network, performing probability prediction on the attribute prediction information to determine the occupancy probability of the first scale point cloud;

The encoding part is configured to encode the attribute information of the first-scale point cloud based on the occupancy probability and determine the attribute encoding information of the first-scale point cloud.
A code stream, including:

The code stream is generated by bit encoding according to information to be encoded; wherein the information to be encoded includes at least: attribute encoding information of multiple scale point clouds, multiple quantization errors and reconstructed geometric information of multiple scale point clouds, wherein the reconstructed geometric information of the multiple scale point clouds includes the reconstructed geometric information of the first scale point cloud.
A decoder, comprising:

A first memory configured to store executable instructions;

The first processor is configured to implement the method according to any one of claims 1 to 16 when executing the executable instructions stored in the first memory.
An encoder, comprising:

a second memory configured to store executable instructions;

The second processor is configured to implement the method described in any one of claims 17 to 38 when executing the executable instructions stored in the second memory.
A computer-readable storage medium storing executable instructions for causing a first processor to execute the method described in any one of claims 1 to 16, or for causing a second processor to execute the method described in any one of claims 17 to 38.