WO2024082105A1 - Encoding method, decoding method, decoder, encoder and computer-readable storage medium - Google Patents

Encoding method, decoding method, decoder, encoder and computer-readable storage medium Download PDF

Info

Publication number
WO2024082105A1
WO2024082105A1 PCT/CN2022/125742 CN2022125742W WO2024082105A1 WO 2024082105 A1 WO2024082105 A1 WO 2024082105A1 CN 2022125742 W CN2022125742 W CN 2022125742W WO 2024082105 A1 WO2024082105 A1 WO 2024082105A1
Authority
WO
WIPO (PCT)
Prior art keywords
scale
point cloud
voxel
local density
voxels
Prior art date
Application number
PCT/CN2022/125742
Other languages
French (fr)
Chinese (zh)
Inventor
马展
薛瑞翔
魏红莲
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Priority to PCT/CN2022/125742 priority Critical patent/WO2024082105A1/en
Publication of WO2024082105A1 publication Critical patent/WO2024082105A1/en

Links

Images

Definitions

  • the present application relates to point cloud compression coding and decoding technology, and in particular to a coding and decoding method, a decoder, an encoder and a computer-readable storage medium.
  • Point cloud is a collection of points that can store the geometric position and related attribute information of each point, so as to accurately describe objects in space.
  • Point cloud data is huge, and a frame of point cloud can contain millions of points, which also brings great difficulties and challenges to the effective storage and transmission of point clouds. Therefore, compression technology is used to reduce redundant information in point cloud storage, so as to facilitate subsequent processing.
  • representative point cloud compression algorithms include: Video-based Point Cloud Compression (V-PCC) and Geometry-based Point Cloud Compression (G-PCC).
  • V-PCC Video-based Point Cloud Compression
  • G-PCC Geometry-based Point Cloud Compression
  • the geometric compression in G-PCC is mainly implemented through octree models and/or triangular surface models.
  • V-PCC is mainly implemented through three-dimensional to two-dimensional projection and video compression.
  • the structural information of the real environment expressed by the point cloud is restored by reconstructing the geometric information of the point cloud.
  • the process of reconstructing the geometric information of the point cloud includes: using a sparse convolutional neural network, taking the voxelized geometric information of the low-scale point cloud as input, and predicting the occupancy probability of each high-scale voxel in the high-scale point cloud.
  • the occupancy symbol of each high-scale voxel is determined, and the geometric information of the high-scale point cloud is reconstructed according to the occupancy symbol representing the occupied high-scale voxels.
  • the method of geometric reconstruction based on occupancy probability prediction is prone to inaccurate occupancy symbol determination when the occupancy probabilities of multiple high-scale voxels are close or the occupancy probability threshold is set unreasonably, thereby reducing the quality of geometric reconstruction and further reducing the encoding and decoding performance.
  • the embodiments of the present application provide a coding and decoding method, a decoder, an encoder and a computer-readable storage medium, which can improve the quality of geometric reconstruction of point cloud coding and decoding, thereby improving coding and decoding performance.
  • the present application provides a decoding method, including:
  • the first-scale point cloud is the previously decoded point cloud data corresponding to the second-scale point cloud;
  • a local density prediction is performed to determine a local density corresponding to a first-scale voxel in the first-scale point cloud, and an occupation probability prediction is performed on a second-scale voxel to determine an occupation probability corresponding to the second-scale voxel;
  • the second-scale voxel is an upsampled voxel corresponding to the first-scale voxel;
  • the local density represents the number of occupied second-scale voxels in the second-scale voxel corresponding to the first-scale voxel;
  • the encoded information corresponding to the second-scale point cloud is decoded and reconstructed to determine the reconstructed geometric information corresponding to the second-scale point cloud.
  • the present application provides an encoding method, including:
  • the local density represents the number of occupied second-scale voxels in the second-scale voxels corresponding to the first-scale voxels;
  • Encoding is performed based on the reconstructed geometric information corresponding to the second-scale point cloud, encoding information corresponding to the second-scale point cloud is determined, and the encoding information is written into a bitstream.
  • the present application provides a decoder, including:
  • a parsing part configured to parse the bitstream and determine the encoding information corresponding to the second scale point cloud
  • the determining part is configured to determine a first-scale point cloud; the first-scale point cloud is the previously decoded point cloud data corresponding to the second-scale point cloud;
  • a local density prediction part is configured to perform local density prediction based on the first-scale point cloud to determine the local density corresponding to the first-scale voxel in the first-scale point cloud;
  • the occupancy probability prediction part is configured to perform occupancy probability prediction on a second-scale voxel based on the first-scale point cloud, and determine the occupancy probability corresponding to the second-scale voxel;
  • the second-scale voxel is an upsampled voxel corresponding to the first-scale voxel;
  • the decoding and reconstruction part is configured to decode and reconstruct the encoded information corresponding to the second-scale point cloud based on the occupancy probability corresponding to the second-scale voxel and the local density corresponding to the first-scale voxel, and determine the reconstructed geometric information corresponding to the second-scale point cloud.
  • the present application provides an encoder, including:
  • a downsampling part configured to perform voxel downsampling on the second scale point cloud to determine a first scale point cloud
  • a local density prediction part configured to perform local density prediction based on the first-scale point cloud to determine the local density corresponding to the first-scale voxel
  • the occupancy probability prediction part is configured to upsample the first scale voxels in the first scale point cloud to the second scale, determine the second scale voxels corresponding to the first scale voxels; and perform occupancy probability prediction on the second scale voxels to determine the occupancy probability corresponding to the second scale voxels;
  • a reconstruction part configured to determine the reconstructed geometric information corresponding to the second-scale point cloud according to the local density corresponding to the first-scale voxel and the occupancy probability corresponding to the second-scale voxel;
  • the encoding part is configured to perform encoding based on the reconstructed geometric information corresponding to the second-scale point cloud, determine the encoding information corresponding to the second-scale point cloud, and write the encoding information into a bitstream.
  • the embodiment of the present application provides a code stream, including:
  • the code stream is generated by bit encoding according to the coding information; wherein the coding information at least includes: coding information corresponding to the second-scale point cloud.
  • the present application provides a decoder, including:
  • a first memory configured to store executable instructions
  • the first processor is configured to implement any of the decoding methods described above when executing the executable instructions stored in the first memory.
  • the present application provides an encoder, including:
  • a second memory configured to store executable instructions
  • the second processor is configured to implement any of the encoding methods described above when executing the executable instructions stored in the second memory.
  • An embodiment of the present application provides a computer-readable storage medium storing executable instructions for causing a first processor to execute to implement the above-mentioned decoding method, or for causing a second processor to execute to implement the above-mentioned encoding method.
  • An embodiment of the present application provides a computer program product, including a computer program or instructions.
  • the decoding method provided by the embodiment of the present application is implemented; or, when the computer program or instructions are executed by a second processor, the encoding method provided by the embodiment of the present application is implemented.
  • the embodiment of the present application provides a coding and decoding method, a decoder, an encoder and a computer-readable storage medium.
  • the decoder can determine the number of occupied second-scale voxels in the second-scale voxels obtained by sampling each first-scale voxel by predicting the local density. In this way, the occupancy probability corresponding to the second-scale voxel can be screened in combination with the local density to determine the occupancy of the second-scale voxel, reconstruct the second-scale point cloud according to the occupancy of the second-scale voxel, and determine the reconstructed geometric information of the second-scale point cloud.
  • the occupancy of the determined second-scale voxel can be made more accurate, and the accuracy of the reconstructed geometric information of the second-scale point cloud can be improved, that is, the reconstructed geometric quality of the decoder is improved, and the decoding performance is improved.
  • the occupancy probability corresponding to the second-scale voxel is screened, which can improve the accuracy of determining the occupancy of the second-scale voxel, and then improve the accuracy of encoding the reconstructed geometric information of the second-scale point cloud determined based on the occupancy of the second-scale voxel, thereby improving the encoding performance.
  • FIG1 is a flow chart of G-PCC coding
  • FIG2 is a flow chart of G-PCC decoding
  • FIG3 is a schematic diagram of an optional flow chart of a decoding method provided in an embodiment of the present application.
  • FIG4 is a schematic diagram of an optional process of voxel upsampling provided in an embodiment of the present application.
  • FIG5 is a schematic diagram of an optional flow chart of a decoding method provided in an embodiment of the present application.
  • FIG6 is a schematic diagram of an optional flow chart of an occupancy probability prediction and local density prediction process provided in an embodiment of the present application.
  • FIG7 is a schematic diagram of an optional flow chart of an occupancy probability prediction and local density prediction process provided in an embodiment of the present application.
  • FIG8 is a schematic diagram of an optional flow chart of an occupancy probability prediction and local density prediction process provided in an embodiment of the present application.
  • FIG9 is a schematic diagram of an optional structure of a local density prediction network provided in an embodiment of the present application.
  • FIG10 is a schematic diagram of an optional structure of a feature extraction network provided in an embodiment of the present application.
  • FIG11 is a schematic diagram of an optional structure of a residual layer in a feature extraction network provided in an embodiment of the present application.
  • FIG12 is a schematic diagram of an optional structure of an occupancy probability prediction network provided in an embodiment of the present application.
  • FIG13 is a schematic diagram of an occupancy probability corresponding to a second-scale voxel provided in an embodiment of the present application.
  • FIG. 14 is a schematic diagram of an optional flow chart of a decoding method provided in an embodiment of the present application
  • FIG15 is a schematic diagram of an optional process for reconstructing point cloud geometric information based on occupancy probability and local density provided in an embodiment of the present application;
  • FIG16 is a schematic diagram of an optional flow chart of an encoding method provided in an embodiment of the present application.
  • FIG17 is a schematic diagram of an optional process of voxel downsampling provided in an embodiment of the present application.
  • FIG18 is a schematic diagram of an occupancy symbol obtained by voxel downsampling according to an embodiment of the present application.
  • FIG19 is a schematic diagram of an optional process of applying the decoding method provided in an embodiment of the present application to an actual scenario
  • FIG20 is a schematic diagram of an optional structure of a decoder provided in an embodiment of the present application.
  • FIG21 is a schematic diagram of an optional structure of an encoder provided in an embodiment of the present application.
  • FIG22 is a schematic diagram of an optional structure of a decoder provided in an embodiment of the present application.
  • FIG. 23 is a schematic diagram of an optional structure of an encoder provided in an embodiment of the present application.
  • first ⁇ second ⁇ third involved are merely used to distinguish similar objects and do not represent a specific ordering of the objects. It can be understood that “first ⁇ second ⁇ third” can be interchanged with a specific order or sequence where permitted, so that the embodiments of the present application described here can be implemented in an order other than that illustrated or described here.
  • Voxel is the abbreviation of volume element, which is the smallest unit of digital data in three-dimensional space segmentation. Voxel can be used to divide 3D space into grids and give each grid feature. For example, voxel can be a cubic block of fixed size in three-dimensional space. Voxel can be widely used in fields such as three-dimensional imaging, scientific data and medical imaging.
  • Point cloud compression algorithms include: Video-based Point Cloud Compression (V-PCC) and Geometry-based Point Cloud Compression (G-PCC).
  • V-PCC Video-based Point Cloud Compression
  • G-PCC Geometry-based Point Cloud Compression
  • the geometry compression in G-PCC is mainly implemented through the octree model and/or the triangle surface model.
  • V-PCC is mainly implemented through 3D to 2D projection and video compression.
  • neural networks are applied to geometry-based point cloud compression technology.
  • the point cloud geometry compression technology based on neural networks can be roughly divided into geometric lossy compression and lossless compression.
  • the lossless compression algorithm mainly revolves around the design of the prediction model of voxel occupancy probability.
  • the data representation of voxels usually uses octree models, volume models, sparse tensor representations, etc.
  • the encoder side it is often necessary to use the surrounding context such as parent nodes and neighbor nodes as input, and after processing by the neural network (such as convolution, full connection) layer, output the occupancy probability of each voxel in the geometric data of the point cloud, and then use the entropy encoder to convert the voxel occupancy symbol corresponding to the occupancy probability of each voxel into a bitstream.
  • the occupancy probability of each voxel is predicted according to the same process, and the voxel occupancy symbol is decoded from the bitstream based on the predicted occupancy probability to reconstruct the geometric data of the point cloud.
  • the occupation symbol of a voxel is determined only based on the occupation probability, it is easy to have inaccurate occupation symbol determination when the occupation probabilities of multiple adjacent voxels are close or the occupation probability threshold is set unreasonably. Especially in the case of uneven point cloud density distribution, it is easy for the occupation symbol to represent the number of occupied voxels to be inconsistent with the actual number of occupied voxels, resulting in more or less points in the reconstructed geometric information. This reduces the quality of geometric reconstruction and further reduces the encoding and decoding performance.
  • the embodiments of the present application provide a coding and decoding method, a decoder, an encoder and a computer-readable storage medium, which can improve coding and decoding efficiency and improve coding and decoding performance.
  • a flow chart of G-PCC encoding and a flow chart of G-PCC decoding are first provided. It should be noted that the flow chart of G-PCC encoding and the flow chart of G-PCC decoding described in the embodiments of the present application are only for more clearly illustrating the technical solution of the embodiments of the present application, and do not constitute a limitation on the technical solution provided by the embodiments of the present application.
  • the point cloud compressed in the embodiments of the present application can be a point cloud in a video, but is not limited to this.
  • the point cloud of the input 3D image model is sliced and each slice is encoded independently.
  • the point cloud data is first divided into multiple slices by strip division. In each slice, the geometric information and attribute information of the point cloud are encoded separately. In the geometric coding process, the geometric information is transformed so that all the point clouds are contained in a bounding box, and then quantized. Quantization mainly plays a role in scaling. Due to the quantization rounding, the geometric information of a part of the point cloud is the same. It can be determined whether to remove duplicate points based on parameters. The process of quantization and removal of duplicate points is also called voxelization. Then the bounding box is divided into octrees.
  • the bounding box is divided into 8 sub-cubes, and the non-empty (containing points in the point cloud) sub-cubes are divided into 8 equal parts until the leaf node obtained by the division is a 1x1x1 unit cube.
  • the division is stopped, and the points in the leaf node are arithmetically encoded to generate a binary geometric bit stream, that is, a geometric code stream.
  • octree division must also be performed first.
  • the trisoup does not need to divide the point cloud into unit cubes with a side length of 1x1x1 step by step. Instead, the division stops when the side length of the sub-block is W.
  • the surface and the twelve edges of the block are obtained. At most twelve intersections (vertex) generated by the twelve edges of the block are obtained, and the vertices are arithmetically encoded (surface fitting based on the intersections) to generate a binary geometric bit stream, that is, a geometric code stream. Vertex is also used to implement the process of geometric reconstruction, and the reconstructed geometric information is used when encoding the attributes of the point cloud.
  • color conversion is performed to convert the color information (i.e., attribute information) from the RGB color space to the YUV color space. Then, the point cloud is recolored using the reconstructed geometric information so that the unencoded attribute information corresponds to the reconstructed geometric information.
  • LOD level of detail
  • RAHT region adaptive hierarchical transformation
  • the geometric encoding data after octree division and surface fitting and the attribute encoding data processed by the quantized coefficients are sliced and synthesized, and the vertex coordinates of each block are encoded in turn (i.e., arithmetic encoding) to generate a binary attribute bit stream, i.e., the attribute code stream.
  • the flowchart of G-PCC decoding shown in Figure 2 is applied to the decoder.
  • the decoder obtains a binary code stream, and independently decodes the geometric bit stream (i.e., geometric code stream) and attribute bit stream in the binary code stream.
  • the geometric bit stream the geometric information of the point cloud is obtained through arithmetic decoding-octree synthesis-surface fitting-reconstruction of geometry-inverse coordinate transformation;
  • the attribute information of the point cloud is obtained through arithmetic decoding-inverse quantization-LOD-based inverse lifting or RAHT-based inverse transformation-inverse color conversion, and the three-dimensional image model of the point cloud data to be encoded is restored based on the geometric information and attribute information.
  • the encoding method of the embodiment of the present application can be applied to the geometric information encoding process of the G-PCC as shown in Figure 1.
  • the geometric encoding process in Figure 1 is performed based on the voxelized second-scale point cloud to obtain a geometric bit stream;
  • the voxel-down sampling is performed on the voxelized second-scale point cloud to determine the first-scale point cloud, and the first-scale voxels in the first-scale point cloud are upsampled to the second scale to determine the second-scale voxels corresponding to the first-scale voxels;
  • local density prediction is performed based on the first-scale point cloud to determine the local density corresponding to the first-scale voxels, and the occupation probability prediction is performed on the second-scale voxels to determine the occupation probability corresponding to the second-scale voxels;
  • the local density represents the number of occupied second-scale voxels in the second-scale
  • the decoding method of the embodiment of the present application can be applied to the geometric information decoding process of G-PCC as shown in FIG2.
  • the coding information corresponding to the second-scale point cloud is determined, and the first-scale point cloud is determined; the first-scale point cloud is the previously decoded point cloud data corresponding to the second-scale point cloud.
  • the coding information corresponding to the second-scale point cloud is subjected to arithmetic decoding, octree synthesis, and surface fitting.
  • local density prediction is performed based on the first-scale point cloud to determine the local density corresponding to the first-scale voxel in the first-scale point cloud
  • occupation probability prediction is performed on the second-scale voxel to determine the occupation probability corresponding to the second-scale voxel
  • the second-scale voxel is the upsampled voxel corresponding to the first-scale voxel
  • the local density represents the number of occupied second-scale voxels in the second-scale voxel corresponding to the first-scale voxel
  • the coding information corresponding to the second-scale point cloud is decoded and reconstructed to determine the reconstructed geometric information corresponding to the second-scale point cloud.
  • the reconstructed geometric information corresponding to the second-scale point cloud is used to perform LOD-based inverse lifting or RAHT-based inverse transformation-inverse color conversion to obtain the attribute information of the second-scale point cloud, and the three-dimensional image model of the second-scale point cloud is restored based on the reconstructed geometric information and attribute information.
  • the encoding method and decoding method of the embodiments of the present application can also be used in other point cloud encoding and decoding processes besides G-PCC.
  • Figure 3 is an optional flowchart of a decoding method provided in an embodiment of the present application, which will be explained in conjunction with the steps shown in Figure 3.
  • the decoder parses the received code stream to obtain the encoding information corresponding to the second scale point cloud and determines the first scale point cloud, wherein the first scale point cloud is the previously decoded point cloud data corresponding to the second scale point cloud.
  • the code stream generally includes coding information corresponding to at least one scale of point cloud sent by the encoder.
  • the decoder decodes in order from low scale to high scale. That is, before the decoder decodes the coding information corresponding to the second scale point cloud, it has completed the decoding of the coding information of the previous scale point cloud of the second scale point cloud, that is, the first scale point cloud, and determined the decoded first scale point cloud data, that is, the first scale point cloud.
  • the decoder can use the decoded low-scale first-scale point cloud data to decode the coding information corresponding to the high-scale second-scale point cloud and reconstruct the geometric information.
  • S102 performing local density prediction based on the first-scale point cloud to determine the local density corresponding to the first-scale voxels in the first-scale point cloud, and performing occupancy probability prediction on the second-scale voxels to determine the occupancy probability corresponding to the second-scale voxels.
  • the geometric information of the first-scale point cloud and the second-scale point cloud has undergone a voxelization process of the encoder and can be represented in the form of a voxel grid.
  • a point in the point cloud may correspond to an occupied voxel (i.e., a non-empty voxel), and an unoccupied voxel (i.e., an empty voxel) indicates that there is no point in the point cloud at the voxel position.
  • the occupied voxels may be marked as 1, and the unoccupied voxels may be marked as 0.
  • the voxelized point cloud may represent the geometric data of the point cloud by the occupation symbols of the voxels at each position in the voxel grid.
  • the scale of the point cloud corresponds to the scale of the voxels in the point cloud. That is, the voxels contained in the first-scale point cloud are first-scale voxels, and the voxels contained in the second-scale point cloud are second-scale voxels. Among them, the second-scale voxels are upsampled voxels corresponding to the first-scale voxels.
  • the decoder can perform voxel upsampling on the first-scale voxels in the first-scale point cloud. For example, the first-scale voxels representing the occupied points in the first-scale point cloud are upsampled to obtain multiple second-scale voxels corresponding to each first-scale voxel.
  • the decoder can implement voxel upsampling by pooling, such as using a maximum pooling layer with a step size of 2 ⁇ 2 ⁇ 2 to divide one first-scale voxel in the first-scale point cloud into eight second-scale voxels.
  • Each upsampling evenly divides the size of the first-scale voxel in three dimensions, that is, the size of the second-scale voxel in three dimensions is half of the first-scale voxel.
  • each first-scale voxel in the first-scale point cloud completes the upsampling from the low-scale point cloud with known geometric information to the high-scale point cloud, and obtains the second-scale point cloud whose geometric information is to be reconstructed.
  • Figure 4 shows a first-scale point cloud containing 2 ⁇ 2 ⁇ 1 first-scale voxels. After one voxel upsampling, a second-scale point cloud containing 4 ⁇ 4 ⁇ 2 second-scale voxels is obtained.
  • the first-scale point cloud is the decoded point cloud data.
  • the occupied first-scale voxels are represented by solid cubes, which represent the locations of the points in the point cloud.
  • the second-scale voxels are upsampled to second-scale voxels, whether the second-scale voxels are occupied needs to be further determined through a geometric reconstruction process, which is represented by empty cubes in Figure 4.
  • the point cloud in Figure 4 is only exemplary, and the actual point cloud may include more voxels.
  • the local density represents the number of occupied second-scale voxels in the second-scale voxels corresponding to the first-scale voxels. Therefore, the decoder performs local density prediction based on the first-scale point cloud, determines the local density corresponding to the first-scale voxels in the first-scale point cloud, and can determine the number of occupied second-scale voxels in the multiple second-scale voxels corresponding to each first-scale voxel. In this way, the occupied second-scale voxels in the multiple second-scale voxels can be more accurately determined by combining the multiple occupation probabilities corresponding to the multiple second-scale voxels.
  • the decoder predicts the occupancy probability of the second-scale voxels obtained by upsampling the first-scale voxels, and determines multiple occupancy probabilities corresponding to multiple second-scale voxels corresponding to the first-scale voxels.
  • S102 may be implemented by executing the process of S201 - S203 as follows:
  • S201 extracting features from geometric information of a first-scale point cloud to determine features of the first-scale point cloud.
  • the geometric information of the first scale point cloud may include the occupancy of each voxel in the first scale point cloud and the position information of each voxel.
  • the occupancy may be a preset flag such as 0 or 1
  • the position information may be the three-dimensional coordinates of the voxel.
  • the geometric information of the first scale point cloud may be obtained by decoding and reconstructing the encoded information of the first scale point cloud, that is, the geometric information of the first scale point cloud may be equivalent to the reconstructed geometric information of the first scale point cloud.
  • the decoder extracts features from the geometric information of the first-scale point cloud, maps the geometric information of the first-scale point cloud to a preset low-scale feature space, and determines the features of the first-scale point cloud.
  • the first-scale point cloud features may include implicit features extracted from the geometric information of the first-scale point cloud.
  • the decoder upsamples the first-scale point cloud features to the second scale to determine the initial second-scale point cloud features.
  • the decoder extracts features from the initial second-scale point cloud features to determine the second-scale point cloud features.
  • the decoder predicts the occupancy probability of the second-scale voxels based on the second-scale point cloud features, predicts the probability that the voxel position corresponding to each second-scale voxel falls into the midpoint of the point cloud, and determines the occupancy probability corresponding to the second-scale voxel.
  • the second-scale point cloud features are obtained by performing two feature extractions on the geometric information of the first-scale point cloud. Therefore, the second-scale point cloud features have better feature expression, which can improve the accuracy of the decoder's prediction of occupancy probability using the second-scale point cloud features.
  • the decoder performs local density prediction based on the first-scale point cloud features, and predicts the number of occupied second-scale voxels in multiple second-scale voxels corresponding to each first-scale voxel in the first-scale point cloud as the local density corresponding to each first-scale voxel.
  • the local density is a numerical value, which can be an integer value greater than or equal to 1 and less than or equal to the total number of second-scale voxels corresponding to a first-scale voxel.
  • the execution order of the feature upsampling, feature extraction of the initial second-scale point cloud features, and occupancy probability prediction processes in S202 and the local density prediction process in S203 is not limited to that shown in FIG. 5 . In practical applications, they can also be executed in any order or simultaneously, depending on the actual situation, and the embodiment of the present application does not limit this.
  • the decoder extracts features of the geometric information of the first-scale point cloud through the third feature extraction network to determine the first-scale point cloud features; the first-scale point cloud features are respectively input into the first branch including the upsampling network, the fourth feature extraction network and the occupancy probability prediction network, and the second branch including the local density prediction network; the occupancy probability corresponding to each second-scale voxel is output through the first branch, and the local density corresponding to each first-scale voxel is output through the second branch.
  • the upsampling network can be implemented by a transposed convolutional network; the third feature extraction network is used to extract features of the geometric information of the first-scale point cloud, and the fourth feature extraction network is used to extract features of the initial second-scale point cloud features; the occupancy probability prediction network and the local density prediction network can be implemented using pre-trained neural networks.
  • S102 in FIG. 3 may also be implemented by executing the process of S301-S304 as follows:
  • S302 up-sample the first point cloud features at the first scale to the second scale, determine the second-scale point cloud features, and perform occupancy probability prediction based on the second-scale point cloud features to determine the occupancy probability corresponding to the second-scale voxels.
  • the decoder can also extract different first-scale point cloud features, that is, first point cloud features and second point cloud features, for occupancy probability prediction and local density prediction processes through different feature extraction networks, which are used for occupancy probability prediction and local density prediction, respectively.
  • the first feature extraction network can be obtained by jointly learning or training with a neural network for occupancy probability prediction
  • the second feature extraction network can be obtained by jointly learning or training with a local density prediction network.
  • the embodiment of the present application does not limit the execution order of the S301 - S302 process and the S303 - S304 process.
  • the first branch in FIG7 includes a first feature extraction network, an upsampling network, and an occupancy probability prediction network;
  • the second branch includes a second feature extraction network and a local density prediction network.
  • the decoder inputs the geometric information of the first-scale point cloud into the first branch and the second branch respectively, and performs feature extraction and network prediction on the geometric information of the first-scale point cloud through the first branch and the second branch respectively; the occupancy probability corresponding to each second-scale voxel is output through the first branch, and the local density corresponding to each first-scale voxel is output through the second branch.
  • the second-scale point cloud features output by the upsampling network may be further extracted, and the second-scale point cloud features after further feature extraction may be input into the occupancy probability prediction network for occupancy probability prediction to enhance feature expression, thereby improving the accuracy of occupancy probability prediction.
  • the specific selection is made according to the actual situation, and the embodiments of the present application are not limited thereto.
  • S102 in FIG. 3 may also be implemented by executing the process of S401-S403 as follows:
  • S401 extracting features from geometric information of a first-scale point cloud to determine features of the first-scale point cloud.
  • S402 up-sample the first-scale point cloud features to a second scale, determine the second-scale point cloud features, and perform occupation probability prediction based on the second-scale point cloud features to determine the occupation probability corresponding to the second-scale voxels.
  • the occupancy probability prediction and the local density prediction can be performed based on the first scale point cloud features determined by feature extraction of the geometric information of the first scale point cloud.
  • the feature information can be reused, the module complexity can be reduced, and the processing efficiency of the decoder can be improved.
  • the embodiment of the present application does not limit the execution order of S402 and S403.
  • the above S401-S403 process can be shown in FIG8.
  • the decoder extracts features from the geometric information of the first-scale point cloud through the fifth feature extraction network to determine the first-scale point cloud features; the first-scale point cloud features are respectively input into the first branch including the upsampling network and the occupancy probability prediction network, and the second branch including the local density prediction network; the occupancy probability corresponding to each second-scale voxel is output through the first branch, and the local density corresponding to each first-scale voxel is output through the second branch.
  • the fifth feature extraction network in FIG8 and the third feature extraction network in FIG6 can be the same or different feature extraction networks.
  • a local density prediction network can be used to perform local density prediction based on the first-scale point cloud features to determine the local density corresponding to the first-scale voxel.
  • the local density prediction network may include: a first sparse convolution layer, a first activation function layer, a second sparse convolution layer, and a second activation function layer.
  • the local density prediction network may be shown in FIG9, including: a first layer of sparse convolution layer (i.e., a first sparse convolution layer), a second layer of ReLu activation function layer (i.e., a first activation function layer), a third layer of sparse convolution layer (i.e., a second sparse convolution layer), and a fourth layer of Sigmoid function layer (i.e., a second activation function layer).
  • the local density prediction network is used to output local density according to implicit features of the point cloud.
  • the feature extraction process in S201, S202, S301, S303, S401, and S402 can be implemented using a feature extraction network as shown in FIG10, including: a sparse convolution layer of the first layer, an activation function layer of the second layer (e.g., a ReLu activation function layer), a residual layer of the third layer, and a sparse convolution layer of the fourth layer.
  • the network structure of the residual layer of the third layer can be as shown in FIG11.
  • the occupancy probability prediction process in S202, S302, and S402 can be implemented using an occupancy probability prediction network as shown in FIG12, including: a sparse convolution layer of the first layer, an activation function ReLu layer of the second layer, a sparse convolution layer of the third layer, an activation function ReLu of the fourth layer, a sparse convolution layer of the fifth layer, and a Sigmoid function layer of the sixth layer.
  • the occupancy probability prediction network is used to perform occupancy prediction on each second-scale voxel obtained by upsampling each first-scale voxel according to the implicit features of the point cloud of the second scale, and obtain the occupancy probability corresponding to each second-scale voxel.
  • the occupancy probability prediction network in FIG12 is only an example. In actual applications, it can also be a convolutional neural network (CNN) of other hierarchical structures. The specific selection is made according to the actual situation, and the embodiments of the present application are not limited thereto.
  • CNN convolutional neural network
  • the multiple high-scale second-scale voxels obtained by upsampling are also unoccupied, and no probability prediction is required.
  • the multiple second-scale voxels obtained by upsampling the occupied first-scale voxels in the first-scale point cloud need to be predicted.
  • the occupancy probability corresponding to the second-scale voxels facing the side of the paper in the second-scale point cloud in Figure 4 can be shown in Figure 13.
  • the occupancy probability in Figure 13 is only for convenience of explanation and cannot be understood as the result of actual calculation.
  • the decoder uses the predicted local density corresponding to each first-scale voxel in the first-scale point cloud and the occupancy probability corresponding to each second-scale voxel obtained by upsampling each first-scale voxel to determine whether each second-scale voxel is occupied, and then, according to whether each second-scale voxel is occupied, decodes and reconstructs the encoded information corresponding to the second-scale point cloud, decodes the position information corresponding to each occupied second-scale voxel in the second-scale point cloud, and determines the reconstructed geometric information corresponding to the second-scale point cloud according to the position information corresponding to each occupied second-scale voxel in the second-scale point cloud.
  • S103 may be implemented by executing the process of S1031 - S1032 as follows:
  • the multiple occupation probabilities of the multiple second scale voxels corresponding to each first scale voxel can be screened according to the local density corresponding to each first scale voxel, and the local density second scale voxels with high occupation probability are determined as the occupied second scale voxels.
  • the local density corresponding to the first-scale voxel can be any value from 1 to 8.
  • the decoder determines the 4 second-scale voxels with high occupancy probability among the 8 second-scale voxels as occupied second-scale voxels, and the other second-scale voxels are determined as unoccupied second-scale voxels.
  • a first-scale voxel in the first-scale point cloud is upsampled to obtain 8 second-scale voxels, and the local density corresponding to the first-scale voxel is 4; the occupancy probabilities corresponding to the 8 second-scale voxels are, from high to low, [0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2]. Then, according to the local density, the second-scale voxels with occupancy probabilities of [0.9, 0.8, 0.7, 0.6] can be determined as occupied second-scale voxels; and the other second-scale voxels corresponding to [0.5, 0.4, 0.3, 0.2] are determined as unoccupied second-scale voxels.
  • the occupied second-scale voxels represent points in the point cloud that fall into the second-scale voxels obtained by upsampling the first-scale voxels.
  • the decoder can decode and reconstruct the encoded information corresponding to the second-scale point cloud based on the occupied second-scale voxels corresponding to each first-scale voxel, thereby determining the reconstructed geometric information corresponding to the second-scale point cloud.
  • the decoder may distinguish between occupied second-scale voxels and unoccupied second-scale voxels with different occupancy symbols according to the determination result of S1031.
  • the encoded information corresponding to the second-scale point cloud may be decoded and reconstructed based on each occupied second-scale voxel represented by the occupancy symbol, and the reconstructed geometric information corresponding to the second-scale point cloud may be determined.
  • the decoder when it performs geometric reconstruction, it mainly decodes and reconstructs the geometric coding information in the coding information corresponding to the second scale point cloud. After determining the reconstructed geometric information corresponding to the second scale point cloud, the attribute coding information corresponding to the second scale point cloud can be decoded based on the reconstructed geometric information corresponding to the second scale point cloud through the attribute decoding process to determine the attribute information corresponding to the second scale point cloud, thereby combining the reconstructed geometric information and the attribute information corresponding to the second scale point cloud to restore the three-dimensional image model of the second scale point cloud.
  • the decoder can determine the number of occupied second-scale voxels in the second-scale voxels obtained by sampling each first-scale voxel by predicting the local density. In this way, the occupancy probability corresponding to the second-scale voxels can be screened in combination with the local density to determine the occupancy of the second-scale voxels, reconstruct the second-scale point cloud according to the occupancy of the second-scale voxels, and determine the reconstructed geometric information of the second-scale point cloud.
  • the occupancy of the determined second-scale voxels can be made more accurate, and the accuracy of the reconstructed geometric information of the second-scale point cloud can be improved, that is, the reconstructed geometric quality of the decoder is improved, and the decoding performance is improved.
  • the decoder may continue to parse the code stream to determine the encoding information corresponding to the i-th scale point cloud, where i is an integer greater than or equal to 3; through the above decoding method in the embodiment of the present application, the local density of the i-th scale voxel is predicted based on the reconstructed geometric information of the i-th scale point cloud, the local density corresponding to the i-1 scale voxel is determined, and the occupancy probability of the i-th scale voxel corresponding to the i-1 scale voxel is predicted to determine the occupancy probability corresponding to the i-th scale voxel; the i-th scale is obtained by upsampling the i-th scale; based on the occupancy probability corresponding to the i-th scale voxel and the local density corresponding to the i-th scale voxel, the encoding information corresponding to the i-th scale point cloud is decoded and
  • the decoder may also decode and reconstruct the encoding information of a higher scale point cloud in the code stream, such as a third scale point cloud, based on the reconstructed geometric information of the second scale point cloud, using the decoding method in the embodiment of the present application to obtain the reconstructed geometric information corresponding to the third scale point cloud.
  • a higher scale point cloud in the code stream such as a third scale point cloud
  • the decoder when the decoder decodes adjacent scales, it can use the known geometric data of the previously decoded low-scale point cloud to decode and reconstruct the encoded information of the high-scale point cloud, and determine the reconstructed geometric information of the high-scale point cloud; by decoding scale by scale until the reconstructed geometric information of the target scale point cloud is restored, the target scale can be determined according to the preset decoding accuracy of the decoder.
  • the decoder may also perform the above-mentioned decoding method in the embodiments of the present application to predict the local density of the n-th scale voxel and the occupancy probability of the n+1-th scale voxel based on the reconstructed geometric data corresponding to the n-th scale point cloud, and determine the local density corresponding to the n-th scale voxel and the occupancy probability corresponding to the n+1-th scale voxel; n is a positive integer greater than or equal to 2; the n+1-th scale is obtained by sampling the n-th scale; based on the local density corresponding to the n-th scale voxel and the occupancy probability corresponding to the n+1-th scale voxel, determine the reconstructed geometric data corresponding to the n+1-th scale point cloud.
  • the decoder can perform voxel upsampling based on the first-scale voxels in the first-scale point cloud to obtain the second-scale voxels on the basis of decoding and reconstructing the first-scale point cloud; and extract features from the geometric information of the first-scale point cloud, perform local density prediction on the first-scale voxels, and perform occupation probability prediction on the second-scale voxels, and reconstruct the reconstructed geometric data corresponding to the second-scale point cloud according to the determined local density of the first-scale voxels and the occupation probability of the second-scale voxels, combined with the second position information of the second-scale voxels determined by upsampling the first position information of the first-scale
  • the decoding method in the embodiment of the present application can be applied to a scalable encoding and decoding method, that is, for multiple encoding information of multiple scale point clouds sent by the encoder side, the decoder can decode and reconstruct the point cloud of any scale in the order of decoding from low scale to high scale according to the actual decoding accuracy requirements.
  • the encoder writes and sends the encoding information of the first scale point cloud, the encoding information of the second scale point cloud to the encoding information of the fifth scale point cloud in the code stream, and the decoder can decode from the first scale point cloud to the third scale point cloud according to the preset accuracy requirements according to the decoding method provided in the embodiment of the present application, reconstruct the geometric data of the third scale point cloud and restore the three-dimensional image model of the third scale point cloud, and then end the decoding, and no longer continue to decode the encoding information corresponding to the fifth scale point cloud.
  • the specific selection is based on the actual situation, and the embodiment of the present application is not limited.
  • the decoding method provided in the embodiment of the present application can be repeatedly applied between multiple adjacent scales, and the decoding between each group of adjacent scales is independent of each other, so scale-scalable decoding can be flexibly implemented.
  • each decoding process of the above decoder uses the decoded low-scale point cloud as known information to decode the encoded information of the high-scale point cloud.
  • the known information may be a preset number of unencoded point cloud information sent by the encoder side.
  • the encoder may send a preset number of point cloud information, such as the coordinates of 100 points in the point cloud, as the first known information directly to the decoding end in an unencoded manner, so that the decoder does not need to decode the first known information, but directly uses the position information of the preset number of points sent by the encoder to reconstruct the point cloud of the corresponding scale to continue the subsequent decoding process.
  • Figure 16 is an optional flow chart of the encoding method provided in an embodiment of the present application, which will be explained in conjunction with the steps shown in Figure 16.
  • the encoder voxelizes the original point cloud data of the second scale point cloud, and the voxelized point cloud can represent the geometric data of the point cloud by the occupation symbol of the voxel at each position in the voxel grid.
  • the encoder performs voxel downsampling on the voxelized second scale point cloud to determine the first scale point cloud.
  • the encoder can implement voxel downsampling by pooling. As shown in FIG17, a maximum pooling layer with a step size of 2 ⁇ 2 ⁇ 2 is used to merge 8 second-scale voxels into 1 first-scale voxel. After voxel downsampling, 3 of the 4 first-scale voxels corresponding to the first-scale point cloud are occupied, and 1 first-scale voxel is not occupied. The size of the first-scale voxel in three dimensions is twice that of the second-scale voxel. The encoder marks the occupancy of the voxel with an occupancy symbol.
  • the occupancy symbol corresponding to the second-scale voxel on the side of the second-scale point cloud facing the paper surface is obtained by voxel upsampling.
  • the process of the occupancy symbol corresponding to the first-scale voxel on the side of the first-scale point cloud facing the paper surface can be shown in FIG18. In this way, by voxel downsampling, the geometric data of the first-scale point cloud of relatively low scale is obtained.
  • the encoder then upsamples the first-scale point cloud to the second scale, determines a plurality of second-scale voxels corresponding to each first-scale voxel, and implements a lossless encoding process by predicting the occupancy of the second-scale voxels.
  • the local density represents the number of occupied second-scale voxels in the second-scale voxels corresponding to the first-scale voxels.
  • the encoder performs local density prediction based on the first-scale point cloud, determines the local density corresponding to the first-scale voxels, and predicts the occupancy probability of the second-scale voxels.
  • the process of determining the occupancy probability corresponding to the second-scale voxels is the same as the same processing method in the decoder, and will not be repeated here.
  • the process of S502 may include: performing feature extraction on the geometric information of the first-scale point cloud to determine the first-scale point cloud features; upsampling the first-scale point cloud features to the second scale to determine the initial second-scale point cloud features, performing feature extraction on the initial second-scale point cloud features to determine the second-scale point cloud features, and performing occupancy probability prediction based on the second-scale point cloud features to determine the occupancy probability corresponding to the second-scale voxel; performing local density prediction based on the first-scale point cloud features to determine the local density corresponding to the first-scale voxel.
  • the process of S502 may also include: performing feature extraction on the geometric information of the first-scale point cloud through a first feature extraction network to determine the first point cloud features of the first scale; upsampling the first point cloud features of the first scale to a second scale to determine the second-scale point cloud features, and performing occupancy probability prediction based on the second-scale point cloud features to determine the occupancy probability corresponding to the second-scale voxel; performing feature extraction on the geometric data of the first-scale point cloud through a second feature extraction network to determine the second point cloud features of the first scale; performing local density prediction based on the second point cloud features of the first scale to determine the local density corresponding to the first-scale voxel.
  • the process of S502 may also include: extracting features from the geometric information of the first-scale point cloud to determine the first-scale point cloud features; upsampling the first-scale point cloud features to the second scale to determine the second-scale point cloud features, and performing occupancy probability prediction based on the second-scale point cloud features to determine the occupancy probability corresponding to the second-scale voxel; performing local density prediction based on the first-scale point cloud features to determine the local density corresponding to the first-scale voxel.
  • the encoder can use the local density prediction network to perform local density prediction according to the first-scale point cloud features to determine the local density corresponding to the first-scale voxel; wherein the local density prediction network includes: a first sparse convolution layer, a first activation function layer, a second sparse convolution layer and a second activation function layer.
  • S503 Determine the reconstructed geometric information corresponding to the second-scale point cloud according to the local density corresponding to the first-scale voxel and the occupancy probability corresponding to the second-scale voxel.
  • the encoder determines the reconstructed geometric information corresponding to the second-scale point cloud based on the local density corresponding to the first-scale voxel and the occupancy probability corresponding to the second-scale voxel, which is consistent with the description of the same processing process of the decoder and is not repeated here.
  • S503 may be implemented by executing the process of S5031-S5032 as follows:
  • S5032 Determine reconstructed geometric information corresponding to the second-scale point cloud based on the occupied second-scale voxels corresponding to each first-scale voxel.
  • encoding is performed based on the reconstructed geometric information corresponding to the second-scale point cloud, encoding information corresponding to the second-scale point cloud is determined, and the encoding information is written into a bitstream.
  • the encoder can perform recoloring processing based on the reconstructed geometric information corresponding to the second-scale point cloud to obtain colored point cloud data, perform color information encoding based on the colored point cloud data, and determine the attribute information encoding corresponding to the second-scale point cloud; perform geometric information encoding on the point cloud data of the second-scale point cloud to determine the geometric information encoding corresponding to the second-scale point cloud; and combine the geometric information encoding and the attribute information encoding to determine the encoding information corresponding to the second-scale point cloud.
  • entropy coding may use a context-based adaptive binary arithmetic coding (CABAC: Context-based Adaptive Binary Arithmetic Coding) algorithm, but is not limited thereto.
  • CABAC Context-based Adaptive Binary Arithmetic Coding
  • the encoder writes the code corresponding to the second-scale point cloud into the bitstream and sends it to the decoder, which parses the encoding information corresponding to the second-scale point cloud.
  • the decoding method provided in the embodiment of the present application is used to send the geometric data of the previously decoded low-scale point cloud (such as the first-scale point cloud) to the entropy decoder, so that the geometric data of the lossless second-scale point cloud can be reconstructed, that is, the reconstructed geometric information corresponding to the second-scale point cloud is determined, and then the geometric decoding and attribute decoding of the encoding information of the second-scale point cloud are realized based on the reconstructed geometric information corresponding to the second-scale point cloud, and the three-dimensional image model of the second-scale point cloud is restored.
  • the geometric data of the previously decoded low-scale point cloud such as the first-scale point cloud
  • the occupancy probability corresponding to the second-scale voxels is screened, which can improve the accuracy of determining the occupancy status of the second-scale voxels, and then improve the accuracy of encoding the reconstructed geometric information of the second-scale point cloud determined based on the occupancy status of the second-scale voxels, thereby improving the encoding performance.
  • the encoder can also perform at least one voxel downsampling based on the first-scale point cloud through the same encoding process, determine a point cloud with a lower scale than the first-scale point cloud, and complete the encoding of multiple-scale point clouds scale by scale.
  • the encoder writes the encoding information of the multiple-scale point clouds into the bitstream and sends it to the decoder.
  • the encoding method provided in the embodiment of the present application can be repeatedly applied between multiple adjacent scales, and the encoding between each group of adjacent scales is independent of each other, so scale-scalable encoding can be flexibly implemented.
  • the encoder performs G-PCC encoding on the original point cloud data of the first scale point cloud to obtain the encoding information corresponding to the first scale point cloud, and sends it to the decoder through the code stream.
  • the decoder performs G-PCC decoding on the encoding information corresponding to the first scale point cloud data, and determines the reconstructed geometric information of the first scale point cloud through the geometric decoding process in G-PCC decoding.
  • the geometric decoding process in G-PCC decoding is used to illustrate that the decoder extracts features from the reconstructed geometric information of the first scale point cloud to obtain first scale point cloud features, and upsamples the first scale point cloud features to the second scale by a 2 ⁇ 2 ⁇ 2 transposed convolution to determine the initial second scale point cloud features; the decoder extracts features from the initial second scale point cloud features to determine the second scale point cloud features.
  • the decoder predicts the occupancy probability of multiple second scale voxels obtained by upsampling each first scale voxel in the first scale point cloud based on the second scale point cloud features, and obtains multiple occupancy probabilities corresponding to the multiple second scale voxels; and predicts the local density of each first scale voxel based on the first scale point cloud features to obtain the local density corresponding to each first scale voxel.
  • the decoder screens the multiple occupancy probabilities corresponding to the multiple second-scale voxels sampled from the first-scale voxel according to the local density corresponding to each first-scale voxel; illustratively, the multiple occupancy probabilities are sorted from high to low, and the local density occupancy probabilities with high occupancy probabilities are determined, and the local density occupancy probabilities corresponding to the local density occupancy probabilities with high occupancy probabilities are determined as the local density occupies the second-scale voxels.
  • the reconstructed geometric information of the second-scale point cloud is determined according to the local density occupies the second-scale voxels corresponding to each first-scale voxel in the first-scale point cloud. Based on the reconstructed geometric information of the second-scale point cloud, the reconstructed geometric information of the third-scale point cloud is reconstructed by the same process.
  • the embodiment of the present application has good geometric reconstruction quality and better encoding and decoding performance than the traditional G-PCC method.
  • the applicant conducted a comparative encoding and decoding test on multiple point cloud data sets using the encoding and decoding method of the embodiment of the present application and the traditional G-PCC method. The results are shown in Table 1, as follows:
  • Point cloud dataset The rate-distortion performance gain of the embodiment of the present application compared to the traditional G-PCC facade_00009_vox12 -25.91% house_without_roof_00057_vox12 -43.11% boxer_viewdep_vox12 -42.76% soldier_viewdep_vox12 -45.60% Average gain -39.35%
  • facade_00009_vox12, house_without_roof_00057_vox12, boxer_viewdep_vox12 and soldier_viewdep_vox12 are different point cloud data sets, and BD-rate gain over.
  • the rate-distortion (Bjontegaard-Delta, BD-rate) performance gain value is negatively correlated with the codec performance.
  • the embodiment of the present application has a higher BD-rate gain, with a maximum increase of 45.60% and an average increase of 39.35. This data illustrates the improvement of codec performance.
  • the embodiment of the present application provides a decoder 1, as shown in FIG20 , including:
  • the parsing part 11 is configured to parse the bit stream and determine the encoding information corresponding to the second scale point cloud;
  • the determining part 12 is configured to determine a first scale point cloud; the first scale point cloud is the previously decoded point cloud data corresponding to the second scale point cloud;
  • the first prediction part 13 is configured to perform local density prediction based on the first scale point cloud, determine the local density corresponding to the first scale voxel in the first scale point cloud, and perform occupation probability prediction on the second scale voxel to determine the occupation probability corresponding to the second scale voxel;
  • the second scale voxel is an upsampled voxel corresponding to the first scale voxel;
  • the local density represents the number of occupied second scale voxels in the second scale voxel corresponding to the first scale voxel;
  • the decoding and reconstruction part 14 is configured to decode and reconstruct the encoded information corresponding to the second scale point cloud based on the occupancy probability corresponding to the second scale voxel and the local density corresponding to the first scale voxel, and determine the reconstructed geometric information corresponding to the second scale point cloud.
  • the decoding and reconstruction part 14 is further configured to, for each first-scale voxel in the first-scale point cloud, determine the second-scale voxels with high local density of occupation probability among the multiple second-scale voxels corresponding to each first-scale voxel as occupied second-scale voxels; based on the occupied second-scale voxels corresponding to each first-scale voxel, decode and reconstruct the encoded information corresponding to the second-scale point cloud to determine the reconstructed geometric information corresponding to the second-scale point cloud.
  • the first prediction part 13 is further configured to perform feature extraction on the geometric information of the first-scale point cloud to determine the first-scale point cloud features; upsample the first-scale point cloud features to the second scale to determine the initial second-scale point cloud features, perform feature extraction on the initial second-scale point cloud features to determine the second-scale point cloud features, and perform occupancy probability prediction based on the second-scale point cloud features to determine the occupancy probability corresponding to the second-scale voxel; perform local density prediction based on the first-scale point cloud features to determine the local density corresponding to the first-scale voxel.
  • the first prediction part 13 is further configured to perform feature extraction on the geometric information of the first-scale point cloud through a first feature extraction network to determine the first point cloud features of the first scale; upsample the first point cloud features of the first scale to a second scale to determine the second-scale point cloud features, and perform occupancy probability prediction based on the second-scale point cloud features to determine the occupancy probability corresponding to the second-scale voxel; perform feature extraction on the geometric data of the first-scale point cloud through a second feature extraction network to determine the second point cloud features of the first scale; perform local density prediction based on the second point cloud features of the first scale to determine the local density corresponding to the first-scale voxel.
  • the first prediction part 13 is further configured to perform feature extraction on the geometric information of the first-scale point cloud to determine the first-scale point cloud features; upsample the first-scale point cloud features to the second scale to determine the second-scale point cloud features, and perform occupancy probability prediction based on the second-scale point cloud features to determine the occupancy probability corresponding to the second-scale voxel; perform local density prediction based on the first-scale point cloud features to determine the local density corresponding to the first-scale voxel.
  • the first prediction part 13 is further configured to use a local density prediction network to perform local density prediction according to the first-scale point cloud features to determine the local density corresponding to the first-scale voxel.
  • the local density prediction network includes:
  • the first sparse convolution layer the first activation function layer, the second sparse convolution layer and the second activation function layer.
  • the parsing part 11 is further configured to parse the code stream to determine the encoding information corresponding to the i-th scale point cloud; i is an integer greater than or equal to 3; the first prediction part 13 is further configured to perform local density prediction of the i-1th scale voxel based on the reconstructed geometric information of the i-1th scale point cloud, determine the local density corresponding to the i-1 scale voxel, and perform occupancy probability prediction on the i-th scale voxel corresponding to the i-1 scale voxel to determine the occupancy probability corresponding to the i-th scale voxel; the i-th scale is obtained by sampling the i-1th scale; the decoding and reconstruction part 14 is further configured to decode and reconstruct the encoding information corresponding to the i-th scale point cloud based on the occupancy probability corresponding to the i-th scale voxel and the local density corresponding to the i-1th scale voxel, and determine the
  • the first prediction part 13 is further configured to perform local density prediction of the n-th scale voxel and occupancy probability prediction of the n+1-th scale voxel based on the reconstructed geometric data corresponding to the n-th scale point cloud, and determine the local density corresponding to the n-th scale voxel and the occupancy probability corresponding to the n+1-th scale voxel; n is a positive integer greater than or equal to 2; the n+1-th scale is obtained by upsampling the n-th scale; the decoding and reconstruction part 14 is further configured to determine the reconstructed geometric data corresponding to the n+1-th scale point cloud based on the local density corresponding to the n-th scale voxel and the occupancy probability corresponding to the n+1-th scale voxel.
  • the embodiment of the present application provides an encoder 2, as shown in FIG21, including:
  • a downsampling part 21 is configured to perform voxel downsampling on the second scale point cloud to determine a first scale point cloud;
  • the second prediction part 22 is configured to perform local density prediction based on the first scale point cloud to determine the local density corresponding to the first scale voxel, and perform occupation probability prediction on the second scale voxel to determine the occupation probability corresponding to the second scale voxel; the local density represents the number of occupied second scale voxels in the second scale voxels corresponding to the first scale voxels;
  • a reconstruction part 23 configured to determine the reconstructed geometric information corresponding to the second-scale point cloud according to the local density corresponding to the first-scale voxel and the occupancy probability corresponding to the second-scale voxel;
  • the encoding part 24 is configured to perform encoding based on the reconstructed geometric information corresponding to the second-scale point cloud, determine the encoding information corresponding to the second-scale point cloud, and write the encoding information into a bitstream.
  • the reconstruction part 23 is further configured to, for each first-scale voxel in the first-scale point cloud, determine a number of second-scale voxels with a high local density of occupation probability among multiple second-scale voxels corresponding to each first-scale voxel as occupied second-scale voxels; and determine the reconstructed geometric information corresponding to the second-scale point cloud based on the occupied second-scale voxels corresponding to each first-scale voxel.
  • the second prediction part 22 is further configured to perform feature extraction on the geometric information of the first-scale point cloud to determine the first-scale point cloud features; upsample the first-scale point cloud features to the second scale to determine the initial second-scale point cloud features, perform feature extraction on the initial second-scale point cloud features to determine the second-scale point cloud features, and perform occupancy probability prediction based on the second-scale point cloud features to determine the occupancy probability corresponding to the second-scale voxel; perform local density prediction based on the first-scale point cloud features to determine the local density corresponding to the first-scale voxel.
  • the second prediction part 22 is further configured to perform feature extraction on the geometric information of the first-scale point cloud through a first feature extraction network to determine the first point cloud features of the first scale; upsample the first point cloud features of the first scale to a second scale to determine the second-scale point cloud features, and perform occupancy probability prediction based on the second-scale point cloud features to determine the occupancy probability corresponding to the second-scale voxel; perform feature extraction on the geometric data of the first-scale point cloud through a second feature extraction network to determine the second point cloud features of the first scale; perform local density prediction based on the second point cloud features of the first scale to determine the local density corresponding to the first-scale voxel.
  • the second prediction part 22 is further configured to perform feature extraction on the geometric information of the first-scale point cloud to determine the first-scale point cloud features; upsample the first-scale point cloud features to the second scale to determine the second-scale point cloud features, and perform occupancy probability prediction based on the second-scale point cloud features to determine the occupancy probability corresponding to the second-scale voxel; perform local density prediction based on the first-scale point cloud features to determine the local density corresponding to the first-scale voxel.
  • the second prediction part 22 is further configured to use a local density prediction network to perform local density prediction based on the first-scale point cloud features to determine the local density corresponding to the first-scale voxel; wherein the local density prediction network includes: a first sparse convolution layer, a first activation function layer, a second sparse convolution layer and a second activation function layer.
  • the encoding part 24 is further configured to perform recoloring processing based on the reconstructed geometric information corresponding to the second-scale point cloud to obtain colored point cloud data, perform color information encoding based on the colored point cloud data, and determine the attribute information encoding corresponding to the second-scale point cloud; perform geometric information encoding on the point cloud data of the second-scale point cloud to determine the geometric information encoding corresponding to the second-scale point cloud; and combine the geometric information encoding and the attribute information encoding to determine the encoding information corresponding to the second-scale point cloud.
  • the embodiment of the present application further provides a decoder
  • FIG22 is an optional structural diagram of the decoder 3 provided in the embodiment of the present application.
  • the decoder 3 includes: a first memory 32 and a first processor 33.
  • the first memory 32 and the first processor 33 are connected through a first communication bus 34;
  • the first memory 32 is used to store executable instructions;
  • the first processor 33 is used to execute the executable instructions stored in the first memory 32, and implement the decoding method provided in the embodiment of the present application.
  • the embodiment of the present application further provides an encoder
  • FIG23 is an optional structural diagram of the encoder 4 provided in the embodiment of the present application.
  • the encoder 4 includes: a second memory 42 and a second processor 43.
  • the second memory 42 and the second processor 43 are connected via a second communication bus 44;
  • the second memory 42 is used to store executable instructions;
  • the second processor 43 is used to execute the executable instructions stored in the second memory 42, and implement the encoding method provided in the embodiment of the present application.
  • An embodiment of the present application provides a computer-readable storage medium storing executable instructions, wherein executable instructions are stored.
  • the first processor When the executable instructions are executed by a first processor, the first processor will be caused to execute any one of the decoding methods provided in the embodiments of the present application; or, when the executable instructions are executed by a second processor, the second processor will be caused to execute any one of the encoding methods provided in the embodiments of the present application.
  • the computer-readable storage medium may be a memory such as FRAM, ROM, PROM, EPROM, EEPROM, flash memory, magnetic surface storage, optical disk, or CD-ROM; or it may be various devices including one or any combination of the above memories.
  • executable instructions may be in the form of a program, software, software module, script or code, written in any form of programming language (including compiled or interpreted languages, or declarative or procedural languages), and may be deployed in any form, including as a stand-alone program or as a module, component, subroutine or other unit suitable for use in a computing environment.
  • executable instructions may, but do not necessarily, correspond to a file in a file system, may be stored as part of a file that stores other programs or data, such as, for example, in one or more scripts in a Hypertext Markup Language (HTML) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files storing one or more modules, subroutines, or code portions).
  • HTML Hypertext Markup Language
  • executable instructions may be deployed to be executed on one computing device, or on multiple computing devices located at one site, or on multiple computing devices distributed across multiple sites and interconnected by a communication network.
  • the embodiments of the present application may be provided as methods, systems, or computer program products. Therefore, the present application may adopt the form of hardware embodiments, software embodiments, or embodiments in combination with software and hardware. Moreover, the present application may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage and optical storage, etc.) that contain computer-usable program code.
  • a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage and optical storage, etc.) that contain computer-usable program code.
  • each flow process and/or box in the flow chart and/or block diagram and the combination of the flow process and/or box in the flow chart and/or block diagram can be realized by computer program instructions.
  • These computer program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, an embedded processing machine or other programmable data processing device to produce a machine, so that the instructions executed by the processor of the computer or other programmable data processing device produce a device for realizing the function specified in one flow chart or multiple flows and/or one box or multiple boxes of the block chart.
  • These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing device to work in a specific manner, so that the instructions stored in the computer-readable memory produce a manufactured product including an instruction device that implements the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.
  • These computer program instructions may also be loaded onto a computer or other programmable data processing device so that a series of operational steps are executed on the computer or other programmable device to produce a computer-implemented process, whereby the instructions executed on the computer or other programmable device provide steps for implementing the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.
  • the embodiment of the present application provides a coding and decoding method, a decoder, an encoder and a computer-readable storage medium.
  • the decoder can determine the number of occupied second-scale voxels in the second-scale voxels obtained by sampling each first-scale voxel by predicting the local density. In this way, the occupancy probability corresponding to the second-scale voxel can be screened in combination with the local density to determine the occupancy of the second-scale voxel, reconstruct the second-scale point cloud according to the occupancy of the second-scale voxel, and determine the reconstructed geometric information of the second-scale point cloud.
  • the occupancy of the determined second-scale voxel can be made more accurate, the accuracy of the reconstructed geometric information of the second-scale point cloud is improved, the reconstructed geometric quality of the decoder is improved, and the decoding performance is improved.
  • the occupancy probability corresponding to the second-scale voxel is screened, which can improve the accuracy of determining the occupancy of the second-scale voxel, thereby improving the accuracy of encoding the reconstructed geometric information of the second-scale point cloud determined based on the occupancy of the second-scale voxel, thereby improving the encoding performance.

Abstract

Provided in the embodiments of the present application are an encoding method, a decoding method, a decoder, an encoder and a computer-readable storage medium, which can improve the encoding and decoding performance. The decoding method comprises: parsing a code stream to determine coded information corresponding to a second-scale point cloud, and determining a first-scale point cloud; performing local density prediction on the basis of the first-scale point cloud to determine a local density corresponding to first-scale voxels in the first-scale point cloud, and performing occupancy probability prediction on second-scale voxels to determine an occupancy probability corresponding to the second-scale voxels, the second-scale voxels being up-sampling voxels corresponding to the first-scale voxels, and the local density representing the number of occupied second-scale voxels amongst the second-scale voxels corresponding to the first-scale voxels; and, on the basis of the occupancy probability corresponding to the second-scale voxels and the local density corresponding to the first-scale voxels, decoding and reconstructing the coded information corresponding to the second-scale point cloud, so as to determine reconstructed geometric information corresponding to the second-scale point cloud.

Description

编解码方法、解码器、编码器及计算机可读存储介质Coding and decoding method, decoder, encoder and computer readable storage medium 技术领域Technical Field
本申请涉及点云压缩编解码技术,尤其涉及一种编解码方法、解码器、编码器及计算机可读存储介质。The present application relates to point cloud compression coding and decoding technology, and in particular to a coding and decoding method, a decoder, an encoder and a computer-readable storage medium.
背景技术Background technique
点云是一组点的集合,它可以存储每个点的几何位置和相关属性信息,从而准确立体地描述空间中的物体。点云数据量庞大,一帧点云可以包含上百万的点,这也对有效地存储和传输点云带来了极大地困难与挑战。因此,压缩技术被用于减少点云存储中的冗余信息,从而方便后续的处理工作。Point cloud is a collection of points that can store the geometric position and related attribute information of each point, so as to accurately describe objects in space. Point cloud data is huge, and a frame of point cloud can contain millions of points, which also brings great difficulties and challenges to the effective storage and transmission of point clouds. Therefore, compression technology is used to reduce redundant information in point cloud storage, so as to facilitate subsequent processing.
目前,代表性点云压缩算法包括:基于视频的点云压缩(Video-based Point Cloud Compression,V-PCC)和基于几何的点云压缩(Geometry-based Point Cloud Compression,G-PCC)。G-PCC中的几何压缩主要通过八叉树模型和/或三角形表面模型实现。V-PCC主要通过三维到二维的投影和视频压缩实现。在上述两种压缩算法中,都是通过重建点云几何信息的过程,来还原点云表达的真实环境的结构信息的。通常,重建点云几何信息的过程包括:利用稀疏卷积神经网络,将低尺度点云的体素化的几何信息作为输入,预测得到高尺度点云中各个高尺度体素的占据概率。进而根据各个高尺度体素的占据概率,确定各个高尺度体素的占用符,根据占用符表征被占用的高尺度体素重建高尺度点云的几何信息。At present, representative point cloud compression algorithms include: Video-based Point Cloud Compression (V-PCC) and Geometry-based Point Cloud Compression (G-PCC). The geometric compression in G-PCC is mainly implemented through octree models and/or triangular surface models. V-PCC is mainly implemented through three-dimensional to two-dimensional projection and video compression. In the above two compression algorithms, the structural information of the real environment expressed by the point cloud is restored by reconstructing the geometric information of the point cloud. Generally, the process of reconstructing the geometric information of the point cloud includes: using a sparse convolutional neural network, taking the voxelized geometric information of the low-scale point cloud as input, and predicting the occupancy probability of each high-scale voxel in the high-scale point cloud. Then, according to the occupancy probability of each high-scale voxel, the occupancy symbol of each high-scale voxel is determined, and the geometric information of the high-scale point cloud is reconstructed according to the occupancy symbol representing the occupied high-scale voxels.
然而,根据占据概率预测进行几何重建的方法在多个高尺度体素的占据概率较为接近,或者占据概率阈值设置不合理等情况下,容易出现占用符确定不准确的问题,从而降低了几何重建的质量,进而降低了编解码性能。However, the method of geometric reconstruction based on occupancy probability prediction is prone to inaccurate occupancy symbol determination when the occupancy probabilities of multiple high-scale voxels are close or the occupancy probability threshold is set unreasonably, thereby reducing the quality of geometric reconstruction and further reducing the encoding and decoding performance.
发明内容Summary of the invention
本申请实施例提供一种编解码方法、解码器、编码器及计算机可读存储介质,能够提高点云编解码几何重建的质量,从而提高编解码性能。The embodiments of the present application provide a coding and decoding method, a decoder, an encoder and a computer-readable storage medium, which can improve the quality of geometric reconstruction of point cloud coding and decoding, thereby improving coding and decoding performance.
本申请的技术方案是这样实现的:The technical solution of this application is implemented as follows:
本申请实施例提供一种解码方法,包括:The present application provides a decoding method, including:
解析码流,确定第二尺度点云对应的编码信息,并确定第一尺度点云;所述第一尺度点云为所述第二尺度点云对应的前一个已解码的点云数据;Parse the bitstream to determine the encoding information corresponding to the second-scale point cloud, and determine the first-scale point cloud; the first-scale point cloud is the previously decoded point cloud data corresponding to the second-scale point cloud;
基于所述第一尺度点云进行局部密度预测,确定所述第一尺度点云中第一尺度体素对应的局部密度,并对第二尺度体素进行占据概率预测,确定所述第二尺度体素对应的占据概率;所述第二尺度体素为所述第一尺度体素对应的上采样体素;所述局部密度表征所述第一尺度体素对应的第二尺度体素中,被占用的第二尺度体素的数量;Based on the first-scale point cloud, a local density prediction is performed to determine a local density corresponding to a first-scale voxel in the first-scale point cloud, and an occupation probability prediction is performed on a second-scale voxel to determine an occupation probability corresponding to the second-scale voxel; the second-scale voxel is an upsampled voxel corresponding to the first-scale voxel; the local density represents the number of occupied second-scale voxels in the second-scale voxel corresponding to the first-scale voxel;
基于所述第二尺度体素对应的占据概率与所述第一尺度体素对应的局部密度,对所述第二尺度点云对应的编码信息进行解码重建,确定所述第二尺度点云对应的重建几何信息。Based on the occupancy probability corresponding to the second-scale voxel and the local density corresponding to the first-scale voxel, the encoded information corresponding to the second-scale point cloud is decoded and reconstructed to determine the reconstructed geometric information corresponding to the second-scale point cloud.
本申请实施例提供一种编码方法,包括:The present application provides an encoding method, including:
对第二尺度点云进行体素下采样,确定第一尺度点云,并将所述第一尺度点云中的第一尺度体素上采样至第二尺度,确定第一尺度体素对应的第二尺度体素;Downsampling the second-scale point cloud to determine the first-scale point cloud, and upsampling the first-scale voxels in the first-scale point cloud to the second scale to determine the second-scale voxels corresponding to the first-scale voxels;
基于所述第一尺度点云进行局部密度预测,确定所述第一尺度体素对应的局部密度,并对所述第二尺度体素进行占据概率预测,确定所述第二尺度体素对应的占据概率;所述局部密度表征所述第一尺度体素对应的第二尺度体素中,被占用的第二尺度体素的数量;Performing local density prediction based on the first-scale point cloud to determine the local density corresponding to the first-scale voxel, and performing occupancy probability prediction on the second-scale voxel to determine the occupancy probability corresponding to the second-scale voxel; the local density represents the number of occupied second-scale voxels in the second-scale voxels corresponding to the first-scale voxels;
根据所述第一尺度体素对应的局部密度与所述第二尺度体素对应的占据概率,确定所述第二尺度点云对应的重建几何信息;Determining reconstructed geometric information corresponding to the second-scale point cloud according to the local density corresponding to the first-scale voxel and the occupancy probability corresponding to the second-scale voxel;
基于所述第二尺度点云对应的重建几何信息进行编码,确定第二尺度点云对应的编码信息,并将所述编码信息写入码流。Encoding is performed based on the reconstructed geometric information corresponding to the second-scale point cloud, encoding information corresponding to the second-scale point cloud is determined, and the encoding information is written into a bitstream.
本申请实施例提供一种解码器,包括:The present application provides a decoder, including:
解析部分,配置为解析码流,确定第二尺度点云对应的编码信息;A parsing part, configured to parse the bitstream and determine the encoding information corresponding to the second scale point cloud;
确定部分,配置为确定第一尺度点云;所述第一尺度点云为所述第二尺度点云对应的前一个已解码 的点云数据;The determining part is configured to determine a first-scale point cloud; the first-scale point cloud is the previously decoded point cloud data corresponding to the second-scale point cloud;
局部密度预测部分,配置为基于所述第一尺度点云进行局部密度预测,确定所述第一尺度点云中第一尺度体素对应的局部密度;A local density prediction part is configured to perform local density prediction based on the first-scale point cloud to determine the local density corresponding to the first-scale voxel in the first-scale point cloud;
占据概率预测部分,配置为基于所述第一尺度点云,对第二尺度体素进行占据概率预测,确定所述第二尺度体素对应的占据概率;所述第二尺度体素为所述第一尺度体素对应的上采样体素;The occupancy probability prediction part is configured to perform occupancy probability prediction on a second-scale voxel based on the first-scale point cloud, and determine the occupancy probability corresponding to the second-scale voxel; the second-scale voxel is an upsampled voxel corresponding to the first-scale voxel;
解码重建部分,配置为基于所述第二尺度体素对应的占据概率与所述第一尺度体素对应的局部密度,对所述第二尺度点云对应的编码信息进行解码重建,确定所述第二尺度点云对应的重建几何信息。The decoding and reconstruction part is configured to decode and reconstruct the encoded information corresponding to the second-scale point cloud based on the occupancy probability corresponding to the second-scale voxel and the local density corresponding to the first-scale voxel, and determine the reconstructed geometric information corresponding to the second-scale point cloud.
本申请实施例提供一种编码器,包括:The present application provides an encoder, including:
下采样部分,配置为对第二尺度点云进行体素下采样,确定第一尺度点云;A downsampling part, configured to perform voxel downsampling on the second scale point cloud to determine a first scale point cloud;
局部密度预测部分,配置为基于所述第一尺度点云进行局部密度预测,确定所述第一尺度体素对应的局部密度;A local density prediction part, configured to perform local density prediction based on the first-scale point cloud to determine the local density corresponding to the first-scale voxel;
占据概率预测部分,配置为将所述第一尺度点云中的第一尺度体素上采样至第二尺度,确定第一尺度体素对应的第二尺度体素;并对所述第二尺度体素进行占据概率预测,确定所述第二尺度体素对应的占据概率;The occupancy probability prediction part is configured to upsample the first scale voxels in the first scale point cloud to the second scale, determine the second scale voxels corresponding to the first scale voxels; and perform occupancy probability prediction on the second scale voxels to determine the occupancy probability corresponding to the second scale voxels;
重建部分,配置为根据所述第一尺度体素对应的局部密度与所述第二尺度体素对应的占据概率,确定所述第二尺度点云对应的重建几何信息;A reconstruction part, configured to determine the reconstructed geometric information corresponding to the second-scale point cloud according to the local density corresponding to the first-scale voxel and the occupancy probability corresponding to the second-scale voxel;
编码部分,配置为基于所述第二尺度点云对应的重建几何信息进行编码,确定第二尺度点云对应的编码信息,并将所述编码信息写入码流。The encoding part is configured to perform encoding based on the reconstructed geometric information corresponding to the second-scale point cloud, determine the encoding information corresponding to the second-scale point cloud, and write the encoding information into a bitstream.
本申请实施例提供一种码流,包括:The embodiment of the present application provides a code stream, including:
所述码流是根据编码信息进行比特编码生成的;其中,所述编码信息至少包括:第二尺度点云对应的编码信息。The code stream is generated by bit encoding according to the coding information; wherein the coding information at least includes: coding information corresponding to the second-scale point cloud.
本申请实施例提供一种解码器,包括:The present application provides a decoder, including:
第一存储器,配置为存储可执行指令;A first memory configured to store executable instructions;
第一处理器,配置为执行所述第一存储器中存储的可执行指令时,实现如上述任一项所述的解码方法。The first processor is configured to implement any of the decoding methods described above when executing the executable instructions stored in the first memory.
本申请实施例提供一种编码器,包括:The present application provides an encoder, including:
第二存储器,配置为存储可执行指令;a second memory configured to store executable instructions;
第二处理器,配置为执行所述第二存储器中存储的可执行指令时,实现如上述任一项所述的编码方法。The second processor is configured to implement any of the encoding methods described above when executing the executable instructions stored in the second memory.
本申请实施例提供一种计算机可读存储介质,存储有可执行指令,用于引起第一处理器执行时,实现上述的解码方法,或者,用于引起第二处理器执行时,实现上述的编码方法。An embodiment of the present application provides a computer-readable storage medium storing executable instructions for causing a first processor to execute to implement the above-mentioned decoding method, or for causing a second processor to execute to implement the above-mentioned encoding method.
本申请实施例提供一种计算机程序产品,包括计算机程序或指令,所述计算机程序或指令被第一处理器执行时,实现本申请实施例提供的解码方法;或者,所述计算机程序或指令被第二处理器执行时,实现本申请实施例提供的编码方法。An embodiment of the present application provides a computer program product, including a computer program or instructions. When the computer program or instructions are executed by a first processor, the decoding method provided by the embodiment of the present application is implemented; or, when the computer program or instructions are executed by a second processor, the encoding method provided by the embodiment of the present application is implemented.
本申请实施例提供了一种编解码方法、解码器、编码器及计算机可读存储介质,解码器可以通过预测局部密度,确定每个第一尺度体素上采样得到的第二尺度体素中,被占据的第二尺度体素的数量。这样,可以结合局部密度对第二尺度体素对应的占据概率进行筛选,来确定出第二尺度体素的占据情况,根据第二尺度体素的占据情况重建第二尺度点云,确定第二尺度点云的重建几何信息。如此,可以使得确定出的第二尺度体素的占据情况更准确,提高第二尺度点云的重建几何信息的准确性,也即提高了解码器的重建几何质量,提高了解码性能。并且,在编码器中,基于第一尺度体素对应的局部密度,对第二尺度体素对应的占据概率进行筛选,可以提高确定第二尺度体素的占据情况的准确性,进而提高基于第二尺度体素的占据情况确定的第二尺度点云的重建几何信息进行编码的准确性,从而提高了编码性能。The embodiment of the present application provides a coding and decoding method, a decoder, an encoder and a computer-readable storage medium. The decoder can determine the number of occupied second-scale voxels in the second-scale voxels obtained by sampling each first-scale voxel by predicting the local density. In this way, the occupancy probability corresponding to the second-scale voxel can be screened in combination with the local density to determine the occupancy of the second-scale voxel, reconstruct the second-scale point cloud according to the occupancy of the second-scale voxel, and determine the reconstructed geometric information of the second-scale point cloud. In this way, the occupancy of the determined second-scale voxel can be made more accurate, and the accuracy of the reconstructed geometric information of the second-scale point cloud can be improved, that is, the reconstructed geometric quality of the decoder is improved, and the decoding performance is improved. In addition, in the encoder, based on the local density corresponding to the first-scale voxel, the occupancy probability corresponding to the second-scale voxel is screened, which can improve the accuracy of determining the occupancy of the second-scale voxel, and then improve the accuracy of encoding the reconstructed geometric information of the second-scale point cloud determined based on the occupancy of the second-scale voxel, thereby improving the encoding performance.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1为G-PCC编码的流程框图;FIG1 is a flow chart of G-PCC coding;
图2为G-PCC解码的流程框图;FIG2 is a flow chart of G-PCC decoding;
图3为本申请实施例提供的解码方法的一种可选的流程示意图;FIG3 is a schematic diagram of an optional flow chart of a decoding method provided in an embodiment of the present application;
图4为本申请实施例提供的体素上采样的一种可选的过程示意图;FIG4 is a schematic diagram of an optional process of voxel upsampling provided in an embodiment of the present application;
图5为本申请实施例提供的解码方法的一种可选的流程示意图;FIG5 is a schematic diagram of an optional flow chart of a decoding method provided in an embodiment of the present application;
图6为本申请实施例提供的占据概率预测与局部密度预测过程的一种可选的流程示意图;FIG6 is a schematic diagram of an optional flow chart of an occupancy probability prediction and local density prediction process provided in an embodiment of the present application;
图7为本申请实施例提供的占据概率预测与局部密度预测过程的一种可选的流程示意图;FIG7 is a schematic diagram of an optional flow chart of an occupancy probability prediction and local density prediction process provided in an embodiment of the present application;
图8为本申请实施例提供的占据概率预测与局部密度预测过程的一种可选的流程示意图;FIG8 is a schematic diagram of an optional flow chart of an occupancy probability prediction and local density prediction process provided in an embodiment of the present application;
图9为本申请实施例提供的局部密度预测网络的一种可选的结构示意图;FIG9 is a schematic diagram of an optional structure of a local density prediction network provided in an embodiment of the present application;
图10为本申请实施例提供的特征提取网络的一种可选的结构示意图;FIG10 is a schematic diagram of an optional structure of a feature extraction network provided in an embodiment of the present application;
图11为本申请实施例提供的特征提取网络中残差层的一种可选的结构示意图;FIG11 is a schematic diagram of an optional structure of a residual layer in a feature extraction network provided in an embodiment of the present application;
图12为本申请实施例提供的占据概率预测网络的一种可选的结构示意图;FIG12 is a schematic diagram of an optional structure of an occupancy probability prediction network provided in an embodiment of the present application;
图13为本申请实施例提供的一种第二尺度体素对应的占据概率的示意图;FIG13 is a schematic diagram of an occupancy probability corresponding to a second-scale voxel provided in an embodiment of the present application;
图14为本申请实施例提供的解码方法的一种可选的流程示意图FIG. 14 is a schematic diagram of an optional flow chart of a decoding method provided in an embodiment of the present application
图15为本申请实施例提供的根据占据概率与局部密度进行点云几何信息重建的一种可选的过程示意图;FIG15 is a schematic diagram of an optional process for reconstructing point cloud geometric information based on occupancy probability and local density provided in an embodiment of the present application;
图16为本申请实施例提供的编码方法的一种可选的流程示意图;FIG16 is a schematic diagram of an optional flow chart of an encoding method provided in an embodiment of the present application;
图17为本申请实施例提供的体素下采样的一种可选的过程示意图;FIG17 is a schematic diagram of an optional process of voxel downsampling provided in an embodiment of the present application;
图18为本申请实施例提供的体素下采样得到的占用符号的示意图;FIG18 is a schematic diagram of an occupancy symbol obtained by voxel downsampling according to an embodiment of the present application;
图19为本申请实施例提供的解码方法应用于实际场景的一种可选的过程示意图;FIG19 is a schematic diagram of an optional process of applying the decoding method provided in an embodiment of the present application to an actual scenario;
图20为本申请实施例提供的解码器的一种可选的结构示意图;FIG20 is a schematic diagram of an optional structure of a decoder provided in an embodiment of the present application;
图21为本申请实施例提供的编码器的一种可选的结构示意图;FIG21 is a schematic diagram of an optional structure of an encoder provided in an embodiment of the present application;
图22为本申请实施例提供的解码器的一种可选的结构示意图;FIG22 is a schematic diagram of an optional structure of a decoder provided in an embodiment of the present application;
图23为本申请实施例提供的编码器的一种可选的结构示意图。FIG. 23 is a schematic diagram of an optional structure of an encoder provided in an embodiment of the present application.
具体实施方式Detailed ways
为了使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请作进一步地详细描述,所描述的实施例不应视为对本申请的限制,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本申请保护的范围。In order to make the purpose, technical solutions and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings. The described embodiments should not be regarded as limiting the present application. All other embodiments obtained by ordinary technicians in the field without making creative work are within the scope of protection of this application.
在以下的描述中,涉及到“一些实施例”,其描述了所有可能实施例的子集,但是可以理解,“一些实施例”可以是所有可能实施例的相同子集或不同子集,并且可以在不冲突的情况下相互结合。In the following description, reference is made to “some embodiments”, which describe a subset of all possible embodiments, but it will be understood that “some embodiments” may be the same subset or different subsets of all possible embodiments and may be combined with each other without conflict.
在以下的描述中,所涉及的术语“第一\第二\第三”仅仅是是区别类似的对象,不代表针对对象的特定排序,可以理解地,“第一\第二\第三”在允许的情况下可以互换特定的顺序或先后次序,以使这里描述的本申请实施例能够以除了在这里图示或描述的以外的顺序实施。In the following description, the terms "first\second\third" involved are merely used to distinguish similar objects and do not represent a specific ordering of the objects. It can be understood that "first\second\third" can be interchanged with a specific order or sequence where permitted, so that the embodiments of the present application described here can be implemented in an order other than that illustrated or described here.
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中所使用的术语只是为了描述本申请实施例的目的,不是旨在限制本申请。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as those commonly understood by those skilled in the art to which this application belongs. The terms used herein are only for the purpose of describing the embodiments of this application and are not intended to limit this application.
对本申请实施例进行进一步详细说明之前,对本申请实施例中涉及的名词和术语进行说明,本申请实施例中涉及的名词和术语适用于如下的解释。Before further describing the embodiments of the present application in detail, the nouns and terms involved in the embodiments of the present application are explained. The nouns and terms involved in the embodiments of the present application are subject to the following interpretations.
1)体素:体素是体积元素的简称,是数字数据于三维空间分割上的最小单位。通过体素,可以对3D空间进行网格划分,并赋予每个网格特征。示例性地,体素可以是三维空间中固定大小的立方块。体素可以广泛用于三维成像、科学数据与医学影像等领域。1) Voxel: Voxel is the abbreviation of volume element, which is the smallest unit of digital data in three-dimensional space segmentation. Voxel can be used to divide 3D space into grids and give each grid feature. For example, voxel can be a cubic block of fixed size in three-dimensional space. Voxel can be widely used in fields such as three-dimensional imaging, scientific data and medical imaging.
点云压缩算法包括:基于视频的点云压缩(Video-based Point Cloud Compression,V-PCC)和基于几何的点云压缩(Geometry-based Point Cloud Compression,G-PCC)。其中,G-PCC中的几何压缩主要通过八叉树模型和/或三角形表面模型实现。V-PCC主要通过三维到二维的投影和视频压缩实现。Point cloud compression algorithms include: Video-based Point Cloud Compression (V-PCC) and Geometry-based Point Cloud Compression (G-PCC). Among them, the geometry compression in G-PCC is mainly implemented through the octree model and/or the triangle surface model. V-PCC is mainly implemented through 3D to 2D projection and video compression.
随着人工智能技术的发展,神经网络被应用于基于几何的点云压缩技术中。基于神经网络的点云几何压缩技术,可大致分为几何有损压缩与无损压缩。其中,无损压缩算法主要围绕体素占据概率的预测模型的设计展开。其中体素的数据表征通常使用八叉树模型,体积模型,稀疏张量表征等,在预测模型的设计中,对于几何无损压缩,在编码器端,往往需要以父节点,邻居节点等周围上下文为输入,经过神经网络(如卷积,全连接)层的处理,输出点云的几何数据中每个体素的占据概率,进而使用熵编码器,将每个体素的占据概率对应的体素占据符号,转换成码流。相应地,在解码器端,根据同样的过程预测每个体素的占据概率,根据预测的占据概率,使用熵解码器从码流中解码出体素占据符号,重建点云的几何数据。With the development of artificial intelligence technology, neural networks are applied to geometry-based point cloud compression technology. The point cloud geometry compression technology based on neural networks can be roughly divided into geometric lossy compression and lossless compression. Among them, the lossless compression algorithm mainly revolves around the design of the prediction model of voxel occupancy probability. The data representation of voxels usually uses octree models, volume models, sparse tensor representations, etc. In the design of the prediction model, for geometric lossless compression, on the encoder side, it is often necessary to use the surrounding context such as parent nodes and neighbor nodes as input, and after processing by the neural network (such as convolution, full connection) layer, output the occupancy probability of each voxel in the geometric data of the point cloud, and then use the entropy encoder to convert the voxel occupancy symbol corresponding to the occupancy probability of each voxel into a bitstream. Correspondingly, on the decoder side, the occupancy probability of each voxel is predicted according to the same process, and the voxel occupancy symbol is decoded from the bitstream based on the predicted occupancy probability to reconstruct the geometric data of the point cloud.
可以看出,仅根据占据概率确定体素的占据符号,在多个相邻体素的占据概率较为接近,或者占据概率阈值设置不合理等情况下,容易出现占用符确定不准确的问题,尤其是在点云密度分布不均的情况下,容易出现占用符号表征被占据的体素的与实际被占据的体素数量不一致的情况,造成重建的几何信息中多点或少点。从而降低了几何重建的质量,进而降低了编解码性能。It can be seen that if the occupation symbol of a voxel is determined only based on the occupation probability, it is easy to have inaccurate occupation symbol determination when the occupation probabilities of multiple adjacent voxels are close or the occupation probability threshold is set unreasonably. Especially in the case of uneven point cloud density distribution, it is easy for the occupation symbol to represent the number of occupied voxels to be inconsistent with the actual number of occupied voxels, resulting in more or less points in the reconstructed geometric information. This reduces the quality of geometric reconstruction and further reduces the encoding and decoding performance.
本申请实施例提供一种编解码方法、解码器、编码器及计算机可读存储介质,能够提高编解码效率,提高编解码性能。为了便于对本申请实施例所提供的技术方案的理解,首先提供一种G-PCC编码的流程框图和G-PCC解码的流程框图。需要说明的是,本申请实施例描述的G-PCC编码的流程框图和G-PCC解码的流程框图仅是为了更加清楚地说明本申请实施例的技术方案,并不构成对于本申请实施例提供的 技术方案的限定。本领域技术人员可知,随着点云压缩技术的演变和新业务场景的出现,本申请实施例提供的技术方案对于类似G-PCC的点云编解码架构同样适用,本申请实施例压缩的点云可以是视频中的点云,但不局限于此。The embodiments of the present application provide a coding and decoding method, a decoder, an encoder and a computer-readable storage medium, which can improve coding and decoding efficiency and improve coding and decoding performance. In order to facilitate the understanding of the technical solution provided by the embodiments of the present application, a flow chart of G-PCC encoding and a flow chart of G-PCC decoding are first provided. It should be noted that the flow chart of G-PCC encoding and the flow chart of G-PCC decoding described in the embodiments of the present application are only for more clearly illustrating the technical solution of the embodiments of the present application, and do not constitute a limitation on the technical solution provided by the embodiments of the present application. It is known to those skilled in the art that with the evolution of point cloud compression technology and the emergence of new business scenarios, the technical solution provided in the embodiments of the present application is also applicable to point cloud coding and decoding architectures similar to G-PCC. The point cloud compressed in the embodiments of the present application can be a point cloud in a video, but is not limited to this.
在点云G-PCC编码器框架中,将输入三维图像模型的点云进行slice划分后,对每一个slice进行独立编码。In the point cloud G-PCC encoder framework, the point cloud of the input 3D image model is sliced and each slice is encoded independently.
如图1所示的G-PCC编码的流程框图中,应用于编码器中,针对待编码的点云数据,先通过条带(slice)划分,将点云数据划分为多个slice。在每一个slice中,点云的几何信息和属性信息是分开进行编码的。在几何编码过程中,对几何信息进行坐标转换,使点云全都包含在一个包围盒(bounding box)中,然后再进行量化,量化主要起到缩放的作用,由于量化取整,使得一部分点云的几何信息相同,可以基于参数来决定是否移除重复点,量化和移除重复点这一过程又被称为体素化过程。接着对bounding box进行八叉树划分。在基于八叉树的几何信息编码流程中,将包围盒八等分为8个子立方体,对非空的(包含点云中的点)的子立方体继续进行八等分,直到划分得到的叶子结点为1x1x1的单位立方体时停止划分,对叶子结点中的点进行算术编码,生成二进制的几何比特流,即几何码流。在基于三角面片集(triangle soup,trisoup)的几何信息编码过程中,同样也要先进行八叉树划分,但区别于基于八叉树的几何信息编码,该trisoup不需要将点云逐级划分到边长为1x1x1的单位立方体,而是划分到子块(block)边长为W时停止划分,基于每个block中点云的分布所形成的表面,得到该表面与block的十二条边所产生的至多十二个交点(vertex),对vertex进行算术编码(基于交点进行表面拟合),生成二进制的几何比特流,即几何码流。Vertex还用于在几何重建的过程的实现,而重建的几何信息在对点云的属性编码时使用。As shown in the flowchart of G-PCC coding in FIG1 , it is applied to the encoder. For the point cloud data to be encoded, the point cloud data is first divided into multiple slices by strip division. In each slice, the geometric information and attribute information of the point cloud are encoded separately. In the geometric coding process, the geometric information is transformed so that all the point clouds are contained in a bounding box, and then quantized. Quantization mainly plays a role in scaling. Due to the quantization rounding, the geometric information of a part of the point cloud is the same. It can be determined whether to remove duplicate points based on parameters. The process of quantization and removal of duplicate points is also called voxelization. Then the bounding box is divided into octrees. In the octree-based geometric information coding process, the bounding box is divided into 8 sub-cubes, and the non-empty (containing points in the point cloud) sub-cubes are divided into 8 equal parts until the leaf node obtained by the division is a 1x1x1 unit cube. The division is stopped, and the points in the leaf node are arithmetically encoded to generate a binary geometric bit stream, that is, a geometric code stream. In the process of geometric information encoding based on triangle soup (trisoup), octree division must also be performed first. However, unlike the geometric information encoding based on octree, the trisoup does not need to divide the point cloud into unit cubes with a side length of 1x1x1 step by step. Instead, the division stops when the side length of the sub-block is W. Based on the surface formed by the distribution of the point cloud in each block, the surface and the twelve edges of the block are obtained. At most twelve intersections (vertex) generated by the twelve edges of the block are obtained, and the vertices are arithmetically encoded (surface fitting based on the intersections) to generate a binary geometric bit stream, that is, a geometric code stream. Vertex is also used to implement the process of geometric reconstruction, and the reconstructed geometric information is used when encoding the attributes of the point cloud.
在属性编码过程中,进行颜色转换,将颜色信息(即属性信息)从RGB颜色空间转换到YUV颜色空间。然后,利用重建的几何信息对点云重新着色,使得未编码的属性信息与重建的几何信息对应起来。在颜色信息编码过程中,主要有两种变换方法,一是依赖于细节层次(Level of Detail,LOD)划分的基于距离的提升变换,二是直接进行区域自适应分层变换(Region Adaptive Hierarchal Transform,RAHT)的变换,这两种方法都会将颜色信息从空间域转换到频域,通过变换得到高频系数和低频系数,最后对系数进行量化(即量化系数),最后,将经过八叉树划分及表面拟合的几何编码数据与量化系数处理属性编码数据进行slice合成后,依次编码每个block的vertex坐标(即算数编码),生成二进制的属性比特流,即属性码流。In the attribute encoding process, color conversion is performed to convert the color information (i.e., attribute information) from the RGB color space to the YUV color space. Then, the point cloud is recolored using the reconstructed geometric information so that the unencoded attribute information corresponds to the reconstructed geometric information. In the color information encoding process, there are two main transformation methods. One is the distance-based lifting transformation that relies on the level of detail (LOD) division, and the other is the direct transformation of the region adaptive hierarchical transformation (RAHT). Both methods will convert the color information from the spatial domain to the frequency domain, obtain high-frequency coefficients and low-frequency coefficients through the transformation, and finally quantize the coefficients (i.e., quantized coefficients). Finally, the geometric encoding data after octree division and surface fitting and the attribute encoding data processed by the quantized coefficients are sliced and synthesized, and the vertex coordinates of each block are encoded in turn (i.e., arithmetic encoding) to generate a binary attribute bit stream, i.e., the attribute code stream.
如图2所示的G-PCC解码的流程框图,应用于解码器中。解码器获取二进制码流,针对二进制码流中的几何比特流(即几何码流)和属性比特流分别进行独立解码。在对几何比特流的解码时,通过算术解码-八叉树合成-表面拟合-重建几何-反坐标变换,得到点云的几何信息;在对属性比特流的解码时,通过算术解码-反量化-基于LOD的反提升或者基于RAHT的反变换-反颜色转换,得到点云的属性信息,基于几何信息和属性信息还原待编码的点云数据的三维图像模型。The flowchart of G-PCC decoding shown in Figure 2 is applied to the decoder. The decoder obtains a binary code stream, and independently decodes the geometric bit stream (i.e., geometric code stream) and attribute bit stream in the binary code stream. When decoding the geometric bit stream, the geometric information of the point cloud is obtained through arithmetic decoding-octree synthesis-surface fitting-reconstruction of geometry-inverse coordinate transformation; when decoding the attribute bit stream, the attribute information of the point cloud is obtained through arithmetic decoding-inverse quantization-LOD-based inverse lifting or RAHT-based inverse transformation-inverse color conversion, and the three-dimensional image model of the point cloud data to be encoded is restored based on the geometric information and attribute information.
本申请实施例的编码方法,可以应用于如图1所示的G-PCC的几何信息编码流程中,在体素化完成之后,基于体素化后的第二尺度点云进行图1中的几何编码过程,得到几何比特流;在几何编码的重建几何过程中,对体素化后的第二尺度点云进行体素下采样,确定第一尺度点云,并将第一尺度点云中的第一尺度体素上采样至第二尺度,确定第一尺度体素对应的第二尺度体素;基于第一尺度点云进行局部密度预测,确定第一尺度体素对应的局部密度,并对第二尺度体素进行占据概率预测,确定第二尺度体素对应的占据概率;局部密度表征第一尺度体素对应的第二尺度体素中,被占据的第二尺度体素的数量;根据第一尺度体素对应的局部密度与第二尺度体素对应的占据概率,确定第二尺度点云对应的重建几何信息;在属性编码过程中,基于第二尺度点云对应的重建几何信息重新着色过程以及颜色信息编码过程,得到属性比特流。The encoding method of the embodiment of the present application can be applied to the geometric information encoding process of the G-PCC as shown in Figure 1. After voxelization is completed, the geometric encoding process in Figure 1 is performed based on the voxelized second-scale point cloud to obtain a geometric bit stream; in the reconstruction geometry process of the geometric encoding, the voxel-down sampling is performed on the voxelized second-scale point cloud to determine the first-scale point cloud, and the first-scale voxels in the first-scale point cloud are upsampled to the second scale to determine the second-scale voxels corresponding to the first-scale voxels; local density prediction is performed based on the first-scale point cloud to determine the local density corresponding to the first-scale voxels, and the occupation probability prediction is performed on the second-scale voxels to determine the occupation probability corresponding to the second-scale voxels; the local density represents the number of occupied second-scale voxels in the second-scale voxels corresponding to the first-scale voxels; according to the local density corresponding to the first-scale voxels and the occupation probability corresponding to the second-scale voxels, the reconstructed geometric information corresponding to the second-scale point cloud is determined; in the attribute encoding process, the attribute bit stream is obtained based on the reconstructed geometric information corresponding to the second-scale point cloud.
本申请实施例的解码方法,可以应用于如图2所示的G-PCC的几何信息解码流程中,通过解析码流,确定第二尺度点云对应的编码信息,并确定第一尺度点云;第一尺度点云为第二尺度点云对应的前一个已解码的点云数据。在几何解码过程中,对第二尺度点云对应的编码信息进行算术解码、八叉树合成以及表面拟合,在几何解码的重建几何过程中,基于第一尺度点云进行局部密度预测,确定第一尺度点云中第一尺度体素对应的局部密度,并对第二尺度体素进行占据概率预测,确定第二尺度体素对应的占据概率;第二尺度体素为第一尺度体素对应的上采样体素;局部密度表征第一尺度体素对应的第二尺度体素中,被占据的第二尺度体素的数量;基于第二尺度体素对应的占据概率与第一尺度体素对应的局部密度,对第二尺度点云对应的编码信息进行解码重建,确定第二尺度点云对应的重建的几何信息。在属性解码过程中,利用第二尺度点云对应的重建几何信息进行基于LOD的反提升或者基于RAHT的反变换-反颜色转换,得到第二尺度点云的属性信息,基于重建的几何信息和属性信息还原第二尺度点云的三维图像模型。The decoding method of the embodiment of the present application can be applied to the geometric information decoding process of G-PCC as shown in FIG2. By parsing the bitstream, the coding information corresponding to the second-scale point cloud is determined, and the first-scale point cloud is determined; the first-scale point cloud is the previously decoded point cloud data corresponding to the second-scale point cloud. In the geometric decoding process, the coding information corresponding to the second-scale point cloud is subjected to arithmetic decoding, octree synthesis, and surface fitting. In the geometric reconstruction process of geometric decoding, local density prediction is performed based on the first-scale point cloud to determine the local density corresponding to the first-scale voxel in the first-scale point cloud, and the occupation probability prediction is performed on the second-scale voxel to determine the occupation probability corresponding to the second-scale voxel; the second-scale voxel is the upsampled voxel corresponding to the first-scale voxel; the local density represents the number of occupied second-scale voxels in the second-scale voxel corresponding to the first-scale voxel; based on the occupation probability corresponding to the second-scale voxel and the local density corresponding to the first-scale voxel, the coding information corresponding to the second-scale point cloud is decoded and reconstructed to determine the reconstructed geometric information corresponding to the second-scale point cloud. In the attribute decoding process, the reconstructed geometric information corresponding to the second-scale point cloud is used to perform LOD-based inverse lifting or RAHT-based inverse transformation-inverse color conversion to obtain the attribute information of the second-scale point cloud, and the three-dimensional image model of the second-scale point cloud is restored based on the reconstructed geometric information and attribute information.
需要说明的是,本申请实施例的编码方法与解码方法也可以用于G-PCC之外的其他点云编码和解码流程中。It should be noted that the encoding method and decoding method of the embodiments of the present application can also be used in other point cloud encoding and decoding processes besides G-PCC.
下面说明本申请实施例提供的应用于解码器的解码方法。The decoding method applied to a decoder provided in an embodiment of the present application is described below.
参见图3,图3是本申请实施例提供的解码方法的一个可选的流程示意图,将结合图3示出的步骤进行说明。Refer to Figure 3, which is an optional flowchart of a decoding method provided in an embodiment of the present application, which will be explained in conjunction with the steps shown in Figure 3.
S101、解析码流,确定第二尺度点云对应的编码信息,并确定第一尺度点云。S101, parsing a bitstream, determining coding information corresponding to a second-scale point cloud, and determining a first-scale point cloud.
本申请实施例中,解码器对接收到的码流进行解析,得到第二尺度点云对应的编码信息,并确定第一尺度点云。其中,第一尺度点云为第二尺度点云对应的前一个已解码的点云数据。In the embodiment of the present application, the decoder parses the received code stream to obtain the encoding information corresponding to the second scale point cloud and determines the first scale point cloud, wherein the first scale point cloud is the previously decoded point cloud data corresponding to the second scale point cloud.
本申请实施例中,码流中通常包含编码器发送的至少一个尺度的点云对应的编码信息。解码器在解码时,以从低尺度到高尺度的顺序进行解码。也就是说,解码器在对第二尺度点云对应的编码信息进行解码之前,已经完成了对第二尺度点云的前一个尺度点云,也即第一尺度点云的编码信息解码,确定了已解码的第一尺度点云数据,也即第一尺度点云。这样,解码器可以利用已解码的低尺度的第一尺度点云数据,对高尺度的第二尺度点云对应的编码信息进行解码与几何信息的重建。In the embodiment of the present application, the code stream generally includes coding information corresponding to at least one scale of point cloud sent by the encoder. When decoding, the decoder decodes in order from low scale to high scale. That is, before the decoder decodes the coding information corresponding to the second scale point cloud, it has completed the decoding of the coding information of the previous scale point cloud of the second scale point cloud, that is, the first scale point cloud, and determined the decoded first scale point cloud data, that is, the first scale point cloud. In this way, the decoder can use the decoded low-scale first-scale point cloud data to decode the coding information corresponding to the high-scale second-scale point cloud and reconstruct the geometric information.
S102、基于第一尺度点云进行局部密度预测,确定第一尺度点云中第一尺度体素对应的局部密度,并对第二尺度体素进行占据概率预测,确定第二尺度体素对应的占据概率。S102: performing local density prediction based on the first-scale point cloud to determine the local density corresponding to the first-scale voxels in the first-scale point cloud, and performing occupancy probability prediction on the second-scale voxels to determine the occupancy probability corresponding to the second-scale voxels.
本申请实施例中,第一尺度点云与第二尺度点云的几何信息经过了编码器的体素化过程,可以通过体素网格的形式来表示。In the embodiment of the present application, the geometric information of the first-scale point cloud and the second-scale point cloud has undergone a voxelization process of the encoder and can be represented in the form of a voxel grid.
本申请实施例中,对于体素化过程,点云中的一个点可以对应一个被占据的体素(即非空体素),而未被占据的体素(即空体素)表示该体素位置上没有落入点云中的点。在一些实施例中,可以将被占据的体素标记为1,将未被占据的体素标记为0。如此,体素化后的点云可以通过体素网格中,各个位置上体素的占据符号,来表示点云的几何数据。In the embodiment of the present application, for the voxelization process, a point in the point cloud may correspond to an occupied voxel (i.e., a non-empty voxel), and an unoccupied voxel (i.e., an empty voxel) indicates that there is no point in the point cloud at the voxel position. In some embodiments, the occupied voxels may be marked as 1, and the unoccupied voxels may be marked as 0. In this way, the voxelized point cloud may represent the geometric data of the point cloud by the occupation symbols of the voxels at each position in the voxel grid.
本申请实施例中,点云的尺度与该点云中体素的尺度是对应的。也就是说,第一尺度点云中包含的体素为第一尺度体素,第二尺度点云中包含的体素为第二尺度体素。其中,第二尺度体素为第一尺度体素对应的上采样体素。解码器可以对第一尺度点云中第一尺度体素进行体素上采样,示例性地,对于第一尺度点云中表征被占据的第一尺度体素进行上采样,得到每个第一尺度体素对应的多个第二尺度体素。In the embodiment of the present application, the scale of the point cloud corresponds to the scale of the voxels in the point cloud. That is, the voxels contained in the first-scale point cloud are first-scale voxels, and the voxels contained in the second-scale point cloud are second-scale voxels. Among them, the second-scale voxels are upsampled voxels corresponding to the first-scale voxels. The decoder can perform voxel upsampling on the first-scale voxels in the first-scale point cloud. For example, the first-scale voxels representing the occupied points in the first-scale point cloud are upsampled to obtain multiple second-scale voxels corresponding to each first-scale voxel.
在一些实施例中,解码器可以通过池化的方式实现体素上采样,如采用步长为2×2×2最大池化层,将第一尺度点云中的1个第一尺度体素划分为8个第二尺度体素,每次上采样对第一尺度体素在三个维度上的尺寸进行平均划分,也即第二尺度体素在三个维度上的尺寸均第一尺度体素的一半。对第一尺度点云中的各个第一尺度体素进行上采样,即完成了从几何信息已知的低尺度点云向高尺度点云的上采样,得到几何信息待重建的第二尺度点云。In some embodiments, the decoder can implement voxel upsampling by pooling, such as using a maximum pooling layer with a step size of 2×2×2 to divide one first-scale voxel in the first-scale point cloud into eight second-scale voxels. Each upsampling evenly divides the size of the first-scale voxel in three dimensions, that is, the size of the second-scale voxel in three dimensions is half of the first-scale voxel. Upsampling each first-scale voxel in the first-scale point cloud completes the upsampling from the low-scale point cloud with known geometric information to the high-scale point cloud, and obtains the second-scale point cloud whose geometric information is to be reconstructed.
请参见图4,图4中示出了一个包含2×2×1个第一尺度体素的第一尺度点云,经一次体素上采样后,得到包含4×4×2个第二尺度体素的第二尺度点云。第一尺度点云为已解码的点云数据,在图4中以实的立方块表示被占据的第一尺度体素,代表有点云中的点所在的位置。第一尺度体素上采样为第二尺度体素之后,第二尺度体素是否被占据需要进一步通过几何重建过程来确定,在图4中以空的立方块表示。但图4的点云仅仅是示例性的,实际的点云可以包括更多的体素。Please refer to Figure 4, which shows a first-scale point cloud containing 2×2×1 first-scale voxels. After one voxel upsampling, a second-scale point cloud containing 4×4×2 second-scale voxels is obtained. The first-scale point cloud is the decoded point cloud data. In Figure 4, the occupied first-scale voxels are represented by solid cubes, which represent the locations of the points in the point cloud. After the first-scale voxels are upsampled to second-scale voxels, whether the second-scale voxels are occupied needs to be further determined through a geometric reconstruction process, which is represented by empty cubes in Figure 4. However, the point cloud in Figure 4 is only exemplary, and the actual point cloud may include more voxels.
本申请实施例中,局部密度表征第一尺度体素对应的第二尺度体素中,被占据的第二尺度体素的数量。因此,解码器基于第一尺度点云进行局部密度预测,确定第一尺度点云中第一尺度体素对应的局部密度,即可确定每个第一尺度体素对应的多个第二尺度体素中,被占据的多个第二尺度体素的数量。从而可以结合多个第二尺度体素对应的多个占据概率,更准确地确定多个第二尺度体素中被占据的第二尺度体素。In the embodiment of the present application, the local density represents the number of occupied second-scale voxels in the second-scale voxels corresponding to the first-scale voxels. Therefore, the decoder performs local density prediction based on the first-scale point cloud, determines the local density corresponding to the first-scale voxels in the first-scale point cloud, and can determine the number of occupied second-scale voxels in the multiple second-scale voxels corresponding to each first-scale voxel. In this way, the occupied second-scale voxels in the multiple second-scale voxels can be more accurately determined by combining the multiple occupation probabilities corresponding to the multiple second-scale voxels.
本申请实施例中,解码器对由第一尺度体素上采样得到的第二尺度体素进行对第二尺度体素进行占据概率预测,确定第一尺度体素对应的多个第二尺度体素对应的多个占据概率。In the embodiment of the present application, the decoder predicts the occupancy probability of the second-scale voxels obtained by upsampling the first-scale voxels, and determines multiple occupancy probabilities corresponding to multiple second-scale voxels corresponding to the first-scale voxels.
在一些实施例中,基于图3,如图5所示,S102可以通过执行S201-S203的过程来实现,如下:In some embodiments, based on FIG. 3 , as shown in FIG. 5 , S102 may be implemented by executing the process of S201 - S203 as follows:
S201、对第一尺度点云的几何信息进行特征提取,确定第一尺度点云特征。S201 : extracting features from geometric information of a first-scale point cloud to determine features of the first-scale point cloud.
S202、将第一尺度点云特征上采样至第二尺度,确定初始第二尺度点云特征,对初始第二尺度点云特征进行特征提取,确定第二尺度点云特征,并根据第二尺度点云特征进行占据概率预测,确定第二尺度体素对应的占据概率。S202, upsampling the first-scale point cloud features to the second scale, determining initial second-scale point cloud features, performing feature extraction on the initial second-scale point cloud features, determining second-scale point cloud features, and performing occupancy probability prediction based on the second-scale point cloud features to determine occupancy probabilities corresponding to second-scale voxels.
本申请实施例中,第一尺度点云的几何信息可以包括第一尺度点云中各个体素的占用情况,以及各个体素的位置信息。在一些实施例中,占用情况可以是如0或1预设标志位,位置信息可以是体素的三维坐标。在一些实施例中,第一尺度点云的几何信息可以是对第一尺度点云的编码信息进行解码重建得到的,也即第一尺度点云的几何信息可以相当于第一尺度点云的重建几何信息。In an embodiment of the present application, the geometric information of the first scale point cloud may include the occupancy of each voxel in the first scale point cloud and the position information of each voxel. In some embodiments, the occupancy may be a preset flag such as 0 or 1, and the position information may be the three-dimensional coordinates of the voxel. In some embodiments, the geometric information of the first scale point cloud may be obtained by decoding and reconstructing the encoded information of the first scale point cloud, that is, the geometric information of the first scale point cloud may be equivalent to the reconstructed geometric information of the first scale point cloud.
本申请实施例中,解码器对第一尺度点云的几何信息进行特征提取,将第一尺度点云的几何信息映射至预设低尺度特征空间,确定第一尺度点云特征。在一些实施例中,第一尺度点云特征可以包括从第一尺度点云的几何信息中提取得到的隐式特征。In an embodiment of the present application, the decoder extracts features from the geometric information of the first-scale point cloud, maps the geometric information of the first-scale point cloud to a preset low-scale feature space, and determines the features of the first-scale point cloud. In some embodiments, the first-scale point cloud features may include implicit features extracted from the geometric information of the first-scale point cloud.
本申请实施例中,解码器将第一尺度点云特征上采样至第二尺度,确定初始第二尺度点云特征。解 码器对初始第二尺度点云特征进行特征提取,确定第二尺度点云特征。解码器根据第二尺度点云特征,对第二尺度体素进行占据概率预测,预测各个第二尺度体素对应的体素位置落入点云中点的概率,也即确定第二尺度体素对应的占据概率。可理解,第二尺度点云特征是通过对第一尺度点云的几何信息进行两次特征提取得到的,因此,第二尺度点云特征具有更好的特征表达,可以提高解码器利用第二尺度点云特征进行占据概率预测的准确性。In the embodiment of the present application, the decoder upsamples the first-scale point cloud features to the second scale to determine the initial second-scale point cloud features. The decoder extracts features from the initial second-scale point cloud features to determine the second-scale point cloud features. The decoder predicts the occupancy probability of the second-scale voxels based on the second-scale point cloud features, predicts the probability that the voxel position corresponding to each second-scale voxel falls into the midpoint of the point cloud, and determines the occupancy probability corresponding to the second-scale voxel. It can be understood that the second-scale point cloud features are obtained by performing two feature extractions on the geometric information of the first-scale point cloud. Therefore, the second-scale point cloud features have better feature expression, which can improve the accuracy of the decoder's prediction of occupancy probability using the second-scale point cloud features.
S203、根据第一尺度点云特征进行局部密度预测,确定第一尺度体素对应的局部密度。S203 , performing local density prediction according to the first-scale point cloud features to determine the local density corresponding to the first-scale voxel.
本申请实施例中,解码器根据第一尺度点云特征进行局部密度预测,预测第一尺度点云中的每个第一尺度体素对应的多个第二尺度体素中,被占据的第二尺度体素的数量,作为每个第一尺度体素对应的局部密度。可理解,局部密度为一个数量值,可以是大于或等于1,且小于或等于一个第一尺度体素对应的第二尺度体素总数的整数值。In the embodiment of the present application, the decoder performs local density prediction based on the first-scale point cloud features, and predicts the number of occupied second-scale voxels in multiple second-scale voxels corresponding to each first-scale voxel in the first-scale point cloud as the local density corresponding to each first-scale voxel. It can be understood that the local density is a numerical value, which can be an integer value greater than or equal to 1 and less than or equal to the total number of second-scale voxels corresponding to a first-scale voxel.
需要说明的是,S202中的特征上采样、初始第二尺度点云特征的特征提取以及占据概率预测的过程,与S203中局部密度预测的过程执行顺序不限于图5所示,实际应用中,也可以以任意先后顺序执行或同步执行,具体的根据实际情况进行选择,本申请实施例不作限定。It should be noted that the execution order of the feature upsampling, feature extraction of the initial second-scale point cloud features, and occupancy probability prediction processes in S202 and the local density prediction process in S203 is not limited to that shown in FIG. 5 . In practical applications, they can also be executed in any order or simultaneously, depending on the actual situation, and the embodiment of the present application does not limit this.
示例性地,上述S201-S203的过程可以如图6所示。解码器通过第三特征提取网络对第一尺度点云的几何信息进行特征提取,确定第一尺度点云特征;将第一尺度点云特征分别输入包含上采样网络、第四特征提取网络与占据概率预测网络的第一支路,以及包含局部密度预测网络的第二支路;通过第一支路输出每个第二尺度体素对应的占据概率,通过第二支路输出每个第一尺度体素对应的局部密度。在一些实施例中,上采样网络可以通过转置卷积网络来实现;第三特征提取网络用于对第一尺度点云的几何信息进行特征提取,第四特征提取网络用于对初始第二尺度点云特征进行特征提取;占据概率预测网络与局部密度预测网络可以利用预训练的神经网络来实现。Exemplarily, the above process of S201-S203 can be shown in FIG6. The decoder extracts features of the geometric information of the first-scale point cloud through the third feature extraction network to determine the first-scale point cloud features; the first-scale point cloud features are respectively input into the first branch including the upsampling network, the fourth feature extraction network and the occupancy probability prediction network, and the second branch including the local density prediction network; the occupancy probability corresponding to each second-scale voxel is output through the first branch, and the local density corresponding to each first-scale voxel is output through the second branch. In some embodiments, the upsampling network can be implemented by a transposed convolutional network; the third feature extraction network is used to extract features of the geometric information of the first-scale point cloud, and the fourth feature extraction network is used to extract features of the initial second-scale point cloud features; the occupancy probability prediction network and the local density prediction network can be implemented using pre-trained neural networks.
在一些实施例中,图3中的S102也可以通过执行S301-S304的过程来实现,如下:In some embodiments, S102 in FIG. 3 may also be implemented by executing the process of S301-S304 as follows:
S301、通过第一特征提取网络,对第一尺度点云的几何信息进行特征提取,确定第一尺度的第一点云特征。S301 . Perform feature extraction on geometric information of a first-scale point cloud through a first feature extraction network to determine first point cloud features of a first scale.
S302、将第一尺度的第一点云特征上采样至第二尺度,确定第二尺度点云特征,并基于第二尺度点云特征进行占据概率预测,确定第二尺度体素对应的占据概率。S302: up-sample the first point cloud features at the first scale to the second scale, determine the second-scale point cloud features, and perform occupancy probability prediction based on the second-scale point cloud features to determine the occupancy probability corresponding to the second-scale voxels.
S303、通过第二特征提取网络,对第一尺度点云的几何数据进行特征提取,确定第一尺度的第二点云特征。S303 . Perform feature extraction on the geometric data of the first-scale point cloud through a second feature extraction network to determine second point cloud features of the first scale.
S304、根据第一尺度的第二点云特征进行局部密度预测,确定第一尺度体素对应的局部密度。S304 , performing local density prediction according to the second point cloud features at the first scale, and determining the local density corresponding to the voxels at the first scale.
本申请实施例中,解码器也可以通过不同的特征提取网络,针对占据概率预测和局部密度预测过程,提取不同的第一尺度点云特征,也即第一点云特征与第二点云特征,分别用于占据概率预测与局部密度预测。In an embodiment of the present application, the decoder can also extract different first-scale point cloud features, that is, first point cloud features and second point cloud features, for occupancy probability prediction and local density prediction processes through different feature extraction networks, which are used for occupancy probability prediction and local density prediction, respectively.
在一些实施例中,第一特征提取网络可以是与占据概率预测的神经网络进行联合学习或训练得到的,第二特征提取网络可以是与局部密度预测网络进行联合学习或训练得到的,如此,可以更有针对性地提取出用于预测占据概率的点云几何特征和用于预测局部密度的点云几何特征,提高局部密度预测与占据概率预测的准确性。In some embodiments, the first feature extraction network can be obtained by jointly learning or training with a neural network for occupancy probability prediction, and the second feature extraction network can be obtained by jointly learning or training with a local density prediction network. In this way, point cloud geometric features for predicting occupancy probability and point cloud geometric features for predicting local density can be extracted more specifically, thereby improving the accuracy of local density prediction and occupancy probability prediction.
同样地,本申请实施例对S301-S302过程,以及S303-S304过程两者的执行顺序也不作限定。Likewise, the embodiment of the present application does not limit the execution order of the S301 - S302 process and the S303 - S304 process.
示例性地,上述S301-S304的过程可以如图7所示。图7中的第一支路包括第一特征提取网络、上采样网络、以及占据概率预测网络;第二支路包括第二特征提取网络与局部密度预测网络。解码器将第一尺度点云的几何信息分别输入第一支路与第二支路,通过第一支路与第二支路分别对第一尺度点云的几何信息进行特征提取与网络预测;通过第一支路输出每个第二尺度体素对应的占据概率,通过第二支路输出每个第一尺度体素对应的局部密度。Exemplarily, the above-mentioned process of S301-S304 can be shown in FIG7. The first branch in FIG7 includes a first feature extraction network, an upsampling network, and an occupancy probability prediction network; the second branch includes a second feature extraction network and a local density prediction network. The decoder inputs the geometric information of the first-scale point cloud into the first branch and the second branch respectively, and performs feature extraction and network prediction on the geometric information of the first-scale point cloud through the first branch and the second branch respectively; the occupancy probability corresponding to each second-scale voxel is output through the first branch, and the local density corresponding to each first-scale voxel is output through the second branch.
需要说明的是,在一些实施例中,基于图7,在上采样网络之后,在占据概率预测网络之前,也可以对上采样网络输出的第二尺度点云特征进行进一步特征提取,将进一步特征提取后的第二尺度点云特征输入占据概率预测网络进行占据概率预测,以增强特征表达,进而提高占据概率预测的准确性。具体的根据实际情况进行选择,本申请实施例不作限定。It should be noted that, in some embodiments, based on FIG. 7 , after the upsampling network and before the occupancy probability prediction network, the second-scale point cloud features output by the upsampling network may be further extracted, and the second-scale point cloud features after further feature extraction may be input into the occupancy probability prediction network for occupancy probability prediction to enhance feature expression, thereby improving the accuracy of occupancy probability prediction. The specific selection is made according to the actual situation, and the embodiments of the present application are not limited thereto.
在一些实施例中,图3中的S102也可以通过执行S401-S403的过程来实现,如下:In some embodiments, S102 in FIG. 3 may also be implemented by executing the process of S401-S403 as follows:
S401、对第一尺度点云的几何信息进行特征提取,确定第一尺度点云特征。S401 : extracting features from geometric information of a first-scale point cloud to determine features of the first-scale point cloud.
S402、将第一尺度点云特征上采样至第二尺度,确定第二尺度点云特征,并根据第二尺度点云特征进行占据概率预测,确定第二尺度体素对应的占据概率。S402: up-sample the first-scale point cloud features to a second scale, determine the second-scale point cloud features, and perform occupation probability prediction based on the second-scale point cloud features to determine the occupation probability corresponding to the second-scale voxels.
S403、根据第一尺度点云特征进行局部密度预测,确定第一尺度体素对应的局部密度。S403 , performing local density prediction according to the first-scale point cloud features to determine the local density corresponding to the first-scale voxel.
本申请实施例中,也可以基于对第一尺度点云的几何信息进行特征提取确定的第一尺度点云特征,分别进行占据概率预测与局部密度预测。这样,可以实现特征信息的复用,降低模块复杂度,提高解码器的处理效率。In the embodiment of the present application, the occupancy probability prediction and the local density prediction can be performed based on the first scale point cloud features determined by feature extraction of the geometric information of the first scale point cloud. In this way, the feature information can be reused, the module complexity can be reduced, and the processing efficiency of the decoder can be improved.
同样地,本申请实施例对S402与S403的执行顺序也不作限定。Likewise, the embodiment of the present application does not limit the execution order of S402 and S403.
示例性地,上述S401-S403过程可以如图8所示。解码器通过第五特征提取网络对第一尺度点云的几何信息进行特征提取,确定第一尺度点云特征;将第一尺度点云特征分别输入包含上采样网络与占据概率预测网络的第一支路,以及包含局部密度预测网络的第二支路;通过第一支路输出每个第二尺度体素对应的占据概率,通过第二支路输出每个第一尺度体素对应的局部密度。这里,图8中的第五特征提取网络与图6中的第三特征提取网络可以是相同或不同的特征提取网络。Exemplarily, the above S401-S403 process can be shown in FIG8. The decoder extracts features from the geometric information of the first-scale point cloud through the fifth feature extraction network to determine the first-scale point cloud features; the first-scale point cloud features are respectively input into the first branch including the upsampling network and the occupancy probability prediction network, and the second branch including the local density prediction network; the occupancy probability corresponding to each second-scale voxel is output through the first branch, and the local density corresponding to each first-scale voxel is output through the second branch. Here, the fifth feature extraction network in FIG8 and the third feature extraction network in FIG6 can be the same or different feature extraction networks.
在一些实施例中,对于上述S203、S304、以及S403中局部密度预测的过程,可以利用局部密度预测网络,根据第一尺度点云特征进行局部密度预测,确定第一尺度体素对应的局部密度。其中,局部密度预测网络可以包括:第一稀疏卷积层、第一激活函数层、第二稀疏卷积层与第二激活函数层。示例性地,局部密度预测网络可以如图9所示,包括:第一层的稀疏卷积层(即第一稀疏卷积层)、第二层的ReLu激活函数层(即第一激活函数层)、第三层的稀疏卷积层(即第二稀疏卷积层)以及第四层的Sigmoid函数层(即第二激活函数层)。局部密度预测网络用于根据点云隐式特征输出局部密度In some embodiments, for the local density prediction process in S203, S304, and S403 above, a local density prediction network can be used to perform local density prediction based on the first-scale point cloud features to determine the local density corresponding to the first-scale voxel. Wherein, the local density prediction network may include: a first sparse convolution layer, a first activation function layer, a second sparse convolution layer, and a second activation function layer. Exemplarily, the local density prediction network may be shown in FIG9, including: a first layer of sparse convolution layer (i.e., a first sparse convolution layer), a second layer of ReLu activation function layer (i.e., a first activation function layer), a third layer of sparse convolution layer (i.e., a second sparse convolution layer), and a fourth layer of Sigmoid function layer (i.e., a second activation function layer). The local density prediction network is used to output local density according to implicit features of the point cloud.
在一些实施例中,对于上述S201、S202、S301、S303、S401、以及S402中的特征提取过程,可以利用如图10所示的特征提取网络来实现,包括:第一层的稀疏卷积层、第二层的激活函数层(例如ReLu激活函数层)、第三层的残差层和第四层的稀疏卷积层。在一些实施例中,第三层的残差层的网络结构可以如图11所示。In some embodiments, the feature extraction process in S201, S202, S301, S303, S401, and S402 can be implemented using a feature extraction network as shown in FIG10, including: a sparse convolution layer of the first layer, an activation function layer of the second layer (e.g., a ReLu activation function layer), a residual layer of the third layer, and a sparse convolution layer of the fourth layer. In some embodiments, the network structure of the residual layer of the third layer can be as shown in FIG11.
在一些实施例中,对于上述S202、S302、以及S402中的占据概率预测过程,可以利用如图12所示的占据概率预测网络来实现,包括:第一层的稀疏卷积层、第二层的激活函数ReLu层、第三层的稀疏卷积层、第四层的激活函数ReLu、第五层的稀疏卷积层、以及第六层的Sigmoid函数层。占据概率预测网络用于根据第二尺度的点云隐式特征,对由每个第一尺度体素上采样得到的每个第二尺度体素进行占据预测,得到每个第二尺度体素对应的占据概率。图12中的占据概率预测网络仅为示例,实际应用中也可以是其他层级结构的卷积神经网络(Convolution Neural Network,CNN),具体的根据实际情况进行选择,本申请实施例不作限定。In some embodiments, the occupancy probability prediction process in S202, S302, and S402 can be implemented using an occupancy probability prediction network as shown in FIG12, including: a sparse convolution layer of the first layer, an activation function ReLu layer of the second layer, a sparse convolution layer of the third layer, an activation function ReLu of the fourth layer, a sparse convolution layer of the fifth layer, and a Sigmoid function layer of the sixth layer. The occupancy probability prediction network is used to perform occupancy prediction on each second-scale voxel obtained by upsampling each first-scale voxel according to the implicit features of the point cloud of the second scale, and obtain the occupancy probability corresponding to each second-scale voxel. The occupancy probability prediction network in FIG12 is only an example. In actual applications, it can also be a convolutional neural network (CNN) of other hierarchical structures. The specific selection is made according to the actual situation, and the embodiments of the present application are not limited thereto.
本申请实施例中,占据概率预测网络在处理过程中,对于低尺度点云(即第一尺度点云)中未被占据的第一尺度体素,其上采样得到的高尺度的多个第二尺度体素也是未被占据的,不需要进行概率预测。而对第一尺度点云中被占据的第一尺度体素进行上采样得到的多个第二尺度体素需要进行预测。例如图4中的第二尺度点云,其朝向纸面一侧的第二尺度体素对应的占据概率可以如图13所示,图13中的占据概率仅仅为了方便说明,不能理解为实际运算的结果。可以看出,如果对实际被占据的第二尺度体素预测得到的占据概率越接近于1,而对实际未被占据的第二尺度体素预测得到的占据概率越接近于0,则预测越准确,点云几何信息的重建质量就越准确。In the embodiment of the present application, during the processing of the occupancy probability prediction network, for the unoccupied first-scale voxels in the low-scale point cloud (i.e., the first-scale point cloud), the multiple high-scale second-scale voxels obtained by upsampling are also unoccupied, and no probability prediction is required. However, the multiple second-scale voxels obtained by upsampling the occupied first-scale voxels in the first-scale point cloud need to be predicted. For example, the occupancy probability corresponding to the second-scale voxels facing the side of the paper in the second-scale point cloud in Figure 4 can be shown in Figure 13. The occupancy probability in Figure 13 is only for convenience of explanation and cannot be understood as the result of actual calculation. It can be seen that if the occupancy probability predicted for the actually occupied second-scale voxels is closer to 1, and the occupancy probability predicted for the actually unoccupied second-scale voxels is closer to 0, the more accurate the prediction is, and the more accurate the reconstruction quality of the point cloud geometric information is.
S103、基于第二尺度体素对应的占据概率与第一尺度体素对应的局部密度,对第二尺度点云对应的编码信息进行解码重建,确定第二尺度点云对应的重建几何信息。S103 . Decode and reconstruct the encoded information corresponding to the second-scale point cloud based on the occupancy probability corresponding to the second-scale voxel and the local density corresponding to the first-scale voxel to determine the reconstructed geometric information corresponding to the second-scale point cloud.
本申请实施例中,解码器利用预测得到的第一尺度点云中每个第一尺度体素对应的局部密度,以及由每个第一尺度体素上采样得到的每个第二尺度体素对应的占据概率,确定每个第二尺度体素是否被占据,进而,根据每个第二尺度体素是否被占据的情况,对第二尺度点云对应的编码信息进行解码重建,解码出第二尺度点云中每个被占据的第二尺度体素对应的位置信息,根据第二尺度点云中每个被占据的第二尺度体素对应的位置信息,确定第二尺度点云对应的重建几何信息。In the embodiment of the present application, the decoder uses the predicted local density corresponding to each first-scale voxel in the first-scale point cloud and the occupancy probability corresponding to each second-scale voxel obtained by upsampling each first-scale voxel to determine whether each second-scale voxel is occupied, and then, according to whether each second-scale voxel is occupied, decodes and reconstructs the encoded information corresponding to the second-scale point cloud, decodes the position information corresponding to each occupied second-scale voxel in the second-scale point cloud, and determines the reconstructed geometric information corresponding to the second-scale point cloud according to the position information corresponding to each occupied second-scale voxel in the second-scale point cloud.
在一些实施例中,基于图3或图5,如图14所示,S103可以通过执行S1031-S1032的过程来实现,如下:In some embodiments, based on FIG. 3 or FIG. 5 , as shown in FIG. 14 , S103 may be implemented by executing the process of S1031 - S1032 as follows:
S1031、对于第一尺度点云中的每个第一尺度体素,将每个第一尺度体素对应的多个第二尺度体素中占据概率高的局部密度个第二尺度体素,确定为被占据的第二尺度体素。S1031 . For each first-scale voxel in the first-scale point cloud, determine second-scale voxels with a high local density of occupation probability among multiple second-scale voxels corresponding to each first-scale voxel as occupied second-scale voxels.
本申请实施例中,由于局部密度表征了第一尺度体素对应的第二尺度体素中,被占据的第二尺度体素的数量,因此,可以根据每个第一尺度体素对应局部密度,对每个第一尺度体素对应的多个第二尺度体素的多个占据概率进行筛选,从中确定出占据概率高的局部密度个第二尺度体素,确定为被占据的第二尺度体素。In the embodiment of the present application, since the local density represents the number of occupied second scale voxels in the second scale voxels corresponding to the first scale voxel, the multiple occupation probabilities of the multiple second scale voxels corresponding to each first scale voxel can be screened according to the local density corresponding to each first scale voxel, and the local density second scale voxels with high occupation probability are determined as the occupied second scale voxels.
示例性地,以一个第一尺度体素上采样得到8个第二尺度体素为例,第一尺度体素对应的局部密度可以是1-8的任意值。以局部密度为4为例,解码器将8个第二尺度体素中占据概率高的4个第二尺度体素,确定为被占据的第二尺度体素,其他第二尺度体素确定为未被占据的第二尺度体素。如图15所示,第一尺度点云中的某个第一尺度体素上采样得到8个第二尺度体素,该第一尺度体素对应的局部密度为4;8个第二尺度体素对应的占据概率从高到低依次为:[0.9,0.8,0.7,0.6,0.5,0.4,0.3,0.2],则可以根据局部密度,将占据概率为[0.9,0.8,0.7,0.6]的第二尺度体素确定为被占据的第二尺度体素;将其他的[0.5,0.4,0.3,0.2]对应的第二尺度体素确定为未被占据的第二尺度体素。For example, taking a first-scale voxel upsampled to obtain 8 second-scale voxels as an example, the local density corresponding to the first-scale voxel can be any value from 1 to 8. Taking the local density of 4 as an example, the decoder determines the 4 second-scale voxels with high occupancy probability among the 8 second-scale voxels as occupied second-scale voxels, and the other second-scale voxels are determined as unoccupied second-scale voxels. As shown in FIG15 , a first-scale voxel in the first-scale point cloud is upsampled to obtain 8 second-scale voxels, and the local density corresponding to the first-scale voxel is 4; the occupancy probabilities corresponding to the 8 second-scale voxels are, from high to low, [0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2]. Then, according to the local density, the second-scale voxels with occupancy probabilities of [0.9, 0.8, 0.7, 0.6] can be determined as occupied second-scale voxels; and the other second-scale voxels corresponding to [0.5, 0.4, 0.3, 0.2] are determined as unoccupied second-scale voxels.
S1032、基于每个第一尺度体素对应的被占据的第二尺度体素,对第二尺度点云对应的编码信息进 行解码重建,确定第二尺度点云对应的重建几何信息。S1032. Based on the occupied second-scale voxels corresponding to each first-scale voxel, decode and reconstruct the encoded information corresponding to the second-scale point cloud to determine the reconstructed geometric information corresponding to the second-scale point cloud.
本申请实施例中,被占据的第二尺度体素表征由第一尺度体素上采样得到的该第二尺度体素中落入了点云中的点。解码器可以基于每个第一尺度体素对应的被占据的第二尺度体素,对第二尺度点云对应的编码信息进行解码重建,从而确定第二尺度点云对应的重建几何信息。In the embodiment of the present application, the occupied second-scale voxels represent points in the point cloud that fall into the second-scale voxels obtained by upsampling the first-scale voxels. The decoder can decode and reconstruct the encoded information corresponding to the second-scale point cloud based on the occupied second-scale voxels corresponding to each first-scale voxel, thereby determining the reconstructed geometric information corresponding to the second-scale point cloud.
在一些实施例中,解码器可以根据S1031的确定结果,以不同的占用符区分标记被占据的第二尺度体素以及未被占据的第二尺度体素。可以基于占用符标记表征被占据的每个第二尺度体素,对第二尺度点云对应的编码信息进行解码重建,确定第二尺度点云对应的重建几何信息。In some embodiments, the decoder may distinguish between occupied second-scale voxels and unoccupied second-scale voxels with different occupancy symbols according to the determination result of S1031. The encoded information corresponding to the second-scale point cloud may be decoded and reconstructed based on each occupied second-scale voxel represented by the occupancy symbol, and the reconstructed geometric information corresponding to the second-scale point cloud may be determined.
这里,解码器在进行几何重建时,主要是针对第二尺度点云对应的编码信息中的几何编码信息进行解码重建。在确定第二尺度点云对应的重建几何信息之后,还可以通过属性解码过程,基于第二尺度点云对应的重建几何信息对第二尺度点云对应的属性编码信息进行解码,确定第二尺度点云对应的属性信息,从而结合第二尺度点云对应的重建几何信息与属性信息,还原第二尺度点云的三维图像模型。Here, when the decoder performs geometric reconstruction, it mainly decodes and reconstructs the geometric coding information in the coding information corresponding to the second scale point cloud. After determining the reconstructed geometric information corresponding to the second scale point cloud, the attribute coding information corresponding to the second scale point cloud can be decoded based on the reconstructed geometric information corresponding to the second scale point cloud through the attribute decoding process to determine the attribute information corresponding to the second scale point cloud, thereby combining the reconstructed geometric information and the attribute information corresponding to the second scale point cloud to restore the three-dimensional image model of the second scale point cloud.
可以理解的是,本申请实施例中,解码器可以通过预测局部密度,确定每个第一尺度体素上采样得到的第二尺度体素中,被占据的第二尺度体素的数量。这样,可以结合局部密度对第二尺度体素对应的占据概率进行筛选,来确定出第二尺度体素的占据情况,根据第二尺度体素的占据情况重建第二尺度点云,确定第二尺度点云的重建几何信息。如此,可以使得确定出的第二尺度体素的占据情况更准确,提高第二尺度点云的重建几何信息的准确性,也即提高了解码器的重建几何质量,提高了解码性能。It is understandable that in the embodiment of the present application, the decoder can determine the number of occupied second-scale voxels in the second-scale voxels obtained by sampling each first-scale voxel by predicting the local density. In this way, the occupancy probability corresponding to the second-scale voxels can be screened in combination with the local density to determine the occupancy of the second-scale voxels, reconstruct the second-scale point cloud according to the occupancy of the second-scale voxels, and determine the reconstructed geometric information of the second-scale point cloud. In this way, the occupancy of the determined second-scale voxels can be made more accurate, and the accuracy of the reconstructed geometric information of the second-scale point cloud can be improved, that is, the reconstructed geometric quality of the decoder is improved, and the decoding performance is improved.
在一些实施例中,在S103之后,解码器还可以继续对码流进行解析,确定第i尺度点云对应的编码信息,i为大于或等于3的整数;通过本申请实施例中的上述解码方法,基于第i-1尺度点云的重建几何信息进行第i-1尺度体素的局部密度预测,确定i-1尺度体素对应的局部密度,并对i-1尺度体素对应的第i尺度体素进行占据概率预测,确定第i尺度体素对应的占据概率;第i尺度由第i-1尺度上采样得到;基于第i尺度体素对应的占据概率与第i-1尺度体素对应的局部密度,对第i尺度点云对应的编码信息进行解码重建,确定第i尺度点云对应的重建几何信息。也就是说,解码器在解码重建出第二尺度点云之后,还可以基于第二尺度点云的重建几何信息,利用本申请实施例中的解码方法对码流中更高尺度点云,如第三尺度点云的编码信息进行解码重建,得到第三尺度点云对应的重建几何信息。如此,解码器对相邻尺度进行解码时,可以利用前一个已解码的低尺度点云的已知几何数据,对高尺度点云的编码信息进行解码重建,确定高尺度点云的重建几何信息;通过逐尺度的解码,直至恢复至目标尺度点云的重建几何信息,目标尺度可以根据解码器的预设解码精度来确定。In some embodiments, after S103, the decoder may continue to parse the code stream to determine the encoding information corresponding to the i-th scale point cloud, where i is an integer greater than or equal to 3; through the above decoding method in the embodiment of the present application, the local density of the i-th scale voxel is predicted based on the reconstructed geometric information of the i-th scale point cloud, the local density corresponding to the i-1 scale voxel is determined, and the occupancy probability of the i-th scale voxel corresponding to the i-1 scale voxel is predicted to determine the occupancy probability corresponding to the i-th scale voxel; the i-th scale is obtained by upsampling the i-th scale; based on the occupancy probability corresponding to the i-th scale voxel and the local density corresponding to the i-th scale voxel, the encoding information corresponding to the i-th scale point cloud is decoded and reconstructed to determine the reconstructed geometric information corresponding to the i-th scale point cloud. That is, after decoding and reconstructing the second scale point cloud, the decoder may also decode and reconstruct the encoding information of a higher scale point cloud in the code stream, such as a third scale point cloud, based on the reconstructed geometric information of the second scale point cloud, using the decoding method in the embodiment of the present application to obtain the reconstructed geometric information corresponding to the third scale point cloud. In this way, when the decoder decodes adjacent scales, it can use the known geometric data of the previously decoded low-scale point cloud to decode and reconstruct the encoded information of the high-scale point cloud, and determine the reconstructed geometric information of the high-scale point cloud; by decoding scale by scale until the reconstructed geometric information of the target scale point cloud is restored, the target scale can be determined according to the preset decoding accuracy of the decoder.
在一些实施例中,在S103之后,解码器还可以通过本申请实施例中的上述解码方法,基于第n尺度点云对应的重建几何数据进行第n尺度体素的局部密度预测与第n+1尺度体素的占据概率预测,确定第n尺度体素对应的局部密度与第n+1尺度体素对应的占据概率;n为大于或等于2的正整数;第n+1尺度由第n尺度上采样得到;基于第n尺度体素对应的局部密度与第n+1尺度体素对应的占据概率,确定第n+1尺度点云对应的重建几何数据。In some embodiments, after S103, the decoder may also perform the above-mentioned decoding method in the embodiments of the present application to predict the local density of the n-th scale voxel and the occupancy probability of the n+1-th scale voxel based on the reconstructed geometric data corresponding to the n-th scale point cloud, and determine the local density corresponding to the n-th scale voxel and the occupancy probability corresponding to the n+1-th scale voxel; n is a positive integer greater than or equal to 2; the n+1-th scale is obtained by sampling the n-th scale; based on the local density corresponding to the n-th scale voxel and the occupancy probability corresponding to the n+1-th scale voxel, determine the reconstructed geometric data corresponding to the n+1-th scale point cloud.
示例性地,对于码流中包含的多非相邻尺度的点云的编码信息,如包含第一尺度点云对应的编码信息、第三尺度点云对应的编码信息、以及第五尺度点云对应的编码信息等等的情况下,解码器可以在解码重建出第一尺度点云的基础上,基于第一尺度点云中的第一尺度体素进行体素上采样,得到第二尺度体素;并通过对第一尺度点云的几何信息进行特征提取,对第一尺度体素进行局部密度预测,以及对第二尺度体素进行占据概率预测,根据确定的第一尺度体素的局部密度与第二尺度体素的占据概率,结合根据第一尺度体素的第一位置信息上采样所确定的第二尺度体素的第二位置信息,重建出第二尺度点云对应的重建几何数据。利用第二尺度点云对应的重建几何数据对第三尺度点云对应的编码信息进行解码重建,以此类推。Exemplarily, for the coding information of point clouds of multiple non-adjacent scales contained in the code stream, such as the coding information corresponding to the first-scale point cloud, the coding information corresponding to the third-scale point cloud, and the coding information corresponding to the fifth-scale point cloud, etc., the decoder can perform voxel upsampling based on the first-scale voxels in the first-scale point cloud to obtain the second-scale voxels on the basis of decoding and reconstructing the first-scale point cloud; and extract features from the geometric information of the first-scale point cloud, perform local density prediction on the first-scale voxels, and perform occupation probability prediction on the second-scale voxels, and reconstruct the reconstructed geometric data corresponding to the second-scale point cloud according to the determined local density of the first-scale voxels and the occupation probability of the second-scale voxels, combined with the second position information of the second-scale voxels determined by upsampling the first position information of the first-scale voxels. The reconstructed geometric data corresponding to the second-scale point cloud is used to decode and reconstruct the coding information corresponding to the third-scale point cloud, and so on.
需要说明的是,本申请实施例中的解码方法可应用于可伸缩的编解码方法中,也即对于编码器侧发送的多个尺度点云的多个编码信息,解码器可以根据实际解码精度的需要,以从低尺度向高尺度解码的顺序,解码重建至任意尺度的点云。示例性地,编码器在码流中写入并发送了第一尺度点云的编码信息、第二尺度点云的编码信息至第五尺度点云的编码信息,而解码器可以根据预设精度要求,根据本申请实施例提供的解码方法,从第一尺度点云解码至第三尺度点云,重建第三尺度点云的几何数据并还原第三尺度点云的三维图像模型后结束解码,不再继续对第五尺度点云对应的编码信息解码。具体的根据实际情况进行选择,本申请实施例不作限定。It should be noted that the decoding method in the embodiment of the present application can be applied to a scalable encoding and decoding method, that is, for multiple encoding information of multiple scale point clouds sent by the encoder side, the decoder can decode and reconstruct the point cloud of any scale in the order of decoding from low scale to high scale according to the actual decoding accuracy requirements. Exemplarily, the encoder writes and sends the encoding information of the first scale point cloud, the encoding information of the second scale point cloud to the encoding information of the fifth scale point cloud in the code stream, and the decoder can decode from the first scale point cloud to the third scale point cloud according to the preset accuracy requirements according to the decoding method provided in the embodiment of the present application, reconstruct the geometric data of the third scale point cloud and restore the three-dimensional image model of the third scale point cloud, and then end the decoding, and no longer continue to decode the encoding information corresponding to the fifth scale point cloud. The specific selection is based on the actual situation, and the embodiment of the present application is not limited.
可以理解的是,本申请实施例提供的解码方法可以重复应用于多个相邻尺度之间,且每组相邻尺度间的解码相互独立不依赖,因此可以灵活地实现尺度可伸缩的解码。It can be understood that the decoding method provided in the embodiment of the present application can be repeatedly applied between multiple adjacent scales, and the decoding between each group of adjacent scales is independent of each other, so scale-scalable decoding can be flexibly implemented.
需要说明的是,上述解码器的每次解码过程都是将已解码的低尺度点云作为已知信息,对高尺度点云的编码信息进行解码的。对于解码器的首个解码过程,其已知信息可以是编码器侧发送的未编码的预设数量个点云信息。编码器可以将预设数量个点云信息,如点云中100个点的坐标作为首个已知信息,以未编码的方式直接发送至解码端,以使解码器无需对首个已知信息进行解码,直接利用编码器发送的 预设数量个点的位置信息,重建出相应尺度的点云,以继续之后的解码过程。It should be noted that each decoding process of the above decoder uses the decoded low-scale point cloud as known information to decode the encoded information of the high-scale point cloud. For the first decoding process of the decoder, the known information may be a preset number of unencoded point cloud information sent by the encoder side. The encoder may send a preset number of point cloud information, such as the coordinates of 100 points in the point cloud, as the first known information directly to the decoding end in an unencoded manner, so that the decoder does not need to decode the first known information, but directly uses the position information of the preset number of points sent by the encoder to reconstruct the point cloud of the corresponding scale to continue the subsequent decoding process.
下面说明本申请实施例提供的应用于编码器的编码方法。The following describes an encoding method applied to an encoder provided in an embodiment of the present application.
参见图16,图16是本申请实施例提供的编码方法的一个可选的流程示意图,将结合图16示出的步骤进行说明。Refer to Figure 16, which is an optional flow chart of the encoding method provided in an embodiment of the present application, which will be explained in conjunction with the steps shown in Figure 16.
S501、对第二尺度点云进行体素下采样,确定第一尺度点云,并将第一尺度点云中的第一尺度体素上采样至第二尺度,确定第一尺度体素对应的第二尺度体素。S501 , down-sampling the second-scale point cloud to determine the first-scale point cloud, and up-sampling the first-scale voxels in the first-scale point cloud to the second scale to determine the second-scale voxels corresponding to the first-scale voxels.
本申请实施例中,编码器对第二尺度点云的原始点云数据进行体素化,体素化后的点云可以通过体素网格中,各个位置上体素的占用符号,来表示点云的几何数据。编码器对体素化后的第二尺度点云进行体素下采样,确定第一尺度点云。In the embodiment of the present application, the encoder voxelizes the original point cloud data of the second scale point cloud, and the voxelized point cloud can represent the geometric data of the point cloud by the occupation symbol of the voxel at each position in the voxel grid. The encoder performs voxel downsampling on the voxelized second scale point cloud to determine the first scale point cloud.
在一些实施例中,编码器可以通过池化的方式实现体素下采样,如图17所示,采用步长为2×2×2最大池化层,将8个第二尺度体素合并为1个第一尺度体素。经过体素下采样之后,第一尺度点云对应的4个第一尺度体素中有3个第一尺度体素被占据,1个第一尺度体素未被占据,第一尺度体素在三个维度上的尺寸均为第二尺度体素的一倍。编码器以占用符号来标记体素的占据情况,示例性地,第二尺度点云朝向纸面一侧的第二尺度体素对应的占用符号通过体素上采样,得到的第一尺度点云朝向纸面一侧的第一尺度体素对应的占用符号的过程可以如图18所示。这样,通过体素下采样,得到相对较低尺度的第一尺度点云的几何数据。In some embodiments, the encoder can implement voxel downsampling by pooling. As shown in FIG17, a maximum pooling layer with a step size of 2×2×2 is used to merge 8 second-scale voxels into 1 first-scale voxel. After voxel downsampling, 3 of the 4 first-scale voxels corresponding to the first-scale point cloud are occupied, and 1 first-scale voxel is not occupied. The size of the first-scale voxel in three dimensions is twice that of the second-scale voxel. The encoder marks the occupancy of the voxel with an occupancy symbol. For example, the occupancy symbol corresponding to the second-scale voxel on the side of the second-scale point cloud facing the paper surface is obtained by voxel upsampling. The process of the occupancy symbol corresponding to the first-scale voxel on the side of the first-scale point cloud facing the paper surface can be shown in FIG18. In this way, by voxel downsampling, the geometric data of the first-scale point cloud of relatively low scale is obtained.
本申请实施例中,编码器再将第一尺度点云上采样至第二尺度,确定每个第一尺度体素对应的多个第二尺度体素,以通过对第二尺度体素的占据情况进行预测,实现无损编码过程。In the embodiment of the present application, the encoder then upsamples the first-scale point cloud to the second scale, determines a plurality of second-scale voxels corresponding to each first-scale voxel, and implements a lossless encoding process by predicting the occupancy of the second-scale voxels.
S502、基于第一尺度点云进行局部密度预测,确定第一尺度体素对应的局部密度,并对第二尺度体素进行占据概率预测,确定第二尺度体素对应的占据概率。S502 , performing local density prediction based on the first-scale point cloud to determine the local density corresponding to the first-scale voxel, and performing occupancy probability prediction on the second-scale voxel to determine the occupancy probability corresponding to the second-scale voxel.
本申请实施例中,局部密度表征第一尺度体素对应的第二尺度体素中,被占用的第二尺度体素的数量。编码器基于第一尺度点云进行局部密度预测,确定第一尺度体素对应的局部密度,并对第二尺度体素进行占据概率预测,确定第二尺度体素对应的占据概率的过程与解码器中相同的处理过程方法相同,此处不再赘述,In the embodiment of the present application, the local density represents the number of occupied second-scale voxels in the second-scale voxels corresponding to the first-scale voxels. The encoder performs local density prediction based on the first-scale point cloud, determines the local density corresponding to the first-scale voxels, and predicts the occupancy probability of the second-scale voxels. The process of determining the occupancy probability corresponding to the second-scale voxels is the same as the same processing method in the decoder, and will not be repeated here.
在一些实施例中,S502的过程可以包括:对第一尺度点云的几何信息进行特征提取,确定第一尺度点云特征;将第一尺度点云特征上采样至第二尺度,确定初始第二尺度点云特征,对初始第二尺度点云特征进行特征提取,确定第二尺度点云特征,并根据第二尺度点云特征进行占据概率预测,确定第二尺度体素对应的占据概率;根据第一尺度点云特征进行局部密度预测,确定第一尺度体素对应的局部密度。In some embodiments, the process of S502 may include: performing feature extraction on the geometric information of the first-scale point cloud to determine the first-scale point cloud features; upsampling the first-scale point cloud features to the second scale to determine the initial second-scale point cloud features, performing feature extraction on the initial second-scale point cloud features to determine the second-scale point cloud features, and performing occupancy probability prediction based on the second-scale point cloud features to determine the occupancy probability corresponding to the second-scale voxel; performing local density prediction based on the first-scale point cloud features to determine the local density corresponding to the first-scale voxel.
或者,在一些实施例中,S502的过程也可以包括:通过第一特征提取网络,对第一尺度点云的几何信息进行特征提取,确定第一尺度的第一点云特征;将第一尺度的第一点云特征上采样至第二尺度,确定第二尺度点云特征,并基于第二尺度点云特征进行占据概率预测,确定第二尺度体素对应的占据概率;通过第二特征提取网络,对第一尺度点云的几何数据进行特征提取,确定第一尺度的第二点云特征;根据第一尺度的第二点云特征进行局部密度预测,确定第一尺度体素对应的局部密度。Alternatively, in some embodiments, the process of S502 may also include: performing feature extraction on the geometric information of the first-scale point cloud through a first feature extraction network to determine the first point cloud features of the first scale; upsampling the first point cloud features of the first scale to a second scale to determine the second-scale point cloud features, and performing occupancy probability prediction based on the second-scale point cloud features to determine the occupancy probability corresponding to the second-scale voxel; performing feature extraction on the geometric data of the first-scale point cloud through a second feature extraction network to determine the second point cloud features of the first scale; performing local density prediction based on the second point cloud features of the first scale to determine the local density corresponding to the first-scale voxel.
或者,在一些实施例中,S502的过程还也可以包括:对第一尺度点云的几何信息进行特征提取,确定第一尺度点云特征;将第一尺度点云特征上采样至第二尺度,确定第二尺度点云特征,并根据第二尺度点云特征进行占据概率预测,确定第二尺度体素对应的占据概率;根据第一尺度点云特征进行局部密度预测,确定第一尺度体素对应的局部密度。Alternatively, in some embodiments, the process of S502 may also include: extracting features from the geometric information of the first-scale point cloud to determine the first-scale point cloud features; upsampling the first-scale point cloud features to the second scale to determine the second-scale point cloud features, and performing occupancy probability prediction based on the second-scale point cloud features to determine the occupancy probability corresponding to the second-scale voxel; performing local density prediction based on the first-scale point cloud features to determine the local density corresponding to the first-scale voxel.
上述局部密度预测过程中,编码器可以利用局部密度预测网络,根据第一尺度点云特征进行局部密度预测,确定第一尺度体素对应的局部密度;其中,局部密度预测网络包括:第一稀疏卷积层、第一激活函数层、第二稀疏卷积层与第二激活函数层。In the above-mentioned local density prediction process, the encoder can use the local density prediction network to perform local density prediction according to the first-scale point cloud features to determine the local density corresponding to the first-scale voxel; wherein the local density prediction network includes: a first sparse convolution layer, a first activation function layer, a second sparse convolution layer and a second activation function layer.
上述处理过程与解码器的相同处理过程的描述一致,此处不再赘述。The above processing is consistent with the description of the same processing of the decoder, and will not be repeated here.
S503、根据第一尺度体素对应的局部密度与第二尺度体素对应的占据概率,确定第二尺度点云对应的重建几何信息。S503 : Determine the reconstructed geometric information corresponding to the second-scale point cloud according to the local density corresponding to the first-scale voxel and the occupancy probability corresponding to the second-scale voxel.
本申请实施例中,编码器根据第一尺度体素对应的局部密度与第二尺度体素对应的占据概率,确定第二尺度点云对应的重建几何信息与解码器的相同处理过程的描述一致,此处不再赘述。In the embodiment of the present application, the encoder determines the reconstructed geometric information corresponding to the second-scale point cloud based on the local density corresponding to the first-scale voxel and the occupancy probability corresponding to the second-scale voxel, which is consistent with the description of the same processing process of the decoder and is not repeated here.
在一些实施例中,S503可以通过执行S5031-S5032的过程来实现,如下:In some embodiments, S503 may be implemented by executing the process of S5031-S5032 as follows:
S5031、对于第一尺度点云中的每个第一尺度体素,将每个第一尺度体素对应的多个第二尺度体素中占据概率高的局部密度个第二尺度体素,确定为被占用的第二尺度体素;S5031. For each first-scale voxel in the first-scale point cloud, determine the second-scale voxels with high local density of occupation probability among the multiple second-scale voxels corresponding to each first-scale voxel as occupied second-scale voxels;
S5032、基于每个第一尺度体素对应的被占用的第二尺度体素,确定第二尺度点云对应的重建几何信息。S5032: Determine reconstructed geometric information corresponding to the second-scale point cloud based on the occupied second-scale voxels corresponding to each first-scale voxel.
这里,S5031-S5032的过程与上述S1031-S1032的过程描述一致,此处不再赘述。Here, the process of S5031-S5032 is consistent with the above-mentioned process description of S1031-S1032, and will not be repeated here.
S504、基于第二尺度点云对应的重建几何信息进行编码,确定第二尺度点云对应的编码信息,并将编码信息写入码流。S504 , encoding is performed based on the reconstructed geometric information corresponding to the second-scale point cloud, encoding information corresponding to the second-scale point cloud is determined, and the encoding information is written into a bitstream.
本申请实施例中,编码器可以基于第二尺度点云对应的重建几何信息进行重新着色处理,得到着色的点云数据,基于着色的点云数据进行颜色信息编码,确定第二尺度点云对应的属性信息编码;对第二尺度点云的点云数据进行几何信息编码,确定第二尺度点云对应的几何信息编码;将几何信息编码与属性信息编码,确定为第二尺度点云对应的编码信息。In an embodiment of the present application, the encoder can perform recoloring processing based on the reconstructed geometric information corresponding to the second-scale point cloud to obtain colored point cloud data, perform color information encoding based on the colored point cloud data, and determine the attribute information encoding corresponding to the second-scale point cloud; perform geometric information encoding on the point cloud data of the second-scale point cloud to determine the geometric information encoding corresponding to the second-scale point cloud; and combine the geometric information encoding and the attribute information encoding to determine the encoding information corresponding to the second-scale point cloud.
在一些实施例中,熵编码可以采用自适应上下文的二进制算术编码(CABAC:Context-based Adaptive Binary Arithmetic Coding)算法,但不局限于此。根据熵编码的原理,对体素的占用情况的预测越准,则信息熵越小,实际码率和带宽就越节省。如此,编码器将第二尺度点云对应的编码写入码流,发送至解码器,由解码器解析出第二尺度点云对应的编码信息,利用本申请实施例提供的解码方法,基于前一个已解码的低尺度点云的几何数据(如第一尺度点云),送入熵解码器,就可以重建无损的第二尺度点云的几何数据,也即确定第二尺度点云对应的重建几何信息,进而基于第二尺度点云对应的重建几何信息实现对第二尺度点云的编码信息的几何解码与属性解码,还原第二尺度点云的三维图像模型。In some embodiments, entropy coding may use a context-based adaptive binary arithmetic coding (CABAC: Context-based Adaptive Binary Arithmetic Coding) algorithm, but is not limited thereto. According to the principle of entropy coding, the more accurate the prediction of the occupancy of voxels, the smaller the information entropy, and the more the actual bit rate and bandwidth are saved. In this way, the encoder writes the code corresponding to the second-scale point cloud into the bitstream and sends it to the decoder, which parses the encoding information corresponding to the second-scale point cloud. The decoding method provided in the embodiment of the present application is used to send the geometric data of the previously decoded low-scale point cloud (such as the first-scale point cloud) to the entropy decoder, so that the geometric data of the lossless second-scale point cloud can be reconstructed, that is, the reconstructed geometric information corresponding to the second-scale point cloud is determined, and then the geometric decoding and attribute decoding of the encoding information of the second-scale point cloud are realized based on the reconstructed geometric information corresponding to the second-scale point cloud, and the three-dimensional image model of the second-scale point cloud is restored.
可以理解的是,本申请实施例中,基于第一尺度体素对应的局部密度,对第二尺度体素对应的占据概率进行筛选,可以提高确定第二尺度体素的占据情况的准确性,进而提高基于第二尺度体素的占据情况确定的第二尺度点云的重建几何信息进行编码的准确性,从而提高了编码性能。It can be understood that in the embodiment of the present application, based on the local density corresponding to the first-scale voxels, the occupancy probability corresponding to the second-scale voxels is screened, which can improve the accuracy of determining the occupancy status of the second-scale voxels, and then improve the accuracy of encoding the reconstructed geometric information of the second-scale point cloud determined based on the occupancy status of the second-scale voxels, thereby improving the encoding performance.
在一些实施例中,编码器还可以通过同样的编码过程,基于第一尺度点云进行至少一次体素下采样,确定出相对于第一尺度点云更低尺度的点云,并逐尺度完成多个尺度点云的编码。编码器将多个尺度点云的编码信息写入码流,发送至解码器。In some embodiments, the encoder can also perform at least one voxel downsampling based on the first-scale point cloud through the same encoding process, determine a point cloud with a lower scale than the first-scale point cloud, and complete the encoding of multiple-scale point clouds scale by scale. The encoder writes the encoding information of the multiple-scale point clouds into the bitstream and sends it to the decoder.
可以理解的是,本申请实施例提供的编码方法可以重复应用于多个相邻尺度之间,且每组相邻尺度间的编码相互独立不依赖,因此可以灵活地实现尺度可伸缩的编码。It can be understood that the encoding method provided in the embodiment of the present application can be repeatedly applied between multiple adjacent scales, and the encoding between each group of adjacent scales is independent of each other, so scale-scalable encoding can be flexibly implemented.
下面,结合图19,说明本申请实施例提供的解码方法在实际场景中的应用。Below, in conjunction with Figure 19, the application of the decoding method provided in an embodiment of the present application in an actual scenario is explained.
如图19所示,编码器对第一尺度点云的原始点云数据进行G-PCC编码,得到第一尺度点云对应的编码信息,并通过码流发送至解码器。解码器对第一尺度点云数据对应的编码信息进行G-PCC解码,通过以G-PCC解码中的几何解码过程确定第一尺度点云的重建几何信息。以G-PCC解码中的几何解码过程进行说明,解码器对第一尺度点云的重建几何信息进行特征提取,得到第一尺度点云特征,以2×2×2的转置卷积对第一尺度点云特征进行上采样至第二尺度,确定初始第二尺度点云特征;解码器对初始第二尺度点云特征进行特征提取,确定第二尺度点云特征。解码器根据第二尺度点云特征,对第一尺度点云中每个第一尺度体素上采样得到的多个第二尺度体素进行占据概率预测,得到多个第二尺度体素对应的多个占据概率;并且,根据第一尺度点云特征,对每个第一尺度体素的局部密度进行预测,得到每个第一尺度体素对应的局部密度。在重建过程中,解码器根据每个第一尺度体素对应的局部密度,对该第一尺度体素上采样得到的多个第二尺度体素对应的多个占据概率进行筛选;示例性地,将多个占据概率从高到低进行排序,确定出占据概率高的局部密度个占据概率,将占据概率高的局部密度个占据概率对应的局部密度个第二尺度体素,确定为局部密度个被占据的第二尺度体素。根据第一尺度点云中,每个第一尺度体素对应的局部密度个被占据的第二尺度体素确定第二尺度点云的重建几何信息。并基于第二尺度点云的重建几何信息,以同样的过程重建得到第三尺度点云的重建几何信息。As shown in FIG19 , the encoder performs G-PCC encoding on the original point cloud data of the first scale point cloud to obtain the encoding information corresponding to the first scale point cloud, and sends it to the decoder through the code stream. The decoder performs G-PCC decoding on the encoding information corresponding to the first scale point cloud data, and determines the reconstructed geometric information of the first scale point cloud through the geometric decoding process in G-PCC decoding. The geometric decoding process in G-PCC decoding is used to illustrate that the decoder extracts features from the reconstructed geometric information of the first scale point cloud to obtain first scale point cloud features, and upsamples the first scale point cloud features to the second scale by a 2×2×2 transposed convolution to determine the initial second scale point cloud features; the decoder extracts features from the initial second scale point cloud features to determine the second scale point cloud features. The decoder predicts the occupancy probability of multiple second scale voxels obtained by upsampling each first scale voxel in the first scale point cloud based on the second scale point cloud features, and obtains multiple occupancy probabilities corresponding to the multiple second scale voxels; and predicts the local density of each first scale voxel based on the first scale point cloud features to obtain the local density corresponding to each first scale voxel. During the reconstruction process, the decoder screens the multiple occupancy probabilities corresponding to the multiple second-scale voxels sampled from the first-scale voxel according to the local density corresponding to each first-scale voxel; illustratively, the multiple occupancy probabilities are sorted from high to low, and the local density occupancy probabilities with high occupancy probabilities are determined, and the local density occupancy probabilities corresponding to the local density occupancy probabilities with high occupancy probabilities are determined as the local density occupies the second-scale voxels. The reconstructed geometric information of the second-scale point cloud is determined according to the local density occupies the second-scale voxels corresponding to each first-scale voxel in the first-scale point cloud. Based on the reconstructed geometric information of the second-scale point cloud, the reconstructed geometric information of the third-scale point cloud is reconstructed by the same process.
可以理解的是,本申请实施例相比传统的G-PCC方法,具有良好的几何重建质量,编解码性能更好。在一些实施例中,申请人在多个点云数据集上,将本申请实施例的编解码方法与传统的G-PCC方法进行了对比编解码测试,结果如表1所示,如下:It can be understood that the embodiment of the present application has good geometric reconstruction quality and better encoding and decoding performance than the traditional G-PCC method. In some embodiments, the applicant conducted a comparative encoding and decoding test on multiple point cloud data sets using the encoding and decoding method of the embodiment of the present application and the traditional G-PCC method. The results are shown in Table 1, as follows:
表1Table 1
点云数据集Point cloud dataset 本申请实施例相较于传统G-PCC的率失真性能增益The rate-distortion performance gain of the embodiment of the present application compared to the traditional G-PCC
facade_00009_vox12facade_00009_vox12 -25.91%-25.91%
house_without_roof_00057_vox12house_without_roof_00057_vox12 -43.11%-43.11%
boxer_viewdep_vox12boxer_viewdep_vox12 -42.76%-42.76%
soldier_viewdep_vox12soldier_viewdep_vox12 -45.60%-45.60%
平均增益Average gain -39.35%-39.35%
表1中,facade_00009_vox12、house_without_roof_00057_vox12、boxer_viewdep_vox12与soldier_viewdep_vox12为不同的点云数据集,BD-rate gain over。率失真(Bjontegaard-Delta,BD-rate)性能增益数值与编解码性能负相关。增益越小,比如以负值表示的数值越小,代表编解码性能越好。可以看出,本申请实施例相较于传统G-PCC编解码方法,BD-rate增益更高,最高提升了45.60%,平均提升了39.35。这一数据说明了编解码性能的提升。In Table 1, facade_00009_vox12, house_without_roof_00057_vox12, boxer_viewdep_vox12 and soldier_viewdep_vox12 are different point cloud data sets, and BD-rate gain over. The rate-distortion (Bjontegaard-Delta, BD-rate) performance gain value is negatively correlated with the codec performance. The smaller the gain, such as the smaller the value represented by a negative value, the better the codec performance. It can be seen that compared with the traditional G-PCC codec method, the embodiment of the present application has a higher BD-rate gain, with a maximum increase of 45.60% and an average increase of 39.35. This data illustrates the improvement of codec performance.
本申请实施例提供一种解码器1,如图20所示,包括:The embodiment of the present application provides a decoder 1, as shown in FIG20 , including:
解析部分11,配置为解析码流,确定第二尺度点云对应的编码信息;The parsing part 11 is configured to parse the bit stream and determine the encoding information corresponding to the second scale point cloud;
确定部分12,配置为确定第一尺度点云;所述第一尺度点云为所述第二尺度点云对应的前一个已解 码的点云数据;The determining part 12 is configured to determine a first scale point cloud; the first scale point cloud is the previously decoded point cloud data corresponding to the second scale point cloud;
第一预测部分13,配置为基于所述第一尺度点云进行局部密度预测,确定所述第一尺度点云中第一尺度体素对应的局部密度,并对第二尺度体素进行占据概率预测,确定所述第二尺度体素对应的占据概率;所述第二尺度体素为所述第一尺度体素对应的上采样体素;所述局部密度表征所述第一尺度体素对应的第二尺度体素中,被占据的第二尺度体素的数量;The first prediction part 13 is configured to perform local density prediction based on the first scale point cloud, determine the local density corresponding to the first scale voxel in the first scale point cloud, and perform occupation probability prediction on the second scale voxel to determine the occupation probability corresponding to the second scale voxel; the second scale voxel is an upsampled voxel corresponding to the first scale voxel; the local density represents the number of occupied second scale voxels in the second scale voxel corresponding to the first scale voxel;
解码重建部分14,配置为基于所述第二尺度体素对应的占据概率与所述第一尺度体素对应的局部密度,对所述第二尺度点云对应的编码信息进行解码重建,确定所述第二尺度点云对应的重建几何信息。The decoding and reconstruction part 14 is configured to decode and reconstruct the encoded information corresponding to the second scale point cloud based on the occupancy probability corresponding to the second scale voxel and the local density corresponding to the first scale voxel, and determine the reconstructed geometric information corresponding to the second scale point cloud.
在一些实施例中,所述解码重建部分14,还配置为对于所述第一尺度点云中的每个第一尺度体素,将所述每个第一尺度体素对应的多个第二尺度体素中占据概率高的局部密度个第二尺度体素,确定为被占据的第二尺度体素;基于所述每个第一尺度体素对应的被占据的第二尺度体素,对所述第二尺度点云对应的编码信息进行解码重建,确定所述第二尺度点云对应的重建几何信息。In some embodiments, the decoding and reconstruction part 14 is further configured to, for each first-scale voxel in the first-scale point cloud, determine the second-scale voxels with high local density of occupation probability among the multiple second-scale voxels corresponding to each first-scale voxel as occupied second-scale voxels; based on the occupied second-scale voxels corresponding to each first-scale voxel, decode and reconstruct the encoded information corresponding to the second-scale point cloud to determine the reconstructed geometric information corresponding to the second-scale point cloud.
在一些实施例中,所述第一预测部分13,还配置为对所述第一尺度点云的几何信息进行特征提取,确定第一尺度点云特征;将所述第一尺度点云特征上采样至第二尺度,确定初始第二尺度点云特征,对所述初始第二尺度点云特征进行特征提取,确定第二尺度点云特征,并根据所述第二尺度点云特征进行占据概率预测,确定所述第二尺度体素对应的占据概率;根据所述第一尺度点云特征进行局部密度预测,确定所述第一尺度体素对应的局部密度。In some embodiments, the first prediction part 13 is further configured to perform feature extraction on the geometric information of the first-scale point cloud to determine the first-scale point cloud features; upsample the first-scale point cloud features to the second scale to determine the initial second-scale point cloud features, perform feature extraction on the initial second-scale point cloud features to determine the second-scale point cloud features, and perform occupancy probability prediction based on the second-scale point cloud features to determine the occupancy probability corresponding to the second-scale voxel; perform local density prediction based on the first-scale point cloud features to determine the local density corresponding to the first-scale voxel.
在一些实施例中,所述第一预测部分13,还配置为通过第一特征提取网络,对所述第一尺度点云的几何信息进行特征提取,确定第一尺度的第一点云特征;将所述第一尺度的第一点云特征上采样至第二尺度,确定第二尺度点云特征,并基于所述第二尺度点云特征进行占据概率预测,确定所述第二尺度体素对应的占据概率;通过第二特征提取网络,对所述第一尺度点云的几何数据进行特征提取,确定第一尺度的第二点云特征;根据所述第一尺度的第二点云特征进行局部密度预测,确定所述第一尺度体素对应的局部密度。In some embodiments, the first prediction part 13 is further configured to perform feature extraction on the geometric information of the first-scale point cloud through a first feature extraction network to determine the first point cloud features of the first scale; upsample the first point cloud features of the first scale to a second scale to determine the second-scale point cloud features, and perform occupancy probability prediction based on the second-scale point cloud features to determine the occupancy probability corresponding to the second-scale voxel; perform feature extraction on the geometric data of the first-scale point cloud through a second feature extraction network to determine the second point cloud features of the first scale; perform local density prediction based on the second point cloud features of the first scale to determine the local density corresponding to the first-scale voxel.
在一些实施例中,所述第一预测部分13,还配置为对所述第一尺度点云的几何信息进行特征提取,确定第一尺度点云特征;将所述第一尺度点云特征上采样至第二尺度,确定第二尺度点云特征,并根据所述第二尺度点云特征进行占据概率预测,确定所述第二尺度体素对应的占据概率;根据所述第一尺度点云特征进行局部密度预测,确定所述第一尺度体素对应的局部密度。In some embodiments, the first prediction part 13 is further configured to perform feature extraction on the geometric information of the first-scale point cloud to determine the first-scale point cloud features; upsample the first-scale point cloud features to the second scale to determine the second-scale point cloud features, and perform occupancy probability prediction based on the second-scale point cloud features to determine the occupancy probability corresponding to the second-scale voxel; perform local density prediction based on the first-scale point cloud features to determine the local density corresponding to the first-scale voxel.
在一些实施例中,所述第一预测部分13,还配置为利用局部密度预测网络,根据所述第一尺度点云特征进行局部密度预测,确定所述第一尺度体素对应的局部密度。In some embodiments, the first prediction part 13 is further configured to use a local density prediction network to perform local density prediction according to the first-scale point cloud features to determine the local density corresponding to the first-scale voxel.
在一些实施例中,所述局部密度预测网络包括:In some embodiments, the local density prediction network includes:
第一稀疏卷积层、第一激活函数层、第二稀疏卷积层与第二激活函数层。The first sparse convolution layer, the first activation function layer, the second sparse convolution layer and the second activation function layer.
在一些实施例中,所述解析部分11,还配置为解析所述码流,确定第i尺度点云对应的编码信息;i为大于或等于3的整数;所述第一预测部分13,还配置为基于第i-1尺度点云的重建几何信息进行第i-1尺度体素的局部密度预测,确定所述i-1尺度体素对应的局部密度,并对所述i-1尺度体素对应的第i尺度体素进行占据概率预测,确定所述第i尺度体素对应的占据概率;第i尺度由第i-1尺度上采样得到;所述解码重建部分14,还配置为基于所述第i尺度体素对应的占据概率与所述第i-1尺度体素对应的局部密度,对所述第i尺度点云对应的编码信息进行解码重建,确定所述第i尺度点云对应的重建几何信息。In some embodiments, the parsing part 11 is further configured to parse the code stream to determine the encoding information corresponding to the i-th scale point cloud; i is an integer greater than or equal to 3; the first prediction part 13 is further configured to perform local density prediction of the i-1th scale voxel based on the reconstructed geometric information of the i-1th scale point cloud, determine the local density corresponding to the i-1 scale voxel, and perform occupancy probability prediction on the i-th scale voxel corresponding to the i-1 scale voxel to determine the occupancy probability corresponding to the i-th scale voxel; the i-th scale is obtained by sampling the i-1th scale; the decoding and reconstruction part 14 is further configured to decode and reconstruct the encoding information corresponding to the i-th scale point cloud based on the occupancy probability corresponding to the i-th scale voxel and the local density corresponding to the i-1th scale voxel, and determine the reconstructed geometric information corresponding to the i-th scale point cloud.
在一些实施例中,所述第一预测部分13,还配置为基于第n尺度点云对应的重建几何数据进行第n尺度体素的局部密度预测与第n+1尺度体素的占据概率预测,确定所述第n尺度体素对应的局部密度与所述第n+1尺度体素对应的占据概率;n为大于或等于2的正整数;第n+1尺度由第n尺度上采样得到;所述解码重建部分14,还配置为基于所述第n尺度体素对应的局部密度与所述第n+1尺度体素对应的占据概率,确定第n+1尺度点云对应的重建几何数据。In some embodiments, the first prediction part 13 is further configured to perform local density prediction of the n-th scale voxel and occupancy probability prediction of the n+1-th scale voxel based on the reconstructed geometric data corresponding to the n-th scale point cloud, and determine the local density corresponding to the n-th scale voxel and the occupancy probability corresponding to the n+1-th scale voxel; n is a positive integer greater than or equal to 2; the n+1-th scale is obtained by upsampling the n-th scale; the decoding and reconstruction part 14 is further configured to determine the reconstructed geometric data corresponding to the n+1-th scale point cloud based on the local density corresponding to the n-th scale voxel and the occupancy probability corresponding to the n+1-th scale voxel.
本申请实施例提供一种编码器2,如图21所示,包括:The embodiment of the present application provides an encoder 2, as shown in FIG21, including:
下采样部分21,配置为对第二尺度点云进行体素下采样,确定第一尺度点云;A downsampling part 21 is configured to perform voxel downsampling on the second scale point cloud to determine a first scale point cloud;
第二预测部分22,配置为基于所述第一尺度点云进行局部密度预测,确定所述第一尺度体素对应的局部密度,并对所述第二尺度体素进行占据概率预测,确定所述第二尺度体素对应的占据概率;所述局部密度表征所述第一尺度体素对应的第二尺度体素中,被占据的第二尺度体素的数量;The second prediction part 22 is configured to perform local density prediction based on the first scale point cloud to determine the local density corresponding to the first scale voxel, and perform occupation probability prediction on the second scale voxel to determine the occupation probability corresponding to the second scale voxel; the local density represents the number of occupied second scale voxels in the second scale voxels corresponding to the first scale voxels;
重建部分23,配置为根据所述第一尺度体素对应的局部密度与所述第二尺度体素对应的占据概率,确定所述第二尺度点云对应的重建几何信息;A reconstruction part 23, configured to determine the reconstructed geometric information corresponding to the second-scale point cloud according to the local density corresponding to the first-scale voxel and the occupancy probability corresponding to the second-scale voxel;
编码部分24,配置为基于所述第二尺度点云对应的重建几何信息进行编码,确定第二尺度点云对应的编码信息,并将所述编码信息写入码流。The encoding part 24 is configured to perform encoding based on the reconstructed geometric information corresponding to the second-scale point cloud, determine the encoding information corresponding to the second-scale point cloud, and write the encoding information into a bitstream.
在一些实施例中,所述重建部分23,还配置为对于所述第一尺度点云中的每个第一尺度体素,将所述每个第一尺度体素对应的多个第二尺度体素中占据概率高的局部密度个第二尺度体素,确定为被占据的第二尺度体素;基于所述每个第一尺度体素对应的被占据的第二尺度体素,确定所述第二尺度点云对 应的重建几何信息。In some embodiments, the reconstruction part 23 is further configured to, for each first-scale voxel in the first-scale point cloud, determine a number of second-scale voxels with a high local density of occupation probability among multiple second-scale voxels corresponding to each first-scale voxel as occupied second-scale voxels; and determine the reconstructed geometric information corresponding to the second-scale point cloud based on the occupied second-scale voxels corresponding to each first-scale voxel.
在一些实施例中,所述第二预测部分22,还配置为对所述第一尺度点云的几何信息进行特征提取,确定第一尺度点云特征;将所述第一尺度点云特征上采样至第二尺度,确定初始第二尺度点云特征,对所述初始第二尺度点云特征进行特征提取,确定第二尺度点云特征,并根据所述第二尺度点云特征进行占据概率预测,确定所述第二尺度体素对应的占据概率;根据所述第一尺度点云特征进行局部密度预测,确定所述第一尺度体素对应的局部密度。In some embodiments, the second prediction part 22 is further configured to perform feature extraction on the geometric information of the first-scale point cloud to determine the first-scale point cloud features; upsample the first-scale point cloud features to the second scale to determine the initial second-scale point cloud features, perform feature extraction on the initial second-scale point cloud features to determine the second-scale point cloud features, and perform occupancy probability prediction based on the second-scale point cloud features to determine the occupancy probability corresponding to the second-scale voxel; perform local density prediction based on the first-scale point cloud features to determine the local density corresponding to the first-scale voxel.
在一些实施例中,所述第二预测部分22,还配置为通过第一特征提取网络,对所述第一尺度点云的几何信息进行特征提取,确定第一尺度的第一点云特征;将所述第一尺度的第一点云特征上采样至第二尺度,确定第二尺度点云特征,并基于所述第二尺度点云特征进行占据概率预测,确定所述第二尺度体素对应的占据概率;通过第二特征提取网络,对所述第一尺度点云的几何数据进行特征提取,确定第一尺度的第二点云特征;根据所述第一尺度的第二点云特征进行局部密度预测,确定所述第一尺度体素对应的局部密度。In some embodiments, the second prediction part 22 is further configured to perform feature extraction on the geometric information of the first-scale point cloud through a first feature extraction network to determine the first point cloud features of the first scale; upsample the first point cloud features of the first scale to a second scale to determine the second-scale point cloud features, and perform occupancy probability prediction based on the second-scale point cloud features to determine the occupancy probability corresponding to the second-scale voxel; perform feature extraction on the geometric data of the first-scale point cloud through a second feature extraction network to determine the second point cloud features of the first scale; perform local density prediction based on the second point cloud features of the first scale to determine the local density corresponding to the first-scale voxel.
在一些实施例中,所述第二预测部分22,还配置为对所述第一尺度点云的几何信息进行特征提取,确定第一尺度点云特征;将所述第一尺度点云特征上采样至第二尺度,确定第二尺度点云特征,并根据所述第二尺度点云特征进行占据概率预测,确定所述第二尺度体素对应的占据概率;根据所述第一尺度点云特征进行局部密度预测,确定所述第一尺度体素对应的局部密度。In some embodiments, the second prediction part 22 is further configured to perform feature extraction on the geometric information of the first-scale point cloud to determine the first-scale point cloud features; upsample the first-scale point cloud features to the second scale to determine the second-scale point cloud features, and perform occupancy probability prediction based on the second-scale point cloud features to determine the occupancy probability corresponding to the second-scale voxel; perform local density prediction based on the first-scale point cloud features to determine the local density corresponding to the first-scale voxel.
在一些实施例中,所述第二预测部分22,还配置为利用局部密度预测网络,根据所述第一尺度点云特征进行局部密度预测,确定所述第一尺度体素对应的局部密度;其中,所述局部密度预测网络包括:第一稀疏卷积层、第一激活函数层、第二稀疏卷积层与第二激活函数层。In some embodiments, the second prediction part 22 is further configured to use a local density prediction network to perform local density prediction based on the first-scale point cloud features to determine the local density corresponding to the first-scale voxel; wherein the local density prediction network includes: a first sparse convolution layer, a first activation function layer, a second sparse convolution layer and a second activation function layer.
在一些实施例中,所述编码部分24,还配置为基于所述第二尺度点云对应的重建几何信息进行重新着色处理,得到着色的点云数据,基于所述着色的点云数据进行颜色信息编码,确定所述第二尺度点云对应的属性信息编码;对所述第二尺度点云的点云数据进行几何信息编码,确定所述第二尺度点云对应的几何信息编码;将所述几何信息编码与所述属性信息编码,确定为所述第二尺度点云对应的编码信息。In some embodiments, the encoding part 24 is further configured to perform recoloring processing based on the reconstructed geometric information corresponding to the second-scale point cloud to obtain colored point cloud data, perform color information encoding based on the colored point cloud data, and determine the attribute information encoding corresponding to the second-scale point cloud; perform geometric information encoding on the point cloud data of the second-scale point cloud to determine the geometric information encoding corresponding to the second-scale point cloud; and combine the geometric information encoding and the attribute information encoding to determine the encoding information corresponding to the second-scale point cloud.
需要说明的是,以上装置实施例的描述,与上述方法实施例的描述是类似的,具有同方法实施例相似的有益效果。对于本申请装置实施例中未披露的技术细节,请参照本申请方法实施例的描述而理解。It should be noted that the description of the above device embodiment is similar to the description of the above method embodiment, and has similar beneficial effects as the method embodiment. For technical details not disclosed in the device embodiment of the present application, please refer to the description of the method embodiment of the present application for understanding.
在一些实施例中,本申请实施例还提供一种解码器,图22为本申请实施例提供的解码器3的一种可选的结构示意图。如图22所示,解码器3包括:第一存储器32与第一处理器33。其中,第一存储器32和第一处理器33通过第一通信总线34连接;第一存储器32,用于存储可执行指令;第一处理器33,用于执行第一存储器32中存储的可执行指令时,实现本申请实施例提供的解码方法。In some embodiments, the embodiment of the present application further provides a decoder, and FIG22 is an optional structural diagram of the decoder 3 provided in the embodiment of the present application. As shown in FIG22, the decoder 3 includes: a first memory 32 and a first processor 33. Among them, the first memory 32 and the first processor 33 are connected through a first communication bus 34; the first memory 32 is used to store executable instructions; the first processor 33 is used to execute the executable instructions stored in the first memory 32, and implement the decoding method provided in the embodiment of the present application.
在一些实施例中,本申请实施例还提供一种编码器,图23为本申请实施例提供的编码器4的一种可选的结构示意图。如图23所示,编码器4包括:第二存储器42与第二处理器43。其中,第二存储器42和第二处理器43通过第二通信总线44连接;第二存储器42,用于存储可执行指令;第二处理器43,用于执行第二存储器42中存储的可执行指令时,实现本申请实施例提供的编码方法。In some embodiments, the embodiment of the present application further provides an encoder, and FIG23 is an optional structural diagram of the encoder 4 provided in the embodiment of the present application. As shown in FIG23, the encoder 4 includes: a second memory 42 and a second processor 43. Among them, the second memory 42 and the second processor 43 are connected via a second communication bus 44; the second memory 42 is used to store executable instructions; the second processor 43 is used to execute the executable instructions stored in the second memory 42, and implement the encoding method provided in the embodiment of the present application.
本申请实施例提供一种存储有可执行指令的计算机可读存储介质,其中存储有可执行指令,当可执行指令被第一处理器执行时,将引起第一处理器执行上述任一种本申请实施例提供的解码方法;或者,当可执行指令被第二处理器执行时,将引起第二处理器执行上述任一种本申请实施例提供的编码方法。An embodiment of the present application provides a computer-readable storage medium storing executable instructions, wherein executable instructions are stored. When the executable instructions are executed by a first processor, the first processor will be caused to execute any one of the decoding methods provided in the embodiments of the present application; or, when the executable instructions are executed by a second processor, the second processor will be caused to execute any one of the encoding methods provided in the embodiments of the present application.
在一些实施例中,计算机可读存储介质可以是FRAM、ROM、PROM、EPROM、EEPROM、闪存、磁表面存储器、光盘、或CD-ROM等存储器;也可以是包括上述存储器之一或任意组合的各种设备。In some embodiments, the computer-readable storage medium may be a memory such as FRAM, ROM, PROM, EPROM, EEPROM, flash memory, magnetic surface storage, optical disk, or CD-ROM; or it may be various devices including one or any combination of the above memories.
在一些实施例中,可执行指令可以采用程序、软件、软件模块、脚本或代码的形式,按任意形式的编程语言(包括编译或解释语言,或者声明性或过程性语言)来编写,并且其可按任意形式部署,包括被部署为独立的程序或者被部署为模块、组件、子例程或者适合在计算环境中使用的其它单元。In some embodiments, executable instructions may be in the form of a program, software, software module, script or code, written in any form of programming language (including compiled or interpreted languages, or declarative or procedural languages), and may be deployed in any form, including as a stand-alone program or as a module, component, subroutine or other unit suitable for use in a computing environment.
作为示例,可执行指令可以但不一定对应于文件系统中的文件,可以被存储在保存其它程序或数据的文件的一部分,例如,存储在超文本标记语言(HTML,Hyper Text Markup Language)文档中的一个或多个脚本中,存储在专用于所讨论的程序的单个文件中,或者,存储在多个协同文件(例如,存储一个或多个模块、子程序或代码部分的文件)中。As an example, executable instructions may, but do not necessarily, correspond to a file in a file system, may be stored as part of a file that stores other programs or data, such as, for example, in one or more scripts in a Hypertext Markup Language (HTML) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files storing one or more modules, subroutines, or code portions).
作为示例,可执行指令可被部署为在一个计算设备上执行,或者在位于一个地点的多个计算设备上执行,又或者,在分布在多个地点且通过通信网络互连的多个计算设备上执行。By way of example, executable instructions may be deployed to be executed on one computing device, or on multiple computing devices located at one site, or on multiple computing devices distributed across multiple sites and interconnected by a communication network.
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用硬件实施例、软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器和光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art will appreciate that the embodiments of the present application may be provided as methods, systems, or computer program products. Therefore, the present application may adopt the form of hardware embodiments, software embodiments, or embodiments in combination with software and hardware. Moreover, the present application may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage and optical storage, etc.) that contain computer-usable program code.
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框 图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to the flowchart and/or block diagram of the method, device (system) and computer program product according to the embodiment of the present application. It should be understood that each flow process and/or box in the flow chart and/or block diagram and the combination of the flow process and/or box in the flow chart and/or block diagram can be realized by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, an embedded processing machine or other programmable data processing device to produce a machine, so that the instructions executed by the processor of the computer or other programmable data processing device produce a device for realizing the function specified in one flow chart or multiple flows and/or one box or multiple boxes of the block chart.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing device to work in a specific manner, so that the instructions stored in the computer-readable memory produce a manufactured product including an instruction device that implements the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions may also be loaded onto a computer or other programmable data processing device so that a series of operational steps are executed on the computer or other programmable device to produce a computer-implemented process, whereby the instructions executed on the computer or other programmable device provide steps for implementing the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.
以上所述,仅为本申请的较佳实施例而已,并非用于限定本申请的保护范围。凡在本申请的精神和范围之内所作的任何修改、等同替换和改进等,均包含在本申请的保护范围之内。The above is only a preferred embodiment of the present application and is not intended to limit the protection scope of the present application. Any modifications, equivalent replacements and improvements made within the spirit and scope of the present application are included in the protection scope of the present application.
工业实用性Industrial Applicability
本申请实施例提供了一种编解码方法、解码器、编码器及计算机可读存储介质,解码器可以通过预测局部密度,确定每个第一尺度体素上采样得到的第二尺度体素中,被占据的第二尺度体素的数量。这样,可以结合局部密度对第二尺度体素对应的占据概率进行筛选,来确定出第二尺度体素的占据情况,根据第二尺度体素的占据情况重建第二尺度点云,确定第二尺度点云的重建几何信息。如此,可以使得确定出的第二尺度体素的占据情况更准确,提高了第二尺度点云的重建几何信息的准确性,提高解码器的重建几何质量,也即提高了解码性能。并且,在编码器中,基于第一尺度体素对应的局部密度,对第二尺度体素对应的占据概率进行筛选,可以提高确定第二尺度体素的占据情况的准确性,进而提高基于第二尺度体素的占据情况确定的第二尺度点云的重建几何信息进行编码的准确性,从而提高了编码性能。The embodiment of the present application provides a coding and decoding method, a decoder, an encoder and a computer-readable storage medium. The decoder can determine the number of occupied second-scale voxels in the second-scale voxels obtained by sampling each first-scale voxel by predicting the local density. In this way, the occupancy probability corresponding to the second-scale voxel can be screened in combination with the local density to determine the occupancy of the second-scale voxel, reconstruct the second-scale point cloud according to the occupancy of the second-scale voxel, and determine the reconstructed geometric information of the second-scale point cloud. In this way, the occupancy of the determined second-scale voxel can be made more accurate, the accuracy of the reconstructed geometric information of the second-scale point cloud is improved, the reconstructed geometric quality of the decoder is improved, and the decoding performance is improved. In addition, in the encoder, based on the local density corresponding to the first-scale voxel, the occupancy probability corresponding to the second-scale voxel is screened, which can improve the accuracy of determining the occupancy of the second-scale voxel, thereby improving the accuracy of encoding the reconstructed geometric information of the second-scale point cloud determined based on the occupancy of the second-scale voxel, thereby improving the encoding performance.

Claims (22)

  1. 一种解码方法,包括:A decoding method, comprising:
    解析码流,确定第二尺度点云对应的编码信息,并确定第一尺度点云;所述第一尺度点云为所述第二尺度点云对应的前一个已解码的点云数据;Parse the bitstream to determine the encoding information corresponding to the second-scale point cloud, and determine the first-scale point cloud; the first-scale point cloud is the previously decoded point cloud data corresponding to the second-scale point cloud;
    基于所述第一尺度点云进行局部密度预测,确定所述第一尺度点云中第一尺度体素对应的局部密度,并对第二尺度体素进行占据概率预测,确定所述第二尺度体素对应的占据概率;所述第二尺度体素为所述第一尺度体素对应的上采样体素;所述局部密度表征所述第一尺度体素对应的第二尺度体素中,被占据的第二尺度体素的数量;Based on the first-scale point cloud, a local density prediction is performed to determine a local density corresponding to a first-scale voxel in the first-scale point cloud, and an occupation probability prediction is performed on a second-scale voxel to determine an occupation probability corresponding to the second-scale voxel; the second-scale voxel is an upsampled voxel corresponding to the first-scale voxel; the local density represents the number of occupied second-scale voxels in the second-scale voxel corresponding to the first-scale voxel;
    基于所述第二尺度体素对应的占据概率与所述第一尺度体素对应的局部密度,对所述第二尺度点云对应的编码信息进行解码重建,确定所述第二尺度点云对应的重建几何信息。Based on the occupancy probability corresponding to the second-scale voxel and the local density corresponding to the first-scale voxel, the encoded information corresponding to the second-scale point cloud is decoded and reconstructed to determine the reconstructed geometric information corresponding to the second-scale point cloud.
  2. 根据权利要求1所述的方法,其中,所述基于所述第二尺度体素对应的占据概率与所述第一尺度体素对应的局部密度,对所述第二尺度点云对应的编码信息进行解码重建,确定所述第二尺度点云对应的重建几何信息,包括:The method according to claim 1, wherein the decoding and reconstructing the encoded information corresponding to the second-scale point cloud based on the occupancy probability corresponding to the second-scale voxel and the local density corresponding to the first-scale voxel to determine the reconstructed geometric information corresponding to the second-scale point cloud comprises:
    对于所述第一尺度点云中的每个第一尺度体素,将所述每个第一尺度体素对应的多个第二尺度体素中占据概率高的局部密度个第二尺度体素,确定为被占据的第二尺度体素;For each first-scale voxel in the first-scale point cloud, determine the second-scale voxels with high local density of occupation probability among the multiple second-scale voxels corresponding to each first-scale voxel as occupied second-scale voxels;
    基于所述每个第一尺度体素对应的被占据的第二尺度体素,对所述第二尺度点云对应的编码信息进行解码重建,确定所述第二尺度点云对应的重建几何信息。Based on the occupied second-scale voxels corresponding to each first-scale voxel, the encoded information corresponding to the second-scale point cloud is decoded and reconstructed to determine the reconstructed geometric information corresponding to the second-scale point cloud.
  3. 根据权利要求1所述的方法,其中,所述基于所述第一尺度点云进行局部密度预测,确定所述第一尺度点云中第一尺度体素对应的局部密度,并对第二尺度体素进行占据概率预测,确定所述第二尺度体素对应的占据概率,包括:The method according to claim 1, wherein the performing local density prediction based on the first-scale point cloud to determine the local density corresponding to the first-scale voxel in the first-scale point cloud, and performing occupancy probability prediction on the second-scale voxel to determine the occupancy probability corresponding to the second-scale voxel comprises:
    对所述第一尺度点云的几何信息进行特征提取,确定第一尺度点云特征;Extracting features from the geometric information of the first-scale point cloud to determine features of the first-scale point cloud;
    将所述第一尺度点云特征上采样至第二尺度,确定初始第二尺度点云特征,对所述初始第二尺度点云特征进行特征提取,确定第二尺度点云特征,并根据所述第二尺度点云特征进行占据概率预测,确定所述第二尺度体素对应的占据概率;Upsampling the first-scale point cloud features to a second scale to determine initial second-scale point cloud features, performing feature extraction on the initial second-scale point cloud features to determine second-scale point cloud features, and performing occupancy probability prediction based on the second-scale point cloud features to determine occupancy probabilities corresponding to voxels at the second scale;
    根据所述第一尺度点云特征进行局部密度预测,确定所述第一尺度体素对应的局部密度。A local density prediction is performed according to the first-scale point cloud features to determine a local density corresponding to the first-scale voxel.
  4. 根据权利要求1所述的方法,其中,所述基于所述第一尺度点云进行局部密度预测,确定所述第一尺度点云中第一尺度体素对应的局部密度,并对第二尺度体素进行占据概率预测,确定所述第二尺度体素对应的占据概率,包括:The method according to claim 1, wherein the performing local density prediction based on the first-scale point cloud to determine the local density corresponding to the first-scale voxel in the first-scale point cloud, and performing occupancy probability prediction on the second-scale voxel to determine the occupancy probability corresponding to the second-scale voxel comprises:
    通过第一特征提取网络,对所述第一尺度点云的几何信息进行特征提取,确定第一尺度的第一点云特征;Extracting features from the geometric information of the first-scale point cloud through a first feature extraction network to determine first point cloud features at a first scale;
    将所述第一尺度的第一点云特征上采样至第二尺度,确定第二尺度点云特征,并基于所述第二尺度点云特征进行占据概率预测,确定所述第二尺度体素对应的占据概率;Upsampling the first point cloud features at the first scale to a second scale, determining the second-scale point cloud features, and performing occupancy probability prediction based on the second-scale point cloud features to determine the occupancy probability corresponding to the second-scale voxel;
    通过第二特征提取网络,对所述第一尺度点云的几何数据进行特征提取,确定第一尺度的第二点云特征;Performing feature extraction on the geometric data of the first-scale point cloud through a second feature extraction network to determine features of a second point cloud at the first scale;
    根据所述第一尺度的第二点云特征进行局部密度预测,确定所述第一尺度体素对应的局部密度。A local density prediction is performed according to the second point cloud feature of the first scale to determine a local density corresponding to the voxel of the first scale.
  5. 根据权利要求1所述的方法,其中,所述基于所述第一尺度点云进行局部密度预测,确定所述第一尺度点云中第一尺度体素对应的局部密度,并对第二尺度体素进行占据概率预测,确定所述第二尺度体素对应的占据概率,包括:The method according to claim 1, wherein the performing local density prediction based on the first-scale point cloud to determine the local density corresponding to the first-scale voxel in the first-scale point cloud, and performing occupancy probability prediction on the second-scale voxel to determine the occupancy probability corresponding to the second-scale voxel comprises:
    对所述第一尺度点云的几何信息进行特征提取,确定第一尺度点云特征;Extracting features from the geometric information of the first-scale point cloud to determine features of the first-scale point cloud;
    将所述第一尺度点云特征上采样至第二尺度,确定第二尺度点云特征,并根据所述第二尺度点云特征进行占据概率预测,确定所述第二尺度体素对应的占据概率;Upsampling the first-scale point cloud features to a second scale, determining second-scale point cloud features, and performing occupancy probability prediction based on the second-scale point cloud features to determine occupancy probabilities corresponding to voxels at the second scale;
    根据所述第一尺度点云特征进行局部密度预测,确定所述第一尺度体素对应的局部密度。A local density prediction is performed according to the first-scale point cloud features to determine a local density corresponding to the first-scale voxel.
  6. 根据权利要求3-5任一项所述的方法,其中,所述方法还包括:The method according to any one of claims 3 to 5, wherein the method further comprises:
    利用局部密度预测网络,根据所述第一尺度点云特征进行局部密度预测,确定所述第一尺度体素对应的局部密度。A local density prediction network is used to perform local density prediction according to the first-scale point cloud features to determine the local density corresponding to the first-scale voxel.
  7. 根据权利要求6所述的方法,其中,所述局部密度预测网络包括:The method according to claim 6, wherein the local density prediction network comprises:
    第一稀疏卷积层、第一激活函数层、第二稀疏卷积层与第二激活函数层。The first sparse convolution layer, the first activation function layer, the second sparse convolution layer and the second activation function layer.
  8. 根据权利要求1-5任一项、或权利要求7所述的方法,其中,所述方法还包括:The method according to any one of claims 1 to 5 or claim 7, wherein the method further comprises:
    解析所述码流,确定第i尺度点云对应的编码信息;i为大于或等于3的整数;Parse the bitstream to determine the encoding information corresponding to the i-th scale point cloud; i is an integer greater than or equal to 3;
    基于第i-1尺度点云的重建几何信息进行第i-1尺度体素的局部密度预测,确定所述i-1尺度体素对应的局部密度,并对所述i-1尺度体素对应的第i尺度体素进行占据概率预测,确定所述第i尺度体素对应的占据概率;第i尺度由第i-1尺度上采样得到;Based on the reconstructed geometric information of the i-1-th scale point cloud, the local density of the i-1-th scale voxel is predicted to determine the local density corresponding to the i-1-th scale voxel, and the occupancy probability of the i-th scale voxel corresponding to the i-1-th scale voxel is predicted to determine the occupancy probability corresponding to the i-th scale voxel; the i-th scale is obtained by upsampling the i-1-th scale;
    基于所述第i尺度体素对应的占据概率与所述第i-1尺度体素对应的局部密度,对所述第i尺度点云对应的编码信息进行解码重建,确定所述第i尺度点云对应的重建几何信息。Based on the occupancy probability corresponding to the i-th scale voxel and the local density corresponding to the i-1-th scale voxel, the encoded information corresponding to the i-th scale point cloud is decoded and reconstructed to determine the reconstructed geometric information corresponding to the i-th scale point cloud.
  9. 根据权利要求1-5任一项、或权利要求7所述的方法,其中,所述方法还包括:The method according to any one of claims 1 to 5 or claim 7, wherein the method further comprises:
    基于第n尺度点云对应的重建几何数据进行第n尺度体素的局部密度预测与第n+1尺度体素的占据概率预测,确定所述第n尺度体素对应的局部密度与所述第n+1尺度体素对应的占据概率;n为大于或等于2的正整数;第n+1尺度由第n尺度上采样得到;Based on the reconstructed geometric data corresponding to the n-th scale point cloud, the local density of the n-th scale voxel and the occupation probability of the n+1-th scale voxel are predicted, and the local density corresponding to the n-th scale voxel and the occupation probability corresponding to the n+1-th scale voxel are determined; n is a positive integer greater than or equal to 2; the n+1-th scale is obtained by sampling the n-th scale;
    基于所述第n尺度体素对应的局部密度与所述第n+1尺度体素对应的占据概率,确定第n+1尺度点云对应的重建几何数据。Based on the local density corresponding to the n-th scale voxel and the occupancy probability corresponding to the n+1-th scale voxel, the reconstructed geometric data corresponding to the n+1-th scale point cloud is determined.
  10. 一种编码方法,包括:A coding method, comprising:
    对第二尺度点云进行体素下采样,确定第一尺度点云,并将所述第一尺度点云中的第一尺度体素上采样至第二尺度,确定第一尺度体素对应的第二尺度体素;Downsampling the second-scale point cloud to determine the first-scale point cloud, and upsampling the first-scale voxels in the first-scale point cloud to the second scale to determine the second-scale voxels corresponding to the first-scale voxels;
    基于所述第一尺度点云进行局部密度预测,确定所述第一尺度体素对应的局部密度,并对所述第二尺度体素进行占据概率预测,确定所述第二尺度体素对应的占据概率;所述局部密度表征所述第一尺度体素对应的第二尺度体素中,被占据的第二尺度体素的数量;Performing local density prediction based on the first-scale point cloud to determine the local density corresponding to the first-scale voxel, and performing occupation probability prediction on the second-scale voxel to determine the occupation probability corresponding to the second-scale voxel; the local density represents the number of occupied second-scale voxels in the second-scale voxels corresponding to the first-scale voxels;
    根据所述第一尺度体素对应的局部密度与所述第二尺度体素对应的占据概率,确定所述第二尺度点云对应的重建几何信息;Determining reconstructed geometric information corresponding to the second-scale point cloud according to the local density corresponding to the first-scale voxel and the occupancy probability corresponding to the second-scale voxel;
    基于所述第二尺度点云对应的重建几何信息进行编码,确定第二尺度点云对应的编码信息,并将所述第二尺度点云对应的编码信息写入码流。Encoding is performed based on the reconstructed geometric information corresponding to the second-scale point cloud, encoding information corresponding to the second-scale point cloud is determined, and the encoding information corresponding to the second-scale point cloud is written into a bitstream.
  11. 根据权利要求10所述的方法,其中,所述根据所述第一尺度体素对应的局部密度与所述第二尺度体素对应的占据概率,确定所述第二尺度点云对应的重建几何信息,包括:The method according to claim 10, wherein determining the reconstructed geometric information corresponding to the second-scale point cloud according to the local density corresponding to the first-scale voxel and the occupancy probability corresponding to the second-scale voxel comprises:
    对于所述第一尺度点云中的每个第一尺度体素,将所述每个第一尺度体素对应的多个第二尺度体素中占据概率高的局部密度个第二尺度体素,确定为被占据的第二尺度体素;For each first-scale voxel in the first-scale point cloud, determine the second-scale voxels with high local density of occupation probability among the multiple second-scale voxels corresponding to each first-scale voxel as occupied second-scale voxels;
    基于所述每个第一尺度体素对应的被占据的第二尺度体素,确定所述第二尺度点云对应的重建几何信息。Based on the occupied second-scale voxels corresponding to each first-scale voxel, reconstructed geometric information corresponding to the second-scale point cloud is determined.
  12. 根据权利要求10所述的方法,其中,所述基于所述第一尺度点云进行局部密度预测,确定所述第一尺度体素对应的局部密度,并对所述第二尺度体素进行占据概率预测,确定所述第二尺度体素对应的占据概率,包括:The method according to claim 10, wherein the performing local density prediction based on the first-scale point cloud to determine the local density corresponding to the first-scale voxel, and performing occupancy probability prediction on the second-scale voxel to determine the occupancy probability corresponding to the second-scale voxel comprises:
    对所述第一尺度点云的几何信息进行特征提取,确定第一尺度点云特征;Extracting features from the geometric information of the first-scale point cloud to determine features of the first-scale point cloud;
    将所述第一尺度点云特征上采样至第二尺度,确定初始第二尺度点云特征,对所述初始第二尺度点云特征进行特征提取,确定第二尺度点云特征,并根据所述第二尺度点云特征进行占据概率预测,确定所述第二尺度体素对应的占据概率;Upsampling the first-scale point cloud features to a second scale to determine initial second-scale point cloud features, performing feature extraction on the initial second-scale point cloud features to determine second-scale point cloud features, and performing occupancy probability prediction based on the second-scale point cloud features to determine occupancy probabilities corresponding to voxels at the second scale;
    根据所述第一尺度点云特征进行局部密度预测,确定所述第一尺度体素对应的局部密度。A local density prediction is performed according to the first-scale point cloud features to determine a local density corresponding to the first-scale voxel.
  13. 根据权利要求10所述的方法,其中,所述基于所述第一尺度点云进行局部密度预测,确定所述第一尺度体素对应的局部密度,并对所述第二尺度体素进行占据概率预测,确定所述第二尺度体素对应的占据概率,包括:The method according to claim 10, wherein the performing local density prediction based on the first-scale point cloud to determine the local density corresponding to the first-scale voxel, and performing occupancy probability prediction on the second-scale voxel to determine the occupancy probability corresponding to the second-scale voxel comprises:
    通过第一特征提取网络,对所述第一尺度点云的几何信息进行特征提取,确定第一尺度的第一点云特征;Extracting features from the geometric information of the first-scale point cloud through a first feature extraction network to determine first point cloud features at a first scale;
    将所述第一尺度的第一点云特征上采样至第二尺度,确定第二尺度点云特征,并基于所述第二尺度点云特征进行占据概率预测,确定所述第二尺度体素对应的占据概率;Upsampling the first point cloud features at the first scale to a second scale, determining the second-scale point cloud features, and performing occupancy probability prediction based on the second-scale point cloud features to determine the occupancy probability corresponding to the second-scale voxel;
    通过第二特征提取网络,对所述第一尺度点云的几何数据进行特征提取,确定第一尺度的第二点云特征;Performing feature extraction on the geometric data of the first-scale point cloud through a second feature extraction network to determine features of a second point cloud at the first scale;
    根据所述第一尺度的第二点云特征进行局部密度预测,确定所述第一尺度体素对应的局部密度。A local density prediction is performed according to the second point cloud feature of the first scale to determine a local density corresponding to the voxel of the first scale.
  14. 根据权利要求10所述的方法,其中,所述基于所述第一尺度点云进行局部密度预测,确定所述第一尺度体素对应的局部密度,并对所述第二尺度体素进行占据概率预测,确定所述第二尺度体素对应的占据概率,包括:The method according to claim 10, wherein the performing local density prediction based on the first-scale point cloud to determine the local density corresponding to the first-scale voxel, and performing occupancy probability prediction on the second-scale voxel to determine the occupancy probability corresponding to the second-scale voxel comprises:
    对所述第一尺度点云的几何信息进行特征提取,确定第一尺度点云特征;Extracting features from the geometric information of the first-scale point cloud to determine features of the first-scale point cloud;
    将所述第一尺度点云特征上采样至第二尺度,确定第二尺度点云特征,并根据所述第二尺度点云特征进行占据概率预测,确定所述第二尺度体素对应的占据概率;Upsampling the first-scale point cloud features to a second scale, determining second-scale point cloud features, and performing occupancy probability prediction based on the second-scale point cloud features to determine occupancy probabilities corresponding to voxels at the second scale;
    根据所述第一尺度点云特征进行局部密度预测,确定所述第一尺度体素对应的局部密度。A local density prediction is performed according to the first-scale point cloud features to determine a local density corresponding to the first-scale voxel.
  15. 根据权利要求12-14任一项所述的方法,其中,所述方法还包括:The method according to any one of claims 12 to 14, wherein the method further comprises:
    利用局部密度预测网络,根据所述第一尺度点云特征进行局部密度预测,确定所述第一尺度体素对应的局部密度;Using a local density prediction network, performing local density prediction according to the first-scale point cloud features, and determining a local density corresponding to the first-scale voxel;
    其中,所述局部密度预测网络包括:Wherein, the local density prediction network comprises:
    第一稀疏卷积层、第一激活函数层、第二稀疏卷积层与第二激活函数层。The first sparse convolution layer, the first activation function layer, the second sparse convolution layer and the second activation function layer.
  16. 根据权利要求10-14任一项所述的方法,其中,所述基于所述第二尺度点云对应的重建几何信息进行编码,确定第二尺度点云对应的编码信息,并将所述编码信息写入码流,包括:The method according to any one of claims 10 to 14, wherein encoding the reconstructed geometric information corresponding to the second-scale point cloud, determining the encoding information corresponding to the second-scale point cloud, and writing the encoding information into the bitstream comprises:
    基于所述第二尺度点云对应的重建几何信息进行重新着色处理,得到着色的点云数据,基于所述着色的点云数据进行颜色信息编码,确定所述第二尺度点云对应的属性信息编码;Recoloring is performed based on the reconstructed geometric information corresponding to the second-scale point cloud to obtain colored point cloud data, and color information is encoded based on the colored point cloud data to determine the attribute information encoding corresponding to the second-scale point cloud;
    对所述第二尺度点云的点云数据进行几何信息编码,确定所述第二尺度点云对应的几何信息编码;Performing geometric information encoding on the point cloud data of the second-scale point cloud to determine the geometric information encoding corresponding to the second-scale point cloud;
    将所述几何信息编码与所述属性信息编码,确定为所述第二尺度点云对应的编码信息。The geometric information encoding and the attribute information encoding are determined as encoding information corresponding to the second-scale point cloud.
  17. 一种解码器,包括:A decoder, comprising:
    解析部分,配置为解析码流,确定第二尺度点云对应的编码信息;A parsing part, configured to parse the bitstream and determine the encoding information corresponding to the second scale point cloud;
    确定部分,配置为确定第一尺度点云;所述第一尺度点云为所述第二尺度点云对应的前一个已解码的点云数据;The determining part is configured to determine a first-scale point cloud; the first-scale point cloud is the previously decoded point cloud data corresponding to the second-scale point cloud;
    预测部分,配置为基于所述第一尺度点云进行局部密度预测,确定所述第一尺度点云中第一尺度体素对应的局部密度,并对第二尺度体素进行占据概率预测,确定所述第二尺度体素对应的占据概率;所述第二尺度体素为所述第一尺度体素对应的上采样体素;所述局部密度表征所述第一尺度体素对应的第二尺度体素中,被占据的第二尺度体素的数量;The prediction part is configured to perform local density prediction based on the first-scale point cloud, determine the local density corresponding to the first-scale voxel in the first-scale point cloud, and perform occupation probability prediction on the second-scale voxel to determine the occupation probability corresponding to the second-scale voxel; the second-scale voxel is an upsampled voxel corresponding to the first-scale voxel; the local density represents the number of occupied second-scale voxels in the second-scale voxel corresponding to the first-scale voxel;
    解码重建部分,配置为基于所述第二尺度体素对应的占据概率与所述第一尺度体素对应的局部密度,对所述第二尺度点云对应的编码信息进行解码重建,确定所述第二尺度点云对应的重建几何信息。The decoding and reconstruction part is configured to decode and reconstruct the encoded information corresponding to the second-scale point cloud based on the occupancy probability corresponding to the second-scale voxel and the local density corresponding to the first-scale voxel, and determine the reconstructed geometric information corresponding to the second-scale point cloud.
  18. 一种编码器,包括:An encoder, comprising:
    下采样部分,配置为对第二尺度点云进行体素下采样,确定第一尺度点云;A downsampling part, configured to perform voxel downsampling on the second scale point cloud to determine a first scale point cloud;
    局部密度预测部分,配置为基于所述第一尺度点云进行局部密度预测,确定所述第一尺度体素对应的局部密度;A local density prediction part, configured to perform local density prediction based on the first-scale point cloud to determine the local density corresponding to the first-scale voxel;
    占据概率预测部分,配置为将所述第一尺度点云中的第一尺度体素上采样至第二尺度,确定第一尺度体素对应的第二尺度体素;并对所述第二尺度体素进行占据概率预测,确定所述第二尺度体素对应的占据概率;The occupancy probability prediction part is configured to upsample the first scale voxels in the first scale point cloud to the second scale, determine the second scale voxels corresponding to the first scale voxels; and perform occupancy probability prediction on the second scale voxels to determine the occupancy probability corresponding to the second scale voxels;
    重建部分,配置为根据所述第一尺度体素对应的局部密度与所述第二尺度体素对应的占据概率,确定所述第二尺度点云对应的重建几何信息;A reconstruction part, configured to determine the reconstructed geometric information corresponding to the second-scale point cloud according to the local density corresponding to the first-scale voxel and the occupancy probability corresponding to the second-scale voxel;
    编码部分,配置为基于所述第二尺度点云对应的重建几何信息进行编码,确定第二尺度点云对应的编码信息,并将所述编码信息写入码流。The encoding part is configured to perform encoding based on the reconstructed geometric information corresponding to the second-scale point cloud, determine the encoding information corresponding to the second-scale point cloud, and write the encoding information into a bitstream.
  19. 一种码流,包括:A code stream, including:
    所述码流是根据编码信息进行比特编码生成的;其中,所述编码信息至少包括:第二尺度点云对应的编码信息。The code stream is generated by bit encoding according to the coding information; wherein the coding information at least includes: coding information corresponding to the second-scale point cloud.
  20. 一种解码器,包括:A decoder, comprising:
    第一存储器,配置为存储可执行指令;A first memory configured to store executable instructions;
    第一处理器,配置为执行所述第一存储器中存储的可执行指令时,实现权利要求1至9任一项所述的方法。The first processor is configured to implement the method according to any one of claims 1 to 9 when executing the executable instructions stored in the first memory.
  21. 一种编码器,包括:An encoder, comprising:
    第二存储器,配置为存储可执行指令;a second memory configured to store executable instructions;
    第二处理器,配置为执行所述第二存储器中存储的可执行指令时,实现权利要求10至16任一项所述的方法。The second processor is configured to implement the method according to any one of claims 10 to 16 when executing the executable instructions stored in the second memory.
  22. 一种计算机可读存储介质,存储有可执行指令,用于引起第一处理器执行时,实现权利要求1至9任一项所述的方法,或者,用于引起第二处理器执行时,实现权利要求10至16任一项所述的方法。A computer-readable storage medium storing executable instructions for causing a first processor to execute the method described in any one of claims 1 to 9, or for causing a second processor to execute the method described in any one of claims 10 to 16.
PCT/CN2022/125742 2022-10-17 2022-10-17 Encoding method, decoding method, decoder, encoder and computer-readable storage medium WO2024082105A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/125742 WO2024082105A1 (en) 2022-10-17 2022-10-17 Encoding method, decoding method, decoder, encoder and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/125742 WO2024082105A1 (en) 2022-10-17 2022-10-17 Encoding method, decoding method, decoder, encoder and computer-readable storage medium

Publications (1)

Publication Number Publication Date
WO2024082105A1 true WO2024082105A1 (en) 2024-04-25

Family

ID=90736577

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/125742 WO2024082105A1 (en) 2022-10-17 2022-10-17 Encoding method, decoding method, decoder, encoder and computer-readable storage medium

Country Status (1)

Country Link
WO (1) WO2024082105A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113613010A (en) * 2021-07-07 2021-11-05 南京大学 Point cloud geometric lossless compression method based on sparse convolutional neural network
CN113766228A (en) * 2020-06-05 2021-12-07 Oppo广东移动通信有限公司 Point cloud compression method, encoder, decoder, and storage medium
US20220108492A1 (en) * 2020-10-06 2022-04-07 Qualcomm Incorporated Gpcc planar mode and buffer simplification
CN114926636A (en) * 2022-05-12 2022-08-19 合众新能源汽车有限公司 Point cloud semantic segmentation method, device, equipment and storage medium
US20220321912A1 (en) * 2019-08-09 2022-10-06 Lg Electronics Inc. Point cloud data transmission device, point cloud data transmission method, point cloud data reception device, and point cloud data reception method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220321912A1 (en) * 2019-08-09 2022-10-06 Lg Electronics Inc. Point cloud data transmission device, point cloud data transmission method, point cloud data reception device, and point cloud data reception method
CN113766228A (en) * 2020-06-05 2021-12-07 Oppo广东移动通信有限公司 Point cloud compression method, encoder, decoder, and storage medium
US20220108492A1 (en) * 2020-10-06 2022-04-07 Qualcomm Incorporated Gpcc planar mode and buffer simplification
CN113613010A (en) * 2021-07-07 2021-11-05 南京大学 Point cloud geometric lossless compression method based on sparse convolutional neural network
CN114926636A (en) * 2022-05-12 2022-08-19 合众新能源汽车有限公司 Point cloud semantic segmentation method, device, equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HE YUN; REN XINLIN; TANG DANHANG; ZHANG YINDA; XUE XIANGYANG; FU YANWEI: "Density-preserving Deep Point Cloud Compression", 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), IEEE, 18 June 2022 (2022-06-18), pages 2323 - 2332, XP034194523, DOI: 10.1109/CVPR52688.2022.00237 *
J. WANG, Z. MA (NANJING UNIVERSITY), H. WEI (OPPO), Y. YU (OPPO), V. ZAKHARCHENKO (OPPO), D. WANG(OPPO): "[G-PCC EE13.54] A Geometry Compression Framework for AI-based PCC via Sparse Convolution", 135. MPEG MEETING; 20210712 - 20210716; ONLINE; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11), 6 July 2021 (2021-07-06), XP030297111 *

Similar Documents

Publication Publication Date Title
CN113615181B (en) Method and device for point cloud encoding and decoding
US9734595B2 (en) Method and apparatus for near-lossless compression and decompression of 3D meshes and point clouds
KR20210114508A (en) Method and apparatus for point cloud compression
WO2021000658A1 (en) Point cloud encoding and decoding method, encoder, decoder, and computer storage medium
JP7233561B2 (en) Method for point cloud compression and its apparatus and computer program
CN113613010A (en) Point cloud geometric lossless compression method based on sparse convolutional neural network
WO2022121649A1 (en) Point cloud data encoding and decoding method, point cloud data processing method and apparatus, electronic device, computer program product, and computer readable storage medium
JP7408799B2 (en) Neural network model compression
CN111641826B (en) Method, device and system for encoding and decoding data
CN111727445A (en) Data compression for partial entropy coding
WO2021062772A1 (en) Prediction method, encoder, decoder, and computer storage medium
KR20220127837A (en) Method and apparatus for HAAR-based point cloud coding
WO2024082105A1 (en) Encoding method, decoding method, decoder, encoder and computer-readable storage medium
TW202406344A (en) Point cloud geometry data augmentation method and apparatus, encoding method and apparatus, decoding method and apparatus, and encoding and decoding system
CN113382244B (en) Coding and decoding network structure, image compression method, device and storage medium
US20220180567A1 (en) Method and apparatus for point cloud coding
JP7394980B2 (en) Method, device and program for decoding neural network with block division
CN115393452A (en) Point cloud geometric compression method based on asymmetric self-encoder structure
WO2024011417A1 (en) Encoding method, decoding method, decoder, encoder and computer readable storage medium
WO2024082101A1 (en) Encoding method, decoding method, decoder, encoder, code stream, and storage medium
JP2023525207A (en) Intra prediction method, device, encoder, decoder, and storage medium
WO2023248486A1 (en) Information processing device and method
RU2778864C1 (en) Implicit geometric division based on a quad-tree or binary tree for encoding a point cloud
WO2024082152A1 (en) Encoding and decoding methods and apparatuses, encoder and decoder, code stream, device, and storage medium
WO2023205969A1 (en) Point cloud geometric information compression method and apparatus, point cloud geometric information decompression method and apparatus, point cloud video encoding method and apparatus, and point cloud video decoding method and apparatus