WO2024119518A1 - Procédé de codage, procédé de décodage, décodeur, codeur, flux de code et support de stockage - Google Patents

Procédé de codage, procédé de décodage, décodeur, codeur, flux de code et support de stockage Download PDF

Info

Publication number
WO2024119518A1
WO2024119518A1 PCT/CN2022/138185 CN2022138185W WO2024119518A1 WO 2024119518 A1 WO2024119518 A1 WO 2024119518A1 CN 2022138185 W CN2022138185 W CN 2022138185W WO 2024119518 A1 WO2024119518 A1 WO 2024119518A1
Authority
WO
WIPO (PCT)
Prior art keywords
encoding
information
context
context model
coding
Prior art date
Application number
PCT/CN2022/138185
Other languages
English (en)
Chinese (zh)
Inventor
孙泽星
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Priority to PCT/CN2022/138185 priority Critical patent/WO2024119518A1/fr
Publication of WO2024119518A1 publication Critical patent/WO2024119518A1/fr

Links

Images

Definitions

  • the present application relates to point cloud compression coding and decoding technology, and in particular to a coding and decoding method, a decoder, an encoder, a bit stream and a storage medium.
  • Point cloud is a collection of points, which can store the geometric position and related attribute information of each point, so as to accurately and three-dimensionally describe objects in space.
  • the amount of point cloud data is huge, and a frame of point cloud can contain millions of points, which also brings great difficulties and challenges to the effective storage and transmission of point clouds. Therefore, compression technology is used to reduce redundant information in point cloud storage, so as to facilitate subsequent processing.
  • point cloud compression can be divided into two categories: geometry compression and attribute compression, which correspond to compressed coordinate information and attribute information respectively, and the two are compressed independently. That is, the coordinate information is first compressed using a geometric compression algorithm, and then the attribute information is compressed using a separate attribute compression algorithm with the coordinates as known information.
  • geometric compression is usually implemented by multi-tree coding and prediction tree coding, while attribute compression algorithms can be divided into methods for compressing color and reflectivity information.
  • attribute compression algorithms can be divided into methods for compressing color and reflectivity information.
  • geometric information encoding context and attribute information encoding context among which the geometric information encoding context is further divided into multi-tree encoding context and prediction tree encoding context, while the attribute information encoding context is divided into color encoding context and reflectivity attribute encoding context.
  • the embodiments of the present application provide a coding and decoding method, a decoder, an encoder, a bit stream and a storage medium, which can reduce the memory overhead during coding and decoding and improve the utilization rate of memory resources while ensuring the coding and decoding efficiency.
  • an embodiment of the present application provides a decoding method, which is applied to a decoder, and the method includes:
  • the current node is decoded.
  • an embodiment of the present application provides an encoding method, which is applied to an encoder, and the method includes:
  • the coding method is used to indicate geometric coding or attribute coding
  • the current node is encoded.
  • an embodiment of the present application provides a code stream, including:
  • the code stream is generated by bit encoding according to the information to be encoded; the information to be encoded includes at least one of the following:
  • Code stream type first syntax element information
  • second syntax element information encoding information of each node in the point cloud.
  • an embodiment of the present application provides a decoder, including:
  • the decoding part is configured to parse the bitstream and determine the bitstream type
  • a first determining part is configured to determine a context model based on the bitstream type
  • a first initialization part configured to initialize context information of the context model
  • the decoding part is further configured to decode the current node based on the context information.
  • an encoder including:
  • the second determining part is configured to determine an encoding method of the point cloud; the encoding method is used to indicate geometric encoding or attribute encoding; and based on the encoding method, determine a context model;
  • a second initialization part is configured to initialize the context information of the context model when encoding the current node according to the encoding method
  • the encoding part is configured to encode the current node based on the context information.
  • an embodiment of the present application further provides a decoder, including:
  • a first memory configured to store executable instructions
  • the first processor is configured to implement the method described in the decoder when executing the executable instructions stored in the first memory.
  • an encoder including:
  • a second memory configured to store executable instructions
  • the second processor is configured to implement the method described by the encoder when executing the executable instructions stored in the second memory.
  • an embodiment of the present application further provides a computer-readable storage medium storing executable instructions for causing a first processor to execute and implement the method described in the decoder, or for causing a second processor to execute and implement the method described in the encoder.
  • the embodiment of the present application provides a coding and decoding method, a decoder, an encoder, a bitstream and a storage medium.
  • the bitstream is parsed to determine the bitstream type; based on the bitstream type, a context model is determined; the context information of the context model is initialized; and based on the context information, the current node is decoded.
  • the decoder can use the parsed bitstream type to decode using context information, and only needs to initialize the context information of the context model that is consistent with the parsed bitstream type, so that the context information required for decoding is already loaded at the time of initialization, thereby ensuring the efficiency of decoding.
  • all context models are not loaded, thereby saving memory resources allocated to the context model, thereby achieving the purpose of reducing memory overhead during decoding and improving the utilization of memory resources.
  • the encoder can determine whether to perform geometric encoding or attribute encoding by determining the encoding method of the point cloud during encoding, and can determine whether to initialize the context information of the context model corresponding to the geometric information or the context information of the context model corresponding to the attribute information according to the encoding method. In this way, it is only necessary to initialize the context information of the context model consistent with the parsed encoding method, so that the context information required for encoding is loaded at the time of initialization, ensuring the efficiency of encoding. At the same time, the context models of all encoding methods are not loaded, saving the memory resources allocated to the context model, achieving the purpose of reducing the memory overhead during encoding and improving the utilization of memory resources.
  • FIG1A is a schematic diagram of a three-dimensional point cloud image
  • FIG1B is a partially enlarged schematic diagram of a three-dimensional point cloud image
  • FIG2A is a schematic diagram of a point cloud image at different viewing angles
  • FIG2B is a schematic diagram of a data storage format corresponding to FIG2A ;
  • FIG3 is a schematic diagram of a network architecture for point cloud encoding and decoding
  • FIG4A is a schematic block diagram of a G-PCC encoder
  • FIG4B is a schematic block diagram of a G-PCC decoder
  • FIG5A is a schematic diagram of an intersection of a seed block
  • FIG5B is a schematic diagram of fitting a triangular face set
  • FIG5C is a schematic diagram of upsampling of a triangular face set
  • FIG6 is a schematic diagram of an exemplary prediction tree structure
  • FIG7A is a block diagram of an AVS encoder
  • FIG7B is a block diagram of an AVS decoder
  • FIG8 is a schematic diagram of a flow chart of a decoding method provided in an embodiment of the present application.
  • FIG9 is a schematic diagram of a flow chart of an encoding method provided in an embodiment of the present application.
  • 10A-10H are schematic diagrams of the structure of reference nodes selected by subnodes according to an embodiment of the present application.
  • 11A-11D are schematic diagrams of the structures of four groups of reference neighbor nodes of the current node provided in an embodiment of the present application.
  • FIG14 is a schematic diagram of the structure of a decoder provided in an embodiment of the present application.
  • FIG15 is a schematic diagram of a specific hardware structure of a decoder provided in an embodiment of the present application.
  • FIG16 is a schematic diagram of the structure of an encoder provided in an embodiment of the present application.
  • FIG. 17 is a schematic diagram of a specific hardware structure of an encoder provided in an embodiment of the present application.
  • first ⁇ second ⁇ third involved in the embodiments of the present application are only used to distinguish similar objects and do not represent a specific ordering of the objects. It can be understood that “first ⁇ second ⁇ third” can be interchanged in a specific order or sequence where permitted, so that the embodiments of the present application described here can be implemented in an order other than that illustrated or described here.
  • Point Cloud is a three-dimensional representation of the surface of an object.
  • Point cloud (data) on the surface of an object can be collected through acquisition equipment such as photoelectric radar, lidar, laser scanner, and multi-view camera.
  • a point cloud is a set of discrete points that are irregularly distributed in space and express the spatial structure and surface properties of a three-dimensional object or scene.
  • FIG1A shows a three-dimensional point cloud image
  • FIG1B shows a partial magnified view of the three-dimensional point cloud image. It can be seen that the point cloud surface is composed of densely distributed points.
  • Two-dimensional images have information expressed at each pixel point, and the distribution is regular, so there is no need to record its position information additionally; however, the distribution of points in point clouds in three-dimensional space is random and irregular, so it is necessary to record the position of each point in space in order to fully express a point cloud.
  • each position in the acquisition process has corresponding attribute information, usually RGB color values, and the color value reflects the color of the object; for point clouds, in addition to color information, the attribute information corresponding to each point is also commonly the reflectance value, which reflects the surface material of the object. Therefore, the points in the point cloud can include the location information of the point and the attribute information of the point.
  • the location information of the point can be the three-dimensional coordinate information (x, y, z) of the point.
  • the location information of the point can also be called the geometric information of the point.
  • the attribute information of the point can include color information (three-dimensional color information) and/or reflectance (one-dimensional reflectance information r), etc.
  • the color information can be information on any color space.
  • the color information can be RGB information. Among them, R represents red (Red, R), G represents green (Green, G), and B represents blue (Blue, B).
  • the color information may be luminance and chrominance (YCbCr, YUV) information, where Y represents brightness (Luma), Cb (U) represents blue color difference, and Cr (V) represents red color difference.
  • the points in the point cloud may include the three-dimensional coordinate information of the points and the reflectivity value of the points.
  • the points in the point cloud may include the three-dimensional coordinate information of the points and the three-dimensional color information of the points.
  • a point cloud obtained by combining the principles of laser measurement and photogrammetry may include the three-dimensional coordinate information of the points, the reflectivity value of the points and the three-dimensional color information of the points.
  • Figure 2A and 2B a point cloud image and its corresponding data storage format are shown.
  • Figure 2A provides six viewing angles of the point cloud image
  • Figure 2B consists of a file header information part and a data part.
  • the header information includes the data format, data representation type, the total number of point cloud points, and the content represented by the point cloud.
  • the point cloud is in the ".ply" format, represented by ASCII code, with a total number of 207242 points, and each point has three-dimensional coordinate information (x, y, z) and three-dimensional color information (r, g, b).
  • Point clouds can be divided into the following categories according to the way they are obtained:
  • Static point cloud the object is stationary, and the device that obtains the point cloud is also stationary;
  • Dynamic point cloud The object is moving, but the device that obtains the point cloud is stationary;
  • Dynamic point cloud acquisition The device used to acquire the point cloud is in motion.
  • point clouds can be divided into two categories according to their usage:
  • Category 1 Machine perception point cloud, which can be used in autonomous navigation systems, real-time inspection systems, geographic information systems, visual sorting robots, disaster relief robots, etc.
  • Category 2 Point cloud perceived by the human eye, which can be used in point cloud application scenarios such as digital cultural heritage, free viewpoint broadcasting, 3D immersive communication, and 3D immersive interaction.
  • Point clouds can flexibly and conveniently express the spatial structure and surface properties of three-dimensional objects or scenes. Point clouds are obtained by directly sampling real objects, so they can provide a strong sense of reality while ensuring accuracy. Therefore, they are widely used, including virtual reality games, computer-aided design, geographic information systems, automatic navigation systems, digital cultural heritage, free viewpoint broadcasting, three-dimensional immersive remote presentation, and three-dimensional reconstruction of biological tissues and organs.
  • Point clouds can be collected mainly through the following methods: computer generation, 3D laser scanning, 3D photogrammetry, etc.
  • Computers can generate point clouds of virtual three-dimensional objects and scenes; 3D laser scanning can obtain point clouds of static real-world three-dimensional objects or scenes, and can obtain millions of point clouds per second; 3D photogrammetry can obtain point clouds of dynamic real-world three-dimensional objects or scenes, and can obtain tens of millions of point clouds per second.
  • 3D photogrammetry can obtain point clouds of dynamic real-world three-dimensional objects or scenes, and can obtain tens of millions of point clouds per second.
  • the number of points in each point cloud frame is 700,000, and each point has coordinate information xyz (float) and color information RGB (uchar).
  • the point cloud is a collection of massive points, storing the point cloud will not only consume a lot of memory, but also be inconvenient for transmission. There is also not enough bandwidth to support direct transmission of the point cloud at the network layer without compression. Therefore, the point cloud needs to be compressed.
  • the point cloud coding framework that can compress point clouds can be the geometry-based point cloud compression (G-PCC) codec framework or the video-based point cloud compression (V-PCC) codec framework provided by the Moving Picture Experts Group (MPEG), or the AVS-PCC codec framework provided by AVS.
  • the G-PCC codec framework can be used to compress the first type of static point clouds and the third type of dynamically acquired point clouds, which can be based on the point cloud compression test platform (Test Model Compression 13, TMC13), and the V-PCC codec framework can be used to compress the second type of dynamic point clouds, which can be based on the point cloud compression test platform (Test Model Compression 2, TMC2). Therefore, the G-PCC codec framework is also called the point cloud codec TMC13, and the V-PCC codec framework is also called the point cloud codec TMC2.
  • FIG3 is a schematic diagram of a network architecture of a point cloud encoding and decoding provided by the embodiment of the present application.
  • the network architecture includes one or more electronic devices 13 to 1N and a communication network 01, wherein the electronic devices 13 to 1N can perform video interaction through the communication network 01.
  • the electronic device can be various types of devices with point cloud encoding and decoding functions.
  • the electronic device can include a mobile phone, a tablet computer, a personal computer, a personal digital assistant, a navigator, a digital phone, a video phone, a television, a sensor device, a server, etc., which is not limited by the embodiment of the present application.
  • the decoder or encoder in the embodiment of the present application can be the above-mentioned electronic device.
  • the electronic device in the embodiment of the present application has a point cloud encoding and decoding function, generally including a point cloud encoder (ie, encoder) and a point cloud decoder (ie, decoder).
  • a point cloud encoder ie, encoder
  • a point cloud decoder ie, decoder
  • the point cloud data is first divided into multiple slices by slice division.
  • the geometric information of the point cloud and the attribute information corresponding to each point cloud are encoded and decoded separately.
  • FIG4A shows a schematic diagram of the composition framework of a G-PCC encoder.
  • the geometric information is transformed so that all point clouds are contained in a bounding box (Bounding Box), and then quantized.
  • This step of quantization mainly plays a role in scaling. Due to the quantization rounding, the geometric information of a part of the point cloud is the same, so whether to remove duplicate points is determined based on parameters.
  • the process of quantization and removal of duplicate points is also called voxelization.
  • the Bounding Box is divided into octrees or a prediction tree is constructed.
  • arithmetic coding is performed on the points in the divided leaf nodes to generate a binary geometric bit stream; or, arithmetic coding is performed on the intersection points (Vertex) generated by the division (surface fitting is performed based on the intersection points) to generate a binary geometric bit stream.
  • color conversion is required first to convert the color information (i.e., attribute information) from the RGB color space to the YUV color space. Then, the point cloud is recolored using the reconstructed geometric information so that the uncoded attribute information corresponds to the reconstructed geometric information. Attribute encoding is mainly performed on color information.
  • FIG4B shows a schematic diagram of the composition framework of a G-PCC decoder.
  • the geometric bit stream and the attribute bit stream in the binary bit stream are first decoded independently.
  • the geometric information of the point cloud is obtained through arithmetic decoding-reconstruction of the octree/reconstruction of the prediction tree-reconstruction of the geometry-coordinate inverse conversion;
  • the attribute information of the point cloud is obtained through arithmetic decoding-inverse quantization-LOD partitioning/RAHT-color inverse conversion, and the point cloud data to be encoded (i.e., the output point cloud) is restored based on the geometric information and attribute information.
  • the current geometric coding of G-PCC can be divided into octree-based geometric coding (marked by a dotted box) and prediction tree-based geometric coding (marked by a dotted box).
  • octree-based geometry encoding includes: first, coordinate transformation of geometric information is performed so that all point clouds are contained in a Bounding Box. Then quantization is performed. This step of quantization mainly plays a role in scaling. Due to the quantization rounding, the geometric information of some points is the same. Whether to remove duplicate points is determined according to the parameters. The process of quantization and removal of duplicate points is also called voxelization. Next, the Bounding Box is continuously divided into trees (such as octrees, quadtrees, binary trees, etc.) in the order of breadth-first traversal, and the placeholder code of each node is encoded.
  • trees such as octrees, quadtrees, binary trees, etc.
  • the bounding box is divided into sub-cubes in turn, and the non-empty (containing points in the point cloud) sub-cubes are continued to be divided until the leaf node obtained by the division is a 1x1x1 unit cube. Then, the division is stopped when the leaf node obtained by the division is a 1x1x1 unit cube.
  • the number of points contained in the leaf node is encoded, and finally the geometric octree encoding is completed to generate a binary code stream.
  • the decoder obtains the placeholder code of each node by continuously parsing in the order of breadth-first traversal, and continuously divides the nodes in turn until a 1x1x1 unit cube is obtained. The number of points contained in each leaf node is parsed, and finally the geometric reconstructed point cloud information is restored.
  • the decoding end obtains the placeholder code of each node by continuously parsing in the order of breadth-first traversal, and continuously divides the nodes in turn until a 1 ⁇ 1 ⁇ 1 unit cube is obtained. The number of points contained in each leaf node is parsed, and finally the geometric reconstructed point cloud information is restored.
  • geometric information coding based on triangle soup (trisoup)
  • geometric division must also be performed first, but different from geometric information coding based on binary tree/quadtree/octree, this method does not need to divide the point cloud into unit cubes with a side length of 1 ⁇ 1 ⁇ 1 step by step, but stops dividing when the side length of the sub-block is W.
  • the surface and the twelve edges of the block are obtained.
  • the vertex coordinates of each block are encoded in turn to generate a binary code stream.
  • the Predictive geometry coding based on the Predictive tree includes: first, sorting the input point cloud.
  • the sorting methods currently used include unordered, Morton order, azimuth order, and radial distance order.
  • the prediction tree structure is established by using two different methods, including: high-latency slow mode (KD-Tree, KD tree) and low-latency fast mode (using laser radar calibration information).
  • KD-Tree high-latency slow mode
  • KD tree high-latency slow mode
  • KD-Tree high-latency fast mode
  • each point is divided into different lasers (Laser), and the prediction tree structure is established according to different Lasers.
  • each node in the prediction tree is traversed, and the geometric position information of the node is predicted by selecting different prediction modes to obtain the geometric prediction residual, and the geometric prediction residual is quantized using the quantization parameter.
  • the prediction residual of the prediction tree node position information, the prediction tree structure, and the quantization parameter are encoded to generate a binary code stream.
  • the geometric information of the point cloud is first used at the encoding end to perform Morton code sorting, and then the geometric information of the point cloud is predicted and encoded using KD-Tree, similar to a single chain structure that predicts and encodes the geometric information of the child node by using the parent node.
  • the prediction tree adopts a single chain structure, and each tree node has only one child node except for the only leaf node. Except for the root node, which is predicted by the default value, other nodes are provided with geometric prediction values by their parent nodes.
  • the decoding end reconstructs the prediction tree structure by continuously parsing the bit stream, and then obtains the geometric position prediction residual information and quantization parameters of each prediction node through parsing, and dequantizes the prediction residual to recover the reconstructed geometric position information of each node, and finally completes the geometric reconstruction of the decoding end.
  • attribute encoding is mainly performed on color information.
  • the color information is converted from the RGB color space to the YUV color space.
  • the point cloud is recolored using the reconstructed geometric information so that the unencoded attribute information corresponds to the reconstructed geometric information.
  • color information encoding there are two main transformation methods. One is the distance-based lifting transformation that relies on LOD division, and the other is to directly perform RAHT transformation. Both methods will convert color information from the spatial domain to the frequency domain, and obtain high-frequency coefficients and low-frequency coefficients through transformation. Finally, the coefficients are quantized and encoded to generate a binary bit stream (which can be referred to as "code stream").
  • Figure 7A shows a schematic diagram of the composition framework of an AVS encoder
  • Figure 7B shows a schematic diagram of the composition framework of an AVS encoder.
  • the geometric encoding of the multi-branch tree is an octree as an example for explanation.
  • the geometric information is first transformed into coordinates so that all point clouds are contained in a Bounding Box.
  • the parameter configuration will determine whether to divide the entire point cloud sequence into multiple slices, and each divided slice will be treated as a single independent point cloud for serial processing.
  • the preprocessing process includes quantization and removal of duplicate points. Quantization mainly plays a role in scaling. Due to the quantization rounding, the geometric information of some points is the same. Whether to remove duplicate points is determined according to the parameters.
  • the Bounding Box is divided in the order of breadth-first traversal (octree/quadtree/binary tree), and the placeholder code of each node is encoded.
  • the bounding box is divided into sub-cubes in turn, and the non-empty (containing points in the point cloud) sub-cubes are continued to be divided until the leaf node obtained by the division is a 1 ⁇ 1 ⁇ 1 unit cube. Then, the division is stopped when the leaf node is a 1 ⁇ 1 ⁇ 1 unit cube.
  • the number of points contained in the leaf node is encoded, and finally the geometric octree encoding is completed to generate a binary geometric bit stream (i.e., geometric code stream).
  • the decoder obtains the placeholder code of each node by continuous parsing in the order of breadth-first traversal, and divides the nodes in turn until a 1 ⁇ 1 ⁇ 1 unit cube is obtained. The number of points contained in each leaf node is parsed to finally recover the geometric information.
  • attribute encoding is mainly performed on color and reflectance information. First, determine whether to perform color space conversion. If color space conversion is performed, the color information is converted from RGB color space to YUV color space. Then, the reconstructed point cloud is recolored using the original point cloud so that the unencoded attribute information corresponds to the reconstructed geometric information.
  • color information encoding it is divided into two modules: attribute prediction and attribute transformation.
  • the attribute prediction process is as follows: first, the point cloud is reordered, and then differential prediction is performed. There are two reordering methods: Morton reordering and Hilbert reordering.
  • the attribute prediction of the sorted point cloud is performed using a differential method, and finally the prediction residual is quantized and entropy encoded to generate a binary attribute bit stream.
  • the attribute transformation process is as follows: first, wavelet transform is performed on the point cloud attributes and the transform coefficients are quantized; secondly, the attribute reconstruction value is obtained through inverse quantization and inverse wavelet transform; then the difference between the original attribute and the attribute reconstruction value is calculated to obtain the attribute residual and quantize it; finally, the quantized transform coefficients and attribute residual are entropy encoded to generate a binary attribute bit stream (i.e., attribute code stream).
  • the decoding end performs entropy decoding-inverse quantization-attribute prediction compensation/attribute inverse transform-inverse spatial transform on the attribute bit stream, and finally recovers the attribute information.
  • Condition 1 The geometric position is limitedly lossy and the attributes are lossy;
  • Condition 3 The geometric position is lossless, and the attributes are limitedly lossy
  • Condition 4 The geometric position and attributes are lossless.
  • Cat1A and Cat2-frame point clouds only contain reflectance attribute information
  • Cat1B and Cat3 point clouds only contain color attribute information
  • Cat1C point cloud contains both color and reflectance attribute information.
  • the points in the point cloud are processed in a certain order (the original acquisition order of the point cloud, the Morton order, the Hilbert order, etc.), and the prediction algorithm is first used to obtain the attribute prediction value, and the attribute residual is obtained according to the attribute value and the attribute prediction value. Then, the attribute residual is quantized to generate a quantized residual, and finally the quantized residual is encoded;
  • the points in the point cloud are processed in a certain order (the original acquisition order of the point cloud, Morton order, Hilbert order, etc.).
  • the prediction algorithm is first used to obtain the attribute prediction value, and then the decoding is performed to obtain the quantized residual.
  • the quantized residual is then dequantized, and finally the attribute reconstruction value is obtained based on the attribute prediction value and the dequantized residual.
  • Prediction transform branch - resources are limited
  • attribute compression adopts a method based on intra-frame prediction and discrete cosine transform (DCT).
  • DCT discrete cosine transform
  • the points in the point cloud are processed in a certain order (the original acquisition order of the point cloud, the Morton order, the Hilbert order, etc.), and the entire point cloud is first divided into several small groups with a maximum length of Y (such as 2), and then these small groups are combined into several large groups (the number of points in each large group does not exceed X, such as 4096), and then the prediction algorithm is used to obtain the attribute prediction value, and the attribute residual is obtained according to the attribute value and the attribute prediction value.
  • the attribute residual is transformed by DCT in small groups to generate transformation coefficients, and then the transformation coefficients are quantized to generate quantized transformation coefficients, and finally the quantized transformation coefficients are encoded in large groups;
  • the points in the point cloud are processed in a certain order (the original acquisition order of the point cloud, Morton order, Hilbert order, etc.).
  • the entire point cloud is divided into several small groups with a maximum length of Y (such as 2), and then these small groups are combined into several large groups (the number of points in each large group does not exceed X, such as 4096).
  • the quantized transform coefficients are decoded in large groups, and then the prediction algorithm is used to obtain the attribute prediction value.
  • the quantized transform coefficients are dequantized and inversely transformed in small groups.
  • the attribute reconstruction value is obtained based on the attribute prediction value and the dequantized and inversely transformed coefficients.
  • Prediction transform branch - resources are not limited. Attribute compression adopts a method based on intra-frame prediction and DCT transform. When encoding the quantized transform coefficients, there is no limit on the maximum number of points X, that is, all coefficients are encoded together.
  • the points in the point cloud are processed in a certain order (the original acquisition order of the point cloud, the Morton order, the Hilbert order, etc.).
  • the entire point cloud is divided into several small groups with a maximum length of Y (such as 2).
  • the prediction algorithm is used to obtain the attribute prediction value.
  • the attribute residual is obtained according to the attribute value and the attribute prediction value.
  • the attribute residual is subjected to DCT transformation in groups to generate transformation coefficients.
  • the transformation coefficients are then quantized to generate quantized transformation coefficients.
  • the quantized transformation coefficients of the entire point cloud are encoded.
  • the points in the point cloud are processed in a certain order (the original acquisition order of the point cloud, Morton order, Hilbert order, etc.).
  • the entire point cloud is divided into several small groups with a maximum length of Y (such as 2), and the quantized transformation coefficients of the entire point cloud are obtained by decoding.
  • the prediction algorithm is used to obtain the attribute prediction value, and then the quantized transformation coefficients are dequantized and inversely transformed in groups.
  • the attribute reconstruction value is obtained based on the attribute prediction value and the dequantized and inversely transformed coefficients.
  • Multi-layer transformation branch attribute compression adopts a method based on multi-layer wavelet transform.
  • the entire point cloud is subjected to multi-layer wavelet transform to generate transform coefficients, which are then quantized to generate quantized transform coefficients, and finally the quantized transform coefficients of the entire point cloud are encoded;
  • decoding obtains the quantized transform coefficients of the entire point cloud, and then dequantizes and inversely transforms the quantized transform coefficients to obtain attribute reconstruction values.
  • geometric information encoding context i.e., the context information of the context model corresponding to the geometric information
  • attribute information encoding context i.e., the context information of the context model corresponding to the attribute information
  • the geometric information encoding context can be divided into multi-tree encoding context (such as octree encoding context) and prediction tree encoding context.
  • the attribute information encoding context is divided into color encoding context and reflectivity attribute encoding context.
  • Table 1 The specific context information is shown in Table 1 below:
  • the multi-tree context model one can be used to encode the point cloud. Then the number of contexts actually applied to the encoding and decoding is 208 contexts, but 599 contexts need to be initialized in the codec. And if the multi-tree context model two is used for encoding, the number of contexts actually applied to the encoding and decoding is 256 contexts, but 599 contexts need to be initialized in the codec. When prediction tree coding is used, the number of contexts actually applied to the encoding and decoding is 37 contexts, but 599 contexts need to be initialized in the codec.
  • the context memory will be drastically wasted in actual encoding and decoding, which is very unfriendly to hardware implementation.
  • the context required for encoding the point cloud is classified, and the initialized context model is selectively loaded according to the encoding method, so that the context model actually used by the codec and the context model allocated by initialization can correspond to each other, and there will be no phenomenon of too many context models being loaded and not used, causing a waste of memory.
  • FIG8 a schematic flow chart of a decoding method provided by an embodiment of the present application is shown. As shown in FIG8 , the method may include:
  • the decoding method is applied to the process of initializing the context model before performing arithmetic decoding, which can be in the process of decoding geometric information or in the process of decoding attribute information, and the embodiment of the present application is not limited.
  • a point cloud of the three-dimensional image model to be encoded in space is obtained, and the point cloud may contain geometric information and attribute information of the three-dimensional image model.
  • the geometric information of the point cloud and the attribute information corresponding to each point are encoded separately.
  • the geometric information of the point may also be referred to as the position information of the point, and the position information of the point may be the three-dimensional coordinate information of the point.
  • the attribute information of the point may include color information and/or reflectivity, etc.
  • geometric encoding and attribute encoding are performed on the point cloud data of each slice, that is, the point cloud to be processed.
  • the encoding and decoding method provided in the embodiment of the present application is adopted.
  • the code stream type may include any of: geometric code stream, attribute code stream and sequence-level code stream.
  • the decoder can parse out the code stream type in the byte segment of the code stream by parsing the code stream.
  • different code stream types are represented differently, for example, GPS corresponds to the encoding of geometric information, and APS corresponds to the encoding of attribute information.
  • the decoder can determine whether it is a geometric code stream or an attribute code stream by the parsed code stream type.
  • the code stream type indicates a geometric code stream or an attribute code stream
  • the decoding method provided in the embodiment of the present application is used.
  • header information of the byte stream of the code stream can indicate the code stream type.
  • S102 Determine a context model based on the bitstream type.
  • the decoder can parse to obtain which code stream type the code stream belongs to, so as to determine which context models need to be initialized.
  • the decoder may determine a first context model corresponding to the geometric code stream based on the geometric code stream indicated by the code stream type; or determine a second context model corresponding to the attribute code stream based on the attribute code stream indicated by the code stream type.
  • the decoder when parsing a bitstream, can determine the context model corresponding to the bitstream type according to the bitstream type, and clarify whether it is the decoding of geometric information or the decoding of attribute information. In this way, when initializing the context information of the context model, only the context information of the context model corresponding to the bitstream type can be initialized.
  • the parsed bitstream can only be of one type, and different types of bitstreams are parsed in segments. Therefore, in each bitstream segment parsed by the decoder, the context information of the context model of only one type of bitstream can be initialized before decoding.
  • the context model may include: a multi-tree coding model, a prediction tree coding model, a color coding model and a reflectivity coding model.
  • the geometry code stream corresponds to the multi-tree coding model and the prediction tree coding model.
  • the attribute code stream corresponds to the color coding model and the reflectivity coding model.
  • bitstream type is a geometric bitstream
  • the context information of the multi-tree coding model and the prediction tree coding model is initialized
  • bitstream type is an attribute bitstream
  • the context information of the color coding model and the reflectivity coding model is initialized.
  • the context model initialization process can be understood as the process of allocating memory for the set context model and loading the context model.
  • Each context model can correspond to multiple context information.
  • S104 Decode the current node based on the context information.
  • the encoder when performing geometric encoding, the encoder will determine the context information corresponding to the current node and encode the placeholder information of the current node. Or when performing attribute encoding of the current node, the context information corresponding to the current node will be used to encode the coefficients after the attribute information is transformed; when performing attribute encoding of the current node, the context information corresponding to the current node will be used to encode the predicted residual information of the attribute. Therefore, in the corresponding decoding process, the decoder needs to first initialize the context information of the set context model, so that when decoding the current node, the specific context information used in the encoding can be determined according to the index of the parsed context model, thereby realizing various methods of decoding the current node.
  • the decoder can use the parsed bitstream type and, when using context information decoding, only needs to initialize the context information of the context model that is consistent with the parsed bitstream type, thereby achieving that the context information required for decoding has been loaded at the time of initialization, thereby ensuring the efficiency of decoding.
  • all context models are not loaded, thereby saving memory resources allocated to the context model, thereby achieving the purpose of reducing memory overhead during decoding and improving the utilization of memory resources.
  • the context model includes: a first context model corresponding to a geometric code stream, and a second context model corresponding to an attribute code stream.
  • the process of determining the initialized context model by the decoder based on different code stream types includes:
  • S207 Decode the current node based on the context information of the second context model.
  • the first context model includes: a multitree coding model and a prediction tree coding model corresponding to the geometry code stream.
  • the second context model includes: a color coding model and a reflectivity coding model corresponding to the attribute code stream.
  • the parsing of the geometric code stream and the parsing of the attribute code stream are not performed synchronously.
  • the decoder decodes the current node based on the context information of the second context model
  • the geometric reconstruction information of the current node that has been decoded from the geometric information is also required to realize the decoding of the attribute information of the current node.
  • the decoder can, through the parsed bitstream type, when using context information decoding, only needs to initialize the context information of the first context model or the information of the second context model that is consistent with the parsed bitstream type, thereby achieving the goal of loading the context information required for decoding of geometric information or decoding of attribute information at the time of initialization, thereby ensuring the efficiency of decoding.
  • all context models are not loaded, thus saving memory resources allocated to the context model, thereby achieving the purpose of reducing memory overhead during decoding and improving the utilization of memory resources.
  • the process of determining the context model by the decoder based on the bitstream type can be further subdivided into the level of the context model corresponding to the prediction method when encoding the geometric information; or, it can be subdivided into the level of the context model corresponding to different attribute information when encoding the attribute information.
  • the coding type is used to indicate the prediction method (multi-tree coding or prediction tree coding) used in the coding process of geometric information, or to indicate the type of attribute information of whether one or two attribute information are used in the coding process of attribute information.
  • the decoder may determine the first syntax element information based on the bitstream type; and determine the context model based on the encoding type indicated by the first syntax element information.
  • the first syntax element information is a syntax element indicating a coding type parsed from a bitstream.
  • the first syntax element information is geomTreeType in GPS.
  • geomTreeType is used to indicate whether the encoding type is multi-tree encoding or prediction tree encoding.
  • the first syntax element information is an identification bit in APS, which is used to indicate whether the attribute information is encoded for one type of attribute information or for several types of attribute information, that is, it is used to indicate whether the encoding type is color encoding, reflectivity encoding and normal vector encoding, that is, what encoding types are there.
  • the code stream type when decoding geometric information, includes: geometric code stream; the decoding method also includes:
  • bitstream type indicates a geometry bitstream
  • parse the geometry bitstream to determine first syntax element information related to the geometry bitstream;
  • the first syntax element information indicates a coding type of geometry information;
  • the coding type of geometry information includes: multitree coding or prediction tree coding;
  • S303 and S304 are two optional solutions after S302, and S303 or S304 is executed according to actual circumstances.
  • the context information (464 contexts) of the multi-tree coding model is initialized, and the decoder allocates memory resources for the context information of the multi-tree model.
  • the coding information of the current node is decoded based on the context information (464 contexts) of the multi-tree coding model to obtain the placeholder information of the current node.
  • the coding information of the current node is decoded based on the context information (37 contexts) of the prediction tree coding model to obtain the prediction residual of the current node.
  • the decoder can determine the context model consistent with the coding type by parsing the first syntax element information, and only allocate memory resources to one of the multi-tree coding model and the prediction tree coding model that is consistent with the coding type, and discard the other coding model of the multi-tree coding model and the prediction tree coding model that is inconsistent with the coding type, further refine the classification of the initialized coding model, reduce the occupancy of memory resources, and improve resource utilization while ensuring decoding efficiency.
  • the context of point cloud information encoding is classified into four categories: multi-branch tree decoding model, prediction tree decoding model, color decoding model and reflectivity decoding model.
  • the model for multi-tree encoding may include multiple multi-tree context models.
  • the multi-tree coding model may include multiple model types such as multi-tree context model 1, multi-tree context model 2 and multi-tree context model 3, which is not limited in the embodiments of the present application.
  • the geometry code stream is parsed to determine the second syntax element information corresponding to the multi-tree coding.
  • a color coding model corresponding to the color coding and reflectivity coding is determined.
  • a decoding method of a reflectivity coding model and attributes corresponding to the reflectivity coding is determined.
  • a normal vector coding model corresponding to the normal vector coding is determined.
  • the decoder allocates memory resources for the context information of the color coding model, the reflectivity coding model and the normal vector coding model.
  • the color coding model (49 contexts) is initialized.
  • the decoder allocates memory resources for the context information of the color coding model.
  • the reflectivity coding model (49 contexts) is initialized.
  • the decoder allocates memory resources for the context information of the reflectivity coding model.
  • a normal vector coding model is initialized, and the decoder allocates memory resources for context information of the normal vector coding model.
  • the decoder can also parse the coding type in the byte segment of each bitstream segment.
  • the process of the decoder determining the context model can also be implemented as follows:
  • determining a second context model corresponding to the normal vector encoding If the encoding type of the attribute information in the bitstream indicates normal vector encoding, determining a second context model corresponding to the normal vector encoding; or,
  • determining a second context model corresponding to the color encoding If the encoding type of the attribute information in the bitstream indicates color encoding, determining a second context model corresponding to the color encoding; or,
  • a second context mode corresponding to the reflectivity coding is determined.
  • S407 Decode the current node based on the context information of the second context model.
  • the current node is decoded based on the reflectivity coding model (49 contexts) and the attribute decoding method.
  • the current node is decoded based on the normal vector coding model.
  • multi-tree decoding model Consistent with the multi-tree encoding model
  • prediction tree decoding model Consistent with the prediction tree encoding model
  • color decoding model Consistent with the color encoding model
  • reflectivity decoding model Consistent with the reflectivity encoding model
  • normal vector decoding model Consistent with the normal vector encoding model
  • the context model actually used by the decoder corresponds to the allocated context model, and there will be no waste of context model memory.
  • the context information of the second context model corresponding to the color coding is initialized in the first decoding core; through the first decoding core, the color coding information of the current node is decoded based on the context information of the second context model (color coding model) corresponding to the color coding.
  • the context information of the second context model corresponding to the reflectivity coding is initialized in the second decoding core; through the second decoding core, the reflectivity information of the current node is decoded based on the context information of the second context model (reflectivity coding model) corresponding to the reflectivity coding.
  • the context information of the second context model corresponding to the normal vector encoding is initialized in the third decoding core; based on the context information of the second context model (normal vector encoding model) corresponding to the normal vector encoding, the normal vector of the current node is decoded.
  • each attribute information corresponds to a decoding core, and each attribute information can be independently decoded independently, which can speed up the attribute decoding.
  • the context model of color coding, the context model of reflectivity coding, and the context model of normal vector coding are separated from each other and cannot generate a mutually dependent relationship.
  • the context models between color, reflectivity, and normal vector are independent of each other and do not generate a mutually dependent relationship.
  • FIG9 a schematic diagram of a flow chart of an encoding method provided by an embodiment of the present application is shown. As shown in FIG9 , the method may include:
  • S501 Determine a coding method for the point cloud; the coding method is used to indicate geometric coding or attribute coding.
  • S502 Determine a context model based on the encoding method.
  • the encoding configuration file may indicate information such as the encoding method and encoding type.
  • the encoder determines the encoding method of the point cloud through a configuration file; the encoding method is used to indicate geometric encoding or attribute encoding.
  • the encoder may determine a first context model corresponding to the geometric coding based on the geometric coding indicated by the encoding method; and/or determine a second context model corresponding to the attribute coding based on the attribute coding indicated by the encoding method.
  • the encoder can determine the context model corresponding to the encoding method according to the encoding method, and clearly specify whether it is the encoding of geometric information or the encoding of attribute information. In this way, when initializing the context information of the context model, only the context information of the context model corresponding to the encoding method can be initialized.
  • the encoding method can only be one type of encoding, and different types of encoding are encoded independently. Therefore, in each encoded bit stream, the encoder can only initialize the context information of the context model of one encoding method before arithmetic coding.
  • the context model may include: a multi-tree coding model, a prediction tree coding model, a color coding model and a reflectivity coding model.
  • the geometric coding corresponds to the multi-tree coding model and the prediction tree coding model.
  • the attribute coding corresponds to the color coding model and the reflectivity coding model.
  • the context model initialization process can be understood as the process of allocating memory for the set context model and loading the context model.
  • Each context model can correspond to multiple context information.
  • S504 Encode the current node based on the context information.
  • the encoder when performing geometric coding, the encoder will determine the context information corresponding to the current node and encode the current node. Or when performing attribute coding of the current node, the context information corresponding to the current node will be used to encode the coefficients after the attribute information is transformed.
  • the encoder can determine whether to perform geometric encoding or attribute encoding by determining the encoding method of the point cloud during encoding, and can determine whether to initialize the context information of the context model corresponding to the geometric information or the context information of the context model corresponding to the attribute information according to the encoding method.
  • the context models of all encoding methods are not loaded, saving the memory resources allocated to the context model, achieving the purpose of reducing the memory overhead during encoding and improving the utilization of memory resources.
  • the context model includes: a first context model corresponding to geometric coding, and a second context model corresponding to attribute coding.
  • the process of determining the initialized context model based on different encoding methods includes:
  • S604 Encode the current node based on the context information of the first context model.
  • S607 Encode the current node based on the context information of the second context model.
  • the first context model includes: a multitree coding model and a prediction tree coding model corresponding to geometry coding.
  • the second context model includes: a color coding model and a reflectivity coding model corresponding to attribute coding.
  • geometric coding and attribute coding are not performed synchronously.
  • the encoder encodes the current node based on the context information of the second context model
  • the geometric reconstruction information of the current node that has been encoded with the geometric information is also required to realize the encoding of the attribute information of the current node.
  • the encoder can use the encoding method, and then when using context information encoding, it only needs to initialize the context information of the first context model or the information of the second context model that is consistent with the encoding method, so that the context information required for encoding of geometric information or encoding of attribute information has been loaded at the time of initialization, thereby ensuring the efficiency of encoding.
  • all context models are not loaded, which saves memory resources allocated to the context model, thereby achieving the purpose of reducing memory overhead during encoding and improving the utilization of memory resources.
  • the process of determining the context model based on the encoding method by the encoder can be further subdivided into the level of the context model corresponding to the prediction method during encoding for geometric information; or, it can be subdivided into the level of the context model corresponding to different attribute information during encoding for attribute information.
  • the coding type is used to indicate the prediction method (multi-tree coding or prediction tree coding) used in the coding process of geometric information, or to indicate whether one type of attribute information or at least two types of attribute information are used in the coding process of attribute information.
  • the encoder determines a coding type corresponding to the coding method; and determines a context model based on the coding type.
  • the geometric information is encoded, and the encoding method further includes:
  • the coding type includes: multi-tree coding or prediction tree coding
  • the encoding type corresponding to the encoding method can be determined, and then based on whether the encoding type indicates multi-tree coding or prediction tree coding, the first context model corresponding to the encoding type used in encoding can be determined.
  • the first context model here is a prediction tree coding model or a multi-tree coding model.
  • the context model includes: a first context model.
  • the encoding type indicates multi-tree encoding
  • all multi-tree context models corresponding to the multi-tree encoding are determined, or a multi-tree context model corresponding to multiple multi-tree encodings used in encoding is determined.
  • a prediction tree encoding model corresponding to the prediction tree encoding is determined.
  • the context information (464 contexts) of the multi-tree encoding model is initialized.
  • the encoder allocates memory resources for the context information of the multi-tree model.
  • the context information (37 contexts) of the prediction tree encoding model is initialized.
  • the encoder allocates memory resources for the context information of the prediction tree model.
  • S706 Encode the current node based on the context information of the first context model.
  • the current node is encoded based on the context information (464 contexts) of the multi-tree encoding model.
  • the current node is encoded based on the context information (37 contexts) of the prediction tree coding model.
  • S708 Write the coding type into the bitstream as the first syntax element information.
  • the first syntax element information is geomTreeType in GPS.
  • geomTreeType is used to indicate whether the encoding type is multi-tree encoding or prediction tree encoding.
  • the first syntax element information is an identification bit in APS, which is used to indicate whether the attribute information is encoded for one attribute information or for multiple attribute information, that is, it is used to indicate whether the encoding type is color encoding, reflectivity encoding, normal vector encoding, several attribute information that needs to be encoded, and what the attribute encoding method is.
  • the attribute encoding method may include: prediction or transformation.
  • the information indicating whether the encoding type is color encoding, reflectivity encoding, normal vector encoding, several attribute information that needs to be encoded, and what the attribute encoding method is can be written into the bitstream as the first syntax element information.
  • the encoder can determine the context model consistent with the encoding type through the encoding type, and only allocate memory resources to one of the multi-tree encoding model and the prediction tree encoding model that is consistent with the encoding type, and discard the other encoding model of the multi-tree encoding model and the prediction tree encoding model that is inconsistent with the encoding type, further refine the classification of the initialized coding model, reduce the occupancy of memory resources, and improve resource utilization while ensuring encoding efficiency.
  • the context of point cloud information encoding is classified into four categories: multi-branch tree decoding model, prediction tree decoding model, color decoding model and reflectivity decoding model.
  • the encoder When the encoder only encodes geometric information, the encoder only needs to choose according to the point cloud geometric information encoding method: initialize the multi-tree encoding model or initialize the prediction tree encoding model; when the encoder encodes geometric information, it needs to initialize the corresponding geometric encoding model according to the geometric encoding residual. In this way, the context model actually used by the encoder corresponds to the assigned context model, and there will be no waste of context model memory.
  • the model for multi-tree encoding may include multiple types of multi-tree context models.
  • the multi-tree coding model may include types such as multi-tree context model 1, multi-tree context model 2, and multi-tree context model 3, which is not limited in the embodiments of the present application.
  • a specific context model type corresponding to the multi-tree coding is determined from multiple multi-tree context models, that is, a first context model type corresponding to the multi-tree coding is determined.
  • the multi-tree coding model corresponds to multiple multi-tree coding models (ie, first context models), and the multiple multi-tree coding models may belong to different types of multi-tree coding models (ie, types of first context models).
  • the multitree coding model includes: a plurality of multitree coding model types, and each multitree coding model type corresponds to at least one multitree model.
  • the encoder can determine that all multi-tree coding models corresponding to the multi-tree coding are the first context models; it can also specifically determine the type of the first context model corresponding to the multi-tree coding, and determine the multi-tree coding model corresponding to the type of the first context model in the multi-tree coding model as the first context model.
  • the encoder determines the type of a multi-tree context model indicating that the geometric information is encoded using multi-tree encoding, and writes the type of the multi-tree context model into the bitstream as the second syntax element information.
  • the type of the specific model among the multiple multi-tree context models can be determined by geom_context_mode in gbh.
  • the first context model is the multi-tree context corresponding to the multi-tree context model one.
  • a multi-tree context model type is multi-tree context model 2
  • the first context model is the multi-tree context corresponding to the multi-tree context model 2.
  • the first context model is the multi-tree context corresponding to the multi-tree context model three.
  • the encoder initializes the multi-tree context corresponding to the specific multi-tree context model type indicated in the multi-tree coding model, thereby reducing the memory allocation of other multi-tree context model types.
  • the multitree coding model types may include multitree context model 1 and multitree context model 2.
  • the point cloud multitree coding model can be further classified into: multitree coding model 1 and multitree coding model 2.
  • the context of point cloud information coding is classified into five categories: multitree decoding model 1, multitree decoding model 2, prediction tree decoding model, color decoding model and reflectivity decoding model.
  • Octree coding If octree coding is used, there are two context coding models. Context model one is used for cat1-A and cat2 point cloud sequences; context model two is used for cat1-B and cat3 sequences.
  • Multi-tree context model 1 This model includes the sub-layer neighbor prediction of the current point and the neighbor prediction of the current point layer.
  • the neighbor information that can be obtained when encoding the child node of the current point includes the neighbor child nodes in the three directions of left, front and bottom.
  • the context model of the child node layer is designed as follows: for the child node layer to be encoded, find the occupancy of the three coplanar, three colinear, and one co-point nodes in the left, front and bottom direction of the same layer as the child node to be encoded, and the node in the negative direction of the dimension with the shortest node side length, which is two node side lengths away from the current child node to be encoded.
  • the reference nodes selected by each child node are shown in Figures 10A-10H.
  • the dotted box node is the current node
  • the gray node is the current child node to be encoded
  • the solid box node is the reference node selected by each child node.
  • the occupancy of the 3 coplanar nodes, 3 colinear nodes, and the node at the negative direction of the dimension with the shortest node side length and two node side lengths away from the current sub-node to be encoded is considered in detail.
  • There are 2 possibilities for the common neighbor: occupied or unoccupied. A separate context is assigned to the situation where the common neighbor node is occupied. If the common neighbor is also unoccupied, the occupancy of the neighbors at the current node layer to be described next is considered. That is, the neighbors at the sub-node layer to be encoded correspond to a total of 127 + 2-1 128 contexts.
  • the left front and bottom coplanar neighbors occupy or the right top and back colinear neighbors occupy 1
  • the left front and bottom coplanar neighbors and the right top and back collinear neighbors are not occupied, and the left front and bottom collinear neighbors are occupied 2 None of the four neighbor groups at the current node layer are occupied 3
  • the distance has 3 values.
  • Multi-tree context model 2 This method uses a two-layer context reference relationship configuration, as shown in formula (1).
  • the first layer is the occupancy of the encoded adjacent blocks of the parent node of the current sub-block to be encoded (i.e., ctxIdxParent), and the second layer is the occupancy of the adjacent encoded blocks at the same depth as the current sub-block to be encoded (i.e., ctxIdxChild).
  • the ctxIdxChild of the second layer is as shown in formula (2): Indicates that the current sub-block Occupancy of the three coded sub-blocks with a distance of 1.
  • each sub-graph shows the relative position relationship of the 6 adjacent parent blocks found by the i-th sub-block, including 3 coplanar parent blocks (P i,0 ,P i,1 ,P i,2 ) and 3 colinear parent blocks (P i,3 ,P i,4 ,P i,5 ).
  • the position relationship between each sub-block and the adjacent parent block is obtained by the method of Table 1.
  • Table 4 correspond to the Morton sequence in Figure 13. This method takes into account the different sub-block positions and the geometric center rotation symmetry. As can be seen from Figure 13, with the current block as the center, this method has a larger receptive field and can use up to 18 adjacent parent blocks that have been encoded around it.
  • the method used in formula (3) is the combination of the occupancy of the 3 coplanar parent blocks and the sum of the number of occupancy of the 3 colinear parent blocks.
  • the encoder can determine the context model that is consistent with the type of multi-tree context model used for encoding among multiple multi-tree context models, and only allocate memory resources to the multi-tree context model type that is consistent with the type of multi-tree context model used for encoding, discard other multi-tree context models and other coding models, further refine the classification of the initialized multi-tree coding models, reduce the occupancy of memory resources, and improve resource utilization while ensuring coding efficiency.
  • the encoding method when encoding attribute information, the encoding method further includes:
  • S802 Determine a coding type corresponding to the coding method; the coding type includes: color coding, reflectivity coding or normal vector coding.
  • the encoder when the first syntax element indicates that the encoding method is attribute encoding, the encoder can determine the encoding type corresponding to the attribute encoding, and then based on whether the encoding type indicates color encoding, reflectivity encoding, or color encoding normal vector encoding, determine the second context model used in encoding that corresponds to the encoding type.
  • the second context model here is a color encoding model, a reflectivity encoding model, or a normal vector encoding model.
  • the context model includes: a second context model.
  • a normal vector encoding model corresponding to the normal vector encoding is determined.
  • a color encoding model corresponding to the color encoding is determined.
  • the reflectivity encoding model corresponding to the reflectivity encoding is determined.
  • S803-S805 are three optional solutions after S802, and one of S803-S805 is executed according to the actual situation.
  • the normal vector encoding model is initialized and the encoder allocates memory resources for context information of the normal vector encoding model.
  • the color encoding model (49 contexts) is initialized.
  • the encoder allocates memory resources for the context information of the color encoding model.
  • the reflectivity encoding model (49 contexts) is initialized.
  • the encoder allocates memory resources for the context information of the reflectivity encoding model.
  • the current node is encoded based on the normal vector encoding model.
  • the current node is encoded based on the color encoding model (49 contexts).
  • the current node is encoded based on the reflectivity coding model (49 contexts).
  • the context of point cloud information encoding is classified into four categories: multi-tree coding model, prediction tree coding model, color coding model, reflectivity coding model and normal vector coding model.
  • multi-tree coding model prediction tree coding model
  • color coding model color coding model
  • reflectivity coding model normal vector coding model.
  • the encoder encodes geometric and attribute information, it is necessary to initialize the corresponding geometric coding model and the corresponding attribute coding model according to the geometric coding residual. For example, if only the color is encoded, then only the color coding model needs to be initialized in the attribute coding model. If only the reflectivity attribute is encoded, then only the reflectivity coding context model needs to be initialized in the attribute coding model. In this way, the context model actually used by the encoder corresponds to the allocated context model, and there will be no waste of context model memory.
  • the context information of the second context model corresponding to the color coding is initialized in the first encoding core; through the first encoding core, the color information of the current node is encoded based on the context information of the second context model corresponding to the color coding.
  • the context information of the second context model corresponding to the reflectivity coding is initialized in the second encoding core; and the reflectivity information of the current node is encoded based on the context information of the second context model corresponding to the reflectivity coding through the second encoding core.
  • the context information of the second context model corresponding to the reflectivity encoding is initialized in the third encoding core; through the third encoding core, the normal vector of the current node is encoded based on the context information of the second context model corresponding to the normal vector encoding.
  • the encoder there can be multiple independent encoding cores, and different types of attribute information correspond to an independent encoding core.
  • Each type of attribute information can be independently encoded independently, which can speed up the attribute encoding.
  • the context model of color encoding, the context model of reflectivity encoding, and the context model of normal vector encoding are separated from each other and cannot generate a mutually dependent relationship.
  • the context models between color, reflectivity, and normal vector are independent of each other and do not generate a mutually dependent relationship.
  • different information of the current node may be encoded based on the context information, as follows:
  • the encoder determines the index of the context model corresponding to the current node; based on the index of the context model, determines the target context information from the context information; if the encoding type indicates multi-tree encoding, then based on the target context information, the placeholder information of the current node is encoded to obtain the encoding information of the current node; if the encoding type indicates prediction tree encoding, then based on the target context information, the prediction residual of the current node is encoded to obtain the encoding information of the current node; if the encoding method indicates prediction coding in attribute coding, then based on the target context information, the prediction residual information of the attribute of the current node is encoded to obtain the encoding information of the current node; if the encoding method indicates transform coding in attribute coding, then based on the target context information, the transform coefficient of the current node is encoded to obtain the encoding information of the current node.
  • the encoder writes the encoding information of the current node into the bitstream.
  • Some embodiments of the present application provide a code stream, wherein the code stream is generated by bit encoding according to information to be encoded; the information to be encoded includes at least one of the following:
  • Code stream type first syntax element information
  • second syntax element information encoding information of each node in the point cloud.
  • Figure 14 shows a schematic diagram of the composition structure of a decoder 1 provided in an embodiment of the present application.
  • the decoder 1 may include:
  • the decoding part 10 is configured to parse the code stream and determine the type of the code stream
  • a first determining part 11 is configured to determine a context model based on the bitstream type
  • a first initialization part 12 configured to initialize context information of the context model
  • the decoding part 10 is further configured to decode the current node based on the context information.
  • the decoding part 10 is further configured to determine the first syntax element information based on the bitstream type
  • the first determination part 11 is further configured to determine the context model based on the encoding type indicated by the first syntax element information.
  • the decoding part 10 is further configured to determine second syntax element information based on the encoding type indicated by the first syntax element information; the second syntax element information indicates the type of model when the geometric information is encoded using a multi-tree;
  • the first determination part 11 is further configured to determine the context model according to the model corresponding to the type indicated by the second syntax element information.
  • the code stream type includes: a geometric code stream or an attribute code stream;
  • the context model includes: a first context model and a second context model;
  • the first determination part 11 is further configured to determine the first context model corresponding to the geometric code stream based on the geometric code stream indicated by the code stream type; and determine the second context model corresponding to the attribute code stream based on the attribute code stream indicated by the code stream type.
  • the code stream type includes: a geometric code stream or an attribute code stream;
  • the decoding part 10 is further configured to parse the geometry code stream if the code stream type indicates a geometry code stream, and determine first syntax element information related to the geometry code stream; the first syntax element information indicates a coding type of geometry information;
  • the attribute code stream is parsed to determine first syntax element information related to the attribute code stream; the first syntax element information indicates a coding type of attribute information.
  • the coding type of the geometric information includes: multi-tree coding or prediction tree coding;
  • the context model includes: a first context model;
  • the first determining part 11 is further configured to determine a first context model corresponding to the multi-tree coding if the first syntax element information indicates the multi-tree coding; or
  • first syntax element information indicates prediction tree coding
  • a first context model corresponding to the prediction tree coding is determined.
  • the encoding type of the attribute information includes: color encoding, reflectivity encoding or normal vector encoding;
  • the context model includes: a second context model;
  • the first determining part 11 is further configured to determine a decoding method of a second context model and attribute corresponding to the normal vector encoding if the first syntax element information indicates the normal vector encoding; or,
  • the first syntax element information indicates color coding, determining a decoding method of a second context model and attribute corresponding to the color coding; or,
  • the first syntax element information indicates reflectivity coding
  • a decoding method of a second context modulus and attribute corresponding to the reflectivity coding is determined.
  • the encoding type of the attribute information includes: color encoding, reflectivity encoding or normal vector encoding;
  • the context model includes: a second context model;
  • the first determining part 11 is further configured to determine a second context model corresponding to the normal vector coding if the coding type of the attribute information in the code stream indicates normal vector coding; or
  • determining a second context model corresponding to the color encoding If the encoding type of the attribute information in the bitstream indicates color encoding, determining a second context model corresponding to the color encoding; or,
  • a second context module corresponding to the reflectivity coding is determined.
  • the decoding part 10 is further configured to parse the geometry code stream to determine the second syntax element information corresponding to the multi-tree coding if the first syntax element information indicates multi-tree coding.
  • the first initialization part 12 is further configured to, if the encoding type of the attribute information indicates color coding, initialize the context information of the second context model corresponding to the color coding in the first decoding core; if the encoding type of the attribute information indicates reflectivity coding, initialize the context information of the second context model corresponding to the reflectivity coding in the second decoding core; if the encoding type of the attribute information indicates normal vector coding, initialize the context information of the second context model corresponding to the normal vector coding in the third decoding core.
  • the decoding part 10 is also configured to decode the color coding information of the current node through the first decoding core based on the context information of the second context model corresponding to the color coding; to decode the reflectivity coding information of the current node through the second decoding core based on the context information of the second context model corresponding to the reflectivity coding; and to decode the normal vector coding information of the current node through the third decoding core based on the context information of the second context model corresponding to the normal vector coding.
  • a "unit” may be a part of a circuit, a part of a processor, a part of a program or software, etc., and of course, it may be a module, or it may be non-modular.
  • the components in the present embodiment may be integrated into a processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit may be implemented in the form of hardware or in the form of a software functional module.
  • the decoder may include: a first communication interface 1901, a first memory 1902 and a first processor 1903; each component is coupled together through a first bus system 1904. It can be understood that the first bus system 1904 is used to achieve connection and communication between these components.
  • the first bus system 1904 also includes a power bus, a control bus and a status signal bus.
  • various buses are labeled as the first bus system 1904 in Figure 15. Among them,
  • the first communication interface 1901 is used for receiving and sending signals during the process of sending and receiving information with other external network elements;
  • a first memory 1902 used to store a computer program that can be run on the first processor 1903;
  • the first processor 1903 is configured to execute the decoding method described by the decoder when running the computer program.
  • the bitstream is parsed to determine the bitstream type; based on the bitstream type, the context model is determined; the context information of the context model is initialized; and based on the context information, the current node is decoded.
  • the decoder can use the parsed bitstream type to decode using context information, and only needs to initialize the context information of the context model consistent with the parsed bitstream type, so that the context information required for decoding is already loaded at the time of initialization, ensuring the efficiency of decoding.
  • all context models are not loaded, saving memory resources allocated to the context model, achieving the purpose of reducing memory overhead during decoding and improving the utilization of memory resources.
  • Figure 16 shows a schematic diagram of the composition structure of an encoder 2 provided in an embodiment of the present application.
  • the encoder 2 may include:
  • the second determining part 20 is configured to determine an encoding method of the point cloud; the encoding method is used to indicate geometric encoding or attribute encoding; and determine a context model based on the encoding method;
  • a second initialization part 21 is configured to initialize the context information of the context model when encoding the current node according to the encoding method
  • the encoding part 22 is configured to encode the current node based on the context information.
  • the second determination part 20 is further configured to determine a coding type corresponding to the coding method; and determine the context model based on the coding type.
  • the encoding type when the encoding mode indicates geometric encoding, includes: multi-tree encoding or prediction tree encoding; the context model includes: a first context model;
  • the second determining part 20 is further configured to determine a first context model corresponding to the multi-tree coding or a first context model corresponding to the type of the first context model corresponding to the multi-tree coding if the coding type indicates multi-tree coding; or
  • a first context model corresponding to the prediction tree encoding is determined.
  • the encoding type when the encoding mode indicates attribute encoding, includes: color encoding, reflectivity encoding or normal vector encoding; the context model includes: a second context model;
  • the second determining part 20 is further configured to determine a second context model corresponding to the color coding if the coding type indicates color coding; or
  • a second context modulus corresponding to the reflectivity encoding is determined, or, if the encoding type indicates normal vector encoding, a second context modulus corresponding to the normal vector encoding is determined.
  • the context model includes: a first context model and a second context model
  • the second determining part 20 is further configured to determine the first context model corresponding to the geometric coding if the coding mode indicates geometric coding;
  • the second context model corresponding to the attribute encoding is determined.
  • the second initialization part 21 is further configured to, if the encoding type indicates color encoding, initialize the context information of the second context model corresponding to the color encoding in the first encoding core; and if the encoding type indicates reflectivity encoding, initialize the context information of the second context model corresponding to the reflectivity encoding in the second encoding core; if the encoding type indicates normal vector encoding, initialize the context information of the second context model corresponding to the reflectivity encoding in the third encoding core.
  • the encoding part 22 is further configured to encode the color information of the current node based on the context information of the second context model corresponding to the color encoding through the first encoding core;
  • the reflectivity information of the current node is encoded based on the context information of the second context model corresponding to the reflectivity encoding; through the third coding core, the normal vector of the current node is encoded based on the context information of the second context model corresponding to the normal vector encoding.
  • the encoder 2 further includes: a writing part 23;
  • the writing part 23 is configured to write the code stream using the encoding method as the code stream type.
  • the encoder 2 further includes: a writing part 23;
  • the writing part 23 is configured to write the coding type into the bitstream as the first syntax element information.
  • the encoder 2 further includes: a writing part 23;
  • the writing part 23 is further configured to write the type of the first context model corresponding to the multi-tree coding into the bitstream as the second syntax element information if the coding type indicates multi-tree coding.
  • the encoding part 22 is further configured to determine an index of a context model corresponding to the current node; based on the index of the context model, determine target context information from the context information;
  • the encoding type indicates multi-tree encoding, encoding the placeholder information of the current node based on the target context information to obtain encoding information of the current node;
  • the encoding type indicates prediction tree encoding, encoding the prediction residual of the current node based on the target context information to obtain encoding information of the current node;
  • the encoding mode indicates predictive encoding in attribute encoding, encoding the prediction residual information of the attribute of the current node based on the target context information to obtain encoding information of the current node;
  • the encoding mode indicates a transform encoding mode in attribute encoding
  • the transform coefficient of the current node is encoded to obtain encoding information of the current node.
  • the encoder 2 further includes: a writing part 23;
  • the writing part 23 is also configured to write the coding information of the current node into the code stream.
  • the encoder may include: a second communication interface 2001, a second memory 2002 and a second processor 2003; each component is coupled together through a second bus system 2004.
  • the second bus system 2004 is used to achieve connection and communication between these components.
  • the second bus system 2004 also includes a power bus, a control bus and a status signal bus.
  • various buses are marked as the second bus system 2004 in Figure 17. Among them,
  • the second communication interface 2001 is used for receiving and sending signals during the process of sending and receiving information with other external network elements;
  • the second memory 2002 is used to store a computer program that can be run on the second processor 2003;
  • the second processor 2003 is configured to execute the encoding method described by the encoder when running the computer program.
  • the encoder can determine whether to perform geometric encoding or attribute encoding by determining the encoding method of the point cloud during encoding, and can determine whether to initialize the context information of the context model corresponding to the geometric information or the context information of the context model corresponding to the attribute information according to the encoding method.
  • the context models of all encoding methods are not loaded, saving the memory resources allocated to the context model, achieving the purpose of reducing the memory overhead during encoding and improving the utilization of memory resources.
  • the integrated unit is implemented in the form of a software function module and is not sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solution of this embodiment is essentially or the part that contributes to the prior art or all or part of the technical solution can be embodied in the form of a software product.
  • the computer software product is stored in a storage medium, including several instructions for a computer device (which can be a personal computer, server, or network device, etc.) or a processor to perform all or part of the steps of the method described in this embodiment.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), disk or optical disk, etc., various media that can store program codes.
  • an embodiment of the present application provides a computer-readable storage medium, which is applied to an encoder or a decoder.
  • the computer-readable storage medium stores a computer program.
  • the computer program When the computer program is executed by a first processor, it implements the decoding method in the aforementioned embodiment, or when the computer program is executed by a second processor, it implements the encoding method in the aforementioned embodiment.
  • the embodiment of the present application provides a coding and decoding method, a decoder, an encoder, a bitstream and a storage medium.
  • the bitstream is parsed to determine the bitstream type; based on the bitstream type, a context model is determined; the context information of the context model is initialized; and based on the context information, the current node is decoded.
  • the decoder can use the parsed bitstream type to decode using context information, and only needs to initialize the context information of the context model consistent with the parsed bitstream type, so that the context information required for decoding is loaded at the time of initialization, thereby ensuring the efficiency of decoding.
  • all context models are not loaded, thereby saving memory resources allocated to the context model, thereby reducing memory overhead during decoding and improving the utilization of memory resources.
  • the encoder can determine whether to perform geometric encoding or attribute encoding by determining the encoding method of the point cloud during encoding, and can determine whether to initialize the context information of the context model corresponding to the geometric information or the context information of the context model corresponding to the attribute information according to the encoding method. In this way, it is only necessary to initialize the context information of the context model consistent with the parsed encoding method, so that the context information required for encoding is loaded at the time of initialization, ensuring the efficiency of encoding. At the same time, the context models of all encoding methods are not loaded, saving the memory resources allocated to the context model, achieving the purpose of reducing the memory overhead during encoding and improving the utilization of memory resources.

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Les modes de réalisation de la présente demande concernent un procédé de codage, un procédé de décodage et un décodeur, un codeur, un flux de code et un support de stockage. Le procédé de décodage consiste à : analyser un flux de code, de façon à déterminer le type du flux de code ; déterminer un modèle de contexte sur la base du type du flux de code ; initialiser des informations de contexte du modèle de contexte ; et décoder le nœud actuel sur la base des informations de contexte. Par conséquent, des surdébits de mémoire pendant le codage et le décodage peuvent être réduits sur la base de l'assurance de l'efficacité de codage et de décodage, et le taux d'utilisation de ressources de mémoire peut être amélioré.
PCT/CN2022/138185 2022-12-09 2022-12-09 Procédé de codage, procédé de décodage, décodeur, codeur, flux de code et support de stockage WO2024119518A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/138185 WO2024119518A1 (fr) 2022-12-09 2022-12-09 Procédé de codage, procédé de décodage, décodeur, codeur, flux de code et support de stockage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/138185 WO2024119518A1 (fr) 2022-12-09 2022-12-09 Procédé de codage, procédé de décodage, décodeur, codeur, flux de code et support de stockage

Publications (1)

Publication Number Publication Date
WO2024119518A1 true WO2024119518A1 (fr) 2024-06-13

Family

ID=91378410

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/138185 WO2024119518A1 (fr) 2022-12-09 2022-12-09 Procédé de codage, procédé de décodage, décodeur, codeur, flux de code et support de stockage

Country Status (1)

Country Link
WO (1) WO2024119518A1 (fr)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021261458A1 (fr) * 2020-06-22 2021-12-30 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ Procédé de codage de données tridimensionnelles, procédé de décodage de données tridimensionnelles, dispositif de codage de données tridimensionnelles et dispositif de décodage de données tridimensionnelles
CN114972551A (zh) * 2022-02-11 2022-08-30 北京大学深圳研究生院 一种点云的压缩和解压缩方法
US20220337872A1 (en) * 2021-04-15 2022-10-20 Lg Electronics Inc. Point cloud data transmission device, point cloud data transmission method, point cloud data reception device, and point cloud data reception method
CN115443486A (zh) * 2020-04-14 2022-12-06 松下电器(美国)知识产权公司 三维数据编码方法、三维数据解码方法、三维数据编码装置及三维数据解码装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115443486A (zh) * 2020-04-14 2022-12-06 松下电器(美国)知识产权公司 三维数据编码方法、三维数据解码方法、三维数据编码装置及三维数据解码装置
WO2021261458A1 (fr) * 2020-06-22 2021-12-30 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ Procédé de codage de données tridimensionnelles, procédé de décodage de données tridimensionnelles, dispositif de codage de données tridimensionnelles et dispositif de décodage de données tridimensionnelles
US20220337872A1 (en) * 2021-04-15 2022-10-20 Lg Electronics Inc. Point cloud data transmission device, point cloud data transmission method, point cloud data reception device, and point cloud data reception method
CN114972551A (zh) * 2022-02-11 2022-08-30 北京大学深圳研究生院 一种点云的压缩和解压缩方法

Similar Documents

Publication Publication Date Title
CN113615181B (zh) 用于点云编解码的方法、装置
JP2022528528A (ja) 点群コーディングのための方法、装置、およびコンピュータプログラム
KR102609776B1 (ko) 포인트 클라우드 데이터 처리 방법 및 장치
WO2022121648A1 (fr) Procédé de codage de données de nuage de points, procédé de décodage de données de nuage de points, dispositif, support et produit de programme
KR20220127837A (ko) Haar 기반 포인트 클라우드 코딩을 위한 방법 및 장치
JP2022531110A (ja) 点群符号化のための方法および装置
Xu et al. Introduction to point cloud compression
US20220180567A1 (en) Method and apparatus for point cloud coding
CN114598883A (zh) 点云属性的预测方法、编码器、解码器及存储介质
WO2024119518A1 (fr) Procédé de codage, procédé de décodage, décodeur, codeur, flux de code et support de stockage
WO2024145910A1 (fr) Procédé de codage, procédé de décodage, flux de bits, codeur, décodeur et support de stockage
WO2024145904A1 (fr) Procédé de codage, procédé de décodage, flux de code, codeur, décodeur, et support de stockage
WO2024103304A1 (fr) Procédé d'encodage de nuage de points, procédé de décodage de nuage de points, encodeur, décodeur, flux de code, et support de stockage
WO2024065406A1 (fr) Procédés de codage et de décodage, train de bits, codeur, décodeur et support de stockage
WO2024065269A1 (fr) Procédé et appareil de codage et de décodage de nuage de points, dispositif, et support de stockage
WO2024119420A1 (fr) Procédé de codage, procédé de décodage, flux de code, codeur, décodeur, et support de stockage
WO2024082152A1 (fr) Procédés et appareils de codage et de décodage, codeur et décodeur, flux de code, dispositif et support de stockage
WO2023024842A1 (fr) Procédé, appareil et dispositif de codage/décodage de nuage de points, et support de stockage
WO2023173238A1 (fr) Procédé de codage, procédé de décodage, flux de code, codeur, décodeur et support d'enregistrement
WO2024145912A1 (fr) Procédé et appareil de codage de nuage de points, procédé et appareil de décodage de nuage de points, dispositif, et support de stockage
RU2778377C1 (ru) Способ и устройство для кодирования облака точек
WO2024145913A1 (fr) Procédé et appareil de codage et de décodage de nuage de points, dispositif, et support de stockage
WO2024082127A1 (fr) Procédé de codage, procédé de décodage, flux de code, codeur, décodeur et support de stockage
WO2024145933A1 (fr) Procédé et appareil de codage de nuage de points, procédé et appareil de décodage de nuage de points, dispositifs, et support de stockage
WO2024145935A1 (fr) Procédé et appareil de codage de nuage de points, procédé et appareil de décodage de nuage de points, dispositif, et support de stockage