WO2023155045A1 - 预测的方法和装置、编码器、解码器和编解码系统 - Google Patents

预测的方法和装置、编码器、解码器和编解码系统 Download PDF

Info

Publication number
WO2023155045A1
WO2023155045A1 PCT/CN2022/076368 CN2022076368W WO2023155045A1 WO 2023155045 A1 WO2023155045 A1 WO 2023155045A1 CN 2022076368 W CN2022076368 W CN 2022076368W WO 2023155045 A1 WO2023155045 A1 WO 2023155045A1
Authority
WO
WIPO (PCT)
Prior art keywords
block
current sub
sub
parent
blocks
Prior art date
Application number
PCT/CN2022/076368
Other languages
English (en)
French (fr)
Inventor
徐异凌
侯礼志
高粼遥
魏红莲
Original Assignee
上海交通大学
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海交通大学, Oppo广东移动通信有限公司 filed Critical 上海交通大学
Priority to PCT/CN2022/076368 priority Critical patent/WO2023155045A1/zh
Publication of WO2023155045A1 publication Critical patent/WO2023155045A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/537Motion estimation other than block-based
    • H04N19/54Motion estimation other than block-based using feature points or meshes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding

Definitions

  • the embodiments of the present application relate to the technical field of point cloud encoding and decoding, and more specifically, relate to a prediction method and device, an encoder, a decoder, and an encoding and decoding system.
  • a prediction method and device an encoder, a decoder, and an encoding and decoding system, which can help improve the accuracy and stability of the average attribute value of the current sub-block, and can further improve encoding and decoding efficiency.
  • a prediction method including:
  • the hierarchical structure includes a parent block and at least one child block of the parent block;
  • the reference block includes at least one parent block associated with the same level as the parent block of the current sub-block, and/or, the current sub-block is the same At least one sub-block associated with the hierarchy;
  • an encoding method including:
  • a decoding method including:
  • a method for point cloud processing including:
  • attribute information of the newly added point is acquired.
  • a prediction device including:
  • An acquisition unit configured to acquire a hierarchical structure of the point cloud, wherein the hierarchical structure includes a parent block and at least one sub-block of the parent block;
  • a processing unit configured to determine a reference block of the current sub-block in the hierarchical structure, wherein the reference block includes at least one parent block associated with the same level as the parent block of the current sub-block, and/or, the At least one sub-block associated with the same level as the current sub-block;
  • a neural network model configured to input the information of the reference block and/or the information of the current sub-block, and obtain the predicted attribute value of the current sub-block, wherein the training data of the neural network model includes the information of the sub-block Information about the reference block and real attribute values of sub-blocks.
  • an encoder including:
  • An acquisition unit configured to acquire the predicted attribute value of the current sub-block according to the method described in the first aspect
  • a processing unit configured to determine a predictive transform coefficient of the current sub-block according to the predictive attribute value and the number of points in the current sub-block;
  • the processing unit is further configured to determine the real transformation coefficient of the current sub-block according to the real attribute value of the current sub-block and the number of points in the current sub-block;
  • the processing unit is further configured to determine a difference between the predicted transform coefficients and the true transform coefficients
  • a coding unit configured to write the difference value into a code stream.
  • a decoder including:
  • An acquisition unit configured to acquire the difference between the predicted transform coefficient and the real transform coefficient of the current sub-block according to the code stream
  • the obtaining unit is further configured to obtain the predicted attribute value of the current sub-block according to the method described in the first aspect
  • a processing unit configured to determine the real transform coefficient of the current sub-block according to the predicted attribute value and the difference value
  • the processing unit is further configured to determine the real attribute value of the current sub-block according to the real transform coefficient and the number of points in the current sub-block.
  • the eighth aspect provides a codec system, which is characterized by including the encoder of the fifth aspect and the decoder of the sixth aspect.
  • a point cloud processing device including:
  • the upsampling unit is used to upsample the point cloud to obtain the position information of the newly added point;
  • the obtaining unit is configured to obtain the attribute information of the newly added point according to the method described in the first aspect.
  • an electronic device including a processor and a memory
  • the memory is used to store a computer program
  • the processor is used to call and run the computer program stored in the memory to execute the method in any one of the first aspect to the fourth aspect above.
  • a chip including: a processor, configured to call and run a computer program from a memory, so that a device installed with the chip executes any of the above-mentioned first to fourth aspects. method.
  • a computer-readable storage medium for storing a computer program, and the computer program causes a computer to execute the method of any one of the above-mentioned first aspect to the fourth aspect.
  • a computer program product including computer program instructions, the computer program instructions cause a computer to execute the method of any one of the above first to fourth aspects.
  • a computer program which, when running on a computer, causes the computer to execute the method of any one of the first to fourth aspects above.
  • the reference block for predicting the current sub-block can be flexibly selected, and based on the strong expressive ability of the neural network model, it can help to improve the accuracy and stability of the average attribute value of the current sub-block.
  • Encoding and decoding is performed according to the predicted attribute value of the current sub-block obtained according to the above intra-frame prediction method, and the efficiency of encoding and decoding can be further improved while improving the accuracy and stability of the average attribute value of the current sub-block.
  • Fig. 1 is a schematic diagram of the octree structure involved in the embodiment of the present application
  • FIG. 2A is a schematic diagram of the octree division involved in the embodiment of the present application.
  • FIG. 2B is another schematic diagram of the octree division involved in the embodiment of the present application.
  • FIG. 3 is a schematic diagram of an encoder involved in an embodiment of the present application.
  • FIG. 4 is a schematic diagram of a decoder involved in an embodiment of the present application.
  • Fig. 5 is a schematic flow chart of a prediction method provided by an embodiment of the present application.
  • FIG. 6 shows a specific example of a prediction method according to an embodiment of the present application.
  • FIG. 7 is a schematic flowchart of an encoding method provided by an embodiment of the present application.
  • Fig. 8 is a schematic flowchart of a decoding method provided by an embodiment of the present application.
  • Fig. 9 is a schematic block diagram of a prediction device according to an embodiment of the present application.
  • FIG. 10 is a schematic block diagram of an encoder provided in an embodiment of the present application.
  • Fig. 11 is a schematic block diagram of a decoder provided by an embodiment of the present application.
  • Fig. 12 is a schematic block diagram of an electronic device provided by an embodiment of the present application.
  • This application is applicable to the technical field of point cloud data compression. First, related terms involved in the embodiments of the present application are described.
  • Point cloud is a three-dimensional (3D) representation of the surface of an object, which can refer to a collection of massive points in three-dimensional space. Each point has associated properties such as color, material properties, etc. Exemplarily, a point cloud can be used to reconstruct an object or a scene into a combination of points.
  • the points in the point cloud can include point geometric information and point attribute information.
  • the geometric information of the point may be three-dimensional coordinate information of the point, for example, may be represented by (x, y, z) in a Cartesian coordinate system or any coordinate system.
  • the geometric information of a point may also be referred to as the positional information of a point.
  • these points may have associated attribute information such as color, such as red-green-blue (Red-Green-Blue, RGB) or (Luminance-Chrominance, YUV) three-component value, etc.
  • attribute information such as color, such as red-green-blue (Red-Green-Blue, RGB) or (Luminance-Chrominance, YUV) three-component value, etc.
  • Other attribute information may include transparency, reflection Rate normal vector, etc., are not limited.
  • Point clouds can be static or dynamic.
  • detailed scans or maps of objects or terrain can be static point cloud data
  • environmental scans for machine vision purposes can be dynamic point cloud data.
  • the dynamic point cloud data changes with time
  • the dynamic point cloud can be a sequence of point clouds sorted by time.
  • Point cloud data can be applied to various fields, for example, virtual/augmented reality, machine vision, geographic information system, medical field, etc.
  • the point cloud of the surface of the object can be collected.
  • the number of points in the point cloud is large, for example, up to billions, so the amount of original data of the point cloud is particularly huge. Therefore, effective compression technology, that is, encoding and decoding processing, is required to reduce the amount of point cloud data.
  • the tree structure of the point cloud can represent the division result of the geometric information of the point cloud in the process of encoding or decoding the point cloud.
  • the volume space of the point cloud is recursively divided into sub-volumes, and the corresponding volume space corresponds to the root node of the tree structure, and each sub-volume corresponds to the node of the tree structure.
  • it may be determined whether to further divide the subvolume based on whether the subvolume contains points.
  • Each node may have a placeholder bit indicating whether the subvolume corresponding to the node contains a point.
  • arithmetic coding can be performed on these placeholder bits to obtain a binary code stream.
  • the tree structure may be an octree.
  • the volume space or sub-volumes are cubes, and each split further produces eight sub-volumes/sub-cubes.
  • FIG. 1 shows a schematic diagram of an octree structure.
  • block 10 may be a root node, and may correspond to a volume space of a complete point cloud, such as a cube.
  • the volume space corresponding to the block 10 can be divided into 8 sub-volumes, and each sub-volume corresponds to a block in the dashed box 20 .
  • Block 10 is the parent block (also can be referred to as parent node) of the block in the dotted line box 20, and the block in the corresponding dotted line box 20 is the sub-block (also can be called child node) of block 10, can be between each sub-block They are called sibling blocks.
  • the sub-blocks of the block 10 may include a block containing points, and its placeholder bit is 1, indicating that the sub-volume corresponding to the block contains points.
  • the sub-blocks of the block 10 may also include a block that does not contain points, and its placeholder bit is 0, indicating that the sub-volume corresponding to this block does not contain points, that is, the sub-volume is empty.
  • the parent block can be represented by the occupancy bits of its sub-blocks. For example, block 10 can be expressed in binary form of "00001001", indicating that the occupancy bits of sub-blocks 21 and 22 are 1.
  • the corresponding subvolume may be further divided into 8 subvolumes.
  • blocks 21 and 22 are parent blocks corresponding to nodes of 8 sub-volumes that are further divided by their respective sub-volumes, and the 8 sub-volumes that are further divided are sub-blocks, such as the blocks in the dotted line box 30 .
  • block 21 can be expressed in binary form of "01001000", indicating that the occupying bits of sub-blocks 31 and 32 are 1;
  • block 22 can be expressed in binary form of "001000000", indicating that occupying bits of sub-block 33 are 1 .
  • arithmetic coding can be performed on these placeholder bits to obtain a binary code stream.
  • block 10 may also be a block corresponding to a subvolume, that is, the octree structure in FIG. 1 may be a part of the octree structure corresponding to the complete point cloud, which is not limited in this application.
  • blocks with the same depth in the octree structure can form a layer.
  • the octree structure may include at least two layers, each layer may include at least one block, and each block may correspond to a child.
  • the octree structure is a hierarchical structure.
  • the tree structure of the point cloud may also be at least one of hierarchical structures such as a quadtree structure, a binary tree structure, and an uneven space division structure, which is not limited.
  • the depth value of each block in the dashed box 20 is 1, and belongs to one layer.
  • the layer corresponding to the dotted box 20 may be the 0th layer of the octree structure.
  • each block in the dashed box 30 has a depth value of 2 and belongs to one layer.
  • the layer corresponding to the dotted box 30 may be the first layer of the octree structure.
  • the octree structure may have blocks with a greater depth, corresponding to more layers.
  • FIG. 2A shows a schematic diagram of the spatial positions of 8 sub-blocks (ie, sub-blocks 0-7) generated by octree division relative to their parent blocks (ie, the current block).
  • the neighbor reference information of the same layer can be obtained, for example, the neighbor subblocks in the three directions of left, front and bottom (such as the negative direction of x, y and z axes in the coordinate system) placeholder information.
  • the neighbor reference information of the same layer can be obtained, for example, the neighbor subblocks in the three directions of left, front and bottom (such as the negative direction of x, y and z axes in the coordinate system) placeholder information.
  • at least one neighbor among 3 co-planar, 3 co-linear and 1 common vertex neighbors of the same layer may be used as a reference block.
  • 2B shows examples of coplanar neighbors and collinear neighbors of the same layer block. From left to right, they are upper right and rear coplanar neighbors, left front and lower coplanar neighbors, right upper and rear collinear neighbors, and left front and lower collinear neighbors.
  • Fig. 3 is a schematic block diagram of an encoder 100 provided by an embodiment of the present application.
  • the encoder 100 may be a G-PCC encoder.
  • the input to the encoder 100 includes geometric information and attribute information of the point cloud.
  • the input point cloud may be divided into slices, and each obtained slice may be independently coded.
  • the geometric information and attribute information of the point cloud are encoded separately.
  • the encoder 100 can perform coordinate transformation on geometric information, so that all point clouds are included in a bounding box.
  • the bounding box may be referred to as the volume space corresponding to the point cloud.
  • a voxelization process can then be performed, eg including quantization and removal of duplicate points.
  • the bounding box can be divided into octrees. According to the depth of the octree division level, the encoding of geometric information can be divided into encoding based on octree and encoding based on triangle soup (trisoup).
  • the bounding box can be divided into 8 sub-cubes, and the occupancy bits of the sub-cubes can be recorded.
  • the occupancy bit of the sub-cube is 1, indicating that the sub-cube is not empty, in other words, the sub-cube is occupied by points in the point cloud, that is, the sub-cube contains points in the point cloud.
  • the occupancy bit of the sub-cube is 0, indicating that the sub-cube is empty, in other words, the sub-cube is not occupied by points in the point cloud, that is, the sub-cube does not contain points in the point cloud.
  • the division may be stopped when the resulting leaf nodes are 1x1x1 unit cubes.
  • a sub-cube may be called a sub-volume, which is obtained by dividing a bounding box or a volume space.
  • the bounding box may be called a root node
  • each sub-cube may be called a child node of the root node, that is, a sub-block.
  • the spatial correlation between the block and surrounding blocks can be used to perform intra-frame prediction on the occupancy bits.
  • context modeling can be performed to obtain the context information of the block, and arithmetic coding (such as adaptive binary arithmetic coding) is performed based on the context information to generate a binary code stream, that is, a geometric code stream.
  • the octree division is also performed. Different from the encoding process based on octree, in the encoding process based on trisoup, it is not necessary to divide the point cloud into unit cubes with a side length of 1x1x1, but to block (block) when the side length is W Stop dividing. Based on the surface formed by the distribution of point clouds in each block, at most twelve intersection points (vertex) generated by the surface and twelve edges of the block are obtained. The vertex coordinates of each block are encoded sequentially to generate a binary code stream, that is, a geometric code stream.
  • the G-PCC encoder After the G-PCC encoder completes the encoding of the geometric information, it reconstructs the geometric information, and uses the reconstructed geometric information to encode the attribute information of the point cloud.
  • the attribute information encoding of the point cloud mainly encodes the color information of the points in the point cloud.
  • the encoder can perform color conversion on the color information of the points. For example, when the color information of points in the input point cloud is expressed in RGB color space, the encoder can convert the color information from RGB color space to YUV color space. Then, the point cloud is recolored with the reconstructed geometry information so that the unencoded attribute information corresponds to the reconstructed geometry information. Then, the color information is transformed.
  • RAHT Region Adaptive Hierarchal Transformation
  • Morton codes can be used to sort the point cloud, and the nearest neighbors of the points to be encoded (also called points to be predicted) can be searched using the geometric spatial relationship, and the found neighbors can be used to The reconstructed attribute value of the to-be-encoded point is interpolated and predicted to obtain the predicted attribute value, and then the real attribute value and the predicted attribute value can be differentially calculated to obtain the predicted residual, and finally the predicted residual is quantized and arithmetically encoded to obtain a binary code stream.
  • the nearest neighbors of the points to be encoded also called points to be predicted
  • the found neighbors can be used to The reconstructed attribute value of the to-be-encoded point is interpolated and predicted to obtain the predicted attribute value, and then the real attribute value and the predicted attribute value can be differentially calculated to obtain the predicted residual, and finally the predicted residual is quantized and arithmetically encoded to obtain a binary code stream.
  • FIG. 4 is a schematic block diagram of a decoder 200 provided by an embodiment of the present application.
  • the input of the decoder 200 includes the geometry code stream and the attribute code stream of the point cloud, and the geometry code stream and the attribute code stream of the point cloud are decoded separately.
  • the decoder 200 performs arithmetic decoding, context modeling, octree division, inverse quantization, and inverse coordinate transformation on the input geometric code stream to obtain geometric information, and performs arithmetic decoding on the input attribute code stream , inverse quantization, inverse transformation, attribute reconstruction and inverse color conversion to obtain attribute information.
  • the decoding process and the encoding process are reciprocal.
  • the first transformation coefficient (called DC coefficient or DC coefficient) of the transformation coefficient of the parent block in the RAHT transformation
  • the other coefficients are directly inherited from the sub-block conversion coefficient.
  • the DC coefficient the DC coefficient of the parent block is obtained by transforming the DC coefficient of the sub-block, and the transformation matrix is related to the number of points in each sub-block, namely:
  • the transformed coefficients are finally quantized and entropy encoded, and the attribute value of each point can be reconstructed at the decoding end according to the transformed coefficients obtained through decoding.
  • Another RATH intra-frame prediction method adjusts RAHT to transform downward step by step from the largest parent block.
  • the DC coefficient of the largest parent block is defined as w is the number of points in the entire parent block, a i is the attribute value of each point, and the AC coefficient is 0.
  • the DC coefficient and AC coefficient of the sub-block it is calculated through the transformation matrix, in the above example, namely:
  • the obtained DC coefficients continue to be used for the transformation of the next level, and the AC coefficients are used for direct coding and writing into the code stream.
  • the transformation coefficient of the sub-block can be calculated, and a prediction method is introduced to predict the attribute value of the sub-block , for each child block of the parent block, get the parent block containing the child block, 6 parent blocks of the same level with the parent block, and 12 parent blocks of the same level with the parent block, a total of 19 parent blocks
  • the average attribute value, and calculate the distance between the sub-block and 19 parent blocks, then the average attribute value of the predicted sub-block is:
  • a up is the average attribute value of the sub-block to be predicted
  • k is the number of non-empty parent blocks among the 19 parent blocks
  • d k is the distance between the current sub-block and the non-empty parent block
  • a k is the non-empty parent block average attribute value.
  • the average predicted attribute value of each sub-block is obtained, and a RAHT transformation is performed using the predicted attribute value to obtain a series of predicted transformation coefficients, namely:
  • the residuals of the AC coefficients are directly quantized and entropy encoded, and the DC coefficients continue to be used for the transformation of the next level. In this way, what is finally quantized and entropy encoded is the difference between the real transform coefficient and the predicted transform coefficient.
  • the parent block In the above process of predicting the attribute value of the child block from the parent block, only the parent block is used, 6 parent blocks at the same level that are coplanar with the parent block, and 12 parent blocks at the same level that are collinear with the parent block, a total of 19
  • the attribute value of the parent block is simply linearly weighted to predict the average attribute value of the sub-block, resulting in low accuracy and poor stability of the average attribute value of the sub-block, which affects the encoding and decoding efficiency.
  • the embodiment of the present application provides a prediction method, which first determines the reference block of the current sub-block in the hierarchical structure of the point cloud, and then inputs the information of the reference block and/or the information of the current sub-block into the neural network model to obtain the predicted attribute value of the current sub-block, wherein the reference block includes at least one parent block associated with the same level as the parent block of the current sub-block, and/or at least one parent block associated with the same level of the current sub-block A sub-block, the training data of the neural network model includes the information of the reference block of the sub-block and the real attribute value of the sub-block. Because the embodiment of the present application can flexibly select the reference block used to predict the current sub-block, and based on the strong expressive ability of the neural network model, the embodiment of the present application can help improve the accuracy and stability of the average attribute value of the current sub-block.
  • the embodiment of the present application also provides an encoding method, which can perform encoding according to the predicted attribute value of the current sub-block obtained according to the above prediction method, to obtain a code stream of the point cloud.
  • the predictive transformation coefficient of the current sub-block may be determined according to the predicted attribute value of the current sub-block and the number of points in the current sub-block, and according to the real attribute value of the current sub-block and the number of points in the current sub-block, The real transform coefficient of the current sub-block is determined, and then the difference between the predicted transform coefficient and the real transform coefficient is determined, and the difference is written into the code stream.
  • the embodiment of the present application also provides a decoding method, which can decode according to the predicted attribute value of the current sub-block obtained according to the above prediction method, to obtain the attribute information of the point cloud.
  • the difference between the predicted transformation coefficient and the real transformation coefficient of the current sub-block may be obtained according to the code stream, and the real transformation coefficient of the current sub-block may be determined according to the predicted attribute value of the current sub-block and the difference, and According to the real transformation coefficient and the number of points in the current sub-block, the real attribute value of the current sub-block is determined.
  • encoding and decoding are performed according to the predicted attribute value of the current sub-block obtained according to the above prediction method, and the efficiency of encoding and decoding can be further improved when the accuracy and stability of the average attribute value of the current sub-block are improved.
  • the embodiment of the present application also provides a point cloud processing method, which can upsample the point cloud to obtain the position information of the newly added point, and obtain the attribute information of the newly added point according to the above prediction method, so as to have It helps to increase the density of the point cloud.
  • FIG. 5 shows a schematic flowchart of a prediction method 300 provided by an embodiment of the present application.
  • the prediction method 300 can be applied to the encoder 100 shown in FIG. 3 or the decoder 200 shown in FIG. 4 to implement compression encoding and decoding of point clouds.
  • the method 300 includes steps 310 to 330 .
  • the hierarchical structure may include at least one of an octree structure, a quadtree structure, a binary tree structure, and an uneven space division structure.
  • the octree structure may refer to the descriptions in FIG. 1 and FIG. 2 above, and details are not repeated here.
  • 320 Determine a reference block of the current sub-block in the hierarchical structure, wherein the reference block includes at least one parent block associated with the same level (ie, same level) as the parent block of the current sub-block, and/or , at least one sub-block associated with the current sub-block at the same level.
  • the reference block is used to predict the predicted attribute value of the current sub-block.
  • the current sub-block is the block whose predicted attribute value is currently to be predicted, for example, may be a certain sub-cube in the octree structure, which is not limited.
  • the current sub-block may also be called a sub-block to be predicted, which is not limited.
  • At least one parent block associated with the same level as the parent block of the current sub-block may include at least one of the following:
  • the parent block of the current sub-block, the parent block of the same level as the parent block of the current sub-block, the parent block of the same level as the parent block of the current sub-block, and the parent block of the current sub-block The parent block at the same level as the parent block, the parent block at the same level as the parent block of the current sub-block, the parent block at the same level as the parent block of the current sub-block, and the distance between two parent blocks in the positive or negative x direction axis, the same as the parent block of the current sub-block
  • At least one parent block associated with the same level as the parent block of the current sub-block may be referred to as a reference parent block range for prediction.
  • At least one sub-block associated with the same level of the current sub-block may include at least one of the following:
  • a subblock on the same level as the current subblock, a subblock on the same level as the current subblock, a subblock on the same level as the current subblock, a subblock on the same level as the current subblock, and a subblock on the same level as the current subblock A sub-block that is two sub-blocks away from the positive or negative x axis, a sub-block that is at the same level as the current sub-block, and a sub-block that is two sub-blocks away from the current sub-block on the positive or negative y axis, and the current sub-block Sub-blocks at the same level that are at a distance from two sub-blocks on the positive or negative z axis.
  • At least one sub-block associated with the same level of the current sub-block may be referred to as a range of reference sub-blocks used for prediction.
  • the sum of the range of the reference parent block used for prediction and the range of the reference sub-block used for prediction is the range of the reference block used for prediction.
  • the above-mentioned information of each reference parent block used to predict its predicted attribute value can be obtained at the decoding end, and the above-mentioned information used to predict its predicted attribute value can be obtained at the decoding end. Only part of the sub-block information of the reference sub-blocks of attribute values can be obtained, and the sub-blocks that can be obtained can be determined according to the traversal order adopted in the hierarchical structure during decoding, for example, which is not limited in this application. As an optional implementation manner, during decoding, a reference sub-block may be selected from these available sub-blocks, which is not limited in this application.
  • the attribute value of the reference block is 0. That is to say, the parent block within the range of the reference parent block used for prediction, if it does not contain any point, its attribute value is regarded as 0; and/or the sub-block within the range of the reference sub-block used for prediction , whose attribute value is considered to be 0 if it does not contain any dots.
  • the reference block that does not contain points may be referred to as a reference block that occupies an empty space.
  • the predicted attribute value may also be referred to as an average attribute value, which is not limited in this application.
  • the neural network model may include a multi-layer perceptron and/or a Transformer.
  • the predicted attribute value of the current sub-block obtained by the trained neural network model can be compared with the actual The attribute value being as close as possible can help to improve the accuracy and stability of the average attribute value of the current sub-block.
  • the training target of the neural network model can be adjusted in consideration of the stability of the data, for example, the stability of the prediction can be taken into account .
  • the output of the above neural network model may be the predicted attribute value of the current sub-block.
  • an initial predicted value can also be determined, and the output of the neural network model can be the difference between the initial predicted value and the real attribute value.
  • the initial prediction value may be calculated according to the parent block of the current sub-block and at least one parent block associated with the same level of the parent block, which is not limited in the present application.
  • the current sub-block by determining the reference block of the current sub-block in the hierarchical structure of the point cloud, and then inputting the information of the reference block and/or the information of the current sub-block into the neural network model, the current sub-block can be obtained , wherein the reference block includes at least one parent block associated with the same level of the parent block of the current sub-block, and/or at least one sub-block associated with the same level of the current sub-block, the neural network model
  • the training data of the sub-block includes the information of the reference block of the sub-block and the real attribute value of the sub-block. Because the embodiment of the present application can flexibly select the reference block used to predict the current sub-block, and based on the strong expressive ability of the neural network model, the embodiment of the present application can help improve the accuracy and stability of the average attribute value of the current sub-block.
  • two parameters m and k may be defined, where m ⁇ k, and both m and k are positive integers. Whether to predict the predicted attribute value of the current sub-block can be determined according to the parameters m and k, or a manner of predicting the predicted attribute value of the current sub-block can be determined. Exemplarily, the following three situations may exist.
  • the prediction attribute value of the current sub-block is not predicted. Wherein, at least one point is contained in the non-empty reference block.
  • k non-empty reference blocks are selected among these non-empty reference blocks, and the The information of the k non-empty reference blocks and/or the information of the current sub-block is input into the above neural network model to obtain the predicted attribute value of the current sub-block.
  • the k non-null reference blocks may be determined according to distance information between the non-null reference block of the reference block and the current sub-block.
  • the distance information may include Euclidean distance and/or Manhattan distance, which is not limited.
  • Euclidean distance and/or the Manhattan distance between the non-empty reference block in the reference block and the current sub-block k non-empty parent blocks and/or non-empty sub-blocks closest to the current sub-block can be selected as reference block for prediction.
  • the reference block occupying an empty space in the reference block can be Interpolation to obtain k non-empty reference blocks, and then input the information of the k non-empty reference blocks and/or the information of the current sub-block into the above neural network model to obtain the predicted attribute value of the current sub-block.
  • the reference blocks within the range of the reference blocks used for prediction may be interpolated successively according to a certain priority until k non-empty reference blocks are reached.
  • interpolation may be performed with reference to one or more non-empty reference blocks that are co-planar, collinear or co-pointed at the same level as the block.
  • the n1 may be determined according to the attribute values of the n1 non-empty reference blocks in the reference block, and the distance between the n1 non-empty reference blocks and the second reference block that occupies an empty space.
  • the weighted average of the attribute values of non-empty reference blocks namely:
  • n1 is a positive integer
  • a i is the attribute value of the i-th non-empty reference block among the n1 non-empty reference blocks
  • d i is the distance between the i-th reference block and the current second reference block among the n1 non-empty reference blocks distance
  • m ⁇ n1 ⁇ k and then perform interpolation on the second reference block according to the weighted average to obtain the k non-empty reference blocks.
  • the reference blocks within the range of the reference blocks used for prediction may be interpolated successively according to a certain priority until k non-empty reference blocks are reached.
  • the attribute value can be calculated by referring to the attribute values of the n1 non-empty reference blocks that have been obtained, and the distance between the n non-empty reference blocks and the current reference block to be interpolated The weighted average of is used as the final interpolation result.
  • the third reference block is interpolated to obtain the above k non-empty reference blocks, wherein the attribute value of the third reference block is the first average value, and the distance from the third reference block to the current sub-block is the second average value.
  • the average value of the attributes of the n2 non-null reference blocks, and the average distance between the n2 non-null reference blocks and the current sub-block can be calculated, and the remaining (k- The distances between the n2) reference blocks (ie, an example of the third reference block) and the current sub-block are all the average distances, and the attribute values of the (k-n2) reference blocks are all the average attribute values.
  • At least one of the attribute value of the reference block, the distance information between the reference block and the current sub-block, and the auxiliary information of the current sub-block may be input into the The neural network model is used to obtain the predicted attribute value of the current sub-block.
  • the auxiliary information includes the three-dimensional space coordinates of the current sub-block, the three-dimensional space coordinates of the parent block of the current sub-block, the relative position of the current sub-block in its parent block, the current sub-block At least one of the transformation level, the size information of the current sub-block, the size of the parent block of the current sub-block, and the spatial distribution of the sub-blocks in the parent block of the current sub-block.
  • the auxiliary information may be empty, which is not limited in this application.
  • At least one of the number of non-empty blocks in the reference block, the ratio of the non-empty block to the total number of reference blocks, and the attribute difference of the non-empty blocks in the reference block can also be determined.
  • One that satisfies the preset condition That is to say, only at least one of the number of non-empty blocks in the above-mentioned reference block, the ratio of the non-empty block to the total number of reference blocks, and the attribute difference of the non-empty blocks in the reference block satisfies the preset condition
  • the prediction method of the above-mentioned method 300 is used to predict the predicted attribute value of the current sub-block.
  • the predicted attribute value of the current sub-block may be determined according to other prediction methods, for example, methods in the prior art.
  • a dynamic switching identifier can be defined, and when the prediction attribute value of the current sub-block needs to be determined according to the prediction method provided by the embodiment of the present application, the dynamic switching identifier can be defined as 1; otherwise, it can be defined as 0.
  • the dynamic switching identifier can be written into the code stream.
  • Fig. 6 shows a specific example of a prediction method according to an embodiment of the present application.
  • the reference block of the current sub-block that is, the range of the reference block used for prediction, includes: the parent block of the current sub-block (the number is 1), and the same level as the parent block of the current sub-block.
  • Parent blocks (6 in number), parent blocks on the same level as the parent block of the current sub-block (12 in number), and parent blocks on the same level as the parent block of the current sub-block (8 in number) , a total of 27 parent blocks.
  • 12 non-empty parent blocks can be selected from the 27 parent blocks as reference blocks (i.e.
  • the selected sum is the Euclidean distance between the reference block and the current sub-block, for example, the distance between the current sub-block and the current sub-block can be selected
  • the 12 parent blocks with the smallest Euclidean distance between them are used as reference blocks.
  • the interpolation operation may not be performed.
  • the Euclidean distance between the selected non-empty parent block and the current child block may be calculated, and there are 12 elements in total.
  • the auxiliary information of the current sub-block may include the relative position of the current sub-block in its parent block and the side length of the current sub-block, a total of 26 elements. As shown in Figure 6, the 26 elements can be connected to form a vector, which is input into a multi-layer perceptron including three hidden layers. Correspondingly, the output of the network is the predicted attribute value of the current sub-block.
  • the embodiment of the present application can help to improve the accuracy and stability of the average attribute value of the current sub-block by flexibly selecting the reference block for predicting the current sub-block and based on the strong expressive ability of the neural network model.
  • FIG. 7 shows a schematic flowchart of an encoding method 500 provided by an embodiment of the present application.
  • the encoding method 500 can be applied to the encoder 100 shown in FIG. 3 , for example, the geometric information and attribute information of the point cloud can be input into the encoder 100 to implement compression encoding of the point cloud.
  • method 500 includes steps 510 to 550 .
  • the predicted attribute value of the current sub-block can be obtained according to the method 300 shown in FIG. 3 .
  • the above-mentioned method 300 can be combined with RATH for point cloud encoding.
  • the DC coefficient of the largest parent block is defined as w is the number of points in the entire parent block, a i is the attribute value of each point, and the AC coefficient is 0.
  • the DC coefficient and AC coefficient of the sub-block it is calculated through the transformation matrix, that is:
  • the obtained DC coefficients continue to be used for the transformation of the next level, and the AC coefficients are used for direct coding.
  • the transformation coefficient of the child block can be calculated.
  • the output value of the network can be rounded to obtain the predicted attribute value a up of the sub-block.
  • the predicted attribute value of each sub-block is obtained, and the predicted attribute value is used to perform a RAHT transformation to obtain a series of predicted transformation coefficients, namely:
  • the difference between the real transform coefficient and the predicted transform coefficient is calculated, namely:
  • the residual of the AC coefficients is written into the code stream, for example, quantization and entropy coding can be performed directly, and the DC coefficients continue to be used for the transformation of the next level. In this way, what is finally quantized and entropy encoded is the difference between the real transform coefficient and the predicted transform coefficient.
  • the embodiment of the present application performs encoding according to the predicted attribute value of the current sub-block obtained according to the above prediction method, and can further improve the encoding efficiency while improving the accuracy and stability of the average attribute value of the current sub-block.
  • FIG. 8 shows a schematic flowchart of a decoding method 600 provided by an embodiment of the present application.
  • the decoding method 600 can be applied to the decoder 200 shown in FIG. 4 , for example, the geometric code stream and attribute code stream of the point cloud can be input into the decoder 200 to realize the decoding of the point cloud.
  • method 600 includes steps 610 to 640 .
  • the predicted attribute value of the current sub-block can be obtained according to the method 300 shown in FIG. 3 .
  • determining the real transform coefficient of the current sub-block according to the predicted attribute value and the difference is the inverse process of calculating the difference between the predicted transform coefficient of the current sub-block and the real transform coefficient in method 500, specifically, Refer to the description in FIG. 7 , which will not be repeated here.
  • determining the real attribute value of the current sub-block is the inverse process of calculating the real transformation coefficient in method 500.
  • FIG. 7 Let me repeat.
  • the prediction method 300 when encoding and decoding point clouds according to the prediction method provided by the embodiment of the present application, such as method 300, for example, in method 500 or 600, the prediction method 300 can be combined with other prediction current
  • the method of the attribute value of the sub-block is dynamically switched.
  • the method for predicting the attribute value of the current sub-block may be to select 1 parent block containing the current sub-block, 6 coplanar parent blocks at the same level as the parent block containing the current sub-block, 12 parent blocks with 19 reference blocks including collinear parent blocks at the same level as the parent block of the current sub-block are included, and the weighted average value of the attribute values of the 19 reference blocks is calculated.
  • the conditions for dynamic switching may include, for example: at least one of the number of non-empty blocks in the reference block, the ratio of the non-empty block to the total number of reference blocks, and the attribute difference of the non-empty blocks in the reference block satisfies the preset condition.
  • a dynamic switch identifier can be defined.
  • the switch identifier can be defined as 1, otherwise it is 0.
  • the dynamic switching identifier needs to be written into the code stream.
  • each block in the first n layers of the hierarchical structure can be encoded with its real attribute value, and the real attribute value can be quantized and entropy encoded Write the code stream; start from the n+1th layer of the hierarchical structure, use the prediction method provided by the embodiment of the present application, such as method 300, to predict the predicted attribute value of each sub-block, and use the real attribute value of the sub-block The difference is calculated from the predicted attribute value obtained by prediction, and the difference is quantized and entropy encoded to form the attribute code stream of the point cloud.
  • the above-mentioned point cloud attribute encoding and decoding process can be directly embedded in the geometry encoding and decoding process, so that geometry can be realized at the decoding end/encoding end Synchronous step-by-step encoding and decoding of attributes.
  • the embodiment of the present application also provides a point cloud processing method.
  • the point cloud can be up-sampled to obtain the position information of the newly added point, and then according to the prediction method provided in the embodiment of the present application, for example, the method
  • the prediction method at 300 is to acquire attribute information of the newly added point.
  • the resolution of point cloud can be described as the spatial density of points, and the more points in the same space size, the better the subjective quality of point cloud.
  • the up-sampling can obtain the location information of the newly added point, and further, the predicted attribute value of the newly added point can be obtained through the above prediction method provided by the embodiment of the present application.
  • the newly added point may fall in a certain smallest sub-block with a size of 1 ⁇ 1 ⁇ 1, and its attribute value may be obtained through the prediction method provided by the embodiment of the present application, such as the above-mentioned method 300 to obtain its sampled attribute value. Therefore, the embodiment of the present application can help to increase the density of the point cloud.
  • FIG. 9 is a schematic block diagram of a prediction device 900 according to an embodiment of the present application.
  • the device 900 may include an acquisition unit 910 , a processing unit 920 and a neural network model 930 .
  • An acquisition unit 910 configured to acquire a hierarchical structure of a point cloud, wherein the hierarchical structure includes a parent block and at least one sub-block of the parent block;
  • the processing unit 920 is configured to determine a reference block of the current sub-block in the hierarchical structure, wherein the reference block includes at least one parent block associated with the same level as the parent block of the current sub-block, and/or, At least one sub-block associated with the same level of the current sub-block;
  • a neural network model 930 configured to input the information of the reference block and/or the information of the current sub-block, and obtain the predicted attribute value of the current sub-block, wherein the training data of the neural network model includes sub-blocks The information of the reference block and the real attribute value of the sub-block.
  • At least one parent block associated with the same level of the parent block of the current sub-block includes at least one of the following:
  • the parent block of the current sub-block, the parent block of the same level as the parent block of the current sub-block, the parent block of the same level as the parent block of the current sub-block, and the parent block of the current sub-block The parent block at the same level as the parent block, the parent block at the same level as the parent block of the current sub-block, the parent block at the same level as the parent block of the current sub-block, and the distance between two parent blocks in the positive or negative x direction axis, the same as the parent block of the current sub-block
  • At least one sub-block associated with the same level of the current sub-block includes at least one of the following:
  • a subblock on the same level as the current subblock, a subblock on the same level as the current subblock, a subblock on the same level as the current subblock, a subblock on the same level as the current subblock, and a subblock on the same level as the current subblock A sub-block that is two sub-blocks away from the positive or negative x axis, a sub-block that is at the same level as the current sub-block, and a sub-block that is two sub-blocks away from the current sub-block on the positive or negative y axis, and the current sub-block Sub-blocks at the same level that are at a distance from two sub-blocks on the positive or negative z axis.
  • processing unit 920 is further configured to:
  • the neural network model 930 is specifically configured to input the information of the k non-empty reference blocks and/or the information of the current sub-block to obtain the predicted attribute value of the current sub-block.
  • processing unit 920 is specifically configured to:
  • the k non-empty reference blocks are determined according to the distance information between the non-empty reference block of the reference block and the current sub-block.
  • the distance information includes Euclidean distance and/or Manhattan distance.
  • processing unit 920 is further configured to:
  • the neural network model 930 is specifically configured to input the information of the k non-empty reference blocks and/or the information of the current sub-block to obtain the predicted attribute value of the current sub-block.
  • processing unit 920 is specifically configured to:
  • processing unit 920 is specifically configured to:
  • processing unit 920 is specifically configured to:
  • the third reference block is interpolated to obtain the k non-empty reference blocks, wherein the attribute value of the third reference block is the The first average value, the distance between the third reference block and the current sub-block is the second average value.
  • the neural network model 930 is specifically configured to input at least one of the attribute value of the reference block, the distance information between the reference block and the current sub-block, and the auxiliary information of the current sub-block. to obtain the predicted attribute value of the current sub-block.
  • the auxiliary information includes the three-dimensional space coordinates of the current sub-block, the three-dimensional space coordinates of the parent block of the current sub-block, the relative position of the current sub-block in its parent block, the current sub-block At least one of the transformation level, the size information of the current sub-block, the size of the parent block of the current sub-block, and the spatial distribution of the sub-blocks in the parent block of the current sub-block.
  • processing unit 920 is further configured to:
  • the attribute value of the reference block is 0.
  • the hierarchical structure includes at least one of an octree structure, a quadtree structure, a binary tree structure, and an uneven space division structure.
  • the neural network model includes a multi-layer perceptron and/or Transformer.
  • the device embodiment and the method embodiment may correspond to each other, and similar descriptions may refer to the method embodiment. To avoid repetition, details are not repeated here.
  • the forecasting device 900 shown in FIG. 9 may correspond to the corresponding subject in the method 300 of the embodiment of the present application, and the aforementioned and other operations and/or functions of each module in the device 900 are to realize the For the sake of brevity, the corresponding processes in each method are not repeated here.
  • FIG. 10 is a schematic block diagram of an encoder 1000 according to an embodiment of the present application, for example, it may be the encoder in FIG. 3 .
  • the encoder 1000 may include an acquisition unit 1010 , a processing unit 1020 and an encoding unit 1030 .
  • the acquiring unit 1010 is configured to acquire the predicted attribute value of the current sub-block.
  • the predicted attribute value of the current sub-block may be obtained according to the method 300 shown in FIG. 5 above, without limitation.
  • a processing unit configured to determine a predictive transform coefficient of the current sub-block according to the predictive attribute value and the number of points in the current sub-block.
  • the processing unit is further configured to determine the real transform coefficient of the current sub-block according to the real attribute value of the current sub-block and the number of points in the current sub-block.
  • the processing unit is further configured to determine a difference between the predicted transform coefficients and the true transform coefficients
  • a coding unit configured to write the difference value into a code stream.
  • the device embodiment and the method embodiment may correspond to each other, and similar descriptions may refer to the method embodiment. To avoid repetition, details are not repeated here.
  • the encoder 1000 shown in FIG. 10 may correspond to the corresponding subject in the method 500 of the embodiment of the present application, and the aforementioned and other operations and/or functions of the various modules in the encoder 1000 are respectively in order to realize the For the sake of brevity, the corresponding processes in each method are not repeated here.
  • FIG. 11 is a schematic block diagram of a decoder 1100 according to an embodiment of the present application, which may be, for example, the decoder in FIG. 4 .
  • the decoder 1100 may include an acquisition unit 1110 and a processing unit 1120 .
  • the obtaining unit 1110 is configured to obtain the difference between the predicted transformation coefficient and the real transformation coefficient of the current sub-block according to the code stream.
  • the acquiring unit 1110 is further configured to acquire the predicted attribute value of the current sub-block, for example, the predicted attribute value of the current sub-block may be acquired according to the method 300 shown in FIG. 5 above, without limitation.
  • a processing unit 1120 configured to determine the real transform coefficient of the current sub-block according to the predicted attribute value and the difference value.
  • the processing unit 1220 is further configured to determine the real attribute value of the current sub-block according to the real transform coefficient and the number of points in the current sub-block.
  • the device embodiment and the method embodiment may correspond to each other, and similar descriptions may refer to the method embodiment. To avoid repetition, details are not repeated here.
  • the decoder 1100 shown in FIG. 11 may correspond to the corresponding subject in the method 600 of the embodiment of the present application, and the foregoing and other operations and/or functions of each module in the decoder 1100 are respectively in order to realize the For the sake of brevity, the corresponding processes in each method are not repeated here.
  • the embodiment of the present application also provides a point cloud processing device, including an upsampling unit and an acquisition unit.
  • the upsampling unit can be used to upsample the point cloud to obtain the location information of the newly added point; the obtaining unit is used to obtain the attribute information of the newly added point, for example, according to the method 300 shown in FIG. 5 above. Obtain the predicted attribute value of the current sub-block, without limitation.
  • each step of the method embodiment in the embodiment of the present application can be completed by an integrated logic circuit of the hardware in the processor and/or instructions in the form of software, and the steps of the method disclosed in the embodiment of the present application can be directly embodied as hardware
  • the execution of the decoding processor is completed, or the combination of hardware and software modules in the decoding processor is used to complete the execution.
  • the software module may be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, and registers.
  • the storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps in the above method embodiments in combination with its hardware.
  • Figure 12 is a schematic block diagram of an electronic device 1200 provided in an embodiment of the present application.
  • the electronic device 1200 may include:
  • a memory 1210 and a processor 1220 the memory 1210 is used to store computer programs and transmit the program codes to the processor 1220 .
  • the processor 1220 can call and run a computer program from the memory 1210, so as to implement the method in the embodiment of the present application.
  • the processor 1220 may be configured to execute the steps of the above method 300, 500 or 600 according to the instructions in the computer program.
  • the processor 1220 may include but not limited to:
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field Programmable Gate Array
  • the memory 1210 includes but is not limited to:
  • non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), electronically programmable Erase Programmable Read-Only Memory (Electrically EPROM, EEPROM) or Flash.
  • the volatile memory can be Random Access Memory (RAM), which acts as external cache memory.
  • RAM Static Random Access Memory
  • SRAM Static Random Access Memory
  • DRAM Dynamic Random Access Memory
  • Synchronous Dynamic Random Access Memory Synchronous Dynamic Random Access Memory
  • SDRAM double data rate synchronous dynamic random access memory
  • Double Data Rate SDRAM, DDR SDRAM double data rate synchronous dynamic random access memory
  • Enhanced SDRAM, ESDRAM enhanced synchronous dynamic random access memory
  • SLDRAM synchronous connection dynamic random access memory
  • Direct Rambus RAM Direct Rambus RAM
  • the computer program can be divided into one or more modules, and the one or more modules are stored in the memory 1210 and executed by the processor 1220 to complete the Point cloud processing methods.
  • the one or more modules may be a series of computer program instruction segments capable of accomplishing specific functions, and the instruction segments are used to describe the execution process of the computer program in the electronic device 1200 .
  • the electronic device 1200 may further include:
  • Transceiver 1230 the transceiver 1230 can be connected to the processor 1220 or the memory 1210 .
  • the processor 1220 can control the transceiver 1230 to communicate with other devices, specifically, can send information or data to other devices, or receive information or data sent by other devices.
  • Transceiver 1230 may include a transmitter and a receiver.
  • the transceiver 1230 may further include an antenna, and the number of antennas may be one or more.
  • bus system includes not only a data bus, but also a power bus, a control bus and a status signal bus.
  • an encoder including a processor and a memory, the memory is used to store a computer program, and the processor is used to call and run the computer program stored in the memory, so that the encoder performs The encoding method of the foregoing method embodiment.
  • a decoder including a processor and a memory, the memory is used to store a computer program, and the processor is used to call and run the computer program stored in the memory, so that the decoder performs The decoding method of the above method embodiment.
  • a codec system including the above coder and decoder.
  • a computer storage medium on which a computer program is stored, and when the computer program is executed by a computer, the computer can execute the methods of the above method embodiments.
  • the embodiments of the present application further provide a computer program product including instructions, and when the instructions are executed by a computer, the computer executes the methods of the foregoing method embodiments.
  • a computer program product or computer program comprising computer instructions stored in a computer readable storage medium.
  • the processor of the computer device reads the computer instruction from the computer-readable storage medium, and the processor executes the computer instruction, so that the computer device executes the method of the above method embodiment.
  • the computer program product includes one or more computer instructions.
  • the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, e.g. (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) to another website site, computer, server or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server or a data center integrated with one or more available media.
  • the available medium may be a magnetic medium (such as a floppy disk, a hard disk, or a magnetic tape), an optical medium (such as a digital video disc (digital video disc, DVD)), or a semiconductor medium (such as a solid state disk (solid state disk, SSD)), etc.
  • modules and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.
  • the disclosed systems, devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the modules is only a logical function division. In actual implementation, there may be other division methods.
  • multiple modules or components can be combined or can be Integrate into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or modules may be in electrical, mechanical or other forms.
  • a module described as a separate component may or may not be physically separated, and a component displayed as a module may or may not be a physical module, that is, it may be located in one place, or may also be distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment. For example, each functional module in each embodiment of the present application may be integrated into one processing module, each module may exist separately physically, or two or more modules may be integrated into one module.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

提供了一种预测的方法和装置、编码器、解码器和编解码系统。该预测的方法,包括:获取点云的层级化结构,其中,所述层级化结构包括父块和所述父块的至少一个子块;确定所述层级化结构中的当前子块的参考块,其中,所述参考块包括所述当前子块的父块同一层级相关联的至少一个父块,和/或,所述当前子块同一层级相关联的至少一个子块;将所述参考块的信息和/或所述当前子块的信息输入神经网络模型,得到所述当前子块的预测属性值,其中,所述神经网络模型的训练数据包括子块的参考块的信息和子块的真实属性值。本申请实施例能够灵活选取用于预测当前子块的参考块,并且基于神经网络模型强大的表达能力,能够有助于提高当前子块平均属性值的准确性和稳定性,进一步提高编解码效率。

Description

预测的方法和装置、编码器、解码器和编解码系统 技术领域
本申请实施例涉及点云编解码技术领域,并且更具体地,涉及预测的方法和装置、编码器、解码器和编解码系统。
背景技术
随着点云技术的不断发展,点云数据的压缩编码成为重要的研究问题。目前,国内数字音视频编解码技术标准工作组(Audio Video coding Standard Workgroup ofChina,AVS)和国际标准化组织中的运动图像专家组(MPEG,Moving Picture Experts Group)均在制订点云编码的标准,例如基于几何信息的点云压缩编解码(Geometry-based Point Cloud Compression,G-PCC)。如何进一步提高点云编解码的性能是亟待解决的问题。
发明内容
提供了一种预测的方法和装置、编码器、解码器和编解码系统,能够有助于提高当前子块平均属性值的准确性和稳定性,进一步能够提高编解码效率。
第一方面,提供了一种预测的方法,包括:
获取点云的层级化结构,其中,所述层级化结构包括父块和所述父块的至少一个子块;
确定所述层级化结构中的当前子块的参考块,其中,所述参考块包括所述当前子块的父块同一层级相关联的至少一个父块,和/或,所述当前子块同一层级相关联的至少一个子块;
将所述参考块的信息和/或所述当前子块的信息输入神经网络模型,得到所述当前子块的预测属性值,其中,所述神经网络模型的训练数据包括子块的参考块的信息和子块的真实属性值。
第二方面,提供了一种编码方法,包括:
按照第一方面所述的方法,获取当前子块的预测属性值;
根据所述预测属性值和所述当前子块中的点数,确定所述当前子块的预测变换系数;
根据所述当前子块的真实属性值和所述当前子块中的点数,确定所述当前子块的真实变换系数;
确定所述预测变换系数和所述真实变换系数的差值;
将所述差值写入码流。
第三方面,提供了一种解码方法,包括:
根据码流,获取当前子块的预测变换系数和真实变换系数的差值;
按照第一方面的方法,获取所述当前子块的预测属性值;
根据所述预测属性值和所述差值,确定所述当前子块的真实变换系数;
根据所述真实变换系数和所述当前子块中的点数,确定所述当前子块的真实属性值。
第四方面,提供了一种点云处理的方法,包括:
对点云进行上采样,得到新增点的位置信息;
按照第一方面所述的方法,获取所述新增点的属性信息。
第五方面,提供了一种预测的装置,包括:
获取单元,用于获取点云的层级化结构,其中,所述层级化结构包括父块和所述父块的至少一个子块;
处理单元,用于确定所述层级化结构中的当前子块的参考块,其中,所述参考块包括所述当前子块的父块同一层级相关联的至少一个父块,和/或,所述当前子块同一层级相关联的至少一个子块;
神经网络模型,用于输入所述参考块的信息和/或所述当前子块的信息,并得到所述当前子块的预测属性值,其中,所述神经网络模型的训练数据包括子块的参考块的信息和子块的真实属性值。
第六方面,提供了一种编码器,包括:
获取单元,用于按照第一方面所述的方法,获取当前子块的预测属性值;
处理单元,用于根据所述预测属性值和所述当前子块中的点数,确定所述当前子块的预测变换系数;
所述处理单元还用于根据所述当前子块的真实属性值和所述当前子块中的点数,确定所述当前子块的真实变换系数;
所述处理单元还用于确定所述预测变换系数和所述真实变换系数的差值;
编码单元,用于将所述差值写入码流。
第七方面,提供了一种解码器,包括:
获取单元,用于根据码流,获取当前子块的预测变换系数和真实变换系数的差值;
所述获取单元还用于按照第一方面所述的方法,获取所述当前子块的预测属性值;
处理单元,用于根据所述预测属性值和所述差值,确定所述当前子块的真实变换系数;
所述处理单元还用于根据所述真实变换系数和所述当前子块中的点数,确定所述当前子块的真实属性值。
第八方面,提供了一种编解码系统,其特征在于,包括第五方面的编码器和第六方面的解码器。
第九方面,提供了一种点云处理的装置,包括:
上采样单元,用于对点云进行上采样,得到新增点的位置信息;
获取单元,用于按照第一方面所述的方法,获取所述新增点的属性信息。
第十方面,提供了一种电子设备,包括处理器和存储器。该存储器用于存储计算机程序,该处理器用于调用并运行该存储器中存储的计算机程序,执行上述第一方面至第四方面中任一方面的方法。
第十一方面,提供了一种芯片,包括:处理器,用于从存储器中调用并运行计算机程序,使得安装有所述芯片的设备执行如上述第一方面至第四方面中任一方面的方法。
第十二方面,提供了一种计算机可读存储介质,用于存储计算机程序,该计算机程序使得计算机执行上述第一方面至第四方面中任一方面的方法。
第十三方面,提供了一种计算机程序产品,包括计算机程序指令,所述计算机程序指令使得计算机执行上述第一方面至第四方面中任一方面的方法。
第十四方面,提供了一种计算机程序,当其在计算机上运行时,使得计算机执行上述第一方面至第四方面中任一方面的方法。
通过上述技术方案,能够灵活选取用于预测当前子块的参考块,并且基于神经网络模型强大的表达能力,能够有助于提高当前子块平均属性值的准确性和稳定性。根据按照上述帧内预测的方法获取的当前子块的预测属性值进行编解码,在提高了当前子块平均属性值的准确性和稳定性的情况下,能够进一步提高编解码效率。
附图说明
图1是本申请实施例涉及的八叉树结构的一个示意图;
图2A是本申请实施例涉及的八叉树划分的一个示意图;
图2B是本申请实施例涉及的八叉树划分的另一个示意图;
图3是本申请实施例涉及的一种编码器的一个示意图;
图4是本申请实施例涉及的一种解码器的一个示意图;
图5是本申请实施例提供的一种预测的方法的示意性流程图;
图6示出了根据本申请实施例的一种预测的方法的具体的例子;
图7是本申请实施例提供的一种编码方法的示意性流程图;
图8是本申请实施例提供的一种解码方法的示意性流程图;
图9是本申请实施例的预测的装置的示意性框图;
图10是本申请实施例提供的一种编码器的示意性框图;
图11是本申请实施例提供的一种解码器的示意性框图;
图12是本申请实施例提供的电子设备的示意性框图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。针对本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请适用于点云数据压缩技术领域。首先,对本申请实施例涉及的相关术语进行说明。
1)点云(point cloud),是物体表面的三维(3D)表现形式,可以指三维空间中海量点的集合。每个点都具有关联的属性,例如颜色、材质特性等。示例性的,可以利用点云,将对象或场景重构为点的组合。点云中的点可以包括点的几何信息和点的属性信息。作为示例,点的几何信息可以是点的三维坐标信息,例如可以由笛卡尔坐标系,或者任意坐标系中的(x,y,z)表示。点的几何信息也可称为点的位置信息。作为示例,这些点可能具有例如颜色等关联的属性信息,例如红绿蓝(Red-Green-Blue,RGB)或(Luminance-Chrominance,YUV)三分量值等,其他的属性信息可包括透明度、反射率法线矢量等,不做限定。
点云可以是静态或者动态的。例如,对物体或地形的详细扫描或映射可以是静态点云数据,用于机器视觉用途的环境扫描可以是动态点云数据。因为动态点云数据随时间而变化,因此动态点云可以是按照时间排序的点云序列。
点云数据可以应用到各个领域,例如,虚拟/增强现实、机器视觉、地理信息系统、医学领域等。通过光电雷达、激光雷达、激光扫描仪、多视角相机等采集设备,可以采集得到物体表面的点云。点云中点的数量较大,例如可达数十亿,因而点云的原始数据量特别巨大,因此,需要有效的压缩技术,即编码和解码处理以减少点云数据量。
2)点云的树结构,可以在对点云进行编码或解码过程中表征点云的几何信息的划分结果。在基于树结构的点云划分过程中,将点云的体空间递归划分为子体,相应的该体空间对应于树结构的根节点,各子体对应于树结构的节点。示例性的,可以基于子体中是否包含点来确定是否进一步划分子体。各节点可以具有占位比特,指示节点对应的子体是否包含点。可选的,可以对这些占位比特进行算术编码,得到二进制码流。
作为示例,树结构可以是八叉树(octree)。在点云的八叉树结构中,体空间或子体都是立方体,并且每次分割都会进一步产生八个子体/子立方体。图1示出了八叉树结构的一个示意图。如图1所示,块10可以为根节点,可以对应于完整点云的体空间,例如正方体。块10对应的体空间可以被分割为8个子体,每个子体对应虚线框20中的一个块。块10为虚线框20中的块的父块(也可以称为父节点),相应的虚线框20中的块为块10的子块(也可以称为子节点),各子块之间可以互称为兄弟块。如图1所示,块10的子块(即虚线框20中的块)中,可以包括包含点的块,其占位比特为1,表示该块对应的子体中包含点。块10的子块还可以包括不包含点的块,其占位比特为0,表示该块对应的子体中不包含点,即该子体为空。父块可以通过其子块的占位比特表示,例如块10可以表示为“00001001”的二进制形式,指示子块21和22的占位比特为1。
示例性的,虚线框20中的占位比特为1的块,例如块21和22,对应的子体可以进一步被分割为8个子体。相应的,块21和22分别为各自对应的子体进一步分割的8个子体对应节点的父块,该进一步分割的8个子体为子块,例如虚线框30中的各块。类似的,块21可以表示为“01001000”的二进制形式,指示子块31和32的占位比特为1;块22可以表示为“001000000”的二进制形式,指示子块33的占位比特为1。可选的,可以对这些占位比特进行算术编码,得到二进制码流。
在一些可选的实施例中,块10还可以为子体对应的块,即图1中的八叉树结构可以为完整点云对应的八叉树结构的一部分,本申请对此不作限定。
在一些可选的实施例中,八叉树结构中具有相同深度的块可以构成一个层。八叉树结构可以包括至少两个层,每个层可以包括至少一个块,每个块可以对应一个子体。八叉树结构为一种层级化结构。在另一些实施例中,点云的树结构还可以为四叉树结构、二叉树结构以及不均匀的空间划分结构等层级化结构中的至少一种,不做限定。
作为示例,参见图1,当块10为根节点时,虚线框20中的各块的深度值为1,属于一个层。示例性的,虚线框20对应的该层可以为该八叉树结构的第0层。类似的,虚线框30中的各块的深度值为2,属于一个层。示例性的,虚线框30对应的该层可以为该八叉树结构的第1层。当虚线框30中的块对应的子体进一步被分割时,该八叉树结构可以具有更大的深度的块,对应更多的层。以此类推,随着块的深度值增大,各层的层数依次增大。
图2A示出了八叉树划分产生的8个子块(即子块0~7)相对其父块(即当前块)的空间位置示意图。在当前节点编码8位空间占位码时,可以获得同一层的邻居参考信息,例如包括左、前、下三个方向(例如坐标系中x、y、z轴的负方向)的邻居子块的占位信息。示例性的,对于当前块的不同位置的子块,可以将与其同层的3个共面、3个共线以及1个共顶点中的至少一个邻居作为参考块。图2B示出了同一层块的共面邻居和共线邻居的示例,由左至右依次为右上后共面邻居、左前下共面邻居、右上后共线邻居和左前下共线邻居。
下面,结合图3和图4对本申请实施例的可适用的用于点云压缩的编解码框架进行说明。
图3是本申请实施例提供的编码器100的示意性框图。示例性的,该编码器100可以为G-PCC编码器。编码器100的输入包括点云的几何信息和属性信息。示例性的,可以对输入点云进行切片(slice)划分,对得到的每一个slice进行独立编码。在一个slice中,点云的几何信息和属性信息是分开进行编码的。如图3所示,编码器100可以对几何信息进行坐标转换,使点云全都包含在一个包围盒(boundingbox)中。该包围盒可以称为该点云对应的体空间。然后可以进行体素化过程,例如包括量化和移除重复点。其中量化以对坐标转换的结果进行缩放。由于量化取整,使得一部分点的几何信息相同,此时可以根据参数来决定是否移除重复点。接下来,可以对包围盒进行八叉树划分。根据八叉树划分层级深度的不同,几何信息的编码可以分为基于八叉树的编码和基于三角面片集(triangle soup,trisoup)的编码。
在基于八叉树的编码过程中,可以将包围盒八等分为8个子立方体,并记录子立方体的占位比特。其中,子立方体的占位比特为1表示该子立方体为非空,换言之该子立方体被点云中的点占据,即该 子立方体包含点云中的点。子立方体的占位比特为0表示子立方体为空,换言之该子立方体没有被点云中的点占据,即该子立方体不包含点云中的点。进一步的,对非空的子立方体继续进行八等分。示例性的,可以在划分得到的叶子节点为1x1x1的单位立方体时停止划分。
示例性的,子立方体可以称为子体,即对包围盒或体空间进行划分得到的。在该八叉树中,包围盒可以称为根节点,各子立方体可以称为该根节点的子节点,即子块。
在八叉树的划分过程中,可以利用块与周围块的空间相关性,对占位比特进行帧内预测。然后可以进行上下文建模,得到块的上下文信息,并基于上下文信息进行算术编码(例如自适应二进制算术编码),生成二进制码流,即几何码流。
在基于trisoup的编码过程中,同样也要进行八叉树划分。与基于八叉树的编码过程不同的是,在基于trisoup的编码过程中,不需要将点云逐级划分到边长为1x1x1的单位立方体,而是划分到块(block)边长为W时停止划分。基于每个block中点云的分布所形成的表面,得到该表面与block的十二条边所产生的至多十二个交点(vertex)。依次编码每个block的vertex坐标,生成二进制码流,即几何码流。
G-PCC编码器在完成几何信息的编码后,对几何信息进行重建,并使用重建的几何信息对点云的属性信息进行编码。示例性的,点云的属性信息编码主要是对点云中点的颜色信息进行编码。首先,编码器可以对点的颜色信息进行颜色转换。例如,当输入点云中点的颜色信息使用RGB颜色空间表示时,编码器可以将颜色信息从RGB颜色空间转换到YUV颜色空间。然后,利用重建的几何信息对点云重新着色,使得未编码的属性信息与重建的几何信息对应起来。然后,对颜色信息进行变换。示例性的,有两种变换方法,一种方法是依赖于细节层次(Level of Detail,LOD)划分的基于距离的提升变换,另一种方法是直接进行区域自适应分层变换(Region Adaptive Hierarchal Transform,RAHT),这两种方法都会将颜色信息从空间域变换到频域,得到高频系数和低频系数,最后对系数进行量化、算术编码,生成二进制码流,即属性码流。
可选的,在属性信息的编码过程中,可以采用莫顿码对点云进行排序,并利用几何空间关系搜索待编码点(也可以称为待预测点)的最近邻,利用找到的邻点的重建属性值对待编码点进行插值预测,得到预测属性值,然后可以将真实属性值和预测属性值进行差分运算得到预测残差,最后对预测残差进行量化以及算术编码,得到二进制码流。
图4是本申请实施例提供的解码器200的示意性框图。解码器200的输入包括点云的几何码1流和属性码流,点云的几何码流和属性码流是分开进行解码的。如图4所示,解码器200对输入的几何码流进行算术解码、上下文建模、八叉树划分、逆量化以及逆坐标转换等流程,得到几何信息,对输入的属性码流进行算术解码、逆量化、逆变换、属性重建和逆颜色转换,得到属性信息。具体的,解码过程与编码过程互逆。
一种RATH帧内预测的方法,对于通过八叉树几何编码得到的一系列1×1×1的单位立方体,按照莫顿码大小顺序从小到大排列。定义A n为n个子块的属性值,T n为n个子块的变换系数。初始情况下,A n=T n
从最小的单位立方体逐级向上进行变换。假设两个相邻的子块包含的点的属性值分别为
Figure PCTCN2022076368-appb-000001
Figure PCTCN2022076368-appb-000002
它们的变换系数分别为
Figure PCTCN2022076368-appb-000003
Figure PCTCN2022076368-appb-000004
定义包含这两个子块的更高一级的父块包含的点的属性值为
Figure PCTCN2022076368-appb-000005
其变换系数为
Figure PCTCN2022076368-appb-000006
则有:
Figure PCTCN2022076368-appb-000007
Figure PCTCN2022076368-appb-000008
Figure PCTCN2022076368-appb-000009
其中,
Figure PCTCN2022076368-appb-000010
换句话说,RAHT变换中父块的变换系数除了其子块变换系数的第一个变换系数(称为DC系数或直流系数),其他系数(称为AC系数或交流系数)直接继承于子块的变换系数。对于DC系数,父块的DC系数由子块的DC系数经过变换得到,变换矩阵与每个子块内的点数有关,即:
Figure PCTCN2022076368-appb-000011
经过逐级地变换,最终量化和熵编码的是变换后的系数,在解码端根据解码得到的变换系数可以重建出每个点的属性值。
另一种RATH帧内预测的方法,将RAHT调整为从最大的父块逐级地向下进行变换,初始时,最大的父块的直流系数定义为
Figure PCTCN2022076368-appb-000012
w为整个父块中的点数,a i为每个点的属性值,AC系数为0。 对于子块的DC系数和AC系数,经过变换矩阵计算得到,在上例中,即:
Figure PCTCN2022076368-appb-000013
得到的DC系数继续用于下一层级的变换,AC系数用于直接编码写进码流。在由父块到子块的逐级变换过程中,如果已知每个子块的属性值和点数,即可计算得到子块的变换系数,一种预测方法被引入用以预测子块的属性值,对于父块的每个子块,获取包含该子块的父块、与父块共面的6个同级父块、与父块共线的12个同级父块,共19个父块的平均属性值,并计算子块与19个父块的距离,则预测子块的平均属性值为:
Figure PCTCN2022076368-appb-000014
其中,a up是要预测的子块的平均属性值,k为19个父块中非空父块的数量,d k为当前子块与非空父块的距离,a k为非空父块的平均属性值。
经过预测得到每个子块的平均预测属性值,利用预测属性值做一次RAHT变换,得到一系列预测变换系数,即:
Figure PCTCN2022076368-appb-000015
计算真实属性值经过变换得到的真实变换系数,即:
Figure PCTCN2022076368-appb-000016
计算真实变换系数和预测变换系数的差值,即:
Figure PCTCN2022076368-appb-000017
AC系数的残差直接进行量化和熵编码,DC系数继续用于下一层级的变换。这样,最后量化和熵编码的是真实变换系数和预测变换系数的差值。
上述由父块预测子块属性值的过程中,仅使用了该父块,与该父块共面的6个同级父块、与该父块共线的12个同级父块共19个父块的属性值进行简单的线性加权,来预测该子块的平均属性值,导致子块平均属性值的准确性不高,稳定性较差,影响了编解码效率。
有鉴于此,本申请实施例提供了一种预测的方法,首先确定点云的层级化结构中当前子块的参考块,然后将该参考块的信息和/或当前子块的信息输入神经网络模型,得到该当前子块的预测属性值,其中,该参考块包括该当前子块的父块同一层级相关联的至少一个父块,和/或,该当前子块同一层级相关联的至少一个子块,该神经网络模型的训练数据包括子块的参考块的信息和子块的真实属性值。由于本申请实施例能够灵活选取用于预测当前子块的参考块,并且基于神经网络模型强大的表达能力,本申请实施例能够有助于提高当前子块平均属性值的准确性和稳定性。
本申请实施例还提供了一种编码方法,可以根据按照上述预测的方法获取的当前子块的预测属性值进行编码,得到点云的码流。具体的,可以根据当前子块的预测属性值和该当前子块中的点数,确定该当前子块的预测变换系数,以及根据该当前子块的真实属性值和该当前子块中的点数,确定该当前子块的真实变换系数,之后确定该预测变换系数和真实变换系数的差值,以及将该差值写入码流。
本申请实施例还提供了一种解码方法,可以根据按照上述预测的方法获取的当前子块的预测属性值进行解码,得到点云的属性信息。具体的,可以根据码流,获取当前子块的预测变换系数和真实变换系数的差值,并根据该当前子块的预测属性值和该差值,确定该当前子块的真实变换系数,以及根据该真实变换系数和该当前子块中的点数,确定该当前子块的真实属性值。
本申请实施例中,根据按照上述预测的方法获取的当前子块的预测属性值进行编解码,在提高了当前子块平均属性值的准确性和稳定性的情况下,能够进一步提高编解码效率。
本申请实施例还提供了一种点云处理的方法,可以对点云进行上采样,得到新增点的位置信息,以及按照上述预测的方法,获取该新增点的属性信息,从而可以有助于提升点云的密度。
下面结合附图详细描述本申请实施例提供的技术方案。
图5示出了本申请实施例提供的一种预测的方法300的示意性流程图。预测的方法300可以应用 于图3所示的编码器100,或图4所示的解码器200中,实现对点云的压缩编解码。如图5所示,方法300包括步骤310至330。
310,获取点云的层级化结构,其中,所述层级化结构包括父块和所述父块的至少一个子块。
示例性的,该层级化结构可以包括八叉树结构、四叉树结构、二叉树结构以及不均匀的空间划分结构中的至少一种。示例性的,八叉树结构可以参见上文图1、图2中的描述,这里不再赘述。
320,确定所述层级化结构中的当前子块的参考块,其中,所述参考块包括所述当前子块的父块同一层级(即同级)相关联的至少一个父块,和/或,所述当前子块同一层级相关联的至少一个子块。该参考块用于预测该当前子块的预测属性值。
示例性的,该当前子块即当前要预测其预测属性值的块,例如可以是八叉树结构中的某一个子立方体,不做限定。该当前子块也可以称为待预测子块,不做限定。
示例性的,该当前子块的父块同一层级相关联的至少一个父块可以包括以下至少一种:
所述当前子块的父块、与所述当前子块的父块同一层级共面的父块、与所述当前子块的父块同一层级共线的父块、与所述当前子块的父块同一层级共点的父块、与所述当前子块的父块同一层级在x正方向或负方向轴上距离两个父块距离的父块、与所述当前子块的父块同一层级在y正方向或负方向轴上距离两个父块距离的父块、与所述当前子块的父块同一层级在z正方向或负方向轴上距离两个父块距离的父块。
这里,该当前子块的父块同一层级相关联的至少一个父块,可以称为用于预测的参考父块的范围。
示例性的,该当前子块同一层级相关联的至少一个子块可以包括以下至少一种:
与所述当前子块同一层级共面的子块、与所述当前子块同一层级共线的子块、与所述当前子块同一层级共点的子块、与所述当前子块同一层级在x正方向或负方向轴上距离两个子块距离的子块、与所述当前子块同一层级在y正方向或负方向轴上距离两个子块距离的子块、与所述当前子块同一层级在z正方向或负方向轴上距离两个子块距离的子块。
这里,该当前子块同一层级相关联的至少一个子块,可以称为用于预测的参考子块的范围。用于预测的参考父块的范围和用于预测的参考子块的范围之和即为用于预测的参考块的范围。
应注意,当按照层级化的编解码方式进行编解码时,对于当前子块,上述用于预测其预测属性值的各参考父块的信息在解码端可以全部获得,而上述用于预测其预测属性值的参考子块中只有部分子块的信息能够获得,能够获得的子块例如可以根据解码时在层级化结构中采用的遍历顺序确定,本申请对此不做限定。作为一种可选的实现方式,在解码时,可以从这些能够获得的子块中选择参考子块,本申请对此不做限定。
在一些可选的实施例中,在所述参考块不包含点的情况下,该参考块的属性值为0。也就是说,用于预测的参考父块的范围内的父块,在不包含任何点的情况下,其属性值视为0;和/或用于预测的参考子块的范围内的子块,在不包含任何点的情况下,其属性值视为0。该不包含点的参考块可以称为占位为空的参考块。
330,将所述参考块的信息和/或所述当前子块的信息输入神经网络模型,得到所述当前子块的预测属性值,其中,所述神经网络模型的训练数据包括子块的参考块的信息和子块的真实属性值。该预测属性值还可以称为平均属性值,本申请对此不做限定。
示例性的,该神经网络模型可以包括多层感知机和/或Transformer。
示例性的,通过根据训练数据,即子块的参考块的信息和子块的真实属性值,对神经网络模型进行训练,可以使得已训练的神经网络模型获得的当前子块的预测属性值与其真实属性值尽可能接近能够有助于提高当前子块平均属性值的准确性和稳定性。
在一些实施例中,当将本申请实施例提供的预测的方法应用于点云的编解码时,考虑到数据的稳定性可对神经网络模型的训练目标进行调整,例如可以兼顾预测的稳定性。
在一些实施例中,上述神经网络模型的输出可以是当前子块的预测属性值。
在另一些实施例中,还可以确定一个初始预测值,神经网络模型的输出可以是该初始预测值与真实属性值的差值。示例性的,可以根据当前子块的父块,以及父块同一层级相关联的至少一个父块计算该初始预测值,本申请对此不做限定。
因此,本申请实施例中,通过确定点云的层级化结构中当前子块的参考块,然后将该参考块的信息和/或当前子块的信息输入神经网络模型,能够得到该当前子块的预测属性值,其中,该参考块包括该当前子块的父块同一层级相关联的至少一个父块,和/或,该当前子块同一层级相关联的至少一个子块,该神经网络模型的训练数据包括子块的参考块的信息和子块的真实属性值。由于本申请实施例能够灵活选取用于预测当前子块的参考块,并且基于神经网络模型强大的表达能力,本申请实施例能够有助于提高当前子块平均属性值的准确性和稳定性。
在一些可选的实施例中,可以定义两个参数m和k,其中m≤k,m、k均为正整数。可以根据参数m和k,确定是否对当前子块进行预测属性值的预测,或者确定预测当前子块的预测属性值的方式。示例性的,可能存在以下3种情况。
情况1
当当前子块的参考块中(即用于预测的参考块的范围内)的非空参考块的数量小于m时,不对该当前子块的预测属性值进行预测。其中,非空参考块中包含至少一个点。
情况2
当当前子块的参考块中(即用于预测的参考块的范围内)的非空参考块的数量大于k时,在这些非空参考块中选择其中的k个非空参考块,将该k个非空参考块的信息和/或当前子块的信息输入上述神经网络模型,得到当前子块的预测属性值。
作为一种可能的实现方式,可以根据参考块的非空参考块与当前子块的距离信息,确定所述k个非空参考块。示例性的,该距离信息可以包括欧几里得距离和/或曼哈顿距离,不做限定。例如可以根据参考块中的非空参考块与当前子块之间的欧几里得距离和/或曼哈顿距离,选择k个距离当前子块最近的非空父块和/或非空子块作为用于预测的参考块。
情况3
当当前子块的参考块中(即用于预测的参考块的范围内)的非空参考块的数量大于或等于m且小于k时,可以对参考块中的占位为空的参考块进行插值,以得到k个非空参考块,然后将该k个非空参考块的信息和/或当前子块的信息输入上述神经网络模型,得到当前子块的预测属性值。
作为一种可能的实现方式,可以获取该参考块中的占位为空的第一参考块的同一层级共面、共线、共点至少一种非空块,然后根据该非空块,对该第一参考块进行插值,以得到上述k个非空参考块。
示例性的,可以按照某种优先级对用于预测的参考块的范围内的参考块逐次插值,直至达到k个非空参考块。对于要插值的块,例如上述第一参考块,可以参考与其同一层级的共面、共线或共点的一个或多个非空参考块进行插值。
作为另一种可能的实现方式,可以根据该参考块中的n1个非空参考块的属性值,以及该n1个非空参考块距离占位为空的第二参考块的距离,确定该n1个非空参考块的属性值的加权平均值,即:
Figure PCTCN2022076368-appb-000018
其中,n1为正整数,a i为n1个非空参考块中第i个非空参考块的属性值,d i为n1个非空参考块中第i个参考块距离当前第二参考块的距离,m≤n1≤k,然后根据该加权平均值,对该第二参考块进行插值,以得到该k个非空参考块。
示例性的,可以根据已经获得的n1个非空参考块,按照某种优先级对用于预测的参考块的范围内的参考块逐次插值,直至到达k个非空参考块。对于要插值的块,例如上述第二参考块,可以参考该n1个已经获得的非空参考块的属性值,以及该n个非空参考块距离当前要插值的参考块的距离,计算属性值的加权平均值作为最终的插值结果。
作为另一种可能的实现方式,可以确定该参考块中的n2个非空参考块的属性值的第一平均值,以及确定该n2个非空参考块距离占位为空的第三参考块的距离的第二平均值,然后根据该第一平均值和第二平均值,对该第三参考块进行插值,以得到上述k个非空参考块,其中,该第三参考块的属性值为该第一平均值,该第三参考块距离当前子块的距离为该第二平均值。
示例性的,可以根据已经获得的n2个非空参考块,计算该n2个非空参考块的属性平均值,以及该n2个非空参考块距离当前子块的平均距离,并且其余(k-n2)个参考块(即第三参考块的一个示例)距离当前子块的距离均为该平均距离,该(k-n2)个参考块的属性值均为该平均属性值。
在一些可选的实施例中,可以将所述参考块的属性值、所述参考块与所述当前子块之间的距离信息和所述当前子块的辅助信息中的至少一种输入所述神经网络模型,得到所述当前子块的预测属性值。
示例性的,所述辅助信息包括所述当前子块三维空间坐标、所述当前子块的父块的三维空间坐标、所述当前子块在其父块中的相对位置、所述当前子块所处的变换等级、所述当前子块的尺寸信息、所述当前子块的父块的大小、所述当前子块的父块中的子块的空间分布情况中的至少一种。
在一些实施例中,该辅助信息可以为空,本申请对此不做限定。
在一些可选的实施例中,还可以确定上述参考块中非空块的数量、该非空块占总的参考块的数量的比值、该参考块中非空块的属性差异性中的至少一种满足预设条件。也就是说,只有在上述参考块中非空块的数量、该非空块占总的参考块的数量的比值、该参考块中非空块的属性差异性中的至少一种满足预设条件时,采用上述方法300的预测方法来预测当前子块的预测属性值。可选的,当存在参考块中非空块的数量、该非空块占总的参考块的数量的比值、该参考块中非空块的属性差异性中的至 少一种不满足预设条件时,可以按照其他预测方式,例如现有技术中的方式来确定当前子块的预测属性值。
可选的,可以定义一个动态切换标识符,当需要根据本申请实施例提供的预测的方法确定当前子块的预测属性值时,该动态切换标识符可以定义为1;否则定义为0。作为示例,当该上述方法300用于属性编码时,可以将该动态切换标识符写入码流。
图6示出了根据本申请实施例的一种预测的方法的具体的例子。如图6所示,当前子块的参考块,即用于预测的参考块的范围,包括:当前子块的父块(数量为1个)、与当前子块的父块同一层级共面的父块(数量为6个)、与当前子块的父块同一层级共线的父块(数量为12个)、与当前子块的父块同一层级共点的父块(数量为8个),一共27个父块。可选的,可以从该27个父块中选择12个非空父块作为参考块(即k=12),选择的以及为参考块与当前子块的欧式距离,例如可以选择与当前子块之间的欧式距离最小的12个父块作为参考块。可选的,当该27个父块中非空父块的示例大于或等于12个时,可以不进行插值操作。示例性的,可以计算选择的非空父块与该当前子块的欧式距离,共12个元素。可选的,当前子块的辅助信息可以包括当前子块在其父块中的相对位置和当前子块的边长,一共26个元素。如图6所示,可以将该26个元素连接形成一个向量,输入到包含三个隐含层的多层感知机中。相应的,网络的输出为当前子块的预测属性值。
因此,本申请实施例通过灵活选取用于预测当前子块的参考块,并且基于神经网络模型强大的表达能力,能够有助于提高当前子块平均属性值的准确性和稳定性。
图7示出了本申请实施例提供的一种编码方法500的示意性流程图。编码方法500可以应用于图3所示的编码器100中,例如可以将点云的几何信息和属性信息输入到编码器100中,实现对该点云的压缩编码。如图7所示,方法500包括步骤510至550。
510,获取当前子块的预测属性值。
示例性的,可以按照图3所示的方法300,获取当前子块的预测属性值。具体的,可以参见图5至图6中的描述,不再赘述。
520,根据当前子块的预测属性值和该当前子块中的点数,确定该当前子块的预测变换系数。
作为具体的示例,可以将上述方法300与RATH结合用于点云的编码。在RAHT从父块向子块的逐层变换中,初始时,最大的父块的直流系数定义为
Figure PCTCN2022076368-appb-000019
w为整个父块中的点数,a i为每个点的属性值,AC系数为0。对于子块的DC系数和AC系数,经过变换矩阵计算得到,即:
Figure PCTCN2022076368-appb-000020
得到的DC系数继续用于下一层级的变换,AC系数用于直接编码。在由父块到子块的逐级变换过程中,如果已知每个子块的属性值和点数,即可计算得到子块的变换系数。使用本申请实施例提供的预测方法,网络的输出值可以经过取整得到子块的预测属性值a up。经过预测得到每个子块的预测属性值,利用该预测属性值做一次RAHT变换,得到一系列预测变换系数,即:
Figure PCTCN2022076368-appb-000021
530,根据该当前子块的真实属性值和该当前子块中的点数,确定该当前子块的真实变换系数。
示例性的,计算真实属性值经过变换得到的真实变换系数,即:
Figure PCTCN2022076368-appb-000022
540,确定该预测变换系数和真实变换系数的差值。
示例性的,计算真实变换系数和预测变换系数的差值,即:
Figure PCTCN2022076368-appb-000023
550,将该差值写入码流。
示例性的,即将AC系数的残差写入码流,例如可以直接进行量化和熵编码,DC系数继续用于下一层级的变换。这样,最后量化和熵编码的是真实变换系数和预测变换系数的差值。
因此,本申请实施例根据按照上述预测的方法获取的当前子块的预测属性值进行编码,在提高了 当前子块平均属性值的准确性和稳定性的情况下,能够进一步提高编码效率。
图8示出了本申请实施例提供的一种解码方法600的示意性流程图。解码方法600可以应用于图4所示的解码器200中,例如可以将点云的几何码流和属性码流输入到解码器200中,实现对该点云的解码。如图8所示,方法600包括步骤610至640。
610,获取当前子块的预测属性值。
示例性的,可以按照图3所示的方法300,获取当前子块的预测属性值。具体的,可以参见图5至图6中的描述,不再赘述。
620,根据码流获取当前子块的预测变换系数和真实变换系数的差值。
630,根据该预测属性值和所述差值,确定该当前子块的真实变换系数。
示例性的,根据该预测属性值和所述差值,确定该当前子块的真实变换系数,是方法500中计算当前子块的预测变换系数和真实变换系数的差值的逆过程,具体可以参考图7中的描述,这里不再赘述。
640,根据该真实变换系数和当前子块中的点数,确定该当前子块的真实属性值。
示例性的,根据该真实变换系数和当前子块中的点数,确定该当前子块的真实属性值,为方法500中计算真实变换系数的逆过程,具体可以参考图7中的描述,这里不再赘述。
在一些可选的实施例中,在根据本申请实施例提供的预测的方法,例如方法300进行点云的编解码时,例如在方法500或600中,可以将该预测方法300与其他预测当前子块的属性值的方法进行动态切换。示例性的,该其他预测当前子块的属性值的方法,可以为选择1个包含当前子块的父块、6个与包含当前子块的父块同级的共面父块、12个与包含当前子块的父块同级的共线的父块等19个参考块,计算该19个参考块的属性值的加权平均值。动态切换的条件例如可以包括:参考块中非空块的数量、该非空块占总的参考块的数量的比值、该参考块中非空块的属性差异性中的至少一种满足预设条件。
可选的,可以定义一个动态切换标识符。在编解码的过程中,当需要进行预测方法切换时,可以定义该切换标识符为1,否则为0。此时,该动态切换标识符需要写入码流。
在一些可选的实施例中,在进行点云的编解码时,可以对层级化结构的前n层中的每个块均编码其真实属性值,并对该真实属性值进行量化和熵编码写入码流;从该层级化结构的第n+1层开始,使用本申请实施例提供的预测的方法,例如方法300,预测每个子块的预测属性值,并使用子块的真实属性值与预测得到的预测属性值计算差值,将该差值经过量化和熵编码,以形成点云的属性码流。
在一些可选的实施例中,当上述层级化结构为八叉树结构时,前述的点云的属性编解码的过程可以直接嵌入几何编解码的过程,从而在解码端/编码端可以实现几何和属性的同步逐级编解码。
本申请实施例还提供了一种点云处理的方法,在该方法中,可以对点云进行上采样,得到新增点的位置信息,然后按照本申请实施例提供的预测的方法,例如方法300的预测方法,获取所述新增点的属性信息。
具体而言,点云的分辨率可以描述为点的空间密度,在同等空间大小内点数越多,则点云的主观质量越好。通过点云的上采样,例如可以在解码端通过某种操作使得点云更加密集。上采样可以获取新增点的位置信息,进一步可以通过本申请实施例提供的上述预测方法,得到新增点的预测属性值。示例性的,新增点可以落在某个1×1×1大小的最小子块中,其属性值可以通过本申请实施例提供的预测方法,例如上述方法300得到其采样后的属性值。因此,本申请实施例可以有助于提升点云的密度。
上文结合图5至图8,详细描述了本申请的方法实施例,下文结合图9至图12,详细描述本申请的装置实施例。
图9是本申请实施例的预测的装置900的示意性框图,如图9所示,所述装置900可包括获取单910、处理单元920和神经网络模型930。
获取单元910,用于获取点云的层级化结构,其中,所述层级化结构包括父块和所述父块的至少一个子块;
处理单元920,用于确定所述层级化结构中的当前子块的参考块,其中,所述参考块包括所述当前子块的父块同一层级相关联的至少一个父块,和/或,所述当前子块同一层级相关联的至少一个子块;
神经网络模型930,用于输入所述参考块的信息和/或所述当前子块的信息,并得到所述当前子块的预测属性值,其中,所述神经网络模型的训练数据包括子块的参考块的信息和子块的真实属性值。
可选的,所述当前子块的父块同一层级相关联的至少一个父块包括以下至少一种:
所述当前子块的父块、与所述当前子块的父块同一层级共面的父块、与所述当前子块的父块同一 层级共线的父块、与所述当前子块的父块同一层级共点的父块、与所述当前子块的父块同一层级在x正方向或负方向轴上距离两个父块距离的父块、与所述当前子块的父块同一层级在y正方向或负方向轴上距离两个父块距离的父块、与所述当前子块的父块同一层级在z正方向或负方向轴上距离两个父块距离的父块。
可选的,所述当前子块同一层级相关联的至少一个子块包括以下至少一种:
与所述当前子块同一层级共面的子块、与所述当前子块同一层级共线的子块、与所述当前子块同一层级共点的子块、与所述当前子块同一层级在x正方向或负方向轴上距离两个子块距离的子块、与所述当前子块同一层级在y正方向或负方向轴上距离两个子块距离的子块、与所述当前子块同一层级在z正方向或负方向轴上距离两个子块距离的子块。
可选的,所述处理单元920还用于:
确定所述参考块中的非空参考块的数量大于或等于k,其中,所述非空参考块中包含至少一个点,k为正整数;
在所述参考块的非空参考块中确定k个非空参考块;
所述神经网络模型930具体用于输入所述k个非空参考块的信息和/或所述当前子块的信息,得到所述当前子块的预测属性值。
可选的,所述处理单元920具体用于:
根据所述参考块的非空参考块与所述当前子块的距离信息,确定所述k个非空参考块。
可选的,所述距离信息包括欧几里得距离和/或曼哈顿距离。
可选的,所述处理单元920还用于:
确定所述参考块中的非空参考块的数量大于或等于m且小于k,其中,所述非空参考块中包含至少一个点,m、k为正整数,m≤k;
对所述参考块中的占位为空的参考块进行插值,以得到k个非空参考块;
所述神经网络模型930具体用于输入所述k个非空参考块的信息和/或所述当前子块的信息,得到所述当前子块的预测属性值。
可选的,所述处理单元920具体用于:
获取所述参考块中的占位为空的第一参考块的同一层级共面、共线、共点至少一种非空块;
根据所述非空块,对所述第一参考块进行插值,以得到所述k个非空参考块。
可选的,所述处理单元920具体用于:
根据所述参考块中的n1个非空参考块的属性值,以及所述n1个非空参考块距离占位为空的第二参考块的距离,确定所述n1个非空参考块的属性值的加权平均值,其中,n1为正整数,m≤n1≤k;
根据所述加权平均值,对所述第二参考块进行插值,以得到所述k个非空参考块。
可选的,所述处理单元920具体用于:
确定所述参考块中的n2个非空参考块的属性值的第一平均值;
确定所述n2个非空参考块距离占位为空的第三参考块的距离的第二平均值;
根据所述第一平均值和所述第二平均值,对所述第三参考块进行插值,以得到所述k个非空参考块,其中,所述第三参考块的属性值为所述第一平均值,所述第三参考块距离所述当前子块的距离为所述第二平均值。
可选的,所述神经网络模型930具体用于输入所述参考块的属性值、所述参考块与所述当前子块之间的距离信息和所述当前子块的辅助信息中的至少一种,得到所述当前子块的预测属性值。
可选的,所述辅助信息包括所述当前子块三维空间坐标、所述当前子块的父块的三维空间坐标、所述当前子块在其父块中的相对位置、所述当前子块所处的变换等级、所述当前子块的尺寸信息、所述当前子块的父块的大小、所述当前子块的父块中的子块的空间分布情况中的至少一种。
可选的,所述处理单元920还用于:
确定所述参考块中非空块的数量、所述非空块占总的参考块的数量的比值、所述参考块中非空块的属性差异性中的至少一种满足预设条件。
可选的,在所述参考块不包含点的情况下,所述参考块的属性值为0。
可选的,所述层级化结构包括八叉树结构、四叉树结构、二叉树结构以及不均匀的空间划分结构中的至少一种。
可选的,所述神经网络模型包括多层感知机和/或Transformer。
应理解,装置实施例与方法实施例可以相互对应,类似的描述可以参照方法实施例。为避免重复,此处不再赘述。具体地,图9所示的预测的装置900可以对应于执行本申请实施例的方法300中的相应主体,并且装置900中的各个模块的前述和其它操作和/或功能分别为了实现图5中的各个方法中 的相应流程,为了简洁,在此不再赘述。
图10是本申请实施例的编码器1000的示意性框图,例如可以为图3中的编码器。如图10所示,所述编码器1000可包括获取单元1010、处理单1020和编码单元1030。
获取单元1010,用于获取当前子块的预测属性值。例如可以按照上文图5所示的方法300获取当前子块的预测属性值,不做限定。
处理单元,用于根据所述预测属性值和所述当前子块中的点数,确定所述当前子块的预测变换系数。
所述处理单元还用于根据所述当前子块的真实属性值和所述当前子块中的点数,确定所述当前子块的真实变换系数。
所述处理单元还用于确定所述预测变换系数和所述真实变换系数的差值;
编码单元,用于将所述差值写入码流。
应理解,装置实施例与方法实施例可以相互对应,类似的描述可以参照方法实施例。为避免重复,此处不再赘述。具体地,图10所示的编码器1000可以对应于执行本申请实施例的方法500中的相应主体,并且编码器1000中的各个模块的前述和其它操作和/或功能分别为了实现图7中的各个方法中的相应流程,为了简洁,在此不再赘述。
图11是本申请实施例的解码器1100的示意性框图,例如可以为图4中的解码器。如图11所示,所述解码器1100可包括获取单元1110和处理单1120。
获取单元1110,用于根据码流,获取当前子块的预测变换系数和真实变换系数的差值。
所述获取单元1110还用于获取所述当前子块的预测属性值,例如可以按照上文图5所示的方法300获取当前子块的预测属性值,不做限定。
处理单元1120,用于根据所述预测属性值和所述差值,确定所述当前子块的真实变换系数.
所述处理单元1220还用于根据所述真实变换系数和所述当前子块中的点数,确定所述当前子块的真实属性值。
应理解,装置实施例与方法实施例可以相互对应,类似的描述可以参照方法实施例。为避免重复,此处不再赘述。具体地,图11所示的解码器1100可以对应于执行本申请实施例的方法600中的相应主体,并且解码器1100中的各个模块的前述和其它操作和/或功能分别为了实现图8中的各个方法中的相应流程,为了简洁,在此不再赘述。
本申请实施例还提供了一种点云处理的装置,包括上采样单元和获取单元。其中,上采样单元可以用于对点云进行上采样,得到新增点的位置信息;获取单元,用于获取所述新增点的属性信息,例如可以按照上文图5所示的方法300获取当前子块的预测属性值,不做限定。
上文中结合附图从功能模块的角度描述了本申请实施例的装置和系统。应理解,该功能模块可以通过硬件形式实现,也可以通过软件形式的指令实现,还可以通过硬件和软件模块组合实现。具体地,本申请实施例中的方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路和/或软件形式的指令完成,结合本申请实施例公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。可选地,软件模块可以位于随机存储器,闪存、只读存储器、可编程只读存储器、电可擦写可编程存储器、寄存器等本领域的成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法实施例中的步骤。
如12是本申请实施例提供的电子设备1200的示意性框图。
如图12所示,该电子设备1200可包括:
存储器1210和处理器1220,该存储器1210用于存储计算机程序,并将该程序代码传输给该处理器1220。换言之,该处理器1220可以从存储器1210中调用并运行计算机程序,以实现本申请实施例中的方法。
例如,该处理器1220可用于根据该计算机程序中的指令执行上述方法300、500或600的步骤。
在本申请的一些实施例中,该处理器1220可以包括但不限于:
通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等等。
在本申请的一些实施例中,该存储器1210包括但不限于:
易失性存储器和/或非易失性存储器。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例 性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synch link DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DR RAM)。
在本申请的一些实施例中,该计算机程序可以被分割成一个或多个模块,该一个或者多个模块被存储在该存储器1210中,并由该处理器1220执行,以完成本申请提供的点云处理方法。该一个或多个模块可以是能够完成特定功能的一系列计算机程序指令段,该指令段用于描述该计算机程序在该电子设备1200中的执行过程。
可选的,如图12所示,该电子设备1200还可包括:
收发器1230,该收发器1230可连接至该处理器1220或存储器1210。
其中,处理器1220可以控制该收发器1230与其他设备进行通信,具体地,可以向其他设备发送信息或数据,或接收其他设备发送的信息或数据。收发器1230可以包括发射机和接收机。收发器1230还可以进一步包括天线,天线的数量可以为一个或多个。
应当理解,该电子设备1200中的各个组件通过总线系统相连,其中,总线系统除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。
根据本申请的一个方面,提供了一种编码器,包括处理器和存储器,该存储器用于存储计算机程序,该处理器用于调用并运行所述存储器中存储的计算机程序,使得所述编码器执行上述方法实施例的编码方法。
根据本申请的一个方面,提供了一种解码器,包括处理器和存储器,该存储器用于存储计算机程序,该处理器用于调用并运行所述存储器中存储的计算机程序,使得所述解码器执行上述方法实施例的解码方法。
根据本申请的一个方面,提供了一种编解码系统,包括上述编码器和解码器。
根据本申请的一个方面,提供了一种计算机存储介质,其上存储有计算机程序,该计算机程序被计算机执行时使得该计算机能够执行上述方法实施例的方法。或者说,本申请实施例还提供一种包含指令的计算机程序产品,该指令被计算机执行时使得计算机执行上述方法实施例的方法。
根据本申请的另一个方面,提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行上述方法实施例的方法。
换言之,当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。该计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行该计算机程序指令时,全部或部分地产生按照本申请实施例该的流程或功能。该计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。该计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,该计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。该计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。该可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如数字视频光盘(digital video disc,DVD))、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的模块及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,该模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个模块或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或模块的间接耦合或通信连接,可以是电性,机械或其它的形式。
作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理模块,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。例如,在本申请各个实施例中的各功能 模块可以集成在一个处理模块中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个模块中。
综上内容,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以该权利要求的保护范围为准。

Claims (29)

  1. 一种预测的方法,其特征在于,包括:
    获取点云的层级化结构,其中,所述层级化结构包括父块和所述父块的至少一个子块;
    确定所述层级化结构中的当前子块的参考块,其中,所述参考块包括所述当前子块的父块同一层级相关联的至少一个父块,和/或,所述当前子块同一层级相关联的至少一个子块;
    将所述参考块的信息和/或所述当前子块的信息输入神经网络模型,得到所述当前子块的预测属性值,其中,所述神经网络模型的训练数据包括子块的参考块的信息和子块的真实属性值。
  2. 根据权利要求1所述的方法,其特征在于,所述当前子块的父块同一层级相关联的至少一个父块包括以下至少一种:
    所述当前子块的父块、与所述当前子块的父块同一层级共面的父块、与所述当前子块的父块同一层级共线的父块、与所述当前子块的父块同一层级共点的父块、与所述当前子块的父块同一层级在x正方向或负方向轴上距离两个父块距离的父块、与所述当前子块的父块同一层级在y正方向或负方向轴上距离两个父块距离的父块、与所述当前子块的父块同一层级在z正方向或负方向轴上距离两个父块距离的父块。
  3. 根据权利要求1所述的方法,其特征在于,所述当前子块同一层级相关联的至少一个子块包括以下至少一种:
    与所述当前子块同一层级共面的子块、与所述当前子块同一层级共线的子块、与所述当前子块同一层级共点的子块、与所述当前子块同一层级在x正方向或负方向轴上距离两个子块距离的子块、与所述当前子块同一层级在y正方向或负方向轴上距离两个子块距离的子块、与所述当前子块同一层级在z正方向或负方向轴上距离两个子块距离的子块。
  4. 根据权利要求1-3任一项所述的方法,其特征在于,所述将所述参考块的信息和/或所述当前子块的信息输入神经网络模型,得到所述当前子块的预测属性值,包括:
    确定所述参考块中的非空参考块的数量大于或等于k,其中,所述非空参考块中包含至少一个点,k为正整数;
    在所述参考块的非空参考块中确定k个非空参考块;
    将所述k个非空参考块的信息和/或所述当前子块的信息输入所述神经网络模型,得到所述当前子块的预测属性值。
  5. 根据权利要求4所述的方法,其特征在于,所述在所述参考块的非空参考块中确定k个非空参考块,包括:
    根据所述参考块的非空参考块与所述当前子块的距离信息,确定所述k个非空参考块。
  6. 根据权利要求5所述的方法,其特征在于,所述距离信息包括欧几里得距离和/或曼哈顿距离。
  7. 根据权利要求1所述的方法,其特征在于,所述将所述参考块的信息和/或所述当前子块的信息输入神经网络模型,得到所述当前子块的预测属性值,包括:
    确定所述参考块中的非空参考块的数量大于或等于m且小于k,其中,所述非空参考块中包含至少一个点,m、k为正整数,m≤k;
    对所述参考块中的占位为空的参考块进行插值,以得到k个非空参考块;
    将所述k个非空参考块的信息和/或所述当前子块的信息输入所述神经网络模型,得到所述当前子块的预测属性值。
  8. 根据权利要求7所述的方法,其特征在于,所述对所述参考块中的占位为空的参考块进行插值,以得到k个非空参考块,包括:
    获取所述参考块中的占位为空的第一参考块的同一层级共面、共线、共点至少一种非空块;
    根据所述非空块,对所述第一参考块进行插值,以得到所述k个非空参考块。
  9. 根据权利要求7所述的方法,其特征在于,所述对所述参考块中的占位为空的参考块进行插值,以得到k个非空参考块,包括:
    根据所述参考块中的n1个非空参考块的属性值,以及所述n1个非空参考块距离占位为空的第二参考块的距离,确定所述n1个非空参考块的属性值的加权平均值,其中,n1为正整数,m≤n1≤k;
    根据所述加权平均值,对所述第二参考块进行插值,以得到所述k个非空参考块。
  10. 根据权利要求7所述的方法,其特征在于,所述对所述参考块中的占位为空的参考块进行插值,以得到k个非空参考块,包括:
    确定所述参考块中的n2个非空参考块的属性值的第一平均值;
    确定所述n2个非空参考块距离占位为空的第三参考块的距离的第二平均值;
    根据所述第一平均值和所述第二平均值,对所述第三参考块进行插值,以得到所述k个非空参考 块,其中,所述第三参考块的属性值为所述第一平均值,所述第三参考块距离所述当前子块的距离为所述第二平均值。
  11. 根据权利要求1-10任一项所述的方法,其特征在于,所述将所述参考块的信息和/或所述当前子块的信息输入神经网络模型,得到所述当前子块的预测属性值,包括:
    将所述参考块的属性值、所述参考块与所述当前子块之间的距离信息和所述当前子块的辅助信息中的至少一种输入所述神经网络模型,得到所述当前子块的预测属性值。
  12. 根据权利要求11所述的方法,其特征在于,所述辅助信息包括所述当前子块三维空间坐标、所述当前子块的父块的三维空间坐标、所述当前子块在其父块中的相对位置、所述当前子块所处的变换等级、所述当前子块的尺寸信息、所述当前子块的父块的大小、所述当前子块的父块中的子块的空间分布情况中的至少一种。
  13. 根据权利要求1-12任一项所述的方法,其特征在于,还包括:
    确定所述参考块中非空块的数量、所述非空块占总的参考块的数量的比值、所述参考块中非空块的属性差异性中的至少一种满足预设条件。
  14. 根据权利要求1-13任一项所述的方法,其特征在于,在所述参考块不包含点的情况下,所述参考块的属性值为0。
  15. 根据权利要求1-14任一项所述的方法,其特征在于,所述层级化结构包括八叉树结构、四叉树结构、二叉树结构以及不均匀的空间划分结构中的至少一种。
  16. 根据权利要求1-15任一项所述的方法,其特征在于,所述神经网络模型包括多层感知机和/或Transformer。
  17. 一种编码方法,其特征在于,包括:
    按照权利要求1-16任一项所述的方法,获取当前子块的预测属性值;
    根据所述预测属性值和所述当前子块中的点数,确定所述当前子块的预测变换系数;
    根据所述当前子块的真实属性值和所述当前子块中的点数,确定所述当前子块的真实变换系数;
    确定所述预测变换系数和所述真实变换系数的差值;
    将所述差值写入码流。
  18. 根据权利要求17所述的方法,其特征在于,还包括:
    将切换标识符写入码流,其中,所述切换标识符用于指示按照权利要求1-16任一项所述的方法,获取当前子块的预测属性值。
  19. 一种解码方法,其特征在于,包括:
    根据码流,获取当前子块的预测变换系数和真实变换系数的差值;
    按照权利要求1-15任一项所述的方法,获取所述当前子块的预测属性值;
    根据所述预测属性值和所述差值,确定所述当前子块的真实变换系数;
    根据所述真实变换系数和所述当前子块中的点数,确定所述当前子块的真实属性值。
  20. 根据权利要求19所述的方法,其特征在于,还包括:
    从所述码流中获取切换标识符,其中,所述切换标识符用于指示按照权利要求1-16任一项所述的方法,获取当前子块的预测属性值。
  21. 一种点云处理的方法,其特征在于,包括:
    对点云进行上采样,得到新增点的位置信息;
    按照权利要求1-16任一项所述的方法,获取所述新增点的属性信息。
  22. 一种预测的装置,其特征在于,包括:
    获取单元,用于获取点云的层级化结构,其中,所述层级化结构包括父块和所述父块的至少一个子块;
    处理单元,用于确定所述层级化结构中的当前子块的参考块,其中,所述参考块包括所述当前子块的父块同一层级相关联的至少一个父块,和/或,所述当前子块同一层级相关联的至少一个子块;
    神经网络模型,用于输入所述参考块的信息和/或所述当前子块的信息,并得到所述当前子块的预测属性值,其中,所述神经网络模型的训练数据包括子块的参考块的信息和子块的真实属性值。
  23. 一种编码器,其特征在于,包括:
    获取单元,用于按照权利要求1-16任一项所述的方法,获取当前子块的预测属性值;
    处理单元,用于根据所述预测属性值和所述当前子块中的点数,确定所述当前子块的预测变换系数;
    所述处理单元还用于根据所述当前子块的真实属性值和所述当前子块中的点数,确定所述当前子块的真实变换系数;
    所述处理单元还用于确定所述预测变换系数和所述真实变换系数的差值;
    编码单元,用于将所述差值写入码流。
  24. 一种解码器,其特征在于,包括:
    获取单元,用于根据码流,获取当前子块的预测变换系数和真实变换系数的差值;
    所述获取单元还用于按照权利要求1-15任一项所述的方法,获取所述当前子块的预测属性值;
    处理单元,用于根据所述预测属性值和所述差值,确定所述当前子块的真实变换系数;
    所述处理单元还用于根据所述真实变换系数和所述当前子块中的点数,确定所述当前子块的真实属性值。
  25. 一种编解码系统,其特征在于,包括如权利要求23所述的编码器和如权利要求24所示的解码器。
  26. 一种点云处理的装置,其特征在于,包括:
    上采样单元,用于对点云进行上采样,得到新增点的位置信息;
    获取单元,用于按照权利要求1-16任一项所述的方法,获取所述新增点的属性信息。
  27. 一种电子设备,其特征在于,包括处理器和存储器;
    所述存储器用于存储计算机程序,所述处理器用于调用并运行所述存储器中存储的计算机程序,使得所述电子设备执行如权利要求1-21任一项所述的方法。
  28. 一种计算机可读存储介质,其特征在于,用于存储计算机程序,所述计算机程序使得计算机执行如权利要求1-21任一项所述的方法。
  29. 一种计算机程序产品,其特征在于,包括计算机程序代码,当所述计算机程序代码被电子设备运行时,使得所述电子设备执行权利要求1-21中任一项所述的方法。
PCT/CN2022/076368 2022-02-15 2022-02-15 预测的方法和装置、编码器、解码器和编解码系统 WO2023155045A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/076368 WO2023155045A1 (zh) 2022-02-15 2022-02-15 预测的方法和装置、编码器、解码器和编解码系统

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/076368 WO2023155045A1 (zh) 2022-02-15 2022-02-15 预测的方法和装置、编码器、解码器和编解码系统

Publications (1)

Publication Number Publication Date
WO2023155045A1 true WO2023155045A1 (zh) 2023-08-24

Family

ID=87577296

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/076368 WO2023155045A1 (zh) 2022-02-15 2022-02-15 预测的方法和装置、编码器、解码器和编解码系统

Country Status (1)

Country Link
WO (1) WO2023155045A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170347100A1 (en) * 2016-05-28 2017-11-30 Microsoft Technology Licensing, Llc Region-adaptive hierarchical transform and entropy coding for point cloud compression, and corresponding decompression
CN111405281A (zh) * 2020-03-30 2020-07-10 北京大学深圳研究生院 一种点云属性信息的编码方法、解码方法、存储介质及终端设备
CN112385236A (zh) * 2020-06-24 2021-02-19 北京小米移动软件有限公司 点云的编码和解码方法
CN113273211A (zh) * 2018-12-14 2021-08-17 Pcms控股公司 用于对空间数据进行程序化着色的系统和方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170347100A1 (en) * 2016-05-28 2017-11-30 Microsoft Technology Licensing, Llc Region-adaptive hierarchical transform and entropy coding for point cloud compression, and corresponding decompression
CN113273211A (zh) * 2018-12-14 2021-08-17 Pcms控股公司 用于对空间数据进行程序化着色的系统和方法
CN111405281A (zh) * 2020-03-30 2020-07-10 北京大学深圳研究生院 一种点云属性信息的编码方法、解码方法、存储介质及终端设备
CN112385236A (zh) * 2020-06-24 2021-02-19 北京小米移动软件有限公司 点云的编码和解码方法

Similar Documents

Publication Publication Date Title
JP7386337B2 (ja) 分割方法、符号器、復号器及びコンピュータ記憶媒体
WO2022121649A1 (zh) 点云数据编码方法、解码方法、点云数据处理方法及装置、电子设备、计算机程序产品及计算机可读存储介质
KR102645508B1 (ko) Haar 기반 포인트 클라우드 코딩을 위한 방법 및 장치
WO2023130333A1 (zh) 编解码方法、编码器、解码器以及存储介质
CN113518226A (zh) 一种基于地面分割的g-pcc点云编码改进方法
TW202143709A (zh) 針對基於幾何的點雲壓縮的三湯語法訊號傳遞
CN115885514A (zh) 用于基于几何的点云压缩的高级语法
WO2022121650A1 (zh) 点云属性的预测方法、编码器、解码器及存储介质
KR102650334B1 (ko) 포인트 클라우드 코딩을 위한 방법 및 장치
WO2022131948A1 (en) Devices and methods for sequential coding for point cloud compression
JP2023549447A (ja) 点群階層化方法、デコーダ、エンコーダ及び記憶媒体
WO2021062771A1 (zh) 颜色分量预测方法、编码器、解码器及计算机存储介质
WO2023155045A1 (zh) 预测的方法和装置、编码器、解码器和编解码系统
WO2023133710A1 (zh) 编码方法、解码方法、编码器、解码器和编解码系统
WO2024026712A1 (zh) 点云编解码方法、装置、设备及存储介质
WO2023024842A1 (zh) 点云编解码方法、装置、设备及存储介质
WO2023142133A1 (zh) 编码方法、解码方法、编码器、解码器及存储介质
WO2024082153A1 (zh) 编解码方法、码流、编码器、解码器以及存储介质
JP7470211B2 (ja) ポイントクラウドコーディングのための距離ベースの加重平均を計算する方法および装置
WO2024060161A1 (zh) 编解码方法、编码器、解码器以及存储介质
WO2023173238A1 (zh) 编解码方法、码流、编码器、解码器以及存储介质
WO2023024840A1 (zh) 点云编解码方法、编码器、解码器及存储介质
WO2024082127A1 (zh) 编解码方法、码流、编码器、解码器以及存储介质
WO2024074121A1 (en) Method, apparatus, and medium for point cloud coding
WO2023197338A1 (zh) 索引确定方法、装置、解码器以及编码器

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22926383

Country of ref document: EP

Kind code of ref document: A1