WO2022188583A1 - 基于点云属性预测的解码、编码方法、解码器及编码器 - Google Patents
基于点云属性预测的解码、编码方法、解码器及编码器 Download PDFInfo
- Publication number
- WO2022188583A1 WO2022188583A1 PCT/CN2022/075560 CN2022075560W WO2022188583A1 WO 2022188583 A1 WO2022188583 A1 WO 2022188583A1 CN 2022075560 W CN2022075560 W CN 2022075560W WO 2022188583 A1 WO2022188583 A1 WO 2022188583A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- points
- point
- neighbor
- target point
- candidate
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 123
- 238000004364 calculation method Methods 0.000 claims description 17
- 230000007423 decrease Effects 0.000 claims description 4
- 230000000694 effects Effects 0.000 abstract description 8
- 238000010187 selection method Methods 0.000 abstract description 4
- 238000013139 quantization Methods 0.000 description 30
- 230000009466 transformation Effects 0.000 description 19
- 230000006870 function Effects 0.000 description 18
- 230000008569 process Effects 0.000 description 18
- 238000010586 diagram Methods 0.000 description 16
- 238000012545 processing Methods 0.000 description 8
- 230000006835 compression Effects 0.000 description 7
- 238000007906 compression Methods 0.000 description 7
- 230000003044 adaptive effect Effects 0.000 description 4
- 230000001360 synchronised effect Effects 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 239000003086 colorant Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 238000002310 reflectometry Methods 0.000 description 2
- 108091026890 Coding region Proteins 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 230000006837 decompression Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/001—Model-based coding, e.g. wire frame
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/40—Tree coding, e.g. quadtree, octree
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/004—Predictors, e.g. intraframe, interframe coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/13—Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/186—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/90—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
- H04N19/96—Tree coding, e.g. quad-tree coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/2343—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/4402—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
Definitions
- the embodiments of the present application relate to the technical field of computer vision (image) of artificial intelligence, in particular to the technical field of point cloud encoding and decoding, and more particularly, to decoding, encoding methods, decoders and encoders based on point cloud attribute prediction.
- Point clouds have begun to spread to various fields, such as virtual/augmented reality, robotics, geographic information systems, medical fields, etc.
- a large number of point clouds on the surface of objects can be accurately obtained, often corresponding to hundreds of thousands of points in one scene.
- Such a large number of points also brings challenges to the storage and transmission of computers. Therefore, point-to-point compression has become a hot issue.
- point cloud compression it is mainly necessary to compress its location information and attribute information. Specifically, first perform octree encoding on the position information of the point cloud; at the same time, according to the position information of the current point encoded by the octree, a prediction for predicting the attribute information of the current point is selected from the encoded points After the value of the point, the attribute information is predicted based on the selected point, and then the color information is encoded by making a difference with the original value of the attribute information, so as to realize the encoding of the point cloud.
- a decoding and encoding method, decoder and encoder based on point cloud attribute prediction are provided.
- the present application provides a decoding method based on point cloud attribute prediction, including:
- the attribute value of the k neighbor points is the reconstructed value of the attribute information of the k neighbor points;
- the decoded point cloud is obtained.
- the present application provides an encoding method based on point cloud attribute prediction, including:
- the attribute value of the k neighbor points is the reconstructed value of the attribute information of the k neighbor points or the original value of the attribute information of the k neighbor points ;
- the residual value of the attribute information of the target point is obtained
- the residual value of the attribute information of the target point is encoded to obtain the code stream of the point cloud.
- the present application provides a decoder based on point cloud attribute prediction, which is used to perform the decoding method based on point cloud attribute prediction in the second aspect or each of its implementations.
- the decoder includes a functional module for executing the decoding method based on point cloud attribute prediction in the second aspect or each implementation manner thereof.
- the encoder includes:
- the parsing unit is used to obtain the code stream of the point cloud, analyze the code stream of the point cloud, and obtain the reconstruction information of the position information of the target point in the point cloud;
- the prediction unit is used to select N decoded points from the M decoded points in the point cloud as the N candidate points of the target point, M ⁇ N ⁇ 1; based on the reconstruction information of the position information of the target point, Select k neighbor points from the N candidate points, N ⁇ k ⁇ 1; use the attribute values of the k neighbor points to determine the predicted value of the attribute information of the target point; the attribute value of the k neighbor points is the The reconstructed value of the attribute information of k neighbor points;
- the parsing unit is also used for parsing the code stream to obtain the residual value of the attribute information of the target point;
- a residual unit configured to obtain the final reconstructed value of the attribute information of the target point according to the predicted value of the attribute information of the target point and the residual value of the attribute information of the target point;
- the decoding unit is used to obtain the decoded point cloud according to the final reconstruction value of the attribute information of the target point.
- the present application provides an encoder based on point cloud attribute prediction, which is used to perform the encoding method based on point cloud attribute prediction in the first aspect or each of its implementations.
- the encoder includes a functional module for executing the encoding method based on point cloud attribute prediction in the first aspect or each implementation manner thereof.
- the encoder includes:
- an acquisition unit used for acquiring reconstruction information of the target point position information in the point cloud
- a prediction unit used for: selecting N coded points from M coded points in the point cloud as N candidate points of the target point, M ⁇ N ⁇ 1; reconstruction information based on the position information of the target point , select k neighbor points from the N candidate points, N ⁇ k ⁇ 1; use the attribute values of the k neighbor points to determine the predicted value of the attribute information of the target point; the attribute value of the k neighbor points is The reconstructed value of the attribute information of the k neighbor points or the original value of the k neighbor points;
- a residual unit configured to obtain the residual value of the attribute information of the target point according to the predicted value of the attribute information of the target point and the original value of the attribute information of the target point;
- the encoding unit is used for encoding the residual value of the attribute information of the target point to obtain the code stream of the point cloud.
- an encoding and decoding device comprising:
- a processor adapted to implement computer instructions
- a computer-readable storage medium where computer instructions are stored in the computer-readable storage medium, and the computer instructions are adapted to be loaded by a processor and execute the encoding and decoding methods in any one of the above-mentioned first aspect to the second aspect or each of its implementations.
- the processor is one or more and the memory is one or more.
- the computer-readable storage medium may be integrated with the processor, or the computer-readable storage medium may be provided separately from the processor.
- an embodiment of the present application provides a computer-readable storage medium, where computer instructions are stored in the computer-readable storage medium, and when the computer instructions are read and executed by a processor of a computer device, the computer device is made to execute the above-mentioned first A coding and decoding method in any of the aspects to the second aspect or implementations thereof.
- the solution provided by the present application can reduce the prediction complexity on the basis of ensuring the prediction effect.
- FIG. 1 is a schematic block diagram of a coding framework provided by an embodiment of the present application.
- FIG. 2 is a schematic block diagram of a decoding framework provided by an embodiment of the present application.
- FIG. 3 is a schematic diagram of a point cloud in the original Morton order provided by an embodiment of the present application.
- FIG. 4 is a schematic diagram of a point cloud in an offset Morton order provided by an embodiment of the present application.
- FIG. 5 is a schematic structural diagram of a spatial relationship of neighbor points provided by an embodiment of the present application.
- FIG. 6 is an example of a Morton code relationship between neighbor points that are coplanar with the current point to be encoded within a neighborhood range provided by an embodiment of the present application.
- FIG. 7 is an example of a Morton code relationship between neighbor points that are collinear with the current point to be encoded within a neighborhood range provided by an embodiment of the present application.
- FIG. 8 is a schematic flowchart of an encoding method provided by an embodiment of the present application.
- FIG. 9 is a schematic flowchart of a decoding method provided by an embodiment of the present application.
- FIG. 10 is a schematic block diagram of an encoder provided by an embodiment of the present application.
- FIG. 11 is a schematic block diagram of a decoder provided by an embodiment of the present application.
- FIG. 12 is a schematic block diagram of an electronic device provided by an embodiment of the present application.
- FIG. 1 is a schematic block diagram of an encoding framework 100 provided by an embodiment of the present application.
- the encoding framework 100 can obtain the location information and attribute information of the point cloud from the acquisition device.
- the encoding of point cloud includes position encoding and attribute encoding.
- the process of position encoding includes: performing preprocessing on the original point cloud, such as coordinate transformation, quantization and removing duplicate points; and encoding to form a geometric code stream after constructing an octree.
- the attribute coding process includes: given the reconstructed information of the position information of the input point cloud and the original value of the attribute information, select one of the three prediction modes for point cloud prediction, quantify the predicted result, and perform arithmetic coding to form attribute code stream.
- position encoding can be achieved by the following units:
- Coordinate transformation transformation (Tanmsform coordinates) unit 101, quantize and remove duplicate points (Quantize and remove points) unit 102, octree analysis (Analyze octree) unit 103, geometric reconstruction (Reconstruct Geometry) unit 104 and first arithmetic coding (Arithmetic) enconde) unit 105.
- the coordinate transformation unit 101 may be used for pre-processing the points in the point cloud, that is, for coordinate transformation and voxelization.
- the coordinate transformation may refer to transforming the world coordinates of the points in the point cloud into relative coordinates. For example, through scaling operations (the geometric coordinates of the points minus the minimum value of the xyz coordinate axis respectively) and translation operations, convert the data of points in the point cloud in 3D space into integer form, and move their minimum geometric positions to the coordinates at the origin.
- the scaling operation for a point is equivalent to a de-DC operation to transform the coordinates of the points in the point cloud from world coordinates to relative coordinates.
- the quantization and removal of duplicate points unit 102 can reduce the number of coordinates through geometry quantization.
- the fineness of geometric quantization is usually determined by the quantization parameter (QP).
- QP quantization parameter
- the QP value is smaller, the coefficients representing the smaller value range will be quantized into the same output, so it usually brings less distortion and corresponds to a higher code rate.
- quantization is performed directly on the coordinate information of points.
- different points after quantization may be assigned the same coordinates, based on which, duplicate points can be removed by deduplication; for example, multiple points with the same quantization position and different attribute information can be merged into one point through attribute transformation .
- the quantization and removal of duplicate points unit 102 may be used as an optional unit module.
- Geometry encoding contains two modes, octree-based geometry encoding (Octree) and triangular representation-based geometry encoding (Trisoup), which can be used under different conditions.
- the octree analysis unit 103 may encode the position information of the quantized points using an octree encoding method.
- Octree is a tree data structure. In 3D space division, the preset bounding boxes are evenly divided, and each node has eight child nodes. By using '1' and '0' to indicate whether each child node of the octree is occupied or not, the occupancy code information (occupancy code) is obtained as the code stream of the point cloud geometric information.
- the point cloud is divided in the form of an octree, so that the position of the point can be in a one-to-one correspondence with the position of the octree.
- the flag (flag) is recorded as 1, for geometry encoding.
- the geometric coding based on triangle representation divides the point cloud into blocks of a certain size, locates the intersection of the point cloud surface at the edge of the block and constructs a triangle, and compresses the geometric information by encoding the position of the intersection.
- the first arithmetic coding unit 105 can be used for geometric entropy coding (Geometry entropy coding), that is, performing statistical compression coding on the occupancy code information of the octree, and finally outputs a binarized (0 or 1) compressed code stream.
- geometric entropy coding is a lossless coding method, which can effectively reduce the code rate required to express the same signal.
- a commonly used statistical compression coding can be context-based binary arithmetic coding (Content Adaptive Binary Arithmetic Coding, CABAC).
- the entropy coding method is used to perform arithmetic coding on the position information output by the octree analysis unit 103, that is, the position information output by the octree analysis unit 103 is used to generate a geometric code stream by means of an arithmetic coding method; the geometric code stream may also be referred to as Geometry bitstream.
- the encoding end needs to decode and reconstruct the geometric information, that is, to restore the coordinate information of each point in the 3D point cloud. For each point, find the reconstructed value of the attribute information corresponding to one or more adjacent points in the original point cloud, as the predicted value of the attribute information of the point, based on the predicted value and the original value of the attribute information of the point. The residual value of the attribute information of the point is obtained, and the encoder obtains the attribute bit stream by encoding the residual value of the attribute information of all points of the point cloud. The reconstructed value of the encoder based on the attribute information can be based on the predicted value and the original value, and the reconstructed value of the attribute information can be obtained.
- Attribute encoding can be achieved through the following units:
- Color space transform (Transform colors) unit 110 attribute transform (Transfer attributes) unit 111, Region Adaptive Hierarchical Transform (RAHT) unit 112, predicting transform (predicting transform) unit 113 and lifting transform (lifting transform) ) unit 114 , a quantization (Quantize) unit 115 and a second arithmetic coding unit 116 .
- RAHT Region Adaptive Hierarchical Transform
- the color space transformation unit 110 may be used to transform the RGB color space of the points in the point cloud into YCbCr format or other formats.
- the attribute transformation unit 111 may be used to transform attribute information of points in the point cloud to minimize attribute distortion.
- the attribute transformation unit 111 may be used to obtain the original value of the attribute information of the point.
- the attribute information may be color information of dots.
- any prediction unit can be selected to predict the point in the point cloud.
- the unit for predictive coding of points in the point cloud may include: a hierarchical region adaptive transform coding (Region Adaptive Hierarchical Transform, RAHT) unit 112, a predictive transform (predicting transform) unit 113, and a lifting transform (lifting transform) unit At least one of 114, namely attribute transform coding, contains three modes that can be used under different conditions.
- RAHT Resource Adaptive Hierarchical Transform
- the predicting transform unit 113 and the lifting transform unit 114 can be used to predict the attribute information of points in the point cloud to obtain the predicted value of the attribute information of the points, Further, the residual value of the attribute information of the point can be obtained based on the predicted value of the attribute information of the point.
- the residual value of the attribute information of the point may be the original value of the attribute information of the point minus the predicted value of the attribute information of the point.
- the attribute information is RAHT transformed, and the signal is transformed into the transform domain, which is called transform coefficients.
- the boosting and changing unit 114 to perform predictive coding, on the basis of the LoD adjacent layer prediction, a weight update strategy for neighboring points is introduced to finally obtain the predicted value of the attribute information of each point, and then obtain the corresponding residual value.
- the prediction transformation unit 113 can also be used to generate the LOD, and sequentially predict the attribute information of the points in the LOD, and calculate the prediction residual for subsequent quantization and coding.
- the sub-point set is selected according to the distance, and the point cloud is divided into multiple different levels (Level of Detail, LOD), so as to realize the point cloud representation from rough to fine.
- LOD Level of Detail
- Bottom-up prediction can be implemented between adjacent layers, that is, the attribute information of the points introduced in the fine layer is predicted from the adjacent points in the rough layer, and the corresponding residual value is obtained.
- the lowest point in the LOD is encoded as reference information.
- the quantization unit 115 can be used for attribute quantization (Attribute quantization), and the fineness of quantization is usually determined by a quantization parameter.
- attribute quantization attribute quantization
- entropy coding is performed after quantizing residual values
- RAHT entropy coding is performed after quantizing transform coefficients.
- the quantization unit 115 is connected to the predictive transformation unit 113 , the quantization unit can be used to quantize the residual value of the attribute information of the point output by the predictive transformation unit 113 .
- the residual value of the attribute information of the point output by the predictive transform unit 113 is quantized using a quantization step size, so as to improve the system performance.
- the residual value can be determined based on the following equation:
- AttrResidualQuant (attrValue-attrPred)/Qstep
- AttrResidualQuant represents the residual value of the current point
- attrPred represents the predicted value of the current point
- attrValue represents the original value of the current point
- Qstep represents the quantization step size.
- Qstep is calculated by the quantization parameter (Quantization Parameter, Qp).
- the current point will be used as the nearest neighbor of the subsequent point, and the attribute information of the subsequent point will be predicted using the reconstructed value of the current point.
- the reconstructed value of the attribute information of the current point can be obtained by the following formula:
- reconstructedColor represents the reconstructed value of the current point
- attrResidualQuant represents the residual value of the current point
- Qstep represents the quantization step size
- attrPred represents the predicted value of the current point.
- Qstep is calculated by the quantization parameter (Quantization Parameter, Qp).
- the second arithmetic coding unit 116 can be used for attribute entropy coding (Attribute entropy coding), the residual value or transform coefficient of the quantized attribute information can use run length coding (run length coding) and arithmetic coding (arithmetic coding) to achieve the final result compression.
- attribute entropy coding attribute entropy coding
- the residual value or transform coefficient of the quantized attribute information can use run length coding (run length coding) and arithmetic coding (arithmetic coding) to achieve the final result compression.
- Corresponding coding modes, quantization parameters and other information are also coded by an entropy coder.
- the attribute code stream can be obtained by entropy coding the residual value of the attribute information of the point.
- the attribute code stream may be bit stream information.
- the predicted value (predicted value) of the attribute information of the point in the point cloud may also be referred to as the color predicted value (predicted Color) in the LOD mode.
- the original value of the attribute information of the point minus the predicted value of the attribute information of the point can obtain the residual value of the point.
- the residual value of the attribute information of the point may also be referred to as a color residual value (residualColor) in the LOD mode.
- the predicted value of the attribute information of the point and the residual value of the attribute information of the point are added to generate a reconstructed value of the attribute information of the point.
- the reconstructed value of the attribute information of the point may also be referred to as a color reconstructed value (reconstructedColor) in the LOD mode.
- reconstructedColor color reconstructed value
- the decoder Based on the encoding process of the frame frame 100, after the decoder obtains the compressed code stream, it first performs entropy decoding to obtain various pattern information, quantized geometric information and attribute information. First, the geometric information is inverse quantized to obtain the reconstructed 3D point position information. On the other hand, the attribute information is inversely quantized to obtain the residual value, and the reference signal is confirmed according to the adopted transformation mode, and the predicted value of the attribute information is obtained, which corresponds to the geometric information in sequence, and the reconstructed value of each point is generated and output. , that is, output the reconstructed point cloud data.
- FIG. 2 is a schematic block diagram of a decoding framework 200 provided by an embodiment of the present application.
- the decoding framework 200 can obtain the code stream of the point cloud from the encoding device, and obtain the position information and attribute information of the points in the point cloud by parsing the code stream of the point cloud.
- the decoding of point cloud includes position decoding and attribute decoding.
- the position decoding process includes: performing arithmetic decoding on the geometric code stream; merging after constructing the octree, and reconstructing the position information of the point to obtain the reconstruction information of the position information of the point; The reconstructed information of the information is subjected to coordinate transformation to obtain the position information of the point.
- the position information of the point may also be referred to as the geometric information of the point.
- the attribute decoding process includes: obtaining the residual value of the attribute information of the point in the point cloud by parsing the attribute code stream; obtaining the residual value of the attribute information of the point after inverse quantization by inverse quantizing the residual value of the attribute information of the point value; based on the reconstruction information of the position information of the point obtained in the position decoding process, select one of the three prediction modes to perform point cloud prediction, and obtain the reconstructed value of the attribute information of the point; the reconstructed value of the attribute information of the point is color space Inverse transform to get the decoded point cloud.
- position decoding can be implemented by the following units: a first arithmetic decoding unit 201, an octree analysis (synthesize octree) unit 202, a geometric reconstruction (Reconstruct geometry) unit 203, and an inverse transform coordinates unit. 204.
- Attribute encoding can be implemented by the following units: a second arithmetic decoding unit 210, an inverse quantize unit 211, a RAHT unit 212, a predicting transform unit 213, a lifting transform unit 214, and an inverse color space transform (inverse trasform colors) unit 215.
- each unit in the decoding framework 200 may refer to the functions of the corresponding units in the encoding framework 100 .
- the decoding framework 200 may divide the point cloud into a plurality of LODs according to the Euclidean distance between the points in the point cloud; then, sequentially decode the attribute information of the points in the LOD; Quantity (zero_cnt), to decode the residual with a zero-based quantity; then, the decoding framework 200 can perform inverse quantization based on the decoded residual value, and add the predicted value of the current point based on the inverse quantized residual value Get the reconstructed value of the point cloud until all point clouds are decoded.
- the current point will be used as the nearest neighbor of the subsequent LOD midpoint, and the reconstructed value of the current point will be used to predict the attribute information of the subsequent point.
- inverse transform transform
- inverse quantization scale/scaling
- orthogonal transform if one of the matrices is used for transform, the other matrix is used for inverse transform.
- the matrices used in the decoder may be referred to as "transform" matrices.
- the predictive encoding method of attribute information may include a predictive encoding method for reflectance attribute information, a predictive encoding method for color attribute information, and a method of adaptively selecting an attribute prediction value.
- the method based on the offset Morton code finds the k encoded points of the current point as neighbor points.
- Morton code is a method of expressing the coordinates of points in multi-dimensional space with a one-dimensional value.
- the spatial relationship corresponding to the point in the space can be represented by the correlation between the values of Morton code. Neighbor relationship to approximate representation.
- a Morton order can be formed by a plurality of Morton codes based on Morton ordering. Sorting refers to changing the position of a set of data according to a specific rule (sorting algorithm), so that the data is arranged in order, which can be arranged from large to small, or from small to large.
- Morton sorting refers to the process of sorting based on the adjacent relationship between the values of the Morton code.
- FIG. 3 is a schematic diagram of a point cloud in the original Morton order provided by an embodiment of the present application.
- FIG. 4 is a schematic diagram of a point cloud in an offset Morton order provided by an embodiment of the present application.
- point B can be found by searching two points forward from point D at most. But in Figure 3, the Morton code from point D is 16, and the Morton code of point D's neighbor point B is 2. Therefore, it takes up to 14 points to search forward from point D to find point B.
- the decoder decodes according to Morton order and finds the closest predicted point for the current point. Specifically, in Morton sequence 1, the first N points of the current point are selected as candidate points, and the value range of N is greater than or equal to 1, and in Morton sequence 2, the first M points of the current point are selected as candidate points Point, the value range of M is greater than or equal to 1. Among the above N candidate points and M candidate points, calculate the distance d from each candidate point to the current point, which can be calculated using Euclidean distance or Manhattan distance.
- the coordinates of the current point are (x, y, x)
- the coordinates of the candidate points are (x1, y1, z1)
- the k decoded points with the smallest distance are selected as the prediction points of the current point.
- the selection method of neighbor points in Hilbert order is to find the k points closest to the current point among the maximum number of neighbor points (maxNumOfNeighbours) points before the current code point in Hilbert order as neighbor points, where maxNumOfNeighbours represents candidate points
- maxNumOfNeighbours represents candidate points
- the number of , maxNumOfNeighbours is 128 by default
- k is 3 by default
- discrete Hilbert curves are used for calculation when performing Hilbert sorting.
- the weighted average of the attribute reconstruction values of the k neighbor points is calculated to obtain The predicted value of the attribute information of the current point.
- the serial number of the current point be i
- j 0, 1, 2, ..., k
- the weight wij of each neighbor point is shown in the following formula 1:
- Equation 3 For the reflectivity attribute, if the weights in Equation 1 are calculated with different weights in the x, y, and z directions, the weight wij of each neighbor point is shown in Equation 3 below:
- a, b, and c have different weights on the components in the x, y, and z directions, respectively.
- the Morton code is first used to find the spatial neighbors of the current point, and then the attributes of the current point are predicted according to the searched spatial neighbors.
- FIG. 5 is a schematic structural diagram of a spatial relationship of neighbor points provided by an embodiment of the present application.
- FIG. 6 is an example of a Morton code relationship between neighbor points that are coplanar with the current point to be encoded within a neighborhood range provided by an embodiment of the present application.
- FIG. 7 is an example of a Morton code relationship between neighbor points that are collinear with the current point to be encoded within a neighborhood range provided by an embodiment of the present application.
- the current point to be encoded is a thick line marking block A
- the neighbor search range is a 3 ⁇ 3 ⁇ 3 neighborhood of the current point to be encoded.
- the Morton code of the current point is used to obtain the block with the smallest Morton code value in the neighborhood, and the block is used as the reference block to find the coded neighbor points that are coplanar and collinear with the current point 7 to be coded.
- the current point 7 to be encoded is coplanar with the neighbor point 3 , the neighbor point 5 , and the neighbor point 6 .
- the current point 7 to be encoded is collinear with the neighbor point 1 , the neighbor point 2 and the neighbor point 4 .
- Use the reference block to search for k neighbor points that are coplanar and collinear with the current point to be coded for example, when selecting a neighborhood, k ⁇ 6), and use these k neighbor points to predict the current point to be coded.
- the weight of the coplanar neighbor point is assigned to 2; continue to search for neighbors that are collinear with the current point to be encoded in the encoded and decoded points. If a collinear neighbor point is found in the decoded point set, the weight of the collinear neighbor point is assigned to 1, and finally, the weighted average of the found neighbor points is used to predict the attribute of the current point to be encoded; if there is no After finding the encoded and decoded neighbor points that are coplanar and collinear with the current point to be encoded, the attribute prediction can be performed by using the point corresponding to the previous Morton code of the current point to be encoded.
- the prediction method for reflectance attribute information and the prediction method for color attribute attribute information both determine the predicted value of attribute information according to geometric position information.
- the prediction method for reflectance attribute information and the prediction method for color attribute attribute information can be called is a prediction method based on geometric location.
- Geometric location-based prediction methods are usually suitable for denser and more predictable point clouds, such as human point clouds, or for situations with small prediction residuals. If the residual error generated by the geometric position-based predictor method is large, the attribute value-based predictor method can usually reduce the prediction residual error and improve the coding efficiency.
- the prediction value method based on attribute value can be realized by the following steps:
- the coplanar collinear point in the encoded points is found as the neighbor point of the current point, and different weights are set for the coplanar point and the collinear point for weighting calculation, and finally the corresponding point is obtained.
- Predicted value of attribute information For sparse point cloud data, if more than 90% of coplanar collinear points cannot be found, the previous point will be used for prediction in AVS encoding.
- the point cloud in one scene can correspond to hundreds of thousands of points. Such a large number of points also brings challenges to the storage and transmission of the computer.
- the coplanar collinear points directly in the encoded points are used as the neighbor points of the current point. The amount of computation required is too large, resulting in too high prediction complexity.
- This application proposes a color prediction-oriented neighbor point optimization method, which can reduce the prediction complexity on the basis of ensuring the prediction effect by better utilizing the spatial correlation of the adjacent points of the point cloud.
- N encoded points are selected as N candidate points among M encoded points, and then a distance-first or geometry-first approach is designed to select N candidate points among the N candidate points.
- Select k neighbor points and finally perform attribute prediction based on the selected k neighbor points. It should be noted that, in the embodiment of the present application, for the solution of selecting N candidate points from M coded points, and the solution of selecting k neighbor points from N candidate points, the solution can be searched or mapped and other ways to achieve.
- the neighbor point selection optimization method for point cloud attribute prediction proposed in this application can be applied to any 3D point cloud encoding and decoding products.
- FIG. 8 is a schematic flowchart of an encoding method 300 based on point cloud attribute prediction provided by an embodiment of the present application.
- the method 300 may be performed by an encoder or an encoding end.
- the encoding framework 100 shown in FIG. 1 For example, the encoding framework 100 shown in FIG. 1 .
- the encoding method 300 may include:
- S320 select N coded points from the M coded points in the point cloud as the N candidate points of the target point, M ⁇ N ⁇ 1;
- the attribute value of the k neighbor points is the reconstructed value of the attribute information of the k neighbor points or the reconstruction value of the attribute information of the k neighbor points.
- the encoding end when the encoding end encodes the attribute information of the target point, it first selects N encoded points from the M encoded points as the N candidate points of the target point, and then selects the N candidate points from the N candidate points. Select k neighbor points, then use the attribute values of the k neighbor points to determine the predicted value of the attribute information of the target point, and finally, according to the predicted value of the attribute information of the target point and the original value of the attribute information of the target point , obtain the residual value of the attribute information of the target point; and encode the residual value of the attribute information of the target point to obtain the code stream of the point cloud.
- the solution provided by the present application can reduce the prediction complexity on the basis of ensuring the prediction effect.
- this S320 may include:
- the N coded points are selected from the M coded points; wherein, the first order is the order of the M coded points from small to large or from large to small
- the encoding end sorts the points in the point cloud, for example, uses Morton code or Hilbert code to represent the coordinates of the points in the point cloud, and sorts from small to large or from large to small ; Or do not sort, keep the coding order of each point; the coding order of points can also be called the input order of points.
- the current number of the point to be encoded is i
- the corresponding Hilbert code is m_i
- the encoded points in the pre-order include i-1, i-2, ..., 1, 0, corresponding to Hilbert codes are m_(i-1), m_(i-2), ..., m_1, m_0.
- the S320 may specifically include:
- the N points that are before the target point and adjacent to the target point are determined as the N encoded points; or, along the first order, the N points that are before the target point and are consecutive N points are determined as the N coded points, wherein the consecutive N points are adjacent to or separated from the target point by at least one coded point.
- the achievable solution includes but is not limited to at least one of the following methods:
- the N coded points are directly selected from the M coded points, which can effectively control the selection complexity of the candidate points and improve the prediction efficiency.
- the encoding end may also randomly select the N encoded points from the M encoded points, which is not specifically limited in this embodiment of the present application.
- the first order may be the order formed by directly sorting the point cloud by the encoding end, or the order formed by only sorting the M encoded points and the target point.
- the point cloud is When the point cloud is dense, the first order can be the order formed only by sorting the M encoded points and the target point, and when the point cloud is a sparse point cloud, the first order can be all the points in the point cloud. Points are sorted to form an order to reduce workload and improve forecasting efficiency.
- the encoding end sorts the points in the point cloud or only the M encoded points and the target point, all directions (x, y, z) can be processed, or one or more of them can be processed.
- the direction is processed, which is not specifically limited in this embodiment of the present application.
- the encoding end can sort the points in the point cloud according to the position information of the points (or only sorts the M encoded points and the target point) according to the position information of the points Do a Morton sort (or just the M encoded points and the target point), or Hilbert sort the points in the point cloud based on their location information (or just the M points)
- the encoded point and the target point are Hilbert sorted).
- the position information of the point may be three-dimensional position information of the cloud, or may be position information in one dimension or multiple dimensions.
- the encoding end may determine to use the position information of several dimensions to sort the points in the point cloud (or the M encoded points and the target point) according to actual requirements.
- a neighbor point for predicting the attribute information of the target point may be selected from the N candidate points.
- the encoder can calculate the distance between each of the N candidate points and the target point, and determine k neighbor points based on the distance between each candidate point and the target point; For example, the octree structure formed based on the N candidate points and the target point is used to locate the eligible k neighbor points; other mapping methods can also be used to locate; and the target points are sorted, and k neighbor points are selected from the sorted candidate points. For example, k neighbor points can be selected based on the order formed by sorting the N candidate points and the target points. This is not specifically limited. It should also be noted that the present application does not limit the metric or specific implementation of the distance involved in the calculation process. For example, Euclidean distance or Manhattan distance may be employed.
- the order formed by sorting the N candidate points and the target points may be the order formed by directly sorting the point cloud, or it may be formed only by sorting the N candidate points and target points.
- Order for example, when the point cloud is a dense point cloud, the order formed by sorting N candidate points and target points can be the order formed by sorting only the N candidate points and target points, and the point cloud is sparse
- the order formed by sorting the N candidate points and the target point may be the order formed by sorting all the points in the point cloud, so as to reduce the workload and improve the prediction efficiency.
- the encoding end sorts the points in the point cloud (or only the N candidate points and the target point), all directions (x, y, z) can be processed, or one or more of them can be processed. Processing is performed in each direction, which is not specifically limited in this embodiment of the present application.
- the encoding end sorts the points in the point cloud according to the position information of the points (or only sorts the N candidate points and the target points)
- the encoding end can sort the points in the point cloud according to the position information of the points.
- Morton sorting or only Morton sorting for the N candidate points and target points
- Hilbert sorting for the points in the point cloud according to the position information of the points (or only for the N candidate points) points and target points for Hilbert sort).
- the position information of the point may be three-dimensional position information of the cloud, or may be position information in one dimension or multiple dimensions.
- the encoding end may determine, according to actual requirements, to use position information of several dimensions to sort the points in the point cloud or the N candidate points.
- the S330 may include:
- the geometric structural relationship is represented by an octree structure; based on the octree structure, the k nearest neighbors of the target point are determined; the k nearest neighbors are determined as the k neighbors point.
- the encoder selects the k nearest neighbor points as neighbor points; that is, selects the k points closest to the target point among the N candidate points.
- the encoder can use the K-Nearest Neighbor (KNN) classification algorithm to calculate and obtain its K nearest neighbors.
- K nearest neighbors refers to the K nearest neighbors, which is equivalent to that each point can be represented by its nearest K neighbors.
- the second order is determined by using the reconstruction information of the position information of the target point and the reconstruction information of the position information of the N candidate points; based on the second order, from the N candidate points Select the k neighbor points from among the points; wherein, the second order is obtained by performing Morton sorting or Hilbert sorting on the N candidate points and the target point according to the order from small to large or from large to small Or, the second order is an order obtained by sorting the distances between each candidate point in the N candidate points and the target point from large to small or from small to large, the N candidate points The distance from each candidate point in the point to the target point is Euclidean distance or Manhattan distance.
- the encoding end can select a candidate point that is collinear and/or coplanar with the target point from the N candidate points based on the geometric structure relationship;
- the number of candidate points of the surface is less than k or there is no candidate point that is collinear or coplanar with the target point, then the encoding end is based on the distance between the candidate point in the N candidate points and the target point or
- the second order determines the k neighbor points. If the number of candidate points that are collinear and/or coplanar with the target point is greater than or equal to k, then assign all the points that are collinear and/or coplanar with the target point.
- the candidate points are determined as the k neighbor points or k sum points are selected from all the candidate points that are collinear and/or coplanar with the target point as the k neighbor points.
- the method for the encoder to select the k neighbor points from the N candidate points specifically includes, but is not limited to, at least one of the following methods:
- d_0 can be a fixed value or multiple fixed values.
- d_0 can be 1 or 2, i.e. select points in the dense point cloud that are coplanar or colinear with the target point.
- the encoding end When the encoding end selects the k neighbor points from the N candidate points based on the second order, it can select the k neighbor points from the N candidate points based on the sequence number of the target point in the second order Neighborhood point.
- the achievable solution includes but is not limited to at least one of the following ways:
- the encoder may also randomly select the k neighbor points from the N candidate points, which is not specifically limited in this embodiment of the present application.
- the S330 may include:
- the distance between each candidate point in the N candidate points and the target point Based on the reconstruction information of the position information of the target point and the reconstruction information of the position information of the N candidate points, determine the distance between each candidate point in the N candidate points and the target point; based on the N candidate points The distance between each candidate point in the selection point and the target point, select the k neighbor points from the N candidate points, wherein the distance between the candidate point in the N candidate points and the target point is The distance is either the Euclidean distance or the Manhattan distance.
- the correlation or similarity between the attribute information of the candidate point and the attribute information of the target point can be reflected.
- the k neighbor points that meet the preset conditions or attribute prediction conditions are preferentially selected from the N candidate points.
- the S330 may specifically include:
- the first target candidate point among the N candidate points as the k neighbor points and the first target candidate point refers to a point whose distance from the target point in the N candidate points is less than the first threshold; Or, determine the second target candidate point among the N candidate points as k neighbor points, and the second target candidate point refers to the point whose distance from the target point is less than the second threshold among the N candidate points .
- the method for the encoder to select the k neighbor points from the N candidate points specifically includes, but is not limited to, at least one of the following methods:
- d_0 can be a fixed value or multiple fixed values.
- d_0 can be 1 or 2, i.e. select points in the dense point cloud that are coplanar or colinear with the target point.
- the S330 may include:
- the second order is the order obtained by performing Morton sorting or Hilbert sorting on the N candidate points and the target point in descending order from small to large or from large to small; or, the second order is According to the order obtained by sorting the distances between each candidate point in the N candidate points and the target point from large to small or from small to large, each candidate point in the N candidate points reaches the target
- the distance of points is Euclidean distance or Manhattan distance.
- the encoder may select the k neighbor points from the N candidate points in sequence numbers based on the second order. For example, based on the second order of the N candidate points, some candidate points are selected from the N candidate points as the k neighbor points. For example, along the second order, k points in front of the target point and adjacent to the target point are determined as the k neighbor points; or, along the second order, before the target point and consecutively The k points of , are determined as the k neighbor points, wherein the consecutive k points are adjacent to the target point or separated by at least one candidate point.
- the achievable solution includes, but is not limited to, at least one of the following methods:
- the encoder may also randomly select the k neighbor points from the N candidate points, which is not specifically limited in this embodiment of the present application.
- the S340 may include:
- the weighted average of attribute values is determined as the predicted value of the attribute information of the target point, wherein the initial weight of one neighbor point in the k neighbor points decreases as the distance between the one neighbor point and the target point increases , the code stream includes the initial weight of each neighbor point in the k neighbor points; or, the attribute value of the neighbor point with the closest distance to the target point among the k neighbor points is determined as the attribute value of the target point The predicted value of the information.
- the encoder can use the obtained attribute values of neighbor points to calculate the predicted value of the attribute information of the target point, and the specific calculation process includes but is not limited to at least one of the following methods:
- the encoder can set the same or different weight values for different neighbor points, for example, set a larger weight value for the points with a closer distance; the predicted value is the attribute value of each neighbor point weighted average of .
- the decoding end can obtain the corresponding weight value by parsing the code stream.
- the S350 may include:
- the first neighbor point means that the distance between the k neighbor points and the reference point is greater than the first neighbor point
- the neighbor point of the threshold the second neighbor point refers to the neighbor point whose distance from the reference point is greater than or equal to the second threshold value among the k neighbor points, and the k neighbor points include the reference point;
- the attribute values of the remaining neighbor points determine the predicted value of the attribute information of the target point.
- the encoder can process the obtained neighbor points, and can eliminate points with large differences to avoid introducing errors, including but not limited to at least one of the following methods:
- the size of the sequence numbers of the above-mentioned processes does not mean the sequence of execution, and the execution sequence of each process should be determined by its functions and internal logic, and should not be dealt with in the present application.
- the implementation of the embodiments constitutes no limitation.
- FIG. 9 is a schematic flowchart of a decoding method 400 based on point cloud attribute prediction provided by an embodiment of the present application.
- the decoding method 400 may include:
- S410 obtain the code stream of the point cloud, analyze the code stream of the point cloud, and obtain reconstruction information of the position information of the target point in the point cloud;
- this S420 may include:
- the N decoded points are selected from the M decoded points; wherein, the first order is the order of the M decoded points from small to large or from large to small The order obtained by performing Morton sorting or Hilbert sorting on the decoded points and the target point, or the first order is the decoding order of the M decoded points and the target point; the N decoded points, Determined as the N candidate points.
- the solution provided by the present application can reduce the prediction complexity on the basis of ensuring the prediction effect.
- the S420 may specifically include:
- the N points preceding and adjacent to the target point are determined as the N decoded points; or, along the first order, the N points preceding and consecutive to the target point are determined as the N decoded points N points are determined as the N decoded points, wherein the consecutive N points are adjacent to or separated from the target point by at least one decoded point.
- the S430 may include:
- the geometric structure relationship is represented by an octree structure; the S430 may specifically include:
- the k nearest neighbors of the target point are determined; the k nearest neighbors are determined as the k neighbors.
- the S430 may include:
- the p candidate points are determined as the k neighbor points.
- k candidate points are selected from the p candidate points as the k neighbor points.
- the S430 may include:
- the number p of candidate points is less than the number k of neighbor points or the number p of candidate points is equal to 0, then based on the reconstruction information of the position information of the target point and the reconstruction information of the position information of the N candidate points, determine the N candidate points. The distance between the candidate point in the selected point and the target point;
- the k neighbor points are selected from the N candidate points, wherein the candidate point in the N candidate points is to The distance of the target point is Euclidean distance or Manhattan distance.
- the S430 may include:
- the number p of candidate points is less than the number k of neighbor points or the number p of candidate points is equal to 0, then use the reconstruction information of the position information of the target point and the reconstruction information of the position information of the N candidate points to determine the second order; based on In the second order, the k neighbor points are selected from the N candidate points;
- the second order is an order obtained by performing Morton sorting or Hilbert sorting on the N candidate points and the target point in descending order or descending order; or, the second order is the order obtained by sorting the distances between the candidate points in the N candidate points and the target point from large to small or from small to large, the candidate point in the N candidate points to the target point
- the distance is Euclidean distance or Manhattan distance.
- the S430 may include:
- the distance between each candidate point in the N candidate points and the target point Based on the reconstruction information of the position information of the target point and the reconstruction information of the position information of the N candidate points, determine the distance between each candidate point in the N candidate points and the target point; based on the N candidate points The distance between each candidate point in the selection point and the target point, select the k neighbor points from the N candidate points, wherein the distance between the candidate point in the N candidate points and the target point is The distance is either the Euclidean distance or the Manhattan distance.
- the S430 may specifically include:
- the first target candidate point among the N candidate points as the k neighbor points and the first target candidate point refers to a point whose distance from the target point in the N candidate points is less than the first threshold; Or, determine the second target candidate point among the N candidate points as k neighbor points, and the second target candidate point refers to the point whose distance from the target point is less than the second threshold among the N candidate points . .
- the S430 may include:
- the second order is an order obtained by performing Morton sorting or Hilbert sorting on the N candidate points and the target point in descending order or descending order; or, the second order is the order obtained by sorting the distances between the candidate points in the N candidate points and the target point from large to small or from small to large, the candidate point in the N candidate points to the target point
- the distance is Euclidean distance or Manhattan distance.
- the S440 may include:
- the weighted average of the attribute values of the point is determined as the predicted value of the attribute information of the target point, wherein the initial weight of each neighbor point in the k neighbor points increases with the distance between each neighbor point and the target point.
- the code stream includes the initial weight of each neighbor point in the k neighbor points; or, the attribute value of the neighbor point with the closest distance to the target point among the k neighbor points is determined as the The predicted value of the attribute information of the target point.
- the S440 may include:
- the first neighbor point means that the distance between the k neighbor points and the reference point is greater than the first neighbor point
- the neighbor point of the threshold the second neighbor point refers to the neighbor point whose distance from the reference point is greater than or equal to the second threshold value among the k neighbor points, and the k neighbor points include the reference point;
- the attribute values of the remaining neighbor points determine the predicted value of the attribute information of the target point.
- decoding method 400 may refer to the relevant description of the encoding method 300 , which is not repeated here to avoid repetition.
- FIG. 10 is a schematic block diagram of a point cloud encoder 500 provided by an embodiment of the present application.
- the encoder 500 may include:
- an obtaining unit 510 configured to obtain reconstruction information of the position information of the target point in the point cloud
- the prediction unit 520 is configured to: select N coded points from the M coded points in the point cloud as N candidate points of the target point, M ⁇ N ⁇ 1; reconstruction based on the position information of the target point information, select k neighbor points from the N candidate points, N ⁇ k ⁇ 1; use the attribute values of the k neighbor points to determine the predicted value of the attribute information of the target point; the attribute value of the k neighbor points is the reconstructed value of the attribute information of the k neighbor points or the original value of the attribute information of the k neighbor points;
- a residual unit 520 configured to obtain the residual value of the attribute information of the target point according to the predicted value of the attribute information of the target point and the original value of the attribute information of the target point;
- the encoding unit 530 is configured to encode the residual value of the attribute information of the target point to obtain the code stream of the point cloud.
- the number M of coded points exceeds the number N of candidate points; wherein, the prediction unit 520 is specifically used for:
- the N coded points are selected from the M coded points; wherein, the first order is the order of the M coded points from small to large or from large to small The order obtained by performing Morton sorting or Hilbert sorting on the encoded point and the target point, or the first order is the encoding order of the M encoded points and the target point;
- the N coded points are taken as the N candidate points.
- the prediction unit 520 is specifically used for:
- the N points that are before the target point and adjacent to the target point are determined as the N encoded points;
- the N consecutive points preceding the target point are determined as the N coded points, wherein the consecutive N points are adjacent to or separated from the target point by at least one coded point.
- the prediction unit 520 is specifically used for:
- the k neighbor points are selected from the N candidate points.
- the geometric structure relationship is represented by an octree structure; the prediction unit 520 is specifically used for:
- the k nearest neighbors are determined as the k neighbors.
- the prediction unit 520 is specifically used for:
- the p candidate points are determined as the k neighbor points.
- k candidate points are selected from the p candidate points as the k neighbor points.
- the prediction unit 520 is specifically used for:
- the N candidate points The distance between the candidate point in and the target point;
- the k neighbor points are selected from the N candidate points, wherein the candidate point in the N candidate points is to The distance of the target point is Euclidean distance or Manhattan distance.
- the prediction unit 520 is specifically used for:
- the number p of candidate points is less than the number k of neighbor points or the number p of candidate points is equal to 0, then use the reconstruction information of the position information of the target point and the reconstruction information of the position information of the N candidate points to determine the second order; Second order, select the k neighbor points from the N candidate points;
- the second order is an order obtained by performing Morton sorting or Hilbert sorting on the N candidate points and the target point in descending order or descending order; or, the second order is the order obtained by sorting the distances between the candidate points in the N candidate points and the target point from large to small or from small to large, the candidate point in the N candidate points to the target point
- the distance is Euclidean distance or Manhattan distance.
- the prediction unit 520 is specifically used for:
- the k neighbor points are selected from the N candidate points, wherein the candidate point in the N candidate points is to The distance of the target point is Euclidean distance or Manhattan distance.
- the prediction unit 520 is specifically used for:
- the first target candidate point refers to the point whose distance from the target point is less than the first threshold among the N candidate points, or
- a second target candidate point among the N candidate points is determined as k neighbor points, and the second target candidate point refers to a point whose distance from the target point is less than a second threshold among the N candidate points.
- the prediction unit 520 is specifically used for:
- the k neighbor points are selected from the N candidate points
- the second order is an order obtained by performing Morton sorting or Hilbert sorting on the N candidate points and the target point in descending order or descending order; or, the second order is the order obtained by sorting the distances between the candidate points in the N candidate points and the target point from large to small or from small to large, the candidate point in the N candidate points to the target point
- the distance is Euclidean distance or Manhattan distance.
- the prediction unit 520 is specifically used for:
- the weighted average of the attribute values of the k neighbor points is determined as the predicted value of the attribute information of the target point, wherein the initial weight of each neighbor point in the k neighbor points varies with each The distance between the neighbor point and the target point increases and decreases; or
- the attribute value of the neighbor point with the closest distance to the target point among the k neighbor points is determined as the predicted value of the attribute information of the target point.
- the prediction unit 520 is specifically used for:
- the first neighbor point means that the distance between the k neighbor points and the reference point is greater than the first neighbor point
- the threshold neighbor point the second neighbor point refers to the neighbor point whose distance from the reference point is greater than or equal to the second threshold value among the k neighbor points, and the k neighbor points include the reference point.
- the encoder 500 can also be combined with the encoding framework 100 shown in FIG. 1 , that is, the units in the encoder 500 can be replaced or combined with the relevant units in the encoding framework 100 .
- the prediction unit 520 and the residual unit 530 may be used to implement the relevant functions of the prediction transform unit 113 in the coding framework 100, and may even be used to implement the position encoding function and the function before prediction for attribute information.
- the encoding unit 540 can be used to replace the second arithmetic encoding unit 116 in the encoding framework 100 .
- FIG. 11 is a schematic block diagram of a decoder 600 provided by an embodiment of the present application.
- the decoder 600 may include:
- the parsing unit 610 is configured to obtain the code stream of the point cloud, parse the code stream of the point cloud, and obtain reconstruction information of the position information of the target point in the point cloud;
- a prediction unit 620 configured to: select N decoded points from the M decoded points in the point cloud as N candidate points of the target point, M ⁇ N ⁇ 1; reconstruction based on the position information of the target point information, select k neighbor points from the N candidate points, N ⁇ k ⁇ 1; use the attribute values of the k neighbor points to determine the predicted value of the attribute information of the target point; the attribute value of the k neighbor points is the reconstructed value of the attribute information of the k neighbor points;
- the parsing unit 610 is further configured to parse the code stream to obtain the residual value of the attribute information of the target point;
- a residual unit 630 configured to obtain the final reconstructed value of the attribute information of the target point according to the predicted value of the attribute information of the target point and the residual value of the attribute information of the target point;
- the decoding unit 640 is configured to obtain a decoded point cloud according to the final reconstructed value of the attribute information of the target point.
- the number M of decoded points exceeds the number N of candidate points; wherein, the prediction unit 620 is specifically used for:
- the N decoded points are selected from the M decoded points; wherein, the first order is the order of the M decoded points from small to large or from large to small The order obtained by performing Morton sorting or Hilbert sorting on the decoded points and the target point, or the first order is the decoding order of the M decoded points and the target point;
- the N decoded points are determined as the N candidate points.
- the prediction unit 620 is specifically used for:
- the N points preceding the target point and adjacent to the target point are determined as the N decoded points;
- N consecutive points preceding the target point are determined as the N decoded points, wherein the consecutive N points are adjacent to or separated from the target point by at least one decoded point.
- the prediction unit 620 is specifically used for:
- the k neighbor points are selected from the N candidate points.
- the geometric structure relationship is represented by an octree structure; the prediction unit 620 is specifically used for:
- the k nearest neighbors are determined as the k neighbors.
- the prediction unit 620 is specifically used for:
- the p candidate points are determined as the k neighbor points.
- k candidate points are selected from the p candidate points as the k neighbor points.
- the prediction unit 620 is specifically used for:
- the N candidate points The distance between each candidate point in and the target point;
- the target point Based on the distance between each candidate point in the N candidate points and the target point, select the k neighbor points from the N candidate points, wherein the candidate point in the N candidate points
- the distance to this target point is the Euclidean distance or the Manhattan distance.
- the prediction unit 620 is specifically used for:
- the number p of candidate points is less than the number k of neighbor points or the number p of candidate points is equal to 0, then use the reconstruction information of the position information of the target point and the reconstruction information of the position information of the N candidate points to determine the second order; Second order, select the k neighbor points from the N candidate points;
- the second order is an order obtained by performing Morton sorting or Hilbert sorting on the N candidate points and the target point in descending order or descending order; or, the second order is the order obtained by sorting the distances between the candidate points in the N candidate points and the target point from large to small or from small to large, the candidate point in the N candidate points to the target point
- the distance is Euclidean distance or Manhattan distance.
- the prediction unit 620 is specifically used for:
- the target point Based on the distance between each candidate point in the N candidate points and the target point, select the k neighbor points from the N candidate points, wherein the candidate point in the N candidate points
- the distance to this target point is the Euclidean distance or the Manhattan distance.
- the prediction unit 620 is specifically used for:
- the first target candidate point refers to a point whose distance from the target point in the N candidate points is less than the first threshold
- the second target candidate point in the N candidate points is determined as k neighbor points, and the second target candidate point refers to the point whose distance from the target point is less than the second threshold among the N candidate points.
- the prediction unit 620 is specifically used for:
- the k neighbor points are selected from the N candidate points
- the second order is an order obtained by performing Morton sorting or Hilbert sorting on the N candidate points and the target point in descending order or descending order; or, the second order is the order obtained by sorting the distances between the candidate points in the N candidate points and the target point from large to small or from small to large, the candidate point in the N candidate points to the target point
- the distance is Euclidean distance or Manhattan distance.
- the prediction unit 620 is specifically used for:
- the weighted average of the attribute values of the k neighbor points is determined as the predicted value of the attribute information of the target point, wherein the initial weight of each neighbor point in the k neighbor points varies with each The distance between the neighbor point and the target point increases and decreases, and the code stream includes the initial weight of each neighbor point in the k neighbor points; or
- the attribute value of the neighbor point with the closest distance to the target point among the k neighbor points is determined as the predicted value of the attribute information of the target point.
- the prediction unit 620 is specifically used for:
- the first neighbor point means that the distance between the k neighbor points and the reference point is greater than the first neighbor point
- the threshold neighbor point the second neighbor point refers to the neighbor point whose distance from the reference point is greater than or equal to the second threshold value among the k neighbor points, and the k neighbor points include the reference point.
- the decoder 600 can also be combined with the decoding framework 200 shown in FIG. 2 , that is, the units in the decoder 600 can be replaced or combined with the relevant units in the decoding framework 200 .
- the parsing unit 610 can be used to implement the related functions of the predictive transformation unit 213 in the decoding framework 200 , and can even be used to implement the related functions of the inverse quantization unit 211 and the second arithmetic decoding unit 210 .
- the prediction unit 620 and the residual unit 630 may be used to implement the related functions of the prediction transform unit 213 .
- the decoding unit 640 may be used to implement the function of the color space inverse transformation unit 215 in the decoding framework 200 .
- the apparatus embodiments and the method embodiments may correspond to each other, and similar descriptions may refer to the method embodiments. To avoid repetition, details are not repeated here.
- the encoder 500 may correspond to executing the corresponding subject in the method 300 of the embodiments of the present application, and each unit in the encoder 500 is to implement the corresponding process in the method 300
- the decoder 600 may correspond to executing The corresponding main body in the method 400 in the embodiment of the present application, and each unit in the decoder 600 are respectively to implement the corresponding process in the method 400, and are not repeated here for brevity.
- each unit in the data processing apparatus for point cloud media involved in the embodiments of the present application may be respectively or all merged into one or several other units to form, or some unit(s) may be further disassembled. It is divided into a plurality of units with smaller functions, which can realize the same operation without affecting the realization of the technical effects of the embodiments of the present application.
- the above-mentioned units are divided based on logical functions.
- the function of one unit may also be implemented by multiple units, or the functions of multiple units may be implemented by one unit.
- the data processing apparatus for point cloud media may also include other units. In practical applications, these functions may also be implemented with the assistance of other units, and may be implemented by cooperation of multiple units.
- a general-purpose computing device including a general-purpose computer such as a central processing unit (CPU), a random access storage medium (RAM), a read-only storage medium (ROM), etc.
- a general-purpose computer may be implemented
- Computer-readable instructions capable of executing the steps involved in the corresponding method are executed on the computer to construct the data processing apparatus for point cloud media involved in the embodiments of the present application, and to realize the point cloud attribute-based properties of the embodiments of the present application.
- the predicted codec method may be recorded on, for example, a computer-readable storage medium, and loaded into any electronic device with data processing capability through the computer-readable storage medium, and run in the computer-readable storage medium to implement the corresponding methods of the embodiments of the present application.
- the units mentioned above can be implemented in the form of hardware, can also be implemented by instructions in the form of software, and can also be implemented in the form of a combination of software and hardware.
- the steps of the method embodiments in the embodiments of the present application may be completed by hardware integrated logic circuits in the processor and/or instructions in the form of software, and the steps of the methods disclosed in conjunction with the embodiments of the present application may be directly embodied as hardware
- the execution of the decoding processor is completed, or the execution is completed by a combination of hardware and software in the decoding processor.
- the software may be located in random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, and other storage media mature in the art.
- the storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps in the above method embodiments in combination with its hardware.
- FIG. 12 is a schematic structural diagram of an encoding and decoding device 700 provided by an embodiment of the present application.
- the codec device 700 includes at least a processor 710 and a computer-readable storage medium 720 .
- the processor 710 and the computer-readable storage medium 720 may be connected through a bus or other means.
- the computer readable storage medium 720 is used for storing computer readable instructions 721
- the computer readable instructions 721 include computer instructions
- the processor 710 is used for executing the computer instructions stored in the computer readable storage medium 720 .
- the processor 710 is the computing core and the control core of the encoding/decoding device 700, which is suitable for implementing one or more computer instructions, and is specifically suitable for loading and executing one or more computer instructions to implement corresponding method processes or corresponding functions.
- the processor 710 may also be referred to as a central processing unit (Central Processing Unit, CPU).
- the processor 710 may include, but is not limited to: a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a Field Programmable Gate Array (Field Programmable Gate Array, FPGA) Or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like.
- the computer-readable storage medium 720 may be a high-speed RAM memory, or a non-volatile memory (Non-Volatile Memory), such as at least one disk memory; computer readable storage medium.
- the computer-readable storage medium 720 includes, but is not limited to, volatile memory and/or non-volatile memory.
- the non-volatile memory may be a read-only memory (Read-Only Memory, ROM), a programmable read-only memory (Programmable ROM, PROM), an erasable programmable read-only memory (Erasable PROM, EPROM), an electrically programmable read-only memory (Erasable PROM, EPROM).
- Volatile memory may be Random Access Memory (RAM), which acts as an external cache.
- RAM Random Access Memory
- SRAM Static RAM
- DRAM Dynamic RAM
- SDRAM Synchronous DRAM
- SDRAM double data rate synchronous dynamic random access memory
- Double Data Rate SDRAM DDR SDRAM
- enhanced SDRAM ESDRAM
- synchronous link dynamic random access memory SLDRAM
- Direct Rambus RAM Direct Rambus RAM
- the encoding and decoding device 700 may be the encoding framework 100 shown in FIG. 1 or the encoder 500 shown in FIG. 10 ; the computer-readable storage medium 720 stores first computer instructions; 710 loads and executes the first computer instructions stored in the computer-readable storage medium 720 to implement the corresponding steps in the method embodiment shown in FIG. 8; in specific implementation, the first computer instructions in the computer-readable storage medium 720 are processed by The controller 710 loads and executes the corresponding steps, which are not repeated here to avoid repetition.
- the codec device 700 may be the decoding framework 200 shown in FIG. 2 or the decoder 600 shown in FIG. 11 ; the computer-readable storage medium 720 stores second computer instructions; 710 loads and executes the second computer instructions stored in the computer-readable storage medium 720 to implement the corresponding steps in the method embodiment shown in FIG. 9; in specific implementation, the second computer instructions in the computer-readable storage medium 720 are processed by The controller 710 loads and executes the corresponding steps, which are not repeated here to avoid repetition.
- an embodiment of the present application further provides a computer-readable storage medium (Memory), where the computer-readable storage medium is a memory device in the encoding and decoding device 700 for storing programs and data.
- computer readable storage medium 720 may include both a built-in storage medium in the encoding and decoding device 700 , and of course, may also include an extended storage medium supported by the encoding and decoding device 700 .
- the computer-readable storage medium provides storage space in which the operating system of the codec apparatus 700 is stored.
- one or more computer instructions suitable for being loaded and executed by the processor 710 are also stored in the storage space, and these computer instructions may be one or more computer-readable instructions 721 (including program codes). These computer instructions are used for the computer to execute the encoding and decoding methods based on point cloud attribute prediction provided in the above-mentioned various optional manners.
- a computer-readable instruction product or computer-readable instructions comprising computer instructions stored in a computer-readable storage medium.
- computer readable instructions 721 the encoding and decoding device 700 may be a computer, the processor 710 reads the computer instructions from the computer-readable storage medium 720, and the processor 710 executes the computer instructions, so that the computer executes the point-based point-based instructions provided in the above-mentioned various optional manners. Encoding and decoding methods for cloud attribute prediction.
- the computer-readable instruction product includes one or more computer instructions.
- the computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
- the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website site, computer, server or data center via Transmission to another website site, computer, server, or data center by wired (eg, coaxial cable, optical fiber, digital subscriber line, DSL) or wireless (eg, infrared, wireless, microwave, etc.) means.
- wired eg, coaxial cable, optical fiber, digital subscriber line, DSL
- wireless eg, infrared, wireless, microwave, etc.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Compression Of Band Width Or Redundancy In Fax (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Signal Processing For Digital Recording And Reproducing (AREA)
Abstract
本申请提供了提供一种基于点云属性预测的解码、编码方法、解码器及编码器,涉及计算机视觉(图像)中的点云编解码技术领域,该方法包括:对点云的码流进行解析,得到该点云中的目标点的位置信息的重建信息;从该点云中的M个已解码点中选择N个已解码点作为该目标点的N个备选点;基于该目标点的位置信息的重建信息,从该N个备选点中选择k个邻居点;利用该k个邻居点的属性值,确定该目标点的属性信息的预测值,进而基于该目标点的属性信息的预测值得到解码点云。本申请通过设计邻居点的选取方法,尽量选择与目标点属性相似的邻居点对目标点的属性信息进行预测,能够在保证预测效果的基础上降低预测复杂度。
Description
本申请要求于2021年03月12日提交中国专利局,申请号为202110278568X,申请名称为“基于点云属性预测的解码、编码方法、解码器及编码器”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请实施例涉及人工智能的计算机视觉(图像)技术领域,尤其涉及点云编解码技术领域,并且更具体地,涉及基于点云属性预测的解码、编码方法、解码器及编码器。
点云已经开始普及到各个领域,例如,虚拟/增强现实、机器人、地理信息系统、医学领域等。随着扫描设备的基准度和速率的不断提升,可以准确地获取物体表面的大量点云,往往一个场景下就可以对应几十万个点。数量如此庞大的点也给计算机的存储和传输带来了挑战。因此,对点的压缩也就成为一个热点问题。
对于点云的压缩来说,主要需要压缩其位置信息和属性信息。具体而言,先通过对点云的位置信息进行八叉树编码;同时,根据八叉树编码后的当前点的位置信息,在已编码的点中选择出用于预测当前点属性信息的预测值的点后,基于选择出的点对其属性信息进行预测,再通过与属性信息的原始值进行做差的方式来编码颜色信息,以实现对点云的编码。
在对属性信息进行预测的过程中,如何在保证预测效果的基础上降低预测复杂度,是本领域亟需解决的技术问题。
发明内容
根据本申请提供的各种实施例,提供一种基于点云属性预测的解码、编码方法、解码器及编码器。
一方面,本申请提供了一种基于点云属性预测的解码方法,包括:
获取点云的码流,对点云的码流进行解析,得到该点云中的目标点位置信息的重建信息;
从该点云中的M个已解码点中选择N个已解码点作为该目标点的N个备选点,M≥N≥1;
基于该目标点位置信息的重建信息,从该N个备选点中选择k个邻居点,N≥k≥1;
利用该k个邻居点的属性值,确定该目标点属性信息的预测值;该k个邻居点的属性值为该k个邻居点属性信息的重建值;
对该码流进行解析,得到该目标点属性信息的残差值;
用于根据该目标点属性信息的预测值和该目标点属性信息的残差值,得到该目标点属性信息的最终重建值;
根据该目标点属性信息的最终重建值,得到解码点云。
另一方面,本申请提供了一种基于点云属性预测的编码方法,包括:
获取点云中的目标点位置信息的重建信息;
从该点云中的M个已编码点中选择N个已编码点作为该目标点的N个备选点,M≥N≥1;
基于该目标点位置信息的重建信息,从该N个备选点中选择k个邻居点,N≥k≥1;
利用该k个邻居点的属性值,确定该目标点属性信息的预测值;该k个邻居点的属性值 为该k个邻居点属性信息的重建值或该k个邻居点属性信息的原始值;
根据该目标点属性信息的预测值和该目标点属性信息的原始值,得到该目标点属性信息的残差值;
对该目标点属性信息的残差值进行编码,得到该点云的码流。
另一方面,本申请提供了一种基于点云属性预测的解码器,用于执行上述第二方面或其各实现方式中的基于点云属性预测的解码方法。具体地,该解码器包括用于执行上述第二方面或其各实现方式中的基于点云属性预测的解码方法的功能模块。
在一种实现方式中,该编码器包括:
解析单元,用于获取点云的码流,对点云的码流进行解析,得到该点云中的目标点位置信息的重建信息;
预测单元,用于从该点云中的M个已解码点中选择N个已解码点作为该目标点的N个备选点,M≥N≥1;基于该目标点位置信息的重建信息,从该N个备选点中选择k个邻居点,N≥k≥1;利用该k个邻居点的属性值,确定该目标点属性信息的预测值;该k个邻居点的属性值为该k个邻居点属性信息的重建值;
该解析单元还用于对该码流进行解析,得到该目标点属性信息的残差值;
残差单元,用于根据该目标点属性信息的预测值和该目标点属性信息的残差值,得到该目标点属性信息的最终重建值;
解码单元,用于根据该目标点属性信息的最终重建值,得到解码点云。
另一方面,本申请提供了一种基于点云属性预测的编码器,用于执行上述第一方面或其各实现方式中的基于点云属性预测的编码方法。具体地,该编码器包括用于执行上述第一方面或其各实现方式中的基于点云属性预测的编码方法的功能模块。
在一种实现方式中,该编码器包括:
获取单元,用于获取点云中的目标点位置信息的重建信息;
预测单元,用于:从该点云中的M个已编码点中选择N个已编码点作为该目标点的N个备选点,M≥N≥1;基于该目标点位置信息的重建信息,从该N个备选点中选择k个邻居点,N≥k≥1;利用该k个邻居点的属性值,确定该目标点属性信息的预测值;该k个邻居点的属性值为该k个邻居点属性信息的重建值或该k个邻居点的原始值;
残差单元,用于根据该目标点属性信息的预测值和该目标点属性信息的原始值,得到该目标点属性信息的残差值;
编码单元,用于对该目标点属性信息的残差值进行编码,得到该点云的码流。
另一方面,本申请提供了一种编解码设备,包括:
处理器,适于实现计算机指令;以及,
计算机可读存储介质,计算机可读存储介质存储有计算机指令,计算机指令适于由处理器加载并执行上述第一方面至第二方面中的任一方面或其各实现方式中的编解码方法。
在一种实现方式中,该处理器为一个或多个,该存储器为一个或多个。
在一种实现方式中,该计算机可读存储介质可以与该处理器集成在一起,或者该计算机可读存储介质与处理器分离设置。
另一方面,本申请实施例提供一种计算机可读存储介质,该计算机可读存储介质存储有计算机指令,该计算机指令被计算机设备的处理器读取并执行时,使得计算机设备执行上述第一方面至第二方面中的任一方面或其各实现方式中的编解码方法。
本申请实施例中,从该点云中的M个已解码点中选择N个已解码点作为该目标点的N个备选点后,基于该目标点的位置信息的重建信息,从该N个备选点中选择k个邻居点,针对密集点云,能够避免用于选择该k个邻居点的备选点的数量过大,降低预测复杂度。另一方面,在利用该k个邻居点的属性值,确定该目标点的属性信息的预测值,能够保证针对目标点的属性信息的预测准确度。因此,本申请提供的方案,能够在保证预测效果的基础上降低预测复杂度。
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本申请实施例提供的编码框架的示意性框图。
图2是本申请实施例提供的解码框架的示意性框图。
图3是本申请实施例提供的原始莫顿顺序下的点云的示意图。
图4是本申请实施例提供的偏移莫顿顺序下的点云的示意图。
图5是本申请实施例提供的邻居点的空间关系的示意性结构图。
图6是本申请实施例提供的邻域范围内与待编码当前点共面的邻居点之间的莫顿码关系的示例。
图7是本申请实施例提供的邻域范围内与待编码当前点共线的邻居点之间的莫顿码关系的示例。
图8是本申请实施例提供的编码方法的示意性流程图。
图9是本申请实施例提供的解码方法的示意性流程图。
图10是本申请实施例提供的编码器的示意性框图。
图11是本申请实施例提供的解码器的示意性框图。
图12是本申请实施例提供的电子设备的示意性框图。
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
图1是本申请实施例提供的编码框架100的示意性框图。
如图1所示,编码框架100可以从采集设备获取点云的位置信息和属性信息。点云的编码包括位置编码和属性编码。在一个实施例中,位置编码的过程包括:对原始点云进行坐标变换、量化去除重复点等预处理;构建八叉树后进行编码形成几何码流。属性编码过程包括:通过给定输入点云的位置信息的重建信息和属性信息的原始值,选择三种预测模式的一种进行点云预测,对预测后的结果进行量化,并进行算术编码形成属性码流。
如图1所示,位置编码可通过以下单元实现:
坐标变换(Tanmsform coordinates)单元101、量化和移除重复点(Quantize and remove points)单元102、八叉树分析(Analyze octree)单元103、几何重建(Reconstruct geometry) 单元104以及第一算术编码(Arithmetic enconde)单元105。
坐标变换单元101可用于预处理(Pre-processing)点云中的点,即可用于坐标变换和体素化(Voxelize),坐标变换可以指将点云中点的世界坐标变换为相对坐标。例如,通过缩放操作(点的几何坐标分别减去xyz坐标轴的最小值)和平移操作,将3D空间中的点云中的点的数据转换成整数形式,并将其最小几何位置移至坐标原点处。针对点的缩放操作相当于去直流操作,以实现将点云中的点的坐标从世界坐标变换为相对坐标。
量化和移除重复点单元102可通过几何量化(Geometry quantization)减少坐标的数目。几何量化的精细程度通常由量化参数(QP)来决定,QP取值较大,表示更大取值范围的系数将被量化为同一个输出,因此通常会带来更大的失真,及较低的码率;相反,QP取值较小,表示较小取值范围的系数将被量化为同一个输出,因此通常会带来较小的失真,同时对应较高的码率。在点云编码中,量化是直接对点的坐标信息进行的。量化后原先不同的点可能被赋予相同的坐标,基于此,可通过去重操作将重复的点删除;例如,具有相同量化位置和不同属性信息的多个点可通过属性变换合并到一个点中。在本申请的一些实施例中,量化和移除重复点单元102可作为可选的单元模块。
几何编码(Geometry encoding)中包含两种模式,即基于八叉树的几何编码(Octree)和基于三角表示的几何编码(Trisoup),其可在不同条件下使用。八叉树分析单元103可利用八叉树(octree)编码方式编码量化的点的位置信息。八叉树是一种树形数据结构,在3D空间划分中,对预先设定的包围盒进行均匀划分,每个节点都具有八个子节点。通过对八叉树各个子节点的占用与否采用‘1’和‘0’指示,获得占用码信息(occupancy code)作为点云几何信息的码流。例如,将点云按照八叉树的形式进行划分,由此,点的位置可以和八叉树的位置一一对应,通过统计八叉树中有点的位置,并将其标识(flag)记为1,以进行几何编码。基于三角表示的几何编码将点云划分为一定大小的块(block),定位点云表面在块的边缘的交点并构建三角形,通过编码交点位置实现几何信息的压缩。
第一算术编码单元105可用于几何熵编码(Geometry entropy encoding),即针对八叉树的占用码信息进行统计压缩编码,最后输出二值化(0或者1)的压缩码流。统计压缩编码是一种无损编码方式,可以有效的降低表达同样的信号所需要的码率。常用的统计压缩编码可以是基于上下文的二值化算术编码(Content Adaptive Binary Arithmetic Coding,CABAC)。简言之,采用熵编码方式对八叉树分析单元103输出的位置信息进行算术编码,即将八叉树分析单元103输出的位置信息利用算术编码方式生成几何码流;几何码流也可称为几何比特流(geometry bitstream)。
在有损编码情况下,在几何信息编码后,编码端需解码并重建几何信息,即恢复3D点云的各点坐标信息。针对每一个点,在原始点云中寻找其对应一个或多个邻近点的属性信息的重建值,作为该点的属性信息的预测值,基于该点的属性信息的预测值和原始值即可得到该点的属性信息的残差值,编码器通过对点云的所有点的属性信息的残差值进行编码,以得到属性比特流。编码器基于属性信息的重建值可基于预测值和原始值,可以得到属性信息的重建值。
属性编码可通过以下单元实现:
颜色空间变换(Transform colors)单元110、属性变换(Transfer attributes)单元111、区域自适应分层变换(Region Adaptive Hierarchical Transform,RAHT)单元112、预测变化(predicting transform)单元113以及提升变化(lifting transform)单元114、 量化(Quantize)单元115以及第二算术编码单元116。
颜色空间变换单元110可用于将点云中点的RGB色彩空间变换为YCbCr格式或其他格式。
属性变换单元111可用于变换点云中点的属性信息,以最小化属性失真。例如,属性变换单元111可用于得到点的属性信息的原始值。例如,该属性信息可以是点的颜色信息。经过属性变换单元111变换得到点的属性信息的原始值后,可选择任一种预测单元,对点云中的点进行预测。
用于对点云中的点进行预测编码的单元可包括:分层区域自适应变换编码(Region Adaptive Hierarchical Transform,RAHT)单元112、预测变化(predicting transform)单元113以及提升变化(lifting transform)单元114中的至少一项,即属性变换编码中包含三种模式,可在不同条件下使用。换言之,RAHT 112、预测变化(predicting transform)单元113以及提升变化(lifting transform)单元114中的任一项可用于对点云中点的属性信息进行预测,以得到点的属性信息的预测值,进而可基于点的属性信息的预测值得到点的属性信息的残差值。例如,点的属性信息的残差值可以是点的属性信息的原始值减去点的属性信息的预测值。利用RAHT单元112进行预测编码时,属性信息经过RAHT变换,将信号转换到变换域中,称之为变换系数。利用提升变化单元114进行预测编码时,在LoD相邻层预测的基础上,引入邻域点的权重更新策略,最终获得各点的属性信息的预测值,进而获得对应的残差值。预测变换单元113还可用于生成LOD,并对LOD中点的属性信息依次进行预测,计算得到预测残差以便后续进行量化编码。利用预测变换单元113进行预测编码时,根据距离选择子点集,将点云划分成多个不同的层级(Level of Detail,LOD),实现由粗糙到精细化的点云表示。相邻层之间可以实现自下而上的预测,即由粗糙层中的邻近点预测精细层中引入的点的属性信息,获得对应的残差值。其中,LOD中的最底层的点作为参考信息进行编码。
量化单元115可用于属性信息的量化(Attribute quantization),量化的精细程度通常由量化参数来决定。在预测变换编码及提升变换编码中,是对残差值进行量化后进行熵编码;在RAHT中,是对变换系数进行量化后进行熵编码。例如,若该量化单元115和该预测变换单元113相连,则该量化单元可用于量化该预测变换单元113输出的点的属性信息的残差值。例如,对预测变换单元113输出的点的属性信息的残差值使用量化步长进行量化,以实现提升系统性能。
针对属性信息进行预测编码时,通过对几何信息或属性信息的邻近关系,选择一个或多个点作为预测值,并求加权平均获得属性信息的最终预测值,对原始值与预测值之间的差值进行编码。以利用预测变换单元113进行预测编码为例,对LOD中的每一个点,在其前面的LOD中找到3个距离最近的邻居点,然后利用3个邻居点的重建值对当前点进行预测,得到预测值,其中,可以使用欧式距离或曼哈顿距离进行距离计算;基于此,可基于当前点的预测值和当前点的原始值得到当前点的残差值。
在一些实施例中,可以基于下述方公式确定残差值:
attrResidualQuant=(attrValue-attrPred)/Qstep;
其中,attrResidualQuant表示当前点的残差值,attrPred表示当前点的预测值,attrValue表示当前点的原始值,Qstep表示量化步长。其中,Qstep由量化参数(Quantization Parameter,Qp)计算得到。
在一些实施例中,当前点将会作为后续点的最近邻居,并利用当前点的重建值对后续点 的属性信息进行预测。当前点的属性信息的重建值可通过以下公式得到:
reconstructedColor=attrResidualQuant×Qstep+attrPred;
其中,reconstructedColor表示当前点的重建值,attrResidualQuant表示当前点的残差值,Qstep表示量化步长,attrPred表示当前点的预测值。其中,Qstep由量化参数(Quantization Parameter,Qp)计算得到。
第二算术编码单元116可用于属性信息的熵编码(Attribute entropy coding),量化后的属性信息的残差值或变换系数可使用行程编码(run length coding)及算数编码(arithmetic coding)实现最终的压缩。相应的编码模式,量化参数等信息也同样采用熵编码器进行编码。对点的属性信息的残差值进行熵编码,可以得到属性码流。该属性码流可以是比特流信息。
需要说明的是,在本申请实施例中,点云中的点的属性信息的预测值(predictedvalue)也可称为LOD模式下的颜色预测值(predictedColor)。点的属性信息的原始值减去点的属性信息的预测值可得到点的残差值(residualvalue)。点的属性信息的残差值也可称为LOD模式下的颜色残差值(residualColor)。点的属性信息的预测值和点的属性信息的残差值相加可生成点的属性信息的重建值(reconstructedvalue)。点的属性信息的重建值也可称为LOD模式下的颜色重建值(reconstructedColor)。当然,上述术语仅为示例性说明,不应理解为对本申请的限制。
基于边框框架100的编码过程,解码器获得压缩码流后,首先进行熵解码,获得各种模式信息及量化后的几何信息以及属性信息。首先,几何信息经过反量化,得到重建的3D点位置信息。另一方面,属性信息经过反量化得到残差值,并根据采用的变换模式确认参考信号,得到属性信息的预测值,按顺序与几何信息一一对应,产生并输出的每一个点的重建值,即输出重建点云数据。
图2是本申请实施例提供的解码框架200的示意性框图。
如图2所示,解码框架200可以从编码设备获取点云的码流,通过解析点云的码流得到点云中的点的位置信息和属性信息。点云的解码包括位置解码和属性解码。
在一个实施例中,位置解码的过程包括:对几何码流进行算术解码;构建八叉树后进行合并,对点的位置信息进行重建,以得到点的位置信息的重建信息;对点的位置信息的重建信息进行坐标变换,得到点的位置信息。点的位置信息也可称为点的几何信息。
属性解码过程包括:通过解析属性码流,获取点云中点的属性信息的残差值;通过对点的属性信息的残差值进行反量化,得到反量化后的点的属性信息的残差值;基于位置解码过程中获取的点的位置信息的重建信息,选择三种预测模式的一种进行点云预测,得到点的属性信息的重建值;对点的属性信息的重建值进行颜色空间反变换,以得到解码点云。
如图2所示,位置解码可通过以下单元实现:第一算数解码单元201、八叉树分析(synthesize octree)单元202、几何重建(Reconstruct geometry)单元203以及坐标反变化(inverse transform coordinates)单元204。属性编码可通过以下单元实现:第二算数解码单元210、反量化(inverse quantize)单元211、RAHT单元212、预测变化(predicting transform)单元213、提升变化(lifting transform)单元214以及颜色空间反变换(inverse trasform colors)单元215。
需要说明的是,解压缩是压缩的逆过程,类似的,解码框架200中的各个单元的功能可参见编码框架100中相应的单元的功能。例如,解码框架200可根据点云中点与点之间的欧 式距离将点云划分为多个LOD;然后,依次对LOD中点的属性信息进行解码;例如,计算零行程编码技术中零的数量(zero_cnt),以基于零的数量对残差进行解码;接着,解码框架200可基于解码出的残差值进行反量化,并基于反量化后的残差值与当前点的预测值相加得到该点云的重建值,直到解码完所有的点云。当前点将会作为后续LOD中点的最近邻居,并利用当前点的重建值对后续点的属性信息进行预测。此外,关于反变换(transform)和反量化(scale/scaling),对于正交变换,如果将其中的一个矩阵用于变换,则另一个矩阵用于反变换。对于解码方法,可将解码器中使用的矩阵称为“变换”(transform)矩阵。
为便于描述,下面对属性信息的预测编码方法进行说明。属性信息的预测编码方法可包括针对反射率属性信息的预测编码方法、针对颜色属性信息的预测编码方法以及自适应选择属性预测值的方法。
1)、针对反射率属性信息的预测编码方法。
i)、莫顿序邻居点选取。
基于偏移莫顿码的方法找到当前点的k个已编码的点作为邻居点。莫顿码是一种将多维空间中的点坐标用一个一维的数值来表示,通过利用莫顿编码的方式来可以将空间中点所对应的空间关系用莫顿码的值之间的相邻关系来近似表示。换言之,点云经过莫顿码编码后可形成由多个莫顿码基于莫顿排序的方式形成的莫顿顺序。排序(sorting)是指将一组数据按特定规则(排序算法)调换位置,使得数据按序排列,可以是从大到小排列,也可以是从小到大排列。莫顿排序是指基于莫顿码的值之间的相邻关系进行排序的过程。
图3是本申请实施例提供的原始莫顿顺序下的点云的示意图。图4是本申请实施例提供的偏移莫顿顺序下的点云的示意图。
如图3所示,获取点云中所有点的坐标,并按照莫顿排序的方式得到莫顿顺序1;把所有点的坐标(x,y,z)加上一个固定值(j1,j2,j3),用新的坐标(x+j1,y+j2,z+j3)生成点云对应的莫顿码,按照莫顿排序的方式得到莫顿顺序2。如图4所示,在图3中的A,B,C,D移到图4中的不同位置,对应的莫顿码也发生了变化,但相对位置保持不变。另外,在图4中,点D的莫顿码是23,点D的邻居点B的莫顿码是21,所以,从点D向前最多搜索两个点就可以找到点B。但在图3中,从点D的莫顿码是16,点D的邻居点B的莫顿码是2,所以,从点D向前最多需要搜索14个点才能找到点B。
解码器根据莫顿顺序进行解码,查找当前点的最近预测点。具体地,在莫顿顺序1中选取该当前点的前N个点作为备选点,N取值范围是大于等于1,在莫顿顺序2中选取该当前点的前M个点作为备选点,M的取值范围是大于等于1。在上述N个备选点和M个备选点中,计算每个备选点到当前点的距离d,其中,可以使用欧式距离或曼哈顿距离进行计算,比如,当前点的坐标为(x,y,x),备选点的坐标为(x1,y1,z1),距离d计算方法为d=|x-x1|+|y-y1|+|z-z1|,从这N+M个备选点中选取距离最小的k个已解码点作为当前点的预测点。作为本申请的示例,在PCEM软件中,j1=j2=j3=42,k=3,N=M=4。
ii)、希尔伯特(Hilbert)序邻居点选取。
其中,Hilbert序邻居点的选取方式通过在Hilbert序下当前编码点的前的邻居点的最大数目(maxNumOfNeighbours)个点中查找距离当前点最近的k个点作为邻居点,其中maxNumOfNeighbours表示备选点的个数,maxNumOfNeighbours默认取128,k默认取3,距离计算方法为曼哈顿距离,即d=|x1-x2|+|y1-y2|+|z1-z2|。其中,在编码中,进行希尔伯特排序时采用离散的希尔伯特曲线进行计算。
iii)、预测值计算。
在进行当前点的属性信息的预测值计算时,用上述选定的k个邻居点与当前点之间的曼哈顿距离的倒数作为权重,最后计算k个邻居点的属性重构值的加权平均获得当前点的属性信息的预测值。设当前点的序号为i,当前点的几何坐标为(xi,yi,zi),每个邻居点的几何坐标为(xij,yij,zij)。其中,j=0,1,2,……,k,则每个邻居点的权重wij如下式1所示:
对于反射率属性,若式1中的权重计算在x、y、z方向的分量采用不同权重,则每个邻居点的权重wij如下式3所示:
其中a、b、c分别x、y、z方向的分量上的不同权重。
2)、针对颜色属性信息的预测编码方法。
在颜色属性的预测方法中,首先利用莫顿码来查找当前点的空间邻居点,然后根据查找的空间邻居点对当前点进行属性预测。
图5是本申请实施例提供的邻居点的空间关系的示意性结构图。图6是本申请实施例提供的邻域范围内与待编码当前点共面的邻居点之间的莫顿码关系的示例。图7是本申请实施例提供的邻域范围内与待编码当前点共线的邻居点之间的莫顿码关系的示例。
如图5所示,待编码的当前点为粗线条标记块A,邻居查找范围为待编码的当前点的3X3X3邻域。首先利用当前点的莫顿码得到该邻域中莫顿码值最小的块,将该块作为基准块,利用基准块来查找与待编码当前点7共面、共线的已编码邻居点。如图6所示,邻域范围内,待编码当前点7与邻居点3、邻居点5和邻居点6共面。如图7所示,邻域范围内,待编码当前点7与邻居点1、邻居点2和邻居点4共线。
利用基准块来搜索与待编码当前点共面、共线的已编解码的k个邻居点(比如当选择邻域时,k≤6),利用这k个邻居点来预测待编码当前点的属性信息。例如k=3的情况下,在已编码的点一定范围[j-maxNumOfNeighbours,j-1]内进行查找与待编码当前点共面的邻居点,其中,当前点的索引为j,maxNumOfNeighbours表示备选点的个数,如果查找到共面的已编解码的邻居点,则将共面的邻居点的权重分配为2;继续在已编解码的点中查找与待编码当前点共线的邻居点,如果在已解码点集中查找到共线的邻居点,则对共线的邻居点的权重分配为1,最终,利用查找到的邻居点进行加权平均对待编码当前点进行属性预测;如果没有查找到与待编码当前点共面和共线的已编解码邻居点,则可以利用待编码当前点的前一 个莫顿码对应的点进行属性预测。
例如,利用邻居点进行加权平均对待编码当前点进行属性预测时,若有一个共面点和一个共线点,共面点与当前点距离为d1(比如d1=1),属性值为r1;共线点与当前点距离为d2(比如d2≥2),属性值为r2,则待编码当前点的属性预测值为r=(2*r1+r2)/(2+1)。
3)、自适应选择属性预测值的方法。
针对反射率属性信息的预测方法和针对颜色属性属性信息的预测方法均是根据几何位置信息来确定属性信息的预测值,针对反射率属性信息的预测方法和针对颜色属性属性信息的预测方法可以称为基于几何位置的预测值方法。基于几何位置的预测值方法通常适合于比较稠密、较易预测的点云,比如人物点云,或者适合于预测残差较小的情况。如果基于几何位置的预测值方法产生的残差很大的情况下,基于属性值的预测值方法通常能降低预测残差,并提高编码效率。
其中,基于属性值的预测值方法可以通过以下步骤实现:
i)、将最近已编码的已编码点中32个不同的属性预测值保存在侯选预测值表中;
ii)、从候选预测值表中选取跟当前点的属性信息最近的点,并用被选取点的属性值作为当前点的属性预测值;
iii)、将被选取点在候选预测值表的序号二值化成5个比特,并用有上下文的熵编码进行编码。
针对颜色属性信息的预测方法中,通过寻找已编码点中的共面共线点作为当前点的邻居点,并给共面点和共线点设置不同的权重进行加权计算,最终获得对应点的属性信息的预测值。对于较稀疏的点云数据,在90%以上找不到共面共线点,则在AVS编码中会采取前一个点进行预测。一个场景下的点云就可以对应几十万个点,数量如此庞大的点也给计算机的存储和传输带来了挑战,直接在已编码点中的共面共线点作为当前点的邻居点需要的计算量过大,导致预测复杂度过高。
本申请提出了一种面向颜色预测的邻居点优化方法,通过更好地利用点云邻近点的空间关联性,能够在保证预测效果的基础上降低预测复杂度。具体地,通过分析点云数据的空间邻域关系在M个已编码点中选择N个已编码点作为N个备选点,然后设计距离优先或几何结构优先的方式在N个备选点中选择k个邻居点,最后基于选择的k个邻居点进行属性预测。需要说明的是,本申请实施例中,针对在M个已编码点中选择N个备选点的方案,以及针对在N个备选点中选择k个邻居点的方案,可以通过搜索或映射等多种方式实现。在不同的情况下,可能采取单独的选择方式,或采取多种选择方式相结合,以实现在保证预测效果的基础上降低预测复杂度。此外,本申请提出的面向点云属性预测的邻居点选择优化方式可应用于任意一种3D点云编解码产品中。
图8是本申请实施例提供的基于点云属性预测的编码方法300的示意性流程图。该方法300可由编码器或编码端来执行。例如,图1所示的编码框架100。
如图8所示,该编码方法300可包括:
S310,获取点云中的目标点位置信息的重建信息;
S320,从该点云中的M个已编码点中选择N个已编码点作为该目标点的N个备选点,M≥N≥1;
S330,基于该目标点位置信息的重建信息,从该N个备选点中选择k个邻居点,N≥k≥1;
S340,利用该k个邻居点的属性值,确定该目标点属性信息的预测值;该k个邻居点的 属性值为该k个邻居点属性信息的重建值或该k个邻居点属性信息的原始值;
S350,根据该目标点属性信息的预测值和该目标点属性信息的原始值,得到该目标点属性信息的残差值;
S360,对该目标点属性信息的残差值进行编码,得到该点云的码流。
简言之,编码端针对目标点的属性信息进行编码时,先在M个已编码点中选择N个已编码点作为该目标点的N个备选点,再从该N个备选点中选择k个邻居点,接着利用该k个邻居点的属性值,确定该目标点的属性信息的预测值,最后,根据该目标点的属性信息的预测值和该目标点的属性信息的原始值,得到该目标点的属性信息的残差值;并对该目标点的属性信息的残差值进行编码,得到该点云的码流。
本申请实施例中,从该点云中的M个已编码点中选择N个已编码点作为该目标点的N个备选点后,基于该目标点的位置信息的重建信息,从该N个备选点中选择k个邻居点,针对密集点云,能够避免用于选择该k个邻居点的备选点的数量过大,降低预测复杂度。另一方面,在利用该k个邻居点的属性值,确定该目标点的属性信息的预测值,能够保证针对目标点的属性信息的预测准确度。因此,本申请提供的方案,能够在保证预测效果的基础上降低预测复杂度。
在一些实施例中,已编码点数量M超过备选点数量N;其中,该S320可包括:
基于该M个已编码点的第一顺序,从该M个已编码点中选择该N个已编码点;其中,该第一顺序为按照从小到大或从大到小的顺序对该M个已编码点和该目标点进行莫顿排序或希尔伯特排序所得到的顺序,或该第一顺序为该M个已编码点和该目标点的编码顺序;将该N个已编码点作为该N个备选点。
换言之,编码端对点云中的点进行排序,例如,采用莫顿(morton)码或希尔伯特(hilbert)码表示点云中点的坐标,并从小到大或从大到小进行排序;或不进行排序,保持各点的编码顺序;点的编码顺序也可以称为点的输入顺序。以希尔伯特排序为例,当前待编码点序号为i,对应希尔伯特码为m_i,其前序的已编码点包括i-1,i-2,…,1,0,对应的希尔伯特码为m_(i-1),m_(i-2),…,m_1,m_0。在序号为i-1,i-2,…,1,0的已编码点中选择N个已编码点,即为当前点选择N个备选点作为当前点的邻居点的备选点,选择N个已编码点的方法包括但是不限于:
1、采用所有已编码点作为备选点。
2、采用已编码点中的部分点作为备选点。
在一些实施例中,该S320可具体包括:
沿该第一顺序,将在该目标点之前且与该目标点相邻的N个点,确定为该N个已编码点;或者,沿该第一顺序,将在该目标点之前且连续的N个点,确定为该N个已编码点,其中,该连续的N个点与该目标点相邻或间隔至少一个已编码点。
换言之,编码端采用已编码点中的部分点作为备选点时,可实现的方案包括但是不限于以下方式中的至少一项:
1、按照该第一顺序逐渐向前搜索,选择N个点加入备选点。
2、按该第一顺序选择前序的N个点,即选择序号为i-1,i-2,…,i-N的点作为备选点。
3、按该第一顺序在已编码点中任选连续的N个点。例如,跳过前序N1个点,即选择序号为i-N1-1,i-N1-2,…,i-N1-N的N个点加入备选点。
本实施例中,基于该M个已编码点的第一顺序,直接从该M个已编码点中选择该N个已 编码点,能够有效控制备选点的选择复杂度,提升预测效率。
当然,在其他可替代实施例中,编码端也可以从该M个已编码点中随机选择该N个已编码点,本申请实施例对此不作具体限定。
需要说明的是,该第一顺序可以是编码端直接对该点云进行排序形成的顺序,也可以是仅对该M个已编码点和该目标点排序形成的顺序,例如,该点云为密集点云时,该第一顺序可以是仅对该M个已编码点和该目标点排序形成的顺序,该点云为稀疏点云时,该第一顺序可以是对该点云中所有的点进行排序形成的顺序,以降低工作量并提升预测效率。此外,编码端对点云中的点进行排序或仅对该M个已编码点和该目标点进行排序时,可对所有方向(x,y,z)进行处理,或者对其中一个或多个方向进行处理,本申请实施例对此不作具体限定。例如,编码端根据点的位置信息对点云中的点进行排序(或仅对该M个已编码点和该目标点进行排序)时,编码端可以根据点的位置信息对点云中的点进行莫顿排序(或仅对该M个已编码点和该目标点进行莫顿排序),也可以根据点的位置信息对点云中的点进行希尔伯特排序(或仅对该M个已编码点和该目标点进行希尔伯特排序)。可选的,该点的位置信息可以是该云的三维位置信息,也可以是一个维度或多个维度上的位置信息。可选的,编码端可根据实际需求确定采用几个维度的位置信息对点云中的点(或该M个已编码点和该目标点)进行排序。
编码端选择出该N个备选点后,可以在该N个备选点中选择用于预测目标点的属性信息的邻居点。
需要说明的是,编码端可计算该N个备选点中的每一个备选点与目标点的距离,并基于每一个备选点与目标点的距离确定k个邻居点;也可通过点的几何结构关系,如利用基于该N个备选点和目标点形成的八叉树结构定位出符合条件的k邻居点;也可通过其他映射方式进行定位;还可以在对N个备选点和目标点进行排序,并在排序后的备选点中选择k个邻居点,例如,可以基于对N个备选点和目标点进行排序形成的顺序选择k个邻居点,本申请实施例对此不作具体限定。还需要说明的是,本申请对计算过程中涉及的距离所采用的度量方式或具体实现方式不作限定。例如,可以采用欧式(Euclidean)距离或曼哈顿(Manhattan)距离。还需要说明的是,对N个备选点和目标点进行排序形成的顺序可以是直接对该点云进行排序形成的顺序,也可以是仅对该N个备选点和目标点排序形成的顺序,例如,该点云为密集点云时,对N个备选点和目标点进行排序形成的顺序可以是仅对该N个备选点和目标点排序形成的顺序,该点云为稀疏点云时,该对N个备选点和目标点进行排序形成的顺序可以是对该点云中所有的点进行排序形成的顺序,以降低工作量并提升预测效率。此外,编码端对点云中的点进行排序(或仅对该N个备选点和目标点进行排序)时,可对所有方向(x,y,z)进行处理,或者对其中一个或多个方向进行处理,本申请实施例对此不作具体限定。例如,编码端根据点的位置信息对点云中的点进行排序(或仅对该N个备选点和目标点进行排序)时,编码端可以根据点的位置信息对点云中的点进行莫顿排序(或仅对该N个备选点和目标点进行莫顿排序),也可以根据点的位置信息对点云中的点进行希尔伯特排序(或仅对该N个备选点和目标点进行希尔伯特排序)。可选的,该点的位置信息可以是该云的三维位置信息,也可以是一个维度或多个维度上的位置信息。可选的,编码端可根据实际需求确定采用几个维度的位置信息对点云中的点或该N个备选点进行排序。
在一些实施例中,该S330可包括:
基于该目标点位置信息的重建信息和该N个备选点位置信息的重建信息,确定该N个备 选点和该目标点的几何结构关系;基于该几何结构关系,从该N个备选点中选择该k个邻居点。
在一种实现方式中,该几何结构关系通过八叉树结构表示;基于该八叉树结构,确定该目标点的k个最近邻点;将该k个最近邻点,确定为该k个邻居点。
简言之,编码端基于该八叉树结构,选择k个最近邻点作为邻居点;即,在该N个备选点中选择与目标点距离最近的k个点。例如,编码端可利用K最近邻(k-Nearest Neighbor,KNN)分类算法计算获取其的K个最近邻点。其中,K最近邻是指K个最近的邻居,相当于,每个点都可以用它最接近的K个邻近点来代表。
在另一种实现方式中,基于该几何结构关系,从该N个备选点中选择与该目标点共线和/或共面的p个备选点;若备选点数量p大于或等于邻居点数量k,则将该p个备选点确定为该k个邻居点;或者,若备选点数量p大于或等于邻居点数量k,则在该p个备选点中选择k个备选点作为该k个邻居点。在另一种实现方式中,基于该几何结构关系,从该N个备选点中选择与该目标点共线和/或共面的p个备选点;若备选点数量p小于邻居点数量k或备选点数量p等于0,则基于该目标位置信息的重建信息和该N个备选点的位置信息的重建信息,确定该N个备选点中的各个备选点到该目标点之间的距离;基于该N个备选点中的各个备选点到该目标点之间的距离,从该N个备选点中选择该k个邻居点,其中,该N个备选点中的各个备选点到该目标点的距离为欧式距离或曼哈顿距离。在另一种实现方式中,基于该几何结构关系,从该N个备选点中选择与该目标点共线和/或共面的p个备选点;若备选点数量p小于邻居点数量k或备选点数量p等于0,则利用该目标点位置信息的重建信息和该N个备选点位置信息的重建信息确定第二顺序;基于该第二顺序,从该N个备选点中选择该k个邻居点;其中,该第二顺序为按照从小到大或从大到小的顺序对该N个备选点和该目标点进行莫顿排序或希尔伯特排序所得到的顺序;或者,该第二顺序为按照该N个备选点中的各个备选点与该目标点的距离由大到小或有小到大进行排序后得到的顺序,该N个备选点中的各个备选点到该目标点的距离为欧式距离或曼哈顿距离。
简言之,编码端可基于该几何结构关系,从该N个备选点中选择与该目标点共线和/或共面的备选点;若从与该目标点共线和/或共面的备选点的数目小于k或者不存在与该目标点共线或共面的备选点,则编码端基于该N个备选点中的备选点到该目标点之间的距离或第二顺序确定该k个邻居点,若从与该目标点共线和/或共面的备选点的数目大于或等于k,则将所有的与该目标点共线和/或共面的备选点确定为该k个邻居点或从所有的与该目标点共线和/或共面的备选点中选择出k和点作为该k个邻居点。
编码端基于该N个备选点中的各个备选点到该目标点之间的距离,从该N个备选点中选择该k个邻居点时,将该N个备选点中的与该目标点的距离小于第一阈值的点,确定为该k个邻居点;或,将该N个备选点中的与该目标点的距离为第二阈值的点,确定为k个邻居点。换言之,编码端从该N个备选点中选择该k个邻居点的方法具体包括但是不限于以下方法中的至少一项:
1、选择所有距离小于d的点作为邻居点。例如,d=2。
2、选择所有距离为d_0的点作为邻居点。可选的,d_0可选一个固定值或多个固定值。例如,d_0可以为1或2,即选择密集点云中与目标点共面或共线的点。
编码端基于该第二顺序,从该N个备选点中选择该k个邻居点时,可以基于该目标点在该第二顺序中的序号,从该N个备选点中选择该k个邻居点。换言之,编码端采用该N个备 选点中的部分点作为该k个邻居点时,可实现的方案包括但是不限于以下方式中的至少一项:
1、按照该第二顺序逐渐向前搜索,选择k个点加入邻居点。
2、按该第二顺序选择前序的k个点,即选择序号为i-1,i-2,…,i-k的点作为邻居点。
3、按该第二顺序在该N个备选点中任选连续的k个点。例如,跳过前序N1个点,即选择序号为i-N1-1,i-N1-2,…,i-N1-k的N个点加入邻居点。
当然,在其他可替代实施例中,编码端也可以从该N个备选点中随机选择该k个邻居点,本申请实施例对此不作具体限定。
在一些实施例中,该S330可包括:
基于该目标点位置信息的重建信息和该N个备选点的位置信息的重建信息,确定该N个备选点中的各个备选点到该目标点之间的距离;基于该N个备选点中的各个备选点到该目标点之间的距离,从该N个备选点中选择该k个邻居点,其中,该N个备选点中的备选点到该目标点的距离为欧式距离或曼哈顿距离。
由于备选点到该目标点之间的距离大小,能够体现备选点的属性信息和目标点的属性信息之间的相关度或相似度,编码端可基于该N个备选点中的备选点到该目标点之间的距离大小,优先从该N个备选点中选择符合预设条件或属性预测条件的该k个邻居点。
在一种实现方式中,该S330可具体包括:
将该N个备选点中的第一目标备选点,确定为该k个邻居点,第一目标备选点是指N个备选点中与目标点的距离小于第一阈值的点;或,将该N个备选点中的第二目标备选点,确定为k个邻居点,第二目标备选点是指N个备选点中与目标点的距离小于第二阈值的点。
换言之,编码端从该N个备选点中选择该k个邻居点的方法具体包括但是不限于以下方法中的至少一项:
1、选择所有距离小于d的点作为邻居点。例如,d=2。
2、选择所有距离为d_0的点作为邻居点。可选的,d_0可选一个固定值或多个固定值。例如,d_0可以为1或2,即选择密集点云中与目标点共面或共线的点。
在一些实施例中,该S330可包括:
利用该目标点的位置信息的重建信息和该N个备选点的位置信息的重建信息确定第二顺序;基于该第二顺序,从该N个备选点中选择该k个邻居点;其中,该第二顺序为按照从小到大或从大到小的顺序对该N个备选点和该目标点进行莫顿排序或希尔伯特排序所得到的顺序;或者,该第二顺序为按照该N个备选点中的各个备选点与该目标点的距离由大到小或有小到大进行排序后得到的顺序,该N个备选点中的各个备选点到该目标点的距离为欧式距离或曼哈顿距离。
换言之,编码端可基于第二顺序,按序号从该N个备选点中选择该k个邻居点。例如,基于该N个备选点的第二顺序,从该N个备选点中选择部分备选点作为该k个邻居点。例如,沿该第二顺序,将该目标点前面的且与该目标点相邻的k个点,确定为该k个邻居点;或者,沿该第二顺序,将在该目标点之前且连续的k个点,确定为该k个邻居点,其中,该连续的k个点与该目标点相邻或间隔至少一个备选点。
换言之,编码端采用该N个备选点中的部分点作为该k个邻居点时,可实现的方案包括但是不限于以下方式中的至少一项:
1、按照该第二顺序逐渐向前搜索,选择k个点加入邻居点。
2、按该第二顺序选择前序的k个点,即选择序号为i-1,i-2,…,i-k的点作为邻居点。
3、按该第二顺序在该N个备选点中任选连续的k个点。例如,跳过前序N1个点,即选择序号为i-N1-1,i-N1-2,…,i-N1-k的N个点加入邻居点。
当然,在其他可替代实施例中,编码端也可以从该N个备选点中随机选择该k个邻居点,本申请实施例对此不作具体限定。
在一些实施例中,该S340可包括:
以该k个邻居点中的每一个邻居点与该目标点的距离的倒数为该每一个邻居点的权重,基于该k个邻居点中每一个邻居点的属性值和权重进行加权平均计算,得到k个邻居点属性值的加权平均值,将该k个邻居点的属性值的加权平均值,确定为该目标点的属性信息的预测值;或者,针对该k个邻居点中的不同邻居点设置相同或不同的初始权重,基于该k个邻居点中每一个邻居点的属性值和初始权重进行加权平均计算,得到k个邻居点属性值的加权平均值,将该k个邻居点的属性值的加权平均值,确定为该目标点的属性信息的预测值,其中该k个邻居点中的一个邻居点的初始权重随该一个邻居点与该目标点的距离的增大而减小,该码流包括该k个邻居点中的每一个邻居点的初始权重;或者,将该k个邻居点中与该目标点的距离最近的邻居点的属性值,确定为该目标点的属性信息的预测值。
换言之,编码端可利用所获得的邻居点的属性值计算目标点的属性信息的预测值,具体计算过程包括但是不限于以下方法中的至少一项:
1、根据邻居点与目标点的距离计算加权平均值。例如,采用欧式距离度量,以距离的倒数作为各个邻居点的权重,预测值为各个邻居点属性值的加权平均。
2、根据设置的初始权重计算加权平均值,可由编码端为不同邻居点设置相同或不同的权重值,例如,给距离较近的点设置较大的权重值;预测值为各个邻居点属性值的加权平均。相应的,解码端可通过解析码流获得对应的权重值。
3、采用最邻近点的属性值作为目标点的属性值预测值。
在一些实施例中,该S350可包括:
丢弃该k个邻居点中的第一邻居点和第二邻居点,得到k个邻居点中剩余的邻居点;其中,第一邻居点是指k个邻居点中与参考点的距离大于第一阈值的邻居点,所述第二邻居点是指k个邻居点中与参考点的距离大于或者等于第二阈值的邻居点,该k个邻居点包括该参考点;利用该k个邻居点中剩余的邻居点的属性值,确定该目标点的属性信息的预测值。
换言之,编码端对所获得的邻居点可进行处理,可剔除差异较大的点,避免引入误差,包括但是不限于以下方法中的至少一项:
1、剔除邻居点中距离相差较大的邻居点。例如,已选择k个最近邻点作为邻居点,设置阈值为d_0,其中距离最近的邻居点为j,若其余邻居点与j的距离大于d_0,则剔除。
2、剔除邻居点中属性相差较大的邻居点。例如,已选择k个最近邻点作为邻居点,设置阈值为r_0,其中距离最近的邻居点为j,若其余邻居点与j的属性值之差大于d_0,则剔除。
以上结合附图详细描述了本申请的优选实施方式,但是,本申请并不限于上述实施方式中的具体细节,在本申请的技术构思范围内,可以对本申请的技术方案进行多种简单变型,这些简单变型均属于本申请的保护范围。例如,在上述具体实施方式中所描述的各个具体技术特征,在不矛盾的情况下,可以通过任何合适的方式进行组合,为了避免不必要的重复,本申请对各种可能的组合方式不再另行说明。又例如,本申请的各种不同的实施方式之间也可以进行任意组合,只要其不违背本申请的思想,其同样应当视为本申请所公开的内容。还应理解,在本申请的各种方法实施例中,上述各过程的序号的大小并不意味着执行顺序的先 后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
图9是本申请实施例提供的基于点云属性预测的解码方法400的示意性流程图。
如图9所示,该解码方法400可包括:
S410,获取点云的码流,对点云的码流进行解析,得到该点云中目标点位置信息的重建信息;
S420,从该点云中的M个已解码点中选择N个已解码点作为该目标点的N个备选点,M≥N≥1;
S430,基于该目标点位置信息的重建信息,从该N个备选点中选择k个邻居点,N≥k≥1;
S440,利用该k个邻居点的属性值,确定该目标点属性信息的预测值;该k个邻居点的属性值为该k个邻居点属性信息的重建值;
S450,对该码流进行解析,得到该目标点属性信息的残差值;
S460,用于根据该目标点属性信息的预测值和该目标点属性信息的残差值,得到该目标点属性信息的最终重建值;
S470,根据该目标点的属性信息的最终重建值,得到解码点云。
在一些实施例中,已解码点数量M超过备选点数量N;其中,该S420可包括:
基于该M个已解码点的第一顺序,从该M个已解码点中选择该N个已解码点;其中,该第一顺序为按照从小到大或从大到小的顺序对该M个已解码点和该目标点进行莫顿排序或希尔伯特排序所得到的顺序,或该第一顺序为该M个已解码点和该目标点的解码顺序;将该N个已解码点,确定为该N个备选点。
本申请实施例中,从该点云中的M个已解码点中选择N个已解码点作为该目标点的N个备选点后,基于该目标点的位置信息的重建信息,从该N个备选点中选择k个邻居点,针对密集点云,能够避免用于选择该k个邻居点的备选点的数量过大,降低预测复杂度。另一方面,在利用该k个邻居点的属性值,确定该目标点的属性信息的预测值,能够保证针对目标点的属性信息的预测准确度。因此,本申请提供的方案,能够在保证预测效果的基础上降低预测复杂度。
在一些实施例中,该S420可具体包括:
沿该第一顺序,将在该目标点之前且与该目标点相邻的N个点,确定为该N个已解码点;或者,沿该第一顺序,将在该目标点之前且连续的N个点,确定为该N个已解码点,其中,该连续的N个点与该目标点相邻或间隔至少一个已解码点。
在一些实施例中,该S430可包括:
基于该目标点位置信息的重建信息和该N个备选点位置信息的重建信息,确定该N个备选点和该目标点的几何结构关系;基于该几何结构关系,从该N个备选点中选择该k个邻居点。
在一些实施例中,该几何结构关系通过八叉树结构表示;该S430可具体包括:
基于该八叉树结构,确定该目标点的k个最近邻点;将该k个最近邻点,确定为该k个邻居点。
在一些实施例中,该S430可包括:
基于该几何结构关系,从该N个备选点中选择与该目标点共线和/或共面的p个备选点;
若备选点数量p大于或等于邻居点数量k,则将该p个备选点确定为该k个邻居点;或 者
若备选点数量p大于或等于邻居点数量k,则在该p个备选点中选择k个备选点作为该k个邻居点。
在一些实施例中,该S430可包括:
基于该几何结构关系,从该N个备选点中选择与该目标点共线和/或共面的p个备选点;
若备选点数量p小于邻居点数量k或备选点数量p等于0,则基于该目标点的位置信息的重建信息和该N个备选点的位置信息的重建信息,确定该N个备选点中的备选点到该目标点之间的距离;
基于该N个备选点中的备选点到该目标点之间的距离,从该N个备选点中选择该k个邻居点,其中,该N个备选点中的备选点到该目标点的距离为欧式距离或曼哈顿距离。
在一些实施例中,该S430可包括:
基于该几何结构关系,从该N个备选点中选择与该目标点共线和/或共面的p个备选点;
若备选点数量p小于邻居点数量k或备选点数量p等于0,则利用该目标点的位置信息的重建信息和该N个备选点的位置信息的重建信息确定第二顺序;基于该第二顺序,从该N个备选点中选择该k个邻居点;
其中,该第二顺序为按照从小到大或从大到小的顺序对该N个备选点和该目标点进行莫顿排序或希尔伯特排序所得到的顺序;或者,该第二顺序为按照该N个备选点中的备选点与该目标点的距离由大到小或有小到大进行排序后得到的顺序,该N个备选点中的备选点到该目标点的距离为欧式距离或曼哈顿距离。
在一些实施例中,该S430可包括:
基于该目标点位置信息的重建信息和该N个备选点的位置信息的重建信息,确定该N个备选点中的各个备选点到该目标点之间的距离;基于该N个备选点中的各个备选点到该目标点之间的距离,从该N个备选点中选择该k个邻居点,其中,该N个备选点中的备选点到该目标点的距离为欧式距离或曼哈顿距离。
在一些实施例中,该S430可具体包括:
将该N个备选点中的第一目标备选点,确定为该k个邻居点,第一目标备选点是指N个备选点中与目标点的距离小于第一阈值的点;或,将该N个备选点中的第二目标备选点,确定为k个邻居点,第二目标备选点是指N个备选点中与目标点的距离小于第二阈值的点。。
在一些实施例中,该S430可包括:
利用该目标点位置信息的重建信息和该N个备选点位置信息的重建信息确定第二顺序;基于该第二顺序,从该N个备选点中选择该k个邻居点;
其中,该第二顺序为按照从小到大或从大到小的顺序对该N个备选点和该目标点进行莫顿排序或希尔伯特排序所得到的顺序;或者,该第二顺序为按照该N个备选点中的备选点与该目标点的距离由大到小或有小到大进行排序后得到的顺序,该N个备选点中的备选点到该目标点的距离为欧式距离或曼哈顿距离。
在一些实施例中,该S440可包括:
以该k个邻居点中的每一个邻居点与该目标点的距离的倒数为该每一个邻居点的权重,基于该k个邻居点中每一个邻居点的属性值和权重进行加权平均计算,得到k个邻居点属性值的加权平均值,将该k个邻居点的属性值的加权平均值,确定为该目标点的属性信息的预测值;或者,针对该k个邻居点中的不同邻居点设置相同或不同的初始权重进行加权平均计 算,得到k个邻居点属性值的加权平均值,基于该k个邻居点中每一个邻居点的属性值和初始权重,将计算的该k个邻居点的属性值的加权平均值,确定为该目标点的属性信息的预测值,其中该k个邻居点中的每一个邻居点的初始权重随该每一个邻居点与该目标点的距离的增大而减小,该码流包括该k个邻居点中的每一个邻居点的初始权重;或者,将该k个邻居点中与该目标点的距离最近的邻居点的属性值,确定为该目标点的属性信息的预测值。
在一些实施例中,该S440可包括:
丢弃该k个邻居点中的第一邻居点和第二邻居点,得到k个邻居点中剩余的邻居点;其中,第一邻居点是指k个邻居点中与参考点的距离大于第一阈值的邻居点,所述第二邻居点是指k个邻居点中与参考点的距离大于或者等于第二阈值的邻居点,该k个邻居点包括该参考点;利用该k个邻居点中剩余的邻居点的属性值,确定该目标点的属性信息的预测值。
应理解,解码方法400可参考编码方法300的相关描述,为避免重复,此处不再赘述。
下面将结合附图对本申请实施例提供的编码器或解码器进行说明。
图10是本申请实施例提供的点云的编码器500的示意性框图。
如图10所示,该编码器500可包括:
获取单元510,用于获取点云中目标点位置信息的重建信息;
预测单元520,用于:从该点云中的M个已编码点中选择N个已编码点作为该目标点的N个备选点,M≥N≥1;基于该目标点位置信息的重建信息,从该N个备选点中选择k个邻居点,N≥k≥1;利用该k个邻居点的属性值,确定该目标点属性信息的预测值;该k个邻居点的属性值为该k个邻居点属性信息的重建值或该k个邻居点属性信息的原始值;
残差单元520,用于根据该目标点属性信息的预测值和该目标点属性信息的原始值,得到该目标点属性信息的残差值;
编码单元530,用于对该目标点属性信息的残差值进行编码,得到该点云的码流。
在一些实施例中,已编码点数量M超过备选点数量N;其中,该预测单元520具体用于:
基于该M个已编码点的第一顺序,从该M个已编码点中选择该N个已编码点;其中,该第一顺序为按照从小到大或从大到小的顺序对该M个已编码点和该目标点进行莫顿排序或希尔伯特排序所得到的顺序,或该第一顺序为该M个已编码点和该目标点的编码顺序;
将该N个已编码点作为该N个备选点。
在一些实施例中,该预测单元520具体用于:
沿该第一顺序,将在该目标点之前且与该目标点相邻的N个点,确定为该N个已编码点;或者
沿该第一顺序,将在该目标点之前且连续的N个点,确定为该N个已编码点,其中,该连续的N个点与该目标点相邻或间隔至少一个已编码点。
在一些实施例中,该预测单元520具体用于:
基于该目标点位置信息的重建信息和该N个备选点位置信息的重建信息,确定该N个备选点和该目标点的几何结构关系;
基于该几何结构关系,从该N个备选点中选择该k个邻居点。
在一些实施例中,该几何结构关系通过八叉树结构表示;该预测单元520具体用于:
基于该八叉树结构,确定该目标点的k个最近邻点;
将该k个最近邻点,确定为该k个邻居点。
在一些实施例中,该预测单元520具体用于:
基于该几何结构关系,从该N个备选点中选择与该目标点共线和/或共面的p个备选点;
若备选点数量p大于或等于邻居点数量k,则将该p个备选点确定为该k个邻居点;或者
若备选点数量p大于或等于邻居点数量k,则在该p个备选点中选择k个备选点作为该k个邻居点。
在一些实施例中,该预测单元520具体用于:
基于该几何结构关系,从该N个备选点中选择与该目标点共线和/或共面的p个备选点;
若备选点数量p小于邻居点数量k或备选点数量p等于0,则基于该目标点位置信息的重建信息和该N个备选点位置信息的重建信息,确定该N个备选点中的备选点到该目标点之间的距离;
基于该N个备选点中的备选点到该目标点之间的距离,从该N个备选点中选择该k个邻居点,其中,该N个备选点中的备选点到该目标点的距离为欧式距离或曼哈顿距离。
在一些实施例中,该预测单元520具体用于:
基于该几何结构关系,从该N个备选点中选择与该目标点共线和/或共面的p个备选点;
若备选点数量p小于邻居点数量k或备选点数量p等于0,则利用该目标点位置信息的重建信息和该N个备选点位置信息的重建信息确定第二顺序;基于该第二顺序,从该N个备选点中选择该k个邻居点;
其中,该第二顺序为按照从小到大或从大到小的顺序对该N个备选点和该目标点进行莫顿排序或希尔伯特排序所得到的顺序;或者,该第二顺序为按照该N个备选点中的备选点与该目标点的距离由大到小或有小到大进行排序后得到的顺序,该N个备选点中的备选点到该目标点的距离为欧式距离或曼哈顿距离。
在一些实施例中,该预测单元520具体用于:
基于该目标点位置信息的重建信息和该N个备选点位置信息的重建信息,确定该N个备选点中的备选点到该目标点之间的距离;
基于该N个备选点中的备选点到该目标点之间的距离,从该N个备选点中选择该k个邻居点,其中,该N个备选点中的备选点到该目标点的距离为欧式距离或曼哈顿距离。
在一些实施例中,该预测单元520具体用于:
将该N个备选点中的第一目标备选点,确定为该k个邻居点;第一目标备选点是指N个备选点中与目标点的距离小于第一阈值的点,或
将该N个备选点中的第二目标备选点,确定为k个邻居点,第二目标备选点是指N个备选点中与目标点的距离小于第二阈值的点。
在一些实施例中,该预测单元520具体用于:
利用该目标点位置信息的重建信息和该N个备选点位置信息的重建信息确定第二顺序;
基于该第二顺序,从该N个备选点中选择该k个邻居点;
其中,该第二顺序为按照从小到大或从大到小的顺序对该N个备选点和该目标点进行莫顿排序或希尔伯特排序所得到的顺序;或者,该第二顺序为按照该N个备选点中的备选点与该目标点的距离由大到小或有小到大进行排序后得到的顺序,该N个备选点中的备选点到该目标点的距离为欧式距离或曼哈顿距离。
在一些实施例中,该预测单元520具体用于:
以该k个邻居点中的每一个邻居点与该目标点的距离的倒数为该每一个邻居点的权重, 基于该k个邻居点中每一个邻居点的属性值和权重进行加权平均计算,得到所述k个邻居点属性值的加权平均值,将该k个邻居点的属性值的加权平均值,确定为该目标点的属性信息的预测值,该码流包括该k个邻居点中的每一个邻居点的初始权重;或者
针对该k个邻居点中的不同邻居点设置相同或不同的初始权重,基于该k个邻居点中每一个邻居点的属性值和初始权重进行加权平均计算,得到所述k个邻居点属性值的加权平均值,将该k个邻居点的属性值的加权平均值,确定为该目标点的属性信息的预测值,其中该k个邻居点中的每一个邻居点的初始权重随该每一个邻居点与该目标点的距离的增大而减小;或者
将该k个邻居点中与该目标点的距离最近的邻居点的属性值,确定为该目标点的属性信息的预测值。
在一些实施例中,该预测单元520具体用于:
丢弃该k个邻居点中的第一邻居点和第二邻居点,得到k个邻居点中剩余的邻居点;其中,第一邻居点是指k个邻居点中与参考点的距离大于第一阈值的邻居点,第二邻居点是指所述k个邻居点中与参考点的距离大于或者等于第二阈值的邻居点,该k个邻居点包括该参考点。
需要说明的是,该编码器500也可以结合至图1所示的编码框架100,即可将该编码器500中的单元替换或结合至编码框架100中的相关单元。例如,该预测单元520和残差单元530可用于实现编码框架100中的预测变换单元113的相关功能,甚至可用于实现位置编码功能以及针对属性信息进行预测之前的功能。再如,该编码单元540可用于替换该编码框架100中的第二算数编码单元116。
图11是本申请实施例提供的解码器600的示意性框图。
如图11所示,该解码器600可包括:
解析单元610,用于获取点云的码流,对点云的码流进行解析,得到该点云中目标点位置信息的重建信息;
预测单元620,用于:从该点云中的M个已解码点中选择N个已解码点作为该目标点的N个备选点,M≥N≥1;基于该目标点位置信息的重建信息,从该N个备选点中选择k个邻居点,N≥k≥1;利用该k个邻居点的属性值,确定该目标点属性信息的预测值;该k个邻居点的属性值为该k个邻居点属性信息的重建值;
该解析单元610还用于对该码流进行解析,得到该目标点属性信息的残差值;
残差单元630,用于根据该目标点属性信息的预测值和该目标点属性信息的残差值,得到该目标点属性信息的最终重建值;
解码单元640,用于根据该目标点属性信息的最终重建值,得到解码点云。
在一些实施例中,已解码点数量M超过备选点数量N;其中,该预测单元620具体用于:
基于该M个已解码点的第一顺序,从该M个已解码点中选择该N个已解码点;其中,该第一顺序为按照从小到大或从大到小的顺序对该M个已解码点和该目标点进行莫顿排序或希尔伯特排序所得到的顺序,或该第一顺序为该M个已解码点和该目标点的解码顺序;
将该N个已解码点,确定为该N个备选点。
在一些实施例中,该预测单元620具体用于:
沿该第一顺序,将在该目标点之前且与该目标点相邻的N个点,确定为该N个已解码点;或者
沿该第一顺序,将在该目标点之前且连续的N个点,确定为该N个已解码点,其中,该连续的N个点与该目标点相邻或间隔至少一个已解码点。
在一些实施例中,该预测单元620具体用于:
基于该目标点位置信息的重建信息和该N个备选点位置信息的重建信息,确定该N个备选点和该目标点的几何结构关系;
基于该几何结构关系,从该N个备选点中选择该k个邻居点。
在一些实施例中,该几何结构关系通过八叉树结构表示;该预测单元620具体用于:
基于该八叉树结构,确定该目标点的k个最近邻点;
将该k个最近邻点,确定为该k个邻居点。
在一些实施例中,该预测单元620具体用于:
基于该几何结构关系,从该N个备选点中选择与该目标点共线和/或共面的p个备选点;
若备选点数量p大于或等于邻居点数量k,则将该p个备选点确定为该k个邻居点;或者
若备选点数量p大于或等于邻居点数量k,则在该p个备选点中选择k个备选点作为该k个邻居点。
在一些实施例中,该预测单元620具体用于:
基于该几何结构关系,从该N个备选点中选择与该目标点共线和/或共面的p个备选点;
若备选点数量p小于邻居点数量k或备选点数量p等于0,则基于该目标点位置信息的重建信息和该N个备选点位置信息的重建信息,确定该N个备选点中的各个备选点到该目标点之间的距离;
基于该N个备选点中的各个备选点到该目标点之间的距离,从该N个备选点中选择该k个邻居点,其中,该N个备选点中的备选点到该目标点的距离为欧式距离或曼哈顿距离。
在一些实施例中,该预测单元620具体用于:
基于该几何结构关系,从该N个备选点中选择与该目标点共线和/或共面的p个备选点;
若备选点数量p小于邻居点数量k或备选点数量p等于0,则利用该目标点位置信息的重建信息和该N个备选点位置信息的重建信息确定第二顺序;基于该第二顺序,从该N个备选点中选择该k个邻居点;
其中,该第二顺序为按照从小到大或从大到小的顺序对该N个备选点和该目标点进行莫顿排序或希尔伯特排序所得到的顺序;或者,该第二顺序为按照该N个备选点中的备选点与该目标点的距离由大到小或有小到大进行排序后得到的顺序,该N个备选点中的备选点到该目标点的距离为欧式距离或曼哈顿距离。
在一些实施例中,该预测单元620具体用于:
基于该目标点位置信息的重建信息和该N个备选点位置信息的重建信息,确定该N个备选点中的各个备选点到该目标点之间的距离;
基于该N个备选点中的各个备选点到该目标点之间的距离,从该N个备选点中选择该k个邻居点,其中,该N个备选点中的备选点到该目标点的距离为欧式距离或曼哈顿距离。
在一些实施例中,该预测单元620具体用于:
将该N个备选点中的第一目标备选点,确定为该k个邻居点,第一目标备选点是指N个备选点中与目标点的距离小于第一阈值的点;或
将该N个备选点中的第二目标备选点,确定为k个邻居点,第二目标备选点是指N个备 选点中与目标点的距离小于第二阈值的点。
在一些实施例中,该预测单元620具体用于:
利用该目标点位置信息的重建信息和该N个备选点位置信息的重建信息确定第二顺序;
基于该第二顺序,从该N个备选点中选择该k个邻居点;
其中,该第二顺序为按照从小到大或从大到小的顺序对该N个备选点和该目标点进行莫顿排序或希尔伯特排序所得到的顺序;或者,该第二顺序为按照该N个备选点中的备选点与该目标点的距离由大到小或有小到大进行排序后得到的顺序,该N个备选点中的备选点到该目标点的距离为欧式距离或曼哈顿距离。
在一些实施例中,该预测单元620具体用于:
以该k个邻居点中的每一个邻居点与该目标点的距离的倒数为该每一个邻居点的权重,基于该k个邻居点中每一个邻居点的属性值和权重进行加权平均计算,得到所述k个邻居点属性值的加权平均值,将该k个邻居点的属性值的加权平均值,确定为该目标点的属性信息的预测值;或者
针对该k个邻居点中的不同邻居点设置相同或不同的初始权重,基于该k个邻居点中每一个邻居点的属性值和初始权重进行加权平均计算,得到所述k个邻居点属性值的加权平均值,将该k个邻居点的属性值的加权平均值,确定为该目标点的属性信息的预测值,其中该k个邻居点中的每一个邻居点的初始权重随该每一个邻居点与该目标点的距离的增大而减小,该码流包括该k个邻居点中的每一个邻居点的初始权重;或者
将该k个邻居点中与该目标点的距离最近的邻居点的属性值,确定为该目标点的属性信息的预测值。
在一些实施例中,该预测单元620具体用于:
丢弃该k个邻居点中的第一邻居点和第二邻居点,得到k个邻居点中剩余的邻居点;其中,第一邻居点是指k个邻居点中与参考点的距离大于第一阈值的邻居点,第二邻居点是指k个邻居点中与参考点的距离大于或者等于第二阈值的邻居点,该k个邻居点包括该参考点。
需要说明的是,该解码器600也可以结合至图2所示的解码框架200,即可将该解码器600中的单元替换或结合至解码框架200中的相关单元。例如,该解析单元610可用于实现解码框架200中的预测变换单元213的相关功能,甚至可用于实现反量化单元211以及第二算数解码单元210的相关功能。再如,预测单元620和残差单元630可用于实现预测变换单元213的相关功能。再如,解码单元640可用于实现该解码框架200中的颜色空间反变换单元215的功能。
应理解,装置实施例与方法实施例可以相互对应,类似的描述可以参照方法实施例。为避免重复,此处不再赘述。具体地,编码器500可以对应于执行本申请实施例的方法300中的相应主体,并且编码器500中的各个单元分别为了实现方法300中的相应流程,类似的,解码器600可以对应于执行本申请实施例的方法400中的相应主体,并且解码器600中的各个单元分别为了实现方法400中的相应流程,为了简洁,在此不再赘述。
还应当理解,本申请实施例涉及的点云媒体的数据处理装置中的各个单元可以分别或全部合并为一个或若干个另外的单元来构成,或者其中的某个(些)单元还可以再拆分为功能上更小的多个单元来构成,这可以实现同样的操作,而不影响本申请的实施例的技术效果的实现。上述单元是基于逻辑功能划分的,在实际应用中,一个单元的功能也可以由多个单元来实现,或者多个单元的功能由一个单元实现。在本申请的其它实施例中,该点云媒体的数 据处理装置也可以包括其它单元,在实际应用中,这些功能也可以由其它单元协助实现,并且可以由多个单元协作实现。根据本申请的另一个实施例,可以通过在包括例如中央处理单元(CPU)、随机存取存储介质(RAM)、只读存储介质(ROM)等处理元件和存储元件的通用计算机的通用计算设备上运行能够执行相应方法所涉及的各步骤的计算机可读指令(包括程序代码),来构造本申请实施例涉及的点云媒体的数据处理装置,以及来实现本申请实施例的基于点云属性预测的编解码方法。计算机可读指令可以记载于例如计算机可读存储介质上,并通过计算机可读存储介质装载于任意具有数据处理能力的电子设备,并在其中运行,来实现本申请实施例的相应方法。
换言之,上文涉及的单元可以通过硬件形式实现,也可以通过软件形式的指令实现,还可以通过软硬件结合的形式实现。具体地,本申请实施例中的方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路和/或软件形式的指令完成,结合本申请实施例公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件组合执行完成。可选地,软件可以位于随机存储器,闪存、只读存储器、可编程只读存储器、电可擦写可编程存储器、寄存器等本领域的成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法实施例中的步骤。
图12是本申请实施例提供的编解码设备700的示意结构图。
如图12所示,该编解码设备700至少包括处理器710以及计算机可读存储介质720。其中,处理器710以及计算机可读存储介质720可通过总线或者其它方式连接。计算机可读存储介质720用于存储计算机可读指令721,计算机可读指令721包括计算机指令,处理器710用于执行计算机可读存储介质720存储的计算机指令。处理器710是编解码设备700的计算核心以及控制核心,其适于实现一条或多条计算机指令,具体适于加载并执行一条或多条计算机指令从而实现相应方法流程或相应功能。
作为示例,处理器710也可称为中央处理器(CentralProcessingUnit,CPU)。处理器710可以包括但不限于:通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等等。
作为示例,计算机可读存储介质720可以是高速RAM存储器,也可以是非不稳定的存储器(Non-VolatileMemory),例如至少一个磁盘存储器;可选的,还可以是至少一个位于远离前述处理器710的计算机可读存储介质。具体而言,计算机可读存储介质720包括但不限于:易失性存储器和/或非易失性存储器。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synch link DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DR RAM)。
在一种实现方式中,该编解码设备700可以是图1所示的编码框架100或图10所示的编码器500;该计算机可读存储介质720中存储有第一计算机指令;由处理器710加载并执行计算机可读存储介质720中存放的第一计算机指令,以实现图8所示方法实施例中的相应步骤;具体实现中,计算机可读存储介质720中的第一计算机指令由处理器710加载并执行相应步骤,为避免重复,此处不再赘述。
在一种实现方式中,该编解码设备700可以是图2所示的解码框架200或图11所示的解码器600;该计算机可读存储介质720中存储有第二计算机指令;由处理器710加载并执行计算机可读存储介质720中存放的第二计算机指令,以实现图9所示方法实施例中的相应步骤;具体实现中,计算机可读存储介质720中的第二计算机指令由处理器710加载并执行相应步骤,为避免重复,此处不再赘述。
根据本申请的另一方面,本申请实施例还提供了一种计算机可读存储介质(Memory),计算机可读存储介质是编解码设备700中的记忆设备,用于存放程序和数据。例如,计算机可读存储介质720。可以理解的是,此处的计算机可读存储介质720既可以包括编解码设备700中的内置存储介质,当然也可以包括编解码设备700所支持的扩展存储介质。计算机可读存储介质提供存储空间,该存储空间存储了编解码设备700的操作系统。并且,在该存储空间中还存放了适于被处理器710加载并执行的一条或多条的计算机指令,这些计算机指令可以是一个或多个的计算机可读指令721(包括程序代码)。这些计算机指令指令用于计算机执行上述各种可选方式中提供的基于点云属性预测的编解码方法。
根据本申请的另一方面,提供了一种计算机可读指令产品或计算机可读指令,该计算机可读指令产品或计算机可读指令包括计算机指令,该计算机指令存储在计算机可读存储介质中。例如,计算机可读指令721。此时,编解码设备700可以是计算机,处理器710从计算机可读存储介质720读取该计算机指令,处理器710执行该计算机指令,使得该计算机执行上述各种可选方式中提供的基于点云属性预测的编解码方法。
换言之,当使用软件实现时,可以全部或部分地以计算机可读指令产品的形式实现。该计算机可读指令产品包括一个或多个计算机指令。在计算机上加载和执行该计算机指令时,全部或部分地运行本申请实施例的流程或实现本申请实施例的功能。该计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。该计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质进行传输,例如,该计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元以及流程步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
最后需要说明的是,以上内容,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。
Claims (20)
- 一种基于点云属性预测的解码方法,由编解码设备执行,其特征在于,包括:获取点云的码流,对所述点云的码流进行解析,得到所述点云中目标点位置信息的重建信息;从所述点云中的M个已解码点中选择N个已解码点作为所述目标点的N个备选点,M≥N≥1;基于所述目标点位置信息的重建信息,从所述N个备选点中选择k个邻居点,N≥k≥1;利用所述k个邻居点的属性值,确定所述目标点属性信息的预测值;所述k个邻居点的属性值为所述k个邻居点属性信息的重建值;对所述码流进行解析,得到所述目标点属性信息的残差值;根据所述目标点属性信息的预测值和所述目标点属性信息的残差值,得到所述目标点属性信息的最终重建值;根据所述目标点属性信息的最终重建值,得到解码点云。
- 根据权利要求1所述的方法,其特征在于,所述已解码点数量M超过所述备选点数量N;其中,所述从所述点云中的M个已解码点中选择N个已解码点作为所述目标点的N个备选点,包括:基于所述M个已解码点的第一顺序,从所述M个已解码点中选择所述N个已解码点;其中,所述第一顺序为对所述M个已解码点和所述目标点进行莫顿排序或希尔伯特排序所得到的顺序,或所述第一顺序为所述M个已解码点和所述目标点的解码顺序;将所述N个已解码点,确定为所述N个备选点。
- 根据权利要求2所述的方法,其特征在于,所述基于所述M个已解码点的第一顺序,从所述M个已解码点中选择所述N个已解码点,包括:沿所述第一顺序,将在所述目标点之前且与所述目标点相邻的N个点,确定为所述N个已解码点;或者沿所述第一顺序,将在所述目标点之前且连续的N个点,确定为所述N个已解码点,其中,所述连续的N个点与所述目标点相邻或间隔至少一个已解码点。
- 根据权利要求1所述的方法,其特征在于,所述基于所述目标点位置信息的重建信息,从所述N个备选点中选择k个邻居点,包括:基于所述目标点位置信息的重建信息和所述N个备选点位置信息的重建信息,确定所述N个备选点和所述目标点间的几何结构关系;基于所述几何结构关系,从所述N个备选点中选择所述k个邻居点。
- 根据权利要求4所述的方法,其特征在于,所述几何结构关系通过八叉树结构表示;所述基于所述几何结构关系,从所述N个备选点中选择所述k个邻居点,包括:基于所述八叉树结构,确定所述目标点的k个最近邻点;将所述k个最近邻点,确定为所述k个邻居点。
- 根据权利要求4所述的方法,其特征在于,所述基于所述几何结构关系,从所述N个备选点中选择所述k个邻居点,包括:基于所述几何结构关系,从所述N个备选点中选择与所述目标点共线和/或共面的p个备选点;若所述备选点数量p大于或等于所述邻居点数量k,则将所述p个备选点确定为所述k个邻居点;或者若所述备选点数量p大于或等于所述邻居点数量k,则在所述p个备选点中选择k个备选点作为所述k个邻居点。
- 根据权利要求4所述的方法,其特征在于,所述基于所述几何结构关系,从所述N个备选点中选择所述k个邻居点,包括:基于所述几何结构关系,从所述N个备选点中选择与所述目标点共线和/或共面的p个备选点;若所述备选点数量p小于所述邻居点数量k或所述备选点数量p等于0,则基于所述目标点位置信息的重建信息和所述N个备选点位置信息的重建信息,确定所述N个备选点中的各个备选点到所述目标点之间的距离;基于所述N个备选点中的各个备选点到所述目标点之间的距离,从所述N个备选点中选择所述k个邻居点。
- 根据权利要求4所述的方法,其特征在于,所述基于所述几何结构关系,从所述N个备选点中选择所述k个邻居点,包括:基于所述几何结构关系,从所述N个备选点中选择与所述目标点共线和/或共面的p个备选点;若所述备选点数量p小于所述邻居点数量k或所述备选点数量p等于0,则利用所述目标点位置信息的重建信息和所述N个备选点位置信息的重建信息确定第二顺序;基于所述第二顺序,从所述N个备选点中选择所述k个邻居点;其中,所述第二顺序为对所述N个备选点和所述目标点进行莫顿排序或希尔伯特排序所得到的顺序;或者,所述第二顺序为按照所述N个备选点中的备选点与所述目标点的距离由大到小或有小到大进行排序后得到的顺序,所述N个备选点中的备选点到所述目标点的距离为欧式距离或曼哈顿距离。
- 根据权利要求1所述的方法,其特征在于,所述基于所述目标点位置信息的重建信息,从所述N个备选点中选择k个邻居点,包括:基于所述目标点位置信息的重建信息和所述N个备选点的位置信息的重建信息,确定所述N个备选点中的各个备选点到所述目标点之间的距离;基于所述N个备选点中的各个备选点到所述目标点之间的距离,从所述N个备选点中选择所述k个邻居点。
- 根据权利要求7或9所述的方法,其特征在于,所述基于所述N个备选点中的各个备选点到所述目标点之间的距离,从所述N个备选点中选择所述k个邻居点,包括:将所述N个备选点中的第一目标备选点,确定为所述k个邻居点,所述第一目标备选点是指所述N个备选点中与所述目标点的距离小于第一阈值的点;或将所述N个备选点中的第二目标备选点,确定为k个邻居点,所述第二目标备选点是指所述N个备选点中与所述目标点的距离小于第二阈值的点。
- 根据权利要求1所述的方法,其特征在于,所述基于所述目标点位置信息的重建信息,从所述N个备选点中选择k个邻居点,包括:利用所述目标点位置信息的重建信息和所述N个备选点的位置信息的重建信息确定第二顺序;基于所述第二顺序,从所述N个备选点中选择所述k个邻居点;其中,所述第二顺序为对所述N个备选点和所述目标点进行莫顿排序或希尔伯特排序所得到的顺序;或者,所述第二顺序为按照所述N个备选点中的各个备选点与所述目标点的距离进行排序后得到的顺序。
- 根据权利要求1~9中任一项所述的方法,其特征在于,所述利用所述k个邻居点的属性值,确定所述目标点属性信息的预测值,包括:以所述k个邻居点中的每一个邻居点与所述目标点的距离的倒数作为所述每一个邻居点的权重,基于所述k个邻居点中每一个邻居点的属性值和权重进行加权平均计算,得到所述k个邻居点属性值的加权平均值,将所述k个邻居点属性值的加权平均值,确定为所述目标点属性信息的预测值;或者针对所述k个邻居点中的不同邻居点设置相同或不同的初始权重,基于所述k个邻居点中每一个邻居点的属性值和初始权重进行加权平均计算,得到所述k个邻居点属性值的加权平均值,将所述k个邻居点属性值的加权平均值,确定为所述目标点属性信息的预测值,其中所述k个邻居点中的每一个邻居点的初始权重随所述每一个邻居点与所述目标点的距离的增大而减小,所述码流包括所述k个邻居点中的每一个邻居点的初始权重;或者将所述k个邻居点中与所述目标点的距离最近的邻居点的属性值,确定为所述目标点属性信息的预测值。
- 根据权利要求1~9中任一项所述的方法,其特征在于,所述利用所述k个邻居点的属性值,确定所述目标点的属性信息的预测值,包括:丢弃所述k个邻居点中的第一邻居点和第二邻居点,得到所述k个邻居点中剩余的邻居点;其中,所述第一邻居点是指所述k个邻居点中与参考点的距离大于第一阈值的邻居点,所述第二邻居点是指所述k个邻居点中与参考点的距离大于或者等于第二阈值的邻居点,所述k个邻居点包括所述参考点;利用所述k个邻居点中剩余的邻居点的属性值,确定所述目标点的属性信息的预测值。
- 一种基于点云属性预测的编码方法,由编解码设备执行,其特征在于,包括:获取点云中目标点位置信息的重建信息;从所述点云中的M个已编码点中选择N个已编码点作为所述目标点的N个备选点,M≥N≥1;基于所述目标点位置信息的重建信息,从所述N个备选点中选择k个邻居点,N≥k≥1;利用所述k个邻居点的属性值,确定所述目标点属性信息的预测值;所述k个邻居点的属性值为所述k个邻居点属性信息的重建值或所述k个邻居点属性信息的原始值;根据所述目标点属性信息的预测值和所述目标点属性信息的原始值,得到所述目标点属性信息的残差值;对所述目标点属性信息的残差值进行编码,得到所述点云的码流。
- 根据权利要求14所述的方法,其特征在于,已编码点数量M超过备选点数量N;其中,所述从所述点云中的M个已编码点中选择N个已编码点作为所述目标点的N个备选点,包括:基于所述M个已编码点的第一顺序,从所述M个已编码点中选择该N个已编码点;其中,所述第一顺序为对所述M个已编码点和所述目标点进行排序所得到的顺序,或所述第一顺序为所述M个已编码点和所述目标点的编码顺序;将所述N个已编码点作为所述N个备选点。
- 根据权利要求1所述的方法,其特征在于,所述基于所述M个已编码点的第一顺序,从所述M个已编码点中选择该N个已编码点,包括:沿所述第一顺序,将在所述目标点之前且与所述目标点相邻的N个点,确定为所述N个已编码点;或者,沿所述第一顺序,将在所述目标点之前且连续的N个点,确定为所述N个已编码点,其中,所述连续的N个点与所述目标点相邻或间隔至少一个已编码点。
- 一种基于点云属性预测的解码器,其特征在于,包括:解析单元,用于获取点云的码流,对所述点云的码流进行解析,得到所述点云中目标点位置信息的重建信息;预测单元,用于从所述点云中的M个已解码点中选择N个已解码点作为所述目标点的N个备选点,M≥N≥1;基于所述目标点位置信息的重建信息,从所述N个备选点中选择k个邻居点,N≥k≥1;利用所述k个邻居点的属性值,确定所述目标点属性信息的预测值;所述k个邻居点的属性值为所述k个邻居点属性信息的重建值;所述解析单元还用于对所述码流进行解析,得到所述目标点属性信息的残差值;残差单元,用于根据所述目标点属性信息的预测值和所述目标点属性信息的残差值,得到所述目标点属性信息的最终重建值;解码单元,用于根据所述目标点的属性信息的最终重建值,得到解码点云。
- 一种基于点云属性预测的编码器,其特征在于,包括:获取单元,用于获取点云中的目标点位置信息的重建信息;预测单元,用于:从所述点云中的M个已编码点中选择N个已编码点作为所述目标点的N个备选点,M≥N≥1;基于所述目标点位置信息的重建信息,从所述N个备选点中选择k个邻居点,N≥k≥1;利用所述k个邻居点的属性值,确定所述目标点属性信息的预测值;所述k个邻居点的属性值为所述k个邻居点的属性信息的重建值或所述k个邻居点属性信息的原始值;残差单元,用于根据所述目标点属性信息的预测值和所述目标点属性信息的原始值,得到所述目标点属性信息的残差值;编码单元,用于对所述目标点属性信息的残差值进行编码,得到所述点云的码流。
- 一种电子设备,其特征在于,包括:处理器,适于执行计算机可读指令;计算机可读存储介质,所述计算机可读存储介质中存储有计算机可读指令,所述计算机可读指令被所述处理器执行时,实现如权利要求1至13中任一项所述的基于点云属性预测的解码方法,或实现如权利要求14至16所述的基于点云属性预测的编码方法。
- 一种计算机可读存储介质,其特征在于,用于存储计算机可读指令,所述计算机可读指令使得计算机执行如权利要求1至13中任一项所述的基于点云属性预测的解码方法,或如权利要求14至16所述的基于点云属性预测的编码方法。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP22766110.5A EP4307663A4 (en) | 2021-03-12 | 2022-02-08 | DECODING METHODS/ENCODING METHODS BASED ON POINT CLOUD ATTRIBUTE PREDICTION, DECODER AND ENCODER |
US18/051,713 US20230086264A1 (en) | 2021-03-12 | 2022-11-01 | Decoding method, encoding method, decoder, and encoder based on point cloud attribute prediction |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110278568.X | 2021-03-12 | ||
CN202110278568.XA CN115086660B (zh) | 2021-03-12 | 2021-03-12 | 基于点云属性预测的解码、编码方法、解码器及编码器 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/051,713 Continuation US20230086264A1 (en) | 2021-03-12 | 2022-11-01 | Decoding method, encoding method, decoder, and encoder based on point cloud attribute prediction |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022188583A1 true WO2022188583A1 (zh) | 2022-09-15 |
Family
ID=83226313
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/075560 WO2022188583A1 (zh) | 2021-03-12 | 2022-02-08 | 基于点云属性预测的解码、编码方法、解码器及编码器 |
Country Status (5)
Country | Link |
---|---|
US (1) | US20230086264A1 (zh) |
EP (1) | EP4307663A4 (zh) |
CN (2) | CN117097898A (zh) |
TW (1) | TWI815339B (zh) |
WO (1) | WO2022188583A1 (zh) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115688004B (zh) * | 2022-11-08 | 2023-09-29 | 中国民用航空总局第二研究所 | 一种基于希尔伯特编码的目标属性确定方法、介质及设备 |
WO2024168613A1 (zh) * | 2023-02-15 | 2024-08-22 | Oppo广东移动通信有限公司 | 解码方法、编码方法、解码器以及编码器 |
WO2024207481A1 (zh) * | 2023-04-07 | 2024-10-10 | Oppo广东移动通信有限公司 | 编解码方法、编码器、解码器、码流以及存储介质 |
CN116989694A (zh) * | 2023-08-04 | 2023-11-03 | 深圳市汇和通传感技术有限公司 | 基于3d轮廓扫描的尺寸定位检测系统 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190081638A1 (en) * | 2017-09-14 | 2019-03-14 | Apple Inc. | Hierarchical point cloud compression |
CN110708560A (zh) * | 2018-07-10 | 2020-01-17 | 腾讯美国有限责任公司 | 点云数据处理方法和装置 |
CN110996098A (zh) * | 2018-10-02 | 2020-04-10 | 腾讯美国有限责任公司 | 处理点云数据的方法和装置 |
CN111145090A (zh) * | 2019-11-29 | 2020-05-12 | 鹏城实验室 | 一种点云属性编码方法、解码方法、编码设备及解码设备 |
CN111405281A (zh) * | 2020-03-30 | 2020-07-10 | 北京大学深圳研究生院 | 一种点云属性信息的编码方法、解码方法、存储介质及终端设备 |
WO2020190090A1 (ko) * | 2019-03-20 | 2020-09-24 | 엘지전자 주식회사 | 포인트 클라우드 데이터 전송 장치, 포인트 클라우드 데이터 전송 방법, 포인트 클라우드 데이터 수신 장치 및 포인트 클라우드 데이터 수신 방법 |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10861196B2 (en) * | 2017-09-14 | 2020-12-08 | Apple Inc. | Point cloud compression |
US10911787B2 (en) * | 2018-07-10 | 2021-02-02 | Apple Inc. | Hierarchical point cloud compression |
US10979730B2 (en) * | 2019-03-20 | 2021-04-13 | Tencent America LLC | Techniques and apparatus for interframe point cloud attribute coding |
WO2021196029A1 (zh) * | 2020-03-31 | 2021-10-07 | 深圳市大疆创新科技有限公司 | 一种用于点云编码、解码的方法和设备 |
-
2021
- 2021-03-12 CN CN202311139933.4A patent/CN117097898A/zh active Pending
- 2021-03-12 CN CN202110278568.XA patent/CN115086660B/zh active Active
-
2022
- 2022-02-08 EP EP22766110.5A patent/EP4307663A4/en active Pending
- 2022-02-08 WO PCT/CN2022/075560 patent/WO2022188583A1/zh active Application Filing
- 2022-03-08 TW TW111108454A patent/TWI815339B/zh active
- 2022-11-01 US US18/051,713 patent/US20230086264A1/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190081638A1 (en) * | 2017-09-14 | 2019-03-14 | Apple Inc. | Hierarchical point cloud compression |
CN110708560A (zh) * | 2018-07-10 | 2020-01-17 | 腾讯美国有限责任公司 | 点云数据处理方法和装置 |
CN110996098A (zh) * | 2018-10-02 | 2020-04-10 | 腾讯美国有限责任公司 | 处理点云数据的方法和装置 |
WO2020190090A1 (ko) * | 2019-03-20 | 2020-09-24 | 엘지전자 주식회사 | 포인트 클라우드 데이터 전송 장치, 포인트 클라우드 데이터 전송 방법, 포인트 클라우드 데이터 수신 장치 및 포인트 클라우드 데이터 수신 방법 |
CN111145090A (zh) * | 2019-11-29 | 2020-05-12 | 鹏城实验室 | 一种点云属性编码方法、解码方法、编码设备及解码设备 |
CN111405281A (zh) * | 2020-03-30 | 2020-07-10 | 北京大学深圳研究生院 | 一种点云属性信息的编码方法、解码方法、存储介质及终端设备 |
Non-Patent Citations (1)
Title |
---|
See also references of EP4307663A4 |
Also Published As
Publication number | Publication date |
---|---|
CN115086660B (zh) | 2023-07-25 |
CN117097898A (zh) | 2023-11-21 |
CN115086660A (zh) | 2022-09-20 |
EP4307663A1 (en) | 2024-01-17 |
TW202236216A (zh) | 2022-09-16 |
TWI815339B (zh) | 2023-09-11 |
EP4307663A4 (en) | 2024-04-17 |
US20230086264A1 (en) | 2023-03-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022188583A1 (zh) | 基于点云属性预测的解码、编码方法、解码器及编码器 | |
US11252441B2 (en) | Hierarchical point cloud compression | |
US11276203B2 (en) | Point cloud compression using fixed-point numbers | |
WO2021067867A1 (en) | Trimming search space for nearest neighbor determinations in point cloud compression | |
WO2019055772A1 (en) | COMPRESSION OF CLOUD OF POINTS | |
WO2024037091A1 (zh) | 一种点云处理方法、装置及计算机设备、存储介质 | |
CN115379191B (zh) | 一种点云解码方法、点云编码方法及相关设备 | |
Vázquez et al. | Using normalized compression distance for image similarity measurement: an experimental study | |
WO2023241107A1 (zh) | 点云处理方法、装置及计算机设备、存储介质 | |
WO2022258009A1 (zh) | 熵编码、解码方法及装置 | |
WO2022067775A1 (zh) | 点云的编码、解码方法、编码器、解码器以及编解码系统 | |
WO2024037244A9 (zh) | 点云数据的解码方法、编码方法、装置、存储介质及设备 | |
TW202406344A (zh) | 一種點雲幾何資料增強、編解碼方法、裝置、碼流、編解碼器、系統和儲存媒介 | |
WO2023205969A1 (zh) | 点云几何信息的压缩、解压缩及点云视频编解码方法、装置 | |
WO2022067776A1 (zh) | 点云的解码、编码方法、解码器、编码器和编解码系统 | |
WO2024060161A1 (zh) | 编解码方法、编码器、解码器以及存储介质 | |
WO2022257528A1 (zh) | 点云属性的预测方法、装置及相关设备 | |
WO2023098820A1 (zh) | 点云编码、解码方法、装置及通信设备 | |
WO2024012381A1 (en) | Method, apparatus, and medium for point cloud coding | |
WO2024148473A1 (zh) | 编码方法及装置、编码器、码流、设备、存储介质 | |
WO2023024842A1 (zh) | 点云编解码方法、装置、设备及存储介质 | |
WO2023103565A1 (zh) | 点云属性信息的编解码方法、装置、设备及存储介质 | |
WO2024074121A1 (en) | Method, apparatus, and medium for point cloud coding | |
WO2023025135A1 (zh) | 点云属性编码方法、装置、解码方法以及装置 | |
KR20240097892A (ko) | 포인트 클라우드 코딩을 위한 방법, 장치 및 매체 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22766110 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2022766110 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2022766110 Country of ref document: EP Effective date: 20231012 |