WO2022166963A1 - 基于二维规则化平面投影的点云编解码方法及装置 - Google Patents

基于二维规则化平面投影的点云编解码方法及装置 Download PDF

Info

Publication number
WO2022166963A1
WO2022166963A1 PCT/CN2022/075397 CN2022075397W WO2022166963A1 WO 2022166963 A1 WO2022166963 A1 WO 2022166963A1 CN 2022075397 W CN2022075397 W CN 2022075397W WO 2022166963 A1 WO2022166963 A1 WO 2022166963A1
Authority
WO
WIPO (PCT)
Prior art keywords
dimensional
information
point cloud
projection
map
Prior art date
Application number
PCT/CN2022/075397
Other languages
English (en)
French (fr)
Inventor
杨付正
张伟
Original Assignee
荣耀终端有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN202110172795.4A external-priority patent/CN114915795B/zh
Application filed by 荣耀终端有限公司 filed Critical 荣耀终端有限公司
Priority to JP2023517332A priority Critical patent/JP2023541207A/ja
Priority to US18/040,705 priority patent/US20230290007A1/en
Priority to KR1020237011307A priority patent/KR20230060534A/ko
Priority to EP22749242.8A priority patent/EP4184917A4/en
Publication of WO2022166963A1 publication Critical patent/WO2022166963A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/43Hardware specially adapted for motion estimation or compensation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/001Model-based coding, e.g. wire frame
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N9/00Details of colour television systems
    • H04N9/12Picture reproducers
    • H04N9/31Projection devices for colour picture display, e.g. using electronic spatial light modulators [ESLM]
    • H04N9/3179Video signal processing therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds

Definitions

  • the invention belongs to the technical field of encoding and decoding, and in particular relates to a point cloud encoding and decoding method and device based on two-dimensional regularized plane projection.
  • 3D point cloud With the improvement of hardware processing capability and the rapid development of computer vision, 3D point cloud has become a new generation of immersive multimedia after audio, image and video, and is widely used in virtual reality, augmented reality, autonomous driving and environment modeling, etc. .
  • 3D point cloud usually has a large amount of data, which is not conducive to the transmission and storage of point cloud data. Therefore, it is of great significance to study efficient point cloud encoding and decoding technology.
  • G-PCC Geometry-based Point Cloud Compression
  • the geometric information and attribute information of the point cloud are encoded separately.
  • the geometric encoding and decoding of G-PCC can be divided into geometric encoding and decoding based on octree and geometric encoding and decoding based on prediction tree.
  • Octree-based geometric encoding and decoding At the encoding end, first, the geometric information of the point cloud is preprocessed, which includes the coordinate transformation and voxelization process of the point cloud. Then, according to the order of breadth-first traversal, the bounding box where the point cloud is located is continuously divided into trees (octree/quadtree/binary tree). Finally, encode the placeholder code of each node, and encode the number of points contained in each leaf node to generate a binary code stream. At the decoding end, the placeholder code of each node is continuously parsed according to the order of breadth-first traversal. Then, the tree division is continuously performed in turn, and the division is stopped when a unit cube of 1x1x1 is obtained. Finally, the number of points contained in each leaf node is obtained by parsing, and finally the reconstructed point cloud geometric information is obtained.
  • the placeholder code of each node is continuously parsed according to the order of breadth-first traversal. Then, the tree division is continuously
  • Prediction tree-based geometric encoding and decoding On the encoding side, the original point cloud is first sorted. Then, a prediction tree structure is established, by classifying each point to the corresponding laser scanner, and establishing the prediction tree structure according to different laser scanners. Next, traverse each node in the prediction tree, predict the geometric information of the node by selecting different prediction modes, and obtain the prediction residual, and use the quantization parameter to quantize the prediction residual. Finally, the prediction tree structure, quantization parameters and prediction residuals of node geometry information are encoded to generate a binary code stream.
  • the code stream is first parsed, then the prediction tree structure is reconstructed, and then the prediction residual and quantization parameters are obtained by analyzing the geometric information of each node, the prediction residual is inversely quantized, and finally the weight of each node is recovered.
  • the geometric information of the point cloud is reconstructed.
  • the present invention provides a point cloud encoding and decoding method and device based on two-dimensional regularized plane projection.
  • the technical problem to be solved by the present invention is realized by the following technical solutions:
  • a point cloud encoding method based on two-dimensional regularized plane projection comprising:
  • the several two-dimensional image information is encoded to obtain code stream information.
  • a two-dimensional regularized plane projection is performed on the original point cloud data to obtain a two-dimensional projected plane structure, including:
  • a mapping relationship between the original point cloud data and the two-dimensional projection plane structure is determined, so as to project the original point cloud data onto the two-dimensional projection plane structure.
  • the plurality of two-dimensional map information includes a geometric information map
  • the geometric information map includes an occupancy information map, a depth information map, a projection residual information map, and a coordinate transformation error information map.
  • code stream information including:
  • a geometry information code stream is obtained according to the occupancy information code stream, the depth information code stream, the projection residual information code stream, and the coordinate conversion error information code stream.
  • the method further includes:
  • the attribute information of the original point cloud data is encoded based on the reconstructed point cloud geometric information to obtain an attribute information code stream.
  • the pieces of two-dimensional graph information further include attribute information graphs.
  • encoding the several two-dimensional image information to obtain code stream information further comprising:
  • the attribute information map is encoded to obtain an attribute information code stream.
  • Another embodiment of the present invention also provides a point cloud encoding device based on two-dimensional regularized plane projection, including:
  • a first data acquisition module used for acquiring original point cloud data
  • a projection module for performing a two-dimensional regularized plane projection on the original point cloud data to obtain a two-dimensional projected plane structure
  • a data processing module configured to obtain several two-dimensional map information according to the two-dimensional projection plane structure
  • the encoding module is used for encoding the several two-dimensional image information to obtain the code stream information.
  • Another embodiment of the present invention also provides a point cloud decoding method based on two-dimensional regularized plane projection, including:
  • the point cloud is reconstructed using the two-dimensional projected plane structure.
  • Yet another embodiment of the present invention also provides a point cloud decoding device based on two-dimensional regularized plane projection, including:
  • the second data acquisition module is used to acquire and decode the code stream information to obtain analytical data
  • a first reconstruction module configured to reconstruct several two-dimensional map information according to the analytical data
  • a second reconstruction module configured to obtain a two-dimensional projection plane structure according to the several two-dimensional map information
  • the point cloud reconstruction module is used for reconstructing the point cloud by using the two-dimensional projection plane structure.
  • the invention By projecting the point cloud in the three-dimensional space into the corresponding two-dimensional regularized projection plane structure, the invention performs regular correction on the vertical direction and the horizontal direction of the point cloud, and obtains the point cloud on the two-dimensional projection plane structure. Strong correlation representation, thus avoiding the sparsity in the 3D representation structure, and better reflecting the spatial correlation of the point cloud.
  • the spatial correlation of the point cloud can be greatly utilized, the spatial redundancy can be reduced, and the encoding efficiency of the point cloud can be further improved.
  • FIG. 1 is a schematic diagram of a point cloud encoding method based on two-dimensional regularized plane projection provided by an embodiment of the present invention
  • FIG. 2 is a schematic diagram of the correspondence between the cylindrical coordinates of a point and a pixel in a two-dimensional projection plane provided by an embodiment of the present invention
  • FIG. 3 is a schematic diagram of a two-dimensional projection plane structure of a point cloud provided by an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of a projection residual provided by an embodiment of the present invention.
  • FIG. 5 is a frame diagram of a point cloud geometric information encoding provided by an embodiment of the present invention.
  • FIG. 6 is a frame diagram of attribute information encoding based on reconstructed geometric information provided by an embodiment of the present invention.
  • FIG. 7 is a frame diagram of simultaneous encoding of point cloud geometric information and attribute information provided by an embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of a point cloud encoding device based on two-dimensional regularized plane projection provided by an embodiment of the present invention.
  • FIG. 9 is a schematic diagram of a point cloud decoding method based on two-dimensional regularized plane projection provided by an embodiment of the present invention.
  • FIG. 10 is a schematic structural diagram of a point cloud decoding device based on two-dimensional regularized plane projection according to an embodiment of the present invention.
  • FIG. 1 is a schematic diagram of a point cloud encoding method based on two-dimensional regularized plane projection provided by an embodiment of the present invention, including the following steps:
  • the original point cloud data usually consists of a set of three-dimensional space points, and each space point records its own geometric position information, as well as additional attribute information such as color, reflectance, and normal.
  • the geometric position information of the point cloud is generally represented based on the Cartesian coordinate system, that is, represented by the x, y, and z coordinates of the point.
  • Raw point cloud data can be obtained through 3D scanning equipment such as lidar, etc., or through public datasets provided by various platforms.
  • the acquired geometric position information of the original point cloud data is represented based on a Cartesian coordinate system. It should be noted that the representation method of the geometric position information of the original point cloud data is not limited to Cartesian coordinates.
  • the original point cloud data may also be preprocessed, such as voxelization, to facilitate subsequent encoding.
  • the regularization parameters are usually finely determined by the manufacturer and provided to the consumer as one of the necessary data, such as the acquisition range of the lidar, the sampling angle resolution of the horizontal azimuth or the number of sampling points, and the distance correction factor of each laser scanner, the offset information V o and H o of the laser scanner along the vertical and horizontal directions, and the offset information ⁇ 0 of the laser scanner along the elevation and horizontal azimuth angles and a.
  • the regularization parameters are not limited to the parameters given above, which can use the given calibration parameters of the lidar, or can optimize the estimation and data simulation when the calibration parameters of the lidar are not given. obtained in an equal manner.
  • the two-dimensional regularized projection plane structure of the point cloud is a data structure containing M rows and N columns of pixels, and the points in the three-dimensional point cloud correspond to the pixels in the data structure after being projected.
  • the pixel (i, j) in the data structure can be associated with the cylindrical coordinate component ( ⁇ , ⁇ ), for example, the following formula can be used to find the pixel (i, j) corresponding to the cylindrical coordinate (r, ⁇ , ⁇ ) .
  • FIG. 2 is a schematic diagram of the correspondence between the cylindrical coordinates of a point and a pixel in a two-dimensional projection plane provided by an embodiment of the present invention.
  • the resolution of the two-dimensional regularized projection plane can be obtained from the regularization parameters. For example, if the resolution of the two-dimensional regularized projection plane is assumed to be M ⁇ N, the number of laser scanners in the regularization parameter can be used to initialize M. , and utilize the sampling angular resolution of the horizontal azimuth (or the number of sampling points of the laser scanner) to initialize N, for example, the following formula can be used, and finally the initialization of the two-dimensional projection plane structure can be completed, and a plane structure containing M ⁇ N pixels can be obtained.
  • mapping relationship between the original point cloud data and the two-dimensional projected plane structure is determined, so as to project the original point cloud data onto the two-dimensional projected plane structure.
  • This part judges the position of the original point cloud in the two-dimensional projection plane structure point by point, and maps the point cloud originally distributed randomly in the Cartesian coordinate system to the uniformly distributed two-dimensional regular projection plane structure. Specifically, for each point in the original point cloud, the corresponding pixel is determined in the two-dimensional projection plane structure, for example, the pixel with the smallest spatial distance from the projection position of the point in the two-dimensional plane can be selected as the corresponding pixel of the point.
  • the search area of the current point in the two-dimensional projected plane structure can be directly used as the search area. Further, in order to reduce the amount of calculation, the search area of the corresponding pixel in the two-dimensional projection plane structure can also be determined by the pitch angle ⁇ and azimuth angle ⁇ of the cylindrical coordinate components of the current point, so as to reduce the search area.
  • the error Err is less than the current minimum error minErr, use it to update the minimum error minErr, and use the i and j corresponding to the current pixel to update the i and j of the pixel corresponding to the current point; if the error Err is greater than the minimum error minErr, do not perform the above update process.
  • the corresponding pixel (i, j) of the current point in the two-dimensional projection plane structure can be determined.
  • FIG. 3 is a schematic diagram of a two-dimensional projection plane structure of a point cloud provided by an embodiment of the present invention, wherein each point in the original point cloud data is mapped to a corresponding pixel in the structure.
  • the geometric information map may be one or more of an occupancy information map, a depth information map, a projection residual information map, and a coordinate conversion error information map, or may be other geometric information maps.
  • This embodiment specifically takes the above four geometric information graphs as examples for detailed description.
  • the occupancy information map is used to identify whether each pixel in the two-dimensional regularized projection plane structure is occupied, that is, whether each pixel corresponds to a point in the point cloud. If it is occupied, the pixel is called non-empty, otherwise, it is called The pixel is empty. For example, 0 and 1 can be used to represent, 1: represents that the current pixel is occupied; 0: represents that the current pixel is not occupied, so the occupancy information map of the point cloud can be obtained according to the two-dimensional projection plane structure of the point cloud.
  • the depth information map is used to represent the distance between the corresponding point of each occupied pixel and the coordinate origin in the 2D regularized projection plane structure.
  • the cylindrical coordinate r component of the corresponding point of the pixel can be used as the depth of the pixel .
  • the Cartesian coordinates of the corresponding point of the pixel are (x, y, z)
  • the cylindrical coordinate r component of the point can be obtained, that is, the depth of the pixel. Based on this, each occupied pixel in the two-dimensional regularized projection plane structure will have a depth value, thereby obtaining the corresponding depth information map.
  • the projection residual information map is used to represent the residual between the corresponding position of each occupied pixel in the two-dimensional regularized projection plane structure and the actual projection position, as shown in FIG. 4 , which is a projection provided by an embodiment of the present invention. Schematic diagram of residuals.
  • the projection residuals of pixels can be calculated in the following manner. Assuming that the current pixel is (i,j) and the Cartesian coordinates of the corresponding point are (x,y,z), the actual projection position of the point can be expressed as ( ⁇ ',i'), which can be calculated by the following formula :
  • each occupied pixel in the two-dimensional regularized projection plane will have a projection residual, thereby obtaining the projection residual information map corresponding to the point cloud.
  • the coordinate transformation error information map is used to represent the residual between the spatial position of each occupied pixel in the two-dimensional regularized projection plane structure and the spatial position of the corresponding original point of the pixel.
  • the coordinate conversion error of the pixel can be calculated in the following way. Assuming that the current pixel is (i, j) and the Cartesian coordinates of its corresponding point are (x, y, z), the regularization parameter and the following formula can be used to inversely convert the pixel back to the Cartesian coordinate system to obtain the corresponding Cartesian coordinate system. Karl coordinates (xl,yl,zl):
  • the coordinate conversion error ( ⁇ x, ⁇ y, ⁇ z) of the current pixel can be calculated by the following formula:
  • each occupied pixel in the two-dimensional regularized projection plane structure will have a coordinate transformation error, thereby obtaining the coordinate transformation error information map corresponding to the point cloud.
  • the invention By projecting the point cloud in the three-dimensional space into the corresponding two-dimensional regularized projection plane structure, the invention performs regular correction on the vertical direction and the horizontal direction of the point cloud, and obtains the point cloud on the two-dimensional projection plane structure. Strong correlation representation, which better reflects the spatial correlation of point clouds, thereby improving the coding efficiency of point clouds.
  • S4 Encode several two-dimensional image information to obtain code stream information.
  • FIG. 5 is a frame diagram of a point cloud geometric information encoding provided by an embodiment of the present invention.
  • the occupancy information map, the depth information map, the projection residual information map, and the coordinate conversion error information map obtained in step S3 are encoded to obtain the occupancy information code stream, the depth information code stream, and the projection residual error respectively.
  • Information code stream and coordinate conversion error information code stream are encoded to obtain the occupancy information code stream, the depth information code stream, and the projection residual error respectively.
  • a certain scanning sequence is used to traverse the pixels in the occupancy information map, the depth information map, the projection residual information map, and the coordinate conversion error information map in sequence, for example, zigzag scanning is used.
  • the reconstructed occupancy information of the encoded and decoded pixels can be used for prediction.
  • various existing neighbor prediction techniques can be used. After obtaining the corresponding prediction residual, the existing Entropy coding technology is used for coding, and the occupancy information code stream is obtained.
  • the reconstructed occupancy information map and the reconstructed depth information of the encoded and decoded pixels can be used for prediction.
  • the occupancy information of neighbor pixels can be combined on the existing neighbor prediction technology to predict Prediction, that is, only the neighbor pixels with non-empty occupancy information are used to predict the depth information of the current pixel.
  • the calculation of the predicted value can be performed by means of weighted average, etc.
  • the existing entropy coding technology can be used for coding, Get the depth information code stream.
  • the reconstructed occupancy information map, depth information map and the reconstructed projection residual information of the encoded and decoded pixels can be used for prediction, which can be based on the existing neighbor prediction technology.
  • the existing entropy coding technology can be used for encoding, and the projection residual information code stream can be obtained.
  • the reconstructed occupancy information map, depth information map, projection residual information map and the reconstructed coordinate transformation error information of the encoded and decoded pixels can be used for prediction.
  • the occupancy information, depth information and projection residual information of neighbor pixels are combined to predict, that is, only neighbor pixels whose occupancy information is not empty and are similar to the depth information and projection residual information of the current pixel are used to predict the current pixel.
  • the coordinate conversion error information of the pixel and the calculation of the predicted value can be calculated by means of weighted average, and after obtaining the corresponding prediction residual, the existing entropy coding technology can be used for encoding, and the coordinate conversion error information code stream can be obtained.
  • prediction residuals can also be quantized and then encoded.
  • this embodiment encodes several two-dimensional image information obtained by projecting a two-dimensional regularized plane, such as the occupancy information map, the depth information map, the projection residual information map, and the coordinate conversion error information map
  • the two-dimensional map can be effectively used.
  • the strong correlation in the current pixel is used to predict and entropy encode the current pixel, so the spatial correlation of the point cloud can be greatly utilized, and the spatial redundancy can be reduced, thereby further improving the encoding efficiency of the point cloud.
  • the geometric information code stream is obtained according to the occupancy information code stream, the depth information code stream, the projection residual information code stream and the coordinate conversion error information code stream.
  • the geometric information code stream of the original point cloud data can be obtained.
  • the above-mentioned geometric information map can also be compressed by means of image ⁇ video and other compression methods, including but not limited to: JPEG, JPEG2000, HEIF, H.264 ⁇ AVC, H.265 ⁇ HEVC, etc.
  • the projection residual and coordinate transformation error of the point cloud can be adjusted. Not fixed. For example, for lossless coding, if the two-dimensional projection accuracy is high, the projection residual is small, so the projection residual can be set to 0.
  • this operation makes the coordinate conversion error slightly increase, it can be
  • the coding efficiency of the coordinate conversion error decreases or remains unchanged, the coding efficiency of the projection residual can be well improved; or if the two-dimensional projection accuracy is low, the projection residual is large, and the projection can be The residuals are properly adjusted, so that the coordinate transformation errors change accordingly, and then the adjusted projection residuals and coordinate transformation errors are encoded, and better encoding efficiency may also be obtained.
  • the projection residual For lossy coding, if the two-dimensional projection accuracy is high, the projection residual is small, so the projection residual may not be encoded; or if the coordinate conversion error is small, the coordinate conversion error may not be encoded, so as to In this way, the coding efficiency can be improved; alternatively, the projection residual and the coordinate transformation error can be adjusted appropriately, and then the adjusted projection residual and the coordinate transformation error can be encoded, and better coding efficiency may also be obtained.
  • FIG. 6 is a framework diagram of encoding attribute information based on reconstructed geometric information according to an embodiment of the present invention.
  • geometric reconstruction is performed according to the geometric information code stream obtained in the first embodiment, and the reconstructed point cloud geometric information is obtained.
  • the attribute information of the original point cloud data is encoded based on the reconstructed point cloud geometric information, and the attribute information code stream is obtained.
  • attribute information encoding is generally performed for color and reflectivity information of spatial points.
  • the attribute information of the original point cloud data can be encoded based on the geometric reconstruction information of the point cloud using the prior art. For example, first convert the color information in the attribute from RGB color space to YUV color space. Then, the point cloud is recolored with the reconstructed geometric information so that the unencoded attribute information corresponds to the reconstructed geometric information. After sorting the point cloud with Morton code or Hilbert code, use the reconstructed attribute value of the encoded point to interpolate the predicted point to obtain the predicted attribute value, and then differentiate the real attribute value and the predicted attribute value to obtain the predicted residual value. Finally, the prediction residual is quantized and encoded to generate a binary code stream.
  • the attribute information map can also be obtained simultaneously through the two-dimensional projected plane structure. Then encode the geometric information graph and the attribute information graph at the same time to obtain the geometric information code stream and the attribute information code stream.
  • FIG. 7 is a framework diagram of simultaneous encoding of point cloud geometric information and attribute information provided by an embodiment of the present invention.
  • the encoding process of the attribute information graph is as follows:
  • the reconstructed occupancy information map and the reconstructed attribute information of the encoded and decoded pixels can be used for prediction.
  • the occupancy information of the neighbor pixels can be combined with the existing neighbor prediction technology.
  • To predict that is, only the neighbor pixels with non-empty occupancy information are used to predict the attribute information of the current pixel.
  • the calculation of the predicted value can be carried out by means of weighted average, etc.
  • the existing entropy coding technology can be used for coding. , get the attribute information code stream.
  • the geometric information map and the attribute information map are simultaneously obtained through the two-dimensional projection plane structure, and then the geometric information and the attribute information are simultaneously encoded, thereby improving the encoding efficiency.
  • FIG. 8 A schematic structural diagram of a point cloud encoding device for plane projection, which includes:
  • the first data acquisition module 11 is used to acquire original point cloud data
  • the projection module 12 is used to perform a two-dimensional regularized plane projection on the original point cloud data to obtain a two-dimensional projected plane structure
  • the data processing module 13 is used to obtain several two-dimensional map information according to the two-dimensional projection plane structure
  • the encoding module 14 is used for encoding several two-dimensional image information to obtain code stream information.
  • the encoding apparatus provided in this embodiment can implement the encoding methods described in the first to third embodiments above, and the detailed process is not repeated here.
  • FIG. 9 is a schematic diagram of a point cloud decoding method based on two-dimensional regularized plane projection provided by an embodiment of the present invention.
  • the method includes:
  • Step 1 Obtain the code stream information and decode it to obtain the parsed data
  • the decoding end obtains the compressed code stream information, and uses the corresponding existing entropy decoding technology to decode the code stream information correspondingly to obtain parsed data.
  • Step 2 reconstruct several two-dimensional map information according to the analytical data
  • the pieces of two-dimensional map information include a geometric information map
  • the geometric information map includes an occupancy information map, a depth information map, a projection residual information map, and a coordinate conversion error information map.
  • the analytical data mainly includes prediction residuals of occupancy information, prediction residuals of depth information, prediction residuals of projection residual information, and prediction residuals of coordinate transformation error information.
  • the encoder adopts a certain scanning order to traverse the pixels in the occupancy information map, the depth information map, the projection residual information map and the coordinate conversion error information map in turn and encode the corresponding information, then the pixel prediction obtained by the decoding end Residual information is also in this order, and the decoding end can obtain the resolutions of these two-dimensional images through regularization parameters. For details, see the S2 initializing the two-dimensional projection plane structure part in the first embodiment. Therefore, the decoding end can know the current position of the pixel to be reconstructed in the two-dimensional image.
  • the reconstructed occupancy information of the encoded and decoded pixels can be used for prediction.
  • the placeholder information for the current pixel can be used for prediction.
  • the reconstructed occupancy information map and the reconstructed depth information of the encoded and decoded pixels can be used for prediction.
  • the neighbor pixels are used to predict the depth information of the current pixel, and then the depth information of the current pixel is reconstructed according to the obtained prediction value and the parsed prediction residual.
  • the reconstructed occupancy information map, depth information map and the reconstructed projection residual information of the encoded and decoded pixels can be used for prediction.
  • the prediction method is consistent with that of the encoding end. Only the neighbor pixels whose occupancy information is not empty and are similar to the depth information of the current pixel are used to predict the projection residual information of the current pixel, and then the projection residual information of the current pixel is reconstructed according to the obtained prediction value and the parsed prediction residual.
  • the reconstructed occupancy information map, depth information map, projection residual information map and the reconstructed coordinate conversion error information of the encoded and decoded pixels can be used for prediction.
  • the prediction method Consistent with the encoding end that is, only the neighbor pixels with non-empty occupancy information and similar to the current pixel depth information and projection residual information are used to predict the coordinate conversion error information of the current pixel, and then the obtained prediction value and the parsed prediction residual. The difference reconstructs the coordinate transformation error information of the current pixel.
  • the reconstructed occupancy information map, depth information map, and projection residual information map can be obtained. and coordinate transformation error infographic.
  • the analysis data may also include prediction residuals of the attribute information, and correspondingly, the attribute information map may also be reconstructed by using the information.
  • Step 3 obtaining a two-dimensional projection plane structure according to several two-dimensional map information
  • the two-dimensional projection plane structure Since the resolution of the two-dimensional projection plane structure is consistent with the occupancy information map, the depth information map, the projection residual information map and the coordinate conversion error information map, and these two-dimensional map information have been reconstructed, it can be seen that the two-dimensional projection plane The occupancy information, depth information, projection residual information and coordinate transformation error information of each pixel in the structure are used to obtain the reconstructed two-dimensional projection plane structure.
  • the reconstructed two-dimensional projection plane structure may also include attribute information of the point cloud.
  • Step 4 Reconstruct the point cloud using the 2D projected plane structure.
  • the occupancy information, depth information, projection residual information and coordinate transformation error information of each pixel can be known. If the occupancy information of the current pixel (i, j) is not empty, the depth information of the current pixel, that is, the cylindrical coordinate r component of the corresponding point of the pixel, and the projection residual information, that is, the corresponding position of the pixel and the actual
  • the residual ( ⁇ , ⁇ i) between the projection positions and the coordinate conversion error information are the residuals ( ⁇ x, ⁇ y, ⁇ z) between the spatial position obtained by the back-projection of the pixel and the spatial position of the original point corresponding to the pixel to reconstruct the pixel.
  • the corresponding position of the current pixel (i, j) can be expressed as ( ⁇ j , i), then the actual projection position ( ⁇ ', i') of the corresponding spatial point of the current pixel is:
  • the following formula is used to reconstruct the spatial point (x, y, z) corresponding to the current pixel according to the spatial position (xl, yl, zl) and the coordinate transformation error ( ⁇ x, ⁇ y, ⁇ z) obtained by the back-projection of the current pixel.
  • the corresponding spatial point of each non-empty pixel in the two-dimensional projection structure can be reconstructed, so as to obtain the reconstructed point cloud.
  • the reconstruction method when reconstructing the point cloud at the decoding end, can be selected according to the encoding method of the geometric information and attribute information of the encoded endpoint cloud, so as to obtain the corresponding reconstructed point cloud.
  • FIG. 10 A schematic structural diagram of a point cloud decoding device for plane projection, which includes:
  • the second data acquisition module 21 is used to acquire and decode the code stream information to obtain analytical data
  • the first reconstruction module 22 is used for reconstructing several two-dimensional map information according to the analysis data
  • the second reconstruction module 23 is used to obtain a two-dimensional projection plane structure according to several two-dimensional map information
  • the point cloud reconstruction module 24 is used for reconstructing the point cloud by using the two-dimensional projection plane structure.
  • the decoding apparatus provided in this embodiment can implement the decoding method described in the fifth embodiment, and the detailed process is not repeated here.

Abstract

本发明公开了一种基于二维规则化平面投影的点云编解码方法及装置,编码方法包括:获取原始点云数据;对原始点云数据进行二维规则化平面投影,得到二维投影平面结构;根据二维投影平面结构得到若干二维图信息;对若干二维图信息进行编码,得到码流信息。本发明通过将三维空间中的点云投影到对应的二维规则化投影平面结构当中,对点云在垂直方向和水平方向上进行了规则化校正,得到点云在二维投影平面结构上的强相关性表示,从而更好的体现了点云的空间相关性;使得后续在对若干二维图信息进行编码时,能够极大地利用点云的空间相关性,减小空间冗余,从而进一步提升点云的编码效率。

Description

基于二维规则化平面投影的点云编解码方法及装置
本申请要求于2021年02月08日提交中国专利局、申请号为202110172795.4、申请名称为“基于二维规则化平面投影的点云编解码方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明属于编解码技术领域,具体涉及一种基于二维规则化平面投影的点云编解码方法及装置。
背景技术
随着硬件处理能力的提升和计算机视觉的飞速发展,三维点云成为继音频、图像、视频之后的新一代沉浸式多媒体,被广泛的应用于虚拟现实、增强现实、自动驾驶和环境建模等。然而三维点云通常具有较大的数据量,十分不利于点云数据的传输及存储,因此研究高效的点云编解码技术具有重要意义。
在现有的基于几何的点云压缩编码(G-PCC,Geometry-based Point Cloud Compression)框架中,点云的几何信息和属性信息是分开进行编码的。目前G-PCC的几何编解码可分为基于八叉树的几何编解码和基于预测树的几何编解码。
基于八叉树的几何编解码:在编码端,首先,对点云的几何信息进行预处理,这包括点云的坐标转换和体素化过程。然后,按照广度优先遍历的顺序不断对点云所在的包围盒进行树划分(八叉树/四叉树/二叉树)。最后,对每个节点的占位码进行编码,并编码每个叶子节点中包含的点数,生成二进制码流。在解码端,首先按照广度优先遍历的顺序,不断解析得到每个节点的占位码。然后,依次不断进行树划分,直至划分得到1x1x1的单位立方体时 停止划分。最后,解析得到每个叶子节点中包含的点数,最终得到重构的点云几何信息。
基于预测树的几何编解码:在编码端,首先对原始点云进行排序。然后,建立预测树结构,通过将每个点归类到所属的激光扫描器上,并按照不同的激光扫描器建立预测树结构。接下来,遍历预测树中的每个节点,通过选取不同的预测模式对节点的几何信息进行预测得到预测残差,并利用量化参数对预测残差进行量化。最后,对预测树结构、量化参数以及节点几何信息的预测残差等进行编码,生成二进制码流。在解码端,首先解析码流,其次重构预测树结构,然后通过解析得到的每个节点的几何信息预测残差以及量化参数,对预测残差进行反量化,最终恢复得到每个节点的重构几何信息,即完成了点云几何信息的重建。
然而,由于点云具有较强的空间稀疏性,对于使用八叉树结构的点云编码技术而言,该结构会导致划分得到的空节点占比较高,且无法充分体现点云的空间相关性,从而不利于点云的预测及熵编码。基于预测树的点云编解码技术利用激光雷达设备的部分参数来建立树结构,在此基础上利用树结构进行预测编码,然而该树结构并未充分体现点云的空间相关性,从而不利于点云的预测及熵编码。因而,上述两种点云编解码技术均存在编码效率不够高的问题。
发明内容
为了解决现有技术中存在的上述问题,本发明提供了一种基于二维规则化平面投影的点云编解码方法及装置。本发明要解决的技术问题通过以下技术方案实现:
一种基于二维规则化平面投影的点云编码方法,包括:
获取原始点云数据;
对所述原始点云数据进行二维规则化平面投影,得到二维投影平面结构;
根据所述二维投影平面结构得到若干二维图信息;
对所述若干二维图信息进行编码,得到码流信息。
在本发明的一个实施例中,对所述原始点云数据进行二维规则化平面投影,得到二维投影平面结构,包括:
初始化二维投影平面结构;
确定所述原始点云数据与所述二维投影平面结构的映射关系,以将所述原始点云数据投影到所述二维投影平面结构上。
在本发明的一个实施例中,所述若干二维图信息包括几何信息图,所述几何信息图包括占位信息图、深度信息图、投影残差信息图以及坐标转换误差信息图。
在本发明的一个实施例中,对所述若干二维图信息进行编码,得到码流信息,包括:
对所述占位信息图、所述深度信息图、所述投影残差信息图以及所述坐标转换误差信息图进行编码,分别得到占位信息码流、深度信息码流、投影残差信息码流以及坐标转换误差信息码流;
根据所述占位信息码流、所述深度信息码流、所述投影残差信息码流以及所述坐标转换误差信息码流得到几何信息码流。
在本发明的一个实施例中,在得到几何信息码流之后还包括:
根据所述几何信息码流进行几何重建,得到重建后的点云几何信息;
基于重建后的点云几何信息对所述原始点云数据的属性信息进行编码,得到属性信息码流。
在本发明的一个实施例中,所述若干二维图信息还包括属性信息图。
在本发明的一个实施例中,对所述若干二维图信息进行编码,得到码流信息,还包括:
对所述属性信息图进行编码,得到属性信息码流。
本发明的另一个实施例还提供了一种基于二维规则化平面投影的点云编码装置,包括:
第一数据获取模块,用于获取原始点云数据;
投影模块,用于对所述原始点云数据进行二维规则化平面投影,得到二维投影平面结构;
数据处理模块,用于根据所述二维投影平面结构得到若干二维图信息;
编码模块,用于对所述若干二维图信息进行编码,得到码流信息。
本发明的又一个实施例还提供了一种基于二维规则化平面投影的点云解码方法,包括:
获取码流信息并进行解码,得到解析数据;
根据所述解析数据重构若干二维图信息;
根据所述若干二维图信息得到二维投影平面结构;
利用所述二维投影平面结构重建点云。
本发明的再一个实施例还提供了一种基于二维规则化平面投影的点云解码装置,包括:
第二数据获取模块,用于获取码流信息并进行解码,得到解析数据;
第一重构模块,用于根据所述解析数据重构若干二维图信息;
第二重构模块,用于根据所述若干二维图信息得到二维投影平面结构;
点云重建模块,用于利用所述二维投影平面结构重建点云。
本发明的有益效果:
本发明通过将三维空间中的点云投影到对应的二维规则化投影平面结构当中,对点云在垂直方向和水平方向上进行了规则化校正,得到点云在二维投影平面结构上的强相关性表示,从而避免了三维表示结构中存在的稀疏性,又更好的体现了点云的空间相关性;使得后续在对二维规则化投影平面结构所得到的若干二维图信息进行编码时,能够极大地利用点云的空间相关性,减小空间冗余,从而进一步提升点云的编码效率。
以下将结合附图及实施例对本发明做进一步详细说明。
附图说明
图1是本发明实施例提供的一种基于二维规则化平面投影的点云编码方法示意图;
图2是本发明实施例提供的点的柱面坐标与二维投影平面中像素的对应关系示意图;
图3是本发明实施例提供的点云的二维投影平面结构示意图;
图4是本发明实施例提供的投影残差的示意图;
图5是本发明实施例提供的点云几何信息编码框架图;
图6是本发明实施例提供的基于重建几何信息进行属性信息编码的框架图;
图7是本发明实施例提供的点云几何信息和属性信息同时进行编码的框架图;
图8是本发明实施例提供的一种基于二维规则化平面投影的点云编码装置结构示意图;
图9是本发明实施例提供的一种基于二维规则化平面投影的点云解码方法示意图;
图10是本发明实施例提供的一种基于二维规则化平面投影的点云解码装置结构示意图。
具体实施方式
下面结合具体实施例对本发明做进一步详细的描述,但本发明的实施方式不限于此。
实施例一
请参见图1,图1是本发明实施例提供的一种基于二维规则化平面投影的点云编码方法示意图,包括以下步骤:
S1:获取原始点云数据。
具体地,原始点云数据通常由一组三维空间点组成,每个空间点都记录了自身的几何位置信息,以及颜色、反射率、法线等额外的属性信息。其中, 点云的几何位置信息一般是基于笛卡尔坐标系进行表示的,即利用点的x,y,z坐标进行表示。原始点云数据可通过3D扫描设备例如激光雷达等获取,也可通过各种平台提供的公共数据集获得。在本实施例中,设获取到的原始点云数据的几何位置信息基于笛卡尔坐标系进行表示。需要说明的是,原始点云数据的几何位置信息的表示方法不限于笛卡尔坐标。
S2:对原始点云数据进行二维规则化平面投影,得到二维投影平面结构。
具体的,在本实施例中,在对原始点云进行二维规则化平面投影之前,还可以对原始点云数据进行预处理,如体素化处理等,以方便后续编码。
首先,初始化二维投影平面结构。
初始化点云的二维规则化投影平面结构需要利用规则化参数。规则化参数通常由制造厂商进行精细测定并作为必备的数据之一提供给消费者,例如激光雷达的采集范围,水平方位角的采样角分辨率
Figure PCTCN2022075397-appb-000001
或采样点数,以及每个激光扫描器的距离校正因子、激光扫描器沿垂直方向和水平方向的偏移信息V o和H o、激光扫描器沿俯仰角和水平方位角的偏移信息θ 0和α。
需要说明的是,规则化参数不限于以上给出的这些参数,其可以利用给定的激光雷达的标定参数,也可以在激光雷达的标定参数没有给定的情况下,通过优化估计、数据拟合等方式得到。
点云的二维规则化投影平面结构为一个包含M行、N列像素的数据结构,三维点云中的点经过投影后与该数据结构中的像素对应。并且该数据结构中的像素(i,j)可与柱面坐标分量(θ,φ)相关联,如可利用以下公式找到柱面坐标(r,θ,φ)对应的像素(i,j)。
Figure PCTCN2022075397-appb-000002
Figure PCTCN2022075397-appb-000003
具体地,请参见图2,图2是本发明实施例提供的点的柱面坐标与二维投影平面中像素的对应关系示意图。
需要说明的是,此处像素的对应并不限于柱面坐标。
进一步地,二维规则化投影平面的分辨率可由规则化参数获得,如假设二维规则化投影平面的分辨率为M×N,则可利用规则化参数中激光扫描器的个数来初始化M,并利用水平方位角的采样角分辨率
Figure PCTCN2022075397-appb-000004
(或者激光扫描器的采样点数)来初始化N,例如可采用如下公式,最终即可完成二维投影平面结构的初始化,得到一个包含M×N个像素的平面结构。
M=laserNum;
Figure PCTCN2022075397-appb-000005
或N=pointNumPerLaser。
其次,确定原始点云数据与二维投影平面结构的映射关系,以将原始点云数据投影到二维投影平面结构上。
该部分通过逐点判断原始点云在二维投影平面结构中的位置,将原本在笛卡尔坐标系下杂乱分布的点云映射至均匀分布的二维规则化投影平面结构中。具体的,针对原始点云中的每一个点,在二维投影平面结构中确定对应的像素,例如可选择与点在二维平面中投影位置空间距离最小的像素作为该点的对应像素。
若利用柱面坐标系进行二维投影,则确定原始点云对应像素的具体流程如下:
a.确定原始点云数据中当前点的柱面坐标分量r,具体的,利用以下公式进行计算:
Figure PCTCN2022075397-appb-000006
b.确定当前点在二维投影平面结构中的搜索区域。具体的,可选择直接将整个二维投影平面结构作为搜索区域。进一步的,为了减小计算量,还可通过当前点的柱面坐标分量俯仰角θ和方位角φ来确定对应像素在二维投影平面结构中的搜索区域,以减小搜索区域。
c.确定搜索区域后,对其中的每个像素(i,j),利用规则化参数即激光雷 达第i个激光扫描器的标定参数θ 0、V o、H o和α,计算当前像素在笛卡尔坐标系中的位置(xl,yl,zl),具体计算公式如下:
θ i=θ 0
Figure PCTCN2022075397-appb-000007
xl=r·sin(φ j-α)-H o·cos(φ j-α)
yl=r·cos(φ j-α)+H o·sin(φ j-α)
zl=r·tanθ i+V o
d.得到当前像素在笛卡尔坐标系中的位置(xl,yl,zl)后,计算其与当前点(x,y,z)之间的空间距离并将其作为误差Err,即:
Err=dist{(x,y,z),(xl,yl,zl)}
若该误差Err小于当前最小误差minErr,则用其更新最小误差minErr,并用当前像素对应的i和j更新当前点所对应像素的i和j;若该误差Err大于最小误差minErr,则不进行以上更新过程。
e.当搜索区域内的所有像素均被遍历完成后,即可确定当前点在二维投影平面结构中的对应像素(i,j)。
当原始点云中的所有点均完成上述操作后,即完成了点云的二维规则化平面投影。具体地,请参见图3,图3是本发明实施例提供的点云的二维投影平面结构示意图,其中,原始点云数据中的每个点均被映射至该结构中的对应像素。
需要说明的是,在点云的二维规则化平面投影过程中,可能会出现点云中的多个点对应到二维投影平面结构中的同一像素。若要避免这种情况发生,可选择在投影时将这些空间点投影到不同的像素中,例如,对某一点进行投影时,若其对应的像素中已有对应点,则将该点投影至该像素的邻近空像素中。此外,若点云中的多个点已投影到二维投影平面结构中的同一像素,则在基于二维投影平面结构进行编码时,应额外编码每个像素中的对应点数,并根据该点数对像素中的每个对应点信息进行编码。
S3:根据二维投影平面结构得到若干二维图信息。
在本实施例中,若干二维图信息可以包括几何信息图。其中,该几何信息图可以是占位信息图、深度信息图、投影残差信息图以及坐标转换误差信息图中的一种或者多种,也可以是其他几何信息图。
本实施例具体以上述四个几何信息图为例进行详细说明。
a.占位信息图
占位信息图用来标识二维规则化投影平面结构中每个像素是否被占据,即每个像素是否与点云中的点相对应,若被占据,则称该像素非空,否则,称该像素为空。如可采用0和1进行表示,1:代表当前像素被占用;0:代表当前像素没有被占用,由此可根据点云的二维投影平面结构得到其占位信息图。
b.深度信息图
深度信息图用来表示二维规则化投影平面结构中每个被占据像素的对应点与坐标原点之间的距离。如可采用该像素对应点的柱面坐标r分量作为该像素的深度 假设像素对应点的笛卡尔坐标为(x,y,z),则利用公式
Figure PCTCN2022075397-appb-000008
可得该点的柱面坐标r分量即像素的深度。基于此,二维规则化投影平面结构中每个被占据的像素都会有一个深度值,从而得到对应的深度信息图。
c.投影残差信息图
投影残差信息图用来表示二维规则化投影平面结构中每个被占据像素的对应位置与实际投影位置之间的残差,如图4所示,图4是本发明实施例提供的投影残差的示意图。
具体地,可利用以下方式计算像素的投影残差。假设当前像素为(i,j),其对应点的笛卡尔坐标为(x,y,z),则该点实际的投影位置可表示为(φ',i'),其可由以下公式计算得到:
Figure PCTCN2022075397-appb-000009
Figure PCTCN2022075397-appb-000010
而当前像素的对应位置可表示为(φ j,i),其可由以下公式计算得到:
Figure PCTCN2022075397-appb-000011
则由以下公式即可计算得到当前像素对应的投影残差(Δφ,Δi):
Δφ=φ'-φ j
Δi=i'-i
基于以上计算,二维规则化投影平面中每个被占据的像素都会有一个投影残差,从而得到点云对应的投影残差信息图。
d.坐标转换误差信息图
坐标转换误差信息图用来表示二维规则化投影平面结构中每个被占据像素逆投影所得的空间位置与该像素对应原始点的空间位置之间的残差。
如可采用以下方式计算像素的坐标转换误差。假设当前像素为(i,j),其对应点的笛卡尔坐标为(x,y,z),则利用规则化参数和以下公式可将该像素逆转换回笛卡尔坐标系,得到对应的笛卡尔坐标(xl,yl,zl):
θ i=θ 0
Figure PCTCN2022075397-appb-000012
Figure PCTCN2022075397-appb-000013
xl=r·sin(φ j-α)-H o·cos(φ j-α)
yl=r·cos(φ j-α)+H o·sin(φ j-α)
zl=r·tanθ i+V o
接下来由以下公式即可计算得到当前像素的坐标转换误差(Δx,Δy,Δz):
Δx=x-xl
Δy=y-yl
Δz=z-zl
基于以上计算,二维规则化投影平面结构中每个被占据的像素都会有一个坐标转换误差,从而得到点云对应的坐标转换误差信息图。
本发明通过将三维空间中的点云投影到对应的二维规则化投影平面结构当中,对点云在垂直方向和水平方向上进行了规则化校正,得到点云在二维投影平面结构上的强相关性表示,从而更好的体现了点云的空间相关性,进而提升了点云的编码效率。
S4:对若干二维图信息进行编码,得到码流信息。
请参见图5,图5是本发明实施例提供的点云几何信息编码框架图。
在本实施例中,对步骤S3得到的占位信息图、深度信息图、投影残差信息图以及坐标转换误差信息图进行编码,分别得到占位信息码流、深度信息码流、投影残差信息码流以及坐标转换误差信息码流。
具体地,本实施例采用某种扫描顺序分别依次遍历占位信息图、深度信息图、投影残差信息图和坐标转换误差信息图中的像素,例如采用Z字扫描等。
对于占位信息图中的当前像素,可利用已编解码像素的重建占位信息来进行预测,具体可采用现有的各种邻居预测技术等,得到相应的预测残差后可采用现有的熵编码技术进行编码,得到占位信息码流。
对于深度信息图中的当前像素,可利用已重建的占位信息图和已编解码像素的重建深度信息来进行预测,具体可在现有的邻居预测技术之上结合邻居像素的占位信息来预测,即仅使用占位信息非空的邻居像素来预测当前像素的深度信息,预测值的计算可采用加权平均等方式,得到相应的预测残差后可采用现有的熵编码技术进行编码,得到深度信息码流。
对于投影残差信息图中的当前像素,可利用已重建的占位信息图、深度信息图和已编解码像素的重建投影残差信息来进行预测,具体可在现有的邻居预测技术之上结合邻居像素的占位信息和深度信息来预测,即仅使用占位信息非空并且与当前像素深度信息相近的邻居像素来预测当前像素的投影残差信息,预测值的计算可采用加权平均等方式,得到相应的预测残差后可采用现有的熵编码技术进行编码,得到投影残差信息码流。
对于坐标转换误差信息图中的当前像素,可利用已重建的占位信息图、深度信息图、投影残差信息图和已编解码像素的重建坐标转换误差信息来进行预测,具体可在现有的邻居预测技术之上结合邻居像素的占位信息、深度信息和投影残差信息来预测,即仅使用占位信息非空并且与当前像素深度信息和投影残差信息相近的邻居像素来预测当前像素的坐标转换误差信息,预测值的计算可采用加权平均等方式,得到相应的预测残差后可采用现有的熵编码技术进行编码,得到坐标转换误差信息码流。
此外,还可对上述预测残差量化后再进行编码。
本实施例对二维规则化平面投影得到的若干二维图信息,如占位信息图、深度信息图、投影残差信息图以及坐标转换误差信息图进行编码时,可以有效的利用二维图中的强相关性对当前像素进行预测和熵编码,因此能够极大地利用点云的空间相关性,并减小空间冗余,从而进一步提升点云的编码效率。
根据占位信息码流、深度信息码流、投影残差信息码流以及坐标转换误差信息码流得到几何信息码流。
在本实施例中,当所有的几何信息图均完成编码后,即可得到原始点云数据的几何信息码流。
在本发明的另一个实施例中,对于上述的几何信息图,还可以借助图像\视频等压缩方式进行压缩,包含但不限于:JPEG、JPEG2000、HEIF、H.264\AVC、H.265\HEVC等。
此外,需要说明的是,点云投影残差的变化和调整会影响其坐标转换误差的大小,因此为了得到更好的编码效率,点云的投影残差和坐标转换误差是可以调整的,并不是固定不变的。例如,对于无损编码而言,若二维投影准确度较高,则投影残差较小,因此可将投影残差置为0,这一操作虽然使得坐标转换误差有些许升高,但是在允许坐标转换误差的编码效率有一定下降或不变的情况下,可以很好的提升投影残差的编码效率;或者若二维投影准 确度较低,则投影残差较大,此时可以对投影残差进行适当的调整,从而使得坐标转换误差发生相应的变化,然后对调整后的投影残差和坐标转换误差进行编码,也可能会得到更好的编码效率。对于有损编码而言,若二维投影准确度较高,则投影残差较小,因此可以不对投影残差进行编码;或者若坐标转换误差较小,也可以不对坐标转换误差进行编码,以此来提升编码效率;又或者也可以对投影残差和坐标转换误差进行适当的调整,然后对调整后的投影残差和坐标转换误差进行编码,也可能会得到更好的编码效率。
实施例二
在上述实施例一完成几何信息编码的基础上,还可以基于重建的几何信息进行属性信息编码。请参见图6,图6是本发明实施例提供的基于重建几何信息进行属性信息编码的框架图。
首先,根据上述实施例一得到的几何信息码流进行几何重建,得到重建后的点云几何信息。
然后基于重建后的点云几何信息对原始点云数据的属性信息进行编码,得到属性信息码流。
具体的,属性信息编码一般针对空间点的颜色和反射率信息进行。可利用现有技术基于点云的几何重构信息对原始点云数据的属性信息进行编码。例如,首先将属性中的颜色信息从RGB颜色空间转换到YUV颜色空间。然后,利用重建的几何信息对点云重新着色,使得未编码的属性信息与重建的几何信息对应起来。用莫顿码或希尔伯特码对点云进行排序后,利用已编码点的重建属性值对待预测点进行插值预测得到预测属性值,然后将真实属性值和预测属性值进行差分得到预测残差,最后对预测残差进行量化并编码,生成二进制码流。
实施例三
在上述实施例一的基础上,还可以通过二维投影平面结构同时得到属性信息图。然后对几何信息图和属性信息图同时进行编码,得到几何信息码流 和属性信息码流。请参见图7,图7是本发明实施例提供的点云几何信息和属性信息同时进行编码的框架图。
具体地,几何信息图的编码过程参见上述实施例一。属性信息图的编码过程如下:
首先采用某种扫描顺序遍历属性信息图中的像素,例如采用Z字扫描等。然后对于属性信息图中的当前像素,可利用已重建的占位信息图和已编解码像素的重建属性信息来进行预测,具体可在现有的邻居预测技术之上结合邻居像素的占位信息来预测,即仅使用占位信息非空的邻居像素来预测当前像素的属性信息,预测值的计算可采用加权平均等方式,得到相应的预测残差后可采用现有的熵编码技术进行编码,得到属性信息码流。
本实施例通过二维投影平面结构同时得到几何信息图和属性信息图,然后同时对几何信息和属性信息进行编码,提升了编码效率。
实施例四
在上述实施例一至三的基础上,本实施例提供了一种基于二维规则化平面投影的点云编码装置,请参见图8,图8是本发明实施例提供的一种基于二维规则化平面投影的点云编码装置结构示意图,其包括:
第一数据获取模块11,用于获取原始点云数据;
投影模块12,用于对原始点云数据进行二维规则化平面投影,得到二维投影平面结构;
数据处理模块13,用于根据二维投影平面结构得到若干二维图信息;
编码模块14,用于对若干二维图信息进行编码,得到码流信息。
本实施例提供的编码装置可以实现上述实施例一至三所述的编码方法,详细过程在此不再赘述。
实施例五
请参见图9,图9是本发明实施例提供的一种基于二维规则化平面投影的点云解码方法示意图,该方法包括:
步骤1:获取码流信息并进行解码,得到解析数据;
解码端获取压缩的码流信息,并采用相应的现有熵解码技术对码流信息进行相应的解码,得到解析后的数据。
步骤2:根据解析数据重构若干二维图信息;
在本实施例中,若干二维图信息包括几何信息图,几何信息图包括占位信息图、深度信息图、投影残差信息图以及坐标转换误差信息图。
基于此,解析数据主要包括占位信息的预测残差、深度信息的预测残差、投影残差信息的预测残差以及坐标转换误差信息的预测残差。
由于编码端采用了某种扫描顺序分别依次遍历占位信息图、深度信息图、投影残差信息图和坐标转换误差信息图中的像素并对相应信息进行编码,那么解码端所得到的像素预测残差信息同样是按照此种顺序,且解码端可通过规则化参数获得这些二维图的分辨率,具体参见实施例一中S2初始化二维投影平面结构部分。因此,解码端可获知当前待重构像素在二维图中的位置。
对于占位信息图中的当前待重构像素,可利用已编解码像素的重建占位信息来进行预测,预测方法与编码端保持一致,然后根据得到的预测值和解析出来的预测残差重建当前像素的占位信息。
对于深度信息图中的当前待重构像素,可利用已重建的占位信息图和已编解码像素的重建深度信息来进行预测,预测方法与编码端保持一致即仅使用占位信息非空的邻居像素来预测当前像素的深度信息,然后根据得到的预测值和解析出来的预测残差重建当前像素的深度信息。
对于投影残差信息图中的当前待重构像素,可利用已重建的占位信息图、深度信息图和已编解码像素的重建投影残差信息来进行预测,预测方法与编码端保持一致即仅使用占位信息非空并且与当前像素深度信息相近的邻居像素来预测当前像素的投影残差信息,然后根据得到的预测值和解析出来的预测残差重建当前像素的投影残差信息。
对于坐标转换误差信息图中的当前待重构像素,可利用已重建的占位信 息图、深度信息图、投影残差信息图和已编解码像素的重建坐标转换误差信息来进行预测,预测方法与编码端保持一致即仅使用占位信息非空并且与当前像素深度信息和投影残差信息相近的邻居像素来预测当前像素的坐标转换误差信息,然后根据得到的预测值和解析出来的预测残差重建当前像素的坐标转换误差信息。
重构完占位信息图、深度信息图、投影残差信息图和坐标转换误差信息图中的每个像素后,即可得到重构的占位信息图、深度信息图、投影残差信息图和坐标转换误差信息图。
此外,解析数据还可以包括属性信息的预测残差,相应的,也可以通过该信息重构出属性信息图。
步骤3:根据若干二维图信息得到二维投影平面结构;
由于二维投影平面结构的分辨率与占位信息图、深度信息图、投影残差信息图和坐标转换误差信息图一致,且这些二维图信息均已被重构,因此可知二维投影平面结构中每个像素的占位信息、深度信息、投影残差信息和坐标转换误差信息,从而得到重构的二维投影平面结构。
相应的,重构的二维投影平面结构中还可以包括点云的属性信息。
步骤4:利用二维投影平面结构重建点云。
具体地,按照某一扫描顺序遍历重构的二维投影平面结构中的像素,可知每个像素的占位信息、深度信息、投影残差信息和坐标转换误差信息。若当前像素(i,j)的占位信息为非空,则可按照如下方式根据当前像素的深度信息即该像素对应点的柱面坐标r分量、投影残差信息即该像素对应位置与实际投影位置之间的残差(Δφ,Δi)和坐标转换误差信息即该像素逆投影所得空间位置与该像素对应原始点空间位置之间的残差(Δx,Δy,Δz)来重构该像素对应的空间点(x,y,z)。
当前像素(i,j)的对应位置可表示为(φ j,i),则当前像素对应空间点的实际投影位置(φ',i')为:
Figure PCTCN2022075397-appb-000014
φ'=φ j+Δφ
i'=i+Δi
利用规则化参数和以下公式将当前像素逆投影回笛卡尔坐标系,得到对应的笛卡尔坐标(xl,yl,zl):
θ i=θ 0
xl=r·sin(φ j-α)-H o·cos(φ j-α)
yl=r·cos(φ j-α)+H o·sin(φ j-α)
zl=r·tanθ i+V o
利用以下公式根据当前像素逆投影所得空间位置(xl,yl,zl)与坐标转换误差(Δx,Δy,Δz)重建当前像素对应空间点(x,y,z)。
x=xl+Δx
y=yl+Δy
z=zl+Δz
根据以上计算即可对二维投影结构中的每个非空像素重构其对应空间点,从而得到重建点云。
需要说明的是,在解码端进行点云重建时,可根据编码端点云的几何信息和属性信息的编码方式适应的选择重建方式,以得到相应的重建点云。
实施例六
在上述实施例五的基础上,本实施例提供了一种基于二维规则化平面投影的点云解码装置,请参见图10,图10是本发明实施例提供的一种基于二维规则化平面投影的点云解码装置结构示意图,其包括:
第二数据获取模块21,用于获取码流信息并进行解码,得到解析数据;
第一重构模块22,用于根据解析数据重构若干二维图信息;
第二重构模块23,用于根据若干二维图信息得到二维投影平面结构;
点云重建模块24,用于利用二维投影平面结构重建点云。
本实施例提供的解码装置可以实现上述实施例五所述的解码方法,详细 过程在此不再赘述。
以上内容是结合具体的优选实施方式对本发明所作的进一步详细说明,不能认定本发明的具体实施只局限于这些说明。对于本发明所属技术领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干简单推演或替换,都应当视为属于本发明的保护范围。

Claims (10)

  1. 一种基于二维规则化平面投影的点云编码方法,其特征在于,包括:
    获取原始点云数据;
    对所述原始点云数据进行二维规则化平面投影,得到二维投影平面结构;
    根据所述二维投影平面结构得到若干二维图信息;
    对所述若干二维图信息进行编码,得到码流信息。
  2. 根据权利要求1所述的基于二维规则化平面投影的点云编码方法,其特征在于,对所述原始点云数据进行二维规则化平面投影,得到二维投影平面结构,包括:
    初始化二维投影平面结构;
    确定所述原始点云数据与所述二维投影平面结构的映射关系,以将所述原始点云数据投影到所述二维投影平面结构上。
  3. 根据权利要求1所述的基于二维规则化平面投影的点云编码方法,其特征在于,所述若干二维图信息包括几何信息图,所述几何信息图包括占位信息图、深度信息图、投影残差信息图以及坐标转换误差信息图。
  4. 根据权利要求3所述的基于二维规则化平面投影的点云编码方法,其特征在于,对所述若干二维图信息进行编码,得到码流信息,包括:
    对所述占位信息图、所述深度信息图、所述投影残差信息图以及所述坐标转换误差信息图进行编码,分别得到占位信息码流、深度信息码流、投影残差信息码流以及坐标转换误差信息码流;
    根据所述占位信息码流、所述深度信息码流、所述投影残差信息码流以及所述坐标转换误差信息码流得到几何信息码流。
  5. 根据权利要求4所述的基于二维规则化平面投影的点云编码方法,其特征在于,在得到几何信息码流之后还包括:
    根据所述几何信息码流进行几何重建,得到重建后的点云几何信息;
    基于重建后的点云几何信息对所述原始点云数据的属性信息进行编码, 得到属性信息码流。
  6. 根据权利要求3所述的基于二维规则化平面投影的点云编码方法,其特征在于,所述若干二维图信息还包括属性信息图。
  7. 根据权利要求6所述的基于二维规则化平面投影的点云编码方法,其特征在于,对所述若干二维图信息进行编码,得到码流信息,还包括:
    对所述属性信息图进行编码,得到属性信息码流。
  8. 一种基于二维规则化平面投影的点云编码装置,其特征在于,包括:
    第一数据获取模块(11),用于获取原始点云数据;
    投影模块(12),用于对所述原始点云数据进行二维规则化平面投影,得到二维投影平面结构;
    数据处理模块(13),用于根据所述二维投影平面结构得到若干二维图信息;
    编码模块(14),用于对所述若干二维图信息进行编码,得到码流信息。
  9. 一种基于二维规则化平面投影的点云解码方法,其特征在于,包括:
    获取码流信息并进行解码,得到解析数据;
    根据所述解析数据重构若干二维图信息;
    根据所述若干二维图信息得到二维投影平面结构;
    利用所述二维投影平面结构重建点云。
  10. 一种基于二维规则化平面投影的点云解码装置,其特征在于,包括:
    第二数据获取模块(21),用于获取码流信息并进行解码,得到解析数据;
    第一重构模块(22),用于根据所述解析数据重构若干二维图信息;
    第二重构模块(23),用于根据所述若干二维图信息得到二维投影平面结构;
    点云重建模块(24),用于利用所述二维投影平面结构重建点云。
PCT/CN2022/075397 2021-02-08 2022-02-07 基于二维规则化平面投影的点云编解码方法及装置 WO2022166963A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
JP2023517332A JP2023541207A (ja) 2021-02-08 2022-02-07 2次元正則化平面投影に基づくポイントクラウド符号化及び復号方法並びに装置
US18/040,705 US20230290007A1 (en) 2021-02-08 2022-02-07 Point Cloud Encoding and Decoding Method and Device Based on Two-Dimensional Regularization Plane Projection
KR1020237011307A KR20230060534A (ko) 2021-02-08 2022-02-07 2차원 정규화 평면 투사에 기초한 포인트 클라우드 인코딩 및 디코딩 방법과 장치
EP22749242.8A EP4184917A4 (en) 2021-02-08 2022-02-07 METHOD AND DEVICE FOR CODING AND DECODING A POINT CLOUD BASED ON A TWO-DIMENSIONAL REGULATORY PLANAR PROJECTION

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110172795.4 2021-02-08
CN202110172795.4A CN114915795B (zh) 2021-02-08 基于二维规则化平面投影的点云编解码方法及装置

Publications (1)

Publication Number Publication Date
WO2022166963A1 true WO2022166963A1 (zh) 2022-08-11

Family

ID=82742013

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/075397 WO2022166963A1 (zh) 2021-02-08 2022-02-07 基于二维规则化平面投影的点云编解码方法及装置

Country Status (5)

Country Link
US (1) US20230290007A1 (zh)
EP (1) EP4184917A4 (zh)
JP (1) JP2023541207A (zh)
KR (1) KR20230060534A (zh)
WO (1) WO2022166963A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220292730A1 (en) * 2021-03-10 2022-09-15 Tencent America LLC Method and apparatus for haar-based point cloud coding

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019162567A1 (en) * 2018-02-23 2019-08-29 Nokia Technologies Oy Encoding and decoding of volumetric video
US20190311500A1 (en) * 2018-04-10 2019-10-10 Apple Inc. Point cloud compression
WO2020013631A1 (ko) * 2018-07-12 2020-01-16 삼성전자 주식회사 3차원 영상을 부호화 하는 방법 및 장치, 및 3차원 영상을 복호화 하는 방법 및 장치
WO2020187140A1 (en) * 2019-03-15 2020-09-24 Mediatek Inc. Method and apparatus of patch segmentation for video-based point cloud coding

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019162567A1 (en) * 2018-02-23 2019-08-29 Nokia Technologies Oy Encoding and decoding of volumetric video
US20190311500A1 (en) * 2018-04-10 2019-10-10 Apple Inc. Point cloud compression
WO2020013631A1 (ko) * 2018-07-12 2020-01-16 삼성전자 주식회사 3차원 영상을 부호화 하는 방법 및 장치, 및 3차원 영상을 복호화 하는 방법 및 장치
WO2020187140A1 (en) * 2019-03-15 2020-09-24 Mediatek Inc. Method and apparatus of patch segmentation for video-based point cloud coding

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4184917A4

Also Published As

Publication number Publication date
EP4184917A4 (en) 2024-03-06
EP4184917A1 (en) 2023-05-24
CN114915795A (zh) 2022-08-16
JP2023541207A (ja) 2023-09-28
US20230290007A1 (en) 2023-09-14
KR20230060534A (ko) 2023-05-04

Similar Documents

Publication Publication Date Title
WO2020188403A1 (en) Point cloud geometry padding
WO2022166957A1 (zh) 点云数据的预处理方法及点云几何编解码方法、装置
US20230328285A1 (en) Point cloud data transmission method, point cloud data transmission device, point cloud data reception method, and point cloud data reception device
WO2022166963A1 (zh) 基于二维规则化平面投影的点云编解码方法及装置
US20230419552A1 (en) Point cloud data transmission method, point cloud data transmission device, point cloud data reception method, and point cloud data reception device
WO2022166967A1 (zh) 基于二维规则化平面投影的点云编解码方法及装置
WO2020128652A1 (en) Point cloud auxiliary information coding
CN114915795B (zh) 基于二维规则化平面投影的点云编解码方法及装置
WO2022166961A1 (zh) 面向大规模点云的二维规则化平面投影及编解码方法
WO2022166958A1 (zh) 基于二维规则化平面投影的点云编解码方法及装置
WO2022166968A1 (zh) 基于二维规则化平面投影的点云编解码方法及装置
WO2022166966A1 (zh) 基于二维规则化平面投影的点云编解码方法及装置
CN114915793B (zh) 基于二维规则化平面投影的点云编解码方法及装置
WO2022166969A1 (zh) 基于二维规则化平面投影的点云序列编解码方法及装置
WO2023123284A1 (zh) 一种解码方法、编码方法、解码器、编码器及存储介质
WO2023097694A1 (zh) 解码方法、编码方法、解码器以及编码器
US20230334711A1 (en) Point cloud data transmission device, transmission method, processing device, and processing method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22749242

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022749242

Country of ref document: EP

Effective date: 20230216

ENP Entry into the national phase

Ref document number: 2023517332

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 20237011307

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE