CN110278444B

CN110278444B - Sparse representation three-dimensional point cloud compression method adopting geometric guidance

Info

Publication number: CN110278444B
Application number: CN201910645303.1A
Authority: CN
Inventors: 曾焕强; 谷帅; 侯军辉; 陈婧; 朱建清; 张联昌
Original assignee: Huaqiao University
Current assignee: Huaqiao University
Priority date: 2019-07-17
Filing date: 2019-07-17
Publication date: 2022-11-01
Anticipated expiration: 2039-07-17
Also published as: CN110278444A

Abstract

The invention discloses a sparse representation three-dimensional point cloud compression method adopting geometric guidance, which belongs to the field of video coding and comprises the following steps: partitioning the input three-dimensional point cloud by using an octree; obtaining an original redundant dictionary by adopting a graph transformation method; utilizing the geometric information of the point cloud in the block to perform downsampling on the original redundant dictionary; carrying out mean value removal on each unit block, and then carrying out sparse representation on the color information subjected to mean value removal on a down-sampling dictionary; carrying out predictive coding on the mean value of each coding unit block by using an octree-based block mean value prediction algorithm; coding the quantized sparse coefficient by adopting a Run-Level method; finally, all the encoded parameters are entropy encoded by an arithmetic encoder. The invention utilizes sparse representation, can efficiently compress huge three-dimensional point cloud data, and greatly improves the transmission and storage efficiency of the three-dimensional point cloud.

Description

Sparse representation three-dimensional point cloud compression method adopting geometric guidance

Technical Field

The invention relates to the field of video coding, in particular to a sparse representation three-dimensional point cloud compression method adopting geometric guidance.

Background

With the rapid development of three-dimensional acquisition and multimedia communication technologies, three-dimensional point clouds are widely applied to multiple fields of remote communication, augmented reality, immersive communication and the like as a new transmission medium. A three-dimensional point cloud is composed of a large set of points with specific three-dimensional position information, where each point has one or more features (e.g., color, normal, etc.). As a novel spatial data type, the three-dimensional point cloud can more effectively represent three-dimensional objects or scenes with more complex topological structures. Its enormous volume of data presents significant challenges to its storage and transmission. Therefore, how to construct an efficient and stable three-dimensional point cloud compression method is very important.

Unlike traditional natural image video, the three-dimensional point cloud is characterized by irregularities, which are mainly manifested in that not every voxel is occupied, which poses a great challenge to the compression of the three-dimensional point cloud. As research work progresses, sparse representation can be used to efficiently compress high-dimensional data, but how to build a redundant dictionary and sparsely represent color information of a point cloud on the redundant dictionary remains a problem to be solved.

Disclosure of Invention

The invention mainly aims to overcome the defects in the prior art and provides a sparse representation three-dimensional point cloud compression method adopting geometric guidance. The method fully considers the correlation of adjacent points in the three-dimensional point cloud data, converts color information into sparse coefficients by a sparse representation method, and realizes effective compression of the huge three-dimensional point cloud data by encoding nonzero coefficients with less quantity.

The invention adopts the following technical scheme:

a sparse representation three-dimensional point cloud compression method adopting geometric guidance is characterized by comprising the following steps:

1) Dividing the input three-dimensional point cloud to obtain a plurality of unit blocks;

2) Calculating an original redundant dictionary of the unit block;

3) Acquiring the original redundant dictionary by using a descending method based on the geometric information of the inner points of the current unit block to obtain a descending sampling redundant dictionary;

4) Calculating the color average value in each unit block, and performing the average value removal on the color information of each unit block;

5) Sparse representation is carried out on the color information subjected to mean value removal on a down-sampled redundant dictionary, and quantized sparse coefficients are obtained;

6) Carrying out predictive coding on the color mean value of each unit block by using an octree-based block mean value prediction algorithm;

7) Coding the quantized sparse coefficient by adopting a Run-Level method;

8) Entropy coding the coded parameters obtained in the steps 6) and 7) by adopting an arithmetic coder.

Preferably, in step 1), the input three-dimensional point cloud is segmented into unit blocks with uniform sizes by using octree.

Preferably, in step 2), the original redundant dictionary of the unit block is obtained by a graphic transformation.

As can be seen from the above description of the present invention, compared with the prior art, the present invention has the following advantages:

1. the method of the invention adopts the octree to segment the three-dimensional point cloud, thereby not only effectively keeping the correlation between points, but also ensuring that the size of each segmented unit block is the same.

2. The method of the invention utilizes the strong correlation between points, and obtains the original redundant dictionary by assuming that the voxels in each block are occupied and then adopting a graph transformation method; obtaining a down-sampling redundant dictionary by using the geometric information of the points in the block; the color information of the points in the block is sparsely represented on the downsampled redundant dictionary, so that the energy of color signals can be concentrated on a small number of non-zero coefficients, and the data volume of the three-dimensional point cloud is greatly reduced.

3. The invention designs an octree-based block mean prediction mode, which can eliminate color redundancy among each unit block; and for the transformed sparse coefficient, efficiently coding the quantized sparse coefficient by using a Run-level method.

Drawings

FIG. 1 is a main flow diagram of the process of the present invention;

FIG. 2 is an octree partitioning method of the present invention;

FIG. 3 is a schematic diagram of a block mean prediction method based on octree according to the present invention;

FIG. 4 is a flow chart of Run-Level encoding method of the present invention.

The invention is described in further detail below with reference to the figures and specific examples.

Detailed Description

The invention is further described below by means of specific embodiments.

Referring to fig. 1, a sparse representation three-dimensional point cloud compression method using geometric guidance performs octree segmentation on an input three-dimensional point cloud, and then assumes that voxels in each block are occupied, obtains an original redundant dictionary using graph transformation, and obtains a downsampled redundant dictionary through geometric information in each block. And then, sparsely representing the color information in each block on the downsampled redundant dictionary to obtain a sparse coefficient. And the mean value of each block is subjected to predictive coding and quantization by using the block mean value prediction method based on the octree. And coding the non-zero coefficient of the quantized sparse coefficient by adopting a Run-Level coding method. And finally, entropy coding is carried out on all coding parameters by adopting an algorithm coder. The specific implementation steps are as follows:

1) The octree is used to segment the original input point cloud to obtain unit blocks with uniform size, i.e. code blocks, as shown in fig. 2. Given a segmentation depth n, the original input three-dimensional point cloud is first divided into eight equal parts, and after segmentation, a block containing points is marked as 1, and a block without points is marked as 0. The block labeled 1 then continues to be octaved, the same block is labeled 0,1, based on the absence of a point within the block, and then the block labeled 1 continues to be partitioned. This process is repeated until the division is made to a given depth, and the number of divided blocks is Z.

2) Using the unit block as a processing unit, assuming that all the voxels in each block are occupied, calculating the weight value W between each voxel_i,jThen, the original redundant dictionary D can be obtained by using a graph transformation, specifically, the following formula is used:

A＝diag(d₁,d₂,...,d_n)

L＝A-W＝DΛD^-1

wherein dist represents the Euclidean distance between the ith voxel and the jth voxel, τ is a threshold, σ is a preset parameter of each voxel, L is a graph Laplace operator, and Λ is a diagonal matrix translation operator.

3) Downsampled redundant dictionary

The method can be obtained through geometric information of points in the block, and particularly, for the ith block, the redundant dictionary is downsampled

Can be obtained by the following formula:

wherein S_iIs a down-sampling matrix obtained from an identity matrix by removing the rows of unoccupied voxels in the ith block.

4) The color mean value x may be obtained by averaging, and the color information x for sparse representation is obtained by averaging the original color information. Specifically, for the ith block, the following formula is used to obtain:

5) Sparse coefficient c_iThe acquisition can be sparsely represented on the downsampled redundant dictionary by the de-averaged color information. Specifically, for the ith block, the following optimization problem can be solved and obtained by using an OMP algorithm:

where N is the number of occupied blocks and ε is the reconstruction error, used to control the sparse coefficient c_iSparsity of ·₀Is the 0 norm of the vector and is used for counting the number of nonzero elements of the input vector.

6) The mean value of each unit block is coded by adopting a prediction algorithm based on an octree. Fig. 3 shows an octree prediction method, which specifically includes the following steps:

first, assume that after octree segmentation, the original three-dimensional point cloud has been divided into m × m × m blocks, each having its own three-dimensional space coordinates (p) in three-dimensional space_x,p_y,p_z) Wherein, 1 is less than or equal to p_x≤m，1≤p_y≤m，1≤p_zM is less than or equal to m. Fig. 3 (a) shows 8 neighboring blocks of the current coding block, where the coordinates of the gray block number 1 is (p)_x,p_y,p_z) The coordinates of the other seven blocks with different colors numbered 2-8 are (p) for the current coding block_x-Δp_x,p_y-Δp_y,p_z-Δp_z) Wherein Δ p_x,Δp_y,Δp_zE {0,1}, which is the already encoded reference block. For the current cell block (indicated by grey blocks in fig. 3 (b), with a block label of 1), 9 prediction modes will be used, where the 9 prediction modes include oneDC mode and 8 angle mode. Of the 8 angular modes, mode1 to mode7 are modes of seven reference neighboring blocks; mode8 is the mode of the reference macroblock, which is a block with 7 reference blocks as a whole. By comparing with the neighboring blocks and macroblocks, the best prediction mode can be obtained by the following formula:

modedecision＝min|b-b_i|

by comparing with 8 reference blocks, 8 prediction residuals can be obtained, and the minimum absolute value of the 8 residuals corresponds to the best prediction mode.

Where b is the mean of the current block of coding units, b is the mean of the current block of coding units_iIs the mean of the reference block, i ∈ {1,2, ·,8}; | is an absolute value sign. Since the three-dimensional coordinates comprise one block, namely the block at the outermost layer of the whole three-dimensional point cloud, and the blocks have no reference block, the color mean value of each block is directly quantized, namely, the DC mode is adopted.

And 7) the quantized sparse coefficient can be coded by a Run-Level method.

Specifically, as shown in fig. 4, the quantized sparse coefficients are represented as 2 one-dimensional arrays, which are denoted as Run and Level, respectively. Since the number of non-zero coefficients in each block is different, numNonzeroCoef is introduced to record the number of non-zero coefficients in each block. For the ith block, only if NumNonorcoef [ i ]]Is not equal to 0 (run)_i,level_i) And (5) coding, wherein i is more than or equal to 1 and less than or equal to Z. For quantized coefficient c_iIf no non-zero coefficients are present, numNonorcoef [ i ]]A value of 0 indicates that no coefficients need to be encoded. If c is_iIn which is N_iA non-zero coefficient, N is first determined_iStored in variable NumNonzeroCoef [ i ]]Then, run is stored separately_iAnd level_iInto Run { i } and Level { i }. The above process is repeated until all the unit blocks are coded.

And step 8) entropy coding all the coded parameters in the previous step by adopting an arithmetic coder, wherein the coded parameters comprise quantized prediction residual errors, an optimal prediction mode, run, level and NumNonorcoef.

The above description is only an embodiment of the present invention, but the design concept of the present invention is not limited thereto, and any insubstantial modifications made by using the design concept should fall within the scope of infringing the present invention.

Claims

1. A sparse representation three-dimensional point cloud compression method adopting geometric guidance is characterized by comprising the following steps:

2) Calculating an original redundant dictionary of the unit block;

3) Based on the geometric information of the inner points of the current cell block, downsampling the original redundant dictionary to obtain a downsampled redundant dictionary;

7) Coding the quantized sparse coefficient by adopting a Run-Level method;

2. The method of claim 1, wherein in step 1), the input three-dimensional point cloud is segmented into uniform-sized unit blocks by using octree.

3. The method of compressing a three-dimensional point cloud with geometric guidance according to claim 1, wherein in step 2), the original redundant dictionary of unit blocks is obtained by graphic transformation.