CN112184840B

CN112184840B - 3D point cloud compression system based on multi-scale structured dictionary learning

Info

Publication number: CN112184840B
Application number: CN202011002405.0A
Authority: CN
Inventors: 戴文睿; 申扬眉; 李成林; 邹君妮; 熊红凯
Original assignee: Shanghai Jiaotong University
Current assignee: Shanghai Jiaotong University
Priority date: 2020-09-22
Filing date: 2020-09-22
Publication date: 2022-09-02
Anticipated expiration: 2040-09-22
Also published as: US20230215055A1; CN112184840A; US11836954B2; WO2022063055A1

Abstract

The invention provides a 3D point cloud compression system based on multi-scale structured dictionary learning, wherein: the point cloud data dividing module outputs a voxel set after point cloud division and voxel block sets with different scales; the geometric information coding module outputs a coded geometric information bit stream; the geometric information decoding module outputs decoded geometric information; the attribute signal coding module outputs a sparsely coded coefficient matrix and a learned multi-scale structured dictionary; the attribute signal encoding module outputs the learned multi-scale structured dictionary, the attribute signal compression module outputs a compressed attribute signal bit stream, the attribute signal decoding module outputs a decoded attribute signal, and the 3D point cloud reconstruction module completes reconstruction. The method is suitable for lossless geometric and lossy attribute compression of the point cloud signal, utilizes the natural hierarchical division structure of the point cloud signal, gradually improves the reconstruction quality of high-frequency detail information along the direction from coarse to fine of the signal scale, and can obtain obvious performance gain.

Description

3D point cloud compression system based on multi-scale structured dictionary learning

Technical Field

The invention relates to a scheme in the technical field of 3D point cloud data compression, in particular to a 3D point cloud compression system based on multi-scale structured dictionary learning.

Background

In recent years, with the rapid development of 3D data acquisition devices and display systems, real-world scenes and targets have been able to be digitized in real-time into three-dimensional point data with high detail accuracy, while at the same time taking advantage of the powerful GPU computing power. 3D data and technology have been widely used in a number of emerging areas including virtual/augmented reality, mixed reality, telepresence, panoramic communication, autopilot, and robotic navigation, among others. 3D modeling techniques are diverse, for example, RGB-D frames and multi-view video techniques containing depth information can model 3D scenes and objects, but do not support real-time rendering; polygonal meshes can reconstruct 3D surfaces of objects using connectivity between vertices and points, but cannot model three-dimensional data that does not satisfy the manifold topology. In many technologies, 3D point clouds are distinguished by using a series of points to represent a 3D object, each point is composed of corresponding geometric coordinate position information and attribute signals such as color, texture, reflectivity and the like, and can flexibly represent the structure of original data without being constrained by manifold topology and effectively support real-time information processing. However, each point cloud data often contains 3D points of hundreds of thousands or even millions of orders, and how to effectively compress, store and transmit such massive data is an important problem to be solved, especially for emerging industries such as automatic driving which have strict requirements on real-time performance.

Through the literature search of the prior art, the inventor finds that the correlation of Point Cloud Attribute Compression with Graph Transform published by c.zhang, d.florencio and c.loop in the IEEE International Conference on Image Processing (ICIP2014) in 2014 is creatively removed by using Graph fourier Transform, the nearest neighbor Graph in the Point Cloud voxel block is constructed in a limited manner, and a Graph laplacian matrix is obtained by calculation and is used as a Transform base matrix to encode the Point Cloud signal. However, the graph fourier transform needs to perform eigenvalue decomposition on a large-size laplacian matrix, has high computational complexity, and is not suitable for real-time application requirements. In order to reduce complexity, R.L.de Queiroz and P.A.Chou propose improved haar wavelet Transform in the 'Compression of 3D Point cloud Using a Region-Adaptive high efficiency Transform' published in the 'IEEE Transactions on Image Processing' (TIP2016) journal of 2016, and carry out Adaptive weighted Transform on Point cloud signals with irregular spatial distribution, so that the characteristic decomposition of a large-scale matrix is avoided, and the coding efficiency is remarkably improved. However, both the graph fourier transform and the adaptive wavelet transform are fixed analytic transforms, and their transform basis matrices are directly calculated from geometric information, and lack adaptability to attribute signals, so that it is difficult to effectively characterize the complex structural characteristics of 3D point cloud data in natural scenes.

Meanwhile, the MPEG 3DG PCC standardization organization proposes a 3D point cloud compression technology collection from 2017, selects a mixed point cloud encoding technology as a test reference, compresses geometric information by adopting an octree structure, and compresses an attribute signal of 3D-2D projection by utilizing a JPEG encoder of an image. Finally, through layer-by-layer screening and strict comparison, MPEG works out two standard test models, one is a Video-based Point Cloud Compression (V-PCC) standard technology, a geometric Video sequence and an attribute Video sequence are obtained by projecting a 3D Point Cloud into a 2D plane, and the geometric Video sequence and the attribute Video sequence are compressed by utilizing the existing HEVC Video coding standard; the other is a Geometry-based Point Cloud Compression (G-PCC) standard technology, which directly compresses Point Cloud data in a three-dimensional space by using space transformation coding of an octree structure and critical sampling. However, the 3D-2D projection of the V-PCC introduces inevitable projection distortion, the attribute space transformation of the G-PCC only depends on geometric information, the statistical characteristics of attribute signals are not considered, and the compression quality of the attribute signals is reduced.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides a 3D point cloud compression system based on multi-scale structured dictionary learning, which utilizes the natural hierarchical division structure of a point cloud signal to gradually improve the reconstruction quality of high-frequency detail information along the direction from coarse to fine of the signal scale, and the compression effect can obtain obvious performance gain compared with the international standard MPEG 3DG G-PCC and V-PCC.

The invention is realized by the following technical scheme:

the invention provides a 3D point cloud compression system based on multi-scale structured dictionary learning, which comprises: the device comprises a point cloud data partitioning module, a geometric information encoding module, a geometric information decoding module, an attribute signal encoding module, an attribute signal compression module, an attribute signal decoding module and a 3D point cloud reconstruction module, wherein:

the point cloud data dividing module divides original point cloud data to obtain voxel sets which are uniformly distributed in space, further divides the voxel sets into point cloud voxel blocks with different scales, transmits the obtained voxel sets to the geometric information coding module, and transmits the point cloud voxel blocks to the attribute signal coding module, wherein the point cloud voxel blocks form a training set and a test set;

the geometric information coding module carries out lossless coding on the geometric position information of the point cloud voxels through an octree structure and transmits a bit stream obtained by coding to the geometric information decoding module;

the geometric information decoding module decodes the bit stream obtained by encoding to obtain decoded geometric information, and transmits the decoded geometric information to the attribute signal encoding module and the 3D point cloud reconstruction module;

the attribute signal coding module learns a multi-scale structured dictionary from point cloud voxel blocks in a training set, performs hierarchical sparse coding on the point cloud voxel blocks in a test set based on the multi-scale structured dictionary, transmits the multi-scale structured dictionary to an attribute signal compression module and an attribute signal decoding module, and transmits sparse coding coefficients of the hierarchical sparse coding to the attribute signal compression module;

the attribute signal compression module rearranges, quantizes, predicts and entropy codes the sparse coding coefficient, and transmits the coded bit stream to an attribute signal decoding module;

the attribute signal decoding module obtains a decoded sparse coefficient matrix through entropy decoding, inverse prediction and inverse quantization, reconstructs a point cloud attribute signal according to the multi-scale structured dictionary, and transmits the reconstructed attribute signal to the 3D point cloud reconstruction module;

and the 3D point cloud reconstruction module integrates the geometric information obtained by the geometric information decoding module and the attribute signal obtained by the attribute signal decoding module to obtain a reconstructed 3D point cloud.

Preferably, the point cloud data partitioning module includes: a voxel division sub-module and a block division sub-module, wherein: the voxel division submodule recursively divides a boundary cube where original point cloud data which are arranged in a disordered way are located into voxel units which are aligned in a space coordinate system and distributed uniformly, and transmits a voxel set obtained by division to the geometric information coding module and the block division submodule; the block division submodule uniformly divides the voxel set into voxel blocks with different scales and transmits the voxel block set to the attribute signal coding module.

Further, the voxel division submodule recursively divides a boundary cube of the point cloud into voxel units according to an octree structure, each node of the octree containing point cloud point data is represented as a voxel unit, floating point precision coordinates of points contained in each voxel unit are quantized into integer coordinates of voxel centers, and an average value of attribute information of the contained points is used as an attribute signal of the voxel unit.

Further, the block division submodule uniformly divides the set of N × N voxel units into m × m × m (m × m) dimensions according to a preset number K of dimensions<N) and further decreasing the scale layer by layer, dividing it into scales of

K is the voxel block of 1, …, K, and all the obtained voxel blocks constitute the multi-scale voxel block set.

Preferably, the geometric information encoding module allocates an eight-bit byte unit to each branch node in the voxel-divided octree structure, and is used for indicating whether point data of point cloud exists in a space unit corresponding to a sub node of the branch node, and traverses the octree structure in a depth-breadth priority order, so that all the obtained byte units form a geometric information encoding code word, and further compresses information by entropy encoding, so as to obtain a geometric information encoding bit stream, and transmits the geometric information encoding bit stream to the geometric information decoding module.

Preferably, the geometric information decoding module performs entropy decoding on the geometric coding bit stream to obtain a coded octree structure and a byte unit of each branch node, and further obtains geometric coordinate information of each voxel unit, and transmits the geometric coordinate information to the attribute signal coding module and the 3D point cloud reconstruction module.

Preferably, the attribute signal encoding module includes: a multi-scale dictionary learning submodule and a hierarchical sparse coding submodule, wherein: aiming at different scales of voxel blocks in a training set, a multi-scale dictionary learning submodule introduces a weight matrix to depict the dimension irregularity of attribute signals, and obtains a multi-scale dictionary by utilizing the adaptive learning of an alternative optimization algorithm; and the hierarchical structure sparse coding sub-module performs transform coding on the attribute signals of the multi-scale voxel blocks of the test set by utilizing the hierarchical sparsity of the point cloud signals according to the learned dictionary, and transmits sparse coefficients to the attribute signal compression module.

Furthermore, the multi-scale dictionary learning submodule introduces the non-uniform weight matrix to characterize the dimension irregularity of the attribute signals of the voxel blocks, and establishes a mixture l with an optimized objective function of non-uniform weighting ₁ /l ₂ The method has the advantages that the problem of minimization of paradigm regularization is solved, dictionary atoms and sparse coefficients are efficiently and alternately updated by means of a Gauss-Seidel iteration method, approximate errors are guaranteed, meanwhile, the convergence rate of learning is improved, the learned dictionary atoms are naturally arranged into a tree structure, and along with the sequence that the atom scale is gradually decreased from coarse to fine layer by layer, the signal frequency represented by the atoms is gradually increased.

Furthermore, the hierarchical structure sparse coding submodule utilizes the natural hierarchical structure prior of the point cloud signal to design a regular term of hierarchical sparse constraint, utilizes an alternating direction multiplier algorithm to carry out effective sparse coding on the attribute signals of the voxel blocks in the test set on the basis of a multi-scale dictionary, and gradually improves the representation precision of high-frequency detail information in the signals while ensuring the coding efficiency.

Preferably, the attribute signal compression module includes: a sparse coefficient rearrangement sub-module, a uniform quantization sub-module, a differential coding prediction sub-module and an adaptive arithmetic entropy coding sub-module, wherein: the sparse coefficient rearrangement submodule improves the compression rate of subsequent encoding by rearranging the coefficient matrix of the sparse encoding and transmits the reordered sparse coefficient to the uniform quantization submodule; the uniform quantization sub-module quantizes the sparse coefficient value, the differential coding prediction sub-module differentially codes the index of the sparse coefficient, then the adaptive arithmetic entropy coding sub-module entropy codes the quantized sparse coefficient and the predicted index, and the module transmits the obtained compressed bit stream to the attribute signal decoding module.

Furthermore, the sparse coefficient rearrangement submodule rearranges the row vectors of the sparse coefficient matrix according to the decreasing order of the number of the non-zero elements, and simultaneously rearranges the column atom vectors of the multi-scale dictionary correspondingly, so that on the premise of ensuring that a reconstruction signal cannot be changed, the entropy value of the subsequent index difference coding is reduced, and the compression ratio is effectively improved.

Furthermore, the uniform quantization sub-module utilizes the dead zone uniform quantizer, the zero setting interval is set to be twice of the size of the rest uniform quantization intervals, the minimum sparse coefficient with less information carrying capacity is eliminated, and the compression efficiency is obviously improved.

Furthermore, the differential coding prediction submodule performs differential coding on the index of the sparse coefficient matrix according to columns, marks the end of each column to be zero, and further reduces information redundancy.

Furthermore, the self-adaptive arithmetic entropy coding sub-module performs effective entropy coding on the quantized non-zero coefficient value and the indexed differential prediction residual error to obtain a complete bit stream compressed by the point cloud attribute signal.

Preferably, the attribute signal decoding module includes: the adaptive arithmetic entropy decoding sub-module, the differential decoding prediction sub-module, the inverse quantization sub-module and the attribute signal reconstruction sub-module, wherein: the self-adaptive arithmetic entropy decoding submodule carries out entropy decoding on the compressed bit stream to obtain a differential prediction residual error of a decompressed sparse coefficient and an index and transmits the differential prediction residual error to the differential decoding prediction submodule; the differential decoding prediction submodule decodes the index residual error to obtain an index of a sparse coefficient and transmits the index of the sparse coefficient to the inverse quantization submodule; the inverse quantization submodule carries out inverse quantization operation on the sparse coefficient value, restores the sparse coefficient value, integrates the sparse coefficient value with the index of the sparse coefficient to obtain a complete sparse coefficient matrix and transmits the complete sparse coefficient matrix to the attribute signal reconstruction submodule; and the attribute signal reconstruction submodule performs matrix multiplication on the obtained reconstruction sparse coefficient matrix and the multi-scale dictionary to obtain a reconstructed point cloud attribute signal, and transmits the reconstructed point cloud attribute signal to the 3D point cloud reconstruction module.

Preferably, the 3D point cloud reconstruction module synthesizes the decoded geometric information and attribute signals to reconstruct complete point data, and the module obtains the final reconstructed 3D point cloud data.

Compared with the prior art, the invention has at least one of the following beneficial effects:

the system effectively improves the compression efficiency of the 3D point cloud attribute signals, and the multi-scale structured dictionary model can adaptively depict the spatial irregularity of the point cloud structure and gradually improve the approximation precision of the high-frequency information of the signals.

The alternating optimization algorithm designed by the system can reduce the complexity of the traditional calculation and improve the convergence speed of dictionary learning while ensuring the approximate optimal solution.

The compression framework of the system can carry out customized quantization, prediction and entropy coding on the sparse coding coefficient of the hierarchical structure, thereby further improving the compression performance.

Compared with the analytic transformation based on graph Fourier or regional adaptive wavelet, the system of the invention has adaptivity to the training data, thereby effectively improving the reconstruction quality; compared with the MPEG PCC international standard, 3D-2D plane projection is not needed, projection distortion is not introduced, an attribute transformation base is not constructed independently depending on geometric information, and the learned multi-scale dictionary atoms can effectively utilize the statistical characteristics and the prior structured information of attribute signals, so that the method obtains remarkable performance gain.

Drawings

Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:

FIG. 1 is a block diagram of a preferred embodiment of the present invention;

FIG. 2 is a schematic diagram of a point cloud data partitioning module according to an embodiment of the present invention;

fig. 3 is a schematic diagram of an attribute signal encoding module according to an embodiment of the invention.

Detailed Description

The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that variations and modifications can be made by persons skilled in the art without departing from the spirit of the invention. All falling within the scope of the present invention.

Fig. 1 is a block diagram illustrating a 3D point cloud compression system based on multi-scale structured dictionary learning according to an embodiment of the present invention. The system in this embodiment comprises: the device comprises a point cloud data partitioning module, a geometric information encoding module, a geometric information decoding module, an attribute signal encoding module, an attribute signal compression module, an attribute signal decoding module and a 3D point cloud reconstruction module, wherein: the point cloud data partitioning module is connected with the geometric information coding module to transmit voxel sets after point cloud partitioning, the point cloud data partitioning module is connected with the attribute signal coding module to transmit partitioned voxel block sets with different scales, the geometric information coding module is connected with the geometric information decoding module to transmit coded geometric information bit streams, the geometric information decoding module is connected with the attribute signal coding module to transmit decoded geometric information, the geometric information decoding module is connected with the 3D point cloud reconstruction module to transmit decoded geometric information, the attribute signal coding module is connected with the attribute signal compression module to transmit a coefficient matrix of sparse coding and a learned multi-scale structured dictionary, the attribute signal coding module is connected with the attribute signal decoding module to transmit a learned multi-scale structured dictionary, and the attribute signal compression module is connected with the attribute signal decoding module to transmit compressed attribute signal bit streams, the attribute signal decoding module is connected with the 3D point cloud reconstruction module to transmit the decoded attribute signals, and the 3D point cloud reconstruction module outputs finally reconstructed point cloud data.

The embodiment of the invention effectively improves the compression efficiency of the 3D point cloud attribute signal, and the multi-scale structured dictionary can adaptively depict the spatial irregularity of the point cloud structure and gradually improve the approximation precision of the high-frequency information of the signal.

As shown in fig. 2, a schematic diagram of a point cloud data partitioning module in a preferred embodiment is shown, in which the point cloud data partitioning module includes: a voxel division sub-module and a block division sub-module, wherein: the voxel division submodule is connected with the geometric information coding module and transmits a voxel set after point cloud division, the voxel division submodule is connected with the block division submodule and transmits a voxel set after point cloud division, and the block division submodule is connected with the attribute signal coding module and transmits divided voxel block sets with different scales.

In a specific embodiment, the voxel division submodule recursively divides the boundary cube of the point cloud into voxel units according to an octree structure, each node of the octree containing point cloud point data is represented as a voxel unit, floating point precision coordinates of the points contained in each voxel unit are quantized into integer coordinates of a voxel center, an average value of attribute information of the contained points is used as an attribute signal of the voxel unit, generally, the number of layers of the octree, that is, the number of division times, is set to 9 and 10, and voxel resolutions nxnxnxnxnxxn of corresponding point cloud geometric information are 512 × 512 × 512 and 1024 × 1024 × 1024 × 1024. Of course, in other embodiments, other numbers of partitions and voxel resolutions may be selected, and the invention is not limited to the above parameters.

The block division submodule uniformly divides the NxNxN voxel unit set into m x m (m) according to the preset scale number K<N) and further decreasing the scale layer by layer, dividing it into scales of

K is the voxel block of 1, …, K, all resulting voxel blocks constitute the multi-scale voxel block set.

As shown in fig. 2, as a preferred embodiment, in the geometric information encoding module and the geometric information decoding module: the geometric information coding module and the geometric information decoding module are connected to transmit coded geometric information bit streams, the geometric information decoding module and the attribute signal coding module are connected to transmit decoded geometric information, and the geometric information decoding module and the 3D point cloud reconstruction module are connected to transmit decoded geometric information.

Specifically, the geometric information encoding module allocates an eight-bit byte unit to each branch node in the voxel-divided octree structure to indicate whether point data of a point cloud exists in a space unit corresponding to a child node of the branch node, wherein 1 represents existence and 0 represents nonexistence. Traversing the octree structure in the depth-breadth priority order, forming the obtained coded code words of the geometric information by all byte units, further compressing the information by entropy coding to obtain a geometric information coded bit stream, and transmitting the geometric information coded bit stream to a geometric information decoding module.

The geometric information decoding module carries out entropy decoding on the geometric coding bit stream to obtain a coded octree structure and byte units of each branch node, and further obtains the geometric coordinate information of each voxel unit, and transmits the information to the attribute signal coding module and the 3D point cloud reconstruction module.

As shown in fig. 3, in a preferred embodiment, the attribute signal encoding module includes: the system comprises a multi-scale dictionary learning submodule and a layered structure sparse coding submodule, wherein the multi-scale dictionary learning submodule is connected with the layered structure sparse coding submodule to transmit a learned multi-scale structured dictionary, and the layered structure sparse coding submodule is connected with an attribute signal compression module to transmit a coefficient matrix of sparse coding.

Specifically, in a preferred embodiment, the multi-scale dictionary learning submodule sets X ═ X in the training set formed by different scale voxel blocks ₁ ,…,x _n Introducing a non-uniform weight matrix M to depict the dimension irregularity of the attribute signals of the voxel blocks, setting the scale number of the dictionary equal to the scale number K of the signals, and setting the initial multi-scale dictionary as D ═ D { (D) ¹ ,…,D ^K And the initial dictionary atoms are arranged into a tree structure according to the descending order of the scales. Performing zero filling on dictionary atoms with different scales to obtain a corresponding dictionary matrix D ═ D ₁ ,…,d _p ]Establishing a mixture whose optimization objective function is non-uniformly weighted ₁ /l ₂ Minimization problem of paradigm regularization:

where D is a dictionary, A is a sparse coefficient matrix, α _i Is a signal x _i The column vector of the sparse coefficient matrix on dictionary D, λ is the regularization parameter, is the constrained dictionary column vector l ₂ And (4) matrix convex set with the norm value not greater than 1.

Is a regularization term of a hierarchical sparse constraint, wherein

Set of groups, w, formed by groups g of dictionary atoms _g As a group weighting parameter, α _g Is a sub-vector of the sparse coefficient series a with the grouping g as an index set. Because the target problem is a non-convex function, the two variables are alternately optimized, and the approximate optimal solution is obtained. And (3) alternately updating the dictionary atom D and the sparse coefficient A by using a Gauss-Seidel iterative method, and improving the convergence rate of learning while ensuring the approximation error. Due to the constraint of the regular terms, the learned dictionary atoms are naturally arranged into a tree structure, and the signal frequency represented by the atoms gradually increases along with the descending sequence of the atom scale from coarse to fine layer by layer.

Specifically, in a preferred embodiment, the pair of layered sparse coding modules solve the layered sparse coefficients of the attribute signal. Since the sparse coefficients of the signals have no mutual dependency relationship, the parallel operation can be realized. For a test signal x, the target problem of the hierarchical sparse decomposition of the hierarchical structure of the test signal x on the multi-scale dictionary D is as follows:

introducing a local auxiliary variable z for each group g _g And corresponding equality constraint z _g -α _g 0 and further establishing an augmented lagrange expression for the target problem:

where y is a dual variable, ρ>0 is a penalty parameter and P is a binary projection matrix. Adopting an alternate direction multiplier method to alternately optimize two original variables alpha and z, then carrying out gradient rising on a dual variable y,and repeating the iteration until the algorithm converges. The optimal solution of z can be obtained by a grouping soft threshold operator, and the optimization of alpha is a convex quadratic programming problem and can be directly solved by KKT optimal conditions. Since the precise solution of α involves matrix inversion, the computation complexity is too high for large-scale point cloud signals. In order to reduce the computational load, the method of solving the approximation by the steepest gradient descent method or the preconditioned conjugate gradient method may be selected.

The alternating optimization algorithm in the preferred embodiment of the invention can reduce the complexity of traditional calculation and improve the convergence rate of dictionary learning while ensuring the approximate optimal solution.

As shown in fig. 1, in a preferred embodiment, the attribute signal compression module includes: a sparse coefficient rearrangement sub-module, a uniform quantization sub-module, a differential coding prediction sub-module and an adaptive arithmetic entropy coding sub-module, wherein: the sparse coefficient rearrangement submodule is connected with the uniform quantization submodule to transmit a nonzero coefficient value and an index of the rearranged sparse coefficient matrix, the uniform quantization submodule is connected with the differential coding prediction submodule to transmit a quantized sparse coefficient value, the differential coding prediction submodule is connected with the self-adaptive arithmetic entropy coding submodule to transmit a residual error of coefficient index differential coding and a quantized sparse coefficient value, and the self-adaptive arithmetic entropy coding submodule is connected with the attribute signal decoding module to transmit an entropy coding bit stream of the attribute signal.

Specifically, the sparse coefficient rearrangement submodule rearranges the row vectors of the sparse coefficient matrix according to the decreasing order of the number of the non-zero elements, and simultaneously rearranges the column atom vectors of the multi-scale dictionary correspondingly, so that on the premise that the reconstruction signal is not changed, the entropy value of the subsequent index difference coding is reduced, and the compression ratio is effectively improved.

Specifically, the uniform quantization submodule quantizes the sparse coefficient matrix a into a matrix a of integer values _q And a dead zone uniform quantizer is utilized, the zero setting interval is set to be twice of the size of the rest uniform quantization intervals, the extremely small sparse coefficient with less information carrying quantity is eliminated, and the compression efficiency is obviously improved.

Specifically, the differential coding prediction sub-module performs differential coding on the index of the sparse coefficient matrix by columns, marks the end of each column as zero, and further reduces information redundancy.

Specifically, the adaptive arithmetic entropy coding sub-module performs effective entropy coding on the quantized nonzero coefficient value and the indexed differential prediction residual to obtain a complete bit stream compressed by the point cloud attribute signal.

As shown in fig. 1, in a preferred embodiment, the attribute signal decoding module includes: the adaptive arithmetic entropy decoding sub-module, the differential decoding prediction sub-module, the inverse quantization sub-module and the attribute signal reconstruction sub-module, wherein: the self-adaptive arithmetic entropy decoding submodule is connected with the differential decoding prediction module to transmit a differential prediction residual error of a decoded sparse coefficient and an index, the differential decoding prediction submodule is connected with the inverse quantization submodule to transmit an index of the decoded sparse coefficient, the inverse quantization submodule is connected with the attribute signal reconstruction submodule to transmit an inverse quantized sparse coefficient value and a decoded coefficient index, and the attribute signal reconstruction submodule is connected with the 3D point cloud reconstruction module to transmit a reconstructed point cloud attribute signal.

Specifically, the adaptive arithmetic entropy decoding submodule performs entropy decoding on the compressed bitstream to obtain a differential prediction residual between a decompressed sparse coefficient and an index.

Specifically, the differential decoding prediction submodule decodes the index residual to obtain an index of a sparse coefficient.

Specifically, the inverse quantization submodule described above applies to the sparse coefficient value a _q And performing inverse quantization operation to recover the sparse coefficient value, and integrating the sparse coefficient value with the index of the sparse coefficient to obtain a complete sparse coefficient matrix A.

Specifically, the attribute signal reconstruction submodule performs matrix multiplication on the obtained reconstructed sparse coefficient matrix and the multi-scale dictionary by X ═ DA, so as to obtain a reconstructed point cloud attribute signal.

The compression framework in the preferred embodiment of the invention can carry out customized quantization, prediction and entropy coding on the sparse coding coefficient of the hierarchical structure, thereby further improving the compression performance.

As shown in fig. 1, the 3D point cloud reconstruction module synthesizes the decoded geometric information and attribute signals to reconstruct complete point data, and the module obtains the final reconstructed 3D point cloud data.

The present invention can be implemented by using the prior art for the parts which are not specifically described in the above embodiments.

On the basis of the 3D point cloud compression system of the above embodiment, the following description is made in conjunction with specific application embodiments:

the key parameters in this embodiment are set as follows: according to the general test condition of MPEG PCC point cloud compression, the test point cloud data used in the experiment includes 5 point cloud data with 512 × 512 × 512 geometric resolutions and 8 point cloud data with 1024 × 1024 × 1024 geometric resolutions, including various data types, such as a human body target of a half body and a whole body, a building surface, a cultural relic, a natural scene, and the like. The training data is obtained from a large amount of data sets and is divided into 'static targets' according to the data content&The method comprises the following steps of classifying scene data and human body data, and respectively training a multi-scale dictionary for each class of data. The training data and the test data do not coincide. Because human eyes are more sensitive to the brightness information change of the image, the original point cloud is converted from the RGB color space to the YUV color space. The resolution of each voxel block is set to m × m × m — 8 × 8, and the number of signal scales is set to K — 2, i.e., the multi-scale training contains signals of two different scales, 8 × 8 × 8 and 4 × 4 × 4. Correspondingly, the number of scales of the multi-scale dictionary is 2, the two scales are respectively 512 and 64 dictionary atoms, and the dictionary matrix after zero padding contains p as 512+64 × 2 ³ 1024 dictionary atoms. Packet weight parameter w _g Has a value space of {2 ^-2 ,2 ^-1 ,2 ⁰ ,2 ¹ ,2 ² } ^k Wherein k is 1 and 2 represents the number scale of atoms. The value space of the regularization parameter lambda and the penalty parameter rho is {10 ^-2 ,10 ^-1 ,10 ⁰ ,10 ¹ ,10 ² }. The best parameter combination is selected by the grid search method. The dictionary is initialized to DCT bases. The dictionary learning algorithm performs 20 iterative cycles on the training data set, each cycle comprising 1000 alternating direction multiplicationsSub-method iterations and 10 Gauss-Seidel iterations.

The system of the embodiment of the invention is adopted to compress the test point cloud data, and average BD-PSNR and BD-Rate are calculated to be used as the measuring index of the compression performance, wherein the larger the BD-PSNR is, the better the reconstruction quality of the compression system is represented, and the smaller the BD-Rate is, the more the code Rate is saved by the compression system.

Compared with the method proposed by C.Zhang (ICIP2014), the average BD-PSNR gain obtained by the system of the embodiment on all the test data is 2.34dB, and the average BD-Rate is-49.14%, which represents that the bit Rate is saved by 49.14% by the method; compared with the method proposed by R.L.de Queiroz (TIP2016), the average BD-PSNR gain obtained by the system of the embodiment on all the test data is 2.14dB, and the average BD-Rate is-52.55%, which represents that the method saves the bit Rate of 52.55%; compared with the standard test model of the MPEG 3DG international point cloud compression standard, the average BD-PSNR gain obtained by the system of the embodiment on all test data is 2.64dB, and the average BD-Rate is-15.37%, which represents that the bit Rate is saved by 15.37% by the method; compared with the MPEG 3DG international point cloud compression standard G-PCC, the average BD-PSNR gain obtained by the system of the embodiment on all the test data is 0.22dB, and the average BD-Rate is-0.22 percent, which represents that the bit Rate is saved by 0.22 percent by the method; compared with the MPEG 3DG international point cloud compression standard V-PCC, the average BD-PSNR gain obtained by the system of the embodiment on all the test data is 0.55dB, and the average BD-Rate is-4.16%, which represents that the bit Rate is saved by 4.16% by the method.

Experiments show that the compression efficiency of the system in the embodiment of the invention is obviously superior to that of the method proposed by C.Zhang and R.L.de Queiroz, and compared with the international point cloud compression standard, the method can obviously improve the reconstruction performance and save the coding bit rate.

The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes and modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention.

Claims

1. A 3D point cloud compression system based on multi-scale structured dictionary learning, the system comprising: the device comprises a point cloud data partitioning module, a geometric information encoding module, a geometric information decoding module, an attribute signal encoding module, an attribute signal compression module, an attribute signal decoding module and a 3D point cloud reconstruction module, wherein:

the attribute signal compression module is used for rearranging, quantizing, predicting and entropy coding the sparse coding coefficient and transmitting a coded bit stream to an attribute signal decoding module;

2. The system of claim 1, wherein the point cloud data partitioning module comprises: a voxel division sub-module and a block division sub-module, wherein:

the voxel division submodule recursively divides a boundary cube where original point cloud data which are arranged in a disordered way are located into voxel units which are aligned in a space coordinate system and distributed uniformly, and transmits a voxel set obtained by division to the geometric information coding module and the block division submodule;

and the block division submodule uniformly divides the voxel set into voxel blocks with different scales and transmits the voxel block set to the attribute signal coding module.

3. The multi-scale structured dictionary learning-based 3D point cloud compression system according to claim 2, wherein the voxel division submodule recursively divides a boundary cube of the point cloud into voxel units according to an octree structure, each node of the octree containing point cloud point data is represented as a voxel unit, floating point precision coordinates of the points contained in each voxel unit are quantized into integer coordinates of voxel centers, and an average value of attribute information of the contained points is used as an attribute signal of the voxel unit.

4. The multi-scale structured dictionary learning-based 3D point cloud compression system as claimed in claim 2, wherein the block division submodule uniformly divides the NxNxN voxel unit set into m x m (m) according to a preset scale number K<N) and further decreasing the scale layer by layer, dividing it into scales of

The voxel blocks of K, all the obtained voxel blocks constitutingA set of multi-scale voxel blocks.

5. The system of claim 3, wherein the geometric information encoding module assigns an eight-bit byte unit to each branch node in the voxel-partitioned octree structure, and is configured to indicate whether point data of the point cloud exists in a space unit corresponding to a child node of the branch node, and traverse the octree structure in a depth-breadth priority order to obtain encoded code words of the geometric information by using all byte units, further compress the information by using entropy encoding, obtain a geometric information encoded bit stream, and transmit the geometric information encoded bit stream to the geometric information decoding module.

6. The multi-scale structured dictionary learning-based 3D point cloud compression system as claimed in claim 5, wherein the geometric information decoding module performs entropy decoding on the geometric coding bit stream to obtain a coded octree structure and byte units of each branch node, and further obtains the geometric coordinate information of each voxel unit, and transmits the geometric coordinate information to the attribute signal coding module and the 3D point cloud reconstruction module.

7. The multi-scale structured dictionary learning-based 3D point cloud compression system as claimed in claim 1, wherein the attribute signal encoding module comprises: a multi-scale dictionary learning submodule and a hierarchical sparse coding submodule, wherein:

the multi-scale dictionary learning submodule introduces the dimension irregularity of the weight matrix depicting attribute signals into the voxel blocks with different scales in the training set and obtains a multi-scale structured dictionary by utilizing the adaptive learning of the alternative optimization algorithm;

and the hierarchical structure sparse coding submodule carries out transform coding on the attribute signals of the multi-scale voxel blocks of the test set by utilizing the hierarchical sparsity of the point cloud signals according to the learned multi-scale structured dictionary and transmits sparse coding coefficients to the attribute signal compression module.

8. The multi-scale structured dictionary learning-based 3D point cloud compression system as claimed in claim 7, wherein the multi-scale dictionary learning sub-module introduces dimensional irregularity of attribute signals of non-uniform weight matrix delineating voxel blocks, establishes an optimized objective function as a non-uniformly weighted mixture ₁ /l ₂ The method has the advantages that the problem of minimization of paradigm regularization is solved, dictionary atoms and sparse coefficients are updated alternately by means of a Gauss-Seidel iteration method, approximate errors are guaranteed, meanwhile, the convergence rate of learning is improved, the learned dictionary atoms are naturally arranged into a tree structure, and the signal frequency represented by the atoms is gradually increased along with the sequence that the atomic scale is gradually decreased from coarse to fine layer by layer.

9. The 3D point cloud compression system based on multi-scale structured dictionary learning as claimed in claim 7, wherein the hierarchical sparse coding sub-module utilizes natural hierarchical structure prior of the point cloud signal to design regular terms of hierarchical sparse constraint and utilizes alternative direction multiplier algorithm to effectively sparse code the attribute signals of voxel blocks in the test set on the basis of the multi-scale structured dictionary.

10. The system of claim 1, wherein the attribute signal compression module comprises: a sparse coefficient rearrangement sub-module, a uniform quantization sub-module, a differential coding prediction sub-module and an adaptive arithmetic entropy coding sub-module, wherein:

the sparse coefficient rearrangement submodule improves the compression ratio of subsequent coding by rearranging the coefficient matrix of the sparse coding and transmits the reordered sparse coefficient to the uniform quantization submodule;

the uniform quantization submodule quantizes the sparse coefficient value, and the differential coding prediction submodule differentially codes the index of the sparse coefficient;

and the self-adaptive arithmetic entropy coding sub-module carries out entropy coding on the quantized sparse coefficient and the predicted index and transmits the obtained compressed bit stream to the attribute signal decoding module.

11. The multi-scale structured dictionary learning-based 3D point cloud compression system according to claim 10, wherein the sparse coefficient rearrangement submodule rearranges row vectors of the sparse coefficient matrix in a descending order of the number of non-zero elements, and simultaneously, the corresponding column atom vectors of the rearranged multi-scale dictionary reduce entropy of subsequent index difference coding on the premise of ensuring that a reconstruction signal is not changed.

12. The multi-scale structured dictionary learning-based 3D point cloud compression system as claimed in claim 10, wherein the uniform quantization sub-module utilizes a dead zone uniform quantizer to set the zeroing interval to be twice the size of the rest uniform quantization intervals and eliminate the tiny sparse coefficients carrying less information.

13. The multi-scale structured dictionary learning-based 3D point cloud compression system as claimed in claim 10, wherein the differential coding prediction sub-module differentially codes the index of the sparse coefficient matrix by columns, and marks the end of each column as zero, further reducing information redundancy.

14. The multi-scale structured dictionary learning-based 3D point cloud compression system as claimed in claim 10, wherein the adaptive arithmetic entropy coding sub-module performs effective entropy coding on the quantized non-zero coefficient value and the indexed differential prediction residual to obtain a complete bit stream of point cloud attribute signal compression.

15. The multi-scale structured dictionary learning-based 3D point cloud compression system as claimed in any one of claims 1 to 14, wherein the attribute signal decoding module comprises: the adaptive arithmetic entropy decoding sub-module, the differential decoding prediction sub-module, the inverse quantization sub-module and the attribute signal reconstruction sub-module, wherein:

the self-adaptive arithmetic entropy decoding submodule carries out entropy decoding on the compressed bit stream to obtain a decompressed sparse coefficient and an indexed differential prediction residual error, and transmits the decompressed sparse coefficient and the indexed differential prediction residual error to the differential decoding prediction submodule;

the differential decoding prediction submodule decodes the index residual error to obtain an index of a sparse coefficient and transmits the index of the sparse coefficient to the inverse quantization submodule;

the inverse quantization submodule carries out inverse quantization operation on the sparse coefficient value, restores the sparse coefficient value, integrates the sparse coefficient value with the index of the sparse coefficient to obtain a complete sparse coefficient matrix and transmits the sparse coefficient matrix to the attribute signal reconstruction submodule;

and the attribute signal reconstruction submodule performs matrix multiplication on the obtained reconstruction sparse coefficient matrix and the multi-scale dictionary to obtain a reconstructed point cloud attribute signal and transmits the reconstructed point cloud attribute signal to the 3D point cloud reconstruction module.

16. The multi-scale structured dictionary learning-based 3D point cloud compression system as claimed in any one of claims 1 to 14, wherein the 3D point cloud reconstruction module synthesizes the decoded geometric information and attribute signals to reconstruct complete point data, and the module obtains the final reconstructed 3D point cloud data.