CN114025146B - Dynamic point cloud geometric compression method based on scene flow network and time entropy model - Google Patents
Dynamic point cloud geometric compression method based on scene flow network and time entropy model Download PDFInfo
- Publication number
- CN114025146B CN114025146B CN202111285773.5A CN202111285773A CN114025146B CN 114025146 B CN114025146 B CN 114025146B CN 202111285773 A CN202111285773 A CN 202111285773A CN 114025146 B CN114025146 B CN 114025146B
- Authority
- CN
- China
- Prior art keywords
- point cloud
- scene flow
- difference
- information
- compression
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000006835 compression Effects 0.000 title claims abstract description 45
- 238000007906 compression Methods 0.000 title claims abstract description 45
- 238000000034 method Methods 0.000 title claims abstract description 27
- 239000013598 vector Substances 0.000 claims abstract description 32
- 238000000354 decomposition reaction Methods 0.000 claims description 2
- 238000013507 mapping Methods 0.000 claims description 2
- 238000013139 quantization Methods 0.000 claims description 2
- 238000005070 sampling Methods 0.000 description 4
- 230000009466 transformation Effects 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000002310 reflectometry Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/161—Encoding, multiplexing or demultiplexing different image signal components
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/597—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/90—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
- H04N19/91—Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
Abstract
The invention discloses a dynamic point cloud geometric compression method based on a scene flow network and a time entropy model. The invention mainly aims at the problem of geometrical compression of dynamic point cloud, utilizes a scene flow network to estimate the motion vector of the point cloud of the previous frame, thereby utilizing time redundancy, regards the motion vector as the attribute of the point cloud, utilizes an attribute compression mode in MPEG (moving Picture experts group) to encode so as to utilize space redundancy, and then introduces a time entropy model network to encode the residual errors of a predicted frame and a current frame in a hidden space, so as to realize the geometrical compression of the dynamic point cloud. The method solves the optimized compression of time sequence mass dynamic point cloud data, and provides technical support for wider application and popularization of the three-dimensional dynamic point cloud.
Description
Technical Field
The invention relates to a dynamic point cloud geometric compression method, belongs to the technical field of artificial intelligence and GIS information, and particularly relates to a dynamic point cloud geometric compression method based on a scene flow network and a time entropy model.
Background
A Point Cloud (Point Cloud) is a collection of three-dimensional (or more) geometric model surface sampling points, each Point containing geometric information (x, y, z) and corresponding attribute information, such as information of color (r, g, b), reflectivity, transparency, and the like. The dynamic point cloud is a time continuous point cloud sequence, unlike grid data, the point cloud does not contain topological information in space, no corresponding relation exists in time, and the point cloud contains much noise data, so that how to effectively remove space and time redundancy is a difficult thing.
On the other hand, with the development of sensing equipment, the acquisition of point clouds is easier, and the point clouds have huge application potential in various fields such as immersive 3D remote, VR, motion playback of free view angles, automatic driving and the like. Meanwhile, the data volume of the high-resolution dynamic point cloud is larger and larger, and the storage capacity and the transmission capacity of a large amount of dynamic point cloud data to hardware equipment are high. Therefore, the method for researching the compressed storage of the dynamic point cloud data has very important practical significance.
According to the reference data which can be referred to, more scientific researchers at home and abroad in recent years are engaged in research work related to dynamic point cloud compression, a series of compression schemes are proposed, wherein the compression schemes comprise an XOR (difference of octree structures of coding adjacent frames), a dynamic point cloud compression method based on a graph, and compression methods based on ICP (nearest iteration point) and intra-frame coding all achieve compression effects of different degrees, but the compression rates of the modes are lower.
Motion estimation and residual compression are key factors for dynamic point cloud geometric compression, but the motion estimation modes adopted before, such as motion estimation based on a graph, ICP and other modes, have lower estimation accuracy, and the residual compression methods before, such as an XOR method, have larger coding quantity based on block intra-frame coding and other modes.
Therefore, the compression method capable of effectively removing the geometrical redundancy of the dynamic point cloud is designed and realized, and has stronger practical significance and application value.
Disclosure of Invention
The invention mainly solves the problems existing in the prior art and provides a dynamic point cloud geometric compression method based on a scene flow network and a time entropy model.
The invention mainly aims at the problem of geometrical compression of dynamic point cloud, utilizes a scene flow network to estimate the motion vector of the point cloud of the previous frame, thereby utilizing time redundancy, regards the motion vector as the attribute of the point cloud, utilizes an attribute compression mode in MPEG (moving Picture experts group) to encode so as to utilize space redundancy, and then introduces a time entropy model network to encode the residual errors of a predicted frame and a current frame in a hidden space, so as to realize the geometrical compression of the dynamic point cloud. The method solves the optimized compression of time sequence mass dynamic point cloud data, and provides technical support for wider application and popularization of the three-dimensional dynamic point cloud.
The invention is solved by the following technical scheme:
step one: a scene flow network-based motion estimation step for estimating a motion vector of a previous frame point cloud relative to a current frame point cloud;
step two: a motion vector coding and motion compensation step, which is used for coding the motion vector estimated in the previous step, and performing motion compensation on the point cloud of the previous frame by utilizing the decoded motion vector to obtain a predicted point cloud;
step three: and a residual compression step, which is used for encoding difference information of the predicted point cloud and the original point cloud.
The invention has the following beneficial effects: according to the method, the scene flow network is introduced, so that the motion vector of the point cloud of the previous frame can be estimated rapidly and accurately, and the time redundancy is effectively removed. The motion vector is regarded as a point cloud attribute and is encoded by utilizing an attribute compression mode in MPEG, and the method can efficiently encode the motion vector and effectively utilize space redundancy. By introducing the time entropy model network, the residual error coding quantity can be greatly reduced. Finally, the whole framework uses a sparse convolution network, so that the memory can be greatly reduced and the running speed can be improved.
Drawings
FIG. 1 is an overall framework of dynamic point cloud geometric compression based on a scene flow network and a temporal entropy model provided by an embodiment of the invention.
Detailed Description
The technical scheme of the invention is further specifically described below through examples and with reference to the accompanying drawings.
Examples:
the embodiment provides a dynamic point cloud geometric compression method based on a scene flow network and a time entropy model, which specifically comprises the following steps as shown in fig. 1:
step one: scene flow estimation
Firstly, scaling and quantizing the decoded previous frame point cloud and the current frame point cloud, randomly sampling a certain point number from the previous frame point cloud and the current frame point cloud, and inputting the point number into a scene flow network for processing. The sampled points are subjected to sparse convolution with a step length of 2 in several layers to extract multi-scale characteristics of the point cloud, then the scene flow information is estimated in a bottom-up mode, and the scene flow information of the current layer is estimated by using a scene flow estimation module in each layer.
The scene flow estimation module mainly utilizes the cost body sub-module and the scene flow predictor sub-module to estimate scene flow information. The cost-body submodule integrates point-to-point similarities principally in a block-to-block fashion. The scene flow predictor submodule mainly utilizes the characteristics of the point cloud of the last frame and the characteristics of the point cloud of the current frame, and the scene flow information which is up-sampled in the last layer, and the cost body information is used for predicting the scene flow information of the current layer. After the motion vectors of the sampling points are obtained, the motion vectors of all points of the previous frame are obtained through interpolation.
Step two: motion vector compression and motion compensation
The decoded motion vector is compressed and decompressed by using an attribute compression mode in MPEG to obtain a decompressed motion vector, and then the decompressed motion vector is used for performing motion compensation on the point cloud of the last frame of decoding to obtain a predicted point cloud.
Step three: residual compression
Encoding the difference value between the predicted point cloud and the original point cloud by using a time entropy model network: firstly, mapping the predicted point cloud and the current frame point cloud to hidden variables Y1 and Y in a hidden space by using an encoder. The hidden variable Y is composed of position information C Y Corresponding characteristic information F Y And (3) representing. Taking the difference between Y1 and Y in the hidden space to obtain a difference Y in the hidden space res . Lossless compression of the Y position information using octree compression, and then considering how to encode the difference Y res . For the difference Y res Can be defined byThe position information in Y1 and Y is obtained by differencing, for the difference Y res Is quantized first and then lossless compressed using arithmetic coding. The probability distribution of the characteristic information corresponding to the difference value is assumed to meet the Gaussian mixture distribution, and the probability distribution of each component is approximated by a Gaussian distribution (the mean value is mu, and the variance is sigma), so that only one network is designed to obtain the parameters of the Gaussian distribution.
The information after Y1 and Y are spliced is subjected to 2-layer sparse convolution with the step length of 2 to obtain hidden variable Z, and characteristic information F of Z is obtained Z Quantization is performed first, then lossless compression is performed by using arithmetic coding, and F is estimated by using a full decomposition entropy model Z Probability distribution information of (2). The compressed characteristic information F Z After arithmetic decoding, the decoded hidden variable is obtained Z1 is obtained through 2 layers of sparse convolution with step length of 2. Hidden variable Y1 of the prediction point cloud is subjected to 3-layer sparse convolution to obtain Y2, and then the hidden variable spliced by Y2 and Z1 is subjected to 3-layer sparse convolution to estimate probability distribution of characteristic information of a difference value.
Performing octree decoding on the compressed Y position information to obtain hidden variables of the current point cloudWill beDifference between the position information of Y1 and Y1, resulting in a difference +.>Is obtained by arithmetically decoding the characteristic information of the compressed difference value>Is a difference of the position information and the characteristic information of the decoding>Decoded difference +.>Adding hidden variable Y1 of the predicted point cloud to obtain hidden variable ++of the current point cloud>Hidden variable->The decoded point cloud is obtained through a decoder.
Examples:
the data set used for testing in this embodiment is a dynamic point cloud sequence holder in MPEG; according to the overall flowchart steps of the invention of fig. 1, a scene flow network and a time entropy model network are trained first. Here, part of the point cloud data in the AMASS is selected for training. For the input current frame point cloud and the decoded last frame point cloud, firstly reducing the current frame point cloud and the decoded last frame point cloud by 2 times, quantizing, randomly sampling 100000 points, and inputting the points into a scene flow network to estimate the motion vector of the previous frame point cloud. For the obtained motion vectors, motion vectors of all points in the previous frame are obtained by interpolation, and the motion vectors are enlarged by 2 times.
And then, the motion vector is regarded as attribute information of the point cloud, and the motion vector is encoded by utilizing a lift transformation mode in MPEG to obtain a bit stream. And then decoding the bit stream in a lift transformation mode to obtain a decoded motion vector, and performing motion compensation on the decoded last frame point cloud by using the motion vector to obtain a predicted point cloud.
And finally, inputting the current frame point cloud and the predicted frame point cloud into a time entropy model network, obtaining the final decoded point cloud, and placing the point cloud into a decoded frame buffer.
The implementation method and other methods are respectively tested on several indexes of bpp, D1 and D2, and the obtained results are gathered into the following table:
where bpp represents how many bits of coding are needed to average each vertex, the smaller the value, the better the D1 represents the point-to-point distortion measure, the larger the value, the better the D2 represents the point-to-plane distortion measure, and the larger the value the better. As can be seen from the results, the best results are obtained under both the distortion metrics D1 and D2 at the minimum bpp in this embodiment, so that the compression rate can be effectively improved compared with the previous method according to the present invention.
Claims (3)
1. The dynamic point cloud geometric compression method based on the scene flow network and the time entropy model is characterized by comprising the following steps of:
step one: a scene flow network-based motion estimation step for estimating a motion vector of a previous frame point cloud relative to a current frame point cloud;
step two: a motion vector coding and motion compensation step, which is used for coding the motion vector estimated in the previous step, and performing motion compensation on the point cloud of the previous frame by utilizing the decoded motion vector to obtain a predicted point cloud;
step three: a residual compression step, which is used for encoding difference information of the predicted point cloud and the original point cloud;
the scene flow estimation module in the scene flow network mainly utilizes the cost body submodule and the scene flow predictor submodule to estimate scene flow information, wherein the cost body submodule integrates the similarity between points in a block-to-block mode; the scene flow predictor submodule mainly utilizes the characteristics of the point cloud of the last frame and the characteristics of the point cloud of the current frame, and the scene flow information sampled in the last layer is used for predicting the scene flow information of the current layer by the cost body information;
the method comprises the steps of utilizing a time entropy model network to encode a difference value between a predicted point cloud and an original point cloud, and specifically comprises the following steps:
mapping the predicted point cloud and the current frame point cloud to hidden variables Y1 and Y in a hidden space by using an encoder;
taking the difference between Y1 and Y in the hidden space to obtain a difference Y in the hidden space res ;
Performing lossless compression on the Y position information by using an octree compression method;
for the difference Y res After the characteristic information of the (B) is quantized, lossless compression is carried out by utilizing arithmetic coding;
the information after Y1 and Y are spliced is subjected to 2-layer sparse convolution with the step length of 2 to obtain hidden variable Z, and characteristic information F of Z is obtained Z Quantization is performed first, then lossless compression is performed by using arithmetic coding, and F is estimated by using a full decomposition entropy model Z Probability distribution information of (2);
compressed characteristic information F Z After arithmetic decoding, the decoded hidden variable is obtained Obtaining Z1 through 2 layers of sparse convolution with the step length of 2;
hidden variable Y1 of the prediction point cloud is subjected to 3-layer sparse convolution to obtain Y2, and then the hidden variable spliced by Y2 and Z1 is subjected to 3-layer sparse convolution to estimate probability distribution of characteristic information of a difference value;
performing octree decoding on the compressed Y position information to obtain hidden variables of the current point cloudWill->Difference between the position information of Y1 and Y1, resulting in a difference +.>Is a part of the position information of the mobile terminal; the difference Y to be compressed res Performing arithmetic decoding on the characteristic information of (2) to obtain difference +.>Is a difference of the position information and the characteristic information of the decoding>
Decoded difference valueAdding hidden variable Y1 of the predicted point cloud to obtain hidden variable ++of the current point cloud>Hidden variable->The decoded point cloud is obtained through a decoder.
2. The scene flow network and time entropy model based dynamic point cloud geometric compression method according to claim 1, wherein the method comprises the following steps: the decoded motion vector is compressed and decompressed by using an attribute compression mode in MPEG to obtain a decompressed motion vector, and then the decompressed motion vector is used for performing motion compensation on the point cloud of the previous frame to be decoded to obtain a predicted point cloud.
3. The scene flow network and time entropy model based dynamic point cloud geometric compression method according to claim 1, wherein the method comprises the following steps: the difference Y res Is obtained by subtracting the position information in Y1 and Y.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111285773.5A CN114025146B (en) | 2021-11-02 | 2021-11-02 | Dynamic point cloud geometric compression method based on scene flow network and time entropy model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111285773.5A CN114025146B (en) | 2021-11-02 | 2021-11-02 | Dynamic point cloud geometric compression method based on scene flow network and time entropy model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114025146A CN114025146A (en) | 2022-02-08 |
CN114025146B true CN114025146B (en) | 2023-11-17 |
Family
ID=80059612
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111285773.5A Active CN114025146B (en) | 2021-11-02 | 2021-11-02 | Dynamic point cloud geometric compression method based on scene flow network and time entropy model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114025146B (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108322742A (en) * | 2018-02-11 | 2018-07-24 | 北京大学深圳研究生院 | A kind of point cloud genera compression method based on intra prediction |
CN109196559A (en) * | 2016-05-28 | 2019-01-11 | 微软技术许可有限责任公司 | The motion compensation of dynamic voxelization point cloud is compressed |
CN110264502A (en) * | 2019-05-17 | 2019-09-20 | 华为技术有限公司 | Point cloud registration method and device |
CN111476822A (en) * | 2020-04-08 | 2020-07-31 | 浙江大学 | Laser radar target detection and motion tracking method based on scene flow |
CN111866521A (en) * | 2020-07-09 | 2020-10-30 | 浙江工商大学 | Video image compression artifact removing method combining motion compensation and generation type countermeasure network |
CN112862858A (en) * | 2021-01-14 | 2021-05-28 | 浙江大学 | Multi-target tracking method based on scene motion information |
CN113012063A (en) * | 2021-03-05 | 2021-06-22 | 北京未感科技有限公司 | Dynamic point cloud repairing method and device and computer equipment |
CN113281718A (en) * | 2021-06-30 | 2021-08-20 | 江苏大学 | 3D multi-target tracking system and method based on laser radar scene flow estimation |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10499054B2 (en) * | 2017-10-12 | 2019-12-03 | Mitsubishi Electric Research Laboratories, Inc. | System and method for inter-frame predictive compression for point clouds |
CN108632621B (en) * | 2018-05-09 | 2019-07-02 | 北京大学深圳研究生院 | A kind of point cloud genera compression method based on distinguishing hierarchy |
WO2020197966A1 (en) * | 2019-03-22 | 2020-10-01 | Tencent America LLC | Method and apparatus for interframe point cloud attribute coding |
-
2021
- 2021-11-02 CN CN202111285773.5A patent/CN114025146B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109196559A (en) * | 2016-05-28 | 2019-01-11 | 微软技术许可有限责任公司 | The motion compensation of dynamic voxelization point cloud is compressed |
CN108322742A (en) * | 2018-02-11 | 2018-07-24 | 北京大学深圳研究生院 | A kind of point cloud genera compression method based on intra prediction |
CN110264502A (en) * | 2019-05-17 | 2019-09-20 | 华为技术有限公司 | Point cloud registration method and device |
CN111476822A (en) * | 2020-04-08 | 2020-07-31 | 浙江大学 | Laser radar target detection and motion tracking method based on scene flow |
CN111866521A (en) * | 2020-07-09 | 2020-10-30 | 浙江工商大学 | Video image compression artifact removing method combining motion compensation and generation type countermeasure network |
CN112862858A (en) * | 2021-01-14 | 2021-05-28 | 浙江大学 | Multi-target tracking method based on scene motion information |
CN113012063A (en) * | 2021-03-05 | 2021-06-22 | 北京未感科技有限公司 | Dynamic point cloud repairing method and device and computer equipment |
CN113281718A (en) * | 2021-06-30 | 2021-08-20 | 江苏大学 | 3D multi-target tracking system and method based on laser radar scene flow estimation |
Non-Patent Citations (2)
Title |
---|
基于视点的三维点云自适应多细节层次模型动态绘制;孔剑虹;杨超;于晓辉;祁广源;王赜坤;;科学技术与工程(12);全文 * |
面向移动有损网络的基于预测重构模型传输机制;杨柏林;章志勇;王勋;潘志庚;;计算机辅助设计与图形学学报(01);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN114025146A (en) | 2022-02-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022063055A1 (en) | 3d point cloud compression system based on multi-scale structured dictionary learning | |
WO2019210531A1 (en) | Point cloud attribute compression method based on deleting 0 elements in quantisation matrix | |
WO2019153326A1 (en) | Intra-frame prediction-based point cloud attribute compression method | |
CN109889839B (en) | Region-of-interest image coding and decoding system and method based on deep learning | |
WO2019213986A1 (en) | Multi-angle adaptive intra-frame prediction-based point cloud attribute compression method | |
CN111386551A (en) | Method and device for predictive coding and decoding of point clouds | |
CN110691243A (en) | Point cloud geometric compression method based on deep convolutional network | |
CN108174218B (en) | Video coding and decoding system based on learning | |
JP2015504545A (en) | Predictive position coding | |
KR20140089426A (en) | Predictive position decoding | |
CN110602494A (en) | Image coding and decoding system and method based on deep learning | |
CN113613010A (en) | Point cloud geometric lossless compression method based on sparse convolutional neural network | |
CN112866694A (en) | Intelligent image compression optimization method combining asymmetric volume block and condition context | |
CN108028945A (en) | The apparatus and method of conversion are performed by using singleton coefficient update | |
CN109166160B (en) | Three-dimensional point cloud compression method adopting graph prediction | |
CN114025146B (en) | Dynamic point cloud geometric compression method based on scene flow network and time entropy model | |
CN117354523A (en) | Image coding, decoding and compressing method for frequency domain feature perception learning | |
Bletterer et al. | Point cloud compression using depth maps | |
CN104282030A (en) | Image compression device and method | |
CN115393452A (en) | Point cloud geometric compression method based on asymmetric self-encoder structure | |
Wei et al. | Enhanced intra prediction scheme in point cloud attribute compression | |
CN115239563A (en) | Point cloud attribute lossy compression device and method based on neural network | |
Hajizadeh et al. | Predictive compression of animated 3D models by optimized weighted blending of key‐frames | |
CN110349228B (en) | Triangular mesh compression method for data-driven least square prediction | |
Boulfani-Cuisinaud et al. | Motion-based geometry compensation for DWT compression of 3D mesh sequences |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |