CN114092579A - Point cloud compression method based on implicit neural network - Google Patents
Point cloud compression method based on implicit neural network Download PDFInfo
- Publication number
- CN114092579A CN114092579A CN202111357338.9A CN202111357338A CN114092579A CN 114092579 A CN114092579 A CN 114092579A CN 202111357338 A CN202111357338 A CN 202111357338A CN 114092579 A CN114092579 A CN 114092579A
- Authority
- CN
- China
- Prior art keywords
- network
- implicit
- hidden
- decoder
- variables
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000006835 compression Effects 0.000 title claims abstract description 29
- 238000007906 compression Methods 0.000 title claims abstract description 29
- 238000000034 method Methods 0.000 title claims abstract description 28
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 10
- 238000012549 training Methods 0.000 claims abstract description 27
- 238000012360 testing method Methods 0.000 claims abstract description 12
- 238000007781 pre-processing Methods 0.000 claims abstract description 4
- 230000006870 function Effects 0.000 claims description 10
- 238000005070 sampling Methods 0.000 claims description 10
- 238000005315 distribution function Methods 0.000 claims description 3
- 238000013135 deep learning Methods 0.000 description 4
- 238000000354 decomposition reaction Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/002—Image coding using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Biophysics (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Mathematical Physics (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
The invention discloses a point cloud compression method based on an implicit neural network. Firstly, a certain class shape in a data set is given and divided into a training set and a testing set; preprocessing a mesh model in an original data set to obtain an SDF value; secondly, designing an integral network framework based on auto-decoder and fused with an entropy model, inputting observation information and hidden variables quantized by the entropy model, and outputting the SDF estimated value of the query point; and then carrying out model training and inference to obtain a hidden variable representing a single shape, finally compressing the hidden variable into a binary character string, further increasing the compression rate, and transmitting the compressed character string to a decoding end. The invention does not need to process the 3d shape through complex 3d convolution, and the implicit network is expressed through simple MLP, so the structure is simpler.
Description
Technical Field
The invention belongs to the field of three-dimensional reconstruction and point cloud compression, and particularly relates to a point cloud compression method based on an implicit neural network.
Background
In recent years, the field of acquisition and application of point cloud data becomes more and more diverse, and the fields such as automatic driving, VR, AR and the like have the shadow of the point cloud data. Compared with Mesh or Voxel, the point cloud has no complex topological structure and can be directly acquired through a radar sensor, but the point cloud usually has massive data, and a large amount of memory resources and network bandwidth are consumed for direct storage and transmission, so that an efficient point cloud compression scheme is necessary.
The traditional mainstream point cloud compression method is based on two basic compression frames proposed by MPEG, one is VPCC, and the other is GPCC. VPCC uses video compression technology to compress dynamic point cloud sequences in real time, while GPCC compresses some geometric attributes of point clouds, such as color, normal, etc., which are usually used to compress static point clouds. Compared with the traditional methods, the existing deep learning methods are mostly based on a VAE framework, firstly point cloud down-sampling is carried out on an original point cloud to an implicit space through a multilayer convolutional layer, then different entropy models are combined, probability distribution of the implicit space is obtained in a training process, information behind an encoder is quantized and entropy-coded to obtain a bit stream, and a decoding end carries out an up-sampling process through the bit stream and the decoder to finally reconstruct a new point cloud. The point cloud compression is carried out by utilizing the deep learning method, and the compression rate and the precision are further improved compared with the traditional method. However, the existing deep learning method has large network parameters and can only perform compression and reconstruction under a fixed resolution, resulting in poor expandability.
From 2019, the implicit function neural network draws wide attention in the 3d shape representation field, the main idea of the implicit function is that observation information of an original shape and the position of a query point are given, the query point in a space is judged to be inside and outside the shape, the network formed by the multilayer perceptron fits an implicit surface through learning, and finally a new shape can be reconstructed through a marching cube algorithm.
The implicit network is defined on a continuous domain of the whole input, is more efficient than discrete representation, and can process the input of various topological structures, such as voxels, mesh and point clouds. Implicit networks are not only spatially continuous, but may in theory not limit resolution output. Compared with the existing deep learning method, the implicit network has a simpler overall structure, does not need multi-layer up-sampling operation, and has good expandability and generalization capability. Therefore, the application of the implicit network to the field of point cloud compression is considered, and the method has very important significance for the further development of point cloud compression in the future.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a point cloud compression method based on an implicit neural network.
The invention comprises the following steps:
step 1: given a class of shapes in the Shapelet dataset, the classification is divided into two parts, a training set and a test set.
Step 2: and preprocessing the mesh model in the original data set to obtain an SDF value.
And step 3: designing an integral framework based on auto-decoder and fused with an entropy model, inputting observation information and hidden variables quantized by the entropy model, outputting SDF estimated values of query points, and constructing L through the SDF estimated values1Loss, and designing a regularization term of the hidden variable to increase generalization capability, and finally the loss of the whole network is as follows:
wherein f isθIs an implicit network of training set sharing parameters, sjIs the true SDF value of the input point, L is L1The function of the loss is a function of,after being processed by entropy modelLatent variable, xjAre the sampling points.Is the loss of compression of the entropy model,is a probability distribution function of hidden variables learned by the network, wherein sigma and lambda are hyper-parameters.
And 4, step 4: taking disturbance points in the preprocessed training set as input x of an implicit network, simultaneously randomly initializing hidden variables with fixed dimensionality, splicing the hidden variables with the input x after passing through an entropy model, taking the spliced hidden variables as input of the implicit network, training, and learning common characteristics of the input of the type.
And 5: and in the inference stage, the implicit network weight of the decoder part is fixed, the randomly initialized implicit variables are optimized through a small number of iteration processes to obtain the implicit variables finally representing a single shape, and the decoding end can obtain the reconstructed shape through the implicit variables and the network weight.
Step 6: and quantizing and arithmetically encoding the hidden variables in the testing stage, compressing the hidden variables into binary character strings, further increasing the compression rate, and transmitting the compressed character strings to a decoding end.
The invention has the beneficial effects that: the invention can realize considerable compression ratio on a specific type of data set, obtain hidden variable information representing original input through the hidden function network, and carry out quantization compression on hidden variables, the integral network does not need to process a 3d shape through complex 3d convolution, the hidden network is represented through simple MLP, and the structure is simpler.
Drawings
FIG. 1 is a block diagram of an auto-decoder network;
FIG. 2 is a diagram of the overall network framework of the present invention;
fig. 3 is a graph of different reconstruction loss and compression bit rate obtained by network testing of different hidden variable dimensions.
Detailed Description
The invention is further described below with reference to the accompanying drawings:
step 1: the shapes of a certain category in ShapeNet are divided into a training set and a test set, and the shapes of the same category are used for ensuring that the training set and the test set have similar shape characteristics.
Step 2: preprocessing a mesh model in an original data set, normalizing the mesh model to a unit sphere, sampling a fixed point number on the surface of the mesh model, adding disturbance to obtain a disturbance point at any position in space, calculating the distance and the sign between the disturbance point in the space and a sampling point on the surface of the mesh model, namely an SDF (software development framework) value, and storing the SDF value and the coordinate corresponding to the disturbance point in a new file.
And 3, designing an integral network framework, inputting observation information and a hidden variable quantized by an entropy model, and outputting the SDF estimated value of the query point. The auto-decoder principle is shown in figure 1, a decoder encoder part is removed from a general frame of the encoder-decoder, a hidden variable of an intermediate layer is directly used as the input of the decoder part, the parameter of the decoder and the value of the hidden variable are updated simultaneously during reverse propagation, and the redundancy of the encoder part is avoided during the inference stage, so that the output precision is higher.
Fusing an entropy model part on the basis of auto-decoder to realize further processing on hidden variables, wherein P is initial observation point information, and Z is a randomly initialized hidden variable with one dimension, as shown in FIG. 2; in the training process, firstly, the hidden variables are input into a full decomposition Entropy Model by a quantizer Q, the probability distribution of the hidden variables is predicted by the full decomposition Entropy Model, Y is the hidden variables,is the quantized value, AE is the arithmetic encoder, AD is the arithmetic decoder,is the hidden variable after entropy model processing, and D is the decoder.
The invention uses multi-layer full-connection netTraining the network parameter θ by the network, all the individual shapes S in the training setiThe common characteristics are learned on the same implicit network. In the decoder part in the network framework of the traditional method, multilayer upsampling is carried out on intermediate hidden variables by using a 3d convolution mode and the like, and an implicit network learns and fits an implicit curved surface representing a shape by using the hidden variables in the implicit network, so that the needed network parameters are less, and the expandability is better.
To studyFor a data set containing N shapes, for each particular shape SiK sampling points and SDF values of the sampling points are prepared, and corresponding hidden variables ziOutputting SDF value through computing network during trainingAnd the true SDF value s of the input pointjL between1Loss while maximizing the combined logarithm posteriori of all training shapes, i.e. adding an L to the hidden vector2Regularization term, plus the loss of the entropy model:
wherein f isθIs an implicit network of training set sharing parameters, sjIs the true SDF value of the input point, L is L1The function of the loss is a function of,is a latent variable, x, after being processed by an entropy modeljAre the sampling points.Is the loss of compression of the entropy model,is network learningThe probability distribution functions, sigma and lambda, of the hidden variables are hyper-parameters.
And 4, step 4: the method comprises the steps of taking disturbance points in a preprocessed training set as input x of an implicit network, simultaneously randomly initializing a hidden variable with a fixed dimensionality, splicing the hidden variable after being processed by an entropy model with the x, taking the spliced hidden variable as input of the implicit network, and training.
And 5: in the inference phase, since the network parameters of the decoder are already obtained after training, the network weights of the decoder are fixed, and the values of the hidden variables z of each new shape are fine-tuned using a similar loss function:
step 6: after the hidden variables finally representing a single shape are obtained, the entropy model learns the probability distribution of the hidden space during training, the hidden variables are quantized and arithmetically encoded in combination with the probability distribution, the hidden variables are compressed into binary character strings and transmitted to a decoding end, and the decoding end decompresses the hidden variables through the hidden variables and network parameters.
Example (b):
the experimental results are shown in fig. 3, and specifically include:
step 1: the stool dataset in the Shapelet dataset was used for training and testing, the training and testing sets were preprocessed, an ADAM optimizer with a learning rate of 0.0005 was used in the experiment, the batch _ size was set to 12, and 60 epochs were trained.
Step 2: the final shape reconstruction effect is controlled by controlling the dimensionality of the initial hidden variables, the dimensionalities of the selected hidden variables are 8, 16, 32, 64, 128 and 256, and the initial hidden variables with different dimensionalities are respectively and independently trained by combining a decoder and an entropy model.
And step 3: in the testing stage, the data of the test set are respectively tested through the trained networks with different hidden variable dimensions to obtain different reconstruction losses and compression bit rates, as shown in fig. 3, the bpp (bit per point) of the invention is changed into 0.0898,0.1172,0.1563, 02383,0.3672 and 0.6055, and the reconstruction losses are changed into 0.0835,0.0382,0.0405,0.0316, 0.0250 and 0.0220.
Claims (4)
1. A point cloud compression method based on an implicit neural network is characterized by comprising the following steps:
step 1: a certain class shape in a ShapeNet data set is given and divided into a training set and a testing set;
step 2: preprocessing a mesh model in an original data set to obtain an SDF value;
and step 3: designing an integral network framework based on auto-decoder and fused with an entropy model, inputting observation information and hidden variables quantized by the entropy model, and outputting SDF estimated values of query points;
constructing L from SDF values1Loss, and designing a regularization term of the hidden variable to increase generalization capability, and finally the loss of the whole network is as follows:
wherein f isθIs an implicit network of training set sharing parameters, sjIs the true SDF value of the input point, L is L1The function of the loss is a function of,is a latent variable, x, after being processed by an entropy modeljIs a sampling point;is a probability distribution function of hidden variables learned by the network,is the compression loss of the entropy model, and sigma and lambda are both hyper-parameters;
and 4, step 4: taking disturbance points in the preprocessed training set as input x of an implicit network, simultaneously randomly initializing hidden variables with fixed dimensionality, splicing the hidden variables with the input x after passing through an entropy model, taking the spliced hidden variables as input of the implicit network, training, and learning common characteristics of the input of the type;
and 5: in the inference stage, the implicit network weight of the decoder part is fixed, the randomly initialized implicit variables are optimized through a small number of iteration processes to obtain the implicit variables which finally represent a single shape, and the decoding end obtains the reconstructed shape through the implicit variables and the network weight;
step 6: and quantizing and arithmetically encoding the hidden variables in the testing stage, compressing the hidden variables into binary character strings, further increasing the compression rate, and transmitting the compressed character strings to a decoding end.
2. The point cloud compression method based on the implicit neural network as claimed in claim 1, wherein: the auto-decoder removes an encoder part of a decoder on a universal frame of the encoder-decoder, directly uses a hidden variable of an intermediate layer as an input of the encoder part of the decoder, and simultaneously updates a parameter of the decoder and a value of the hidden variable during reverse propagation.
3. The point cloud compression method based on the implicit neural network as claimed in claim 1, wherein: in the whole network framework training process, the hidden variables are input into a fully decomposed Entropy Model Entorpy Model through a quantizer Q, and the probability distribution of the hidden variables is predicted through the fully decomposed Entropy Model Entorpy Model.
4. The point cloud compression method based on the implicit neural network as claimed in claim 1, wherein: training a network parameter theta by adopting a multi-layer fully-connected network, and training all single shapes S in a setiThe common characteristics are learned on the same implicit network.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111357338.9A CN114092579B (en) | 2021-11-16 | 2021-11-16 | Point cloud compression method based on implicit neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111357338.9A CN114092579B (en) | 2021-11-16 | 2021-11-16 | Point cloud compression method based on implicit neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114092579A true CN114092579A (en) | 2022-02-25 |
CN114092579B CN114092579B (en) | 2024-05-14 |
Family
ID=80301091
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111357338.9A Active CN114092579B (en) | 2021-11-16 | 2021-11-16 | Point cloud compression method based on implicit neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114092579B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116597082A (en) * | 2023-05-17 | 2023-08-15 | 杭州电子科技大学 | Hub workpiece digitizing method based on implicit three-dimensional reconstruction |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB201703122D0 (en) * | 2017-02-27 | 2017-04-12 | Nokia Technologies Oy | Encoding and decoding three dimensional video data |
WO2019019680A1 (en) * | 2017-07-28 | 2019-01-31 | 北京大学深圳研究生院 | Point cloud attribute compression method based on kd tree and optimized graph transformation |
CN110349230A (en) * | 2019-07-15 | 2019-10-18 | 北京大学深圳研究生院 | A method of the point cloud Geometric compression based on depth self-encoding encoder |
CN110691243A (en) * | 2019-10-10 | 2020-01-14 | 叠境数字科技(上海)有限公司 | Point cloud geometric compression method based on deep convolutional network |
CN111612859A (en) * | 2020-05-22 | 2020-09-01 | 潍坊学院 | Three-dimensional point cloud model compression method based on data dimension reduction and implementation system thereof |
WO2021013334A1 (en) * | 2019-07-22 | 2021-01-28 | Toyota Motor Europe | Depth maps prediction system and training method for such a system |
US10970518B1 (en) * | 2017-11-14 | 2021-04-06 | Apple Inc. | Voxel-based feature learning network |
CN113284203A (en) * | 2021-05-04 | 2021-08-20 | 北京航空航天大学 | Point cloud compression and decompression method based on octree coding and voxel context |
-
2021
- 2021-11-16 CN CN202111357338.9A patent/CN114092579B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB201703122D0 (en) * | 2017-02-27 | 2017-04-12 | Nokia Technologies Oy | Encoding and decoding three dimensional video data |
WO2019019680A1 (en) * | 2017-07-28 | 2019-01-31 | 北京大学深圳研究生院 | Point cloud attribute compression method based on kd tree and optimized graph transformation |
US10970518B1 (en) * | 2017-11-14 | 2021-04-06 | Apple Inc. | Voxel-based feature learning network |
CN110349230A (en) * | 2019-07-15 | 2019-10-18 | 北京大学深圳研究生院 | A method of the point cloud Geometric compression based on depth self-encoding encoder |
US20210019918A1 (en) * | 2019-07-15 | 2021-01-21 | Peking Universtiy Shenzhen Graduate School | Point cloud geometric compression method based on depth auto-encoder |
WO2021013334A1 (en) * | 2019-07-22 | 2021-01-28 | Toyota Motor Europe | Depth maps prediction system and training method for such a system |
CN110691243A (en) * | 2019-10-10 | 2020-01-14 | 叠境数字科技(上海)有限公司 | Point cloud geometric compression method based on deep convolutional network |
CN111612859A (en) * | 2020-05-22 | 2020-09-01 | 潍坊学院 | Three-dimensional point cloud model compression method based on data dimension reduction and implementation system thereof |
CN113284203A (en) * | 2021-05-04 | 2021-08-20 | 北京航空航天大学 | Point cloud compression and decompression method based on octree coding and voxel context |
Non-Patent Citations (2)
Title |
---|
杨柏林等: "法向贴图混合自适应压缩算法", 《第四届和谐人机环境联合学术会议论文集》, 31 December 2008 (2008-12-31) * |
缪永伟;刘家宗;陈佳慧;舒振宇;: "基于生成对抗网络的点云形状保结构补全", 中国科学:信息科学, no. 05, 31 May 2020 (2020-05-31) * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116597082A (en) * | 2023-05-17 | 2023-08-15 | 杭州电子科技大学 | Hub workpiece digitizing method based on implicit three-dimensional reconstruction |
Also Published As
Publication number | Publication date |
---|---|
CN114092579B (en) | 2024-05-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11606560B2 (en) | Image encoding and decoding, video encoding and decoding: methods, systems and training methods | |
CN109035142B (en) | Satellite image super-resolution method combining countermeasure network with aerial image prior | |
CN111275640B (en) | Image enhancement method for fusing two-dimensional discrete wavelet transform and generation of countermeasure network | |
CN109949222B (en) | Image super-resolution reconstruction method based on semantic graph | |
Laha et al. | Design of vector quantizer for image compression using self-organizing feature map and surface fitting | |
CN111147862B (en) | End-to-end image compression method based on target coding | |
Saravanan et al. | Intelligent Satin Bowerbird Optimizer Based Compression Technique for Remote Sensing Images. | |
Deng et al. | Compressing explicit voxel grid representations: fast nerfs become also small | |
CN113595993B (en) | Vehicle-mounted sensing equipment joint learning method for model structure optimization under edge calculation | |
KR20220058628A (en) | Neural Network Model Compression | |
CN114092579B (en) | Point cloud compression method based on implicit neural network | |
CN114463548B (en) | Image classification method based on visual features and capsule network | |
Liu | Literature review on image restoration | |
CN117499711A (en) | Training method, device, equipment and storage medium of video generation model | |
Li et al. | Towards communication-efficient digital twin via AI-powered transmission and reconstruction | |
Ramasinghe et al. | A learnable radial basis positional embedding for coordinate-mlps | |
Zhu et al. | Rhino: Regularizing the hash-based implicit neural representation | |
CN114332481A (en) | Blind-end element extraction and spectrum unmixing method based on nonnegative sparse self-encoder | |
CN114913054B (en) | Attention perception-based shader simplified variant evaluation method and device | |
Pan et al. | Adaptive deep learning based time-varying volume compression | |
US20230154051A1 (en) | Systems and Methods for Compression of Three-Dimensional Volumetric Representations | |
CN116913436B (en) | Super-atom reverse design method based on LDM-PNN and particle swarm optimization | |
CN115761020B (en) | Image data compression method based on neural network automatic construction | |
US20230316588A1 (en) | Online training-based encoder tuning with multi model selection in neural image compression | |
US20230260197A1 (en) | Learned Volumetric Attribute Compression Using Coordinate-Based Networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |