CN114092579A - Point cloud compression method based on implicit neural network - Google Patents

Point cloud compression method based on implicit neural network Download PDF

Info

Publication number
CN114092579A
CN114092579A CN202111357338.9A CN202111357338A CN114092579A CN 114092579 A CN114092579 A CN 114092579A CN 202111357338 A CN202111357338 A CN 202111357338A CN 114092579 A CN114092579 A CN 114092579A
Authority
CN
China
Prior art keywords
network
implicit
hidden
decoder
variables
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111357338.9A
Other languages
Chinese (zh)
Other versions
CN114092579B (en
Inventor
邹文钦
杨柏林
江照意
叶振虎
丁璐赟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Gongshang University
Original Assignee
Zhejiang Gongshang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Gongshang University filed Critical Zhejiang Gongshang University
Priority to CN202111357338.9A priority Critical patent/CN114092579B/en
Publication of CN114092579A publication Critical patent/CN114092579A/en
Application granted granted Critical
Publication of CN114092579B publication Critical patent/CN114092579B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/002Image coding using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses a point cloud compression method based on an implicit neural network. Firstly, a certain class shape in a data set is given and divided into a training set and a testing set; preprocessing a mesh model in an original data set to obtain an SDF value; secondly, designing an integral network framework based on auto-decoder and fused with an entropy model, inputting observation information and hidden variables quantized by the entropy model, and outputting the SDF estimated value of the query point; and then carrying out model training and inference to obtain a hidden variable representing a single shape, finally compressing the hidden variable into a binary character string, further increasing the compression rate, and transmitting the compressed character string to a decoding end. The invention does not need to process the 3d shape through complex 3d convolution, and the implicit network is expressed through simple MLP, so the structure is simpler.

Description

Point cloud compression method based on implicit neural network
Technical Field
The invention belongs to the field of three-dimensional reconstruction and point cloud compression, and particularly relates to a point cloud compression method based on an implicit neural network.
Background
In recent years, the field of acquisition and application of point cloud data becomes more and more diverse, and the fields such as automatic driving, VR, AR and the like have the shadow of the point cloud data. Compared with Mesh or Voxel, the point cloud has no complex topological structure and can be directly acquired through a radar sensor, but the point cloud usually has massive data, and a large amount of memory resources and network bandwidth are consumed for direct storage and transmission, so that an efficient point cloud compression scheme is necessary.
The traditional mainstream point cloud compression method is based on two basic compression frames proposed by MPEG, one is VPCC, and the other is GPCC. VPCC uses video compression technology to compress dynamic point cloud sequences in real time, while GPCC compresses some geometric attributes of point clouds, such as color, normal, etc., which are usually used to compress static point clouds. Compared with the traditional methods, the existing deep learning methods are mostly based on a VAE framework, firstly point cloud down-sampling is carried out on an original point cloud to an implicit space through a multilayer convolutional layer, then different entropy models are combined, probability distribution of the implicit space is obtained in a training process, information behind an encoder is quantized and entropy-coded to obtain a bit stream, and a decoding end carries out an up-sampling process through the bit stream and the decoder to finally reconstruct a new point cloud. The point cloud compression is carried out by utilizing the deep learning method, and the compression rate and the precision are further improved compared with the traditional method. However, the existing deep learning method has large network parameters and can only perform compression and reconstruction under a fixed resolution, resulting in poor expandability.
From 2019, the implicit function neural network draws wide attention in the 3d shape representation field, the main idea of the implicit function is that observation information of an original shape and the position of a query point are given, the query point in a space is judged to be inside and outside the shape, the network formed by the multilayer perceptron fits an implicit surface through learning, and finally a new shape can be reconstructed through a marching cube algorithm.
The implicit network is defined on a continuous domain of the whole input, is more efficient than discrete representation, and can process the input of various topological structures, such as voxels, mesh and point clouds. Implicit networks are not only spatially continuous, but may in theory not limit resolution output. Compared with the existing deep learning method, the implicit network has a simpler overall structure, does not need multi-layer up-sampling operation, and has good expandability and generalization capability. Therefore, the application of the implicit network to the field of point cloud compression is considered, and the method has very important significance for the further development of point cloud compression in the future.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a point cloud compression method based on an implicit neural network.
The invention comprises the following steps:
step 1: given a class of shapes in the Shapelet dataset, the classification is divided into two parts, a training set and a test set.
Step 2: and preprocessing the mesh model in the original data set to obtain an SDF value.
And step 3: designing an integral framework based on auto-decoder and fused with an entropy model, inputting observation information and hidden variables quantized by the entropy model, outputting SDF estimated values of query points, and constructing L through the SDF estimated values1Loss, and designing a regularization term of the hidden variable to increase generalization capability, and finally the loss of the whole network is as follows:
Figure RE-GDA0003441278510000031
Figure RE-GDA0003441278510000032
wherein f isθIs an implicit network of training set sharing parameters, sjIs the true SDF value of the input point, L is L1The function of the loss is a function of,
Figure RE-GDA0003441278510000033
after being processed by entropy modelLatent variable, xjAre the sampling points.
Figure RE-GDA0003441278510000034
Is the loss of compression of the entropy model,
Figure RE-GDA0003441278510000035
is a probability distribution function of hidden variables learned by the network, wherein sigma and lambda are hyper-parameters.
And 4, step 4: taking disturbance points in the preprocessed training set as input x of an implicit network, simultaneously randomly initializing hidden variables with fixed dimensionality, splicing the hidden variables with the input x after passing through an entropy model, taking the spliced hidden variables as input of the implicit network, training, and learning common characteristics of the input of the type.
And 5: and in the inference stage, the implicit network weight of the decoder part is fixed, the randomly initialized implicit variables are optimized through a small number of iteration processes to obtain the implicit variables finally representing a single shape, and the decoding end can obtain the reconstructed shape through the implicit variables and the network weight.
Step 6: and quantizing and arithmetically encoding the hidden variables in the testing stage, compressing the hidden variables into binary character strings, further increasing the compression rate, and transmitting the compressed character strings to a decoding end.
The invention has the beneficial effects that: the invention can realize considerable compression ratio on a specific type of data set, obtain hidden variable information representing original input through the hidden function network, and carry out quantization compression on hidden variables, the integral network does not need to process a 3d shape through complex 3d convolution, the hidden network is represented through simple MLP, and the structure is simpler.
Drawings
FIG. 1 is a block diagram of an auto-decoder network;
FIG. 2 is a diagram of the overall network framework of the present invention;
fig. 3 is a graph of different reconstruction loss and compression bit rate obtained by network testing of different hidden variable dimensions.
Detailed Description
The invention is further described below with reference to the accompanying drawings:
step 1: the shapes of a certain category in ShapeNet are divided into a training set and a test set, and the shapes of the same category are used for ensuring that the training set and the test set have similar shape characteristics.
Step 2: preprocessing a mesh model in an original data set, normalizing the mesh model to a unit sphere, sampling a fixed point number on the surface of the mesh model, adding disturbance to obtain a disturbance point at any position in space, calculating the distance and the sign between the disturbance point in the space and a sampling point on the surface of the mesh model, namely an SDF (software development framework) value, and storing the SDF value and the coordinate corresponding to the disturbance point in a new file.
And 3, designing an integral network framework, inputting observation information and a hidden variable quantized by an entropy model, and outputting the SDF estimated value of the query point. The auto-decoder principle is shown in figure 1, a decoder encoder part is removed from a general frame of the encoder-decoder, a hidden variable of an intermediate layer is directly used as the input of the decoder part, the parameter of the decoder and the value of the hidden variable are updated simultaneously during reverse propagation, and the redundancy of the encoder part is avoided during the inference stage, so that the output precision is higher.
Fusing an entropy model part on the basis of auto-decoder to realize further processing on hidden variables, wherein P is initial observation point information, and Z is a randomly initialized hidden variable with one dimension, as shown in FIG. 2; in the training process, firstly, the hidden variables are input into a full decomposition Entropy Model by a quantizer Q, the probability distribution of the hidden variables is predicted by the full decomposition Entropy Model, Y is the hidden variables,
Figure RE-GDA0003441278510000051
is the quantized value, AE is the arithmetic encoder, AD is the arithmetic decoder,
Figure RE-GDA0003441278510000052
is the hidden variable after entropy model processing, and D is the decoder.
The invention uses multi-layer full-connection netTraining the network parameter θ by the network, all the individual shapes S in the training setiThe common characteristics are learned on the same implicit network. In the decoder part in the network framework of the traditional method, multilayer upsampling is carried out on intermediate hidden variables by using a 3d convolution mode and the like, and an implicit network learns and fits an implicit curved surface representing a shape by using the hidden variables in the implicit network, so that the needed network parameters are less, and the expandability is better.
To study
Figure RE-GDA0003441278510000053
For a data set containing N shapes, for each particular shape SiK sampling points and SDF values of the sampling points are prepared, and corresponding hidden variables ziOutputting SDF value through computing network during training
Figure RE-GDA0003441278510000054
And the true SDF value s of the input pointjL between1Loss while maximizing the combined logarithm posteriori of all training shapes, i.e. adding an L to the hidden vector2Regularization term, plus the loss of the entropy model:
Figure RE-GDA0003441278510000055
wherein f isθIs an implicit network of training set sharing parameters, sjIs the true SDF value of the input point, L is L1The function of the loss is a function of,
Figure RE-GDA0003441278510000061
is a latent variable, x, after being processed by an entropy modeljAre the sampling points.
Figure RE-GDA0003441278510000062
Is the loss of compression of the entropy model,
Figure RE-GDA0003441278510000063
is network learningThe probability distribution functions, sigma and lambda, of the hidden variables are hyper-parameters.
And 4, step 4: the method comprises the steps of taking disturbance points in a preprocessed training set as input x of an implicit network, simultaneously randomly initializing a hidden variable with a fixed dimensionality, splicing the hidden variable after being processed by an entropy model with the x, taking the spliced hidden variable as input of the implicit network, and training.
And 5: in the inference phase, since the network parameters of the decoder are already obtained after training, the network weights of the decoder are fixed, and the values of the hidden variables z of each new shape are fine-tuned using a similar loss function:
Figure RE-GDA0003441278510000064
step 6: after the hidden variables finally representing a single shape are obtained, the entropy model learns the probability distribution of the hidden space during training, the hidden variables are quantized and arithmetically encoded in combination with the probability distribution, the hidden variables are compressed into binary character strings and transmitted to a decoding end, and the decoding end decompresses the hidden variables through the hidden variables and network parameters.
Example (b):
the experimental results are shown in fig. 3, and specifically include:
step 1: the stool dataset in the Shapelet dataset was used for training and testing, the training and testing sets were preprocessed, an ADAM optimizer with a learning rate of 0.0005 was used in the experiment, the batch _ size was set to 12, and 60 epochs were trained.
Step 2: the final shape reconstruction effect is controlled by controlling the dimensionality of the initial hidden variables, the dimensionalities of the selected hidden variables are 8, 16, 32, 64, 128 and 256, and the initial hidden variables with different dimensionalities are respectively and independently trained by combining a decoder and an entropy model.
And step 3: in the testing stage, the data of the test set are respectively tested through the trained networks with different hidden variable dimensions to obtain different reconstruction losses and compression bit rates, as shown in fig. 3, the bpp (bit per point) of the invention is changed into 0.0898,0.1172,0.1563, 02383,0.3672 and 0.6055, and the reconstruction losses are changed into 0.0835,0.0382,0.0405,0.0316, 0.0250 and 0.0220.

Claims (4)

1. A point cloud compression method based on an implicit neural network is characterized by comprising the following steps:
step 1: a certain class shape in a ShapeNet data set is given and divided into a training set and a testing set;
step 2: preprocessing a mesh model in an original data set to obtain an SDF value;
and step 3: designing an integral network framework based on auto-decoder and fused with an entropy model, inputting observation information and hidden variables quantized by the entropy model, and outputting SDF estimated values of query points;
constructing L from SDF values1Loss, and designing a regularization term of the hidden variable to increase generalization capability, and finally the loss of the whole network is as follows:
Figure FDA0003357786230000011
Figure FDA0003357786230000012
wherein f isθIs an implicit network of training set sharing parameters, sjIs the true SDF value of the input point, L is L1The function of the loss is a function of,
Figure FDA0003357786230000013
is a latent variable, x, after being processed by an entropy modeljIs a sampling point;
Figure FDA0003357786230000014
is a probability distribution function of hidden variables learned by the network,
Figure FDA0003357786230000015
is the compression loss of the entropy model, and sigma and lambda are both hyper-parameters;
and 4, step 4: taking disturbance points in the preprocessed training set as input x of an implicit network, simultaneously randomly initializing hidden variables with fixed dimensionality, splicing the hidden variables with the input x after passing through an entropy model, taking the spliced hidden variables as input of the implicit network, training, and learning common characteristics of the input of the type;
and 5: in the inference stage, the implicit network weight of the decoder part is fixed, the randomly initialized implicit variables are optimized through a small number of iteration processes to obtain the implicit variables which finally represent a single shape, and the decoding end obtains the reconstructed shape through the implicit variables and the network weight;
step 6: and quantizing and arithmetically encoding the hidden variables in the testing stage, compressing the hidden variables into binary character strings, further increasing the compression rate, and transmitting the compressed character strings to a decoding end.
2. The point cloud compression method based on the implicit neural network as claimed in claim 1, wherein: the auto-decoder removes an encoder part of a decoder on a universal frame of the encoder-decoder, directly uses a hidden variable of an intermediate layer as an input of the encoder part of the decoder, and simultaneously updates a parameter of the decoder and a value of the hidden variable during reverse propagation.
3. The point cloud compression method based on the implicit neural network as claimed in claim 1, wherein: in the whole network framework training process, the hidden variables are input into a fully decomposed Entropy Model Entorpy Model through a quantizer Q, and the probability distribution of the hidden variables is predicted through the fully decomposed Entropy Model Entorpy Model.
4. The point cloud compression method based on the implicit neural network as claimed in claim 1, wherein: training a network parameter theta by adopting a multi-layer fully-connected network, and training all single shapes S in a setiThe common characteristics are learned on the same implicit network.
CN202111357338.9A 2021-11-16 2021-11-16 Point cloud compression method based on implicit neural network Active CN114092579B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111357338.9A CN114092579B (en) 2021-11-16 2021-11-16 Point cloud compression method based on implicit neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111357338.9A CN114092579B (en) 2021-11-16 2021-11-16 Point cloud compression method based on implicit neural network

Publications (2)

Publication Number Publication Date
CN114092579A true CN114092579A (en) 2022-02-25
CN114092579B CN114092579B (en) 2024-05-14

Family

ID=80301091

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111357338.9A Active CN114092579B (en) 2021-11-16 2021-11-16 Point cloud compression method based on implicit neural network

Country Status (1)

Country Link
CN (1) CN114092579B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116597082A (en) * 2023-05-17 2023-08-15 杭州电子科技大学 Hub workpiece digitizing method based on implicit three-dimensional reconstruction

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201703122D0 (en) * 2017-02-27 2017-04-12 Nokia Technologies Oy Encoding and decoding three dimensional video data
WO2019019680A1 (en) * 2017-07-28 2019-01-31 北京大学深圳研究生院 Point cloud attribute compression method based on kd tree and optimized graph transformation
CN110349230A (en) * 2019-07-15 2019-10-18 北京大学深圳研究生院 A method of the point cloud Geometric compression based on depth self-encoding encoder
CN110691243A (en) * 2019-10-10 2020-01-14 叠境数字科技(上海)有限公司 Point cloud geometric compression method based on deep convolutional network
CN111612859A (en) * 2020-05-22 2020-09-01 潍坊学院 Three-dimensional point cloud model compression method based on data dimension reduction and implementation system thereof
WO2021013334A1 (en) * 2019-07-22 2021-01-28 Toyota Motor Europe Depth maps prediction system and training method for such a system
US10970518B1 (en) * 2017-11-14 2021-04-06 Apple Inc. Voxel-based feature learning network
CN113284203A (en) * 2021-05-04 2021-08-20 北京航空航天大学 Point cloud compression and decompression method based on octree coding and voxel context

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201703122D0 (en) * 2017-02-27 2017-04-12 Nokia Technologies Oy Encoding and decoding three dimensional video data
WO2019019680A1 (en) * 2017-07-28 2019-01-31 北京大学深圳研究生院 Point cloud attribute compression method based on kd tree and optimized graph transformation
US10970518B1 (en) * 2017-11-14 2021-04-06 Apple Inc. Voxel-based feature learning network
CN110349230A (en) * 2019-07-15 2019-10-18 北京大学深圳研究生院 A method of the point cloud Geometric compression based on depth self-encoding encoder
US20210019918A1 (en) * 2019-07-15 2021-01-21 Peking Universtiy Shenzhen Graduate School Point cloud geometric compression method based on depth auto-encoder
WO2021013334A1 (en) * 2019-07-22 2021-01-28 Toyota Motor Europe Depth maps prediction system and training method for such a system
CN110691243A (en) * 2019-10-10 2020-01-14 叠境数字科技(上海)有限公司 Point cloud geometric compression method based on deep convolutional network
CN111612859A (en) * 2020-05-22 2020-09-01 潍坊学院 Three-dimensional point cloud model compression method based on data dimension reduction and implementation system thereof
CN113284203A (en) * 2021-05-04 2021-08-20 北京航空航天大学 Point cloud compression and decompression method based on octree coding and voxel context

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
杨柏林等: "法向贴图混合自适应压缩算法", 《第四届和谐人机环境联合学术会议论文集》, 31 December 2008 (2008-12-31) *
缪永伟;刘家宗;陈佳慧;舒振宇;: "基于生成对抗网络的点云形状保结构补全", 中国科学:信息科学, no. 05, 31 May 2020 (2020-05-31) *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116597082A (en) * 2023-05-17 2023-08-15 杭州电子科技大学 Hub workpiece digitizing method based on implicit three-dimensional reconstruction

Also Published As

Publication number Publication date
CN114092579B (en) 2024-05-14

Similar Documents

Publication Publication Date Title
US11606560B2 (en) Image encoding and decoding, video encoding and decoding: methods, systems and training methods
CN109035142B (en) Satellite image super-resolution method combining countermeasure network with aerial image prior
CN111275640B (en) Image enhancement method for fusing two-dimensional discrete wavelet transform and generation of countermeasure network
CN109949222B (en) Image super-resolution reconstruction method based on semantic graph
Laha et al. Design of vector quantizer for image compression using self-organizing feature map and surface fitting
CN111147862B (en) End-to-end image compression method based on target coding
Saravanan et al. Intelligent Satin Bowerbird Optimizer Based Compression Technique for Remote Sensing Images.
Deng et al. Compressing explicit voxel grid representations: fast nerfs become also small
CN113595993B (en) Vehicle-mounted sensing equipment joint learning method for model structure optimization under edge calculation
KR20220058628A (en) Neural Network Model Compression
CN114092579B (en) Point cloud compression method based on implicit neural network
CN114463548B (en) Image classification method based on visual features and capsule network
Liu Literature review on image restoration
CN117499711A (en) Training method, device, equipment and storage medium of video generation model
Li et al. Towards communication-efficient digital twin via AI-powered transmission and reconstruction
Ramasinghe et al. A learnable radial basis positional embedding for coordinate-mlps
Zhu et al. Rhino: Regularizing the hash-based implicit neural representation
CN114332481A (en) Blind-end element extraction and spectrum unmixing method based on nonnegative sparse self-encoder
CN114913054B (en) Attention perception-based shader simplified variant evaluation method and device
Pan et al. Adaptive deep learning based time-varying volume compression
US20230154051A1 (en) Systems and Methods for Compression of Three-Dimensional Volumetric Representations
CN116913436B (en) Super-atom reverse design method based on LDM-PNN and particle swarm optimization
CN115761020B (en) Image data compression method based on neural network automatic construction
US20230316588A1 (en) Online training-based encoder tuning with multi model selection in neural image compression
US20230260197A1 (en) Learned Volumetric Attribute Compression Using Coordinate-Based Networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant