CN117315189A

CN117315189A - Point cloud reconstruction method, system, terminal equipment and computer storage medium

Info

Publication number: CN117315189A
Application number: CN202310882685.6A
Authority: CN
Inventors: 高伟; 谢良; 李革
Original assignee: Peking University Shenzhen Graduate School
Current assignee: Peking University Shenzhen Graduate School
Priority date: 2023-07-18
Filing date: 2023-07-18
Publication date: 2023-12-29

Abstract

The invention provides a point cloud reconstruction method, a system, terminal equipment and a computer storage medium, which are applied to the technical field of point cloud geometry, wherein the point cloud reconstruction method is applied to a point cloud reconstruction system, the point cloud reconstruction system comprises an adaptive coding module, and the point cloud reconstruction method comprises the following steps: acquiring point cloud data to be reconstructed and a data application scene; the self-adaptive coding module codes the point cloud data to obtain point cloud coding data; and decoding the point cloud coding data according to the data application scene to obtain point cloud reconstruction data. The technical scheme of the invention can solve the technical problem that the point cloud data compressed by the traditional point cloud compression frame is not matched with the application scene.

Description

Point cloud reconstruction method, system, terminal equipment and computer storage medium

Technical Field

The present invention relates to the field of point cloud technologies, and in particular, to a point cloud reconstruction method, a system, a terminal device, and a computer storage medium.

Background

Point cloud data is a definition of a set of points in three-dimensional space, and has become one of the most important data formats in three-dimensional representations. The traditional point cloud reconstruction method mainly comprises two types, one type is a dynamic point cloud coding standard V-PCC (Video based Point Cloud Compression, video-based point cloud compression) proposed by MPEG (Moving Picture Experts Group ), and the V-PCC compresses geometric and texture information of a point cloud sequence by utilizing the traditional video coding method; another is the static point cloud coding standard G-PCC (Geometry Point Cloud Compression, geometric-based point cloud compression), where the G-PCC codec framework is mainly compressed against object point clouds and the like.

However, the conventional point cloud compression frame compresses the point cloud data from the encoding performance perspective, and the compressed point cloud data is often applied to each application scene, and the accuracy requirements of the point cloud data are different from each application scene, so that the point cloud data compressed by the conventional point cloud compression frame is not matched with each application scene.

Disclosure of Invention

The invention provides a point cloud reconstruction method, a point cloud reconstruction system, terminal equipment and a computer storage medium, and aims to solve the technical problem that point cloud data compressed by a traditional point cloud compression frame are not matched with an application scene.

In order to solve the above problems, the present invention provides a point cloud reconstruction method, which is applied to a point cloud reconstruction system, the point cloud reconstruction system includes an adaptive coding module, the point cloud reconstruction method includes:

acquiring point cloud data to be reconstructed and a data application scene;

the self-adaptive coding module codes the point cloud data to obtain point cloud coding data;

and decoding the point cloud coding data according to the data application scene to obtain point cloud reconstruction data.

Optionally, the adaptive coding module includes: the step of encoding the point cloud data by the adaptive encoding module to obtain point cloud encoded data comprises the following steps:

The bit distribution unit is used for carrying out separation processing on the ROI (region of interest, interested region) in the point cloud data to obtain point cloud prior data;

and carrying out local attention extraction processing on the point cloud prior data through the feature learning unit to obtain point cloud context data, and encoding the point cloud prior data according to the point cloud context data to obtain point cloud encoded data.

Optionally, the adaptive coding module includes: the code rate variable unit, the step of obtaining the point cloud coded data by coding the point cloud data through the adaptive coding module, further includes:

establishing a plurality of dynamic encoders with shared parameters through the code rate variable unit, and taking the dynamic encoder matched with the characteristic channel number of the point cloud data in the plurality of dynamic encoders as a target encoder;

and determining a target parameter in the parameters according to a preset rule, and controlling the target encoder to encode the point cloud data according to the target parameter to obtain point cloud encoded data.

Optionally, the step of decoding the point cloud encoded data according to the data application scene to obtain point cloud reconstruction data includes:

When the data application scene is perceived by human eyes, carrying out prior information extraction processing on the point cloud significance data corresponding to the point cloud data to obtain point cloud gain data;

and decoding the point cloud coding data according to the point cloud gain data to obtain point cloud reconstruction data.

Optionally, the step of decoding the point cloud encoded data according to the data application scenario to obtain point cloud reconstruction data further includes:

when the data application scene is machine perception, object detection is carried out on the point cloud coding data to obtain object detection data;

and decoding the point cloud coding data according to the object detection data to obtain point cloud reconstruction data.

Optionally, before the step of encoding the point cloud data by the adaptive encoding module to obtain point cloud encoded data, the method further includes:

obtaining a point cloud data category according to the density of the point cloud data, determining a downsampling step length according to the point cloud data category, and downsampling the point cloud data according to the downsampling step length;

the step of encoding the point cloud data by the adaptive encoding module to obtain point cloud encoded data includes:

And encoding the down-sampled point cloud data through the self-adaptive encoding module to obtain point cloud encoded data.

Optionally, after the step of decoding the point cloud encoded data according to the data application scenario to obtain the point cloud reconstruction data, the method further includes:

and determining an up-sampling step length according to the point cloud data category, up-sampling the point cloud reconstruction data according to the up-sampling step length, and taking the up-sampled point cloud reconstruction data as new point cloud reconstruction data.

In addition, in order to solve the above problems, the present invention further provides a point cloud reconstruction system, where the point cloud reconstruction system includes an adaptive coding module, and the point cloud reconstruction system further includes:

the first acquisition module is used for acquiring point cloud data to be compressed and a data application scene;

the point cloud coding module is used for coding the point cloud data through the self-adaptive coding module to obtain point cloud coding data;

and the point cloud decoding module is used for decoding the point cloud encoded data according to the data application scene to obtain point cloud reconstruction data.

In addition, in order to solve the above problems, the present invention also proposes a terminal device, including: the system comprises a memory, a processor and a point cloud reconstruction program stored on the memory and capable of running on the processor, wherein the point cloud reconstruction program realizes the steps of the point cloud reconstruction method when being executed by the processor.

In addition, in order to solve the above-mentioned problems, the present invention also proposes a computer storage medium having stored thereon a point cloud reconstruction program which, when executed by a processor, implements the steps of the point cloud reconstruction method as described above.

The invention provides a point cloud reconstruction method, a system, terminal equipment and a computer storage medium, wherein the point cloud reconstruction method is applied to a point cloud reconstruction system, the point cloud reconstruction system comprises an adaptive coding module, and the point cloud reconstruction method comprises the following steps: acquiring point cloud data to be reconstructed and a data application scene; the self-adaptive coding module codes the point cloud data to obtain point cloud coding data; and decoding the point cloud coding data according to the data application scene to obtain point cloud reconstruction data.

When the point cloud reconstruction system executes a point cloud reconstruction task, point cloud data to be reconstructed and a data application scene of the reconstructed point cloud data are acquired, the point cloud data are encoded through the self-adaptive encoding module to obtain point cloud encoded data, and the point cloud encoded data are decoded according to the data application scene to obtain point cloud reconstruction data.

Compared with the traditional point cloud reconstruction method based on coding performance, the method has the advantages that the self-adaptive coding module codes the point cloud data so that the codes of the point cloud data adapt to different data structures, and the reconstructed point cloud data meets the requirements of application scenes according to the decoding mode of the point cloud data of the data application scenes.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the description of the embodiments or the prior art will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.

FIG. 1 is a schematic device architecture diagram of a hardware operating environment of a terminal device according to an embodiment of the present invention;

FIG. 2 is a schematic flow chart of a first embodiment of the point cloud reconstruction method according to the present invention;

FIG. 3 is a block diagram of a bit distribution network according to an embodiment of the point cloud reconstruction method of the present invention;

FIG. 4 is a schematic diagram of scalable encoding of a point cloud according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a human eye perceived network structure according to an embodiment of the point cloud reconstruction method of the present invention;

FIG. 6 is a schematic diagram of a flow chart of an embodiment of a point cloud human eye perception implementation of the point cloud reconstruction method according to the present invention;

FIG. 7 is a schematic diagram of a machine-aware network according to an embodiment of a point cloud reconstruction method of the present invention;

FIG. 8 is a flow chart of a machine-aware implementation of an embodiment of a point cloud reconstruction method according to the present invention;

FIG. 9 is a frame diagram of a point cloud reconstruction according to an embodiment of the present invention;

FIG. 10 is a flowchart illustrating an embodiment of a point cloud reconstruction method according to the present invention;

FIG. 11 is a block diagram illustrating an embodiment of a point cloud reconstruction system according to the present invention.

Detailed Description

It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

It should be noted that all directional indicators (such as up, down, left, right, front, and rear … …) in the embodiments of the present invention are merely used to explain the relative positional relationship, movement, etc. between the components in a particular posture (as shown in the drawings), and if the particular posture is changed, the directional indicator is changed accordingly.

In the present invention, unless specifically stated and limited otherwise, the terms "connected," "affixed," and the like are to be construed broadly, and for example, "affixed" may be a fixed connection, a removable connection, or an integral body; can be mechanically or electrically connected; either directly or indirectly, through intermediaries, or both, may be in communication with each other or in interaction with each other, unless expressly defined otherwise. The specific meaning of the above terms in the present invention can be understood by those of ordinary skill in the art according to the specific circumstances.

Furthermore, descriptions such as those referred to as "first," "second," and the like, are provided for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implying an order of magnitude of the indicated technical features in the present disclosure. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In addition, the technical solutions of the embodiments may be combined with each other, but it is necessary to base that the technical solutions can be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be considered to be absent and not within the scope of protection claimed in the present invention.

As shown in fig. 1, fig. 1 is a schematic device structure diagram of a hardware operating environment of a terminal device according to an embodiment of the present invention.

It should be noted that, the terminal device according to the embodiment of the present invention is a data storage control terminal, a PC, or a portable computer, etc. capable of executing the point cloud compression method of the present application.

As shown in fig. 1, in a hardware operating environment of a terminal device, the terminal device may include: a processor 1001, such as a CPU, a network interface 1004, a user interface 1003, a memory 1005, a communication bus 1002. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display, an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may further include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a stable memory (non-volatile memory), such as a disk memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.

It will be appreciated by those skilled in the art that the terminal device structure shown in fig. 1 is not limiting of the terminal device and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.

As shown in fig. 1, an operating system, a network communication module, a user interface module, and a point cloud reconstruction program may be included in a memory 1005, which is a type of computer storage medium.

In the device shown in fig. 1, the network interface 1004 is mainly used for connecting to a background server, and performing data communication with the background server; the user interface 1003 is mainly used for connecting a client (user side) and performing data communication with the client; and the processor 1001 may be configured to call a point cloud reconstruction program stored in the memory 1005, and perform the following operations:

acquiring point cloud data to be reconstructed and a data application scene;

Optionally, the adaptive coding module includes: the bit allocation unit and the feature learning unit, the processor 1001 may be configured to call a point cloud reconstruction program stored in the memory 1005, and perform the following operations:

the bit distribution unit is used for carrying out separation processing on the ROI in the point cloud data to obtain point cloud prior data;

Optionally, the adaptive coding module includes: the variable rate unit, the processor 1001 may be configured to call a point cloud reconstruction program stored in the memory 1005, and perform the following operations:

Optionally, the processor 1001 may be configured to call a point cloud reconstruction program stored in the memory 1005, and perform the following operations:

Based on the above hardware structure, the overall concept of each embodiment of the point cloud reconstruction method of the present invention is presented.

In the embodiment of the invention, the point cloud data is the definition of the point set in the three-dimensional space, and has become one of the most important data formats in the three-dimensional representation. The traditional point cloud reconstruction method mainly comprises two types, one type is a dynamic point cloud coding standard V-PCC proposed by MPEG, and the V-PCC compresses geometric and texture information of a point cloud sequence by utilizing the traditional video coding method; the other is static point cloud coding standard G-PCC, and the G-PCC codec framework is mainly used for compressing object point clouds and the like.

In order to solve the above problems, the present invention provides a point cloud reconstruction method, which is applied to a point cloud reconstruction system, wherein the point cloud reconstruction system comprises an adaptive coding module, and the point cloud reconstruction method comprises: acquiring point cloud data to be reconstructed and a data application scene; the self-adaptive coding module codes the point cloud data to obtain point cloud coding data; and decoding the point cloud coding data according to the data application scene to obtain point cloud reconstruction data.

Based on the above overall conception of the point cloud reconstruction method of the present invention, various embodiments of the point cloud reconstruction method of the present invention are presented.

Referring to fig. 2, fig. 2 is a flow chart of a first embodiment of the point cloud reconstruction method according to the present invention. It should be noted that, although a logical order is shown in the flowchart, in some cases, the respective steps of the point cloud reconstruction method of the present invention may of course be performed in an order different from that herein.

In this embodiment, the point cloud reconstruction method of the present invention is applied to a point cloud reconstruction system, where the point cloud reconstruction system includes an adaptive coding module, and the point cloud reconstruction method includes:

step S10: and acquiring point cloud data to be reconstructed and a data application scene.

In this embodiment, the point cloud reconstruction system acquires point cloud data to be reconstructed and a data application scene after the point cloud data is reconstructed, where the point cloud data to be reconstructed may be person point clouds of MPEG, scanNet scene point clouds, or radar point clouds of KITTI, and the data application scene may be human eye perception or machine perception.

Step S20: and encoding the point cloud data through the self-adaptive encoding module to obtain point cloud encoded data.

In this embodiment, the point cloud reconstruction system further encodes point cloud data to be reconstructed through the adaptive encoding module, and uses the encoded point cloud data as point cloud encoded data.

Step S30: and decoding the point cloud coding data according to the data application scene to obtain point cloud reconstruction data.

In this embodiment, the point cloud reconstruction system decodes the point cloud encoded data encoded by the adaptive encoding module according to the data application scene of the point cloud data, so that the decoded point cloud reconstruction data meets the requirements of the data application scene.

Taking MPEG person point clouds as an example, the point cloud reconstruction system further applies the obtained person point clouds to a data application scene, i.e., the person point clouds are applied to human eye perception or machine recognition, then encodes the person point clouds through an adaptive encoding module, and further decodes the encoded person point clouds according to the data application scene to obtain point cloud reconstruction data. For example, when the reconstructed human point cloud is applied to human eye perception, the point cloud reconstruction system will reconstruct the point cloud data that can be identified by human eyes, and when the reconstructed human point cloud is applied to machine perception, the point cloud reconstruction system will reconstruct the point cloud data that can be identified by machines.

In this embodiment, the method of encoding the character point cloud, the scene point cloud and the radar point cloud by the adaptive encoding module enables the encoding of the point cloud data to adapt to different data structures, and enables the reconstructed point cloud data to meet the requirements of the application scene by decoding the point cloud data according to the data application scene.

Further, based on the first embodiment of the point cloud reconstruction method of the present invention, a second embodiment of the point cloud reconstruction method of the present invention is provided.

In this embodiment, the adaptive coding module includes: bit allocation unit and feature learning unit, step S20: the step of encoding the point cloud data by the adaptive encoding module to obtain point cloud encoded data comprises the following steps:

step S201: and separating the ROI in the point cloud data through the bit distribution unit to obtain the point cloud prior data.

The separation process is to separate the ROI area and the background area in the point cloud data.

In this embodiment, when the point cloud reconstruction system encodes the point cloud data through the adaptive encoding module, the bit allocation unit performs adaptive rate control on the content of different areas in the point cloud data, for example, the background of the scene point cloud can allocate less code rate, and the partitioned areas of various important objects should allocate more code rate, so as to realize bit allocation of different areas. Therefore, the point cloud reconstruction system separates an ROI (important region) and a background region in the point cloud data through a bit allocation unit, performs key coding on the ROI, performs lossy compression on the background region, and takes the processed point cloud data as point cloud prior data.

Step S202: and carrying out local attention extraction processing on the point cloud prior data through the feature learning unit to obtain point cloud context data, and encoding the point cloud prior data according to the point cloud context data to obtain point cloud encoded data.

It should be noted that the local attention extraction process refers to a process of performing super prior encoding, arithmetic decoding and super prior decoding on the point cloud prior data, and the point cloud context data refers to the point cloud data characteristics obtained by the above process.

In this embodiment, the point cloud reconstruction system further performs arithmetic coding and arithmetic decoding on the point cloud priori data through the feature learning unit, so as to extract point cloud data features in the point cloud priori data, and further performs key coding on important areas in the point cloud priori data according to the obtained point cloud data features to obtain point cloud coded data.

Referring to fig. 3, fig. 3 is a block diagram of a bit distribution network according to an embodiment of the point cloud reconstruction method of the present invention, wherein the point cloud data in fig. 3 is LiDAR scanned point cloud data in automatic driving, and because the point cloud data is applied to automatic driving, the reconstructed point cloud data is applied to machine recognition and is mainly used for algorithms such as subsequent target detection and recognition, so that when reconstructing the scanned point cloud data, the point cloud reconstruction is performed with relevant basic visual tasks as optimization targets.

Before encoding the point cloud data, the point cloud reconstruction system first distinguishes which regions need to be encoded as key regions. Aiming at the point cloud data set segmentation result under the Lei Dadian cloud scene, the point cloud reconstruction system divides data into three categories, namely pedestrians, vehicles, roads, sidewalks and the like, the first category is a green belt, a building and the like, the third category is a green belt, a vehicle pedestrian is mainly oriented to a machine vision sensing region, and an automatic driving scene needs to sense a road surface range, so that the first category and the second category are considered to be used as important attention objects to be encoded, important regions (ROI (such as pedestrians, vehicles, roads and sidewalks) in the point cloud data are separated, after the ROI is separated, the point cloud data are converted into sparse tensor expression forms, and the tensor forms can be used for coordinate feature encoding by using 3D sparse convolution, so that the point cloud priori data is obtained.

When the point cloud reconstruction system encodes point cloud data by an arithmetic encoder, since the arithmetic encoder needs an occurrence probability or an accumulated probability distribution table (CDF) of decoding points in both the encoding and decoding stages. The point cloud prior data needs to be transmitted to the decoding end for correct entropy decoding, so that the point cloud reconstruction system controls the super prior network to compress the point cloud prior data into z through the feature learning unit, carries out quantization entropy coding on z and transmits the z to the super prior decoding end, decodes and learns modeling parameters (i.e. point cloud context data) of potential representation y through the super prior decoding end, and after obtaining modeling distribution of potential representation y through the super prior network, models (i.e. context model) the modeling parameters and quantifies the modeling parameters Entropy encoding to obtain compressed code stream file (i.e., point cloud encoded data), and entropy decoding to obtain +.>Entropy decoding result->Inputting to the main decoding end to obtain the final reconstructed picture +.>For the problem of optimizing the coding and decoding parameters of the whole network, the whole rate distortion function is still adopted:

L _E ＝λ*D+R

wherein D represents the distortion degree of the reconstructed image and the original image, R represents the compression code rate of the whole frame, and lambda represents the parameter for controlling the size of the point cloud code stream.Code stream representing the main encoder, +.>Representing the code stream of the super a priori encoder.

It should be understood that, since the reconstructed point cloud data is applied to machine recognition, after decoding is completed, the point cloud target detection and segmentation are performed by using the decoded and restored data, and here, taking the segmentation part as an example, the segmentation precision S (iou) needs to be added to the loss function part as a constraint condition, it can be understood that the higher the segmentation precision is, the more obvious the coding benefit is, the weaker the background area is, and the more bits are not needed to be occupied, so the whole loss function formula is as follows:

L＝L _E +μS(iou)

wherein L is _E Representing the compressed loss function, S (iou) representing the loss function of the point cloud segmentation, μ representing the parameter controlling the segmentation loss.

Optionally, in a possible embodiment, the adaptive coding module includes: code rate variable unit, the above-mentioned step S20: the step of encoding the point cloud data by the adaptive encoding module to obtain point cloud encoded data further comprises:

step S203: and establishing a plurality of dynamic encoders with shared parameters through the code rate variable unit, and taking the dynamic encoder matched with the characteristic channel number of the point cloud data in the plurality of dynamic encoders as a target encoder.

In this embodiment, the point cloud reconstruction system establishes a plurality of dynamic encoders through the code rate variable unit, and parameters used by each dynamic encoder belong to a preset full-quantity parameter set, and then the point cloud reconstruction system determines a matched dynamic encoder among the plurality of dynamic encoders according to the number of characteristic channels of the point cloud data, and takes the dynamic encoder as a target encoder.

Step S204: and determining a target parameter in the parameters according to a preset rule, and controlling the target encoder to encode the point cloud data according to the target parameter to obtain point cloud encoded data.

It should be noted that the dynamic encoders are all sub-neural network models.

In this embodiment, the point cloud reconstruction system further determines the parameter slice index of each layer operator in the corresponding sub-neural network model according to a preset rule, so as to partially access the full-scale parameter set, take the accessed parameter as a target parameter, and further control the target encoder to encode the point cloud data according to the target parameter to obtain the point cloud encoded data.

In the traditional point cloud reconstruction method, the target code rate is determined by the quality weight in the rate distortion loss functionAnd controlling, wherein in the end-to-end optimization process, the higher the lambda is, the lower the distortion is corresponding to the weight learned by the model, and the higher the code rate is consumed. In practical application, because the need of encoding the input point cloud data with different compression rates exists, the input point cloud data is limited by the conditions, a plurality of groups of lambda are required to be set manually, a plurality of code rate points are corresponding, a set of complete model parameters is allocated for each lambda, and a set of full-scale models are independently trained until convergence. The above schemeNot only causes higher model training overhead, but also brings inconvenience to storage, distribution and loading use of the codec parameters.

As an example, the point cloud reconstruction system firstly performs unified expansion definition (dynamization) on a full-connection layer operator, a 2D convolution layer operator, a Minkowski convolution layer operator, a normalization layer operator and the like used in the neural network model, so that the full-connection layer operator, the 2D convolution layer operator, the Minkowski convolution layer operator, the normalization layer operator and the like support dynamic slicing of built-in parameters, thereby establishing a multi-level shared parameter sub-neural network model (i.e. establishing a plurality of shared parameter dynamic encoders), specifying characteristic specifications (calculated by a characteristic channel number k) needed to be processed in the encoder during running, determining a corresponding dynamic encoder grade (i.e. determining a target encoder), and determining parameter slice indexes of each layer operator in the corresponding dynamic encoder according to a predefined rule, so as to perform partial access on the full-quantity parameter wfull, ensure dynamic activation and execution of the model, and obtain point cloud encoded data. The parameter sharing formula of each dynamic encoder is as follows:

k>0,△>0

Where wk represents each dynamic encoder, wfull represents the full-scale parameter, and k represents the number of characteristic channels. Because the parameters of the dynamic encoders are all subsets of the same set of full-scale parameters (the dynamic encoder with the highest quality grade corresponds to the full-scale parameter set), the scheme avoids the cost of distributing and calling a plurality of groups of full-scale model parameters with the same specification. When coding and decoding are needed with a lower target code rate, the space-time complexity of the dynamic encoder is not obviously different from that of an independent encoder with the same internal characteristic specification and is obviously lower than that of a dynamic encoder with a higher code rate, so that the defect that the complexity of a possible alternative scheme (using methods such as parameter modulation and the like and possibly realizing variable code rate coding and decoding) is kept at a full model level during low code rate coding is avoided.

On the basis, the point cloud reconstruction system can also allocate the highest preset quality weight lambda max for each dynamic encoder in the dynamic encoder joint optimization process, continuously test the RD (Recursion Desi red, expected regression) response of each dynamic encoder on a data set in the iteration process, and adaptively reduce the quality weight lambda k in the loss function corresponding to each dynamic encoder wk, so that the final RD response is consistent with the estimated optimal curve in geometric relation. At this time, any dynamic encoder does not cause a loss in compression efficiency due to an excessively high quality target.

Referring to fig. 4, fig. 4 is a schematic diagram of scalable point cloud encoding according to an embodiment of the present invention, when the point cloud reconstruction system may further perform slice splicing by constraining parameters of the dynamic encoders, only allow each dynamic encoder to accumulate gradients compared to a parameter increment portion of a previous dynamic encoder, so as to avoid updating common parameters of multiple dynamic encoders under the guidance of an optimization target corresponding to a higher quality level, and in a training stage, activate each dynamic encoder separately according to a low-to-high level order for each point cloud data batch, thereby completing forward propagation and accumulating gradients according to the above rule. After all the dynamic encoders are executed, the full-scale parameters are updated according to the accumulated gradient. In this way, the layering effect is naturally achieved in the code stream output by the highest-level dynamic encoder: the code stream corresponding to the number of the certain channels is completely consistent with the output of the dynamic encoder with the corresponding specification. The design ensures that the legal code stream (base layer [ +several enhancement layers, optional ]) has optimal rate distortion performance when processed by a decoder network matched with the width of the base layer [ +several enhancement layers, optional ]) so that the base layer code stream can be independently decoded or the higher quality point cloud reconstruction can be performed after one or more enhancement layer code streams are added.

In this embodiment, the method of encoding the point cloud data by the bit allocation unit can improve the encoding performance of the point cloud data, and the method of encoding by the bit allocation unit according to the same set of model parameters can control each dynamic encoder by the code rate variable unit, so that one set of model parameters can adapt to the encoding requirements of different target code rates, and the model training cost is reduced.

Further, based on the first embodiment and the second embodiment of the point cloud reconstruction method of the present invention, a third embodiment of the point cloud reconstruction method of the present invention is provided.

In this embodiment, the step S30 is as follows: decoding the point cloud encoded data according to the data application scene to obtain point cloud reconstruction data, including:

step S301: and when the data application scene is human eye perception, extracting prior information of the point cloud significance data corresponding to the point cloud data to obtain point cloud gain data.

The prior information extraction processing refers to processing of extracting data features from the network structure of the convolutional layer-residual block-convolutional layer.

In this embodiment, after obtaining point cloud encoded data, if an application scene of point cloud data to be reconstructed is human eye perception, the point cloud reconstruction system extracts data features in point cloud significance data corresponding to the point cloud data through a network structure of a convolution layer, a residual block and a convolution layer, and uses the obtained data features as point cloud gain data.

Step S302: and decoding the point cloud coding data according to the point cloud gain data to obtain point cloud reconstruction data.

In this embodiment, after the point cloud reconstruction system obtains the point cloud gain data, the point cloud reconstruction system decodes the point cloud encoded data according to the point cloud gain data, so as to achieve the effect of decoding the point cloud encoded data of the information gain part according to the information gain in the point cloud gain data, thereby improving the subjective visual quality of the reconstructed point cloud data.

Referring to fig. 5, fig. 5 is a schematic diagram of a human eye perception network structure according to an embodiment of the point cloud reconstruction method of the present invention, wherein in practical application, the network structure is disposed at a decoding network end. The point cloud reconstruction system receives a reconstructed point cloud (point cloud encoded data) and a point cloud saliency map (point cloud saliency data). The point cloud saliency map computing network contains rich priori information of human vision, and can provide additional gain for reconstruction quality of the point cloud. The reconstruction point cloud network consists of two convolution layers and three residual blocks and is used for extracting data characteristics, and the point cloud reconstruction system controls the point cloud saliency map to pass throughAnd outputting a characteristic diagram by each convolution layer and the residual block, and adding the characteristic diagram with the characteristic diagram of the reconstructed point cloud of the same layer, so that the point cloud area focused by human eyes obtains additional information gain, and further, better subjective visual quality is obtained. Recording the reconstructed point cloud as The point cloud saliency map is x _roi . The subjective quality enhancement procedure for the entire network can be expressed as:

wherein,is a point cloud with enhanced subjective quality.

Referring to fig. 6, fig. 6 is a schematic flow chart of an embodiment of a point cloud human eye perception implementation of the point cloud reconstruction method according to the present invention, wherein after dense/sparse point cloud data is obtained, the point cloud reconstruction system performs quantization, octree construction and voxel processing on the obtained data, and compresses the point cloud data into a code stream through a deep learning-based encoder (entropy encoder), so as to decode the code stream according to feature data of a point cloud significance network.

Optionally, in one possible embodiment, step S30 above: decoding the point cloud encoded data according to the data application scene to obtain point cloud reconstruction data, including:

step S303: when the data application scene is machine perception, object detection is carried out on the point cloud coding data to obtain object detection data;

in this embodiment, when the data application scene obtained by the point cloud reconstruction system is machine perception, the point cloud reconstruction system uses target information such as objects in the detection point cloud encoded data as object detection data.

Step S304: and decoding the point cloud coding data according to the object detection data to obtain point cloud reconstruction data.

In this embodiment, after the object detection data is obtained, the point cloud reconstruction system decodes the point cloud encoded data according to the object detection data, and performs key decoding on the point cloud encoded data corresponding to the physical detection data, so as to obtain the point cloud reconstruction data.

Referring to fig. 7, fig. 7 is a schematic diagram of a machine-aware network structure according to an embodiment of a point cloud reconstruction method of the present invention, wherein a coding frame is a variational coding frame, a point cloud reconstruction system controls a main encoder to perform point cloud geometric coding by sparse convolution, quantized features perform arithmetic coding by using a probability model, perform conditional entropy estimation by using an autoregressive and super-prior model, and then restore a code stream by using an arithmetic decoder. The point cloud reconstruction system utilizes a multi-head detector to detect objects in scene point cloud data to obtain object detection data, and utilizes a main decoder to reconstruct the point cloud data according to the object detection data to obtain point cloud reconstruction data. Meanwhile, the point cloud reconstruction system can also jointly optimize the loss of detection and reconstruction tasks so as to realize end-to-end training.

For example, the point cloud dataset where the original input data is a set of voxels is:

C＝{(x _i ,y _j ,z _i )} _i ，

after feature extraction by sparse convolution, the output is as follows:

C ⁱⁿ and C ^out Representing the input/output point cloud coordinates, f _u ⁱⁿ And f _u ^out The three-dimensional convolution kernel size expression is as follows:

N ³ (u,C ⁱⁿ )＝{i|u+i∈C ⁱⁿ ,i∈N ³ }，

the loss function is formed by the following formula, the point cloud compression is formed by adopting code rate and distortion loss, and the optimization is rate distortion weight. The detection end is composed of two parts of loss, the classification loss uses a focal loss classification original label, and the regression loss uses the distance between the prediction frame and the detection frame as a constraint.

L _E ＝λ1*D+R，

D＝ζ _mse +λ2L _det ,

Wherein R represents the code rate, E represents the calculated entropy,is the coordinates of the reconstructed point cloud. L (L) _E A loss function representing compression, λ1 is a code stream size control parameter, and D represents distortion loss. L (L) _det Representing a perceptual loss function, L _cls Representing the classification loss, L _reg Representing the loss function of point cloud target detection. />p and->b represents the predictive tag and the original classification/detection tag, ζ, respectively _mse For distance loss, λ2 is a parameter that adjusts the weight of the detection model.

Referring to fig. 8, fig. 8 is a flow chart illustrating a machine-aware implementation of an embodiment of a point cloud reconstruction method according to the present invention, wherein after dense/sparse point cloud data is obtained, a point cloud reconstruction system performs quantization, octree construction and voxel processing on the obtained data, compresses the point cloud data into a code stream by a deep learning-based encoder (entropy encoder), and further decodes the code stream according to object detection data obtained by point cloud detection classification, and obtains reconstructed point cloud data.

In the embodiment, the human eye prior information in the point cloud saliency map is extracted, and the obtained reconstructed point cloud data has higher human eye visual quality in a mode of decoding the point cloud encoded data according to the human eye prior information; in addition, the invention also enables the reconstructed point cloud data obtained by decoding according to the object position information to be convenient for machine identification by extracting the information such as the object position in the point cloud data.

Further, based on the above embodiments of the point cloud reconstruction method of the present invention, a fourth embodiment of the point cloud reconstruction method of the present invention is presented.

In the present embodiment, in the above step S20: before the step of encoding the point cloud data by the adaptive encoding module to obtain point cloud encoded data, the method further includes:

step S40: obtaining a point cloud data category according to the density of the point cloud data, determining a downsampling step length according to the point cloud data category, and downsampling the point cloud data according to the downsampling step length.

In this embodiment, after obtaining the point cloud data, the point cloud reconstruction system determines a point cloud data class according to the density of the point cloud data, where the point cloud data class may be: and determining a downsampling step according to the class of the point cloud data, and downsampling the point cloud data according to the downsampling step.

Illustratively, after obtaining the point cloud data, the point cloud reconstruction system classifies the point cloud data according to the point cloud density, and may divide the point cloud data into two main categories, namely a low-density point cloud data set and a high-density point cloud data set. The low density point cloud data set is typically a KITTI outdoor scene data set, and the high density point cloud data set is typically an MPEG character data set. The downsampling step of different point cloud data is then determined according to the difference in density, for example, the downsampling step of a high-density point cloud data set may be 2, while the downsampling step of a low-density point cloud data set is 1. It should be understood that in practical applications, the downsampling step size may also be determined based on the range of densities.

Based on this, step S20 described above: the self-adaptive coding module codes the point cloud data to obtain point cloud coding data, and the method further comprises the following steps:

step S210: and encoding the down-sampled point cloud data through the self-adaptive encoding module to obtain point cloud encoded data.

As an example, the point cloud reconstruction system performs downsampling on the point cloud data according to the density of the point cloud data, and then encodes the downsampled point cloud data through the adaptive encoding module to obtain point cloud encoded data.

Optionally, in a possible embodiment, in step S30 above: after decoding the point cloud encoded data according to the data application scene to obtain point cloud reconstruction data, the method further comprises:

step S50: and determining an up-sampling step length according to the point cloud data category, up-sampling the point cloud reconstruction data according to the up-sampling step length, and taking the up-sampled point cloud reconstruction data as new point cloud reconstruction data.

In this embodiment, after determining the point cloud data type, the point cloud reconstruction system may further determine an up-sampling step according to the point cloud data type, and up-sample the point cloud reconstruction data according to the up-sampling step, so as to use the up-sampled point cloud reconstruction data as new point cloud reconstruction data.

Illustratively, the point cloud reconstruction system adapts the data sets of different densities by a high resolution- > low resolution- > high resolution restoration, lossy- > lossless- > lossy codec procedure after determining the point cloud data class of the point cloud data. Wherein the processing steps from high resolution (i.e., high density) to low resolution (i.e., low density) are downsampling, and the processing steps from low resolution (i.e., low density) to high resolution (i.e., high density) are upsampling. The step size of up-down sampling will determine the processing mode of up-down sampling of the point cloud according to the density of the data set. In general, for a denser high-density point cloud dataset, the up-down sampling step size will be larger. For a less dense low density point cloud data set, the step size of up-down sampling will be smaller. And after the up-sampling step length is determined, the point cloud reconstruction system up-samples the point cloud reconstruction data to obtain new point cloud reconstruction data.

Referring to fig. 9, fig. 9 is a point cloud reconstruction frame diagram of an embodiment of a point cloud reconstruction method according to the present invention, wherein after down-sampling high-density point cloud data, the point cloud data is encoded through adaptive up-down sampling, variable code rate encoding, bit allocation and scalable encoding, and then high-density up-sampling is performed after decoding, so that the obtained point cloud reconstruction data accords with application scenarios of machine perception and/or human eye perception.

Referring to fig. 10, fig. 10 is a flowchart of an embodiment of a point cloud reconstruction method according to the present invention, in which, after dense/sparse point cloud data is obtained, a point cloud reconstruction system performs quantization, octree construction and voxel processing on the obtained data, compresses the point cloud data into a code stream by a deep learning-based encoder (entropy encoder), decodes the code stream by a lossless encoder (lossless encoder is based on deep learning design), and performs upsampling processing after decoding to obtain point cloud reconstruction data, or further, a variable code rate, bit allocation and scalable encoding technique are used to design a deep learning-based codec, and the adaptive upsampling point cloud data is processed by the codec, so that the point cloud reconstruction data accords with an application scene of machine perception and/or human eye perception.

In this embodiment, the present invention can realize high reliability and high stability of coding quality for different densities by controlling the step size and processing mode of up-down sampling by density adaptation.

The invention further provides a point cloud reconstruction system, which comprises the self-adaptive coding module.

Referring to fig. 11, the point cloud reconstruction system includes:

the first acquisition module 10 is used for acquiring point cloud data to be compressed and a data application scene;

the point cloud encoding module 20 is configured to encode the point cloud data by using the adaptive encoding module to obtain point cloud encoded data;

the point cloud decoding module 30 is configured to decode the point cloud encoded data according to the data application scenario to obtain point cloud reconstruction data.

Optionally, the adaptive coding module includes: a bit allocation unit and a feature learning unit, the point cloud encoding module 20 includes:

the prior data acquisition unit is used for separating and processing the ROI area in the point cloud data through the bit distribution unit to obtain point cloud prior data;

and the prior data encoding unit is used for carrying out local attention extraction processing on the point cloud prior data through the characteristic learning unit to obtain point cloud context data, and encoding the point cloud prior data according to the point cloud context data to obtain point cloud encoded data.

Optionally, the adaptive coding module includes: a code rate variable unit, the point cloud coding module 20 includes:

the encoder establishing unit is used for establishing a plurality of dynamic encoders with shared parameters through the code rate variable unit, and taking the dynamic encoder matched with the characteristic channel number of the point cloud data in the plurality of dynamic encoders as a target encoder;

and the shared parameter coding unit is used for determining a target parameter in the parameters according to a preset rule and controlling the target encoder to code the point cloud data according to the target parameter to obtain point cloud coded data.

Optionally, the point cloud decoding module 30 includes:

the human eye priori data acquisition unit is used for extracting priori information from the point cloud significance data corresponding to the point cloud data to obtain point cloud gain data when the data application scene is human eye perception;

and the decoding gain unit is used for decoding the point cloud coding data according to the point cloud gain data to obtain point cloud reconstruction data.

Optionally, the point cloud decoding module 30 includes:

the object detection unit is used for carrying out object detection on the point cloud coded data to obtain object detection data when the data application scene is machine perception;

And the machine decoding unit is used for decoding the point cloud coding data according to the object detection data to obtain point cloud reconstruction data.

Optionally, the point cloud reconstruction system further comprises:

the downsampling module is used for obtaining a point cloud data category according to the density of the point cloud data, determining a downsampling step length according to the point cloud data category, and downsampling the point cloud data according to the downsampling step length;

based on this, the point cloud encoding module 20 is further configured to encode the down-sampled point cloud data by using the adaptive encoding module to obtain point cloud encoded data.

Optionally, the point cloud reconstruction system further comprises:

and the up-sampling module is used for determining an up-sampling step length according to the point cloud data category, up-sampling the point cloud reconstruction data according to the up-sampling step length, and taking the up-sampled point cloud reconstruction data as new point cloud reconstruction data.

The function implementation of each module in the point cloud reconstruction system corresponds to each step in the embodiment of the point cloud reconstruction method, and the function and implementation process of each module are not described in detail herein.

In addition, the invention also provides a terminal device, which comprises: the system comprises a memory, a processor and a point cloud reconstruction program stored in the memory and capable of running on the processor, wherein the point cloud reconstruction program realizes the steps of the point cloud reconstruction method according to the invention when being executed by the processor.

The specific embodiment of the terminal device of the present invention is basically the same as each embodiment of the above-mentioned point cloud reconstruction method, and will not be described herein.

In addition, the invention also provides a computer storage medium, and the computer storage medium stores a point cloud reconstruction program which realizes the steps of the point cloud reconstruction method when being executed by a processor.

The specific embodiment of the computer storage medium of the present invention is substantially the same as the embodiments of the point cloud reconstruction method described above, and will not be described herein.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The foregoing embodiment numbers of the present application are merely for describing, and do not represent advantages or disadvantages of the embodiments.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), comprising several instructions for causing a terminal device (which may be a car-mounted computer, a smart phone, a computer, or a server, etc.) to perform the method described in the embodiments of the present application.

The foregoing description is only of the preferred embodiments of the present application, and is not intended to limit the scope of the claims, and all equivalent structures or equivalent processes using the descriptions and drawings of the present application, or direct or indirect application in other related technical fields are included in the scope of the claims of the present application.

Claims

1. The point cloud reconstruction method is characterized by being applied to a point cloud reconstruction system, wherein the point cloud reconstruction system comprises an adaptive coding module, and the point cloud reconstruction method comprises the following steps:

Acquiring point cloud data to be reconstructed and a data application scene;

2. The point cloud reconstruction method of claim 1, wherein the adaptive encoding module comprises: the step of encoding the point cloud data by the adaptive encoding module to obtain point cloud encoded data comprises the following steps:

3. The point cloud reconstruction method of claim 1, wherein the adaptive encoding module comprises: the code rate variable unit, the step of obtaining the point cloud coded data by coding the point cloud data through the adaptive coding module, further includes:

4. The method of point cloud reconstruction as claimed in claim 1, wherein said decoding the point cloud encoded data according to the data application scene to obtain the point cloud reconstructed data comprises:

5. The method of point cloud reconstruction as set forth in claim 1, wherein said decoding the point cloud encoded data according to the data application scene to obtain the point cloud reconstructed data further comprises:

6. The point cloud reconstruction method as claimed in claim 1, wherein before said step of encoding said point cloud data by said adaptive encoding module to obtain point cloud encoded data, said method further comprises:

7. The method of point cloud reconstruction as claimed in claim 6, wherein after said step of decoding said point cloud encoded data according to said data application scenario to obtain point cloud reconstructed data, said method further comprises:

8. A point cloud reconstruction system comprising an adaptive encoding module, the point cloud reconstruction system further comprising:

9. A terminal device, characterized in that the terminal device comprises: a memory, a processor and a point cloud reconstruction program stored on the memory and executable on the processor, which when executed by the processor, implements the steps of the point cloud reconstruction method according to any one of claims 1 to 7.

10. A computer storage medium, characterized in that a point cloud reconstruction program is stored on the computer storage medium, which when executed by a processor implements the steps of the point cloud reconstruction method according to any one of claims 1 to 7.