CN113902061A

CN113902061A - Point cloud completion method and device

Info

Publication number: CN113902061A
Application number: CN202111349742.1A
Authority: CN
Inventors: 徐名业; 王亚立; 乔宇
Original assignee: Shenzhen Institute of Advanced Technology of CAS
Current assignee: Shenzhen Institute of Advanced Technology of CAS
Priority date: 2021-11-15
Filing date: 2021-11-15
Publication date: 2022-01-07
Also published as: WO2023082415A1

Abstract

The invention discloses a point cloud completion method and a point cloud completion device. The method comprises the following steps: acquiring original point cloud data of a target to be supplemented; inputting the original point cloud data into a first generating network to obtain first complete point cloud data, wherein the first generating network is obtained by adopting an enhanced training data set for training; and inputting the first complete point cloud data into a second generation network to obtain second complete point cloud data serving as a completion result, wherein the second generation network generates the second complete point cloud data by utilizing semantic representation guidance. The invention improves the universality and robustness of point cloud completion, and makes the local structure of the reconstructed point cloud clearer and more accurate.

Description

Point cloud completion method and device

Technical Field

The invention relates to the technical field of three-dimensional data processing, in particular to a point cloud completion method and device.

Background

The point cloud is a point data set of the product appearance surface obtained by a measuring instrument in the reverse engineering, and the number of points obtained by using a three-dimensional coordinate measuring machine is small, the distance between the points is large, and the point cloud is called as sparse point cloud; the point clouds obtained by using the three-dimensional laser scanner or the photographic scanner have larger and denser point quantities, and are called dense point clouds. With the popularization of depth cameras and lidar 3D scanning devices, the acquisition of point clouds becomes easier and easier, recently attracting a great deal of research interest in the visual and robotic communities. However, the scanned three-dimensional point cloud is often incomplete due to occlusion, noise, etc., which hinders practical applications. Point cloud completion therefore becomes particularly critical, with the aim of predicting complete shapes by local observation, and the local structure of the predicted shape should be clear, accurate and noise-free.

In the prior art, to achieve point cloud completion, a multi-view space (MVP) dataset is introduced, which contains 100000 partial and complete point clouds of high quality scans. MVP datasets are more challenging in terms of a unified view and rich categories of data diversity than other datasets. This is because: (1) for each complete object, the corresponding incomplete object is randomly rendered from 26 visual angles, and the structure of the incomplete point cloud is very different, so that the universality of the existing method is limited. (2) The 16 object classes further increase the diversity of the data. It can be observed that although each incomplete object is rendered from a different view, all the incomplete point clouds share the basic shape attributes (global shape code and semantic category information) of the object. Based on this fact, it is particularly critical in point cloud completion to exploit the underlying shape attributes, which can also guide the local representation of the overall discriminative features closer to the same object. Current mainstream methods typically connect a global representation as an additional feature to each point of the generation phase. However, this operation does not directly and effectively affect the point-by-point representation.

In summary, the actual scanned point cloud is usually incomplete due to viewpoint, occlusion and noise. Existing point cloud completion methods tend to generate a global shape skeleton, but lack fine local details. In addition, different incomplete forms of the same point cloud share the basic shape attributes (overall shape information and semantic category information) of the object. Based on this fact, it is particularly critical to utilize basic shape attributes, which may also guide the local representation closer to the global differentiated representation of the same object.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides a point cloud completion method and a point cloud completion device.

According to a first aspect of the invention, a point cloud completion method is provided. The method comprises the following steps:

acquiring original point cloud data of a target to be supplemented;

inputting the original point cloud data into a first generation network to obtain first complete point cloud data; the first generated network is obtained by training with an enhanced training data set;

inputting the first complete point cloud data into a second generation network to obtain second complete point cloud data serving as a completion result; wherein the second generation network utilizes semantic representation guidance to generate second complete point cloud data.

According to a second aspect of the present invention, a point cloud complementing device is provided. The device includes:

a data acquisition module: the system comprises a data acquisition module, a data processing module and a data processing module, wherein the data acquisition module is used for acquiring original point cloud data of a target to be supplemented;

a first point cloud generation module: the system comprises a first generation network, a second generation network and a third generation network, wherein the first generation network is used for inputting the original point cloud data to the first generation network to obtain first complete point cloud data; the first generated network is obtained by training with an enhanced training data set;

a second point cloud generation module: the system comprises a first generation network, a second generation network and a third generation network, wherein the first generation network is used for generating first complete point cloud data; wherein the second generation network utilizes semantic representation guidance to generate second complete point cloud data.

Compared with the prior art, the method has the advantages that the existing method basically only carries out the generation stage of point cloud completion, secondary fine processing is not carried out, and semantic guidance is not carried out.

Other features of the present invention and advantages thereof will become apparent from the following detailed description of exemplary embodiments thereof, which proceeds with reference to the accompanying drawings.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description, serve to explain the principles of the invention.

FIG. 1 is a flow diagram of a point cloud replenishment method according to one embodiment of the invention;

FIG. 2 is a schematic diagram of the overall process of a point cloud completion method according to one embodiment of the invention;

FIG. 3 is a schematic diagram of a "scratch-scratch" data enhancement according to one embodiment of the present invention;

FIG. 4 is a schematic diagram of a conditional refinement network according to one embodiment of the present invention;

FIG. 5 is a schematic diagram of a signature modulation network according to one embodiment of the present invention;

FIG. 6 is a schematic diagram of a multi-scale snowflake deconvolution module, according to one embodiment of the present disclosure;

FIG. 7 is a comparison graph of point cloud completion effects according to one embodiment of the invention.

Detailed Description

Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless specifically stated otherwise.

The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses.

Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.

In all examples shown and discussed herein, any particular value should be construed as merely illustrative, and not limiting. Thus, other examples of the exemplary embodiments may have different values.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.

The invention provides a universal and unique two-stage point cloud completion framework, wherein the first stage is robust point cloud generation, and the second stage is semantic-guided point cloud refining processing. For example, to improve the robustness of the generated point cloud, a compact "incomplete-incomplete" data enhancement module is used in the first stage, which further cuts the original incomplete point cloud into a new incomplete input and processes the original incomplete point cloud into a complete input. In this way, the diversity of incomplete point clouds can be increased to improve the generalization capability of the point cloud generation network. In the second stage, a new conditional guide network is adopted, the network can effectively use semantic representation as dynamic guidance, and the accuracy of point cloud is improved by using distinguishing category information. In addition, the conditional guidance network is a lightweight conditional modulation module, and can fuse bottom layer shape attributes (semantic information and shape information) into point-by-point local representation instead of directly connecting global features of the point cloud, so that the local distribution of the point cloud can be improved through semantic guidance.

Specifically, with reference to fig. 1 and 2, the provided point cloud complementing method includes the following steps.

And step S110, constructing an overall network architecture consisting of the generation network and the conditional refinement network.

Referring to fig. 2, the overall network architecture includes a generation network and a refinement network. In this embodiment, the generating network is a VRCNet (variational associated point cloud completion network) provided with "deficit-deficit" data enhancement, and the conditional refinement network is a modified snowflakennet (snowflake point deconvolution network), the improvement of which will be described below.

In the description herein, "incomplete-incomplete" data refers to enhanced or expanded data obtained after further defect processing is performed on the original incomplete point cloud, and by data expansion, the diversity of incomplete point clouds can be increased, and the universality of the generated network is improved. The conditional refinement network performs more detailed refinement by means of semantic category information and shape information to further improve the quality of the generated point cloud.

In summary, the complete point cloud is generated for various incomplete point cloud structures in the generation stage of the overall network architecture, and the robustness of point cloud completion is improved. Subsequently, a refinement phase is used to refine the complete point cloud with the category labels and the discriminative basis attributes of the global representation.

And step S120, training a generation network with 'incomplete-incomplete' data enhancement, and generating a complete point cloud with robustness.

Taking as an example the construction of a generating network based on VRCNet, this network consists of two successive encoder-decoder subnetworks, used respectively as "probabilistic modeling" (PMNet) and "relationship enhancement" (RENet). PMNet embeds global shape representations and potential distributions from partial inputs and generates a coarse skeleton. Then, RENet attempts to enhance the structural relationships by learning multi-scale local point features and reconstruct a fine complete point cloud on a coarse skeleton. It should be understood that more renets may be provided to further enhance the structural relationship, and the number and specific structure of the sub-networks are not limited by the present invention.

In order to improve the robustness of point cloud generation, newly added data of a data set for training a generated network is obtained by adopting a 'incomplete-incomplete' data expansion method. As shown in fig. 3, the original incomplete point cloud (i.e., the incomplete point cloud) may be randomly cropped (e.g., randomly cropped) to obtain a incomplete point cloud, which is then provided to the model to reconstruct the original incomplete point cloud. This enhancement can increase the diversity of the global features and potential distributions of PMNet and make RENet more versatile for variations of incomplete structures.

In one embodiment, the enhanced training data set includes a correspondence between the original incomplete point cloud and the fine standard complete point cloud (corresponding to the base data in table 1 below), and a correspondence between the enhanced incomplete point cloud and the fine standard incomplete point cloud (corresponding to the enhancement data in table 1 below). In this embodiment, the enhanced defect point cloud may be obtained by performing a secondary defect processing (marked as a defect) on the defect point cloud, where the secondary defect processing is, for example, a random cropping or a random excavation.

TABLE 1 enhanced training data set

It should be understood that, this training data set construction method can not only improve the efficiency of subsequent model training or model application, but also enhance the robustness of the model. This is mainly reflected in that, on the one hand, incomplete point clouds with precise standards are introduced, and compared with complete point clouds with precise standards, the incomplete point clouds have richer shape structures, and the robustness is favorably improved. On the other hand, preferably, the training data pair does not contain the corresponding relation between the incomplete-incomplete point cloud and the precise standard complete point cloud, so that the influence on the complexity of model training or the efficiency of the model application process due to the overlarge difference between the input image and the output image is avoided.

It should be noted that, the data enhancement is to perform secondary incomplete processing on the existing partial incomplete point cloud data, and this data enhancement mode obtains diverse incomplete forms without collecting too many original incomplete point clouds, and can obtain various types of defect forms and various defect ratios by random clipping, and can correspond to various scenes (such as shielding, noise, different viewing angles, and the like) for collecting point cloud data in practical applications, so that the generated network has more robustness to different incomplete forms, and the generalization performance of the network is increased. The traditional data enhancement modes such as turning, displacement and the like only have certain effect on the two-dimensional image and have little significance on the three-dimensional point cloud representation.

Furthermore, a good initial point in the downstream fine-tuning completion task can be reached by self-supervised reconstruction pre-training of the generative network. Self-supervision allows for a wider and easier optimization than training from scratch.

And step S130, taking the predicted complete point cloud output by the generated network as input, and carrying out discriminative point cloud refinement by utilizing a semantic guidance-based condition refinement network.

In order to distinguish from the generated network, the point cloud generation model used in the second stage is marked as a conditional refinement network or a conditional refinement network, and the purpose is to refine a complete point cloud with more geometric details and semantic information. Fig. 4 is a structure of a conditional refinement network, in which a feature modulation module (or feature modulation network) can efficiently perform point representation through semantic guidance, and a multi-scale SPD module (snowflake point deconvolution module) can refine point clouds through multi-scale context aggregation to display more geometric structures. The feature modulation module and the multi-scale SPD module will be described in detail below.

The shape attributes (global shape information and semantic category information) of the object can be used for encouraging the incomplete representation to be closer to the overall discriminative representation of the same object, and the incomplete representation can be used as a guide for point cloud refinement. The existing method only merges global information by concatenation and local representation, but concatenation is not efficient enough and it significantly increases the weight of MLP (multi-layer perceptron) (as model F in table 3). Existing methods also ignore important category information that contains distinguishing semantics. To this end, in one embodiment, a lightweight conditional modeling module (i.e., feature modulation module) for point cloud refinement is proposed. In addition to enabling adjustment of the global point cloud representation, the module also facilitates expansion to understand local enhancement effects to redefine the point cloud.

As shown in fig. 5, in order to enable the network to handle operations that require semantic category information (taking the MVP dataset 16 categories as an example) and global shape information, this embodiment utilizes a feature modulation module to adjust the intermediate displacement features of the conditionally refined network. In particular, the condition vector ω is used to influence the cluster center of the local feature representation, and the condition vector β is used to fine-tune the variance in the feature space. Thus, global adjustment of point features can be achieved with few parameters. Local features are considered to be closer to the same object than features of other objects, and therefore, the local representation of each object is affected by different semantic information and global shape information. Therefore, the provided feature modulation module is not easily confused with similar local structures under different semantic information.

To reveal fine local geometric details on the complete shape, existing methods typically employ a folding-based strategy to obtain the changes to learn the different displacements of the repetition points. However, the folding-based strategy ignores the local shape features contained in the original points, because the sampled two-dimensional meshes are identical. Unlike the folding-based strategy, snowflakennet uses SPD to reconstruct child points generated from parent points into a snowflake growing process, where the shape features embedded by the parent point features are extracted and inherited into the child points by a point-by-point splitting operation. And also introduces a new hopping transformer to learn the split mode in the SPD module, which can learn the shape context and the special relationship between child and parent points.

The condition refinement network aims at perfecting the local geometric details of the complete point cloud, and improves the SnowflakenNet to achieve the aim. Although a similar structure to the SPD is used, except that the input to the invention is a predicted completion point cloud of N2048 points from the generation network, and instead of using a point-by-point splitting operation to increase the number of points, the invention only obtains the coordinate change of each point in the multi-scale SPD module, as shown in fig. 6. To further refine the local geometric details, three multi-scale SPDs are used in the conditional refinement network, as shown in fig. 4. To facilitate the continuous multi-scale SPD to refine the points in a coherent manner, the spatial context is preferably learned and refined from different layers using multiple skip transformers (skip converters). In addition, in order to improve robustness to hierarchical local structural changes, multi-scale hopping transformers with different local regions are employed in the multi-scale SPD module.

Referring to FIG. 6, in the ith multi-scale SPD model, the refined point cloud of the previous layer is taken as P_i-1∈R^N×3Extracted per-point feature

From P_i-1∈R^N×3. Then transmitting the displacement characteristics

From the previous characteristic modulation module and

two hopping converters with different local regions are used for local feature learning. Then, the multi-scale local characteristics are fed back to the MLP to obtain the displacement characteristics K of the current layer_i. In one embodiment, K may be used_iThe offset of the generated point coordinates is expressed as:

where tanh is hyper-tangent activation and MLP is a multilayer perceptron. And finally, updating the refined point cloud as follows:

it is to be understood that those skilled in the art can appropriately change or modify the above-described embodiments without departing from the spirit and scope of the present invention. For example, the feature modulation network may employ a more excellent feature extractor. As another example, the generation network or the condition refining network may be replaced with other types of point cloud completion networks. Or other types of non-linear activation functions, etc., may be employed in addition to Relu, tanh activation functions.

Correspondingly, the invention also provides a point cloud complementing device which is used for realizing one or more aspects of the method. For example, the apparatus includes: the data acquisition module is used for acquiring original point cloud data of a target to be supplemented; a first point cloud generating module, configured to input the original point cloud data into a first generating network, and obtain first complete point cloud data, where the first generating network is obtained by training using an enhanced training data set; and the second point cloud generating module is used for inputting the first complete point cloud data into a second generating network to obtain second complete point cloud data as a completion result, wherein the second generating network generates the second complete point cloud data by utilizing semantic representation guidance. The modules related to the device can be realized by a processor, special hardware or FPGA and the like.

It should be noted that the training process for generating deep learning models such as network and the like related to the present invention can be performed offline at a server or a cloud, and real-time point cloud completion can be achieved by embedding the trained models into an electronic device. The electronic device can be a terminal device or a server, and the terminal device comprises any terminal device such as a mobile phone, a tablet computer, a Personal Digital Assistant (PDA), a point-of-sale (POS), a vehicle-mounted computer, a smart wearable device (a smart watch, virtual reality glasses, a virtual reality helmet and the like). The server includes but is not limited to an application server or a Web server, and may be a stand-alone server or a cluster server or a cloud server. In actual model application, the terminal device can directly acquire original point cloud data of a target to be supplemented from the point cloud data acquisition device. For example, after the point cloud data acquisition device scans a specific object (object or scene) to obtain target original point cloud data, the target original point cloud data may be sent to the terminal device through the network. For another example, the point cloud data collection device may also send data in response to a request from the terminal device, thereby returning the target original point cloud data to the terminal device. Or the terminal device can also acquire target original point cloud data from a database special for storing the original point cloud data. The method for acquiring the target original point cloud data by the terminal equipment is not limited.

To further verify the effectiveness of the present invention, ablation studies and qualitative visualization experiments were performed, as shown in tables 2, 3, 4 and 7 below. The verification proves that the invention can realize the precision of 5.01 average chamfer angle distance on the official public test set. In addition, the method of the present invention was performed on 16384 points of MVP raw data sets, with an average chamfer distance of 2.51, which demonstrates the effectiveness and robustness of the present invention.

TABLE 2 Point cloud completion comparison of the present invention and existing models

Table 2 is a comparison of the point cloud completion results on the MVP dataset (16384 points), where the average chamfer distance is the result multiplied by 10000.

TABLE 3 Point cloud completion comparison of the present invention and existing models

Table 3 is a comparison of the point cloud completion results on MVP data sets (2048 points), where the average chamfer distance is the result multiplied by 10000.

In tables 2 and 3, comparing the method of the present invention with other evaluation methods on the MVP raw data set, it can be seen from the evaluated average chamfer distance and F1 score that the method of the present invention is superior to other existing methods in the indexes of average chamfer distance and F1 score.

TABLE 4 ablation study of the present invention in MVP completion task

The experiments in table 4 verify that the proposed data expansion can indeed improve the performance of the complementary network. In addition, the characteristic modulation module with only 43k parameters can reduce the average chamfer distance from 5.41 to 5.32. In addition to the effectiveness of the feature modulation module in model G, the parameters of the refined network in model G (2.27M) are less than the parameters of model F (2.33M). It can thus be demonstrated that the invention enables a better precision and efficiency balance.

In addition, to verify the effect of enhancing the training data set, the basic data (i.e., prior art) and the effect of the three data enhancement modes are compared, see tables 5 and 6 below.

Table 5 data enhancement format comparison

TABLE 6 "disability-disability" data enhancement Defect ratio maximum threshold comparison

Ratio of incomplete (maximum threshold setting)	0.1	0.3	0.5	0.7	0.9
						Distance of chamfer	6.03	5.92	5.82	5.88	5.91

As can be seen from table 5, in the embodiment of the present invention, by adding the incomplete-incomplete point cloud and the precise standard incomplete point cloud data pair, the optimal chamfering distance (i.e. 5.82) is achieved, and a better point cloud completion effect can be obtained compared with other two data enhancement methods. Table 6 demonstrates that the data enhancement mode of the invention is superior to other modes under different deficit ratios, and also demonstrates that model training can be further optimized by setting the deficit ratio.

In summary, the present invention has at least the following advantages over the prior art:

1) in order to reveal fine local geometric details on the complete shape by distinguishing semantic information, the invention provides a two-stage point cloud finishing framework, wherein the first stage (generating network) is used for enhancing the robustness of generating the complete point cloud, and the second stage uses 'incomplete-incomplete' enhancement (refining network) to effectively utilize semantic representation as dynamic guidance and use classification information to promote point cloud refinement. In addition, the refined network of the present invention may also make the distribution of missing parts more uniform (as shown in fig. 7). The verification proves that the invention obtains the performance of the tip on the MVP data set, and the supplemented point cloud has better quality.

2) The simple 'incomplete-incomplete' data enhancement module for point cloud completion is provided, the diversity of incomplete structures can be increased, and the universality and the robustness of a generated network are improved. The ablation studies in table 3 demonstrate that this data enhancement can improve the performance of the generated network.

3) A new conditional modulation model is provided, so that semantic representation is effectively used as dynamic guidance, and distinguishing category information is used for promoting point cloud refinement, and local representation is encouraged to be closer to the same object than the characteristics of other objects.

4) The inference time of the present invention is verified to be valid. For the test point cloud, the mean inference time of the generation phase is 0.192s, while the inference time of the refinement phase is 0.008 s. The inference time is valid.

It should be noted that the point cloud completion method provided by the present invention can be applied to various point cloud processing scenes, for example, virtual objects or virtual scenes in a modeling game, or scenes such as traffic environment modeling, medical image modeling, product design, and the like. The invention makes any limitation to the applicable application scenario.

The present invention may be a system, method and/or computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied therewith for causing a processor to implement various aspects of the present invention.

The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.

The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.

The computer program instructions for carrying out operations of the present invention may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + +, Python, or the like, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present invention are implemented by personalizing an electronic circuit, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA), with state information of computer-readable program instructions, which can execute the computer-readable program instructions.

Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. It is well known to those skilled in the art that implementation by hardware, by software, and by a combination of software and hardware are equivalent.

Having described embodiments of the present invention, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein. The scope of the invention is defined by the appended claims.

Claims

1. A point cloud complementing method comprises the following steps:

acquiring original point cloud data of a target to be supplemented;

2. The method of claim 1, wherein the enhanced training data set characterizes a correspondence between an original incomplete point cloud and a fine standard complete point cloud, an enhanced incomplete point cloud and a fine standard incomplete point cloud; the enhanced incomplete point cloud is obtained by carrying out secondary defect processing on the incomplete point cloud.

3. The method of claim 2, wherein the enhanced defect point cloud is obtained by randomly excavating an original defect point cloud, and the fine standard defect point cloud is obtained by complementing the original defect point cloud.

4. The method of claim 1, wherein the first generation network is constructed based on a point cloud completion network, comprising two consecutive sub-networks, wherein a first sub-network is used to generate a coarse skeleton; the second sub-network reconstructs the first complete point cloud data on the coarse skeleton.

5. The method of claim 1, wherein the second generation network is a conditional refinement network comprising a plurality of multi-scale snowflake deconvolution modules and a feature modulation network; the multi-scale snowflake point deconvolution module is used for learning intermediate features of the input point cloud and polymerizing and refining the point cloud structure through multi-scale context.

6. The method according to claim 5, wherein for each multi-scale snowflake point deconvolution module, the displacement features output by the module are point represented by semantic guidance using the feature modulation network.

7. The method of claim 5, wherein for a continuous multi-scale snowflake deconvolution module, spatial context is learned and refined from different layers using a multi-scale hopping transformer.

8. The method according to claim 7, characterized in that for the multi-scale snowflake deconvolution module i, the following steps are performed

Output point cloud P for previous layer_i-1∈R^N×3Extracting the feature of each point

Characterizing displacements

And

sending the local feature to two hopping converters with different local areas to obtain multi-scale local features;

feeding back the multi-scale local features to a multi-layer sensor to obtain the current layerCharacteristic of displacement K_i；

Using K_iGenerating an offset of point coordinates

Further updating the point cloud of the current layer to

9. The method according to claim 5, wherein the feature modulation network obtains a condition variable ω and a condition variable β based on the class information of the target and the global shape feature learned by the first generation network, wherein the condition variable ω is related to the class information of the target for adjusting the cluster center of the local feature representation; the conditioning variable β is related to the target class information and the global shape feature learned by the first generative network for adjusting the variance in the feature space.

10. A point cloud replenishment apparatus comprising:

11. The apparatus of claim 10, wherein the second generation network is a conditional refinement network comprising a plurality of multi-scale snowflake deconvolution modules and a feature modulation network; the multi-scale snowflake point deconvolution module is used for learning intermediate features of the input point cloud and polymerizing and refining the point cloud structure through multi-scale context.

12. The apparatus of claim 11, wherein for each multi-scale snowflake point deconvolution module, its output displacement features are point represented by semantic guidance using the feature modulation network.

13. The apparatus of claim 10, wherein for a continuous multi-scale snowflake deconvolution module, spatial context is learned and refined from different layers using a multi-scale hopping transformer.

14. The apparatus of claim 13, wherein for the multi-scale snowflake deconvolution module i, performing the steps of:

Characterizing displacements

And

feeding back the multi-scale local characteristics to a multilayer perceptron to obtain the displacement characteristics K of the current layer_i；

Using K_iGenerating an offset of point coordinates

Further updating the point cloud of the current layer to

15. The apparatus according to claim 11, wherein the feature modulation network obtains a condition variable ω and a condition variable β based on the class information of the target and the global shape feature learned by the first generation network, wherein the condition variable ω is related to the target class information for adjusting the cluster center of the local feature representation; the conditioning variable β is related to the target class information and the global shape feature learned by the first generative network for adjusting the variance in the feature space.

16. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 9.

17. An electronic device comprising a memory and a processor, on which a computer program is stored which is executable on the processor, characterized in that the steps of the method of any of claims 1 to 9 are implemented when the processor executes the program.