CN112529015A

CN112529015A - Three-dimensional point cloud processing method, device and equipment based on geometric unwrapping

Info

Publication number: CN112529015A
Application number: CN202011494730.3A
Authority: CN
Inventors: 乔宇; 许牧天; 张钧皓; 周志鹏; 徐名业
Original assignee: Shenzhen Institute of Advanced Technology of CAS
Current assignee: Shenzhen Institute of Advanced Technology of CAS
Priority date: 2020-12-17
Filing date: 2020-12-17
Publication date: 2021-03-19

Abstract

The invention discloses a three-dimensional point cloud processing method, a three-dimensional point cloud processing device and three-dimensional point cloud processing equipment based on geometric unwrapping. The method comprises the following steps: acquiring point cloud data of an object and processing the point cloud data; geometrically unwrapping the point cloud data to divide the point cloud data into points with large geometrical change and points with small geometrical change according to the geometrical change degree; and inputting the point cloud data into a trained convolutional neural network for feature extraction and classification segmentation, wherein the neural network model is used for extracting local features and global features of the point cloud data, and learning mutual complementary information between the points with large geometric change and the points with small geometric change by using a geometric attention module. The invention can improve the accuracy of the point cloud data classification task and the cross merging rate of the segmentation task.

Description

Three-dimensional point cloud processing method, device and equipment based on geometric unwrapping

Technical Field

The invention relates to the technical field of three-dimensional data processing, in particular to a method, a device and equipment for processing three-dimensional point cloud based on geometric unwrapping.

Background

The Convolutional Neural Network (CNN) is a feedforward neural network which comprises convolutional calculation and has a deep structure, and consists of one or more convolutional layers (corresponding to filters for traditional image processing), full-link layers, pooling layers and the like, and has the capacity of representing learning. The artificial neurons of the method can respond to surrounding units in a part of coverage range, the network at the lower layer can extract low-level image features such as edges, lines and angle and other layers, and the network at the higher layer can extract more complex features from the low-level features in an iteration mode. The parameters of the convolutional neural network can be solved by a back propagation optimization algorithm. The convolutional neural network has excellent performance on large-scale image processing related tasks due to the fact that the convolutional neural network uses few parameters.

Point clouds (Point Cloud) are Point data sets of the surface of the product appearance obtained by measuring instruments in reverse engineering, which may have color information in addition to geometric positions. The color information is typically obtained by capturing a color image with a camera and then assigning color information (RGB) of pixels at corresponding locations to corresponding points in the point cloud. The intensity information is obtained by the echo intensity collected by the receiving device of the laser scanner, and the intensity information is related to the surface material, roughness and incident angle direction of the target, and the emission energy and laser wavelength of the instrument.

However, when processing point cloud data using a convolutional neural network, it is necessary to convert the point cloud data into other data formats, such as projecting a three-dimensional point cloud onto a two-dimensional image as a convolutional input neural network, or converting the point cloud into a voxel representation, and then extracting features by a three-dimensional convolutional neural network. Due to the need of data conversion of the point cloud, a large amount of memory consumption is caused, more computer resources are occupied, and spatial geometry information is easily lost.

Through analysis, the main problems of the existing scheme for processing three-dimensional point cloud data are as follows:

1) the three-dimensional point cloud data is different from the two-dimensional picture, the three-dimensional point cloud data is of a random structure, the non-standardized three-dimensional point cloud is projected into a two-dimensional image by the multi-view projection technology, and then the two-dimensional image is processed. But the projection process itself will cause some data loss due to occlusion and the data conversion process requires a lot of computation.

2) The voxel transformation method converts non-standardized point cloud data into spatial voxel data, and the process can reduce the problem of data loss. However, the transformed voxel data is highly redundant and has a large amount of data.

3) The one-dimensional convolutional neural network can directly operate and process non-standardized point cloud data, and the basic idea is to learn the spatial coding of each point and then aggregate all single point features into an integral representation. But this design does not fully capture the relationship between points.

4) An enhanced version of the point cloud convolution may divide the point cloud into overlapping local regions according to a distance measurement of the base space and use a two-dimensional convolution to extract the local feature neighborhood structure that captures the fine geometry. However, it still only considers local areas of each point and cannot correlate similar local features on the point cloud.

5) Dynamic graph convolution can capture the local geometry of the point cloud while maintaining the permutation invariance and use the features generated for each layer. However, in this way, all points or local point clouds are simultaneously intertwined and processed, so that there is a large amount of redundant information, making it impossible for the neural network to capture the most important geometric information.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides a method, a device and equipment for processing three-dimensional point cloud based on geometric unwrapping.

According to a first aspect of the invention, a method for processing a three-dimensional point cloud based on geometric unwrapping is provided. The method comprises the following steps:

acquiring point cloud data of an object and processing the point cloud data;

geometrically unwrapping the point cloud data to divide the point cloud data into points with large geometrical change and points with small geometrical change according to the geometrical change degree;

and inputting the point cloud data into a trained convolutional neural network for feature extraction and classification segmentation, wherein the neural network model is used for extracting local features and global features of the point cloud data, and learning mutual complementary information between the points with large geometric change and the points with small geometric change by using a geometric attention module.

According to a second aspect of the invention, a three-dimensional point cloud processing device based on geometric unwrapping is provided. The device includes:

a point cloud data acquisition unit for collecting point cloud data of an object;

the point cloud data processing unit is used for processing the acquired point cloud data;

the system comprises an input unit, a pre-trained convolutional neural network and a data processing unit, wherein the input unit is used for inputting point cloud data into the pre-trained convolutional neural network, and the convolutional neural network comprises a point cloud local characteristic information unwrapping module;

the characteristic extraction unit is used for extracting local characteristics and global characteristics of the point cloud data according to the calculation result of the convolutional neural network;

and the task output unit is used for processing the extracted characteristics of the point cloud data through the multilayer perceptron and/or the normalized index function to obtain the category prediction probability corresponding to the point cloud data.

Compared with the prior art, the method has the advantages that the point cloud data is directly processed by setting the point cloud convolution neural network shared by the local geometric information, the point cloud data does not need to be converted into other complex data formats, the memory occupation and the computer resource consumption are reduced, and rich characteristic data can be extracted more quickly. In addition, the geometric unwrapping and geometric change attention is utilized to be more beneficial to exploring the geometric characteristics of the integral structure of the point cloud edge outline, so that the precision of classification and segmentation tasks is improved.

Other features of the present invention and advantages thereof will become apparent from the following detailed description of exemplary embodiments thereof, which proceeds with reference to the accompanying drawings.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description, serve to explain the principles of the invention.

FIG. 1 is a schematic diagram of a local feature extraction and geometric unwrapping module in accordance with one embodiment of the present invention;

FIG. 2 is a schematic illustration of points with large geometric variations and points with small geometric variations in accordance with one embodiment of the present invention;

FIG. 3 is a schematic view of a geometric change attention module in accordance with one embodiment of the present invention;

FIG. 4 is a graphical illustration of geometric change attention and self-attention contrast in accordance with one embodiment of the present invention;

FIG. 5 is a schematic diagram of a network architecture according to one embodiment of the present invention;

FIG. 6 is a flow diagram of a method of processing a three-dimensional point cloud based on geometric unwrapping in accordance with one embodiment of the present invention;

FIG. 7 is a schematic structural diagram of a three-dimensional point cloud processing apparatus based on geometric unwrapping according to one embodiment of the present invention;

FIG. 8 is a schematic diagram of a three-dimensional scene segmentation application according to one embodiment of the present invention;

FIG. 9 is a schematic diagram of a three-dimensional scene reconstruction application according to an embodiment of the present invention;

FIG. 10 is a schematic diagram of a process for segmenting a scene from a three-dimensional point cloud based on geometric unwrapping according to one embodiment of the present invention;

fig. 11 is a schematic diagram of a reconstruction process of a three-dimensional point cloud scene based on geometric unwrapping according to an embodiment of the present invention.

Detailed Description

Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless specifically stated otherwise.

The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses.

Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.

In all examples shown and discussed herein, any particular value should be construed as merely illustrative, and not limiting. Thus, other examples of the exemplary embodiments may have different values.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.

The invention provides a three-dimensional point cloud processing method based on geometric unwrapping, which is different from a point cloud network in the prior art in which local geometric information is concerned and the same operation is carried out on each point. Generally, the invention uses the method of image signal processing to perform modeling and depicting on the whole point cloud, and then learns the whole point cloud information through a deep network, so that the extracted point cloud features have global property, complementarity and representativeness, thereby accurately realizing the classification and segmentation of the point cloud.

For the sake of clarity, the following first briefly introduces the local feature extractor, the geometric unwrapping module, the geometric change attention module, and the neural network oriented to the point cloud classification segmentation designed by the present invention.

(1) Local feature extractor

For example, K neighbor points of each point are first found by KNN algorithm and local feature extraction is performed by one-dimensional convolution, as shown in fig. 1(a), the point P1 completes the local feature extraction operation for the point in the circle. It should be understood that other clustering algorithms may be used to perform local feature extraction.

(2) Geometric unwrapping (geometric selection) module

The module firstly provides a method for analyzing point clouds by frequency in combination with image signal processing, and can select points with larger or smaller geometric change. The module can be conveniently embedded into some point cloud deep learning basic frameworks. As shown in the middle diagram of fig. 1(b), the original gray point cloud has much redundant information, which is not beneficial for the learning of the network, and the geometric unwrapping module unwrapps the original point cloud into a point with large geometric change (the right diagram in fig. 1 (b)) and a point with small geometric change (the left diagram in fig. 1 (b)), where the point with small geometric change corresponds to the flat area of the object and the point with large geometric change corresponds to the contour edge information of the object.

In addition, the invention is illustrated by the visualization of some other point cloud models, and as shown in fig. 2, gray points are also selected as the original point cloud, the upper points (i.e., the blocks 1 and 2) are points with small geometric changes, and the lower points are points with large geometric changes, so that the points with small geometric changes do not entangle the flat area of the object, and the points with large geometric changes do entangle the information such as the outline and the edge of the object.

(3) A geometric change attention module.

The module can learn information which is mutually complemented among different points of geometric change by using an attention mechanism, and extract the characteristics of the point cloud in a global integral mode.

The specific steps of the geometric change attention module are shown in fig. 3, each point is respectively subjected to attention collection with all points with large geometric change and points with small geometric change in a characteristic change point multiplication mode, and then the characteristics of each point cloud in the original point cloud, the characteristics of the points with large geometric change and the characteristics of the points with small geometric change are subjected to characteristic fusion through an attention coefficient, so that the fusion of the whole characteristics is completed.

From the visualization, the geometric change attention module can better capture overall information which is more geometrically related, so that the point cloud object can be better classified. Fig. 4 is a graphical illustration of geometric change attention and self-attention contrast.

(4) Convolutional neural network for point cloud classification segmentation

In general, the convolutional neural network structure includes a feature extraction unit and a classification segmentation unit, as shown in fig. 5.

In the feature extraction unit, features of respective layers are learned using a hierarchical structure. For example, the neural network takes N points as input, and the extracted local features acquire rich local geometric features of each point. And then, carrying out geometric unwrapping, and then carrying out feature fusion by using a geometric change attention module to enrich the overall geometric structural features.

The classification and segmentation unit is used for realizing the classification and segmentation tasks of the point cloud, for the classification, the down-sampling features of each level are connected in series, and the overall features are integrated into a one-dimensional global descriptor for generating a classification score; for segmentation, the neural network connects each level, and then the class prediction probability corresponding to each point in the point cloud is calculated.

The point cloud convolution operation of geometric information unwrapping provided by the invention avoids complicated sampling and data structure design. The point cloud is unwrapped into two components with large geometric change (contour region) or small geometric change (flat region), and the relationship between the original point cloud and the two components is fully mined, so that the network can learn the complementary features of the point cloud geometric structure and capture the most critical geometric information. In the aspect of network design, a cascade structure is adopted to learn the geometric characteristics of the point cloud data progressively.

The invention is not only suitable for automatic driving equipment, such as obstacle detection, automatic path planning of the automatic driving equipment and the like, but also suitable for household service robots in the following fields, such as object detection systems, grasping systems and the like.

For a further understanding of the invention, reference will now be made to specific embodiments thereof, which are illustrated in the accompanying drawings.

Referring to fig. 6, the method for processing a three-dimensional point cloud based on geometric unwrapping according to the embodiment includes the following steps:

step S601, point cloud data of an object is acquired.

The point cloud data may be collected by a laser device, stereo camera, or transit time camera. The point cloud data of the three-dimensional object can be acquired by a data acquisition method based on automatic point cloud splicing. During the acquisition process, multiple stations may be used to scan and stitch the data of each station to obtain point cloud data. In addition, accurate registration of point clouds at different angles can be achieved by iteratively optimizing coordinate transformation parameters.

Preferably, the acquired point cloud data is further processed, for example, by adding data by rotating and translating the point cloud, or enhancing the point cloud data by oscillating point coordinates in the point cloud around it. In addition, points in the point cloud may also be randomly deleted. For example, a random probability is randomly generated or obtained according to a preset highest random probability, and then the points in the point cloud are deleted according to the random probability. Experimental results show that the data enhancement method can improve the generalization capability of convolutional neural network learning, and further improve the test accuracy of a test set (point cloud data not used in training).

Step S602, point cloud data is input into a pre-trained convolutional neural network, and the network comprises a geometric unwrapping module and a geometric change attention module.

When the parameters of the convolutional neural network are input, the following steps can be further carried out: and manually classifying and screening the collected three-dimensional point cloud data according to the categories to finish preliminary data preparation work. A convolution kernel of the convolutional neural network may be trained using a first portion of the point cloud data (i.e., a training set) of the classified classes to obtain a trained convolutional neural network; the second portion of the point cloud data in the classification category may be used as validation data to evaluate the convolutional neural network. For example, according to the data sorting process, 90% of data of each category of the three-dimensional point cloud is selected as training data for network training, and the rest 10% of data is reserved as experimental verification data for later evaluation on model identification accuracy and generalization capability.

Step S603, extracting local feature information and overall information of the point cloud, and obtaining a result of the classification or segmentation task.

Specifically, a set of point cloud data including N point cloud data having C-dimensional feature points may be represented by a matrix X ═ X₁x₂…x_N]^T＝[s₁s₂…s_C]∈R^N×CIs shown in which x_i∈R^CRepresents the ith point, s_c∈R^CFeatures representing the c-th channel dimension. These features may be three-dimensional coordinates, normal vectors or semantic features. Drawing (A)

Can be constructed by a adjacency matrix a that can encode the similarity in feature space between points,

representing the vertex domain. Each point x_i∈R^CRepresents a vertex, s, on the graph_c∈R^CRepresenting a graph signal. Two arbitrary points (e.g. x)_iAnd x_j) The weight of the edges in between can be expressed as:

where f (-) is a non-negative decreasing function (e.g. Gaussian) that must guarantee the adjacency matrix A ∈ R^N×NIs a diagonal dominance matrix. τ is a threshold value that is adjusted according to the actual situation. Furthermore, to better address the problem of the difference in feature scale between point cloud neighbors, the weights of all edges in the adjacency matrix are preferably normalized to:

wherein

Still a diagonal dominance matrix. A new map can be obtained from a set of point cloud data

And represents a low to high frequency eigenvector

The corresponding graph can be obtained according to the formulas (1) and (2)

General expression of the filter

The filter coefficient selection takes the laplacian operator where L is 2, h₀＝1，h₁Is-1. So the filter can be expressed as

Where I is the identity matrix. Filter with a filter element having a plurality of filter elements

The frequency response of (c) is:

feature vector

Is in descending order, which also represents that the frequency of the graph is in ascending order from low to high along with the descending order of the eigenvalues. Taking into account the frequency characteristic value

Frequency response of the signal in the figure

The low frequency components are significantly suppressed after passing through the filter. The response of the filter in the frequency domain is known as the high pass filter in the frequency domain.

In the vertex domain, all points X are obtained by passing through a filter

Refined to each point, the response of each point through the filter is:

the formula reflects the difference between the characteristics of one point and the characteristics of the linear convex combination near points, so that the formula can reflect the geometrical change degree between the point cloud and the near points.

Calculating l for each point of equation (4)²And (4) norm. By the above high-pass filter,/²A point with a larger norm represents a point whose features change more strongly from its neighboring point, which is consistent with high frequency information in the image. For all the original points X, according to each point l²Norm magnitude descending order rearrangement

Selecting l²M points with maximum norm

For points with large geometric variation, select l²M points with minimum norm

Points with less geometric variation. The method for selecting the points with large or small geometric change is applied to different semantic layers of the neural network and is called as a geometric selection module, the geometric selection module is embedded in different layers, and corresponding points are dynamically selected according to semantic features of the layers. As shown in figure 2 of the drawings, in which,blocks 1 and 2 correspond to blocks 1 and block2 of fig. 5, and it can be seen that the geometry selection module can dynamically capture the object global geometry.

Through the process, the original point cloud is obtained

Point X with large geometric variation_lPoint X with small geometrical change_s. Referring to fig. 3, the three point clouds are first processed by a multi-layer perceptron MLP, and then an attention weight matrix is calculated:

the multi-layer perceptrons in equations (5) and (6) do not share parameters. In this way, two learnable attention weight matrices W can be derived^l∈R^N×MAnd W^s∈R^N×MWherein M is X_lAnd X_sThe number of midpoints. Matrix W^lAnd W^sEach row in (a) corresponds to an attention weight between each original point and all points with large or small geometric changes. Because of the attention weight matrix W^lAnd W^sAre obtained by dot product of feature vector between the points, so they can clearly depict semantic connection and dependence between the points in the original point cloud and the points with large or small geometric change.

For the geometric feature supplementary representation, X_lAnd X_sEncoding with a multi-layer perceptron (MLP) and applying an attention weight matrix to the original point cloud and points with large or small geometric variations as follows:

for each point:

finally, the characteristics are spliced to obtain

In the geometric attention changing module, points with large geometric change and points with small geometric change can be connected with the original point cloud through the learned attention weight coefficients and subjected to feature fusion, the operation can efficiently capture global geometric structural features, and geometric feature supplementary enhancement is performed on the original point cloud.

In step S603, features of the point cloud data are extracted according to the calculation result of the convolutional neural network. The invention can further process the extracted point cloud data after the characteristics of the point cloud data are extracted: after several deconvolution module processes on the geometric feature information, the geometric features of the point cloud can be extracted using a max K pooling operation for subsequent classification, segmentation, or registration.

Specifically, it is assumed that the features obtained by the multilayer convolution module are NxM dimensional vectors, N is a point number, M is a dimension of each point feature, and the maximum K pooling operation is to take the maximum K values from the ith dimensional features of the N points, so as to finally obtain the global feature vector of the KxM dimensional point cloud. The output characteristics of each layer of convolution modules may be combined to perform a maximum pooling operation, and finally through the fully connected layers. The output dimension of the last of the fully connected layers is equal to the number of classes in the classification task. The output of the fully-connected layer may be converted to a probability between 0 and 1 using a normalized index function, which represents the probability that the input point cloud belongs to a certain category. In addition, cross-entropy functions can be used as loss functions, and back-propagation algorithms can be used to train and optimize the model.

For the segmentation task, on the basis of obtaining global features, the global features and the object class information of the point cloud are used as local features of the point cloud, higher-dimensional local cloud features are formed after the point cloud, and segmentation prediction is carried out through the prediction probability of object segmentation parts obtained by a multilayer perceptron and a normalized index function after the local features of the point cloud are extracted.

It should be noted that, in practical applications, the input features may be changed according to different tasks, for example, the input features are replaced or combined by the distance between a point and a neighboring point, color information of the point, a combination of feature vectors, and local shape context information of the point. In addition, a geometric unwrapping module and a geometric change attention module in the convolutional neural network are portable point cloud feature learning modules and can be used as a feature extractor to be applied to other tasks related to point clouds, such as three-dimensional point cloud completion, three-dimensional point cloud detection and the like.

The invention designs a convolutional neural network structure for three-dimensional point cloud classification and segmentation, adjusts network parameters of the neural network, including but not limited to (learning rate, batch size), and adopts different learning strategies to promote the convolutional neural network to converge to the optimal network model optimization direction; and finally, testing the verification data by using the trained network model to realize the classification and segmentation of the point cloud. In addition, the geometric information unwrapping convolution designed by the invention is a module in the neural network, and can directly extract the characteristics with large and small geometric changes from the signals distributed on the point cloud, so that the network can be combined with other modules in the neural network. The number of input and output channels and the combination of output channels can be altered to achieve the best results in different tasks. Different neural network structures can be designed by using the geometric characteristic information sharing module.

Correspondingly, the invention also provides a three-dimensional point cloud processing device based on geometric unwrapping, which is used for realizing one or more aspects of the method. For example, referring to fig. 7, the apparatus 7 includes: a point cloud data acquisition unit 701 configured to collect point cloud data of an object; the point cloud data processing unit 702 is configured to process the acquired point cloud data, and perform operations such as denoising and hole filling on the acquired point cloud data, so as to facilitate subsequent processing and improve performance; the input unit 703 is configured to input the point cloud data into a pre-trained convolutional neural network, where a main module of the convolutional neural network is a point cloud local feature information unwrapping module. A feature extraction unit 704, configured to extract features of the point cloud data according to a calculation result of the convolutional neural network, where the feature extraction unit 704 is configured to extract local features extracted in multiple layers in parallel, and extract global features from the combined local features by a maximum K pool method; the task output unit 705 is configured to process the extracted features of the point cloud data through a multi-layer perceptron and/or a normalized exponential function to obtain a category prediction probability corresponding to the point cloud data.

In order to further verify the effect of the invention, experimental verification is carried out. Experimental results show that the point cloud-oriented feature extraction method can be used for testing classification and segmentation tasks of large-scale point cloud data (ModelNet40 and ShapeNet Part). Compared with the current internationally advanced method, the accuracy of the classification task of the invention is 93.8%, the cross merging rate of the segmentation task is 86.5%, and the method has advanced advantages in performance.

The invention can be applied to various scenes. For example, a scene segmentation task and a three-dimensional scene reconstruction task applied to the fields of unmanned driving and robot vision, as shown in fig. 8 and 9. Fig. 8 illustrates a scene segmentation task of the present invention applied to unmanned vehicles and robot vision. By analyzing and processing the three-dimensional point cloud obtained by scanning, the category and position of the object can be obtained, which is the basis for other tasks in the field. Fig. 9 shows the application of the present invention to a three-dimensional reconstruction task. By processing the three-dimensional point cloud obtained by scanning, global and local geometric features can be obtained, and then the category information of the overall category and the components of the object is obtained. According to the information, the reconstructed scene is finely adjusted, and the hole filling and denoising work of the point cloud is facilitated.

Specifically, referring to fig. 10, the process of the present invention for a scene segmentation task of an unmanned or intelligent robot includes: firstly, acquiring point cloud data of a scene by using a depth camera, and marking object types in the point cloud data of the scene; extracting local features of the point cloud through a convolutional neural network based on geometric sharing, and using the local features for pixel-level classification, which is training for scene segmentation; in actual use, the depth camera is used for collecting point cloud data of an actual scene, then, the trained neural network is used for extracting local features of the point cloud, and then, the scene is segmented; the segmentation results (i.e., the different objects in the scene) are returned to the unmanned vehicle (or smart robot) for data storage and further analysis.

For another example, referring to fig. 11, a specific process of the present invention applied to a three-dimensional scene reconstruction task of an unmanned aerial vehicle includes: firstly, scanning a terrain by using a depth camera installed on an unmanned aerial vehicle, and then acquiring three-dimensional point cloud data through sampling; extracting local and global characteristics of the point cloud through geometric shared convolutional neural network training, and coding the characteristics; normal information of the point cloud is estimated by the network, and the normal information is used for assisting in adding texture information; and acquiring global and component classification information of the object by using the global and local features extracted from the point cloud, and then acquiring the local similarity of the object. And according to the characteristics, fine adjustment is carried out on the fine-adjusted scene so as to guide hole filling and denoising of the point cloud.

In summary, the present invention solves at least the following technical problems:

1) point cloud data processing problem

The three-dimensional point cloud data is non-standardized data and many techniques can directly process normalized data, such as images, speech, and other data. The main ideas of three-dimensional point cloud data processing are two types: the first category is the conversion of non-normalized three-dimensional point cloud data into regularization data, most representative of which are projection image data and voxel data. In the process of projecting to a two-dimensional image, the data of the space geometric information is lost, and the problem of data redundancy exists in the three-dimensional voxel data. And in the second category, a convolution neural network is directly constructed to process three-dimensional point cloud data. In this way, the point cloud data does not have the problems of data loss and data redundancy, but the difficulty lies in the construction of a convolution structure and the representation and learning of features. The invention adopts the idea to process point cloud data.

2) Processing for displacement and translation invariance and rotation robustness of point cloud midpoint

Since point clouds are non-regular data, the order of points in the point cloud can be random and uncertain (the same point cloud data has an ordered representation of multiple points), and the method of processing the point cloud must have invariance to the permutations of points in the point cloud. In a real scene, the point cloud will undergo certain rotation and translation transformation, and the point cloud data after the transformation will also change, so the method for processing the point cloud also needs to have robustness to the rotation and translation transformation of the point cloud, which is also a problem to be solved by the invention.

3) Geometrical information representation of point clouds

In order to analyze the point cloud, it is necessary to represent geometric feature information of the point cloud, and the point cloud is effectively analyzed according to the feature information, wherein many existing technologies process the whole point cloud or local point cloud at the same time when representing the information of the point cloud, and the information is intertwined, so that the network cannot capture the most important geometric information. How to effectively capture and fuse the geometric feature information of the point cloud is also a main problem to be solved.

4) Point cloud unwrapping characterization

For point cloud data, an outline area and an internal flat area of a point cloud play different but complementary roles in representing the point cloud, and information of the two components can play a very important role in point cloud analysis, but most methods omit the process of unwrapping the point cloud into the outline and the flat area. The invention can well link the two parts of characteristics with the original point cloud characteristics, thereby improving the performance of the point cloud analysis task.

The present invention may be a system, method and/or computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied therewith for causing a processor to implement various aspects of the present invention.

The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.

The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.

The computer program instructions for carrying out operations of the present invention may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present invention are implemented by personalizing an electronic circuit, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA), with state information of computer-readable program instructions, which can execute the computer-readable program instructions.

Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. It is well known to those skilled in the art that implementation by hardware, by software, and by a combination of software and hardware are equivalent.

Having described embodiments of the present invention, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein. The scope of the invention is defined by the appended claims.

Claims

1. A three-dimensional point cloud processing method based on geometric unwrapping comprises the following steps:

acquiring point cloud data of an object and processing the point cloud data;

2. The method of claim 1, wherein the acquiring and processing point cloud data of an object comprises one or more of:

denoising and hole filling are carried out on the collected object point cloud data;

adding data by rotating and translating the point cloud, or enhancing the point cloud data by oscillating point coordinates in the point cloud around it;

randomly generating or obtaining a random probability according to the preset highest random probability, and then deleting the points in the point cloud according to the random probability.

3. The method of claim 1, wherein the extracting local features of point cloud data comprises:

collecting K adjacent points of each point in the whole point cloud data by using a K adjacent algorithm;

and copying K parts of the features of each point, splicing the K parts of the features of the adjacent points together, and extracting local features through one-dimensional convolution kernel convolution operation.

4. The method of claim 1, wherein the geometrically unwrapping the point cloud data comprises:

construction of a graph

An adjacency matrix A for similarity between the coding points and the points on the feature space;

using filters

Filtering the map signal, the response of each point through the filter is:

calculate l for each point response²The norm is used for representing the characteristic change degree of the point and the adjacent point;

for all points X in the point cloud data, according to each point l²Norm magnitude descending order rearrangement

Selecting l²M points with maximum norm

For points with large geometric variation, select l²M points with minimum norm

Points with less geometric variation.

5. The method of claim 4, wherein said learning, with a geometric attention module, mutually complementary information between said points of large geometric variation and points of small geometric variation comprises:

for the points with large geometric change and the points with small geometric change, the corresponding attention weight matrix is obtained by processing and calculating by using a multilayer perceptron MLP:

mixing X_lAnd X_sEncoding with multi-layer perceptron MLP and applying an attention weight matrix to the original point cloud

And the points above which the geometric variation is large or small, are expressed as:

splicing the features to obtain output features

Wherein the content of the first and second substances,

is an original point cloud, X_lIs a point of great geometric variation, X_sIs a point with small geometric change, two learnable attention weight matrixes W^l∈R^N×MAnd W^s∈R^N×MM is X_lAnd X_SNumber of midpoints, matrix W^lAnd W^sEach row in (a) corresponds to an attention weight between each original point and all points with large or small geometric changes.

6. The method of claim 1, wherein the geometric unwrapping of the point cloud data is applied to different semantic layers of the convolutional neural network.

7. The method of claim 1, wherein learning mutually complementary information between the points with large geometric variation and the points with small geometric variation using a geometric attention module comprises:

and (3) respectively carrying out attention acquisition on each point in the point cloud data with all points with large geometric change and points with small geometric change in a characteristic change point multiplication mode, and carrying out characteristic fusion on the characteristics of each point cloud in the point cloud data, the characteristics of the points with large geometric change and the characteristics of the points with small geometric change through an attention coefficient.

8. A geometric unwrapping-based three-dimensional point cloud processing apparatus, comprising:

9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.

10. A computer device comprising a memory and a processor, on which memory a computer program is stored which is executable on the processor, characterized in that the steps of the method of any of claims 1 to 7 are implemented when the processor executes the program.