CN112149725A

CN112149725A - Spectral domain graph convolution 3D point cloud classification method based on Fourier transform

Info

Publication number: CN112149725A
Application number: CN202010991678.6A
Authority: CN
Inventors: 陈苏婷; 陈怀新
Original assignee: Nanjing University of Information Science and Technology
Current assignee: Nanjing University of Information Science and Technology
Priority date: 2020-09-18
Filing date: 2020-09-18
Publication date: 2020-12-29
Anticipated expiration: 2040-09-18
Also published as: CN112149725B

Abstract

The invention discloses a spectral domain graph convolution 3D point cloud classification method based on Fourier transform, which comprises the following steps: performing geometric sampling processing on the input original point cloud by using a G-PointNet network model: dividing points with a neighborhood included angle value larger than that of a point into a geometric feature region G and the rest points into other regions T by setting an angle threshold V, and sampling to obtain point clouds of the regions; introducing an expansion rate E based on a Dynamic KNN local graph construction method, and selectively establishing a local geometric graph every E adjacent point clouds. And performing spectrum domain graph convolution by using a spectrum domain graph convolution method based on Fourier transform to obtain a plurality of pooled graph local features, obtaining global features through G-PointNet, classifying, and obtaining a classification result. The invention effectively solves the problem of uneven distribution of the density degree of the point cloud, retains the space geometric information, can efficiently distinguish the edge points of the point cloud and simultaneously separate the noise points, and improves the classification precision.

Description

Spectral domain graph convolution 3D point cloud classification method based on Fourier transform

Technical Field

The invention relates to a spectral domain graph convolution 3D point cloud classification method based on Fourier transform, and belongs to the technical field of remote sensing image processing.

Background

With the image processing technique of hundreds of buzzes, classification methods based on two-dimensional images are diversified, and a few achievements are achieved. However, the deep learning processing method based on three-dimensional data has a far lower effect than the two-dimensional image classification effect. Three-dimensional data is typically represented in depth images, voxels, meshes, and point clouds. Compared with the method for acquiring three-dimensional data through an RGB-D camera or a mainstream sensor, the three-dimensional point cloud acquired by the laser radar can provide more reliable depth and outline information of the three-dimensional object, and is gradually applied to three-dimensional object classification in recent years.

In the previous work, computer vision researchers mostly use the success of CNN in image processing for point cloud classification, extract two-dimensional features by using a three-dimensional object, or obtain multiple two-dimensional view images from different "views" of the object, project the three-dimensional object into multiple views, extract corresponding view features, and then fuse these features to perform accurate object identification. However, the methods derive the shape of the 3D object from the 2D image, abandon the inherent spatial structure of the 3D point cloud, lose a large amount of spatial structure information, and consume excessive memory. In view of the deficiencies of 2D multi-view, researchers have attempted to voxelate 3D point clouds. However, voxel construction does not build complete edge information, which makes it difficult for such methods to capture high fine granularity.

Therefore, several problems are common in the current point cloud classification task. Firstly, image convolution can determine the size of a convolution kernel to define a local area in an image, unlike a grid structure with a regular arrangement in the image, point clouds are a set of dispersed points in a three-dimensional space, the point clouds are continuously distributed in the space, and the arrangement sequence of the point clouds does not change the space distribution, so that the point clouds cannot be directly convolved by using a traditional deep neural network. Secondly, the non-uniform density of the point cloud distribution in the space also brings great challenges to the classification problem.

Disclosure of Invention

In order to overcome the defects of the prior art and effectively solve the problem that the traditional 3D point cloud classification method is affected by the spatial relationship and the uneven distribution of the point cloud, the invention provides a spectral domain graph convolution 3D point cloud classification method based on Fourier transform, under the premise of not changing the spatial information of the point cloud, a new expression form-graph is provided, the graph structure effectively solves the problem of the adjacent relationship between points in most point cloud deep learning models, the spatial geometrical information is reserved, and the graph is very suitable for arranging irregular non-European data; deep learning lacks a lot of research work in a spectral domain, the model combines a spectral domain graph convolution method in a 3D point cloud frame for the first time, the spectral domain convolution has a solid mathematical theoretical basis, and graph convolution more emphasizes the adjacency relation between key points; the G-PointNet of the invention makes great improvement in the aspects of feature point acquisition and local area division, provides geometric sampling pretreatment and designs a Dynamic K neighbor graph construction method, namely Dynamic KNN, and effectively solves the problem of uneven distribution of point cloud density.

The invention specifically adopts the following technical scheme to solve the technical problems:

a spectral domain graph convolution 3D point cloud classification method based on Fourier transform comprises the following steps:

performing geometric sampling processing on the input original point cloud by using a G-PointNet network model: by setting an angle threshold V, dividing points with neighborhood included angle values larger than the angle threshold V into a geometric characteristic region G and dividing the rest points into other regions T, and uniformly sampling point clouds in the two regions respectively to obtain point clouds after geometric sampling of the regions;

constructing an undirected graph from the point clouds after geometric sampling of each region, introducing an expansion rate E based on a Dynamic KNN local graph construction method, and selectively establishing a local geometric graph at intervals of E neighboring point clouds to obtain a plurality of local geometric graphs;

and performing spectrum domain graph convolution on each local geometric graph by using a spectrum domain graph convolution method based on Fourier transform to obtain a plurality of pooled graph local features, obtaining global features through a G-PointNet network model, classifying, and obtaining a classification result.

Further, as a preferred technical solution of the present invention, the method performs spectrum domain graph convolution on each local geometric graph by using a spectrum domain graph convolution method based on fourier transform, specifically:

inputting a local geometric graph G ═ V, (V,) wherein V represents a corresponding node set and an edge set respectively, and [ mu ], V ∈ V represents a node in the graph, and [ mu ], V ∈ V represents an edge in the graph;

the laplacian matrix L ═ D-a defining the local geometry, where a denotes the adjacency matrix of the graph with the elements a inside_i,j＝A_j,iA matrix for representing the connection of nodes in the graph; d represents the degree matrix of the graph, within which the element D_i,i＝∑_jA_i,jThe degree of a node represents the number of edges connected to the node;

normalization yields a Laplace matrix, denoted as

Wherein I_nIs a unit matrix and is characterized by a set of laplacian eigenvectors U ═ U (U)₁,u₂,...u_n)；

Taking the decomposed Laplace matrix eigenvector as a group of basis, taking the local geometric figure as an input x and carrying out Fourier transformation

T is the transposition of the matrix; obtaining a convolution kernel h_θAnd (Λ) in a Fourier transform diagonal matrix form to obtain Fourier transform convolution in a spectral domain, and then performing inverse transformation on the Fourier transform convolution to finally obtain the convolution output of the spectral domain graph.

Further, as a preferred technical solution of the present invention, the spectral domain plot convolution output formula obtained by the method is:

y＝σ(Uh_θ(Λ)U^Tx)

wherein y is the output of the spectral domain graph convolution; x is the local geometry of the input and σ () is the activation function relu.

By adopting the technical scheme, the invention can produce the following technical effects:

the invention discloses a spectral domain graph convolution 3D point cloud classification method based on Fourier transform, which combines a deep network model of PointNet and spectral domain graph convolution operation based on Fourier transform to obtain a G-PointNet network model of the method. G-PointNet preserves the spatial transformation network T-Net in PointNet, and the semantic labels of point clouds must be invariant if they undergo some specific set transformations, such as rigid transformations. The representation of the learned set of points is therefore also invariant to these transformations. The solution is to fit all input sets into a canonical space before feature extraction. The three-dimensional space is aligned through sampling and interpolation, and a network layer is specially designed to be implemented on a GPU.

And in the preprocessing stage, geometric sampling processing is carried out on the original point cloud. Inspired by cavity convolution, the invention provides a graph construction method, namely a Dynamic KNN (D-KNN), a plurality of local geometric graph structures are constructed to be used as graph convolution input, the graph convolution input is mapped to a spectrum domain through Fourier transform and is subjected to convolution operation, the graph convolution operation is returned to a space domain through inverse Fourier transform, and finally global features of a plurality of pooled graph local features are obtained through PointNet and are classified. The process can be divided into three parts: geometric sampling, Dynamic KNN local map construction and spectrum domain map convolution based on Fourier transformation. Respectively has the following advantages:

1. geometric sampling: the advantages of geometric sampling are very significant. In the area with more obvious point cloud geometric characteristics, the number distribution of the sampled point clouds is more, the edge characteristics are very obvious, the calculation efficiency is high, and the anti-noise capability of the sampling result is stronger.

Dynamic KNN local map construction: the Dynamic KNN is a composition method for dynamically selecting a K near neighborhood, which is inspired from hole convolution, and on a two-dimensional image task, the hole convolution effectively increases the receptive field by introducing a hyper-parameter called ' expansion rate ' (dilation rate) ' on the premise of not changing the output size of an image. The expansion rate E is introduced into the KNN algorithm by the Dynamic KNN, and the expansion rate can be selected according to the density degree of the point cloud.

3. Fourier transform-based spectral domain map convolution: the spectral domain graph convolution network based on Fourier transform is used as a point cloud feature extraction network, and the advantage of the spectral domain graph convolution on a point cloud classification task is also quite obvious. In an end-to-end deep learning task relative to space-domain convolution, a mathematical theory cannot be used for explaining, and Fourier transformation has a solid theoretical basis for explaining the feasibility. Secondly, points and noise points with large edge changes in the point cloud are generally regarded as high-frequency signals, Fourier transform energy conversion distinguishes high-frequency signals from low-frequency signals, and for a classification task, the Fourier transform energy conversion effectively distinguishes the edge points of the point cloud and simultaneously separates the noise points, which is important for the classification task.

Drawings

FIG. 1 is a schematic diagram of the method of the present invention.

FIG. 2 is a schematic diagram of geometric sampling processing in the method of the present invention.

FIG. 3 is a schematic diagram of a partial view of a Dynamic KNN according to the present invention.

Detailed Description

The following describes embodiments of the present invention with reference to the drawings.

As shown in fig. 1, the invention designs a spectral domain graph convolution 3D point cloud classification method based on fourier transform, and the process can be divided into three parts: geometric sampling, Dynamic KNN local graph construction and spectrum domain graph convolution based on Fourier transform specifically comprise the following steps:

step 1, carrying out geometric sampling processing on input original point cloud by utilizing a G-PointNet network model. The method comprises the following specific steps:

firstly, obtaining the G-PointNet network model of the invention under the enlightenment of a PointNet depth network model and a spectral domain graph convolution operation based on Fourier transform. The G-PointNet network model preserves the spatial transformation network T-Net in PointNet. The G-PointNet network model directly uses as input a point cloud, which is a collection of consecutive points in three-dimensional space, represented as three-dimensional space coordinates (x, y, z), sometimes with additional features such as color, laser reflection intensity, etc. Unless specified, G-PointNet uses only three-dimensional coordinates (x, y, z) as point features.

The characteristics of the point cloud, which means that direct convolution with depth models in image processing cannot be used, have 3 characteristics:

1. the point set is disordered. The image can be changed when any two pixels in the image exchange positions, and unlike the two-dimensional image, the point cloud has no specific arrangement sequence, namely the arrangement sequence of any point in the space is changed, and the shape change of the point cloud cannot be influenced.

2. The interrelationship between points. Each point has three-dimensional coordinates (x, y, z), which means that each point has spatial information representing its shape, each point is not independent, and a neighboring set of local points may represent meaningful spatial information. Therefore, the model needs to be able to capture local structures from nearby points, as well as the combined interactions between local structures.

3. The permutation is unchanged. The point cloud is three-dimensional data, which represents any rotation or translation operation, and should not affect the final point cloud classification effect.

The G-PointNet network model of the method performs geometric sampling processing on the original point cloud in a preprocessing stage, reserves a spatial transformation network T-Net in the PointNet, and the T-Net can adjust the point cloud to a position suitable for point cloud classification.

Fig. 2 is a schematic diagram of geometric sampling processing of the method of the present invention. The method selects geometric sampling to process point cloud data. A traditional Point cloud classification model, such as PointNet, adopts a Farthest Point Sampling algorithm to randomly select an initial Point, selects a Point which is Farthest away from the initial Point to add a starting Point, and then continues iteration until the required number is iterated. Every time when the Farthest Point Sampling samples one Point, the distance between the sets needs to be calculated, the algorithm time complexity is high, and the edge characteristics of the obtained Sampling points are not obvious. By contrast, the advantages of geometric sampling are quite significant. The geometric sampling processing process of the invention is as follows:

assuming the number C of input point clouds, the number S of target sampling and the uniform sampling rate U, setting an angle threshold V, dividing points with neighborhood included angle values larger than the angle threshold V into geometric characteristic regions G, and dividing the rest points into other regions T; the point cloud is divided into two parts, namely a geometric characteristic region G and other regions T, and the point clouds in the two regions G, T are respectively and uniformly sampled to obtain point clouds after geometric sampling of each region.

Geometric sampling can acquire more point cloud numbers at places with larger point cloud curvature, however, calculating the point cloud curvature is time-consuming and greatly increases the work difficulty, so that a simple method is used to approximately achieve the curvature effect: and calculating the normal angle value from the characteristic point to the neighborhood point in the local point cloud picture structure, wherein the bigger the normal angle value is, the bigger the curvature value is. In fig. 2, c1, c2 and c3 are 3 point cloud points respectively, alpha represents a normal line included angle value, the curvature effect of c1 is replaced by the normal line included angle values of c2 and c3, and the two are in positive correlation.

Step 2, constructing an undirected graph from the point clouds after geometric sampling of each region, introducing an expansion rate E based on a Dynamic KNN local graph construction method, and selectively establishing a local geometric graph every E neighboring point clouds to obtain a plurality of local geometric graphs, wherein the method specifically comprises the following steps:

fig. 3 is a schematic diagram of a local map construction method dynamicknn. The receptive field of a point cloud refers to a set of point cloud nodes including a central node and its neighbors, however, the non-uniformity of the point cloud distribution results in some nodes possibly having only one neighbor, while other nodes may have as many as thousands of neighbors.

According to the local map construction method, the expansion rate E is introduced into a KNN algorithm by the Dynamic KNN, and the expansion rate can be selected according to the density degree of the point cloud. The Dynamic KNN sets two threshold points M, N, where M < N. And M and N are the number of target point clouds, the number of the point clouds after geometric sampling is X, and the expansion rate is E.

The method comprises the steps of giving an undirected graph G (v) to represent a graph structure of a point cloud, wherein v is { 1., n }, an edge belongs to v x v, and the idea of Dynamic KNN is that a local geometric graph is selectively established every E adjacent point clouds according to the sparsity degree of the point clouds, and a plurality of local geometric graphs are obtained after multiple selections are carried out. When the point cloud is sparse, the traditional KNN nearest neighbor idea is adopted, where E is 1. When the point cloud is dense, E points close to each other realize the connection of the points, a local geometric graph structure is established as graph convolution input and is sent to a spectral domain for graph convolution. The Dynamic KNN effectively solves the problem of overhigh node coincidence degree in dense point cloud, and simultaneously reduces the calculation complexity.

And 3, performing spectrum domain graph convolution on each local geometric graph by using a spectrum domain graph convolution method based on Fourier transform to obtain a plurality of pooled graph local features, obtaining global features through a PointNet network model, classifying, and obtaining a classification result. The method comprises the following steps of respectively taking a plurality of constructed local geometric graphs as graph convolution input x by a local graph construction method Dynamic KNN, mapping the graph convolution input x to a spectral domain by Fourier transform for convolution operation, returning the graph convolution operation to a spatial domain by inverse Fourier transform, and finally obtaining global features of a plurality of pooled graph local features by a PointNet network model for classification, wherein the specific process is as follows:

firstly, inputting a local geometric graph G ═ V ═ to represent an undirected graph, V respectively represents a corresponding node set and an edge set, mu, V ∈ V represents nodes in the graph, and (mu, V ∈ V represents edges in the graph;

next, a laplacian matrix L ═ D-a of the local geometry G is defined, where a denotes the adjacency matrix (adjacency matrix) of the graph, in which the element a is_i,j＝A_j,iI and j respectively represent the row number and the column number of elements positioned in the matrix, and are used for representing the matrix of the connection condition of the nodes in the graph; for an undirected graph with N nodes, the adjacency matrix is an N × N real-symmetric matrix. D represents a degree matrix (degree matrix) of the graph, in which the element D_i,i＝∑_jA_i,jAnd the degree of a node indicates the number of edges connected to the node. L represents the laplacian matrix of the diagram, which may be binary or weighted.

Then, normalization yields a Laplace matrix, which can be expressed as

Wherein I_nIs a unit matrix, and since the Laplace matrix is a symmetric matrix, the characteristic decomposition into a group of Laplace matrices can be performedThe placian eigenvector U ═ (U ═ U)₁,u₂,...u_n) (ii) a The characteristic vector of the Laplace characteristic vector and the characteristic vector associated with the larger characteristic value represents a signal carrying fast change and is regarded as a high-frequency signal; the eigenvectors associated with the smaller eigenvalues carry slower varying signals and are considered low frequency signals. In the point cloud classification task, the edge information of the object can be found as signals for distinguishing high frequency from low frequency.

Then, taking the decomposed Laplace matrix eigenvector as a group of basis, Fourier transform is performed using the local geometry as input x

T is the transposition of the matrix; inverse Fourier transform

The traditional Fourier transform and convolution are transferred to graph convolution, and the core work is to change the characteristic function of the Laplace operator into the characteristic vector of the Laplace matrix corresponding to the local geometric graph G. I.e. obtaining the convolution kernel h_θAnd (Λ) in a Fourier transform diagonal matrix form to obtain Fourier transform convolution in a spectral domain, and then performing inverse transformation on the Fourier transform convolution to finally obtain spectral domain graph convolution output, wherein the derivation process is as follows:

considering the input as f, the Fourier transform of f

② convolution kernel h_θA fourier transform diagonal matrix form of (Λ):

where theta is the kernel, lambda is the feature vector,

u_lis a feature vector.

And thirdly, obtaining Fourier convolution in a spectral domain:

inverse transform of Fourier transform product

Fifthly, the final convolution formula of the spectrum domain diagram is as follows:

y＝σ(Uh_θ(Λ)U^Tx)

And finally, obtaining a plurality of pooled local features of the map according to the output y of the spectrum domain map convolution, and obtaining global features through a PointNet network model for classification to obtain a final classification result.

Therefore, the method can be greatly improved in the aspects of feature point acquisition and local area division, geometric sampling pretreatment is provided, a Dynamic local graph construction method Dynamic KNN is designed, and the problem of uneven distribution of the density degree of the point cloud is effectively solved. On the premise of not changing the spatial information of the point cloud, a new expression form-graph is provided, the graph structure effectively solves the problem of the adjacent relation between points in most point cloud deep learning models, the spatial geometrical information is kept, and the receptive field is effectively increased on the premise of not changing the output size of the image by introducing a hyper-parameter called expansion rate. The spectrum domain graph convolution network based on Fourier transform is used as a point cloud feature extraction network, the Fourier transform energy conversion is used for efficiently distinguishing edge points of point clouds aiming at a classification task, and meanwhile, noise points are separated, so that the classification precision is improved, and the advantage of spectrum domain graph convolution on the point cloud classification task is very obvious.

The embodiments of the present invention have been described in detail with reference to the drawings, but the present invention is not limited to the above embodiments, and various changes can be made within the knowledge of those skilled in the art without departing from the gist of the present invention.

Claims

1. The spectral domain graph convolution 3D point cloud classification method based on Fourier transform is characterized by comprising the following steps of:

2. The fourier transform-based spectral domain graph convolution 3D point cloud classification method according to claim 1, wherein the method performs spectral domain graph convolution on each local geometric graph by using a fourier transform-based spectral domain graph convolution method, specifically:

normalization yields a Laplace matrix, denoted as

3. The method for classifying a 3D point cloud based on convolution of a Fourier transform spectrum domain map according to claim 2, wherein the convolution output formula of the spectrum domain map obtained by the method is as follows:

y＝σ(Uh_θ(Λ)U^Tx)