CN116091751B

CN116091751B - Point cloud classification method and device, computer equipment and storage medium

Info

Publication number: CN116091751B
Application number: CN202211103085.7A
Authority: CN
Inventors: 何良雨; 王戬鑫; 刘彤; 张文刚
Original assignee: Fengrui Lingchuang Zhuhai Technology Co ltd
Current assignee: Fengrui Lingchuang Zhuhai Technology Co ltd
Priority date: 2022-09-09
Filing date: 2022-09-09
Publication date: 2023-09-05
Anticipated expiration: 2042-09-09
Also published as: CN116091751A

Abstract

The invention relates to the field of artificial intelligence, in particular to a point cloud classification method, a device, computer equipment and a storage medium, wherein three-dimensional point clouds obtained in advance are subjected to voxelization to obtain voxelized three-dimensional point clouds, the voxelized three-dimensional point clouds are projected on three preset coordinate planes to obtain at least three projection views corresponding to the projected three-dimensional point clouds, t-order layer-focusing transformation is carried out on each projection view to obtain fine and rough view matrix information, view expression vectors after splicing are respectively input into a preset point cloud classification model, and point cloud classification results are output.

Description

Point cloud classification method and device, computer equipment and storage medium

Technical Field

The present invention relates to the field of artificial intelligence, and in particular, to a point cloud classification method, a point cloud classification device, a computer device, and a storage medium.

Background

Along with the continuous emergence of three-dimensional photographing and scanning equipment such as a 3D camera, a Kinect, a radar, a depth scanner and the like, the acquisition of point cloud data is increasingly convenient and accurate. At present, the point cloud data is widely applied to the fields of automatic driving, intelligent robots, virtual reality, medical diagnosis, medical imaging and the like. In many processes of the point cloud data, the point cloud classification is the basis for performing task processing such as target recognition tracking, scene understanding, three-dimensional reconstruction and the like in the application field.

At present, a deep learning method is widely applied to classification tasks of three-dimensional point cloud data. Deep neural network models based on different three-dimensional data are also continually evolving and updating. Because three-dimensional data are huge, and the corresponding neural network model is complex, the calculated amount of various tasks is huge, and the classification efficiency is low, so how to improve the classification efficiency in the three-dimensional point cloud classification process becomes a problem to be solved urgently.

Disclosure of Invention

In view of the foregoing, it is necessary to provide a method, an apparatus, a computer device and a storage medium for classifying point clouds, so as to solve the problem of low efficiency of classifying point clouds.

A first aspect of an embodiment of the present application provides a point cloud classification method, where the detection method includes:

Voxelized is carried out on the three-dimensional point cloud obtained in advance, and the three-dimensional point cloud after voxelization is obtained;

projecting the voxelized three-dimensional point cloud on three preset coordinate planes to obtain at least three projection views corresponding to the projected three-dimensional point cloud, wherein at least one projection view is projected on each coordinate plane;

performing t-order layer aggregation transformation on each projection view to obtain fine horizontal matrix information, fine vertical matrix information, fine diagonal matrix information and rough simulation matrix information of a t-th layer in each view, wherein t is an integer greater than 1;

and respectively inputting view expression vectors formed by splicing the fine horizontal matrix information, the fine vertical matrix information, the fine diagonal matrix information and the rough simulation matrix information of the t layer in each projection view into a preset point cloud classification model, and outputting the point cloud classification result.

A second aspect of an embodiment of the present application provides a point cloud classification apparatus, the apparatus including:

the voxelization module is used for voxelizing the three-dimensional point cloud acquired in advance to obtain the voxelized three-dimensional point cloud;

the projection module is used for projecting the voxelized three-dimensional point cloud on three preset coordinate planes to obtain at least three projection views corresponding to the projected three-dimensional point cloud, wherein at least one projection view is projected on each coordinate plane;

the t-order layer aggregation transformation module is used for carrying out t-order layer aggregation transformation on each projection view to obtain fine horizontal matrix information, fine vertical matrix information, fine diagonal matrix information and rough simulation matrix information of a t-th layer in each view, wherein t is an integer greater than 1;

and the point cloud classification result determining module is used for respectively inputting the view expression vectors after splicing the fine horizontal matrix information, the fine vertical matrix information, the fine diagonal matrix information and the rough simulation matrix information of the t th layer in each projection view into a preset point cloud classification model and outputting the point cloud classification result.

In a third aspect, an embodiment of the present invention provides a computer device, where the computer device includes a processor, a memory, and a computer program stored in the memory and executable on the processor, where the processor implements the point cloud classification method according to the first aspect when the computer program is executed.

In a fourth aspect, an embodiment of the present invention provides a computer readable storage medium storing a computer program, which when executed by a processor implements the point cloud classification method according to the first aspect.

Compared with the prior art, the invention has the beneficial effects that:

the method comprises the steps of voxelizing a three-dimensional point cloud obtained in advance to obtain the voxelized three-dimensional point cloud, projecting the voxelized three-dimensional point cloud on three preset coordinate planes to obtain at least three projection views corresponding to the projected three-dimensional point cloud, wherein at least one projection view is projected on each coordinate plane, carrying out t-order layer-focusing transformation on each projection view to obtain fine horizontal matrix information of a t-th layer in each view, fine vertical matrix information, fine diagonal matrix information and rough simulation matrix information, wherein t is an integer larger than 1, carrying out classification on the fine horizontal matrix information of the t-th layer in each projection view, the fine vertical matrix information and rough simulation matrix information into spliced view expression vectors, respectively inputting the view expression vectors into a preset point cloud classification model, outputting a point cloud classification result, carrying out view projection on the voxelized point cloud on the three preset coordinate planes respectively, reducing data quantity, simultaneously retaining three-dimensional geometric features of the point cloud, carrying out t-order layer-focusing transformation on the corresponding view to obtain specific features of the corresponding view, carrying out t-order layer-focusing transformation on the corresponding view, carrying out the specific feature-focusing transformation on the corresponding view, and carrying out classification on the point cloud classification vectors when the specific feature is included in the cloud classification model, and ensuring that the cloud classification efficiency is improved when the cloud classification is carried out classification.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic diagram of an application environment of a point cloud classification method according to an embodiment of the present invention;

FIG. 2 is a flow chart of a method for classifying point clouds according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a preset point cloud classification model according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a fusion module according to an embodiment of the present invention;

FIG. 5 is a schematic structural diagram of a prediction module according to an embodiment of the present invention;

fig. 6 is a schematic diagram of a perceived network structure according to an embodiment of the present invention.

FIG. 7 is a schematic diagram of a sensor structure according to an embodiment of the present invention;

FIG. 8 is a flow chart of a method for classifying point clouds according to an embodiment of the present invention;

Fig. 9 is a schematic structural diagram of a point cloud classifying device according to an embodiment of the present invention;

fig. 10 is a schematic structural diagram of a computer device according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

It should be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It should also be understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.

As used in the present description and the appended claims, the term "if" may be interpreted as "when..once" or "in response to a determination" or "in response to detection" depending on the context. Similarly, the phrase "if a determination" or "if a [ described condition or event ] is detected" may be interpreted in the context of meaning "upon determination" or "in response to determination" or "upon detection of a [ described condition or event ]" or "in response to detection of a [ described condition or event ]".

Furthermore, the terms "first," "second," "third," and the like in the description of the present specification and in the appended claims, are used for distinguishing between descriptions and not necessarily for indicating or implying a relative importance.

Reference in the specification to "one embodiment" or "some embodiments" or the like means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the invention. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," and the like in the specification are not necessarily all referring to the same embodiment, but mean "one or more but not all embodiments" unless expressly specified otherwise. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless expressly specified otherwise.

The embodiment of the invention can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.

Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

It should be understood that the sequence numbers of the steps in the following embodiments do not mean the order of execution, and the execution order of the processes should be determined by the functions and the internal logic, and should not be construed as limiting the implementation process of the embodiments of the present invention.

In order to illustrate the technical scheme of the invention, the following description is made by specific examples.

The point cloud classification method provided by the embodiment of the invention can be applied to an application environment as shown in fig. 1, wherein a client communicates with a server. The clients include, but are not limited to, palm top computers, desktop computers, notebook computers, ultra-mobile personal computer (UMPC), netbooks, personal digital assistants (personal digital assistant, PDA), and the like. The server may be implemented by a stand-alone server or a server cluster formed by a plurality of servers.

Referring to fig. 2, a flow chart of a point cloud classification method according to an embodiment of the present invention is shown, where the point cloud classification method may be applied to a server in fig. 1, and the server is connected to a corresponding client to provide model training service for the client. As shown in fig. 2, the point cloud classification method may include the following steps.

S201: and voxelizing the three-dimensional point cloud obtained in advance to obtain the voxelized three-dimensional point cloud.

In step S201, the three-dimensional point cloud obtained in advance is voxelized to obtain a voxelized three-dimensional point cloud, wherein voxelized is that three-dimensional data is divided into minimum units in a space, and three-dimensional coordinates of the point cloud are converted into voxel coordinates to obtain a voxelized three-dimensional point cloud.

In this embodiment, the three-dimensional point cloud obtained in advance may be point cloud data obtained by a laser radar, and the point cloud is subjected to voxel processing, in this embodiment, voxels are established by adopting a method with equal side lengths, and the laser radar is spatially divided by a certain step length according to the coordinate axis direction, where we may set different axial resolutions, that is, the side lengths of the voxels are different for each axial direction. In this embodiment, the resolutions of the respective directions of the voxels are set to the same value according to the scene.

Each point cloud can calculate the voxel position of the corresponding point through the coordinates and the voxel side length. The voxels here are only relative to the pixels of the planar image. The number of point clouds that may be contained within each voxel space after voxelization is non-uniform. Because the number of points in the voxels is affected by factors such as the point cloud density of the data used when the voxel space is uniform.

After the point clouds are subjected to voxel division, the point cloud characteristics in each voxel grid are calculated, scenes contained in different vehicle-mounted laser radar data are very abundant, a large amount of terrain data, hillsides, flat lands, sharp ridges and the like possibly exist in the data, and due to the characteristic of different ground forms, the fact that the coordinates Z of similar ground features in all the point clouds are different is found, and the average normalized height N can be used for marking the voxel grids to obtain the point cloud characteristics. In the plane area, the point cloud in a certain range is subjected to plane fitting by adopting a certain method, for example, a plane can be well fitted by using a common classical least square method, the normal direction of the plane is calculated, and the distance from the laser point cloud in the current range to the plane obtained in the previous step can be quickly obtained according to the normal direction of the plane.

Optionally, voxelizing a three-dimensional point cloud acquired in advance to obtain a voxelized three-dimensional point cloud, including:

acquiring corresponding three-dimensional point cloud information in each voxel grid according to the three-dimensional point cloud acquired in advance and the division specification of the preset voxel grid;

and calculating the centroid coordinates in each voxel grid according to the corresponding three-dimensional point cloud information in each voxel grid, and taking the point cloud corresponding to the obtained centroid coordinates as the three-dimensional point cloud after voxelization.

In this embodiment, a body model is built for a three-dimensional point cloud, the spatial model which we want to represent is taken out in the process of voxelization of the model surface, a spatial bounding box parallel to the coordinate axis is then divided into individual small body elements at the resolution of all directions in the three-dimensional space, a list of three-dimensional body elements is obtained, the voxel positions of the surface can be obtained by utilizing the intersection positions of the polygons or triangles of the model surface and the current bounding box, and the voxels of all intersection positions are marked, so that the voxels of the surface are obtained, and the voxelization of the surface is completed.

The surface-voxelized model is regarded as a shell model, and then the three-dimensional space enclosed by the shell, namely the space in the model, is required to be voxelized. The method generally used here is to build octree in the model internal space to be voxelized, and then accelerate the process according to the ray divergence method or the scanning method, so that the internal voxelization process can be rapidly realized. And after voxelization, when the voxelized point cloud information is calculated, the centroid coordinates of the point cloud in the voxel grid are taken as the voxelized point cloud.

S202: and projecting the voxelized three-dimensional point cloud on three preset coordinate planes to obtain at least three projection views corresponding to the projected three-dimensional point cloud.

In step S202, at least one projection view is projected on each coordinate plane, and the voxelized three-dimensional point cloud is projected on three preset coordinate planes, where the projection views in the front view and the rear view are equal, the projection views in the left view and the right view are equal, and the projection views in the bottom view and the top view are equal.

In this embodiment, three projection views are obtained, a voxelized three-dimensional point cloud is projected onto a two-dimensional plane during projection, the size of the two-dimensional plane is set, and when the voxelized three-dimensional point cloud is projected, the value of the voxelized three-dimensional point cloud falling on a corresponding two-dimensional grid of the two-dimensional plane is taken as a projection value, so that a corresponding projection view is obtained.

Optionally, projecting the voxelized three-dimensional point cloud on each coordinate plane to obtain at least three projection views corresponding to the projected three-dimensional point cloud, including:

projecting the voxelized three-dimensional point cloud on an xoy plane, acquiring a first coordinate value in the direction perpendicular to the xoy plane, and taking the first coordinate value as a projection value of a projection view of the three-dimensional point cloud on the first coordinate plane to acquire a first projection view;

Projecting the voxelized three-dimensional point cloud on a xoz plane, acquiring a second coordinate value in the direction perpendicular to the xoz plane, and taking the second coordinate value as a projection value of a projection view of the three-dimensional point cloud on the second coordinate plane to acquire a second projection view;

and projecting the voxelized three-dimensional point cloud on a yoz plane, acquiring a third coordinate value in the direction vertical to the yoz plane, and taking the third coordinate value as a projection value of a projection view of the three-dimensional point cloud on the third coordinate plane to acquire a third projection view.

In this embodiment, a projection view is obtained on an xoy coordinate plane, the projection value in the plane is the z value of the three-dimensional point cloud coordinate after voxelization, a projection view is obtained on a xoz coordinate plane, the projection value in the plane is the y value of the three-dimensional point cloud coordinate after voxelization, a projection view is obtained on a yoz coordinate plane, and the projection value in the plane is the x value of the three-dimensional point cloud coordinate after voxelization.

S203: and carrying out t-order layer aggregation on each projection view to obtain fine horizontal matrix information, fine vertical matrix information, fine diagonal matrix information and coarse simulation matrix information of a t-th layer in each view.

In step S203, a t-order layer-focusing transformation is performed on each projection view, so as to obtain fine information and coarse information of the last layer transformation.

In this embodiment, 3-order layer-focusing transformation is performed on each projection view, and the t-order layer-focusing transformation is to decompose the projection view matrix information. The layer transform first performs one-dimensional decomposition on the image in the row direction with a low-pass decomposition filter and a high-pass component Jie Lvbo, and then decomposes the result of the row-direction decomposition in the column direction. The calculation formula of the 1 st order aggregation layer transformation is as follows:

defining an input matrix A _M×N The element is a _i,j . After aggregation layer transformation, fine horizontal matrix information FL is obtained _M/2×N/2 Fine vertical matrix information FV _M/2×N/2 Fine diagonal matrix information FD _M/2×N/2 Coarse simulation matrix information CS _M/2×N/2 The elements being fl respectively _i,j ，fv _i,j ，fd _i,j ，cs _i,j 。

Wherein said fl _i,j Fv is an element in the fine horizontal matrix information _i,j Fd, an element in the fine vertical matrix information _i,j Cs, an element in the fine diagonal matrix information _i,j Is an element in rough simulation matrix information. The 2-order aggregation is to transform in the rough and sparse analog matrix information obtained by the 1-order aggregation, the elements in the rough and sparse analog matrix information obtained by the 1-order aggregation are used as the elements of the input matrix, and the t-order aggregation is to transform on the basis of the rough and sparse analog matrix information obtained by the t-1-order aggregation.

S204, respectively inputting the view expression vectors after splicing the fine horizontal matrix information, the fine vertical matrix information, the fine diagonal matrix information and the rough simulation matrix information of the t layer in each projection view into a preset point cloud classification model, and outputting a point cloud classification result.

In step S204, the view expression vectors obtained by splicing the fine horizontal matrix information, the fine vertical matrix information, the fine diagonal matrix information and the coarse simulation matrix information of the t-th layer in each projection view are respectively input into a preset point cloud classification model as input vectors, and the point cloud classification results are output, wherein the number of the input ends of the preset point cloud classification model is equal to that of the projection views.

In this embodiment, the corresponding preset point cloud classification model includes at least three input ends, each input end inputs one of view expression vectors after projection view matrix information is spliced, and the category information of the point cloud is output in the preset point cloud classification model through feature extraction of the preset point cloud classification model.

Referring to fig. 3, a schematic diagram of a preset point cloud classification model according to an embodiment of the present invention is shown.

Optionally, the preset point cloud classification model includes: the system comprises a first sensing network, a second sensing network, a third sensing network, a fusion module and a prediction module;

the first perception network is used for inputting view expression vectors obtained by projection on the xoy plane after splicing of the view matrix information and outputting first view self-attention feature vectors;

The second perception network is used for inputting view expression vectors spliced by the view matrix information obtained by projection on a xoz plane and outputting second view self-attention feature vectors;

the third perception network is used for inputting view expression vectors spliced by the view matrix information obtained by projection on a yoz plane and outputting third view self-attention feature vectors;

the fusion module is used for fusing the first view self-attention feature vector, the second view self-attention feature vector and the third view self-attention feature vector to obtain a fused view self-attention feature vector;

the prediction module is used for performing prediction processing on the self-attention feature vector of the fusion view and outputting a prediction result.

In this embodiment, the fine horizontal matrix information, the fine vertical matrix information, the fine diagonal matrix information and the coarse simulation matrix information of the t-th layer are spliced to obtain spliced matrix information, the spliced matrix information includes view features of each projection view, the spliced matrix information is input to an input end of any perception network in a preset point cloud classification model, corresponding self-attention feature vectors are output at an output end of any perception network, wherein the self-attention feature vectors are in a two-dimensional shape, the self-attention feature vectors output in each perception network are fused through a fusion module to obtain fusion view self-attention feature vectors with a three-dimensional shape, and the fusion view self-attention feature vectors are input to a prediction module to be predicted to output a prediction result.

The fusion module comprises a connection layer, a convolution layer, a regular normalization layer, an activation layer, a residual layer, a summation layer and a maximum value pooling layer.

Referring to fig. 4, a schematic diagram of a fusion module according to an embodiment of the invention is provided.

In this embodiment, the connection layer connects the self-attention feature vectors output in each sensing network, the connected feature vectors obtain limited feature vectors with preset dimensions through convolution and activation layers, for example, the connected feature vectors [3,W/4, H/4] are connected to obtain data with the shape of [128, W/4, H/4] through the convolution layer and the ReLU activation layer, the data with the shape of [128, W/4, H/4] are obtained through convolution, regular normalization layer and the ReLU activation layer to obtain data with the shape of [256, W/8,H/8], the data with the shape of [256, W/8,H/8] are input to the maximum value pooling layer to obtain data with the shape of [256, W/8,H/8], residual connection is performed on the output, and Add operation is performed on the two to obtain data with the shape of [256, W/8,H/8 ]. The data of [256, W/8,H/8] is subjected to a convolution layer, a regularized layer and a ReLU activation function to obtain data with the shape of [512, W/16, H/16], the data of [512, W/16, H/16] is input into a maximum value pooling layer to obtain data with the shape of [512, W/16, H/16] and output are connected in a residual way, add operation is carried out on the data with the shape of [512, W/16, H/16] to obtain data with the shape of [1024, W/32, H/32] through the convolution layer, the regularized layer and the ReLU activation layer again.

The prediction module comprises a mean value pooling layer, a size recombination layer and a full connection layer.

Referring to fig. 5, a schematic diagram of a prediction module according to an embodiment of the invention is provided.

Inputting data with the shape of [1024, W/32 and H/32] output by the fusion module into a mean value pooling layer to obtain data with the shape of [1024,1,1], passing the data with the shape of [1024,1,1] through a size recombination layer to obtain one-dimensional vector data with the shape of [1,1024], and obtaining classification vectors with the shape of [1, classnum ] through a full connection layer, wherein Classnum is the total category number.

Optionally, the first sensor network, the second sensor network and the third sensor network have the same structure and comprise a convolution layer, a first shape recombination layer, a sensor and a second shape recombination layer;

the convolution layer is used for inputting the view expression vector after each view matrix information splice and outputting the view expression vector after convolution;

the first shape reorganization layer is used for inputting the convolved view expression vector, performing dimension reduction processing on the view diagram vector, and outputting a view expression vector with a first preset dimension after dimension reduction;

the perceptron is used for inputting the view expression vector of the first preset dimension after dimension reduction and outputting the perceived view expression vector;

The second shape reorganization layer is used for inputting the perceived view expression vector, expanding the dimension of the view diagram expression vector, and outputting the self-attention feature vector with the preset dimension.

Referring to fig. 6, a schematic diagram of a perceived network structure according to an embodiment of the present invention is provided.

In this embodiment, the convolved view expression vector of each projection view matrix information input by the sensing network input end is convolved, the number of channels in the convolved view expression vector of each projection view matrix information is changed, and the convolved view expression vector is obtained. The convolved view expression vector is subjected to dimension reduction processing through a first shape recombination layer, the convolved view expression vector is changed from three dimensions to two dimensions, the view expression vector with the first preset dimension after dimension reduction is obtained, so that the vector is conveniently input into a sensor, the view expression vector with the first preset dimension after dimension reduction is input into the sensor, self-attention characteristic processing is carried out, the perceived view expression vector is obtained, the perceived view expression vector is two-dimensional in shape, the perceived view expression vector is input into a second shape recombination layer, dimension expansion processing is carried out on the second shape recombination layer, the three-dimensional shape vector is obtained, and the self-attention characteristic vector with the preset dimension is output.

Optionally, the sensor includes a self-attention module, a hidden layer activation module, and a fully connected layer, wherein,

the self-attention module is used for inputting view expression vectors of a first preset dimension after dimension reduction;

the hidden layer activation module is used for carrying out nonlinear mapping on the output in the self-attention module by using an activation function;

the full connection layer is used for carrying out linear transformation on the result of the hidden layer activation module and outputting a perceived view expression vector with the same size as the view expression vector of the first preset dimension after the dimension reduction.

Referring to fig. 7, a schematic diagram of a sensor structure according to an embodiment of the invention is shown.

In this embodiment, the sensor includes a self-attention module, a hidden layer activation module, and a full connection layer, where the self-attention layer models the intrinsic relation of the features, and the hidden layer activation layer performs a nonlinear mapping. And inputting the processing result after nonlinear mapping into a fully-connected feedforward neural network to perform linear transformation operation, and outputting a perceived view expression vector with the same size as the view expression vector of the first preset dimension after dimension reduction.

The method comprises the steps of voxelizing a three-dimensional point cloud obtained in advance to obtain the voxelized three-dimensional point cloud, projecting the voxelized three-dimensional point cloud on three preset coordinate planes to obtain at least three projection views corresponding to the projected three-dimensional point cloud, wherein at least one projection view is obtained by projection on each coordinate plane, carrying out t-order layer-focusing transformation on each projection view to obtain fine horizontal matrix information of a t-th layer in each view, fine vertical matrix information, fine diagonal matrix information and rough simulation matrix information, wherein t is an integer larger than 1, carrying out the fine horizontal matrix information of the t-th layer in each projection view, fine vertical matrix information and rough simulation matrix information on spliced view expression vectors, respectively inputting the view expression vectors of the t-th layer in the projection views into a preset point cloud classification model, outputting a point cloud classification result, carrying out view projection on the three preset coordinate planes on the voxelized point cloud respectively, reducing data quantity, simultaneously retaining three-dimensional geometric characteristics of the point cloud, carrying out t-order layer-focusing transformation on the corresponding view, carrying out specific point-focusing transformation on the corresponding view expression on the view, and carrying out specific point cloud classification on the point cloud classification model, and carrying out specific classification on the point cloud classification data, when the point cloud classification model is classified, and carrying out classification, so that the classification is ensured.

Referring to fig. 8, a flow chart of a point cloud classification method according to an embodiment of the present invention, as shown in fig. 8, the point cloud classification method may include the following steps:

s801: voxelized is carried out on the three-dimensional point cloud obtained in advance, and the three-dimensional point cloud after voxelization is obtained;

s802: projecting the voxelized three-dimensional point cloud on three preset coordinate planes to obtain at least three projection views corresponding to the projected three-dimensional point cloud, wherein at least one projection view is projected on each coordinate plane;

s803: performing t-order layer aggregation transformation on each projection view to obtain fine horizontal matrix information, fine vertical matrix information, fine diagonal matrix information and rough simulation matrix information of a t-th layer in each view, wherein t is an integer greater than 1;

the contents of the steps S801 to S803 are the same as those of the steps S201 to S203, and reference may be made to the descriptions of the steps S201 to S203, which are not repeated herein.

S804: acquiring a point cloud tag data vector of a sample three-dimensional point cloud;

s805: and training the pre-constructed point cloud classification model by taking the point cloud label data vector as priori data in the training process of the pre-constructed point cloud classification model to obtain a trained point cloud classification model, and taking the trained point cloud classification model as a preset point cloud classification model.

In this embodiment, when training a pre-built point cloud classification model, first, sample data with a tag is obtained, where the sample data with a tag corresponds to the output of the pre-built point cloud classification model network, and a three-dimensional point cloud tag data vector of the sample is obtained, which may be [1, classnum ], for example, the total class number Classnum is 100, a certain object class is 3, and its class index is 2. The tag data is a one-dimensional vector of the shape [1, classnum ], where the values at the [1,2] positions are 1 and the values at the remaining positions are 0. And inputting the label data as priori data to train a pre-constructed point cloud classification model, calculating a loss value of each training according to the difference value between the output of the pre-constructed point cloud classification model and the label data, and updating a network by utilizing the loss value and gradient back propagation to obtain a trained point cloud classification model, wherein the trained point cloud classification model is used as a preset point cloud classification model.

S806: and respectively inputting the view expression vectors of the t-th layer of fine horizontal matrix information, the fine vertical matrix information, the fine diagonal matrix information and the coarse simulation matrix information in each projection view into a preset point cloud classification model, and outputting a point cloud classification result.

The content of step S906 is the same as that of step S204, and reference is made to the description of step S204, which is not repeated here.

Referring to fig. 9, fig. 9 is a schematic structural diagram of a point cloud classifying device according to an embodiment of the invention. The terminal in this embodiment includes units for performing the steps in the embodiments corresponding to fig. 2 to 8. Please refer to fig. 2 to fig. 8 and the related descriptions in the embodiments corresponding to fig. 2 to fig. 8. For convenience of explanation, only the portions related to the present embodiment are shown. Referring to fig. 9, the sorting apparatus 90 includes: the system comprises a voxelization module 91, a projection module 92, a t-order hierarchical transformation module 93 and a point cloud classification result determination module 94.

The voxelization module 91 is configured to voxelize a three-dimensional point cloud acquired in advance to obtain a voxelized three-dimensional point cloud;

the projection module 92 is configured to project the voxelized three-dimensional point cloud on three preset coordinate planes to obtain at least three projection views corresponding to the projected three-dimensional point cloud, where at least one projection view is projected on each coordinate plane;

a t-order layer aggregation transformation module 93, configured to perform t-order layer aggregation transformation on each projection view, to obtain fine horizontal matrix information, fine vertical matrix information, fine diagonal matrix information, and coarse sparse analog matrix information of a t-th layer in each view, where t is an integer greater than 1;

The point cloud classification result determining module 94 is configured to input the view expression vectors obtained by splicing the fine horizontal matrix information, the fine vertical matrix information, the fine diagonal matrix information and the coarse simulation matrix information of the t-th layer in each projection view into a preset point cloud classification model, and output a point cloud classification result.

Optionally, the voxelization module 91 includes:

the dividing unit is used for acquiring corresponding three-dimensional point cloud information in each voxel grid according to the three-dimensional point cloud acquired in advance and the dividing specification of the preset voxel grid;

the voxel three-dimensional point cloud acquisition unit is used for calculating the centroid coordinates in each voxel grid according to the corresponding three-dimensional point cloud information in each voxel grid, and taking the point cloud corresponding to the obtained centroid coordinates as the voxel three-dimensional point cloud.

Optionally, the projection module 92 includes:

the first projection view acquisition unit is used for projecting the voxelized three-dimensional point cloud on the xoy plane, acquiring a first coordinate value in the direction perpendicular to the xoy plane, and taking the first coordinate value as a projection value of a projection view of the three-dimensional point cloud on the first coordinate plane to acquire a first projection view;

the second projection view acquisition unit is used for projecting the voxelized three-dimensional point cloud on a xoz plane, acquiring a second coordinate value in the direction perpendicular to the xoz plane, and taking the second coordinate value as a projection value of a projection view of the three-dimensional point cloud on the second coordinate plane to acquire a second projection view;

And the third projection view acquisition unit is used for projecting the voxelized three-dimensional point cloud on a yoz plane, acquiring a third coordinate value in the direction perpendicular to the yoz plane, and taking the third coordinate value as a projection value of the three-dimensional point cloud in the projection view on the third coordinate plane to obtain a third projection view.

the third perception network is used for inputting view expression vectors spliced by the view matrix information obtained by the plane projection of yoz and outputting third view self-attention feature vectors;

Optionally, the first, second and third sensing networks have the same structure and include a convolution layer, a first shape reorganizing layer, a sensor and a second shape reorganizing layer;

the perceptron is used for inputting the view expression vector of the first preset dimension after the dimension reduction and outputting the perceived view expression vector;

Optionally, the sensor comprises a self-attention module, a hidden layer activation module and a fully connected layer, wherein,

Optionally, the sorting apparatus 90 further includes:

the label data acquisition module is used for acquiring a point cloud label data vector in the sample three-dimensional point cloud;

the training module is used for taking the point cloud label data vector as priori data in the training process of the pre-built point cloud classification model, training the pre-built point cloud classification model to obtain a trained point cloud classification model, and taking the trained point cloud classification model as a preset point cloud classification model.

It should be noted that, because the content of information interaction and execution process between the above units is based on the same concept as the method embodiment of the present invention, specific functions and technical effects thereof may be referred to in the method embodiment section, and will not be described herein.

Fig. 10 is a schematic structural diagram of a computer device according to an embodiment of the present invention. As shown in fig. 10, the computer device of this embodiment includes: at least one processor (only one shown in fig. 10), a memory, and a computer program stored in the memory and executable on the at least one processor, the processor executing the computer program to perform the steps of any of the various point cloud classification method embodiments described above.

The computer device may include, but is not limited to, a processor, a memory. It will be appreciated by those skilled in the art that fig. 10 is merely an example of a computer device and is not intended to be limiting, and that a computer device may include more or fewer components than shown, or may combine certain components, or different components, such as may also include a network interface, a display screen, an input device, and the like.

The processor may be a CPU, but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory includes a readable storage medium, an internal memory, etc., where the internal memory may be the memory of the computer device, the internal memory providing an environment for the execution of an operating system and computer-readable instructions in the readable storage medium. The readable storage medium may be a hard disk of a computer device, and in other embodiments may be an external storage device of the computer device, for example, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), etc. that are provided on the computer device. Further, the memory may also include both internal storage units and external storage devices of the computer device. The memory is used to store an operating system, application programs, boot loader (BootLoader), data, and other programs such as program codes of computer programs, and the like. The memory may also be used to temporarily store data that has been output or is to be output.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, the specific names of the functional units and modules are only for distinguishing from each other, and are not used for limiting the protection scope of the present invention. The specific working process of the units and modules in the above device may refer to the corresponding process in the foregoing method embodiment, which is not described herein again. The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present invention may implement all or part of the flow of the method of the above-described embodiment, and may be implemented by a computer program to instruct related hardware, and the computer program may be stored in a computer readable storage medium, where the computer program, when executed by a processor, may implement the steps of the method embodiment described above. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, executable files or in some intermediate form, etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code, a recording medium, a computer Memory, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), an electrical carrier signal, a telecommunications signal, and a software distribution medium. Such as a U-disk, removable hard disk, magnetic or optical disk, etc. In some jurisdictions, computer readable media may not be electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.

The present invention may also be implemented as a computer program product for implementing all or part of the steps of the method embodiments described above, when the computer program product is run on a computer device, causing the computer device to execute the steps of the method embodiments described above.

In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

In the embodiments provided by the present invention, it should be understood that the disclosed apparatus/computer device and method may be implemented in other manners. For example, the apparatus/computer device embodiments described above are merely illustrative, e.g., the division of modules or units is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection via interfaces, devices or units, which may be in electrical, mechanical or other forms.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

The above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention.

Claims

1. The point cloud classification method is characterized by comprising the following steps of:

projecting the voxelized three-dimensional point cloud on three preset coordinate planes to obtain at least three projection views corresponding to the projected three-dimensional point cloud, wherein at least one projection view is projected on each coordinate plane, and the preset coordinate planes comprise an xoy plane, a xoz plane and a yoz plane;

the fine horizontal matrix information, the fine vertical matrix information, the fine diagonal matrix information and the rough simulation matrix information of the t layer in each projection view are spliced to form view expression vectors, and the view expression vectors are respectively input into a preset point cloud classification model to output a point cloud classification result;

the preset point cloud classification model comprises the following components: the system comprises a first sensing network, a second sensing network, a third sensing network, a fusion module and a prediction module;

the first perception network is used for inputting view expression vectors obtained by splicing all view matrix information obtained by projection in an xoy plane and outputting a first view self-attention feature vector;

the second perception network is used for inputting view expression vectors obtained by projection on a xoz plane and spliced by the view matrix information and outputting second view self-attention feature vectors;

the third perception network is used for inputting view expression vectors obtained by projection on a yoz plane and spliced by the matrix information of each view, and outputting third view self-attention feature vectors;

The fusion module is used for fusing the first view self-attention feature vector, the second view self-attention feature vector and the third view self-attention feature vector to obtain a fusion view self-attention feature vector;

2. The method of classifying point clouds according to claim 1, wherein voxelizing the three-dimensional point cloud obtained in advance to obtain the voxelized three-dimensional point cloud, comprising:

acquiring corresponding three-dimensional point cloud information in each voxel grid according to the pre-acquired three-dimensional point cloud and a preset dividing specification of the voxel grid;

3. The method of point cloud classification as claimed in claim 1, wherein projecting the voxelized three-dimensional point cloud in each coordinate plane to obtain at least three projection views corresponding to the projected three-dimensional point cloud, includes:

and projecting the voxelized three-dimensional point cloud on a yoz plane, acquiring a third coordinate value in the direction perpendicular to the yoz plane, and taking the third coordinate value as a projection value of a projection view of the three-dimensional point cloud on the third coordinate plane to acquire a third projection view.

4. The point cloud classification method of claim 1, wherein the first, second and third sensing networks are all the same in structure and comprise a convolution layer, a first shape reorganization layer, a sensor, and a second shape reorganization layer;

the first shape reorganization layer is used for inputting the convolved view expression vector, performing dimension reduction on the vector of the view chart, and outputting a view expression vector with a first preset dimension after dimension reduction;

The second shape reorganization layer is used for inputting the perceived view expression vector, expanding the dimension of the view expression vector, and outputting a self-attention feature vector with a preset dimension.

5. The point cloud classification method of claim 4, wherein said sensor comprises a self-attention module, a hidden layer activation module, and a fully connected layer, wherein,

the self-attention module is used for inputting the view expression vector of the first preset dimension after the dimension reduction;

the hidden layer activation module is used for performing nonlinear mapping on the output in the self-attention module by using an activation function;

6. The method of point cloud classification according to claim 1, wherein before the step of inputting the view expression vectors, which are obtained by concatenating the fine horizontal matrix information, the fine vertical matrix information, the fine diagonal matrix information and the coarse sparse simulation matrix information of the t th layer in each projection view, respectively, the method further comprises:

Acquiring a point cloud tag data vector of a sample three-dimensional point cloud;

and training the pre-constructed point cloud classification model by taking the point cloud label data vector as priori data in the training process of the pre-constructed point cloud classification model to obtain a trained point cloud classification model, and taking the trained point cloud classification model as a preset point cloud classification model.

7. A point cloud classification device, the device comprising:

the projection module is used for projecting the voxelized three-dimensional point cloud on three preset coordinate planes to obtain at least three projection views corresponding to the projected three-dimensional point cloud, wherein at least one projection view is projected on each coordinate plane, and the preset coordinate planes comprise an xoy plane, a xoz plane and a yoz plane;

The point cloud classification result determining module is used for respectively inputting the view expression vectors after splicing the fine horizontal matrix information, the fine vertical matrix information, the fine diagonal matrix information and the rough simulation matrix information of the t layer in each projection view into a preset point cloud classification model and outputting the point cloud classification result;

8. A computer device, characterized in that it comprises a processor, a memory and a computer program stored in the memory and executable on the processor, which processor implements the point cloud classification method according to any of claims 1 to 6 when executing the computer program.

9. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the point cloud classification method according to any of claims 1 to 6.