CN109118564A

CN109118564A - A kind of three-dimensional point cloud labeling method and device based on fusion voxel

Info

Publication number: CN109118564A
Application number: CN201810861715.4A
Authority: CN
Inventors: 马燕新; 鲁敏; 涂兵; 郭裕兰; 雷印杰
Original assignee: Hunan Visualtouring Information Technology Co Ltd
Current assignee: National University of Defense Technology
Priority date: 2018-08-01
Filing date: 2018-08-01
Publication date: 2019-01-01
Anticipated expiration: 2038-08-01
Also published as: CN109118564B

Abstract

The embodiment of the present invention provides a kind of three-dimensional point cloud labeling method and device based on fusion voxel.Wherein, the three-dimensional point cloud labeling method based on fusion voxel includes voxelization processing being carried out to three dimensional point cloud collection, and carry out voxel feature extraction in voxel based on processing result and form the first voxel eigenmatrix；First voxel eigenmatrix is calculated to the Analysis On Multi-scale Features of voxel as the input of Three dimensional convolution neural network, and feature fused in tandem is carried out to obtain the second voxel eigenmatrix to the Analysis On Multi-scale Features；The voxel feature in the second voxel eigenmatrix is extended in each point of three dimensional point cloud concentration to obtain a cloud eigenmatrix based on feature interpolation algorithm；It will be marked in cloud eigenmatrix input multilayer perceptron with attribute of the realization to three-dimensional point cloud.The present invention can be realized point-by-point more sophisticated category identification, with further hoist point cloud labeling properties.

Description

A kind of three-dimensional point cloud labeling method and device based on fusion voxel

Technical field

The present invention relates to data markers technical fields, in particular to a kind of three-dimensional point cloud mark based on fusion voxel Remember method and apparatus.

Background technique

As three-dimension sensor recent years (such as LiDAR, Microsoft Kinect, ASUS Xtion) is in each field The extensive use of (such as mobile robot, automatic Pilot, remote Sensing Interpretation, virtual reality, augmented reality and situation of battlefield perception), Three-dimensional data is also significantly increasing, and three-dimensional point cloud label refers to and seeing as the important means handled three-dimensional data In the scene point cloud data measured, the category attribute of each point is identified, and distribute a unique class label for each point, than Such as building, road or automobile.But traditional semantic marker based on voxel convolutional neural networks requires all the points in voxel total The label result for enjoying same semantic label, and needing the data of regularization to input, while obtaining is also the rough mark of voxel level Remember result.

Summary of the invention

In view of this, the present invention provides a kind of three-dimensional point cloud labeling method and device based on fusion voxel, on solving State problem.

On the one hand, present pre-ferred embodiments provide a kind of three-dimensional point cloud labeling method based on fusion voxel, the side Method includes:

Voxelization processing is carried out to three dimensional point cloud collection, and voxel feature extraction is carried out in voxel based on processing result Form the first voxel eigenmatrix；

The first voxel eigenmatrix is calculated to more rulers of voxel as the input of Three dimensional convolution neural network Feature is spent, and feature fused in tandem is carried out to obtain the second voxel eigenmatrix to the Analysis On Multi-scale Features；

The voxel feature in the second voxel eigenmatrix is extended into the three-dimensional point cloud based on feature interpolation algorithm To obtain a cloud eigenmatrix in each point in data set；

It will be marked in described cloud eigenmatrix input multilayer perceptron with attribute of the realization to three-dimensional point cloud.

On the other hand, present pre-ferred embodiments also provide a kind of three-dimensional point cloud labelling apparatus based on fusion voxel, institute Stating device includes:

Voxel processing and characteristic extracting module, for carrying out voxelization processing to three dimensional point cloud collection, and based on processing As a result voxel feature extraction is carried out in voxel form the first voxel eigenmatrix；

Multiple dimensioned voxel feature calculation module, for using the first voxel eigenmatrix as Three dimensional convolution neural network Input the Analysis On Multi-scale Features of voxel are calculated, and feature fused in tandem is carried out to obtain the second body to the Analysis On Multi-scale Features Plain eigenmatrix；

Feature expansion module, for being expanded the voxel feature in the second voxel eigenmatrix based on feature interpolation algorithm It opens up in each point concentrated to the three dimensional point cloud to obtain a cloud eigenmatrix；

Point cloud mark module, for inputting in multilayer perceptron described cloud eigenmatrix to realize to three-dimensional point cloud Attribute label.

Compared with prior art, it is provided in an embodiment of the present invention it is a kind of based on fusion voxel three-dimensional point cloud labeling method and Device constructs multiscale space by being based on voxel convolutional neural networks on the voxel model of regularization to extract multiple dimensioned body Then voxel feature is expanded to point feature in the way of feature interpolation, and then realizes point by point more fine by plain feature Classification and Identification and point cloud label.

In addition, also result optimizing is marked to the point cloud data for completing label using CRF-RNN in the present invention, to improve mark Remember precision.

Detailed description of the invention

In order to illustrate the technical solution of the embodiments of the present invention more clearly, below will be to needed in the embodiment attached Figure is briefly described, it should be understood that the following drawings illustrates only certain embodiments of the present invention, therefore is not construed as pair The restriction of range for those of ordinary skill in the art without creative efforts, can also be according to this A little attached drawings obtain other relevant attached drawings.

Fig. 1 is the application scenarios signal of the three-dimensional point cloud labelling apparatus provided in an embodiment of the present invention based on fusion voxel Figure.

Fig. 2 is the flow diagram of the three-dimensional point cloud labeling method provided in an embodiment of the present invention based on fusion voxel.

Fig. 3 is the sub-process schematic diagram of step S11 shown in Fig. 2.

Fig. 4 is the three-dimensional point cloud labeling method frame structure provided in an embodiment of the present invention based on voxel convolutional neural networks Schematic diagram.

Fig. 5 is the structural schematic diagram of part provided in an embodiment of the present invention and global characteristics Fusion Module.

Fig. 6 is voxel feature extraction schematic network structure provided in an embodiment of the present invention.

Fig. 7 is the sub-process schematic diagram of step S12 shown in Fig. 2.

Fig. 8 is Three dimensional convolution operation chart provided in an embodiment of the present invention.

Fig. 9 is convolution provided in an embodiment of the present invention and deconvolution two-dimensional representation.

Figure 10 is the Multi resolution feature extraction structural schematic diagram provided in an embodiment of the present invention based on 3D CNN.

Figure 11 is the point cloud feature interpolation schematic diagram provided in an embodiment of the present invention based on voxel feature.

Figure 12 is full connection CRF connection schematic diagram provided in an embodiment of the present invention.

Figure 13 is that the CRF provided in an embodiment of the present invention based on RNN realizes structural schematic diagram.

Figure 14 is the three-dimensional point cloud mark of fusion voxel convolutional neural networks provided in an embodiment of the present invention and the optimization of the rear end CRF Remember schematic network structure.

Figure 15 is the label result signal provided in an embodiment of the present invention based on XYZ coordinate information on S3DIS data set Figure.

Figure 16 is the label result signal provided in an embodiment of the present invention based on XYZ coordinate information on vKITTI data set Figure.

Figure 17 is the classification confusion matrix of Point-VoxelNet provided in an embodiment of the present invention on both data sets.

Figure 18 is the classification confusion matrix of PVCRF provided in an embodiment of the present invention on both data sets.

Figure 19 is the frame structure signal of the three-dimensional point cloud labelling apparatus provided in an embodiment of the present invention based on fusion voxel Figure.

Icon: 10- electric terminal；Three-dimensional point cloud labelling apparatus of the 100- based on fusion voxel；The processing of 110- voxel and spy Levy extraction module；111- voxel division unit；112- point cloud sorts out unit；113- Points Sample unit；The multiple dimensioned voxel of 120- Feature calculation module；130- feature expansion module；140- point cloud mark module；200- memory；300- storage control；400- Processor.

Specific embodiment

Through inventor the study found that on the global context information modeling of existing scene point cloud, artwork is usually utilized The available solution of expression ability of type (graphical models), such as relatively common mode be by classifier and condition with (Conditional Random Fields, CRF) is combined the semantic label to estimate each data point on airport.However, point The class device Classification and Identification stage independently carries out operation typically as individual module with the CRF optimizing phase, does not have between each other Interaction, to limit the information interchange between each module.

Wherein, for classifier, three-dimensional voxel convolutional neural networks are a preferable selections.Three-dimensional voxel convolutional Neural Network is expanded by two-dimensional convolution neural network, and good property is also achieved in objective Classification and Identification task Can, for the deep neural network based on cloud, three-dimensional voxel convolutional neural networks also have network structure clear, easy In the advantages such as speeding up to realize.But voxel neural network needs the data of regularization to input, and it is also voxel level that it, which marks result, Rough label.

For the above-mentioned problems in the prior art, the embodiment of the present invention provides a kind of three-dimensional point based on fusion voxel Cloud labeling method and device, by the voxel model of regularization be based on voxel convolutional neural networks construct multiscale space with Multiple dimensioned voxel feature is extracted, voxel feature is then expanded into point feature in the way of feature interpolation, and then is realized point-by-point More sophisticated category identification, further to promote labeling properties.To make the purpose of the embodiment of the present invention, technical solution and excellent Point it is clearer, following will be combined with the drawings in the embodiments of the present invention, technical solution in the embodiment of the present invention carry out it is clear, It is fully described by, it is clear that described embodiments are some of the embodiments of the present invention, instead of all the embodiments.Usually exist The component of the embodiment of the present invention described and illustrated in attached drawing can be arranged and be designed with a variety of different configurations herein.

Therefore, the detailed description of the embodiment of the present invention provided in the accompanying drawings is not intended to limit below claimed The scope of the present invention, but be merely representative of selected embodiment of the invention.Based on the embodiments of the present invention, this field is common Technical staff's every other embodiment obtained without creative efforts belongs to the model that the present invention protects It encloses.

It should also be noted that similar label and letter indicate similar terms in following attached drawing, therefore, once a certain Xiang Yi It is defined in a attached drawing, does not then need that it is further defined and explained in subsequent attached drawing.

As shown in Figure 1, for the application of the three-dimensional point cloud labelling apparatus 100 provided in an embodiment of the present invention based on fusion voxel Schematic diagram of a scenario.Wherein, electric terminal 10 include based on fusion voxel three-dimensional point cloud labelling apparatus 100, memory 200, deposit Store up controller 300 and processor 400.Wherein, electric terminal 10 may be, but not limited to, computer, mobile internet surfing equipment (mobile Internet device, MID) etc. has the electronic equipment of processing function, can also be server etc..

Optionally, memory 200, storage control 300, each element of processor 400 are directly or indirectly electric between each other Property connection, to realize the transmission or interaction of data.For example, passing through one or more communication bus or signal wire between these elements It realizes and is electrically connected.Three-dimensional point cloud labelling apparatus 100 based on fusion voxel includes at least one can be with the shape of software or firmware Formula is stored in memory 200 or is solidificated in the software function module in the operating system of electric terminal 10.Processor 400 is being deposited Access memory 200 under the control of controller 300 is stored up, for executing the executable module stored in memory 200, such as base Software function module and computer program etc. included by the three-dimensional point cloud labelling apparatus 100 of fusion voxel.

It is appreciated that structure shown in FIG. 1 is only to illustrate, electric terminal 10 may also include it is more than shown in Fig. 1 or Less component, or with the configuration different from shown in Fig. 1.Each component shown in Fig. 1 can using hardware, software or its Combination is realized.

Further, Fig. 2 is please referred to, the embodiment of the present invention also provides a kind of three-dimensional point cloud mark based on fusion voxel Note method is introduced the three-dimensional point cloud labeling method based on fusion voxel below with reference to Fig. 2.

Step S11 carries out voxelization processing to three dimensional point cloud collection, and carries out voxel in voxel based on processing result Feature extraction forms the first voxel eigenmatrix；

Step S12, using the first voxel eigenmatrix as the input of Three dimensional convolution neural network voxel is calculated Analysis On Multi-scale Features, and feature fused in tandem is carried out to obtain the second voxel eigenmatrix to the Analysis On Multi-scale Features；

Voxel feature in second voxel eigenmatrix is extended to three-dimensional point cloud based on feature interpolation algorithm by step S13 To obtain a cloud eigenmatrix in each point in data set；

Step S14 will be marked in cloud eigenmatrix input multilayer perceptron with attribute of the realization to three-dimensional point cloud.

In the present embodiment, voxelization processing is carried out to three-dimensional point cloud first, feature extraction then is carried out to point cloud in voxel, Then by using voxel feature as element voxel model input Three dimensional convolution neural network carry out Multi resolution feature extraction with merge, Voxel feature is expanded into a cloud feature followed by feature interpolation algorithm, realizes to the label of three-dimensional point cloud, can effectively mention High point cloud marks precision.

In detail, Fig. 3 is please referred to, the process for carrying out voxelization processing to cloud in step S11 can pass through following steps Rapid S111- step S113 is realized:

A cloud coordinate space is divided into multiple voxels according to default voxel size by step S111；

Step S112 sorts out each point that three dimensional point cloud is concentrated to corresponding voxel according to the grid parameter of voxel In；

Step S113 samples the point in each voxel after classification so that the quantity of the point in voxel reaches first Preset value.

Wherein, the present embodiment in above-mentioned steps S111- step S113 introduction point cloud voxelization model to be carried out to cloud Voxelization processing.Specifically, as shown in figure 4, point cloud voxelization will put the segmentation of cloud coordinate space according to given voxel size For multiple voxels.Where it is assumed that size of the input point cloud on three reference axis, that is, X, Y, Z axis direction is respectively W, H, E, each The size of voxel is λ_W、λ_H、λ_E, then the size of voxelization treated model is W'=W/ λ_W, H'=H/ λ_H, E'=E/ λ_E.This In embodiment, for the ease of subsequent convolution algorithm, can enable W', H', E' be integer and be 2 power side.

After realizing voxelization grid to cloud coordinate space in step S111, can further it be joined according to the grid of each voxel Each point is sorted out in several pairs of point clouds, so that each point is attributed in each voxel.But when carrying out cloud and sorting out, due into When row three dimensional point cloud acquires by measurement error, distance, the factors such as block and influenced, collected cloud is often not Uniformly, as partial region point cloud compares concentration, partial region point cloud is than sparse.In addition, point cloud data acquisition is equivalent to mesh Mark the sampling on surface, thus target internal be it is empty, point cloud data is not present, carries out a voxelization so as to cause to cloud space Afterwards, the point cloud inside each voxel is unevenly distributed, specifically as shown in figure 4, wherein the voxel in the lower left corner does not include a point cloud, The voxel in the upper right corner includes that points are less.Consequently, to facilitate subsequent carry out unified voxel feature extraction, a cloud piecemeal is being carried out After need to sample out of each voxel and obtain identical quantity point, (T is according to cloud resolution ratio and a storage such as the first preset value T Capacity determines).

It should be noted that when being sampled, if the quantity for the point for including in voxel is greater than the first preset value, from current The first preset value of stochastical sampling point in voxel is so that the quantity of the point in the voxel reaches the first preset value；If including in voxel Point quantity less than the first preset value, then randomly select one or more points from current voxel and replicated so that the voxel In the quantity of point reach the first preset value.For example, it is assumed that the first preset value is T, it is more than the body of T for the quantity put in voxel Plain then T point of stochastical sampling, for points are less than T in voxel, then the point of random reproduction respective numbers is to obtain T point Set, carry out after a cloud piecemeal and sampling it is available no more than T' × H' × E' include T point set of voxels, in turn Feature learning is carried out using the point cloud data in voxel to obtain the validity feature of each voxel comprising point cloud and express.

Further, cloud feature extraction is carried out in voxel based on processing result in above-mentioned steps S11 and forms the first body The step of plain eigenmatrix includes: the centre coordinate for the point cloud computing cloud in each voxel, and is sat based on the center It marks and center normalized is carried out to obtain primary data matrix to the point cloud data in voxel；By primary data Input matrix Point-by-point local feature description is realized in LGAB module, and the local feature set in voxel is carried out using maximum value pond Point-by-point pondization operates the global characteristics to obtain voxel and as the first voxel eigenmatrix.

Specifically, it is carried out in the embodiment of the present invention using part as shown in Figure 5 and global characteristics Fusion Module (LGAB) It stacks to build feature learning network and carry out the extraction of voxel sign.Wherein, as shown in Figure 6, it is assumed that V_xFor the non-empty comprising T point Voxel, i.e. V_x={ p_i=(x_i,y_i,z_i) }, i=1,2,3 ..., T, then being inputted by point cloud data (primary data matrix) Center normalization is carried out before LGAB module, i.e., calculates the centre coordinate (c of point cloud in voxel first_x,c_y,c_z), utilize centre coordinate (c_x,c_y,c_z) point cloud data is carried out center to normalize being to obtain final input data primary data matrixThat is, the input of voxel characteristic extracting module is what T × 6 was tieed up Primary data matrix.

Further, point-by-point local feature description can be obtained using the LGAB module of stacking, using maximum value pond (Max-Pooling, MP), which carries out pondization operation to characteristic set point-by-point in voxel, can be obtained the global characteristics of voxel.Such as Fig. 6 The feature extraction example for non-empty voxel is given, to reduce number of parameters, remaining non-empty voxel is when carrying out feature extraction Identical network parameter can be shared.When actual implementation, information is shared since LGAB module can be good at merging point cloud local neighborhood With respective different information, the cascade in the present embodiment using multiple LGAB modules can effectively extract the point cloud inside voxel and believe Breath.

Further, as shown in fig. 7, using the first voxel eigenmatrix as Three dimensional convolution neural network in step S12 Input can be realized with the process that the Analysis On Multi-scale Features of voxel are calculated by following step.

Voxel eigenmatrix is converted to 4 dimension tensors by step S120, and the 4 dimension tensor is inputted respectively has difference big Voxel feature in the Three dimensional convolution neural network of small convolution kernel to be calculated under different scale；

Voxel feature under different scale is inputted the three-dimensional warp with different size of convolution kernel by step S121 respectively To obtain the voxel feature of multiple and different scales in product neural network, wherein each Three dimensional convolution nerve net when carrying out convolution The convolution kernel size of network and the convolution kernel size of each three-dimensional deconvolution neural network when progress deconvolution are corresponding identical.

In detail, due to the important information that space geometry information is objective, directly handling three-dimensional data can With extract target validity feature description, therefore, the present invention use for reference two-dimensional convolution neural network in image procossing it is huge at Function expands to two-dimensional convolution neural network in Three dimensional convolution neural network, i.e., using Three dimensional convolution neural network to three dimensions When according to being handled, needing to carry out three-dimensional data regularization processing i.e. voxelization and handling, it is also necessary to by two-dimensional convolution operation, pond Change operation etc. to expand in three-dimensional voxel data.Wherein, Three dimensional convolution formula are as follows:

In formula (1),For the three-dimensional voxel data of input,For three dimensional convolution kernel template, For its output response.Similar with two-dimensional convolution operation, two-dimensional convolution core is only expanded to three dimensional convolution kernel by Three dimensional convolution operation, Comprising three dimensions of length, width and height, correspondingly, as shown in figure 8, its local experiences is also converted to by the local neighborhood on two-dimensional surface Local neighborhood in three-dimensional space.When actual implementation, since Three dimensional convolution operation can reduce the bulk of three-dimensional data, i.e., three The bulk of dimensional feature figure is less than the size of input voxel data.But three-dimensional data is marked, needs to obtain every number The characteristic information at strong point needs to obtain the feature of each voxel when carrying out voxel feature extraction, thus need to grasp convolution Characteristic pattern back mapping after work returns in initial input voxel.To solve this problem, it is grasped in the present embodiment using deconvolution Make (deconvolution) and is handled obtained three-dimensional voxel characteristic pattern the spy to obtain and input voxel data with size Sign figure.The core of deconvolution operation is still convolution operation, only carries out edge to input feature vector data before convolution operation 0 operation is mended, to guarantee the size requirement of output characteristic pattern.Show for example, giving the two dimension that convolution is operated with deconvolution shown in Fig. 9 Example, wherein blue markings data are the input data of two operations, and grey flag data is convolution kernel (size of the two is identical), Green Marker data are the response output of the two.It is not difficult to find out that the input of warp lamination is the output of convolutional layer, warp lamination Output has identical size with the input of convolutional layer.

Further, after handling based on aforementioned operation all non-empty voxels, a series of voxel of D dimensions can be obtained Feature.Since each voxel feature and the voxel coordinate of three-dimensional space are uniquely corresponding, thus the character representation that can be will acquire is 4 Tensor is tieed up, size is W' × H' × E' × D, (for empty voxel, use D to tie up zero vector and describe as its feature).It will Character representation be converted to based on 4 dimension tensors character representation after can using Three dimensional convolution neural network (3D CNN) carry out into The characteristic optimization of one step.In view of the feature extraction under fixed size (convolution kernel size) is not enough to the part of expressed intact voxel Contextual information, the present invention in using multiple dimensioned feature extraction with merge, extract more abundant local neighborhood information.

It wherein, as shown in Figure 10, is specific Three dimensional convolution neural network structure, the i.e. multiple dimensioned spy based on 3D CNN Sign is extracted mainly comprising Three dimensional convolution operation (Conv3D), three-dimensional deconvolution operation (DeConv3D) and feature serial operation (Concat).For W' × H' × E' × D dimensional feature of input, convolution, warp are carried out respectively using three different convolution kernels Product operation, is denoted as Conv3D (f_in；f_out；ker；st；Pad), DeConv3D (f_in；f_out；ker；st；), pad wherein f_in、f_out Indicate the dimension of input and output eigenmatrix, ker；st；Convolution kernel moving step length when pad respectively indicates convolution kernel size, convolution Size carries out data filling size when Data expansion, is trivector.To obtain the feature under different scale, three convolution Core can be but be not limited to (1；2；2)；(2；1；2)；(2；2；1) identical convolution kernel, is used in convolution and deconvolution operation, instead Corresponding edge is carried out in convolution operation mends 0 operation.In addition, not only being grasped comprising convolution, deconvolution in convolutional layer and warp lamination Make, each convolution operation is subsequent also comprising normalization layer (Batch Normalization layer, BN) and ReLU activation behaviour Make.

Multi resolution feature extraction based on 3D CNN is along three mutually orthogonal directions of three-dimensional space (i.e. X, Y, Z-direction) Using different convolution kernel carry out voxel feature extraction with merge, enable the feature learnt to include more partial structurtes Information realizes the expression more complete to cloud.

Further, in step s 13, due to needing to obtain the feature of each point when realizing the label of three-dimensional point cloud Description, and aforementioned Three dimensional convolution neural network can only obtain the feature description of each voxel, therefore, by voxel in the present invention Feature carries out interpolation to obtain the feature description of each point in input point cloud.As shown in figure 11, it for giving target point p, finds out Nearest preset quantity (such as 8) a neighboring voxels, each neighboring voxels in the voxel space formed by the second voxel eigenmatrix Corresponding feature describesWherein j=1,2 ..., 8, then the feature description of target point p are as follows:

In formula (2),It indicates according to the central point c in target point p and j-th of neighboring voxels_jBetween The weight parameter that Euclidean distance obtains,The voxel feature for indicating j-th of neighboring voxels repeats to hold to each point in three-dimensional point cloud The row above process, the voxel feature in the second voxel eigenmatrix can be extended to three dimensional point cloud concentration each point in Obtain a cloud eigenmatrix.

Further, in step S14, the point cloud eigenmatrix of the obtained each point of step S13 is inputted into multilayer perceptron (MLP) classification identification a little can be realized, i.e. three-dimensional point cloud marks, and specific network structure is referring to Fig. 4.

According to actual needs, it marks to advanced optimize the above-mentioned point cloud obtained using convolutional neural networks as a result, mentioning Height label precision, the invention also includes step S15.

Step S15, it is excellent by a cloud genera label is carried out in the three-dimensional point cloud input CRF-RNN network for completing attribute label Change.

Specifically, the present embodiment is aforementioned based on Three dimensional convolution nerve by the FC-CRF realized based on CNN basic operation insertion In the point cloud token network of network, realization is end to end, fusion is thick marks the fine token network of three-dimensional point cloud optimized with rear end, Further increase an accuracy for cloud label, the especially flatness of object boundary, profile.

In the present embodiment, the CRF marked towards three-dimensional point cloud is modeled first, the CRF for being then based on CNN operation is approximate real It is existing, finally merge the three-dimensional point cloud label of CRF optimization.

Traditional semantic marker is to be modeled as point-by-point Classification and Identification, has and carries out Classification and Identification using local feature, Also have and Classification and Identification is carried out using deep neural network.But point-by-point Classification and Identification would generally be brought and some cannot obviously receive Marked erroneous, for example other classifications may be identified as the partial dot of some target internal, this is because point-by-point point Class identification does not account for syntople between points, has only used the part of point to be marked, the neighborhood letter of small size Breath.If object construction information can be modeled in advance (such as: all targets be all it is continuous, with similar characteristics Consecutive points should be labeled as same class target) and label result is optimized, is limited based on modeling result, it is some apparent Mistake can be rejected effectively, and then obtain high-precision label result.Condition random field (CRF) is continuous to target Property and the effective ways that are modeled of its contextual information, and be widely used in two dimensional image label.Wherein, condition Random field is the model that the conditional probability distribution of another group of output stochastic variable is calculated under conditions of giving one group of stochastic variable, Its main feature assumes that output variable constitutes Markov random field (Markov Random Field, MRF).

In detail, CRF is a kind of undirected graph model of probability of discriminate, can in data global context information, Intercrossing feature is modeled, and is a kind of probability graph model that can be good at processing sequence data segmentation and label.Assuming that giving Determine stochastic variable set X={ X₁,X₂,…,X_NAnd P={ P₁,P₂,…,P_N, wherein X_i∈ L={ l₁,l₁,…,l_M, for Three-dimensional point cloud label, P are the input point cloud comprising N number of point, P_jFor j-th point of measurement vector, X is the semantic mark of input point cloud Note is as a result, X_iFor i-th point of semantic label, value is one in M semantic label, then corresponding CRF model can To be indicated with Gibbs probability distribution, it may be assumed that

In formula (3), G is the probability non-directed graph constructed on stochastic variable collection X,_OFor the group in figure G, wherein_οIn it is each pair of Node be it is adjacent,_OGIt is then the set of all groups in G, Z (P) is normalized function, Λ (x_ο| P) be group on energy Function, also known as potential function.

Result x ∈ L is marked for any one^NWhole potential function are as follows:

It is solved to obtain optimal label result based on maximal posterior probability algorithm are as follows:

As can be seen that the maximization of label result posterior probability is whole potential function from aforementioned optimal solution solution procedure It minimizes.It is not difficult to find out that condition random field realizes the modeling of local context information by group potential function first, it is then sharp The transmitting of contextual information is carried out with graph structure, and then realizes the modeling of a wide range of contextual information.

For connecting CRF model entirely, each node is connected with remaining node in figure G, as shown in figure 12, corresponding base Group_OFor comprising individual node or comprising the group of paired node, thus the corresponding whole potential function of x can indicate are as follows:

Wherein, for ease of description, removing the condition portion P in condition posterior probability in formula (6), that is, have Λ (x)= Λ (x | P),For unit potential function,For pairs of binary potential function, i, j=1,2 ..., N. Unit potential functionI-th of node is marked as x in expression figure G_iCost, the function is usually by certain discriminate classifier Probability output defines, and estimated result at this moment usually contains more noise, and segmentation result is often discontinuous in object edge. What pairs of binary potential function provided is that label i-th, j observation point are x simultaneously_i、x_jCost, have and retain adjacent observation point The flatness of label result can be improved in the effect for marking consistency, reducing inconsistency.

Using the thought of Gauss weighting by pairs of binary potential function is defined as:

In formula (7), function ψ (x_i,x_j) consistency between different labels calculates function, w^(m)For weight,For based on The smoothing filter function of Gaussian kernel, a total of M_GA gaussian kernel function, f_i,f_jThe characteristic vector of observation point i, j is respectively indicated, and HaveEach gaussian kernel functionA symmetrical, positive definite square can be passed through Battle array Λ^mTo define.So far, the full connection CRF modeling towards three-dimensional point cloud label is completed, and is how that solution obtains in next step Optimal label result.

Three-dimensional point cloud label optimization process based on full connection CRF is based on input point cloud data and maximizes posterior probability Φ (X) process.It is relatively difficult for solving accurate posterior probability, and calculation amount is huge, and the approximation method based on mean field can be with Convert posterior probability Φ (X) to a series of product of mutually independent marginal probabilities, i.e. Φ (X) ≈ Θ (X)=∏_I=1Θ_i (X_i), the available any label result x in convolution (3)-(7)_iMarginal probability Θ_iAre as follows:

The iteration that CRF can be constructed based on formula (8) infers algorithm, as shown in algorithm 5.1.The convergence master of the iterative algorithm It to be measured by the otherness between the Q and P of estimation, it is available by the convergence of assessment algorithm, be in the number of iterations Evaluated error very little when 10, it was demonstrated that algorithm has good convergence.

Algorithm 1: the CRF iteration based on mean field approximation infers algorithm

1. initialization: being initialized to all nodes

2.while is convergence do

3. information is transmitted: calculating all gaussian filtering results

4. weighted filtering:

5. consistency detection:

6. increasing unitary potential function:

7. normalization:

8.end while

Below to how realizing using the relevant operation in CNN that CRF iteration infers that above-mentioned algorithm is introduced.It is based on CNN operation to algorithm be reconstructed in the biggest problems are that the backpropagation of error can be realized, i.e., carried out using BP algorithm Parameter learning training.

(1) initialization operation

Initialization operation in algorithm 1 are as follows:

Wherein,Summation can be can be carried out to all values.Note Then haveZ_i=∑_lexp(U_i(l)), it can be seen that this operation is equivalent to right in each scene point The U of all possible label results_i(l) the activation operation based on Softmax function is carried out.Softmax function is normal in CNN network Activation primitive does not include any parameter, and error derivative can carry out backpropagation, thus also can use Back- Propagation (BP) algorithm carries out learning training.

(2) information is transmitted

As shown in algorithm 1, the information transmitting in CRF utilizes M_GA Gaussian filter is to Θ_jCarry out smothing filtering.Gauss The kernel function of filter is coordinate information or color, the strength information obtained according to the feature of cloud, for example put, is expressed every Incidence relation between a scene point.In full connection CRF model, each filter needs to cover all the points in point cloud, number It is very big according to amount and calculation amount, thus cannot be directly realized.Here using based on full freedom degree polyhedron lattice side Method (permutohedral laTTice) realizes quick Gaussian convolution, and calculation amount is O (N), N is the point being filtered Quantity has faster speed and better filter effect compared to traditional Gaussian convolution.It is brilliant based on full freedom degree polyhedron The quick Gaussian convolution of lattice method includes four-stage, and polyhedron lattice constructs stage, extension (splat) mapping, slice (slice) map and obscure (blur) stage etc..

In backpropagation, the input (error derivative) of current convolutional layer is the output of upper one layer of filter along opposite direction By M_GOutput result after a Gaussian filter.In Gaussian convolution based on full freedom degree polyhedron lattice method, this is anti- To propagation, identical polyhedron lattice building, extension can be mapped on the basis of mapping with slice when keeping with forward-propagating The sequence of filter in the fuzzy stage is reversed to realize.The calculation amount of this implementation method remains as O (N), hence it is evident that reduces meter Calculation amount, improves computational efficiency.

(3) weighted filtering

Next calculating is the aforementioned M to each semantic label l_GA output result is weighted summation.In a cloud mark In note, it is independent from each other between each semantic label, thus this weighted filtering operation can pass through M_GA convolution kernel be l × The convolution operation of l realizes that wherein the input of the convolution operation is to include M_GThe eigenmatrix in a channel exports as comprising l The eigenmatrix in channel.In backpropagation, due to the input and output of this single stepping be it is known, between convolution kernel be also It is mutually independent, thus the error derivative about convolution nuclear parameter and about input data can calculate, into And it can use BP algorithm and learning training carried out to convolution nuclear parameter.

(4) consistency detection

In consistency detection, compatible meter is carried out using output result of the PoTTs model to labels different in previous step It calculates.Whether the label that two similar observation points are mainly compared in compatibility calculating is identical, when the semantic label of two points is identical Consistency detection is 0, and when the semantic label of two points does not introduce penalty term σ simultaneously, calculating is as follows:

Be compared to using fixed penalty term σ, the present invention is considered as the penalty value learnt based on data, this be by The degree of association between different labels is different, thus is labeled as different labels for entirely marking result shadow in consecutive points Sound is different.Therefore, consistency detection can also regard a convolutional layer as, and the I/O channel number of this layer is M (number of tags), for convolution kernel having a size of l × l, the neuron connection weight parameter learnt is the value of transfer function.Due to It is realized using basic convolution operation, thus this step is also that can carry out backpropagation.

(5) increase unitary potential function

By unitary potential functionWith obtained in consistency detection output result carry out by element combination into And obtain complete potential function result.Increase this step of unitary potential function in, do not include any parameter, thus can simply by Error in output copies to input terminal to realize backpropagation.

(6) it normalizes

Can also be operated by the activation based on Softmax function similar to the process of initialization, in normalization step come It realizes, backpropagation is consistent with the backpropagation based on Softmax function in CNN.So far, it has used in CNN network Basic operation each step of single iteration in algorithm 1 is realized, to above-mentioned steps carry out stack can be realized it is more The derivation algorithm of secondary iteration.

Based on foregoing description, approximation is carried out to CRF model using the approximation method of mean field first in the present embodiment and has been built Then mould has carried out Equivalent realization to each step in mean field approximation method using the basic operation in CNN, that is, has realized list The mean field approximation algorithm of secondary iteration.The mean field approximation method of iteration is only needed to stack i.e. related step Can, i.e., it is calculated using the mean field approximation that iteration can be realized in recursive CNN structure (RNN), structure is as shown in figure 13, gives The point cloud data of input is P, and point-by-point unitary potential function is U=U_i(l), the marginal probability that preceding an iteration obtains is H₁, currently The marginal probability that iteration obtains is H₂, single mean field approximation estimation be denoted as f_Ω(U,P,H₁), Ω is its parameter sets (comprising adding All parameters in power filtering and consistency detection) and Ω={ w^(m),ψ(l,l')}.For H₁, start to initialize when iteration To be equal to H in iteration later with the softmax function output that U is input₂Output, that is, have:

Wherein, T ' is the number of iterations.

Obtaining H₁Afterwards, H is carried out based on mean field approximation algorithm₂Estimation, that is, have:

H₂(t')=f_Ω(U,P,H₁(t')),0<t'≤T'

For export Y, only in last time iteration output estimation as a result, Y=H₂(T')。

Based on above-mentioned analysis it is recognised that the error derivative in whole network structure (being denoted as CRF-RNN) about parameter is It can ask, thus it can be solved with the BP algorithm of standard, thus can also be embedded into other neural networks and be learned Practise training.

Further, based on above-mentioned cloud token network Point-VoxelNet network struction fusion three-dimensional voxel volume The three-dimensional point cloud token network (Point VoxelNet+CRF-RNN, PVCRF) of product neural network and the optimization of the rear end CRF, tool Body structure is as shown in figure 14, wherein for the point cloud of input, first according to the scene size of input point cloud carry out voxelization and The point cloud of fixed quantity is randomly selected in voxel for subsequent feature extraction, and spy is then carried out in voxel based on LGAB module Sign, which is extracted, obtains simple voxel feature, (feature of hollow body element, which uses, the mends 0 operation) base after obtaining all non-empty voxel features In Three dimensional convolution neural network (Conv3D, DeConv3D) carry out multiple dimensioned voxel feature extraction with merge, then utilization is slotting Multiple dimensioned voxel feature is expanded in all the points and then obtains point-by-point point feature by the method for value, is then inputted point feature more To obtain just beans-and bullets shooter cloud label as a result, finally carrying out rear end optimization based on CRF-RNN network structure in layer perceptron.This network Structure realizes the information exchange in Classification and Identification stage and CRF optimizing phase in cloud label well, marks to point cloud is improved Precision has obvious effect.

Based on the description to the above-mentioned three-dimensional point cloud labeling method based on fusion voxel, inventor is also based on fusion to this The performance of three-dimensional point cloud labeling method of element is verified, as point set hand over and than (Intersection over Union, IoU) and the evaluation indexes such as whole accuracy rate (Overall Accuracy, OA) evaluate and test a cloud labeling properties.

(1) network implementations and parameter setting

In point cloud data rasterizing and sample phase, need to carry out different disposal to different data sets.

S3DIS: for S3DIS data set, scene is respectively E=8m along the full-size range of Z, Y, X-direction, H=16m, W=50m.To cover whole scenes, the size for being entire grid with 8 × 16 × 50, the size of each voxel is λ_E= 0.5m、λ_H=0.25m, λ_W=0.2m, the voxel model of building is having a size of E'=16, H'=64, W'=256, wherein extra Voxel empties.T=32 point is chosen in each voxel.

VKITTI: for vKITTI data set, scene is respectively E=along the full-size range of Z, Y, X-direction 33m, H=193m, W=148m.Here the size that each voxel is arranged is λ_E=2m, λ_H=1.6m, λ_W=1.2m, the body of building Prime model is having a size of E'=16, H'=128, W'=128.Likewise, choosing T=32 point in each voxel.

For CRF-RNN network, to prevent over-fitting and gradient disappearance etc., its number of iterations, which is arranged, in the training stage is T '=10 are arranged in test phase in T '=5.Gaussian filter size is consistent with point cloud data size.

The present invention uses the strategy of two steps training, and the first step individually trains Point-VoxelNet, second step pair The network PVCRF of joint Point-VoxelNet and CRF-RNN are finely adjusted.In Point-VoxelNet network, using momentum The Adam optimization algorithm that value is 0.9 is trained optimization, and initial learning rate is 0.001, and trained batch size is 16.It is right PVCRF network uses momentum value to be trained optimization for 0.6 Adam optimization algorithm, and initial learning rate is 0.0001, training Batch size be 16.During the training period, equally use the early strategy that stops obtaining optimal network parameter, maximum exercise wheel number Be 100, if network parameter it is continuous 10 wheel training after still without update if deconditioning.In test phase, likewise, adopting Algorithm is verified with 6 folding cross validations, the grouping situation of training data and test data is as shown in table 1.

The three-dimensional voxel nerve net of proposition is realized based on Python, using the deep learning frame of Tensorflow Network structure and CRF-RNN network structure.Experimental Hardware environment are as follows: Intel Core i76700KCPU, 48G memory, GTX 1080Ti video card (supports CUDA 8.0, cuDNN 5.1).

(2) quantitative result is analyzed

Deep neural network model (Point VoxelNet and PVCRF) is compared in three-dimensional based on above-mentioned two data set Point cloud label in application effect, and with it is current preferably labeling algorithm PointNet, MS+CU (2), SEGCloud, 3DContextNet has carried out comparative analysis.Wherein, only with the rectangular co-ordinate information of cloud, i.e. XYZ coordinate is handled.

Table 1

Table 2

Statistics label of the heterogeneous networks model on two small data sets is set forth as a result, wherein originally in Tables 1 and 2 Embodiment propose PVCRF model achieved on S3DIS, vKITTI data set preferably be averaged IoU, respectively 51.8%, 39.1%, it is also yielded good result on general reference numeral accuracy rate OA, respectively 81.2%, 82.6% show PVCRF mould Type can preferably obtain the complete characteristics expression of a cloud, it was demonstrated that Multi resolution feature extraction based on three-dimensional voxel space be based on The Multi resolution feature extraction of theorem in Euclid space has comparable function, can mark for three-dimensional point cloud and provide the details letter of a cloud scene Breath.

Comparing PVCRF and SEGCLloud, SEGCLloud directlys adopt two-value voxel model and carries out voxel feature learning, and PVCRF and utilizes point cloud in voxel to carry out voxel feature learning using the point cloud voxel model of rasterizing.Both add The rear end optimization based on full connection CRF model is added.In addition, containing the multiple dimensioned spy based on three-dimensional voxel space in PVCRF Levy extraction module, thus PVCRF achieves higher average IoU, absolutely prove using point cloud data in voxel handled with And Multi resolution feature extraction module can extract the description of the feature with stronger characterization ability.

Compare Point-VoxelNet and PVCRF model, PVCRF is equal in average IoU and two aspect of overall accuracy OA Preferably performance is achieved, this is because connection CRF model can model larger range of contextual information entirely, to point The syntople of cloud has stronger characterization ability.

On different data sets, using PointNet as benchmark, since the point cloud in S3DIS data set compares concentration, Point cloud density is generally higher, and the network model PVCRF that the present embodiment proposes achieves biggish performance boost, and in vKITTI number According to concentration, since the distribution of its cloud is universal more sparse, thus performance boost is then smaller.

Label result and authentic signature result such as Figure 15 that algorithms of different obtains on S3DIS and vKITTI data set and Shown in Figure 16.From left to right successively are as follows: input color point cloud, the label based on PointNet, Point-VoxelNet, PVCRF As a result, authentic signature result.It can be seen that from the label result in Figure 15 and Figure 16 and be better than based on the PVCRF result obtained PointNet and Point-VoxelNet, label result and legitimate reading are more closely, demonstrate the Analysis On Multi-scale Features in PVCRF The validity of study and the optimization of the rear end CRF.It is compared to PointNet and Point-VoxelNet, is based on Point-VoxelNet The label result of acquirement is better than PointNet, and this is mainly due to the introducings that Analysis On Multi-scale Features in Point-VoxelNet learn Improve the feature learning ability of network model.

Indoors in outer scene, for the target that degree of overlapping is relatively high or is completely embedded, Point-VoxelNet, PVCRF model is still more difficult separated, such as the road in the wallboard (board) and window (window), Figure 16 in Figure 15 (road) with landform (terrain).Due to considering Multi resolution feature extraction in Point-VoxelNet and PVCRF model, Target for being greater than target size in scene can be marked well, such as desk (table) and figure in Figure 15 Building (building) in 16 etc..

Compare Point-VoxelNet and PVCRF, due to being connected in the large scale that CRF can be extracted in scene point cloud entirely Context information, so that the details in label result is more prominent, such as the chair (chair) and sofa in Figure 15 the first row (sofa) shown in segmentation result of the road (road) with landform (terrain) and in Figure 16.

(4) classification confusion matrix is analyzed

Two network models of Point-VoxelNet and PVCRF are set forth in S3DIS, vKITTI number in Figure 17, Figure 18 According to the classification confusion matrix obtained on collection, the numerical value in matrix grid is category label accuracy rate, and mesh color also represents standard The size of true rate.The result of the two is compared it can be found that for house data collection S3DIS, Point-VoxelNet and PVCRF mould Type is suitable to the segmentation precision of all kinds of targets, and the introducing of CRF model is mainly reduction of desk (table) and floor (floor) Obscuring Deng between.For the introducing of outdoor data collection vKITTI, CRF model be mainly reduction of building (building) with Degree of aliasing between lorry (van) etc..

From Figure 17 (a) as can be seen that Point-VoxelNet network model is to ceiling in S3DIS data set (ceiling), floor (floor), door (door), column (column), crossbeam (beam), window (window), bookcase (bookcase), for the recognition accuracy of wallboard (board) and chair (chair) 52% or more, label accuracy rate is lower There are sofa (sofa), wall (wall) and shade (cluTTer), precision is worst in 13 class targets between 35% to 46% Be desk (table), precision 22%.The superiority and inferiority for the average classification accuracy rate that PVCRF is obtained on S3DIS data set point Cloth situation is similar with Point-VoxelNet network model, but universal precision all increases, and wherein the accuracy rate of desk improves To 30%, as shown in Figure 18 (a).

Also there is similar comparing result in vKITTI data set, as shown in Figure 17 (b) and 18 (b).This comparing result CRF model is demonstrated to the accurate assurance and modeling of target detail in scene.In comparison diagram 17 and Figure 18 in two datasets not With as a result, the label precision on vKITTI data set is generally lower, this is mainly due to the point cloud phases in vKITTI data set For the point cloud data in S3DIS data set generally than sparse and uneven, thus neighborhood of a point structure be not it is so obvious, And then it is insufficient when leading to information extraction, it is unfavorable for realizing high-precision cloud label.

(5) Statistical Analysis is calculated

Experimental analysis is carried out to the computational efficiency of algorithms of different on S3DIS data set.Include altogether in S3DIS data set 272 width independent point clouds are marked each amplitude point cloud using algorithms of different and count its mean test time (only statistics nerve Network query function is time-consuming, not the time-consuming of statistical data preparation stage), statistical result such as table 3.Statistical result can from table 3 Out, the calculating time of PointNet is about 1.8s.After the fusion three-dimensional voxel convolutional neural networks and CRF that are provided in the present embodiment The network model PVCRF of end optimization calculates time-consuming maximum, is 4.52s, this is because the introducing of CRF significantly increases calculation amount.

Table 3

The FC-CRF realized based on convolutional neural networks is used to be advanced optimized finally to obtain fine label result.

Further, Figure 19 is please referred to, the page provided in an embodiment of the present invention builds device 100 and includes voxel processing And characteristic extracting module 110, multiple dimensioned voxel feature calculation module 120, feature expansion module 130 and point cloud mark module 140.

Voxel processing and characteristic extracting module 110, for carrying out voxelization processing to three dimensional point cloud collection, and based on place Reason result carries out voxel feature extraction in voxel and forms the first voxel eigenmatrix；In the present embodiment, above-mentioned steps S11 can be by Voxel processing and characteristic extracting module 110 execute, i.e., the specific descriptions about voxel processing and characteristic extracting module 110 can refer to Step S11, details are not described herein for the present embodiment.Optionally, as shown in figure 19 in the present embodiment, voxel processing and feature extraction Module 110 includes voxel division unit 111, point cloud classification unit 112 and Points Sample unit 113.

Voxel division unit 111, for a cloud coordinate space to be divided into multiple voxels according to default voxel size；This reality It applies in example, above-mentioned steps S111 can be executed by voxel division unit 111, i.e., the specific descriptions about voxel division unit 111 can With reference to step S111, details are not described herein for the present embodiment.

Point cloud sort out unit 112, for the grid parameter according to voxel by each point that three dimensional point cloud is concentrated sort out to In corresponding voxel；In the present embodiment, above-mentioned steps S112 can sort out unit 112 by cloud and execute, i.e., sort out about cloud single The specific descriptions of member 112 can refer to step S112, and details are not described herein for the present embodiment.

Points Sample unit 113, for being sampled to the point in each voxel after classification so that point in voxel Quantity reaches the first preset value.In the present embodiment, above-mentioned steps S113 can be executed by Points Sample unit 113, i.e., about a cloud The specific descriptions of sampling unit 113 can refer to step S113, and details are not described herein for the present embodiment.

Multiple dimensioned voxel feature calculation module 120, for using the first voxel eigenmatrix as Three dimensional convolution neural network Input the Analysis On Multi-scale Features of voxel are calculated, and feature fused in tandem is carried out to obtain the second body to the Analysis On Multi-scale Features Plain eigenmatrix；In the present embodiment, above-mentioned steps S12 can be executed by multiple dimensioned voxel feature calculation module 120, i.e., about more rulers The specific descriptions of degree voxel feature calculation module 120 can refer to step S12, and details are not described herein for the present embodiment.

Feature expansion module 130, for being expanded the voxel feature in the second voxel eigenmatrix based on feature interpolation algorithm It opens up in each point concentrated to three dimensional point cloud to obtain a cloud eigenmatrix；In the present embodiment, above-mentioned steps S13 can be by feature Expansion module 130 executes, i.e. the specific descriptions about feature expansion module 130 can refer to step S13, and the present embodiment is herein no longer It repeats.

Point cloud mark module 140, for that will put in cloud eigenmatrix input multilayer perceptron to realize to three-dimensional point cloud Attribute label.In the present embodiment, above-mentioned steps S14 can be executed by a cloud mark module 140, i.e., about a cloud mark module 140 Specific descriptions can refer to step S14, details are not described herein for the present embodiment.

To sum up, a kind of three-dimensional point cloud labeling method and device based on fusion voxel provided in an embodiment of the present invention, passes through Multiple dimensioned voxel feature is extracted based on voxel convolutional neural networks building multiscale space on the voxel model of regularization, then Voxel feature is expanded into point feature in the way of feature interpolation, and then realizes the point-by-point identification of more sophisticated category and point Cloud label.

In the description of the present invention, term " setting ", " connected ", " connection " shall be understood in a broad sense, for example, it may be fixed Connection, may be a detachable connection, or be integrally connected；It can be mechanical connection, be also possible to be electrically connected；It can be directly It is connected, the connection inside two elements can also be can be indirectly connected through an intermediary.For the ordinary skill of this field For personnel, the concrete meaning of above-mentioned term in the present invention can be understood with concrete condition.Provided by the embodiment of the present invention In several embodiments, it should be understood that disclosed device and method can also be realized by other means.It is described above Device and method embodiment it is only schematical, for example, the flow chart and block diagram in the drawings show according to the present invention The device of preset quantity embodiment, method and computer program product architecture, function and operation in the cards.At this On point, each box in flowchart or block diagram can represent a part of a module, section or code.The module, A part of program segment or code includes that one or preset quantity are a for realizing defined logic function.

It should also be noted that function marked in the box can also be with difference in some implementations as replacement The sequence marked in attached drawing occurs.For example, two continuous boxes can actually be basically executed in parallel, they are sometimes It can also execute in the opposite order, this depends on the function involved.It is also noted that in block diagram and or flow chart The combination of box in each box and block diagram and or flow chart, can function or movement as defined in executing it is dedicated Hardware based system is realized, or can be realized using a combination of dedicated hardware and computer instructions.

The foregoing is only a preferred embodiment of the present invention, is not intended to restrict the invention, for the skill of this field For art personnel, the invention may be variously modified and varied.All within the spirits and principles of the present invention, made any to repair Change, equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.

Claims

1. a kind of three-dimensional point cloud labeling method based on fusion voxel, which is characterized in that the described method includes:

Voxelization processing is carried out to three dimensional point cloud collection, and voxel feature extraction is carried out in voxel based on processing result and is formed First voxel eigenmatrix；

The first voxel eigenmatrix is calculated to the multiple dimensioned spy of voxel as the input of Three dimensional convolution neural network Sign, and feature fused in tandem is carried out to obtain the second voxel eigenmatrix to the Analysis On Multi-scale Features；

The voxel feature in the second voxel eigenmatrix is extended into the three dimensional point cloud based on feature interpolation algorithm To obtain a cloud eigenmatrix in each point of concentration；

2. the three-dimensional point cloud labeling method according to claim 1 based on fusion voxel, which is characterized in that carried out to cloud Voxelization handle the step of include:

A cloud coordinate space is divided into multiple voxels according to default voxel size；

The each point that the three dimensional point cloud is concentrated is sorted out into corresponding voxel according to the grid parameter of the voxel；

Point in each voxel after classification is sampled so that the quantity of the point in voxel reaches the first preset value.

3. the three-dimensional point cloud labeling method according to claim 2 based on fusion voxel, which is characterized in that after classification The step of point in each voxel is sampled include:

If the quantity for the point for including in voxel is greater than the first preset value, the first preset value of stochastical sampling point from current voxel So that the quantity of the point in the voxel reaches first preset value；

If the quantity for the point for including in voxel less than the first preset value, randomly selects one or more click-through from current voxel Row duplication is so that the quantity of the point in the voxel reaches first preset value.

4. the three-dimensional point cloud labeling method according to claim 1 based on fusion voxel, which is characterized in that based on processing knot Fruit carries out a step of cloud feature extraction forms the first voxel eigenmatrix in voxel and includes:

For the centre coordinate of the point cloud computing cloud in each voxel, and based on the centre coordinate to the point in the voxel Cloud data carry out center normalized to obtain primary data matrix；

, to realize point-by-point local feature description, and maximum value pond will be used in the primary data Input matrix LGAB module The operation of point-by-point pondization is carried out to obtain the global characteristics of voxel and as the first voxel feature to the local feature set in voxel Matrix.

5. the three-dimensional point cloud labeling method according to claim 1 based on fusion voxel, which is characterized in that by described first Voxel eigenmatrix is calculated the step of Analysis On Multi-scale Features of voxel as the input of Three dimensional convolution neural network and includes:

The voxel eigenmatrix is converted into 4 dimension tensors, and the 4 dimension tensor is inputted respectively with different size of convolution kernel Three dimensional convolution neural network in voxel feature to be calculated under different scale；

Voxel feature under the different scale is inputted to the three-dimensional deconvolution nerve net with different size of convolution kernel respectively To obtain the voxel feature of multiple and different scales in network, wherein each Three dimensional convolution neural network when carrying out convolution Convolution kernel size and the convolution kernel size of each three-dimensional deconvolution neural network when progress deconvolution are corresponding identical.

6. the three-dimensional point cloud labeling method according to claim 1 based on fusion voxel, which is characterized in that inserted based on feature Value-based algorithm extends to the voxel feature in the second voxel eigenmatrix in each point that the three dimensional point cloud is concentrated Step includes:

For the three dimensional point cloud concentrate each point respectively as target point, by the second voxel eigenmatrix shape At voxel space in search with the voxel of the nearest preset quantity of the target point as neighboring voxels；

Calculating is extended to obtain the point cloud feature of the target point to the voxel feature of the neighboring voxels.

7. the three-dimensional point cloud labeling method according to claim 6 based on fusion voxel, which is characterized in that the target point The point cloud feature F of p_pForWherein,It indicates according to target point p and j-th of neighboring voxels In central point c_jBetween the obtained weight parameter of Euclidean distance,Indicate the voxel feature of j-th of neighboring voxels.

8. the three-dimensional point cloud labeling method according to claim 1 based on fusion voxel, which is characterized in that described to be based on melting The three-dimensional point cloud labeling method of fit element further include:

Cloud genera label optimization will be carried out in the three-dimensional point cloud input CRF-RNN network for completing attribute label.

9. a kind of three-dimensional point cloud labelling apparatus based on fusion voxel, which is characterized in that described device includes:

Voxel processing and characteristic extracting module for carrying out voxelization processing to three dimensional point cloud collection, and are based on processing result Voxel feature extraction is carried out in voxel forms the first voxel eigenmatrix；

Multiple dimensioned voxel feature calculation module, for using the first voxel eigenmatrix as the defeated of Three dimensional convolution neural network Enter so that the Analysis On Multi-scale Features of voxel are calculated, and feature fused in tandem is carried out to the Analysis On Multi-scale Features to obtain the second voxel spy Levy matrix；

Feature expansion module, for being extended to the voxel feature in the second voxel eigenmatrix based on feature interpolation algorithm To obtain a cloud eigenmatrix in each point that the three dimensional point cloud is concentrated；

Point cloud mark module, for described cloud eigenmatrix to be inputted in multilayer perceptron the attribute realized to three-dimensional point cloud Label.

10. the three-dimensional point cloud labelling apparatus according to claim 9 based on fusion voxel, which is characterized in that the voxel Processing and characteristic extracting module include:

Voxel division unit, for a cloud coordinate space to be divided into multiple voxels according to default voxel size；

Point cloud sort out unit, for the grid parameter according to the voxel by each point that the three dimensional point cloud is concentrated sort out to In corresponding voxel；

Points Sample unit, for being sampled to the point in each voxel after classification so that the quantity of the point in voxel reaches First preset value.