CN110310298A

CN110310298A - A kind of road target real-time three-dimensional point cloud segmentation method based on cycling condition random field

Info

Publication number: CN110310298A
Application number: CN201910540355.2A
Authority: CN
Inventors: 孙伟; 张桢浩; 陆伟
Original assignee: Xian University of Electronic Science and Technology
Current assignee: Xian University of Electronic Science and Technology
Priority date: 2019-06-21
Filing date: 2019-06-21
Publication date: 2019-10-08

Abstract

The present invention relates to automatic Pilot unmanned planes under the complexity road scene of low latitude to the real-time three-dimensional point cloud Target Segmentation technology of itself ambient enviroment, especially a kind of road target real-time three-dimensional point cloud segmentation method based on cycling condition random field.It is characterized in that including at least following steps: step 1 obtains point cloud data, obtains the three-dimensional matrice of a H*W*C；Step 2 carries out network architecture optimization according to above-mentioned data；The probability graph that step 3, output minimize provides three-dimensional point cloud segmentation.Model proposed by the present invention preferably can go out the road target in point cloud by Real-time segmentation, and model engineering meets the requirement of real-time and stability, has certain practical value.

Description

A kind of road target real-time three-dimensional point cloud segmentation method based on cycling condition random field

Technical field

The present invention relates to automatic Pilot unmanned planes under the complexity road scene of low latitude to the real-time three-dimensional of itself ambient enviroment Point cloud Target Segmentation technology, especially a kind of road target real-time three-dimensional point cloud segmentation method based on cycling condition random field.

Background technique

In recent years, unmanned plane slowly has the tendency that sending express delivery, region low latitude into low latitude domain activity, such as unmanned plane Safety inspection, road defects detection, provisional road segment segment vehicle detection statistics etc..The characteristics of these tasks is to need to carry out people Machine interaction, needs to perceive the target and ambient enviroment in the scene of low latitude, accomplishes safe flight while the task of completion.Meanwhile Automatic Pilot has obtained paying close attention to for whole world researcher in recent years, and around perception, decision, control, which expands, to be goed deep into Research.Unmanned plane low-latitude flying and automated driving system are required dependent on accurate, real-time, robust environment sensing energy Power.In the road scene of low latitude, unmanned plane and automatic Pilot need precise classification and position road target, such as vehicle, pedestrian, Cyclist and other barriers.Wherein, the cognitive method based on radar point cloud data can more directly react reality Scene information has wide application scenarios.Therefore, studying a kind of three-dimensional perception technology under the road scene of low latitude just seems It is very necessary, be conducive to the development for pushing unmanned plane and automatic Pilot.

Previous traditional point cloud segmentation method usually includes following four kinds of common operations: removing ground, point is gathered into A example extracts feature from each aggregation population, the population assembled according to tagsort.Conventional method depends critically upon The feature of hand-designed, and do not omited in data connection, satisfied segmentation result cannot be obtained.Scholars attempt to use later The angle of probability goes to explain segmentation problem, proposes condition random field equation for solving the problems, such as semantic segmentation, was dividing It joined the smoothness constraint between pixel in journey, be conducive to improve edge segmentation effect.The appearance of FCN network demonstrates convolution mind It can be used in image segmentation task through network, neural network proposes the feature of initial data by self study completion Take, the segmentation result of same input data original resolution rank obtained by mode end to end, FCN's the experimental results showed that Performance of the depth network in image segmentation task is far better than conventional method.It is influenced by the depth Internet upsurge, S.Zheng etc. The mathematical expression of Stochastic Conditions field is converted to the neural network structure of RNN a kind of by people.Deeplab network then will be above-mentioned two Advanced dividing method combines, and makes preliminary segmentation of FCN network first, and segmentation result is then passed through Stochastic Conditions field Handle and then exports, but the network for being based only on FCN of the network training, it is equivalent to FCN network and condition random field It is separation, does not carry out joint training, although Deeplab is good compared with FCN segmentation effect, promotion is not obvious.It sends out backward Exhibition has just emerged in large numbers through pretreatment three dimensional point cloud and then has used the point cloud segmentation network of nerual network technique.Zhou Y etc. People proposes VoxelNet network model, the 3D detection and segmentation of object is completed on point cloud data, on KITTI data set Preferable experimental result has been obtained, but the model is huger, the real-time performance of model is poor.Qi C R et al. is proposed PointNet network model can be completed at the same time the classification of point cloud data, semantic segmentation and individual component segmentation, and the model is to nothing Each point of sequence point cloud carries out independent processing, it is thus achieved that the points cloud processing unrelated with input sequence.But the model When handling the score of each point, directly point feature is combined together with global characteristics, and has ignored local feature, caused Final segmenting edge is ineffective, while a feature of point cloud data is packing density difference, embodies close how far few etc. Problem, and in the case where density difference, handle that the obvious computational efficiency of these data is not high, and model can not using unified template Real time execution.By the studies above As-Is analysis, point cloud segmentation technology is in Rapid development stage, how more efficient training Network, how the model of compression network keeps segmentation precision simultaneously, how lift scheme real-time and computational efficiency the problems such as It is all direction to be studied.

Summary of the invention

It is an object of that present invention to provide a kind of model real-time of point cloud segmentation is good, computational efficiency is high, model compression and point Cut the road target real-time three-dimensional point cloud segmentation method of the good random field of balance between precision.

The object of the present invention is achieved like this, a kind of road target real-time three-dimensional point cloud based on cycling condition random field Dividing method, it is characterized in that including at least following steps:

Step 1 obtains point cloud data, obtains the three-dimensional matrice of a H*W*C；

Step 2 carries out network architecture optimization according to above-mentioned data；

The probability graph that step 3, output minimize provides three-dimensional point cloud segmentation.

The step 1 includes step in detail below:

Step 1.1) obtains point cloud data, and point cloud data is made of non-structured three-dimensional space point, each spatial point It again include the data of 5 dimensions: one group of Descartes's three dimensional space coordinate (x, y, z), the intensity value i and distance of current point

Point cloud data is converted to the data matrix for meeting network data requirement, spherical surface using spherical Map by step 1.2) Shown in mapping equation such as formula (1):

Wherein, θ, φ respectively indicate apex angle and azimuth, Δ θ, and Δ φ respectively indicates θ, the discrete precision on φ,I.e. For discrete value, one groupData map to a position on 2D grid map；

Each point on one point cloud map is carried out above-mentioned spherical Map by step 1.3), obtains a H*W*C's Three-dimensional matrice, H, W respectively correspond mapping result discrete valueThe data in each channel of the C equal to 5, C respectively correspond LiDAR The x of point, y, z-axis coordinate values put strength values and distance values.

The step 2 includes:

The tensor H*W*C of step 2.1) input first passes around the compression layer processing of Fire, the compression layer of Fire by 1*1 volume Product core composition, the quantity of convolution kernel are equal to the compression layer of 1/4, Fire of input tensor port number while completing feature extraction Carry out the advanced treating of compressed data；

Then data pass through FireDeconv module to step 2.2)；

Step 2.3) cycling condition random field network establishment；

Step 2.4) defines macroscopical optimization aim of network model

Cross entropy calculating is shown below:

Wherein, m indicates resolution sizes, that is, the sum put, L_iIndicate the intersection entropy loss of each point,Indicate i-th A point is classified as true c_iThe probability of label.

The step 2.1) includes:

A) respectively that the data of compression layer are deep by carrying out tensor on the Conv convolutional layer of 1*1 and the Conv convolutional layer of 3*3 Dilatation is spent, becomes exporting the 1/2 of tensor depth respectively；

B) data after two kinds of convolution operations are stacked in depth dimension, exports H*W*C tensor；

The step 2.2) includes:

A) input H*W*C tensor enters FireDeconv module, by Conv1*1, completes feature extraction and depth-compression；

B) then data are up-sampled, depth remains unchanged；Layer data will be extended and be extended for output in depth 1/2；

C) data after two kinds of convolution operations are stacked in depth dimension, exports H*2W*C tensor.

The step 2.3) includes:

Step 2.3.1) provide the energy function of cycling condition random field:

Wherein, c is the prediction label vector of a cloud, c_iThe prediction label of i-th cloud is indicated as a result, u_i(c_i)=- logP (c_i) represent i-th cloud and be predicted as c_iThe cross entropy probability value of classification, b_i,j(c_i,c_j) it is a kind of to one group of similitude distribution For the payment method of the behavior of different labels, b is defined_i,j(c_i,c_j) specific payment method are as follows:

Work as c_i≠c_jWhen, h (c_i,c_j)=1, otherwise h (c_i,c_j)=0, f_i,f_jIndicate the feature of i-th, j point, k^mIndicate with The feature of i-th, j points is m-th of Gaussian filter of input, w_mIndicate corresponding coefficient；Using as follows:

Wherein, vectorIndicate that the angle information of a point, vector X (x, y, z) indicate that the Descartes an of point sits Mark, σ_α,σ_β,σ_γIt is one group of hyper parameter, value is generally selected by experience；

Step 2.3.2) building energy function mean field iterative algorithm Recognition with Recurrent Neural Network structure RNN:

A) it inputs the output of base neural network as probability figure into Recognition with Recurrent Neural Network RNN；

B) LiDAR initial characteristic data is handled by formula (4), obtains two Gaussian kernels；

C) scale of Gaussian kernel is set to the regional area of 3*5；Gaussian kernel is in a manner of local articulamentum in probability graph On calculated, the filtering of complete paired data and information transmitting；

D) it is weighted again with polymerization probability of the convolution kernel of 1*1 to the probability results after c) step and compatibility conversion, To change the distributed degrees of each point, 1*1 convolution nuclear parameter learns in network training.

E) initial probability graph is weighted again with completion and the probability graph of compatibility conversion is overlapped, through cross entropy The probability graph that the output of Softmax normalization operation minimizes.

The invention proposes a kind of based on SqueezeNet and recycles the real-time three-dimensional point cloud segmentation model RobNet of CRF, Model is macroscopically being divided into basic network and recirculating network, and is carrying out detailed elaboration on microcosmic, finally in ROS frame Engineering deployment has been carried out under frame to realize.Light, the external factor such as weather are effectively overcome by initial data of LiDAR point cloud Influence to algorithm, so that model more robust, while ranging information is contained in LiDAR data, and the range information energy of environment It is enough directly used in the subsequent planning of unmanned plane, automatic Pilot, controls scheduling algorithm.When network design, with reference to SqueezeNet's Compression Strategies are effectively reduced the number of parameters of network, have compressed the scale of construction of model, and utilize the condition random field of circulation Refined output as a result, finally enable this chapter propose model preferably Real-time segmentation go out point cloud in road target, Model engineering meets the requirement of real-time and stability, and model proposed by the present invention has certain practical value.

All symbols and meaning in 1 specification of table

Below with reference to embodiment attached drawing, the invention will be further described:

Detailed description of the invention

Fig. 1 is spherical Map schematic diagram；

Fig. 2 is point cloud data spherical Map result；

Fig. 3 is network model macroscopic design；

Fig. 4 is basic network Micro Instructional Design；

Fig. 5 is Fire and FireDeconv module microstructure, and Fig. 5 (a) is Fire structure chart, and Fig. 5 (b) is FireDeconv structure chart；

Fig. 6 is that the RNN of condition random field CRF is realized；

Fig. 7 is model optimization object delineation；

Fig. 8 is the point cloud segmentation functional schematic of model；

Fig. 9 (a) is model training loss loss；

Fig. 9 (b) is the transformation of model training learning rate；

The vehicle IOU score value of Figure 10 (1) training set data point value of evaluation changes；

The bicycle IOU score value of Figure 10 (2) training set data point value of evaluation changes；

Pedestrian's IOU score value of Figure 10 (3) training set data point value of evaluation changes；

Figure 11 (1) verifying collection data assessment -- vehicle IOU score value variation；

Figure 11 (2) verifying collection data assessment -- Che Jingdu score value variation；

Figure 11 (3) verifying collection data assessment -- vehicle readjustment rate score value variation；

Figure 11 (4) verifying collection data assessment -- cyclist's IOU score value variation；

Figure 11 (5) verifying collection data assessment -- cyclist's precision score value variation；

Figure 11 (6) verifying collection data assessment -- cyclist's readjustment rate score value variation；

Figure 11 (7) verifying collection data assessment -- pedestrian's IOU score value variation；

Figure 11 (8) verifying collection data assessment -- pedestrian's precision score value variation；

Figure 11 (9) verifying collection data assessment -- pedestrian's readjustment rate score value variation；

Three groups of Figure 12 (a), first group of segmentation effect-segmentation effect；

Three groups of Figure 12 (b), second group of segmentation effect-segmentation effect；

Three groups of segmentation effect-third components of Figure 12 (c) cut effect；

Figure 13 ROS engineering code logic process；

Figure 14 RVIZ real-time visual result.

Specific embodiment

In conjunction with attached drawing is implemented, specific analysis, a kind of road mesh based on SqueezeNet and circulation CRF are made to the present invention Real-time three-dimensional point cloud segmentation is marked, it is characterized in that including at least following steps:

As shown in Figure 1, the step 1 includes step in detail below:

LiDAR point cloud data are made of a series of non-structured three-dimensional space points, each of which spatial point includes 5 again The data of dimension: one group of Descartes's three dimensional space coordinate (x, y, z), the intensity value i and distance of current pointPoint cloud data feature and image data feature difference are larger, need to complete data using spherical Map to turn It changes.This kind of more compact expression LiDAR point cloud data of method energy, promote the computational efficiency of network, while meeting the data of network Call format.Shown in spherical Map equation such as formula (1):

Wherein, θ, φ respectively indicate apex angle and azimuth, Δ θ, and Δ φ respectively indicates θ, the discrete precision on φ,I.e. For discrete value, one groupData map to a position on 2D grid map.By each point on a point cloud map Above-mentioned spherical Map is carried out, the three-dimensional matrice of a H*W*C can be obtained, H, W respectively correspond mapping result discrete value The data in each channel of the C equal to 5, C respectively correspond the x of LiDAR point, y, and z-axis coordinate values put strength values and apart from number Value.

Unordered by above-mentioned spherical surface conversion method, sparse three dimensional point cloud has been processed into orderly, dense class figure The standard data format of picture, processing result are as shown in Figure 2.

The step 2 includes step in detail below:

2.1 network model macroscopic designs

The macroscopic design of network model is as shown in Figure 3.Herein when basic convolutional neural networks part in designing a model, With reference to SqueezeNet network, and be optimized to the mode of residual error connection, can great compression network model size, reduce Model calculation amount.As shown in figure 3, being the basic network part of model from Conv1a layers to Conv14 layers.Mode input is data Pretreated standard point cloud data, data three-dimensional scale are 64*512*5, and input data is through general convolution, maximum pond, Fire has operated the feature extraction of paired data.When carrying out down-sampling operation, it is contemplated that the transverse width of data is higher than longitudinal Much larger, longitudinal highly only 64 dimensions are spent, so, only to transverse width progress Sampling Compression, indulging herein when down-sampling To highly remaining unchanged.FireDeconv10 layers to totally four layers of FireDeconv13, it is gradually completing the up-sampling of image, The resolution ratio of the data of FireDeconv layers of output has reached consistent with input resolution ratio.It is added during up-sampling herein The skip floor of equal resolution data connects, and is overlapped in data.Conv14 carries out convolution operation to data, by data Depth becomes 4, and carries out Softmax operation, output probability characteristic pattern.The probability characteristics figure of basic network output is by circulation item Part random field carries out further accurate adjustment, the label characteristics of final output and input data equal resolution in pixel scale Figure, i.e. segmentation result.

2.2 network model Micro Instructional Designs

2.2.1 basic network Micro Instructional Design

Basic network selection is finely tuned based on SqueezeNet network, and to network, based on as shown in Figure 4 The complete design of network, each layer of data depth are identified in figure.Entire basic network by Conv layers, Fire layers, FireDeconv layers, pond layer and Softmax layers composition.Between Fire2 and Fire3, between Fire4 and Fire5, Fire6 Residual error connection is added between Fire7, between Fire8 and Fire9.

2.2.2Fire module and FireDeconv module design

The structure of Fire and FireDeconv is as shown in figure 5, the effect of Fire module is to complete convolution behaviour to input tensor Make, exports tensor scale and input is consistent.Input tensor H*W*C first passes around the compression layer of Fire, this layer by 1*1 volume Product core composition, the quantity of convolution kernel are equal to the 1/4 of input tensor port number, layer compressed data while completing feature extraction Depth.Then data pass through extension layer, Conv1*1 and Conv3*3 convolution respectively carries out the data of compression layer in depth Dilatation becomes exporting the 1/2 of tensor depth, the data after two kinds of convolution operations of extension layer is finally carried out heap in depth dimension It is folded, export the tensor of H*W*C.FireDeconv is similar to Fire module, also can significantly lower the number of parameters and meter of network layer Calculation amount, only for completing the up-sampling to input data.Input tensor H*W*C is inputted into FireDeconv module, first By Conv1*1, feature extraction and depth-compression are completed, then Deconv upsample*2 up-samples data, depth It remains unchanged.Then data are extended for the 1/2 of output by extension layer in depth, finally simply will be after two kinds of convolution operations Data are stacked in depth dimension, export H*2W*C tensor.

The compression of parameters rate of Fire and FireDeconv module is as shown in table 1.

1 Fire and FireDeconv module compression effectiveness of table statistics

	3*3Conv	Fire	1*4Deconv	FireDeconv
					Number of parameters	9C²	3C²/2	4C²	7C²/2
Calculation amount	9HWC²	3HWC²/2	4HWC²	7HWC²/2

2.2.3 cycling condition random field network establishment

The prediction label vector of defining point cloud is c, c_iThe prediction label of i-th cloud is indicated as a result, being defined as follows condition The energy function of random field:

Wherein, u_i(c_i)=- logP (c_i) represent i-th cloud and be predicted as c_iThe cross entropy probability value of classification, b_i,j(c_i, c_j) be a kind of behavior that different labels are assigned as to one group of similitude payment method, define b_i,j(c_i,c_j) specific punishment side Formula are as follows:

Work as c_i≠c_jWhen, h (c_i,c_j)=1, otherwise h (c_i,c_j)=0.f_i,f_jIndicate the feature of i-th, j point, k^mIndicate with The feature of i-th, j points is m-th of Gaussian filter of input, w_mIndicate corresponding coefficient.In the design of this chapter algorithm, herein Using following two filters:

Wherein, vectorIndicate that the angle information of a point, vector X (x, y, z) indicate that the Descartes an of point sits It marks, first filter uses the angle information and cartesian coordinate information of two points, second filter simultaneously in above formula Relate only to the angle information of two points.σ_α,σ_β,σ_γIt is one group of hyper parameter, value is generally selected by experience.

The RNN network structure of the mean field iterative algorithm of energy function as shown in fig. 6, the output of CNN as probability Figure is inputted into RNN.Calculating formula (4) in LiDAR initial characteristic data obtain two Gaussian kernels.As two o'clock is in the flute card of 3D It is remote at a distance of change in the angular region of that space and 2D, the value rapid decrease of Gaussian kernel, therefore when two o'clock is apart from each other, above-mentioned height This meaning for assessing calculation will become very little, so herein setting the scale of Gaussian kernel to the regional area of 3*5.In Fig. 6 First layer shown in, Gaussian kernel is calculated on probability graph in a manner of local articulamentum, the filtering of complete paired data and letter Breath transmitting, the step can polymerize the probability of consecutive points.Then polymerization probability is weighted and compatibility conversion herein again, is used To change the distributed degrees of each point, which can realize that 1*1 convolution nuclear parameter can be in network training with the convolution kernel of 1*1 Middle study.Finally initial probability graph is weighted in third layer again with completion and the probability graph of compatibility conversion is overlapped, Then the probability graph of final fining is exported by Softmax normalization operation.The part RNN in figure can be iterated, this The number of iterations is set to 3 times by chapter.

2.2.4 network model macroscopic view optimization aim

The segmentation index expression of network model on the whole is to measure between the RNN probability graph exported and true tag Consistency.As shown in figure 8, three-dimensional bits are the probability graphs in RNN, slice resolution ratio is equal to CNN original image and inputs resolution ratio, deep Degree is 4, each value in depth represents the probability value of each classification (vehicle, cyclist, pedestrian, background).Cross entropy meter It is shown below:

Wherein, m indicates resolution sizes, that is, the sum put, L_iIndicate the intersection entropy loss of each point,Indicate i-th A point is classified as true c_iThe probability of label.Minimum formula (5) obtains the final segmentation result of network.

The step 3 the following steps are included:

3.1 evaluation criterias and experiment porch explanation

3.1.1 experiment porch and data set explanation

The hardware platform of the training of model proposed by the invention and ROS deployment are as follows: 1) processor: Inter (R) Core (TM)i5-4590CPU@3.30GHz；2) memory: 8GB；3) 1060 GPU:3NVIDIA, 6G video memory.Software version are as follows: 1) operate System: Ubuntu14.04；2) deep learning frame: TensorFlow1.4.0；3) ROS:indigo；4) Python:2.7.Number It is described according to the acquisition methods of collection in the 2nd trifle, the 10848 frame original numbers that will be converted to herein from KITTI data set According to division for the training set comprising 8057 frame data and comprising the verifying collection of 2791 frame data, wherein to data when being divided in Between accomplish in sequence as far as possible uniformly, to prevent data from flocking together the case where occurring in certain time period.

3.1.2 evaluation criteria

Evaluation criteria of the invention is divided using the evaluation index of classification segmentation task specifically by more point-by-point prediction Class result and true tag obtain the precision of prediction Pr of each classification_c, readjustment rate recall_cWith friendship and than score IoU_c, It is defined as follows:

Wherein P_c,G_cRespectively indicate the true point set of the prediction point set for being divided into c classification and ground.The network model proposed Dividing function as shown in Figure 8 finally can be completed, inputs as unordered point cloud data, exports as the point cloud number with classification marker According to.

3.1.3 model size calculates

Each layer of data dimension in network model proposed by the present invention, number of parameters to be learned, filter hyper parameter Complete statistics as shown in table 2, does not have the grid of specific number to show in network in the layer without corresponding feature, therefore does not do in table Statistics.

The model parameter statistical form that table 2 is proposed

As shown in table 2, the population parameter quantity for the network model that this chapter is proposed is 904880, takes the single floating point precision to be 32bit (4byte), then the size of entire model is about 3.5MB.Common image classification network such as AlexNet mould as shown in table 3 Type size is 240MB, and VGG16 model size is 500MB, and GoogLeNet model size is 50MB, common to put cloud detection segmentation Network model such as 3DmFV-Net model size is 45MB, and PCNN model size is 8MB.The model of the size of 3.5MB is easier portion It is deployed in the limited intelligent platform of computing resource.

The comparison of 3 model parameter amount of table

3.2 model trainings and assessment

The present invention is converted using data set by KITTI public data collection, contains 10848 frame data altogether, wherein 8057 frames Data are as training set, and 2791 frame data are as verifying collection.Start train network before load pre-training basic network parameter, together When setting study total step number be 25000 times, batch size is 16, and initial learning rate is 0.01, and 10000 step learning rates of every study reduce Half, setting Momentum are 0.9.Real-time visual training effect in training process, real-time record cast penalty values, training number According to collection and validation data set precision, readjustment rate and IOU score value.Model training Loss score value and learning rate variation are as shown in figure 9, figure 9 (a) be model training loss loss；Fig. 9 (b) is the transformation of model training learning rate；Model is received substantially at trained to 21k times It holds back.In entire training process, three kinds of classification IOU indexs of training set are as shown in Figure 10, Figure 10 (1) training set data point value of evaluation Vehicle IOU score value variation；The bicycle IOU score value of Figure 10 (2) training set data point value of evaluation changes；Figure 10 (3) training set number Change according to pedestrian's IOU score value of point value of evaluation.Deep yellow colo(u)r streak in Fig. 9, Figure 10 is the curve after pale yellow colo(u)r streak is smooth, with prominent Variation tendency on the whole out.

Record is more specifically assessed on verifying collection herein, is as shown in figure 11 upper other point of three types of verifying collection Cut assessment result, Figure 11 (1) verifying collection data assessment -- vehicle IOU score value variation；Figure 11 (2) verifying collection data assessment -- Che Jingdu Score value variation；Figure 11 (3) verifying collection data assessment -- vehicle readjustment rate score value variation；Figure 11 (4) verifying collection data assessment -- voluntarily The variation of driver's IOU score value；Figure 11 (5) verifying collection data assessment -- cyclist's precision score value variation；Figure 11 (6) verifying collection data Assessment -- cyclist's readjustment rate score value variation；Figure 11 (7) verifying collection data assessment -- pedestrian's IOU score value variation；Figure 11 (8) is tested Card collection data assessment -- pedestrian's precision score value variation；Figure 11 (9) verifying collection data assessment -- pedestrian's readjustment rate score value variation；It is each The other evaluation index of type respectively includes precision, tri- kinds of specific score values of readjustment rate and IOU.Curve azury changes in Figure 11 for difference The corresponding prediction score value of each target under generation number, navy blue are in order to facilitate observing and nursing to the whole smooth of prediction score value Global Iterative Schemes predict variation.

In order to intuitively show the segmentation effect of model, the prediction of this paper and the difference of ground true tag are visualized, is arranged Go out following 3 groups of Experimental comparisons, each group of LiDAR depth map (first) current comprising a frame, ground true tag figure (the Two) and prediction segmentation figure (third), as shown in figure 12.

Green exposure mask in Figure 12 represents vehicle, and light blue exposure mask represents pedestrian and cyclist, observes Figure 12 (a), model can accurately be partitioned into the road target in scene, and in the point cloud, such as scene expanded around individual Cyclist, this also explains the higher phenomenons of readjustment rate of model, and model has been partitioned into spill tag in the true figure in ground The lower left corner vehicle, illustrate model have preferable reasoning and generalization ability.Figure 12 (b) illustrates model to vehicle target point It is preferable to cut effect, prediction result is basic consistent with ground true tag.It observes Figure 12 (c), model exists sub-fraction point cloud The case where being accidentally divided into pedestrian, in fact this part point cloud classifications is that the score of pedestrian is only higher than background category score little by little, Being classified as pedestrian's classification is because itself just having certain pedestrian target characteristic, with a kind of more conservative segmentation strategy clothes It is engaged in substantially meeting the principle of safety standard in unmanned plane, automatic Pilot.

4 model of table classification segmentation performance pixel-by-pixel

The segmentation performance of classification pixel-by-pixel of model is as shown in table 4.By table 4 and Figure 11 can perception model the segmentation of vehicle is imitated Fruit is preferable, and precision can reach 61.8%, and readjustment rate is close to 100%, and friendship and than can reach 59.9 is better than VoxelNet in performance, PointNet and MS+CU effect on vehicles segmentation.Readjustment rate index is critically important in unmanned plane during flying and automatic Pilot scene Index because under real road scene would rather mistakenly by the point cloud segmentation of target proximity be the target, be also reluctant incite somebody to action this The point cloud segmentation for being the target is background or other classifications, serves unmanned plane with a kind of more conservative segmentation strategy, drives automatically A possibility that sailing subsequent motion planning and action decision, reducing unmanned plane, automatic driving vehicle collision.Three kinds of target object height Readjustment rate meet the requirement of safety specification under the paleocinetic scene of intelligent body.

3.3 models are engineered real-time and stability analysis under ROS frame

For the Project Realization of entire model under ROS frame, the engineering code flow under ROS is as shown in figure 13.Firstly, by 2D LiDAR point cloud data after change load process sequence, and the network model for calling this chapter to propose in a program (is based on TensorFlow frame Frame is realized), then the parameter of trained network model is loaded into TensorFlow model.The each frame data of circular treatment: The point cloud data of present frame is read, calling model completes segmentation, segmentation result is packaged into data format and the completion of ROS topic Publication.RVIZ is a simulation software under ROS, a loading environment configuration file, and subscription in real time from/module_ Information on seg/points topic, the real-time visual point cloud segmentation result in RVIZ.

The model real-time visual result of RVIZ is as shown in figure 14, while having recorded each frame Point Cloud Processing with program Time, the real-time performance of system is as shown in table 5, and average time and standard deviation and the system for having counted each frame data are flat Equal frame per second.When this chapter model engineeringization is run, the processing time of every frame point cloud data is 82.5ms, and average frame per second is 12.1Hz, The scan frequency of LiDAR is generally 10Hz or so, so the engineering code of this chapter has reached the performance requirement of real time execution.Fortune Capable time standard difference is 4.5ms, it is meant that the runing time for handling each frame changes very little, and program is more stable, will not go out Existing a certain frame Caton influences other operation components of intelligent body, which ensure that unmanned plane, automatic Pilot in low latitude road Execution task can be stablized under the scene of road.

5 model engineering real-time of table and stability contrast

The invention proposes one kind to be suitable for unmanned plane, automatic Pilot real-time perfoming three-dimensional point cloud under the street scene of low latitude The neural network model RobNet of segmentation.The basic theory of convolutional neural networks and model compression is described first and is marked with semanteme The point cloud data acquisition methods of label, and realized by spherical Map method by 3D point cloud data 2Dization.Then from macroscopically designing Whole network model, described in detail on microcosmic model first half basic network and latter half cycling condition with Airport network, wherein contain the residual error connection of basic network, the interior design of Fire and FireDocnv module and RNN's It derives and constructs.The model that finally this chapter is proposed has carried out engineering under ROS frame and has realized.It is final the experimental results showed that The model that this chapter is proposed preferably can go out the road target in point cloud by Real-time segmentation, while the model scale of construction is smaller.Moulder Journey meets the requirement of real-time and stability, and model proposed by the present invention has certain practical value.

Claims

1. a kind of road target real-time three-dimensional point cloud segmentation method based on cycling condition random field, it is characterized in that including at least such as Lower step:

2. a kind of road target real-time three-dimensional based on SqueezeNet and cycling condition random field according to claim 1 Point cloud segmentation, it is characterized in that: the step 1 includes step in detail below:

Step 1.1) obtains point cloud data, and point cloud data is made of non-structured three-dimensional space point, each spatial point is wrapped again Data containing 5 dimensions: one group of Descartes's three dimensional space coordinate (x, y, z), the intensity value i and distance of current point

Point cloud data is converted to the data matrix for meeting network data requirement, spherical Map using spherical Map by step 1.2) Shown in equation such as formula (1):

Wherein, θ, φ respectively indicate apex angle and azimuth, Δ θ, and Δ φ respectively indicates θ, the discrete precision on φ,As from It dissipates and is worth, one groupData map to a position on 2D grid map；

Each point on one point cloud map is carried out above-mentioned spherical Map by step 1.3), obtains the three-dimensional of a H*W*C Matrix, H, W respectively correspond mapping result discrete valueThe data in each channel of the C equal to 5, C respectively correspond LiDAR point X, y, z-axis coordinate values put strength values and distance values.

3. a kind of road target real-time three-dimensional point cloud segmentation side based on cycling condition random field according to claim 1 Method, it is characterized in that: the step 2 includes:

The tensor H*W*C of step 2.1) input first passes around the compression layer processing of Fire, the compression layer of Fire by 1*1 convolution kernel Composition, the compression layer that the quantity of convolution kernel is equal to 1/4, Fire of input tensor port number carry out while completing feature extraction The advanced treating of compressed data；

Then data pass through FireDeconv module to step 2.2)；

Step 2.3) cycling condition random field network establishment；

Step 2.4) defines macroscopical optimization aim of network model

Cross entropy calculating is shown below:

Wherein, m indicates resolution sizes, that is, the sum put, L_iIndicate the intersection entropy loss of each point,Indicate at i-th point It is classified as true c_iThe probability of label.

4. a kind of road target real-time three-dimensional point cloud segmentation side based on cycling condition random field according to claim 3 Method, it is characterized in that: the step 2.1) includes:

A) respectively by the data of compression layer by carrying out the expansion of tensor depth on the Conv convolutional layer of 1*1 and the Conv convolutional layer of 3*3 Hold, becomes exporting the 1/2 of tensor depth respectively；

B) data after two kinds of convolution operations are stacked in depth dimension, exports H*W*C tensor.

5. a kind of road target real-time three-dimensional point cloud segmentation side based on cycling condition random field according to claim 3 Method, it is characterized in that: the step 2.2) includes:

B) then data are up-sampled, depth remains unchanged；Layer data will be extended and be extended for export 1/2 in depth；

6. a kind of road target real-time three-dimensional point cloud segmentation side based on cycling condition random field according to claim 3 Method, it is characterized in that: the step 2.3) includes:

Step 2.3.1) provide the energy function of cycling condition random field:

Wherein, c is the prediction label vector of a cloud, c_iThe prediction label of i-th cloud is indicated as a result, u_i(c_i)=- logP (c_i) It represents i-th cloud and is predicted as c_iThe cross entropy probability value of classification, b_i,j(c_i,c_j) it is that one kind is assigned as not one group of similitude With the payment method of the behavior of label, b is defined_i,j(c_i,c_j) specific payment method are as follows:

Work as c_i≠c_jWhen, h (c_i,c_j)=1, otherwise h (c_i,c_j)=0, f_i,f_jIndicate the feature of i-th, j point, k^mIt indicates with i-th, The feature of j point is m-th of Gaussian filter of input, w_mIndicate corresponding coefficient；Using as follows:

Wherein, vectorIndicate that the angle information of a point, vector X (x, y, z) indicate the cartesian coordinate of a point, σ_α,σ_β,σ_γIt is one group of hyper parameter, value is generally selected by experience；

C) scale of Gaussian kernel is set to the regional area of 3*5；Gaussian kernel is enterprising in probability graph in a manner of local articulamentum Row calculates, the filtering of complete paired data and information transmitting；

E) initial probability graph is weighted again with completion and the probability graph of compatibility conversion is overlapped, through cross entropy Softmax The probability graph that normalization operation output minimizes.