CN110310298A - A kind of road target real-time three-dimensional point cloud segmentation method based on cycling condition random field - Google Patents

A kind of road target real-time three-dimensional point cloud segmentation method based on cycling condition random field Download PDF

Info

Publication number
CN110310298A
CN110310298A CN201910540355.2A CN201910540355A CN110310298A CN 110310298 A CN110310298 A CN 110310298A CN 201910540355 A CN201910540355 A CN 201910540355A CN 110310298 A CN110310298 A CN 110310298A
Authority
CN
China
Prior art keywords
data
point cloud
point
dimensional
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910540355.2A
Other languages
Chinese (zh)
Inventor
孙伟
张桢浩
陆伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian University of Electronic Science and Technology
Original Assignee
Xian University of Electronic Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Electronic Science and Technology filed Critical Xian University of Electronic Science and Technology
Priority to CN201910540355.2A priority Critical patent/CN110310298A/en
Publication of CN110310298A publication Critical patent/CN110310298A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/143Segmentation; Edge detection involving probabilistic approaches, e.g. Markov random field [MRF] modelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20076Probabilistic image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The present invention relates to automatic Pilot unmanned planes under the complexity road scene of low latitude to the real-time three-dimensional point cloud Target Segmentation technology of itself ambient enviroment, especially a kind of road target real-time three-dimensional point cloud segmentation method based on cycling condition random field.It is characterized in that including at least following steps: step 1 obtains point cloud data, obtains the three-dimensional matrice of a H*W*C;Step 2 carries out network architecture optimization according to above-mentioned data;The probability graph that step 3, output minimize provides three-dimensional point cloud segmentation.Model proposed by the present invention preferably can go out the road target in point cloud by Real-time segmentation, and model engineering meets the requirement of real-time and stability, has certain practical value.

Description

A kind of road target real-time three-dimensional point cloud segmentation method based on cycling condition random field
Technical field
The present invention relates to automatic Pilot unmanned planes under the complexity road scene of low latitude to the real-time three-dimensional of itself ambient enviroment Point cloud Target Segmentation technology, especially a kind of road target real-time three-dimensional point cloud segmentation method based on cycling condition random field.
Background technique
In recent years, unmanned plane slowly has the tendency that sending express delivery, region low latitude into low latitude domain activity, such as unmanned plane Safety inspection, road defects detection, provisional road segment segment vehicle detection statistics etc..The characteristics of these tasks is to need to carry out people Machine interaction, needs to perceive the target and ambient enviroment in the scene of low latitude, accomplishes safe flight while the task of completion.Meanwhile Automatic Pilot has obtained paying close attention to for whole world researcher in recent years, and around perception, decision, control, which expands, to be goed deep into Research.Unmanned plane low-latitude flying and automated driving system are required dependent on accurate, real-time, robust environment sensing energy Power.In the road scene of low latitude, unmanned plane and automatic Pilot need precise classification and position road target, such as vehicle, pedestrian, Cyclist and other barriers.Wherein, the cognitive method based on radar point cloud data can more directly react reality Scene information has wide application scenarios.Therefore, studying a kind of three-dimensional perception technology under the road scene of low latitude just seems It is very necessary, be conducive to the development for pushing unmanned plane and automatic Pilot.
Previous traditional point cloud segmentation method usually includes following four kinds of common operations: removing ground, point is gathered into A example extracts feature from each aggregation population, the population assembled according to tagsort.Conventional method depends critically upon The feature of hand-designed, and do not omited in data connection, satisfied segmentation result cannot be obtained.Scholars attempt to use later The angle of probability goes to explain segmentation problem, proposes condition random field equation for solving the problems, such as semantic segmentation, was dividing It joined the smoothness constraint between pixel in journey, be conducive to improve edge segmentation effect.The appearance of FCN network demonstrates convolution mind It can be used in image segmentation task through network, neural network proposes the feature of initial data by self study completion Take, the segmentation result of same input data original resolution rank obtained by mode end to end, FCN's the experimental results showed that Performance of the depth network in image segmentation task is far better than conventional method.It is influenced by the depth Internet upsurge, S.Zheng etc. The mathematical expression of Stochastic Conditions field is converted to the neural network structure of RNN a kind of by people.Deeplab network then will be above-mentioned two Advanced dividing method combines, and makes preliminary segmentation of FCN network first, and segmentation result is then passed through Stochastic Conditions field Handle and then exports, but the network for being based only on FCN of the network training, it is equivalent to FCN network and condition random field It is separation, does not carry out joint training, although Deeplab is good compared with FCN segmentation effect, promotion is not obvious.It sends out backward Exhibition has just emerged in large numbers through pretreatment three dimensional point cloud and then has used the point cloud segmentation network of nerual network technique.Zhou Y etc. People proposes VoxelNet network model, the 3D detection and segmentation of object is completed on point cloud data, on KITTI data set Preferable experimental result has been obtained, but the model is huger, the real-time performance of model is poor.Qi C R et al. is proposed PointNet network model can be completed at the same time the classification of point cloud data, semantic segmentation and individual component segmentation, and the model is to nothing Each point of sequence point cloud carries out independent processing, it is thus achieved that the points cloud processing unrelated with input sequence.But the model When handling the score of each point, directly point feature is combined together with global characteristics, and has ignored local feature, caused Final segmenting edge is ineffective, while a feature of point cloud data is packing density difference, embodies close how far few etc. Problem, and in the case where density difference, handle that the obvious computational efficiency of these data is not high, and model can not using unified template Real time execution.By the studies above As-Is analysis, point cloud segmentation technology is in Rapid development stage, how more efficient training Network, how the model of compression network keeps segmentation precision simultaneously, how lift scheme real-time and computational efficiency the problems such as It is all direction to be studied.
Summary of the invention
It is an object of that present invention to provide a kind of model real-time of point cloud segmentation is good, computational efficiency is high, model compression and point Cut the road target real-time three-dimensional point cloud segmentation method of the good random field of balance between precision.
The object of the present invention is achieved like this, a kind of road target real-time three-dimensional point cloud based on cycling condition random field Dividing method, it is characterized in that including at least following steps:
Step 1 obtains point cloud data, obtains the three-dimensional matrice of a H*W*C;
Step 2 carries out network architecture optimization according to above-mentioned data;
The probability graph that step 3, output minimize provides three-dimensional point cloud segmentation.
The step 1 includes step in detail below:
Step 1.1) obtains point cloud data, and point cloud data is made of non-structured three-dimensional space point, each spatial point It again include the data of 5 dimensions: one group of Descartes's three dimensional space coordinate (x, y, z), the intensity value i and distance of current point
Point cloud data is converted to the data matrix for meeting network data requirement, spherical surface using spherical Map by step 1.2) Shown in mapping equation such as formula (1):
Wherein, θ, φ respectively indicate apex angle and azimuth, Δ θ, and Δ φ respectively indicates θ, the discrete precision on φ,I.e. For discrete value, one groupData map to a position on 2D grid map;
Each point on one point cloud map is carried out above-mentioned spherical Map by step 1.3), obtains a H*W*C's Three-dimensional matrice, H, W respectively correspond mapping result discrete valueThe data in each channel of the C equal to 5, C respectively correspond LiDAR The x of point, y, z-axis coordinate values put strength values and distance values.
The step 2 includes:
The tensor H*W*C of step 2.1) input first passes around the compression layer processing of Fire, the compression layer of Fire by 1*1 volume Product core composition, the quantity of convolution kernel are equal to the compression layer of 1/4, Fire of input tensor port number while completing feature extraction Carry out the advanced treating of compressed data;
Then data pass through FireDeconv module to step 2.2);
Step 2.3) cycling condition random field network establishment;
Step 2.4) defines macroscopical optimization aim of network model
Cross entropy calculating is shown below:
Wherein, m indicates resolution sizes, that is, the sum put, LiIndicate the intersection entropy loss of each point,Indicate i-th A point is classified as true ciThe probability of label.
The step 2.1) includes:
A) respectively that the data of compression layer are deep by carrying out tensor on the Conv convolutional layer of 1*1 and the Conv convolutional layer of 3*3 Dilatation is spent, becomes exporting the 1/2 of tensor depth respectively;
B) data after two kinds of convolution operations are stacked in depth dimension, exports H*W*C tensor;
The step 2.2) includes:
A) input H*W*C tensor enters FireDeconv module, by Conv1*1, completes feature extraction and depth-compression;
B) then data are up-sampled, depth remains unchanged;Layer data will be extended and be extended for output in depth 1/2;
C) data after two kinds of convolution operations are stacked in depth dimension, exports H*2W*C tensor.
The step 2.3) includes:
Step 2.3.1) provide the energy function of cycling condition random field:
Wherein, c is the prediction label vector of a cloud, ciThe prediction label of i-th cloud is indicated as a result, ui(ci)=- logP (ci) represent i-th cloud and be predicted as ciThe cross entropy probability value of classification, bi,j(ci,cj) it is a kind of to one group of similitude distribution For the payment method of the behavior of different labels, b is definedi,j(ci,cj) specific payment method are as follows:
Work as ci≠cjWhen, h (ci,cj)=1, otherwise h (ci,cj)=0, fi,fjIndicate the feature of i-th, j point, kmIndicate with The feature of i-th, j points is m-th of Gaussian filter of input, wmIndicate corresponding coefficient;Using as follows:
Wherein, vectorIndicate that the angle information of a point, vector X (x, y, z) indicate that the Descartes an of point sits Mark, σαβγIt is one group of hyper parameter, value is generally selected by experience;
Step 2.3.2) building energy function mean field iterative algorithm Recognition with Recurrent Neural Network structure RNN:
A) it inputs the output of base neural network as probability figure into Recognition with Recurrent Neural Network RNN;
B) LiDAR initial characteristic data is handled by formula (4), obtains two Gaussian kernels;
C) scale of Gaussian kernel is set to the regional area of 3*5;Gaussian kernel is in a manner of local articulamentum in probability graph On calculated, the filtering of complete paired data and information transmitting;
D) it is weighted again with polymerization probability of the convolution kernel of 1*1 to the probability results after c) step and compatibility conversion, To change the distributed degrees of each point, 1*1 convolution nuclear parameter learns in network training.
E) initial probability graph is weighted again with completion and the probability graph of compatibility conversion is overlapped, through cross entropy The probability graph that the output of Softmax normalization operation minimizes.
The invention proposes a kind of based on SqueezeNet and recycles the real-time three-dimensional point cloud segmentation model RobNet of CRF, Model is macroscopically being divided into basic network and recirculating network, and is carrying out detailed elaboration on microcosmic, finally in ROS frame Engineering deployment has been carried out under frame to realize.Light, the external factor such as weather are effectively overcome by initial data of LiDAR point cloud Influence to algorithm, so that model more robust, while ranging information is contained in LiDAR data, and the range information energy of environment It is enough directly used in the subsequent planning of unmanned plane, automatic Pilot, controls scheduling algorithm.When network design, with reference to SqueezeNet's Compression Strategies are effectively reduced the number of parameters of network, have compressed the scale of construction of model, and utilize the condition random field of circulation Refined output as a result, finally enable this chapter propose model preferably Real-time segmentation go out point cloud in road target, Model engineering meets the requirement of real-time and stability, and model proposed by the present invention has certain practical value.
All symbols and meaning in 1 specification of table
Below with reference to embodiment attached drawing, the invention will be further described:
Detailed description of the invention
Fig. 1 is spherical Map schematic diagram;
Fig. 2 is point cloud data spherical Map result;
Fig. 3 is network model macroscopic design;
Fig. 4 is basic network Micro Instructional Design;
Fig. 5 is Fire and FireDeconv module microstructure, and Fig. 5 (a) is Fire structure chart, and Fig. 5 (b) is FireDeconv structure chart;
Fig. 6 is that the RNN of condition random field CRF is realized;
Fig. 7 is model optimization object delineation;
Fig. 8 is the point cloud segmentation functional schematic of model;
Fig. 9 (a) is model training loss loss;
Fig. 9 (b) is the transformation of model training learning rate;
The vehicle IOU score value of Figure 10 (1) training set data point value of evaluation changes;
The bicycle IOU score value of Figure 10 (2) training set data point value of evaluation changes;
Pedestrian's IOU score value of Figure 10 (3) training set data point value of evaluation changes;
Figure 11 (1) verifying collection data assessment -- vehicle IOU score value variation;
Figure 11 (2) verifying collection data assessment -- Che Jingdu score value variation;
Figure 11 (3) verifying collection data assessment -- vehicle readjustment rate score value variation;
Figure 11 (4) verifying collection data assessment -- cyclist's IOU score value variation;
Figure 11 (5) verifying collection data assessment -- cyclist's precision score value variation;
Figure 11 (6) verifying collection data assessment -- cyclist's readjustment rate score value variation;
Figure 11 (7) verifying collection data assessment -- pedestrian's IOU score value variation;
Figure 11 (8) verifying collection data assessment -- pedestrian's precision score value variation;
Figure 11 (9) verifying collection data assessment -- pedestrian's readjustment rate score value variation;
Three groups of Figure 12 (a), first group of segmentation effect-segmentation effect;
Three groups of Figure 12 (b), second group of segmentation effect-segmentation effect;
Three groups of segmentation effect-third components of Figure 12 (c) cut effect;
Figure 13 ROS engineering code logic process;
Figure 14 RVIZ real-time visual result.
Specific embodiment
In conjunction with attached drawing is implemented, specific analysis, a kind of road mesh based on SqueezeNet and circulation CRF are made to the present invention Real-time three-dimensional point cloud segmentation is marked, it is characterized in that including at least following steps:
Step 1 obtains point cloud data, obtains the three-dimensional matrice of a H*W*C;
Step 2 carries out network architecture optimization according to above-mentioned data;
The probability graph that step 3, output minimize provides three-dimensional point cloud segmentation.
As shown in Figure 1, the step 1 includes step in detail below:
LiDAR point cloud data are made of a series of non-structured three-dimensional space points, each of which spatial point includes 5 again The data of dimension: one group of Descartes's three dimensional space coordinate (x, y, z), the intensity value i and distance of current pointPoint cloud data feature and image data feature difference are larger, need to complete data using spherical Map to turn It changes.This kind of more compact expression LiDAR point cloud data of method energy, promote the computational efficiency of network, while meeting the data of network Call format.Shown in spherical Map equation such as formula (1):
Wherein, θ, φ respectively indicate apex angle and azimuth, Δ θ, and Δ φ respectively indicates θ, the discrete precision on φ,I.e. For discrete value, one groupData map to a position on 2D grid map.By each point on a point cloud map Above-mentioned spherical Map is carried out, the three-dimensional matrice of a H*W*C can be obtained, H, W respectively correspond mapping result discrete value The data in each channel of the C equal to 5, C respectively correspond the x of LiDAR point, y, and z-axis coordinate values put strength values and apart from number Value.
Unordered by above-mentioned spherical surface conversion method, sparse three dimensional point cloud has been processed into orderly, dense class figure The standard data format of picture, processing result are as shown in Figure 2.
The step 2 includes step in detail below:
2.1 network model macroscopic designs
The macroscopic design of network model is as shown in Figure 3.Herein when basic convolutional neural networks part in designing a model, With reference to SqueezeNet network, and be optimized to the mode of residual error connection, can great compression network model size, reduce Model calculation amount.As shown in figure 3, being the basic network part of model from Conv1a layers to Conv14 layers.Mode input is data Pretreated standard point cloud data, data three-dimensional scale are 64*512*5, and input data is through general convolution, maximum pond, Fire has operated the feature extraction of paired data.When carrying out down-sampling operation, it is contemplated that the transverse width of data is higher than longitudinal Much larger, longitudinal highly only 64 dimensions are spent, so, only to transverse width progress Sampling Compression, indulging herein when down-sampling To highly remaining unchanged.FireDeconv10 layers to totally four layers of FireDeconv13, it is gradually completing the up-sampling of image, The resolution ratio of the data of FireDeconv layers of output has reached consistent with input resolution ratio.It is added during up-sampling herein The skip floor of equal resolution data connects, and is overlapped in data.Conv14 carries out convolution operation to data, by data Depth becomes 4, and carries out Softmax operation, output probability characteristic pattern.The probability characteristics figure of basic network output is by circulation item Part random field carries out further accurate adjustment, the label characteristics of final output and input data equal resolution in pixel scale Figure, i.e. segmentation result.
2.2 network model Micro Instructional Designs
2.2.1 basic network Micro Instructional Design
Basic network selection is finely tuned based on SqueezeNet network, and to network, based on as shown in Figure 4 The complete design of network, each layer of data depth are identified in figure.Entire basic network by Conv layers, Fire layers, FireDeconv layers, pond layer and Softmax layers composition.Between Fire2 and Fire3, between Fire4 and Fire5, Fire6 Residual error connection is added between Fire7, between Fire8 and Fire9.
2.2.2Fire module and FireDeconv module design
The structure of Fire and FireDeconv is as shown in figure 5, the effect of Fire module is to complete convolution behaviour to input tensor Make, exports tensor scale and input is consistent.Input tensor H*W*C first passes around the compression layer of Fire, this layer by 1*1 volume Product core composition, the quantity of convolution kernel are equal to the 1/4 of input tensor port number, layer compressed data while completing feature extraction Depth.Then data pass through extension layer, Conv1*1 and Conv3*3 convolution respectively carries out the data of compression layer in depth Dilatation becomes exporting the 1/2 of tensor depth, the data after two kinds of convolution operations of extension layer is finally carried out heap in depth dimension It is folded, export the tensor of H*W*C.FireDeconv is similar to Fire module, also can significantly lower the number of parameters and meter of network layer Calculation amount, only for completing the up-sampling to input data.Input tensor H*W*C is inputted into FireDeconv module, first By Conv1*1, feature extraction and depth-compression are completed, then Deconv upsample*2 up-samples data, depth It remains unchanged.Then data are extended for the 1/2 of output by extension layer in depth, finally simply will be after two kinds of convolution operations Data are stacked in depth dimension, export H*2W*C tensor.
The compression of parameters rate of Fire and FireDeconv module is as shown in table 1.
1 Fire and FireDeconv module compression effectiveness of table statistics
3*3Conv Fire 1*4Deconv FireDeconv
Number of parameters 9C2 3C2/2 4C2 7C2/2
Calculation amount 9HWC2 3HWC2/2 4HWC2 7HWC2/2
2.2.3 cycling condition random field network establishment
The prediction label vector of defining point cloud is c, ciThe prediction label of i-th cloud is indicated as a result, being defined as follows condition The energy function of random field:
Wherein, ui(ci)=- logP (ci) represent i-th cloud and be predicted as ciThe cross entropy probability value of classification, bi,j(ci, cj) be a kind of behavior that different labels are assigned as to one group of similitude payment method, define bi,j(ci,cj) specific punishment side Formula are as follows:
Work as ci≠cjWhen, h (ci,cj)=1, otherwise h (ci,cj)=0.fi,fjIndicate the feature of i-th, j point, kmIndicate with The feature of i-th, j points is m-th of Gaussian filter of input, wmIndicate corresponding coefficient.In the design of this chapter algorithm, herein Using following two filters:
Wherein, vectorIndicate that the angle information of a point, vector X (x, y, z) indicate that the Descartes an of point sits It marks, first filter uses the angle information and cartesian coordinate information of two points, second filter simultaneously in above formula Relate only to the angle information of two points.σαβγIt is one group of hyper parameter, value is generally selected by experience.
The RNN network structure of the mean field iterative algorithm of energy function as shown in fig. 6, the output of CNN as probability Figure is inputted into RNN.Calculating formula (4) in LiDAR initial characteristic data obtain two Gaussian kernels.As two o'clock is in the flute card of 3D It is remote at a distance of change in the angular region of that space and 2D, the value rapid decrease of Gaussian kernel, therefore when two o'clock is apart from each other, above-mentioned height This meaning for assessing calculation will become very little, so herein setting the scale of Gaussian kernel to the regional area of 3*5.In Fig. 6 First layer shown in, Gaussian kernel is calculated on probability graph in a manner of local articulamentum, the filtering of complete paired data and letter Breath transmitting, the step can polymerize the probability of consecutive points.Then polymerization probability is weighted and compatibility conversion herein again, is used To change the distributed degrees of each point, which can realize that 1*1 convolution nuclear parameter can be in network training with the convolution kernel of 1*1 Middle study.Finally initial probability graph is weighted in third layer again with completion and the probability graph of compatibility conversion is overlapped, Then the probability graph of final fining is exported by Softmax normalization operation.The part RNN in figure can be iterated, this The number of iterations is set to 3 times by chapter.
2.2.4 network model macroscopic view optimization aim
The segmentation index expression of network model on the whole is to measure between the RNN probability graph exported and true tag Consistency.As shown in figure 8, three-dimensional bits are the probability graphs in RNN, slice resolution ratio is equal to CNN original image and inputs resolution ratio, deep Degree is 4, each value in depth represents the probability value of each classification (vehicle, cyclist, pedestrian, background).Cross entropy meter It is shown below:
Wherein, m indicates resolution sizes, that is, the sum put, LiIndicate the intersection entropy loss of each point,Indicate i-th A point is classified as true ciThe probability of label.Minimum formula (5) obtains the final segmentation result of network.
The step 3 the following steps are included:
3.1 evaluation criterias and experiment porch explanation
3.1.1 experiment porch and data set explanation
The hardware platform of the training of model proposed by the invention and ROS deployment are as follows: 1) processor: Inter (R) Core (TM)i5-4590CPU@3.30GHz;2) memory: 8GB;3) 1060 GPU:3NVIDIA, 6G video memory.Software version are as follows: 1) operate System: Ubuntu14.04;2) deep learning frame: TensorFlow1.4.0;3) ROS:indigo;4) Python:2.7.Number It is described according to the acquisition methods of collection in the 2nd trifle, the 10848 frame original numbers that will be converted to herein from KITTI data set According to division for the training set comprising 8057 frame data and comprising the verifying collection of 2791 frame data, wherein to data when being divided in Between accomplish in sequence as far as possible uniformly, to prevent data from flocking together the case where occurring in certain time period.
3.1.2 evaluation criteria
Evaluation criteria of the invention is divided using the evaluation index of classification segmentation task specifically by more point-by-point prediction Class result and true tag obtain the precision of prediction Pr of each classificationc, readjustment rate recallcWith friendship and than score IoUc, It is defined as follows:
Wherein Pc,GcRespectively indicate the true point set of the prediction point set for being divided into c classification and ground.The network model proposed Dividing function as shown in Figure 8 finally can be completed, inputs as unordered point cloud data, exports as the point cloud number with classification marker According to.
3.1.3 model size calculates
Each layer of data dimension in network model proposed by the present invention, number of parameters to be learned, filter hyper parameter Complete statistics as shown in table 2, does not have the grid of specific number to show in network in the layer without corresponding feature, therefore does not do in table Statistics.
The model parameter statistical form that table 2 is proposed
As shown in table 2, the population parameter quantity for the network model that this chapter is proposed is 904880, takes the single floating point precision to be 32bit (4byte), then the size of entire model is about 3.5MB.Common image classification network such as AlexNet mould as shown in table 3 Type size is 240MB, and VGG16 model size is 500MB, and GoogLeNet model size is 50MB, common to put cloud detection segmentation Network model such as 3DmFV-Net model size is 45MB, and PCNN model size is 8MB.The model of the size of 3.5MB is easier portion It is deployed in the limited intelligent platform of computing resource.
The comparison of 3 model parameter amount of table
3.2 model trainings and assessment
The present invention is converted using data set by KITTI public data collection, contains 10848 frame data altogether, wherein 8057 frames Data are as training set, and 2791 frame data are as verifying collection.Start train network before load pre-training basic network parameter, together When setting study total step number be 25000 times, batch size is 16, and initial learning rate is 0.01, and 10000 step learning rates of every study reduce Half, setting Momentum are 0.9.Real-time visual training effect in training process, real-time record cast penalty values, training number According to collection and validation data set precision, readjustment rate and IOU score value.Model training Loss score value and learning rate variation are as shown in figure 9, figure 9 (a) be model training loss loss;Fig. 9 (b) is the transformation of model training learning rate;Model is received substantially at trained to 21k times It holds back.In entire training process, three kinds of classification IOU indexs of training set are as shown in Figure 10, Figure 10 (1) training set data point value of evaluation Vehicle IOU score value variation;The bicycle IOU score value of Figure 10 (2) training set data point value of evaluation changes;Figure 10 (3) training set number Change according to pedestrian's IOU score value of point value of evaluation.Deep yellow colo(u)r streak in Fig. 9, Figure 10 is the curve after pale yellow colo(u)r streak is smooth, with prominent Variation tendency on the whole out.
Record is more specifically assessed on verifying collection herein, is as shown in figure 11 upper other point of three types of verifying collection Cut assessment result, Figure 11 (1) verifying collection data assessment -- vehicle IOU score value variation;Figure 11 (2) verifying collection data assessment -- Che Jingdu Score value variation;Figure 11 (3) verifying collection data assessment -- vehicle readjustment rate score value variation;Figure 11 (4) verifying collection data assessment -- voluntarily The variation of driver's IOU score value;Figure 11 (5) verifying collection data assessment -- cyclist's precision score value variation;Figure 11 (6) verifying collection data Assessment -- cyclist's readjustment rate score value variation;Figure 11 (7) verifying collection data assessment -- pedestrian's IOU score value variation;Figure 11 (8) is tested Card collection data assessment -- pedestrian's precision score value variation;Figure 11 (9) verifying collection data assessment -- pedestrian's readjustment rate score value variation;It is each The other evaluation index of type respectively includes precision, tri- kinds of specific score values of readjustment rate and IOU.Curve azury changes in Figure 11 for difference The corresponding prediction score value of each target under generation number, navy blue are in order to facilitate observing and nursing to the whole smooth of prediction score value Global Iterative Schemes predict variation.
In order to intuitively show the segmentation effect of model, the prediction of this paper and the difference of ground true tag are visualized, is arranged Go out following 3 groups of Experimental comparisons, each group of LiDAR depth map (first) current comprising a frame, ground true tag figure (the Two) and prediction segmentation figure (third), as shown in figure 12.
Green exposure mask in Figure 12 represents vehicle, and light blue exposure mask represents pedestrian and cyclist, observes Figure 12 (a), model can accurately be partitioned into the road target in scene, and in the point cloud, such as scene expanded around individual Cyclist, this also explains the higher phenomenons of readjustment rate of model, and model has been partitioned into spill tag in the true figure in ground The lower left corner vehicle, illustrate model have preferable reasoning and generalization ability.Figure 12 (b) illustrates model to vehicle target point It is preferable to cut effect, prediction result is basic consistent with ground true tag.It observes Figure 12 (c), model exists sub-fraction point cloud The case where being accidentally divided into pedestrian, in fact this part point cloud classifications is that the score of pedestrian is only higher than background category score little by little, Being classified as pedestrian's classification is because itself just having certain pedestrian target characteristic, with a kind of more conservative segmentation strategy clothes It is engaged in substantially meeting the principle of safety standard in unmanned plane, automatic Pilot.
4 model of table classification segmentation performance pixel-by-pixel
The segmentation performance of classification pixel-by-pixel of model is as shown in table 4.By table 4 and Figure 11 can perception model the segmentation of vehicle is imitated Fruit is preferable, and precision can reach 61.8%, and readjustment rate is close to 100%, and friendship and than can reach 59.9 is better than VoxelNet in performance, PointNet and MS+CU effect on vehicles segmentation.Readjustment rate index is critically important in unmanned plane during flying and automatic Pilot scene Index because under real road scene would rather mistakenly by the point cloud segmentation of target proximity be the target, be also reluctant incite somebody to action this The point cloud segmentation for being the target is background or other classifications, serves unmanned plane with a kind of more conservative segmentation strategy, drives automatically A possibility that sailing subsequent motion planning and action decision, reducing unmanned plane, automatic driving vehicle collision.Three kinds of target object height Readjustment rate meet the requirement of safety specification under the paleocinetic scene of intelligent body.
3.3 models are engineered real-time and stability analysis under ROS frame
For the Project Realization of entire model under ROS frame, the engineering code flow under ROS is as shown in figure 13.Firstly, by 2D LiDAR point cloud data after change load process sequence, and the network model for calling this chapter to propose in a program (is based on TensorFlow frame Frame is realized), then the parameter of trained network model is loaded into TensorFlow model.The each frame data of circular treatment: The point cloud data of present frame is read, calling model completes segmentation, segmentation result is packaged into data format and the completion of ROS topic Publication.RVIZ is a simulation software under ROS, a loading environment configuration file, and subscription in real time from/module_ Information on seg/points topic, the real-time visual point cloud segmentation result in RVIZ.
The model real-time visual result of RVIZ is as shown in figure 14, while having recorded each frame Point Cloud Processing with program Time, the real-time performance of system is as shown in table 5, and average time and standard deviation and the system for having counted each frame data are flat Equal frame per second.When this chapter model engineeringization is run, the processing time of every frame point cloud data is 82.5ms, and average frame per second is 12.1Hz, The scan frequency of LiDAR is generally 10Hz or so, so the engineering code of this chapter has reached the performance requirement of real time execution.Fortune Capable time standard difference is 4.5ms, it is meant that the runing time for handling each frame changes very little, and program is more stable, will not go out Existing a certain frame Caton influences other operation components of intelligent body, which ensure that unmanned plane, automatic Pilot in low latitude road Execution task can be stablized under the scene of road.
5 model engineering real-time of table and stability contrast
The invention proposes one kind to be suitable for unmanned plane, automatic Pilot real-time perfoming three-dimensional point cloud under the street scene of low latitude The neural network model RobNet of segmentation.The basic theory of convolutional neural networks and model compression is described first and is marked with semanteme The point cloud data acquisition methods of label, and realized by spherical Map method by 3D point cloud data 2Dization.Then from macroscopically designing Whole network model, described in detail on microcosmic model first half basic network and latter half cycling condition with Airport network, wherein contain the residual error connection of basic network, the interior design of Fire and FireDocnv module and RNN's It derives and constructs.The model that finally this chapter is proposed has carried out engineering under ROS frame and has realized.It is final the experimental results showed that The model that this chapter is proposed preferably can go out the road target in point cloud by Real-time segmentation, while the model scale of construction is smaller.Moulder Journey meets the requirement of real-time and stability, and model proposed by the present invention has certain practical value.

Claims (6)

1. a kind of road target real-time three-dimensional point cloud segmentation method based on cycling condition random field, it is characterized in that including at least such as Lower step:
Step 1 obtains point cloud data, obtains the three-dimensional matrice of a H*W*C;
Step 2 carries out network architecture optimization according to above-mentioned data;
The probability graph that step 3, output minimize provides three-dimensional point cloud segmentation.
2. a kind of road target real-time three-dimensional based on SqueezeNet and cycling condition random field according to claim 1 Point cloud segmentation, it is characterized in that: the step 1 includes step in detail below:
Step 1.1) obtains point cloud data, and point cloud data is made of non-structured three-dimensional space point, each spatial point is wrapped again Data containing 5 dimensions: one group of Descartes's three dimensional space coordinate (x, y, z), the intensity value i and distance of current point
Point cloud data is converted to the data matrix for meeting network data requirement, spherical Map using spherical Map by step 1.2) Shown in equation such as formula (1):
Wherein, θ, φ respectively indicate apex angle and azimuth, Δ θ, and Δ φ respectively indicates θ, the discrete precision on φ,As from It dissipates and is worth, one groupData map to a position on 2D grid map;
Each point on one point cloud map is carried out above-mentioned spherical Map by step 1.3), obtains the three-dimensional of a H*W*C Matrix, H, W respectively correspond mapping result discrete valueThe data in each channel of the C equal to 5, C respectively correspond LiDAR point X, y, z-axis coordinate values put strength values and distance values.
3. a kind of road target real-time three-dimensional point cloud segmentation side based on cycling condition random field according to claim 1 Method, it is characterized in that: the step 2 includes:
The tensor H*W*C of step 2.1) input first passes around the compression layer processing of Fire, the compression layer of Fire by 1*1 convolution kernel Composition, the compression layer that the quantity of convolution kernel is equal to 1/4, Fire of input tensor port number carry out while completing feature extraction The advanced treating of compressed data;
Then data pass through FireDeconv module to step 2.2);
Step 2.3) cycling condition random field network establishment;
Step 2.4) defines macroscopical optimization aim of network model
Cross entropy calculating is shown below:
Wherein, m indicates resolution sizes, that is, the sum put, LiIndicate the intersection entropy loss of each point,Indicate at i-th point It is classified as true ciThe probability of label.
4. a kind of road target real-time three-dimensional point cloud segmentation side based on cycling condition random field according to claim 3 Method, it is characterized in that: the step 2.1) includes:
A) respectively by the data of compression layer by carrying out the expansion of tensor depth on the Conv convolutional layer of 1*1 and the Conv convolutional layer of 3*3 Hold, becomes exporting the 1/2 of tensor depth respectively;
B) data after two kinds of convolution operations are stacked in depth dimension, exports H*W*C tensor.
5. a kind of road target real-time three-dimensional point cloud segmentation side based on cycling condition random field according to claim 3 Method, it is characterized in that: the step 2.2) includes:
A) input H*W*C tensor enters FireDeconv module, by Conv1*1, completes feature extraction and depth-compression;
B) then data are up-sampled, depth remains unchanged;Layer data will be extended and be extended for export 1/2 in depth;
C) data after two kinds of convolution operations are stacked in depth dimension, exports H*2W*C tensor.
6. a kind of road target real-time three-dimensional point cloud segmentation side based on cycling condition random field according to claim 3 Method, it is characterized in that: the step 2.3) includes:
Step 2.3.1) provide the energy function of cycling condition random field:
Wherein, c is the prediction label vector of a cloud, ciThe prediction label of i-th cloud is indicated as a result, ui(ci)=- logP (ci) It represents i-th cloud and is predicted as ciThe cross entropy probability value of classification, bi,j(ci,cj) it is that one kind is assigned as not one group of similitude With the payment method of the behavior of label, b is definedi,j(ci,cj) specific payment method are as follows:
Work as ci≠cjWhen, h (ci,cj)=1, otherwise h (ci,cj)=0, fi,fjIndicate the feature of i-th, j point, kmIt indicates with i-th, The feature of j point is m-th of Gaussian filter of input, wmIndicate corresponding coefficient;Using as follows:
Wherein, vectorIndicate that the angle information of a point, vector X (x, y, z) indicate the cartesian coordinate of a point, σαβγIt is one group of hyper parameter, value is generally selected by experience;
Step 2.3.2) building energy function mean field iterative algorithm Recognition with Recurrent Neural Network structure RNN:
A) it inputs the output of base neural network as probability figure into Recognition with Recurrent Neural Network RNN;
B) LiDAR initial characteristic data is handled by formula (4), obtains two Gaussian kernels;
C) scale of Gaussian kernel is set to the regional area of 3*5;Gaussian kernel is enterprising in probability graph in a manner of local articulamentum Row calculates, the filtering of complete paired data and information transmitting;
D) it is weighted again with polymerization probability of the convolution kernel of 1*1 to the probability results after c) step and compatibility conversion, to Change the distributed degrees of each point, 1*1 convolution nuclear parameter learns in network training.
E) initial probability graph is weighted again with completion and the probability graph of compatibility conversion is overlapped, through cross entropy Softmax The probability graph that normalization operation output minimizes.
CN201910540355.2A 2019-06-21 2019-06-21 A kind of road target real-time three-dimensional point cloud segmentation method based on cycling condition random field Pending CN110310298A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910540355.2A CN110310298A (en) 2019-06-21 2019-06-21 A kind of road target real-time three-dimensional point cloud segmentation method based on cycling condition random field

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910540355.2A CN110310298A (en) 2019-06-21 2019-06-21 A kind of road target real-time three-dimensional point cloud segmentation method based on cycling condition random field

Publications (1)

Publication Number Publication Date
CN110310298A true CN110310298A (en) 2019-10-08

Family

ID=68077630

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910540355.2A Pending CN110310298A (en) 2019-06-21 2019-06-21 A kind of road target real-time three-dimensional point cloud segmentation method based on cycling condition random field

Country Status (1)

Country Link
CN (1) CN110310298A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112565794A (en) * 2020-12-03 2021-03-26 西安电子科技大学 Point cloud isolated point encoding and decoding method and device
CN113762195A (en) * 2021-09-16 2021-12-07 复旦大学 Point cloud semantic segmentation and understanding method based on road side RSU
CN115453570A (en) * 2022-09-13 2022-12-09 北京踏歌智行科技有限公司 Multi-feature fusion mining area dust filtering method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180225515A1 (en) * 2015-08-04 2018-08-09 Baidu Online Network Technology (Beijing) Co. Ltd. Method and apparatus for urban road recognition based on laser point cloud, storage medium, and device
CN108876796A (en) * 2018-06-08 2018-11-23 长安大学 A kind of lane segmentation system and method based on full convolutional neural networks and condition random field

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180225515A1 (en) * 2015-08-04 2018-08-09 Baidu Online Network Technology (Beijing) Co. Ltd. Method and apparatus for urban road recognition based on laser point cloud, storage medium, and device
CN108876796A (en) * 2018-06-08 2018-11-23 长安大学 A kind of lane segmentation system and method based on full convolutional neural networks and condition random field

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
BICHEN WU: "SqueezeSeg: Convolutional Neural Nets with Recurrent CRF for Real-Time Road-Object Segmentation from 3D LiDAR Point Cloud", 《2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA)》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112565794A (en) * 2020-12-03 2021-03-26 西安电子科技大学 Point cloud isolated point encoding and decoding method and device
CN112565794B (en) * 2020-12-03 2022-10-04 西安电子科技大学 Point cloud isolated point encoding and decoding method and device
CN113762195A (en) * 2021-09-16 2021-12-07 复旦大学 Point cloud semantic segmentation and understanding method based on road side RSU
CN115453570A (en) * 2022-09-13 2022-12-09 北京踏歌智行科技有限公司 Multi-feature fusion mining area dust filtering method

Similar Documents

Publication Publication Date Title
Cui et al. Fish detection using deep learning
US11651302B2 (en) Method and device for generating synthetic training data for an artificial-intelligence machine for assisting with landing an aircraft
CN109993082A (en) The classification of convolutional neural networks road scene and lane segmentation method
CN111507378A (en) Method and apparatus for training image processing model
US20160224903A1 (en) Hyper-parameter selection for deep convolutional networks
CN110532859A (en) Remote Sensing Target detection method based on depth evolution beta pruning convolution net
CN109461157A (en) Image, semantic dividing method based on multi-stage characteristics fusion and Gauss conditions random field
CN106682569A (en) Fast traffic signboard recognition method based on convolution neural network
CN106920243A (en) The ceramic material part method for sequence image segmentation of improved full convolutional neural networks
CN111339935B (en) Optical remote sensing picture classification method based on interpretable CNN image classification model
CN110097145A (en) One kind being based on CNN and the pyramidal traffic contraband recognition methods of feature
KR102011788B1 (en) Visual Question Answering Apparatus Using Hierarchical Visual Feature and Method Thereof
CN110310298A (en) A kind of road target real-time three-dimensional point cloud segmentation method based on cycling condition random field
CN112070729A (en) Anchor-free remote sensing image target detection method and system based on scene enhancement
CN110009648A (en) Trackside image Method of Vehicle Segmentation based on depth Fusion Features convolutional neural networks
Doi et al. The effect of focal loss in semantic segmentation of high resolution aerial image
CN112329815B (en) Model training method, device and medium for detecting travel track abnormality
CN113743417B (en) Semantic segmentation method and semantic segmentation device
CN110281949B (en) Unified hierarchical decision-making method for automatic driving
US11695898B2 (en) Video processing using a spectral decomposition layer
CN107423747A (en) A kind of conspicuousness object detection method based on depth convolutional network
CN107506792A (en) A kind of semi-supervised notable method for checking object
CN107016371A (en) UAV Landing Geomorphological Classification method based on improved depth confidence network
CN107766828A (en) UAV Landing Geomorphological Classification method based on wavelet convolution neutral net
Qurishee Low-cost deep learning UAV and Raspberry Pi solution to real time pavement condition assessment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20191008

RJ01 Rejection of invention patent application after publication