CN110310298A - A kind of road target real-time three-dimensional point cloud segmentation method based on cycling condition random field - Google Patents
A kind of road target real-time three-dimensional point cloud segmentation method based on cycling condition random field Download PDFInfo
- Publication number
- CN110310298A CN110310298A CN201910540355.2A CN201910540355A CN110310298A CN 110310298 A CN110310298 A CN 110310298A CN 201910540355 A CN201910540355 A CN 201910540355A CN 110310298 A CN110310298 A CN 110310298A
- Authority
- CN
- China
- Prior art keywords
- data
- point cloud
- point
- dimensional
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000011218 segmentation Effects 0.000 title claims abstract description 61
- 238000000034 method Methods 0.000 title claims abstract description 30
- 230000001351 cycling effect Effects 0.000 title claims abstract description 17
- 238000005457 optimization Methods 0.000 claims abstract description 8
- 238000012549 training Methods 0.000 claims description 27
- 238000007906 compression Methods 0.000 claims description 21
- 230000006835 compression Effects 0.000 claims description 18
- 238000013528 artificial neural network Methods 0.000 claims description 8
- 238000006243 chemical reaction Methods 0.000 claims description 7
- 238000000605 extraction Methods 0.000 claims description 7
- 238000012545 processing Methods 0.000 claims description 7
- 238000004422 calculation algorithm Methods 0.000 claims description 6
- 230000008859 change Effects 0.000 claims description 6
- 238000013507 mapping Methods 0.000 claims description 4
- 239000000203 mixture Substances 0.000 claims description 4
- 230000000306 recurrent effect Effects 0.000 claims description 4
- 238000001914 filtration Methods 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000006116 polymerization reaction Methods 0.000 claims description 3
- 238000005516 engineering process Methods 0.000 abstract description 4
- 238000013461 design Methods 0.000 description 11
- 230000000694 effects Effects 0.000 description 11
- 238000011156 evaluation Methods 0.000 description 11
- 238000005070 sampling Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 5
- 238000013527 convolutional neural network Methods 0.000 description 4
- 238000001514 detection method Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 230000000007 visual effect Effects 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 230000008447 perception Effects 0.000 description 3
- 230000009471 action Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000003709 image segmentation Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 235000000177 Indigofera tinctoria Nutrition 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 230000004888 barrier function Effects 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000007667 floating Methods 0.000 description 1
- 229940097275 indigo Drugs 0.000 description 1
- COHYTHOBJLSHDF-UHFFFAOYSA-N indigo powder Natural products N1C2=CC=CC=C2C(=O)C1=C1C(=O)C2=CC=CC=C2N1 COHYTHOBJLSHDF-UHFFFAOYSA-N 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 238000012821 model calculation Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000000474 nursing effect Effects 0.000 description 1
- 238000012856 packing Methods 0.000 description 1
- 230000003134 recirculating effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/143—Segmentation; Edge detection involving probabilistic approaches, e.g. Markov random field [MRF] modelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20076—Probabilistic image processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
The present invention relates to automatic Pilot unmanned planes under the complexity road scene of low latitude to the real-time three-dimensional point cloud Target Segmentation technology of itself ambient enviroment, especially a kind of road target real-time three-dimensional point cloud segmentation method based on cycling condition random field.It is characterized in that including at least following steps: step 1 obtains point cloud data, obtains the three-dimensional matrice of a H*W*C;Step 2 carries out network architecture optimization according to above-mentioned data;The probability graph that step 3, output minimize provides three-dimensional point cloud segmentation.Model proposed by the present invention preferably can go out the road target in point cloud by Real-time segmentation, and model engineering meets the requirement of real-time and stability, has certain practical value.
Description
Technical field
The present invention relates to automatic Pilot unmanned planes under the complexity road scene of low latitude to the real-time three-dimensional of itself ambient enviroment
Point cloud Target Segmentation technology, especially a kind of road target real-time three-dimensional point cloud segmentation method based on cycling condition random field.
Background technique
In recent years, unmanned plane slowly has the tendency that sending express delivery, region low latitude into low latitude domain activity, such as unmanned plane
Safety inspection, road defects detection, provisional road segment segment vehicle detection statistics etc..The characteristics of these tasks is to need to carry out people
Machine interaction, needs to perceive the target and ambient enviroment in the scene of low latitude, accomplishes safe flight while the task of completion.Meanwhile
Automatic Pilot has obtained paying close attention to for whole world researcher in recent years, and around perception, decision, control, which expands, to be goed deep into
Research.Unmanned plane low-latitude flying and automated driving system are required dependent on accurate, real-time, robust environment sensing energy
Power.In the road scene of low latitude, unmanned plane and automatic Pilot need precise classification and position road target, such as vehicle, pedestrian,
Cyclist and other barriers.Wherein, the cognitive method based on radar point cloud data can more directly react reality
Scene information has wide application scenarios.Therefore, studying a kind of three-dimensional perception technology under the road scene of low latitude just seems
It is very necessary, be conducive to the development for pushing unmanned plane and automatic Pilot.
Previous traditional point cloud segmentation method usually includes following four kinds of common operations: removing ground, point is gathered into
A example extracts feature from each aggregation population, the population assembled according to tagsort.Conventional method depends critically upon
The feature of hand-designed, and do not omited in data connection, satisfied segmentation result cannot be obtained.Scholars attempt to use later
The angle of probability goes to explain segmentation problem, proposes condition random field equation for solving the problems, such as semantic segmentation, was dividing
It joined the smoothness constraint between pixel in journey, be conducive to improve edge segmentation effect.The appearance of FCN network demonstrates convolution mind
It can be used in image segmentation task through network, neural network proposes the feature of initial data by self study completion
Take, the segmentation result of same input data original resolution rank obtained by mode end to end, FCN's the experimental results showed that
Performance of the depth network in image segmentation task is far better than conventional method.It is influenced by the depth Internet upsurge, S.Zheng etc.
The mathematical expression of Stochastic Conditions field is converted to the neural network structure of RNN a kind of by people.Deeplab network then will be above-mentioned two
Advanced dividing method combines, and makes preliminary segmentation of FCN network first, and segmentation result is then passed through Stochastic Conditions field
Handle and then exports, but the network for being based only on FCN of the network training, it is equivalent to FCN network and condition random field
It is separation, does not carry out joint training, although Deeplab is good compared with FCN segmentation effect, promotion is not obvious.It sends out backward
Exhibition has just emerged in large numbers through pretreatment three dimensional point cloud and then has used the point cloud segmentation network of nerual network technique.Zhou Y etc.
People proposes VoxelNet network model, the 3D detection and segmentation of object is completed on point cloud data, on KITTI data set
Preferable experimental result has been obtained, but the model is huger, the real-time performance of model is poor.Qi C R et al. is proposed
PointNet network model can be completed at the same time the classification of point cloud data, semantic segmentation and individual component segmentation, and the model is to nothing
Each point of sequence point cloud carries out independent processing, it is thus achieved that the points cloud processing unrelated with input sequence.But the model
When handling the score of each point, directly point feature is combined together with global characteristics, and has ignored local feature, caused
Final segmenting edge is ineffective, while a feature of point cloud data is packing density difference, embodies close how far few etc.
Problem, and in the case where density difference, handle that the obvious computational efficiency of these data is not high, and model can not using unified template
Real time execution.By the studies above As-Is analysis, point cloud segmentation technology is in Rapid development stage, how more efficient training
Network, how the model of compression network keeps segmentation precision simultaneously, how lift scheme real-time and computational efficiency the problems such as
It is all direction to be studied.
Summary of the invention
It is an object of that present invention to provide a kind of model real-time of point cloud segmentation is good, computational efficiency is high, model compression and point
Cut the road target real-time three-dimensional point cloud segmentation method of the good random field of balance between precision.
The object of the present invention is achieved like this, a kind of road target real-time three-dimensional point cloud based on cycling condition random field
Dividing method, it is characterized in that including at least following steps:
Step 1 obtains point cloud data, obtains the three-dimensional matrice of a H*W*C;
Step 2 carries out network architecture optimization according to above-mentioned data;
The probability graph that step 3, output minimize provides three-dimensional point cloud segmentation.
The step 1 includes step in detail below:
Step 1.1) obtains point cloud data, and point cloud data is made of non-structured three-dimensional space point, each spatial point
It again include the data of 5 dimensions: one group of Descartes's three dimensional space coordinate (x, y, z), the intensity value i and distance of current point
Point cloud data is converted to the data matrix for meeting network data requirement, spherical surface using spherical Map by step 1.2)
Shown in mapping equation such as formula (1):
Wherein, θ, φ respectively indicate apex angle and azimuth, Δ θ, and Δ φ respectively indicates θ, the discrete precision on φ,I.e.
For discrete value, one groupData map to a position on 2D grid map;
Each point on one point cloud map is carried out above-mentioned spherical Map by step 1.3), obtains a H*W*C's
Three-dimensional matrice, H, W respectively correspond mapping result discrete valueThe data in each channel of the C equal to 5, C respectively correspond LiDAR
The x of point, y, z-axis coordinate values put strength values and distance values.
The step 2 includes:
The tensor H*W*C of step 2.1) input first passes around the compression layer processing of Fire, the compression layer of Fire by 1*1 volume
Product core composition, the quantity of convolution kernel are equal to the compression layer of 1/4, Fire of input tensor port number while completing feature extraction
Carry out the advanced treating of compressed data;
Then data pass through FireDeconv module to step 2.2);
Step 2.3) cycling condition random field network establishment;
Step 2.4) defines macroscopical optimization aim of network model
Cross entropy calculating is shown below:
Wherein, m indicates resolution sizes, that is, the sum put, LiIndicate the intersection entropy loss of each point,Indicate i-th
A point is classified as true ciThe probability of label.
The step 2.1) includes:
A) respectively that the data of compression layer are deep by carrying out tensor on the Conv convolutional layer of 1*1 and the Conv convolutional layer of 3*3
Dilatation is spent, becomes exporting the 1/2 of tensor depth respectively;
B) data after two kinds of convolution operations are stacked in depth dimension, exports H*W*C tensor;
The step 2.2) includes:
A) input H*W*C tensor enters FireDeconv module, by Conv1*1, completes feature extraction and depth-compression;
B) then data are up-sampled, depth remains unchanged;Layer data will be extended and be extended for output in depth
1/2;
C) data after two kinds of convolution operations are stacked in depth dimension, exports H*2W*C tensor.
The step 2.3) includes:
Step 2.3.1) provide the energy function of cycling condition random field:
Wherein, c is the prediction label vector of a cloud, ciThe prediction label of i-th cloud is indicated as a result, ui(ci)=- logP
(ci) represent i-th cloud and be predicted as ciThe cross entropy probability value of classification, bi,j(ci,cj) it is a kind of to one group of similitude distribution
For the payment method of the behavior of different labels, b is definedi,j(ci,cj) specific payment method are as follows:
Work as ci≠cjWhen, h (ci,cj)=1, otherwise h (ci,cj)=0, fi,fjIndicate the feature of i-th, j point, kmIndicate with
The feature of i-th, j points is m-th of Gaussian filter of input, wmIndicate corresponding coefficient;Using as follows:
Wherein, vectorIndicate that the angle information of a point, vector X (x, y, z) indicate that the Descartes an of point sits
Mark, σα,σβ,σγIt is one group of hyper parameter, value is generally selected by experience;
Step 2.3.2) building energy function mean field iterative algorithm Recognition with Recurrent Neural Network structure RNN:
A) it inputs the output of base neural network as probability figure into Recognition with Recurrent Neural Network RNN;
B) LiDAR initial characteristic data is handled by formula (4), obtains two Gaussian kernels;
C) scale of Gaussian kernel is set to the regional area of 3*5;Gaussian kernel is in a manner of local articulamentum in probability graph
On calculated, the filtering of complete paired data and information transmitting;
D) it is weighted again with polymerization probability of the convolution kernel of 1*1 to the probability results after c) step and compatibility conversion,
To change the distributed degrees of each point, 1*1 convolution nuclear parameter learns in network training.
E) initial probability graph is weighted again with completion and the probability graph of compatibility conversion is overlapped, through cross entropy
The probability graph that the output of Softmax normalization operation minimizes.
The invention proposes a kind of based on SqueezeNet and recycles the real-time three-dimensional point cloud segmentation model RobNet of CRF,
Model is macroscopically being divided into basic network and recirculating network, and is carrying out detailed elaboration on microcosmic, finally in ROS frame
Engineering deployment has been carried out under frame to realize.Light, the external factor such as weather are effectively overcome by initial data of LiDAR point cloud
Influence to algorithm, so that model more robust, while ranging information is contained in LiDAR data, and the range information energy of environment
It is enough directly used in the subsequent planning of unmanned plane, automatic Pilot, controls scheduling algorithm.When network design, with reference to SqueezeNet's
Compression Strategies are effectively reduced the number of parameters of network, have compressed the scale of construction of model, and utilize the condition random field of circulation
Refined output as a result, finally enable this chapter propose model preferably Real-time segmentation go out point cloud in road target,
Model engineering meets the requirement of real-time and stability, and model proposed by the present invention has certain practical value.
All symbols and meaning in 1 specification of table
Below with reference to embodiment attached drawing, the invention will be further described:
Detailed description of the invention
Fig. 1 is spherical Map schematic diagram;
Fig. 2 is point cloud data spherical Map result;
Fig. 3 is network model macroscopic design;
Fig. 4 is basic network Micro Instructional Design;
Fig. 5 is Fire and FireDeconv module microstructure, and Fig. 5 (a) is Fire structure chart, and Fig. 5 (b) is
FireDeconv structure chart;
Fig. 6 is that the RNN of condition random field CRF is realized;
Fig. 7 is model optimization object delineation;
Fig. 8 is the point cloud segmentation functional schematic of model;
Fig. 9 (a) is model training loss loss;
Fig. 9 (b) is the transformation of model training learning rate;
The vehicle IOU score value of Figure 10 (1) training set data point value of evaluation changes;
The bicycle IOU score value of Figure 10 (2) training set data point value of evaluation changes;
Pedestrian's IOU score value of Figure 10 (3) training set data point value of evaluation changes;
Figure 11 (1) verifying collection data assessment -- vehicle IOU score value variation;
Figure 11 (2) verifying collection data assessment -- Che Jingdu score value variation;
Figure 11 (3) verifying collection data assessment -- vehicle readjustment rate score value variation;
Figure 11 (4) verifying collection data assessment -- cyclist's IOU score value variation;
Figure 11 (5) verifying collection data assessment -- cyclist's precision score value variation;
Figure 11 (6) verifying collection data assessment -- cyclist's readjustment rate score value variation;
Figure 11 (7) verifying collection data assessment -- pedestrian's IOU score value variation;
Figure 11 (8) verifying collection data assessment -- pedestrian's precision score value variation;
Figure 11 (9) verifying collection data assessment -- pedestrian's readjustment rate score value variation;
Three groups of Figure 12 (a), first group of segmentation effect-segmentation effect;
Three groups of Figure 12 (b), second group of segmentation effect-segmentation effect;
Three groups of segmentation effect-third components of Figure 12 (c) cut effect;
Figure 13 ROS engineering code logic process;
Figure 14 RVIZ real-time visual result.
Specific embodiment
In conjunction with attached drawing is implemented, specific analysis, a kind of road mesh based on SqueezeNet and circulation CRF are made to the present invention
Real-time three-dimensional point cloud segmentation is marked, it is characterized in that including at least following steps:
Step 1 obtains point cloud data, obtains the three-dimensional matrice of a H*W*C;
Step 2 carries out network architecture optimization according to above-mentioned data;
The probability graph that step 3, output minimize provides three-dimensional point cloud segmentation.
As shown in Figure 1, the step 1 includes step in detail below:
LiDAR point cloud data are made of a series of non-structured three-dimensional space points, each of which spatial point includes 5 again
The data of dimension: one group of Descartes's three dimensional space coordinate (x, y, z), the intensity value i and distance of current pointPoint cloud data feature and image data feature difference are larger, need to complete data using spherical Map to turn
It changes.This kind of more compact expression LiDAR point cloud data of method energy, promote the computational efficiency of network, while meeting the data of network
Call format.Shown in spherical Map equation such as formula (1):
Wherein, θ, φ respectively indicate apex angle and azimuth, Δ θ, and Δ φ respectively indicates θ, the discrete precision on φ,I.e.
For discrete value, one groupData map to a position on 2D grid map.By each point on a point cloud map
Above-mentioned spherical Map is carried out, the three-dimensional matrice of a H*W*C can be obtained, H, W respectively correspond mapping result discrete value
The data in each channel of the C equal to 5, C respectively correspond the x of LiDAR point, y, and z-axis coordinate values put strength values and apart from number
Value.
Unordered by above-mentioned spherical surface conversion method, sparse three dimensional point cloud has been processed into orderly, dense class figure
The standard data format of picture, processing result are as shown in Figure 2.
The step 2 includes step in detail below:
2.1 network model macroscopic designs
The macroscopic design of network model is as shown in Figure 3.Herein when basic convolutional neural networks part in designing a model,
With reference to SqueezeNet network, and be optimized to the mode of residual error connection, can great compression network model size, reduce
Model calculation amount.As shown in figure 3, being the basic network part of model from Conv1a layers to Conv14 layers.Mode input is data
Pretreated standard point cloud data, data three-dimensional scale are 64*512*5, and input data is through general convolution, maximum pond,
Fire has operated the feature extraction of paired data.When carrying out down-sampling operation, it is contemplated that the transverse width of data is higher than longitudinal
Much larger, longitudinal highly only 64 dimensions are spent, so, only to transverse width progress Sampling Compression, indulging herein when down-sampling
To highly remaining unchanged.FireDeconv10 layers to totally four layers of FireDeconv13, it is gradually completing the up-sampling of image,
The resolution ratio of the data of FireDeconv layers of output has reached consistent with input resolution ratio.It is added during up-sampling herein
The skip floor of equal resolution data connects, and is overlapped in data.Conv14 carries out convolution operation to data, by data
Depth becomes 4, and carries out Softmax operation, output probability characteristic pattern.The probability characteristics figure of basic network output is by circulation item
Part random field carries out further accurate adjustment, the label characteristics of final output and input data equal resolution in pixel scale
Figure, i.e. segmentation result.
2.2 network model Micro Instructional Designs
2.2.1 basic network Micro Instructional Design
Basic network selection is finely tuned based on SqueezeNet network, and to network, based on as shown in Figure 4
The complete design of network, each layer of data depth are identified in figure.Entire basic network by Conv layers, Fire layers,
FireDeconv layers, pond layer and Softmax layers composition.Between Fire2 and Fire3, between Fire4 and Fire5, Fire6
Residual error connection is added between Fire7, between Fire8 and Fire9.
2.2.2Fire module and FireDeconv module design
The structure of Fire and FireDeconv is as shown in figure 5, the effect of Fire module is to complete convolution behaviour to input tensor
Make, exports tensor scale and input is consistent.Input tensor H*W*C first passes around the compression layer of Fire, this layer by 1*1 volume
Product core composition, the quantity of convolution kernel are equal to the 1/4 of input tensor port number, layer compressed data while completing feature extraction
Depth.Then data pass through extension layer, Conv1*1 and Conv3*3 convolution respectively carries out the data of compression layer in depth
Dilatation becomes exporting the 1/2 of tensor depth, the data after two kinds of convolution operations of extension layer is finally carried out heap in depth dimension
It is folded, export the tensor of H*W*C.FireDeconv is similar to Fire module, also can significantly lower the number of parameters and meter of network layer
Calculation amount, only for completing the up-sampling to input data.Input tensor H*W*C is inputted into FireDeconv module, first
By Conv1*1, feature extraction and depth-compression are completed, then Deconv upsample*2 up-samples data, depth
It remains unchanged.Then data are extended for the 1/2 of output by extension layer in depth, finally simply will be after two kinds of convolution operations
Data are stacked in depth dimension, export H*2W*C tensor.
The compression of parameters rate of Fire and FireDeconv module is as shown in table 1.
1 Fire and FireDeconv module compression effectiveness of table statistics
3*3Conv | Fire | 1*4Deconv | FireDeconv | |
Number of parameters | 9C2 | 3C2/2 | 4C2 | 7C2/2 |
Calculation amount | 9HWC2 | 3HWC2/2 | 4HWC2 | 7HWC2/2 |
2.2.3 cycling condition random field network establishment
The prediction label vector of defining point cloud is c, ciThe prediction label of i-th cloud is indicated as a result, being defined as follows condition
The energy function of random field:
Wherein, ui(ci)=- logP (ci) represent i-th cloud and be predicted as ciThe cross entropy probability value of classification, bi,j(ci,
cj) be a kind of behavior that different labels are assigned as to one group of similitude payment method, define bi,j(ci,cj) specific punishment side
Formula are as follows:
Work as ci≠cjWhen, h (ci,cj)=1, otherwise h (ci,cj)=0.fi,fjIndicate the feature of i-th, j point, kmIndicate with
The feature of i-th, j points is m-th of Gaussian filter of input, wmIndicate corresponding coefficient.In the design of this chapter algorithm, herein
Using following two filters:
Wherein, vectorIndicate that the angle information of a point, vector X (x, y, z) indicate that the Descartes an of point sits
It marks, first filter uses the angle information and cartesian coordinate information of two points, second filter simultaneously in above formula
Relate only to the angle information of two points.σα,σβ,σγIt is one group of hyper parameter, value is generally selected by experience.
The RNN network structure of the mean field iterative algorithm of energy function as shown in fig. 6, the output of CNN as probability
Figure is inputted into RNN.Calculating formula (4) in LiDAR initial characteristic data obtain two Gaussian kernels.As two o'clock is in the flute card of 3D
It is remote at a distance of change in the angular region of that space and 2D, the value rapid decrease of Gaussian kernel, therefore when two o'clock is apart from each other, above-mentioned height
This meaning for assessing calculation will become very little, so herein setting the scale of Gaussian kernel to the regional area of 3*5.In Fig. 6
First layer shown in, Gaussian kernel is calculated on probability graph in a manner of local articulamentum, the filtering of complete paired data and letter
Breath transmitting, the step can polymerize the probability of consecutive points.Then polymerization probability is weighted and compatibility conversion herein again, is used
To change the distributed degrees of each point, which can realize that 1*1 convolution nuclear parameter can be in network training with the convolution kernel of 1*1
Middle study.Finally initial probability graph is weighted in third layer again with completion and the probability graph of compatibility conversion is overlapped,
Then the probability graph of final fining is exported by Softmax normalization operation.The part RNN in figure can be iterated, this
The number of iterations is set to 3 times by chapter.
2.2.4 network model macroscopic view optimization aim
The segmentation index expression of network model on the whole is to measure between the RNN probability graph exported and true tag
Consistency.As shown in figure 8, three-dimensional bits are the probability graphs in RNN, slice resolution ratio is equal to CNN original image and inputs resolution ratio, deep
Degree is 4, each value in depth represents the probability value of each classification (vehicle, cyclist, pedestrian, background).Cross entropy meter
It is shown below:
Wherein, m indicates resolution sizes, that is, the sum put, LiIndicate the intersection entropy loss of each point,Indicate i-th
A point is classified as true ciThe probability of label.Minimum formula (5) obtains the final segmentation result of network.
The step 3 the following steps are included:
3.1 evaluation criterias and experiment porch explanation
3.1.1 experiment porch and data set explanation
The hardware platform of the training of model proposed by the invention and ROS deployment are as follows: 1) processor: Inter (R) Core
(TM)i5-4590CPU@3.30GHz;2) memory: 8GB;3) 1060 GPU:3NVIDIA, 6G video memory.Software version are as follows: 1) operate
System: Ubuntu14.04;2) deep learning frame: TensorFlow1.4.0;3) ROS:indigo;4) Python:2.7.Number
It is described according to the acquisition methods of collection in the 2nd trifle, the 10848 frame original numbers that will be converted to herein from KITTI data set
According to division for the training set comprising 8057 frame data and comprising the verifying collection of 2791 frame data, wherein to data when being divided in
Between accomplish in sequence as far as possible uniformly, to prevent data from flocking together the case where occurring in certain time period.
3.1.2 evaluation criteria
Evaluation criteria of the invention is divided using the evaluation index of classification segmentation task specifically by more point-by-point prediction
Class result and true tag obtain the precision of prediction Pr of each classificationc, readjustment rate recallcWith friendship and than score IoUc,
It is defined as follows:
Wherein Pc,GcRespectively indicate the true point set of the prediction point set for being divided into c classification and ground.The network model proposed
Dividing function as shown in Figure 8 finally can be completed, inputs as unordered point cloud data, exports as the point cloud number with classification marker
According to.
3.1.3 model size calculates
Each layer of data dimension in network model proposed by the present invention, number of parameters to be learned, filter hyper parameter
Complete statistics as shown in table 2, does not have the grid of specific number to show in network in the layer without corresponding feature, therefore does not do in table
Statistics.
The model parameter statistical form that table 2 is proposed
As shown in table 2, the population parameter quantity for the network model that this chapter is proposed is 904880, takes the single floating point precision to be
32bit (4byte), then the size of entire model is about 3.5MB.Common image classification network such as AlexNet mould as shown in table 3
Type size is 240MB, and VGG16 model size is 500MB, and GoogLeNet model size is 50MB, common to put cloud detection segmentation
Network model such as 3DmFV-Net model size is 45MB, and PCNN model size is 8MB.The model of the size of 3.5MB is easier portion
It is deployed in the limited intelligent platform of computing resource.
The comparison of 3 model parameter amount of table
3.2 model trainings and assessment
The present invention is converted using data set by KITTI public data collection, contains 10848 frame data altogether, wherein 8057 frames
Data are as training set, and 2791 frame data are as verifying collection.Start train network before load pre-training basic network parameter, together
When setting study total step number be 25000 times, batch size is 16, and initial learning rate is 0.01, and 10000 step learning rates of every study reduce
Half, setting Momentum are 0.9.Real-time visual training effect in training process, real-time record cast penalty values, training number
According to collection and validation data set precision, readjustment rate and IOU score value.Model training Loss score value and learning rate variation are as shown in figure 9, figure
9 (a) be model training loss loss;Fig. 9 (b) is the transformation of model training learning rate;Model is received substantially at trained to 21k times
It holds back.In entire training process, three kinds of classification IOU indexs of training set are as shown in Figure 10, Figure 10 (1) training set data point value of evaluation
Vehicle IOU score value variation;The bicycle IOU score value of Figure 10 (2) training set data point value of evaluation changes;Figure 10 (3) training set number
Change according to pedestrian's IOU score value of point value of evaluation.Deep yellow colo(u)r streak in Fig. 9, Figure 10 is the curve after pale yellow colo(u)r streak is smooth, with prominent
Variation tendency on the whole out.
Record is more specifically assessed on verifying collection herein, is as shown in figure 11 upper other point of three types of verifying collection
Cut assessment result, Figure 11 (1) verifying collection data assessment -- vehicle IOU score value variation;Figure 11 (2) verifying collection data assessment -- Che Jingdu
Score value variation;Figure 11 (3) verifying collection data assessment -- vehicle readjustment rate score value variation;Figure 11 (4) verifying collection data assessment -- voluntarily
The variation of driver's IOU score value;Figure 11 (5) verifying collection data assessment -- cyclist's precision score value variation;Figure 11 (6) verifying collection data
Assessment -- cyclist's readjustment rate score value variation;Figure 11 (7) verifying collection data assessment -- pedestrian's IOU score value variation;Figure 11 (8) is tested
Card collection data assessment -- pedestrian's precision score value variation;Figure 11 (9) verifying collection data assessment -- pedestrian's readjustment rate score value variation;It is each
The other evaluation index of type respectively includes precision, tri- kinds of specific score values of readjustment rate and IOU.Curve azury changes in Figure 11 for difference
The corresponding prediction score value of each target under generation number, navy blue are in order to facilitate observing and nursing to the whole smooth of prediction score value
Global Iterative Schemes predict variation.
In order to intuitively show the segmentation effect of model, the prediction of this paper and the difference of ground true tag are visualized, is arranged
Go out following 3 groups of Experimental comparisons, each group of LiDAR depth map (first) current comprising a frame, ground true tag figure (the
Two) and prediction segmentation figure (third), as shown in figure 12.
Green exposure mask in Figure 12 represents vehicle, and light blue exposure mask represents pedestrian and cyclist, observes Figure 12
(a), model can accurately be partitioned into the road target in scene, and in the point cloud, such as scene expanded around individual
Cyclist, this also explains the higher phenomenons of readjustment rate of model, and model has been partitioned into spill tag in the true figure in ground
The lower left corner vehicle, illustrate model have preferable reasoning and generalization ability.Figure 12 (b) illustrates model to vehicle target point
It is preferable to cut effect, prediction result is basic consistent with ground true tag.It observes Figure 12 (c), model exists sub-fraction point cloud
The case where being accidentally divided into pedestrian, in fact this part point cloud classifications is that the score of pedestrian is only higher than background category score little by little,
Being classified as pedestrian's classification is because itself just having certain pedestrian target characteristic, with a kind of more conservative segmentation strategy clothes
It is engaged in substantially meeting the principle of safety standard in unmanned plane, automatic Pilot.
4 model of table classification segmentation performance pixel-by-pixel
The segmentation performance of classification pixel-by-pixel of model is as shown in table 4.By table 4 and Figure 11 can perception model the segmentation of vehicle is imitated
Fruit is preferable, and precision can reach 61.8%, and readjustment rate is close to 100%, and friendship and than can reach 59.9 is better than VoxelNet in performance,
PointNet and MS+CU effect on vehicles segmentation.Readjustment rate index is critically important in unmanned plane during flying and automatic Pilot scene
Index because under real road scene would rather mistakenly by the point cloud segmentation of target proximity be the target, be also reluctant incite somebody to action this
The point cloud segmentation for being the target is background or other classifications, serves unmanned plane with a kind of more conservative segmentation strategy, drives automatically
A possibility that sailing subsequent motion planning and action decision, reducing unmanned plane, automatic driving vehicle collision.Three kinds of target object height
Readjustment rate meet the requirement of safety specification under the paleocinetic scene of intelligent body.
3.3 models are engineered real-time and stability analysis under ROS frame
For the Project Realization of entire model under ROS frame, the engineering code flow under ROS is as shown in figure 13.Firstly, by 2D
LiDAR point cloud data after change load process sequence, and the network model for calling this chapter to propose in a program (is based on TensorFlow frame
Frame is realized), then the parameter of trained network model is loaded into TensorFlow model.The each frame data of circular treatment:
The point cloud data of present frame is read, calling model completes segmentation, segmentation result is packaged into data format and the completion of ROS topic
Publication.RVIZ is a simulation software under ROS, a loading environment configuration file, and subscription in real time from/module_
Information on seg/points topic, the real-time visual point cloud segmentation result in RVIZ.
The model real-time visual result of RVIZ is as shown in figure 14, while having recorded each frame Point Cloud Processing with program
Time, the real-time performance of system is as shown in table 5, and average time and standard deviation and the system for having counted each frame data are flat
Equal frame per second.When this chapter model engineeringization is run, the processing time of every frame point cloud data is 82.5ms, and average frame per second is 12.1Hz,
The scan frequency of LiDAR is generally 10Hz or so, so the engineering code of this chapter has reached the performance requirement of real time execution.Fortune
Capable time standard difference is 4.5ms, it is meant that the runing time for handling each frame changes very little, and program is more stable, will not go out
Existing a certain frame Caton influences other operation components of intelligent body, which ensure that unmanned plane, automatic Pilot in low latitude road
Execution task can be stablized under the scene of road.
5 model engineering real-time of table and stability contrast
The invention proposes one kind to be suitable for unmanned plane, automatic Pilot real-time perfoming three-dimensional point cloud under the street scene of low latitude
The neural network model RobNet of segmentation.The basic theory of convolutional neural networks and model compression is described first and is marked with semanteme
The point cloud data acquisition methods of label, and realized by spherical Map method by 3D point cloud data 2Dization.Then from macroscopically designing
Whole network model, described in detail on microcosmic model first half basic network and latter half cycling condition with
Airport network, wherein contain the residual error connection of basic network, the interior design of Fire and FireDocnv module and RNN's
It derives and constructs.The model that finally this chapter is proposed has carried out engineering under ROS frame and has realized.It is final the experimental results showed that
The model that this chapter is proposed preferably can go out the road target in point cloud by Real-time segmentation, while the model scale of construction is smaller.Moulder
Journey meets the requirement of real-time and stability, and model proposed by the present invention has certain practical value.
Claims (6)
1. a kind of road target real-time three-dimensional point cloud segmentation method based on cycling condition random field, it is characterized in that including at least such as
Lower step:
Step 1 obtains point cloud data, obtains the three-dimensional matrice of a H*W*C;
Step 2 carries out network architecture optimization according to above-mentioned data;
The probability graph that step 3, output minimize provides three-dimensional point cloud segmentation.
2. a kind of road target real-time three-dimensional based on SqueezeNet and cycling condition random field according to claim 1
Point cloud segmentation, it is characterized in that: the step 1 includes step in detail below:
Step 1.1) obtains point cloud data, and point cloud data is made of non-structured three-dimensional space point, each spatial point is wrapped again
Data containing 5 dimensions: one group of Descartes's three dimensional space coordinate (x, y, z), the intensity value i and distance of current point
Point cloud data is converted to the data matrix for meeting network data requirement, spherical Map using spherical Map by step 1.2)
Shown in equation such as formula (1):
Wherein, θ, φ respectively indicate apex angle and azimuth, Δ θ, and Δ φ respectively indicates θ, the discrete precision on φ,As from
It dissipates and is worth, one groupData map to a position on 2D grid map;
Each point on one point cloud map is carried out above-mentioned spherical Map by step 1.3), obtains the three-dimensional of a H*W*C
Matrix, H, W respectively correspond mapping result discrete valueThe data in each channel of the C equal to 5, C respectively correspond LiDAR point
X, y, z-axis coordinate values put strength values and distance values.
3. a kind of road target real-time three-dimensional point cloud segmentation side based on cycling condition random field according to claim 1
Method, it is characterized in that: the step 2 includes:
The tensor H*W*C of step 2.1) input first passes around the compression layer processing of Fire, the compression layer of Fire by 1*1 convolution kernel
Composition, the compression layer that the quantity of convolution kernel is equal to 1/4, Fire of input tensor port number carry out while completing feature extraction
The advanced treating of compressed data;
Then data pass through FireDeconv module to step 2.2);
Step 2.3) cycling condition random field network establishment;
Step 2.4) defines macroscopical optimization aim of network model
Cross entropy calculating is shown below:
Wherein, m indicates resolution sizes, that is, the sum put, LiIndicate the intersection entropy loss of each point,Indicate at i-th point
It is classified as true ciThe probability of label.
4. a kind of road target real-time three-dimensional point cloud segmentation side based on cycling condition random field according to claim 3
Method, it is characterized in that: the step 2.1) includes:
A) respectively by the data of compression layer by carrying out the expansion of tensor depth on the Conv convolutional layer of 1*1 and the Conv convolutional layer of 3*3
Hold, becomes exporting the 1/2 of tensor depth respectively;
B) data after two kinds of convolution operations are stacked in depth dimension, exports H*W*C tensor.
5. a kind of road target real-time three-dimensional point cloud segmentation side based on cycling condition random field according to claim 3
Method, it is characterized in that: the step 2.2) includes:
A) input H*W*C tensor enters FireDeconv module, by Conv1*1, completes feature extraction and depth-compression;
B) then data are up-sampled, depth remains unchanged;Layer data will be extended and be extended for export 1/2 in depth;
C) data after two kinds of convolution operations are stacked in depth dimension, exports H*2W*C tensor.
6. a kind of road target real-time three-dimensional point cloud segmentation side based on cycling condition random field according to claim 3
Method, it is characterized in that: the step 2.3) includes:
Step 2.3.1) provide the energy function of cycling condition random field:
Wherein, c is the prediction label vector of a cloud, ciThe prediction label of i-th cloud is indicated as a result, ui(ci)=- logP (ci)
It represents i-th cloud and is predicted as ciThe cross entropy probability value of classification, bi,j(ci,cj) it is that one kind is assigned as not one group of similitude
With the payment method of the behavior of label, b is definedi,j(ci,cj) specific payment method are as follows:
Work as ci≠cjWhen, h (ci,cj)=1, otherwise h (ci,cj)=0, fi,fjIndicate the feature of i-th, j point, kmIt indicates with i-th,
The feature of j point is m-th of Gaussian filter of input, wmIndicate corresponding coefficient;Using as follows:
Wherein, vectorIndicate that the angle information of a point, vector X (x, y, z) indicate the cartesian coordinate of a point,
σα,σβ,σγIt is one group of hyper parameter, value is generally selected by experience;
Step 2.3.2) building energy function mean field iterative algorithm Recognition with Recurrent Neural Network structure RNN:
A) it inputs the output of base neural network as probability figure into Recognition with Recurrent Neural Network RNN;
B) LiDAR initial characteristic data is handled by formula (4), obtains two Gaussian kernels;
C) scale of Gaussian kernel is set to the regional area of 3*5;Gaussian kernel is enterprising in probability graph in a manner of local articulamentum
Row calculates, the filtering of complete paired data and information transmitting;
D) it is weighted again with polymerization probability of the convolution kernel of 1*1 to the probability results after c) step and compatibility conversion, to
Change the distributed degrees of each point, 1*1 convolution nuclear parameter learns in network training.
E) initial probability graph is weighted again with completion and the probability graph of compatibility conversion is overlapped, through cross entropy Softmax
The probability graph that normalization operation output minimizes.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910540355.2A CN110310298A (en) | 2019-06-21 | 2019-06-21 | A kind of road target real-time three-dimensional point cloud segmentation method based on cycling condition random field |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910540355.2A CN110310298A (en) | 2019-06-21 | 2019-06-21 | A kind of road target real-time three-dimensional point cloud segmentation method based on cycling condition random field |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110310298A true CN110310298A (en) | 2019-10-08 |
Family
ID=68077630
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910540355.2A Pending CN110310298A (en) | 2019-06-21 | 2019-06-21 | A kind of road target real-time three-dimensional point cloud segmentation method based on cycling condition random field |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110310298A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112565794A (en) * | 2020-12-03 | 2021-03-26 | 西安电子科技大学 | Point cloud isolated point encoding and decoding method and device |
CN113762195A (en) * | 2021-09-16 | 2021-12-07 | 复旦大学 | Point cloud semantic segmentation and understanding method based on road side RSU |
CN115453570A (en) * | 2022-09-13 | 2022-12-09 | 北京踏歌智行科技有限公司 | Multi-feature fusion mining area dust filtering method |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180225515A1 (en) * | 2015-08-04 | 2018-08-09 | Baidu Online Network Technology (Beijing) Co. Ltd. | Method and apparatus for urban road recognition based on laser point cloud, storage medium, and device |
CN108876796A (en) * | 2018-06-08 | 2018-11-23 | 长安大学 | A kind of lane segmentation system and method based on full convolutional neural networks and condition random field |
-
2019
- 2019-06-21 CN CN201910540355.2A patent/CN110310298A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180225515A1 (en) * | 2015-08-04 | 2018-08-09 | Baidu Online Network Technology (Beijing) Co. Ltd. | Method and apparatus for urban road recognition based on laser point cloud, storage medium, and device |
CN108876796A (en) * | 2018-06-08 | 2018-11-23 | 长安大学 | A kind of lane segmentation system and method based on full convolutional neural networks and condition random field |
Non-Patent Citations (1)
Title |
---|
BICHEN WU: "SqueezeSeg: Convolutional Neural Nets with Recurrent CRF for Real-Time Road-Object Segmentation from 3D LiDAR Point Cloud", 《2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA)》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112565794A (en) * | 2020-12-03 | 2021-03-26 | 西安电子科技大学 | Point cloud isolated point encoding and decoding method and device |
CN112565794B (en) * | 2020-12-03 | 2022-10-04 | 西安电子科技大学 | Point cloud isolated point encoding and decoding method and device |
CN113762195A (en) * | 2021-09-16 | 2021-12-07 | 复旦大学 | Point cloud semantic segmentation and understanding method based on road side RSU |
CN115453570A (en) * | 2022-09-13 | 2022-12-09 | 北京踏歌智行科技有限公司 | Multi-feature fusion mining area dust filtering method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Cui et al. | Fish detection using deep learning | |
US11651302B2 (en) | Method and device for generating synthetic training data for an artificial-intelligence machine for assisting with landing an aircraft | |
CN109993082A (en) | The classification of convolutional neural networks road scene and lane segmentation method | |
CN111507378A (en) | Method and apparatus for training image processing model | |
US20160224903A1 (en) | Hyper-parameter selection for deep convolutional networks | |
CN110532859A (en) | Remote Sensing Target detection method based on depth evolution beta pruning convolution net | |
CN109461157A (en) | Image, semantic dividing method based on multi-stage characteristics fusion and Gauss conditions random field | |
CN106682569A (en) | Fast traffic signboard recognition method based on convolution neural network | |
CN106920243A (en) | The ceramic material part method for sequence image segmentation of improved full convolutional neural networks | |
CN111339935B (en) | Optical remote sensing picture classification method based on interpretable CNN image classification model | |
CN110097145A (en) | One kind being based on CNN and the pyramidal traffic contraband recognition methods of feature | |
KR102011788B1 (en) | Visual Question Answering Apparatus Using Hierarchical Visual Feature and Method Thereof | |
CN110310298A (en) | A kind of road target real-time three-dimensional point cloud segmentation method based on cycling condition random field | |
CN112070729A (en) | Anchor-free remote sensing image target detection method and system based on scene enhancement | |
CN110009648A (en) | Trackside image Method of Vehicle Segmentation based on depth Fusion Features convolutional neural networks | |
Doi et al. | The effect of focal loss in semantic segmentation of high resolution aerial image | |
CN112329815B (en) | Model training method, device and medium for detecting travel track abnormality | |
CN113743417B (en) | Semantic segmentation method and semantic segmentation device | |
CN110281949B (en) | Unified hierarchical decision-making method for automatic driving | |
US11695898B2 (en) | Video processing using a spectral decomposition layer | |
CN107423747A (en) | A kind of conspicuousness object detection method based on depth convolutional network | |
CN107506792A (en) | A kind of semi-supervised notable method for checking object | |
CN107016371A (en) | UAV Landing Geomorphological Classification method based on improved depth confidence network | |
CN107766828A (en) | UAV Landing Geomorphological Classification method based on wavelet convolution neutral net | |
Qurishee | Low-cost deep learning UAV and Raspberry Pi solution to real time pavement condition assessment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191008 |
|
RJ01 | Rejection of invention patent application after publication |