CN110378909B - Single wood segmentation method for laser point cloud based on Faster R-CNN - Google Patents

Single wood segmentation method for laser point cloud based on Faster R-CNN Download PDF

Info

Publication number
CN110378909B
CN110378909B CN201910551190.9A CN201910551190A CN110378909B CN 110378909 B CN110378909 B CN 110378909B CN 201910551190 A CN201910551190 A CN 201910551190A CN 110378909 B CN110378909 B CN 110378909B
Authority
CN
China
Prior art keywords
point cloud
tree
trunk
forest
cnn
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910551190.9A
Other languages
Chinese (zh)
Other versions
CN110378909A (en
Inventor
云挺
陈鑫鑫
王佳敏
曹林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Forestry University
Original Assignee
Nanjing Forestry University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Forestry University filed Critical Nanjing Forestry University
Priority to CN201910551190.9A priority Critical patent/CN110378909B/en
Publication of CN110378909A publication Critical patent/CN110378909A/en
Application granted granted Critical
Publication of CN110378909B publication Critical patent/CN110378909B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30181Earth observation
    • G06T2207/30188Vegetation; Agriculture
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a single-wood segmentation method facing laser point cloud based on fast R-CNN, which comprises the steps of obtaining forest stand point cloud data; calculating the point cloud characteristics of the scanned forest stand to realize branch-leaf separation of the forest stand point cloud data; performing self-adaptive voxelization operation on the main point cloud data of the forest stand, and performing multi-angle projection on the main point cloud data to generate a corresponding depth image; detecting a trunk in the generated depth image by adopting a deep learning method; and obtaining the space three-dimensional point cloud of the corresponding trunk by back projection by utilizing the detected position information of the trunk in the depth image. And taking the obtained point cloud of the trunk part as a seed point, and combining a region growing algorithm to realize single wood separation. The invention adopts a deep learning method, learns by means of big data samples, has higher single tree segmentation accuracy, and provides possibility for accurately solving the single rubber tree segmentation problem of LiDAR data based on the ground by using the deep learning.

Description

Single wood segmentation method for laser point cloud based on Faster R-CNN
Technical Field
The invention relates to the technical field of single plant separation, in particular to a laser point cloud-oriented single plant segmentation method based on Faster R-CNN.
Background
Natural rubber widely planted in tropical regions is used as an important industrial raw material and strategic material, and has increasingly outstanding roles in national economy construction. However, the rubber tree is a wild species in Amazon river basin in south America, and no wild resources exist in China, so that the rubber forest in China is mostly a pure artificial forest. Hainan is taken as the largest rubber production base in China, and has a rubber forest of approximately 800 ten thousand acres, so that the largest artificial ecological system is formed. But it is often disturbed by typhoons due to geographical location. It has been counted that over 100 typhoons have been encountered in the southwest island over the last 60 years. Typhoons occur in a short time and cause serious original damage to rubber trees, such as trunk and branch breakage, root pulling, and the like. In the global warming background, although the typhoon frequency of logging into Hainan every year tends to be reduced, the average strength of typhoons generally tends to be enhanced, which must have serious influence on the planting and production of rubber. In order to determine the wind resistance index of the rubber tree, a strong resistance variety is cultivated, and a precise rubber tree single-tree segmentation algorithm is an indispensable precondition for obtaining the structural parameters and dynamic change information of the rubber tree.
The laser radar technology, i.e. the laser detection and ranging system (Light Detection and Ranging, liDAR), provides a promising approach to obtain three-dimensional (3D) phenotypic characteristics of plants with its ability to accurately record three-dimensional laser points, which mainly works in the ultraviolet, visible, infrared, etc. band range. According to the different working platforms, laser scanning systems can be divided into five types: satellite-based laser scanning (SLS), on-board laser scanning (ALS), mobile Laser Scanning (MLS), on-board laser scanning (VLS), and Terrestrial Laser Scanning (TLS). Wherein SLS and ALS adopt a top-down scanning mode, MLS, VLS and TLS adopt a bottom-up scanning mode. The top-down scanning method can clearly scan vegetation canopy, and has great potential and advantages in the aspects of recording forest vertical structural characteristics and extracting canopy structural parameters. The bottom-up scanning method can clearly record the lower part (such as trunks and leaves) of the canopy, is more suitable for ground forest investigation work, but if the leaf area density of vegetation below the canopy is higher, the lower part data of the forest is also lost.
At present, aiming at airborne laser radar data, scholars have proposed a plurality of single tree segmentation methods, and the single tree segmentation of the forest is realized based on the geometric form features of the canopy of the forest, and the single tree segmentation methods can be roughly divided into two types: a single plant separation algorithm based on a Canopy Height Model (CHM) and directly based on a scattered point cloud. Precise individual crown segmentation using ground-based mobile laser radar is still challenging, especially for ecological forests where crown height is irregular and crossing is severe. Although some creative studies have been reported on detecting trunk position from ground-based mobile lidar data for tree segmentation, this type of approach still has two problems (1) long-term exposure to deformation of the wood composition of the tree under investigation by hurricane disasters; (2) The robustness and versatility of the single-wood segmentation model for ground-oriented mobile LiDAR data requires further investigation.
Disclosure of Invention
The invention aims to solve the technical problem of providing a single-tree segmentation method facing the laser point cloud based on the Faster R-CNN aiming at the defects of the prior art, wherein the single-tree segmentation method facing the laser point cloud based on the Faster R-CNN adopts a deep learning method, learns by means of big data samples, has higher single-tree segmentation accuracy, and provides possibility for accurately solving the single-tree segmentation problem of LiDAR data based on the ground by using deep learning.
In order to achieve the technical purpose, the invention adopts the following technical scheme:
a single wood segmentation method facing to laser point cloud based on Faster R-CNN comprises the following steps:
step 1: acquiring forest stand point cloud data based on ground mobile LiDAR;
step 2: calculating the point cloud characteristics of the scanned forest stand, so as to realize branch-leaf separation of the forest stand point cloud data;
step 3: performing self-adaptive voxelization operation on the main point cloud data of the forest stand, and performing multi-angle projection on the main point cloud data to generate a corresponding depth image;
step 4: detecting a trunk in the generated depth image by adopting a deep learning method;
step 5: obtaining a space three-dimensional point cloud of a corresponding trunk by back projection by utilizing the detected position information of the trunk in the depth image;
step 6: and taking the obtained point cloud of the trunk part as a seed point, and combining a region growing algorithm to realize single wood separation.
As a further improved technical scheme of the present invention, the step 1 includes:
and (3) measuring by a Velodyne HDL-32E scanner according to a set measuring route by moving back and forth in a forest stand, and splicing the measured point cloud data by a SLAM algorithm.
As a further improved technical scheme of the present invention, the step 2 includes:
acquiring each laser point p in point cloud data of a forest stand i Is characterized by the following:
Figure BDA0002105505280000021
wherein: { e ix ,e iy ,e iz Is the point p i Normal vector of { c } 0,i ,c 1,i ,c 2,i Is the point p i And { l } structural tensor features 0,i ,l 1,i ,l 2,i Is the point p i Shape characteristic values of (2);
according to each obtained laser spot p i Features of (2)
Figure BDA0002105505280000022
And the branches and leaves of the forest stand are separated by combining a Gaussian classifier.
As a further improved technical scheme of the present invention, the step 3 includes:
in the self-adaptive voxelization process, defining the length of a voxel block according to the planting line spacing of the forest tree, defining the width of the voxel block according to the planting line spacing of the forest tree, and defining the height of the voxel block according to the height of the crown trunk of the forest tree clone;
after the voxelization operation, the point cloud data of the forest stand are allocated to the corresponding voxels by adopting an allocation value V i To mark each valid voxel;
for each voxel V i Two corresponding depth images are generated by projection from the Y-axis positive direction and the X-axis positive direction.
As a further improved technical solution of the present invention, the step 4 includes:
(4.1) modeling:
selecting a training sample plot, sequentially operating forest segments in the training sample plot by adopting the methods from the step 1 to the step 3 to obtain a plurality of depth images, marking the trunk position of the tree in each obtained depth image, and taking the marked depth images as a training set;
selecting a test sample site, sequentially operating forest stands in the test sample site by adopting the methods from the step 1 to the step 3 to obtain a plurality of depth images, and taking the obtained depth images as a test set;
training the ability of the Faster R-CNN model to identify the trunk position of the tree in the depth image by using the training set to obtain a trained Faster R-CNN network model;
testing the test set by using a trained Faster R-CNN network model;
(4.2) detecting tree trunks of depth images generated by forest segments needing single-tree segmentation:
and (3) inputting a plurality of depth images obtained by sequentially adopting the methods from step 1 to step 3 of the forest stand needing to be subjected to single-tree segmentation into a trained fast R-CNN network model, outputting a prediction boundary frame and a prediction confidence coefficient of the trunk position of the tree in each depth image, and reserving the prediction boundary frame with the prediction confidence coefficient of more than 80%.
As a further improved technical solution of the present invention, the step 5 includes:
and combining the set position information and size information of the voxel block and the position information of the prediction boundary box in the corresponding depth image generated by the voxel block to calculate x, y and z values of the tree trunk space position, and combining the x, y and z values to back-project the tree trunk in the two-dimensional prediction boundary box to the corresponding 3D space, so as to obtain the three-dimensional point cloud of the corresponding tree trunk part.
As a further improved technical solution of the present invention, the step 6 includes:
based on the point cloud of the trunk part of the tree obtained in the step 5, the area growth algorithm based on octree is used for realizing the segmentation of the point cloud of the skeleton part of the tree; wherein the tree skeleton part is the rest structure of the tree with leaves removed;
and for the point clouds of the unclassified leaf parts, taking the point clouds of the segmented tree skeleton parts as center points, and clustering the point clouds of the leaf parts into the point clouds of the corresponding tree skeleton parts through a clustering algorithm.
The beneficial effects of the invention are as follows: according to the method, the self-adaptive voxelization method is utilized to project the cloud data of the target point into a two-dimensional depth image, and then the deep learning method is utilized to detect the trunk part of the forest in the depth image; the detected two-dimensional trunk can acquire three-dimensional point cloud data of the corresponding trunk part through back projection; based on the acquired trunk point cloud data, the final single wood segmentation can be realized by combining a region growing algorithm. Therefore, the method is different from a single tree segmentation algorithm based on crown detection, and the method firstly detects the trunk part of a single tree, thereby fundamentally reducing the influence of crown detection errors on segmentation results. The existing single tree segmentation algorithm based on tree detection for moving laser radar data towards the ground in a trunk mode can generate certain errors when being applied to forest segments with serious deformation of forest forms after wind damage. In general, the invention has great prospect in applying the method combined with deep learning to single rubber tree segmentation research of ground-based mobile LiDAR scanning data.
Drawings
Fig. 1 is a flow chart of the single-wood separating method of the present embodiment.
Fig. 2 is a view showing an outline of the study area of this embodiment.
Fig. 3 is a result diagram of data preprocessing, adaptive voxel formation and multi-angle projection generation of depth images of three rubber forest segments according to the present embodiment.
Fig. 4 is a flow chart of the detection of the trunk of the rubber tree based on the fast R-CNN model in the present embodiment.
Fig. 5 is a diagram showing an example of the training sample of the present embodiment.
FIG. 6 is a training loss diagram of the fast R-CNN model of the present embodiment.
Fig. 7 is a diagram showing an example of the test results of the test sample according to the present embodiment.
FIG. 8 is a diagram showing the result of the trunk detection of the rubber tree according to the present embodiment.
Fig. 9 is a graph of the region growing result based on the point cloud of the trunk portion in the present embodiment.
Fig. 10 is a graph of final single-wood segmentation results after She Zidian cloud clustering based on the skeleton part point cloud in the embodiment.
Detailed Description
The following further describes embodiments of the present invention with reference to fig. 1 to 10:
a single wood segmentation method facing to laser point cloud based on Faster R-CNN, the flow is shown in figure 1, includes:
step 1: and acquiring forest stand point cloud data based on the ground mobile LiDAR.
The method comprises the following steps: FIG. 2 is a schematic diagram of an investigation, an experimenter carrying a Velodyne HDL-32E scanner to walk around three rubber forest stands at a speed of 0.5 m/s according to established measurement routes to obtain point cloud data for a target rubber forest stand. The data splice of the whole system adopts simultaneous localization and mapping (SLAM) algorithm.
In this example, one forest Duan Ziji was created for each of the three rubber forest stands (rubber forest stand 1, rubber forest stand 2, and rubber forest stand 3), and used as a training pattern in the subsequent experiments. Each training plot consisted of an area of approximately 0.6x0.6 km representing the corresponding rubber forest stand and tree height measurements were made in three training plots using a Vertex IV meter at day 2016, 2, 11 (see table 1 for specific values). Furthermore, three further subsets (which do not intersect the subset used as training points) were again selected from the three rubber forest segments, each subset consisting of an area of about 0.3×0.6km, which was used as a subsequent test specimen.
Step 2: and calculating the point cloud characteristics of the scanned forest stand, so that branch and leaf separation of the forest stand point cloud data is realized.
The method comprises the following steps: calculating and acquiring each laser spot p i Is characterized by the following:
Figure BDA0002105505280000052
wherein: { e ix ,e iy ,e iz Is the point p i Normal vector of { c } 0,i ,c 1,i ,c 2,i Is the point p i And { l } structural tensor features 0,i ,l 1,i ,l 2,i Is the point p i Is a shape characteristic value of (a). According to each obtained laser scanning point p i Features of->
Figure BDA0002105505280000053
And the branches and leaves of the rubber forest stand are separated by combining a Gaussian classifier.
And 3, performing self-adaptive voxelization operation on the main point cloud data of the forest stand, and performing multi-angle projection on the main point cloud data to generate a corresponding depth image. In the self-adaptive voxelization process, defining the length of a voxel block according to the planting line spacing of the forest tree, defining the width of the voxel block according to the planting line spacing of the forest tree, and defining the height of the voxel block according to the height of the crown trunk of the forest tree clone; after the voxelization operation, the point cloud data of the forest stand are allocated to the corresponding voxels by adopting an allocation value V i To mark each valid voxel.
The line spacing is generally about 6-8m according to the artificial rubber forest, and is about 2.8-3m, so that in the self-adaptive voxelization process, the embodiment defines the length of the voxel block to be 8m and the width to be 3m. Furthermore, considering the growth and structural characteristics of the different rubber tree clones, different heights are set for voxels of the three training plots according to the height of the undercrown trunk of the three rubber tree clones (see table 1), the voxel heights being 5.62 (training plot 1), 8.40 (training plot 2) and 9.35 (training plot 3), respectively. After the voxelization operation, the point cloud data of the three training sample sites are assigned to the respective voxels and a value V is assigned i To mark each valid voxel. For each voxel V i Two corresponding depth images are generated by projection from the Y-axis positive direction and the X-axis positive direction. The number of depth images generated from the scan points of the three training plots is 233,268 and 301, respectively, all of which are used to construct the training set, with the total number of depth images in the training set being 802. In fig. 3, (a), (b) and (c) are the result graphs of data preprocessing, adaptive voxelization and multi-angle projection generation of depth images of the rubber forest stand 1, the rubber forest stand 2 and the rubber forest stand 3, respectively.
Table 1: parameters related to three rubber forest stands
Figure BDA0002105505280000051
Figure BDA0002105505280000061
And 4, detecting the trunk in the generated depth image by adopting a deep learning method.
(4.1) modeling: after a training sample plot is selected, sequentially adopting the methods from the step 1 to the step 3 to operate the forest stand in the training sample plot to obtain a plurality of depth images, marking the trunk position of the tree in each obtained depth image, and taking the marked depth images as a training set. After a test sample is selected, the forest stand in the test sample is operated by adopting the methods from the step 1 to the step 3 in sequence, a plurality of depth images are obtained, and the obtained depth images are used as a test set. And training the ability of the Faster R-CNN model to identify the trunk position of the tree in the depth image by using the training set to obtain a trained Faster R-CNN network model. And testing the test set by using the trained Faster R-CNN network model.
The Faster Region-based Convolutional Neural Network (Faster R-CNN) model consists of two Convolutional Neural Network (CNN) networks-a proposed regional Region Proposal Network (RPN) network and a Fast Region-based Convolutional Neural Network (Fast R-CNN) detection network using the proposed regional. The RPN samples the random region information of the image as proposed regions and trains them to determine regions that may contain targets. The Fast R-CNN detection network further processes the region information collected by the RPN network, determines the target class in the region, and adjusts the size and position of the region to locate the precise position of the target in the image. FIG. 4 is a model architecture for automatic rubber tree trunk detection and identification based on Faster R-CNN.
The training process of the fast R-CNN network model mainly comprises four steps. First, the parameters of the entire fast R-CNN network model are initialized with the pre-trained model. Second, the RPN network is trained with a training set. Third, the RPN generated advice region is used to train the Fast R-CNN detection network. Finally, the RPN and Fast R-CNN form a joint network, and the weights of the joint network are adjusted by repeating the above procedure.
Based on the pretrained convolutional neural network, the training set generated by the training pattern is used for training and optimizing parameters of the Faster R-CNN network model. As described in step 3, the training samples are generated by performing multi-angle projection on the adaptive voxel block, and according to the number and morphological characteristics of the trunk of the rubber tree in the depth image, the embodiment analyzes the depth image in 6 cases: (a) The depth image only comprises a complete rubber tree trunk; (b) The depth image comprises two complete rubber tree trunks; (c) a plurality of branched stems appear in the voxels; (d) information of the trunk is obscured by leaves or branches; (e) Trunk information belonging to a plurality of trees is overlapped in one voxel; (f) The trunk of the same rubber tree trunk tree appears in two adjacent images resulting from the projection of two adjacent voxels. In order to prepare for the subsequent training process, the training samples must be marked first (i.e. the trunk position in the obtained depth image is marked), and for the above six cases, the present embodiment proposes different marking modes respectively: (a) marking the entire trunk; (b) marking the trunks of two trees in one image; (c) Marking trunks of two trees in one image, including branches; (d) marking only the trunk portion, without marking the leaf portion; (e) marking all overlapping trunks; (f) Only the trunk portion in a voxel is marked, while branches that occur in neighboring voxels or the upper part of the trunk is not marked. The labeling results of depth images for six cases in a partial training set are shown in fig. 5, where a rectangular box that tightly encloses the trunk of the rubber tree is used as a group trunk in subsequent training.
Specific training process of model:
A. selection of a feature extraction network model:
the choice of convolutional neural network for feature extraction determines the accuracy of the final recognition result of the training process, so how to select an appropriate convolutional neural network framework is important in the training process.
In computer vision, the depth of the network is an important factor in achieving good results. As network depth increases, the level of functionality increases accordingly. However, too large a depth value may cause a gradient extinction phenomenon. Therefore, the VGG16 network is selected as the feature extraction network model in the present embodiment. VGG16 networks consist of 13 convolutional layers, each followed by a ReLU layer, which is a common activation function in artificial neural networks. Some convolution layers are also followed by a maximum pool layer to preserve features and reduce unnecessary parameters to increase computation speed.
B. Pretraining the CNN model:
as shown in table 1, the total number of samples in the training set is 802, but a large number of samples is the basis for high-precision training. The transfer learning can use general data to acquire a pre-training model and use the model to construct an RPN network and a Fast RCNN detection network, so that the problem of too small training samples is solved. In this embodiment, a training set (approximately 100,000 images, 1000 classifications) in ImageNet is used to pre-train the VGG16 network model.
C. Training of RPN network:
as an important component of Faster R-CNN, the RPN neural network takes depth images as input and outputs a set of rectangular suggested regions to the Fast R-CNN detection network. The special network structure can improve the region extraction speed. Meanwhile, the method can scale and translate the size and the position of the suggested area according to the group trunk, so that the positioning is more accurate.
The feature map of the input picture can be extracted by utilizing the convolutional layer of the VGG16 feature extraction network. For each position of the feature map, a convolution operation is performed through a 3 x 3 sliding window to obtain a multi-dimensional feature vector corresponding to the same position, reflecting the depth features in the sliding window of that position. And this feature vector is fed to the fully connected layers of the two peers: regression layer (reg layer) and classification layer (cls layer). Anchor is centered on a 3 x 3 sliding window, each sliding window has nine anchors, the size of which corresponds to three proportional combinations [128 ] 2 ,256 2 ,512 2 ]And three aspect ratios [1:1,1:2,2:1]. For an anchor, if the overlap of any one of its groups trunk is higher than 0.7, a positive label is assigned to the anchorA sign (i.e., positive anchor). In some rare cases, this may not find a positive anchor, for which we designate an anchor with the highest overlap with one of the groups trunk as the positive anchor. For a non-positive anchor, if its overlap with all groups trunk is below 0.3, it is designated as a negative anchor. For each anchor, a probability value of its belonging to the foreground (i.e., the information in the anchor is determined as the target)/the background (i.e., the information in the anchor is determined as the non-target) and a positional deviation of the anchor with respect to the ground trunk are calculated. When the information in the anchor is identified as a target (i.e., the anchor belongs to the foreground), then the anchor is retained as a suggested area and used for subsequent training. Before further training the Fast R-CNN detection network, the suggested regions generated by the RPN network are first mapped into a feature map and a series of ROIs (regions of interest) of random size are generated on the feature map.
D. Training of Fast R-CNN detection network:
the ROI with random size is normalized to a fixed size by the pooling operation as an input to the ROI pooling layer (see fig. 4). The normalized ROI is fed into two fully connected layers, where the fully connected layers (cls layers) are used to achieve classification of targets in the ROI. The other full-connection layer (reglayer) adjusts the position of the ROI according to the group trunk through two translation and two scaling parameters, so that the position of the ROI is closer to the real position of the target.
E. Loss function in training process
During the training process, the parameters of the neural network are adjusted by the loss function, which is defined as follows:
Figure BDA0002105505280000081
wherein i is an index value of an anchor in the training process. Classification loss function L cls (e i ,e i * ) Is a log-loss function of two classes (whether the anchor belongs to the foreground or the background).
L cls (e i ,e i * )=-log[e i * e i +(1-e i * )(1-e i )] (2);
Wherein e i Is the probability that the anchor predicts as the foreground, e i * Is a group trunk tag. If the anchor is a positive anchor, e i * The value is equal to 1, if the anchor is a negative anchor, e i * The value is equal to 0.
L reg (t i ,t i * ) For the regression loss function:
L reg (t i ,t i * )=R(t i -t i * ) (3);
for each anchor, the regression loss is calculated by multiplying by e i * This means that the loss is calculated only when the anchor is determined to be foreground, and otherwise, the loss is not calculated. Wherein t is i ={t x ,t y ,t w ,t h And is a vector indicating the offset predicted by the anchor. t is t i * ={t x * ,t y * ,t w * ,t h * And t is i The vector of the same dimension represents the actual offset of the anchor with respect to the ground trunk (see fig. 4). As shown in equation (4), R is a smoothing loss function that controls the smoothed region by a parameter σ, typically σ=1, in the fast R-CNN function σ=3:
Figure BDA0002105505280000091
as shown in equation (1), the outputs of cls and reg layers (see FIG. 4) are respectively represented by { e i Sum { t } i Composition. N (N) cls Is the size of the feature map (about 1750, 50×35), N reg Is the batch size (in RPN networks, N reg =256, N in Fast R-CNN detection network reg =128). As shown above, N cls And N reg The difference is too large and the parameter lambda is used to balance the two, resulting in a total loss function L ({ e) i },{t i }) can uniformly account for 2 losses
Using the manually marked rubber tree trunk as a reference, a difference between the predicted region information and the group trunk can be generated. And according to the calculated difference, adjusting the weight and the offset of the neural network by using a back propagation algorithm. Therefore, with enough training, the fast R-CNN can accurately detect the trunk position and the class of the rubber tree.
The convolutional neural network of the experiment is constructed based on a TensorFlow deep learning framework. Experiments were performed on a PC with intel i7-8550U CPU,16GB RAM and NIVIDA GTX 1070 GPU. During the experiment, the Monentim method was used for training of Faster R-CNN, where the weight decay was 0.0001 and the Momentum was 0.9. The learning rate during training was 0.001 and the number of iterations was 70000. Training loss as shown in fig. 6, the total training time was about 2 hours.
While the overall loss is not smooth enough, the overall downward trend is apparent (fig. 6). The loss occurs mainly in the first 100 iterations. The final loss is about 0.002, which means that the error between the predicted result and the corresponding ground trunk is small.
After the Faster Region-based Convolutional Neural Network (Faster R-CNN) model is trained, it needs to be tested. The testing process comprises four main steps: first, for the forest stand point cloud data of each test specimen, the point cloud is assigned to different voxels by a voxelization method. Next, a depth image for testing is generated by multi-angle projection of the voxels. Thirdly, testing a test sample by using a trained fast R-CNN network model to predict the positions of the trunks of the rubber trees in the image, and only reserving the prediction results with more than 80% of the prediction confidence, wherein FIG. 7 is a test recognition result diagram of the trunks of the rubber trees in six cases.
(4.2) detecting tree trunks of depth images generated by forest segments needing single-tree segmentation through a trained Faster R-CN network model:
and (3) sequentially adopting a plurality of depth images obtained by the method from step 1 to step 3 for the forest stand needing to be subjected to single-tree segmentation, inputting the plurality of depth images into a trained Faster R-CNN network model, outputting a prediction boundary frame and a prediction confidence coefficient of the trunk position of the tree in each depth image, and reserving the prediction boundary frame with the prediction confidence coefficient of more than 80%.
Step 5: and obtaining the space three-dimensional point cloud of the corresponding trunk by back projection by utilizing the detected position information of the trunk in the depth image.
The method comprises the following specific steps: and combining the position information and the size information of the set voxel blocks and the position information of the prediction boundary boxes in the corresponding depth images, and calculating to obtain x, y and z values of the trunk space position of the rubber tree. And combining the x, y and z values, and back projecting the trunk in the two-dimensional prediction boundary box to the corresponding 3D space, so that the three-dimensional point cloud of the trunk part of the corresponding rubber tree can be obtained.
Step 6: based on the trunk part point cloud obtained in the step 5, realizing the segmentation of the rubber tree skeleton point cloud by using an octree-based region growing algorithm; and for unclassified leaf part point clouds, taking the segmented rubber tree skeleton part point clouds as center points, and clustering She Zidian clouds into corresponding skeleton part point clouds through a clustering algorithm.
In fig. 8, (a), (b) and (c) are shown the results of a single rubber tree trunk location of three test sites (i.e., 8 (a) corresponds to test site 1;8 (b) selected from rubber stand 1, 2 selected from rubber stand 2, and 8 (c) corresponds to test site 3 selected from rubber stand 3), respectively. Each rubber tree trunk is represented by light vertical lines that serve as seed points to further segment the unclassified point cloud. The results of the single rubber tree skeleton point segmentation at three test sites (i.e., test site 1, test site 2, and test site 3) are shown in fig. 9 (a), (b), and (c), respectively. In fig. 10, (a), (b) and (c) are shown the leaf portion clustering results based on the skeleton point segmentation results for three samples (i.e., sample 1, sample 2 and sample 3), respectively.
For three rubber forest test specimens, the present experiment evaluates the segmentation results at the single rubber tree level. If the rubber tree is marked and classified as class A, it is True Positive (TP); if the rubber tree is marked as class a but not segmented (assigned to another class), then it is False Negative (FN); if the rubber tree is not present but segmented, it is a False Positive (FP). We expect higher TP, lower FN and lower FP to achieve higher accuracy. Further, the backbone detection rate r (recall), the correctness P (precision) of the backbone detection, and the overall accuracy F (F score) of the backbone detection are calculated using the following equations.
Figure BDA0002105505280000101
Figure BDA0002105505280000102
Figure BDA0002105505280000103
As shown in Table 2, the segmentation results for all three samples were accurate, with r, P and F values greater than 0.98. For sample 1 (PR 107), the values of r, P and F were 1,0.98 and 0.99, respectively. For sample 2 (CATAS 7-20-59), the values of r, P and F were 0.99,0.99 and 0.99, respectively. For sample 3 (CATAS 8-7-9), the values of r, P and F were 0.98,0.99 and 0.98, respectively. The accuracy of these segmentations is almost the same, although the species of planting are different.
Table 2: evaluation of accuracy of rubber tree segmentation results of test sample sites in three rubber forest lands
Figure BDA0002105505280000111
In order to further locate the error condition in the detection result, the embodiment divides the test set into six cases according to the number and the state of the trunk of the rubber tree contained in the image. The number of images in these six cases was 228,22,48,18,14 and 29, respectively. The present example analyzes the accuracy of the trunk detection of the rubber tree in these six cases, calculates the average accuracy P, the average recall r, the average F score for each case, and compares the results using the average F score. As shown in Table 3, the values of F were as high as 100% for cases a, c and d. While for the other cases b, d and e, the accuracy of F reached 90% although there were some false positive and false negative errors.
Many factors can lead to errors in the detection results for the three cases (b, d and e) described above. First, as can be seen from table 3, the training pictures corresponding to these three cases are relatively few, which may lead to a problem of under fitting. For case e, the problem of the projection angle may also cause that the corresponding multiple trunks are highly overlapped in the projection picture, which is difficult to distinguish. In the case f, due to typhoon invasion, the trunk may have a serious inclination condition, so that when voxelization is performed, the upper and lower parts of the trunk appear in different voxel blocks, and when labeling, the embodiment only marks the lower half of the trunk in the corresponding picture, and the upper half is not marked. However, due to the similarity of the shape characteristics, the upper trunk part of the model can be detected erroneously during detection.
Table 3: evaluation of accuracy of rubber tree segmentation under six different conditions
Figure BDA0002105505280000112
Figure BDA0002105505280000121
/>
Single-tree segmentation is an important premise for retrieving forest attributes from various types of forest remote sensing data. Previous single-tree segmentation related studies rely on single-tree detection of monochromatic wavelength on-board laser scanning, and focus on the use of geometric spatial information of the point cloud. However, these methods have difficulty extracting clustered crowns with similar height and density distributions because clustered trees do not meet the assumption of geometric constraint features. For example, a clustered crown having similar height and density profiles may be falsely detected as a single treetop. On the other hand, a non-treetop local maximum may be erroneously detected as treetop.
The present embodiment proposes a deep learning based approach to improve the segmentation of a single tree. Since typhoons frequently occur in a research area, the rubber tree is severely inclined, so that the morphological structure of the crown is not obvious. Therefore, single rubber tree segmentation based on canopy features is difficult to achieve, and the accuracy of the segmentation results is susceptible to interference. The fast R-CNN method combines a large number of data samples to learn the characteristics of the target to be detected, so that the single tree segmentation method based on the fast R-CNN model is more robust. As shown in Table 3, the overall accuracy of our method reaches 90% despite some false positive and false negative errors in the test results. Although the method of the embodiment cannot achieve 100% accuracy of pure manual detection, the manpower cost of manual detection can be effectively reduced. In general, this work demonstrates the possibility of using deep learning to solve the single rubber tree segmentation problem of ground-based LiDAR data.
The scope of the present invention includes, but is not limited to, the above embodiments, and any alterations, modifications, and improvements made by those skilled in the art are intended to fall within the scope of the invention.

Claims (4)

1. A single wood segmentation method facing to laser point cloud based on Faster R-CNN is characterized by comprising the following steps:
step 1: acquiring forest stand point cloud data based on ground mobile LiDAR;
step 2: calculating the point cloud characteristics of the scanned forest stand, so as to realize branch-leaf separation of the forest stand point cloud data;
step 3: performing self-adaptive voxelization operation on the main point cloud data of the forest stand, and performing multi-angle projection on the main point cloud data to generate a corresponding depth image;
step 4: detecting a trunk in the generated depth image by adopting a deep learning method;
step 5: obtaining a space three-dimensional point cloud of a corresponding trunk by back projection by utilizing the detected position information of the trunk in the depth image;
step 6: taking the obtained point cloud of the trunk part as a seed point, and combining a region growing algorithm to realize single wood separation;
the step 3 comprises the following steps:
in the self-adaptive voxelization process, defining the length of a voxel block according to the planting line spacing of the forest tree, defining the width of the voxel block according to the planting line spacing of the forest tree, and defining the height of the voxel block according to the height of the crown trunk of the forest tree clone;
after the voxelization operation, the point cloud data of the forest stand are allocated to the corresponding voxels by adopting an allocation value V i To mark each valid voxel;
for each voxel V i Generating two corresponding depth images by projection from the Y-axis forward direction and the X-axis forward direction;
the step 4 comprises the following steps:
(4.1) modeling:
selecting a training sample plot, sequentially operating forest segments in the training sample plot by adopting the methods from the step 1 to the step 3 to obtain a plurality of depth images, marking the trunk position of the tree in each obtained depth image, and taking the marked depth images as a training set;
selecting a test sample site, sequentially operating forest stands in the test sample site by adopting the methods from the step 1 to the step 3 to obtain a plurality of depth images, and taking the obtained depth images as a test set;
training the ability of the Faster R-CNN model to identify the trunk position of the tree in the depth image by using the training set to obtain a trained Faster R-CNN network model;
testing the test set by using a trained Faster R-CNN network model;
(4.2) detecting tree trunks of depth images generated by forest segments needing single-tree segmentation:
inputting a plurality of depth images obtained by adopting the methods from step 1 to step 3 to a trained fast R-CNN network model, outputting a prediction boundary frame and a prediction confidence coefficient of the trunk position of the tree in each depth image, and reserving the prediction boundary frame with the prediction confidence coefficient of more than 80%;
the step 5 comprises the following steps:
and combining the set position information and size information of the voxel block and the position information of the prediction boundary box in the corresponding depth image generated by the voxel block to calculate x, y and z values of the tree trunk space position, and combining the x, y and z values to back-project the tree trunk in the two-dimensional prediction boundary box to the corresponding 3D space, so as to obtain the three-dimensional point cloud of the corresponding tree trunk part.
2. The method for laser point cloud oriented single wood segmentation based on fast R-CNN according to claim 1, wherein the step 1 comprises:
and (3) measuring by a Velodyne HDL-32E scanner according to a set measuring route by moving back and forth in a forest stand, and splicing the measured point cloud data by a SLAM algorithm.
3. The method for laser point cloud oriented single wood segmentation based on fast R-CNN according to claim 2, wherein the step 2 comprises:
acquiring each laser point p in point cloud data of a forest stand i Is characterized by the following:
Figure FDA0004071573590000021
wherein: { e ix ,e iy ,e iz Is the point p i Normal vector of { c } 0,i ,c 1,i ,c 2,i Is the point p i And { l } structural tensor features 0,i ,l 1,i ,l 2,i Is the point p i Shape characteristic values of (2);
according to each obtained laser spot p i Features of (2)
Figure FDA0004071573590000022
And the branches and leaves of the forest stand are separated by combining a Gaussian classifier.
4. The method for laser point cloud oriented single wood segmentation based on fast R-CNN according to claim 1, wherein the step 6 comprises:
based on the point cloud of the trunk part of the tree obtained in the step 5, the area growth algorithm based on octree is used for realizing the segmentation of the point cloud of the skeleton part of the tree; wherein the tree skeleton part is the rest structure of the tree with leaves removed;
and for the point clouds of the unclassified leaf parts, taking the point clouds of the segmented tree skeleton parts as center points, and clustering the point clouds of the leaf parts into the point clouds of the corresponding tree skeleton parts through a clustering algorithm.
CN201910551190.9A 2019-06-24 2019-06-24 Single wood segmentation method for laser point cloud based on Faster R-CNN Active CN110378909B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910551190.9A CN110378909B (en) 2019-06-24 2019-06-24 Single wood segmentation method for laser point cloud based on Faster R-CNN

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910551190.9A CN110378909B (en) 2019-06-24 2019-06-24 Single wood segmentation method for laser point cloud based on Faster R-CNN

Publications (2)

Publication Number Publication Date
CN110378909A CN110378909A (en) 2019-10-25
CN110378909B true CN110378909B (en) 2023-06-06

Family

ID=68250686

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910551190.9A Active CN110378909B (en) 2019-06-24 2019-06-24 Single wood segmentation method for laser point cloud based on Faster R-CNN

Country Status (1)

Country Link
CN (1) CN110378909B (en)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110929722A (en) * 2019-11-04 2020-03-27 浙江农林大学 Tree detection method based on whole tree image
CN110743818A (en) * 2019-11-29 2020-02-04 苏州嘉诺环境工程有限公司 Garbage sorting system and garbage sorting method based on vision and deep learning
CN111275724B (en) * 2020-02-26 2022-06-07 武汉大学 Airborne point cloud roof plane segmentation method based on octree and boundary optimization
CN111696122A (en) * 2020-06-12 2020-09-22 北京数字绿土科技有限公司 Crop phenotype parameter extraction method and device
CN111738151B (en) * 2020-06-22 2023-10-10 佛山科学技术学院 Grape fruit stem accurate identification method based on deep learning post-optimization
CN111814666B (en) * 2020-07-07 2021-09-24 华中农业大学 Single tree parameter extraction method, system, medium and equipment under complex forest stand
CN111898688B (en) * 2020-08-04 2023-12-05 沈阳建筑大学 Airborne LiDAR data tree classification method based on three-dimensional deep learning
CN112561985B (en) * 2020-10-27 2021-07-20 广西大学 Hedgerow nursery stock trimming and centering method based on binocular vision
CN112101488B (en) * 2020-11-18 2021-06-25 北京沃东天骏信息技术有限公司 Training method and device for machine learning model and storage medium
CN112991300B (en) * 2021-03-04 2023-09-26 中国林业科学研究院资源信息研究所 Single wood skeleton extraction and visualization method based on neighborhood characteristics
CN113205543A (en) * 2021-05-27 2021-08-03 南京林业大学 Laser radar point cloud trunk extraction method based on machine learning
CN113642475B (en) * 2021-08-17 2023-04-25 中国气象局上海台风研究所(上海市气象科学研究所) Atlantic hurricane strength estimation method based on convolutional neural network model
CN113935428A (en) * 2021-10-25 2022-01-14 山东大学 Three-dimensional point cloud clustering identification method and system based on image identification
CN114494586B (en) * 2022-01-10 2024-03-19 南京林业大学 Lattice projection deep learning network broadleaf branch and leaf separation and skeleton reconstruction method
CN114862872B (en) * 2022-05-10 2024-05-07 桂林理工大学 Mangrove single wood segmentation method based on Faster R-CNN
CN116188489A (en) * 2023-02-01 2023-05-30 中国科学院植物研究所 Wheat head point cloud segmentation method and system based on deep learning and geometric correction
CN116486261B (en) * 2023-04-14 2024-09-17 中山大学 Method and system for separating tree point cloud wood components from blade components
CN116893428B (en) * 2023-09-11 2023-12-08 山东省地质测绘院 Forest resource investigation and monitoring method and system based on laser point cloud
CN117710601B (en) * 2023-12-27 2024-05-24 南京林业大学 Single wood skeleton extraction method and system based on laser point cloud and image information

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107705309A (en) * 2017-10-15 2018-02-16 南京林业大学 Forest parameter evaluation method in laser point cloud

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107705309A (en) * 2017-10-15 2018-02-16 南京林业大学 Forest parameter evaluation method in laser point cloud

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
apple tree trunk and branch segmentation for automatic trellis training using convolutional neural network based semantic segmentation;Yaqoob Majeed等;《IFAC PapersOnLine》;20180715;第51卷(第17期);正文第2节 *

Also Published As

Publication number Publication date
CN110378909A (en) 2019-10-25

Similar Documents

Publication Publication Date Title
CN110378909B (en) Single wood segmentation method for laser point cloud based on Faster R-CNN
CN112819830B (en) Single tree crown segmentation method based on deep learning and airborne laser point cloud
CN110264468B (en) Point cloud data mark, parted pattern determination, object detection method and relevant device
Yang et al. An individual tree segmentation method based on watershed algorithm and three-dimensional spatial distribution analysis from airborne LiDAR point clouds
Li et al. Automatic organ-level point cloud segmentation of maize shoots by integrating high-throughput data acquisition and deep learning
Malambo et al. Automated detection and measurement of individual sorghum panicles using density-based clustering of terrestrial lidar data
CN111340826B (en) Aerial image single tree crown segmentation algorithm based on super pixels and topological features
CN109146889A (en) A kind of field boundary extracting method based on high-resolution remote sensing image
CN108830188A (en) Vehicle checking method based on deep learning
CN105389538B (en) A method of based on a cloud hemisphere slice estimation Forest Leaf Area Index
CN112907520B (en) Single tree crown detection method based on end-to-end deep learning method
Ok et al. 2-D delineation of individual citrus trees from UAV-based dense photogrammetric surface models
CN110969654A (en) Corn high-throughput phenotype measurement method and device based on harvester and harvester
Zhang et al. A hybrid framework for single tree detection from airborne laser scanning data: A case study in temperate mature coniferous forests in Ontario, Canada
CN109766824A (en) Main passive remote sensing data fusion classification method based on Fuzzy Evidence Theory
CN114140665A (en) Dense small target detection method based on improved YOLOv5
CN115880487A (en) Forest laser point cloud branch and leaf separation method based on deep learning method
CN117392382A (en) Single tree fruit tree segmentation method and system based on multi-scale dense instance detection
CN114494586B (en) Lattice projection deep learning network broadleaf branch and leaf separation and skeleton reconstruction method
CN115909096A (en) Unmanned aerial vehicle cruise pipeline hidden danger analysis method, device and system
CN115331100A (en) Spatial distribution monitoring method and system for cultivated land planting attributes
Zheng et al. YOLOv4-lite–based urban plantation tree detection and positioning with high-resolution remote sensing imagery
Xiao et al. 3D reconstruction and characterization of cotton bolls in situ based on UVA technology
CN116665081B (en) Coastal vegetation aboveground biomass estimation method, computer equipment and medium
CN117765006A (en) Multi-level dense crown segmentation method based on unmanned aerial vehicle image and laser point cloud

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant