CN113591729A - Urban forest single-tree crown detection method combining RGB-DSM image and deep learning - Google Patents

Urban forest single-tree crown detection method combining RGB-DSM image and deep learning Download PDF

Info

Publication number
CN113591729A
CN113591729A CN202110885400.5A CN202110885400A CN113591729A CN 113591729 A CN113591729 A CN 113591729A CN 202110885400 A CN202110885400 A CN 202110885400A CN 113591729 A CN113591729 A CN 113591729A
Authority
CN
China
Prior art keywords
detection
image
deep learning
rgb
tree crown
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110885400.5A
Other languages
Chinese (zh)
Other versions
CN113591729B (en
Inventor
夏凯
王昊
冯海林
杨垠晖
徐流畅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang A&F University ZAFU
Original Assignee
Zhejiang A&F University ZAFU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang A&F University ZAFU filed Critical Zhejiang A&F University ZAFU
Priority to CN202110885400.5A priority Critical patent/CN113591729B/en
Publication of CN113591729A publication Critical patent/CN113591729A/en
Application granted granted Critical
Publication of CN113591729B publication Critical patent/CN113591729B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an urban forest single-tree crown detection method combining RGB-DSM images and deep learning, and belongs to the field of target detection. The invention provides an urban forest single-tree crown detection method combining two images and deep learning. Compared with a deep learning network trained by only using an orthophoto map or a digital surface model, the method has obvious precision advantage.

Description

Urban forest single-tree crown detection method combining RGB-DSM image and deep learning
Technical Field
The invention belongs to the field of target detection, and particularly relates to an urban forest single-tree crown detection method combining an RGB-DSM image and deep learning.
Background
Urban forests have important functions of regulating urban climate, absorbing toxic and harmful gases, improving living environment, maintaining biological diversity and the like. As the urbanization effect becomes more and more serious, the construction of urban forests is more and more emphasized. Single tree crown detection (ITCD), as an important method for realizing sustainable urban forest management and management, is not only used to obtain basic information of trees, but also has important applications in disease tree detection, urban green quantity monitoring, and the like.
The ITCD task can be divided into two steps: position detection and edge delineation. Common algorithms for position detection include a local maximum method, a template matching method, a scale analysis method and the like; common algorithms for edge delineation include region growing methods, valley tracking methods, watershed segmentation methods, and the like. The algorithms are unsupervised algorithms and are more suitable for landscape homogenization areas. The urban forest is located in a city, and various types of ground features are distributed around the urban forest. In such a complicated environment, it is difficult for the above algorithm to achieve a good detection effect.
With the improvement of computer computing power and the increase of data volume, the strong feature learning capability enables the deep learning to have obvious advantages in the aspects of adapting to complex backgrounds, generalizing models, reducing access thresholds and the like, and the advantages enable the deep learning to be widely applied to multiple fields such as face recognition, unmanned driving and the like. In forestry, combine degree of depth study and unmanned aerial vehicle high definition image data, the ITCD task is applied to a great deal of scene.
Research has shown that elevation information helps to improve the performance of computer vision tasks. Some researchers have used elevation data in ITCD studies, but these studies have used elevation data in non-deep learning algorithm based ITCD tasks, but because color and elevation data are three-channel images and single-channel images (which have four channels in total), respectively, while currently common deep learning networks accept three-channel images, prior to combining the two types of data for a deep learning based ITCD task, some must try to resolve the contradiction between the study data and the network in terms of the number of channels.
Disclosure of Invention
The invention aims to solve the problems in the prior art and provides a method for detecting urban forest single-tree crowns by combining RGB-DSM images and deep learning. The method combines the unmanned aerial vehicle RGB image and the DSM to detect the urban forest single tree crown so as to improve the accuracy of the detection result.
The invention adopts the following specific technical scheme:
a city forest single tree crown detection method combining RGB-DSM images and deep learning comprises the following steps:
s1: acquiring an unmanned aerial vehicle image of a target detection area, and generating a dense point cloud by using the acquired unmanned aerial vehicle image so as to obtain an RGB three-channel orthophoto map representing the reflectivity of a ground object and a single-channel digital surface model representing the height of the ground object;
s2: acquiring a first target detection model and a second target detection model which are respectively trained based on a deep learning network; the first target detection model takes an orthophoto map as input and outputs a detection frame of the single tree crown; the second target detection model takes a three-channel high-level diagram formed by repeated superposition of single-channel digital surface models as input and outputs a detection frame of the single-tree crown;
s3: taking the RGB three-channel orthophoto map obtained in the S1 as the input of the first target detection model to obtain a group of first detection box sets representing all single tree crowns in the input image; taking the single-channel digital surface model obtained in the step S1 as the input of the second target detection model to obtain another group of second detection box sets representing all single-tree crowns in the input image;
s4: and traversing the two groups of detection boxes for confidence level resetting and redundancy elimination:
if two detection frames belong to two groups of detection frames respectively and the intersection ratio is greater than or equal to the intersection ratio threshold value, the two detection frames are considered to represent the same crown, the confidence degrees of the two detection frames are weighted according to a preset weight and then assigned to one detection frame with a higher original confidence degree, and then the other detection frame is removed;
if the union ratio between one detection frame in one group of detection frame sets and all detection frames in the other group of detection frame sets is smaller than the union ratio threshold, the single tree crown does not exist in the other group of detection frame sets, the detection frame is reserved, but the confidence coefficient is reduced according to a preset proportion;
s5: and after the confidence coefficient resetting and redundancy elimination of the S4 are completed, obtaining a group of new detection boxes as a final single-tree crown detection result.
Preferably, when the unmanned aerial vehicle is used for acquiring the unmanned aerial vehicle image, the course overlapping and the side overlapping need to be kept to be not less than 90%.
Preferably, when the unmanned aerial vehicle is used for acquiring the unmanned aerial vehicle image, the flight area is larger than the target detection area, and then the unmanned aerial vehicle image of the target detection area is obtained by removing the edge part of the image.
Preferably, the deep learning network comprises a feature extraction network and a regional proposal network, the network receives three-channel images as input, and the input images are subjected to feature extraction to obtain a group of feature maps; then, the region proposal network generates a large number of regions of interest on the feature map, and the detection frame is obtained after the regions of interest are regressed, classified and screened.
Preferably, the deep learning network is Faster R-CNN.
Preferably, the threshold value of the intersection ratio is 0.5.
Preferably, the preset weights are equal weights, that is, the confidences of the two detection boxes are weighted by equal weights of 0.5.
Preferably, the first target detection model and the second target detection model are both trained in advance by using labeled sample data which is subjected to data enhancement.
Compared with the prior art, the invention has the following beneficial effects:
the invention provides an urban forest single-tree crown detection method combining RGB-DSM images and deep learning. Compared with a deep learning network trained by only using RGB images or DSM images, the method has obvious precision advantage.
Drawings
FIG. 1 is a flow chart of steps of a method for detecting urban forest single-tree crowns according to the present invention;
fig. 2 is a schematic structural diagram of a dual-branch detection network.
Detailed Description
The invention will be further elucidated and described with reference to the drawings and the detailed description. The technical features of the embodiments of the present invention can be combined correspondingly without mutual conflict.
As shown in fig. 1, in the present invention, a method for detecting a canopy of an urban forest single tree is provided, which combines an RGB-DSM image with deep learning, and comprises the following steps:
s1: acquiring an unmanned aerial vehicle image of a target detection area, and generating dense point cloud by using the acquired unmanned aerial vehicle image to further obtain an Orthophoto Map (DOM) representing the reflectivity of a ground object and a Digital Surface Model (DSM) representing the height of the ground object. Wherein the orthophoto map is an RGB three-channel color image, and the digital surface model is single-channel elevation data.
When the unmanned aerial vehicle is used for obtaining the unmanned aerial vehicle image, the flight area is larger than the target detection area, and then the unmanned aerial vehicle image of the target detection area is obtained by removing the edge part of the image.
For the DOM and the DSM, in the process of generating dense point clouds, it is difficult to construct a more accurate point cloud at the edge of a flight area due to the problem of picture angle, so that the edge area in an image is discarded, and only the non-edge area of the image is captured for training a deep learning network. In view of this, when the unmanned aerial vehicle aerial photography task is executed to acquire the unmanned aerial vehicle image, the flight area should be set to be slightly larger than the target detection area, and then the unmanned aerial vehicle image of the target detection area is obtained by removing the edge portion of the image. In addition, in order to manufacture the DOM, when the unmanned aerial vehicle is used for acquiring the unmanned aerial vehicle image, certain course overlapping and side-to-side overlapping need to be kept, and the course overlapping and the side-to-side overlapping are set to be 90%.
S2: and acquiring a first target detection model and a second target detection model which are respectively trained based on the deep learning network, wherein the input and the output of the first target detection model and the second target detection model are different although the first target detection model and the second target detection model are both trained based on the deep learning network as a framework. The first target detection model takes the ortho-image as input and outputs a detection frame of the single-tree crown in the ortho-image, and the second target detection model takes a three-channel high-speed diagram formed by repeated superposition of single-channel digital surface models as input and outputs the detection frame of the single-tree crown in the digital surface model.
The deep learning network can adopt a general network framework capable of realizing target detection, and can comprise a feature extraction network and a regional proposal network, wherein the deep learning network accepts three-channel images as input, and the input images are subjected to feature extraction to obtain a group of feature maps; then, the area proposal network generates a large number of interesting areas (ROIs with Feature maps) on the Feature Map, and after the interesting areas are regressed, classified and screened, Detection Boxes (Detection Boxes) are obtained, and finally, the Detection Boxes are evaluated to obtain Evaluation indexes (Evaluation Index). Each detection box has a corresponding category and confidence. The confidence coefficient is self-evaluation of the accuracy degree of the detection frame, represents the probability that the object in the detection frame belongs to the judged category, and the value range is [0,1 ].
In the subsequent embodiments of the present invention, the Faster R-CNN is used as the deep learning network, and the first target detection model and the second target detection model are formed through training. The concrete network structure of the Faster R-CNN belongs to the prior art and is not described in detail. Training of the deep learning network can be performed by using labeled sample data, and the labeled sample data can also be obtained according to the method in S1 and manually labeled by using labeling software. Data enhancement can also be performed if the sample size is insufficient.
S3: taking the RGB three-channel orthophoto map obtained in the S1 as the input of the first target detection model to obtain a group of first detection box sets representing all single tree crowns in the input image; and taking the single-channel digital surface model obtained in the step S1 as the input of the second target detection model, and obtaining another group of second detection box sets representing all the single-tree crowns in the input image.
S4: because the two sets of detection frames are obtained by using different input data and detection models respectively, it is necessary to traverse the two sets of detection frames for confidence level resetting and redundancy elimination, and the methods for confidence level resetting and redundancy elimination are as follows:
if two detection frames belong to two groups of detection frames respectively and the intersection ratio is greater than or equal to the intersection ratio threshold value, the two detection frames are considered to represent the same crown, the confidence degrees of the two detection frames are weighted according to a preset weight and then assigned to one detection frame with a higher original confidence degree, and then the other detection frame is removed;
if the parallel-to-parallel ratio between one detection frame A in one group of detection frame sets and all detection frames in the other group of detection frame sets is smaller than the parallel-to-parallel ratio threshold, the single tree crown, namely the detection frame A, does not exist in the other group of detection frame sets, the detection frame A is reserved, and the confidence coefficient is reduced according to a preset proportion.
In the confidence resetting and redundancy elimination processes, the intersection ratio threshold is a standard for judging whether the two detection boxes represent the same tree crown, and the specific value of the intersection ratio threshold is optimized and adjusted according to the actual condition. In the subsequent embodiments of the present invention, it is preferable to set the intersection ratio threshold value to 0.5. Similarly, the preset weights need to be optimized according to actual conditions, and in the subsequent embodiment of the present invention, the preset weights are preferably set to be equal weights, that is, the confidences of the two detection boxes are weighted according to an equal weight of 0.5.
S5: after all the detection frames in the two detection frame sets traverse to complete the confidence degree resetting and redundancy elimination processes, a new group of detection frames are obtained through combination, and the new group of detection frames can be used as a final single tree crown detection result in the target detection area.
The methods of S1-S5 are applied to an embodiment to show the specific implementation and technical effects of the present invention.
Examples
In this embodiment, the implementation process of the urban forest single-tree crown detection method combining the RGB-DSM image and deep learning is as follows:
1. unmanned aerial vehicle image data acquisition
The method comprises the steps of taking the periphery of a university campus of Zhejiang agriculture and forestry in the Hangzhou city of Zhejiang as a target detection area, and taking representative camphor trees in the area as detection objects.
In the embodiment, a genius Phantom 4 RTK series unmanned aerial vehicle of Xin Jiang Innovation science and technology Limited and a built-in camera carried by the unmanned aerial vehicle are used as an image acquisition system, and detailed parameters of the unmanned aerial vehicle image acquisition system are shown in a table 1. The images are acquired by using an unmanned aerial vehicle to automatically cruise and shoot at regular time in the environment with clear weather or cloudy weather and low wind speed from 3/month and 3/day to 15/month in 2021. The embodiment totally executes the flight mission 31 times, and the flight parameters are not completely consistent. Considering the problem of building height in the study area, the flight mission performed in the target detection area is 50 meters in height, and the flight mission performed outside the target detection area is 70 meters in height. To make the DOM, both heading and side lap were set to 90%. Through the flight mission, 31 batches of 11377 unmanned aerial vehicle aerial images are obtained in the embodiment.
TABLE 1 unmanned aerial vehicle image acquisition System parameters
Figure BDA0003193918650000061
2. Dual branch detection network construction
In order to fully utilize the information contained in the RGB image and the DSM in the ITCD task, the present embodiment proposes a two-branch detection network model. The model is improved on the basis of fast R-CNN, and the structure is shown in figure 2: respectively inputting an RGB three-channel orthophoto map representing the reflectivity of the ground object and a single-channel digital surface model representing the height of the ground object into respective branch depth learning networks; feature extraction is carried out on each branch to obtain a feature map, then an area of interest is generated on the feature map by an area proposal network, and regression and classification are respectively carried out on the area of interest to obtain a candidate detection frame set (since the embodiment only detects one type of crown, category information is not labeled, and numbers in the detection frames represent confidence). The deep learning networks of the two branches are respectively recorded as a first target detection model and a second target detection model, the first target detection model takes an orthophoto image as input and outputs a detection frame of the tree crown of the single tree; and the second target detection model takes a three-channel high-level diagram formed by repeated superposition of single-channel digital surface models as input and outputs a detection frame of the single-tree crown. Correspondingly, detection frame sets obtained by the first target detection model and the second target detection model are respectively marked as a first detection frame set and a second detection frame set, and then confidence weighting and redundant elimination are carried out on the two groups of candidate detection frames to obtain a group of final detection frames. And finally, evaluating the detection result to obtain evaluation indexes (Precision, Recall and Average Precision).
The specific processing method for the two groups of candidate detection frames is as follows: if two detection frames belong to two groups of detection frames respectively and the intersection ratio is greater than or equal to the intersection ratio threshold value, the two detection frames are considered to represent the same crown, the confidence degrees of the two detection frames are weighted according to a preset weight and then assigned to one detection frame with a higher original confidence degree, and then the other detection frame is removed; if the union ratio between one detection frame in one group of detection frame sets and all detection frames in the other group of detection frame sets is smaller than the union ratio threshold, the single tree crown does not exist in the other group of detection frame sets, the detection frame is reserved, and the confidence coefficient is reduced according to the preset proportion.
The only criterion for determining whether two detection boxes represent the same crown is an Intersection over Union (IoU) between the two detection boxes, and if IoU is greater than or equal to 0.5, the two detection boxes are considered to actually represent the same camphor tree crown, and vice versa. The calculation formula of the intersection ratio is shown in formula 1.
Figure BDA0003193918650000071
The method for realizing the process comprises the following steps: and (3) collecting the two groups of candidate detection frames, wherein the detection frame with the highest confidence coefficient is alpha, and the rest detection frames form a set Box, which is shown in a formula 2.
Boxes={box1,box2,......} (2)
And (4) calculating the intersection ratio of the detection frame alpha and all the detection frames in the set Box, and taking the maximum intersection ratio, as shown in a formula 3.
Max_IoU(α,Boxes)=max{IoU(α,box1),IoU(α,box2),......} (3)
Assuming that the intersection ratio between the detection frame α and the detection frame β is maximum, equation 4 determines the weight of β: if the intersection ratio is more than or equal to 0.5, the alpha and the beta are considered to actually represent the same camphor tree crown, and the weight of the beta is 0.5; if the intersection ratio is less than 0.5, the alpha and the beta are not considered to represent the same camphor tree crown, and the weight of the beta is given as 0.
Figure BDA0003193918650000072
Finally, as shown in equation 5, α and β are weighted to obtain a new confidence level. This new confidence is assigned to the detection box α, which represents the probability that the object in the detection box α is the camphor tree crown in the detection result of this scheme (hereinafter referred to as the two-branch network scheme), and the detection box β is rejected. Where conf _ α and conf _ β represent the confidence levels of α and β, respectively. And repeating the above process until the processing of all the detection frames is completed.
Conf(α,Boxes)=0.5*conf_α+Weight(Max_IoU(α,Boxes))*conf_β (5)
When evaluating the detection result, the calculation of the evaluation index depends on the confidence of each detection frame. The confidence weighting step in the above process is to determine the detection frame by using the elevation information and the color information in combination, so as to improve the confidence of the positive sample, reduce the confidence of the negative sample, and further improve the detection result.
3 data processing and data set creation
Data preprocessing: 31 batches of images obtained by an unmanned aerial vehicle aerial photography task are respectively loaded in an Agisoft Metashape Professional to generate dense point clouds, and then 31 DOM (an RGB image, which is recorded as an RGB image later) and DSM are respectively produced and derived. Due to the fact that the image is too large, the DSM is resampled to be the same as the RGB image in resolution, all local areas containing the camphor tree crowns are cut out, and 90 RGB local images and DSM local images which correspond to one another are obtained in total.
Data set preparation: the RGB partial image and the DSM partial image are divided into a training image and a test image respectively. The composition of each type of image is the same, and the specific steps are as follows: the test images were 46 in total and contained 602 camphor tree crowns. The training images are 44 in total and comprise 617 camphor tree crowns. In order to increase the data volume and obtain a better detection effect, the training images are subjected to data enhancement, and the training images are expanded to 308 images including 4319 camphor tree crowns through clockwise 90-degree rotation, clockwise 180-degree rotation, clockwise 270-degree rotation, left and right mirror images, upper and lower mirror images and 0.8-time brightness transformation.
The type, the position and the size of the single tree crown are determined through field investigation, particularly the places where the tree crowns are covered with each other are important investigation areas, and the information of each tree is ensured to be recorded correctly. On this basis, in the embodiment, the labeling work is completed by using an open source labeling tool, so as to obtain an RGB data set with a label and a DSM data set.
4. Network training
The computer operating system used in this embodiment is Windows 10 (64-bit, education edition), the CPU is Intel Xeon E3-1225 v5@3.30GHz, the GPU is Nvidia GeForce GTX 1080 Ti, and the video memory is 11G. The programming language is Python, the corresponding environment version is 3.5.2, and the tensrflow framework is installed (version 1.9.0). In this embodiment, Resnet101 is used as a backbone network, and in order to accelerate the convergence rate of the model, a corresponding pre-trained model is further obtained from a tensflo official sourcing project for migration learning, and detailed parameters of the network are shown in table 2. And training the network in the double-branch detection network by using the RGB data set and the DSM data set respectively until the network converges.
Table 2 network details parameters
Figure BDA0003193918650000091
5. Evaluation index
And after the training of the double-branch detection network is finished, inputting a test image into the double-branch detection network, and verifying the result. The RGB local image is used as the input of a first target detection model, and a group of first detection frame sets representing all single tree crowns in the input image are obtained; and taking the DSM local image as the input of a second target detection model to obtain another group of second detection frame sets representing all the single-tree crowns in the input image. According to the foregoing process in this embodiment, the two groups of detection frames are traversed to perform confidence level resetting and redundancy elimination, so as to obtain a new group of detection frames, which is used as a final single-tree crown detection result. In the embodiment, Recall rate (Recall), Precision rate (Precision) and Average Precision (AP) are used as evaluation indexes, and the quality of a model is evaluated according to the performances of the three indexes on test image data. In the embodiment, the recall rate indicates how many camphor tree crowns actually exist and are detected by the network, and the accuracy rate indicates how many camphor tree crowns detected by the network are correct. Under different confidence degrees, the accuracy rate and the recall rate are different, so that a P-R curve can be drawn, and the average accuracy is numerically equal to the area under the P-R curve. The correlation calculation formula is as follows:
Figure BDA0003193918650000092
Figure BDA0003193918650000093
Figure BDA0003193918650000094
in the formula, tp (true positive) indicates a detection box determined as the camphor tree crown, and actually, the detection box is also the camphor tree crown. Fp (false positive) indicates a detection frame determined as the camphor tree crown but is not actually the camphor tree crown. FN (false negative) indicates that the detection frame is not judged as the camphor tree crown but actually is the detection frame of the camphor tree crown. The specific determination method is as follows: and (8) taking all the rectangular label boxes in the label file corresponding to the image, and calculating IoU of the detection boxes and the label boxes one by one. Taking the largest IoU and the corresponding labeling box, and if IoU is greater than or equal to 0.5, determining that the camphor tree crowns represented by the detection box and the labeling box are the same, and correctly detecting; if the IoU is less than 0.5, the position of the detection frame is determined to have no camphor tree crown, and the detection is wrong.
In the present embodiment, the detection results of the above-mentioned two-branch network, the first object detection model using RGB partial images alone, and the second object detection model using DSM partial images alone are compared, as shown in table 3. Wherein Precision, Recall, TP, FP and TN are obtained by detecting frames with statistical confidence degree of more than or equal to 0.5.
TABLE 3 comparison of experimental results of various protocols
Figure BDA0003193918650000101
The two-branch network scheme achieves the best result, the average precision of the two-branch network scheme reaches 86.79%, the improvement is 7.28% compared with the RGB scheme, and the precision rate and the recall rate are respectively the best and the suboptimum in each scheme. The two-branch network scheme of the invention effectively combines and utilizes the elevation information and the color information to detect the single tree crown target, and forms a new detection result through further confidence resetting and redundancy elimination, thereby improving the performance of the ITCD task.
The above-described embodiments are merely preferred embodiments of the present invention, which should not be construed as limiting the invention. Various changes and modifications may be made by one of ordinary skill in the pertinent art without departing from the spirit and scope of the present invention. Therefore, the technical scheme obtained by adopting the mode of equivalent replacement or equivalent transformation is within the protection scope of the invention.

Claims (8)

1. A city forest single tree crown detection method combining RGB-DSM images and deep learning is characterized by comprising the following steps:
s1: acquiring an unmanned aerial vehicle image of a target detection area, and generating a dense point cloud by using the acquired unmanned aerial vehicle image so as to obtain an RGB three-channel orthophoto map representing the reflectivity of a ground object and a single-channel digital surface model representing the height of the ground object;
s2: acquiring a first target detection model and a second target detection model which are respectively trained based on a deep learning network; the first target detection model takes an orthophoto map as input and outputs a detection frame of the single tree crown; the second target detection model takes a three-channel high-level diagram formed by repeated superposition of single-channel digital surface models as input and outputs a detection frame of the single-tree crown;
s3: taking the RGB three-channel orthophoto map obtained in the S1 as the input of the first target detection model to obtain a group of first detection box sets representing all single tree crowns in the input image; taking the single-channel digital surface model obtained in the step S1 as the input of the second target detection model to obtain another group of second detection box sets representing all single-tree crowns in the input image;
s4: and traversing the two groups of detection boxes for confidence level resetting and redundancy elimination:
if two detection frames belong to two groups of detection frames respectively and the intersection ratio is greater than or equal to the intersection ratio threshold value, the two detection frames are considered to represent the same crown, the confidence degrees of the two detection frames are weighted according to a preset weight and then assigned to one detection frame with a higher original confidence degree, and then the other detection frame is removed;
if the union ratio between one detection frame in one group of detection frame sets and all detection frames in the other group of detection frame sets is smaller than the union ratio threshold, the single tree crown does not exist in the other group of detection frame sets, the detection frame is reserved, but the confidence coefficient is reduced according to a preset proportion;
s5: and after the confidence coefficient resetting and redundancy elimination of the S4 are completed, obtaining a group of new detection boxes as a final single-tree crown detection result.
2. The method as claimed in claim 1, wherein the unmanned aerial vehicle is used to obtain the image of the unmanned aerial vehicle, and the course overlap and the side overlap are not less than 90%.
3. The method for detecting the canopy of an urban forest tree combining the RGB-DSM image and the deep learning according to claim 1, wherein when the drone image is obtained by using a drone, a flight area should be larger than a target detection area, and then the drone image of the target detection area is obtained by removing an edge portion of the image.
4. The method for detecting the crown of an urban forest single tree by combining the RGB-DSM image and the deep learning as claimed in claim 1, wherein the deep learning network comprises a feature extraction network and a region proposal network, the network receives three-channel images as input, and a group of feature maps are obtained after the input images are subjected to feature extraction; then, the region proposal network generates a large number of regions of interest on the feature map, and the detection frame is obtained after the regions of interest are regressed, classified and screened.
5. The method of urban forest single tree crown detection combining RGB-DSM image and deep learning of claim 1, wherein the deep learning network is fast R-CNN.
6. The method of urban forest single tree crown detection combining RGB-DSM image and deep learning according to claim 1, wherein the cross-over ratio threshold is 0.5.
7. The method as claimed in claim 1, wherein the predetermined weight is equal, that is, the confidence degrees of the two detection boxes are weighted by equal weight of 0.5.
8. The method of claim 1, wherein the first object detection model and the second object detection model are trained in advance by using labeled sample data which is subjected to data enhancement.
CN202110885400.5A 2021-08-03 2021-08-03 Urban forest single-wood crown detection method combining RGB-DSM image and deep learning Active CN113591729B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110885400.5A CN113591729B (en) 2021-08-03 2021-08-03 Urban forest single-wood crown detection method combining RGB-DSM image and deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110885400.5A CN113591729B (en) 2021-08-03 2021-08-03 Urban forest single-wood crown detection method combining RGB-DSM image and deep learning

Publications (2)

Publication Number Publication Date
CN113591729A true CN113591729A (en) 2021-11-02
CN113591729B CN113591729B (en) 2023-07-21

Family

ID=78254267

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110885400.5A Active CN113591729B (en) 2021-08-03 2021-08-03 Urban forest single-wood crown detection method combining RGB-DSM image and deep learning

Country Status (1)

Country Link
CN (1) CN113591729B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115311587A (en) * 2022-09-07 2022-11-08 无锡海纳智能科技有限公司 Intelligent generation method for photovoltaic power station inspection waypoints
CN117372885A (en) * 2023-09-27 2024-01-09 中国人民解放军战略支援部队信息工程大学 Multi-mode remote sensing data change detection method and system based on twin U-Net neural network
WO2024100777A1 (en) * 2022-11-08 2024-05-16 日本電信電話株式会社 Inference device, training device, inference method, training method, and computer program

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105894501A (en) * 2016-03-29 2016-08-24 南京信息工程大学 Single-tree detection and crown describing method for high-resolution remote sensing image
AU2020103026A4 (en) * 2020-10-27 2020-12-24 Nanjing Forestry University A Single Tree Crown Segmentation Algorithm Based on Super-pixels and Topological Features in Aerial Images
CN112907520A (en) * 2021-02-05 2021-06-04 南京林业大学 Single tree crown detection method based on end-to-end deep learning method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105894501A (en) * 2016-03-29 2016-08-24 南京信息工程大学 Single-tree detection and crown describing method for high-resolution remote sensing image
AU2020103026A4 (en) * 2020-10-27 2020-12-24 Nanjing Forestry University A Single Tree Crown Segmentation Algorithm Based on Super-pixels and Topological Features in Aerial Images
CN112907520A (en) * 2021-02-05 2021-06-04 南京林业大学 Single tree crown detection method based on end-to-end deep learning method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
AMPATZIDIS Y: "UAV-based high throughput phenotyping in citrus utilizing multispectral imaging and artificial intelligence", REMOTE SENSING, vol. 11, no. 4, XP055906094, DOI: 10.3390/rs11040410 *
罗巍;王东亮;夏列钢;陈曙东;: "基于深度学习的林业资源调查方法", 林业科技通讯, no. 08 *
薛月菊;黄宁;涂淑琴;毛亮;杨阿庆;朱勋沐;杨晓帆;陈鹏飞;: "未成熟芒果的改进YOLOv2识别方法", 农业工程学报, no. 07 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115311587A (en) * 2022-09-07 2022-11-08 无锡海纳智能科技有限公司 Intelligent generation method for photovoltaic power station inspection waypoints
CN115311587B (en) * 2022-09-07 2024-05-31 无锡海纳智能科技有限公司 Intelligent generation method for photovoltaic power station inspection waypoints
WO2024100777A1 (en) * 2022-11-08 2024-05-16 日本電信電話株式会社 Inference device, training device, inference method, training method, and computer program
CN117372885A (en) * 2023-09-27 2024-01-09 中国人民解放军战略支援部队信息工程大学 Multi-mode remote sensing data change detection method and system based on twin U-Net neural network

Also Published As

Publication number Publication date
CN113591729B (en) 2023-07-21

Similar Documents

Publication Publication Date Title
CN110796168B (en) Vehicle detection method based on improved YOLOv3
CN113591729B (en) Urban forest single-wood crown detection method combining RGB-DSM image and deep learning
US11263434B2 (en) Fast side-face interference resistant face detection method
CN111553200A (en) Image detection and identification method and device
CN113065558A (en) Lightweight small target detection method combined with attention mechanism
CN108921057B (en) Convolutional neural network-based prawn form measuring method, medium, terminal equipment and device
CN112084869B (en) Compact quadrilateral representation-based building target detection method
CN111899172A (en) Vehicle target detection method oriented to remote sensing application scene
CN107341523A (en) Express delivery list information identifying method and system based on deep learning
CN112464911A (en) Improved YOLOv 3-tiny-based traffic sign detection and identification method
CN111126399A (en) Image detection method, device and equipment and readable storage medium
CN110298227B (en) Vehicle detection method in unmanned aerial vehicle aerial image based on deep learning
CN113160062B (en) Infrared image target detection method, device, equipment and storage medium
Cepni et al. Vehicle detection using different deep learning algorithms from image sequence
CN110287798B (en) Vector network pedestrian detection method based on feature modularization and context fusion
CN111860359B (en) Point cloud classification method based on improved random forest algorithm
CN115861619A (en) Airborne LiDAR (light detection and ranging) urban point cloud semantic segmentation method and system of recursive residual double-attention kernel point convolution network
CN117409339A (en) Unmanned aerial vehicle crop state visual identification method for air-ground coordination
CN112946679A (en) Unmanned aerial vehicle surveying and mapping jelly effect detection method and system based on artificial intelligence
CN115032648A (en) Three-dimensional target identification and positioning method based on laser radar dense point cloud
CN115272876A (en) Remote sensing image ship target detection method based on deep learning
CN117392382A (en) Single tree fruit tree segmentation method and system based on multi-scale dense instance detection
Larsen Individual tree top position estimation by template voting
CN114821370A (en) Single-tree crown detection and segmentation method based on unmanned aerial vehicle image and U-Net
CN111524098B (en) Neural network output layer cutting and template frame size determining method based on self-organizing clustering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant