CN113705478A - Improved YOLOv 5-based mangrove forest single tree target detection method - Google Patents
Improved YOLOv 5-based mangrove forest single tree target detection method Download PDFInfo
- Publication number
- CN113705478A CN113705478A CN202111009370.8A CN202111009370A CN113705478A CN 113705478 A CN113705478 A CN 113705478A CN 202111009370 A CN202111009370 A CN 202111009370A CN 113705478 A CN113705478 A CN 113705478A
- Authority
- CN
- China
- Prior art keywords
- yolov5
- target
- mangrove forest
- model
- pooling
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 240000002044 Rhizophora apiculata Species 0.000 title claims abstract description 80
- 238000001514 detection method Methods 0.000 title claims abstract description 79
- 238000011176 pooling Methods 0.000 claims abstract description 32
- 230000003993 interaction Effects 0.000 claims abstract description 10
- 230000007246 mechanism Effects 0.000 claims abstract description 8
- 238000012549 training Methods 0.000 claims description 29
- 238000000034 method Methods 0.000 claims description 23
- 238000000605 extraction Methods 0.000 claims description 22
- 238000010586 diagram Methods 0.000 claims description 14
- 238000012360 testing method Methods 0.000 claims description 12
- 230000006870 function Effects 0.000 claims description 9
- 230000004913 activation Effects 0.000 claims description 8
- 230000000694 effects Effects 0.000 claims description 8
- 238000002372 labelling Methods 0.000 claims description 8
- 230000008901 benefit Effects 0.000 claims description 7
- 230000004927 fusion Effects 0.000 claims description 7
- 230000008569 process Effects 0.000 claims description 4
- 238000012795 verification Methods 0.000 claims description 4
- 239000002023 wood Substances 0.000 claims description 4
- 230000003044 adaptive effect Effects 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 230000003213 activating effect Effects 0.000 claims description 2
- 238000004364 calculation method Methods 0.000 claims description 2
- 238000005314 correlation function Methods 0.000 claims description 2
- 238000013507 mapping Methods 0.000 claims description 2
- 230000011218 segmentation Effects 0.000 claims description 2
- 230000004580 weight loss Effects 0.000 claims description 2
- 230000000717 retained effect Effects 0.000 claims 1
- 239000013589 supplement Substances 0.000 claims 1
- 238000012546 transfer Methods 0.000 claims 1
- 238000013135 deep learning Methods 0.000 abstract description 10
- 238000005516 engineering process Methods 0.000 abstract description 6
- 230000009467 reduction Effects 0.000 abstract description 4
- 238000012545 processing Methods 0.000 abstract description 3
- 230000002708 enhancing effect Effects 0.000 abstract description 2
- 238000011160 research Methods 0.000 abstract description 2
- 230000006872 improvement Effects 0.000 description 6
- 238000012544 monitoring process Methods 0.000 description 5
- 238000007689 inspection Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000004083 survival effect Effects 0.000 description 3
- 240000003793 Rhizophora mangle Species 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 230000035772 mutation Effects 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 241000592183 Eidolon Species 0.000 description 1
- 241000120622 Rhizophoraceae Species 0.000 description 1
- 239000000853 adhesive Substances 0.000 description 1
- 230000001070 adhesive effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000011897 real-time detection Methods 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 239000013535 sea water Substances 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Abstract
A mangrove forest single tree target detection method based on improved YOLOv5, a target detection and image processing technology in deep learning and a young mangrove forest single tree recognition technology of an improved YOLOv5 algorithm belong to the mangrove forest single tree target detection field in forestry scientific research. Sequentially marking target trees on the selected unmanned aerial vehicle image by utilizing open source software Labe Img, constructing a mangrove forest single tree data set, selecting YOLOv5 as a basic target detection model, optimizing and improving the target trees according to the characteristics of dense distribution and small size of the target trees, improving a CSPDarknet53 backbone network by using an effective channel attention mechanism effi i ent channel attention i on, avoiding the reduction of dimensionality and cross-channel interaction, enhancing the characteristic expression capability, introducing SoftPoo pooling operation into an SPP module, retaining more detailed characteristic information and improving the automatic target detection precision.
Description
Technical Field
The invention relates to a mangrove forest single tree target detection method based on improved YOLOv5, a target detection and image processing technology in deep learning and a young mangrove forest single tree recognition technology based on improved YOLOv5 algorithm, belonging to the field of mangrove forest single tree target detection in forestry scientific research.
Background
Mangrove forest is a special forest community on tropical and subtropical coast zones and plays an important role in improving ecological conditions, maintaining biological diversity, ecological safety in coastal areas and the like. Mangrove resource survey and dynamic monitoring are the basis and precondition of scientific protection and management of mangrove, but the mangrove area in our country is decreasing day by day due to human reconstruction and other reasons. In order to better protect the mangrove forest ecosystem and enlarge the mangrove forest area, in recent years, the artificial afforestation work of a plurality of mangrove forest natural protection areas is developed according to the plan, and the supplementary planting and the artificial afforestation in the original mangrove forest become the main measures for recovering the mangrove forest resources. The geographical positions of mangrove forest distribution are special, most mangrove forest is in shoals in intertidal zones, the mangrove forest is easily submerged by seawater in tidal tide, if the mangrove forest is not survived, the mangrove forest is washed away by water flow because of no adhesive force of root systems, and therefore the existing single saplings of the mangrove forest are survived trees. In the special environment, the low-altitude unmanned remote sensing system is very suitable for image acquisition of mangrove forest regions due to the characteristics of flexible data acquisition, low cost and capability of rapidly acquiring small-range ultrahigh-resolution images. Based on this, be applied to the acquisition of mangrove seedling with unmanned aerial vehicle, be one of the way of effectively improving mangrove seedling monitoring precision.
Therefore, how to combine the unmanned aerial vehicle image to detect the individual mangrove forest quickly and accurately becomes a problem to be solved urgently in the checking work of the survival rate of the newly-built mangrove forest. With the development of target detection and object recognition technologies, deep learning begins to be widely applied to target detection, and face recognition, voice recognition and the like are the most common applications. Aiming at the problems of insufficient feature extraction capability, detection speed reduction caused by prediction frame processing and the like, the Anchor-Free lightweight detection algorithm with multi-scale feature fusion is provided. The defect detection method based on the scale invariant feature pyramid is provided for the problem that the inspection defect detection precision of the existing target detection algorithm is low in a complex inspection scene. However, the above methods are only to perform algorithm improvement on the corresponding specific scenes. In recent years, the invention of deep learning is also gradually applied to the forestry industry, and more accurate, rapid and intelligent monitoring of forestry is gradually realized. Zhouyan in the article of the small target disaster-stricken tree detection method based on deep learning, aiming at the problems of small tree scale, dense growth, irregular distribution and the like in the forest image of the unmanned aerial vehicle, provides a small target disaster-stricken tree detection method based on deep learning. However, the SSD algorithm (Si ng l e Shot Mu l t i Box Detector, which is a target detection algorithm proposed by We i Li u on the ECCV 2016) discards the underlying features containing rich information, and has low robustness for small target detection.
The current mainstream deep learning target detection algorithm is divided into a double-stage algorithm and a single-stage algorithm. Different from a double-stage detection algorithm represented by an R-CNN series, the YOLO directly completes feature extraction, candidate frame regression and classification in the same unbranched convolutional network, so that the network structure is simple, and the requirement of a real-time detection task can be met. The YOLO target detection model is subjected to updating iteration for a plurality of times, and the problems of multiple dimensions such as multi-target detection, small target detection, missed detection and repair, multi-scale prediction and the like are solved successively.
Based on the advantages of automatic feature learning, high speed, high efficiency and the like of target detection, the method better applies the latest YOLOv5 of the series of algorithms to the individual tree detection of the mangrove forest to realize the intelligent monitoring of the condition of the seedlings of the mangrove forest.
Disclosure of Invention
The invention aims to provide a mangrove forest single tree target detection method based on improved YOLOv5 based on a deep learning method, which solves the problems that the mangrove forest single tree targets in the current man-machine image are small and densely distributed, the automation degree is low and the efficiency is low during detection, and the like, provides technical support for the automatic detection of survival mangrove forest seedlings, further improves the survival rate inspection precision and efficiency of newly manufactured mangroves, and realizes the intelligent monitoring of the mangrove forest seedling conditions.
Aiming at the problems that the individual mangrove tree targets in the unmanned aerial vehicle image are small and distributed densely, the automation degree is low during detection, the efficiency is low and the like, a method for detecting the individual mangrove tree targets based on the improved YOLOv5 is provided based on a deep learning method, so that the individual mangrove trees in the unmanned aerial vehicle image can be automatically identified and positioned quickly and accurately.
A mangrove forest single tree target detection method based on improved YOLOv5 comprises the following steps:
and sequentially marking target trees on the selected unmanned aerial vehicle images by utilizing open source software LabelImg to construct a mangrove forest single tree data set.
The method selects YOLOv5 as a basic target detection model, optimizes and improves the target wood according to the characteristics of dense distribution and small size of the target wood, improves a CSPDarknet53 backbone network by using an effective Channel Attention mechanism, avoids descending and cross-Channel interaction, enhances the characteristic expression capability, introduces SoftPool pooling operation in an SPP module, retains more detailed characteristic information and improves the automatic target detection precision.
The technical scheme adopted by the invention for solving the technical problems is as follows: a mangrove forest single tree target detection method based on improved YOLOv5 comprises the following steps:
And step 3, further improving on the basis of the step 2, introducing SoftPool improved pooling operation into an SPP module of the YOLOv5 network, and reserving more detailed characteristic information.
And 4, inputting the obtained single-tree data set image of the unmanned aerial vehicle mangrove forest into a Yolov5 feature extraction network for feature extraction, and obtaining feature graphs of different scales.
And 5, classifying and regressing the characteristic diagram obtained in the step 4, performing characteristic reconstruction operation on a regression result to obtain a more precise characteristic diagram, and performing classification and regression operation again on the basis to calculate loss.
And 6, after the training of the model is completed, testing the test set by means of the divided data set to realize the target detection of the young mangrove forest single trees, and evaluating the detection effect of the model.
The efficient attention channel of the present invention reduces the complexity of the model by nondecreasing and cross-channel information interaction, while enhancing the expressive power of the features. The softpool can keep the expressiveness of the characteristics and the operation is micro, so that the characteristic information of the whole receptive field is reserved, and the accuracy of the algorithm is improved. The method can quickly, accurately and automatically detect the newly manufactured mangrove forest single trees, has obvious advantages compared with the traditional detection method of the mangrove forest of the existing unmanned aerial vehicle, improves the target detection precision along with the improvement, has lower training loss, can realize the quick, accurate and automatic detection of the mangrove forest single tree targets, and better improves the recognition and positioning capability of the mangrove forest single trees.
Drawings
A more complete appreciation of the invention and many of the attendant advantages thereof will be readily obtained as the same becomes better understood by reference to the following detailed description when considered in connection with the accompanying drawings, wherein the accompanying drawings are included to provide a further understanding of the invention and form a part of this specification, and wherein the illustrated embodiments of the invention and the description thereof are intended to illustrate and not limit the invention, as illustrated in the accompanying drawings, in which:
fig. 1 is a schematic diagram of a YOLOv5 network structure.
Fig. 2 is a block diagram of an attention module for introducing an effective channel.
Fig. 3 is a diagram of the structure of the incorporation of SoftPool pooling in SPP modules.
FIG. 4 is a schematic illustration of the position of SoftPool in YOLOv 5-ECA.
Fig. 5a is a collected image of a portion of a mangrove forest single-tree unmanned aerial vehicle.
Fig. 5b is a collected image of a portion of a mangrove forest single-tree unmanned aerial vehicle.
FIG. 6a is a target label graph.
Fig. 6b is a graph of normalized target position.
Fig. 6c is a graph of normalized target size.
FIG. 7 is a graph of the value of the Yolov5 and Yolov5-ECA loss functions as a function of the number of training rounds.
FIG. 8a is a comparison graph of convergence results of various parameters of YOLOv5 and YOLOv 5-ECA.
FIG. 8b is a comparison graph of convergence results of various parameters of YOLOv5 and YOLOv 5-ECA.
FIG. 8c is a comparison graph of convergence results of various parameters of YOLOv5 and YOLOv 5-ECA.
FIG. 8d is a comparison graph of convergence results of the respective parameters YOLOv5 and YOLOv 5-ECA.
FIG. 8e is a comparison graph of convergence results of various parameters of YOLOv5 and YOLOv 5-ECA.
FIG. 8f is a comparison graph of convergence results of the respective parameters YOLOv5 and YOLOv 5-ECA.
FIG. 9a is a graph showing the detection result of the YOLOv5 detection target.
FIG. 9b is a graph showing the results of detection of a target by YOLOv 5.
FIG. 9c is a graph showing the results of detection of the YOLOv5-ECA detection target.
FIG. 9d is a graph showing the results of detection of the YOLOv5-ECA detection target.
FIG. 10 is a flow chart of the steps of the present invention.
Detailed Description
The invention is further illustrated with reference to the following figures and examples.
It will be apparent that those skilled in the art can make many modifications and variations based on the spirit of the present invention.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element, component or section is referred to as being "connected" to another element, component or section, it can be directly connected to the other element or section or intervening elements or sections may also be present. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art.
The following examples are further illustrative in order to facilitate the understanding of the embodiments, and the present invention is not limited to the examples.
Example 1: as shown in fig. 1, fig. 2, fig. 3, fig. 4, fig. 5, fig. 6, fig. 7, fig. 8, fig. 9 and fig. 10, a mangrove forest single tree target detection method based on improved YOLOv5 includes the following steps:
And step 3, further improving on the basis of the step 2, introducing SoftPool improved pooling operation into an SPP module of the YOLOv5 network, and reserving more detailed characteristic information.
And 4, inputting the obtained single-tree data set image of the unmanned aerial vehicle mangrove forest into a Yolov5 feature extraction network for feature extraction, and obtaining feature graphs of different scales.
And 5, classifying and regressing the characteristic diagram obtained in the step 4, performing characteristic reconstruction operation on a regression result to obtain a more precise characteristic diagram, and performing classification and regression operation again on the basis to calculate loss.
And 6, after the training of the model is completed, testing the test set by means of the divided data set to realize the target detection of the young mangrove forest single trees, and evaluating the detection effect of the model.
In the step 1, in order to ensure the completeness of each detection target in the image, conventional LabelImg open source software is used for sequentially marking individual seedlings of the mangrove forest in the unmanned aerial vehicle image, marking content is coordinates of a rectangular surrounding frame of a newly-built mangrove forest, and storing the coordinates as an XML text file for training and testing a YOLOv5 model.
The structure of YOLOv5 is shown in fig. 1, and the YOLOv5 network mainly comprises 4 parts, wherein 1 box is an Input end (Input), 2 boxes are a reference network (backhaul), 3 boxes are feature fusion (hack), and 4 boxes are a Prediction part (Prediction). The 1 box is an Input (Input) and comprises a preprocessing stage of training data set images, and the Input is enhanced by the Mosaic data of YOLOv4 to improve the training speed and the network precision of the model.
And the frame 2 is a reference network (Backbone), the Backbone of the detection network is used for extracting the characteristics of the high, middle and low layers of the image by the Backbone.
And 3, the frame is a feature fusion (Neck), and the Neck part is mainly used for generating a feature pyramid. The feature pyramid enhances the detection of the model for objects of different scaling dimensions, so that the same object of different size and dimensions can be identified.
And 4, taking a frame as a Prediction part (Prediction), and performing convolution again to obtain a Prediction result.
The specific contents of the A frame, the B frame, the C frame, the D frame, the E frame and the F frame except the part 4 are as follows:
the input end of the YOLOv5 comprises a preprocessing stage of training data set images, and the Mosaic data enhancement of YOLOv4 is adopted at the input end to improve the training speed and the network precision of the model.
In the reference network, YOLOv5 is different in that a slice structure (Focus) is newly added, and as shown in a block E in fig. 1, data is cut by slice to extract general features.
The A frame is composed of a CBL structure and a Conv + Bn + Leaky _ relu activation function.
The B box is CSP1_ X consisting of three convolutional layers and X Res unint modules concatee.
The CSP2_ X structure is used in feature fusion (Neck), and the Res unint module is no longer used, but instead is CBL, as indicated in block D.
The SPP module is mainly in a pooling mode as shown by a C frame, and the subsequent improvement mainly changes the original maximum pooling into soft pool.
The prediction part is convolved again to obtain a prediction result.
The frame F is a residual component in which the Res unit module is composed of two CBL modules, and the number of the modules is determined according to specific requirements.
The most prominent characteristic of the Efficient Channel Attention in the step 2 is that the dimension reduction is avoided, the cross-Channel interaction is realized, the complexity of the model is reduced, and the feature expression capability is enhanced, the structure diagram of the ECA Attention module is shown in FIG. 2, the principle is that the ECA module generates Channel Attention through the fast one-dimensional convolution with the size of k, and the size of the kernel is completely determined by the Channel dimension correlation function in a self-adaptive manner. When the feature image x is input under the condition that the dimension is kept unchanged, after all channels are subjected to global average pooling, the ECA module learns the features by utilizing one-dimensional convolution which can share the weight, and each channel is involved with k neighbors to capture cross-channel interaction when the features are learned. k represents the kernel size of the fast one-dimensional convolution, and the determination of the value of the adaptive k is obtained by the direct proportion relation between the covering area of the cross-channel information interaction and the channel dimension C, and the calculation is as shown in formula (1):
in the formula: γ 2, b 21,|*oodRepresenting the nearest neighbor odd, C is the channel dimension.
And 3, detecting information for easily damaging weight loss when performing feature mapping by utilizing maximum pooling and average pooling for newly-manufactured mangrove seedlings with smaller targets and excessively low pixels. Therefore, the SoftPool is introduced into the SPP module to improve the pooling operation and retain more detailed feature information, namely, the pooling is realized in a Softmax mode in the pooling area of the activation feature map. The Softpool structure is shown in FIG. 3, and the Softpool downsampling process sequentially comprises the following steps from left to right: activating the characteristic diagram, calculating the final pooling result of the softpool in the area, and outputting the result. The position of fusion in YOLOv5 is shown in fig. 4, in the SPP module.
Unlike other pooling, SoftPool uses softmax (normalized exponential function) for weighted pooling, which can preserve expressiveness of features and is differentiable in operation. The gradient of each back propagation can be updated, and SoftPool can comprehensively utilize each activation factor of the pooled kernel and only increase little memory occupation. The discrimination of the similar characteristic information is increased, meanwhile, the characteristic information of the whole receptive field is kept, and the accuracy of the algorithm is improved.
The core idea of SoftPool is the utilization of softmax, and the eigenvalue weight of the region R is calculated according to the nonlinear eigenvalue:
in the formula: wiWeight activated for the ith element, aiFor the ith activation value, R is the pooling region size, and e is a mathematical constant that is the base of the natural logarithm function.
Weight ofwi can ensure the transmission of important characteristics, and the characteristic values in the region R have at least a preset minimum gradient when transmitted reversely. After obtaining the weight wiThen, the output is obtained by weighting the characteristic value in the region R:
in the formula: r is the size of the pooling area,for the output value of SoftPool, the weighted summation of all the activation factors of the pooling kernel is realized.
In step 4, before feature extraction is performed on the Yolov5 feature extraction network, mosaic enhancement operation is performed on the data, then the data are uniformly scaled to a standard size for Focus slicing operation, and then the data are input to the Yolov5 feature extraction network for feature extraction.
In the step 6, the detection effect and performance of the model are generally evaluated by using an average Precision AP (average Precision) and an average Precision mAP (average probability), wherein the AP is an area under a Recall rate (Recall) curve and an accuracy rate (Precision) curve, the method is a single-class target, and the AP is equivalent to the mAP.
Area intersection ratio (IoU): and measuring the position prediction capability of the model by calculating the area intersection ratio of the rectangular region of the model prediction target and the rectangular region calibrated by the target in the verification set.
Precision (Precision) represents the proportion of the correct target number detected by the model to the total target number, and represents the accuracy of the model in target detection.
And the Recall rate (Recall) represents the proportion of the number of the targets detected by the model to the total number of the targets, and embodies the Recall capability of the model identification.
In the formula: TP (true positive) is the number of detected correct positive samples, namely the type of the prediction box is the same as that of the labeling box, and IoU is more than 0.5; FP (false positive) is the number of positive samples with errors detected; FN is the number of negative samples detecting errors; r is the whole real number set; AP is the area under the recall and accuracy curves.
Because the accuracy and the recall ratio are influenced by the confidence, if the accuracy and the recall ratio are only adopted to evaluate the model performance, certain unscientific performance and limitation exist, so that the average accuracy AP is introduced into an experiment as an evaluation index to evaluate the recognition performance of the model, which is one of the most important indexes for evaluating the performance of the mainstream target detection algorithm at present.
Example 2: as shown in fig. 1, fig. 2, fig. 3, fig. 4, fig. 5, fig. 6, fig. 7, fig. 8, fig. 9 and fig. 10, the method for detecting a single-tree target of a mangrove forest based on the improved YOLOv5 algorithm, including the improved YOLOv5 network model, will be further described below by taking a new mangrove forest seedling in the zhanjiang mangrove forest protection area in guangzhou as an example, based on the method for detecting a single-tree target of a mangrove forest according to the present invention.
A mangrove forest single tree target detection method based on improved YOLOv5 comprises the following steps:
in view of the special geographical position and the short and small characteristics of the newly-constructed mangrove in the natural protection area of the Zhanjiang mangrove, 2348 images of the unmanned aerial vehicle are obtained by acquiring continuous flight data of multiple breakpoints in a flight mission area by the unmanned aerial vehicle at the height of 120 meters, the theoretical spatial resolution is 0.05 meter, and the single mangrove seedling can be clearly distinguished by partial images of the unmanned aerial vehicle as shown in fig. 5a and 5b, so that a data set required by the invention is constructed. The unmanned aerial vehicle is produced by Shenzhen, Dajiang Innovation technology Limited, carries a CCD sensor, has the model of Dajiang eidolon 4RTK, and is light and handy in design, convenient to operate and good in performance. The method comprises the steps of selecting a clearer original image with resolution of 5472 x 3648 pixels when a data set is constructed, segmenting the original image into 597 pictures with 512 x 512 pixels through a segmentation program written based on python, sequentially labeling individual seedlings of the mangrove forest in an unmanned aerial vehicle image by using more commonly used LabelImg open source software, wherein the general outline of a target tree is shown in figures 6a, 6b and 6c, the relative position distribution and the relative target size of a labeling frame in the figures can be known from figures 6b and 6c, the target tree is uniformly distributed in the figures, the width of the target mostly accounts for 2% -5% of the width of the picture, and the height of the target mostly accounts for 1% -4%. Labeling about 3 million target trees, labeling the coordinates of a rectangular bounding box of the newly constructed mangrove forest, and storing as an XML text file for training and testing the YOLOv5 model. Then, according to the following 8: 1: the ratio of 1 divides the constructed data set into a training set, a verification set and a test set, wherein the training set 477 images, the verification set 60 images and the test set 60 images are contained.
the experimental platform is an autonomous configuration server, a 64-bit Windows10 operating system, an Intel (R) Xeon (R) CPU E5-2630 v3@2.40GHz, an NVIDIATesla K40c video card, a video memory 12G and a memory 16 GB. The invention constructs a network model based on a PyTorch deep learning framework, and the development environment is PyTorch1.4, cuda10.1 and python 3.7. The training process uses an Adam optimizer for training, the initial learning rate is set to be 0.01, single-scale training is adopted in all experiments, and the image input size is 512 x 512 pixels. According to the characteristics of the model, epochs of Yolov5 and Yolov5-ECA training are both 200, and a pre-training model is YOLOv5 x.
an effective Channel Attention mechanism (effective Channel Attention) is introduced into a CSPDarknet53 feature extraction network of YOLOv5, dimension reduction and cross-Channel interaction are avoided, meanwhile, the complexity of the model is reduced, the feature expression capability is enhanced, the YOLOv5 network is improved, and the improved model is renamed as YOLOv 5-ECA; SoftPool is introduced into an SPP module of a YOLOv5 network to improve the pooling operation and retain more detailed feature information. Inputting the obtained single-tree data set image of the unmanned aerial vehicle mangrove forest into a Yolov5 feature extraction network for feature extraction to obtain feature graphs of different scales; and classifying and regressing the obtained characteristic diagram, performing characteristic reconstruction operation on the regression result to obtain a more refined characteristic diagram, performing classification and regression operation again on the basis, calculating loss, and completing target detection based on the Guangzhou Zhanjiang newly manufactured mangrove forest single tree.
on the basis of completing the model training of the invention, the test set is tested, and the detection result is evaluated and analyzed through the loss value, the parameter convergence result, the model detection performance and other aspects in the training process.
The loss function is used as one of the standards for evaluating the training advantages and disadvantages of the model, and it can be seen that the improved optimized model YOLOv5-ECA is obviously superior to the original model YOLOv5, YOLOv5-ECA has lower loss function values under the same training round number, and meanwhile, the improved model has less detail loss and stronger feature learning capability. The transformation curve of the loss function values with the number of training rounds is shown in figure 7.
The loss parameter and the result parameter of the YOLOv5-ECA are relatively stable, the fluctuation of the mutation is small, the sudden change tends to converge smoothly, and the result parameter is higher than the YOLOv 5. As shown in fig. 8a, 8b, 8c, 8d, 8e, and 8f, it can be seen that during YOLOv5 training, val objective Loss fluctuates greatly around 50epoch, and as a result, parameters Recall, mapp @0.5, and mapp @0.5:0.95 all appear to suddenly drop at 180epoch, and then continue to rise smoothly to converge. Precision fluctuates to varying degrees between 0-60 epochs and then stabilizes substantially. The loss parameter and the result parameter of the YOLOv5-ECA are relatively stable, the fluctuation of the mutation is small, the sudden change tends to converge smoothly, and the result parameter is higher than the YOLOv 5.
The single-tree target detection effect of the mangrove forest is shown in fig. 9a, 9b, 9c and 9d, fig. 9a is listed as the detection effect of the original model YOLOv5, fig. 9b is listed as the detection effect of the improved model YOLOv5-ECA, and the overall precision is improved by 3.2% compared with the original model. As can be seen from the figure, because the newly-built mangrove forest is small and densely distributed, and is not clear enough on the image, both models have the phenomenon of missing detection, and the comprehensive comparison shows that the phenomenon of missing detection of YOLOv5-ECA is less, and the probability of detecting some marginal objects and fuzzy objects is higher.
As described above, although the embodiments of the present invention have been described in detail, it will be apparent to those skilled in the art that many modifications are possible without substantially departing from the spirit and scope of the present invention. Therefore, such modifications are also all included in the scope of protection of the present invention.
Claims (7)
1. A mangrove forest single tree target detection method based on improved YOLOv5 is characterized by comprising the following steps:
sequentially marking target trees on the selected unmanned aerial vehicle images by utilizing open source software LabelImg to construct a mangrove forest single tree data set;
the method selects YOLOv5 as a basic target detection model, optimizes and improves the target wood according to the characteristics of dense distribution and small size of the target wood, improves a CSPDarknet53 backbone network by using an effective Channel Attention mechanism, avoids descending and cross-Channel interaction, enhances the characteristic expression capability, introduces SoftPool pooling operation in an SPP module, retains more detailed characteristic information and improves the automatic target detection precision.
2. The method for detecting the single target of the mangrove forest based on the improved YOLOv5 as claimed in claim 1, which is characterized by comprising the following steps:
step 1, acquiring data to construct a required data set by means of the unique time flexibility and high spatial resolution advantage of an unmanned aerial vehicle due to the special geographic position, dense distribution and small target of the young mangrove forest tree;
step 2, introducing an effective Channel Attention mechanism (effective Channel Attention) module into a CSPDarknet53 feature extraction network of YOLOv5, improving the YOLOv5 network, and renaming the improved model to be YOLOv 5-ECA;
step 3, further improving on the basis of the step 2, introducing SoftPool improved pooling operation into an SPP module of a YOLOv5 network, and reserving more detailed characteristic information;
step 4, inputting the acquired single-tree data set image of the unmanned aerial vehicle mangrove forest into a Yolov5 feature extraction network for feature extraction to acquire feature graphs of different scales;
step 5, classifying and regressing the characteristic diagram obtained in the step 4, performing characteristic reconstruction operation on a regression result to obtain a more precise characteristic diagram, and performing classification and regression operation again on the basis to calculate loss;
and 6, after the training of the model is completed, testing the test set by means of the divided data set so as to realize the target detection of the young mangrove forest single tree and evaluate the detection effect of the model.
3. The method as claimed in claim 2, wherein in step 1, in order to ensure the integrity of each detected target in the image, the individual seedlings of mangrove forest in the unmanned aerial vehicle image are labeled in sequence by using LabelImg open source software, the labeled contents are coordinates of a rectangular bounding box of a newly created mangrove forest, and stored as an XML text file for training and testing of a YOLOv5 model, the structure of YOLOv5 includes an Input end Input, a reference network Backbone, a feature fusion part Neck, and a detection Head part, the Input end of YOLOv5 includes a preprocessing stage of training data set image, the Input end is enhanced by using Moic sa data of YOLOv4 to improve the training speed and network precision of the model, meanwhile, an adaptive anchor frame calculation program is newly added, in a Backbone network part, YOLOv5 is added with a generic structure for extracting generic features, and two kinds of CSP structures are constructed accordingly, the Focus structure carries out slicing operation on the picture, a combined structure of FPN + PAN is added to a feature fusion part between a backbone network and head detection in the Yolov5, meanwhile, a corresponding segmentation program is written to cut the picture into a required size, a labeling target frame is attached to a target tree on the cut picture, and the target object with or without missed labeling is checked so as to be complete in supplement.
4. The method of claim 2, wherein the single target detection method of mangrove forest based on improved YOLOv5, it is characterized in that in step 2, an effective Channel Attention mechanism (effective Channel Attention) module generates Channel Attention through a fast one-dimensional convolution with the size of k, wherein the size of the kernel is completely determined by channel dimension correlation function self-adaption, the characteristic image χ is input under the condition that the dimension is kept unchanged, after all channels are subjected to global average pooling, the ECA module learns the characteristics by utilizing one-dimensional convolution which can share the weight, and when learning features, each channel is involved with k neighbors to capture cross-channel interaction, k represents the kernel size of the fast one-dimensional convolution, the determination of the value of the adaptive k is obtained through the proportional relation between the covering area of the cross-channel information interaction and the channel dimension C, and the calculation is shown as a formula (1):
in the formula: gamma 2, b 1, | YoodRepresenting the nearest neighbor odd, C is the channel dimension.
5. The method as claimed in claim 2, wherein in step 3, when feature mapping is performed on newly-manufactured mangrove seedlings with smaller targets and too low pixels, the vulnerability to weight loss of information to be detected is performed by maximum pooling and average pooling, so SoftPool improved pooling operation is introduced into the SPP module, and more detailed feature information is retained, that is, pooling is implemented in a pooling area of an activation feature map by using a softmax method, and during the SoftPool downsampling process, the steps are sequentially from left to right: activating the characteristic diagram, calculating the final pooling result of the softpool in the area and outputting the result,
and (3) calculating the characteristic value weight of the region R according to the nonlinear characteristic value by SoftPool by utilizing softmax:
in the formula: wiWeight activated for the ith element, aiFor the ith activation value, R is the pooling region size, e is a mathematical constant, is the base of a natural logarithmic function,
weight ofwiThe transfer of important features can be ensured, the feature values in the region R have at least a preset minimum gradient when being transferred reversely, and the weight w is obtainediThen, the output is obtained by weighting the characteristic value in the region R:
6. The method for detecting the single tree target of the mangrove forest based on the improved YOLOv5 as claimed in claim 2, wherein in step 4, before the YOLOv5 feature extraction network is used for feature extraction, the data is subjected to mosaic enhancement operation, then is uniformly scaled to a standard size for Focus slicing operation, and then is input to the YOLOv5 feature extraction network for feature extraction.
7. The method as claimed in claim 2, wherein in step 6, the target detection is performed by using average Precision AP (average Precision) and average Precision mAP to evaluate the detection effect and performance of the model, AP being the area under the Recall and Precision recalls curves,
area intersection ratio IoU: the position prediction capability of the model is measured by calculating the area intersection ratio of the rectangular region of the model prediction target and the rectangular region calibrated by the target in the verification set,
precision, which represents the proportion of the correct target number detected by the model to the total target number, represents the accuracy of the model in target detection,
recall (Recall) which represents the proportion of the number of the targets detected by the model to the total number of the targets, embodies the Recall capability of model identification,
in the formula: TP (true positive) is the number of detected correct positive samples, namely the type of the prediction box is the same as that of the labeling box, and IoU is more than 0.5; FP (false positive) is the number of positive samples with errors detected; FN is the number of negative samples detecting errors; r is the whole real number set; AP is the area under the recall and accuracy curves.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111009370.8A CN113705478B (en) | 2021-08-31 | 2021-08-31 | Mangrove single wood target detection method based on improved YOLOv5 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111009370.8A CN113705478B (en) | 2021-08-31 | 2021-08-31 | Mangrove single wood target detection method based on improved YOLOv5 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113705478A true CN113705478A (en) | 2021-11-26 |
CN113705478B CN113705478B (en) | 2024-02-27 |
Family
ID=78657578
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111009370.8A Active CN113705478B (en) | 2021-08-31 | 2021-08-31 | Mangrove single wood target detection method based on improved YOLOv5 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113705478B (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114092820A (en) * | 2022-01-20 | 2022-02-25 | 城云科技(中国)有限公司 | Target detection method and moving target tracking method applying same |
CN114255350A (en) * | 2021-12-23 | 2022-03-29 | 四川大学 | Method and system for measuring thickness of soft and hard tissues of palate part |
CN114372968A (en) * | 2021-12-31 | 2022-04-19 | 江南大学 | Defect detection method combining attention mechanism and adaptive memory fusion network |
CN114462555A (en) * | 2022-04-13 | 2022-05-10 | 国网江西省电力有限公司电力科学研究院 | Multi-scale feature fusion power distribution network equipment identification method based on raspberry pi |
CN114627447A (en) * | 2022-03-10 | 2022-06-14 | 山东大学 | Road vehicle tracking method and system based on attention mechanism and multi-target tracking |
US20220207868A1 (en) * | 2020-12-29 | 2022-06-30 | Tsinghua University | All-weather target detection method based on vision and millimeter wave fusion |
CN115082695A (en) * | 2022-05-31 | 2022-09-20 | 中国科学院沈阳自动化研究所 | Transformer substation insulator string modeling and detecting method based on improved Yolov5 |
CN115226650A (en) * | 2022-06-02 | 2022-10-25 | 南京农业大学 | Sow oestrus state automatic detection system based on interactive features |
CN115272828A (en) * | 2022-08-11 | 2022-11-01 | 河南省农业科学院农业经济与信息研究所 | Intensive target detection model training method based on attention mechanism |
CN115270943A (en) * | 2022-07-18 | 2022-11-01 | 青软创新科技集团股份有限公司 | Knowledge tag extraction model based on attention mechanism |
CN115937991A (en) * | 2023-03-03 | 2023-04-07 | 深圳华付技术股份有限公司 | Human body tumbling identification method and device, computer equipment and storage medium |
CN117011555A (en) * | 2023-10-07 | 2023-11-07 | 广东海洋大学 | Mangrove forest ecological detection method based on remote sensing image recognition |
CN117274824A (en) * | 2023-11-21 | 2023-12-22 | 岭南设计集团有限公司 | Mangrove growth state detection method and system based on artificial intelligence |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110807488A (en) * | 2019-11-01 | 2020-02-18 | 北京芯盾时代科技有限公司 | Anomaly detection method and device based on user peer-to-peer group |
CN111898688A (en) * | 2020-08-04 | 2020-11-06 | 沈阳建筑大学 | Airborne LiDAR data tree species classification method based on three-dimensional deep learning |
CN112307903A (en) * | 2020-09-29 | 2021-02-02 | 江西裕丰智能农业科技有限公司 | Rapid single-tree extraction, positioning and counting method in fruit forest statistics |
CN112733749A (en) * | 2021-01-14 | 2021-04-30 | 青岛科技大学 | Real-time pedestrian detection method integrating attention mechanism |
CN112861837A (en) * | 2020-12-30 | 2021-05-28 | 北京大学深圳研究生院 | Unmanned aerial vehicle-based mangrove forest ecological information intelligent extraction method |
WO2021139069A1 (en) * | 2020-01-09 | 2021-07-15 | 南京信息工程大学 | General target detection method for adaptive attention guidance mechanism |
-
2021
- 2021-08-31 CN CN202111009370.8A patent/CN113705478B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110807488A (en) * | 2019-11-01 | 2020-02-18 | 北京芯盾时代科技有限公司 | Anomaly detection method and device based on user peer-to-peer group |
WO2021139069A1 (en) * | 2020-01-09 | 2021-07-15 | 南京信息工程大学 | General target detection method for adaptive attention guidance mechanism |
CN111898688A (en) * | 2020-08-04 | 2020-11-06 | 沈阳建筑大学 | Airborne LiDAR data tree species classification method based on three-dimensional deep learning |
CN112307903A (en) * | 2020-09-29 | 2021-02-02 | 江西裕丰智能农业科技有限公司 | Rapid single-tree extraction, positioning and counting method in fruit forest statistics |
CN112861837A (en) * | 2020-12-30 | 2021-05-28 | 北京大学深圳研究生院 | Unmanned aerial vehicle-based mangrove forest ecological information intelligent extraction method |
CN112733749A (en) * | 2021-01-14 | 2021-04-30 | 青岛科技大学 | Real-time pedestrian detection method integrating attention mechanism |
Non-Patent Citations (1)
Title |
---|
岳慧慧;白瑞林;: "基于改进YOLOv3的木结缺陷检测方法研究", 自动化仪表, no. 03 * |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11380089B1 (en) * | 2020-12-29 | 2022-07-05 | Tsinghua University | All-weather target detection method based on vision and millimeter wave fusion |
US20220207868A1 (en) * | 2020-12-29 | 2022-06-30 | Tsinghua University | All-weather target detection method based on vision and millimeter wave fusion |
CN114255350A (en) * | 2021-12-23 | 2022-03-29 | 四川大学 | Method and system for measuring thickness of soft and hard tissues of palate part |
CN114255350B (en) * | 2021-12-23 | 2023-08-04 | 四川大学 | Method and system for measuring thickness of soft and hard tissues of palate |
CN114372968A (en) * | 2021-12-31 | 2022-04-19 | 江南大学 | Defect detection method combining attention mechanism and adaptive memory fusion network |
CN114372968B (en) * | 2021-12-31 | 2022-12-27 | 江南大学 | Defect detection method combining attention mechanism and adaptive memory fusion network |
WO2023138300A1 (en) * | 2022-01-20 | 2023-07-27 | 城云科技(中国)有限公司 | Target detection method, and moving-target tracking method using same |
CN114092820A (en) * | 2022-01-20 | 2022-02-25 | 城云科技(中国)有限公司 | Target detection method and moving target tracking method applying same |
CN114627447A (en) * | 2022-03-10 | 2022-06-14 | 山东大学 | Road vehicle tracking method and system based on attention mechanism and multi-target tracking |
CN114462555A (en) * | 2022-04-13 | 2022-05-10 | 国网江西省电力有限公司电力科学研究院 | Multi-scale feature fusion power distribution network equipment identification method based on raspberry pi |
US11631238B1 (en) | 2022-04-13 | 2023-04-18 | Iangxi Electric Power Research Institute Of State Grid | Method for recognizing distribution network equipment based on raspberry pi multi-scale feature fusion |
CN115082695A (en) * | 2022-05-31 | 2022-09-20 | 中国科学院沈阳自动化研究所 | Transformer substation insulator string modeling and detecting method based on improved Yolov5 |
CN115226650A (en) * | 2022-06-02 | 2022-10-25 | 南京农业大学 | Sow oestrus state automatic detection system based on interactive features |
CN115226650B (en) * | 2022-06-02 | 2023-08-08 | 南京农业大学 | Sow oestrus state automatic detection system based on interaction characteristics |
CN115270943A (en) * | 2022-07-18 | 2022-11-01 | 青软创新科技集团股份有限公司 | Knowledge tag extraction model based on attention mechanism |
CN115270943B (en) * | 2022-07-18 | 2023-06-30 | 青软创新科技集团股份有限公司 | Knowledge tag extraction model based on attention mechanism |
CN115272828A (en) * | 2022-08-11 | 2022-11-01 | 河南省农业科学院农业经济与信息研究所 | Intensive target detection model training method based on attention mechanism |
CN115937991A (en) * | 2023-03-03 | 2023-04-07 | 深圳华付技术股份有限公司 | Human body tumbling identification method and device, computer equipment and storage medium |
CN117011555A (en) * | 2023-10-07 | 2023-11-07 | 广东海洋大学 | Mangrove forest ecological detection method based on remote sensing image recognition |
CN117011555B (en) * | 2023-10-07 | 2023-12-01 | 广东海洋大学 | Mangrove forest ecological detection method based on remote sensing image recognition |
CN117274824A (en) * | 2023-11-21 | 2023-12-22 | 岭南设计集团有限公司 | Mangrove growth state detection method and system based on artificial intelligence |
CN117274824B (en) * | 2023-11-21 | 2024-02-27 | 岭南设计集团有限公司 | Mangrove growth state detection method and system based on artificial intelligence |
Also Published As
Publication number | Publication date |
---|---|
CN113705478B (en) | 2024-02-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113705478A (en) | Improved YOLOv 5-based mangrove forest single tree target detection method | |
CN111738124B (en) | Remote sensing image cloud detection method based on Gabor transformation and attention | |
CN110378909B (en) | Single wood segmentation method for laser point cloud based on Faster R-CNN | |
Klodt et al. | Field phenotyping of grapevine growth using dense stereo reconstruction | |
CN109325395A (en) | The recognition methods of image, convolutional neural networks model training method and device | |
CN110765865A (en) | Underwater target detection method based on improved YOLO algorithm | |
Li et al. | A comparison of deep learning methods for airborne lidar point clouds classification | |
CN114387520A (en) | Precision detection method and system for intensive plums picked by robot | |
CN114049325A (en) | Construction method and application of lightweight face mask wearing detection model | |
CN115359366A (en) | Remote sensing image target detection method based on parameter optimization | |
CN113344045A (en) | Method for improving SAR ship classification precision by combining HOG characteristics | |
CN115410081A (en) | Multi-scale aggregated cloud and cloud shadow identification method, system, equipment and storage medium | |
CN112084860A (en) | Target object detection method and device and thermal power plant detection method and device | |
CN115099297A (en) | Soybean plant phenotype data statistical method based on improved YOLO v5 model | |
Garcia-D'Urso et al. | Efficient instance segmentation using deep learning for species identification in fish markets | |
CN115761463A (en) | Shallow sea water depth inversion method, system, equipment and medium | |
CN115527234A (en) | Infrared image cage dead chicken identification method based on improved YOLOv5 model | |
Pamungkas et al. | Segmentation of Enhalus acoroides seagrass from underwater images using the Mask R-CNN method | |
CN116311086B (en) | Plant monitoring method, training method, device and equipment for plant monitoring model | |
CN117456287B (en) | Method for observing population number of wild animals by using remote sensing image | |
CN115205853B (en) | Image-based citrus fruit detection and identification method and system | |
CN116503737B (en) | Ship detection method and device based on space optical image | |
CN115861824B (en) | Remote sensing image recognition method based on improved transducer | |
Linlong et al. | Optimized Detection Method for Siberian Crane (Grus Leucogeranus) Based on Yolov5 | |
CN116883992A (en) | Strawberry fruit detection method based on improved YOLOv5x |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |