CN113705478A - Improved YOLOv 5-based mangrove forest single tree target detection method - Google Patents

Improved YOLOv 5-based mangrove forest single tree target detection method Download PDF

Info

Publication number
CN113705478A
CN113705478A CN202111009370.8A CN202111009370A CN113705478A CN 113705478 A CN113705478 A CN 113705478A CN 202111009370 A CN202111009370 A CN 202111009370A CN 113705478 A CN113705478 A CN 113705478A
Authority
CN
China
Prior art keywords
yolov5
target
mangrove forest
model
pooling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111009370.8A
Other languages
Chinese (zh)
Other versions
CN113705478B (en
Inventor
马永康
凌成星
刘华
赵峰
张雨桐
曾浩威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Research Institute Of Forest Resource Information Techniques Chinese Academy Of Forestry
Original Assignee
Research Institute Of Forest Resource Information Techniques Chinese Academy Of Forestry
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Research Institute Of Forest Resource Information Techniques Chinese Academy Of Forestry filed Critical Research Institute Of Forest Resource Information Techniques Chinese Academy Of Forestry
Priority to CN202111009370.8A priority Critical patent/CN113705478B/en
Publication of CN113705478A publication Critical patent/CN113705478A/en
Application granted granted Critical
Publication of CN113705478B publication Critical patent/CN113705478B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

A mangrove forest single tree target detection method based on improved YOLOv5, a target detection and image processing technology in deep learning and a young mangrove forest single tree recognition technology of an improved YOLOv5 algorithm belong to the mangrove forest single tree target detection field in forestry scientific research. Sequentially marking target trees on the selected unmanned aerial vehicle image by utilizing open source software Labe Img, constructing a mangrove forest single tree data set, selecting YOLOv5 as a basic target detection model, optimizing and improving the target trees according to the characteristics of dense distribution and small size of the target trees, improving a CSPDarknet53 backbone network by using an effective channel attention mechanism effi i ent channel attention i on, avoiding the reduction of dimensionality and cross-channel interaction, enhancing the characteristic expression capability, introducing SoftPoo pooling operation into an SPP module, retaining more detailed characteristic information and improving the automatic target detection precision.

Description

Improved YOLOv 5-based mangrove forest single tree target detection method
Technical Field
The invention relates to a mangrove forest single tree target detection method based on improved YOLOv5, a target detection and image processing technology in deep learning and a young mangrove forest single tree recognition technology based on improved YOLOv5 algorithm, belonging to the field of mangrove forest single tree target detection in forestry scientific research.
Background
Mangrove forest is a special forest community on tropical and subtropical coast zones and plays an important role in improving ecological conditions, maintaining biological diversity, ecological safety in coastal areas and the like. Mangrove resource survey and dynamic monitoring are the basis and precondition of scientific protection and management of mangrove, but the mangrove area in our country is decreasing day by day due to human reconstruction and other reasons. In order to better protect the mangrove forest ecosystem and enlarge the mangrove forest area, in recent years, the artificial afforestation work of a plurality of mangrove forest natural protection areas is developed according to the plan, and the supplementary planting and the artificial afforestation in the original mangrove forest become the main measures for recovering the mangrove forest resources. The geographical positions of mangrove forest distribution are special, most mangrove forest is in shoals in intertidal zones, the mangrove forest is easily submerged by seawater in tidal tide, if the mangrove forest is not survived, the mangrove forest is washed away by water flow because of no adhesive force of root systems, and therefore the existing single saplings of the mangrove forest are survived trees. In the special environment, the low-altitude unmanned remote sensing system is very suitable for image acquisition of mangrove forest regions due to the characteristics of flexible data acquisition, low cost and capability of rapidly acquiring small-range ultrahigh-resolution images. Based on this, be applied to the acquisition of mangrove seedling with unmanned aerial vehicle, be one of the way of effectively improving mangrove seedling monitoring precision.
Therefore, how to combine the unmanned aerial vehicle image to detect the individual mangrove forest quickly and accurately becomes a problem to be solved urgently in the checking work of the survival rate of the newly-built mangrove forest. With the development of target detection and object recognition technologies, deep learning begins to be widely applied to target detection, and face recognition, voice recognition and the like are the most common applications. Aiming at the problems of insufficient feature extraction capability, detection speed reduction caused by prediction frame processing and the like, the Anchor-Free lightweight detection algorithm with multi-scale feature fusion is provided. The defect detection method based on the scale invariant feature pyramid is provided for the problem that the inspection defect detection precision of the existing target detection algorithm is low in a complex inspection scene. However, the above methods are only to perform algorithm improvement on the corresponding specific scenes. In recent years, the invention of deep learning is also gradually applied to the forestry industry, and more accurate, rapid and intelligent monitoring of forestry is gradually realized. Zhouyan in the article of the small target disaster-stricken tree detection method based on deep learning, aiming at the problems of small tree scale, dense growth, irregular distribution and the like in the forest image of the unmanned aerial vehicle, provides a small target disaster-stricken tree detection method based on deep learning. However, the SSD algorithm (Si ng l e Shot Mu l t i Box Detector, which is a target detection algorithm proposed by We i Li u on the ECCV 2016) discards the underlying features containing rich information, and has low robustness for small target detection.
The current mainstream deep learning target detection algorithm is divided into a double-stage algorithm and a single-stage algorithm. Different from a double-stage detection algorithm represented by an R-CNN series, the YOLO directly completes feature extraction, candidate frame regression and classification in the same unbranched convolutional network, so that the network structure is simple, and the requirement of a real-time detection task can be met. The YOLO target detection model is subjected to updating iteration for a plurality of times, and the problems of multiple dimensions such as multi-target detection, small target detection, missed detection and repair, multi-scale prediction and the like are solved successively.
Based on the advantages of automatic feature learning, high speed, high efficiency and the like of target detection, the method better applies the latest YOLOv5 of the series of algorithms to the individual tree detection of the mangrove forest to realize the intelligent monitoring of the condition of the seedlings of the mangrove forest.
Disclosure of Invention
The invention aims to provide a mangrove forest single tree target detection method based on improved YOLOv5 based on a deep learning method, which solves the problems that the mangrove forest single tree targets in the current man-machine image are small and densely distributed, the automation degree is low and the efficiency is low during detection, and the like, provides technical support for the automatic detection of survival mangrove forest seedlings, further improves the survival rate inspection precision and efficiency of newly manufactured mangroves, and realizes the intelligent monitoring of the mangrove forest seedling conditions.
Aiming at the problems that the individual mangrove tree targets in the unmanned aerial vehicle image are small and distributed densely, the automation degree is low during detection, the efficiency is low and the like, a method for detecting the individual mangrove tree targets based on the improved YOLOv5 is provided based on a deep learning method, so that the individual mangrove trees in the unmanned aerial vehicle image can be automatically identified and positioned quickly and accurately.
A mangrove forest single tree target detection method based on improved YOLOv5 comprises the following steps:
and sequentially marking target trees on the selected unmanned aerial vehicle images by utilizing open source software LabelImg to construct a mangrove forest single tree data set.
The method selects YOLOv5 as a basic target detection model, optimizes and improves the target wood according to the characteristics of dense distribution and small size of the target wood, improves a CSPDarknet53 backbone network by using an effective Channel Attention mechanism, avoids descending and cross-Channel interaction, enhances the characteristic expression capability, introduces SoftPool pooling operation in an SPP module, retains more detailed characteristic information and improves the automatic target detection precision.
The technical scheme adopted by the invention for solving the technical problems is as follows: a mangrove forest single tree target detection method based on improved YOLOv5 comprises the following steps:
step 1, because the single trees of the young mangrove forest are in special geographic positions, densely distributed and small in target, the data are acquired by means of the unique time flexibility and high spatial resolution advantage of the unmanned aerial vehicle to construct a required data set.
Step 2, introducing an effective Channel Attention mechanism (effective Channel Attention) into a CSPDarknet53 feature extraction network of YOLOv5, improving the YOLOv5 network, and renaming a model after improvement to YOLOv 5-ECA.
And step 3, further improving on the basis of the step 2, introducing SoftPool improved pooling operation into an SPP module of the YOLOv5 network, and reserving more detailed characteristic information.
And 4, inputting the obtained single-tree data set image of the unmanned aerial vehicle mangrove forest into a Yolov5 feature extraction network for feature extraction, and obtaining feature graphs of different scales.
And 5, classifying and regressing the characteristic diagram obtained in the step 4, performing characteristic reconstruction operation on a regression result to obtain a more precise characteristic diagram, and performing classification and regression operation again on the basis to calculate loss.
And 6, after the training of the model is completed, testing the test set by means of the divided data set to realize the target detection of the young mangrove forest single trees, and evaluating the detection effect of the model.
The efficient attention channel of the present invention reduces the complexity of the model by nondecreasing and cross-channel information interaction, while enhancing the expressive power of the features. The softpool can keep the expressiveness of the characteristics and the operation is micro, so that the characteristic information of the whole receptive field is reserved, and the accuracy of the algorithm is improved. The method can quickly, accurately and automatically detect the newly manufactured mangrove forest single trees, has obvious advantages compared with the traditional detection method of the mangrove forest of the existing unmanned aerial vehicle, improves the target detection precision along with the improvement, has lower training loss, can realize the quick, accurate and automatic detection of the mangrove forest single tree targets, and better improves the recognition and positioning capability of the mangrove forest single trees.
Drawings
A more complete appreciation of the invention and many of the attendant advantages thereof will be readily obtained as the same becomes better understood by reference to the following detailed description when considered in connection with the accompanying drawings, wherein the accompanying drawings are included to provide a further understanding of the invention and form a part of this specification, and wherein the illustrated embodiments of the invention and the description thereof are intended to illustrate and not limit the invention, as illustrated in the accompanying drawings, in which:
fig. 1 is a schematic diagram of a YOLOv5 network structure.
Fig. 2 is a block diagram of an attention module for introducing an effective channel.
Fig. 3 is a diagram of the structure of the incorporation of SoftPool pooling in SPP modules.
FIG. 4 is a schematic illustration of the position of SoftPool in YOLOv 5-ECA.
Fig. 5a is a collected image of a portion of a mangrove forest single-tree unmanned aerial vehicle.
Fig. 5b is a collected image of a portion of a mangrove forest single-tree unmanned aerial vehicle.
FIG. 6a is a target label graph.
Fig. 6b is a graph of normalized target position.
Fig. 6c is a graph of normalized target size.
FIG. 7 is a graph of the value of the Yolov5 and Yolov5-ECA loss functions as a function of the number of training rounds.
FIG. 8a is a comparison graph of convergence results of various parameters of YOLOv5 and YOLOv 5-ECA.
FIG. 8b is a comparison graph of convergence results of various parameters of YOLOv5 and YOLOv 5-ECA.
FIG. 8c is a comparison graph of convergence results of various parameters of YOLOv5 and YOLOv 5-ECA.
FIG. 8d is a comparison graph of convergence results of the respective parameters YOLOv5 and YOLOv 5-ECA.
FIG. 8e is a comparison graph of convergence results of various parameters of YOLOv5 and YOLOv 5-ECA.
FIG. 8f is a comparison graph of convergence results of the respective parameters YOLOv5 and YOLOv 5-ECA.
FIG. 9a is a graph showing the detection result of the YOLOv5 detection target.
FIG. 9b is a graph showing the results of detection of a target by YOLOv 5.
FIG. 9c is a graph showing the results of detection of the YOLOv5-ECA detection target.
FIG. 9d is a graph showing the results of detection of the YOLOv5-ECA detection target.
FIG. 10 is a flow chart of the steps of the present invention.
Detailed Description
The invention is further illustrated with reference to the following figures and examples.
It will be apparent that those skilled in the art can make many modifications and variations based on the spirit of the present invention.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element, component or section is referred to as being "connected" to another element, component or section, it can be directly connected to the other element or section or intervening elements or sections may also be present. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art.
The following examples are further illustrative in order to facilitate the understanding of the embodiments, and the present invention is not limited to the examples.
Example 1: as shown in fig. 1, fig. 2, fig. 3, fig. 4, fig. 5, fig. 6, fig. 7, fig. 8, fig. 9 and fig. 10, a mangrove forest single tree target detection method based on improved YOLOv5 includes the following steps:
step 1, because the single trees of the young mangrove forest are in special geographic positions, densely distributed and small in target, the data are acquired by means of the unique time flexibility and high spatial resolution advantage of the unmanned aerial vehicle to construct a required data set.
Step 2, introducing an effective Channel Attention mechanism (effective Channel Attention) into a CSPDarknet53 feature extraction network of YOLOv5, improving the YOLOv5 network, and renaming a model after improvement to YOLOv 5-ECA.
And step 3, further improving on the basis of the step 2, introducing SoftPool improved pooling operation into an SPP module of the YOLOv5 network, and reserving more detailed characteristic information.
And 4, inputting the obtained single-tree data set image of the unmanned aerial vehicle mangrove forest into a Yolov5 feature extraction network for feature extraction, and obtaining feature graphs of different scales.
And 5, classifying and regressing the characteristic diagram obtained in the step 4, performing characteristic reconstruction operation on a regression result to obtain a more precise characteristic diagram, and performing classification and regression operation again on the basis to calculate loss.
And 6, after the training of the model is completed, testing the test set by means of the divided data set to realize the target detection of the young mangrove forest single trees, and evaluating the detection effect of the model.
In the step 1, in order to ensure the completeness of each detection target in the image, conventional LabelImg open source software is used for sequentially marking individual seedlings of the mangrove forest in the unmanned aerial vehicle image, marking content is coordinates of a rectangular surrounding frame of a newly-built mangrove forest, and storing the coordinates as an XML text file for training and testing a YOLOv5 model.
The structure of YOLOv5 is shown in fig. 1, and the YOLOv5 network mainly comprises 4 parts, wherein 1 box is an Input end (Input), 2 boxes are a reference network (backhaul), 3 boxes are feature fusion (hack), and 4 boxes are a Prediction part (Prediction). The 1 box is an Input (Input) and comprises a preprocessing stage of training data set images, and the Input is enhanced by the Mosaic data of YOLOv4 to improve the training speed and the network precision of the model.
And the frame 2 is a reference network (Backbone), the Backbone of the detection network is used for extracting the characteristics of the high, middle and low layers of the image by the Backbone.
And 3, the frame is a feature fusion (Neck), and the Neck part is mainly used for generating a feature pyramid. The feature pyramid enhances the detection of the model for objects of different scaling dimensions, so that the same object of different size and dimensions can be identified.
And 4, taking a frame as a Prediction part (Prediction), and performing convolution again to obtain a Prediction result.
The specific contents of the A frame, the B frame, the C frame, the D frame, the E frame and the F frame except the part 4 are as follows:
the input end of the YOLOv5 comprises a preprocessing stage of training data set images, and the Mosaic data enhancement of YOLOv4 is adopted at the input end to improve the training speed and the network precision of the model.
In the reference network, YOLOv5 is different in that a slice structure (Focus) is newly added, and as shown in a block E in fig. 1, data is cut by slice to extract general features.
The A frame is composed of a CBL structure and a Conv + Bn + Leaky _ relu activation function.
The B box is CSP1_ X consisting of three convolutional layers and X Res unint modules concatee.
The CSP2_ X structure is used in feature fusion (Neck), and the Res unint module is no longer used, but instead is CBL, as indicated in block D.
The SPP module is mainly in a pooling mode as shown by a C frame, and the subsequent improvement mainly changes the original maximum pooling into soft pool.
The prediction part is convolved again to obtain a prediction result.
The frame F is a residual component in which the Res unit module is composed of two CBL modules, and the number of the modules is determined according to specific requirements.
The most prominent characteristic of the Efficient Channel Attention in the step 2 is that the dimension reduction is avoided, the cross-Channel interaction is realized, the complexity of the model is reduced, and the feature expression capability is enhanced, the structure diagram of the ECA Attention module is shown in FIG. 2, the principle is that the ECA module generates Channel Attention through the fast one-dimensional convolution with the size of k, and the size of the kernel is completely determined by the Channel dimension correlation function in a self-adaptive manner. When the feature image x is input under the condition that the dimension is kept unchanged, after all channels are subjected to global average pooling, the ECA module learns the features by utilizing one-dimensional convolution which can share the weight, and each channel is involved with k neighbors to capture cross-channel interaction when the features are learned. k represents the kernel size of the fast one-dimensional convolution, and the determination of the value of the adaptive k is obtained by the direct proportion relation between the covering area of the cross-channel information interaction and the channel dimension C, and the calculation is as shown in formula (1):
Figure BDA0003238098050000081
in the formula: γ 2, b 21,|*oodRepresenting the nearest neighbor odd, C is the channel dimension.
And 3, detecting information for easily damaging weight loss when performing feature mapping by utilizing maximum pooling and average pooling for newly-manufactured mangrove seedlings with smaller targets and excessively low pixels. Therefore, the SoftPool is introduced into the SPP module to improve the pooling operation and retain more detailed feature information, namely, the pooling is realized in a Softmax mode in the pooling area of the activation feature map. The Softpool structure is shown in FIG. 3, and the Softpool downsampling process sequentially comprises the following steps from left to right: activating the characteristic diagram, calculating the final pooling result of the softpool in the area, and outputting the result. The position of fusion in YOLOv5 is shown in fig. 4, in the SPP module.
Unlike other pooling, SoftPool uses softmax (normalized exponential function) for weighted pooling, which can preserve expressiveness of features and is differentiable in operation. The gradient of each back propagation can be updated, and SoftPool can comprehensively utilize each activation factor of the pooled kernel and only increase little memory occupation. The discrimination of the similar characteristic information is increased, meanwhile, the characteristic information of the whole receptive field is kept, and the accuracy of the algorithm is improved.
The core idea of SoftPool is the utilization of softmax, and the eigenvalue weight of the region R is calculated according to the nonlinear eigenvalue:
Figure BDA0003238098050000091
in the formula: wiWeight activated for the ith element, aiFor the ith activation value, R is the pooling region size, and e is a mathematical constant that is the base of the natural logarithm function.
Weight ofwi can ensure the transmission of important characteristics, and the characteristic values in the region R have at least a preset minimum gradient when transmitted reversely. After obtaining the weight wiThen, the output is obtained by weighting the characteristic value in the region R:
Figure BDA0003238098050000092
in the formula: r is the size of the pooling area,
Figure BDA0003238098050000093
for the output value of SoftPool, the weighted summation of all the activation factors of the pooling kernel is realized.
In step 4, before feature extraction is performed on the Yolov5 feature extraction network, mosaic enhancement operation is performed on the data, then the data are uniformly scaled to a standard size for Focus slicing operation, and then the data are input to the Yolov5 feature extraction network for feature extraction.
In the step 6, the detection effect and performance of the model are generally evaluated by using an average Precision AP (average Precision) and an average Precision mAP (average probability), wherein the AP is an area under a Recall rate (Recall) curve and an accuracy rate (Precision) curve, the method is a single-class target, and the AP is equivalent to the mAP.
Area intersection ratio (IoU): and measuring the position prediction capability of the model by calculating the area intersection ratio of the rectangular region of the model prediction target and the rectangular region calibrated by the target in the verification set.
Precision (Precision) represents the proportion of the correct target number detected by the model to the total target number, and represents the accuracy of the model in target detection.
Figure BDA0003238098050000101
And the Recall rate (Recall) represents the proportion of the number of the targets detected by the model to the total number of the targets, and embodies the Recall capability of the model identification.
Figure BDA0003238098050000102
Figure BDA0003238098050000103
In the formula: TP (true positive) is the number of detected correct positive samples, namely the type of the prediction box is the same as that of the labeling box, and IoU is more than 0.5; FP (false positive) is the number of positive samples with errors detected; FN is the number of negative samples detecting errors; r is the whole real number set; AP is the area under the recall and accuracy curves.
Because the accuracy and the recall ratio are influenced by the confidence, if the accuracy and the recall ratio are only adopted to evaluate the model performance, certain unscientific performance and limitation exist, so that the average accuracy AP is introduced into an experiment as an evaluation index to evaluate the recognition performance of the model, which is one of the most important indexes for evaluating the performance of the mainstream target detection algorithm at present.
Example 2: as shown in fig. 1, fig. 2, fig. 3, fig. 4, fig. 5, fig. 6, fig. 7, fig. 8, fig. 9 and fig. 10, the method for detecting a single-tree target of a mangrove forest based on the improved YOLOv5 algorithm, including the improved YOLOv5 network model, will be further described below by taking a new mangrove forest seedling in the zhanjiang mangrove forest protection area in guangzhou as an example, based on the method for detecting a single-tree target of a mangrove forest according to the present invention.
A mangrove forest single tree target detection method based on improved YOLOv5 comprises the following steps:
step 1, data acquisition and data set construction:
in view of the special geographical position and the short and small characteristics of the newly-constructed mangrove in the natural protection area of the Zhanjiang mangrove, 2348 images of the unmanned aerial vehicle are obtained by acquiring continuous flight data of multiple breakpoints in a flight mission area by the unmanned aerial vehicle at the height of 120 meters, the theoretical spatial resolution is 0.05 meter, and the single mangrove seedling can be clearly distinguished by partial images of the unmanned aerial vehicle as shown in fig. 5a and 5b, so that a data set required by the invention is constructed. The unmanned aerial vehicle is produced by Shenzhen, Dajiang Innovation technology Limited, carries a CCD sensor, has the model of Dajiang eidolon 4RTK, and is light and handy in design, convenient to operate and good in performance. The method comprises the steps of selecting a clearer original image with resolution of 5472 x 3648 pixels when a data set is constructed, segmenting the original image into 597 pictures with 512 x 512 pixels through a segmentation program written based on python, sequentially labeling individual seedlings of the mangrove forest in an unmanned aerial vehicle image by using more commonly used LabelImg open source software, wherein the general outline of a target tree is shown in figures 6a, 6b and 6c, the relative position distribution and the relative target size of a labeling frame in the figures can be known from figures 6b and 6c, the target tree is uniformly distributed in the figures, the width of the target mostly accounts for 2% -5% of the width of the picture, and the height of the target mostly accounts for 1% -4%. Labeling about 3 million target trees, labeling the coordinates of a rectangular bounding box of the newly constructed mangrove forest, and storing as an XML text file for training and testing the YOLOv5 model. Then, according to the following 8: 1: the ratio of 1 divides the constructed data set into a training set, a verification set and a test set, wherein the training set 477 images, the verification set 60 images and the test set 60 images are contained.
Step 2, model training environment:
the experimental platform is an autonomous configuration server, a 64-bit Windows10 operating system, an Intel (R) Xeon (R) CPU E5-2630 v3@2.40GHz, an NVIDIATesla K40c video card, a video memory 12G and a memory 16 GB. The invention constructs a network model based on a PyTorch deep learning framework, and the development environment is PyTorch1.4, cuda10.1 and python 3.7. The training process uses an Adam optimizer for training, the initial learning rate is set to be 0.01, single-scale training is adopted in all experiments, and the image input size is 512 x 512 pixels. According to the characteristics of the model, epochs of Yolov5 and Yolov5-ECA training are both 200, and a pre-training model is YOLOv5 x.
Step 3, model improvement and training:
an effective Channel Attention mechanism (effective Channel Attention) is introduced into a CSPDarknet53 feature extraction network of YOLOv5, dimension reduction and cross-Channel interaction are avoided, meanwhile, the complexity of the model is reduced, the feature expression capability is enhanced, the YOLOv5 network is improved, and the improved model is renamed as YOLOv 5-ECA; SoftPool is introduced into an SPP module of a YOLOv5 network to improve the pooling operation and retain more detailed feature information. Inputting the obtained single-tree data set image of the unmanned aerial vehicle mangrove forest into a Yolov5 feature extraction network for feature extraction to obtain feature graphs of different scales; and classifying and regressing the obtained characteristic diagram, performing characteristic reconstruction operation on the regression result to obtain a more refined characteristic diagram, performing classification and regression operation again on the basis, calculating loss, and completing target detection based on the Guangzhou Zhanjiang newly manufactured mangrove forest single tree.
Step 4, analyzing the detection result:
on the basis of completing the model training of the invention, the test set is tested, and the detection result is evaluated and analyzed through the loss value, the parameter convergence result, the model detection performance and other aspects in the training process.
The loss function is used as one of the standards for evaluating the training advantages and disadvantages of the model, and it can be seen that the improved optimized model YOLOv5-ECA is obviously superior to the original model YOLOv5, YOLOv5-ECA has lower loss function values under the same training round number, and meanwhile, the improved model has less detail loss and stronger feature learning capability. The transformation curve of the loss function values with the number of training rounds is shown in figure 7.
The loss parameter and the result parameter of the YOLOv5-ECA are relatively stable, the fluctuation of the mutation is small, the sudden change tends to converge smoothly, and the result parameter is higher than the YOLOv 5. As shown in fig. 8a, 8b, 8c, 8d, 8e, and 8f, it can be seen that during YOLOv5 training, val objective Loss fluctuates greatly around 50epoch, and as a result, parameters Recall, mapp @0.5, and mapp @0.5:0.95 all appear to suddenly drop at 180epoch, and then continue to rise smoothly to converge. Precision fluctuates to varying degrees between 0-60 epochs and then stabilizes substantially. The loss parameter and the result parameter of the YOLOv5-ECA are relatively stable, the fluctuation of the mutation is small, the sudden change tends to converge smoothly, and the result parameter is higher than the YOLOv 5.
The single-tree target detection effect of the mangrove forest is shown in fig. 9a, 9b, 9c and 9d, fig. 9a is listed as the detection effect of the original model YOLOv5, fig. 9b is listed as the detection effect of the improved model YOLOv5-ECA, and the overall precision is improved by 3.2% compared with the original model. As can be seen from the figure, because the newly-built mangrove forest is small and densely distributed, and is not clear enough on the image, both models have the phenomenon of missing detection, and the comprehensive comparison shows that the phenomenon of missing detection of YOLOv5-ECA is less, and the probability of detecting some marginal objects and fuzzy objects is higher.
As described above, although the embodiments of the present invention have been described in detail, it will be apparent to those skilled in the art that many modifications are possible without substantially departing from the spirit and scope of the present invention. Therefore, such modifications are also all included in the scope of protection of the present invention.

Claims (7)

1. A mangrove forest single tree target detection method based on improved YOLOv5 is characterized by comprising the following steps:
sequentially marking target trees on the selected unmanned aerial vehicle images by utilizing open source software LabelImg to construct a mangrove forest single tree data set;
the method selects YOLOv5 as a basic target detection model, optimizes and improves the target wood according to the characteristics of dense distribution and small size of the target wood, improves a CSPDarknet53 backbone network by using an effective Channel Attention mechanism, avoids descending and cross-Channel interaction, enhances the characteristic expression capability, introduces SoftPool pooling operation in an SPP module, retains more detailed characteristic information and improves the automatic target detection precision.
2. The method for detecting the single target of the mangrove forest based on the improved YOLOv5 as claimed in claim 1, which is characterized by comprising the following steps:
step 1, acquiring data to construct a required data set by means of the unique time flexibility and high spatial resolution advantage of an unmanned aerial vehicle due to the special geographic position, dense distribution and small target of the young mangrove forest tree;
step 2, introducing an effective Channel Attention mechanism (effective Channel Attention) module into a CSPDarknet53 feature extraction network of YOLOv5, improving the YOLOv5 network, and renaming the improved model to be YOLOv 5-ECA;
step 3, further improving on the basis of the step 2, introducing SoftPool improved pooling operation into an SPP module of a YOLOv5 network, and reserving more detailed characteristic information;
step 4, inputting the acquired single-tree data set image of the unmanned aerial vehicle mangrove forest into a Yolov5 feature extraction network for feature extraction to acquire feature graphs of different scales;
step 5, classifying and regressing the characteristic diagram obtained in the step 4, performing characteristic reconstruction operation on a regression result to obtain a more precise characteristic diagram, and performing classification and regression operation again on the basis to calculate loss;
and 6, after the training of the model is completed, testing the test set by means of the divided data set so as to realize the target detection of the young mangrove forest single tree and evaluate the detection effect of the model.
3. The method as claimed in claim 2, wherein in step 1, in order to ensure the integrity of each detected target in the image, the individual seedlings of mangrove forest in the unmanned aerial vehicle image are labeled in sequence by using LabelImg open source software, the labeled contents are coordinates of a rectangular bounding box of a newly created mangrove forest, and stored as an XML text file for training and testing of a YOLOv5 model, the structure of YOLOv5 includes an Input end Input, a reference network Backbone, a feature fusion part Neck, and a detection Head part, the Input end of YOLOv5 includes a preprocessing stage of training data set image, the Input end is enhanced by using Moic sa data of YOLOv4 to improve the training speed and network precision of the model, meanwhile, an adaptive anchor frame calculation program is newly added, in a Backbone network part, YOLOv5 is added with a generic structure for extracting generic features, and two kinds of CSP structures are constructed accordingly, the Focus structure carries out slicing operation on the picture, a combined structure of FPN + PAN is added to a feature fusion part between a backbone network and head detection in the Yolov5, meanwhile, a corresponding segmentation program is written to cut the picture into a required size, a labeling target frame is attached to a target tree on the cut picture, and the target object with or without missed labeling is checked so as to be complete in supplement.
4. The method of claim 2, wherein the single target detection method of mangrove forest based on improved YOLOv5, it is characterized in that in step 2, an effective Channel Attention mechanism (effective Channel Attention) module generates Channel Attention through a fast one-dimensional convolution with the size of k, wherein the size of the kernel is completely determined by channel dimension correlation function self-adaption, the characteristic image χ is input under the condition that the dimension is kept unchanged, after all channels are subjected to global average pooling, the ECA module learns the characteristics by utilizing one-dimensional convolution which can share the weight, and when learning features, each channel is involved with k neighbors to capture cross-channel interaction, k represents the kernel size of the fast one-dimensional convolution, the determination of the value of the adaptive k is obtained through the proportional relation between the covering area of the cross-channel information interaction and the channel dimension C, and the calculation is shown as a formula (1):
Figure FDA0003238098040000031
in the formula: gamma 2, b 1, | YoodRepresenting the nearest neighbor odd, C is the channel dimension.
5. The method as claimed in claim 2, wherein in step 3, when feature mapping is performed on newly-manufactured mangrove seedlings with smaller targets and too low pixels, the vulnerability to weight loss of information to be detected is performed by maximum pooling and average pooling, so SoftPool improved pooling operation is introduced into the SPP module, and more detailed feature information is retained, that is, pooling is implemented in a pooling area of an activation feature map by using a softmax method, and during the SoftPool downsampling process, the steps are sequentially from left to right: activating the characteristic diagram, calculating the final pooling result of the softpool in the area and outputting the result,
and (3) calculating the characteristic value weight of the region R according to the nonlinear characteristic value by SoftPool by utilizing softmax:
Figure FDA0003238098040000032
in the formula: wiWeight activated for the ith element, aiFor the ith activation value, R is the pooling region size, e is a mathematical constant, is the base of a natural logarithmic function,
weight ofwiThe transfer of important features can be ensured, the feature values in the region R have at least a preset minimum gradient when being transferred reversely, and the weight w is obtainediThen, the output is obtained by weighting the characteristic value in the region R:
Figure FDA0003238098040000041
in the formula: r is the size of the pooling area,
Figure FDA0003238098040000042
for the output value of SoftPool, the weighted summation of all the activation factors of the pooling kernel is realized.
6. The method for detecting the single tree target of the mangrove forest based on the improved YOLOv5 as claimed in claim 2, wherein in step 4, before the YOLOv5 feature extraction network is used for feature extraction, the data is subjected to mosaic enhancement operation, then is uniformly scaled to a standard size for Focus slicing operation, and then is input to the YOLOv5 feature extraction network for feature extraction.
7. The method as claimed in claim 2, wherein in step 6, the target detection is performed by using average Precision AP (average Precision) and average Precision mAP to evaluate the detection effect and performance of the model, AP being the area under the Recall and Precision recalls curves,
area intersection ratio IoU: the position prediction capability of the model is measured by calculating the area intersection ratio of the rectangular region of the model prediction target and the rectangular region calibrated by the target in the verification set,
precision, which represents the proportion of the correct target number detected by the model to the total target number, represents the accuracy of the model in target detection,
Figure FDA0003238098040000043
recall (Recall) which represents the proportion of the number of the targets detected by the model to the total number of the targets, embodies the Recall capability of model identification,
Figure FDA0003238098040000051
Figure FDA0003238098040000052
in the formula: TP (true positive) is the number of detected correct positive samples, namely the type of the prediction box is the same as that of the labeling box, and IoU is more than 0.5; FP (false positive) is the number of positive samples with errors detected; FN is the number of negative samples detecting errors; r is the whole real number set; AP is the area under the recall and accuracy curves.
CN202111009370.8A 2021-08-31 2021-08-31 Mangrove single wood target detection method based on improved YOLOv5 Active CN113705478B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111009370.8A CN113705478B (en) 2021-08-31 2021-08-31 Mangrove single wood target detection method based on improved YOLOv5

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111009370.8A CN113705478B (en) 2021-08-31 2021-08-31 Mangrove single wood target detection method based on improved YOLOv5

Publications (2)

Publication Number Publication Date
CN113705478A true CN113705478A (en) 2021-11-26
CN113705478B CN113705478B (en) 2024-02-27

Family

ID=78657578

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111009370.8A Active CN113705478B (en) 2021-08-31 2021-08-31 Mangrove single wood target detection method based on improved YOLOv5

Country Status (1)

Country Link
CN (1) CN113705478B (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114092820A (en) * 2022-01-20 2022-02-25 城云科技(中国)有限公司 Target detection method and moving target tracking method applying same
CN114255350A (en) * 2021-12-23 2022-03-29 四川大学 Method and system for measuring thickness of soft and hard tissues of palate part
CN114372968A (en) * 2021-12-31 2022-04-19 江南大学 Defect detection method combining attention mechanism and adaptive memory fusion network
CN114462555A (en) * 2022-04-13 2022-05-10 国网江西省电力有限公司电力科学研究院 Multi-scale feature fusion power distribution network equipment identification method based on raspberry pi
CN114627447A (en) * 2022-03-10 2022-06-14 山东大学 Road vehicle tracking method and system based on attention mechanism and multi-target tracking
US20220207868A1 (en) * 2020-12-29 2022-06-30 Tsinghua University All-weather target detection method based on vision and millimeter wave fusion
CN115082695A (en) * 2022-05-31 2022-09-20 中国科学院沈阳自动化研究所 Transformer substation insulator string modeling and detecting method based on improved Yolov5
CN115226650A (en) * 2022-06-02 2022-10-25 南京农业大学 Sow oestrus state automatic detection system based on interactive features
CN115272828A (en) * 2022-08-11 2022-11-01 河南省农业科学院农业经济与信息研究所 Intensive target detection model training method based on attention mechanism
CN115270943A (en) * 2022-07-18 2022-11-01 青软创新科技集团股份有限公司 Knowledge tag extraction model based on attention mechanism
CN115937991A (en) * 2023-03-03 2023-04-07 深圳华付技术股份有限公司 Human body tumbling identification method and device, computer equipment and storage medium
CN117011555A (en) * 2023-10-07 2023-11-07 广东海洋大学 Mangrove forest ecological detection method based on remote sensing image recognition
CN117274824A (en) * 2023-11-21 2023-12-22 岭南设计集团有限公司 Mangrove growth state detection method and system based on artificial intelligence

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110807488A (en) * 2019-11-01 2020-02-18 北京芯盾时代科技有限公司 Anomaly detection method and device based on user peer-to-peer group
CN111898688A (en) * 2020-08-04 2020-11-06 沈阳建筑大学 Airborne LiDAR data tree species classification method based on three-dimensional deep learning
CN112307903A (en) * 2020-09-29 2021-02-02 江西裕丰智能农业科技有限公司 Rapid single-tree extraction, positioning and counting method in fruit forest statistics
CN112733749A (en) * 2021-01-14 2021-04-30 青岛科技大学 Real-time pedestrian detection method integrating attention mechanism
CN112861837A (en) * 2020-12-30 2021-05-28 北京大学深圳研究生院 Unmanned aerial vehicle-based mangrove forest ecological information intelligent extraction method
WO2021139069A1 (en) * 2020-01-09 2021-07-15 南京信息工程大学 General target detection method for adaptive attention guidance mechanism

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110807488A (en) * 2019-11-01 2020-02-18 北京芯盾时代科技有限公司 Anomaly detection method and device based on user peer-to-peer group
WO2021139069A1 (en) * 2020-01-09 2021-07-15 南京信息工程大学 General target detection method for adaptive attention guidance mechanism
CN111898688A (en) * 2020-08-04 2020-11-06 沈阳建筑大学 Airborne LiDAR data tree species classification method based on three-dimensional deep learning
CN112307903A (en) * 2020-09-29 2021-02-02 江西裕丰智能农业科技有限公司 Rapid single-tree extraction, positioning and counting method in fruit forest statistics
CN112861837A (en) * 2020-12-30 2021-05-28 北京大学深圳研究生院 Unmanned aerial vehicle-based mangrove forest ecological information intelligent extraction method
CN112733749A (en) * 2021-01-14 2021-04-30 青岛科技大学 Real-time pedestrian detection method integrating attention mechanism

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
岳慧慧;白瑞林;: "基于改进YOLOv3的木结缺陷检测方法研究", 自动化仪表, no. 03 *

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11380089B1 (en) * 2020-12-29 2022-07-05 Tsinghua University All-weather target detection method based on vision and millimeter wave fusion
US20220207868A1 (en) * 2020-12-29 2022-06-30 Tsinghua University All-weather target detection method based on vision and millimeter wave fusion
CN114255350A (en) * 2021-12-23 2022-03-29 四川大学 Method and system for measuring thickness of soft and hard tissues of palate part
CN114255350B (en) * 2021-12-23 2023-08-04 四川大学 Method and system for measuring thickness of soft and hard tissues of palate
CN114372968A (en) * 2021-12-31 2022-04-19 江南大学 Defect detection method combining attention mechanism and adaptive memory fusion network
CN114372968B (en) * 2021-12-31 2022-12-27 江南大学 Defect detection method combining attention mechanism and adaptive memory fusion network
WO2023138300A1 (en) * 2022-01-20 2023-07-27 城云科技(中国)有限公司 Target detection method, and moving-target tracking method using same
CN114092820A (en) * 2022-01-20 2022-02-25 城云科技(中国)有限公司 Target detection method and moving target tracking method applying same
CN114627447A (en) * 2022-03-10 2022-06-14 山东大学 Road vehicle tracking method and system based on attention mechanism and multi-target tracking
CN114462555A (en) * 2022-04-13 2022-05-10 国网江西省电力有限公司电力科学研究院 Multi-scale feature fusion power distribution network equipment identification method based on raspberry pi
US11631238B1 (en) 2022-04-13 2023-04-18 Iangxi Electric Power Research Institute Of State Grid Method for recognizing distribution network equipment based on raspberry pi multi-scale feature fusion
CN115082695A (en) * 2022-05-31 2022-09-20 中国科学院沈阳自动化研究所 Transformer substation insulator string modeling and detecting method based on improved Yolov5
CN115226650A (en) * 2022-06-02 2022-10-25 南京农业大学 Sow oestrus state automatic detection system based on interactive features
CN115226650B (en) * 2022-06-02 2023-08-08 南京农业大学 Sow oestrus state automatic detection system based on interaction characteristics
CN115270943A (en) * 2022-07-18 2022-11-01 青软创新科技集团股份有限公司 Knowledge tag extraction model based on attention mechanism
CN115270943B (en) * 2022-07-18 2023-06-30 青软创新科技集团股份有限公司 Knowledge tag extraction model based on attention mechanism
CN115272828A (en) * 2022-08-11 2022-11-01 河南省农业科学院农业经济与信息研究所 Intensive target detection model training method based on attention mechanism
CN115937991A (en) * 2023-03-03 2023-04-07 深圳华付技术股份有限公司 Human body tumbling identification method and device, computer equipment and storage medium
CN117011555A (en) * 2023-10-07 2023-11-07 广东海洋大学 Mangrove forest ecological detection method based on remote sensing image recognition
CN117011555B (en) * 2023-10-07 2023-12-01 广东海洋大学 Mangrove forest ecological detection method based on remote sensing image recognition
CN117274824A (en) * 2023-11-21 2023-12-22 岭南设计集团有限公司 Mangrove growth state detection method and system based on artificial intelligence
CN117274824B (en) * 2023-11-21 2024-02-27 岭南设计集团有限公司 Mangrove growth state detection method and system based on artificial intelligence

Also Published As

Publication number Publication date
CN113705478B (en) 2024-02-27

Similar Documents

Publication Publication Date Title
CN113705478A (en) Improved YOLOv 5-based mangrove forest single tree target detection method
CN111738124B (en) Remote sensing image cloud detection method based on Gabor transformation and attention
CN110378909B (en) Single wood segmentation method for laser point cloud based on Faster R-CNN
Klodt et al. Field phenotyping of grapevine growth using dense stereo reconstruction
CN109325395A (en) The recognition methods of image, convolutional neural networks model training method and device
CN110765865A (en) Underwater target detection method based on improved YOLO algorithm
Li et al. A comparison of deep learning methods for airborne lidar point clouds classification
CN114387520A (en) Precision detection method and system for intensive plums picked by robot
CN114049325A (en) Construction method and application of lightweight face mask wearing detection model
CN115359366A (en) Remote sensing image target detection method based on parameter optimization
CN113344045A (en) Method for improving SAR ship classification precision by combining HOG characteristics
CN115410081A (en) Multi-scale aggregated cloud and cloud shadow identification method, system, equipment and storage medium
CN112084860A (en) Target object detection method and device and thermal power plant detection method and device
CN115099297A (en) Soybean plant phenotype data statistical method based on improved YOLO v5 model
Garcia-D'Urso et al. Efficient instance segmentation using deep learning for species identification in fish markets
CN115761463A (en) Shallow sea water depth inversion method, system, equipment and medium
CN115527234A (en) Infrared image cage dead chicken identification method based on improved YOLOv5 model
Pamungkas et al. Segmentation of Enhalus acoroides seagrass from underwater images using the Mask R-CNN method
CN116311086B (en) Plant monitoring method, training method, device and equipment for plant monitoring model
CN117456287B (en) Method for observing population number of wild animals by using remote sensing image
CN115205853B (en) Image-based citrus fruit detection and identification method and system
CN116503737B (en) Ship detection method and device based on space optical image
CN115861824B (en) Remote sensing image recognition method based on improved transducer
Linlong et al. Optimized Detection Method for Siberian Crane (Grus Leucogeranus) Based on Yolov5
CN116883992A (en) Strawberry fruit detection method based on improved YOLOv5x

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant