CN116721060A

CN116721060A - Method for detecting red palm weevil beetles based on attention mechanism

Info

Publication number: CN116721060A
Application number: CN202310555293.9A
Authority: CN
Inventors: 刘丽; 阎伟; 刘博�; 覃伟权; 李朝绪; 黄山春; 吕朝军; 钟宝珠
Original assignee: Coconut Research Institute of Chinese Academy of Tropical Agricultural Sciences
Current assignee: Coconut Research Institute of Chinese Academy of Tropical Agricultural Sciences
Priority date: 2023-08-08
Filing date: 2023-08-08
Publication date: 2023-09-08

Abstract

The invention provides a method for detecting red palm weevil beetles based on an attention mechanism, which comprises the following steps: constructing a target detection data set, namely a red palm beetle data set, and carrying out data enhancement and expansion on the target detection data set; constructing a target detection network, and training the constructed target detection network by utilizing a data set; the target detection network can extract and learn the characteristics of the red palm beetles and finish the output detection of the red palm beetle individuals; the trained target detection network can be used for detecting and identifying the red palm weevil beetles in different scenes; improving the target detection network, and adding an attention mechanism into a backbone network of the target detection network to obtain an improved target detection network for the red palm beetles; retraining the data set of the red palm weevil, and comparing the accuracy, precision and the like of the identification of the red palm weevil under the original target detection network and the improved target detection network. By adopting the method, the accuracy of target detection of the red palm weevil can be remarkably improved.

Description

Method for detecting red palm weevil beetles based on attention mechanism

Technical Field

The invention relates to the technical field of target detection of machine learning, in particular to a method for detecting red palm weevil beetles based on an attention mechanism.

Background

The red palm weevil is a destructive pest of coconut trees and also damages palm plants such as oil palm, betel nut, date palm and the like. It is distributed in Indonesia, malaysia, philippines, thailand, burmese and China. The insect is found to be harmful in the provinces of Hainan, guangdong, taiwan, yunnan and the like in China. The adult red palm weevil is generally not directly harmful and mainly takes the stem and the crown and the leaf of the tree as the insect of the larvae. Early adults spawn and hatch in scars, cracks and fissures near the crowns of plants, and the larvae are bored into tender tissues of the top stems of the pests, once victimized, the damage is serious. Early harmful is hard to be perceived, and later harmful is easy to be seen. When the pest is at first, the blade of the new suction is incomplete, the ear or the medical stethoscope is used for being close to the stem of the damaged tree, the eating sound of the sand in the stem of the larva can be heard, and in the later period of the pest, the central blade is dried up, and waste fiber scraps or brown viscous liquid are discharged from the decayed holes. The serious plant is damaged, the new She Diaowei plant is dead, and only a plurality of old leaves are left, so that the plant is difficult to rescue. Some trunks are even eaten hollow, leaving only empty shells. Once harmful, it can lead to death of the coconut forest or palmaceae plant pieces, and there is a constant trend to spread. Once the pest is harmful to the plantation, if no control measures are taken, the pest will continue to expand and spread until the entire plantation is destroyed.

Therefore, the pest control should adopt the comprehensive control measures mainly including prevention, firstly, depending on the artificial observation of whether the crown of the plant has abnormal variation, the stem of the plant is inspected and listened by a common stethoscope, and once the fibrous scraps are found at the heart leaves of the crown or the eating sound of the trunk is heard, the pest is immediately controlled. After manual observation and auscultation, the prevention measures adopted are strictly preventing plant injury, removing or reducing insect sources in the garden, chemical prevention, trapping and killing and the like. However, the cost of the manual prevention and control mode in the early stage is high and the efficiency is low, so that early discovery of red palm weevil pests becomes a great research hot spot and difficulty in order to ensure that palm plants are protected efficiently.

In some cities in the south of China, coconuts and other palm plants are ideal environment greening tree species, the project of developing palm plants and million mu coconut forests in regions such as Yunnan of China is strong, the defects of the traditional red palm beetle prevention and control technology are overcome when the palm plants are required to be protected from insect pests effectively, and an accurate and efficient red palm beetle detection technology is searched out by combining with the artificial intelligence emerging technology so as to achieve the aim of overall pest prevention and control.

Disclosure of Invention

The embodiment of the invention provides a detection method for red palm weevil based on an attention mechanism, which can solve the problems of high cost and low efficiency of the manual prevention and control method for red palm weevil in the prior art. The technical scheme is as follows:

the embodiment of the invention provides a method for detecting red palm weevil beetles based on an attention mechanism, which comprises the following steps:

collecting an image of red palm weevil, performing processing such as enhancement marking on the image, and manufacturing a data set;

constructing a target detection network, and training the constructed target detection network by utilizing the data set; the constructed target detection network adopts a YOLOv5 deep learning framework;

performing algorithm improvement on the original YOLOv5 network, adding an attention mechanism into the backbone network of the original YOLOv5 network, and retraining the data set; among them, the attention mechanism is a Squeeze-and-incentive network (Squeeze-and-ExcitationNet, SENet), which is the most common one, belonging to channel attention; the method strengthens the characteristics of the input image, so that the model pays more attention to the typical characteristics with large information quantity and ignores the secondary characteristics with small information quantity; the method mainly comprises extrusion and excitation, wherein the method comprises the steps of firstly compressing the characteristics of an input image, then carrying out characteristic learning on a compressed characteristic image to obtain learning weights, and finally multiplying the learning weights by an original characteristic image to obtain final characteristics;

further, the dataset comprises: a training set and a testing set;

the acquiring the image of the red palm weevil and the making of the data set comprises the following steps:

at different moments of the day, the camera device is used for collecting images of red palm beetles in different states in the field environment, the red palm beetles in the images are marked by using the marking tool and recorded in the marking file, a part of the images and the corresponding marking files are randomly selected to serve as a training set, and the rest of the images and the corresponding marking files serve as a test set.

Further, the constructing the target detection network, and the training the constructed target detection network by using the data set includes:

selecting a YOLOv5 target detection algorithm, and constructing a target detection network, wherein the target detection network comprises: the system comprises an input end, a backbone network, a neck network and a detection head (output end), wherein the structure of the characteristic network is based on a selected YOLOv5 network;

and inputting the data set, training the detection network by using a deep learning technology, combining the feature network and the loss function of the detection network, and performing fine tuning training on the whole target detection network by using the data set.

Further, joining the attention mechanism, utilizing the improved detection network technique includes:

adding a squeezing and exciting network (Squeeze-and-ExcitationNet, SENet), namely SE attention mechanism, into a target detection network YOLOv5 model, wherein after the backbone network is improved, the attention mechanism is added to obtain an improved target detection model,

using generalized cross joint loss (GIOULoss) as a loss function of frame regression (Boundingbox), and simultaneously requiring non-maximum suppression (NonMaximumSuppression, NMS) operation to complete output detection;

and (3) after calculating the loss function value of the feature network, performing back propagation, wherein the test is performed on the test set after each training round, if the loss function value of the feature network on the test set is increased, the training is finished in advance, otherwise, the training is continued until the current iteration number is greater than or equal to the preset training number.

Further, retraining and testing the improved model to obtain indexes such as detection accuracy, recall rate and the like;

further, the front and rear model detection performance was evaluated:

evaluation indexes of the detection Precision include a cross ratio (IntersectionofUnion, ioU), an accuracy (P), a Recall (R), and an average accuracy mean value (meanAveragePrecision, mAP), which are expressed as:

where TP represents the number of positive samples detected as positive samples; FP represents the number of negative samples detected as positive samples; FN represents the number of positive samples detected as negative samples; n is the number of detection sample categories. Ba represents the region of the predicted frame, bb represents the region of the real frame; the overlap ratio IoU represents the overlapping degree of the real frame and the predicted frame, and the higher the value is, the higher the overlapping degree is, and the more accurate the predicted result is; when the IoU threshold value is 0.5, the mAP at the moment is marked as mAP@.5; and (5) starting the IoU threshold value from 0.5 to 0.95, adding 0.05 each time, taking 10 threshold values, respectively calculating corresponding average precision mean values, and then calculating the average value, wherein the average value is recorded as mAP@5:95.

Further, the identification accuracy of the original target detection network and the improved network is compared, and a result is obtained.

Further, the method further comprises:

in the improved network model, the data set of the red palm beetles is trained, indexes are comprehensively evaluated, and the accuracy, precision and the like of the identification of the red palm beetles under the original target detection network and the improved target detection network are compared.

The technical scheme provided by the embodiment of the invention has the beneficial effects that at least:

the invention carries out feature extraction and training on the image of the red palm beetle through the target detection network, improves the target detection network, continues training after adding the attention mechanism, tests the model after training, and compares the results. Compared with the prior art, the method has the advantages that the detection result of the red palm weevil is more accurate, the detection is more convenient and faster, and the method has better robustness for the detection in a field shielding scene, so that the problems of high cost and low efficiency of manually preventing and controlling the red palm weevil in the prior art are solved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic flow chart of a method for detecting red palm weevil based on an attention mechanism according to an embodiment of the present invention;

FIG. 2 is a detailed flow chart of a method for detecting red palm weevil based on an attention mechanism according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of a YOLOv5 target detection network according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of the SE attention mechanism according to an embodiment of the present invention;

fig. 5 is a schematic diagram of a spectrum trap model to be built according to an embodiment of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.

Examples:

as shown in fig. 1-5, an embodiment of the present invention provides a method for detecting beetles of red palm image based on an attention mechanism, the method comprising:

s101, acquiring an image of red palm weevil and making a data set;

s102, constructing a target detection network, and training the constructed target detection network by utilizing the data set; the target detection network can extract and learn the characteristics of the red palm beetles and finish the output detection of the red palm beetle individuals;

s103, detecting and identifying the red palm weevil beetles in different scenes by using a trained target detection network; improving the target detection network, and adding an attention mechanism into a backbone network of the target detection network to obtain an improved target detection network for the red palm beetles; retraining and detecting the data set of the red palm weevil, and comparing the accuracy and precision of the identification of the red palm weevil under the original target detection network and the improved target detection network.

According to the method for detecting the red palm beetles based on the attention mechanism, disclosed by the embodiment of the invention, the characteristics of the images of the red palm beetles are extracted and trained through the target detection network, the target detection network is improved, the training is continued after the attention mechanism is added, and the test and the comparison result are carried out on the trained model. Compared with the prior art, the method has the advantages that the detection result of the red palm weevil is more accurate, the detection is more convenient and faster, and the method has better robustness for the detection in a field shielding scene, so that the problems of high cost and low efficiency of manually preventing and controlling the red palm weevil in the prior art are solved.

In this embodiment, the target detection network is a YOLOv5 target detection network in a one-stage target detection network.

In a specific embodiment of the foregoing method for detecting beetles in red palm based on an attention mechanism, further, the data set includes: a training set and a testing set;

In this embodiment, for example, a camera may be used to capture an image of the red palm beetle, then an image of the red palm beetle in a field environment is collected, and a labeling tool labelimg is used to label the red palm beetle in the image, label data is saved as an XML file (i.e., a label), 90% of the image and the corresponding label file are randomly selected as a training set, and the remaining image and the corresponding label file are used as a test set.

In the specific embodiment of the foregoing method for detecting red palm weevil based on the attention mechanism, further, as shown in fig. 2, 3 and 4, the constructing the target detection network, training the constructed target detection network using the data set includes:

selecting a YOLOv5 target detection algorithm, and constructing a target detection network, wherein the target detection network comprises: the device comprises an input end, a Backbone network of backbones, a Neck network of Necks and a Head detection Head (output end), wherein the structure of the characteristic network is based on a selected YOLOv5 network;

In the embodiment, a feature network is firstly constructed, the input of the feature network is an original image, and the output is a feature vector of the image; then, a detection network is constructed on the basis of the characteristic network, the input of the detection network is two characteristic vectors before and after a characteristic network pooling layer, and the output is a target detection result, namely: target individuals of the red palm weevil beetles.

In this example, the specific flow of detection is shown in fig. 2, firstly, a data set is made, the target detection network is trained and then input into the target detection system, secondly, the obtained red palm beetle data are transmitted into the YOLOv5 feature extraction network, feature extraction, up-sampling and feature fusion are performed on the input data in the network, prediction information is generated through regression analysis and the like, and the prediction information is transmitted to the output end to output.

In this embodiment, the selected target detection network is the target detection network YOLOv5 at one stage.

In this embodiment, in order to better understand YOLOv5, a target detection algorithm is described first; common target detection algorithms can be divided into two categories: the method is a two-stage target detection algorithm represented by RCNN, such as R-CNN, fastR-CNN, fasterR-CNN and the like, wherein candidate frames are generated by the algorithm firstly, a convolution network is executed again to identify a detection object, and the algorithm has high detection precision but slower speed; the other type is a one-stage target detection algorithm represented by SSD and YOLO series, the algorithm directly generates class probability and position information of an object, a final detection result can be directly obtained through single detection, and compared with a two-stage algorithm, the algorithm has higher speed but has lost precision; in the one-stage detection algorithm, the running speed of the SSD neural network is lower than that of YOLO, the detection precision is slightly lower than that of Faster-CNN, and the stability is poor; in the YOLO series, YOLOv5 is one of the advanced algorithms at present, has the advantages of high average detection speed, high flexibility, quick deployment and the like, is widely applied, and can be suitable for quick real-time detection of red palm weevil in a natural complex environment.

In this embodiment, YOLOv5 belongs to a one-stage target detection algorithm, and has the characteristics of flexible and changeable model scale, low deployment cost, faster training time and reasoning speed, and the like; YOLOv5 is a single-stage target detection algorithm, and a new improvement thought is added to the algorithm on the basis of YOLOv4, so that the speed and the precision of the algorithm are greatly improved; the YOLOv5 network model mainly comprises an input end, a Backbone network of a backhaul, a Neck network and a Head output end, and a network schematic diagram of the YOLOv5 network model is shown in figure 3; an input end: the part comprises an image preprocessing stage, wherein an input image is scaled to the input size of a network, normalization and other operations are performed, and the operation method comprises a Mosaic data enhancement operation, a self-adaptive anchor frame calculation and a self-adaptive image scaling method; the mosaics data enhancement sequentially combines the four pictures to achieve the effect of enriching the picture background; the self-adaptive picture scaling is to add black edges to the original picture in a self-adaptive way, uniformly scale the original picture to a standard size and send the standard size into a detection network; the self-adaptive anchor frame is to calculate the best anchor frame value in the training set in a self-adaptive way when each training is performed; YOLOv5 adopts CSPDarknet53 as backbone network, and its main function is to extract characteristic information in the picture for subsequent network use.

The Neck network is composed of an SPPF module, a feature pyramid network and a path aggregation network and is positioned in the middle of the backbone network and the output end, so that the diversity and the robustness of the features can be further improved; the structure mainly solves the problem of multi-scale detection in target detection, the feature pyramid network carries out feature transfer fusion in three modes of using a top-down mode and a top-down mode to transversely connect high-level feature information, and improves the feature information capacity of the final network layer, but the feature pyramid network only enhances semantic information transfer and is weaker in shallow positioning information transfer; the path aggregation network is characterized in that a pyramid from bottom to top is added after the characteristic pyramid network is convolved by 3 multiplied by 3, so that the transmission of positioning information is enhanced; YOLOv5 takes as output 76 x 76 features generated by the feature pyramid; meanwhile, the characteristics of 38 multiplied by 38 and 19 multiplied by 19 generated by the path aggregation network are taken as output, and the anchor frame is combined to finish the detection of targets with different scales; the generalized cross combination is adopted as a loss function of frame regression at the output end, and in the post-processing process of target detection, non-maximum value inhibition operation is usually required for screening a plurality of target frames, and the generated three feature maps with different sizes are used for calculating the following loss function.

In this embodiment, the added attention mechanism is shown in fig. 4, and is an attention mechanism of a channel type, that is, the attention mechanism is added in the channel dimension, and the main contents are squeeze and specification.

On the basis of an original learning mechanism, a new network path is opened up, the attention degree of each channel in the feature map is obtained through operation, and an attention weight is configured for each feature channel according to the attention degree, so that the convolution network focuses on the feature channels more, the channels of the feature map which are useful for the current task are realized, and the feature channels which are less useful for the current task are restrained; mainly comprises two operations: extruding and exciting; the SE module first performs a squeeze operation on the feature map obtained by convolution to obtain channel-level global features, and then performs a stimulus operation on the global features to obtain weights for different channels and relationships between the channels.

The invention and its embodiments have been described above with no limitation, and the actual construction is not limited to the embodiments of the invention as shown in the drawings. In summary, if one of ordinary skill in the art is informed by this disclosure, a structural manner and an embodiment similar to the technical solution should not be creatively devised without departing from the gist of the present invention.

Claims

1. The method for detecting the red palm weevil beetles based on the attention mechanism is characterized by comprising the following steps of:

collecting images of red palm weevil and preparing a data set;

constructing a target detection network, and training the constructed target detection network by utilizing the data set; the target detection network can extract and learn the characteristics of the red palm beetles and finish the output detection of the red palm beetle individuals;

the trained target detection network can be used for detecting and identifying the red palm weevil beetles in different scenes; improving the target detection network, and adding an attention mechanism into a backbone network of the target detection network to obtain an improved target detection network for the red palm beetles; retraining the data set of the red palm weevil, and comparing the accuracy and precision of the identification of the red palm weevil under the original target detection network and the improved target detection network.

2. The method for detecting beetles in red palm based on an attention mechanism according to claim 1, wherein said data set comprises: a training set and a testing set;

the image of the red palm weevil, the data set is made comprising:

3. The method for detecting beetles in red palm based on an attention mechanism according to claim 2, wherein said constructing a target detection network, training the constructed target detection network using the data set comprises:

selecting a YOLOv5 target detection algorithm, and constructing a target detection network, wherein the target detection network comprises: the device comprises an input end, a backbone network, a neck network and a detection head, wherein the structure of the characteristic network is based on a selected YOLOv5 network;

4. The method for detecting beetles in red palm based on an attention mechanism according to claim 3, wherein adding to the attention mechanism, utilizing an improved detection network technique comprises:

adding an extrusion and excitation network, namely SE attention mechanism, into a target detection network YOLOv5 model, wherein after the backbone network is improved, adding the attention mechanism to obtain an improved target detection model,

the generalized cross joint loss is used as a loss function of frame regression, and meanwhile, non-maximum suppression operation is required to complete output detection;

5. The method for detecting the red palm weevil based on the attention mechanism according to claim 4, wherein the evaluation indexes of the detection precision are a cross ratio IoU, an accuracy rate P, a recall rate R and an average accuracy mean mAP, which are expressed as:

where TP represents the number of positive samples detected as positive samples; FP represents the number of negative samples detected as positive samples; FN represents the number of positive samples detected as negative samples; n is the number of detection sample categories; ba represents the region of the predicted frame, bb represents the region of the real frame; the overlap ratio IoU represents the overlapping degree of the real frame and the predicted frame, and the higher the value is, the higher the overlapping degree is, and the more accurate the predicted result is; when the IoU threshold value is 0.5, the mAP at the moment is marked as mAP@.5; and (5) starting the IoU threshold value from 0.5 to 0.95, adding 0.05 each time, taking 10 threshold values, respectively calculating corresponding average precision mean values, and then calculating the average value, wherein the average value is recorded as mAP@5:95.

6. The method for detecting beetles in red palm based on an attention mechanism of claim 4, wherein said training the data set of beetles in red palm based on the improved feature detection network comprises:

the weight parameters of a network training model are fixed, the front training and the back training are both based on a Pytorch deep learning framework, an operating system adopted by the model training is Windows10, a hardware platform is an NVIDIAGeforceGTX1070 display card, and a memory 8G is operated; the parallel computing framework with the version number of CUDA10.2 is used;

fixing training times, and performing experiment total training times for 1000 times; model training parameters were set as follows: learning rate=0.01, momentum=0.937, decay=0.0005, lot=8.

7. The method for detecting beetles in red palm based on an attention mechanism according to claim 5, wherein said method further comprises:

in the improved network model, the data set of the red palm weevil is trained and tested, the index is comprehensively evaluated, and the accuracy, precision and the like of the identification of the red palm weevil under the original target detection network and the improved target detection network are compared.