CN116012655A

CN116012655A - Fruit tree pteridae insect pest detection method

Info

Publication number: CN116012655A
Application number: CN202310083017.7A
Authority: CN
Inventors: 许丽佳; 石小仕; 唐座亮; 伍志军; 王玉超; 赵永鹏; 杨宇平; 邹志勇; 黄鹏; 康志亮; 刘碧
Original assignee: Sichuan Agricultural University
Current assignee: Sichuan Agricultural University
Priority date: 2023-02-08
Filing date: 2023-02-08
Publication date: 2023-04-25

Abstract

The invention relates to the field of pest detection, and discloses a method for detecting fruit tree pteridae pests, which comprises the following steps: collecting images of fruit trees to be detected; constructing a self-adaptive feature fusion lightweight detection model based on YOLOX-x; and inputting the fruit tree image to be detected into a trained self-adaptive feature fusion lightweight detection model for detection, and obtaining a pest detection result. The detection method can accurately and rapidly detect the insect pest stress of the fruit tree in the unstructured orchard environment; the method is more suitable for being transplanted to a mobile chassis embedded system, can provide an upgrading thought for an identification system of orchard plant protection equipment, and can promote high-quality and high-efficiency development of agriculture; secondly, the detection method can improve the detection precision of the model on the pests shielded by the blades and the branches.

Description

Fruit tree pteridae insect pest detection method

Technical Field

The invention relates to the field of pest detection, in particular to a fruit tree pteridae pest detection method and a fruit tree pteridae pest detection device.

Background

The pteridae insect pests stress seasonal high incidence, and the tender leaves and the young shoots of the silkworm fruit trees are harmful to the crown formation and the branch fruiting. The traditional fruit tree industry still remains in the aspect of pest control and is in the aspect of visual inspection, manual control and intervention on pest phenomena, and the defects of high time and energy consumption, high cost of manpower and material resources and the like exist.

Along with the development of deep learning, the target recognition method based on the deep learning is widely applied to pest recognition and detection, so that the recognition efficiency of the fruit tree pest and disease types can be effectively improved, and the recognition cost is reduced.

For example: the Yolo series target detection algorithm is a typical one-stage target detection algorithm, and combines classification and regression problems of target positioning by using an anchor box, so that the method has high efficiency, flexibility and good generalization performance. The Yolo algorithm adopts a single CNN model to realize end-to-end target detection, and the core idea is to directly return the position of a bounding box and the category of the bounding box at an output layer by using the whole graph as the input of the network. The Yolo algorithm has the advantages of high detection speed, end-to-end training and prediction, and very simple and convenient operation.

In the practical application of the existing detection method for fruit tree insect pest detection, as the fruit tree grows in the unstructured orchard and branches and leaves are flourishing, a large number of insect pests have the phenomenon of shielding leaves, branches and fruits, and the background and texture details of the insect pests in different illumination environments are different. Therefore, the pest identification model carried by the prior orchard plant protection equipment has the problems of low detection precision, large memory occupation, low reasoning speed and the like.

Disclosure of Invention

The invention aims to provide a method and a device for detecting insect pests of the Paenidae of fruit trees, which solve the problems of low detection precision, large memory occupation, low reasoning speed and the like of an insect pest identification model carried by the prior orchard plant protection equipment.

In order to achieve the aim of the invention, the invention adopts the following technical scheme: a method for detecting insect pests of the family pteridae of fruit trees, the method comprising:

collecting images of fruit trees to be detected;

constructing a self-adaptive feature fusion lightweight detection model based on YOLOX-x;

and inputting the fruit tree image to be detected into a trained self-adaptive feature fusion lightweight detection model for detection, and obtaining a pest detection result.

Preferably, the self-adaptive feature fusion lightweight detection model comprises a backbone network, a neck and a detection head which are sequentially connected.

Preferably, the backbone network comprises a Focus module and three lightweight feature extraction modules which are connected in sequence, wherein the output of the Focus module is connected with the input of the first lightweight feature extraction module;

each lightweight feature extraction module comprises two Ghost model units and an attention mechanism unit, wherein the attention mechanism unit is connected in series between the two Ghost model units.

Preferably, the method further comprises: and optimizing the pest detection result based on the DIoU loss function.

Preferably, the method further comprises: and carrying out fusion processing on the feature images output by the neck based on the self-adaptive feature fusion algorithm.

Preferably, the method further comprises: constructing a preset activation function of the detection head according to the Softplus function and the Tanh function, wherein the expression of the preset activation function is as follows:

softplus(x)＝log(1+e ^x )；

TS＝tanh(x)·softplus(x)；

wherein TS is a preset activation function of the detection head, softplus (x) is a Softplus function, and Tanh (x) is a Tanh function.

Preferably, the method further comprises: training an adaptive feature fusion lightweight detection model, comprising:

obtaining an insect pest sample image;

preprocessing the insect pest sample image;

and inputting the preprocessed insect pest sample image into the self-adaptive feature fusion lightweight detection model for training, and obtaining the trained self-adaptive feature fusion lightweight detection model.

Preferably, the pretreatment comprises: at least one of image translation, image blurring, image affine, image rotation, image flipping, image stitching, random change in image contrast, and random change in image brightness.

Preferably, in the training stage of the adaptive feature fusion lightweight detection model, the adaptive feature fusion lightweight detection model is compressed by adopting a pruning strategy.

The invention also provides a device for detecting the insect pests of the fruit tree pteridae, which is used for realizing the method for detecting the insect pests of the fruit tree pteridae, and comprises the following steps:

the acquisition module is used for acquiring the fruit tree image to be detected;

the construction module is used for constructing an adaptive feature fusion lightweight detection model based on the YOLOX-x;

the detection module is used for inputting the fruit tree image to be detected into the trained self-adaptive feature fusion lightweight detection model for detection, and obtaining the insect pest detection result.

The invention also provides electronic equipment, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the method for detecting the insect pests of the Paeniidae of the fruit tree when executing the computer program.

The invention also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the fruit tree pteridae pest detection method described above.

The beneficial effects of the invention are concentrated in that:

1. the detection method can accurately and rapidly detect the insect pest stress of the fruit tree in the unstructured orchard environment;

2. the detection method is more suitable for being transplanted to the mobile chassis embedded system, can provide an upgrading thought for the identification system of the orchard plant protection equipment, and can promote the high-quality and high-efficiency development of agriculture;

3. the detection method can improve the detection precision of the model on the pests shielded by the leaves and the branches.

Drawings

FIG. 1 is a flow chart of a method for detecting insect pests of the Paenidae family of fruit trees provided by one embodiment of the invention;

FIG. 2 is a schematic diagram of the overall structure of an adaptive feature fusion lightweight detection model according to an alternative embodiment of the present invention;

FIG. 3 is a schematic diagram of an ASFF according to an alternative embodiment of the present invention;

FIG. 4 is a schematic diagram of a DIoU calculation scheme provided by an alternative embodiment of the invention;

FIG. 5 is a schematic view of a pruning process according to an alternative embodiment of the present invention;

fig. 6 is a block diagram of an alternative embodiment of the fruit tree pteridae pest detection device provided by the invention.

Detailed Description

In order to make the technical solution of the present invention better understood by those skilled in the art, the present invention will be further described in detail with reference to the accompanying drawings and specific embodiments.

Example 1

Fig. 1 is a flowchart of a method for detecting a plant pest of the family pteridae of a fruit tree according to an embodiment of the present invention, and as shown in fig. 1, the method includes:

step S101: collecting images of fruit trees to be detected; in this embodiment, a mobile platform such as a wheel type, a crawler belt, etc. may be used, and a device for acquiring an image such as a video camera or a camera is mounted on the mobile platform, and a camera is used to acquire an image of a fruit tree to be detected.

Step S102: constructing a self-adaptive feature fusion lightweight detection model based on YOLOX-x;

in this embodiment, as shown in fig. 2, the adaptive feature fusion lightweight detection model includes a Backbone network backhaul, a Neck neg, and a detection Head that are sequentially connected.

As a further optimization of this embodiment, the backbone network includes a Focus module and three lightweight feature extraction modules GE connected in sequence, where the output of the Focus module is connected to the input of the first lightweight feature extraction module GE;

each of the lightweight feature extraction modules GE comprises two Ghost model units and an attention mechanism unit ECA, which is connected in series between the two Ghost model units.

In this embodiment, the Ghost model unit compares the basic convolution, and the Ghost model unit can extract more features with fewer parameters; and for the input feature layer, generating a part of real feature layer by using a Ghost model unit through conventional convolution operation, performing linear transformation on each channel of the real feature layer to obtain the Ghost feature layer, and finally splicing the real feature layer and the Ghost feature layer Concat to obtain the complete output feature layer.

Let h be h×w×c, where h be the height of the input feature map, w be the width of the input feature map, c be the number of channels of the input feature map, h '×w' ×n, h 'be the height of the output feature map, w' be the width of the output feature map, n be the number of channels of the output feature map, the convolution kernel size k×k, divide the input feature layer into s, the calculation amount of the standard convolution is as follows:

h′×w′×n×k×k×c (1)；

the calculated amount of the Ghost model unit is as follows:

from the formula (2), it can be seen that the Ghost model unit is equivalent to splitting the multiplication operation of the standard convolution into two multiplication additions, and compared with the standard convolution, the model compression rate is about s, so that the calculation amount of the model is greatly reduced.

In this embodiment, the force mechanism unit ECA is a local cross-channel interaction strategy without dimension reduction, and is also a method for adaptively selecting the size of the one-dimensional convolution kernel.

ECA first performs global averaging pooling (global average pooling, GAP) on feature graphs of input size w×h×c to obtain features of feature size 1×1×c that are not dimension reduced. Then ECA extracts the characteristic relation of partial k channels in the 1X C characteristic through one-dimensional convolution with the convolution kernel size of k, and realizes the information interaction between the channels. Wherein the parameter k can be adaptively determined by a function of the number of input channels C, as in the formula:

in the formula, |x| _odd For the odd nearest x, C is the total number of channels input.

The ECA uses a Sigmoid function to generate the weight ratio of each characteristic channel, combines the input characteristics of the original W multiplied by H multiplied by C with the channel weight, so that important characteristics in the original characteristics are distributed with large weights, and important attention is realized; and the invalid features are allocated with small weights to realize autonomous suppression, and finally the features with channel attention are obtained.

Therefore, the lightweight feature extraction module GE can keep higher detection performance of the model, and simultaneously reduce the calculated amount and the parameter number of the model; and the influence of shielding pests by the blades and the branches is reduced, and the detection precision of the pests is improved.

In this embodiment, the Focus module performs the point-separating sampling on the feature point information through the slicing operation, and stacks the feature point information on the channel, which is equivalent to slicing a high-resolution picture into a plurality of low-resolution pictures, which is equivalent to downsampling the feature image, but ensures that the picture information is not lost.

CBS is a basic convolution in YOLOX networks, which includes Conv layer (convolution layer), BN layer (Batch Normalization ), siLu layer (SiLu activation function). The convolution operation is mainly responsible for feature extraction in the network, and is one of the most important operations of the model; the BN keeps the output of each layer consistent with the input data distribution of the lower layer as much as possible, the model is more stable in training, namely has the function of inhibiting the internal covariant offset, further reduces the sensitivity of the network model to the parameter initial value, and effectively improves the convergence rate of the model; the SiLu activation function provides nonlinear variation capability for the network and realizes layering step-by-step abstract feature capability in the depth model.

The CSP layer structure divides the original input into two branches, the convolution operation is carried out to halve the number of channels, then one branch carries out the Bottleneck N operation, then the concat two branches make the input and the output of the Bottleneck CSP be the same size, so that the model learns more characteristics. In this embodiment, the CSP layer structures are the csp1_x structure and the csp2_x structure, respectively, and the csp2_x structure is preferably adopted in the present invention.

The original fusion mode of the YOLOX target detection network can only simply convert the feature images into the same size, and then add the feature images, so that the features with different scales cannot be fully utilized; in order to fully utilize semantic information of the high-level image of the pteridae larvae and contour, edge, color and shape information of the bottom layer. As a further optimization of this embodiment, further comprising: and carrying out fusion processing on the feature images output by the neck based on the self-adaptive feature fusion algorithm.

Specifically, as shown in fig. 2 and 3, the YOLOX neck outputs are a level feature map, a level2 feature map, and a level3 feature map (P1, P2, and P3 of three sets of CSP2_1 structure outputs in fig. 2), and as illustrated in fig. 3, ASFF-3 is output after fusion as a result of multiplying semantic features of level1, level2, and level3 with weights α, β, and γ from different layers and adding them, as follows:

in the method, in the process of the invention,

for weights from different layers, +.>

For output from different feature maps, i, j denote the rows and columns of the image.

Because of the addition, the level1 and level2 need to be compressed into the same number of channels using a convolution kernel of 1×1 size before addition; and then the dimension is adjusted to be the same as level3 through an up-sampling operation. The alpha, the beta and the gamma are obtained by 1X 1 convolution of the rest_leve1 and the rest_leve2 and the rest_leve1 and the level3, wherein the alpha, the beta and the gamma are spliced through tensors and then are in the range of [0,1 ] through a normalized exponential function]And sum to be

The following are provided:

as a further optimization of this embodiment, to enhance the performance of the detection model of the pteridae larvae, the method further comprises: the method comprises the steps of constructing a preset activation function of a detection head according to a Softplus function and a Tanh function, wherein the preset activation function of the detection head has smoother characteristics, namely better generalization capability and effective optimization capability, and can further improve the quality of a detection model, and the expression of the preset activation function is as follows:

softplus(x)＝log(1+e ^x )；

TS＝tanh(x)·softplus(x)；

Step S103: and inputting the fruit tree image to be detected into a trained self-adaptive feature fusion lightweight detection model for detection, and obtaining a pest detection result.

As a further optimization of this embodiment, in the prediction stage, if prediction frames larger than the set threshold are blindly deleted for the case of dense overlapping of the worms, prediction frames belonging to other pteridae larvae may be suppressed, which would lead to shielding of overlapping worms from missing; thus, the method further comprises: and optimizing the pest detection result based on the DIoU loss function, namely replacing the original IoU loss function by the DIoU loss function, wherein a DIoU schematic diagram is shown in fig. 4.

The expression for the DIoU loss function is:

wherein b is ^F Is the center of the frame F, b ^E For the center of frame E ρ ² (b ^F ，b ^E ) The square of the center distance between the frame E and the frame F, and D is the diagonal length of the minimum closure area between the frame E and the frame F;

IoU has the expression:

DIoU further considers distance information between the center points of the bounding boxes. Thus, the present invention uses DIoU instead of IoU to screen bounding boxes and improve accuracy; soft diou_nms does not delete all frames above the threshold directly when filtering other redundant bounding boxes, but reduces its confidence, soft diou_nms is defined as follows:

wherein S is _i Confidence score for the current prediction frame; m is the prediction frame with highest reliability in all the prediction frames; b _i All compared prediction frames in the current target; n (N) _t For a set threshold, 0.5 is generally taken; sigma is the penalty term coefficient. As can be seen from the above equation, soft diou_nms sets the prediction frame with the highest score as the reference frame, then performs DIoU calculation with the prediction frames remaining in the current target, and retains the prediction frames with DIoU smaller than the set threshold, whereas for the prediction frames with DIoU larger than the set threshold, it is not directly classified into 0, but the confidence score of the frame is continuously reduced. For some higher scoring frames, it may be used as a correct detection box in subsequent calculations. Therefore, the detection performance of the shielding overlapping insect bodies can be effectively improved by using soft DIoU_nms.

The detection method can accurately and rapidly detect the insect pest stress of the fruit tree in the unstructured orchard environment; secondly, the detection method is more suitable for being transplanted to a mobile chassis embedded system, can provide an upgrading thought for an identification system of orchard plant protection equipment, and can promote high-quality and efficient development of agriculture; the detection method can improve the detection precision of the model on the pests shielded by the blades and the branches.

As a further optimization of this embodiment, the method further comprises: training an adaptive feature fusion lightweight detection model, comprising:

step a01: obtaining an insect pest sample image; in the embodiment, the pteridae pests mostly use crops such as phellodendron genus and citrus genus of Rutaceae family as main hosts, so that pest sample image acquisition in the citrus orchard is more representative; the invention is collected in an orchard in a rain city of Atlantic city in Sichuan province of China, the collected data consists of the Paenidae pest images with different time periods, different illumination and different angles, and the image types accord with the actual conditions of pest growth in natural environment, including forward light, backlight, night, shielding and the like. The invention selects 35000 images with clear target outlines and textures.

In unstructured orchards, fruit trees grow luxuriantly in branches and leaves, a large number of pests have the phenomenon of shielding leaves, branches and fruits, and the background and texture details of the pests in different illumination environments are different. In addition, the growth of the pteridae larvae is divided into 5 age groups, namely 1-3 years (low age) and 4-5 years (high age), the phenotype characteristics of the low age larvae and the high age larvae are greatly different, and the surfaces of the 1-3-year larvae are brown and resemble bird droppings; the 4-5-year-old larvae have green body surfaces, smooth surfaces and ozone corners.

Step a02: preprocessing the insect pest sample image; the preprocessing in this embodiment includes: at least one of image translation, image blurring, image affine, image rotation, image flipping, image stitching, random change in image contrast, and random change in image brightness. The invention adopts the processing technologies of image translation, image blurring, image affine, image rotation, image overturning, image stitching and the like to expand the original insect pest sample image by 5 times to 17.5 ten thousand images so as to enrich the data set and improve the generalization capability of the algorithm model and avoid over fitting. Then the method of changing the contrast of the image randomly and changing the brightness of the image randomly is that the following formula is adopted to add or subtract a certain value to the pixels of the insect pest sample image randomly to change the brightness of the insect pest sample image and to multiply a certain value to the pixels of the image randomly to change the contrast of the image, so that the fused insect pest sample image recovers a part of color information to improve the characteristic extraction effect of the network model.

Wherein x is _i ω is the contrast of the pest sample image, ψ represents the change in brightness,

for the fused pest sample image, ω is the contrast change threshold。

The pest targets in the images are manually marked by using a LabelImg tool, the low-age larva label is 'Young', the high-age larva label is 'Old', and the marked and generated files are stored in a PASCAL VOC data set format. The dataset was read as per 7:2: the scale of 1 is randomly divided into a training set, a test set and a verification set. In the test set, a sample with an average occlusion level of the target of less than 30%, i.e., light occlusion, is denoted by a. And otherwise, the heavy occlusion sample is denoted by B.

Step a03: and inputting the preprocessed insect pest sample image into the self-adaptive feature fusion lightweight detection model for training, and obtaining the trained self-adaptive feature fusion lightweight detection model.

As a further optimization of this embodiment, in the training stage of the adaptive feature fusion lightweight detection model, the adaptive feature fusion lightweight detection model is compressed by adopting a pruning strategy.

Specifically, the batch normalization layer (Batch Normalization, BN) has the effect of inhibiting internal covariate offset, so that the sensitivity of the network model to the initial value of the parameter is reduced, and the convergence rate of the model is effectively improved. The BN layer may be expressed as:

wherein z is _in And z _out Respectively inputting and outputting data of the BN layer; u is the current small batch; mu (mu) _U Sum sigma _U Respectively the mean value and standard deviation of the data input by the U batch; chi and ψ are respectively a scale parameter and a translation parameter which can be learned in the training process; epsilon is a small amount which serves to prevent the occurrence of a denominator of 0. From the above equation, the activation value z of each channel _out The size of the parameter gamma directly influences the importance of the channel information, so the parameter gamma is used as a quantization index for measuring the importance of the channel, and is also called a scaling factor. In general, the activation values outputted from BN layer are normally distributed and mostly not approaching 0, and L1 regularization is introducedConstraint, namely reducing the value of a channel importance quantization index gamma so as to facilitate sparsification training, wherein the loss function of the sparsification training is as follows:

wherein L is _baseline A loss function that is a basis model;

is an L1 regularization constraint term, where g (χ) = |χ|; lambda is a penalty factor for balancing the loss term. The sparsification training is to adjust punishment factors according to weight distribution conditions and average precision of BN layers and select proper learning rate to obtain a high sparsity model with slight precision loss. After the sparsification training, the model is pruned, and the pruning process is shown in fig. 5.

The scaling factors in the BN layer will overall approach 0, where channels with χ approaching 0 are less important, based on which the scaling factors for all channels are ordered and the appropriate pruning scale is set. The pruning proportion has direct influence on the volume and the precision of the model, the larger the pruning proportion is, the more the number of the cut channels is, the smaller the volume of the model is, but the precision of the model is also reduced, and therefore the precision is raised by fine adjustment after the model is pruned.

Fig. 6 is a block diagram of a device for detecting insect pests of the family pteridae of fruit trees according to an alternative embodiment of the present invention, as shown in fig. 6, and the device is used for implementing the method for detecting insect pests of the family pteridae of fruit trees, where the device includes:

The invention can accurately and rapidly detect the insect pest stress of the fruit tree in the unstructured orchard environment; secondly, the invention is more suitable for being transplanted to a mobile chassis embedded system, can provide an upgrading thought for an identification system of orchard plant protection equipment, and can promote the high-quality and high-efficiency development of agriculture; the invention can improve the detection precision of the model on the pests shielded by the leaves and the branches.

Example two

In order to objectively measure the detection performance of the self-adaptive feature fusion lightweight detection model (ASFL-YOLOX) on the insect pests of the pteridae, recall rate (recovery), accuracy (precision), F1 value (F1-score), AP value, detection speed, total network parameters and model size are introduced for evaluation.

Ablation experiments were performed on the test set, and the results of the ablation experiments are shown in table 1. Compared with the traditional yolox_x, the invention takes the GhostNet as the basis of a backbone network, the model parameter quantity is greatly reduced, and the mAP value is improved to a certain extent. The model parameter obtained by the invention is only 10.93MB, which is reduced by 88.97% MB compared with the traditional YOLOX_x model parameter, and the floating point operation times are reduced by 111.03G. Experiments prove that the parameter can be greatly reduced by using a pruning strategy, and the network performance can be improved. Aiming at the problem of complex background insect pest detection in unstructured orchards, the method and the device for detecting the target characteristic information of the unstructured orchards, through adding the channel attention mechanism ECA, influence of background noise on detection performance is reduced, and interaction enhancement of target characteristic information among channels is achieved. And on the basis of improving the network, an efficient channel attention mechanism ECA is added, so that various indexes of F1, AP, recall and Precision are improved, and the stability of model weight is ensured.

TABLE 1

In unstructured orchards, the phenomenon that branches and leaves shield insects or the insects overlap with each other is unavoidable. The shielding degree is taken as a control variable, a test set A, B and A+B are taken as experimental data respectively, the detection results of the proposed ASFL-Yolox network model and two popular lightweight models of Yolox-s and Yoloxv 5-s are compared, the detection structures are shown in a table 2, and in the table 2, the indexes of the ASFL-Yolox network model are obviously improved compared with those of the Yoloxv 5 and Yolox models on a light shielding test set or a heavy shielding test set. It is worth noting that the detection AP value of the model can reach 93.14% and the F1 value can reach 0.93 under the environment of serious shielding and dense targets.

TABLE 2

At light occlusion, the three models were able to successfully detect the emerging pests in the picture, but the ASFL-YOLOX had a higher confidence. In the detection result of heavy occlusion, the advantages of ASFL-YOLOX are obvious, the YOLOX and YOLOv5 have different degrees of missed detection, and the confidence score obtained by recognition is generally low. In summary, ASFL-YOLOX has a higher recognition rate and confidence score for severely occluded pests, so ASFL-YOLOX can reduce the rate of missed detection of occluded pests.

500 forward, side and reverse images are selected from the light shielding data set A and used as experimental objects to test the robustness of the model under different illumination angles, and the detection results are shown in table 3. Analysis of Table 3 shows that under a forward light environment, the ASFL-YOLOX proposed by the present invention has the highest AP value; in a backlight environment, the AP value of the algorithm is slightly lower than that of forward light and side light; the results of the detection of YOLOv5 and YOLOX are vice versa. In the forward light environment, the object can present clearer texture, so that each model can obtain the optimal recognition precision. Compared with other detection models, the ASFL-YOLOX model has the lowest variance of various performance indexes under forward light, side light and backlight conditions and has the highest performance index, which shows that the ASFL-YOLOX model provided by the invention has more robustness to illumination angle change. In the backlight condition, due to insufficient light, the color distortion and texture characteristics of the image are lost, a high-contrast image is formed, and at the moment, the YOLOv5 model has the phenomenon of omission, and the confidence coefficient presented by the method is highest.

TABLE 3 Table 3

To further demonstrate the superiority of the proposed method of the present invention, the classical target detection algorithms Faster-RCNN, SSD, etc. are compared with the proposed ASFL-YOLOX. The results of the vertical comparison are shown in Table 4 and the horizontal comparison are shown in Table 5. The FPS and the detection speed result of each model are obtained by detecting pictures with the resolution of 640 multiplied by 640 under the GPU.

TABLE 4 Table 4

TABLE 5

The comparison result shows that the performance of the YOLOX-Nano is the worst in terms of detection precision, and the AP, precision, F value is the smallest; although the AP, precision, F value of the ASFL-Yolox is not the highest, the model is light, the reasoning speed is high, the mAP value of the ASFL-Yolox is improved by 11.84% compared with the mAP value of the Yolov3, and the AP value of the ASFL-Yolox is improved by 6.92% compared with the AP value of the Yolov 4; in terms of detection speed, the FPS of the Faster-RCNN is only 19 frames/s, which is obviously lower than SSD and YOLO series networks, and the limitation of the two-stage detection model of the Faster-RCNN in real-time detection is illustrated; the average accuracy of detection of YOLOv7-x is up to 96.98%, but the detection speed is 37FPS, which is inferior to that of ASFL-YOLOX; the FPS of ASFL-YOLOX is 66 frames/s, which is 29 frames/s higher than that of YOLOv7-x, and mAP is about 10 percent higher than that of YOLOv 7-tiny; although the average accuracy of the ASFL-YOLOX is slightly lower than YOLOv7-x, the higher complexity of the YOLOv7-x model results in slower detection speeds, and therefore both the overall performance and model size of the ASFL-YOLOX are superior to the YOLOv7 series. In conclusion, ASFL-YOLOX is more suitable for pest detection of the Papilionaceae family of fruit trees.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, etc., such as Read Only Memory (ROM) or flash RAM. Memory is an example of a computer-readable medium.

Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises an element.

The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and changes may be made to the present application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. which are within the spirit and principles of the present application are intended to be included within the scope of the claims of the present application.

Claims

1. The method for detecting the insect pests of the Paenidae of the fruit trees is characterized by comprising the following steps:

collecting images of fruit trees to be detected;

2. The method for detecting the insect pests of the Paeniidae family of fruit trees according to claim 1, wherein the self-adaptive feature fusion lightweight detection model comprises a backbone network, a neck and a detection head which are sequentially connected.

3. The method for detecting the insect pests of the Paeniidae family of fruit trees according to claim 2, wherein the backbone network comprises a Focus module and three lightweight feature extraction modules which are sequentially connected, and the output of the Focus module is connected with the input of the first lightweight feature extraction module;

4. The method for detecting insect pests of the family Paenidae of fruit trees according to claim 3, further comprising: and optimizing the pest detection result based on the DIoU loss function.

5. The method for detecting insect pests of the family Paenidae of fruit trees according to claim 3, further comprising: and carrying out fusion processing on the feature images output by the neck based on the self-adaptive feature fusion algorithm.

6. The method for detecting insect pests of the family Paenidae of fruit trees according to claim 3, further comprising: constructing a preset activation function of the detection head according to the Softplus function and the Tanh function, wherein the expression of the preset activation function is as follows:

softplus(x)＝log(1+e ^x )；

TS＝tanh(x)·softplus(x)；

7. The method for detecting fruit tree pteridae pests according to any one of claims 1 to 6, further comprising: training an adaptive feature fusion lightweight detection model, comprising:

obtaining an insect pest sample image;

preprocessing the insect pest sample image;

8. The method for detecting fruit tree pteridae pests according to claim 7, wherein the pretreatment comprises: at least one of image translation, image blurring, image affine, image rotation, image flipping, image stitching, random change in image contrast, and random change in image brightness.

9. The method for detecting the plant diseases and insect pests of the Paenidae of fruit trees according to claim 7, wherein the self-adaptive feature fusion lightweight detection model is compressed by adopting a pruning strategy in a training stage of the self-adaptive feature fusion lightweight detection model.

10. A fruit tree pteridae pest detection device for implementing the fruit tree pteridae pest detection method of any one of claims 1-9, the device comprising: