NL2025689B1 - Crop pest detection method based on f-ssd-iv3 - Google Patents

Crop pest detection method based on f-ssd-iv3 Download PDF

Info

Publication number
NL2025689B1
NL2025689B1 NL2025689A NL2025689A NL2025689B1 NL 2025689 B1 NL2025689 B1 NL 2025689B1 NL 2025689 A NL2025689 A NL 2025689A NL 2025689 A NL2025689 A NL 2025689A NL 2025689 B1 NL2025689 B1 NL 2025689B1
Authority
NL
Netherlands
Prior art keywords
ssd
layer
candidate
feature
crop pest
Prior art date
Application number
NL2025689A
Other languages
Dutch (nl)
Other versions
NL2025689A (en
Inventor
He Yong
Zeng Hong
Wu Jianjian
Xu Jian
Original Assignee
Univ Zhejiang
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Univ Zhejiang filed Critical Univ Zhejiang
Publication of NL2025689A publication Critical patent/NL2025689A/en
Application granted granted Critical
Publication of NL2025689B1 publication Critical patent/NL2025689B1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/5866Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, manually generated location and time information
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/188Vegetation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Medical Informatics (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Library & Information Science (AREA)
  • Image Analysis (AREA)

Abstract

The present invention discloses a crop pest detection method based on Feature Fusion Single Shot Multibox Detector Inception V3 (F-SSD-IV3), including the following steps: (1) capturing pest images to construct a crop pest database; (2) constructing an F-SSD-IV3 target detection algorithm, using lnception V3 to replace VGG-16 as a feature extractor, designing a feature fusion method to conduct fusion on context information for output feature maps of different scales, and finally fine-tuning a candidate bound by using Softer NMS; and (3) optimizing a network during training, and improving detection performance and a model generalization capability by using a method of amplifying data and adding a Dropout layer.

Description

P3457ONLOO/TRE Title: CROP PEST DETECTION METHOD BASED ON F-SSD-IV3
TECHNICAL FIELD The present invention belongs to the field of deep learning and computer vision, and in particular, to a crop pest detection method based on Feature Fusion Single Shot Multibox Detector Inception V3 (F-SSD-1V3).
BACKGROUND With the continuous growth of the global population, a grain demand is also increasing dramatically. Due to the influence of the natural environment and factors of crops, the crops are inevitably attacked by pests at different growth stages. If the pests cannot be detected and eliminated in time, an outbreak of pests may occur. A large-scale outbreak of pests will affect the healthy growth of crops, thereby greatly reducing a yield and quality of the crops.
Conventional pest identification is based on morphological features such as a morphology, a color, and a texture, and relies on an artificial identification method. As a result, there exists specific subjectivity, poor timeliness, or labor intensity. Early identification of pests is based on a template matching technology and a simple model, and a feature of a pest image is extracted by using an artificially designed feature. Common features include a histogram of oriented gradient (HOG), a local binary pattern (LBP), scale-invariant feature transform (SIFT), Haar-like, a deformable parts model (DPM), etc. However, the artificially designed feature depends on priori knowledge. Therefore, it is difficult to accurately express a color and a morphology of a target pest, and there is a lack of robustness. In addition, application scenarios of the above-mentioned methods have limitations and are only suitable for an ideal laboratory environment.
In recent years, relying on a powerful feature expression capability of a convolutional neural network (CNN), a target detection method based on deep learning has made great breakthroughs in detection performance. In general, the target detection method based on deep learning can be divided into two types: a target detection method based on a candidate region and a target detection method based on regression. In the target detection method based on a candidate region, an algorithm process includes generating a candidate region in an image, extracting a feature from the candidate region to generate a region of interest (Rol), and finally conducting classification and regression. Common algorithms include R-CNN[53], Fast R- CNN[54], Faster R-CNN[55], and R-FCN[56]. Such a method has relatively high accuracy but has a low detection speed. Currently, a main trend of object detection is faster and more efficient detection. Target detection methods based on regression such as YOLO [59] and SSD [60] have an obvious advantage of a high detection speed. For an input image, a bounding box and a
-2- category thereof are predicted at multiple positions of the image at the same time when there is no candidate region. A limitation of YOLO lies in a strong spatial constraint on the prediction of a bounding box. Therefore, it is difficult to detect a multiple-scale small target object. In terms of a detection speed, SSD can basically achieve real-time performance, but it has relatively poor detection performance when being used for a small target object. In an actual field environment, there is a complex background and diverse pest types/postures, and a target size in an obtained pest image is relatively small. Consequently, existing detection methods cannot well satisfy a need of the crop pest detection field.
SUMMARY To resolve a problem that existing detection methods cannot well balance a contradiction between a detection speed and detection accuracy, based on characteristics of existing pest images: a small number of samples, small target objects, diverse posture changes, and being easy to be blocked, the present invention proposes a new F-SSD-IV3 target detection method for crop pest detection, to improve an SSD target detection algorithm.
To achieve the foregoing objective, the present invention provides the following technical solution, including the following steps, as shown in FIG. 1: (1) Capture pest images through internet downloading, smartphone shooting, digital camera shooting, etc. to construct a crop pest database.
(1-1) Set all the RGB pest images to images in a JPEG format, and name the images with pest names and continuous numbers.
(1-2) Label a category of a pest and a rectangular boundary box in the image by using an image annotation tool Labellmg, where the rectangular boundary box is formed by four pieces of coordinate information: xmin, ymin, xmax, and ymax.
(2) Construct an F-SSD-IV3 target detection algorithm, use Inception V3 to replace VGG- 16 as a feature extractor, design a feature fusion method to conduct fusion on context information for output feature maps of different scales, and finally fine-tune a candidate bound by using Softer NMS, where the method is shown in FIG. 2, and a detailed process includes the following: (2-1) Select the Inception V3 as a basic network of the F-SSD-IV3, where a structure of an Inception V3 network is shown in FIG. 3, and includes a convolutional layer, a convolutional layer, a convolutional layer, a pooling layer, a convolutional layer, a convolutional layer, a pooling layer, Mixed1_a, Mixed1_b, Mixed1_c, Mixed2_a, Mixed2_b, Mixed2_c, Mixed2_d, Mixed2_e, Mixed3_a, Mixed3_b, Mixed3_c, a pooling layer, a dropout layer, and a fully connected layer; a size of an input image is 300x300x3; dimensions of convolution kernels include 1x1, 1x3, 3x1, 3x3, 5x5, 1x7, and 7x1; the pooling layer includes maximum pooling and average pooling, and has a dimension of 3x3; and sizes of the obtained feature maps are 149x149x32, 147x147x32,
-3- 147x147 x64, 73x73xB4, 73x73x80, 71x71x192, 35x35x192, 35x35x256, 35x35x288, 35x35x288, 17x17x768, 17x17x768, 17x17x768, 17x17x768, 17x17x768, 8x8x1280, 8x8x2048, 8x8x2048, and 1x1x2048.
(2-2) Then add an additional network of the six convolutional layers after the Inception V3, where sizes of convolution kernels are respectively 1x1x258, 3x3x512 (a step is 2), 1x1x128, 3x3x256 (a step is 2), 1x1x256, and 3x3x128 (a step is 1); and obtain three feature maps with sizes gradually decreased, where sizes thereof are respectively 4x4x512, 2x2x256, and 1x1x128.
(2-3) Conduct feature fusion on the feature maps output in step 2-2, a Mixed1_c feature map, a Mixed2_e feature map, and a Mixed3_c feature map, to resolve a problem that it is difficult to detect a small target object in a later stage of an original SSD target detection method due to a serious lack of global context information, where the feature fusion method is shown in FIG. 3, and specifically includes first conducting deconvolution on a feature map at a next layer, then conducting feature fusion on the feature map at the next layer and a feature map at a current layer in a cascading manner, and outputting a new feature map; and an output candidate bound in the network structure can be represented as the following formula: Output candidate bound ={F, , (ft). ee ( A) Je = Joa = ht 1 Jo = ht ott fi n>k>0 / represents a feature map output at a cascaded n! layer, and P represents a candidate bound generated for each feature map.
"+" in FIG. 4 represents a cascading module formed by a deconvolution layer, 3x3 convolution layers, and a 1x1 convolution layer, and can transfer an advanced feature to a lower layer. To combine feature maps of different sizes, the cascading module uses the deconvolution layer to generate and input feature maps with a same height and width; then uses two 3x3 convolution layers to better learn features; and uses a standardized layer before connection to conduct normalization processing on the input feature maps. Normalization can resolve a problem of gradient explosion, and can greatly increase a training speed during network training. Concat can combine two feature maps. Other dimensions of the two feature maps are same except a stitching dimension. The 1x1 convolutional layer is introduced for dimensionality reduction and feature recombination.
(2-4) Conduct convolution on k candidate bounds at each position in a mxn feature map, where a size of a convolution kernel is (c+ Dl predict c category scores and four position
-4- changes, and finally generate ™ X10 Xk(¢ +4) predicted outputs. For the candidate bound of the feature map, a minimum scale is Sin =0.2, and a maximum scale is S as =0.9. In the present invention, S in =0.1, and Sas =0.95. In this case, a size range of the candidate bound of the Se +S feature map is larger. To ensure smooth scale transition between layers, a hew scale 2 is added for a feature map at each layer in the present invention, so as to improve the detection a, €{1,2,3~,5} accuracy. In addition, a default aspect ratio of a candidate bound is set to 23. When 4 =1, an extra candidate bound is added, and a size thereof is Sk = NSS (2-5) During detection conducted by using an original SSD algorithm, use the NMS to preserve a candidate bound with a relatively high confidence coefficient, and generate a large number of candidate bounds (24,564 candidate bounds are generated by using SSD512) between which an overlap exists; (1) A candidate bound is selected by using the Soft NMS for each candidate bound. (2) For each selected candidate bound M, whether an loU of another candidate bound and the candidate bound M is greater than a threshold p is determined. (3) Weighted averaging is conducted on all candidate bounds whose loUs are greater than the threshold p, and position coordinates of the candidate bounds are updated. (2-8) A loss function of the SSD is formed by two parts: a position loss Line and a classification loss Loong and can be represented as follows: 1 L(x,c,l,g)= 7 Leos (x.c)+al,. (x1,8)) where N represents the number of candidate bounds matching a real boundary, cis a confidence coefficient of each type of candidate bound, /is a value of a translation and scale change of a candidate bound, g is position information of the real boundary, and a=1 by default. (3) Optimize a network during training, and improve detection performance and a model generalization capability by using a method of amplifying data and adding a Dropout layer. (3-1) A data set of pests is relatively small, it is relatively difficult to obtain new data, and relatively high costs are required to obtain a data set with sufficient labels. Therefore, a data amplification method is adopted in the present invention to expand a data set. Data amplification can be represented as the following formula:
DST S represents raw training data, 7’ represents data obtained after data amplification, and Ö is the adopted data amplification method.
-5.
In the present invention, a common data amplification manner is adopted to randomly adjust luminance, a contrast ratio, and saturability of an image and conduct flipping, rotation, cropping, and translation on the image. Finally, the training set is expanded by five times.
(3-2) The Dropout policy can prevent a problem of model overfitting. During network training, some neurons at a hidden layer are randomly inhibited in each iteration at a probability p, and finally a comprehensive averaging policy is used to combine different neural networks as a final output model. In the present invention, probabilities of randomly inhibiting some neurons at the hidden layer are p=0.5, 0.6, 0.7, 0.8, 0.9.
BRIEF DESCRIPTION OF DRAWINGS FIG. 1 is a step diagram of a detection method according to the present invention; FIG. 2 is a flowchart of an F-SSD-IV3 algorithm; FIG. 3 is a network structure diagram of Inception V3; and FIG. 4 is a schematic diagram of a feature fusion method.
DETAILED DESCRIPTION The present invention is described in detail below with reference to embodiments and the accompanying drawings, but the present invention is not limited thereto.
(1) Experimental data: In the present invention, a field crop typical-pest data set collected by the Institute of Agricultural Information Technology, Zhejiang University is adopted, and pest images in the data set include information such as different image sizes, light conditions, blocking degrees, shooting angles, and target pest sizes. Images in a database are randomly and evenly distributed in a training set, a verification set, and a test set at a ratio of 7:2:1. A model is trained by using data in the training set, evaluation is conducted by using the validation set to select a model parameter, and finally model performance and efficiency are detected by using the test set.
(2) Experimental environment: Specifications of an experimental workstation are as follows: Memory is 32GB, an operating system is Linux Ubuntu 18.04, and a CPU is Intel Core i7 7800X. TensorFlow supports multi-GPU training. A total of two NVIDIA GeForce GTX 1080Ti graphics cards are used for training in the present invention. Python is used as a programming language because it can support a TensorFlow deep learning framework.
(3) A training process: First, data amplification is conducted to expand the training set, and a size of an input image is fixed at 300x300x3. Then a network is initialized, errors of a position loss function and a classification loss function are calculated through forward propagation, and parameters are updated through backpropagation until 200,000 iterations are completed, and finally the parameters are saved. In the experiment, a model Inception V3 trained on ImageNet is used as a feature extraction network of SSD through fine-tuning, and parameters of the
-6- Inception V3 are used to initialize parameters of a basic network to speed up a training speed of the network. Training hyperparameters are as follows: a random number of standard normal distribution with a standard deviation of 0.1 and a mean of 0 is generated through initialization. A stochastic gradient descent (SGD) method of Momentum is used, a weight is 0.9, and an attenuation coefficient is also set to 0.9. Compared with SGD, a Momentum optimizer resolves two problems: noise introduction and relatively large convergence oscillation. An initial learning rate is set to 0.004, an exponential attenuation parameter is set to 0.95, and a batch size is set to
24. A total of 200,000 iterations are conducted, and one complete training operation is conducted for approximately 20 hours. During training, when an loU of a candidate bound and a labeled rectangular box exceeds 0.6, the candidate bound is a positive sample; otherwise, the candidate bound is a negative sample.
(4) Parameters of the model are continually adjusted according to a result of the verification set, and the test set is applied to a trained optimal model to determine the performance of the model. When p of a Dropout layer is 0.8, an mAP value is the highest. An F- SSD-1V3 algorithm proposed in the present invention is compared with original SSD300, Faster R-CNN, and R-FCN target detection algorithms based on a same test set, and a target detection standard performance evaluation indicator mAP proposed in the Pascal VOC Challenge is used as a performance indicator.
Table 1 Performance comparison of various algorithms F-SSD- Detecti It can be learned from the foregoing table that, SSD300 has the best detection speed, namely, 0.048 seconds per single image, but the detection accuracy is the lowest; the detection accuracy of both Faster R-CNN and R-FCN is lower than 0.68, and Faster R-CNN and R-FCN can detect a single image by approximately 0.15 seconds. Compared with R-FCN and Faster R- CNN, F-SSD-IV3 has relatively larger advantages in the detection accuracy and a detection speed. Therefore, F-SSD-IV3 proposed in the present invention can better balance the detection accuracy and the detection speed. The present invention has a relatively high practical value for real-time and accurate detection of pests in a field environment.
The foregoing descriptions are merely preferred examples of the present invention, but are not intended to limit the present invention. Any modifications, equivalent replacements or
-7- improvements made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (1)

-8--8- CONCLUSIESCONCLUSIONS 1. Een gewasplaagdetectiemethode op basis van Feature Fusion Single Shot Multibox Detector Inception V3 (F-SSD-IV3), die de volgende stappen omvat: (1) het vastleggen van afbeeldingen van de plaag om een gewasplaagdatabase samen te stellen; (2) het construeren van een F-SSD-IV3 doeldetectiealgoritme, het uitvoeren van kenmerkkaarten op verschillende schalen door het gebruiken van de afbeeldingen in de gewasplaagdatabase en door het gebruiken van Inception V3 als een kenmerkontleder, het doorvoeren van kenmerkfusie op de kenmerkkaarten, en het door het gebruiken van Softer NMS verfijnen van een kandidaatgrens; en (3) het optimaliseren van een doeldetectienetwerk door het versterken van gegevens en het toevoegen van een Drop-outlaag, om een optimaal detectiemodel te verkrijgen dat wordt gebruikt voor het in een afbeelding detecteren van gewasplagen.1. A crop pest detection method based on Feature Fusion Single Shot Multibox Detector Inception V3 (F-SSD-IV3), which includes the following steps: (1) capturing images of the pest to build a crop pest database; (2) constructing an F-SSD-IV3 target detection algorithm, performing feature maps at different scales using the images in the crop pest database and using Inception V3 as a feature parser, performing feature fusion on the feature maps, and refining a candidate boundary using Softer NMS; and (3) optimizing a target detection network by amplifying data and adding a drop-out layer to obtain an optimal detection model used to detect crop pests in an image. 2. Gewasplaagdetectiemethode op basis van F-SSD-IV3 volgens conclusie 1, waarbij de gewasplaagdatabase plaagafbeeldingen opslaat met verschillende afmetingen, lichtomstandigheden, blokkeergradaties, opnamehoeken en doelplaaggroottes.The crop pest detection method based on F-SSD-IV3 according to claim 1, wherein the crop pest database stores pest images of different sizes, lighting conditions, blocking degrees, shooting angles and target pest sizes. 3. Gewasplaagdetectiemethode op basis van F-SSD-IV3 volgens conclusie 1, waarbij stap (2) specifiek de volgende stappen omvat: (2-1) het selecteren van Inception V3 als een basisnetwerk van de F-SSD-IV3, waarbij een structuur van een Inception V3-netwerk een convolutielaag, een convolutielaag, een convolutielaag, een poolinglaag, een convolutielaag omvat, een convolutielaag, een poolinglaag, Mixed1_a, Mixed1_b, Mixed1_c, Mixed2_a, Mixed2_b, Mixed2_c, Mixed2_d, Mixed2_e, Mixed3_a, Mixed3_b, Mixed3_c, een poolinglaag, een drop-outlaag en een volledig verbonden laag omvat; waarbij afmetingen van convolutiekernels 1x1, 1x3, 3x1, 3x3, 5x5, 1x7 en 7x1 omvatten; waarbij de poolinglaag maximale pooling en gemiddelde pooling omvat, en een afmeting van 3x3 heeft; en waarbij groottes van de verkregen kenmerkkaarten 149x149x32, 147x147x32, 147x147x64, 73x73x64, 73x73x80, 71x71x192, 35x35x192, 35x35x256, 35x35x288, 35x35x288, 17x17x768, 17x17x768, 17x17x768, 17x17x768, 17x17x768, 8x8x1280, 8x8x2048, 8x8x2048 en 1x1x2048 zijn; (2-2) vervolgens het toevoegen van een extra netwerk van de zes convolutielagen na de Inception V3, waarbij groottes van de convolutiekernels respectievelijk 1x1x256, 3x3x512, 1x1x128, 3x3x256, 1x1x256 en 3x3x128 zijn, en het verkrijgen van drie kenmerkkaarten met groottes die geleidelijk afnemen, waarbij de groottes daarvan respectievelijk 4x4x512, 2x2x256 en 1x1x128 zijn;The F-SSD-IV3 based crop pest detection method according to claim 1, wherein step (2) specifically comprises the following steps: (2-1) selecting Inception V3 as a base network of the F-SSD-IV3, wherein a structure of an Inception V3 network includes a convolution layer, a convolution layer, a convolution layer, a pooling layer, a convolution layer, a convolution layer, a pooling layer, Mixed1_a, Mixed1_b, Mixed1_c, Mixed2_a, Mixed2_b, Mixed2_c, Mixed2_d, Mixed3_a, Mixed3_b, Mixed3_b comprises a pooling layer, a drop-out layer and a fully connected layer; wherein dimensions of convolution kernels include 1x1, 1x3, 3x1, 3x3, 5x5, 1x7 and 7x1; wherein the pooling layer comprises maximum pooling and average pooling, and has a size of 3x3; and wherein sizes of the resulting feature maps are 149x149x32, 147x147x32, 147x147x64, 73x73x64, 73x73x80, 71x71x192, 35x35x192, 35x35x256, 35x35x288, 35x35x288, 17x17x768, 17x17x768, 17x17x768, 8x17x8 (2-2) then adding an additional network of the six convolution layers after the Inception V3, with convolution kernel sizes being 1x1x256, 3x3x512, 1x1x128, 3x3x256, 1x1x256, and 3x3x128, respectively, and getting three feature maps with sizes that gradually decrease, the sizes thereof being 4x4x512, 2x2x256 and 1x1x128, respectively; -9- (2-3) het doorvoeren van kenmerkfusie op de kenmerkkaarten die zijn uitgevoerd in stap 2-2, een Mixed1_c kenmerkkaart, een Mixed2_e kenmerkkaart, en een Mixed3_c kenmerkkaart, en het uitvoeren van een nieuwe kenmerkkaart; (2-4) het doorvoeren van convolutie op k kandidaatgrenzen die zijn gebonden op elke positie in een mxn kenmerkkaart, waarbij een grootte van een convolutiekernel (c+4)k is, die c categoriescores en vier positieveranderingen voorspelt, en uiteindelijk ™ X10 xk(c +4) voorspelde uitvoerwaardes genereert; (2-5) het gebruiken van de NMS om een kandidaatgrens met een relatief hoge zekerheidscoéfficiént te behouden, en het genereren van een groot aantal kandidaatgrenzen waartussen een overlap bestaat; het selecteren van een kandidaatgrens door het gebruiken van de Soft NMS voor elke kandidaatgrens; het, voor elke geselecteerde kandidaatgrens M, bepalen of een loU van een andere kandidaatgrens en de kandidaatgrens M groter is dan een drempel p; en het doorvoeren van gewogen middeling op alle kandidaatgrenzen waarvan de loU's groter zijn dan de drempel p, en het bijwerken van positiecoördinaten van de kandidaatgrenzen; en (2-6) waarbij een verliesfunctie van de SSD is gevormd door twee delen: een positieverlies Lie en een classificatieverlies Loong ‚ en als volgt kan worden weergegeven: 1 L(x, ¢,l,g) = To (x,c) + al, (x,..8)) waarbij N het aantal kandidaatgrenzen weergeeft dat overeenkomt met een werkelijke grens, cc een zekerheidscoéfficiént is van elk type kandidaatgrens, / een waarde is van een verplaatsing en schaalverandering van een kandidaatgrens, g positie-informatie is van de werkelijke grens, en standaard a = 1.-9-(2-3) performing feature fusion on the feature maps performed in step 2-2, a Mixed1_c feature map, a Mixed2_e feature map, and a Mixed3_c feature map, and outputting a new feature map; (2-4) conducting convolution on k candidate boundaries bound at any position in an mxn feature map, where a size of a convolution kernel is (c+4)k, predicting c category scores and four positional changes, and finally ™ X10 xk (c +4) generates predicted output values; (2-5) using the NMS to maintain a candidate boundary with a relatively high certainty coefficient, and generating a large number of candidate boundaries that overlap; selecting a candidate boundary using the Soft NMS for each candidate boundary; determining, for each selected candidate boundary M, whether a lU of another candidate boundary and the candidate boundary M is greater than a threshold p; and performing weighted averaging on all candidate boundaries whose loUs are greater than the threshold p, and updating position coordinates of the candidate boundaries; and (2-6) where a loss function of the SSD is formed by two parts: a position loss Lie and a classification loss Loong ‚ and can be represented as follows: 1 L(x, ¢,l,g) = To (x,c ) + al, (x,..8)) where N represents the number of candidate boundaries that correspond to an actual boundary, cc is a certainty coefficient of each type of candidate boundary, / is a value of a displacement and scaling of a candidate boundary, g position- information is from the actual boundary, and by default a = 1. 4. Gewasplaagdetectiemethode op basis van F-SSD-IV3 volgens conclusie 3, waarbij in stap (2-3) de kenmerkfusiemethode eerst het doorvoeren van deconvolutie op een kenmerkkaart op een volgende laag omvat, daarna het doorvoeren van kenmerkfusie op de kenmerkkaart op de volgende laag omvat en een kenmerkkaart op een huidige laag omvat op een trapsgewijze manier, en het uitvoeren van een nieuwe kenmerkkaart omvat.A crop pest detection method based on F-SSD-IV3 according to claim 3, wherein in step (2-3) the feature fusion method comprises first performing deconvolution on a feature map on a next layer, then performing feature fusion on the feature map on the next layer and includes a feature map on a current layer in a cascading manner, and includes executing a new feature map. 35. Gewasplaagdetectiemethode op basis van F-SSD-IV3 volgens conclusie 3, waarbij een uitvoerwaarde voor de kandidaatgrens in de netwerkstructuur kan worden voorgesteld als de volgende formule:The crop pest detection method based on F-SSD-IV3 according to claim 3, wherein an output value for the candidate boundary in the network structure can be represented as the following formula: -10 - Output candidate bound ={P,_, (£4): 4 (4) zh fon = fit fh Jin = Jy + Ju +... + Jur n>k=>0 waarbij 7 een kenmerkkaart-uitvoer op een trapsgewijze n® laag weergeeft, en P een kandidaatgrens weergeeft die voor elke kenmerkkaart is gegenereerd.-10 - Output candidate bound ={P,_, (£4): 4 (4) zh fon = fit fh Jin = Jy + Ju +... + Jur n>k=>0 where 7 is a feature card output on represents a cascading n® layer, and P represents a candidate boundary generated for each feature map. 6. Gewasplaagdetectiemethode op basis van F-SSD-IV3 volgens conclusie 5, waarbij een 11 a €{1,2,3,—,—} standaard aspectverhouding van een kandidaatgrens is ingesteld op 23 en wanneer “r= 1, een extra kandidaatgrens wordt toegevoegd, en waarbij een grootte daarvan St NSS jgCrop pest detection method based on F-SSD-IV3 according to claim 5, wherein an 11 a €{1,2,3,—,—} default aspect ratio of a candidate boundary is set to 23 and when “r=1, an additional candidate boundary is added, and a magnitude thereof is St NSS jg 7. Gewasplaagdetectiemethode op basis van F-SSD-IV3 volgens conclusie 1, waarbij gegevensversterking in stap (3) wordt weergegeven als de volgende formule:The crop pest detection method based on F-SSD-IV3 according to claim 1, wherein data amplification in step (3) is represented as the following formula: OST waarbij S ruwe trainingsgegevens weergeeft, / gegevens weergeeft die zijn verkregen na gegevensversterking, en J een aangenomen gegevensversterkingsmethode is; en helderheid, een contrastverhouding en verzadiging van een afbeelding willekeurig worden aangepast, en waarbij spiegelen, rotatie, bijsnijden en verplaatsing worden doorgevoerd op de afbeelding.OST where S represents raw training data, / represents data obtained after data amplification, and J is an assumed data amplification method; and brightness, contrast ratio, and saturation of an image are arbitrarily adjusted, and mirroring, rotation, cropping, and displacement are applied to the image. 8. Gewasplaagdetectiemethode op basis van F-SSD-IV3 volgens conclusie 1, waarbij het Drop-outbeleid als volgt is: dat tijdens netwerktraining enkele neuronen op een verborgen laag willekeurig met een kans p worden onderdrukt bij elke iteratie, en dat uiteindelijk een uitgebreid middelingsbeleid wordt gebruikt om verschillende neurale netwerken te combineren als een uiteindelijk uitvoermodel.The crop pest detection method based on F-SSD-IV3 according to claim 1, wherein the Dropout policy is as follows: that during network training, single neurons on a hidden layer are randomly suppressed with a probability p with each iteration, and that finally a comprehensive averaging policy is used to combine different neural networks as a final output model.
NL2025689A 2019-05-31 2020-05-27 Crop pest detection method based on f-ssd-iv3 NL2025689B1 (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910470899.6A CN110222215B (en) 2019-05-31 2019-05-31 Crop pest detection method based on F-SSD-IV3

Publications (2)

Publication Number Publication Date
NL2025689A NL2025689A (en) 2020-12-03
NL2025689B1 true NL2025689B1 (en) 2021-06-07

Family

ID=67819271

Family Applications (1)

Application Number Title Priority Date Filing Date
NL2025689A NL2025689B1 (en) 2019-05-31 2020-05-27 Crop pest detection method based on f-ssd-iv3

Country Status (2)

Country Link
CN (1) CN110222215B (en)
NL (1) NL2025689B1 (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110782435A (en) * 2019-10-17 2020-02-11 浙江中烟工业有限责任公司 Tobacco worm detection method based on deep learning model
CN112464971A (en) * 2020-04-09 2021-03-09 丰疆智能软件科技(南京)有限公司 Method for constructing pest detection model
CN111476317B (en) * 2020-04-29 2023-03-24 中国科学院合肥物质科学研究院 Plant protection image non-dense pest detection method based on reinforcement learning technology
CN111476238B (en) * 2020-04-29 2023-04-07 中国科学院合肥物质科学研究院 Pest image detection method based on regional scale perception technology
CN111882002B (en) * 2020-08-06 2022-05-24 桂林电子科技大学 MSF-AM-based low-illumination target detection method
CN113065473A (en) * 2021-04-07 2021-07-02 浙江天铂云科光电股份有限公司 Mask face detection and body temperature measurement method suitable for embedded system
CN115641575A (en) * 2022-10-24 2023-01-24 南京睿升达科技有限公司 Leafhopper agricultural pest detection method based on sparse candidate frame
CN116070789B (en) * 2023-03-17 2023-06-02 北京茗禾科技有限公司 Artificial intelligence-based single-yield prediction method for mature-period rice and wheat

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7496228B2 (en) * 2003-06-13 2009-02-24 Landwehr Val R Method and system for detecting and classifying objects in images, such as insects and other arthropods
US7286056B2 (en) * 2005-03-22 2007-10-23 Lawrence Kates System and method for pest detection
CN107665355B (en) * 2017-09-27 2020-09-29 重庆邮电大学 Agricultural pest detection method based on regional convolutional neural network
CN108399380A (en) * 2018-02-12 2018-08-14 北京工业大学 A kind of video actions detection method based on Three dimensional convolution and Faster RCNN
CN109002755B (en) * 2018-06-04 2020-09-01 西北大学 Age estimation model construction method and estimation method based on face image
CN109101994B (en) * 2018-07-05 2021-08-20 北京致远慧图科技有限公司 Fundus image screening method and device, electronic equipment and storage medium
CN109191455A (en) * 2018-09-18 2019-01-11 西京学院 A kind of field crop pest and disease disasters detection method based on SSD convolutional network
CN109740463A (en) * 2018-12-21 2019-05-10 沈阳建筑大学 A kind of object detection method under vehicle environment

Also Published As

Publication number Publication date
CN110222215B (en) 2021-05-04
NL2025689A (en) 2020-12-03
CN110222215A (en) 2019-09-10

Similar Documents

Publication Publication Date Title
NL2025689B1 (en) Crop pest detection method based on f-ssd-iv3
Saedi et al. A deep neural network approach towards real-time on-branch fruit recognition for precision horticulture
WO2020177432A1 (en) Multi-tag object detection method and system based on target detection network, and apparatuses
CN111753828B (en) Natural scene horizontal character detection method based on deep convolutional neural network
Chen et al. Weed detection in sesame fields using a YOLO model with an enhanced attention mechanism and feature fusion
Mathur et al. Crosspooled FishNet: transfer learning based fish species classification model
CN111652317B (en) Super-parameter image segmentation method based on Bayes deep learning
CN109033978B (en) Error correction strategy-based CNN-SVM hybrid model gesture recognition method
Su et al. LodgeNet: Improved rice lodging recognition using semantic segmentation of UAV high-resolution remote sensing images
CN115620160A (en) Remote sensing image classification method based on multi-classifier active transfer learning resistance
Hao et al. Growing period classification of Gynura bicolor DC using GL-CNN
CN112364747B (en) Target detection method under limited sample
CN112598031A (en) Vegetable disease detection method and system
Wenxia et al. Identification of maize leaf diseases using improved convolutional neural network.
Ouf Leguminous seeds detection based on convolutional neural networks: Comparison of faster R-CNN and YOLOv4 on a small custom dataset
Singh et al. Performance Analysis of CNN Models with Data Augmentation in Rice Diseases
Song et al. Multi-source remote sensing image classification based on two-channel densely connected convolutional networks.
Sharma et al. Deep Learning Meets Agriculture: A Faster RCNN Based Approach to pepper leaf blight disease Detection and Multi-Classification
Tu et al. Toward automatic plant phenotyping: starting from leaf counting
Sadati et al. An improved image classification based in feature extraction from convolutional neural network: application to flower classification
CN115393631A (en) Hyperspectral image classification method based on Bayesian layer graph convolution neural network
Yang et al. Intelligent collection of rice disease images based on convolutional neural network and feature matching
CN109308936B (en) Grain crop production area identification method, grain crop production area identification device and terminal identification equipment
Chu et al. Automatic image annotation combining svms and knn algorithm
Zhang et al. Unsound wheat kernel recognition based on deep convolutional neural network transfer learning and feature fusion