CN116758415A - Lightweight pest identification method based on two-dimensional discrete wavelet transformation - Google Patents

Lightweight pest identification method based on two-dimensional discrete wavelet transformation Download PDF

Info

Publication number
CN116758415A
CN116758415A CN202310616466.3A CN202310616466A CN116758415A CN 116758415 A CN116758415 A CN 116758415A CN 202310616466 A CN202310616466 A CN 202310616466A CN 116758415 A CN116758415 A CN 116758415A
Authority
CN
China
Prior art keywords
discrete wavelet
dimensional discrete
network
wavelet transform
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310616466.3A
Other languages
Chinese (zh)
Inventor
李晖
谭廷俊
胡欣仪
唐栩燃
罗伟
赵雪如
李超然
赵泽华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN202310616466.3A priority Critical patent/CN116758415A/en
Publication of CN116758415A publication Critical patent/CN116758415A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/188Vegetation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/48Extraction of image or video features by mapping characteristic values of the pattern into a parameter space, e.g. Hough transformation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/52Scale-space analysis, e.g. wavelet analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Abstract

The invention relates to a light pest identification method based on two-dimensional discrete wavelet transform (2D-DWT), which comprises the following steps: carrying out spatial multi-scale feature fusion and downsampling on an input image by using a two-dimensional discrete wavelet transform module (2D-DWTM), and carrying out feature extraction under the condition that the image scale is unchanged; the Residual Attention Module (RAM) is used for improving the attention of the network to important characteristics while reducing the gradient vanishing problem of the network, so that the characterization capability of the network is improved; compressing the characteristic channels by using global average pooling (avgpool), reducing the size and complexity of the characteristic map, and improving the generalization capability of the model; finally, the classification is completed by using a full connection layer (Linear) as a classification module. The invention adopts a multi-scale feature fusion and attention weighted fusion mechanism, effectively reduces the interference of complex background pictures on pest classification, and improves the classification accuracy. Meanwhile, by optimizing the memory occupation of the model, the more efficient classification precision is realized, the industrial popularization is facilitated, and important contribution is made to the field of agricultural protection.

Description

Lightweight pest identification method based on two-dimensional discrete wavelet transformation
Technical Field
The invention belongs to the field of deep learning, and relates to a light pest identification method based on two-dimensional discrete wavelet transform.
Technical Field
With the continuous aggravation of greenhouse effect, the plant diseases and insect pests of agriculture and forestry are increased increasingly, and the phenomenon of grain yield reduction is more and more common. In order to solve this problem, elaborate preventive and control measures are required. At present, a mode of manually identifying pests is generally adopted, but the mode has high cost, low efficiency and large workload. Therefore, an efficient and low-cost automatic pest identification algorithm needs to be researched to reduce the production cost and improve the agricultural production efficiency.
The traditional pest automatic identification algorithm generally adopts a machine learning technology and is divided into three steps of image preprocessing, feature extraction and classification. In the preprocessing stage, the algorithm enhances significant areas in the image and removes background noise. Then, in the feature extraction stage, the algorithm will extract the features such as color, texture and shape of the pest image. Finally, the algorithm classifies the pest images using a support vector machine (Support Vector Machine, SVM), adaboost, or artificial neural network (Artificial Neural Network, ANN), etc. However, the algorithm has the limitations of low precision, low robustness, too much dependence on manual skills in the feature extraction process and the like.
Deep learning technology is widely applied in the field of pest automatic identification, wherein a convolutional neural network-based method becomes the mainstream. In order to achieve accurate identification of pests in a complex setting, researchers have started from the following three aspects. The first method is a pest identification method based on visual saliency features of a graph by extracting and highlighting salient regions in an input image and then performing feature extraction and classification using a convolutional neural network. However, conventional saliency algorithms have difficulty extracting high-level semantic information because complex background disturbances such as color and texture are often present in pest images. The second method is a pest identification method combining an attention mechanism, which adds a channel or a spatial attention mechanism in a convolutional neural network to enhance the feature extraction capability of the network. However, the attention mechanism increases the number of parameters of the model, the spatial relationship between the features is not processed sufficiently, and the convolutional neural network with small receptive field cannot extract high-level semantic information, so that good balance can not be achieved in accuracy and speed. The third method is a pest identification method integrating multiple models, and the method is characterized in that different models are trained, and then the structures and weights of the models are integrated to obtain a new pest identification model, so that pest identification with higher accuracy is realized. However, the method only uses the accuracy as a unique index, so that the model parameter amount is large, the training cost is high, the training and deployment on high-performance computing equipment are often required, and certain limitations exist for environments such as farmlands, mountain areas and the like with weak signals.
Disclosure of Invention
In view of the above, an object of the present invention is to use the characteristics of two-dimensional discrete wavelet transform, that is, to enable lossless feature extraction of an image, to reduce the spatial size of a feature map for the purpose of reducing the amount of computation, and to enable faster expansion of the receptive field compared to convolution. In addition, aiming at pest identification in a complex environment, the residual error attention module is added, so that the attention of a network to key features can be improved, and the problem of gradient disappearance is solved.
In order to achieve the above purpose, the present invention provides the following technical solutions:
a light pest identification method based on two-dimensional discrete wavelet transformation comprises the following steps:
s1: carrying out space multi-scale feature fusion and downsampling on an input image by using a two-dimensional discrete wavelet transform module, and carrying out feature extraction under the condition that the image scale is unchanged;
s2: the residual attention module is used for improving the attention of the network to important features while reducing the gradient vanishing problem of the network, so that the characterization capability of the network is improved.
S3: compressing the characteristic channels by using global average pooling, reducing the size and complexity of the characteristic map, and improving the generalization capability of the model;
s4: and finally, using the full connection layer as a classification module to finish classification.
Further, in step S1, the specific operation of the two-dimensional discrete wavelet transform module is as follows:
s1.1: extracting the image characteristics which can be learned and are unchanged in space through a convolution layer;
s1.2: performing spatial token mixing and downsampling using a two-dimensional discrete wavelet transform to perform scale-invariant feature extraction;
s1.3: carrying out channel mixing by using a learnable multi-layer perceptron;
s1.4: then restoring the spatial resolution of the feature map using the transposed convolutional layer;
s1.5: and finally, adjusting the output format to be the same as the input by using batch normalization operation, and connecting the output format with the input in series to output, so that the problem of network gradient disappearance is reduced.
Further, in step S2, the residual attention module is used to reduce the gradient vanishing problem of the network and improve the attention of the network to the important features, so as to improve the characterization capability of the network, and the expression is as follows:
x out =RAM(x in )
wherein x is in Representing input, x out Representing the output, RAM (. Cndot.) represents the residual attention module.
Further, the step S3 specifically includes: the feature channels are compressed by using global average pooling, the size and complexity of feature graphs are reduced, the generalization capability of a model is improved, and the expression is as follows:
x out =avgpool(x in )
wherein avgpool (·) represents global average pooling.
Further, the step S5 specifically includes: the full-connection layer is used as a classification module to complete classification, and the expression is as follows:
Pre=Linear(x in )
the Linear (·) represents a classification module, and a final prediction result is obtained by transmitting the processed one-dimensional vector to the full connection layer.
The invention has the beneficial effects that:
(1) The invention tries to apply the two-dimensional discrete wavelet transformation to the agricultural pest identification, and the two-dimensional discrete wavelet transformation can perform lossless feature extraction on the image; strong priori knowledge of the image, such as translational invariance, proportional invariance and edge sparsity, can be effectively learned; meanwhile, the space size of the feature map is reduced, so that the memory and time required by forward and backward transmission are reduced, the occupied memory of the model is reduced, and the industrial popularization is facilitated.
(2) According to the pest image with the complex background characteristics, the residual attention module is added to carry out weighted fusion on important features in the image, so that the attention degree and generalization capability of the important features are enhanced. Therefore, the influence of the complex background image on the pest classification accuracy is effectively reduced, the classification accuracy is improved, the accurate classification of the pest species is achieved, and the contribution is made to the field of agricultural protection.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention.
Drawings
For the purposes, technical solutions and advantages of the present invention that will become more apparent, the present invention will be described in detail below with reference to the attached drawings, wherein:
FIG. 1 is a flow chart of a lightweight pest identification method based on two-dimensional discrete wavelet transform according to the present invention;
FIG. 2 is a diagram of a lightweight pest identification network framework based on two-dimensional discrete wavelet transform according to the present invention;
FIG. 3 is a schematic diagram of a two-dimensional discrete wavelet transform module according to the present invention;
FIG. 4 is a schematic diagram of a residual attention module of the present invention;
FIG. 5 is a comparison of the parameters and recognition accuracy of the present invention with other lightweight models;
Detailed Description
The advantages and effects of the present invention will be readily apparent to those skilled in the art from the following description, by describing specific embodiments of the present invention by the following specific examples. Furthermore, the invention is capable of other and different embodiments or of being practiced or of being carried out in various ways. Various modifications or changes may be made in the details of the description and in the examples which follow, in view of the various aspects and applications, but it must be ensured that the spirit of the invention will not depart. It is noted that the schematic diagrams provided below are only illustrative of the basic idea of the invention, and features can be combined with each other without conflict.
The schematic is provided only to illustrate examples of the invention and not to limit it. The components in the figures may be omitted, scaled or enlarged for better illustration of the embodiments, and do not represent actual product dimensions. Some well-known structures and descriptions thereof may be omitted to the skilled person.
The same reference numbers correspond to the same or similar elements. In the description of the present invention, if the terms "upper", "lower", "left", "right", "front", "rear", etc. are used as words of description, they are merely for convenience of description and for simplicity of description, and do not necessarily mean that the apparatus or element being referred to has a particular orientation or is constructed and operated in a particular orientation. Accordingly, the terms of positional relationship in the drawings are merely for illustration and are not to be construed as limiting the invention. The specific meaning of the above-described location indication words can be understood by those skilled in the art according to actual circumstances.
FIG. 1 is a flowchart of a lightweight pest identification method based on two-dimensional discrete wavelet transform according to the present invention, and is further described below with reference to FIG. 1, the method mainly includes the steps of:
step 1: carrying out space multi-scale feature fusion and downsampling on an input image by using a two-dimensional discrete wavelet transform module, and carrying out feature extraction under the condition that the image scale is unchanged;
step 2: the residual attention module is used for improving the attention of the network to important features while reducing the gradient vanishing problem of the network, so that the characterization capability of the network is improved.
Step 3: compressing the characteristic channels by using global average pooling, reducing the size and complexity of the characteristic map, and improving the generalization capability of the model;
step 4: and finally, using the full connection layer as a classification module to finish classification.
FIG. 2 is a diagram showing the overall structure of the network constructed this time, and the input image is x in The size of the input image at this time is h×w×c, where C is the number of channels input, H is the height of the image, and W is the width of the image. In the figure, n represents the number of input images, and f represents the number of channels input to the two-dimensional discrete wavelet transform module. First, the input image is passed through a convolution layer to create a feature map, i.e., a feature map that can be trained is extracted. Then go to feature map F r1 The feature map has the size of H multiplied by W multiplied by f, and the two-dimensional discrete wavelet transformation module of the input stack performs further feature extraction and parameter reduction, so that the effect of reducing the calculated amount can be achieved while the image features are acquired. To further enhance the attention of the network to the key features, a feature map F obtained by a two-dimensional discrete wavelet transform module is obtained r2 The characteristic size is still H×W×f, and the characteristic size is further input into a residual attention module. The residual attention module further improves the attention degree of key features through stacking of main branch residual blocks and weighted fusion of mask branches, and outputs a feature map F r3 。F r3 The size of (2) is H×W×f, in which case the global average pooling layer pair F is used r3 The compression of the feature map is performed to reduce the size and complexity of the feature map. Finally, compressing the obtained characteristic diagram F r4 And classifying by using the full connection layer to finally obtain the prediction probability of the pest image.
FIG. 3 is a block diagram of a two-dimensional discrete wavelet transform module of a lightweight pest identification method based on two-dimensional discrete wavelet transform according to the present invention, and the present invention is described below with reference to FIG. 3The structural principle of the invention is further described. From the figure it can be seen that the model of the invention contains 5 layers in total, first the convolution layer, the use of a trainable convolution before the wavelet transform is a key design aspect of our architecture, since it only allows to extract feature mappings that fit the selected wavelet basis functions. Input tensor X in The size of the vector is H multiplied by W multiplied by C, and after passing through the convolution layer, the tensor channel number is reduced by four times to obtain X 0 The size is H X W X C/4, namely:
x 0 =c(x in ,ε);
wherein X is in Is the input tensor, X0 is the output tensor, c (·) represents the convolution operation, ε represents the corresponding set of trainable parameters. The second layer is to perform two-dimensional discrete wavelet transform to obtain tensor X 0 The effect of reducing the spatial resolution is achieved by calculating the approximate sub-band and the three detail sub-bands by performing two-dimensional discrete wavelet transform, and the output tensor is X 1 The size of the input channel is H/2 XW/2 XC/4, namely, after two-dimensional discrete wavelet transformation operation, both tensor H and W are reduced to 1/2 of the input channel, and the channel number is unchanged. Then, through aggregation operation, the four sub-graphs obtained by two-dimensional discrete wavelet transformation are subjected to characteristic fusion by using a multi-layer perceptron so as to lead the characteristic tensor X 2 Is restored to H/2 XW/2 XC, and then the spatial resolution of the feature tensor is restored and output by transpose convolution layer restoration feature mapping and batch normalization, i.e., feature tensor X 3 Is H W C.
Fig. 4 is a schematic diagram of a residual attention module according to the present invention, in which an input picture is first processed by a single-layer residual module, and the size of the feature map is h×w×c. Where H is the length of the image. W is the width of the image and C is the number of channels of the image. It is then split into a trunk branch and a mask branch for processing. The main branch performs three residual operations, and the image size remains unchanged and is h×w×c. For mask branching, the picture is first reduced in size to half by max-pooling downsampling, where the image size is H/2 xw/2 xc. And then carrying out residual error processing twice, and carrying out upsampling by utilizing bilinear interpolation to restore the original size to H multiplied by W multiplied by C. Finally, by adding to the trunk branches and performing a series of normalization, convolution, etc., a tensor between 0 and 1 is finally obtained, the size of which is H×W×C. The tensor is then added and multiplied with the tensor of the trunk branch, i.e. by weighted fusion, the effect of enhancing the important features is achieved.
In general, the module consists of two key components, namely a backbone branch and a mask branch. Wherein the main branches are responsible for extraction of image features and the mask branches are used to learn the attention mask that soft weights the output features. Specifically, the input of the module is split into two different paths after residual layer processing. One path directly enters the next layer, the other path is weighted by the attention mechanism and then multiplied by the first path and added, and finally enters the next layer for processing, namely:
H i,c (x)=(1+M i,c (x))×F i,c (x)
where x is the input, i traverses all spatial locations, and C is the index of the channel (c.epsilon.1..C). Furthermore, M (x) is the output of the mask branch, and F (x) is the original feature of the trunk branch.
As can be seen from the simulation result of FIG. 5, the model designed by the invention has higher recognition accuracy while realizing light weight.
The above-described embodiments are merely illustrative of the technical solutions of the present invention, and should not be construed as limiting the present invention. While the invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (5)

1. A lightweight pest identification method based on two-dimensional discrete wavelet transform is characterized in that: the method comprises the following steps:
s1: carrying out space multi-scale feature fusion and downsampling on an input image by using a two-dimensional discrete wavelet transform module, and carrying out feature extraction under the condition that the image scale is unchanged;
s2: the residual attention module is used for improving the attention of the network to important features while reducing the gradient vanishing problem of the network, so that the characterization capability of the network is improved.
S3: compressing the characteristic channels by using global average pooling, reducing the size and complexity of the characteristic map, and improving the generalization capability of the model;
s4: and finally, using the full connection layer as a classification module to finish classification.
2. The method for identifying lightweight pests based on two-dimensional discrete wavelet transform according to claim 1, wherein: in step S1, the specific operation of the two-dimensional discrete wavelet transform module is as follows:
s1.1: extracting the image characteristics which can be learned and are unchanged in space through a convolution layer;
s1.2: performing spatial token mixing and downsampling using a two-dimensional discrete wavelet transform to perform scale-invariant feature extraction;
s1.3: carrying out channel mixing by using a learnable multi-layer perceptron;
s1.4: then restoring the spatial resolution of the feature map using the transposed convolutional layer;
s1.5: and finally, adjusting the output format to be the same as the input by using batch normalization operation, and connecting the output format with the input in series to output, so that the problem of network gradient disappearance is reduced.
3. The method for identifying lightweight pests based on two-dimensional discrete wavelet transform according to claim 1, wherein: in the step S2, the residual attention module is used to reduce the gradient vanishing problem of the network and improve the attention of the network to the important features, thereby improving the characterization capability of the network, and the expression is as follows:
x out =RAM(x in )
wherein x is in Representing input, x out Representing the output, RAM (. Cndot.) represents the residual attention module.
4. The method for identifying lightweight pests based on two-dimensional discrete wavelet transform according to claim 1, wherein: the step S3 specifically includes: the feature channels are compressed by using global average pooling, the size and complexity of feature graphs are reduced, the generalization capability of a model is improved, and the expression is as follows:
x out =avgpool(x in )
wherein avgpool (·) represents global average pooling.
5. The method for identifying lightweight pests based on two-dimensional discrete wavelet transform according to claim 1, wherein: the step S4 specifically includes: the full-connection layer is used as a classification module to complete classification, and the expression is as follows:
Pre=Linear(x in )
the Linear (·) represents a classification module, and a final prediction result is obtained by transmitting the processed one-dimensional vector to the full connection layer.
CN202310616466.3A 2023-05-29 2023-05-29 Lightweight pest identification method based on two-dimensional discrete wavelet transformation Pending CN116758415A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310616466.3A CN116758415A (en) 2023-05-29 2023-05-29 Lightweight pest identification method based on two-dimensional discrete wavelet transformation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310616466.3A CN116758415A (en) 2023-05-29 2023-05-29 Lightweight pest identification method based on two-dimensional discrete wavelet transformation

Publications (1)

Publication Number Publication Date
CN116758415A true CN116758415A (en) 2023-09-15

Family

ID=87954336

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310616466.3A Pending CN116758415A (en) 2023-05-29 2023-05-29 Lightweight pest identification method based on two-dimensional discrete wavelet transformation

Country Status (1)

Country Link
CN (1) CN116758415A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117290762A (en) * 2023-10-11 2023-12-26 北京邮电大学 Insect pest falling-in identification method, type identification method, device, insect trap and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117290762A (en) * 2023-10-11 2023-12-26 北京邮电大学 Insect pest falling-in identification method, type identification method, device, insect trap and system
CN117290762B (en) * 2023-10-11 2024-04-02 北京邮电大学 Insect pest falling-in identification method, type identification method, device, insect trap and system

Similar Documents

Publication Publication Date Title
Liu Feature extraction and image recognition with convolutional neural networks
CN112949565B (en) Single-sample partially-shielded face recognition method and system based on attention mechanism
CN109685819B (en) Three-dimensional medical image segmentation method based on feature enhancement
CN112287940A (en) Semantic segmentation method of attention mechanism based on deep learning
CN112288011B (en) Image matching method based on self-attention deep neural network
CN111767979A (en) Neural network training method, image processing method, and image processing apparatus
CN111291809B (en) Processing device, method and storage medium
CN112541503A (en) Real-time semantic segmentation method based on context attention mechanism and information fusion
CN112906720B (en) Multi-label image identification method based on graph attention network
JP2017157138A (en) Image recognition device, image recognition method and program
CN109598732B (en) Medical image segmentation method based on three-dimensional space weighting
CN110222718B (en) Image processing method and device
CN112561027A (en) Neural network architecture searching method, image processing method, device and storage medium
CN110222717A (en) Image processing method and device
CN112818764A (en) Low-resolution image facial expression recognition method based on feature reconstruction model
CN113112583B (en) 3D human body reconstruction method based on infrared thermal imaging
CN111860683A (en) Target detection method based on feature fusion
CN113627472A (en) Intelligent garden defoliating pest identification method based on layered deep learning model
CN116758415A (en) Lightweight pest identification method based on two-dimensional discrete wavelet transformation
CN115641473A (en) Remote sensing image classification method based on CNN-self-attention mechanism hybrid architecture
Gao et al. Natural scene recognition based on convolutional neural networks and deep Boltzmannn machines
CN112668486A (en) Method, device and carrier for identifying facial expressions of pre-activated residual depth separable convolutional network
CN112950780A (en) Intelligent network map generation method and system based on remote sensing image
CN116863194A (en) Foot ulcer image classification method, system, equipment and medium
CN115186804A (en) Encoder-decoder network structure and point cloud data classification and segmentation method adopting same

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication