CN116883364A - Apple leaf disease identification method based on CNN and Transformer - Google Patents

Apple leaf disease identification method based on CNN and Transformer Download PDF

Info

Publication number
CN116883364A
CN116883364A CN202310869642.4A CN202310869642A CN116883364A CN 116883364 A CN116883364 A CN 116883364A CN 202310869642 A CN202310869642 A CN 202310869642A CN 116883364 A CN116883364 A CN 116883364A
Authority
CN
China
Prior art keywords
apple leaf
leaf disease
model
cnn
disease image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310869642.4A
Other languages
Chinese (zh)
Inventor
庞登浩
孟浩
王弘
黄林生
梁栋
刘家保
周向明
丁宇豪
吴修杨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui University
Original Assignee
Anhui University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui University filed Critical Anhui University
Priority to CN202310869642.4A priority Critical patent/CN116883364A/en
Publication of CN116883364A publication Critical patent/CN116883364A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0499Feedforward networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/42Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30181Earth observation
    • G06T2207/30188Vegetation; Agriculture

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to an apple leaf disease identification method based on CNN and a transducer, which comprises the following steps: collecting and processing apple leaf disease images; preprocessing an image in an initial apple leaf disease image sample to obtain an initial feature map, wherein the initial feature map forms an apple leaf disease image training set; constructing an apple leaf disease image recognition model based on the CNN model and the transducer model; inputting the apple leaf disease image training set into an apple leaf disease identification model for training; acquiring an apple leaf disease image to be detected and preprocessing; and inputting the pretreated apple leaf disease image to be detected into a trained apple leaf disease identification model to obtain an apple leaf disease identification result. According to the invention, the transform model is fused into the CNN model, so that the accurate identification of the apple leaf image diseases is realized; and the comprehensive modeling of the global and local information of the apple leaf diseases is realized.

Description

Apple leaf disease identification method based on CNN and Transformer
Technical Field
The invention relates to the technical field of agricultural pest image processing, in particular to an apple leaf disease identification method based on CNN and a Transformer.
Background
Crop disease refers to the phenomenon of attack of crops by various diseases and pests in agricultural production, which diseases may be caused by fungi, bacteria, viruses and other microorganisms, and pests which are damage caused by insects, mites, worms and other pests. Each disease may exhibit different symptoms and characteristics at different stages of development and environmental conditions, which makes accurate identification difficult.
The data on crop disease and pest in farms is enormous and growing, involving a large amount of images and related information. The need to efficiently process and manage such large-scale data has become a challenge, including the collection, storage, transmission, and analysis of data. In recent years, with the development of computer vision and machine learning technologies, crop pest monitoring and identification methods based on image identification and data analysis have attracted attention. By utilizing the image processing and deep learning algorithm, the image of crop diseases and insect pests can be automatically analyzed and identified, a rapid and accurate disease and insect pest detection result is provided, farmers and crop protection workers are helped to take corresponding prevention and treatment measures in time, and the loss caused by the disease and insect pests is reduced.
At present, the existing apple leaf disease identification method is mainly based on a CNN model, the model is excellent in extracting local characteristics, but has certain limitation in modeling global context information. To better utilize global context information, a transducer model was introduced into the field of computer vision, whose multi-headed self-attention mechanism was able to model global context information more fully. However, the transducer model is relatively weak in extracting local features of the image. Therefore, a method for combining the CNN and the transducer model becomes a key for solving the problem, and no invention or research for combining the CNN and the transducer model to solve the problem of identifying apple leaf diseases has been developed at present.
Disclosure of Invention
The invention aims to solve the problem of lower accuracy of the traditional apple leaf disease detection method, and provides an apple leaf disease identification method based on CNN and a transducer, which is used for fully transmitting and multiplexing characteristic information through dense connection and fusion of the CNN model and the transducer model and comprehensively utilizing local characteristic and global context information so as to improve the accuracy of crop pest identification.
In order to achieve the above purpose, the present invention adopts the following technical scheme: a method for identifying apple leaf diseases based on CNN and Transformer, which comprises the following steps in sequence:
(1) Collecting and processing apple leaf disease images to obtain an initial apple leaf disease image sample;
(2) Preprocessing an image in an initial apple leaf disease image sample to obtain an initial feature map, wherein the initial feature map forms an apple leaf disease image training set;
(3) Constructing an apple leaf disease image recognition model based on a CNN model and a transducer model, wherein the apple leaf disease image recognition model consists of a CNN branch model and a transducer branch model;
(4) Inputting the apple leaf disease image training set into an apple leaf disease recognition model for training to obtain a trained apple leaf disease recognition model;
(5) Acquiring an apple leaf disease image to be detected and preprocessing;
(6) And inputting the pretreated apple leaf disease image to be detected into a trained apple leaf disease identification model to obtain an apple leaf disease identification result.
The step (1) specifically refers to: and acquiring an apple leaf disease image under a real background, and generating diversified image data by an image enhancement method of random overturn, random color enhancement and noise addition to obtain an initial apple leaf disease image sample.
In step (2), the preprocessing includes rolling and pooling operations.
In step (3), the construction of the CNN branch model includes the following steps:
(3a) Setting a CNN branch model as a four-layer structure:
the first layer of the CNN branch model is set to be composed of three cascaded residual modules, wherein the last residual module is responsible for compressing the picture size and expanding the dimension;
setting a second layer of the CNN branch model to be composed of four cascaded residual modules, and expanding the dimension at the last residual module;
setting a third layer of the CNN branch model to be composed of three cascaded residual modules, and expanding the dimension at the last residual module;
setting a fourth layer of the CNN branch model, and extracting a final feature map through a residual error module;
residual modules of the four-layer structure of the CNN branch model are the same;
(3b) Setting the residual error module:
the dimension of the input dimension is reduced by using a lower projection convolution operation with a convolution kernel size of 1 x 1;
feature extraction is performed using a spatial convolution with a convolution kernel size of 3 x 3, and then the dimension is restored using an up-projection convolution with a convolution kernel size of 1 x 1;
an identity mapping is performed between the input and the output using a jump connection.
In the step (3), the construction of the transducer branch model specifically refers to:
designing a multi-head self-attention module to obtain the context information of each position;
the initial feature map is mapped into Q, K and V vectors through linear projection, each vector executes a self-attention function to obtain an output weight, and finally, projection mapping is carried out again after weight splicing to obtain a final output value:
MultiHead(Q,K,V)=Concat(head 1 ,...,head h )W O
where head h =Attention(QW h Q ,KW h K ,VW h V )
wherein W is Q ,W K ,W V And W is O Are linear projection parameter matrixes, concat represents that vectors are spliced, h represents a number, and head represents a number 1 Representing the self-attention, head, corresponding to the 1 st weight component h Representing the self-attention corresponding to the h weight component;
the input of the self-attention module is defined by the dimension d k Q and K vectors and dimension d of (2) v The V vector composition of (2) is calculated by first calculating the dot product of the Q vector and the K vector and then dividing byAnd applying a softmax function to obtain the weight of the V vector, and finally multiplying the weight by the V vector to obtain the output of the self-attention module, wherein the specific calculation formula is as follows:
wherein d k Representing the dimension of the vector K;
the multi-head self-attention module obtains the final multi-head attention representation by linearly transforming and splicing the outputs of the plurality of attention heads.
The step (4) specifically comprises the following steps:
(4a) Forward propagation: inputting an apple leaf disease image training set, and carrying out forward propagation through an apple leaf disease image recognition model;
(4b) Calculating a loss from the loss function:
wherein y is i The ith element, p, in the probability distribution vector representing the real label i The i-th element in the predictive probability distribution vector of the apple leaf disease image recognition model is represented, and N represents the number of elements;
(4c) Back propagation and parameter update: according to the loss result, back propagation is carried out, gradient is calculated, and parameters of the apple leaf disease image recognition model are optimized:
gradient calculation: deriving the parameters according to the loss function to obtain gradients of the parameters; the parameters refer to weights in an apple leaf disease image recognition model;
parameter updating: updating the weight and bias of the apple leaf disease image recognition model by using a gradient descent optimization algorithm;
(4d) Repeating the training steps: and (3) repeating the steps (4 a) to (4 c), continuously inputting an apple leaf disease image training set, and performing forward propagation, loss calculation, reverse propagation and parameter updating until the loss converges to obtain the weight with the best prediction effect.
According to the technical scheme, the beneficial effects of the invention are as follows: firstly, the method realizes accurate identification of apple leaf image diseases by fusing a transducer model into a CNN model; secondly, the invention uses a transducer model of a multi-head self-attention mechanism to enhance the modeling capability of the model on global context information, and extracts local features by using a CNN model, thereby realizing comprehensive modeling on global and local information of apple leaf diseases; thirdly, the transducer model realizes the attention to the space information of the global position through the multi-head self-attention module, so that the modeling capacity of the global visual information is improved, and meanwhile, the local features extracted by the CNN model are continuously fed back into the transducer model to enrich the local detail information of the transducer model; fourth, in order to further enhance feature propagation and feature multiplexing, the invention also introduces a dense connection mechanism, reduces information loss during network transmission, and has higher robustness to complex background, shielding and other conditions.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIGS. 2, 3 and 4 are schematic structural diagrams of a transducer module, a multi-head self-attention module and a self-attention module of a transducer branch model, respectively;
FIG. 5 is a schematic structural diagram of an image recognition model of apple leaf disease in the invention;
fig. 6, 7, 8 are the original image, shallow feature thermodynamic diagram, and deep feature thermodynamic diagram, respectively.
Detailed Description
As shown in fig. 1, a method for identifying apple leaf diseases based on CNN and Transformer comprises the following steps in sequence:
(1) Collecting and processing apple leaf disease images to obtain an initial apple leaf disease image sample;
(2) Preprocessing an image in an initial apple leaf disease image sample to obtain an initial feature map, wherein the initial feature map forms an apple leaf disease image training set;
(3) Constructing an apple leaf disease image recognition model based on a CNN model and a transducer model, wherein the apple leaf disease image recognition model consists of a CNN branch model and a transducer branch model; in the invention, a CNN branch model is a CNN model, and a transducer branch model is a transducer model; here, the CNN model is a convolutional neural network model;
(4) Inputting the apple leaf disease image training set into an apple leaf disease recognition model for training to obtain a trained apple leaf disease recognition model;
(5) Acquiring an apple leaf disease image to be detected and preprocessing;
(6) And inputting the pretreated apple leaf disease image to be detected into a trained apple leaf disease identification model to obtain an apple leaf disease identification result.
The step (1) specifically refers to: and acquiring an apple leaf disease image under a real background, and generating diversified image data by an image enhancement method of random overturn, random color enhancement and noise addition to obtain an initial apple leaf disease image sample.
In step (2), the preprocessing includes rolling and pooling operations.
As shown in fig. 2, 3, 4, and 5, in step (3), the construction of the CNN branching model includes the following steps:
(3a) Setting a CNN branch model as a four-layer structure:
the first layer of the CNN branch model is set to be composed of three cascaded residual modules, wherein the last residual module is responsible for compressing the picture size and expanding the dimension;
setting a second layer of the CNN branch model to be composed of four cascaded residual modules, and expanding the dimension at the last residual module;
setting a third layer of the CNN branch model to be composed of three cascaded residual modules, and expanding the dimension at the last residual module;
setting a fourth layer of the CNN branch model, and extracting a final feature map through a residual error module;
residual modules of the four-layer structure of the CNN branch model are the same;
(3b) Setting the residual error module:
the dimension of the input dimension is reduced by using a lower projection convolution operation with a convolution kernel size of 1 x 1;
feature extraction is performed using a spatial convolution with a convolution kernel size of 3 x 3, and then the dimension is restored using an up-projection convolution with a convolution kernel size of 1 x 1;
an identity mapping is performed between the input and the output using a jump connection.
In the step (3), the construction of the transducer branch model specifically refers to:
designing a multi-head self-attention module to obtain the context information of each position;
the initial feature map is mapped into Q, K and V vectors through linear projection, each vector executes a self-attention function to obtain an output weight, and finally, projection mapping is carried out again after weight splicing to obtain a final output value:
MultiHead(Q,K,V)=Concat(head 1 ,...,head h )W O
where head h =Attention(QW h Q ,KW h K ,VW h V )
wherein W is Q ,W K ,W V And W is O Are linear projection parameter matrixes, concat represents that vectors are spliced, h represents a number, and head represents a number 1 Representing the self-attention, head, corresponding to the 1 st weight component h Representing the self-attention corresponding to the h weight component;
the input of the self-attention module is defined by the dimension d k Q and K vectors and dimension d of (2) v The V vector composition of (2) is calculated by first calculating the dot product of the Q vector and the K vector and then dividing byAnd applying a softmax function to obtain the weight of the V vector, and finally multiplying the weight by the V vector to obtain the output of the self-attention module, wherein the specific calculation formula is as follows:
wherein d k Representing the dimension of the vector K;
the multi-head self-attention module obtains the final multi-head attention representation by linearly transforming and splicing the outputs of the plurality of attention heads.
The step (4) specifically comprises the following steps:
(4a) Forward propagation: inputting an apple leaf disease image training set, and carrying out forward propagation through an apple leaf disease image recognition model;
(4b) Calculating a loss from the loss function:
wherein y is i The ith element, p, in the probability distribution vector representing the real label i The i-th element in the predictive probability distribution vector of the apple leaf disease image recognition model is represented, and N represents the number of elements;
(4c) Back propagation and parameter update: according to the loss result, back propagation is carried out, gradient is calculated, and parameters of the apple leaf disease image recognition model are optimized:
gradient calculation: deriving the parameters according to the loss function to obtain gradients of the parameters; the parameters refer to weights in an apple leaf disease image recognition model;
parameter updating: updating the weight and bias of the apple leaf disease image recognition model by using a gradient descent optimization algorithm;
(4d) Repeating the training steps: and (3) repeating the steps (4 a) to (4 c), continuously inputting an apple leaf disease image training set, and performing forward propagation, loss calculation, reverse propagation and parameter updating until the loss converges to obtain the weight with the best prediction effect.
The effect of the invention can be illustrated by the following simulation experiment:
1. experimental conditions
The data used for the experiment was the public dataset Plant Pathology 2021-FGVC8. This is a disease involving three common apple leaves: published data sets for apple scab, apple gray spot, and apple rust north. In order to improve the generalization capability of the model, the data set is expanded to 21142 pictures by rotating, overturning, cutting, color transformation and other operations; next, according to 7:2:1, dividing the data set into a training set, a verification set and a test set. Finally, the expanded dataset is used to train and test model effects. .
2. The experimental steps are as follows:
(1) Inputting the expanded data set into the apple leaf disease image recognition model;
(2) Setting an optimization method as an AdamW optimization method, setting an initial learning rate to be 0.001, training 300 batches, and applying a cosine attenuation method;
(3) Saving the weight of the batch with the best training result;
(4) Loading the weight stored in the step (3), and inputting the apple leaf disease image to be predicted into the apple leaf disease image recognition model for testing;
(5) And outputting a model prediction result.
3. Experimental results
As shown in fig. 6, 7 and 8, compared with the CNN model and the Transformer model which are currently mainstream, the model accuracy proposed by the present invention is higher than other advanced recognition models while maintaining lower parameter numbers.
Table 1 comparison of the accuracy results of the present invention with other test models (unit:%)
Model Quantity of parameters Precision of
ResNet50 25.5M 88.37
ResNext50 25.0M 94.15
EfficientNetB5 28.4M 98.95
Deit-small 21.6M 95.92
Twins-SVT-S 24.1M 99.16
The invention is that 20.4M 99.69
In conclusion, the method and the device realize accurate identification of the apple leaf image diseases by fusing the Transformer model into the CNN model; the invention uses a transducer model of a multi-head self-attention mechanism to enhance the modeling capability of the model on global context information, and utilizes a CNN model to extract local characteristics, thereby realizing comprehensive modeling on global and local information of apple leaf diseases; according to the invention, the transducer model realizes the attention to the space information of the global position through the multi-head self-attention module, so that the modeling capacity of the global visual information is improved, and meanwhile, the local features extracted by the CNN model are continuously fed back into the transducer model to enrich the local detail information of the transducer model; fourth, in order to further enhance feature propagation and feature multiplexing, the invention also introduces a dense connection mechanism, reduces information loss during network transmission, and has higher robustness to complex background, shielding and other conditions.
While the foregoing describes the basic principles and embodiments of the present invention, it should be noted that the embodiments of the present invention are not limited to the above examples, and that any modifications, equivalents, etc. may be made without departing from the scope of the principles of the present invention, and such changes and modifications are intended to be included within the scope of the present invention. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (6)

1. A method for identifying apple leaf diseases based on CNN and Transformer is characterized by comprising the following steps: the method comprises the following steps in sequence:
(1) Collecting and processing apple leaf disease images to obtain an initial apple leaf disease image sample;
(2) Preprocessing an image in an initial apple leaf disease image sample to obtain an initial feature map, wherein the initial feature map forms an apple leaf disease image training set;
(3) Constructing an apple leaf disease image recognition model based on a CNN model and a transducer model, wherein the apple leaf disease image recognition model consists of a CNN branch model and a transducer branch model;
(4) Inputting the apple leaf disease image training set into an apple leaf disease recognition model for training to obtain a trained apple leaf disease recognition model;
(5) Acquiring an apple leaf disease image to be detected and preprocessing;
(6) And inputting the pretreated apple leaf disease image to be detected into a trained apple leaf disease identification model to obtain an apple leaf disease identification result.
2. The CNN and fransformer based apple leaf disease identification method of claim 1, wherein: the step (1) specifically refers to: and acquiring an apple leaf disease image under a real background, and generating diversified image data by an image enhancement method of random overturn, random color enhancement and noise addition to obtain an initial apple leaf disease image sample.
3. The CNN and fransformer based apple leaf disease identification method of claim 1, wherein: in step (2), the preprocessing includes rolling and pooling operations.
4. The CNN and fransformer based apple leaf disease identification method of claim 1, wherein: in step (3), the construction of the CNN branch model includes the following steps:
(3a) Setting a CNN branch model as a four-layer structure:
the first layer of the CNN branch model is set to be composed of three cascaded residual modules, wherein the last residual module is responsible for compressing the picture size and expanding the dimension;
setting a second layer of the CNN branch model to be composed of four cascaded residual modules, and expanding the dimension at the last residual module;
setting a third layer of the CNN branch model to be composed of three cascaded residual modules, and expanding the dimension at the last residual module;
setting a fourth layer of the CNN branch model, and extracting a final feature map through a residual error module;
residual modules of the four-layer structure of the CNN branch model are the same;
(3b) Setting the residual error module:
the dimension of the input dimension is reduced by using a lower projection convolution operation with a convolution kernel size of 1 x 1;
feature extraction is performed using a spatial convolution with a convolution kernel size of 3 x 3, and then the dimension is restored using an up-projection convolution with a convolution kernel size of 1 x 1;
an identity mapping is performed between the input and the output using a jump connection.
5. The CNN and fransformer based apple leaf disease identification method of claim 1, wherein: in the step (3), the construction of the transducer branch model specifically refers to:
designing a multi-head self-attention module to obtain the context information of each position;
the initial feature map is mapped into Q, K and V vectors through linear projection, each vector executes a self-attention function to obtain an output weight, and finally, projection mapping is carried out again after weight splicing to obtain a final output value:
MultiHead(Q,K,V)=Concat(head 1 ,...,head h )W O
where head h =Attention(QW h Q ,KW h K ,VW h V )
wherein W is Q ,W K ,W V And W is O Are linear projection parameter matrixes, concat represents that vectors are spliced, h represents a number, and head represents a number 1 Representing the self-attention, head, corresponding to the 1 st weight component h Representing the self-attention corresponding to the h weight component;
the input of the self-attention module is defined by the dimension d k Q and K vectors and dimension d of (2) v The V vector composition of (2) is calculated by first calculating the dot product of the Q vector and the K vector and then dividing byAnd applying a softmax function to obtain the weight of the V vector, and finally multiplying the weight by the V vector to obtain the output of the self-attention module, wherein the specific calculation formula is as follows:
wherein d k Representing the dimension of the vector K;
the multi-head self-attention module obtains the final multi-head attention representation by linearly transforming and splicing the outputs of the plurality of attention heads.
6. The CNN and fransformer based apple leaf disease identification method of claim 1, wherein: the step (4) specifically comprises the following steps:
(4a) Forward propagation: inputting an apple leaf disease image training set, and carrying out forward propagation through an apple leaf disease image recognition model;
(4b) Calculating a loss from the loss function:
wherein y is i The ith element, p, in the probability distribution vector representing the real label i The i-th element in the predictive probability distribution vector of the apple leaf disease image recognition model is represented, and N represents the number of elements;
(4c) Back propagation and parameter update: according to the loss result, back propagation is carried out, gradient is calculated, and parameters of the apple leaf disease image recognition model are optimized:
gradient calculation: deriving the parameters according to the loss function to obtain gradients of the parameters; the parameters refer to weights in an apple leaf disease image recognition model;
parameter updating: updating the weight and bias of the apple leaf disease image recognition model by using a gradient descent optimization algorithm;
(4d) Repeating the training steps: and (3) repeating the steps (4 a) to (4 c), continuously inputting an apple leaf disease image training set, and performing forward propagation, loss calculation, reverse propagation and parameter updating until the loss converges to obtain the weight with the best prediction effect.
CN202310869642.4A 2023-07-17 2023-07-17 Apple leaf disease identification method based on CNN and Transformer Pending CN116883364A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310869642.4A CN116883364A (en) 2023-07-17 2023-07-17 Apple leaf disease identification method based on CNN and Transformer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310869642.4A CN116883364A (en) 2023-07-17 2023-07-17 Apple leaf disease identification method based on CNN and Transformer

Publications (1)

Publication Number Publication Date
CN116883364A true CN116883364A (en) 2023-10-13

Family

ID=88265780

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310869642.4A Pending CN116883364A (en) 2023-07-17 2023-07-17 Apple leaf disease identification method based on CNN and Transformer

Country Status (1)

Country Link
CN (1) CN116883364A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117576467A (en) * 2023-11-22 2024-02-20 安徽大学 Crop disease image identification method integrating frequency domain and spatial domain information
CN118314144A (en) * 2024-06-11 2024-07-09 江西农业大学 Plant leaf disease identification method and system based on depth intensive residual error module

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115249329A (en) * 2022-07-16 2022-10-28 江苏师范大学 Apple leaf disease detection method based on deep learning
CN115273072A (en) * 2022-06-13 2022-11-01 南京林业大学 Apple leaf disease detection method based on improved Yolov5s model
CN115620146A (en) * 2022-11-07 2023-01-17 无锡学院 Crop leaf disease detection method based on Transformer

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115273072A (en) * 2022-06-13 2022-11-01 南京林业大学 Apple leaf disease detection method based on improved Yolov5s model
CN115249329A (en) * 2022-07-16 2022-10-28 江苏师范大学 Apple leaf disease detection method based on deep learning
CN115620146A (en) * 2022-11-07 2023-01-17 无锡学院 Crop leaf disease detection method based on Transformer

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
M N AHIL ET AL: "Apple and Grape Leaf Disease Classification using MLP and CNN", 2021 INTERNATIONAL CONFERENCE ON ADVANCEMENTS IN ELECTRICAL, ELECTRONICS, COMMUNICATION, COMPUTING AND AUTOMATION (ICAECA), 18 January 2022 (2022-01-18) *
XIAOPENG LI ET AL: "Transformer Help CNN See Better: A Lightweight Hybrid Apple Disease Identification Model Based on Transformers", AGRICULTURE, 19 June 2022 (2022-06-19) *
徐艳蕾等: "基于Transformer的强泛化苹果叶片病害识别模型", 农业工程学报, vol. 38, no. 16, 31 August 2022 (2022-08-31), pages 198 - 206 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117576467A (en) * 2023-11-22 2024-02-20 安徽大学 Crop disease image identification method integrating frequency domain and spatial domain information
CN117576467B (en) * 2023-11-22 2024-04-26 安徽大学 Crop disease image identification method integrating frequency domain and spatial domain information
CN118314144A (en) * 2024-06-11 2024-07-09 江西农业大学 Plant leaf disease identification method and system based on depth intensive residual error module
CN118314144B (en) * 2024-06-11 2024-08-06 江西农业大学 Plant leaf disease identification method and system based on depth intensive residual error module

Similar Documents

Publication Publication Date Title
CN110532900B (en) Facial expression recognition method based on U-Net and LS-CNN
CN105678284B (en) A kind of fixed bit human body behavior analysis method
CN111723738B (en) Coal rock chitin group microscopic image classification method and system based on transfer learning
CN114092832B (en) High-resolution remote sensing image classification method based on parallel hybrid convolutional network
CN111696101A (en) Light-weight solanaceae disease identification method based on SE-Inception
CN116883364A (en) Apple leaf disease identification method based on CNN and Transformer
CN112070768B (en) Anchor-Free based real-time instance segmentation method
Hassan et al. Plant seedlings classification using transfer learning
CN112749675A (en) Potato disease identification method based on convolutional neural network
CN115966010A (en) Expression recognition method based on attention and multi-scale feature fusion
CN111368637A (en) Multi-mask convolution neural network-based object recognition method for transfer robot
CN114898359B (en) Litchi plant diseases and insect pests detection method based on improvement EFFICIENTDET
Mahbub et al. Detect bangladeshi mango leaf diseases using lightweight convolutional neural network
CN112329771A (en) Building material sample identification method based on deep learning
CN117876832A (en) Pest detection method and model integrating local and global attention
CN114170657A (en) Facial emotion recognition method integrating attention mechanism and high-order feature representation
CN113780335B (en) Small sample commodity image classification method, device, equipment and storage medium
CN115439842A (en) Mulberry sclerotinia severity detection method based on deep learning
CN114463741A (en) Litchi disease and insect pest identification method based on deep learning
CN114627496A (en) Robust pedestrian re-identification method based on depolarization batch normalization of Gaussian process
Deng et al. Image Classification Method of Longhorn Beetles of Yunnan Based on Bagging and CNN
CN113887653A (en) Positioning method and system for tightly-coupled weak supervised learning based on ternary network
Hussein et al. Semantic segmentation of aerial images using u-net architecture
Sun et al. Tobacco-disease image recognition via multiple-attention classification network
CN114842300B (en) Crop pest detection method suitable for rainy day environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination