CN116883364A

CN116883364A - Apple leaf disease identification method based on CNN and Transformer

Info

Publication number: CN116883364A
Application number: CN202310869642.4A
Authority: CN
Inventors: 庞登浩; 孟浩; 王弘; 黄林生; 梁栋; 刘家保; 周向明; 丁宇豪; 吴修杨
Original assignee: Anhui University
Current assignee: Anhui University
Priority date: 2023-07-17
Filing date: 2023-07-17
Publication date: 2023-10-13

Abstract

The invention relates to an apple leaf disease identification method based on CNN and a transducer, which comprises the following steps: collecting and processing apple leaf disease images; preprocessing an image in an initial apple leaf disease image sample to obtain an initial feature map, wherein the initial feature map forms an apple leaf disease image training set; constructing an apple leaf disease image recognition model based on the CNN model and the transducer model; inputting the apple leaf disease image training set into an apple leaf disease identification model for training; acquiring an apple leaf disease image to be detected and preprocessing; and inputting the pretreated apple leaf disease image to be detected into a trained apple leaf disease identification model to obtain an apple leaf disease identification result. According to the invention, the transform model is fused into the CNN model, so that the accurate identification of the apple leaf image diseases is realized; and the comprehensive modeling of the global and local information of the apple leaf diseases is realized.

Description

Apple leaf disease identification method based on CNN and Transformer

Technical Field

The invention relates to the technical field of agricultural pest image processing, in particular to an apple leaf disease identification method based on CNN and a Transformer.

Background

Crop disease refers to the phenomenon of attack of crops by various diseases and pests in agricultural production, which diseases may be caused by fungi, bacteria, viruses and other microorganisms, and pests which are damage caused by insects, mites, worms and other pests. Each disease may exhibit different symptoms and characteristics at different stages of development and environmental conditions, which makes accurate identification difficult.

The data on crop disease and pest in farms is enormous and growing, involving a large amount of images and related information. The need to efficiently process and manage such large-scale data has become a challenge, including the collection, storage, transmission, and analysis of data. In recent years, with the development of computer vision and machine learning technologies, crop pest monitoring and identification methods based on image identification and data analysis have attracted attention. By utilizing the image processing and deep learning algorithm, the image of crop diseases and insect pests can be automatically analyzed and identified, a rapid and accurate disease and insect pest detection result is provided, farmers and crop protection workers are helped to take corresponding prevention and treatment measures in time, and the loss caused by the disease and insect pests is reduced.

At present, the existing apple leaf disease identification method is mainly based on a CNN model, the model is excellent in extracting local characteristics, but has certain limitation in modeling global context information. To better utilize global context information, a transducer model was introduced into the field of computer vision, whose multi-headed self-attention mechanism was able to model global context information more fully. However, the transducer model is relatively weak in extracting local features of the image. Therefore, a method for combining the CNN and the transducer model becomes a key for solving the problem, and no invention or research for combining the CNN and the transducer model to solve the problem of identifying apple leaf diseases has been developed at present.

Disclosure of Invention

The invention aims to solve the problem of lower accuracy of the traditional apple leaf disease detection method, and provides an apple leaf disease identification method based on CNN and a transducer, which is used for fully transmitting and multiplexing characteristic information through dense connection and fusion of the CNN model and the transducer model and comprehensively utilizing local characteristic and global context information so as to improve the accuracy of crop pest identification.

In order to achieve the above purpose, the present invention adopts the following technical scheme: a method for identifying apple leaf diseases based on CNN and Transformer, which comprises the following steps in sequence:

(1) Collecting and processing apple leaf disease images to obtain an initial apple leaf disease image sample;

(2) Preprocessing an image in an initial apple leaf disease image sample to obtain an initial feature map, wherein the initial feature map forms an apple leaf disease image training set;

(3) Constructing an apple leaf disease image recognition model based on a CNN model and a transducer model, wherein the apple leaf disease image recognition model consists of a CNN branch model and a transducer branch model;

(4) Inputting the apple leaf disease image training set into an apple leaf disease recognition model for training to obtain a trained apple leaf disease recognition model;

(5) Acquiring an apple leaf disease image to be detected and preprocessing;

(6) And inputting the pretreated apple leaf disease image to be detected into a trained apple leaf disease identification model to obtain an apple leaf disease identification result.

The step (1) specifically refers to: and acquiring an apple leaf disease image under a real background, and generating diversified image data by an image enhancement method of random overturn, random color enhancement and noise addition to obtain an initial apple leaf disease image sample.

In step (2), the preprocessing includes rolling and pooling operations.

In step (3), the construction of the CNN branch model includes the following steps:

(3a) Setting a CNN branch model as a four-layer structure:

the first layer of the CNN branch model is set to be composed of three cascaded residual modules, wherein the last residual module is responsible for compressing the picture size and expanding the dimension;

setting a second layer of the CNN branch model to be composed of four cascaded residual modules, and expanding the dimension at the last residual module;

setting a third layer of the CNN branch model to be composed of three cascaded residual modules, and expanding the dimension at the last residual module;

setting a fourth layer of the CNN branch model, and extracting a final feature map through a residual error module;

residual modules of the four-layer structure of the CNN branch model are the same;

(3b) Setting the residual error module:

the dimension of the input dimension is reduced by using a lower projection convolution operation with a convolution kernel size of 1 x 1;

feature extraction is performed using a spatial convolution with a convolution kernel size of 3 x 3, and then the dimension is restored using an up-projection convolution with a convolution kernel size of 1 x 1;

an identity mapping is performed between the input and the output using a jump connection.

In the step (3), the construction of the transducer branch model specifically refers to:

designing a multi-head self-attention module to obtain the context information of each position;

the initial feature map is mapped into Q, K and V vectors through linear projection, each vector executes a self-attention function to obtain an output weight, and finally, projection mapping is carried out again after weight splicing to obtain a final output value:

MultiHead(Q,K,V)＝Concat(head ₁ ,...,head _h )W ^O

where head _h ＝Attention(QW _h ^Q ,KW _h ^K ,VW _h ^V )

wherein W is ^Q ，W ^K ，W ^V And W is ^O Are linear projection parameter matrixes, concat represents that vectors are spliced, h represents a number, and head represents a number ₁ Representing the self-attention, head, corresponding to the 1 st weight component _h Representing the self-attention corresponding to the h weight component;

the input of the self-attention module is defined by the dimension d _k Q and K vectors and dimension d of (2) _v The V vector composition of (2) is calculated by first calculating the dot product of the Q vector and the K vector and then dividing byAnd applying a softmax function to obtain the weight of the V vector, and finally multiplying the weight by the V vector to obtain the output of the self-attention module, wherein the specific calculation formula is as follows:

wherein d _k Representing the dimension of the vector K;

the multi-head self-attention module obtains the final multi-head attention representation by linearly transforming and splicing the outputs of the plurality of attention heads.

The step (4) specifically comprises the following steps:

(4a) Forward propagation: inputting an apple leaf disease image training set, and carrying out forward propagation through an apple leaf disease image recognition model;

(4b) Calculating a loss from the loss function:

wherein y is _i The ith element, p, in the probability distribution vector representing the real label _i The i-th element in the predictive probability distribution vector of the apple leaf disease image recognition model is represented, and N represents the number of elements;

(4c) Back propagation and parameter update: according to the loss result, back propagation is carried out, gradient is calculated, and parameters of the apple leaf disease image recognition model are optimized:

gradient calculation: deriving the parameters according to the loss function to obtain gradients of the parameters; the parameters refer to weights in an apple leaf disease image recognition model;

parameter updating: updating the weight and bias of the apple leaf disease image recognition model by using a gradient descent optimization algorithm;

(4d) Repeating the training steps: and (3) repeating the steps (4 a) to (4 c), continuously inputting an apple leaf disease image training set, and performing forward propagation, loss calculation, reverse propagation and parameter updating until the loss converges to obtain the weight with the best prediction effect.

According to the technical scheme, the beneficial effects of the invention are as follows: firstly, the method realizes accurate identification of apple leaf image diseases by fusing a transducer model into a CNN model; secondly, the invention uses a transducer model of a multi-head self-attention mechanism to enhance the modeling capability of the model on global context information, and extracts local features by using a CNN model, thereby realizing comprehensive modeling on global and local information of apple leaf diseases; thirdly, the transducer model realizes the attention to the space information of the global position through the multi-head self-attention module, so that the modeling capacity of the global visual information is improved, and meanwhile, the local features extracted by the CNN model are continuously fed back into the transducer model to enrich the local detail information of the transducer model; fourth, in order to further enhance feature propagation and feature multiplexing, the invention also introduces a dense connection mechanism, reduces information loss during network transmission, and has higher robustness to complex background, shielding and other conditions.

Drawings

FIG. 1 is a flow chart of the method of the present invention;

FIGS. 2, 3 and 4 are schematic structural diagrams of a transducer module, a multi-head self-attention module and a self-attention module of a transducer branch model, respectively;

FIG. 5 is a schematic structural diagram of an image recognition model of apple leaf disease in the invention;

fig. 6, 7, 8 are the original image, shallow feature thermodynamic diagram, and deep feature thermodynamic diagram, respectively.

Detailed Description

As shown in fig. 1, a method for identifying apple leaf diseases based on CNN and Transformer comprises the following steps in sequence:

(3) Constructing an apple leaf disease image recognition model based on a CNN model and a transducer model, wherein the apple leaf disease image recognition model consists of a CNN branch model and a transducer branch model; in the invention, a CNN branch model is a CNN model, and a transducer branch model is a transducer model; here, the CNN model is a convolutional neural network model;

(5) Acquiring an apple leaf disease image to be detected and preprocessing;

In step (2), the preprocessing includes rolling and pooling operations.

As shown in fig. 2, 3, 4, and 5, in step (3), the construction of the CNN branching model includes the following steps:

(3a) Setting a CNN branch model as a four-layer structure:

(3b) Setting the residual error module:

MultiHead(Q,K,V)＝Concat(head ₁ ,...,head _h )W ^O

where head _h ＝Attention(QW _h ^Q ,KW _h ^K ,VW _h ^V )

wherein d _k Representing the dimension of the vector K;

The step (4) specifically comprises the following steps:

(4b) Calculating a loss from the loss function:

The effect of the invention can be illustrated by the following simulation experiment:

1. experimental conditions

The data used for the experiment was the public dataset Plant Pathology 2021-FGVC8. This is a disease involving three common apple leaves: published data sets for apple scab, apple gray spot, and apple rust north. In order to improve the generalization capability of the model, the data set is expanded to 21142 pictures by rotating, overturning, cutting, color transformation and other operations; next, according to 7:2:1, dividing the data set into a training set, a verification set and a test set. Finally, the expanded dataset is used to train and test model effects. .

2. The experimental steps are as follows:

(1) Inputting the expanded data set into the apple leaf disease image recognition model;

(2) Setting an optimization method as an AdamW optimization method, setting an initial learning rate to be 0.001, training 300 batches, and applying a cosine attenuation method;

(3) Saving the weight of the batch with the best training result;

(4) Loading the weight stored in the step (3), and inputting the apple leaf disease image to be predicted into the apple leaf disease image recognition model for testing;

(5) And outputting a model prediction result.

3. Experimental results

As shown in fig. 6, 7 and 8, compared with the CNN model and the Transformer model which are currently mainstream, the model accuracy proposed by the present invention is higher than other advanced recognition models while maintaining lower parameter numbers.

Table 1 comparison of the accuracy results of the present invention with other test models (unit:%)

Model	Quantity of parameters	Precision of
			ResNet50	25.5M	88.37
ResNext50	25.0M	94.15
			EfficientNetB5	28.4M	98.95
Deit-small	21.6M	95.92
			Twins-SVT-S	24.1M	99.16
The invention is that	20.4M	99.69

In conclusion, the method and the device realize accurate identification of the apple leaf image diseases by fusing the Transformer model into the CNN model; the invention uses a transducer model of a multi-head self-attention mechanism to enhance the modeling capability of the model on global context information, and utilizes a CNN model to extract local characteristics, thereby realizing comprehensive modeling on global and local information of apple leaf diseases; according to the invention, the transducer model realizes the attention to the space information of the global position through the multi-head self-attention module, so that the modeling capacity of the global visual information is improved, and meanwhile, the local features extracted by the CNN model are continuously fed back into the transducer model to enrich the local detail information of the transducer model; fourth, in order to further enhance feature propagation and feature multiplexing, the invention also introduces a dense connection mechanism, reduces information loss during network transmission, and has higher robustness to complex background, shielding and other conditions.

While the foregoing describes the basic principles and embodiments of the present invention, it should be noted that the embodiments of the present invention are not limited to the above examples, and that any modifications, equivalents, etc. may be made without departing from the scope of the principles of the present invention, and such changes and modifications are intended to be included within the scope of the present invention. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. A method for identifying apple leaf diseases based on CNN and Transformer is characterized by comprising the following steps: the method comprises the following steps in sequence:

(5) Acquiring an apple leaf disease image to be detected and preprocessing;

2. The CNN and fransformer based apple leaf disease identification method of claim 1, wherein: the step (1) specifically refers to: and acquiring an apple leaf disease image under a real background, and generating diversified image data by an image enhancement method of random overturn, random color enhancement and noise addition to obtain an initial apple leaf disease image sample.

3. The CNN and fransformer based apple leaf disease identification method of claim 1, wherein: in step (2), the preprocessing includes rolling and pooling operations.

4. The CNN and fransformer based apple leaf disease identification method of claim 1, wherein: in step (3), the construction of the CNN branch model includes the following steps:

(3a) Setting a CNN branch model as a four-layer structure:

(3b) Setting the residual error module:

5. The CNN and fransformer based apple leaf disease identification method of claim 1, wherein: in the step (3), the construction of the transducer branch model specifically refers to:

MultiHead(Q,K,V)＝Concat(head ₁ ,...,head _h )W ^O

where head _h ＝Attention(QW _h ^Q ,KW _h ^K ,VW _h ^V )

wherein d _k Representing the dimension of the vector K;

6. The CNN and fransformer based apple leaf disease identification method of claim 1, wherein: the step (4) specifically comprises the following steps:

(4b) Calculating a loss from the loss function: