CN113239947B - Pest image classification method based on fine-grained classification technology - Google Patents

Pest image classification method based on fine-grained classification technology Download PDF

Info

Publication number
CN113239947B
CN113239947B CN202110264082.0A CN202110264082A CN113239947B CN 113239947 B CN113239947 B CN 113239947B CN 202110264082 A CN202110264082 A CN 202110264082A CN 113239947 B CN113239947 B CN 113239947B
Authority
CN
China
Prior art keywords
pest
classification
model
fine
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110264082.0A
Other languages
Chinese (zh)
Other versions
CN113239947A (en
Inventor
钱蓉
董伟
程泽凯
朱静波
夏皖
孔娟娟
刘桂民
张萌
李闰枚
王忠培
管博伦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Agricultural Economy And Information Research Of Anhui Academy Of Agricultural Sciences
Original Assignee
Agricultural Economy And Information Research Of Anhui Academy Of Agricultural Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Agricultural Economy And Information Research Of Anhui Academy Of Agricultural Sciences filed Critical Agricultural Economy And Information Research Of Anhui Academy Of Agricultural Sciences
Priority to CN202110264082.0A priority Critical patent/CN113239947B/en
Publication of CN113239947A publication Critical patent/CN113239947A/en
Application granted granted Critical
Publication of CN113239947B publication Critical patent/CN113239947B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention relates to a pest image classification method based on a fine-grained classification technology, which overcomes the defect of poor pest fine-grained identification effect compared with the prior art. The invention comprises the following steps: acquiring a training image; constructing a pest identification network; training a pest recognition network; acquiring an image of the pest to be identified; and obtaining a pest identification result. The invention achieves the highest performance by using the characteristic filtering fusion and the design loss function, can be simultaneously suitable for the classification of similar pests and rough pests, and can obtain ideal effects. Meanwhile, the target can be concerned about by the pests with very complicated backgrounds and the colors and the forms of the pests close to the backgrounds, so that the targets can be accurately identified, and the pest category number of the automatic classification of the agricultural pests is further widened.

Description

Pest image classification method based on fine-grained classification technology
Technical Field
The invention relates to a pest image identification method, in particular to a pest image classification method based on a fine-grained classification technology.
Background
The conventional image classification refers to coarse classification, such as cat-dog classification, and the fine-grained classification refers to classifying a plurality of sub-categories in the same coarse classification, such as different types of dogs, different types of birds and the like, wherein the differences among the sub-categories are not obvious, and the individuals in the classes have obvious differences in posture, motion, appearance and the like, so that the fine-grained image classification task requires a classification model to extract fine feature information of a target object.
In recent years, fine-grained image classification attracts more and more attention, and the FGVC has great application in some practical scenes requiring fine classification. Extracting corresponding local features and global features based on a target part and fusing the local features and the global features are the classic method of fine-grained classification. (Berg and Belhumeur 2013) obtain features of different positions by means of positions marked manually for classification, wherein some methods (Zhang and Donahue 2014; Krause and Jin 2015; Huang and Xu 2016; Zhang and Xu 2016; Lam and Mahassei 2017; Wei and Xie 2018; Liu and Xie 2020) perform more accurate semantic segmentation and feature fusion by searching for the best position to obtain better fine-grained position feature representation. Some scholars (Simon and Rodner 2015; Zhang and Wei 2016; He and Peng 2017; Ge and Lin 2019; Wang and Wang 2020, Huang and Li 2020; and others) extract site features by weak supervision and unsupervised methods, reducing the cost of human labeling. Still other researchers (Zhang and Xiong 2016; Wang and Morariu 2018; see the discussion of the above discussion) have sought potential sites for targets with the aid of deep convolution filters, without the need for additional site labeling; some methods (Xiao and Xu 2015; Fu and Zheng 2017; Zheng and Fu 2017; Sun and Yuan 2018; Zheng and Fu 2019) incorporating the attribute mechanism have also been proposed, and more relevant site features are extracted by the attention mechanism. (Ji and Wen 2020) incorporates an attention mechanism on a binary neural tree structure network, learning the target representation from coarse to fine, and focusing on capturing discriminative features. In addition, Zhuang and Wang learn the different parts among similar objects by means of comparative learning, and merge the part characteristics which are obtained by difference through gating on the common characteristics for classification. The method has the working key point of extracting better differential part characteristics and fusing the part characteristics with the overall characteristics to achieve better classification effect.
Unlike methods that focus on location, methods based on end-to-end feature coding focus on extracting higher-order representations and interactive relationships of features of the target location. (Lin 2015) the bilinear CNN model is used for fine-grained classification, so that good classification effect is achieved, and researchers have strong research interest in an end-to-end method. After that, Gao 2016 proposes to replace the original full bilinear representation with a bilinearly-pooled low-dimensional compact representation in response to the deficiency of excessive bilinear feature dimensionality. Based on bilinear pooling, Cui and Zhou 2017 propose a generic pooling framework that captures higher-order interactions of features in the form of kernels. To further reduce the amount of bilinear computations, Kong and Fowles 2017 proposes a classifier co-decomposition method that compresses the model by decomposing the set of bilinear classifiers into a common factor and compact sub-class terms. Zheng and Fu 2019 proposes a Deep Bilinear Transform (DBT) block, which is used to divide the input channels evenly into several semantic groups. And the pair interaction in each group is respectively calculated to represent bilinear transformation, so that the calculation cost is greatly reduced. Furthermore, Yu and Zhao 2018 proposes a cross-layer bilinear pooling framework integrating multiple cross-layer bilinear features to capture the feature relationship of the interlayer position and enhance the representation capability of the feature relationship. Replacing classical first-order pooling with global covariance pooling in convolutional neural networks has yielded impressive improvements, but typically requires longer training times. Li and Xie 2018 provides an iterative matrix square root normalization method which can accelerate the training speed of an end-to-end network based on global covariance pooling. Engin and Wang 2018 propose end-to-end training by jointly learning local descriptors and pooling the representation by replacing covariance matrices with kernel matrices. In addition, Cai and Zuo 2017 et al represent activation of layered convolution as local representations of different proportions, capture high-order statistics of convolution activation by a polynomial-kernel-based predictor, and model partial interaction to obtain higher-order intra-layer and inter-layer feature relationships. Gao and Han 2020 designs a Channel Interaction Network (CIN) to model intra-image and inter-image channel interactions, explore intra-image channel correlations, and enable the model to learn complementary features from the correlated channels, thereby yielding stronger fine-grained features. The end-to-end method is simpler and more effective, extra shallow features are added to deep features learned by the model through the deconvolution block, and more fine-grained level information is provided for the model for classification in a mode of fusing smaller-scale detail features for final feature representation.
In fine-grained identification, the difficulty of inter-class separation is higher than that of a conventional image classification task, and some methods improve the fine-grained classification effect by improving a loss function. (Wang et al.2016) used triplet state losses to achieve better inter-class separation. However, triple loss increases the computational cost of training. (Dubey and Gupta 2018) by using the maximum entropy principle, a maximum entropy loss is proposed that can be used for fine-grained classification. (Dubey and Gupta 2018) then reduced overfitting during training by deliberately introducing clutter in the model output activation. Sun and Cholakkal 2020 et al designed a "gradient enhancement" loss function to accelerate model convergence by focusing on only the aliased class of each sample. However, the above method neglects that the confidence score distribution of the simple sample and the similar sample is very different during model training, and some difficult samples interfered by the similar sample are predicted correctly, but the confidence score distribution can be known that the prediction confidence of the model to the samples is insufficient, so that the recognition rate is low and the classification effect is poor.
Disclosure of Invention
The invention aims to solve the defect of poor pest fine-grained identification effect in the prior art, and provides a pest image classification method based on a fine-grained classification technology to solve the problem.
In order to achieve the purpose, the technical scheme of the invention is as follows:
a pest image classification method based on a fine-grained classification technology comprises the following steps:
11) acquisition of training images: acquiring a pest image data set to be trained and preprocessing the pest image data set;
12) constructing a pest classification model: constructing a pest classification model based on a ResNet18 network and a cross entropy loss function, and marking as a DB _ RN18 model;
13) training a pest classification model: training a pest DB _ RN18 model by using a pest image data set to be trained, designing a loss function which is more sensitive to a classification result of the DB _ RN18 model, and completing the training in an end-to-end mode;
14) acquiring an image of the pest to be identified: acquiring a pest image to be identified and preprocessing the pest image;
15) obtaining a pest classification result: and inputting the preprocessed pest image into the trained DB _ RN18 model to obtain a classification result.
The construction of the pest classification model comprises the following steps:
21) setting a ResNet18 network, and adding two DeconvBlock modules between the last ResBlock layer and the classification layer of the ResNet18 network to construct a DB _ RN18 classification model;
22) adding Channel attention and Spatial attention mechanisms in both DeconvBlock modules;
23) constructing a loss function suitable for fine-grained image classification based on the cross entropy loss function: setting a Loss function which is sensitive to the confidence score of the fine-grained image recognition model, in the same batch size, predicting correctly, and under the three conditions that the confidence of a sample set in the same batch size is stable, predicting correctly but the confidence of the sample set is dynamically changed and predicting in the same batch size is wrong, respectively giving out different Loss functions for rewarding or punishing the model, and gradually lightening the punishment degree along with the depth addition of the model.
The training of the pest classification model comprises the following steps:
31) inputting a pest image data set to be trained into a ResNet18 network for training;
32) performing Channel attribute and Spatial attribute processing on the last layer of the convolutional layer feature map of the ResNet18 network to obtain an attribute map containing deep feature information of the convolutional layer feature map and expand the receptive field of the model;
33) carrying out deconvolution processing on the final layer of the convolution layer feature map of ResNet18, and expanding the size of the convolution feature map;
34) fusing the Attention map and the deconvolution feature map information, and extracting fused feature map information by using double convolution layers; namely, the output Op of the last ResBlock and the output Oss of the previous ResBlock with the same output scale are used as the input of the deconvolution block, and the expression is as follows:
Figure GDA0003121913980000041
Figure GDA0003121913980000042
wherein, N and M respectively represent the number of channels output by the previous ResBlock and the ResBlock with the same output scale, H and W represent the height and width of the output characteristic diagram, and H 'is 2 × H, and W' is 2 × W;
35) training a loss function of the classification of the fine-grained images;
351) model prediction error samples are given penalties: setting a penalty factor alpha for each error sample, wherein the value of the penalty factor is gradually reduced as the network depth is deeper and the error frequency is reduced, namely the penalty degree is reduced, and the expression is as follows:
Figure GDA0003121913980000051
wherein F α0 )=(α 0 -1) 2 ,N bs Is the size of the current batch, s n Prediction for all labels for the nth sampleThe number of points is given to the user,
Figure GDA0003121913980000052
predict confidence score, N, that is a true tag al for the nth sample in batch l Number of true tags, s al And s is expressed as:
Figure GDA0003121913980000053
Figure GDA0003121913980000054
α 0 and N al Is the accuracy of the representation within the current batch, calculated as follows:
Figure GDA0003121913980000055
Figure GDA0003121913980000056
wherein N is al The number of samples of which the labels corresponding to the predicted maximum scores are real labels is represented, namely the number of samples in one batch which are predicted correctly;
352) the model predicts a sample with correct prediction but low confidence score, sets an incentive factor beta, and distinguishes confidence score intervals between the sample with correct prediction but low confidence score and a sample with wrong prediction, so that the model can be competent for the precise identification task of fine-grained pests, and the expression form is as follows:
Figure GDA0003121913980000057
Figure GDA0003121913980000058
wherein, F β () As a mapping function, β 0 Represents the ratio of the second highest score to the highest score in all label scores corresponding to the correctly predicted sample when beta is 0 Greater than 0.5, with beta 0 Increasing, we believe that the model is progressively more penalized by the penalty, β 0 Is calculated as follows:
Figure GDA0003121913980000059
wherein
Figure GDA00031219139800000510
A score representing all tags except the genuine tag;
353) the model predicts a correct sample with an unstable confidence score, sets a reward factor gamma, normalizes the confidence score of the correct sample, and leads the confidence score of the sample to tend to be stable, and the expression form is as follows:
Figure GDA0003121913980000061
Figure GDA0003121913980000062
where F γ () is the mapping function, γ 0 The confidence level of the true tag representing the correct sample to be predicted to reach the maximum confidence score in the current batch is calculated as follows:
Figure GDA0003121913980000063
Figure GDA0003121913980000064
wherein
Figure GDA0003121913980000065
Predicting correctness for nth sample in batchConfidence score of true tag al of time, will be in case of prediction error
Figure GDA0003121913980000066
Is set to be 0;
354) the reward and punishment multiple
Figure GDA0003121913980000069
Is expressed in the form of
Figure GDA0003121913980000067
And giving a value a, wherein a belongs to [0,1), when the correct rate in the same batch reaches a set value, additionally punishing the model, wherein the loss calculation formula of the model is as follows:
Figure GDA0003121913980000068
wherein, the hyperparameter alpha is determined to be an optimal value through an optimization algorithm based on a sequence model.
Advantageous effects
Compared with the prior art, the pest image classification method based on the fine-grained classification technology achieves the highest performance by using the feature filtering fusion and the design loss function, can be simultaneously suitable for classifying similar pests and rough pests, and can obtain ideal effects. Meanwhile, the target can be concerned about by the pests with very complicated backgrounds and the colors and the forms of the pests close to the backgrounds, so that the targets can be accurately identified, and the pest category number of the automatic classification of the agricultural pests is further widened.
The method uses the reverse convolution block to introduce shallow information into the deep layer of the model, and filters background characteristics through attention to strengthen the attention target; the S3-Loss lifting model is more sensitive to the low confidence coefficient phenomenon, so that similar samples or difficult samples are separated, a corresponding Loss function form is selected according to the confidence score distribution of each sample, the model is more sensitive to the similar samples, and the purpose of separating the similar samples is achieved.
Drawings
FIG. 1 is a sequence diagram of the method of the present invention.
Detailed Description
So that the manner in which the above recited features of the present invention can be understood and readily understood, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings, wherein:
as shown in FIG. 1, the pest image classification method based on the fine-grained classification technology comprises the following steps:
the first step of training image acquisition: and acquiring a pest image data set to be trained and preprocessing the pest image data set.
Secondly, constructing a pest classification model: a pest identification network is constructed based on the ResNet18 network. Compared with the Baseline network, the main improvements here include: (a) between the last resplock of the original ResNet18 network and the classification layer, two deconvblocks are added. The two deconvolution blocks promote the scale of the feature map, introduce shallow features, add more related detailed information to the feature map, weaken background features by using the Attention in the DeconvBlock, and help the model to more accurately extract the features of the target; (b) and S3-Loss predicts the fraction according to the sample given by the model, and the design operator imposes different punishments on the similar sample and the difficult sample to finely adjust the CE-Loss of the sample.
The method expands the size of the final layer of the ResNet18 output feature diagram, fuses features of different layers, and prepares for adding more fine features for a deep network. In a DeconvBlock, let the last ResBlock output feature map Op of a Backbone successively pass through a Deconv layer and a Conv layer, so as to enlarge the scale of the feature map, in an upper branch, use 1 × 1 convolution to promote the channel number of a shallow feature map, combine with Attention to useful features, reduce background feature interference, and then add the results of the upper and lower branches to obtain a fused feature map Odc ═ Odc { (Odc) } n :n∈[1,N]},
Figure GDA0003121913980000071
Figure GDA0003121913980000072
The specific calculation is as follows:
Odc=Op′+Ossa′
Op′=Cn(Dcn(Op))
Ossa′=A(Oss′)
Figure GDA0003121913980000081
where Op' denotes the upscaled feature map Op, Cn and Dcn represent convolution and deconvolution operations, respectively, A denotes the Attention operation performed on the feature map,
Figure GDA0003121913980000082
to boost the number of channels from M to a1 x 1 convolution of N,
Figure GDA0003121913980000083
performing Attention operation on the feature map Oss 'after the channel is lifted, wherein Ossa' contains strengthened target shallow features, and the specific calculation in the Attention is as follows:
Figure GDA0003121913980000084
W ca =GAP(Oss′)+GMP(Oss′)
Figure GDA0003121913980000085
we use channel-based weights W ca ,
Figure GDA0003121913980000086
And a space-based weight W sa ,
Figure GDA0003121913980000087
Figure GDA0003121913980000088
From the superficial layerObtaining effective details of the target as much as possible in the feature map, wherein GAP represents global average pooling, GMP represents global maximum pooling, Mean and Max represent operations of averaging the feature map and the maximum feature map in the Oss' channel dimension, respectively, taking the two feature maps as the input of a convolution layer with only 7 × 7kernel, and finally obtaining the space-based weight W sa
Residual connection we further extracted the fused features using the dual-layer convolution layer and performed a residual connection to obtain the output Odb of the whole deconvolution block { Odb ═ b n :n∈[1,N]},
Figure GDA0003121913980000089
Odb=Odc+Odc′
Odc′=Cn(Cn(Odc)) (2)
Where Odc' represents the signature after Odc passes through two convolutional layers.
The output Odb of the last DeconvBlock enters the classifier, and finally the prediction score s of the model is obtained, i.e., { s {(s) } l :l∈[1,L]},
Figure GDA00031219139800000810
L represents the number of real tags.
s=Dense(GAP(Odb)) (3)
Wherein GAP represents global maximum pooling, density represents a fully-connected layer, and dropout is set to 0.5 between the pooling layer and the fully-connected layer.
The construction of the pest identification network comprises the following specific steps:
(1) setting a ResNet18 network, and adding two DeconvBlock modules between the last ResBlock layer and a classification layer of the ResNet18 network to construct a DB _ RN18 classification model;
(2) adding Channel attention and Spatial attention mechanisms in both DeconvBlock modules;
(3) constructing a loss function suitable for fine-grained image classification based on the cross entropy loss function: setting a Loss function which is sensitive to the confidence score of the fine-grained image recognition model, in the same batch size, predicting correctly, and under the three conditions that the confidence of a sample set in the same batch size is stable, predicting correctly but the confidence of the sample set is dynamically changed and predicting in the same batch size is wrong, respectively giving out different Loss functions for rewarding or punishing the model, and gradually lightening the punishment degree along with the depth addition of the model.
Thirdly, training a pest classification model: and training the pest recognition network by using the pest image data set to be trained.
In the design of the Loss function, we design an S3-Loss sensitive to the confidence score of the model, and calculate three additional weights for each sample in the two cases of prediction correct and wrong respectively, and finally add to obtain omega, that is, the original CE-Loss of each sample needs to be increased by an additional multiple, thereby imposing additional punishment on the model. Let the prediction Score of the model be denoted S, and each time before S3-Loss needs to be performed, we translate S e [ min (S), max (S) ], to S e [ min (S) + (0-min (S)) + upsilon, maxs +0-mins + upsilon, i.e. S e upsilon, maxs-min (S) + upsilon, where upsilon is 1 e-12. In order to make different degrees of punishment on the behavior of model prediction errors during training, we give severe punishment on the behavior of model prediction errors in the early stage of training, but the extra punishment degree becomes lighter and lighter along with the deep training. The method comprises the following specific steps:
(1) and inputting a pest image data set to be trained into a ResNet18 network for training.
(2) And (3) performing Channel attribute and Spatial attribute processing on the last convolutional layer characteristic diagram of the ResNet18 network to obtain an attribute map containing deep characteristic information of the convolutional layer characteristic diagram, and expanding the receptive field of the model.
(3) Deconvolution is performed on the final convolutional layer signature of ResNet18 to expand the convolutional signature size.
(4) The Attention map and the deconvolution feature map information are fused, the fused feature map information is extracted by using the double convolution layers, and the feature information obtained by adopting the class activation map verification model is richer than that obtained by adopting the traditional method; (ii) a Namely, the output Op of the last ResBlock and the output Oss of the previous ResBlock with the same output scale are used as the input of the deconvolution block, and the expression is as follows:
Figure GDA0003121913980000091
Oss={Oss m :n∈[1,M]},
Figure GDA0003121913980000101
where N and M denote the number of channels output by the previous ResBlock and the ResBlock of the same output scale, respectively, H and W denote the height and width of the output profile, and H '═ 2 × H, and W' ═ 2 × W.
(5) Setting a Loss function which is sensitive to the confidence score of a fine-grained image recognition model, predicting correctly in the same batch, and giving out three conditions of stable confidence of a sample set, correct prediction but dynamic change of the confidence of the sample set and wrong prediction in the batch, wherein the three conditions are that the sample set in the same batch is correct in prediction, different Loss functions are respectively used for rewarding or punishing the model, and the punishment degree is gradually reduced along with the addition of the model depth;
A1) model prediction error samples are given penalties: setting a penalty factor alpha for each error sample, wherein the value of the penalty factor is gradually reduced as the network depth is deeper and the error frequency is reduced, namely the penalty degree is reduced, and the expression is as follows:
Figure GDA0003121913980000102
wherein F α0 )=(α 0 -1) 2 ,N bs Indicating the size of the current batch,
Figure GDA0003121913980000103
confidence score, s, representing the prediction of the nth sample in batch as a true tag al al The expression of (a) is:
Figure GDA0003121913980000109
wherein s is n Indicating that the nth sample corresponds to the prediction scores of all tags,
Figure GDA0003121913980000106
Figure GDA0003121913980000107
nl represents the number of real tags;
α 0 and N al Is the accuracy of the representation within the current batch, calculated as follows:
Figure GDA0003121913980000108
wherein N is al The number of samples of which the labels corresponding to the predicted maximum scores are real labels is represented, namely the number of samples in one batch which are predicted correctly;
A2) the model predicts a sample with correct prediction but low confidence score, sets a reward factor beta, and distinguishes confidence score intervals between the sample with correct prediction but low confidence score and a sample with wrong prediction, so that the model can be competent for the precise identification task of fine-grained pests, and the expression form is as follows:
Figure GDA0003121913980000111
Figure GDA0003121913980000112
wherein beta is 0 Represents the ratio of the second highest score to the highest score in all label scores corresponding to the correctly predicted sample when beta is 0 Above 0.5, with beta 0 Increasing, we believe that the model is progressively more penalized by the penalty, β 0 Is calculated as follows:
Figure GDA0003121913980000113
wherein
Figure GDA0003121913980000114
Scores representing all tags except the genuine tag;
A3) the model predicts a correct sample with an unstable confidence score, sets a reward factor gamma, normalizes the confidence score of the correct sample, and leads the confidence score of the sample to tend to be stable, and the expression form is as follows:
Figure GDA0003121913980000115
Figure GDA0003121913980000116
wherein, γ 0 The degree to which the confidence of the genuine tag representing the correct sample predicted can reach the maximum confidence score in the current batch is calculated as follows:
Figure GDA0003121913980000117
Figure GDA0003121913980000118
wherein
Figure GDA0003121913980000119
For the sample n in the batch, the confidence score of the real label al when the prediction is correct will be
Figure GDA00031219139800001110
Setting to 0;
A4) the reward and punishment multiple
Figure GDA00031219139800001113
The expression of (b) is set to ω ═ α + β + γ,
Figure GDA00031219139800001111
and giving a value a, wherein a belongs to [0,1), when the correct rate in the same batch reaches a set value, additionally punishing the model, wherein the loss calculation formula of the model is as follows:
Figure GDA00031219139800001112
the super-parameter alpha is determined by a sequence model-based optimization algorithm, wherein the super-parameter alpha is determined by the sequence model-based optimization algorithm, and the model accuracy is optimal when alpha is 0.8 as can be known through experimental analysis.
Fourthly, obtaining an image of the pest to be identified: and acquiring a pest image to be identified and preprocessing the pest image.
Fifthly, obtaining a pest identification result: and inputting the preprocessed pest image into the trained DB _ RN18 model to obtain a classification result.
In order to verify the effectiveness of the algorithm, under the same framework and experimental environment, three network models, namely a Baseline network RN18, a classic BLCNN algorithm and an RN18_ DB, are reproduced, wherein the BLCNN algorithm is a typical end-to-end training-based method, and the RN18_ DB uses a peak suppression and gradient enhancement loss function method for the first time in the field of fine-grained image classification, so that a remarkable classification effect is achieved. The three models are compared with the method provided by the invention, and the comparison experiments are respectively carried out on a pest fine-grained data set ArgFIP20 and an agricultural pest data set AgrIP138, and the detailed experiment results are shown in Table 1.
TABLE 1 comparison of the results of the three models and the method of the present invention
Figure GDA0003121913980000121
Above, the four different algorithms adopt the experimental result of the ab initio training mode on the ArgFIP20 and the ArgIP138, and in general, the algorithm based on the Resnet18 is superior to the VGG16, and is 2.74% higher than that of the Baseline algorithm, and is improved by 1.11% compared with the latest algorithm in the fine-grained field.
Analysis of experimental results shows that the method provided by the invention achieves ideal effects on both ArgFIP20 data set and ArgIP 138. On ArgFIP20, a peak suppression method is adopted to force a model to automatically find more areas with differences, a self-adaptive method without position marking has better performance on pest data sets with different types and inconsistent target position structures, and the automatic identification accuracy rate reaches 92.25%. According to the invention, the anti-convolution layer is introduced to enable the model to learn more detailed information, the background characteristics are filtered by an attention mechanism, the related characteristics of the target are strengthened, and the detailed characteristics of the target body can be accurately captured. On ArgIP138, the classification of the invention reaches 98.23%, the recognition effect is better than that of the other three models, however, RN18_ DB adopts a random mode to acquire more fine features, but in this case, compared with Baseline, the classification is not effectively improved, and the recognition accuracy is lower than that of Baseline.
The foregoing shows and describes the general principles, essential features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are merely illustrative of the principles of the invention, but that various changes and modifications may be made without departing from the spirit and scope of the invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (1)

1. A pest image classification method based on a fine-grained classification technology is characterized by comprising the following steps:
11) acquisition of training images: acquiring a pest image data set to be trained and preprocessing the pest image data set;
12) constructing a pest classification model: constructing a pest classification model based on a ResNet18 network and a cross entropy loss function, and marking as a DB _ RN18 model;
the construction of the pest classification model comprises the following steps:
121) setting a ResNet18 network, and adding two DeconvBlock modules between the last ResBlock layer and a classification layer of the ResNet18 network to construct a DB _ RN18 classification model;
122) adding Channel attention and Spatial attention mechanisms in both DeconvBlock modules;
123) constructing a loss function suitable for fine-grained image classification based on the cross entropy loss function: setting a Loss function which is sensitive to the confidence score of the fine-grained image recognition model, in the same batch size, predicting correctly, and under the three conditions that the confidence of a sample set in the same batch size is stable, predicting correctly but the confidence of the sample set is dynamically changed and predicting in the same batch size is wrong, respectively giving out different Loss functions for rewarding or punishing the model, and gradually lightening the punishment degree along with the depth addition of the model;
13) training a pest classification model: training a pest DB _ RN18 model by using a pest image data set to be trained, designing a loss function which is more sensitive to a classification result of the DB _ RN18 model, and completing the training in an end-to-end mode;
the training of the pest classification model comprises the following steps:
131) inputting a pest image data set to be trained into a ResNet18 network for training;
132) carrying out Channel attribute and Spatial attribute processing on the output characteristic diagram of the ResNet18 network upper branch ResBlock to obtain an attribute map containing deep characteristic information of the convolutional layer characteristic diagram and expand the receptive field of the model;
133) performing deconvolution processing on the feature map of the last convolutional layer of the ResNet18, and expanding the size of the convolutional feature map;
134) fusing the Attention map and the deconvolution feature map information, and extracting fused feature map information by using double convolution layers; namely, the output Op of the last ResBlock and the output Oss of the up branch ResBlock are used as the input of the deconvolution block, and the expression is as follows:
Figure FDA0003783403050000021
Oss={Oss n' :n'∈[1,M]},Oss∈R M×H′×W′
wherein, N and M respectively represent the number of channels output by the last ResBlock and the upstream branch ResBlock, H and W represent the height and width of the output characteristic diagram, and H '═ 2 × H, W' ═ 2 × W;
135) training a loss function of fine-grained image classification;
14) obtaining an image of the pest to be identified: acquiring a pest image to be identified and preprocessing the pest image;
15) obtaining a pest classification result: and inputting the preprocessed pest image into the trained DB _ RN18 model to obtain a classification result.
CN202110264082.0A 2021-03-10 2021-03-10 Pest image classification method based on fine-grained classification technology Active CN113239947B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110264082.0A CN113239947B (en) 2021-03-10 2021-03-10 Pest image classification method based on fine-grained classification technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110264082.0A CN113239947B (en) 2021-03-10 2021-03-10 Pest image classification method based on fine-grained classification technology

Publications (2)

Publication Number Publication Date
CN113239947A CN113239947A (en) 2021-08-10
CN113239947B true CN113239947B (en) 2022-09-23

Family

ID=77130207

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110264082.0A Active CN113239947B (en) 2021-03-10 2021-03-10 Pest image classification method based on fine-grained classification technology

Country Status (1)

Country Link
CN (1) CN113239947B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107016405A (en) * 2017-02-24 2017-08-04 中国科学院合肥物质科学研究院 A kind of insect image classification method based on classification prediction convolutional neural networks
CN109344883A (en) * 2018-09-13 2019-02-15 西京学院 Fruit tree diseases and pests recognition methods under a kind of complex background based on empty convolution
CN111898709A (en) * 2020-09-30 2020-11-06 中国人民解放军国防科技大学 Image classification method and device
CN111985370A (en) * 2020-08-10 2020-11-24 华南农业大学 Crop pest and disease fine-grained identification method based on improved mixed attention module
CN112241762A (en) * 2020-10-19 2021-01-19 吉林大学 Fine-grained identification method for pest and disease damage image classification

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108171257B (en) * 2017-12-01 2019-11-26 百度在线网络技术(北京)有限公司 Fine granularity image recognition model training and recognition methods, device and storage medium
US11361470B2 (en) * 2019-05-09 2022-06-14 Sri International Semantically-aware image-based visual localization
CN111259982B (en) * 2020-02-13 2023-05-12 苏州大学 Attention mechanism-based premature infant retina image classification method and device
CN111582225B (en) * 2020-05-19 2023-06-20 长沙理工大学 Remote sensing image scene classification method and device
CN112163465B (en) * 2020-09-11 2022-04-22 华南理工大学 Fine-grained image classification method, fine-grained image classification system, computer equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107016405A (en) * 2017-02-24 2017-08-04 中国科学院合肥物质科学研究院 A kind of insect image classification method based on classification prediction convolutional neural networks
CN109344883A (en) * 2018-09-13 2019-02-15 西京学院 Fruit tree diseases and pests recognition methods under a kind of complex background based on empty convolution
CN111985370A (en) * 2020-08-10 2020-11-24 华南农业大学 Crop pest and disease fine-grained identification method based on improved mixed attention module
CN111898709A (en) * 2020-09-30 2020-11-06 中国人民解放军国防科技大学 Image classification method and device
CN112241762A (en) * 2020-10-19 2021-01-19 吉林大学 Fine-grained identification method for pest and disease damage image classification

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于显著性检测的蔬菜鳞翅目害虫图像自动分割算法;钱蓉 等;《福建农林大学学报(自然科学版)》;20191231;第48卷(第3期);398-404 *

Also Published As

Publication number Publication date
CN113239947A (en) 2021-08-10

Similar Documents

Publication Publication Date Title
CN112308158B (en) Multi-source field self-adaptive model and method based on partial feature alignment
CN108875674B (en) Driver behavior identification method based on multi-column fusion convolutional neural network
Li et al. Selective kernel networks
CN108734208B (en) Multi-source heterogeneous data fusion system based on multi-mode deep migration learning mechanism
CN111325165B (en) Urban remote sensing image scene classification method considering spatial relationship information
CN102314614B (en) Image semantics classification method based on class-shared multiple kernel learning (MKL)
CN113239784B (en) Pedestrian re-identification system and method based on space sequence feature learning
CN106919920A (en) Scene recognition method based on convolution feature and spatial vision bag of words
CN108446589B (en) Face recognition method based on low-rank decomposition and auxiliary dictionary in complex environment
CN111738303B (en) Long-tail distribution image recognition method based on hierarchical learning
CN108345850A (en) The scene text detection method of the territorial classification of stroke feature transformation and deep learning based on super-pixel
CN112232151B (en) Iterative polymerization neural network high-resolution remote sensing scene classification method embedded with attention mechanism
CN104680173A (en) Scene classification method for remote sensing images
CN109886161A (en) A kind of road traffic index identification method based on possibility cluster and convolutional neural networks
CN104809469A (en) Indoor scene image classification method facing service robot
CN110503613A (en) Based on the empty convolutional neural networks of cascade towards removing rain based on single image method
CN108537121A (en) The adaptive remote sensing scene classification method of environment parament and image information fusion
CN111161244B (en) Industrial product surface defect detection method based on FCN + FC-WXGboost
CN112862015A (en) Paper classification method and system based on hypergraph neural network
CN106709419A (en) Video human behavior recognition method based on significant trajectory spatial information
CN110414587A (en) Depth convolutional neural networks training method and system based on progressive learning
CN109165698A (en) A kind of image classification recognition methods and its storage medium towards wisdom traffic
CN108388904B (en) Dimensionality reduction method based on convolutional neural network and covariance tensor matrix
CN113052254A (en) Multi-attention ghost residual fusion classification model and classification method thereof
CN113807176A (en) Small sample video behavior identification method based on multi-knowledge fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant