CN111832642A

CN111832642A - An image recognition method based on VGG16 in insect taxonomy

Info

Publication number: CN111832642A
Application number: CN202010643798.7A
Authority: CN
Inventors: 吴开华; 张赫; 张竞成; 陈冬梅; 李凯强; 李欣恺
Original assignee: Hangzhou Dianzi University
Current assignee: Hangzhou Dianzi University
Priority date: 2020-07-07
Filing date: 2020-07-07
Publication date: 2020-10-27

Abstract

The invention relates to an image recognition method based on VGG16 in insect taxonomy, comprising: S1, establishing an image data set; S2, processing insect images in the image data set to obtain a training data set; S3, using a VGG16 model to analyze the training data S4, extract some images from the image data set as the reference image and the image to be recognized, and perform corner detection to correct the reference image; S5, input the image to be recognized and the corrected reference image after processing and input after training The VGG16 model is used to extract image features; S6, visualize the extracted image features to obtain a feature map; S7, calculate the image similarity SSIM of the image to be recognized and all reference images under the order of each class of insects The average value is obtained, and the image to be recognized is classified into the category with the largest average value as the target level element to which it belongs. The invention improves the accuracy and efficiency of insect classification.

Description

An image recognition method based on VGG16 in insect taxonomy

技术领域technical field

本发明属于昆虫图像识别分类技术领域，具体涉及一种基于VGG16在昆虫分类学上的图像识别方法。The invention belongs to the technical field of insect image recognition and classification, in particular to an image recognition method based on VGG16 in insect taxonomy.

背景技术Background technique

在传统的昆虫分类方法中，无论是传统分类学还是数值分类学，对昆虫进行分类的主要依据都是昆虫的身体特征，这些身体特征包括昆虫的颜色、斑纹、身体附属物(如瘤突、刻点、纤毛等)及大小(如体长、体宽)等。但是，这些身体特征不能全面体现不同昆虫类群身体形态特征之间的差异，仅凭这些身体特征对昆虫进行分类无法达到比较好的准确率。In the traditional insect classification method, whether it is traditional taxonomy or numerical taxonomy, the main basis for classifying insects is the body characteristics of insects, these body characteristics include insect color, markings, body appendages (such as nodules, engraved point, cilia, etc.) and size (such as body length, body width) and so on. However, these body characteristics cannot fully reflect the differences between the body morphological characteristics of different insect groups, and classification of insects based on these body characteristics alone cannot achieve a better accuracy.

随着计算机科学的快速发展和广泛应用，以计算机视觉为手段，对包括昆虫在内生物的图像进行特征提取和分析，并利用计算机提取的特征对生物物种进行分类成为可能。不同种昆虫的体型大小不一、形态各异，对于昆虫的数学形态特征，除了体长和体宽外，还包括虫体面积、周长、偏心率、似圆度等等的一些数学特征。这些能够表现不同昆虫类群身体形态的量化特征有可能更精确、全面地反映不同昆虫类群的差别。在深度学习卷积神经网络出现之前，模式识别中对于这些特征的提取主要依靠人工提取，存在一定的主观性。此外，已有的基于深度学习进行图像识别的方法大多不会关注神经网络运行过程中产生哪些结果。全连接层对图像空间结构具有破坏性，这在一定程度上会给图像的识别准确率带来损失。且目前鲜有基于卷积神经网络在昆虫分类学上进行图像目级阶元识别的研究。With the rapid development and wide application of computer science, it is possible to use computer vision as a means to extract and analyze biological images including insects, and to classify biological species using the features extracted by computers. Different species of insects have different body sizes and shapes. For the mathematical morphological characteristics of insects, in addition to body length and body width, it also includes some mathematical characteristics such as insect body area, perimeter, eccentricity, and roundness. These quantitative features that can represent the body shape of different insect groups may more accurately and comprehensively reflect the differences between different insect groups. Before the emergence of deep learning convolutional neural networks, the extraction of these features in pattern recognition mainly relied on manual extraction, and there was a certain degree of subjectivity. In addition, most of the existing methods for image recognition based on deep learning do not pay attention to what results are produced during the operation of the neural network. The fully connected layer is destructive to the image space structure, which will bring loss to the image recognition accuracy to a certain extent. And there are few researches on image order-level recognition in insect taxonomy based on convolutional neural network.

发明内容SUMMARY OF THE INVENTION

基于现有技术中存在的上述缺点和不足，本发明的目的之一是至少解决现有技术中存在的上述问题之一或多个，换言之，本发明的目的之一是提供满足前述需求之一或多个的一种基于VGG16在昆虫分类学上的图像识别方法。Based on the above-mentioned shortcomings and deficiencies in the prior art, one of the objectives of the present invention is to at least solve one or more of the above-mentioned problems existing in the prior art. In other words, one of the objectives of the present invention is to provide one of the aforementioned requirements An image recognition method based on VGG16 in insect taxonomy.

为了达到上述发明目的，本发明采用以下技术方案：In order to achieve the above-mentioned purpose of the invention, the present invention adopts the following technical solutions:

一种基于VGG16在昆虫分类学上的图像识别方法，包括以下步骤：An image recognition method based on VGG16 in insect taxonomy, comprising the following steps:

S1、采集各类昆虫目级阶元的昆虫图像，根据昆虫目级阶元对图像进行分类，建立图像数据集；S1. Collect insect images of various insect orders, classify the images according to the insect orders, and establish an image data set;

S2、对图像数据集的昆虫图像进行目标与背景分割、尺寸归一化，得到训练数据集；S2. Perform target and background segmentation and size normalization on the insect images of the image dataset to obtain a training dataset;

S3、采用迁移学习的方法，利用VGG16模型对训练数据集进行训练，得到训练后的VGG16模型；S3. Using the method of migration learning, use the VGG16 model to train the training data set, and obtain the VGG16 model after training;

S4、从图像数据集中抽取部分图像作为参考图像和待识别图像，对待识别图像和参考图像进行角点检测，通过计算角点均值得到偏移值，进而根据偏移值对参考图像进行矫正；S4, extract some images from the image data set as the reference image and the image to be recognized, perform corner detection on the image to be recognized and the reference image, obtain the offset value by calculating the mean value of the corner points, and then correct the reference image according to the offset value;

S5、将待识别图像以及矫正后的参考图像经过目标与背景分割、尺寸归一化后输入训练后的VGG16模型，进行图像特征的提取；S5, the to-be-recognized image and the corrected reference image are input into the trained VGG16 model after target and background segmentation and size normalization to extract image features;

S6、利用TensorBoard将提取的图像特征进行可视化以得到特征图并保存；S6. Use TensorBoard to visualize the extracted image features to obtain a feature map and save it;

S7、计算待识别图像与每类昆虫目级阶元下所有参考图像的特征图图像相似度SSIM并求取均值，将待识别图像分类至均值最大的一类，得到待识别图像所属的昆虫目级阶元。S7, calculate the image similarity SSIM of the feature map between the image to be recognized and all reference images under the order level of each type of insect, and obtain the average value, classify the image to be recognized into the class with the largest average value, and obtain the insect order to which the image to be recognized belongs. tier element.

作为优选方案，所述昆虫目级阶元包括鳞翅目、直翅目、半翅目、鞘翅目以及同翅目。As a preferred solution, the insect orders include Lepidoptera, Orthoptera, Hemiptera, Coleoptera and Homoptera.

作为优选方案，所述图像数据集需满足以下条件：As a preferred solution, the image data set needs to meet the following conditions:

每个昆虫目级阶元包括虫害种类不少于10种，每种昆虫原始图像至少30张。Each class of insects includes no less than 10 species of insect pests, and at least 30 original images of each insect.

作为优选方案，所述目标与背景分割采用全卷积神经网络FCN，将背景变为黑色。As a preferred solution, a fully convolutional neural network (FCN) is used for the segmentation of the target and the background to turn the background into black.

作为优选方案，所述尺寸归一化将图像尺寸归一化为224*224。As a preferred solution, the size normalization normalizes the image size to 224*224.

作为优选方案，所述步骤S2中，尺寸归一化之后还进行数据增强处理。As a preferred solution, in the step S2, data enhancement processing is also performed after size normalization.

作为优选方案，所述数据增强处理包括翻转、旋转、亮度调整、色彩调整中的一种或多种。As a preferred solution, the data enhancement processing includes one or more of flipping, rotation, brightness adjustment, and color adjustment.

作为优选方案，所述VGG16模型包括13个卷积层、3个全连接层和5个池化层。As a preferred solution, the VGG16 model includes 13 convolutional layers, 3 fully connected layers and 5 pooling layers.

作为优选方案，所述步骤S7中，采用VGG16模型的第一至第五层卷积层卷积得到的特征图作为图像相似度的计算输入。As a preferred solution, in the step S7, the feature maps obtained by convolution of the first to fifth convolutional layers of the VGG16 model are used as the input for calculating the image similarity.

作为优选方案，所述步骤S7中，从亮度、对比度、结构三个方面度量图像相似度SSIM：As a preferred solution, in the step S7, the image similarity SSIM is measured from three aspects of brightness, contrast and structure:

其中，x、y分别代表待识别图像和参考图像，μ_x为x的均值，μ_y为y的均值，σ_x是x的方差，σ_y是y的方差，σ_xy是x和y的协方差；c1＝(k₁L)²、c2＝(k₂L)²用来维持稳定的常数，L是图像像素值的动态范围0-255，k₁＝0.01，k₂＝0.03。Among them, x and y represent the image to be recognized and the reference image, respectively, μ _x is the mean value of x, μ _y is the mean value of y, σ _x is the variance of x, σ _y is the variance of y, and σ _xy is the covariance of x and y Variance; c1=(k ₁ L) ² , c2=(k ₂ L) ² are used to maintain stable constants, L is the dynamic range of image pixel values 0-255, k ₁ =0.01, k ₂ =0.03.

本发明与现有技术相比，有益效果是：Compared with the prior art, the present invention has the following beneficial effects:

本发明的基于VGG16的昆虫分类用图像识别方法，便于对昆虫进行分类，提高昆虫分类的准确率和效率。The image recognition method for insect classification based on VGG16 of the present invention is convenient for classifying insects and improves the accuracy and efficiency of insect classification.

本发明通过深度学习卷积神经网络对昆虫图像进行特征提取，避免了传统人为特征提取的主观性，提取到的特征更加全面。The present invention extracts features from insect images through a deep learning convolutional neural network, thereby avoiding the subjectivity of traditional artificial feature extraction, and the extracted features are more comprehensive.

目前对于昆虫分类学的研究主要集中基于传统的数学形态学特征，对于深度学习卷积神经网络在昆虫分类学上的应用几乎没有，所以本发明是深度学习在昆虫分类学上的一种新的尝试。At present, the research on insect taxonomy is mainly based on traditional mathematical morphological features, and there is almost no application of deep learning convolutional neural network in insect taxonomy, so the present invention is a new method of deep learning in insect taxonomy. try.

附图说明Description of drawings

图1是本发明实施例的基于VGG16在昆虫分类学上的图像识别方法的流程图；1 is a flowchart of an image recognition method based on VGG16 in insect taxonomy according to an embodiment of the present invention;

图2是本发明实施例的VGG16的网络结构图；Fig. 2 is the network structure diagram of VGG16 of the embodiment of the present invention;

图3是本发明实施例的昆虫图像经过VGG16的各层卷积后的特征图。FIG. 3 is a feature map of an insect image according to an embodiment of the present invention after convolution of each layer of VGG16.

具体实施方式Detailed ways

为了更清楚地说明本发明实施例，下面将对照附图说明本发明的具体实施方式。显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图，并获得其他的实施方式。In order to describe the embodiments of the present invention more clearly, the following will describe specific embodiments of the present invention with reference to the accompanying drawings. Obviously, the accompanying drawings in the following description are only some embodiments of the present invention. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative efforts, and obtain other implementations.

如图1所示，本发明实施例的基于VGG16在昆虫分类学上的图像识别方法，包括以下步骤：As shown in FIG. 1 , the image recognition method based on VGG16 in insect taxonomy according to the embodiment of the present invention includes the following steps:

具体地，各类昆虫目级阶元的昆虫图像可以从公共数据集、网络爬虫获取的数据集以及实地采集的数据集中采集；根据采集的昆虫图像按照昆虫目级阶元进行分类，其中，昆虫目级阶元包括鳞翅目、直翅目、半翅目、鞘翅目以及同翅目五大类，每个目级阶元包括虫害种类不少于10种，每种昆虫原始图像至少30张。Specifically, insect images of various insect orders can be collected from public data sets, data sets obtained by web crawlers, and data sets collected in the field; according to the collected insect images, they are classified according to insect orders. The order-level element includes five categories: Lepidoptera, Orthoptera, Hemiptera, Coleoptera, and Homoptera. Each order-level element includes no less than 10 types of pests, and at least 30 original images of each insect.

具体地，利用全卷积神经网络FCN对图像数据集的昆虫图像进行目标与背景的分割，将背景变为黑色，以便更好地提取昆虫特征。Specifically, a fully convolutional neural network (FCN) is used to segment the insect images of the image dataset from the target and the background, and the background is changed to black in order to better extract insect features.

由于VGG16全连接层的前向传播是当前层的权重和上一层输出的乘积，如下：Since the forward propagation of the VGG16 fully connected layer is the product of the weight of the current layer and the output of the previous layer, as follows:

a^k＝σ(z^k)＝σ(W^ka^k-1+b^k)a ^k =σ(z ^k )=σ(W ^k a ^k-1 +b ^k )

其中，k表示第k层，W是第k层的权值，b^k是第k层的偏置，a^k-1表示k-1层(即上一层)的输出，也是第k层的输入，a^k表示第k层(即当前层)的输出，σ为激活函数。Among them, k represents the k-th layer, W is the weight of the k-th layer, b ^k is the bias of the k-th layer, and a ^k-1 represents the output of the k-1 layer (ie, the previous layer), which is also the k-th layer. Input, a ^k represents the output of the kth layer (ie the current layer), and σ is the activation function.

而VGG16权重矩阵的形状是固定的，如果输入的尺寸不固定，无法前向传播，导致无法计算，模型训练就会中断。为了使模型训练正常进行，需要将图像尺寸归一化为224*224。The shape of the VGG16 weight matrix is fixed. If the size of the input is not fixed, forward propagation cannot be performed, resulting in inability to calculate and model training will be interrupted. In order for the model training to work properly, the image size needs to be normalized to 224*224.

另外，深度学习卷积神经网络VGG16模型要想获得更好的特征提取及分类效果，对于训练数据集的数量要求较高，因此，对于图像数据集的图像数量较少的情况下可以通过数据增强的方法对数据集进行样本扩充，其中常用的方法包括：翻转、旋转、亮度调整、色彩调整等。In addition, in order to obtain better feature extraction and classification effects, the deep learning convolutional neural network VGG16 model requires a high number of training data sets. Therefore, when the number of images in the image data set is small, data enhancement can be used. The methods of sample augmentation of the dataset include: flipping, rotating, brightness adjustment, color adjustment, etc.

因此，本发明实施例为了使图像能够输入至卷积神经网络中，首先对所有的图像进行尺寸的调整，利用python程序将所有图像尺寸归一化为224*224。Therefore, in order to enable the images to be input into the convolutional neural network in this embodiment of the present invention, the size of all images is first adjusted, and a python program is used to normalize the size of all images to 224*224.

为了能够使训练的模型具有更好的泛化能力，需要大量的训练数据集对模型进行训练，对于原始图像数据集的数量不足的情况下，可以通过数据增强的方式来扩充数据集(即样本扩充)，其中常用的方法包括：翻转、亮度调整、色彩调整等。In order to enable the trained model to have better generalization ability, a large number of training data sets are required to train the model. When the number of original image data sets is insufficient, the data set can be expanded by means of data enhancement (that is, the sample Expansion), the commonly used methods include: flip, brightness adjustment, color adjustment, etc.

具体地，如图2所示，深度学习卷积神经网络VGG16包含：Specifically, as shown in Figure 2, the deep learning convolutional neural network VGG16 includes:

13个卷积层，分别用convx_x表示，依次包括conv1-1、conv1-2、conv2-1、conv2-2、conv3-1、conv3-2、conv3-3、conv4-1、conv4-2、conv4-3、conv5-1、conv5-2、conv5-3；13 convolutional layers, represented by convx_x respectively, including conv1-1, conv1-2, conv2-1, conv2-2, conv3-1, conv3-2, conv3-3, conv4-1, conv4-2, conv4 -3, conv5-1, conv5-2, conv5-3;

5个池化层，分别用poolx表示，依次包括位于conv1-2与conv2-1之间的pool1、位于conv2-2与conv3-1之间的pool2、位于conv3-3与conv4-1之间的pool3、位于conv4-3与conv5-1之间的pool4、位于conv5-3之后的pool5；5 pooling layers, represented by poolx respectively, including pool1 located between conv1-2 and conv2-1, pool2 located between conv2-2 and conv3-1, and located between conv3-3 and conv4-1. pool3, pool4 located between conv4-3 and conv5-1, pool5 located after conv5-3;

3个全连接层，分别用fcxxx表示，依次包括位于pool5之后的fc4096、fc4096和fc1000。其中，卷积层和全连接层具有权重系数，因此也被称为权重层，总数目为13+3＝16，即VGG16的来源。The 3 fully connected layers are represented by fcxxx respectively, including fc4096, fc4096 and fc1000 located after pool5 in turn. Among them, the convolution layer and the fully connected layer have weight coefficients, so they are also called weight layers, and the total number is 13+3=16, which is the source of VGG16.

由于ImageNet图像数据集中包括昆虫，利用ImageNet训练的得到的VGG16对于昆虫特征的提取效果应该比较好，故采用迁移学习的方法，利用深度学习卷积神经网络模型VGG16在ImageNet数据集上对昆虫图像进行训练，将通过图像分割、尺寸归一化及数据增强后的训练数据集输入VGG16模型，通过微调网络获取昆虫图像训练模型，即训练后的VGG16模型。Since the ImageNet image dataset includes insects, the VGG16 trained by ImageNet should have a better extraction effect on insect features. Therefore, the transfer learning method is used, and the deep learning convolutional neural network model VGG16 is used to perform the insect images on the ImageNet dataset. For training, the training data set after image segmentation, size normalization and data enhancement is input into the VGG16 model, and the insect image training model is obtained through the fine-tuning network, that is, the trained VGG16 model.

S4、从图像数据集中抽取部分图像作为参考图像和待识别图像，对待识别图像和参考图像进行图像角点检测，通过计算角点均值获取偏移值，进而根据偏移值对参考图像进行矫正，具体矫正图像中昆虫的相对位置；S4, extract some images from the image data set as the reference image and the image to be recognized, perform image corner detection on the image to be recognized and the reference image, obtain the offset value by calculating the average value of the corner points, and then correct the reference image according to the offset value, Specifically correct the relative position of the insects in the image;

具体地，从图像数据集中随机选取一张图片作为待识别图像，参考图像的选取尽可能包括不同目级阶元下的不同昆虫种类，其中，每种昆虫的图像尽量包括昆虫的不同姿态，对待识别图像与每一张参考图像进行角点检测，求取两幅图像的角点平均值，通过比较两幅图像的角点平均值获得偏移值，根据偏移值矫正参考图像，使图像中昆虫的相对位置大体一致。Specifically, a picture is randomly selected from the image data set as the image to be recognized, and the selection of the reference image includes different insect species under different order levels as much as possible. The recognition image and each reference image are detected by the corner points, the average value of the corner points of the two images is obtained, the offset value is obtained by comparing the average value of the corner points of the two images, and the reference image is corrected according to the offset value, so that the The relative positions of the insects are roughly the same.

S5、将待识别图像以及矫正后的参考图像经过图像预处理，包括目标与背景分割、尺寸归一化，预处理后输入训练后的VGG16模型，进行图像特征的提取；S5, subject the image to be recognized and the corrected reference image to image preprocessing, including target and background segmentation, size normalization, and input the trained VGG16 model after preprocessing to extract image features;

具体地，将待识别图像以及矫正后的参考图像输入到训练后的VGG16模型，进行图像特征的提取。Specifically, the image to be recognized and the corrected reference image are input into the trained VGG16 model to extract image features.

具体地，利用可视化工具tensorboard查看图像经过卷积层卷积得到的特征图，如图3所示，1-13依次对应第1-13个卷积层卷积得到的特征图；将特征图保存下来作为下一步卷积特征图分析及选取的基础。Specifically, use the visualization tool tensorboard to view the feature map obtained by convolution of the image through the convolution layer. As shown in Figure 3, 1-13 correspond to the feature map obtained by convolution of the 1-13th convolution layer in turn; save the feature map It is used as the basis for the analysis and selection of the convolution feature map in the next step.

VGG16浅层网络提取的是纹理、细节特征，浅层网络包含更多的特征，也具备提取关键特征的能力，深层网络提取的是轮廓、形状、最强特征，但是随着卷积层数的加深，提取到的特征都是高维抽象特征，很可能会因为提取过度提取到一些无用的特征，因此，本发明实施例采用第一至第五层卷积层卷积得到的特征图构成卷积特征集，作为下一步图像相似度分析的基础。The VGG16 shallow network extracts texture and detail features. The shallow network contains more features and also has the ability to extract key features. The deep network extracts contours, shapes, and the strongest features, but with the number of convolutional layers. Deepening, the extracted features are high-dimensional abstract features, and it is likely that some useless features will be extracted due to excessive extraction. Therefore, the embodiment of the present invention uses the feature maps obtained by convolution of the first to fifth layers of convolution layers to form volumes. The product feature set is used as the basis for the next image similarity analysis.

其中，最开始的两层卷积层分别用64个3×3，步幅为1的过滤器对输入图像进行卷积得到尺寸为224*224*64的特征图，其中，padding参数为same卷积中的参数；然后进行relu，用一个2×2，步幅为2的过滤器构建最大池化层，池化后尺寸变为112*112*64；第三、四层卷积层分别用128个3×3，步幅为1的过滤器对输入图像进行卷积得到尺寸为112*112*128的特征图，其中padding参数为same卷积中的参数；然后进行relu，用一个2×2，步幅为2的过滤器构建最大池化层，池化后尺寸变为56*56*128；第五层卷积层用256个3×3，步幅为1的过滤器对输入图像进行卷积得到尺寸为56*56*256的特征图，其中，padding参数为same卷积中的参数，然后进行relu，用一个2×2，步幅为2的过滤器构建最大池化层，池化后尺寸变为28*28*256，其它卷积层的参数可以参考现有技术，在此不赘述。Among them, the first two convolutional layers convolve the input image with 64 3×3 filters with a stride of 1 to obtain a feature map of size 224*224*64, where the padding parameter is the same volume The parameters in the product; then relu, use a 2 × 2 filter with a stride of 2 to build a maximum pooling layer, and the size becomes 112*112*64 after pooling; the third and fourth convolutional layers use 128 3×3 filters with stride 1 convolve the input image to obtain a feature map of size 112*112*128, where the padding parameter is the parameter in the same convolution; then relu, use a 2× 2. The filter with stride 2 builds the maximum pooling layer, and the size becomes 56*56*128 after pooling; the fifth layer convolutional layer uses 256 filters of 3×3 and stride 1 to input image Perform convolution to obtain a feature map with a size of 56*56*256, where the padding parameter is the parameter in the same convolution, then relu is performed, and a 2×2 filter with a stride of 2 is used to construct a maximum pooling layer, After pooling, the size becomes 28*28*256, and the parameters of other convolutional layers can refer to the prior art, which will not be repeated here.

具体地，利用上一步构建的卷积特征集进行图像结构相似度分析。SSIM是一种全参考的图像质量评价指标，分别从亮度、对比度、结构三个方面度量图像相似度，结构相似性的范围为(0，1)，其计算公式如下：Specifically, the image structure similarity analysis is performed using the convolutional feature set constructed in the previous step. SSIM is a full-reference image quality evaluation index, which measures image similarity from three aspects: brightness, contrast, and structure. The range of structural similarity is (0, 1), and its calculation formula is as follows:

计算待识别图像与每个目级阶元下的所有参考图像的SSIM值并计算待识别图像与每个目级阶元下所有图像的平均SSIM值，SSMI均值最大值所对应的类别即为待识别图像所属的目级阶元。Calculate the SSIM value of the to-be-recognized image and all reference images under each object level, and calculate the average SSIM value of the to-be-identified image and all images under each object level. The category corresponding to the maximum SSMI mean value is the Identify the order element to which the image belongs.

以上所述仅是对本发明的优选实施例及原理进行了详细说明，对本领域的普通技术人员而言，依据本发明提供的思想，在具体实施方式上会有改变之处，而这些改变也应视为本发明的保护范围。The above is only a detailed description of the preferred embodiments and principles of the present invention. For those of ordinary skill in the art, according to the ideas provided by the present invention, there will be changes in the specific implementation, and these changes should also be It is regarded as the protection scope of the present invention.

Claims

1. An image identification method based on VGG16 in insect taxonomy is characterized by comprising the following steps:

s1, collecting insect images of various insect eye level elements, classifying the images according to the insect eye level elements, and establishing an image data set;

s2, carrying out target and background segmentation and size normalization on the insect image of the image data set to obtain a training data set;

s3, training a training data set by using a VGG16 model by adopting a transfer learning method to obtain a trained VGG16 model;

s4, extracting partial images from the image data set as reference images and images to be identified, carrying out corner detection on the images to be identified and the reference images, calculating a corner mean value to obtain an offset value, and correcting the reference images according to the offset value;

s5, performing target and background segmentation and size normalization on the image to be recognized and the corrected reference image, inputting the image to the trained VGG16 model, and extracting image features;

s6, visualizing the extracted image features by using a TensorBoard to obtain a feature map and storing the feature map;

and S7, calculating feature map image similarity SSIM of the image to be recognized and all reference images under each type of insect eye level, calculating a mean value, and classifying the image to be recognized to the type with the largest mean value to obtain the insect eye level to which the image to be recognized belongs.

2. The image identification method based on VGG16 in insect taxonomy according to claim 1, wherein the insect order-level elements include Lepidoptera, Orthoptera, Hemiptera, Coleoptera and Homoptera.

3. The image identification method based on VGG16 in insect taxonomy according to claim 2, wherein the image data set satisfies the following condition:

each insect eye level element comprises not less than 10 insect pest species, and each insect original image comprises at least 30.

4. The image identification method based on VGG16 in insect taxonomy according to claim 1, wherein the target and background segmentation adopts a full convolution neural network (FCN) to change the background to black.

5. The method of claim 1, wherein the size normalization normalizes the size of the image to 224 x 224 based on VGG16 image recognition in insect taxonomy.

6. The method of claim 1, wherein in step S2, the size normalization is followed by data enhancement.

7. The image identification method based on VGG16 in insect taxonomy according to claim 6, wherein the data enhancement processing comprises one or more of flipping, rotating, brightness adjusting and color adjusting.

8. The image identification method based on VGG16 in insect taxonomy according to claim 1, wherein the VGG16 model comprises 13 convolutional layers, 3 fully-connected layers and 5 pooling layers.

9. The image identification method based on VGG16 in insect taxonomy according to claim 8, wherein in step S7, feature maps obtained by convolution of the first to fifth convolutional layers of the VGG16 model are used as input for calculating image similarity.

10. The image recognition method based on VGG16 in insect taxonomy according to claim 1, wherein in step S7, the similarity SSIM of the image is measured from three aspects of brightness, contrast and structure:

wherein x and y respectively represent the image to be identified and the reference image, mu_xIs the mean value of x, μ_yIs the mean value of y, σ_xIs the variance of x, σ_yIs the variance of y, σ_xyIs the covariance of x and y; c1 ═ k₁L)²、c2＝(k₂L)²A constant for maintaining stability, L being the dynamic range of image pixel values 0-255, k₁＝0.01，k₂＝0.03。