CN116310541A

CN116310541A - Insect classification method and system based on convolutional network multidimensional learning

Info

Publication number: CN116310541A
Application number: CN202310239083.9A
Authority: CN
Inventors: 袁娜朵; 朱旭华; 陈加肯; 梁周瑞; 姚波
Original assignee: Zhejiang Top Cloud Agri Technology Co ltd
Current assignee: Zhejiang Top Cloud Agri Technology Co ltd
Priority date: 2023-03-09
Filing date: 2023-03-09
Publication date: 2023-06-23

Abstract

The invention discloses an insect classification method based on convolution network multidimensional learning, which comprises the following steps: obtaining insect pictures containing various insect species, preprocessing the insect pictures, and reserving the outer outlines of insects to form insect-like pictures; dividing the class insect pictures into a training data set and a verification data set, and carrying out data enhancement processing on the training data set to obtain an enhanced training sample set; constructing a multi-dimensional insect recognition network pre-training model; training and optimizing the multi-dimensional insect recognition network pre-training model based on the enhanced training sample set to obtain a multi-dimensional insect recognition network model; inputting the picture to be identified into a multi-dimensional insect identification network model to obtain an identification result, wherein the sizes of the picture sets to be identified are the same. According to the invention, a plurality of dimensional information of insects is introduced and used as auxiliary features to assist in model identification, so that the model can simultaneously consider information beyond image features in the identification process, thereby improving the identification accuracy.

Description

Insect classification method and system based on convolutional network multidimensional learning

Technical Field

The invention belongs to the technical field of big data processing, and particularly relates to an insect classification method and system based on convolution network multidimensional learning.

Background

The variety of insects is very large, and up to now, it is known from the description of the species of insect: there are about 300 rice pests, more than 300 cotton pests, more than 160 apple pests, and about 200 mulberry pests. Since pests are of various kinds and damage to crops is great, the management of crop pests and control work has a critical influence on the yield and quality of crops. Therefore, in order to help farmers to better identify insect species, on the one hand, the farmers can be enabled to improve the knowledge of unknown insects; on the other hand, the insect-proof device can effectively prevent and control known insects, namely can better identify the insects, and further effectively guide pests and beneficial insects, so that the insects can be more effectively used for agricultural products, and the economic benefit is increased.

Conventional insect recognition methods are typically performed by an expert, and therefore, it is difficult to perform effective insect recognition not only by consuming a lot of time and effort from the insect expert, but also by encountering species that are not known to the expert, each time the expert is required to be present in a scene where the insect needs to be recognized. Therefore, the insect recognition method based on the deep learning technology is utilized, so that not only can the efficiency of personnel with insect recognition requirements be improved, but also the insects can be accurately positioned.

With the penetration of artificial intelligence technology in the agricultural field, the recognition of insects by learning texture information and deep semantic information in insect pictures through a deep learning technology can be realized, and the precision and efficiency of the recognition can be effectively improved. However, the recognition method based on the deep learning model still has a certain bottleneck, which causes the recognition result to be inaccurate, for example: the method is difficult to effectively distinguish similar pictures and effectively solve the long tail distribution phenomenon of training sample data, and the traditional deep learning training method needs to uniformly scale the pictures to the same size for training, so that the characteristic of acquiring the insect scale is omitted to assist effective recognition.

Disclosure of Invention

Aiming at the defects in the prior art, the invention provides an insect classification method and system based on convolution network multidimensional learning.

In order to solve the technical problems, the invention is solved by the following technical scheme:

an insect classification method based on convolution network multidimensional learning comprises the following steps:

obtaining insect pictures containing various insect species, preprocessing the insect pictures, and reserving the outer outlines of insects to form insect-like pictures;

dividing the class insect pictures into a training data set and a verification data set, and carrying out data enhancement processing on the training data set to obtain an enhanced training sample set;

constructing a multi-dimensional insect recognition network pre-training model, wherein the multi-dimensional insect recognition network pre-training model comprises an image feature extraction module, a set image feature extraction module, an image size full-connection layer, an Arcface loss function, a Focal loss function and a total loss function;

training and optimizing the multi-dimensional insect recognition network pre-training model based on the enhanced training sample set to obtain a multi-dimensional insect recognition network model;

inputting the picture to be identified into a multi-dimensional insect identification network model to obtain an identification result, wherein the sizes of the picture sets to be identified are the same.

As an embodiment, the preprocessing the insect picture and retaining the outer contour of the insect to form a category insect picture includes the following steps:

and carrying out insect outline calibration treatment on the insect picture to obtain an outline calibration picture, wherein the calibration conditions of the calibration treatment are as follows: the main area of the insect accounts for at least 70% of the outline area of the insect, and the identification characteristics of the corresponding insect category are reserved;

and carrying out insect outline interception on the outline calibration picture to obtain an insect outline picture, and taking the insect outline picture as a category insect picture.

As an embodiment, the data enhancement process includes: patch size, mix up, brightness enhancement, saturation enhancement, contrast enhancement, random erasure, random scale cropping, and unified scaling to one or more of the same size and picture flipping.

As an implementation manner, the construction of the multi-dimensional insect recognition network pre-training model includes the following steps:

an image feature extraction module is constructed, wherein the image feature extraction module comprises a primary extraction Stem unit, a residual pouring unit and a head unit, and the primary extraction Stem unit comprises a 3x3 convolution layer and a first BN subunit; the residual pouring unit comprises 16 residual pouring structures realized based on an attention mechanism; the head unit comprises a 1x1 convolution layer, a second BN subunit, a pooling layer and a full connection layer;

normalizing the length and the width of each type of insect picture, and outputting a 2-dimensional image to obtain a two-dimensional image;

acquiring time information characteristics of each category of insect pictures, performing one-hot coding, converting the time information characteristics into characteristic vectors, obtaining network embedding results from the characteristic vectors through network embedding processing, and taking the network embedding results as input characteristics, wherein a grid for the network embedding processing comprises a first convolution layer and a second convolution layer, the first convolution layer has the dimension of 1x m x1x128, the second convolution layer has the dimension of 1x128x n, m represents the dimension of the time information characteristics, and the characteristic output dimension is n dimensions;

and constructing a collection image feature extraction module and an image size full-connection layer, wherein the input dimension is k, and the output dimension k is the number of insect categories.

As an embodiment, the method further comprises the steps of:

respectively constructing an Arcface loss function, a Focal loss function and a total loss function;

converting the dimension of the insect image into a numerical feature based on an Arcface loss function; dividing the time period characteristics of the insect shooting into time periods, inputting a one-hot code of the division result into a multi-dimensional insect recognition network pre-training model, and extracting the time characteristics;

based on the Focal loss function, optimizing the enhanced training sample set, and eliminating the long tail distribution phenomenon existing in unbalanced condition;

and judging whether the training result is converged or not based on the total loss function.

As an embodiment, the training multi-dimensional insect recognition network pre-training model comprises the following steps:

randomly selecting a plurality of image samples from the enhanced training sample set, wherein the image samples further comprise corresponding image scale information;

inputting the image sample and the corresponding image scale information into a multi-dimensional insect recognition network pre-training model, and optimizing and updating network parameters of the multi-dimensional insect recognition network pre-training model based on an SGD optimization algorithm;

judging whether the training result is converged or not based on the total loss function, and verifying the accuracy of the training result based on the verification data set, so as to obtain the multi-dimensional insect recognition network model.

As an implementation manner, the identifying the picture to be identified by using the multi-dimensional insect identification network model includes the following steps:

scaling each picture to be identified, and intercepting a picture with the input size of the picture as an input picture of the multi-dimensional insect identification network model;

acquiring the length and width information of the picture, and carrying out normalization processing to obtain the numerical characteristics of the scale layer;

acquiring time information of insect activities in the pictures, converting the time information into corresponding time periods, and respectively carrying out one-hot coding and network embedding processing to extract time characteristics;

and (3) corresponding the output result of the multidimensional insect recognition network model to the index of the insect category, and selecting the category with the highest confidence as the final recognition result.

As an embodiment, the total loss function is:

L＝L _c +L _focal

wherein L represents the total loss function of the multi-dimensional insect recognition network model, L _c 、L _focal Respectively representing an Arcface loss function and a Focal loss function;

the ArcFace Loss function is expressed as follows:

wherein ,

representing a predicted output result, m representing an interval of angles, and N representing a total sample size;

the Focal loss tag loss function is expressed as follows:

wherein ,y^′ Representing the output probability of the relevant category, gamma represents the adjustable attention parameter.

An insect classification system based on convolutional network multidimensional learning comprises a data set acquisition module, a division processing module, a model construction module, a model training module and an identification module;

the data set acquisition module is used for acquiring insect pictures containing various insect types, preprocessing the insect pictures and reserving the outer outlines of insects to form the insect type pictures;

the division processing module is used for dividing the class insect pictures into a training data set and a verification data set, and carrying out data enhancement processing on the training data set to obtain an enhanced training sample set;

the model construction module is used for constructing a multi-dimensional insect recognition network pre-training model, wherein the multi-dimensional insect recognition network pre-training model comprises an image feature extraction module, a set image feature extraction module, an image size full-connection layer, an Arcface loss function, a Focal loss function and a total loss function;

the model training module is configured to: training and optimizing the multi-dimensional insect recognition network pre-training model based on the enhanced training sample set to obtain a multi-dimensional insect recognition network model;

the identification module is used for inputting the picture to be identified into the multi-dimensional insect identification network model to obtain an identification result, wherein the sizes of the picture sets to be identified are the same.

A computer readable storage medium storing a computer program which when executed by a processor performs the method of:

An insect classification device based on convolutional network multidimensional learning, comprising a memory, a processor and a computer program stored in the memory and running on the processor, which processor, when executing the computer program, implements the method of:

The invention has the remarkable technical effects due to the adoption of the technical scheme:

according to the invention, a plurality of dimensional information of insects is introduced and used as auxiliary features to assist in model identification, so that the model can simultaneously consider information beyond image features in the identification process, thereby improving the identification accuracy. For the identification equipment which is fixed after being installed, the situation that the same object obtains pictures with different pixel sizes when the mobile terminal equipment shoots can not occur, so that the scale characteristics are introduced into the insect identification model, and the identification can be assisted according to the size of the insect. After the introduction of the time information feature, better differentiation is provided for insects of different life habits. In addition, a framework for multi-dimensional characteristic input is provided, and crops, longitude and latitude region information and other relevant information of insect life can be combined according to the model architecture mode of the invention. The Focal loss is used as a loss function of label identification, so that the problem of data unbalance of long tail distribution of training data can be relieved to a certain extent.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only some embodiments of the invention, and that other drawings can be obtained according to these drawings without inventive faculty for a person skilled in the art.

FIG. 1 is a schematic overall flow diagram of the method of the present invention;

FIG. 2 is a block diagram of a convolutional network model;

FIG. 3 is a block diagram of an attention mechanism module of a convolutional network;

FIG. 4 is a graph showing the differentiating effect of Arcface Loss on multiple categories over conventional Softmax Loss;

FIG. 5 is a representation of several data enhancement effects of the training set of the present invention;

FIG. 6 is a comparison of precision results of multi-dimensional insect features versus purer image features;

FIG. 7 is a training model structure;

FIG. 8 is a schematic diagram of an insect recognition flow in conjunction with target detection;

fig. 9 is a view showing the result of recognition of insects in combination with target detection;

fig. 10 is a schematic diagram of the overall structure of the system of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the following examples, which are illustrative of the present invention and are not intended to limit the present invention thereto.

Example 1:

an insect classification method based on convolution network multidimensional learning, as shown in fig. 1, comprises the following steps:

s100, obtaining insect pictures containing various insect species, preprocessing the insect pictures, and reserving the outer outlines of insects to form the insect pictures of the types;

s200, dividing the class insect picture into a training data set and a verification data set, and carrying out data enhancement processing on the training data set to obtain an enhanced training sample set;

s300, constructing a multi-dimensional insect recognition network pre-training model, wherein the multi-dimensional insect recognition network pre-training model comprises an image feature extraction module, a set image feature extraction module, an image size full-connection layer, an Arcface loss function, a Focal loss function and a total loss function;

s400, training and optimizing a multi-dimensional insect recognition network pre-training model based on the enhanced training sample set to obtain a multi-dimensional insect recognition network model;

s500, inputting the picture to be identified into a multi-dimensional insect identification network model to obtain an identification result, wherein the sizes of the picture sets to be identified are the same.

The insect classifying method based on convolution network multidimensional learning includes the following steps:

the first step: obtaining insect pictures containing various insect types to form an original data set, wherein in the original data set, all the insect pictures need to meet the condition that insect main body parts account for more than 70% of the total picture area so as to reduce the interference effect of the background in picture identification; if the condition that the insect main body part occupies 70% of the total area of the picture is not satisfied, cutting or complementing the insect picture again until the condition is satisfied; the sources of the insect pictures at least comprise insect pictures obtained through web crawlers, insect pictures taken by agricultural equipment or farmers or plant protection group personnel, insect pictures of public data sets and the like; carrying out insect outline calibration treatment on the collected insect pictures, and then intercepting the insect outline as a category insect picture, wherein the calibration conditions of the calibration treatment are as follows: the insect body area accounts for at least 70% of the insect outline area, and retains the identifying characteristics of the corresponding insect class.

And a second step of: sample enhancement processing is performed on the original data set and the original data set is divided into a training data set, a verification data set and a test data set, wherein the divided training data set, verification data set and test data set can be 8:1:1 by number ratio.

And a third step of: performing data enhancement processing on the training sample set to obtain an enhanced training sample set, where the data enhancement processing may include: the edge is square, mixup, brightness enhancement, saturation enhancement, contrast enhancement, random erasure, random scale clipping and unified scaling to the same size, picture turning, random angle tilting of the picture.

The training sample set is subjected to black filling and edge trimming to the square size, and is randomly cut and scaled to a fixed size, for example, the image with 224x224 pixel values is uniformly scaled, the integrity of a main part of the image can be effectively maintained when the image is subjected to black filling and edge trimming to the square size, and if the image is simply scaled to the fixed size according to a traditional image processing method, the outline or the shape of insects in the training sample set can be deformed, for example, the rectangular insects are scaled to the square, so that the original morphological characteristics of the insects can be lost, and the probability of false identification is increased.

In addition, the area ratio and the aspect ratio of the training sample set which are randomly cut can be respectively set as (0.7,1) and (0.7,1.3), so that more and more comprehensive local information of insects can be obtained, and a better recognition effect can be achieved among the same-class insects facing to incomplete pictures and similar features.

The insect picture information can be obtained, meanwhile, the size information of the insect picture can be obtained, the insect picture information is converted into the insect picture information by adopting min-max normalization processing, and the formula is as follows:

the maximum and minimum scale values are obtained by traversing the width and height of all training data sets.

In a common recognition task, different shooting distances can cause different pixel sizes of pictures to be recognized, and even the same target, the obtained insect pictures can show different scales, so that the size information of the insect pictures cannot be used as the characteristic of picture recognition in the case. However, in the recognition apparatuses in some agricultural fields, the photographing apparatuses are stationary, and for convenience of photographing, the base plate for photographing the insects is also stationary, which results in that the sizes of the obtained pictures to be recognized are the same when the same photographing apparatuses or the same insects to be photographed. In this case, therefore, the recognition accuracy of the model can be improved by taking the size of the insect picture as an important feature for recognition. In addition, in conventional deep learning training, pictures of different sizes are uniformly scaled to the same size so as to be a uniform input size of a model. In natural scenes, insects with different sizes exist, but many insects have very similar apparent characteristics, the colors are black, after the insects are uniformly scaled to the same size, the ordinary deep learning model is difficult to learn the differences between the insects, so that the original size of an insect picture is taken as one dimension characteristic of recognition, and the recognition accuracy of the situation is improved.

In conventional identification, the circadian rhythm of insect activity cannot generally be considered as a characteristic of insect identification. In insect activities such as hatching, pupation, eclosion, mating, oviposition, etc., rhythms conforming to the circadian variation law in nature are formed. The activity of insects often exhibits a rhythmic law of variation with the alternation of circadian, a phenomenon also known as circadian rhythm. Sunrise insects such as butterflies, dragonflies, step beetles, tiger beetles, etc., which all move during the day; most night-exiting insects such as cutworm and the like are moths, underground pests, boring insects, blood-sucking mosquitoes and the like all move at night; day and night insects such as ants can all act in the daytime and at night. Therefore, the time period information is used as the identification characteristic, and the identification rate of insects in a specific time period can be effectively improved.

Fourth, constructing a multi-dimensional insect recognition network pre-training model, firstly, constructing an image feature extraction module, wherein the characteristics of a plurality of deep learning model structures are referred when a structure is selected, a convolution structure has better generalization than a front-edge transducer series, and an attention mechanism has better feature extraction effect on texture information and semantic information of detail features, so that the embodiment adopts a series-connected attention module as a main structure for image feature extraction. An image feature extraction module is constructed, wherein the image feature extraction module comprises a primary extraction Stem unit, a residual pouring unit and a head unit, and the primary extraction Stem unit comprises a 3x3 convolution layer and a first BN subunit; the residual pouring unit comprises 16 residual pouring structures realized based on an attention mechanism; the head unit comprises a 1x1 convolution layer, a second BN subunit, a pooling layer and a full connection layer; normalizing the length and the width of each type of insect picture, and outputting a 2-dimensional image to obtain a two-dimensional image; acquiring time information characteristics of each category of insect pictures, performing one-hot coding, converting the time information characteristics into characteristic vectors, obtaining network embedding results from the characteristic vectors through network embedding processing, and taking the network embedding results as input characteristics, wherein a grid for the network embedding processing comprises a first convolution layer and a second convolution layer, the first convolution layer has the dimension of 1x m x1x128, the second convolution layer has the dimension of 1x128x n, m represents the dimension of the time information characteristics, and the characteristic output dimension is n dimensions; and constructing a collection image feature extraction module and an image size full-connection layer, wherein the input dimension is k, and the output dimension k is the number of insect categories.

2) As shown in fig. 2, the first Conv3x3 includes a 3x3 convolutional layer and a first BN subunit, and the padding parameter value is 1, and performs the up-scaling operation of the channel number. The number of input channels and the number of output channels are kept consistent with the last Conv3x 3. The second Conv3x3 is also composed of a 3x3 convolution layer and a first BN subunit, the padding parameter value is 1, and the dimension reduction operation of the channel number is carried out. In the post-processing stage of the image characteristics after the 16 inverted residual Linear BottleNecks structures, the image characteristics are converted into the input of a full-connection layer after passing through a 1x1 kernel size convolution layer, a second BN subunit and a pooling layer, and then the full-connection layer is used for classifying labels, wherein the input dimension of the full-connection layer is 1280, and the output dimension is 100 dimensions for representing the image characteristics; acquiring the length and width dimensions of each picture, respectively carrying out normalization processing on the length and the width, and outputting the dimension as 2; and acquiring shooting time information of the insect picture, dividing the shooting time information into three time blocks of morning, midday and evening, performing one-hot coding, and inputting the one-hot coding to network embedding. The present model structure is illustrated by fig. 2, and the attention mechanism module is illustrated by fig. 3. The image feature dimension, the wide-high dimension and the time information dimension are combined through the concat layer of deep learning by outputting 100 dimensions of the image feature, 2 dimensions of the image width and height and 10 dimensions of time information after network embedding, and all the features become numerical features after being processed, so that the input features of the full-connection layer can be directly combined. The 112 dimensions after combination are used as the input dimensions of the final full-connection layer, and the output dimensions are the number of categories of training insects.

In addition, an Arcface loss function, a Focal loss function and a total loss function are required to be respectively constructed;

converting the dimension of the insect image into a numerical feature based on an Arcface loss function; dividing the time period characteristics of the insect shooting into time periods, inputting a one-hot code of the division result into a multi-dimensional insect recognition network pre-training model, and extracting the time characteristics; based on the Focal loss function, optimizing the enhanced training sample set, and eliminating the long tail distribution phenomenon existing in unbalanced condition; judging whether the training result is converged or not based on the total loss function;

the total loss function is:

L＝L _c +L _focal

wherein L represents the total loss function of the multi-dimensional insect recognition network model, L _c 、L _focal Respectively represent an Arcface loss function and a Focal loss function；

The ArcFace Loss function is expressed as follows:

wherein ,

the Focal loss tag loss function is expressed as follows:

Fifthly, randomly selecting 64 image samples and corresponding image scales from the training set, inputting the image samples into a multi-dimensional measurement learning network, and optimizing network parameters of an update measurement learning network by using an SGD (generalized gateway) optimization algorithm; the ArcFace is used as a loss function of the image embedded features, the intra-class distance is enlarged, the intra-class distance is reduced, and meanwhile, the image features can be converted into the designated dimension representation so as to be combined with the features of other dimensions in a cascading way, and the multi-dimensional insect feature representation is obtained. Focal Loss is used as a Loss function of label classification and is used for optimizing long tail distribution phenomenon existing when training data is unbalanced;

consider the situation where the convergence rate is slow during training and the final accuracy is general in the conventional Softmax loss. Therefore, the Loss function Arcface Loss introduced into measurement learning is considered to strengthen the distinction of different types of insects, the identified prediction result is only dependent on the angle between the characteristics and the weights and mapped to the hypersphere characteristic space, and the additive angle margin (additive angular margin) parameter m is introduced to simultaneously strengthen the intra-class compactness and the inter-class variability, so that the identification precision of the model is improved. Fig. 4 shows the differentiating effect of Arcface Loss on different categories during picture recognition than the conventional Softmax Loss.

In the multi-class insect data set collected in the training process, the difficulty in collecting the pictures of partial insects is unavoidable, so that the number of the training data sets in the class is small; the collection of pictures of a portion of the insects is easy, and the number of training data sets in this category is high. In this embodiment, the number of insects in more categories is tens of thousands, the number of insects in fewer categories is tens of thousands, and the data set shows a more obvious long tail distribution and data imbalance phenomenon. Thus, focal loss is introduced to give a higher loss to a small number of insect species to increase their recognition rate.

Here, the Arcface Loss function, the insect size dimension, the insect activity time dimension, and the Focal Loss function respectively solve the problems of increasing the discrimination of similar samples, distinguishing the different-sized insects with similar apparent characteristics, distinguishing the different living habit insects, and unbalanced long tail distribution of samples of training data.

And step six, judging whether the total loss of measurement learning is converged or not, and storing an optimal model according to the accuracy condition of the verification set.

Compared with the traditional convolution network, the main improvement points are different in image characteristic training modes, the original method is characterized in that Softmax loss is added after a final full-connection layer to train the label class, but the training speed of the method is low, and the accuracy is general. Therefore, a metric learning method is introduced for extracting the embedded features of the images, the inter-class distances are enlarged, the intra-class distances are reduced, and insects can be expressed as specified dimension features. In addition, the scale characteristics of insects are used as continuous numerical characteristics to be added into training, and the model trained based on the method can have better differentiation among insects with different scales but similar apparent characteristics; the time dimension characteristics obtained after network embedding conversion can be used for better distinguishing the insects moving in different time periods. In addition, the embodiment provides an architecture mode of cascading multiple insect information with different dimensions, and other information of insects can be subjected to feature extraction in a similar mode through a network embedding mode to carry out final classification. And finally, adding the Focal loss into final label training, so that unbalanced data of long tail distribution has a better performance effect.

In the present invention, the data set currently contains 138 kinds of photographs of insects, which include different kinds of insects with different illumination conditions, different sizes and different shooting angles, and the training set is first increased in number by adopting the data enhancement modes such as inversion, rotation, mixup image fusion, brightness enhancement, contrast enhancement and the like, and meanwhile, the robustness of the model is increased, and fig. 5 shows an example of sample enhancement which may be used in the present invention. And (3) collecting all data to form an original data set, and uniformly and randomly dividing the original data set into a training data set, a verification data set and a test data set according to the proportion of 8:1:1.

The invention is divided into a training part and a testing part, wherein a development operating system of the training part can be Linux-ubuntu-20.04, the cpu model is InterCore i7-8700, the memory size is 64GB, two blocks NVIDIA GeForce RTX 3090 are used as a gpu support for model training, the graphic card driving version is 430.40, the cuda version is 10.0, the cudnn version is cudnn-v7.3.1, the main development languages are c language and python, the opencv version is 2.4.9 when a convolutional network is configured, and other operating systems developed by the method can be realized.

Based on the improvement, the specific training process of the network model based on the attention mechanism is as follows: the convolutional network is built according to the model structure of fig. 2, and the super parameters during training are as follows: the parameter momentum affecting the gradient down to the optimal speed is set to 0.9; the weight attenuation regular coefficient is set to be 0.0005, and the parameter can effectively prevent the phenomenon of overfitting; the learning rate is gradually reduced by means of the arm up. The Arcface loss function and the Focal loss function are respectively constructed, and the sum of the Arcface loss function and the Focal loss function is used as a total loss function for training. The best results from the validation set are used as the final model. The accuracy of the local test set of multiple insects is shown in fig. 6, and the structural representation of the network is shown in fig. 7.

In other embodiments, the method further comprises the steps of: the method comprises the steps of carrying out insect target detection on insect pictures to be classified, calling a trained insect recognition model to carry out recognition to obtain a result, and finally feeding back the recognition result to recognition software or APP at a mobile terminal or a computer end, wherein the display result of the recognition software or APP at the mobile terminal or the computer end generally comprises the following steps: the insect target frame and the recognition type of the insects are shown in fig. 9, and the e13 type in fig. 9 is a training label corresponding to the training set mole cricket.

In one embodiment, the insect sizes in the original data set can be subjected to deep learning-based rough classification according to the morphological characteristics of the insects, such as lepidoptera, beetles and the like, and the rough classification is usually high in similarity due to the fact that classification is few, so that extremely high precision can be achieved; then further carrying out deep learning-based fine classification on the coarse classification result, such as chilo suppressalis, meadow moth, corn borer and the like under lepidoptera; thereby achieving the layering and classifying effect of insects.

Example 2:

an insect classification system based on convolutional network multidimensional learning, as shown in fig. 10, comprises a data set acquisition module 100, a division processing module 200, a model construction module 300, a model training module 400 and an identification module 500;

the data set acquisition module 100 is used for acquiring insect pictures containing various insect types, preprocessing the insect pictures and reserving the outer outlines of insects to form the insect type pictures;

the division processing module 200 is configured to divide the class insect picture into a training data set and a verification data set, and perform data enhancement processing on the training data set to obtain an enhanced training sample set;

the model construction module 300 is configured to construct a multi-dimensional insect recognition network pre-training model, where the multi-dimensional insect recognition network pre-training model includes an image feature extraction module, an aggregate image feature extraction module, an image size full-connection layer, an Arcface loss function, a Focal loss function, and a total loss function;

model training module 400, configured to: training and optimizing the multi-dimensional insect recognition network pre-training model based on the enhanced training sample set to obtain a multi-dimensional insect recognition network model;

the recognition module 500 is configured to input a picture to be recognized into the multi-dimensional insect recognition network model to obtain a recognition result, where the sizes of the picture sets to be recognized are the same.

For the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points.

In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.

It will be apparent to those skilled in the art that embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In addition, the specific embodiments described in the present specification may differ in terms of parts, shapes of components, names, and the like. All equivalent or simple changes of the structure, characteristics and principle according to the inventive concept are included in the protection scope of the present invention. Those skilled in the art may make various modifications or additions to the described embodiments or substitutions in a similar manner without departing from the scope of the invention as defined in the accompanying claims.

Claims

1. The insect classification method based on convolution network multidimensional learning is characterized by comprising the following steps of:

2. The method for classifying insects based on convolutional network multidimensional learning as recited in claim 1, wherein the preprocessing of the insect pictures and retaining the insect outline forms a classified insect picture, comprising the steps of:

3. The method of insect classification based on convolutional network multidimensional learning of claim 1, wherein the data enhancement process comprises: patch size, mix up, brightness enhancement, saturation enhancement, contrast enhancement, random erasure, random scale cropping, and unified scaling to one or more of the same size and picture flipping.

4. The method for classifying insects based on convolutional network multi-dimensional learning as claimed in claim 1, wherein said constructing a multi-dimensional insect recognition network pre-training model comprises the steps of:

acquiring time information characteristics of each category of insect pictures, performing one-hot coding, converting the time information characteristics into characteristic vectors, obtaining network embedding results from the characteristic vectors through network embedding processing, and taking the characteristic vectors as input characteristics, wherein a grid for the network embedding processing comprises a first convolution layer and a second convolution layer, the first convolution layer has the dimension of 1x m x1x128, the second convolution layer has the dimension of 1x128x n, m represents the dimension of the time information characteristics, and the characteristic output dimension is n dimensions;

5. The method for classifying insects based on convolutional network multidimensional learning as recited in claim 4, further comprising the steps of:

based on the Focalloss loss function, optimizing the enhanced training sample set, and eliminating the long tail distribution phenomenon existing in unbalanced condition;

6. The method for classifying insects based on convolutional network multi-dimensional learning as claimed in claim 1, wherein said training multi-dimensional insect recognition network pre-training model comprises the steps of:

7. The insect classification method based on convolutional network multidimensional learning as recited in claim 1, wherein the identifying the picture to be identified using the multidimensional insect identification network model comprises the steps of:

8. The method of insect classification based on convolutional network multidimensional learning of claim 5, wherein the total loss function is:

L＝L _c + _focal

the ArcFace Loss function is expressed as follows:

wherein ,

the Focal loss tag loss function is expressed as follows:

where y' represents the output probability of the relevant category and γ represents the adjustable attention parameter.

9. The insect classification system based on the convolution network multi-dimensional learning is characterized by comprising a data set acquisition module, a division processing module, a model construction module, a model training module and an identification module;

10. A computer readable storage medium storing a computer program, which when executed by a processor implements the method of any one of claims 1 to 8.

11. An insect classification device based on convolutional network multidimensional learning, comprising a memory, a processor and a computer program stored in the memory and running on the processor, characterized in that the processor implements the method according to any one of claims 1 to 8 when executing the computer program.