CN117115685A

CN117115685A - Method and system for identifying cash crop information based on deep learning

Info

Publication number: CN117115685A
Application number: CN202310914018.1A
Authority: CN
Inventors: 杨建宇; 吴春晓; 张婷婷; 代安进; 周晗
Original assignee: China Agricultural University
Current assignee: China Agricultural University
Priority date: 2023-07-24
Filing date: 2023-07-24
Publication date: 2023-11-24

Abstract

The invention provides a cash crop information identification method and system based on deep learning, comprising the following steps: cutting the multispectral remote sensing image of the unmanned aerial vehicle in the research area according to a specified shape to obtain a regular-shape image; sample labeling is carried out on the regular-shape images, and the regular-shape images are divided into a training set, a verification set and a test set; training the U-Net model according to the training set and the verification set; taking the trained U-Net model as a basic model, adding an acceptance structure and SE-Block, and deepening a network; selecting super parameters to perform parameter optimization on the model with the deepened network, generating an improved deep learning semantic segmentation ISDU-Net model, and training the ISDU-Net model based on a training set and a verification set; and carrying out information identification on the cash crops by using the trained ISDU-Net model. According to the invention, the improved deep learning semantic segmentation model is utilized to extract the remote sensing image of the cash crop unmanned aerial vehicle and complete the rapid drawing, so that the method is outstanding in small sample type prediction, and a new thought is provided for the precise identification of cash crops.

Description

Method and system for identifying cash crop information based on deep learning

Technical Field

The invention relates to the field of image processing, in particular to a cash crop information identification method and system based on deep learning.

Background

The economic crops are important components of agriculture, have higher economic value and ecological benefit, are used as a material basis for national economy and social development, and have benign influence on ecological environment improvement. The Chinese operators are wide, the landforms and soil types are various, the gradient of some areas is high, the soil is relatively barren, the ecological environment is relatively bad, and the cultivation method is not suitable for planting grain crops, but has a good cultivation foundation for economic fruit forests. The economic crops have great effects on guaranteeing the national ecological environment safety, enriching the supply of forest products, improving rural income, improving life quality and the like, so that the accurate classification of the economic crops is very important. However, the conventional ground investigation of the cash crops is time-consuming and labor-consuming, and the appearance of remote sensing images becomes possible for quickly identifying the cash crops.

Along with the gradual development of technology, unmanned aerial vehicle remote sensing technology is mature and rapidly applied in various fields, and becomes one of novel acquisition modes of remote sensing images. Unmanned aerial vehicle remote sensing has great potential, mainly utilizes its nimble, efficient, advantage such as conveniently carrying, research to agricultural fields such as crop discernment, resource investigation, become the main power in discernment classification field gradually. The machine body is light, the operation and the use are intelligent, and the like, so that manpower and material resources are greatly reduced, the image acquisition is more convenient, and the satellite remote sensing image has a greater advantage compared with the traditional satellite remote sensing image. In recent years, deep learning is gradually developed and tends to be mature, and the deep learning becomes a popular research direction of image processing tasks and has very wide application in various fields. The deep learning model does not depend on manual feature selection work, and classification efficiency is greatly improved by automatically extracting features and calculating class probability. In unmanned aerial vehicle remote sensing image classification research, classification accuracy is further improved after deep learning algorithm is adopted. At present, some scholars research aims at processing remote sensing image classification work more efficiently by improving a deep learning model, providing a new idea for the field of unmanned aerial vehicle and other types of remote sensing image data processing, and obviously improving classification effect compared with a classical model.

In the existing related research, although there are many methods for the unmanned aerial vehicle remote sensing technology to be used in the classification recognition field, the research of classifying and recognizing cash crops based on unmanned aerial vehicle remote sensing images is less at present, and classification accuracy is limited by a classical deep learning model. Therefore, the invention selects the cash crops as a research object, aims to develop an improved deep learning model, is applied to researching the characteristic difference among different cash crops, improves the classification precision and the extraction efficiency of the cash fruit forest of the remote sensing image of the unmanned aerial vehicle, improves the identification effect of the semantic segmentation model on small samples, and researches a cash crop information identification method based on the remote sensing image data of the unmanned aerial vehicle, thereby providing reference for scientifically, conveniently and efficiently identifying the space distribution information of the cash crops.

Disclosure of Invention

The invention aims to provide a method and a system for identifying cash crop information based on deep learning, which are used for solving the problem that classification effect is not ideal due to the fact that classification accuracy is limited by a classical deep learning model when classification identification is carried out on cash crop unmanned aerial vehicle remote sensing images in the prior art.

The invention provides a cash crop information identification method based on deep learning, which comprises the following steps:

Cutting the multispectral remote sensing image of the unmanned aerial vehicle in the research area according to a specified shape to obtain a regular-shape image;

sample labeling is carried out on the regular-shape image to form sample data, and the sample data is divided into a training set, a verification set and a test set;

training a deep learning semantic segmentation model according to the training set and the verification set, wherein the deep learning semantic segmentation model is a classical semantic segmentation model U-Net model;

taking the trained U-Net model as a basic model, adding an acceptance structure and a channel attention mechanism module SE-Block, and carrying out network deepening to obtain a deep learning semantic segmentation model after network deepening;

selecting super parameters to perform parameter optimization on the deep learning semantic segmentation model with the deepened network, generating an improved deep learning semantic segmentation model ISDU-Net model, and training the ISDU-Net model based on the training set and the verification set to obtain a trained ISDU-Net model;

and carrying out information identification on the multi-spectral remote sensing image of the economic crop in the area to be detected by using the trained ISDU-Net model.

According to the cash crop information identification method based on deep learning provided by the invention, the unmanned aerial vehicle multispectral remote sensing image of the research area is cut according to the specified shape to obtain the regular-shape image, and the method comprises the following steps:

Acquiring a multispectral remote sensing image of the unmanned aerial vehicle in a research area;

carrying out automatic radiation correction, deformation correction and image stitching on the multispectral remote sensing image;

the red, green, blue, red edge and near infrared single-band images after image splicing are synthesized into five-band images through an ArcGIS band synthesis tool;

and cutting the research area according to the sample by using a raster data cutting tool to obtain the regular-shape image.

According to the cash crop information identification method based on deep learning provided by the invention, the regular-shape images are subjected to sample labeling to form sample data, and the sample data are divided into a training set, a verification set and a test set, and the method comprises the following steps:

cutting and converting the obtained regular-shape image;

sample labeling is carried out on the image data after cutting and bit conversion processing;

and randomly dividing the image data subjected to sample labeling into a training set, a verification set and a test set according to a preset proportion.

According to the cash crop information identification method based on deep learning provided by the invention, the sample labeling is carried out, and the method specifically comprises the following steps:

determining a classification system according to the economic crop category of the research area, and distributing RGB labels for each category;

And drawing and filling colors on the boundaries of each crop by combining the spot information, and making a label graph to obtain a marked sample.

According to the cash crop information identification method based on deep learning provided by the invention, after image data subjected to sample labeling is randomly divided into a training set, a verification set and a test set according to a preset proportion, the method further comprises the following steps:

and carrying out data enhancement on the training set, the verification set and the test set through geometric transformation to obtain the training set, the verification set and the test set after data enhancement.

According to the cash crop information identification method based on deep learning provided by the invention, a training U-Net model is used as a basic model, an acceptance structure and a channel attention mechanism module SE-Block are added, and network deepening is carried out to obtain a deep learning semantic segmentation model after the network deepening, and the method specifically comprises the following steps:

taking a U-Net model as a basic model, wherein the U-Net model comprises a model coding part and a model decoding part;

in the five-layer downsampling structure of the U-Net model coding part, each layer of the first three layers is provided with an acceptance structure, the second two layers are provided with three convolution layers, three BN layers, three ReLU activation layers and one pooling layer, the model coding part with deepened network is obtained, the acceptance module comprises four parts, namely a 2D convolution performed by a 1*1, 3*3 and 5*5 convolution kernel and a maximum pooling layer with a 3*3 kernel size, and the operations of the activation layers and the BN layers are performed during the convolution;

In the five-layer upsampling structure of the U-Net model decoding part, a channel attention mechanism module SE-Block is arranged before upsampling of each layer to obtain a model decoding part with a deepened network, and the channel attention mechanism module SE-Block is provided with two parts, namely compression Squeeze and Excitation accounting; the improved model decoding part adopts a transposition convolution mode, gradually reduces the number of filters layer by layer, adopts a high-low level characteristic fusion mode to fuse with a characteristic layer obtained by the encoding part, and then adopts 2D convolution to extract the characteristics;

and generating a deep learning semantic segmentation model with a deepened network based on the improved coding part and the improved decoding part.

According to the cash crop information identification method based on deep learning provided by the invention, the processing process of the channel attention mechanism module for the input image comprises the following steps:

based on the input unmanned aerial vehicle multispectral remote sensing image, global average pooling is performed through the compression part, global characteristic information of each channel of the multispectral remote sensing image is obtained, and the global characteristic information is transmitted into the excitation part;

based on global feature information transmitted by a compression part, the excitation part carries out nonlinear transformation on the global feature information through two full-connection layers to obtain a weight value of each channel of the multispectral remote sensing image, and the weight value is endowed to the global feature information of the multispectral remote sensing image to obtain original features of each channel of the multispectral remote sensing image;

The compression part is in a global average pooling mode, which is defined as:

wherein H is the feature layer height, W is the feature layer width, u _c The output layer results obtained for the convolution are expressed as follows:

wherein v is _c Denote the c-th convolution kernel, s denote the number of channels, and x denote the convolution calculation.

According to the cash crop information identification method based on deep learning provided by the invention, the selected super-parameters are used for carrying out parameter optimization on the deep learning semantic segmentation model after network deepening, and the method comprises the following steps: and selecting two super parameters of the learning rate and the batch size to perform parameter optimization on the improved deep learning semantic segmentation model.

According to the cash crop information identification method based on deep learning provided by the invention, after training the ISDU-Net model based on the training set and the verification set to obtain a trained ISDU-Net model, the cash crop information identification method further comprises the following steps:

predicting the test set by using the trained ISDU-Net model to obtain a prediction result of the test set;

based on the prediction result, the trained ISDU-Net model is subjected to precision evaluation by adopting a confusion matrix, pixel accuracy, F1 fraction, cross-correlation ratio and Kappa coefficient.

The invention also provides a cash crop information identification system based on deep learning, which comprises the following steps:

The image clipping module is used for clipping the unmanned aerial vehicle multispectral remote sensing image of the research area according to the specified shape to obtain a regular-shape image;

the sample labeling module is used for labeling samples of the regular-shape images to form sample data, and dividing the sample data into a training set, a verification set and a test set;

the model training module is used for training a deep learning semantic segmentation model according to the training set and the verification set, wherein the deep learning semantic segmentation model is a classical semantic segmentation model U-Net model;

the structure optimization module is used for taking the trained U-Net model as a basic model, adding an acceptance structure and a channel attention mechanism module SE-Block, and carrying out network deepening to obtain a deep learning semantic segmentation model after the network deepening;

the parameter optimization module is used for selecting super parameters to perform parameter optimization on the deep learning semantic segmentation model with the deepened network, generating an improved deep learning semantic segmentation model ISDU-Net model, and training the ISDU-Net model based on the training set and the verification set to obtain a trained ISDU-Net model;

and the information identification module is used for carrying out information identification on the multi-spectrum remote sensing image of the economic crop in the area to be detected by utilizing the trained ISDU-Net model.

The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the cash crop information identification method based on deep learning when executing the program.

The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the cash crop information recognition method based on deep learning as described above.

The invention selects the cash crop as a research object and develops an improved deep learning semantic segmentation model on the basis of a classical deep learning semantic segmentation model. The method is mainly applied to researching the characteristic difference among different cash crops, improves the classification precision and extraction efficiency of the unmanned aerial vehicle remote sensing image in the aspect of the cash crops, and improves the recognition effect of the semantic segmentation model on small samples. The cash crop information identification method based on the unmanned aerial vehicle remote sensing image data provides references for scientifically, conveniently and efficiently identifying the space distribution information of the cash crops, and simultaneously can realize automatic identification and drawing, thereby extracting the cash crops more accurately.

Drawings

In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow chart of a method for identifying cash crop information based on deep learning according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of structural optimization based on a U-Net model according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a system for identifying information of cash crops based on deep learning according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The invention provides a cash crop information identification method based on deep learning, which comprises the steps of firstly preprocessing an acquired unmanned aerial vehicle remote sensing image, preparing a sample set, dividing the sample set into a training set, a verification set and a test set, establishing a cash crop data set in a research area by amplifying samples, initially classifying cash fruit forests in the research area by selecting a classical semantic segmentation model, performing precision evaluation, then selecting a model U-Net with highest classification precision and best classification effect, performing deepening and widening operations on a basic network to obtain an improved deep learning semantic segmentation model ISDU-Net, performing information identification on the cash crop multispectral remote sensing image by the ISDU-Net model, and evaluating identification precision. According to the evaluation result, the economic crops are classified and identified through the improved deep learning semantic segmentation model, namely the ISDU-Net model, and the identification effect is obviously improved.

The method, system, electronic device and storage medium for identifying the cash crop information based on the deep learning provided by the invention are described below with reference to fig. 1 to 4.

Fig. 1 is a flow chart of the cash crop information identification method based on deep learning provided by the invention, as shown in fig. 1, in a specific embodiment, the cash crop information identification method based on deep learning provided by the invention comprises the following steps:

Step S110, cutting the unmanned aerial vehicle multispectral remote sensing image of the research area according to a specified shape to obtain a regular-shape image;

step S120, carrying out sample labeling on the regular-shape image to form sample data, and dividing the sample data into a training set, a verification set and a test set;

step S130, training a deep learning semantic segmentation model according to the training set and the verification set, wherein the deep learning semantic segmentation model is a classical semantic segmentation model U-Net model;

step S140, taking the trained U-Net model as a basic model, adding an acceptance structure and a channel attention mechanism module SE-Block, and carrying out network deepening to obtain a deep learning semantic segmentation model after the network deepening;

step S150, selecting super parameters to perform parameter optimization on the deep learning semantic segmentation model with the deepened network, generating an improved deep learning semantic segmentation model ISDU-Net model, and training the ISDU-Net model based on the training set and the verification set to obtain a trained ISDU-Net model;

and step 160, carrying out information identification on the multi-spectral remote sensing image of the cash crop in the area to be detected by using the trained ISDU-Net model.

The steps in fig. 1 are described in detail below:

in the embodiment of the present invention, the preprocessing of the multi-spectrum remote sensing image of the cash crop to generate a regular-shape image of the research area cut according to the regular shape includes:

Specifically, the research area selected by the embodiment of the invention is an agricultural economic forest, and the total area is about 700 mu. The fruit trees are various in variety and are planted in a strip shape, and the main economic crops are as follows: grape, peach, apple, cherry, etc. have scientific value in studying the field.

In this embodiment, firstly, an unmanned aerial vehicle multispectral remote sensing image of a typical research area is acquired, and is preprocessed, and the preprocessing process includes:

The method comprises the steps of obtaining a multispectral remote sensing image of the unmanned aerial vehicle, obtaining a multispectral remote sensing image with high resolution through aerial photography of the unmanned aerial vehicle arranged on the spot, selecting and recording sample points by using a handheld GPS device, and preprocessing obtained image data. In the embodiment, a four-rotor unmanned aerial vehicle is used for acquiring images of a research area, so that multispectral remote sensing images of the unmanned aerial vehicle of the research area are obtained;

carrying out automatic radiation correction, deformation correction and image stitching on the obtained multispectral remote sensing image by utilizing intelligent graph software; the image stitching process comprises the following steps: the method comprises the steps that images shot by a multispectral unmanned aerial vehicle comprise various data such as geographic coordinate data, lens error data, DEM data and radiation data, radiation correction, deformation correction and image stitching are automatically carried out by using intelligent graph software, high-quality orthographic images of visible light and a single-band research area are generated, and the images are stored in a TIFF format;

and (3) synthesizing the five single-band images of red, green, blue, red edge and near infrared after image splicing into a five-band image by using an ArcGIS band synthesizing tool. Because the orthographic images are single-band images after being spliced, five single-band images are synthesized into five-band images by a band synthesis tool of ArcGIS software, the five images comprise red, green, blue, red edges and near infrared five bands and are stored in a TIFF format, so that the follow-up model training is convenient.

Secondly, further processing the preprocessed image data, wherein the processing process comprises the following steps:

cutting, bit conversion and other treatments are carried out on the preprocessed image;

in this embodiment, because the shape of the study area is irregular, for the subsequent data set production, the study area is cut according to the sample by using a raster data cutting tool of ArcGIS software, so as to obtain 16 regular rectangular images with 2136 x 2629 pixels, and the regular rectangular images are stored in TIFF format;

converting the 16-view 32-bit images into 8 bits in batches; because the number of bits of the obtained multi-band orthographic image of the unmanned plane is 32 bits, the images need to be normalized before the deep learning model is trained, 16-view 32-bit images are converted into 8 bits in batches by writing a Python script for convenient normalization, and subsequent work is reduced.

in an embodiment of the present invention, the performing sample labeling on the regular-shape image to form a data sample, and dividing the data sample into a training set and a testing set includes:

firstly, establishing a cash crop data set, carrying out data set amplification, and amplifying and manufacturing the data set, wherein the method specifically comprises the following steps of:

Cutting and converting the obtained regular-shape image;

In this embodiment, the sample labeling specifically includes:

and drawing and filling colors on the boundary of each crop by combining the field sample point information, making a label graph, obtaining a marked data sample, and storing the marked data sample as a TIFF file.

In this embodiment, the following is 8:2 randomly divides the training set, the validation set and the test set.

After dividing the data samples into the training set, the verification set and the test set, the method further comprises: and carrying out data enhancement on the training set, the verification set and the test set through geometric transformation to obtain the enhanced training set, the enhanced verification set and the enhanced test set.

In particular, the deep learning model often depends on a large amount of sample data, so that data enhancement is required to obtain a large amount of training samples to further improve the model classification accuracy, and common data enhancement modes include geometric transformation, color transformation, noise increase and decrease, and the like. The geometric transformations generally include: translation, rotation, overturn, clipping, scaling, etc.; color conversion, namely, adjusting color components in different color spaces, and performing color conversion on an image; the noise increase or decrease is to perform blurring processing, noise addition, specific region coverage, and the like on the image. The existing researches show that the color transformation and noise increase and decrease are not excellent in semantic segmentation tasks in the remote sensing field, and even local information required by the model is erased, so that the learning capacity of the model is reduced. The present embodiment selects a geometric transformation to enhance the data.

In a specific embodiment, the data enhancement is performed on the training set, the verification set and the test set through geometric transformation, and specifically includes:

8, carrying out pretreatment on the 16 scene images obtained by the pretreatment: 2, dividing, namely randomly extracting 2 images from the images to serve as a testing machine to perform subsequent prediction and precision evaluation, and performing model training by using the rest 14 images as training sets and verification sets of the models; carrying out horizontal overturn, vertical overturn and diagonal mirror overturn on the 16-view image to obtain data enhancement, wherein the enhanced data set reaches 17985 pictures, and the data set is processed according to 8: the ratio of 2 randomly divides the training set and the validation set.

specifically, cash crops are first initially classified by a classical deep learning model. As the existing deep learning semantic segmentation models for primarily classifying cash crops are various, the embodiment selects more classical models, and selects a model with the optimal classification effect as an improved basic model according to specific evaluation and comparison analysis.

In the embodiment, a classical deep learning semantic segmentation model is selected as a primary classification model for cash crop extraction; the classical semantic segmentation model FCN, segNet, U-Net which is relatively commonly used at present is selected as a classification base model.

Further, the specific sample set data was predicted using the FCN, segNet, U-Net model, respectively. And selecting the model with highest precision by evaluating the precision of the model prediction result. In the embodiment, the pixel accuracy, the F1 fraction, the cross-correlation ratio and the Kappa coefficient are used as model prediction accuracy evaluation indexes, namely:

PA in the formula is pixel accuracy;

CPA in the formula is category pixel accuracy, recall is Recall rate, and F1 is F1 fraction;

MIoU in the formula is an intersection ratio;

kappa coefficient in the formula.

The pixel accuracy, average cross-over ratio, frequency-weight cross-over ratio and Kappa coefficient of the U-Net model obtained through calculation according to the formula are 87.73%, 70.68%, 78.69% and 0.84 respectively, and the result is optimal in the three models. And through precision evaluation, comparison and analysis, the classification effect of the U-Net network model is optimal, and the overall precision is high, so that the U-Net network structure is selected as a basic network model.

Specifically, the method adds an acceptance structure and a channel attention mechanism module SE-Block for the trained U-Net model, and performs network deepening to generate a deep learning semantic segmentation model after the network deepening, which is essentially the structural optimization of the U-Net model.

Fig. 2 is a schematic diagram of structural optimization based on a U-Net model, and as shown in fig. 2, the specific process of structural optimization based on the U-Net model includes:

taking the U-Net model as a basic model, widening and deepening the U-Net model, wherein the widening and deepening comprises the steps of adding an acceptance structure, a channel attention mechanism module SE-Block and network deepening;

the U-Net model comprises a model coding part and a model decoding part: the method comprises the steps that the number of filters is increased layer by layer through 2D convolution in the model coding part to form a five-layer downsampling structure, each layer of the first three layers in the downsampling structure is set to be an acceptance structure, each layer of the last two layers is set to be a structure comprising three convolution layers, three BN layers, three ReLU activation layers and one pooling layer, and the last two layers are set to be three convolution structures; the model decoding part carries out transposition convolution, gradually reduces the number of filters layer by layer, adopts a high-low level characteristic fusion form to fuse with a characteristic layer obtained by the encoding part, and adopts 2D convolution to carry out characteristic extraction; a five-layer upsampling structure is adopted, and a channel attention mechanism module SE-Block is added before upsampling of each layer.

Specifically, the principle of the acceptance module is to replace a full-connection layer with sparse connection by referring to the sparse connection characteristic of a biological nervous system, and increase the depth and the width of a network under the condition of not increasing the calculation amount by a parallel convolution structure and a smaller convolution core. The acceptance module is divided into four parallel convolution structures, namely convolution operations of 1*1, 3*3 and 5*5 and a maximum pooling operation of 3*3, wherein the convolution operations of 1*1 are added before the convolution operations of 3*3 and 5*5 and after the maximum pooling of the acceptance structure due to large calculation amount required by the convolution kernel of 5*5, so that the dimension of a feature layer is reduced to reduce the calculation amount, and the width and depth of a network are increased at the same time. Finally, the four obtained results are fused through the connection, so that the diversity of the characteristics is ensured, and the learning capacity of the network is improved.

In the acceptance module used in the embodiment, each layer of convolution adopts a ReLU activation function, and a BN layer is added for accelerating the speed of network training and convergence and preventing gradient elimination and overfitting. Because the acceptance module can obtain more various sample characteristic information, the experiment replaces the front three layers of downsampling structures of the original U-Net model with the acceptance module, the width of downsampling parts is increased, the learning ability of the model to small sample types is improved, the problem of unbalanced samples in part of categories is solved, the characteristic receptive field of the model is widened, more effective information is obtained, and the prediction precision of fruit forest classification is improved.

It will be appreciated that in general, network deepening will be beneficial to enhance the learning ability of the model, but this is not absolute, and network deepening will require a greater amount of data, which would otherwise result in the model being trained over-fitted, so proper deepening of the model may increase the expressive power of the model. Therefore, the invention is based on the original U-Net model, and the downsampling part is properly deepened so as to improve the feature extraction capability of the model. In the invention, besides the fact that the front three downsampling layers replace common convolution by using an acceptance structure, the network becomes deeper and wider, the last two downsampling layers of the model are properly adjusted, and the two-time 2D convolution is changed into three times, so that the learning capacity of the model is improved, more characteristic information is obtained, and better classification effect is obtained.

On the other hand, the channel attention mechanism module SE-Block, namely the squeize-and-specification module, is born by a convolutional neural network model SENet. In the common convolution pooling process, the importance of each channel is generally regarded as being equal, namely all channels of each feature map are fused, but in practice, the importance of different channels for classification results is often different, and SE-Block is a module which focuses on the relationship and importance of different channels, so that the model can learn the importance degree of the features of different channels through SE-Block, and further better judgment is made.

The channel attention mechanism module SE-Block comprises two parts, namely compression Squeeze and Excitation specification, global average pooling (Global Average Pooling) is carried out through the compression part, the network can learn global characteristics of channel levels, the global characteristics are then transmitted into the Excitation part, the result obtained by the compression part is subjected to nonlinear transformation by utilizing two full-connection layers, the weight value of each channel is obtained, and the characteristic layer is endowed.

Since the convolution process is only a feature extraction operation performed in a local space, it is generally difficult to obtain enough information to extract the relationship between channels, in order to expand the receptive field, and the compression section encodes the whole spatial feature on one channel as a global feature to meet the receptive field requirement, so that the global average pooling mode is selected to achieve this, and the compression section is defined as:

wherein v is _c Representing the c-th convolution kernel, s representing the number of channels, x representing the convolution calculation; .

The excitation portion specifically includes:

receiving channel global characteristic information obtained by the compression part in a global average pooling mode;

Carrying out nonlinear transformation on the global characteristic information by utilizing two full-connection layers to obtain weight coefficients of all channels;

and obtaining the original characteristics of each channel based on the weight coefficient of each channel.

In detail, the global feature obtained by the compression part needs to use another operation mode to obtain the relation among the channels, the operation needs to flexibly learn the nonlinear relation among each channel, and the learned relation is not mutually exclusive, so that the simultaneous existence of the multi-channel feature can be ensured, and the excitation part is designed. The excitation part firstly adopts a full-connection layer to perform dimension reduction operation, so that the generalization capability of the network can be improved, the complexity of the model is reduced, then the excitation part is activated through an activation function, and then the full-connection layer is restored to the original dimension.

The relation among the channels is obtained by adopting a Gating mechanism, and the principle is as follows:

s＝F _ex (z,W)＝σ(g(z,W))＝σ(W ₂ ReLU(W ₁ z))

where ReLU is the activation function.

And then endowing the learned weight coefficient of each channel with the original characteristic of each channel, wherein the formula is as follows:

thus, the model has more distinguishing ability to the characteristics of each channel, namely, the model is a channel attention mechanism.

In the embodiment of the invention, because the high-resolution multispectral remote sensing image dataset of the unmanned aerial vehicle is based, the wave bands related to the original image are more, and the characteristic image layer obtained after the early-stage acceptance structure and the deeper multilayer convolution are performed is complex. Therefore, when the decoding part performs feature fusion, SE-Block is used for channel importance consideration, and then fusion is performed, so that a better classification effect can be obtained, more comprehensive channel information is considered, and model learning efficiency is improved.

specifically, the selected super-parameters are used for carrying out parameter optimization on the deep learning semantic segmentation model after the network deepens, and the essence of the parameter optimization is that the U-Net model after the structure optimization is subjected to parameter adjustment.

In detail, the parameter adjustment of the U-Net model after structure optimization specifically includes:

in the embodiment, on the basis of the U-Net model after the structure optimization, two super-parameters of the learning rate and the batch size are selected for carrying out parameter optimization on the model, so that an improved U-Net model is obtained. It will be appreciated that the accuracy of the deep learning model is also largely dependent on the adjustment of model parameters, in addition to the model structure. Parameters of deep learning models are generally divided into two categories: model parameters and superparameters. Model parameters are adjusted based on data driving, and the model adjusts weight data in the model through obtained data information by continuously iterating learning; the super parameters are directly adjusted by human interference without depending on data, and the optimal parameters are tested by learning after the model is adjusted by setting initial values and the like. The invention adjusts parameters based on the model after structure optimization. Because the adjustment of the super-parameters does not depend on data, the embodiment improves the model after the structure optimization by adjusting the super-parameters of the model, and mainly adjusts the super-parameters of the learning rate and the batch size.

In the embodiment of the invention, under the condition that other parameters are kept consistent, the initial learning rate is set to be 0.01, 0.001, 0.0001 and 0.00001, network training is carried out, and testing is carried out, so that the initial learning rate is finally determined to be 0.0001 to be the optimal setting.

Comparative studies were also performed on batch sizing in the examples of the present invention. Under the condition that the memory is allowed, other parameters are kept unchanged, and the batch sizes of 4, 6, 8 and 10 are set for comparison research, so that the model can obtain the optimal effect when the batch size is 8.

Further, after parameter adjustment is performed on the U-Net model with optimized structure, an improved deep learning semantic segmentation model ISDU-Net model is generated, and training is performed on the ISDU-Net model based on the training set and the verification set, so that a trained ISDU-Net model is obtained.

And after the improved deep learning semantic segmentation model is generated by structural optimization based on the U-Net model and parameter adjustment of the U-Net model after the structural optimization, carrying out information identification on the multi-spectral remote sensing image of the economic crop in the region to be detected by utilizing the improved deep learning semantic segmentation model.

After the information identification is carried out on the multi-spectrum remote sensing image of the cash crop, the identification precision of the improved deep learning semantic segmentation model is required to be evaluated.

In this embodiment, the improved deep learning semantic segmentation model may be evaluated with respect to accuracy using the confusion matrix, pixel accuracy, F1 score, cross-correlation ratio, and Kappa coefficient mentioned above.

According to the cash crop information identification method based on deep learning, remote sensing extraction of cash crops is achieved by using the improved U-Net deep learning model based on the unmanned aerial vehicle multispectral remote sensing source, and the accuracy and reliability are high; compared with a classical model, the classification model provided by the invention has obvious precision improvement on the recognition of small samples, and effectively solves the problem that samples of a certain class are less in sample selection; the method has the potential of space-time analysis of the cash crops and remote sensing data processing of the unmanned aerial vehicle, and can provide guidance in practical sense for the management and resource allocation of the cash crops.

The invention also provides a cash crop information recognition system based on deep learning, which comprises an image cutting module, a sample labeling module, a model training module, a structure optimizing module, a parameter optimizing module and an information recognition module.

Fig. 3 is a schematic structural diagram of a cash crop information recognition system based on deep learning according to an embodiment of the present invention, and as shown in fig. 3, the cash crop information recognition system based on deep learning provided by the present invention includes:

the image clipping module 310 is configured to clip the unmanned aerial vehicle multispectral remote sensing image of the research area according to a specified shape to obtain a regular-shape image;

the sample labeling module 320 is configured to label samples of the regular-shape image to form sample data, and divide the sample data into a training set, a verification set and a test set;

the model training module 330 is configured to train a deep learning semantic segmentation model according to the training set and the verification set, where the deep learning semantic segmentation model is a classical semantic segmentation model U-Net model;

the structure optimization module 340 is configured to use the trained U-Net model as a basic model, add an acceptance structure and a channel attention mechanism module SE-Block, and perform network deepening to obtain a deep learning semantic segmentation model after network deepening;

the parameter optimization module 350 is configured to select a super parameter to perform parameter optimization on the deep learning semantic segmentation model after the network deepens, generate an improved deep learning semantic segmentation model ISDU-Net model, train the ISDU-Net model based on the training set and the verification set, and obtain a trained ISDU-Net model;

The information identifying module 360 is configured to identify information of the multi-spectral remote sensing image of the economic crop in the area to be detected by using the trained ISDU-Net model.

According to the cash crop information identification system based on deep learning, provided by the invention, the unmanned aerial vehicle multispectral remote sensing image of a research area is cut according to a specified shape through the image cutting module, so that a regular-shape image is obtained; the sample labeling module labels the samples of the regular-shape images to form sample data, and the sample data is divided into a training set, a verification set and a test set; the model training module trains a deep learning semantic segmentation model according to the training set and the verification set, wherein the deep learning semantic segmentation model is a classical semantic segmentation model U-Net model; the structure optimization module takes the trained U-Net model as a basic model, increases an acceptance structure and a channel attention mechanism module SE-Block, and performs network deepening to obtain a deep learning semantic segmentation model after the network deepening; the parameter optimization module selects super parameters to perform parameter optimization on the deep learning semantic segmentation model with the deepened network, generates an improved deep learning semantic segmentation model ISDU-Net model, trains the ISDU-Net model based on the training set and the verification set, and obtains a trained ISDU-Net model; the information identification module utilizes the trained ISDU-Net model to carry out information identification on the multi-spectral remote sensing image of the cash crop in the area to be detected, and solves the problem that the classification effect is not ideal due to the fact that the classification precision is limited by a classical deep learning model when the prior art carries out classification identification on the remote sensing image of the cash crop unmanned aerial vehicle.

Aiming at the defects of the prior study, the invention provides a cash crop information identification method based on deep learning, which selects cash crops as study objects, develops an improved deep learning model based on a classical deep learning semantic segmentation model, is applied to study the characteristic difference among different cash crops, improves the precision and extraction efficiency of unmanned aerial vehicle remote sensing images in the aspect of cash crop classification, improves the identification effect of a semantic segmentation model on small samples, provides reference for scientifically, conveniently and efficiently identifying the space distribution information of the cash crops, simultaneously can realize automatic identification and drawing, and can extract the cash crops more accurately.

Fig. 4 illustrates a physical schematic diagram of an electronic device, as shown in fig. 4, which may include: processor 410, communication interface (Communications Interface) 820, memory 430, and communication bus 440, wherein processor 410, communication interface 420, and memory 430 communicate with each other via communication bus 440. The processor 410 may invoke logic instructions in the memory 430 to perform a deep learning based cash crop information identification method comprising: cutting the multispectral remote sensing image of the unmanned aerial vehicle in the research area according to a specified shape to obtain a regular-shape image; sample labeling is carried out on the regular-shape image to form sample data, and the sample data is divided into a training set, a verification set and a test set; training a deep learning semantic segmentation model according to the training set and the verification set, wherein the deep learning semantic segmentation model is a classical semantic segmentation model U-Net model; taking the trained U-Net model as a basic model, adding an acceptance structure and a channel attention mechanism module SE-Block, and carrying out network deepening to obtain a deep learning semantic segmentation model after network deepening; selecting super parameters to perform parameter optimization on the deep learning semantic segmentation model with the deepened network, generating an improved deep learning semantic segmentation model ISDU-Net model, and training the ISDU-Net model based on the training set and the verification set to obtain a trained ISDU-Net model; and carrying out information identification on the multi-spectral remote sensing image of the economic crop in the area to be detected by using the trained ISDU-Net model.

Further, the logic instructions in the memory 430 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the above-described method for recognition of cash crop information based on deep learning, the method comprising: cutting the multispectral remote sensing image of the unmanned aerial vehicle in the research area according to a specified shape to obtain a regular-shape image; sample labeling is carried out on the regular-shape image to form sample data, and the sample data is divided into a training set, a verification set and a test set; training a deep learning semantic segmentation model according to the training set and the verification set, wherein the deep learning semantic segmentation model is a classical semantic segmentation model U-Net model; taking the trained U-Net model as a basic model, adding an acceptance structure and a channel attention mechanism module SE-Block, and carrying out network deepening to obtain a deep learning semantic segmentation model after network deepening; selecting super parameters to perform parameter optimization on the deep learning semantic segmentation model with the deepened network, generating an improved deep learning semantic segmentation model ISDU-Net model, and training the ISDU-Net model based on the training set and the verification set to obtain a trained ISDU-Net model; and carrying out information identification on the multi-spectral remote sensing image of the economic crop in the area to be detected by using the trained ISDU-Net model.

The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. The cash crop information identification method based on deep learning is characterized by comprising the following steps of:

2. The method for identifying information of cash crops based on deep learning according to claim 1, wherein the cutting the unmanned aerial vehicle multispectral remote sensing image of the research area according to the specified shape to obtain the regular-shape image comprises the following steps:

3. The method for identifying cash crop information based on deep learning of claim 1, wherein the performing sample labeling on the regular-shape image to form sample data, dividing the sample data into a training set, a verification set and a test set comprises:

Cutting and converting the obtained regular-shape image;

4. The method for identifying cash crop information based on deep learning according to claim 3, wherein the performing sample labeling specifically comprises:

5. The method for identifying information of cash crop based on deep learning according to claim 3, wherein after randomly dividing the image data subjected to sample labeling into a training set, a verification set and a test set according to a predetermined ratio, further comprising:

6. The method for identifying the cash crop information based on the deep learning according to claim 1, wherein the training U-Net model is used as a basic model, an acceptance structure and a channel attention mechanism module SE-Block are added, and the network deepening is performed to obtain a deep learning semantic segmentation model after the network deepening, and the method specifically comprises the following steps:

7. The method for identifying cash crop information based on deep learning of claim 6, wherein the processing procedure of the channel attention mechanism module on the input image comprises:

the compression part is in a global average pooling mode, which is defined as:

8. The method for identifying the cash crop information based on the deep learning according to claim 1, wherein the selecting the super-parameters performs parameter optimization on the deep learning semantic segmentation model after the network deepens, and the method comprises the following steps: and selecting two super parameters of the learning rate and the batch size to perform parameter optimization on the improved deep learning semantic segmentation model.

9. The method for identifying economic crop information based on deep learning according to claim 8, wherein after training the ISDU-Net model based on the training set and the verification set, obtaining a trained ISDU-Net model, further comprising:

10. A cash crop information recognition system based on deep learning, comprising:

11. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the deep learning based cash crop information identification method of any one of claims 1 to 9 when the program is executed.

12. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the deep learning based cash crop information identification method of any one of claims 1 to 9.