CN115760866A

CN115760866A - Crop drought detection method based on remote sensing image

Info

Publication number: CN115760866A
Application number: CN202211511972.8A
Authority: CN
Inventors: 张江南; 李吉龙; 綦家辉; 张乐平; 王志义; 吕文羽; 于瑷源
Original assignee: Qingdao Agricultural University
Current assignee: Qingdao Agricultural University
Priority date: 2022-11-29
Filing date: 2022-11-29
Publication date: 2023-03-07

Abstract

The invention discloses a crop drought detection method based on remote sensing images, which is characterized in that a drought detection neural network model is constructed to carry out large-range detection on a crop drought area so as to effectively prevent crop drought and provide reliable data support for realizing accurate irrigation; extracting a feature map by using ResNet50 as a backbone network in a feature map extraction module; a multi-scale attention pooling module is designed in an Encon module, multi-scale context information is obtained in a multi-scale cavity convolution mode, a CBAM attention mechanism is fused to focus network attention on a target drought area, and the feature extraction effect is improved. And finally, the Deconder part effectively fuses the low-level characteristic diagram and the multi-scale high-level drought characteristic diagram of the Encon module, and finally realizes effective detection of the drought region of the land by adopting a deconvolution mode, thereby optimizing the recognition effect of the drought boundary of the crops.

Description

Crop drought detection method based on remote sensing image

Technical Field

The invention belongs to the field of crop drought detection, and particularly relates to a drought detection method for semantically segmenting a remote sensing image of a crop.

Background

Since the 21 st century, the world-wide food crisis is gradually intensified, and natural disasters such as drought and flood seriously affect the food safety of China. According to statistics, the global economic loss caused by drought is about $ 60-80 billion per year, which is far beyond other natural disasters, and the drought is taken as a key element influencing the growth of crops and seriously influencing the normal growth and development of the crops, so that the effective detection of the drought of the crops is crucial to the promotion of the disaster prevention and reduction level.

At present, drought detection is mainly realized by adopting a mode that various sensor devices are additionally arranged on an Internet of things system. The method is difficult to use in a large-scale farmland, the cost of equipment is greatly increased, and the Internet of things equipment cannot stably run for a long time in an outdoor environment, so that the method has certain limitations. The invention patent with publication number [ CN108765906A ] discloses an agricultural drought monitoring and early warning forecasting method, which utilizes a signal tower and a telegraph pole in the existing agricultural field to set a plurality of detection branch points and divide areas, wherein each area comprises a plurality of detection branch points, a master station for collecting detection data of the detection branch points is arranged in each area, and each detection branch point comprises an air temperature and humidity detection unit, a soil pH value detection unit, a wind speed detection unit and a picture shooting unit. The conditions of air, soil and crops in the regions are detected by setting a plurality of points, and the precipitation, the underground water mining condition and the climate and environmental conditions in the regions are compared and collated, so that the drought reason is found out, an alarm is given in advance, and the drought is prevented.

In consideration of the problem of cost, the traditional crop drought detection mode is difficult to use in a wide-range farmland, and equipment cannot stably operate for a long time in an outdoor environment; for the above situation, increasing importance is placed on the remote sensing image as a data set, the drought detection range is expanded, and the accuracy of the drought boundary segmentation is achieved. However, for the drought condition of agricultural crops, the distribution of the drought crops in the agricultural field is often irregular, and the divided images are often mixed with partial background images, so that the drought boundaries of the crops cannot be accurately divided. The traditional method has a single and fixed network structure, has poor segmentation effect on a large-range and complex drought farmland image, and is difficult to realize accurate and effective detection on crop drought. Therefore, an improved network architecture is needed to solve the above problems and achieve reliable detection of drought in a wide range of crops.

Disclosure of Invention

The invention provides a crop drought detection method based on a remote sensing image, aiming at the defects existing in the traditional method of adopting an Internet of things system to additionally install various sensor devices to carry out crop drought detection, and the crop drought detection method is used for effectively improving the identification effect of crop drought boundaries.

The invention is realized by adopting the following technical scheme: a crop drought detection method based on remote sensing images comprises the following steps:

step A, constructing a drought detection neural network model, training the drought detection neural network model, wherein the drought detection neural network model comprises a characteristic diagram extraction module, an encoder module and a decoder module, and determining a final drought detection neural network model;

step B, detecting the crop drought area based on the trained drought detection neural network model, and specifically comprising the following steps:

b1, extracting a low-level feature map of a remote sensing image of a crop drought region based on a feature map extraction module;

b2, acquiring multi-scale context information of the remote sensing image by adopting multi-scale cavity convolution based on an Encon module, extracting a multi-scale high-level feature map, and then fusing a CBAM attention mechanism to focus the attention point of the network on a target drought area to obtain the multi-scale high-level drought feature map so as to improve the extraction effect of the drought features;

and step B3, the Deconder module effectively fuses the low-level feature map and the multi-scale high-level drought feature map extracted by the Encon module, and the detected drought region is restored by adopting a deconvolution mode, so that the detection of the drought region is realized.

Further, the characteristic diagram extraction module adopts a Resnet50 network and comprises an input module, a residual error module and an output module, wherein the input module consists of a convolution layer and a maximum pooling layer, and Relu activation function and batch normalization layer are used for improving the network fitting capability; the residual module comprises two Conv-Block and Identity-Block basic modules which are connected in series, wherein the Conv-Block is used for changing the dimensionality of the network, and the Identity-Block is used for deepening the depth of the network; the output module processes the input features through an average pooling layer, and finally processes the input remote sensing image into a low-level feature map through a full connection layer.

Further, the Encon module comprises a multi-scale attention pooling module, and the low-level feature map is processed by adopting different sampling rates and cavity convolution, so that the receptive field is enlarged, and a multi-scale high-level feature map is obtained through multiple scales; and in the process of processing the low-level feature maps at different sampling rates, introducing a CBAM (CBAM) convolution attention mechanism to further enhance the drought features from two dimensions of a channel and a space, and finally fusing the multi-scale high-level feature maps after the processing of multiple scales to obtain a final fused multi-scale high-level drought feature map.

Further, the specific process of extracting the multi-scale high-level drought feature map by the Encon module is as follows:

(1) The multi-scale attention pooling module packs the hole number into the convolution kernel through adjusting the expansion coefficient of the expansion convolution layer so as to achieve the purpose of expanding the convolution kernel, the size of the convolution kernel is set to be k, the stuffed hole number is relationships-1, and the size n of the central convolution kernel after stuffing the relationships-1 hole numbers is set to be:

n＝k+(k-1)×(dilations-1)

(2) Assuming that the input convolution size is i and the step size is S, the size of the convolution feature map is:

and further obtaining a cavity convolution characteristic diagram, namely the size of the crop drought characteristic diagram after multi-scale extraction:

a CBAM module is introduced into the convolution kernel of each scale in the process of processing the low-level features to enhance the extraction of the drought features, the CBAM is formed by combining a CAM space attention mechanism and a BAM channel attention mechanism together, and the drought features are enhanced from two dimensions of space and channel respectively;

(3) Channel attention mechanism CAM inputs F, F ∈ R ^C×H×W Adding corresponding elements through a shared full link layer to generate a one-dimensional channel attention M using average pooling and maximum pooling operations, respectively ^C ∈R ^C×1×1 Multiplying the channel attention by the input element to obtain a channel attention adjusted drought feature map F subjected to channel dimension processing ^C Drought feature attention M of channel dimension _c (F ^C ) Expressed as:

where σ is Sigmoid function, W ₀ And W ₁ Respectively hidden layer weight and output layer weight;

(4) Channel attention drought feature map F output by space attention mechanism SAM ^C ，F ^C ∈R ^C×H×W Carrying out average pooling and maximum pooling, carrying out dimensionality splicing, and obtaining a spatial dimensionality drought characteristic diagram M through a Sigmoid function ^S ∈R ¹ ^×H×W Expressed as:

where σ is Sigmoid function, f ^7×7 A convolution kernel representing a convolution kernel size of 7;

and the multi-scale attention pooling module performs maximum pooling operation after finishing the processing at each layer of scale, so that the output feature maps have the same size, all the multi-scale high-level drought feature maps are fused on the channel dimension, and the fused multi-scale high-level drought feature maps are used as the final output result of the multi-scale attention pooling module after 1 × 1 convolution operation.

Further, the Decoder module performs bilinear interpolation quadruple upsampling operation on the multi-scale high-level drought characteristic map extracted by the Encoder module, and gradually restores the size of the original image; performing convolution operation on the low-level feature map to compress the low-level feature channels so as to store low-level feature information, and adding a CBAM (CBAM) module to enhance the features of the low-level feature map in the crop drought region; and finally, fusing the multi-scale high-level drought characteristic diagram and the low-level characteristic diagram, performing bilinear interpolation upsampling after 4 times of amplification, decoding a final prediction diagram, and completing detection on the crop drought area.

Further, in the step a, when the network model is trained to determine the final drought detection neural network model, after training of each training batch is finished, a trained network prediction model is generated, the detected drought region image and the drought label image are subjected to IOU calculation, an optimal solution is obtained by comparison, and the model with the highest accuracy is stored as the final model

Compared with the prior art, the invention has the advantages and positive effects that:

the scheme provides a Transformer neural network method for crop drought detection based on remote sensing images, a ResNet50 is used as a main network of a feature extraction module, a multi-scale attention pooling module is designed, a multi-scale void convolution method is adopted to reduce local information loss caused in a downsampling process, and an attention mechanism is introduced to enhance extraction of crop drought features; in the characteristic diagram recovery process, the high-level characteristic diagram and the low-level characteristic diagram are effectively fused, so that the deconvolution effect is better, the identification effect of the drought boundary of the crops is further improved, the drought of the crops is effectively prevented, and reliable data support is provided for realizing accurate irrigation. According to the scheme, the drought detection of crops in a large range is realized through a remote sensing technology, the pixel accuracy reaches 91.1%, and the method has wide practical application and popularization values.

Drawings

FIG. 1 is a schematic diagram of a schematic structure of a Transformer network for crop drought detection according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a feature extraction process according to an embodiment of the present invention;

fig. 3 is a schematic diagram of an Encoder module architecture of a Transformer network structure according to an embodiment of the present invention;

fig. 4 is a schematic diagram of a Decoder module architecture of the Transformer network structure according to an embodiment of the present invention;

fig. 5 is a schematic overall architecture diagram of a Transformer network structure according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of the drought detection results of crops according to the embodiment of the present invention.

Detailed Description

In order to make the above objects, features and advantages of the present invention more clearly understood, the present invention will be further described with reference to the accompanying drawings and examples. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced otherwise than as described herein, and thus, the present invention is not limited to the specific embodiments disclosed below.

The embodiment provides a crop drought detection method based on a remote sensing image, which is characterized in that a satellite remote sensing technology is adopted to shoot a farmland in a monitored area for a long time, the remote sensing image is analyzed, a neural network based on a Transformer is used for realizing the wide-range drought detection of the remote sensing image, and the crop drought detection method based on the remote sensing image comprises the following steps:

a, constructing a drought detection neural network model, and training the drought detection neural network model, wherein the drought detection neural network model comprises a characteristic diagram extraction module, an Enconnector module and a deconer module;

b1, extracting a low-level feature map of the remote sensing image of the crop drought region based on a feature map extraction module;

b2, acquiring multi-scale context information of the remote sensing image by adopting multi-scale void convolution based on the Encon module, extracting a multi-scale high-level feature map, and then fusing a CBAM (cone beam absorption spectrum) attention mechanism to focus the attention point of the network on a target drought area to obtain the multi-scale high-level drought feature map, so that the extraction effect of the drought feature is improved;

For a clearer understanding of the present invention, the following is a detailed description of the model construction and specific applications of the present invention:

in the step A, a transform network structure is adopted for constructing a main network of a drought detection neural network model, the best training model is selected by segmenting remote sensing images of drought regions of agricultural crops and adopting an intersection ratio calculation method, and the method specifically comprises the following steps as shown in figure 1:

step A1, constructing a data set: acquiring a crop drought remote sensing image data set, and dividing a part of data set into 6:2:2, randomly dividing the ratio into a training set, a testing set and a verification set;

a2, constructing a drought detection neural network model, wherein the drought detection neural network model comprises a feature extraction module, an Enconnector module and a deconer module;

as shown in fig. 2-5, the structure of the feature extraction module is as shown in fig. 2, and a Resnet50 network is adopted, which includes six parts, stage0 is an input module, stages 1 to 4 are residual modules, and stage5 is an output module. The input module consists of a convolutional layer and a maximum pooling layer, and the Relu activation function and the batch normalization layer are used for improving the network fitting capability; the residual module is provided with two basic modules Conv-Block (changing the dimension of the network) and Identity-Block (deepening the depth of the network), and the two basic modules are connected in series to increase the depth of the network; the output module processes the features through an average pooling layer, and finally processes the input remote sensing image into a low-level feature map through a full connection layer;

the Encoder module is structurally shown in FIG. 3, and is designed with a multi-scale attention pooling module, acquires multi-scale context information in a multi-scale cavity convolution mode, integrates a CBAM (cone beam absorption spectrum) attention mechanism to focus network attention on a drought area of a target, outputs a multi-scale high-level feature map and improves the feature extraction effect;

the Deconder module structure is as shown in fig. 4, and the low-level feature map and the multi-scale high-level drought feature map are fused, and the image is restored by adopting a deconvolution mode.

Step A3, training a network model:

step A31, inputting all images in a training set into a Transformer network architecture for model training, and initializing network parameters;

step A32: inputting the images in the training set into Resnet50 to extract a low-level feature map;

step A33, extracting multi-scale features by an Enconnector module:

and B, inputting the feature map extracted in the step A32 into a multi-scale attention pooling module, wherein the module respectively adopts A3 × 3 convolution kernel and cavity convolution kernels with different expansion rates, such as 6 × 6, 12 × 12, 24 × 24 and the like to respectively process the low-level feature map, the method adopts different sampling rates and cavity convolutions to extract the drought feature map, the receptive field is enlarged, and the high-level drought feature map is obtained through multiple scales. In the process of processing the low-level feature map at different sampling rates, a CBAM convolution attention mechanism is introduced to further enhance the drought feature from two dimensions of a channel and a space. Finally, fusing the processed high-level feature maps of multiple scales to obtain a finally fused multi-scale high-level drought feature map, which specifically comprises the following steps:

(1) The multi-scale attention pooling module packs the number of holes into the convolution kernel by adjusting the expansion coefficient (relationships) of the expansion convolution layer so as to achieve the purpose of expanding the convolution kernel, the size of the convolution kernel is k, the number of the stuffed holes is (relationships-1), and then the size n of the central convolution kernel after the (relationships-1) number of holes is stuffed is:

n＝k+(k-1)×(dilations-1)

and a CBAM module is introduced into the convolution kernels of each scale in the process of processing the low-level features to enhance the extraction of the drought features, and the CBAM is formed by combining a CAM space attention mechanism and a BAM channel attention mechanism together and enhances the drought features from two dimensions of space and channels respectively.

(3) Channel attention mechanism CAM pair input F (F ∈ R) ^C×H×W ) Adding corresponding elements through a shared full-link layer to generate one-dimensional channel attention M using average pooling and maximum pooling operations, respectively ^C ∈R ^C×1×1 Multiplying the channel attention by the input according to elements to obtain a channel attention-adjusted drought characteristic diagram F subjected to channel dimension processing ^C Drought feature attention M of channel dimension _c (F ^C ) Expressed as:

(4) Channel attention drought feature diagram F output by space attention mechanism SAM ^C (F ^C ∈R ^C×H×W ) Carrying out average pooling and maximum pooling, carrying out dimensionality splicing, and obtaining a space dimensionality drought feature map M through a Sigmoid function ^S ∈R ¹ ^×H×W Expressed as:

and the multi-scale attention pooling module performs maximum pooling operation after each layer of scale finishes the processing, so that the output feature maps have the same size, all the high-level drought feature maps are fused in the channel dimension, and the fused high-level drought feature maps are used as a final output result of the multi-scale attention pooling module after 1 × 1 convolution operation.

Step A34, relevant fusion of feature maps;

and performing bilinear interpolation quadruple upsampling operation on the multi-scale high-level drought feature map extracted by the Encoder module to gradually restore the image to the original image. The decoder firstly compresses low-level characteristic channels to 48 stored high-level characteristic information by adopting convolution of 1 multiplied by 1, connects two sensors on columns through a torreh.cat function, restores a target boundary, and finally decodes a final drought area image through 4 times of amplified bilinear interpolation of interpolate.

Step A4, determining a final drought detection neural network model:

after training of each training batch is finished, a trained network prediction model is generated, IOU (cross-over ratio) calculation is carried out on the detected drought area image and the drought label image, an optimal solution is obtained through comparison, and the model with the highest accuracy is stored as a final model.

The calculation formula for calculating the MIOU (cross-over ratio) of the detected drought area image and the label image is expressed as follows:

wherein, X _ii Indicating that the pixel is an i-type pixel and the prediction result is the total number of the i-type pixels; x _ji The expression pixel is a class i pixel and the prediction result is also the total number of class j pixels.

Based on the trained drought detection neural network model, when the method is applied specifically and used for detecting the drought area of crops, the method comprises the following steps:

step B1, when feature extraction is carried out: and the input remote sensing image is subjected to down-sampling of the Resnet50 network to generate a low-level feature map.

And B2, inputting the low-level feature map into an Encoder module, processing the low-level feature map through a multi-scale attention pooling module by respectively carrying out convolution with a convolution kernel of 1 × 1 convolution and cavity convolutions with different expansion rates of 6 × 6, 12 × 12, 24 × 24 and the like, adding CBAM convolution attention to carry out drought feature enhancement in the process of processing the low-level feature map at different sampling rates, enabling the boundary contour of the crop drought area to be more obvious, and fusing the multi-scale features after pooling operation to obtain the multi-scale high-level drought feature map.

And step B3, the Decoder module performs bilinear interpolation quadruple upsampling operation on the multi-scale high-level drought feature map extracted by the Encoder module, gradually restores the multi-scale high-level drought feature map to the size of the original image, compresses low-level feature channels to 48 by adopting 1 x 1 convolution on the original low-level feature map (the crop drought feature map obtained after rough division) so as to store low-level feature information, and then adds a CBAM module to enhance low-level features and enhance the features of the crop drought region of the low-level map. And finally, fusing the high-level and low-level characteristic images, performing bilinear interpolation upsampling after 4-time amplification, decoding a final prediction image, and completing detection on the drought area of the crop.

The implementation adopts an international computer vision and pattern recognition Conference (CVPR) 2020 agricultural vision challenge data set to test the method, and the result is as follows:

the usage data set contained 21,061 aerial farm images taken all year around in the united states in 2019, each image consisting of four 512 x 512 color channels, each image having a boundary map representing a region of the farm and a mask representing valid pixels in the image. In the embodiment, a part of data sets (3000 sheets) is selected for division, wherein 60% (1800 sheets) of the data sets are selected; the network was segmented using 1800 images and the trained network model was validated using 600 images.

Fusing a Senet attention mechanism in a residual neural network ResNet50 in an original network architecture to form a Transformer (Se-Resnet) network, and training the Transformer (Se-Resnet) by using 600 images;

fusing an Encoder part in an original network structure with a Convolution Block Attention Mechanism (CBAM), and training a Transformer (adding CBAM) by using 600 images;

fusing part of an Encoder in an original network structure with a multi-scale cavity convolution, and training a Transformer by using 600 images;

fusing a multi-scale attention pooling module of an Encoder part in an original network structure with a rolling block attention mechanism (CBAM), and training the method by using 600 images, wherein the method is the method disclosed by the invention;

replacing a network architecture with the Unet, and training the Unet by using 600 image samples;

replacing the network architecture with the FCN, and training the FCN by using 600 image samples;

replacing a network architecture with SegNet, and training the SegNet by using 600 image samples;

replacing the network architecture with a PSPNet, and training the PSPNet by using 600 image samples;

replacing the network architecture with HRNet, and training the HRNet by using 600 image samples;

under the condition of keeping the other parameter settings consistent, training all networks for 100 epochs to obtain the pixel accuracy, wherein the training result pair is shown in table 1:

as can be seen from Table 1, on the original image test set, the accuracy of the method is higher than that of the other four methods, and the identification accuracy is up to 91.05%. The identification accuracy of the method is improved by 2 percent compared with that of the original Deeplabv3+ network, and the fusion volume block attention mechanism (CBAM) is proved to enable the feature extraction network to effectively extract features and improve the identification performance of the model. Compared with a transform (CBAM), the method has the advantages that the Convolution Block Attention Mechanism (CBAM) is fused in the multi-scale attention pooling module, the identification accuracy is improved by 1 percent, and the feature extraction performance can be enhanced after the convolution block attention mechanism is fused in the void space pyramid. The recognition accuracy of the method is higher than that of other four models, and the method is proved to be capable of enhancing the feature extraction performance and enabling the feature extraction network to extract the features more efficiently.

As shown in fig. 6, the remote sensing satellite image of a certain arid farmland is segmented to obtain an image of the arid region of the crop, so that the arid region is completely reserved and is input into the identification network, and the performance loss caused by image noise can be well reduced, thereby showing good identification performance.

The above description is only a preferred embodiment of the present invention, and not intended to limit the present invention in other forms, and any person skilled in the art may apply the above modifications or changes to the equivalent embodiments with equivalent changes, without departing from the technical spirit of the present invention, and any simple modification, equivalent change and change made to the above embodiments according to the technical spirit of the present invention still belong to the protection scope of the technical spirit of the present invention.

Claims

1. The crop drought detection method based on the remote sensing image is characterized by comprising the following steps: the method comprises the following steps:

step A, constructing a drought detection neural network model, training the drought detection neural network model, wherein the drought detection neural network model comprises a characteristic diagram extraction module, an Enconnector module and a Deconder module, and determining a final drought detection neural network model;

b2, acquiring multi-scale context information of the remote sensing image by adopting multi-scale void convolution based on an Encon module, extracting a multi-scale high-level feature map, and then fusing a CBAM (cone beam absorption spectrum) attention mechanism to focus the attention point of the network on a target drought area to obtain the multi-scale high-level drought feature map;

and step B3, effectively fusing the low-level feature map and the multi-scale high-level drought feature map extracted by the encoder module through the decoder module, and reducing the detected drought region in a deconvolution mode to realize the detection of the drought region.

2. The crop drought detection method based on remote sensing images as claimed in claim 1, wherein the characteristic diagram extraction module adopts a Resnet50 network and comprises an input module, a residual error module and an output module, wherein the input module consists of a convolutional layer and a maximum pooling layer, and the Relu activation function and the batch normalization layer are used for improving the network fitting capability; the residual module comprises two Conv-Block and Identity-Block basic modules which are connected in series, wherein the Conv-Block is used for changing the dimensionality of the network, and the Identity-Block is used for deepening the depth of the network; and the output module processes the input features through an average pooling layer, and finally processes the input remote sensing image into a low-level feature map through a full-connection layer.

3. The crop drought detection method based on remote sensing images as claimed in claim 1, wherein the Encon module comprises a multi-scale attention pooling module, and is used for processing the low-level feature map by adopting different sampling rates and cavity convolution, enlarging the receptive field and acquiring the multi-scale high-level feature map through multiple scales; and introducing a CBAM convolution attention mechanism to further enhance the drought characteristics from two dimensions of a channel and a space, and finally fusing the multi-scale processed high-level characteristic diagrams to obtain a final fused multi-scale high-level drought characteristic diagram.

4. The crop drought detection method based on remote sensing images as claimed in claim 3, characterized in that: the specific process of extracting the multi-scale high-level drought feature map by the Enconnector module is as follows:

(1) The multi-scale attention pooling module packs the number of the holes into the convolution kernel by adjusting the expansion coefficient of the expansion convolution layer so as to achieve the purpose of expanding the convolution kernel, the size of the convolution kernel is set to be k, the number of the filled holes is set to be relationships-1, and then the size n of the central convolution kernel after the number of the relationships-1 holes is filled is set to be:

n＝k+(k-10×(dilations-1)

the convolution kernels of each scale introduce a CBAM module in the low-level feature processing to enhance the extraction of drought features, the CBAM is formed by combining a CAM space attention mechanism and a BAM channel attention mechanism together, and the drought features are enhanced from two dimensions of space and channel respectively;

(3) Channel attention mechanism CAM inputs F, F ∈ R ^C×H×W Adding corresponding elements through a shared full link layer to generate one-dimensional channel attention M using average pooling and maximum pooling operations, respectively ^C ∈R ^C×1×1 Multiplying the channel attention by the input element to obtain a channel attention adjusted drought feature map F subjected to channel dimension processing ^C Drought feature attention M of channel dimension _c (F ^C ) Expressed as:

(4) Channel attention drought feature diagram F output by space attention mechanism SAM ^C ，F ^C ∈R ^C×H×W Carrying out average pooling and maximum pooling, carrying out dimensionality splicing, and obtaining a space dimensionality drought characteristic diagram M through a Sigmoid function ^S ∈R ^1×H×W Expressed as:

5. The crop drought detection method based on remote sensing images as claimed in claim 1, characterized in that: the Decoder module performs bilinear interpolation quadruple upsampling operation on the multi-scale high-level drought characteristic map extracted by the Encoder module, and gradually restores the size of the original image; performing convolution operation on the low-level feature map to compress the low-level feature channels so as to store low-level feature information, and adding a CBAM (CBAM) module to enhance the features of the low-level feature map in the crop drought region; and finally, fusing the multi-scale high-level drought characteristic diagram and the low-level characteristic diagram, performing bilinear interpolation upsampling after 4 times of amplification, decoding a final prediction diagram, and completing detection on the crop drought area.

6. The crop drought detection method based on remote sensing images as claimed in claim 1, characterized in that: in the step A, when the network model is trained to determine the final drought detection neural network model, after training of each training batch is finished, a trained network prediction model is generated, IOU calculation is carried out on the detected drought area image and the drought label image, an optimal solution is obtained through comparison, and the model with the highest accuracy is stored as the final model.