CN115019178A - Hyperspectral image classification method based on large kernel convolution attention - Google Patents

Hyperspectral image classification method based on large kernel convolution attention Download PDF

Info

Publication number
CN115019178A
CN115019178A CN202210883826.1A CN202210883826A CN115019178A CN 115019178 A CN115019178 A CN 115019178A CN 202210883826 A CN202210883826 A CN 202210883826A CN 115019178 A CN115019178 A CN 115019178A
Authority
CN
China
Prior art keywords
convolution
image
spectrum
spatial
hyperspectral
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210883826.1A
Other languages
Chinese (zh)
Inventor
孙根云
王凯
陈勇
董震
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao Xingke Ruisheng Information Technology Co ltd
Original Assignee
Qingdao Xingke Ruisheng Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao Xingke Ruisheng Information Technology Co ltd filed Critical Qingdao Xingke Ruisheng Information Technology Co ltd
Priority to CN202210883826.1A priority Critical patent/CN115019178A/en
Publication of CN115019178A publication Critical patent/CN115019178A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/194Terrestrial scenes using hyperspectral data, i.e. more or other wavelengths than RGB
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/58Extraction of image or video features relating to hyperspectral data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A40/00Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
    • Y02A40/10Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in agriculture

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Remote Sensing (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of hyperspectral remote sensing images, in particular to a hyperspectral image classification method based on large-kernel convolution attention, which comprises the steps of model parameter setting, hyperspectral image preprocessing and data set manufacturing, model pre-training, spatial spectral feature combined mining and processing and classification based on spatial-spectral combined features.

Description

Hyperspectral image classification method based on large kernel convolution attention
Technical Field
The invention relates to the technical field of hyperspectral remote sensing images, in particular to a hyperspectral image classification method based on large kernel convolution attention.
Background
With the development of sensor technology, the hyperspectral remote sensing images play an important role in the fields of environment monitoring, national and local resource survey and evaluation, urban planning and the like, in the process, the classification precision of the remote sensing images directly determines the effective degree of the application of the hyperspectral remote sensing images, although the abundant spectral characteristics of the hyperspectral remote sensing images can more accurately describe ground coverage information, the ground surface coverage is complex, the phenomenon of spectral isomerism is serious, and the high-precision image classification is difficult to realize only by utilizing the spectral characteristics.
The spatial characteristics of hyperspectrum can play an important role in influencing classification, so that how to mine high-quality spatial and spectral characteristics from hyperspectral images and realize high-precision image classification becomes a key problem in hyperspectral image application.
MAP, 2DSSA and superpixel segmentation are common hyperspectral spatial feature extraction methods, good results are obtained in hyperspectral image classification, hyperspectral space and spectral features can complement each other, algorithms for spatial spectral feature combined processing are continuously developed, the algorithm combining spatial and spectral information mainly takes adjacent pixel information extracted by spatial operators such as Gabor filtering and Markov Random Fields (MRFs) as a certain feature channel, and then classification is carried out by using a spectral feature processing method, and the mode of separating spectral spatial information utilizes hyperspectral space information but does not accord with the characteristics of hyperspectral space spectrum integration. Later, in order to solve the problem, researchers put forward 3D hyperspectral image processing methods such as 3D scattered wavelets and 3DGabor, and effective improvement of spatial spectrum information mining capability is achieved. However, these conventional methods extract shallow features, and it is difficult to mine and utilize deep features inherent to hyperspectrum.
The deep Convolutional Neural Network (CNN) can utilize the unique local connection to gracefully integrate the spectral features with the spatial context information from the HSIs data and simultaneously mine the deep hyperspectral features. Many researchers utilize CNN to excavate the spectral space information of HSIs, can divide into two kinds of structures of double branch and single branch, the double branch has promoted the efficiency that space and spectral feature excavated through different sub-networks, but this method is not conform to the characteristic that the high spectrum space spectrum unifies, this makes it neglect HSI's spectral space associated information inherently, the single branch regards high spectrum influence piece as the input, can effectual excavation space spectrum associated information, however, whether single branch or double branch network treat the characteristic of input equally, it is difficult to extract key space and spectral feature effectively.
Inspired by the human visual system, researchers in the field of computer vision have developed a mechanism of attention. The method can prompt the network to pay attention to key information and compress noise by weighting the characteristics. Due to this characteristic, attention mechanism has also gained wide attention in the hyperspectral field. However, the existing attention network in the hyperspectral field is only different combinations of spectral attention and spatial attention, and still does not conform to the characteristics of hyperspectral image and spatial-spectral unification.
Therefore, a hyperspectral image classification method based on large-kernel convolution attention needs to be designed, hyperspectral space spectrum correlation features are effectively utilized, the blank of the research is solved, a space spectrum weight graph is constructed through the large-kernel convolution attention, weighting processing is carried out on three-dimensional hyperspectral image blocks, a space-spectrum integrated structure of images is effectively protected, and high-quality space spectrum correlation features are finally obtained to obtain a high-precision hyperspectral classification result.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, provides a hyperspectral image classification method based on large-kernel convolution attention, effectively utilizes hyperspectral space spectrum correlation characteristics, solves the blank of the research, constructs a space spectrum weight map through the large-kernel convolution attention, performs weighting processing on a three-dimensional hyperspectral image block, effectively protects a space-spectrum integrated structure of images, and finally obtains high-quality space-spectrum correlation characteristics to obtain a high-precision hyperspectral classification result.
In order to achieve the above object, the present invention provides a hyperspectral image classification method based on large kernel convolution attention, which comprises the following steps:
s1, setting model parameters:
s1-1: selecting original hyperspectral images HSIs to be subjected to image classification, determining the total wave band number C and the category total number S of the hyperspectral images HSIs, and determining the image segmentation size img _ size and the number of training samples;
s1-2: setting the image size of an input spectrum space attention characteristic extractor according to the data determined in S1-1, determining the number D of VAN processing modules, and determining the times n of image processing by using Large Kernel Attention (LKA) and convolution feed-forward (convolution feed-forward) in each VAN processing module i Wherein i is more than or equal to 1 and less than or equal to D, and the number of image block channels c processed by each processing module i (ii) a S1-3: determining a network learning rate, an optimization iteration number and a model optimizer Adam;
s1-4: setting the number of one-dimensional vector elements output by the feature classifier according to the determined category total number in the S1-1;
s2, hyperspectral image preprocessing and data set production:
s2-1: normalizing the hyperspectral image, and adjusting the image radiation value to 0-1;
s2-2: randomly selecting training sample points based on the training sample proportion set in the S1-1, then carrying out image segmentation by taking the sample points as the center and img _ size as the diameter to obtain an image patch, and taking the rest samples as test samples for testing the model precision;
s2-3: in order to improve the stability of the model, sample expansion is carried out by using a method of mirroring, rotating and adding salt and pepper noise;
s3, model pre-training:
s3-1: constructing a network model based on the parameters set by the S1-2;
s3-2: performing model training based on the parameters determined in the step S1-3, and storing the model parameters corresponding to the optimal results as model pre-training weights after the training is completed;
s4, jointly mining and processing spatial spectral features:
s4-1: constructing a network model based on the parameters set in the S1-2, and loading model pre-training weights;
s4-2: processing the image patch by using 2D convolution keeps the space size of the hyperspectral patch unchanged, and the number of channels is changed into c i
S4-3: jointly mining the image patch space and spectrum information by using LKA;
s4-4: fusing information of image patch space and spectrum dimension by using CFF;
s4-5: back to S3 until the number of iterations equals ni;
s4-6: image data normalization is carried out by utilizing layer normalization so as to inhibit overfitting caused by small data scale after the image is processed;
s4-7: returning to S4-2 until the iteration number is equal to the number of VAN processing modules, and outputting the spatial spectrum joint characteristics;
s5, classification based on space spectrum joint features:
s5-1: performing information compression on the space spectrum correlation characteristics by utilizing global average pooling, and outputting a one-dimensional vector T;
s5-2: and transforming the size of the T by using MLP to obtain a one-dimensional vector class, wherein index corresponding to the largest element in the class is the ground object type corresponding to the central pixel of the image patch.
S4-3 is specifically:
s4-3-1: the characteristic map obtained by the processing of step S4-2
Figure BDA0003765206390000041
w x w represents the spatial size of the feature map, c i Representing the number of channels of the feature map;
s4-3-2: preprocessing the feature map by using 1 × 1 convolution and GELU activation function to obtain the feature mapB, the process can be represented as: b ═ GELU (Conv) 1×1 (A));
S4-3-3: conveying B to LKA to finish spatial spectrum attention operation, realizing large kernel convolution operation step by utilizing depth convolution, depth cavity convolution and dot product to obtain three-dimensional spatial spectrum weight graph W spe-spa The process can be expressed as: w spe-spa =Con 1×1 (DW d (DW(B))),W spe-spa Each weight in the convolution kernel is independent and related to all pixels in the large convolution kernel receptive field;
s4-3-4: b and W are spe-spa The new feature map C after weighting is obtained by multiplying the corresponding elements, and the process can be expressed as:
Figure BDA0003765206390000051
Figure BDA0003765206390000052
representing corresponding element multiplication, which reinforces spectral blocks that provide more information while suppressing unnecessary spectral blocks;
s4-3-5: in order to further mine the spectrum and space information, 1 × 1 convolution processing is carried out to complete the weighted fusion of the space and spectrum information;
s4-3-6: and residual error connection is carried out to obtain LKA output D: d is A + Con 1×1 (C)。
S4-4 is specifically:
different from the attention-weighted spatial spectral feature enhancement mode of LKA, CFF self-adaptively mines significant spatial and spectral features by utilizing convolution combination;
s4-4-1: performing channel expansion on the feature map D by using 1 × 1 convolution;
s4-4-2: performing channel feature enhancement by using deep convolution, and processing the introduced nonlinear features by using a GELU activation function;
s4-4-3: and finally, utilizing 1 × 1 convolution to realize channel number restoration and fusion of spatial and spectral characteristics, wherein the process can be expressed as:
E=Con 1×1 (GELU(DW(Conv 1×1 (D))))
wherein
Figure BDA0003765206390000053
Conv 1×1 (g) Is a 2D convolution with convolution kernel size of 1, DW (g) is a deep convolution with convolution kernel size of 3 × 3, padding of 1, and group of c i
Compared with the prior art, the method has the advantages that based on the characteristic that the large convolution kernel attention network accords with the space-spectrum unification of the hyperspectral images, the complete hyperspectral image block is used as input, the feature classifier which is simple and effective in design perfectly utilizes the feature extraction result, and the space and spectrum features in the image are jointly excavated and extracted, so that the problems of model overfitting, low classification precision and the like caused by insufficient excavation of the hyperspectral remote sensing image features are solved.
Drawings
Fig. 1 is a general structural view of the present invention.
Fig. 2 is a schematic diagram of a flow of jointly mining HSI spatial and spectral information by LKA according to the present invention.
FIG. 3 is a schematic diagram of CFF fusion and mining of HSI spatial and spectral information in accordance with the present invention.
FIG. 4 is a diagram illustrating the number of training sets and validation sets for each category according to an embodiment of the present invention.
FIG. 5 is a diagram illustrating an optimal performance of a classification algorithm according to an embodiment of the present invention.
Detailed Description
The invention will now be further described with reference to the accompanying drawings.
As shown in fig. 1 to 5, the present invention provides a hyperspectral image classification method based on large kernel convolution attention, which includes the following steps:
s1, setting model parameters:
s1-1: selecting original hyperspectral images HSIs to be subjected to image classification, determining the total wave band number C and the category total number S of the hyperspectral images HSIs, and determining the image segmentation size img _ size and the number of training samples;
s1-2: setting the size of an image input into the spectral space attention feature extractor according to the data determined in S1-1, determining the number D of VAN processing modules, and determining the number of VAN processing modulesNumber of times n of image processing by LKA and CFF i Wherein i is more than or equal to 1 and less than or equal to D, and the number of image block channels c processed by each processing module i
S1-3: determining a network learning rate, an optimization iteration number and a model optimizer Adam;
s1-4: setting the number of one-dimensional vector elements output by the feature classifier according to the determined category total number in the S1-1;
s2, hyperspectral image preprocessing and data set production:
s2-1: normalizing the hyperspectral image, and adjusting the image radiation value to 0-1;
s2-2: randomly selecting training sample points based on the training sample proportion set in the S1-1, then carrying out image segmentation by taking the sample points as the center and img _ size as the diameter to obtain an image patch, and taking the rest samples as test samples for testing the model precision;
s2-3: in order to improve the stability of the model, sample expansion is carried out by using a method of mirroring, rotating and adding salt and pepper noise;
s3, model pre-training:
s3-1: constructing a network model based on the parameters set by the S1-2;
s3-2: performing model training based on the parameters determined in the step S1-3, and storing the model parameters corresponding to the optimal results as model pre-training weights after the training is completed;
s4, jointly mining and processing spatial spectral features:
s4-1: constructing a network model based on the parameters set in the S1-2, and loading model pre-training weights;
s4-2: processing the image patch by using 2D convolution keeps the space size of the hyperspectral patch unchanged, and the number of channels is changed into c i
S4-3: jointly mining the image patch space and spectrum information by using LKA;
s4-4: fusing information of image patch space and spectrum dimension by using CFF;
s4-5: back to S3 until the number of iterations equals ni;
s4-6: image data normalization is carried out by utilizing layer normalization so as to inhibit overfitting caused by small data scale after the image is processed;
s4-7: returning to S4-2 until the iteration number is equal to the number of VAN processing modules, and outputting the spatial spectrum joint characteristics;
s5, classification based on space spectrum joint features:
s5-1: performing information compression on the space spectrum correlation characteristics by utilizing global average pooling, and outputting a one-dimensional vector T;
s5-2: and transforming the size of the T by using MLP to obtain a one-dimensional vector class, wherein index corresponding to the largest element in the class is the ground object type corresponding to the central pixel of the image patch.
S4-3 is specifically:
s4-3-1: the characteristic map obtained by the processing of step S4-2
Figure BDA0003765206390000081
w x w represents the spatial size of the feature map, c i Representing the number of channels of the feature map;
s4-3-2: the feature map is preprocessed by using 1 × 1 convolution and the GELU activation function to obtain a feature map B, and the process can be expressed as: b ═ GELU (Conv) 1×1 (A));
S4-3-3: conveying B to LKA to finish spatial spectrum attention operation, realizing large kernel convolution operation step by utilizing depth convolution, depth cavity convolution and dot product to obtain three-dimensional spatial spectrum weight graph W spe-spa The process can be expressed as: w spe-spa =Con 1×1 (DW d (DW(B))),W spe-spa Each weight in the convolution kernel is independent and related to all pixels in the large convolution kernel receptive field;
s4-3-4: b and W are spe-spa The new feature map C after weighting is obtained by multiplying the corresponding elements, and the process can be expressed as:
Figure BDA0003765206390000082
Figure BDA0003765206390000083
representing multiplication of corresponding elements, which process intensifies the provision of more informative spectral blocks, as wellUnnecessary spectral blocks are suppressed;
s4-3-5: in order to further mine the spectrum and space information, 1 × 1 convolution processing is carried out to complete the weighted fusion of the space and spectrum information;
s4-3-6: and residual error connection is carried out to obtain LKA output D: d is A + Con 1×1 (C)。
S4-4 is specifically:
different from the attention-weighted spatial spectral feature enhancement mode of LKA, CFF self-adaptively mines significant spatial and spectral features by utilizing convolution combination;
s4-4-1: performing channel expansion on the feature map D by using 1 × 1 convolution;
s4-4-2: performing channel feature enhancement by using deep convolution, and processing the introduced nonlinear features by using a GELU activation function;
s4-4-3: and finally, utilizing 1 × 1 convolution to realize channel number restoration and fusion of spatial and spectral characteristics, wherein the process can be expressed as:
E=Con 1×1 (GELU(DW(Conv 1×1 (D))))
wherein
Figure BDA0003765206390000091
Conv 1×1 (g) Is a 2D convolution with convolution kernel size of 1, DW (g) is a deep convolution with convolution kernel size of 3 × 3, padding of 1, and group of c i
Example (b):
the method is suitable for processing the hyperspectral remote sensing images of Indian Pine areas acquired by an AVIRIS (aircraft Visible Imaging spectrometer) sensor. Raw HSI data the database contains 145 x 145 pixels with a resolution of 20m per pixel and a wavelength range of 400-2500nm, containing 220 spectral bands. After removing 20 noise and water absorption bands from the entire data set, the spectral bands were reduced to 200, which became a 145 × 145 × 200 data cube. The surface feature types include 16 identifiable surface features (mainly different crops). The data set contains 10366 tags in total, with the remaining unlabeled portions being considered as background.
As shown in fig. 1, 2 and 3, the present invention provides a hyperspectral image classification method based on large kernel convolution attention, which includes the following steps:
step 1, setting model parameters:
1) selecting original hyper-spectral images HSIs to be subjected to image classification, determining the total wave band number C of the hyper-spectral images HSIs as 200 and the category total number S as 16, determining the image segmentation size img _ size as 13 multiplied by 13, and determining the number of training samples and test samples as shown in fig. 4.
2) Setting the size of an image input into a spectral space attention feature extractor according to the data determined in 1), determining the number of VAN processing modules D to be 4, and determining the times of image processing by using LKA and CFF in each VAN processing module:
n i i is not less than 1 and not more than D, wherein i is 4, n 1 =3,n 2 =4,n 3 =6,n 4 =3
And the number of image block channels processed by each processing module is as follows:
c i where i is 4, c 1 =64,c 1 =128,c 1 =256,c 1 =512
3) Determining a network learning rate of 0.005, an optimization iteration number of 50 and a model optimizer Adam;
4) and setting the number S of the one-dimensional vector elements output by the feature classifier as 16 according to the number of the categories determined in the step 1).
Step 2, hyperspectral image preprocessing and data set production:
1) normalizing the highlight image, and adjusting the image radiation value to 0-1;
2) randomly selecting training sample points based on the training sample proportion set in the step one, then carrying out image segmentation by taking the sample points as the center and img _ size as the diameter to obtain an image patch, and taking the rest samples as test samples for testing the model precision;
3) in order to improve the stability of the model, the sample is expanded by using methods such as mirroring, rotation, adding salt and pepper noise and the like.
Step 3, model pre-training:
1) constructing a network model based on the parameters set in the step 1) 2);
2) and (3) carrying out model training based on the parameters determined in the step (1) and storing the model parameters corresponding to the optimal results as model pre-training weights after the training is finished.
Step 4, jointly mining and processing spatial spectral characteristics:
1) constructing a network model based on the parameters set in the step 1) 2), and loading model pre-training weights;
2) processing the image patch by using 2D convolution keeps the space size of the hyperspectral patch unchanged, and the number of channels is changed into c i (ii) a Obtaining a feature map
Figure BDA0003765206390000101
Where w x w represents the spatial size of the feature map 13 x 13, c i The number of channels of the signature is indicated.
3) The LKA is utilized to jointly mine the image patch space and spectrum information, and the specific operation is as follows:
first, preprocessing the feature map by using 1 × 1 convolution and the GELU activation function to obtain a feature map B, which can be expressed as:
B=GELU(Conv 1×1 (A))
thereafter, B is sent to LKA to complete spatial spectroscopy attention.
The large-kernel convolution parameter is high, so that the large-kernel convolution operation is realized step by utilizing the depth convolution, the depth void convolution and the dot product, and the three-dimensional space spectrum weight map W is obtained spe-spa The process can be expressed as:
W spe-spa =Con 1×1 (DW d (DW(B)))
W spe-spa each weight in (1) is independent of but related to all pixels in the large kernel convolution field. Then B and W are spe-spa And multiplying corresponding elements to obtain a new feature graph C after weighting, wherein the process can be represented as follows:
Figure BDA0003765206390000111
wherein the content of the first and second substances,
Figure BDA0003765206390000112
representing the multiplication of the corresponding elements, C focuses on spectral blocks that provide more information while suppressing unnecessary spectral blocks. In order to further mine the spectral and spatial information, 1 × 1 convolution processing is performed to complete weighted fusion of the spatial and spectral information. And finally, carrying out residual error connection to obtain LKA output D:
D=A+Con 1×1 (C)
4) fusing information of image patch space and spectrum dimension by using CFF; the specific operation is as follows:
firstly, performing channel expansion on a feature map D by using 1 × 1 convolution, then performing channel feature enhancement by using deep convolution, processing the introduced nonlinear features by using a GELU activation function, and finally, realizing channel number restoration and fusion of spatial and spectral features by using 1 × 1 convolution, wherein the process can be expressed as follows:
E=Con 1×1 (GELU(DW(Conv 1×1 (D))))
wherein
Figure BDA0003765206390000113
Conv 1×1 (g) Is a 2D convolution with convolution kernel size of 1, DW (g) represents a deep convolution with convolution kernel size of 3 × 3, padding of 1, and group of c i
5) Returning to the step 3) until the iteration number is equal to n i
6) Image data normalization is carried out by utilizing layer normalization so as to inhibit overfitting caused by small data scale after the image is processed;
7) returning to the step 2) until the iteration number is equal to the VAN processing module number D, and outputting the air-drop spectrum joint characteristic. In this case, it is considered that the feature can characterize the empty spectrum characteristics of the deep layer and contribute to obtaining excellent classification results.
And 5, classification based on the spatial spectrum joint characteristics:
1) performing information compression on the space spectrum correlation characteristics by utilizing global average pooling, and outputting a one-dimensional vector T;
2) and transforming the size of the T by using MLP to obtain a one-dimensional vector class, wherein index corresponding to the largest element in the class is the ground object type corresponding to the central pixel of the image patch.
In order to verify the performance of the invention, the experiment quantitatively compares the invention with a mainstream hyperspectral classification algorithm. Fig. 5 shows the classification accuracy, the Overall Accuracy (OA), the Average Accuracy (AA) and the kappa coefficient of various types of ground objects of the method and four hyperspectral algorithms 3DCNN, SSRN (spectral-spatial residual Network), DBMA (Double-Branch multi-orientation Mechanism Network) and DBDA (Double-Branch Dual-orientation Mechanism Network), which shows that the classification algorithm provided by the present invention performs optimally.
The above is only a preferred embodiment of the present invention, and is only used to help understand the method and the core idea of the present invention, and the protection scope of the present invention is not limited to the above embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may occur to those skilled in the art without departing from the principle of the invention, and are considered to be within the scope of the invention.
The invention integrally solves the problems of model overfitting and low classification precision caused by insufficient excavation of high-spectrum remote sensing image characteristics in the prior art and the fact that the existing attention network is not in accordance with the empty-spectrum-combined structure of the high-spectrum images, constructs an empty-spectrum weight graph through large convolution kernel attention, and performs weighting processing on a three-dimensional high-spectrum image block, thereby effectively protecting the empty-spectrum-combined structure of the images, effectively utilizing the high-spectrum empty-spectrum correlation characteristics and finally obtaining the high-quality empty-spectrum correlation characteristics to obtain the high-precision high-spectrum classification result.

Claims (3)

1. A hyperspectral image classification method based on large kernel convolution attention is characterized by comprising the following steps:
s1, setting model parameters:
s1-1: selecting original hyperspectral images HSIs to be subjected to image classification, determining the total wave band number C and the category total number S of the hyperspectral images HSIs, and determining the image segmentation size img _ size and the number of training samples;
s1-2: setting the image size of an input spectrum space attention characteristic extractor according to the data determined in the S1-1, determining the number D of VAN processing modules, and determining the times n of image processing by using Large Kernel Attention (LKA) and convolution feed-forward (convolution feed-forward) in each VAN processing module i Wherein i is more than or equal to 1 and less than or equal to D, and the number of image block channels c processed by each processing module i
S1-3: determining a network learning rate, an optimization iteration number and a model optimizer Adam;
s1-4: setting the number of one-dimensional vector elements output by the feature classifier according to the determined category total number in the S1-1;
s2, hyperspectral image preprocessing and data set production:
s2-1: normalizing the hyperspectral image, and adjusting the image radiation value to 0-1;
s2-2: randomly selecting training sample points based on the training sample proportion set in the S1-1, then carrying out image segmentation by taking the sample points as the center and img _ size as the diameter to obtain an image patch, and taking the rest samples as test samples for testing the model precision;
s2-3: in order to improve the stability of the model, sample expansion is carried out by using a method of mirroring, rotating and adding salt and pepper noise;
s3, model pre-training:
s3-1: constructing a network model based on the parameters set by the S1-2;
s3-2: performing model training based on the parameters determined in the step S1-3, and storing model parameters corresponding to the optimal results as model pre-training weights after the training is completed;
s4, jointly mining and processing spatial spectral features:
s4-1: constructing a network model based on the parameters set in the S1-2, and loading model pre-training weights;
s4-2: processing the image patch by using 2D convolution keeps the space size of the hyperspectral patch unchanged, and the number of channels is changed into c i
S4-3: jointly mining the image patch space and spectrum information by using LKA;
s4-4: fusing information of image patch space and spectrum dimension by using CFF;
s4-5: returning to said S3 until the number of iterations equals ni;
s4-6: image data normalization is carried out by utilizing layer normalization so as to inhibit overfitting caused by small data scale after the image is processed;
s4-7: returning to the S4-2 until the iteration number is equal to the number of VAN processing modules, and outputting the spatial spectrum joint characteristics;
s5, classification based on space spectrum joint features:
s5-1: performing information compression on the space spectrum correlation characteristics by utilizing global average pooling, and outputting a one-dimensional vector T;
s5-2: and transforming the size of the T by using MLP to obtain a one-dimensional vector class, wherein index corresponding to the largest element in the class is the ground object type corresponding to the central pixel of the image patch.
2. The hyperspectral image classification method based on large-kernel convolution attention according to claim 1, wherein the S4-3 is specifically:
s4-3-1: the characteristic diagram obtained by the processing of the step S4-2
Figure FDA0003765206380000021
W x w represents the spatial size of the feature map, c i Representing the number of channels of the feature map;
s4-3-2: the feature map is preprocessed by using 1 × 1 convolution and the GELU activation function to obtain a feature map B, and the process can be expressed as: b ═ GELU (Conv) 1×1 (A));
S4-3-3: conveying the B to LKA to finish spatial spectrum attention operation, realizing large kernel convolution operation step by utilizing depth convolution, depth cavity convolution and dot product to obtain a three-dimensional spatial spectrum weight graph W spe-spa The process can be represented as: w spe-spa =Con 1×1 (DW d (DW (B))), said W spe-spa Each weight in the convolution kernel exists independently and is matched with all pixels in the large convolution kernel receptive field(ii) related;
s4-3-4: b and W are spe-spa The new feature map C after weighting is obtained by multiplying the corresponding elements, and the process can be expressed as:
Figure FDA0003765206380000031
the above-mentioned
Figure FDA0003765206380000032
Representing corresponding element multiplication, which reinforces spectral blocks that provide more information while suppressing unnecessary spectral blocks;
s4-3-5: in order to further mine the spectrum and space information, 1 × 1 convolution processing is carried out to complete the weighted fusion of the space and spectrum information;
s4-3-6: and residual error connection is carried out to obtain LKA output D: d is A + Con 1×1 (C)。
3. The hyperspectral image classification method based on large-kernel convolution attention according to claim 1, wherein the S4-4 is specifically:
different from the attention-weighted spatial spectral feature enhancement mode of LKA, CFF self-adaptively mines significant spatial and spectral features by utilizing convolution combination;
s4-4-1: performing channel expansion on the feature map D by using 1 × 1 convolution;
s4-4-2: performing channel feature enhancement by using deep convolution, and processing the introduced nonlinear features by using a GELU activation function;
s4-4-3: and finally, utilizing 1 × 1 convolution to realize channel number restoration and fusion of spatial and spectral characteristics, wherein the process can be expressed as:
E=Con 1×1 (GELU(DW(Conv 1×1 (D))))
wherein
Figure FDA0003765206380000033
Conv 1×1 (g) Is a 2D convolution with convolution kernel size of 1, DW (g) is a deep convolution with convolution kernel size of 3 × 3, padding of 1, and group of c i
CN202210883826.1A 2022-07-26 2022-07-26 Hyperspectral image classification method based on large kernel convolution attention Pending CN115019178A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210883826.1A CN115019178A (en) 2022-07-26 2022-07-26 Hyperspectral image classification method based on large kernel convolution attention

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210883826.1A CN115019178A (en) 2022-07-26 2022-07-26 Hyperspectral image classification method based on large kernel convolution attention

Publications (1)

Publication Number Publication Date
CN115019178A true CN115019178A (en) 2022-09-06

Family

ID=83081368

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210883826.1A Pending CN115019178A (en) 2022-07-26 2022-07-26 Hyperspectral image classification method based on large kernel convolution attention

Country Status (1)

Country Link
CN (1) CN115019178A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116721243A (en) * 2023-08-11 2023-09-08 自然资源部第一海洋研究所 Deep learning atmosphere correction method and system based on spatial spectrum feature constraint
CN117934473A (en) * 2024-03-22 2024-04-26 成都信息工程大学 Highway tunnel apparent crack detection method based on deep learning

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116721243A (en) * 2023-08-11 2023-09-08 自然资源部第一海洋研究所 Deep learning atmosphere correction method and system based on spatial spectrum feature constraint
CN116721243B (en) * 2023-08-11 2023-11-28 自然资源部第一海洋研究所 Deep learning atmosphere correction method and system based on spatial spectrum feature constraint
CN117934473A (en) * 2024-03-22 2024-04-26 成都信息工程大学 Highway tunnel apparent crack detection method based on deep learning
CN117934473B (en) * 2024-03-22 2024-05-28 成都信息工程大学 Highway tunnel apparent crack detection method based on deep learning

Similar Documents

Publication Publication Date Title
Wang et al. ADS-Net: An Attention-Based deeply supervised network for remote sensing image change detection
Ou et al. A CNN framework with slow-fast band selection and feature fusion grouping for hyperspectral image change detection
CN115019178A (en) Hyperspectral image classification method based on large kernel convolution attention
Wang et al. A unified multiscale learning framework for hyperspectral image classification
CN111368691B (en) Unsupervised hyperspectral remote sensing image space spectrum feature extraction method
CN111898662B (en) Coastal wetland deep learning classification method, device, equipment and storage medium
CN114821164A (en) Hyperspectral image classification method based on twin network
CN115330643B (en) Earthquake denoising method based on convolutional neural network and visual transformation neural network
Gao et al. Densely connected multiscale attention network for hyperspectral image classification
CN112818920B (en) Double-temporal hyperspectral image space spectrum joint change detection method
CN110991430A (en) Ground feature identification and coverage rate calculation method and system based on remote sensing image
CN113159189A (en) Hyperspectral image classification method and system based on double-branch multi-attention convolutional neural network
CN114155371A (en) Semantic segmentation method based on channel attention and pyramid convolution fusion
Song et al. Recognition of microseismic and blasting signals in mines based on convolutional neural network and stockwell transform
CN113052216A (en) Oil spill hyperspectral image detection method based on two-way graph U-NET convolutional network
CN114937202A (en) Double-current Swin transform remote sensing scene classification method
CN115471675A (en) Disguised object detection method based on frequency domain enhancement
CN115471757A (en) Hyperspectral image classification method based on convolutional neural network and attention mechanism
Ma et al. A multimodal hyper-fusion transformer for remote sensing image classification
Yin et al. Multibranch 3D-dense attention network for hyperspectral image classification
CN115376010A (en) Hyperspectral remote sensing image classification method
Yang et al. Cirrus detection based on tensor multi-mode expansion sum nuclear norm in infrared imagery
Fu et al. Deep learning-based hydrothermal alteration mapping using GaoFen-5 hyperspectral data in the Duolong Ore District, Western Tibet, China
CN113421198A (en) Hyperspectral image denoising method based on subspace non-local low-rank tensor decomposition
CN117115675A (en) Cross-time-phase light-weight spatial spectrum feature fusion hyperspectral change detection method, system, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination