CN116310510A

CN116310510A - Hyperspectral image classification method based on small sample deep learning

Info

Publication number: CN116310510A
Application number: CN202310095139.8A
Authority: CN
Inventors: 高大化; 孙科; 张中强; 刘丹华; 牛毅; 石光明
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2023-02-08
Filing date: 2023-02-08
Publication date: 2023-06-23
Anticipated expiration: 2043-02-08
Also published as: CN116310510B

Abstract

The invention relates to a hyperspectral image classification method based on small sample deep learning, which comprises the following steps: acquiring a source domain hyperspectral image set and a target domain hyperspectral image set; extracting a plurality of first image blocks and a plurality of second image blocks by taking each pixel point of the source domain hyperspectral image and the target domain hyperspectral image after filling treatment as a center; in each category, randomly selecting a part of first image blocks to form a source domain support set and a part of first image blocks to form a source domain query set, and randomly selecting a part of second image blocks to form a target domain support set and a part of first image blocks to form a target domain query set; training the small sample spectral space feature extraction convolutional neural network by using a support set and a query set to obtain a trained network; and inputting the hyperspectral image to be classified into a trained small sample spectral space feature extraction convolutional neural network to obtain a classification result. The method and the device can improve the classification precision and extract the spatial information contained in the hyperspectral image in a deeper level.

Description

A hyperspectral image classification method based on small-sample deep learning

技术领域technical field

本发明属于遥感信息处理技术领域，涉及一种基于小样本深度学习的高光谱图像分类方法。The invention belongs to the technical field of remote sensing information processing, and relates to a hyperspectral image classification method based on small-sample deep learning.

背景技术Background technique

高光谱以其丰富的波段信息记录了地物目标的连续光谱特征，具备了能够进行更多种类地物目标识别和更高精度地目标分类可能性。与普通的自然图像不同，高光谱图像数据呈现三维结构，光谱信息十分丰富，空间信息相对较少，高光谱图像分类技术的关键在于利用高光谱图像的空间特征和谱间特征对样本类别进行分类。然而，高光谱图像训练样本较少，对于需求参数量比较多的分类方法来说容易出现过拟合问题，如何在少量样本的情况下，训练出高效的分类模型对高光谱图像分类来说十分重要。Hyperspectral records the continuous spectral characteristics of ground objects with its rich band information, and has the possibility of recognizing more types of ground objects and classifying objects with higher accuracy. Different from ordinary natural images, hyperspectral image data presents a three-dimensional structure with rich spectral information and relatively little spatial information. The key to hyperspectral image classification technology is to use the spatial and spectral features of hyperspectral images to classify sample categories . However, hyperspectral image training samples are few, and overfitting problems are prone to occur for classification methods that require a large number of parameters. How to train an efficient classification model with a small number of samples is very important for hyperspectral image classification. important.

Kun Tan等人在其发表的论文“A novel semi-supervised hyperspectral imageclassification approach based on spatial neighborhood information andclassifier combination”(ISPRS journal of photogrammetry and remote sensing，2015)中提出了一种新的半监督HSI分类方法，将空间邻域信息与分类器相结合以增强分类能力。Yue Wu等人在其发表的论文“Semi-supervised hyperspectral imageclassification via spatial-regulated self-training”(Remote Sensing，2020)提出了一种半监督方法，该方法利用自训练通过聚类将高度自信的伪标签逐渐分配给未标记样本，并利用空间约束来调节自训练过程。然而，这些方法假设标记和未标记样本来自同一数据集，这意味着分类性能仍然受到待分类数据(即目标域)中标记样本数量的限制。Kun Tan et al. proposed a new semi-supervised HSI classification method in their paper "A novel semi-supervised hyperspectral imageclassification approach based on spatial neighborhood information and classifier combination" (ISPRS journal of photogrammetry and remote sensing, 2015). Combining spatial neighborhood information with classifiers to enhance classification capabilities. In their paper "Semi-supervised hyperspectral imageclassification via spatial-regulated self-training" (Remote Sensing, 2020), Yue Wu et al. proposed a semi-supervised method that uses self-training to cluster highly confident pseudo Labels are gradually assigned to unlabeled samples, and spatial constraints are exploited to regulate the self-training process. However, these methods assume that labeled and unlabeled samples come from the same dataset, which means that the classification performance is still limited by the number of labeled samples in the data to be classified (i.e., the target domain).

Bing Liu等人在其发表的论文“Deep few-shot learning for hyperspectralimage classification”(IEEE Transactions on Geoscience and Remote Sensing，2018)提出了一种小样本深度学习方法来解决HSI分类的小样本问题，该方法通过从训练集中学习度量空间来更好地帮助分类。Kuiliang Gao等人在其发表的论文“Deep relationnetwork for hyperspectral image few-shot classification”(Remote Sensing，2020)设计了一种新的基于关系网络的深度分类模型，并用元学习的思想对其进行训练。In their paper "Deep few-shot learning for hyperspectral image classification" (IEEE Transactions on Geoscience and Remote Sensing, 2018), Bing Liu et al. proposed a small sample deep learning method to solve the small sample problem of HSI classification. Better aid classification by learning a metric space from the training set. In their paper "Deep relation network for hyperspectral image few-shot classification" (Remote Sensing, 2020), Kuiliang Gao et al. designed a new deep relation network-based classification model and trained it with the idea of meta-learning.

除此上述列举的高光谱图像分类方法之外，目前基于深度卷积神经网络的高光谱图像分类方法都与上述方法类似，这些方法的共性就是在谱间和空间特征提取时由于对提取的特征利用率不足所造成信息丢失，或者保留过多无关信息造成信息冗余，无法充分利用高光谱图像波段中的关键信息和获得更具有可分辨性的谱空语义特征，并且在训练时需要大量的高光谱样本来训练神经网络，从而导致在少样本训练时这些方法对高光谱图像分类效果不佳，没有更深入的去关注不同光谱间信息的差异。并且考虑到高光谱数据集较少的问题，虽然有采用小样本学习的方法来解决高光谱分类问题，但是没有提出合适的方法来解决使用不同高光谱数据集所带来的域适应问题。在分类时，上述方法采用的损失函数过于单一，无法满足高精度分类的需求。In addition to the hyperspectral image classification methods listed above, the current hyperspectral image classification methods based on deep convolutional neural networks are similar to the above methods. The loss of information caused by insufficient utilization, or the redundancy of information caused by retaining too much irrelevant information, cannot make full use of the key information in the hyperspectral image band and obtain more distinguishable spectral-space semantic features, and requires a large number of training methods. Hyperspectral samples are used to train neural networks, which leads to poor performance of these methods on hyperspectral image classification during few-sample training, and does not pay more attention to the differences in information between different spectra. And considering the problem of fewer hyperspectral datasets, although there is a small sample learning method to solve the hyperspectral classification problem, no suitable method is proposed to solve the domain adaptation problem caused by using different hyperspectral datasets. In classification, the loss function adopted by the above methods is too simple to meet the requirements of high-precision classification.

发明内容Contents of the invention

本发明的目的在于针对上述现有技术的不足，提出一种基于小样本深度学习的高光谱图像分类方法，通过学习源域高光谱数据集中的先验知识，使用标签样本数量较多的源域高光谱数据集来帮助标签样本数量较少的目标域高光谱数据集进行分类，以提升高光谱图像中在少样本训练的情况下地物目标分类的精度。本发明要解决的技术问题通过以下技术方案实现：The purpose of the present invention is to address the deficiencies of the above-mentioned existing technologies, and propose a hyperspectral image classification method based on small-sample deep learning. By learning the prior knowledge in the source domain hyperspectral data set, the source domain with a large number of label samples is used. Hyperspectral datasets are used to help classify target domain hyperspectral datasets with a small number of label samples, so as to improve the accuracy of object classification in hyperspectral images in the case of few-sample training. The technical problem to be solved in the present invention is realized through the following technical solutions:

本发明实施例提供了一种基于小样本深度学习的高光谱图像分类方法，所述高光谱图像分类方法包括：An embodiment of the present invention provides a hyperspectral image classification method based on small-sample deep learning. The hyperspectral image classification method includes:

步骤1、获取源域高光谱图像集和目标域高光谱图像集，所述源域高光谱图像集包括多张源域高光谱图像，所述目标域高光谱图像集包括多张目标域高光谱图像；Step 1. Obtain a source domain hyperspectral image set and a target domain hyperspectral image set, the source domain hyperspectral image set includes multiple source domain hyperspectral images, and the target domain hyperspectral image set includes multiple target domain hyperspectral images;

步骤2、对所述源域高光谱图像和所述目标域高光谱图像的边缘部分分别进行填充处理，以经填充处理后的源域高光谱图像和目标域高光谱图像的每个像素点为中心提取若干第一图像块和若干第二图像块；Step 2. Carry out filling processing on the edge parts of the source domain hyperspectral image and the target domain hyperspectral image respectively, with each pixel of the source domain hyperspectral image and the target domain hyperspectral image after filling processing as Centrally extract several first image blocks and several second image blocks;

步骤3、在每个类别中，随机选出部分所述第一图像块组成源域支持集以及部分所述第一图像块组成源域查询集，随机选出部分所述第二图像块组成目标域支持集以及部分所述第一图像块组成目标域查询集；Step 3. In each category, randomly select some of the first image blocks to form the source domain support set and some of the first image blocks to form the source domain query set, and randomly select some of the second image blocks to form the target The domain support set and part of the first image blocks form a target domain query set;

步骤4、基于随机梯度下降法，利用所述源域支持集、所述源域查询集、所述目标域支持集和所述目标域查询集对小样本谱空特征提取卷积神经网络进行交替训练，得到训练好的小样本谱空特征提取卷积神经网络，所述小样本谱空特征提取卷积神经网络从光谱和空间两个方面提取特征，所述小样本谱空特征提取卷积神经网络的总损失函数由交叉熵损失函数、相关性对齐损失函数和最大均值化差异损失函数构成；Step 4. Based on the stochastic gradient descent method, use the source domain support set, the source domain query set, the target domain support set and the target domain query set to alternately extract the convolutional neural network for small-sample spectral-space feature extraction Training to obtain the trained small-sample spectral-space feature extraction convolutional neural network, the small-sample spectral-space feature extraction convolutional neural network extracts features from two aspects of spectrum and space, and the small-sample spectral-space feature extraction convolutional neural network The total loss function of the network consists of a cross-entropy loss function, a correlation alignment loss function, and a maximum meanization difference loss function;

步骤5、将待分类的高光谱图像输入至所述训练好的小样本谱空特征提取卷积神经网络，得到分类结果。Step 5. Input the hyperspectral image to be classified into the trained small-sample spectral-space feature extraction convolutional neural network to obtain a classification result.

在一个发明实施例中，所述步骤2包括：In an embodiment of the invention, the step 2 includes:

步骤2.1、对所述源域高光谱图像和所述目标域高光谱图像的四周分别填充像素值为0的像素，得到填充后的源域高光谱图像和填充后的目标域高光谱图像；Step 2.1, respectively filling pixels with a pixel value of 0 around the source domain hyperspectral image and the target domain hyperspectral image to obtain a filled source domain hyperspectral image and a filled target domain hyperspectral image;

步骤2.2、以所述填充后的源域高光谱图像和所述填充后的目标域高光谱图像中每个像素点为中心，选取空间大小为(2t+1)×(2t+1)、通道数为d的所述第一图像块和所述第二图像块，其中，t为大于0的整数。Step 2.2, take each pixel in the filled source hyperspectral image and the filled target hyperspectral image as the center, select a space size of (2t+1)×(2t+1), channel The number of the first image block and the second image block is d, wherein, t is an integer greater than 0.

在一个发明实施例中，所述小样本谱空特征提取卷积神经网络的结构包括谱分支网络、空间分支网络、域注意力模块、第一拼接层、全连接层和softmax分类器，所述谱分支网络、所述空间分支网络并联后再与所述域注意力模块依次串联，所述域注意力模块、所述第一拼接层、所述全连接层和softmax分类器串联，所述谱分支网络包括2个3D可形变卷积块和2个第一最大池化层，第1个所述3D可形变卷积块、第1个所述第一最大池化层、第2个所述3D可形变卷积块、第2个所述第一最大池化层依次串联，空间分支网络域注意力模块包括依次串联的多尺度空间特征提取模块、第二拼接层和第二最大池化层。In an inventive embodiment, the structure of the small-sample spectral-space feature extraction convolutional neural network includes a spectral branch network, a spatial branch network, a domain attention module, a first splicing layer, a fully connected layer, and a softmax classifier. The spectral branch network and the spatial branch network are connected in parallel and then connected in series with the domain attention module, the domain attention module, the first splicing layer, the fully connected layer and the softmax classifier are connected in series, and the spectral The branch network includes 2 3D deformable convolution blocks and 2 first maximum pooling layers, the first 3D deformable convolution block, the first first maximum pooling layer, and the second The 3D deformable convolution block and the second first maximum pooling layer are connected in sequence, and the spatial branch network domain attention module includes a multi-scale spatial feature extraction module, a second splicing layer and a second maximum pooling layer connected in sequence .

在一个发明实施例中，所述3D可形变卷积块包括3个3D可形变卷积层和3个第一激活函数层，第1个所述3D可形变卷积层、第1个所述第一激活函数层、第2个所述3D可形变卷积层、第2个所述第一激活函数层、第3个所述3D可形变卷积层、第3个所述第一激活函数层依次串联，第1个所述第一激活函数层的输出与第3个所述3D可形变卷积层的输出相加组成残差结构。In an inventive embodiment, the 3D deformable convolution block includes three 3D deformable convolution layers and three first activation function layers, the first 3D deformable convolution layer, the first one The first activation function layer, the second 3D deformable convolution layer, the second first activation function layer, the third 3D deformable convolution layer, the third first activation function The layers are connected in sequence, and the output of the first activation function layer is added to the output of the third 3D deformable convolution layer to form a residual structure.

在一个发明实施例中，所述多尺度空间特征提取模块包括2个尺度操作层、3个卷积层、3个归一化层、3个第二激活函数层；In an inventive embodiment, the multi-scale spatial feature extraction module includes 2 scale operation layers, 3 convolution layers, 3 normalization layers, and 3 second activation function layers;

第1个所述卷积层、第1个所述归一化层、第1个所述第二激活函数层依次串联；The first convolutional layer, the first normalization layer, and the first second activation function layer are sequentially connected in series;

第1个所述尺度操作层、第2个所述卷积层、第2个所述归一化层、第2个所述第二激活函数层依次串联；The first scale operation layer, the second convolution layer, the second normalization layer, and the second second activation function layer are connected in sequence;

第2个所述尺度操作层、第3个所述卷积层、第3个所述归一化层、第3个所述第二激活函数层依次串联；The second scale operation layer, the third convolution layer, the third normalization layer, and the third second activation function layer are connected in series in sequence;

第1个所述第二激活函数层、第2个所述第二激活函数层、第3个所述第二激活函数层三者并联后再与所述第二拼接层、所述第二最大池化层依次串联。The first described second activation function layer, the second described second activation function layer, and the third described second activation function layer are connected in parallel and then combined with the second stitching layer and the second maximum The pooling layers are sequentially connected in series.

在一个发明实施例中，所述小样本谱空特征提取卷积神经网络的总损失函数为：In an inventive embodiment, the total loss function of the small-sample spectral-space feature extraction convolutional neural network is:

L_total＝L_fsl+L_coral+L_MMD L _total ＝L _fsl +L _coral +L _MMD

其中，L_total表示小样本谱空特征提取卷积神经网络的总损失函数，L_fsl表示交叉熵损失，L_coral表示相关性对齐损失，L_MMD表示最大均值化差异损失；Among them, L _total represents the total loss function of the convolutional neural network for small-sample spectral-space feature extraction, L _fsl represents the cross-entropy loss, L _coral represents the correlation alignment loss, and L _MMD represents the maximum mean difference loss;

交叉熵损失L_fsl表示为：The cross-entropy loss L _fsl is expressed as:

其中，d(·)表示欧式距离，F_ω(·)表示参数ω的特征提取函数，f_l表示源域支持集或目标域支持集中的第l个类别的特征，C表示为类别的数目，x_j表示源域查询集或目标域查询集中的一个样本，y_j表示样本x_j的标签，Q表示源域查询集或目标域查询集；Among them, d(·) represents the Euclidean distance, F _ω (·) represents the feature extraction function of parameter ω, f _l represents the feature of the lth category in the source domain support set or target domain support set, C represents the number of categories, x _j represents a sample in the source domain query set or the target domain query set, y _j represents the label of the sample x _j , and Q represents the source domain query set or the target domain query set;

相关性对齐损失L_coral表示为：The correlation alignment loss L _coral is expressed as:

其中，

表示矩阵的Frobenius范数，C_S表示源域特征的协方差矩阵，C_T表示目标域特征的协方差矩阵，d表示特征的维度；in,

Represents the Frobenius norm of the matrix, C _S represents the covariance matrix of the source domain features, C _T represents the covariance matrix of the target domain features, and d represents the dimension of the features;

最大均值化差异损失L_MMD表示为：The maximum meanization difference loss L _MMD is expressed as:

其中，

表示空间距离，φ(·)表示映射函数，/>

表示源域特征，/>

表示目标域特征，X_s表示源域数据集，X_t表示目标域数据集，n_s表示源域数据集中数据的个数，n_t表示目标域数据集中数据的个数。in,

represents the spatial distance, φ( ) represents the mapping function, />

Indicates source domain characteristics, />

Represents the characteristics of the target domain, X _s represents the source domain data set, X _t represents the target domain data set, n _s represents the number of data in the source domain data set, n _t represents the number of data in the target domain data set.

在一个发明实施例中，所述步骤4包括：In an embodiment of the invention, the step 4 includes:

设置训练的初始学习率为α，迭代次数为T，在奇数次迭代时，将所述源域支持集和所述源域查询集送入所述小样本谱空特征提取卷积神经网络进行训练，利用所述总损失函数计算所述源域支持集的特征和所述源域查询集的特征之间的损失值，以更新所述小样本谱空特征提取卷积神经网络里的参数；在偶数次迭代时，将所述目标域支持集和所述目标域查询集送入所述小样本谱空特征提取卷积神经网络进行训练，利用所述总损失函数计算所述目标域支持集的特征和所述目标域查询集的特征之间的损失值，以更新所述小样本谱空特征提取卷积神经网络里的参数，直至所述小样本谱空特征提取卷积神经网络的损失值不再下降且当前训练轮次数小于迭代次数T或者训练轮次数达到迭代次数T，则停止对所述小样本谱空特征提取卷积神经网络的训练，得到所述训练好的小样本谱空特征提取卷积神经网络。Set the initial learning rate of training to α, and the number of iterations is T. When an odd number of iterations is used, the source domain support set and the source domain query set are sent to the small-sample spectral-space feature extraction convolutional neural network for training , using the total loss function to calculate the loss value between the features of the source domain support set and the features of the source domain query set, so as to update the parameters in the small-sample spectral-space feature extraction convolutional neural network; in During an even number of iterations, the target domain support set and the target domain query set are sent to the small-sample spectral-space feature extraction convolutional neural network for training, and the total loss function is used to calculate the target domain support set. The loss value between the feature and the feature of the target domain query set to update the parameters in the small-sample spectral-space feature extraction convolutional neural network until the loss value of the small-sample spectral-space feature extraction convolutional neural network No longer decline and the current number of training rounds is less than the number of iterations T or the number of training rounds reaches the number of iterations T, then stop the training of the convolutional neural network for extracting the small-sample spectral-space features, and obtain the trained small-sample spectral-space features Extract Convolutional Neural Networks.

在一个发明实施例中，所述小样本谱空特征提取卷积神经网络更新后的权重向量W_new为：In an inventive embodiment, the updated weight vector W _new of the small-sample spectral-space feature extraction convolutional neural network is:

其中，L_total表示小样本谱空特征提取卷积神经网络的总损失函数，W表示小样本谱空特征提取卷积神经网络更新前的权重向量，R表示学习率。Among them, L _total represents the total loss function of the small-sample spectral-space feature extraction convolutional neural network, W represents the weight vector of the small-sample spectral-space feature extraction convolutional neural network before updating, and R represents the learning rate.

与现有技术相比，本发明的有益效果：Compared with prior art, the beneficial effect of the present invention:

本发明针对于高光谱图像丰富的光谱特征信息和空间特征信息，利用小样本谱空特征提取卷积神经网络对高光谱图像进行分类，该小样本谱空特征提取卷积神经网络分别从光谱和空间两个方面来提取特征，能够提高分类精度且能够更深层次的提取高光谱图像中所包含的空间信息。Aiming at rich spectral feature information and spatial feature information of hyperspectral images, the present invention utilizes a small-sample spectral-space feature extraction convolutional neural network to classify hyperspectral images. Extracting features from two aspects of space can improve the classification accuracy and extract the spatial information contained in the hyperspectral image in a deeper level.

本发明在光谱支路，采用了3D可形变卷积和残差结构，使得卷积核对于不规则形状的高光谱图像能够更好的提取深层信息并且能够保留浅层网络的信息，提高分类精度；空间支路则采用了多尺度操作，对于输入的高光谱样本，首先复制两份，然后依次抛弃边缘像素点，这样就得到了三个不同空间分辨率大小的输入样本，能够更深层次的提取高光谱图像中所包含的空间信息。In the spectral branch, the present invention adopts 3D deformable convolution and residual structure, so that the convolution kernel can better extract deep information and retain shallow network information for irregularly shaped hyperspectral images, improving classification accuracy ; the spatial branch adopts multi-scale operation. For the input hyperspectral samples, two copies are first copied, and then the edge pixels are discarded in turn. In this way, three input samples with different spatial resolutions are obtained, which can be extracted at a deeper level. Spatial information contained in hyperspectral images.

通过以下参考附图的详细说明，本发明的其它方面和特征变得明显。但是应当知道，该附图仅仅为解释的目的设计，而不是作为本发明的范围的限定，这是因为其应当参考附加的权利要求。还应当知道，除非另外指出，不必要依比例绘制附图，它们仅仅力图概念地说明此处描述的结构和流程。Other aspects and features of the present invention will become apparent from the following detailed description with reference to the accompanying drawings. It should be understood, however, that the drawings are designed for purposes of illustration only and not as a limitation of the scope of the invention since reference should be made to the appended claims. It should also be understood that, unless otherwise indicated, the drawings are not necessarily drawn to scale and are merely intended to conceptually illustrate the structures and processes described herein.

附图说明Description of drawings

图1是本发明实施例提供的一种基于小样本深度学习的高光谱图像分类方法的流程示意图；Fig. 1 is a schematic flow chart of a hyperspectral image classification method based on small-sample deep learning provided by an embodiment of the present invention;

图2是本发明实施例提供的一种小样本谱空特征提取卷积神经网络的模型结构示意图；2 is a schematic diagram of a model structure of a small-sample spectral-space feature extraction convolutional neural network provided by an embodiment of the present invention;

图3是本发明实施例提供的一种光谱分支中3D可形变卷积模块的结构示意图；Fig. 3 is a schematic structural diagram of a 3D deformable convolution module in a spectrum branch provided by an embodiment of the present invention;

图4是本发明实施例提供的一种多尺度空间特征提取模块的结构示意图；Fig. 4 is a schematic structural diagram of a multi-scale spatial feature extraction module provided by an embodiment of the present invention;

图5是分别用本发明和现有两种网络在University of Pavia数据集上的分类结果仿真图；Fig. 5 is respectively with the present invention and existing two kinds of network classification result simulation figure on the University of Pavia data set;

图6是分别用本发明和现有两种网络在Indian Pines数据集上的分类结果仿真图。Fig. 6 is a simulation diagram of classification results on the Indian Pines data set using the present invention and the existing two networks respectively.

具体实施方式Detailed ways

下面结合具体实施例对本发明做进一步详细的描述，但本发明的实施方式不限于此。The present invention will be described in further detail below in conjunction with specific examples, but the embodiments of the present invention are not limited thereto.

实施例一Embodiment one

目前基于深度卷积神经网络的高光谱图像分类方法都有所不足，这些方法的共性就是在谱间和空间特征提取时由于对提取的特征利用率不足所造成的信息丢失，或者保留过多无关信息造成信息冗余，无法充分利用高光谱图像波段中的关键信息和获得更具有可分辨性的谱空语义特征，并且在训练时需要大量的高光谱样本来训练神经网络，从而导致在少样本训练时这些方法对高光谱图像分类效果不佳，没有更深入的去关注不同光谱间信息的差异。At present, hyperspectral image classification methods based on deep convolutional neural networks are deficient. The commonality of these methods is that information is lost due to insufficient utilization of extracted features during spectral and spatial feature extraction, or too much irrelevant information is retained. The information causes information redundancy, which cannot make full use of the key information in the hyperspectral image band and obtain more distinguishable spectral-space semantic features, and requires a large number of hyperspectral samples to train the neural network during training, which leads to These methods do not perform well in hyperspectral image classification during training, and do not pay more attention to the differences in information between different spectra.

基于此，本发明提出一种基于小样本深度学习的高光谱图像分类方法，具体请参见图1，图1是本发明实施例提供的一种基于小样本深度学习的高光谱图像分类方法的流程示意图，本发明实施例提供的基于小样本深度学习的高光谱图像分类方法具体可以包括步骤1至步骤4，其中：Based on this, the present invention proposes a hyperspectral image classification method based on small-sample deep learning, please refer to Figure 1 for details, which is a flow chart of a hyperspectral image classification method based on small-sample deep learning provided by an embodiment of the present invention Schematic diagram, the hyperspectral image classification method based on small-sample deep learning provided by the embodiment of the present invention may specifically include steps 1 to 4, wherein:

步骤1、获取源域高光谱图像集和目标域高光谱图像集，源域高光谱图像集包括多张源域高光谱图像，目标域高光谱图像集包括多张目标域高光谱图像。Step 1. Obtain a source domain hyperspectral image set and a target domain hyperspectral image set. The source domain hyperspectral image set includes multiple source domain hyperspectral images, and the target domain hyperspectral image set includes multiple target domain hyperspectral images.

具体的，高光谱图像是一个三维数据S∈R^h×w×c，该高光谱图像中每个波段对应三维数据中的一个二维矩阵S_i∈R^h×w，其中，∈表示属于符号，R表示实数域符号，h表示高光谱图像的长，w表示高光谱图像的宽，c表示高光谱图像的光谱波段数，i表示高光谱图像中光谱波段的序号，i＝1,2,…,c。Specifically, the hyperspectral image is a three-dimensional data S∈R ^h×w×c , and each band in the hyperspectral image corresponds to a two-dimensional matrix S _i ∈R ^h×w in the three-dimensional data, where ∈ represents the symbol , R represents the real number domain symbol, h represents the length of the hyperspectral image, w represents the width of the hyperspectral image, c represents the number of spectral bands in the hyperspectral image, i represents the serial number of the spectral band in the hyperspectral image, i=1,2, ..., c.

在本实施例中，源域高光谱图像集采用Chikusei数据集，目标域高光谱图像集采用UP等数据集。In this embodiment, the source domain hyperspectral image set adopts the Chikusei dataset, and the target domain hyperspectral image set adopts UP and other datasets.

步骤2、对源域高光谱图像和目标域高光谱图像的边缘部分分别进行填充处理，以经填充处理后的源域高光谱图像和目标域高光谱图像的每个像素点为中心提取若干第一图像块和若干第二图像块。Step 2. Carry out filling processing on the edge parts of the source domain hyperspectral image and the target domain hyperspectral image respectively, and extract several pixel points of the filled source domain hyperspectral image and target domain hyperspectral image One image block and several second image blocks.

具体的，本实施例分别对源域高光谱图像和目标域高光谱图像进行了填充处理，以使也能获取到边缘处的第一图像块和第二图像块，使边缘处的第一图像块和第二图像块包含所需要的信息。Specifically, this embodiment performs filling processing on the source domain hyperspectral image and the target domain hyperspectral image respectively, so that the first image block and the second image block at the edge can also be obtained, so that the first image at the edge The block and the second image block contain the required information.

步骤2.1、对源域高光谱图像和目标域高光谱图像的四周分别填充像素值为0的像素，得到填充后的源域高光谱图像和填充后的目标域高光谱图像。Step 2.1. Fill the surroundings of the source-domain hyperspectral image and the target-domain hyperspectral image with pixels with a pixel value of 0 to obtain the filled source-domain hyperspectral image and the filled target-domain hyperspectral image.

步骤2.2、以填充后的源域高光谱图像和填充后的目标域高光谱图像中每个像素点为中心，选取空间大小为(2t+1)×(2t+1)、通道数为d的第一图像块和第二图像块，其中，t为大于0的整数，例如图像块的大小为9×9，因此t＝4。Step 2.2. Taking each pixel in the filled source hyperspectral image and the filled target hyperspectral image as the center, select the image with a space size of (2t+1)×(2t+1) and a channel number of d For the first image block and the second image block, t is an integer greater than 0, for example, the size of the image block is 9×9, so t=4.

具体的，第一图像块为以填充后的源域高光谱图像中像素点为中心提取的像素块，第二图像块为以填充后的目标域高光谱图像中像素点为中心提取的像素块。其中，通道数d与高光谱图像的光谱波段数相同。Specifically, the first image block is a pixel block extracted centering on the pixel point in the filled source domain hyperspectral image, and the second image block is a pixel block extracted centering on the pixel point in the filled target domain hyperspectral image . Among them, the number of channels d is the same as the number of spectral bands of the hyperspectral image.

步骤3、在每个类别中，随机选出部分第一图像块组成源域支持集以及部分第一图像块组成源域查询集，随机选出部分第二图像块组成目标域支持集以及部分第一图像块组成目标域查询集。Step 3. In each category, randomly select some of the first image blocks to form the source domain support set and some of the first image blocks to form the source domain query set, randomly select some of the second image blocks to form the target domain support set and some of the first image blocks to form the target domain support set and some of the first image blocks to form the source domain support set. An image patch constitutes the target domain query set.

具体的，将第一图像块和第二图像块按照中心像素点类别分配到该类别所属的集合中，类别例如为水、玻璃等等。从每个类别的第一图像块中挑选出部分组成源域支持集，再挑选出部分组成源域查询集；从每个类别的第二图像块中挑选出部分组成目标域支持集，再挑选出部分组成目标域查询集。Specifically, the first image block and the second image block are allocated to the set to which the category belongs according to the category of the central pixel point, such as water, glass, and so on. Select a part from the first image block of each category to form the source domain support set, and then select a part to form the source domain query set; select a part from the second image block of each category to form the target domain support set, and then select The output part constitutes the query set of the target domain.

例如，源域高光谱图像集采用Chikusei数据集，每一类别挑选出200个第一图像块组成源域数据集，然后从源域数据集中对于每个类别随机挑选出一个图像块组成源域支持集，每个类别随机挑选出19个第一图像块组成源域查询集；目标域高光谱图像集采用UP等数据集，每一类挑选出5个第二图像块组成目标域数据集，进行数据增强(即对目标域数据集进行复制)之后，对于每个类别从目标域数据集中随机挑选出一个图像块组成目标域支持集，每个类别随机挑选出19个图像块组成目标域查询集。For example, the source domain hyperspectral image set uses the Chikusei dataset, and 200 first image blocks are selected for each category to form the source domain dataset, and then an image block is randomly selected from the source domain dataset for each category to form the source domain support Each category randomly selects 19 first image blocks to form the source domain query set; the target domain hyperspectral image set adopts UP and other data sets, and each category selects 5 second image blocks to form the target domain data set, and performs After data enhancement (that is, copying the target domain dataset), for each category, an image block is randomly selected from the target domain dataset to form the target domain support set, and 19 image blocks are randomly selected for each category to form the target domain query set .

步骤4、基于随机梯度下降法，利用源域支持集、源域查询集、目标域支持集和目标域查询集对小样本谱空特征提取卷积神经网络进行交替训练，得到训练好的小样本谱空特征提取卷积神经网络，小样本谱空特征提取卷积神经网络从光谱和空间两个方面提取特征，小样本谱空特征提取卷积神经网络的总损失函数由交叉熵损失函数、相关性对齐损失函数和最大均值化差异损失函数构成。Step 4. Based on the stochastic gradient descent method, use the source domain support set, source domain query set, target domain support set and target domain query set to alternately train the small-sample spectral-space feature extraction convolutional neural network to obtain trained small samples Spectral-space feature extraction Convolutional neural network, small-sample spectral-space feature extraction Convolutional neural network extracts features from two aspects of spectrum and space, small-sample spectral-space feature extraction The total loss function of convolutional neural network is composed of cross-entropy loss function, correlation Consistency alignment loss function and maximum meanization difference loss function.

具体的，请参见图2，小样本谱空特征提取卷积神经网络的结构包括谱分支网络、空间分支网络、域注意力模块、第一拼接层、全连接层和softmax分类器，谱分支网络、空间分支网络并联后再与域注意力模块串联，域注意力模块、第一拼接层、全连接层和softmax分类器依次串联，谱分支网络包括2个3D可形变卷积块和2个第一最大池化层，第1个3D可形变卷积块、第1个第一最大池化层、第2个3D可形变卷积块、第2个第一最大池化层依次串联，空间分支网络域注意力模块包括依次串联的多尺度空间特征提取模块、第二拼接层和第二最大池化层。Specifically, please refer to Figure 2. The structure of the small-sample spectral-space feature extraction convolutional neural network includes a spectral branch network, a spatial branch network, a domain attention module, the first splicing layer, a fully connected layer and a softmax classifier, and a spectral branch network. , the spatial branch network is connected in parallel and then connected in series with the domain attention module, the domain attention module, the first splicing layer, the fully connected layer and the softmax classifier are connected in series in sequence, and the spectral branch network includes 2 3D deformable convolution blocks and 2 first A maximum pooling layer, the first 3D deformable convolutional block, the first first maximum pooling layer, the second 3D deformable convolutional block, and the second first maximum pooling layer are connected in series, and the spatial branch The network domain attention module includes a multi-scale spatial feature extraction module, a second concatenation layer and a second maximum pooling layer in series.

第1个第一最大池化层的卷积核大小设置为2*2*4，卷积核数目设置为8，第二个第一最大池化层的卷积核大小设置为2*2*4，卷积核数目设置为16。The convolution kernel size of the first first maximum pooling layer is set to 2*2*4, the number of convolution kernels is set to 8, and the convolution kernel size of the second first maximum pooling layer is set to 2*2* 4. The number of convolution kernels is set to 16.

请参见图3，3D可形变卷积块包括3个3D可形变卷积层和3个第一激活函数层，第1个3D可形变卷积层、第1个第一激活函数层、第2个3D可形变卷积层、第2个第一激活函数层、第3个3D可形变卷积层、第3个第一激活函数层依次串联，第1个第一激活函数层的输出与第3个3D可形变卷积层的输出相加组成残差结构，3D可形变卷积层是通过在3d卷积中加入可形变卷积核形成的结构。Please refer to Figure 3, the 3D deformable convolution block includes 3 3D deformable convolution layers and 3 first activation function layers, the first 3D deformable convolution layer, the first first activation function layer, the second A 3D deformable convolutional layer, the second first activation function layer, the third 3D deformable convolutional layer, and the third first activation function layer are connected in series, and the output of the first first activation function layer is connected to the first The outputs of three 3D deformable convolutional layers are added to form a residual structure. The 3D deformable convolutional layer is a structure formed by adding a deformable convolution kernel to the 3D convolution.

3D可形变卷积层的卷积核大小均设置为3*3*3；每个第一激活函数层的激活函数均设置为ReLU激活函数，表示如下：The convolution kernel size of the 3D deformable convolution layer is set to 3*3*3; the activation function of each first activation function layer is set to the ReLU activation function, expressed as follows:

ReLU(x)＝max(0,x)ReLU(x)=max(0,x)

其中，x表示激活函数的输入。Among them, x represents the input of the activation function.

请参见图4，多尺度空间特征提取模块包括2个尺度操作层、3个卷积层、3个归一化层、3个第二激活函数层，其中：Please refer to Figure 4. The multi-scale spatial feature extraction module includes 2 scale operation layers, 3 convolution layers, 3 normalization layers, and 3 second activation function layers, where:

第1个卷积层、第1个归一化层、第1个第二激活函数层依次串联；The first convolutional layer, the first normalization layer, and the first second activation function layer are connected in series;

第1个尺度操作层、第2个卷积层、第2个归一化层、第2个第二激活函数层依次串联；The first scale operation layer, the second convolutional layer, the second normalization layer, and the second second activation function layer are connected in sequence;

第2个尺度操作层、第3个卷积层、第3个归一化层、第3个第二激活函数层依次串联；The second scale operation layer, the third convolutional layer, the third normalization layer, and the third second activation function layer are connected in sequence;

第1个第二激活函数层、第2个第二激活函数层、第3个第二激活函数层三者并联后再与第二拼接层、第二最大池化层依次串联。The first second activation function layer, the second second activation function layer, and the third second activation function layer are connected in parallel and then connected in series with the second splicing layer and the second maximum pooling layer.

该多尺度空间特征提取模块中的第1个尺度操作层对选取的图像块四周边缘减少一个像素点，第2个尺度操作层对选取的图像块四周边缘减少两个像素点，第1个卷积层的卷积核大小设置为5*5*4，第2个卷积层的卷积核大小设置为3*3*4，第3个卷积层的卷积核大小设置为1*1*4，个数均设置为16，每个第二激活函数层的激活函数均设置为ReLU激活函数；The first scale operation layer in the multi-scale spatial feature extraction module reduces one pixel point around the edge of the selected image block, and the second scale operation layer reduces two pixels around the edge of the selected image block. The convolution kernel size of the multilayer is set to 5*5*4, the convolution kernel size of the second convolution layer is set to 3*3*4, and the convolution kernel size of the third convolution layer is set to 1*1 *4, the number is set to 16, and the activation function of each second activation function layer is set to the ReLU activation function;

三个第二激活函数层的输出都为大小为5*5*25的16个特征，经过第二拼接层进行拼接操作，得到大小为5*5*75的16个特征，接着经过第二最大池化层进行池化操作，第二最大池化层的卷积核设置为2*2*8，卷积核个数设置为16。The outputs of the three second activation function layers are all 16 features with a size of 5*5*25. After the splicing operation of the second splicing layer, 16 features with a size of 5*5*75 are obtained, and then through the second largest The pooling layer performs a pooling operation, the convolution kernel of the second maximum pooling layer is set to 2*2*8, and the number of convolution kernels is set to 16.

域注意力模块采用2D卷积层，其中包括1个光谱注意力模块和2个空间注意力模块，光谱注意力模块为1个2D卷积层，2个空间注意力模块为2个2D卷积层，3个2D卷积层依次串联，光谱注意力模块其卷积核大小设置为9*9，卷积核数目设置为1，域注意力模块包含空间注意力和谱间注意力；The domain attention module uses a 2D convolutional layer, including 1 spectral attention module and 2 spatial attention modules, the spectral attention module is 1 2D convolutional layer, and the 2 spatial attention modules are 2 2D convolutional layers Layer, three 2D convolutional layers are connected in series in sequence, the spectral attention module has its convolution kernel size set to 9*9, the number of convolution kernels is set to 1, and the domain attention module includes spatial attention and interspectral attention;

本实施例将谱分支网络、空间分支网络并联后再与域注意力模块、第一拼接层、全连接层和softmax分类器串联组成小样本谱空特征提取卷积神经网络，域注意力模块为卷积层，域注意力模块执行卷积操作，得到加权系数，小样本谱空特征提取卷积神经网络选择交叉熵损失函数、相关性对齐损失函数和最大均值化差异损失函数作为该小样本谱空特征提取卷积神经网络的损失函数。In this embodiment, the spectral branch network and the spatial branch network are connected in parallel and then connected in series with the domain attention module, the first splicing layer, the fully connected layer and the softmax classifier to form a small-sample spectral-space feature extraction convolutional neural network. The domain attention module is Convolution layer, domain attention module performs convolution operation to obtain weighting coefficients, small-sample spectral space feature extraction convolutional neural network selects cross-entropy loss function, correlation alignment loss function and maximum meanization difference loss function as the small-sample spectrum Loss functions for convolutional neural networks for empty feature extraction.

本实施例中，小样本谱空特征提取卷积神经网络的总损失函数为：In this embodiment, the total loss function of the convolutional neural network for small-sample spectral-space feature extraction is:

L_total＝L_fsl+L_coral+L_MMD L _total ＝L _fsl +L _coral +L _MMD

其中，L_total表示小样本谱空特征提取卷积神经网络的总损失函数，L_fsl表示交叉熵损失函数，L_coral表示相关性对齐损失函数，L_MMD表示最大均值化差异损失函数；Among them, L _total represents the total loss function of the small-sample spectral space feature extraction convolutional neural network, L _fsl represents the cross-entropy loss function, L _coral represents the correlation alignment loss function, and L _MMD represents the maximum mean difference loss function;

交叉熵损失函数L_fsl表示为：The cross-entropy loss function L _fsl is expressed as:

其中，L_fsl表示预测标签向量与真实标签向量之间的损失值，d(·)表示欧式距离，F_ω(·)表示参数ω的特征提取函数，f_l表示源域支持集或目标域支持集中的第l个类别的特征，C表示为类别的数目，x_j表示源域查询集或目标域查询集中的一个样本，y_j表示样本x_j的标签，Q表示源域查询集或目标域查询集。where L _fsl represents the loss value between the predicted label vector and the true label vector, d( ) represents the Euclidean distance, F _ω ( ) represents the feature extraction function with parameter ω, f _l represents the source domain support set or the target domain support The feature of the lth category in the set, C represents the number of categories, x _j represents a sample in the source domain query set or target domain query set, y _j represents the label of sample x _j , Q represents the source domain query set or target domain queryset.

其中，

表示矩阵的Frobenius范数，C_S表示源域特征的协方差矩阵，C_T表示目标域特征的协方差矩阵，源域特征为源域支持集和源域查询集经谱分支网络、空间分支网络提取特征后，再经域注意力模块、第一拼接层后拼接的特征，目标域特征为目标域支持集和目标域查询集经谱分支网络、空间分支网络提取特征后，再经域注意力模块、第一拼接层后拼接的特征，d表示特征的维度；in,

Represents the Frobenius norm of the matrix, C _S represents the covariance matrix of the source domain features, C _T represents the covariance matrix of the target domain features, and the source domain features are the source domain support set and the source domain query set through the spectral branch network and the spatial branch network After the features are extracted, the features spliced after the domain attention module and the first splicing layer. The target domain features are the target domain support set and the target domain query set. The features spliced after the module and the first splicing layer, d represents the dimension of the feature;

其中，

表示空间距离，其是由φ(·)将数据映射到再生希尔伯特空间(RKHS)中进行度量的，φ(·)表示映射函数，/>

表示源域特征，/>

Indicates the spatial distance, which is measured by φ( ) mapping the data into the regenerative Hilbert space (RKHS), φ( ) represents the mapping function, />

Indicates source domain characteristics, />

基于上述所述的小样本谱空特征提取卷积神经网络及其总损失函数，对小样本谱空特征提取卷积神经网络的训练方法为：Based on the small-sample spectral-space feature extraction convolutional neural network and its total loss function described above, the training method for the small-sample spectral-space feature extraction convolutional neural network is:

设置训练的初始学习率为α，迭代次数为T，在奇数次迭代时，将源域支持集和源域查询集送入小样本谱空特征提取卷积神经网络进行训练，利用总损失函数计算源域支持集的特征和源域查询集的特征之间的损失值，以更新小样本谱空特征提取卷积神经网络里的参数；在偶数次迭代时，将目标域支持集和目标域查询集送入小样本谱空特征提取卷积神经网络进行训练，利用总损失函数计算目标域支持集的特征和目标域查询集的特征之间的损失值，以更新小样本谱空特征提取卷积神经网络里的参数，直至小样本谱空特征提取卷积神经网络的损失值不再下降且当前训练轮次数小于迭代次数T或者训练轮次数达到迭代次数T，则停止对小样本谱空特征提取卷积神经网络的训练，得到训练好的小样本谱空特征提取卷积神经网络。Set the initial learning rate of the training to α, and the number of iterations is T. In odd number of iterations, the source domain support set and the source domain query set are sent to the small-sample spectral-space feature extraction convolutional neural network for training, and the total loss function is used to calculate The loss value between the features of the source domain support set and the features of the source domain query set is used to update the parameters in the small-sample spectral-space feature extraction convolutional neural network; at even iterations, the target domain support set and the target domain query set The set is sent to the small-sample spectral-space feature extraction convolutional neural network for training, and the total loss function is used to calculate the loss value between the features of the target domain support set and the features of the target domain query set to update the small-sample spectral-space feature extraction convolution The parameters in the neural network, until the loss value of the small-sample spectral-space feature extraction convolutional neural network no longer decreases and the current number of training rounds is less than the number of iterations T or the number of training rounds reaches the number of iterations T, then stop extracting small-sample spectral-space features The convolutional neural network is trained to obtain a trained small-sample spectral-space feature extraction convolutional neural network.

具体而言，在奇数次迭代时，将源域数据集中的源域支持集和源域查询集送入小样本谱空特征提取卷积神经网络进行训练，计算源域支持集的特征和源域查询集的特征之间的损失值；在偶数次迭代时，将目标域数据集中的目标域支持集和目标域查询集送入小样本谱空特征提取卷积神经网络进行训练，同样计算支持集特征和查询集特征之间的损失值。依次交替，分别输入源域数据集和目标域数据集对小样本谱空特征提取卷积神经网络进行训练，不断进行迭代来更新网络里的参数。Specifically, at odd number of iterations, the source domain support set and source domain query set in the source domain data set are sent to the small-sample spectral-space feature extraction convolutional neural network for training, and the features of the source domain support set and the source domain The loss value between the features of the query set; at an even number of iterations, the target domain support set and the target domain query set in the target domain dataset are sent to the small-sample spectral-space feature extraction convolutional neural network for training, and the support set is also calculated Loss value between features and queryset features. Alternately, input the source domain data set and the target domain data set respectively to train the small sample spectral space feature extraction convolutional neural network, and continuously iterate to update the parameters in the network.

设置每次输入高光谱图像块的学习率R为：R＝α。The learning rate R of each input hyperspectral image block is set as: R=α.

对小样本谱空特征提取卷积神经网络进行T次权重更新，得到更新后的权重向量W_new：Perform T weight updates on the small-sample spectral-space feature extraction convolutional neural network to obtain an updated weight vector W _new :

其中，L_total表示总损失函数，W表示小样本谱空特征提取卷积神经网络的更新前的权重向量，R表示学习率。Among them, L _total represents the total loss function, W represents the weight vector of the small-sample spectral-space feature extraction convolutional neural network before updating, and R represents the learning rate.

将下一次训练样本集输入小样本谱空特征提取卷积神经网络，对总损失函数的损失函数值进行更新，使得损失函数值L_total不断下降，直到损失函数值L_total不再下降，且当前训练轮次数小于设置的迭代次数T，则停止对该网络的训练，得到训练好的小样本谱空特征提取卷积神经网络；否则，当训练轮次数达到T，停止对该网络的训练，得到训练好的小样本谱空特征提取卷积神经网络。Input the next training sample set into the small-sample spectral-space feature extraction convolutional neural network, and update the loss function value of the total loss function, so that the loss function value L _total continues to decrease until the loss function value L _total no longer decreases, and the current If the number of training rounds is less than the set number of iterations T, the training of the network is stopped, and the trained small-sample spectral-space feature extraction convolutional neural network is obtained; otherwise, when the number of training rounds reaches T, the training of the network is stopped, and the training of the network is obtained. Trained Convolutional Neural Networks for Few-Sample Spectral-Space Feature Extraction.

步骤5、将待分类的高光谱图像输入至训练好的小样本谱空特征提取卷积神经网络，得到分类结果。Step 5. Input the hyperspectral image to be classified into the trained small-sample spectral-space feature extraction convolutional neural network to obtain the classification result.

在本实施例中，为了测试训练好的小样本谱空特征提取卷积神经网络，可以将测试样本输入到训练好的小样本谱空特征提取卷积神经网络，得到测试样本的类别，完成高光谱图像的分类。In this embodiment, in order to test the trained small-sample spectral-space feature extraction convolutional neural network, the test sample can be input into the trained small-sample spectral-space feature extraction convolutional neural network to obtain the category of the test sample and complete the high-level Classification of spectral images.

第一，本发明构建的谱分支网络通过其中的3D可形变卷积块可以提取丰富的谱间特征，通过其中的3D可形变卷积块对这些谱间特征进行关注和筛选可以提取更具分辨性的谱间特征，克服了现有技术在谱间特征提取时由于卷积核固定无法提取更多的有用信息，或者保留过多无关信息造成信息冗余，提高了对高光谱图像中地物的分类精度。First, the spectral branch network constructed by the present invention can extract rich inter-spectral features through the 3D deformable convolution block, and the 3D deformable convolution block can be used to focus on and filter these inter-spectral features to extract more distinguishable features. It overcomes the inability to extract more useful information due to the fixed convolution kernel when extracting inter-spectral features in the prior art, or retains too much irrelevant information to cause information redundancy, and improves the accuracy of ground objects in hyperspectral images. classification accuracy.

第二，本发明构建了的空间分支网络通过其中的多尺度空间特征提取模块使得小样本谱空特征提取卷积神经网络能够关注到不同尺度的空间特征，克服了现有技术使用单一尺度提取高光谱图像块的空间特征缺点，通过其中的多路空间注意力机制模块可以对这些多尺度空间特征进行关注和筛选，提取更具分辨性的空间特征，克服了现有技术在空间特征提取时因对提取的特征利用率不足造成的信息丢失，或者保留过多无关信息造成信息冗余，提高了在少样本训练时卷积神经网络的分类能力。Second, through the multi-scale spatial feature extraction module of the spatial branch network constructed in the present invention, the small-sample spectral-space feature extraction convolutional neural network can pay attention to the spatial features of different scales, which overcomes the high cost of single-scale extraction in the prior art. The shortcomings of the spatial features of the spectral image block, through the multi-channel spatial attention mechanism module, can focus on and screen these multi-scale spatial features, extract more resolving spatial features, and overcome the shortcomings of the existing technology in the extraction of spatial features. The loss of information caused by insufficient utilization of the extracted features, or the redundancy of information caused by retaining too much irrelevant information, improves the classification ability of the convolutional neural network during training with few samples.

第三，本发明的小样本谱空特征提取卷积神经网络采用了域注意力模块，主要是针对高光谱图像光谱波段较多导致波段间的冗余信息过多的问题。通过光谱注意力来提取有用的谱间特征和空间注意力提取有用的空间特征，使得神经网络更多的关注特征信息中的有用信息。为了降低采用不同高光谱数据集训练带来的域转移问题，本发明采用了交叉熵损失函数、相关性对齐损失函数和最大均值化差异损失函数作为该网络的损失函数。使得小样本谱空特征提取卷积神经网络更关注样本分布不集中或样本量很少的地物类别。Third, the small-sample spectral-space feature extraction convolutional neural network of the present invention uses a domain attention module, which is mainly aimed at the problem of excessive redundant information between bands due to the large number of spectral bands in hyperspectral images. Using spectral attention to extract useful inter-spectral features and spatial attention to extract useful spatial features makes the neural network pay more attention to the useful information in feature information. In order to reduce the domain transfer problem caused by training with different hyperspectral data sets, the present invention uses a cross-entropy loss function, a correlation alignment loss function and a maximum mean difference loss function as the loss function of the network. The small-sample spectral-space feature extraction convolutional neural network pays more attention to the category of ground objects whose sample distribution is not concentrated or the sample size is small.

下面结合仿真实验对本发明的效果做进一步说明。The effects of the present invention will be further described below in combination with simulation experiments.

仿真实验条件：Simulation experiment conditions:

本发明的仿真实验的硬件平台为：Inter core i7-6700，频率为3.4GHz，NvidiaGeForce RTX3090。本发明的仿真实验的软件使用pytorch。The hardware platform of the emulation experiment of the present invention is: Inter core i7-6700, frequency is 3.4GHz, NvidiaGeForce RTX3090. The software of the simulation experiment of the present invention uses pytorch.

本发明的仿真实验是采用本发明和两个现有RN-FSC和DCFSL方法分别对University of Pavia和Indian Pines高光谱数据集中地物目标进行分类。The simulation experiment of the present invention is to use the present invention and two existing RN-FSC and DCFSL methods to classify the ground objects in the University of Pavia and Indian Pines hyperspectral data sets respectively.

所述RN-FSC方法是指：Kuiliang Gao等人在“Deep relation network forhyperspectral image few-shot classification”(Remote Sensing,2020)中提出的一种高光谱分类方法，其特征学习模块和关系学习模块可以充分利用高光谱图像中的空间-光谱信息，实现少量标记样本对新的高光谱图像进行准确分类。The RN-FSC method refers to a hyperspectral classification method proposed by Kuiliang Gao et al. in "Deep relation network for hyperspectral image few-shot classification" (Remote Sensing, 2020), and its feature learning module and relation learning module can be Make full use of the spatial-spectral information in hyperspectral images to achieve accurate classification of new hyperspectral images with a small number of labeled samples.

所述DCFSL方法是指：Rui Li等人在“Deep crossdomain few-shot learning forhyperspectral image classification”(Remote Sensing,2021)中提出的一种利用源类数据来帮助分类目标类的小样本元学习的高光谱分类方法。The DCFSL method refers to: Rui Li et al. proposed in "Deep crossdomain few-shot learning for hyperspectral image classification" (Remote Sensing, 2021) that uses source class data to help classify target classes. Spectral Classification Method.

本发明中使用的目标域数据集为University of Pavia和Indian Pines高光谱数据集，分别是由AVIRIS sensor在加利福利亚州的University of Pavia和美国印第安纳州一块印度松树收集的数据。Indian Pines是最早的用于高光谱图像分类的测试数据，由机载可视红外成像光谱仪(AVIRIS)于1992年对美国印第安纳州一块印度松树进行成像，然后截取尺寸为145×145的大小进行标注作为高光谱图像分类测试用途。其中，University ofPavia高光谱数据集图像的大小为610×340，具有103个波段，包含9类地物，每类地物的类别与数量如表1所示。The target domain data sets used in the present invention are University of Pavia and Indian Pines hyperspectral data sets, which are collected by AVIRIS sensor at University of Pavia in California and an Indian pine tree in Indiana, USA, respectively. Indian Pines is the earliest test data for hyperspectral image classification. The Airborne Visible Infrared Imaging Spectrometer (AVIRIS) imaged an Indian pine tree in Indiana, USA in 1992, and then cut and marked it with a size of 145×145 It is used as a hyperspectral image classification test. Among them, the image size of the University of Pavia hyperspectral dataset is 610×340, with 103 bands, including 9 types of ground objects, and the categories and quantities of each type of ground objects are shown in Table 1.

表1University of Pavia样本类别与数量Table 1 University of Pavia sample category and quantity

Indian Pines高光谱数据集图像的大小为145×145，具有200个波段，包含16类地物，每类地物的类别与数量如表2所示。The image size of the Indian Pines hyperspectral dataset is 145×145, with 200 bands, and contains 16 types of ground objects. The categories and quantities of each type of ground objects are shown in Table 2.

表2Indian Pines样本类别与数量Table 2 Indian Pines sample category and quantity

类标class label 地物类别Feature category 数量quantity 11 AlfalfaAlfalfa 4646 22 Corn-notillCorn-notill 14281428 33 Corn-mintillCorn-mintill 830830 44 Corncorn 237237 55 Grass-pastureGrass-pasture 483483 66 Grass-treeGrass-tree 730730 77 Grass-pasture-mowedGrass-pasture-mowed 2828 88 Hay-windrowedHay-windowed 478478 99 OatsOats 2020 1010 Soybean-notillSoybean-notill 972972 1111 Soybean-mintillSoybean-mintill 24552455 1212 Soybean-cleanSoybean-clean 593593 1313 WheatWheat 205205 1414 WoodsWoods 12651265 1515 Buildings-Grass-Trees-DribesBuildings-Grass-Trees-Dribes 386386 1616 Stone-steel-TowersStone-steel-Towers 9393

本发明中使用的源域数据集是Chikusei数据集，通过Hyperspec-VNIR-CIRIS光谱仪获取的日本茨城筑西高光谱影像。地面采样距离为2.5m，图像大小为2517×2335像素，共512个波段，在363nm到1018nm的光谱范围内有128个波段，一共包含了19个类别。每类地物的类别与数量如表3所示。The source domain data set used in the present invention is the Chikusei data set, which is the hyperspectral image of Chikusai, Ibaraki, Japan, obtained by the Hyperspec-VNIR-CIRIS spectrometer. The ground sampling distance is 2.5m, the image size is 2517×2335 pixels, and there are 512 bands in total. There are 128 bands in the spectral range from 363nm to 1018nm, including 19 categories in total. The types and quantities of each type of ground objects are shown in Table 3.

为了验证本发明的高效性和良好的分类性能，采用整体分类精度OA,平均精度AA,Kappa系数三种评价指标。In order to verify the high efficiency and good classification performance of the present invention, three evaluation indexes including overall classification accuracy OA, average accuracy AA and Kappa coefficient are used.

所述整体分类精度OA，指的是测试集上正确分类的像素点的个数除以总的像素个数的比例，其值在0～100％之间，此值越大说明分类效果越好。The overall classification accuracy OA refers to the ratio of the number of correctly classified pixels on the test set divided by the total number of pixels, and its value is between 0 and 100%. The larger the value, the better the classification effect .

所述平均精度AA，指的是将测试集上每类正确分类的像素点个数除以该类所有像素总数，得到该类的正确分类精度，并将所有类别的精度取平均值，其值在0～100％之间，此值越大说明分类效果越好。The average accuracy AA refers to dividing the number of correctly classified pixels of each class on the test set by the total number of all pixels of this class to obtain the correct classification accuracy of this class, and taking the average of the accuracy of all classes, its value Between 0 and 100%, the larger the value, the better the classification effect.

所述Kappa系数，是定义在混淆矩阵上的一个评价指标，其综合考虑混淆矩阵对角线上的元素和偏离对角线的元素，更客观地反映了算法的分类性能，Kappa系数的值在-1～1之间，此值越大说明分类效果越好。The Kappa coefficient is an evaluation index defined on the confusion matrix, which comprehensively considers the elements on the diagonal of the confusion matrix and the elements off the diagonal, and more objectively reflects the classification performance of the algorithm. The value of the Kappa coefficient is in Between -1 and 1, the larger the value, the better the classification effect.

2.仿真实验内容及结果分析：2. Simulation experiment content and result analysis:

仿真1，将本发明和两个现有技术分别在University of Pavia高光谱数据集中进行分类测试，结果图如图5所示，其中：Simulation 1, the present invention and two prior art are respectively carried out classification test in University of Pavia hyperspectral data set, and result figure is as shown in Figure 5, wherein:

图5(a)为用现有RN-FSC方法在University of Pavia高光谱数据集上的分类结果；Figure 5(a) is the classification result of the existing RN-FSC method on the University of Pavia hyperspectral dataset;

图5(b)为用现有DCFSL方法在University of Pavia高光谱数据集上的分类结果；Figure 5(b) is the classification result of the existing DCFSL method on the University of Pavia hyperspectral dataset;

图5(c)为用本发明方法在University of Pavia高光谱数据集上的分类结果。Fig. 5(c) is the classification result on the University of Pavia hyperspectral data set using the method of the present invention.

从图5(c)可以看出，本发明在University of Pavia数据集上的分类结果图明显比图5(a)，图5(b)更加光滑，边缘更加清晰。It can be seen from Fig. 5(c) that the classification result map of the present invention on the University of Pavia dataset is obviously smoother and the edges are clearer than Fig. 5(a) and Fig. 5(b).

仿真2，用本发明和两个现有技术分别在Indian Pines高光谱数据集进行测试，其仿真结果图如图6所示，其中：Simulation 2, test the Indian Pines hyperspectral data set with the present invention and two prior art respectively, its simulation result figure is as shown in Figure 6, wherein:

图6(a)为用现有RN-FSC方法在Indian Pines高光谱数据集上的分类结果；Figure 6(a) is the classification result of the existing RN-FSC method on the Indian Pines hyperspectral dataset;

图6(b)为用现有DCFSL方法在Indian Pines高光谱数据集上的分类结果；Figure 6(b) is the classification result of the existing DCFSL method on the Indian Pines hyperspectral dataset;

图6(c)为用本发明方法在Indian Pines高光谱数据集上的分类结果；Fig. 6 (c) is the classification result on the Indian Pines hyperspectral data set with the method of the present invention;

将上述两个仿真中本发明和现有技术分别在University of Pavia高光谱数据集和Indian Pines高光谱数据集的分类的精度进行比较，其结果如表4所示。In the above two simulations, the classification accuracy of the present invention and the prior art in the University of Pavia hyperspectral data set and the Indian Pines hyperspectral data set were compared, and the results are shown in Table 4.

表4三种网络在两个不同数据集下的分类精度对比Table 4 Comparison of classification accuracy of the three networks under two different data sets

由表4可以看出，本发明方法在University of Pavia和Indian Pines数据集下，比现有技术RN-FSC方法和DCFSL方法均获得了较高的分类准确率，说明本发明能更加准确的预测出高光谱图像样本的类别。It can be seen from Table 4 that under the University of Pavia and Indian Pines data sets, the method of the present invention has achieved higher classification accuracy than the prior art RN-FSC method and DCFSL method, indicating that the present invention can more accurately predict Classes of hyperspectral image samples.

以上仿真实验表明：本发明的方法利用构造的谱间3D可形变卷积块能够更充分的提取谱间特征，构造的多尺度空间特征提取块能够更充分的提取空间特征。并将空间特征和谱间特征拼接，再通过全连接层能够获取更具可区分性的谱空特征，最后通过softmax分类器获得高光谱图像分类结果。本发明采用交叉熵损失函数、相关性对齐损失函数和最大均值化差异损失函数构成的总损失函数来训练神经网络，使得小样本谱空特征提取卷积神经网络更关注样本分布不集中或样本量很少的地物类别。解决了现有技术在空间特征提取时因对提取的特征利用率不足造成的信息丢失，或者保留过多无关信息造成信息冗余，而导致在少训练样本的情况下分类准确率不高的问题，是一种非常实用的针对少训练样本下的高光谱图像分类方法。The above simulation experiments show that the method of the present invention can more fully extract inter-spectral features by using the constructed inter-spectral 3D deformable convolution block, and the constructed multi-scale spatial feature extraction block can more fully extract spatial features. The spatial features and inter-spectral features are concatenated, and more distinguishable spectral-spatial features can be obtained through the fully connected layer, and finally the hyperspectral image classification result is obtained through the softmax classifier. The present invention adopts the total loss function composed of cross-entropy loss function, correlation alignment loss function and maximum meanization difference loss function to train the neural network, so that the small-sample spectrum-space feature extraction convolutional neural network pays more attention to the non-concentrated sample distribution or sample size Very few feature categories. It solves the problem of low classification accuracy in the case of few training samples due to the loss of information caused by insufficient utilization of the extracted features in the existing technology in the extraction of spatial features, or the redundancy of information caused by the retention of too much irrelevant information , is a very practical method for hyperspectral image classification with few training samples.

在发明的描述中，术语“第一”、“第二”仅用于描述目的，而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此，限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。在本发明的描述中，“多个”的含义是两个或两个以上，除非另有明确具体的限定。In the description of the invention, the terms "first" and "second" are only used for descriptive purposes, and cannot be understood as indicating or implying relative importance or implicitly specifying the quantity of indicated technical features. Thus, a feature defined as "first" and "second" may explicitly or implicitly include one or more of these features. In the description of the present invention, "plurality" means two or more, unless otherwise specifically defined.

在本说明书的描述中，参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特征数据点包含于本发明的至少一个实施例或示例中。在本说明书中，对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且，描述的具体特征、结构、材料或者特征数据点可以在任何的一个或多个实施例或示例中以合适的方式结合。此外，本领域的技术人员可以将本说明书中描述的不同实施例或示例进行接合和组合。以上内容是结合具体的优选实施方式对本发明所作的进一步详细说明，不能认定本发明的具体实施只局限于这些说明。对于本发明所属技术领域的普通技术人员来说，在不脱离本发明构思的前提下，还可以做出若干简单推演或替换，都应当视为属于本发明的保护范围。In the description of this specification, descriptions referring to the terms "one embodiment", "some embodiments", "example", "specific examples", or "some examples" mean that specific features described in connection with the embodiment or example , structure, material or characteristic data points are included in at least one embodiment or example of the invention. In this specification, the schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the specific features, structures, materials or characteristic data points described may be combined in any suitable manner in any one or more embodiments or examples. In addition, those skilled in the art can combine and combine different embodiments or examples described in this specification. The above content is a further detailed description of the present invention in conjunction with specific preferred embodiments, and it cannot be assumed that the specific implementation of the present invention is limited to these descriptions. For those of ordinary skill in the technical field of the present invention, without departing from the concept of the present invention, some simple deduction or replacement can be made, which should be regarded as belonging to the protection scope of the present invention.

Claims

1. The hyperspectral image classification method based on the deep learning of the small sample is characterized by comprising the following steps of:

step 1, acquiring a source domain hyperspectral image set and a target domain hyperspectral image set, wherein the source domain hyperspectral image set comprises a plurality of Zhang Yuanyu hyperspectral images, and the target domain hyperspectral image set comprises a plurality of target domain hyperspectral images;

step 2, respectively filling the edge parts of the source domain hyperspectral image and the target domain hyperspectral image, and extracting a plurality of first image blocks and a plurality of second image blocks by taking each pixel point of the filled source domain hyperspectral image and target domain hyperspectral image as a center;

step 3, in each category, randomly selecting part of the first image blocks to form a source domain support set and part of the first image blocks to form a source domain query set, and randomly selecting part of the second image blocks to form a target domain support set and part of the first image blocks to form a target domain query set;

step 4, based on a random gradient descent method, carrying out alternate training on a small sample spectral space feature extraction convolutional neural network by using the source domain support set, the source domain query set, the target domain support set and the target domain query set to obtain a trained small sample spectral space feature extraction convolutional neural network, wherein the small sample spectral space feature extraction convolutional neural network extracts features from two aspects of spectrum and space, and a total loss function of the small sample spectral space feature extraction convolutional neural network consists of a cross entropy loss function, a correlation alignment loss function and a maximum average difference loss function;

And step 5, inputting the hyperspectral image to be classified into the trained small sample spectral space feature extraction convolutional neural network to obtain a classification result.

2. The hyperspectral image classification method based on small sample deep learning as claimed in claim 1, wherein the step 2 includes:

step 2.1, filling pixels with pixel values of 0 into the peripheries of the source domain hyperspectral image and the target domain hyperspectral image respectively to obtain a filled source domain hyperspectral image and a filled target domain hyperspectral image;

and 2.2, selecting the first image block and the second image block with the space size of (2t+1) x (2t+1) and the channel number of d by taking each pixel point in the filled source domain hyperspectral image and the filled target domain hyperspectral image as a center, wherein t is an integer greater than 0.

3. The hyperspectral image classification method based on small sample deep learning as claimed in claim 1, wherein the structure of the small sample spectral space feature extraction convolutional neural network comprises a spectral branch network, a spatial branch network, a domain attention module, a first splicing layer, a full connection layer and a softmax classifier, the spectral branch network, the spatial branch network and the domain attention module are connected in parallel and then sequentially connected in series, the domain attention module, the first splicing layer, the full connection layer and the softmax classifier are connected in series, the spectral branch network comprises 2 3D deformable convolution blocks and 2 first maximum pooling layers, the 1 st 3D deformable convolution block, the 1 st first maximum pooling layer, the 2 nd 3D deformable convolution block and the 2 nd first maximum pooling layer are sequentially connected in series, and the spatial branch network domain attention module comprises a multi-scale spatial feature extraction module, a second splicing layer and a second maximum pooling layer which are sequentially connected in series.

4. The hyperspectral image classification method based on small sample deep learning as claimed in claim 3, wherein the 3D deformable convolution block includes 3D deformable convolution layers and 3 first activation function layers, the 1 st 3D deformable convolution layer, the 1 st first activation function layer, the 2 nd 3D deformable convolution layer, the 2 nd first activation function layer, the 3 rd 3D deformable convolution layer, and the 3 rd first activation function layer are sequentially connected in series, and the output of the 1 st first activation function layer and the output of the 3 rd 3D deformable convolution layer are added to form a residual structure.

5. The hyperspectral image classification method based on small sample deep learning as claimed in claim 3, wherein the multi-scale spatial feature extraction module comprises 2 scale operation layers, 3 convolution layers, 3 normalization layers and 3 second activation function layers;

the 1 st convolution layer, the 1 st normalization layer and the 1 st second activation function layer are sequentially connected in series;

the 1 st scale operation layer, the 2 nd convolution layer, the 2 nd normalization layer and the 2 nd second activation function layer are sequentially connected in series;

the 2 nd scale operation layer, the 3 rd convolution layer, the 3 rd normalization layer and the 3 rd second activation function layer are sequentially connected in series;

The 1 st second activation function layer, the 2 nd second activation function layer and the 3 rd second activation function layer are connected in parallel and then connected in series with the second splicing layer and the second maximum pooling layer in sequence.

6. The hyperspectral image classification method based on small sample deep learning as claimed in claim 1, wherein the total loss function of the small sample spectral null feature extraction convolutional neural network is:

L _total ＝L _fsl +L _coral +L _MMD

wherein ,L_total Representing the total loss function L of the small sample spectral space feature extraction convolutional neural network _fsl Represents cross entropy loss, L _coral Representing dependency alignment loss, L _MMD Representing the maximum averaged delta loss;

cross entropy loss L _fsl Expressed as:

wherein d (·) represents the Euclidean distance, F _ω (. Cndot.) the feature extraction function of parameter ω, f _l Features representing the first class in the source domain support set or the target domain support set, C being the number of classes, x _j Representing one sample in a source domain query set or a target domain query set, y _j Representing sample x _j Q represents a source domain query set or a target domain query set;

correlation alignment loss L _coral Expressed as:

wherein ,

frobenius norms, C representing matrix _S Covariance matrix representing source domain features, C _T A covariance matrix representing the characteristics of the target domain, and d represents the dimension of the characteristics;

maximum averaged difference loss L _MMD Expressed as:

wherein ,

represents the spatial distance, phi (·) represents the mapping function, +.>

Representing source domain features, ++>

Representing the characteristics of the target domain, X _s Representing a source domain dataset, X _t Representing a target domain dataset, n _s Representing the number of data in the source domain dataset, n _t Representing the number of data in the target domain dataset.

7. The hyperspectral image classification method based on small sample deep learning as claimed in claim 6, wherein the step 4 includes:

setting the initial learning rate of training as alpha and the iteration times as T, and sending the source domain support set and the source domain query set into the small sample spectral space feature extraction convolutional neural network for training when the iteration is performed for odd times, and calculating the loss value between the features of the source domain support set and the features of the source domain query set by using the total loss function so as to update the parameters in the small sample spectral space feature extraction convolutional neural network; and when the number of iterations is even, the target domain support set and the target domain query set are sent into the small sample spectral space feature extraction convolutional neural network to train, the total loss function is utilized to calculate the loss value between the features of the target domain support set and the features of the target domain query set so as to update the parameters in the small sample spectral space feature extraction convolutional neural network until the loss value of the small sample spectral space feature extraction convolutional neural network is not reduced any more and the current training round number is smaller than the iteration number T or the training round number reaches the iteration number T, and training of the small sample spectral space feature extraction convolutional neural network is stopped to obtain the trained small sample spectral space feature extraction convolutional neural network.

8. The hyperspectral image classification method based on small sample deep learning as claimed in claim 7, wherein the small sample spectral null feature extraction convolutional neural network updated weight vector W _new The method comprises the following steps:

wherein ,L_total The method is characterized in that the method comprises the steps of representing a total loss function of a small sample spectral null feature extraction convolutional neural network, W represents a weight vector before updating the small sample spectral null feature extraction convolutional neural network, and R represents a learning rate.