CN114494762A

CN114494762A - A hyperspectral image classification method based on deep transfer network

Info

Publication number: CN114494762A
Application number: CN202111503748.XA
Authority: CN
Inventors: 刘晓敏; 桑顺; 孙兴建; 史珉; 王浩宇
Original assignee: Nantong University
Current assignee: Nantong University
Priority date: 2021-12-10
Filing date: 2021-12-10
Publication date: 2022-05-13
Anticipated expiration: 2041-12-10
Also published as: CN114494762B

Abstract

The invention discloses a hyperspectral image classification method based on a deep migration network. And the first-order statistic information of the related sub-fields is aligned based on the local maximum mean difference, and the local distribution of the two fields is adapted. The method can extract deep and discriminant features of the target domain, and only completes classification of the unlabeled samples of the target domain by using the source domain labeled samples.

Description

A hyperspectral image classification method based on deep transfer network

技术领域technical field

本发明涉及一种利用深度迁移网络来实现高光谱图像分类的方法，属于模式识别领域。The invention relates to a method for realizing hyperspectral image classification by using a deep migration network, and belongs to the field of pattern recognition.

背景技术Background technique

高光谱图像(HSI)具有丰富的光谱和空间信息，它在农业、气候探测、国防安全等领域具有广阔的应用前景。HSI分类是这些应用的一个共同的任务，其目的是根据地物被探测得到的光谱和空间信息将图像中每个像素分为不同的类别。研究人员提出许多方法以提高HSI分类的精度，包括随机森林、支持向量机以及决策树等。这些传统的高光谱图像分类方法虽模型简单，但大多数不能保证分类精度。Hyperspectral imagery (HSI) has rich spectral and spatial information, and it has broad application prospects in the fields of agriculture, climate detection, national defense and security. HSI classification is a common task of these applications, and its purpose is to classify each pixel in the image into different categories based on the spectral and spatial information obtained from the detected objects. Researchers have proposed many methods to improve the accuracy of HSI classification, including random forests, support vector machines, and decision trees. Although these traditional hyperspectral image classification methods have simple models, most of them cannot guarantee the classification accuracy.

近年来，深度学习在许多计算机视觉任务上取得了优异表现，如图像处理、目标检测和自然语言处理等。深度学习可以自动学习特征，这使其能够适用于各种任务情景。且深度学习具有强大的非线性表示能力，可提取数据较为深层、具判别性的特征。以上优点使深度学习被成功应用于高光谱图像分类任务。深度学习强大的特征表达能力往往需要大量有标记的训练样本支撑。大家知道，随着新一代卫星高光谱传感器的发展，我们可以快速获取大量的未标记HSI，然而对这些图像进行标记还需相关专家耗费大量时间完成。因此，标记样本不足严重影响深度学习方法在高光谱分类任务上的应用。为解决上述问题，很多研究者通过将主动学习、数据增强等方法与深度学习将结合，利用少量标记样本完成了高光谱图像分类。虽然以上方法能够在一定程度上缓解因HSI标记困难而带来的训练样本不足的问题，但当训练集和测试集数据来自相似但不相同领域时，即训练集和测试集的数据分布类似但不相同时，以上方法较难获得理想的实验效果。迁移学习能够很好地解决此问题，其通过探索领域不变结构将知识从有标签的某一领域(源域)迁移到相似但分布不相同的领域(目标域)，进而完成跨域分类。将深度网络与迁移学习相结合得到的深度迁移学习模型能够学习到更加深层且更具迁移性的特征，其在高光谱分类任务上取得了突破性成果。这源于深度迁移学习网络可提取数据中具有判别性的特征和不变因子，并根据不变因子的相关性对HSI特征进行有效分组。In recent years, deep learning has achieved excellent performance on many computer vision tasks, such as image processing, object detection, and natural language processing. Deep learning can learn features automatically, which makes it applicable to various task scenarios. And deep learning has a powerful nonlinear representation ability, which can extract deeper and discriminative features of data. The above advantages enable deep learning to be successfully applied to hyperspectral image classification tasks. The powerful feature expression ability of deep learning often requires a large number of labeled training samples to support. As we all know, with the development of a new generation of satellite hyperspectral sensors, we can quickly acquire a large number of unlabeled HSIs, but labeling these images still requires a lot of time for relevant experts. Therefore, the lack of labeled samples seriously affects the application of deep learning methods on hyperspectral classification tasks. In order to solve the above problems, many researchers have completed hyperspectral image classification with a small number of labeled samples by combining active learning, data enhancement and other methods with deep learning. Although the above methods can alleviate the problem of insufficient training samples due to the difficulty of HSI labeling to a certain extent, when the training set and test set data come from similar but different fields, that is, the data distribution of the training set and test set is similar but When they are not the same, the above methods are difficult to obtain ideal experimental results. Transfer learning can solve this problem well. It transfers knowledge from a labeled domain (source domain) to a similar but different distribution domain (target domain) by exploring the domain-invariant structure, and then completes cross-domain classification. The deep transfer learning model obtained by combining deep network and transfer learning can learn deeper and more transferable features, and it has achieved breakthrough results in hyperspectral classification tasks. This stems from the fact that the deep transfer learning network can extract discriminative features and invariant factors in the data, and effectively group HSI features according to the correlation of the invariant factors.

发明内容SUMMARY OF THE INVENTION

发明目的：为了克服现有技术中存在的不足，本发明提出一种基于深度迁移网络的高光谱图像分类方法，能够仅利用源域标记样本完成对目标域无标签样本的分类。Purpose of the invention: In order to overcome the deficiencies in the prior art, the present invention proposes a hyperspectral image classification method based on a deep transfer network, which can only use the labeled samples in the source domain to complete the classification of unlabeled samples in the target domain.

技术方案：一种基于深度迁移网络的高光谱图像分类方法，包括如下步骤：Technical solution: a hyperspectral image classification method based on a deep transfer network, comprising the following steps:

步骤1，用波段选择对原始高光谱图像进行降维：去除波段冗余，得到降维后的高光谱数据X₀，高光谱数据X₀包括经降维后的源域数据

和目标域数据

Step 1, use band selection to reduce the dimension of the original hyperspectral image: remove the band redundancy to obtain the hyperspectral data X ₀ after dimension reduction, and the hyperspectral data X ₀ includes the source domain data after dimension reduction

and target domain data

步骤2，利用源域标记样本训练辅助分类器，并使用辅助分类器得到目标域伪标签；Step 2, use the source domain labeled samples to train the auxiliary classifier, and use the auxiliary classifier to obtain the target domain pseudo-label;

步骤3，构建深度迁移神经网络，基于CORAL损失利用领域适配层适配源域和目标域减小二阶统计量差异，同时基于LMMD减少源域和目标相关子空间的一阶统计量差异；Step 3, construct a deep transfer neural network, use the domain adaptation layer to adapt the source domain and the target domain based on the CORAL loss to reduce the second-order statistical difference, and at the same time reduce the first-order statistical difference between the source domain and the target correlation subspace based on LMMD;

步骤4，使用训练好的辅助分类器对目标域数据进行分类。Step 4, use the trained auxiliary classifier to classify the target domain data.

进一步的，步骤1中用波段选择对原始高光谱图像进行降维：去除波段冗余，得到降维后的高光谱数据X₀，具体包括如下过程：Further, in step 1, the original hyperspectral image is dimensionally reduced by band selection: the band redundancy is removed to obtain the dimension-reduced hyperspectral data X ₀ , which specifically includes the following processes:

定义原始HSI波段数为N_b，以间隔数为

和

分别选择a和b个波段，降维后的波段数为d，其中

表示向下取整运算，可得：Define the number of original HSI bands as N _b , with the number of intervals as

and

Select a and b bands respectively, and the number of bands after dimension reduction is d, where

Represents a round-down operation, which results in:

定义降维后的高光谱数据X₀∈R^n×d作为模型的输入，n表示样本数，谱数据X₀包括经降维后的源域数据

和目标域数据

其中，

和

其中n_s表示源域样本数，n_t表示目标域样本数，Y^s为源域标签。Define the dimension-reduced hyperspectral data X ₀ ∈ R ^n×d as the input of the model, n represents the number of samples, and the spectral data X ₀ includes the dimension-reduced source domain data

and target domain data

in,

and

where _ns represents the number of samples in the source domain, _nt represents the number of samples in the target domain, and Y ^s is the label in the source domain.

进一步的，步骤3中构建深度迁移神经网络，基于CORAL方法利用领域适配层适配源域和目标域减小二阶统计量差异，同时基于LMMD减少源域和目标相关子空间的一阶统计量差异；具体包括如下内容：Further, in step 3, a deep transfer neural network is constructed, and the domain adaptation layer is used to adapt the source domain and the target domain based on the CORAL method to reduce the difference of the second-order statistics, and at the same time, the first-order statistics of the source domain and the target correlation subspace are reduced based on LMMD. Quantitative differences; specifically include the following:

深度迁移网络DTN包括全连接层、非线性层、领域适配层以及Softmax层；将降维后的源域数据

和目标域数据

输入深度迁移网络DTN中，经由全连接层和非线性层构成的特征提取器提取特征；其中，全连接层输入为：The deep transfer network DTN includes a fully connected layer, a nonlinear layer, a domain adaptation layer and a Softmax layer; the source domain data after dimension reduction

and target domain data

In the input deep transfer network DTN, features are extracted through a feature extractor composed of a fully connected layer and a nonlinear layer; wherein, the input of the fully connected layer is:

F₁＝I×W₁+b₁ F ₁ =I×W ₁ +b ₁

其中，I为全连接层输入，b₁为偏置；将全连接层输出作为输入连接到非线性层，非线性层输出为：Among them, I is the input of the fully connected layer, and b ₁ is the bias; the output of the fully connected layer is connected as the input to the nonlinear layer, and the output of the nonlinear layer is:

其中，I^N为非线性层输入；Among them, I ^N is the nonlinear layer input;

领域适配层用于适配两域分布差异，后将领域适配层的输出连接到Softmax层；深度迁移网络DTN的损失函数定义为：The domain adaptation layer is used to adapt the distribution difference between the two domains, and then the output of the domain adaptation layer is connected to the Softmax layer; the loss function of the deep transfer network DTN is defined as:

其中，

为协方差领域适配项，

为子空间适应项，

为源域数据分类损失，α₁和α₂分别为方差领域适应参数和子空间适应参数；协方差领域适配项可表示为：in,

is the adaptation term for the covariance field,

is the subspace adaptation term,

is the source domain data classification loss, α ₁ and α ₂ are the variance domain adaptation parameters and subspace adaptation parameters, respectively; the covariance domain adaptation term can be expressed as:

其中，d₁为领域适配层输入的维度，C_s和C_t分别表示源域和目标域数据协方差矩阵；Among them, d ₁ is the input dimension of the domain adaptation layer, and C _s and C _t represent the source domain and target domain data covariance matrices, respectively;

子空间适应项可表示为：The subspace adaptation term can be expressed as:

其中，

为经辅助分类器得到的目标域伪标签，c∈{1,2,...,C}为类别索引，

和

表示

和

属于c类的权重，则

是c类的加权和，

可计算为：in,

is the pseudo-label of the target domain obtained by the auxiliary classifier, c∈{1,2,...,C} is the category index,

and

express

and

belongs to the weight of class c, then

is the weighted sum of class c,

can be calculated as:

源域数据分类损失可表示为：The source domain data classification loss can be expressed as:

其中，C为类别数，Y代表类别矩阵，S为深度迁移网络DTN模型预测结果。Among them, C is the number of categories, Y is the category matrix, and S is the prediction result of the deep transfer network DTN model.

有益效果：本发明的一种深度迁移网络的跨域高光谱图像分类方法，不仅基于CORAL从整体上对齐了两域的二阶统计量信息，适配了两域的全局分布。而且基于局部最大均值差异对齐了相关子领域的一阶统计量信息，适配了两域的局部分布。且所提方法能够提取目标域深层、具判别性特征，仅利用源域标记样本完成对目标域无标签样本的分类Beneficial effects: The cross-domain hyperspectral image classification method of the deep transfer network of the present invention not only aligns the second-order statistics information of the two domains as a whole based on CORAL, but also adapts the global distribution of the two domains. And based on the local maximum mean difference, the first-order statistic information of the relevant sub-domains is aligned, and the local distribution of the two domains is adapted. And the proposed method can extract the deep and discriminative features of the target domain, and only use the labeled samples in the source domain to complete the classification of unlabeled samples in the target domain.

附图说明Description of drawings

图1为本发明方法的示意图。Figure 1 is a schematic diagram of the method of the present invention.

具体实施方式Detailed ways

下面结合附图对本发明做更进一步的解释。The present invention will be further explained below in conjunction with the accompanying drawings.

一种基于深度迁移网络的高光谱图像分类方法，能够仅利用源域标记样本完成对目标域无标签样本的分类。A hyperspectral image classification method based on a deep transfer network, which can complete the classification of unlabeled samples in the target domain using only the labeled samples in the source domain.

和目标域数据

and target domain data

原始HSI波段数量多且波段间相关性较强，因而波段之间存在大量冗余信息。直接将原始HSI输入深度迁移网络DTN会造成网络参数增多，模型性能下降。应用波段选择对原始HSI数据进行降维处理。定义原始HSI波段数为N_b，以间隔数为

和

分别选择a和b个波段，降维后的波段数为d，其中

表示向下取整运算，可得：The number of original HSI bands is large and the correlation between bands is strong, so there is a lot of redundant information between bands. Directly inputting the original HSI into the deep transfer network DTN will increase the network parameters and degrade the model performance. Apply band selection to dimensionality reduction of raw HSI data. Define the number of original HSI bands as N _b , with the number of intervals as

and

Represents a round-down operation, which results in:

和目标域数据

其中，

和

and target domain data

in,

and

步骤3，构建深度迁移神经网络，基于CORAL(CORrelation Alignment，关联对齐)算法，利用领域适配层减小两域二阶统计量差异，同时基于LMMD(Local Maximum MeanDiscrepancy，局部最大均值度量)减少源域和目标相关子空间的一阶统计量差异。Step 3, build a deep transfer neural network, based on the CORAL (CORrelation Alignment) algorithm, use the domain adaptation layer to reduce the second-order statistical difference between the two domains, and reduce the source based on LMMD (Local Maximum Mean Discrepancy, Local Maximum Mean Measure). Differences in first-order statistics of domain and target-related subspaces.

深度神经网络因其强大的深层特征提取能力，被广泛应用于HSI分类。然而当训练集和测试集属于不同数据分布时，深度神经网络难以学习到可迁移知识,从而导致模型分类能力不足。为解决上述问题，提出深度迁移网络DTN，在DNN中加入领域适配层，将经DNN提取的深层、具有判别性的源域和目标域特征同时进行全局二阶统计量和每类相关子空间一阶统计量对齐。Deep neural networks are widely used in HSI classification due to their powerful deep feature extraction capabilities. However, when the training set and the test set belong to different data distributions, it is difficult for the deep neural network to learn transferable knowledge, resulting in insufficient model classification ability. In order to solve the above problems, a deep transfer network DTN is proposed, and a domain adaptation layer is added to the DNN, and the deep and discriminative source and target domain features extracted by the DNN are simultaneously processed for global second-order statistics and each type of correlation subspace. First-order statistic alignment.

深度迁移网络DTN是包括全连接层、非线性层、领域适配层以及Softmax层的前馈神经网络。图1中的FC1表示领域适配层，FC2表示Softmax层。The deep transfer network DTN is a feedforward neural network including a fully connected layer, a nonlinear layer, a domain adaptation layer and a Softmax layer. FC1 in Figure 1 represents the domain adaptation layer, and FC2 represents the Softmax layer.

将降维后的源域数据

和目标域数据

输入深度迁移网络DTN中，经由全连接层和非线性层构成的特征提取器提取特征。其中，全连接层输入为：source domain data after dimensionality reduction

and target domain data

In the input deep transfer network DTN, features are extracted through a feature extractor composed of fully connected layers and nonlinear layers. Among them, the input of the fully connected layer is:

F₁＝I×W₁+b₁ F ₁ =I×W ₁ +b ₁

其中，I为全连接层输入，b₁为偏置。将全连接层输出作为输入连接到非线性层，非线性层输出为：Among them, I is the input of the fully connected layer, and b ₁ is the bias. The fully connected layer output is connected as input to the nonlinear layer, and the output of the nonlinear layer is:

其中，I^N为非线性层输入。加入领域适配层以适配两域分布差异，后将领域适配层的输出连接到Softmax层。深度迁移网络DTN的损失函数定义为：Among them, I ^N is the nonlinear layer input. A domain adaptation layer is added to adapt to the distribution difference between the two domains, and then the output of the domain adaptation layer is connected to the Softmax layer. The loss function of the deep transfer network DTN is defined as:

其中，

为协方差领域适配项，

为子空间适应项，

为源域数据分类损失，α₁和α₂分别为方差领域适应参数和子空间适应参数。in,

is the adaptation term for the covariance field,

is the subspace adaptation term,

is the classification _loss for the source domain data, and α1 and _α2 are the variance domain adaptation parameters and subspace adaptation parameters, respectively.

协方差领域适配项可表示为：The covariance domain fit term can be expressed as:

其中，d₁为领域适配层输入的维度，C_s和C_t分别表示源域和目标域数据协方差矩阵。Among them, d ₁ is the input dimension of the domain adaptation layer, and C _s and C _t represent the source domain and target domain data covariance matrices, respectively.

其中，

和

表示

和

属于c类的权重，则

是c类的加权和，

可计算为：in,

and

express

and

belongs to the weight of class c, then

is the weighted sum of class c,

can be calculated as:

步骤4：使用训练好的辅助分类器对目标域数据进行分类。Step 4: Classify the target domain data using the trained auxiliary classifier.

以上所述仅是本发明的优选实施方式，应当指出，对于本技术领域的普通技术人员来说，在不脱离本发明原理的前提下，还可以做出若干改进和润饰，这些改进和润饰也应视为本发明的保护范围。The above are only the preferred embodiments of the present invention. It should be pointed out that for those skilled in the art, without departing from the principles of the present invention, several improvements and modifications can be made. It should be regarded as the protection scope of the present invention.

Claims

1. a hyperspectral image classification method based on deep migration network, is characterized in that, comprises the steps:

and target domain data

Step 2, use the source domain labeled samples to train the auxiliary classifier, and use the auxiliary classifier to obtain the target domain pseudo-label;

Step 3, construct a deep transfer neural network, use the domain adaptation layer to adapt the source domain and the target domain based on the CORAL loss to reduce the second-order statistical difference, and at the same time reduce the first-order statistical difference between the source domain and the target correlation subspace based on LMMD;

Step 4, use the trained auxiliary classifier to classify the target domain data.

2. A kind of hyperspectral image classification method based on deep migration network according to claim 1, it is characterized in that, in step 1, use band selection to carry out dimension reduction to original hyperspectral image: remove band redundancy, obtain after dimensionality reduction. The hyperspectral data X ₀ specifically includes the following processes:

Define the number of original HSI bands as N _b , with the number of intervals as

and

Represents a round-down operation, which results in:

Define the dimension-reduced hyperspectral data X ₀ ∈ R ^n×d as the input of the model, n represents the number of samples, and the spectral data X ₀ includes the dimension-reduced source domain data

and target domain data

in,

and

3. a kind of hyperspectral image classification method based on deep transfer network according to claim 1, it is characterized in that, build deep transfer neural network in step 3, utilize domain adaptation layer to adapt source domain and target domain subtraction based on CORAL method. Small second-order statistic difference, while reducing the first-order statistic difference between the source domain and the target correlation subspace based on LMMD; the details include the following:

The deep transfer network DTN includes a fully connected layer, a nonlinear layer, a domain adaptation layer and a Softmax layer; the source domain data after dimension reduction

and target domain data

F ₁ =I×W ₁ +b ₁

Among them, I is the input of the fully connected layer, and b ₁ is the bias; the output of the fully connected layer is connected as the input to the nonlinear layer, and the output of the nonlinear layer is:

Among them, I ^N is the nonlinear layer input;

The domain adaptation layer is used to adapt the distribution difference between the two domains, and then the output of the domain adaptation layer is connected to the Softmax layer; the loss function of the deep transfer network DTN is defined as:

in,

is the adaptation term for the covariance field,

is the subspace adaptation term,

Among them, d ₁ is the input dimension of the domain adaptation layer, and C _s and C _t represent the source domain and target domain data covariance matrices, respectively;

The subspace adaptation term can be expressed as: