CN110516596A

CN110516596A - Spatial Spectral Attention Hyperspectral Image Classification Method Based on Octave Convolution

Info

Publication number: CN110516596A
Application number: CN201910797299.0A
Authority: CN
Inventors: 唐旭; 孟凡波; 马晶晶; 焦李成
Original assignee: Xian University of Electronic Science and Technology
Current assignee: Xian University of Electronic Science and Technology
Priority date: 2019-08-27
Filing date: 2019-08-27
Publication date: 2019-11-29
Anticipated expiration: 2039-08-27
Also published as: CN110516596B

Abstract

The invention discloses a space-spectral attention hyperspectral image classification method based on Octave convolution, which solves the problems in the prior art that the distance between the same category is large, the distance between different categories is small, and the classification accuracy is low. The plan is: image input to be classified and data preprocessing, division of training set and test set, construction of Octave convolutional neural network, determination of Octave convolutional neural network loss function, training update of Octave convolutional neural network, test set data testing, completed Hyperspectral Image Classification. The present invention uses Octave convolution operation to strengthen feature representation, introduces spatial attention mechanism and spectral attention mechanism, and enables the network to more accurately find regions that are more favorable for classification and contain more comprehensive and detailed information. The invention has high classification accuracy and strong robustness, and can be applied to the analysis and management of hyperspectral image data.

Description

Spatial Spectral Attention Hyperspectral Image Classification Method Based on Octave Convolution

技术领域technical field

本发明属于图像处理技术领域，特别涉及到高光谱图像的内容分类，具体是一种基于Octave卷积的空间光谱注意力高光谱图像分类方法，可应用于高光谱图像数据的分析和管理。The invention belongs to the technical field of image processing, and in particular relates to content classification of hyperspectral images, in particular to a spatial spectral attention hyperspectral image classification method based on Octave convolution, which can be applied to the analysis and management of hyperspectral image data.

背景技术Background technique

随着高光谱图像像素点的分辨率不断提高,可以从高光谱图像中获得更多有用的数据和信息。而针对不同应用的需求，对高光谱图像的处理也有着不同的要求,所以为了有效地对这些高光谱图像数据进行分析和管理，需要为高光谱图像的每一个像素点贴上语义标签。而高光谱图像分类就是解决该类问题的一种重要途径。高光谱图像分类指的就是从一张高光谱图像中区分出具有相似特征的像素点，并正确的对这些像素点进行分类。相较于自然图像，高光谱图像本身有存在着自身的特点，其分类结果由于高光谱图像本身的空间分辨率的限制以及同物异谱、异物同谱现象的存在，常常会造成错分的现象，这是由高光谱图像本身的复杂性造成的。因此，如何更精确地对高光谱图像进行准确分类也成为了当前的一项巨大挑战。With the continuous improvement of the pixel resolution of hyperspectral images, more useful data and information can be obtained from hyperspectral images. According to the needs of different applications, the processing of hyperspectral images also has different requirements, so in order to effectively analyze and manage these hyperspectral image data, it is necessary to attach a semantic label to each pixel of the hyperspectral image. Hyperspectral image classification is an important way to solve this kind of problem. Hyperspectral image classification refers to distinguishing pixels with similar characteristics from a hyperspectral image, and correctly classifying these pixels. Compared with natural images, hyperspectral images have their own characteristics. Due to the limitation of spatial resolution of hyperspectral images and the phenomenon of the same object with different spectra and different objects with the same spectrum, the classification results often cause misclassification. phenomenon, which is caused by the complexity of the hyperspectral imagery itself. Therefore, how to classify hyperspectral images more accurately has become a great challenge at present.

基于卷积神经网络的分类，是指将需要训练的一些数据，分批次的输入到卷积神经网络当中，通过大批量数据的反复训练，使得目标优化损失函数不断降低，从而实现分类的目的。如今已经有很多较为成熟、著名的卷积神经网络被提出，如2015年由何凯明等人提出的用于图像分类任务的深度残差卷积神经网络,被大家广泛使用。深度残差卷积神经网络有效的解决了图像分类任务所提取的特征包含信息不足、网络训练梯度消失的问题，但是此网络仍存在因图像数据复杂而导致的相同类别间距大、不同类别间距小、图像分类准确率低的问题以及因训练样本少而导致的网络鲁棒性较低、容易陷入过拟合的问题。The classification based on convolutional neural network refers to inputting some data that needs to be trained into the convolutional neural network in batches. Through repeated training of large batches of data, the target optimization loss function is continuously reduced, so as to achieve the purpose of classification. . Nowadays, many mature and well-known convolutional neural networks have been proposed. For example, the deep residual convolutional neural network for image classification tasks proposed by He Kaiming and others in 2015 is widely used by everyone. The deep residual convolutional neural network effectively solves the problem that the features extracted by the image classification task contain insufficient information and the network training gradient disappears. However, this network still has large distances between the same category and small distances between different categories due to the complexity of the image data. , the problem of low image classification accuracy, and the low robustness of the network caused by the small number of training samples, which is easy to fall into the problem of over-fitting.

现有的卷积神经网络虽然能够实现高光谱像素级分类的任务，但是在学习图像语义信息的时候仍然存在三方面的不足：一是由于高光谱图像复杂性导致的分类信息定位不准确，进行分类任务时造成同一类别的间距较大，不同类别的间距较小；二是对于高光谱特征提取时，对提取的特征的利用率不足造成信息丢失或者保留过多无关信息造成信息冗余，影响分类结果，同时卷神经网络在训练时常常会陷入局部最优区域；三是可利用的高光谱数据比较少，而训练卷积神经网络通常需要大量的训练数据，少量的高光谱数据不能满足卷积神经网络的数据需求。这三个不足会导致在实际高光谱图像的分类过程中出现鲁棒性较差和产生错分的问题。Although the existing convolutional neural network can achieve the task of hyperspectral pixel-level classification, there are still three deficiencies in learning image semantic information: first, the classification information location is not accurate due to the complexity of hyperspectral images, and the When classifying tasks, the distance between the same category is large, and the distance between different categories is small; secondly, when extracting hyperspectral features, the insufficient utilization of the extracted features causes information loss or retains too much irrelevant information, resulting in information redundancy, which affects At the same time, the convolutional neural network often falls into the local optimal area during training; the third is that there are relatively few hyperspectral data available, and training convolutional neural networks usually requires a large amount of training data, and a small amount of hyperspectral data cannot satisfy volume. The data requirements of the product neural network. These three deficiencies will lead to problems of poor robustness and misclassification in the classification process of actual hyperspectral images.

发明内容Contents of the invention

本发明目的在于针对上述已有技术存在的问题，提出一种分类准确率更高的基于Octave卷积的空谱注意力机制深度学习的高光谱图像分类方法。The purpose of the present invention is to solve the problems in the above-mentioned prior art, and propose a hyperspectral image classification method based on deep learning of Octave convolution space-spectral attention mechanism with higher classification accuracy.

本发明是一种基于Octave卷积的空谱注意力机制深度学习的高光谱图像分类方法，其特征在于，包括有如下步骤：The present invention is a kind of hyperspectral image classification method based on Octave convolution spatial spectrum attention mechanism depth learning, is characterized in that, comprises the following steps:

(1)图像输入与数据预处理：输入待分类的高光谱图像，以每个像素点为中心进行逐像素点滑动，滑动得到的所有图像块用于建立高光谱图像库{I₁,I₂,…,I_n,…,I_N}，图像库中每一个图像块对应的类别为{Y₁,Y₂,…,Y_n,…,Y_N},并对建立的高光谱图像库进行归一化处理，其中I_n代表图像库中第n个图像块，Y_n代表图像库中第n个图像块对应的类别，n代表图像库中第n样本编号，n∈[0,N]，N代表高光谱图像库的图像块总数目；(1) Image input and data preprocessing: input the hyperspectral image to be classified, and slide pixel by pixel with each pixel as the center, and all the image blocks obtained by sliding are used to establish the hyperspectral image library {I ₁ ,I ₂ ,…,I _n ,…,I _N }, the category corresponding to each image block in the image library is {Y ₁ ,Y ₂ ,…,Y _n ,…,Y _N }, and the established hyperspectral image library is Normalization processing, where I _n represents the nth image block in the image library, Y _n represents the category corresponding to the nth image block in the image library, n represents the nth sample number in the image library, n∈[0,N] , N represents the total number of image blocks in the hyperspectral image database;

(2)划分训练集与测试集：从归一化处理后的每类高光谱图像中随机挑选指定数量的高光谱图像样本，构建训练样本集{T₁,T₂,…,T_j,…,T_M}，将剩余的高光谱图像作为测试样本集{t₁,t₂,…t_d,…,t_m}，其中T_j表示训练样本中第j个样本，j∈[0,M]，t_d表示测试样本中第d个样本，d∈[0,m]，M为训练样本的总个数，m为测试样本的总个数，m<N，M<N；(2) Divide training set and test set: randomly select a specified number of hyperspectral image samples from each type of hyperspectral image after normalization processing, and construct a training sample set {T ₁ ,T ₂ ,…,T _j ,… ,T _M }, take the remaining hyperspectral image as the test sample set {t ₁ ,t ₂ ,…t _d ,…,t _m }, where T _j represents the jth sample in the training sample, j∈[0,M ], t _d represents the dth sample in the test sample, d∈[0,m], M is the total number of training samples, m is the total number of test samples, m<N, M<N;

(3)Octave卷积神经网络搭建：搭建一个Octave卷积神经网络，网络的输入端为Octave卷积模块，网络的输出端为全连接层的输出结果，在网络的输入端与输出端之间包含两条支路，其中一条支路依次经过空间注意力模块和像素级注意力模块，另一条支路依次经过光谱注意力模块和像素级注意力模块；(3) Octave convolutional neural network construction: build an Octave convolutional neural network, the input end of the network is the Octave convolution module, the output end of the network is the output result of the fully connected layer, between the input end and the output end of the network Contains two branches, one of which passes through the spatial attention module and the pixel-level attention module in turn, and the other branch passes through the spectral attention module and the pixel-level attention module in turn;

(4)确定Octave卷积神经网络损失函数loss_op：设置损失函数包括将经过特征融合后提取的特征输入到全连接层进而得到的输出分类结果与实际结果的交叉熵loss₁、将经过空间注意力模块和像素级注意力模块提取的特征输入到全连接层进而得到的输出分类结果与实际结果的交叉熵loss₂、将经过光谱注意力模块和像素级注意力模块提取的特征输入到全连接层进而得到的输出分类结果与实际结果的交叉熵loss₃和带有超参数的卷积神经网络权重W的L2范数四部分，网络的损失函数由以上四部分依次相加构成；(4) Determine the loss function loss _op of the Octave convolutional neural network: setting the loss function includes inputting the extracted features after feature fusion to the fully connected layer and then obtaining the cross-entropy loss between the output classification result and the actual result ₁ , the spatial attention The features extracted by the force module and the pixel-level attention module are input to the fully connected layer to obtain the cross-entropy loss between the output classification result and the actual result _2. The features extracted by the spectral attention module and the pixel-level attention module are input to the fully connected The cross-entropy loss ₃ of the output classification result and the actual result obtained by the layer and the L2 norm of the weight W of the convolutional neural network with hyperparameters are four parts. The loss function of the network is composed of the above four parts in sequence;

(5)训练更新：设置网络训练的迭代次数为P，通过梯度下降优化对Octave卷积神经网络进行迭代训练，直到损失函数loss_op不下降或训练轮次数达到迭代次数，得到训练好的Octave卷积神经网络；(5) Training update: Set the number of iterations of network training to P, and iteratively train the Octave convolutional neural network through gradient descent optimization until the loss function loss _op does not decrease or the number of training rounds reaches the number of iterations, and the trained Octave volume is obtained product neural network;

(6)数据测试：将经过归一化处理后的测试样本集输入到训练好的Octave卷积神经网络当中，得到分类结果，完成图像分类。(6) Data test: Input the normalized test sample set into the trained Octave convolutional neural network to obtain the classification result and complete the image classification.

本发明采用了Octave卷积操作，并融入了注意力机制，设置了一个包含Octave卷积操作和注意力机制的深度卷积网络模型，此模型仅需要少量的训练数据就能够训练出效果较好的模型，Octave卷积操作将高光谱图像数据的高频部分和低频部分进行了有效的结合，提取的特征包含的信息更加全面于详细，利用率更高，同时注意力机制可促使网络更快速有效的找到对于分类任务更有利的特征区域，使网络捕捉的信息更加全面准确，有效解决了当前高光谱图像分类任务中分类准确率较低、鲁棒性不强的问题。The present invention adopts the Octave convolution operation and integrates the attention mechanism, and sets a deep convolution network model including the Octave convolution operation and the attention mechanism. This model only needs a small amount of training data to train the effect better The Octave convolution operation effectively combines the high-frequency part and low-frequency part of the hyperspectral image data. The extracted features contain more comprehensive and detailed information, and the utilization rate is higher. At the same time, the attention mechanism can make the network faster Effectively find feature regions that are more favorable for classification tasks, make the information captured by the network more comprehensive and accurate, and effectively solve the problems of low classification accuracy and weak robustness in current hyperspectral image classification tasks.

本发明与现有的技术相比具有以下优点：Compared with the prior art, the present invention has the following advantages:

特征表示增强：本发明首次将Octave卷积操作引入到高光谱图像分类方法中来，在Octave卷积神经网络中专门设有Octave卷积模块，由于是基于Octave卷积操作来获取高光谱图像特征，将高光谱图像的高频信息和低频信息进行了合理有效的结合在一起，使高光谱图像特征包含的信息更加全面和详细，增强了图像的特征表示。Feature representation enhancement: the present invention introduces the Octave convolution operation into the hyperspectral image classification method for the first time, and the Octave convolution module is specially set in the Octave convolution neural network, because it is based on the Octave convolution operation to obtain the hyperspectral image features , the high-frequency information and low-frequency information of the hyperspectral image are combined reasonably and effectively, so that the information contained in the hyperspectral image features is more comprehensive and detailed, and the feature representation of the image is enhanced.

分类准确率提高：本发明将注意力机制引入到高光谱图像分类方法中来，在Octave卷积神经网络中专门设有空间注意力机制模块、光谱注意力机制模块和像素级注意力机制模块，由于引入注意力机制原理，能够促使网络快速准确的找到已获取的高光谱图像特征当中特征最为明显的区域，使对于分类更有利的特征更加集中于某一带有明显语义信息的区域，降低了损失函数陷入局部最优的概率，增强了高光谱图像分类的准确性。Improved classification accuracy: the present invention introduces the attention mechanism into the hyperspectral image classification method. The Octave convolutional neural network is specially equipped with a spatial attention mechanism module, a spectral attention mechanism module and a pixel-level attention mechanism module. Due to the introduction of the principle of attention mechanism, the network can quickly and accurately find the most obvious region among the acquired hyperspectral image features, so that the features that are more beneficial to classification are more concentrated in a certain region with obvious semantic information, reducing the loss. The probability that the function falls into a local optimum enhances the accuracy of hyperspectral image classification.

鲁棒性增强：本发明设计了一个更有效的损失函数，新的损失函数利用三个交叉熵损失函数促使网络学习到对高光谱图像分类更有效的特征，加强了图像的特征表示，进一步明确了分类任务，目的性更强，能够适应复杂的高光谱图像数据，极大的增强了网络的鲁棒性。Robustness enhancement: The present invention designs a more effective loss function. The new loss function uses three cross-entropy loss functions to promote the network to learn more effective features for hyperspectral image classification, which strengthens the feature representation of images and further clarifies The classification task is more purposeful, and it can adapt to complex hyperspectral image data, which greatly enhances the robustness of the network.

训练样本少：本发明仅需要少量的样本就可以训练出效果较好的网络模型，对高光谱图像的数据量要求较小。Less training samples: The present invention only needs a small number of samples to train a network model with better effect, and requires less data volume of hyperspectral images.

附图说明Description of drawings

图1是本发明的实现流程图；Fig. 1 is the realization flowchart of the present invention;

图2是本发明中构建的Octave卷积神经网络结构图；Fig. 2 is the Octave convolutional neural network structural diagram constructed among the present invention;

图3是本发明中实验用的高光谱图像，其中图3(a)为原始的高光谱图像，图3(b)为原始的高光谱图像对应像素点的类标签；Fig. 3 is the hyperspectral image used in experiments in the present invention, wherein Fig. 3 (a) is the original hyperspectral image, and Fig. 3 (b) is the class label of the corresponding pixel of the original hyperspectral image;

图4是Octave卷积模块结构图；Figure 4 is a structural diagram of the Octave convolution module;

图5是空间注意力机制模块结构图；Figure 5 is a structural diagram of the spatial attention mechanism module;

图6是谱间注意力机制模块结构图；Figure 6 is a block diagram of the inter-spectrum attention mechanism;

图7是像素级注意力机制模块结构图。Figure 7 is a block diagram of the pixel-level attention mechanism.

具体实施方式Detailed ways

以下结合附图对本发明的技术方案和效果作详细描述。The technical solutions and effects of the present invention will be described in detail below in conjunction with the accompanying drawings.

实施例1Example 1

近几十年来，高光谱分辨率为区分不同材料和对象提供过了有用的信息，高光谱图像分类方法在地球观测中得到了广泛的应用，尤其在城市发展、精准农业、土地变化检测、资源管理等方面具有重要的意义。目前存在的高光谱图像分类的方法中，参见图1，经过构建高光谱图像库、划分训练样本集和测试样本集、搭建并训练卷积神经网络模型、对训练好的卷积神经网络模型进行测试等完成对于高光谱图像的分类，但由于高光谱图像的复杂性，导致分类信息定位不准确，训练的网络很容易陷入到局部最优的问题；同时由于网络深度较深、网络操作比较复杂，提取的特征包含信息不足，导致进行分类任务时出现同一类别的间距较大，不同类别的间距较小的问题，容易出现错误分类的现象，鲁棒性较差。In recent decades, hyperspectral resolution has provided useful information for distinguishing different materials and objects, and hyperspectral image classification methods have been widely used in earth observation, especially in urban development, precision agriculture, land change detection, resource Management etc. are of great significance. In the existing methods of hyperspectral image classification, see Figure 1, after constructing hyperspectral image library, dividing training sample set and test sample set, building and training convolutional neural network model, and performing training on the trained convolutional neural network model The classification of hyperspectral images has been completed in tests, but due to the complexity of hyperspectral images, the location of classification information is inaccurate, and the trained network is easy to fall into the problem of local optimum; at the same time, due to the deep network depth and complex network operation , the extracted features contain insufficient information, which leads to the problem that the distance between the same category is large and the distance between different categories is small when performing classification tasks, which is prone to misclassification and poor robustness.

针对这个现状，本发明展开了研究与探讨，提出一种基于Octave卷积的空谱注意力机制深度学习的高光谱图像分类方法，参见图2，包括有如下步骤：In view of this current situation, the present invention has carried out research and discussion, and proposes a hyperspectral image classification method based on deep learning of Octave convolution space-spectral attention mechanism, see Figure 2, including the following steps:

(1)图像输入与数据预处理：输入待分类的高光谱图像，高光谱图像光谱波段较多，且不同波段差异较大，同时受到光照、气温等环境因素的影响，属于同一类别的高光谱图像的差异较大，属于不同类别的高光谱图像差异不大；针对高光谱图像数据，在高光谱图像上以每个像素点为中心进行逐像素点滑动，本例中选取9*9大小的滑框，滑框大小可根据实际情况进行调整，使用滑框进行逐像素点滑动，每次滑动的距离为一个像素点，每次滑动都可得到一个9*9大小的高光谱图像块，滑动得到的所有高光谱图像块用于建立高光谱图像库{I₁,I₂,…,I_n,…,I_N}，图像库中每一个图像块对应的类别分别为{Y₁,Y₂,…,Y_n,…,Y_N},针对建立好的高光谱图像库，找到图像库中所有像素点的最大值和最小值，利用所有像素点的值和两个像素点的最值对建立的高光谱图像库进行归一化处理，其中I_n代表图像库中第n个图像，Y_n代表图像库中第n个图像对应的类别，n代表图像库中第n个样本编号，n∈[0,N]，N代表高光谱图像库的图片总数目。(1) Image input and data preprocessing: input the hyperspectral image to be classified. The hyperspectral image has many spectral bands, and the difference between different bands is large. At the same time, it is affected by environmental factors such as light and temperature, and belongs to the same category of hyperspectral The images are quite different, and the hyperspectral images belonging to different categories have little difference; for the hyperspectral image data, each pixel is centered on the hyperspectral image to slide pixel by pixel. In this example, the size of 9*9 is selected. Sliding frame, the size of the sliding frame can be adjusted according to the actual situation. Use the sliding frame to slide pixel by pixel. The distance of each sliding is one pixel. Each sliding can get a hyperspectral image block with a size of 9*9. All the obtained hyperspectral image blocks are used to establish the hyperspectral image library {I ₁ ,I ₂ ,…,I _n ,…,I _N }, and the categories corresponding to each image block in the image library are respectively {Y ₁ ,Y ₂ ,…,Y _n ,…,Y _N }, for the established hyperspectral image library, find the maximum and minimum values of all pixels in the image library, using the values of all pixels and the most value pair of two pixels The established hyperspectral image library is normalized, where In represents the _{nth image in the image library, Y n} _represents the category corresponding to the nth image in the image library, n represents the nth sample number in the image library, and n ∈[0,N], N represents the total number of pictures in the hyperspectral image database.

(2)划分训练集与测试集：从归一化处理后的每个类别的高光谱图像中随机挑选指定数量的高光谱图像样本，构建训练样本集{T₁,T₂,…,T_j,…,T_M}，简称训练集，将剩余的高光谱图像作为测试样本集{t₁,t₂,…t_d,…,t_m}这个测试样本集为归一化的测试样本集，简称测试集。训练样本集中T_j表示训练样本中第j个样本，j∈[0,M]，测试样本集中t_d表示测试样本中第d个样本，d∈[0,m]，M为训练样本的总个数，m为测试样本的总个数，m<N，M<N。(2) Divide training set and test set: randomly select a specified number of hyperspectral image samples from the normalized hyperspectral images of each category, and construct a training sample set {T ₁ ,T ₂ ,…,T _j ,…,T _M }, referred to as the training set, the remaining hyperspectral images are used as the test sample set {t ₁ ,t ₂ ,…t _d ,…,t _m } This test sample set is a normalized test sample set, referred to as the test set. T _j in the training sample set represents the jth sample in the training sample, j∈[0,M], t _d in the test sample set represents the dth sample in the test sample, d∈[0,m], M is the total number of training samples number, m is the total number of test samples, m<N, M<N.

本发明在挑选样本构建训练样本集时，针对每个类别分别采用随机挑选的方式，这样选取的训练样本集可以包含所有类别的样本，同时最大可能的将每个类别所包含的所有可能性样本都划分到训练样本集之中。When the present invention selects samples to build a training sample set, it adopts a random selection method for each category, so that the training sample set selected in this way can contain samples of all categories, and at the same time maximize all possible samples contained in each category. are divided into the training sample set.

构建高光谱图像训练样本集时，随机选取指定数量的训练样本通常是本领域技术人员的常规做法，主要是因为高光谱图像数据中，高光谱图像样本少并且每种类别的样本的数量差别较大，若按比例划分训练样本集与测试样本集会导致样本少的高光谱类别的分类准确率很低。When constructing a hyperspectral image training sample set, it is usually a routine practice for those skilled in the art to randomly select a specified number of training samples, mainly because in hyperspectral image data, there are few hyperspectral image samples and the number of samples of each category is relatively different. Large, if the training sample set and test sample set are divided proportionally, the classification accuracy of the hyperspectral category with few samples will be very low.

(3)Octave卷积神经网络搭建：搭建一个Octave卷积神经网络，参见图2，网络的输入端为Octave卷积模块，网络的输出端为全连接层的输出结果，在网络的输入端与输出端之间包含两条支路，其中一条支路依次经过空间注意力模块和像素级注意力模块，另一条支路依次经过光谱注意力模块和像素级注意力模块。(3) Construction of Octave convolutional neural network: build an Octave convolutional neural network, see Figure 2, the input end of the network is the Octave convolution module, the output end of the network is the output result of the fully connected layer, and the input end of the network is connected with There are two branches between the output terminals, one of which passes through the spatial attention module and the pixel-level attention module in turn, and the other branch passes through the spectral attention module and the pixel-level attention module in turn.

本发明Octave卷积模块充分考虑了高光谱图像的高频信息和低频信息，通过卷积操作将高光谱图像的高频信息和低频信息进行了有效的结合，增强了图像的特征表示，使图像特征包含的信息更加全面；空间注意力模块和像素级注意力模块使整个网络的注意力集中于某一带有明显语义信息的区域，使整个网络在对高光谱分类过程中，充分利用对于分类最有利的区域，抓住最明显的特征区域，提高高光谱的分类准确率；光谱注意力模块和像素级注意力模块使整个网络的注意力集中于对于高光谱图像分类最有用的光谱波段，突出特征表示更良好的光谱波段，有利于提高高光谱分类准确率，提高网络的鲁棒性。The Octave convolution module of the present invention fully considers the high-frequency information and low-frequency information of the hyperspectral image, effectively combines the high-frequency information and low-frequency information of the hyperspectral image through the convolution operation, enhances the feature representation of the image, and makes the image The information contained in the feature is more comprehensive; the spatial attention module and the pixel-level attention module make the whole network focus on a certain area with obvious semantic information, so that the whole network can make full use of the most important information for classification in the process of hyperspectral classification. Favorable areas, grasp the most obvious feature areas, improve the classification accuracy of hyperspectral; spectral attention module and pixel-level attention module make the whole network focus on the most useful spectral bands for hyperspectral image classification, highlighting The feature represents a better spectral band, which is conducive to improving the accuracy of hyperspectral classification and improving the robustness of the network.

(4)确定Octave卷积神经网络损失函数loss_op：设置损失函数包括将经过特征融合后提取的特征输入到全连接层进而得到的输出分类结果与实际结果的交叉熵loss₁、将经过空间注意力模块和像素级注意力模块提取的特征输入到全连接层进而得到的输出分类结果与实际结果的交叉熵loss₂、将经过光谱注意力模块和像素级注意力模块提取的特征输入到全连接层进而得到的输出分类结果与实际结果的交叉熵loss₃和带有超参数的卷积神经网络权重W的L2范数四部分，网络的损失函数由以上四部分依次相加构成。(4) Determine the loss function loss _op of the Octave convolutional neural network: setting the loss function includes inputting the extracted features after feature fusion to the fully connected layer and then obtaining the cross-entropy loss between the output classification result and the actual result ₁ , the spatial attention The features extracted by the force module and the pixel-level attention module are input to the fully connected layer to obtain the cross-entropy loss between the output classification result and the actual result _2. The features extracted by the spectral attention module and the pixel-level attention module are input to the fully connected The cross-entropy loss ₃ of the output classification result obtained by the layer and the actual result and the L2 norm of the weight W of the convolutional neural network with hyperparameters are four parts. The loss function of the network is formed by adding the above four parts in sequence.

本发明的损失函数loss_op利用三个交叉熵函数促使网络学习到对高光谱图像分类更有效的特征，加强高光谱图像的特征表示，进一步明确了分类任务，目的性更强，能够适应复杂的高光谱图像数据，极大的增强了网络的鲁棒性，同时有效的降低了损失函数在网络训练过程中陷入局部最优的概率。The loss function loss _op of the present invention uses three cross-entropy functions to promote the network to learn more effective features for hyperspectral image classification, strengthens the feature representation of hyperspectral images, further clarifies the classification task, has stronger purpose, and can adapt to complex Hyperspectral image data greatly enhances the robustness of the network, and at the same time effectively reduces the probability of the loss function falling into local optimum during the network training process.

(5)训练更新：设置网络训练的迭代次数为P，通过梯度下降优化对Octave卷积神经网络进行迭代训练，直到损失函数loss_op不下降或训练轮次数达到迭代次数，得到训练好的Octave卷积神经网络。迭代次数P是人为设定的，可根据网络的训练效果进行调整，使高光谱图像分类准确率最高。在网络训练过程中，网络的学习率随着网络的训练逐渐减小，初始训练时学习率较大，随着训练的加深，学习率逐渐减小，使网络有效的降低了网络的损失函数陷入局部最优的概率，有利于高光谱图像分类准确率的提高，鲁棒性增强。(5) Training update: Set the number of iterations of network training to P, and iteratively train the Octave convolutional neural network through gradient descent optimization until the loss function loss _op does not decrease or the number of training rounds reaches the number of iterations, and the trained Octave volume is obtained product neural network. The number of iterations P is set artificially and can be adjusted according to the training effect of the network to make the classification accuracy of hyperspectral images the highest. In the process of network training, the learning rate of the network gradually decreases with the training of the network. The learning rate is relatively large during the initial training. As the training deepens, the learning rate gradually decreases, so that the network effectively reduces the loss function of the network. The local optimal probability is beneficial to the improvement of hyperspectral image classification accuracy and robustness.

本发明提供了一个基于Octave卷积的空谱注意力机制深度学习的高光谱图像分类方法的整体技术方案。The present invention provides an overall technical scheme of a hyperspectral image classification method based on deep learning of Octave convolution space-spectral attention mechanism.

本发明的技术思路是：搭建一个Octave卷积神经网络，利用Octave卷积操作获得针对每一个像素点区域、包含信息更加全面的卷积特征；根据空间注意力机制原理和像素级注意力机制原理，将整个网络的注意力集中到某一带有明显语义信息的区域，找到利于分类的有用信息的区域；根据光谱注意力机制原理和像素级注意力机制原理，找到特征更为突出、光谱信息更强的光谱波段，获取利于分类的更强的特征表示；通过上述注意力机制将整个网络的注意力集中到对于分类最有效的区域，然后通过全连接层网络实现图像分类。The technical idea of the present invention is: to build an Octave convolutional neural network, and use Octave convolution operation to obtain more comprehensive convolution features for each pixel area; according to the principles of spatial attention mechanism and pixel-level attention mechanism , focus the attention of the entire network on an area with obvious semantic information, and find an area with useful information that is conducive to classification; Strong spectral bands to obtain stronger feature representations that are conducive to classification; through the above attention mechanism, the attention of the entire network is focused on the most effective area for classification, and then image classification is achieved through a fully connected layer network.

本发明解决了当前高光谱图像分类中存在的相同类别间距大、不同类别间距小、图像分类准确率低的问题以及因训练样本少而导致的网络鲁棒性较低、容易陷入过拟合的问题。The present invention solves the problems existing in the current hyperspectral image classification, such as large distance between the same category, small distance between different categories, low image classification accuracy, low network robustness and easy to fall into over-fitting problems caused by few training samples. question.

本发明通过Octave卷积模块有效的获得了包含信息更加全面的卷积特征，增强了图像的特征表示，利用空间注意力模块和像素级注意力模块使网络找到语义信息更为明显、特征突出的区域，利用光谱注意力模块和像素级注意力模块找到光谱信息更强的光谱波段，通过几种注意力机制的结合，找到对于高光谱图像分类更有效的特征区域，提高了高光谱图像分类的准确率，增强了网络的鲁棒性。The invention effectively obtains more comprehensive convolution features containing information through the Octave convolution module, enhances the feature representation of the image, and uses the spatial attention module and the pixel-level attention module to enable the network to find more obvious semantic information and prominent features. region, using the spectral attention module and pixel-level attention module to find spectral bands with stronger spectral information, and through the combination of several attention mechanisms, find more effective feature regions for hyperspectral image classification, improving the performance of hyperspectral image classification Accuracy increases the robustness of the network.

实施例2Example 2

基于Octave卷积的空谱注意力机制深度学习的高光谱图像分类方法同实施例1，本发明步骤(3)中所述的Octave卷积神经网络搭建，参见图2，本发明组成Octave卷积神经网络的Octave卷积模块、空间注意力模块、光谱注意力模块、像素级注意力模块和全连接层，其各组成模块参数设置如下：The hyperspectral image classification method based on the deep learning of the spatial spectrum attention mechanism of Octave convolution is the same as embodiment 1, and the Octave convolution neural network described in step (3) of the present invention is built, referring to Fig. 2, the present invention forms Octave convolution The Octave convolution module, spatial attention module, spectral attention module, pixel-level attention module and fully connected layer of the neural network, the parameters of each component module are set as follows:

Octave卷积模块即输入模块，其由依次连接的四个卷积部分构成，每个卷积部分又包括Octave卷积，参见图4，Batch Normalization和Relu激活函数，在第二、三个卷积部分之间还有一个最大池化层。The Octave convolution module is the input module, which consists of four convolution parts connected in sequence, and each convolution part includes Octave convolution. See Figure 4, Batch Normalization and Relu activation functions, in the second and third convolutions There is also a max pooling layer between parts.

Octave卷积操作主要是先将高光谱图像分为高频部分和低频部分两个部分，通过常规卷积操作对高频部分和低频部分进行卷积，获得由高频到高频、高频到低频、低频到高频、低频到低频四个卷积结果，再将高频到高频和低频到高频的卷积结果相加到一起，高频到低频和低频到低频的卷积结果相加到一起，这样就将高光谱图像的高频信息和低频信息进行了有效的联系与沟通，使获得的特征表示包含信息更加全面与详细，增强了高光谱图像的特征表示。The Octave convolution operation mainly divides the hyperspectral image into two parts, the high-frequency part and the low-frequency part, and then convolves the high-frequency part and the low-frequency part through the conventional convolution operation to obtain the image from high frequency to high frequency and high frequency to low frequency. The four convolution results of low frequency, low frequency to high frequency, and low frequency to low frequency are added together, and the convolution results of high frequency to high frequency and low frequency to high frequency are added together, and the convolution results of high frequency to low frequency and low frequency to low frequency are compared Added together, the high-frequency information and low-frequency information of the hyperspectral image are effectively connected and communicated, so that the obtained feature representation contains more comprehensive and detailed information, and the feature representation of the hyperspectral image is enhanced.

空间注意力机制模块，参见图5，其由卷积层、Batch Normalization、Relu激活函数、矩阵转置与相乘层、softmax层和数据转置与相加层构成。将输入到空间注意力机制模块的特征经过一个卷积操作对特征进行强化，输入到矩阵转置与相乘层，得到一个矩阵，矩阵中每个元素代表高光谱图像中任意两个位置的空间关系；再利用softmax层对此矩阵进行归一化处理，将矩阵中的值归一化到0到1之间，同时将归一化后的矩阵与经过卷积操作后得到的特征相乘，突出高光谱图像中某一带有明显语义信息的区域，得到具有明显语义信息表示的特征图；最后将具有明显语义信息的特征与输入到空间注意力机制模块的初始特征输入到数据转置与相加层中，将两个特征相加到一起，防止信息的丢失。经过空间注意力机制模块的得到的最后的特征语义信息明显，使网络很容易的找到特征最明显的区域，提高了高光谱图像分类的准确率。The spatial attention mechanism module, see Figure 5, consists of a convolutional layer, Batch Normalization, Relu activation function, matrix transposition and multiplication layer, softmax layer, and data transposition and addition layer. The features input to the spatial attention mechanism module are strengthened by a convolution operation, input to the matrix transposition and multiplication layer, and a matrix is obtained, and each element in the matrix represents the space of any two positions in the hyperspectral image relationship; then use the softmax layer to normalize the matrix, normalize the values in the matrix to between 0 and 1, and multiply the normalized matrix with the features obtained after the convolution operation, Highlight a region with obvious semantic information in the hyperspectral image, and obtain a feature map with obvious semantic information representation; finally, the features with obvious semantic information and the initial features input to the spatial attention mechanism module are input to the data transposition and correlation In the addition layer, the two features are added together to prevent the loss of information. The final feature semantic information obtained by the spatial attention mechanism module is obvious, which makes it easy for the network to find the region with the most obvious features, and improves the accuracy of hyperspectral image classification.

光谱注意力机制模块，参见图6，其由矩阵转置与相乘层、softmax层和数据转置与相加层构成。将输入到光谱注意力机制模块的特征输入到矩阵转置与相乘层，得到一个矩阵，矩阵中每个元素代表高光谱图像中任意像素点的不同光谱波段之间的关系；再利用softmax层对此矩阵进行归一化处理，将矩阵中的值归一化到0到1之间，同时将归一化后的矩阵与输入到光谱注意力模块的初始特征相乘，突出高光谱图像中每一个像素点中光谱信息最强的光谱波段，得到突出最强光谱信息的特征图；最后将突出最强光谱信息的特征与输入到光谱注意力模块的初始特征输入到数据转置与相加层，将两个特征相加到一起，防止信息的丢失。经过光谱注意力机制模块得到的最后的特征突出了光谱信息最强的光谱波段，使网络注意力集中于光谱信息最强的区域，提高了高光谱图像分类的准确率，增强了网络的鲁棒性。The spectral attention mechanism module, see Figure 6, consists of a matrix transposition and multiplication layer, a softmax layer, and a data transposition and addition layer. The features input to the spectral attention mechanism module are input to the matrix transposition and multiplication layer to obtain a matrix, and each element in the matrix represents the relationship between different spectral bands of any pixel in the hyperspectral image; then use the softmax layer Normalize this matrix, normalize the values in the matrix to between 0 and 1, and multiply the normalized matrix with the initial features input to the spectral attention module to highlight the hyperspectral image. The spectral band with the strongest spectral information in each pixel is obtained to obtain a feature map that highlights the strongest spectral information; finally, the features that highlight the strongest spectral information and the initial features input to the spectral attention module are input to the data transposition and addition layer, adding two features together to prevent loss of information. The final feature obtained by the spectral attention mechanism module highlights the spectral band with the strongest spectral information, making the network focus on the region with the strongest spectral information, improving the accuracy of hyperspectral image classification and enhancing the robustness of the network. sex.

像素级注意力机制模块，参见图7，其由卷积层、Batch Normalization和Relu激活函数构成。高光谱图像分类是对高光谱图像中每个像素点进行分类，经过像素级注意力机制模块得到的特征更加细化了每一个像素点的特征，增强了每一个像素点的特征表示，提高了高光谱图像分类的准确率。The pixel-level attention mechanism module, see Figure 7, consists of convolutional layers, Batch Normalization and Relu activation functions. Hyperspectral image classification is to classify each pixel in the hyperspectral image. The features obtained by the pixel-level attention mechanism module further refine the features of each pixel, enhance the feature representation of each pixel, and improve the Accuracy of Hyperspectral Image Classification.

所述全连接层，其是由第一全连接层和第二全连接层和softmax层构成，即输出层。第一全连接层的输入为经过空间注意力模块和像素级注意力模块后得到的特征、经过光谱注意力模块和像素级注意力模块后得到的特征以及经过两种特征相加融合后得到的融合特征；第二全连接层的输入为第一全连接层的输出特征；softmax层的输入为第二全连接层的输出特征，softmax层的输出结果表示训练样本属于高光谱中某一类别的概率，softmax层的输出结果为整个网络的最终输出结果。The fully connected layer is composed of the first fully connected layer, the second fully connected layer and the softmax layer, that is, the output layer. The input of the first fully connected layer is the features obtained after the spatial attention module and the pixel-level attention module, the features obtained after the spectral attention module and the pixel-level attention module, and the features obtained after the two kinds of features are added and fused. Fusion features; the input of the second fully connected layer is the output feature of the first fully connected layer; the input of the softmax layer is the output feature of the second fully connected layer, and the output result of the softmax layer indicates that the training sample belongs to a certain category in the hyperspectrum Probability, the output of the softmax layer is the final output of the entire network.

本发明使用Octave卷积操作来强化特征表示，引入空间注意力机制和光谱注意力机制，使网络更准确的找到对于分类更有利并且包含信息更加全面详细的区域，增强了高光谱图像分类的准确性和网络鲁棒性。The present invention uses Octave convolution operation to strengthen feature representation, introduces spatial attention mechanism and spectral attention mechanism, enables the network to more accurately find areas that are more beneficial to classification and contain more comprehensive and detailed information, and enhances the accuracy of hyperspectral image classification and network robustness.

实施例3Example 3

基于Octave卷积的空谱注意力机制深度学习的高光谱图像分类方法同实施例1-2，步骤4中所述的确定Octave卷积神经网络损失函数loss_op，具体包括如下步骤：The hyperspectral image classification method based on the deep learning of the spatial spectrum attention mechanism of Octave convolution is the same as embodiment 1-2, and the determination of the Octave convolution neural network loss function loss _op as described in step 4 specifically includes the following steps:

(4a)将训练图像库{T₁,T₂,…,T_j,…,T_M}输入到Octave卷积神经网络的Octave卷积模块，输出卷积层的最后一层特征F。(4a) Input the training image library {T ₁ ,T ₂ ,…,T _j ,…,T _M } into the Octave convolution module of the Octave convolutional neural network, and output the feature F of the last layer of the convolutional layer.

(4b)将最后一层特征F分别输入到Octave卷积神经网络的空间注意力模块和光谱注意力模块，输出特征分别为A和B，再将输出特征A和B输入到像素级注意力模块，输出特征分别为C和D。(4b) Input the last layer feature F into the spatial attention module and spectral attention module of the Octave convolutional neural network, the output features are A and B respectively, and then input the output features A and B to the pixel-level attention module , the output features are C and D respectively.

(4c)将得到的特征C和D，输入到Octave卷积神经网络的全连接层，输出利用特征C和D得到的输出分类结果；同时对特征C和D分别乘以一个系数，然后进行逐像素相加，得到融合后的特征E，再将特征E输入到Octave卷积神经网络的全连接层，输出利用融合后的特征E得到的输出分类结果，得到Octave卷积神经网络的损失函数loss_op：(4c) The obtained features C and D are input to the fully connected layer of the Octave convolutional neural network, and the output classification results obtained by using the features C and D are output; at the same time, the features C and D are multiplied by a coefficient respectively, and then step by step The pixels are added to obtain the fused feature E, and then the feature E is input to the fully connected layer of the Octave convolutional neural network, and the output classification result obtained by using the fused feature E is output, and the loss function loss of the Octave convolutional neural network is obtained _op :

其中，loss₁为利用融合后的特征E经过全连接层后输出分类结果与实际结果的交叉熵，loss₂为特征C经过全连接层后输出分类结果与实际结果的交叉熵，loss₃为特征D经过全连接层后输出分类结果与实际结果的交叉熵，为卷积神经网络权重向量的L2范数，η为的超参数。Among them, loss ₁ is the cross entropy between the output classification result and the actual result after using the fused feature E through the fully connected layer, loss ₂ is the cross entropy between the output classification result and the actual result after the feature C passes through the fully connected layer, and loss ₃ is the feature D outputs the cross entropy of the classification result and the actual result after passing through the fully connected layer, is the L2 norm of the convolutional neural network weight vector, η is hyperparameters.

本发明的损失函数loss_op利用三个交叉熵函数促使网络学习到对高光谱图像分类更有效的空间特征和光谱特征，加强高光谱图像的特征表示，进一步明确了分类任务，目的性更强，能够适应复杂的高光谱图像数据，极大的增强了网络的鲁棒性，同时有效的降低了损失函数在网络训练过程中陷入局部最优的概率。The loss function loss _op of the present invention uses three cross-entropy functions to promote the network to learn more effective spatial and spectral features for hyperspectral image classification, strengthen the feature representation of hyperspectral images, further clarify the classification task, and have a stronger purpose. It can adapt to complex hyperspectral image data, greatly enhances the robustness of the network, and effectively reduces the probability of the loss function falling into local optimum during the network training process.

实施例4Example 4

基于Octave卷积的空谱注意力机制深度学习的高光谱图像分类方法同实施例1-3，(4c)中的输出分类结果与实际结果的交叉熵loss₁，公式如下：The hyperspectral image classification method based on the deep learning of the space-spectrum attention mechanism of Octave convolution is the same as the cross-entropy loss ₁ of the output classification result in (4c) and the actual result in Embodiment 1-3, and the formula is as follows:

其中，y_j为训练图像库中T_j的预测类标概率，o_j为训练图像库中T_j的实际类标；loss₁的输入为融合后的特征E经过全连接层后得到的预测类标概率；Among them, y _j is the predicted class label probability of T _j in the training image library, o _j is the actual class label of T _j in the training image library; the input of loss ₁ is the predicted class obtained after the fused feature E passes through the fully connected layer mark probability;

本发明中loss₂、loss₃的原理与公式表达和loss₁相同，只是loss₂的输入为特征C经过全连接层后得到的预测类标概率,loss₃的输入为特征D经过全连接层后得到的预测类标概率。The principles of loss ₂ and loss ₃ in the present invention are the same as the formula expression and loss ₁ , except that the input of loss ₂ is the predicted class label probability obtained after the feature C passes through the fully connected layer, and the input of loss ₃ is the feature D after passing through the fully connected layer The resulting predicted class label probabilities.

实施例5Example 5

基于Octave卷积的空谱注意力机制深度学习的高光谱图像分类方法同实施例1-4，步骤(1)中对高光谱图像库进行归一化处理，通过如下公式进行：The hyperspectral image classification method based on the deep learning of the spatial spectrum attention mechanism of Octave convolution is the same as embodiment 1-4, and in step (1), the hyperspectral image database is normalized, and is carried out by the following formula:

其中V_max为高光谱图像库中所有像素的点最大值，V_min为高光谱图像库中所有像素的点最小值，V_n为高光谱图像库中任意一点的像素值，{I′₁,I′₂,…,I′_n,…,I′_N}为归一化处理后的高光谱图像库，I′_n为归一化处理后高光谱图像第n个样本，n∈[0,N]。Where V _max is the point maximum value of all pixels in the hyperspectral image library, V _min is the point minimum value of all pixels in the hyperspectral image library, V _n is the pixel value of any point in the hyperspectral image library, {I′ ₁ , I′ ₂ ,…,I′ _n ,…,I′ _N } is the hyperspectral image database after normalization processing, I′ _n is the nth sample of hyperspectral image after normalization processing, n∈[0, N].

本发明通过对高光谱图像库进行归一化处理，将高光谱像素值限制到-0.5到0.5之间，使高光谱图像亮度分配更加均衡，有效避免后续处理带来的干扰，同时将每一个像素点的像素值限制再一个统一的区间，防止像素值跨度较大，把边缘信息抹掉。由于归一化使高光谱图像的像素值减小，减小了网络的计算量，同时加快了网络训练的收敛性。通过实验也证明了将高光谱图像像素值归一化-0.5到0.5之间进一步提高了高光谱图像分类的准确率，同时网络训练速度大大加快。The present invention limits the hyperspectral pixel value to -0.5 to 0.5 by normalizing the hyperspectral image database, so that the brightness distribution of the hyperspectral image is more balanced, effectively avoiding the interference caused by subsequent processing, and at the same time, each The pixel value of the pixel is limited to a uniform interval to prevent the pixel value from being large and erasing the edge information. Due to the normalization, the pixel value of the hyperspectral image is reduced, which reduces the computational load of the network and accelerates the convergence of the network training. Experiments also prove that normalizing hyperspectral image pixel values between -0.5 and 0.5 further improves the accuracy of hyperspectral image classification, and at the same time the network training speed is greatly accelerated.

实施例6Example 6

基于Octave卷积的空谱注意力机制深度学习的高光谱图像分类方法同实施例1-5，步骤(5)中通过梯度下降优化对卷积神经网络进行迭代训练，其实现如下：The hyperspectral image classification method based on the deep learning of the spatial spectrum attention mechanism of Octave convolution is the same as embodiment 1-5, and in step (5), the convolutional neural network is iteratively trained by gradient descent optimization, and its realization is as follows:

(5a)设置训练的初始学习率为L，衰减率为β,将训练图像库{T₁,T₂,…,T_j,…,T_M}分为G次输入构建的卷积神经网络中，每次输入的图片数目为Q，则：(5a) Set the initial learning rate of training to L and the decay rate to β, and divide the training image library {T ₁ , T ₂ ,…,T _j ,…,T _M } into the convolutional neural network constructed by G times of input , the number of pictures input each time is Q, then:

其中M为训练图像库样本的总个数。Where M is the total number of samples in the training image database.

(5b)设每次输入图片对应的学习率l为：(5b) Let the learning rate l corresponding to each input picture be:

l＝L*β^G l= ^L *βG

(5c)通过如下公式对卷积神经网络进行G次参数更新，得到更新后的权重向量W_new；(5c) update the parameters of the convolutional neural network G times by the following formula, and obtain the updated weight vector W _new ;

其中，W为卷积神经网络参数的权重向量；Among them, W is the weight vector of convolutional neural network parameters;

(5d)将下一次训练图片输入卷积神经网络，对权重向量更新后的损失函数loss_op进行更新，使得损失函数loss_op的值不断下降；(5d) Input the next training picture into the convolutional neural network, and update the loss function loss _op after the weight vector is updated, so that the value of the loss function loss _op continues to decrease;

(5e)重复(5d)，直到损失函数loss_op不再下降，且当前训练轮次数小于设置的迭代次数P，则停止对该网络的训练，得到训练好的卷积神经网络；否则，当训练轮次达到设置的迭代次数P时，停止对该网络的训练，得到训练好的卷积神经网络。(5e) Repeat (5d) until the loss function loss _op no longer decreases, and the current number of training rounds is less than the set iteration number P, then stop the training of the network to obtain a trained convolutional neural network; otherwise, when training When the round reaches the set number of iterations P, the training of the network is stopped, and the trained convolutional neural network is obtained.

本发明中每次输入的图片数目Q是人为设定的，可根据网络的训练效果进行调整，使高光谱图像分类准确率最高。网络的学习率即为网络学习有效特征的速率，在网络训练过程中，网络的学习率随着网络的训练逐渐减小，初始训练时学习率较大，促使网络快速高效的学习高光谱图像的主要特征，随着训练的加深，学习率逐渐减小，网络学习速度减慢，促使网络学习有利于高光谱图像分类的详细特征，同时使网络有效的降低了网络的损失函数陷入局部最优的概率，加快了网络训练的速度，加速了网络训练的收敛性。The number Q of pictures input each time in the present invention is artificially set, and can be adjusted according to the training effect of the network, so that the classification accuracy of hyperspectral images is the highest. The learning rate of the network is the rate at which the network learns effective features. During the network training process, the learning rate of the network gradually decreases with the training of the network. The learning rate is relatively large during the initial training, which promotes the network to quickly and efficiently learn the hyperspectral image. The main features, as the training deepens, the learning rate gradually decreases, and the network learning speed slows down, which prompts the network to learn detailed features that are beneficial to hyperspectral image classification, and at the same time effectively reduces the network's loss function into a local optimum. The probability speeds up the speed of network training and accelerates the convergence of network training.

下面再给出一个更加详尽的例子，对本发明进一步说明：A more detailed example is given below to further illustrate the present invention:

实施例7Example 7

基于Octave卷积的空谱注意力机制深度学习的高光谱图像分类方法同实施1-6，参见图2，本发明的实现步骤如下：The hyperspectral image classification method based on the spatial spectrum attention mechanism deep learning of Octave convolution is the same as implementing 1-6, referring to Fig. 2, the implementation steps of the present invention are as follows:

步骤1，建立高光谱图像库，获得训练样本和测试样本。Step 1, establish a hyperspectral image library, and obtain training samples and test samples.

1a)从相关官网下载Indian Pines高光谱图像数据集，Indian Pines高光谱图像数据集由机载可视红外成像光谱仪(AVIRIS)于1992年对美国印第安纳州一块印度松树进行成像，参见图3，图3(a)为原始的Indian Pines高光谱图像，图3(b)为原始的IndianPines高光谱图像对应像素点的类标签。在高光谱图像上以每个像素点为中心进行逐像素点滑动，选取13*13大小的滑框，使用滑框进行逐像素点滑动，每次滑动的距离为一个像素点，每次滑动都可得到一个13*13大小的高光谱图像块，滑动得到的所有高光谱图像块用于建立高光谱图像库{I₁,I₂,…,I_n,…,I_N}，图像库对应的类别为{Y₁,Y₂,…,Y_n,…,Y_N},其中I_n代表图像库中第n个图像，Y_n代表图像库中第n个图像对应的类别，n代表图像库中第n个样本编号，n∈[0,N]。1a) Download the Indian Pines hyperspectral image dataset from the relevant official website. The Indian Pines hyperspectral image dataset was imaged by the Airborne Visible Infrared Imaging Spectrometer (AVIRIS) in 1992 on an Indian pine tree in Indiana, USA, see Figure 3, Figure 3(a) is the original Indian Pines hyperspectral image, and Figure 3(b) is the class label of the corresponding pixel of the original Indian Pines hyperspectral image. Swipe pixel by pixel with each pixel as the center on the hyperspectral image, select a sliding frame with a size of 13*13, and use the sliding frame to slide pixel by pixel. The distance of each sliding is one pixel, and each sliding is A hyperspectral image block with a size of 13*13 can be obtained, and all the hyperspectral image blocks obtained by sliding are used to establish a hyperspectral image library {I ₁ ,I ₂ ,…,I _n ,…,I _N }, and the image library corresponds to The category is {Y ₁ , Y ₂ ,...,Y _n ,...,Y _N }, where I _n represents the nth image in the image library, Y _n represents the category corresponding to the nth image in the image library, and n represents the image library In the nth sample number, n∈[0,N].

1b)针对建立好的高光谱图像库，找到图像库中所有像素点的最大值和最小值，利用所有像素点的值和两个像素点的最值对建立的高光谱图像库按照如下公式进行归一化处理：1b) For the established hyperspectral image library, find the maximum and minimum values of all pixels in the image library, and use the values of all pixels and the maximum value of two pixels to establish a hyperspectral image library according to the following formula Normalization processing:

其中V_max为高光谱图像库中所有像素的点最大值，V_min为高光谱图像库中所有像素的点最小值，V_n为高光谱图像库中任意一点的像素值，{I′₁,I′₂,…,I′_n,…,I′_N}为归一化处理后的高光谱图像库，I′_n为归一化处理后遥感图像的第n个样本，n∈[0,N]。Where V _max is the point maximum value of all pixels in the hyperspectral image library, V _min is the point minimum value of all pixels in the hyperspectral image library, V _n is the pixel value of any point in the hyperspectral image library, {I′ ₁ , I′ ₂ ,…,I′ _n ,…,I′ _N } is the hyperspectral image library after normalization processing, I′ _n is the nth sample of remote sensing image after normalization processing, n∈[0, N].

1c)从归一化处理后的每个类别的高光谱图像中随机挑选指定数量的高光谱图像样本，构建训练样本集{T₁,T₂,…,T_j,…,T_M}，简称训练集，将剩余的高光谱图像作为测试样本集{t₁,t₂,…t_d,…,t_m}简称测试集，其中T_j表示训练样本中第j个样本，j∈[0,M]，t_d表示测试样本中第d个样本，d∈[0,m]，M为训练样本的总个数，m为测试样本的总个数，m<N，M<N。1c) Randomly select a specified number of hyperspectral image samples from the normalized hyperspectral images of each category to construct a training sample set {T ₁ ,T ₂ ,…,T _j ,…,T _M }, referred to as The training set, the remaining hyperspectral images are used as the test sample set {t ₁ ,t ₂ ,…t _d ,…,t _m } referred to as the test set, where T _j represents the jth sample in the training sample, j∈[0, M], t _d represents the dth sample in the test sample, d∈[0,m], M is the total number of training samples, m is the total number of test samples, m<N, M<N.

步骤2，构建Octave卷积神经网络。Step 2, construct the Octave convolutional neural network.

2a)设置Octave卷积模块，其由依次连接的四个卷积部分构成，每个卷积部分又包括Octave卷积，Batch Normalization和Relu激活函数，在第二、三个卷积部分之间还有一个最大池化层；2a) Set the Octave convolution module, which consists of four convolution parts connected in sequence, each convolution part includes Octave convolution, Batch Normalization and Relu activation function, between the second and third convolution parts There is a max pooling layer;

参见图4，本发明的Octave卷积工作原理如下：Referring to Fig. 4, the working principle of Octave convolution of the present invention is as follows:

将图像分为高频和低频两个部分，其中低频部分的图像的宽和高为高频部分的一半；对高频部分进行普通的卷积操作，得到两个卷积结果，其中高频到高频卷积结果的宽和高与高频部分的相同，高频到低频卷积结果的宽和高与低频部分的相同；再对低频部分进行相同的卷积操作，其中低频到高频卷积结果的宽和高与高频部分相同，低频到低频卷积结果的宽和高与低频部分的相同；再将具有相同宽和高的结果相加到一起组成新的高频部分和低频部分。Divide the image into two parts, high frequency and low frequency, where the width and height of the image in the low frequency part are half of the high frequency part; perform ordinary convolution operations on the high frequency part, and obtain two convolution results, in which the high frequency to The width and height of the high-frequency convolution result are the same as those of the high-frequency part, and the width and height of the high-frequency to low-frequency convolution result are the same as those of the low-frequency part; then perform the same convolution operation on the low-frequency part, where the low-frequency to high-frequency convolution The width and height of the convolution result are the same as the high-frequency part, and the width and height of the low-frequency to low-frequency convolution result are the same as those of the low-frequency part; then the results with the same width and height are added together to form a new high-frequency part and low-frequency part .

所述Relu激活函数为：The Relu activation function is:

其中x为Relu激活函数的输入函数。Where x is the input function of the Relu activation function.

2b)设置空间注意力机制模块，其由卷积层、Batch Normalization、Relu激活函数、矩阵转置与相乘层、softmax层和数据转置与相加层构成，其结构如图5所示。2b) Set the spatial attention mechanism module, which consists of convolutional layer, Batch Normalization, Relu activation function, matrix transposition and multiplication layer, softmax layer, and data transposition and addition layer. Its structure is shown in Figure 5.

本发明空间注意力模块中矩阵转置与相乘层，输入的特征为经过卷积得到的特征，其大小为W×H×C，先将特征大小转换为N×C，其中N＝W×H,再对转换后的特征进行矩阵转置，得到的特征的大小为C×N,再将经过特征转换得到的特征和经过矩阵转置得到的特征进行矩阵相乘，得到输出矩阵，其大小为N×N。In the matrix transposition and multiplication layer in the spatial attention module of the present invention, the input feature is the feature obtained through convolution, and its size is W×H×C. First, the feature size is converted to N×C, where N=W× H, and then perform matrix transposition on the converted features, the size of the obtained features is C×N, and then perform matrix multiplication on the features obtained through feature conversion and the features obtained through matrix transposition to obtain the output matrix, its size It is N×N.

本发明空间注意力模块中softmax层，利用softmax对矩阵转置与相乘层输出的矩阵进行归一化处理，将矩阵中的值归一化到0到1之间，同时将归一化后的矩阵与经过卷积操作和矩阵转换得到的特征相乘，得到输出特征，其大小为N×C。The softmax layer in the spatial attention module of the present invention uses softmax to normalize the matrix output from the matrix transposition and multiplication layer, normalizes the values in the matrix to between 0 and 1, and simultaneously normalizes The matrix of is multiplied by the features obtained through the convolution operation and matrix transformation to obtain the output features, whose size is N×C.

本发明空间注意力模块中数据转置与相加层，先将softmax层的输出结果进行转换，经过转换后大小为W×H×C，再将经过转换得到的特征与输入到空间注意力模块的初始特征相加到一起，得到空间注意力模块的最终输出特征，其大小为W×H×C。In the data transposition and addition layer in the spatial attention module of the present invention, the output result of the softmax layer is converted first, and the size after conversion is W×H×C, and then the converted features and input are input to the spatial attention module The initial features of are summed together to get the final output feature of the spatial attention module, which has a size of W×H×C.

空间注意力模块由卷积层、Batch Normalization、Relu激活函数、矩阵转置与相乘层、softmax层和数据转置与相加层依次相连构成，通过空间注意力模块可使网络找到带有明显语义信息的区域，进而提高高光谱图像分类的准确率。The spatial attention module is composed of a convolutional layer, Batch Normalization, Relu activation function, matrix transposition and multiplication layer, softmax layer, and data transposition and addition layer. Through the spatial attention module, the network can find The region of semantic information, thereby improving the accuracy of hyperspectral image classification.

2c)设置光谱注意力机制模块，其由矩阵转置与相乘层、softmax层和数据转置与相加层构成，其结构如图6所示。2c) Set up the spectral attention mechanism module, which consists of a matrix transpose and multiply layer, a softmax layer, and a data transpose and add layer, and its structure is shown in Figure 6.

本发明光谱注意力模块中矩阵转置与相乘层，输入的特征大小为W×H×C，先将特征大小转换为C×N，其中N＝W×H,再对转换后的特征进行矩阵转置，得到的特征的大小为N×C，再将经过特征转换和矩阵转置得到的特征进行矩阵相乘，得到输出矩阵，其大小为C×C。In the matrix transposition and multiplication layer in the spectral attention module of the present invention, the input feature size is W×H×C, and the feature size is first converted to C×N, where N=W×H, and then the converted feature is performed Matrix transposition, the size of the obtained features is N×C, and then matrix multiplication is performed on the features obtained through feature conversion and matrix transposition to obtain an output matrix, whose size is C×C.

本发明光谱注意力模块中softmax层，利用softmax对矩阵转置与相乘层输出的矩阵进行归一化处理，将矩阵中的值归一化到0到1之间，同时将归一化后的矩阵与经过矩阵转换得到的特征相乘，得到输出特征，其大小为C×N。The softmax layer in the spectral attention module of the present invention uses softmax to normalize the matrix output from the matrix transposition and multiplication layer, normalizes the values in the matrix to between 0 and 1, and simultaneously normalizes The matrix of is multiplied by the features obtained through matrix transformation to obtain the output features, whose size is C×N.

本发明光谱注意力模块中数据转置与相加层，先将softmax层的输出结果进行转换，经过转换后大小为W×H×C，再将经过转换得到的特征与输入到光谱注意力模块的初始特征相加到一起，得到光谱注意力模块的最终输出特征，其大小为W×H×C。In the data transposition and addition layer in the spectral attention module of the present invention, the output result of the softmax layer is converted first, and the size after conversion is W×H×C, and then the converted features are input to the spectral attention module The initial features of are summed together to get the final output features of the spectral attention module, which is of size W×H×C.

光谱注意力模块由矩阵转置与相乘层、softmax层和数据转置与相加层依次相连构成，通过光谱注意力模块使网络找到光谱信息较强的光谱波段，进而提高高光谱图像分类的准确率和网络的鲁棒性。The spectral attention module is composed of a matrix transposition and multiplication layer, a softmax layer, and a data transposition and addition layer. Through the spectral attention module, the network can find spectral bands with strong spectral information, thereby improving the performance of hyperspectral image classification. accuracy and robustness of the network.

2d)设置像素级注意力机制模块，其由卷积层、Batch Normalization和Relu激活函数依次相连构成，其结构如图7所示。2d) Set the pixel-level attention mechanism module, which is composed of convolutional layer, Batch Normalization and Relu activation function connected in sequence, and its structure is shown in Figure 7.

本发明像素级注意力模块中卷积层中，设置卷积核大小为1*1，通过卷积核为1*1大小的卷积操作对高光谱图像的每个像素点进行特征强化，因而提高了高光谱图像分类的准确率。In the convolution layer in the pixel-level attention module of the present invention, the size of the convolution kernel is set to 1*1, and the feature enhancement is performed on each pixel of the hyperspectral image through the convolution operation of the convolution kernel with a size of 1*1, thus Improved accuracy for hyperspectral image classification.

2e)设置全连接层，全连接层是由第一全连接层、第二全连接层和softmax层构成，第一全连接层的卷积核大小为9800×1024，第二全连接层的卷积核大小为1024×16。其中16是输入的高光谱图像中总的类别数。2e) Set the fully connected layer. The fully connected layer is composed of the first fully connected layer, the second fully connected layer and the softmax layer. The convolution kernel size of the first fully connected layer is 9800×1024, and the volume of the second fully connected layer The product kernel size is 1024×16. where 16 is the total number of categories in the input hyperspectral image.

本发明全连接层中第一全连接层，其特征输入大小为1×9800，先为输入特征的每个值乘上一个系数，再输入到第一全连接层，其输出特征大小为1×1024。所乘系数为初始化的一组符合高斯分布的向量，系数在样本训练过程中可进行更新。The first fully connected layer in the fully connected layer of the present invention has a feature input size of 1×9800, first multiplies a coefficient for each value of the input feature, and then inputs it to the first fully connected layer, and its output feature size is 1× 1024. The multiplied coefficients are an initialized set of vectors conforming to the Gaussian distribution, and the coefficients can be updated during the sample training process.

本发明全连接层中第二全连接层，其特征输入大小为1×1024，输出特征大小为1×16。The second fully connected layer in the fully connected layer of the present invention has a feature input size of 1×1024 and an output feature size of 1×16.

2f)将上述设置的Octave卷积模块、空间注意力模块、光谱注意力模块和全连接层依次相连，得到Octave卷积神经网络。2f) Connect the Octave convolution module, the spatial attention module, the spectral attention module and the fully connected layer set above in sequence to obtain the Octave convolutional neural network.

步骤3确定卷积神经网络的损失函数：Step 3 determines the loss function of the convolutional neural network:

(3a)将训练样本集{T₁,T₂,…,T_j,…,T_M}输入到Octave卷积神经网络的Octave卷积模块，输出卷积层的最后一层特征F。(3a) Input the training sample set {T ₁ , T ₂ ,...,T _j ,...,T _M } into the Octave convolution module of the Octave convolutional neural network, and output the feature F of the last layer of the convolutional layer.

(3b)将最后一层特征F分别输入到Octave卷积神经网络的空间注意力模块和光谱注意力模块，输出特征分别为A和B，再将输出特征A和B输入到像素级注意力模块，输出特征分别为C和D。(3b) Input the last layer feature F into the spatial attention module and spectral attention module of the Octave convolutional neural network, the output features are A and B respectively, and then input the output features A and B to the pixel-level attention module , the output features are C and D respectively.

(3c)将得到的特征C和D，输入到Octave卷积神经网络的全连接层，输出利用特征C和D得到的输出分类结果；同时对特征C和D分别乘以一个系数，然后进行逐像素相加，得到融合后的特征E，再将特征E输入到Octave卷积神经网络的全连接层，输出利用融合后的特征E得到的输出分类结果，得到Octave卷积神经网络的损失函数loss_op：(3c) Input the obtained features C and D to the fully connected layer of the Octave convolutional neural network, and output the output classification results obtained by using the features C and D; meanwhile, multiply the features C and D by a coefficient, and then perform step by step The pixels are added to obtain the fused feature E, and then the feature E is input to the fully connected layer of the Octave convolutional neural network, and the output classification result obtained by using the fused feature E is output, and the loss function loss of the Octave convolutional neural network is obtained _op :

其中：为Octave卷积神经网络权重向量的L2范数，η为的超参数；表示利用融合后的特征E经过全连接层后输出分类结果与实际结果的交叉熵，y_j为训练图像库中T_j的预测类标概率，o_j为训练图像库中T_j的实际类标，loss₁的输入为融合后的特征E经过全连接层后得到的预测类标概率。in: is the L2 norm of the Octave convolutional neural network weight vector, and η is hyperparameters; Indicates the cross entropy between the output classification result and the actual result after using the fused feature E through the fully connected layer, y _j is the predicted class label probability of T _j in the training image library, o _j is the actual class label of T _j in the training image library , the input of loss ₁ is the predicted class label probability obtained after the fused feature E passes through the fully connected layer.

loss₂、loss₃的原理与公式表达和loss₁相同，loss₂为特征C经过全连接层后输出分类结果与实际结果的交叉熵，loss₃为特征D经过全连接层后输出分类结果与实际结果的交叉熵；loss₂的输入为特征C经过全连接层后得到的预测类标概率,loss₃的输入为特征D经过全连接层后得到的预测类标概率。The principle of loss ₂ and loss ₃ is the same as the formula expression and loss _1. Loss ₂ is the cross entropy between the output classification result and the actual result after the feature C passes through the fully connected layer. Loss ₃ is the output classification result and the actual result after the feature D passes through the fully connected layer. The cross entropy of the result; the input of loss ₂ is the predicted class label probability obtained after feature C passes through the fully connected layer, and the input of loss ₃ is the predicted class label probability obtained after feature D passes through the fully connected layer.

步骤4，对卷积神经网络进行迭代训练。Step 4, perform iterative training on the convolutional neural network.

对Octave卷积神经网络进行迭代训练的现有方法有梯度下降优化算法、Nesterov梯度加速法、Adagrad方法，本例中采用但不仅限于梯度下降算法，其实现步骤如下：The existing methods for iterative training of the Octave convolutional neural network include the gradient descent optimization algorithm, the Nesterov gradient acceleration method, and the Adagrad method. In this example, but not limited to the gradient descent algorithm, the implementation steps are as follows:

4a)设置迭代次数为P，设置训练的初始学习率为L，衰减率为β,将训练图像库{T₁,T₂,…,T_j,…,T_M}分输入到步骤2构建的卷积神经网络中，每次输入的图片数目为Q，次数为G：4a) Set the number of iterations to P, set the initial learning rate of training to L, and the decay rate to β, and input the training image library {T ₁ ,T ₂ ,…,T _j ,…,T _M } points to the one constructed in step 2 In the convolutional neural network, the number of pictures input each time is Q, and the number of times is G:

其中M为训练图像库样本的总个数。Where M is the total number of samples in the training image library.

4b)设每次输入图片对应的学习率l为：4b) Let the learning rate l corresponding to each input picture be:

l＝L*β^G l= ^L *βG

4c)通过如下公式对Octave卷积神经网络进行G次参数更新，得到更新后的权重向量W_new：4c) Update the parameters of the Octave convolutional neural network G times by the following formula to obtain the updated weight vector W _new :

其中，W为Octave卷积神经网络参数的权重向量。Among them, W is the weight vector of Octave convolutional neural network parameters.

将更新后的权重向量W_new带入3c)中的损失函数loss_op，得到权重向量更新后的损失函数loss_op。Bring the updated weight vector W _new into the loss function loss _{op in 3c) to obtain the loss function loss op} _after the weight vector is updated.

4d)将下一次训练图片输入到Octave卷积神经网络，对权重向量更新后的损失函数loss_op进行更新，使得该损失函数loss_op的值不断下降。4d) Input the next training picture to the Octave convolutional neural network, and update the loss function loss _op after the weight vector is updated, so that the value of the loss function loss _op keeps decreasing.

4e)重复4d)，直到损失函数loss_op不再下降，且当前训练轮次数小于设置的迭代次数P，则停止对该网络的训练，得到训练好的Octave卷积神经网络；否则，当训练轮次达到设置的迭代次数P时，停止对该网络的训练，得到训练好的Octave卷积神经网络。4e) Repeat 4d) until the loss function loss _op no longer declines, and the current number of training rounds is less than the set number of iterations P, then stop the training of the network to obtain the trained Octave convolutional neural network; otherwise, when the training rounds When the set number of iterations P is reached, the training of the network is stopped, and the trained Octave convolutional neural network is obtained.

本例中使用梯度下降算法来对网络进行优化，寻找最优解，但对网络优化时不限于梯度下降算法，其他优化算法如遗传算法等仍可对网络进行优化。In this example, the gradient descent algorithm is used to optimize the network to find the optimal solution, but the network optimization is not limited to the gradient descent algorithm, and other optimization algorithms such as genetic algorithms can still optimize the network.

步骤5对测试样本集进行分类。Step 5 classifies the test sample set.

将经过归一化处理后的测试样本集输入到训练好的Octave卷积神经网络中，从训练好的Octave卷积神经网络输出获得输入的高光谱图像的测试样本集分类结果，完成对高光谱图像的精确分类。Input the normalized test sample set into the trained Octave convolutional neural network, and obtain the classification result of the test sample set of the input hyperspectral image from the output of the trained Octave convolutional neural network, and complete the hyperspectral image classification. Accurate classification of images.

本发明主要解决现有技术相同类别间距大、不同类别间距小、分类准确率低的问题。本发明通过建立高光谱图像库和图像库对应的类别,并从归一化处理后的每类高光谱图像中随机挑选指定数量的高光谱图像样本构建训练样本集和测试样本集；构建一个包括Octave卷积模块、空间注意力模块、光谱注意力模块、像素级注意力模块和全连接层的Octave卷积神经网络；将训练样本集中的训练样本输入到Octave卷积神经网络中获得训练样本的分类结果，并确定卷积神经网络的损失函数；通过梯度下降方法对损失函数迭代更新直到损失值稳定，得到训练好的Octave卷积神经网络；将经过归一化后的待分类的测试样本集输入到训练好的Octave卷积神经网络获得分类结果。本发明分类精度高，鲁棒性强，可应用于高光谱图像数据的分析和管理。The invention mainly solves the problems in the prior art that the distance between the same category is large, the distance between different categories is small, and the classification accuracy is low. The present invention builds a training sample set and a test sample set by establishing a hyperspectral image library and corresponding categories of the image library, and randomly selecting a specified number of hyperspectral image samples from each type of hyperspectral image after normalization processing; constructing a training sample set and a test sample set; Octave convolution module, spatial attention module, spectral attention module, pixel-level attention module and Octave convolutional neural network of fully connected layer; input training samples from training sample set into Octave convolutional neural network to obtain training samples Classify the results, and determine the loss function of the convolutional neural network; iteratively update the loss function through the gradient descent method until the loss value is stable, and obtain the trained Octave convolutional neural network; the normalized test sample set to be classified Input to the trained Octave convolutional neural network to obtain classification results. The invention has high classification accuracy and strong robustness, and can be applied to the analysis and management of hyperspectral image data.

发明的效果可通过以下仿真进一步说明：The effect of the invention can be further illustrated by the following simulation:

实施例8Example 8

基于Octave卷积的空谱注意力机制深度学习的高光谱图像分类方法同实施例1-7，The hyperspectral image classification method based on the deep learning of the spatial spectrum attention mechanism of Octave convolution is the same as that of Embodiment 1-7,

仿真条件Simulation conditions

本实例在HP-Z840-Workstation with Xeon(R)CPU E5-2630,GeForce TITAN XP,64G RAM，Ubuntu系统下，TensorFlow运行平台上，完成本发明以及现有遥感图像场景分类仿真。This example completes the present invention and the existing remote sensing image scene classification simulation on the HP-Z840-Workstation with Xeon(R) CPU E5-2630, GeForce TITAN XP, 64G RAM, Ubuntu system, and TensorFlow operating platform.

仿真参数设置如下，迭代轮次P为175次，初始学习率L为0.00001，η＝0.0001，每次输入图片数Q为16，衰减率β为0.9。训练顺序为在每一次的迭代训练中，对类标判别器，分类差值优化器，共同训练。The simulation parameters are set as follows, the iteration round P is 175 times, the initial learning rate L is 0.00001, η=0.0001, the number of input pictures Q is 16 each time, and the decay rate β is 0.9. The training sequence is that in each iterative training, the class label discriminator and the classification difference optimizer are jointly trained.

仿真内容Simulation content

下载Indian Pines高光谱图像数据集，Indian Pines高光谱图像数据集由机载可视红外成像光谱仪(AVIRIS)于1992年对美国印第安纳州一块印度松树进行成像，参见图3，图3(a)为原始的Indian Pines高光谱图像，图3(b)为原始的Indian Pines高光谱图像对应像素点的类标签。以每个像素点为中心，选取13*13大小的滑框，进行逐像素点的滑动，每次滑动得到一个图像，滑动得到的所有图像用于建立高光谱图像库{I₁,I₂,…,I_n,…,I_N}。再对建立的高光谱图像库进行归一化处理，即先获取高光谱图像库像素点最大值V′_max和像素点的最小值V′_min，再对高光谱图像库的所有像素点的值除以V′_max与V′_min的差值，得到归一化处理后的高光谱图像库。Download the Indian Pines hyperspectral image dataset. The Indian Pines hyperspectral image dataset was imaged by the Airborne Visible Infrared Imaging Spectrometer (AVIRIS) in 1992 on an Indian pine tree in Indiana, USA. See Figure 3, Figure 3(a) is The original Indian Pines hyperspectral image, Figure 3(b) is the class label of the corresponding pixel of the original Indian Pines hyperspectral image. With each pixel as the center, select a sliding frame of size 13*13, and slide pixel by pixel. Each time you slide to get an image, all the images obtained by sliding are used to build a hyperspectral image library {I ₁ ,I ₂ , ...,I _n ,...,I _N }. Then normalize the established hyperspectral image library, that is, first obtain the maximum pixel value V′ _max and the minimum pixel value V′ _min of the hyperspectral image library, and then calculate the values of all pixels in the hyperspectral image library Divide by the difference between V′ _max and V′ _min to get the normalized hyperspectral image library.

从归一化处理后高光谱图像库中随机挑选一定数量的高光谱图像作为训练样本集D_T，将剩余高光谱图像作为测试样本集D_t。A certain number of hyperspectral images are randomly selected from the normalized hyperspectral image library as the training sample set D _T , and the remaining hyperspectral images are used as the test sample set D _t .

该训练样本集和测试样本集中的图像均有16个种类，分别为Aflalfa、Corn-notill、Corn-mintill、Corn、Grass-pasture、Grass-trees、Grass-pasture-mowed、Hay-windrowed、Oats、Soybean-nottill、Soybean-mintill、Soybean-clean、Wheat、Woods、Stone-Steel-Towers、Buildings-Grass-Trees-Drives；每种类别的训练样本数、测试样本数以及总样本数参见表1。There are 16 types of images in the training sample set and test sample set, namely Aflalfa, Corn-notill, Corn-mintill, Corn, Grass-pasture, Grass-trees, Grass-pasture-mowed, Hay-windrowed, Oats, Soybean-nottill, Soybean-mintill, Soybean-clean, Wheat, Woods, Stone-Steel-Towers, Buildings-Grass-Trees-Drives; see Table 1 for the number of training samples, the number of test samples, and the total number of samples for each category.

表1 Indian Pines高光谱图像数据集类别统计表Table 1 Category statistics table of Indian Pines hyperspectral image dataset

在上述仿真条件下，采用训练样本集D_T分别用本发明和现有代表性的三种图像分类模型进行训练，采用测试样本集D_t进行测试，比较其分类的准确率，结果如表2。Under the above-mentioned simulation conditions, the training sample set D _T is used to train respectively with the present invention and the existing representative three image classification models, and the test sample set D _T is used to test, and the classification accuracy is compared. The results are shown in Table 2 .

表2本发明与现有高光谱图像分类模型性能评价表Table 2 The performance evaluation table of the present invention and the existing hyperspectral image classification model

测试模型test model 测试样本准确率test sample accuracy 本发明this invention 0.98980.9898 KFRC-CKIRKFRC-CKIR 0.98600.9860 2-DCNN2-DCNN 0.98880.9888 3D-SRNet3D-SRNet 0.97200.9720

表2中KFRC-CKIR为现有基于核正则和核融合的高光谱分类方法，2-DCNN为现有基于深度卷积神经网络的高光谱图像分类方法，3D-SRNet为现有基于三维可分和迁移学习的高光谱图像分类方法。In Table 2, KFRC-CKIR is the existing hyperspectral classification method based on kernel regularization and kernel fusion, 2-DCNN is the existing hyperspectral image classification method based on deep convolutional neural network, and 3D-SRNet is the existing hyperspectral image classification method based on three-dimensional separable and Transfer Learning for Hyperspectral Image Classification.

从表2可以看出，用本发明训练好的卷积神经网络对测试样本集D_t进行分类，其准确率在所有参与测试的分类方法中最高，比现有代表性的高光谱图像分类模型的精确率均有提升。As can be seen from Table 2, the convolutional neural network trained by the present invention is used to classify the test sample set D _t , and its accuracy rate is the highest among all the classification methods participating in the test, which is higher than that of the existing representative hyperspectral image classification model. accuracy has been improved.

综上所述，本发明公开的一种基于Octave卷积的空谱注意力高光谱图像分类方法，解决了现有技术相同类别间距大、不同类别间距小、分类准确率低的问题。方案是：待分类图像输入与数据预处理、划分训练集与测试集、Octave卷积神经网络搭建、确定Octave卷积神经网络损失函数、Octave卷积神经网络的训练更新、测试集数据测试，完成高光谱图像分类。本发明使用Octave卷积操作来强化特征表示，引入空间注意力机制和光谱注意力机制，使网络更准确的找到对于分类更有利并且包含信息更加全面详细的区域。本发明分类精度高，鲁棒性强，可应用于高光谱图像数据的分析和管理。To sum up, the invention discloses a method for classifying hyperspectral images based on spatial spectral attention based on Octave convolution, which solves the problems in the prior art that the distance between the same category is large, the distance between different categories is small, and the classification accuracy is low. The plan is: image input to be classified and data preprocessing, division of training set and test set, construction of Octave convolutional neural network, determination of Octave convolutional neural network loss function, training update of Octave convolutional neural network, test set data testing, completed Hyperspectral Image Classification. The present invention uses Octave convolution operation to strengthen feature representation, introduces spatial attention mechanism and spectral attention mechanism, and enables the network to more accurately find regions that are more favorable for classification and contain more comprehensive and detailed information. The invention has high classification accuracy and strong robustness, and can be applied to the analysis and management of hyperspectral image data.

Claims

1. a kind of hyperspectral image classification method of the empty spectrum attention mechanism deep learning based on Octave convolution, feature exist In comprising the following steps that

(1) image input and data prediction: high spectrum image to be sorted is inputted, is carried out centered on each pixel by picture Vegetarian refreshments sliding, all image blocks slided are for establishing high spectrum image library { I₁,I₂,…,I_n,…,I_N, in image library The corresponding classification of each image block is { Y₁,Y₂,…,Y_n,…,Y_N, and place is normalized to the high spectrum image library of foundation It manages, wherein I_nN-th image block, Y in representative image library_nThe corresponding classification of n-th image block in representative image library, n representative image N-th of sample number in library, n ∈ [0, N], N represent the image block total number in high spectrum image library；

(2) training set and test set are divided: selecting specified quantity at random from every class high spectrum image after normalized High spectrum image sample constructs training sample set { T₁,T₂,…,T_j,…,T_M, using remaining high spectrum image as test specimens This collection { t₁,t₂,…t_d,…,t_mWherein T_jIndicate j-th of sample in training sample, j ∈ [0, M], t_dIt indicates the in test sample D sample, d ∈ [0, m], M are the total number of training sample, and m is the total number of test sample, m < N, M < N；

(3) Octave convolutional neural networks are built: building an Octave convolutional neural networks, the input terminal of network is Octave Convolution module, the output end of network are the output of full articulamentum as a result, including two between the input terminal and output end of network Branch, wherein a branch successively passes through space transforms power module and Pixel-level pays attention to power module, another branch successively passes through Spectrum notices that power module and Pixel-level pay attention to power module；

(4) Octave convolutional neural networks loss function loss is determined_op: setting loss function includes that will mention after Fusion Features The feature taken is input to the cross entropy loss of full articulamentum output category result and actual result obtained from₁, will by sky Between notice that power module and Pixel-level notice that power module is extracted feature be input to full articulamentum output category result obtained from With the cross entropy loss of actual result₂, by by spectrum pay attention to power module and Pixel-level pay attention to power module extract feature input To the cross entropy loss of full articulamentum output category result and actual result obtained from₃With the convolutional Neural for having hyper parameter Four part of L2 norm of network weight W, the loss function of network is successively added by above four part to be constituted；

(5) training updates: the number of iterations that network training is arranged is P, by gradient decline optimization to Octave convolutional Neural net Network is iterated training, until loss function loss_opDo not decline or exercise wheel number reaches the number of iterations, obtains trained Octave convolutional neural networks；

(6) test sample collection after normalized data test: is input to trained Octave convolutional Neural net In network, classification results are obtained, complete image classification.

2. the high spectrum image point of the empty spectrum attention mechanism deep learning according to claim 1 based on Octave convolution Class method, which is characterized in that Octave convolutional neural networks described in step (3) are built, Octave convolution module therein, Space transforms power module, spectrum notice that power module, Pixel-level notice that power module and full articulamentum, parameter setting are as follows:

Octave convolution module, that is, the input module is made of, each conventional part sequentially connected four conventional parts It again include Octave convolution, Batch Normalization and Relu activation primitive is gone back between second and third conventional part There is a maximum pond layer；

The spatial attention mechanism module is turned by convolutional layer, Batch Normalization, Relu activation primitive, matrix It sets with the layer that is multiplied, softmax layers and data transposition with layer is added and constitutes；

The spectrum attention mechanism module, by matrix transposition and the layer that is multiplied, softmax layers and data transposition be added layer structure At；

The Pixel-level attention mechanism module, by convolutional layer, Batch Normalization and Relu activation primitive structure At；

The full articulamentum is made of the first full articulamentum and the second full articulamentum and softmax layers, i.e. output layer.

3. the high spectrum image point of the empty spectrum attention mechanism deep learning according to claim 1 based on Octave convolution Class method, which is characterized in that determination Octave convolutional neural networks loss function loss described in step (4)_op, specifically include Following steps:

(4a) is by training image library { T₁,T₂,…,T_j,…,T_MIt is input to the Octave convolution mould of Octave convolutional neural networks Block exports the last layer feature F of convolutional layer；

(4b) the last layer feature F is separately input to the space transforms power module of Octave convolutional neural networks and spectrum pays attention to Power module, output feature is respectively A and B, then output feature A and B are input to Pixel-level and pay attention to power module, output feature difference For C and D.

The feature C and D that (4c) will be obtained, are input to the full articulamentum of Octave convolutional neural networks, and output utilizes feature C and D Obtained output category result；Simultaneously then feature C and D are added pixel-by-pixel, are merged respectively multiplied by a coefficient Feature E afterwards, then feature E is input to the full articulamentum of Octave convolutional neural networks, output is obtained using fused feature E The output category result arrived obtains the loss function loss of Octave convolutional neural networks_op:

Wherein, loss₁For using fused feature E output category result and actual result after full articulamentum cross entropy, loss₂It is characterized the cross entropy of C output category result and actual result after full articulamentum, loss₃D is characterized by connecting entirely The cross entropy of output category result and actual result after layer is connect,For the L2 norm of convolutional neural networks weight vectors, η isHyper parameter.

4. the high spectrum image point of the empty spectrum attention mechanism deep learning according to claim 3 based on Octave convolution Class method, which is characterized in that the output category result obtained using fused feature E and actual result in step (4c) Cross entropy loss₁, specific cross entropy formula is as follows:

Wherein, y_jFor T in training image library_jPrediction category probability, o_jFor T in training image library_jPractical category；loss₁'s Input is the prediction category probability that fused feature E is obtained after full articulamentum；

loss₂、loss₃Principle and formula and loss₁It is identical, loss₂Input be characterized C obtained after full articulamentum it is pre- Survey category probability, loss₃Input be characterized the prediction category probability that D is obtained after full articulamentum.

5. the high spectrum image point of the empty spectrum attention mechanism deep learning according to claim 1 based on Octave convolution Class method, which is characterized in that step is normalized high spectrum image library in (1), is carried out by following formula:

Wherein V_maxFor the point maximum value of all pixels in high spectrum image library, V_minFor the point of all pixels in high spectrum image library Minimum value, V_nFor the pixel value at any point in high spectrum image library, { I '₁,I′₂,…,I′_n,…,I′_NFor after normalized High spectrum image library, I '_nFor n-th of sample of high spectrum image after normalized, n ∈ [0, N].

6. the high spectrum image point of the empty spectrum attention mechanism deep learning according to claim 1 based on Octave convolution Class method, which is characterized in that training is iterated to convolutional neural networks by gradient decline optimization in step (5), is realized It is as follows:

The initial learning rate of (5a) setting training is L, attenuation rate β, by training image library { T₁,T₂,…,T_j,…,T_MIt is divided into G In the convolutional neural networks of secondary input building, the number of pictures inputted every time is Q, then:

Wherein M is the total number of training image library sample；

(5b) sets the corresponding learning rate l of input picture every time are as follows:

L=L* β^G

(5c) carries out the update of G subparameter to convolutional neural networks by following formula, obtains updated weight vectors W_new；

Wherein, W is the weight vectors of convolutional neural networks parameter；

(5d) will train picture to input convolutional neural networks, loss function loss updated to weight vectors next time_opIt carries out It updates, so that loss function loss_opValue constantly decline；

(5e) repeats (5d), until loss function loss_opNo longer decline, and current exercise wheel number is less than the number of iterations of setting P then stops the training to the network, obtains trained Octave convolutional neural networks；Otherwise, when training round reaches setting The number of iterations P when, stop training to the network, obtain trained Octave convolutional neural networks.