CN113313185A

CN113313185A - Hyperspectral image classification method based on self-adaptive spatial spectral feature extraction

Info

Publication number: CN113313185A
Application number: CN202110639860.XA
Authority: CN
Inventors: 袁媛; 王文超; 马单丹
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2021-06-09
Filing date: 2021-06-09
Publication date: 2021-08-27
Anticipated expiration: 2041-06-09
Also published as: CN113313185B

Abstract

The invention discloses a hyperspectral image classification method based on self-adaptive spatial spectral feature extraction, which comprises the steps of firstly, using a principal component analysis method on an original hyperspectral image, and selecting a spectral vector of a central pixel and surrounding pixels of the hyperspectral image to form a three-dimensional matrix block as a sample central pixel spectral vector; then obtaining the spectral characteristics of the central pixel of the sample; then extracting the probability fraction of the spectrum vector of the center pixel of the sample according to the spectrum characteristic of the center pixel of the sample; inputting the spectral characteristics of the sample central pixels with the probability scores larger than or equal to the threshold value into a spatial characteristic extraction module, and outputting to obtain spectral-spatial combined characteristics of the sample; then, the spectral characteristics of the central pixels of the samples and the spectral space joint characteristic of the samples are classified according to the system of information, and a final classification result is obtained; the invention reduces the calculation amount and the calculation time of classification, improves the classification precision and simplifies the engineering complexity of model realization.

Description

Hyperspectral image classification method based on self-adaptive spatial spectral feature extraction

Technical Field

The invention belongs to the technical field of image processing, and particularly relates to a hyperspectral classification method.

Background

The hyperspectral remote sensing images usually contain hundreds of spectral bands, have high spectral resolution and large information amount, can reflect richer ground feature spectral characteristics, and are widely applied to the fields of land monitoring, agricultural monitoring, mineral exploration and the like. Hyperspectral image classification techniques play an important role in these applications, and the purpose of this task is: and for the input hyperspectral remote sensing image, assigning a correct substance class to each pixel in the image. In a general classification framework, feature mining is firstly performed to obtain features with the largest difference among different ground feature classes, and then pixel-level ground feature classification is performed according to the extracted features. The feature extraction is a key step of hyperspectral image classification, and has great influence on classification precision results.

Most of the traditional machine learning methods rely on the manually made features represented by shallow layers, and require expert knowledge and are used for specific tasks, so that the feature extraction and classification processes are troublesome, and the applicability of the method in different complex scenes is limited. And due to the limitation of the manual feature making, it may not be enough to distinguish the subtle difference between classes and the bigger difference in the classes, and it is difficult to obtain the discriminative features, so as to realize the accurate classification of the hyperspectral data with the complex and high-dimensional characteristics. The deep learning method developed in recent years is receiving more and more attention. The complex discriminative features of the deep layer are extracted through the hierarchical structure, so that the method has stronger fitting and representing capabilities and has an obvious effect on a hyperspectral image classification task. The hyperspectral image classification method based on deep learning can be divided into the following three categories according to different feature extraction modes:

the first is a classification method based on spectral features. The spectral information is the most important characteristic of the hyperspectral data and plays a crucial role in the classification task. Early studies input spectral vectors corresponding to hyperspectral pixels directly into a network model. Liu et al, in the documents "p.liu, h.zhang, and k.b.eom, Active deep learning for classification of hyperspectral images, IEEE j.s.topics appl.earth observer.remote sens.s., vol.10, No.2, pp.712-724,2017", propose a classification framework based on deep learning and Active learning, using deep belief networks to extract deep spectral features. Haut et al in the literature "j. Haut, m.e.paoletti, j.plata, j.li, and a.plata, Active learning with a connected neural network for hyperspectral image classification using a new Bayesian approach, IEEE trans. geosci.remote sens, vol.56, No.11, pp.6440-6461,2018" train the model using a small number of labeled samples for the high dimensional properties of hyperspectral data in combination with Active learning and deep convolutional neural networks to avoid the problem of classifier overfitting based on finite samples. However, these models directly classify the original spectral data with high dimensional characteristics and high redundancy, do not consider the spatial features with discriminative information in the data, degrade the image classification performance, and have a large computational burden.

The second is a classification method based on spatial features. Usually, PCA (Principal Component Analysis) is used to reduce the dimension of the raw data, and then a two-dimensional CNN (Convolutional Neural Networks) is used to mine the spatial information contained in the input hyperspectral pixel neighborhood, so as to extract the discriminative spatial features. Cheng et al, in the documents "G.Cheng, Z.Li, J.Han, X.Yao, and L.Guo, expanding technical horizontal defects for hyperspectral image classification, IEEE trans.geosci.Remote Sens, vol.56, No.11, pp.6712-6722,2018", extract several principal components of hyperspectral data by PCA, extract depth space features of hyperspectral data using existing CNN models (e.g., AlexNet, VGG, GoogleNet, etc.), and learn to spectral space features with better representation ability by constructing a classification framework based on metric learning. However, in these methods, a bilinear interpolation method and the like are also used to up-sample the input size required by the network in the spatial dimension, and then the obtained depth spatial features are still to be classified by fusing the spectral information, which results in a large calculation amount.

And the third method is a classification method based on spectral-spatial feature joint extraction. The depth spectrum spatial joint features are usually obtained by directly extracting the depth joint features from original or hyperspectral data containing a plurality of principal components, or by fusing separately extracted depth spectrum and spatial features. Chen et al in the documents "Y.Chen, H.Jiang, C.Li, X.Jia, and P.Ghamisi, Deep feature extraction and classification of hyper spectral images based on volumetric neural networks, IEEE trans. geosci. Remote Sens, vol.54, No.10, pp.6232-6251,2016" used 3D CNN to efficiently extract depth spectral-spatial joint features without relying on any data preprocessing or post feature fusion techniques. Yang et al, J.Yang, Y.Q.ZHao, and C.W.Chan, Learning and transforming deep-specific defects for hyperspectral classification, IEEE transactions.geosci.Remote Sens, vol.55, No.8, pp.4729-4742,2017, propose a deep CNN with a double-branched structure and concatenate the learned spectral and spatial features into a fully-concatenated layer to extract spectral-spatial joint features. However, these methods only directly classify spectral-spatial features by simply introducing spatial features, and do not consider the necessity of spatial features on some samples, thereby causing extra computation and even reducing algorithm performance.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention provides a hyperspectral image classification method based on self-adaptive spatial spectral feature extraction, which comprises the steps of firstly, using a principal component analysis method on an original hyperspectral image and selecting a spectral vector of a central pixel and peripheral pixels of the hyperspectral image to form a three-dimensional matrix block as a sample central pixel spectral vector; then obtaining the spectral characteristics of the central pixel of the sample; then extracting the probability fraction of the spectrum vector of the center pixel of the sample according to the spectrum characteristic of the center pixel of the sample; inputting the spectral characteristics of the sample central pixels with the probability scores larger than or equal to the threshold value into a spatial characteristic extraction module, and outputting to obtain spectral-spatial combined characteristics of the sample; then, the spectral characteristics of the central pixels of the samples and the spectral space joint characteristic of the samples are classified according to the system of information, and a final classification result is obtained; the invention reduces the calculation amount and the calculation time of classification, improves the classification precision and simplifies the engineering complexity of model realization.

The technical scheme adopted by the invention for solving the technical problem comprises the following steps:

step 1: data preprocessing: obtaining a new hyperspectral image containing l principal components by using a principal component analysis method on the original hyperspectral image;

step 2: selecting a central pixel of the new hyperspectral image and spectral vectors of surrounding pixels to form an n multiplied by l three-dimensional matrix block, and taking the three-dimensional matrix block as a sample representing the selected central pixel, wherein the sample is called a sample central pixel spectral vector;

and step 3: performing characteristic extraction on the sample central pixel spectral vector by using a spectral characteristic extraction module to obtain the spectral characteristic of the sample central pixel; the spectral feature extraction module consists of a one-dimensional convolution layer and a ReLU activation layer;

and 4, step 4: extracting probability scores of the spectral vectors of the sample central pixels from the spectral features of the sample central pixels by using a gating module; the gating module consists of a full connection layer, a ReLU activation layer and a Sigmoid activation layer;

and 5: setting a threshold, inputting the spectral characteristics of the sample central pixels with the probability scores larger than or equal to the threshold into a spatial characteristic extraction module, and outputting the spectral characteristics to obtain the spectral-spatial combined characteristics of the sample; the spatial feature extraction module consists of a two-dimensional convolution layer and a ReLU activation layer;

step 6: inputting the spectral characteristics of the sample central pixels with the probability scores smaller than or equal to the threshold value into a first classifier module to obtain a classification result;

inputting the spectral space joint characteristics of the samples with the probability scores larger than or equal to the threshold value into a second classifier module to obtain a classification result;

the first classifier module and the second classifier module are respectively composed of a full connection layer, a Dropout layer, a ReLU layer and a Softmax layer, but the parameters of the two classifier modules are different;

and 7: training a hyperspectral classification network formed by the spectral feature extraction module, the gating module, the spatial feature extraction module, the first classifier module and the second classifier module from the step 3 to the step 6, and updating network parameters to obtain a trained hyperspectral classification network model;

and 8: and (3) firstly, the hyperspectral images to be classified are subjected to the step 1 and the step 2 to generate a central pixel spectrum vector, then the central pixel spectrum vector is input into the trained hyperspectral classification network model, and a classification result is output.

Preferably, said 15 ≦ l ≦ 30.

Preferably, the threshold is 0.5.

Preferably, the method for training the hyperspectral classification network comprises the following steps:

using the stochastic gradient descent method, the optimizer uses Adam, the learning rate is set to 0.001, and the loss function over N training samples in C classes is:

wherein L is_clsRepresenting the cross entropy loss of the two classifier modules, wherein lambda is a weight factor for balancing the loss of the two classification results, and lambda is 0 and 2; l is_gateIn order to gate the loss of a module,

for the indicator function, 1 when the prediction classification is the same as the actual one, and 0 otherwise; p is a radical of_gProbability scores for spatial feature extraction output by the gating module; y is^(i,j)The label of the ith sample on the jth class is 0 or 1;

indicating the prediction confidence of the ith sample on the jth class by the first classifier module,

indicating the prediction confidence of the ith sample on the jth class by the second classifier module,

representing the prediction vector, y, of the i-th sample by the first classifier blockⁱA tag representing the one-hot code of the ith sample,

represents a second classifierA module predicts a vector of an ith sample; by a loss function L_clsAnd L_gateAnd updating the network parameters under the condition of not setting additional hyper-parameters to obtain a trained model.

The invention has the following beneficial effects:

1. the invention reduces the calculation amount and the calculation time of classification, can adaptively determine the necessity of introducing the spatial information into the sample according to the discrimination capability of the spectral feature, and does not need to extract the spatial feature if the spectral feature of the sample has good discrimination capability.

2. The invention improves the classification precision, and samples with sufficient discrimination ability to the spectral features are directly classified through the spectral features without extracting the spatial features, thereby avoiding the potential interference of introducing spatial information to classification results.

3. The model of the invention uses an end-to-end training and testing method, learns the spectral feature quality in a self-adaptive manner through the gating module without using modes such as grid search or manual threshold setting, and the like, thereby having stronger adaptability to different hyperspectral images and simplifying the engineering complexity of model realization.

Drawings

FIG. 1 is a flow chart of the method of the present invention.

Fig. 2 is a diagram illustrating a classification result according to an embodiment of the present invention.

FIG. 3 is a schematic diagram of an experimental result of effectiveness analysis according to an embodiment of the present invention.

Detailed Description

The invention is further illustrated with reference to the following figures and examples.

The invention discloses a novel hyperspectral image classification method based on spectrum and spatial feature adaptive extraction. For the hyperspectral image classification task, the spatial information can solve the problem that spectral information of partial samples is not enough in distinguishing, so that the classification precision is effectively improved. However, for most samples in the hyperspectral image, the spectral information of the samples has enough discriminative power, correct classification can be achieved, and introduction of spatial information is unnecessary, so that additional calculation burden is brought. In some cases even cause interference, reducing the accuracy of the final classification. Therefore, according to the spectral characteristics of the sample, spatial features are introduced in a self-adaptive manner, and a spatial spectrum combined feature classification model with stronger adaptability is constructed, which is very important. To solve the above problems, the present invention aims to achieve the following aspects:

1. by designing an effective feature extraction mechanism, whether to introduce spatial information is adaptively determined according to the discrimination capability of spectral features, thereby avoiding unnecessary calculation burden.

2. Under the condition that the discrimination of certain spectral features is sufficient and the introduction of spatial information may generate negative influence on the contrary, the potential interference of neighborhood spatial information is avoided, and the classification precision is improved.

3. Aiming at different hyperspectral images, a gating module is adopted to combine with an early prediction strategy, and an end-to-end training method is designed, so that the method has stronger data adaptivity.

As shown in fig. 1, a hyperspectral image classification method based on adaptive spatial spectral feature extraction includes the following steps:

step 1: data preprocessing: obtaining a new hyperspectral image containing 15 to 30 principal components by using a principal component analysis method on the original hyperspectral image;

step 2: selecting a central pixel of the new hyperspectral image and spectral vectors of surrounding pixels to form an n multiplied by l three-dimensional matrix block, and taking the three-dimensional matrix block as a sample representing the selected central pixel, wherein the sample is called a sample central pixel spectral vector; dividing sample center pixel spectral vectors obtained from a large number of hyperspectral images into a training set and a test set;

and 4, step 4: extracting the probability fraction of the spectrum vector of the sample central pixel from the spectrum characteristic of the sample central pixel by using a gating module, and judging whether high classification confidence can be obtained according to the probability fraction so as to determine whether the spatial characteristic extraction is required; the gating module consists of a full connection layer, a ReLU activation layer and a Sigmoid activation layer;

and 5: inputting the spectral characteristics of the sample central pixels with the probability scores of more than or equal to 0.5 into a spatial characteristic extraction module, and outputting to obtain spectral-spatial combined characteristics of the sample; the spatial feature extraction module consists of a two-dimensional convolution layer and a ReLU activation layer;

step 6: inputting the spectral characteristics of the sample central pixels with the probability fraction smaller than 0.5 into a first classifier module to obtain a classification result;

inputting the spectral space joint characteristics of the samples with the probability scores of more than or equal to 0.5 into a second classifier module to obtain classification results;

and 7: training a hyperspectral classification network formed by the spectral feature extraction module, the gating module, the spatial feature extraction module, the first classifier module and the second classifier module in the steps 3 to 6 by using a training set, wherein in the training process, all training samples are subjected to feature extraction by the spectral feature extraction module or the spatial feature extraction module to obtain classification results and update network parameters, and finally a trained hyperspectral classification network model is obtained;

during training, a random gradient descent method is used, an optimizer uses Adam, the learning rate is set to be 0.001, and in C categories, loss functions on N training samples are as follows:

wherein L is_clsRepresents the cross-entropy loss of the two classifier blocks, λ is the weight that balances the loss of the two classification resultsFactor, λ ═ 0, 2; l is_gateIn order to gate the loss of a module,

representing a prediction vector of the ith sample by the second classifier module; by a loss function L_clsAnd L_gateUpdating network parameters under the condition of not setting additional hyper-parameters to obtain a trained model;

testing the trained hyperspectral classification network model by using a test set; in the testing process, performing spectral feature extraction and gating module quality evaluation on the central pixel spectral vector, and when the probability score is less than 0.5, directly outputting a classification result by using a first classifier module according to the spectral feature, otherwise, performing spatial feature extraction and classifying by using spectral-spatial combined features;

The specific embodiment is as follows:

conditions of the experiment

The embodiment is written by using python and a pytorech library and is carried out on an Intel i7-10700F CPU, an Nvidia GTX1660SUPER GPU and a Windows operating system.

The data used in the experiments were public data sets including Indian Pines, Pavia University and Salinas Valley. In which Indian Pines and Salinas Valley were captured by an onboard visible infrared imaging spectrometer (AVIRIS) sensor and Pavia University was captured by a Reflection Optical System Imaging Spectrometer (ROSIS) sensor. Indian Pines contains 145 × 145 pixels, 200 spectral bands, 16 ground feature classes; the Pavia University contains 610 × 340 pixels, 103 spectral bands, 9 classes; the Salinas Valley contains 512 × 217 pixels, 204 spectral bands, 16 classes.

The comparison algorithm is a 3D CNN proposed by Chen et al in the documents "Y.Chen, H.Jiang, C.Li, X.Jia, and P.Ghamisi, Deep feature extraction and classification of hyper spectral images based on a connected computational network, IEEE trans. geosci. Remote Sens., vol.54, No.10, pp.6232-6251,2016"; the SSRNs proposed by Zhong et al in "Z.Zhong, J.Li, Z.Luo, and M.Chapman, Spectral-spatial residual network for hyperspectral image classification: A3-d deep learning frame, IEEE tran.Geosci.remote Sens., vol.56, No.2, pp.847-858,2017"; hybrid SN proposed by Roy et al in "S.K.Roy, G.Krishna, S.R.Dubey, and B.B.Chaudhuri, hybrid: expanding 3-d-2-dcnn feature architecture for Hyperspectral image classification, IEEE geosci.Remote Sens.Lett., vol.17, No.2, pp.277-281,2019".

2. Content of the experiment

According to the steps given in the specific implementation mode, a hyperspectral image classification model is trained on a training set, a classification result schematic diagram is shown in fig. 2, and the classification effect is good. The left graph of each pair of comparison graphs is a ground graph, and the right graph is a classification result graph.

The effectiveness of the algorithm is demonstrated by comparative experiments below. Firstly, observing early prediction rate, namely obtaining sample proportion of a classification result directly through spectral features, then removing the spectral feature classifier on the basis of the model provided by the invention to obtain a common spectral space feature extraction and classification network, and carrying out a comparison test. The results of the experiment are shown in FIG. 3. The experimental result shows that the model provided by the invention obtains higher early prediction rate in each data set, not only realizes higher precision, but also has less calculation amount and test time.

The indexes of the method provided by the invention on three data sets and the results of the comparison method are shown in the following tables 1-3, and it can be seen that the method provided by the invention achieves advanced level on four measurement indexes compared with the comparison method.

TABLE 1 Indian Pines classification results

TABLE 2 Pavia University Classification results

TABLE 3 Salinas Valley classification results

Claims

1. A hyperspectral image classification method based on adaptive spatial spectral feature extraction is characterized by comprising the following steps:

2. The hyperspectral image classification method based on adaptive spatial spectral feature extraction according to claim 1 is characterized in that l is more than or equal to 15 and less than or equal to 30.

3. The hyperspectral image classification method based on adaptive spatial spectral feature extraction according to claim 1, wherein the threshold is 0.5.

4. The hyperspectral image classification method based on adaptive spatial spectral feature extraction according to claim 1, wherein the method for training the hyperspectral classification network comprises the following steps:

representing a prediction vector of the ith sample by the second classifier module; by a loss function L_clsAnd L_gateAnd updating the network parameters under the condition of not setting additional hyper-parameters to obtain a trained model.