CN111695467A

CN111695467A - Spatial spectrum full convolution hyperspectral image classification method based on superpixel sample expansion

Info

Publication number: CN111695467A
Application number: CN202010485713.7A
Authority: CN
Inventors: 王佳宁; 李林昊; 郭思颖; 黄润虎; 杨攀泉; 焦李成; 侯彪; 张向荣; 毛莎莎
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2020-06-01
Filing date: 2020-06-01
Publication date: 2020-09-22
Anticipated expiration: 2040-06-01
Also published as: CN111695467B

Abstract

The invention discloses a spatial spectrum full convolution hyperspectral image classification method based on superpixel sample expansion, which comprises the steps of inputting a hyperspectral image; acquiring a training set and a test set; performing principal component analysis and dimensionality reduction on the hyperspectral image; carrying out entropy rate segmentation on the dimensionality reduction result; generating a pseudo label sample; updating the training set; performing data preprocessing on the hyperspectral image; inputting a convolutional neural network; training a convolutional neural network, and classifying the hyperspectral images; repeating the operations and voting; and outputting a hyperspectral classification result. According to the method, the pseudo label samples are expanded by utilizing the entropy rate superpixel segmentation result, the prior characteristics of the hyperspectral images are fully utilized, the number of samples is increased, the problems of network overfitting and low network convergence speed are solved, and the accuracy of hyperspectral image classification under the condition of rare marked samples is improved.

Description

Spatial spectrum full convolution hyperspectral image classification method based on superpixel sample expansion

Technical Field

The invention belongs to the technical field of image processing, and particularly relates to a spatial spectrum full convolution hyperspectral image classification method based on superpixel sample expansion.

Background

With the progress of scientific technology, the hyperspectral remote sensing technology is greatly developed. The hyperspectral data can be represented as a hyperspectral data cube, which is a three-dimensional data structure. The hyperspectral data can be regarded as a three-dimensional image, and one-dimensional spectral information is added in addition to a common two-dimensional image. The space image describes two-dimensional space characteristics of the earth surface, and the spectrum dimension of the space image reveals the spectrum curve characteristics of each pixel of the image, so that the organic fusion of the space dimension and the spectrum dimension information of the remote sensing data is realized. The hyperspectral remote sensing image contains abundant spectral information, can provide spatial domain information and spectral domain information, has the characteristic of 'map integration', can realize accurate identification and detail extraction of ground objects, and provides favorable conditions for knowing an objective world. Due to the unique characteristics of hyperspectral images, the hyperspectral remote sensing technology is widely applied to different fields. In the civil field, hyperspectral remote sensing images have been used in the aspects of urban environment monitoring, surface soil monitoring, geological exploration, disaster assessment, agricultural yield estimation, crop analysis and the like. The hyperspectral remote sensing technology is widely applied to daily life of people. Therefore, designing a practical and efficient hyperspectral image classification method has become an indispensable scientific and technological requirement of modern society.

At present, researchers have proposed many classical classification methods for hyperspectral image classification, and representative classification methods are Support Vector Machines (SVMs) and neural networks (CNNs). The SVM obtains a better classification result in small sample classification by maximizing the class boundary. The SVM is introduced into the hyperspectral image Classification in the Classification of hyperspectral remote sensing images with supported vector machines by F.Melgani and L.Bruzzone, the best Classification result is obtained at that time, but the SVM determines a kernel function, the experience judgment is completely needed, and the Classification performance is poor due to the selection of an improper kernel function. Meanwhile, with the rise of deep learning, the convolutional neural network is also applied to the field of hyperspectral image classification. However, because a large number of labeled samples are needed for training the convolutional neural network as training samples, and the labeling cost of the hyperspectral image is very expensive, how to solve the problem of small samples is a popular direction at present.

Disclosure of Invention

The technical problem to be solved by the present invention is to provide a method for classifying a hyperspectral image based on a spatial-spectral full-convolution of a superpixel sample expansion, so as to improve the classification effect of the hyperspectral image by using a segmentation result as prior information.

The invention adopts the following technical scheme:

the method for classifying the spatial spectrum full convolution hyperspectral image based on the super-pixel sample expansion comprises the following steps:

s1, inputting the hyperspectral image PaviaU, and acquiring the training sample X from the hyperspectral image PaviaU_tAnd test sample X_e；

S2, carrying out normalization processing on the hyperspectral data set, and adding n labels which are in the neighborhood of each training sample from the segmentation label matrix and are the same as the segmentation labels of the training samples into the training samples as pseudo label samples;

s3, respectively constructing a spectral feature extraction module and a spatial spectrum feature extraction module, constructing a spectral feature map and a spatial spectrum feature map weighted fusion module, taking the features of spatial spectrum combination as input, passing through two convolution layers and setting, and constructing a spatial spectrum combined full convolution neural network for hyperspectral classification;

s4, constructing a loss function of the step S3 full convolution neural network, and training the neural network;

and S5, obtaining a final classification result graph through multiple training votes, and realizing image classification.

Specifically, step S1 specifically includes:

s101, recording a three-dimensional hyperspectral image PaviaU as

U, V and C are the space length, space width and spectrum channel number of the hyperspectral image respectively, the hyperspectral image comprises N pixel points, each pixel point has C spectrum wave bands, and N is U × V;

s102, randomly taking 30 samples of class labels 1 to 9 in X to form an initial training sample set X_tThe rest is used as a test sample X_e。

Specifically, step S2 specifically includes:

s201, carrying out PCA (principal component analysis) dimensionality reduction on the three-dimensional hyperspectral image, wherein the number of channels of the image after dimensionality reduction is 1;

s202, carrying out entropy rate superpixel segmentation on the image subjected to PCA dimensionality reduction to obtain 50 segmentation results, and obtaining a segmentation label matrix of

S203, setting a real label matrix

Partition the label matrix into

(x₀,y₀) The real label of the training sample is

The segmentation labels in the segmentation map are

7 × 7 with (x, y) as center

Generating a pseudo label of

And expanding the pseudo label samples meeting the standard, wherein the number of the training samples is n +1 times of the original number, and the test samples are kept unchanged.

Specifically, step S3 specifically includes:

s301, constructing a spectral feature extraction module, wherein the spectral feature extraction module comprises three convolution layers and a merging layer, and a relu activation function and batch normalization processing are added behind each convolution layer;

s302, constructing a spatial spectrum feature extraction module, wherein the spatial spectrum feature extraction module comprises a 1 × 1 convolution layer, a relu activation layer, a batch normalization layer, a multi-scale spatial feature fusion layer, a 3 × 3 cavity convolution layer, a relu activation layer, a batch normalization layer, a 2 × 2 average pooling layer and a merging layer;

s303, constructing a spectral feature map and spatial spectrum feature map weighting fusion module;

s304, the characteristics of the combination of the space spectrum are used as input to pass through the two convolution layers;

s305, carrying out PCA dimension reduction on the feature map after convolution to 5 dimensions for use in subsequent CRF treatment;

and S306, performing Softmax operation on the convolved feature map to output a classification probability matrix, and outputting the dimension number with the largest value in the classification probability matrix as a prediction class label to obtain a classification result.

Further, in step S301, the batch normalization processing parameters are: and (3) momentum is 0.8, the sizes of convolution kernels are all 1, the step length is 1, the number of channels after all convolution is 64, and the convolution results are added after three continuous convolutions to obtain a spectral feature map.

Further, in step S302, the first convolutional layer uses 1 × 1 convolution with a step size of 1; the contrast rate of the 3 × 3 hole convolution is 2, and the step size is 1; the number of all convolution result channels is 64; all batch normalization processing parameters were: the momentum is 0.8, the merging layer is the sum of the feature maps of the three convolution layers, and the number of channels is held at 64.

Further, in step S303, the spectral feature map and the spatial feature map are weighted and superimposed as follows:

C_unite＝λ_spectralC_spectral+λ_spatialC_spatial

wherein, C_uniteFor the weighted feature map, λ_spectralAnd λ_spatialWeighting coefficients, C, for trainable spectral and spatial features in the network, respectively_spectralAnd C_spatialRespectively a spectral feature map and a spatial feature map.

Specifically, in step S4, the loss function is:

L＝L₁+L₂

where L is the final loss function, L₁And L₂Respectively labeled samples and pseudo-labeled samples in the training set,

and

the label representing the ith training sample and the prediction label, and taking 1 or 2 for j represents that the sample is an original sample or a pseudo label sample.

Specifically, step S5 specifically includes:

s501, adding the feature graph after the PCA dimensionality reduction to 5 dimensionalities in the step S3 and the classification result obtained by inputting the normalized hyperspectral data in the step S4 into a network into a conditional random field;

s502, obtaining a classification result of primary training through a conditional random field;

s503, repeating the pseudo sample expansion and network training operations for m times on the same training sample to obtain m classification results, and outputting the prediction class label with the largest occurrence frequency of each pixel as a final prediction class label.

Further, in step S501, the energy function of the conditional random field is as follows:

wherein psi_u(y_i) And psi_p(y_i,y_j) Respectively, a unary function part and a binary function part.

Compared with the prior art, the invention has at least the following beneficial effects:

according to the method for classifying the empty-spectrum full-convolution hyperspectral images based on the super-pixel sample expansion, the result generated by the hyperspectral image segmentation is used for guiding the generation of the pseudo label sample, and the prior information of the hyperspectral images is effectively used for expanding the training sample, so that the training sample can still keep good classification accuracy under the condition of small samples; the method adopts a space-spectrum combination mode to extract the features, so that the spectrum and space-spectrum features of the hyperspectral image can be more fully extracted, and the classification accuracy of the hyperspectral image is improved; the spatial feature extraction module uses the void convolution of different resolution rates to realize multi-scale feature fusion, and spatial features of the hyperspectral image are extracted on multiple scales; a voter is added before the final classification result, so that the robustness of the whole structure is enhanced, and the classification result is more stable and reliable; the full convolution neural network is adopted, a full connection layer is not introduced, and end-to-end image classification can be realized on hyperspectral images of any size; the input of the full convolution neural network is preprocessed hyperspectral data with the same size as the original hyperspectral image, and the condition of high redundancy of training data caused by the fact that Patch of each pixel is used as input data in the conventional method is avoided.

Furthermore, the entropy rate superpixel is set to obtain the segmentation label, the prior information of the hyperspectral image is effectively utilized, a sample similar to a training sample is supplemented into a training set under the condition that a classification label is not needed, and the difficulty that the hyperspectral image can be used for training marked samples is effectively solved.

Furthermore, the full convolution neural network used for constructing the full convolution neural network combined by the empty spectrum for hyperspectral classification has no full connection layer, so that a hyperspectral image with any size can be very conveniently accepted as input. The spatial-spectral combination mode can combine spectral information and spatial information in a hyperspectral image into a new feature with better effect relative to a single spatial feature or a spectral feature.

Further, the feature extraction of the spectrum module uses continuous 1 × 1 convolutional layers to complete the extraction of the spectrum features on the premise that the influence on the spatial information is as small as possible, and the existence of the residual module enables the gradient information to be preserved so that the model can be converged better.

Furthermore, the feature extraction of the spatial spectrum module uses the hole convolution of different contrast rates, so that the receptive field can be expanded and multi-scale spatial information can be extracted at the same time.

Furthermore, aiming at the proposed pseudo label sample expansion method, the loss function of the full convolution neural network is constructed to make corresponding improvement, and the cross entropy of the pseudo label sample is also added into the loss function, so that the network can be converged better.

Furthermore, aiming at the instability of the proposed pseudo label sample expansion method caused by self limitation, the final classification diagram obtained by training and voting for multiple times effectively increases the robustness of the model.

In conclusion, the spatial spectrum full-convolution hyperspectral image classification method based on superpixel sample expansion effectively utilizes the prior information of the hyperspectral image to realize pseudo sample expansion, solves the problem that the hyperspectral image has marked samples, and simultaneously fully utilizes multi-scale spatial features and spectral features to realize higher classification precision by the full-convolution hyperspectral network of the spatial spectrum.

The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.

Drawings

FIG. 1 is a flow chart of an implementation of the present invention;

FIG. 2 is a block diagram of a flow chart of an implementation of pseudo tag exemplar augmentation of the present invention;

FIG. 3 is a multi-scale spatial feature fusion module in accordance with the present invention.

Detailed Description

The invention provides a spatial spectrum full convolution hyperspectral image classification method based on superpixel sample expansion, which is used for inputting a hyperspectral image; acquiring a training set and a test set; performing principal component analysis and dimensionality reduction on the hyperspectral image; performing entropy rate superpixel segmentation on the dimensionality reduction result; generating a pseudo label sample; updating the training set; performing data preprocessing on the hyperspectral image; inputting a full convolution neural network; training a full convolution neural network, and classifying the hyperspectral images; repeating the operations and voting; and outputting a hyperspectral classification result. According to the method, the pseudo label samples are expanded by utilizing the entropy rate superpixel segmentation result, the spatial prior information of the hyperspectral images is fully utilized, the number of samples is increased, the problem of network overfitting is solved, and the accuracy, the classification efficiency and the classification performance of the hyperspectral images under the condition of small samples are effectively improved.

Referring to fig. 1, the present invention provides a method for classifying a spatial spectrum full convolution hyperspectral image based on superpixel sample expansion, which comprises the following steps:

s1, inputting a hyperspectral image PaviaU, wherein the hyperspectral three-dimensional image is data used in the experiment, and acquiring a training sample X from the PaviaU hyperspectral image_tAnd test sample X_e；

S101, recording a three-dimensional hyperspectral image PaviaU as

The hyperspectral image comprises N pixel points, each pixel point is provided with C spectral bands, N is equal to U × V.PaviaU data set, N is equal to 207400 samples, U is equal to 610, V is equal to 340, C is equal to 103, the class labels are 1 to 9, the samples are 42776 in total, X is normalized, and the data value is kept in [0, 1]The method comprises the following specific steps:

s102, randomly taking 30 samples with class labels of 1 to 9 in X to form an initial training sample set X_tThe rest is used as a test sample X_e。

S2, after the hyperspectral data set is normalized, generating a segmentation label by utilizing the entropy rate superpixel to generate a pseudo label sample;

S203, setting a real label matrix

Partition the label matrix into

(x₀,y₀) The real label of the training sample is

The segmentation labels in the segmentation map are

7 × 7 with (x, y) as center

For which a pseudo label is generated as

The pseudo label samples meeting the above criteria are expanded, at this time, the number of training samples is changed to n +1 times of the original number, and the test samples are kept unchanged, as shown in fig. 2.

S3, constructing a full convolution neural network for hyperspectral classification and space spectrum combination;

s301, the spectral feature extraction module is composed of three convolution layers, and a relu activation function and batch normalization processing are added behind each convolution layer.

The batch normalization processing parameters were:

and (3) momentum is 0.8, the sizes of convolution kernels are all 1, the step length is 1, the number of channels after all convolution is 64, and the convolution results are added after three continuous convolutions to obtain a spectral feature map.

S302, the spatial spectrum feature extraction module consists of three convolution layers, the convolution kernel size of the first convolution layer is 1, and the step length is 1; the second convolution layer is used to realize multi-scale feature fusion, and has a structure shown in fig. 3, and is obtained by convolution and addition of three holes with convolution kernel size of 3, convolution rate of 2,3,4, and step size of 1; the third convolutional layer was followed by a 2 x 2 average pooling layer.

After each convolution is finished, performing relu activation function and batch normalization processing; the convolution kernel sizes of the third convolution layers are all 3, the distortion rate is 2, and the step size is 1.

All batch normalization processing parameters were:

and (4) momentum is 0.8, the number of channels of all convolution results is 64, and the convolution results are added after three times of continuous convolution to obtain a space spectrum feature map.

S303, weighting and superposing the spectral feature map and the spatial feature map as follows:

C_unite＝λ_spectralC_spectral+λ_spatialC_spatial

wherein, C_uniteThe weighted feature map has a channel of 64 λ_spectralAnd λ_spatialWeighting coefficients, C, for trainable spectral and spatial features in the network, respectively_spectralAnd C_spatialRespectively a spectrum characteristic diagram and a space spectrum characteristic diagram;

s304, taking the characteristics of the combination of the space spectrum as input, passing through two 1 × 1 convolution layers, performing relu activation after convolution, wherein the sizes of convolution kernels are all 1, the step lengths are all 1, the number of channels of a first convolution result is 64, and the number of channels of a second convolution result is 128;

s306, performing Softmax operation on the convolved feature map to output a classification probability matrix of 610 multiplied by 340 multiplied by 9, and outputting the dimension number with the largest value in the 9 dimensions as a prediction class label to obtain a classification result with the size of 610 multiplied by 340.

S4, constructing a loss function of the full convolution neural network, and training the neural network;

s401, calculating the cross entropy of the training sample prediction label and the training sample label after the expansion by using the cross entropy through a loss function, wherein the cross entropy is shown as the following formula:

L＝L₁+L₂

and

a label and a prediction label representing the ith training sample, wherein j takes 1 or 2 to represent that the sample is an original sample or a pseudo label sample;

s402, inputting the normalized hyperspectral data into a network, and iterating 1000 times to generate a predicted label graph.

And S5, obtaining a final classification result graph through multiple training votes.

S501, adding the outputs of the step S305 and the step S402 into a conditional random field, wherein the energy function of the conditional random field is as follows,

ψ_u(y_i) And psi_p(y_i,y_j) Are respectively provided withAre a unary function part and a binary function part.

In the present invention, the formula for calculating the unary part is ψ_u(y_i)＝-log P(y_i) Wherein, P (y)_i) Is the label assignment probability of pixel i given by the proposed full convolution network.

The binary function part is defined as:

wherein if y_i＝y_j,μ(y_i,y_j) 1, otherwise zero; k is a radical of_mIs a gaussian kernel; f. of_iAnd f_jIs the feature vector of pixels i and j in arbitrary feature space; omega_mIs the corresponding weight; in order to fully utilize the deep spectral space characteristics, C in S305 is_uniteThe first five principal components of (a) are used as features of each pixel. The complete form of the gaussian kernel is written as:

and S503, repeating the above m times of operations on the same training sample to obtain m times of classification results, and outputting the prediction class label with the maximum occurrence frequency of each pixel as a final prediction class label.

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The three evaluation indexes of the simulation experiment are as follows:

the overall accuracy OA represents the proportion of correctly classified samples to all samples, with a larger value indicating a better classification. The average precision AA represents the average value of the classification precision of each class, and the larger the value is, the better the classification effect is. The chi-square coefficient Kappa represents different weights in the confusion matrix, and the larger the value is, the better the classification effect is.

The prior art contrast classification method used in the invention is as follows:

song et al, in "Hyperspectral Image Classification With Deep feature fusion Network, IEEE trans. Geosci. remote Sens., vol.56, No.6, pp.3173-3184, June 2018", propose a Hyperspectral Image Classification method, depth feature fusion DFFN method for short.

Table 1 is a classification result quantitative analysis table of the present invention (PaviaU data set is selected, 30 labeled samples of each class are used as training set:

in conclusion, the spatial spectrum full-convolution hyperspectral image classification method based on superpixel sample expansion effectively utilizes the prior information of the hyperspectral image to realize pseudo sample expansion, solves the problem that the hyperspectral image has marked samples, and simultaneously utilizes the spatial features and the spectral features of multiple scales to realize higher classification precision by the full-convolution hyperspectral network of the spatial spectrum.

The above-mentioned contents are only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and any modification made on the basis of the technical idea of the present invention falls within the protection scope of the claims of the present invention.

Claims

1. The method for classifying the spatial spectrum full convolution hyperspectral image based on the super-pixel sample expansion is characterized by comprising the following steps of:

2. The method for classifying the spatial-spectral full-convolution hyperspectral image based on the super-pixel sample expansion according to claim 1, wherein the step S1 is specifically as follows:

s101, recording a three-dimensional hyperspectral image PaviaU as

3. The method for classifying the spatial-spectral full-convolution hyperspectral image based on the super-pixel sample expansion according to claim 1, wherein the step S2 is specifically as follows:

S203, setting a real label matrix

Partition the label matrix into

(x₀,y₀) The real label of the training sample is

The segmentation labels in the segmentation map are

7 × 7 with (x, y) as center

Generating a pseudo label of

4. The method for classifying the spatial-spectral full-convolution hyperspectral image based on the super-pixel sample expansion according to claim 1, wherein the step S3 is specifically as follows:

5. The method for classifying the spatial spectrum full-convolution hyperspectral image based on the super-pixel sample expansion according to claim 4, wherein in the step S301, batch normalization processing parameters are as follows: and (3) momentum is 0.8, the sizes of convolution kernels are all 1, the step length is 1, the number of channels after all convolution is 64, and the convolution results are added after three continuous convolutions to obtain a spectral feature map.

6. The method for classifying the spatial-spectral full-convolution hyperspectral image based on super-pixel sample expansion according to claim 4, wherein in the step S302, the convolution layer of the first layer uses 1 x 1 convolution with a step size of 1; the dispativity of the 3 × 3 hole convolution is 2, and the step size is 1; the number of all convolution result channels is 64; all batch normalization processing parameters were: the momentum is 0.8, the merging layer is the sum of the feature maps of the three convolution layers, and the number of channels is held at 64.

7. The method for classifying the spatial-spectral full-convolution hyperspectral images based on the super-pixel sample expansion according to claim 4, wherein in the step S303, the spectral feature map and the spatial-spectral feature map are weighted and superimposed as follows:

C_unite＝λ_spectralC_spectral+λ_spatialC_spatial

8. The method for classifying the spatial-spectral full-convolution hyperspectral image based on super-pixel sample expansion according to claim 1, wherein in step S4, the loss function is:

L＝L₁+L₂

where L is the final loss function, L₁And L₂Respectively labeled and pseudo-labeled in the training set, y_j ⁽ⁱ⁾And

9. The method for classifying the spatial-spectral full-convolution hyperspectral image based on the super-pixel sample expansion according to claim 1, wherein the step S5 is specifically as follows:

10. The method for classifying the spatial-spectral full-convolution hyperspectral image based on super-pixel sample expansion according to claim 9, wherein in step S501, an energy function of the conditional random field is as follows: