CN108256557B

CN108256557B - Hyperspectral image classification method combining deep learning and neighborhood integration

Info

Publication number: CN108256557B
Application number: CN201711415902.1A
Authority: CN
Inventors: 孟红云; 张小华; 樊宏渊; 田小林; 朱虎明; 曹向海; 侯彪
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2017-12-25
Filing date: 2017-12-25
Publication date: 2021-09-28
Anticipated expiration: 2037-12-25
Also published as: CN108256557A

Abstract

The invention provides a hyperspectral image classification method combining deep learning and neighborhood integration, which mainly solves the problems of more training samples and poor classification effect in the prior art, and adopts the technical scheme that: selecting different neighborhood scales in the hyperspectral data to obtain a data set combining different spatial information; respectively inputting data sets of different spatial information into different self-coding networks to obtain classification results under different spatial information; connecting the classification results, and training a new automatic encoder network as training data to serve as a final integrated network; connecting the classification results of the self-encoders to the test samples under different spatial information to form the test sample of the integrated network; and inputting the new test sample into the integrated network to obtain a final classification result of the hyperspectral image. The invention has the advantages of less training samples and high classification precision, and can be used for environmental monitoring, land utilization, target identification and the like.

Description

Hyperspectral image classification method combining deep learning and neighborhood integration

Technical Field

The invention belongs to the technical field of digital image processing, and particularly relates to a hyperspectral image classification method which can be used for environment monitoring, land utilization and target identification.

Background

By combining an imaging technology and a spectrum technology, the hyperspectral remote sensing can simultaneously obtain data with continuous space and spectrum. Hyperspectral images are an effective tool in earth surface monitoring and are widely used in agriculture, mineralogy, ground detection, physics, astronomy and environmental science. A common technique in these applications is to classify each pixel in the hyperspectral image.

The classification method of the hyperspectral image mainly comprises a classification method based on spectral information, a classification method based on spatial information and a classification method combining the spatial information and the spectral information, wherein:

the classification method based on the spectral information only utilizes the spectral information of the hyperspectral image for classification, and a decision tree algorithm, a neural network algorithm and the like are commonly used. The methods only consider the spectral information of the pixels, but do not consider the neighborhood information of the pixels, and in fact, the pixels of the high-spectrum image and the adjacent pixels are often in the same class, so that the classification effect obtained by the classification method only depending on the spectral information is very limited.

The hyperspectral image classification method based on spatial information only utilizes the spatial information of hyperspectral images for classification, and typical methods are a feature extraction method based on wavelet analysis and a feature extraction method based on gray level co-occurrence matrix. The method is a feature extraction method based on artificial experience, so the method needs better prior knowledge to obtain a better classification result.

A hyperspectral image classification method based on spatial-spectral combination is a method for classifying by combining hyperspectral pixel spectral information and spatial information. Typical methods include a sparse representation classification method based on space-spectrum combination and a hyperspectral image classification method based on deep learning. The sparse representation classification method based on the combination of the space spectrum is a popular classification algorithm at present, and obtains a good classification effect to a certain extent, but only extracts the shallow feature of the hyperspectral image. The hyperspectral image classification method based on deep learning is a research hotspot in recent years, and is more and more applied to actual classification due to the extremely strong deep feature extraction capability, but the hyperspectral image classification method based on deep learning is greatly restricted because a large number of labeled samples are required for network training and the hyperspectral image labeled samples are deficient.

Disclosure of Invention

The invention aims to provide a hyperspectral image classification method combining deep learning and neighborhood integration, and aims to solve the problems that deep-layer spatial spectrum features of hyperspectral images cannot be well extracted and a large number of training samples are required in the prior art.

In order to achieve the purpose, the technical scheme of the invention comprises the following steps:

(1) inputting a hyperspectral image containing C categories: x ═ X₁，x₂，……，x_i，……，x_NAnd randomly selecting 10% of samples from each type of hyperspectral pixels as a training sample set S, and using the rest samples as test samples T, wherein x is_iRepresents the ith sample in the hyperspectral image, which is B₀The dimension spectral vector i is 1, 2, …, N represents the number of samples of the hyperspectral image, C is more than or equal to 2, B₀The number of wave bands of the hyperspectral image is different, and the spectral dimensions of pixels of the image obtained by different hyperspectral imagers are different;

(2) inputting a training sample set S into an automatic encoder network connected with a softmax classifier to perform network training to obtain a trained classification network;

(3) inputting the training sample set S and the test sample T into the trained network to respectively obtain the probability classification results of the training sample set and the test sample set

And

wherein N is₁For the number of training samples, N₂C is the number of sample types;

(4) performing dimensionality reduction on the hyperspectral image X by using a principal component analysis method to obtain a dimensionality-reduced hyperspectral image: x '═ X'₁,x′₂,……,x′_i,……,x′_NX 'therein'_iRepresenting the ith sample of the hyperspectral image after dimensionality reduction, and the dimensionality is represented by B₀Reducing to B;

(5) on the dimensionality-reduced hyperspectral image X ', by each sample X'_iSelecting a window with the space size of 3 multiplied by 3 as a center, and obtaining a new training sample set S 'and a new test sample set T' which are added with neighborhood information correspondingly;

(6) inputting the new training sample set S' into an automatic encoder network connected with a softmax classifier to perform network training to obtain a new trained network;

(7) inputting the new training sample set S 'and the new test sample set T' into a new trained network to respectively obtain the probability classification results of the new training sample set S

And the probability classification result of the newly measured sample book T

(8) repeating (5) - (7) to obtain probability classification results of training samples obtained when the selection window size is 5 × 5, 7 × 7, 9 × 9, 11 × 11, 13 × 13, 15 × 15

And probabilistic classification of test samples

(9) will be provided with

Probability classification result of cascading into one total training sample

And P is_STraining the network as a training sample set of a new automatic encoder network to obtain a trained integrated network;

(10) will be provided with

Probabilistic classification of a total test sample

And P is_TAnd inputting the test sample set into a trained integrated network to obtain a final classification result.

Compared with the prior art, the invention has the following advantages:

1) the method uses a hyperspectral image classification method combining deep learning and neighborhood integration, effectively extracts deep spatial spectral features of the hyperspectral image, and has strong robustness on classification;

2) the invention provides a hyperspectral image classification method combining deep learning and neighborhood integration, which integrates classification information obtained by a plurality of networks, so that the integrated network inherits the information obtained by the plurality of networks and makes up for the defect of insufficient extraction characteristics of a single network, thereby enabling the integrated network to obtain a good classification effect and solving the problems that a large number of training samples are needed and the classification effect is poor in a common deep learning method.

Drawings

FIG. 1 is a general flow chart of an implementation of the present invention;

FIG. 2 is a sub-flow diagram of the present invention for training a single classification network and obtaining classification results;

FIG. 3 is a diagram of a distribution of real terrain features using images in simulation of the present invention;

FIG. 4 is a graph of the classification results of FIG. 3 for a single network at different scales in the present invention;

fig. 5 is a diagram of the classification results of fig. 3 using the final integration network of the present invention.

Detailed Description

Referring to fig. 1, the implementation steps of the invention are as follows:

step 1, selecting an original training sample set S and an original test sample set T from an input hyperspectral image X.

Inputting a hyperspectral image containing C categories: x ═ X₁，x₂，……,x_i,……,x_NAnd randomly selecting 10% of pixels from each type of hyperspectral images as an original training sample set S, and using the rest samples as an original test sample set T, wherein the hyperspectral images use Indian Pine images and images of the university of Pavia in a public data set, and x is x_iRepresents the ith sample in the hyperspectral image, which is B₀A dimensional spectral vector i equal to 1, 2, … …, N representing the number of samples of the hyperspectral image, B₀The number of the wave bands of the hyperspectral images is different, the size of the hyperspectral images is different in images obtained by different hyperspectral imagers, C is larger than or equal to 2, the number of the categories C contained in different hyperspectral images is different, for example, Indian Pine images contain 16 categories, and images of university of Pavia contain 9 categories.

And 2, training the network connected with the automatic encoder of the softmax classifier by using the original training sample set S.

(2a) Inputting an original training sample set S into an automatic encoder network, training a first layer of the network, and obtaining parameters of the trained first layer:

(2a1) taking an original training sample set S as a first layer of an m-layer automatic encoder network as input, wherein m is more than or equal to 2, obtaining hidden layer characteristics of the first layer by using initial parameters of the first layer of the network, and obtaining reconstruction data by using the hidden layer characteristics;

(2a2) continuously adjusting the first layer network parameters to minimize the error between the data of the input layer of the first layer network and the reconstructed data, and obtaining trained first layer network parameters;

(2b) converting the training sample into the hidden layer characteristic of the first layer by using the trained parameters of the first layer;

(2c) taking the hidden layer characteristics of the first layer as the input of the second layer of the network, training the second layer of the network by using the hidden layer characteristics of the first layer in a training mode of the first layer to obtain the parameters of the trained second layer, and converting the input of the second layer into the hidden layer characteristics of the second layer by using the parameters; in the same way, the same strategy is adopted for the following layers until the hidden layer characteristic of the last layer is obtained;

(2d) and (3) training the softmax classifier by taking the hidden layer characteristics of the last layer obtained in the step (2c) as the input of the softmax classifier, and then finely adjusting the whole network to obtain the classification network trained by the original training sample set S.

Step 3, inputting the original training sample set S and the original test sample T into the classification network obtained in the step 2d to respectively obtain the probability classification results of the original training sample set and the original test sample set

Wherein N is₁For the number of training samples, N₂To test the number of samples, C is the number of sample classes.

And 4, obtaining classification information by using a single network.

Referring to fig. 2, the specific implementation of this step is as follows:

(4a) performing dimensionality reduction on the hyperspectral image X by using a principal component analysis method to obtain a dimensionality-reduced hyperspectral image: x '═ X'₁,x′₂,……,x′_i,……,x′_NX 'therein'_iRepresenting the ith sample of the hyperspectral image after dimensionality reduction, and the dimensionality is represented by B₀Reducing to B;

(4b) on the dimensionality-reduced hyperspectral image X ', by each sample X'_iSelecting a 3 × 3 window as the center, and combining the windowsThe inner 9 samples are cascaded into a vector to obtain a hyperspectral image X' containing spatial information;

(4c) correspondingly connecting the samples in the hyperspectral image X with the samples in the hyperspectral image X 'containing spatial information respectively to obtain a hyperspectral image X' containing spatial information;

(4d) acquiring a new training sample set S 'and a new test sample set T' according to the positions of the original training sample set S and the original training sample set T in the hyperspectral image X 'in the step (1) and the corresponding positions in the hyperspectral image X';

(4e) inputting the new training sample set S 'into an automatic encoder network connected with a softmax classifier, and carrying out network training according to the following steps to obtain a network trained by the new training sample set S':

(4e1) taking a training sample as the input of a first layer of an m-layer automatic encoder network, wherein m is more than or equal to 2; training the first layer by using a training sample to obtain the parameters of the trained first layer;

(4e2) converting the training sample into the hidden layer characteristic of the first layer by using the trained parameters of the first layer;

(4e3) taking the hidden layer characteristics of the first layer as the input of the second layer of the network, training the second layer of the network by using the hidden layer characteristics of the first layer in a training mode of the first layer to obtain the parameters of the trained second layer, and converting the input of the second layer into the hidden layer characteristics of the second layer by using the parameters; in the same way, the same strategy is adopted for the following layers until the hidden layer characteristic of the last layer is obtained;

(4e4) and (5) training the softmax classifier by taking the hidden layer features of the last layer obtained in the step (4e3) as the input of the softmax classifier, and then finely adjusting the whole network to obtain the trained network.

(4f) Inputting the new training sample set S 'and the new test sample set T' into the network obtained in the step (4e) to respectively obtain the probability classification result of the new training sample set S

And the probability classification result of the newly measured sample book T

And 5, respectively selecting windows with different sizes to obtain corresponding probability classification results.

(5a) Selecting the window size of 5 multiplied by 5, repeating (4b) - (4f), and obtaining the probability classification result of the training sample set in the 5 multiplied by 5 window

And probabilistic classification of test sample sets

(5b) Selecting the window size to be 7 multiplied by 7, repeating (4b) to (4f), and obtaining the probability classification result of the training sample set when the window size is 7 multiplied by 7

And probabilistic classification of test sample sets

(5c) Selecting the window size of 9 multiplied by 9, repeating (4b) - (4f), and obtaining the probability classification result of the training sample set when the window size is 9 multiplied by 9

And probabilistic classification of test sample sets

(5d) Selecting the window size to be 11 multiplied by 11, repeating (4b) to (4f), and obtaining the probability classification result of the training sample set when the window size is 11 multiplied by 11

And summary of test sample setsRate classification results

(5e) Selecting a window size of 13 multiplied by 13, repeating (4b) to (4f), and obtaining the probability classification result of the training sample set when the window size is 13 multiplied by 13

And probabilistic classification of test sample sets

(5f) Selecting a window with the size of 15 multiplied by 15, repeating (4b) to (4f), and obtaining the probability classification result of the training sample set when the window is 15 multiplied by 15

And probabilistic classification of test sample sets

And 6, training the integrated network according to the result of the step 5.

(6a) Classifying the probability of the training sample set obtained in the step 5

Cascading to obtain a new sample set:

(6b) will P_SAnd (3) as a training sample set of a new automatic encoder network, carrying out network training according to the same training mode as the step (2) to obtain a trained integrated network.

And 7, classifying all the test samples by using the trained integrated network to obtain a final result.

(7a) Probability classification result of the test sample set obtained in the step 5

Cascading to obtain a probability classification result of a test sample set:

(7b) will P_TInputting the classification result into the integrated network obtained in the step 6 to obtain a final classification result.

The effect of the present invention can be further illustrated by the following simulation results.

1. Simulation conditions

The hardware platform is as follows: intel (R) core (TM) i5-3210M, 8GB RAM, software platform: MATLAB R2014 a.

The simulation experiment adopts an Indian Pine image obtained by an AVIRIS of a NASA jet propulsion laboratory in 1992 at 6 months in North Indiana as shown in FIG. 3, wherein the image has a real ground object distribution diagram, the image size is 145 multiplied by 145, the total number of the wave bands is 220, the wave bands for removing noise and absorbing the air and water areas are 200, and the total number of the 16 types of ground object information is shown in Table 1.

TABLE 1 Ind Pin image information of 16 kinds of ground objects

2. Emulated content and analysis

Simulation 1, selecting neighborhood windows with different sizes, and obtaining classification results of the test sample under the neighborhood windows with different sizes by using the method shown in FIG. 2, wherein the classification results are shown in FIG. 4; wherein:

FIG. 4(a) shows a classification result diagram when the size of the neighborhood window is selected to be 1X 1,

figure 4(b) shows a classification result diagram when the size of the neighborhood window is selected to be 3 x 3,

figure 4(c) shows a diagram of the classification results when the size of the neighborhood window is chosen to be 5 x 5,

figure 4(d) shows a classification result diagram when the size of the neighborhood window is selected to be 7 x 7,

figure 4(e) shows a classification result diagram when the size of the neighborhood window is chosen to be 9 x 9,

figure 4(f) shows a classification result diagram when the size of the neighborhood window is chosen to be 11 x 11,

figure 4(g) shows a graph of the classification results when the size of the neighborhood window is chosen to be 13 x 13,

fig. 4(h) is a diagram showing the classification result when the size of the neighborhood window is selected to be 15 × 15.

Simulation 2, the method of the present invention, i.e., the method shown in fig. 1, is used to classify the test samples, and the result is shown in fig. 5.

As can be seen from fig. 3, 4 and 5, the classification result obtained by using the integrated network in the present invention is better than the classification result of a single network in different scales shown in fig. 4, and has stronger similarity with the Indian Pine reference diagram shown in fig. 3, thereby demonstrating that the present invention effectively improves the classification accuracy of the hyperspectral images.

The classification accuracy is an index for evaluating classification performance, namely the ratio of the number of correctly classified samples to the total number of samples. The higher the classification accuracy, the better the performance of the classification method. The accuracy of the classification result of the single network under different scales shown in fig. 4 and the accuracy of the classification result of the final integrated network shown in fig. 5 are respectively counted, and the results are shown in table 2.

TABLE 2 Classification correctness for Single networks and for Integrated networks under different scales

As can be seen from Table 2, the classification accuracy of the final integrated network is far higher than that of a single network under different scales, which shows that the invention has better classification performance.

In summary, compared with a common hyperspectral image classification method, the hyperspectral image classification method has a better classification effect, and compared with other deep learning methods, the hyperspectral image classification method overcomes the defects that a large number of training samples are needed and the classification result is poor, and is suitable for the hyperspectral image classification problem in reality.

Claims

1. A hyperspectral image classification method combining deep learning and neighborhood integration comprises the following steps:

(1) inputting a hyperspectral image containing C categories: x ═ X₁，x₂，……，x_i，……，x_NAnd randomly selecting 10% of samples from each type of hyperspectral pixels as a training sample set S, and using the rest samples as a test sample set T, wherein x_iRepresents the ith sample in the hyperspectral image, which is B₀The dimension spectral vector i is 1, 2, …, N represents the number of samples of the hyperspectral image, C is more than or equal to 2, B₀The number of wave bands of the hyperspectral image is different, and the spectral dimensions of pixels of the image obtained by different hyperspectral imagers are different;

(3) inputting the training sample set S and the test sample set T into the trained network to respectively obtain the probability classification results of the training sample set and the test sample set

And

(4) performing dimensionality reduction on the hyperspectral image X by using a principal component analysis method to obtain a dimensionality-reduced hyperspectral image: x' ═ X₁',x'₂,……,x_i',……,x'_NIn which x_i' represents the ith sample of the hyperspectral image after dimensionality reduction, and the dimensionality is represented by B₀Reducing to B;

(5) on the hyperspectral image X' after dimensionality reduction, each sample X_iSelecting a window with the space size of 3 multiplied by 3 as the center to obtain a corresponding new training sample set S 'and a new test sample set T' added with neighborhood information;

And the probability classification result of the newly measured sample book T

(8) repeating (5) - (7) to obtain probability classification results of the training sample sets obtained when the selection window sizes are 5 × 5, 7 × 7, 9 × 9, 11 × 11, 13 × 13 and 15 × 15 respectively

And probabilistic classification of test sample sets

(9) will be provided with

Probability classification result of cascading into one total training sample

(10) will be provided with

Probabilistic classification of a total test sample

2. The method of claim 1, wherein the training sample set S is input into the softmax classifier-connected autoencoder network for network training in step (2), and the steps are as follows:

(2a) taking a training sample as the input of a first layer of an m-layer automatic encoder network, wherein m is more than or equal to 2; training the first layer by using a training sample to obtain the parameters of the trained first layer;

(2d) and (3) training the softmax classifier by taking the hidden layer characteristics of the last layer obtained in the step (2c) as the input of the softmax classifier to obtain a trained classifier, and then finely adjusting the whole network to obtain the trained network.

3. The method of claim 2, wherein the training of the first layer with the training sample in step (2a) is performed by:

(2a1) taking a training sample as an input layer of a first layer network, and obtaining hidden layer characteristics and reconstruction data of the first layer by using initial parameters of the first layer of the network;

(2a2) and continuously adjusting the parameters of the first layer network to minimize the error of the data of the input layer and the reconstructed data of the first layer network, thereby obtaining the trained parameters of the first layer.

4. The method according to claim 1, wherein in step (5), a 3 × 3 window is selected from the hyperspectral image X ' after dimensionality reduction with each sample as a center, and a corresponding new training sample set S ' and a new test sample set T ' to which neighborhood information is added are obtained, and the method comprises the following steps:

(5a) on the hyperspectral image X' after dimensionality reduction, a window with the space size of 3 multiplied by 3 is selected by taking each sample as the center, and 9 samples in the window are cascaded into a vector to obtain a hyperspectral image X containing space information;

(5b) correspondingly connecting samples in the hyperspectral image X with samples in the hyperspectral image X 'containing spatial information respectively to obtain a hyperspectral image X' containing spatial information;

(5c) and (2) acquiring a new training sample set S ' and a new test sample set T ' according to the positions of the training sample set S and the test sample set T in the hyperspectral image X in the step (1) and the corresponding positions in the hyperspectral image X ' ″ containing the space spectrum information.

5. The method of claim 1, wherein the training sample set S' is input into the softmax classifier-connected autoencoder network for network training in step (6), and the following steps are performed:

(6a) taking a training sample as the input of a first layer of an m-layer automatic encoder network, wherein m is more than or equal to 2; training the first layer by using a training sample to obtain the parameters of the trained first layer;

(6b) converting the training sample into the hidden layer characteristic of the first layer by using the trained parameters of the first layer;

(6c) taking the hidden layer characteristics of the first layer as the input of the second layer of the network, training the second layer of the network by using the hidden layer characteristics of the first layer in a training mode of the first layer to obtain the parameters of the trained second layer, and converting the input of the second layer into the hidden layer characteristics of the second layer by using the parameters; in the same way, the same strategy is adopted for the following layers until the hidden layer characteristic of the last layer is obtained;

(6d) and (5) training the softmax classifier by taking the hidden layer characteristics of the last layer obtained in the step (6c) as the input of the softmax classifier to obtain a trained classifier, and then finely adjusting the whole network to obtain the trained network.