CN109389106B

CN109389106B - A hyperspectral image unmixing method and system based on 3D convolutional neural network

Info

Publication number: CN109389106B
Application number: CN201811563969.4A
Authority: CN
Inventors: 李杏梅; 王心宇; 陈雪晴; 刘晓杰
Original assignee: China University of Geosciences
Current assignee: China University of Geosciences
Priority date: 2018-12-20
Filing date: 2018-12-20
Publication date: 2021-06-08
Anticipated expiration: 2038-12-20
Also published as: CN109389106A

Abstract

The present invention provides a hyperspectral image unmixing method and system based on a 3D convolutional neural network. The method includes: firstly constructing a training sample set by using a spectral feature library of real remote sensing images, and using the training sample set and a softmax classifier to compare The 3D convolutional neural network is trained; then the real image is input into the trained 3D neural network for image unmixing, and the abundance matrix corresponding to the real image is obtained, that is, the unmixing result. The beneficial effects of the present invention are as follows: the present invention proposes a new method for unmixing hyperspectral images, using the 3D convolutional neural network proposed by the present invention to unmix the hyperspectral images, almost without adjusting parameters, the Satisfactory results, compared with other methods, the technical solution proposed by the present invention is simpler and more practical.

Description

Hyperspectral image unmixing method and system based on 3D convolutional neural network

Technical Field

The invention relates to the field of images, in particular to a hyperspectral image unmixing method and a hyperspectral image unmixing system based on a 3D convolutional neural network.

Background

The research on the decomposition of the mixed pixels of the hyperspectral images starts in the middle of the 90 s of the 20 th century, is limited to the hardware level of computers at that time, and researchers in the period only put forward a mixed pixel decomposition algorithm with a short process and simple calculation, but the thought for researching the decomposition problem of the mixed pixels and the proposed linear spectrum mixed model are still the theoretical basis of the field until 20 years later.

The mixed model of the spectrum is classified into a linear model and a nonlinear model.

The linear spectrum hybrid model assumes that incident solar radiation only interacts with one type of surface of an object, there is no interaction between the objects, and each photon can only superpose a signal of one type of substance on the image element spectrum. Assuming that when the reflected light of three objects in the same scene is simultaneously incident into one sensor, the measured spectrum is a weighted average of the reflected spectra of the three objects, and the corresponding weight represents the proportion of each feature.

The nonlinear spectrum hybrid model considers the interaction of reflected light among various ground objects in the same scene. The Hapke mixed spectrum model is provided for the surface of a planet, and is difficult to be applied to the ground surface with a plant cover, and data collection is difficult; the K-M theory is mostly applied to spectral unmixing in the mining field, but only converts the reflectivity to a proportional amount of the material absorption coefficient in a limited range; the bilinear fitting model (bilinear fitting model) considers the multiple interactions of photons with the ground object, and adds an interaction part in the linear model to explain the interaction between the multiple photons. Although many scholars began to expand research more into the nonlinear domain, nonlinear hybrid models were still far from perfect compared to linear hybrid models.

Disclosure of Invention

In order to solve the problems, the invention provides a hyperspectral image unmixing method and a hyperspectral image unmixing system based on a 3D convolutional neural network, and the hyperspectral image unmixing method based on the 3D convolutional neural network mainly comprises the following steps:

s101: constructing n training samples with the size of S multiplied by L, setting the size of abundance samples of end members in the training samples to be S multiplied by M, taking each pixel as a center of each training sample, extracting spatial characteristics of S1 multiplied by S1, and adding abundance values of the center pixels of each training sample to synthesize a training sample set; wherein S is the space size of the training sample, and L is the number of wave bands of the training sample; m is the ground object type of the training sample; s1 is the window size; the number of the wave bands of the training sample and the real image is equal;

s102: training the 3D convolutional neural network by using the training sample set obtained in the step S101 to obtain a trained 3D convolutional neural network;

s103: unmixing the real image to be unmixed by using the trained 3D convolutional neural network to obtain an abundance matrix of the real image; the abundance matrix is the unmixing result.

Further, in step S102, the 3D convolutional neural network has two 3D convolutional layers C1 and C2, and one fully connected layer F1; there are 2 convolution kernels in C1 and 4 convolution kernels in C2.

Further, in step S102, the 3D convolutional neural network is trained by using a softmax classifier, the loss of the 3D convolutional neural network is minimized by propagating the error back by using a gradient descent method, and the convolution kernel of the 3D convolutional neural network is updated by using an additional momentum method, as shown in formula (1) and formula (2):

w_i+1＝w_i-εm_i+1 (2)

in the above formula, i is greater than 0 and is the iteration number, m is the momentum, epsilon represents the learning rate, w represents the weight, and mc is the momentum factor.

Further, in step S103, the specific step of unmixing the real image by using the trained 3D convolutional neural network includes:

s201: inputting the real image as input data into the first 3D convolutional layer to obtain 2 three-dimensional data C1 out;

s202: processing the three-dimensional data C1out by using a ReLU function to obtain 2 processed three-dimensional data C1REout, and sending the 2 processed three-dimensional data C1REout serving as input to a second convolutional layer to obtain 4 three-dimensional data C2 out;

s203: processing the three-dimensional data C2out by using a ReLU function to obtain 4 processed three-dimensional data C2REout, and arranging the four three-dimensional data C2REout in a row in sequence to obtain 1 eigenvector;

s204: and feeding the feature vectors into a full-link layer F1 as input to obtain a corresponding abundance matrix.

Further, in step S103, reconstructing the image according to the obtained unmixing result to obtain a reconstructed image; then comparing the reconstructed image with a real image, and calculating a quantitative error value RMSE of the reconstructed image and the real image; judging the image unmixing effect of the current 3D convolutional neural network according to the quantitative error value RMSE; the smaller the RMSE is, the better the image unmixing effect is; the formula for RMSE is shown in equation (3):

in the above formula, y_iFor the ith picture element in the real image,

the image element is the ith image element of the reconstructed image, wherein i is 1,2, …, and n is the total number of the image elements of the reconstructed image; the total number of the image elements of the reconstructed image and the real image is equal.

Further, a hyperspectral image unmixing system based on 3D convolutional neural network, its characterized in that: the system comprises the following modules:

the training sample construction module is used for constructing n training samples with the size of S multiplied by L, setting the size of each end member abundance sample in the training samples to be S multiplied by M, taking each pixel of each training sample as a center, extracting the spatial characteristics of S1 multiplied by S1, and combining the abundance values of the central pixels of each training sample to synthesize a training sample set; wherein S is the space size of the training sample, and L is the number of the training sample wave bands; m is the ground object type of the training sample; s1 is the window size; the number of the wave bands of the training sample and the real image is equal;

a network training module, configured to train the 3D convolutional neural network by using the training sample set obtained in step S101, to obtain a trained 3D convolutional neural network;

the unmixing module is used for unmixing the real image to be unmixed by utilizing the trained 3D convolutional neural network to obtain an abundance matrix of the real image; the abundance matrix is the unmixing result.

The technical scheme provided by the invention has the beneficial effects that: the invention provides a novel hyperspectral image unmixing method, the 3D convolutional neural network provided by the invention is used for unmixing the hyperspectral image, the satisfactory result can be obtained almost without adjusting parameters, and compared with other methods, the technical scheme provided by the invention is simpler and more practical.

Drawings

The invention will be further described with reference to the accompanying drawings and examples, in which:

FIG. 1 is a flowchart of a hyperspectral image unmixing method based on a 3D convolutional neural network in an embodiment of the present invention;

FIG. 2 is a schematic diagram of a module composition of a hyperspectral image unmixing system based on a 3D convolutional neural network in an embodiment of the present invention;

FIG. 3 is a schematic diagram of an experimental flow chart in an embodiment of the present invention;

FIG. 4 is a schematic diagram of the corresponding abundances of the infinite end-members of the FCLS algorithm in the embodiment of the present invention;

FIG. 5 is a schematic representation of the corresponding abundance of the FCLS algorithm chalcedony end-member in the embodiment of the present invention;

FIG. 6 is a schematic representation of the corresponding abundances of the sparse decomposition of the alenite end-members of L1 in an embodiment of the present invention;

FIG. 7 is a schematic representation of L1 sparsely decomposed chalcedony end-member correspondence abundance in an embodiment of the invention;

FIG. 8 is a schematic diagram of the corresponding abundance of the infinite end-members of the 3D-CNN algorithm in the embodiment of the present invention;

FIG. 9 is a schematic diagram of the corresponding abundance of the 3D-CNN algorithm chalcedony end-member in the example of the present invention.

Detailed Description

For a more clear understanding of the technical features, objects and effects of the present invention, embodiments of the present invention will now be described in detail with reference to the accompanying drawings.

The embodiment of the invention provides a hyperspectral image unmixing method and a hyperspectral image unmixing system based on a 3D convolutional neural network.

Referring to fig. 1, fig. 1 is a flowchart of a hyperspectral image unmixing method based on a 3D convolutional neural network in an embodiment of the present invention, which specifically includes the following steps:

In step S102, the 3D convolutional neural network has two 3D convolutional layers C1 and C2, and a fully connected layer F1; there are 2 convolution kernels in C1 and 4 convolution kernels in C2.

In step S102, training the 3D convolutional neural network by using a softmax classifier, performing back propagation on the error by using a gradient descent method to minimize the loss of the 3D convolutional neural network, and updating the convolution kernel of the 3D convolutional neural network by using an additional momentum method, as shown in formula (1) and formula (2):

w_i+1＝w_i-εm_i+1 (2)

In step S103, the specific step of unmixing the real image by using the trained 3D convolutional neural network includes:

s201: inputting the real image as input data into the first 3D convolutional layer to obtain 2 three-dimensional data C1 out; the method comprises the following specific steps:

the C1 comprises 2 3D convolution kernels with the size of H1 xW 1 xD 1, and after each sample is respectively subjected to 3D convolution with the 2 convolution kernels, 2 pieces of three-dimensional data C1out with the size of (S2-H1+1) × (S2-W1+1) × (L-D1+1) are obtained; s2 is the real image space size; l is the number of spectral bands of the real image;

s202: processing the three-dimensional data C1out by using a ReLU function to obtain 2 processed three-dimensional data C1REout, and sending the 2 processed three-dimensional data C1REout serving as input to a second convolutional layer to obtain 4 three-dimensional data C2 out; the method comprises the following specific steps:

the second 3D convolutional layer C2 involved 4 3D convolutional kernels of size H2 × W2 × D2. After each input three-dimensional data is respectively subjected to 3D convolution with 4 convolution kernels, the results of the 4 three-dimensional data obtained by the same convolution kernel are accumulated and summed to obtain 4 three-dimensional data C2out with the size of (S2-H1-H2+2) × (S2-W1-W2+2) × (L-D1-D2+ 2);

s203: processing the three-dimensional data C2out by using a ReLU function to obtain 4 processed three-dimensional data C2REout, and arranging the four three-dimensional data C2REout into a row according to the sequence (the sequence is set according to the requirement), so as to obtain 1 (S2-H1-H2+2) × (S2-W1-W2+2) × (L-D1-D2+2) dimensional feature vector;

s204: and inputting the feature vectors into a full-link layer F1 as input to obtain a corresponding abundance matrix.

In step S103, reconstructing an image corresponding to the abundance matrix according to the obtained unmixing result to obtain a reconstructed image; then comparing the reconstructed image with a real image, and calculating a quantitative error value RMSE of the reconstructed image and the real image; judging the image unmixing effect of the current 3D convolutional neural network according to the quantitative error value RMSE; the smaller the RMSE is, the better the image unmixing effect is; the formula for RMSE is shown in equation (3):

in the above formula, y_iFor the ith picture element in the real image,

Referring to fig. 2, fig. 2 is a schematic diagram illustrating a module composition of a hyperspectral image unmixing system based on a 3D convolutional neural network in an embodiment of the present invention, including a training sample construction module 11, a network training module 12, and an unmixing module 13, which are connected in sequence;

a training sample construction module 11, configured to construct n training samples with a size of sxsxsx L, set the size of each end member abundance sample in the training samples to sx M, extract spatial features of S1 × S1 for each training sample with each pixel as a center, and combine the abundance values of the center pixels of each training sample to form a training sample set; wherein S is the space size of the training sample, and L is the number of the training sample wave bands; m is the ground object type of the training sample; s1 is the window size; the number of the wave bands of the training sample and the real image is equal;

a network training module 12, configured to train the 3D convolutional neural network by using the training sample set obtained in step S101, so as to obtain a trained 3D convolutional neural network;

the unmixing module 13 is configured to unmix the real image to be unmixed by using the trained 3D convolutional neural network to obtain an abundance matrix of the real image; the abundance matrix is the unmixing result.

In the network training module 12, the 3D convolutional neural network has two 3D convolutional layers C1 and C2 and a fully connected layer F1; there are 2 convolution kernels in C1 and 4 convolution kernels in C2.

In this example, the actual data used was AVIRIS (aircraft visual not obtained imaging spectrometer) data imaged in 1995 in Cuprite, Nevada, USA, and experiments were carried out to demonstrate the effectiveness and creativity of the proposed technical method. In the area, minerals such as Alunite (Alunite), Kaolinite (Kaolinite), Chalcedony (Chalcedony), Muscovite (Muscovite), Montmorillonite (Montmorillonite), Jarosite (Jarosite), Calcite (Calcite) and the like are mainly included, and typical pure pixels exist; and small amount of mineral substances such as Buddingtonite, Nontronite, Halloysite, Dickite, etc. The group of data is widely applied to the research of the hyperspectral image end member extraction method and has good representativeness.

The spectral library used in the embodiment of the present invention is usgs (united States Geological survey) mineral spectral library M, which has a size of 224 × 498, where 224 is the number of wavelength bands of the spectrum, and 498 is the number of ground feature types, i.e. the number of end members.

The USGS spectral library and the Cuprice area hyperspectral image are removed from low signal-to-noise ratio and water vapor absorption wave bands, and 196 wave bands are left. For convenience of calculation, the image used in this experiment is a part of the entire image, and the size thereof is 143 × 123.

In the experiment, 11 end members contained in the end member set of the image are firstly determined, and corresponding 11 spectrums are extracted from the mineral spectrum library M of the USGS to form the experiment end member set. Thus, the end-member sample set size is 196 × 11. In practical situations, the mineral mixture generally does not include all end members, so when the hyperspectral image sample is synthesized, the mixture situation that the number of the end members is not more than 5 is mainly considered. Therefore, the sample randomly selects a number p as the number of end members, where p is a positive integer between 1 and 5. Abundance samples generated from p random end-members of the 11 end-members obey dirichlet distribution and meet the "sum is one" and "non-negative" constraints. In generating the training samples, spatially adjacent pixels are set to the same or similar abundance. And finally, synthesizing the hyperspectral image according to the linear model, and enabling the signal-to-noise ratio of the synthesized hyperspectral image to be 60 dB.

In the experiment, data of known ground object types need to be processed, so that the estimation of the number of end members by an algorithm is not needed. Because a general hyperspectral mixed image lacks the abundance of reference, when a network is trained, an abundance map sample of each end member needs to be constructed by self, and then a hyperspectral image sample is constructed by a known spectrum library.

The size of the hyperspectral image sample is set to 100 × 100 × 196, and the size of the abundance sample of each end member is 100 × 100 × 11. And (3) taking each pixel as a center to extract 5 multiplied by 196 characteristics of the hyperspectral image sample, and combining the unmixing results of the center pixels to form a training sample set. After the 3D convolutional neural network parameters are debugged by the samples, the samples are used for de-mixing real data, and an experimental flow chart is shown in FIG. 3.

The experimental results are shown in table 1, the algorithm for experimental comparison has FCLS and L1 norm sparse decomposition, and table 1 shows the reconstruction errors of the three algorithms on the hyperspectral data in the Cuprite region. Fig. 4 to 9 show the abundance results of typical figure Alunite (Alunite) and Chalcedony (Chalcedony) in 11 figures solved by 3 algorithms.

When the solution is solved by using the fully constrained least square method, the iteration is carried out for 2000 times.

When a sparse mixed pixel decomposition model based on L1 norm is used for solving, lambda is set to be 10-4, and the iteration number is 2000.

When the 3D convolutional neural network is used for training data, mini-batch is adopted, the batch number is 20, iteration is carried out for 2000 times, and the learning rate is 0.01.

Table 1 reconstruction errors of different algorithms on high spectral data of Cuprice area

	FCLS	L1 sparse decomposition	3D-CNN
				RMSE	0.0453	0.0427	0.0394

According to the comparison of the spectrum reconstruction errors RMSE of the three algorithms in the table 1, the unmixing result based on the 3D convolutional neural network is better than the other two algorithms. Since there is no standard abundance map to compare, it can only be roughly estimated visually and numerically. But overall the results of the 3D convolutional neural network are roughly satisfactory. The most important advantage is that few adjustments of the parameters are required to obtain good results.

The invention has the beneficial effects that: the invention provides a novel hyperspectral image unmixing method, the 3D convolutional neural network provided by the invention is used for unmixing the hyperspectral image, the satisfactory result can be obtained almost without adjusting parameters, and compared with other methods, the technical scheme provided by the invention is simpler and more practical.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. A hyperspectral image unmixing method based on a 3D convolutional neural network is characterized by comprising the following steps: the method comprises the following steps:

s103: unmixing the real image to be unmixed by using the trained 3D convolutional neural network to obtain an abundance matrix of the real image; the abundance matrix is the unmixing result;

in step S102, the 3D convolutional neural network has two 3D convolutional layers C1 and C2, and a fully connected layer F1; there are 2 convolution kernels in C1 and 4 convolution kernels in C2;

w_i+1＝w_i-εm_i+1 (2)

in the above formula, i is greater than 0 and is iteration times, m is momentum, epsilon represents learning rate, w represents weight, and mc is a momentum factor;

s203: processing the three-dimensional data C2out by utilizing a ReLU function to obtain 4 processed three-dimensional data C2REout, and arranging the four three-dimensional data C2REout in a row in sequence to obtain 1 eigenvector;

2. The hyperspectral image unmixing method based on the 3D convolutional neural network as claimed in claim 1, wherein: in step S103, reconstructing the image according to the obtained unmixing result to obtain a reconstructed image; then comparing the reconstructed image with a real image, and calculating a quantitative error value RMSE of the reconstructed image and the real image; judging the image unmixing effect of the current 3D convolutional neural network according to the quantitative error value RMSE; the smaller the RMSE is, the better the image unmixing effect is; the formula for RMSE is shown in equation (3):

in the above formula, y_iFor the ith picture element in the real image,

3. A hyperspectral image unmixing system based on a 3D convolutional neural network is used for realizing the hyperspectral image unmixing method based on the 3D convolutional neural network according to any one of claims 1 to 2, and is characterized in that: the system comprises the following modules: