CN111612127A

CN111612127A - Multi-direction information propagation convolution neural network construction method for hyperspectral image classification

Info

Publication number: CN111612127A
Application number: CN202010359251.4A
Authority: CN
Inventors: 肖亮; 余剑; 刘启超
Original assignee: Nanjing University of Science and Technology
Current assignee: Nanjing University of Science and Technology
Priority date: 2020-04-29
Filing date: 2020-04-29
Publication date: 2020-09-01
Anticipated expiration: 2040-04-29
Also published as: CN111612127B

Abstract

The invention discloses a construction method of a multi-direction information propagation convolutional neural network for hyperspectral image classification, which comprises the following steps: the input end is a local three-dimensional hyperspectral data cube sample taking a target pixel as a center; the deep neural network consists of two-dimensional convolution multilayer perceptrons among the hidden layer units, two-dimensional convolution perceptrons inside the hidden layer units, a pooling layer and a full-connection layer; the two-dimensional convolution sensor in the hidden layer unit divides the internal feature map of the hidden layer into pieces according to the row or column direction, and executes the piece-by-piece convolution between the feature pieces according to the upper, lower, left and right directions, thereby transmitting the spatial information of the pixels in different directions; the output layer is a class probability vector of the input spectral pixels. The network is different from a classical convolution network, a spatial information propagation mechanism among characteristic channels is formed inside a hidden layer, the spatial spectrum characteristics with higher discriminative performance can be learned, the network is applied to hyperspectral supervised classification, and the supervised classification capability under a small number of samples is greatly improved.

Description

Multi-direction information propagation convolution neural network construction method for hyperspectral image classification

Technical Field

The invention relates to a deep neural network technology, in particular to a construction method of a multi-direction information propagation convolution neural network for hyperspectral image classification.

Background

In recent years, a convolutional neural network as a popular deep learning framework has gradually become a powerful tool in hyperspectral image analysis, and has a very wide application prospect in the field of hyperspectral classification. Compared with a method based on shallow characterization learning, the convolutional neural network realized by the deep convolutional sensor can adaptively learn hierarchical representation from low-level features to high-level features, and then sequentially identify the most discriminative features for a supervised classification task of a hyperspectral image. For the hyperspectral image classification task, the challenge lies in the following aspects. Firstly, the pixels are located in a high-dimensional complex manifold, and the nonlinear correlation between the pixels is more complex than that of a natural image; second, the spatial variability of spectral features increases the variability between internal classes; finally, due to the fact that ground object type distribution is unbalanced, the hyperspectral image is always in a type unbalanced state.

To address these issues, various different structures of convolutional neural network frameworks have been proposed in succession today to achieve more compact and more discriminative spatio-spectral features. In general, among convolutional Neural Network-based methods, convolutional forms are various, and typical methods include two-dimensional convolutional Neural networks [ Konstatinos M, Konstatinos K, absolute. Deep super search for hyper spectral data Classification networks. IEEE International geographic and RemoteSensioning Symposium,2015, 4959. 4962 ], three-dimensional convolutional Neural networks [ Li Y, Zhang H, Shell Q. spectral-Spatial Classification of hyper spectral analysis with 3D channel Spatial Neural Network, 2017,9(1):67, as well as empty spectral networks [ Zhang Z, J, Image. Transmission. III. Spatial spectral analysis with 3D channel Spatial Network, 2017,9(1):67, and IEEE spectrum Residual spectrum Network [ Zhang Z, J, echo. prediction. Spatial analysis. distribution, D.7: 2. simulation, III. D.7. for hyper spectral analysis, and transformation of hyper spectral analysis, 2. distribution, 2. prediction, 2. distribution, and 4. distribution, 2. distribution, and 2. distribution. The two-dimensional convolutional neural network can construct high-level features containing rich pixel space spectrum information in a layered mode by using the convolutional neural network and the multilayer perceptron. The three-dimensional convolution neural network directly uses initial three-dimensional high-level data as network input without artificially extracting image features, and the network can effectively extract the spatial spectrum features of the hyperspectral image. The spatial spectrum residual network uses the spectral residual block and the spatial residual block to learn deep discriminative features. Although these methods are effective in improving the hyperspectral classification effect, many deep features in the hyperspectral image are still not utilized, and especially in the aspect of spatial features, the utilization is far from enough.

Disclosure of Invention

The invention aims to provide a multidirectional information propagation convolutional neural network for hyperspectral image classification, which is different from a classical convolutional network, forms a spatial information propagation mechanism among characteristic channels inside a hidden layer, can learn more discriminative spatial spectrum characteristics, is applied to hyperspectral supervised classification, and greatly improves the supervised classification capability under a small number of samples.

The technical solution for realizing the purpose of the invention is as follows: a construction method of a multi-directional information propagation convolutional neural network for hyperspectral image classification comprises the following steps:

the input layer is three-dimensional spatial spectrum data with a target pixel as a center, namely the input data of the network is a three-dimensional neighborhood pixel block with hundreds of spectral bands;

constructing a multi-directional information propagation convolutional neural network;

accelerating network training by adopting batch normalization, parameter correction linear unit activation functions and random discarding strategies;

the output layer is a category probability vector of an input spectrum pixel, namely the output of the network is the category probability vector of a central pixel point of an input three-dimensional neighborhood pixel block, the category probability vector is used for determining the category of the pixel, and the vector length is the total number of the categories.

Compared with the existing classification method, the deep neural network establishes characteristic sheet propagation in the hidden layer and a multidirectional information propagation mechanism between the hidden layer and the hidden layer, and has the advantages that: (1) the two-dimensional convolution sensor in the hidden layer unit can effectively utilize the spatial correlation of the hyperspectral image pixels, enriches the spatial information of the pixels, and can effectively extract abundant and discriminative space spectrum features by combining the two-dimensional convolution multilayer sensor between the hidden layer units and the two-dimensional convolution sensor in the hidden layer unit; (2) an effective optimization method is adopted, so that the network convergence speed is high, and the network parameters are few; (3) the network can obtain better performance under the condition of a small amount of training set samples, has good stability, and can obtain excellent effect when being applied to hyperspectral image classification; the invention can be widely applied to the fields of land and feature classification, environmental monitoring, crop classification and the like.

The present invention is described in further detail below with reference to the attached drawing figures.

Drawings

FIG. 1 is a block diagram of a multi-directional information propagation convolutional neural network structure for hyperspectral image classification according to the present invention.

Fig. 2 is a schematic diagram of the piece-by-piece convolution.

Fig. 3 is a network architecture diagram of the present invention.

FIG. 4(a) is a Pavia University ground truth map.

FIG. 4(b) is a graph of the classification effect of the Pavia University 0.5% training set.

FIG. 4(c) is a graph of the classification effect of the Pavia University 1% training set.

FIG. 4(d) is a graph of the classification effect of the Pavia University 5% training set.

FIG. 5(a) is a plot of the actual terrain profile for Indian Pines.

FIG. 5(b) is a graph of the classification effect of Indian Pines 1% training set.

FIG. 5(c) is a diagram of the effect of the Indian Pines 5% training set classification.

FIG. 5(d) is a diagram of the classification effect of Indian Pines 10% training set.

Detailed Description

In order to solve the problem that the existing convolutional neural network method cannot fully utilize the spatial characteristics of a hyperspectral image, the invention provides a multi-directional information propagation convolutional neural network construction method for hyperspectral image classification, which comprises the following steps: the input end is a local three-dimensional hyperspectral data cube sample taking a target pixel as a center; the deep neural network consists of two-dimensional convolution multilayer perceptrons among the hidden layer units, two-dimensional convolution perceptrons inside the hidden layer units, a pooling layer and a full-connection layer; the multilayer convolution perceptron between the hidden layer units is formed by convolving the spectrum dimension by a 1 multiplied by 1 two-dimensional convolution core; the two-dimensional convolution sensor in the hidden layer unit divides the internal feature map of the hidden layer into pieces according to the row or column direction, and executes the piece-by-piece convolution between the feature pieces according to the upper, lower, left and right directions, thereby transmitting the spatial information of the pixels in different directions; the output layer is a class probability vector of the input spectral pixels. The network is different from a classical convolution network, a spatial information propagation mechanism among characteristic channels is formed inside a hidden layer, the spatial spectrum characteristics with higher discriminative performance can be learned, the network is applied to hyperspectral supervised classification, and the supervised classification capability under a small number of samples is greatly improved. The network fully utilizes the spatial correlation among the pixels of the hyperspectral images through a novel piece-by-piece convolution structure, is combined with a two-dimensional convolution layer, obtains richer and more distinctive spatial spectrum characteristics, and can obtain excellent effects on the supervision and classification of the hyperspectral images.

The following describes the implementation process of the present invention in detail with reference to fig. 1 and 2, and the steps of the present invention are as follows:

in the first step, the input layer is three-dimensional spatial spectrum data with a target pixel as the center, that is, the input data of the network is a three-dimensional neighborhood pixel block with hundreds of spectral bands, and the value range is [100,600 ]]. Note the book

And the pixel block is a three-dimensional neighborhood pixel block, wherein l is the height and the width of the three-dimensional neighborhood pixel block, and b is the number of spectral channels of the three-dimensional neighborhood pixel block.

And secondly, constructing a multi-directional information propagation convolutional neural network. The deep neural network is composed of two-dimensional convolution multilayer perceptrons among hidden layer units, two-dimensional convolution perceptrons inside the hidden layer units, a pooling layer and a full connection layer. The two-dimensional convolution multilayer perceptron between the hidden layer units and the two-dimensional convolution perceptron inside the hidden layer units have the following structural forms:

1) the input of the hidden layer unit is

Output is as

The two-dimensional convolution multilayer perceptron between the hidden layer units adopts m 1 × 1 two-dimensional convolution cores to carry out convolution transformation on the spectrum dimension, and the output is

m is the number of input and output channels, and m is an integer and is more than 1;

2) the two-dimensional convolution perceptron in the hidden layer unit is a feature map obtained after two-dimensional convolution

The method comprises the steps of slicing in the row direction or the column direction to obtain l characteristic slices with the size of l × m, and sequentially performing two-dimensional convolution on each characteristic slice in the four directions of up, down, left and right, wherein the convolution kernel size is w × m, and is 0<w is not more than l and is an integer, the number of convolution kernels is m, the same filling mode is adopted to keep the size of the result after convolution consistent with that of the original feature piece, the convolution result with the size of l × m is linearly added with the next feature piece to obtain an updated feature piece, two-dimensional convolution is applied to the updated feature piece, the obtained convolution result is used for updating the next piece, the operation is repeated until the last piece is updated, and the specific implementation calculation formula is as follows

(f₁，f₂，......，f_l)＝split(T)

O＝CON(f₁,f′₂,......，f′_l)

Wherein,

represents a convolution operation, h_iRepresenting the ith convolution kernel in the two-dimensional convolution layer, BN (-) representing batch normalization processing, sigma representing a nonlinear activation function, split (-) representing that the feature diagram output by the previous layer is subjected to slicing operation according to the row or column direction of the image, f_kIs the k-th feature slice f after the feature map is sliced_k ^′Is the updated k-th feature slice, W_k-1Is the convolution kernel of the k-1 th slice in the slice-by-slice convolution, and CON (-) represents the operation of re-splicing the feature slices into the feature graph.

Thirdly, batch normalization, a parameter modification linear unit activation function and random discarding acceleration network training are adopted, wherein the parameter modification linear unit activation function is abbreviated as PRelu (x)_i) The calculation formula is defined as:

wherein x is_iRepresenting the input of a parametrically modified linear unit activation function on the ith channel, a_iIs a learnable parameter that determines the slope of the negative portion. Updating a by momentum method_i：

Wherein mu is momentum, and the value range is [0,1 ]](ii) a lr is the learning rate of the network, and the value range [0,0.0005 ]]In iteration a_iAs an initial value, 0.25.

And fourthly, the output layer is the category probability vector of the input spectrum pixel, namely the output of the network is the category probability vector of the central pixel point of the input three-dimensional neighborhood pixel block, the category of the pixel is determined, and the vector length is the total number of the categories. Note the book

For the three-dimensional neighborhood pixel block input by the network, the target pixel can be divided into C different classes, and the output layer of the network is

Indicating the probability that the pixel belongs to each class. Wherein Y can be represented as:

Y＝FC(P(O′))＝[y₁,y₂,……,y_C]

wherein, y_CRepresenting the probability that the pixel belongs to class C, P (-) representing pooling level processing, and FC (-) representing a fully connected operation.

According to the invention, a novel piece-by-piece convolution structure is embedded in a hidden feature layer of a convolution neural network, and the structure can utilize convolution operation among feature pieces of a feature map, so that spatial feature information is spread, and the spatial information of each pixel is richer. The traditional layer-by-layer convolution is combined with the piece-by-piece convolution provided by the invention, the characteristic learning process is obviously improved, and the spatial spectrum characteristics which are richer and have more discriminativity can be obtained through the network, so that the hyperspectral image classification performance is improved. The network of the invention is an end-to-end supervised classification neural network model, the input does not need to be preprocessed, the training process is efficient and time-saving, the output result is simple and clear, the model stability is good, the robustness is high, and the invention can be widely applied to the engineering field.

The effect of the invention can be further illustrated by the following simulation experiment:

examples

The hyperspectral image is typical three-dimensional space spectrum data, and the verification experiment is carried out in the following two groups of common hyperspectral data sets: indian Pines dataset and Pavia University dataset. The Pavia University dataset was collected by a ross sensor in parkia, and included 115 bands in total, with an image size of 610 × 340, and after removing the noise band, the remaining 103 bands were selected as the study objects, and since the image included a large number of background pixels, 42776 were total ground object pixels actually used in the classification experiment, and 9 were total ground object types. The Indian Pines dataset is a hyperspectral remote sensing image acquired by an airborne visible infrared imaging spectrometer (AVIRIS) in an Indian Pines experimental area, indiana, usa. The image contains 220 bands in total, the spatial resolution is 20m, and the image size is 145 × 145. After removing 20 water vapor absorption and low signal-to-noise ratio bands, the remaining 200 bands were selected as the study. The region contains 16 known surface features and 10249 surface feature samples. In the experiments performed on both data sets, no pre-treatment was taken. On a Pavia University dataset, 0.5%, 1% and 5% of samples are randomly selected as a training dataset in an experiment, 1% of samples are randomly selected as a verification dataset, and the rest samples are used as a test dataset. On an Indian Pines data set, 1%, 5% and 10% of samples are randomly selected as a training data set in an experiment, 1% of samples are randomly selected as a verification data set in the experiment, and the rest samples are used as a test data set. In the experiment, the experiments on the two data sets were repeated 10 times respectively and averaged as the final result, and oa (overall accuracy) and aa (average accuracy) were used as the evaluation indexes of classification performance. All experiments were performed on the same equipment and in the same environment: windows10 operating system, CPU: i5-8400, GPU: NVIDIA GeForce GTX 1060, 8G memory, Python3.5+ Tensflow environment. The network structure used in the experiment is shown in fig. 3: extracting three-dimensional neighborhood pixel blocks from an original hyperspectral image as input of a network, respectively performing four times of two-dimensional convolution and piece-by-piece convolution in different directions, and finally obtaining the classification soft probability of a single sample after operations such as pooling, dimension reduction, full connection, random discarding and the like.

Table 1 shows the classification accuracy results obtained by the validation experiment of the two data sets according to the method of the present invention.

TABLE 1

From the classification results, the method of the invention shows good performance on both the Pavia university and Indian Pines datasets. Under the condition that 5% of samples are selected as training data sets by the Pavia University and 10% of samples are selected as training data sets by the Indian pipes, the classification precision reaches 99%, and the performance is far higher than that of the traditional hyperspectral image classification method, so that the feasibility of the method is proved. Moreover, under the condition of a small amount of training samples, namely the condition that 1 percent and 0.5 percent of samples are selected as training data sets by the PaviaUniversal and 5 percent and 1 percent of samples are selected as training data sets by the Indian Pines, the classification results also keep higher performance, so that the method disclosed by the invention can still obtain excellent effects under the condition of a small amount of training samples and has the advantage of higher stability. The experimental results of the method of the present invention on two sets of data sets are shown in fig. 4(a) -4 (d) and fig. 5(a) -5 (d), and the classification result graph shows that the method of the present invention achieves a good classification effect on both data sets.

Claims

1. A construction method of a multi-directional information propagation convolutional neural network for hyperspectral image classification is characterized by comprising the following steps:

2. The method of claim 1, wherein the input layer is three-dimensional spatial spectral data centered on a target pixel, i.e. the input data of the network is a three-dimensional neighborhood pixel block with hundreds of spectral bands, and the input data is recorded as

3. The method for constructing the multidirectional information propagation convolutional neural network for hyperspectral image classification according to claim 1, wherein the multidirectional information propagation convolutional neural network is constructed, and the deep neural network is composed of two-dimensional convolutional multi-layer perceptrons among hidden layer units, two-dimensional convolutional perceptrons inside the hidden layer units, pooling layers and full connection layers.

4. The method for constructing the multidirectional information propagation convolutional neural network for hyperspectral image classification according to claim 3, wherein the two-dimensional convolutional multi-layer perceptron between the hidden layer units and the two-dimensional convolutional perceptron inside the hidden layer units are constructed in the following forms:

1) the input of the hidden layer unit is

Output is as

The method comprises the steps of slicing in the row direction or the column direction to obtain l characteristic pieces with the size of l × m, sequentially performing two-dimensional convolution on each characteristic piece in the up, down, left and right directions, wherein the convolution kernel size is w × m, w is more than 0 and less than or equal to l and is an integer, the number of convolution kernels is m, keeping the size of a result after convolution consistent with that of an original characteristic piece in the same filling mode, and enabling the convolution result with the size of l × m to be consistent with that of the next characteristic pieceAnd (3) linearly adding the slices to obtain an updated feature slice, applying two-dimensional convolution to the updated feature slice to obtain a convolution result for updating the next slice, and repeating the operation until the last slice is updated, wherein the specific calculation formula is as follows:

(f₁，f₂，......，f_l)＝split(T)

O＝CON(f₁，f′₂，......，f_l′)

wherein,

represents a convolution operation, h_iRepresenting the ith convolution kernel in the two-dimensional convolution layer, BN (-) representing batch normalization processing, sigma representing a nonlinear activation function, split (-) representing that the feature diagram output by the previous layer is subjected to slicing operation according to the row or column direction of the image, f_kIs the k characteristic piece, f 'after the characteristic diagram piece'_kIs the updated k-th feature slice, W_k-1Is the convolution kernel of the k-1 th slice in the slice-by-slice convolution, and CON (-) represents the operation of re-splicing the feature slices into the feature graph.

5. The method of claim 1, wherein batch normalization, parameter-modified linear unit activation function (PRelu) (x) and random discard acceleration network training are employed, wherein PRelu (x) is abbreviated as the parameter-modified linear unit activation function_i) The calculation formula is defined as:

wherein x is_iRepresenting the input of a parametrically modified linear unit activation function on the ith channel, a_iIs a learnable parameter that determines the slope of the negative portion; updating a by momentum method_i：

6. The method for constructing the multidirectional information propagation convolutional neural network for hyperspectral image classification according to claim 1, wherein an output layer is a class probability vector of an input spectral pixel, namely the output of the network is the class probability vector of a central pixel point of an input three-dimensional neighborhood pixel block, the class probability vector is used for determining the class to which the pixel belongs, and the vector length is the total number of the classes; note the book

Representing the probability that the pixel belongs to each category; wherein Y is represented by:

Y＝FC(P(O′))＝[y₁，y₂，......，y_c]