CN111563520A

CN111563520A - Hyperspectral image classification method based on space-spectrum combined attention mechanism

Info

Publication number: CN111563520A
Application number: CN202010044989.1A
Authority: CN
Inventors: 尹继豪; 李磊; 刘雨晨; 黄浦; 王麒雄
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2020-01-16
Filing date: 2020-01-16
Publication date: 2020-08-21
Anticipated expiration: 2040-01-16
Also published as: CN111563520B

Abstract

The algorithm is used for solving the problem that the performance of a traditional convolutional neural network on a fine-grained image classification task represented by a hyperspectral image is insufficient, and the hyperspectral image classification algorithm based on a space-spectrum combined attention mechanism is provided, can effectively capture image global features by matching with the convolutional neural network, and adaptively focuses on spatial local features with large differences among similar images; meanwhile, contributions of different wave bands to the task are evaluated, so that the neural network pays more attention to the spectrum wave bands with large contributions, local difference characteristics of image spectrums are extracted, the hyperspectral image classification precision is improved, and the method has wide application in the field of classification of fine-grained images represented by hyperspectral images.

Description

Hyperspectral image classification method based on space-spectrum combined attention mechanism

Technical Field

The invention relates to a hyperspectral image classification method based on a space-spectrum combined attention mechanism. The method can be used in the field of remote sensing image processing.

Background

The hyperspectral remote sensing technology is one of the most important technical breakthroughs in the field of airborne observation systems and satellite-borne observation systems for human beings at the end of the twentieth century, the hyperspectral image overcomes the limitations of the traditional single-waveband and multispectral remote sensing in the aspects of waveband range, waveband quantity, fine ground target observation and identification and the like, and has unique advantages in the field of remote sensing ground observation. The hyperspectral image classification is an important and meaningful task in practice, and specifically, the hyperspectral image classification is a task of identifying a given image according to different spectral features or spatial features and marking the type of each pixel point in the image.

Compared with the common image classification task, the hyperspectral image has the characteristic of dimension disaster and same-spectrum foreign matter in the spectrum domain, so that the classification task becomes more difficult. Under the circumstance, the traditional hyperspectral image classification algorithm which solely depends on spectral information has limited performance, and the classification algorithm based on the combined space-spectral information is a research hotspot in recent years.

Since 2012, the deep learning technique, represented by Convolutional Neural Network (CNN), was a great achievement in computer vision tasks. The convolutional neural network is very suitable for processing image space domain information and has achieved great success in common image classification tasks, and the convolutional neural network is used for hyperspectral image classification tasks at the earliest in 2016. Subsequently, various convolutional neural network algorithms for the hyperspectral image classification task are developed, but the algorithms have difficulty in extracting image global features due to the limited size of the convolutional network 'receptive field'. What is worse, due to the particularity of the hyperspectral image data, the hyperspectral image data needs to be preprocessed before being classified, that is, each pixel is divided into cubes (the general size is 27 × 27) as the center, and the middle pixel label is used as each cube classification label, so that similar and heterogeneous pixel cubes are very similar in spatial features, which is generally called as integral spatial feature redundancy, and an image with slight difference in local features is a fine-grained image. The ability of the traditional convolutional neural network to process the fine-grained image with the spatial redundancy characteristic is very weak, and the performance of the convolutional neural network in the task of classifying the high-spectrum and other fine-grained images is further improved seriously.

In addition, different from a common image, a hyperspectral image has very rich spectrum information, most of traditional classification algorithms consider that different spectrum bands contribute the same to an algorithm task, but actually, due to the influence of physical factors such as illumination, atmosphere and the like, some bands tend to be noisy, and basically do not contribute to the current task, or even cause interference.

Based on the method, a spatial local feature which can effectively capture the global feature of the image and has larger difference between similar fine degree images is designed; meanwhile, the contributions of different wave bands to the task are evaluated, so that the neural network pays more attention to the spectrum wave bands with large contributions, the local difference characteristics of the image spectrum are extracted, the high-spectrum image classification precision is improved, and the method is a very worthy of research.

Disclosure of Invention

The algorithm aims at solving the problem that the classification performance of the traditional convolutional neural network on the fine and smooth images represented by the hyperspectral images is insufficient, and provides a hyperspectral image classification method based on a space-spectrum combined attention mechanism, which can be matched with the convolutional neural network to effectively capture the global features of the images and adaptively focus the spatial local features with larger differences among similar images; meanwhile, different wave bands are evaluated to contribute to tasks, so that the neural network pays more attention to the spectrum wave bands contributing to large, local difference features of the images are extracted, the classification precision of the hyperspectral images is improved, and the method has wide application in the field of classification of fine images such as hyperspectrum.

The algorithm of the invention provides a space-spectrum combined attention mechanism module, which has the following three advantages:

(1) the algorithm has strong portability and can be randomly embedded into various conventional convolutional neural networks.

(2) The algorithm has good universality, and attention mechanism modules are flexibly selected according to task requirements. For example, when the ordinary fine image classification task without spectral features is faced, the spatial attention machine modeling module is flexibly selected.

(3) The performance of the algorithm is strong, and the performance of the convolutional neural network can be effectively improved;

drawings

FIG. 1 is a block diagram of a spatial-spectral combined attention mechanism;

FIG. 2 is a block diagram of three structures of a convolutional neural network embedded with a spatial-spectral combined attention mechanism module;

FIG. 3 is a comparison of experimental results of different algorithms on a hyperspectral dataset. Note: in the experiment, a space-spectrum combined Attention mechanism Module is called Joint Spatial-Spectral Attention Module for short and JSAM for short, a convolutional neural network adopting a series embedding mode is called CNN-JSAM-A, a convolutional neural network adopting a parallel embedding mode is called CNN-JSAM-B, and a convolutional neural network adopting a series embedding mode is called CNN-JSAM-C. Taking Indian Pine data as a high-spectrum data set in an experiment, taking 10% as a training set, and keeping the network parameters and the layer number of the CNNs consistent, wherein the difference is whether a space-spectrum combined attention mechanism module JSAM is embedded.

Detailed Description

As shown in fig. 1, the spatial-spectral combined attention mechanism module is mainly composed of three sub-modules: a spatial attention score extraction sub-module, a spectral attention score extraction sub-module, and an attention score assignment sub-module. The spatial attention score extraction submodule mainly extracts similarity characteristics between any two pixels in a space to obtain a spatial attention score map; the spectral attention fraction extraction submodule mainly extracts correlation dependencies in different spectral bands to obtain an attention fraction graph of the spectral bands; and the attention score distribution branch distributes the spatial attention scores and the spectral attention scores which are respectively extracted into the original feature space to obtain an attention score cube containing attention features of different spatial domains and different wave bands.

(1) Spatial attention score extraction submodule

The hyperspectral cube of the input network is denoted by X as follows:

wherein H is the length of the input hyperspectral cube;

w is the width of the input hyperspectral cube;

c is the spectral dimension of the input hyperspectral cube;

and N ═ hxw;

the method comprises the following steps: respectively mapping an input image X according to a formula (1) into an embedded spectral feature space to obtain two new feature maps theta (X) and phi (X);

wherein i and j are the numbers of pixels in the feature map;

and

linear mapping matrixes are adopted, and the linear mapping matrixes are parameters which can be learned in the neural network;

d is the spectral dimension mapped to the new feature maps θ (X) and φ (X) in the embedded spectral space;

step two: calculating the similarity s of any two pixels by using a Gaussian function embedded in space_ijObtaining a spatial attention point map S, and specifically calculating a process map formula (2) and shown in FIG. 1:

wherein s is_ijRepresenting the similarity between the ith and jth pixels;

in the procedure, W_θAnd W_φIs capable of learningThe network parameters of (1) are realized by adopting a 1-by-1 convolution layer; first in formula (2)_θ(xi)Transposing to obtain theta (x)_i)^TThen, theta (x) is added_i)^TPhi (x)_j) And performing matrix multiplication operation, and finally performing normalization operation by using a neural network softmax layer.

(2) Spectral attention score extraction submodule

The hyperspectral cube of the input network is denoted by X below

Wherein H is the length of the input hyperspectral cube;

w is the width of the input hyperspectral cube;

c is the spectral dimension of the input hyperspectral cube;

the method comprises the following steps: respectively mapping an input image X according to a formula (4) into an embedding space feature space to obtain two new feature maps upsilon (X) and omega (X);

wherein i and j are numbers of spectral bands corresponding to the characteristic diagram;

W_υand W_ωLinear mapping matrixes are adopted, and the linear mapping matrixes are parameters which can be learned in the neural network;

step two: calculating the similarity q of the corresponding characteristic graphs of any two spectral bands by using a Gaussian function embedded in space_ijObtaining a spatial attention point map Q, and specifically calculating a process map formula (5) and shown in FIG. 1:

wherein q is_ijRepresenting the similarity between corresponding signatures in the ith and jth spectral bands;

in the procedure, W_υAnd W_ωThe method is a learnable network parameter and is realized by adopting a 3X 3Depth-wise convolution layer; in the formula (5), v (x) is first measured_i) Transposing the resulting product to give v (x)_i)^TThen v (x)_i)^TAnd ω (x)_i) And performing matrix multiplication operation, and finally performing normalization operation by using a neural network softmax layer.

(3) Attention score assignment submodule

The attention score distribution submodule has the main function of distributing the spatial attention scores and the spectral attention scores extracted respectively to the original feature space by the attention score distribution branch to obtain the attention score cube containing the attention features of different spatial domains and different wave bands.

The input image X is represented as follows:

the method comprises the following steps: in order to ensure that the attention mechanism module can adaptively focus on the local space and the local spectral band of the feature map according to task requirements, firstly mapping is carried out in the feature space to obtain a brand new feature map

As in equation (7); in the program, formula (7) is implemented by using 3 × 3 convolutional layers, where W is_ζIs the 3 x 3 convolution kernel parameter.

A＝S·ζ(X)·Q (8)

Step two: the attention mechanism score cube a is obtained by assigning the spatial attention score S and the spectral attention score Q to the original feature space by formula (8).

In addition, the algorithm also designs a set of spatial-spectral combined attention mechanism module and convolution neural network embedding modes, which mainly comprise the following three embedding modes:

(1) series embedded mode

(2) Parallel embedded mode

(3) Series-parallel connection embedding mode

Detailed diagrams of three structures of the convolutional neural network embedded with the spatial-spectral combined attention mechanism module are shown in FIG. 2.

Claims

1. The invention provides a hyperspectral image classification algorithm based on a space-spectrum combined attention mechanism, which mainly comprises a space-spectrum combined attention mechanism module and a convolution neural network embedded mode:

1) the space-spectrum combined attention mechanism module consists of three sub-modules, namely a space attention score extraction sub-module, a spectrum attention score extraction sub-module and an attention score distribution sub-module; the spatial attention score extraction branch mainly extracts similarity characteristics between any two pixels in a space to obtain a spatial attention score map; the spectral attention fraction extraction branch is mainly used for extracting relevant dependencies in different spectral bands to obtain an attention fraction graph of the spectral bands; distributing the spatial attention score map and the spectral attention score map respectively extracted by the attention score distribution submodule into the original feature space pixel by pixel and spectrum by spectrum to obtain an attention score cube containing different pixel points and attention features of different wave bands; the method comprises the following specific steps:

(a) spatial attention score extraction submodule

The method comprises the following steps: mapping the input image X into an embedded spectral feature space respectively to obtain two new feature maps theta (X) and phi (X);

step two: calculating the similarity s of any two pixels by using a Gaussian function embedded in space_ijObtaining a spatial attention point chart S, and finally performing normalization operation by using a neural network softmax layer;

(b) spectral attention score extraction submodule

Step three: mapping an input image X into an embedding space feature space respectively to obtain two new feature maps upsilon (X) and omega (X);

step four: calculating the similarity q of the corresponding characteristic graphs of any two spectral bands by using a Gaussian function embedded in space_ijObtaining a spatial attention point diagram Q, wherein in the experiment, the spatial attention point diagram Q is realized by adopting a 3 x 3 layering (Depth-wise) convolution layer; finally, carrying out normalization operation by utilizing a neural network Softmax layer;

(c) attention score assignment submodule

The attention score distribution submodule has the main function that the attention score distribution branch distributes the space attention score and the spectrum attention score which are respectively extracted into the original characteristic space to obtain an attention score cube containing attention characteristics of different space domains and different wave bands;

step five: in order to ensure that the attention mechanism module can adaptively focus on the local space and the local spectral band of the feature map according to task requirements, firstly mapping is carried out in the feature space to obtain a brand new feature map

Realized by 3 x 3 convolution layers, in which W_zIs the 3 x 3 convolution kernel parameter;

step six: the spatial attention score S and the spectral attention score Q are assigned to the original feature space, and an attention mechanism score cube a is obtained.

2) The spatial-spectral combined attention mechanism module is embedded into the convolutional neural network in three ways:

(a) a serial embedding mode;

(b) a parallel embedding mode;

(c) series-parallel embedded mode.