CN116486160B

CN116486160B - Hyperspectral remote sensing image classification method, equipment and medium based on spectrum reconstruction

Info

Publication number: CN116486160B
Application number: CN202310457860.7A
Authority: CN
Inventors: 冯鹏铭; 王恺瀚; 贺广均; 关键; 符晗; 梁颖; 刘世烁; 车程安
Original assignee: Harbin Engineering University; Beijing Institute of Satellite Information Engineering
Current assignee: Harbin Engineering University; Beijing Institute of Satellite Information Engineering
Priority date: 2023-04-25
Filing date: 2023-04-25
Publication date: 2023-12-19
Anticipated expiration: 2043-04-25
Also published as: CN116486160A

Abstract

The invention relates to a hyperspectral remote sensing image classification method, equipment and storage medium based on spectrum reconstruction, wherein the hyperspectral remote sensing image classification method based on spectrum reconstruction comprises the following steps: s1, performing data dimension reduction on a hyperspectral remote sensing image; s2, constructing an input sample; step S3, a self-supervision pre-training network based on a spectrum reconstruction task is established, and a backbone network is extracted through unlabeled sample training characteristics; and S4, constructing a classification network of the hyperspectral remote sensing image based on the feature extraction backbone network of the pre-training stage, and training the classification network through labeling samples to finish pixel-by-pixel classification. The method can obviously improve the classification accuracy of the spectrum remote sensing image under the condition of a small sample, has a high model training speed, and has important significance for practical application.

Description

Hyperspectral remote sensing image classification method, equipment and medium based on spectrum reconstruction

Technical Field

The invention relates to the technical field of hyperspectral remote sensing image classification, in particular to a hyperspectral remote sensing image classification method, equipment and medium based on spectrum reconstruction.

Background

The hyperspectral remote sensing image can identify the subtle difference of the texture of the observed object material by acquiring hundreds of pieces of information of continuous wave bands, so that the similar objects which are difficult to distinguish in the visual range can be distinguished. The method has the advantage of benefiting from the excellent feature extraction capability of the depth network, and has great progress in classifying hyperspectral remote sensing images. On the other hand, since the label acquisition of the hyperspectral remote sensing image is mainly performed by manual labeling, the label acquisition is expensive and time-consuming, and therefore, how to achieve a better classification effect when less samples are labeled is a significant problem.

Conventional small sample learning strategies, such as metric learning and meta-learning strategies, use absolute or relative distances between sample features to identify the class of each unlabeled pixel in the hyperspectral remote sensing image. These methods still do not solve the problem of lack of samples well. In order to increase available supervision information, self-supervised learning strategies are proposed to pre-train models using the data itself as supervision, thereby improving the performance and convergence speed of model classification during training. For the classification task of hyperspectral remote sensing images, available self-supervision learning methods are mainly divided into a comparison formula and a generation formula. The contrast type self-supervision method relies on a data enhancement method to construct positive samples, and a pre-training task is constructed by increasing the distance between positive and negative sample characteristics. But such methods present performance bottlenecks due to the lack of efficient hyperspectral image data enhancement strategies. The development time of the generation type self-supervision method is long, but the generation type self-supervision method is mainly concentrated in the traditional RGB three-channel image processing field. Typically, a convolutional neural network-based self-encoder receives an image of the masked portion of pixels and reconstructs the masked portion under the original image as supervisory information, thereby constructing the pre-training task.

However, when the method is applied to hyperspectral images, the characteristic extraction of the images is insufficient due to lack of consideration of the spectrum information, and the performance of the classification model is directly reduced.

Disclosure of Invention

In order to solve the technical problems in the prior art, the invention aims to provide a hyperspectral remote sensing image classification method, equipment and a storage medium based on spectrum reconstruction, which can improve the classification performance of hyperspectral remote sensing images under the problem of small samples.

In order to achieve the above object, the present invention provides a hyperspectral remote sensing image classification method based on spectrum reconstruction, comprising the following steps:

s1, performing data dimension reduction on a hyperspectral remote sensing image;

s2, constructing an input sample;

step S3, a self-supervision pre-training network based on a spectrum reconstruction task is established, and a backbone network is extracted through unlabeled sample training characteristics;

and S4, constructing a classification network of the hyperspectral remote sensing image based on the feature extraction backbone network of the pre-training stage, and training the classification network through labeling samples to finish pixel-by-pixel classification.

According to one aspect of the present invention, in the step S1, a principal component analysis method is used to perform data dimension reduction on the hyperspectral remote sensing image.

According to one aspect of the present invention, in the step S2, sampling is performed on the hyperspectral remote sensing image based on adjacent pixels, so as to complete the construction of an input sample, which specifically includes:

s21, sampling all pixels of the hyperspectral remote sensing image based on adjacent pixels of a rectangular area is used as a pre-training network input sample;

and S22, sampling all marked pixels of the hyperspectral remote sensing image based on adjacent pixels of the rectangular area to be used as a classification network training sample.

According to an aspect of the present invention, in the step S3, specifically includes:

step S31, expanding an input sample in a spectrum dimension, and randomly masking part of spectrum channels;

step S32, the residual channel information of the sample which is not masked is sent to a characteristic extraction backbone network based on a spectrum converter, and the sample is represented in a characteristic space;

step S33, inserting the spectral channels of the sample which are masked into the characteristics output by the characteristic extraction network according to the arrangement sequence of the original channels to form new sample characteristics;

step S34, the new sample characteristics are sent to a spectrum reconstruction network, and the masked channel information is predicted;

step S35, using the original sample which is not masked as supervision information, and updating network parameters by using the MSE loss function.

According to an aspect of the present invention, in the step S32, specifically includes:

step S321, embedding the characteristics of the input hyperspectral image sample into a characteristic space by utilizing a spectrum embedding layer: sample x= (c) of hyperspectral image for input ₁ ,c ₂ ,…c _n ) Wherein c _i Representing the ith channel of the image sample, first, the spectral embedding layer of the network expands the input sample into a one-dimensional vector, i.e., each channel of the sampleSpread to->And the channel information c is converted into a linear transformation matrix W _i Embedding feature space into one-dimensional vector with dimension d>As shown in the formula:

a _i ＝W·c _i ，i∈{1，...n}；

step S322, using the position embedding layer to obtain one-dimensional vector a _i Plus a learnable one-dimensional vector p _i ，p _i The initial value of (1) is randomly generated, and the operation of adding the position vector is shown in the formula:

a _i ＝a _i +p _i ,i∈{1,…n}；

step S323, calculating global correlation of different wave bands of the sample by utilizing a core Transformer Block, and completing expression of spectral characteristics of the sample.

According to an aspect of the present invention, in the step S323, specifically, the method includes:

for each compute head of Transformer Block, each channel feature a of the embedded feature space is computed _i I e {1, … n } are multiplied by three different transformation matrices W, respectively _q ，W _q ，W _k ，W _v Generating feature vectors corresponding to the three vectors, i.e. q _i 、k _i 、v _i And spliced into a matrix, query (Q= [ Q ] ₁ ,…q _n ])，Key(K＝[k ₁ ,…k _n ])，Value(V＝[v ₁ ,…v _n ]) Each q is calculated in the form of an inner product _i Vector sum each k _i The attention score s between vectors, while the score is scaled by normalization, i.e.:

wherein d is q _i Or k _j Dimension of the vector;

subsequently, the obtained attention score is passed through a Softmax activation function and associated with the corresponding vector v _i Multiplying to obtain a weighted value;

finally, all weighted values are added to obtain the attention value z of the feature _i The method comprises the following steps:

results z of different calculation heads _i Spliced into a whole and multiplied by an output weight matrix W ^O The mathematical expression to get the overall output, i.e. Transformer Block of the core, is:

MultiHead(Q，K，V)＝Concat(head ₁ ，...，head _h )W ^O wherein Concat (·) represents the splicing operation of the matrix, head _i ,i∈[1,n]Representing the output of the ith computational head, the computational formula for each computational head can be expressed as:

wherein K is ^T Representing the transpose of matrix K.

According to an aspect of the present invention, in the step S4, specifically includes:

step S41: constructing a classification network of the hyperspectral remote sensing image by utilizing the feature extraction backbone network and a classifier which are trained in the step S3;

step S42, training classifier parameters under the supervision of the labeling sample, and completing pixel-by-pixel classification,

the parameter updating of the classifier is completed by calculating the cross entropy loss between the true value of the labeling sample and the network predicted value.

According to an aspect of the present invention, there is provided an electronic apparatus including: one or more processors, one or more memories, and one or more computer programs; wherein the processor is connected to the memory, and the one or more computer programs are stored in the memory, and when the electronic device is running, the processor executes the one or more computer programs stored in the memory, so that the electronic device performs a hyperspectral remote sensing image classification method based on spectrum reconstruction as in any one of the above technical solutions.

According to an aspect of the present invention, there is provided a computer readable storage medium storing computer instructions which, when executed by a processor, implement a hyperspectral remote sensing image classification method based on spectral reconstruction as set forth in any one of the above technical solutions.

Compared with the prior art, the invention has the following beneficial effects:

the invention provides a hyperspectral remote sensing image classification method, equipment and a storage medium based on spectrum reconstruction, which are characterized in that firstly, data dimension reduction is carried out on hyperspectral remote sensing images, image redundancy is reduced, network fitting difficulty is reduced, an input sample is constructed, the input sample generally comprises a pre-training sample set and a training sample set, the pre-training sample set is a set without a labeling sample, the training sample set is a set with a labeling sample, then a self-supervision pre-training network based on a spectrum reconstruction task is constructed, a trunk network is firstly extracted through training features of the pre-training sample, a classification network is constructed according to the characteristics of training completion, the trunk network is extracted under the supervision of the training sample set, and the backbone network is extracted by utilizing the training features of the non-labeling sample, so that the demand of model fitting on the labeling sample is reduced, and the method has important significance for solving the problem of small samples of hyperspectral remote sensing image classification.

Furthermore, the classifier is trained only under the supervision of a small number of labeling samples, so that the classification accuracy of the spectrum remote sensing image can be remarkably improved under the condition of a small sample, the model training speed is high, and the method has important significance for practical application.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments will be briefly described below. It is apparent that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained from these drawings without inventive effort for a person of ordinary skill in the art.

FIG. 1 schematically illustrates a flow chart of a method for classifying hyperspectral remote sensing images based on spectral reconstruction provided in one embodiment of the present invention;

FIG. 2 schematically illustrates the general technical path of a hyperspectral remote sensing image classification method based on spectral reconstruction in accordance with one embodiment of the present invention;

FIG. 3 schematically illustrates a feature extraction backbone network architecture of one embodiment of the invention;

FIG. 4 schematically shows a flow chart of a pre-training scheme in accordance with one embodiment of the present invention;

fig. 5 schematically shows a structure diagram of a classification network according to an embodiment of the present invention.

Detailed Description

The description of the embodiments of this specification should be taken in conjunction with the accompanying drawings, which are a complete description of the embodiments. In the drawings, the shape or thickness of the embodiments may be enlarged and indicated simply or conveniently. Furthermore, portions of the structures in the drawings will be described in terms of separate descriptions, and it should be noted that elements not shown or described in the drawings are in a form known to those of ordinary skill in the art.

Any references to directions and orientations in the description of the embodiments herein are for convenience only and should not be construed as limiting the scope of the invention in any way. The following description of the preferred embodiments will refer to combinations of features, which may be present alone or in combination, and the invention is not particularly limited to the preferred embodiments. The scope of the invention is defined by the claims.

As shown in fig. 1 to 5, the hyperspectral remote sensing image classification method based on spectrum reconstruction of the invention comprises the following steps:

s2, constructing an input sample;

As shown in fig. 1 and fig. 2, in this embodiment, data dimension reduction is performed on a hyperspectral remote sensing image, image redundancy is reduced, network fitting difficulty is reduced, an input sample is constructed, the input sample generally comprises a pre-training sample set and a training sample set, the pre-training sample set is a set of non-labeling samples, the training sample set is a set of labeling samples, a self-supervision pre-training network based on a spectrum reconstruction task is constructed, a trunk network is firstly extracted through training features of the pre-training sample, a classification network is constructed according to the characteristics obtained through training, the classification network is trained under supervision of the training sample set, the backbone network is extracted through the non-labeling sample training features, the requirement of model fitting on the labeling samples is reduced, and the method has important significance for solving the problem of classifying small samples of the hyperspectral remote sensing image.

It is worth particularly pointed out that the method comprises two model training phases, namely a pre-training phase and a training phase, and specifically, masking and reconstructing partial wave bands of the hyperspectral remote sensing image through a self-encoder model constructed based on a spectrum converter in the pre-training phase. Since the reconstruction of the band relies on the image itself as the supervision information, the encoder portion of the self-encoder model can learn the representation of the samples in the feature space using unlabeled samples. In the training stage, the encoder part model and parameters of the pre-training model are used as a feature extraction module and are spliced with a classifier to be used as a hyperspectral remote sensing image classification model. Finally, the classifier only needs to be trained under the supervision of a small number of labeled samples.

In one embodiment of the present invention, preferably, in the step S1, a principal component analysis method is used to perform data dimension reduction on the hyperspectral remote sensing image.

In the embodiment, through dimension reduction processing, image redundancy is reduced, and the difficulty of deep network fitting is reduced.

In one embodiment of the present invention, preferably, in the step S2, sampling the hyperspectral remote sensing image based on adjacent pixels to complete the construction of the input sample specifically includes:

As shown in fig. 4, in one embodiment of the present invention, preferably, in the step S3, the method specifically includes:

step S31, expanding the input sample in the spectrum dimension, and randomly masking part of the spectrum channels, namely setting the value to 0, for example, randomly masking 50% of the spectrum channels;

step S34, sending the new sample characteristics into a spectrum reconstruction network, predicting masked channel information, wherein the network is also based on a spectrum converter, but has fewer calculation heads and network depth;

As shown in fig. 1 and 3, in one embodiment of the present invention, preferably, in the step S32, the method specifically includes:

step S321, embedding the characteristics of the input hyperspectral image sample into a characteristic space by utilizing a spectrum embedding layer:

for input hyperspectral image sample X ⁱ ＝(c ₁ ，c ₂ ，...c _n (d), wherein c _i Representing the ith channel of the sample, first, the spectral embedding layer of the network expands the input sample into a one-dimensional vector, i.e., each channel of the sampleSpread to->And the channel information c is converted into a linear transformation matrix W _i Embedding feature space into one-dimensional vector of dimension dAs shown in the formula:

a _i ＝W·c _i ，i∈{1，...m}；

step S322, adding a learnable one-dimensional vector p to the obtained one-dimensional vector ai by using the position embedding layer _i ，p _i The initial value of (2) is randomly generated. The operation of adding the position vector is as follows:

a _i ＝a _i +p _i ,i∈{1,…n}；

step S323, for each computing head of Transformer Block, embedding channel feature a of each feature space _i I e {1, … n } are multiplied by three different transformation matrices W, respectively _q ，W _k ，W _v Generating feature vectors corresponding to the three vectors, i.e. q _i 、k _i 、v _i And spliced into a matrix, query (Q= [ Q ] ₁ ,…q _n ])，Key(K＝[k ₁ ,…k _n ])，Value(V＝[v ₁ ,…v _n ]). Thereafter, each q is calculated as an inner product _i Vector sum each k _i The attention score s between vectors, while the score is scaled by normalization, i.e.:

wherein d is q _i Or k _j Dimension of the vector;

MultiHead(Q，K，V)＝Concat(head ₁ ，...，head _h )W ^O ，

wherein Concat (·) represents the splicing operation of the matrix, head _i ,i∈[1,n]Representing the output of the ith computational head, the computational formula for each computational head can be expressed as:

wherein K is ^T Representing the transpose of matrix K.

In this embodiment, the space-spectrum information of the hyperspectral image is fully expressed by the pre-training task of spectrum reconstruction and the feature extraction module constructed based on the spectrum converter, so that sufficient support is provided for the correct classification of pixels.

As shown in fig. 5, in one embodiment of the present invention, preferably, in the step S4, the method specifically includes:

In this embodiment, the input samples first go through a feature extraction network, mapping the samples to feature space, which in part does not require updating parameters in the classification network because training has been completed in the pre-training task. Then the classifier module transforms the sample feature into one-dimensional feature vector, then maps each one-dimensional feature vector to the category through the full connection layer, finally trains the classifier parameters under the supervision of the labeling sample, and completes the classification pixel by pixel.

According to an aspect of the present invention, there is provided an electronic apparatus including: one or more processors, one or more memories, and one or more computer programs; wherein the processor is connected to the memory, the one or more computer programs are stored in the memory, and when the electronic device is running, the processor executes the one or more computer programs stored in the memory, so that the electronic device performs a hyperspectral remote sensing image classification method based on spectrum reconstruction as in any one of the above technical solutions.

According to an aspect of the present invention, there is provided a computer readable storage medium storing computer instructions which, when executed by a processor, implement a hyperspectral remote sensing image classification method based on spectral reconstruction as in any one of the above technical solutions.

Computer-readable storage media may include any medium that can store or transfer information. Examples of a computer readable storage medium include an electronic circuit, a semiconductor memory device, a ROM, a flash memory, an Erasable ROM (EROM), a floppy disk, a CD-ROM, an optical disk, a hard disk, a fiber optic medium, a Radio Frequency (RF) link, and the like. The code segments may be downloaded via computer networks such as the internet, intranets, etc.

The invention discloses a hyperspectral remote sensing image classification method, equipment and storage medium based on spectrum reconstruction, wherein the hyperspectral remote sensing image classification method based on spectrum reconstruction comprises the following steps: s1, performing data dimension reduction on a hyperspectral remote sensing image; s2, constructing an input sample; step S3, a self-supervision pre-training network based on a spectrum reconstruction task is established, and a backbone network is extracted through unlabeled sample training characteristics; s4, constructing a classification network of the hyperspectral remote sensing image based on a feature extraction backbone network in a pre-training stage, and training the classification network through labeling samples to finish pixel-by-pixel classification; firstly, carrying out data dimension reduction on a hyperspectral remote sensing image, reducing image redundancy, reducing network fitting difficulty, constructing an input sample, wherein the input sample generally comprises a pre-training sample set and a training sample set, the pre-training sample set is a set of non-labeling samples, the training sample set is a set of labeling samples, then constructing a self-supervision pre-training network based on a spectrum reconstruction task, firstly extracting a trunk network through training features of the pre-training sample, constructing a classification network according to the extracted trunk network after the training is completed, training the classification network under the supervision of the training sample set, extracting the trunk network by utilizing the non-labeling sample training features, reducing the requirement of model fitting on the labeling samples, and having important significance for solving the problem of small samples of hyperspectral remote sensing image classification.

Furthermore, it should be noted that the present invention can be provided as a method, an apparatus, or a computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the invention may take the form of a computer program product on one or more computer-usable storage media having computer-usable program code embodied therein.

Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It should also be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or terminal device comprising the element.

It is finally pointed out that the above description of the preferred embodiments of the invention, it being understood that although preferred embodiments of the invention have been described, it will be obvious to those skilled in the art that, once the basic inventive concepts of the invention are known, several modifications and adaptations can be made without departing from the principles of the invention, and these modifications and adaptations are intended to be within the scope of the invention. It is therefore intended that the following claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the scope of the embodiments of the invention.

Claims

1. The hyperspectral remote sensing image classification method based on spectrum reconstruction is characterized by comprising the following steps of:

s2, constructing an input sample;

s4, constructing a classification network of the hyperspectral remote sensing image based on a feature extraction backbone network in a pre-training stage, and training the classification network through labeling samples to finish pixel-by-pixel classification;

in the step S3, the method specifically includes:

step S35, using the original sample which is not masked as supervision information, and updating network parameters by using an MSE loss function;

in the step S32, specifically, the method includes:

a _i ＝W·c _i ，i∈{1，...n}；

a _i ＝a _i +p _i ,i∈{1,…n}；

step 323, calculating global correlation of different wave bands of the sample by utilizing Transformer Block of the core, and completing expression of spectral characteristics of the sample;

in the step S323, specifically, the method includes:

for each compute head of Transformer Block, each channel feature a of the embedded feature space is computed _i I e {1, … n } are multiplied by three different transformation matrices W, respectively _q ，W _k ，W _v Generating feature vectors corresponding to the three vectors, i.e. q _i 、k _i 、v _i And splicedThe junctions are matrices, query (q= [ Q ] ₁ ,…q _n ])，Key(K＝[k ₁ ,…k _n ])，Value(V＝[v ₁ ,…v _n ]) Each q is calculated in the form of an inner product _i Vector sum each k _i The attention score s between vectors, while the score is scaled by normalization, i.e.:

wherein d is q _i Or k _j Dimension of the vector;

MultiHead(Q，K，V)＝Concat(head ₁ ，...，head _h )W ^O ，

wherein,representing the transpose of matrix K.

2. The method for classifying hyperspectral remote sensing images based on spectral reconstruction according to claim 1, wherein in the step S1, the hyperspectral remote sensing images are subjected to data dimension reduction by using a principal component analysis method.

3. The method for classifying hyperspectral remote sensing images based on spectral reconstruction according to claim 1, wherein in the step S2, the hyperspectral remote sensing images are sampled based on adjacent pixels, and the construction of the input samples is completed, specifically comprising:

4. The method for classifying hyperspectral remote sensing images based on spectral reconstruction according to claim 1, wherein in step S4, specifically comprising:

5. An electronic device, comprising: one or more processors, one or more memories, and one or more computer programs; wherein the processor is connected to the memory, the one or more computer programs being stored in the memory, which when the electronic device is running, is executed by the processor to cause the electronic device to perform the method of classification of hyperspectral remote sensing images based on spectral reconstruction as claimed in any one of claims 1 to 4.

6. A computer readable storage medium storing computer instructions which, when executed by a processor, implement the method of classifying hyperspectral remote sensing images based on spectral reconstruction as claimed in any one of claims 1 to 4.