CN111507409B

CN111507409B - Hyperspectral image classification method and device based on depth multi-view learning

Info

Publication number: CN111507409B
Application number: CN202010307781.4A
Authority: CN
Inventors: 刘冰; 郭文月; 余岸竹; 王瑞瑞; 余旭初; 张鹏强; 谭熊; 魏祥坡; 高奎亮; 左溪冰
Original assignee: Information Engineering University of PLA Strategic Support Force
Current assignee: Information Engineering University of PLA Strategic Support Force
Priority date: 2020-04-17
Filing date: 2020-04-17
Publication date: 2022-10-18
Anticipated expiration: 2040-04-17
Also published as: CN111507409A

Abstract

The invention provides a hyperspectral image classification method and device based on depth multi-view learning, and belongs to the technical field of remote sensing image processing and application. The classification method comprises the following steps: constructing at least two different visual angles for each unmarked sample in the training sample set, and training the depth residual error network model by using the obtained multi-visual angle data of all unmarked samples in the training sample set; constructing multi-view data of each sample in a sample set to be classified, inputting one view of each sample into a trained model to obtain a feature vector of a corresponding sample, or inputting each sample in the sample set to be classified or each sample subjected to dimensionality reduction into the trained model to obtain the feature vector of the corresponding sample, and further obtaining the feature vectors of all samples in the sample set to be classified; and inputting the characteristic vectors of all samples in the sample set to be classified into a classification model trained in advance to finish the classification of the hyperspectral images. The method can improve the classification precision of the hyperspectral images under the condition of small samples.

Description

Hyperspectral image classification method and device based on depth multi-view learning

Technical Field

The invention relates to a hyperspectral image classification method and device based on depth multi-view learning, and belongs to the technical field of remote sensing image processing and application.

Background

The hyperspectral image classification is one of important steps in hyperspectral image application, and the basic task of the hyperspectral image classification is to endow each pixel in an image with a category identifier, so that the hyperspectral image classification has a high practical application value.

The existing hyperspectral image classification methods mainly comprise the following steps:

(1) The hyperspectral image classification is carried out based on a traditional machine learning method. For example, based on traditional machine learning methods such as a support vector machine, a semi-supervised support vector machine, a random forest and the like, although the method can obtain a certain classification effect, the method usually needs complex feature design work and relies on expert experience to adjust parameters to a great extent, and has great limitation in application and low classification precision.

(2) And (4) performing hyperspectral image classification based on a deep learning method. For example, based on a deep learning method such as a convolutional neural network, although the method can automatically extract the spatial spectral features of the hyperspectral image without manually presetting the features, the method can obtain a good classification effect only by relying on a large number of labeled training samples, in practical application, the acquisition of the hyperspectral image labeled samples is time-consuming and labor-consuming, the number of labeled samples which can be used for training is very small, and if a sufficient number of labeled training samples are not supported, the method is difficult to obtain high classification accuracy.

(3) And performing hyperspectral image classification based on the depth residual error network model. The method is based on the characteristics that the number of unmarked samples in the hyperspectral image is large and the unmarked samples are easy to acquire, and researches how to better mine the characteristic information of the hyperspectral image by utilizing the unmarked samples and improve the classification performance of the small samples of the hyperspectral image. For example, in an invention patent application document with publication number CN109754017A, a hyperspectral image classification method based on a separable three-dimensional residual network and transfer learning is disclosed, in which a three-dimensional residual network model is constructed to realize the autonomous extraction of the hyperspectral image depth features, and compared with the hyperspectral image classification method based on a deep learning method, the network model is deeper and has higher precision, and a better classification effect can be obtained under the condition of a small sample. However, the method performs feature learning based on reconstruction errors, and thus, deeper abstract features cannot be extracted, and the classification accuracy under the condition of a small sample still needs to be further improved.

In summary, in the existing hyperspectral image classification methods, the classification method based on the traditional machine learning not only needs to rely on expert experience to perform complex feature design work, but also has low classification precision; the deep learning-based classification method needs to depend on a large number of marked training samples, and has low classification precision under the condition of small samples; the classification method based on the deep residual error network model performs feature learning based on reconstruction errors, so that deeper abstract features cannot be extracted, and the classification accuracy under the condition of small samples needs to be further improved.

Disclosure of Invention

The invention aims to provide a hyperspectral image classification method and device based on depth multi-view learning, and aims to solve the problem that the existing hyperspectral image classification method is low in classification accuracy under the condition of small samples.

In order to achieve the above object, the present invention provides a hyperspectral image classification method based on depth multi-view learning, which comprises the following steps:

(1) Inputting a hyperspectral image;

(2) Extracting a set number of unmarked samples in the hyperspectral image to form a training sample set, and forming a sample set to be classified by the rest samples, wherein the samples are mxmxmxb data cubes selected by taking a pixel to be processed in the hyperspectral image as a center, m is the size of a spatial neighborhood, and b is the number of spectral bands;

(3) Dividing all spectral bands of the same unmarked sample in the training sample set into at least two groups, processing one group of spectral bands to obtain one visual angle, thereby constructing at least two different visual angles of the unmarked sample, obtaining multi-visual angle data of the unmarked sample, and further obtaining the multi-visual angle data of all the unmarked samples in the training sample set;

(4) Training a depth residual error network model by using multi-view data of all unmarked samples in a training sample set, wherein the training is carried out for multiple times, N unmarked samples in the training sample set are extracted for training each time, N is more than or equal to 1, and the training process is as follows: inputting multi-view data of each unmarked sample in the training process into a depth residual error network model, wherein one view obtains one vector, and thus each unmarked sample obtains at least two vectors; calculating comparative losses among all vectors of each unmarked sample by using a loss function, further obtaining the total comparative losses of all the unmarked samples in the training process, judging whether the total comparative losses meet the set requirement, if not, optimizing the depth residual error network model until the total comparative losses meet the set requirement, and finishing the training of the depth residual error network model;

(5) Constructing multi-view data of each sample in a sample set to be classified, inputting one view of each sample into a trained depth residual error network model to obtain a feature vector of the corresponding sample, and further obtaining feature vectors of all samples in the sample set to be classified;

or firstly reducing the dimension of each sample in the sample set to be classified, inputting each sample after dimension reduction into a trained deep residual error network model to obtain the feature vector of the corresponding sample, and further obtaining the feature vectors of all samples in the sample set to be classified;

or directly inputting each sample in the sample set to be classified into the trained depth residual error network model to obtain the feature vector of the corresponding sample, and further obtaining the feature vectors of all samples in the sample set to be classified;

(6) And inputting the characteristic vectors of all samples in the sample set to be classified into a classification model trained in advance to finish the classification of the hyperspectral images.

The invention also provides a hyperspectral image classification device based on depth multi-view learning, which comprises a processor and a memory, wherein the processor executes a computer program stored by the memory so as to realize the hyperspectral image classification method based on depth multi-view learning.

The hyperspectral image classification method and device based on depth multi-view learning have the beneficial effects that: firstly, selecting an mxmxmxb data cube selected by taking a pixel to be processed in a hyperspectral image as a center as a sample, constructing at least two different viewing angles for each unmarked sample in a training sample set on the basis, and training a depth residual error network model (hereinafter referred to as a model) by using multi-viewing angle data of all unmarked samples in the obtained training sample set, so that on one hand, spatial-spectral joint information in the hyperspectral image can be fully utilized; on the other hand, the advantage of large quantity of unmarked samples in the hyperspectral image can be fully exerted, and the depth characteristic information of the hyperspectral image is mined by means of a large quantity of unmarked samples, so that the depth characteristic of the hyperspectral image can be extracted by using a trained model; moreover, the model training is completed only when the total comparative loss meets the set requirement, so that the consistency of the trained model for deeply excavating multi-view data is ensured, the feature vectors output by the multi-view data from the same sample are consistent, and the feature vectors extracted by the trained model have stronger representativeness, discriminability and robustness.

Further, in the hyperspectral image classification method and device based on depth multi-view learning, the loss function is a cosine similarity function constructed according to cosine similarity between vectors.

The beneficial effects of doing so are: the cosine similarity function constructed according to the cosine similarity between the vectors can enable different view angle data from the same sample to be aggregated, so that the multi-view angle data from different samples are far away from each other, the consistency of the multi-view angle data in the hyperspectral image can be mined, the mined information is more representative, the classification precision can be effectively improved when the hyperspectral image is used for classification, and the classification precision improvement effect is obvious particularly for the hyperspectral image when the number of marked training samples is small.

Further, in the method and the device for classifying hyperspectral images based on depth multi-view learning, the cosine similarity function is as follows:

in the formula, ζ _i,j Representing a vector z _i Sum vector z _j A comparative loss therebetween;

is a vector z _i Sum vector z _j The cosine similarity between them, the closer the cosine similarity is to 1, indicates that the more similar the two vectors are,

representing a vector z _i Transpose of (1), l | z _i ||、||z _j Respectively representing the vector z _i Modulo, vector z _j The mold of (4); l. the _[k≠i] For the indicating function, k ≠ i is 1; i. j and k are variables; n represents the number of unlabeled samples.

In order to reduce the data processing amount on the premise of ensuring that the obtained sample multi-view data contains enough spatio-spectral joint information, further, in the hyper-spectral image classification method and device based on depth multi-view learning, the process of processing a group of spectral bands to obtain a view angle comprises the following steps: and performing principal component analysis on the group of spectral bands, wherein the first M principal components of the group of spectral bands are taken as a visual angle, and M is more than or equal to 1.

In order to improve the generalization ability of the depth residual error network model, in the hyperspectral image classification method and device based on depth multi-view learning, when the depth residual error network model is trained, sample data expansion is performed by adopting random clipping and a random Gaussian blur method.

In order to ensure that the depth residual error network model has a depth network structure, and enable the extraction of the depth features of the hyperspectral images by using the trained depth residual error network model, so as to improve the classification precision of the hyperspectral images under the condition of small samples, further, in the hyperspectral image classification method and device based on the depth multi-view learning, the depth residual error network model comprises 49 convolutional layers and 2 full-connection layers, wherein the 49 convolutional layers of the Resnet50 model are used as the 49 convolutional layers of the depth residual error network model.

Further, in the method and the device for classifying hyperspectral images based on depth multi-view learning, the classification model is a support vector machine classification model, a random forest classification model or a convolutional neural network classification model.

Drawings

FIG. 1 is a flowchart of a hyper-spectral image classification method based on depth multi-view learning according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a process for constructing multi-view data of an unmarked sample according to an embodiment of the method of the present invention;

FIG. 3 is a schematic structural diagram of a depth residual error network model in an embodiment of the method of the present invention;

FIG. 4 is a schematic diagram of a standard residual block of the deep residual network model of FIG. 3;

FIG. 5 is a comparison of classification results on the Salinas dataset for various methods in a method embodiment of the invention;

fig. 6 is a structural diagram of a hyperspectral image classification device based on depth multi-view learning in an embodiment of the device.

Detailed Description

Method embodiment

The hyperspectral image classification method based on depth multi-view learning (hereinafter referred to as classification method) of the embodiment is shown in fig. 1, and the classification method includes the following steps:

step 1, inputting a hyperspectral image;

step 2, extracting a set number of unmarked samples (specifically set according to actual needs) in the hyperspectral image to form a training sample set, and forming a sample set to be classified by the rest samples, wherein in order to fully utilize spatial information in the hyperspectral image to effectively improve the classification precision, a data cube of mxmxb selected by taking a pixel to be processed in the hyperspectral image as a center is taken as a sample, m is the size of a spatial neighborhood, and b is the number of spectral bands;

step 3, constructing two different visual angles for the same unmarked sample in the training sample set to obtain multi-visual angle data of the unmarked sample, and further obtaining multi-visual angle data of all unmarked samples in the training sample set;

in this embodiment, a depth multi-view learning method is used to construct multi-view data of unmarked samples, in order to reduce complexity of a depth multi-view learning process, two different views are constructed for the same unmarked sample, and a specific process of view construction is as follows: and averagely dividing all spectral bands of the same unmarked sample into two groups, carrying out principal component analysis on each group of spectral bands, taking the first 3 principal components of the first group of spectral bands as a first visual angle of the unmarked sample, and taking the first 3 principal components of the second group of spectral bands as a second visual angle of the unmarked sample. For example, in fig. 2, for a certain unmarked pixel, a 28 × 28 × 200 data cube is selected as an unmarked sample of the pixel, the unmarked sample has 200 spectral bands, the 200 spectral bands of the unmarked sample are averagely divided into two groups, two spectral band groups with 100 spectral bands are obtained, principal component analysis is performed on each spectral band group, the first 3 principal components of the first group of spectral bands are used as a first viewing angle of the unmarked sample, and the first 3 principal components of the second group of spectral bands are used as a second viewing angle of the unmarked sample.

As other implementation modes, the first M main components of a group of spectral bands can be used as a viewing angle, wherein the value of M is set according to actual needs, and M is more than or equal to 1; or each group of spectral bands can be directly used as a visual angle, and the step of principal component analysis is omitted.

In addition, as different wave bands of the hyperspectral image can reflect different attributes of ground objects, in order to enable the obtained multi-view data of the hyperspectral image to contain the empty spectrum information of the hyperspectral image as much as possible, as other implementation modes, 3 or more than 3 different view angles can be constructed for the same unmarked sample, and the construction process of the view angles is similar to that when two different view angles are constructed, and is not repeated; specifically, each band of the unmarked sample may also be taken as a view angle.

Step 4, training the depth residual error network model by using multi-view data of all unmarked samples in the training sample set;

the deep residual network model constructed in this embodiment consists of a network f (-) including 49 convolutional layers and a network g (-) including 2 fully-connected layers. The network f (-) adopts a Resnet50 model with a classification layer removed as a basic structure, specifically shown in FIG. 3, zero PAD represents a 0-complementing operation on the periphery of an image, CONV represents a two-dimensional convolution layer, batchnorm represents a batch normalization layer, reLU represents a ReLU activation function, maxpool represents a maximum pooling layer, AVGPool represents a global maximum pooling layer, CONVBlock represents a residual Block shown in FIG. 4, and Block 3 represents repeating the residual Block 3 times. It can be seen that the network f (·) contains 16 standard residual blocks altogether, the structure of each residual block is shown in fig. 4, CONV2D represents a two-dimensional convolution layer, batchNorm represents a batch normalization layer, reLU represents a ReLU activation function, shortcut represents a jump connection in the residual network, and since each standard residual block contains 3 convolution layers, the network f (·) contains 1+3 × (3 +4+6+ 3) =49 convolution layers altogether.

In this embodiment, after the parameters of the network f (-) are set according to table 1, the network f (-) outputs a vector h with a dimension of 2048, and is connected to the network g (-) after the network f (-), and by setting the number of input and output nodes of the network g (-), the vector h can be reduced in dimension, so that the obtained feature vector z has a lower dimension, wherein the dimension of the feature vector z can be specifically set according to actual needs.

TABLE 1 parameter settings table for network f (-) s

The pseudo code of the deep residual network model training process in this embodiment is shown in table 2:

TABLE 2 pseudo code for deep residual network model training procedure

The method comprises the steps of training a depth residual error network model by using multi-view data of all unmarked samples in a training sample set, wherein a small batch of training strategies are adopted for multiple times during training, namely N unmarked samples in the training sample set are randomly selected for training each time, N is larger than or equal to 1, the N unmarked samples generate 2N views after deep multi-view learning, in the 2N views, two views from the same unmarked sample are called as positive view pairs, and two views from different unmarked samples are called as negative view pairs.

For two visual angles of the same unmarked sample, two vectors z are generated after passing through a depth residual error network model _i And z _j The comparative loss ζ between the two vectors of the unmarked sample can be calculated by using a loss function _i,j And in the batch training process, calculating the total comparative loss zeta on all the active visual angle pairs, judging whether the total comparative loss zeta meets a set requirement (set according to actual needs), if not, optimizing the deep residual error network model, and finishing the training of the deep residual error network model when the total comparative loss zeta meets the set requirement. Therefore, under the unsupervised condition, the feature information shared by different visual angles is learned according to the similarity between the visual angles.

In this embodiment, a cosine similarity function shown in formula (1) is used as a loss function, and formula (1) is as follows:

in the formula, ζ _i,j Representing a vector z _i Sum vector z _j Loss of comparability betweenLosing;

is a vector z _i Sum vector z _j The cosine similarity between them, the closer the cosine similarity is to 1, indicating that the more similar the two vectors are,

representing a vector z _i Transpose, | | z _i ||、||z _j Respectively representing the vector z _i Modulo, vector z _j The mold of (4); l _[k≠i] For indicating the function, k ≠ i has a value of 1; i. j and k are variables; n represents the number of unlabeled samples.

Because the multi-view data of the same unmarked sample is input into the depth residual error network model, one view obtains one vector, when the number of views constructed by the same unmarked sample is more than or equal to 3, the number of vectors obtained by the unmarked sample is more than or equal to 3, and at the moment, the calculation method of the comparative loss among all vectors of the unmarked sample is as follows: firstly, combining vectors obtained from the unmarked sample pairwise, then respectively calculating comparative losses of the vectors combined pairwise by using a formula (1), and finally summing the comparative losses of all different combinations to obtain the comparative losses among all vectors of the unmarked sample. For example: an unmarked sample has 3 views, and accordingly the unmarked sample results in 3 vectors: a. b and c, combining the 3 vectors pairwise, and calculating the vector a and the vector b to obtain a comparative loss value zeta _a,b Similarly, vector a and vector c also obtain a comparative loss value ζ _a,c Vector b and vector c also obtain a comparative loss value ζ _b,c Then the loss of comparability between all vectors of the unlabeled sample is ζ _a,b +ζ _a,c +ζ _b,c 。

As other embodiments, other forms of cosine similarity functions may also be employed to calculate the comparative loss between two vectors, as long as the cosine similarity functions are constructed according to cosine similarity between vectors; in addition, the cross entropy loss function in the prior art can be used to calculate the comparative loss between two vectors.

The Resnet50 model adopted in this embodiment has a deep network structure, and as another embodiment, other deep residual network models having a deep network structure may also be adopted to construct a network f (·), such as a Resnet100 model.

In other embodiments, in order to further enhance the training effect and the robustness of the model, when the deep residual error network model is trained, sample data expansion can be performed by using two data expansion methods, namely random clipping and random gaussian fuzzy.

Step 5, constructing multi-view data of each sample in a sample set to be classified, inputting one view of each sample into a trained depth residual error network model to obtain a feature vector of the corresponding sample, and further obtaining feature vectors of all samples in the sample set to be classified;

the method for constructing the multi-view data of each sample in the to-be-classified sample set is similar to the method for constructing the multi-view data without the labeled sample in the step 3, and is not repeated, and because the optimization goal of the depth residual error network model in the step 4 is to enable the feature vectors output by the multi-view data from the same sample to be consistent, any one view of the to-be-classified sample can be input into the trained depth residual error network model to extract the feature vector of the to-be-classified sample; as another embodiment, the dimension reduction can be performed on each sample in the sample set to be classified, and each sample after the dimension reduction is input into the trained deep residual error network model to obtain the feature vector of the corresponding sample, so as to obtain the feature vectors of all samples in the sample set to be classified; or directly inputting each sample in the sample set to be classified into the trained deep residual error network model to obtain the feature vector of the corresponding sample, and further obtaining the feature vectors of all samples in the sample set to be classified.

And 6, inputting the feature vectors of all samples in the sample set to be classified into a pre-trained classification model to finish hyperspectral image classification.

The trained classification model can be a classification model trained by using a traditional machine learning method, such as a support vector machine classification model or a random forest classification model; or a classification model trained by a deep learning method, such as a convolutional neural network classification model.

The method has the following advantages: (1) On the basis of a three-dimensional data cube, multi-view data is constructed by using a principal component analysis method, and space-spectrum joint information in a hyperspectral image can be fully utilized; (2) Constructing two different visual angles for each unmarked sample in a training sample set, training a depth residual error network model (hereinafter referred to as a model) by using multi-visual angle data of all unmarked samples in the obtained training sample set, on one hand, the advantage of more unmarked samples in the hyperspectral image can be fully exerted, and depth characteristic information of the hyperspectral image is mined by using a large number of unmarked samples, so that the depth characteristic of the hyperspectral image can be extracted by using the trained model; on the other hand, the model training is completed only when the total comparative loss meets the set requirement, so that the consistency of the trained model in deep mining of multi-view data is ensured, the feature vectors output by the multi-view data from the same sample are consistent, and the feature vectors extracted by the trained model are further ensured to have stronger representativeness, discriminability and robustness; therefore, the classification precision can be effectively improved by performing high-spectrum image classification on the feature vector extracted by using the trained model, and the improvement effect on the classification precision of the high-spectrum image is obvious particularly when the number of marked training samples is small (namely under the condition of small samples); (3) The training process adopts a data amplification method, so that the generalization capability of the network model can be further improved.

The validity of the classification method of this embodiment is verified on the salanas dataset as follows. The simulation conditions in this embodiment are: intel core i7-5700HQ,2.7GHz central processor, geForce GTX 970M graphics processor, 32GB memory; on the Salinas data set, 5 marked samples of each type of ground feature are randomly selected as training samples, the rest samples are used as test samples, experiments are respectively carried out by utilizing the profile characteristics of the extended morphological attributes, the support vector machine (EMP + SVM), the Transduction Support Vector Machine (TSVM), the three-dimensional convolution self-encoder (3 DCAE), the generative countermeasure network (GAN), the learning method of the depth few samples, the support vector machine (DFSL + SVM), the 50-layer residual error network model (Resnet 50) and the classification method of the embodiment, and the experimental results are specifically shown in Table 3 and FIG. 5. The DMVL + SVM is implemented by using an SVM classification model, and the DMVL + RF is implemented by using an RF classification model.

TABLE 3 Classification results of various methods on Salinas dataset

In Table 3, OA represents the overall classification accuracy, AA represents the average classification accuracy of each class, and k represents the kappa coefficient. By comparing the OA value, the AA value and the k value of each method, it can be seen that the classification method of the present embodiment can greatly improve the classification accuracy of the hyperspectral image under the condition of a small sample (5 labeled samples per class) compared with other methods.

Device embodiment

As shown in fig. 6, the hyper-spectral image classification apparatus based on depth multi-view learning of this embodiment includes a processor and a memory, where a computer program operable on the processor is stored in the memory, and the processor implements the method in the foregoing method embodiments when executing the computer program.

That is, the method in the above method embodiment should be understood as a flow of the hyperspectral image classification method based on depth multi-view learning, which can be implemented by computer program instructions. These computer program instructions may be provided to a processor such that execution of the instructions by the processor results in the implementation of the functions specified in the method flow described above.

The processor referred to in this embodiment refers to a processing device such as a microprocessor MCU or a programmable logic device FPGA.

The memory referred to in this embodiment includes a physical device for storing information, and generally, information is digitized and then stored in a medium using an electric, magnetic, optical, or the like. For example: various memories for storing information by using an electric energy mode, such as a RAM, a ROM and the like; various memories for storing information by magnetic energy, such as hard disk, floppy disk, magnetic tape, magnetic core memory, bubble memory, and U disk; various types of memory, CD or DVD, that store information optically. Of course, there are other ways of memory, such as quantum memory, graphene memory, and so forth.

The apparatus comprising the memory, the processor and the computer program is realized by the processor executing corresponding program instructions in the computer, and the processor can be loaded with various operating systems, such as windows operating system, linux system, android, iOS system, and the like.

As other embodiments, the device can also comprise a display, and the display is used for displaying the classification result for the staff to refer to.

Claims

1. A hyperspectral image classification method based on depth multi-view learning is characterized by comprising the following steps:

(1) Inputting a hyperspectral image;

(3) Dividing all spectrum wave bands of the same unmarked sample in the training sample set into at least two groups, processing one group of spectrum wave bands to obtain a visual angle, thereby constructing at least two different visual angles of the unmarked sample, obtaining multi-visual angle data of the unmarked sample, and further obtaining the multi-visual angle data of all unmarked samples in the training sample set;

(4) Training a depth residual error network model by using multi-view data of all unmarked samples in a training sample set, wherein the training is carried out for multiple times, N unmarked samples in the training sample set are extracted for training each time, N is more than or equal to 1, and the training process is as follows: inputting multi-view data of each unmarked sample in the training process into a depth residual error network model, wherein one view obtains one vector, and thus each unmarked sample obtains at least two vectors; calculating comparative losses among all vectors of each unmarked sample by using a loss function, further obtaining the total comparative losses of all unmarked samples in the training process, judging whether the total comparative losses meet set requirements, if not, optimizing the deep residual error network model until the total comparative losses meet the set requirements, and finishing the training of the deep residual error network model;

or firstly reducing the dimension of each sample in the sample set to be classified, inputting each sample after dimension reduction into a trained depth residual error network model to obtain the feature vector of the corresponding sample, and further obtaining the feature vectors of all samples in the sample set to be classified;

or directly inputting each sample in the sample set to be classified into the trained deep residual error network model to obtain the feature vector of the corresponding sample, and further obtaining the feature vectors of all samples in the sample set to be classified;

(6) And inputting the characteristic vectors of all samples in the sample set to be classified into a classification model trained in advance to finish hyperspectral image classification.

2. The method for classifying hyperspectral images based on depth multi-view learning according to claim 1, wherein the loss function is a cosine similarity function constructed according to cosine similarity between vectors.

3. The hyperspectral image classification method based on depth multi-view learning according to claim 2 is characterized in that the cosine similarity function is:

representing a vector z _i Transpose, | | z _i ||、||z _j Respectively representing a vector z _i Modulo, vector z _j The mold of (4); l _[k≠i] For indicating the function, k ≠ i has a value of 1; i. j and k are variables; n represents the number of unlabeled samples.

4. The method for classifying hyperspectral images based on deep multi-view learning according to any of claims 1 to 3, wherein the process of processing a group of spectral bands to obtain a view angle comprises: and performing principal component analysis on the group of spectral bands, wherein the first M principal components of the group of spectral bands are taken as a visual angle, and M is more than or equal to 1.

5. The method for classifying hyperspectral images based on depth multi-view learning according to any of claims 1-3, wherein sample data expansion is performed by random clipping and a random Gaussian blur method when the depth residual error network model is trained.

6. The method for classifying hyperspectral imagery based on deep multi-view learning according to any of claims 1 to 3, wherein the deep residual network model comprises 49 convolutional layers and 2 fully-connected layers, wherein 49 convolutional layers of a Resnet50 model are used as the 49 convolutional layers of the deep residual network model.

7. The method for classifying hyperspectral images based on depth multi-view learning according to any of claims 1 to 3, wherein the classification model is a support vector machine classification model, a random forest classification model or a convolutional neural network classification model.

8. A hyper-spectral image classification apparatus based on depth multi-view learning, comprising a processor and a memory, wherein the processor executes a computer program stored by the memory to realize the hyper-spectral image classification method based on depth multi-view learning according to any one of claims 1 to 7.