CN111507409B - Hyperspectral image classification method and device based on depth multi-view learning - Google Patents

Hyperspectral image classification method and device based on depth multi-view learning Download PDF

Info

Publication number
CN111507409B
CN111507409B CN202010307781.4A CN202010307781A CN111507409B CN 111507409 B CN111507409 B CN 111507409B CN 202010307781 A CN202010307781 A CN 202010307781A CN 111507409 B CN111507409 B CN 111507409B
Authority
CN
China
Prior art keywords
sample
samples
unmarked
training
depth
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010307781.4A
Other languages
Chinese (zh)
Other versions
CN111507409A (en
Inventor
刘冰
郭文月
余岸竹
王瑞瑞
余旭初
张鹏强
谭熊
魏祥坡
高奎亮
左溪冰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Information Engineering University of PLA Strategic Support Force
Original Assignee
Information Engineering University of PLA Strategic Support Force
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Information Engineering University of PLA Strategic Support Force filed Critical Information Engineering University of PLA Strategic Support Force
Priority to CN202010307781.4A priority Critical patent/CN111507409B/en
Publication of CN111507409A publication Critical patent/CN111507409A/en
Application granted granted Critical
Publication of CN111507409B publication Critical patent/CN111507409B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2155Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a hyperspectral image classification method and device based on depth multi-view learning, and belongs to the technical field of remote sensing image processing and application. The classification method comprises the following steps: constructing at least two different visual angles for each unmarked sample in the training sample set, and training the depth residual error network model by using the obtained multi-visual angle data of all unmarked samples in the training sample set; constructing multi-view data of each sample in a sample set to be classified, inputting one view of each sample into a trained model to obtain a feature vector of a corresponding sample, or inputting each sample in the sample set to be classified or each sample subjected to dimensionality reduction into the trained model to obtain the feature vector of the corresponding sample, and further obtaining the feature vectors of all samples in the sample set to be classified; and inputting the characteristic vectors of all samples in the sample set to be classified into a classification model trained in advance to finish the classification of the hyperspectral images. The method can improve the classification precision of the hyperspectral images under the condition of small samples.

Description

Hyperspectral image classification method and device based on depth multi-view learning
Technical Field
The invention relates to a hyperspectral image classification method and device based on depth multi-view learning, and belongs to the technical field of remote sensing image processing and application.
Background
The hyperspectral image classification is one of important steps in hyperspectral image application, and the basic task of the hyperspectral image classification is to endow each pixel in an image with a category identifier, so that the hyperspectral image classification has a high practical application value.
The existing hyperspectral image classification methods mainly comprise the following steps:
(1) The hyperspectral image classification is carried out based on a traditional machine learning method. For example, based on traditional machine learning methods such as a support vector machine, a semi-supervised support vector machine, a random forest and the like, although the method can obtain a certain classification effect, the method usually needs complex feature design work and relies on expert experience to adjust parameters to a great extent, and has great limitation in application and low classification precision.
(2) And (4) performing hyperspectral image classification based on a deep learning method. For example, based on a deep learning method such as a convolutional neural network, although the method can automatically extract the spatial spectral features of the hyperspectral image without manually presetting the features, the method can obtain a good classification effect only by relying on a large number of labeled training samples, in practical application, the acquisition of the hyperspectral image labeled samples is time-consuming and labor-consuming, the number of labeled samples which can be used for training is very small, and if a sufficient number of labeled training samples are not supported, the method is difficult to obtain high classification accuracy.
(3) And performing hyperspectral image classification based on the depth residual error network model. The method is based on the characteristics that the number of unmarked samples in the hyperspectral image is large and the unmarked samples are easy to acquire, and researches how to better mine the characteristic information of the hyperspectral image by utilizing the unmarked samples and improve the classification performance of the small samples of the hyperspectral image. For example, in an invention patent application document with publication number CN109754017A, a hyperspectral image classification method based on a separable three-dimensional residual network and transfer learning is disclosed, in which a three-dimensional residual network model is constructed to realize the autonomous extraction of the hyperspectral image depth features, and compared with the hyperspectral image classification method based on a deep learning method, the network model is deeper and has higher precision, and a better classification effect can be obtained under the condition of a small sample. However, the method performs feature learning based on reconstruction errors, and thus, deeper abstract features cannot be extracted, and the classification accuracy under the condition of a small sample still needs to be further improved.
In summary, in the existing hyperspectral image classification methods, the classification method based on the traditional machine learning not only needs to rely on expert experience to perform complex feature design work, but also has low classification precision; the deep learning-based classification method needs to depend on a large number of marked training samples, and has low classification precision under the condition of small samples; the classification method based on the deep residual error network model performs feature learning based on reconstruction errors, so that deeper abstract features cannot be extracted, and the classification accuracy under the condition of small samples needs to be further improved.
Disclosure of Invention
The invention aims to provide a hyperspectral image classification method and device based on depth multi-view learning, and aims to solve the problem that the existing hyperspectral image classification method is low in classification accuracy under the condition of small samples.
In order to achieve the above object, the present invention provides a hyperspectral image classification method based on depth multi-view learning, which comprises the following steps:
(1) Inputting a hyperspectral image;
(2) Extracting a set number of unmarked samples in the hyperspectral image to form a training sample set, and forming a sample set to be classified by the rest samples, wherein the samples are mxmxmxb data cubes selected by taking a pixel to be processed in the hyperspectral image as a center, m is the size of a spatial neighborhood, and b is the number of spectral bands;
(3) Dividing all spectral bands of the same unmarked sample in the training sample set into at least two groups, processing one group of spectral bands to obtain one visual angle, thereby constructing at least two different visual angles of the unmarked sample, obtaining multi-visual angle data of the unmarked sample, and further obtaining the multi-visual angle data of all the unmarked samples in the training sample set;
(4) Training a depth residual error network model by using multi-view data of all unmarked samples in a training sample set, wherein the training is carried out for multiple times, N unmarked samples in the training sample set are extracted for training each time, N is more than or equal to 1, and the training process is as follows: inputting multi-view data of each unmarked sample in the training process into a depth residual error network model, wherein one view obtains one vector, and thus each unmarked sample obtains at least two vectors; calculating comparative losses among all vectors of each unmarked sample by using a loss function, further obtaining the total comparative losses of all the unmarked samples in the training process, judging whether the total comparative losses meet the set requirement, if not, optimizing the depth residual error network model until the total comparative losses meet the set requirement, and finishing the training of the depth residual error network model;
(5) Constructing multi-view data of each sample in a sample set to be classified, inputting one view of each sample into a trained depth residual error network model to obtain a feature vector of the corresponding sample, and further obtaining feature vectors of all samples in the sample set to be classified;
or firstly reducing the dimension of each sample in the sample set to be classified, inputting each sample after dimension reduction into a trained deep residual error network model to obtain the feature vector of the corresponding sample, and further obtaining the feature vectors of all samples in the sample set to be classified;
or directly inputting each sample in the sample set to be classified into the trained depth residual error network model to obtain the feature vector of the corresponding sample, and further obtaining the feature vectors of all samples in the sample set to be classified;
(6) And inputting the characteristic vectors of all samples in the sample set to be classified into a classification model trained in advance to finish the classification of the hyperspectral images.
The invention also provides a hyperspectral image classification device based on depth multi-view learning, which comprises a processor and a memory, wherein the processor executes a computer program stored by the memory so as to realize the hyperspectral image classification method based on depth multi-view learning.
The hyperspectral image classification method and device based on depth multi-view learning have the beneficial effects that: firstly, selecting an mxmxmxb data cube selected by taking a pixel to be processed in a hyperspectral image as a center as a sample, constructing at least two different viewing angles for each unmarked sample in a training sample set on the basis, and training a depth residual error network model (hereinafter referred to as a model) by using multi-viewing angle data of all unmarked samples in the obtained training sample set, so that on one hand, spatial-spectral joint information in the hyperspectral image can be fully utilized; on the other hand, the advantage of large quantity of unmarked samples in the hyperspectral image can be fully exerted, and the depth characteristic information of the hyperspectral image is mined by means of a large quantity of unmarked samples, so that the depth characteristic of the hyperspectral image can be extracted by using a trained model; moreover, the model training is completed only when the total comparative loss meets the set requirement, so that the consistency of the trained model for deeply excavating multi-view data is ensured, the feature vectors output by the multi-view data from the same sample are consistent, and the feature vectors extracted by the trained model have stronger representativeness, discriminability and robustness.
Further, in the hyperspectral image classification method and device based on depth multi-view learning, the loss function is a cosine similarity function constructed according to cosine similarity between vectors.
The beneficial effects of doing so are: the cosine similarity function constructed according to the cosine similarity between the vectors can enable different view angle data from the same sample to be aggregated, so that the multi-view angle data from different samples are far away from each other, the consistency of the multi-view angle data in the hyperspectral image can be mined, the mined information is more representative, the classification precision can be effectively improved when the hyperspectral image is used for classification, and the classification precision improvement effect is obvious particularly for the hyperspectral image when the number of marked training samples is small.
Further, in the method and the device for classifying hyperspectral images based on depth multi-view learning, the cosine similarity function is as follows:
Figure BDA0002456387390000031
in the formula, ζ i,j Representing a vector z i Sum vector z j A comparative loss therebetween;
Figure BDA0002456387390000032
is a vector z i Sum vector z j The cosine similarity between them, the closer the cosine similarity is to 1, indicates that the more similar the two vectors are,
Figure BDA0002456387390000033
representing a vector z i Transpose of (1), l | z i ||、||z j Respectively representing the vector z i Modulo, vector z j The mold of (4); l. the [k≠i] For the indicating function, k ≠ i is 1; i. j and k are variables; n represents the number of unlabeled samples.
In order to reduce the data processing amount on the premise of ensuring that the obtained sample multi-view data contains enough spatio-spectral joint information, further, in the hyper-spectral image classification method and device based on depth multi-view learning, the process of processing a group of spectral bands to obtain a view angle comprises the following steps: and performing principal component analysis on the group of spectral bands, wherein the first M principal components of the group of spectral bands are taken as a visual angle, and M is more than or equal to 1.
In order to improve the generalization ability of the depth residual error network model, in the hyperspectral image classification method and device based on depth multi-view learning, when the depth residual error network model is trained, sample data expansion is performed by adopting random clipping and a random Gaussian blur method.
In order to ensure that the depth residual error network model has a depth network structure, and enable the extraction of the depth features of the hyperspectral images by using the trained depth residual error network model, so as to improve the classification precision of the hyperspectral images under the condition of small samples, further, in the hyperspectral image classification method and device based on the depth multi-view learning, the depth residual error network model comprises 49 convolutional layers and 2 full-connection layers, wherein the 49 convolutional layers of the Resnet50 model are used as the 49 convolutional layers of the depth residual error network model.
Further, in the method and the device for classifying hyperspectral images based on depth multi-view learning, the classification model is a support vector machine classification model, a random forest classification model or a convolutional neural network classification model.
Drawings
FIG. 1 is a flowchart of a hyper-spectral image classification method based on depth multi-view learning according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a process for constructing multi-view data of an unmarked sample according to an embodiment of the method of the present invention;
FIG. 3 is a schematic structural diagram of a depth residual error network model in an embodiment of the method of the present invention;
FIG. 4 is a schematic diagram of a standard residual block of the deep residual network model of FIG. 3;
FIG. 5 is a comparison of classification results on the Salinas dataset for various methods in a method embodiment of the invention;
fig. 6 is a structural diagram of a hyperspectral image classification device based on depth multi-view learning in an embodiment of the device.
Detailed Description
Method embodiment
The hyperspectral image classification method based on depth multi-view learning (hereinafter referred to as classification method) of the embodiment is shown in fig. 1, and the classification method includes the following steps:
step 1, inputting a hyperspectral image;
step 2, extracting a set number of unmarked samples (specifically set according to actual needs) in the hyperspectral image to form a training sample set, and forming a sample set to be classified by the rest samples, wherein in order to fully utilize spatial information in the hyperspectral image to effectively improve the classification precision, a data cube of mxmxb selected by taking a pixel to be processed in the hyperspectral image as a center is taken as a sample, m is the size of a spatial neighborhood, and b is the number of spectral bands;
step 3, constructing two different visual angles for the same unmarked sample in the training sample set to obtain multi-visual angle data of the unmarked sample, and further obtaining multi-visual angle data of all unmarked samples in the training sample set;
in this embodiment, a depth multi-view learning method is used to construct multi-view data of unmarked samples, in order to reduce complexity of a depth multi-view learning process, two different views are constructed for the same unmarked sample, and a specific process of view construction is as follows: and averagely dividing all spectral bands of the same unmarked sample into two groups, carrying out principal component analysis on each group of spectral bands, taking the first 3 principal components of the first group of spectral bands as a first visual angle of the unmarked sample, and taking the first 3 principal components of the second group of spectral bands as a second visual angle of the unmarked sample. For example, in fig. 2, for a certain unmarked pixel, a 28 × 28 × 200 data cube is selected as an unmarked sample of the pixel, the unmarked sample has 200 spectral bands, the 200 spectral bands of the unmarked sample are averagely divided into two groups, two spectral band groups with 100 spectral bands are obtained, principal component analysis is performed on each spectral band group, the first 3 principal components of the first group of spectral bands are used as a first viewing angle of the unmarked sample, and the first 3 principal components of the second group of spectral bands are used as a second viewing angle of the unmarked sample.
As other implementation modes, the first M main components of a group of spectral bands can be used as a viewing angle, wherein the value of M is set according to actual needs, and M is more than or equal to 1; or each group of spectral bands can be directly used as a visual angle, and the step of principal component analysis is omitted.
In addition, as different wave bands of the hyperspectral image can reflect different attributes of ground objects, in order to enable the obtained multi-view data of the hyperspectral image to contain the empty spectrum information of the hyperspectral image as much as possible, as other implementation modes, 3 or more than 3 different view angles can be constructed for the same unmarked sample, and the construction process of the view angles is similar to that when two different view angles are constructed, and is not repeated; specifically, each band of the unmarked sample may also be taken as a view angle.
Step 4, training the depth residual error network model by using multi-view data of all unmarked samples in the training sample set;
the deep residual network model constructed in this embodiment consists of a network f (-) including 49 convolutional layers and a network g (-) including 2 fully-connected layers. The network f (-) adopts a Resnet50 model with a classification layer removed as a basic structure, specifically shown in FIG. 3, zero PAD represents a 0-complementing operation on the periphery of an image, CONV represents a two-dimensional convolution layer, batchnorm represents a batch normalization layer, reLU represents a ReLU activation function, maxpool represents a maximum pooling layer, AVGPool represents a global maximum pooling layer, CONVBlock represents a residual Block shown in FIG. 4, and Block 3 represents repeating the residual Block 3 times. It can be seen that the network f (·) contains 16 standard residual blocks altogether, the structure of each residual block is shown in fig. 4, CONV2D represents a two-dimensional convolution layer, batchNorm represents a batch normalization layer, reLU represents a ReLU activation function, shortcut represents a jump connection in the residual network, and since each standard residual block contains 3 convolution layers, the network f (·) contains 1+3 × (3 +4+6+ 3) =49 convolution layers altogether.
In this embodiment, after the parameters of the network f (-) are set according to table 1, the network f (-) outputs a vector h with a dimension of 2048, and is connected to the network g (-) after the network f (-), and by setting the number of input and output nodes of the network g (-), the vector h can be reduced in dimension, so that the obtained feature vector z has a lower dimension, wherein the dimension of the feature vector z can be specifically set according to actual needs.
TABLE 1 parameter settings table for network f (-) s
Figure BDA0002456387390000061
The pseudo code of the deep residual network model training process in this embodiment is shown in table 2:
TABLE 2 pseudo code for deep residual network model training procedure
Figure BDA0002456387390000062
The method comprises the steps of training a depth residual error network model by using multi-view data of all unmarked samples in a training sample set, wherein a small batch of training strategies are adopted for multiple times during training, namely N unmarked samples in the training sample set are randomly selected for training each time, N is larger than or equal to 1, the N unmarked samples generate 2N views after deep multi-view learning, in the 2N views, two views from the same unmarked sample are called as positive view pairs, and two views from different unmarked samples are called as negative view pairs.
For two visual angles of the same unmarked sample, two vectors z are generated after passing through a depth residual error network model i And z j The comparative loss ζ between the two vectors of the unmarked sample can be calculated by using a loss function i,j And in the batch training process, calculating the total comparative loss zeta on all the active visual angle pairs, judging whether the total comparative loss zeta meets a set requirement (set according to actual needs), if not, optimizing the deep residual error network model, and finishing the training of the deep residual error network model when the total comparative loss zeta meets the set requirement. Therefore, under the unsupervised condition, the feature information shared by different visual angles is learned according to the similarity between the visual angles.
In this embodiment, a cosine similarity function shown in formula (1) is used as a loss function, and formula (1) is as follows:
Figure BDA0002456387390000071
in the formula, ζ i,j Representing a vector z i Sum vector z j Loss of comparability betweenLosing;
Figure BDA0002456387390000072
is a vector z i Sum vector z j The cosine similarity between them, the closer the cosine similarity is to 1, indicating that the more similar the two vectors are,
Figure BDA0002456387390000073
representing a vector z i Transpose, | | z i ||、||z j Respectively representing the vector z i Modulo, vector z j The mold of (4); l [k≠i] For indicating the function, k ≠ i has a value of 1; i. j and k are variables; n represents the number of unlabeled samples.
Because the multi-view data of the same unmarked sample is input into the depth residual error network model, one view obtains one vector, when the number of views constructed by the same unmarked sample is more than or equal to 3, the number of vectors obtained by the unmarked sample is more than or equal to 3, and at the moment, the calculation method of the comparative loss among all vectors of the unmarked sample is as follows: firstly, combining vectors obtained from the unmarked sample pairwise, then respectively calculating comparative losses of the vectors combined pairwise by using a formula (1), and finally summing the comparative losses of all different combinations to obtain the comparative losses among all vectors of the unmarked sample. For example: an unmarked sample has 3 views, and accordingly the unmarked sample results in 3 vectors: a. b and c, combining the 3 vectors pairwise, and calculating the vector a and the vector b to obtain a comparative loss value zeta a,b Similarly, vector a and vector c also obtain a comparative loss value ζ a,c Vector b and vector c also obtain a comparative loss value ζ b,c Then the loss of comparability between all vectors of the unlabeled sample is ζ a,ba,cb,c
As other embodiments, other forms of cosine similarity functions may also be employed to calculate the comparative loss between two vectors, as long as the cosine similarity functions are constructed according to cosine similarity between vectors; in addition, the cross entropy loss function in the prior art can be used to calculate the comparative loss between two vectors.
The Resnet50 model adopted in this embodiment has a deep network structure, and as another embodiment, other deep residual network models having a deep network structure may also be adopted to construct a network f (·), such as a Resnet100 model.
In other embodiments, in order to further enhance the training effect and the robustness of the model, when the deep residual error network model is trained, sample data expansion can be performed by using two data expansion methods, namely random clipping and random gaussian fuzzy.
Step 5, constructing multi-view data of each sample in a sample set to be classified, inputting one view of each sample into a trained depth residual error network model to obtain a feature vector of the corresponding sample, and further obtaining feature vectors of all samples in the sample set to be classified;
the method for constructing the multi-view data of each sample in the to-be-classified sample set is similar to the method for constructing the multi-view data without the labeled sample in the step 3, and is not repeated, and because the optimization goal of the depth residual error network model in the step 4 is to enable the feature vectors output by the multi-view data from the same sample to be consistent, any one view of the to-be-classified sample can be input into the trained depth residual error network model to extract the feature vector of the to-be-classified sample; as another embodiment, the dimension reduction can be performed on each sample in the sample set to be classified, and each sample after the dimension reduction is input into the trained deep residual error network model to obtain the feature vector of the corresponding sample, so as to obtain the feature vectors of all samples in the sample set to be classified; or directly inputting each sample in the sample set to be classified into the trained deep residual error network model to obtain the feature vector of the corresponding sample, and further obtaining the feature vectors of all samples in the sample set to be classified.
And 6, inputting the feature vectors of all samples in the sample set to be classified into a pre-trained classification model to finish hyperspectral image classification.
The trained classification model can be a classification model trained by using a traditional machine learning method, such as a support vector machine classification model or a random forest classification model; or a classification model trained by a deep learning method, such as a convolutional neural network classification model.
The method has the following advantages: (1) On the basis of a three-dimensional data cube, multi-view data is constructed by using a principal component analysis method, and space-spectrum joint information in a hyperspectral image can be fully utilized; (2) Constructing two different visual angles for each unmarked sample in a training sample set, training a depth residual error network model (hereinafter referred to as a model) by using multi-visual angle data of all unmarked samples in the obtained training sample set, on one hand, the advantage of more unmarked samples in the hyperspectral image can be fully exerted, and depth characteristic information of the hyperspectral image is mined by using a large number of unmarked samples, so that the depth characteristic of the hyperspectral image can be extracted by using the trained model; on the other hand, the model training is completed only when the total comparative loss meets the set requirement, so that the consistency of the trained model in deep mining of multi-view data is ensured, the feature vectors output by the multi-view data from the same sample are consistent, and the feature vectors extracted by the trained model are further ensured to have stronger representativeness, discriminability and robustness; therefore, the classification precision can be effectively improved by performing high-spectrum image classification on the feature vector extracted by using the trained model, and the improvement effect on the classification precision of the high-spectrum image is obvious particularly when the number of marked training samples is small (namely under the condition of small samples); (3) The training process adopts a data amplification method, so that the generalization capability of the network model can be further improved.
The validity of the classification method of this embodiment is verified on the salanas dataset as follows. The simulation conditions in this embodiment are: intel core i7-5700HQ,2.7GHz central processor, geForce GTX 970M graphics processor, 32GB memory; on the Salinas data set, 5 marked samples of each type of ground feature are randomly selected as training samples, the rest samples are used as test samples, experiments are respectively carried out by utilizing the profile characteristics of the extended morphological attributes, the support vector machine (EMP + SVM), the Transduction Support Vector Machine (TSVM), the three-dimensional convolution self-encoder (3 DCAE), the generative countermeasure network (GAN), the learning method of the depth few samples, the support vector machine (DFSL + SVM), the 50-layer residual error network model (Resnet 50) and the classification method of the embodiment, and the experimental results are specifically shown in Table 3 and FIG. 5. The DMVL + SVM is implemented by using an SVM classification model, and the DMVL + RF is implemented by using an RF classification model.
TABLE 3 Classification results of various methods on Salinas dataset
Figure BDA0002456387390000091
In Table 3, OA represents the overall classification accuracy, AA represents the average classification accuracy of each class, and k represents the kappa coefficient. By comparing the OA value, the AA value and the k value of each method, it can be seen that the classification method of the present embodiment can greatly improve the classification accuracy of the hyperspectral image under the condition of a small sample (5 labeled samples per class) compared with other methods.
Device embodiment
As shown in fig. 6, the hyper-spectral image classification apparatus based on depth multi-view learning of this embodiment includes a processor and a memory, where a computer program operable on the processor is stored in the memory, and the processor implements the method in the foregoing method embodiments when executing the computer program.
That is, the method in the above method embodiment should be understood as a flow of the hyperspectral image classification method based on depth multi-view learning, which can be implemented by computer program instructions. These computer program instructions may be provided to a processor such that execution of the instructions by the processor results in the implementation of the functions specified in the method flow described above.
The processor referred to in this embodiment refers to a processing device such as a microprocessor MCU or a programmable logic device FPGA.
The memory referred to in this embodiment includes a physical device for storing information, and generally, information is digitized and then stored in a medium using an electric, magnetic, optical, or the like. For example: various memories for storing information by using an electric energy mode, such as a RAM, a ROM and the like; various memories for storing information by magnetic energy, such as hard disk, floppy disk, magnetic tape, magnetic core memory, bubble memory, and U disk; various types of memory, CD or DVD, that store information optically. Of course, there are other ways of memory, such as quantum memory, graphene memory, and so forth.
The apparatus comprising the memory, the processor and the computer program is realized by the processor executing corresponding program instructions in the computer, and the processor can be loaded with various operating systems, such as windows operating system, linux system, android, iOS system, and the like.
As other embodiments, the device can also comprise a display, and the display is used for displaying the classification result for the staff to refer to.

Claims (8)

1. A hyperspectral image classification method based on depth multi-view learning is characterized by comprising the following steps:
(1) Inputting a hyperspectral image;
(2) Extracting a set number of unmarked samples in the hyperspectral image to form a training sample set, and forming a sample set to be classified by the rest samples, wherein the samples are mxmxmxb data cubes selected by taking a pixel to be processed in the hyperspectral image as a center, m is the size of a spatial neighborhood, and b is the number of spectral bands;
(3) Dividing all spectrum wave bands of the same unmarked sample in the training sample set into at least two groups, processing one group of spectrum wave bands to obtain a visual angle, thereby constructing at least two different visual angles of the unmarked sample, obtaining multi-visual angle data of the unmarked sample, and further obtaining the multi-visual angle data of all unmarked samples in the training sample set;
(4) Training a depth residual error network model by using multi-view data of all unmarked samples in a training sample set, wherein the training is carried out for multiple times, N unmarked samples in the training sample set are extracted for training each time, N is more than or equal to 1, and the training process is as follows: inputting multi-view data of each unmarked sample in the training process into a depth residual error network model, wherein one view obtains one vector, and thus each unmarked sample obtains at least two vectors; calculating comparative losses among all vectors of each unmarked sample by using a loss function, further obtaining the total comparative losses of all unmarked samples in the training process, judging whether the total comparative losses meet set requirements, if not, optimizing the deep residual error network model until the total comparative losses meet the set requirements, and finishing the training of the deep residual error network model;
(5) Constructing multi-view data of each sample in a sample set to be classified, inputting one view of each sample into a trained depth residual error network model to obtain a feature vector of the corresponding sample, and further obtaining feature vectors of all samples in the sample set to be classified;
or firstly reducing the dimension of each sample in the sample set to be classified, inputting each sample after dimension reduction into a trained depth residual error network model to obtain the feature vector of the corresponding sample, and further obtaining the feature vectors of all samples in the sample set to be classified;
or directly inputting each sample in the sample set to be classified into the trained deep residual error network model to obtain the feature vector of the corresponding sample, and further obtaining the feature vectors of all samples in the sample set to be classified;
(6) And inputting the characteristic vectors of all samples in the sample set to be classified into a classification model trained in advance to finish hyperspectral image classification.
2. The method for classifying hyperspectral images based on depth multi-view learning according to claim 1, wherein the loss function is a cosine similarity function constructed according to cosine similarity between vectors.
3. The hyperspectral image classification method based on depth multi-view learning according to claim 2 is characterized in that the cosine similarity function is:
Figure FDA0002456387380000021
in the formula, ζ i,j Representing a vector z i Sum vector z j A comparative loss therebetween;
Figure FDA0002456387380000022
is a vector z i Sum vector z j The cosine similarity between them, the closer the cosine similarity is to 1, indicates that the more similar the two vectors are,
Figure FDA0002456387380000023
representing a vector z i Transpose, | | z i ||、||z j Respectively representing a vector z i Modulo, vector z j The mold of (4); l [k≠i] For indicating the function, k ≠ i has a value of 1; i. j and k are variables; n represents the number of unlabeled samples.
4. The method for classifying hyperspectral images based on deep multi-view learning according to any of claims 1 to 3, wherein the process of processing a group of spectral bands to obtain a view angle comprises: and performing principal component analysis on the group of spectral bands, wherein the first M principal components of the group of spectral bands are taken as a visual angle, and M is more than or equal to 1.
5. The method for classifying hyperspectral images based on depth multi-view learning according to any of claims 1-3, wherein sample data expansion is performed by random clipping and a random Gaussian blur method when the depth residual error network model is trained.
6. The method for classifying hyperspectral imagery based on deep multi-view learning according to any of claims 1 to 3, wherein the deep residual network model comprises 49 convolutional layers and 2 fully-connected layers, wherein 49 convolutional layers of a Resnet50 model are used as the 49 convolutional layers of the deep residual network model.
7. The method for classifying hyperspectral images based on depth multi-view learning according to any of claims 1 to 3, wherein the classification model is a support vector machine classification model, a random forest classification model or a convolutional neural network classification model.
8. A hyper-spectral image classification apparatus based on depth multi-view learning, comprising a processor and a memory, wherein the processor executes a computer program stored by the memory to realize the hyper-spectral image classification method based on depth multi-view learning according to any one of claims 1 to 7.
CN202010307781.4A 2020-04-17 2020-04-17 Hyperspectral image classification method and device based on depth multi-view learning Active CN111507409B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010307781.4A CN111507409B (en) 2020-04-17 2020-04-17 Hyperspectral image classification method and device based on depth multi-view learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010307781.4A CN111507409B (en) 2020-04-17 2020-04-17 Hyperspectral image classification method and device based on depth multi-view learning

Publications (2)

Publication Number Publication Date
CN111507409A CN111507409A (en) 2020-08-07
CN111507409B true CN111507409B (en) 2022-10-18

Family

ID=71876206

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010307781.4A Active CN111507409B (en) 2020-04-17 2020-04-17 Hyperspectral image classification method and device based on depth multi-view learning

Country Status (1)

Country Link
CN (1) CN111507409B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112464891B (en) * 2020-12-14 2023-06-16 湖南大学 Hyperspectral image classification method
CN112749752B (en) * 2021-01-15 2023-02-03 中国人民解放军战略支援部队信息工程大学 Hyperspectral image classification method based on depth transform
CN112948897B (en) * 2021-03-15 2022-08-26 东北农业大学 Webpage tamper-proofing detection method based on combination of DRAE and SVM
CN113191442B (en) * 2021-05-14 2023-11-17 中国石油大学(华东) Method for classifying hyperspectral images through mutual conductance learning
CN114155397B (en) * 2021-11-29 2023-01-03 中国船舶重工集团公司第七0九研究所 Small sample image classification method and system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109871830A (en) * 2019-03-15 2019-06-11 中国人民解放军国防科技大学 Spatial-spectral fusion hyperspectral image classification method based on three-dimensional depth residual error network
CN110852227A (en) * 2019-11-04 2020-02-28 中国科学院遥感与数字地球研究所 Hyperspectral image deep learning classification method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN111507409A (en) 2020-08-07

Similar Documents

Publication Publication Date Title
CN111507409B (en) Hyperspectral image classification method and device based on depth multi-view learning
Fasy et al. Introduction to the R package TDA
CN111160214B (en) 3D target detection method based on data fusion
CN110348399B (en) Hyperspectral intelligent classification method based on prototype learning mechanism and multidimensional residual error network
US10210430B2 (en) System and a method for learning features on geometric domains
CN109543548A (en) A kind of face identification method, device and storage medium
CN105320965A (en) Hyperspectral image classification method based on spectral-spatial cooperation of deep convolutional neural network
CN107133496B (en) Gene feature extraction method based on manifold learning and closed-loop deep convolution double-network model
CN110570440A (en) Image automatic segmentation method and device based on deep learning edge detection
CN114429422A (en) Image super-resolution reconstruction method and system based on residual channel attention network
Cai et al. Superpixel contracted neighborhood contrastive subspace clustering network for hyperspectral images
CN110009745B (en) Method for extracting plane from point cloud according to plane element and model drive
CN115830375A (en) Point cloud classification method and device
CN111476287A (en) Hyperspectral image small sample classification method and device
Yuan et al. ROBUST PCANet for hyperspectral image change detection
Siméoni et al. Unsupervised object discovery for instance recognition
CN116703992A (en) Accurate registration method, device and equipment for three-dimensional point cloud data and storage medium
CN109584194B (en) Hyperspectral image fusion method based on convolution variation probability model
CN116524352A (en) Remote sensing image water body extraction method and device
CN116386803A (en) Cytopathology report generation method based on graph
CN110910463A (en) Full-view-point cloud data fixed-length ordered encoding method and equipment and storage medium
CN107229935B (en) Binary description method of triangle features
Du et al. Expectation-maximization attention cross residual network for single image super-resolution
Yang et al. Automatic brain tumor segmentation using cascaded FCN with DenseCRF and K-means
Thorstensen et al. Pre-Image as Karcher Mean using Diffusion Maps: Application to shape and image denoising

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant