CN112633386A - SACVAEGAN-based hyperspectral image classification method - Google Patents

SACVAEGAN-based hyperspectral image classification method Download PDF

Info

Publication number
CN112633386A
CN112633386A CN202011569729.2A CN202011569729A CN112633386A CN 112633386 A CN112633386 A CN 112633386A CN 202011569729 A CN202011569729 A CN 202011569729A CN 112633386 A CN112633386 A CN 112633386A
Authority
CN
China
Prior art keywords
data
hyperspectral
training
classifier
potential
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011569729.2A
Other languages
Chinese (zh)
Inventor
陈志涛
同磊
禹晶
肖创柏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN202011569729.2A priority Critical patent/CN112633386A/en
Publication of CN112633386A publication Critical patent/CN112633386A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a SACVAEGAN-based hyperspectral image classification method, wherein a potential vector classifier module is added on the basis of CVAEGAN and is used for classifying potential vectors corresponding to hyperspectral data, so that the potential vectors are cooperatively trained with a decoder and a sample classifier, the problem that the randomly generated potential vectors and classes in a GAN network are difficult to correspond is solved, and the accuracy is further improved. A self-attention mechanism and a spectrum regularization method are applied to a decoder, an encoder and a discriminator, the self-attention mechanism can enable a network model to better extract the characteristics of hyperspectral data, and the spectrum regularization method can improve the stability of the model. Features are extracted from two angles of space and spectrum in the sample classifier, and the structure of a residual error network is added, so that the effect of model classification is improved.

Description

SACVAEGAN-based hyperspectral image classification method
Technical Field
The invention relates to the field of hyperspectral image classification, in particular to a method for classifying hyperspectral images.
Background
With the continuous development of remote sensing technology, High Spectral Image (HSI) makes a significant breakthrough in the field of earth observation. Unlike traditional three-channel color images, HSI can collect images in hundreds of spectral bands simultaneously, with very rich spectral information. Therefore, the hyperspectral image is widely applied to the fields of satellite remote sensing, crop observation, mineral exploration and the like.
In hyperspectral data processing technology, the classification problem has always been one of the very active subjects. The classification problem of hyperspectral data generally has two classification methods: spectral classifiers and spectral-spatial classifiers. Conventional hyperspectral image classification algorithms typically include Support Vector Machines (SVMs), K Nearest Neighbors (KNNs), maximum likelihood, neural networks, and logistic regression. However, since the same material may have spectral differences and different materials may have similar spectral characteristics, it is difficult to accurately distinguish different classes only by spectral information. In order to solve the above problems, some scholars have proposed a method of combining spectral information with spatial information to improve classification performance.
Due to the fact that a large amount of experience and parameter setting are designed in the traditional classification method, in recent years, deep learning methods are applied to hyperspectral image classification in a large amount. Convolutional Neural Networks (CNN) in particular have received great attention. For example, one of the methods proposed by w.hu adopts a five-layer deep CNN model to extract the spectral features of HSI, and obtains better classification performance. Li proposes a pixel pair method as a classifier for depth spectra, which achieves good results in the absence of training data, but which mainly performs convolution operations in the spectral domain, ignoring spatial details. Ying Li provides a method for extracting spectrum-space characteristics by using 3D convolution, various characteristics of hyperspectral data are fully considered, and a good classification result is obtained.
Although the deep learning based approach has made great progress in HSI classification, it still faces some problems, namely, too little tagged data. Based on this, some scholars propose to reduce the phenomenon of limited hyperspectral data by using a GAN network model. The GAN network model typically includes a generative model G and a discriminative model D. Models G and D were trained in an antagonistic manner. Where G attempts to generate as real pseudo-samples as possible by means of a random vector Z, and D attempts to identify real samples and G-generated pseudo-samples. The two are continually confronted until D finally fails to successfully identify a false sample. By correctly using the samples which are generated by the GAN and can be used as virtual samples, the classification accuracy can be improved, and the condition that hyperspectral training data is limited is reduced.
Disclosure of Invention
The invention aims to solve the problems and provides an SACVAEGAN (Self-Attention-based Conditional variant adaptive Encoder generation countermeasure network) hyperspectral image classification method. The invention adopts CVAEGAN (conditional variational self-encoder) as a basic structure to solve the problem of small hyperspectral image training data, generates a virtual sample of the hyperspectral image through a decoder in the CVAEGAN, amplifies the training data and further improves the accuracy of classification. Meanwhile, a potential vector classifier module is added on the basis of CVAEGAN to classify potential vectors corresponding to hyperspectral data, so that the potential vectors are cooperatively trained with a decoder and a sample classifier, the problem that the potential vectors and classes randomly generated in a GAN network are difficult to correspond is solved, and the accuracy is improved. A self-attention mechanism and a spectrum regularization method are applied to a decoder, an encoder and a discriminator, the self-attention mechanism can enable a network model to better extract the characteristics of hyperspectral data, and the spectrum regularization method can improve the stability of the model. Features are extracted from two angles of space and spectrum in the sample classifier, and the structure of a residual error network is added, so that the effect of model classification is improved.
In order to achieve the purpose, the technical scheme and the experimental steps adopted by the invention are as follows:
(1) firstly, the hyperspectral image data is preprocessed.
(1a) The original hyperspectral data edges are first filled and zeroed so that data with a window size of patchsize × patchsize can be taken centered around each point, where indianpins and Salinas datasets patchsize is 28 and PaviaU dataset patchsize is 24.
(1b) Randomly selecting K points as training labels, wherein 500 points are used as the training labels in the IndianPines and PaviaU data sets, 200 labels are used as the training labels in the Salinas data set, and the rest are used as the testing labels.
(1c) A sample set is obtained. And obtaining a sample set of the hyperspectral image, taking K training labels obtained at random as centers, dividing training data by the size of a window being patchsize multiplied by patchsize, and dividing the rest of the training data into test data by the size of the patchsize multiplied by patchsize.
(2) Building a network model
After data preprocessing, the network model is constructed. The training network model consists of five parts which are respectively: conditional variations are from the encoder, the discriminator, the sample classifier, and the latent vector classifier.
(2a) Conditional variations are divided from an encoder into a decoder (i.e., a generator) and an encoder. The encoder mainly functions to generate potential vectors corresponding to real hyperspectral data. The encoder combines a self-attention mechanism and a spectral normalization method. The decoder mainly functions to generate corresponding virtual hyperspectral data according to the potential vectors, and combines a self-attention mechanism and a spectrum normalization method.
(2b) The discriminator is mainly used for discriminating the truth of the input hyperspectral data, and combines an attention mechanism with spectrum normalization.
(2c) The sample classifier is mainly used for classifying input hyperspectral data and is composed of two branches for acquiring spatial features and spectral features of hyperspectral images and extracting features by combining a residual error network.
(2d) The potential vector classifier is mainly used for classifying the randomly generated potential vectors, and then giving a category to the virtual hyperspectral data generated by the generator according to the potential vectors so as to facilitate the following operation.
(3) Training network
Training is started after the data and the model are processed respectively. The training process is mainly divided into four parts, namely a condition variation self-encoder, a discriminator, a sample classifier and a potential vector classifier are trained.
(3a) The arbiter is first trained. The training of the discriminator is divided into three steps, namely, real hyperspectral data and potential vectors generated by an encoder in a condition variation self-encoder are input into a decoder to generate virtual hyperspectral data, and virtual hyperspectral data generated by the potential vectors generated randomly are input into the discriminator to be trained, and a loss function is calculated to optimize parameters of the discriminator.
(3b) Training the conditional variational self-encoder. The training is divided into five steps, namely, the real hyperspectral data are put into an encoder to generate corresponding potential vectors and corresponding loss functions are calculated. And inputting virtual hyperspectral data and real hyperspectral data which are correspondingly generated by the potential vector generated by the encoder into a discriminator, and calculating a corresponding loss function. And putting the virtual hyperspectral data and the real hyperspectral data corresponding to the potential vector generated by the encoder into a sample classifier to calculate a corresponding loss function. And respectively inputting the randomly generated potential vector and the corresponding virtual hyperspectral data into a potential vector classifier and a sample classifier, and calculating corresponding loss functions. And inputting the potential vector generated by the encoder into a potential vector classifier, and calculating a corresponding loss function.
(3c) And classifying the potential vector classifier. The training of the potential vector classifier is mainly to input potential vectors corresponding to real hyperspectral data into the potential vector classifier for classification and calculate a loss function.
(3d) And training a sample classifier. The training of the sample classifier is mainly divided into three steps. And respectively inputting the real hyperspectral data into a sample classifier to calculate a classified loss function. And respectively inputting the real hyperspectral data and the virtual hyperspectral data into a sample classifier to calculate a corresponding loss function. And inputting the randomly generated potential vectors into a potential vector classifier for classification, inputting virtual hyperspectral data generated according to the randomly generated potential vectors into the classifier for classification, and calculating corresponding loss functions according to classification results.
(4) Hyperspectral image classification
And testing after the model training is completed. And comparing the test result with the true value to obtain a classification result, and calculating the accuracy. The potential vector classifier is added in the invention, so that the problem of correspondence between potential vectors and categories randomly generated in the GAN model is solved, and experimental results prove that the accuracy of the model for classifying the hyperspectral data is remarkably improved after the potential vector classifier is added. A self-attention mechanism and a spectrum regularization method are added into the network model, so that the stability and the training speed of the model are improved. The method has the advantages that the characteristics are extracted from two angles of the space spectrum in the sample classifier, and the residual error network structure is added, so that the accuracy of the sample classifier in classifying the hyperspectral data is improved.
Drawings
FIG. 1 is a flow chart of the present invention.
Fig. 2 is an overall structural view of the present invention.
Fig. 3 is a block diagram of a sample classifier.
Fig. 4 is a structural diagram of a conditional variable self-encoder, in which the upper side is an encoder and the lower side is a decoder.
Fig. 5 is a structural diagram of the discriminator.
Fig. 6 is a block diagram of a potential vector classifier.
In fig. 7, (a) is an IndianPines hyperspectral image used in the present invention, (b) is a PaviaU hyperspectral image used in the present invention, and (c) is a Salinas hyperspectral image used in the present invention.
Fig. 8 (a) is a diagram showing the result of classification of the IndianPines hyperspectral image by the SVM. (b) The result graph is obtained by classifying the IndianPines hyperspectral images through 2 dCNN. (c) The result graph is obtained by classifying 3dCNN on IndianPines hyperspectral images. (d) The result graph is obtained by classifying the IndianPines hyperspectral images through the DCGAN. (e) The invention is a result graph for classifying Indian pines hyperspectral images.
Fig. 9 (a) is a diagram showing the result of classification of the PaviaU hyperspectral image by the SVM. (b) The result graph is obtained by classifying the PaviaU hyperspectral image by 2 dCNN. (c) The result graph is obtained by classifying the PaviaU hyperspectral image by 3 dCNN. (d) The result graph of the classification of the PaviaU hyperspectral images by the DCGAN is shown. (e) The invention is a result graph for classifying the PaviaU hyperspectral image.
Fig. 10 (a) is a diagram showing the results of classification of the Salinas hyperspectral images by the SVM. (b) The result graph is obtained by classifying Salinas hyperspectral images by 2 dCNN. (c) The result graph is that the 3dCNN classifies Salinas hyperspectral images. (d) The result graph of the classification of Salinas hyperspectral images by DCGAN is shown. (e) The invention is a result graph for classifying Salinas hyperspectral images.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Referring to FIG. 1, the experimental procedure of the present invention is as follows:
step 1, data preprocessing
(1) Edge filling and normalization of data
The data is edge-filled with zero padding to enable the data to be centered on each available data point and the window size is divided into
Patchsize × Patchsize data. And carrying out normalization processing on the data.
(2) Obtaining a sample set
Randomly selecting K data points in the original data, and dividing the window into the size of K data points by taking the K data points as the center
The patch size × patch size data block serves as a training set. The patch size × patch sized data block is divided into test sets centered on the remaining data points.
Step 2, constructing a model
The model mainly comprises four parts which are respectively: conditional variations are from the encoder, the discriminator, the sample classifier, and the latent vector classifier.
The conditional variational self-encoder consists of an encoder and a decoder. Wherein corresponding latent vectors are generated for better utilization of the encoder extraction features. The encoder is operative to generate its corresponding potential vector from the input samples, and the decoder is operative to generate sample data of the corresponding class from the input potential vector. Two branches are used in the encoder. One branch uses 2-D convolution to process data to obtain the spatial characteristics of the data, the other branch uses 1-D convolution to process data of the central pixel point of the data block to obtain the spectral characteristics of the data, and after the spatial characteristics and the spectral characteristics are extracted, the spatial characteristics and the spectral characteristics are spliced to continue the next operation. Both branches consist of convolutional layers, fully-connected layers, batch normalization, spectral normalization, Dropout, and self-supervision mechanisms. The last convolutional layer was removed and the remaining convolutional layers were normalized by spectral normalization to meet the constraints of Lipschitz ═ 1. The activation function of the convolutional layer is 'ReLU'. The structure of the decoder is similar to that of the encoder except that it is a single branch. The system consists of an deconvolution layer, a full-link layer, batch normalization, spectrum normalization, Dropout and an auto-supervision mechanism. The last convolutional layer was removed and the remaining convolutional layers were normalized by spectral normalization to meet the constraints of Lipschitz ═ 1. Using batch normalization may allow a higher learning rate to accelerate convergence by normalizing the data for each training mini-batch. Using Dropout may prevent the occurrence of overfitting in the case of small training samples. The spectrum normalization is used, so that the possibility of generating the GAN network mode collapse can be reduced, and the stability of the GAN network mode collapse can be improved. And a self-supervision mechanism is used, so that the feature information can be better extracted. After the encoder passes through a convolution and self-supervision mechanism, the encoder inputs the data into a full connection layer in a leveling mode, and a sigmoid function is used for activation. After passing through the deconvolution layer and the self-supervision layer, the decoder flattens and inputs the deconvolution layer and the self-supervision layer into a full-connection layer, and finally activates the decoder by using a 'Tanh' function.
The main function of the discriminator is to judge whether the input sample data is real sample data or virtual sample data. The model structure consists of convolution layer, full connection layer, batch normalization, spectrum normalization, Dropout and self-supervision mechanism. The last convolutional layer was removed and the remaining convolutional layers were normalized by spectral normalization to meet the constraints of Lipschitz ═ 1. The activation function of the convolutional layer is 'LeakyReLU'. Using batch normalization may allow a higher learning rate to accelerate convergence by normalizing the data for each training mini-batch. Using Dropout may prevent the occurrence of overfitting in the case of small training samples. The spectrum normalization is used, so that the possibility of generating the GAN network mode collapse can be reduced, and the stability of the GAN network mode collapse can be improved. And a self-supervision mechanism is used, so that the feature information can be better extracted. After passing through the convolutional layer and the self-monitoring layer, the arbiter flattens and inputs the convolutional layer and the self-monitoring layer into a full-connection layer, and finally activates the convolutional layer and the self-monitoring layer by using a 'Sigmoid' function.
The main function of the sample classifier is to classify the input sample data. The model structure is mainly divided into two branches, one branch uses 2-D convolution to process data to obtain the spatial characteristics of the data, the other branch uses 1-D convolution to process data of the central pixel point of the data block to obtain the spectral characteristics of the data, and after the spatial characteristics and the spectral characteristics are extracted, the two branches are spliced to continue the next operation. Both branches consist of convolutional layers, fully-connected layers, batch normalization, Dropout, and residual network structures. The activation function of the convolutional layer is 'LeakyReLU'. Using batch normalization may allow a higher learning rate to accelerate convergence by normalizing the data for each training mini-batch. Using Dropout may prevent the occurrence of overfitting in the case of small training samples. The depth of the network is deepened by using a residual network structure, and the accuracy of the network is improved. After passing through the convolutional layer and the residual network layer, the sample classifier is input into the full-link layer in a flattened mode, and finally activated through a 'LogSoftmax' function.
The main role of the potential vector classifier is to classify potential vectors. The model structure is composed of a convolution layer, a full connection layer and a Dropout. The activation function of the convolutional layer is 'ReLU'. Using batch normalization may allow a higher learning rate to accelerate convergence by normalizing the data for each training mini-batch. Using Dropout may prevent the occurrence of overfitting in the case of small training samples. The potential vector classifier is firstly adjusted to be a fixed size through the full link layer, then changes the shape of the potential vector classifier and inputs the shape of the potential vector classifier into the convolutional layer, then flattens the shape of the potential vector classifier and inputs the shape of the potential vector classifier into the full link layer, and finally activates the potential vector classifier through a 'LogSoftmax' function.
Batch Normalization (BN), which normalizes activation of the previous layer of each batch. In other words, it averages the activation of the previous layerThe value is transformed to 0 and the activation standard deviation is transformed to 1. Assume that the batch size is n, and
Figure BDA0002862419270000061
for the activation value derived for the previous layer,
Figure BDA0002862419270000062
the batch normalization is calculated as:
Figure BDA0002862419270000063
wherein the content of the first and second substances,
Figure BDA0002862419270000064
representing the output of samples from batches after batch normalization.
Figure BDA0002862419270000065
And
Figure BDA0002862419270000066
to represent
Figure BDA0002862419270000067
The expectation and variance of (c). Correspondingly, γ and β represent learned hyper-parameters.
The main role of spectrum normalization (spectra normalization) is to make the parameters in the convolution operation satisfy the constraint of Lipschitz ═ 1, so that the network structure is more stable, and the generation of mode collapse is reduced. The realization method is that each layer of network parameters is divided by the spectrum norm of the layer of parameter matrix to meet the constraint of Lipschitz 1.
The main function of the self attention mechanism (selfatentence) is to improve the quality of image generation, and further improve the classification performance of the classifier. Most GAN-based image generation models are constructed using convolutional layers. Convolutional layers process information in the local neighborhood, and thus modeling long-range correlations in an image using convolutional layers alone is computationally inefficient. A self-attention mechanism is therefore introduced that enables the generator and arbiter to efficiently model the relationship between widely separated spatial regions.
Suppose that the image feature x ∈ R in the previous hidden layerC×NThe transformation into two feature spaces f and g is used to calculate attention. Wherein f (x) ═ Wfx,g(x)=Wgx。
Figure BDA0002862419270000068
βijThe degree of participation, i.e., the correlation, of the i region when the model synthesizes the j region image content is represented. The output of the attention layer is o ═ o (o)1,o2,...,oj,...,oN)∈RC×NC is the number of channels, and N is the number of feature locations where the previous hidden layer feature is located.
Figure BDA0002862419270000071
Finally, the output of the attention layer is further multiplied by a proportional parameter, and the characteristic diagram of the input is added. Thus, the final output is:
yi=γoi+xi (4)
gamma is a learnable scalar and is initialized to 0 in order to make the network pay limited attention to neighborhood information, and then weights are slowly distributed to other distant features.
The residual network consists of a series of residual blocks, one of which can be represented as:
xl+1=xl+F(xl,Wl) (5)
wherein xlThe data of the previous layer is directly transferred for direct mapping. F (x)l,Wl) The residual part is generally composed of 2-3 convolutions, and represents a part generated by the convolution. The residual network is easy to optimize, and the depth can be increased to improve the accuracy.
Step 3, training the network
The training of the conditional variation self-encoder is divided into four parts, namely an encoder part in the conditional variation self-encoder is trained, a decoder is trained by a discriminator, and the decoder is trained by a sample classifier and a potential vector classifier. The loss function when the training condition variates the self-encoder is:
LG=LE+LGD+LGC+Laux_real (6)
wherein
Figure BDA0002862419270000072
In the above formula, v and ξ are the mean and variance of the potential vector output by the encoder network, x is real data, and x' is generated virtual data.
Figure BDA0002862419270000073
Figure BDA0002862419270000074
In the above formula, Pr,PzThe distribution of real data and latent variables, respectively. f. ofD,fCRespectively, the covariance of the output of the middle layer of the discriminator and the middle layer of the classifier.
Figure BDA0002862419270000081
Where x represents real data or virtual data generated from potential vectors corresponding to real data, OCRepresenting the final output of the sample classifier, OauxRepresenting the output of the potential vector classifier.
The training of the discriminator is mainly divided into three parts. And respectively inputting the real data, virtual samples generated according to potential vectors corresponding to the real data and virtual samples generated according to potential vectors generated randomly into a discriminator to calculate corresponding loss functions. The loss function for training the arbiter is:
LD=LD_real+LD_fake+LD_fake_random (11)
wherein
Figure BDA0002862419270000082
Figure BDA0002862419270000083
Figure BDA0002862419270000084
Where x denotes the real data, PrRepresenting the true data distribution. z represents the potential vector to which the real data corresponds, PzRepresents the distribution of z. z _ random represents a randomly generated potential vector. Pz_randomRepresenting the distribution of z _ random.
The training of the sample classifier is mainly divided into three parts. And respectively inputting the real data, virtual samples generated according to potential vectors corresponding to the real data and virtual samples generated according to potential vectors generated randomly into a sample classifier to calculate corresponding loss functions. The loss function of the training sample classifier is:
LC=LC_real+LC_fake+LC_fake_random (14)
wherein
LC_real=-E[logP(c|xr)](15)
LC_fake=||f(xr)-f(xg)|| (16)
LC_fake_random=-E[logP(c|xg_random)] (17)
Wherein xrRepresenting true data, xgRepresenting decoder rootsVirtual data, x, generated from potential vectors corresponding to real datag_randomRepresenting the dummy data generated by the decoder from the randomly generated potential vectors. And c represents a category. f represents the middle layer output of the sample classifier.
Only one part of the training of the potential vector classifier is to input the potential vector corresponding to the real data into the potential vector classifier for classification and calculate the corresponding loss function. Training the latent vector classifier to a loss function of
Laux=-E[logP(c|z)] (18)
Where z is the potential vector to which the real data corresponds.
In addition, the classifier, the discriminator, and the generator all used the Adam optimization algorithm with a batch size of 32, the weight attenuation was set to 0.0005, the learning rate of the generator was 0.0001, the learning rate of the discriminator was 0.0002, the learning rates of both classifiers were 0.0001, and when iterated to 20,40,80,100, the learning rate became 0.7 times the current learning rate. The number of iterations is 500. IndianPines dataset chunk size was 28 x 28, training set size was 500. The PaviaU data set data block size is 24 × 24 and the training set size is 500. The salanas dataset data block size was 28 x 28 and the training set size was 200. The latent variable z has a size of 100.
Step 4, classifying the hyperspectral images
And comparing the output of the classifier with the test value and the true value to obtain a classification result, and calculating the accuracy.
And 5, outputting a classification image result.
Experiments and analyses
1. Conditions of the experiment
The hardware test platform of the invention is: the processor Intel (R) core (TM) i5-9300H CPU with a main frequency of 2.40GHz, a memory of 16GB and a display card of GTX1660 Ti; the software platform is Windows 10 operating system and Pycharm 2019. The programming language is Python, and the network structure is implemented using a PyTorch deep learning framework.
2. Experimental data
The performance evaluation of the present invention mainly uses three data sets. Indiana pine Indian Pines dataset, Pavia University dataset in italy, and sainas valley sainas dataset in california, usa.
Indian pine Indian Pines data sets were collected by an onboard visible infrared imaging spectrometer (AVIRIS) in 1992 on Indian paince test fields in northwest Indian. The image has 220 original wave bands, and after removing useless wave bands, 200 wave bands remain. The images had 16 types of samples in total. Table 1 shows the distribution of the number of samples of each type in the Indian Pines images, and the number of each type in the training set and the testing set on the data set.
TABLE 1
Figure BDA0002862419270000091
Figure BDA0002862419270000101
The Pavia University dataset in italy is a portion of the hyperspectral data of images of the Pavia University in italy in 2003 using an onboard reflectance optical spectroscopy imager in germany. The spectral imager continuously images 115 wavebands within the wavelength range of 0.43-0.86 μm, and the spatial resolution of the image is 1.3 m. Of these, 12 bands are rejected due to noise, so that an image composed of the remaining 103 spectral bands is generally used. The data has a size of 610 × 340, and contains 9 types of samples. Table 2 shows the distribution of the number of samples of each type in the Pavia University image, and the number of each type in the training set and the testing set on the data set according to the present invention.
TABLE 2
Figure BDA0002862419270000102
The Salinas valley Salinas dataset, California, USA is an image of the Salinas valley, California, USA by an AVIRIS imaging spectrometer. The spatial resolution reaches 3.7 m. The image has 224 original bands, and there are 204 bands left after removing several invalid bands. The size of the image was 512 × 217, and there were 16 types of samples. Table 3 shows the distribution of the number of samples of each category in the Salinas image, and the number of each category in the training set and the testing set on the data set according to the present invention.
TABLE 3
Figure BDA0002862419270000111
3. Performance comparison
The four prior art comparison and classification methods used in the invention are as follows:
(1) the hyperspectral images are classified by using an svm (supported vector machine) based on a Radial Basis Function (RBF) kernel.
(2) The hyperspectral image classification method provided by Rotewara et al in hyperspectral remote sensing image classification based on a deep convolutional neural network. Referred to as convolutional neural network classification method. Firstly, dimensionality reduction is performed on hyperspectral data through PCA (principal Component analysis), and then a 2D convolution is adopted to classify hyperspectral image.
(3) The hyperspectral Image Classification method proposed by Amina Ben Hamida et al in 3-D Deep Learning for Remote Sensing Image Classification. Referred to as 3D convolutional network classification method for short. And classifying the hyperspectral images by adopting 3D convolution.
(4) The Hyperspectral Image Classification method proposed by Lin Zhu et al in general adaptive Networks for Hyperspectral Image Classification. Referred to as a generative confrontation network classification method. The hyperspectral images are classified by DCGAN (deep conditional general adaptive networks).
In the experiment, the following three indexes were used to evaluate the performance of the present invention:
the first evaluation index is the Overall Accuracy (OA), which represents the proportion of correctly classified samples to all samples, with larger values indicating better classification.
The second evaluation index is the Average Accuracy (AA), which represents the average of the accuracy of classification for each class, with larger values indicating better classification results.
The third evaluation index is a chi-square coefficient (Kappa) which represents different weights in the confusion matrix, and the larger the value is, the better the classification effect is.
Table 4 shows the accuracy and contrast of the present invention for the high spectral image classification of Indian pine Italy pins
Table 5 shows the accuracy and contrast of the present invention in classifying the Pavia University hyperspectral images of paviia city, italy.
Table 6 shows the accuracy and contrast of the present invention for classifying Salinas valley hyperspectral images in California, USA.
TABLE 4
Figure BDA0002862419270000121
TABLE 5
Figure BDA0002862419270000122
Figure BDA0002862419270000131
TABLE 6
Figure BDA0002862419270000132
Figure BDA0002862419270000141
As can be seen from tables 4, 5 and 6, the hyperspectral classification method provided by the invention has better classification effect than other methods for the same hyperspectral data set. The classification performance of the network was about 3%, 9% and 2% better than the current best method classification performance for the university of pavea dataset, the indian pine dataset and the sainas valley dataset of california, usa, respectively.
In addition, fig. 8, 9 and 10 show classification graphs whose visual classification effect is consistent with the results listed in table 4, table 5 and table 6. From the visualization of results, the classification chart achieved by the method has better effect.
TABLE 7
Figure BDA0002862419270000142
Table 7 is a time comparison of training and testing of the present invention with SVM, 2dCNN, 3dCNN and DCGAN.
As can be seen from Table 7, the training time of the present invention is much longer than that of other methods due to the complicated structure of the model. The test times for the present invention and DCGAN are comparable and longer than those for 2dCNN and 3 dCNN.
In summary, the invention provides a method for classifying hyperspectral images of SACVAEGAN network structures. An automatic supervision mechanism is added on the basis of CVAEGAN, so that the network can better learn the characteristics of the hyperspectral image. Other structures of the network are modified, and a residual error network is added into the sample classifier, so that the network structure is deepened, and the accuracy is higher; meanwhile, two branches for extracting spectral features and spatial features are added into the sample classifier so as to better extract the features of the hyperspectral image; and a potential vector classifier module is added in the network, so that the potential vector generated by SACVAEGAN is more accurate, and meanwhile, a more accurate label can be given to the potential vector generated randomly. The experimental result shows that the method has higher classification precision than the prior art.

Claims (4)

1. SACVAEGAN-based hyperspectral image classification method is characterized in that: comprises the following steps
(1) Firstly, preprocessing hyperspectral image data;
(2) constructing a network model;
after data preprocessing is carried out, a network model is constructed; the training network model consists of five parts which are respectively: the system comprises a conditional variation self-encoder, a discriminator, a sample classifier and a potential vector classifier;
(3) training a network;
after the data and the model are processed respectively, training is started; the training process is mainly divided into four parts, namely a condition variation self-encoder, a discriminator, a sample classifier and a potential vector classifier are trained;
(4) classifying the hyperspectral images;
testing after the model training is finished; and comparing the test result with the true value to obtain a classification result, and calculating the accuracy.
2. SACVAEGAN-based hyperspectral image classification method according to claim 1, characterized in that: in the step (1), (1a) filling and zero padding are firstly carried out on the edge of the original hyperspectral data, so that data with the window size of patchsize multiplied by patchsize can be obtained by taking each point as the center, wherein the patchsize of the Indian pins data set and the Salinas data set is 28, and the patchsize of the PaviaU data set is 24;
(1b) randomly selecting K points as training labels, wherein 500 points are used as the training labels for the Indian pines and PaviaU data sets, 200 labels are used as the training labels for the Salinas data sets, and the rest are used as test labels;
(1c) obtaining a sample set; and obtaining a sample set of the hyperspectral image, taking K training labels obtained at random as centers, dividing training data by the size of a window being patchsize multiplied by patchsize, and dividing the rest of the training data into test data by the size of the patchsize multiplied by patchsize.
3. SACVAEGAN-based hyperspectral image classification method according to claim 1, characterized in that: in the step (2), (2a) the conditional variation is divided into a decoder and an encoder from an encoder; the encoder is mainly used for generating potential vectors corresponding to real hyperspectral data; the encoder combines a self-attention mechanism and a spectrum normalization method; the decoder is mainly used for generating corresponding virtual hyperspectral data according to the potential vector, and combines a self-attention mechanism and a spectrum normalization method;
(2b) the discriminator mainly has the function of discriminating the truth of the input hyperspectral data, and combines an attention mechanism with spectrum normalization;
(2c) the sample classifier is mainly used for classifying input hyperspectral data and consists of two branches for acquiring spatial features and spectral features of hyperspectral images and extracting features by combining a residual error network;
(2d) the potential vector classifier is mainly used for classifying the randomly generated potential vectors, and then giving a category to the virtual hyperspectral data generated by the generator according to the potential vectors so as to facilitate the following operation.
4. SACVAEGAN-based hyperspectral image classification method according to claim 1, characterized in that: in the step (1), (3a) firstly, training a discriminator; training of the discriminator is divided into three steps, namely inputting real hyperspectral data and potential vectors generated by an encoder in a condition variation self-encoder into a decoder to generate virtual hyperspectral data and inputting virtual hyperspectral data generated by the potential vectors generated randomly into the discriminator to train, and calculating a loss function to further optimize parameters of the discriminator;
(3b) training a conditional variational self-encoder; the training is divided into five steps, namely, the real hyperspectral data are put into an encoder to generate corresponding potential vectors and corresponding loss functions are calculated; inputting virtual hyperspectral data and real hyperspectral data which are correspondingly generated by a potential vector generated by an encoder into a discriminator, and calculating a corresponding loss function; virtual hyperspectral data and real hyperspectral data corresponding to the potential vector generated by the encoder are put into a sample classifier to calculate a corresponding loss function; respectively inputting the randomly generated potential vector and the corresponding virtual hyperspectral data into a potential vector classifier and a sample classifier, and calculating corresponding loss functions; inputting the potential vector generated by the encoder into a potential vector classifier, and calculating a corresponding loss function;
(3c) classifying the potential vector classifier; the training of the potential vector classifier is to input the potential vectors corresponding to the real hyperspectral data into the potential vector classifier for classification and calculate a loss function;
(3d) training a sample classifier; training of the sample classifier is divided into three steps; respectively inputting the real hyperspectral data into a sample classifier to calculate a classified loss function; respectively inputting the real hyperspectral data and the virtual hyperspectral data into a sample classifier to calculate a corresponding loss function; and inputting the randomly generated potential vectors into a potential vector classifier for classification, inputting virtual hyperspectral data generated according to the randomly generated potential vectors into the classifier for classification, and calculating corresponding loss functions according to classification results.
CN202011569729.2A 2020-12-26 2020-12-26 SACVAEGAN-based hyperspectral image classification method Pending CN112633386A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011569729.2A CN112633386A (en) 2020-12-26 2020-12-26 SACVAEGAN-based hyperspectral image classification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011569729.2A CN112633386A (en) 2020-12-26 2020-12-26 SACVAEGAN-based hyperspectral image classification method

Publications (1)

Publication Number Publication Date
CN112633386A true CN112633386A (en) 2021-04-09

Family

ID=75325275

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011569729.2A Pending CN112633386A (en) 2020-12-26 2020-12-26 SACVAEGAN-based hyperspectral image classification method

Country Status (1)

Country Link
CN (1) CN112633386A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113239938A (en) * 2021-05-11 2021-08-10 中国人民解放军火箭军工程大学 Hyperspectral classification method and system based on graph structure
CN114120041A (en) * 2021-11-29 2022-03-01 暨南大学 Small sample classification method based on double-pair anti-variation self-encoder
CN114107935A (en) * 2021-11-29 2022-03-01 重庆忽米网络科技有限公司 Automatic PVD (physical vapor deposition) coating thickness adjusting method based on AI (Artificial Intelligence) algorithm
CN114492526A (en) * 2022-01-25 2022-05-13 太原科技大学 SPEC-Net network architecture and identification method for multi-satellite spectrum automatic identification
CN117034020A (en) * 2023-10-09 2023-11-10 贵州大学 Unmanned aerial vehicle sensor zero sample fault detection method based on CVAE-GAN model
CN117527900A (en) * 2024-01-08 2024-02-06 季华实验室 Data processing method, device, equipment and storage medium
CN117692346A (en) * 2024-01-31 2024-03-12 浙商银行股份有限公司 Message blocking prediction method and device based on spectrum regularization variation self-encoder

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110580501A (en) * 2019-08-20 2019-12-17 天津大学 Zero sample image classification method based on variational self-coding countermeasure network
CN111008652A (en) * 2019-11-15 2020-04-14 河海大学 Hyper-spectral remote sensing image classification method based on GAN
WO2020172838A1 (en) * 2019-02-26 2020-09-03 长沙理工大学 Image classification method for improvement of auxiliary classifier gan
US20200356810A1 (en) * 2019-05-06 2020-11-12 Agora Lab, Inc. Effective Structure Keeping for Generative Adversarial Networks for Single Image Super Resolution

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020172838A1 (en) * 2019-02-26 2020-09-03 长沙理工大学 Image classification method for improvement of auxiliary classifier gan
US20200356810A1 (en) * 2019-05-06 2020-11-12 Agora Lab, Inc. Effective Structure Keeping for Generative Adversarial Networks for Single Image Super Resolution
CN110580501A (en) * 2019-08-20 2019-12-17 天津大学 Zero sample image classification method based on variational self-coding countermeasure network
CN111008652A (en) * 2019-11-15 2020-04-14 河海大学 Hyper-spectral remote sensing image classification method based on GAN

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113239938A (en) * 2021-05-11 2021-08-10 中国人民解放军火箭军工程大学 Hyperspectral classification method and system based on graph structure
CN113239938B (en) * 2021-05-11 2024-01-09 中国人民解放军火箭军工程大学 Hyperspectral classification method and hyperspectral classification system based on graph structure
CN114120041A (en) * 2021-11-29 2022-03-01 暨南大学 Small sample classification method based on double-pair anti-variation self-encoder
CN114107935A (en) * 2021-11-29 2022-03-01 重庆忽米网络科技有限公司 Automatic PVD (physical vapor deposition) coating thickness adjusting method based on AI (Artificial Intelligence) algorithm
CN114120041B (en) * 2021-11-29 2024-05-17 暨南大学 Small sample classification method based on double-countermeasure variable self-encoder
CN114492526A (en) * 2022-01-25 2022-05-13 太原科技大学 SPEC-Net network architecture and identification method for multi-satellite spectrum automatic identification
CN114492526B (en) * 2022-01-25 2022-11-22 太原科技大学 SPEC-Net network architecture and identification method for multi-satellite spectrum automatic identification
CN117034020A (en) * 2023-10-09 2023-11-10 贵州大学 Unmanned aerial vehicle sensor zero sample fault detection method based on CVAE-GAN model
CN117034020B (en) * 2023-10-09 2024-01-09 贵州大学 Unmanned aerial vehicle sensor zero sample fault detection method based on CVAE-GAN model
CN117527900A (en) * 2024-01-08 2024-02-06 季华实验室 Data processing method, device, equipment and storage medium
CN117527900B (en) * 2024-01-08 2024-05-07 季华实验室 Data processing method, device, equipment and storage medium
CN117692346A (en) * 2024-01-31 2024-03-12 浙商银行股份有限公司 Message blocking prediction method and device based on spectrum regularization variation self-encoder

Similar Documents

Publication Publication Date Title
CN110298396B (en) Hyperspectral image classification method based on deep learning multi-feature fusion
Ghaderizadeh et al. Hyperspectral image classification using a hybrid 3D-2D convolutional neural networks
CN112633386A (en) SACVAEGAN-based hyperspectral image classification method
US11170502B2 (en) Method based on deep neural network to extract appearance and geometry features for pulmonary textures classification
Lin et al. A deep convolutional neural network architecture for boosting image discrimination accuracy of rice species
Hage Chehade et al. Lung and colon cancer classification using medical imaging: A feature engineering approach
US20190164047A1 (en) Object recognition using a convolutional neural network trained by principal component analysis and repeated spectral clustering
Subudhi et al. A survey on superpixel segmentation as a preprocessing step in hyperspectral image analysis
CN107145836B (en) Hyperspectral image classification method based on stacked boundary identification self-encoder
CN114821164B (en) Hyperspectral image classification method based on twin network
Abed et al. A modern deep learning framework in robot vision for automated bean leaves diseases detection
CN113728335A (en) Method and system for classification and visualization of 3D images
CN109190511B (en) Hyperspectral classification method based on local and structural constraint low-rank representation
Dwivedi et al. Lung cancer detection and classification by using machine learning & multinomial Bayesian
US20220237789A1 (en) Weakly supervised multi-task learning for cell detection and segmentation
Huang et al. Hyperspectral image classification via discriminant Gabor ensemble filter
CN112861915A (en) Anchor-frame-free non-cooperative target detection method based on high-level semantic features
Chudzik et al. DISCERN: Generative framework for vessel segmentation using convolutional neural network and visual codebook
Fırat et al. Spatial-spectral classification of hyperspectral remote sensing images using 3D CNN based LeNet-5 architecture
Bhimavarapu et al. Analysis and characterization of plant diseases using transfer learning
CN115049952A (en) Juvenile fish limb identification method based on multi-scale cascade perception deep learning network
CN115393631A (en) Hyperspectral image classification method based on Bayesian layer graph convolution neural network
CN115661029A (en) Pulmonary nodule detection and identification system based on YOLOv5
Azam et al. Using feature maps to unpack the CNN ‘Black box’theory with two medical datasets of different modality
Rodrigues et al. HEp-2 cell image classification based on convolutional neural networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination