CN112633386A - SACVAEGAN-based hyperspectral image classification method - Google Patents
SACVAEGAN-based hyperspectral image classification method Download PDFInfo
- Publication number
- CN112633386A CN112633386A CN202011569729.2A CN202011569729A CN112633386A CN 112633386 A CN112633386 A CN 112633386A CN 202011569729 A CN202011569729 A CN 202011569729A CN 112633386 A CN112633386 A CN 112633386A
- Authority
- CN
- China
- Prior art keywords
- data
- hyperspectral
- training
- classifier
- potential
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 44
- 239000013598 vector Substances 0.000 claims abstract description 92
- 238000001228 spectrum Methods 0.000 claims abstract description 21
- 230000007246 mechanism Effects 0.000 claims abstract description 20
- 238000012549 training Methods 0.000 claims description 66
- 230000006870 function Effects 0.000 claims description 43
- 238000010606 normalization Methods 0.000 claims description 29
- 230000003595 spectral effect Effects 0.000 claims description 26
- 238000012360 testing method Methods 0.000 claims description 17
- 241001466077 Salina Species 0.000 claims description 15
- 235000008331 Pinus X rigitaeda Nutrition 0.000 claims description 9
- 235000011613 Pinus brutia Nutrition 0.000 claims description 9
- 241000018646 Pinus brutia Species 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 7
- 238000007781 pre-processing Methods 0.000 claims description 4
- 230000000694 effects Effects 0.000 abstract description 6
- 230000004913 activation Effects 0.000 description 9
- 238000010586 diagram Methods 0.000 description 8
- 238000013527 convolutional neural network Methods 0.000 description 5
- 238000013135 deep learning Methods 0.000 description 4
- 238000011156 evaluation Methods 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 4
- 230000003044 adaptive effect Effects 0.000 description 3
- 238000012544 monitoring process Methods 0.000 description 3
- 239000000463 material Substances 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000000513 principal component analysis Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 1
- 238000013256 Gubra-Amylin NASH model Methods 0.000 description 1
- 238000007476 Maximum Likelihood Methods 0.000 description 1
- 241000507627 Salana Species 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000003042 antagnostic effect Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000003331 infrared imaging Methods 0.000 description 1
- 229910052500 inorganic mineral Inorganic materials 0.000 description 1
- 208000013409 limited attention Diseases 0.000 description 1
- 238000007477 logistic regression Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000011707 mineral Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000004611 spectroscopical analysis Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a SACVAEGAN-based hyperspectral image classification method, wherein a potential vector classifier module is added on the basis of CVAEGAN and is used for classifying potential vectors corresponding to hyperspectral data, so that the potential vectors are cooperatively trained with a decoder and a sample classifier, the problem that the randomly generated potential vectors and classes in a GAN network are difficult to correspond is solved, and the accuracy is further improved. A self-attention mechanism and a spectrum regularization method are applied to a decoder, an encoder and a discriminator, the self-attention mechanism can enable a network model to better extract the characteristics of hyperspectral data, and the spectrum regularization method can improve the stability of the model. Features are extracted from two angles of space and spectrum in the sample classifier, and the structure of a residual error network is added, so that the effect of model classification is improved.
Description
Technical Field
The invention relates to the field of hyperspectral image classification, in particular to a method for classifying hyperspectral images.
Background
With the continuous development of remote sensing technology, High Spectral Image (HSI) makes a significant breakthrough in the field of earth observation. Unlike traditional three-channel color images, HSI can collect images in hundreds of spectral bands simultaneously, with very rich spectral information. Therefore, the hyperspectral image is widely applied to the fields of satellite remote sensing, crop observation, mineral exploration and the like.
In hyperspectral data processing technology, the classification problem has always been one of the very active subjects. The classification problem of hyperspectral data generally has two classification methods: spectral classifiers and spectral-spatial classifiers. Conventional hyperspectral image classification algorithms typically include Support Vector Machines (SVMs), K Nearest Neighbors (KNNs), maximum likelihood, neural networks, and logistic regression. However, since the same material may have spectral differences and different materials may have similar spectral characteristics, it is difficult to accurately distinguish different classes only by spectral information. In order to solve the above problems, some scholars have proposed a method of combining spectral information with spatial information to improve classification performance.
Due to the fact that a large amount of experience and parameter setting are designed in the traditional classification method, in recent years, deep learning methods are applied to hyperspectral image classification in a large amount. Convolutional Neural Networks (CNN) in particular have received great attention. For example, one of the methods proposed by w.hu adopts a five-layer deep CNN model to extract the spectral features of HSI, and obtains better classification performance. Li proposes a pixel pair method as a classifier for depth spectra, which achieves good results in the absence of training data, but which mainly performs convolution operations in the spectral domain, ignoring spatial details. Ying Li provides a method for extracting spectrum-space characteristics by using 3D convolution, various characteristics of hyperspectral data are fully considered, and a good classification result is obtained.
Although the deep learning based approach has made great progress in HSI classification, it still faces some problems, namely, too little tagged data. Based on this, some scholars propose to reduce the phenomenon of limited hyperspectral data by using a GAN network model. The GAN network model typically includes a generative model G and a discriminative model D. Models G and D were trained in an antagonistic manner. Where G attempts to generate as real pseudo-samples as possible by means of a random vector Z, and D attempts to identify real samples and G-generated pseudo-samples. The two are continually confronted until D finally fails to successfully identify a false sample. By correctly using the samples which are generated by the GAN and can be used as virtual samples, the classification accuracy can be improved, and the condition that hyperspectral training data is limited is reduced.
Disclosure of Invention
The invention aims to solve the problems and provides an SACVAEGAN (Self-Attention-based Conditional variant adaptive Encoder generation countermeasure network) hyperspectral image classification method. The invention adopts CVAEGAN (conditional variational self-encoder) as a basic structure to solve the problem of small hyperspectral image training data, generates a virtual sample of the hyperspectral image through a decoder in the CVAEGAN, amplifies the training data and further improves the accuracy of classification. Meanwhile, a potential vector classifier module is added on the basis of CVAEGAN to classify potential vectors corresponding to hyperspectral data, so that the potential vectors are cooperatively trained with a decoder and a sample classifier, the problem that the potential vectors and classes randomly generated in a GAN network are difficult to correspond is solved, and the accuracy is improved. A self-attention mechanism and a spectrum regularization method are applied to a decoder, an encoder and a discriminator, the self-attention mechanism can enable a network model to better extract the characteristics of hyperspectral data, and the spectrum regularization method can improve the stability of the model. Features are extracted from two angles of space and spectrum in the sample classifier, and the structure of a residual error network is added, so that the effect of model classification is improved.
In order to achieve the purpose, the technical scheme and the experimental steps adopted by the invention are as follows:
(1) firstly, the hyperspectral image data is preprocessed.
(1a) The original hyperspectral data edges are first filled and zeroed so that data with a window size of patchsize × patchsize can be taken centered around each point, where indianpins and Salinas datasets patchsize is 28 and PaviaU dataset patchsize is 24.
(1b) Randomly selecting K points as training labels, wherein 500 points are used as the training labels in the IndianPines and PaviaU data sets, 200 labels are used as the training labels in the Salinas data set, and the rest are used as the testing labels.
(1c) A sample set is obtained. And obtaining a sample set of the hyperspectral image, taking K training labels obtained at random as centers, dividing training data by the size of a window being patchsize multiplied by patchsize, and dividing the rest of the training data into test data by the size of the patchsize multiplied by patchsize.
(2) Building a network model
After data preprocessing, the network model is constructed. The training network model consists of five parts which are respectively: conditional variations are from the encoder, the discriminator, the sample classifier, and the latent vector classifier.
(2a) Conditional variations are divided from an encoder into a decoder (i.e., a generator) and an encoder. The encoder mainly functions to generate potential vectors corresponding to real hyperspectral data. The encoder combines a self-attention mechanism and a spectral normalization method. The decoder mainly functions to generate corresponding virtual hyperspectral data according to the potential vectors, and combines a self-attention mechanism and a spectrum normalization method.
(2b) The discriminator is mainly used for discriminating the truth of the input hyperspectral data, and combines an attention mechanism with spectrum normalization.
(2c) The sample classifier is mainly used for classifying input hyperspectral data and is composed of two branches for acquiring spatial features and spectral features of hyperspectral images and extracting features by combining a residual error network.
(2d) The potential vector classifier is mainly used for classifying the randomly generated potential vectors, and then giving a category to the virtual hyperspectral data generated by the generator according to the potential vectors so as to facilitate the following operation.
(3) Training network
Training is started after the data and the model are processed respectively. The training process is mainly divided into four parts, namely a condition variation self-encoder, a discriminator, a sample classifier and a potential vector classifier are trained.
(3a) The arbiter is first trained. The training of the discriminator is divided into three steps, namely, real hyperspectral data and potential vectors generated by an encoder in a condition variation self-encoder are input into a decoder to generate virtual hyperspectral data, and virtual hyperspectral data generated by the potential vectors generated randomly are input into the discriminator to be trained, and a loss function is calculated to optimize parameters of the discriminator.
(3b) Training the conditional variational self-encoder. The training is divided into five steps, namely, the real hyperspectral data are put into an encoder to generate corresponding potential vectors and corresponding loss functions are calculated. And inputting virtual hyperspectral data and real hyperspectral data which are correspondingly generated by the potential vector generated by the encoder into a discriminator, and calculating a corresponding loss function. And putting the virtual hyperspectral data and the real hyperspectral data corresponding to the potential vector generated by the encoder into a sample classifier to calculate a corresponding loss function. And respectively inputting the randomly generated potential vector and the corresponding virtual hyperspectral data into a potential vector classifier and a sample classifier, and calculating corresponding loss functions. And inputting the potential vector generated by the encoder into a potential vector classifier, and calculating a corresponding loss function.
(3c) And classifying the potential vector classifier. The training of the potential vector classifier is mainly to input potential vectors corresponding to real hyperspectral data into the potential vector classifier for classification and calculate a loss function.
(3d) And training a sample classifier. The training of the sample classifier is mainly divided into three steps. And respectively inputting the real hyperspectral data into a sample classifier to calculate a classified loss function. And respectively inputting the real hyperspectral data and the virtual hyperspectral data into a sample classifier to calculate a corresponding loss function. And inputting the randomly generated potential vectors into a potential vector classifier for classification, inputting virtual hyperspectral data generated according to the randomly generated potential vectors into the classifier for classification, and calculating corresponding loss functions according to classification results.
(4) Hyperspectral image classification
And testing after the model training is completed. And comparing the test result with the true value to obtain a classification result, and calculating the accuracy. The potential vector classifier is added in the invention, so that the problem of correspondence between potential vectors and categories randomly generated in the GAN model is solved, and experimental results prove that the accuracy of the model for classifying the hyperspectral data is remarkably improved after the potential vector classifier is added. A self-attention mechanism and a spectrum regularization method are added into the network model, so that the stability and the training speed of the model are improved. The method has the advantages that the characteristics are extracted from two angles of the space spectrum in the sample classifier, and the residual error network structure is added, so that the accuracy of the sample classifier in classifying the hyperspectral data is improved.
Drawings
FIG. 1 is a flow chart of the present invention.
Fig. 2 is an overall structural view of the present invention.
Fig. 3 is a block diagram of a sample classifier.
Fig. 4 is a structural diagram of a conditional variable self-encoder, in which the upper side is an encoder and the lower side is a decoder.
Fig. 5 is a structural diagram of the discriminator.
Fig. 6 is a block diagram of a potential vector classifier.
In fig. 7, (a) is an IndianPines hyperspectral image used in the present invention, (b) is a PaviaU hyperspectral image used in the present invention, and (c) is a Salinas hyperspectral image used in the present invention.
Fig. 8 (a) is a diagram showing the result of classification of the IndianPines hyperspectral image by the SVM. (b) The result graph is obtained by classifying the IndianPines hyperspectral images through 2 dCNN. (c) The result graph is obtained by classifying 3dCNN on IndianPines hyperspectral images. (d) The result graph is obtained by classifying the IndianPines hyperspectral images through the DCGAN. (e) The invention is a result graph for classifying Indian pines hyperspectral images.
Fig. 9 (a) is a diagram showing the result of classification of the PaviaU hyperspectral image by the SVM. (b) The result graph is obtained by classifying the PaviaU hyperspectral image by 2 dCNN. (c) The result graph is obtained by classifying the PaviaU hyperspectral image by 3 dCNN. (d) The result graph of the classification of the PaviaU hyperspectral images by the DCGAN is shown. (e) The invention is a result graph for classifying the PaviaU hyperspectral image.
Fig. 10 (a) is a diagram showing the results of classification of the Salinas hyperspectral images by the SVM. (b) The result graph is obtained by classifying Salinas hyperspectral images by 2 dCNN. (c) The result graph is that the 3dCNN classifies Salinas hyperspectral images. (d) The result graph of the classification of Salinas hyperspectral images by DCGAN is shown. (e) The invention is a result graph for classifying Salinas hyperspectral images.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Referring to FIG. 1, the experimental procedure of the present invention is as follows:
step 1, data preprocessing
(1) Edge filling and normalization of data
The data is edge-filled with zero padding to enable the data to be centered on each available data point and the window size is divided into
Patchsize × Patchsize data. And carrying out normalization processing on the data.
(2) Obtaining a sample set
Randomly selecting K data points in the original data, and dividing the window into the size of K data points by taking the K data points as the center
The patch size × patch size data block serves as a training set. The patch size × patch sized data block is divided into test sets centered on the remaining data points.
Step 2, constructing a model
The model mainly comprises four parts which are respectively: conditional variations are from the encoder, the discriminator, the sample classifier, and the latent vector classifier.
The conditional variational self-encoder consists of an encoder and a decoder. Wherein corresponding latent vectors are generated for better utilization of the encoder extraction features. The encoder is operative to generate its corresponding potential vector from the input samples, and the decoder is operative to generate sample data of the corresponding class from the input potential vector. Two branches are used in the encoder. One branch uses 2-D convolution to process data to obtain the spatial characteristics of the data, the other branch uses 1-D convolution to process data of the central pixel point of the data block to obtain the spectral characteristics of the data, and after the spatial characteristics and the spectral characteristics are extracted, the spatial characteristics and the spectral characteristics are spliced to continue the next operation. Both branches consist of convolutional layers, fully-connected layers, batch normalization, spectral normalization, Dropout, and self-supervision mechanisms. The last convolutional layer was removed and the remaining convolutional layers were normalized by spectral normalization to meet the constraints of Lipschitz ═ 1. The activation function of the convolutional layer is 'ReLU'. The structure of the decoder is similar to that of the encoder except that it is a single branch. The system consists of an deconvolution layer, a full-link layer, batch normalization, spectrum normalization, Dropout and an auto-supervision mechanism. The last convolutional layer was removed and the remaining convolutional layers were normalized by spectral normalization to meet the constraints of Lipschitz ═ 1. Using batch normalization may allow a higher learning rate to accelerate convergence by normalizing the data for each training mini-batch. Using Dropout may prevent the occurrence of overfitting in the case of small training samples. The spectrum normalization is used, so that the possibility of generating the GAN network mode collapse can be reduced, and the stability of the GAN network mode collapse can be improved. And a self-supervision mechanism is used, so that the feature information can be better extracted. After the encoder passes through a convolution and self-supervision mechanism, the encoder inputs the data into a full connection layer in a leveling mode, and a sigmoid function is used for activation. After passing through the deconvolution layer and the self-supervision layer, the decoder flattens and inputs the deconvolution layer and the self-supervision layer into a full-connection layer, and finally activates the decoder by using a 'Tanh' function.
The main function of the discriminator is to judge whether the input sample data is real sample data or virtual sample data. The model structure consists of convolution layer, full connection layer, batch normalization, spectrum normalization, Dropout and self-supervision mechanism. The last convolutional layer was removed and the remaining convolutional layers were normalized by spectral normalization to meet the constraints of Lipschitz ═ 1. The activation function of the convolutional layer is 'LeakyReLU'. Using batch normalization may allow a higher learning rate to accelerate convergence by normalizing the data for each training mini-batch. Using Dropout may prevent the occurrence of overfitting in the case of small training samples. The spectrum normalization is used, so that the possibility of generating the GAN network mode collapse can be reduced, and the stability of the GAN network mode collapse can be improved. And a self-supervision mechanism is used, so that the feature information can be better extracted. After passing through the convolutional layer and the self-monitoring layer, the arbiter flattens and inputs the convolutional layer and the self-monitoring layer into a full-connection layer, and finally activates the convolutional layer and the self-monitoring layer by using a 'Sigmoid' function.
The main function of the sample classifier is to classify the input sample data. The model structure is mainly divided into two branches, one branch uses 2-D convolution to process data to obtain the spatial characteristics of the data, the other branch uses 1-D convolution to process data of the central pixel point of the data block to obtain the spectral characteristics of the data, and after the spatial characteristics and the spectral characteristics are extracted, the two branches are spliced to continue the next operation. Both branches consist of convolutional layers, fully-connected layers, batch normalization, Dropout, and residual network structures. The activation function of the convolutional layer is 'LeakyReLU'. Using batch normalization may allow a higher learning rate to accelerate convergence by normalizing the data for each training mini-batch. Using Dropout may prevent the occurrence of overfitting in the case of small training samples. The depth of the network is deepened by using a residual network structure, and the accuracy of the network is improved. After passing through the convolutional layer and the residual network layer, the sample classifier is input into the full-link layer in a flattened mode, and finally activated through a 'LogSoftmax' function.
The main role of the potential vector classifier is to classify potential vectors. The model structure is composed of a convolution layer, a full connection layer and a Dropout. The activation function of the convolutional layer is 'ReLU'. Using batch normalization may allow a higher learning rate to accelerate convergence by normalizing the data for each training mini-batch. Using Dropout may prevent the occurrence of overfitting in the case of small training samples. The potential vector classifier is firstly adjusted to be a fixed size through the full link layer, then changes the shape of the potential vector classifier and inputs the shape of the potential vector classifier into the convolutional layer, then flattens the shape of the potential vector classifier and inputs the shape of the potential vector classifier into the full link layer, and finally activates the potential vector classifier through a 'LogSoftmax' function.
Batch Normalization (BN), which normalizes activation of the previous layer of each batch. In other words, it averages the activation of the previous layerThe value is transformed to 0 and the activation standard deviation is transformed to 1. Assume that the batch size is n, andfor the activation value derived for the previous layer,the batch normalization is calculated as:
wherein the content of the first and second substances,representing the output of samples from batches after batch normalization.Andto representThe expectation and variance of (c). Correspondingly, γ and β represent learned hyper-parameters.
The main role of spectrum normalization (spectra normalization) is to make the parameters in the convolution operation satisfy the constraint of Lipschitz ═ 1, so that the network structure is more stable, and the generation of mode collapse is reduced. The realization method is that each layer of network parameters is divided by the spectrum norm of the layer of parameter matrix to meet the constraint of Lipschitz 1.
The main function of the self attention mechanism (selfatentence) is to improve the quality of image generation, and further improve the classification performance of the classifier. Most GAN-based image generation models are constructed using convolutional layers. Convolutional layers process information in the local neighborhood, and thus modeling long-range correlations in an image using convolutional layers alone is computationally inefficient. A self-attention mechanism is therefore introduced that enables the generator and arbiter to efficiently model the relationship between widely separated spatial regions.
Suppose that the image feature x ∈ R in the previous hidden layerC×NThe transformation into two feature spaces f and g is used to calculate attention. Wherein f (x) ═ Wfx,g(x)=Wgx。
βijThe degree of participation, i.e., the correlation, of the i region when the model synthesizes the j region image content is represented. The output of the attention layer is o ═ o (o)1,o2,...,oj,...,oN)∈RC×NC is the number of channels, and N is the number of feature locations where the previous hidden layer feature is located.
Finally, the output of the attention layer is further multiplied by a proportional parameter, and the characteristic diagram of the input is added. Thus, the final output is:
yi=γoi+xi (4)
gamma is a learnable scalar and is initialized to 0 in order to make the network pay limited attention to neighborhood information, and then weights are slowly distributed to other distant features.
The residual network consists of a series of residual blocks, one of which can be represented as:
xl+1=xl+F(xl,Wl) (5)
wherein xlThe data of the previous layer is directly transferred for direct mapping. F (x)l,Wl) The residual part is generally composed of 2-3 convolutions, and represents a part generated by the convolution. The residual network is easy to optimize, and the depth can be increased to improve the accuracy.
Step 3, training the network
The training of the conditional variation self-encoder is divided into four parts, namely an encoder part in the conditional variation self-encoder is trained, a decoder is trained by a discriminator, and the decoder is trained by a sample classifier and a potential vector classifier. The loss function when the training condition variates the self-encoder is:
LG=LE+LGD+LGC+Laux_real (6)
wherein
In the above formula, v and ξ are the mean and variance of the potential vector output by the encoder network, x is real data, and x' is generated virtual data.
In the above formula, Pr,PzThe distribution of real data and latent variables, respectively. f. ofD,fCRespectively, the covariance of the output of the middle layer of the discriminator and the middle layer of the classifier.
Where x represents real data or virtual data generated from potential vectors corresponding to real data, OCRepresenting the final output of the sample classifier, OauxRepresenting the output of the potential vector classifier.
The training of the discriminator is mainly divided into three parts. And respectively inputting the real data, virtual samples generated according to potential vectors corresponding to the real data and virtual samples generated according to potential vectors generated randomly into a discriminator to calculate corresponding loss functions. The loss function for training the arbiter is:
LD=LD_real+LD_fake+LD_fake_random (11)
wherein
Where x denotes the real data, PrRepresenting the true data distribution. z represents the potential vector to which the real data corresponds, PzRepresents the distribution of z. z _ random represents a randomly generated potential vector. Pz_randomRepresenting the distribution of z _ random.
The training of the sample classifier is mainly divided into three parts. And respectively inputting the real data, virtual samples generated according to potential vectors corresponding to the real data and virtual samples generated according to potential vectors generated randomly into a sample classifier to calculate corresponding loss functions. The loss function of the training sample classifier is:
LC=LC_real+LC_fake+LC_fake_random (14)
wherein
LC_real=-E[logP(c|xr)](15)
LC_fake=||f(xr)-f(xg)|| (16)
LC_fake_random=-E[logP(c|xg_random)] (17)
Wherein xrRepresenting true data, xgRepresenting decoder rootsVirtual data, x, generated from potential vectors corresponding to real datag_randomRepresenting the dummy data generated by the decoder from the randomly generated potential vectors. And c represents a category. f represents the middle layer output of the sample classifier.
Only one part of the training of the potential vector classifier is to input the potential vector corresponding to the real data into the potential vector classifier for classification and calculate the corresponding loss function. Training the latent vector classifier to a loss function of
Laux=-E[logP(c|z)] (18)
Where z is the potential vector to which the real data corresponds.
In addition, the classifier, the discriminator, and the generator all used the Adam optimization algorithm with a batch size of 32, the weight attenuation was set to 0.0005, the learning rate of the generator was 0.0001, the learning rate of the discriminator was 0.0002, the learning rates of both classifiers were 0.0001, and when iterated to 20,40,80,100, the learning rate became 0.7 times the current learning rate. The number of iterations is 500. IndianPines dataset chunk size was 28 x 28, training set size was 500. The PaviaU data set data block size is 24 × 24 and the training set size is 500. The salanas dataset data block size was 28 x 28 and the training set size was 200. The latent variable z has a size of 100.
Step 4, classifying the hyperspectral images
And comparing the output of the classifier with the test value and the true value to obtain a classification result, and calculating the accuracy.
And 5, outputting a classification image result.
Experiments and analyses
1. Conditions of the experiment
The hardware test platform of the invention is: the processor Intel (R) core (TM) i5-9300H CPU with a main frequency of 2.40GHz, a memory of 16GB and a display card of GTX1660 Ti; the software platform is Windows 10 operating system and Pycharm 2019. The programming language is Python, and the network structure is implemented using a PyTorch deep learning framework.
2. Experimental data
The performance evaluation of the present invention mainly uses three data sets. Indiana pine Indian Pines dataset, Pavia University dataset in italy, and sainas valley sainas dataset in california, usa.
Indian pine Indian Pines data sets were collected by an onboard visible infrared imaging spectrometer (AVIRIS) in 1992 on Indian paince test fields in northwest Indian. The image has 220 original wave bands, and after removing useless wave bands, 200 wave bands remain. The images had 16 types of samples in total. Table 1 shows the distribution of the number of samples of each type in the Indian Pines images, and the number of each type in the training set and the testing set on the data set.
TABLE 1
The Pavia University dataset in italy is a portion of the hyperspectral data of images of the Pavia University in italy in 2003 using an onboard reflectance optical spectroscopy imager in germany. The spectral imager continuously images 115 wavebands within the wavelength range of 0.43-0.86 μm, and the spatial resolution of the image is 1.3 m. Of these, 12 bands are rejected due to noise, so that an image composed of the remaining 103 spectral bands is generally used. The data has a size of 610 × 340, and contains 9 types of samples. Table 2 shows the distribution of the number of samples of each type in the Pavia University image, and the number of each type in the training set and the testing set on the data set according to the present invention.
TABLE 2
The Salinas valley Salinas dataset, California, USA is an image of the Salinas valley, California, USA by an AVIRIS imaging spectrometer. The spatial resolution reaches 3.7 m. The image has 224 original bands, and there are 204 bands left after removing several invalid bands. The size of the image was 512 × 217, and there were 16 types of samples. Table 3 shows the distribution of the number of samples of each category in the Salinas image, and the number of each category in the training set and the testing set on the data set according to the present invention.
TABLE 3
3. Performance comparison
The four prior art comparison and classification methods used in the invention are as follows:
(1) the hyperspectral images are classified by using an svm (supported vector machine) based on a Radial Basis Function (RBF) kernel.
(2) The hyperspectral image classification method provided by Rotewara et al in hyperspectral remote sensing image classification based on a deep convolutional neural network. Referred to as convolutional neural network classification method. Firstly, dimensionality reduction is performed on hyperspectral data through PCA (principal Component analysis), and then a 2D convolution is adopted to classify hyperspectral image.
(3) The hyperspectral Image Classification method proposed by Amina Ben Hamida et al in 3-D Deep Learning for Remote Sensing Image Classification. Referred to as 3D convolutional network classification method for short. And classifying the hyperspectral images by adopting 3D convolution.
(4) The Hyperspectral Image Classification method proposed by Lin Zhu et al in general adaptive Networks for Hyperspectral Image Classification. Referred to as a generative confrontation network classification method. The hyperspectral images are classified by DCGAN (deep conditional general adaptive networks).
In the experiment, the following three indexes were used to evaluate the performance of the present invention:
the first evaluation index is the Overall Accuracy (OA), which represents the proportion of correctly classified samples to all samples, with larger values indicating better classification.
The second evaluation index is the Average Accuracy (AA), which represents the average of the accuracy of classification for each class, with larger values indicating better classification results.
The third evaluation index is a chi-square coefficient (Kappa) which represents different weights in the confusion matrix, and the larger the value is, the better the classification effect is.
Table 4 shows the accuracy and contrast of the present invention for the high spectral image classification of Indian pine Italy pins
Table 5 shows the accuracy and contrast of the present invention in classifying the Pavia University hyperspectral images of paviia city, italy.
Table 6 shows the accuracy and contrast of the present invention for classifying Salinas valley hyperspectral images in California, USA.
TABLE 4
TABLE 5
TABLE 6
As can be seen from tables 4, 5 and 6, the hyperspectral classification method provided by the invention has better classification effect than other methods for the same hyperspectral data set. The classification performance of the network was about 3%, 9% and 2% better than the current best method classification performance for the university of pavea dataset, the indian pine dataset and the sainas valley dataset of california, usa, respectively.
In addition, fig. 8, 9 and 10 show classification graphs whose visual classification effect is consistent with the results listed in table 4, table 5 and table 6. From the visualization of results, the classification chart achieved by the method has better effect.
TABLE 7
Table 7 is a time comparison of training and testing of the present invention with SVM, 2dCNN, 3dCNN and DCGAN.
As can be seen from Table 7, the training time of the present invention is much longer than that of other methods due to the complicated structure of the model. The test times for the present invention and DCGAN are comparable and longer than those for 2dCNN and 3 dCNN.
In summary, the invention provides a method for classifying hyperspectral images of SACVAEGAN network structures. An automatic supervision mechanism is added on the basis of CVAEGAN, so that the network can better learn the characteristics of the hyperspectral image. Other structures of the network are modified, and a residual error network is added into the sample classifier, so that the network structure is deepened, and the accuracy is higher; meanwhile, two branches for extracting spectral features and spatial features are added into the sample classifier so as to better extract the features of the hyperspectral image; and a potential vector classifier module is added in the network, so that the potential vector generated by SACVAEGAN is more accurate, and meanwhile, a more accurate label can be given to the potential vector generated randomly. The experimental result shows that the method has higher classification precision than the prior art.
Claims (4)
1. SACVAEGAN-based hyperspectral image classification method is characterized in that: comprises the following steps
(1) Firstly, preprocessing hyperspectral image data;
(2) constructing a network model;
after data preprocessing is carried out, a network model is constructed; the training network model consists of five parts which are respectively: the system comprises a conditional variation self-encoder, a discriminator, a sample classifier and a potential vector classifier;
(3) training a network;
after the data and the model are processed respectively, training is started; the training process is mainly divided into four parts, namely a condition variation self-encoder, a discriminator, a sample classifier and a potential vector classifier are trained;
(4) classifying the hyperspectral images;
testing after the model training is finished; and comparing the test result with the true value to obtain a classification result, and calculating the accuracy.
2. SACVAEGAN-based hyperspectral image classification method according to claim 1, characterized in that: in the step (1), (1a) filling and zero padding are firstly carried out on the edge of the original hyperspectral data, so that data with the window size of patchsize multiplied by patchsize can be obtained by taking each point as the center, wherein the patchsize of the Indian pins data set and the Salinas data set is 28, and the patchsize of the PaviaU data set is 24;
(1b) randomly selecting K points as training labels, wherein 500 points are used as the training labels for the Indian pines and PaviaU data sets, 200 labels are used as the training labels for the Salinas data sets, and the rest are used as test labels;
(1c) obtaining a sample set; and obtaining a sample set of the hyperspectral image, taking K training labels obtained at random as centers, dividing training data by the size of a window being patchsize multiplied by patchsize, and dividing the rest of the training data into test data by the size of the patchsize multiplied by patchsize.
3. SACVAEGAN-based hyperspectral image classification method according to claim 1, characterized in that: in the step (2), (2a) the conditional variation is divided into a decoder and an encoder from an encoder; the encoder is mainly used for generating potential vectors corresponding to real hyperspectral data; the encoder combines a self-attention mechanism and a spectrum normalization method; the decoder is mainly used for generating corresponding virtual hyperspectral data according to the potential vector, and combines a self-attention mechanism and a spectrum normalization method;
(2b) the discriminator mainly has the function of discriminating the truth of the input hyperspectral data, and combines an attention mechanism with spectrum normalization;
(2c) the sample classifier is mainly used for classifying input hyperspectral data and consists of two branches for acquiring spatial features and spectral features of hyperspectral images and extracting features by combining a residual error network;
(2d) the potential vector classifier is mainly used for classifying the randomly generated potential vectors, and then giving a category to the virtual hyperspectral data generated by the generator according to the potential vectors so as to facilitate the following operation.
4. SACVAEGAN-based hyperspectral image classification method according to claim 1, characterized in that: in the step (1), (3a) firstly, training a discriminator; training of the discriminator is divided into three steps, namely inputting real hyperspectral data and potential vectors generated by an encoder in a condition variation self-encoder into a decoder to generate virtual hyperspectral data and inputting virtual hyperspectral data generated by the potential vectors generated randomly into the discriminator to train, and calculating a loss function to further optimize parameters of the discriminator;
(3b) training a conditional variational self-encoder; the training is divided into five steps, namely, the real hyperspectral data are put into an encoder to generate corresponding potential vectors and corresponding loss functions are calculated; inputting virtual hyperspectral data and real hyperspectral data which are correspondingly generated by a potential vector generated by an encoder into a discriminator, and calculating a corresponding loss function; virtual hyperspectral data and real hyperspectral data corresponding to the potential vector generated by the encoder are put into a sample classifier to calculate a corresponding loss function; respectively inputting the randomly generated potential vector and the corresponding virtual hyperspectral data into a potential vector classifier and a sample classifier, and calculating corresponding loss functions; inputting the potential vector generated by the encoder into a potential vector classifier, and calculating a corresponding loss function;
(3c) classifying the potential vector classifier; the training of the potential vector classifier is to input the potential vectors corresponding to the real hyperspectral data into the potential vector classifier for classification and calculate a loss function;
(3d) training a sample classifier; training of the sample classifier is divided into three steps; respectively inputting the real hyperspectral data into a sample classifier to calculate a classified loss function; respectively inputting the real hyperspectral data and the virtual hyperspectral data into a sample classifier to calculate a corresponding loss function; and inputting the randomly generated potential vectors into a potential vector classifier for classification, inputting virtual hyperspectral data generated according to the randomly generated potential vectors into the classifier for classification, and calculating corresponding loss functions according to classification results.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011569729.2A CN112633386A (en) | 2020-12-26 | 2020-12-26 | SACVAEGAN-based hyperspectral image classification method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011569729.2A CN112633386A (en) | 2020-12-26 | 2020-12-26 | SACVAEGAN-based hyperspectral image classification method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112633386A true CN112633386A (en) | 2021-04-09 |
Family
ID=75325275
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011569729.2A Pending CN112633386A (en) | 2020-12-26 | 2020-12-26 | SACVAEGAN-based hyperspectral image classification method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112633386A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113239938A (en) * | 2021-05-11 | 2021-08-10 | 中国人民解放军火箭军工程大学 | Hyperspectral classification method and system based on graph structure |
CN114120041A (en) * | 2021-11-29 | 2022-03-01 | 暨南大学 | Small sample classification method based on double-pair anti-variation self-encoder |
CN114107935A (en) * | 2021-11-29 | 2022-03-01 | 重庆忽米网络科技有限公司 | Automatic PVD (physical vapor deposition) coating thickness adjusting method based on AI (Artificial Intelligence) algorithm |
CN114492526A (en) * | 2022-01-25 | 2022-05-13 | 太原科技大学 | SPEC-Net network architecture and identification method for multi-satellite spectrum automatic identification |
CN117034020A (en) * | 2023-10-09 | 2023-11-10 | 贵州大学 | Unmanned aerial vehicle sensor zero sample fault detection method based on CVAE-GAN model |
CN117527900A (en) * | 2024-01-08 | 2024-02-06 | 季华实验室 | Data processing method, device, equipment and storage medium |
CN117692346A (en) * | 2024-01-31 | 2024-03-12 | 浙商银行股份有限公司 | Message blocking prediction method and device based on spectrum regularization variation self-encoder |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110580501A (en) * | 2019-08-20 | 2019-12-17 | 天津大学 | Zero sample image classification method based on variational self-coding countermeasure network |
CN111008652A (en) * | 2019-11-15 | 2020-04-14 | 河海大学 | Hyper-spectral remote sensing image classification method based on GAN |
WO2020172838A1 (en) * | 2019-02-26 | 2020-09-03 | 长沙理工大学 | Image classification method for improvement of auxiliary classifier gan |
US20200356810A1 (en) * | 2019-05-06 | 2020-11-12 | Agora Lab, Inc. | Effective Structure Keeping for Generative Adversarial Networks for Single Image Super Resolution |
-
2020
- 2020-12-26 CN CN202011569729.2A patent/CN112633386A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020172838A1 (en) * | 2019-02-26 | 2020-09-03 | 长沙理工大学 | Image classification method for improvement of auxiliary classifier gan |
US20200356810A1 (en) * | 2019-05-06 | 2020-11-12 | Agora Lab, Inc. | Effective Structure Keeping for Generative Adversarial Networks for Single Image Super Resolution |
CN110580501A (en) * | 2019-08-20 | 2019-12-17 | 天津大学 | Zero sample image classification method based on variational self-coding countermeasure network |
CN111008652A (en) * | 2019-11-15 | 2020-04-14 | 河海大学 | Hyper-spectral remote sensing image classification method based on GAN |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113239938A (en) * | 2021-05-11 | 2021-08-10 | 中国人民解放军火箭军工程大学 | Hyperspectral classification method and system based on graph structure |
CN113239938B (en) * | 2021-05-11 | 2024-01-09 | 中国人民解放军火箭军工程大学 | Hyperspectral classification method and hyperspectral classification system based on graph structure |
CN114120041A (en) * | 2021-11-29 | 2022-03-01 | 暨南大学 | Small sample classification method based on double-pair anti-variation self-encoder |
CN114107935A (en) * | 2021-11-29 | 2022-03-01 | 重庆忽米网络科技有限公司 | Automatic PVD (physical vapor deposition) coating thickness adjusting method based on AI (Artificial Intelligence) algorithm |
CN114120041B (en) * | 2021-11-29 | 2024-05-17 | 暨南大学 | Small sample classification method based on double-countermeasure variable self-encoder |
CN114492526A (en) * | 2022-01-25 | 2022-05-13 | 太原科技大学 | SPEC-Net network architecture and identification method for multi-satellite spectrum automatic identification |
CN114492526B (en) * | 2022-01-25 | 2022-11-22 | 太原科技大学 | SPEC-Net network architecture and identification method for multi-satellite spectrum automatic identification |
CN117034020A (en) * | 2023-10-09 | 2023-11-10 | 贵州大学 | Unmanned aerial vehicle sensor zero sample fault detection method based on CVAE-GAN model |
CN117034020B (en) * | 2023-10-09 | 2024-01-09 | 贵州大学 | Unmanned aerial vehicle sensor zero sample fault detection method based on CVAE-GAN model |
CN117527900A (en) * | 2024-01-08 | 2024-02-06 | 季华实验室 | Data processing method, device, equipment and storage medium |
CN117527900B (en) * | 2024-01-08 | 2024-05-07 | 季华实验室 | Data processing method, device, equipment and storage medium |
CN117692346A (en) * | 2024-01-31 | 2024-03-12 | 浙商银行股份有限公司 | Message blocking prediction method and device based on spectrum regularization variation self-encoder |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110298396B (en) | Hyperspectral image classification method based on deep learning multi-feature fusion | |
Ghaderizadeh et al. | Hyperspectral image classification using a hybrid 3D-2D convolutional neural networks | |
CN112633386A (en) | SACVAEGAN-based hyperspectral image classification method | |
US11170502B2 (en) | Method based on deep neural network to extract appearance and geometry features for pulmonary textures classification | |
Lin et al. | A deep convolutional neural network architecture for boosting image discrimination accuracy of rice species | |
Hage Chehade et al. | Lung and colon cancer classification using medical imaging: A feature engineering approach | |
US20190164047A1 (en) | Object recognition using a convolutional neural network trained by principal component analysis and repeated spectral clustering | |
Subudhi et al. | A survey on superpixel segmentation as a preprocessing step in hyperspectral image analysis | |
CN107145836B (en) | Hyperspectral image classification method based on stacked boundary identification self-encoder | |
CN114821164B (en) | Hyperspectral image classification method based on twin network | |
Abed et al. | A modern deep learning framework in robot vision for automated bean leaves diseases detection | |
CN113728335A (en) | Method and system for classification and visualization of 3D images | |
CN109190511B (en) | Hyperspectral classification method based on local and structural constraint low-rank representation | |
Dwivedi et al. | Lung cancer detection and classification by using machine learning & multinomial Bayesian | |
US20220237789A1 (en) | Weakly supervised multi-task learning for cell detection and segmentation | |
Huang et al. | Hyperspectral image classification via discriminant Gabor ensemble filter | |
CN112861915A (en) | Anchor-frame-free non-cooperative target detection method based on high-level semantic features | |
Chudzik et al. | DISCERN: Generative framework for vessel segmentation using convolutional neural network and visual codebook | |
Fırat et al. | Spatial-spectral classification of hyperspectral remote sensing images using 3D CNN based LeNet-5 architecture | |
Bhimavarapu et al. | Analysis and characterization of plant diseases using transfer learning | |
CN115049952A (en) | Juvenile fish limb identification method based on multi-scale cascade perception deep learning network | |
CN115393631A (en) | Hyperspectral image classification method based on Bayesian layer graph convolution neural network | |
CN115661029A (en) | Pulmonary nodule detection and identification system based on YOLOv5 | |
Azam et al. | Using feature maps to unpack the CNN ‘Black box’theory with two medical datasets of different modality | |
Rodrigues et al. | HEp-2 cell image classification based on convolutional neural networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |