CN115496934A

CN115496934A - Hyperspectral image classification method based on twin neural network

Info

Publication number: CN115496934A
Application number: CN202211002779.1A
Authority: CN
Inventors: 薛朝辉; 周逸飏
Original assignee: Hohai University HHU
Current assignee: Hohai University HHU
Priority date: 2022-08-19
Filing date: 2022-08-19
Publication date: 2022-12-20

Abstract

The invention discloses a hyperspectral image classification method based on a twin neural network, which is characterized in that a twin neural network framework is built through a lightweight spatio-spectral union network, and classification is realized on the basis of a characteristic vector extracted from input data by the network, so that the method has end-to-end classification capability; introducing a differential input strategy and a random sampling strategy in a sample input stage; and designing a weighted contrast loss function and a white-adapted cross entropy loss function in the loss function part. The method effectively improves the classification efficiency, classification accuracy and generalization of the hyperspectral remote sensing image small samples.

Description

Hyperspectral image classification method based on twin neural network

Technical Field

The invention belongs to the technical field of remote sensing image classification, and particularly relates to a hyperspectral image classification method based on a twin neural network.

Background

The hyperspectral sensor can capture continuous spectrum curves in a large number of electromagnetic wave bands, and the hyperspectral sensor has nanoscale spectral resolution and can cover ultraviolet light, visible light, near infrared light, intermediate infrared light and thermal infrared light. A Hyperspectral Remote Sensing Image (HSI) is data having a "cubic" structure. Different from the traditional remote sensing image, the hyperspectral remote sensing has the advantages of (1) small and numerous wave bands, (2) fine and continuous spectrum, (3) unification of spectral images, and (4) large data volume. Therefore, the hyperspectral remote sensing image combined with the spectroscopy and the imaging technology can be used for information expression in three modes of a spectrum space, an image space and a characteristic space, and has strong capability of accurately describing earth surface objects.

However, with the development of the hyperspectral remote sensing technology, the characteristics of hyperspectral data, such as high dimension, large data volume, information redundancy, nonlinear data structure and insufficient samples, will be further highlighted, which will put new requirements on the robustness, effectiveness and computational efficiency of the feature extraction and classification method. Therefore, it is necessary to develop a new and more effective feature extraction and classification method for the urgent needs of industrial applications and the difficult problem of hyperspectral remote sensing image classification.

In recent years, a deep learning-based method is concerned by many researchers, has end-to-end classification capability and strong feature extraction capability, and is very suitable for hyperspectral remote sensing images. However, deep learning excessively depends on the number and quality of samples, and how to design a deep learning model with strong feature extraction capability under the condition of small samples is an urgent problem to be solved in hyperspectral remote sensing image classification. The main solutions of the current deep learning method in the small sample classification task are as follows: semi-supervised learning, transfer learning and contrast learning.

Semi-supervised learning improves the overall classification capability of the model by utilizing unlabeled samples, however, how to select high-quality unlabeled samples and features and how to control the number of unlabeled samples are still difficult to solve at present; the migration learning indicates the overall classification capability of the model by taking data of other fields as source domains and learning features favorable for model expression from the source domains. However, current cross-domain methods are not mature, and migration learning still requires a sufficient amount of source domain labeled samples, which is limited in itself. Furthermore, the network needs to adapt the feature distribution over the two domain data, which is also very complex and difficult; the contrast learning is realized by converting the classification task into the measurement task and has certain sample amplification capacity, so that the method has a good prospect in solving the classification problem of the small samples of the hyperspectral remote sensing images.

Although the existing research is developed in the aspect of contrast learning, the problems of excessive network parameters, loose contact among samples, low characteristic separability and the like exist, and the use of the method on the hyperspectral remote sensing image is restricted.

The twin neural network framework is very obvious for sample amplification and characteristic separability improvement. Zhao et al first tried the application of twin neural network in the field of classification of hyperspectral remote sensing images, rao et al proposed a new labeling strategy, jia et al combined twin neural network with semi-supervised method to obtain good classification effect, but the current twin neural network method was not directly used for the research of classification of hyperspectral remote sensing images small samples.

Disclosure of Invention

The technical problem to be solved is as follows: the invention provides a hyperspectral image classification method based on a twin neural network, which executes a brand-new design strategy and has very outstanding performances in the aspects of calculation efficiency, classification accuracy and generalization.

The technical scheme is as follows:

a hyperspectral image classification method based on a twin neural network is used for obtaining class marking results of all pixels in a hyperspectral remote sensing image corresponding to each class of a target data set, and the classification method comprises the following steps:

s1, according to the marked hyperspectral remote sensing image labels, selecting a fixed number of marked samples from each type of samples, and cutting spatial dimensions in the whole hyperspectral remote sensing image map to obtain patches P with different spatial sizes taking the marked samples as centers ₁ And P ₂ To, forAll large patches are paired with all small patches to form training sample pairs N _train Constructing to obtain a training set; if the patches of the sample pair are centered on the same class of pixels, the learning label y is compared _ctr Set to 1, otherwise set to 0; if the central pixel of any sample pair patch is the same, the patch pair is marked as a classification learning sample pair, and the classification learning label y of the classification learning sample pair _cls A central pixel label;

s2, building a corresponding twin neural network model through a lightweight space-spectrum combined network, inputting training data into the twin neural network model in batches, wherein each batch of training data comprises equal amounts of positive sample pairs N ₊ And negative sample pair N _- Positive sample pair N ₊ Is a sample pair consisting of similar pixel centers in a training sample pair, a negative sample pair N _- The method comprises the steps of (1) training a sample pair consisting of heterogeneous pixel centers in the sample pair;

s3, aiming at the positive sample pair N ₊ And negative sample pair N _- Extracting corresponding spectral information from the single sample; combining the spectral information of a single sample, extracting the global spectral information and the spatial information of a target hyperspectral remote sensing image sample pair to obtain a global spectral feature map F ₁ And spatial feature map F ₂ ；

S4, respectively obtaining a global spectral feature map F ₁ And spatial feature map F ₂ Of the central pixel of (2) ₁ And L ₂ Using feature vectors L ₁ And L ₂ Obtaining a weight coefficient W of a target hyperspectral remote sensing image sample pair _pair Weight coefficient W _pair The method is used for enhancing the attention degree of the model to the difficult sample pairs;

s5, calculating to obtain a global spectral feature map F ₁ And spatial feature map F ₂ Corresponding feature vector f ₁ And feature vector f ₂ (ii) a Combined feature vector f ₁ Characteristic vector f ₂ Weight coefficient W and comparison learning label y _ctr Calculating a weighted contrast loss function L of the model _ctr (ii) a For feature vector f ₁ And f ₂ Setting two adaptive learning parameters alpha and beta respectively according toClassification learning label y _cls Calculating an adaptive cross entropy loss value L _cls (ii) a Combining weighted contrast loss functions L _ctr And an adaptive cross entropy loss function L _cls Obtaining a loss function of the training process; wherein the contrast loss function L is weighted _ctr The method is used for enabling the model to increase the attention to the difficult samples in the training stage so as to improve the fitting capability of the model and self-adapt to the cross entropy loss function L _cls The method is used for enabling the model to learn the optimal fusion proportion of the two branches of the twin neural network in the training stage so as to improve the expression capability of the model;

s6, applying Adam optimization algorithm and weighting contrast loss function value L _ctr Sum cross entropy loss function value L _cls Updating network parameters and fixing a twin neural network model F;

s7, extracting features of all test sample pairs by using the twin neural network model F, and predicting class marking results of all classes corresponding to all pixels of all hyperspectral remote sensing images

Further, the target dataset includes Indian Pines dataset and the corresponding sample categories are terrain types including construction-lawn-tree-machine, hay-heap, wood, steel tower, pasture and various crops.

Further, in step S1, the process of constructing the training set includes the following steps:

s11, according to the marked hyperspectral remote sensing image labels, selecting a fixed number of marked samples from each type of samples as a training set, and cutting the selected samples in the spatial dimension in the whole hyperspectral remote sensing image map to obtain plaques P with different spatial sizes and taking the marked samples as centers ₁ And P ₂ (ii) a All large plaques are paired with all small plaques to form a training sample pair N _train ；

S12, cutting all pixels of the hyperspectral remote sensing image to obtain plaques with two sizes, and pairing the large plaques and the small plaques of each pixel to obtain data for predicting a full imageN _test 。

Further, in step S3, a global spectral feature map F is obtained ₁ And spatial feature map F ₂ Comprises the following steps:

s31, aiming at the positive sample pair N ₊ And negative sample pair N _- Extracting corresponding spectral information by using a deep learning model f:

wherein,

the characteristic value of the characteristic diagram in the jth three-dimensional array form of the ith layer is represented at (x, y, z), x, y and z represent the positions of the characteristic diagram in the width and height of a space dimension and a spectrum dimension respectively, m represents the characteristic diagram of the ith-1 layer convolution connected with the jth characteristic diagram in the ith layer convolution, p and q represent the positions of convolution kernels in the width and height of the space dimension, and a parameter R _i Representing the magnitude in the spectral dimension of the convolution kernel of the ith layer,

representing the value of the parameter at (p, q, r) connected to the mth signature, r representing the position of the three-dimensional convolution kernel in the spectral dimension, b _ij Is the offset in the jth feature map of the ith layer;

s32, extracting global spectrum information of the target hyperspectral remote sensing image sample pair by combining spectrum information of a single sample:

wherein C represents the spectral dimension of the feature map extracted in step S31;

s33, extracting spatial information of the target hyperspectral remote sensing image sample pair:

in the formula,

representing the characteristic value at (x, y) in the characteristic diagram of the jth two-dimensional array form of the ith layer, m representing the characteristic diagram of the ith-1 layer convolution connected with the jth characteristic diagram in the ith layer convolution, P and Q representing the positions of the width and height of the convolution kernel in the spatial dimension, P and Q representing the sizes of the convolution kernel in the spatial dimension,

representing values of convolution kernel parameters at (p, q) connected to the mth signature, b _ij Is the bias parameter in the jth characteristic diagram of the ith layer;

s34, according to the deep learning model f and the sample pair N ₊ And N _- Respectively extracting spectral features and spatial features of a target hyperspectral remote sensing image sample pair to obtain a global spectral feature map F with the same spatial size as input data and the same channel dimension and category number ₁ And spatial feature map F ₂ 。

Further, in step S4, the feature vector L is utilized ₁ And L ₂ Obtaining a weight coefficient W of a target hyperspectral remote sensing image sample pair:

wherein, L ₁ And L ₂ | | represents the feature vector L ₁ And L ₂ The die size of (a);

further, in step S5, the process of obtaining the loss function of the training process includes the following steps:

s51, according to the global spectral feature map F ₁ And spatial feature map F ₂ Calculating to obtain corresponding characteristic vector f by adopting the following formula ₁ And a feature vector f ₂ ：

Wherein, W and H respectively represent the width and height of the characteristic diagram in the space dimension, W and H represent the width and height positions of the characteristic diagram in the space dimension, K represents the kth class, K represents the total class number, F _whk A feature value representing a position of the feature map F in the k-dimension (w, h);

s52, according to the feature vector f ₁ Characteristic vector f ₂ Comparison learning label y _ctr And a weight coefficient W _pair Calculating a weighted contrast loss function value of the model:

L _ctr ＝y _ctr ×(1-W _pair )×d _pos +(1-y _ctr )×W _pair ×max(margin-d _neg ，0)

wherein, y _ctr A comparison learning label of the sample pair is represented, if the sample pair takes the same type of pixels as the center, the comparison learning label is 1, otherwise, the comparison learning label is 0; margin is a threshold that controls the positive and negative sample pair spacing; d _pos And d _neg Respectively representing positive sample pair spacing and negative sample pair spacing, which are calculated by using Euclidean distance between two feature vectors:

where K denotes the number of classes and d denotes two feature vectors f ₁ And f ₂ D represents d if the two eigenvectors match _pos Otherwise, it represents d _neg ，f _1k And f _2k Representing the eigenvalue of the eigenvector at position k;

s53, aiming at the feature vector f ₁ And f ₂ Setting two adaptive learning parameters alpha and beta respectively, and learning a label y according to classification _cls Calculating an adaptive cross entropy loss value L _cls ：

Wherein,

representing a vector of probabilities for each class predicted by the model for the input sample;

s54, calculating to obtain a loss function in the training process:

L＝L _ctr +L _cls 。

further, in step S7, the twin neural network model F is applied according to the following formula:

predicting class marking results of all pixels of all the hyperspectral remote sensing images corresponding to each class; wherein, F (N) _test ) And representing the characteristic vectors of all test sample pairs in the hyperspectral remote sensing image.

Further, the classification method further comprises the following steps:

and S8, quantitatively evaluating the classification result by counting and comparing the overall precision, the class precision, the average precision, the Kappa coefficient and the running time.

Has the beneficial effects that:

the invention discloses a hyperspectral image classification method based on a twin neural network, which is characterized in that in a lightweight air-spectrum joint twin neural network method S3Net, a lightweight air-spectrum joint network is designed firstly, then a twin neural network framework is built through the network, finally, classification is realized on feature vectors extracted from input data based on the network, and the end-to-end classification capability is achieved; in addition, a differential input strategy and a random sampling strategy are introduced in the sample input stage, the differential input strategy provides more characteristics for the twin neural network, the random sampling strategy reduces the excessive use of negative samples, and the training burden of the twin neural network is reduced; a weighted comparison loss function and a self-adaptive cross entropy loss function are designed in the loss function part, the model increases the attention to difficult samples in the training stage by using the weighted comparison function, so that the fitting capability of the model is improved, the model learns the optimal fusion ratio of two branches of the twin neural network in the training stage by using the self-adaptive cross entropy loss function, and the expression capability of the model is improved. The algorithm designed by the invention has very outstanding performances in the aspects of calculation efficiency, classification accuracy and generalization.

Drawings

FIG. 1 is a schematic flow chart of a hyperspectral image classification method based on a twin neural network designed by the invention.

Fig. 2 is a schematic diagram of the learning rate analysis result of S3Net in the embodiment of the present invention.

FIG. 3 is a diagram illustrating the analysis result of the size of the domain window of S3Net in the embodiment of the present invention.

FIG. 4 is a schematic diagram of the PCA retention principal component analysis result of S3Net in the embodiment of the present invention.

FIG. 5 is a diagram illustrating the analysis result of the training step size of S3Net in the embodiment of the present invention.

FIG. 6 is a diagram illustrating the M-threshold analysis result of S3Net in the embodiment of the present invention.

Fig. 7 is a schematic diagram of the precision performance results of the S3Net differential input combination under three public data sets in the designed embodiment of the present invention, (a) is a schematic diagram of the precision performance of the differential input combination under Indian pipes data, (b) is a schematic diagram of the precision performance of the differential input combination under University of Pavia data, and (c) is a schematic diagram of the precision performance of the differential input combination under yellow river crossing (HHK) data.

FIG. 8 is a schematic diagram of the distribution of features of differentiated inputs in different categories in Indian pins data sets in an embodiment of the present invention. A schematic of feature distribution of under-plow corn at large size, (b) a schematic of feature distribution of under-plow corn at small size, (c) a schematic of feature distribution of building-lawn-tree-machine at large size, (d) a schematic of feature distribution of building-lawn-tree-machine at small size.

Fig. 9 is a classification diagram of the AVIRIS Indian Pines dataset according to different methods in the embodiment of the present invention, (a) corresponding to 2DCNN, (b) corresponding to 3DCNN, (c) corresponding to SSUN, (d) corresponding to SSRN, (e) corresponding to 3dae, (f) corresponding to DCFSL, (g) corresponding to S2Net, (h) corresponding to S3Net, and (j) corresponding to ground object true distribution data.

FIG. 10 is a classification diagram of ROSIS University of Pavia datasets with different methods in the embodiments of the present invention, (a) corresponding to 2DCNN, (b) corresponding to 3DCNN, (c) corresponding to SSUN, (d) corresponding to SSRN, (e) corresponding to 3DAES, (f) corresponding to DCFSL, (g) corresponding to S2Net, (h) corresponding to S3Net, and (j) corresponding to ground object real distribution data.

Fig. 11 is a classification diagram of the ZY1E _ AHSI yellow estuary data set according to the different methods in the embodiment of the present invention, (a) corresponds to 2DCNN, (b) corresponds to 3DCNN, (c) corresponds to SSUN, (d) corresponds to SSRN, (E) corresponds to 3dae, (f) corresponds to DCFSL, (g) corresponds to S2Net, (h) corresponds to S3Net, and (j) corresponds to ground object real distribution data.

FIG. 12 is a graph of the generalization accuracy of different methods on three datasets in an embodiment of the present invention, (a) for Indian pipes, (b) for University of Pavia, and (c) for Huang Hekou (HHK).

Detailed Description

The following examples will give the skilled person a more complete understanding of the present invention, but do not limit the invention in any way.

Compared with the traditional deep learning method, the lightweight network has fewer network parameters and is not beneficial to the overfitting phenomenon under the condition of small samples, so that the method is more suitable for the classification task under the small samples, and meanwhile, the twin neural network framework can expand the sample capacity under the condition that the network parameters are not changed, add a comparison learning idea to the model and improve the classification capability of the model.

Therefore, the invention designs a hyperspectral image classification method based on a twin neural network, which comprises the steps of firstly designing a lightweight spatio-spectral union network, then building a twin neural network framework through the network, and finally classifying feature vectors extracted from input data based on the network, wherein the hyperspectral image classification method has end-to-end classification capability; in addition, a differential input strategy and a random sampling strategy are introduced in the sample input stage, the differential input strategy provides more characteristics for the twin neural network, the random sampling strategy reduces the excessive use of negative samples, and the training burden of the twin neural network is reduced; a weighted comparison loss function and a self-adaptive cross entropy loss function are designed in the loss function part, the model increases the attention to difficult samples in the training stage by using the weighted comparison function, so that the fitting capability of the model is improved, the model learns the optimal fusion ratio of two branches of the twin neural network in the training stage by using the self-adaptive cross entropy loss function, and the expression capability of the model is improved. The algorithm designed by the invention has very prominent performances in the aspects of calculation efficiency, classification accuracy and generalization.

In practice, as shown in fig. 1, design steps A to G are executed, specific embodiments are acted, and three groups of representative hyperspectral remote sensing data of AVIRIS Indian pins, ROSIS University of Pavia and ZY1E _ AHSI yellow river mouth are used for carrying out validity verification on the provided lightweight class-space-spectrum combined twin-nerve network hyperspectral remote sensing image small sample classification model and expansion thereof.

1 Experimental setup

(1) Analysis of parameter sensitivity

And aiming at parameters of a deep learning model, respectively testing a learning rate, a domain window, a PCA reserved waveband, a training step length, a differentiated window combination and a margin threshold, and selecting 5 marked samples in each class as training data in three data sets.

(2) Ablation experiment

The influence of the differential input strategy, the random sampling strategy, the weighting contrast loss function and the cross entropy loss function provided by the method on the overall precision of the deep learning model is verified, and experiments are carried out in three data sets, wherein 5 labeled samples are selected from each type to serve as training data.

(3) Experiments on the generalization

The overall classification accuracy was tested for all methods under 5-50 training samples per class in the three datasets. Specifically, for a certain category, if the number of training samples to be selected from is equal to or more than the number of self samples, only 50% of the number of self samples are selected as training samples for training.

(4) Comparison method

The comparison method for verifying the validity of the model comprises the following steps:

two classical deep learning approaches: two-Dimensional Convolutional Neural networks (2 DCNN) and Three-Dimensional Convolutional Neural networks (3 DCNN);

two advanced space-spectrum combination methods: spectral-Spatial Residual Network (SSRN) and Spectral-Spatial Unified Network (SSUN),

two deep neural network methods for small sample classification: a Deep Cross domain transfer Learning small sample Learning method (DCFSL) and a twin neural network semi-supervised method (3-Dimension Auto Encoder Stack,3 DAES).

The hyper-parameters of the comparison method used above are all set according to the original paper.

(5) Evaluation index

The classification results were quantitatively evaluated by counting and comparing the Overall Accuracy (OA), class Accuracy (CA), average Accuracy (AA), kappa coefficient (κ) and run time (t). For all used classification algorithms, all evaluation indexes are the average of the experimental results of 10 independent runs of the random initialization training sample.

2 results of the experiment

(1) Learning rate

According to the deep learning model principle, the S3Net algorithm is influenced by a plurality of hyper-parameters. Fig. 2 shows the influence of the learning rate on the overall accuracy of S3Net in the case of 5 random training samples of each type on three data sets, respectively. The learning rate is one of the key parameters of the gradient descent algorithm, which is directly related to the magnitude of parameter adjustment in each training. 7 conventional learning rates were selected for testing, 1e-1,5e-2,1e-2,5e-3,1e-3,5e-4,1e-4, respectively.

As can be seen from FIG. 2, under the condition that the learning rate is 1e-1, the test accuracy of all three data sets is the lowest, because the learning rate is too high, the network parameters are updated too fast, and thus the network generates an overfitting phenomenon; when the learning rate is the power of 1e-4, the classification accuracy is relatively low, because the learning rate is too small, the network parameters change slowly, and the network is not effectively trained under the setting; between 1e-2 and 5e-4, the classification accuracy of the model in the three data sets is not much different, and the learning rate is finally determined to be 1e-2 by considering the time factor.

(2) Size of Domain Window

As shown in fig. 3, the effect of different window sizes on the classification accuracy was evaluated on three data sets, respectively. As can be seen from the figure, the window sizes are all set in the range of [3 × 3,5 × 5,7 × 7,9 × 9, 11 × 11, 13 × 13, 15 × 15, 17 × 17]. Before the size of the neighborhood window is smaller than 9, the classification precision of the network is increased along with the increase of the size of the window, because when the size of the window is smaller, more spatial information is beneficial to the extraction of spatial features, so that the classification precision of the model is improved; after more than 11, the precision of the Indian Pines and University of Pavia datasets increases and decreases with the window size, because with the increase of the window, more and more heterogeneous pixels are in the pixel block, thereby affecting the final classification precision of the model; the accuracy of the HHK data set is not very different at different block sizes, reaching the highest classification accuracy at a block size of 9. And (5) integrating the training speed and the classification result, and finally selecting the window size to be 9 multiplied by 9.

(3) PCA Retention of principal Components

As shown in fig. 4, the influence of the number of principal components retained after PCA processing on the classification result of the model is shown on the three data sets. The number of principal components in the ranges {10, 20., 100, 103} (University of Pavia), {10, 20., 190, 200} (Indian Pines) and {10, 20., 110, 119} (HHK) were tested separately. As can be seen from the figure, for the three data sets, if the PCA is not used for preprocessing the hyperspectral data, the overall classification accuracy is far lower than that of the data preprocessed by the PCA, and the original data are proved to have better separability after being processed by the PCA. Specifically, for HHK, OA was highest when the retention principal component was 80, so in subsequent experiments, the dataset was pre-processed with PCA for its first 80-dimensional data. For the University of Pavia, OA is highest at 30 bands after PCA, so for this data the first 30 principal components after PCA pre-treatment were retained in later experiments. For Indian Pines, the highest OA in the data is obtained when the first 60 principal components are taken after PCA pre-processing, so we set it to 60 in the data set.

(4) Training step size

As shown in FIG. 5, the training step size ranges from 0 to 100 for the three data sets, and the loss values of the training set are observed to determine the fitting condition of the model. It can be seen from the figure that for the three data sets, the loss value is already close to 0 and tends to be stable around the training step size of 90, indicating that the network is trained at this time, so the training step size is set to 90 in the subsequent study.

(5) margin threshold

As shown in fig. 6, the parameter M is an important parameter for controlling the distance between positive and negative samples, and the influence of different margin on the three data sets OA is tested in the range of 0.1-2.0, and it can be seen that M is about 1.2 in the three data sets, and relatively high classification accuracy is achieved in all the three data sets, so in the subsequent experiments, margin is set to 1.2.

(6) Differentiated input

Fig. 7 shows the overall classification accuracy of S3Net in the three data sets at different window settings. The rows and columns in the figure represent the window sizes of the two branch input samples, respectively. It can be known from the figure that the differential input is higher than the input with the same size under the condition that both windows are larger than 5, which proves that the differential input can effectively improve the classification capability of the model, the overall classification accuracy of the three data sets is integrated, and the input combination of the two branches is finally selected to be 13 × 13 and 7 × 7.

FIG. 8 is a statistical analysis of initial values of samples for both the Inian Pines dataset under different windows for the under-plough corn and the building-lawn-tree-machine categories, in FIG. 8, (a) and (b) are statistics for the under-plough corn category large and small windows, respectively, (c) and (d) are statistics for the building-lawn-tree-machine category large and small windows, respectively, with the initial values for the samples on the abscissa and the numbers on the ordinate. It can be seen that even for the same sample, the initial value distributions in the case of different window sizes have significant differences, the small windows are smoother and overall closer to gaussian distribution, and more initial values in the large window are gathered near the peak. The result shows that compared with a sample with a single window size, the differentiated window combination can provide more features for the model, and the comprehensive overall classification result shows that differentiated input provides more effective features for the model, so that the identification capability of the model is enhanced, and the classification capability of the model is finally improved.

(7) Random sampling strategy

Table 1 is the impact of the random sampling strategy on the classification accuracy and runtime of the model under three data sets.

TABLE 1 Classification accuracy of random selection strategies in three datasets

As can be seen from Table 1, in the University of Pavia, indian pipes and HHK data sets, under the random selection strategy, the training time of the model is about 1/10,1/50 and 1/70 of the original training time respectively, which greatly reduces the training time because the random selection strategy improves the training speed of the model by reducing the training samples of each batch; in addition, the precision of all samples on the whole input can be achieved in precision, even slightly improved, which proves that excessive negative samples hardly contribute to the network and consume a great deal of training time.

(8) Loss function

Tables 2 and 3 show the weighted contrast loss function and the adaptive cross-entropy loss function proposed by the present technique, and table 2 shows the overall accuracy contrast of the weighted contrast loss function and the original contrast loss function

TABLE 3 Overall accuracy comparison of adaptive Cross-entropy loss to original Cross-entropy loss

As can be seen from tables 2 and 3, both of the two improvement strategies proposed by the present technology are beneficial to improving the overall accuracy of the model, wherein the weighted contrast loss function improves the attention degree of the model to the difficult samples by giving greater weight to the difficult samples, thereby improving the fitting ability of the model; and the self-adaptive cross entropy loss function improves the characteristic expression capability of the model by learning the optimal fusion proportion through the network.

(9) Comparative experiment

S3Net is compared with other comparison methods mentioned earlier in terms of classification accuracy. Table 4 shows the classification precision, the overall classification precision, the average precision and the Kappa coefficient of each class of the Indian Pines data set under the precision of 20 classes by different classification methods. As can be seen from Table 4, S3Net designed by the present technology achieves better classification effect than other comparison algorithms. For OA, OA for S3Net reaches 91.54%. Compared with other methods, the method has the advantages that the method is improved by 14.67% and 14.41% compared with 2DCNN and 3DCNN, is improved by 19.16% and 11.02% compared with SSUN and SSRN, is improved by 16.02% and 8.44% compared with 3DAES and DCFSL, is improved by 1.63% compared with S2Net, and is greatly improved in precision compared with the existing method. Meanwhile, AA and kx 100 also obtain the highest values, which respectively reach 95.58 percent and 90.38. In CA, S3Net achieves the highest precision in a total of 10 types of terrain, particularly in the classification of corn and construction-lawn-tree-machine categories. Fig. 9 shows the classification results of the whole image in different methods, and overall, the classification method provided in this chapter has better overall effect, and the classification results of the whole image are smoother and have less noise.

TABLE 4 Classification accuracy of different methods on Indian Pines datasets

Table 5 shows the classification accuracy, overall classification accuracy, average accuracy, and Kappa coefficient for each class of 20 accuracies on the University of pavia dataset for different classification methods. As can be seen from Table 5, S3Net designed by the present technology achieves better classification effect than other comparison algorithms. OA for S3Net was 96.27%. Compared with other comparison methods, the improved protein is improved by 7.63 percent and 12.43 percent compared with 2DCNN and 3DCNN, is improved by 11.07 percent and 6.52 percent compared with SSUN and SSRN respectively, is improved by 7.55 percent and 5.14 percent compared with 3DAES and DCFSL, and is improved by 2.5 percent compared with S2 Net; AA and kX 100 also achieved the highest values, 96.91% and 95.11, respectively. S3Net achieves the highest classification accuracy for 5 out of 9 classes in total. Fig. 10 correspondingly shows classification maps obtained by different methods, and overall the S3Net proposed herein obtains the best classification result map, retains more edge details, and has a smoother overall appearance.

TABLE 5 Classification accuracy of different methods on the University of Pavia dataset

Table 6 shows the classification accuracy, the overall classification accuracy, the average accuracy, and the Kappa coefficient for each classification of 20 types on the yellow river estuary data set for different classification methods. As can be seen from the table, S3Net designed by the technology achieves a classification effect superior to that of other comparison algorithms. For OA, the OA of S3Net is 95.25%, which is 1.48% and 2.82% higher than 2D CNN and 3DCNN, respectively, and 10.05% and 1.77% higher than SSUN and SSRN, respectively, compared to other methods; compared with 3DAES and DCFSL, the three-dimensional model is improved by 5.03 percent and 2.73 percent respectively, and is improved by 0.99 percent compared with S2 Net. AA and κ × 100 of S3Net also achieved the highest values, 97.49% and 93.62, respectively; the highest classification accuracy was achieved for 13 classes out of 23 classes in total. The overall classification map of the different methods is shown in fig. 11, and overall the S3Net proposed herein achieves the best classification result map, retains more edge details, and has a smoother overall look and feel.

TABLE 6 Classification accuracy of different methods on Salinas dataset

(10) Experiments on the generalization

To verify the generalization performance of the small sample classification model designed by the present technique, on three datasets, all methods were trained on labeled samples of each class [5, 10, 15, 20, 30, 40, 50], and tested for classification accuracy, as shown in fig. 12. As can be seen from the figure, in different numbers of training samples of three data sets, the S3Net provided in this chapter achieves the highest overall classification precision, and the technology is proved to have strong generalization. In addition, as can be seen from the figure, when the number of training samples is small, the accuracy performance of S3Net is better than that of other methods, which indicates that the method of the present technology has better classification performance under the condition of small samples.

The invention is seen from the above embodiments in a method for classifying small samples of hyperspectral remote sensing images based on a lightweight space-spectrum joint twin neural network, in a traditional twin neural network framework, firstly a lightweight space-spectrum joint network is designed by utilizing one-dimensional convolution and two-dimensional convolution, then a twin neural network framework is constructed by utilizing the lightweight network, and finally a network is trained based on a weighted contrast loss function and a self-adaptive cross entropy loss function, and labels of unknown data are predicted end to realize classification; in addition, a differential input strategy and a random sampling strategy are introduced, the differential input strategy provides different inputs for the model in space size, more available features are introduced into the model in a training stage, the model learns the difference between the features, the recognition capability of the model is improved, the random sampling strategy effectively relieves the problem of redundancy of an inherent sample pair of a twin neural network, the weighting contrast loss function leads the model to pay more attention to a difficult sample in the training stage by introducing weights, so that the expression capability of the model is effectively improved, the self-adaptive cross entropy loss function introduces parameters in the model training stage, the model learns the optimal fusion ratio of two branches, and the expression capability of the model is improved; the design method of the invention has outstanding performance in the aspects of calculation efficiency, classification accuracy and generalization.

The embodiments of the present invention have been described in detail with reference to the drawings, but the present invention is not limited to the above embodiments, and various changes can be made within the knowledge of those skilled in the art without departing from the gist of the present invention.

Claims

1. A hyperspectral image classification method based on a twin neural network is used for obtaining class marking results of all classes of target data sets corresponding to all pixels in a hyperspectral remote sensing image, and is characterized by comprising the following steps:

s1, according to the marked hyperspectral remote sensing image labels, selecting a fixed number of marked samples from each type of samples, and cutting spatial dimensions in the whole hyperspectral remote sensing image map to obtain plaques P with different spatial sizes and taking the marked samples as centers ₁ And P ₂ All large patches are paired with all small patches to form training sample pairs N _train Constructing a training set; if the patches of the sample pair are centered on the same class of pixels, the learning label y is compared _ctr Set to 1, otherwise set to 0; if the central pixel of any sample pair patch is the same, the patch pair is marked as a classification learning sample pair, and the classification learning label y of the classification learning sample pair _cls A central pixel label;

s5, calculating to obtain a global spectral feature map F ₁ And spatial feature map F ₂ Corresponding feature vector f ₁ And a feature vector f ₂ (ii) a Combined feature vector f ₁ Feature vector f ₂ Weight coefficient W and comparison learning label y _ctr Calculating a weighted contrast loss function L of the model _ctr (ii) a For feature vector f ₁ And f ₂ Setting two adaptive learning parameters alpha and beta respectively, and learning a label y according to classification _cls Calculating an adaptive cross entropy loss value L _cls (ii) a Combining weighted contrast loss functions L _ctr And an adaptive cross entropy loss function L _cls Obtaining a loss function of the training process; wherein the contrast loss function L is weighted _ctr The method is used for enabling the model to increase attention to difficult samples in the training stage so as to improve the fitting capability of the model and self-adapt to the cross entropy loss function L _cls The method is used for enabling the model to learn the optimal fusion proportion of the two branches of the twin neural network in the training stage so as to improve the expression capability of the model;

s6, applying Adam optimization algorithm and weighting contrast loss function value L _ctr Sum cross entropy loss functionValue L _cls Updating network parameters and fixing a twin neural network model F;

and S7, extracting the characteristics of all the test sample pairs by using the twin neural network model F, and predicting the class marking result Y of all the pixels of all the hyperspectral remote sensing images corresponding to each class.

2. The twin neural network-based hyperspectral image classification method according to claim 1, wherein the target dataset comprises an Indian Pines dataset and the corresponding sample categories are types of terrain including construction-grassland-trees-machinery, hay stacks, wood, steel towers, pasture grasses, and various crops.

3. The hyperspectral image classification method based on the twin neural network as claimed in claim 1 is characterized in that the process of constructing the training set in the step S1 comprises the following steps:

S12, cutting all pixels of the hyperspectral remote sensing image to obtain plaques with two sizes, and pairing the large plaques and the small plaques of each pixel to obtain data N for predicting the whole image _test 。

4. The twin neural network-based hyperspectral image classification method according to claim 1, wherein in step S3, a global spectral feature map F is obtained ₁ And spatial feature map F ₂ Comprises the following steps:

s31, aiming at the positive sample pair N ₊ And negative sample pair N _- F, extracting corresponding spectral information:

wherein,

representing the value of the parameter at (p, q, r) connected to the mth signature, r representing the position of the three-dimensional convolution kernel in the spectral dimension, b _ij Is the offset in the jth profile of the ith layer;

in the formula,

representing the characteristic value at (x, y) in the characteristic diagram of the jth two-dimensional array form of the ith layer, m representing the characteristic diagram of the ith-1 layer convolution connected with the jth characteristic diagram in the ith layer convolution, P and Q representing the positions of the width and height of a convolution kernel in a space dimension, P and Q representing the sizes of the width and height of the convolution kernel in a space dimension,

5. The twin neural network-based hyperspectral image classification method according to claim 1, wherein in step S4, a feature vector L is used ₁ And L ₂ Obtaining a weight coefficient W of a target hyperspectral remote sensing image sample pair:

wherein, L ₁ | and | L ₂ | | represents the feature vector L ₁ And L ₂ The size of the die.

6. The twin neural network-based hyperspectral image classification method according to claim 1, wherein in the step S5, the process of obtaining the loss function of the training process comprises the following steps:

s51, according to the global spectral feature map F ₁ And spatial feature map F ₂ Calculating to obtain corresponding characteristic vector f by adopting the following formula ₁ And feature vector f ₂ ：

wherein, y _ctr A comparison learning label of the sample pair is represented, if the sample pair takes the same type of pixels as the center, the comparison learning label is 1, otherwise, the comparison learning label is 0; margin is a threshold that controls the positive and negative sample pair spacing; d is a radical of _pos And d _neg Respectively representing positive sample pair spacing and negative sample pair spacing, which are calculated by using Euclidean distance between two feature vectors:

Wherein,

s54, calculating to obtain a loss function in the training process:

L＝L _ctr +L _cls 。

7. the twin neural network-based hyperspectral image classification method according to claim 1, wherein in step S7, a twin neural network model F is applied according to the following formula:

predicting class marking results of all pixels of all the hyperspectral remote sensing images corresponding to each class; wherein, F (B) _test ) And representing the characteristic vectors of all test sample pairs in the hyperspectral remote sensing image.

8. The twin neural network-based hyperspectral image classification method according to claim 1, further comprising the steps of: