CN114037896A

CN114037896A - PolSAR terrain fine classification method based on multi-index convolution self-encoder

Info

Publication number: CN114037896A
Application number: CN202111319995.4A
Authority: CN
Inventors: 艾加秋; 黄默; 王非凡; 范高伟; 裴志林
Original assignee: Hefei University of Technology
Current assignee: Hefei University of Technology
Priority date: 2021-11-09
Filing date: 2021-11-09
Publication date: 2022-02-11
Anticipated expiration: 2041-11-09
Also published as: CN114037896B

Abstract

The invention discloses a PolSAR terrain fine classification method based on a multi-index convolution self-encoder, which comprises the following steps: 1. carrying out preprocessing and slicing operation on the PolSAR data; 2. extracting a characteristic diagram of the PolSAR image through convolution and pooling operations, inputting the obtained characteristic diagram into an MI-SE module to calculate multiple indexes, and obtaining the weight of each dimension of the input characteristic diagram; 3. weighting the importance of the feature map according to the weight assigned to the input feature map to obtain a final feature; 4. inputting the final characteristics into a classifier to obtain a prediction result, and comparing the prediction result with a real result to finish the training process of the model; 5. and inputting the PolSAR image to be classified into the trained multi-index convolution self-encoder model to obtain a classification result. The method can improve the completeness and the fineness of the PolSAR surface feature characterization, obtain higher classification precision and classification efficiency, and have better engineering application value.

Description

PolSAR terrain fine classification method based on multi-index convolution self-encoder

Technical Field

The invention relates to the technical field of PolSAR terrain fine classification, in particular to a PolSAR terrain fine classification method based on a multi-index convolution self-encoder.

Background

Polar synthetic aperture radar (polar sar), as a high resolution imaging sensor, can provide detailed ground object information. Compared with optical images, PolSAR can obtain more effective and comprehensive ground feature information under all-weather and all-time conditions. Therefore, the accuracy of classification can be effectively improved by using the PolSAR data for ground feature classification.

The traditional PolSAR terrain classification method can be summarized into unsupervised classification without training sample labels and supervised classification with training sample labels. Although a large amount of human resources are not needed to be consumed in the unsupervised classification process, the operation time is long, and the classification precision is low. The supervised classification based on machine learning mainly comprises two processes of feature extraction and classification, the performance of the supervised classification is mainly limited by the representation capability of features, and the classification performance is greatly reduced when the extracted features difficultly highlight the difference between different surface feature classes. Supervised classification algorithms based on deep learning have been proposed in recent years, but as the complexity of deep learning networks increases, the training process generally requires a large number of samples and sufficient time. However, the problem of sample lack of labels is just a short board in the PolSAR ground classification problem. To address this problem, a convolutional auto-encoder (CAE) is of more interest because CAE can train network parameters through a decoder under unsupervised conditions, thereby reducing the amount of samples required for final classification. However, due to the low depth of the network, the extracted depth features cannot completely represent the information of the target and are also easily affected by the interference information, thereby resulting in poor classification results.

There are many existing CAE network problem improvement methods, which respectively improve the traditional CAE network from the aspects of sample data size, network depth, loss function, etc. For example: a Stacked Convolutional Auto Encoder (SCAE) deepens the whole network by overlapping a plurality of CAE networks, and learns network parameters by using layered training, thereby acquiring deeper ground feature information; there are also algorithms that use wishart distance as the reconstruction loss of the CAE network. However, these improved algorithms merely stack adjacent features into different feature map channels, which are not complete and fine-grained for different classes of feature representations.

Disclosure of Invention

The invention aims to overcome the defects in the prior art, and provides a PolSAR terrain fine classification method based on a multi-index convolution self-encoder so as to improve the completeness and the smoothness of PolSAR terrain feature characterization and obtain higher classification precision and classification efficiency.

In order to achieve the purpose, the invention adopts the following technical scheme:

the invention discloses a PolSAR terrain fine classification method based on a multi-index convolution self-encoder, which is characterized by comprising the following steps:

step 1: preprocessing and cutting operations of PolSAR data:

step 1.1: the scattering characteristic of any minimum resolution unit in the PolSAR data is expressed by using a polarized scattering matrix S; converting any one polarization scattering matrix S in PolSAR data into vector according to reciprocity principle and Pauli decomposition principle

Wherein S is_HHRepresents the complex scattering coefficient when the transmission polarization mode is horizontal polarization and the receiving polarization mode is horizontal polarization, S_VVRepresents the complex scattering coefficient when the transmission polarization mode is vertical polarization and the reception polarization mode is vertical polarization, S_HVIndicating the transmission polarization as verticalThe direct polarization receives the complex scattering coefficient when the polarization mode is horizontal polarization;

step 1.2: and (3) obtaining a polarization coherent matrix T by using an equation (1) according to the loss K:

in the formula (1) (.)^HDenotes the conjugate transpose, T_ijAn element representing the ith row and the jth column of the polarized coherence matrix T; and i, j ═ 1,2, 3;

step 1.3: extracting a six-dimensional eigenvector F from the polarized coherence matrix T by using equation (2) ═ a, B, C, D, E, F:

in the formula (2), Span represents total scattering power of all polarization channels, and is T ═ T₁₁+T₂₂+T₃₃Where A represents the decibel form of the total scattered power Span, B and C represent the row 2, column 2 elements T, respectively₂₂And row 3, column 3 element T₃₃D, E and F represent three relative correlation coefficients;

step 1.4: carrying out normalization operation on the six-dimensional feature vector F to obtain a normalized six-dimensional feature vector F', so that the normalized six-dimensional feature vectors of all minimum resolution units form preprocessed PolSAR data;

step 1.5: cutting the preprocessed PolSAR data into slices with the size of L multiplied by 6, thereby obtaining a slice set { s) of the PolSAR data₁,...,s_n,...,s_NIn which s is_nRepresents the nth slice, and N represents the total number of slices; n is an element of [1, N ∈]；

Step 2: the method for constructing the classification network based on the multi-index convolution self-encoder comprises the following steps: an encoder, a decoder, and a classifier; and set the slices { s₁,...,s_n,...,s_NInputting into the classification network;

step 2.1: the encoder consists of m convolutional layers, m pooling layers, m MI-SE modules, m weighting modules and a full connection layer;

the nth slice s_nInputting the feature map into the encoder, and obtaining a feature map U with the dimension of H multiplied by W multiplied by D after the processing of the first convolution layer and the pooling layer, wherein H and W respectively represent the height and the width of the feature map U, and D represents the channel number of the feature map U;

the characteristic diagram U is input into a first MI-SE module, and a multi-index is calculated by using an equation (3), wherein the multi-index comprises the following steps: first moment S₁Second central moment S₂And coefficient of variation S₃：

In formula (3), U (a, b:) represents all the elements in the a-th row and b-th column of the characteristic diagram U;

the first moment S₁Second central moment S₂And coefficient of variation S₃Splicing the data into a multi-index matrix V with the size of 3 xD, and processing the multi-index matrix V through a convolution layer and a full connection layer in the first MI-SE module to obtain a weight matrix Y with the size of 1 xD;

the first weighting module multiplies the weighting matrix Y with the characteristic diagram U to obtain a weighted characteristic diagram U' with the size of H multiplied by W multiplied by D;

the weighted feature graph U' sequentially passes through the operations of the rest m-1 convolution layers, the pooling layer, the MI-SE module and the weighting module of the encoder, and finally the output features pass through a full connection layer to obtain a feature vector alpha output by the encoder;

step 2.2: the decoder is formed by sequentially and alternately connecting k convolutional layers and k upper sampling layers;

the feature vector alpha is input into a decoder and the nth slice s is obtained_nReconstructed slice of

Re-use formula (4) to establish reverse transmissionLoss function L of broadcast_MSE：

In the formula (4), MSE represents the minimum mean square error; p is the number of elements per slice,

for the nth slice s_nThe c-th element in the slice reconstructed after the decoder,

for the nth slice s_nThe c-th element of (1);

training the encoder and decoder using back propagation when the loss function L is_MSEStopping training when the minimum value is taken, thereby obtaining the optimal characteristic vector alpha output by the trained coder^*；

Step 2.3: the classifier consists of t full-connection layers and a softmax layer;

the optimal feature vector α^*Inputting the n-th slice s into a classifier, firstly reducing the dimension through t full-connected layers, then inputting the feature vector subjected to dimension reduction into a softmax layer, and calculating the n-th slice s_nThe central element of (a) corresponds to the posterior probability of each class, and the class corresponding to the maximum posterior probability is selected as the nth slice s_nThe prediction category of (1);

step 2.4: and (3) obtaining the prediction categories of the N slices according to the modes of the step 2.1 to the step 2.3, comparing the prediction categories with the corresponding real categories, and optimizing the classifier through back propagation, thereby completing the training of the encoder, the decoder and the classifier to obtain a trained multi-index convolutional self-encoder model for classifying the PolSAR data to be classified.

Compared with the prior art, the invention has the beneficial effects that:

1. the PolSAR terrain fine classification method based on the multi-index convolution self-encoder effectively improves the integrity and the smoothness of feature representation in the PolSAR terrain classification process, weakens the influence of individual extreme values of images on the overall features, and realizes PolSAR terrain fine classification.

2. The invention provides an MI-SE module, which solves the problem that in a traditional CAE network, the feature map result output by a convolutional layer is input to the next layer simply and averagely, and the information in the feature map is used for carrying out effective weighting operation on the output of the convolutional layer, so that the extraction and characterization of the feature of the ground object are better carried out. The features carrying different information are weighted by different weights, and more complete and fine feature information is obtained.

3. The invention skillfully provides three indexes for measuring the importance of different characteristics, namely a first-order moment, a second-order central moment and a variation coefficient. The first moment can reflect the average value of the features, the second-order center distance can describe the fluctuation in each feature, the content of edge and texture information is reflected to a certain degree, and the variation coefficient can eliminate the influence of the average value on dispersion, so that the fluctuation is reflected better. The three indexes complement each other, and weighting of each characteristic at different angles is achieved.

4. According to the invention, the convolution result is weighted by three indexes, so that the difference inside the ground feature classes is obviously reduced, and the difference between the ground feature classes is increased, in order to illustrate the capability of improving the separability between different classes by three different indexes, the invention uses a t-distribution random neighborhood embedding algorithm to visualize the intermediate characteristic diagram, and the result after 100 network iterations is shown in FIG. 3. Making it easier for the last softmax to separate each category of terrain.

Drawings

FIG. 1 is a MI-SE block structure of the present invention;

FIG. 2 is a flow chart of the PolSAR image ground object target classification method of the present invention;

FIG. 3 is a display diagram of a visualized intermediate feature map according to the present invention, using a t-distribution random neighborhood embedding algorithm;

FIG. 4 is a set of experimental data of the present invention, San Francisco;

FIG. 5 is a graph of the results of the classification of the method of the present invention and other methods on a San Francisco dataset.

Detailed Description

In this embodiment, a multi-indicator request-and-excitation (MI-SE) module shown in fig. 1 is added to a conventional CAE network, and a classification network based on a multi-indicator convolution self-encoder shown in fig. 2 is proposed, so that the detail features and the depth features of the PolSAR ground object can be taken into account effectively, different attention degrees of different features are realized by a guidance model during training, the integrity and the fineness of the feature characterization of the PolSAR ground object are ensured, the differences in various ground objects are reduced, and the differences among various ground objects are increased, thereby obtaining a better classification result. Specifically, the fine PolSAR terrain classification method based on the multi-index convolution self-encoder comprises the following steps:

step 1: preprocessing and cutting operations of PolSAR data:

Wherein S is_HHRepresents the complex scattering coefficient when the transmission polarization mode is horizontal polarization and the receiving polarization mode is horizontal polarization, S_VVRepresents the complex scattering coefficient when the transmission polarization mode is vertical polarization and the reception polarization mode is vertical polarization, S_HVThe complex scattering coefficient when the transmitting polarization mode is vertical polarization and the receiving polarization mode is horizontal polarization is represented;

in the formula (1) (.)^HDenotes the conjugate transpose, T_ijAn element representing the ith row and the jth column of the polarized coherence matrix T; and i, j ═1,2,3；

Step 1.3: extracting a six-dimensional eigenvector F from the polarization coherence matrix T by using equation (2) ═ a, B, C, D, E, F:

step 1.5: cutting the preprocessed PolSAR data into slices with the size of L multiplied by 6, thereby obtaining a slice set { s) of the PolSAR data₁,...,s_n,...,s_NIn which s is_nRepresents the nth slice, N represents the total number of slices, N ∈ [1, N]In this embodiment, the data is cut into slices of size L × 6 ═ 15 × 15 × 6, the total number of slices being 0.5% of the amount of tagged data in the data set;

step 2: the method for constructing the classification network based on the multi-index convolution self-encoder comprises the following steps: an encoder, a decoder, and a classifier; and sets the slices s₁,...,s_n,...,s_NThe input is entered into a classification network,

step 2.1: the encoder consists of m convolutional layers, m pooling layers, m MI-SE modules, m weighting modules and a full connection layer, wherein in the embodiment, two convolutional layers, two pooling layers, two MI-SE modules, two weighting modules and a full connection layer are constructed;

the nth slice s_nInputting into encoder, and processing with first convolutional layer and pooling layer to obtain final product with dimension of H × W × DA feature diagram U, wherein H and W represent the height and width of the feature diagram U respectively, and D represents the channel number of the feature diagram U;

the feature map U is input into a first MI-SE module, and a multi-index is calculated by using an equation (3), wherein the multi-index comprises the following steps: first moment S₁Second central moment S₂And coefficient of variation S₃：

the first order moment S₁Second central moment S₂And coefficient of variation S₃Splicing the multi-index matrix V into a multi-index matrix V with the size of 3 multiplied by D, and processing the multi-index matrix V through a convolution layer and a full connection layer in a first MI-SE module to obtain a weight matrix Y with the size of 1 multiplied by D;

the weighted feature graph U' sequentially passes through the operations of the rest m-1 convolutional layers, the pooling layer, the MI-SE module and the weighting module of the encoder, and finally the output features pass through a full connection layer to obtain a feature vector alpha output by the encoder;

the feature vector alpha is input into the decoder and the nth slice s is obtained_nReconstructed slice of

Then, the formula (4) is used to establish a back propagation loss function L_MSE：

for the nth slice s_nThe c-th element of (1);

training the encoder and decoder using back propagation as the loss function L_MSEStopping training when the minimum value is taken, thereby obtaining the optimal characteristic vector alpha output by the trained coder^*In the present embodiment, three convolutional layers and three upsampling layers are provided in the decoder;

optimal feature vector alpha^*Inputting the n-th slice s into a classifier, firstly reducing the dimension through t full-connected layers, then inputting the feature vector subjected to dimension reduction into a softmax layer, and calculating the n-th slice s_nThe central element of (a) corresponds to the posterior probability of each class, and the class corresponding to the maximum posterior probability is selected as the nth slice s_nIn the embodiment, two fully-connected layers and one softmax layer are constructed in the classifier;

step 2.4: and (3) obtaining the prediction categories of the N slices according to the modes of the step 2.1 to the step 2.3, comparing the prediction categories with the corresponding real categories, and optimizing the classifier through back propagation, so that the trained encoder, decoder and classifier are obtained, and the trained multi-index convolutional self-encoder model is obtained and used for classifying the PolSAR data to be classified.

So far, the fine classification method of the PolSAR terrain based on the multi-index convolution self-encoder is basically completed.

The effectiveness of the present invention is further illustrated by the San Francisco data set experiment below.

San Francisco data set PolSAR image target classification experiment:

1. experimental setup:

the experimental data set is an L-band fully polarimetric synthetic aperture radar image collected from san francisco by the Jet Propulsion Laboratory (JPL) airborne synthetic aperture radar platform of the united states space agency. The size of the data set is 1300 x 1200, and the data set comprises five land features of low-density cities, high-density cities, developed cities, water bodies and vegetation. To input the data set into the multi-index convolutional autocoder model, the size of the data slice was set to 15 × 15 in the experiment, and in addition, the number of training samples in the data set was randomly selected according to 0.5% of the total samples. The data set is shown in fig. 3 and the result of the final classification is shown in fig. 4. Fig. 3 (a) shows raw data. Fig. 3 (b) shows a case where data can be separated when the conventional CAE is used. Fig. 3 (c) shows the case where data can be separated after CAE is added to the first moment. Fig. 3 (d) shows the case where data can be separated after CAE adds the second central moment. Fig. 3 (e) shows the separable data after the CAE is added with the coefficient of variation. FIG. 3 (f) shows a data separable case when the method of the present invention is employed;

2. and (4) analyzing results:

in the experiment in this embodiment, the performance of the method provided by the present invention is quantitatively analyzed by using the Overall Accuracy (OA) and the Kappa coefficient (Kappa coefficient). In order to illustrate the superiority of the method provided by the invention, several common polarisar image ground object classification methods are selected for comparison, and most of the methods are based on the improvement of a Convolutional Neural Network (CNN), wherein the methods comprise a complete local binary pattern fusion CNN (complete local binary pattern fusion CNN, CLBP-CNN), a combination of CNN and Markov random field (MRF-CNN), a polarization compression and excitation network (PSE-NET), a complex CNN (CV-CNN ), VGG-NET, a residual neural network (residual neural network, and CAE. The batch processing size of 64, the iteration times of 100, the learning rate of 0.001 and the Adam algorithm are set in the training process of all the methods. The comparative results are shown in table 1, wherein:

t and N in the formula (5) represent the number of classes and total samples in the classification task, respectively, and c_iiCorresponding to the diagonal elements in the confusion matrix. C in formula (6)_ijCorresponding to all elements in the confusion matrix.

TABLE 1 PolSAR terrain fine classification method performance evaluation index based on multi-index convolution self-encoder

The analysis is performed in conjunction with fig. 5 and table 1, in which (a) in fig. 5 represents an original label graph, (b) in fig. 5- (h) in fig. 5 represent the classification results of the comparison algorithm, and (i) in fig. 5 represents the classification results of the present invention; compared with other common PolSAR terrain classification algorithms, the PolSAR terrain fine classification algorithm based on the multi-index convolution self-encoder provided by the invention has better continuity in classification results, namely, fewer isolated points in a classification result image. In other comparison methods, the CLBP-CNN fuses and describes edge and texture features CLBP, but because the CLBP data dimension is high, the time cost spent in the process of extracting the features and training in practical application is high. The MRF-CNN is limited by the underlying network, and when the classification task is complex, it is difficult to remove the misclassification region, and even the classification performance is degraded again. The PSE-NET increases the correlation among information through the proposed SE model, so that the classification performance is improved, but the texture features are still lost, and more sample data is required to participate in training due to the increase of parameters, otherwise, the classification result is poor. The internal network parameters in CV-CNN are also of a complex form, and therefore the number of parameters that need to be trained is large. VGG-Net and Res-Net have very deep network depth, and also have very high requirements on the amount of training sample data, otherwise the final classification effect is poor. The method of the invention embeds the newly proposed MI-SE module on the basis of CAE, effectively inhibits the interference information in the data set, and simultaneously carries out multi-index weighting on the features extracted by the convolution and pooling layers, thereby greatly improving the feature separation degree. Meanwhile, three indexes are proposed in the MI-SE module: the first moment may reflect an average of the features; the second-order center distance can describe fluctuation in each feature, and reflects the content of edge and texture information to a certain extent; the coefficient of variation can eliminate the influence of the mean value on dispersion, thereby better reflecting the fluctuation. The three indexes complement each other, and weighting of each feature at different angles is achieved.

In conclusion, the PolSAR terrain fine classification method based on the multi-index convolution self-encoder has the characteristics of simple structure, accurate classification and high calculation efficiency, and has high application value in practical engineering.

Claims

1. A PolSAR terrain fine classification method based on a multi-index convolution self-encoder is characterized by comprising the following steps:

step 1: preprocessing and cutting operations of PolSAR data:

the nth slice s_nIs input into the encoder, andafter the first convolution layer and the pooling layer are processed, a feature graph U with the dimension of H multiplied by W multiplied by D is obtained, wherein H and W respectively represent the height and width of the feature graph U, and D represents the number of channels of the feature graph U;

for the nth slice s_nThe c-th element of (1);