CN113255727A - Multi-sensor remote sensing image fusion classification method capable of layering dense fusion network - Google Patents
Multi-sensor remote sensing image fusion classification method capable of layering dense fusion network Download PDFInfo
- Publication number
- CN113255727A CN113255727A CN202110446906.6A CN202110446906A CN113255727A CN 113255727 A CN113255727 A CN 113255727A CN 202110446906 A CN202110446906 A CN 202110446906A CN 113255727 A CN113255727 A CN 113255727A
- Authority
- CN
- China
- Prior art keywords
- layer
- convolution
- pixel
- conv2
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000000034 method Methods 0.000 title claims abstract description 87
- 230000004927 fusion Effects 0.000 title claims abstract description 58
- 230000004913 activation Effects 0.000 claims abstract description 64
- 230000003595 spectral effect Effects 0.000 claims abstract description 47
- 238000001228 spectrum Methods 0.000 claims abstract description 16
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 15
- 238000012549 training Methods 0.000 claims description 50
- 238000011176 pooling Methods 0.000 claims description 45
- 238000000605 extraction Methods 0.000 claims description 19
- 238000010606 normalization Methods 0.000 claims description 18
- 238000013507 mapping Methods 0.000 claims description 9
- 238000012360 testing method Methods 0.000 claims description 9
- 239000011159 matrix material Substances 0.000 claims description 7
- 238000013528 artificial neural network Methods 0.000 claims description 5
- 101100257430 Streptococcus downei spaA gene Proteins 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000011478 gradient descent method Methods 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 230000007246 mechanism Effects 0.000 abstract description 4
- 238000012545 processing Methods 0.000 abstract description 4
- 230000006870 function Effects 0.000 description 31
- 238000012706 support-vector machine Methods 0.000 description 14
- 230000003993 interaction Effects 0.000 description 6
- 230000008901 benefit Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000000354 decomposition reaction Methods 0.000 description 3
- 238000007499 fusion processing Methods 0.000 description 3
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 2
- 102100032268 Triadin Human genes 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 239000002689 soil Substances 0.000 description 2
- 108010072310 triadin Proteins 0.000 description 2
- 244000025254 Cannabis sativa Species 0.000 description 1
- 101100237844 Mus musculus Mmp19 gene Proteins 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24137—Distances to cluster centroïds
- G06F18/2414—Smoothing the distance, e.g. radial basis function networks [RBFN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/58—Extraction of image or video features relating to hyperspectral data
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a fusion classification method for a layering intensive fusion multi-sensor remote sensing image, and belongs to the field of remote sensing image processing. Firstly, introducing a network frame with three branches of space-frequency spectrum-elevation, and extracting spatial features and spectral features of a hyperspectral image and spatial elevation features of a LiDAR image respectively; secondly, a modal attention mechanism of the multi-sensor remote sensing image is provided, and the characteristics of different modal data are obtained by utilizing the relevance and the anisotropy among different modal data; and then, fusing the features obtained by the modal attention mechanism and the self-attention mechanism by using a Flatten operation and a concatenate operation of a convolutional neural network, and classifying the features by using a softmax activation function to realize the ground object classification based on the multi-sensor remote sensing image.
Description
Technical Field
The invention relates to the field of remote sensing image processing, in particular to a multi-sensor remote sensing image fusion classification method of a layering intensive fusion network, which has good fusion quality, high classification precision and strong multi-modal data interaction capacity.
Background
As a contactless remote sensing technology, remote sensing image-to-ground observation technology has been widely applied in the field of ground surface covering ground object classification. Among many types of sensors, hyperspectral images provide a detailed description of features of land features in a unified map, which can better distinguish land features having the same elevation information but different spectral characteristics, such as a road surface and a grass land at the same height. However, the spatial resolution of the hyperspectral image is not high, phenomena such as spectrum aliasing and 'same spectrum of foreign matters' often exist, and the classification accuracy of the ground objects in a complex scene is seriously influenced. LiDAR data, on the other hand, provides height information for features that may better distinguish features having the same spectral characteristics but different elevation information. Because the traditional LiDAR adopts a single-waveband working mode, if the remote sensing scene interpretation is carried out by utilizing the three-dimensional space information of the ground objects acquired by the LiDAR, the classification and the identification can be usually carried out on the large classes of the ground objects, and the fine interpretation cannot be realized. Taking public hyperspectral images and LiDAR-DSM images of a regional area of a Florida campus as an example, although the hyperspectral images can accurately distinguish grasslands from pavements, the pavements cannot be distinguished from roofs of buildings, and the reason is that the two are made of the same material; conversely, although LiDAR accurately distinguishes buildings and pavements of different heights, it may not be possible to effectively distinguish pavements and lawns of the same height. Therefore, the hyperspectral image and the LiDAR data have rich complementary information, and if the information can be fully utilized to cooperatively complete ground object analysis, the advantages of two types of sensors are favorably fused, the effective improvement of the performance of an intelligent processing algorithm is realized, and the ground object information is analyzed more comprehensively. Under such circumstances, a method for fusion classification of hyperspectral imagery and LiDAR data has received much attention.
Mercier et al introduced a Support Vector Machine (SVM) with a nonlinear kernel function into the field of remote sensing image classification in 2003. Extreme Learning Machine (ELM) was also introduced by Li et al in 2015 into classification of remote sensing images and achieved performance effects comparable to SVM. However, these classification methods have low classification accuracy. Rasi et al proposed a hyperspectral image-LiDAR data fusion classification method based on sparsity and low-rank decomposition in 2017, and spatial redundancy of image features is captured by using sparsity characteristics to improve spatial smoothness of fusion features, so that the problem of a hous phenomenon in the fusion process is effectively solved. However, due to the dimensionality disaster phenomenon of the hyperspectral image, the processing precision of the method is obviously insufficient. Xue et al proposed a hyperspectral image-LiDAR data fusion model based on coupled high-order tensor decomposition in 2019, extracted more potential features through the coupled high-order tensor decomposition technology, and overcome the defects of low classification precision, hous effect caused by the fusion process and the like in the above technology to a certain extent.
In recent years, computing power and data acquisition capabilities of computing devices have increased rapidly. The significant increase in computing power can mitigate the inefficiencies of training, and the significant increase in training data can reduce the risk of "overfitting". Therefore, complex models represented by deep learning techniques such as Convolutional Neural Networks (CNN) are increasingly applied to classification of hyperspectral remote sensing images, and results superior to those of conventional machine learning methods are obtained. In 2017, Li et al trained the pixel points of the hyperspectral image by using the CNN frame, and realized the pixel-level classification process of the hyperspectral image. However, in the method, each single pixel point of the hyperspectral image is trained as a whole, and the spectral characteristics specific to the hyperspectral image are ignored, so that the model precision is insufficient. In 2018, Xu et al proposed a dual-branch convolutional neural network hyperspectral image classification framework, and introduced spectral domain branches and spatial domain branches to perform collaborative classification on spectral features and spatial features of hyperspectral images respectively. Due to the fact that the spectral characteristics of the hyperspectral images are considered, the hyperspectral classification accuracy of the classification method is improved, and the classification accuracy of ground feature information with the same elevation is poor. On the basis, Hao et al introduce elevation information provided by LiDAR data into a double-branch convolutional neural network structure in 2018, and provide a hyperspectral image-LiDAR data collaborative fusion classification framework based on a neural network and a composite kernel. The frame utilizes the convolution neural networks of the three branches to respectively extract the spectral feature, the spatial feature and the elevation feature of the ground feature, so that the ground feature classification precision is improved. However, although the multi-branch input network structure can reduce the information loss of different modality data in the fusion process, no connection is still established between the spatial information of two modalities. Therefore, in 2020, Hong et al combined the generative countermeasure network (GAN) with multi-modal deep learning, and proposed a cross-modal network model using GAN network as main frame. The method adopts a single-stage feature fusion mode to fuse shallow features, and considers the spatial information correlation and the interactivity among data features of different modes to a certain extent. However, since this method only processes shallow features of each modality, deep spatial information correlation between multi-source data cannot be fully utilized, and there is room for further improvement in classification accuracy.
In general, the characteristics of different modes are cooperatively analyzed based on the characteristics of the remote sensing image, so that a higher-quality fusion result can be obtained. Unfortunately, in the prior art, most of the prior art adopts a mode of network front-end fusion in single-stage feature fusion to perform fusion, which usually ignores the correlation and interactivity of spatial information between different modality features, and a multi-branch network structure cannot establish a sufficient correlation relationship between the spatial information of two modalities. At present, a fusion classification method which can increase the interaction among the modalities by utilizing common characteristics and special characteristics among different modalities of the multi-sensor remote sensing image so as to obviously improve the fusion quality and classification precision of the multi-sensor remote sensing image does not exist, and the prior technical scheme still has the defects of poor cross-modality interaction capability, poor fusion quality and limited classification precision.
Disclosure of Invention
The invention aims to solve the technical problems in the prior art and provides the multi-sensor remote sensing image fusion classification method of the layering dense fusion network, which has the advantages of good fusion quality, high classification precision and strong multi-mode data interaction capability.
The technical solution of the invention is as follows: a multi-sensor remote sensing image fusion classification method capable of layering dense fusion network is characterized by comprising the following steps:
Step 1.1 establishing and initializing a sub-network NfeatureSpa4 groups of convolutional layers, Conv2_0, Conv2_1, Conv2_2 and Conv2_ 3;
the Conv2_0 comprises 1-layer convolution operation, 1-layer BatchNorm normalization operation and 1-layer activation operation, wherein the convolution layer comprises 100 convolution kernels with the size of 3 x 3, each convolution kernel performs convolution operation by taking 1 pixel as a step size, and a nonlinear activation function ReLU is selected as an activation function for operation;
the Conv2_1 comprises 1-layer convolution operation, 1-layer BatchNorm normalization operation and 1-layer activation operation, wherein the convolution layer comprises 100 convolution kernels with the size of 3 x 3, each convolution kernel performs convolution operation by taking 1 pixel as a step size, and a nonlinear activation function ReLU is selected as an activation function for operation;
the Conv2_2 comprises 1-layer convolution operation, 1-layer BatchNorm normalization operation and 1-layer activation operation, wherein the convolution layer comprises 100 convolution kernels with the size of 1 × 1, each convolution kernel performs convolution operation by taking 1 pixel as a step size, and a nonlinear activation function ReLU is selected as an activation function for operation;
the Conv2_3 comprises 1-layer convolution operation, 1-layer BatchNorm normalization operation and 1-layer activation operation, wherein the convolution layer comprises 100 convolution kernels with the size of 1 × 1, each convolution kernel performs convolution operation by taking 1 pixel as a step size, and a nonlinear activation function ReLU is selected as an activation function for operation;
step 1.2. establishing and initializing sub-network N featureSpe2 groups of convolutional layers, Conv1_0 and Conv1_ 1;
the Conv1_0 comprises 1-layer convolution operation, 1-layer BatchNorm normalization operation and 1-layer activation operation, wherein the convolution layer comprises 64 one-dimensional convolution kernels with the size of 11, each convolution kernel performs convolution operation by taking 1 pixel as a step size, and a nonlinear activation function ReLU is selected as an activation function for operation;
the Conv1_1 comprises a 1-layer convolution operation, a 1-layer BatchNorm normalization operation and a 1-layer activation operation, wherein the convolution layer comprises 128 one-dimensional convolution kernels with the size of 3, each convolution kernel performs convolution operation by taking 1 pixel as a step size, and a nonlinear activation function ReLU is selected as an activation function for operation;
step 1.3. establishing and initializing sub-network NshallowfusionComprising 6 sets of parallel convolutional layers, Conv2_ Q1, Conv2_ K1, Conv2_ V1, Conv2_ Q2, Conv2_ K2 and Conv2_ V2, and 2 sets of custom modules, LSAM、LDPAM;
The Conv2_ Q1 comprises 1 layer of convolution operations, including 25 convolution kernels of size 1 × 1, each convolution kernel performing convolution operations in steps of 1 pixel;
the Conv2_ K1 includes 1 layer of convolution operations, including 25 convolution kernels of size 1 × 1, each convolution kernel performing convolution operations with 1 pixel step size;
the Conv2_ V1 comprises 1 layer of convolution operations, including 200 convolution kernels of size 1 × 1, each convolution kernel performing convolution operations with a step size of 1 pixel;
the Conv2_ Q2 comprises 1 layer of convolution operations, including 25 convolution kernels of size 1 × 1, each convolution kernel performing convolution operations in steps of 1 pixel;
the Conv2_ K2 includes 1 layer of convolution operations, including 25 convolution kernels of size 1 × 1, each convolution kernel performing convolution operations with 1 pixel step size;
the Conv2_ V2 comprises 1-layer convolution operation, including 200 2D convolution kernels with the size of 1 × 1, and each convolution kernel carries out convolution operation by taking 1 pixel as a step size;
said LSAMThe module maps the input three-dimensional tensor F to using reshape operationSpace, get the characteristicWherein, CspeIndicating the number of input channels, N2=1×1,Is represented by FspeRThe ith channel of (2), and then calculating the spectrum attention moment array according to the definition of the formula (1)
Wherein,is represented by FspeSThe element in the jth row and ith column,is represented by FspeRThe transpose of the jth lane of (1),is represented by FspeRRepresents the inner product operation, and further defines F according to the formula (2)speRAnd FspeSPerforming matrix multiplication to obtain spectral attention feature FspeA;
Wherein γ represents a preset coefficient;
said LDPAMThe module comprises the following 7 steps:
(a) three-dimensional tensor F to be input1Feeding into the convolutional layer Conv2_ Q1 to calculate the characteristicsThen F is mixed1Feeding into the convolutional layer Conv2_ K1 to calculate the characteristicsThen F is put1Sending the convolution layer Conv2_ V1 to calculate the characteristicsWherein, FQ1,i、FK1,iAnd FV1,iRespectively represent FQ1、FK1And FV1The ith element of (1), CspaNumber of channels representing input tensor, HspaAnd WspaRespectively representing the length and width of the input tensor, K1=25,K2=25,K3=200;
(b)LDPAMModule three-dimensional tensor F2Feeding into the convolutional layer Conv2_ Q2 to calculate the characteristicsThen F is mixed2Feeding into the convolutional layer Conv2_ K2 to calculate the characteristicsThen F is put2Sending the convolution layer Conv2_ V2 to calculate the characteristicsWherein, FQ2,i、FK2,iAnd FV2,iRespectively represent FQ2、FK2And FV2The ith element of (1);
(c) f is processed by reshape operationQ1And FK1Mapping toSpace and calculating a space attention moment matrix according to the definition of the formula (3)
Wherein N is1Represents the total number of features and N1=Hspa×Wspa,Is represented by FspaXThe element in the jth row and ith column,is represented by FK1Transpose of jth element of (a);
(d) f is processed by reshape operationV1Mapping toSpace, calculating the spatial attention feature F according to the definition of formula (4)spaA;
Wherein eta isspaIs a pre-set scaling factor that is,is represented by FspaXA vector consisting of the elements of row i;
(e) f is processed by reshape operationQ2And FK2Mapping toSpace, rootCalculating the space attention moment array according to the definition of the formula (5)
Wherein,is represented by FmXThe element in the jth row and ith column,is represented by FK2Transpose of jth element of (a);
(f) calculating the modal attention feature F according to the definition of formula (6)mA;
Wherein epsilon2Which represents a preset scaling factor, is set,is represented by FmXA vector consisting of the elements of row i;
(g) calculating the spatial weighting feature F according to the definition of equation (7)maF;
FmaF=α1F1+α2FspaA+α3FmA (7)
Wherein alpha is1,α2And alpha3Representing a preset weight coefficient;
step 1.4 setting up and initializing sub-network NdeepfusionThe self-defined self-leveling light-emitting diode comprises 2 groups of maximum pooling layers and 1 group of self-defined connecting layers, namely MaxPool1, MaxPool2 and Concatenate;
the MaxPool1 comprises 1-layer pooling operation and 1-layer Flatten operation, wherein the pooling layer carries out maximum pooling operation by using a one-dimensional pooling kernel with the size of 1;
the MaxPool2 comprises 1-layer pooling operation, 2-layer full-connection operation, 2-layer activation operation and 1-layer Flatten operation, wherein the pooling layer performs maximum pooling operation by using a pooling core with the size of 2 x 2, the 2-layer full-connection layer is respectively provided with 1024 and 512 output units, ReLU is selected as an activation function for operation, and Dropout operation with the parameter of 0.4 is performed to obtain 3 three-dimensional tensorsAnd
the Concatenate will be defined according to equation (8)Andperforming fusion operation and Dropout operation with 3 times of parameters of 0.5;
where ω and b represent weights and offsets of fully connected layers and "|" represents an operation of connecting spectral features with spatial features;
step 1.5 establishing and initializing sub-network N cls1 group of full connection layers, namely Dense 1;
the Dense1 has num classification units and takes Softmax as an activation function, wherein num represents the total number of the ground feature categories to be classified;
step 2.1, according to the pixel point coordinate set marked artificially, in the hyperspectral imageExtracting all pixel point sets X with labels from the training set HH={xH,i1, …, M, and extracting pixel point set X with all labels from training set L of LiDAR imageL={xL,i1, …, M, where xH,iRepresents XHThe ith pixel point, xL,iRepresents XLM represents the total number of pixel points having labels;
step 2.2. according to the definition of the formula (9) and the formula (10), X is definedHAnd XLPerforming standardization treatment to obtainAndwherein,representing a normalized set of labeled hyperspectral image primitive points,to representThe point of the ith pixel of (a),represents a normalized set of labeled LiDAR pixel points,to representThe ith pixel point of (1);
step 2.3. withIs divided into a series of high spectral pel block sets X of size 11X 11 centered on each pel point of HH1And are combined withDivides L into a series of sets X of LiDAR image metablocks of size 11X 11 centered on each image metablock ofL1;
Step 2.4. mixing XH1And XL1Each image element block in the image acquisition system is turned over up and down to obtain a high-spectrum image element block set XH2And LiDAR pixelblock set XL2;
Step 2.5 for XH1Adding Gaussian noise with variance of 0.01 to each pixel block to obtain a hyperspectral pixel block set XH3And to XL1Adding Gaussian noise with variance of 0.03 to each pixel block to obtain a LiDAR pixel block set XL3;
Step 2.6. mixing XH1Each pixel block in the hyperspectral image block set X rotates by n multiplied by 90 degrees clockwise and randomly by taking the central point as a rotation center to obtain a hyperspectral image block set XH4And X isL1Each pixel block in the LiDAR pixel block set X is obtained by clockwise randomly rotating n multiplied by 90 degrees by taking the central point as the rotating centerL4Wherein n represents a value randomly selected from the set {1,2,3 };
step 2.7. orderAndwill be provided withAndas a training set for fusing and classifying neural networks, and integrating samples in the training set into a triadIn the form of a network data input, wherein,represents a pixel pair consisting of a hyperspectral image and a LiDAR image in the training set, andandare the same in spatial coordinates of (a) YiTo representAndmaking the iteration number iter ← 1 for the corresponding real category label, and executing the step 2.8 to the step 2.13;
step 2.8. adopt the subnetwork NfeatureSpeAnd NfeatureSpaExtracting the characteristics of the training set;
step 2.8.1 utilizing subnetwork NfeatureSpeTraining set for hyperspectral imagesCarrying out feature extraction to obtain shallow spectral feature F of hyperspectral imagespe;
Step 2.8.2 utilizing sub-network NfeatureSpaTraining set for hyperspectral imagesCarrying out feature extraction to obtain shallow space features F of the hyperspectral imagespa;
Step 2.8.3 utilizing sub-network NfeatureSaTraining set for LiDAR imageryPerforming feature extraction to obtain shallow elevation features F of the LiDAR imageL;
Step 2.9. use sub-network NshallowfusionPerforming shallow layer fusion of a characteristic level to obtain shallow layer characteristics;
step 2.9.1 Using LSAMModule pair shallow spectral feature FspeCalculating to obtain the spectral attention characteristic F of the hyperspectral imagespeA;
Step 2.9.2 Using LDPAMModule pair shallow space feature FspaAnd shallow space feature FLCalculating to obtain the spatial modal attention feature F of the hyperspectral imagemaHF;
Step 2.9.3 Using LDPAMModule pair shallow space feature FLAnd shallow space feature FspaCalculating to obtain the spatial modal attention feature F of the LiDAR imagemaLF;
Step 2.10. use sub-network NdeepfusionCarrying out deep fusion of characteristic levels to obtain deep characteristics;
step 2.10.1 spectral attention feature F using max-pooling layer MaxPool1speACalculating to obtain deep spectral characteristics of the hyperspectral image
Step 2.10.2 spatial modal attention feature F of hyperspectral image by using maximum pooling layer MaxPool2maHFCalculating to obtain deep space characteristics of hyperspectral image
Step 2.10.3 utilizes the spatial modal attention feature F of the max pooling layer Maxpool2 for LiDAR imagerymaLFCalculation was carried out to obtain LiDeep elevation features for DAR images
Step 2.10.4 utilizes the custom linker Concatenate to characterize the deep spectra of the hyperspectral imageSpatial features of deep layersDeep elevation features of LiDAR imagesCalculating to obtain deep layer characteristics FM;
Step 2.11 Using subnetwork NclsClassifying the deep features, and calculating to obtain a classification prediction result TRpred;
Step 2.12, taking the weighted cross entropy as a loss function according to the definitions of the formula (11) and the formula (12);
wherein, ω isjThe weight of the jth class is represented,probability, n, of a picture element belonging to class j terrainjRepresenting the number of the jth class of ground-truth ground objects in the ground-truth training sample;
step 2.13, if all pixel blocks in the training set are processed, the step 2.14 is carried out, otherwise, a group of unprocessed pixel blocks are taken out from the training set, and the step 2.8 is returned;
step 2.14 let iter ← iter +1, if yesIteration number iter>Total _ iter, then obtaining the trained convolutional neural network NahdAnd (4) turning to the step (3), otherwise, utilizing a reverse error propagation algorithm based on a random gradient descent method and predicting loss Lω-CUpdating NahdStep 2.8, all the pixel blocks in the training set are reprocessed, and the Total _ iter represents the preset iteration times;
step 3, inputting unlabeled hyperspectral images H 'and LiDAR images L', performing data preprocessing on all pixels of H 'and L', and adopting a trained convolutional neural network NahdCompleting pixel classification;
step 3.1, extracting all pixel points in H' to form a set TH={tH,iI1, …, U, extracting all pixel points in L' to form a set TL={t L,i1, …, U }, where t isH,iRepresents THI-th pixel of (1), tL,iRepresents TLU represents the total number of all picture elements;
step 3.2. definition of T according to formula (17) and formula (18)HAnd TLPerforming standardization treatment to obtainAndwherein,representing a normalized labeled set of high-spectrum image pixel points,to representThe point of the ith pixel of (a),representing normalized images with labelsThe set of LiDAR pixel points of (1),to representThe ith pixel point of (1);
step 3.3. withEach pixel point of the hyperspectral imager is taken as a center, H' is divided into a series of hyperspectral pixel block sets with the size of 11 multiplied by 11 to form a hyperspectral image test setThen combine withWith each pixel point as the center, the L' is divided into a series of sets of LiDAR pixel blocks with the size of 11 multiplied by 11 to form a LiDAR image test set
Step 3.4. use sub-network NfeatureSpeAnd NfeatureSpaExtracting the characteristics of the test set;
step 3.4.1 utilizing sub-network NfeatureSpeTo pairCarrying out feature extraction to obtain spectral feature T of hyperspectral image Hspe;
Step 3.4.2 utilizes sub-network NfeatureSpaTo pairCarrying out feature extraction to obtain spatial feature T of hyperspectral image Hspa;
Step 3.4.3 utilizing sub-network NfeatureSpaTo pairExtracting features to obtain the elevation features T of the LiDAR image LL;
Step 3.5. use sub-network NshallowfusionPerforming shallow layer fusion of a characteristic level to obtain shallow layer characteristics;
step 3.5.1 Using LSAMModule pair spectral feature TspeCalculating to obtain the spectral attention characteristic T of the hyperspectral image HspeA;
Step 3.5.2 Using LDPAMModule to space characteristics TspaAnd spatial feature TLCalculating to obtain the spatial modal attention feature T of the hyperspectral image HmaHF;
Step 3.5.3 utilizes LDPAMModule to space characteristics TLAnd spatial feature TspaCalculating to obtain the space modal attention feature T of the LiDAR image LmaLF;
Step 3.6. use sub-network NdeepfusionCarrying out deep fusion of characteristic levels to obtain deep characteristics;
step 3.6.1 spectral attention feature T using max-pooling layer MaxPool1speACalculating to obtain deep spectral characteristics of the hyperspectral image H
Step 3.6.2 attention feature T to spatial modality with max pooling layer Maxpool2maHFCalculating to obtain deep spatial features of the hyperspectral image H
Step 3.6.3 utilizes the max pooling layer Maxpool2 for the spatial modal attention feature TmaLFCalculating to obtain the deep elevation features of the LiDAR image L
Step 3.6.4 utilizes custom connection layer conditioner pairsCalculating to obtain deep layer characteristics TM;
Step 3.7 Using subnetwork NclsFor deep layer characteristic TMClassifying to calculate the classified prediction result TEpred。
Compared with the prior art, the invention has the advantages of two aspects: firstly, a multi-mode fusion classification frame of a layering dense fusion network based on a multi-attention machine system is introduced, and the frames of three branches of space-frequency spectrum-elevation of a hyperspectral image and a LiDAR image can be organically combined with the attention machine system, so that the precision of ground feature fusion classification can be improved; secondly, by fusing the shallow spectral features and the shallow spatial features of the hyperspectral image with the shallow spatial features of the LiDAR image, a modal attention mechanism for fusion of the shallow features is designed to discover the relevance and the diversity among the multi-modal data of the same ground object, and interaction and advantage complementation among different modal data are realized. Therefore, the method has the characteristics of good fusion quality, high classification precision and strong multi-mode data interaction capability. Experimental results show that the overall accuracy of the method on the Houston data set and the Telento data set respectively reaches 90.06% and 99.03%, the average accuracy is 92.25% and 98.32%, the Kappa coefficient is 89.24% and 98.70%, and the classification accuracy of the ground objects is effectively improved.
Drawings
FIG. 1 is a comparison graph of fusion classification results of the method of the present invention and a SVM method, an ELM method, a CNN-PPF method, a Two-Branch CNN method, and an EndNet method on a Houston data set.
FIG. 2 is a comparison graph of the fusion classification results of the method of the present invention with a SVM method, ELM method, CNN-PPF method, Two-Branch CNN method, EndNet method on the Trento dataset.
Detailed Description
The invention discloses a multi-sensor remote sensing image fusion classification method of a layering dense fusion network, which is carried out according to the following steps:
Step 1.1 establishing and initializing a sub-network NfeatureSpa4 groups of convolutional layers, Conv2_0, Conv2_1, Conv2_2 and Conv2_ 3;
the Conv2_0 comprises 1-layer convolution operation, 1-layer BatchNorm normalization operation and 1-layer activation operation, wherein the convolution layer comprises 100 convolution kernels with the size of 3 x 3, each convolution kernel performs convolution operation by taking 1 pixel as a step size, and a nonlinear activation function ReLU is selected as an activation function for operation;
the Conv2_1 comprises 1-layer convolution operation, 1-layer BatchNorm normalization operation and 1-layer activation operation, wherein the convolution layer comprises 100 convolution kernels with the size of 3 x 3, each convolution kernel performs convolution operation by taking 1 pixel as a step size, and a nonlinear activation function ReLU is selected as an activation function for operation;
the Conv2_2 comprises 1-layer convolution operation, 1-layer BatchNorm normalization operation and 1-layer activation operation, wherein the convolution layer comprises 100 convolution kernels with the size of 1 × 1, each convolution kernel performs convolution operation by taking 1 pixel as a step size, and a nonlinear activation function ReLU is selected as an activation function for operation;
the Conv2_3 comprises 1-layer convolution operation, 1-layer BatchNorm normalization operation and 1-layer activation operation, wherein the convolution layer comprises 100 convolution kernels with the size of 1 × 1, each convolution kernel performs convolution operation by taking 1 pixel as a step size, and a nonlinear activation function ReLU is selected as an activation function for operation;
step 1.2. establishing and initializing sub-network N featureSpe2 groups of convolutional layers, Conv1_0 and Conv1_ 1;
the Conv1_0 comprises 1-layer convolution operation, 1-layer BatchNorm normalization operation and 1-layer activation operation, wherein the convolution layer comprises 64 one-dimensional convolution kernels with the size of 11, each convolution kernel performs convolution operation by taking 1 pixel as a step size, and a nonlinear activation function ReLU is selected as an activation function for operation;
the Conv1_1 comprises a 1-layer convolution operation, a 1-layer BatchNorm normalization operation and a 1-layer activation operation, wherein the convolution layer comprises 128 one-dimensional convolution kernels with the size of 3, each convolution kernel performs convolution operation by taking 1 pixel as a step size, and a nonlinear activation function ReLU is selected as an activation function for operation;
step 1.3. establishing and initializing sub-network NshallowfusionComprising 6 sets of parallel convolutional layers, Conv2_ Q1, Conv2_ K1, Conv2_ V1, Conv2_ Q2, Conv2_ K2 and Conv2_ V2, and 2 sets of custom modules, LSAM、LDPAM;
The Conv2_ Q1 comprises 1 layer of convolution operations, including 25 convolution kernels of size 1 × 1, each convolution kernel performing convolution operations in steps of 1 pixel;
the Conv2_ K1 includes 1 layer of convolution operations, including 25 convolution kernels of size 1 × 1, each convolution kernel performing convolution operations with 1 pixel step size;
the Conv2_ V1 comprises 1 layer of convolution operations, including 200 convolution kernels of size 1 × 1, each convolution kernel performing convolution operations with a step size of 1 pixel;
the Conv2_ Q2 comprises 1 layer of convolution operations, including 25 convolution kernels of size 1 × 1, each convolution kernel performing convolution operations in steps of 1 pixel;
the Conv2_ K2 includes 1 layer of convolution operations, including 25 convolution kernels of size 1 × 1, each convolution kernel performing convolution operations with 1 pixel step size;
the Conv2_ V2 comprises 1-layer convolution operation, including 200 2D convolution kernels with the size of 1 × 1, and each convolution kernel carries out convolution operation by taking 1 pixel as a step size;
said LSAMThe module maps the input three-dimensional tensor F to using reshape operationSpace, get the characteristicWherein, CspeIndicating the number of input channels, N2=1×1,Is represented by FspeRThe ith channel of (2), and then calculating the spectrum attention moment array according to the definition of the formula (1)
Wherein,is represented by FspeSThe element in the jth row and ith column,is represented by FspeRThe transpose of the jth lane of (1),is represented by FspeRRepresents the inner product operation, and further defines F according to the formula (2)speRAnd FspeSPerforming matrix multiplication to obtain spectral attention feature FspeA;
Where γ represents a preset coefficient, and in this embodiment, γ is made to be 0.4;
said LDPAMThe module comprises the following 7 steps:
(a) three-dimensional tensor F to be input1Feeding into the convolutional layer Conv2_ Q1 to calculate the characteristicsThen F is mixed1Feeding into the convolutional layer Conv2_ K1 to calculate the characteristicsThen F is put1Sending the convolution layer Conv2_ V1 to calculate the characteristicsWherein, FQ1,i、FK1,iAnd FV1,iRespectively represent FQ1、FK1And FV1The ith element of (1), CspaNumber of channels representing input tensor, HspaAnd WspaRespectively representing the length and width of the input tensor, K1=25,K2=25,K3=200;
(b)LDPAMModule three-dimensional tensor F2Feeding into the convolutional layer Conv2_ Q2 to calculate the characteristicsThen F is mixed2Feeding into the convolutional layer Conv2_ K2 to calculate the characteristicsThen F is put2Sending the convolution layer Conv2_ V2 to calculate the characteristicsWherein, FQ2,i、FK2,iAnd FV2,iRespectively represent FQ2、FK2And FV2The ith element of (1);
(c) f is processed by reshape operationQ1And FK1Mapping toSpace and calculating a space attention moment matrix according to the definition of the formula (3)
Wherein N is1Represents the total number of features and N1=Hspa×Wspa,Is represented by FspaXThe element in the jth row and ith column,is represented by FK1Transpose of jth element of (a);
(d) f is processed by reshape operationV1Mapping toSpace, calculating the spatial attention feature F according to the definition of formula (4)spaA;
Wherein eta isspaIs a pre-set scaling factor that is,is represented by FspaXThe vector formed by the ith row element of (1), let η in this embodimentspa=0.4;
(e) F is processed by reshape operationQ2And FK2Mapping toSpace, calculating a space attention moment array according to the definition of formula (5)
Wherein,is represented by FmXThe element in the jth row and ith column,is represented by FK2Transpose of jth element of (a);
(f) calculating the modal attention feature F according to the definition of formula (6)mA;
Wherein epsilon2Which represents a preset scaling factor, is set,is represented by FmXThe vector formed by the ith row element of (1), let ε in this embodiment2=0.4;
(g) Calculating the spatial weighting feature F according to the definition of equation (7)maF;
FmaF=α1F1+α2FspaA+α3FmA (7)
Wherein alpha is1,α2And alpha3Represents a predetermined weight coefficient, let α in this embodiment1=0.4,α2=0.3, α3=0.3;
Step 1.4 setting up and initializing sub-network NdeepfusionThe self-defined self-leveling light-emitting diode comprises 2 groups of maximum pooling layers and 1 group of self-defined connecting layers, namely MaxPool1, MaxPool2 and Concatenate;
the MaxPool1 comprises 1-layer pooling operation and 1-layer Flatten operation, wherein the pooling layer carries out maximum pooling operation by using a one-dimensional pooling kernel with the size of 1;
the MaxPool2 comprises 1-layer pooling operation, 2-layer full-connection operation, 2-layer activation operation and 1-layer Flatten operation, wherein the pooling layer performs maximum pooling operation by using a pooling core with the size of 2 x 2, the 2-layer full-connection layer is respectively provided with 1024 and 512 output units, ReLU is selected as an activation function for operation, and Dropout operation with the parameter of 0.4 is performed to obtain 3 three-dimensional tensorsAnd
the Concatenate will be defined according to equation (8)Andperforming fusion operation and Dropout operation with 3 times of parameters of 0.5;
where ω and b represent weights and offsets of fully connected layers and "|" represents an operation of connecting spectral features with spatial features;
step 1.5 establishing and initializing sub-network N cls1 group of full connection layers, namely Dense 1;
the Dense1 has num classification units and takes Softmax as an activation function, wherein num represents the total number of the ground feature categories to be classified;
step 2.1, extracting all pixel point sets X with labels from a hyperspectral image training set H according to the artificially marked pixel point coordinate setH={xH,i1, …, M, and extracting pixel point set X with all labels from training set L of LiDAR imageL={xL,i1, …, M, where xH,iRepresents XHThe ith pixel point, xL,iRepresents XLM represents the total number of pixel points having labels;
step 2.2. according to the definition of the formula (9) and the formula (10), X is definedHAnd XLPerforming standardization treatment to obtainAndwherein,representing a normalized set of labeled hyperspectral image primitive points,to representThe point of the ith pixel of (a),represents a normalized set of labeled LiDAR pixel points,to representThe ith pixel point of (1);
step 2.3. withIs divided into a series of high spectral pel block sets X of size 11X 11 centered on each pel point of HH1And are combined withDivides L into a series of sets X of LiDAR image metablocks of size 11X 11 centered on each image metablock ofL1;
Step 2.4. mixing XH1And XL1Each image element block in the image acquisition system is turned over up and down to obtain a high-spectrum image element block set XH2And LiDAR pixelblock set XL2;
Step 2.5 for XH1Adding Gaussian noise with variance of 0.01 to each pixel block to obtain a hyperspectral pixel block set XH3And to XL1Adding Gaussian noise with variance of 0.03 to each pixel block to obtain a LiDAR pixel block set XL3;
Step 2.6. mixing XH1Each pixel block in the hyperspectral image block set X rotates by n multiplied by 90 degrees clockwise and randomly by taking the central point as a rotation center to obtain a hyperspectral image block set XH4And X isL1Each pixel block in the LiDAR pixel block set X is obtained by clockwise randomly rotating n multiplied by 90 degrees by taking the central point as the rotating centerL4Wherein n represents a value randomly selected from the set {1,2,3 };
step 2.7. orderAndwill be provided withAndas a training set for fusing and classifying neural networks, and integrating samples in the training set into a triadIn the form of a network data input, wherein,represents a pixel pair consisting of a hyperspectral image and a LiDAR image in the training set, andandare the same in spatial coordinates of (a) YiTo representAndmaking the iteration number iter ← 1 for the corresponding real category label, and executing the step 2.8 to the step 2.13;
step 2.8. adopt the subnetwork NfeatureSpeAnd NfeatureSpaExtracting the characteristics of the training set;
step 2.8.1 utilizing subnetwork NfeatureSpeTraining set for hyperspectral imagesPerforming feature extraction to obtain shallow spectrum features of hyperspectral imageSign Fspe;
Step 2.8.2 utilizing sub-network NfeatureSpaTraining set for hyperspectral imagesCarrying out feature extraction to obtain shallow space features F of the hyperspectral imagespa;
Step 2.8.3 utilizing sub-network NfeatureSpaTraining set for LiDAR imageryPerforming feature extraction to obtain shallow elevation features F of the LiDAR imageL;
Step 2.9. use sub-network NshallowfusionPerforming shallow layer fusion of a characteristic level to obtain shallow layer characteristics;
step 2.9.1 Using LSAMModule pair shallow spectral feature FspeCalculating to obtain the spectral attention characteristic F of the hyperspectral imagespeA;
Step 2.9.2 Using LDPAMModule pair shallow space feature FspaAnd shallow space feature FLCalculating to obtain the spatial modal attention feature F of the hyperspectral imagemaHF;
Step 2.9.3 Using LDPAMModule pair shallow space feature FLAnd shallow space feature FspaCalculating to obtain the spatial modal attention feature F of the LiDAR imagemaLF;
Step 2.10. use sub-network NdeepfusionCarrying out deep fusion of characteristic levels to obtain deep characteristics;
step 2.10.1 spectral attention feature F using max-pooling layer MaxPool1speACalculating to obtain deep spectral characteristics of the hyperspectral image
Step 2.10.2 spatial modal attention feature F of hyperspectral image by using maximum pooling layer MaxPool2maHFCalculating to obtain deep space characteristics of hyperspectral image
Step 2.10.3 utilizes the spatial modal attention feature F of the max pooling layer Maxpool2 for LiDAR imagerymaLFCalculating to obtain deep elevation features of the LiDAR image
Step 2.10.4 utilizes the custom linker Concatenate to characterize the deep spectra of the hyperspectral imageSpatial features of deep layersDeep elevation features of LiDAR imagesCalculating to obtain deep layer characteristics FM;
Step 2.11 Using subnetwork NclsClassifying the deep features, and calculating to obtain a classification prediction result TRpred;
Step 2.12, taking the weighted cross entropy as a loss function according to the definitions of the formula (11) and the formula (12);
wherein, ω isjThe weight of the jth class is represented,indicating that the picture element belongs to the j-th classProbability of ground object, njRepresenting the number of the jth class of ground-truth ground objects in the ground-truth training sample;
step 2.13, if all pixel blocks in the training set are processed, the step 2.14 is carried out, otherwise, a group of unprocessed pixel blocks are taken out from the training set, and the step 2.8 is returned;
step 2.14, let iter ← iter +1, if iter times iter>Total _ iter, then obtaining the trained convolutional neural network NahdAnd (4) turning to the step (3), otherwise, utilizing a reverse error propagation algorithm based on a random gradient descent method and predicting loss Lω-CUpdating NahdStep 2.8, all the image element blocks in the training set are reprocessed, where Total _ iter represents a preset number of iterations, and in this embodiment, Total _ iter is set to 200;
step 3, inputting unlabeled hyperspectral images H 'and LiDAR images L', performing data preprocessing on all pixels of H 'and L', and adopting a trained convolutional neural network NahdCompleting pixel classification;
step 3.1, extracting all pixel points in H' to form a set TH={tH,iI1, …, U, extracting all pixel points in L' to form a set TL={t L,i1, …, U }, where t isH,iRepresents THI-th pixel of (1), tL,iRepresents TLU represents the total number of all picture elements;
step 3.2. definition of T according to formula (17) and formula (18)HAnd TLPerforming standardization treatment to obtainAndwherein,representing a normalized labeled set of high-spectrum image pixel points,to representThe point of the ith pixel of (a),represents a normalized set of labeled LiDAR pixel points,to representThe ith pixel point of (1);
step 3.3. withEach pixel point of the hyperspectral imager is taken as a center, H' is divided into a series of hyperspectral pixel block sets with the size of 11 multiplied by 11 to form a hyperspectral image test setThen combine withWith each pixel point as the center, the L' is divided into a series of sets of LiDAR pixel blocks with the size of 11 multiplied by 11 to form a LiDAR image test set
Step 3.4. use sub-network NfeatureSpeAnd NfeatureSpaExtraction ofCharacteristics of the test set;
step 3.4.1 utilizing sub-network NfeatureSpeTo pairCarrying out feature extraction to obtain spectral feature T of hyperspectral image Hspe;
Step 3.4.2 utilizes sub-network NfeatureSpaTo pairCarrying out feature extraction to obtain spatial feature T of hyperspectral image Hspa;
Step 3.4.3 utilizing sub-network NfeatureSpaTo pairExtracting features to obtain the elevation features T of the LiDAR image LL;
Step 3.5. use sub-network NshallowfusionPerforming shallow layer fusion of a characteristic level to obtain shallow layer characteristics;
step 3.5.1 Using LSAMModule pair spectral feature TspeCalculating to obtain the spectral attention characteristic T of the hyperspectral image HspeA;
Step 3.5.2 Using LDPAMModule to space characteristics TspaAnd spatial feature TLCalculating to obtain the spatial modal attention feature T of the hyperspectral image HmaHF;
Step 3.5.3 utilizes LDPAMModule to space characteristics TLAnd spatial feature TspaCalculating to obtain the space modal attention feature T of the LiDAR image LmaLF;
Step 3.6. use sub-network NdeepfusionCarrying out deep fusion of characteristic levels to obtain deep characteristics;
step 3.6.1 spectral attention feature T using max-pooling layer MaxPool1speACalculating to obtain deep spectral characteristics of the hyperspectral image H
Step 3.6.2 attention feature T to spatial modality with max pooling layer Maxpool2maHFCalculating to obtain deep spatial features of the hyperspectral image H
Step 3.6.3 utilizes the max pooling layer Maxpool2 for the spatial modal attention feature TmaLFCalculating to obtain the deep elevation features of the LiDAR image L
Step 3.6.4 utilizes custom connection layer conditioner pairsCalculating to obtain deep layer characteristics TM;
Step 3.7 Using subnetwork NclsFor deep layer characteristic TMClassifying to calculate the classified prediction result TEpred。
In order to verify the effectiveness of the method, experiments are carried out by taking a Houston data set and a Telento data set which are disclosed as examples, the fusion classification result is evaluated by taking the Overall Accuracy (OA), the Average Accuracy (AA) and the Kappa coefficient as objective indexes, and the evaluation result is compared with an SVM method, an ELM method, a CNN-PPF method, a Two-Branch CNN method and an EndNet method.
The main challenge of the terrain classification task is the problem of misclassification. For remote-sensing image-based terrain classification, the most common classification error is the classification of soil as grassland. As can be seen from Table 1, the SVM method, the ELM method, the CNN-PPF method, the Two-Branch CNN method and the EndNet method do not fully utilize the interactivity among the characteristics acquired by different sensors, so that the classification accuracy is not accurate enough, but the method provided by the invention has certain misjudgment among three grasslands in different states, but almost no wrong classification is formed between the whole grasslands and bare soil. In addition, for the tennis court area and the runway area in the table 1 and the timber area and the vineyard area in the table 2, the invention has no classification error and no misjudgment phenomenon, and the accuracy rate reaches 100 percent. As can be seen from Table 1, for the Houston data set, the OA results obtained by the method are respectively improved by 9.57%, 8.14%, 6.73%, 2.08% and 1.54% compared with the SVM method, the ELM method, the CNN-PPF method, the Two-Branch CNN method and the EndNet method, and are improved by 5.61% on average; compared with an SVM method, an ELM method, a CNN-PPF method, a Two-Branch CNN method and an EndNet method, the Kappa coefficient result of the invention is respectively improved by 10.26%, 8.79%, 7.36%, 2.26% and 1.65%, and is improved by 6.06% on average. As can be seen from Table 2, for the trenltor data set, the OA result obtained by the method is respectively improved by 6.26%, 13.22%, 4.27%, 1.11% and 4.86% compared with the SVM method, the ELM method, the CNN-PPF method, the Two-Branch CNN method and the EndNet method, and is improved by 5.94% on average; the Kappa coefficient results are respectively improved by 2.85%, 17.34%, 5.66%, 1.89% and 6.48% compared with the SVM method, the ELM method, the CNN-PPF method, the Two-Branch CNN method and the EndNet method, and are improved by 6.84% on average.
FIG. 1 is a graph of results of different classification methods on Houston data sets, wherein (a) is an HSI pseudo-color image; (b) a digital surface model based on LiDAR imagery; (c) is a group-truth classification chart; (d) the overall accuracy of the classification result of the SVM method is 80.49%; (e) the classification result of the ELM method has the overall accuracy of 81.92 percent; (f) the overall accuracy of the classification result of the CNN-PPF method is 83.33%; (g) the classification result of the Two-Branch CNN method is obtained, and the overall accuracy rate is 87.98%; (h) the classification result of the EndNet method is 88.52 percent of the overall accuracy; (i) the overall accuracy of the classification results of the present invention was 90.06%.
Fig. 2 is a diagram of classification results of different methods on a trenlto data set, wherein (a) is an HSI pseudo-color image; (b) a digital surface model based on LiDAR imagery; (c) is a group-truth classification chart; (d) the overall accuracy of the classification result of the SVM method is 92.77%; (e) the classification result of the ELM method has the overall accuracy of 85.81 percent; (f) the overall accuracy of the classification result of the CNN-PPF method is 94.76%; (g) the classification result of the Two-Branch CNN method is obtained, and the overall accuracy rate is 97.92%; (h) the method is a classification result of an EndNet method, and the overall accuracy is 94.17%; (i) the overall accuracy of the classification results of the present invention was 99.03%.
As can be seen from fig. 1 and 2, for the areas which are difficult to classify, the present invention can more effectively identify various ground features, and particularly, can relatively accurately judge the business area in the upper right corner of fig. 2. Moreover, because the multi-end input is beneficial to reducing information loss, the invention can obtain a classification result which is smoother and more accurate than five single-input methods, namely an SVM method, an ELM method, a CNN-PPF method, a Two-Branch CNN method and an EndNet method.
It can be known from the comparison results of table 1, table 2, fig. 1 and fig. 2 that the fusion quality and classification accuracy of the multi-sensor remote sensing image are effectively improved by fully utilizing the interactivity among different sensor data.
Table 1 houston data set classification accuracy comparison (%)
Table 2 comparison of classification accuracy (%) -for the trenntot data sets
Claims (1)
1. A multi-sensor remote sensing image fusion classification method capable of layering dense fusion network is characterized by comprising the following steps:
step 1, establishing and initializing a convolutional neural network N for fusion and classification of multi-sensor remote sensing imagesahdSaid N isahdComprising 2 sub-networks N for feature extractionfeatureSpeAnd NfeatureSpa1 subnet for shallow feature fusionLigand of formula IIshallowfusion1 sub-network N for deep feature fusiondeepfusionAnd 1 sub-network N for classificationcls;
Step 1.1 establishing and initializing a sub-network NfeatureSpa4 groups of convolutional layers, Conv2_0, Conv2_1, Conv2_2 and Conv2_ 3;
the Conv2_0 comprises 1-layer convolution operation, 1-layer BatchNorm normalization operation and 1-layer activation operation, wherein the convolution layer comprises 100 convolution kernels with the size of 3 x 3, each convolution kernel performs convolution operation by taking 1 pixel as a step size, and a nonlinear activation function ReLU is selected as an activation function for operation;
the Conv2_1 comprises 1-layer convolution operation, 1-layer BatchNorm normalization operation and 1-layer activation operation, wherein the convolution layer comprises 100 convolution kernels with the size of 3 x 3, each convolution kernel performs convolution operation by taking 1 pixel as a step size, and a nonlinear activation function ReLU is selected as an activation function for operation;
the Conv2_2 comprises 1-layer convolution operation, 1-layer BatchNorm normalization operation and 1-layer activation operation, wherein the convolution layer comprises 100 convolution kernels with the size of 1 × 1, each convolution kernel performs convolution operation by taking 1 pixel as a step size, and a nonlinear activation function ReLU is selected as an activation function for operation;
the Conv2_3 comprises 1-layer convolution operation, 1-layer BatchNorm normalization operation and 1-layer activation operation, wherein the convolution layer comprises 100 convolution kernels with the size of 1 × 1, each convolution kernel performs convolution operation by taking 1 pixel as a step size, and a nonlinear activation function ReLU is selected as an activation function for operation;
step 1.2. establishing and initializing sub-network NfeatureSpe2 groups of convolutional layers, Conv1_0 and Conv1_ 1;
the Conv1_0 comprises 1-layer convolution operation, 1-layer BatchNorm normalization operation and 1-layer activation operation, wherein the convolution layer comprises 64 one-dimensional convolution kernels with the size of 11, each convolution kernel performs convolution operation by taking 1 pixel as a step length, and a nonlinear activation function ReLU is selected as an activation function for operation;
the Conv1_1 comprises 1-layer convolution operation, 1-layer BatchNorm normalization operation and 1-layer activation operation, wherein the convolution layer comprises 128 one-dimensional convolution kernels with the size of 3, each convolution kernel performs convolution operation by taking 1 pixel as a step length, and a nonlinear activation function ReLU is selected as an activation function for operation;
step 1.3. establishing and initializing sub-network NshallowfusionComprising 6 sets of parallel convolutional layers, Conv2_ Q1, Conv2_ K1, Conv2_ V1, Conv2_ Q2, Conv2_ K2 and Conv2_ V2, and 2 sets of custom modules, LSAM、LDPAM;
The Conv2_ Q1 comprises 1 layer of convolution operation, and comprises 25 convolution kernels with the size of 1 × 1, and each convolution kernel carries out convolution operation by taking 1 pixel as a step size;
the Conv2_ K1 includes 1 layer of convolution operations, including 25 convolution kernels of size 1 × 1, each convolution kernel performing convolution operations with 1 pixel as a step size;
the Conv2_ V1 comprises 1 layer of convolution operation, and comprises 200 convolution kernels with the size of 1 × 1, and each convolution kernel carries out convolution operation by taking 1 pixel as a step size;
the Conv2_ Q2 comprises 1 layer of convolution operation, and comprises 25 convolution kernels with the size of 1 × 1, and each convolution kernel carries out convolution operation by taking 1 pixel as a step size;
the Conv2_ K2 includes 1 layer of convolution operations, including 25 convolution kernels of size 1 × 1, each convolution kernel performing convolution operations with 1 pixel as a step size;
the Conv2_ V2 comprises 1 layer of convolution operations, including 200 2D convolution kernels of size 1 × 1, each convolution kernel performing convolution operations with 1 pixel step size;
said LSAMThe module maps the input three-dimensional tensor F to using reshape operationSpace, get a featureWherein, CspeIndicating the number of input channels, N2=1×1,Is represented by FspeRAnd then calculating the spectral attention matrix according to the formula (1)
Wherein,is represented by FspeSThe element in the jth row and ith column,is represented by FspeRThe transpose of the jth lane of (1),is represented by FspeRRepresents the inner product operation, and then F is calculated according to the formula (2)speRAnd FspeSPerforming matrix multiplication to obtain spectral attention feature FspeA;
Wherein γ represents a preset coefficient;
said LDPAMThe module comprises the following 7 steps:
(a) three-dimensional tensor F to be input1Feeding into the convolutional layer Conv2_ Q1 to calculate the characteristicsThen F is mixed1Feeding into the convolutional layer Conv2_ K1 to calculate the characteristicsThen F is put1Feeding into the convolutional layer Conv2_ V1 to obtain characteristics by calculationWherein, FQ1,i、FK1,iAnd FV1,iRespectively represent FQ1、FK1And FV1The ith element of (1), CspaNumber of channels representing input tensor, HspaAnd WspaRespectively representing the length and width of the input tensor, K1=25,K2=25,K3=200;
(b)LDPAMModule three-dimensional tensor F2Feeding into the convolutional layer Conv2_ Q2 to calculate the characteristicsThen F is mixed2Feeding into the convolutional layer Conv2_ K2 to calculate the characteristicsThen F is put2Feeding into the convolutional layer Conv2_ V2 to obtain characteristics by calculationWherein, FQ2,i、FK2,iAnd FV2,iRespectively represent FQ2、FK2And FV2The ith element of (1);
(c) f is processed by reshape operationQ1And FK1Mapping toSpace and calculating a spatial attention matrix according to formula (3)
Wherein N is1Represents the total number of features and N1=Hspa×Wspa,Is represented by FspaXThe element in the jth row and ith column,is represented by FK1Transpose of jth element of (a);
(d) f is processed by reshape operationV1Mapping toSpace, calculating a spatial attention feature F according to formula (4)spaA;
Wherein eta isspaIs a pre-set scaling factor that is,is represented by FspaXA vector consisting of the elements of row i;
(e) f is processed by reshape operationQ2And FK2Mapping toSpace, calculating a space attention moment array according to formula (5)
Wherein,is represented by FmXThe element in the jth row and ith column,is represented by FK2Transpose of jth element of (a);
(f) calculating modal attention feature F according to equation (6)mA;
Wherein epsilon2Which represents a preset scaling factor, is set,is represented by FmXA vector consisting of the elements of row i;
(g) calculating the spatial weighting feature F according to equation (7)maF;
FmaF=α1F1+α2FspaA+α3FmA (7)
Wherein alpha is1,α2And alpha3Representing a preset weight coefficient;
step 1.4 setting up and initializing sub-network NdeepfusionThe self-defined self-leveling light-emitting diode comprises 2 groups of maximum pooling layers and 1 group of self-defined connecting layers, namely MaxPool1, MaxPool2 and Concatenate;
the MaxPool1 comprises 1-layer pooling operation and 1-layer Flatten operation, wherein the pooling layer carries out maximum pooling operation by using a one-dimensional pooling kernel with the size of 1;
the MaxPool2 comprises 1-layer pooling operation, 2-layer full-link operation, 2-layer activation operation and 1-layer Flatten operation, wherein the pooling layer performs maximum pooling operation by using a pooling core with the size of 2 x 2, and the 2-layer full-link layer respectively comprises1024 and 512 output units, selecting ReLU as an activation function for operation, and then executing Dropout operation with the parameter of 0.4 to obtain 3 three-dimensional tensorsAnd
the Concatenate is a derivative of formula (8)Andperforming fusion operation and Dropout operation with 3 times of parameters of 0.5;
where ω and b represent weights and offsets of fully connected layers and "|" represents an operation of connecting spectral features with spatial features;
step 1.5 establishing and initializing sub-network Ncls1 group of full connection layers, namely Dense 1;
the Dense1 has num classification units and takes Softmax as an activation function, wherein num represents the total number of the ground feature categories to be classified;
step 2, inputting a training set L of a training set H, LiDAR image of a hyperspectral image, a pixel point coordinate set and a label set which are marked artificially, and performing comparison on NahdTraining is carried out;
step 2.1, extracting all pixel point sets X with labels from a training set H of the hyperspectral image according to the pixel point coordinate set marked artificiallyH={xH,i1, …, M, and extracting pixel point set X with all labels from training set L of LiDAR imageL={xL,i1, …, M, where xH,iRepresents XHThe ith pixel point, xL,iRepresents XLM represents the total number of pixel points having labels;
step 2.2, X is corrected according to the formula (9) and the formula (10)HAnd XLPerforming standardization treatment to obtainAndwherein,representing a normalized set of labeled hyperspectral image primitive points,to representThe point of the ith pixel of (a),represents a normalized set of labeled LiDAR pixel points,to representThe ith pixel point of (1);
step 2.3. withIs divided into a series of hyperspectral pel block sets X with the size of 11 multiplied by 11 by taking each pixel point as the centerH1And are combined withDivides L into a series of sets X of 11X 11 LiDAR pixelblocks centered at each pixel point ofL1;
Step 2.4. mixing XH1And XL1Each image element block in the high-spectrum image element block set X is turned over up and down to obtain a high-spectrum image element block set XH2And LiDAR pixelblock set XL2;
Step 2.5 for XH1Adding Gaussian noise with variance of 0.01 to each pixel block to obtain a hyperspectral pixel block set XH3And to XL1Adding Gaussian noise with variance of 0.03 to each pixel block to obtain a LiDAR pixel block set XL3;
Step 2.6. mixing XH1Each pixel block in the hyperspectral image block set X rotates by n multiplied by 90 degrees clockwise and randomly by taking the central point as a rotation center to obtain a hyperspectral image block set XH4And X isL1Each pixel block in the LiDAR pixel block set X is obtained by clockwise randomly rotating n multiplied by 90 degrees by taking the central point as a rotation centerL4Wherein n represents a value randomly selected from the set {1,2,3 };
step 2.7. orderAndwill be provided withAndas a training set for fusing and classifying neural networks, and integrating samples in the training set into triplesIn the form of a network data input, wherein,represents a pixel pair consisting of a hyperspectral image and a LiDAR image in the training set, andandare the same in spatial coordinates of (a) YiTo representAndmaking iteration number iter ← 1 for the corresponding real category label, and executing the step 2.8 to the step 2.13;
step 2.8. adopt the subnetwork NfeatureSpeAnd NfeatureSpaExtracting the characteristics of the training set;
step 2.8.1 utilizing subnetwork NfeatureSpeTraining set for hyperspectral imagesCarrying out feature extraction to obtain shallow spectral feature F of hyperspectral imagespe;
Step 2.8.2 utilizing sub-network NfeatureSpaTraining set for hyperspectral imagesCarrying out feature extraction to obtain shallow space features F of the hyperspectral imagespa;
Step 2.8.3 utilizing sub-network NfeatureSpaTraining set for LiDAR imageryPerforming feature extraction to obtain shallow elevation features F of the LiDAR imageL;
Step 2.9. use sub-network NshallowfusionPerforming shallow layer fusion of a characteristic level to obtain shallow layer characteristics;
step 2.9.1 Using LSAMModule pair shallow spectral feature FspeCalculating to obtain the spectral attention characteristic F of the hyperspectral imagespeA;
Step 2.9.2 Using LDPAMModule pair shallow space feature FspaAnd shallow space feature FLCalculating to obtain the spatial modal attention feature F of the hyperspectral imagemaHF;
Step 2.9.3 Using LDPAMModule pair shallow space feature FLAnd shallow space feature FspaCalculating to obtain the spatial modal attention feature F of the LiDAR imagemaLF;
Step 2.10. use sub-network NdeepfusionCarrying out deep fusion of characteristic levels to obtain deep characteristics;
step 2.10.1 spectral attention feature F using max-pooling layer MaxPool1speACalculating to obtain deep spectral characteristics of the hyperspectral image
Step 2.10.2 spatial modal attention feature F of hyperspectral image by using maximum pooling layer MaxPool2maHFCalculating to obtain deep space characteristics of hyperspectral image
Step 2.10.3 utilizes the maximum pooling layer MaxPoint 2 for spatial modal attention feature F of LiDAR imagerymaLFCalculating to obtain deep elevation features of the LiDAR image
Step 2.10.4 utilizes the custom linker Concatenate to characterize the deep spectra of the hyperspectral imageSpatial features of deep layersDeep elevation features of LiDAR imagesCalculating to obtain deep layer characteristics FM;
Step 2.11 Using subnetwork NclsClassifying the deep features, and calculating to obtain a classification prediction result TRpred;
Step 2.12 using the weighted cross entropy as a loss function according to equation (11) and equation (12);
wherein, ω isjThe weight of the jth class is represented,representing the probability of the picture element belonging to the j-th class of ground objects, njRepresenting the number of the jth class of ground-truth ground objects in the ground-truth training sample;
step 2.13, if all pixel blocks in the training set are processed, the step 2.14 is carried out, otherwise, a group of unprocessed pixel blocks are taken out from the training set, and the step 2.8 is returned;
step 2.14, let iter ← iter +1, if iter times iter>Total _ iter, then obtaining the trained convolutional neural network NahdAnd (4) turning to the step (3), otherwise, utilizing a reverse error propagation algorithm based on a random gradient descent method and predicting the loss Lω-CUpdating NahdStep 2.8, all the pixel blocks in the training set are reprocessed, and the Total _ iter represents the preset iteration times;
step 3, inputting unlabeled hyperspectral images H 'and LiDAR images L', performing data preprocessing on all pixels of H 'and L', and adopting a trained convolutional neural network NahdCompleting pixel classification;
step 3.1, extracting all pixel points in H' to form a set TH={tH,iI1, …, U, extracting all pixel points in L' to form a set TL={tL,i1, …, U }, where t isH,iRepresents THI-th pixel of (1), tL,iRepresents TLU represents the total number of all picture elements;
step 3.2, according to the formula (17) and the formula (18), T is pairedHAnd TLPerforming standardization treatment to obtainAndwherein,representing a normalized set of labeled hyperspectral image primitive points,to representThe point of the ith pixel of (a),represents a normalized set of labeled LiDAR pixel points,to representThe ith pixel point of (1);
step 3.3. withEach pixel point of the hyperspectral imager is taken as a center, H' is divided into a series of hyperspectral pixel block sets with the size of 11 multiplied by 11 to form a hyperspectral image test setThen combine withWith each pixel point as the center, the L' is divided into a series of sets of LiDAR pixel blocks with the size of 11 multiplied by 11 to form a LiDAR image test set
Step 3.4. use sub-network NfeatureSpeAnd NfeatureSpaExtracting the characteristics of the test set;
step 3.4.1 utilizing sub-network NfeatureSpeTo pairCarrying out feature extraction to obtain spectral feature T of hyperspectral image Hspe;
Step 3.4.2 utilizes sub-network NfeatureSpaTo pairCarrying out feature extraction to obtain spatial feature T of hyperspectral image Hspa;
Step 3.4.3 utilizing sub-network NfeatureSpaTo pairExtracting features to obtain the elevation features T of the LiDAR image LL;
Step 3.5. use sub-network NshallowfusionPerforming shallow layer fusion of a characteristic level to obtain shallow layer characteristics;
step 3.5.1 Using LSAMModule pair spectral feature TspeCalculating to obtain the spectral attention characteristic T of the hyperspectral image HspeA;
Step 3.5.2 Using LDPAMModule to space characteristics TspaAnd spatial feature TLCalculating to obtain the spatial modal attention feature T of the hyperspectral image HmaHF;
Step 3.5.3 utilizes LDPAMModule to space characteristics TLAnd spatial feature TspaCalculating to obtain the space modal attention feature T of the LiDAR image LmaLF;
Step 3.6. use sub-network NdeepfusionCarrying out deep fusion of characteristic levels to obtain deep characteristics;
step 3.6.1 spectral attention feature T using max-pooling layer MaxPool1speACalculating to obtain deep spectral characteristics of the hyperspectral image H
Step 3.6.2 attention feature T to spatial modality with max pooling layer Maxpool2maHFCalculating to obtain deep space characteristics of the hyperspectral image H
Step 3.6.3 utilizes the max pooling layer Maxpool2 for the spatial modal attention feature TmaLFCalculating to obtain the deep elevation features of the LiDAR image L
Step 3.6.4 utilizes custom connection layer conditioner pairsCalculating to obtain deep layer characteristic TM;
Step 3.7 Using subnetwork NclsFor deep layer characteristic TMClassifying to calculate the classified prediction result TEpred。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110446906.6A CN113255727A (en) | 2021-04-25 | 2021-04-25 | Multi-sensor remote sensing image fusion classification method capable of layering dense fusion network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110446906.6A CN113255727A (en) | 2021-04-25 | 2021-04-25 | Multi-sensor remote sensing image fusion classification method capable of layering dense fusion network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113255727A true CN113255727A (en) | 2021-08-13 |
Family
ID=77221568
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110446906.6A Withdrawn CN113255727A (en) | 2021-04-25 | 2021-04-25 | Multi-sensor remote sensing image fusion classification method capable of layering dense fusion network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113255727A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113887645A (en) * | 2021-10-13 | 2022-01-04 | 西北工业大学 | Remote sensing image fusion classification method based on joint attention twin network |
CN113920323A (en) * | 2021-11-18 | 2022-01-11 | 西安电子科技大学 | Different-chaos hyperspectral image classification method based on semantic graph attention network |
CN114565858A (en) * | 2022-02-25 | 2022-05-31 | 辽宁师范大学 | Multispectral image change detection method based on geospatial perception low-rank reconstruction network |
CN114581838A (en) * | 2022-04-26 | 2022-06-03 | 阿里巴巴达摩院(杭州)科技有限公司 | Image processing method and device and cloud equipment |
CN114663779A (en) * | 2022-03-25 | 2022-06-24 | 辽宁师范大学 | Multi-temporal hyperspectral image change detection method based on time-space-spectrum attention mechanism |
CN114663777A (en) * | 2022-03-07 | 2022-06-24 | 辽宁师范大学 | Hyperspectral image change detection method based on spatio-temporal joint graph attention mechanism |
CN116051896A (en) * | 2023-01-28 | 2023-05-02 | 西南交通大学 | Hyperspectral image classification method of lightweight mixed tensor neural network |
-
2021
- 2021-04-25 CN CN202110446906.6A patent/CN113255727A/en not_active Withdrawn
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113887645A (en) * | 2021-10-13 | 2022-01-04 | 西北工业大学 | Remote sensing image fusion classification method based on joint attention twin network |
CN113887645B (en) * | 2021-10-13 | 2024-02-13 | 西北工业大学 | Remote sensing image fusion classification method based on joint attention twin network |
CN113920323A (en) * | 2021-11-18 | 2022-01-11 | 西安电子科技大学 | Different-chaos hyperspectral image classification method based on semantic graph attention network |
CN113920323B (en) * | 2021-11-18 | 2023-04-07 | 西安电子科技大学 | Different-chaos hyperspectral image classification method based on semantic graph attention network |
CN114565858A (en) * | 2022-02-25 | 2022-05-31 | 辽宁师范大学 | Multispectral image change detection method based on geospatial perception low-rank reconstruction network |
CN114565858B (en) * | 2022-02-25 | 2024-04-05 | 辽宁师范大学 | Multispectral image change detection method based on geospatial perception low-rank reconstruction network |
CN114663777A (en) * | 2022-03-07 | 2022-06-24 | 辽宁师范大学 | Hyperspectral image change detection method based on spatio-temporal joint graph attention mechanism |
CN114663777B (en) * | 2022-03-07 | 2024-04-05 | 辽宁师范大学 | Hyperspectral image change detection method based on space-time joint graph attention mechanism |
CN114663779A (en) * | 2022-03-25 | 2022-06-24 | 辽宁师范大学 | Multi-temporal hyperspectral image change detection method based on time-space-spectrum attention mechanism |
CN114581838A (en) * | 2022-04-26 | 2022-06-03 | 阿里巴巴达摩院(杭州)科技有限公司 | Image processing method and device and cloud equipment |
CN116051896A (en) * | 2023-01-28 | 2023-05-02 | 西南交通大学 | Hyperspectral image classification method of lightweight mixed tensor neural network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113255727A (en) | Multi-sensor remote sensing image fusion classification method capable of layering dense fusion network | |
Xie et al. | Multiscale densely-connected fusion networks for hyperspectral images classification | |
CN108573276B (en) | Change detection method based on high-resolution remote sensing image | |
Windrim et al. | Pretraining for hyperspectral convolutional neural network classification | |
Wang et al. | A unified multiscale learning framework for hyperspectral image classification | |
CN107590515B (en) | Hyperspectral image classification method of self-encoder based on entropy rate superpixel segmentation | |
CN113011499A (en) | Hyperspectral remote sensing image classification method based on double-attention machine system | |
CN113469094A (en) | Multi-mode remote sensing data depth fusion-based earth surface coverage classification method | |
CN107145836B (en) | Hyperspectral image classification method based on stacked boundary identification self-encoder | |
Boggavarapu et al. | A new framework for hyperspectral image classification using Gabor embedded patch based convolution neural network | |
Cai et al. | Residual-capsule networks with threshold convolution for segmentation of wheat plantation rows in UAV images | |
CN105184314B (en) | Wrapper formula EO-1 hyperion band selection methods based on pixel cluster | |
CN109858557B (en) | Novel semi-supervised classification method for hyperspectral image data | |
CN110309780A (en) | High resolution image houseclearing based on BFD-IGA-SVM model quickly supervises identification | |
CN113609889A (en) | High-resolution remote sensing image vegetation extraction method based on sensitive feature focusing perception | |
CN105160351B (en) | Semi-supervised hyperspectral classification method based on anchor point sparse graph | |
Chu et al. | Hyperspectral image classification with discriminative manifold broad learning system | |
Huang et al. | A multilevel decision fusion approach for urban mapping using very high-resolution multi/hyperspectral imagery | |
CN115564996A (en) | Hyperspectral remote sensing image classification method based on attention union network | |
Shuai et al. | A research review on deep learning combined with hyperspectral Imaging in multiscale agricultural sensing | |
Bao et al. | Method for wheat ear counting based on frequency domain decomposition of MSVF-ISCT | |
Khan et al. | Segmentation of farmlands in aerial images by deep learning framework with feature fusion and context aggregation modules | |
CN115908924A (en) | Multi-classifier-based small sample hyperspectral image semantic segmentation method and system | |
Tu et al. | Fully convolutional network-based nonlocal-dependent learning for hyperspectral image classification | |
Liu et al. | Separable coupled dictionary learning for large-scene precise classification of multispectral images |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20210813 |