WO2024040828A1 - Method and device for fusion and classification of remote sensing hyperspectral image and laser radar image - Google Patents

Method and device for fusion and classification of remote sensing hyperspectral image and laser radar image Download PDF

Info

Publication number
WO2024040828A1
WO2024040828A1 PCT/CN2022/142160 CN2022142160W WO2024040828A1 WO 2024040828 A1 WO2024040828 A1 WO 2024040828A1 CN 2022142160 W CN2022142160 W CN 2022142160W WO 2024040828 A1 WO2024040828 A1 WO 2024040828A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
hyperspectral
input
output
lidar
Prior art date
Application number
PCT/CN2022/142160
Other languages
French (fr)
Chinese (zh)
Inventor
于文博
黄鹤
沈纲祥
Original Assignee
苏州大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州大学 filed Critical 苏州大学
Publication of WO2024040828A1 publication Critical patent/WO2024040828A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/58Extraction of image or video features relating to hyperspectral data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Definitions

  • the invention relates to the technical field of remote sensing image processing, and in particular, to a method and device for fusion and classification of remote sensing hyperspectral images and lidar images.
  • Hyperspectral images have rich spatial information and spectral information.
  • the spatial information is the spatial position information of pixels at each wavelength
  • the spectral information is the spectral curve composed of the spectral reflectance of a single pixel at each wavelength.
  • Lidar images record the elevation information of target objects.
  • the fusion classification methods of hyperspectral images and lidar images in the field of remote sensing can generally be divided into fusion classification methods based on classic machine learning and fusion classification methods based on deep learning.
  • the fusion classification method based on classic machine learning is mainly based on the classic machine learning theory, using the spatial information and spectral information in hyperspectral images and the elevation information in lidar images to construct feature extraction modules and fusion modules to achieve the union of different remote sensing images.
  • the more commonly used machine learning theories include Principal Component Analysis (PCA), Minimum Noise Fraction (MNF), Linear Discriminant Analysis (LDA), etc.
  • Other machine learning methods such as manifold learning algorithms, structural sparsification algorithms, dictionary set decomposition algorithms, etc. also play an important role.
  • This type of method usually extracts discriminative information from hyperspectral images and lidar images, and ensures the classifiable ability of samples by fusing different information.
  • some deep network models have also been applied to the research on fusion classification of hyperspectral images and lidar images, such as auto-encoder (AE), variational auto-encoder (Variational Auto-encoder). encoder, VAE), long short-term memory network (Long Short-term Memory, LSTM), etc.
  • AE auto-encoder
  • VAE variational auto-encoder
  • VAE Variational Auto-encoder
  • Long Short-term Memory Long Short-term Memory
  • Deep Encoder-Decoder Networks for Classification of Hyperspectral and LiDAR Data published in IEEE Geoscience and Remote Sensing Letters in 2020, which combines hyperspectral The features in the image and lidar image are extracted separately and fused, realizing the reconstruction of feature information and the transmission of deeper embedded space.
  • the existing fusion classification methods of hyperspectral images and lidar images in the field of remote sensing have certain shortcomings: 1 The existing methods do not take into account the correlation between the illumination information of hyperspectral images and the elevation information of lidar images. Therefore, It is difficult to achieve in-depth fusion of the two, weakening the performance of the classification model; 2 Existing methods do not apply the illumination information of hyperspectral images to the construction of fusion classification models, and do not consider decomposing hyperspectral images into intrinsic images and illumination images and give full play to the advantages of both.
  • the technical problem to be solved by the present invention is to overcome the problems existing in the existing technology and propose a method and device for fusion and classification of remote sensing hyperspectral images and lidar images, which can fully integrate important discriminant information in multi-source remote sensing images to achieve
  • the purpose of high-precision classification of target pixels is to fully avoid the loss and loss of important information during the fusion process and reduce problems such as reduced classification accuracy due to lack of information.
  • the present invention provides a fusion classification method of remote sensing hyperspectral images and lidar images, including:
  • S1 Acquire hyperspectral images and lidar images.
  • the categories of objects in the two images are label;
  • S2 Decompose the hyperspectral image into intrinsic image to obtain the intrinsic image and illumination image. For each hyperspectral intrinsic pixel, hyperspectral illumination pixel and lidar pixel, select its surrounding neighbors with size s ⁇ s. The domain is used as the neighborhood block of the pixel, where the neighborhood block size of each hyperspectral pixel in the hyperspectral image is s ⁇ s ⁇ B, and the neighborhood block size of the lidar pixel in the lidar image L is s ⁇ s;
  • S3 Training deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 using neighborhood blocks, where the inputs to L 1 and L 2 are hyperspectral eigenimages of size s ⁇ s ⁇ B hyperspectral intrinsic pixels, the inputs of L 3 and L 4 are lidar pixels of size s ⁇ s ⁇ B in the lidar image, and the inputs of L 5 and L 6 are the lidar pixels in the hyperspectral illumination image
  • the hyperspectral illumination pixel with size s ⁇ s ⁇ B has outputs of O 1 , O 2 , O 3 , O 4 , O 5 and O 6 respectively, and the size is s ⁇ s ⁇ d;
  • S4 Use the splicing layer to splice the outputs of the deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 in pairs to obtain O 12 , O 34 and O 56 ;
  • O 3456 2 and O 56 2 are input to the 7th multi-modal group convolution layer, and the output O 3456 3 is obtained.
  • O 34 2 is input to the 8th multi-modal group convolution layer, and the output O 34 is obtained.
  • step S1 after selecting the hyperspectral image and the lidar image, normalization preprocessing is performed on the hyperspectral image and the lidar image.
  • the method of decomposing the hyperspectral image into eigenimages to obtain eigenimages and illumination images in step S2 includes:
  • D i [H 1 ,..., Hi -1 , Hi+1 ,..., H X ⁇ Y , I B ] ⁇ R B ⁇ (B+X ⁇ Y-1)
  • I B is the identity matrix of size B ⁇ B
  • ⁇ i (B+X ⁇ Y-1) ⁇ 1;
  • the deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 in step S3 each include multiple two-dimensional convolution layers.
  • Each two-dimensional convolution layer The number of convolution kernels in the convolution layer is d, the convolution kernel size is [3, 3], the convolution kernel sliding step size is [1, 1], branches L 2 and L 3 share all weights, branch L 4 and L 5 share all weight.
  • the loss function when constructing the deep network branch training in step S3 is:
  • label is the input image category
  • output image category is the output image category
  • step S4 the splicing layer is used to splice the outputs of the deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 in pairs to obtain O 12 , O
  • the present invention also provides a remote sensing hyperspectral image and lidar image fusion classification device, including:
  • the data acquisition module is used to acquire hyperspectral images and lidar images.
  • the categories of objects in the two images are label;
  • Image decomposition module which is used to decompose hyperspectral images into intrinsic images to obtain intrinsic images and illumination images. For each hyperspectral intrinsic pixel, hyperspectral illumination pixel and lidar pixel, select its surrounding size as The neighborhood of s ⁇ s is used as the neighborhood block of the pixel, where the neighborhood block size of each hyperspectral pixel in the hyperspectral image is s ⁇ s ⁇ B, and the neighborhood of the lidar pixel in the lidar image L is The block size is s ⁇ s;
  • Deep network training module which is used to train the deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 using neighborhood blocks, where the inputs of L 1 and L 2 are in the hyperspectral intrinsic image
  • the hyperspectral intrinsic pixels of size s ⁇ s ⁇ B are, the inputs of L 3 and L 4 are lidar pixels of size s ⁇ s ⁇ B in the lidar image, and the inputs of L 5 and L 6 are
  • the hyperspectral illumination pixels in the hyperspectral illumination image are of size s ⁇ s ⁇ B, and their outputs are O 1 , O 2 , O 3 , O 4 , O 5 and O 6 respectively, and the sizes are all s ⁇ s ⁇ d. ;
  • the image splicing module uses the splicing layer to splice the outputs of the deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 in pairs to obtain O 12 , O 34 and O 56 ;
  • the multi-modal fusion module is used to input O 34 and O 56 to the first multi-modal group convolution layer to obtain the output O 3456 1 , and input O 34 to the second multi-modal group convolution layer, Get the output O 34 1 , input O 56 to the 3rd multi-modal group convolution layer, get the output O 56 1 , input O 34 1 , O 3456 1 and O 56 1 to the 4th multi-modal group convolution layer Accumulative layer, get the output O 3456 2 , input O 34 1 to the fifth multi-modal group convolution layer, get the output O 34 2 , input O 56 1 to the sixth multi-modal group convolution layer, get Output O 56 2 , input O 34 2 , O 3456 2 and O 56 2 to the 7th multimodal grouping convolution layer to obtain the output O 3456 3 , input O 34 2 to the 8th multimodal grouping volume Multimodal convolution layer, get the output O 34 3 , input O 56 2 to the 9th multi-modal group convolution layer, get the output O 56 3
  • the size is s ⁇ s ⁇ O 3456 4 and O 12 3 of d are input into the two-dimensional average pooling layer with size s ⁇ s, and O 3456 5 and O 12 4 with size 1 ⁇ d are obtained;
  • Image classification module input O 12 4 and O 3456 5 into the splicing layer to get the output O 123456 , whose size is 1 ⁇ 2d, input O 123456 into the fully connected layer, and get the final output category:
  • the data acquisition module includes a data preprocessing submodule, which is used to classify the hyperspectral image and the lidar image after selecting the hyperspectral image and the lidar image. Unified preprocessing.
  • the image decomposition module performs intrinsic image decomposition on hyperspectral images to obtain intrinsic images and illumination images, including:
  • D i [H 1 ,..., Hi -1 , Hi+1 ,..., H X ⁇ Y , I B ] ⁇ R B ⁇ (B+X ⁇ Y-1)
  • I B is the identity matrix of size B ⁇ B
  • ⁇ i (B+X ⁇ Y-1) ⁇ 1;
  • the deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 each include multiple two-dimensional convolution layers, and each two-dimensional convolution layer
  • the number of convolution kernels is d
  • the convolution kernel size is [3, 3]
  • the convolution kernel sliding step size is [1, 1]
  • branches L 2 and L 3 share all weights
  • branches L 4 and L 5 shares all weight.
  • the fusion classification method of remote sensing hyperspectral images and lidar images proposed by this invention can fully integrate important discriminant information in multi-source remote sensing images, achieve the purpose of high-precision classification of target pixels, and fully avoid the loss of important information during the fusion process. and loss, reducing problems such as reduced classification accuracy due to lack of information;
  • the present invention applies the intrinsic image decomposition theory of hyperspectral images to the fusion classification research of hyperspectral images and lidar images, fully improves the intrinsic image decomposition theory and multi-modal remote sensing image fusion classification research, and avoids the need for conventional classification methods.
  • the phenomenon of discarding the decomposed illumination image during intrinsic image decomposition reduces the loss of information;
  • the present invention proposes a method for fully integrating the illumination image obtained by decomposing the hyperspectral image with the lidar image, so that the correlation between the illumination information in the hyperspectral image and the elevation information in the lidar image is fully explored and utilized. , give full play to the advantages of illumination images in model construction research, and improve the final classification performance.
  • Figure 1 is a flow chart of a fusion classification method of remote sensing hyperspectral images and lidar images provided by the present invention.
  • Figure 2 is a schematic framework diagram of a remote sensing hyperspectral image and lidar image fusion classification device provided by the present invention.
  • the reference numbers are as follows: 10. Data acquisition module; 20. Image decomposition module; 30. Deep network training module; 40. Image splicing module; 50. Multi-modal fusion module; 60. Image classification module.
  • An embodiment of the present invention provides a method for fusion and classification of remote sensing hyperspectral images and lidar images, including:
  • S1 Acquire hyperspectral images and lidar images.
  • the categories of objects in the two images are label;
  • S2 Decompose the hyperspectral image into intrinsic image to obtain the intrinsic image and illumination image. For each hyperspectral intrinsic pixel, hyperspectral illumination pixel and lidar pixel, select its surrounding neighbors with size s ⁇ s. The domain is used as the neighborhood block of the pixel, where the neighborhood block size of each hyperspectral pixel in the hyperspectral image is s ⁇ s ⁇ B, and the neighborhood block size of the lidar pixel in the lidar image L is s ⁇ s;
  • S3 Training deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 using neighborhood blocks, where the inputs to L 1 and L 2 are hyperspectral eigenimages of size s ⁇ s ⁇ B hyperspectral intrinsic pixels, the inputs of L 3 and L 4 are lidar pixels of size s ⁇ s ⁇ B in the lidar image, and the inputs of L 5 and L 6 are the lidar pixels in the hyperspectral illumination image
  • the hyperspectral illumination pixel with size s ⁇ s ⁇ B has outputs of O 1 , O 2 , O 3 , O 4 , O 5 and O 6 respectively, and the size is s ⁇ s ⁇ d;
  • S4 Use the splicing layer to splice the outputs of the deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 in pairs to obtain O 12 , O 34 and O 56 ;
  • O 3456 2 and O 56 2 are input to the 7th multi-modal group convolution layer, and the output O 3456 3 is obtained.
  • O 34 2 is input to the 8th multi-modal group convolution layer, and the output O 34 is obtained.
  • the hyperspectral image H and the lidar image L are selected according to the actual problem, where the hyperspectral image size is X ⁇ Y ⁇ B, X and Y are the spatial dimensions of the hyperspectral image in each band, and B is the height The number of bands of the spectral image.
  • the lidar image size is X ⁇ Y. X and Y are the spatial dimensions of the lidar image. The spatial dimensions of the two images are the same.
  • step S2 the method of decomposing the hyperspectral image into eigenimages to obtain eigenimages and illumination images includes:
  • D i [H 1 ,..., Hi -1 , Hi+1 ,..., H X ⁇ Y , I B ] ⁇ R B ⁇ (B+X ⁇ Y-1)
  • I B is the identity matrix of size B ⁇ B
  • ⁇ i (B+X ⁇ Y-1) ⁇ 1;
  • step S2 for each hyperspectral eigenpixel, hyperspectral illumination pixel and lidar pixel (there are X ⁇ Y pixels in each of the three images), select its surrounding neighbors with size s ⁇ s.
  • the domain is used as the neighborhood block of the pixel, where the neighborhood block size of each hyperspectral pixel in the hyperspectral image H is s ⁇ s ⁇ B, and the neighborhood block size of the lidar pixel in the lidar image L is s ⁇ s.
  • step S3 six deep network branches are first constructed, namely L 1 , L 2 , L 3 , L 4 , L 5 and L 6 , where the inputs of L 1 and L 2 are hyperspectral intrinsic images RE.
  • the hyperspectral intrinsic pixels of size s ⁇ s ⁇ B, the inputs of L 3 and L 4 are the lidar pixels of size s ⁇ s ⁇ B in the lidar image L, the inputs of L 5 and L 6 is a hyperspectral illumination pixel of size s ⁇ s ⁇ B in the hyperspectral illumination image SH.
  • the above six deep network branches are composed of three two-dimensional convolution layers, and the convolution in each two-dimensional convolution layer
  • the number of kernels is all d
  • the sliding steps of the convolution kernels are all [1, 1]
  • branches L 2 and L 3 share all weights
  • branches L 4 and L 5 share all weights
  • the final six deep network branches The outputs are O 1 , O 2 , O 3 , O 4 , O 5 and O 6 respectively, and the dimensions are all s ⁇ s ⁇ d.
  • the loss function when constructing the deep network branch training in step S3 is:
  • label is the input image category
  • output image category is the output image category
  • step S4 the concatenation layer (Concatenation Layer) is used to concatenate the outputs of the above six deep network branches in pairs, according to the following formula:
  • O 56 Concatenation (O 5 , O 6 ).
  • step S5 Concatenation (O 34 , O 56 ) is input to the first multi-modal group convolution layer to obtain the output O 3456 1 , and O 34 is input to the second multi-modal group convolution layer to obtain Output O 34 1 , input O 56 to the 3rd multimodal group convolution layer, get the output O 56 1 , input Concatenation (O 34 1 , O 3456 1 , O 56 1 ) to the 4th multimodal Group convolution layer, get the output O 3456 2 , input O 34 1 to the 5th multi-modal group convolution layer, get the output O 34 2 , input O 56 1 to the 6th multi-modal group convolution layer , get the output O 56 2 , input Concatenation (O 34 2 , O 3456 2 , O 56 2 ) to the 7th multi-modal group convolution layer, get the output O 3456 3 , input O 34 2 to the 8th Multi-modal grouped convolution layer, get the output O 34 3 , input O 56 2 to the 9th multi-
  • Concatenation (O 12 2 , O 3456 3 ) is input to the 13th multi-modal group convolution layer, and the output O 12 is obtained. 3 , its size is s ⁇ s ⁇ d, input O 12 3 into the two-dimensional pooling layer (Average Pooling Layer) with size s ⁇ s, and obtain O 12 4 with size 1 ⁇ d.
  • the fusion classification method of remote sensing hyperspectral images and lidar images proposed by the present invention efficiently combines the intrinsic image decomposition theory with multi-modal remote sensing image fusion classification research, giving full play to the advantages of illumination images in model construction and reducing the need for important discriminations. Loss and attrition of information.
  • This invention introduces the intrinsic image decomposition of hyperspectral images into multi-modal remote sensing image fusion classification research for the first time, and achieves the purpose of balancing the elevation information in illumination images and lidar images, and simultaneously fuses this illumination information and elevation information. And guide the intrinsic information mining process to improve the classification ability of samples.
  • the present invention proposes a method for fully integrating the illumination image obtained by decomposing the hyperspectral image with the lidar image, so that the correlation between the illumination information in the hyperspectral image and the elevation information in the lidar image is fully explored and utilized, and Take advantage of lighting images in model building research.
  • the present invention proposes a discriminant feature extraction method for hyperspectral images and lidar images, which greatly improves the correlation between the two and reduces the occurrence of information imbalance during the information mining process.
  • the present invention proposes a method for applying a multi-modal grouping convolution layer to the fusion classification research of hyperspectral images and lidar images, giving full play to the application value of the multi-modal grouping convolution layer in the research field of the present invention and greatly improving the The ability to combine different modalities reduces unnecessary redundant information and enhances the ability to express important information.
  • the hyperspectral image and lidar image used in the fusion classification method of remote sensing hyperspectral image and lidar image proposed by the present invention were taken in Trento, Italy, where the size of the hyperspectral image is 166 ⁇ 600 ⁇ 63 , the size of the lidar image is 166 ⁇ 600.
  • the input hyperspectral image is an image of size 166 ⁇ 600 ⁇ 63
  • the input lidar image is an image of size 166 ⁇ 600.
  • the neighborhood size is 11, and the number of convolution kernels in each two-dimensional convolution layer is 120.
  • the hyperspectral image is decomposed to obtain an intrinsic image with a size of 166 ⁇ 600 ⁇ 63 and an illumination image with a size of 166 ⁇ 600 ⁇ 63.
  • Neighborhood information is selected, and a neighborhood block of size 11 ⁇ 11 ⁇ 63 is obtained for each pixel, and the neighborhood block is input into the deep network for training.
  • the overall classification accuracy and average classification accuracy were used to evaluate the classification results.
  • the overall classification result refers to the ratio of the number of correctly classified samples divided by the number of all samples.
  • the average classification accuracy is first divided by the ratio of the number of correctly classified samples in each category by the number of samples of that category, and the average of the ratios of each category is calculated.
  • the method of the present invention can better fuse and classify hyperspectral images and lidar images, with fewer misclassified samples.
  • the hyperspectral image decomposition part of the method of the present invention is removed and the above experiment is repeated, the overall classification accuracy obtained is 85.49%.
  • the method of the present invention has strong information mining capabilities. To sum up, the method of the present invention can effectively improve the classification ability and classification accuracy of multi-source remote sensing images.
  • the following is an introduction to a remote sensing hyperspectral image and lidar image fusion classification device disclosed in the second embodiment of the present invention.
  • the remote sensing hyperspectral image and lidar image fusion classification device described below is the same as the remote sensing hyperspectral image and lidar image fusion classification device described above.
  • LiDAR image fusion classification methods can correspond to each other.
  • An embodiment of the present invention provides a remote sensing hyperspectral image and lidar image fusion classification device, which includes:
  • the data acquisition module 10 is used to acquire hyperspectral images and lidar images.
  • the categories of objects in the two images are label;
  • Image decomposition module 20 is used to decompose the hyperspectral image into intrinsic images to obtain intrinsic images and illumination images. For each hyperspectral intrinsic pixel, hyperspectral illumination pixel and lidar pixel, select its surrounding size The neighborhood of s ⁇ s is used as the neighborhood block of the pixel, where the neighborhood block size of each hyperspectral pixel in the hyperspectral image is s ⁇ s ⁇ B, and the neighbor block of the lidar pixel in the lidar image L is The domain block size is s ⁇ s;
  • Deep network training module 30 which is used to train the deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 using neighborhood blocks, where the inputs of L 1 and L 2 are hyperspectral intrinsic images.
  • the hyperspectral intrinsic pixels of size s ⁇ s ⁇ B in the inputs of L 3 and L 4 are the lidar pixels of size s ⁇ s ⁇ B in the lidar image
  • the inputs of L 5 and L 6 is a hyperspectral illumination pixel with size s ⁇ s ⁇ B in the hyperspectral illumination image.
  • Its outputs are O 1 , O 2 , O 3 , O 4 , O 5 and O 6 respectively, and the sizes are all s ⁇ s ⁇ d;
  • the image splicing module 40 uses the splicing layer to splice the outputs of the deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 in pairs to obtain O 12 , O 34 and O 56 ;
  • the multi-modal fusion module 50 is used to input O 34 and O 56 to the first multi-modal group convolution layer to obtain the output O 3456 1 , and input O 34 to the second multi-modal group convolution layer. , get the output O 34 1 , input O 56 to the third multi-modal grouping convolution layer, get the output O 56 1 , input O 34 1 , O 3456 1 and O 56 1 to the fourth multi-modal grouping Convolution layer, get the output O 3456 2 , input O 34 1 to the 5th multi-modal group convolution layer, get the output O 34 2 , input O 56 1 to the 6th multi-modal group convolution layer, Get the output O 56 2 , input O 34 2 , O 3456 2 and O 56 2 to the 7th multimodal grouping convolution layer, get the output O 3456 3 , input O 34 2 to the 8th multimodal grouping Convolution layer, get the output O 34 3 , input O 56 2 to the 9th multi-modal group convolution layer, get the output O 56 3 , input O 34
  • the size is s ⁇ s ⁇ d
  • O 3456 4 and O 12 3 are input into the two-dimensional average pooling layer with size s ⁇ s, and O 3456 5 and O 12 4 with size 1 ⁇ d are obtained;
  • the image classification module 60 inputs O 12 4 and O 3456 5 into the splicing layer to obtain the output O 123456 , whose size is 1 ⁇ 2d, and inputs O 123456 into the fully connected layer to obtain the final output category:
  • the data acquisition module 10 includes a data preprocessing submodule, which is used to perform processing on the hyperspectral image and the lidar image after selecting the hyperspectral image and the lidar image. Normalization preprocessing.
  • the image decomposition module 20 performs intrinsic image decomposition on the hyperspectral image to obtain an intrinsic image and an illumination image, including:
  • D i [H 1 ,..., Hi -1 , Hi+1 ,..., H X ⁇ Y , I B ] ⁇ R B ⁇ (B+X ⁇ Y-1)
  • I B is the identity matrix of size B ⁇ B
  • ⁇ i (B+X ⁇ Y-1) ⁇ 1;
  • the deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 each include multiple two-dimensional convolution layers, and each two-dimensional convolution layer
  • the number of convolution kernels is d
  • the convolution kernel size is [3, 3]
  • the convolution kernel sliding step size is [1, 1]
  • branches L 2 and L 3 share all weights
  • branches L 4 and L 5 shares all weight.
  • a remote sensing hyperspectral image and lidar image fusion classification device in this embodiment is used to implement the aforementioned remote sensing hyperspectral image and lidar image fusion classification method. Therefore, the specific implementation of the device can be seen in the preceding article.
  • the image and lidar image fusion classification method are part of the embodiments. Therefore, its specific implementation can be referred to the description of the corresponding embodiments of each part, and will not be introduced here.
  • the remote sensing hyperspectral image and lidar image fusion classification device of this embodiment is used to implement the aforementioned remote sensing hyperspectral image and lidar image fusion classification method, its function corresponds to the function of the above system, and is not discussed here. Again.
  • embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment that combines software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory that causes a computer or other programmable data processing apparatus to operate in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction means, the instructions
  • the device implements the functions specified in a process or processes of the flowchart and/or a block or blocks of the block diagram.
  • These computer program instructions may also be loaded onto a computer or other programmable data processing device, causing a series of operating steps to be performed on the computer or other programmable device to produce computer-implemented processing, thereby executing on the computer or other programmable device.
  • Instructions provide steps for implementing the functions specified in a process or processes of a flowchart diagram and/or a block or blocks of a block diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Investigating Or Analysing Materials By Optical Means (AREA)
  • Image Processing (AREA)

Abstract

The present invention relates to a method for fusion and classification of a remote sensing hyperspectral image and a laser radar image, comprising: acquiring a hyperspectral image and a laser radar image; performing intrinsic image decomposition on the hyperspectral image to obtain an intrinsic image and an illumination image, and for each hyperspectral intrinsic pixel, each hyperspectral illumination pixel, and each laser radar pixel, selecting neighborhood blocks thereof; training a plurality of deep network branches by using the neighborhood blocks; splicing outputs of the plurality of deep network branches in pairs by using a splicing layer; and performing multi-modal fusion on spliced outputs to obtain a final output category. According to the method for fusion and classification of the remote sensing hyperspectral image and the laser radar image provided by the present invention, important discrimination information in a multi-source remote sensing image can be fully fused, thereby achieving the objective of high-precision classification of target pixels, and missing and loss of important information during fusion are fully avoided, thereby reducing the problems of classification precision reduction and the like caused by information loss.

Description

遥感高光谱图像与激光雷达图像融合分类方法及装置Remote sensing hyperspectral image and lidar image fusion classification method and device 技术领域Technical field
本发明涉及遥感图像处理技术领域,尤其是指一种遥感高光谱图像与激光雷达图像融合分类方法及装置。The invention relates to the technical field of remote sensing image processing, and in particular, to a method and device for fusion and classification of remote sensing hyperspectral images and lidar images.
背景技术Background technique
在遥感领域内,高光谱图像与激光雷达图像在各种相关研究中被广泛使用。高光谱图像具有丰富的空间信息和光谱信息,其中空间信息为各波长下像元的空间位置信息,光谱信息为单一像元在各波长下光谱反射率组成的光谱曲线。激光雷达图像记录目标地物的高程信息,通过将高光谱图像与激光雷达图像进行充分融合可以达到信息互补的效果,进而对地物完整信息进行学习与建模。同时,通过将两种遥感图像进行融合分类可以实现对像元中嵌入特性进行充分挖掘,从而提升后续分类研究的识别精度。早期,此类融合分类方法通常采用两条独立的支路对两种图像进行特征提取,并通过简单连接等方式实现多源信息的融合,但此类方法没有考虑不同支路之间的关联性,难以实现多源信息的平衡。随着计算机算力的提升以及深度学习研究的深入,一些通过训练神经网络来实现高光谱图像与激光雷达图像充分融合的方法相继被提出,这些方法对不同图像的信息提取过程进行改进,提升其关联性,提升了算法的性能。In the field of remote sensing, hyperspectral images and lidar images are widely used in various related research. Hyperspectral images have rich spatial information and spectral information. The spatial information is the spatial position information of pixels at each wavelength, and the spectral information is the spectral curve composed of the spectral reflectance of a single pixel at each wavelength. Lidar images record the elevation information of target objects. By fully integrating hyperspectral images and lidar images, complementary information can be achieved, and the complete information of the objects can be learned and modeled. At the same time, by fusion and classification of two types of remote sensing images, the characteristics embedded in the pixels can be fully exploited, thereby improving the recognition accuracy of subsequent classification research. In the early days, such fusion classification methods usually used two independent branches to extract features from two images, and achieved the fusion of multi-source information through simple connections, etc. However, such methods did not consider the correlation between different branches. , it is difficult to achieve the balance of multi-source information. With the improvement of computer computing power and the deepening of deep learning research, some methods have been proposed to fully integrate hyperspectral images and lidar images by training neural networks. These methods improve the information extraction process of different images and enhance their Correlation improves the performance of the algorithm.
目前遥感领域中高光谱图像与激光雷达图像融合分类方法一般可分为基于经典机器学习的融合分类方法和基于深度学习的融合分类方法。基于经典机器学习的融合分类方法主要基于经典机器学习理论,利用高光谱图像中的空间信息和光谱信息以及激光雷达图像中的高程信息构造特征提取模块以及融合模块,从而实现不同遥感图像间的联合表达。较为常用的机器学习理论包括主成分分析(Principle Component Analysis,PCA),最小化噪声分离(Minimum Noise Fraction,MNF),线性判别分析(Linear Discriminant Analysis,LDA)等等。其他机器学习方法,例如流形学习算法、结构稀疏化算法、字典集分解算法等同样发挥重要的作用。此类方法通常提取高光谱图像和激光雷达图像中的判别信息,通过将不同信息加以融合来保证样本的可分类能力。随着深度学习理论的不断深入,一些深度网络模型也被应用到高光谱图像与激光雷达图像融合分类研究中,例如自动 编码器(Auto-encoder,AE)、变分自动编码器(Variational Auto-encoder,VAE)、长短期记忆网络(Long Short-term Memory,LSTM)等等。此类方法通过利用复杂的网络结构来提取深层判别信息,从多个方面对样本包含的判别特征进行描述,因此越来越多基于深度学习的融合分类方法被广泛提出。例如Danfeng Hong等人于2020年发表在IEEE Geoscience and Remote Sensing Letters上的Deep Encoder-Decoder Networks for Classification of Hyperspectral and LiDAR Data中提出了一种基于编码器和解码器结构的全连接网络,将高光谱图像和激光雷达图像中特征进行分别提取并加以融合,实现了对特征信息的重构以及想更深层嵌入空间的传递。此外,他们同年发表在IEEE Transactions on Geoscience and Remote Sensing上的More Diverse Means Better:Multimodal Deep Learning Meets Remote-Sensing Imagery Classification提出了一种多模态数据的深度学习框架,通过在网络训练过程中进行参数交叉选择来对多模态图像间的互补信息进行二次学习。可以看出,深度学习在面向遥感领域中高光谱图像与激光雷达图像融合分类研究中得到了广泛的应用并取得了较优的结果。At present, the fusion classification methods of hyperspectral images and lidar images in the field of remote sensing can generally be divided into fusion classification methods based on classic machine learning and fusion classification methods based on deep learning. The fusion classification method based on classic machine learning is mainly based on the classic machine learning theory, using the spatial information and spectral information in hyperspectral images and the elevation information in lidar images to construct feature extraction modules and fusion modules to achieve the union of different remote sensing images. Express. The more commonly used machine learning theories include Principal Component Analysis (PCA), Minimum Noise Fraction (MNF), Linear Discriminant Analysis (LDA), etc. Other machine learning methods, such as manifold learning algorithms, structural sparsification algorithms, dictionary set decomposition algorithms, etc. also play an important role. This type of method usually extracts discriminative information from hyperspectral images and lidar images, and ensures the classifiable ability of samples by fusing different information. With the continuous deepening of deep learning theory, some deep network models have also been applied to the research on fusion classification of hyperspectral images and lidar images, such as auto-encoder (AE), variational auto-encoder (Variational Auto-encoder). encoder, VAE), long short-term memory network (Long Short-term Memory, LSTM), etc. This type of method extracts deep discriminant information by using complex network structures to describe the discriminant features contained in the sample from multiple aspects. Therefore, more and more fusion classification methods based on deep learning have been widely proposed. For example, Danfeng Hong et al. proposed a fully connected network based on the encoder and decoder structure in Deep Encoder-Decoder Networks for Classification of Hyperspectral and LiDAR Data published in IEEE Geoscience and Remote Sensing Letters in 2020, which combines hyperspectral The features in the image and lidar image are extracted separately and fused, realizing the reconstruction of feature information and the transmission of deeper embedded space. In addition, they published More Diverse Means Better: Multimodal Deep Learning Meets Remote-Sensing Imagery Classification in IEEE Transactions on Geoscience and Remote Sensing in the same year. They proposed a deep learning framework for multimodal data, by parameterizing during the network training process. Cross-selection is used to perform secondary learning of complementary information between multi-modal images. It can be seen that deep learning has been widely used in research on the fusion classification of hyperspectral images and lidar images in the field of remote sensing and has achieved excellent results.
但是现有的面向遥感领域中高光谱图像与激光雷达图像融合分类方法存在一定的缺点:①现有方法没有考虑到高光谱图像的光照信息与激光雷达图像中的高程信息之间的关联性,因此难以实现两者的深层次融合,弱化了分类模型的性能;②现有方法没有将高光谱图像的光照信息应用到融合分类模型的构建中,没有考虑将高光谱图像分解为本征图像和光照图像并充分发挥二者的优势,部分方法尝试将这种本征分解理论引入到分类模型中,但均将分解得到的光照图像直接舍弃,仅使用本征图像与激光雷达图像进行融合分类,无法发挥多模态遥感图像的优势;③现有方法在从高光谱图像和激光雷达图像中提取判别信息时较少考虑二者之间的联合能力与协同能力,仅使用完全分离的支路进行信息挖掘与特征提取,这不利于对像元完整信息的充分把握,难以发挥多模态遥感图像在像元分类识别方面的优势;④现有方法在提取图像空间信息时往往采用卷积神经网络,但常规卷积神经网络没有考虑多模态学习方面的限制,对不同模态图像之间信息的融合没有过多的结构设计,因此不利于提升高光谱图像与激光雷达图像的融合分类精度。However, the existing fusion classification methods of hyperspectral images and lidar images in the field of remote sensing have certain shortcomings: ① The existing methods do not take into account the correlation between the illumination information of hyperspectral images and the elevation information of lidar images. Therefore, It is difficult to achieve in-depth fusion of the two, weakening the performance of the classification model; ② Existing methods do not apply the illumination information of hyperspectral images to the construction of fusion classification models, and do not consider decomposing hyperspectral images into intrinsic images and illumination images and give full play to the advantages of both. Some methods try to introduce this intrinsic decomposition theory into the classification model, but they all directly discard the decomposed illumination images and only use the intrinsic images and lidar images for fusion classification, which cannot Give full play to the advantages of multi-modal remote sensing images; ③ Existing methods rarely consider the joint and collaborative capabilities between hyperspectral images and lidar images when extracting discriminative information from hyperspectral images and lidar images, and only use completely separated branches to extract information. Mining and feature extraction are not conducive to fully grasping the complete information of pixels, and it is difficult to give full play to the advantages of multi-modal remote sensing images in pixel classification and recognition; ④ Existing methods often use convolutional neural networks when extracting image spatial information. However, conventional convolutional neural networks do not consider the limitations of multi-modal learning and do not have too much structural design for the fusion of information between different modal images. Therefore, it is not conducive to improving the fusion classification accuracy of hyperspectral images and lidar images.
发明内容Contents of the invention
为此,本发明所要解决的技术问题在于克服现有技术存在的问题,提出一种遥感高光谱图像与激光雷达图像融合分类方法及装置,其可以充分融合多源遥感图像中重要判别信息,实现对目标像元高精度分类的目的,充分避免融合过程中重要信息的丢失与损耗,减少了由于信息缺失而带来的分类精度降低等问题。To this end, the technical problem to be solved by the present invention is to overcome the problems existing in the existing technology and propose a method and device for fusion and classification of remote sensing hyperspectral images and lidar images, which can fully integrate important discriminant information in multi-source remote sensing images to achieve The purpose of high-precision classification of target pixels is to fully avoid the loss and loss of important information during the fusion process and reduce problems such as reduced classification accuracy due to lack of information.
为解决上述技术问题,本发明提供一种遥感高光谱图像与激光雷达图像融合分类方法,包括:In order to solve the above technical problems, the present invention provides a fusion classification method of remote sensing hyperspectral images and lidar images, including:
S1:获取高光谱图像与激光雷达图像,两幅图像中各地物的类别为label;S1: Acquire hyperspectral images and lidar images. The categories of objects in the two images are label;
S2:将高光谱图像进行本征图像分解得到本征图像和光照图像,对于每一个高光谱本征像元、高光谱光照像元以及激光雷达像元,选取其周围尺寸为s×s的邻域作为该像元的邻域块,其中高光谱图像中每个高光谱像元的邻域块尺寸为s×s×B,激光雷达图像L中激光雷达像元的邻域块尺寸为s×s;S2: Decompose the hyperspectral image into intrinsic image to obtain the intrinsic image and illumination image. For each hyperspectral intrinsic pixel, hyperspectral illumination pixel and lidar pixel, select its surrounding neighbors with size s×s. The domain is used as the neighborhood block of the pixel, where the neighborhood block size of each hyperspectral pixel in the hyperspectral image is s×s×B, and the neighborhood block size of the lidar pixel in the lidar image L is s× s;
S3:使用邻域块训练深度网络支路L 1,L 2,L 3,L 4,L 5和L 6,其中L 1和L 2的输入为高光谱本征图像中的尺寸为s×s×B的高光谱本征像元,L 3和L 4的输入为激光雷达图像中的尺寸为s×s×B的激光雷达像元,L 5和L 6的输入为高光谱光照图像中的尺寸为s×s×B的高光谱光照像元,其输出分别为O 1,O 2,O 3,O 4,O 5和O 6,尺寸均为s×s×d; S3: Training deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 using neighborhood blocks, where the inputs to L 1 and L 2 are hyperspectral eigenimages of size s × s ×B hyperspectral intrinsic pixels, the inputs of L 3 and L 4 are lidar pixels of size s×s×B in the lidar image, and the inputs of L 5 and L 6 are the lidar pixels in the hyperspectral illumination image The hyperspectral illumination pixel with size s×s×B has outputs of O 1 , O 2 , O 3 , O 4 , O 5 and O 6 respectively, and the size is s×s×d;
S4:利用拼接层将深度网络支路L 1,L 2,L 3,L 4,L 5和L 6的输出两两拼接,得到O 12、O 34和O 56S4: Use the splicing layer to splice the outputs of the deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 in pairs to obtain O 12 , O 34 and O 56 ;
S5:将O 34和O 56输入到第1个多模态分组卷积层,得到输出O 3456 1,将O 34输入到第2个多模态分组卷积层,得到输出O 34 1,将O 56输入到第3个多模态分组卷积层,得到输出O 56 1,将O 34 1、O 3456 1和O 56 1输入到第4个多模态分组卷积层,得到输出O 3456 2,将O 34 1输入到第5个多模态分组卷积层,得到输出O 34 2,将O 56 1输入到第6个多模态分组卷积层,得到输出O 56 2,将O 34 2、O 3456 2和O 56 2输入到第7个多模态分组卷积层,得到输出O 3456 3,将O 34 2输入到第8个多模态分组卷积层,得到输出O 34 3,将O 56 2输入到第9个多模态分组卷积层,得到输出O 56 3,将O 34 3、O 3456 3和O 56 3输入到第10个多模态分组卷积层,得到输出O 3456 4,将O 12和O 3456 1输入到第11个多模态分组卷积层,得到输出O 12 1,将O 12 1和O 3456 2输入到第12个多模态分组卷积层,得到输出O 12 2,将O 12 2和O 3456 3输入到第13个多模态分组卷积层,得到输出O 12 3,将尺寸为s×s×d的O 3456 4和O 12 3输入尺寸为s×s的二维平均池化层,得到尺寸为1×d的O 3456 5和O 12 4S5: Input O 34 and O 56 to the first multi-modal group convolution layer to obtain the output O 3456 1 . Input O 34 to the second multi-modal group convolution layer to obtain the output O 34 1 . O 56 is input to the third multi-modal group convolution layer, and the output O 56 1 is obtained. O 34 1 , O 3456 1 and O 56 1 are input to the fourth multi-modal group convolution layer, and the output O 3456 is obtained. 2. Input O 34 1 to the fifth multi-modal group convolution layer to obtain the output O 34 2 . Input O 56 1 to the sixth multi-modal group convolution layer to obtain the output O 56 2 . Put O 34 2 , O 3456 2 and O 56 2 are input to the 7th multi-modal group convolution layer, and the output O 3456 3 is obtained. O 34 2 is input to the 8th multi-modal group convolution layer, and the output O 34 is obtained. 3. Input O 56 2 to the 9th multi-modal group convolution layer to obtain the output O 56 3 . Input O 34 3 , O 3456 3 and O 56 3 to the 10th multi-modal group convolution layer. Get the output O 3456 4 , input O 12 and O 3456 1 to the 11th multimodal group convolution layer, get the output O 12 1 , input O 12 1 and O 3456 2 to the 12th multimodal group convolution layer Product layer, get the output O 12 2 , input O 12 2 and O 3456 3 to the 13th multi-modal group convolution layer, get the output O 12 3 , put the O 3456 4 and O of size s×s×d 12 3 inputs a two-dimensional average pooling layer with size s × s, and obtains O 3456 5 and O 12 4 with size 1 × d;
S6:将O 12 4和O 3456 5输入拼接层得到输出O 123456,其尺寸为1×2d,将O 123456输入全连接层,得到最终输出的类别为
Figure PCTCN2022142160-appb-000001
S6: Input O 12 4 and O 3456 5 into the splicing layer to obtain the output O 123456 , whose size is 1×2d. Input O 123456 into the fully connected layer to obtain the final output category:
Figure PCTCN2022142160-appb-000001
在本发明的一个实施例中,所述步骤S1中在选取高光谱图像与激光雷达图像之后对高光谱图像与激光雷达图像进行归一化预处理。In one embodiment of the present invention, in step S1, after selecting the hyperspectral image and the lidar image, normalization preprocessing is performed on the hyperspectral image and the lidar image.
在本发明的一个实施例中,所述步骤S2中将高光谱图像进行本征图像分解得到本征图像和光照图像的方法包括:In one embodiment of the present invention, the method of decomposing the hyperspectral image into eigenimages to obtain eigenimages and illumination images in step S2 includes:
S2.1:计算每一个高光谱像元H i对应的矩阵D i,其中1≤i≤X×Y: S2.1: Calculate the matrix D i corresponding to each hyperspectral pixel H i , where 1≤i≤X×Y:
D i=[H 1,...,H i-1,H i+1,...,H X×Y,I B]∈R B×(B+X×Y-1) D i = [H 1 ,..., Hi -1 , Hi+1 ,..., H X×Y , I B ]∈R B×(B+X×Y-1)
其中I B为尺寸为B×B的单位矩阵; where I B is the identity matrix of size B×B;
S2.2:基于矩阵D i计算每一个高光谱像元H i对应的向量α iS2.2: Calculate the vector α i corresponding to each hyperspectral pixel H i based on the matrix D i :
min||α i|| 1    s.t.H i=D iα i min||α i || 1 stH i =D i α i
其中α i的形状为(B+X×Y-1)×1; Among them, the shape of α i is (B+X×Y-1)×1;
S2.3:构建权重矩阵W∈R (X×Y)×(X×Y),对权重矩阵中第i行第j列的元W ij进行赋值
Figure PCTCN2022142160-appb-000002
基于权重矩阵W计算矩阵G=(I X×Y-W T)(I X×Y-W)+δI X×Y,其中I X×Y为尺寸为(X×Y)×(X×Y)的单位矩阵,δ为常数,T为转置矩阵;
S2.3: Construct a weight matrix W∈R (X×Y)×(X×Y) , and assign values to the elements W ij in the i-th row and j-th column in the weight matrix.
Figure PCTCN2022142160-appb-000002
Calculate the matrix G=(I X×Y -W T )(I X×Y -W)+δI X × Y based on the weight matrix W, where I The identity matrix of , δ is a constant, and T is the transposed matrix;
S2.4:将高光谱图像H变换为二维矩阵,并进行对数计算得到log(flatten(H)),计算矩阵K=(I B-1 B1 B T/B)log(flatten(H))(I X×Y-1 X×Y1 X×Y T/(X×Y)),其中I B和I X×Y分别为尺寸为B×B和(X×Y)×(X×Y)的单位矩阵,1 B和1 X×Y分别为尺寸为B×1和(X×Y)×1的全1向量; S2.4: Transform the hyperspectral image H into a two-dimensional matrix, and perform logarithmic calculation to obtain log(flatten(H)). Calculate the matrix K=(I B -1 B 1 B T /B)log(flatten(H) ))(I X×Y -1 X×Y 1 X×Y T /(X × Y)), where I B and I The identity matrix of Y), 1 B and 1 X×Y are all-1 vectors with dimensions B×1 and (X×Y)×1 respectively;
S2.5:基于矩阵G和矩阵K计算矩阵ρ=δKG -1,基于矩阵ρ得到由高光谱图像H分解得到的本征图像RE=e ρ和光照图像SH=e log(H)-ρ,其中e为自然常数,两个图像的尺寸均为X×Y×B。 S2.5: Calculate matrix ρ = δKG -1 based on matrix G and matrix K. Based on matrix ρ, obtain the intrinsic image RE = e ρ and illumination image SH = e log(H)-ρ obtained by decomposing the hyperspectral image H. Where e is a natural constant, and the dimensions of both images are X×Y×B.
在本发明的一个实施例中,所述步骤S3中深度网络支路L 1,L 2,L 3,L 4,L 5和L 6均包括多个二维卷积层,每个二维卷积层中卷积核数目均为d,卷积核尺寸为[3,3],卷积核滑动步长均为[1,1],支路L 2和L 3共享所有权重,支路L 4和L 5共享所有权重。 In one embodiment of the present invention, the deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 in step S3 each include multiple two-dimensional convolution layers. Each two-dimensional convolution layer The number of convolution kernels in the convolution layer is d, the convolution kernel size is [3, 3], the convolution kernel sliding step size is [1, 1], branches L 2 and L 3 share all weights, branch L 4 and L 5 share all weight.
在本发明的一个实施例中,所述步骤S3中构建深度网络支路训练时的损失函数为:In one embodiment of the present invention, the loss function when constructing the deep network branch training in step S3 is:
Figure PCTCN2022142160-appb-000003
Figure PCTCN2022142160-appb-000003
其中,label为输入的图像类别,
Figure PCTCN2022142160-appb-000004
为输出的图像类别。
Among them, label is the input image category,
Figure PCTCN2022142160-appb-000004
is the output image category.
在本发明的一个实施例中,所述步骤S4中利用拼接层将深度网络支路L 1,L 2,L 3,L 4,L 5和L 6的输出两两拼接,得到O 12、O 34和O 56的公式为O 12=Concatenation(O 1,O 2),O 34=Concatenation(O 3,O 4),O 56=Concatenation(O 5,O 6)。 In one embodiment of the present invention, in step S4, the splicing layer is used to splice the outputs of the deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 in pairs to obtain O 12 , O The formulas of 34 and O 56 are O 12 =Concatenation (O 1 , O 2 ), O 34 =Concatenation (O 3 , O 4 ), and O 56 =Concatenation (O 5 , O 6 ).
此外,本发明还提供一种遥感高光谱图像与激光雷达图像融合分类装置,包括:In addition, the present invention also provides a remote sensing hyperspectral image and lidar image fusion classification device, including:
数据获取模块,其用于获取高光谱图像与激光雷达图像,两幅图像中各地物的类别为label;The data acquisition module is used to acquire hyperspectral images and lidar images. The categories of objects in the two images are label;
图像分解模块,其用于将高光谱图像进行本征图像分解得到本征图像和光照图像,对于每一个高光谱本征像元、高光谱光照像元以及激光雷达像元,选取其周围尺寸为s×s的邻域作为该像元的邻域块,其中高光谱图像中每个高光谱像元的邻域块尺寸为s×s×B,激光雷达图像L中激光雷达像元的邻域块尺寸为s×s;Image decomposition module, which is used to decompose hyperspectral images into intrinsic images to obtain intrinsic images and illumination images. For each hyperspectral intrinsic pixel, hyperspectral illumination pixel and lidar pixel, select its surrounding size as The neighborhood of s × s is used as the neighborhood block of the pixel, where the neighborhood block size of each hyperspectral pixel in the hyperspectral image is s × s × B, and the neighborhood of the lidar pixel in the lidar image L is The block size is s×s;
深度网络训练模块,其用于使用邻域块训练深度网络支路L 1,L 2,L 3,L 4,L 5和L 6,其中L 1和L 2的输入为高光谱本征图像中的尺寸为s×s×B的高光谱本征像元,L 3和L 4的输入为激光雷达图像中的尺寸为s×s×B的激光雷达像元,L 5和L 6的输入为高光谱光照图像中的尺寸为s×s×B的高光谱光照像元,其输出分别为O 1,O 2,O 3,O 4,O 5和O 6,尺寸均为s×s×d; Deep network training module, which is used to train the deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 using neighborhood blocks, where the inputs of L 1 and L 2 are in the hyperspectral intrinsic image The hyperspectral intrinsic pixels of size s×s×B are, the inputs of L 3 and L 4 are lidar pixels of size s×s×B in the lidar image, and the inputs of L 5 and L 6 are The hyperspectral illumination pixels in the hyperspectral illumination image are of size s×s×B, and their outputs are O 1 , O 2 , O 3 , O 4 , O 5 and O 6 respectively, and the sizes are all s×s×d. ;
图像拼接模块,其利用拼接层将深度网络支路L 1,L 2,L 3,L 4,L 5和L 6的输出两两拼接,得到O 12、O 34和O 56The image splicing module uses the splicing layer to splice the outputs of the deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 in pairs to obtain O 12 , O 34 and O 56 ;
多模态融合模块,其用于将O 34和O 56输入到第1个多模态分组卷积层,得到输出O 3456 1,将O 34输入到第2个多模态分组卷积层,得到输出O 34 1,将O 56输入到第3个多模态分组卷积层,得到输出O 56 1,将O 34 1、O 3456 1和O 56 1输入到第4个多模态分组卷积层,得到输出O 3456 2,将O 34 1输入到第5个多模态分组卷积层,得到输出O 34 2,将O 56 1输入到第6个多模态分组卷积层,得到输出O 56 2,将O 34 2、O 3456 2和O 56 2输入到第7个多模态分组卷积层,得到输出O 3456 3,将O 34 2输入到第8个多模态分组卷积层,得到输出O 34 3,将O 56 2输入到第9个多模态分组卷积层,得到输出O 56 3,将O 34 3、O 3456 3和O 56 3输入到第10个多模态分组卷积层,得到输出O 3456 4,将O 12和O 3456 1输入到第11个多模态分组卷积层,得到输出O 12 1,将O 12 1和O 3456 2输入到第12个多模态分组卷积层,得到输出O 12 2,将O 12 2和O 3456 3输入到第13个多模态分组卷积层,得到输出O 12 3,将尺寸为s×s×d的O 3456 4和O 12 3输入尺寸为s×s的二维平均池化层,得到尺寸为1×d的O 3456 5和O 12 4The multi-modal fusion module is used to input O 34 and O 56 to the first multi-modal group convolution layer to obtain the output O 3456 1 , and input O 34 to the second multi-modal group convolution layer, Get the output O 34 1 , input O 56 to the 3rd multi-modal group convolution layer, get the output O 56 1 , input O 34 1 , O 3456 1 and O 56 1 to the 4th multi-modal group convolution layer Accumulative layer, get the output O 3456 2 , input O 34 1 to the fifth multi-modal group convolution layer, get the output O 34 2 , input O 56 1 to the sixth multi-modal group convolution layer, get Output O 56 2 , input O 34 2 , O 3456 2 and O 56 2 to the 7th multimodal grouping convolution layer to obtain the output O 3456 3 , input O 34 2 to the 8th multimodal grouping volume Multimodal convolution layer, get the output O 34 3 , input O 56 2 to the 9th multi-modal group convolution layer, get the output O 56 3 , input O 34 3 , O 3456 3 and O 56 3 to the 10th multi-modal convolutional layer Modal group convolution layer, get the output O 3456 4 , input O 12 and O 3456 1 to the 11th multi-modal group convolution layer, get the output O 12 1 , input O 12 1 and O 3456 2 to the 11th multi-modal group convolution layer 12 multi-modal grouped convolution layers are used to obtain the output O 12 2 . Input O 12 2 and O 3456 3 to the 13th multi-modal grouped convolution layer to obtain the output O 12 3 . The size is s×s× O 3456 4 and O 12 3 of d are input into the two-dimensional average pooling layer with size s×s, and O 3456 5 and O 12 4 with size 1×d are obtained;
图像分类模块,将O 12 4和O 3456 5输入拼接层得到输出O 123456,其尺寸为1×2d,将O 123456输入全连接层,得到最终输出的类别为
Figure PCTCN2022142160-appb-000005
Image classification module, input O 12 4 and O 3456 5 into the splicing layer to get the output O 123456 , whose size is 1×2d, input O 123456 into the fully connected layer, and get the final output category:
Figure PCTCN2022142160-appb-000005
在本发明的一个实施例中,所述数据获取模块包括数据预处理子模块,所述数据预处理子模块用于在选取高光谱图像与激光雷达图像之后对高光谱图像与激光雷达图像进行归一化预处理。In one embodiment of the present invention, the data acquisition module includes a data preprocessing submodule, which is used to classify the hyperspectral image and the lidar image after selecting the hyperspectral image and the lidar image. Unified preprocessing.
在本发明的一个实施例中,所述图像分解模块将高光谱图像进行本征图像分解得到本征图像和光照图像,包括:In one embodiment of the present invention, the image decomposition module performs intrinsic image decomposition on hyperspectral images to obtain intrinsic images and illumination images, including:
计算每一个高光谱像元H i对应的矩阵D i,其中1≤i≤X×Y: Calculate the matrix D i corresponding to each hyperspectral pixel H i , where 1≤i≤X×Y:
D i=[H 1,...,H i-1,H i+1,...,H X×Y,I B]∈R B×(B+X×Y-1) D i = [H 1 ,..., Hi -1 , Hi+1 ,..., H X×Y , I B ]∈R B×(B+X×Y-1)
其中I B为尺寸为B×B的单位矩阵; where I B is the identity matrix of size B×B;
基于矩阵D i计算每一个高光谱像元H i对应的向量α iCalculate the vector α i corresponding to each hyperspectral pixel H i based on the matrix D i :
min||α i|| 1    s.t.H i=D iα i min||α i || 1 stH i =D i α i
其中α i的形状为(B+X×Y-1)×1; Among them, the shape of α i is (B+X×Y-1)×1;
构建权重矩阵W∈R (X×Y)×(X×Y),对权重矩阵中第i行第j列的元W ij进行赋值
Figure PCTCN2022142160-appb-000006
基于权重矩阵W计算矩阵G=(I X×Y-W T)(I X×Y-W)+δI X×Y,其中I X×Y为尺寸为(X×Y)×(X×Y)的单位矩阵,δ为常数,T为转置矩阵;
Construct a weight matrix W∈R (X×Y)×(X×Y) , and assign a value to the element W ij in the i-th row and j-th column in the weight matrix
Figure PCTCN2022142160-appb-000006
Calculate the matrix G=(I X×Y -W T )(I X×Y -W)+δI X × Y based on the weight matrix W, where I The identity matrix of , δ is a constant, and T is the transposed matrix;
将高光谱图像H变换为二维矩阵,并进行对数计算得到log(flatten(H)),计算矩阵K=(I B-1 B1 B T/ B)log(flatten(H))(I X×Y-1 X×Y1 X×Y T/(X×Y)),其中I B和I X×Y分别为尺寸为B×B和(X×Y)×(X×Y)的单位矩阵,1 B和1 X×Y分别为尺寸为B×1和(X×Y)×1的全1向量; Transform the hyperspectral image H into a two-dimensional matrix, and perform logarithmic calculation to obtain log(flatten(H)). Calculate the matrix K=(I B -1 B 1 B T / B )log(flatten(H))(I X×Y -1 X×Y 1 X×Y T /(X × Y)), where I B and I Matrices, 1 B and 1 X×Y are all-1 vectors with dimensions B×1 and (X×Y)×1 respectively;
基于矩阵G和矩阵K计算矩阵ρ=δKG -1,基于矩阵ρ得到由高光谱图像H分解得到的本征图像RE=e ρ和光照图像SH=e log(H)-ρ,其中e为自然常数,两个图像的尺寸均为X×Y×B。 The matrix ρ = δKG -1 is calculated based on the matrix G and the matrix K. Based on the matrix ρ, the intrinsic image RE = e ρ and the illumination image SH = e log(H)-ρ obtained by decomposing the hyperspectral image H are obtained, where e is the natural Constant, the dimensions of both images are X×Y×B.
在本发明的一个实施例中,所述深度网络支路L 1,L 2,L 3,L 4,L 5和L 6均包括多个二维卷积层,每个二维卷积层中卷积核数目均为d,卷积核尺寸为[3,3],卷积核滑动步长均为[1,1],支路L 2和L 3共享所有权重,支路L 4和L 5共享所有权重。 In one embodiment of the present invention, the deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 each include multiple two-dimensional convolution layers, and each two-dimensional convolution layer The number of convolution kernels is d, the convolution kernel size is [3, 3], the convolution kernel sliding step size is [1, 1], branches L 2 and L 3 share all weights, branches L 4 and L 5 shares all weight.
本发明的上述技术方案相比现有技术具有以下优点:The above technical solution of the present invention has the following advantages compared with the existing technology:
1.本发明提出的遥感高光谱图像与激光雷达图像融合分类方法,可以充分融合多源遥感图像中重要判别信息,实现对目标像元高精度分类的目的,充分避免融合过程中重要信息的丢失与损耗,减少了由于信息缺失而带来的分类精度降低等问题;1. The fusion classification method of remote sensing hyperspectral images and lidar images proposed by this invention can fully integrate important discriminant information in multi-source remote sensing images, achieve the purpose of high-precision classification of target pixels, and fully avoid the loss of important information during the fusion process. and loss, reducing problems such as reduced classification accuracy due to lack of information;
2.本发明将高光谱图像本征图像分解理论应用到高光谱图像与激光雷达图像融合分类研究中,充分将本征图像分解理论与多模态遥感图像融合分类研究进行改进,避免了在常规本征图像分解时舍弃分解得到的光照图像的现象,减少了信息的丢失;2. The present invention applies the intrinsic image decomposition theory of hyperspectral images to the fusion classification research of hyperspectral images and lidar images, fully improves the intrinsic image decomposition theory and multi-modal remote sensing image fusion classification research, and avoids the need for conventional classification methods. The phenomenon of discarding the decomposed illumination image during intrinsic image decomposition reduces the loss of information;
3.本发明提出一种将高光谱图像分解得到的光照图像与激光雷达图像进行充分融合的方法,使得高光谱图像中光照信息与激光雷达图像中的高程信息间的关联性被充分挖掘与利用,充分发挥光照图像在模型构建研究中的优势,提升最终分类性能。3. The present invention proposes a method for fully integrating the illumination image obtained by decomposing the hyperspectral image with the lidar image, so that the correlation between the illumination information in the hyperspectral image and the elevation information in the lidar image is fully explored and utilized. , give full play to the advantages of illumination images in model construction research, and improve the final classification performance.
附图说明Description of drawings
为了使本发明的内容更容易被清楚的理解,下面根据本发明的具体实施例并结合附 图,对本发明作进一步详细的说明。In order to make the content of the present invention easier to understand clearly, the present invention will be described in further detail below based on specific embodiments of the present invention and in conjunction with the accompanying drawings.
图1为本发明所提供的一种遥感高光谱图像与激光雷达图像融合分类方法的流程图。Figure 1 is a flow chart of a fusion classification method of remote sensing hyperspectral images and lidar images provided by the present invention.
图2为本发明所提供的一种遥感高光谱图像与激光雷达图像融合分类装置的框架示意图。Figure 2 is a schematic framework diagram of a remote sensing hyperspectral image and lidar image fusion classification device provided by the present invention.
其中,附图标记说明如下:10、数据获取模块;20、图像分解模块;30、深度网络训练模块;40、图像拼接模块;50、多模态融合模块;60、图像分类模块。The reference numbers are as follows: 10. Data acquisition module; 20. Image decomposition module; 30. Deep network training module; 40. Image splicing module; 50. Multi-modal fusion module; 60. Image classification module.
具体实施方式Detailed ways
下面结合附图和具体实施例对本发明作进一步说明,以使本领域的技术人员可以更好地理解本发明并能予以实施,但所举实施例不作为对本发明的限定。The present invention will be further described below in conjunction with the accompanying drawings and specific examples, so that those skilled in the art can better understand and implement the present invention, but the examples are not intended to limit the present invention.
请参考图1所示,本发明实施例提供一种遥感高光谱图像与激光雷达图像融合分类方法,包括:Please refer to Figure 1. An embodiment of the present invention provides a method for fusion and classification of remote sensing hyperspectral images and lidar images, including:
S1:获取高光谱图像与激光雷达图像,两幅图像中各地物的类别为label;S1: Acquire hyperspectral images and lidar images. The categories of objects in the two images are label;
S2:将高光谱图像进行本征图像分解得到本征图像和光照图像,对于每一个高光谱本征像元、高光谱光照像元以及激光雷达像元,选取其周围尺寸为s×s的邻域作为该像元的邻域块,其中高光谱图像中每个高光谱像元的邻域块尺寸为s×s×B,激光雷达图像L中激光雷达像元的邻域块尺寸为s×s;S2: Decompose the hyperspectral image into intrinsic image to obtain the intrinsic image and illumination image. For each hyperspectral intrinsic pixel, hyperspectral illumination pixel and lidar pixel, select its surrounding neighbors with size s×s. The domain is used as the neighborhood block of the pixel, where the neighborhood block size of each hyperspectral pixel in the hyperspectral image is s×s×B, and the neighborhood block size of the lidar pixel in the lidar image L is s× s;
S3:使用邻域块训练深度网络支路L 1,L 2,L 3,L 4,L 5和L 6,其中L 1和L 2的输入为高光谱本征图像中的尺寸为s×s×B的高光谱本征像元,L 3和L 4的输入为激光雷达图像中的尺寸为s×s×B的激光雷达像元,L 5和L 6的输入为高光谱光照图像中的尺寸为s×s×B的高光谱光照像元,其输出分别为O 1,O 2,O 3,O 4,O 5和O 6,尺寸均为s×s×d; S3: Training deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 using neighborhood blocks, where the inputs to L 1 and L 2 are hyperspectral eigenimages of size s × s ×B hyperspectral intrinsic pixels, the inputs of L 3 and L 4 are lidar pixels of size s×s×B in the lidar image, and the inputs of L 5 and L 6 are the lidar pixels in the hyperspectral illumination image The hyperspectral illumination pixel with size s×s×B has outputs of O 1 , O 2 , O 3 , O 4 , O 5 and O 6 respectively, and the size is s×s×d;
S4:利用拼接层将深度网络支路L 1,L 2,L 3,L 4,L 5和L 6的输出两两拼接,得到O 12、O 34和O 56S4: Use the splicing layer to splice the outputs of the deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 in pairs to obtain O 12 , O 34 and O 56 ;
S5:将O 34和O 56输入到第1个多模态分组卷积层,得到输出O 3456 1,将O 34输入到第2个多模态分组卷积层,得到输出O 34 1,将O 56输入到第3个多模态分组卷积层,得到输出O 56 1,将O 34 1、O 3456 1和O 56 1输入到第4个多模态分组卷积层,得到输出O 3456 2,将O 34 1输入到第5个多模态分组卷积层,得到输出O 34 2,将O 56 1输入到第6个多模态分组卷积层,得到输出O 56 2,将O 34 2、O 3456 2和O 56 2输入到第7个多模态分组卷积层,得到输出O 3456 3,将O 34 2输入到第8个多模态分组卷积层,得到输出O 34 3,将O 56 2输入到第9个多模态分组 卷积层,得到输出O 56 3,将O 34 3、O 3456 3和O 56 3输入到第10个多模态分组卷积层,得到输出O 3456 4,将O 12和O 3456 1输入到第11个多模态分组卷积层,得到输出O 12 1,将O 12 1和O 3456 2输入到第12个多模态分组卷积层,得到输出O 12 2,将O 12 2和O 3456 3输入到第13个多模态分组卷积层,得到输出O 12 3,将尺寸为s×s×d的O 3456 4和O 12 3输入尺寸为s×s的二维平均池化层,得到尺寸为1×d的O 3456 5和O 12 4S5: Input O 34 and O 56 to the first multi-modal group convolution layer to obtain the output O 3456 1 . Input O 34 to the second multi-modal group convolution layer to obtain the output O 34 1 . O 56 is input to the third multi-modal group convolution layer, and the output O 56 1 is obtained. O 34 1 , O 3456 1 and O 56 1 are input to the fourth multi-modal group convolution layer, and the output O 3456 is obtained. 2. Input O 34 1 to the fifth multi-modal group convolution layer to obtain the output O 34 2 . Input O 56 1 to the sixth multi-modal group convolution layer to obtain the output O 56 2 . Put O 34 2 , O 3456 2 and O 56 2 are input to the 7th multi-modal group convolution layer, and the output O 3456 3 is obtained. O 34 2 is input to the 8th multi-modal group convolution layer, and the output O 34 is obtained. 3. Input O 56 2 to the 9th multi-modal group convolution layer to obtain the output O 56 3 . Input O 34 3 , O 3456 3 and O 56 3 to the 10th multi-modal group convolution layer. Get the output O 3456 4 , input O 12 and O 3456 1 to the 11th multimodal group convolution layer, get the output O 12 1 , input O 12 1 and O 3456 2 to the 12th multimodal group convolution layer Product layer, get the output O 12 2 , input O 12 2 and O 3456 3 to the 13th multi-modal group convolution layer, get the output O 12 3 , put the O 3456 4 and O of size s×s×d 12 3 inputs a two-dimensional average pooling layer with size s × s, and obtains O 3456 5 and O 12 4 with size 1 × d;
S6:将O 12 4和O 3456 5输入拼接层得到输出O 123456,其尺寸为1×2d,将O 123456输入全连接层,得到最终输出的类别为
Figure PCTCN2022142160-appb-000007
S6: Input O 12 4 and O 3456 5 into the splicing layer to obtain the output O 123456 , whose size is 1×2d. Input O 123456 into the fully connected layer to obtain the final output category:
Figure PCTCN2022142160-appb-000007
具体地,所述步骤S1中根据实际问题选取高光谱图像H与激光雷达图像L,其中高光谱图像尺寸为X×Y×B,X和Y是各波段中高光谱图像的空间尺寸,B是高光谱图像的波段个数,激光雷达图像尺寸为X×Y,X和Y是激光雷达图像的空间尺寸,两个图像的空间尺寸相同。对高光谱图像和激光雷达图像进行归一化预处理,设置邻域尺寸s(s为大于0的奇数),各二维卷积层的卷积核数目为d,卷积核尺寸为[3,3],卷积核滑动步长均为[1,1],各二维卷积层的填充参数(Padding)均为‘保持相同(Same)’,激活函数均选择Tanh函数,两幅图像中各地物的类别为label,类别尺寸为1×(X×Y),类别数目为c。Specifically, in step S1, the hyperspectral image H and the lidar image L are selected according to the actual problem, where the hyperspectral image size is X×Y×B, X and Y are the spatial dimensions of the hyperspectral image in each band, and B is the height The number of bands of the spectral image. The lidar image size is X×Y. X and Y are the spatial dimensions of the lidar image. The spatial dimensions of the two images are the same. Perform normalization preprocessing on hyperspectral images and lidar images, set the neighborhood size s (s is an odd number greater than 0), the number of convolution kernels in each two-dimensional convolution layer is d, and the convolution kernel size is [3 , 3], the sliding steps of the convolution kernel are all [1, 1], the padding parameters (Padding) of each two-dimensional convolution layer are all 'Keep the same (Same)', and the activation function selects the Tanh function. The two images The category of the objects in the center is label, the category size is 1×(X×Y), and the number of categories is c.
所述步骤S2中将高光谱图像进行本征图像分解得到本征图像和光照图像的方法包括:In step S2, the method of decomposing the hyperspectral image into eigenimages to obtain eigenimages and illumination images includes:
S2.1:计算每一个高光谱像元H i对应的矩阵D i,其中1≤i≤X×Y: S2.1: Calculate the matrix D i corresponding to each hyperspectral pixel H i , where 1≤i≤X×Y:
D i=[H 1,...,H i-1,H i+1,...,H X×Y,I B]∈R B×(B+X×Y-1) D i = [H 1 ,..., Hi -1 , Hi+1 ,..., H X×Y , I B ]∈R B×(B+X×Y-1)
其中I B为尺寸为B×B的单位矩阵; where I B is the identity matrix of size B×B;
S2.2:基于矩阵D i计算每一个高光谱像元H i对应的向量α iS2.2: Calculate the vector α i corresponding to each hyperspectral pixel H i based on the matrix D i :
min||α i|| 1    s.t.H i=D iα i min||α i || 1 stH i =D i α i
其中α i的形状为(B+X×Y-1)×1; Among them, the shape of α i is (B+X×Y-1)×1;
S2.3:构建权重矩阵W∈R (X×Y)×(X×Y),对权重矩阵中第i行第j列的元W ij进行赋值
Figure PCTCN2022142160-appb-000008
基于权重矩阵W计算矩阵G=(I X×Y-W T)(I X×Y-W)+δI X×Y,其中I X×Y为尺寸为(X×Y)×(X×Y)的单位矩阵,δ为常数,T为转置矩阵;
S2.3: Construct a weight matrix W∈R (X×Y)×(X×Y) , and assign values to the elements W ij in the i-th row and j-th column in the weight matrix.
Figure PCTCN2022142160-appb-000008
Calculate the matrix G=(I X×Y -W T )(I X×Y -W)+δI X × Y based on the weight matrix W, where I The identity matrix of , δ is a constant, and T is the transposed matrix;
S2.4:将高光谱图像H变换为二维矩阵,并进行对数计算得到log(flatten(H)),计算矩阵K=(I B-1 B1 B T/B)log(flatten(H))(I X×Y-1 X×Y1 X×Y T/(X×Y)),其中I B和I X×Y分别为尺寸为B×B和(X×Y)×(X×Y)的单位矩阵,1 B和1 X×Y分别为尺寸为B×1和(X×Y)×1的全1向量; S2.4: Transform the hyperspectral image H into a two-dimensional matrix, and perform logarithmic calculation to obtain log(flatten(H)). Calculate the matrix K=(I B -1 B 1 B T /B)log(flatten(H) ))(I X×Y -1 X×Y 1 X×Y T /(X × Y)), where I B and I The identity matrix of Y), 1 B and 1 X×Y are all-1 vectors with dimensions B×1 and (X×Y)×1 respectively;
S2.5:基于矩阵G和矩阵K计算矩阵ρ= δKG-1,基于矩阵ρ得到由高光谱图像H分解 得到的本征图像RE=e ρ和光照图像SH=e log(H)-ρ,其中e为自然常数,两个图像的尺寸均为X×Y×B。 S2.5: Calculate the matrix ρ = δKG-1 based on the matrix G and the matrix K. Based on the matrix ρ, obtain the intrinsic image RE=e ρ and the illumination image SH=e log(H)-ρ obtained by decomposing the hyperspectral image H, Where e is a natural constant, and the dimensions of both images are X×Y×B.
所述步骤S2中针对每一个高光谱本征像元、高光谱光照像元以及激光雷达像元(三个图像中均有X×Y个像元),选取其周围尺寸为s×s的邻域作为该像元的邻域块,其中高光谱图像H中每个高光谱像元的邻域块尺寸为s×s×B,激光雷达图像L中激光雷达像元的邻域块尺寸为s×s。In step S2, for each hyperspectral eigenpixel, hyperspectral illumination pixel and lidar pixel (there are X×Y pixels in each of the three images), select its surrounding neighbors with size s×s. The domain is used as the neighborhood block of the pixel, where the neighborhood block size of each hyperspectral pixel in the hyperspectral image H is s × s × B, and the neighborhood block size of the lidar pixel in the lidar image L is s ×s.
所述步骤S3中首先构建六个深度网络支路,分别为L 1,L 2,L 3,L 4,L 5和L 6,其中L 1和L 2的输入为高光谱本征图像RE中的尺寸为s×s×B的高光谱本征像元,L 3和L 4的输入为激光雷达图像L中的尺寸为s×s×B的激光雷达像元,L 5和L 6的输入为高光谱光照图像SH中的尺寸为s×s×B的高光谱光照像元,上述六个深度网络支路均由三个二维卷积层组成,每个二维卷积层中卷积核数目均为d,卷积核滑动步长均为[1,1],支路L 2和L 3共享所有权重,支路L 4和L 5共享所有权重,最终六个深度网络支路的输出分别为O 1,O 2,O 3,O 4,O 5和O 6,尺寸均为s×s×d。 In step S3, six deep network branches are first constructed, namely L 1 , L 2 , L 3 , L 4 , L 5 and L 6 , where the inputs of L 1 and L 2 are hyperspectral intrinsic images RE. The hyperspectral intrinsic pixels of size s×s×B, the inputs of L 3 and L 4 are the lidar pixels of size s×s×B in the lidar image L, the inputs of L 5 and L 6 is a hyperspectral illumination pixel of size s×s×B in the hyperspectral illumination image SH. The above six deep network branches are composed of three two-dimensional convolution layers, and the convolution in each two-dimensional convolution layer The number of kernels is all d, the sliding steps of the convolution kernels are all [1, 1], branches L 2 and L 3 share all weights, branches L 4 and L 5 share all weights, and the final six deep network branches The outputs are O 1 , O 2 , O 3 , O 4 , O 5 and O 6 respectively, and the dimensions are all s×s×d.
所述步骤S3中构建深度网络支路训练时的损失函数为:The loss function when constructing the deep network branch training in step S3 is:
Figure PCTCN2022142160-appb-000009
Figure PCTCN2022142160-appb-000009
其中,label为输入的图像类别,
Figure PCTCN2022142160-appb-000010
为输出的图像类别。
Among them, label is the input image category,
Figure PCTCN2022142160-appb-000010
is the output image category.
所述步骤S4中利用拼接层(Concatenation Layer)将上述六个深度网络支路的输出两两拼接,依据如下公式:In step S4, the concatenation layer (Concatenation Layer) is used to concatenate the outputs of the above six deep network branches in pairs, according to the following formula:
O 12=Concatenation(O 1,O 2) O 12 =Concatenation (O 1 , O 2 )
O 34=Concatenation(O 3,O 4) O 34 =Concatenation (O 3 , O 4 )
O 56=Concatenation(O 5,O 6)。 O 56 =Concatenation (O 5 , O 6 ).
所述步骤S5中将Concatenation(O 34,O 56)输入到第1个多模态分组卷积层,得到输出O 3456 1,将O 34输入到第2个多模态分组卷积层,得到输出O 34 1,将O 56输入到第3个多模态分组卷积层,得到输出O 56 1,将Concatenation(O 34 1,O 3456 1,O 56 1)输入到第4个多模态分组卷积层,得到输出O 3456 2,将O 34 1输入到第5个多模态分组卷积层,得到输出O 34 2,将O 56 1输入到第6个多模态分组卷积层,得到输出O 56 2,将Concatenation(O 34 2,O 3456 2,O 56 2)输入到第7个多模态分组卷积层,得到输出O 3456 3,将O 34 2输入到第8个多模态分组卷积层,得到输出O 34 3,将O 56 2输入到第9个多模态分组卷积层,得到输出O 56 3,将Concatenation(O 34 3,O 3456 3,O 56 3)输入到第10个多模态分组卷积 层,得到输出O 3456 4,将尺寸为s×s×d的O 3456 4输入尺寸为s×s的二维平均池化层(Average Pooling Layer),得到尺寸为1×d的O 3456 5;将Concatenation(O 12,O 3456 1)输入到第11个多模态分组卷积层,得到输出O 12 1,将Concatenation(O 12 1,O 3456 2)输入到第12个多模态分组卷积层,得到输出O 12 2,将Concatenation(O 12 2,O 3456 3)输入到第13个多模态分组卷积层,得到输出O 12 3,其尺寸为s×s×d,将O 12 3输入尺寸为s×s的二维池化层(Average Pooling Layer),得到尺寸为1×d的O 12 4In step S5, Concatenation (O 34 , O 56 ) is input to the first multi-modal group convolution layer to obtain the output O 3456 1 , and O 34 is input to the second multi-modal group convolution layer to obtain Output O 34 1 , input O 56 to the 3rd multimodal group convolution layer, get the output O 56 1 , input Concatenation (O 34 1 , O 3456 1 , O 56 1 ) to the 4th multimodal Group convolution layer, get the output O 3456 2 , input O 34 1 to the 5th multi-modal group convolution layer, get the output O 34 2 , input O 56 1 to the 6th multi-modal group convolution layer , get the output O 56 2 , input Concatenation (O 34 2 , O 3456 2 , O 56 2 ) to the 7th multi-modal group convolution layer, get the output O 3456 3 , input O 34 2 to the 8th Multi-modal grouped convolution layer, get the output O 34 3 , input O 56 2 to the 9th multi-modal group convolution layer, get the output O 56 3 , put Concatenation(O 34 3 , O 3456 3 , O 56 3 ) Input to the 10th multi-modal group convolution layer to obtain the output O 3456 4 , and input O 3456 4 with size s×s×d into the two-dimensional average pooling layer with size s×s (Average Pooling Layer ), get O 3456 5 with size 1 × d; input Concatenation (O 12 , O 3456 1 ) to the 11th multi-modal group convolution layer to get the output O 12 1 , put Concatenation (O 12 1 , O 3456 2 ) is input to the 12th multi-modal group convolution layer, and the output O 12 2 is obtained. Concatenation (O 12 2 , O 3456 3 ) is input to the 13th multi-modal group convolution layer, and the output O 12 is obtained. 3 , its size is s×s×d, input O 12 3 into the two-dimensional pooling layer (Average Pooling Layer) with size s×s, and obtain O 12 4 with size 1×d.
所述步骤S6中将O 12 4和O 3456 5输入拼接层得到输入O 123456=Concatenation(O 12 4,O 3456 5),其尺寸为1×2d。将O 123456输入全连接层,其节点数目为c,激活函数为Softmax函数,得到最终输出的类别为
Figure PCTCN2022142160-appb-000011
In the step S6, O 12 4 and O 3456 5 are input into the concatenation layer to obtain the input O 123456 =Concatenation (O 12 4 , O 3456 5 ), whose size is 1×2d. Input O 123456 into the fully connected layer, the number of nodes is c, the activation function is the Softmax function, and the final output category is
Figure PCTCN2022142160-appb-000011
本发明提出的遥感高光谱图像与激光雷达图像融合分类方法,将本征图像分解理论与多模态遥感图像融合分类研究进行高效结合,在模型构建中充分发挥光照图像的优势,减少了重要判别信息的丢失与损耗。The fusion classification method of remote sensing hyperspectral images and lidar images proposed by the present invention efficiently combines the intrinsic image decomposition theory with multi-modal remote sensing image fusion classification research, giving full play to the advantages of illumination images in model construction and reducing the need for important discriminations. Loss and attrition of information.
本发明首次将高光谱图像的本征图像分解引入多模态遥感图像融合分类研究中,并实现光照图像和激光雷达图像中的高程信息平衡的目的,同时对这种光照信息和高程信息进行融合并指导本征信息挖掘过程,提升样本的可分类能力。This invention introduces the intrinsic image decomposition of hyperspectral images into multi-modal remote sensing image fusion classification research for the first time, and achieves the purpose of balancing the elevation information in illumination images and lidar images, and simultaneously fuses this illumination information and elevation information. And guide the intrinsic information mining process to improve the classification ability of samples.
本发明提出一种将高光谱图像分解得到的光照图像与激光雷达图像进行充分融合的方法,使得高光谱图像中光照信息与激光雷达图像中的高程信息间的关联性被充分挖掘与利用,充分发挥光照图像在模型构建研究中的优势。The present invention proposes a method for fully integrating the illumination image obtained by decomposing the hyperspectral image with the lidar image, so that the correlation between the illumination information in the hyperspectral image and the elevation information in the lidar image is fully explored and utilized, and Take advantage of lighting images in model building research.
本发明提出了一种针对高光谱图像与激光雷达图像的判别特征提取方法,在信息挖掘的过程中大幅度提升二者的关联性,减少信息不平衡现象的发生。The present invention proposes a discriminant feature extraction method for hyperspectral images and lidar images, which greatly improves the correlation between the two and reduces the occurrence of information imbalance during the information mining process.
本发明提出一种将多模态分组卷积层应用到高光谱图像与激光雷达图像融合分类研究的方法,充分发挥多模态分组卷积层在本发明研究领域中的应用价值,大幅度提升不同模态之间的联合能力,减少不必要的冗余信息并强化重要信息的表达能力。The present invention proposes a method for applying a multi-modal grouping convolution layer to the fusion classification research of hyperspectral images and lidar images, giving full play to the application value of the multi-modal grouping convolution layer in the research field of the present invention and greatly improving the The ability to combine different modalities reduces unnecessary redundant information and enhances the ability to express important information.
本发明提出的一种遥感高光谱图像与激光雷达图像融合分类方法采用的高光谱图像和激光雷达图像拍摄于意大利特伦托(Trento,Italy),其中高光谱图像的尺寸为166×600×63,激光雷达图像的尺寸为166×600。The hyperspectral image and lidar image used in the fusion classification method of remote sensing hyperspectral image and lidar image proposed by the present invention were taken in Trento, Italy, where the size of the hyperspectral image is 166×600×63 , the size of the lidar image is 166×600.
(一)本实施例输入:(1) Input of this embodiment:
输入的高光谱图像是大小为166×600×63的图像,输入的激光雷达图像是大小为166×600的图像。The input hyperspectral image is an image of size 166×600×63, and the input lidar image is an image of size 166×600.
(二)参数设定(2) Parameter setting
邻域尺寸为11,各二维卷积层的卷积核数目为120。The neighborhood size is 11, and the number of convolution kernels in each two-dimensional convolution layer is 120.
对高光谱图像进行分解,得到大小为166×600×63的本征图像和大小为166×600×63的光照图像。The hyperspectral image is decomposed to obtain an intrinsic image with a size of 166×600×63 and an illumination image with a size of 166×600×63.
进行邻域信息选取,针对每一个像元,得到大小为11×11×63的邻域块,将邻域块输入深度网络进行训练。Neighborhood information is selected, and a neighborhood block of size 11×11×63 is obtained for each pixel, and the neighborhood block is input into the deep network for training.
(三)训练此深度网络模型(3) Training this deep network model
从共计99600个样本邻域块中随机选择10%的样本邻域块用于训练深度网络模型,将这些样本邻域块随机排序并打包,小批量样本邻域块个数为512。每次训练只用其中一个样本包。在训练结束后,将全部99600个样本邻域块输入深度网络模型进行测试,最终得到全部样本的分类结果,选用总体分类精度和平均分类精度来对分类结果进行评价。总体分类结果指所有样本中分类正确的样本数目除以全部样本数目的比值。平均分类精度首先每一类中分类正确的样本数目除以该类样本的数目的比值,并求取各类比值的平均值。Randomly select 10% of the sample neighborhood blocks from a total of 99,600 sample neighborhood blocks for training the deep network model. These sample neighborhood blocks are randomly sorted and packaged. The number of sample neighborhood blocks in the mini-batch is 512. Only one of the sample packages is used for each training. After the training, all 99,600 sample neighborhood blocks were input into the deep network model for testing. Finally, the classification results of all samples were obtained. The overall classification accuracy and average classification accuracy were used to evaluate the classification results. The overall classification result refers to the ratio of the number of correctly classified samples divided by the number of all samples. The average classification accuracy is first divided by the ratio of the number of correctly classified samples in each category by the number of samples of that category, and the average of the ratios of each category is calculated.
(四)本实施例的结果(4) Results of this example
用本发明提出的遥感高光谱图像与激光雷达图像融合分类方法及装置和目前较为常用的多流编码器所得到的分类结果如下表1所示。The classification results obtained by using the remote sensing hyperspectral image and lidar image fusion classification method and device proposed by the present invention and the currently commonly used multi-stream encoder are shown in Table 1 below.
  总体分类精度Overall classification accuracy 平均分类精度average classification accuracy
本发明方法Method of the present invention 92.84%92.84% 90.74%90.74%
常用多流编码器Commonly used multi-stream encoders 86.23%86.23% 83.79%83.79%
可以看出,本发明方法可以较好得对高光谱图像和激光雷达图像进行融合并分类,具有更少的错分样本。此外,当去除本发明方法中高光谱图像分解部分,并重复上述实验,得到的总体分类精度为85.49%,由此可见,本发明方法具有较强的信息挖掘能力。综上所述,本发明方法可以有效提升多源遥感图像的可分类能力和分类精度。It can be seen that the method of the present invention can better fuse and classify hyperspectral images and lidar images, with fewer misclassified samples. In addition, when the hyperspectral image decomposition part of the method of the present invention is removed and the above experiment is repeated, the overall classification accuracy obtained is 85.49%. It can be seen that the method of the present invention has strong information mining capabilities. To sum up, the method of the present invention can effectively improve the classification ability and classification accuracy of multi-source remote sensing images.
下面对本发明实施例二公开的一种遥感高光谱图像与激光雷达图像融合分类装置进行介绍,下文描述的一种遥感高光谱图像与激光雷达图像融合分类装置与上文描述的遥感高光谱图像与激光雷达图像融合分类方法可相互对应参照。The following is an introduction to a remote sensing hyperspectral image and lidar image fusion classification device disclosed in the second embodiment of the present invention. The remote sensing hyperspectral image and lidar image fusion classification device described below is the same as the remote sensing hyperspectral image and lidar image fusion classification device described above. LiDAR image fusion classification methods can correspond to each other.
请参阅图2所示,本发明实施例提供一种遥感高光谱图像与激光雷达图像融合分类装置,包括:Please refer to Figure 2. An embodiment of the present invention provides a remote sensing hyperspectral image and lidar image fusion classification device, which includes:
数据获取模块10,其用于获取高光谱图像与激光雷达图像,两幅图像中各地物的类别为label;The data acquisition module 10 is used to acquire hyperspectral images and lidar images. The categories of objects in the two images are label;
图像分解模块20,其用于将高光谱图像进行本征图像分解得到本征图像和光照图像,对于每一个高光谱本征像元、高光谱光照像元以及激光雷达像元,选取其周围尺寸为s×s 的邻域作为该像元的邻域块,其中高光谱图像中每个高光谱像元的邻域块尺寸为s×s×B,激光雷达图像L中激光雷达像元的邻域块尺寸为s×s;Image decomposition module 20 is used to decompose the hyperspectral image into intrinsic images to obtain intrinsic images and illumination images. For each hyperspectral intrinsic pixel, hyperspectral illumination pixel and lidar pixel, select its surrounding size The neighborhood of s×s is used as the neighborhood block of the pixel, where the neighborhood block size of each hyperspectral pixel in the hyperspectral image is s×s×B, and the neighbor block of the lidar pixel in the lidar image L is The domain block size is s×s;
深度网络训练模块30,其用于使用邻域块训练深度网络支路L 1,L 2,L 3,L 4,L 5和L 6,其中L 1和L 2的输入为高光谱本征图像中的尺寸为s×s×B的高光谱本征像元,L 3和L 4的输入为激光雷达图像中的尺寸为s×s×B的激光雷达像元,L 5和L 6的输入为高光谱光照图像中的尺寸为s×s×B的高光谱光照像元,其输出分别为O 1,O 2,O 3,O 4,O 5和O 6,尺寸均为s×s×d; Deep network training module 30, which is used to train the deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 using neighborhood blocks, where the inputs of L 1 and L 2 are hyperspectral intrinsic images. The hyperspectral intrinsic pixels of size s×s×B in , the inputs of L 3 and L 4 are the lidar pixels of size s×s×B in the lidar image, the inputs of L 5 and L 6 is a hyperspectral illumination pixel with size s×s×B in the hyperspectral illumination image. Its outputs are O 1 , O 2 , O 3 , O 4 , O 5 and O 6 respectively, and the sizes are all s×s× d;
图像拼接模块40,其利用拼接层将深度网络支路L 1,L 2,L 3,L 4,L 5和L 6的输出两两拼接,得到O 12、O 34和O 56The image splicing module 40 uses the splicing layer to splice the outputs of the deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 in pairs to obtain O 12 , O 34 and O 56 ;
多模态融合模块50,其用于将O 34和O 56输入到第1个多模态分组卷积层,得到输出O 3456 1,将O 34输入到第2个多模态分组卷积层,得到输出O 34 1,将O 56输入到第3个多模态分组卷积层,得到输出O 56 1,将O 34 1、O 3456 1和O 56 1输入到第4个多模态分组卷积层,得到输出O 3456 2,将O 34 1输入到第5个多模态分组卷积层,得到输出O 34 2,将O 56 1输入到第6个多模态分组卷积层,得到输出O 56 2,将O 34 2、O 3456 2和O 56 2输入到第7个多模态分组卷积层,得到输出O 3456 3,将O 34 2输入到第8个多模态分组卷积层,得到输出O 34 3,将O 56 2输入到第9个多模态分组卷积层,得到输出O 56 3,将O 34 3、O 3456 3和O 56 3输入到第10个多模态分组卷积层,得到输出O 3456 4,将O 12和O 3456 1输入到第11个多模态分组卷积层,得到输出O 12 1,将O 12 1和O 3456 2输入到第12个多模态分组卷积层,得到输出O 12 2,将O 12 2和O 3456 3输入到第13个多模态分组卷积层,得到输出O 12 3,将尺寸为s×s×d的O 3456 4和O 12 3输入尺寸为s×s的二维平均池化层,得到尺寸为1×d的O 3456 5和O 12 4The multi-modal fusion module 50 is used to input O 34 and O 56 to the first multi-modal group convolution layer to obtain the output O 3456 1 , and input O 34 to the second multi-modal group convolution layer. , get the output O 34 1 , input O 56 to the third multi-modal grouping convolution layer, get the output O 56 1 , input O 34 1 , O 3456 1 and O 56 1 to the fourth multi-modal grouping Convolution layer, get the output O 3456 2 , input O 34 1 to the 5th multi-modal group convolution layer, get the output O 34 2 , input O 56 1 to the 6th multi-modal group convolution layer, Get the output O 56 2 , input O 34 2 , O 3456 2 and O 56 2 to the 7th multimodal grouping convolution layer, get the output O 3456 3 , input O 34 2 to the 8th multimodal grouping Convolution layer, get the output O 34 3 , input O 56 2 to the 9th multi-modal group convolution layer, get the output O 56 3 , input O 34 3 , O 3456 3 and O 56 3 to the 10th Multi-modal grouped convolution layer, get the output O 3456 4 , input O 12 and O 3456 1 to the 11th multi-modal group convolution layer, get the output O 12 1 , input O 12 1 and O 3456 2 to The 12th multi-modal grouped convolution layer obtains the output O 12 2 . Input O 12 2 and O 3456 3 to the 13th multi-modal grouped convolution layer to obtain the output O 12 3 . The size is s×s ×d O 3456 4 and O 12 3 are input into the two-dimensional average pooling layer with size s×s, and O 3456 5 and O 12 4 with size 1×d are obtained;
图像分类模块60,将O 12 4和O 3456 5输入拼接层得到输出O 123456,其尺寸为1×2d,将O 123456输入全连接层,得到最终输出的类别为
Figure PCTCN2022142160-appb-000012
The image classification module 60 inputs O 12 4 and O 3456 5 into the splicing layer to obtain the output O 123456 , whose size is 1×2d, and inputs O 123456 into the fully connected layer to obtain the final output category:
Figure PCTCN2022142160-appb-000012
在本发明的一个实施例中,所述数据获取模块10包括数据预处理子模块,所述数据预处理子模块用于在选取高光谱图像与激光雷达图像之后对高光谱图像与激光雷达图像进行归一化预处理。In one embodiment of the present invention, the data acquisition module 10 includes a data preprocessing submodule, which is used to perform processing on the hyperspectral image and the lidar image after selecting the hyperspectral image and the lidar image. Normalization preprocessing.
在本发明的一个实施例中,所述图像分解模块20将高光谱图像进行本征图像分解得到本征图像和光照图像,包括:In one embodiment of the present invention, the image decomposition module 20 performs intrinsic image decomposition on the hyperspectral image to obtain an intrinsic image and an illumination image, including:
计算每一个高光谱像元H i对应的矩阵D i,其中1≤i≤X×Y: Calculate the matrix D i corresponding to each hyperspectral pixel H i , where 1≤i≤X×Y:
D i=[H 1,...,H i-1,H i+1,...,H X×Y,I B]∈R B×(B+X×Y-1) D i = [H 1 ,..., Hi -1 , Hi+1 ,..., H X×Y , I B ]∈R B×(B+X×Y-1)
其中I B为尺寸为B×B的单位矩阵; where I B is the identity matrix of size B×B;
基于矩阵D i计算每一个高光谱像元H i对应的向量α iCalculate the vector α i corresponding to each hyperspectral pixel H i based on the matrix D i :
min||α i|| 1    s.t.H i=D iα i min||α i || 1 stH i =D i α i
其中α i的形状为(B+X×Y-1)×1; Among them, the shape of α i is (B+X×Y-1)×1;
构建权重矩阵W∈R (X×Y)×(X×Y),对权重矩阵中第i行第j列的元W ij进行赋值
Figure PCTCN2022142160-appb-000013
基于权重矩阵W计算矩阵G=(I X×Y-W T)(I X×Y-W)+δI X×Y,其中I X×Y为尺寸为(X×Y)×(X×Y)的单位矩阵,δ为常数,T为转置矩阵;
Construct a weight matrix W∈R (X×Y)×(X×Y) , and assign a value to the element W ij in the i-th row and j-th column in the weight matrix
Figure PCTCN2022142160-appb-000013
Calculate the matrix G=(I X×Y -W T )(I X×Y -W)+δI X × Y based on the weight matrix W, where I The identity matrix of , δ is a constant, and T is the transposed matrix;
将高光谱图像H变换为二维矩阵,并进行对数计算得到log(flatten(H)),计算矩阵K=(I B-1 B1 B T/B)log(flatten(H))(I X×Y-1 X×Y1 X×Y T/(X×Y)),其中I B和I X×Y分别为尺寸为B×B和(X×Y)×(X×Y)的单位矩阵,1 B和1 X×Y分别为尺寸为B×1和(X×Y)×1的全1向量; Transform the hyperspectral image H into a two-dimensional matrix, and perform logarithmic calculation to obtain log(flatten(H)). Calculate the matrix K=(I B -1 B 1 B T /B)log(flatten(H))(I X×Y -1 X×Y 1 X×Y T /(X × Y)), where I B and I Matrices, 1 B and 1 X×Y are all-1 vectors with dimensions B×1 and (X×Y)×1 respectively;
基于矩阵G和矩阵K计算矩阵ρ=δKG -1,基于矩阵ρ得到由高光谱图像H分解得到的本征图像RE=e ρ和光照图像SH=e log(H)-ρ,其中e为自然常数,两个图像的尺寸均为X×Y×B。 The matrix ρ = δKG -1 is calculated based on the matrix G and the matrix K. Based on the matrix ρ, the intrinsic image RE = e ρ and the illumination image SH = e log(H)-ρ obtained by decomposing the hyperspectral image H are obtained, where e is the natural Constant, the dimensions of both images are X×Y×B.
在本发明的一个实施例中,所述深度网络支路L 1,L 2,L 3,L 4,L 5和L 6均包括多个二维卷积层,每个二维卷积层中卷积核数目均为d,卷积核尺寸为[3,3],卷积核滑动步长均为[1,1],支路L 2和L 3共享所有权重,支路L 4和L 5共享所有权重。 In one embodiment of the present invention, the deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 each include multiple two-dimensional convolution layers, and each two-dimensional convolution layer The number of convolution kernels is d, the convolution kernel size is [3, 3], the convolution kernel sliding step size is [1, 1], branches L 2 and L 3 share all weights, branches L 4 and L 5 shares all weight.
本实施例的一种遥感高光谱图像与激光雷达图像融合分类装置用于实现前述的遥感高光谱图像与激光雷达图像融合分类方法,因此该装置的具体实施方式可见前文中的一种遥感高光谱图像与激光雷达图像融合分类方法的实施例部分,所以,其具体实施方式可以参照相应的各个部分实施例的描述,在此不再展开介绍。A remote sensing hyperspectral image and lidar image fusion classification device in this embodiment is used to implement the aforementioned remote sensing hyperspectral image and lidar image fusion classification method. Therefore, the specific implementation of the device can be seen in the preceding article. The image and lidar image fusion classification method are part of the embodiments. Therefore, its specific implementation can be referred to the description of the corresponding embodiments of each part, and will not be introduced here.
另外,由于本实施例的一种遥感高光谱图像与激光雷达图像融合分类装置用于实现前述的遥感高光谱图像与激光雷达图像融合分类方法,因此其作用与上述系统的作用相对应,这里不再赘述。In addition, since the remote sensing hyperspectral image and lidar image fusion classification device of this embodiment is used to implement the aforementioned remote sensing hyperspectral image and lidar image fusion classification method, its function corresponds to the function of the above system, and is not discussed here. Again.
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art will understand that embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment that combines software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流 程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each process and/or block in the flowchart illustrations and/or block diagrams, and combinations of processes and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing device to produce a machine, such that the instructions executed by the processor of the computer or other programmable data processing device produce a use A device for realizing the functions specified in one process or multiple processes of the flowchart and/or one block or multiple blocks of the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory that causes a computer or other programmable data processing apparatus to operate in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction means, the instructions The device implements the functions specified in a process or processes of the flowchart and/or a block or blocks of the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions may also be loaded onto a computer or other programmable data processing device, causing a series of operating steps to be performed on the computer or other programmable device to produce computer-implemented processing, thereby executing on the computer or other programmable device. Instructions provide steps for implementing the functions specified in a process or processes of a flowchart diagram and/or a block or blocks of a block diagram.
显然,上述实施例仅仅是为清楚地说明所作的举例,并非对实施方式的限定。对于所属领域的普通技术人员来说,在上述说明的基础上还可以做出其它不同形式变化或变动。这里无需也无法对所有的实施方式予以穷举。而由此所引申出的显而易见的变化或变动仍处于本发明创造的保护范围之中。Obviously, the above-mentioned embodiments are only examples for clear explanation and are not intended to limit the implementation. For those of ordinary skill in the art, other changes or modifications may be made based on the above description. An exhaustive list of all implementations is neither necessary nor possible. The obvious changes or modifications derived therefrom are still within the protection scope of the present invention.

Claims (10)

  1. 一种遥感高光谱图像与激光雷达图像融合分类方法,其特征在于,包括:A method for fusion and classification of remote sensing hyperspectral images and lidar images, which is characterized by including:
    S1:获取高光谱图像与激光雷达图像,两幅图像中各地物的类别为label;S1: Acquire hyperspectral images and lidar images. The categories of objects in the two images are label;
    S2:将高光谱图像进行本征图像分解得到本征图像和光照图像,对于每一个高光谱本征像元、高光谱光照像元以及激光雷达像元,选取其周围尺寸为s×s的邻域作为该像元的邻域块,其中高光谱图像中每个高光谱像元的邻域块尺寸为s×s×B,激光雷达图像L中激光雷达像元的邻域块尺寸为s×s;S2: Decompose the hyperspectral image into intrinsic image to obtain the intrinsic image and illumination image. For each hyperspectral intrinsic pixel, hyperspectral illumination pixel and lidar pixel, select its surrounding neighbors with size s×s. The domain is used as the neighborhood block of the pixel, where the neighborhood block size of each hyperspectral pixel in the hyperspectral image is s×s×B, and the neighborhood block size of the lidar pixel in the lidar image L is s× s;
    S3:使用邻域块训练深度网络支路L 1,L 2,L 3,L 4,L 5和L 6,其中L 1和L 2的输入为高光谱本征图像中的尺寸为s×s×B的高光谱本征像元,L 3和L 4的输入为激光雷达图像中的尺寸为s×s×B的激光雷达像元,L5和L6的输入为高光谱光照图像中的尺寸为s×s×B的高光谱光照像元,其输出分别为O 1,O 2,O 3,O 4,O 5和O 6,尺寸均为s×s×d; S3: Training deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 using neighborhood blocks, where the inputs to L 1 and L 2 are hyperspectral eigenimages of size s × s ×B hyperspectral intrinsic pixels, the inputs of L 3 and L 4 are lidar pixels with size s×s×B in the lidar image, and the inputs of L5 and L6 are the size s in the hyperspectral illumination image. The hyperspectral illumination pixels of s×s×B have outputs of O 1 , O 2 , O 3 , O 4 , O 5 and O 6 respectively, and the sizes are all s×s×d;
    S4:利用拼接层将深度网络支路L 1,L 2,L 3,L 4,L 5和L 6的输出两两拼接,得到O 12、O 34和O 56S4: Use the splicing layer to splice the outputs of the deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 in pairs to obtain O 12 , O 34 and O 56 ;
    S5:将O 34和O 56输入到第1个多模态分组卷积层,得到输出O 3456 1,将O 34输入到第2个多模态分组卷积层,得到输出O 34 1,将O 56输入到第3个多模态分组卷积层,得到输出O 56 1,将O 34 1、O 3456 1和O 56 1输入到第4个多模态分组卷积层,得到输出O 3456 2,将O 34 1输入到第5个多模态分组卷积层,得到输出O 34 2,将O 56 1输入到第6个多模态分组卷积层,得到输出O 56 2,将O 34 2、O 3456 2和O 56 2输入到第7个多模态分组卷积层,得到输出O 3456 3,将O 34 2输入到第8个多模态分组卷积层,得到输出O 34 3,将O 56 2输入到第9个多模态分组卷积层,得到输出O 56 3,将O 34 3、O 3456 3和O 56 3输入到第10个多模态分组卷积层,得到输出O 3456 4,将O 12和O 3456 1输入到第11个多模态分组卷积层,得到输出O 12 1,将O 12 1和O 3456 2输入到第12个多模态分组卷积层,得到输出O 12 2,将O 12 2和O 3456 3输入到第13个多模态分组卷积层,得到输出O 12 3,将尺寸为s×s×d的O 3456 4和O 12 3输入尺寸为s×s的二维平均池化层,得到尺寸为1×d的O 3456 5和O 12 4S5: Input O 34 and O 56 to the first multi-modal group convolution layer to obtain the output O 3456 1 . Input O 34 to the second multi-modal group convolution layer to obtain the output O 34 1 . O 56 is input to the third multi-modal group convolution layer, and the output O 56 1 is obtained. O 34 1 , O 3456 1 and O 56 1 are input to the fourth multi-modal group convolution layer, and the output O 3456 is obtained. 2. Input O 34 1 to the fifth multi-modal group convolution layer to obtain the output O 34 2 . Input O 56 1 to the sixth multi-modal group convolution layer to obtain the output O 56 2 . Put O 34 2 , O 3456 2 and O 56 2 are input to the 7th multi-modal group convolution layer, and the output O 3456 3 is obtained. O 34 2 is input to the 8th multi-modal group convolution layer, and the output O 34 is obtained. 3. Input O 56 2 to the 9th multi-modal group convolution layer to obtain the output O 56 3 . Input O 34 3 , O 3456 3 and O 56 3 to the 10th multi-modal group convolution layer. Get the output O 3456 4 , input O 12 and O 3456 1 to the 11th multimodal group convolution layer, get the output O 12 1 , input O 12 1 and O 3456 2 to the 12th multimodal group convolution layer Product layer, get the output O 12 2 , input O 12 2 and O 3456 3 to the 13th multi-modal group convolution layer, get the output O 12 3 , put the O 3456 4 and O of size s×s×d 12 3 inputs a two-dimensional average pooling layer with size s × s, and obtains O 3456 5 and O 12 4 with size 1 × d;
    S6:将O 12 4和O 3456 5输入拼接层得到输出O 123456,其尺寸为1×2d,将O 123456输入全连接层,得到最终输出的类别为
    Figure PCTCN2022142160-appb-100001
    S6: Input O 12 4 and O 3456 5 into the splicing layer to obtain the output O 123456 , whose size is 1×2d. Input O 123456 into the fully connected layer to obtain the final output category:
    Figure PCTCN2022142160-appb-100001
  2. 如权利要求1所述的遥感高光谱图像与激光雷达图像融合分类方法,其特征在于:所述步骤S1中在选取高光谱图像与激光雷达图像之后对高光谱图像与激光雷达图像进行归一化预处理。The remote sensing hyperspectral image and lidar image fusion classification method as claimed in claim 1, characterized in that: in step S1, after selecting the hyperspectral image and the lidar image, the hyperspectral image and the lidar image are normalized. preprocessing.
  3. 如权利要求1或2所述的遥感高光谱图像与激光雷达图像融合分类方法,其特征在于:所述步骤S2中将高光谱图像进行本征图像分解得到本征图像和光照图像的方法包括:The fusion classification method of remote sensing hyperspectral images and lidar images according to claim 1 or 2, characterized in that: in step S2, the method of decomposing the hyperspectral image into intrinsic images to obtain intrinsic images and illumination images includes:
    S2.1:计算每一个高光谱像元H i对应的矩阵D i,其中1≤i≤X×Y: S2.1: Calculate the matrix D i corresponding to each hyperspectral pixel H i , where 1≤i≤X×Y:
    D i=[H 1,...,H i-1,H i+1,...,H X×Y,I B]∈R B×(B+C×Y-1) D i =[H 1 ,..., H i-1 , H i+1 ,..., H X×Y , I B ]∈R B×(B+C×Y-1)
    其中I B为尺寸为B×B的单位矩阵; where I B is the identity matrix of size B×B;
    S2.2:基于矩阵D i计算每一个高光谱像元H i对应的向量α iS2.2: Calculate the vector α i corresponding to each hyperspectral pixel H i based on the matrix D i :
    min||α i|| 1  s.t.H i=D iα i min||α i || 1 stH i =D i α i
    其中α i的形状为(B+X×Y-1)×1; Among them, the shape of α i is (B+X×Y-1)×1;
    S2.3:构建权重矩阵W∈R (X×Y)×(X×Y),对权重矩阵中第i行第j列的元W ij进行赋值
    Figure PCTCN2022142160-appb-100002
    基于权重矩阵W计算矩阵G=(I X×Y-W T)(I X×Y-W)+δI X×Y,其中I X×Y为尺寸为(X×Y)×(X×Y)的单位矩阵,δ为常数,T为转置矩阵;
    S2.3: Construct a weight matrix W∈R (X×Y)×(X×Y) , and assign values to the elements W ij in the i-th row and j-th column in the weight matrix.
    Figure PCTCN2022142160-appb-100002
    Calculate the matrix G=(I X×Y -W T )(I X×Y -W)+δI X × Y based on the weight matrix W, where I The identity matrix of , δ is a constant, and T is the transposed matrix;
    S2.4:将高光谱图像H变换为二维矩阵,并进行对数计算得到log(flatten(H)),计算矩阵K=(I B-1 B1 B T/B)log(flatten(H))(I X×Y-1 X×Y1 X×Y T/(X×Y)),其中I B和I X×Y分别为尺寸为B×B和(X×Y)×(X×Y)的单位矩阵,1 B和1 X×Y分别为尺寸为B×1和(X×Y)×1的全1向量; S2.4: Transform the hyperspectral image H into a two-dimensional matrix, and perform logarithmic calculation to obtain log(flatten(H)). Calculate the matrix K=(I B -1 B 1 B T /B)log(flatten(H) ))(I X×Y -1 X×Y 1 X×Y T /(X × Y)), where I B and I The identity matrix of Y), 1 B and 1 X×Y are all-1 vectors with dimensions B×1 and (X×Y)×1 respectively;
    S2.5:基于矩阵G和矩阵K计算矩阵ρ=δKG -1,基于矩阵ρ得到由高光谱图像H分解得到的本征图像RE=e ρ和光照图像SH=e log(H)-ρ,其中e为自然常数,两个图像的尺寸均为X×Y×B。 S2.5: Calculate matrix ρ = δKG -1 based on matrix G and matrix K. Based on matrix ρ, obtain the intrinsic image RE = e ρ and illumination image SH = e log(H)-ρ obtained by decomposing the hyperspectral image H. Where e is a natural constant, and the dimensions of both images are X×Y×B.
  4. 如权利要求1所述的遥感高光谱图像与激光雷达图像融合分类方法,其特征在于:所述步骤S3中深度网络支路L 1,L 2,L 3,L 4,L 5和L 6均包括多个二维卷积层,每个二维卷积层中卷积核数目均为d,卷积核尺寸为[3,3],卷积核滑动步长均为[1,1],支路L 2和L 3共享所有权重,支路L 4和L 5共享所有权重。 The remote sensing hyperspectral image and lidar image fusion classification method according to claim 1, characterized in that: in step S3, the deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 are all It includes multiple two-dimensional convolution layers. The number of convolution kernels in each two-dimensional convolution layer is d, the convolution kernel size is [3, 3], and the convolution kernel sliding steps are [1, 1]. Branches L 2 and L 3 share all weights, and branches L 4 and L 5 share all weights.
  5. 如权利要求1所述的遥感高光谱图像与激光雷达图像融合分类方法,其特征在于:所述步骤S3中构建深度网络支路训练时的损失函数为
    Figure PCTCN2022142160-appb-100003
    The remote sensing hyperspectral image and lidar image fusion classification method according to claim 1, characterized in that: the loss function when constructing the deep network branch training in step S3 is:
    Figure PCTCN2022142160-appb-100003
  6. 如权利要求1或4所述的遥感高光谱图像与激光雷达图像融合分类方法,其特征在于:所述步骤S4中利用拼接层将深度网络支路L 1,L 2,L 3,L 4,L 5和L 6的输出两两拼接,得到O 12、O 34和O 56的公式为O 12=Concatenation(O 1,O 2),O 34=Concatenation(O 3,O 4), O 56=Concatenation(O 5,O 6)。 The remote sensing hyperspectral image and lidar image fusion classification method according to claim 1 or 4, characterized in that: in step S4, a splicing layer is used to combine the deep network branches L 1 , L 2 , L 3 , L 4 , The outputs of L 5 and L 6 are spliced in pairs, and the formulas for O 12 , O 34 and O 56 are obtained: O 12 =Concatenation (O 1 , O 2 ), O 34 =Concatenation (O 3 , O 4 ), O 56 = Concatenation (O 5 , O 6 ).
  7. 一种遥感高光谱图像与激光雷达图像融合分类装置,其特征在于,包括:A device for fusion and classification of remote sensing hyperspectral images and lidar images, which is characterized by including:
    数据获取模块,其用于获取高光谱图像与激光雷达图像,两幅图像中各地物的类别为label;The data acquisition module is used to acquire hyperspectral images and lidar images. The categories of objects in the two images are label;
    图像分解模块,其用于将高光谱图像进行本征图像分解得到本征图像和光照图像,对于每一个高光谱本征像元、高光谱光照像元以及激光雷达像元,选取其周围尺寸为s×s的邻域作为该像元的邻域块,其中高光谱图像中每个高光谱像元的邻域块尺寸为s×s×B,激光雷达图像L中激光雷达像元的邻域块尺寸为s×s;Image decomposition module, which is used to decompose hyperspectral images into intrinsic images to obtain intrinsic images and illumination images. For each hyperspectral intrinsic pixel, hyperspectral illumination pixel and lidar pixel, select its surrounding size as The neighborhood of s × s is used as the neighborhood block of the pixel, where the neighborhood block size of each hyperspectral pixel in the hyperspectral image is s × s × B, and the neighborhood of the lidar pixel in the lidar image L is The block size is s×s;
    深度网络训练模块,其使用邻域块训练深度网络支路L 1,L 2,L 3,L 4,L 5和L 6,其中L 1和L 2的输入为高光谱本征图像中的尺寸为s×s×B的高光谱本征像元,L 3和L 4的输入为激光雷达图像中的尺寸为s×s×B的激光雷达像元,L 5和L 6的输入为高光谱光照图像中的尺寸为s×s×B的高光谱光照像元,其输出分别为O 1,O 2,O 3,O 4,O 5和O 6,尺寸均为s×s×d; A deep network training module that uses neighborhood blocks to train the deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 , where the inputs to L 1 and L 2 are the dimensions in the hyperspectral eigenimage is a hyperspectral intrinsic pixel of s×s×B, the inputs of L 3 and L 4 are lidar pixels of size s×s×B in the lidar image, and the inputs of L 5 and L 6 are hyperspectral The hyperspectral illumination pixels in the illumination image are of size s×s×B, and their outputs are O 1 , O 2 , O 3 , O 4 , O 5 and O 6 respectively, and the sizes are all s×s×d;
    图像拼接模块,其利用拼接层将深度网络支路L 1,L 2,L 3,L 4,L 5和L 6的输出两两拼接,得到O 12、O 34和O 56The image splicing module uses the splicing layer to splice the outputs of the deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 in pairs to obtain O 12 , O 34 and O 56 ;
    多模态融合模块,其用于将O 34和O 56输入到第1个多模态分组卷积层,得到输出O 3456 1,将O 34输入到第2个多模态分组卷积层,得到输出O 34 1,将O 56输入到第3个多模态分组卷积层,得到输出O 56 1,将O 34 1、O 3456 1和O 56 1输入到第4个多模态分组卷积层,得到输出O 3456 2,将O 34 1输入到第5个多模态分组卷积层,得到输出O 34 2,将O 56 1输入到第6个多模态分组卷积层,得到输出O 56 2,将O 34 2、O 3456 2和O 56 2输入到第7个多模态分组卷积层,得到输出O 3456 3,将O 34 2输入到第8个多模态分组卷积层,得到输出O 34 3,将O 56 2输入到第9个多模态分组卷积层,得到输出O 56 3,将O 34 3、O 3456 3和O 56 3输入到第10个多模态分组卷积层,得到输出O 3456 4,将O 12和O 3456 1输入到第11个多模态分组卷积层,得到输出O 12 1,将O 12 1和O 3456 2输入到第12个多模态分组卷积层,得到输出O 12 2,将O 12 2和O 3456 3输入到第13个多模态分组卷积层,得到输出O 12 3,将尺寸为s×s×d的O 3456 4和O 12 3输入尺寸为s×s的二维平均池化层,得到尺寸为1×d的O 3456 5和O 12 4The multi-modal fusion module is used to input O 34 and O 56 to the first multi-modal group convolution layer to obtain the output O 3456 1 , and input O 34 to the second multi-modal group convolution layer, Get the output O 34 1 , input O 56 to the 3rd multi-modal group convolution layer, get the output O 56 1 , input O 34 1 , O 3456 1 and O 56 1 to the 4th multi-modal group convolution layer Accumulative layer, get the output O 3456 2 , input O 34 1 to the fifth multi-modal group convolution layer, get the output O 34 2 , input O 56 1 to the sixth multi-modal group convolution layer, get Output O 56 2 , input O 34 2 , O 3456 2 and O 56 2 to the 7th multimodal grouping convolution layer to obtain the output O 3456 3 , input O 34 2 to the 8th multimodal grouping volume Multimodal convolution layer, get the output O 34 3 , input O 56 2 to the 9th multi-modal group convolution layer, get the output O 56 3 , input O 34 3 , O 3456 3 and O 56 3 to the 10th multi-modal convolutional layer Modal group convolution layer, get the output O 3456 4 , input O 12 and O 3456 1 to the 11th multi-modal group convolution layer, get the output O 12 1 , input O 12 1 and O 3456 2 to the 11th multi-modal group convolution layer 12 multi-modal grouped convolution layers are used to obtain the output O 12 2 . Input O 12 2 and O 3456 3 to the 13th multi-modal grouped convolution layer to obtain the output O 12 3 . The size is s×s× O 3456 4 and O 12 3 of d are input into the two-dimensional average pooling layer with size s×s, and O 3456 5 and O 12 4 with size 1×d are obtained;
    图像分类模块,将O 12 4和O 3456 5输入拼接层得到输出O 123456,其尺寸为1×2d,将O 123456输入全连接层,得到最终输出的类别为
    Figure PCTCN2022142160-appb-100004
    Image classification module, input O 12 4 and O 3456 5 into the splicing layer to get the output O 123456 , whose size is 1×2d, input O 123456 into the fully connected layer, and get the final output category:
    Figure PCTCN2022142160-appb-100004
  8. 如权利要求7所述的遥感高光谱图像与激光雷达图像融合分类装置,其特征在于:所述数据获取模块包括数据预处理子模块,所述数据预处理子模块用于在选取高光谱图像与激光雷达图像之后对高光谱图像与激光雷达图像进行归一化预处理。The remote sensing hyperspectral image and lidar image fusion classification device according to claim 7, characterized in that: the data acquisition module includes a data preprocessing submodule, and the data preprocessing submodule is used to select hyperspectral images and After the lidar image, the hyperspectral image and the lidar image are normalized and preprocessed.
  9. 如权利要求7或8所述的遥感高光谱图像与激光雷达图像融合分类装置,其特征在于:所述图像分解模块将高光谱图像进行本征图像分解得到本征图像和光照图像,包括:The remote sensing hyperspectral image and lidar image fusion classification device according to claim 7 or 8, characterized in that: the image decomposition module performs intrinsic image decomposition on the hyperspectral image to obtain the intrinsic image and illumination image, including:
    计算每一个高光谱像元H i对应的矩阵D i,其中1≤i≤X×Y: Calculate the matrix D i corresponding to each hyperspectral pixel H i , where 1≤i≤X×Y:
    D i=[H 1,...,H i-1,H i+1,...,H X×Y,I B]∈R B×(B+X×Y-1) D i = [H 1 ,..., Hi -1 , Hi+1 ,..., H X×Y , I B ]∈R B×(B+X×Y-1)
    其中I B为尺寸为B×B的单位矩阵; where I B is the identity matrix of size B×B;
    基于矩阵D i计算每一个高光谱像元H i对应的向量α iCalculate the vector α i corresponding to each hyperspectral pixel H i based on the matrix D i :
    min||α i|| 1  s.t.H i=D iα i min||α i || 1 stH i =D i α i
    其中α i的形状为(B+X×Y-1)×1; Among them, the shape of α i is (B+X×Y-1)×1;
    构建权重矩阵W∈R (X×Y)×(X×Y),对权重矩阵中第i行第j列的元W ij进行赋值
    Figure PCTCN2022142160-appb-100005
    基于权重矩阵W计算矩阵G=(I X×Y-W T)(I X×Y-W)+δI X×Y,其中I X×Y为尺寸为(X×Y)×(X×Y)的单位矩阵,δ为常数,T为转置矩阵;
    Construct a weight matrix W∈R (X×Y)×(X×Y) , and assign a value to the element W ij in the i-th row and j-th column in the weight matrix
    Figure PCTCN2022142160-appb-100005
    Calculate the matrix G=(I X×Y -W T )(I X×Y -W)+δI X × Y based on the weight matrix W, where I The identity matrix of , δ is a constant, and T is the transposed matrix;
    将高光谱图像H变换为二维矩阵,并进行对数计算得到log(flatten(H)),计算矩阵K=(I B-1 B1 B T/B)loG(flatten(H))(I X×Y-1 X×Y1 X×Y T/(X×Y)),其中I B和I X×Y分别为尺寸为B×B和(X×Y)×(X×Y)的单位矩阵,1 B和1 X×Y分别为尺寸为B×1和(X×Y)×1的全1向量; Transform the hyperspectral image H into a two-dimensional matrix, and perform logarithmic calculation to obtain log(flatten(H)). Calculate the matrix K=(I B -1 B 1 B T /B) log(flatten(H))(I X×Y -1 X×Y 1 X×Y T /(X × Y)), where I B and I Matrices, 1 B and 1 X×Y are all-1 vectors with dimensions B×1 and (X×Y)×1 respectively;
    基于矩阵G和矩阵K计算矩阵ρ=δKG -1,基于矩阵ρ得到由高光谱图像H分解得到的本征图像RE=e ρ和光照图像SH=e log(H)-ρ,其中e为自然常数,两个图像的尺寸均为X×Y×B。 The matrix ρ = δKG -1 is calculated based on the matrix G and the matrix K. Based on the matrix ρ, the intrinsic image RE = e ρ and the illumination image SH = e log(H)-ρ obtained by decomposing the hyperspectral image H are obtained, where e is the natural Constant, the dimensions of both images are X×Y×B.
  10. 如权利要求7所述的遥感高光谱图像与激光雷达图像融合分类装置,其特征在于:所述深度网络支路L 1,L 2,L 3,L 4,L 5和L 6均包括多个二维卷积层,每个二维卷积层中卷积核数目均为d,卷积核尺寸为[3,3],卷积核滑动步长均为[1,1],支路L 2和L 3共享所有权重,支路L 4和L 5共享所有权重。 The remote sensing hyperspectral image and lidar image fusion classification device according to claim 7, characterized in that: the deep network branches L 1 , L 2 , L 3 , L 4 , L 5 and L 6 each include multiple Two-dimensional convolution layer, the number of convolution kernels in each two-dimensional convolution layer is d, the convolution kernel size is [3, 3], the convolution kernel sliding step size is [1, 1], and the branch L 2 and L 3 share all weights, and branches L 4 and L 5 share all weights.
PCT/CN2022/142160 2022-08-26 2022-12-27 Method and device for fusion and classification of remote sensing hyperspectral image and laser radar image WO2024040828A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211037953.6A CN115331110A (en) 2022-08-26 2022-08-26 Fusion classification method and device for remote sensing hyperspectral image and laser radar image
CN202211037953.6 2022-08-26

Publications (1)

Publication Number Publication Date
WO2024040828A1 true WO2024040828A1 (en) 2024-02-29

Family

ID=83928217

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/142160 WO2024040828A1 (en) 2022-08-26 2022-12-27 Method and device for fusion and classification of remote sensing hyperspectral image and laser radar image

Country Status (2)

Country Link
CN (1) CN115331110A (en)
WO (1) WO2024040828A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117830752A (en) * 2024-03-06 2024-04-05 昆明理工大学 Self-adaptive space-spectrum mask graph convolution method for multi-spectrum point cloud classification
CN117934978A (en) * 2024-03-22 2024-04-26 安徽大学 Hyperspectral and laser radar multilayer fusion classification method based on countermeasure learning
CN117934978B (en) * 2024-03-22 2024-06-11 安徽大学 Hyperspectral and laser radar multilayer fusion classification method based on countermeasure learning

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115331110A (en) * 2022-08-26 2022-11-11 苏州大学 Fusion classification method and device for remote sensing hyperspectral image and laser radar image
CN116167955A (en) * 2023-02-24 2023-05-26 苏州大学 Hyperspectral and laser radar image fusion method and system for remote sensing field

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090318815A1 (en) * 2008-05-23 2009-12-24 Michael Barnes Systems and methods for hyperspectral medical imaging
CN109993220A (en) * 2019-03-23 2019-07-09 西安电子科技大学 Multi-source Remote Sensing Images Classification method based on two-way attention fused neural network
CN112967350A (en) * 2021-03-08 2021-06-15 哈尔滨工业大学 Hyperspectral remote sensing image eigen decomposition method and system based on sparse image coding
CN114742985A (en) * 2022-03-17 2022-07-12 苏州大学 Hyperspectral feature extraction method and device and storage medium
CN115331110A (en) * 2022-08-26 2022-11-11 苏州大学 Fusion classification method and device for remote sensing hyperspectral image and laser radar image

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090318815A1 (en) * 2008-05-23 2009-12-24 Michael Barnes Systems and methods for hyperspectral medical imaging
CN109993220A (en) * 2019-03-23 2019-07-09 西安电子科技大学 Multi-source Remote Sensing Images Classification method based on two-way attention fused neural network
CN112967350A (en) * 2021-03-08 2021-06-15 哈尔滨工业大学 Hyperspectral remote sensing image eigen decomposition method and system based on sparse image coding
CN114742985A (en) * 2022-03-17 2022-07-12 苏州大学 Hyperspectral feature extraction method and device and storage medium
CN115331110A (en) * 2022-08-26 2022-11-11 苏州大学 Fusion classification method and device for remote sensing hyperspectral image and laser radar image

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117830752A (en) * 2024-03-06 2024-04-05 昆明理工大学 Self-adaptive space-spectrum mask graph convolution method for multi-spectrum point cloud classification
CN117830752B (en) * 2024-03-06 2024-05-07 昆明理工大学 Self-adaptive space-spectrum mask graph convolution method for multi-spectrum point cloud classification
CN117934978A (en) * 2024-03-22 2024-04-26 安徽大学 Hyperspectral and laser radar multilayer fusion classification method based on countermeasure learning
CN117934978B (en) * 2024-03-22 2024-06-11 安徽大学 Hyperspectral and laser radar multilayer fusion classification method based on countermeasure learning

Also Published As

Publication number Publication date
CN115331110A (en) 2022-11-11

Similar Documents

Publication Publication Date Title
WO2024040828A1 (en) Method and device for fusion and classification of remote sensing hyperspectral image and laser radar image
CN113011499B (en) Hyperspectral remote sensing image classification method based on double-attention machine system
CN107766850B (en) Face recognition method based on combination of face attribute information
CN111275007B (en) Bearing fault diagnosis method and system based on multi-scale information fusion
CN112926641B (en) Three-stage feature fusion rotating machine fault diagnosis method based on multi-mode data
WO2021082480A1 (en) Image classification method and related device
CN112801040B (en) Lightweight unconstrained facial expression recognition method and system embedded with high-order information
CN109190511B (en) Hyperspectral classification method based on local and structural constraint low-rank representation
CN113066065B (en) No-reference image quality detection method, system, terminal and medium
CN110222718A (en) The method and device of image procossing
WO2023125456A1 (en) Multi-level variational autoencoder-based hyperspectral image feature extraction method
CN114419413A (en) Method for constructing sensing field self-adaptive transformer substation insulator defect detection neural network
CN113065426B (en) Gesture image feature fusion method based on channel perception
CN112732921B (en) False user comment detection method and system
US20230316699A1 (en) Image semantic segmentation algorithm and system based on multi-channel deep weighted aggregation
CN111860683A (en) Target detection method based on feature fusion
CN112667071A (en) Gesture recognition method, device, equipment and medium based on random variation information
CN115564996A (en) Hyperspectral remote sensing image classification method based on attention union network
CN110837808A (en) Hyperspectral image classification method based on improved capsule network model
CN111860601B (en) Method and device for predicting type of large fungi
CN113743079A (en) Text similarity calculation method and device based on co-occurrence entity interaction graph
CN111652349A (en) Neural network processing method and related equipment
CN114373080B (en) Hyperspectral classification method of lightweight hybrid convolution model based on global reasoning
CN115965819A (en) Lightweight pest identification method based on Transformer structure
CN116595133A (en) Visual question-answering method based on stacked attention and gating fusion

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22956360

Country of ref document: EP

Kind code of ref document: A1