CN114092327A - Hyperspectral image super-resolution method by utilizing heterogeneous knowledge distillation - Google Patents

Hyperspectral image super-resolution method by utilizing heterogeneous knowledge distillation Download PDF

Info

Publication number
CN114092327A
CN114092327A CN202111288667.2A CN202111288667A CN114092327A CN 114092327 A CN114092327 A CN 114092327A CN 202111288667 A CN202111288667 A CN 202111288667A CN 114092327 A CN114092327 A CN 114092327A
Authority
CN
China
Prior art keywords
features
distillation
resolution
dodb
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111288667.2A
Other languages
Chinese (zh)
Other versions
CN114092327B (en
Inventor
江俊君
刘子仟
马清
刘贤明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology
Original Assignee
Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology filed Critical Harbin Institute of Technology
Priority to CN202111288667.2A priority Critical patent/CN114092327B/en
Publication of CN114092327A publication Critical patent/CN114092327A/en
Application granted granted Critical
Publication of CN114092327B publication Critical patent/CN114092327B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4046Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • G06T2207/10036Multispectral image; Hyperspectral image

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Processing (AREA)

Abstract

The invention provides a hyperspectral image super-resolution method by utilizing heterogeneous knowledge distillation, which gives a low-resolution hyperspectral image input ILR∈RL×H×WPerforming shallow feature extraction, nonlinear mapping on distillation-oriented double-branch module DODB, and upsampling to finally output a high-resolution hyperspectral image ISR∈RL×sH×sW(ii) a The heterogeneous knowledge distillation is used for improving the model performance, the distillation acts between the 2D characteristics of the two models, the heterogeneous knowledge distillation problem is transferred to the fusion problem in the SHSR model, the transmitted information is taken as feedback information, the characteristics of each frequency band are respectively refined, and the characteristics are divided into a distillation part and a reserved part; obtaining better performance quantitatively and qualitatively, and reconstructing high spectrum with relatively high qualityAnd (4) an image.

Description

Hyperspectral image super-resolution method by utilizing heterogeneous knowledge distillation
Technical Field
The invention belongs to the technical field of image super-resolution, and particularly relates to a hyperspectral image super-resolution method by utilizing heterogeneous knowledge distillation.
Background
The hyperspectral imaging sensor receives light with different wavelengths reflected by an object to obtain a hyperspectral image of a multispectral band. Therefore, each pixel of the hyperspectral image contains continuous spectral bands varying from tens to thousands, unlike the grayscale image or the RGB image. The abundance of spectral information in hyperspectral images makes them extremely beneficial in many tasks of computer vision and remote sensing, such as image classification, anomaly detection and medical diagnostics. However, due to hardware limitations, the spatial resolution of hyperspectral images is relatively low and it is difficult to improve hardware systems. Therefore, super-resolution (SR), a post-processing technique, is widely used to reconstruct a high-spatial-resolution hyperspectral image from a low-resolution (LR) version. One class of classical hyperspectral super-resolution methods is fusion-based methods (FHSR), which require some high-resolution multispectral image (HR-MSI), such as an RGB or full-color (PAN) image, and fuse information from both sources. The main drawback of fusion-based methods is that it is difficult, and in some cases even impossible, to collect well-registered high-resolution multispectral images. Another approach is single hyperspectral image super resolution (SHSR), which only uses information from low resolution hyperspectral images. However, since there is no complementary spatial information, this model is highly dependent on an a priori human design, such as low rank and sparsity. With the coming of the deep learning era, a single hyperspectral image super-resolution model based on a convolutional neural network makes great progress, but the lack of spatial detail still limits the capability of the model. Furthermore, they do not take full advantage of expensive well-aligned hyper-spectral-multi-spectral pairs, such as common datasets.
Disclosure of Invention
In order to solve the problems, the invention provides a hyperspectral image super-resolution method by utilizing heterogeneous knowledge distillation, and a distillation-oriented double-branch network DODN is designed; and a new mixed 2D/3D convolution module, namely a distillation-oriented double-branch module (DODB) is provided, and the information of the high-resolution multispectral image HR-MSI is transmitted to a single hyperspectral image super-resolution model SHSR through knowledge distillation to improve the model performance.
The invention is realized by the following scheme:
a hyperspectral image super-resolution method by utilizing heterogeneous knowledge distillation comprises the following steps: the method specifically comprises the following steps:
the method specifically comprises the following steps:
the method comprises the following steps: given a low-resolution hyperspectral image input ILR,ILR∈RL×H×WWherein L, H and W represent the number of spectral bands, height and width of the input image, respectively;
step two: shallow feature extraction is carried out on given image input, image information is respectively sent into a 2D processing branch and a 3D processing branch, the 3D processing branch is processed through 3D convolution, spatial spectrum information of a low-resolution hyperspectral input image is extracted, and shallow 3D features are obtained
Figure BDA0003333812180000021
The 2D processing branch is processed by 2D convolution to obtain shallow 2D characteristics
Figure BDA0003333812180000022
Step three: will be provided with
Figure BDA0003333812180000023
And
Figure BDA0003333812180000024
sent to a distillation oriented double branch module DODB, using a cascaded DODB: hDODBGenerating a non-linear mapping; and discarding 2D features at the kth, i.e. last DODB module
Figure BDA0003333812180000025
Simultaneously obtaining 3D characteristics of Kth DODB module
Figure BDA0003333812180000026
Obtaining shallow 3D features
Figure BDA0003333812180000027
Adding;
step four: performing heterogeneous knowledge distillation and loss function calculation, performing distillation on half of the 2D signature, i.e. for each 2D signature of DODB, will
Figure BDA0003333812180000028
The first C/2 channels are used as output parts for distillation, and the rest parts are used as retention parts; finally, a high-resolution hyperspectral image I is output through an up-samplerSR∈RL ×sH×sWAnd s is a scale factor.
Further, in the second step, the first step,
in the 3D processing branch, a low resolution hyperspectral image input ILRDecompressed to 1 × L × H × W size, and then 3 × 3 × 3D convolved to obtain shallow 3D features
Figure BDA0003333812180000029
Figure BDA00033338121800000210
The expression of (a) is:
Figure BDA00033338121800000211
in the 2D processing branch, a low-resolution hyperspectral image is input ILRUpsampling to LxsH × sW to adapt to the spatial resolution of the spectral super-resolution SSR model input, and then obtaining shallow 2D features by 3 × 3 2D convolution
Figure BDA00033338121800000212
Wherein s is a scale factor, and wherein,
Figure BDA00033338121800000213
the expression of (a) is:
Figure BDA00033338121800000214
further, in the third step,
for the Kth DODB module, there are:
Figure BDA00033338121800000215
using the transposed 3D convolution and the 1 x 1 3D convolution as upsamplers,
Figure BDA00033338121800000216
before passing through the upsampler and
Figure BDA0003333812180000031
adding to improve the robustness of the model;
Figure BDA0003333812180000032
the distillation-oriented double-branch module DODB consists of a 3D module, a 2D module and a feedback fusion module;
wherein the 3D module and the 2D module respectively extract low-resolution 3D features according to the structure of the residual block
Figure BDA0003333812180000033
And high resolution 2D features
Figure BDA0003333812180000034
Wherein C' and C represent the number of channels of the 3D feature and the 2D feature, respectively, and B represents the batch size;
Figure BDA0003333812180000035
Figure BDA0003333812180000036
further, in a feedback fusion module of the DODB, the 3D features are firstly up-sampled to the same size as the 2D features and fused, and then down-sampled to the original size of the 3D features;
after upsampling the 3D features, the 2D features and the 3D features are fused in a band-by-band manner: correcting 3D features according to spectral bands using 2D features as feedback information to obtain high resolution 3D features
Figure BDA0003333812180000037
Will be provided with
Figure BDA0003333812180000038
The separation into L spectral bands in the spectral dimension is:
Figure BDA0003333812180000039
Figure BDA00033338121800000310
wherein FlThe size of (a) is b × C' × sH × sW; the 2D features are connected to each spectral band separately and fused features are generated using 2D convolution:
and for all bands, the 2D convolution is the same, the fused features are decompressed to b × 1 × C' × sH × sW, and then stacked together to obtain new 3D features
Figure BDA00033338121800000311
Figure BDA00033338121800000312
Size and
Figure BDA00033338121800000313
the same, namely:
Figure BDA00033338121800000314
in the downsampling process, cascaded 3 × 3 × 3 convolution pairs are used
Figure BDA0003333812180000041
Carrying out down-sampling;
the final output of the distillation oriented double branch module DODB is then:
Figure BDA0003333812180000042
Figure BDA0003333812180000043
further, in the fourth step,
the total loss function included the reconstitution loss and distillation loss, the L1 loss was chosen as the reconstitution loss,
Figure BDA0003333812180000044
wherein N represents the number of samples,
Figure BDA0003333812180000045
and
Figure BDA0003333812180000046
the ith spectral band of the reconstructed image and the real high-resolution image respectively;
for distillation loss, the L1 norm was used to measure the distance between features from the SHSR model and the features forming the SSR model;
Figure BDA0003333812180000047
wherein S represents the characteristic quantity used in the distillation,
Figure BDA0003333812180000048
and
Figure BDA0003333812180000049
is a feature of the j-th layer of the SHSR model and SSR model, and GiIs a transformation, namely 1 × 1 convolution, which is used for ensuring that the number of the corresponding two characteristic channels is the same;
the total loss function is then:
Ltotal=Lrec+λLoutput (14)
where λ is a hyper-parameter for balancing the two parts, set to 0.05 in practical experiments.
The invention has the beneficial effects
The invention provides a new double-branch single-hyperspectral image super-resolution model and a new module for effectively combining 2D convolution and 3D convolution, wherein the model comprises the following components:
extracting spatial spectrum information of a low-resolution hyperspectral input image by the 3D branch through 3D convolution; the 2D branch is designed to be similar to a spectral super-resolution model and receives information transmitted from the spectral super-resolution model;
in each block, the 3D features are segmented in the spectral dimension and corrected band by 2D features in a feedback manner; applying distillation on half the channel of the 2D feature is beneficial to reduce negative migration, a technique known as semi-distillation;
the invention takes the first place to utilize the privilege information from the spectrum super-resolution task and designs a model for heterogeneous knowledge distillation, and the introduction of long residual connection makes the model more robust.
Drawings
FIG. 1 is a schematic diagram of a DODN in accordance with the present invention, the upper half being a DODN network and the lower half being an AWAN SSR model;
FIG. 2 is a DODB schematic of the present invention;
FIG. 3 is a graph of the reconstruction and absolute error of an image, an integer _ state 630nm band in a CAVE dataset;
fig. 4 is a graph of the reconstruction and absolute error for the 1 st band of the ARAD _0463 image in the ntie 2020 dataset.
Detailed Description
The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments; all other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
A hyperspectral image super-resolution method by utilizing heterogeneous knowledge distillation comprises the following steps: the method specifically comprises the following steps:
the method comprises the following steps: given a low-resolution hyperspectral image input ILR,ILR∈RL×H×WWherein L, H and W represent the number of spectral bands, height and width of the input image, respectively;
step two: shallow feature extraction is carried out on given image input, image information is respectively sent into a 2D processing branch and a 3D processing branch, the 3D processing branch is processed through 3D convolution, spatial spectrum information of a low-resolution hyperspectral input image is extracted, and shallow 3D features are obtained
Figure BDA0003333812180000051
The 2D processing branch is processed by 2D convolution to obtain shallow 2D characteristics
Figure BDA0003333812180000052
Step three: will be provided with
Figure BDA0003333812180000053
And
Figure BDA0003333812180000054
sent to a distillation oriented double branch module DODB, using a cascaded DODB: hDODBGenerating a non-linear mapping; and discarding 2D features at the kth, i.e. last DODB module
Figure BDA0003333812180000055
Simultaneously obtaining 3D characteristics of Kth DODB module
Figure BDA0003333812180000056
Obtaining shallow 3D features
Figure BDA0003333812180000057
Adding;
step four: performing heterogeneous knowledge distillation and loss function calculation, performing distillation on half of the 2D signature, i.e. for each 2D signature of DODB, will
Figure BDA0003333812180000058
The first C/2 channels are used as output parts for distillation, and the rest parts are used as retention parts; finally, a high-resolution hyperspectral image I is output through an up-samplerSR∈RL ×sH×sWAnd s is a scale factor.
In the second step, the first step is carried out,
in the 3D processing branch, a low resolution hyperspectral image input ILRDecompressed to 1 × L × H × W size, and then 3 × 3 × 3D convolved to obtain shallow 3D features
Figure BDA0003333812180000061
Figure BDA0003333812180000062
The expression of (a) is:
Figure BDA0003333812180000063
in the 2D processing branch, a low-resolution hyperspectral image is input ILRUpsampling to LxsH × sW to adapt to the spatial resolution of the spectral super-resolution SSR model input, and then obtaining shallow 2D features by 3 × 3 2D convolution
Figure BDA0003333812180000064
Wherein s is a scale factor, and wherein,
Figure BDA0003333812180000065
the expression of (a) is:
Figure BDA0003333812180000066
in the third step, the first step is carried out,
for the Kth DODB module, there are:
Figure BDA0003333812180000067
using the transposed 3D convolution and the 1 x 1 3D convolution as upsamplers,
Figure BDA0003333812180000068
before passing through the upsampler and
Figure BDA0003333812180000069
adding to improve the robustness of the model;
Figure BDA00033338121800000610
the distillation-oriented double-branch module DODB consists of a 3D module, a 2D module and a feedback fusion module; the DODB is made partially similar to the 2D SSR model because of the addition of the 2D branch; using a feedback mechanism to fuse 2D and 3D features in a band-by-band manner, each band of HSI is only sensitive to a portion of the photographic subject due to the limited energy of light in a particular range, while the RGB image contains spatial information of the entire scene.
Wherein the 3D module and the 2D module respectively extract low-resolution 3D features according to the structure of the residual block
Figure BDA00033338121800000611
And high resolution 2D features
Figure BDA00033338121800000612
Wherein C' and C represent the number of channels of the 3D feature and the 2D feature, respectively, and B represents the batch size; by pseudo-3DThe convolution replaces the common 3D convolution to reduce the computational complexity;
Figure BDA00033338121800000613
Figure BDA00033338121800000614
Figure BDA0003333812180000071
in a feedback fusion module of the DODB, 3D features are firstly up-sampled to the same size as the 2D features and fused, and then down-sampled to the original size of the 3D features; the reason for fusion in the high resolution space is to fuse the 2D features and the 3D features in a band-by-band manner, preserving as much detail as possible in the 2D features. The 2D feature accepts information transferred from high resolution RGB images, so it contains all spatial details at a lower capacity. While each spectral band of the 3D features can be viewed as a limited view of the high resolution RGB image with richer spectral information.
After upsampling the 3D features, fusing the 2D features and the 3D features in a band-by-band manner; correcting 3D features according to spectral bands using 2D features as feedback information to obtain high resolution 3D features
Figure BDA0003333812180000072
Will be provided with
Figure BDA0003333812180000073
The separation into L spectral bands in the spectral dimension is:
Figure BDA0003333812180000074
Figure BDA0003333812180000075
wherein FlThe size of (a) is b × C' × sH × sW; the 2D features are connected to each spectral band separately and fused features are generated using 2D convolution:
and for all spectral bands, the 2D convolution is the same, the 2D features are regarded as feedback information to refine each spectral band of the 3D features, the fused features are decompressed to b × 1 × C' × sH × sW, and then stacked together to obtain new 3D features
Figure BDA0003333812180000076
Figure BDA0003333812180000077
Size and
Figure BDA0003333812180000078
the same, namely:
Figure BDA0003333812180000079
in the downsampling process, cascaded 3 × 3 × 3 convolution pairs are used
Figure BDA00033338121800000710
Carrying out down-sampling; d, convolution further extracts spectral spatial correlation;
the final output of the distillation oriented double branch module DODB is then:
Figure BDA00033338121800000711
Figure BDA00033338121800000712
in the fourth step, (the step is used for improving the performance of the model in the training process and is not applied in the reasoning process)
The total loss function includes the loss of reconstitution and the loss of distillation, following the mainstream of the field, the loss of L1 was chosen as the loss of reconstitution,
Figure BDA0003333812180000081
wherein N represents the number of samples,
Figure BDA0003333812180000082
and
Figure BDA0003333812180000083
the ith spectral band of the reconstructed image and the real high-resolution image respectively;
for distillation loss, the L1 norm was used to measure the distance between features from the SHSR model and the features forming the SSR model;
Figure BDA0003333812180000084
wherein S represents the characteristic quantity used in the distillation,
Figure BDA0003333812180000085
and
Figure BDA0003333812180000086
is a feature of the j-th layer of the SHSR model and SSR model, and GiIs a transformation, namely 1 × 1 convolution, which is used for ensuring that the number of the corresponding two characteristic channels is the same;
the total loss function is then:
Ltotal=Lrec+λLoutput (14)
where λ is a hyper-parameter for balancing the two parts, set to 0.05 in practical experiments.
CAVE: the CAVE data set was collected by a cooled CCD camera at wavelengths ranging from 400nm to 710nm, separated into 31 bands. These 32 images are divided into five parts: genuine, fake, skin and hair, paintings, food and beverages, and objects, each image being 512 x 512 in size.
NTIRE 2020: for a long time, hyperspectral image processing lacks large-scale datasets. Recently, the NTIRE spectral super-resolution challenge provides a data set containing 510 hyperspectral images, which is one of the largest data sets to date. The size of each picture was 512 × 482, and the number of bands was also 31. Only clean track data is used and since the group route of the test set is not accessible, the comparison is made on the validation set, so there are 450 pictures for training and 10 images for testing.
For the CAVE dataset, 20 images were selected as the training set, and the remaining 12 images were used for testing. Each picture is cropped into blocks of 96 x 96 size, 48 pixels overlap between blocks, and the scale factor is set to 4; randomly select 10% of the blocks as the validation set. Then, a bicubic downsampling is used to generate the low resolution input. Cropping and downsampling operations are applied to both hyperspectral and RGB images simultaneously to obtain a well-registered HSI-RGB image pair. The data augmentation operation is performed using 90 °, 180 °, and 270 ° rotations, vertical and horizontal flips, and combinations thereof, on the training sample. For the NTIRE2020 dataset, the only difference is that the image blocks are randomly cropped, and the number of blocks per image is fixed at 24.
DODN contains 4 DODBs, setting C' ═ C ═ 64. The AWAN was chosen as the SSR model, with the number of modules and number of channels being 8 and 200, respectively. Since the number of modules is different for the AWAN model and the DODN, the outputs of 2, 4, 6 and 8 modules of the AWAN are used for distillation along with all the outputs of the four blocks of DODN. Setting beta by Adam optimization algorithm1=0.9,β2The learning rate is initialized to 10 at 0.999-4Gradually decrease to 10-5. The batch size was 12, and the model of the present invention was trained for 200 cycles. SSR model and DODN are optimized alternately: in each small lot, one model with a smaller fixed error acts as a teacher model, while the other updates its parameters. In practical experiments, the convergence speed of the two models is found to have a difference, so that the phenomenon of insufficient training of the SHSR model occurs. Therefore, an official pre-trained AWAN is used and at a smaller learning rate (10)-5) To perform fine tuning.
Six widely used measurement indices are used to assess the quality of the reconstructed image: peak signal-to-noise ratio (PSNR), Structural Similarity (SSIM), cross-Correlation Coefficient (CC), Spectral Angle Mapping (SAM), Root Mean Square Error (RMSE), and dimensionless error coefficient (ERGAS). And calculating the average value of all frequency bands of the PSNR and the SSIM as an index. SAM is a common index for measuring the spectral difference between two hyperspectral images, and CC and ERGAS are widely used for hyperspectral image fusion. The remaining three metrics are typically used to quantitatively measure image recovery quality. The limit values of the above indexes are + ∞, 1, 0 and 1, respectively;
compared with the prior six SHSR methods, the method comprises Bicubic, EDSR, MCNet, ERCSR, SFCSR and ASFS; and two common data sets CAVE and ntie 2020 are used to verify the validity of the proposed DODN.
CAVE dataset: table 1 shows the quantitative comparison of the advanced SHSR method on CAVE datasets for different scale factors. It is clear that the process of the invention is superior to other processes in all respects. In all of these algorithms, EDSR is a classical model of single natural image super-resolution with pure 2D convolution, MCNet and ERCSR are 2D/3D hybrid convolution neural networks, and SFCSR is a sequence model. ASFS uses adjacent bands to independently reconstruct the central band in turn. Experimental results show that 2D CNN (like EDSR) can restore spatial detail at a reasonably good level, but poor results of SAM show that it causes severe spectral distortion. The 2D/3D hybrid models score relatively low on SSIM and CC, indicating that they cannot generate sufficient spatial detail. Furthermore, the sequence model and the 2D/3D mixture model have similar values on the SAM. However, by introducing knowledge from the SSR model, the model of the invention achieves a significant improvement in PSNR (+0.3dB) compared to the sub-optimal algorithm (MCNet) and also considerably reduces SAM (-0.1 rad), i.e., knowledge distillation improves both spatial and spectral reconstruction. As shown in fig. 3, the method of the present invention has a low absolute error, especially in the area containing rich texture. In addition, the model of the invention well reconstructs the edges of the color blocks on the right side frame, which shows that the model of the invention has strong capability of extracting the spatial correlation.
Figure BDA0003333812180000101
Table 1: result comparison of SHSR method on CAVE dataset
NTIRE2020 dataset: experiments on the NITRE2020 dataset revealed the performance of the existing SHSR method on large scale data. The quantitative results are summarized in table 2. Surprisingly, the EDSR performed best in all existing models except the process of the present invention. This may be because the EDSR has a more general architecture to leverage the rich data, which also suggests that the reason to limit the performance of EDSR on small HSI datasets may not be a network prior, but rather a lack of data. In contrast, a single-band output model with adjacent bands as input, such as SFCSR and ASFS, converges prematurely with poor results. From table 2 it can be observed that the model of the invention outperforms the other methods in all indexes, which demonstrates the effectiveness of the proposed algorithm. In particular, the method of the present invention improves PSNR and SAM by +0.37dB and-0.13 rad, respectively, compared to a suboptimal method, which indicates that both spatial detail and frequency spectrum are enhanced. The reconstructed spectral bands and the corresponding absolute errors are visualized in fig. 4, and the resulting errors of the invention are smaller overall, especially in regions containing rich texture. For example, the leaf vein in the red rectangle recovers well, which other methods cannot.
Figure BDA0003333812180000102
Table 2: result comparison of SHSR method on NTIRE2020 dataset
The goal of heterogeneous distillation is to transmit spatial and spectral information from the SSR model so that the SHSR model can gain knowledge of the other perspective and utilize information from both tasks simultaneously. Since the SSR model is a 2D convolutional network, all spatial and spectral information is embedded in the 2D features, which leads to the problem of distillation with heterogeneous knowledge of the 3D features in the SHSR model. The solution of the present invention is to add a 2D branch in the model of the present invention to isolate the 3D SHSR and 2D SSR features and to shift the task of distilling the 3D and 2D features to the task of distilling between the 2D features. The two-dimensional branches of the model of the invention are designed to be similar to the SSR model, which reduces the gap between the two models, thereby reducing the difficulty of heterogeneous knowledge distillation. The two-dimensional branches of the model of the invention are designed to be similar to the SSR model, which reduces the gap between the two models, thereby reducing the difficulty of heterogeneous knowledge distillation. The information of the two views is combined by feedback fusion, where the 2D features refine the 3D features band by band. In this way, spatial detail from the HR RGB image is introduced and the information of one band is not contaminated by other bands. In table 3, the invention demonstrates the performance of the model with and without knowledge distillation on the CAVE data set. In models that do not use knowledge distillation, the present invention trains the model of the present invention using only the L1 loss and keeps all hyper-parameters the same as the original model. It can be observed that the knowledge distillation significantly improved the PSNR and caused little harm to SAM, confirming that the spatial details of HR-MSI are effectively transferred to the model of the present invention. One possible reason for the slight increase in SAM may be the limited ability of the 2D SSR signature to represent full spectrum information, resulting in negative migration in the knowledge distillation process.
Figure BDA0003333812180000111
Table 3: ablation analysis of knowledge distillation
The hyperspectral image super-resolution method by utilizing heterogeneous knowledge distillation, which is provided by the invention, is described in detail, the principle and the implementation mode of the invention are explained, and the description of the embodiment is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in specific embodiments and application ranges, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (6)

1. A hyperspectral image super-resolution method by utilizing heterogeneous knowledge distillation is characterized by comprising the following steps:
the method specifically comprises the following steps:
the method comprises the following steps: given a low-resolution hyperspectral image input ILR,ILR∈RL×H×WWherein L, H and W represent the number of spectral bands, height and width of the input image, respectively;
step two: shallow feature extraction is carried out on given image input, image information is respectively sent into a 2D processing branch and a 3D processing branch, the 3D processing branch is processed through 3D convolution, space spectrum information of a low-resolution hyperspectral input image is extracted, and shallow 3D features are obtained
Figure RE-FDA0003457722140000011
The 2D processing branch is processed by 2D convolution to obtain shallow layer 2D characteristics
Figure RE-FDA0003457722140000012
Step three: will be provided with
Figure RE-FDA0003457722140000013
And
Figure RE-FDA0003457722140000014
sent to a distillation oriented double branch module DODB, using a cascaded DODB: hDODBGenerating a non-linear mapping; and discarding 2D features at the kth, i.e. last DODB module
Figure RE-FDA0003457722140000015
Simultaneously obtaining 3D characteristics of Kth DODB module
Figure RE-FDA0003457722140000016
Obtaining shallow 3D features
Figure RE-FDA0003457722140000017
Adding;
step four: performing heterogeneous knowledge distillation and loss function calculation, performing distillation on half the channel of the 2D features, i.e. for each 2D feature of DODB, will
Figure RE-FDA0003457722140000018
The first C/2 channels are used as output parts for distillation, and the rest parts are used as retention parts; finally, a high-resolution hyperspectral image I is output through an up-samplerSR∈RL×sH×sWAnd s is a scale factor.
2. The method of claim 1, further comprising: in the second step, the first step is carried out,
in the 3D processing branch, a low resolution hyperspectral image input ILRDecompressed to 1 × L × H × W size, and then 3 × 3 × 3D convolved to obtain shallow 3D features
Figure RE-FDA0003457722140000019
Figure RE-FDA00034577221400000110
The expression of (a) is:
Figure RE-FDA00034577221400000111
in the 2D processing branch, a low-resolution hyperspectral image is input ILRUpsampling to LxsH xsW to adapt to the spatial resolution of the spectral super-resolution SSR model input, and then obtaining shallow layer 2D features through 3 x 3 2D convolution
Figure RE-FDA00034577221400000112
Wherein s is a scale factor, and wherein,
Figure RE-FDA00034577221400000113
the expression of (a) is:
Figure RE-FDA00034577221400000114
3. the method of claim 2, further comprising: in the third step, the first step is carried out,
for the Kth DODB module, there are:
Figure RE-FDA0003457722140000021
using the transposed 3D convolution and the 1 x 1 3D convolution as upsamplers,
Figure RE-FDA0003457722140000022
before passing through the upsampler and
Figure RE-FDA0003457722140000023
adding to improve the robustness of the model;
Figure RE-FDA0003457722140000024
4. the method of claim 3, further comprising:
the distillation-oriented double-branch module DODB consists of a 3D module, a 2D module and a feedback fusion module;
wherein the 3D module and the 2D module respectively extract low-resolution 3D features according to the structure of the residual block
Figure RE-FDA0003457722140000025
And high resolution 2D features
Figure RE-FDA0003457722140000026
Wherein C' and C represent the number of channels for the 3D feature and the 2D feature, respectively, and B represents the batch size;
Figure RE-FDA0003457722140000027
Figure RE-FDA0003457722140000028
5. the method of claim 4, further comprising:
in a feedback fusion module of the DODB, 3D features are firstly up-sampled to the same size as the 2D features and fused, and then down-sampled to the original size of the 3D features;
after upsampling the 3D features, the 2D features and the 3D features are fused in a band-by-band manner: correcting the 3D features according to spectral bands using the 2D features as feedback information to obtain high resolution 3D features
Figure RE-FDA0003457722140000029
Will be provided with
Figure RE-FDA00034577221400000210
The separation into L spectral bands in the spectral dimension is:
Figure RE-FDA00034577221400000211
Figure RE-FDA00034577221400000212
wherein FlThe size of (a) is b × C' × sH × sW; the 2D features are connected to each spectral band separately, and fused features are generated using 2D convolution:
and is aligned withThe 2D convolution is the same for all bands, and the fused features are decompressed to b × 1 × C' × sH × sW, and then stacked together to obtain new 3D features
Figure RE-FDA0003457722140000031
Size and
Figure RE-FDA0003457722140000032
the same, namely:
Figure RE-FDA0003457722140000033
in the downsampling process, cascaded 3 × 3 × 3 convolution pairs are used
Figure RE-FDA0003457722140000034
Carrying out down-sampling;
the final output of the distillation oriented double branch module DODB is then:
Figure RE-FDA0003457722140000035
Figure RE-FDA0003457722140000036
6. the method of claim 5, further comprising: in the fourth step of the method, the first step of the method,
the total loss function included the reconstitution loss and distillation loss, the L1 loss was chosen as the reconstitution loss,
Figure RE-FDA0003457722140000037
wherein N represents the number of samples,
Figure RE-FDA0003457722140000038
and
Figure RE-FDA0003457722140000039
the ith spectral band of the reconstructed image and the real high-resolution image respectively;
for distillation loss, the L1 norm was used to measure the distance between features from the SHSR model and the features forming the SSR model;
Figure RE-FDA00034577221400000310
wherein S represents the characteristic quantity used in the distillation,
Figure RE-FDA00034577221400000311
and
Figure RE-FDA00034577221400000312
is a feature of the j-th layer of the SHSR model and SSR model, and GjIs a transformation, namely 1 × 1 convolution, which is used for ensuring that the number of the corresponding two characteristic channels is the same;
the total loss function is then:
Ltotal=Lrec+λLoutput (14)
where λ is a hyper-parameter for balancing the two parts, set to 0.05 in practical experiments.
CN202111288667.2A 2021-11-02 2021-11-02 Hyperspectral image super-resolution method utilizing heterogeneous knowledge distillation Active CN114092327B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111288667.2A CN114092327B (en) 2021-11-02 2021-11-02 Hyperspectral image super-resolution method utilizing heterogeneous knowledge distillation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111288667.2A CN114092327B (en) 2021-11-02 2021-11-02 Hyperspectral image super-resolution method utilizing heterogeneous knowledge distillation

Publications (2)

Publication Number Publication Date
CN114092327A true CN114092327A (en) 2022-02-25
CN114092327B CN114092327B (en) 2024-06-07

Family

ID=80298625

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111288667.2A Active CN114092327B (en) 2021-11-02 2021-11-02 Hyperspectral image super-resolution method utilizing heterogeneous knowledge distillation

Country Status (1)

Country Link
CN (1) CN114092327B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113870109A (en) * 2021-09-07 2021-12-31 西安理工大学 Step-by-step spectrum super-resolution method based on spectrum back projection residual

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB1457758A (en) * 1972-11-29 1976-12-08 Schlumberger Ltd Methods for producing signals representative of parameters used in evaluating earth formations
CN110111276A (en) * 2019-04-29 2019-08-09 西安理工大学 Based on sky-spectrum information deep exploitation target in hyperspectral remotely sensed image super-resolution method
AU2020100200A4 (en) * 2020-02-08 2020-06-11 Huang, Shuying DR Content-guide Residual Network for Image Super-Resolution
CN113222823A (en) * 2021-06-02 2021-08-06 国网湖南省电力有限公司 Hyperspectral image super-resolution method based on mixed attention network fusion
CN113240580A (en) * 2021-04-09 2021-08-10 暨南大学 Lightweight image super-resolution reconstruction method based on multi-dimensional knowledge distillation
WO2021185225A1 (en) * 2020-03-16 2021-09-23 徐州工程学院 Image super-resolution reconstruction method employing adaptive adjustment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB1457758A (en) * 1972-11-29 1976-12-08 Schlumberger Ltd Methods for producing signals representative of parameters used in evaluating earth formations
CN110111276A (en) * 2019-04-29 2019-08-09 西安理工大学 Based on sky-spectrum information deep exploitation target in hyperspectral remotely sensed image super-resolution method
AU2020100200A4 (en) * 2020-02-08 2020-06-11 Huang, Shuying DR Content-guide Residual Network for Image Super-Resolution
WO2021185225A1 (en) * 2020-03-16 2021-09-23 徐州工程学院 Image super-resolution reconstruction method employing adaptive adjustment
CN113240580A (en) * 2021-04-09 2021-08-10 暨南大学 Lightweight image super-resolution reconstruction method based on multi-dimensional knowledge distillation
CN113222823A (en) * 2021-06-02 2021-08-06 国网湖南省电力有限公司 Hyperspectral image super-resolution method based on mixed attention network fusion

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
董小慧;高戈;陈亮;韩镇;江俊君;: "数据驱动局部特征转换的噪声人脸幻构", 计算机应用, no. 12, 10 December 2014 (2014-12-10) *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113870109A (en) * 2021-09-07 2021-12-31 西安理工大学 Step-by-step spectrum super-resolution method based on spectrum back projection residual

Also Published As

Publication number Publication date
CN114092327B (en) 2024-06-07

Similar Documents

Publication Publication Date Title
CN110119780B (en) Hyper-spectral image super-resolution reconstruction method based on generation countermeasure network
CN109741256B (en) Image super-resolution reconstruction method based on sparse representation and deep learning
Zhao et al. Hierarchical regression network for spectral reconstruction from RGB images
Luo et al. Pansharpening via unsupervised convolutional neural networks
Mei et al. Learning hyperspectral images from RGB images via a coarse-to-fine CNN
CN111127374B (en) Pan-sharing method based on multi-scale dense network
CN111080567A (en) Remote sensing image fusion method and system based on multi-scale dynamic convolution neural network
CN111861961A (en) Multi-scale residual error fusion model for single image super-resolution and restoration method thereof
CN112037131A (en) Single-image super-resolution reconstruction method based on generation countermeasure network
CN113139898B (en) Light field image super-resolution reconstruction method based on frequency domain analysis and deep learning
CN114266957B (en) Hyperspectral image super-resolution restoration method based on multi-degradation mode data augmentation
Zheng et al. Separable-spectral convolution and inception network for hyperspectral image super-resolution
CN114841856A (en) Image super-pixel reconstruction method of dense connection network based on depth residual channel space attention
CN108288256A (en) A kind of multispectral mosaic image restored method
CN114418850A (en) Super-resolution reconstruction method with reference image and fusion image convolution
CN116029930A (en) Multispectral image demosaicing method based on convolutional neural network
CN117635428A (en) Super-resolution reconstruction method for lung CT image
Hang et al. Prinet: A prior driven spectral super-resolution network
CN114092327B (en) Hyperspectral image super-resolution method utilizing heterogeneous knowledge distillation
CN108460723A (en) Bilateral full variation image super-resolution rebuilding method based on neighborhood similarity
Fan et al. Global sensing and measurements reuse for image compressed sensing
CN114359041A (en) Light field image space super-resolution reconstruction method
CN111899166A (en) Medical hyperspectral microscopic image super-resolution reconstruction method based on deep learning
CN114511470B (en) Attention mechanism-based double-branch panchromatic sharpening method
CN115984106A (en) Line scanning image super-resolution method based on bilateral generation countermeasure network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant