CN114092327A - Hyperspectral image super-resolution method by utilizing heterogeneous knowledge distillation - Google Patents
Hyperspectral image super-resolution method by utilizing heterogeneous knowledge distillation Download PDFInfo
- Publication number
- CN114092327A CN114092327A CN202111288667.2A CN202111288667A CN114092327A CN 114092327 A CN114092327 A CN 114092327A CN 202111288667 A CN202111288667 A CN 202111288667A CN 114092327 A CN114092327 A CN 114092327A
- Authority
- CN
- China
- Prior art keywords
- features
- distillation
- resolution
- dodb
- module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 238000013140 knowledge distillation Methods 0.000 title claims abstract description 24
- 238000004821 distillation Methods 0.000 claims abstract description 28
- 230000004927 fusion Effects 0.000 claims abstract description 12
- 238000001228 spectrum Methods 0.000 claims abstract description 8
- 238000013507 mapping Methods 0.000 claims abstract description 5
- 238000000605 extraction Methods 0.000 claims abstract description 4
- 230000003595 spectral effect Effects 0.000 claims description 39
- 238000012545 processing Methods 0.000 claims description 19
- 230000008569 process Effects 0.000 claims description 9
- 238000002474 experimental method Methods 0.000 claims description 5
- 239000000284 extract Substances 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 3
- 230000014759 maintenance of location Effects 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 238000000926 separation method Methods 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 230000006870 function Effects 0.000 description 6
- 238000012549 training Methods 0.000 description 5
- 230000009286 beneficial effect Effects 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000013508 migration Methods 0.000 description 2
- 230000005012 migration Effects 0.000 description 2
- 238000010200 validation analysis Methods 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 238000002679 ablation Methods 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 235000013361 beverage Nutrition 0.000 description 1
- 238000000701 chemical imaging Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000013434 data augmentation Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000008713 feedback mechanism Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000010422 painting Methods 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 210000003462 vein Anatomy 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4046—Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10032—Satellite or aerial image; Remote sensing
- G06T2207/10036—Multispectral image; Hyperspectral image
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Processing (AREA)
Abstract
The invention provides a hyperspectral image super-resolution method by utilizing heterogeneous knowledge distillation, which gives a low-resolution hyperspectral image input ILR∈RL×H×WPerforming shallow feature extraction, nonlinear mapping on distillation-oriented double-branch module DODB, and upsampling to finally output a high-resolution hyperspectral image ISR∈RL×sH×sW(ii) a The heterogeneous knowledge distillation is used for improving the model performance, the distillation acts between the 2D characteristics of the two models, the heterogeneous knowledge distillation problem is transferred to the fusion problem in the SHSR model, the transmitted information is taken as feedback information, the characteristics of each frequency band are respectively refined, and the characteristics are divided into a distillation part and a reserved part; obtaining better performance quantitatively and qualitatively, and reconstructing high spectrum with relatively high qualityAnd (4) an image.
Description
Technical Field
The invention belongs to the technical field of image super-resolution, and particularly relates to a hyperspectral image super-resolution method by utilizing heterogeneous knowledge distillation.
Background
The hyperspectral imaging sensor receives light with different wavelengths reflected by an object to obtain a hyperspectral image of a multispectral band. Therefore, each pixel of the hyperspectral image contains continuous spectral bands varying from tens to thousands, unlike the grayscale image or the RGB image. The abundance of spectral information in hyperspectral images makes them extremely beneficial in many tasks of computer vision and remote sensing, such as image classification, anomaly detection and medical diagnostics. However, due to hardware limitations, the spatial resolution of hyperspectral images is relatively low and it is difficult to improve hardware systems. Therefore, super-resolution (SR), a post-processing technique, is widely used to reconstruct a high-spatial-resolution hyperspectral image from a low-resolution (LR) version. One class of classical hyperspectral super-resolution methods is fusion-based methods (FHSR), which require some high-resolution multispectral image (HR-MSI), such as an RGB or full-color (PAN) image, and fuse information from both sources. The main drawback of fusion-based methods is that it is difficult, and in some cases even impossible, to collect well-registered high-resolution multispectral images. Another approach is single hyperspectral image super resolution (SHSR), which only uses information from low resolution hyperspectral images. However, since there is no complementary spatial information, this model is highly dependent on an a priori human design, such as low rank and sparsity. With the coming of the deep learning era, a single hyperspectral image super-resolution model based on a convolutional neural network makes great progress, but the lack of spatial detail still limits the capability of the model. Furthermore, they do not take full advantage of expensive well-aligned hyper-spectral-multi-spectral pairs, such as common datasets.
Disclosure of Invention
In order to solve the problems, the invention provides a hyperspectral image super-resolution method by utilizing heterogeneous knowledge distillation, and a distillation-oriented double-branch network DODN is designed; and a new mixed 2D/3D convolution module, namely a distillation-oriented double-branch module (DODB) is provided, and the information of the high-resolution multispectral image HR-MSI is transmitted to a single hyperspectral image super-resolution model SHSR through knowledge distillation to improve the model performance.
The invention is realized by the following scheme:
a hyperspectral image super-resolution method by utilizing heterogeneous knowledge distillation comprises the following steps: the method specifically comprises the following steps:
the method specifically comprises the following steps:
the method comprises the following steps: given a low-resolution hyperspectral image input ILR,ILR∈RL×H×WWherein L, H and W represent the number of spectral bands, height and width of the input image, respectively;
step two: shallow feature extraction is carried out on given image input, image information is respectively sent into a 2D processing branch and a 3D processing branch, the 3D processing branch is processed through 3D convolution, spatial spectrum information of a low-resolution hyperspectral input image is extracted, and shallow 3D features are obtainedThe 2D processing branch is processed by 2D convolution to obtain shallow 2D characteristics
Step three: will be provided withAndsent to a distillation oriented double branch module DODB, using a cascaded DODB: hDODBGenerating a non-linear mapping; and discarding 2D features at the kth, i.e. last DODB moduleSimultaneously obtaining 3D characteristics of Kth DODB moduleObtaining shallow 3D featuresAdding;
step four: performing heterogeneous knowledge distillation and loss function calculation, performing distillation on half of the 2D signature, i.e. for each 2D signature of DODB, willThe first C/2 channels are used as output parts for distillation, and the rest parts are used as retention parts; finally, a high-resolution hyperspectral image I is output through an up-samplerSR∈RL ×sH×sWAnd s is a scale factor.
Further, in the second step, the first step,
in the 3D processing branch, a low resolution hyperspectral image input ILRDecompressed to 1 × L × H × W size, and then 3 × 3 × 3D convolved to obtain shallow 3D features The expression of (a) is:
in the 2D processing branch, a low-resolution hyperspectral image is input ILRUpsampling to LxsH × sW to adapt to the spatial resolution of the spectral super-resolution SSR model input, and then obtaining shallow 2D features by 3 × 3 2D convolutionWherein s is a scale factor, and wherein,the expression of (a) is:
further, in the third step,
for the Kth DODB module, there are:
using the transposed 3D convolution and the 1 x 1 3D convolution as upsamplers,before passing through the upsampler andadding to improve the robustness of the model;
the distillation-oriented double-branch module DODB consists of a 3D module, a 2D module and a feedback fusion module;
wherein the 3D module and the 2D module respectively extract low-resolution 3D features according to the structure of the residual blockAnd high resolution 2D featuresWherein C' and C represent the number of channels of the 3D feature and the 2D feature, respectively, and B represents the batch size;
further, in a feedback fusion module of the DODB, the 3D features are firstly up-sampled to the same size as the 2D features and fused, and then down-sampled to the original size of the 3D features;
after upsampling the 3D features, the 2D features and the 3D features are fused in a band-by-band manner: correcting 3D features according to spectral bands using 2D features as feedback information to obtain high resolution 3D features
wherein FlThe size of (a) is b × C' × sH × sW; the 2D features are connected to each spectral band separately and fused features are generated using 2D convolution:
and for all bands, the 2D convolution is the same, the fused features are decompressed to b × 1 × C' × sH × sW, and then stacked together to obtain new 3D features Size andthe same, namely:
in the downsampling process, cascaded 3 × 3 × 3 convolution pairs are usedCarrying out down-sampling;
the final output of the distillation oriented double branch module DODB is then:
further, in the fourth step,
the total loss function included the reconstitution loss and distillation loss, the L1 loss was chosen as the reconstitution loss,
wherein N represents the number of samples,andthe ith spectral band of the reconstructed image and the real high-resolution image respectively;
for distillation loss, the L1 norm was used to measure the distance between features from the SHSR model and the features forming the SSR model;
wherein S represents the characteristic quantity used in the distillation,andis a feature of the j-th layer of the SHSR model and SSR model, and GiIs a transformation, namely 1 × 1 convolution, which is used for ensuring that the number of the corresponding two characteristic channels is the same;
the total loss function is then:
Ltotal=Lrec+λLoutput (14)
where λ is a hyper-parameter for balancing the two parts, set to 0.05 in practical experiments.
The invention has the beneficial effects
The invention provides a new double-branch single-hyperspectral image super-resolution model and a new module for effectively combining 2D convolution and 3D convolution, wherein the model comprises the following components:
extracting spatial spectrum information of a low-resolution hyperspectral input image by the 3D branch through 3D convolution; the 2D branch is designed to be similar to a spectral super-resolution model and receives information transmitted from the spectral super-resolution model;
in each block, the 3D features are segmented in the spectral dimension and corrected band by 2D features in a feedback manner; applying distillation on half the channel of the 2D feature is beneficial to reduce negative migration, a technique known as semi-distillation;
the invention takes the first place to utilize the privilege information from the spectrum super-resolution task and designs a model for heterogeneous knowledge distillation, and the introduction of long residual connection makes the model more robust.
Drawings
FIG. 1 is a schematic diagram of a DODN in accordance with the present invention, the upper half being a DODN network and the lower half being an AWAN SSR model;
FIG. 2 is a DODB schematic of the present invention;
FIG. 3 is a graph of the reconstruction and absolute error of an image, an integer _ state 630nm band in a CAVE dataset;
fig. 4 is a graph of the reconstruction and absolute error for the 1 st band of the ARAD _0463 image in the ntie 2020 dataset.
Detailed Description
The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments; all other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
A hyperspectral image super-resolution method by utilizing heterogeneous knowledge distillation comprises the following steps: the method specifically comprises the following steps:
the method comprises the following steps: given a low-resolution hyperspectral image input ILR,ILR∈RL×H×WWherein L, H and W represent the number of spectral bands, height and width of the input image, respectively;
step two: shallow feature extraction is carried out on given image input, image information is respectively sent into a 2D processing branch and a 3D processing branch, the 3D processing branch is processed through 3D convolution, spatial spectrum information of a low-resolution hyperspectral input image is extracted, and shallow 3D features are obtainedThe 2D processing branch is processed by 2D convolution to obtain shallow 2D characteristics
Step three: will be provided withAndsent to a distillation oriented double branch module DODB, using a cascaded DODB: hDODBGenerating a non-linear mapping; and discarding 2D features at the kth, i.e. last DODB moduleSimultaneously obtaining 3D characteristics of Kth DODB moduleObtaining shallow 3D featuresAdding;
step four: performing heterogeneous knowledge distillation and loss function calculation, performing distillation on half of the 2D signature, i.e. for each 2D signature of DODB, willThe first C/2 channels are used as output parts for distillation, and the rest parts are used as retention parts; finally, a high-resolution hyperspectral image I is output through an up-samplerSR∈RL ×sH×sWAnd s is a scale factor.
In the second step, the first step is carried out,
in the 3D processing branch, a low resolution hyperspectral image input ILRDecompressed to 1 × L × H × W size, and then 3 × 3 × 3D convolved to obtain shallow 3D features The expression of (a) is:
in the 2D processing branch, a low-resolution hyperspectral image is input ILRUpsampling to LxsH × sW to adapt to the spatial resolution of the spectral super-resolution SSR model input, and then obtaining shallow 2D features by 3 × 3 2D convolutionWherein s is a scale factor, and wherein,the expression of (a) is:
in the third step, the first step is carried out,
for the Kth DODB module, there are:
using the transposed 3D convolution and the 1 x 1 3D convolution as upsamplers,before passing through the upsampler andadding to improve the robustness of the model;
the distillation-oriented double-branch module DODB consists of a 3D module, a 2D module and a feedback fusion module; the DODB is made partially similar to the 2D SSR model because of the addition of the 2D branch; using a feedback mechanism to fuse 2D and 3D features in a band-by-band manner, each band of HSI is only sensitive to a portion of the photographic subject due to the limited energy of light in a particular range, while the RGB image contains spatial information of the entire scene.
Wherein the 3D module and the 2D module respectively extract low-resolution 3D features according to the structure of the residual blockAnd high resolution 2D featuresWherein C' and C represent the number of channels of the 3D feature and the 2D feature, respectively, and B represents the batch size; by pseudo-3DThe convolution replaces the common 3D convolution to reduce the computational complexity;
in a feedback fusion module of the DODB, 3D features are firstly up-sampled to the same size as the 2D features and fused, and then down-sampled to the original size of the 3D features; the reason for fusion in the high resolution space is to fuse the 2D features and the 3D features in a band-by-band manner, preserving as much detail as possible in the 2D features. The 2D feature accepts information transferred from high resolution RGB images, so it contains all spatial details at a lower capacity. While each spectral band of the 3D features can be viewed as a limited view of the high resolution RGB image with richer spectral information.
After upsampling the 3D features, fusing the 2D features and the 3D features in a band-by-band manner; correcting 3D features according to spectral bands using 2D features as feedback information to obtain high resolution 3D featuresWill be provided withThe separation into L spectral bands in the spectral dimension is:
wherein FlThe size of (a) is b × C' × sH × sW; the 2D features are connected to each spectral band separately and fused features are generated using 2D convolution:
and for all spectral bands, the 2D convolution is the same, the 2D features are regarded as feedback information to refine each spectral band of the 3D features, the fused features are decompressed to b × 1 × C' × sH × sW, and then stacked together to obtain new 3D features Size andthe same, namely:
in the downsampling process, cascaded 3 × 3 × 3 convolution pairs are usedCarrying out down-sampling; d, convolution further extracts spectral spatial correlation;
the final output of the distillation oriented double branch module DODB is then:
in the fourth step, (the step is used for improving the performance of the model in the training process and is not applied in the reasoning process)
The total loss function includes the loss of reconstitution and the loss of distillation, following the mainstream of the field, the loss of L1 was chosen as the loss of reconstitution,
wherein N represents the number of samples,andthe ith spectral band of the reconstructed image and the real high-resolution image respectively;
for distillation loss, the L1 norm was used to measure the distance between features from the SHSR model and the features forming the SSR model;
wherein S represents the characteristic quantity used in the distillation,andis a feature of the j-th layer of the SHSR model and SSR model, and GiIs a transformation, namely 1 × 1 convolution, which is used for ensuring that the number of the corresponding two characteristic channels is the same;
the total loss function is then:
Ltotal=Lrec+λLoutput (14)
where λ is a hyper-parameter for balancing the two parts, set to 0.05 in practical experiments.
CAVE: the CAVE data set was collected by a cooled CCD camera at wavelengths ranging from 400nm to 710nm, separated into 31 bands. These 32 images are divided into five parts: genuine, fake, skin and hair, paintings, food and beverages, and objects, each image being 512 x 512 in size.
NTIRE 2020: for a long time, hyperspectral image processing lacks large-scale datasets. Recently, the NTIRE spectral super-resolution challenge provides a data set containing 510 hyperspectral images, which is one of the largest data sets to date. The size of each picture was 512 × 482, and the number of bands was also 31. Only clean track data is used and since the group route of the test set is not accessible, the comparison is made on the validation set, so there are 450 pictures for training and 10 images for testing.
For the CAVE dataset, 20 images were selected as the training set, and the remaining 12 images were used for testing. Each picture is cropped into blocks of 96 x 96 size, 48 pixels overlap between blocks, and the scale factor is set to 4; randomly select 10% of the blocks as the validation set. Then, a bicubic downsampling is used to generate the low resolution input. Cropping and downsampling operations are applied to both hyperspectral and RGB images simultaneously to obtain a well-registered HSI-RGB image pair. The data augmentation operation is performed using 90 °, 180 °, and 270 ° rotations, vertical and horizontal flips, and combinations thereof, on the training sample. For the NTIRE2020 dataset, the only difference is that the image blocks are randomly cropped, and the number of blocks per image is fixed at 24.
DODN contains 4 DODBs, setting C' ═ C ═ 64. The AWAN was chosen as the SSR model, with the number of modules and number of channels being 8 and 200, respectively. Since the number of modules is different for the AWAN model and the DODN, the outputs of 2, 4, 6 and 8 modules of the AWAN are used for distillation along with all the outputs of the four blocks of DODN. Setting beta by Adam optimization algorithm1=0.9,β2The learning rate is initialized to 10 at 0.999-4Gradually decrease to 10-5. The batch size was 12, and the model of the present invention was trained for 200 cycles. SSR model and DODN are optimized alternately: in each small lot, one model with a smaller fixed error acts as a teacher model, while the other updates its parameters. In practical experiments, the convergence speed of the two models is found to have a difference, so that the phenomenon of insufficient training of the SHSR model occurs. Therefore, an official pre-trained AWAN is used and at a smaller learning rate (10)-5) To perform fine tuning.
Six widely used measurement indices are used to assess the quality of the reconstructed image: peak signal-to-noise ratio (PSNR), Structural Similarity (SSIM), cross-Correlation Coefficient (CC), Spectral Angle Mapping (SAM), Root Mean Square Error (RMSE), and dimensionless error coefficient (ERGAS). And calculating the average value of all frequency bands of the PSNR and the SSIM as an index. SAM is a common index for measuring the spectral difference between two hyperspectral images, and CC and ERGAS are widely used for hyperspectral image fusion. The remaining three metrics are typically used to quantitatively measure image recovery quality. The limit values of the above indexes are + ∞, 1, 0 and 1, respectively;
compared with the prior six SHSR methods, the method comprises Bicubic, EDSR, MCNet, ERCSR, SFCSR and ASFS; and two common data sets CAVE and ntie 2020 are used to verify the validity of the proposed DODN.
CAVE dataset: table 1 shows the quantitative comparison of the advanced SHSR method on CAVE datasets for different scale factors. It is clear that the process of the invention is superior to other processes in all respects. In all of these algorithms, EDSR is a classical model of single natural image super-resolution with pure 2D convolution, MCNet and ERCSR are 2D/3D hybrid convolution neural networks, and SFCSR is a sequence model. ASFS uses adjacent bands to independently reconstruct the central band in turn. Experimental results show that 2D CNN (like EDSR) can restore spatial detail at a reasonably good level, but poor results of SAM show that it causes severe spectral distortion. The 2D/3D hybrid models score relatively low on SSIM and CC, indicating that they cannot generate sufficient spatial detail. Furthermore, the sequence model and the 2D/3D mixture model have similar values on the SAM. However, by introducing knowledge from the SSR model, the model of the invention achieves a significant improvement in PSNR (+0.3dB) compared to the sub-optimal algorithm (MCNet) and also considerably reduces SAM (-0.1 rad), i.e., knowledge distillation improves both spatial and spectral reconstruction. As shown in fig. 3, the method of the present invention has a low absolute error, especially in the area containing rich texture. In addition, the model of the invention well reconstructs the edges of the color blocks on the right side frame, which shows that the model of the invention has strong capability of extracting the spatial correlation.
Table 1: result comparison of SHSR method on CAVE dataset
NTIRE2020 dataset: experiments on the NITRE2020 dataset revealed the performance of the existing SHSR method on large scale data. The quantitative results are summarized in table 2. Surprisingly, the EDSR performed best in all existing models except the process of the present invention. This may be because the EDSR has a more general architecture to leverage the rich data, which also suggests that the reason to limit the performance of EDSR on small HSI datasets may not be a network prior, but rather a lack of data. In contrast, a single-band output model with adjacent bands as input, such as SFCSR and ASFS, converges prematurely with poor results. From table 2 it can be observed that the model of the invention outperforms the other methods in all indexes, which demonstrates the effectiveness of the proposed algorithm. In particular, the method of the present invention improves PSNR and SAM by +0.37dB and-0.13 rad, respectively, compared to a suboptimal method, which indicates that both spatial detail and frequency spectrum are enhanced. The reconstructed spectral bands and the corresponding absolute errors are visualized in fig. 4, and the resulting errors of the invention are smaller overall, especially in regions containing rich texture. For example, the leaf vein in the red rectangle recovers well, which other methods cannot.
Table 2: result comparison of SHSR method on NTIRE2020 dataset
The goal of heterogeneous distillation is to transmit spatial and spectral information from the SSR model so that the SHSR model can gain knowledge of the other perspective and utilize information from both tasks simultaneously. Since the SSR model is a 2D convolutional network, all spatial and spectral information is embedded in the 2D features, which leads to the problem of distillation with heterogeneous knowledge of the 3D features in the SHSR model. The solution of the present invention is to add a 2D branch in the model of the present invention to isolate the 3D SHSR and 2D SSR features and to shift the task of distilling the 3D and 2D features to the task of distilling between the 2D features. The two-dimensional branches of the model of the invention are designed to be similar to the SSR model, which reduces the gap between the two models, thereby reducing the difficulty of heterogeneous knowledge distillation. The two-dimensional branches of the model of the invention are designed to be similar to the SSR model, which reduces the gap between the two models, thereby reducing the difficulty of heterogeneous knowledge distillation. The information of the two views is combined by feedback fusion, where the 2D features refine the 3D features band by band. In this way, spatial detail from the HR RGB image is introduced and the information of one band is not contaminated by other bands. In table 3, the invention demonstrates the performance of the model with and without knowledge distillation on the CAVE data set. In models that do not use knowledge distillation, the present invention trains the model of the present invention using only the L1 loss and keeps all hyper-parameters the same as the original model. It can be observed that the knowledge distillation significantly improved the PSNR and caused little harm to SAM, confirming that the spatial details of HR-MSI are effectively transferred to the model of the present invention. One possible reason for the slight increase in SAM may be the limited ability of the 2D SSR signature to represent full spectrum information, resulting in negative migration in the knowledge distillation process.
Table 3: ablation analysis of knowledge distillation
The hyperspectral image super-resolution method by utilizing heterogeneous knowledge distillation, which is provided by the invention, is described in detail, the principle and the implementation mode of the invention are explained, and the description of the embodiment is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in specific embodiments and application ranges, and in summary, the content of the present specification should not be construed as a limitation to the present invention.
Claims (6)
1. A hyperspectral image super-resolution method by utilizing heterogeneous knowledge distillation is characterized by comprising the following steps:
the method specifically comprises the following steps:
the method comprises the following steps: given a low-resolution hyperspectral image input ILR,ILR∈RL×H×WWherein L, H and W represent the number of spectral bands, height and width of the input image, respectively;
step two: shallow feature extraction is carried out on given image input, image information is respectively sent into a 2D processing branch and a 3D processing branch, the 3D processing branch is processed through 3D convolution, space spectrum information of a low-resolution hyperspectral input image is extracted, and shallow 3D features are obtainedThe 2D processing branch is processed by 2D convolution to obtain shallow layer 2D characteristics
Step three: will be provided withAndsent to a distillation oriented double branch module DODB, using a cascaded DODB: hDODBGenerating a non-linear mapping; and discarding 2D features at the kth, i.e. last DODB moduleSimultaneously obtaining 3D characteristics of Kth DODB moduleObtaining shallow 3D featuresAdding;
step four: performing heterogeneous knowledge distillation and loss function calculation, performing distillation on half the channel of the 2D features, i.e. for each 2D feature of DODB, willThe first C/2 channels are used as output parts for distillation, and the rest parts are used as retention parts; finally, a high-resolution hyperspectral image I is output through an up-samplerSR∈RL×sH×sWAnd s is a scale factor.
2. The method of claim 1, further comprising: in the second step, the first step is carried out,
in the 3D processing branch, a low resolution hyperspectral image input ILRDecompressed to 1 × L × H × W size, and then 3 × 3 × 3D convolved to obtain shallow 3D features The expression of (a) is:
in the 2D processing branch, a low-resolution hyperspectral image is input ILRUpsampling to LxsH xsW to adapt to the spatial resolution of the spectral super-resolution SSR model input, and then obtaining shallow layer 2D features through 3 x 3 2D convolutionWherein s is a scale factor, and wherein,the expression of (a) is:
4. the method of claim 3, further comprising:
the distillation-oriented double-branch module DODB consists of a 3D module, a 2D module and a feedback fusion module;
wherein the 3D module and the 2D module respectively extract low-resolution 3D features according to the structure of the residual blockAnd high resolution 2D featuresWherein C' and C represent the number of channels for the 3D feature and the 2D feature, respectively, and B represents the batch size;
5. the method of claim 4, further comprising:
in a feedback fusion module of the DODB, 3D features are firstly up-sampled to the same size as the 2D features and fused, and then down-sampled to the original size of the 3D features;
after upsampling the 3D features, the 2D features and the 3D features are fused in a band-by-band manner: correcting the 3D features according to spectral bands using the 2D features as feedback information to obtain high resolution 3D features
wherein FlThe size of (a) is b × C' × sH × sW; the 2D features are connected to each spectral band separately, and fused features are generated using 2D convolution:
and is aligned withThe 2D convolution is the same for all bands, and the fused features are decompressed to b × 1 × C' × sH × sW, and then stacked together to obtain new 3D featuresSize andthe same, namely:
in the downsampling process, cascaded 3 × 3 × 3 convolution pairs are usedCarrying out down-sampling;
the final output of the distillation oriented double branch module DODB is then:
6. the method of claim 5, further comprising: in the fourth step of the method, the first step of the method,
the total loss function included the reconstitution loss and distillation loss, the L1 loss was chosen as the reconstitution loss,
wherein N represents the number of samples,andthe ith spectral band of the reconstructed image and the real high-resolution image respectively;
for distillation loss, the L1 norm was used to measure the distance between features from the SHSR model and the features forming the SSR model;
wherein S represents the characteristic quantity used in the distillation,andis a feature of the j-th layer of the SHSR model and SSR model, and GjIs a transformation, namely 1 × 1 convolution, which is used for ensuring that the number of the corresponding two characteristic channels is the same;
the total loss function is then:
Ltotal=Lrec+λLoutput (14)
where λ is a hyper-parameter for balancing the two parts, set to 0.05 in practical experiments.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111288667.2A CN114092327B (en) | 2021-11-02 | 2021-11-02 | Hyperspectral image super-resolution method utilizing heterogeneous knowledge distillation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111288667.2A CN114092327B (en) | 2021-11-02 | 2021-11-02 | Hyperspectral image super-resolution method utilizing heterogeneous knowledge distillation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114092327A true CN114092327A (en) | 2022-02-25 |
CN114092327B CN114092327B (en) | 2024-06-07 |
Family
ID=80298625
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111288667.2A Active CN114092327B (en) | 2021-11-02 | 2021-11-02 | Hyperspectral image super-resolution method utilizing heterogeneous knowledge distillation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114092327B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113870109A (en) * | 2021-09-07 | 2021-12-31 | 西安理工大学 | Step-by-step spectrum super-resolution method based on spectrum back projection residual |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB1457758A (en) * | 1972-11-29 | 1976-12-08 | Schlumberger Ltd | Methods for producing signals representative of parameters used in evaluating earth formations |
CN110111276A (en) * | 2019-04-29 | 2019-08-09 | 西安理工大学 | Based on sky-spectrum information deep exploitation target in hyperspectral remotely sensed image super-resolution method |
AU2020100200A4 (en) * | 2020-02-08 | 2020-06-11 | Huang, Shuying DR | Content-guide Residual Network for Image Super-Resolution |
CN113222823A (en) * | 2021-06-02 | 2021-08-06 | 国网湖南省电力有限公司 | Hyperspectral image super-resolution method based on mixed attention network fusion |
CN113240580A (en) * | 2021-04-09 | 2021-08-10 | 暨南大学 | Lightweight image super-resolution reconstruction method based on multi-dimensional knowledge distillation |
WO2021185225A1 (en) * | 2020-03-16 | 2021-09-23 | 徐州工程学院 | Image super-resolution reconstruction method employing adaptive adjustment |
-
2021
- 2021-11-02 CN CN202111288667.2A patent/CN114092327B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB1457758A (en) * | 1972-11-29 | 1976-12-08 | Schlumberger Ltd | Methods for producing signals representative of parameters used in evaluating earth formations |
CN110111276A (en) * | 2019-04-29 | 2019-08-09 | 西安理工大学 | Based on sky-spectrum information deep exploitation target in hyperspectral remotely sensed image super-resolution method |
AU2020100200A4 (en) * | 2020-02-08 | 2020-06-11 | Huang, Shuying DR | Content-guide Residual Network for Image Super-Resolution |
WO2021185225A1 (en) * | 2020-03-16 | 2021-09-23 | 徐州工程学院 | Image super-resolution reconstruction method employing adaptive adjustment |
CN113240580A (en) * | 2021-04-09 | 2021-08-10 | 暨南大学 | Lightweight image super-resolution reconstruction method based on multi-dimensional knowledge distillation |
CN113222823A (en) * | 2021-06-02 | 2021-08-06 | 国网湖南省电力有限公司 | Hyperspectral image super-resolution method based on mixed attention network fusion |
Non-Patent Citations (1)
Title |
---|
董小慧;高戈;陈亮;韩镇;江俊君;: "数据驱动局部特征转换的噪声人脸幻构", 计算机应用, no. 12, 10 December 2014 (2014-12-10) * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113870109A (en) * | 2021-09-07 | 2021-12-31 | 西安理工大学 | Step-by-step spectrum super-resolution method based on spectrum back projection residual |
Also Published As
Publication number | Publication date |
---|---|
CN114092327B (en) | 2024-06-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110119780B (en) | Hyper-spectral image super-resolution reconstruction method based on generation countermeasure network | |
CN109741256B (en) | Image super-resolution reconstruction method based on sparse representation and deep learning | |
Zhao et al. | Hierarchical regression network for spectral reconstruction from RGB images | |
Luo et al. | Pansharpening via unsupervised convolutional neural networks | |
Mei et al. | Learning hyperspectral images from RGB images via a coarse-to-fine CNN | |
CN111127374B (en) | Pan-sharing method based on multi-scale dense network | |
CN111080567A (en) | Remote sensing image fusion method and system based on multi-scale dynamic convolution neural network | |
CN111861961A (en) | Multi-scale residual error fusion model for single image super-resolution and restoration method thereof | |
CN112037131A (en) | Single-image super-resolution reconstruction method based on generation countermeasure network | |
CN113139898B (en) | Light field image super-resolution reconstruction method based on frequency domain analysis and deep learning | |
CN114266957B (en) | Hyperspectral image super-resolution restoration method based on multi-degradation mode data augmentation | |
Zheng et al. | Separable-spectral convolution and inception network for hyperspectral image super-resolution | |
CN114841856A (en) | Image super-pixel reconstruction method of dense connection network based on depth residual channel space attention | |
CN108288256A (en) | A kind of multispectral mosaic image restored method | |
CN114418850A (en) | Super-resolution reconstruction method with reference image and fusion image convolution | |
CN116029930A (en) | Multispectral image demosaicing method based on convolutional neural network | |
CN117635428A (en) | Super-resolution reconstruction method for lung CT image | |
Hang et al. | Prinet: A prior driven spectral super-resolution network | |
CN114092327B (en) | Hyperspectral image super-resolution method utilizing heterogeneous knowledge distillation | |
CN108460723A (en) | Bilateral full variation image super-resolution rebuilding method based on neighborhood similarity | |
Fan et al. | Global sensing and measurements reuse for image compressed sensing | |
CN114359041A (en) | Light field image space super-resolution reconstruction method | |
CN111899166A (en) | Medical hyperspectral microscopic image super-resolution reconstruction method based on deep learning | |
CN114511470B (en) | Attention mechanism-based double-branch panchromatic sharpening method | |
CN115984106A (en) | Line scanning image super-resolution method based on bilateral generation countermeasure network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |