CN108830796B - Hyperspectral image super-resolution reconstruction method based on spectral-spatial combination and gradient domain loss - Google Patents

Hyperspectral image super-resolution reconstruction method based on spectral-spatial combination and gradient domain loss Download PDF

Info

Publication number
CN108830796B
CN108830796B CN201810639042.8A CN201810639042A CN108830796B CN 108830796 B CN108830796 B CN 108830796B CN 201810639042 A CN201810639042 A CN 201810639042A CN 108830796 B CN108830796 B CN 108830796B
Authority
CN
China
Prior art keywords
image
gradient
value
network
hyperspectral
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201810639042.8A
Other languages
Chinese (zh)
Other versions
CN108830796A (en
Inventor
王敏全
丁溢洋
尚赵伟
秦安勇
赵林畅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University
Original Assignee
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University filed Critical Chongqing University
Priority to CN201810639042.8A priority Critical patent/CN108830796B/en
Publication of CN108830796A publication Critical patent/CN108830796A/en
Application granted granted Critical
Publication of CN108830796B publication Critical patent/CN108830796B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to a hyperspectral image super-resolution reconstruction method based on spectral-spatial combination and gradient domain loss, and belongs to the field of image super-resolution reconstruction. The method comprises the following steps: s1: obtaining a hyperspectral image; s2: dividing the hyperspectral images into a training set and a test set; s3: inputting the training set into a neural network with spectrum and space combination, and training by utilizing the joint loss of a space domain and a gradient domain; s4: and (5) passing the test set through a neural network to obtain a final reconstruction result. Compared with the prior art, the method adopted by the invention is lighter in network structure, has higher reconstruction quality and stronger anti-noise performance.

Description

Hyperspectral image super-resolution reconstruction method based on spectral-spatial combination and gradient domain loss
Technical Field
The invention belongs to the field of image super-resolution reconstruction, and relates to a hyperspectral image super-resolution reconstruction method based on a spectrum-space combination network and gradient domain loss.
Background
The spatial resolution of the hyperspectral image is low, and mixed end members are easily generated, so that the spectrum distortion is caused, and the spatial and spectral consistency of the end members is damaged. Distinguished from input information, hyperspectral image super-resolution reconstruction can be roughly divided into two types: a reconstruction method based on image fusion (with an auxiliary image) and a reconstruction method without an auxiliary image. The reconstruction method based on image fusion adopts an RGB image or a Panchromatic (PAN) image or a Multispectral (MS) image as an auxiliary, and uses spatial and spectral information to jointly constrain, so as to unmix end members and reduce spectral distortion. Using an RGB-assisted reconstruction method, the hyperspectral super-resolution reconstruction problem is considered as an optimization problem for solving sparse, non-negative constraints on LR images and corresponding RGB images. Akhtar et al uses a sparse coding strategy to reconstruct a hyperspectral image and fully excavate the space structure and the nonnegativity and sparsity of signals. Dong et al propose to explore spatial correlation of image Sparse Representation based on Non-Negative matrix Structured Sparse coding (NNSR), and reconstruct a HR image using a hyperspectral LR image and a corresponding RGB image. Based on the consistency of end members of a hyperspectral image and a PAN image in the same place, Zhao et al trains an overcomplete dictionary pair through the hyperspectral image and the corresponding PAN image, and uses non-local similarity to express the mapping relation between the HR image and the LR image. Because the PAN image is a gray image formed by mixing visible light wave bands, the spectral resolution is low, and spectral distortion is easily generated in the fusion process of the PAN image and the hyperspectral image. Compared with the PAN image, the MS image has lower spatial resolution but rich spectral information, has several to more than ten wave bands, and can obtain the color information of the ground features. Zhang et al propose a Bayesian fusion framework based on wavelets, and take a Gaussian mixture model as a prior constraint wavelet coefficient, fuse a hyperspectral LR image and a high-resolution MS image, and finally obtain a hyperspectral HR image. Wei et al obtain an overcomplete dictionary by training observation images, fuse hyperspectral images and MS images using an optimization framework with sparse regularization, and solve by alternately optimizing projection target images and coefficient coding. However, in practical application, it is difficult to obtain MS images with communicating spectral coverage in the same scene, and the difference in the wavelength range may cause the quality of the hyperspectral image reconstruction to be degraded. The method of reconstructing an unassisted image has received much attention because of its simple physical requirements.
The reconstruction without the auxiliary image only utilizes the hyperspectral LR image and combines the space and spectrum information of the hyperspectral image to constrain the reconstruction. Xing et al learn a dictionary using hyperspectral images and use the Beta-Bernoulli process to improve the self-consistency of the dictionary, but the computational complexity is high. Akgun et al considers a hyperspectral image into a plurality of Convex sets containing high-dimensional features, uses POCS (project Onto Convex set) to obtain a solution space of an HR image, and adds prior information constraint, so that the edge and detail information of the image can be well maintained, but the solution is too dependent on an initial value and has no uniqueness. Zhang et al propose a hyper-spectral image super-resolution reconstruction algorithm based on MAP (maximum A Posterior), add prior knowledge to guarantee the uniqueness of the solution when solving, divide the wave bands of the hyper-spectral image into three groups and use principal component analysis to reduce the calculated amount and remove redundant information. In recent years, deep learning is widely applied to hyperspectral image reconstruction. Yuan et al use the idea of transfer learning for reference and use the mapping relationship between the LR and HR images of the natural images to help reconstruct the hyperspectral HR images, but the method does not consider the information correlation between the spectrums of the hyperspectral images. The 3D-FCNN extracts spectral-spatial features by using three-dimensional convolution, and the hyperspectral images collected by the same detector can be reconstructed without secondary training. Li et al learn the difference between the spectral bands of the hyperspectral images using a deep convolutional network, initialize the HR image in combination with the difference and the LR image, and iteratively update the HR image until convergence by minimizing the simulation error between the HR image and the LR image obtained by back projection of the HR image using an IBP algorithm.
Generally speaking, the prior art has the following problems in solving the hyperspectral image:
1) the time complexity and reconstruction quality of the method cannot be considered at the same time;
2) the existing method can not capture the spectrum information of the hyperspectral image to realize spectrum-space combination, and the reconstruction quality is poor.
Disclosure of Invention
In view of the above, the present invention provides a hyperspectral image super-resolution reconstruction method based on a spectrum-space combination network and gradient domain loss, wherein all spectrum information of a hyperspectral image is used as input of a neural network, and the link between different spectra is fully utilized by using pseudo-3D convolution and residual learning to improve the reconstruction effect, and the detail quality of a reconstructed image is improved by using a loss function combining gradient domain loss and space domain loss. Compared with the prior art, the method adopted by the invention can obtain a clearer reconstruction effect under the same space, time and other conditions, and is superior to the prior art.
In order to achieve the purpose, the invention provides the following technical scheme:
a hyperspectral image super-resolution reconstruction method based on spectral-spatial combination and gradient domain loss comprises the following steps:
s1: obtaining a hyperspectral image;
s2: dividing the hyperspectral images into a training set and a test set;
s3: inputting the training set into a neural network with spectrum and space combination, and training by utilizing the joint loss of a space domain and a gradient domain;
s4: and (5) passing the test set through a neural network to obtain a final reconstruction result.
Further, in step S3, the spectrum-space combined neural network SSRNet is a convolutional neural network based on deep learning, converts the HR image reconstruction task into fitting of a residual between the HR image and the LR image, and extracts spectrum-space features using pseudo three-dimensional convolution, thereby improving spatial resolution and spectral resolution; the SSRnet uses a loss function combining a pixel domain Charbonnier loss function and a gradient domain loss to improve the quality of a reconstructed image, and adopts a multi-scale training mode to complete an image reconstruction task under various sampling factors.
Further, the SSRNet structure includes: the convolution layer comprises 14 convolution layers, and each convolution layer except the last convolution layer is combined with an activation function layer;
1) deep residual learning: A. jump connection is added between network input and network output, so that residual errors between high/low resolution images are learned by a network, the weight of network parameters is relatively sparse, and the convergence speed can be accelerated; B. a residual block is used in the network, so that the risk of gradient disappearance or explosion caused by the deepening of the network layer number is reduced;
2) introducing a Pseudo-three-dimensional Convolution (P3D) residual block;
the number of the residual blocks is modified according to different computing resources; increasing the number of residual blocks may improve the reconstruction quality, but may increase computational resource consumption;
for a 3x3x3 three-dimensional convolution kernel, P3D was replaced with a 3x1x1 one-dimensional spectral convolution kernel and a 1x3x3 two-dimensional spatial convolution kernel; compared with a two-dimensional convolutional neural network with the same depth, the pseudo three-dimensional convolutional neural network effectively reduces the number of parameters and the size of a model; meanwhile, when the residual block is designed, bottleneck layers with the size of 1x1x1 are respectively added at the head end and the tail end, so that the sizes of input and output characteristic graphs of a two-dimensional space domain convolution kernel and a one-dimensional spectrum convolution kernel are reduced, the calculation cost is reduced, and the network depth is convenient to increase;
3) a BN layer is removed from the pseudo three-dimensional convolution residual module, and the network performance is improved;
the main function of the BN layer is to prevent the gradient from disappearing or exploding, but the present invention removes the BN layer when designing the network structure, mainly for the following two reasons: firstly, a BN layer usually uses a larger batch size, but due to the large hyperspectral image data and high dimensionality, the calculation resource consumption is sharply increased when the larger batch size is adopted, so that the BN layer is not suitable for adopting the smaller batch size when the network structure is designed; the BN layer reduces the Range Flexibility of the network while standardizing the characteristics, namely the network can show or respond to the variable Range; therefore, for image super-resolution reconstruction, the removal of the BN layer does not reduce the image reconstruction quality, but can improve the network performance and reduce the GPU memory utilization rate.
4) The activation functions in the network all adopt Relu activation functions, so that the convergence speed is accelerated, and gradient explosion and gradient disappearance are prevented.
Further, in step S3, the loss function is an objective function optimized by the neural network and is used for evaluating a difference between the predicted value and the true value, and the excellent loss function can improve the network convergence speed and improve the quality of the predicted value, otherwise, the overall performance of the network is reduced;
for the super-resolution reconstruction method based on deep learning, the most common loss function is MSE based on the minimization between pixel pairs, and the minimization of MSE can intuitively promote the PSNR value. Although the MSE optimization is simple, the network returns the average value of a plurality of possible images, and the MSE value is increased sharply due to abnormal points, so that the reconstructed image usually lacks high-frequency content, and the texture information is too smooth and the visual effect is not natural. The method adopts a joint loss function combining Charbonier with gradient domain, which has better robustness and has texture retention.
1) Charbonnier loss function
Figure BDA0001701804840000041
Wherein the content of the first and second substances,
Figure BDA0001701804840000042
m is the number of samples,
Figure BDA0001701804840000043
indicates the prediction resultPixel value of i pixels, X(i)The pixel value of the ith pixel of the true value is represented, x represents the true value, and epsilon is that the difference value between each pixel point pair of the LR image and the HR image is larger than a certain threshold value, so that the interference of an abnormal value is reduced; in the invention, the parameter epsilon is 0.001;
2) gradient domain loss function
Firstly, the LR and HR images need to be transferred to the gradient domain by an image gradient algorithm, and the image gradient is calculated by the following formula:
horizontal direction:
dx(i,j)=I(i+1,j)-I(i,j)
vertical direction:
dy(i,j)=I(i,j+1)-I(i,j)
where I is the image pixel value and (I, j) is the coordinates of the pixel; calculating the Charbonnier loss for the obtained gradient domain feature map, and finally obtaining a joint loss function as follows:
Figure BDA0001701804840000044
wherein alpha is1、α2The weighting coefficients of the gradient losses in the horizontal direction and the vertical direction are respectively, and the selected value is 0.5 in the invention.
Figure BDA0001701804840000045
Represents the gradient value of the ith pixel of the predicted value in the x direction,
Figure BDA0001701804840000046
representing the gradient value in the y-direction of the ith pixel of the predicted value, dx (i)Representing the gradient value in the x-direction of the ith pixel of the true value, dy (i)Representing the gradient value of the ith pixel of the true value in the y direction.
The invention has the beneficial effects that: aiming at the problem that the prior art is difficult to take account of spectrum information, space domain information and poor reconstruction quality texture effect, the invention provides a hyperspectral image super-resolution reconstruction method based on spectrum-space combination and gradient domain loss. Compared with the prior art, the method adopted by the invention can obtain a clearer reconstruction effect under the same space, time and other conditions, and is superior to the prior art. The network structure is lighter, and has higher reconstruction quality and stronger anti-noise performance.
Drawings
In order to make the object, technical scheme and beneficial effect of the invention more clear, the invention provides the following drawings for explanation:
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a diagram of the SSRNet network architecture according to the present invention;
fig. 3 is a super-resolution reconstruction contrast diagram of hyperspectral images Flowers (sampling factor s is 2);
fig. 4 is a super-resolution reconstruction contrast diagram of hyperspectral images Flowers (sampling factor s is 3);
fig. 5 is a super-resolution reconstruction contrast diagram of hyperspectral images Flowers (sampling factor s is 4);
fig. 6 is a super-resolution reconstruction contrast map of the hyperspectral image Img1 (sampling factor s is 2);
fig. 7 is a super-resolution reconstruction contrast map of the hyperspectral image Img1 (sampling factor s is 3);
fig. 8 is a super-resolution reconstruction contrast map of the hyperspectral image Img1 (sampling factor s is 4);
reference numerals: (a) original image, (b) Bicubic, (c) SRCNN, (D) VDSR, (e) DRCN, (f) EDSR, (g)3D-FCNN, (h) The deployed SSRnet.
Detailed Description
Preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
FIG. 1 is a flow chart of a hyperspectral image super-resolution reconstruction method based on spectral-spatial combination and gradient domain loss according to the invention; referring to fig. 1, a hyperspectral image super-resolution reconstruction method based on spectral-spatial combination and gradient domain loss comprises the following steps:
s1: obtaining a hyperspectral image;
s2: dividing the hyperspectral images into a training set and a test set;
s3: inputting the training set into a neural network with spectrum and space combination, and training by utilizing the joint loss of a space domain and a gradient domain;
s4: and (5) passing the test set through a neural network to obtain a final reconstruction result.
Referring to fig. 2, the structure diagram of the neural network in steps S3 and S4, the method of the spectrum space combined neural network includes:
the purpose of the image super-resolution reconstruction task is to learn a mapping relationship from a low-resolution image to a high-resolution image, which can be expressed as the following formula:
y=f(x)
where y denotes a high-resolution image, x denotes a low-resolution image, and f denotes a mapping relation to be learned. In deep learning, such a mapping relationship is often represented by a convolutional neural network having a plurality of convolutional layers.
The SSRNet is a convolutional neural network designed by the present invention, which includes 14 convolutional layers, and each convolutional layer except CONV _5 and CONV _6 has an activation function layer combined. The core ideas and construction details of the network design are as follows:
1) by using the advantages of deep residual error learning
a. Jump connection is added between network input and network output, so that residual errors between high/low resolution images are learned by a network, the weight of network parameters is relatively sparse, and the convergence speed can be accelerated; b. and a residual block is used in the network, so that the risk of gradient disappearance or explosion caused by the deepening of the network layer number is reduced.
2) CONV _1 is a one-dimensional spectrum convolution with a convolution kernel size of 1x1x7, the number of the convolution kernels is 64, and a Relu activation function is adopted and is mainly used for learning characteristics among spectra.
3) A Pseudo three-dimensional Convolution (P3D) residual block was introduced.
CONV _2 is a bottleneck layer, the size of a convolution kernel is 1x1x1, the number of the convolution kernels is 16, and a Relu activation function is adopted. The sizes of the input and output characteristic graphs of the two-dimensional space domain convolution kernel and the one-dimensional spectrum convolution kernel are reduced, so that the calculation cost is reduced, and the network depth is increased conveniently. CONV _3 and CONV _4 form the core of a pseudo three-dimensional convolution, replacing a 3x3x3 three-dimensional convolution kernel with a 3x1x1 one-dimensional spectral convolution kernel and a 1x3x3 two-dimensional spatial convolution kernel. CONV _3 and CONV _4 use the Relu activation function, and the number of convolution kernels is 16. The pseudo three-dimensional convolution can be used for learning the spectral space characteristics of the hyperspectral image, and compared with a two-dimensional convolution neural network with the same depth, the pseudo three-dimensional convolution neural network effectively reduces the number of parameters and the size of a model. CONV _5 is a bottleneck layer, the size of a convolution kernel is 1x1x1, the number of the convolution kernels is 32, and a Relu activation function is adopted.
4) The BN layer is removed when the pseudo three-dimensional convolution residual module is designed. The BN layer is often used behind the convolutional layer in the residual block to normalize the output of the convolutional layer. The main function of the BN layer is to prevent the gradient from disappearing or exploding, but we remove the BN layer when designing the network structure, mainly for the following two reasons: a. the BN layer usually uses a larger batch size, but due to the fact that hyperspectral image data are large and dimensionality is high, computing resource consumption can be increased sharply when the larger batch size is adopted, the BN layer is not suitable for the BN layer when a network structure is designed by adopting the smaller batch size; b. the BN layer, while standardizing features, also reduces the Range Flexibility of the network, i.e. the Range over which the network can behave or cope with changes. Therefore, for image super-resolution reconstruction, the removal of the BN layer does not reduce the image reconstruction quality, but can improve the network performance and reduce the GPU memory utilization rate.
5) The network repeatedly uses three pseudo three-dimensional convolution residual blocks designed as above, and the number of the residual blocks can be modified according to different computing resources. Increasing the number of residual blocks may improve the reconstruction quality but may increase computational resource consumption.
6) CONV _6 is the last layer of the network, is a convolution layer with convolution kernel size of 1x1x1 and convolution kernel number of 1, and is used for performing nonlinear mapping to obtain a reconstructed image.
Wherein the combined loss of spatial and gradient domains involved in step S3 may be expressed as follows:
the loss function is an objective function of neural network optimization, and aims to evaluate the difference between a predicted value and a true value, and the excellent loss function can improve the network convergence speed and the quality of the predicted value, otherwise, the overall performance of the network is reduced.
For the super-resolution reconstruction method based on deep learning, the most common loss function is MSE based on the minimization between pixel pairs, and the minimization of MSE can intuitively promote the PSNR value. Although the MSE optimization is simple, the network returns the average value of a plurality of possible images, and the MSE value is increased sharply when abnormal points exist, so that the reconstructed image usually lacks high-frequency content, and the texture information is too smooth and the visual effect is not natural. The method adopts a joint loss function combining Charbonier with gradient domain, which has better robustness and has texture retention.
1) Charbonnier loss function
Figure BDA0001701804840000071
Wherein the content of the first and second substances,
Figure BDA0001701804840000072
m is the number of samples,
Figure BDA0001701804840000073
pixel value, X, representing the ith pixel of the prediction(i)The pixel value of the ith pixel of the true value is represented, x represents the true value, and epsilon is that the difference value between each pixel point pair of the LR image and the HR image is larger than a certain threshold value, so that the interference of an abnormal value is reduced; in the invention, the parameter epsilon is 0.001;
2) gradient domain loss function
Firstly, the LR and HR images need to be transferred to the gradient domain by an image gradient algorithm, and the image gradient is calculated by the following formula:
horizontal direction:
dx(i,j)=I(i+1,j)-I(i,j)
vertical direction:
dy(i,j)=I(i,j+1)-I(i,j)
where I is the image pixel value and (I, j) is the coordinates of the pixel; calculating the Charbonnier loss for the obtained gradient domain feature map, and finally obtaining a joint loss function as follows:
Figure BDA0001701804840000074
wherein alpha is1、α2The weighting coefficients of the gradient losses in the horizontal direction and the vertical direction are respectively, and the selected value is 0.5 in the invention.
Figure BDA0001701804840000075
Represents the gradient value of the ith pixel of the predicted value in the x direction,
Figure BDA0001701804840000076
representing the gradient value in the y-direction of the ith pixel of the predicted value, dx (i)Representing the gradient value in the x-direction of the ith pixel of the true value, dy (i)Representing the gradient value of the ith pixel of the true value in the y direction.
Example (b):
1. experimental data
The CAVE dataset contains 32 hyperspectral data from different objects, each object having 31 photographs from different wavelength bands, each image being 512x512 in size, the wavelength bands ranging from 400nm to 700nm (each band being 10nm apart). The Harvard data set contains 50 real-world outdoor/indoor data under sunlight and 27 synthetic rays, and the size of each data block is 1392x1040x31, and the 31 wave bands are uniformly distributed between 420nm and 720 nm. For each Harvard sample, the experiment cut it to a size of 1024x1024x31 to facilitate calculations.
In order to compare the quality of the reconstructed image, the average value of the wave band range of 400nm-500nm (or 420nm-520nm) is used as the B channel of the color image, the average value of 500nm-600nm (or 520nm-620nm) is used as the G channel of the color image, and the average value of 600nm-700nm (or 620nm-720nm) is used as the R channel of the color image in the experiment.
The experimental test set extracted 17 data from different subjects, 7 in the CAVE data set (balloon, Chart and stuck toy, Faces, Flowers, Jelly beans, Oil painting, Real and fake applets), and 10 in the Harvard data set (Img1, Img2, Img3, Img4, Img5, Img6, Imga1, Imga2, Imga3, Imga 4). The data from the 92 objects are all left to be used as the training set, and the training set is randomly clipped using 32 × 32 × 31 patches, yielding approximately 50000 patches. In the experiment, the original image is used as a real high-resolution image, and Bicubic downsampling is performed to obtain a corresponding low-resolution image.
2. Introduction to comparative methods
In the experiment, SRCNN, VDSR, DRCN, EDSR and 3DFCNN are selected to be compared with the SSRNet provided by the invention on various performances, wherein SRCNN, VDSR, DRCN and EDSR are super-resolution reconstruction methods based on a single frame and proposed since 2016, and 3D-FCNN is a hyper-spectral image super-resolution reconstruction method based on deep learning and proposed in 2017.
3. Introduction of evaluation index
Image Quality Assessment (IQA) can be generally classified into subjective Assessment and objective Assessment. Subjective evaluation can intuitively reflect the visual perception effect of people, but the method has interference of some human factors, human eyes are difficult to judge the subtle differences among pictures, and evaluation differences obtained by different people are large, so that the evaluation on the Fidelity (Fidelity) of the pictures has deviation. The objective evaluation compares the information amount or the similarity degree between the original image and the reconstructed image through an algorithm, so that the effectiveness of the reconstruction algorithm is evaluated. The experiment combines subjective evaluation and objective evaluation to comprehensively measure the reconstruction quality of the hyperspectral image. Three objective evaluation indexes are introduced in the experiment: peak signal-to-noise ratio, average structural similarity, and spectral angle similarity.
The Peak Signal to Noise Ratio (PSNR) describes the variation of the Signal to Noise Ratio of the hyperspectral image, and the larger the index value of the PSNR, the closer the reconstructed image is to the original image, the unit of the PSNR is dB. The formula for the peak signal-to-noise ratio is as follows:
Figure BDA0001701804840000081
Figure BDA0001701804840000082
the hyperspectral image reconstruction method comprises the following steps that the number of spectral bands of a hyperspectral image is B, the number of pixels of each spectral band of the image is MXN, MAX represents the maximum pixel value in the hyperspectral image, and MSE is a hyperspectral reconstruction HR image
Figure BDA0001701804840000091
The mean square error with the original HR image X.
The Mean Structural Similarity (MSSIM) is used for measuring the Mean of Structural similarities of all spectral bands of the hyperspectral reconstructed HR image and the original HR image, and the larger the index value of the Mean Structural Similarity (MSSIM) is, the more similar the reconstructed HR image and the original HR image are, and the better the reconstruction quality is. Compared with the PSNR, which pays more attention to the mean square error (gray information) of the image, the MSSIM pays more attention to the structural information of the image. The mathematical expression for MSSIM is as follows:
Figure BDA0001701804840000092
wherein, the high spectrum reconstruction HR image and the original HR image under the ith spectrum band are respectively
Figure BDA0001701804840000093
And XiReconstructing an HR image
Figure BDA0001701804840000094
And the original HR image XiCorrespond to each otherThe value and variance are respectively
Figure BDA0001701804840000095
And
Figure BDA0001701804840000096
the covariance between the two is
Figure BDA0001701804840000097
Constant c1And c2The values of (a) are 0.0001 and 0.0009, respectively.
The Spectral Angle Similarity (SAM) judges the image similarity by measuring the included Angle between the spectra of the hyperspectral reconstructed HR image and the original HR image, and the smaller the index value is, the closer the two are, and the better the quality of the reconstructed image is. The mathematical expression for SAM is as follows:
Figure BDA0001701804840000098
wherein the content of the first and second substances,
Figure BDA0001701804840000099
x (i, j) are respectively the spectral vectors of pixel points with coordinates (i, j) on the hyperspectral reconstructed HR image and the original HR image,
Figure BDA00017018048400000910
representing the dot product between the two.
The higher the values of indexes such as PSNR, MSSIM and the like are, the better the hyperspectral image super-resolution reconstruction effect is, and the lower the SAM is, the better the spectrum information of the reconstructed image is restored.
4. Experiment of
In The experiment, 7 hyperspectral images of a CAVE data set are firstly subjected to a reconstruction experiment, and experimental comparison results of flowers under 2, 3 and 4 times of sampling factors are respectively shown in fig. 3, 4 and 5, wherein (a) an original image, (b) Bicubic, (c) SRCNN, (D) VDSR, (e) DRCN, (f) EDSR, (g)3D-FCNN and (h) The deployed SSRnet. The experiment box displays a certain local area of the reconstructed image and enlarges the local area to be placed at the upper left corner of the whole image so as to obtain more accurate visual judgment effect. As can be seen from the comparison, the recovery quality of the SRCNN image is only better than Bicubic. VDSR uses residual idea, however, serious blur still appears at the flower heart, petals, etc., and the information recovery of detail texture, etc. is poor. The DRCN and EDSR methods are good for reconstructing petal edges, but the phenomenon of detail blurring is still obvious, and the phenomenon is particularly obvious when the flower center area is excessively smooth. Compared with the method, the 3D-FCNN and the SSRnet provided by the experiment are obviously improved in visual effect, and images with clear textures and clear details can be obtained.
Next, The experiment reconstructs 10 hyperspectral images of The Harvard dataset, and The reconstructed images of Img1 at 2, 3, and 4-fold sampling factors are given in fig. 6, 7, and 8, respectively, wherein (a) original image, (b) Bicubic, (c) SRCNN, (D) VDSR, (e) DRCN, (f) EDSR, (g)3D-FCNN, (h) The deployed SSRNet. Similarly, to obtain better visual perception, the present experiment outlined some local area of the reconstructed image and placed it enlarged in the lower right corner of the whole image. It can be easily found that 3D-FCNN and SSRnet both recover image texture (e.g. brick) far more than other methods, however, at a sampling factor of 4, the 3D-FCNN still blurs the edge of the house tile.
Compared with single-frame super-resolution reconstruction methods such as SRCNN, VDSR, DRCN and EDSR, the 3D-FCNN and the SSRNet use three-dimensional convolution, not only can the spatial information of the image be recovered, but also the spectral information can be reconstructed, and therefore the visual effect is greatly improved. Compared with the five comparison methods, the method provided by the invention can obtain a better reconstruction effect, and seen from a local enlarged image, the image reconstructed by the method provided by the invention has sharp edges and rich detail information, can effectively inhibit edge artifacts, and sometimes even has a better reconstruction effect than the original image, because the SSRnet not only effectively utilizes the spectral space information of a hyperspectral image, but also converts an HR image reconstruction task into a residual error between a fitted HR image and an LR image, so that the network learning center of gravity can be placed on the depiction of high-frequency information such as detail textures.
Table 1 gives the results of the objective indices PSNR, MSSIM and SAM for different methods under two data sets and different sampling factors.
TABLE 1 reconstruction results CAVE and Harvard data sets for methods on CAVE and Harvard datasets at different sampling factors
Figure BDA0001701804840000101
Figure BDA0001701804840000111
As can be seen from table 1, PSNR and MSSIM of all methods are improved to some extent on the basis of Bicubic, but methods based on single frame reconstruction, such as SRCNN and VDSR, are sometimes inferior to Bicubic in spectrum evaluation index SAM. The DRCN uses a recursive convolutional neural network to reduce the overfitting phenomenon caused by the network being too deep, but the reconstruction effect is still not good. In the method based on single frame reconstruction, the EDSR samples the LR image without interpolation in advance but at the last stage of the network, and extra noise is not introduced, and the obtained PSNR and MSSIM are superior to algorithms such as SRCNN, VDSR, DRCN and the like. Compared with a single-frame reconstruction-based method, the indexes of 3D-FCNN and SSRNet such as SAM are greatly reduced, which also shows that the three-dimensional convolution can effectively utilize the inter-spectrum information to better reconstruct.
The SSRnet provided by the invention is obviously superior to other algorithms, namely 3D-FCNN. Under CAVE and Harvard data sets, compared with algorithms such as SRCNN, VDSR, DRCN, EDSR, 3D-FCNN and the like, PSNR, MSSIM and SAM of images reconstructed by the SSRnet are all optimal values. The SSRnet can well solve the problems of detail loss, edge blurring, unclear texture and the like of a hyperspectral LR image. Relevant experimental data show that the SSRNet can learn low-level information of a hyperspectral image, and effectively reconstruct details and textures of the image by using local features and a global structure of the low-level information, so that a better reconstruction effect is achieved on subjective visual evaluation and objective evaluation.
Finally, it is noted that the above-mentioned preferred embodiments illustrate rather than limit the invention, and that, although the invention has been described in detail with reference to the above-mentioned preferred embodiments, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the scope of the invention as defined by the appended claims.

Claims (2)

1. A hyperspectral image super-resolution reconstruction method based on spectral-spatial combination and gradient domain loss is characterized by comprising the following steps:
s1: obtaining a hyperspectral image;
s2: dividing the hyperspectral images into a training set and a test set;
s3: inputting the training set into a neural network with spectrum and space combination, and training by utilizing the joint loss of a space domain and a gradient domain; the spectrum-space combined neural network SSRNet is a convolutional neural network based on deep learning, converts an HR image reconstruction task into fitting of residual errors between an HR image and an LR image, and extracts spectrum-space characteristics by using pseudo three-dimensional convolution, so that the improvement of spatial resolution and spectrum resolution is realized; the SSRNet uses a loss function combining a pixel domain Charbonier loss function and a gradient domain loss to improve the quality of a reconstructed image, and adopts a multi-scale training mode to complete an image reconstruction task under various sampling factors;
the SSRNet structure includes:
1) deep residual learning: jumping connection is added between network input and output, so that residual errors between high/low resolution images are learned by a network;
2) introducing a Pseudo-three-dimensional Convolution (P3D) residual block; wherein, CONV _2 is a bottleneck layer, the size of a convolution kernel is 1x1x1, and the number of the convolution kernels is 16; CONV _3 adopts a one-dimensional spectrum convolution kernel 3x1x1, and CONV _4 adopts a two-dimensional space domain convolution kernel 1x3x 3; CONV _5 is a bottleneck layer, and the size of a convolution kernel is 1x1x 1;
the number of the residual blocks is modified according to different computing resources;
3) a BN layer is removed from the pseudo three-dimensional convolution residual module, and the network performance is improved;
4) the activation functions in the network all adopt Relu activation functions, so that the convergence speed is accelerated, and gradient explosion and gradient disappearance are prevented;
s4: and (5) passing the test set through a neural network to obtain a final reconstruction result.
2. The hyperspectral image super-resolution reconstruction method based on spectral-spatial combination and gradient domain loss according to claim 1, wherein in step S3, the loss function is an objective function optimized by a neural network and used for evaluating the difference between a predicted value and a true value, and the excellent loss function can improve the network convergence speed and improve the quality of the predicted value, otherwise, the overall performance of the network is reduced;
1) charbonnier loss function
Figure FDA0002723659620000011
Wherein the content of the first and second substances,
Figure FDA0002723659620000012
m is the number of samples,
Figure FDA0002723659620000013
pixel value, X, representing the ith pixel of the prediction(i)The pixel value of the ith pixel of the true value is represented, and epsilon is that the difference value between each pixel point pair of the LR image and the HR image is larger than a certain threshold value, so that the interference of abnormal values is reduced;
2) gradient domain loss function
Firstly, the LR and HR images need to be transferred to the gradient domain by an image gradient algorithm, and the image gradient is calculated by the following formula:
horizontal direction:
dx(i,j)=I(i+1,j)-I(i,j)
vertical direction:
dy(i,j)=I(i,j+1)-I(i,j)
where I is the image pixel value and (I, j) is the coordinates of the pixel; calculating the Charbonnier loss for the obtained gradient domain feature map, and finally obtaining a joint loss function as follows:
Figure FDA0002723659620000021
wherein the content of the first and second substances,
Figure FDA0002723659620000022
representing the predicted value, X the true value, alpha1、α2Weighting coefficients for the horizontal and vertical gradient penalties respectively,
Figure FDA0002723659620000023
represents the gradient value of the ith pixel of the predicted value in the x direction,
Figure FDA0002723659620000024
representing the gradient value in the y-direction of the ith pixel of the predicted value, dx (i)Representing the gradient value in the x-direction of the ith pixel of the true value, dy (i)Representing the gradient value of the ith pixel of the true value in the y direction.
CN201810639042.8A 2018-06-20 2018-06-20 Hyperspectral image super-resolution reconstruction method based on spectral-spatial combination and gradient domain loss Expired - Fee Related CN108830796B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810639042.8A CN108830796B (en) 2018-06-20 2018-06-20 Hyperspectral image super-resolution reconstruction method based on spectral-spatial combination and gradient domain loss

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810639042.8A CN108830796B (en) 2018-06-20 2018-06-20 Hyperspectral image super-resolution reconstruction method based on spectral-spatial combination and gradient domain loss

Publications (2)

Publication Number Publication Date
CN108830796A CN108830796A (en) 2018-11-16
CN108830796B true CN108830796B (en) 2021-02-02

Family

ID=64143026

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810639042.8A Expired - Fee Related CN108830796B (en) 2018-06-20 2018-06-20 Hyperspectral image super-resolution reconstruction method based on spectral-spatial combination and gradient domain loss

Country Status (1)

Country Link
CN (1) CN108830796B (en)

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109753996B (en) * 2018-12-17 2022-05-10 西北工业大学 Hyperspectral image classification method based on three-dimensional lightweight depth network
CN109584164B (en) * 2018-12-18 2023-05-26 华中科技大学 Medical image super-resolution three-dimensional reconstruction method based on two-dimensional image transfer learning
CN109886870B (en) * 2018-12-29 2023-03-03 西北大学 Remote sensing image fusion method based on dual-channel neural network
CN109741407A (en) * 2019-01-09 2019-05-10 北京理工大学 A kind of high quality reconstructing method of the spectrum imaging system based on convolutional neural networks
CN109949219B (en) * 2019-01-12 2021-03-26 深圳先进技术研究院 Reconstruction method, device and equipment of super-resolution image
CN109903255A (en) * 2019-03-04 2019-06-18 北京工业大学 A kind of high spectrum image Super-Resolution method based on 3D convolutional neural networks
CN109886898B (en) * 2019-03-05 2020-10-02 北京理工大学 Imaging method of spectral imaging system based on optimization heuristic neural network
CN109697697B (en) * 2019-03-05 2020-10-16 北京理工大学 Reconstruction method of spectral imaging system based on optimization heuristic neural network
CN110441315B (en) * 2019-08-02 2022-08-05 英特尔产品(成都)有限公司 Electronic component testing apparatus and method
CN111192193B (en) * 2019-11-26 2022-02-01 西安电子科技大学 Hyperspectral single-image super-resolution method based on 1-dimensional-2-dimensional convolution neural network
CN111127573B (en) * 2019-12-12 2022-06-03 首都师范大学 Wide-spectrum hyperspectral image reconstruction method based on deep learning
CN111062403B (en) * 2019-12-26 2022-11-22 哈尔滨工业大学 Hyperspectral remote sensing data depth spectral feature extraction method based on one-dimensional group convolution neural network
CN115221932A (en) * 2021-04-19 2022-10-21 上海与光彩芯科技有限公司 Spectrum recovery method and device based on neural network and electronic equipment
CN113222823B (en) * 2021-06-02 2022-04-15 国网湖南省电力有限公司 Hyperspectral image super-resolution method based on mixed attention network fusion
CN113628111B (en) * 2021-07-28 2024-04-12 西安理工大学 Hyperspectral image super-resolution method based on gradient information constraint
WO2023155032A1 (en) * 2022-02-15 2023-08-24 华为技术有限公司 Image processing method and image processing apparatus
WO2023167465A1 (en) * 2022-03-03 2023-09-07 Samsung Electronics Co., Ltd. Method and system for reducing complexity of a processing pipeline using feature-augmented training
CN115235628B (en) * 2022-05-17 2023-12-01 中国科学院上海技术物理研究所 Spectrum reconstruction method and device, spectrometer, storage medium and electronic equipment

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9064476B2 (en) * 2008-10-04 2015-06-23 Microsoft Technology Licensing, Llc Image super-resolution using gradient profile prior
CN103530860B (en) * 2013-09-26 2017-05-17 天津大学 Adaptive autoregressive model-based hyper-spectral imagery super-resolution method
CN104050653B (en) * 2014-07-07 2017-01-25 西安电子科技大学 Hyperspectral image super-resolution method based on non-negative structure sparse
CN106780338B (en) * 2016-12-27 2020-06-09 南京理工大学 Rapid super-resolution reconstruction method based on anisotropy
CN106683067B (en) * 2017-01-20 2020-06-23 福建帝视信息科技有限公司 Deep learning super-resolution reconstruction method based on residual sub-images
CN107204010B (en) * 2017-04-28 2019-11-19 中国科学院计算技术研究所 A kind of monocular image depth estimation method and system
CN107301372A (en) * 2017-05-11 2017-10-27 中国科学院西安光学精密机械研究所 High spectrum image super-resolution method based on transfer learning

Also Published As

Publication number Publication date
CN108830796A (en) 2018-11-16

Similar Documents

Publication Publication Date Title
CN108830796B (en) Hyperspectral image super-resolution reconstruction method based on spectral-spatial combination and gradient domain loss
CN110570353B (en) Super-resolution reconstruction method for generating single image of countermeasure network by dense connection
CN111709902B (en) Infrared and visible light image fusion method based on self-attention mechanism
CN110119780B (en) Hyper-spectral image super-resolution reconstruction method based on generation countermeasure network
CN110533620B (en) Hyperspectral and full-color image fusion method based on AAE extraction spatial features
CN107123089B (en) Remote sensing image super-resolution reconstruction method and system based on depth convolution network
WO2021056969A1 (en) Super-resolution image reconstruction method and device
Yang et al. Wavelet u-net and the chromatic adaptation transform for single image dehazing
CN112507997B (en) Face super-resolution system based on multi-scale convolution and receptive field feature fusion
Luo et al. Pansharpening via unsupervised convolutional neural networks
CN111080567B (en) Remote sensing image fusion method and system based on multi-scale dynamic convolutional neural network
CN111161360B (en) Image defogging method of end-to-end network based on Retinex theory
CN112837224A (en) Super-resolution image reconstruction method based on convolutional neural network
CN108288256A (en) A kind of multispectral mosaic image restored method
CN116485934A (en) Infrared image colorization method based on CNN and ViT
CN112163998A (en) Single-image super-resolution analysis method matched with natural degradation conditions
CN116645569A (en) Infrared image colorization method and system based on generation countermeasure network
CN115641391A (en) Infrared image colorizing method based on dense residual error and double-flow attention
CN115760814A (en) Remote sensing image fusion method and system based on double-coupling deep neural network
Wang et al. No-reference stereoscopic image quality assessment using quaternion wavelet transform and heterogeneous ensemble learning
Zhou et al. PAN-guided band-aware multi-spectral feature enhancement for pan-sharpening
CN109559278B (en) Super resolution image reconstruction method and system based on multiple features study
CN113344804B (en) Training method of low-light image enhancement model and low-light image enhancement method
CN114067018A (en) Infrared image colorization method for generating countermeasure network based on expansion residual error
CN117495718A (en) Multi-scale self-adaptive remote sensing image defogging method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20210202