CN116091916A - Multi-scale hyperspectral image algorithm and system for reconstructing corresponding RGB images - Google Patents

Multi-scale hyperspectral image algorithm and system for reconstructing corresponding RGB images Download PDF

Info

Publication number
CN116091916A
CN116091916A CN202211469458.2A CN202211469458A CN116091916A CN 116091916 A CN116091916 A CN 116091916A CN 202211469458 A CN202211469458 A CN 202211469458A CN 116091916 A CN116091916 A CN 116091916A
Authority
CN
China
Prior art keywords
image
processing
characteristic
scale
spectrum
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211469458.2A
Other languages
Chinese (zh)
Inventor
栾鸿康
孙玉宝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Information Science and Technology
Original Assignee
Nanjing University of Information Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Information Science and Technology filed Critical Nanjing University of Information Science and Technology
Priority to CN202211469458.2A priority Critical patent/CN116091916A/en
Publication of CN116091916A publication Critical patent/CN116091916A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/194Terrestrial scenes using hyperspectral data, i.e. more or other wavelengths than RGB
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A40/00Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
    • Y02A40/10Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in agriculture

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Medical Informatics (AREA)
  • Remote Sensing (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a multi-scale hyperspectral image algorithm and a system for reconstructing corresponding hyperspectral images from RGB images, belonging to the field of computer vision hyperspectral images; the algorithm comprises the following steps: processing the original RGB image by a multi-scale processing module, and outputting a feature map Y' i The method comprises the steps of carrying out a first treatment on the surface of the Map Y' i Adding and stacking the multi-scale preprocessing information with the original RGB image in the dimension of a spectrum channel to obtain a characteristic image with the multi-scale preprocessing information; the characteristic image with the multi-scale preprocessing information sequentially passes through 3 space-spectrum converter combined processing modules to perform space-spectrum dimension combined processing to obtain the characteristic image with the space-spectrum characteristic information; will be provided with space-spectral characteristic informationThe characteristic images are processed through a spectrum dimension transducer to obtain characteristic images with spectrum dimension characteristic information fully extracted through the transducer; and (3) adjusting the channel dimension to be output by a target 31 channel through a 3X 3 convolution layer to finally obtain the hyperspectral image.

Description

Multi-scale hyperspectral image algorithm and system for reconstructing corresponding RGB images
Technical Field
The invention belongs to the field of computer vision hyperspectral image reconstruction, and particularly relates to a hyperspectral image algorithm and a hyperspectral image system corresponding to multi-scale reconstruction of RGB images.
Background
Visual information is very important to human beings, and at least more than eighty percent of external information is obtained visually. The light rays carry the spectral information peculiar to the object when the light rays irradiate the surface of the object and reflect the light rays, the light rays are limited by the human eye structure, only the information in the visible light range can be read by naked eyes, but the hyperspectral imager can capture information in any wave band, and information which is difficult to find by naked eyes can be found. The hyperspectral imager can completely record the spectrum information of the picture, and compared with a common color camera, the hyperspectral imager can simultaneously acquire the space information and the spectrum information of a detected object. Based on the advantages, the hyperspectral image has particularly wide application in the fields of medical image processing, remote sensing, object tracking and the like.
With the development of the related field of hyperspectral image processing, the demand for hyperspectral images becomes larger and larger, but the traditional hyperspectral images are difficult to acquire and high in cost, so that the development of the field is limited. Most conventional hyperspectral image acquisition methods typically use spectrometers of spatial or spectral scanning technology, such as push broom scanners, broom scanners and band sequential scanners; however, the hyperspectral imager has obvious defects, the hyperspectral imager is quite large in size and complex in operation, so that the hyperspectral image is difficult to obtain and high in cost; in recent years, with the development of convolutional neural network theory, a large number of Convolutional Neural Network (CNN) based methods have been applied to reconstruction work, and have achieved relatively good effects; although the traditional deep learning reconstruction algorithm based on CNN solves the hardware problems of high cost and high difficulty, the reconstruction quality of the deep learning reconstruction algorithm based on CNN still does not reach the higher standard due to the defect of the CNN model in acquiring the spatial remote correlation and the self-similarity in the spectrum.
Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to provide a multi-scale hyperspectral image algorithm and a system for reconstructing corresponding hyperspectral images from RGB images.
The aim of the invention can be achieved by the following technical scheme:
a multi-scale algorithm for reconstructing a corresponding hyperspectral image from an RGB image, comprising the steps of:
s1, processing an original RGB image X through a multi-scale processing module, and outputting a feature map Y' i
S2, the characteristic diagram Y' i Adding and stacking the characteristic image F with the original RGB image X in the dimension of a spectrum channel to obtain a characteristic image F with multi-scale preprocessing information;
s3, sequentially passing the characteristic image F with the multi-scale preprocessing information through 3 space-spectrum converter combined processing modules, and carrying out space-spectrum dimension combined processing to obtain the characteristic image with the space-spectrum characteristic information;
s4, processing the characteristic image obtained by the processing of S3 through a spectrum dimension transducer to obtain a characteristic image with spectrum dimension characteristic information fully extracted through the transducer;
s5, the characteristic image with the spectral dimension characteristic information fully extracted by the transducer is subjected to channel dimension adjustment to a target 31 channel output through a 3X 3 convolution layer, and finally a hyperspectral image is obtained.
Further, the processing in S1 obtains a characteristic diagram Y' i The method comprises the following steps:
s11, copying 3 copies of an original RGB image, and then respectively performing downsampling processing at downsampling rates of 8 times, 4 times and 2 times;
s12, after downsampling, carrying out convolution operation on the downsampled feature image through a convolution layer with a 3 multiplied by 3 convolution kernel; each of the other layers, except the first layer, is additively fused with the results of the previous layer immediately after the convolution operation, as shown in the following equation:
Y i =Conv(Down r (X))⊕Y i-1
wherein Downr (. Cndot.) represents a downsampling operation, r is a downsampling rate, conv (-) represents a two-dimensional convolution operation with a convolution kernel size of 3×3, and the # -symbol represents a channel number stacking operation; all convolution operations in the multi-scale processing module use the LeakyReLU activation function;
Figure BDA0003957935780000031
representing the intermediate process output of the i-th layer, where i e (1, 2, 3);
s13, obtaining Y from S12 i Input into a spectrum converter module, capture the context information among spectrum channels and obtain Y' i The method comprises the steps of carrying out a first treatment on the surface of the The following formula is shown:
Figure BDA0003957935780000032
wherein ,
Figure BDA0003957935780000033
intermediate process output representing layer i, +.>
Figure BDA0003957935780000034
Up, representing the final result of the ith layer r (. Cndot.) represents an upsampling operation, r is the upsampling rate, conv (-) represents a two-dimensional convolution operation with a convolution kernel size of 1X 1, tran spe (. Cndot.) represents the transform processing for the spectral dimension.
Further, in S2, the step of obtaining the feature image F with the multiscale preprocessing information includes:
s21, the original RGB image is adjusted to be the sum feature map Y 'by using a 3X 3 convolution layer' i The same spatial dimensions (rh' 3 ×rw' 3 ) Output result is
Figure BDA0003957935780000035
S22, aiming at a characteristic diagram Y 'in the spectrum dimension' i Stacking with the original RGB image, i.e. c 0 +c 1, wherein
Figure BDA0003957935780000036
Finally, the characteristic image with multi-scale preprocessing information is obtained>
Figure BDA0003957935780000037
Further, in S3, the feature image is sequentially subjected to a spatial transform process and a spectral transform process in each spatial-spectral transform joint processing module.
Further, the processing steps of the characteristic image in each space-spectrum converter joint processing module are as follows:
s31, space Transformer processing:
1) The input of the layer is set as a characteristic image F epsilon R c×h×w The characteristic image F is divided into small windows with the window size of m in an average way, and the characteristic image F is obtained through the partitioning operation i ∈R c×m×m Flattening and transposing to obtain
Figure BDA0003957935780000041
Then, performing self-attention processing on the obtained one-dimensional feature map; the specific formula is as follows:
F={F 1 ,F 2 ,…,F N },N=hw/m 2
A i =Attention(F i W Q ,F i W K ,F i W V ),i=1,…,N
Figure BDA0003957935780000042
wherein WQ ,W K and WV ∈R c×c Respectively represent
Figure BDA0003957935780000043
Is a learnable parameter,/-a projection matrix of (2)>
Figure BDA0003957935780000044
Outputting a result finally for each window; wherein the Attention () operation adds relative position codes while realizing self-Attention calculation, and the specific formula is as follows:
Figure BDA0003957935780000045
wherein B is a relative positional offset, which is of a shape R (2m-1)×(2m-1) Is a learning parameter of (a);
2) The results of the self-attention calculations are then integrated by a simple multi-layer perceptron
Figure BDA0003957935780000046
In the whole process, training difficulty is reduced through jump connection, and the calculation process can be as follows:
Figure BDA0003957935780000047
F out =MLP(LN(F′))+F
wherein F' and F out Processing results of Spa and MLP respectively, and F out Representing the final result of the space Transformer processing, LN (·) symbols representing layer normalization;
s32, inputting the characteristic diagram result subjected to the space Transformer processing into a spectrum Transformer processing block;
1) Let the input feature diagram be H E R c×h×w Firstly, flattening and transpose a characteristic diagram H into H E R hw×c H.epsilon.R hw ×c Via W Q ,W K and WV ∈R c×c Linear projection to Q, K, V ε R hw×c The method comprises the steps of carrying out a first treatment on the surface of the The self-attention calculation process is as follows:
Attention(Q,K,V)=SoftMax(σK T Q)V
wherein σ∈R1 Is a parameter that can be learned;
2) Subsequently, the self-Attention result Attention (Q, K, V) is linearly projected and the relative position code is added, and the specific procedure is given by the following formula:
Spe(H)=Attention(Q,K,V)W+φ(V)
wherein ,W∈Rc×c Is a learnable parameter, the sign phi (·) represents the relative position code, which contains two 3 x 3 volumesLayering and a GELU activation function;
3) Then, a feedforward network is used for integrating the weight matrix obtained by the formula processing, and training difficulty is reduced by jump connection in the whole process, tran spe The calculation of (-) can be represented by the following formula:
H′=Spe(LN(H))+H
H out =FFN(LN(H′))+H′
wherein H' and H out The treatment results of Spe and FFN, respectively, while H out Representing the final result of the spectral transducer processing, the LN (·) symbol represents the layer normalization.
Further, the feed forward network is composed of a 1×1 convolution layer, a GELU activation function, a 3×3 convolution layer, a GELU activation function, and a 1×1 convolution layer in this order.
Further, in S5, the step of acquiring the hyperspectral image is:
s51, setting a convolution layer with an input of 32 and an output of 31, wherein the convolution kernel size is 3 multiplied by 3, and the step size and the filling size are set to be 1;
s52, inputting the feature image obtained in the S4 and fully extracting the spectral dimension feature information through the transducer into a convolution layer, and obtaining a hyperspectral picture with 31 channels and the same spatial resolution as the input RGB picture through a LeakyRelu activation function.
A system for multi-scale reconstruction of a corresponding hyperspectral image from an RGB image, comprising: the device comprises a multi-scale processing unit, a dimension superposition unit, a space-spectrum transducer combined processing unit, a spectrum dimension transducer processing unit and a hyperspectral image output unit;
a multi-scale processing unit for processing the original RGB image X by a multi-scale processing module and outputting a characteristic image Y' i
A dimension superposition unit for superposing the feature map Y' i Adding and stacking the characteristic image F with the original RGB image X in the dimension of a spectrum channel to obtain a characteristic image F with multi-scale preprocessing information;
the space-spectrum converter combined processing unit is used for processing the characteristic image F with the multi-scale preprocessing information through the 3 space-spectrum converter combined processing modules in sequence to obtain the characteristic image with the space-spectrum characteristic information;
the spectral dimension converter processing unit is used for processing the characteristic image obtained by the S3 processing through one spectral dimension converter to obtain a characteristic image with spectral dimension characteristic information fully extracted through the converter;
and the hyperspectral image output unit is used for adjusting the channel dimension to the target 31 channel output through a 3×3 convolution layer to finally obtain the hyperspectral image.
A computer storage medium storing a readable program which when executed performs the algorithm described above.
An apparatus, comprising: one or more processors, memory for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the algorithms described above.
The invention has the beneficial effects that: through a multi-scale space-spectrum joint processing network architecture based on a transducer, the corresponding hyperspectral picture can be reconstructed from RGB with high efficiency, low cost and accuracy; compared with the traditional hyperspectral imager and the deep learning algorithm based on CNN, the method can greatly improve the speed of acquiring the hyperspectral picture and reduce the cost of acquiring the hyperspectral picture, overcomes the limitation of a CNN model in acquiring the space remote correlation and the self-similarity in the spectrum, adopts a multi-scale hierarchical feature extraction mode, and avoids the defect of information loss in the encoding and decoding process caused by the U-Net network used in the current work. The method can greatly improve the accuracy of reconstructing the hyperspectral picture, and has higher practical value.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described, and it will be obvious to those skilled in the art that other drawings can be obtained according to these drawings without inventive effort.
FIG. 1 is a block diagram of the algorithm of the present invention for reconstructing a corresponding hyperspectral image from an RGB image;
FIG. 2 is a block diagram of a multi-scale processing module of the present invention;
FIG. 3 is a graph of true hyperspectral image and reconstructed hyperspectral image and error of both in the experiment of the present invention;
fig. 4 is a graph showing the comparison of the true values and the generated values of all bands at a certain coordinate point in a hyperspectral image in the experiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
As shown in fig. 1, a multi-scale algorithm and system for reconstructing a corresponding hyperspectral image from an RGB image, comprising the steps of:
s1, the original RGB image X epsilon R 3×128×128 Output of feature map Y 'via multi-scale processing Module (MSP) processing' i
The frame structure of the multi-scale processing module is shown in fig. 2, and the specific steps of S1 are as follows:
s11, carrying out downsampling reconstruction on the image; the original RGB image of 3×128×128 is copied 3 copies and then downsampled to 192×16×16, 48×32×32 and 12×64×64 at downsampling rates of 8 times, 4 times and 2 times, respectively.
S12, after downsampling, carrying out convolution operation on the downsampled feature image through a convolution layer with a 3 multiplied by 3 convolution kernel; changing the number of channels through convolution operation, but not changing the spatial resolution so as to fuse with the result of the upper layer; each of the other layers, except the first layer, is additively fused with the results of the previous layer immediately after the convolution operation, as shown in the following equation:
Y i =Conv(Down r (X))⊕Y i-1
wherein Downr (. Cndot.) represents a downsampling operation, r is a downsampling rate, conv (-) represents a two-dimensional convolution operation with a convolution kernel size of 3×3, and the # -symbol represents a channel number stacking operation; all convolution operations in the multi-scale processing module use the LeakyReLU activation function;
Figure BDA0003957935780000081
representing the intermediate process output of the i-th layer, where i e (1, 2, 3).
S13, Y obtained in S12 i Input into a spectral transducer module (Spe), capturing contextual information between spectral channels; since the space size of the first layer after downsampling is too small, at the end of the first layer, a 1×1 convolution layer is passed to enable the network to adaptively adjust the weights; finally, the spatial scale is adjusted to be the same as the size of the next layer through an up-sampling layer to obtain Y' i The method comprises the steps of carrying out a first treatment on the surface of the The specific implementation process is given by the following formula:
Figure BDA0003957935780000091
wherein ,
Figure BDA0003957935780000092
intermediate process output representing layer i, +.>
Figure BDA0003957935780000093
Up, representing the final result of the ith layer r (. Cndot.) represents an upsampling operation, r is the upsampling rate, conv (-) represents a two-dimensional convolution operation with a convolution kernel size of 1X 1, tran spe (. Cndot.) represents the transform processing for the spectral dimension.
S2, the final output characteristic diagram Y 'in S1' i Adding and stacking the characteristic images with the original RGB image X in the dimension of a spectrum channel to obtain a characteristic image with multi-scale preprocessing information, so that a backbone network can better extract the characteristics of the original information; .
The specific superposition steps are as follows:
s21, using a 3×3 convolution layer to transform the original input RGB image (X ε R 3×h×w ) Adjusted to the same spatial dimension (rh 'as the result output in S1' 3 ×rw' 3 ) Output result is
Figure BDA0003957935780000094
Figure BDA0003957935780000095
S22, for two tensors in the spectral dimension (feature map Y' i And the original RGB image), i.e., c) 0 +c 1, wherein
Figure BDA0003957935780000096
Finally, the characteristic image with multi-scale preprocessing information is obtained>
Figure BDA0003957935780000097
Figure BDA0003957935780000098
S3, sequentially passing the characteristic image F with the multi-scale preprocessing information in S2 through 3 space-spectrum converter combined processing modules (SUB) to perform space-spectrum dimension combined processing to obtain a characteristic image with space-spectrum characteristic information;
the characteristic image sequentially passes through a space transducer process (Spa) and a spectrum transducer process (Spe) in each space-spectrum transducer joint processing module (SUB);
the specific steps of processing the characteristic image in a space-spectrum converter joint processing module (SUB) are as follows:
s31, space Transformer processing;
the input of the layer is
Figure BDA0003957935780000101
To simplify the character in the introduction, it is now assumed that the feature image F.epsilon.R c×h×w Is an input to the module; the characteristic image F is divided into small windows with the window size of m in an average way, and the characteristic image F is obtained through the partitioning operation i ∈R c×m×m Performing flattening and transposition to obtain ∈>
Figure BDA0003957935780000102
Then, performing self-attention processing on the obtained one-dimensional feature map; the specific implementation process is shown in the following formula:
F={F 1 ,F 2 ,…,F N },N=hw/m 2
A i =Attention(F i W Q ,F i W K ,F i W V ),i=1,…,N
Figure BDA0003957935780000103
wherein WQ ,W K and WV ∈R c×c Respectively represent
Figure BDA0003957935780000104
Is a learnable parameter,/-a projection matrix of (2)>
Figure BDA0003957935780000105
Outputting a result finally for each window; wherein the Attention () operation adds relative position coding while realizing self-Attention computation, the specific implementation details are given by the following formula:
Figure BDA0003957935780000106
wherein B is a relative positional offset, which is of a shape R (2m-1)×(2m-1) Is a learning parameter of (a);
the results of the self-attention calculations are then integrated by a simple multi-layer perceptron (MLP)
Figure BDA0003957935780000107
And the training difficulty is reduced through jump connection in the whole process, so that Tran is summarized spa The calculation of (-) can be summarized by the following formula:
Figure BDA0003957935780000108
F out =MLP(LN(F′))+F′
wherein F' and F out Processing results of Spa and MLP respectively, and F out Representing the final result of the spatial transducer processing, the LN (·) symbol represents the layer normalization.
S32, spectrum converter processing;
feature map result F after space transform processing out Then input into a spectral transducer processing block;
for convenience of character description, let us now assume that the input is H ε R c×h×w Similar to spatial processing, the feature image X is flattened and transposed to H ε R hw×c H.epsilon.R is then likewise taken hw×c Via W Q ,W K and WV ∈R c×c Linear projection to Q, K, V ε R hw×c The method comprises the steps of carrying out a first treatment on the surface of the In contrast to spatial processing, where all channels of a single pixel are used as a token in spectral processing, the self-attention (Spe) specification is given by:
Attention(Q,K,V)=SoftMax(σK T Q)V
wherein σ∈R1 Is a parameter that can be learned;
subsequently, the self-Attention result Attention (Q, K, V) is linearly projected and the relative position code is added, and the specific procedure is given by the following formula:
Spe(H)=Attention(Q,K,V)W+φ(V)
wherein ,W∈Rc×c Is a parameter which can be learned, phi (&) signThe number represents the relative position code, which contains two 3 x 3 convolutional layers and one GELU activation function;
then integrating the weight matrix Spe (H) obtained by the formula processing through a feedforward network (FFN), and reducing training difficulty through jump connection in the whole process, wherein the feedforward network sequentially comprises a 1×1 convolution layer, a GELU activation function, a 3×3 convolution layer, a GELU activation function and a 1×1 convolution layer; to sum up, tran spe The calculation of (-) can be represented by the following formula:
H′=Spe(LN(H))+H
H out =FFN(LN(H′))+H′
wherein H' and H out The treatment results of Spe and FFN, respectively, while H out Representing the final result of the spectral transducer processing, the LN (·) symbol represents the layer normalization.
S4, the characteristic image obtained by the S3 processing is processed by a spectral dimension transducer (Spe+FFN) to obtain a characteristic image with spectral dimension characteristic information fully extracted by the transducer
The specific processing steps are as follows: the operation S32 is repeated (again a spectral converter process is performed here, since the final purpose is to reconstruct a 31-channel hyperspectral picture from a 3-channel RGB image, to achieve a mapping of the channel dimensions from low to high, most important or spectral dimensional feature information)
S5, the characteristic image obtained in the S4 and subjected to the transformation to fully extract the spectral dimension characteristic information is output by adjusting the channel dimension to a target 31 channel through a 3X 3 convolution layer; finally obtaining a hyperspectral image;
the method comprises the following specific steps:
s51, setting a convolution layer with 32 input and 31 output, wherein the convolution kernel size is 3 multiplied by 3, and the step size and the filling size are set to be 1 so as to ensure that the size of the feature map is not changed;
s52, inputting the final result of S4 into a convolution layer and obtaining a hyperspectral picture with 31 channels and the same spatial resolution as the input RGB picture through a LeakyRelu activation function.
In this embodiment, a dataset provided by the NTIRE 2022 spectral reconstruction challenge is selected for training and evaluation, the dataset comprising 1000 pairs of RGB and their corresponding real hyperspectral pictures; each hyperspectral picture has 31 channels, each channel stores light intensity information of every 10nm from 400nm to 700nm, and the size of the spatial resolution is 482 multiplied by 512; the RGB image has 3 RGB channels, and the size of the spatial resolution is consistent with that of the corresponding hyperspectral picture; 90% of the data in the dataset were randomly selected as the training set, and the remaining 10% of the data were selected as the test set.
In the training process, RGB and hyperspectral pictures in the original data set are randomly cut to 128×128 spatial dimensions, and the light intensity values of the pixels are normalized to [0,1 ]]Is not limited in terms of the range of (a). Simple data enhancement is performed on the training data, such as random rotation and flipping. The processed RGB and the corresponding real hyperspectral pictures are input into a network model, and the mapping from RGB to the real hyperspectral pictures is trained and learned after being processed by a plurality of column network modules provided by the invention, wherein the training aim is to pass through a loss function l MRAE Updating the parameter θ to minimize
Figure BDA0003957935780000121
and Yi Distance between:
Figure BDA0003957935780000131
wherein
Figure BDA0003957935780000132
Is the intensity of the generated hyperspectral picture at the pixel point, and S i The intensity of the hyperspectral image on the pixel point is real, and N represents all the pixel points in the image.
The experimental process comprises the following steps:
in the testing process, the original RGB picture in the testing data set is input into a trained network to obtain a corresponding hyperspectral picture, the reconstructed hyperspectral picture and the real hyperspectral picture are subjected to absolute value difference according to the numerical value of each pixel point in each spectrum dimension, and a numerical error picture is obtained, as shown in figure 3. And randomly selecting the same coordinate position from the reconstructed hyperspectral picture and the real hyperspectral picture to draw the light intensity curves of all wave bands of the pixel point as shown in figure 4, thereby judging the accuracy of the hyperspectral picture generated by the method. In addition, the method of the invention is also compared with HSCNN+ and AWAN based on CNN architecture in the field in MRAE index, and ablation experiments are carried out on two modules of MSP and SSU in the method of the invention to prove the effectiveness of the method.
It can be seen from fig. 3 that the hyperspectral pictures generated from RGB pictures have a strong similarity compared to the true hyperspectral pictures on the left, and further from the error map of both, that the error of all pixels is within a particularly small range in the whole picture (darker representing smaller differences and whiter representing larger differences). As can be clearly seen from the light intensity graph of fig. 4, the method of the present invention can generate a light intensity curve very close to real data in all bands. As is evident from table 1, the proposed method has a lower MRAE, which means that the hyperspectral pictures generated by the method of the invention have a very high similarity to the true hyperspectral pictures, and from table 2 it can be found that there is a positive improvement in the performance of the method of the invention, whether MSP or SSU, which indicates the independent effectiveness of the two modules.
Figure BDA0003957935780000133
Figure BDA0003957935780000141
Table 1 comparison with the index of the conventional method
Figure BDA0003957935780000142
Table 2 ablation experiments
From the point of view of the index quantification result and the visual effect diagram, the algorithm provided by the invention can better generate high-quality hyperspectral pictures, the network is lighter, the network can be conveniently carried into other hardware, a more convenient and lower-cost hyperspectral picture acquisition mode is provided for various hyperspectral application fields, and the further development of the field is promoted.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The solutions in the embodiments of the present application may be implemented in various computer languages, for example, object-oriented programming language Java, and an transliterated scripting language JavaScript, etc.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various modifications and variations can be made in the present application without departing from the spirit or scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims and the equivalents thereof, the present application is intended to cover such modifications and variations.

Claims (10)

1. A multi-scale algorithm for reconstructing a corresponding hyperspectral image from an RGB image, comprising the steps of:
s1, processing an original RGB image X through a multi-scale processing module, and outputting a feature map Y' i
S2, the characteristic diagram Y' i Adding and stacking the characteristic image F with the original RGB image X in the dimension of a spectrum channel to obtain a characteristic image F with multi-scale preprocessing information;
s3, sequentially passing the characteristic image F with the multi-scale preprocessing information through 3 space-spectrum converter combined processing modules, and carrying out space-spectrum dimension combined processing to obtain the characteristic image with the space-spectrum characteristic information;
s4, processing the characteristic image obtained by the processing of S3 through a spectrum dimension transducer to obtain a characteristic image with spectrum dimension characteristic information fully extracted through the transducer;
s5, the characteristic image with the spectral dimension characteristic information fully extracted by the transducer is subjected to channel dimension adjustment to a target 31 channel output through a 3X 3 convolution layer, and finally a hyperspectral image is obtained.
2. A multi-scale algorithm for reconstructing a corresponding hyperspectral image from an RGB image according to claim 1 wherein the processing in S1 results in a feature map Y' i The method comprises the following steps:
s11, copying 3 copies of an original RGB image, and then respectively performing downsampling processing at downsampling rates of 8 times, 4 times and 2 times;
s12, after downsampling, carrying out convolution operation on the downsampled feature image through a convolution layer with a 3 multiplied by 3 convolution kernel; each of the other layers, except the first layer, is additively fused with the results of the previous layer immediately after the convolution operation, as shown in the following equation:
Figure FDA0003957935770000011
wherein Downr (. Cndot.) denotes a downsampling operation, r is a downsampling rate, conv (-) denotes a two-dimensional convolution operation with a convolution kernel size of 3 x 3,
Figure FDA0003957935770000021
symbols represent a channel number stacking operation; all convolution operations in the multi-scale processing module use the LeakyReLU activation function; />
Figure FDA0003957935770000022
Representing the intermediate process output of the i-th layer, where i e (1, 2, 3);
s13, obtaining Y from S12 i Input into a spectrum converter module, capture the context information among spectrum channels and obtain Y' i The method comprises the steps of carrying out a first treatment on the surface of the The following are listed belowThe formula is shown as follows:
Figure FDA0003957935770000023
wherein ,
Figure FDA0003957935770000024
intermediate process output representing layer i, +.>
Figure FDA0003957935770000025
Up, representing the final result of the ith layer r (. Cndot.) represents an upsampling operation, r is the upsampling rate, conv (-) represents a two-dimensional convolution operation with a convolution kernel size of 1X 1, tran spe (. Cndot.) represents the transform processing for the spectral dimension.
3. The multi-scale algorithm for reconstructing a corresponding hyperspectral image from an RGB image according to claim 1, wherein in S2, the step of obtaining the feature image F with multi-scale preprocessing information is:
s21, the original RGB image is adjusted to be the sum feature map Y 'by using a 3X 3 convolution layer' i The same spatial dimensions (rh' 3 ×rw' 3 ) Output result is
Figure FDA0003957935770000026
S22, aiming at a characteristic diagram Y 'in the spectrum dimension' i Stacking with the original RGB image, i.e. c 0 +c 1, wherein
Figure FDA0003957935770000027
Finally, the characteristic image with multi-scale preprocessing information is obtained>
Figure FDA0003957935770000028
/>
4. A multi-scale algorithm for reconstructing a corresponding hyperspectral image from an RGB image according to claim 1 wherein in S3, the feature image is subjected to a spatial transform process and a spectral transform process in each spatial-spectral transform joint processing module in sequence.
5. The multi-scale reconstruction of a corresponding hyperspectral image algorithm from an RGB image according to claim 4 wherein the feature image processing steps in each spatio-spectral transducer joint processing module are:
s31, space Transformer processing:
1) The input of the layer is set as a characteristic image F epsilon R c×h×w The characteristic image F is divided into small windows with the window size of m in an average way, and the characteristic image F is obtained through the partitioning operation i ∈R c×m×m Flattening and transposing to obtain
Figure FDA0003957935770000031
Then, performing self-attention processing on the obtained one-dimensional feature map; the specific formula is as follows:
F={F 1 ,F 2 ,…,F N },N=hw/m 2
A i =Attention(F i W Q ,F i W K ,F i W V ),i=1,…,N
Figure FDA0003957935770000032
wherein WQ ,W K and WV ∈R c×c Respectively, represent Q, K,
Figure FDA0003957935770000033
is a learnable parameter,/-a projection matrix of (2)>
Figure FDA0003957935770000034
Final output junction for each windowFruit; wherein the Attention () operation adds relative position codes while realizing self-Attention calculation, and the specific formula is as follows:
Figure FDA0003957935770000035
wherein B is a relative positional offset, which is of a shape R (2m-1)×(2m-1) Is a learning parameter of (a);
2) The results of the self-attention calculations are then integrated by a simple multi-layer perceptron
Figure FDA0003957935770000036
In the whole process, training difficulty is reduced through jump connection, and the calculation process can be as follows:
Figure FDA0003957935770000037
F out =MLP(LN(F′))+F′
wherein F' and F out Processing results of Spa and MLP respectively, and F out Representing the final result of the space Transformer processing, LN (·) symbols representing layer normalization;
s32, inputting the characteristic diagram result subjected to the space Transformer processing into a spectrum Transformer processing block;
1) Let the input feature diagram be H E R c×h×w Firstly, flattening and transpose a characteristic diagram H into H E R hw×c H.epsilon.R hw×c Via W Q ,W K and WV ∈R c×c Linear projection to Q, K, V ε R hw×c The method comprises the steps of carrying out a first treatment on the surface of the The self-attention calculation process is as follows:
Attention(Q,K,V)=SoftMax(σK T Q)V
wherein σ∈R1 Is a parameter that can be learned;
2) Subsequently, the self-Attention result Attention (Q, K, V) is linearly projected and the relative position code is added, and the specific procedure is given by the following formula:
Spe(H)=Attention(Q,K,V)W+φ(V)
wherein ,W∈Rc×c Is a learnable parameter, the phi (·) symbol represents the relative position code, which comprises two 3 x 3 convolutional layers and a GELU activation function;
3) Then, a feedforward network is used for integrating the weight matrix obtained by the formula processing, and training difficulty is reduced by jump connection in the whole process, tran spe The calculation of (-) can be represented by the following formula:
H′=Spe(LN(H))+H
H out =FFN(LN(H′))+H′
wherein H' and H out The treatment results of Spe and FFN, respectively, while H out Representing the final result of the spectral transducer processing, the LN (·) symbol represents the layer normalization.
6. A multi-scale reconstruction of a corresponding hyperspectral image algorithm from an RGB image as claimed in claim 5 wherein the feed forward network consists of a 1 x 1 convolution layer, a GELU activation function, a 3 x 3 convolution layer, a GELU activation function and a 1 x 1 convolution layer in that order.
7. A multi-scale algorithm for reconstructing a corresponding hyperspectral image from an RGB image according to claim 1 wherein in S5, the step of acquiring the hyperspectral image is:
s51, setting a convolution layer with an input of 32 and an output of 31, wherein the convolution kernel size is 3 multiplied by 3, and the step size and the filling size are set to be 1;
s52, inputting the feature image obtained in the S4 and fully extracting the spectral dimension feature information through the transducer into a convolution layer, and obtaining a hyperspectral picture with 31 channels and the same spatial resolution as the input RGB picture through a LeakyRelu activation function.
8. A system for multi-scale reconstruction of a corresponding hyperspectral image from an RGB image, comprising: the device comprises a multi-scale processing unit, a dimension superposition unit, a space-spectrum transducer combined processing unit, a spectrum dimension transducer processing unit and a hyperspectral image output unit;
a multi-scale processing unit for processing the original RGB image X by a multi-scale processing module and outputting a characteristic image Y' i
A dimension superposition unit for superposing the feature map Y' i Adding and stacking the characteristic image F with the original RGB image X in the dimension of a spectrum channel to obtain a characteristic image F with multi-scale preprocessing information;
the space-spectrum converter combined processing unit is used for processing the characteristic image F with the multi-scale preprocessing information through the 3 space-spectrum converter combined processing modules in sequence to obtain the characteristic image with the space-spectrum characteristic information;
the spectral dimension converter processing unit is used for processing the characteristic image obtained by the S3 processing through one spectral dimension converter to obtain a characteristic image with spectral dimension characteristic information fully extracted through the converter;
and the hyperspectral image output unit is used for adjusting the channel dimension to the target 31 channel output through a 3×3 convolution layer to finally obtain the hyperspectral image.
9. A computer storage medium storing a readable program, characterized in that the algorithm according to any one of claims 1-7 is executed when the program is run.
10. An apparatus, comprising: one or more processors, memory for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the algorithm of any of claims 1-7.
CN202211469458.2A 2022-11-22 2022-11-22 Multi-scale hyperspectral image algorithm and system for reconstructing corresponding RGB images Pending CN116091916A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211469458.2A CN116091916A (en) 2022-11-22 2022-11-22 Multi-scale hyperspectral image algorithm and system for reconstructing corresponding RGB images

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211469458.2A CN116091916A (en) 2022-11-22 2022-11-22 Multi-scale hyperspectral image algorithm and system for reconstructing corresponding RGB images

Publications (1)

Publication Number Publication Date
CN116091916A true CN116091916A (en) 2023-05-09

Family

ID=86201421

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211469458.2A Pending CN116091916A (en) 2022-11-22 2022-11-22 Multi-scale hyperspectral image algorithm and system for reconstructing corresponding RGB images

Country Status (1)

Country Link
CN (1) CN116091916A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116433551A (en) * 2023-06-13 2023-07-14 湖南大学 High-resolution hyperspectral imaging method and device based on double-light-path RGB fusion
CN116990243A (en) * 2023-09-26 2023-11-03 湖南大学 GAP frame-based light-weight attention hyperspectral calculation reconstruction method
CN117314757A (en) * 2023-11-30 2023-12-29 湖南大学 Space spectrum frequency multi-domain fused hyperspectral computed imaging method, system and medium

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116433551A (en) * 2023-06-13 2023-07-14 湖南大学 High-resolution hyperspectral imaging method and device based on double-light-path RGB fusion
CN116433551B (en) * 2023-06-13 2023-08-22 湖南大学 High-resolution hyperspectral imaging method and device based on double-light-path RGB fusion
CN116990243A (en) * 2023-09-26 2023-11-03 湖南大学 GAP frame-based light-weight attention hyperspectral calculation reconstruction method
CN116990243B (en) * 2023-09-26 2024-01-19 湖南大学 GAP frame-based light-weight attention hyperspectral calculation reconstruction method
CN117314757A (en) * 2023-11-30 2023-12-29 湖南大学 Space spectrum frequency multi-domain fused hyperspectral computed imaging method, system and medium
CN117314757B (en) * 2023-11-30 2024-02-09 湖南大学 Space spectrum frequency multi-domain fused hyperspectral computed imaging method, system and medium

Similar Documents

Publication Publication Date Title
Arad et al. Ntire 2022 spectral recovery challenge and data set
Yang et al. Deep edge guided recurrent residual learning for image super-resolution
CN116091916A (en) Multi-scale hyperspectral image algorithm and system for reconstructing corresponding RGB images
CN110969124B (en) Two-dimensional human body posture estimation method and system based on lightweight multi-branch network
CN112819910B (en) Hyperspectral image reconstruction method based on double-ghost attention machine mechanism network
CN112766160A (en) Face replacement method based on multi-stage attribute encoder and attention mechanism
Zhang et al. LR-Net: Low-rank spatial-spectral network for hyperspectral image denoising
CN114881871A (en) Attention-fused single image rain removing method
CN115393233A (en) Full-linear polarization image fusion method based on self-encoder
CN114549567A (en) Disguised target image segmentation method based on omnibearing sensing
CN112163998A (en) Single-image super-resolution analysis method matched with natural degradation conditions
CN116797488A (en) Low-illumination image enhancement method based on feature fusion and attention embedding
CN115578262A (en) Polarization image super-resolution reconstruction method based on AFAN model
Lei et al. Tghop: an explainable, efficient, and lightweight method for texture generation
CN115631107A (en) Edge-guided single image noise removal
CN116757986A (en) Infrared and visible light image fusion method and device
Bao et al. S 2 net: Shadow mask-based semantic-aware network for single-image shadow removal
CN113034408B (en) Infrared thermal imaging deep learning image denoising method and device
Liu et al. Residual-guided multiscale fusion network for bit-depth enhancement
Liu et al. Multi-Scale Underwater Image Enhancement in RGB and HSV Color Spaces
CN114202460A (en) Super-resolution high-definition reconstruction method, system and equipment facing different damage images
CN113379606A (en) Face super-resolution method based on pre-training generation model
CN117237207A (en) Ghost-free high dynamic range light field imaging method for dynamic scene
CN115937429A (en) Fine-grained 3D face reconstruction method based on single image
CN116309278A (en) Medical image segmentation model and method based on multi-scale context awareness

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination