CN116091916A

CN116091916A - Multi-scale hyperspectral image algorithm and system for reconstructing corresponding RGB images

Info

Publication number: CN116091916A
Application number: CN202211469458.2A
Authority: CN
Inventors: 栾鸿康; 孙玉宝
Original assignee: Nanjing University of Information Science and Technology
Current assignee: Nanjing University of Information Science and Technology
Priority date: 2022-11-22
Filing date: 2022-11-22
Publication date: 2023-05-09

Abstract

The invention discloses a multi-scale hyperspectral image algorithm and a system for reconstructing corresponding hyperspectral images from RGB images, belonging to the field of computer vision hyperspectral images; the algorithm comprises the following steps: processing the original RGB image by a multi-scale processing module, and outputting a feature map Y' _i The method comprises the steps of carrying out a first treatment on the surface of the Map Y' _i Adding and stacking the multi-scale preprocessing information with the original RGB image in the dimension of a spectrum channel to obtain a characteristic image with the multi-scale preprocessing information; the characteristic image with the multi-scale preprocessing information sequentially passes through 3 space-spectrum converter combined processing modules to perform space-spectrum dimension combined processing to obtain the characteristic image with the space-spectrum characteristic information; will be provided with space-spectral characteristic informationThe characteristic images are processed through a spectrum dimension transducer to obtain characteristic images with spectrum dimension characteristic information fully extracted through the transducer; and (3) adjusting the channel dimension to be output by a target 31 channel through a 3X 3 convolution layer to finally obtain the hyperspectral image.

Description

Multi-scale hyperspectral image algorithm and system for reconstructing corresponding RGB images

Technical Field

The invention belongs to the field of computer vision hyperspectral image reconstruction, and particularly relates to a hyperspectral image algorithm and a hyperspectral image system corresponding to multi-scale reconstruction of RGB images.

Background

Visual information is very important to human beings, and at least more than eighty percent of external information is obtained visually. The light rays carry the spectral information peculiar to the object when the light rays irradiate the surface of the object and reflect the light rays, the light rays are limited by the human eye structure, only the information in the visible light range can be read by naked eyes, but the hyperspectral imager can capture information in any wave band, and information which is difficult to find by naked eyes can be found. The hyperspectral imager can completely record the spectrum information of the picture, and compared with a common color camera, the hyperspectral imager can simultaneously acquire the space information and the spectrum information of a detected object. Based on the advantages, the hyperspectral image has particularly wide application in the fields of medical image processing, remote sensing, object tracking and the like.

With the development of the related field of hyperspectral image processing, the demand for hyperspectral images becomes larger and larger, but the traditional hyperspectral images are difficult to acquire and high in cost, so that the development of the field is limited. Most conventional hyperspectral image acquisition methods typically use spectrometers of spatial or spectral scanning technology, such as push broom scanners, broom scanners and band sequential scanners; however, the hyperspectral imager has obvious defects, the hyperspectral imager is quite large in size and complex in operation, so that the hyperspectral image is difficult to obtain and high in cost; in recent years, with the development of convolutional neural network theory, a large number of Convolutional Neural Network (CNN) based methods have been applied to reconstruction work, and have achieved relatively good effects; although the traditional deep learning reconstruction algorithm based on CNN solves the hardware problems of high cost and high difficulty, the reconstruction quality of the deep learning reconstruction algorithm based on CNN still does not reach the higher standard due to the defect of the CNN model in acquiring the spatial remote correlation and the self-similarity in the spectrum.

Disclosure of Invention

Aiming at the defects of the prior art, the invention aims to provide a multi-scale hyperspectral image algorithm and a system for reconstructing corresponding hyperspectral images from RGB images.

The aim of the invention can be achieved by the following technical scheme:

a multi-scale algorithm for reconstructing a corresponding hyperspectral image from an RGB image, comprising the steps of:

s1, processing an original RGB image X through a multi-scale processing module, and outputting a feature map Y' _i ；

S2, the characteristic diagram Y' _i Adding and stacking the characteristic image F with the original RGB image X in the dimension of a spectrum channel to obtain a characteristic image F with multi-scale preprocessing information;

s3, sequentially passing the characteristic image F with the multi-scale preprocessing information through 3 space-spectrum converter combined processing modules, and carrying out space-spectrum dimension combined processing to obtain the characteristic image with the space-spectrum characteristic information;

s4, processing the characteristic image obtained by the processing of S3 through a spectrum dimension transducer to obtain a characteristic image with spectrum dimension characteristic information fully extracted through the transducer;

s5, the characteristic image with the spectral dimension characteristic information fully extracted by the transducer is subjected to channel dimension adjustment to a target 31 channel output through a 3X 3 convolution layer, and finally a hyperspectral image is obtained.

Further, the processing in S1 obtains a characteristic diagram Y' _i The method comprises the following steps:

s11, copying 3 copies of an original RGB image, and then respectively performing downsampling processing at downsampling rates of 8 times, 4 times and 2 times;

s12, after downsampling, carrying out convolution operation on the downsampled feature image through a convolution layer with a 3 multiplied by 3 convolution kernel; each of the other layers, except the first layer, is additively fused with the results of the previous layer immediately after the convolution operation, as shown in the following equation:

Y _i ＝Conv(Down ^r (X))⊕Y _i-1

wherein Down^r (. Cndot.) represents a downsampling operation, r is a downsampling rate, conv (-) represents a two-dimensional convolution operation with a convolution kernel size of 3×3, and the # -symbol represents a channel number stacking operation; all convolution operations in the multi-scale processing module use the LeakyReLU activation function;

representing the intermediate process output of the i-th layer, where i e (1, 2, 3);

s13, obtaining Y from S12 _i Input into a spectrum converter module, capture the context information among spectrum channels and obtain Y' _i The method comprises the steps of carrying out a first treatment on the surface of the The following formula is shown:

wherein ,

intermediate process output representing layer i, +.>

Up, representing the final result of the ith layer ^r (. Cndot.) represents an upsampling operation, r is the upsampling rate, conv (-) represents a two-dimensional convolution operation with a convolution kernel size of 1X 1, tran _spe (. Cndot.) represents the transform processing for the spectral dimension.

Further, in S2, the step of obtaining the feature image F with the multiscale preprocessing information includes:

s21, the original RGB image is adjusted to be the sum feature map Y 'by using a 3X 3 convolution layer' _i The same spatial dimensions (rh' ₃ ×rw' ₃ ) Output result is

S22, aiming at a characteristic diagram Y 'in the spectrum dimension' _i Stacking with the original RGB image, i.e. c ₀ +c ₁, wherein

Finally, the characteristic image with multi-scale preprocessing information is obtained>

Further, in S3, the feature image is sequentially subjected to a spatial transform process and a spectral transform process in each spatial-spectral transform joint processing module.

Further, the processing steps of the characteristic image in each space-spectrum converter joint processing module are as follows:

s31, space Transformer processing:

1) The input of the layer is set as a characteristic image F epsilon R ^c×h×w The characteristic image F is divided into small windows with the window size of m in an average way, and the characteristic image F is obtained through the partitioning operation ⁱ ∈R ^c×m×m Flattening and transposing to obtain

Then, performing self-attention processing on the obtained one-dimensional feature map; the specific formula is as follows:

F＝{F ¹ ,F ² ,…,F ^N }，N＝hw/m ²

A ⁱ ＝Attention(F ⁱ W ^Q ,F ⁱ W ^K ,F ⁱ W ^V )，i＝1,…,N

wherein W^Q ，W ^K and W^V ∈R ^c×c Respectively represent

Is a learnable parameter,/-a projection matrix of (2)>

Outputting a result finally for each window; wherein the Attention () operation adds relative position codes while realizing self-Attention calculation, and the specific formula is as follows:

wherein B is a relative positional offset, which is of a shape R ^{(2m-1)×(2m-1)} Is a learning parameter of (a);

2) The results of the self-attention calculations are then integrated by a simple multi-layer perceptron

In the whole process, training difficulty is reduced through jump connection, and the calculation process can be as follows:

F _out ＝MLP(LN(F′))+F

wherein F' and F _out Processing results of Spa and MLP respectively, and F _out Representing the final result of the space Transformer processing, LN (·) symbols representing layer normalization;

s32, inputting the characteristic diagram result subjected to the space Transformer processing into a spectrum Transformer processing block;

1) Let the input feature diagram be H E R ^c×h×w Firstly, flattening and transpose a characteristic diagram H into H E R ^hw×c H.epsilon.R ^hw ^×c Via W ^Q ，W ^K and W^V ∈R ^c×c Linear projection to Q, K, V ε R ^hw×c The method comprises the steps of carrying out a first treatment on the surface of the The self-attention calculation process is as follows:

Attention(Q,K,V)＝SoftMax(σK ^T Q)V

wherein σ∈R¹ Is a parameter that can be learned;

2) Subsequently, the self-Attention result Attention (Q, K, V) is linearly projected and the relative position code is added, and the specific procedure is given by the following formula:

Spe(H)＝Attention(Q,K,V)W+φ(V)

wherein ,W∈R^c×c Is a learnable parameter, the sign phi (·) represents the relative position code, which contains two 3 x 3 volumesLayering and a GELU activation function;

3) Then, a feedforward network is used for integrating the weight matrix obtained by the formula processing, and training difficulty is reduced by jump connection in the whole process, tran _spe The calculation of (-) can be represented by the following formula:

H′＝Spe(LN(H))+H

H _out ＝FFN(LN(H′))+H′

wherein H' and H _out The treatment results of Spe and FFN, respectively, while H _out Representing the final result of the spectral transducer processing, the LN (·) symbol represents the layer normalization.

Further, the feed forward network is composed of a 1×1 convolution layer, a GELU activation function, a 3×3 convolution layer, a GELU activation function, and a 1×1 convolution layer in this order.

Further, in S5, the step of acquiring the hyperspectral image is:

s51, setting a convolution layer with an input of 32 and an output of 31, wherein the convolution kernel size is 3 multiplied by 3, and the step size and the filling size are set to be 1;

s52, inputting the feature image obtained in the S4 and fully extracting the spectral dimension feature information through the transducer into a convolution layer, and obtaining a hyperspectral picture with 31 channels and the same spatial resolution as the input RGB picture through a LeakyRelu activation function.

A system for multi-scale reconstruction of a corresponding hyperspectral image from an RGB image, comprising: the device comprises a multi-scale processing unit, a dimension superposition unit, a space-spectrum transducer combined processing unit, a spectrum dimension transducer processing unit and a hyperspectral image output unit;

a multi-scale processing unit for processing the original RGB image X by a multi-scale processing module and outputting a characteristic image Y' _i ；

A dimension superposition unit for superposing the feature map Y' _i Adding and stacking the characteristic image F with the original RGB image X in the dimension of a spectrum channel to obtain a characteristic image F with multi-scale preprocessing information;

the space-spectrum converter combined processing unit is used for processing the characteristic image F with the multi-scale preprocessing information through the 3 space-spectrum converter combined processing modules in sequence to obtain the characteristic image with the space-spectrum characteristic information;

the spectral dimension converter processing unit is used for processing the characteristic image obtained by the S3 processing through one spectral dimension converter to obtain a characteristic image with spectral dimension characteristic information fully extracted through the converter;

and the hyperspectral image output unit is used for adjusting the channel dimension to the target 31 channel output through a 3×3 convolution layer to finally obtain the hyperspectral image.

A computer storage medium storing a readable program which when executed performs the algorithm described above.

An apparatus, comprising: one or more processors, memory for storing one or more programs;

the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the algorithms described above.

The invention has the beneficial effects that: through a multi-scale space-spectrum joint processing network architecture based on a transducer, the corresponding hyperspectral picture can be reconstructed from RGB with high efficiency, low cost and accuracy; compared with the traditional hyperspectral imager and the deep learning algorithm based on CNN, the method can greatly improve the speed of acquiring the hyperspectral picture and reduce the cost of acquiring the hyperspectral picture, overcomes the limitation of a CNN model in acquiring the space remote correlation and the self-similarity in the spectrum, adopts a multi-scale hierarchical feature extraction mode, and avoids the defect of information loss in the encoding and decoding process caused by the U-Net network used in the current work. The method can greatly improve the accuracy of reconstructing the hyperspectral picture, and has higher practical value.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described, and it will be obvious to those skilled in the art that other drawings can be obtained according to these drawings without inventive effort.

FIG. 1 is a block diagram of the algorithm of the present invention for reconstructing a corresponding hyperspectral image from an RGB image;

FIG. 2 is a block diagram of a multi-scale processing module of the present invention;

FIG. 3 is a graph of true hyperspectral image and reconstructed hyperspectral image and error of both in the experiment of the present invention;

fig. 4 is a graph showing the comparison of the true values and the generated values of all bands at a certain coordinate point in a hyperspectral image in the experiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

As shown in fig. 1, a multi-scale algorithm and system for reconstructing a corresponding hyperspectral image from an RGB image, comprising the steps of:

s1, the original RGB image X epsilon R ^3×128×128 Output of feature map Y 'via multi-scale processing Module (MSP) processing' _i ；

The frame structure of the multi-scale processing module is shown in fig. 2, and the specific steps of S1 are as follows:

s11, carrying out downsampling reconstruction on the image; the original RGB image of 3×128×128 is copied 3 copies and then downsampled to 192×16×16, 48×32×32 and 12×64×64 at downsampling rates of 8 times, 4 times and 2 times, respectively.

S12, after downsampling, carrying out convolution operation on the downsampled feature image through a convolution layer with a 3 multiplied by 3 convolution kernel; changing the number of channels through convolution operation, but not changing the spatial resolution so as to fuse with the result of the upper layer; each of the other layers, except the first layer, is additively fused with the results of the previous layer immediately after the convolution operation, as shown in the following equation:

Y _i ＝Conv(Down ^r (X))⊕Y _i-1

representing the intermediate process output of the i-th layer, where i e (1, 2, 3).

S13, Y obtained in S12 _i Input into a spectral transducer module (Spe), capturing contextual information between spectral channels; since the space size of the first layer after downsampling is too small, at the end of the first layer, a 1×1 convolution layer is passed to enable the network to adaptively adjust the weights; finally, the spatial scale is adjusted to be the same as the size of the next layer through an up-sampling layer to obtain Y' _i The method comprises the steps of carrying out a first treatment on the surface of the The specific implementation process is given by the following formula:

wherein ,

intermediate process output representing layer i, +.>

S2, the final output characteristic diagram Y 'in S1' _i Adding and stacking the characteristic images with the original RGB image X in the dimension of a spectrum channel to obtain a characteristic image with multi-scale preprocessing information, so that a backbone network can better extract the characteristics of the original information; .

The specific superposition steps are as follows:

s21, using a 3×3 convolution layer to transform the original input RGB image (X ε R ^3×h×w ) Adjusted to the same spatial dimension (rh 'as the result output in S1' ₃ ×rw' ₃ ) Output result is

S22, for two tensors in the spectral dimension (feature map Y' _i And the original RGB image), i.e., c) ₀ +c ₁, wherein

S3, sequentially passing the characteristic image F with the multi-scale preprocessing information in S2 through 3 space-spectrum converter combined processing modules (SUB) to perform space-spectrum dimension combined processing to obtain a characteristic image with space-spectrum characteristic information;

the characteristic image sequentially passes through a space transducer process (Spa) and a spectrum transducer process (Spe) in each space-spectrum transducer joint processing module (SUB);

the specific steps of processing the characteristic image in a space-spectrum converter joint processing module (SUB) are as follows:

s31, space Transformer processing;

the input of the layer is

To simplify the character in the introduction, it is now assumed that the feature image F.epsilon.R ^c×h×w Is an input to the module; the characteristic image F is divided into small windows with the window size of m in an average way, and the characteristic image F is obtained through the partitioning operation ⁱ ∈R ^c×m×m Performing flattening and transposition to obtain ∈>

Then, performing self-attention processing on the obtained one-dimensional feature map; the specific implementation process is shown in the following formula:

F＝{F ¹ ,F ² ,…,F ^N }，N＝hw/m ²

A ⁱ ＝Attention(F ⁱ W ^Q ,F ⁱ W ^K ,F ⁱ W ^V )，i＝1,…,N

wherein W^Q ，W ^K and W^V ∈R ^c×c Respectively represent

Is a learnable parameter,/-a projection matrix of (2)>

Outputting a result finally for each window; wherein the Attention () operation adds relative position coding while realizing self-Attention computation, the specific implementation details are given by the following formula:

the results of the self-attention calculations are then integrated by a simple multi-layer perceptron (MLP)

And the training difficulty is reduced through jump connection in the whole process, so that Tran is summarized _spa The calculation of (-) can be summarized by the following formula:

F _out ＝MLP(LN(F′))+F′

wherein F' and F _out Processing results of Spa and MLP respectively, and F _out Representing the final result of the spatial transducer processing, the LN (·) symbol represents the layer normalization.

S32, spectrum converter processing;

feature map result F after space transform processing _out Then input into a spectral transducer processing block;

for convenience of character description, let us now assume that the input is H ε R ^c×h×w Similar to spatial processing, the feature image X is flattened and transposed to H ε R ^hw×c H.epsilon.R is then likewise taken ^hw×c Via W ^Q ，W ^K and W^V ∈R ^c×c Linear projection to Q, K, V ε R ^hw×c The method comprises the steps of carrying out a first treatment on the surface of the In contrast to spatial processing, where all channels of a single pixel are used as a token in spectral processing, the self-attention (Spe) specification is given by:

Attention(Q,K,V)＝SoftMax(σK ^T Q)V

wherein σ∈R¹ Is a parameter that can be learned;

subsequently, the self-Attention result Attention (Q, K, V) is linearly projected and the relative position code is added, and the specific procedure is given by the following formula:

Spe(H)＝Attention(Q,K,V)W+φ(V)

wherein ,W∈R^c×c Is a parameter which can be learned, phi (&) signThe number represents the relative position code, which contains two 3 x 3 convolutional layers and one GELU activation function;

then integrating the weight matrix Spe (H) obtained by the formula processing through a feedforward network (FFN), and reducing training difficulty through jump connection in the whole process, wherein the feedforward network sequentially comprises a 1×1 convolution layer, a GELU activation function, a 3×3 convolution layer, a GELU activation function and a 1×1 convolution layer; to sum up, tran _spe The calculation of (-) can be represented by the following formula:

H′＝Spe(LN(H))+H

H _out ＝FFN(LN(H′))+H′

S4, the characteristic image obtained by the S3 processing is processed by a spectral dimension transducer (Spe+FFN) to obtain a characteristic image with spectral dimension characteristic information fully extracted by the transducer

The specific processing steps are as follows: the operation S32 is repeated (again a spectral converter process is performed here, since the final purpose is to reconstruct a 31-channel hyperspectral picture from a 3-channel RGB image, to achieve a mapping of the channel dimensions from low to high, most important or spectral dimensional feature information)

S5, the characteristic image obtained in the S4 and subjected to the transformation to fully extract the spectral dimension characteristic information is output by adjusting the channel dimension to a target 31 channel through a 3X 3 convolution layer; finally obtaining a hyperspectral image;

the method comprises the following specific steps:

s51, setting a convolution layer with 32 input and 31 output, wherein the convolution kernel size is 3 multiplied by 3, and the step size and the filling size are set to be 1 so as to ensure that the size of the feature map is not changed;

s52, inputting the final result of S4 into a convolution layer and obtaining a hyperspectral picture with 31 channels and the same spatial resolution as the input RGB picture through a LeakyRelu activation function.

In this embodiment, a dataset provided by the NTIRE 2022 spectral reconstruction challenge is selected for training and evaluation, the dataset comprising 1000 pairs of RGB and their corresponding real hyperspectral pictures; each hyperspectral picture has 31 channels, each channel stores light intensity information of every 10nm from 400nm to 700nm, and the size of the spatial resolution is 482 multiplied by 512; the RGB image has 3 RGB channels, and the size of the spatial resolution is consistent with that of the corresponding hyperspectral picture; 90% of the data in the dataset were randomly selected as the training set, and the remaining 10% of the data were selected as the test set.

In the training process, RGB and hyperspectral pictures in the original data set are randomly cut to 128×128 spatial dimensions, and the light intensity values of the pixels are normalized to [0,1 ]]Is not limited in terms of the range of (a). Simple data enhancement is performed on the training data, such as random rotation and flipping. The processed RGB and the corresponding real hyperspectral pictures are input into a network model, and the mapping from RGB to the real hyperspectral pictures is trained and learned after being processed by a plurality of column network modules provided by the invention, wherein the training aim is to pass through a loss function l _MRAE Updating the parameter θ to minimize

and Y_i Distance between:

wherein

Is the intensity of the generated hyperspectral picture at the pixel point, and S _i The intensity of the hyperspectral image on the pixel point is real, and N represents all the pixel points in the image.

The experimental process comprises the following steps:

in the testing process, the original RGB picture in the testing data set is input into a trained network to obtain a corresponding hyperspectral picture, the reconstructed hyperspectral picture and the real hyperspectral picture are subjected to absolute value difference according to the numerical value of each pixel point in each spectrum dimension, and a numerical error picture is obtained, as shown in figure 3. And randomly selecting the same coordinate position from the reconstructed hyperspectral picture and the real hyperspectral picture to draw the light intensity curves of all wave bands of the pixel point as shown in figure 4, thereby judging the accuracy of the hyperspectral picture generated by the method. In addition, the method of the invention is also compared with HSCNN+ and AWAN based on CNN architecture in the field in MRAE index, and ablation experiments are carried out on two modules of MSP and SSU in the method of the invention to prove the effectiveness of the method.

It can be seen from fig. 3 that the hyperspectral pictures generated from RGB pictures have a strong similarity compared to the true hyperspectral pictures on the left, and further from the error map of both, that the error of all pixels is within a particularly small range in the whole picture (darker representing smaller differences and whiter representing larger differences). As can be clearly seen from the light intensity graph of fig. 4, the method of the present invention can generate a light intensity curve very close to real data in all bands. As is evident from table 1, the proposed method has a lower MRAE, which means that the hyperspectral pictures generated by the method of the invention have a very high similarity to the true hyperspectral pictures, and from table 2 it can be found that there is a positive improvement in the performance of the method of the invention, whether MSP or SSU, which indicates the independent effectiveness of the two modules.

Table 1 comparison with the index of the conventional method

Table 2 ablation experiments

From the point of view of the index quantification result and the visual effect diagram, the algorithm provided by the invention can better generate high-quality hyperspectral pictures, the network is lighter, the network can be conveniently carried into other hardware, a more convenient and lower-cost hyperspectral picture acquisition mode is provided for various hyperspectral application fields, and the further development of the field is promoted.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The solutions in the embodiments of the present application may be implemented in various computer languages, for example, object-oriented programming language Java, and an transliterated scripting language JavaScript, etc.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.

It will be apparent to those skilled in the art that various modifications and variations can be made in the present application without departing from the spirit or scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims and the equivalents thereof, the present application is intended to cover such modifications and variations.

Claims

1. A multi-scale algorithm for reconstructing a corresponding hyperspectral image from an RGB image, comprising the steps of:

2. A multi-scale algorithm for reconstructing a corresponding hyperspectral image from an RGB image according to claim 1 wherein the processing in S1 results in a feature map Y' _i The method comprises the following steps:

wherein Down^r (. Cndot.) denotes a downsampling operation, r is a downsampling rate, conv (-) denotes a two-dimensional convolution operation with a convolution kernel size of 3 x 3,

symbols represent a channel number stacking operation; all convolution operations in the multi-scale processing module use the LeakyReLU activation function; />

s13, obtaining Y from S12 _i Input into a spectrum converter module, capture the context information among spectrum channels and obtain Y' _i The method comprises the steps of carrying out a first treatment on the surface of the The following are listed belowThe formula is shown as follows:

wherein ,

intermediate process output representing layer i, +.>

3. The multi-scale algorithm for reconstructing a corresponding hyperspectral image from an RGB image according to claim 1, wherein in S2, the step of obtaining the feature image F with multi-scale preprocessing information is:

/>

4. A multi-scale algorithm for reconstructing a corresponding hyperspectral image from an RGB image according to claim 1 wherein in S3, the feature image is subjected to a spatial transform process and a spectral transform process in each spatial-spectral transform joint processing module in sequence.

5. The multi-scale reconstruction of a corresponding hyperspectral image algorithm from an RGB image according to claim 4 wherein the feature image processing steps in each spatio-spectral transducer joint processing module are:

s31, space Transformer processing:

F＝{F ¹ ,F ² ,…,F ^N }，N＝hw/m ²

A ⁱ ＝Attention(F ⁱ W ^Q ,F ⁱ W ^K ,F ⁱ W ^V )，i＝1,…,N

wherein W^Q ，W ^K and W^V ∈R ^c×c Respectively, represent Q, K,

is a learnable parameter,/-a projection matrix of (2)>

Final output junction for each windowFruit; wherein the Attention () operation adds relative position codes while realizing self-Attention calculation, and the specific formula is as follows:

F _out ＝MLP(LN(F′))+F′

1) Let the input feature diagram be H E R ^c×h×w Firstly, flattening and transpose a characteristic diagram H into H E R ^hw×c H.epsilon.R ^hw×c Via W ^Q ，W ^K and W^V ∈R ^c×c Linear projection to Q, K, V ε R ^hw×c The method comprises the steps of carrying out a first treatment on the surface of the The self-attention calculation process is as follows:

Attention(Q,K,V)＝SoftMax(σK ^T Q)V

wherein σ∈R¹ Is a parameter that can be learned;

Spe(H)＝Attention(Q,K,V)W+φ(V)

wherein ,W∈R^c×c Is a learnable parameter, the phi (·) symbol represents the relative position code, which comprises two 3 x 3 convolutional layers and a GELU activation function;

H′＝Spe(LN(H))+H

H _out ＝FFN(LN(H′))+H′

6. A multi-scale reconstruction of a corresponding hyperspectral image algorithm from an RGB image as claimed in claim 5 wherein the feed forward network consists of a 1 x 1 convolution layer, a GELU activation function, a 3 x 3 convolution layer, a GELU activation function and a 1 x 1 convolution layer in that order.

7. A multi-scale algorithm for reconstructing a corresponding hyperspectral image from an RGB image according to claim 1 wherein in S5, the step of acquiring the hyperspectral image is:

8. A system for multi-scale reconstruction of a corresponding hyperspectral image from an RGB image, comprising: the device comprises a multi-scale processing unit, a dimension superposition unit, a space-spectrum transducer combined processing unit, a spectrum dimension transducer processing unit and a hyperspectral image output unit;

9. A computer storage medium storing a readable program, characterized in that the algorithm according to any one of claims 1-7 is executed when the program is run.

10. An apparatus, comprising: one or more processors, memory for storing one or more programs;

the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the algorithm of any of claims 1-7.