CN116993584A - Multi-domain image-oriented spectrum cross-domain migration super-resolution reconstruction method - Google Patents
Multi-domain image-oriented spectrum cross-domain migration super-resolution reconstruction method Download PDFInfo
- Publication number
- CN116993584A CN116993584A CN202310745724.8A CN202310745724A CN116993584A CN 116993584 A CN116993584 A CN 116993584A CN 202310745724 A CN202310745724 A CN 202310745724A CN 116993584 A CN116993584 A CN 116993584A
- Authority
- CN
- China
- Prior art keywords
- domain
- image
- spectrum
- model
- cross
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001228 spectrum Methods 0.000 title claims abstract description 48
- 238000000034 method Methods 0.000 title claims abstract description 43
- 238000013508 migration Methods 0.000 title claims abstract description 13
- 230000005012 migration Effects 0.000 title claims abstract description 13
- 238000012549 training Methods 0.000 claims abstract description 22
- 230000003595 spectral effect Effects 0.000 claims description 26
- 238000004422 calculation algorithm Methods 0.000 claims description 13
- 238000013528 artificial neural network Methods 0.000 claims description 9
- 238000005070 sampling Methods 0.000 claims description 7
- 230000003993 interaction Effects 0.000 claims description 5
- 239000011159 matrix material Substances 0.000 claims description 5
- 230000006870 function Effects 0.000 claims description 4
- 230000000873 masking effect Effects 0.000 claims description 4
- 230000008569 process Effects 0.000 claims description 4
- 230000004913 activation Effects 0.000 claims description 3
- 230000007246 mechanism Effects 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000007670 refining Methods 0.000 claims description 3
- 238000005316 response function Methods 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 2
- 238000012360 testing method Methods 0.000 abstract description 6
- 230000006978 adaptation Effects 0.000 abstract description 5
- 238000013461 design Methods 0.000 abstract description 4
- 230000000694 effects Effects 0.000 abstract description 4
- 238000013135 deep learning Methods 0.000 description 5
- 238000000701 chemical imaging Methods 0.000 description 4
- 238000003384 imaging method Methods 0.000 description 4
- 238000007796 conventional method Methods 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000000354 decomposition reaction Methods 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000010238 partial least squares regression Methods 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012821 model calculation Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0475—Generative networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/0895—Weakly supervised learning, e.g. semi-supervised or self-supervised learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/096—Transfer learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/0985—Hyperparameter optimisation; Meta-learning; Learning-to-learn
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4046—Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A40/00—Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
- Y02A40/10—Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in agriculture
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a multi-domain image-oriented spectrum cross-domain migration super-division reconstruction method, which is an image spectrum cross-domain migration super-division reconstruction method based on a cross-domain migratable knowledge learning and target domain fast adaptation learning mode. And realizing spectrum super-resolution reconstruction from an RGB image to a hyperspectral image. Model structural design based on a transferable dictionary is adopted to learn the characteristics which can be transferred across domains; facilitating model learning of generic knowledge for reconstruction based on a source domain pre-training strategy that shares a learnable mask; the fine tuning method based on model-agnostic meta-learning is used for learning a general model with strong generalization capability, so that the data of a tested target domain can be adapted to the test data through a plurality of steps of iteration. The invention can mine the knowledge of cross-domain sharing to improve generalization capability and further improve the effect of cross-domain spectrum super-division reconstruction.
Description
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to a spectrum cross-domain migration super-division reconstruction method.
Background
Hyperspectral images refer to images in which tens or even hundreds of consecutive spectral bands are acquired for each pixel in the visible and infrared spectral ranges. Compared with the traditional RGB image, the hyperspectral image provides more abundant spectral information, and can identify the spectral characteristics of the material, so that the earth surface substances are analyzed more carefully.
The hyperspectral image can be applied to the fields of environmental remote sensing, agriculture, forestry, geological exploration, urban planning and the like. For example, in the agricultural field, the hyperspectral image can be used for rapidly identifying, classifying, monitoring and managing crops, so that the yield and quality of the crops are improved. In environmental monitoring, hyperspectral images can be used to identify and monitor harmful substances in a body of water, as well as to monitor vegetation coverage and land use changes. In the field of urban planning, the hyperspectral image can be used for measuring urban green land coverage and building height, optimizing urban planning and facility layout and the like.
In conclusion, the hyperspectral image has wide application prospect as an image with abundant spectral information.
However, hyperspectral images are not widely used as general cameras because hyperspectral cameras are expensive, have a slow imaging speed, have a large volume, and the like. In order to fully utilize the advantages of the hyperspectral image and simultaneously avoid the problems of the hyperspectral imaging device, researchers propose a method for spectrum hyperspectral, aiming at estimating and reconstructing the hyperspectral image by using the traditional RGB image.
Existing spectral superdivision methods can be broadly divided into two categories according to the reconstruction scheme. One is a conventional method, such as (1) a spectral super-resolution method based on spectral decomposition: the method utilizes a spectrum decomposition algorithm to decompose and reconstruct spectrum signals, thereby realizing super-resolution of spectrum. For example, a super-resolution spectral imaging technique "Coupled Nonnegative Matrix Factorization Unmixing for Hyperspectral and Multispectral Data Fusion" based on a non-Negative Matrix Factorization (NMF) algorithm may decompose and reconstruct the spectral signals to achieve super-resolution imaging of the spectrum. (2) a sparse representation-based spectral super-resolution method: the method utilizes a sparse representation algorithm, such as an algorithm based on dictionary learning, to decompose and reconstruct the spectrum signals, thereby realizing super-resolution of the spectrum. For example, a super-resolution spectral imaging technique "Spectral Reflectance Recovery from a Single RGB Image" based on a sparse representation algorithm may perform sparse representation and reconstruction on the spectral signals, thereby implementing super-resolution imaging of the spectrum. (3) Spectral library and model based spectral super resolution method: the method utilizes a spectrum library and a model to perform model training and optimization on a spectrum signal, thereby realizing super-resolution of the spectrum. For example, super-resolution spectral imaging techniques based on Partial Least Squares Regression (PLSR) algorithms can model and predict the spectral signals to achieve super-resolution imaging of the spectrum. These conventional methods often have problems of slow operation speed and poor reconstruction effect. Another approach, which is based on deep learning, is to train and learn the optical signal using a deep learning network, such as a Convolutional Neural Network (CNN) 'Pixel-aware Deep Function-mixture Network for Spectral Super-Resolution', a transducer 'mst++: multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction', etc., to achieve super-Resolution of the spectrum. Although deep learning-based methods have evolved tremendously in recent years and achieved excellent performance on a single dataset, deep learning-based methods severely degrade performance when tested in a scenario outside the training set.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a multi-domain image-oriented spectrum cross-domain migration super-resolution reconstruction method, which is based on a cross-domain migratable knowledge learning and target domain fast adaptation learning mode through multi-domain image scene oriented image spectrum cross-domain migration super-resolution reconstruction method. And realizing spectrum super-resolution reconstruction from an RGB image to a hyperspectral image. Model structural design based on a transferable dictionary is adopted to learn the characteristics which can be transferred across domains; facilitating model learning of generic knowledge for reconstruction based on a source domain pre-training strategy that shares a learnable mask; the fine tuning method based on model-agnostic meta-learning is used for learning a general model with strong generalization capability, so that the data of a tested target domain can be adapted to the test data through a plurality of steps of iteration. The invention can mine the knowledge of cross-domain sharing to improve generalization capability and further improve the effect of cross-domain spectrum super-division reconstruction.
The technical scheme adopted by the invention for solving the technical problems comprises the following steps:
step 1: for RGB imagesWherein h and w represent the height and width of the image, respectively, which are denoted +.>hsi represents a hyperspectral image corresponding to the input image img;
inputting the image into a coding layer, and mapping the channel number of the input image from 3 to 31 to realize preliminary spectrum reconstruction and alignment;
e=embedding(img)
wherein, ebedding (·) represents the embedded layer, which is instantiated by a convolution layer with a convolution kernel size of 3 and a convolution step size of 1, representing the hidden layer characteristics after embedding;
step 2: randomly masking the hidden layer features obtained in the step 1 in a cube form, randomly sampling the cube with a fixed size on an image, and replacing the features at the position with a shared learnable mask;
step 3: and (2) refining the hidden characteristic e obtained in the step (1) by using an inter-spectrum attention-based module, wherein the hidden characteristic e is expressed as follows:
s=SpectralTransformerBlock(e)
wherein spectralTransformaerBlock (·) represents an inter-spectral Transformer module, s represents hidden layer features obtained by the inter-spectral Transformer module; stacking a plurality of inter-spectrum transducer modules in a spectrum reconstruction model, wherein the output of the former module is the input of the latter module; wherein SpectralTransformaerBlock (.) consists of SpectralAttention and FFN and LayerNorm:
SpectralTransformerBlock(x)=t+(FFN(LayerNorm(t)))
where t= (x+spectroalattention (LayerNorm (x))), layerNorm represents the layer normalization operation, FFN (x) = (conv (gel (conv (x)))))) where conv represents the convolutional layer, gel is a nonlinear activation function, and x represents one input tensor;
attention(Q,K,V)=softmax(σ i QK T )V
SpectralAttention(X)=attention(XW Q ,XW K ,XW V )
wherein σi Is a learnable scale factor, W Q ,W K ,W V Is a matrix of projections that can be learned,the input tensor is obtained by rearranging the shape of the characteristic tensor of the input image;
step 4: generating a migratable dictionary by using a generator network instantiated by a multi-layer fully-connected neural network, then dividing hidden layer features s into feature blocks with a certain size according to space dimensions, and injecting knowledge shared across domains into a feature graph by using the migratable dictionary generated by a cross-attention mechanism interaction generator and the feature blocks of the hidden layer features s to form:
z=Generator(randomVector+map(s))
c=CrossAttention(s,z)
wherein Cross sAttention (S, Z) =attention (SW) Q ,ZW K ,ZW V ) Representing cross-attention, map (-) is a mapper that acts to map s' information into hidden space, which is instantiated by a multi-layer fully-connected neural network, and structurally follows the information bottleneck structure; the Generator (-) represents a Generator network that receives the vector random vector randomly sampled from the gaussian distribution and the domain information of the image itself obtained from s via map and generates a migratable dictionary;
step 5: the features are refined again using the inter-spectrum transducer module, and finally a reconstructed hyperspectral image is obtained, expressed as:
hsi′=SpectralTransformerBlock(c)
where hsi' represents the reconstructed hyperspectral image.
Preferably, the training process of the spectrum reconstruction model is as follows:
by adopting a meta-learning fine tuning method based on model agnostic, a model parameter with strong universality is learned, so that the spectrum reconstruction model can adapt to the characteristics of any domain through a plurality of steps of self-supervision fine tuning on the image of the domain, and the specific algorithm is as follows:
firstly, initializing model parameters;
then constructing training data, wherein the specific method is as follows: sampling N tasks in a datasetEach Task consists of K data pairs.
For each Task, self-supervised losses are computed over K example dataAnd performing model parameter updating of the inner layer according to the gradient:
for all tasks, the updated model parameters θ 'are used' i Calculating supervised lossesAnd performing model parameter updating of the outer layer according to the gradient:
wherein ,representing self-supervision loss in the form of +.>Where d (·) represents the spectral response function from the hyperspectral image to the RGB image, f θ (. Cndot.) represents a neural network with a parameter θ, and x is the input image. />Representing a supervised loss in the form of +.>Where mse represents the mean square loss.
The beneficial effects of the invention are as follows:
because the general deep learning model is too dependent on the memory of the training data set, is excessively fitted to the training data set, and does not learn enough knowledge that can be shared across domains, the masking operation of the cube level is used to force the model to learn interactions between spectrums and spaces, and the knowledge of the interactions is shared among all spectrum reconstructions. And using a generator to generate a migratable patch to be added to the spectral reconstruction as cross-domain shared knowledge. The design ensures that the knowledge of cross-domain sharing can be mined from the angle of the model so as to improve the generalization capability and further improve the effect of cross-domain spectrum super-division reconstruction. The model-agnostic meta-learning method does not directly learn the reconstruction from a single RGB image to a hyperspectral image, but learns model parameters with strong generalization, which can be quickly adapted to a target domain in a plurality of steps of iteration for any image, further enhances the universality of the model, and simultaneously matches with the self-supervision quick fine tuning of the target domain, so that the model has the capacity of cross-domain migration and super-division reconstruction.
Drawings
FIG. 1 is a schematic diagram of a model structure of a migratable dictionary.
Detailed Description
The invention will be further described with reference to the drawings and examples.
As shown in fig. 1, in order to repeatedly exert the high performance of the spectral super-resolution method based on deep learning and alleviate the problem of serious degradation of the scene test performance outside the training set, we propose a multi-domain image scene-oriented image spectral cross-domain migration super-resolution reconstruction method based on the cross-domain migratable knowledge learning and the target domain fast adaptation learning mode for spectral super-resolution reconstruction from RGB images to hyperspectral images. The model structure design based on the transferable dictionary is used for learning the characteristics which can be transferred across domains; a source domain pre-training strategy based on a shared learner mask to facilitate model learning of common knowledge for reconstruction; the method comprises a fine tuning method based on model-agnostic meta-learning, which is used for learning a general model with strong generalization capability, so that the data of a tested target domain can be adapted to the test data through a plurality of steps of iteration.
A multi-domain image scene-oriented image spectrum cross-domain migration super-resolution reconstruction method based on a cross-domain migratable knowledge learning and target domain rapid adaptation learning mode comprises the following aspects and steps:
spectral reconstruction model structure:
step 1: for RGB imagesWherein h and w represent the height and width of the image, respectively, which are denoted +.>Representing a hyperspectral image corresponding to the input image img. The image is input to the coding layer, the channel number of the input image is mapped from 3 to 31, and preliminary spectrum reconstruction and alignment are realized.
e=embedding(img)
Where ebedding (·) represents the embedded layer, instantiated by a convolution layer with a convolution kernel size of 3 and a convolution step size of 1, e represents the hidden layer feature after embedding.
Step 2: and (3) carrying out random masking on the hidden layer features obtained in the step (1) in a form of cube, randomly sampling the cube with a fixed size on the image, and replacing the features at the position with a shared learnable mask.
Step 3: the method comprises the following steps: 2, refining the obtained hidden characteristic e by using an inter-spectrum attention module. The expression is as follows:
s=SpectralTransformerBlock(e)
where spectralTransformaerBlock (·) represents the inter-spectral Transformer module, s represents the hidden layer features obtained by the inter-spectral Transformer module, note that we stack multiple modules in the model, the output of the former module being the input of the latter. Wherein SpectralTransformaerBlock (.) consists of SpectralAttention and FFN and LayerNorm:
SpectralTransformerBlock(x)=t+(FFN(LayerNorm(t)))
where t= (x+spectroalattention (LayerNorm (x))), layerNorm represents the layer normalization operation, FFN (x) = (conv (gel (conv (x)))))) where conv represents the convolutional layer, gel is a nonlinear activation function, and x represents the tensor of an input.
attention(Q,K,V)=softmax(σ i QK T )V
SpectralAttention(X)=attention(XW Q ,XW K ,XW V )
wherein σi Is learnableScaling factor, W Q ,W K ,W V Is a matrix of projections that can be learned,is the input tensor, which is obtained by rearranging the feature tensor of the input image.
Step 4: generating a migratable dictionary by using a generator network instantiated by a multi-layer fully-connected neural network, then dividing hidden layer features s into feature blocks with a certain size according to space dimensions, and injecting knowledge shared across domains into a feature graph by using the migratable dictionary generated by a cross-attention mechanism interaction generator and the feature blocks of the hidden layer features s to form:
z=Generator(randomVector+map(s))
c=CrossAttention(s,z)
wherein Cross sAttention (S, Z) =attention (SW) Q ,ZW K ,ZW V ) Representing cross-attention, map (-) is a mapper that acts to map s' information into hidden space, which is instantiated by a multi-layer fully-connected neural network, and structurally follows the information bottleneck structure; the Generator (-) represents a Generator network that receives the vector random vector randomly sampled from the gaussian distribution and the domain information of the image itself obtained from s via map and generates a migratable dictionary;
step 5: the features are refined again using the inter-spectrum transducer module mentioned in step 3, and the reconstructed hyperspectral image is finally obtained, which can be expressed as:
hsi′=SpectralTransformerBlock(c)
where hsi' represents the reconstructed hyperspectral image.
Model-agnostic meta-learning fine tuning method:
in order to obtain good performance of the model on multi-domain images, we propose to learn a model parameter with strong universality by using a meta-learning fine tuning method which is unknown to the model, so that the model can be well adapted to the characteristics of the domain through fine tuning of several steps of self-supervision on any domain image, and a specific algorithm flow is shown as an algorithm 1:
wherein ,representing self-supervision loss in the form of +.>Where d (·) represents the spectral response function from the hyperspectral image to the RGB image, f θ (. Cndot.) represents a neural network with a parameter θ, and x is the input image. />Representing a supervised loss in the form of +.>Where mse represents the mean square loss.
Specific examples:
the invention provides a multi-domain image scene-oriented image spectrum cross-domain migration super-resolution reconstruction method based on a cross-domain migratable knowledge learning and target domain fast adaptation learning mode, which comprises the following specific processes:
1. data preprocessing
For a given training setThere are RGB image and hyperspectral image pairs { img ] i ,hsi i For a given test set ∈ }>There may be no hsi i . In training the model, both RGB data and hyperspectral data are normalized to [0,1 ]]Within the range.
Furthermore, for the input image img i Hsi corresponding thereto i By usingData enhancement modes of random clipping, random horizontal overturn and random vertical overturn are adopted to enhance the generalization capability of the model.
2. Preliminary pre-training based on random small lot size
Since model-agnostic meta-learning methods lead to learning instability from scratch and the self-supervision step of the meta-learning inner layer leads to reduced utilization of supervised data, the model is trained in an early stage using a random small batch-based manner, so that the model can quickly and stably converge to a better position. Specifically, from the training setAnd (3) sampling n samples to form a batch, inputting a randomly initialized model, using Adam as an optimizer, mrae as a loss function, using 4e-4 as an initial learning rate, gradually reducing the learning rate in a cosine annealing mode, and training 150 batches.
3. Model agnostic meta-learning training
From a datasetMid-sampling task set->Every task->Comprising two parts of a support set and a query set, instantiated for simplicity as a sample pair { (img) each containing K RGB and hyperspectral images that do not overlap each other i ,hsi i ) I=1, 2,..the, K }, support set self-supervising loss for inner layer +.>Calculating, query set for supervised loss of the outer layer ∈>And (5) calculating. In addition, in the case of the optical fiber,when the meta learning training is carried out, the model initial weight is loaded with the pre-training weight obtained by the preliminary pre-training based on random small batches, and a mask structure is removed, so that the model calculation graph in the meta learning optimization process is ensured to be consistent with fine adjustment and reasoning during testing as much as possible. Generally, for θ' i If p gets too small it will be difficult to iterate to a solution that fits the task, if p gets too large it will cause local parameters to fit too much to the task, so p is typically taken as 10. For the learning rate alpha, beta of the inner layer and the outer layer, 1e-5 and 1e-6 are respectively taken.
4. Target domain fine tuning and reconstruction reasoning
When learning is completed, the model obtained by training can be subjected to spectral super-resolution reconstruction from RGB to hyperspectral images. The specific algorithm flow is shown in algorithm 2:
the final obtained algorithm output hsi' is the hyperspectral image reconstructed from the input RGB image img.
Claims (2)
1. A multi-domain image-oriented spectrum cross-domain migration super-division reconstruction method is characterized by comprising the following steps:
step 1: for RGB imagesWherein h and w represent the height and width of the image, respectively, which are labeledhsi represents a hyperspectral image corresponding to the input image img;
inputting the image into a coding layer, and mapping the channel number of the input image from 3 to 31 to realize preliminary spectrum reconstruction and alignment;
e=embedding(img)
wherein, ebedding (·) represents the embedded layer, which is instantiated by a convolution layer with a convolution kernel size of 3 and a convolution step size of 1, e represents the hidden layer feature after embedding;
step 2: carrying out random masking on the hidden layer feature e obtained in the step 1 in a cube form, randomly sampling a cube with a fixed size on an image, and replacing the feature at the position with a shared learnable mask;
step 3: and (2) refining the hidden characteristic e obtained in the step (1) by using an inter-spectrum attention-based module, wherein the hidden characteristic e is expressed as follows:
s=SpectralTransformerBlock(e)
wherein spectralTransformaerBlock (·) represents an inter-spectral Transformer module, s represents hidden layer features obtained by the inter-spectral Transformer module; stacking a plurality of inter-spectrum transducer modules in a spectrum reconstruction model, wherein the output of the former module is the input of the latter module; wherein SpectralTransformaerBlock (.) consists of SpectralAttention and FFN and LayerNorm:
SpectralTransformerBlock(x)=t+(FFN(LayerNorm(t)))
where t= (x+spectroalattention (LayerNorm (x))), layerNorm represents the layer normalization operation, FFN (x) = (conv (gel (conv (x)))))) where conv represents the convolutional layer, gel is a nonlinear activation function, and x represents one input tensor;
attention(Q,K,V)=softmax(σ i QK T )V
SpectralAttention(X)=attention(XW Q ,XW K ,XW V )
wherein σi Is a learnable scale factor, W Q ,W K ,W V Is a matrix of projections that can be learned,the input tensor is obtained by rearranging the shape of the characteristic tensor of the input image;
step 4: generating a migratable dictionary by using a generator network instantiated by a multi-layer fully-connected neural network, then dividing hidden layer features s into feature blocks with a certain size according to space dimensions, and injecting knowledge shared across domains into a feature graph by using the migratable dictionary generated by a cross-attention mechanism interaction generator and the feature blocks of the hidden layer features s to form:
z=Generator(randomVector+map(s))
c=CrossAttention(s,z)
wherein Cross sAttention (S, Z) =attention (SW) Q ,ZW K ,ZW V ) Representing cross-attention, map (-) is a mapper that acts to map s' information into hidden space, which is instantiated by a multi-layer fully-connected neural network, and structurally follows the information bottleneck structure; the Generator (-) represents a Generator network that receives the vector random vector randomly sampled from the gaussian distribution and the domain information of the image itself obtained from s via map and generates a migratable dictionary;
step 5: the features are refined again using the inter-spectrum transducer module, and finally a reconstructed hyperspectral image is obtained, expressed as:
hsi′=SpectralTransformerBlock(c)
where hsi' represents the reconstructed hyperspectral image.
2. The multi-domain image-oriented spectrum cross-domain migration super-resolution reconstruction method according to claim 1, wherein the training process of the spectrum reconstruction model is as follows:
by adopting a meta-learning fine tuning method based on model agnostic, a model parameter with strong universality is learned, so that the spectrum reconstruction model can adapt to the characteristics of any domain through a plurality of steps of self-supervision fine tuning on the image of the domain, and the specific algorithm is as follows:
firstly, initializing model parameters;
then constructing training data, wherein the specific method is as follows: sampling N tasks in a datasetEach Task is composed of K data pairs;
for each Task, atCalculating self-supervising losses over K example dataAnd performing model parameter updating of the inner layer according to the gradient:
for all tasks, the updated model parameters θ 'are used' i Calculating supervised lossesAnd performing model parameter updating of the outer layer according to the gradient:
wherein ,representing self-supervision loss in the form of +.>Where d (·) represents the spectral response function from the hyperspectral image to the RGB image, f θ (. Cndot.) represents a neural network with a parameter θ, x being the input image; />Representing a supervised loss in the form of +.>Where mse represents the mean square loss.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310745724.8A CN116993584A (en) | 2023-06-21 | 2023-06-21 | Multi-domain image-oriented spectrum cross-domain migration super-resolution reconstruction method |
PCT/CN2023/113283 WO2024082796A1 (en) | 2023-06-21 | 2023-08-16 | Spectral cross-domain transfer super-resolution reconstruction method for multi-domain image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310745724.8A CN116993584A (en) | 2023-06-21 | 2023-06-21 | Multi-domain image-oriented spectrum cross-domain migration super-resolution reconstruction method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116993584A true CN116993584A (en) | 2023-11-03 |
Family
ID=88525616
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310745724.8A Pending CN116993584A (en) | 2023-06-21 | 2023-06-21 | Multi-domain image-oriented spectrum cross-domain migration super-resolution reconstruction method |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN116993584A (en) |
WO (1) | WO2024082796A1 (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118334458B (en) * | 2024-06-14 | 2024-09-17 | 中国海洋大学 | Universal cross-domain image conversion method and system |
CN118533834B (en) * | 2024-07-15 | 2024-10-25 | 浙江工业大学 | Black tea fermentation degree judging method based on 3D-SwinT-CNN |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110232653A (en) * | 2018-12-12 | 2019-09-13 | 天津大学青岛海洋技术研究院 | The quick light-duty intensive residual error network of super-resolution rebuilding |
CN111369433B (en) * | 2019-11-12 | 2024-02-13 | 天津大学 | Three-dimensional image super-resolution reconstruction method based on separable convolution and attention |
CN111932461B (en) * | 2020-08-11 | 2023-07-25 | 西安邮电大学 | Self-learning image super-resolution reconstruction method and system based on convolutional neural network |
CN112801881B (en) * | 2021-04-13 | 2021-06-22 | 湖南大学 | High-resolution hyperspectral calculation imaging method, system and medium |
CN114332649B (en) * | 2022-03-07 | 2022-05-24 | 湖北大学 | Cross-scene remote sensing image depth countermeasure migration method based on double-channel attention |
-
2023
- 2023-06-21 CN CN202310745724.8A patent/CN116993584A/en active Pending
- 2023-08-16 WO PCT/CN2023/113283 patent/WO2024082796A1/en unknown
Also Published As
Publication number | Publication date |
---|---|
WO2024082796A1 (en) | 2024-04-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112836610B (en) | Land use change and carbon reserve quantitative estimation method based on remote sensing data | |
CN116993584A (en) | Multi-domain image-oriented spectrum cross-domain migration super-resolution reconstruction method | |
Lv et al. | Deep learning model of image classification using machine learning | |
CN115690479A (en) | Remote sensing image classification method and system based on convolution Transformer | |
Azadnia et al. | Developing an automated monitoring system for fast and accurate prediction of soil texture using an image-based deep learning network and machine vision system | |
Zhou et al. | RGB-to-HSV: A frequency-spectrum unfolding network for spectral super-resolution of RGB videos | |
Bi et al. | A transformer-based approach for early prediction of soybean yield using time-series images | |
Li et al. | Low-carbon jujube moisture content detection based on spectral selection and reconstruction | |
US20230186622A1 (en) | Processing remote sensing data using neural networks based on biological connectivity | |
Yang et al. | Extraction of land covers from remote sensing images based on a deep learning model of NDVI-RSU-Net | |
CN111242028A (en) | Remote sensing image ground object segmentation method based on U-Net | |
CN115457311A (en) | Hyperspectral remote sensing image band selection method based on self-expression transfer learning | |
CN114722928A (en) | Blue-green algae image identification method based on deep learning | |
Rafi et al. | Attention-based domain adaptation for hyperspectral image classification | |
CN114511733A (en) | Fine-grained image identification method and device based on weak supervised learning and readable medium | |
Lopez et al. | Convolutional neural networks for semantic segmentation of multispectral remote sensing images | |
CN114965300B (en) | Lake turbidity drawing method for constructing BP-TURB based on optical water body type and BP neural network algorithm | |
CN117132884A (en) | Crop remote sensing intelligent extraction method based on land parcel scale | |
CN111897988B (en) | Hyperspectral remote sensing image classification method and system | |
Riese | Development and Applications of Machine Learning Methods for Hyperspectral Data | |
Huang et al. | Low-light images enhancement via a dense transformer network | |
CN113627480A (en) | Polarized SAR image classification method based on reinforcement learning | |
Liu et al. | Hyperspectral image quality based on convolutional network of multi-scale depth | |
Natsir et al. | Classification of Soil Fertility Level Based on Texture with Convolutional Neural Network (CNN) Algorithm | |
CN114998640B (en) | Self-adaptive neighbor graph optimized hyperspectral semi-supervised classification method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |