CN116993584A - Multi-domain image-oriented spectrum cross-domain migration super-resolution reconstruction method - Google Patents

Multi-domain image-oriented spectrum cross-domain migration super-resolution reconstruction method Download PDF

Info

Publication number
CN116993584A
CN116993584A CN202310745724.8A CN202310745724A CN116993584A CN 116993584 A CN116993584 A CN 116993584A CN 202310745724 A CN202310745724 A CN 202310745724A CN 116993584 A CN116993584 A CN 116993584A
Authority
CN
China
Prior art keywords
domain
image
spectrum
model
cross
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310745724.8A
Other languages
Chinese (zh)
Inventor
张艳宁
张磊
魏巍
任维鑫
王昊宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern Polytechnical University
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical University filed Critical Northwestern Polytechnical University
Priority to CN202310745724.8A priority Critical patent/CN116993584A/en
Priority to PCT/CN2023/113283 priority patent/WO2024082796A1/en
Publication of CN116993584A publication Critical patent/CN116993584A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0895Weakly supervised learning, e.g. semi-supervised or self-supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/096Transfer learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0985Hyperparameter optimisation; Meta-learning; Learning-to-learn
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4046Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A40/00Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
    • Y02A40/10Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in agriculture

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a multi-domain image-oriented spectrum cross-domain migration super-division reconstruction method, which is an image spectrum cross-domain migration super-division reconstruction method based on a cross-domain migratable knowledge learning and target domain fast adaptation learning mode. And realizing spectrum super-resolution reconstruction from an RGB image to a hyperspectral image. Model structural design based on a transferable dictionary is adopted to learn the characteristics which can be transferred across domains; facilitating model learning of generic knowledge for reconstruction based on a source domain pre-training strategy that shares a learnable mask; the fine tuning method based on model-agnostic meta-learning is used for learning a general model with strong generalization capability, so that the data of a tested target domain can be adapted to the test data through a plurality of steps of iteration. The invention can mine the knowledge of cross-domain sharing to improve generalization capability and further improve the effect of cross-domain spectrum super-division reconstruction.

Description

Multi-domain image-oriented spectrum cross-domain migration super-resolution reconstruction method
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to a spectrum cross-domain migration super-division reconstruction method.
Background
Hyperspectral images refer to images in which tens or even hundreds of consecutive spectral bands are acquired for each pixel in the visible and infrared spectral ranges. Compared with the traditional RGB image, the hyperspectral image provides more abundant spectral information, and can identify the spectral characteristics of the material, so that the earth surface substances are analyzed more carefully.
The hyperspectral image can be applied to the fields of environmental remote sensing, agriculture, forestry, geological exploration, urban planning and the like. For example, in the agricultural field, the hyperspectral image can be used for rapidly identifying, classifying, monitoring and managing crops, so that the yield and quality of the crops are improved. In environmental monitoring, hyperspectral images can be used to identify and monitor harmful substances in a body of water, as well as to monitor vegetation coverage and land use changes. In the field of urban planning, the hyperspectral image can be used for measuring urban green land coverage and building height, optimizing urban planning and facility layout and the like.
In conclusion, the hyperspectral image has wide application prospect as an image with abundant spectral information.
However, hyperspectral images are not widely used as general cameras because hyperspectral cameras are expensive, have a slow imaging speed, have a large volume, and the like. In order to fully utilize the advantages of the hyperspectral image and simultaneously avoid the problems of the hyperspectral imaging device, researchers propose a method for spectrum hyperspectral, aiming at estimating and reconstructing the hyperspectral image by using the traditional RGB image.
Existing spectral superdivision methods can be broadly divided into two categories according to the reconstruction scheme. One is a conventional method, such as (1) a spectral super-resolution method based on spectral decomposition: the method utilizes a spectrum decomposition algorithm to decompose and reconstruct spectrum signals, thereby realizing super-resolution of spectrum. For example, a super-resolution spectral imaging technique "Coupled Nonnegative Matrix Factorization Unmixing for Hyperspectral and Multispectral Data Fusion" based on a non-Negative Matrix Factorization (NMF) algorithm may decompose and reconstruct the spectral signals to achieve super-resolution imaging of the spectrum. (2) a sparse representation-based spectral super-resolution method: the method utilizes a sparse representation algorithm, such as an algorithm based on dictionary learning, to decompose and reconstruct the spectrum signals, thereby realizing super-resolution of the spectrum. For example, a super-resolution spectral imaging technique "Spectral Reflectance Recovery from a Single RGB Image" based on a sparse representation algorithm may perform sparse representation and reconstruction on the spectral signals, thereby implementing super-resolution imaging of the spectrum. (3) Spectral library and model based spectral super resolution method: the method utilizes a spectrum library and a model to perform model training and optimization on a spectrum signal, thereby realizing super-resolution of the spectrum. For example, super-resolution spectral imaging techniques based on Partial Least Squares Regression (PLSR) algorithms can model and predict the spectral signals to achieve super-resolution imaging of the spectrum. These conventional methods often have problems of slow operation speed and poor reconstruction effect. Another approach, which is based on deep learning, is to train and learn the optical signal using a deep learning network, such as a Convolutional Neural Network (CNN) 'Pixel-aware Deep Function-mixture Network for Spectral Super-Resolution', a transducer 'mst++: multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction', etc., to achieve super-Resolution of the spectrum. Although deep learning-based methods have evolved tremendously in recent years and achieved excellent performance on a single dataset, deep learning-based methods severely degrade performance when tested in a scenario outside the training set.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a multi-domain image-oriented spectrum cross-domain migration super-resolution reconstruction method, which is based on a cross-domain migratable knowledge learning and target domain fast adaptation learning mode through multi-domain image scene oriented image spectrum cross-domain migration super-resolution reconstruction method. And realizing spectrum super-resolution reconstruction from an RGB image to a hyperspectral image. Model structural design based on a transferable dictionary is adopted to learn the characteristics which can be transferred across domains; facilitating model learning of generic knowledge for reconstruction based on a source domain pre-training strategy that shares a learnable mask; the fine tuning method based on model-agnostic meta-learning is used for learning a general model with strong generalization capability, so that the data of a tested target domain can be adapted to the test data through a plurality of steps of iteration. The invention can mine the knowledge of cross-domain sharing to improve generalization capability and further improve the effect of cross-domain spectrum super-division reconstruction.
The technical scheme adopted by the invention for solving the technical problems comprises the following steps:
step 1: for RGB imagesWherein h and w represent the height and width of the image, respectively, which are denoted +.>hsi represents a hyperspectral image corresponding to the input image img;
inputting the image into a coding layer, and mapping the channel number of the input image from 3 to 31 to realize preliminary spectrum reconstruction and alignment;
e=embedding(img)
wherein, ebedding (·) represents the embedded layer, which is instantiated by a convolution layer with a convolution kernel size of 3 and a convolution step size of 1, representing the hidden layer characteristics after embedding;
step 2: randomly masking the hidden layer features obtained in the step 1 in a cube form, randomly sampling the cube with a fixed size on an image, and replacing the features at the position with a shared learnable mask;
step 3: and (2) refining the hidden characteristic e obtained in the step (1) by using an inter-spectrum attention-based module, wherein the hidden characteristic e is expressed as follows:
s=SpectralTransformerBlock(e)
wherein spectralTransformaerBlock (·) represents an inter-spectral Transformer module, s represents hidden layer features obtained by the inter-spectral Transformer module; stacking a plurality of inter-spectrum transducer modules in a spectrum reconstruction model, wherein the output of the former module is the input of the latter module; wherein SpectralTransformaerBlock (.) consists of SpectralAttention and FFN and LayerNorm:
SpectralTransformerBlock(x)=t+(FFN(LayerNorm(t)))
where t= (x+spectroalattention (LayerNorm (x))), layerNorm represents the layer normalization operation, FFN (x) = (conv (gel (conv (x)))))) where conv represents the convolutional layer, gel is a nonlinear activation function, and x represents one input tensor;
attention(Q,K,V)=softmax(σ i QK T )V
SpectralAttention(X)=attention(XW Q ,XW K ,XW V )
wherein σi Is a learnable scale factor, W Q ,W K ,W V Is a matrix of projections that can be learned,the input tensor is obtained by rearranging the shape of the characteristic tensor of the input image;
step 4: generating a migratable dictionary by using a generator network instantiated by a multi-layer fully-connected neural network, then dividing hidden layer features s into feature blocks with a certain size according to space dimensions, and injecting knowledge shared across domains into a feature graph by using the migratable dictionary generated by a cross-attention mechanism interaction generator and the feature blocks of the hidden layer features s to form:
z=Generator(randomVector+map(s))
c=CrossAttention(s,z)
wherein Cross sAttention (S, Z) =attention (SW) Q ,ZW K ,ZW V ) Representing cross-attention, map (-) is a mapper that acts to map s' information into hidden space, which is instantiated by a multi-layer fully-connected neural network, and structurally follows the information bottleneck structure; the Generator (-) represents a Generator network that receives the vector random vector randomly sampled from the gaussian distribution and the domain information of the image itself obtained from s via map and generates a migratable dictionary;
step 5: the features are refined again using the inter-spectrum transducer module, and finally a reconstructed hyperspectral image is obtained, expressed as:
hsi′=SpectralTransformerBlock(c)
where hsi' represents the reconstructed hyperspectral image.
Preferably, the training process of the spectrum reconstruction model is as follows:
by adopting a meta-learning fine tuning method based on model agnostic, a model parameter with strong universality is learned, so that the spectrum reconstruction model can adapt to the characteristics of any domain through a plurality of steps of self-supervision fine tuning on the image of the domain, and the specific algorithm is as follows:
firstly, initializing model parameters;
then constructing training data, wherein the specific method is as follows: sampling N tasks in a datasetEach Task consists of K data pairs.
For each Task, self-supervised losses are computed over K example dataAnd performing model parameter updating of the inner layer according to the gradient:
for all tasks, the updated model parameters θ 'are used' i Calculating supervised lossesAnd performing model parameter updating of the outer layer according to the gradient:
wherein ,representing self-supervision loss in the form of +.>Where d (·) represents the spectral response function from the hyperspectral image to the RGB image, f θ (. Cndot.) represents a neural network with a parameter θ, and x is the input image. />Representing a supervised loss in the form of +.>Where mse represents the mean square loss.
The beneficial effects of the invention are as follows:
because the general deep learning model is too dependent on the memory of the training data set, is excessively fitted to the training data set, and does not learn enough knowledge that can be shared across domains, the masking operation of the cube level is used to force the model to learn interactions between spectrums and spaces, and the knowledge of the interactions is shared among all spectrum reconstructions. And using a generator to generate a migratable patch to be added to the spectral reconstruction as cross-domain shared knowledge. The design ensures that the knowledge of cross-domain sharing can be mined from the angle of the model so as to improve the generalization capability and further improve the effect of cross-domain spectrum super-division reconstruction. The model-agnostic meta-learning method does not directly learn the reconstruction from a single RGB image to a hyperspectral image, but learns model parameters with strong generalization, which can be quickly adapted to a target domain in a plurality of steps of iteration for any image, further enhances the universality of the model, and simultaneously matches with the self-supervision quick fine tuning of the target domain, so that the model has the capacity of cross-domain migration and super-division reconstruction.
Drawings
FIG. 1 is a schematic diagram of a model structure of a migratable dictionary.
Detailed Description
The invention will be further described with reference to the drawings and examples.
As shown in fig. 1, in order to repeatedly exert the high performance of the spectral super-resolution method based on deep learning and alleviate the problem of serious degradation of the scene test performance outside the training set, we propose a multi-domain image scene-oriented image spectral cross-domain migration super-resolution reconstruction method based on the cross-domain migratable knowledge learning and the target domain fast adaptation learning mode for spectral super-resolution reconstruction from RGB images to hyperspectral images. The model structure design based on the transferable dictionary is used for learning the characteristics which can be transferred across domains; a source domain pre-training strategy based on a shared learner mask to facilitate model learning of common knowledge for reconstruction; the method comprises a fine tuning method based on model-agnostic meta-learning, which is used for learning a general model with strong generalization capability, so that the data of a tested target domain can be adapted to the test data through a plurality of steps of iteration.
A multi-domain image scene-oriented image spectrum cross-domain migration super-resolution reconstruction method based on a cross-domain migratable knowledge learning and target domain rapid adaptation learning mode comprises the following aspects and steps:
spectral reconstruction model structure:
step 1: for RGB imagesWherein h and w represent the height and width of the image, respectively, which are denoted +.>Representing a hyperspectral image corresponding to the input image img. The image is input to the coding layer, the channel number of the input image is mapped from 3 to 31, and preliminary spectrum reconstruction and alignment are realized.
e=embedding(img)
Where ebedding (·) represents the embedded layer, instantiated by a convolution layer with a convolution kernel size of 3 and a convolution step size of 1, e represents the hidden layer feature after embedding.
Step 2: and (3) carrying out random masking on the hidden layer features obtained in the step (1) in a form of cube, randomly sampling the cube with a fixed size on the image, and replacing the features at the position with a shared learnable mask.
Step 3: the method comprises the following steps: 2, refining the obtained hidden characteristic e by using an inter-spectrum attention module. The expression is as follows:
s=SpectralTransformerBlock(e)
where spectralTransformaerBlock (·) represents the inter-spectral Transformer module, s represents the hidden layer features obtained by the inter-spectral Transformer module, note that we stack multiple modules in the model, the output of the former module being the input of the latter. Wherein SpectralTransformaerBlock (.) consists of SpectralAttention and FFN and LayerNorm:
SpectralTransformerBlock(x)=t+(FFN(LayerNorm(t)))
where t= (x+spectroalattention (LayerNorm (x))), layerNorm represents the layer normalization operation, FFN (x) = (conv (gel (conv (x)))))) where conv represents the convolutional layer, gel is a nonlinear activation function, and x represents the tensor of an input.
attention(Q,K,V)=softmax(σ i QK T )V
SpectralAttention(X)=attention(XW Q ,XW K ,XW V )
wherein σi Is learnableScaling factor, W Q ,W K ,W V Is a matrix of projections that can be learned,is the input tensor, which is obtained by rearranging the feature tensor of the input image.
Step 4: generating a migratable dictionary by using a generator network instantiated by a multi-layer fully-connected neural network, then dividing hidden layer features s into feature blocks with a certain size according to space dimensions, and injecting knowledge shared across domains into a feature graph by using the migratable dictionary generated by a cross-attention mechanism interaction generator and the feature blocks of the hidden layer features s to form:
z=Generator(randomVector+map(s))
c=CrossAttention(s,z)
wherein Cross sAttention (S, Z) =attention (SW) Q ,ZW K ,ZW V ) Representing cross-attention, map (-) is a mapper that acts to map s' information into hidden space, which is instantiated by a multi-layer fully-connected neural network, and structurally follows the information bottleneck structure; the Generator (-) represents a Generator network that receives the vector random vector randomly sampled from the gaussian distribution and the domain information of the image itself obtained from s via map and generates a migratable dictionary;
step 5: the features are refined again using the inter-spectrum transducer module mentioned in step 3, and the reconstructed hyperspectral image is finally obtained, which can be expressed as:
hsi′=SpectralTransformerBlock(c)
where hsi' represents the reconstructed hyperspectral image.
Model-agnostic meta-learning fine tuning method:
in order to obtain good performance of the model on multi-domain images, we propose to learn a model parameter with strong universality by using a meta-learning fine tuning method which is unknown to the model, so that the model can be well adapted to the characteristics of the domain through fine tuning of several steps of self-supervision on any domain image, and a specific algorithm flow is shown as an algorithm 1:
wherein ,representing self-supervision loss in the form of +.>Where d (·) represents the spectral response function from the hyperspectral image to the RGB image, f θ (. Cndot.) represents a neural network with a parameter θ, and x is the input image. />Representing a supervised loss in the form of +.>Where mse represents the mean square loss.
Specific examples:
the invention provides a multi-domain image scene-oriented image spectrum cross-domain migration super-resolution reconstruction method based on a cross-domain migratable knowledge learning and target domain fast adaptation learning mode, which comprises the following specific processes:
1. data preprocessing
For a given training setThere are RGB image and hyperspectral image pairs { img ] i ,hsi i For a given test set ∈ }>There may be no hsi i . In training the model, both RGB data and hyperspectral data are normalized to [0,1 ]]Within the range.
Furthermore, for the input image img i Hsi corresponding thereto i By usingData enhancement modes of random clipping, random horizontal overturn and random vertical overturn are adopted to enhance the generalization capability of the model.
2. Preliminary pre-training based on random small lot size
Since model-agnostic meta-learning methods lead to learning instability from scratch and the self-supervision step of the meta-learning inner layer leads to reduced utilization of supervised data, the model is trained in an early stage using a random small batch-based manner, so that the model can quickly and stably converge to a better position. Specifically, from the training setAnd (3) sampling n samples to form a batch, inputting a randomly initialized model, using Adam as an optimizer, mrae as a loss function, using 4e-4 as an initial learning rate, gradually reducing the learning rate in a cosine annealing mode, and training 150 batches.
3. Model agnostic meta-learning training
From a datasetMid-sampling task set->Every task->Comprising two parts of a support set and a query set, instantiated for simplicity as a sample pair { (img) each containing K RGB and hyperspectral images that do not overlap each other i ,hsi i ) I=1, 2,..the, K }, support set self-supervising loss for inner layer +.>Calculating, query set for supervised loss of the outer layer ∈>And (5) calculating. In addition, in the case of the optical fiber,when the meta learning training is carried out, the model initial weight is loaded with the pre-training weight obtained by the preliminary pre-training based on random small batches, and a mask structure is removed, so that the model calculation graph in the meta learning optimization process is ensured to be consistent with fine adjustment and reasoning during testing as much as possible. Generally, for θ' i If p gets too small it will be difficult to iterate to a solution that fits the task, if p gets too large it will cause local parameters to fit too much to the task, so p is typically taken as 10. For the learning rate alpha, beta of the inner layer and the outer layer, 1e-5 and 1e-6 are respectively taken.
4. Target domain fine tuning and reconstruction reasoning
When learning is completed, the model obtained by training can be subjected to spectral super-resolution reconstruction from RGB to hyperspectral images. The specific algorithm flow is shown in algorithm 2:
the final obtained algorithm output hsi' is the hyperspectral image reconstructed from the input RGB image img.

Claims (2)

1. A multi-domain image-oriented spectrum cross-domain migration super-division reconstruction method is characterized by comprising the following steps:
step 1: for RGB imagesWherein h and w represent the height and width of the image, respectively, which are labeledhsi represents a hyperspectral image corresponding to the input image img;
inputting the image into a coding layer, and mapping the channel number of the input image from 3 to 31 to realize preliminary spectrum reconstruction and alignment;
e=embedding(img)
wherein, ebedding (·) represents the embedded layer, which is instantiated by a convolution layer with a convolution kernel size of 3 and a convolution step size of 1, e represents the hidden layer feature after embedding;
step 2: carrying out random masking on the hidden layer feature e obtained in the step 1 in a cube form, randomly sampling a cube with a fixed size on an image, and replacing the feature at the position with a shared learnable mask;
step 3: and (2) refining the hidden characteristic e obtained in the step (1) by using an inter-spectrum attention-based module, wherein the hidden characteristic e is expressed as follows:
s=SpectralTransformerBlock(e)
wherein spectralTransformaerBlock (·) represents an inter-spectral Transformer module, s represents hidden layer features obtained by the inter-spectral Transformer module; stacking a plurality of inter-spectrum transducer modules in a spectrum reconstruction model, wherein the output of the former module is the input of the latter module; wherein SpectralTransformaerBlock (.) consists of SpectralAttention and FFN and LayerNorm:
SpectralTransformerBlock(x)=t+(FFN(LayerNorm(t)))
where t= (x+spectroalattention (LayerNorm (x))), layerNorm represents the layer normalization operation, FFN (x) = (conv (gel (conv (x)))))) where conv represents the convolutional layer, gel is a nonlinear activation function, and x represents one input tensor;
attention(Q,K,V)=softmax(σ i QK T )V
SpectralAttention(X)=attention(XW Q ,XW K ,XW V )
wherein σi Is a learnable scale factor, W Q ,W K ,W V Is a matrix of projections that can be learned,the input tensor is obtained by rearranging the shape of the characteristic tensor of the input image;
step 4: generating a migratable dictionary by using a generator network instantiated by a multi-layer fully-connected neural network, then dividing hidden layer features s into feature blocks with a certain size according to space dimensions, and injecting knowledge shared across domains into a feature graph by using the migratable dictionary generated by a cross-attention mechanism interaction generator and the feature blocks of the hidden layer features s to form:
z=Generator(randomVector+map(s))
c=CrossAttention(s,z)
wherein Cross sAttention (S, Z) =attention (SW) Q ,ZW K ,ZW V ) Representing cross-attention, map (-) is a mapper that acts to map s' information into hidden space, which is instantiated by a multi-layer fully-connected neural network, and structurally follows the information bottleneck structure; the Generator (-) represents a Generator network that receives the vector random vector randomly sampled from the gaussian distribution and the domain information of the image itself obtained from s via map and generates a migratable dictionary;
step 5: the features are refined again using the inter-spectrum transducer module, and finally a reconstructed hyperspectral image is obtained, expressed as:
hsi′=SpectralTransformerBlock(c)
where hsi' represents the reconstructed hyperspectral image.
2. The multi-domain image-oriented spectrum cross-domain migration super-resolution reconstruction method according to claim 1, wherein the training process of the spectrum reconstruction model is as follows:
by adopting a meta-learning fine tuning method based on model agnostic, a model parameter with strong universality is learned, so that the spectrum reconstruction model can adapt to the characteristics of any domain through a plurality of steps of self-supervision fine tuning on the image of the domain, and the specific algorithm is as follows:
firstly, initializing model parameters;
then constructing training data, wherein the specific method is as follows: sampling N tasks in a datasetEach Task is composed of K data pairs;
for each Task, atCalculating self-supervising losses over K example dataAnd performing model parameter updating of the inner layer according to the gradient:
for all tasks, the updated model parameters θ 'are used' i Calculating supervised lossesAnd performing model parameter updating of the outer layer according to the gradient:
wherein ,representing self-supervision loss in the form of +.>Where d (·) represents the spectral response function from the hyperspectral image to the RGB image, f θ (. Cndot.) represents a neural network with a parameter θ, x being the input image; />Representing a supervised loss in the form of +.>Where mse represents the mean square loss.
CN202310745724.8A 2023-06-21 2023-06-21 Multi-domain image-oriented spectrum cross-domain migration super-resolution reconstruction method Pending CN116993584A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202310745724.8A CN116993584A (en) 2023-06-21 2023-06-21 Multi-domain image-oriented spectrum cross-domain migration super-resolution reconstruction method
PCT/CN2023/113283 WO2024082796A1 (en) 2023-06-21 2023-08-16 Spectral cross-domain transfer super-resolution reconstruction method for multi-domain image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310745724.8A CN116993584A (en) 2023-06-21 2023-06-21 Multi-domain image-oriented spectrum cross-domain migration super-resolution reconstruction method

Publications (1)

Publication Number Publication Date
CN116993584A true CN116993584A (en) 2023-11-03

Family

ID=88525616

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310745724.8A Pending CN116993584A (en) 2023-06-21 2023-06-21 Multi-domain image-oriented spectrum cross-domain migration super-resolution reconstruction method

Country Status (2)

Country Link
CN (1) CN116993584A (en)
WO (1) WO2024082796A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118334458B (en) * 2024-06-14 2024-09-17 中国海洋大学 Universal cross-domain image conversion method and system
CN118533834B (en) * 2024-07-15 2024-10-25 浙江工业大学 Black tea fermentation degree judging method based on 3D-SwinT-CNN

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110232653A (en) * 2018-12-12 2019-09-13 天津大学青岛海洋技术研究院 The quick light-duty intensive residual error network of super-resolution rebuilding
CN111369433B (en) * 2019-11-12 2024-02-13 天津大学 Three-dimensional image super-resolution reconstruction method based on separable convolution and attention
CN111932461B (en) * 2020-08-11 2023-07-25 西安邮电大学 Self-learning image super-resolution reconstruction method and system based on convolutional neural network
CN112801881B (en) * 2021-04-13 2021-06-22 湖南大学 High-resolution hyperspectral calculation imaging method, system and medium
CN114332649B (en) * 2022-03-07 2022-05-24 湖北大学 Cross-scene remote sensing image depth countermeasure migration method based on double-channel attention

Also Published As

Publication number Publication date
WO2024082796A1 (en) 2024-04-25

Similar Documents

Publication Publication Date Title
CN112836610B (en) Land use change and carbon reserve quantitative estimation method based on remote sensing data
CN116993584A (en) Multi-domain image-oriented spectrum cross-domain migration super-resolution reconstruction method
Lv et al. Deep learning model of image classification using machine learning
CN115690479A (en) Remote sensing image classification method and system based on convolution Transformer
Azadnia et al. Developing an automated monitoring system for fast and accurate prediction of soil texture using an image-based deep learning network and machine vision system
Zhou et al. RGB-to-HSV: A frequency-spectrum unfolding network for spectral super-resolution of RGB videos
Bi et al. A transformer-based approach for early prediction of soybean yield using time-series images
Li et al. Low-carbon jujube moisture content detection based on spectral selection and reconstruction
US20230186622A1 (en) Processing remote sensing data using neural networks based on biological connectivity
Yang et al. Extraction of land covers from remote sensing images based on a deep learning model of NDVI-RSU-Net
CN111242028A (en) Remote sensing image ground object segmentation method based on U-Net
CN115457311A (en) Hyperspectral remote sensing image band selection method based on self-expression transfer learning
CN114722928A (en) Blue-green algae image identification method based on deep learning
Rafi et al. Attention-based domain adaptation for hyperspectral image classification
CN114511733A (en) Fine-grained image identification method and device based on weak supervised learning and readable medium
Lopez et al. Convolutional neural networks for semantic segmentation of multispectral remote sensing images
CN114965300B (en) Lake turbidity drawing method for constructing BP-TURB based on optical water body type and BP neural network algorithm
CN117132884A (en) Crop remote sensing intelligent extraction method based on land parcel scale
CN111897988B (en) Hyperspectral remote sensing image classification method and system
Riese Development and Applications of Machine Learning Methods for Hyperspectral Data
Huang et al. Low-light images enhancement via a dense transformer network
CN113627480A (en) Polarized SAR image classification method based on reinforcement learning
Liu et al. Hyperspectral image quality based on convolutional network of multi-scale depth
Natsir et al. Classification of Soil Fertility Level Based on Texture with Convolutional Neural Network (CNN) Algorithm
CN114998640B (en) Self-adaptive neighbor graph optimized hyperspectral semi-supervised classification method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination