CN116563187A - Multispectral image fusion based on graph neural network - Google Patents
Multispectral image fusion based on graph neural network Download PDFInfo
- Publication number
- CN116563187A CN116563187A CN202310573695.1A CN202310573695A CN116563187A CN 116563187 A CN116563187 A CN 116563187A CN 202310573695 A CN202310573695 A CN 202310573695A CN 116563187 A CN116563187 A CN 116563187A
- Authority
- CN
- China
- Prior art keywords
- image
- multispectral
- graph
- features
- convolution
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000004927 fusion Effects 0.000 title claims abstract description 18
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 16
- 230000007246 mechanism Effects 0.000 claims abstract description 22
- 238000000034 method Methods 0.000 claims abstract description 22
- 238000007500 overflow downdraw method Methods 0.000 claims abstract description 11
- 238000000605 extraction Methods 0.000 claims abstract description 8
- 230000009467 reduction Effects 0.000 claims abstract description 8
- 238000010586 diagram Methods 0.000 claims abstract description 7
- 238000001228 spectrum Methods 0.000 claims description 18
- 230000004913 activation Effects 0.000 claims description 11
- 230000003595 spectral effect Effects 0.000 claims description 9
- 230000004931 aggregating effect Effects 0.000 claims description 5
- 238000011176 pooling Methods 0.000 claims description 3
- 238000002329 infrared spectrum Methods 0.000 claims description 2
- 238000005070 sampling Methods 0.000 claims description 2
- 230000006870 function Effects 0.000 description 13
- 238000005516 engineering process Methods 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 239000013598 vector Substances 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 101100481876 Danio rerio pbk gene Proteins 0.000 description 1
- 101100481878 Mus musculus Pbk gene Proteins 0.000 description 1
- 230000003416 augmentation Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000003891 environmental analysis Methods 0.000 description 1
- 230000005284 excitation Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000005036 nerve Anatomy 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000000547 structure data Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/042—Knowledge-based neural networks; Logical representations of neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10032—Satellite or aerial image; Remote sensing
- G06T2207/10041—Panchromatic image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A40/00—Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
- Y02A40/10—Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in agriculture
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Multimedia (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention belongs to the field of image fusion, and discloses a multispectral image fusion method based on a graph neural network, which comprises the following steps of: multispectral images and full-color images are first acquired. Firstly, extracting pixel characteristics from a multispectral image by using a convolution network, carrying out dimension reduction and characteristic extraction on the multispectral image, extracting a three-dimensional graph structure of the multispectral image in a graph embedding mode, fusing the three-dimensional graph structure to obtain a heterogeneous graph of the multispectral image, and convoluting the acquired heterogeneous graph by using a space-time graph to extract spatial characteristics of graph data. And then, the acquired pixel characteristics and the spatial characteristics are aggregated to output weights of the characteristics through a gating mechanism, and a multispectral characteristic diagram of the final fusion of the spatial characteristics and the pixel characteristics is acquired by the weights. And fusing the obtained characteristic image and the multispectral characteristic image through the same convolution network by the panchromatic image to obtain a fused multispectral image by a attention mechanism, wherein the method has higher resolution of the multispectral image.
Description
Technical Field
The invention relates to the field of image fusion, in particular to a multispectral image fusion method based on a graph neural network.
Background
The image fusion technology is to generate a new high-quality image by combining and processing the image data under the same scene acquired by different collectors. With the rapid development of satellite sensor technology, multispectral images are widely used in the fields of military systems, environmental analysis and the like. However, due to the limitations of satellite sensor technology, only full-color images with high spatial resolution but low spectral resolution, or multispectral images with rich spectral information but low spatial resolution can be acquired. In order to obtain multispectral images with high spatial resolution, remote sensing image fusion technology becomes a research hot spot, and is capable of fusing multispectral and panchromatic images.
The existing remote sensing image fusion technology can be mainly divided into four types: component substitution methods, multi-resolution analysis methods, model-based methods, and deep learning-based methods. The component replacement method decomposes a multispectral image into a plurality of components and replaces a part of the components therein with spatial components of a full-color image. However, some spectral information in the multispectral image may be lost due to imperfections in component separation. The multi-resolution analysis injects high frequency information of a full color image into a multi-spectral image in the transform domain. Multi-resolution analysis better retains spectral information but sometimes introduces spatial distortion. The model-based method realizes fusion by establishing an optimization model and priori constraints, but has higher calculation cost and difficult selection of optimal manual parameters, and limits the use in practical application.
Although the spectrum and the space detail of the remote sensing image are learned by using the convolutional neural network at present, better fusion performance is obtained, the correlation between the multispectral image and the hyperspectral image feature map is not considered, the possible complementarity feature between the multispectral image and the hyperspectral image is ignored, the multispectral image and the hyperspectral image feature interaction is weak, the extracted feature precision is not high, and the quality of the obtained image is not high. Therefore, it is an important problem to design a network to explore the cross-modal correlation between panchromatic and multispectral images, to better transfer the spatial texture details of panchromatic images into multispectral images, and to obtain multispectral images with rich texture information and minimal spectral distortion.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a multispectral image fusion method based on a graph neural network, wherein the image fusion method can realize the effective fusion of multispectral images and full-color images, and the fused characteristics have strong interaction and high image quality.
The technical scheme for solving the technical problems is as follows:
the multispectral image fusion method based on the graph neural network is characterized by comprising the following steps of:
(S1) acquiring a multispectral image and a corresponding panchromatic image over a period of time;
(S2) extracting pixel features from the multispectral image and the panchromatic image using a shared encoder network;
(S3) carrying out dimension reduction and feature extraction on the multispectral image, and then respectively extracting and fusing three-dimensional graph structures of the multispectral image in a graph embedding mode to obtain a heterogeneous graph of the multisource features;
(S4) performing feature extraction on the obtained heterogeneous graph by utilizing space-time graph convolution to obtain graph data space features;
(S5) aggregating the acquired pixel characteristics and the spatial characteristics through a gating mechanism, outputting the weights of the characteristics, and acquiring a multispectral characteristic diagram fusing the spatial characteristics and the pixel characteristics finally through the weights;
(S6) feature fusion is carried out on the obtained feature map of the full-color image and the multispectral feature map fused with the spatial features and the pixel features through an attention mechanism;
(S7) the fused characteristic images pass through a decoder to obtain fused multispectral images.
Preferably, in the step (S1), the multispectral camera in the step S1 is an imaging camera capable of collecting 3 or more spectral bands simultaneously.
Preferably, in the step (S2), the encoder network structure has two branches: the upper network is used for extracting the shallow layer characteristics of the image, and consists of 4 convolution layers with convolution kernels of 3 multiplied by 3, the last layer is removed, and each layer is connected with a ReLU activation function; one is the deep layer characteristic that the lower network is used for extracting the image, first pass through a 1 x 1 convolution layer, then connect 4 convolution layers that the convolution kernel is 3 x 3 to constitute, this convolution module adopts based on the Nest connected mode, can keep more information, obtains deep layer characteristic, carries out the concat with the characteristic map that upper and lower network obtained at last.
Preferably, in the step (S3), the obtaining of the three feature maps of the map structure is: extracting a physical feature map of the spectrum data by utilizing the spectrum data after dimension reduction and combining the infrared spectrum features; determining super-pixel neighbor node information by using a linear iterative clustering method, constructing edge connection relations among nodes according to the spatial connectivity relations of the super-pixels, and extracting a spatial feature map; and combining the spectrum characteristic similarity of the target, sampling and recombining from different spectrum band dimensions to obtain the spectrum characteristic distribution of the target, and effectively representing the spectrum data residing on the smooth manifold by using the GNN.
Preferably, in the step (S3), the heterogeneous graph is obtained by connecting the obtained feature graphs with three dimensions and different node types by using a graph self-encoder, and obtaining the heterogeneous graph fusing the multi-source features by adopting a self-attention-based graph pooling method, wherein the self-encoder comprises but is not limited to a graph convolution self-encoder, a variational graph convolution self-encoder and an anti-regularization graph self-encoder.
Preferably, in the step (S4), the space-time diagram convolution time dimension and the space dimension are respectively extracted by different methods, wherein the network for extracting the time dimension includes, but is not limited to, RNN, GRU, LSTM, TCN, transformer, and the network for extracting the space dimension includes, but is not limited to, GCN, GAT, GCN and GAT.
Preferably, in the step (S5), the fused feature map is obtained by aggregating two feature maps, where two fully connected networks connected to each other are used; and next, the aggregated features pass through an activation function, the function is limited to be between 0 and 1, the numerical value represents how much information can pass through the gate, 0 represents that no information is allowed to pass through, 1 represents that all information is allowed to pass through, and the weight of the output features can be obtained by the gate value, so that the weight is multiplied by the pixel features to obtain a feature map of the final fused space-time features and the pixel features.
Preferably, in the step (S6), the attention mechanism is a combination of a spatial attention mechanism and a channel attention mechanism, and the pixel features of the full-color image are fused with the multispectral image features of which the spatial features and the pixel features are fused;
preferably, in the step (S7), the decoder performs upsampling by 4 DB modules, where each DB module is composed of a 3×3 convolution and a 1×1 convolution, each DB module adopts a dense connection mode, and finally outputs a fused multispectral image by using 2 3×3 convolution layers.
Compared with the prior art, the invention provides a multispectral image fusion method based on a graph neural network, which has the following beneficial effects:
1. multispectral images provide higher spatial resolution because they contain rich spectral information, while panchromatic images provide higher spatial resolution. The information of the multispectral image and the full-color image can be effectively combined by fully utilizing the multispectral and full-color information through the fusion method of the graph neural network. Therefore, the advantages of the two can be fully utilized, and the quality and detail reduction capability of the image are improved.
2. The graph neural network learns on graph structure data and has good global feature learning capability. Therefore, the spectrum information and the spatial consistency can be maintained by fusion of the multispectral image and the full-color image through the graph neural network. This is important for some tasks that require maintaining consistency of object boundaries and colors in the image.
3. Multispectral and panchromatic images often contain a large amount of redundant information, particularly in the spectral and spatial dimensions. The fusion method of the graph neural network can effectively reduce redundant information and extract the features with the most representation and information richness. This can improve the efficiency of image processing and analysis and reduce the cost of data storage and transmission.
Drawings
Fig. 1 is a flow chart diagram of a multispectral image fusion method based on a graph neural network.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but embodiments of the present invention are not limited thereto.
Referring to fig. 1, the multispectral image fusion method based on the graph neural network comprises the following steps:
(S1) acquiring a multispectral image and a corresponding panchromatic image over a period of time;
(S2) extracting pixel features from the multispectral image and the panchromatic image using a shared encoder network;
(S3) carrying out dimension reduction and feature extraction on the multispectral image, and then respectively extracting and fusing three-dimensional graph structures of the multispectral image in a graph embedding mode to obtain a heterogeneous graph of the multisource features;
(S4) performing feature extraction on the obtained heterogeneous graph by utilizing space-time graph convolution to obtain graph data space features;
(S5) aggregating the acquired pixel characteristics and the spatial characteristics through a gating mechanism, outputting the weights of the characteristics, and acquiring a multispectral characteristic diagram fusing the spatial characteristics and the pixel characteristics finally through the weights;
(S6) feature fusion is carried out on the obtained feature map of the full-color image and the multispectral feature map fused with the spatial features and the pixel features through an attention mechanism;
(S7) the fused characteristic images pass through a decoder to obtain fused multispectral images.
Referring to fig. 1, in step (S1), the acquired multispectral image is composed of multispectral images of three bands, the multispectral image and the panchromatic image are 1000, and the pixel size is 256×256.
Referring to fig. 1, in step (S2), the encoder network is composed of two branches, one is a convolution layer with an upper network for extracting the shallow features of the image, and is composed of 4 convolution layers with a convolution kernel of 3×3 and a stride of 1, and then a ReLU activation function, and the other is a convolution layer with a convolution kernel of 1×1, and then a convolution layer with a convolution kernel of 4×3 and a stride of 1, where the convolution module adopts a convolution layer based on a Nest connection mode, so that more information can be retained, deep features can be obtained, and finally, the feature map obtained by the upper network and the lower network is subjected to concat.
Referring to fig. 1, in step (S3), the dimension reduction and feature extraction method: the spectrum and the spatial information are fused by using the augmentation vector:
x=(u,v,b 1 ,b 2 ,...,b B )=(x 1 ,x 2 ,...,x B+2 ) T (1)
where h (u, v) is a pixel on the graph, (b) 1 ,b 2 ,b 3 ,b B ) Is a band array.
Will augment the vectorAs training data, normalized to any x i The same class classification is carried out in a supervision mode and k nearest neighbor calculation is carried outConstructing a local neighborhood of a pixel, classifying the similarity of the space and spectrum information of the local neighborhood and reducing the feature dimension through manifold learning, combining the local neighborhood or neighborhood embedding of a spatial spectrum polynomial to finish the weight distribution of the similarity of the spectrum features of different pixels in the local neighborhood, and finally combining the binary matrix multiplication to establish the low-dimensional nonlinear explicit mapping between multispectral data.
In this embodiment, images of 4 bands are acquired, so b=4.
Referring to fig. 1, in step (S3), the heterogeneous graph is obtained by connecting three feature graphs of different types of nodes and edges by using a network structure based on a graph self-encoder, specifically: analyzing each given graph, analyzing node feature vectors among different graphs through cosine similarity, and reserving nodes with high similarity in the three graphs; for the three graphs after processing, the graph convolution network is used for calculating the three graphs, so that the node representation z of each node is obtained, and then the following formula is utilized:
wherein the method comprises the steps ofThe prediction probability between the link nodes (i, j) is that sigma is a Sigmoid activation function, the probability that the sigma is larger than 0.8 is set for linking, the probability that the sigma is smaller than 0.2 is not connected, and a new graph after three graphs are linked is obtained.
In addition, the extracted new graph is subjected to primary graph nerve convolution operation, and GCN learns the characteristic representation of each node V E V, namely, the neighbor node characteristics of each node are aggregated to obtain the characteristic representation of the node V; for each node v, calculating an attention score z for each node by using a self-attention mechanism, then selecting the most important node by using topk, and determining the number of reserved nodes by pooling the proportion k, wherein k is set to be 0.5; by thus obtaining the attention-based mask map, the mask map is multiplied by the corresponding node of the map structure of the fused heterogeneous information of the original input, and the final output map, i.e. the heterogeneous map fused with the multi-source features, is obtained.
Referring to fig. 1, in step (S4), the space-time diagram convolving the network with the time dimension is TCmodule, and the network with the time dimension is GCN combined with GAT; wherein TCmodule is a network module for extracting time dimension, it is made up of two expansion starting layers, one expansion starting layer is processed through tanh activation function, it is used as the filter of filtering input, and another branch inputs the processing through Sigmoid activation function, used for controlling the information content that the filter can lead to the next module; and extracting spatial features through a GCN layer after the network of the time module is passed, then carrying out information transfer among nodes through a GAT graph annotation force layer, capturing the dependency relationship among the nodes, and finally obtaining space-time features.
Referring to FIG. 1, in step (S5), the gating mechanism is to fuse the multispectral spatiotemporal features f first R And pixel characteristics
Here g () we use two fully connected networks connected to each other, the hyperbolic tangent function as the activation function. Next, we use the fused feature f as a gating mechanism, i.e. the aggregated feature is subjected to a sigmoid activation function to limit the function to between [0,1], the value represents how much information can pass through the gating, wherein 0 represents that no information is allowed to pass through, and 1 represents that all information is allowed to pass through; in this network, the gating mechanism controls the importance of each pixel, where 0 represents that the current pixel is not useful for image recognition decisions at all and 1 represents that the current pixel is of paramount importance for image recognition decisions. Thus, the final output function can be expressed as:
the corresponding elements are multiplied by the weight of the output characteristic of the gate control value, so that the final characteristic vector f is obtained fusion I.e., a multi-spectral signature that combines both spatio-temporal information as well as pixel information.
Referring to fig. 1, in step (S6), the attention mechanism is a feature of combining the channel attention mechanism and the spatial attention mechanism to make a full color imageAnd multispectral features f fusion Feature learning is carried out in the channel dimension and the space dimension respectively, and the importance of each channel and the importance of the space region are obtained to obtain f ca Feature map of channel attention and f sa The feature map of spatial attention, which merges the features of these two mechanisms, is:
F φ =(f sa ×0.4+f ca ×0.6)×0.5 (5)
because the multispectral image consists of a plurality of wave bands, we pay more attention to feature learning in the channel dimension, and finally obtain a feature map F φ 。
Referring to fig. 1, in step (S7), the decoder upsamples by 4 DB modules, wherein each DB module is composed of a 3×3 convolution and a 1×1 convolution, stride is 1, the ReLU activation function is followed by excitation, each DB module adopts a dense connection mode, and finally 2 3×3 convolution layers are followed. Will obtain a feature map F φ And outputting the images to the network to finally obtain the fused high-resolution multispectral image.
The foregoing is illustrative of the present invention and is not to be construed as limiting thereof, but rather as various changes, modifications, substitutions, combinations, and simplifications which may be made therein without departing from the spirit and principles of the invention are intended to be included within the scope of the invention.
Claims (9)
1. The multispectral image fusion method based on the graph neural network is characterized by comprising the following steps of:
(S1) acquiring a multispectral image and a corresponding panchromatic image over a period of time;
(S2) extracting pixel features from the multispectral image and the panchromatic image using a shared encoder network;
(S3) carrying out dimension reduction and feature extraction on the multispectral image, and then respectively extracting and fusing three-dimensional graph structures of the multispectral image in a graph embedding mode to obtain a heterogeneous graph of the multisource features;
(S4) performing feature extraction on the obtained heterogeneous graph by utilizing space-time graph convolution to obtain graph data space features;
(S5) aggregating the acquired pixel characteristics and the spatial characteristics through a gating mechanism, outputting the weights of the characteristics, and acquiring a multispectral characteristic diagram fusing the spatial characteristics and the pixel characteristics finally through the weights;
(S6) feature fusion is carried out on the obtained feature map of the full-color image and the multispectral feature map fused with the spatial features and the pixel features through an attention mechanism;
(S7) the fused characteristic images pass through a decoder to obtain fused multispectral images.
2. The method of claim 1, wherein in step (S1), the multispectral image is captured by a multispectral camera capable of capturing 3 or more spectral bands simultaneously.
3. The method of claim 1, wherein in step (S2), the encoder network structure comprises two branches: the upper network is used for extracting the shallow layer characteristics of the image, and consists of 4 convolution layers with convolution kernels of 3 multiplied by 3, the last layer is removed, and each layer is connected with a ReLU activation function; one is the deep layer characteristic that the lower network is used for extracting the image, first pass through a convolution layer of 1 x 1, then connect 4 convolution layers that the convolution kernel is 3 x 3 to constitute, this convolution module adopts based on the Nest connected mode, can keep more information, obtains deep layer characteristic, carries out the concat of characteristic dimension with the characteristic map that upper and lower network obtained at last.
4. The method of claim 1, wherein in step (S3), the three feature maps are obtained by obtaining the map structure: extracting a physical feature map of the spectrum data by utilizing the spectrum data after dimension reduction and combining the infrared spectrum features; determining super-pixel neighbor node information by using a linear iterative clustering method, constructing edge connection relations among nodes according to the spatial connectivity relations of the super-pixels, and extracting a spatial feature map; and combining the spectrum characteristic similarity of the target, sampling and recombining from different spectrum band dimensions to obtain the spectrum characteristic distribution of the target, and effectively representing the spectrum data residing on the smooth manifold by using the graph neural network.
5. The method of claim 1, wherein in step (S3), the heterogeneous map is obtained by connecting the obtained feature maps of three dimensions of different node types with a map self-encoder, and obtaining the heterogeneous map fusing the multi-source features by using a self-attention-based map pooling method, wherein the self-encoder includes but is not limited to a map convolution self-encoder, a variational map convolution self-encoder, and an anti-regularization map self-encoder.
6. The method of claim 1, wherein in step (S4), the space-time graph convolution time dimension and the space dimension are extracted by different methods, wherein the network for extracting the time dimension includes but is not limited to RNN, GRU, LSTM, TCN, transformer, and the network for extracting the space dimension includes but is not limited to GCN, GAT, GCN and GAT.
7. The method of claim 1, wherein in step (S5), the fused feature map is obtained by first aggregating two feature maps, wherein two fully connected networks connected to each other are used; and next, the aggregated features pass through an activation function, the function is limited to be between 0 and 1, the numerical value represents how much information can pass through the gate, 0 represents that no information is allowed to pass through, 1 represents that all information is allowed to pass through, and the weight of the output features can be obtained by the gate value, so that the weight is multiplied by the pixel features to obtain a feature map of the final fused space-time features and the pixel features.
8. A method of fusion of multispectral images based on a neural network as claimed in claim 1, wherein in step (S6), the attention mechanism is a combination of spatial attention mechanism and channel attention mechanism, and the pixel features of the panchromatic image are fused with the multispectral image features fused with the spatial features and the pixel features.
9. The method of claim 1, wherein in step (S7), the decoder upsamples by 4 DB modules, each DB module comprising a 3×3 convolution and a 1×1 convolution, each DB module adopting a densenet connection method, and finally outputs the fused multispectral image by using 2 3×3 convolution layers.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310573695.1A CN116563187A (en) | 2023-05-22 | 2023-05-22 | Multispectral image fusion based on graph neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310573695.1A CN116563187A (en) | 2023-05-22 | 2023-05-22 | Multispectral image fusion based on graph neural network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116563187A true CN116563187A (en) | 2023-08-08 |
Family
ID=87501614
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310573695.1A Pending CN116563187A (en) | 2023-05-22 | 2023-05-22 | Multispectral image fusion based on graph neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116563187A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117455970A (en) * | 2023-12-22 | 2024-01-26 | 山东科技大学 | Airborne laser sounding and multispectral satellite image registration method based on feature fusion |
-
2023
- 2023-05-22 CN CN202310573695.1A patent/CN116563187A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117455970A (en) * | 2023-12-22 | 2024-01-26 | 山东科技大学 | Airborne laser sounding and multispectral satellite image registration method based on feature fusion |
CN117455970B (en) * | 2023-12-22 | 2024-05-10 | 山东科技大学 | Airborne laser sounding and multispectral satellite image registration method based on feature fusion |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109272010B (en) | Multi-scale remote sensing image fusion method based on convolutional neural network | |
CN110415199B (en) | Multispectral remote sensing image fusion method and device based on residual learning | |
CN110544212B (en) | Convolutional neural network hyperspectral image sharpening method based on hierarchical feature fusion | |
CN111275618A (en) | Depth map super-resolution reconstruction network construction method based on double-branch perception | |
CN111178316B (en) | High-resolution remote sensing image land coverage classification method | |
CN109859110B (en) | Hyperspectral image panchromatic sharpening method based on spectrum dimension control convolutional neural network | |
CN116071243B (en) | Infrared image super-resolution reconstruction method based on edge enhancement | |
CN109064405A (en) | A kind of multi-scale image super-resolution method based on dual path network | |
CN114119444B (en) | Multi-source remote sensing image fusion method based on deep neural network | |
CN109509160A (en) | Hierarchical remote sensing image fusion method utilizing layer-by-layer iteration super-resolution | |
Turnes et al. | Atrous cGAN for SAR to optical image translation | |
CN112967178B (en) | Image conversion method, device, equipment and storage medium | |
CN113240683B (en) | Attention mechanism-based lightweight semantic segmentation model construction method | |
CN113887645B (en) | Remote sensing image fusion classification method based on joint attention twin network | |
CN116740419A (en) | Target detection method based on graph regulation network | |
CN116563187A (en) | Multispectral image fusion based on graph neural network | |
CN115660955A (en) | Super-resolution reconstruction model, method, equipment and storage medium for efficient multi-attention feature fusion | |
CN116645579A (en) | Feature fusion method based on heterogeneous graph attention mechanism | |
CN114782298A (en) | Infrared and visible light image fusion method with regional attention | |
CN115601236A (en) | Remote sensing image super-resolution reconstruction method based on characteristic information distillation network | |
CN117576483B (en) | Multisource data fusion ground object classification method based on multiscale convolution self-encoder | |
CN116343058A (en) | Global collaborative fusion-based multispectral and panchromatic satellite image earth surface classification method | |
CN118134779A (en) | Infrared and visible light image fusion method based on multi-scale reconstruction transducer and multi-dimensional attention | |
CN118334365A (en) | Novel RGB-D image saliency target detection method | |
CN118196629A (en) | Remote sensing image vegetation extraction method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |