CN115311184A

CN115311184A - Remote sensing image fusion method and system based on semi-supervised deep neural network

Info

Publication number: CN115311184A
Application number: CN202210950867.8A
Authority: CN
Inventors: 张凯; 杨桂烁; 孙建德; 张风; 万文博
Original assignee: Shandong Normal University
Current assignee: Shandong Normal University
Priority date: 2022-08-09
Filing date: 2022-08-09
Publication date: 2022-11-08

Abstract

The invention provides a remote sensing image fusion method and system based on a semi-supervised deep neural network, which comprises the steps of obtaining a full-color image with high spatial resolution and a multispectral image with low spatial resolution to be fused, and preprocessing the full-color image and the multispectral image; inputting the full-color image and the multispectral image into a two-branch network, and respectively extracting spatial information of the full-color image and spectral information characteristics of the multispectral image; and stacking the extracted characteristic graphs of the spatial information and the spectral information, then performing resolution sensing, and then injecting a resolution sensing result when performing fusion reconstruction on the stacked characteristic graphs to obtain a fusion image.

Description

Remote sensing image fusion method and system based on semi-supervised deep neural network

Technical Field

The disclosure relates to the technical field of remote sensing image fusion, in particular to a remote sensing image fusion method and system based on a semi-supervised deep neural network.

Background

The statements in this section merely provide background information related to the present disclosure and may not constitute prior art.

In recent years, many high-resolution optical earth observation satellites, such as QuickBird, geoEye, worldView-2, and GaoFen-2, have been launched, providing researchers in the remote sensing world with a wealth of data that can be used in various research areas, such as agriculture, land surveying, environmental monitoring, and the like. In remote sensing systems, the satellite can acquire two images, namely a multispectral image (MS) and a panchromatic image (PAN), in completely different ways. Multispectral images have high spectral resolution and low spatial resolution, which is limited by on-board storage and broadband transmission. In contrast, the spectral resolution of full-color images is low, but the spatial resolution is high due to the instantaneous field of view being large. Panchromatic sharpening (i.e., fusion of panchromatic and multispectral images) aims at generating high spatial resolution multispectral images by combining the spatial and spectral information of panchromatic images and multispectral images, which provides a good solution to alleviate the above-mentioned problems.

Conventional fusion methods can be generally divided into two categories, 1) methods based on component replacement (CS): the basic assumption of the CS method is that the geometric detail information of the MS image exists in its structural components, which can be obtained by converting it into a new space, and then replacing or partially replacing the structural components with a histogram matched version of PAN to inject the spatial information. Finally, full-color sharpening is achieved by inverse transformation. 2) A method based on multi-resolution analysis (MRA). MRA-based methods assume that the spatial information missing in the MS can be inferred from the high frequencies of the corresponding PAN images. To sharpen the multispectral image, a multiresolution analysis algorithm, such as Discrete Wavelet Transform (DWT), curvelet transform, is applied on the multispectral image to extract the high frequency information, which is then injected into the corresponding MS image.

Recently, deep learning has enjoyed great success in various fields, such as computer vision, pattern recognition and image processing. Many researchers have also introduced Convolutional Neural Network (CNN) based deep learning methods into the task of panchromatic sharpening, such as PNN, panNet and PSGAN, but there is still no better method for panchromatic sharpening.

Disclosure of Invention

In order to solve the problems, the invention provides a remote sensing image fusion method and system based on a semi-supervised deep neural network.

According to some embodiments, the following technical scheme is adopted in the disclosure:

the remote sensing image fusion method based on the semi-supervised deep neural network comprises the following steps:

acquiring a high-spatial-resolution panchromatic image and a low-spatial-resolution multispectral image to be fused, and preprocessing the images;

inputting the full-color image and the multispectral image into a double-branch network, and respectively extracting spatial information of the full-color image and spectral information characteristics of the multispectral image;

and stacking the extracted characteristic graphs of the spatial information and the spectral information, then performing resolution sensing, and then injecting a resolution sensing result when performing fusion reconstruction on the stacked characteristic graphs to obtain a fusion image.

According to other embodiments, the present disclosure adopts the following technical solutions:

the remote sensing image fusion system based on the semi-supervised deep neural network comprises:

the image acquisition module is used for acquiring the high-spatial-resolution panchromatic image and the low-spatial-resolution multispectral image to be fused and preprocessing the images;

the characteristic extraction module is used for inputting the full-color image and the multispectral image into a two-branch network and respectively extracting the spatial information of the full-color image and the spectral information characteristic of the multispectral image;

the resolution sensing module is used for stacking the extracted characteristic graphs of the spatial information and the spectral information and then sensing the resolution;

and the image fusion reconstruction module is used for injecting a resolution sensing result when the stacked characteristic images are subjected to fusion reconstruction so as to obtain a fusion image.

Further, the system also comprises a domain confrontation module;

the characteristic extraction module comprises a coder and a decoder and is used for extracting the spatial characteristic and the spectral characteristic respectively by using the spatial information of the panchromatic image and the spectral information of the multispectral image and acquiring complementary information by using two branch networks.

Compared with the prior art, the beneficial effect of this disclosure is:

firstly, the full-color image and the multispectral image are placed in the double-branch network for step-by-step processing, and the spatial information and the spectral information are respectively extracted, so that the accurate spatial and spectral information can be obtained, the use of a subsequent image processing technology is facilitated, and the utilization rate of the spatial information and the spectral information is improved.

Second, the learned features are more complementary because there should be interaction or communication during the learning or training process between images of different resolutions. And injecting the cross-resolution information into a decoder through a resolution perception module, so that the reconstructed high-resolution multispectral image contains more detailed information.

Drawings

The accompanying drawings, which are included to provide a further understanding of the disclosure, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure and are not to be construed as limiting the disclosure.

FIG. 1 is a flow chart of a method implementation in an embodiment of the present disclosure;

FIG. 2 is a detailed diagram of the components of the various network modules in an embodiment of the present disclosure;

FIG. 3 is a graph comparing the results of fusing a low spatial resolution multispectral image and a high spatial resolution panchromatic image using an embodiment of the present disclosure;

FIG. 3 (a) is a low spatial resolution multispectral image;

FIG. 3 (b) is a high spatial resolution panchromatic image;

FIG. 3 (c) is a group-Truth, reference image of the fusion result;

FIG. 3 (d) is a high spatial resolution multi-spectral image obtained after fusing FIGS. 3 (a) and 3 (b) using the present disclosure;

the specific implementation mode is as follows:

the present disclosure is further described with reference to the following drawings and examples.

It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present disclosure. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.

Example 1

An embodiment of the disclosure provides a remote sensing image fusion method based on a semi-supervised deep neural network, which comprises the following steps:

s101, acquiring a high-spatial-resolution panchromatic image and a low-spatial-resolution multispectral image to be fused, and preprocessing the images;

specifically, a Bicubic interpolation upsampling is used for upsampling the multispectral image into a full-color image, and the low-resolution multispectral image and the full-color image are obtained by performing quadruple upsampling;

s102, inputting the full-color image and the multispectral image into a two-branch network, and respectively extracting spatial information of the full-color image and spectral information characteristics of the multispectral image;

in particular, multispectral images are upsampled to panchromatic image size using Bicubic interpolation upsampling, quadruple upsampled low resolution multispectral images and panchromatic images are input to a two-branch network,

and S103, stacking the extracted characteristic graphs of the spatial information and the spectral information, then performing resolution sensing, and then injecting a resolution sensing result when performing fusion reconstruction on the stacked characteristic graphs to obtain a fusion image.

Specifically, the stacking result is input to a resolution extractor, resolution sensing is performed, and a resolution sensing result is output.

And respectively inputting the stacking results of the full size and the small size into an encoder for fusion, and obtaining a full-size fusion characteristic diagram and a small-size fusion characteristic diagram through two rolling blocks.

When the high-resolution multispectral image is reconstructed, the fused feature map is input into a decoder, the feature map is obtained through a first convolution module, then a first convolution result and a second convolution result are stacked and input into a third convolution block, the output is input into a fourth convolution block, the number of channels is increased, the feature map output in the double-branch network and the result output by the fourth convolution block are stacked and input into a fifth convolution block, and then the number of the channels is reduced through a sixth convolution block to obtain a reconstructed target image.

By way of example, the specific implementation process of the technical scheme of the application is as follows:

as an embodiment, in order to obtain a multispectral image with high spatial resolution, the specific model of the present disclosure is constructed as follows:

(1) Inputting an image

Inputting Full-size Full-color image Full-scale image _F ∈R ^256×256×1 And a multispectral image Full-scale LRMS image L _F ∈R ^64×64×4 (ii) a Inputting a Reduced-scale PAN image P of a small-sized panchromatic image _R ∈R ^64×64×1 And a multispectral image Reduced-scale LRMS image L _R ∈R ^16×16×4 。

The preprocessing process before input is as follows: the full-size multispectral image L _F ∈R ^64×64×4 And a small-sized multispectral image L _R ∈R ^16×16×4 Respectively carrying out quadruple upsampling to obtain an image L _F ∈R ^256×256×4 And L _R ∈R ^64×64×4 。

(2) And constructing a core module of the semi-supervised deep learning network, wherein the core module comprises four parts, namely an encoder, a decoder, a resolution sensing module and a domain confrontation module.

In order to fully utilize the spatial information of the PAN of the panchromatic image and the spectral information of the LRMS of the multispectral image, the characteristics of the PAN and the LRMS are respectively extracted by utilizing an encoder module, namely two branch networks to obtain complementary information of the PAN and the LRMS, wherein the two branches are composed of convolution layers with convolution kernels of 3 x 3 and step length of 1, the number of input channels of the spectral branches is 4, the number of input channels of the spatial branches is 1, then a LeakyRelu activation function and a downsampling layer are carried out, and the downsampling layer carries out downsampling by adopting convolution operation with convolution kernels of 2 x 2 and step length of 2. After passing through the two branch networks, the feature maps are firstly stacked, and then are fused through the subsequent two convolution blocks, wherein the first convolution block is composed of convolution layers with two layers of kernels of 3 × 3 and a step length of 1, the second convolution block is a convolution layer with a convolution kernel of 2 × 2 and a step length of 2, and the image is sampled again.

Reconstructing the features extracted by the encoder in a decoder to obtain a multispectral image HRMS with high spatial resolution, wherein the decoder comprises six-in-one modules, the first two modules are Convolution operations with Convolution kernel of 3 x 3 and step length of 1, then stacking the output results of the first module and the second module and inputting the output results into a third Convolution block, the fourth Convolution block is an upsampling operation, and the method is realized by adopting a Transposed Convolution (Transposed Convolution) with Convolution kernel of 2 x 2 and step length of 2. In order to obtain richer spatial information and spectral information, stacking the up-sampled result and the down-sampled result of the dual-branch network to obtain an image of 256 × 256 × 192, inputting the image into a fifth module, reducing the number of channels to 64, and finally inputting the image into a sixth module to obtain a multispectral image HRMS ∈ R with expected high spatial resolution ^256×256×4 。

The method comprises the steps of stacking characteristic graphs of extracted spatial information and spectral information, then conducting resolution sensing, and utilizing a resolution sensing module, wherein the resolution sensing module takes stacking of double-branch results of an encoder as input, the main structure of the resolution sensing module is two-layer two-dimensional convolution and two-layer LeakyRelu activation functions, the first convolution module is formed by convolution operations with two layers of convolution kernels of 3 x 3 and step length of 1, the activation function is LeakyRe l u, and the second convolution module is formed by convolution operations with one layer of convolution kernels of 2 x 2 and step length of 2, so that the resolution is reduced to 1/2 of the original resolution.

Specifically, the acquisition mode of the distinguishing perception result is as follows:

the output result of the resolution sensing module is multiplied by the output result of the encoder element by element, then is added with the output result of the resolution sensing module and is injected into a downstream decoder, so that the HRMS reconstruction process comprises more resolution information.

Since full-size and small-size images are acquired, in order to distinguish between a full-size domain and a small-size domain, a design domain confrontation module is constructed, which includes two linear layers (linear layer), a batch normalization layer and a softmax activation function. The domain confrontation module takes the output result of the encoder as input, outputs 100 samples through a first linear layer, then performs batch normalization (Batchnormalization) operation on the samples, sets the output samples to be 2 through a linear layer, obtains a tensor (Batchsize, 2) shape, and finally inputs the tensor into softmax, wherein the softmax activation function is responsible for fixing the result between (0, 1) so as to be used in combination with the loss function NLLLoss during training.

As an embodiment, in order to obtain a multispectral image with high spatial resolution, the specific cross-resolution model is trained as follows:

inputting full-size full-color image P _F ∈R ^256×256×1 And a multispectral image L _F ∈R ^64×64×4 Firstly, performing 4-fold upsampling on the multispectral image by using Bicubic interpolation upsampling to obtain an image L _F ∈R ^256×256×4 Then the full-color image P is put _F ∈R ²⁵⁶ ^×256×1 And an upsampled multi-spectral image L _F ∈R ^256×256×4 Common input encoder, through a two-branch network to obtain B _PAN ∈R ¹²⁸ ^×128×64 And B _MS ∈R ^128×128×64 Then stacking the two to obtain C _Full1 ∈R ^{128×128×128} Finally, stacking the result C _Full1 ∈R ^{128×128×128} Input to the next two fusion blocks to obtain F _Full2 ∈R ^64×64×256 . Resolution sensing module to stack results C _Full1 ∈R ^{128×128×128} As input, for extracting resolution information (β and γ), and then fusing the resolution information with the fusion block F _Full2 ∈R ^64×64×256 Are jointly input into a decoder to finally obtain a reconstructed full-size fusion image F _F ∈R ^256×256×4 。

Specifically, the process in the decoder is:

F _input ＝β⊙F _Full2 +γ

as representing element-by-element multiplication, beta and gamma being extracted resolution information, F _Full2 Is the output result of the encoder, F _input Is the input to the decoder.

The Cross-resolution Full-size Fused image is (Cross-resolution Full scale Fused image) C _F ∈R ^256×256×4 ：

The acquisition process comprises the following steps: the generation process of the cross resolution full-size fusion image is similar to the process of the full-size fusion image, except that the cross resolution fusion image is obtained by injecting resolution information of a small-size image extracted by a resolution sensing module into the reconstruction process of the full-size image after up-sampling, so that the cross resolution full-size fusion image C _F ∈R ^256×256×4 Containing low resolution information.

The specific process is as follows:

F _input ＝β↑⊙F _Full2 +γ↑

↓ represents the upsampling operation, and the low resolution information beta and gamma of the small-sized image extracted by the resolution perception module needs to be upsampled and then output by the decoder to result F _Full2 ∈R ^64×64×256 Combined and then input to a decoder.

The Reduced-scale Fused image (Reduced-scale Fused image) is F _R ∈R ^64×64×4 The acquisition process is as follows:

inputting a small-sized full-color image P _R ∈R ^64×64×1 And a multispectral image L _R ∈R ^16×16×4 Firstly, performing 4-fold upsampling on the multispectral image by using Bicubic interpolation upsampling to obtain an image L _R ∈R ^64×64×4 Secondly, the full-color image P _R ∈R ^64×64×1 And an upsampled multi-spectral image L _R ∈R ^64×64×4 Common input encoder, through a two-branch network to obtain B _PAN ∈R ^32×32×64 And B _MS ∈R ^32×32×64 Then stacked to obtain C _Redu1 ∈R ^32×32×128 Finally, stacking the result C _Redu1 ∈R ^32×32×128 Input to the next two fusion blocks to obtain F _Redu2 ∈R ^16×16×256 . Resolution sensing module to stack results C _Redu1 ∈R ^32×32×128 As input, for extracting resolution information and then fusing the resolution information with the fusion block F _Redu2 ∈R ^16×16×256 Are commonly input into a decoder to obtain a small-size fused image F _R ∈R ^64×64×4 . The reconstruction process for the small size fused image is similar to the full size except that the input and output image pairs differ in size.

F _input ＝β⊙F _Redu2 +γ

Resolution information beta and gamma in the small-size fused image reconstruction process are derived from the small-size image, combined with the output result of the encoder in the same way and then input into a decoder to obtain a small-size fused image F _R ∈R ^64×64×4 。

Re-acquiring cross resolution small size fusion image C _R ∈R ^64×64×4 The resolution information of this process is from the full size image, then down sampled and injected into the decoder so that the small size fused image contains the high resolution information.

F _input ＝β↓⊙F _Redu2 +γ↓

↓ represents a down-sampling operation, and the high resolution information beta and gamma of the full-size image extracted by the resolution perception module need to be processed by the down-sampling operation and then output result F of the decoder _Redu2 ∈R ^16×16×256 Combining, and inputting into decoder to obtain cross-resolution small-size fused image C _R ∈R ^64×64×4 。

The domain confrontation module is a fusion block F _Redu2 ∈R ^16×16×256 And a fusion block F _Full2 ∈R ^64×64×256 The four times down-sampling result is used as input in order to separate the full-size domain from the small-size domain, making their features more distinct. In addition, in order to reduce the number of parameters and reduce the computational burden, the encoder, the decoder and the resolution perception module in the model share parameters.

Further, constructing loss functions, wherein the loss functions used by the model comprise MSELoss and NLLLoss; the constructed loss function is:

wherein H is the input full-size multispectral image, and H is used as the Reference image Reduced-scale Reference H belonging to R of the small-size fusion result ^64×64×4 ；F _R Is a Reduced-scale Fused image, i.e., a small-size Fused image; c _R Is a Cross-resolution Reduced-scale Fused image, i.e. a Cross-resolution small-size Fused image; f _F Is Full-scale Fused image, i.e. Full-size Fused image; c _F Is a Cross-resolution Full-scale Fused image, i.e. a Cross-resolution Full-size Fused image; l is _F Is Full-scale LRMS, full-size Low scoreResolution multispectral image. D (-) is a double cubic interpolation space down-sampling operator.

Nlllos is similar to cross-entropy loss functions and is commonly used in classification problems, but nlllos needs to be combined with operations such as softmax and log as loss functions, and the loss functions include:

Loss5＝NLLLoss(full_domain，label0)

Loss6＝NLLLoss(redu_domain，label1)

where full _ domain and reduce _ domain are the outputs of the model (the result of the softmax), label0 is a tensor of all 0's, and label1 is a tensor of all 1's.

The overall loss function is then:

Loss＝Loss1+Loss2+Loss3+Loss4+Loss5+Loss6

example 2

An embodiment of the present disclosure provides a remote sensing image fusion system based on a semi-supervised deep neural network, including: the image acquisition module is used for acquiring a full-color image with high spatial resolution and a multispectral image with low spatial resolution to be fused and preprocessing the full-color image and the multispectral image;

Further, the system also comprises a domain confrontation module;

the characteristic extraction module comprises a coder and a decoder and is used for extracting the spatial characteristic and the spectral characteristic respectively by using the spatial information of the panchromatic image and the spectral information of the multispectral image through two branch networks to obtain complementary information.

As an embodiment, the following method steps are specifically implemented by using the system module:

the data set of the present disclosure employs a low spatial resolution multispectral image and a high spatial resolution panchromatic image taken by GeoEye-1 satellites in 2 months 2009 in hobart australia.

Step 1: respectively inputting Full-size multispectral image Full-scale LRMS image L _F ∈R ^64×64×4 And Full-color image Full-scale PAN image P _F ∈R ^256×256×1 Reduced-scale LRMS image L of small-size multispectral image _R ∈R ^16×16×4 And full-color image Reduced-scale PAN image P _R ∈R ^64×64×1 The positions of the full-size image and the small-size image are corresponding, and only the difference of high resolution and low resolution exists, wherein the full-size multispectral image is used as a reference image of the small-size fusion result HRMS.

Therein, multispectral images are upsampled to panchromatic image sizes using Bicubic interpolation upsampling, i.e., 256 × 256 × 4 in full size and 64 × 64 × 4 in small size, and the corresponding panchromatic images form pairs of training data.

Inputting the four-time up-sampled low-resolution multispectral image and full-color image into a double-branch network to obtain B _PAN ∈R ¹²⁸ ^×128×64 And B _MS ∈R ^128×128×64 Then stacking the two to obtain C _Full1 ∈R ^{128×128×128} Similarly, there is a small-sized stacking result C _Redu1 ∈R ^32×32×128 。

And 2, step: the full-size and small-size stacking results are input to the encoder for fusion, respectively.

Stacking the full size of the result C _Full1 ∈R ^{128×128×128} And small size stacking result C _Redu1 ∈R ^32×32×128 Inputting the data into an encoder, and obtaining a full-size fusion feature map F through two rolling blocks _Full2 ∈R ^64×64×256 And a small-size fused feature map F _Redu2 ∈R ^16×16×256 。

And 3, step 3: performing domain antagonism

Fusing the full size of feature maps F _Full2 ∈R ^64×64×256 Performing double cubic interpolation quadruple down sampling to obtain 16 × 1Results of 6 × 256 size, then fused with the small size feature map F _Redu2 ∈R ^16×16×256 A domain confrontation is implemented. The domain countermeasure module includes two linear layers, a bulk normalization layer and a softmax activation function, the activation function softmax maps domain information between 0 and 1, and then combines NLLLoss with 0 label and 1 label to separate a full-size domain from a small-size domain.

And 4, step 4: extracting resolution factors

Stack result C _Full1 ∈R ^{128×128×128} Input to the resolution extractor, and output the result as F _β ∈R ^64×64×256 And F _γ ∈R ^64×64×256 Likewise, there is a small-sized resolution factor R _β ∈R ^16×16×256 And R _γ ∈R ^16×16×256 It should be noted that β and γ are not simple arabic numbers, but four-dimensional tensors containing resolution information.

And 5: reconstructing high resolution multi-spectral images

Fusing the full size of feature maps F _Full2 ∈R ^64×64×256 An input decoder firstly passes through a first convolution module to obtain a 128 x 128 characteristic diagram, then a first convolution result and a second convolution result are stacked and input into a third convolution block, the resolution of the image is still 128 x 128 at the moment, then the image enters a fourth convolution block to increase the channel number to 128, and then a characteristic diagram B of a double-branch network containing rich space information and spectrum information is obtained _PAN ∈R ^128×128×64 And B _MS ∈R ^128×128×64 Stacking results of the convolution block 4, inputting a 5 th convolution block, wherein the 5 th convolution block is a transposed convolution and is responsible for increasing the image to 256 multiplied by 256, and finally reducing the channel number to 4 through the 6 th convolution block to obtain a target image HRMS (high resolution moving Picture) belonging to the R ^256×256×4 The process is described as a full-size domain image reconstruction process, and the small-size domain reconstruction process is similar to the process except that the resolution is 1/4 of the full-size domain.

The effects of the present disclosure can be further illustrated by the following simulations.

1. Simulation environment:

PyCharm Community Edition 2022.1.2x64，NVIDIA GEFORCE RTX3090,Ubuntu 18.04。

2. simulation content:

the fusion of a low spatial resolution multispectral image and a high spatial resolution panchromatic image taken by a GeoEye-1 satellite in the hobart region of australia on 2 months 2009 with the present disclosure resulted in the result shown in fig. 3, where:

fig. 3 (a) is a low spatial resolution multi-spectral image, 64 x 4 in size,

fig. 3 (b) is a high spatial resolution full color image, 256 × 256 × 1 in size,

FIG. 3 (c) is a group-Truth, i.e., a reference image of the fusion result, with a size of 256X 4

FIG. 3 (d) is a high spatial resolution multispectral image obtained by fusing the images of FIGS. 3 (a) and 3 (b) according to the present invention, and has a size of 256X 4

As can be seen from fig. 3, the spatial detail information of fig. 3 (d) is significantly improved compared to fig. 3 (a), the edges of the road and the building are clear, and the color information of fig. 3 (d) is richer compared to fig. 3 (b), so that the present invention can better merge fig. 3 (a) and fig. 3 (b).

Simulation 2, in order to prove the effect of the invention, the method of the invention and the prior art BDSD transform method, AWLP transform method, indusion transform method, SVT transform method, VPLMC transform method, other deep neural network methods PNN method and PanNet method are respectively used for fusing the images to be fused in the figure 2 (a) and the figure 2 (b), and objective index evaluation is carried out on the fused result, wherein the evaluation indexes are as follows:

1) The correlation coefficient CC indicates the degree of retention of the spectral information, and the result is in the interval [0,1], and the closer the correlation coefficient is to 1, the more similar the fusion result is to the reference image.

2) The root mean square error RMSE represents the square root of the ratio of the square of the deviation between the predicted value and the real value to the observation time n, and the smaller the numerical value is, the better the fusion result is.

3) The global comprehensive error index ERG considers the scale relation between the fusion image and the observation image on the basis of RMSE, the interval is [0,1], and the index is better when being closer to 1.

4) The spectral radian SAM, which represents the degree of spectral distortion, is closer to 0, the better the fusion result.

5) And the global quality evaluation indexQ represents the overall similarity of the images in space and spectrum, the result range is in an interval [0,1], and the larger the global quality evaluation index is, the more similar the fused image is to the reference image.

6) And the overall image quality index UIQI represents the closeness degree of the fused image and the reference image, and the closer to 1, the better the fusion result.

The fusion results of the present invention and the prior art were evaluated from the objective evaluation index based on the evaluation index, and the results are shown in table 1.

TABLE 1 Objective evaluation of fusion results of various methods

As can be seen from table 1, the correlation coefficient CC, the global quality evaluation indexQ, and the overall image quality index UIQI of the present disclosure are all greater than the evaluation values of the prior art, the root mean square error RMSE, the global error score ERG, and the spectral radian SAM are all less than the evaluation values of the prior art, and the above evaluation values are all superior to the evaluation values of the prior art, so that it can be seen that most of the objective evaluation indexes of the present disclosure are superior to the objective evaluation indexes of the prior art.

As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.

The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Although the present disclosure has been described with reference to specific embodiments, it should be understood that the scope of the present disclosure is not limited thereto, and those skilled in the art will appreciate that various modifications and changes can be made without departing from the spirit and scope of the present disclosure.

Claims

1. The remote sensing image fusion method based on the semi-supervised deep neural network is characterized by comprising the following steps:

2. The remote sensing image fusion method based on the semi-supervised deep neural network as recited in claim 1, wherein the preprocessing process comprises the following steps: the multispectral image is upsampled to full-color image size using Bicubic interpolation upsampling, and the four times upsampled low resolution multispectral image and full-color image are input to a two-branch network.

3. The remote sensing image fusion method based on the semi-supervised deep neural network as recited in claim 1, wherein the two-branch network comprises two branch networks, namely a spectral branch and a spatial branch, and the two branch networks are composed of two convolution layers with convolution kernel of 3 x 3 and step length of 1.

4. The remote sensing image fusion method based on the semi-supervised deep neural network as recited in claim 2, wherein the number of input channels of the spectral branch is 4, and the number of input channels of the spatial branch is 1.

5. The remote sensing image fusion method based on the semi-supervised deep neural network as recited in claim 1, wherein full-size and small-size stacking results are respectively input into an encoder for fusion, and a full-size fusion feature map and a small-size fusion feature map are obtained through two rolling blocks.

6. The remote sensing image fusion method based on the semi-supervised deep neural network as recited in claim 1, wherein the stacked result is input to a resolution extractor for resolution perception, and a resolution perception result is output.

7. The remote sensing image fusion method based on the semi-supervised deep neural network as recited in claim 1, wherein when reconstructing the high-resolution multispectral image, the fusion feature map is input into a decoder, the feature map is obtained through a first convolution module, then a first convolution result and a second convolution result are stacked and input into a third convolution block, the output enters a fourth convolution block, the number of channels is increased, the feature map output in the double-branch network and the result output by the fourth convolution block are stacked and input into a fifth convolution block, and then the number of channels is reduced through the sixth convolution block to obtain the reconstructed target image.

8. The method for fusing remote sensing images based on semi-supervised deep neural network as recited in claim 1, wherein the full-size fused feature map is subjected to bicubic interpolation four times down-sampling and then is in domain confrontation with the small-size fused feature map.

9. The remote sensing image fusion method based on the semi-supervised deep neural network, as recited in claim 8, wherein the domain confrontation is performed in a domain confrontation module, the domain confrontation module comprises two linear layers, a batch normalization layer and a softmax activation function, and the activation function softmax maps domain information to be between 0 and 1, and then the NLLLoss and a 0 label and a label are combined to separate a full-size domain from a small-size domain.

10. Remote sensing image fusion system based on semi-supervised deep neural network, its characterized in that includes:

and the image fusion reconstruction module is used for injecting a resolution sensing result when the stacked characteristic images are subjected to fusion reconstruction to obtain a fusion image.

11. The remote sensing image fusion system based on the semi-supervised deep neural network is characterized by further comprising a domain confrontation module;