CN113763299B

CN113763299B - Panchromatic and multispectral image fusion method and device and application thereof

Info

Publication number: CN113763299B
Application number: CN202110989821.2A
Authority: CN
Inventors: 周朝阳; 徐其志; 陈力; 王幸; 郭梦瑶
Original assignee: Institute of Engineering Protection National Defense Engineering Research Institute Academy of Military Sciences of PLA
Current assignee: Institute of Engineering Protection National Defense Engineering Research Institute Academy of Military Sciences of PLA
Priority date: 2021-08-26
Filing date: 2021-08-26
Publication date: 2022-10-14
Anticipated expiration: 2041-08-26
Also published as: CN113763299A

Abstract

The invention discloses a method, a device and application of fusion of panchromatic and multispectral images, wherein the method comprises the steps of firstly carrying out wave band superposition on a panchromatic image, interpolating the multispectral image into an interpolation image with the same resolution as the panchromatic image, subtracting the multispectral image from the panchromatic image and carrying out Gaussian filtering to obtain an interpolation trend graph; obtaining a fusion model with high fidelity fusion capability by constructing an unsupervised fusion model with a spectrum and space spectrum discriminator; the fusion model is used for automatically processing the panchromatic image and the multispectral image to obtain a high-fidelity fusion image. The fusion model obtained by the fusion method is suitable for processing the remote sensing image with the original resolution, effectively solves the common problem of detail distortion of the existing fusion method based on deep learning, and has the advantages of high running speed and good fusion effect.

Description

Panchromatic and multispectral image fusion method and device and application thereof

Technical Field

The invention relates to the technical field of satellite remote sensing image fusion, in particular to a method and a device for fusing panchromatic and multispectral images and application thereof.

Background

With the rapid development of satellite remote sensing technology, more and more multi-sensor satellites emit lift-off, such as IKONOS, quickBird, zi Yuan, wordView-2 and other satellites, and the satellite remote sensing technology has become an important index for measuring the comprehensive strength of each country. The satellite generates a large amount of high-resolution and multi-channel satellite remote sensing data, and information service and decision support can be provided for important fields such as modern agriculture and ecological construction.

In the practical application of satellite remote sensing images, people often expect that the remote sensing images acquired by the satellites have higher spectral and spatial resolutions. Due to the imaging principle and the limitation of software and hardware, a single sensor is difficult to reflect the ground truth completely, so that the spatial detail information in the full-color image and the spectral information in the multispectral image need to be combined for fusion operation to obtain the high-quality multispectral image with integrated spectrum. Panchromatic and multispectral image fusion images must meet two requirements: the spectral information of the fused image must be consistent with the multiple spectra, i.e., spectral fidelity, and the spatial information of the fused image must be consistent with the full-color map, i.e., spatial detail fidelity.

To date, scholars have proposed numerous panchromatic and multispectral image fusion algorithms and constructed a mature fusion algorithm system. The method firstly obtains a replacement component and a reserved component through matrix transformation, matches a full-color image with the replacement component through a certain fusion rule, and finally obtains a fusion image through matrix inverse transformation. The advantages of this type of method are high fidelity of details and fast computation, but the problem of spectral distortion. A large number of scholars are currently looking at fusion algorithms based on deep learning, such as PNN, panNet, PSGAN, etc. The algorithm benefits from the excellent nonlinear fitting characteristic of the deep neural network, the fusion effect is stable, and the method has outstanding performance in the aspect of spectral fidelity, but the spatial detail information of the algorithm is not fully reserved, so that the spatial fidelity of the fusion result is low. Researchers have long been working on improving network models and increasing spatial detail information in the fusion results, however two problems remain unsolved: 1) Most of the existing deep learning fusion-based methods perform downsampling on multispectral and full-color images, then the original multispectral images are used as true value images, and the remote sensing images under high resolution cannot be generated by parameters obtained through training. 2) In order to improve spatial detail information, an existing deep learning-based method directly adds an up-sampled full-color image and network output detail information to obtain a final fusion result, noise is easily introduced in an up-sampling operation, and the up-sampling operation is difficult to improve by a method of optimizing network parameters, so that spatial detail distortion can be caused to locally occur in a fusion image.

Therefore, the research of the unsupervised high-fidelity panchromatic and multispectral image fusion method based on the deep learning to reduce the spectral distortion and the detail distortion existing in the current fusion method has great significance.

Disclosure of Invention

In view of this, the invention provides a method and a device for fusing panchromatic and multispectral images and application thereof, wherein the fusion method preprocesses input panchromatic and multispectral images through operations such as matrix transformation, image smoothing and filtering and the like, and utilizes a spectral fidelity discriminator and a spatial fidelity discriminator to jointly constrain the fused images so as to realize high-fidelity panchromatic and multispectral image fusion.

In order to achieve the purpose, the invention adopts the following technical scheme:

a panchromatic and multispectral image fusion method comprises the following steps:

interpolating the multispectral image MS to be fused into an interpolation image MS with the same resolution as the panchromatic image _i ；

Superposing the full-color image P to be fused into a wave band superposed image P with the same number of multi-spectral wave bands through wave bands _i ；

Obtaining the P _i And the MS _i And carrying out Gaussian filtering on the difference image to obtain a difference trend image I _s ；

Constructing and generating a countermeasure network as a panchromatic and multispectral fusion model and training the panchromatic and multispectral fusion model, wherein the panchromatic and multispectral fusion model comprises a generator, a spectral fidelity discriminator and a spatial fidelity discriminator;

subjecting the said I _s Inputting the data into the trained panchromatic and multispectral fusion model, outputting a characteristic mapping chart by the generator, and acquiring the P _i The difference image of the feature mapping image is a fusion image;

after down-sampling, the fused image and the MS are simultaneously input into the spectral fidelity discriminator, and whether the spectral information of the fused image is fidelity is judged;

the fusion image is weighted according to the ratio and then is simultaneously input into the space fidelity discriminator together with the P, and whether the space detail information of the fusion image is fidelity or not is judged;

if the image data are both true, outputting a final fusion image result; if the spectral information and/or the spatial detail information are not fidelity, the gradient is transmitted reversely, and the parameters in the generator, the spectral fidelity discriminator and the spatial fidelity discriminator are updated until the judgment results of the spectral fidelity discriminator and the spatial fidelity discriminator are both fidelity.

Preferably, the I _s Comprises the following steps:

I _S ＝(P _i -MS _i )*G*G ^T

wherein, represents a filtering operation, G represents a one-dimensional Gaussian filter, G ^T For the transposition of G, the smoothing result of the image can be obtained quickly by using transverse and longitudinal gaussian filters.

Preferably, the generator comprises an encoder, a decoder and a jump connection structure;

the encoder at least comprises a residual error learning module, and the two residual error learning modules are connected through a convolutional layer and used for extracting image features;

the decoder comprises at least one up-sampling module, each up-sampling module comprises two convolutions and an activation function, the first convolution in each up-sampling module is used for up-sampling the feature map, and gradually restoring the feature map to the same spatial resolution as the input image;

the skip connection structure is used for connecting the residual error learning module and the up-sampling module in the same layer, and is used for superposing feature maps with the same size in the encoder and the decoder.

Preferably, the generator further comprises a double attention model connected behind the residual error learning module of each layer for calibrating the feature map extracted by the residual error learning module;

the dual attention model includes a channel attention module and a spatial attention module;

the channel attention model sequentially comprises a mean pooling layer, two full-connection layers and an activation function, the mean pooling layer extracts the global statistical characteristics of each input feature map and outputs the global statistical characteristics to the full-connection layers, and the activation function is arranged between the two full-connection layers to generate channel characteristics;

the space attention module comprises a first branch and a second branch, the first branch sequentially comprises a convolution layer, a mean pooling layer and an activation function, the second branch sequentially comprises a convolution layer, a maximum pooling layer and an activation function, the first branch and the second branch respectively extract two-dimensional statistical characteristics of an input characteristic diagram, and then the first branch and the second branch are fused to obtain space attention information;

and fusing the channel characteristics acquired by the channel attention module and the space attention information acquired by the space attention module to obtain a corrected output characteristic diagram.

Preferably, the specific process of fusing the channel characteristics and the spatial attention information is as follows:

after the channel characteristics and the space attention information are subjected to scale adjustment and combination, a corrected characteristic diagram is obtained through activation of an activation function, the corrected characteristic diagram is multiplied by the input characteristic diagram, and the obtained result is added to the input characteristic diagram, so that a corrected output characteristic diagram is obtained;

setting the input characteristic diagram and the output characteristic diagram of the double attention model as H respectively _input And H _output The feature maps extracted by the spatial attention and the channel attention are respectively H _spatial And H _channel ，

And

the respective representative elements are added and multiplied, the output of the module is:

preferably, the residual error learning module comprises a main trunk and a secondary trunk, the main trunk comprises two convolution layers, an activation function is connected behind each convolution layer, the secondary trunk directly outputs the input characteristic diagram, and the characteristic diagram input into the residual error learning module is subjected to element addition after passing through the main trunk and the secondary trunk respectively.

Preferably, the spatial fidelity discriminator and the spectral fidelity discriminator sequentially comprise a convolutional layer, a pooling layer and an activation function, at the output of the network, the activation function maps the value between 0 and 1, and the value represents the probability value that the images received by the spatial fidelity discriminator and the spectral fidelity discriminator are real images.

A full color and multispectral image fusion device comprises an image acquisition device, a memory and a processor;

the image acquisition equipment is used for acquiring panchromatic and multispectral images to be fused;

the memory to store computer instructions;

the processor is connected to the image acquisition device and the memory, respectively, and is configured to perform image fusion by executing computer instructions to implement a panchromatic and multispectral image fusion method according to any one of claims 1 to 7.

A computer readable storage medium storing computer instructions which, when executed by a processor, implement a panchromatic and multispectral image fusion method as claimed in any one of claims 1-7.

Compared with the prior art, the invention discloses and provides a method and a device for fusing panchromatic and multispectral images and application thereof, and the method and the device have the following beneficial effects:

(1) The image fusion method is easy to realize, the visual effect of the fused image is good, and the method is suitable for fusing panchromatic and multispectral images acquired by different satellites;

(2) According to the method, the texture detail information in the full-color image and the spectral characteristics in the multispectral image are fully utilized, and the supervision of a true value image is not needed in the training process;

(3) The method and the device avoid the multispectral output addition of the interpolated sampling and the network model output, and reduce the influence of noise caused by an interpolation algorithm on the final fusion result.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.

Fig. 1 is a schematic flowchart of a method for constructing a panchromatic and multispectral fusion model according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a dual-arbiter fusion model according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of an overall model of a generator according to a first embodiment of the present invention;

FIG. 4 is a schematic diagram of a dual attention model according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a residual learning model according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of an arbiter model according to an embodiment of the present invention;

fig. 7 is a flowchart illustrating a fusion method of panchromatic and multispectral images according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of a fusion device for panchromatic and multispectral images according to a second embodiment of the present invention.

FIG. 9 is a comparison of the experimental results of the fusion method of the present invention and the prior typical fusion method on the Quickbird satellite image;

the method comprises the following steps of (a) obtaining a multispectral image, (b) obtaining a full-color image, (c) obtaining a PNN model fusion result, (d) obtaining a PanNet model fusion result, (e) obtaining an ENVI-GS transformation method fusion result, and (f) obtaining the method fusion result.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.

The first embodiment is as follows:

the embodiment discloses a panchromatic and multispectral image fusion method, as shown in fig. 1, fig. 1 is a flow chart of a panchromatic and multispectral fusion model construction method in the panchromatic and multispectral image fusion method, table 1 is the description of the symbols of the part, and the panchromatic and multispectral fusion model construction method specifically includes the following steps:

interpolating the multispectral image MS into an interpolated image MS with the same resolution as the panchromatic image _i ；

TABLE 1 description of the symbols in this section

In this step, a bilinear interpolation method is selected as an interpolation algorithm. The resolution of the multispectral image is lower than that of the full-color image, and the multispectral image is interpolated to the same resolution of the full-color image by using an interpolation algorithm in image fusion. The invention utilizes bilinear interpolation to interpolate the multispectral image. Assuming that a pixel point I (I + u, j + v) is an interpolation point between pixel points { I (I, j), I (I +1, j), I (I, j + 1), I (I +1, j + 1) }, wherein 0-less-than-u-less-than-0-less-than-v-less-than-1, I, j are row coordinates and column coordinates of the pixel points, the interpolation calculation method is specifically as follows:

I(i+u,j+v)＝(1-u)(1-v)I(i,j)+uvI(i+1,j+1)+v(1-u)I(i,j+1)+u(1-v)I(i+1,j)

superposing the full-color image P into an image P with the same number of multi-spectral bands _i ；

The panchromatic image only contains one wave band, the panchromatic image needs to be subjected to wave band superposition before interpolation information of panchromatic and multispectral is acquired, and if n is the wave band number of the multispectral image, P is _i Can be expressed as:

P _i ＝{P ₁ ,P ₂ ,...P _n }

to P _i And MS _i Difference is made to obtain a difference image I _d And performing Gaussian filtering operation on the image to obtain a difference trend image I _S Because of the band overlay P _i Contains a lot of detail texture information, and P _i Minus MS _i Gaussian filtering is carried out to obtain low-frequency information, namely basic color information and a difference trend image I in the remote sensing image _S Expressed as:

I _S ＝(P _i -MS _i )*G*G ^T

And constructing a generation countermeasure network with a double discriminator as a panchromatic and multispectral fusion model.

Considering that the parameters obtained by the downsampling panchromatic and multispectral image training are difficult to generate the remote sensing image under the high resolution, the embodiment does not directly take the panchromatic image under the original resolution as the true value image of the network model, but indirectly designs the panchromatic image and the multispectral image as the true value image for constraining the texture detail information and the spectral information in the fusion image.

Fig. 2 is a panchromatic and multispectral fusion model with dual discriminators in the embodiment. The network is composed of a generator, a spectral fidelity discriminator and a spatial fidelity discriminator. The output of the generator not only has the same spatial resolution as the panchromatic image, but also has the same number of bands as the multispectral image.

As can be seen from fig. 2, the difference trend image is first input into the generator provided in this embodiment, and a fusion result of the panchromatic image and the multispectral image is generated. The fusion result is down sampled and input into the spectral fidelity discriminator together with the original resolution multispectral to judge whether the data is real data, and parameters in the generator and the spectral fidelity discriminator are updated by comparing whether the judgment result is correct or not and reversely transmitting the gradient, so that the generator is forced to generate an image with spectral information consistent with the low resolution full-color image, and the spectral fidelity discriminator is forced to more accurately judge the difference between the generated image and the real data.

In addition, the high-resolution full-color image is weighted in a ratio, the weighted image and the full-color image are input into the space-spectrum fidelity discriminator together, the generator is forced to generate an image with space detail information consistent with the full-color image, the capability of the space-fidelity discriminator for distinguishing true from false is improved, when the generator can find balance between the space information and the spectrum information, a fused image with both the space-spectrum information and the spectrum information is generated, the two discriminators are difficult to distinguish true from false, and a network achieves an optimal model.

Fig. 3 shows the generator model proposed by the present embodiment. The model is a U-shaped structure integrally, and comprises an encoder, a decoder and a jump connection structure connecting the two parts. The encoder is composed of four residual learning modules connected in series, and a convolution layer with the step length of 2 is connected behind each residual learning module to extract image features, and the calculation amount is reduced. The decoder is made up of four upsampling modules, each consisting of two convolutions plus an activation function, the first convolution in each upsampling module being used to upsample the feature map, the purpose of this operation being to gradually restore the feature map to the same spatial resolution as the input image.

In addition, the generator also comprises three jump connection structures, the feature maps with the same size in the encoder and the decoder are superposed, and the structure well reserves the information in a shallow network, namely the encoder, and reduces the parameter quantity needing to be trained.

Meanwhile, the present embodiment adds a dual attention model after the residual learning module to calibrate the extracted feature map, where the dual attention model includes a channel attention module and a spatial attention module.

As shown in fig. 4, the channel attention model is composed of a mean pooling layer, two full-connected layers and an activation function, and the model first uses the mean pooling layer to extract global statistical features of each input feature map and then uses two FC layers to generate channel features.

The space attention module is composed of two branches, the first branch comprises convolution, mean pooling and an activation function, the second branch comprises convolution, maximum pooling and an activation function, the two branches extract statistical characteristics of two dimensions of an input characteristic diagram, and then the two branches are fused to obtain space attention information. Further, the channel features obtained by the channel attention module are scaled and combined with the spatial information obtained by the spatial attention module. And finally, activating the combined characteristic diagram through an activation function to obtain a corrected characteristic diagram. And multiplying the corrected characteristic diagram by the input characteristic diagram, and adding the obtained result to the input characteristic diagram to obtain the corrected characteristic diagram.

Assume the input and output of the dual attention module are H _input And H _output The feature maps extracted by the spatial attention and the channel attention are respectively H _spatial And H _channel ，

And

fig. 5 shows the residual learning module provided in this embodiment, and as can be seen from fig. 5, the main trunk is a feature diagram obtained by performing two convolutions and two activation functions, and the secondary trunk is an input feature diagram. And the residual error learning module outputs the result of carrying out element addition on the main trunk characteristic diagram and the secondary trunk characteristic diagram and carrying out function activation. The module utilizes the learning residual to replace the original image characteristics, so that the module parameters in the network can be more easily optimized, the problem of gradient disappearance is relieved, the training speed of the model is increased, and the training effect is improved.

Fig. 6 shows the discriminator model proposed in the present embodiment. The space fidelity discriminator and the spectrum fidelity discriminator respectively correspond to two important rules of remote sensing image fusion, namely (1) the fusion needs to meet the spectrum fidelity, namely the integral structure information in the fusion image is equal to the multispectral up-sampling image everywhere. (2) Fusion needs to satisfy detail fidelity, i.e., the spatial detail of the fused image is everywhere equal to the spatial detail information of the panchromatic image.

The two discriminant models are classifier models, the output of each classifier model is a number between 0 and 1, and the probability value of the image received by the discriminant is a real image is represented. The discriminator model is composed of convolution, pooling, activation functions, which map values to between 0-1 at the output of the network.

And subtracting the wave band superposition image from the characteristic mapping image which is the output of the generator in the generated countermeasure network to obtain a fusion result.

The overall structure of the generator proposed in this embodiment is similar to that of the residual error learning network, and the final output H (x) = x + F (x) is obtained by adding the model input image x and the output F (x) of the model, and the advantage of residual error learning is that the model learning part is only a smaller residual error F (x) = H (x) -x, which is relatively easier to train. Adding the upsampled multi-spectra directly to the network output introduces noise introduced by the interpolation algorithm.

For this purpose, the embodiment subtracts the band-overlapping image from the output of the generator (feature mapping image) to obtain a fused image, and assuming that W represents a relevant parameter in the generation countermeasure network and Gen (g) represents the image output by the generator, the fused image F can be represented as:

F＝P _i -Gen((P _i -MS _i )*G*G ^T ,W)

will I _S The fusion model is used as training data and input into a neural network for training, and the fusion model which has fusion panchromatic and multispectral characteristics and generates a multispectral image with high resolution is obtained.

Firstly, inputting the difference trend graph into a generator to obtain a primary fusion result; then inputting the down-sampled fusion image and the original multispectral image into a spectral fidelity discriminator, respectively calculating loss functions of a generator and the discriminator according to discrimination results, and respectively modifying parameters related to spectral information in each module of the generator and parameters in the discriminator by a gradient return method; and finally, inputting the image obtained by weighting the ratio of the fusion image and the full-color image into a space-spectrum fidelity discriminator, calculating loss functions of the generator and the discriminator, and correcting related parameters in the generator and the discriminator through gradient feedback.

The loss function is an important tool for optimizing network parameters, and the quality of the loss function directly determines the quality of a network training model. The present embodiment proposes a loss function (L) of the generator _G ) Are respectively caused by spectral loss (L) _spectral ) Space loss (L) _spatial ) Total variation loss (L) _tv ) The composition is as follows:

L _G ＝L _spectral +L _spatial +L _tv

wherein the spectral loss is lost by reconstruction (L) _re1 ) To combat loss (L) _adv1 ) The composition, where the reconstruction loss is the L1 norm loss, describes the difference between the generated image and the ideal image. The countermeasure loss is a loss peculiar to the generation of the countermeasure network in order to evaluate whether or not the output result of the discriminator is true, and least squares Loss (LSGAN) is used as a countermeasure loss function in consideration of the problem that the quality of the image generated under the effect of the conventional countermeasure loss is not high. The specific formula is as follows:

L _spectral ＝α ₁ L _re1 +β ₁ L _adv1

wherein B is the number of wave bands of the multispectral image, D _spectral Represented by a spectral fidelity discriminator, D _spectral (↓F ^k ) Representative is a discrimination result obtained by inputting the down-sampled fusion result into a spectral fidelity discriminator,a represents a label that the generator wants the discriminator to output when discriminating the false data. Alpha (alpha) ("alpha") ₁ And beta ₁ Representing the proportion of the loss of reconstitution and of the antagonistic loss, alpha ₁ And beta ₁ The models converge fastest taking 100 and 1, respectively. Similarly, the spatial penalty is also determined by the reconstruction penalty (L) _re2 ) And fight against loss (L) _adv2 ) Comprises the following steps:

L _spatial ＝α ₂ L _re2 +β ₂ L _adv2

b is a label that the generator wants the discriminator to output when discriminating the false data, D _spatial Representative is a spatial fidelity discriminator, the same way as

Representative is a discrimination output, α, obtained by inputting the fused image after band merging into a spatial fidelity discriminator ₂ And beta ₂ Representing the weight of the loss of reconstitution and the loss of countermeasure. Alpha (alpha) ("alpha") ^k The ratio coefficient of each multispectral wave band obtained by the minimum linear multiplication is represented, and the calculation formula is as follows;

the total variation loss is defined as the following formula, and the constraint term can effectively relieve the fuzzy problem in the fused image.

For the loss function of the discriminator, the space spectrum countermeasure loss and the spectrum countermeasure loss are two parts, the optimization target of the discriminator is just opposite to the optimization target of the generator, and if the generator aims to ensure that the discriminator cannot discriminate true and false data, the discriminator aims to discriminate the true and false data as correctly as possible. The present embodiment proposes the discrimination loss of the model as shown in the following formula:

L _D ＝L _D-spectral +L _D-spatial

in the formula c ₁ 、c ₂ Labels, d, for the real image sample M and the dummy sample ↓ F, respectively ₁ 、d ₂ Respectively true sample P and false sample

The label of (1). This embodiment sets the tag c of the real data ₁ And d ₁ Value 1, label c of false data ₂ And d ₂ The value of (d) is 0. Since the purpose of the generator is to let the discriminator discriminate the false data as true data, a and b mentioned in the above equation each take 1.

In the process of network model training, the generator and the discriminator adopt an alternate learning strategy for optimization, namely in the process of optimizing the parameters of the discriminator, the related parameters of the generator need to be frozen and do not participate in parameter optimization, and vice versa. Through repeated alternate learning with a certain number of rounds, the losses of the generator and the discriminator are reduced to be within a certain threshold value, and the network completes training to obtain the network parameters with high-quality fusion capability.

As shown in fig. 7, the fusion method includes the following steps: acquiring panchromatic and multispectral images to be fused, and acquiring a difference trend graph through preprocessing;

and processing the panchromatic image and the multispectral image by using the trained panchromatic and multispectral image fusion model, and automatically fusing the panchromatic image and the multispectral image.

According to an embodiment a trained network model has optimal combination parameters. And inputting the difference trend graph into a network model, calculating the image by using the adjusted parameters, realizing feature extraction, and finally outputting a fusion result. The process consumes very little time, so that the rapid fusion task of massive remote sensing images can be realized.

Further, experiments prove that the image fusion method in the embodiment is also suitable for the fusion of the full-color image and the hyperspectral image and the fusion operation of the image shot by common imaging equipment such as a digital camera, and the obtained beneficial effects are similar.

The second embodiment:

the embodiment discloses a panchromatic and multispectral image fusion device which can be realized by software and/or hardware. As shown in fig. 8, the apparatus 300 includes a telemetric image acquisition device 301, a memory 302, and a processor 303. Wherein the image capturing device 301, the memory 302 and the processor 303 may be connected by a bus or other means.

The image acquisition device 301 is configured to acquire a remote sensing image to be fused, and send the remote sensing image to be detected to the processor 303.

The Processor 303 may be a Central Processing Unit (CPU), or other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, or any combination thereof.

The memory 302 is a non-transitory computer readable storage medium, and can be used to store a non-transitory software program, a non-transitory computer executable program, and a module, such as a program or an instruction corresponding to the remote sensing image fusion model construction method in the first embodiment of the present embodiment or the remote sensing image fusion method in the second embodiment of the present embodiment. The processor 303 executes various functional applications and data processing of the processor by running the non-transitory software program or instructions stored in the memory 302, that is, the method for constructing the remote sensing image fusion model in the above method embodiment is implemented.

The memory 302 may include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function; the data store may store data created by processor 302, and the like. Further, the memory 302 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some aspects, the memory 302 optionally includes memory located remotely from the processor 303, and such remote memory may be connected to the processor 303 over a network. Optionally, the network includes, but is not limited to, the internet, an intranet, a local area network, a mobile communications network, and combinations thereof.

The fusion method disclosed by the invention is further explained according to the practical application condition as follows:

the method is compared with one of the best fusion methods in mainstream Remote Sensing image processing software, namely a Gram-Schmidt transformation fusion method (ENVI-GS transformation method for short) of ENVI software and a fusion method based on Deep learning, namely a Masi G, D Cozzolino, verdoliva L, et al. CNN-based clustering of multi-resolution Remote-Sensing images [ C ]// Urban Remote Sensing event. IEEE,2017. "(PNN for short) and one of the latest fusion methods, namely Yang J, fu X, hu Y, et al. PanNet: A Deep Network Architecture for Pan-shaping [ C ]//2017IEEE International Conference on compression (ICCV). IEEE,2017 NetNet (P for short). The experimental data are panchromatic and multispectral images shot by a Quickbird satellite, and 10 scenes are counted. The panchromatic image has an average scene size of about 12000 × 12000 pixels, and the multispectral image has an average scene size of about 3000 × 3000 pixels.

FIG. 9 shows panchromatic and multispectral images of a Quickbird satellite and a fused image. Due to the large size of the experimental image, fig. 9 shows only a local area of the experimental image in order to clearly show the feature in the image. FIGS. 9 (a) and (b) show multispectral and panchromatic views, respectively, FIGS. 9 (c) and (d) show the effect of the fusion between the PNN model and the PanNet model, and FIG. 9 (d) is an effect view of the ENVI-GS fusion method. Compared with the fusion method based on deep learning (see fig. 9 (c) and (d)), the detail distortion is serious, while the traditional method (see fig. 9 (e)) has the problem of spectrum distortion, and the method of the invention has better spectrum and space detail fidelity effects (see fig. 9 (f)) in terms of subjective visual effects.

The present embodiment uses a no-reference evaluation parameter to measure the quality of the fused image. No-Reference evaluation index (QNR) comprehensively considers the spectral fidelity (D) _λ ) And detail fidelity (D) _S ) The calculation formulas are respectively as follows:

QNR＝(1-D _λ ) ^α (1-D _S ) ^β

generally, 1,P ↓istaken for alpha, beta and q, which are single-band images obtained by downsampling an original full-color image to the original multispectral image with the same resolution. D _λ The difference between each wave band of the fused image and the original multispectral is compared, and the smaller the value is, the more the spectral characteristics are reserved; and D _S The spatial structures of the fused image and the full-color map are compared, and the smaller the value, the closer the wave bands of the fused image and the full-color map are. D _λ And D _S The ideal values of (A) and (B) are all 0, and the ideal value of QNR is 1.

The results of the objective evaluations of the method of the present invention and the comparative method are shown in Table 1.

TABLE 2 Objective evaluation chart of image fusion quality

	D _λ	D _S	QNR
				PNN	0.0660	0.8910	0.8508
PanNet	0.0691	0.0906	0.8466
				ENVI-GS Process	0.0868	0.2996	0.6396
The method of the invention	0.0651	0.0792	0.8609

Observing the objective indexes in table 1, it can be seen that both the spectral torsion and the detail torsion of the fused image are smaller than those of the comparative method. This demonstrates that the spectral and spatial detail fidelity effects of the method of the invention are superior to the comparative method. Experimental results show that the fusion method has good spectral fidelity and detail fidelity effects and is superior to a comparison method.

In the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A panchromatic and multispectral image fusion method is characterized by comprising the following steps:

interpolating the multispectral image MS to be fused into an interpolation image MS with the same resolution as the full-color image _i ；

subjecting the I to _s Inputting the data into the trained panchromatic and multispectral fusion model, outputting a characteristic mapping chart by the generator, and acquiring the P _i A difference map from the feature mapLike a fused image;

if the image data are both true, outputting a final fused image result; if the spectral information and/or the spatial detail information are not fidelity, the gradient is reversely transmitted, and the parameters in the generator, the spectral fidelity discriminator and the spatial fidelity discriminator are updated until the judgment results of the spectral fidelity discriminator and the spatial fidelity discriminator are both fidelity.

2. The panchromatic and multispectral image fusion method of claim 1, wherein I is _s Comprises the following steps:

I _S ＝(P _i -MS _i )*G*G ^T

3. The panchromatic and multispectral image fusion method according to claim 1, wherein the generator includes an encoder, a decoder and a skip connection structure;

the encoder at least comprises a residual error learning module, and the two residual error learning modules are connected through a convolution layer and used for extracting image characteristics;

the decoder comprises at least one upsampling module, each upsampling module comprises two convolutions and an activation function, the first convolution in each upsampling module is used for upsampling the feature map, and gradually restoring the feature map to the same spatial resolution as the input image;

the skip connection structure is used for connecting the residual learning module and the upsampling module in the same layer, and is used for superposing feature maps with the same size in the encoder and the decoder.

4. The method of fusing panchromatic and multispectral images according to claim 3, wherein the generator further comprises a dual attention model connected behind each layer of the residual learning module for calibrating the feature maps extracted by the residual learning module;

the spatial attention module comprises a first branch and a second branch, the first branch sequentially comprises a convolution layer, a mean pooling layer and an activation function, the second branch sequentially comprises a convolution layer, a maximum pooling layer and an activation function, the first branch and the second branch respectively extract two-dimensional statistical characteristics of an input characteristic diagram, and then the first branch and the second branch are fused to obtain spatial attention information;

5. The fusion method of panchromatic and multispectral images as claimed in claim 4, wherein the fusion process of the channel characteristics and the spatial attention information comprises the following specific processes:

after the channel characteristics and the space attention information are subjected to scale adjustment and combination, a corrected characteristic diagram is obtained through activation of an activation function, the corrected characteristic diagram is multiplied by the input characteristic diagram, and the obtained result is added to the input characteristic diagram, so that an output characteristic diagram after correction is obtained;

And with

6. the fusion method of panchromatic and multispectral images according to claim 3, wherein the residual learning module comprises a main trunk and a secondary trunk, the main trunk comprises two convolution layers, an activation function is connected behind each convolution layer, the secondary trunk directly outputs the input feature map, and the feature map input into the residual learning module is subjected to element addition after passing through the main trunk and the secondary trunk respectively.

7. The method of fusing panchromatic and multispectral images as claimed in claim 3, wherein the spatial fidelity discriminator and the spectral fidelity discriminator comprise a convolutional layer, a pooling layer and an activation function in this order, and at the output of the network, the activation function maps a value between 0 and 1, the value representing the probability value that the images received by the spatial fidelity discriminator and the spectral fidelity discriminator are real images.

8. A panchromatic and multispectral image fusion device is characterized by comprising an image acquisition device, a memory and a processor;

the memory to store computer instructions;

9. A computer readable storage medium having stored thereon computer instructions which, when executed by a processor, implement a panchromatic and multispectral image fusion method as recited in any one of claims 1-7.