CN113762288A

CN113762288A - Multispectral image fusion method based on interactive feature embedding

Info

Publication number: CN113762288A
Application number: CN202111106858.2A
Authority: CN
Inventors: 赵凡; 赵文达; 吴雪
Original assignee: Liaoning Normal University
Current assignee: Liaoning Normal University
Priority date: 2021-09-22
Filing date: 2021-09-22
Publication date: 2021-12-07
Anticipated expiration: 2041-09-22
Also published as: CN113762288B

Abstract

The invention provides a multispectral image fusion method based on interactive feature embedding, which belongs to the field of computer vision. The invention includes the following steps: collecting multispectral image pairs, and preprocessing the image pairs, including height and width adjustment, and sliding windows to obtain image pairs. etc., obtain the network training data set; design a multi-spectral image fusion network based on interactive feature embedding based on self-supervised learning; design a loss function to supervise the training of the network model; in the testing process, input multi-spectral image pairs, and the network outputs the final image fusion result . The invention can effectively improve the network feature extraction ability, and is beneficial to the retention of important information in the fusion result.

Description

Multispectral image fusion method based on interactive feature embedding

Technical Field

The invention belongs to the field of computer vision, and particularly relates to multispectral image fusion based on interactive feature embedding.

Background

The multispectral image fusion is to integrate the image characteristics of the same scene captured by the multispectral detector, so as to more comprehensively and accurately describe scene information. Multispectral image fusion is part of the image fusion task and has wide application in many areas, such as scene monitoring [1], target recognition, geological exploration, military and the like.

Deep learning techniques play an important role in image fusion. The existing image fusion method based on deep learning is mainly divided into two types: a convergence method based on a countermeasure network and a convergence method based on a non-countermeasure network. The fusion method based on the countermeasure network aims at fusing main features of a source image through designing a loss function in the countermeasure training process. However, this type of method has the following limitations: it is difficult for the network to optimize and to design a loss function that contains all the important information of the source image. In the fusion method based on the non-countermeasure network, the feature extraction process is often realized in an unsupervised mode, and the feature extraction is difficult to guarantee. Therefore, regardless of the counterlearning based on the loss function design or the unsupervised learning, ignoring any important information in the source image (such as gradient, edge, texture, intensity and contrast) will result in the loss of important features from the fusion result.

Therefore, the feature extraction capability of the network plays a key role in multi-source image fusion. In order to improve the network feature extraction capability, the invention provides an interactive feature-embedded multispectral image fusion network based on self-supervision learning, breaks through the technical bottleneck of comprehensively extracting the source image features in the existing fusion network, and has important significance for promoting more deep application of multispectral images in other fields.

Disclosure of Invention

The invention aims to improve the network feature extraction capability and provides a multispectral image fusion method based on interactive feature embedding.

The technical scheme of the invention is as follows:

a multispectral image fusion method based on interactive feature embedding comprises the following steps:

the method comprises the following steps: making a multi-spectral image fusion dataset

1) Acquiring a multispectral image dataset, a source image I₁And a source image I₂；

2) For the multispectral source image I in the step 1)₁，I₂Adjusted to a height and a width ofSo that;

3) for the source images I with the same size in the step 2)₁，I₂Sliding from left to right to obtain image blocks from top to bottom according to a window with a fixed size and step length;

4) turning over and mirroring the image pair obtained in the step 3), and enlarging the size of the training data set sample;

step two: designing an interactive feature-embedded multispectral image fusion network for self-supervision learning to realize multi-focus image fusion

1) Designing a self-supervision feature extraction module, wherein the module comprises two branches with the same structure; each branch consists of a plurality of convolution layers, and the parameter of convolution kernel of each layer is 3 x f, wherein f is the number of convolution kernels; the hierarchical feature extracted from the convolutional layer is represented by F'_m、F”_mM is denoted as the mth layer, ranging from {1, 2.., M }; the two branches input a source image I with width W and height H₁、I₂The output result is a source image reconstruction result

Loss function L of the module₁Expressed as:

where MSE represents the mean square error, I_nFor the source image I₁、I₂，

Representing a source image I₁、I₂Corresponding reconstructed result

And

2) designing an interactive feature embedding module, which is composed of a plurality of convolution layers, wherein the convolution kernel parameter of each layer is 3 x fWherein f is the number of convolution kernels; the hierarchy features extracted for the convolutional layer are denoted as F_m(ii) a Wherein the hierarchical features of the first layer are derived from the source image I₁、I₂Obtaining the hierarchical characteristics F from the second layer to the M layers after convolution_mHierarchical feature F 'extracted by self-supervised feature extraction module'_m、F”_mThe process expression obtained by the convolution operation is:

wherein, C²For 2 convolution operations, C⁴4 convolution operations; cat represents concat operation; from the above formula, it can be observed that the layer of the intermediate layer and the feature F_mIs a hierarchical feature F 'extracted by a self-supervised feature extraction module'_m、F”_mDerived therefrom, this ensures F_mAnd F'_m，F”_mSharing low, medium and high-grade characteristics to further serve fusion tasks;

hierarchical feature F 'extracted by self-supervision feature extraction module on the other hand'_m、F”_mAlso derived from the hierarchical features F_mFrom F_mObtained after a convolution operation, expressed as:

F'_m,F”_m＝C(F_m)，M≥m≥1 (3)

in view of feature F 'for reconstructing the source image'_m，F”_mFrom F_mThis also ensures F_mThe method comprises the main characteristics of a source image, and further serves a fusion task;

3) outputting a fusion result; fusion result I_fThe final output result weight W of the source image and the interactive feature embedding module is multiplied to obtain:

I_f＝I₁*W+I₂*(1-W) (4)

wherein W is a weight map represented by F_MObtained by a convolution operation:

W＝C⁴(F_M) (5)

wherein C is⁴Represents four convolution operations;

step three: network training, wherein the network training process is a process of optimizing a loss function; the self-supervision learning interactive feature embedded multispectral image fusion network loss function provided by the method consists of two parts: loss of self-supervised training, i.e. L₁(ii) a Loss of fusion, i.e. L_f(ii) a Network training is the process of minimizing the loss function L,

L＝L₁+L_f (6)

in particular, L_fIs a loss function based on SSIM;

step four: a testing stage; inputting two multispectral images I with width W and height H₁、I₂Output the corresponding reconstruction result

And final fusion result I_f。

The invention has the beneficial effects that: compared with the prior art, the invention has the following beneficial effects: the invention provides a multispectral image fusion method for self-supervision learning, which can effectively improve the network feature extraction capability through a self-supervision mechanism. The invention provides an interactive feature embedding structure which can be used as a bridge connection image fusion and reconstruction task, and can gradually embed key information acquired by self-supervision learning into the fusion task, so that the fusion performance is improved finally.

Drawings

FIG. 1 is a schematic diagram of the basic structure of the process of the present invention.

Fig. 2 is a schematic diagram of the fusion result of the present embodiment.

Detailed Description

The specific embodiment of the multispectral image fusion method based on interactive feature embedding is explained in detail as follows:

the method comprises the following steps: the multispectral image fusion data set production specifically comprises the following steps:

1) acquiring a multi-spectral image dataset, a source mapLike I₁And a source image I₂；

2) For the multispectral source image I in the step 1)₁，I₂Adjusting to be consistent in height and width;

3) for the source images I with the same size in the step 2)₁，I₂And sliding the image blocks from left to right from top to bottom in a window with a fixed size and step length.

step two: as shown in fig. 1, designing a multispectral image fusion network with interactive feature embedding for self-supervised learning to implement multispectral image fusion includes:

1) and designing a self-supervision characteristic extraction module. As shown in fig. 1, the module comprises two structurally identical branches. In this embodiment, each branch is composed of M (M ═ 3) convolution layers, each layer having convolution kernel parameters of 3 × f (f is the number of convolution kernels). The number of convolution kernels in the first layer is 64, the number of convolution kernels in the second layer is 128, and the number of convolution kernels in the third layer is 256. The hierarchical feature extracted from the convolutional layer is represented by F'_m，F”_m(m is denoted as the mth layer, ranging from {1,2,3 }). The two branches input a source image I with width W and height H₁、I₂The output result is a source image reconstruction result

Loss function L of the module₁Expressed as:

Representing a source image I₁、I₂Corresponding reconstructed result

And

2) interactive feature embedding module design. As shown in fig. 1, in this embodiment, the module is composed of M +1(M ═ 3) convolutional layers, and the convolution kernel parameter of each layer is 3 × f (f is the number of convolution kernels). The number of convolution kernels in the first layer is 64, the number of convolution kernels in the second layer is 128, the number of convolution kernels in the third layer is 256, and the number of convolution kernels in the fourth layer is 1. The hierarchy features extracted for the convolutional layer are denoted as F_m. Wherein the hierarchical feature F of the first layer₁From a source image I₁、I₂Obtaining the hierarchical characteristics F from the second layer to the M layers after convolution_mHierarchical feature F 'extracted by self-supervised feature extraction module'_m，F”_mThe process expression obtained by the convolution operation is:

wherein C is²For 2 convolution operations, C⁴Is 4 convolution operations. Cat represents the concat operation. From the above formula, it can be observed that the layer of the intermediate layer and the feature F_mIs a hierarchical feature F 'extracted by a self-supervised feature extraction module'_m，F”_mDerived therefrom, this ensures F_mCan be reacted with F'_m，F”_mSharing low, medium and high level features to serve fusion tasks.

Hierarchical feature F 'extracted by self-supervision feature extraction module on the other hand'_m，F”_mAlso derived from the hierarchical features F_mFrom F_mObtained after a convolution operation, expressed as:

F'_m,F”_m＝C(F_m)，M≥m≥1 (3)

in view of feature F 'for reconstructing the source image'_m，F”_mFrom F_mThis also ensures F_mThe method comprises the main characteristics of the source image, and further serves a fusion task. Thus, interact withThe self-supervision mechanism can be fully utilized by the formula characteristic embedding mechanism, so that important characteristics are prevented from being lost in the fusion result.

3) And outputting a fusion result. As shown in FIG. 1, fusion result I_fThe final output result weight W of the source image and the interactive feature embedding module is multiplied to obtain:

I_f＝I₁*W+I₂*(1-W) (4)

W＝C⁴(F_M) (5)

wherein C is⁴Representing four convolution operations.

Step three: and (5) network training. The network training process is a process that optimizes a loss function. The interactive feature embedded multispectral image fusion network loss function provided by the invention consists of two parts: loss of self-supervised training, i.e. L₁(shown in formula 1); loss of fusion, i.e. L_f. Network training is the process of minimizing the loss function L,

L＝L₁+L_f (6)

in particular, L_fIs a loss function based on SSIM.

The parameters in the network training process are set as follows:

base _ lr:1 e-4/learning rate

momentum of 0.9/momentum

weight _ decay:5 e-3/weight decay

batch size 1/batch size

solution _ mode GPU/example training Using GPU

Step four: and (5) a testing stage. Inputting two multispectral images I with width W and height H₁、I₂The model of the invention outputs its corresponding reconstructed result

And final fusion result I_f. As shown in fig. 2, compared to other fusion methodsThe fusion result obtained by the method can better retain the main characteristics in the source image, including the brightness characteristic and the texture characteristic.

Claims

1. a multispectral image fusion method based on interactive feature embedding, is characterized in that, step is as follows:

Step 1: Create a multispectral image fusion dataset

1) acquiring a multispectral image dataset, source image I ₁ and source image I ₂ ;

2) For the multi-spectral source images I ₁ and I ₂ in step 1), adjust the height and width to be consistent;

3) For the source images I ₁ , I ₂ of the same size in step 2), take image blocks by sliding from top to bottom and from left to right with a fixed size window and step size;

4) Perform flipping and mirroring operations on the image pair obtained in step 3) to expand the sample size of the training data set;

Step 2: Design a self-supervised learning interactive feature embedding multispectral image fusion network to achieve multispectral image fusion

1) Design a self-supervised feature extraction module, which contains two branches with the same structure; each branch consists of multiple convolutional layers, and the convolution kernel parameter of each layer is 3*3*f, where f is the convolution kernel The number of layers; the hierarchical features extracted by the convolutional layer are represented as F′ _m , F″ _m , m is represented as the mth layer, and the range is {1, 2,..., M}; the input of the two branches is the width W , source images I ₁ , I ₂ of height H, and the output result is the reconstruction result of the source image

The loss function L1 _of this module is expressed as:

Among them, MSE represents mean square error, I _n is source image I ₁ , I ₂ ,

Represents the reconstruction results corresponding to the source images I ₁ , I ₂

and

2) Design an interactive feature embedding module, which consists of multiple convolutional layers, and the convolution kernel parameter of each layer is 3*3*f, where f is the number of convolution kernels; the hierarchical features extracted by the convolutional layer Denoted as F _m ; wherein, the hierarchical features of the first layer are obtained by convolution of the source images I ₁ and I ₂ , and the hierarchical features F _{m of the second to M layers are obtained by the hierarchical features F′ m} _extracted by the self-supervised feature extraction module , F″ _m through the convolution operation, the process is expressed as:

Among them, C ² is 2 convolution operations, C ⁴ is 4 convolution operations; Cat represents the concat operation; through the above formula, it can be observed that the layers and features F _m of the middle layer are extracted by the self-supervised feature extraction module The hierarchical features F′ _m and F″ _m are derived, which also ensures that F _m and F′ _m , F″ _m share low, medium and high-level features, thereby serving the fusion task fusion;

On the other hand, the hierarchical features F′ _m and F″ _m extracted by the self-supervised feature extraction module are also derived from the hierarchical feature F _m , which are obtained by the convolution operation of F _m and are expressed as:

F′ _m , F″ _m =C(F _m ), M≥m≥1 (3)

In view of the feature F' _m used to reconstruct the source image, F" _m comes from F _m , which also ensures that F _m contains the main features of the source image, and then serves the fusion task;

3) Fusion result output; the fusion result I _f is obtained by multiplying the source image and the final output result weight W of the interactive feature embedding module:

I _f =I ₁ *W+I ₂ *(1-W) (4)

Among them, W is the weight map, obtained by _FM through the convolution operation:

W= _C ⁴ (FM ) (5)

where C ⁴ represents four convolution operations;

Step 3: Network training, the network training process is the process of optimizing the loss function; the multispectral image fusion network loss function of the interactive feature embedding of self-supervised learning proposed by this method consists of two parts: the self-supervised training loss, namely L ₁ ; fusion loss, namely L _f ; network training is the process of minimizing the loss function L,

L=L ₁ +L _f (6)

Specifically, L _f is the loss function based on SSIM;

Step 4: Test phase; input two multispectral images I ₁ , I ₂ with width W and height H, and output the corresponding reconstruction results

and the final _fusion result If.