CN116342455B

CN116342455B - Efficient multi-source image fusion method, system and medium

Info

Publication number: CN116342455B
Application number: CN202310614277.2A
Authority: CN
Inventors: 李树涛; 刘锦洋; 佃仁伟
Original assignee: Hunan University
Current assignee: Hunan University
Priority date: 2023-05-29
Filing date: 2023-05-29
Publication date: 2023-08-08
Anticipated expiration: 2043-05-29
Also published as: CN116342455A

Abstract

The invention discloses a multi-source image high-efficiency fusion method, a system and a medium, wherein the method comprises the following training for a multi-source image high-efficiency fusion network consisting of two feature extraction networks D and a feature reconstruction network F: source image of sample pairAnd (3) withRespectively inputting the high-dimensional image features extracted by the feature extraction network DAnd (3) withSpliced as fusion featuresGenerating a fusion image through a feature fusion network FThe method comprises the steps of carrying out a first treatment on the surface of the Will beAnd (3) withRespectively input to the intermediate information layer L _i Generating weightsAnd (3) withTo guide the calculation of the value of the Loss function Loss designed according to the similarity between the weight fusion image and the weight source image to complete the training of the feature extraction network D and the feature fusion network F. The invention has the advantages of high fusion speed, good visual effect, obvious texture information, high structure retention and strong universality.

Description

Efficient multi-source image fusion method, system and medium

Technical Field

The invention relates to the technical field of efficient multi-source image fusion, in particular to a method, a system and a medium for efficient multi-source image fusion.

Background

The efficient fusion of the multi-source images aims to quickly integrate the dominant information of different input source images into one image so as to improve the efficiency of subsequent works such as image interpretation, target recognition, scene classification and the like. This type of fusion has a great deal of application in video surveillance, camouflage analysis, photography, and the like. Specifically, such image fusion is a fusion mainly including different kinds of multi-modal fusion, multi-focus fusion, multi-exposure fusion, and the like.

Typically this type of fusion is categorized into two broad categories, traditional image fusion and non-traditional image fusion. The traditional pixel-level unified image fusion is almost indistinguishable from three steps of manually designed image feature extraction, fusion and reconstruction. The most common feature extraction methods include Sparse Representation (SR), multi-scale transformation (MST), dictionary learning, and the like. The fusion rules include maximum, minimum, addition, L1 norm (L1-norm), etc. The traditional pixel-level unified image fusion method severely depends on expert knowledge to extract features and formulate a fusion strategy, which not only results in poor generalization capability of a model, but also severely limits the fusion effect. In addition, this method generally requires parameter adjustment according to the fusion type, and the parameter adjustment process is complicated and time-consuming. In recent years, the non-traditional image fusion taking convolutional neural networks as the main stream has made a great breakthrough in fusion performance, and the non-traditional image fusion can be roughly classified into network types based on loss function control, priori information control, manual operator and the like according to network types. U2Fusion (unified unsupervised image Fusion network, see: xu, han, et al, "U2Fusion: A unified unsupervised image Fusion network," IEEE Transactions on Pattern Analysis and Machine Intelligence 44.1.1 (2020): 502-518), fusion DN (image Fusion unified dense connectivity network, see: xu, han, et a)"fusion: A unified densely connected network for image fusion" Proceedings of the AAAI Conference on Artificial intelligence. Vol. 34. No. 07.2020.) PMGI (fast unified image fusion network based on gradient and intensity scale maintenance, see: zhang, hao, et al, "Rethinking the image fusion: A fast unified image fusion network based on proportional maintenance of gradient and intensity," Proceedings of the AAAI Conference on Artificial intelligence, vol.34, no. 07.2020) to achieve image fusion by calculating the similarity between the source image and the fusion result. Furthermore, the weight of the loss function is also typically set according to the task or source image. Specifically, fusion dn uses a manual operator to determine weights, while U2Fusion achieves more accurate Fusion by determining weights using a pre-trained feature extraction network, and PMGI adapts to different Fusion tasks by manually designing weights for different intensity loss functions and gradient loss functions. A representative method based on a priori information control is IFCNN (general image fusion framework based on convolutional neural network, see: zhang, yu, et al, "IFCNN: A general image fusion framework based on convolutional neural network," Information Fusion (2020): 99-118.), which implements image fusion using multi-focal source images and Ground Truth values (Ground Truth) thereof, so that the fusion network can reveal the relationship between source images and Ground Truth. The method based on the artificial operator is first performed in deep (deep unsupervised network for exposure fusion with extreme exposure image pairs, see: ram Prabhakar, k., v. Sai Srikar, and r. Venkatesh babu, "deep: A deep unsupervised approach for exposure fusion with extreme exposure image pairs.":Proceedings of the IEEE International Conference on Computer Vision2017.), the method uses a maximum fusion rule. The method aims to solve the problem of lack of training data and ground truth value in the fusion process. To enhance the fusion effect of DeepFuse, denseFose (Infrared and visible light fusion network, see: li, hui, and Xiao-Jun Wu. "DenseFose: A fusion ap)proach to infrared and visible images."IEEE Transactions on Image Processing28.5 (2018): 2614-2623.) extend the L1 norm into the fusion rules and add dense connections in the network.

However, deep learning networks still have some problems, and some deep learning networks adopt redundant converged network structural designs, so that the calculation cost is high, and the deep learning networks are difficult to adapt to tasks requiring high efficient performance. Meanwhile, the existing method does not fully consider and explore the differences of source images, particularly in the aspect of design of loss functions.

Disclosure of Invention

The invention aims to solve the technical problems: aiming at the problems in the prior art, the invention provides a multi-source image efficient fusion method, a multi-source image efficient fusion system and a multi-source image efficient fusion medium, the fusion speed can achieve the efficient requirement, the differences of the source images such as texture features and image characteristics can be accurately estimated, images with excellent visual effects can be generated, the texture detail information of the source images can be enhanced, information such as illumination can be effectively processed, and meanwhile, the multi-source image efficient fusion method has the advantages of no obvious artifact, weak input image restriction and the like, and has strong universality.

In order to solve the technical problems, the invention adopts the following technical scheme:

the multi-source image high-efficiency fusion method comprises the following training for a multi-source image high-efficiency fusion network consisting of two feature extraction networks D and one feature reconstruction network F:

s101, establishing two types of source images to be fusedAnd->A sample set of sample pairs of (a);

s102, source images of the sample pairsAnd->Respectively inputting the high-dimensional image features extracted by the feature extraction network D>And->；

S103, the high-dimensional image is characterizedAnd->Splicing to obtain fusion characteristics->Fusion feature->The input feature fusion network F generates a fusion image +.>The method comprises the steps of carrying out a first treatment on the surface of the High-dimensional image features->And->Respectively input to the intermediate information layer L _i Generating weight->And->；

S104, weightingAnd->Respectively->Multiplying to obtain two weight fusion images +.>And->The method comprises the steps of carrying out a first treatment on the surface of the Weight +.>And Source image->Multiplication, weight->And Source image->Multiplying to obtain two weight source images +.>And->；

S105, calculating a Loss function Loss value designed according to the similarity between the weight fusion image and the weight source image, and judging that the feature extraction network D and the feature fusion network F are trained if the Loss function Loss value meets a preset convergence condition; otherwise, the network parameters of the feature extraction network D and the feature fusion network F are adjusted, and the step S102 is skipped.

Optionally, the Loss function Loss designed in step S103 is a weighted sum of both a gradient Loss function and an intensity Loss function, wherein the gradient Loss function is a weighted sum of both a gradient Loss function of an exponential measure of structural similarity SSIM, a gradient Loss function of a mean square error MSE of structural similarity, said structural similarity comprising a weighted fusion imageAnd weight source image->Similarity between the weight fusion images +.>Weighting source image with weight source image +.>Similarity between them.

Optionally, the computational function expression of the gradient loss function of the exponential measure SSIM of structural similarity is:

，

in the above-mentioned method, the step of,gradient loss function for SSIM, an exponential measure of structural similarity, SIMM is an exponential measure of structural similarity, +.>And->For two types of source images to be fused, +.>For fusing images, +.>And->Are weight matrix->For element-wise multiplication.

Optionally, the calculation function expression of the gradient loss function of the mean square error MSE of the structural similarity is:

，

in the above-mentioned method, the step of,gradient loss function of MSE, which is mean square error of structural similarity, ++>And->For two types of source images to be fused, +.>For fusing images, +.>And->Are weight matrix->For element-wise multiplication.

Optionally, the calculation function expression of the intensity loss function is:

，

in the above-mentioned method, the step of,MSE is the mean square error of structural similarity, as a strength loss function, +.>And->For two types of source images to be fused, +.>For fusing images, +.>And->Are weight matrix->For element-wise multiplication.

Optionally, the weight matrix、/>、/>、/>The expression of the calculation function of (c) is:

，

in the above-mentioned method, the step of,as constant coefficients, fgrad is gradient operator,>and->Respectively high-dimensional image features->And->Tensor of (1), sigmoid is a normalization function, fmean is the mean of the calculated tensor, intermediate variable +.>And->The expression of the calculation function of (c) is:

，

in the above-mentioned method, the step of,and->Is an intermediate variable, and has:

，

where sqrt is a square root calculation.

Optionally, the feature reconstruction network F performs feature reconstruction to obtain a fused image includes: the fusion features are subjected to 3×3 convolution layers to obtain tensors with the channel number of 64, activated tensors are generated through an activation function LeakyReLU, the channel number of the activated tensors is reduced to 1 through the 3×3 convolution layers, and finally the final fusion image is obtained through the activation function LeakyReLU.

Optionally, the feature extraction network D extracts high-dimensional image features including: and (3) obtaining a tensor with the channel number of 32 by a 3X 3 convolution layer from the source image, generating an activated tensor by an activation function LeakyReLU, lifting the channel number of 64 by the activated tensor by the 3X 3 convolution layer, and finally obtaining a high-dimensional image characteristic corresponding to the source image by the activation function LeakyReLU.

In addition, the invention also provides a multi-source image efficient fusion system which comprises a microprocessor and a memory which are connected with each other, wherein the microprocessor is programmed or configured to execute the multi-source image efficient fusion method.

Furthermore, the present invention provides a computer readable storage medium having stored therein a computer program for programming or configuring by a microprocessor to perform the multi-source image efficient fusion method.

Compared with the prior art, the invention has the following advantages:

1. the multi-source image efficient fusion network formed by the two feature extraction networks D and the feature reconstruction network F can effectively realize multi-source image efficient fusion, and the high-dimensional image features are adoptedAnd->Respectively input to the intermediate information layer L _i Generating weight->And->Weight +.>And->Respectively->Multiplying to obtain two weight fusionComposite imageAnd->Weight +.>And->And source image of sample pair +.>And->Multiplying to obtain two weight source images +.>And->The method calculates the value of a Loss function Loss designed according to the similarity between the weight fusion image and the weight source image, realizes the image fusion Loss function guided by intermediate information, well reserves the gradient and strength information of the source image, has good visual effect of the generated fusion image, and can better reserve the texture characteristics, strength characteristics and the like of the fused image.

2. The multi-source image high-efficiency fusion network formed by the two feature extraction networks D and the feature reconstruction network F realizes a lightweight network for avoiding redundant structural design, has extremely low requirement on calculation training cost, and can be suitable for production and research of light and small-sized products.

3. The multi-source image high-efficiency fusion network formed by the two feature extraction networks D and the feature reconstruction network F can adapt to different types of pixel-level image fusion tasks only by training one model.

4. The invention realizes the image fusion loss function guided by the intermediate information, and the proposed mode of guiding the image generation by the intermediate information layer can be expanded into other network applications so as to enhance the extraction capability of the network for the concerned information.

5. The invention has the capability of processing data of different sources, and can realize different types of image fusion, including infrared visible light image fusion, multi-focus image fusion and multi-exposure image fusion.

Drawings

FIG. 1 is a schematic diagram of a basic flow of a method according to an embodiment of the present invention.

Fig. 2 is a schematic diagram of a method according to an embodiment of the present invention.

Fig. 3 is a schematic structural diagram of a multi-source image efficient fusion network according to an embodiment of the present invention.

Fig. 4 is a schematic diagram of training principle of a multi-source image efficient fusion network in an embodiment of the invention.

Fig. 5 is a schematic diagram showing comparison of multi-focus fusion results of a fusion layer according to an embodiment of the present invention.

FIG. 6 is a comparison of weights and fusion results in an embodiment of the present invention.

FIG. 7 is a comparison result of the fusion experiment of infrared and visible light images in the embodiment of the present invention.

Detailed Description

As shown in fig. 1 and 2, the multi-source image efficient fusion method of the present embodiment includes the following training for a multi-source image efficient fusion network composed of two feature extraction networks D and one feature reconstruction network F:

Based on the training process, when the multi-source image efficient fusion network works, two feature extraction networks D respectively input a source image and extract high-dimensional image features, and then the feature reconstruction network F fuses the high-dimensional image features extracted by the two feature extraction networks D to obtain a pixel-level image fusion result.

In this embodiment, the Loss function Loss designed in step S103 is a weighted sum of both the gradient Loss function and the intensity Loss function, wherein the gradient Loss function is the gradient Loss function of the exponential measure SSIM of the structural similarity,Weighted summation of both gradient loss functions of mean square error MSE of structural similarity including weight fused imageAnd weight source image->Similarity between the weight fusion images +.>Weighting source image with weight source image +.>Similarity between them. As a specific embodiment, the weighted summation of both the gradient loss function and the intensity loss function can be expressed as:

，

in the above-mentioned method, the step of,representing gradient loss function, ++>Representing the intensity loss function, +.>Represents weight, weight->The general value interval is (0.1-1), and the value is 0.4 in this embodiment.

Gradient loss functionGradient loss function of an exponential measurement SSIM for structural similarity +.>Mean square error of structural similarityGradient loss function of MSE->The weighted sum of the two can be expressed as:

，

in the above-mentioned method, the step of,represents weight, weight->The general value interval is (10) ^-1 ~ 10 ³ ) In this embodiment, the value is 20.

In this embodiment, the calculation function expression of the gradient loss function of the exponential measurement SSIM of the structural similarity is:

，

In this embodiment, the expression of the gradient loss function of the mean square error MSE of the structural similarity is:

，

In this embodiment, the calculation function expression of the intensity loss function is:

，

In this embodiment, the weight matrix、/>、/>、/>The expression of the calculation function of (c) is:

，

in the above，As constant coefficients, fgrad is gradient operator,>and->Respectively high-dimensional image features->And->Tensor of (1), sigmoid is a normalization function, fmean is the mean of the calculated tensor, intermediate variable +.>And->The expression of the calculation function of (c) is:

，

where sqrt is a square root calculation.

The key point of the method of the embodiment is that the method aims at a combined training mode of a feature reconstruction network F and a feature extraction network D in a multi-source image high-efficiency fusion network, and the feature reconstruction network F and the feature extraction network D can adopt the needed existing deep neural network structure according to the needs. For example, as an alternative embodiment, as shown in fig. 3, the feature reconstruction network F performs feature reconstruction to obtain a fused image includes: the fusion features are subjected to 3×3 convolution layers to obtain tensors with the channel number of 64, activated tensors are generated through an activation function LeakyReLU, the channel number of the activated tensors is reduced to 1 through the 3×3 convolution layers, and finally the final fusion image is obtained through the activation function LeakyReLU.

For example, as an alternative embodiment, as shown in fig. 3, the feature extraction network D extracts high-dimensional image features including: and (3) obtaining a tensor with the channel number of 32 by a 3X 3 convolution layer from the source image, generating an activated tensor by an activation function LeakyReLU, lifting the channel number of 64 by the activated tensor by the 3X 3 convolution layer, and finally obtaining a high-dimensional image characteristic corresponding to the source image by the activation function LeakyReLU.

Corresponding to different source images, the method of the embodiment uses the feature extraction network D with the same set of network parameters to perform feature extraction to generate high-dimensional features, and then the high-dimensional features are combined and input into the feature reconstruction network F to generate a fusion image. The network structures of the feature reconstruction network F and the feature extraction network D are very simple.

Fig. 4 is a schematic diagram of a training principle of the multi-source image efficient fusion network in this embodiment. As can be seen from a review of FIG. 4, the present embodiment uses an intermediate information layer to characterize the high-dimensional imageAnd->Respectively input to the intermediate information layer L _i Generating weight->And (3) withTo extract different for high-dimensional informationGradient and intensity information, and using the information to generate a weight-guided loss function to calculate similarity between the source image and the fused image, the intermediate information layer being formed by combining a 1 x 1 convolution layer with a Tanh activation function.

In this embodiment, a multi-aggregation fusion dataset (Lytro) is expanded as training data of network parameters, and since the processing of multi-channel and single-channel images in the multi-source image efficient fusion network in this embodiment has little influence on the network structure, the dataset is converted into a single-channel gray scale map, and is cut into pixels with the size of 128×128 as input, and 10% of the total input data is taken as a verification set. If the multi-channel input network or the multi-channel output network needs to be trained, the number of network parameter input channels (n_channels) and the number of categories (n_categories) need to be changed into proper values. The learning rate is 1e-4, and the parameters are updated by a callback function (ReduceLROnPlateeau). The batch size at training was set to 32.

To further verify the effectiveness of the method of this example, three typical Fusion methods of U2Fusion, IFCNN, denseFuse were used as a comparison in this example, which was compared to the method of this example for a multi-source image efficient Fusion experiment. Experiments were performed on NVIDIA Quadro RTX, 6000 and 2.10 GHz Intel Xeon silver 4216 CPU using the multi-focus fusion dataset Lytro as training data, and in order to verify the effectiveness of the unified fusion of this embodiment, three different fusion methods were selected for comparative verification, including three types of multi-modal fusion, multi-focus fusion and multi-exposure fusion.

For multimodal fusion, the present example selects the most representative visible and infrared fusion dataset TNO. For the visible and infrared fusion dataset TNO, this example selected 21 pairs of typical images from the entire scene, and the experimental results obtained are shown in table 1 and fig. 5.

Table 1 objective performance metrics for the present example method and three exemplary multi-modal fusion methods.

Further, the present example verifies multi-focus fusion by using 13 pairs of images from MFFW dataset, and the experimental results obtained are shown in table 2 and fig. 6.

Table 2 objective performance metrics for the present example method and three exemplary multi-focus fusion methods.

For multi-exposure fusion, the present embodiment selects 20 pairs of appropriate EMPA-HDR datasets. In order to accommodate the device limitations, this example performed four downsampling operations on the EMPA-HDR dataset, and the experimental results obtained are shown in Table 3 and FIG. 7.

Table 3 objective performance metrics for the present example method and three typical multi-exposure fusion methods.

In tables 1 to 3, the index (EI) reflects the gradient information extraction capability of the corresponding method, and the larger the index is, the better the effect is. The index Entropy (EN) calculates the detail features of the image, and the larger the index is, the better the effect is. The inter-indicator (MI) measures the similarity between images, with a larger indicator indicating a better effect. The Index (TIME) reflects the TIME in seconds required for the different methods to fuse, the smaller the index, the better. The index (FPS) reflects the number of image frames that can be generated per second, and can reflect the efficiency of the algorithm, and a larger index indicates a better effect. As can be seen from table 1, all objective evaluation indexes of the method proposed in this example are superior to those of other methods. The index (EI) in tables 2 and 3 is slightly lower than IFCNN, and the other indexes have good advancement. The method is characterized in that the mode of guiding the loss function by the intermediate information layer used by the network can greatly endow the network with the capability of processing intensity information and gradient information, so that the network can realize image feature extraction and image reconstruction by using only a very small amount of convolution, and high-efficiency and accurate pixel-level image fusion is realized.

Fig. 5, 6 and 7 are comparisons of three exemplary Fusion methods of U2Fusion, IFCNN, denseFuse and a total of four Fusion methods of the method of the present example on multi-modal Fusion, multi-focus Fusion and multi-exposure Fusion.

In fig. 5, a is an infrared source image, B is a visible light source image, C is a Fusion image obtained by Fusion of U2Fusion method, D is a Fusion image obtained by Fusion of IFCNN method, E is a Fusion image obtained by Fusion of DenseFuse method, and F is a Fusion image obtained by Fusion of the method proposed in this embodiment. As can be seen from fig. 5, the texture and intensity retention of the image fused by the three typical Fusion methods U2Fusion and IFCNN, denseFuse on the background and the target are poor, and the quality of the fused image fused by the method proposed in this embodiment is the best.

In fig. 6, a is a far focus source image, B is a near focus source image, C is a Fusion image obtained by Fusion of U2Fusion method, D is a Fusion image obtained by Fusion of IFCNN method, E is a Fusion image obtained by Fusion of DenseFuse method, and F is a Fusion image obtained by Fusion of the method proposed in this embodiment. As can be seen from fig. 6, the Fusion image obtained by the Fusion method according to the present embodiment has the best effect on edge preservation, and can complement gradient advantage information of different source images, while the Fusion effect of the three typical Fusion methods of U2Fusion and IFCNN, denseFuse is general.

In fig. 7, a is a high exposure source image, B is a low exposure source image, C is a Fusion image obtained by Fusion of U2Fusion method, D is a Fusion image obtained by Fusion of IFCNN method, E is a Fusion image obtained by Fusion of DenseFuse method, and F is a Fusion image obtained by Fusion of the method proposed in this embodiment. As can be seen from fig. 7, the image obtained by Fusion of three typical Fusion methods U2Fusion and IFCNN, denseFuse has a general effect on image restoration and texture maintenance, and has obvious flaws; the fused image obtained by fusion according to the method provided by the embodiment has good performance in both aspects, and the obtained fused image has the best quality.

In summary, the method of the embodiment uses the convolutional neural network to extract the features of the input image, uses the intermediate information guiding optimization mode to the extracted features to generate the corresponding weight system, and gives the corresponding weight system to the loss function of the network. Inputting the source images to be fused into a network to generate high-dimensional features of corresponding different source images, merging the high-dimensional features subsequently and then using the merged high-dimensional features to reconstruct the images, extracting the intensity and gradient information of the high-dimensional features, and merging the intensity and gradient information into a loss function to calculate the similarity between the generated images and the source images. The image thus reconstructed is a fused image. The method has the advantages that excessive training is not needed, only training on a multi-focus data set is needed, and the method is suitable for different types of fusion tasks. After being compared with other advanced fusion methods, the fusion image generated by the fusion method used in the embodiment has higher objective performance index, better visual effect and more important, the method has extremely high efficiency and extremely strong universality and robustness. The method of the embodiment has the advantages of high fusion speed, good visual effect, obvious texture information, high structure retention, strong universality and the like.

In addition, the embodiment also provides a multi-source image efficient fusion system, which comprises a microprocessor and a memory which are connected with each other, wherein the microprocessor is programmed or configured to execute the multi-source image efficient fusion method.

Furthermore, the present embodiment also provides a computer-readable storage medium having stored therein a computer program for programming or configuring by a microprocessor to perform the multi-source image efficient fusion method.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above examples, and all technical solutions belonging to the concept of the present invention belong to the protection scope of the present invention. It should be noted that modifications and adaptations to the present invention may occur to one skilled in the art without departing from the principles of the present invention and are intended to be within the scope of the present invention.

Claims

1. The multi-source image efficient fusion method is characterized by comprising the following training for a multi-source image efficient fusion network consisting of two feature extraction networks D and one feature reconstruction network F:

S103, the high-dimensional image is characterizedAnd->Splicing to obtain fusion characteristics->Fusion feature->The input feature fusion network F generates a fusion image +.>The method comprises the steps of carrying out a first treatment on the surface of the High-dimensional image features->And->Respectively input to the intermediate information layer L _i Generating weight->And->The method comprises the steps of carrying out a first treatment on the surface of the The intermediate information layer L _i The method is formed by combining a 1 multiplied by 1 convolution layer and a Tanh activation function;

S105, calculating a Loss function Loss value designed according to the similarity between the weight fusion image and the weight source image, and judging that the feature extraction network D and the feature fusion network F are trained if the Loss function Loss value meets a preset convergence condition; otherwise, adjusting network parameters of the feature extraction network D and the feature fusion network F, and jumping to the step S102;

the Loss function Loss designed in step S103 is a weighted sum of both the gradient Loss function and the intensity Loss function, wherein the gradient Loss function is a weighted sum of both the gradient Loss function of the exponential measure SSIM of the structural similarity, the gradient Loss function of the mean square error MSE of the structural similarity, the structural similarity comprising the weight fused imageAnd weight source imageSimilarity between the weight fusion images +.>Weighting source image with weight source image +.>Similarity between; the calculation function expression of the gradient loss function of the exponential measurement SSIM of the structural similarity is as follows:

，

in the above-mentioned method, the step of,gradient loss function for measuring SSIM as an index of structural similarity, SIMM is the index of structural similarityQuantity (S)>And->For two types of source images to be fused, +.>To fuse images, W ₁ And W is ₂ Are weight matrix->For element-by-element multiplication; the calculation function expression of the gradient loss function of the MSE of the structural similarity is as follows:

，

in the above-mentioned method, the step of,gradient loss function of MSE, which is mean square error of structural similarity, ++>And->For two types of source images to be fused, +.>For fusing images, +.>And->Are weight matrix->For element-by-element multiplication; the calculation function expression of the intensity loss function is as follows:

，

in the above-mentioned method, the step of,MSE is the mean square error of structural similarity, as a strength loss function, +.>And->For two types of source images to be fused, +.>For fusing images, +.>And->Are weight matrix->For element-by-element multiplication; the weight matrix->、/>、/>、The expression of the calculation function of (c) is:

，

where sqrt is a square root calculation.

2. The method of claim 1, wherein the feature reconstruction network F performs feature reconstruction to obtain a fused image comprises: the fusion features are subjected to 3×3 convolution layers to obtain tensors with the channel number of 64, activated tensors are generated through an activation function LeakyReLU, the channel number of the activated tensors is reduced to 1 through the 3×3 convolution layers, and finally the final fusion image is obtained through the activation function LeakyReLU.

3. The method of claim 1, wherein the feature extraction network D extracts high-dimensional image features comprising: and (3) obtaining a tensor with the channel number of 32 by a 3X 3 convolution layer from the source image, generating an activated tensor by an activation function LeakyReLU, lifting the channel number of 64 by the activated tensor by the 3X 3 convolution layer, and finally obtaining a high-dimensional image characteristic corresponding to the source image by the activation function LeakyReLU.

4. A multi-source image efficient fusion system comprising a microprocessor and a memory connected to each other, wherein the microprocessor is programmed or configured to perform the multi-source image efficient fusion method of any one of claims 1-3.

5. A computer readable storage medium having a computer program stored therein, wherein the computer program is for programming or configuring by a microprocessor to perform the multi-source image efficient fusion method of any one of claims 1-3.