CN116342455A - Efficient multi-source image fusion method, system and medium - Google Patents

Efficient multi-source image fusion method, system and medium Download PDF

Info

Publication number
CN116342455A
CN116342455A CN202310614277.2A CN202310614277A CN116342455A CN 116342455 A CN116342455 A CN 116342455A CN 202310614277 A CN202310614277 A CN 202310614277A CN 116342455 A CN116342455 A CN 116342455A
Authority
CN
China
Prior art keywords
fusion
image
network
source image
weight
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310614277.2A
Other languages
Chinese (zh)
Other versions
CN116342455B (en
Inventor
李树涛
刘锦洋
佃仁伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan University
Original Assignee
Hunan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan University filed Critical Hunan University
Priority to CN202310614277.2A priority Critical patent/CN116342455B/en
Publication of CN116342455A publication Critical patent/CN116342455A/en
Application granted granted Critical
Publication of CN116342455B publication Critical patent/CN116342455B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10048Infrared image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a multi-source image high-efficiency fusion method, a system and a medium, wherein the method comprises the following training for a multi-source image high-efficiency fusion network consisting of two feature extraction networks D and a feature reconstruction network F: source image of sample pair
Figure ZY_2
And (3) with
Figure ZY_6
Respectively inputting the high-dimensional image features extracted by the feature extraction network D
Figure ZY_9
And (3) with
Figure ZY_1
Spliced as fusion features
Figure ZY_4
Generating a fusion image through a feature fusion network F
Figure ZY_7
The method comprises the steps of carrying out a first treatment on the surface of the Will be
Figure ZY_8
And (3) with
Figure ZY_3
Respectively input to the intermediate information layer L i Generating weights
Figure ZY_5
And (3) with
Figure ZY_10
To guide the calculation of the value of the Loss function Loss designed according to the similarity between the weight fusion image and the weight source image to complete the training of the feature extraction network D and the feature fusion network F. The invention has the advantages of high fusion speed, good visual effect, obvious texture information, high structure retention and strong universality.

Description

Efficient multi-source image fusion method, system and medium
Technical Field
The invention relates to the technical field of efficient multi-source image fusion, in particular to a method, a system and a medium for efficient multi-source image fusion.
Background
The efficient fusion of the multi-source images aims to quickly integrate the dominant information of different input source images into one image so as to improve the efficiency of subsequent works such as image interpretation, target recognition, scene classification and the like. This type of fusion has a great deal of application in video surveillance, camouflage analysis, photography, and the like. Specifically, such image fusion is a fusion mainly including different kinds of multi-modal fusion, multi-focus fusion, multi-exposure fusion, and the like.
Typically this type of fusion is categorized into two broad categories, traditional image fusion and non-traditional image fusion. The traditional pixel-level unified image fusion is almost indistinguishable from three steps of manually designed image feature extraction, fusion and reconstruction. The most common feature extraction methods include Sparse Representation (SR), multi-scale transformation (MST), dictionary learning, and the like. The fusion rule comprises a maximum value, a minimum value, an addition,L1 norm (L1-norm), and the like. The traditional pixel-level unified image fusion method severely depends on expert knowledge to extract features and formulate a fusion strategy, which not only results in poor generalization capability of a model, but also severely limits the fusion effect. In addition, this method generally requires parameter adjustment according to the fusion type, and the parameter adjustment process is complicated and time-consuming. In recent years, the non-traditional image fusion taking convolutional neural networks as the main stream has made a great breakthrough in fusion performance, and the non-traditional image fusion can be roughly classified into network types based on loss function control, priori information control, manual operator and the like according to network types. Loss function control-based networks represented by U2Fusion (unified unsupervised image Fusion network, see: xu, han, et al, "U2Fusion: A unified unsupervised image Fusion network," IEEE Transactions on Pattern Analysis and Machine Intelligence 44.1.1 (2020): 502-518), fusion DN (unified dense connectivity network for image Fusion, see: xu, han, et al, "Fusion dn: A unified densely connected network for image Fusion," Proceedings of the AAAI Conference on Artificial Fusion, vol. 34. No. 07.2020), PMGI (gradient and intensity ratio-based fast unified image Fusion network, see: zhang, hao, et al, "Rethinking the image Fusion: A fast unified image Fusion network based on proportional maintenance of gradient and intensity," Proceedings of the AAAI Conference on Artificial Fusion, vol. 34. No. 07.2020), which achieve image Fusion by calculating the similarity between source images and Fusion results. Furthermore, the weight of the loss function is also typically set according to the task or source image. Specifically, fusion dn uses a manual operator to determine weights, while U2Fusion achieves more accurate Fusion by determining weights using a pre-trained feature extraction network, and PMGI adapts to different Fusion tasks by manually designing weights for different intensity loss functions and gradient loss functions. A representative method of a priori information control is IFCNN (a general image fusion framework based on convolutional neural networks, see: zhang, yu, et al, "IFCNN: A general image fusion framework based on convolutional neural network," Information Fusion (2020 99-118.) IFCNN utilizes the multi-focal source image and its Ground Truth (Ground Truth) to achieve image fusion, so the fusion network can reveal the relationship between the source image and Ground Truth. The method based on the artificial operator is first performed in deep (deep unsupervised network for exposure fusion with extreme exposure image pairs, see: ram Prabhakar, k., v. Sai Srikar, and r. Venkatesh babu, "deep: A deep unsupervised approach for exposure fusion with extreme exposure image pairs.":Proceedings of the IEEE International Conference on Computer Vision2017.), the method uses a maximum fusion rule. The method aims to solve the problem of lack of training data and ground truth value in the fusion process. To enhance the fusion effect of DeepFuse, denseFose (Infrared and visible light fusion networks, see: li, hui, and Xiao-Jun Wu. "DenseFose: A fusion approach to infrared and visible images"IEEE Transactions on Image Processing28.5 (2018): 2614-2623.) extend the L1 norm into the fusion rules and add dense connections in the network.
However, deep learning networks still have some problems, and some deep learning networks adopt redundant converged network structural designs, so that the calculation cost is high, and the deep learning networks are difficult to adapt to tasks requiring high efficient performance. Meanwhile, the existing method does not fully consider and explore the differences of source images, particularly in the aspect of design of loss functions.
Disclosure of Invention
The invention aims to solve the technical problems: aiming at the problems in the prior art, the invention provides a multi-source image efficient fusion method, a multi-source image efficient fusion system and a multi-source image efficient fusion medium, the fusion speed can achieve the efficient requirement, the differences of the source images such as texture features and image characteristics can be accurately estimated, images with excellent visual effects can be generated, the texture detail information of the source images can be enhanced, information such as illumination can be effectively processed, and meanwhile, the multi-source image efficient fusion method has the advantages of no obvious artifact, weak input image restriction and the like, and has strong universality.
In order to solve the technical problems, the invention adopts the following technical scheme:
the multi-source image high-efficiency fusion method comprises the following training for a multi-source image high-efficiency fusion network consisting of two feature extraction networks D and one feature reconstruction network F:
s101, establishing two types of source images to be fused
Figure SMS_1
And->
Figure SMS_2
A sample set of sample pairs of (a);
s102, source images of the sample pairs
Figure SMS_3
And->
Figure SMS_4
Respectively inputting the high-dimensional image features extracted by the feature extraction network D>
Figure SMS_5
And->
Figure SMS_6
S103, the high-dimensional image is characterized
Figure SMS_8
And->
Figure SMS_10
Splicing to obtain fusion characteristics->
Figure SMS_14
Fusion feature->
Figure SMS_9
The input feature fusion network F generates a fusion image +.>
Figure SMS_12
The method comprises the steps of carrying out a first treatment on the surface of the High-dimensional image features->
Figure SMS_13
And->
Figure SMS_15
Respectively input to the intermediate information layer L i Generating weight->
Figure SMS_7
And->
Figure SMS_11
S104, weighting
Figure SMS_18
And->
Figure SMS_21
Respectively->
Figure SMS_24
Multiplying to obtain two weight fusion images +.>
Figure SMS_16
And->
Figure SMS_19
The method comprises the steps of carrying out a first treatment on the surface of the Weight +.>
Figure SMS_22
And Source image->
Figure SMS_25
Multiplication, weight->
Figure SMS_17
And Source image->
Figure SMS_20
Multiplying to obtain two weight source images +.>
Figure SMS_23
And->
Figure SMS_26
S105, calculating a Loss function Loss value designed according to the similarity between the weight fusion image and the weight source image, and judging that the feature extraction network D and the feature fusion network F are trained if the Loss function Loss value meets a preset convergence condition; otherwise, the network parameters of the feature extraction network D and the feature fusion network F are adjusted, and the step S102 is skipped.
Optionally, the Loss function Loss designed in step S103 is a weighted sum of both a gradient Loss function and an intensity Loss function, wherein the gradient Loss function is a weighted sum of both a gradient Loss function of an exponential measure of structural similarity SSIM, a gradient Loss function of a mean square error MSE of structural similarity, said structural similarity comprising a weighted fusion image
Figure SMS_27
And weight source image->
Figure SMS_28
Similarity between the weight fusion images +.>
Figure SMS_29
Weighting source image with weight source image +.>
Figure SMS_30
Similarity between them.
Optionally, the computational function expression of the gradient loss function of the exponential measure SSIM of structural similarity is:
Figure SMS_31
in the above-mentioned method, the step of,
Figure SMS_32
gradient loss function for SSIM, an exponential measure of structural similarity, SIMM is an exponential measure of structural similarity, +.>
Figure SMS_33
And->
Figure SMS_34
To be meltedTwo kinds of source images are combined, namely ∈>
Figure SMS_35
For fusing images, +.>
Figure SMS_36
And->
Figure SMS_37
Are weight matrix->
Figure SMS_38
For element-wise multiplication.
Optionally, the calculation function expression of the gradient loss function of the mean square error MSE of the structural similarity is:
Figure SMS_39
in the above-mentioned method, the step of,
Figure SMS_40
gradient loss function of MSE, which is mean square error of structural similarity, ++>
Figure SMS_41
And->
Figure SMS_42
For two types of source images to be fused, +.>
Figure SMS_43
For fusing images, +.>
Figure SMS_44
And->
Figure SMS_45
Are weight matrix->
Figure SMS_46
For element-wise multiplication.
Optionally, the calculation function expression of the intensity loss function is:
Figure SMS_47
in the above-mentioned method, the step of,
Figure SMS_48
MSE is the mean square error of structural similarity, as a strength loss function, +.>
Figure SMS_49
And->
Figure SMS_50
For two types of source images to be fused, +.>
Figure SMS_51
For fusing images, +.>
Figure SMS_52
And->
Figure SMS_53
Are weight matrix->
Figure SMS_54
For element-wise multiplication.
Optionally, the weight matrix
Figure SMS_55
、/>
Figure SMS_56
、/>
Figure SMS_57
、/>
Figure SMS_58
The expression of the calculation function of (c) is:
Figure SMS_59
Figure SMS_60
Figure SMS_61
Figure SMS_62
in the above-mentioned method, the step of,
Figure SMS_63
as constant coefficients, fgrad is gradient operator,>
Figure SMS_64
and->
Figure SMS_65
Respectively high-dimensional image features->
Figure SMS_66
And->
Figure SMS_67
Tensor of (1), sigmoid is a normalization function, fmean is the mean of the calculated tensor, intermediate variable +.>
Figure SMS_68
And->
Figure SMS_69
The expression of the calculation function of (c) is:
Figure SMS_70
in the above-mentioned method, the step of,
Figure SMS_71
and->
Figure SMS_72
Is an intermediate variable, and has:
Figure SMS_73
Figure SMS_74
where sqrt is a square root calculation.
Optionally, the feature reconstruction network F performs feature reconstruction to obtain a fused image includes: the fusion features are subjected to 3×3 convolution layers to obtain tensors with the channel number of 64, activated tensors are generated through an activation function LeakyReLU, the channel number of the activated tensors is reduced to 1 through the 3×3 convolution layers, and finally the final fusion image is obtained through the activation function LeakyReLU.
Optionally, the feature extraction network D extracts high-dimensional image features including: and (3) obtaining a tensor with the channel number of 32 by a 3X 3 convolution layer from the source image, generating an activated tensor by an activation function LeakyReLU, lifting the channel number of 64 by the activated tensor by the 3X 3 convolution layer, and finally obtaining a high-dimensional image characteristic corresponding to the source image by the activation function LeakyReLU.
In addition, the invention also provides a multi-source image efficient fusion system which comprises a microprocessor and a memory which are connected with each other, wherein the microprocessor is programmed or configured to execute the multi-source image efficient fusion method.
Furthermore, the present invention provides a computer readable storage medium having stored therein a computer program for programming or configuring by a microprocessor to perform the multi-source image efficient fusion method.
Compared with the prior art, the invention has the following advantages:
1. the multi-source image efficient fusion network formed by the two feature extraction networks D and the feature reconstruction network F can effectively realize multi-source image efficient fusion, and the high-dimensional image features are adopted
Figure SMS_75
And->
Figure SMS_79
Respectively input to the intermediate information layer L i Generating weight->
Figure SMS_83
And->
Figure SMS_76
Weight +.>
Figure SMS_81
And->
Figure SMS_85
Respectively->
Figure SMS_86
Multiplying to obtain two weight fusion images
Figure SMS_77
And->
Figure SMS_87
Weight +.>
Figure SMS_88
And->
Figure SMS_89
And source image of sample pair +.>
Figure SMS_78
And->
Figure SMS_80
Multiplying to obtain two weight source images +.>
Figure SMS_82
And->
Figure SMS_84
The value of a Loss function Loss designed according to the similarity between the weight fusion image and the weight source image is calculated, the image fusion Loss function guided by intermediate information is realized, the gradient and intensity information of the source image are well reserved, the generated fusion image has good visual effect, andthe generated image can better keep the texture characteristics, strength characteristics and the like of the fused image.
2. The multi-source image high-efficiency fusion network formed by the two feature extraction networks D and the feature reconstruction network F realizes a lightweight network for avoiding redundant structural design, has extremely low requirement on calculation training cost, and can be suitable for production and research of light and small-sized products.
3. The multi-source image high-efficiency fusion network formed by the two feature extraction networks D and the feature reconstruction network F can adapt to different types of pixel-level image fusion tasks only by training one model.
4. The invention realizes the image fusion loss function guided by the intermediate information, and the proposed mode of guiding the image generation by the intermediate information layer can be expanded into other network applications so as to enhance the extraction capability of the network for the concerned information.
5. The invention has the capability of processing data of different sources, and can realize different types of image fusion, including infrared visible light image fusion, multi-focus image fusion and multi-exposure image fusion.
Drawings
FIG. 1 is a schematic diagram of a basic flow of a method according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of a method according to an embodiment of the present invention.
Fig. 3 is a schematic structural diagram of a multi-source image efficient fusion network according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of training principle of a multi-source image efficient fusion network in an embodiment of the invention.
Fig. 5 is a schematic diagram showing comparison of multi-focus fusion results of a fusion layer according to an embodiment of the present invention.
FIG. 6 is a comparison of weights and fusion results in an embodiment of the present invention.
FIG. 7 is a comparison result of the fusion experiment of infrared and visible light images in the embodiment of the present invention.
Detailed Description
As shown in fig. 1 and 2, the multi-source image efficient fusion method of the present embodiment includes the following training for a multi-source image efficient fusion network composed of two feature extraction networks D and one feature reconstruction network F:
s101, establishing two types of source images to be fused
Figure SMS_90
And->
Figure SMS_91
A sample set of sample pairs of (a);
s102, source images of the sample pairs
Figure SMS_92
And->
Figure SMS_93
Respectively inputting the high-dimensional image features extracted by the feature extraction network D>
Figure SMS_94
And->
Figure SMS_95
S103, the high-dimensional image is characterized
Figure SMS_97
And->
Figure SMS_99
Splicing to obtain fusion characteristics->
Figure SMS_102
Fusion feature->
Figure SMS_96
The input feature fusion network F generates a fusion image +.>
Figure SMS_101
The method comprises the steps of carrying out a first treatment on the surface of the High-dimensional image features->
Figure SMS_103
And->
Figure SMS_104
Respectively input to the intermediate information layer L i Generating weight->
Figure SMS_98
And->
Figure SMS_100
S104, weighting
Figure SMS_106
And->
Figure SMS_108
Respectively->
Figure SMS_110
Multiplying to obtain two weight fusion images +.>
Figure SMS_107
And->
Figure SMS_109
The method comprises the steps of carrying out a first treatment on the surface of the Weight +.>
Figure SMS_111
And Source image->
Figure SMS_113
Multiplication, weight->
Figure SMS_105
And Source image->
Figure SMS_112
Multiplying to obtain two weight source images +.>
Figure SMS_114
And->
Figure SMS_115
S105, calculating a Loss function Loss value designed according to the similarity between the weight fusion image and the weight source image, and judging that the feature extraction network D and the feature fusion network F are trained if the Loss function Loss value meets a preset convergence condition; otherwise, the network parameters of the feature extraction network D and the feature fusion network F are adjusted, and the step S102 is skipped.
Based on the training process, when the multi-source image efficient fusion network works, two feature extraction networks D respectively input a source image and extract high-dimensional image features, and then the feature reconstruction network F fuses the high-dimensional image features extracted by the two feature extraction networks D to obtain a pixel-level image fusion result.
In this embodiment, the Loss function Loss designed in step S103 is a weighted sum of both a gradient Loss function and an intensity Loss function, where the gradient Loss function is a weighted sum of both a gradient Loss function of an exponential measure of structural similarity SSIM and a gradient Loss function of a mean square error MSE of the structural similarity, which includes a weight fused image
Figure SMS_116
And weight source image->
Figure SMS_117
Similarity between the weight fusion images +.>
Figure SMS_118
Weighting source image with weight source image +.>
Figure SMS_119
Similarity between them. As a specific embodiment, the weighted summation of both the gradient loss function and the intensity loss function can be expressed as:
Figure SMS_120
in the above-mentioned method, the step of,
Figure SMS_121
representing gradient loss function, ++>
Figure SMS_122
Representing the intensity loss function, +.>
Figure SMS_123
Represents weight, weight->
Figure SMS_124
The general value interval is (0.1-1), and the value is 0.4 in this embodiment.
Gradient loss function
Figure SMS_125
Gradient loss function of an exponential measurement SSIM for structural similarity +.>
Figure SMS_126
Gradient loss function of mean square error MSE of structural similarity +.>
Figure SMS_127
The weighted sum of the two can be expressed as:
Figure SMS_128
in the above-mentioned method, the step of,
Figure SMS_129
represents weight, weight->
Figure SMS_130
The general value interval is (10) -1 ~ 10 3 ) In this embodiment, the value is 20.
In this embodiment, the calculation function expression of the gradient loss function of the exponential measurement SSIM of the structural similarity is:
Figure SMS_131
in the above-mentioned method, the step of,
Figure SMS_132
gradient loss function for SSIM, an exponential measure of structural similarity, SIMM is an exponential measure of structural similarity, +.>
Figure SMS_133
And->
Figure SMS_134
For two types of source images to be fused, +.>
Figure SMS_135
For fusing images, +.>
Figure SMS_136
And->
Figure SMS_137
Are weight matrix->
Figure SMS_138
For element-wise multiplication.
In this embodiment, the expression of the gradient loss function of the mean square error MSE of the structural similarity is:
Figure SMS_139
in the above-mentioned method, the step of,
Figure SMS_140
gradient loss function of MSE, which is mean square error of structural similarity, ++>
Figure SMS_141
And->
Figure SMS_142
For two types of source images to be fused, +.>
Figure SMS_143
For fusing images, +.>
Figure SMS_144
And->
Figure SMS_145
Are weight matrix->
Figure SMS_146
For element-wise multiplication.
In this embodiment, the calculation function expression of the intensity loss function is:
Figure SMS_147
in the above-mentioned method, the step of,
Figure SMS_148
MSE is the mean square error of structural similarity, as a strength loss function, +.>
Figure SMS_149
And->
Figure SMS_150
For two types of source images to be fused, +.>
Figure SMS_151
For fusing images, +.>
Figure SMS_152
And->
Figure SMS_153
Are weight matrix->
Figure SMS_154
For element-wise multiplication.
In this embodiment, the weight matrix
Figure SMS_155
、/>
Figure SMS_156
、/>
Figure SMS_157
、/>
Figure SMS_158
The expression of the calculation function of (c) is:
Figure SMS_159
Figure SMS_160
Figure SMS_161
Figure SMS_162
in the above-mentioned method, the step of,
Figure SMS_163
as constant coefficients, fgrad is gradient operator,>
Figure SMS_164
and->
Figure SMS_165
Respectively high-dimensional image features->
Figure SMS_166
And->
Figure SMS_167
Tensor of (1), sigmoid is a normalization function, fmean is the mean of the calculated tensor, intermediate variable +.>
Figure SMS_168
And->
Figure SMS_169
The expression of the calculation function of (c) is:
Figure SMS_170
in the above-mentioned method, the step of,
Figure SMS_171
and->
Figure SMS_172
Is an intermediate variable, and has:
Figure SMS_173
Figure SMS_174
where sqrt is a square root calculation.
The key point of the method of the embodiment is that the method aims at a combined training mode of a feature reconstruction network F and a feature extraction network D in a multi-source image high-efficiency fusion network, and the feature reconstruction network F and the feature extraction network D can adopt the needed existing deep neural network structure according to the needs. For example, as an alternative embodiment, as shown in fig. 3, the feature reconstruction network F performs feature reconstruction to obtain a fused image includes: the fusion features are subjected to 3×3 convolution layers to obtain tensors with the channel number of 64, activated tensors are generated through an activation function LeakyReLU, the channel number of the activated tensors is reduced to 1 through the 3×3 convolution layers, and finally the final fusion image is obtained through the activation function LeakyReLU.
For example, as an alternative embodiment, as shown in fig. 3, the feature extraction network D extracts high-dimensional image features including: and (3) obtaining a tensor with the channel number of 32 by a 3X 3 convolution layer from the source image, generating an activated tensor by an activation function LeakyReLU, lifting the channel number of 64 by the activated tensor by the 3X 3 convolution layer, and finally obtaining a high-dimensional image characteristic corresponding to the source image by the activation function LeakyReLU.
Corresponding to different source images, the method of the embodiment uses the feature extraction network D with the same set of network parameters to perform feature extraction to generate high-dimensional features, and then the high-dimensional features are combined and input into the feature reconstruction network F to generate a fusion image. The network structures of the feature reconstruction network F and the feature extraction network D are very simple.
Fig. 4 is a schematic diagram of a training principle of the multi-source image efficient fusion network in this embodiment. Participation inFIG. 4 shows that this embodiment uses an intermediate information layer to characterize a high-dimensional image
Figure SMS_175
And->
Figure SMS_176
Respectively input to the intermediate information layer L i Generating weight->
Figure SMS_177
And (3) with
Figure SMS_178
The intermediate information layer is formed by combining a 1 x 1 convolution layer and a Tanh activation function.
In this embodiment, a multi-aggregation fusion dataset (Lytro) is expanded as training data of network parameters, and since the processing of multi-channel and single-channel images in the multi-source image efficient fusion network in this embodiment has little influence on the network structure, the dataset is converted into a single-channel gray scale map, and is cut into pixels with the size of 128×128 as input, and 10% of the total input data is taken as a verification set. If the multi-channel input network or the multi-channel output network needs to be trained, the number of network parameter input channels (n_channels) and the number of categories (n_categories) need to be changed into proper values. The learning rate is 1e-4, and the parameters are updated by a callback function (ReduceLROnPlateeau). The batch size at training was set to 32.
To further verify the effectiveness of the method of this example, three typical Fusion methods of U2Fusion, IFCNN, denseFuse were used as a comparison in this example, which was compared to the method of this example for a multi-source image efficient Fusion experiment. Experiments were performed on NVIDIA Quadro RTX, 6000 and 2.10 GHz Intel Xeon silver 4216 CPU using the multi-focus fusion dataset Lytro as training data, and in order to verify the effectiveness of the unified fusion of this embodiment, three different fusion methods were selected for comparative verification, including three types of multi-modal fusion, multi-focus fusion and multi-exposure fusion.
For multimodal fusion, the present example selects the most representative visible and infrared fusion dataset TNO. For the visible and infrared fusion dataset TNO, this example selected 21 pairs of typical images from the entire scene, and the experimental results obtained are shown in table 1 and fig. 5.
Table 1 objective performance metrics for the present example method and three exemplary multi-modal fusion methods.
Figure SMS_179
Further, the present example verifies multi-focus fusion by using 13 pairs of images from MFFW dataset, and the experimental results obtained are shown in table 2 and fig. 6.
Table 2 objective performance metrics for the present example method and three exemplary multi-focus fusion methods.
Figure SMS_180
For multi-exposure fusion, the present embodiment selects 20 pairs of appropriate EMPA-HDR datasets. In order to accommodate the device limitations, this example performed four downsampling operations on the EMPA-HDR dataset, and the experimental results obtained are shown in Table 3 and FIG. 7.
Table 3 objective performance metrics for the present example method and three typical multi-exposure fusion methods.
Figure SMS_181
In tables 1 to 3, the index (EI) reflects the gradient information extraction capability of the corresponding method, and the larger the index is, the better the effect is. The index Entropy (EN) calculates the detail features of the image, and the larger the index is, the better the effect is. The inter-indicator (MI) measures the similarity between images, with a larger indicator indicating a better effect. The Index (TIME) reflects the TIME in seconds required for the different methods to fuse, the smaller the index, the better. The index (FPS) reflects the number of image frames that can be generated per second, and can reflect the efficiency of the algorithm, and a larger index indicates a better effect. As can be seen from table 1, all objective evaluation indexes of the method proposed in this example are superior to those of other methods. The index (EI) in tables 2 and 3 is slightly lower than IFCNN, and the other indexes have good advancement. The method is characterized in that the mode of guiding the loss function by the intermediate information layer used by the network can greatly endow the network with the capability of processing intensity information and gradient information, so that the network can realize image feature extraction and image reconstruction by using only a very small amount of convolution, and high-efficiency and accurate pixel-level image fusion is realized.
Fig. 5, 6 and 7 are comparisons of three exemplary Fusion methods of U2Fusion, IFCNN, denseFuse and a total of four Fusion methods of the method of the present example on multi-modal Fusion, multi-focus Fusion and multi-exposure Fusion.
In fig. 5, a is an infrared source image, B is a visible light source image, C is a Fusion image obtained by Fusion of U2Fusion method, D is a Fusion image obtained by Fusion of IFCNN method, E is a Fusion image obtained by Fusion of DenseFuse method, and F is a Fusion image obtained by Fusion of the method proposed in this embodiment. As can be seen from fig. 5, the texture and intensity retention of the image fused by the three typical Fusion methods U2Fusion and IFCNN, denseFuse on the background and the target are poor, and the quality of the fused image fused by the method proposed in this embodiment is the best.
In fig. 6, a is a far focus source image, B is a near focus source image, C is a Fusion image obtained by Fusion of U2Fusion method, D is a Fusion image obtained by Fusion of IFCNN method, E is a Fusion image obtained by Fusion of DenseFuse method, and F is a Fusion image obtained by Fusion of the method proposed in this embodiment. As can be seen from fig. 6, the Fusion image obtained by the Fusion method according to the present embodiment has the best effect on edge preservation, and can complement gradient advantage information of different source images, while the Fusion effect of the three typical Fusion methods of U2Fusion and IFCNN, denseFuse is general.
In fig. 7, a is a high exposure source image, B is a low exposure source image, C is a Fusion image obtained by Fusion of U2Fusion method, D is a Fusion image obtained by Fusion of IFCNN method, E is a Fusion image obtained by Fusion of DenseFuse method, and F is a Fusion image obtained by Fusion of the method proposed in this embodiment. As can be seen from fig. 7, the image obtained by Fusion of three typical Fusion methods U2Fusion and IFCNN, denseFuse has a general effect on image restoration and texture maintenance, and has obvious flaws; the fused image obtained by fusion according to the method provided by the embodiment has good performance in both aspects, and the obtained fused image has the best quality.
In summary, the method of the embodiment uses the convolutional neural network to extract the features of the input image, uses the intermediate information guiding optimization mode to the extracted features to generate the corresponding weight system, and gives the corresponding weight system to the loss function of the network. Inputting the source images to be fused into a network to generate high-dimensional features of corresponding different source images, merging the high-dimensional features subsequently and then using the merged high-dimensional features to reconstruct the images, extracting the intensity and gradient information of the high-dimensional features, and merging the intensity and gradient information into a loss function to calculate the similarity between the generated images and the source images. The image thus reconstructed is a fused image. The method has the advantages that excessive training is not needed, only training on a multi-focus data set is needed, and the method is suitable for different types of fusion tasks. After being compared with other advanced fusion methods, the fusion image generated by the fusion method used in the embodiment has higher objective performance index, better visual effect and more important, the method has extremely high efficiency and extremely strong universality and robustness. The method of the embodiment has the advantages of high fusion speed, good visual effect, obvious texture information, high structure retention, strong universality and the like.
In addition, the embodiment also provides a multi-source image efficient fusion system, which comprises a microprocessor and a memory which are connected with each other, wherein the microprocessor is programmed or configured to execute the multi-source image efficient fusion method.
Furthermore, the present embodiment also provides a computer-readable storage medium having stored therein a computer program for programming or configuring by a microprocessor to perform the multi-source image efficient fusion method.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above examples, and all technical solutions belonging to the concept of the present invention belong to the protection scope of the present invention. It should be noted that modifications and adaptations to the present invention may occur to one skilled in the art without departing from the principles of the present invention and are intended to be within the scope of the present invention.

Claims (10)

1. The multi-source image efficient fusion method is characterized by comprising the following training for a multi-source image efficient fusion network consisting of two feature extraction networks D and one feature reconstruction network F:
s101, establishing two types of source images to be fused
Figure QLYQS_1
And->
Figure QLYQS_2
A sample set of sample pairs of (a);
s102, source images of the sample pairs
Figure QLYQS_3
And->
Figure QLYQS_4
Respectively inputting the high-dimensional image features extracted by the feature extraction network D>
Figure QLYQS_5
And->
Figure QLYQS_6
S103, the high-dimensional image is characterized
Figure QLYQS_7
And->
Figure QLYQS_10
Splicing to obtain fusion characteristics->
Figure QLYQS_12
Fusion feature->
Figure QLYQS_9
The input feature fusion network F generates a fusion image +.>
Figure QLYQS_11
The method comprises the steps of carrying out a first treatment on the surface of the High-dimensional image features->
Figure QLYQS_13
And->
Figure QLYQS_15
Respectively input to the intermediate information layer L i Generating weight->
Figure QLYQS_8
And->
Figure QLYQS_14
S104, weighting
Figure QLYQS_17
And->
Figure QLYQS_20
Respectively->
Figure QLYQS_24
Multiplying to obtain two weight fusion images +.>
Figure QLYQS_18
And->
Figure QLYQS_19
The method comprises the steps of carrying out a first treatment on the surface of the Weight +.>
Figure QLYQS_22
And Source image->
Figure QLYQS_23
Multiplication, weight->
Figure QLYQS_16
And Source image->
Figure QLYQS_21
Multiplying to obtain two weight source images +.>
Figure QLYQS_25
And->
Figure QLYQS_26
S105, calculating a Loss function Loss value designed according to the similarity between the weight fusion image and the weight source image, and judging that the feature extraction network D and the feature fusion network F are trained if the Loss function Loss value meets a preset convergence condition; otherwise, the network parameters of the feature extraction network D and the feature fusion network F are adjusted, and the step S102 is skipped.
2. The efficient multi-source image fusion method according to claim 1, wherein the Loss function Loss designed in step S103 is a weighted sum of both a gradient Loss function and an intensity Loss function, wherein the gradient Loss function is a weighted sum of both a gradient Loss function of an exponential measure of structural similarity SSIM and a gradient Loss function of a mean square error MSE of structural similarity, the structural similarity comprising a weight fused image
Figure QLYQS_27
And weight source image->
Figure QLYQS_28
Similarity between the weight fusion images +.>
Figure QLYQS_29
Weighting source image with weight source image +.>
Figure QLYQS_30
Similarity between them.
3. The efficient multi-source image fusion method of claim 2, wherein the structural similarity exponential metric SSIM has a gradient loss function with a computational function expression:
Figure QLYQS_31
in the above-mentioned method, the step of,
Figure QLYQS_32
gradient loss function for SSIM, an exponential measure of structural similarity, SIMM is an exponential measure of structural similarity, +.>
Figure QLYQS_33
And->
Figure QLYQS_34
For two types of source images to be fused, +.>
Figure QLYQS_35
To fuse images, W 1 And W is 2 Are weight matrix->
Figure QLYQS_36
For element-wise multiplication.
4. A method of efficient fusion of multisource images according to claim 3, characterized in that the structural similarity mean square error MSE gradient loss function is calculated as:
Figure QLYQS_37
in the above-mentioned method, the step of,
Figure QLYQS_38
gradient loss function of MSE, which is mean square error of structural similarity, ++>
Figure QLYQS_39
And->
Figure QLYQS_40
For two types of source images to be fused, +.>
Figure QLYQS_41
For fusing images, +.>
Figure QLYQS_42
And->
Figure QLYQS_43
Are weight matrix->
Figure QLYQS_44
For element-wise multiplication.
5. The method of claim 4, wherein the intensity loss function is calculated by the following expression:
Figure QLYQS_45
in the above-mentioned method, the step of,
Figure QLYQS_46
MSE is the mean square error of structural similarity, as a strength loss function, +.>
Figure QLYQS_47
And->
Figure QLYQS_48
For the two types of source images to be fused,/>
Figure QLYQS_49
for fusing images, +.>
Figure QLYQS_50
And->
Figure QLYQS_51
Are weight matrix->
Figure QLYQS_52
For element-wise multiplication.
6. The efficient multi-source image fusion method of claim 5, wherein the weight matrix
Figure QLYQS_53
、/>
Figure QLYQS_54
Figure QLYQS_55
、/>
Figure QLYQS_56
The expression of the calculation function of (c) is:
Figure QLYQS_57
Figure QLYQS_58
Figure QLYQS_59
Figure QLYQS_60
in the above-mentioned method, the step of,
Figure QLYQS_61
as constant coefficients, fgrad is gradient operator,>
Figure QLYQS_62
and->
Figure QLYQS_63
Respectively high-dimensional image features->
Figure QLYQS_64
And->
Figure QLYQS_65
Tensor of (1), sigmoid is a normalization function, fmean is the mean of the calculated tensor, intermediate variable +.>
Figure QLYQS_66
And->
Figure QLYQS_67
The expression of the calculation function of (c) is:
Figure QLYQS_68
in the above-mentioned method, the step of,
Figure QLYQS_69
and->
Figure QLYQS_70
Is an intermediate variable, and has:
Figure QLYQS_71
Figure QLYQS_72
where sqrt is a square root calculation.
7. The method of claim 1, wherein the feature reconstruction network F performs feature reconstruction to obtain a fused image comprises: the fusion features are subjected to 3×3 convolution layers to obtain tensors with the channel number of 64, activated tensors are generated through an activation function LeakyReLU, the channel number of the activated tensors is reduced to 1 through the 3×3 convolution layers, and finally the final fusion image is obtained through the activation function LeakyReLU.
8. The method of claim 1, wherein the feature extraction network D extracts high-dimensional image features comprising: and (3) obtaining a tensor with the channel number of 32 by a 3X 3 convolution layer from the source image, generating an activated tensor by an activation function LeakyReLU, lifting the channel number of 64 by the activated tensor by the 3X 3 convolution layer, and finally obtaining a high-dimensional image characteristic corresponding to the source image by the activation function LeakyReLU.
9. A multi-source image efficient fusion system comprising a microprocessor and a memory interconnected, wherein the microprocessor is programmed or configured to perform the multi-source image efficient fusion method of any one of claims 1-8.
10. A computer readable storage medium having a computer program stored therein, wherein the computer program is for programming or configuring by a microprocessor to perform the multi-source image efficient fusion method of any one of claims 1-8.
CN202310614277.2A 2023-05-29 2023-05-29 Efficient multi-source image fusion method, system and medium Active CN116342455B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310614277.2A CN116342455B (en) 2023-05-29 2023-05-29 Efficient multi-source image fusion method, system and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310614277.2A CN116342455B (en) 2023-05-29 2023-05-29 Efficient multi-source image fusion method, system and medium

Publications (2)

Publication Number Publication Date
CN116342455A true CN116342455A (en) 2023-06-27
CN116342455B CN116342455B (en) 2023-08-08

Family

ID=86886215

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310614277.2A Active CN116342455B (en) 2023-05-29 2023-05-29 Efficient multi-source image fusion method, system and medium

Country Status (1)

Country Link
CN (1) CN116342455B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112668648A (en) * 2020-12-29 2021-04-16 西安电子科技大学 Infrared and visible light fusion identification method based on symmetric fusion network
CN114331931A (en) * 2021-11-26 2022-04-12 西安邮电大学 High dynamic range multi-exposure image fusion model and method based on attention mechanism
CN114529794A (en) * 2022-04-20 2022-05-24 湖南大学 Infrared and visible light image fusion method, system and medium
CN116109538A (en) * 2023-03-23 2023-05-12 广东工业大学 Image fusion method based on simple gate unit feature extraction

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112668648A (en) * 2020-12-29 2021-04-16 西安电子科技大学 Infrared and visible light fusion identification method based on symmetric fusion network
CN114331931A (en) * 2021-11-26 2022-04-12 西安邮电大学 High dynamic range multi-exposure image fusion model and method based on attention mechanism
CN114529794A (en) * 2022-04-20 2022-05-24 湖南大学 Infrared and visible light image fusion method, system and medium
CN116109538A (en) * 2023-03-23 2023-05-12 广东工业大学 Image fusion method based on simple gate unit feature extraction

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HAN XU 等: "U2Fusion: A Unified Unsupervised Image Fusion Network", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》, pages 502 - 518 *
RENWEI DIAN 等: "Hyperspectral and Multispectral Image Fusion Via Self-Supervised Loss and Separable Loss", 《IEEE》 *

Also Published As

Publication number Publication date
CN116342455B (en) 2023-08-08

Similar Documents

Publication Publication Date Title
JP6980958B1 (en) Rural area classification garbage identification method based on deep learning
CN107977932B (en) Face image super-resolution reconstruction method based on discriminable attribute constraint generation countermeasure network
CN111340814B (en) RGB-D image semantic segmentation method based on multi-mode self-adaptive convolution
CN113628249B (en) RGBT target tracking method based on cross-modal attention mechanism and twin structure
CN112819910A (en) Hyperspectral image reconstruction method based on double-ghost attention machine mechanism network
CN109005398B (en) Stereo image parallax matching method based on convolutional neural network
CN114283158A (en) Retinal blood vessel image segmentation method and device and computer equipment
CN115018727B (en) Multi-scale image restoration method, storage medium and terminal
CN111860528B (en) Image segmentation model based on improved U-Net network and training method
CN115311186B (en) Cross-scale attention confrontation fusion method and terminal for infrared and visible light images
CN113870286B (en) Foreground segmentation method based on multi-level feature and mask fusion
CN116757986A (en) Infrared and visible light image fusion method and device
CN113066065B (en) No-reference image quality detection method, system, terminal and medium
CN112419191B (en) Image motion blur removing method based on convolution neural network
CN118379288B (en) Embryo prokaryotic target counting method based on fuzzy rejection and multi-focus image fusion
KS et al. Deep multi-stage learning for hdr with large object motions
CN117456330A (en) MSFAF-Net-based low-illumination target detection method
CN116152128A (en) High dynamic range multi-exposure image fusion model and method based on attention mechanism
CN115861094A (en) Lightweight GAN underwater image enhancement model fused with attention mechanism
CN118212240A (en) Automobile gear production defect detection method
Zhou et al. Boundary-guided lightweight semantic segmentation with multi-scale semantic context
CN117830900A (en) Unsupervised video object segmentation method
CN116403064B (en) Picture processing method, system, equipment and medium
CN116342455B (en) Efficient multi-source image fusion method, system and medium
Zhao et al. A simple yet effective pipeline for radial distortion correction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant