CN114419328A - Image fusion method and system based on adaptive enhancement generation countermeasure network - Google Patents

Image fusion method and system based on adaptive enhancement generation countermeasure network Download PDF

Info

Publication number
CN114419328A
CN114419328A CN202210071844.XA CN202210071844A CN114419328A CN 114419328 A CN114419328 A CN 114419328A CN 202210071844 A CN202210071844 A CN 202210071844A CN 114419328 A CN114419328 A CN 114419328A
Authority
CN
China
Prior art keywords
image
fusion
source
network
adaptive
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210071844.XA
Other languages
Chinese (zh)
Other versions
CN114419328B (en
Inventor
张聪炫
单长鲁
陈震
卢锋
葛利跃
陈昊
秦文健
李凌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanchang Hangkong University
Original Assignee
Nanchang Hangkong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanchang Hangkong University filed Critical Nanchang Hangkong University
Priority to CN202210071844.XA priority Critical patent/CN114419328B/en
Publication of CN114419328A publication Critical patent/CN114419328A/en
Application granted granted Critical
Publication of CN114419328B publication Critical patent/CN114419328B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4038Image mosaicing, e.g. composing plane images from plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20004Adaptive image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to an image fusion method and system based on adaptive enhancement generation countermeasure network. The method comprises the following steps: constructing a dense detail feature extraction network, and respectively extracting features of the source image by using a dense convolution mode and a detail information compensation mechanism; constructing a fusion source characteristic diagram and a detail information characteristic diagram of the dual-channel self-adaptive fusion network; after splicing the fusion characteristic graphs, constructing a 1 x 1 convolution network to realize cross-channel interaction and information fusion; decoding the fused feature map to obtain a fused image; and adding an adaptive structural similarity loss function when training the whole network model. According to the method, a detail information compensation mechanism is introduced to enhance the details of the fused image and reduce information loss, infrared information and visible light information in the fused image are balanced in the channel dimension by using a dual-channel self-adaptive fusion network, and the similarity of the fused image and a source image is enhanced in a self-adaptive manner in the space dimension by adding a self-adaptive structural similarity loss function, so that the quality of the fused image is improved.

Description

Image fusion method and system based on adaptive enhancement generation countermeasure network
Technical Field
The invention relates to the technical field of image fusion, in particular to an image fusion method and system based on adaptive enhancement generation countermeasure network.
Background
Image fusion is an important technique in image processing, and the main goal of image fusion is to integrate salient features extracted from each source image, because a single infrared or visible light sensor cannot capture complete scene information. Generally, the visible image has abundant high-spatial-resolution sensing details, and the infrared image contains the thermal radiation of the target, so that it is important to fuse the complementary information of the infrared image and the visible image into a new composite image to process different tasks. The most advanced fusion algorithms today are widely used in many applications such as autonomous driving of vehicles, visual tracking and video surveillance. Fusion algorithms can be broadly classified into two categories: conventional methods and methods based on deep learning. In recent years, methods based on deep learning have shown great potential in image fusion tasks and are considered to have the potential to provide better performance than traditional algorithms.
At present, a big obstacle of image fusion based on deep learning is that there is no true value, so that only unsupervised training or supervised training is performed on an end-to-end network when training is performed, taking two source images as true values. However, when unsupervised training is adopted, the end-to-end network loses much information when extracting features; when two true value images are adopted for supervised training, the problem of unbalanced information distribution is generated in the fusion image.
Disclosure of Invention
The invention aims to provide an image fusion method and system based on adaptive enhancement generation countermeasure network, which aim to solve the problems of information loss and unbalanced effective information distribution in the existing image fusion method based on deep learning.
In order to achieve the purpose, the invention provides the following scheme:
an image fusion method for generating a countermeasure network based on adaptive enhancement comprises the following steps:
acquiring a source image; the source image comprises an infrared image and a visible light image;
combining the dense convolutional network with a detail information compensation mechanism to construct a dense detail feature extraction network;
performing feature extraction on the source image based on the dense detail feature extraction network to obtain a source feature map and a detail information feature map of the source image; the source feature map of the source image comprises a source feature map of an infrared image and a source feature map of a visible light image; the detail information characteristic diagram comprises a detail information characteristic diagram of an infrared image and a detail information characteristic diagram of a visible light image;
constructing a two-channel self-adaptive fusion network based on a two-channel maximum pooling self-adaptive fusion mechanism and a two-channel average pooling self-adaptive fusion mechanism;
performing two-channel maximum pooling adaptive fusion on the source feature map of the source image based on the two-channel adaptive fusion network to obtain a fusion source feature map;
performing two-channel average pooling self-adaptive fusion on the detail information characteristic graph of the source image based on the two-channel self-adaptive fusion network to obtain a fusion detail information characteristic graph;
splicing the fusion source characteristic diagram and the fusion detail information characteristic diagram by adopting the dual-channel self-adaptive fusion network to obtain a spliced characteristic diagram;
inputting the spliced feature map into a 1 x 1 convolution network to realize cross-channel information interaction and information fusion of the feature map, and obtaining a fused feature map;
decoding the fused feature map by adopting a decoding network to obtain a fused image;
sequentially connecting the dense detail feature extraction network, the two-channel self-adaptive fusion network, the 1 x 1 convolution network and the decoding network to form a self-adaptive enhancement generation countermeasure network;
constructing a self-adaptive structural similarity loss function according to the brightness similarity, the contrast similarity and the structural similarity of the fusion image and the source image;
training the network parameters of the adaptive enhancement generation countermeasure network through back propagation based on the adaptive structure similarity loss function, and generating a trained adaptive enhancement generation countermeasure network;
and adopting the trained adaptive enhancement to generate a countermeasure network for image fusion of the infrared image and the visible light image.
Optionally, the performing, by the dense detail feature extraction network, feature extraction on the source image to obtain a source feature map and a detail information feature map of the source image specifically includes:
extracting a network formula p by adopting dense detail features based on the dense detail feature extraction networki=convi(x1,cat(…,conv2(cat(xi-2,conv1(cat(xi-1,xi) ))) and y) are providedi=cat(pi,broadcast(x)-pi) Performing feature extraction on the source image to obtain a source feature map and a detail information feature map of the source image; wherein x represents the source image, xiA feature map representing an i-th layer of the dense detail feature extraction network; cat () represents splicing the feature graph in brackets on the feature channel; convi() Representing the ith layer of convolution operation for extracting the features of the spliced feature map in the brackets; p is a radical ofiRepresenting the extracted ith layer source characteristic diagram; broadcast (x) -piRepresentation of the source profile piA corresponding ith layer detail information feature map, broadcast (x) represents that a broadcasting mechanism automatically expands the dimension of the source image x; y isiAnd representing all feature maps of the ith layer of the dense detail feature extraction network.
Optionally, the performing, based on the two-channel adaptive fusion network, two-channel maximum pooling adaptive fusion on the source feature map of the source image to obtain a fusion source feature map specifically includes:
based on the two-channel self-adaptive fusion network, a formula is adopted
Figure BDA0003482350870000031
Performing two-channel maximum pooling self-adaptive fusion on the source feature map of the source image to obtain a fusion source feature map; wherein,
Figure BDA0003482350870000032
a source signature graph representing an infrared image,
Figure BDA0003482350870000033
a source signature representing a visible light image; max () represents the maximum value in the channel dimension for the bracketed feature map;
Figure BDA0003482350870000034
representing a convolution operation of the source signature of the infrared image,
Figure BDA0003482350870000035
representing source to visible light imagesPerforming convolution operation on the symbolic graph; σ () represents a sigmoid operation; and X degree represents a fusion source characteristic diagram obtained after weighting through a two-channel maximum pooling self-adaptive fusion mechanism.
Optionally, the performing, based on the two-channel adaptive fusion network, two-channel average pooling adaptive fusion on the detail information feature map of the source image to obtain a fusion detail information feature map specifically includes:
based on the two-channel self-adaptive fusion network, a formula is adopted
Figure BDA0003482350870000036
Performing double-channel average pooling self-adaptive fusion on the detail information characteristic graph of the source image to obtain a fusion detail information characteristic graph; wherein
Figure BDA0003482350870000041
A detailed information feature map representing an infrared image,
Figure BDA0003482350870000042
a detail information feature map representing a visible light image; mean () represents the averaging of the bracketed feature map over the channel dimension;
Figure BDA0003482350870000043
showing the convolution operation of the detail information characteristic diagram of the infrared image,
Figure BDA0003482350870000044
performing convolution operation on a detail information characteristic diagram of the visible light image; σ () represents a sigmoid operation; xdAnd showing a fusion detail information characteristic diagram obtained after weighting through a dual-channel average pooling self-adaptive fusion mechanism.
Optionally, the constructing an adaptive structural similarity loss function according to the brightness similarity, the contrast similarity, and the structural similarity of the fused image and the source image specifically includes:
using a formula
Figure BDA0003482350870000045
Calculating the brightness similarity l (x, y) of the fusion image and the source image; wherein muxMean value of pixel intensities, μ, representing a sliding window of x source imagesyMean value of pixel intensity representing a sliding window of the fused image y; c1Is an extremely small number;
using a formula
Figure BDA0003482350870000046
Calculating the contrast similarity c (x, y) of the fusion image and the source image; wherein sigmaxRepresenting the standard deviation, σ, of the source image xyRepresents the standard deviation of the fused image y; c2Is an extremely small number;
using a formula
Figure BDA0003482350870000047
Calculating the structural similarity s (x, y) of the fusion image and the source image; wherein sigmaxyRepresenting the covariance of the source image x and the fused image y; c3Is an extremely small number;
calculating the structural similarity ssim (x, y) of the fused image and the source image by adopting a formula ssim (x, y) ═ l (x, y) · c (x, y) · s (x, y) according to the brightness similarity l (x, y), the contrast similarity c (x, y) and the structural similarity s (x, y) of the fused image and the source image;
constructing an adaptive structural similarity loss function according to the structural similarity ssim (x, y) of the fusion image and the source image
Figure BDA0003482350870000048
Wherein viwRepresenting blocks of visible light images in a sliding window, irwRepresenting an infrared image block in a sliding window, fwRepresenting a fused image block in a sliding window; ssim (vi)w+fw) Representing the structural similarity of the fused image block and the visible image block in the sliding window, ssim (ir)w+fw) Representing the structural similarity of the fusion image block and the infrared image block in the sliding window;
Figure BDA0003482350870000051
representing the average of the pixels of a block of visible light images in a sliding window,
Figure BDA0003482350870000052
representing the pixel average value of the infrared image block in the sliding window; SSIM denotes the value of the final structural similarity loss function in a sliding window.
An image fusion system for generating a confrontation network based on adaptive enhancement, comprising:
the source image acquisition module is used for acquiring a source image; the source image comprises an infrared image and a visible light image;
the dense detail feature extraction network construction module is used for combining the dense convolution network with a detail information compensation mechanism to construct a dense detail feature extraction network;
the feature extraction module is used for carrying out feature extraction on the source image based on the dense detail feature extraction network to obtain a source feature map and a detail information feature map of the source image; the source feature map of the source image comprises a source feature map of an infrared image and a source feature map of a visible light image; the detail information characteristic diagram comprises a detail information characteristic diagram of an infrared image and a detail information characteristic diagram of a visible light image;
the two-channel self-adaptive fusion network construction module is used for constructing a two-channel self-adaptive fusion network based on a two-channel maximum pooling self-adaptive fusion mechanism and a two-channel average pooling self-adaptive fusion mechanism;
the source feature map fusion module is used for performing two-channel maximum pooling adaptive fusion on the source feature map of the source image based on the two-channel adaptive fusion network to obtain a fusion source feature map;
the detail information feature map fusion module is used for carrying out double-channel average pooling self-adaptive fusion on the detail information feature map of the source image based on the double-channel self-adaptive fusion network to obtain a fusion detail information feature map;
the fusion characteristic diagram splicing module is used for splicing the fusion source characteristic diagram and the fusion detail information characteristic diagram by adopting the two-channel self-adaptive fusion network to obtain a spliced characteristic diagram;
the convolution network fusion module is used for inputting the spliced feature map into a 1 x 1 convolution network to realize cross-channel information interaction and information fusion of the feature map so as to obtain a fused feature map;
the feature map code module is used for decoding the fused feature map by adopting a decoding network to obtain a fused image;
the adaptive enhancement generation countermeasure network construction module is used for sequentially connecting the dense detail feature extraction network, the two-channel adaptive fusion network, the 1 x 1 convolution network and the decoding network to form an adaptive enhancement generation countermeasure network;
the adaptive structure similarity loss function building module is used for building an adaptive structure similarity loss function according to the brightness similarity, the contrast similarity and the structure similarity of the fusion image and the source image;
the adaptive enhancement generation confrontation network training module is used for training the network parameters of the adaptive enhancement generation confrontation network through back propagation based on the adaptive structure similarity loss function and generating a trained adaptive enhancement generation confrontation network;
and the image fusion module is used for generating a countermeasure network by adopting the trained self-adaptive enhancement to perform image fusion of the infrared image and the visible light image.
Optionally, the feature extraction module specifically includes:
a feature extraction unit for extracting a network formula p by using the dense detail feature based on the dense detail feature extraction networki=convi(x1,cat(…,conv2(cat(xi-2,conv1(cat(xi-1,xi) ))) and y) are providedi=cat(pi,broadcast(x)-pi) Performing feature extraction on the source image to obtain a source feature map and a detail information feature map of the source image; wherein x represents the source image, xiRepresenting the dense detail feature extraction meshA layer i profile of the complex; cat () represents splicing the feature graph in brackets on the feature channel; convi() Representing the ith layer of convolution operation for extracting the features of the spliced feature map in the brackets; p is a radical ofiRepresenting the extracted ith layer source characteristic diagram; broadcast (x) -piRepresentation of the source profile piA corresponding ith layer detail information feature map, broadcast (x) represents that a broadcasting mechanism automatically expands the dimension of the source image x; y isiAnd representing all feature maps of the ith layer of the dense detail feature extraction network.
Optionally, the source feature map fusion module specifically includes:
a source feature map fusion unit for applying a formula based on the dual-channel adaptive fusion network
Figure BDA0003482350870000061
Performing two-channel maximum pooling self-adaptive fusion on the source feature map of the source image to obtain a fusion source feature map; wherein,
Figure BDA0003482350870000062
a source signature graph representing an infrared image,
Figure BDA0003482350870000063
a source signature representing a visible light image; max () represents the maximum value in the channel dimension for the bracketed feature map;
Figure BDA0003482350870000071
representing a convolution operation of the source signature of the infrared image,
Figure BDA0003482350870000072
representing a convolution operation on a source feature map of a visible light image; σ () represents a sigmoid operation; and X degree represents a fusion source characteristic diagram obtained after weighting through a two-channel maximum pooling self-adaptive fusion mechanism.
Optionally, the detail information feature map fusion module specifically includes:
detail information feature map fusion unit for basisIn the dual-channel self-adaptive fusion network, a formula is adopted
Figure BDA0003482350870000073
Performing double-channel average pooling self-adaptive fusion on the detail information characteristic graph of the source image to obtain a fusion detail information characteristic graph; wherein
Figure BDA0003482350870000074
A detailed information feature map representing an infrared image,
Figure BDA0003482350870000075
a detail information feature map representing a visible light image; mean () represents the averaging of the bracketed feature map over the channel dimension;
Figure BDA0003482350870000076
showing the convolution operation of the detail information characteristic diagram of the infrared image,
Figure BDA0003482350870000077
performing convolution operation on a detail information characteristic diagram of the visible light image; σ () represents a sigmoid operation; xdAnd showing a fusion detail information characteristic diagram obtained after weighting through a dual-channel average pooling self-adaptive fusion mechanism.
Optionally, the adaptive structural similarity loss function constructing module specifically includes:
a brightness similarity calculation unit for employing a formula
Figure BDA0003482350870000078
Calculating the brightness similarity l (x, y) of the fusion image and the source image; wherein muxMean value of pixel intensities, μ, representing a sliding window of x source imagesyMean value of pixel intensity representing a sliding window of the fused image y; c1Is an extremely small number;
a contrast similarity calculation unit for employing a formula
Figure BDA0003482350870000079
Calculating the contrast similarity c (x, y) of the fusion image and the source image; wherein sigmaxRepresenting the standard deviation, σ, of the source image xyRepresents the standard deviation of the fused image y; c2Is an extremely small number;
a structural similarity calculation unit for employing a formula
Figure BDA00034823508700000710
Calculating the structural similarity s (x, y) of the fusion image and the source image; wherein sigmaxyRepresenting the covariance of the source image x and the fused image y; c3Is an extremely small number;
a structural similarity calculation unit, configured to calculate a structural similarity ssim (x, y) between the fused image and the source image by using a formula ssim (x, y) ═ l (x, y) · c (x, y) · s (x, y) according to a luminance similarity l (x, y), a contrast similarity c (x, y), and a structural similarity s (x, y) between the fused image and the source image;
an adaptive structural similarity loss function construction unit, configured to construct an adaptive structural similarity loss function according to the structural similarity ssim (x, y) of the fusion image and the source image
Figure BDA0003482350870000081
Wherein viwRepresenting blocks of visible light images in a sliding window, irwRepresenting an infrared image block in a sliding window, fwRepresenting a fused image block in a sliding window; ssim (vi)w+fw) Representing the structural similarity of the fused image block and the visible image block in the sliding window, ssim (ir)w+fw) Representing the structural similarity of the fusion image block and the infrared image block in the sliding window;
Figure BDA0003482350870000082
representing the average of the pixels of a block of visible light images in a sliding window,
Figure BDA0003482350870000083
pixel representing an infrared image block in a sliding windowAverage value; SSIM denotes the value of the final structural similarity loss function in a sliding window.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects:
the invention provides an image fusion method and system based on adaptive enhancement generation countermeasure network, wherein the method comprises the following steps: acquiring a source image; the source image comprises an infrared image and a visible light image; combining the dense convolutional network with a detail information compensation mechanism to construct a dense detail feature extraction network; performing feature extraction on the source image based on the dense detail feature extraction network to obtain a source feature map and a detail information feature map of the source image; constructing a two-channel self-adaptive fusion network based on a two-channel maximum pooling self-adaptive fusion mechanism and a two-channel average pooling self-adaptive fusion mechanism; performing two-channel maximum pooling adaptive fusion on the source feature map of the source image based on the two-channel adaptive fusion network to obtain a fusion source feature map; performing two-channel average pooling self-adaptive fusion on the detail information characteristic graph of the source image based on the two-channel self-adaptive fusion network to obtain a fusion detail information characteristic graph; splicing the fusion source characteristic diagram and the fusion detail information characteristic diagram by adopting the dual-channel self-adaptive fusion network to obtain a spliced characteristic diagram; inputting the spliced feature map into a 1 x 1 convolution network to realize cross-channel information interaction and information fusion of the feature map, and obtaining a fused feature map; decoding the fused feature map by adopting a decoding network to obtain a fused image; sequentially connecting the dense detail feature extraction network, the two-channel self-adaptive fusion network, the 1 x 1 convolution network and the decoding network to form a self-adaptive enhancement generation countermeasure network; constructing a self-adaptive structural similarity loss function according to the brightness similarity, the contrast similarity and the structural similarity of the fusion image and the source image; training the network parameters of the adaptive enhancement generation countermeasure network through back propagation based on the adaptive structure similarity loss function, and generating a trained adaptive enhancement generation countermeasure network; and adopting the trained adaptive enhancement to generate a countermeasure network for image fusion of the infrared image and the visible light image. The method of the invention introduces a detail information compensation mechanism to enhance the details of the fused image and reduce the information loss, applies a two-channel self-adaptive fusion network to balance the infrared information and the visible light information in the fused image in the channel dimension, adds a self-adaptive structure similarity loss function to self-adaptively enhance the brightness similarity, the contrast similarity and the structure similarity of the fused image and two source images in the space dimension, solves the problems of information loss and unbalanced effective information distribution of the existing image fusion method based on deep learning, and improves the quality of the fused image.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a flow chart of an image fusion method for generating a countermeasure network based on adaptive enhancement according to the present invention;
FIG. 2 is a schematic diagram of an infrared image in a public data set according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a visible light image in a public data set according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a dense detail feature extraction network provided by an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a dual-channel adaptive convergence network according to an embodiment of the present invention;
fig. 6 is a schematic diagram of a result of fusing an infrared image and a visible light image according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention aims to provide an image fusion method and system based on adaptive enhancement generation countermeasure network, which aim to solve the problems of information loss and unbalanced effective information distribution in the existing image fusion method based on deep learning.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
Fig. 1 is a flowchart of an image fusion method for generating a countermeasure network based on adaptive enhancement according to the present invention. Referring to fig. 1, the image fusion method for generating a countermeasure network based on adaptive enhancement of the present invention includes:
step 101: and acquiring a source image.
FIG. 2 is a schematic diagram of an infrared image in a public data set according to an embodiment of the present invention; fig. 3 is a schematic diagram of a visible light image in a public data set according to an embodiment of the present invention. Referring to fig. 2 and 3, the present invention adaptively enhances the generation of source images against network inputs including infrared images and visible light images.
Step 102: and combining the dense convolutional network with a detail information compensation mechanism to construct a dense detail feature extraction network.
Fig. 4 is a schematic structural diagram of a dense detail feature extraction network according to an embodiment of the present invention. Referring to fig. 4, the present invention combines a dense convolutional network with a detail information compensation mechanism to define a new feature extraction network: dense detail feature extraction networks. The dense convolutional network extracts deep information and shallow information through jump connection, and the detail information compensation mechanism acquires additional detail compensation information corresponding to the deep and shallow information. The dense detail feature extraction network formula constructed by the invention is as follows:
pi=convi(x1,cat(…,conv2(cat(xi-2,conv1(cat(xi-1,xi)))))) (1)
yi=cat(pi,broadcast(x)-pi) (2)
in the formulae (1) and (2), x represents an infrared image or a visible light image, and yiAnd (3) representing all feature maps of the ith layer of the dense detail feature extraction network, wherein i is more than 0 and less than or equal to n, n is the layer number of the dense detail feature extraction network, and n is more than 2. p is a radical ofiRepresenting the extracted ith layer source feature map, broadcast (x) -piA feature map (also referred to as a detail compensation feature map) representing the i-th layer detail information corresponding to the source feature map, xiAnd (3) an i-th layer feature diagram representing the dense detail feature extraction network. cat represents splicing the feature graph on the feature channel, broadcast represents that the broadcast mechanism can automatically expand dimension, conviAnd the ith layer of convolution operation for extracting the features after splicing the feature maps is shown.
Step 103: and performing feature extraction on the source image based on the dense detail feature extraction network to obtain a source feature map and a detail information feature map of the source image.
The constructed dense detail feature extraction network is used as a coding network for generating a confrontation network by self-adaptive enhancement, and feature extraction is carried out on two input source images (infrared images and visible light images) by respectively applying a dense convolution mode and combining a detail information compensation mechanism to obtain a source feature image and a detail information feature image of the two source images. The source feature map of the source image comprises two groups of source feature maps, namely a source feature map of an infrared image and a source feature map of a visible light image; the detail information characteristic diagram comprises two groups of detail information characteristic diagrams, namely a detail information characteristic diagram of an infrared image and a detail information characteristic diagram of a visible light image.
The step 103 specifically includes:
extracting a network formula p by adopting dense detail features based on the dense detail feature extraction networki=convi(x1,cat(…,conv2(cat(xi-2,conv1(cat(xi-1,xi) ))) and y) are providedi=cat(pi,broadcast(x)-pi) For the sourceCarrying out feature extraction on the image to obtain a source feature map and a detail information feature map of the source image; wherein x represents the source image, xiA feature map representing an i-th layer of the dense detail feature extraction network; cat () represents splicing the feature graph in brackets on the feature channel; convi() Representing the ith layer of convolution operation for extracting the features of the spliced feature map in the brackets; p is a radical ofiRepresenting the extracted ith layer source characteristic diagram; broadcast (x) -piRepresentation of the source profile piA corresponding ith layer detail information feature map, broadcast (x) represents that a broadcasting mechanism automatically expands the dimension of the source image x; y isiAnd representing all feature maps of the ith layer of the dense detail feature extraction network.
Step 104: and constructing a two-channel self-adaptive fusion network based on a two-channel maximum pooling self-adaptive fusion mechanism and a two-channel average pooling self-adaptive fusion mechanism.
Fig. 5 is a schematic structural diagram of a dual-channel adaptive convergence network according to an embodiment of the present invention. Referring to fig. 5, the two-channel adaptive fusion network is constructed based on a two-channel maximum pooling adaptive fusion mechanism and a two-channel average pooling adaptive fusion mechanism. After two groups of source characteristic graphs and the other two groups of detailed information characteristic graphs are obtained in step 103, a two-channel maximum pooling adaptive fusion mechanism is constructed to fuse the two groups of source characteristic graphs, and a group of fusion source characteristic graphs is obtained; and constructing a double-channel average pooling self-adaptive fusion mechanism to fuse the two groups of detailed information characteristic graphs to obtain a group of fused detailed information characteristic graphs.
Step 105: and performing two-channel maximum pooling adaptive fusion on the source feature map of the source image based on the two-channel adaptive fusion network to obtain a fusion source feature map.
Acquiring four groups of feature graphs extracted by a dense detail feature extraction network, performing two-channel maximum pooling adaptive fusion on the two groups of obtained source feature graphs, and uniformly distributing pixel intensity information of the two groups of source feature graphs in the fused feature graphs on channel dimensions; performing double-channel average pooling self-adaptive fusion on the two groups of detail compensation feature maps, and uniformly distributing detail texture information of the two groups of detail compensation feature maps in the fusion feature map on channel dimension; and finally, adaptively enhancing effective information of two source images in the fused image. The calculation formula of the fusion source characteristic diagram of the two-channel self-adaptive fusion network is as follows:
Figure BDA0003482350870000121
in the formula (3), the reaction mixture is,
Figure BDA0003482350870000122
source profiles representing the input infrared image and visible image, respectively, max represents maximizing the input profile in the channel dimension,
Figure BDA0003482350870000123
representing a convolution operation of the source signature of the infrared image,
Figure BDA0003482350870000124
the convolution operation is carried out on the source characteristic diagram of the visible light image, and the X-degree represents that the fusion source characteristic diagram is obtained after weighting through a two-channel maximum pooling self-adaptive fusion mechanism.
Therefore, the step 105 specifically includes:
based on the two-channel self-adaptive fusion network, a formula is adopted
Figure BDA0003482350870000125
Performing two-channel maximum pooling self-adaptive fusion on the source feature map of the source image to obtain a fusion source feature map; wherein,
Figure BDA0003482350870000126
a source signature graph representing an infrared image,
Figure BDA0003482350870000127
a source signature representing a visible light image; max () represents the maximum value in the channel dimension for the bracketed feature map;
Figure BDA0003482350870000128
representing a convolution operation of the source signature of the infrared image,
Figure BDA0003482350870000129
representing a convolution operation on a source feature map of a visible light image; σ () represents a sigmoid operation; and X degree represents a fusion source characteristic diagram obtained after weighting through a two-channel maximum pooling self-adaptive fusion mechanism.
Step 106: and carrying out double-channel average pooling self-adaptive fusion on the detail information characteristic graph of the source image based on the double-channel self-adaptive fusion network to obtain a fusion detail information characteristic graph.
And performing double-channel average pooling self-adaptive fusion on the two groups of detail compensation feature maps, and uniformly distributing detail texture information of the two groups of detail compensation feature maps in the fusion feature map on a channel dimension, so as to finally enhance the effective information of the two source images in the fusion image in a self-adaptive manner. The calculation formula of the fusion detail information characteristic diagram of the two-channel self-adaptive fusion network is as follows:
Figure BDA0003482350870000131
in the formula (4), the reaction mixture is,
Figure BDA0003482350870000132
a detail compensation feature map representing an input infrared image,
Figure BDA0003482350870000133
a detail compensation feature map representing a visible image, mean representing averaging the input feature map over channel dimensions,
Figure BDA0003482350870000134
showing the convolution operation of the detail compensation feature map of the infrared image,
Figure BDA0003482350870000135
representing detail compensation features for visible light imagesThe graph is subjected to a convolution operation, σ denotes a sigmoid operation, XdAnd showing a detail fusion characteristic diagram obtained after weighting through a two-channel average pooling self-adaptive fusion mechanism.
Therefore, the step 106 specifically includes:
based on the two-channel self-adaptive fusion network, a formula is adopted
Figure BDA0003482350870000136
Performing double-channel average pooling self-adaptive fusion on the detail information characteristic graph of the source image to obtain a fusion detail information characteristic graph; wherein
Figure BDA0003482350870000137
A detailed information feature map representing an infrared image,
Figure BDA0003482350870000138
a detail information feature map representing a visible light image; mean () represents the averaging of the bracketed feature map over the channel dimension;
Figure BDA0003482350870000139
showing the convolution operation of the detail information characteristic diagram of the infrared image,
Figure BDA00034823508700001310
performing convolution operation on a detail information characteristic diagram of the visible light image; σ () represents a sigmoid operation; xdAnd showing a fusion detail information characteristic diagram obtained after weighting through a dual-channel average pooling self-adaptive fusion mechanism.
Step 107: and splicing the fusion source characteristic diagram and the fusion detail information characteristic diagram by adopting the dual-channel self-adaptive fusion network to obtain a spliced characteristic diagram.
And splicing two groups of fusion feature maps (a fusion source feature map and a fusion detail information feature map) on the channel dimension as input, and constructing a 1-by-1 convolution network to realize information interaction and information fusion of the feature maps across channels.
Step 108: inputting the spliced feature map into a 1 x 1 convolution network to realize cross-channel information interaction and information fusion of the feature map, and obtaining a fused feature map.
The invention adopts the two-channel self-adaptive fusion network to splice the fusion source characteristic diagram and the fusion detail information characteristic diagram, and inputs the spliced characteristic diagram into a 1 x 1 convolution network to realize the information interaction and information fusion of the characteristic diagram across channels, thereby obtaining the fused characteristic diagram. Using 1 × 1 convolution can retain more pixel intensity information and detail texture information accumulated than 3 × 3 convolution; and then decoding the fused feature map to obtain a fused image, so that the final fused result contains rich source image information.
Step 109: and decoding the fused feature map by adopting a decoding network to obtain a fused image.
And finally, decoding the feature map after fusion to obtain a fused image.
Step 110: and sequentially connecting the dense detail feature extraction network, the two-channel self-adaptive fusion network, the 1 x 1 convolution network and the decoding network to form a self-adaptive enhanced generation countermeasure network.
The adaptive enhancement generation countermeasure network takes the dense detail feature extraction network as a coding network, and respectively applies a dense convolution mode to two input source images and combines a detail information compensation mechanism to extract features; the infrared information and the visible light information in the image are fused in a balanced way on the channel dimension by using a two-channel self-adaptive fusion network; splicing the two groups of fusion feature graphs, and constructing a 1 x 1 convolution network to realize cross-channel interaction and information fusion; and finally, decoding the fused feature map to obtain a fused image.
Step 111: and constructing an adaptive structural similarity loss function according to the brightness similarity, the contrast similarity and the structural similarity of the fused image and the source image.
The method adopts a self-adaptive structural similarity loss function to calculate the structural similarity, the contrast similarity and the brightness similarity of a fused image obtained by a network model (namely the self-adaptive enhancement generation confrontation network) and two source images; calculating the average value of the pixel intensities of the sliding windows of the two source images by using a 3 x 3 sliding window; obtaining the weight corresponding to each window according to the mean sigmoid; and calculating the structural similarity of all windows, calculating the average value as a final adaptive structural similarity loss function value, and adjusting the network parameters of the network model through back propagation, so that the effective image information can be adaptively enhanced in the spatial dimension. The adaptive structural similarity loss function calculation formula is shown in formulas (5) to (9):
brightness similarity:
Figure BDA0003482350870000151
contrast similarity:
Figure BDA0003482350870000152
structural similarity:
Figure BDA0003482350870000153
structural similarity:
Figure BDA0003482350870000154
wherein x and y represent input source image and fusion image, mux、μyRepresenting the mean, σ, of two images (source image and fused image)x、σyRepresenting the standard deviation, σ, of two imagesxyRepresenting the covariance of the two images; c1、C2、C3Is an extremely small number in order to ensure the stability of the algorithm; ssim (x, y) represents the structural similarity of the two pictures.
The adaptive structural similarity loss function is expressed as follows:
Figure BDA0003482350870000155
in the formula (9), viwRepresenting blocks of visible light images in a sliding window, fwRepresenting a fused image block in a sliding window, irwRepresenting an infrared image block in a sliding window; ssim (vi)w+fw) Representing the structural similarity of the fused image block and the visible image block in the sliding window, ssim (ir)w+fw) Representing the structural similarity of the fusion image block and the infrared image block in the sliding window;
Figure BDA0003482350870000156
representing the average of the pixels of a block of visible light images in a sliding window,
Figure BDA0003482350870000157
representing the pixel average value of the infrared image block in the sliding window; SSIM denotes the value of the final structural similarity loss function in a sliding window.
Therefore, the step 111 specifically includes:
using a formula
Figure BDA0003482350870000158
Calculating the brightness similarity l (x, y) of the fusion image and the source image; wherein muxMean value of pixel intensities, μ, representing a sliding window of x source imagesyMean value of pixel intensity representing a sliding window of the fused image y; c1Is an extremely small number;
using a formula
Figure BDA0003482350870000161
Calculating the contrast similarity c (x, y) of the fusion image and the source image; wherein sigmaxRepresenting the standard deviation, σ, of the source image xyRepresents the standard deviation of the fused image y; c2Is an extremely small number;
using a formula
Figure BDA0003482350870000162
Calculating the structural similarity s (x, y) of the fusion image and the source image; wherein sigmaxyRepresenting the covariance of the source image x and the fused image y; c3Is an extremely small number;
calculating the structural similarity ssim (x, y) of the fused image and the source image by adopting a formula ssim (x, y) ═ l (x, y) · c (x, y) · s (x, y) according to the brightness similarity l (x, y), the contrast similarity c (x, y) and the structural similarity s (x, y) of the fused image and the source image;
constructing an adaptive structural similarity loss function according to the structural similarity ssim (x, y) of the fusion image and the source image
Figure BDA0003482350870000163
Wherein viwRepresenting blocks of visible light images in a sliding window, irwRepresenting an infrared image block in a sliding window, fwRepresenting a fused image block in a sliding window; ssim (vi)w+fw) Representing the structural similarity of the fused image block and the visible image block in the sliding window, ssim (ir)w+fw) Representing the structural similarity of the fusion image block and the infrared image block in the sliding window;
Figure BDA0003482350870000164
representing the average of the pixels of a block of visible light images in a sliding window,
Figure BDA0003482350870000165
representing the pixel average value of the infrared image block in the sliding window; SSIM denotes the value of the final structural similarity loss function in a sliding window.
Step 112: and training the network parameters of the adaptive enhancement generation countermeasure network through back propagation based on the adaptive structure similarity loss function, and generating the trained adaptive enhancement generation countermeasure network.
Based on the adaptive structure similarity loss function, the whole network model is trained for multiple times through back propagation, and the network after parameter training optimization can adaptively enhance infrared and visible light image information.
Step 113: and adopting the trained adaptive enhancement to generate a countermeasure network for image fusion of the infrared image and the visible light image.
The invention adaptively enhances and generates the input of the countermeasure network into an infrared image and a visible light image, and outputs the input into a fusion image of the infrared image and the visible light image. The trained adaptive enhancement generation countermeasure network is adopted to perform image fusion on the infrared image shown in the figure 1 and the visible light image shown in the figure 2, and the generated fusion image is shown in figure 6. As can be seen from FIG. 6, the adoption of the invention to adaptively enhance and generate the fusion image generated by the countermeasure network enhances the image detail information, balances the infrared information and the visible light information in the fusion image, and greatly improves the image quality of the fusion image.
The method utilizes the dense detail feature extraction network to solve the problem of information loss of the end-to-end generation countermeasure network during feature extraction; the problem of effective information distribution of the infrared image and the visible light image in the fused image is solved on the channel dimension and the space dimension respectively by utilizing a double-channel attention mechanism and a self-adaptive structure similarity loss function, and the quality of the fused image is effectively improved.
Based on the image fusion method for generating the confrontation network based on the adaptive enhancement provided by the invention, the invention also provides an image fusion system for generating the confrontation network based on the adaptive enhancement, and the system comprises:
the source image acquisition module is used for acquiring a source image; the source image comprises an infrared image and a visible light image;
the dense detail feature extraction network construction module is used for combining the dense convolution network with a detail information compensation mechanism to construct a dense detail feature extraction network;
the feature extraction module is used for carrying out feature extraction on the source image based on the dense detail feature extraction network to obtain a source feature map and a detail information feature map of the source image; the source feature map of the source image comprises a source feature map of an infrared image and a source feature map of a visible light image; the detail information characteristic diagram comprises a detail information characteristic diagram of an infrared image and a detail information characteristic diagram of a visible light image;
the two-channel self-adaptive fusion network construction module is used for constructing a two-channel self-adaptive fusion network based on a two-channel maximum pooling self-adaptive fusion mechanism and a two-channel average pooling self-adaptive fusion mechanism;
the source feature map fusion module is used for performing two-channel maximum pooling adaptive fusion on the source feature map of the source image based on the two-channel adaptive fusion network to obtain a fusion source feature map;
the detail information feature map fusion module is used for carrying out double-channel average pooling self-adaptive fusion on the detail information feature map of the source image based on the double-channel self-adaptive fusion network to obtain a fusion detail information feature map;
the fusion characteristic diagram splicing module is used for splicing the fusion source characteristic diagram and the fusion detail information characteristic diagram by adopting the two-channel self-adaptive fusion network to obtain a spliced characteristic diagram;
the convolution network fusion module is used for inputting the spliced feature map into a 1 x 1 convolution network to realize cross-channel information interaction and information fusion of the feature map so as to obtain a fused feature map;
the feature map code module is used for decoding the fused feature map by adopting a decoding network to obtain a fused image;
the adaptive enhancement generation countermeasure network construction module is used for sequentially connecting the dense detail feature extraction network, the two-channel adaptive fusion network, the 1 x 1 convolution network and the decoding network to form an adaptive enhancement generation countermeasure network;
the adaptive structure similarity loss function building module is used for building an adaptive structure similarity loss function according to the brightness similarity, the contrast similarity and the structure similarity of the fusion image and the source image;
the adaptive enhancement generation confrontation network training module is used for training the network parameters of the adaptive enhancement generation confrontation network through back propagation based on the adaptive structure similarity loss function and generating a trained adaptive enhancement generation confrontation network;
and the image fusion module is used for generating a countermeasure network by adopting the trained self-adaptive enhancement to perform image fusion of the infrared image and the visible light image.
Wherein, the feature extraction module specifically comprises:
a feature extraction unit for extracting a network formula p by using the dense detail feature based on the dense detail feature extraction networki=convi(x1,cat(…,conv2(cat(xi-2,conv1(cat(xi-1,xi) ))) and y) are providedi=cat(pi,broadcast(x)-pi) Performing feature extraction on the source image to obtain a source feature map and a detail information feature map of the source image; wherein x represents the source image, xiA feature map representing an i-th layer of the dense detail feature extraction network; cat () represents splicing the feature graph in brackets on the feature channel; convi() Representing the ith layer of convolution operation for extracting the features of the spliced feature map in the brackets; p is a radical ofiRepresenting the extracted ith layer source characteristic diagram; broadcast (x) -piRepresentation of the source profile piA corresponding ith layer detail information feature map, broadcast (x) represents that a broadcasting mechanism automatically expands the dimension of the source image x; y isiAnd representing all feature maps of the ith layer of the dense detail feature extraction network.
The source feature map fusion module specifically comprises:
a source feature map fusion unit for applying a formula based on the dual-channel adaptive fusion network
Figure BDA0003482350870000191
Performing two-channel maximum pooling self-adaptive fusion on the source feature map of the source image to obtain a fusion source feature map; wherein,
Figure BDA0003482350870000192
a source signature graph representing an infrared image,
Figure BDA0003482350870000193
a source signature representing a visible light image; max () represents the maximum value in the channel dimension for the bracketed feature map;
Figure BDA0003482350870000194
representing a convolution operation of the source signature of the infrared image,
Figure BDA0003482350870000195
representing a convolution operation on a source feature map of a visible light image; σ () represents a sigmoid operation; and X degree represents a fusion source characteristic diagram obtained after weighting through a two-channel maximum pooling self-adaptive fusion mechanism.
The detail information feature map fusion module specifically comprises:
a detail information feature map fusion unit for adopting a formula based on the dual-channel self-adaptive fusion network
Figure BDA0003482350870000196
Performing double-channel average pooling self-adaptive fusion on the detail information characteristic graph of the source image to obtain a fusion detail information characteristic graph; wherein
Figure BDA0003482350870000197
A detailed information feature map representing an infrared image,
Figure BDA0003482350870000198
a detail information feature map representing a visible light image; mean () represents the averaging of the bracketed feature map over the channel dimension;
Figure BDA0003482350870000199
showing the convolution operation of the detail information characteristic diagram of the infrared image,
Figure BDA00034823508700001910
performing convolution operation on a detail information characteristic diagram of the visible light image; σ () represents a sigmoid operation; xdAnd showing a fusion detail information characteristic diagram obtained after weighting through a dual-channel average pooling self-adaptive fusion mechanism.
The adaptive structure similarity loss function building module specifically comprises:
a brightness similarity calculation unit for employing a formula
Figure BDA00034823508700001911
Calculating the brightness similarity l (x, y) of the fusion image and the source image; wherein muxMean value of pixel intensities, μ, representing a sliding window of x source imagesyMean value of pixel intensity representing a sliding window of the fused image y; c1Is an extremely small number;
a contrast similarity calculation unit for employing a formula
Figure BDA0003482350870000201
Calculating the contrast similarity c (x, y) of the fusion image and the source image; wherein sigmaxRepresenting the standard deviation, σ, of the source image xyRepresents the standard deviation of the fused image y; c2Is an extremely small number;
a structural similarity calculation unit for employing a formula
Figure BDA0003482350870000202
Calculating the structural similarity s (x, y) of the fusion image and the source image; wherein sigmaxyRepresenting the covariance of the source image x and the fused image y; c3Is an extremely small number;
a structural similarity calculation unit, configured to calculate a structural similarity ssim (x, y) between the fused image and the source image by using a formula ssim (x, y) ═ l (x, y) · c (x, y) · s (x, y) according to a luminance similarity l (x, y), a contrast similarity c (x, y), and a structural similarity s (x, y) between the fused image and the source image;
an adaptive structural similarity loss function construction unit, configured to construct an adaptive structural similarity loss function according to the structural similarity ssim (x, y) of the fusion image and the source image
Figure BDA0003482350870000203
Wherein viwRepresenting blocks of visible light images in a sliding window, irwRepresenting an infrared image block in a sliding window, fwRepresenting a fused image block in a sliding window; ssim (vi)w+fw) Representing the structural similarity of the fused image block and the visible image block in the sliding window, ssim (ir)w+fw) Representing the structural similarity of the fusion image block and the infrared image block in the sliding window;
Figure BDA0003482350870000204
representing the average of the pixels of a block of visible light images in a sliding window,
Figure BDA0003482350870000205
representing the pixel average value of the infrared image block in the sliding window; SSIM denotes the value of the final structural similarity loss function in a sliding window.
The invention discloses a method and a system for fusing infrared and visible light images based on a self-adaptive enhancement generation countermeasure network, wherein the method comprises the steps of firstly, respectively inputting an infrared image and a visible light image; constructing a dense detail feature extraction network as a coding network, and respectively applying a dense convolution mode and a detail information compensation mechanism to the two input source images to carry out feature extraction so as to obtain two groups of source feature images and the other two groups of detail information feature images; constructing a two-channel maximum pooling self-adaptive fusion mechanism to fuse two groups of source characteristic graphs to obtain a group of fusion source characteristic graphs; constructing a double-channel average pooling self-adaptive fusion mechanism to fuse two groups of detail information characteristic graphs to obtain a group of fused detail information characteristic graphs; splicing the two groups of fusion feature graphs, and constructing a 1 x 1 convolution network to realize cross-channel interaction and information fusion; finally, decoding the fused feature map to obtain a fused image; and adding an adaptive structural similarity loss function when training the whole network model. According to the invention, a detail compensation mechanism is introduced into a generator coding network to enhance the details of the fused image and reduce information loss, infrared information and visible light information in the fused image are balanced in channel dimension by using a dual-channel self-adaptive fusion network, the brightness similarity, the contrast similarity and the structure similarity of the fused image and two source images are adaptively enhanced in space dimension by adding a self-adaptive structure similarity loss function, and a dense detail feature extraction network, a dual-channel attention mechanism and a self-adaptive structure similarity loss function are utilized to optimize an infrared and visible light image fusion network model, so that the problems of information loss and effective information distribution in the prior art are solved, and the image fusion quality is effectively improved.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims (10)

1. An image fusion method for generating a countermeasure network based on adaptive enhancement is characterized by comprising the following steps:
acquiring a source image; the source image comprises an infrared image and a visible light image;
combining the dense convolutional network with a detail information compensation mechanism to construct a dense detail feature extraction network;
performing feature extraction on the source image based on the dense detail feature extraction network to obtain a source feature map and a detail information feature map of the source image; the source feature map of the source image comprises a source feature map of an infrared image and a source feature map of a visible light image; the detail information characteristic diagram comprises a detail information characteristic diagram of an infrared image and a detail information characteristic diagram of a visible light image;
constructing a two-channel self-adaptive fusion network based on a two-channel maximum pooling self-adaptive fusion mechanism and a two-channel average pooling self-adaptive fusion mechanism;
performing two-channel maximum pooling adaptive fusion on the source feature map of the source image based on the two-channel adaptive fusion network to obtain a fusion source feature map;
performing two-channel average pooling self-adaptive fusion on the detail information characteristic graph of the source image based on the two-channel self-adaptive fusion network to obtain a fusion detail information characteristic graph;
splicing the fusion source characteristic diagram and the fusion detail information characteristic diagram by adopting the dual-channel self-adaptive fusion network to obtain a spliced characteristic diagram;
inputting the spliced feature map into a 1 x 1 convolution network to realize cross-channel information interaction and information fusion of the feature map, and obtaining a fused feature map;
decoding the fused feature map by adopting a decoding network to obtain a fused image;
sequentially connecting the dense detail feature extraction network, the two-channel self-adaptive fusion network, the 1 x 1 convolution network and the decoding network to form a self-adaptive enhancement generation countermeasure network;
constructing a self-adaptive structural similarity loss function according to the brightness similarity, the contrast similarity and the structural similarity of the fusion image and the source image;
training the network parameters of the adaptive enhancement generation countermeasure network through back propagation based on the adaptive structure similarity loss function, and generating a trained adaptive enhancement generation countermeasure network;
and adopting the trained adaptive enhancement to generate a countermeasure network for image fusion of the infrared image and the visible light image.
2. The method according to claim 1, wherein the extracting the features of the source image based on the dense detail feature extraction network to obtain a source feature map and a detail information feature map of the source image specifically comprises:
extracting a network formula p by adopting dense detail features based on the dense detail feature extraction networki=convi(x1,cat(…,conv2(cat(xi-2,conv1(cat(xi-1,xi) ))) and y) are providedi=cat(pi,broadcast(x)-pi) Performing feature extraction on the source image to obtain a source feature map and a detail information feature map of the source image; wherein x represents the source image, xiA feature map representing an i-th layer of the dense detail feature extraction network; cat () represents splicing the feature graph in brackets on the feature channel; convi() Representing the ith layer of convolution operation for extracting the features of the spliced feature map in the brackets; p is a radical ofiRepresenting the extracted ith layer source characteristic diagram; broadcast (x) -piRepresentation of the source profile piA corresponding ith layer detail information feature map, broadcast (x) represents that a broadcasting mechanism automatically expands the dimension of the source image x; y isiAnd representing all feature maps of the ith layer of the dense detail feature extraction network.
3. The method according to claim 1, wherein the performing two-channel maximum pooling adaptive fusion on the source feature map of the source image based on the two-channel adaptive fusion network to obtain a fused source feature map specifically comprises:
based on the two-channel self-adaptive fusion network, a formula is adopted
Figure FDA0003482350860000021
Performing two-channel maximum pooling self-adaptive fusion on the source feature map of the source image to obtain a fusion source feature map; wherein,
Figure FDA0003482350860000022
a source signature graph representing an infrared image,
Figure FDA0003482350860000023
a source signature representing a visible light image; max () represents the maximum value in the channel dimension for the bracketed feature map;
Figure FDA0003482350860000024
representing a convolution operation of the source signature of the infrared image,
Figure FDA0003482350860000025
representing a convolution operation on a source feature map of a visible light image; σ () represents a sigmoid operation; xoAnd representing a fusion source characteristic diagram obtained after weighting through a two-channel maximum pooling self-adaptive fusion mechanism.
4. The method according to claim 1, wherein the performing two-channel average pooling adaptive fusion on the detail information feature map of the source image based on the two-channel adaptive fusion network to obtain a fused detail information feature map specifically comprises:
based on the two-channel self-adaptive fusion network, a formula is adopted
Figure FDA0003482350860000031
Performing double-channel average pooling self-adaptive fusion on the detail information characteristic graph of the source image to obtain a fusion detail information characteristic graph; wherein
Figure FDA0003482350860000032
A detailed information feature map representing an infrared image,
Figure FDA0003482350860000033
a detail information feature map representing a visible light image; mean () represents the averaging of the bracketed feature map over the channel dimension;
Figure FDA0003482350860000034
showing the convolution operation of the detail information characteristic diagram of the infrared image,
Figure FDA0003482350860000035
performing convolution operation on a detail information characteristic diagram of the visible light image; σ () represents a sigmoid operation; xdAnd showing a fusion detail information characteristic diagram obtained after weighting through a dual-channel average pooling self-adaptive fusion mechanism.
5. The method according to claim 1, wherein the constructing an adaptive structural similarity loss function according to the luminance similarity, the contrast similarity, and the structural similarity of the fused image and the source image comprises:
using a formula
Figure FDA0003482350860000036
Calculating the brightness similarity l (x, y) of the fusion image and the source image; wherein muxMean value of pixel intensities, μ, representing a sliding window of x source imagesyMean value of pixel intensity representing a sliding window of the fused image y; c1Is an extremely small number;
using a formula
Figure FDA0003482350860000037
Calculating the contrast similarity c (x, y) of the fusion image and the source image; wherein sigmaxRepresenting the standard deviation, σ, of the source image xyRepresents the standard deviation of the fused image y; c2Is an extremely small number;
using a formula
Figure FDA0003482350860000038
Calculating the structural similarity s (x, y) of the fusion image and the source image; wherein sigmaxyRepresenting the covariance of the source image x and the fused image y; c3Is an extremely small number;
calculating the structural similarity ssim (x, y) of the fused image and the source image by adopting a formula ssim (x, y) ═ l (x, y) · c (x, y) · s (x, y) according to the brightness similarity l (x, y), the contrast similarity c (x, y) and the structural similarity s (x, y) of the fused image and the source image;
constructing an adaptive structural similarity loss function according to the structural similarity ssim (x, y) of the fusion image and the source image
Figure FDA0003482350860000041
Wherein viwRepresenting blocks of visible light images in a sliding window, irwRepresenting an infrared image block in a sliding window, fwRepresenting a fused image block in a sliding window; ssim (vi)w+fw) Representing the structural similarity of the fused image block and the visible image block in the sliding window, ssim (ir)w+fw) Representing the structural similarity of the fusion image block and the infrared image block in the sliding window;
Figure FDA0003482350860000042
representing the average of the pixels of a block of visible light images in a sliding window,
Figure FDA0003482350860000043
representing the pixel average value of the infrared image block in the sliding window; SSIM denotes the value of the final structural similarity loss function in a sliding window.
6. An image fusion system for generating a countermeasure network based on adaptive enhancement, comprising:
the source image acquisition module is used for acquiring a source image; the source image comprises an infrared image and a visible light image;
the dense detail feature extraction network construction module is used for combining the dense convolution network with a detail information compensation mechanism to construct a dense detail feature extraction network;
the feature extraction module is used for carrying out feature extraction on the source image based on the dense detail feature extraction network to obtain a source feature map and a detail information feature map of the source image; the source feature map of the source image comprises a source feature map of an infrared image and a source feature map of a visible light image; the detail information characteristic diagram comprises a detail information characteristic diagram of an infrared image and a detail information characteristic diagram of a visible light image;
the two-channel self-adaptive fusion network construction module is used for constructing a two-channel self-adaptive fusion network based on a two-channel maximum pooling self-adaptive fusion mechanism and a two-channel average pooling self-adaptive fusion mechanism;
the source feature map fusion module is used for performing two-channel maximum pooling adaptive fusion on the source feature map of the source image based on the two-channel adaptive fusion network to obtain a fusion source feature map;
the detail information feature map fusion module is used for carrying out double-channel average pooling self-adaptive fusion on the detail information feature map of the source image based on the double-channel self-adaptive fusion network to obtain a fusion detail information feature map;
the fusion characteristic diagram splicing module is used for splicing the fusion source characteristic diagram and the fusion detail information characteristic diagram by adopting the two-channel self-adaptive fusion network to obtain a spliced characteristic diagram;
the convolution network fusion module is used for inputting the spliced feature map into a 1 x 1 convolution network to realize cross-channel information interaction and information fusion of the feature map so as to obtain a fused feature map;
the feature map code module is used for decoding the fused feature map by adopting a decoding network to obtain a fused image;
the adaptive enhancement generation countermeasure network construction module is used for sequentially connecting the dense detail feature extraction network, the two-channel adaptive fusion network, the 1 x 1 convolution network and the decoding network to form an adaptive enhancement generation countermeasure network;
the adaptive structure similarity loss function building module is used for building an adaptive structure similarity loss function according to the brightness similarity, the contrast similarity and the structure similarity of the fusion image and the source image;
the adaptive enhancement generation confrontation network training module is used for training the network parameters of the adaptive enhancement generation confrontation network through back propagation based on the adaptive structure similarity loss function and generating a trained adaptive enhancement generation confrontation network;
and the image fusion module is used for generating a countermeasure network by adopting the trained self-adaptive enhancement to perform image fusion of the infrared image and the visible light image.
7. The system of claim 6, wherein the feature extraction module specifically comprises:
a feature extraction unit for extracting a network formula p by using the dense detail feature based on the dense detail feature extraction networki=convi(x1,cat(…,conv2(cat(xi-2,conv1(cat(xi-1,xi) ))) and y) are providedi=cat(pi,broadcast(x)-pi) Performing feature extraction on the source image to obtain a source feature map and a detail information feature map of the source image; wherein x represents the source image, xiA feature map representing an i-th layer of the dense detail feature extraction network; cat () represents splicing the feature graph in brackets on the feature channel; convi() Representing the ith layer of convolution operation for extracting the features of the spliced feature map in the brackets; p is a radical ofiRepresenting the extracted ith layer source characteristic diagram; broadcast (x) -piRepresentation of the source profile piA corresponding ith layer detail information feature map, broadcast (x) represents that a broadcasting mechanism automatically expands the dimension of the source image x; y isiAnd representing all feature maps of the ith layer of the dense detail feature extraction network.
8. The system according to claim 6, wherein the source feature map fusion module specifically comprises:
a source feature map fusion unit for applying a formula based on the dual-channel adaptive fusion network
Figure FDA0003482350860000061
Performing two-channel maximum pooling self-adaptive fusion on the source feature map of the source image to obtain a fusion source feature map; wherein,
Figure FDA0003482350860000062
a source signature graph representing an infrared image,
Figure FDA0003482350860000063
a source signature representing a visible light image; max () represents the maximum value in the channel dimension for the bracketed feature map;
Figure FDA0003482350860000064
representing a convolution operation of the source signature of the infrared image,
Figure FDA0003482350860000065
representing a convolution operation on a source feature map of a visible light image; σ () represents a sigmoid operation; xoAnd representing a fusion source characteristic diagram obtained after weighting through a two-channel maximum pooling self-adaptive fusion mechanism.
9. The system according to claim 6, wherein the detail information feature map fusion module specifically includes:
a detail information feature map fusion unit for adopting a formula based on the dual-channel self-adaptive fusion network
Figure FDA0003482350860000066
Performing double-channel average pooling self-adaptive fusion on the detail information characteristic graph of the source image to obtain a fusion detail information characteristic graph; wherein
Figure FDA0003482350860000067
A detailed information feature map representing an infrared image,
Figure FDA0003482350860000068
a detail information feature map representing a visible light image; mean () represents the averaging of the bracketed feature map over the channel dimension;
Figure FDA0003482350860000069
showing the convolution operation of the detail information characteristic diagram of the infrared image,
Figure FDA00034823508600000610
performing convolution operation on a detail information characteristic diagram of the visible light image; σ () represents a sigmoid operation;
Figure FDA00034823508600000611
and showing a fusion detail information characteristic diagram obtained after weighting through a dual-channel average pooling self-adaptive fusion mechanism.
10. The system according to claim 6, wherein the adaptive structural similarity loss function building module specifically comprises:
a brightness similarity calculation unit for employing a formula
Figure FDA0003482350860000071
Calculating the brightness similarity l (x, y) of the fusion image and the source image; wherein muxMean value of pixel intensities, μ, representing a sliding window of x source imagesyMean value of pixel intensity representing a sliding window of the fused image y; c1Is an extremely small number;
a contrast similarity calculation unit for employing a formula
Figure FDA0003482350860000072
Calculating the contrast similarity c (x, y) of the fusion image and the source image; wherein sigmaxRepresenting the standard deviation, σ, of the source image xyRepresents the standard deviation of the fused image y; c2Is an extremely small number;
a structural similarity calculation unit for employing a formula
Figure FDA0003482350860000073
Calculating the structural similarity s (x, y) of the fusion image and the source image; wherein sigmaxyRepresenting the covariance of the source image x and the fused image y; c3Is an extremely small number;
a structural similarity calculation unit, configured to calculate a structural similarity ssim (x, y) between the fused image and the source image by using a formula ssim (x, y) ═ l (x, y) · c (x, y) · s (x, y) according to a luminance similarity l (x, y), a contrast similarity c (x, y), and a structural similarity s (x, y) between the fused image and the source image;
an adaptive structural similarity loss function construction unit, configured to construct an adaptive structural similarity loss function according to the structural similarity ssim (x, y) of the fusion image and the source image
Figure FDA0003482350860000074
Wherein viwRepresenting blocks of visible light images in a sliding window, irwRepresenting an infrared image block in a sliding window, fwRepresenting a fused image block in a sliding window; ssim (vi)w+fw) Representing the structural similarity of the fused image block and the visible image block in the sliding window, ssim (ir)w+fw) Representing the structural similarity of the fusion image block and the infrared image block in the sliding window;
Figure FDA0003482350860000075
representing the average of the pixels of a block of visible light images in a sliding window,
Figure FDA0003482350860000076
representing the pixel average value of the infrared image block in the sliding window; SSIM denotes the value of the final structural similarity loss function in a sliding window.
CN202210071844.XA 2022-01-21 2022-01-21 Image fusion method and system for generating countermeasure network based on self-adaptive enhancement Active CN114419328B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210071844.XA CN114419328B (en) 2022-01-21 2022-01-21 Image fusion method and system for generating countermeasure network based on self-adaptive enhancement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210071844.XA CN114419328B (en) 2022-01-21 2022-01-21 Image fusion method and system for generating countermeasure network based on self-adaptive enhancement

Publications (2)

Publication Number Publication Date
CN114419328A true CN114419328A (en) 2022-04-29
CN114419328B CN114419328B (en) 2023-05-05

Family

ID=81275736

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210071844.XA Active CN114419328B (en) 2022-01-21 2022-01-21 Image fusion method and system for generating countermeasure network based on self-adaptive enhancement

Country Status (1)

Country Link
CN (1) CN114419328B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090196457A1 (en) * 2008-01-31 2009-08-06 Gregory Zuro Video image processing and fusion
CN111709902A (en) * 2020-05-21 2020-09-25 江南大学 Infrared and visible light image fusion method based on self-attention mechanism
CN111915545A (en) * 2020-08-06 2020-11-10 中北大学 Self-supervision learning fusion method of multiband images
CN112733950A (en) * 2021-01-18 2021-04-30 湖北工业大学 Power equipment fault diagnosis method based on combination of image fusion and target detection
CN113935935A (en) * 2021-10-19 2022-01-14 天翼数字生活科技有限公司 Dark light image enhancement method based on fusion of visible light and near infrared light

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090196457A1 (en) * 2008-01-31 2009-08-06 Gregory Zuro Video image processing and fusion
CN111709902A (en) * 2020-05-21 2020-09-25 江南大学 Infrared and visible light image fusion method based on self-attention mechanism
CN111915545A (en) * 2020-08-06 2020-11-10 中北大学 Self-supervision learning fusion method of multiband images
CN112733950A (en) * 2021-01-18 2021-04-30 湖北工业大学 Power equipment fault diagnosis method based on combination of image fusion and target detection
CN113935935A (en) * 2021-10-19 2022-01-14 天翼数字生活科技有限公司 Dark light image enhancement method based on fusion of visible light and near infrared light

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
YAOCHEN LIU等: "Infrared and Visible Image Fusion through Details Preservation", 《SENSORS 2019》 *
岑悦亮: "任意分辨率红外与可见光图像融合算法研究", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》 *

Also Published As

Publication number Publication date
CN114419328B (en) 2023-05-05

Similar Documents

Publication Publication Date Title
CN110363716B (en) High-quality reconstruction method for generating confrontation network composite degraded image based on conditions
CN112149459B (en) Video saliency object detection model and system based on cross attention mechanism
CN103700114B (en) A kind of complex background modeling method based on variable Gaussian mixture number
CN105631831A (en) Video image enhancement method under haze condition
CN113077505A (en) Optimization method of monocular depth estimation network based on contrast learning
CN110351548B (en) Stereo image quality evaluation method guided by deep learning and disparity map weighting
CN111145290A (en) Image colorization method, system and computer readable storage medium
CN116596822A (en) Pixel-level real-time multispectral image fusion method based on self-adaptive weight and target perception
CN110503049B (en) Satellite video vehicle number estimation method based on generation countermeasure network
CN117115058A (en) Low-light image fusion method based on light weight feature extraction and color recovery
CN113191301A (en) Video dense crowd counting method and system integrating time sequence and spatial information
CN113628143A (en) Weighted fusion image defogging method and device based on multi-scale convolution
CN117689617A (en) Insulator detection method based on defogging constraint network and series connection multi-scale attention
CN116912114A (en) Non-reference low-illumination image enhancement method based on high-order curve iteration
CN114419328A (en) Image fusion method and system based on adaptive enhancement generation countermeasure network
CN111275751A (en) Unsupervised absolute scale calculation method and system
CN116704174A (en) RGB-D image salient object detection method based on deep learning
Lin et al. NightRain: Nighttime Video Deraining via Adaptive-Rain-Removal and Adaptive-Correction
CN116452472A (en) Low-illumination image enhancement method based on semantic knowledge guidance
Huang et al. Edge device-based real-time implementation of CycleGAN for the colorization of infrared video
CN113362251B (en) Anti-network image defogging method based on double discriminators and improved loss function
CN116524207A (en) Weak supervision RGBD image significance detection method based on edge detection assistance
CN114842095A (en) Optimal suture image fusion method considering space-time relation and oriented to virtual reality
CN113313205B (en) Depth image layering method and system
Yang et al. Self-taught recovery of depth data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant