CN114419328A

CN114419328A - Image fusion method and system based on adaptive enhancement generation countermeasure network

Info

Publication number: CN114419328A
Application number: CN202210071844.XA
Authority: CN
Inventors: 张聪炫; 单长鲁; 陈震; 卢锋; 葛利跃; 陈昊; 秦文健; 李凌
Original assignee: Nanchang Hangkong University
Current assignee: Nanchang Hangkong University
Priority date: 2022-01-21
Filing date: 2022-01-21
Publication date: 2022-04-29
Anticipated expiration: 2042-01-21
Also published as: CN114419328B

Abstract

The invention relates to an image fusion method and system based on adaptive enhancement generation countermeasure network. The method comprises the following steps: constructing a dense detail feature extraction network, and respectively extracting features of the source image by using a dense convolution mode and a detail information compensation mechanism; constructing a fusion source characteristic diagram and a detail information characteristic diagram of the dual-channel self-adaptive fusion network; after splicing the fusion characteristic graphs, constructing a 1 x 1 convolution network to realize cross-channel interaction and information fusion; decoding the fused feature map to obtain a fused image; and adding an adaptive structural similarity loss function when training the whole network model. According to the method, a detail information compensation mechanism is introduced to enhance the details of the fused image and reduce information loss, infrared information and visible light information in the fused image are balanced in the channel dimension by using a dual-channel self-adaptive fusion network, and the similarity of the fused image and a source image is enhanced in a self-adaptive manner in the space dimension by adding a self-adaptive structural similarity loss function, so that the quality of the fused image is improved.

Description

Image fusion method and system based on adaptive enhancement generation countermeasure network

Technical Field

The invention relates to the technical field of image fusion, in particular to an image fusion method and system based on adaptive enhancement generation countermeasure network.

Background

Image fusion is an important technique in image processing, and the main goal of image fusion is to integrate salient features extracted from each source image, because a single infrared or visible light sensor cannot capture complete scene information. Generally, the visible image has abundant high-spatial-resolution sensing details, and the infrared image contains the thermal radiation of the target, so that it is important to fuse the complementary information of the infrared image and the visible image into a new composite image to process different tasks. The most advanced fusion algorithms today are widely used in many applications such as autonomous driving of vehicles, visual tracking and video surveillance. Fusion algorithms can be broadly classified into two categories: conventional methods and methods based on deep learning. In recent years, methods based on deep learning have shown great potential in image fusion tasks and are considered to have the potential to provide better performance than traditional algorithms.

At present, a big obstacle of image fusion based on deep learning is that there is no true value, so that only unsupervised training or supervised training is performed on an end-to-end network when training is performed, taking two source images as true values. However, when unsupervised training is adopted, the end-to-end network loses much information when extracting features; when two true value images are adopted for supervised training, the problem of unbalanced information distribution is generated in the fusion image.

Disclosure of Invention

The invention aims to provide an image fusion method and system based on adaptive enhancement generation countermeasure network, which aim to solve the problems of information loss and unbalanced effective information distribution in the existing image fusion method based on deep learning.

In order to achieve the purpose, the invention provides the following scheme:

an image fusion method for generating a countermeasure network based on adaptive enhancement comprises the following steps:

acquiring a source image; the source image comprises an infrared image and a visible light image;

combining the dense convolutional network with a detail information compensation mechanism to construct a dense detail feature extraction network;

performing feature extraction on the source image based on the dense detail feature extraction network to obtain a source feature map and a detail information feature map of the source image; the source feature map of the source image comprises a source feature map of an infrared image and a source feature map of a visible light image; the detail information characteristic diagram comprises a detail information characteristic diagram of an infrared image and a detail information characteristic diagram of a visible light image;

constructing a two-channel self-adaptive fusion network based on a two-channel maximum pooling self-adaptive fusion mechanism and a two-channel average pooling self-adaptive fusion mechanism;

performing two-channel maximum pooling adaptive fusion on the source feature map of the source image based on the two-channel adaptive fusion network to obtain a fusion source feature map;

performing two-channel average pooling self-adaptive fusion on the detail information characteristic graph of the source image based on the two-channel self-adaptive fusion network to obtain a fusion detail information characteristic graph;

splicing the fusion source characteristic diagram and the fusion detail information characteristic diagram by adopting the dual-channel self-adaptive fusion network to obtain a spliced characteristic diagram;

inputting the spliced feature map into a 1 x 1 convolution network to realize cross-channel information interaction and information fusion of the feature map, and obtaining a fused feature map;

decoding the fused feature map by adopting a decoding network to obtain a fused image;

sequentially connecting the dense detail feature extraction network, the two-channel self-adaptive fusion network, the 1 x 1 convolution network and the decoding network to form a self-adaptive enhancement generation countermeasure network;

constructing a self-adaptive structural similarity loss function according to the brightness similarity, the contrast similarity and the structural similarity of the fusion image and the source image;

training the network parameters of the adaptive enhancement generation countermeasure network through back propagation based on the adaptive structure similarity loss function, and generating a trained adaptive enhancement generation countermeasure network;

and adopting the trained adaptive enhancement to generate a countermeasure network for image fusion of the infrared image and the visible light image.

Optionally, the performing, by the dense detail feature extraction network, feature extraction on the source image to obtain a source feature map and a detail information feature map of the source image specifically includes:

extracting a network formula p by adopting dense detail features based on the dense detail feature extraction networkⁱ＝convⁱ(x¹,cat(…,conv²(cat(x^i-2,conv¹(cat(x^i-1,xⁱ) ))) and y) are providedⁱ＝cat(pⁱ,broadcast(x)-pⁱ) Performing feature extraction on the source image to obtain a source feature map and a detail information feature map of the source image; wherein x represents the source image, xⁱA feature map representing an i-th layer of the dense detail feature extraction network; cat () represents splicing the feature graph in brackets on the feature channel; convⁱ() Representing the ith layer of convolution operation for extracting the features of the spliced feature map in the brackets; p is a radical ofⁱRepresenting the extracted ith layer source characteristic diagram; broadcast (x) -pⁱRepresentation of the source profile pⁱA corresponding ith layer detail information feature map, broadcast (x) represents that a broadcasting mechanism automatically expands the dimension of the source image x; y isⁱAnd representing all feature maps of the ith layer of the dense detail feature extraction network.

Optionally, the performing, based on the two-channel adaptive fusion network, two-channel maximum pooling adaptive fusion on the source feature map of the source image to obtain a fusion source feature map specifically includes:

based on the two-channel self-adaptive fusion network, a formula is adopted

Performing two-channel maximum pooling self-adaptive fusion on the source feature map of the source image to obtain a fusion source feature map; wherein,

a source signature graph representing an infrared image,

a source signature representing a visible light image; max () represents the maximum value in the channel dimension for the bracketed feature map;

representing a convolution operation of the source signature of the infrared image,

representing source to visible light imagesPerforming convolution operation on the symbolic graph; σ () represents a sigmoid operation; and X degree represents a fusion source characteristic diagram obtained after weighting through a two-channel maximum pooling self-adaptive fusion mechanism.

Optionally, the performing, based on the two-channel adaptive fusion network, two-channel average pooling adaptive fusion on the detail information feature map of the source image to obtain a fusion detail information feature map specifically includes:

based on the two-channel self-adaptive fusion network, a formula is adopted

Performing double-channel average pooling self-adaptive fusion on the detail information characteristic graph of the source image to obtain a fusion detail information characteristic graph; wherein

A detailed information feature map representing an infrared image,

a detail information feature map representing a visible light image; mean () represents the averaging of the bracketed feature map over the channel dimension;

showing the convolution operation of the detail information characteristic diagram of the infrared image,

performing convolution operation on a detail information characteristic diagram of the visible light image; σ () represents a sigmoid operation; x^dAnd showing a fusion detail information characteristic diagram obtained after weighting through a dual-channel average pooling self-adaptive fusion mechanism.

Optionally, the constructing an adaptive structural similarity loss function according to the brightness similarity, the contrast similarity, and the structural similarity of the fused image and the source image specifically includes:

using a formula

Calculating the brightness similarity l (x, y) of the fusion image and the source image; wherein mu_xMean value of pixel intensities, μ, representing a sliding window of x source images_yMean value of pixel intensity representing a sliding window of the fused image y; c₁Is an extremely small number;

using a formula

Calculating the contrast similarity c (x, y) of the fusion image and the source image; wherein sigma_xRepresenting the standard deviation, σ, of the source image x_yRepresents the standard deviation of the fused image y; c₂Is an extremely small number;

using a formula

Calculating the structural similarity s (x, y) of the fusion image and the source image; wherein sigma_xyRepresenting the covariance of the source image x and the fused image y; c₃Is an extremely small number;

calculating the structural similarity ssim (x, y) of the fused image and the source image by adopting a formula ssim (x, y) ═ l (x, y) · c (x, y) · s (x, y) according to the brightness similarity l (x, y), the contrast similarity c (x, y) and the structural similarity s (x, y) of the fused image and the source image;

constructing an adaptive structural similarity loss function according to the structural similarity ssim (x, y) of the fusion image and the source image

Wherein vi_wRepresenting blocks of visible light images in a sliding window, ir_wRepresenting an infrared image block in a sliding window, f_wRepresenting a fused image block in a sliding window; ssim (vi)_w+f_w) Representing the structural similarity of the fused image block and the visible image block in the sliding window, ssim (ir)_w+f_w) Representing the structural similarity of the fusion image block and the infrared image block in the sliding window;

representing the average of the pixels of a block of visible light images in a sliding window,

representing the pixel average value of the infrared image block in the sliding window; SSIM denotes the value of the final structural similarity loss function in a sliding window.

An image fusion system for generating a confrontation network based on adaptive enhancement, comprising:

the source image acquisition module is used for acquiring a source image; the source image comprises an infrared image and a visible light image;

the dense detail feature extraction network construction module is used for combining the dense convolution network with a detail information compensation mechanism to construct a dense detail feature extraction network;

the feature extraction module is used for carrying out feature extraction on the source image based on the dense detail feature extraction network to obtain a source feature map and a detail information feature map of the source image; the source feature map of the source image comprises a source feature map of an infrared image and a source feature map of a visible light image; the detail information characteristic diagram comprises a detail information characteristic diagram of an infrared image and a detail information characteristic diagram of a visible light image;

the two-channel self-adaptive fusion network construction module is used for constructing a two-channel self-adaptive fusion network based on a two-channel maximum pooling self-adaptive fusion mechanism and a two-channel average pooling self-adaptive fusion mechanism;

the source feature map fusion module is used for performing two-channel maximum pooling adaptive fusion on the source feature map of the source image based on the two-channel adaptive fusion network to obtain a fusion source feature map;

the detail information feature map fusion module is used for carrying out double-channel average pooling self-adaptive fusion on the detail information feature map of the source image based on the double-channel self-adaptive fusion network to obtain a fusion detail information feature map;

the fusion characteristic diagram splicing module is used for splicing the fusion source characteristic diagram and the fusion detail information characteristic diagram by adopting the two-channel self-adaptive fusion network to obtain a spliced characteristic diagram;

the convolution network fusion module is used for inputting the spliced feature map into a 1 x 1 convolution network to realize cross-channel information interaction and information fusion of the feature map so as to obtain a fused feature map;

the feature map code module is used for decoding the fused feature map by adopting a decoding network to obtain a fused image;

the adaptive enhancement generation countermeasure network construction module is used for sequentially connecting the dense detail feature extraction network, the two-channel adaptive fusion network, the 1 x 1 convolution network and the decoding network to form an adaptive enhancement generation countermeasure network;

the adaptive structure similarity loss function building module is used for building an adaptive structure similarity loss function according to the brightness similarity, the contrast similarity and the structure similarity of the fusion image and the source image;

the adaptive enhancement generation confrontation network training module is used for training the network parameters of the adaptive enhancement generation confrontation network through back propagation based on the adaptive structure similarity loss function and generating a trained adaptive enhancement generation confrontation network;

and the image fusion module is used for generating a countermeasure network by adopting the trained self-adaptive enhancement to perform image fusion of the infrared image and the visible light image.

Optionally, the feature extraction module specifically includes:

a feature extraction unit for extracting a network formula p by using the dense detail feature based on the dense detail feature extraction networkⁱ＝convⁱ(x¹,cat(…,conv²(cat(x^i-2,conv¹(cat(x^i-1,xⁱ) ))) and y) are providedⁱ＝cat(pⁱ,broadcast(x)-pⁱ) Performing feature extraction on the source image to obtain a source feature map and a detail information feature map of the source image; wherein x represents the source image, xⁱRepresenting the dense detail feature extraction meshA layer i profile of the complex; cat () represents splicing the feature graph in brackets on the feature channel; convⁱ() Representing the ith layer of convolution operation for extracting the features of the spliced feature map in the brackets; p is a radical ofⁱRepresenting the extracted ith layer source characteristic diagram; broadcast (x) -pⁱRepresentation of the source profile pⁱA corresponding ith layer detail information feature map, broadcast (x) represents that a broadcasting mechanism automatically expands the dimension of the source image x; y isⁱAnd representing all feature maps of the ith layer of the dense detail feature extraction network.

Optionally, the source feature map fusion module specifically includes:

a source feature map fusion unit for applying a formula based on the dual-channel adaptive fusion network

a source signature graph representing an infrared image,

representing a convolution operation on a source feature map of a visible light image; σ () represents a sigmoid operation; and X degree represents a fusion source characteristic diagram obtained after weighting through a two-channel maximum pooling self-adaptive fusion mechanism.

Optionally, the detail information feature map fusion module specifically includes:

detail information feature map fusion unit for basisIn the dual-channel self-adaptive fusion network, a formula is adopted

A detailed information feature map representing an infrared image,

Optionally, the adaptive structural similarity loss function constructing module specifically includes:

a brightness similarity calculation unit for employing a formula

a contrast similarity calculation unit for employing a formula

a structural similarity calculation unit for employing a formula

a structural similarity calculation unit, configured to calculate a structural similarity ssim (x, y) between the fused image and the source image by using a formula ssim (x, y) ═ l (x, y) · c (x, y) · s (x, y) according to a luminance similarity l (x, y), a contrast similarity c (x, y), and a structural similarity s (x, y) between the fused image and the source image;

an adaptive structural similarity loss function construction unit, configured to construct an adaptive structural similarity loss function according to the structural similarity ssim (x, y) of the fusion image and the source image

pixel representing an infrared image block in a sliding windowAverage value; SSIM denotes the value of the final structural similarity loss function in a sliding window.

According to the specific embodiment provided by the invention, the invention discloses the following technical effects:

the invention provides an image fusion method and system based on adaptive enhancement generation countermeasure network, wherein the method comprises the following steps: acquiring a source image; the source image comprises an infrared image and a visible light image; combining the dense convolutional network with a detail information compensation mechanism to construct a dense detail feature extraction network; performing feature extraction on the source image based on the dense detail feature extraction network to obtain a source feature map and a detail information feature map of the source image; constructing a two-channel self-adaptive fusion network based on a two-channel maximum pooling self-adaptive fusion mechanism and a two-channel average pooling self-adaptive fusion mechanism; performing two-channel maximum pooling adaptive fusion on the source feature map of the source image based on the two-channel adaptive fusion network to obtain a fusion source feature map; performing two-channel average pooling self-adaptive fusion on the detail information characteristic graph of the source image based on the two-channel self-adaptive fusion network to obtain a fusion detail information characteristic graph; splicing the fusion source characteristic diagram and the fusion detail information characteristic diagram by adopting the dual-channel self-adaptive fusion network to obtain a spliced characteristic diagram; inputting the spliced feature map into a 1 x 1 convolution network to realize cross-channel information interaction and information fusion of the feature map, and obtaining a fused feature map; decoding the fused feature map by adopting a decoding network to obtain a fused image; sequentially connecting the dense detail feature extraction network, the two-channel self-adaptive fusion network, the 1 x 1 convolution network and the decoding network to form a self-adaptive enhancement generation countermeasure network; constructing a self-adaptive structural similarity loss function according to the brightness similarity, the contrast similarity and the structural similarity of the fusion image and the source image; training the network parameters of the adaptive enhancement generation countermeasure network through back propagation based on the adaptive structure similarity loss function, and generating a trained adaptive enhancement generation countermeasure network; and adopting the trained adaptive enhancement to generate a countermeasure network for image fusion of the infrared image and the visible light image. The method of the invention introduces a detail information compensation mechanism to enhance the details of the fused image and reduce the information loss, applies a two-channel self-adaptive fusion network to balance the infrared information and the visible light information in the fused image in the channel dimension, adds a self-adaptive structure similarity loss function to self-adaptively enhance the brightness similarity, the contrast similarity and the structure similarity of the fused image and two source images in the space dimension, solves the problems of information loss and unbalanced effective information distribution of the existing image fusion method based on deep learning, and improves the quality of the fused image.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.

FIG. 1 is a flow chart of an image fusion method for generating a countermeasure network based on adaptive enhancement according to the present invention;

FIG. 2 is a schematic diagram of an infrared image in a public data set according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a visible light image in a public data set according to an embodiment of the present invention;

FIG. 4 is a schematic structural diagram of a dense detail feature extraction network provided by an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a dual-channel adaptive convergence network according to an embodiment of the present invention;

fig. 6 is a schematic diagram of a result of fusing an infrared image and a visible light image according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.

Fig. 1 is a flowchart of an image fusion method for generating a countermeasure network based on adaptive enhancement according to the present invention. Referring to fig. 1, the image fusion method for generating a countermeasure network based on adaptive enhancement of the present invention includes:

step 101: and acquiring a source image.

FIG. 2 is a schematic diagram of an infrared image in a public data set according to an embodiment of the present invention; fig. 3 is a schematic diagram of a visible light image in a public data set according to an embodiment of the present invention. Referring to fig. 2 and 3, the present invention adaptively enhances the generation of source images against network inputs including infrared images and visible light images.

Step 102: and combining the dense convolutional network with a detail information compensation mechanism to construct a dense detail feature extraction network.

Fig. 4 is a schematic structural diagram of a dense detail feature extraction network according to an embodiment of the present invention. Referring to fig. 4, the present invention combines a dense convolutional network with a detail information compensation mechanism to define a new feature extraction network: dense detail feature extraction networks. The dense convolutional network extracts deep information and shallow information through jump connection, and the detail information compensation mechanism acquires additional detail compensation information corresponding to the deep and shallow information. The dense detail feature extraction network formula constructed by the invention is as follows:

pⁱ＝convⁱ(x¹,cat(…,conv²(cat(x^i-2,conv¹(cat(x^i-1,xⁱ)))))) (1)

yⁱ＝cat(pⁱ,broadcast(x)-pⁱ) (2)

in the formulae (1) and (2), x represents an infrared image or a visible light image, and yⁱAnd (3) representing all feature maps of the ith layer of the dense detail feature extraction network, wherein i is more than 0 and less than or equal to n, n is the layer number of the dense detail feature extraction network, and n is more than 2. p is a radical ofⁱRepresenting the extracted ith layer source feature map, broadcast (x) -pⁱA feature map (also referred to as a detail compensation feature map) representing the i-th layer detail information corresponding to the source feature map, xⁱAnd (3) an i-th layer feature diagram representing the dense detail feature extraction network. cat represents splicing the feature graph on the feature channel, broadcast represents that the broadcast mechanism can automatically expand dimension, convⁱAnd the ith layer of convolution operation for extracting the features after splicing the feature maps is shown.

Step 103: and performing feature extraction on the source image based on the dense detail feature extraction network to obtain a source feature map and a detail information feature map of the source image.

The constructed dense detail feature extraction network is used as a coding network for generating a confrontation network by self-adaptive enhancement, and feature extraction is carried out on two input source images (infrared images and visible light images) by respectively applying a dense convolution mode and combining a detail information compensation mechanism to obtain a source feature image and a detail information feature image of the two source images. The source feature map of the source image comprises two groups of source feature maps, namely a source feature map of an infrared image and a source feature map of a visible light image; the detail information characteristic diagram comprises two groups of detail information characteristic diagrams, namely a detail information characteristic diagram of an infrared image and a detail information characteristic diagram of a visible light image.

The step 103 specifically includes:

extracting a network formula p by adopting dense detail features based on the dense detail feature extraction networkⁱ＝convⁱ(x¹,cat(…,conv²(cat(x^i-2,conv¹(cat(x^i-1,xⁱ) ))) and y) are providedⁱ＝cat(pⁱ,broadcast(x)-pⁱ) For the sourceCarrying out feature extraction on the image to obtain a source feature map and a detail information feature map of the source image; wherein x represents the source image, xⁱA feature map representing an i-th layer of the dense detail feature extraction network; cat () represents splicing the feature graph in brackets on the feature channel; convⁱ() Representing the ith layer of convolution operation for extracting the features of the spliced feature map in the brackets; p is a radical ofⁱRepresenting the extracted ith layer source characteristic diagram; broadcast (x) -pⁱRepresentation of the source profile pⁱA corresponding ith layer detail information feature map, broadcast (x) represents that a broadcasting mechanism automatically expands the dimension of the source image x; y isⁱAnd representing all feature maps of the ith layer of the dense detail feature extraction network.

Step 104: and constructing a two-channel self-adaptive fusion network based on a two-channel maximum pooling self-adaptive fusion mechanism and a two-channel average pooling self-adaptive fusion mechanism.

Fig. 5 is a schematic structural diagram of a dual-channel adaptive convergence network according to an embodiment of the present invention. Referring to fig. 5, the two-channel adaptive fusion network is constructed based on a two-channel maximum pooling adaptive fusion mechanism and a two-channel average pooling adaptive fusion mechanism. After two groups of source characteristic graphs and the other two groups of detailed information characteristic graphs are obtained in step 103, a two-channel maximum pooling adaptive fusion mechanism is constructed to fuse the two groups of source characteristic graphs, and a group of fusion source characteristic graphs is obtained; and constructing a double-channel average pooling self-adaptive fusion mechanism to fuse the two groups of detailed information characteristic graphs to obtain a group of fused detailed information characteristic graphs.

Step 105: and performing two-channel maximum pooling adaptive fusion on the source feature map of the source image based on the two-channel adaptive fusion network to obtain a fusion source feature map.

Acquiring four groups of feature graphs extracted by a dense detail feature extraction network, performing two-channel maximum pooling adaptive fusion on the two groups of obtained source feature graphs, and uniformly distributing pixel intensity information of the two groups of source feature graphs in the fused feature graphs on channel dimensions; performing double-channel average pooling self-adaptive fusion on the two groups of detail compensation feature maps, and uniformly distributing detail texture information of the two groups of detail compensation feature maps in the fusion feature map on channel dimension; and finally, adaptively enhancing effective information of two source images in the fused image. The calculation formula of the fusion source characteristic diagram of the two-channel self-adaptive fusion network is as follows:

in the formula (3), the reaction mixture is,

source profiles representing the input infrared image and visible image, respectively, max represents maximizing the input profile in the channel dimension,

the convolution operation is carried out on the source characteristic diagram of the visible light image, and the X-degree represents that the fusion source characteristic diagram is obtained after weighting through a two-channel maximum pooling self-adaptive fusion mechanism.

Therefore, the step 105 specifically includes:

based on the two-channel self-adaptive fusion network, a formula is adopted

a source signature graph representing an infrared image,

Step 106: and carrying out double-channel average pooling self-adaptive fusion on the detail information characteristic graph of the source image based on the double-channel self-adaptive fusion network to obtain a fusion detail information characteristic graph.

And performing double-channel average pooling self-adaptive fusion on the two groups of detail compensation feature maps, and uniformly distributing detail texture information of the two groups of detail compensation feature maps in the fusion feature map on a channel dimension, so as to finally enhance the effective information of the two source images in the fusion image in a self-adaptive manner. The calculation formula of the fusion detail information characteristic diagram of the two-channel self-adaptive fusion network is as follows:

in the formula (4), the reaction mixture is,

a detail compensation feature map representing an input infrared image,

a detail compensation feature map representing a visible image, mean representing averaging the input feature map over channel dimensions,

showing the convolution operation of the detail compensation feature map of the infrared image,

representing detail compensation features for visible light imagesThe graph is subjected to a convolution operation, σ denotes a sigmoid operation, X^dAnd showing a detail fusion characteristic diagram obtained after weighting through a two-channel average pooling self-adaptive fusion mechanism.

Therefore, the step 106 specifically includes:

based on the two-channel self-adaptive fusion network, a formula is adopted

A detailed information feature map representing an infrared image,

Step 107: and splicing the fusion source characteristic diagram and the fusion detail information characteristic diagram by adopting the dual-channel self-adaptive fusion network to obtain a spliced characteristic diagram.

And splicing two groups of fusion feature maps (a fusion source feature map and a fusion detail information feature map) on the channel dimension as input, and constructing a 1-by-1 convolution network to realize information interaction and information fusion of the feature maps across channels.

Step 108: inputting the spliced feature map into a 1 x 1 convolution network to realize cross-channel information interaction and information fusion of the feature map, and obtaining a fused feature map.

The invention adopts the two-channel self-adaptive fusion network to splice the fusion source characteristic diagram and the fusion detail information characteristic diagram, and inputs the spliced characteristic diagram into a 1 x 1 convolution network to realize the information interaction and information fusion of the characteristic diagram across channels, thereby obtaining the fused characteristic diagram. Using 1 × 1 convolution can retain more pixel intensity information and detail texture information accumulated than 3 × 3 convolution; and then decoding the fused feature map to obtain a fused image, so that the final fused result contains rich source image information.

Step 109: and decoding the fused feature map by adopting a decoding network to obtain a fused image.

And finally, decoding the feature map after fusion to obtain a fused image.

Step 110: and sequentially connecting the dense detail feature extraction network, the two-channel self-adaptive fusion network, the 1 x 1 convolution network and the decoding network to form a self-adaptive enhanced generation countermeasure network.

The adaptive enhancement generation countermeasure network takes the dense detail feature extraction network as a coding network, and respectively applies a dense convolution mode to two input source images and combines a detail information compensation mechanism to extract features; the infrared information and the visible light information in the image are fused in a balanced way on the channel dimension by using a two-channel self-adaptive fusion network; splicing the two groups of fusion feature graphs, and constructing a 1 x 1 convolution network to realize cross-channel interaction and information fusion; and finally, decoding the fused feature map to obtain a fused image.

Step 111: and constructing an adaptive structural similarity loss function according to the brightness similarity, the contrast similarity and the structural similarity of the fused image and the source image.

The method adopts a self-adaptive structural similarity loss function to calculate the structural similarity, the contrast similarity and the brightness similarity of a fused image obtained by a network model (namely the self-adaptive enhancement generation confrontation network) and two source images; calculating the average value of the pixel intensities of the sliding windows of the two source images by using a 3 x 3 sliding window; obtaining the weight corresponding to each window according to the mean sigmoid; and calculating the structural similarity of all windows, calculating the average value as a final adaptive structural similarity loss function value, and adjusting the network parameters of the network model through back propagation, so that the effective image information can be adaptively enhanced in the spatial dimension. The adaptive structural similarity loss function calculation formula is shown in formulas (5) to (9):

brightness similarity:

contrast similarity:

structural similarity:

structural similarity:

wherein x and y represent input source image and fusion image, mu_x、μ_yRepresenting the mean, σ, of two images (source image and fused image)_x、σ_yRepresenting the standard deviation, σ, of two images_xyRepresenting the covariance of the two images; c₁、C₂、C₃Is an extremely small number in order to ensure the stability of the algorithm; ssim (x, y) represents the structural similarity of the two pictures.

The adaptive structural similarity loss function is expressed as follows:

in the formula (9), vi_wRepresenting blocks of visible light images in a sliding window, f_wRepresenting a fused image block in a sliding window, ir_wRepresenting an infrared image block in a sliding window; ssim (vi)_w+f_w) Representing the structural similarity of the fused image block and the visible image block in the sliding window, ssim (ir)_w+f_w) Representing the structural similarity of the fusion image block and the infrared image block in the sliding window;

Therefore, the step 111 specifically includes:

using a formula

using a formula

using a formula

Step 112: and training the network parameters of the adaptive enhancement generation countermeasure network through back propagation based on the adaptive structure similarity loss function, and generating the trained adaptive enhancement generation countermeasure network.

Based on the adaptive structure similarity loss function, the whole network model is trained for multiple times through back propagation, and the network after parameter training optimization can adaptively enhance infrared and visible light image information.

Step 113: and adopting the trained adaptive enhancement to generate a countermeasure network for image fusion of the infrared image and the visible light image.

The invention adaptively enhances and generates the input of the countermeasure network into an infrared image and a visible light image, and outputs the input into a fusion image of the infrared image and the visible light image. The trained adaptive enhancement generation countermeasure network is adopted to perform image fusion on the infrared image shown in the figure 1 and the visible light image shown in the figure 2, and the generated fusion image is shown in figure 6. As can be seen from FIG. 6, the adoption of the invention to adaptively enhance and generate the fusion image generated by the countermeasure network enhances the image detail information, balances the infrared information and the visible light information in the fusion image, and greatly improves the image quality of the fusion image.

The method utilizes the dense detail feature extraction network to solve the problem of information loss of the end-to-end generation countermeasure network during feature extraction; the problem of effective information distribution of the infrared image and the visible light image in the fused image is solved on the channel dimension and the space dimension respectively by utilizing a double-channel attention mechanism and a self-adaptive structure similarity loss function, and the quality of the fused image is effectively improved.

Based on the image fusion method for generating the confrontation network based on the adaptive enhancement provided by the invention, the invention also provides an image fusion system for generating the confrontation network based on the adaptive enhancement, and the system comprises:

Wherein, the feature extraction module specifically comprises:

a feature extraction unit for extracting a network formula p by using the dense detail feature based on the dense detail feature extraction networkⁱ＝convⁱ(x¹,cat(…,conv²(cat(x^i-2,conv¹(cat(x^i-1,xⁱ) ))) and y) are providedⁱ＝cat(pⁱ,broadcast(x)-pⁱ) Performing feature extraction on the source image to obtain a source feature map and a detail information feature map of the source image; wherein x represents the source image, xⁱA feature map representing an i-th layer of the dense detail feature extraction network; cat () represents splicing the feature graph in brackets on the feature channel; convⁱ() Representing the ith layer of convolution operation for extracting the features of the spliced feature map in the brackets; p is a radical ofⁱRepresenting the extracted ith layer source characteristic diagram; broadcast (x) -pⁱRepresentation of the source profile pⁱA corresponding ith layer detail information feature map, broadcast (x) represents that a broadcasting mechanism automatically expands the dimension of the source image x; y isⁱAnd representing all feature maps of the ith layer of the dense detail feature extraction network.

The source feature map fusion module specifically comprises:

a source signature graph representing an infrared image,

The detail information feature map fusion module specifically comprises:

a detail information feature map fusion unit for adopting a formula based on the dual-channel self-adaptive fusion network

A detailed information feature map representing an infrared image,

The adaptive structure similarity loss function building module specifically comprises:

a brightness similarity calculation unit for employing a formula

a contrast similarity calculation unit for employing a formula

a structural similarity calculation unit for employing a formula

The invention discloses a method and a system for fusing infrared and visible light images based on a self-adaptive enhancement generation countermeasure network, wherein the method comprises the steps of firstly, respectively inputting an infrared image and a visible light image; constructing a dense detail feature extraction network as a coding network, and respectively applying a dense convolution mode and a detail information compensation mechanism to the two input source images to carry out feature extraction so as to obtain two groups of source feature images and the other two groups of detail information feature images; constructing a two-channel maximum pooling self-adaptive fusion mechanism to fuse two groups of source characteristic graphs to obtain a group of fusion source characteristic graphs; constructing a double-channel average pooling self-adaptive fusion mechanism to fuse two groups of detail information characteristic graphs to obtain a group of fused detail information characteristic graphs; splicing the two groups of fusion feature graphs, and constructing a 1 x 1 convolution network to realize cross-channel interaction and information fusion; finally, decoding the fused feature map to obtain a fused image; and adding an adaptive structural similarity loss function when training the whole network model. According to the invention, a detail compensation mechanism is introduced into a generator coding network to enhance the details of the fused image and reduce information loss, infrared information and visible light information in the fused image are balanced in channel dimension by using a dual-channel self-adaptive fusion network, the brightness similarity, the contrast similarity and the structure similarity of the fused image and two source images are adaptively enhanced in space dimension by adding a self-adaptive structure similarity loss function, and a dense detail feature extraction network, a dual-channel attention mechanism and a self-adaptive structure similarity loss function are utilized to optimize an infrared and visible light image fusion network model, so that the problems of information loss and effective information distribution in the prior art are solved, and the image fusion quality is effectively improved.

The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.

The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims

1. An image fusion method for generating a countermeasure network based on adaptive enhancement is characterized by comprising the following steps:

2. The method according to claim 1, wherein the extracting the features of the source image based on the dense detail feature extraction network to obtain a source feature map and a detail information feature map of the source image specifically comprises:

3. The method according to claim 1, wherein the performing two-channel maximum pooling adaptive fusion on the source feature map of the source image based on the two-channel adaptive fusion network to obtain a fused source feature map specifically comprises:

based on the two-channel self-adaptive fusion network, a formula is adopted

a source signature graph representing an infrared image,

representing a convolution operation on a source feature map of a visible light image; σ () represents a sigmoid operation; x^oAnd representing a fusion source characteristic diagram obtained after weighting through a two-channel maximum pooling self-adaptive fusion mechanism.

4. The method according to claim 1, wherein the performing two-channel average pooling adaptive fusion on the detail information feature map of the source image based on the two-channel adaptive fusion network to obtain a fused detail information feature map specifically comprises:

based on the two-channel self-adaptive fusion network, a formula is adopted

A detailed information feature map representing an infrared image,

5. The method according to claim 1, wherein the constructing an adaptive structural similarity loss function according to the luminance similarity, the contrast similarity, and the structural similarity of the fused image and the source image comprises:

using a formula

using a formula

using a formula

6. An image fusion system for generating a countermeasure network based on adaptive enhancement, comprising:

7. The system of claim 6, wherein the feature extraction module specifically comprises:

8. The system according to claim 6, wherein the source feature map fusion module specifically comprises:

a source signature graph representing an infrared image,

9. The system according to claim 6, wherein the detail information feature map fusion module specifically includes:

A detailed information feature map representing an infrared image,

performing convolution operation on a detail information characteristic diagram of the visible light image; σ () represents a sigmoid operation;

and showing a fusion detail information characteristic diagram obtained after weighting through a dual-channel average pooling self-adaptive fusion mechanism.

10. The system according to claim 6, wherein the adaptive structural similarity loss function building module specifically comprises:

a brightness similarity calculation unit for employing a formula

a contrast similarity calculation unit for employing a formula

a structural similarity calculation unit for employing a formula