CN106897987A

CN106897987A - Image interfusion method based on translation invariant shearing wave and stack own coding

Info

Publication number: CN106897987A
Application number: CN201710053310.3A
Authority: CN
Inventors: 罗晓清; 张战成; 王鹏飞; 檀华廷; 王骏; 董静
Original assignee: Jiangnan University
Current assignee: Jiangnan University
Priority date: 2017-01-18
Filing date: 2017-01-18
Publication date: 2017-06-27

Abstract

The invention discloses a kind of image interfusion method based on translation invariant shearing wave and stack own coding.Implementation step is：Wave conversion is sheared by picture breakdown to be fused into low frequency sub-band coefficient and high-frequency sub-band coefficient with translation invariant first；Secondly, low frequency sub-band coefficient reflects the base profile of image, and the method being averaged using weighting is merged；High-frequency sub-band coefficient reflects the edge and texture information of image, the present invention proposes a kind of fusion method based on stack own coding feature, using sliding the method for piecemeal by high-frequency sub-band piecemeal, stack autoencoder network is trained as input using fritter, again fritter encode using the network for training obtaining feature, and utilization space frequency carries out feature enhancing and obtains activity and estimate, finally estimating numerical value using this activity and taking big fusion rule carries out the fusion of high-frequency sub-band coefficient fritter, and high-frequency sub-band is obtained using sliding window inverse transformation after all fritters fusions；Image after finally being merged using translation invariant shearing wave inverse transformation.The present invention can preferably retain edge and texture information in original image compared to traditional fusion method.

Description

Image Fusion Method Based on Translation Invariant Shearlet and Stacked Autoencoder

技术领域technical field

本发明涉及一种基于平移不变剪切波和栈式自编码的图像融合方法,是图像处理技术领域的一项融合方法，在军事应用及临床医学诊断等中有广泛的应用。The invention relates to an image fusion method based on translation invariant shear wave and stacked self-encoding, which is a fusion method in the technical field of image processing and is widely used in military applications and clinical medical diagnosis.

背景技术Background technique

由于单幅图像所含的信息有限，往往无法满足实际的应用。图像融合是一种将多个传感器采集到的关于同一场景的图像经过融合算法的处理合成一幅图像的技术，融合后的图像能有效的结合多幅待融合图像的优点，从而更适合人的视觉感知。图像融合大约是在二十世纪七十年代被提出的，近年来，由于多传感器技术的迅速发展，图像融合在军事侦察、医疗诊断和遥感等领域有了广泛的应用。Due to the limited information contained in a single image, it often cannot meet the practical application. Image fusion is a technology that synthesizes images of the same scene collected by multiple sensors into one image through fusion algorithm processing. The fused image can effectively combine the advantages of multiple images to be fused, which is more suitable for human visual perception. Image fusion was proposed in the 1970s. In recent years, due to the rapid development of multi-sensor technology, image fusion has been widely used in the fields of military reconnaissance, medical diagnosis and remote sensing.

图像融合的方法大致可以划分为空域中的图像融合和变换域中的图像融合。空域中的图像融合指的是直接对图像的像素点进行融合的方法，主要包括了加权图像融合和基于主成分分析的图像融合，此类融合方法没有复杂的变换，一般实时性较高，但是它们有一个共同的缺点，那就是融合后的图像存在畸变。变换域中的图像融合一般指的是对图像做多尺度分解，然后对各子带系数进行融合的方法，多尺度分解的工具包括了小波变换(DWT)、平稳小波变化(SWT)、轮廓波变换(CT)、非下采样轮廓波变化(NSCT)、剪切波变换(ST)、平移不变剪切波变换(SIST)等。DWT是Mallat在1989年提出的，具有时频局域化、多分辨率等优点、但是它能捕捉的方向信息有限而且存在吉布斯振荡现象。Rockinger提出了SWT可以抑制DWT的吉布斯振荡现象，但是SWT的高频子带仍只有水平、垂直和对角这三个方向。Minh N.Do提出的CT具有很强的方向选择能力，但是CT因其变换的过程中存在下采样的操作，故缺乏平移不变性。Cunha提出的NSCT使用了非下采样的滤波器，具备平移不变性。Guo等人提出了ST，相比CT，ST计算的效率更高，且没有方向数目的限制，然而ST由于采用了下采样滤波器而不具备平移不变性。Easley等人提出的SIST采用了非下采样金字塔分解，具备了平移不变的特性，使得它在图像融合领域有了更好的发展前景。Image fusion methods can be roughly divided into image fusion in the spatial domain and image fusion in the transform domain. Image fusion in the air domain refers to the method of directly fusing the pixels of the image, mainly including weighted image fusion and image fusion based on principal component analysis. This type of fusion method has no complicated transformation and generally has high real-time performance, but They have a common shortcoming, that is, there is distortion in the fused image. Image fusion in the transform domain generally refers to the method of multi-scale decomposition of the image, and then the fusion of each sub-band coefficient. The multi-scale decomposition tools include wavelet transform (DWT), stationary wavelet transformation (SWT), contourlet Transform (CT), Non-Subsampled Contourlet Transform (NSCT), Shearlet Transform (ST), Translation Invariant Shearlet Transform (SIST), etc. DWT was proposed by Mallat in 1989. It has the advantages of time-frequency localization and multi-resolution, but it can capture limited direction information and has Gibbs oscillation phenomenon. Rockinger proposed that SWT can suppress the Gibbs oscillation phenomenon of DWT, but the high-frequency sub-band of SWT still only has three directions: horizontal, vertical and diagonal. The CT proposed by Minh N.Do has a strong direction selection ability, but CT lacks translation invariance because of the downsampling operation in the transformation process. The NSCT proposed by Cunha uses a non-subsampling filter with translation invariance. Guo et al. proposed ST. Compared with CT, ST is more efficient in calculation and has no limitation on the number of directions. However, ST does not have translation invariance due to the use of downsampling filters. The SIST proposed by Easley et al. uses a non-subsampling pyramid decomposition, which has the characteristics of translation invariance, which makes it have a better development prospect in the field of image fusion.

融合规则的设计决定图像融合效果好坏。基于变换域的图像融合主要步骤是设计一个合适的方案将各层子带系数进行融合，从而得到更符合人类视觉感知的融合图像。此步骤包含了两个基本的问题：融合策略的选择和活动测度的选取。The design of fusion rules determines the quality of image fusion. The main step of image fusion based on transform domain is to design a suitable scheme to fuse the sub-band coefficients of each layer, so as to obtain a fusion image that is more in line with human visual perception. This step involves two basic issues: selection of fusion strategy and selection of activity measure.

常用的子带系数融合规则包括了绝对值取大和加权取平均。绝对值取大指的是选择活动测度大的子带系数作为融合后的子带系数。加权取平均的规则根据活动测度来计算对应子带系数的权值然后将子带系数加权融合。加权取平均的方法能更好的保留待融合图像的有用信息，而绝对值取大的融合规则能提升融合图像的对比度。Commonly used subband coefficient fusion rules include taking the largest absolute value and weighting the average. Taking a large absolute value refers to selecting the subband coefficient with a large activity measure as the fused subband coefficient. The weighted averaging rule calculates the weight of the corresponding sub-band coefficients according to the activity measure and then fuses the sub-band coefficients weightedly. The weighted average method can better retain the useful information of the image to be fused, and the fusion rule with a large absolute value can improve the contrast of the fused image.

活动测度的设计也可以理解为像素或子带系数的特征选取，要求能从某一个角度准确的反应数据的特性。在基于主成分分析(PCA)的融合方法中，He等人首先由波段低分辨率图像间相关系数矩阵的特征值和特征向量求得主分量，然后再对波段高分辨率图像进行灰度拉伸并使之与主分量具有相同的均值和方差，最后用此高分辨率图像代替主分量并利用PCA逆变换获得融合后的图像。张量(tensor)对高维数据的描述有着突出的表现，在图像融合的应用中也有了很好的效果。Liang等人首先用高阶奇异值分解工具将图像进行分解，然后将系数的1范数作为活动测度来构建了融合规则。稀疏表示可以将复杂的数据用字典元素的线性组合来描述，在图像融合中已经有了广泛的应用。Yang等人首先将图像分割成重叠的小块，然后将图像小块分解得到稀疏系数，最后针对稀疏系数来设计融合规则。近几年，深度学习的方法在计算机视觉领域的应用取得了巨大的成功，深度学习的方法擅长从复杂的数据集中发掘出分层的特征，但由于图像融合的实际应用中缺乏足够的带标签的训练集，卷积神经网络(CNN)等监督学习的方法在图像融合中尚未有应用的先例，而栈式自编码(SAE)作为一种无监督学习的深度神经网络，符合了图像融合应用的场景。通过将多层自编码(AE)的栈式叠加得到的SAE网络不仅能够发掘图像的分层特征，而且如果对隐层神经元设置稀疏约束(SSAE)，还能使提取到的特征具有稀疏性，更符合图像融合应用的需求。The design of the activity measure can also be understood as the feature selection of the pixel or sub-band coefficients, and it is required to accurately reflect the characteristics of the data from a certain angle. In the fusion method based on principal component analysis (PCA), He et al. first obtained the principal components from the eigenvalues and eigenvectors of the correlation coefficient matrix between the low-resolution images of the bands, and then stretched the gray scale of the high-resolution images of the bands And make it have the same mean and variance as the principal component, and finally use this high-resolution image to replace the principal component and use PCA inverse transformation to obtain the fused image. Tensor has outstanding performance in describing high-dimensional data, and it also has a good effect in the application of image fusion. Liang et al. first decomposed the image with a high-order singular value decomposition tool, and then used the 1-norm of the coefficients as an activity measure to construct a fusion rule. Sparse representation can describe complex data with a linear combination of dictionary elements, and has been widely used in image fusion. Yang et al. first divided the image into overlapping small blocks, then decomposed the image small blocks to obtain sparse coefficients, and finally designed fusion rules for the sparse coefficients. In recent years, the application of deep learning methods in the field of computer vision has achieved great success. Deep learning methods are good at discovering hierarchical features from complex data sets, but due to the lack of sufficient labeled features in the practical application of image fusion Supervised learning methods such as convolutional neural network (CNN) have not yet been applied in image fusion, while stacked autoencoder (SAE), as a deep neural network for unsupervised learning, meets the requirements of image fusion applications. scene. The SAE network obtained by stacking multi-layer self-encoders (AE) can not only explore the hierarchical features of the image, but also make the extracted features sparse if the sparse constraints (SSAE) are set on the hidden layer neurons. , which is more in line with the needs of image fusion applications.

发明内容Contents of the invention

本发明的目的是针对上述现有技术的不足，提出了一种基于平移不变剪切波和栈式自编码网络的图像融合算法，以保护图像的细节、增强图像对比度和轮廓边缘，改善其视觉效果，提高图像融合的质量。本发明具体技术方案如下：The purpose of the present invention is to address the above-mentioned deficiencies in the prior art, and proposes an image fusion algorithm based on translation-invariant shearlets and stacked autoencoder networks, to protect image details, enhance image contrast and contour edges, and improve its Visual effects that improve the quality of image fusion. Concrete technical scheme of the present invention is as follows:

1)首先对待融合的两幅图像进行SIST变换，分解得到低频子带和高频子带 1) First, perform SIST transformation on the two images to be fused, and decompose to obtain the low-frequency subband and high frequency subband

2)针对低频子带系数和高频子带系数设计不同的融合规则：2) Different fusion rules are designed for low-frequency sub-band coefficients and high-frequency sub-band coefficients:

2.1)低频子带系数包含了图像的基础信息，因此和使用取平均的策略融合：2.1) The low-frequency sub-band coefficients contain the basic information of the image, so with Use averaging strategy fusion:

图像经过了SIST分解得到的低频子带系数里包含了图像的主要能量信息，它是原图像的近似分量，本文采用了取平均的方法融合低频子带系数，即The low-frequency sub-band coefficients obtained by SIST decomposition of the image contain the main energy information of the image, which is an approximate component of the original image. This paper uses the average method to fuse the low-frequency sub-band coefficients, namely

其中，和分别别表示原图像A、B以及融合图像F在点(x，y)出对应的低频系数。in, with Respectively represent the corresponding low-frequency coefficients of the original images A, B and the fused image F at point (x, y).

2.2)高频子带系数包含了细节信息，和使用基于SSAE特征取大的融合规则：高频子带系数包含了图像的纹理信息。SSAE作为一种深度学习方法，很擅长从复杂数据集里学习高维结构特征。考虑到这些因素，本发明提出了一种基于SSAE特征的融合规则，首先将待融合子带分块，将小块转成向量后作为训练数据训练一个两层的SSAE网络，后利用此网络对待融合小块进行编码得到特征接着对求空间频率用取大的规则融合小块，其中s∈{A，B}；2.2) High frequency subband coefficients contains details, with Use the fusion rule based on SSAE feature selection: the high-frequency sub-band coefficients contain the texture information of the image. As a deep learning method, SSAE is very good at learning high-dimensional structural features from complex data sets. Considering these factors, the present invention proposes a fusion rule based on SSAE features, first divide the sub-bands to be fused, and convert the small blocks into vectors Then train a two-layer SSAE network as training data, and then use this network to encode the fusion small blocks to obtain features next to Find the spatial frequency use Take large rules to fuse small blocks, where s ∈ {A, B};

最后对所有的融合向量先转换成矩阵的形式，然后采用滑动窗口变换的逆变换得到融合系数矩阵在重叠区域采用取平均的策略。Finally for all fusion vectors First convert to the form of matrix, and then use the inverse transformation of sliding window transformation to obtain the fusion coefficient matrix A strategy of averaging is used in overlapping areas.

3)利用SIST逆变换得到融合后的图像。3) Use the SIST inverse transformation to obtain the fused image.

本发明相对比现有医学图像融合方法具有如下的优点：Compared with the existing medical image fusion method, the present invention has the following advantages:

1、本发明采用基于平移不变剪切波变换(SIST)作为多尺度分解工具，相比于小波变换(DWT)，SIST能捕捉更多的方向信息且消除了伪吉布斯现象；相比于平稳小波变换(SWT)，SIST得到的高频子带方向数更多；相比于轮廓波变换(CT)缺乏平移不变性的问题，SIST不存在下采样的操作，故具备了平移不变性；相比于非下采样轮廓波变换(NSCT)，SIST没有方向数的限制，且计算的效率更高。1. The present invention uses translation-invariant shearlet transform (SIST) as a multi-scale decomposition tool. Compared with wavelet transform (DWT), SIST can capture more direction information and eliminate pseudo-Gibbs phenomenon; Compared with the stationary wavelet transform (SWT), SIST obtains more high-frequency subband directions; compared with the lack of translation invariance in the contourlet transform (CT), SIST does not have downsampling operations, so it has translation invariance ; Compared with the non-subsampled contourlet transform (NSCT), SIST has no limitation on the number of directions, and the calculation efficiency is higher.

2、本发明采用栈式编码器(SSAE)作为特征提取工具，相比于主成分分析(PCA)，SSAE是一种数据驱动的特征提取工具，它可以从输入图像学习独特的规则，从而提取出比人工设计的特征更具代表性的特征；相比于张量(tensor)和脉冲耦合神经网络(PCNN)，SSAE作为深度网络可以提取出分层特征，比tensor和PCNN更加适合图像数据表示。2. The present invention adopts a stacked encoder (SSAE) as a feature extraction tool. Compared with Principal Component Analysis (PCA), SSAE is a data-driven feature extraction tool that can learn unique rules from an input image, thereby extracting Features that are more representative than artificially designed features; compared to tensor and pulse-coupled neural network (PCNN), SSAE, as a deep network, can extract hierarchical features, which is more suitable for image data representation than tensor and PCNN .

3、本发明在构造活动测度时引入了空间频率，相比直接使用SSAE特征，引入空间频率后的活动测度局部对比度更强，能更好地表示原图的特征；可以更大限度的保护图像的边缘和纹理信息。3. The present invention introduces spatial frequency when constructing the activity measurement. Compared with directly using SSAE features, the local contrast of the activity measurement after introducing the spatial frequency is stronger, which can better represent the characteristics of the original image; it can protect the image to a greater extent edge and texture information.

附图说明：Description of drawings:

图1是整体本发明的整体融合框架图。Fig. 1 is an overall fusion framework diagram of the present invention.

图2是平移不变剪切波变换的结构图。Figure 2 is a structural diagram of the translation invariant shearlet transform.

图3是单层稀疏自动编码器的结构图。Figure 3 is a block diagram of a single-layer sparse autoencoder.

图4是栈式自编码器的结构图。Figure 4 is a structural diagram of a stacked autoencoder.

图5是高频系数子带融合规则示意图。Fig. 5 is a schematic diagram of a high-frequency coefficient sub-band fusion rule.

图6(a)和(b)是本发明第一个实施例的待融合多聚焦图像；(c)是基于GP的融合图像；(d)是基于DWT的融合图像；(e)是基于stDWT的融合图像；(f)是基于PCNN的融合图像；(g)是基于N-PCNN的融合图像；(h)是基于SR的融合图像；(i)是本发明方法的融合图像。Fig. 6 (a) and (b) are multi-focus images to be fused in the first embodiment of the present invention; (c) is a fusion image based on GP; (d) is a fusion image based on DWT; (e) is a fusion image based on stDWT (f) is a fusion image based on PCNN; (g) is a fusion image based on N-PCNN; (h) is a fusion image based on SR; (i) is a fusion image of the inventive method.

图7(a)是本发明一个实施例的待融合可见光图像；(b)是本发明一个实施例的待融合红外图像；(c)是基于GP的融合图像；(d)是基于DWT的融合图像；(e)是基于stDWT的融合图像；(f)是基于PCNN的融合图像；(g)是基于N-PCNN的融合图像；(h)是基于SR的融合图像；(i)是本发明方法的融合图像。Figure 7 (a) is a visible light image to be fused according to an embodiment of the present invention; (b) is an infrared image to be fused according to an embodiment of the present invention; (c) is a fusion image based on GP; (d) is a fusion based on DWT Image; (e) is a fusion image based on stDWT; (f) is a fusion image based on PCNN; (g) is a fusion image based on N-PCNN; (h) is a fusion image based on SR; (i) is the fusion image of the present invention Method for fused images.

图8(a)是本发明一个实施例的待融合CT图像；(b)是本发明一个实施例的待融合MRI图像；(c)是基于GP的融合图像；(d)是基于DWT的融合图像；(e)是基于stDWT的融合图像；(f)是基于PCNN的融合图像；(g)是基于N-PCNN的融合图像；(h)是基于SR的融合图像；(i)是本发明方法的融合图像。Figure 8 (a) is a CT image to be fused according to an embodiment of the present invention; (b) is an MRI image to be fused according to an embodiment of the present invention; (c) is a fusion image based on GP; (d) is a fusion based on DWT Image; (e) is a fusion image based on stDWT; (f) is a fusion image based on PCNN; (g) is a fusion image based on N-PCNN; (h) is a fusion image based on SR; (i) is the fusion image of the present invention Method for fused images.

具体实施方式：detailed description:

下面对本发明的实施例结合附图作详细说明，本实施例在以本发明技术方案为前提下进行，如图1所示，详细的实施方式和具体的操作步骤如下：Embodiments of the present invention are described in detail below in conjunction with the accompanying drawings. The present embodiment is carried out under the premise of the technical solution of the present invention, as shown in Figure 1, the detailed implementation and concrete operation steps are as follows:

步骤1，对待融合的两幅多聚焦图像使用SIST变换进行分解，图像分解后得到低频子带和高频子带其中尺度分解LP选用“maxflat”，方向滤波器组选用“pmaxflat”，方向分解参数设置为[3,4,5]；Step 1, use SIST transformation to decompose the two multi-focus images to be fused, and obtain low-frequency subbands after image decomposition and high frequency subband Among them, "maxflat" is selected for the scale decomposition LP, "pmaxflat" is selected for the direction filter bank, and the direction decomposition parameter is set to [3,4,5];

步骤2，低频子带系数融合和高频子带系数融合：Step 2, low-frequency sub-band coefficient fusion and high-frequency sub-band coefficient fusion:

1)对低频子带系数采用取平均的融合策略：1) For low frequency subband coefficients Using the average fusion strategy:

2)对高频子带系数采用基于SSAE和空间频率的融合规则进行融合：2) For high frequency subband coefficients Fusion is performed using fusion rules based on SSAE and spatial frequency:

2.1)将高频子带利用滑动窗口技术分成num个小块，其中s∈{A，B}：2.1) The high frequency sub-band Divide into num small blocks using sliding window technology, where s∈{A, B}:

2.2)将所有小块转变成向量其中1＜k＜num，并利用它们作为训练集训练第一个自编码网络：2.2) Convert all small blocks into vectors where 1<k<num, and use them as the training set to train the first autoencoder network:

自编码网络训练目标是得到一组最优的权值和偏置值，使自编码网络的重构误差最小。作为第一层自动编码器的输入x和输出：The goal of autoencoder network training is to obtain a set of optimal weights and bias values to minimize the reconstruction error of the autoencoder network. As the input x and output of the first layer autoencoder:

利用梯度下降算法求解最优的W^1，1和b^1，1：Use the gradient descent algorithm to find the optimal W ^1,1 and b ^1,1 :

步骤1：设置l＝1；Step 1: set l=1;

步骤2：设置W^l，1：＝0,b^l，1：＝0,ΔW^l，1：＝0,Δb^l，1：＝0；Step 2: set W ^l,1 :=0,b ^l,1 :=0,ΔW ^l,1 :=0,Δb ^l,1 :=0;

步骤3：计算编码器的重构误差Step 3: Calculate the reconstruction error of the encoder

步骤4：whileJ(W^l，1，b^l，1)＞10^-6：Step 4: whileJ(W ^{l, 1} , b ^{l, 1} )>10 ^-6 :

for i＝1to m:for i＝1to m:

计算和 calculate with

更新W^l，1和b^l，1：Update W ^l,1 and b ^l,1 :

其中W^l，1和b^l，1为权值矩阵和偏置，ΔW^l，1和Δb^l，1分别为W^l，1和b^l，1的增量，J(W^l，1，b^l，1)为重构误差，是Kullback-Leibler(KL)距离，是隐层神经元的平均激活度，ρ为稀疏系数，β是稀疏惩罚项的系数，α是更新速率，m是最大迭代次数。Where W ^l,1 and b ^l,1 are the weight matrix and bias, ΔW ^l,1 and Δb ^l,1 are the increments of W ^l,1 and b ^l,1 respectively, J(W ^l,1 ,b ^{l, 1} ) is the reconstruction error, is the Kullback-Leibler (KL) distance, is the average activation degree of neurons in the hidden layer, ρ is the sparsity coefficient, β is the coefficient of the sparsity penalty term, α is the update rate, and m is the maximum number of iterations.

得到W^1，1和b^1，1的最优值后，对进行编码得到第一层编码器的隐层激活值a^s，1(k)：After obtaining the optimal values of W ^1,1 and b ^1,1 , for Encoding is performed to obtain the hidden layer activation value a ^{s, 1} (k) of the first layer encoder:

其中W^1，1和b^1，1分别为第一层编码器的权值矩阵和偏置。in W ^1,1 and b ^1,1 are the weight matrix and bias of the first-layer encoder, respectively.

得到第一层自动编码器的隐层激活值a^s，1后，将它作为第二层自动编码器的输入和输出，再利用求W^1，1和b^1，1同样的方法可以解得最优W^2，1和b^2，1。得到W^2，1和b^2，1后求得第二层编码器的隐层激活值a^s，2，并以a^s，2作为小块的特征 After obtaining the hidden layer activation value a ^s,1 of the first layer of autoencoder, use it as the input and output of the second layer of autoencoder, and then use the same method to find W ^1,1 and b ^1,1 to solve Optimal W ^2,1 and b ^2,1 . After obtaining W ^2,1 and b ^2,1 , obtain the hidden layer activation value a ^s,2 of the second layer encoder, and use a ^s,2 as the feature of the small block

2.3)为增强特征的区分度，引入空间频率。具体的，以空间频率的值作为活动测度；2.3) In order to enhance the discrimination of features, spatial frequency is introduced. Specifically, with The value of the spatial frequency as an activity measure;

其中，是特征向量在位置(i，j)处的值，其中1＜i＜M，1＜j＜N，M为特征向量的行数，N为特征向量的列数。k是小块的序号，1＜k＜num。in, is the value of the feature vector at position (i, j), where 1<i<M, 1<j<N, M is the number of rows of the feature vector, and N is the number of columns of the feature vector. k is the serial number of the small block, 1<k<num.

2.4)对于每一对和利用空间频率取大的规则融合；2.4) For each pair with Use the rule fusion with the largest spatial frequency;

2.5)对融合向量采用滑动窗口变换的逆变换得到融合系数矩阵在重叠区域采用取平均的策略。2.5) For fusion vector Using the inverse transformation of the sliding window transformation to obtain the fusion coefficient matrix A strategy of averaging is used in overlapping areas.

步骤3，利用SIST逆变换得到。Step 3, obtained by using SIST inverse transformation.

实验条件与方法：Experimental conditions and methods:

硬件平台为：Intel(R)处理器，CPU主频1.80GHz，内存1.0GB；The hardware platform is: Intel(R) processor, CPU main frequency 1.80GHz, memory 1.0GB;

软件平台为：MATLAB R2016a；实验中采用三组已配准的源图像，即多聚焦图像、红外-可见光图像和医学图像，图像大小均为256×256，tif格式。多聚焦源图像见图6(a)和图6(b)，图6(a)为左侧聚焦的图像，图6(b)为右侧聚焦的图像。红外-可见光图像见图7(a)和图7(b)，图7(a)为可见光图像，图7(b)为红外图像。医学图像见图8(a)和图8(b)，图8(a)为CT图像，图8(b)为MRI图像。The software platform is: MATLAB R2016a; three sets of registered source images are used in the experiment, namely multi-focus images, infrared-visible light images and medical images, and the image size is 256×256 in tif format. Multi-focus source images are shown in Figure 6(a) and Figure 6(b), Figure 6(a) is the left-focused image, and Figure 6(b) is the right-focused image. Infrared-visible light images are shown in Figure 7(a) and Figure 7(b), Figure 7(a) is a visible light image, and Figure 7(b) is an infrared image. See Figure 8(a) and Figure 8(b) for medical images, Figure 8(a) is a CT image, and Figure 8(b) is an MRI image.

仿真实验：Simulation:

为了验证本发明的可行性和有效性，采用了多聚焦图像、红外-可见光图像和医学图像这三组图像测试，融合结果如图6、图7和图8所示。In order to verify the feasibility and effectiveness of the present invention, three sets of image tests including multi-focus images, infrared-visible light images and medical images were used, and the fusion results are shown in Fig. 6, Fig. 7 and Fig. 8 .

仿真一：遵循本发明的技术方案，对多聚焦源图像(见图6(a)和图6(b))进行融合，通过图6(c)-图6(i)的分析可以看出：本发明在多聚焦图像融合的实验中实现了两个小钟的表盘区域都聚焦的目标，在表盘刻度和文字处最清晰，融合结果没有引入额外的噪声，整体上主观视觉效果最好。Simulation 1: Following the technical solution of the present invention, the multi-focus source images (see Figure 6(a) and Figure 6(b)) are fused, as can be seen from the analysis of Figure 6(c)-Figure 6(i): In the experiment of multi-focus image fusion, the present invention achieves the goal of focusing the dial areas of the two small clocks, and the dial scale and text are the clearest. The fusion result does not introduce additional noise, and the overall subjective visual effect is the best.

仿真二：遵循本发明的技术方案，对红外-可见光源图像(见图7(a)和图7(b))进行融合，通过图7(c)-图7(i)的分析可以看出：本发明在红外-可见光源图像融合的实验中实现了将所有红外目标的保存到了融合图像的目标，这是红外-可见光图像融合中最为重要的任务，此外在图像的背景处(如树木的枝叶)，本发明能提供最为清晰的纹理细节信息。Simulation 2: Following the technical solution of the present invention, the infrared-visible light source images (see Figure 7(a) and Figure 7(b)) are fused, as can be seen from the analysis of Figure 7(c)-Figure 7(i) : the present invention has realized in the experiment of infrared-visible light source image fusion that all infrared targets have been preserved to the target of fusion image, and this is the most important task in infrared-visible light image fusion. branches and leaves), the present invention can provide the clearest texture detail information.

仿真三：遵循本发明的技术方案，对医学源图像(见图8(a)和图8(b))进行融合，通过图8(c)-图8(i)的分析可以看出：本发明在医学源图像融合的实验中实现了将所有骨骼轮廓和组织信息都保留到融合结果中的目标，从整体来看，本发明的融合结果在骨骼处对比度最高，在组织处提供的信息最为丰富且清晰。Simulation three: following the technical solution of the present invention, the medical source images (see Fig. 8(a) and Fig. 8(b)) are fused, and it can be seen from the analysis of Fig. 8(c)-Fig. 8(i): The invention achieves the goal of retaining all bone contours and tissue information in the fusion result in the experiment of medical source image fusion. On the whole, the fusion result of the present invention has the highest contrast at the bone and the most information at the tissue. Rich and clear.

表1、表2和表3给出了三种数据集利用各种融合方法实验结果的客观评价指标，其中加粗的数据表示对应的评价指标为最优值。GP为基于高斯梯度金字塔分解的图像融合方法，DWT为基于离散小波分解的图像融合方法，strDWT为基于结构张量和离散小波分解的图像融合方法，PCNN为基于脉冲耦合神经网络的图像融合方法，N-PCNN为基于非下采样轮廓波变换和脉冲耦合神经网络的图像融合方法，SR为基于系数表示的图像融合方法，Porposed为本发明提出的融合方法。实验选用信息熵(EN)、平均梯度(AG)、边缘转换率(Qabf)、边缘强度(EI)、互信息(MI)、标准差(SD)作为客观评价指标。Table 1, Table 2 and Table 3 give the objective evaluation indicators of the experimental results of the three data sets using various fusion methods, where the bold data indicates that the corresponding evaluation indicators are the optimal values. GP is an image fusion method based on Gaussian gradient pyramid decomposition, DWT is an image fusion method based on discrete wavelet decomposition, strDWT is an image fusion method based on structural tensor and discrete wavelet decomposition, PCNN is an image fusion method based on pulse-coupled neural network, N-PCNN is an image fusion method based on non-subsampling contourlet transform and pulse-coupled neural network, SR is an image fusion method based on coefficient representation, and Porposed is a fusion method proposed by the present invention. The experiment uses information entropy (EN), average gradient (AG), edge transition rate (Qabf), edge intensity (EI), mutual information (MI), and standard deviation (SD) as objective evaluation indicators.

由表1、表2和表3的数据表明，本发明方法所获得的融合图像在信息熵、平均梯度、互信息和标准差等客观评价指标上要优于其它的融合方法。信息熵反应的是图像携带信息量的多少，其值说明融合图像中包含的信息量越大，融合效果越好；平均梯度反应的是图像的清晰度，其值越大视觉效果越好；边缘转换率反应的是待融合图像的边缘信息转移到融合图像中的程度，其值越接近1视觉效果越好；边缘强度衡量的是图像边缘细节的丰富程度，其值越大则主观效果越好；互信息反应的是待融合图像和融合图像之间信息的相关程度，其值越大视觉效果越好；标准差反应的是图像灰度相比于灰度均值的离散程度，其值越大则灰度级越分散，则视觉效果越好。The data in Table 1, Table 2 and Table 3 show that the fusion image obtained by the method of the present invention is superior to other fusion methods in terms of objective evaluation indicators such as information entropy, average gradient, mutual information and standard deviation. Information entropy reflects the amount of information carried by the image, and its value indicates that the greater the amount of information contained in the fused image, the better the fusion effect; the average gradient reflects the clarity of the image, and the larger the value, the better the visual effect; The conversion rate reflects the degree to which the edge information of the image to be fused is transferred to the fused image. The closer the value is to 1, the better the visual effect; the edge strength measures the richness of the edge details of the image. The larger the value, the better the subjective effect ; Mutual information reflects the degree of correlation between the image to be fused and the information between the fused image, the larger the value, the better the visual effect; the standard deviation reflects the degree of dispersion of the image grayscale compared to the gray average value, the larger the value The more dispersed the gray level, the better the visual effect.

表1多聚焦图像融合结果客观评价指标Table 1 Objective evaluation index of multi-focus image fusion results

表2红外-可见光图像融合结果客观评价指标Table 2 Objective evaluation index of infrared-visible light image fusion results

表3医学图像融合结果客观评价指标Table 3 Objective evaluation indicators of medical image fusion results

从各仿真实验的融合结果可以看出，本发明的融合图像全局清晰，融合图像信息丰富。无论是从主观人类视觉感知上还是客观评价指标上都能证明本发明的有效性。From the fusion results of various simulation experiments, it can be seen that the fusion image of the present invention is globally clear, and the fusion image information is rich. The effectiveness of the present invention can be proved no matter from subjective human visual perception or objective evaluation index.

Claims

1. The image fusion method based on translation-invariant shearlet and stacked self-encoding, is characterized in that, comprises the steps:

1) The two source images to be fused are decomposed into low-frequency sub-band coefficients and high-frequency sub-band coefficients using translation invariant shearlet transform;

2) The low-frequency sub-band coefficients are fused by means of weighted averaging;

3) The high-frequency sub-band coefficients containing detailed information are fused using a fusion rule based on sparse constrained stacked autoencoder (SSAE) feature extraction;

4) Perform translation-invariant shearlet inverse transformation on the fusion coefficients obtained in steps 2) and 3) to obtain a fusion image.

2. the image fusion method based on translation invariant shearlet transform according to claim 1, is characterized in that, described step 1) comprises: two pieces of source images A and B to be fused, utilize translation invariant shearlet transform The wavelet transform decomposes the two images into low-frequency subband coefficients respectively and high frequency subband coefficients

3. the new method of image fusion based on translation-invariant shearlet transform according to claim 1, is characterized in that, described step 2) comprises the steps:

{C C}_{L L}^{F f} ((x x,, y the y)) = = \frac{11}{22} \times \times {C C}_{L L}^{A A} ((x x,, y the y)) + + \frac{11}{22} \times \times {C C}_{L L}^{B B} ((x x,, y the y))

in, with Respectively represent the corresponding low-frequency coefficients of the original images A, B and the fused image F at point (x, y).

4. the new method of image fusion based on translation invariant shearlet transform according to claim 1, is characterized in that, described step 3) comprises the steps:

1) First, the high-frequency sub-band Divide into num small blocks using sliding window technology, and convert all small blocks into vectors where s∈{A,B}, 1<k<num;

2) As the training data train a SSAE with a two-layer structure:

i. Set l=1;

ii. Set W ^l,1 :=0,b ^l,1 :=0,ΔW ^l,1 :=0,Δb ^l,1 :=0;

3) Calculate the reconstruction error of the encoder

J J (({W W}^{l l,, 11},, {b b}^{l l,, 11})) = = \frac{11}{N N} {Σ Σ}_{i i = = 11}^{N N} ((\frac{11}{22} | | | | {W W}^{l l,, 11} {x x}_{i i} + + {b b}^{l l,, 11} - - {x x}_{i i} | | {| |}^{22})) + + β β {Σ Σ}_{j j = = 11}^{{D D.}^{,,}} K K L L ((ρ ρ | | | | \overset{^^}{ρ ρ}))

i.while J(W ^{l, 1} , b ^{l, 1} )>10 ⁻⁶ :

for i=1 to m:

calculate with

{ΔW ΔW}^{l l,, 11} : : = = {ΔW ΔW}^{l l,, 11} + + \frac{\partial \partial J J (({W W}^{l l,, 11},, {b b}^{l l,, 11}))}{\partial \partial {W W}^{l l,, 11}};;

{Δb Δb}^{l l,, 11} : : = = {Δb Δb}^{l l,, 11} + + \frac{\partial \partial J J (({W W}^{l l,, 11},, {b b}^{l l,, 11}))}{\partial \partial {b b}^{l l,, 11}};;

Update W ^l,1 and b ^l,1 :

{W W}^{l l,, 11} = = {W W}^{l l,, 11} - - α α [[((\frac{11}{m m} {ΔW ΔW}^{l l,, 11})) + + {λW λW}^{l l,, 11}]];;

{b b}^{l l,, 11} = = {b b}^{l l,, 11} - - α α [[((\frac{11}{m m} {Δb Δb}^{l l,, 11}))]];;

Where W ^l,1 and b ^l,1 are the weight matrix and bias, ΔW ^l,1 and Δb ^l,1 are the increments of W ^l,1 and b ^l,1 respectively, J(W ^l,1 ,b ^{l, 1} ) is the reconstruction error, is the Kullback-Leibler (KL) distance, is the average activation degree of neurons in the hidden layer, ρ is the sparsity coefficient, β is the coefficient of the sparsity penalty term, α is the update rate, and m is the maximum number of iterations.

After obtaining the optimal parameters W ^1,1 and b ^1,1 of the first layer of autoencoder, calculate the hidden layer neuron activation value a ^s,1 of the first layer of autoencoder:

a ^s,1 (k)=sigmoid(W ^1,1 x(k)+b ^1,1 )

in

4) Take a ^{s, 1} as the input and output of the second-layer autoencoder, and then use the same method to find W ¹ , 1 and b ¹ , 1 to solve the optimal W ^2,1 , b 2, ¹ and a ^{s, 2} .

5) Use the hidden layer activation value a ^{s, 2} of the second layer autoencoder as the feature of the input small block

6) Then use the spatial frequency to enhance the feature The degree of discrimination, and then the value of the spatial frequency As an activity measure:

{SF SF}_{H h}^{s the s} ((k k)) = = \sqrt{\frac{11}{M m * * N N} (({Σ Σ}_{i i = = 11}^{M m} {Σ Σ}_{j j = = 22}^{N N} {[[{F f}_{H h}^{s the s} ((k k)) ((i i,, j j)) - - {F f}_{H h}^{s the s} ((k k)) ((i i,, j j - - 11))]]}^{22} + + {Σ Σ}_{i i = = 22}^{M m} {Σ Σ}_{j j = = 11}^{N N} {[[{F f}_{H h}^{s the s} ((k k)) ((i i,, j j)) - - {F f}_{H h}^{s the s} ((k k)) ((i i - - 11,, j j))]]}^{22}))}

in, is the value of the feature vector at position (i, j), where 1<i<M, 1<j<N, M is the number of rows of the feature vector, and N is the number of columns of the feature vector. k is the serial number of the small block, 1<k<num.

7) For each pair with Use the rule fusion with the largest spatial frequency:

{V V}_{H h}^{F f} ((k k)) = = \{\begin{matrix} {V V}_{H h}^{A A} ((k k)),, & i i f f (({SF SF}_{H h}^{A A} ((k k)) &GreaterEqual; &Greater Equal; {SF SF}_{H h}^{B B} ((k k)))) \\ {V V}_{H h}^{B B} ((k k)),, & i i f f (({SF SF}_{H h}^{A A} ((k k)) < < {SF SF}_{H h}^{B B} ((k k)))) \end{matrix}

Then, the inverse transform of the sliding window transform is used to regenerate the high-frequency subbands, and the overlapping part uses an averaging strategy.