CN111882514A

CN111882514A - Multi-modal medical image fusion method based on double-residual ultra-dense network

Info

Publication number: CN111882514A
Application number: CN202010734334.7A
Authority: CN
Inventors: 王丽芳; 王蕊芳; 张晋
Original assignee: North University of China
Current assignee: North University of China
Priority date: 2020-07-27
Filing date: 2020-07-27
Publication date: 2020-11-03
Anticipated expiration: 2040-07-27
Also published as: CN111882514B

Abstract

The invention discloses a multimode medical image fusion method based on a double-residual ultra-dense network, which comprises the following steps: extracting shallow features of the first modality medical image and the second modality medical image through a first Conv layer convolution and a PReLU layer activation in the double-residual ultra-dense network; extracting deep features through residual learning and ultra-dense connection; and sequentially carrying out dimension splicing, final Conv layer convolution and PReLU layer activation on the Concat layer channel in the double-residual-error super-dense network on the deep features to obtain a fused image of the first-mode medical image and the second-mode medical image. According to the invention, the double-residual super-dense block provided by combining the residual dense block and the super-dense connection not only applies the dense connection between the layers of the same path, but also applies the dense connection between the layers of different paths, and information transmission is carried out between the two paths for extracting different modal image characteristics, so that the extracted deep characteristics are more detailed and abundant, and the loss of useful information of a network middle layer is reduced.

Description

Multi-modal medical image fusion method based on double-residual ultra-dense network

Technical Field

The invention relates to the technical field of medical image fusion, in particular to a multi-modal medical image fusion method based on a double-residual ultra-dense network.

Background

Image fusion is used in a wide range of applications, such as medical imaging, remote sensing, machine vision, biometric identification, and military applications. The goal of fusion is to obtain better contrast and perception experience. In recent years, with the increasing demand for clinical applications, research on multi-modal medical image fusion has been receiving attention. The objective of multi-modal medical image fusion is to provide a better medical image to assist the surgeon in surgical intervention.

Currently, there are multiple modalities for medical images, such as Magnetic Resonance (MR) images, Computed Tomography (CT) images, Positron Emission Tomography (PET) images, and X-ray images, etc., and images of different modalities have their own advantages and limitations, for example, CT can well display bone information, but cannot clearly display structural information such as soft tissue; the MR image can fully display soft tissue information, but has great defects on the detection of bone information; the PET image can provide abundant human body metabolism information for clinic, but the resolution is low. Therefore, the medical image information of multiple modes is combined to complete multi-mode image fusion, and the complementary advantages can be realized. The multi-modal fusion image not only retains the characteristics of the original image, but also makes up the defects of the single-modal medical image, shows richer detail information and provides comprehensive information for clinical diagnosis and treatment and image-guided surgery.

The image fusion method is divided into three layers: pixel level, feature level, decision level. The image fusion at pixel level is to combine the pixel values of each point corresponding to two or more source images by a certain fusion method to calculate a new pixel value, so that each point pixel is fused to form a new fusion image. Common pixel-level image fusion methods include a spatial domain-based fusion method and a transform domain-based fusion method. The image fusion based on the spatial domain is mainly divided into block-based fusion and region-based fusion, including a logic filtering method, a weighted average method, a mathematical morphology method, an image algebra method, a simulated annealing method and the like, and the method has the advantages of small calculated amount, easiness in realization, poor accuracy and unsuitability for the field of medical images. The method based on the transform domain is to decompose the source image, combine the decomposed source images by using different fusion rules, and finally execute inverse transformation operation to reconstruct the fusion image, and comprises a pyramid image fusion method, a wavelet transform image fusion method, a multi-scale decomposition method and the like, so that the method not only can maintain the contrast and reduce the blocking effect, but also has unique advantages in describing the local characteristics of signals. The feature level image fusion method is to extract feature information which is interesting to an observer, such as edges, contours, shapes, local features and other information from a source image, and then analyze, process and integrate the feature information to obtain fused image features. The currently common methods are: weighted average method, Bayesian estimation method, cluster analysis method. The decision-level image fusion method is to analyze, reason, recognize, judge and the like the characteristic information of each image to form corresponding results, and then further fuse, wherein the final fusion result is a global optimal decision. The method has the advantages of good real-time performance, high flexibility and certain fault-tolerant capability, but the preprocessing cost is higher, and the loss of original information in the image is more.

In recent years, deep learning is widely applied to the field of image processing by virtue of strong feature extraction and data expression capacity, the defect that a feature level image fusion method loses detail information is overcome, and a good result is obtained. For example, a Pulse Coupled Neural Network (PCNN) is used to perform medical image fusion in combination with sparse representation, a Convolutional Neural Network (CNN) is used to perform image fusion, a Generative Adaptive Network (GAN) is used to perform image fusion, Huang et al proposes a dense Network (densnet), He et al proposes a residual Network (net), Qiu et al proposes a double-residual dense Network for image fusion, and spatial resolution of the fused image is improved.

The pixel-level fusion method has high computational complexity and long running time, and has high requirement on registration accuracy, thereby limiting the practical application of the methods. The feature level fusion method can compress information to eliminate the uncertainty of partial features, but it can lose more detailed information relative to pixel level image fusion. The preprocessing cost of the decision-level fusion method is high, and the loss of original information in the image is high. The PCNN does not need to learn and train, but has low running speed and cannot explain how to set parameters to enable the fusion effect to be the best; GNN can improve the stability of an image fusion result, but has the problems of unstable training, gradient disappearance and information loss in each layer transmission; the CNN has good detection effect on image characteristics, but has the problems of gradient disappearance and gradient explosion, so that the fused image has no better quality along with the deepening of the network depth.

Huang et al propose a dense network (densnet), He et al propose a residual network (Resnet) to solve the problems of gradient disappearance and gradient explosion by means of dense connection and residual learning, Liu et al propose a method of Resnet in combination with NSST to be used for image fusion, so that the texture information of the fused image is improved, Qiu et al propose a double-residual dense network to be used for image fusion, and the spatial resolution of the fused image is improved. Residual learning or intensive connection is used for image fusion, so that the spatial resolution of a fused image is improved, but Densenet and Resnet only use the last layer result of a network to perform feature fusion, so that part of useful information obtained by an intermediate layer is lost, and the details of the fused image are unclear.

Disclosure of Invention

The embodiment of the invention provides a multi-modal medical image fusion method based on a double-residual ultra-dense network, which is used for solving the problems in the background technology.

The embodiment of the invention provides a multimode medical image fusion method based on a double-residual ultra-dense network, which comprises the following steps:

acquiring a first modality medical image and a second modality medical image;

extracting shallow features of the first modality medical image and the second modality medical image through convolution of a first Conv layer and activation of a PReLU layer in the double-residual ultra-dense network;

extracting deep features of the first modality medical image and the second modality medical image from the shallow features of the first modality medical image and the second modality medical image respectively through residual learning and ultra-dense connection;

and sequentially carrying out dimension splicing on a Concat layer channel, final Conv layer convolution and PReLU layer activation in a double-residual-error super-dense network on deep features of the first-mode medical image and the second-mode medical image to obtain a fusion image of the first-mode medical image and the second-mode medical image.

Further, the air conditioner is provided with a fan,

the first modality medical image adopts a Computed Tomography (CT) image;

the second modality medical image is a Magnetic Resonance (MR) image.

Further, the extracting shallow features of the first modality medical image and the second modality medical image through the convolution of the first Conv layer and the activation of the PReLU layer in the double-residual ultra-dense network specifically includes:

taking a first convolution layer of a high-level residual error network ResNet101 trained in advance on an Image data set ImageNet as a first Conv layer of a double-residual error super-dense network;

shallow features of computed tomography CT images and magnetic resonance MR images were obtained by first Conv layer convolution and prilu layer activation in a double residual super dense network:

wherein X and Y respectively represent a computed tomography CT source image, a magnetic resonance MR source image, and F¹ ₀Shallow features, F, for computed tomography CT images² ₀For shallow features of magnetic resonance MR images, G¹ ₀(. a) is a combined function of the convolution operation and the activation function of the CT path of the computed tomography, G² ₀Convolution manipulation for magnetic resonance MR pathA combination function of the draw and activation functions.

Further, the air conditioner is provided with a fan,

using a parameter correction linear unit as an activation function of a PReLU layer for all convolutional layers in the double-residual-error ultra-dense network;

the PReLU layer is input with feature mapping by nonlinear mapping conversion, and the formula is as follows:

wherein F (-) is a non-linear function, F^j _l+1For output, is the l +1 level feature map of the j path; f^j _lAs input, is the l-th layer feature map of the j path.

Further, the air conditioner is provided with a fan,

first Conv layer of the dual residual ultra dense network: a 7 × 7 convolution kernel with parameters of 64 channels, step size set to 2, and padding to 3; wherein, 256 × 256 × 1 first modality medical images and second modality medical images are input, and shallow feature maps of the first modality medical images and the second modality medical images with the size of 128 × 128 × 64 are output.

Further, the extracting, through residual learning and ultra-dense connection, deep features of the first modality medical image and the second modality medical image from shallow features of the first modality medical image and the second modality medical image, specifically includes:

splicing the shallow feature of the first modality medical image and the shallow feature of the second modality medical image in a channel dimension on a Concat layer of the double-residual-error ultra-dense network;

on a 1 multiplied by 1Conv layer of the double-residual ultra-dense network, the dimension reduction of the number of convolution kernel channels is realized;

on the sum-splicing layer of the double-residual ultra-dense network, the input-output relationship in residual learning is as follows:

H(x)＝F(x)+x

wherein F (x) represents residual mapping, specifically G learned by double-residual ultra-dense network¹ _L(·)、G² _L(·); x represents a shallow feature F¹ ₀，F² ₀(ii) a Thus, the output of the dual residual ultra dense network is as follows:

wherein, F¹ _L，F² _LRepresenting the output of a double residual supercondensed network, G¹ _L(·)，G² _L(. to) represents the function through the last 1 × 1Conv layer in a dual residual ultra dense network;

let F¹ _lAnd F² _lRespectively representing the output of the ith layer in the computed tomography CT image path and the magnetic resonance MR image path, the output characteristics of the ith layer of the two paths of the double residual ultra-dense network are as follows:

wherein G is¹ _l(. and G)² _l(. The) respectively represents the function of the first layer convolution layer on the computed tomography CT image path and the function of the first layer convolution layer on the magnetic resonance MR image path in the double-residual error super-dense network.

Further, the performing channel dimensional stitching, Conv layer convolution and prilu layer activation on the deep features of the first modality medical image and the second modality medical image to obtain a fused image of the first modality medical image and the second modality medical image specifically includes:

splicing deep features of the first modality medical image and the second modality medical image on a Concat layer of the double-residual ultra-dense network in a channel or digital dimension;

performing dimension reduction on the number of convolution kernel channels on a 1 multiplied by 1Conv layer of the double-residual ultra-dense network;

performing convolution on the last Conv layer of the double-residual ultra-dense network;

and activating and completing fusion on a PReLU layer of the double-residual ultra-dense network.

Further, the 1 × 1Conv layer of the dual residual ultra dense network: the number of channels was reduced to 3 using 3 1 × 1 convolution kernels.

Further, the last Conv layer of the dual residual ultra-dense network: step size 1, one 3 × 3 convolution kernel of 1 is filled, and the output size after convolution operation is 4 × 4 × 1.

The embodiment of the invention provides a multimode medical image fusion method based on a double-residual ultra-dense network, which has the following beneficial effects compared with the prior art:

the invention provides a multi-mode medical image fusion method based on a double Residual error Hyper-dense network (DRHDNs), aiming at the problems that the image fusion method based on a Residual error network and a dense network only fuses the features extracted from the last layer of the network, so that part of useful information extracted from the middle layer is lost, and the details and the definition of a fused image are influenced. The DRHDNs comprise two parts of feature extraction and feature fusion: the characteristic extraction part constructs a double-residual error super-dense block by combining residual error learning and super-dense connection to extract deep characteristics of two source images, the residual error learning simplifies a learning target and difficulty by a jump connection mode and realizes target integrity, and the super-dense connection expands dense connection between layers of different paths, so that the loss of useful information of an intermediate layer is reduced, primary information fusion is completed, the characteristic reuse is encouraged by both the residual error learning and the super-dense connection, the characteristic extraction is more sufficient, and the detail information is richer; and the characteristic fusion part firstly splices the two characteristic graphs in a channel, and then obtains a fusion image with more details and more clearness through the reduction sum and convolution. Meanwhile, the invention provides a double-residual super-dense block by combining the residual dense block and the super-dense connection, which not only applies the dense connection between the layers of the same path, but also applies the dense connection between the layers crossing different paths to transmit information between the two paths extracting different modal image characteristics, so that the extracted deep characteristics are more detailed and abundant, and the loss of useful information of a network middle layer is reduced; the dual residual super-dense block encourages feature reuse through residual learning and super-dense concatenation, which helps to more fully extract deep features from shallow features.

Drawings

FIG. 1 is a multi-modal medical image fusion model based on a double-residual ultra-dense network according to an embodiment of the present invention;

FIG. 2 is a block diagram of a dual residual super-dense block architecture according to an embodiment of the present invention;

FIG. 3 shows the CT/MR image fusion result of brain diseases according to the embodiment of the present invention;

fig. 4 shows brain MR/PET image fusion provided by an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The invention provides a multi-mode medical image fusion method based on a double Residual ultra-dense network (DRHDNs). The DRHDNs comprise two parts of feature extraction and feature fusion.

In view of the fact that the double-residual dense network does not fully use the useful information extracted by the middle layer and the fact that the ultra-dense connection has a good effect in the medical image segmentation, the method applies the ultra-dense connection to the residual dense block of the double-residual dense network and constructs the feature extraction part of the DRHDNs. The features of the CT and the MR of the source image are extracted by using super-dense connection between the two paths, the extracted features are richer and clearer, the problem of gradient disappearance can be relieved through residual learning, the feature information of the source image can be stored as much as possible, and the deep features of the source image of the path are extracted as much as possible through super-dense connection and the features of the other source image are preliminarily learned. The feature extraction is divided into two steps, shallow feature extraction and double-residual ultra-dense block deep feature extraction, and the model is shown in figure 1.

Let X, Y, Z represent source image CT, source image MR, and CT/MR image after fusing respectively, the expression of extracting characteristic and fusing characteristic is shown in formulas (1) - (3):

F¹＝G¹(X) (1)

F²＝G²(Y) (2)

Z＝FFN(F¹,F²) (3)

wherein F¹，F²Representing deep features of CT and MR images, respectively, G¹(·)，G²(. cndot.) denotes the function for extracting CT and MR features, respectively, and FFN (. cndot.) denotes the function for feature fusion.

A first part: feature extraction

(1) Extracting shallow features

The DRHDNs uses as the first convolutional layer of the advanced ResNet101 pre-trained on Image Net, which has parameters of 7 × 7 convolutional kernels of 64 channels, with step size set to 2 and padding to 3. Inputting a 256 × 256 × 1 source image, and outputting a feature map with the size of 128 × 128 × 64 after Conv layer convolution and PReLU layer activation, F¹ ₀And F² ₀As shown in formulas (4) to (5), wherein G¹ ₀(. and G)² ₀(. cndot.) represents the combined function of the convolution operation and the activation function of the CT path and the MR path, respectively.

All Convolutional layers (Conv) in the DRHDNs thereafter use parameter corrected Linear units (PReLU) as activation functions.

The PReLU layer is an active layer, also called a nonlinear mapping layer, and the PReLU layer converts the input feature mapping through nonlinear mapping, and the formula is shown as (6).

F (-) in equation (6) is a non-linear function that is important for convolutional neural networks because a neural network with an arbitrary number of layers is equal to a network with only one layer if there is no non-linear mapping. In addition, the non-linear function also enables the convolutional neural network to extract more complex correlations in the image.

(2) Extracting deep layer features

The characteristic extraction part constructs a double-residual super-dense block which can extract the deeper characteristics such as clearer edge information, more detail information and more complete texture information from the CT image information and the MR image information obtained in the last step through residual learning and super-dense connection.

As shown by the arrows on both sides of the double residual super dense block in fig. 1, the input information F of the feature extraction part¹ ₀And F² ₀The method is directly added with the output of the feature extraction part, so that the problem of gradient disappearance can be relieved, and the learning difficulty is simplified.

In a super dense network, the output of the l-th layer (l ≦ 8) in the source image s is shown in equation (7):

let F¹ _lAnd F² _lRespectively representing the output of the l-th layer in the CT path and the MR path, the output characteristics of the l-th layer of the two paths of the dual residual super-dense block in the present invention are shown in equations (8) - (9):

wherein G is¹ _l(. and G)² _l(. The) represents the function of the first convolution layer on the CT image path and the function of the first convolution layer on the MR image path in the double-residual hypercomplex block, respectively.

The double-residual ultra-dense block fully connects the convolution layers in the block and adds residual learning, so that the finally extracted CT and MR characteristics can comprehensively and deeply represent the image characteristics of the path and preliminarily fuse the other source image characteristics.

The end of the dual residual supercomputer block is the Concat layer, the 1 × 1Conv layer, and the splice layer. The role of the Concat layer is to splice all the previously extracted features in the channel or number dimension, as shown in the double residual super dense block of fig. 1, the Concat layer splices the output of each layer together in the channel dimension. The 1 × 1Conv layer has great effect in the network, and can flexibly change the dimensionality of matrix channels without changing the matrix properties, and can realize cross-channel interaction, information integration and the like. The 1 × 1Conv layer realizes dimension reduction of the number of convolution kernel channels and reduces the calculation amount. Since the Concat layer splices all previous layer outputs together, the computation is very computationally intensive and therefore requires dimensionality reduction to reduce the computation. The dimensionality of the output can be reduced to 64 by 64 1 x 1 convolution kernels.

The splicing layer is an important layer for realizing residual learning, and is shown in formula (10) according to the relation between input and output in the residual learning block:

H(x)＝F(x)+x (10)

where F (x) denotes residual mapping, in the present invention the feature G learned by a double residual supercondensed block¹ _L(·)，G² _L(. x) represents a shallow feature F¹ ₀，F² ₀。

Thus, the output of the dual residual super-dense block is shown in equations (11) - (12):

wherein F¹ _L，F² _LRepresenting the output of a double residual super dense block, G¹ _L(·)，G² _L(. cndot.) represents the function in the dual residual superconcentrated block that passes through the last 1 × 1Conv layer.

A second part: feature fusion

Feature F extracted from CT image as shown in FIG. 1¹ ₁Has abundant bone information, and features F extracted from MR images² ₁The CT image and the MR image are fused, so that the obtained fused image contains rich bone information and soft tissue information, has clear outline and does not lose the detail information of the source image. The feature fusion part of the invention has a Concat layer, a 1 × 1Conv layer, a Conv layer and a PRelu layer, and the input of the feature fusion part is two deep features F¹ ₁，F² ₁The output is a CT and MR fusion image.

The Concat layer can splice two or more features on channel or digital dimension, and the existing features are fused with add mode and Concat mode. Where Concat is the concatenation of the number of channels, that is, the features describing the image itself are increased, while the amount of information per feature is not increased, whereas add, as opposed to it, is the amount of information per dimension that is increased. The invention adopts Concat mode fusion, which makes the same size feature have more feature expression.

The 1 × 1Conv layer can reduce the dimension of the number of convolution kernel channels, reduce the operation amount, realize a more complex network structure and further reduce the calculation complexity. The feature fusion part uses 3 1 × 1 convolution kernels to reduce the number of channels to 3.

After the Concat layer and the 1 × 1Conv layer fuse deep features of the CT image and the MR image, the final fusion is completed through Conv layer convolution and PReLU layer activation. The last Conv layer is a 3 × 3 convolution kernel with step size of 1 and padding of 1, and the output size after convolution operation is 4 × 4 × 1. The final CT/MR fusion image not only contains clear skeleton information of the CT image and soft tissue information of the MR image, but also contains more clear detail information.

Example analysis:

referring to fig. 2, the features extracted from each layer in the dual residual super-dense block are not only transferred to all layers behind the path, but also transferred to all layers behind another path, and the curve connecting lines represent that the features extracted from the layer are transferred to all layers behind the path. The ultra-dense connection encourages feature reuse, and increases information interaction between the two source images, so that feature information of the two source images is preliminarily fused. In addition, the connection mode promotes feature learning, improves gradient flow and increases implicit deep supervision.

The dual residual super-dense block has 8 3 × 3Conv layers and 1 × 1Conv layer, and the specific convolution kernel parameters are shown in table 1.

TABLE 1 convolution layer parameters for double residual super dense blocks

The double-residual ultra-dense block fully connects the convolution layers in the block and adds residual learning, so that the finally extracted features can comprehensively and deeply represent the image features of the path and preliminarily fuse the features of another source image.

The output of the dual residual super dense block is as follows:

wherein F¹ _L，F² _LRepresenting the output of a double residual superconcentrated block, F¹ ₀，F² ₀Indicating a shallow feature, G¹ _L(·)，G² _L(. cndot.) represents the function in the dual residual superconcentrated block that passes through the last 1 × 1Conv layer.

In order to verify the effectiveness of the method, the method is verified from two aspects:

1. CT/MR image fusion experiment of brain diseases

In the experiment, a data set from Harvard medical college is adopted, all images are 256 pixels by 256, a CT/MR image of the registered brain diseases is selected as an experimental image, and the fusion result is shown in fig. 3, wherein fig. 3(a) - (i) respectively show the fusion result of the method for the brain MR, the brain CT, CSMCA, ASR, JPC, NSST-PAPCNN, CNN, DRDNs and the DRHDNs provided by the invention.

As can be seen from fig. 3, the edge preservation of the JPC and NSST-PAPCNN methods is poor, the texture details of the fused image of the NSST-PAPCNN method are unclear, the details of the ASR and JPC methods are lost more, the structure preservation of the CNN method is insufficient, the fused image of the DRDNs method is better preserved in details, but the image contrast is low, and the fused image of the DRHDNs adopted in the present invention has rich details, clear textures and high contrast. Table 2 the evaluation indexes prove that the present invention is superior in detail, clarity, etc.

TABLE 2 Objective evaluation index of brain CT/MR fusion image

2. CT/PET image fusion experiment for brain diseases

In order to examine the general applicability of the DRHDNs method, a fusion experiment of MR/PET images of the brain was performed, and the fusion result is shown in fig. 4. In fig. 4, images (a) and (b) respectively represent an MR source image and a PET source image, the MR image can sufficiently display soft tissue information, the PET image can provide detailed human body metabolic information for clinic, but the resolution is low, and color information of the images needs to be retained in the fusion process in order to sufficiently retain the metabolic information of the PET image in the fusion process. As can be seen from the fusion result of FIG. 4 and the objective evaluation indexes in Table 3, the CSMC, ASR, JPC, NSST-PAPCNN methods have serious color distortion, the CSMC method has low definition, the ASR, JPC, NSST-PAPCNN methods have poor image quality, and the NSST-PAPCNN methods have edge brightness distortion. The CNN and DRDNs methods have good contrast and definition, are more suitable for visual observation of human eyes, but have insufficient detail information retention, and the DRHDNs methods have good color retention, clear edges, rich detail information and better quality of fused images. As can be seen from the evaluation index analysis in Table 3, the evaluation index values of the NSST-PAPCNN method are lower, and the evaluation index values of the DRHDNs method are higher than those of the other methods, which are consistent with the subjective observation results.

TABLE 3 Objective evaluation index of MR/PET fusion image of brain

Although the embodiments of the present invention have been disclosed in the form of several specific embodiments, and various modifications and alterations can be made therein by those skilled in the art without departing from the spirit and scope of the invention, the embodiments of the present invention are not limited thereto, and any changes that can be made by those skilled in the art are intended to fall within the scope of the invention.

Claims

1. A multi-modal medical image fusion method based on a double-residual ultra-dense network is characterized by comprising the following steps:

acquiring a first modality medical image and a second modality medical image;

2. The multi-modal medical image fusion method based on the dual residual ultra-dense network as claimed in claim 1,

the first modality medical image adopts a Computed Tomography (CT) image;

the second modality medical image is a Magnetic Resonance (MR) image.

3. The method according to claim 2, wherein the extracting shallow features of the first modality medical image and the second modality medical image through the first Conv layer convolution and the PReLU layer activation in the dual-residual super-dense network comprises:

wherein X and Y respectively represent a computed tomography CT source image, a magnetic resonance MR source image, and F¹ ₀Shallow features, F, for computed tomography CT images² ₀For shallow features of magnetic resonance MR images, G¹ ₀(.) combination of convolution operations and activation functions for computed tomography CT pathsFunction, G² ₀Is a combined function of the convolution operation and the activation function of the magnetic resonance MR path.

4. The multi-modal medical image fusion method based on the dual residual ultra-dense network as claimed in claim 3,

5. The multi-modal medical image fusion method based on the dual residual ultra-dense network as claimed in claim 3,

6. The method according to claim 4, wherein the extracting deep features of the first modality medical image and the second modality medical image from the shallow features of the first modality medical image and the second modality medical image respectively through residual learning and super dense connection specifically comprises:

H(x)＝F(x)+x

wherein G is¹ _l(. and G)² _lRespectively representing the computation in a double residual ultra-dense networkA function of the first layer convolution layer in a tomographic CT image path and a function of the first layer convolution layer in a magnetic resonance MR image path.

7. The method according to claim 6, wherein the channel-dimensional stitching, Conv layer convolution and PReLU layer activation are performed on the deep features of the first-modality medical image and the second-modality medical image to obtain a fused image of the first-modality medical image and the second-modality medical image, and specifically includes:

8. The method for multi-modal medical image fusion based on the dual residual hyperconcentration network as claimed in claim 7, wherein the 1 x 1Conv layers of the dual residual hyperconcentration network: the number of channels was reduced to 3 using 3 1 × 1 convolution kernels.

9. The method of multi-modal medical image fusion based on dual residual hyperconcentration network of claim 7, wherein the last Conv layer of the dual residual hyperconcentration network: step size 1, one 3 × 3 convolution kernel of 1 is filled, and the output size after convolution operation is 4 × 4 × 1.