CN111882514B

CN111882514B - Multi-mode medical image fusion method based on double-residual ultra-dense network

Info

Publication number: CN111882514B
Application number: CN202010734334.7A
Authority: CN
Inventors: 王丽芳; 王蕊芳; 张晋
Original assignee: North University of China
Current assignee: North University of China
Priority date: 2020-07-27
Filing date: 2020-07-27
Publication date: 2023-05-19
Anticipated expiration: 2040-07-27
Also published as: CN111882514A

Abstract

The invention discloses a multi-mode medical image fusion method based on a double-residual ultra-dense network, which comprises the following steps: extracting shallow features of a first modality medical image and a second modality medical image through first Conv layer convolution and PReLU layer activation in a double-residual ultra-dense network; deep features are extracted through residual error learning and ultra-dense connection; and sequentially splicing the deep features in the Concat layer channel dimension in the double-residual ultra-dense network, convolving the Conv layer and activating the PReLU layer to obtain a fused image of the first-mode medical image and the second-mode medical image. According to the invention, the double residual error super-dense block provided by combining the residual error dense block and the super-dense connection not only applies the dense connection between layers of the same path, but also applies the dense connection between layers crossing different paths, and information transmission is carried out between two paths for extracting image features of different modes, so that the extracted deep features are more detailed and abundant, and the loss of useful information of a network intermediate layer is reduced.

Description

Multi-mode medical image fusion method based on double-residual ultra-dense network

Technical Field

The invention relates to the technical field of medical image fusion, in particular to a multi-mode medical image fusion method based on a double-residual ultra-dense network.

Background

Image fusion is widely used in applications such as medical imaging, remote sensing, machine vision, biological recognition and military applications. The purpose of the fusion is to obtain better contrast and perceived experience. In recent years, with the increasing demand of clinical application, research on multi-modal medical image fusion has been attracting attention. The purpose of the multimodal medical image fusion is to provide a better medical image that aids the physician in surgical intervention.

Currently, medical images have multiple modes, such as Magnetic Resonance (MR) images, computed Tomography (CT) images, positron Emission Tomography (PET) images, X-ray images, and the like, and images of different modes have their own advantages and limitations, for example, CT can well display bone information, but cannot clearly display structural information such as soft tissues; the MR image can fully display soft tissue information, but has great defects in detection of bone information; PET images can provide clinical abundant human metabolic information, but with lower resolution. Therefore, medical image information of multiple modes is combined to complete multi-mode image fusion, and complementary advantages can be achieved. The multi-mode fusion image not only maintains the characteristics of the original image, but also makes up the defect of the single-mode medical image, shows more abundant detail information, and provides comprehensive information for clinical diagnosis treatment and image guided surgery.

The image fusion method is divided into three layers: pixel level, feature level, decision level. The pixel-level image fusion is to combine the pixel values of each point corresponding to two or more source images through a certain fusion method to calculate a new pixel value, so that each point pixel is fused to form a new fusion image. Common pixel-level image fusion methods include a spatial domain-based fusion method and a transform domain-based fusion method. The spatial domain-based image fusion is mainly divided into block-based fusion and region-based fusion, and comprises a logic filtering method, a weighted average method, a mathematical morphology method, an image algebra method, a simulated annealing method and the like, and the spatial domain-based image fusion is small in calculated amount, easy to realize, poor in accuracy and not suitable for the field of medical images. The transform domain-based method is to decompose the source image firstly, then combine the decomposed source images by utilizing different fusion rules, and finally execute inverse transformation operation to reconstruct the fusion image, including pyramid image fusion method, wavelet transformation image fusion method, multi-scale decomposition method and the like, which not only can maintain contrast and reduce blocking effect, but also has unique advantages in describing the local characteristics of signals. The feature-level image fusion method is to extract feature information of interest of an observer, such as information of edges, contours, shapes, local features and the like, from a source image, and then analyze, process and integrate the feature information to obtain fused image features. The methods commonly used at present are as follows: weighted averaging, bayesian estimation, cluster analysis. The decision-level image fusion method is to analyze, infer, identify and judge the characteristic information of each image to form corresponding results, and then further fuse the results, wherein the final fusion result is the global optimal decision. The method has good instantaneity, high flexibility and certain fault tolerance, but has higher preprocessing cost and more loss of original information in the image.

In recent years, deep learning is widely applied in the field of image processing by virtue of strong feature extraction and data expression capability, the defect that detail information is lost in a feature-level image fusion method is overcome, and a good result is obtained. For example, medical image fusion is performed using pulse coupled neural networks (Pulse Couple Neural Network, PCNN) in combination with sparse representation, image fusion is performed using convolutional neural networks (Convolutional Neural Network, CNN), image fusion is performed using generative countermeasure networks (Generative Adversarial Networks, GAN), huang et al proposes dense networks (Densenet), he et al proposes residual networks (Resnet), qiu et al proposes a dual residual dense network for image fusion, and spatial resolution of fused images is improved.

The pixel-level fusion method has high computational complexity and long running time and has high requirements on registration accuracy, thus limiting their practical application. The feature level fusion method can compress information and eliminate uncertainty of partial features, but can lose more detail information relative to pixel level image fusion. The decision-level fusion method has higher preprocessing cost and more loss of original information in the image. PCNN does not need to learn and train, but has low running speed, and can not explain how to set parameters so as to achieve the best fusion effect; GNN can improve stability of image fusion results, but has the problems of unstable training, vanishing gradient, and loss of information in each layer of transmission; CNN has good detection effect on image characteristics, but has the problems of gradient disappearance and gradient explosion, so that the fused image has better quality along with the deepening of the network depth.

Huang et al propose dense network (Densenet), he et al propose residual network (Resnet) solve the problems of gradient disappearance and gradient explosion through dense connection and residual learning respectively, liu et al propose a Resnet combined NSST method for image fusion, texture information of the fused image is improved, qiau et al propose a double residual dense network for image fusion, and spatial resolution of the fused image is improved. Residual learning or dense connection is used for image fusion, so that the spatial resolution of a fused image is improved, but Densenet and Resnet only use the last layer of results of a network to perform feature fusion, so that useful information obtained by a middle layer is partially lost, and the detail of the fused image is unclear.

Disclosure of Invention

The embodiment of the invention provides a multi-mode medical image fusion method based on a double-residual ultra-dense network, which is used for solving the problems in the background technology.

The embodiment of the invention provides a multi-mode medical image fusion method based on a double-residual ultra-dense network, which comprises the following steps:

acquiring a first modality medical image and a second modality medical image;

extracting shallow features of a first-mode medical image and a second-mode medical image through first Conv layer convolution and PReLU layer activation in a double-residual ultra-dense network;

extracting deep features of the first-mode medical image and the second-mode medical image from shallow features of the first-mode medical image and the second-mode medical image through residual learning and ultra-dense connection respectively;

and sequentially splicing the deep features of the first-mode medical image and the second-mode medical image in the Concat layer channel dimension, performing Conv layer convolution and performing PReLU layer activation in the double-residual ultra-dense network to obtain a fusion image of the first-mode medical image and the second-mode medical image.

Further, the method comprises the steps of,

the first modality medical image adopts a computed tomography CT image;

the second modality medical image is a magnetic resonance MR image.

Further, the extracting shallow features of the first modality medical image and the second modality medical image through the first Conv layer convolution and the PReLU layer activation in the dual-residual ultra-dense network specifically includes:

taking a first convolution layer of an advanced residual error network ResNet101 trained in advance on an Image Net of an Image dataset as a first Conv layer of a double residual error ultra-dense network;

shallow features of computed tomography CT images and magnetic resonance MR images are obtained by first Conv layer convolution and pralu layer activation in a dual residual super dense network:

/>

wherein X and Y respectively represent CT source images, MR source images and F ¹ ₀ For shallow features of CT images of computed tomography, F ² ₀ G is a shallow feature of the MR image ¹ ₀ (. Cndot.) is the combined function of the convolution operation and the activation function of the CT path of the computed tomography, G ² ₀ (. Cndot.) is a combination of the convolution operation of the magnetic resonance MR path and the activation function.

Further, the method comprises the steps of,

using a parameter correction linear unit as an activation function of the PReLU layer for all convolution layers in the double-residual ultra-dense network;

the PReLU layer is characterized by converting input feature mapping through nonlinear mapping, and the formula is as follows:

wherein F (·) is a nonlinear function, F ^j _l+1 For output, is the first +1st layer feature map of the j path; f (F) ^j _l For input, is the first layer feature map of the j path.

Further, the method comprises the steps of,

first Conv layer of dual residual ultra dense network: 7×7 convolution kernel with parameter 64 channels, step size set to 2, padding to 3; wherein, input 256×256×1 first modality medical image and second modality medical image, output the shallow feature map of first modality medical image and second modality medical image of 128×128×64 in size.

Further, the extracting deep features of the first modality medical image and the second modality medical image from shallow features of the first modality medical image and the second modality medical image respectively through residual learning and ultra-dense connection specifically includes:

on a Concat layer of the double-residual ultra-dense network, splicing the shallow features of the first-mode medical image and the shallow features of the second-mode medical image in the channel dimension;

on a 1 multiplied by 1Conv layer of the double-residual ultra-dense network, the dimension reduction of the number of convolution kernel channels is realized;

on the double-residual ultra-dense network and the splicing layer, the input-output relationship in residual learning is as follows:

H(x)＝F(x)+x

wherein F (x) represents residual mapping, specifically G learned by double residual ultra-dense network ¹ _L (·)、G ² _L (. Cndot.); x represents shallow feature F ¹ ₀ ，F ² ₀ The method comprises the steps of carrying out a first treatment on the surface of the Thus, the output of the dual residual ultra dense network is as follows:

wherein F is ¹ _L ，F ² _L Representing the output of a double residual ultra dense network, G ¹ _L (·)，G ² _L (. Cndot.) represents the function through the last 1X 1Conv layer in a double residual ultra dense network;

set F ¹ _l And F ² _l Representing the output of the first layer in the computed tomography CT image path and the magnetic resonance MR image path, respectively, then double handicappedThe output characteristics of the first layer of the two paths of the super dense network are as follows:

wherein G is ¹ _l (. Cndot.) and G ² _l (. Cndot.) represents the function of the first layer convolution layer on the computed tomography CT image path and the function of the first layer convolution layer on the magnetic resonance MR image path in a dual residual ultra dense network, respectively.

Further, the performing channel dimension stitching, conv layer convolution and PReLU layer activation on the deep features of the first modality medical image and the second modality medical image to obtain a fused image of the first modality medical image and the second modality medical image specifically includes:

on a Concat layer of the double-residual ultra-dense network, splicing deep features of the first-mode medical image and the second-mode medical image in a channel or digital dimension;

on a 1 multiplied by 1Conv layer of the double-residual ultra-dense network, performing dimension reduction of the number of convolution kernel channels;

convolving on the last Conv layer of the double-residual ultra-dense network;

and performing activation on the PReLU layer of the double-residual ultra-dense network to finish fusion.

Further, the 1×1Conv layer of the dual residual ultra dense network: the number of channels is reduced to 3 using 3 1 x 1 convolution kernels.

Further, the last Conv layer of the dual residual ultra-dense network: step size is 1, one 3×3 convolution kernel filled with 1, and the output size after convolution operation is 4×4×1.

The embodiment of the invention provides a multi-mode medical image fusion method based on a double-residual ultra-dense network, which has the following beneficial effects compared with the prior art:

aiming at the problems that the image fusion method based on the residual error network and the dense network only fuses the characteristics extracted from the last layer of the network, so that the useful information extracted from part of intermediate layers is lost and the detail and definition of the fused image are affected, the invention provides a multi-mode medical image fusion method based on a double-residual error ultra-dense network (DRHDNs). The DRHDNs comprise two parts of feature extraction and feature fusion: the feature extraction part constructs a double-residual ultra-dense block for extracting deep features of two source images by combining residual error learning and ultra-dense connection, the residual error learning simplifies learning targets and difficulty in a jump connection mode, target integrity is realized, the ultra-dense connection is formed by expanding dense connection between layers of different paths, thereby reducing loss of useful information of an intermediate layer, completing preliminary information fusion, and encouraging feature reuse by both residual error learning and ultra-dense connection, so that feature extraction is more complete and detail information is more abundant; the feature fusion part is used for firstly splicing the two feature images in the channel, then performing dimension reduction and convolution to finally obtain a fusion image with more details and clearness. Meanwhile, the double-residual error super-dense block proposed by combining the residual error dense block and the super-dense connection not only applies the dense connection between layers of the same path, but also between layers crossing different paths, and information transmission is carried out between two paths for extracting image features of different modes, so that the extracted deep features are more detailed and abundant, and the loss of useful information of a network intermediate layer is reduced; double residual ultra-dense blocks encourage feature reuse through residual learning and ultra-dense connections, which helps to more fully extract deep features from shallow features.

Drawings

FIG. 1 is a multi-modal medical image fusion model based on a dual residual ultra dense network provided by an embodiment of the invention;

FIG. 2 is a block structure of dual residual super-dense blocks according to an embodiment of the present invention;

FIG. 3 is a CT/MR image fusion result of brain diseases provided by an embodiment of the present invention;

fig. 4 is a brain MR/PET image fusion provided by an embodiment of the invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The invention provides a multi-mode medical image fusion method based on a double-residual ultra-dense network (Dual Residual Hyper-Densely Networks, DRHDNs). The DRHDNs comprise two parts, namely feature extraction and feature fusion.

In view of the fact that the dual residual dense network does not fully use useful information extracted by the middle layer and the fact that ultra-dense connection achieves good effects in medical image segmentation, the method applies the ultra-dense connection to the residual dense block of the dual residual dense network and constructs a characteristic extraction part of DRHDNs. The method has the advantages that the characteristics of the source images CT and MR are extracted through the ultra-dense connection between the two paths, the extracted characteristics are richer and clearer, the gradient vanishing problem can be relieved through residual error learning, the characteristic information of the source images can be saved as much as possible, and the deep characteristics of the source images of the paths are extracted as much as possible through the ultra-dense connection and the characteristics of another source image are initially learned. The feature extraction is divided into two steps, shallow features are extracted, deep features are extracted from the double-residual ultra-dense block, and the model is shown in figure 1.

Let X, Y, Z represent the source image CT, the source image MR, and the fused CT/MR image, respectively, and the expressions of the extracted features and the fused features are shown in formulas (1) - (3):

F ¹ ＝G ¹ (X) (1)

F ² ＝G ² (Y) (2)

Z＝FFN(F ¹ ,F ² ) (3)

wherein F is ¹ ，F ² Representing deep features of CT and MR images, G ¹ (·)，G ² Partition of (The functions of extracting CT features and MR features are shown separately, and FFN (. Cndot.) is the function of feature fusion.

A first part: feature extraction

(1) Extracting shallow features

DRHDNs uses as the first convolutional layer of advanced res Net101 trained in advance on Image Net, the parameters of this layer being a 7 x 7 convolutional kernel of 64 channels, the step size being set to 2, the padding to 3. A source image with 256×256×1 is input, after Conv layer convolution and PReLU layer activation, a feature map with 128×128×64 size is output, F ¹ ₀ And F ² ₀ As shown in formulas (4) - (5), wherein G ¹ ₀ (. Cndot.) and G ² ₀ (-) represents the combined function of the convolution operation and activation functions of the CT path and the MR path, respectively.

All Convolutional layers (Conv) in DRHDNs then use the parameter correction linear unit (Parametric Rectified Linear Unit, pralu) as an activation function.

The PReLU layer is an active layer, also called a nonlinear mapping layer, and the layer is used for converting input feature mapping through nonlinear mapping, and the formula is shown as (6).

F (·) in equation (6) is a nonlinear function that is important for convolutional neural networks because a neural network with any number of layers is equal to a network with only one layer if there is no nonlinear mapping. In addition, the nonlinear function also enables convolutional neural networks to extract more complex correlations in the image.

(2) Extraction of deep features

The feature extraction part constructs a double residual error super-dense block, which can extract deep features such as clearer edge information, more detail information, more complete texture information and the like from CT image information and MR image information obtained in the previous step through residual error learning and super-dense connection.

As shown by the arrows on both sides in the double residual super-dense block of fig. 1, the input information F of the feature extraction section ¹ ₀ And F ² ₀ Directly adds with the output of the feature extraction part, thus the gradient vanishing problem can be relieved, and the learning difficulty is simplified.

In an ultra dense network, the output of the first layer (l.ltoreq.8) in the source image s is as shown in formula (7):

set F ¹ _l And F ² _l The output characteristics of the first layer of the two paths of the double residual ultra-dense block in the present invention are shown in formulas (8) - (9):

wherein G is ¹ _l (. Cndot.) and G ² _l (. Cndot.) represents the function of the first layer convolution layer on the CT image path and the function of the first layer convolution layer on the MR image path in the dual residual super-dense block, respectively.

The double residual ultra-dense block is characterized in that all convolution layers in the block are fully connected, residual learning is added, so that the finally extracted CT and MR features can comprehensively and deeply represent the image features of the path, and the other source image features are preliminarily fused.

The last of the double residual super-dense blocks is the Concat layer, the 1 x 1Conv layer and the splice layer. The Concat layer is used for splicing all the features extracted in the previous step in the channel or digital dimension, and as shown in the double residual error super-dense block in FIG. 1, the Concat layer splices the output of each layer together in the channel dimension. The 1 multiplied by 1Conv layer has great effect in the network, can flexibly change the dimension of a matrix channel without changing the matrix property, and can realize cross-channel interaction, information integration and the like. The 1×1Conv layer realizes the dimension reduction of the number of convolution kernel channels, and reduces the calculated amount. Since the Concat layer splices all the previous layer outputs together, the computation is very large, and therefore, dimension reduction is required to reduce the computation. The dimension of the output can be reduced to 64 by 64 1 x 1 convolution kernels.

The splicing layer is an important layer for realizing residual learning, and the relation between input and output in the residual learning block is shown as a formula (10):

H(x)＝F(x)+x (10)

wherein F (x) represents residual mapping, in the present invention refers to feature G learned by double residual ultra dense block ¹ _L (·)，G ² _L (. Cndot.) x represents shallow feature F ¹ ₀ ，F ² ₀ 。

Thus, the output of the double residual super-dense block is shown in formulas (11) - (12):

wherein F is ¹ _L ，F ² _L Representing the output of the double residual super-dense block, G ¹ _L (·)，G ² _L (. Cndot.) represents the function of the last 1X 1Conv layer in the double residual super-dense block.

A second part: feature fusion

Features F extracted from CT images as shown in FIG. 1 ¹ ₁ With rich bone information, features F extracted from MR images ² ₁ The CT image and the MR image are fused, the obtained fusion image contains rich bone information and soft tissue information, the outline is clear, and the detail information of the source image is not lost. The feature fusion part of the invention is provided with a Concat layer, a 1 multiplied by 1Conv layer, a Conv layer and a PRelu layer, and the input of the feature fusion part is two deep features F ¹ ₁ ，F ² ₁ The output is a CT and MR fusion image.

The Concat layer can splice two or more features according to the channel or digital dimension, and the existing features are fused with an add mode and a Concat mode. Where Concat is a concatenation of the number of channels, that is to say that the features describing the image itself are increased, without the amount of information under each feature being increased, and add, in contrast, is the amount of information under each dimension being increased. The invention adopts the Concat mode to fuse, so that the characteristics of the same size have more characteristic representations.

The 1×1Conv layer can reduce the dimension of the convolution kernel channel number, lighten the operation amount, realize a more complex network structure and further reduce the calculation complexity. The feature fusion section uses 3 1×1 convolution kernels to reduce the number of channels to 3.

After the Concat layer and the 1 multiplied by 1Conv layer fuse the deep features of the CT image and the MR image, the final fusion is completed through Conv layer convolution and PReLU layer activation. The last Conv layer is a 3×3 convolution kernel with step size of 1 and filled with 1, and the output size after convolution operation is 4×4×1. The final CT/MR fusion image contains clear bone information of the CT image and soft tissue information of the MR image, and further clear detail information.

Example analysis:

referring to fig. 2, features extracted from each layer in the double residual super-dense block are transferred to not only all layers behind the present path but also all layers behind the other path, and a curved connecting line represents transferring features extracted from the present layer to all layers behind. The ultra-dense connection encourages feature reuse, and increases information interaction between the two source images, so that feature information of the two source images is primarily fused. In addition, the connection mode promotes feature learning, improves gradient flow and increases implicit depth supervision.

The double residual super-dense block has 8 3×3Conv layers and 1×1Conv layer, and specific convolution kernel parameters are shown in table 1.

Table 1 convolutional layer parameters for double residual super-dense blocks

The double residual ultra-dense block is characterized in that all convolution layers in the block are connected, residual learning is added, and therefore the finally extracted features can comprehensively and deeply represent the image features of the path and preliminarily fuse the features of another source image.

The output of the double residual super-dense block is as follows:

wherein F is ¹ _L ，F ² _L Representing the output of the double residual super-dense block, F ¹ ₀ ，F ² ₀ Representing shallow features, G ¹ _L (·)，G ² _L (. Cndot.) represents the function of the last 1X 1Conv layer in the double residual super-dense block.

To verify the effectiveness of the method of the present invention, it is verified in two ways:

1. CT/MR image fusion experiment for brain diseases

In the experiment, a data set from a Harvard medical college is adopted, all images are 256 x 256 pixels in size, CT/MR images of brain diseases which are registered are selected as experimental images, and the fusion results are shown in fig. 3, wherein fig. 3 (a) - (i) respectively show brain MR, brain CT, CSMCA, ASR, JPC, NSST-PAPCNN, CNN, DRDNs and the fusion results of the DRHDNs method.

As can be seen from fig. 3, the edge retention of the JPC and NSST-PAPCNN methods is poor, the texture details of the fusion image of the NSST-PAPCNN method are unclear, the details of the ASR and JPC methods are more lost, the structure of the CNN method is not sufficiently preserved, the fusion image of the DRDNs method is better in detail retention, but the image contrast is lower, and the fusion image of the DRHDNs adopted by the invention has the advantages of abundant details, clear textures and higher contrast. The evaluation index in table 2 objectively proves that the invention has great superiority in detail, definition and the like.

TABLE 2 brain CT/MR fusion image objective evaluation index

2. CT/PET image fusion experiment for brain diseases

To examine the general applicability of the DRHDNs method, a fusion experiment of brain MR/PET images was performed, and the fusion results are shown in FIG. 4. In fig. 4, images (a) and (b) respectively represent an MR source image and a PET source image, the MR image can fully display soft tissue information, and the PET image can provide detailed human body metabolism information for clinic, but the resolution is lower, so that the metabolism information of the PET image is fully reserved in the fusion process, and the color information of the image is required to be reserved in the fusion process. As can be seen from the fusion result of FIG. 4 and the objective evaluation index of Table 3, the CSMCA, ASR, JPC, NSST-PAPCNN method has serious color distortion, the CSMC method has lower definition, the ASR, JPC, NSST-PAPCNN method has poor image quality, and the NSST-PAPCNN method has edge brightness distortion. The contrast and definition of the CNN and DRDNs methods are better, the method is more in line with the visual observation of human eyes, but the detail information is not reserved enough, the color reservation of the DRHDNs method is better, the edge is clear, the detail information is rich, and the quality of the fused image is better. As can be seen from the evaluation index analysis in Table 3, the NSST-PAPCNN method has lower evaluation index values, and the DRHDNs method has higher evaluation index values than other methods, and is consistent with the subjective observation results.

TABLE 3 brain MR/PET fusion image objective evaluation index

The foregoing disclosure is only a few specific embodiments of the invention, and those skilled in the art may make various changes and modifications without departing from the spirit and scope of the invention, but the embodiments of the invention are not limited thereto, and any changes that may be made by those skilled in the art should fall within the scope of the invention.

Claims

1. A multi-mode medical image fusion method based on a double-residual ultra-dense network is characterized by comprising the following steps:

acquiring a first modality medical image and a second modality medical image;

sequentially splicing the deep features of the first-mode medical image and the second-mode medical image in the Concat layer channel dimension in the double-residual ultra-dense network, and finally performing Conv layer convolution and PReLU layer activation to obtain a fusion image of the first-mode medical image and the second-mode medical image;

wherein, the liquid crystal display device comprises a liquid crystal display device,

wherein F (·) is a nonlinear function, F ^j _l+1 For output, is the first +1st layer feature map of the j path; f (F) ^j _l As input, is the first layer feature map of the j path;

the first modality medical image adopts a computed tomography CT image;

the second modality medical image adopts a magnetic resonance MR image;

the extracting shallow features of the first modality medical image and the second modality medical image through first Conv layer convolution and PReLU layer activation in the double-residual ultra-dense network specifically comprises:

wherein X and Y respectively represent CT source images, MR source images and F ¹ ₀ For shallow features of CT images of computed tomography, F ² ₀ G is a shallow feature of the MR image ¹ ₀ (. Cndot.) is the combined function of the convolution operation and the activation function of the CT path of the computed tomography, G ² ₀ (-) is a combined function of the convolution operation and the activation function of the magnetic resonance MR path;

extracting deep features of the first modality medical image and the second modality medical image from shallow features of the first modality medical image and the second modality medical image respectively through residual learning and ultra-dense connection, specifically comprising:

H(x)＝F(x)+x

set F ¹ _l And F ² _l Representing the output of the first layer in the computed tomography CT image path and the magnetic resonance MR image path respectively, the output characteristics of the first layer of the two paths of the dual residual ultra-dense network are as follows:

2. The multi-modal medical image fusion method based on the dual residual ultra dense network as claimed in claim 1, wherein,

3. The multi-mode medical image fusion method based on the double-residual ultra-dense network according to claim 1, wherein the performing channel dimension stitching, conv layer convolution and pralu layer activation on deep features of the first-mode medical image and the second-mode medical image to obtain a fused image of the first-mode medical image and the second-mode medical image specifically comprises:

convolving on the last Conv layer of the double-residual ultra-dense network;

4. The multi-modal medical image fusion method based on the dual residual ultra-dense network of claim 3, wherein the dual residual ultra-dense network has 1 x 1Conv layer: the number of channels is reduced to 3 using 3 1 x 1 convolution kernels.

5. The multi-modal medical image fusion method based on dual residual ultra-dense network of claim 3, wherein the last Conv layer of the dual residual ultra-dense network: step size is 1, one 3×3 convolution kernel filled with 1, and the output size after convolution operation is 4×4×1.